Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Home
Digital Library
Events
RSS | Newsletters
Webcasts


January 20, 2000 Volume 3 - Number 2



Seeking Spatial Intelligence

A data warehousing initiative offers the best opportunity to bring the business value of spatial data into your organization

By Michael L. Gonzales



Right or wrong, the term “GIS” — the acronym for geographic information system — carries a lot of baggage. For most of its existence, GIS has been the property of an exclusive group of government agencies and commercial applications. It’s been defined by a need to analyze map-specific data and constrained by the demands of complex data structures. Moreover, in a seemingly conscious attempt to ensure obscurity, many people define GIS simply and narrowly as automated map management.

Spatial data, however, is not about geography or maps; it’s about visualizing and understanding relationships and information just below the layer of numeric data. For that reason, GIS has thrived for years in several key industries, including oil and gas, real estate, government, telecom, and utilities. More recent opportunities include retail — where the technology is deployed for market analysis, site selection, and property and transportation management — and healthcare, where GISs help track provider inventory and monitor health service billing areas. In fact, most industries use geographic information to support some business process.

Unfortunately, these applications lead to stovepipe solutions that address the needs of small, well-defined user groups. Indeed, apart from a few customers with specific geospatial requirements, GIS simply isn’t considered a mainstream requirement. Many CIOs shudder at the thought of implementing what they perceive to be arcane technologies involving complex objects and special systems.

If these misperceptions continue, GIS will stay in the computing “basement” with other relevant but esoteric technologies. Thus, the growth of the GIS industry depends on how well it can move beyond these stovepipes and expand its role in decision support. GIS must reinvent itself; a new role is required that exploits the technology’s virtues and demonstrates its value to the decision-support process. (Perhaps we should simply remove the “G” from GIS, an observation that reflects the industry’s public-image problems.) In essence, GIS has outgrown the purposes for which it was invented, yet is still perceived as having those purposes only.


Core Customer Influencing Factor Spatial Entity Spatial Attributes
Commercial Add locations Streets, cities, states Distance of locations to the warehouse
Residential Move Streets, zones Average family size of neighborhood and average household income
Employees Change jobs Streets Average number of employees, average employee salary
Facilities Expand locations Facility layout Square footage
TABLE 1 The spatial dimension’s influence across the organization.

Remaking GIS

Jack Dangermond, president of Environmental Systems Research Institute (ESRI), a leading provider of geospatial software, has said that maps are merely the “door prize” for GIS. There’s a lot of insight in that statement. GIS was the impetus for developing and maturing the technologies and techniques for storing, managing, and presenting spatial data. But the value of this hard-won knowledge goes well beyond mapping, the traditional application of GIS. Not unlike date and time, space and spatial data represent a critical analytic perspective for which no substitute exists.

Table 1 shows how “space” is an integral concept across the organization; indeed, there is virtually no part of a business where it doesn’t have some impact. The challenge for the GIS industry is to formulate a message that clearly communicates the value space brings to these processes. Simultaneously, it must demonstrate to the IT community that a major part of the core technology necessary to integrate space is already in place: the RDBMS.

An Enabling Technology

Until the early 1990s, GIS required non-relational data structures to store and manage the objects necessary for spatial operations. During this period, however, RDBMSs dominated data management for all other corporate applications, so the enterprise and GIS community were going in different directions with their data. Over the course of this decade, however, database vendors have invested resources and partnered with leading GIS vendors to ensure that data types particular to spatial data are seamlessly blending into the corporate database. For example, ESRI and geospatial vendor MapInfo Corp. provide engine-based spatial data management technology for storing, managing, and accessing spatial data within leading RDBMSs.

Modern relational databases have cleared the path for the GIS industry to introduce itself to the enterprise on a scale not previously possible. With spatial data stored and managed in the same environment as production systems, IT shops are less resistant to the idea of new data types. Moreover, because the inclusion of spatial data is almost transparent, report and query programmers and application developers need only learn an expanded SQL to exploit the information.

Follow the Leader

Aided by Y2K problems and customer demand to integrate legacy systems, the big ERP vendors have entrenched themselves in nearly every Global 1,000 company. But because at this point most of these companies have already deployed an ERP solution, the vendors are now running out of prime customers. To sustain the lofty growth rates enjoyed in recent years, ERP vendors are now focusing on new applications for existing customers and expanding into the middle market. Because an estimated 80 percent of all data locked up in corporate databases has a geographic component, these new growth initiatives involve the inclusion of spatial data in ERP solutions.

This news is good for companies under constant pressure to contain costs and be competitive. Spatially enabled ERP improves decision-making information by leveraging corporate data assets stored and maintained in the ERP system. Intuitive, graphical interfaces help information flow faster across the organization. Developing or maintaining a competitive advantage can therefore be facilitated through spatially based market analysis, product analysis, trade area analysis, and site selection. For managers, improving reporting and analysis functions by even a fraction can result in big savings and reduced time-to-market. And operational users will be able to react to customer service needs more quickly while improving resource scheduling and utilization.

Although many ERP systems can’t do everything that spatial applications require, they do capture and maintain operational data that GISs can use as attributes of spatial features such as points (addresses), lines/networks (pipe inventories), or polygons (service areas). To GIS users, the content of an ERP database looks like the mother of all attribute data. These systems provide the descriptive characteristics of spatial features that bring GIS applications to life. Gaining access to this vast store of data saves time and money while growing the analytic ability of the enterprise.

Thus, a symbiotic relationship exists between ERP and GIS. For example, SAP’s EnjoySAP initiative involves a series of programs designed to simplify business processes by providing more intuitive user interfaces that meet people’s specific preferences, skills, and work requirements. A key component of this initiative is the addition of GIS functionality to new front-office applications. The first products resulting from the initiative are already on the market. For example, ESRI mapping technology is embedded in the user interface for Business Information Warehouse (BW); end users can interact with BW through map-based interfaces.

The important point here is that SAP chose its data warehouse, BW, as the initial integration point for spatial data. There’s a lot of rationale for this strategy:

• Data warehouse efforts are often isolated. They represent downstream solutions from the production systems. As such, data warehouse iterations lend themselves to the application of new techniques and technologies simply because they have virtually no impact on source systems.

• Data warehouses are static, providing an excellent environment in which to systematically extract, scrub, and otherwise transform source data into the target system.

• Data warehouses emphasize analysis. Consequently, users are specifically interested in enhancing the analytic value of the data being extracted, transformed, and stored there.

But what about the many organizations that haven’t implemented SAP? For them, we need to create a realistic and scalable model that reflects the way they do business.

A Spatial Paradigm

Introducing spatial data across your source systems can be very disruptive. Of course, most GIS vendors try to sell spatial techniques and technologies by focusing on their graphical appeal and front-end applicability. (Logistics departments, oil and gas firms, and utility companies are favorite targets.) But as I explained earlier, this strategy has led inevitably to stovepipe functionality. Only two reasons exist to introduce spatial data on the production side of your organization: if you have a specific business requirement driving the implementation, which will almost certainly lead to a stovepipe rather than an enterprise solution; or if you’re implementing an ERP solution that already includes space as a standard feature of its environment, such as SAP. Otherwise, using data warehouse iterations is the best vehicle for blending spatial data into the enterprise.

Thus, a new spatial paradigm emerges: First enable your enterprise data warehouse to deliver spatial value, and only then focus on getting that information to end users. As I’ll explain, this strategy is less intrusive, yet more effective, for enabling a spatial enterprise.

Entities or Attributes

Before beginning any spatial warehouse iteration, you must understand the two possible types of iterations. At the risk of oversimplification, spatial warehouse implementations are defined by two general categories of spatial data: entities (computer representations of streets, lots, forests, a section of pipe, or an oil well, for example) and attributes (descriptive characteristics about entities, such as the income level of a neighborhood, type of pipe used in a pipeline section, or amount of barrels pumping from a particular oil well).

The difference between spatial entities and attributes isn’t trivial. One very distinguishing characteristic is how the information will appear to users; spatial entities usually require software that understands the objects (map programs, for example). A spatial entity iteration of your warehouse will undoubtedly dictate that you add a geographical tool to access the data being stored. Conversely, implementations of spatial attributes let you use traditional online analytic processing (OLAP) and SQL reporting tools. (See Figure 1.)



FIGURE 1 Spatially expanding standard data access tools.


You can add spatial attributes to your RDBMS by installing a spatial extender for the database engine (an Informix DataBlade, IBM DB2 Extender, or Oracle Cartridge, for example). At this point, the engine will have expanded SQL capability that supports arguments such as ADJACENT TO, CONTAINS, or OVERLAPS. Data access programmers or advanced power users then simply need to learn the new arguments to exploit the spatial functions.

If you have a choice, the best way to start your organization down the spatial path is to implement spatial attributes first. This strategy enhances analytic value for users without changing the tools they use to access the warehouse. This option wasn’t readily available until recently, however; GIS vendors are just now providing technology that allows attribute-only iterations. For example, ESRI is publishing a COM library with more than 1,000 objects that will help third-party and in-house developers create integrated solutions between traditional and spatial data.

How to Scope A Spatial Iteration

Typical warehouse projects comprise three core development tracks: data extraction and transformation, management, and access. The type of spatial implementation you choose will bear directly on these tracks. (See sidebar, “The Big Six.”) For that reason, the data architect must focus on properly identifying the scope of the iteration, and the best way to scope any data warehouse project is by asking business questions. For example, the VP of marketing should answer questions such as “What’s the average driving distance between customers and the stores they patronize?” or “What’s the average income level of our customers’ neighborhoods?” Both of these questions are spatial-centric, involving the average driving distance and the average income level for neighborhoods. To answer either one, you must be able to attach a geographic reference (address) to nongeographic information — in this case, distance and income level.

A seasoned architect would create a single data mart that contains sufficient information to handle either question and more. For example, the architect would include the obvious dimensions of Customer and Store as well as a spatial dimension that includes attributes regarding demographic information (neighborhood income level) and distance. Other dimensions could include date and, possibly, product.

Of course, it’s easy to spot the spatial aspects of these business questions. But what if the business question isn’t as spatially obvious, such as “What are our product sales by date, customer, and store?” In this case, there’s no explicit mention of a spatial requirement. If space were already implemented in the atomic data warehouse, the architect would only have to persuade executives to support incorporation of a spatial dimension to enhance analysis. In fact, if space were already integrated, adding space to any project iteration would be as fundamental as adding a Date dimension; you’d simply expect to have a spatial dimension. However, if you’ve never implemented a spatial iteration but want to start somewhere, this is a good opportunity to do so. The architect only needs to propose the value of space to the analysis using spatial concepts such as drive time or distance, and demographic attributes such as income level.

Spacing Out

Most GIS implementations were invented in the “back room” of the IT department by innovators who have been left alone to deal with their own problems and celebrate their own successes. In the meantime, IT has focused available resources on other areas of technology considered more important to the enterprise.

To bring spatial data into the mainstream, technologists should take the approach of data warehouse or decision-support initiatives. The rationale is simple: Data warehousing is often the impetus to transform corporate data into information. It requires a conscious effort to cleanse and integrate data across the enterprise, with the primary goal of building analytic value. It’s a time when IS expects to learn new analysis tools and methods for building and maintaining new data stores. Simply put, no other IT effort provides a better opportunity to introduce spatial data.

The introduction of spatial technology and techniques, however, does not guarantee their success. In parallel to the inclusion of space into the enterprise database, you must also inject it into the corporate consciousness. The organization itself — how it does business — must change to match this expanded analytic capability, raising the company to new levels of discovery, data exploitation, and spatial intelligence.


Treasure Map

Avoiding the stovepipe trap

Taking these steps will ensure that spatial data won’t be consigned to stovepipe analysis in your organization. They represent a purposeful, scalable, iterative approach that’s conducive to the new spatial enterprise.

• Use your data warehouse effort as the vehicle to spatially enable your enterprise’s analytic power.

• Identify and implement a spatial extender to store, manage, and retrieve spatial data in the same database environment as the standard data types in your warehouse.

• Distinguish between standard data warehouse implementations and spatial implementations. Spatial data warehouse implementations require more attention toward understanding business requirements. In contrast, traditional implementations mainly address only time; the spatial dimension adds an entirely new perspective to the data and therefore the analysis.

• For your first spatial implementation, focus only on attribute data. Doing so will greatly simplify the implementation process while offering better analytic value.

• Expand your spatial capability over several warehouse iterations rather than all at once. The “big bang” strategy fails for standard warehouse efforts, and there’s no reason to believe it will be any different for spatial ones.


THE BIG SIX

KEYs to spatial intelligence

Other implications exist, but generally, the data architect who keeps these six issues on the radar screen has a reasonable chance of successful implementation.

1. Determine if the iteration includes spatial entities, attributes, or both. If you’re implementing spatial entities, you’ll have to evaluate data access tools for the users.

2. Determine if you need to purchase external demographic data. Spatial data warehouse iterations often require data from third parties; it’s common for organizations to have addresses for their clients but little else.

3. Focus on ensuring accurate address information. This issue sits between the source and extraction layers of the project. Cleaning addresses from source data is often so difficult that it requires some scrubbing on the source side as well as processes to guarantee quality on the warehouse side.

4. Add a geocoding process. The purpose of geocoding is to attach a geographic reference to nongeographic attributes. In a warehouse, an address is often geocoded to associate physical location (longitude and latitude) with attribute values. If a consistent, high-quality address is unavailable, a ZIP code can serve the same purpose at a higher granularity.

5. Add a spatial extender to your database engine. You’ll need to invest in a spatial extension to your database if you’re implementing a spatial data warehouse.

6. Consider disk space and indexing strategies. A spatial implementation will affect disk space and partitioning. Moreover, DBAs will need to understand indexing options to ensure data access performance.


Michael L. Gonzales (mlg@starfocus.com), a database developer for more than a decade, manages The Focus Group Ltd., a consulting firm specializing in ROLAP and OLAP techniques and technologies. He has also written several books and conducts data warehouse seminars across the country.



 





IE Weekly Newsletter
Subscribe to the newsletter
    Email Address