Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Home
Digital Library
Events
RSS | Newsletters
Webcasts




January 01, 2001




Spatial Enabling Your Data Warehouse


Ralph goes to MapObjects basic training


Ralph Kimball

I have always been puzzled by the chasm separating the data warehouse community and the geographic information systems (GIS) community. Very few "conventional" data warehouses exploit their data with a map-driven approach, yet these same data warehouses are rich with geographic entities including addresses, point locations, sales districts, and higher level political geographies.

Conversely, I have heard the mainline GIS community talking about "geographic data warehouses," but these rarely bear any similarity to the data warehouse world with which I am familiar. GIS data warehouses certainly have a lot of data, but the concerns of the GIS data warehouse revolve around such unfamiliar terms as vector data sets, cadastral databases, spatiotemporal information systems, and helical hyperspatial codes.

I have always instinctively believed that the conventional data warehouse community could gain a great deal by taking advantage of some GIS tools and user interfaces. A map can be very compelling: for example, a two-dimensional portrayal of data can show patterns that other kinds of analysis simply can't reveal.

Presumably, if you GIS-enable one of your conventional data warehouses, it should be easier for you to answer questions such as:

  • Do your customers come to your store because it is located near their home or near their work? What does that mean to you in terms of hours of operation and reducing instore queues?
  • Have you located your distribution center optimally between your suppliers and your customers, taking into account expected growth in the next five years?
  • What factors explain the obvious disparities in profitability and customer retention that you see when we plot these factors against a national map of all U.S. counties?

Investigating a GIS Vendor

I chose ESRI as the target of my investigation because it is currently the leading GIS provider. In its literature, ESRI defines GIS as "taking the spatial components that already exist in our customers' databases such as store locations, telephone pole locations, and transportation routes and connecting these components to a physical location somewhere on the surface of the earth." That definition is pretty much what was on my mind when I began researching the GIS data warehouse dilemma. I am constantly advising my data warehouse clients about how to carefully parse and store data about locations, yet I don't often see them using a GIS tool for that data.

Furthermore, I was intrigued when I read a comment attributed to ESRI's founder, Jack Dangermond. He said that the maps produced by ESRI's software are really "just the door prize." In other words, the maps, as compelling and necessary as they may be, are simply a means to deliver the information and insights contained in the spatially oriented data.

I agree with the tone of Dangermond's comment. Many times over the years, I have seen dazzling technology distract the customer from the business value contained in the product. The maps and the colorful user interfaces are a necessary component of GIS, but the real point of the exercise is to be able to make decisions from the underlying data. It is at this point that the motivations of the conventional and GIS data warehouses overlap exactly.

Going to Boot Camp

Keeping all this in mind, I decided to learn first hand what barriers existed to attaching GIS user interfaces to our conventional data warehouses. I wanted to see what it would be like to enter the GIS world as both a data warehouse expert and as a GIS novice. Would I be lost? Would I come away frustrated because I would still not know how to make the GIS connection?

So I purchased a developer's license for ESRI's MapObjects 2.1 Visual Basic (VB) product, and I signed up and paid for a spot in ESRI's MapObjects boot camp training in Redlands, Calif.

I chose MapObjects because I am comfortable with Microsoft Visual Basic. I use VB extensively and have written a number of serious data warehouse applications in VB, including the Star Tracker demonstration query tool, which has been distributed to more than 160,000 locations, as well as various other hybrid tools that I use to demonstrate multimedia data warehouse techniques in my classes.

ESRI implemented MapObjects with standard OLE 2.0 interfaces - that is, as an OCX module - which means that I should be able to drag a map object from the VB toolbox directly into an existing VB application, and all MapObjects properties, methods, and events would be accessible.

I came away from my week of training at ESRI very pleased. I got exactly what I expected. The MapObjects OCX is an extremely full-featured driver for GIS capabilities. It is definitely not a bait-and-switch product intended to lure you into buying the higher-end dedicated GIS systems from ESRI. Rather, from the point of view of a conventional data warehouse implementer, MapObjects provides an easy way to GIS-enable almost any data warehouse whose dimensions contain information about locations and routes.

Now, let's do a quick tour of the features you will find in MapObjects.

MapObjects Components And Commands

You connect MapObjects to your existing VB application (in developer mode) by dragging a map control from the VB toolbox into one of your application's forms. At that point, you have access to all the MapObject facilities. The map control provides access to 39 map-oriented objects. Some of the mapping display objects include:

  • Map layer objects, which are the most important objects in the system. A map often has multiple layers, such as political boundaries, roads, and rivers. These layers can be displayed all at once when the map appears on the enduser screen or selectively by request from the end user. Map layers are based on points, lines, or polygons. The information contained in a map layer is called a geodataset object.
  • Image layer objects, which are based on raster data, such as an aerial or satellite photograph. Image layers may be mixed with map layers.
  • A single tracking layer object, which is a special layer that is always "on top." It contains geoevent objects, which may be continuously moving on the display without requiring the complex underlying map layers and image layers to be redrawn. A geoevent object might be a moving train, whose position is being updated from a Global Positioning System (GPS) device.
  • Event renderer objects, which display symbols at certain distances along a line object. An event renderer object could show that an automobile accident is 3.2 miles from the beginning of a road.
  • Z renderer objects, which apply a graphical symbol to the Zvalue (for example, the altitude) of a map feature.
  • Chart renderer objects associate either a bar chart or a pie chart with the data values of a map feature. Using these objects, you can make little pie charts appear in all the states of the U.S.
  • Value map renderer objects associate a unique symbol for each underlying data value of a map feature.
  • Dot density renderer objects sprinkle dot patterns of various densities onto maps, depending the data values of the underlying map features.
  • Label renderer objects place text labels on map features. These labels automatically arrange themselves to avoid overlapping.

MapObjects also includes basic geometric objects, which include points, lines, polygons, rectangles, ellipses, and collections of parts, and projection objects, which allow spatial data to be converted among various map projections.

Two of the most interesting types of objects in MapObjects from the perspective of the conventional data warehouse are the data access objects and the address matching objects. Data access objects include:

  • Data connection objects, which create connections to specific sources of geographic data, including ESRI Shapefiles, CAD files, ARC/INFO workspaces, VPF (military format) data sources, and a useful variety of commercial relational databases including Oracle, DB2, SQL Server, Informix, and Sybase. You can access commercial relational databases through ESRI's spatial database engine (ARC SDE) interface, which allows regular SQL queries to be formed with additional SQL extensions appropriate for geographic queries.
  • Recordset objects, which are the specific rows of data in a geodataset object or coming back from a query against table objects and field objects.
  • Statistics objects, which compute the max, min, mean, count, and standard deviation of the values of a field.

MapObjects' address matching objects implement the bridge between raw addresses and geographic points. The important address matching objects are standardizer objects, which standardize street addresses and intersections against a user-supplied set of address patterns, and geocoder objects, which map a set of standardized addresses onto a street network, supplied separately as a geodataset. A set of addresses that have been geocoded can then be displayed on a map. This is how you would take addresses in your customer dimension and show them on a map.

The fundamental map control implements a full set of actions that let the developer insert code at critical moments in a user session. Experienced VB programmers will recognize the standard VB Click, Double Click, GotFocus, LostFocus, KeyDown, KeyPress, Mouse Down, and Mouse Up events. MapObject supplies a number of useful additional events with the map control, including BeforeLayerDraw, AfterLayerDraw, BeforeTrackingLayerDraw, AfterTrackingLayerDraw, and DrawingCancelled. These extra map-specific events give the user interface designer very fine-grained control over the map drawing user interface experience.

All 39 MapObjects objects implement various properties and methods. There are literally hundreds of properties and methods, and the MapObjects programmer is well advised to paste ESRI's big MapObjects map on the wall to remember which objects support which properties and methods.

As an experienced VB programmer, my impression is that MapObjects was implemented by a team that knew what it was doing and who made a serious attempt to provide a complete and useful toolbox of capabilities. It is worth noting that this is not a "release 1.0" product; MapObjects is currently at the 2.1 release level.

MapObjects comes with five full CDs of maps and geographic data, covering the United States in great detail and the rest of the world in moderate detail. The information on these CDs may be embedded in your application as a way to get started with the graphical aspect of mapping if you do not have your own geodatasets.

After taking ESRI's MapObjects class, I was able to construct my own data warehouse application from scratch. In less than a day, I was able to connect my own private data to ESRI's map data contained on the CDs, and could pan, scroll, highlight, analyze, and query by "geographic gesture."

Figure 1 shows a little snippet of something I created, borrowing code from the ESRI Help system and the class exercises, as well as adding my own logic. This application reads data from a set of relational tables in my own data warehouse, selects a set of countries based on my constraints, shades and labels the countries, and lets me do queries by double clicking on the map. I built it incrementally, by adding features one at a time, which is consistent with the way you use VB.

In this column, I have only glossed over what it takes to really link your data warehouse to a GIS. In my next column, I'll dig deeper and describe in detail how to make the connection from MapObjects to your own, real, dirty data. We'll also take a look at the serious geographic database extensions provided by the major DBMS vendors including Oracle, IBM, Microsoft, and others.

Meanwhile, if you are even just a moderate VB programmer, consider going to the ESRI MapObjects training school (see Resources). It will be a useful eye opener, and will give you confidence that you can extend your conventional data warehouse to be GIS- and spatially enabled.



Ralph Kimball co-invented the Star Workstation at Xerox and founded Red Brick Systems. He has three best-selling data warehousing books in print, including the newly released The Data Webhouse Toolkit (Wiley, 2000). Ralph teaches dimensional data warehouse design through Kimball University and critically reviews large data warehouse projects. You can reach Ralph through his Web site at www.ralphkimball.com.



RESOURCES

ESRI
MapObjects training school

 









IE Weekly Newsletter
Subscribe to the newsletter
    Email Address