Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Home
Digital Library
Events
RSS | Newsletters
Webcasts




June 13, 2001



The Unity Dimension

A saner solution for when differing entites share a role

By Nick Galemmo
Edited By Ralph Kimball


Continued from Page 1

Unity Technique

A unity dimension involves the same technique as a variant, but its component dimensions are full. In my situation, the component dimensions were the existing Customer, Vendor, Plant, and Storage_Location dimensions. The unity dimension, Shipping_Points, contained basic information, such as name and address common to all four dimensions, and contained an equal number of rows as all four combined.

And, as in a variant dimension, all tables share the same surrogate key. This is important because it allows great flexibility in how you can use the dimensions. In essence, the primary keys of the component dimensions are foreign keys to the unity dimension.

While the backroom extract-load process is loading fact records, the natural key in the fact table record must be checked against the corresponding component dimension to obtain the proper surrogate key. The process is fairly complex because the load process must identify which version of the dimension record a natural key represents.

However, since it's safe to assume that the source system already "understands" that issue, you can usually identify the correct version of the dimension from the transaction context. Where ambiguity exists, the load process may be able to determine the version using multiple lookups and assign the correct surrogate keys to the fact record, saving the user work at query time.

In the final model, the Shipment_ Facts table contains two foreign keys to the Shipping_Points dimension, one for the point of origin and the other for the destination. (See Figure 1.) A generic query, such as an analysis of shipping volumes between any two points, joins to the Shipping_Points dimension to get descriptive information about the sending and receiving locations.

And, because the unity and its component dimensions share the surrogate key, it's easy to apply context to the queries. A query to analyze shipments between vendors and customers is a simple matter of joining the point of origin key with the vendor dimension, and the destination key to the customer dimension. The result is a structure that can handle both generic and context queries without snowflakes or complex SQL.

Unity Maintenance

Maintaining the unity dimension is easy. It is simply a matter of performing the same update to the unity dimension as you would to any of the component dimensions. Whenever you add a new row to a component dimension, add a row to the unity dimension using the same key. If you update a component dimension row, apply the same changes to the corresponding unity dimension row.

However, for this solution to work well, you must share a common surrogate key sequence among the component dimensions. Doing so will ensure the key values are mutually exclusive. If you have the luxury of building a new warehouse, I recommend using one surrogate key sequence for most, if not all, of your dimensions.



Rate This Article

Comments:

Optional e-mail address:

Because a 32-bit key would allow for more than four billion unique values, a typical application shouldn't run out of keys for at least a few hundred years. (If you are truly gun-shy about a problem like the Y2K bug or you work for a telecom company or government entity, you could always use a 64-bit key.) If you have an existing warehouse and use independent key sequences for each dimension table, I strongly suggest you consider reassigning keys before implementing a unity dimension.

You can retrofit a unity dimension into an existing warehouse without reassigning keys, but queries become messy. One approach is to define the unity dimension with its own surrogate key and store the component dimension's surrogate key and a type code, identifying the component dimension, as a composite natural key to the unity dimension. The fact table will contain foreign keys to the unity dimension as before. If you wish to query using a component dimension, you will be forced to join through the unity dimension to the appropriate component dimension, filtering on the type code. The result is a snowflake query that can become quite complex and performs poorly.

You may represent a unity dimension as a subgroup cluster in your model. Doing so gives a fairly intuitive representation of the structure.

This unity dimension solution clearly has value when modeling logistics; you can also apply it in any situation where different business entities share a common role.



Guest columnist Nick Galemmo, [nicholas.galemmo@us.nestle.com] is an information architect at Nestle USA. He has more than 26 years of IT experience, with the last eight in data warehousing, and numerous data warehouse and system integration projects under his belt.





IE Weekly Newsletter
Subscribe to the newsletter
    Email Address