Declaring the GrainIt's the most important dimensional design step after identifying data sourcesIn debugging literally thousands of dimensional designs from my students over the years, I have found that the most frequent design error by far is not declaring the grain of the fact table at the beginning of the design process. If the grain isn't clearly defined, the whole design rests on quicksand. Discussions about candidate dimensions go around in circles, and rogue facts that introduce application errors sneak into the design. Declaring the grain means saying exactly what a fact table record represents. Remember that a fact table record captures a measurement. Example declarations of the grain include:
In Business TermsNotice that most of these grain declarations are expressed in business terms. Perhaps you were expecting the grain to be a traditional declaration of the fact table's primary key. Although the grain ultimately is equivalent to the primary key, it's a mistake to list a set of dimensions and then assume that this list is the grain declaration. This is the most common mistake in my students' designs. In a properly executed dimensional design, the grain is first anchored to a clear business object (no pun intended) and a set of business rules. Then, the dimensions that implement that grain become obvious. So, when you make a grain declaration, you can then have a very precise discussion of which dimensions are possible and which are not. For example, a line item of a doctor's bill likely would have the following dimensions:
And quite possibly others. Powerful EffectsIf you've been following this example, I hope you've noticed some powerful effects from declaring the grain. First, you can visualize the dimensionality of the doctor bill line item very precisely, and you can therefore confidently examine your data sources, deciding whether or not a dimension can be attached to this data. For example, you probably would exclude "treatment outcome" from this example because most medical billing data doesn't tie to any notion of outcome. But a general entity/relation-oriented "data model" of doctor visits might well include treatment outcome. After all, in an abstract sense, doesn't every treatment have an outcome?
|
Most Popular This Week
IE Weekly Newsletter
Subscribe to the newsletter
|
|
|




