October 26, 1999, Volume 2 - Number 15
Dont put the cart before the horse when creating profitability data marts
Not So Fast
Ralph Kimball
Profitability is a central theme in many data warehouse designs. Just the existence of the data warehouse
implies a desire to understand the business in a fine-grained, analytic way and if youve
gone that far, youre surely interested in profitability analysis. However, although concern with
profitability might be foremost among your priorities, an attempt to start there is jumping the gun. You
need to complete the intermediary steps before you can deliver a profitability data mart, and there are a
few ways to go about it.
For the last 10 years, I have started almost every speech and class with the observation that data
warehouses are usually initiated by the twin, interwoven themes of: understand the customer and
measure profitability every which way.
I have seen a number of data warehouse projects start with
just one simple requirement: Show customer profitability. Gee, maybe after doing customer
profitability, we can do billings, and then maybe service calls.
But it is almost impossible to start
with profitability, customer or otherwise, as the first data mart. You can achieve a view of
profitability only after all the components of revenue and cost have been separately sourced and brought
into the data warehouse. I call these separate components of revenue and cost first-level data
marts. Profitability is definitely a second-level data mart and can be built only after the
first-level data marts are complete and available as data sources.
Lets explore some of the issues you face when building a profitability data mart, with an eye
toward satisfying the anxious business managers who want to get started measuring the profitability of
their business every which way.
Find the Components. The first step in building a profitability data mart is to determine the
profit equation. Then you must find all the individual sources of revenue and cost. A profitability data
mart should present its results in a standard little income statement. The simplest view of an income
statement is:
Revenue Costs = Profit.
Typically the revenue and costs are broken down into a number of standard buckets that make sense across
as many of the product or service lines of the business as possible. Generally you would expect an income
statement to fit on one side of a sheet of paper. To be more specific, take the familiar example of a
large packaged-goods manufacturer that ships its products to commercial customers. The
manufacturers shipment invoice provides a starting point for a typical income statement, the basis
of a profitability data mart:
Gross Revenue at List Price
Minus revenue adjustment for
finance terms
Minus revenue adjustment for marketing promotions
Net Revenue
Minus manufacturing costs
Minus storage costs
Minus freight costs
Minus cost of returns
Minus allocated marketing costs
Profit
The first thing you should notice is that the true revenue target is the net revenue, shown on the fourth
line, because that may be the revenue reported to stockholders; the gross revenue on line one is based on
list prices. But youll usually want to look at both.
Second, youll see that although
youve listed some important costs, youre not building the complete income statement as shown
in the companys annual report. The income statement shown here is activity based and omits general
costs such as R&D and the CEOs salary. The marketing and finance departments, likely users of
the profitability data mart, usually are willing to see this activity-based subset of the total company
profit. Thats fortunate, because an activity-based income statement is a more practical goal for
the data warehouse team.
As soon as you define the income statement, which I call the profit equation,
you must source all the data requiring the data warehouse team to do a lot of sleuthing.
Youll find some of the revenue and cost data in operational transactional form but some of it may
show up only in accounts payable. Maybe you periodically write a bonus check to your large customers
whose contracts call for a certain threshold of purchases. Such an expenditure is certainly a cost, but
it wont show up in the invoice transaction system.
Marketing and Finance Need to Help. After establishing the profit equation and identifying the
sources of revenue and cost data, you can design all the first-level data marts representing these
sources. Each source yields a separate data mart. If you have been reading this magazine, you know that
you can build separate data marts this way, if and only if you conform your dimensions and facts.
Conforming means finding an enterprisewide standard definition. Conformed dimensions, such as
calendar, customer, location, and product, are used with every data mart. No exceptions to the conformity
are allowed. Conformed facts, such as revenue and costs, must be mathematically combinable across the
separate data marts.
It is essential to involve both marketing and finance in the conforming step. Through their involvement
in this conforming exercise, these groups will deepen their commitment to an enterprisewide definition of
the data warehouse. Whats more, you will be covertly warming these groups up for an even more
controversial exercise that lies ahead: the dreaded allocations.
Allocations: The Heart of the Profitability Data Mart. As soon as you have established the
bus architecture of conformed dimensions and facts, the data mart teams can begin building
the separate first-level data marts around each of the components of revenue and cost. As these
first-level data marts become available to the community, they will be useful for restricted analyses
within each of these subject areas. But you dont have a profitability data mart yet!
The central truth is that in order to present a view of customer profitability, or product line
profitability, or geographic profitability, or promotions profitability, you have to assign all the
components of cost down to an atomic level. To continue the manufacturer example, you have to assign them
down to the line item on the shipping invoice.
Yes, I know that freight costs, for instance, pertain
to the shipment as a whole and are not assigned to the individual line item. This is where it gets hard.
You must assign the costs all the way down to the lowest level. If you dont, in the
manufacturing example, you will be unable to build even product line profitability because the freight
costs wont be represented anywhere against products. It doesnt help to try rolling up
individual products to some higher level and then assigning the costs. Either the costs are at the lowest
level, or they dont participate in the roll-up.
Remember that your strong implicit goal is to see
profitability every which way. In other words, you want the income statement, which is a list of
numerical facts, to be surrounded by a rich set of dimensions, including customer, product, time,
location, promotion, and others. Then you just constrain on your favorite dimension and see that version
of profitability. The most dimensional piece of data you have is the lowest piece of data, the invoice
line item. Hence, you must build the complete income statement at that level.
Once your marketing and
finance partners have dealt with conformed dimensions and facts, it is time to give them a harder
assignment: deciding on the allocations. Doing so entails apportioning costs down to the lowest level of
granularity in your base profitability fact table. In the manufacturing example, the shipping invoice
line item represents this level.
If Youre in a Hurry
It should be obvious at this point that you cant release the profitability data mart until
youve physically sourced all the components of revenue and cost and defined all the allocation
rules. Hence, a properly implemented profitability data mart could take years to release, especially if
many separate data sources underlie the revenue and (especially) costs.
If you dont have the
luxury of this kind of time, there is an interesting alternate path of development, as long as all
parties understand and agree.
At the outset, get marketing and finance to agree on rules of thumb for
the costs, before the underlying cost-driver data marts are available. Therefore, for instance, you
decide that each products manufacturing costs are 27 percent of the list price, and freight costs
are $1.29 per pound. By using rules of thumb, and by carefully sourcing the revenue components, you can
build a complete income statement.
Of course, it isnt very accurate at first. Everyone knows that. But you put enough work into the
rules of thumb that people are willing to live with release 1.0 of the profitability data
mart. Now you progressively upgrade the profitability data mart as the separate sources of cost data come
online in their own first-level data marts and are incorporated into the profit equation through the
allocation rules you defined.
Another advantage is that users become familiar with the concepts of
customer profitability, as well as the other slices of profitability, at an early point in your data
warehouse. The users will think of new interpretations and new requirements as their experience grows. As
you are building the separate cost-driver data marts one at a time and making new releases of the
profitability data mart, you can incorporate some of these new perspectives. Who ever said the data
warehouse is a static thing, anyway?
Ralph Kimball, Ph.D., co-invented the Star Workstation at Xerox and founder of Red Brick
Systems, works as an independent consultant designing large data warehouses. He is the author of The Data Warehouse Toolkit (Wiley, 1996) and the newly published The Data Warehouse Lifecycle Toolkit (Wiley, 1998). You can reach him
through his Web page at www.ralphkimball.com.