Make sure your packaged application is part of your data Webhouse |
|
Welcoming the Packaged App |
|||||
|
|||||||
The tremendous rush toward customer relationship management (CRM), e-business, and business intelligence has brought many end-user departments into the computer marketplace as new customers. This demand is almost entirely great news for us data warehouse and data Webhouse implementers.
We have finally gotten the order to put a data foundation under nearly every business decision and business relationship. Business-to-customer (B2C) and business-to-business (B2B) relationships are both data intensive, and our end-user departments are very aware of this fact. In a sense, with the Web providing a big push recently, business has at last committed heavily to managing by the numbers. Now all we IT folks need to do is build the infrastructure to support this revolution .
Our marketing, sales, finance, and operations department users are in a hurry to keep up with their marketplaces and competition, and they are buying packaged application solutions to satisfy their urgent needs. The heavyweight packaged applications are the ERP systems, many of which were installed prior to the Y2K boundary. But by counting the sheer number of software license sales, the real growth has been at the lower end of the packaged applications market for processes such as sales pipeline and call center management, campaign management, CRM, and business intelligence (BI).
The packaged application providers deliver a very useful service because they have already written the software. But at the same time, these providers may not emphasize the system and interfacing issues that make their applications function in the larger IT context, perhaps to avoid IT scrutiny and the longer sales cycles involved.
Avoid Stovepipe Data Marts
What happens if you fail to welcome the packaged application as a full member of your data Webhouse? The packaged application will become a stovepipe data mart.
Those of us a little long in the tooth have seen this scenario before. Throughout the 1980s, groceries and drug stores sold syndicated scanner data on turnkey hardware directly to retail and manufacturing marketing departments, without IT involvement. Back then, it was a little easier to get away with such a strategy because there were no networks to speak of. But for those of us building the first integrated data warehouses and trying to combine the syndicated data with internal shipments and finance data, it was a nightmare: None of the dimensions in the syndicated data conformed to the internal data.
In the syndicated data, the Time dimension had the grain of quad weeks (four-week intervals unrelated to calendar months), the Product dimension had a vendor-supplied rollup hierarchy, and the Market dimension consisted of about 54 specially constructed markets that did not align with state boundaries. The infamous 54th market was called All Other and comprised all the spaces in between the first 53 markets! I remember being asked (or told outright) on several occasions to bring the syndicated data into the data warehouse, and then deciding that the task was an impossible one.
Conforming dimensions is the key to a successful distributed data Webhouse, and you have to do so from the most granular data during the data preparation phase. For the packaged application data, this process occurs in the back room of the package provider, and for internal company data, it occurs in the backroom data staging area.
Automatic Conforming
Conforming dimensions on the fly where the allocation factors between out-of-conformance data sources are applied at query time has been a dream for many years but is computationally intensive and slows down realtime queries. More seriously, this approach still requires the assignment of hundreds or possibly thousands of allocation factors. For example, you need to decompose time spans that overlap in awkward ways between two data sources to individual days, and then roll them up again. It is rarely satisfactory to assume each day of the week is equal; maybe you do much more business on weekdays than on weekends. We also need to make product hierarchies exactly equivalent between two data sources. The names of categories and departments must be drawn from the same domain, must have the same contents, and must be spelled consistently. Finally, you have to make geographic zones and regions identical in the same way. Conforming incompatible geographies accurately is very interesting because it may involve a sophisticated model of population densities and demographic patterns. Thus, beware of anyone claiming to conform separate data sources without your extensive involvement!
Even if we could conform on the fly, the architecture isnt correct; you would have to re-apply the allocations in each query, repeatedly. You will never have enough computing horsepower to make this goal the best solution.
Vendors Do Take Integration Seriously
Are vendors aware of these issues, and do they encourage their applications to be functioning components of a larger data Webhouse? Remarkably, a whole movement known as enterprise application integration (EAI) is addressing this set of issues. Although the EAI vendors are mostly focused on transaction processing, they are defining exactly what we need to make the distributed Webhouse function. They are defining a framework, principally through extensible markup language (XML), for transmitting business results back and forth, whether for B2C or B2B. Customers ordering a book fill out a form on the Web, and parts of the form such as the books ISBN number, title, author, price, tax, and the shipping charge are all described in an accessible language that every customer and business observer can use. This arrangement allows incompatible applications to pick up the information and consume it locally. It also lets a data Webhouse read the order form and populate its fact tables.
Besides supporting the common data interchange XML offers, what can a packaged-application vendor do to make us take their integration responsibility seriously? Heres my list of recommendations:
Offer to supply the package data in terms of the customers dimensions, especially Product (or Service), Customer, Geography, Calendar, Status, and Transaction. Charging more for this service is acceptable, provided it requires extra processing or special software development.
Call on IT early in the sales cycle. If you handle this correctly, IT should be grateful that an end-user department has recognized a need for a packaged application and has articulated the business requirements seriously. But both the package provider and the end-user department must agree that IT will be responsible for running the hardware and software and integrating the information into the fabric of the organization.
Publish the package data interface specifications so that the IT department can extract all the important dimensional data and fact data into a remote data Webhouse. In fact, give up all notions that the packaged application is itself the enterprise data warehouse. Admittedly, this recommendation is strongly worded, but in my experience, the concerns for most packaged application providers are transaction processing and expanding and protecting their proprietary interests. These goals rarely translate into genuinely effective support for data Webhousing, which demands a profound level of integration among many data sources, simplicity of presentation, and no-compromise speed.
Allow the import of data from the rest of the organization or from business partners, even if these third parties arent using the packaged application.
Fortunately, powerful forces are pushing in the right direction to make application integration real. The whole move to distributed supply chain management requires an honest effort to share data. No vendor that thinks about the problem wants to create an objection to sales by being perceived as so proprietary that it defeats the idea of e-commerce.
So what is the best strategy for IT, given the immense and sometimes undisciplined growth of packaged applications?
Spend time with your end-user departments to see what they need, and be forewarned when they are looking at a packaged-application solution. Credibility with the end users has always been the gold coin for IT, but now it is even more urgent. The world is really beginning to manage by the numbers over the Web. IT is responsible for building the infrastructure for the Webhouse revolution.
Be on the packaged application selection committee so you can ask the right questions early in the process. Have the provider respond to the list of interfacing responsibilities I described.
Begin the process of integrating the packaged application into the rest of your information structure as quickly as possible.
Learning More About These Topics
Inevitably, an article like this one leaves behind a whole host of questions. How do you learn about all these industry developments? Where can you find information about B2B, B2C, CRM, EAI, and XML? One of the reasons I write for Intelligent Enterprise is that addressing these topics is its mission. (I hope you keep your back issues.)
My other favorite research tool is the Google search engine, which I have mentioned many times .Open www.google. com and type in data warehouse CRM or EAI or packaged application CRM. Google organizes its results in order of the Web pages that other sites reference more often; thus, it seems to return very relevant results from abstract searches. I recommend it highly.
Ive focused here more on the small application packages that are potentially little islands of data. The really big ERP application packages present an additional set of challenges and will get their own special treatment in the next issue.
Ralph Kimball co-invented the Star Workstation at Xerox and founded Red Brick Systems. He has the three best-selling data warehousing books in print, including the newly released The Data Webhouse Toolkit (Wiley, 2000). Ralph teaches dimensional data warehouse design through Kimball University and critically reviews large data warehouse projects. You can reach Ralph through his Web site at www.ralphkimball.com.
|
|
|
|
|




