Kimball University: Six Key Decisions for ETL Architectures > > Intelligent Enterprise: Better Insight for Business Decisions

Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
RSS
Webcasts
Digital Library
Subscribe
Home


  • EMAIL
  • PRINT
  • REPRINTS
  • Follow Us on Twitter
  • FOLLOW US
  • Share

Kimball University: Six Key Decisions for ETL Architectures


Best-practice advice on software vs. coding, where to integrate, how to capture changed data, when to stage data, where to correct data and what latency levels to shoot for.


By Bob Becker
October 9, 2009

Bob Becker Bob Becker
This article describes six key decisions that must be made while crafting the ETL architecture for a dimensional data warehouse. These decisions have significant impacts on the upfront and ongoing cost and complexity of the ETL solution and, ultimately, on the success of the overall BI/DW solution. Read on for Kimball Group's advice on making the right choices.

1. Should We Use an ETL Tool?

One of the earliest and most fundamental decisions you must make is whether to hand code your ETL system or use a vendor-supplied package. Technical issues and license costs aside, you shouldn't go off in a direction that your employees and managers find unfamiliar without seriously considering the decision's long-term implications. This decision will have a major impact on the ETL environment, driving staffing decisions, design approaches, metadata strategies, and implementation timelines for a long time.

In today's environment, most organizations should use a vendor-supplied ETL tool as a general rule. However, this decision must be made on the basis of available resources to build and manage the system. ETL tools are really system building environments that use icons, arrows and properties to build the ETL solution rather than writing code. Be careful; if your proposed ETL development team is comprised of a number of old-school hand coders, they might not adapt well to an ETL tool. For this reason alone, some organizations find that custom ETL development is still a reasonable solution.

If you decide to use an ETL tool, don't expect a huge payback in your first iteration. The advantages will become more apparent as you traverse additional iterations and begin to leverage the development advantages of using a tool during subsequent implementations. You'll also experience the benefits of enhanced maintenance capabilities, more complete documentation and improved metadata support over time. The article "Should You Use an ETL Tool," by Joy Mundy, provides an in-depth look at the pros and cons of using an ETL tool.

2. Where and how should data integration take place?

Data integration is a huge topic for IT because, ultimately, it aims to make all systems work together seamlessly. The "360 degree view of the enterprise" is a commonly discussed goal that really means data integration. In many cases, serious data integration should take place among an organization's primary transaction systems before data arrives at the data warehouse. However, rather than dealing with integration in the operational environment, these requirements are often pushed back to the data warehouse and the ETL system.

Most of us understand the core concept that integration means making disparate databases function together in a useful way. We all know we need it; we just don't have a clear idea of how to break it down into manageable pieces. Does integration mean that all parties across large organizations agree on every data element or only on some data elements? This is the crux of the decision that must be made. At what level in the data does business management agree/insist integration will occur? Are they willing to establish common definitions across organizations and to abide by those definitions?

Fundamentally, integration means reaching agreement on the meaning of data from the perspective of two or more databases. With integration, the results of two databases can be combined into a single data warehouse analysis. Without such an accord, the databases will remain isolated stovepipes that can't be linked in an application. For the context of our ETL environment, data integration takes the form of conforming dimensions and conforming facts in the data warehouse. Conforming dimensions means establishing common dimensional attributes across separated fact tables so that "drill across" reports can be generated using these attributes. Conforming facts means making agreements on common business metrics such as key performance indicators (KPIs) across separated databases so that these numbers can be compared mathematically for calculating differences and ratios.


  • EMAIL
  • PRINT
  • REPRINTS
  • Follow Us on Twitter
  • FOLLOW US
  • Share


 





New on the BLOG
SAP Gets Microsoft Nod on Performance Management
11.19.2009
blog author
Cindi Howson
SAP announced yesterday that Microsoft supports SAP BusinessObjects Planning and Consolidation (formerly known as OutlookSoft) as a "preferred solution"... It makes sense that Microsoft would prefer SAP BusinessObjects' product over others in that it runs on the Microsoft platform...

Read more from Cindi Howson >>

Doug Henschen
Text Mining: The Intersection of Content & BI
Companies used to employ armies of people to read through documents such as customer satisfaction surveys, but it took longer, cost a lot more money and yielded far less detailed, reliable and consistent information than you can now quickly uncover using text mining applications.

11.18.2009
Read more from Doug Henschen >>

Is the SharePoint Bubble Going to Burst?
11.18.2009
blog author
Alan Pelz-Sharpe
There is plenty to be excited about in SP2010, especially if you belong to the SharePoint channel of resellers, consultants, developers, and system integrators... but I am starting to get the distinct impression that the SharePoint bubble is about to burst.

Read more from Alan Pelz-Sharpe >>



Intelligent Enterprise Newsletters
Subscribe Here:
*Email:
 First Name:
 Last Name:
  Intelligent Enterprise Blogosphere Newsletter:
  Intelligent Enterprise Newsletter:

Email Type: