Managing Data Warehouse GrowthClimb Every Warehouse One of the biggest challenges in business intelligence and data warehousing initiatives is managing growth. Conflicting demands to support more users, deal with increased query and data complexity, and add more "right time" information have many at a crossroads. We explore technological changes that will make it easier to scale and offer advice on guiding the growth. By Richard Winter and Rick Burns November 1, 2006
Data warehouses are growing rapidly--and not just in sheer size. Conflicting demands to support more users, increased query and data complexity and more "right-time" information have organizations coming to a crossroads. Stakeholders need to know: Do we have the right architecture for growth? Can we scale up without breaking the bank? And do we have the personnel to manage this beast? Companies fear if they can't answer these questions, the competition could leave them behind. Although the data-warehousing concept dates back to the 1980s, it wasn't until the mid-1990s that the notion of a separate store specifically designed to support data access and analysis became a widespread phenomenon. Now, with nearly every major large business organization managing or at least planning for a data warehouse, the race is on to see which company can leverage the most information value. Dealing with scale is a critical part of that effort. As companies come to rely on BI (business intelligence) for strategic, tactical and now operational decision-making, data warehouses are arguably the centerpiece of most data-management functions. We explore what's happening with data-warehouse scalability based in part on results from the Winter TopTen Survey of the largest databases. Then, after looking at significant technology moves vendors are making to meet scalability needs, we offer some advice to guide your data warehouse growth.
From Back Office To Front LinesData warehousing hasn't stayed the same over all these years. When the discipline broke out in the 1990s, it was enough to give business strategists, financial managers and marketing specialists accurate and timely reporting. Most received data through homegrown tools or emerging independent BI products and reporting functions embedded in leading-edge applications. In recent years, the vision has evolved beyond reporting on what happened yesterday to forecasting what needs to be done tomorrow--if not initiating action today. Data warehousing is moving out of the back room to play an important role in what front-line employees do with suppliers, customers and partners, as well as what suppliers, customers and partners do in a self-service environment. In the early days, it was a major triumph to report accurately which customers had defected to the competition. Companies knew it took time and money to acquire a customer. For firms where business performance was extremely sensitive to the dynamics of customer retention, it made sense to amass as much data as possible to understand both individual customers and patterns of behavior. With better information about which customers are likely to defect to the competition and which are the most valuable to retain, the next step has been to connect such knowledge to action. Data warehousing is moving into a more time-sensitive realm as businesses want to turn information immediately into actions that underlie goals of putting customers first, improving service and delivering special offers while customers are online or in the store. With timing being everything, companies are urgently trying to move data warehousing from a solely passive role to one that better fulfills the BI goal of delivering information at the right time and to the right people. Raising the bar of business value is welcome, but with greater relevance comes the pressure to deliver. Along with accommodating the growth in size, user population and workload, data warehouses also must handle more complex queries. And data latency--the time between receipt of the data and its availability for query--must become shorter and shorter. In 1995, the first Winter Corp. survey of database size reported that the largest system in the world contained a terabyte of data. Ten years later, the biggest we found was 100 times larger. Even more astounding is the growth curve; "Growth of Database Size," right, shows a tripling of the largest database identified (and thus, publicly acknowledged; some organizations do not want such publicity) in each of the most recent two-year periods. If this trend continues, we will see a 300 TB data warehouse in 2007--and that's real data, not just disk space allotted for the data. With 300 TB, the disk measure--often erroneously reported as the raw data figure--would likely reach 1,000 TB, or a petabyte. Of course, there is much more to the story than terabytes. As "Multidimensional Scalability" (click on link or see next page), shows, scalability is about user populations and workloads. Also, the complexity of both queries and schemas is growing. Latency is continuously under pressure as data timeliness becomes paramount. Thus, overall scalability requirements are rising faster than the first figure's exponential curve for sheer size.
|
New on the BLOG
Compatibility and SaaS Multi-tenancy
09. 1.2010
Read more from Josh Greenbaum >>
Rome was not reinvented in a day. Your enterprise business processes won't turn around overnight either. You'll need to re-engineer processes while you continue to run a business -- albeit one with many buried layers, some splendid ruins, and many construction projects that cause never-ending traffic snarls. 09. 1.2010 Read more from James Kobielus >> Why HP and Dell are Going Nuts Over 3PAR 08.31.2010
Read more from Rajan Chandras >> Most Popular This Week
Intelligent Enterprise Newsletters
Subscribe Here:
| |||||||||||||||||
|
|



