Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Home
Digital Library
Events
RSS | Newsletters
Webcasts


April 20, 1999, Volume 2 - Number 6


Customer Marketing Databases on the Growth Curve

These tips for building a scalable customer marketing database will win you new customers and overcome draining storage demands

By Steve Helle and Darryl Petruska



Maintaining tremendous customer data volumes is a critical goal in every enterprise. One way you can break ahead is with a customer marketing database. Combining customer data and organizational interaction with those customers, such databases can provide unprecedented benefits — with substantial challenges for your IT organization. Most customer databases traditionally stored 3,000 bytes of information in the early 1990s. But today’s systems require 20,000 storage bytes per customer — nearly 3TB if you’re managing 150 million customers. And the shotgun, mass-marketing techniques of previous years are rapidly being replaced with more focused approaches targeted to the individual. It’s essential to explore the techniques that can help you build successful, scalable customer marketing databases.

Growing Scalability Needs

As more organizations try to attract, win, and retain individual customers, having a database that supports these efforts can be the best strategy. But these databases make several demands on processing capacity. Customer names and addresses are typically lengthy, and an organization will frequently maintain name and address variations — those supplied by the customer, standardized according to U.S. postal service guidelines, and historical addresses. Combine this information with data about a customer’s lifestyle and patterns (commonly referred to as “demographics”) and historical interactions with an organization, and you’ll have a database that requires a huge amount of storage.

Such a database also has unique processing requirements. In order for you to use names, postal addresses, and email addresses in customer contacts, systems need to standardize the information. For instance, in a traditional direct marketing environment, a company needs to analyze names and addresses and put them in a format suitable for mailing. A company could convert a name from a first initial to a full name if the name were stored on file. A system might try to determine gender and apply a title to a name. Most often, systems correct existing address information by matching to proper street, city, and zip codes from a postal service file. When you standardize the information, you process it against a National Change of Address (NCOA) database.

These processing requirements often demand multiple sorts and passes through a customer information database or file. Many organizations have information on more than 100 million customers and prospects. Therefore, companies deploy parallel processing to meet processing- time requirements.

Designing Your Database

Prior to designing a customer marketing database, identify all existing customer information, because it’s critical for marketing purposes. During the planning phase, you should identify the business value propositions — the real value that such a database system will deliver, in meaningful business terms. A sample business value proposition might be to increase an organization’s market share by n percent by achieving a certain growth rate in new customers. During the planning phase, stay focused on the organization’s marketing needs. Marrying customer data with business value propositions provides you with a project that’s scoped properly, giving you a jump on the competition.

Figure 1 is an example of a customer marketing database environment that captures customer data from operational systems. These tips can help you avoid the pitfalls and create an effective database:

FIGURE 1 Typical customer marketing database environment.

Identify customer and prospect information sources and create a consolidated customer model. Typically, an organization may have customer information scattered across several operational systems. Including prospect names and addresses will increase the size of the database substantially. You need to decide which sources are relevant for marketing purposes. You also have to consider several issues when designing the data model: Does the customer base consist of both individuals and businesses? Should you identify customers at both the individual and household levels? Also, review what demographics are available, how often they change, and how valuable it is to track their values historically. What level of promotion history is required? Will you apply dynamic scoring models to the database?

Design the extract and transformation process. One of the most difficult tasks is actually populating the customer database — standardizing names and addresses, applying address changes, and uniquely identifying households and customers. There are several software packages and service bureaus that specialize in these areas. The goal is to create a consolidated customer view. Ideally, you should use a persistent numeric key for customer identification. A concise identifier is critical for performance in a large-volume environment.

Support multiple marketing activities. You have to employ data mining, modeling, statistics, and scoring to improve understanding about the customers and their behavior. Marketers perform iterative ad hoc analyses to determine which customers to promote. During campaign execution, systems managing mailings, telemarketing, or email will use large volumes of customer contact information. Finally, you have to track promotion history and analyze marketing activity results based on responses captured in the operational systems. A dimensional data warehouse with specific extensions is the best approach to supporting the entire direct marketing cycle.

Store detailed information about the customer’s interaction with the business. Examples of such information include purchases, requests, complaints, returns, payments, and shipments. You can manage business activity most effectively when it’s modeled at the lowest meaningful level of detail using a star schema. Ralph Kimball provides an excellent guide to this approach in his book The Data Warehouse Toolkit (Wiley, 1996). You can then use summary tables to enhance performance. These business activity models are the basis for measuring the organization’s success in establishing and maintaining customer relationships.

Adapt the traditional star schema to manage large customer volumes. Customers are often modeled as the lowest level of a geographic dimension in a star schema. This approach has serious drawbacks when you’re marketing directly to individuals. For example, people tend to move more frequently than businesses. Also, for organizations with large customer bases, the size of the customer dimension becomes unwieldy in terms of record length and number of rows. These characteristics can ruin the star schema’s performance. The solution is to optimize the model based on how your organization uses different customer information categories.

Isolate detailed customer contact information. This activity includes lengthy fields such as name, postal address, phone number, and email address. At the most detailed level, this information isn’t dimensional in nature. For example, a marketer will probably not ask for counts of customers named “Jane” that live on “Main Street.” Therefore, place current customer contact information in its own table. Although this table is logically a dimension, users will typically access it during an extract or single-customer lookup. You should redundantly model any dimensional contact fields such as state or area code in a separate dimension table.

Model geography independent of the customer table. Geography usually translates into two distinct dimensions based on zip code and census. This approach has several advantages. Not only are you preserving dimensional query performance, but you’re also avoiding the difficult task of tracking address history (at least in the warehouse). You can apply address changes directly to the customer table. You can also maintain geographic history through the relationship of the geography dimensions to the various fact tables. Finally, the geography dimensions are the natural place to store demographics based on zip code or census.

Figure 2 illustrates a typical model of business activity using the principles previously described. You store facts about the business in the Txn Fact table with the common dimensions of Product and Time. A Promotion dimension based on the organization’s direct marketing activity is typically included. Zip Geography and Census Geography are dimensions that support geographic analysis. Finally, you store customer contact information in the Customer table, including fields that could be derived from the geography dimensions on the Customer table in order to minimize joins during extracts.

FIGURE 2 Typical customer marketing database model.

Analyze demographic attribute use. Organizations often have access to customer and household information such as income, marital status, and home ownership. You can collect this information directly from customers or purchase it from outside data providers. But modeling such a variety of attributes poses special problems. They’re usually independent and don’t group naturally into hierarchies. Adding them to the customer table degrades the ability to use them dimensionally. Simply placing them on the fact table can also affect performance negatively. You can make the best decision on how to model these fields by analyzing demographic attribute use.

Create dimensions for critical demographics. Even when an organization has access to a selection of customer demographics, only a few are commonly used dimensionally in conjunction with the large transactional fact tables. You should build standard dimension tables for these attributes. Figure 2 depicts two examples of this type of dimension. HH Demographics contains household-level characteristics, and Ind Demographics stores attributes that pertain to individuals. Again, this dimensional approach tracks historical changes naturally without complicated date comparisons. For example, if a customer marries, old orders will have one dimension key, yet new orders will have another.

Model additional attributes in a customer profile. If the organization maintains many customer characteristics, it may not be feasible to model all of them as dimensions. In such a case, you should create a customer profile. In essence, this table is a customer-level summary. You can either maintain the table as a current view or store it as a series of snapshots. For each customer, this table contains the full set of demographic attributes. You can also add transactional summaries such as total orders and dollars, as well as use bitmap indexes to obtain fast customer counts. Figure 2 contains a Customer Profile table defined as a snapshot.

If you store the proper attributes, the customer profile table can become the basis for segmentation and scoring. Segmentation is the process of dividing the customer base into distinct groups for promotional purposes. Scoring involves applying a model to rank customers. You can store scores derived from multiple models directly on the profile. Various statistical, data mining, and campaign management packages are available that perform segmentation and scoring.

Model critical facts as dimensions. Although you usually model fields such as order dollars, number of orders, and days since last order as facts, marketers often use groupings of these facts as a basis for segmenting their customer base. For example, a marketer may wish to create a report that shows customer counts grouped by region and ranges of total purchase dollars. You can derive such a query from the detailed fact tables. However, the query requires aggregating totals by customer, assigning each derived total to the appropriate range, and then counting the number of customers in each range. Most query tools will not support this type of analysis directly. In addition, such analysis is time consuming because you’re aggregating the totals from the detail for each query. Therefore, it’s useful to create a dimension table that contains ranged values of critical recency, frequency, and sales measures. The Activity table in Figure 2 is an example.

Track detailed promotion history. Create a promotion fact table that contains an entry for each promotion you send to a customer. In many organizations, this table can reach incredible volumes. You can use summary tables to analyze overall promotion response. However, you need the detail to compare the differences between those customers and prospects who respond and those who don’t. You can use partitioning, archival, and historical summarization judiciously to help manage the volume. The Promotion Fact table in Figure 2 is a simple example of this process.

By understanding the unique characteristics of a customer marketing database, you’ll be able to identify and consolidate customer information, categorize the information by use, and implement the appropriate data model. A properly designed database is an important tool in your efforts to enhance your customer relationships.

Adding Business Value

In the current business climate, deregulation, global expansion, and intense competition for customers across new and quickly changing communication channels abound. It’s vital for organizations to abandon mass-marketing approaches. Two examples based on our experiences within the past few years in Europe point to the need for customer marketing databases. A large European bank was concerned that American banks would continue their expansion in Europe and would steal the European bank’s profitable customer base. Unfortunately, the European bank couldn’t even identify which customers it wanted to keep. In another instance, a major telecommunications firm and former monopoly that was suddenly facing European telecom market deregulation stated it wouldn’t be surprised if it could identify only half of its customers — not necessarily the profitable ones. The problem was that half the telecommunications firm’s customer base comprised nearly 40 million people. In these environments, knowing your customers individually — not as a homogeneous group — is vital to survival.

Capturing vital customer information also benefits Web activity. How many people visited your Web site yesterday? What did those visitors tell you about your site’s structure? Did you get a five percent visitor order rate, or was it more like one percent? You can learn this information by capturing traffic from your Web servers and placing pages strategically in your Web site. Many satisfied customers visiting your site are quite often eager to tell you more about themselves and their interests. What if someone just gave you permission to email them targeted marketing messages? Or — better yet — asked you to send them email promotions? You’ve just achieved a positive response for next to no cost at all. Compare that to the traditional approach of direct marketing, often at a cost in excess of a dollar per direct mail recipient. You’re lucky to get greater than a one-percent response rate in that environment.

Clearing the Clutter

Individual customers and prospects are constantly bombarded with untargeted information that clutters precious minutes in their day. Companies will win customers by delivering meaningful marketing messages to individuals. Building information in a customer marketing database that’s centered on a customer, and not many customers, can provide a tremendous competitive advantage. Building a customer marketing database that can identify millions of individuals may seem like a daunting challenge. But with a few good design principles, the systems of today can deliver an efficient vehicle in the battle for customers.

Steve Helle is the CEO of DataInsight, a systems integration firm that develops database solutions for Internet traffic and customer marketing data.

Darryl Petruska is a senior principal with DataInsight who leads customer marketing database implementations.

You can reach Steve and Darryl through the DataInsight Web site at www.data-insight.com.





IE Weekly Newsletter
Subscribe to the newsletter
    Email Address








Enabling People and Organizations to Harness the Transformative Power of Technology