Kimball University: Eight Recommendations for International Data Quality > > Intelligent Enterprise: Better Insight for Business Decisions

Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Home
Digital Library
Events
RSS | Newsletters
Webcasts


  • EMAIL
  • PRINT
  • REPRINTS
  • Follow Us on Twitter
  • FOLLOW US
  • Share

Kimball University: Eight Recommendations for International Data Quality


Language, culture, and country-by-country compliance and privacy requirements are just a few of the tough data quality problems global organizations must solve. Start by addressing data accuracy at the source and adopting an MDM strategy, then follow these six other best-practice approaches.


By Ralph Kimball
August 1, 2008

Ralph Kimball Ralph Kimball

Thomas Friedman's wonderful book, The World is Flat, chronicles a revolution that most of us in IT are well aware of. Our enterprises collect and process data from around the world. We have hundreds or even thousands of suppliers, and we have millions of customers in almost every country. Our employees, with their attendant names and addresses, come from every conceivable culture. Our financial transactions are denominated in dozens of currencies. We need to know the exact time in remote cities. And above all, even though thanks to the Web we have a tight electronic connection to all of our computing assets, we are dealing with a profoundly distributed system. This, of course, is the point of Friedman's book.

Data quality is enough of a challenge in an idealized mono-cultural environment, but it is inflamed to epic proportions in a flat world. But strangely, the issues of international data quality are not a single coherent theme in the IT world. For the most part, IT organizations are simply reacting to specific data problems in specific locations, without an overall architecture. Is an overall architecture even possible? This article examines the many challenges surrounding international data quality and concludes with eight recommendations for addressing the problem.

Languages and Character Sets

Beyond America and Western Europe there are hundreds of languages and writing systems that cannot be rendered using a single-byte character set such as ASCII. The Unicode standard, of course, is the internationally agreed-upon multi-byte encoding intended to handle all the writing systems on the earth. The latest release, Unicode 5.1, encodes 100,715 characters in virtually every modern language. It is important to understand that Unicode is not a font. It is a character set. The architectural challenge for the data warehouse is to ensure that there is end-to-end support for Unicode all the way from data capture, through all forms of storage, DBMSs, ETL processes, and finally the report writers and BI tools. If any one of these stages cannot support Unicode, the final result will be corrupted and unacceptable.

Cultures, Names and Salutations

The handling of names is a sensitive issue, and doing it incorrectly is a sign of disrespect. Consider the following examples from different cultures:

   Brazil: Mauricio do Prado Filho
   Singapore: Jennifer Chan-Lee Bee Lang
   USA: Frances Hayden-Kimball

Are you confident that you can parse these names? Where does the last name start? Is Frances male or female? Some years ago, my title was Director of Applications. I received a letter addressed to "Dir of Apps", which began with "Dear Dir." I didn't take that letter very seriously!

1 | 2 | 3 | 4 NEXT PAGE

  • EMAIL
  • PRINT
  • REPRINTS
  • Follow Us on Twitter
  • FOLLOW US
  • Share


 





New on the BLOG
Is Gartner's Quadrant the Problem, Or Is It How It's Used?
02. 8.2010
blog author
Cindi Howson
Bashing Gartner's Magic Quadrants seems to be a popular industry pastime, but in truth, I kind of like the quadrants. My biggest gripe is in how the quadrants are used, not necessarily the quadrants themselves...

Read more from Cindi Howson >>

Seth Grimes
Clarabridge Asks, Are You Customer Experienced?
Add "customer" to Jimi Hendrix' song title and you have a question central to last week's Clarabridge Customer Connections (C3) conference, Are You Customer Experienced?

02. 5.2010
Read more from Seth Grimes >>

Quick Thoughts on Sybase/Aleri
02. 4.2010
blog author
Curt Monash
Sybase today announced an asset purchase that amounts to a takeover of CEP (Complex Event Processing) vendor Aleri, which last year acquired Coral8. Quick reactions include...

Read more from Curt Monash >>



Intelligent Enterprise Newsletters
Subscribe Here:
*Email:
 First Name:
 Last Name:
  Intelligent Enterprise Blogosphere Newsletter:
  Intelligent Enterprise Newsletter:

Email Type: