Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Home
Digital Library
Events
RSS | Newsletters
Webcasts




March 08, 2001



Data in the Time of Cholera

How a 19th century physician's data warehouse helped prevent the spread of cholera

By Steven Johnston

Data warehousing projects usually are difficult and expensive efforts that tend to lose direction because they challenge both business and IT managers with a new and different way of viewing information. Whereas both communities have successfully used computers for the past 50 years to help understand what happens in the day-to-day operations of their business, now they are trying to use computers and large amounts of data to help understand why things happen in their business. Fortunately, historical precedent can be helpful in understanding the issues involved.

In essence, data warehousing is an observational science. This point is an important one because observational science has been around for 4,500 years and provides a wealth of experience in acquiring and analyzing large amounts of data. Thus, it brings a sense of history to data warehousing that currently does not exist. As a member of a data warehousing team, it is very comforting to know that many others have gone before you and have struggled with similar challenges and, despite all, have succeeded.

Data Mining Deja Vu

Twenty years ago I was an exploration geophysicist working for a major oil and mining company. Exploration geophysicists collect large amounts of observational data and search for trends and patterns in the earth's gravitational and magnetic fields, electrical resistivity, and the response to manmade seismic (sound) waves to locate commercial deposits of oil and minerals. Over the years, my career wandered to supporting geophysical analytical applications as a programmer and then to finally becoming an IT professional.

In March 2000, I became involved in my first data warehousing project at a major airline. I immediately felt strangely comfortable with the concepts of data warehousing. Then it finally dawned on me that the source of my deja vu stemmed from "data mining" mining data about 20 years ago! Once the concept of data warehousing as an observational science crystallized in my mind, I was able to benefit from my past experience as an observational scientist and the historical experiences of other observational scientists.

Clay-Tablet DBMS

There are two kinds of science: experimental and observational. Physics and chemistry are examples of experimental sciences because they deal with small isolated systems that can be studied through experimentation. Gallileo invented experimental science about 400 years ago to explore the physics of falling bodies.

Observational sciences deal with large, complex systems that cannot be experimented with because they are too large and too complex - astronomy, meteorology, geophysics, and epidemiology, for example. Observational science is much older than experimental science and goes back thousands of years to early Chinese astronomical observations around 2500 B.C. By 1600 B.C. the Babylonians were plotting the positions of the fixed stars; by 800 B.C. the Babylonians were recording the motions of the planets relative to the fixed stars. By 200 B.C., the Greeks had used astronomical observations to figure out that the sun was at the center of the solar system, the earth and planets revolved about the sun, the moon revolved about the earth, and they had determined the diameter of the earth to within 5 percent.

Observational sciences try to figure out how complex things work through an organized process of discovery. (See Table 1.)

Nothing has changed much with this discovery process over the past 4,500 years. Although the DBMS of choice for the Babylonians was clay tablets, the process was the same.

Cholera: A Case Study

A classic case study in early data warehousing as an observational science is the birth of epidemiology in 1854. We now know that cholera is a terrible disease caused by bacteria that enter the intestines from contaminated water. The bacteria release a toxin that gives the victim a severe case of diarrhea. A 200 pound man can lose 40 pounds in a single day. Massive dehydration causes the blood to get so thick that the heart can't pump it and the patient dies quickly. Within two to three days, more than 50 percent of the victims may have perished.

In 1854, everyone believed that diseases were caused by "miasma." Miasma was a substance found in bad smelling air. Through folklore and anecdotal observations, people knew that if you were around foul-smelling sick people in hospitals or were exposed to foul-smelling sewage, there was a good chance that you would get sick as well. People also observed that men who made a living cleaning out cesspools and laborers who boiled down rotting horse carcasses for glue and tallow would get sick and vomit from the extremely foul air that was part of their daily work.







IE Weekly Newsletter
Subscribe to the newsletter
    Email Address