Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Home
Digital Library
Events
RSS | Newsletters
Webcasts




November 12, 2001

Catastrophic Failure

Know the threats to your mission-critical data warehouse and how to defend against them

By Ralph Kimball

Continued from Page 1

Parallel communication paths. Even a distributed data warehouse implementation can be compromised if it depends on too few communication paths. Fortunately, the Internet is a robust communication network that is highly parallelized and continuously adapts itself to its own changing topography. My impression is that the architects of the Internet are very concerned about systemwide failures due to denial-of-service attacks and other intentional disruptions. Collapse of the overall Internet is probably not the biggest worry. The Internet is locally vulnerable if key switching centers (where high-performance Web servers attach directly to the Internet backbone) are attacked. Each local data warehouse team should have a plan for connecting to the Internet if the local switching center is compromised. Providing redundant multimode access paths such as dedicated lines and satellite links from your building to the Internet further reduces vulnerability.

Extended storage area networks (SANs). A SAN is typically a cluster of high-performance disk drives and backup devices connected together via very high-speed fiber channel technology. Rather than being a file server, this cluster of disk drives exposes a block-level interface to computers accessing the SAN that make the drives appear to be connected to the backplane of each computer.

SANs offer at least three huge benefits to a centralized data warehouse. First, a single physical SAN can be 10 kilometers in extent. This means that disk drives, archive systems, and backup devices can be located in separate buildings on a fairly big campus. Second, backup and copying can be performed disk-to-disk at extraordinary speeds across the SAN. And third, because all the disks on a SAN are a shared resource for attached processors, you can configure multiple application systems to access the data in parallel. This design is especially compelling in a true read-only environment.

Daily backups to removable media taken to secure storage. We've known about this one for years, but now it's time to take all of this more seriously. No matter what other protections we put in place, nothing provides the bedrock security that offline and securely stored physical media provide. But before rushing into buying the latest high-density device, give considerable thought as to how hard it will be to read the data from the storage media one, five, and even 10 years into the future.

Strategically placed packet filtering gateways. We need to isolate the key servers of our data warehouse so that they're not directly accessible from the local area networks used within our buildings. In a typical configuration, an application server composes queries that are passed to a separate database server. If the database server is isolated behind a packet filtering gateway, the database server can receive packets from the outside world only if they come from the trusted application server. Therefore, all other forms of access either are prohibited or must be locally connected to the database server behind the gateway. Consequently, DBAs with system privileges must have their terminals connected to this inner network, so that their administrative actions and passwords typed in the clear can't be detected by packet sniffers on the regular network in the building.

Role-enabled bottleneck authentication and access. Data warehouses can be compromised if there are too many different ways to access them and if security is not centrally controlled. Note that I didn't say centrally located; rather, I said centrally controlled. An appropriate solution would be a lightweight directory access protocol (LDAP) server controlling all outside-the-gateway access to the data warehouse. The LDAP server allows all requesting users to be authenticated in a uniform way, regardless of whether they are inside the building or coming in over the Internet from a remote location. Once the user is authenticated, the directory server associates the user with a named role. The application server then makes the decision on a screen-by-screen basis as to whether the authenticated user's role entitles that user to see the information. As our data warehouses grow to thousands of users and hundreds of distinct roles, the advantages of this bottleneck architecture become significant.



Rate This Article

Comments:

Optional e-mail address:

There is much we can do to secure our data warehouses. In the past few years our data warehouses have become too critical to the operations of our organizations to remain as exposed as they have been. We have had the wakeup call.

I have written extensively on the aforementioned topics. I cover the design of distributed architectures and discuss packet filtering gateways and role-enabled security comprehensively in the Data Warehouse Lifecycle Toolkit (Wiley, 1998). I describe the application of SANs to data warehouses in my Intelligent Enterprise column "Adjust Your Thinking for SANs" (March 8, 2001).


Ralph Kimball coinvented the Star Workstation at Xerox and founded Red Brick Systems. He has three best-selling data warehousing books in print, including The Data Webhouse Toolkit (Wiley, 2000). He teaches dimensional data warehouse design through Kimball University and critically reviews large data warehouse projects. You can reach him through his Web site, www.rkimball.com.






IE Weekly Newsletter
Subscribe to the newsletter
    Email Address