November 1998, Volume 1 Number 2
Testing your system for performance and scalability in an iterative fashion can be the difference between success and disaster">
Measure Early, Measure Often
Richard Winter
In an extreme VLDB project, you need to consider that your usual database or system platform may not meet engineering requirements; the requirements as you state them may be beyond the capabilities of any off-the-shelf product. You must also consider that different database products may perform very differently on the application. The difference in performance -- or price/performance -- between one product and another could be a factor of 100 or more. Similarly, the difference in performance -- or price/performance -- between one database design and another could be a factor of 100 or more.
Finally, in an extreme VLDB project, you must consider that there are no good models for predicting performance. If you want to manage the engineering risks in your project appropriately -- those concerned with performance, availability, and scalability -- you must measure; and if you are going to act on the results of your measurements before it's too late, you must measure early.
The VLDB Performance Management Approach
There are two problems with measuring early in the process: The system doesn't exist yet, and you have incomplete knowledge of its performance and workload requirements. But the alternative approach -- not measuring at all -- results in unacceptable risks.
Therefore, the best approach is iterative. You first estimate the requirements and workload, then measure. Next, taking the uncertainties in your measurements into account, you make the best decisions you can at that point in the process and move to the next stage of development. Then you repeat the process. At each stage, your information is better and your decisions can be firmer.
Remarkably, even early stage measurements based on incomplete and uncertain information can be extremely valuable. And they are often the key to avoiding disaster. Beyond that, the very process of iterating on the estimation of performance requirements, the measurements of candidate solutions, and the application of measurement results to implementation leads to insights about performance, scalability, and availability and increases the prospects for success.
How Extreme is Your Project?
Some key issues for many project managers are: How extreme are my database requirements? Am I really trying to do something that hasn't been done before? Is it really so important that I engage in an early stage measurement process that will take resources and delay the start of "real work?"
I recommend finding out how the key dimensions of your implementation compare to those already in production. If you are significantly beyond the frontier, then you are in a high-risk situation and you need to measure before you get committed. Among the dimensions you may want to evaluate when you assess your position are: database size, largest table, row size, number of rows, workload volume, type of workload, your intuitive assessment of data and query complexity, number of concurrent users, transaction rate (OLTP), number of queries in-flight, performance requirements, availability requirements, batch update workload, and online update workload.
If you find that you are significantly beyond prior experience in any dimension or meaningful combination of dimensions, proceed with care. Figure 3
illustrates the frontier for the combination of database size and queries in-flight for decision-support systems.
Figure 3. The DSS frontier for database size and queries in flight. (source: Winter survey of Very Large Databases,
Database Programming & Design, Agust 1998).
Each point plotted on the graph (represented by small squares) indicates a decision-support system in production. The two points marked with a "blue ribbon" symbol were those found to have the largest value in a category in Winter Corp's VLDB Survey. Thus, in the upper left, Kmart had 800 in-flight queries. In the lower right, Sears had a database size of 4.6TB.
The dark area represents the "frontier," that part of the space beyond which there is no significant production experience. If your requirements fall at or near the frontier, you should probably measure early. The gray shaded area is "beyond the frontier" -- well beyond any significant production experience. If your requirement falls in the gray shaded area, you must measure before you get too far into the development process.
A general guideline is that if two or more dimensions of your system size seem critical and are intuitively related, look at them together in terms of the VLDB frontier. With more than three dimensions, it's easier to work with a tabular rather than a graphical representation of the data.
In Figure 3, there is one system operated by Fidelity that falls in the middle of the graph. If that system had requirements similar to yours, you would give it additional weight. In answering this question, it's important to look at the nature of the application and the feel of the query set as well as the quantitative factors, but it's usually not a good idea to consider a single system to have changed the shape of the frontier.
Also, recognize that the frontier changes rapidly as technology advances and new systems come into production, so no one source of information is complete. The Winter Survey (see www.wintercorp.com) is a good starting point, but you should supplement it with private inquiries because data about some systems is kept confidential. Vendors are usually prepared to provide additional data points. Just be sure you verify them directly with the database owner, take particular care to use consistent measures for all data points, and make sure you learn the status of the system. Some vendors routinely describe some systems that are really only in test or in pilot use as "production" systems.
"Production status" is extremely important at or beyond the frontier because even experienced users get surprised and back off in late testing stages. Frequently, problems in performance, availability, or scaling surface only days or weeks before production operation begins. Sometimes they surface early in production. So the system isn't in production until it's actually in production or unless it stays in production. Even then, it's not really in production unless it is in heavy use, or clearly progressing toward its intended level of use.
Defining Your Benchmark
After your risk assessment convinces you that you should perform a measurement, then what do you do? You must specify a benchmark that you can apply and refine at various stages of the project. A famous politician said that the best approach to getting elected was to have your supporters vote "early and often." Similarly, the best approach to VLDB development is to measure early and often -- and keep doing so throughout development and production.
To define a benchmark, you must define a database workload, a database concept, and a set of operational requirements. To address scalability, you must define these factors as you believe they will exist at multiple points in the future.
For performance measurement purposes, the database concept is a rough description of the 20 to 40 entities in the database that are expected to figure most significantly in either data volumes or transaction volumes. This description needs to include only estimated row count, estimated row size, estimated number of columns, primary keys, foreign keys, and columns to use to a significant extent in data selection. For the last three items, the description needs to include an estimate of the distribution of data values over the rows. The workload definition comprises your best estimate of the nature, frequency, and distribution of the SQL requests your system will need to process.
Uncertainty clouds many of the factors that contribute to an accurate picture of the database and the workload early in the project. This uncertainty is unavoidable, but if you make a reasonable estimate of the workload requirements and measure, you are much better off than if you don't. You can often deal with the uncertainty by simplifying assumptions. You don't simplify away the database performance requirement, but you can often ignore factors that primarily concern application logic or user interface. Usually, it is at least possible to come up with reasonable estimates of the "lowest likely" and "highest likely" values. Then in the worst case, you can measure both and manage the risks accordingly.
Applying Your Benchmark Definition
When you have defined your benchmark, you are in a position to take a variety of positive actions. You can compare it to standard benchmarks and determine whether any standard benchmark data might contribute to your analysis. If you are about to do a platform acquisition, you can build the benchmark into the acquisition process. If you have a platform, you run the benchmark to gain information about how your application is likely to perform and employ the results to shape and support decisions concerning configuration, database design, or implementation.
The point is that a benchmark definition provides the tool you need for rational decision making about performance and cost throughout the project lifecycle. Without it, you have only guesswork and opinion. With it, you have the means to get or keep your project on the right track. You have a tool to manage your systems' performance. In the early stages of the project, it is a blunt tool, but it is crucial for avoiding disaster in the early decisions. And as it is refined with better information as the project unfolds, it becomes a sharper tool with ever increasing power to help your effort succeed.
Richard Winter is a specialist in large data base technology and implementation, and is president of Boston-based Winter Corp. You can reach him via email at Richard.Winter@wintercorp.com or by fax at 617-338-4499.
RESOURCES
Winter Corp.: www.wintercorp.com
(industry research and reports on VLDB practice)