Accuracy and PrecisionTechniques for improving accuracy and precision in the enterprise can help improve decision-making. Analytic accuracy and precision will make or break real-time decision-support systems. By Seth Grimes March 6, 2004
With growing demand for real-time decision support, analytic accuracy and precision are more important than ever. Systems are increasingly embedding analytics and eliminating immediate human oversight from operational processes in the name of speed, efficiency, and economy. They monitor and respond directly to dynamic business conditions, relying on automated measurement, classification, prediction, and execution. Highly automated systems typically do not deal well with incertitude so they had better work correctly accurately and precisely from measurement to action. Techniques to improve accuracy and precision are more widely understood than applied, perhaps because they're often proposed out of sensible context, sold as ends in themselves rather than as one means toward a goal that is influenced by many factors. And when they are applied, they seem to be perceived as magic bullets that on their own, in isolation, will target and solve business problems. Meanwhile the race toward profitability through automation and disintermediation continues, only increasing needs. With the hope of providing useful perspective, I'll devote this column to discussing a number of accuracy and precision techniques. There's no magic bullet, however, because context and actual requirements are key. We'll start with data quality. Garbage InYou know the old saw, "garbage in, garbage out." The implication is that data quality is an absolute, a "must have," an end in itself without regard for actual needs that can be met by realistic but limited steps. Take a hypothetical direct-marketing firm, where nine addresses out of 100 are undeliverable and three are undetected duplicates due to variant spellings of names or addresses. These errors could be tolerable if the cost of correcting them is greater than the estimated value of the 9 percent missed opportunity and the cost of delivering three percent duplicates. The cost of absolute quality may be higher than the return. Spam e-mail is at one extreme of the spectrum, where the cost of sending an email message is so low that a spammer can send to a dictionary's worth of addresses at a known domain, with absolutely no discretion in targeting, in the hope of a small number of sales. The U.S. population census is at the other extreme: The government interprets the Constitutional mandate to perform an enumeration of the population to mean that it must make a significant effort to count every individual without recourse to statistical adjustment for missed or duplicated individuals. Because the government cannot meet its accuracy goals through statistical techniques, it is making a huge effort to improve its TIGER geographic database and its Master Address File in order to target survey forms more precisely. But responses are not verified and may nonetheless be nonfactual. For instance, I might identify myself as Aleut and the output tables will dutifully relay my self-reported but incorrect ethnicity. There's no formal data quality problem here, nor is there when someone gives fictitious personal information when signing up to access a Web site. Data quality is important but only to the extent that efforts don't overshoot the required precision. It is insufficient to ensure accuracy in the face of data issues not related to quality. Processes and ModelsThe government conducts an accuracy and coverage evaluation of the census, an independent survey to assess the source and extent of error, but judged that statistically adjusting 2000 census results to improve accuracy would likely introduce errors greater than those adjustments would eliminate. Fixing results may not improve their accuracy because of limitations in the sensitivity of the measurement instrument and of the correction techniques, reinforcing the importance of designing and building accuracy into processing systems. Do a job right, and you won't have to correct the results. The twin trends of business performance and process management are positive steps given the central role they assign to process manageability and measurability. Both entail modeling organizational dynamics with built-in assessment and decision points and to accommodate evaluation of alternative scenarios. They differ in that one emphasizes process-quality measurement, monitoring, and optimization and the other testable, repeatable, intentional (rather than haphazard) processes. The long-established software-testing concepts of verification and validation (V&V) come into play on this larger, organizational scale. Verification seeks to show that a model or algorithm is implemented correctly while validation is the determination that you've chosen the right model and algorithms for the problem at hand. For example, in the past, I've picked on analytic software that provides only linear data fitting, which is useless if you're trying to detect and predict periodic effects like seasonality. The programming might be verified as completely corrected but the results will be wildly inaccurate due to the inadequacy of the model applied. Organizational process models need V&V just as much as software programs to ensure that models are apt and well coded. Good-quality modeling, with the added dimension that process models should adapt to changing circumstances, is important for quality results.
|
New on the BLOG
5 Opportunities and 3 Threats for Oracle
02. 9.2010
Read more from Rajan Chandras >>
Bashing Gartner's Magic Quadrants seems to be a popular industry pastime, but in truth, I kind of like the quadrants. My biggest gripe is in how the quadrants are used, not necessarily the quadrants themselves... 02. 8.2010 Read more from Cindi Howson >> Clarabridge Asks, Are You Customer Experienced? 02. 5.2010
Read more from Seth Grimes >> Most Popular This Week
Intelligent Enterprise Newsletters
Subscribe Here:
| |||||||||||||||||
|
|




