|
Breakthrough Analysis, by Seth Grimes
Seth Grimes is an analytics strategist with Washington DC based Alta Plana Corporation. He consults on data management and analysis systems. See More by Seth Grimes From Text Analytics to Data Warehousing
IBM recently posted a quite nice page on extracting business value from "unstructured" data. The page describes use of IBM's own products and formats to be sure, but it is potentially helpful for anyone who wishes to learn about information extraction from textual sources for data warehousing. IBM's page starts with a brief text-analytics overview. It then dives into implementation with the OmniFind Analytics Edition for DB2 and its pureXML capabilities. It describes a process flow includes XML tagging of document features and the alternatives of mapping the XML schema to relational database structures or use using the XML structures directly for analyses. This text-analytics workflow, and the choices involved in dealing with text-sourced information, are not specific to IBM's tools, however. So which IBM provides diagrams and code listings and an analysis of the alternative approaches that relate to their own products, the lessons apply much more generally. The premise is that because much valuable business information originates in "unstructured" form — e-mail, Web pages, news and blog articles, corporate reports, etc. — you need to look at text analytics as a technology that can unlock value. And naturally, if you already have a BI program and a data warehouse, you'll want to explore integrating text-sourced information into your existing data-analysis infrastructure. You'll want to explore unified analytics. Information extraction to databases enables unified analytics. I cover approaches in my own text-analytics courses and presentations — I use open-source GATE (General Architecture for Text Engineering) software for illustrations and examples in order to remain independent of any product — but IBM's is the first clear, freely available, and practical technical exposition that I have seen on this topic. If you want to learn more about unified analytics, do visit IBM's From Text Analytics to Data Warehousing page. Disclosure: IBM is a sponsor of a editorially independent text-analytics report I am writing, which is unrelated to my Intelligent Enterprise writing.
This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers. Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service. Important Note: This comment area is NOT intended for commercial messages or solicitations of business.
|
Blog Channels
on Enterprise App Development on Changing the Enterprise by Shawn Shell by Kas Thomas Subscribe to RSS feed of all blogs Archives
|
|
|




