Breakthrough Analysis, by Seth GrimesSeth Grimes is a principal of Alta Plana Corp., a Washington, D.C.-based consultancy specializing in large-scale analytic computing systems. True BI for the Masses BI for the Masses is overused marketing-speak meant to suggest that Vendor X's break-out Product Y is going to enable/deliver business intelligence beyond the 15%-20% of knowledge workers who currently do BI. (I got that estimate from a chat with industry veteran Dave Wells, who says the figure becomes 40% if you include Excel.) Well, I have my own notion of BI for the Masses, and it is NOT:
>>Continue reading "True BI for the Masses" Posted Monday, November 16, 2009 7:00 AM >>Comments Commercial and Community Open Source and Pentaho BI Last week, I offered the opinion that BI software publisher Pentaho has moved beyond a commercial open source business model. When "strategic" software components such as the new Pentaho Analyzer interface are not open source, having a "core open" suite like Pentaho's seems no longer enough to define the vendor as an open source company. I held up EnterpriseDB as a company that went down a similar route just last year. While what counts most is great, affordable software, an understanding of market trends helps everyone concerned make strategic choices. In this spirit, I'll present perspectives that complement mine, from Pentaho and from the Pentaho community, regarding the importance of the open source base to Pentaho and the company’s users and regarding a new community development, the open source Pentaho Analysis Tool (PAT). >>Continue reading "Commercial and Community Open Source and Pentaho BI" Posted Thursday, October 22, 2009 1:56 PM >>Comments Open Source Decision Time for Pentaho BI Companies adapt their business models to changing business conditions and emerging opportunities. For BI software publisher Pentaho, the demise of as-a-service BI provider LucidEra created an opportunity that was too good to pass up. LucidEra's Clearview interface, acquired and rebranded Pentaho Analyzer, fills a product-line gap by providing pivot analysis for non-technical business users. That this centerpiece Enterprise Edition component was not and is not open source invites a question. Is Pentaho, founded as a "commercial open source" BI vendor, still defined by open source? Pentaho itself seems unsure. >>Continue reading "Open Source Decision Time for Pentaho BI" Posted Monday, October 12, 2009 5:21 PM >>Comments Visual BI Meets Pop Culture BI has crossed a cultural threshold. Data visualization forms have become a tool of pop culture. Witness a pseudo-infographic published in the Arts & Leisure section of this last Sunday's New York Times. (The broadsheet page reduces surprisingly well to a computer screen.) The BI forms are there, absent the usual, numerical BI content. "He Came, He Heard, He Shared" subverts familiar graphics -- pie, line, and area charts, a horizontal bar chart, a pair of linked timeline charts -- to deliver social/media commentary. Artist and blogger Andrew Kuo substitutes qualitative text for numerical scales in using those BI forms to present personal commentary on music industry changes in the last ten years. >>Continue reading "Visual BI Meets Pop Culture" Posted Tuesday, October 6, 2009 2:20 PM >>Comments Recovery.gov Double Fault: Broken Data Feeds The relaunched recovery.gov government-transparency site no longer supports automated data feeds. These feeds had allowed users of the 1.0 site to perform their own valued-added analyses, "the whole point of accountability and transparency," according one site user, an executive with a large, government systems integrator. According to that user, who asked not to be named, referring to the Recovery Accountability and Transparency Board (RATB) and lead contractor Smartronix, "from a software architecture standpoint, they seem to have missed a key principle here: backward compatibility." RATB spokesperson Edward Pound confirmed that the relaunched site no longer offers the feeds. Pound did not know if notice had been provided to users, on-site or through another mechanism, of the discontinuation of the data-feed interface. He stressed that the Recovery Board is working hard to meet emerging user needs and improve site capabilities in furtherance of its non-political mission of promoting open government. >>Continue reading "Recovery.gov Double Fault: Broken Data Feeds" Posted Friday, October 2, 2009 1:36 PM >>Comments Relaunched Recovery.gov Fails Accessibility Standards Recovery.gov, a showcase government-transparency Web site that relaunched on Monday, fails to meet U.S. federal government Section 508 accessibility standards or accessibility best practices. The non-compliance issues relate to display of data tables -- an essential point given the site's promise of "Data, Data & More Data" -- despite on-site compliance claims. Other elements including navigation maps, while compliant, are poorly designed. Sharron Rush, co-founder and executive director of accessibility-advocacy organization Knowbility, goes so far as to state, "The recovery.gov Web site is a good example of what NOT to do for accessibility in my opinion." >>Continue reading "Relaunched Recovery.gov Fails Accessibility Standards" Posted Wednesday, September 30, 2009 2:37 PM >>Comments Twitter Stirs Up the Analyst Industry A few recent tweets got me thinking: Twitter has stirred up the analyst industry. Every Twitter user gets the same on-site visibility and capabilities. As a result, celebrities excluded, authority chez Twitter derives from your network and from your tweets and only after that from your extra-Twitter identity (a.k.a. your biography and employment). Since open publishing is an independent-analyst ethos, we independents have taken to wide-open Twitter like, well, whales to the air. >>Continue reading "Twitter Stirs Up the Analyst Industry" Posted Wednesday, September 23, 2009 9:51 AM >>Comments Questions and Answers about USAspending.gov My blog article on USAspending.gov's design flaws has attracted record page views according to IE editor Doug Henschen, boosted by coverage on Slashdot, Government Computer News, and other outlets. Posted comments reveal many misconceptions about the site. I'll distill the more interesting ones, with some of my own, into a series of questions. Noting that "[g]overnment should be collaborative," I'll attempt answers myself. >>Continue reading "Questions and Answers about USAspending.gov" Posted Tuesday, September 8, 2009 3:22 PM >>Comments CEP + BI = Real-Time Event Analytics Event-driven analytics aims to facilitate business decisions and actions as opportunities (and threats) emerge. Move the concept into the Now, into the worlds of (for instance) on-line commerce and real-time detection of suspect financial transactions, and it becomes apparent that old, DBMS-reliant architectures, with their load-index-query overhead, just aren't fast enough. Complex event processing (CEP) technology does deliver required sub-second responses, combining real-time BI capabilities -- what my twitter friend @communicating describes as "the act of identifying 'pain' & 'profit' signals as they happen" -- with the ability to tap both DBMS-stored historical data and real-time "data in flight." Yet the mainstream BI world doesn’t have a clear picture of the possibilities afforded by CEP-enabled event-driven analytics, so I set out to study perceptions with the support of one of the space’s leading vendors. >>Continue reading "CEP + BI = Real-Time Event Analytics" Posted Friday, September 4, 2009 9:20 AM >>Comments Serious Design Failure at USAspending.gov The U.S. federal government's USAspending.gov Web site is a travesty, almost a parody of a government-transparency site. The site looks fine, but it significantly fails U.S. government accessibility requirements and its use of graphics has only gotten worse -- far worse -- since I wrote about execution issues a month ago. Further, it's old-school, a mockery of Gov 2.0 principles of interactivity and responsiveness and community. >>Continue reading "Serious Design Failure at USAspending.gov" Posted Tuesday, September 1, 2009 1:04 PM >>Comments The Opposite of Open Source What's the opposite of open source? Hint: The answer is quite straight-forward. And it's not what some analysts and insiders would have you believe. The definition of "open source" (as applied to software) is almost universally accepted as that of the Open Source Initiative. Per the OSI, "open source doesn't just mean access to the source code. The distribution terms of open-source software must comply with [certain] criteria" that are outlined on the OSI's Web site. Open-source software, per the OSI, is free, "free" as in "free beer" rather than necessarily as in "free speech," which latter usage of "free" carries with it certain responsibilities. Those responsibilities are "vitally important" according to Richard Stallman and other free-software movement proponents. >>Continue reading "The Opposite of Open Source" Posted Monday, August 17, 2009 10:40 PM >>Comments Whacky Graphics at USAspending.gov I started this blog entry with the intent of appraising USAspending.gov's IT Dashboard, a new, interactive tool for evaluation of Federal Government IT spending. I find the dashboard less than compelling, but a close look will have to wait because the graphical issues start front-and-center on the site's main page, before you even get to the dashboard, with one downright whacky graphic. I can't recall the last time I saw a graphic that so distorted the numbers, so I tried to recreate it (and failed). Here's how. >>Continue reading "Whacky Graphics at USAspending.gov" Posted Friday, July 31, 2009 9:03 AM >>Comments In SPSS, IBM Gains an Open R & Python Analytics Platform I love telling folks that I ran my first SPSS programs in 1976... and that I haven't run one since. I was in high school. I keypunched and submitted card decks for a researcher back when "SPSS" still stood for Statistical Package for the Social Sciences. SPSS has long since reinvented itself as a predictive analytics vendor. As numerous commentators have pointed out, the company's data mining capabilities will fill a gap in IBM's product line on completion of the announced acquisition and put new heat on rivals including SAS, SAP, and Oracle. SPSS brings other, less-visible assets to the pending IBM deal. Readers whose interest goes beyond analyses of the "IBM gets to check off another capabilities box" variety may be interested in learning about one of them. SPSS provides an open stats platform that allows users to patch Python and R code into their SPSS routines. SPSS's Bring Your Own Analytics is a clear competitive differentiator. Whatever you call it, SPSS's stats platform is a pioneering example of hybrid commercial-open source analytical computing, with benefits for users and the company alike. >>Continue reading "In SPSS, IBM Gains an Open R & Python Analytics Platform" Posted Thursday, July 30, 2009 2:50 PM >>Comments Predictive Text Analytics and SPSS's Predictive Enterprise Vision Damn trademarks. I'm slated to speak on predictive text analytics at October's Predictive Analytics World conference near Washington DC. Release of the PAW agenda elicited a twitter comment from my friend Olivier Jouve, "Seth, glad to see you using 'Predictive Text Analytics' - expression that SPSS and I crafted in 2003!" (I’m at @sethgrimes on twitter by the way.) Olivier is SPSS vice president for corporate development. He and SPSS do deserve credit for promoting wide commercial deployment of text technologies that had previously been accessible only to researchers. I only regret that before titling my PAW talk, I hadn't realized that SPSS had trademarked "predictive text analytics," turning a term that deserves wide business application into SPSS property. Had I known of the trademark, I would have chosen a different title. While SPSS hasn't trademarked "predictive analytics," a more general term that has been in use for years, I'm impressed with the company's ability to execute on a broader vision to similarly own that field. >>Continue reading "Predictive Text Analytics and SPSS's Predictive Enterprise Vision" Posted Friday, July 17, 2009 4:11 PM >>Comments CEP, Events, and Continuous {Transformation | Intelligence} Given that BI thought leaders are wrestling with the notion of events, perhaps we will see a BI-mainstreaming of event processing in the not-too-distant future. Myself, I was way ahead of the game in my expectations of demand for BI access to stream sources. While a combination of legacy database and analytical technology has held BI back, lack of perception of need has been a far greater factor, especially given the under-utilization of conventional BI decades after the term first became popular. Interest in streams and events has definitely picked up in the last few months -- I've reported on novel applications for "continuous transformation" and otherwise done a bit of writing to promote awareness -- and next year could very well be the break-out year for BI on data and event streams. >>Continue reading "CEP, Events, and Continuous {Transformation | Intelligence}" Posted Wednesday, July 1, 2009 10:52 AM >>Comments When Business Gets Too Personal Visualization guru Stephen Few reminds us that analyst opinions, while offered by recognized experts, are inherently personal, and that on the other side of the table, there are real people behind products, marketing campaigns, and corporate decisions. I'll amplify that each of us does bring unique personal experience and even personality to bear when reviewing (analysts) or promoting (vendors) products, and I'll agree that we should each be accountable for what we write or claim. It's an analyst's personal perspective, coupled with strong judgment, communications skills, and fairness, that creates a sense of authority and makes his or her views worth reading. Good analysts don't blindly accept vendor claims. We investigate, and sometimes we reject what we've been told. But I disagree with Steve that analysts should always name names. Some situations become simply too personal. I and others I know have even been the subject vindictive behavior, which unhelpfully diverts attention from products to people. In the worst cases I've seen, the vendor can even exploit personal conflict to dismiss or attempt to denigrate the analyst. >>Continue reading "When Business Gets Too Personal" Posted Friday, June 26, 2009 5:50 AM >>Comments Summer Reading: IR, Sentiment Analysis, and Visualization Summer's slower pace allows time to work through material set aside for calmer days. What's on your reading list? Mine includes a variety of papers and also longer works on Information Retrieval, Sentiment Analysis, and Visualization. The items on my list are technical and accessible (which is not the same as easy), of potential interest to anyone who works with analytics. I've paged through them and plan to take a deeper dive. TechWeb readers might also find them worth at least a quick look. >>Continue reading "Summer Reading: IR, Sentiment Analysis, and Visualization" Posted Thursday, June 25, 2009 7:04 AM >>Comments Reports from the 2009 Text Analytics Summit The number and quality of end-user presentations at year's Text Analytics Summit prove that Marti Hearst's 1999 observation, "The nascent field of text data mining (TDM) has the peculiar distinction of having a name and a fair amount of hype but as yet almost no practitioners," is definitively no longer operative. The summit was vendor heavy in its first year, 2005. For the last couple of years, end users have dominated the program. Their numbers held up this year even with overall attendance down about 1/6, a far smaller loss (due to economic conditions of course) than I've observed at other, recent analytics conferences. I'd pin much of the text summit's loss on integrators and start-ups facing limits imposed by a down economy. They're out there, but better to cut conference presence than R&D or staff when budgets are tight. Voice of the Customer was (again) a popular summit topic, augmented this year by very helpful talks on sentiment analysis (e.g., by Bing Liu) and with broadened coverage of listening platforms and on-line sources including social media. I believe this year's was the first summit with open-source (GATE project coordinator Hamish Cunningham) and search (Usama Fayyad and Daniel Tunkelang) on the program. And I again presented a pre-summit workshop, Text Analytics for Dummies. I've posted my slides, which folks are welcome to use however they wish. Rather than recap further myself, I'll point you to a rich crop of blog articles posted by summit attendees. >>Continue reading "Reports from the 2009 Text Analytics Summit" Posted Wednesday, June 10, 2009 12:57 PM >>Comments Limits of Visualization: Wordle Misses Meaning Visualizations graphically represent information. As I related in connection to my image gallery, "See Connections with Visualization," "Building on the foundation of basic charting, data graphics and dashboard displays, the growing palate of visualization makes analyses more accessible and understandable to a general business audience -- as well as to seasoned BI professionals." Yet visualizations can be no better than the information and analyses that feed them. Let's use Wordle, an excellent "toy," to explore the trade-off between easy-to-use, attractive visualization for the masses and true, deep sense-making. Factoring in semantic considerations discussed over 50 years ago, it is unfortunately clear that Wordle images miss meaning. >>Continue reading "Limits of Visualization: Wordle Misses Meaning" Posted Thursday, May 21, 2009 4:06 AM >>Comments Text Analytics Survey and Summit Serious IT-market research should look at the demand side, at customer and prospect perspectives. I try to maintain balance myself in studying the text-analytics market, software and services designed to help enterprises find business value in "unstructured" text. You can help. Please participate in a survey I'm running and consider attending the up-coming Text Analytics Summit. My text-analytics survey runs through May 10. I'd like to know how your organization is dealing with unstructured sources and the role text mining/analytics plays or might play. I'll write up survey findings in a free report, available in early June. The survey will take you 5-10 minutes. Please help. >>Continue reading "Text Analytics Survey and Summit" Posted Monday, May 4, 2009 11:43 AM >>Comments Candid Thoughts About Recent Gartner & Forrester Research My reaction to recent Gartner and Forrester analyst reports, as covered in Intelligent Enterprise articles, is that the firms are tarting up a couple of instances of questionable research. A Gartner report urges us to "Reduce Costs of Data Integration by Rationalizing Tools and Infrastructure, and Centralizing Skills." According to Gartner's press release, "organizations can save up to $500,000 annually by rationalising tools in the short term and adopting a shared-services model in the longer term," a simplistic statement to say the least, more on which in a moment. Forrester's report, "Voice of the Customer: The Next Generation," does not seem simplistic, but I do get to wondering when Voice of the Customer research is based on interviews with "more than 20 companies" and at least 17 of the ones listed are VOC solution providers. I asked lead author Bruce Temkin, "How did you confirm their assertions with user organizations?" He did not respond. >>Continue reading "Candid Thoughts About Recent Gartner & Forrester Research" Posted Monday, April 27, 2009 6:01 PM >>Comments IBM Weighs In: Information Wants To Be Expensive I was quite disappointed to discover that IBM has cut off free access to historical IBM Journal articles. Decades of valuable, industry focused computing literature is now behind a "paywall." Gone is open availability of seminal material that establishes the foundations of business intelligence, data warehousing, text analytics, and more. For what? IBM earned $12.3 billion on sales of $103.6 billion in 2008. If IBM makes $1 million yearly selling journal access, that would represent less than one-hundredth of one percent of annual profit. Let's look at what we in the data business have lost. >>Continue reading "IBM Weighs In: Information Wants To Be Expensive" Posted Sunday, April 19, 2009 12:12 AM >>Comments The ACM Looks at Sentiment Analysis Our Sentiments, Exactly in the April issue of the Communications of the ACM tackles sentiment analysis. The subhead: "With sentiment analysis algorithms, companies can identify and assess the wide variety of opinions found online and create computational models of human opinion." (I suppose that last bit, the geek-speak, suits the audience. The Association for Computing Machinery is, after all, essentially an industry association for computer scientists.) Author Alex Wright interviewed me for the article, and with his permission, I'll share our conversation... >>Continue reading "The ACM Looks at Sentiment Analysis" Posted Thursday, April 2, 2009 12:07 PM >>Comments A Last Look at Open Source BI Open-source BI and I have come to a parting of ways. OS-BI capabilities, reliability, and support have matured. Commercial OS-BI vendors now compete with BI market leaders. That competition now appears to focus primarily on solutions and on the cost and community advantages open-source-reliant business models can (and do) offer enterprises of all sizes. In the end, for me as a technology analyst, OS-BI is simply boring. I will, however, take one last look, a snapshot of the state of the market, before I take my leave of the topic. >>Continue reading "A Last Look at Open Source BI" Posted Tuesday, March 31, 2009 6:03 PM >>Comments SAS Gets with the (Open Source) Program A January New York Times article on the R open-source statistical programming environment catalysed a change in attitude at SAS, the largest independent BI and analytics vendor. In just one month, SAS's position swerved from disdain -- the Times quoted Anne H. Milley, SAS director of technology product marketing, as opining, "We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet." -- to embrace with an admission that "both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers." And that's good news, for SAS and for R. >>Continue reading "SAS Gets with the (Open Source) Program" Posted Thursday, March 5, 2009 12:09 AM >>Comments Prediction Market Forecasts Are Not A Sure Thing The February 26 Economist looks at the state of prediction markets, a tool for turning collective human insights into forecasts. The title and subhead capture the reporter's take: "An Uncertain Future," "A novel way of generating forecasts has yet to take off." Prediction markets rely on broad participation to generate likelihood profiles for a variety of foreseeable outcomes. The short, anecdotal Economist article, by contrast, cites only one trading-platform provider and experiences at three user organizations. Those experiences are nonetheless telling. "Koch Industries, an American conglomerate in a range of businesses including chemicals,... says the results so far have been pretty accurate compared to actual outcomes, but stresses that markets are complementary to other forecasting techniques, not a substitute for them." On the down side, a couple of banks found that "a big hurdle facing managers using prediction markets is getting enough people to keep trading after the novelty has worn off" and "another reason prediction markets flop is that employees cannot see how the results are used, so they lose interest." >>Continue reading "Prediction Market Forecasts Are Not A Sure Thing" Posted Sunday, March 1, 2009 10:30 AM >>Comments Infonic Reloaded, or the Liberation of Lexalytics I have been following the recent implosion and regeneration of text-analytics, document-management, and Sharepoint services provider Infonic. The company recently went into administration in the United Kingdom due to insolvency. (I reported on the impending train wreck last month.) Infonic was reconstituted under new-old ownership of, reportedly, a couple of its executives. Many (former) shareholders are upset. I feel like a voyeur because this imbroglio affects Infonic subsidiary Lexalytics, which doesn't deserve the taint of guilt by association. I've concluded that the story bears exploring and that Lexalytics, given what I know of its products and management, should come out fine. The Company X that I wrote about last month was Lexalytics. It should have been, directly, Infonic, given that Infonic is and was the troubled partner in the merger of the two companies, which was announced last summer. The merger terms called for Infonic to own "between 70% and 75% of the issued share capital of the Vehicle (Infonic's percentage being dependent on certain conditions)." I critiqued the valuation of the deal in a July blog article. >>Continue reading "Infonic Reloaded, or the Liberation of Lexalytics" Posted Tuesday, February 24, 2009 7:56 AM >>Comments Slicing Up The BI Market Analyst Lyndsay Wise recently published a useful BI demand-side analysis, Redefining the Mid-Market and Its Business Intelligence Requirements. Her thesis is that "small and mid-sized organizations seem to get the short end of the stick when it comes to their software needs" based on IT infrastructure she sees as needed to support BI and related technology. Yet I wonder if segmenting BI-user enterprises by gross revenue is the best way to look at the BI demand-side market. Let's consider other, more refined approaches. >>Continue reading "Slicing Up The BI Market" Posted Wednesday, February 18, 2009 11:30 AM >>Comments Semantic Web Snake Oil The Gospel of Matthew says, "by their fruits ye shall know them." Judging by their work, some of the biggest Semantic Web proponents are snake-oil salesmen. What else are we to conclude when academics and industry figures who fervently boost the Semantic Web can't be bothered — or are unable — to publish their own Web materials with semantic mark-up? The day they get their own acts together, that's the day the Semantic Web will emerge as more than just a questionable, always-just-over-the-horizon panacea for whatever ails Web users, as more than a justification for academic conference junketing, to solve real-world information findability problems. Who knows when we'll see that day, which is why, when it comes to semantics, my bet is (and has long been) on analytics. >>Continue reading "Semantic Web Snake Oil" Posted Wednesday, February 4, 2009 12:49 PM >>Comments Vendor Stability Matters Too Technology is important, and so is vendor stability. You want solutions that perform, and you need to be confident that providers will be there for support and upgrades. When I see evidence that companies I follow are facing serious business complications, do I relate what I see, possibly adding to the difficulties faced by companies I'd like to see succeed, or do keep my views to myself? This isn't an abstract dilemma. I have two software vendors in mind. Here are their stories, a cautionary tale, names withheld as an ethical compromise. >>Continue reading "Vendor Stability Matters Too" Posted Thursday, January 29, 2009 10:13 AM >>Comments The Real Data Liberation Initiative The Data Liberation Initiative is a worthy project that aims to provide academic researchers with affordable and equitable access to Canadian current governmental statistics and other data. I had a chance to meet with three DLI team members back in 2003, and I'm glad to see that the initiative, approved by the Canadian government as a pilot in 1996, has grown into a robust effort with subprojects that benefit the spectrum of public data users. DLI and similar undertakings such as the US government's Fedstats, the Open Data Foundation, and IBM's Many Eyes are what real data liberation is all about. >>Continue reading "The Real Data Liberation Initiative" Posted Thursday, January 15, 2009 2:20 PM >>Comments Complex Event Processing as a Marketing Device I thought I'd share a CEP-Interest list message on Complex Event Processing (CEP) as a marketing device with Intelligent Enterprise readers. Posted by a long-time member of the event-processing community, Hans Gilde, it's an objective, refreshing market positioning/messaging analysis. I don't completely agree with Hans of course, but I do think his views are insightful. They echo points I wrote about back in 2007, and some are points I puzzled out in writing a sponsored CEP-BI white paper for Coral8. With Hans's permission, the rest of the words in this article are his: For software vendors, CEP is a marketing device and nothing else. Notice that no two people will agree on what CEP is, but everyone claims to be more CEP than the next guy. It's like asking to define "cool." The only two consistent attributes across all "CEP" products are (1) they are in some way used for soft real-time processing and (2) they are general purpose, rather than coming pre-customized for a very specific purpose. So the question of the CEP marketplace has two parts: The marketplace for the brand (really a "brand attribute" or whatever) that is the acronym CEP and the marketplace for products with the above two attributes. >>Continue reading "Complex Event Processing as a Marketing Device" Posted Monday, January 5, 2009 5:20 PM >>Comments Quality Issues (Still) Plague Spoke.com People Search "If you want to keep your job, use Spoke," advises recent e-mail from the folks behind "the fastest growing and most up-to-date business network in the U.S." Sounds like something to look into — social / people networks are one of the most important BI assets to have emerged in recent years — and I figured I owe Spoke another chance after panning it back in 2004. Grading according to the same accuracy, completeness, quality, usefulness, and usability standards I'd apply to other BI tools, I'm afraid I'd give Spoke a low C. Here's why. >>Continue reading "Quality Issues (Still) Plague Spoke.com People Search" Posted Wednesday, December 31, 2008 3:24 PM >>Comments Spock.com Taps Text Analytics Spock is a people-search engine, currently in beta release. The company uses "a combination of search-engine technologies and user edits to aggregate the world's people information and make it searchable." Think Google meets LinkedIn: Web search with accuracy boosted by allowing individuals to claim, augment, and correct information about themselves. (See the screenshot below, right.) ![]() Spock interface allowing a registered individual to amend his or her profile (click image for larger view) Andrew Borthwick is Principal Scientist at Spock Networks. I "met" him on the GATE e-mail list. (GATE is open-source text-mining software.) Andrew kindly fielded a number of questions I had about Spock's use of text analytics to build its search database and about Spock's data-quality efforts. >>Continue reading "Spock.com Taps Text Analytics" Posted Tuesday, December 16, 2008 9:07 AM >>Comments BI on Content Feeds, a.k.a. Continuous (Twitter) Transformation The rapid pace and high volume of twitter messaging has upped the stakes for BI on content feeds. BI on content feeds: that would be stuff like monitoring and mining sentiment from social media for reputation and brand management, which you can do with text analytics on RSS and Atom feeds and Web pages. I wrote in September on a leading-edge implementation at Thomson Reuters. But twitter messaging is both faster and, given social-network mediation, more focused: instant messaging text gone public. One approach to making sense of the flow is the CEPish application of continuous transformations that the folks behind SQLstream recently showed me. >>Continue reading "BI on Content Feeds, a.k.a. Continuous (Twitter) Transformation" Posted Monday, December 8, 2008 11:13 PM >>Comments "Many Known and Unknown Fatal Bugs" in MySQL 5.1!? What to make of Michael Widenus's astounding blog posting, "Oops, we did it again (MySQL 5.1 released as GA with crashing bugs)"? The signs have been there: MySQL, as an independent company and then as a Sun Microsystems subsidiary, has worked for over three years (!) to bring out 5.1. EWeek quoted Zack Urlocker, VP of products for Sun's Database Group, last May as claiming, "This version now has zero bugs," a statement disputed even then by Widenus, who effectively characterized MySQL bug management as a shell game, and GA release did take 7 more months. >>Continue reading ""Many Known and Unknown Fatal Bugs" in MySQL 5.1!?" Posted Thursday, December 4, 2008 5:01 PM >>Comments Up Next: BI on Social Networks It's time for the BI community to treat social networks as the business-intelligence resource they are. (BI is more than reports, dashboards, and OLAP!) The recent "Motrin moms" clamor and response to Mumbai terrorism prove networks' value. Both cases involved twitter, the first as a conduit for advertising-prompted outrage and the second for early and rapid news dissemination. It has become clear that twitter and the rest of a broad set of social networks &mdash as messaging / blogging / microblogging channels and as a means of publishing and finding personal and corporate information — hold immense business value. The value of the information that flows through these networks is indisputable. A deeper challenge is next on the agenda: optimizing that flow by better understanding the networks themselves. >>Continue reading "Up Next: BI on Social Networks" Posted Sunday, November 30, 2008 9:03 PM >>Comments Open Source BI: Eclipse BIRT and Talend Information Week has published my article on open source business intelligence (OSBI), Open Source BI Still Fighting For Its Share, a title that applies both to the BI software market and to IW column inches. (The article is now also an Intelligent Enterprise feature.) I'll share with readers material I wrote, cut by IW's editors, on open-source data-integration vendor Talend and on Eclipse BIRT, Business Intelligence and Reporting Tools. >>Continue reading "Open Source BI: Eclipse BIRT and Talend" Posted Monday, November 17, 2008 10:30 AM >>Comments Clarabridge Focuses on Customer Experience I spent a worthwhile day last week at Clarabridge's inaugural user conference. The company is a leading text analytics vendor, and the opportunity to catch up with staff and users and (it turned out) prospects and partners, without having to travel far from home, was too good to pass up. Clarabridge is different from many other text analytics vendors in its singular focus on customer-experience management (CEM). This isn't to say that you can't license and use Clarabridge's Content Mining Platform (CMP) for other applications, and it's not to say that the company doesn't have strong technical capabilities. It's a matter of market positioning that concentrates on a defined set of business users, across a broad spectrum of business sectors, with market messaging that focuses on business benefits rather than on technology. The approach seems to work. Clarabridge CEO and co-founder Sid Banerjee says "the business impact and deployability/usability of our solutions seem to be resonating with our customers and prospects." >>Continue reading "Clarabridge Focuses on Customer Experience" Posted Friday, November 14, 2008 8:12 AM >>Comments SAS Offers 101 on Voice Mining Manya Mayes of SAS has written a helpful introductory paper on audio analytics, "Tune into the Voice of Your Customer with Voice Mining." While the paper, which includes a call-center case study, focuses on customer-feedback audio, the technology and techniques described have applications for e-discovery, intelligence, and rich-media search. Given coverage of distinctive characteristics of speech and of analytical concerns that include BI integration, Manya's paper merits a look for anyone who works with audio data. Manya's thesis: "Combining voice capture with business intelligence, analytics, and text mining provides valuable customer intelligence for marketing and competitive intelligence business functions." Hers is a marketing paper, but it's motivated by significant technical and business challenges. I've looked at the issues myself and have planned to research them and couldn't agree with her more. (That's why we plan significant coverage of speech mining and audio analytics at the Text Analytics Summit, which I chair, in 2009.) >>Continue reading "SAS Offers 101 on Voice Mining" Posted Wednesday, October 29, 2008 8:00 PM >>Comments Heatmap Visualizations: the NY Times and NASDAQ The New York Times has published another excellent visualization, this one a heatmap, Can a President Tame the Business Cycle? The on-line, interactive version adds highly useful capabilities — a detail window that pops-up as a mouse-over effect, and alternative, bar-chart visualizations of each of the seven data series — that obviously can't be delivered in a static, printed newspaper. I've noted that paper/electronic gap before. My purpose this go-around is to explore, via comparison with a financial-information visualization published by the NASDAQ stock market, how underwhelming a mediocre heatmap can be. >>Continue reading "Heatmap Visualizations: the NY Times and NASDAQ" Posted Wednesday, October 22, 2008 12:10 PM >>Comments Lexalytics' ExecDex, or the PR Folks Know Best A press release from Lexalytics touts ExecDex, a Web site that features a "business-leader ranking index." I checked it out. Now Lexalytics makes an interesting sentiment-analysis engine, but I thought ExecDex should have been more fully developed before release to the likes of me. It seems Lexalytics CEO Jeff Catlin agreed, but the two of us couldn't have been more wrong. I suspect in the end, we both received a lesson in looking at tech applications through others' eyes. >>Continue reading "Lexalytics' ExecDex, or the PR Folks Know Best" Posted Friday, October 10, 2008 11:05 AM >>Comments The Semantic Web: Perhaps Not So "On the Cusp" The Semantic Web was conceptualized almost a decade ago, but despite progress on protocols and publishing tools, it remains far from realization. SW-technologist David Provost doesn't share my pessimism. To the contrary, the premise of his new report, On The Cusp: A Global Review of the Semantic Web Industry, is revealed by the report's title, namely that we're almost there. Yet the report itself, like so much material in the SW world, is itself devoid of semantic mark-up. Yes, semantics are important in boosting information findability and usefulness, but these SW examples — I cited another in a year-ago blog article — only emphasize the gap between SW boosterism and Web reality. >>Continue reading "The Semantic Web: Perhaps Not So "On the Cusp"" Posted Wednesday, October 8, 2008 6:04 PM >>Comments Nominate Now for the 2008 Jolt Awards Do you work for a company or project that makes software tools? Now's the time to check out, and consider a nomination for, the 19th annual Jolt Awards for software-development product excellence. I judge the database and enterprise tools categories. I'm definitely on the look-out myself for "Joltworthy" data management and analysis and application-deployment tools, products that provide an SDK, components, languages or APIs, and/or back-end capabilities for developers. There are 13 categories total. They accommodate a spectrum of software and software-related products. >>Continue reading "Nominate Now for the 2008 Jolt Awards" Posted Wednesday, October 1, 2008 3:32 PM >>Comments Is Business Activity Monitoring a BI Application? A question I posed to a LinkedIn group — Is Business Activity Monitoring (BAM) a BI Application? — sparked interesting discussion. I noted and asked, "BAM involves dashboards and analyses for business processes, and BI isn't typically very processy. If not BI, who 'owns' BAM?" There have been 9 responses to date, including two from Howard Dresner, who has done as much as anyone to shape current-day BI. The responses speak to growing interest in operational BI, and they hint at the impact that complex event processing (CEP) will have on enterprise analytics. BAM displays operational performance indicators in numerical and graphical form, often backed by rules-based alerting capabilities. BAM monitors execution of business processes and is part of operational-performance management solutions. It can be incorporated in line-of-business and operational interfaces, for instance for contact-center management, and in automated control systems. As the speed-of-business accelerates, BAM is more important than ever. >>Continue reading "Is Business Activity Monitoring a BI Application?" Posted Monday, September 29, 2008 12:33 PM >>Comments Lyzasoft's Non-Analytical Approach to Analytics Lyzasoft, Inc. calls its Lyza software a "powerful desktop analytics solution." According to the company, Lyza "enables analysts to synthesize, explore, and visualize data, then to publish compelling presentations and dashboards." Lyza seems worth considering as a personal data-integration tool, but it appears to fall short of greater claims. >>Continue reading "Lyzasoft's Non-Analytical Approach to Analytics " Posted Monday, September 22, 2008 11:14 AM >>Comments Event Processing Meets Text: Reuters at Gartner Richard Brown of Thomson Reuters delivered an illuminating talk, "News, Blogs, and Full-Tick Logs: Innovative Approaches to Quantitative and Event-Driven Trading," Tuesday at Gartner's Event Processing Summit. The summit and the Event Processing Technical Society symposium now underway feature many such use cases, descriptions of low-latency transformation and analysis of high-volume data and event streams as applied to diverse business problems. Brown's case study, which looked at exploiting information from unstructured sources to support financial-market trading, was of particular interest (to me) due to its combination of events, text sources, and sentiment analysis. >>Continue reading "Event Processing Meets Text: Reuters at Gartner " Posted Thursday, September 18, 2008 7:09 AM >>Comments Infobright, Kickfire, MySQL 5.1, and the MySQL Platform There's more to/about the Infobright open-source announcement than I covered in my Intelligent Enterprise article. I have thoughts to share on Infobright's architecture and limitations of the release. There's more to say about the MySQL data-warehousing context and then there's the puzzle of the significantly delayed MySQL 5.1 general availability (GA) release. >>Continue reading "Infobright, Kickfire, MySQL 5.1, and the MySQL Platform " Posted Monday, September 15, 2008 2:08 PM >>Comments MicroStrategy and SAS Advance Mainstream BI Visualization Advanced visualization isn't easy to get right. Simply put, our ability to generate and crunch data, and the palette of visualization options available to us, have outstripped our ability to choose charts and options that are appropriate for our data and effectively communicate important interrelationships. Nonetheless, we all know that visualization done right can really boost BI's value. The visualization approaches taken by leading BI companies vary significantly. Some have built out their capabilities nicely — notably MicroStrategy and SAS — while others continue to promote more-of-the-same-but-glitzier graphs and charts. In a worst-case scenario, we even have a major analytics vendor that’s fumbled its latest visualization launch. >>Continue reading "MicroStrategy and SAS Advance Mainstream BI Visualization" Posted Thursday, September 11, 2008 2:40 PM >>Comments BSD Licensing Puts the Shine on Google Chrome For all the coverage of the Chrome Web browser announcement, little note has been taken of Google's choice of the ultra-liberal BSD open-source license. The BSD choice accentuates that Chrome will be more (and very likely less) than a conventional Web browser. Chrome will contribute Web rendering to an increasingly comprehensive enterprise software platform and may (further) tie non-Google application developers to the Google stack. >>Continue reading "BSD Licensing Puts the Shine on Google Chrome" Posted Wednesday, September 3, 2008 4:13 PM >>Comments Learning about Text Analytics I spend a lot of time on teaching materials on text analytics: articles, presentations, and courses. I've gotten positive feedback about my introductory materials, which I designed for practitioners (like myself) rather than for academics or researchers. There are great resources out there — technical papers and white papers, case studies, software, etc. — but you have to get the basics down first. >>Continue reading "Learning about Text Analytics" Posted Friday, August 29, 2008 9:52 AM >>Comments Yahoo Plans "A New Generation of Search" Prabhakar Raghavan, head of Yahoo Research, says that Yahoo "will be launching a new generation of search in two to three months... Search is going to move in a completely new direction." The initiative, one would infer from today's Financial Express interview of Raghavan, will build on Yahoo's BOSS (build your own search software) platform, which implements a "self-service Web services model for developers and start-ups." >>Continue reading "Yahoo Plans "A New Generation of Search"" Posted Monday, August 25, 2008 1:50 PM >>Comments Kognitio and Data as a Service Gain Traction "Gaining traction" is a good description of Kognitio's Data as a Service message. I spoke to company execs at this week's Data Warehousing Institute Conference, and I reviewed results of a survey they released at TDWI. Signs are that Kognitio's DaaS positioning is helping the company define itself and carve a niche in the crowded and dynamic data warehousing-analytics market. That positioning may need tighter focus, but focus will surely come as the company signs some of the prospects it has gained since its February North American launch. >>Continue reading "Kognitio and Data as a Service Gain Traction" Posted Tuesday, August 19, 2008 1:58 PM >>Comments The Real Limits of Prediction Back in July 2005, I published a column titled The Limits of Prediction that addressed barriers to the adoption of predictive analytics. I've continued to think about the topic even as politicians come up with yet more wishful-thinking panaceas like off-shore oil drilling to reduce gasoline prices and business — for instance, the real-estate, financial, and automotive sectors — reports huge losses derived from in-hindsight foolish decisions. All those folks spend massive sums on IT and analytics. Why didn't they know better? >>Continue reading "The Real Limits of Prediction" Posted Wednesday, August 13, 2008 11:31 AM >>Comments Lexalytics and Infonic Merge (and Overvalue?) Sentiment Analytics The Lexalytics-Infonic merger announced last week creates a company, focused on sentiment analysis, that is poised to compete with larger, established text-technologies vendors. The companies' market presence hasn't been huge — Infonic's text-analytics division booked £600 thousand (about $1.2 million) in 2007 sales, and Lexalytics, a rapidly growing US start-up, earned $1.5 million in 2007 — but the combined company should be worth more than the sum of the parts. Whether it will be worth enough to justify a $40 million valuation of the deal, which exceeds Infonic's $23 million market capitalization, remains to be seen. >>Continue reading "Lexalytics and Infonic Merge (and Overvalue?) Sentiment Analytics" Posted Wednesday, August 6, 2008 9:58 PM >>Comments Sybase, DBMS Clusters, MPP, and DATAllegro Sybase is a DBMS stalwart that gets far less attention than deserved. The company recently beat financial-performance estimates and has raised its 2008 sales estimate to $1.11 billion. Sybase's on-going success — the company's DBMS is much more than the parent of Microsoft SQL Server — earns the company a closer look, in its own right and as an excuse for one last comment on Microsoft's planned acquisition of DATAllegro. >>Continue reading "Sybase, DBMS Clusters, MPP, and DATAllegro" Posted Thursday, July 31, 2008 2:05 PM >>Comments Two More Views of the Microsoft-DATAllegro Deal I learned of the Microsoft-DATAllegro deal from DATAllegro e-mail sent at 12:57 pm EDT on July 24. Ten hours later, at 11:11 pm, I thought I'd see what others had to say. The search for views was more illuminating than any additional analyses I found. Take a look... if you don't mind snarky blog articles — >>Continue reading "Two More Views of the Microsoft-DATAllegro Deal" Posted Thursday, July 24, 2008 11:00 PM >>Comments DATAllegro? Is Microsoft Buying the Wrong Company? My first thought on learning of Microsoft's plan to acquire DW appliance vendor DATAllegro is that MS is buying the wrong company. Yes, DATAllegro's parallelized database technology will fill a big gap in Microsoft's DW product line, namely that SQL Server doesn't scale to the top end, but the technology isn't compatible and will take years to build into SQL Server. I'd think it's Dataupia that would, without disruption, close the gap. No, this deal is about gutting the DATAllegro appliance. Look at the details — >>Continue reading "DATAllegro? Is Microsoft Buying the Wrong Company?" Posted Thursday, July 24, 2008 5:06 PM >>Comments Enterprise Search and the Findability Gap AIIM, the Association for Information and Image Management, has been bombarding me with e-mail promoting Dan Keldsen's and Carl Frappaolo's sponsored study, Findability: The Art and Science of Making Content Easy to Find. While I wonder what "findability" offers beyond buzzword differentiation for a few search vendors, even after learning that "findability is the art and science of making content findable," I do understand the distinction that "under findability, the burden of intelligent content processing is placed on the content itself." I'm tempted to say that's hokum — in keeping with my amity for automated knowledge discovery in text, a.k.a., text analytics — except that supporting examples are easy to come by. Here are a few, inadvertently provided by a leading enterprise-search vendor. >>Continue reading "Enterprise Search and the Findability Gap" Posted Thursday, July 17, 2008 6:44 AM >>Comments Aster nCluster Builds on Open Source PostgreSQL I've written about the "category error" of looking at open source primarily as targeting end-user replacement of BI applications and established data warehouse platforms. I've long seen that OS's greatest BI/DW has instead been in enabling developers to build BI into line-of-business applications and create specialized analytical tools. I'm more convinced than ever of this assessment, even as OS-BI vendors have launched improvements that target enterprise end users. On the DW front, the launch of Aster nCluster supports my point. >>Continue reading "Aster nCluster Builds on Open Source PostgreSQL" Posted Tuesday, July 15, 2008 4:18 PM >>Comments Survey on Voice of the Customer Text Analytics I have created a short survey for users and consultants on Voice of the Customer (VoC) text analytics best practices. There are seven questions plus a comment field. The survey should take less than 5 minutes to complete. If you are involved with VoC text analytics or are looking at solutions for possible adoption, please respond to the survey at -- http://www.surveymonkey.com/s.aspx?sm=HyhmPOYKhh8BcDeC_2b1Im5A_3d_3d I'll publish results at a later date. Thanks!
Posted Thursday, June 26, 2008 7:42 PM >>Comments E-Discovery, Compliance, Auditing, and Investigation E-discovery and auditing are flip sides of a single coin, the one concerned with retention of records and their production in litigation, the other with studying records to verify the correct of execution of corporate business processes and accounting procedures. Extending the metaphor, compliance is the coin standing on edge: neither anticipation and response to litigation (e-discovery) nor historical analysis (auditing) but rather operational rules and monitoring designed to ensure that businesses stay out of legal and accounting trouble. >>Continue reading "E-Discovery, Compliance, Auditing, and Investigation" Posted Tuesday, June 24, 2008 5:57 AM >>Comments News & Surprises from Text Analytics Summit 2008 Others have reported on this year's Text Analytics Summit: the prevalence of Voice of the {customer | Market | Patient} as a theme, the focus on sentiment analysis and on BI integration, the vendor announcements, applications for analysis of social media data, and so on. This commentary is helpful so I'll link to it in this article, and I have additional observations to share, drawn from summit discussions, concerning the evolution of the text-analytics market. >>Continue reading "News & Surprises from Text Analytics Summit 2008" Posted Friday, June 20, 2008 4:50 PM >>Comments Social Networking and the Enterprise I have to comment on my colleague Doug Henschen's article, "Is Social Networking KM All Over Again?" Doug did right at the Enterprise 2.0 conference to focus on cloud computing, a much more appropriate topic for enterprises than social networking. From the corporate perspective, the "cloud" is a diverse source of information, including all kinds of social and traditional media, out there to be searched and filtered for exploitable enterprise-relevant nuggets. But precipitous enterprise adoption of social networking? That would be foolish, destructive and not just disruptive. Corporations rely on and benefit from hierarchies and restricted lines of communications. Being selectively anti-social, for corporations, is a good thing. >>Continue reading "Social Networking and the Enterprise" Posted Thursday, June 19, 2008 10:18 AM >>Comments ParAccel Taps Experience, Open Source I've been very impressed by ParAccel, a company that has lived up to their Intelligent Enterprise designation as a Company to Watch. They sell a columnar, MPP DBMS that has won strong market visibility in the two years since the company's founding. The industry experience of ParAccel executives, gained at companies including Oracle, Teradata, Netezza, and DATAllegro, surely plays a part in the company's success, and so must two technology antecedents: an earlier columnar DBMS, Clareos, and the PostgreSQL open-source RDBMS. >>Continue reading "ParAccel Taps Experience, Open Source" Posted Thursday, June 5, 2008 10:45 AM >>Comments BI Innovation From the Inside Out Some of the most interesting BI innovation of recent days has come from a, well, likely source: insiders itching for another go at BI, a chance to (re-)do it right. Ward Yaternick is a case in point. Ward led Cognos development teams, with lead responsibility for the PowerPlay OLAP engine. He created OLAP@Work, an Excel add-in to access Microsoft OLAP Services that he subsequently sold to Business Objects. Ward has been building a new company/product, nextanalytics, that unquestionably represents a fresh take on BI. Ward believes nextanalytics is now ready for prime time. >>Continue reading "BI Innovation From the Inside Out" Posted Tuesday, June 3, 2008 12:17 PM >>Comments Misunderstanding Open Source Richard Stallman announced the GNU Project in September 1983. Eric S. Raymond published the first version of The Cathedral and the Bazaar in 2000. IDC estimated a year ago that worldwide revenue from standalone open source software reached $1.8 billion in 2006, projecting a compound annual growth rate (CAGR) of 26% from 2006 to 2011. That's revenue, not the presumably much higher avoided cost of closed source alternatives. So why are open-source fundamentals still so widely misunderstood, including in the business intelligence and data warehousing markets? Case in point: An advisory firm's recent ranking of Top 200 Private Technology Companies misclassifies DATAllegro, a data warehouse appliance vendor, as open source. Now DATAllegro's innovative "direct data streaming" and database parallelization technology is built around the Ingres open-source DBMS, but they haven't opened their own source à la Sun and OpenSPARC. >>Continue reading "Misunderstanding Open Source" Posted Monday, May 26, 2008 4:56 PM >>Comments From Text Analytics to Data Warehousing IBM recently posted a quite nice page on extracting business value from "unstructured" data. The page describes use of IBM's own products and formats to be sure, but it is potentially helpful for anyone who wishes to learn about information extraction from textual sources for data warehousing. IBM's page starts with a brief text-analytics overview. It then dives into implementation with the OmniFind Analytics Edition for DB2 and its pureXML capabilities. It describes a process flow includes XML tagging of document features and the alternatives of mapping the XML schema to relational database structures or use using the XML structures directly for analyses. This text-analytics workflow, and the choices involved in dealing with text-sourced information, are not specific to IBM's tools, however. So which IBM provides diagrams and code listings and an analysis of the alternative approaches that relate to their own products, the lessons apply much more generally. >>Continue reading "From Text Analytics to Data Warehousing" Posted Sunday, May 18, 2008 11:08 AM >>Comments A Visualization is Worth a Thousand Words The New York Times publishes exceptional visualizations. A couple this week stand out: All of Inflation's Little Parts, graphing the average American's spending by category, and a map of the human "diseasome" that supports the article, Redefining Disease, Genes and All. What distinguishes these visualizations — the first is a form of treemap, a "space-constrained visualization of hierarchies," and the second a network-connectivity diagram — is their success at communicating relationships along multiple data dimensions. >>Continue reading "A Visualization is Worth a Thousand Words" Posted Wednesday, May 7, 2008 6:05 PM >>Comments Finding Design Failure with Microsoft Office Search Commands Cheers to Microsoft Labs for their release of Search Commands, an Office 2007 add-in that "helps you find commands, options, wizards, and galleries in... Word, Excel, and PowerPoint." The embedded Guided Help calls it "a useful complement to the usual method of browsing for commands by clicking tabs on the Ribbon." I'm all for a way to work around Office ribbons, a set of interface elements introduced in Office 2007 that I characterized last September as "visually unbalanced." Ribbons degrade Office usability. I wrote in September that "they force extra clicking around for routine work and make it hard to find less frequently used functions." Microsoft is now, essentially, pleading guilty. Search Commands' Guided Help, in addition to calling the awkward process of "browsing for commands by clicking tabs" a "usual method," says Search Commands is "especially useful for finding commands that you use less often." >>Continue reading "Finding Design Failure with Microsoft Office Search Commands" Posted Tuesday, April 29, 2008 12:06 AM >>Comments Text Technologies in the Mainstream Adoption of text analytics has accelerated in the years I've followed the topic, with growth in expected and unexpected directions both. It wasn't hard to foresee extension of leading data mining workbenches to text, but I'll admit I had thought BI vendors would be much quicker to build handling of "unstructured information" into their technology stacks. (This lag has created partnering opportunities for a number of BI-focused text analytics companies. Business Objects' acquisition of Inxight and SAS's of Teragram show that the BI big guns are closing the gap.) And I didn't anticipate the nature of the solutions, other than semantically enhanced search and expansion in the legal, tax & regulatory (LTR) sector, that would be responsible for the greatest market growth. I'm referring to sectors such as media & publishing and applications including competitive intelligence and Voice of the Customer analytics supporting CRM, product management, and marketing. >>Continue reading "Text Technologies in the Mainstream" Posted Thursday, April 24, 2008 2:53 PM >>Comments BI Software Is a Commodity Technology BI Scorecard author and IE blogger Cindi Howson writes that she "gasped" on hearing Jim Davis of SAS talk about the commoditization of BI. Yet I'm with Davis — BI softwareis a commodity — the technology, that is, and not BI as a whole. That "as a whole" includes extensions wrapped around the commoditized technology core, extensions that build-out common-place core BI into solutions, extensions that adapt the technology — that package it in suite or application or embedded form — and link it to information sources and presentation and, nowadays, decision management. I wrote about BI's information angle in an earlier BI-market appraisal. I didn’t bother to take on BI technology as a commodity in that earlier article because the technology’s commodity status was almost self-evident to me. But to take it on now, I’ll cite what I wrote in a look at database management software. Condensing a bit: So far as that functional core is concerned, 1) there's broad consensus on definitions, 2) basic interface standards are well established, 3) leading products are interchangeable, 4) vendors compete on extended or niche capabilities, and 5) vendors don't compete on software price. I'd further add that with a commodity there is no barrier to user entry-level adoption and there's similarly only a low entry barrier for a would-be technology provider. This description fits BI technology. It fits reporting and OLAP and ETL from the leading BI firms. It explains why the innovation is outside these core areas, differentiated by (claiming) improvement on the commodity core. >>Continue reading "BI Software Is a Commodity Technology " Posted Tuesday, April 22, 2008 5:32 PM >>Comments Prediction Markets and Unpredictable Decision Making Prediction markets are mathematically based but human powered, a tool for turning collective human insights into forecasts. The approach enables individuals to bet on ideas or events. Twists such as anonymous participation, restriction to experts, and pre-screening for character traits are designed to reduce bias and boost accuracy. The New York Times reports that "companies use prediction markets to funnel ideas from the work force" (Betting to Improve the Odds, April 9, 2008). Steve Lohr's article provides illuminating examples from companies including Best Buy, InterContinental Hotels, and Hewlett-Packard. Lohr also notes, "for years, public prediction markets have been used for politics, where buyers and sellers bet on which candidate will win a particular race." It’s the kind of article that everyone interested in decision sciences should read, and then follow-up on to understand not only the mechanics of the techniques but also their limitations. >>Continue reading "Prediction Markets and Unpredictable Decision Making" Posted Friday, April 11, 2008 1:42 PM >>Comments Teradata Has Acquired BI/DW Firm Claraview Intelligent Enterprise and TechWeb are not the New York Times, so we get to publish some of the news that's seemingly Not Fit to Print: Data warehousing powerhouse Teradata has acquired BI/DW solutions provider Claraview. The news is there on Claraview's Web site — "Claraview is a division of Teradata Corporation (www.teradata.com), the world's largest company solely focused on raising intelligence through data warehousing and enterprise analytics" — but I don't have a clue why Teradata hasn't seen fit to announce the acquisition. There are a few hints of the take-over out on the Web, and an individual in a position to know told me the deal is (or was) "an open secret." >>Continue reading "Teradata Has Acquired BI/DW Firm Claraview" Posted Thursday, March 20, 2008 2:51 PM >>Comments The SAS-Teragram Deal's Back Story SAS has announced their take-over of text-analytics vendor Teragram. The companies' joint press release states that "the acquisition will enhance SAS's own robust text mining and analytical BI offerings and extend them to enterprise and mobile search." The release cites Teragram's natural language processing (NLP), categorization, and enterprise search technologies, but it leaves much of back-story untold. Here's my take on positioning, technology, and solution considerations that likely motivated what seems like a smart move for both companies. The essence of the announcement is captured in SAS CEO Jim Goodnight's statement, "Teragram's technologies augment, strengthen, and extend SAS's ability to combine structured and unstructured data – not only in our text mining solution but embedded across the entire SAS Enterprise Intelligence Platform – to drive better answers faster." SAS chief text mining strategist Manya Mayes added in e-mail to me, "Teragram has a broad solution that includes enterprise search and mobile BI" that "complements SAS Text Miner." >>Continue reading "The SAS-Teragram Deal's Back Story" Posted Monday, March 17, 2008 11:00 PM >>Comments Parsing Joseph Weizenbaum I learned of the March 5 death of computer scientist Joseph Weizenbaum from a posting on the CPSR e-mail list. CPSR, Computer Professionals for Social Responsibility, recognized Weizenbaum with their Norbert Wiener Award in 1988. But he was of course best known for creating Eliza, a mid-'60s computer program that conducted natural-language conversations, notably mimicking a psychotherapist's interview with a patient. Eliza was named for the character in George Bernard Shaw's Pygmalion (familiar as the source of the musical My Fair Lady), someone who was similarly taught to impersonate something she was not. Weizenbaum wrote in his 1976 book Computer Power and Human Reason, "Eliza created the most remarkable illusion of having understood in the minds of the many people who conversed with it." >>Continue reading "Parsing Joseph Weizenbaum" Posted Saturday, March 8, 2008 10:02 PM >>Comments Greenplum 3, Open Source (Bizgres) 0.9 Greenplum recently released a new version of its BI optimized DBMS, Greenplum 3 (G3). The software is based on the PostgreSQL open-source database system; proprietary extensions add support for parallel loading and query and other scalability and reliabilty features. But with G3, Greenplum appears to be moving ever farther from the company's open-source roots. Specifically, Greenplum sponsors Bizgres, an open source, BI-optimized but non-MPP DBMS, downloadable as an April 2006 0.9 version at bizgres.org or via Greenplum's site. Later versions are available in the code repository but no later version has been packaged as a General Availability release. Note that Greenplum does, separately, contribute to core PostgreSQL development. Frankly, I wonder if/why that contribution isn't the focus of the company's open-source involvement. >>Continue reading "Greenplum 3, Open Source (Bizgres) 0.9" Posted Monday, March 3, 2008 11:30 AM >>Comments Data Warehouse Appliances? Me too! Just as every presidential candidate this cycle is the candidate of Change, it seems that all the DBMS vendors offer the preferred data-warehouse appliance solution. That's the message I heard from appliance panelists at today's TDWI Washington DC chapter meeting. For a couple of them it was a real stretch, which in one case wasn't a bad thing. The net take-away is that we are seeing Change in the DBMS world, even if for the politicians that word is still only a promise. TDWI-DC's panel consisted of Doug Cardin from IBM, Victoria Eastwood from Infobright, Phil Francisco of Netezza, Foster Hinshaw of Dataupia, and Rita Sallam of Oracle. Now my definition of DW appliance is a packaging of processor, storage, operating system, and DBMS that is optimized for data warehousing. A scalability model is essential. And only one of the represented companies hits the mark: Netezza, with an asterisk for IBM. >>Continue reading "Data Warehouse Appliances? Me too!" Posted Friday, February 29, 2008 8:34 PM >>Comments TDWI Selection Bias: It Depends Whom You Ask The saying "There are three kinds of lies: lies, damned lies, and statistics" is attributed to Benjamin Disraeli, and it's nicely illustrated in a couple of Intelligent Enterprise reality-check photos from this week's TDWI conference. Check out the TDWI Image Gallery photos posted by Intelligent Enterprise Editor-in-Chief Doug Henschen. Executive Summit attendees used the "dots method" to identify Important BI Technologies and Biggest BI Challenges. You get simple histograms showing what's hot and what's not. And you get clear illustrations of "selection bias": not TDWI's fault, but an effect to keep in mind when you assess formal and informal research findings. >>Continue reading "TDWI Selection Bias: It Depends Whom You Ask" Posted Wednesday, February 20, 2008 12:57 AM >>Comments IBM on Text Technologies for the Legal Sector My last blog article relayed key points about e-discovery and potential knowledge-discovery (KDD) applications in the legal sector that were reinforced by my participation in the recent LegalTech conference. A LegalTech exhibitor I spoke to mentioned his company's discussions with IBM, so I dropped IBM text-technologies researcher Aaron Brown a note to learn his company's side of the story. Aaron is program director, Content Discovery and Search, IBM Information Management Software. His thoughts on legal-sector KDD were very much in line with mine. He graciously gave me permission to share his response, which I'll post verbatim — >>Continue reading "IBM on Text Technologies for the Legal Sector" Posted Friday, February 15, 2008 8:40 AM >>Comments Text Technologies in the Legal World "Discovery" is a legal process whereby parties to a lawsuit request and provide documents and information that may be pertinent in litigation. "Discovery" also describes an analytics goal that has nothing to do with the court system: extraction of useful information — data, facts, and rules, which together constitute knowledge — from databases and textual sources. I had expected the December 2006 federal rules amendments on discovery of electronically stored information — "discovery" here in the legal sense — to open new vistas for application of knowledge-discovery technologies: data mining, machine learning, visualization, and the like. The reasoning is simple. Corporations must now retain vast volumes of electronic records including e-mail and information from enterprise operational systems. To comply with e-discovery mandates, they must be able to "produce" records in response to discovery processes, and that means metadata-management, classification, search, and similar systems. >>Continue reading "Text Technologies in the Legal World" Posted Wednesday, February 13, 2008 5:04 PM >>Comments The Rigidity Trap Applies to PowerPoint, Dashboards Alike Curt Monash shares my disdain for PowerPoint: not the software per se but rather the rigid communication dysstyle (="dysfunctional style") it encourages. Seeming solutions such as pecha-kucha — "a simple set of rules to presentations: exactly 20 slides displayed for 20 seconds each" — are seductive. You pick up the pace and limit the text that can appear on a slide, which seem like pluses. Per Daniel H. Pink's description in Wired, you "say what you need to say... and then sit the hell down." On the other hand, you’re still locked in that rigid PowerPoint sequence. Just think of that stereotypical image of the American tourist abroad: If foreigners don't understand English, speak slower and louder and maybe they'll get it. It doesn't work of course. Similarly, faster, simpler presentations aren't necessarily better presentations. The same principle applies to communicating analytical results. >>Continue reading "The Rigidity Trap Applies to PowerPoint, Dashboards Alike" Posted Saturday, February 2, 2008 3:38 PM >>Comments Silobreaker advances social-network visualization I’m a fan of network visualizations, by which I mean display of interconnectedness mined from disparate sources. The subject matter could be just about anything: witness the collection of projects at Manuel Lima’s VisualComplexity site. Social networks inferred from on-line media prove particularly interesting, the sort of stuff you’ll find in static form at Jeffrey Heer’s and Danah Boyd’s vizster site and dynamically in Linkinfluence’s Map of the Political Blogosphere, which I wrote about last month. Silobreaker (as an on-line application) takes these efforts a big step further. Silobreaker visualizations add huge value to the company’s underlying news-aggregation service. They classify nodes by type. If you hover your mouse cursor over a node, you can explore its connectedness, and if you hover over the node’s text, you can learn more about that node, whether it represents a person, company, key phrase, or other type of entity. Hover over a link (edge) and you’ll see “documents indicating a relationship.” Naturally, you can double-click on a node to remake the visualization. Please, visit the site yourself and explore… and do some searches on terms that interest you. Then try the 360° Search, which aggregates content related to the search topic and displays retrieved and analyzed information via a variety of widgets. >>Continue reading "Silobreaker advances social-network visualization" Posted Wednesday, January 30, 2008 11:14 AM >>Comments Renaming the Next Generation Internet Prof. David Farber has issued an interesting challenge: Endlessly people talk about the Next Generation Internet. In fact the term has been used so badly that it is meaningless. I need a name for the Internet-like network we will need when we are faced with end to end optical communications at hundreds of gigabits; multi-core computers (large number) and other now-research technologies.While we shouldn't confuse names with substance, and while the Net would, like Shakespeare's Romeo, "Retain that dear perfection which he owes/Without that title," yet we understand the power of names to describe and even to inspire, including in the IT world. >>Continue reading "Renaming the Next Generation Internet" Posted Wednesday, January 23, 2008 11:23 PM >>Comments Buying FAST, Microsoft also gets AIW, Radar, and AdMomentum The best analysis of the motivations and implications of Microsoft's bid for Fast Search & Transfer (FAST) is Stephen Arnold's, yet his in-depth look, along with the rest of the reporting I've read, has nothing to say about some of the most interesting technology that Microsoft will acquire. Sure, given about FAST that "Our Business is Enterprise Search," and given that enterprise-search mojo and customers are what Microsoft seeks to acquire. There's much more to FAST, however: a fresh approach to data warehousing, a search-integrated BI dashboard, and an ad-delivery platform, that last being where the real search money is to be made. >>Continue reading "Buying FAST, Microsoft also gets AIW, Radar, and AdMomentum" Posted Monday, January 14, 2008 9:53 AM >>Comments Column stores and Census data: ParAccel and SuperSTAR ParAccel has won well-deserved attention in recent months, including Intelligent Enterprise recognition as a Company to Watch. They're a start-up that boasts an all-star cast of executives, positioned in a hot category, namely column-store DBMSes that are optimized for analytics. There's irony, however, in their market positioning. It's not that column stores, most notably SybaseIQ, have been around for decades. It's that ParAccel chose to explain their product with an application, analysis of U.S. Census data, that is essentially owned by a competing column-store system, SuperSTAR from Space-Time Research. I have personal history here: I designed the U.S. Census Bureau's Census 2000 tabulation system, working on subcontract to IBM. Back in 1998, I wrapped up the selection of SuperSTAR over competing options. We chose SuperSTAR for superior performance and ease of use. I then led the development team that created a system that supported both ad-hoc queries and the production of hundreds of billions of statistical tables for subsequent publication via the Census Bureau's American FactFinder Web site. >>Continue reading "Column stores and Census data: ParAccel and SuperSTAR" Posted Tuesday, January 8, 2008 5:37 PM >>Comments Lessons from the Netflix Prize competition The $1,000,000 Netflix Prize competition has produced interesting results, even if no winner, 15 months in. Some of those results are a bit surprising; others we should have expected but didn't anticipate. So while participants haven't yet bettered the accuracy of Netflix' Cinematch recommendation algorithm by 10%, the threshold to win the $1 million prize, we can still take away lessons about predictive-analytics fundamentals. I recently checked on competition status after receiving a note from Alex Lupu, VP Marketing USA for Scio Systems; Alex has been keeping me apprised of his company's progress toward launch of property-lease abstracting and analysis tools. Like Alex I'm into text analytics, and I liked his take that "intelligent communication between customer and the [Netflix suggestion] system" could provide an alternative route to better recommendations. Alex sees analysis of "'open questions' that allow customer to write a sentence or two" about movies as potentially beneficial in complementing traditional, pure-numbers predictive modeling. Alex says "assuming the customer is a static entity seems wrong to me, thus looking at databases only is not of much help." >>Continue reading "Lessons from the Netflix Prize competition" Posted Thursday, January 3, 2008 11:14 PM >>Comments A Year of IntelligentEnterprise.com It has been a year since Intelligent Enterprise magazine went on-line only. The last print issue, dated January 2007, came out last December. I thought I would miss the paper edition but now I see that, from a writer's point of view, the overhead of a print run, particularly for an IT publication, is a greater liability than may be justified by the extra value delivered. I was afraid that writing for Web distribution would diminish my authority as an analytics-industry observer. After all, the expense of producing and distributing printed magazines says that someone, even if only the publisher, thinks that the content justifies the cost. Sure, items on the Web are more findable and, simultaneously, timely and long-lived. But for an established author, the threat of losing the distinction conferred by paper counterbalances greater reach and timeliness. That threat hasn't been realized. >>Continue reading "A Year of IntelligentEnterprise.com" Posted Monday, December 31, 2007 2:15 PM >>Comments Campaign Visualizations: The Bad and the Ugly I wrote last week about a set of New York Times campaign visualizations that caught my eye. They met my "good" criteria: data-appropriate, designed to communicate rather than (merely) show off. The good is often contrasted with the bad and the ugly. Let's check out examples and then look at a TIBCO-Spotfire demonstration site. The Bad: A Map of the Political Blogosphere from Linkinfluence, a company that "engineers mapping, monitoring, and analytics solutions for the social web." I read about the site in Matthew Hurst's Data Mining blog. Matthew writes, "I believe that they have put plenty of effort in to the design of the data visualization and the overall look and feel to really make the site stand apart from others in this space." >>Continue reading "Campaign Visualizations: The Bad and the Ugly" Posted Monday, December 24, 2007 10:28 PM >>Comments Campaign visualizations win my vote I do admire a nice visualization, one whose composition suits the nature of the underlying data, one designed to communicate rather than as a means of showing off technology. Given these criteria, the New York Times delivered twice last Sunday with a pair of visualizations that nicely distill presidential-campaign themes and dynamics from what was otherwise a mighty big pile of words: debate transcripts. The Times's visualizations are useful in another way. They exemplify good design, especially when contrasted with other technology-first visualizations on similar topics. >>Continue reading "Campaign visualizations win my vote" Posted Tuesday, December 18, 2007 2:34 PM >>Comments Business Intelligence in 2008 Facebook is good for something (beyond wasting time)! It brought me to a BI 3.0 discussion thread started by Darren Cunningham, prompted by his LucidEra colleague Ken Rudin's blog entry, What's in Store for Business Intelligence in 2008. Ken is perceptive. His five predictions are:
Do follow the link and read the full blog article, and then consider my BI 3.0 comment addressed to Darren, that LucidEra does interesting enough work, but Ken Rudin's hot-in-2008 list is mighty solipsistic. Is there really nothing (significant) in store for BI in 2008 that isn't touched on by LucidEra offerings? >>Continue reading "Business Intelligence in 2008" Posted Monday, December 3, 2007 11:15 PM >>Comments (How) Has Open Source Data Warehousing Developed? Given the strengths of open source database management systems (DBMSes), open source seems like a natural platform for data warehousing. We've seen a number of success stories over the last few years, Travelocity, O'Reilly, FTD, and Frontier Airlines among them, but the roster of case studies is mighty thin. But I've only recently (re-)started looking — in the last couple of years, on the open-source front I've covered mostly BI (e.g., May, March) — and I hope to find many more for a report I am planning on open source (based) data warehousing (OSDW). >>Continue reading "(How) Has Open Source Data Warehousing Developed?" Posted Tuesday, November 27, 2007 8:14 PM >>Comments Let's stop agonizing about BI positioning I'm getting pretty tired of the agonizing whether recent market events and trends mean the end of business intelligence as we know it. Some of my fellow pundits are scrutinizing vendor consolidations and they're studying the impact of the emergence of new analytical approaches and application-delivery methods. Consolidation will mean a refocusing of product development for acquired vendors, that's all, not "the end of BI as a separate application" as Ephraim Schwartz, for instance, sees it. Schwartz acknowledges authoritative views that are contrary to his — "As Howard Dresner, principal at Dresner Advisory Services, says, 'For every vendor that is acquired, there are 20 emerging companies offering new approaches, technologies, and business models.'" — but dismisses them and slights the impact that Software as a Service, SaaS, has already had on BI. >>Continue reading "Let's stop agonizing about BI positioning" Posted Wednesday, November 21, 2007 5:08 PM >>Comments BI needs both architectural thinking and innovation IBM buys Cognos. SAP buys Business Objects. I agree with Neil Raden: "One big yawn." Neil's take: [Hyperion, Business Objects, and Cognos] have made substantial progress in refurbishing their products for a completely new world, but refurbishing only goes so far. The architectures can't really cut it. They need to scale, they need to be intelligent, they need to react in real-time when necessary, unattended when appropriate. They need to live on the Web. Neil's analysis is spot on but calls for elaboration. Neil sees outmoded product (and process and business?) architectures as impeding innovation, for the established BI vendors and implicitly for the organizations that rely on their tools. But there's more to the picture than scalability, "intelligence" (whatever that is), real-time reaction, autonomicity, and webification. The more is imagination: openness to, and the ability to deliver, new ways of analyzing data and using analytical findings. >>Continue reading "BI needs both architectural thinking and innovation" Posted Thursday, November 15, 2007 11:18 AM >>Comments Tableau does Web 0.2... but that's just a first step In a year when the Net is abuzz about Web 2.0, Tableau Server, out this week, qualifies as Web 0.2. But don’t get me wrong. Web 2.0 is about social media and collaboration, user-driven integration, on-demand access, and a first level of semantic search and discovery. On the back-end, Web 2.0 is about network-accessible services that enable all that stuff. Tableau Software’s first foray onto the Web is a modest step when considered in light of Web 2.0 agendas, and also in light of the very high expectations created by the company’s stand-alone Tableau Desktop application. It is not, however, a failure. Rather it shows caution, implicit care to get it (collaborative Web computing) right and not overextend and underdeliver. >>Continue reading "Tableau does Web 0.2... but that's just a first step" Posted Tuesday, November 6, 2007 3:27 PM >>Comments Can Security Awareness Deliver Competitive Advantage? It's disconcerting to live in a world where security can be seen as delivering competitive advantage, yet that's the idea behind Unisys's Enterprise Security initiative. But after all, the company's Trusted Enterprise Model only extends the security selling point that is a marketing mainstay for financial institutions and that has been adopted or embraced by IT vendors, sometimes far too slowly, with the rise of network computing. >>Continue reading "Can Security Awareness Deliver Competitive Advantage?" Posted Wednesday, October 31, 2007 11:22 PM >>Comments BI as Commodity Technology: The Information Angle I promised to follow an earlier article that looked at database management systems as a commodity technology with a similar assessment of business intelligence. In drafting the promised article, however, I realized that I couldn't limit my evaluation to the software side of BI. BI is complex. It is simultaneously software, transformational work practices, and business information. Admittedly, I am going far beyond IE Editor in Chief Doug Henschen's take on BI, but consider: What value is reporting or OLAP or data mining — software — that doesn't tap all data that contributes to relevant business insights — information — that can help you restructure, realign, or optimize business operations — practices? To understand if BI is a commodity technology, we need to examine all three, complementary aspects of business intelligence: software, information, and practices. Let's start with information, with BI sources and BI results. >>Continue reading "BI as Commodity Technology: The Information Angle" Posted Monday, October 29, 2007 4:31 PM >>Comments Jolt Awards Nominations Now Being Accepted A quick notice to let everyone know that the nomination period for the 2008 Jolt Awards is now open. I'm a Jolt Awards judge, my second year, while this is the 18th go-around for the awards. The main sponsor is Dr. Dobb's, like Intelligent Enterprise a CMP computing magazine (or is IE a business magazine focusing on computing or is IE a portal rather than a magazine?) The Jolt Awards target various aspects of software development practice: development environments, management and coding tools and utilities, libraries, books, developer networks, and lots more. >>Continue reading "Jolt Awards Nominations Now Being Accepted" Posted Friday, October 19, 2007 9:06 AM >>Comments Semantic Web Visions: A Tale of Two Studies Prof. Jorge Cardoso of the University of Madeira, Portugal, has written a very interesting paper titled "The Semantic Web Vision: Where are We?" Cardoso surveyed over 600 academic and industry researchers in December 2006. He published his findings in the September-October 2007 issue of IEEE Intelligent Systems. They include that "mainstream adoption is still five to ten years away." Cardoso defines the Semantic Web as "a machine-readable World Wide Web" and he notes "a significant evolution of standards as improvements and innovations allow the delivery of more complex, more sophisticated, and more far-reaching semantic applications." (Bill Inmon, please note.) >>Continue reading "Semantic Web Visions: A Tale of Two Studies" Posted Wednesday, October 17, 2007 5:37 PM >>Comments Petraeus Does PowerPoint Is there anything to add to an item that was the rage of the political media a month back, the misuse of one of our favorite miscommunication tools, PowerPoint, by U.S. military leadership? Check out Gen. David Petraeus' September 10, 2007 slideshow explaining and justifying the drawdown of U.S. troops inserted into Iraq in the recent "surge." The bloggers (e.g., Yglesias, Benen, Drum) found particularly notable an egregiously amateurish slide that related simply that in nine months, troop levels would be back to the levels of nine months previous. >>Continue reading "Petraeus Does PowerPoint" Posted Monday, October 15, 2007 12:07 AM >>Comments Is Database Software a Commodity Technology? Is database management a mature, commoditized software technology? It depends whom you ask. Start with IE Editor-in-Chief Doug Henschen, who wanted my take last July after he and I each wrote on Oracle 11g. My answer: definitely!, which no doubt confirmed Doug in his intention to later write that Oracle President Charles "Phillips's higher calling was to dispel the idea that database management systems have been commoditized in a mature market." OK, I admit that the president of Oracle speaks with more authority than I do although he perhaps speaks with more bias as well. I don't have a 47.1% share of the RDBMS market to protect. Nor is Microsoft's 17.4% RDBMS market share my responsibility. Contrast Phillips with Microsoft Technical Fellow David Campbell, quoted in an interview published in the September issue of Database Trends and Applications magazine. DBTA asked Campbell about factors distinguishing Microsoft SQL Server and its primary competitors. Campbell's response: "For 99 percent of your information-management needs, you can get the job done with SQL Server, DB2, or Oracle." Campbell's conclusion was that "you want to choose the one where your ongoing cost of operations is lowest." >>Continue reading "Is Database Software a Commodity Technology?" Posted Sunday, September 30, 2007 11:04 PM >>Comments Five years of OpenOffice.org OpenOffice.org has reached a significant anniversary. Earlier this month, OO passed the five-year mark as the only office software on my laptop computers, first installed when I bought a Windows 2000 machine in 2002, reinstalled a couple of months ago on a replacement laptop running Windows Vista and Ubuntu Linux. With open-source Apache Tomcat, Cygwin, Firefox, MySQL, Python, R, and Thunderbird to keep OO company, there's been no looking back. Instead, given diverse project-health indicators such as the release of IBM's new Lotus Symphony, the reported assignment of 35 programmers to the project by IBM, and the continued evolution of the NeoOffice native version for MacOS X, I'm looking forward to my next five years of OpenOffice.org. >>Continue reading "Five years of OpenOffice.org" Posted Thursday, September 27, 2007 9:30 AM >>Comments Complex Event Processing Struggles for Market Definition Complex Event Processing (CEP) seemed like a no-brainer for broad-market acceptance when I first wrote about a key constituent technology a couple of years back. Relational data warehouses and conventional analytics have not kept up with the explosive growth in real-time data volumes and the perceived demand for real-time analytics. CEP promised to fill the gap: technology developed for the extreme high-volume, low-latency processing demands of capital-market algorithmic trading and communications networks, compatible with emerging service-oriented architectures, applicable to a broad spectrum of security, logistics, and click-stream challenges. Further, CEP is supported by a vibrant, diverse community of academic and industrial researchers. IBM and Oracle and other established companies are doing very significant work, and the field has spawned half-a-dozen start-ups. Yet two years on, CEP is still struggling for market definition outside of capital markets. >>Continue reading "Complex Event Processing Struggles for Market Definition" Posted Wednesday, September 26, 2007 12:22 PM >>Comments Actuate: Commercial Open Source, Commercial Community I'm grateful to Actuate for giving me an preview look at BIRT Exchange, a new community site set to launch next Monday, September 24. Like the sponsoring company, the new site straddles the commercial open and closed source worlds. It will surely benefit BIRT Java programmers whether they use the open-source Eclipse version of BIRT or the closed source Actuate version. But make no mistake: Actuate's motives remain staunchly commercial and the company will retain tight control over BIRT development. >>Continue reading "Actuate: Commercial Open Source, Commercial Community" Posted Tuesday, September 18, 2007 2:05 PM >>Comments Merger Mania: What's Next For Analytics Vendors? Cognos's planned takeover of performance-management (PM) and OLAP vendor Applix was eminently predictable. BI companies have been hungry for PM capabilities; think of the Actuate-Performancesoft, Business Objects-Cartesis, Oracle-Hyperion, and SAP-OutlookSoft deals. Cognos's step -- preceded by their 2003 Adaytum acquisition -- leaves Information Builders and Microstrategy as the only major BI players without deep, in-house PM solutions and Longview as one of the last remaining independent PM vendors, albeit noting their alliance with Information Builders and others. >>Continue reading "Merger Mania: What's Next For Analytics Vendors?" Posted Wednesday, September 5, 2007 3:38 PM >>Comments Host Google Ads, Boost Your Page Rank I've been puzzling out a technique, used by sites that machine-aggregate content, that may boost pages' Google rankings. The aggregators stuff their pages with (Google) ads and contextually similar - albeit just similar enough - content. All that pseudo-content surely moves them up the Google rankings. How else to explain the success of the bottom-feeders who exploit others' content in order to sell ads? >>Continue reading "Host Google Ads, Boost Your Page Rank" Posted Friday, August 24, 2007 11:26 PM >>Comments Market Intelligence (without Search): TechNavio Debuts Search has many limits but the price is right and the alternative, reliance on traditional interfaces and human-structured information, is increasingly perceived as unacceptable. Yet there's still much to be said for old-school information-retrieval methods. Witness a new, search-free IT-market research tool, TechNavio, just launched by Infiniti Research. >>Continue reading "Market Intelligence (without Search): TechNavio Debuts" Posted Wednesday, August 22, 2007 8:25 AM >>Comments FAST Falters: Financials and BI-Search The appetite for Search continues to grow rapidly, and Fast Search and Transfer (FAST) has been one of the most aggressive players in the market. I've covered FAST's move to provide a contextual advertising alternative. Another company initiative boldly claims to revolutionize business intelligence. Yet FAST's recent difficulties, which appear to involve technical missteps and not just operational issues, should make us rethink the limits of search, particularly when it comes to extravagant claims about search-BI. >>Continue reading "FAST Falters: Financials and BI-Search" Posted Monday, August 20, 2007 3:09 PM >>Comments Gartner, Open Source, and Microsoft I received Gartner e-mail this week marketing their up-coming open-source summit. The message contains gems that illuminate Gartner's perspective on open source and the larger IT world. I characterized Gartner as the oracle of IT establishment and looked at their summit plans in a blog entry last week. Analysts will explain the heretofore anti-establishment open-source movement, albeit without the help of representatives of the communities that lend open source its power and vibrancy. Gartner's theme -- a quite valid one -- seems to be that establishment IT needs to come to grips with open source, and of course that Gartner is the organization that can show the way. They claim to be good at it. >>Continue reading "Gartner, Open Source, and Microsoft" Posted Wednesday, July 18, 2007 10:54 AM >>Comments Can Oracle 11g OLAP Query Acceleration Alter BI? Yes, Oracle 11g is a blockbuster release, sure to maintain the company's dominant market position. >>Continue reading "Can Oracle 11g OLAP Query Acceleration Alter BI?" Posted Sunday, July 15, 2007 2:27 PM >>Comments Roads to Semantics: Tim Berners-Lee and Bill Inmon There couldn't be a greater contrast between the views on semantics of Web creator Tim Berners-Lee and of data warehousing figure Bill Inmon. >>Continue reading "Roads to Semantics: Tim Berners-Lee and Bill Inmon" Posted Monday, July 9, 2007 7:21 AM >>Comments Cognitive Dissonance: Gartner and Open Source Gartner has announced an Open Source Summit for this coming fall. The summit will bring together, on the one hand, an analyst firm known for authoritative pronouncements on all things IT, and on the other, a disruptive model for software development that is, at its core, anti-authoritarian. The term that comes to mind is cognitive dissonance. How will the Gartner summit bridge two conflicting world views? >>Continue reading "Cognitive Dissonance: Gartner and Open Source" Posted Tuesday, July 3, 2007 1:16 PM >>Comments Voice of the Customer is Only Half the Text Analytics Picture As Curt Monash reports in his Text Technologies blog, Voice of the Customer was a central theme at this year’s Text Analytics Summit. The aim is to stay on top of reputation, quality, and product-design issues by crunching blog- and message-board text, call-center notes and e-mail, and free-text survey responses. (Some vendors call these activities "Enterprise Feedback Management.") Yet VOC and the analytical approach it typifies are only half the overall text-analytics picture. Text analytics still delivers very high value in traditional, non-VOC application domains such as life sciences and intelligence, areas where vendors still derive the major part of their revenues. >>Continue reading "Voice of the Customer is Only Half the Text Analytics Picture" Posted Thursday, June 14, 2007 5:15 PM >>Comments Text Analytics in Search (and a 'PS' on Inxight) The relationship between search and text analytics was a recurring topic at this week's Text Analytics Summit in Boston. The one supports information retrieval and the other just about anything else automated you can do with a document set, from knowledge extraction to automated classification and processing: complementary functions that rely on similar technical underpinnings. Ramana Rao, who has a wonderful ability to clarify, put it this way: "Google's white box makes everything seem so simple," but "we got to simplicity without handing the complexity of reality." It's text analytics, of course, that will equip search to handle complexity. >>Continue reading "Text Analytics in Search (and a 'PS' on Inxight)" Posted Thursday, June 14, 2007 1:18 AM >>Comments Can IT Redeem Politics Gone Wrong? Retired Vice Admiral Lowell Jacoby, former director of the Defense Intelligence Agency, currently an executive at a Washington DC IT solutions provider, gives exactly the keynote presentation one would expect. He gives a keynote whose essence I've heard before: IT is a information sponge that can clean up some nasty, real-world spills. I heard this theme in 2003 when Richard Perle, former assistant secretary of defense, spoke at a Capital Hill program on data mining. It was the rationale for DARPA's ill-fated-but-resurgent Total Information Awareness program. It bespeaks an attitude that would apply IT on a massive scale in a rear-guard attempt to contain a political situation gone horribly wrong - we have to do something, right? - with not a moment's thought given to alternative paths. >>Continue reading "Can IT Redeem Politics Gone Wrong?" Posted Friday, June 8, 2007 4:56 PM >>Comments On the Inxight and ClearForest Text Analytics Deals Buying Inxight is a smart move for Business Objects. Business Objects has strong analytics and is well positioned in the BI marketplace. Folding the capability to extract information from text into their technology stack is a natural next step for the company. This acquisition affirms the text-BI/integrated-analytics strategy being pursued by other vendors, notably Attensity, Clarabridge, Intelligent Results, SAS and SPSS. It also follows the precedent of data-integrator Informatica's late 2006 purchase of text-analytics vendor Itemfield. While the Itemfield deal closed less than six months ago, a demo I saw at last week's TDWI World Conference showed that Informatica has made quick work of seamless integration of text-extraction into their flagship PowerCenter ETL product. Given that Business Objects has been an original equipment manufacturing (OEM) licensee of Inxight technology for several years – which provides at least one reason they didn't go after fellow French company TEMIS – I expect similarly quick inclusion of Inxight text analytics into Business Objects' Data Integrator product. >>Continue reading "On the Inxight and ClearForest Text Analytics Deals" Posted Thursday, May 24, 2007 12:22 PM >>Comments Text Analytics Comes of Age An Accelovation briefing earlier this week was doubly helpful in affirming my take on the maturity of the text-analytics market and in showing me that I might be doing my job wrong. Regarding the first point: Despite an hour-long presentation by company CEO Jonathan Spier and VP of Products Jens Tellefsen, I don't have have a clue how the company's "business insight discovery technology" works. That's because the company is pitching solutions to business analysts and not to IT geeks like me, a sure sign that the underlying technologies are stable and capable, and Spier and Tellefsen steered our conversation away from tech talk. That a vendor can center its message on what rather than how is a hallmark of a maturing market. >>Continue reading "Text Analytics Comes of Age" Posted Thursday, May 17, 2007 9:22 AM >>Comments Open Source BI Firm Targets Integration Needs I profited from my recent Rome visit to learn more about an aggressive open-source business intelligence (OSBI) contender, SpagoBI, positioned to go head-to-head with leading OSBI rivals. I've known about SpagoBI for a couple of years; the software is produced by Rome-based systems integrator Engineering Ingegneria Informatica. It uses some of the same components as software from Pentaho and JasperSoft, packaged however in a framework that Technical Director Gabriele Ruffatti asserts is more flexible and extensible than those of Engineering’s OSBI rivals. >>Continue reading "Open Source BI Firm Targets Integration Needs" Posted Wednesday, May 16, 2007 8:45 AM >>Comments Open Source Business: Altruistic or Profit Driven? While in Rome last week to teach a class on "Open Source for the Enterprise," I had the pleasure of getting together with Roberto Galoppini, who consults and writes a very perceptive blog on commercial open source. Check out, for instance, this useful table classifying licensing and revenue models of companies commercializing open source software (OSS). Roberto thinks a lot about open-source business models, and he took issue with an over-simplification in my course materials. We agree that a third column is missing from a table I had prepared contrasting Open and Closed approaches, this table – >>Continue reading "Open Source Business: Altruistic or Profit Driven?" Posted Monday, May 14, 2007 12:12 PM >>Comments Report from the European Text Analytics Summit I had the privilege of chairing last week's European Text Analytics Summit in Amsterdam. The event was very enjoyable, in no small part because of the diversity of attendee backgrounds and roles. I've never attended any other computing event (outside the summit series) that mixes scientists, police investigators, and media-company product managers with technologists. While I can't say I learned anything completely new (to me), quite a few points surfaced that are worthy of note. I'll report some of them, grouped under the headings user stories, market, and technology. >>Continue reading "Report from the European Text Analytics Summit" Posted Tuesday, May 1, 2007 8:53 AM >>Comments FAST pushes SNaaS – Software NOT as a Service Enterprise search vendor FAST is poised to strike a blow for SNaaS – Software NOT as a Service. With a planned April 30 software release, FAST plans to alter the Web's money equation, which to date has been service mediated. The FAST AdMomentum platform – provided for installation by online publishers – is designed to shift control of delivery of contextual advertising from third-party service providers. Company CEO John M. Lervik claims that by adopting a SNaaS model, media companies, retailers, and telecommunications service providers will be able "to maintain control of their revenue, serving their advertisers and audiences more effectively," something "difficult to do with third-party platforms." >>Continue reading "FAST pushes SNaaS – Software NOT as a Service" Posted Tuesday, April 17, 2007 7:26 PM >>Comments The Grand Challenge for Text Mining Ronen Feldman last year posed a grand challenge problem for text mining: to create "systems that will be able to pass standard reading comprehension tests such as SAT, GRE, GMAT etc." Feldman is one of the great authorities of the field, a computer science professor, author, and co-founder of text-analytics vendor ClearForest. No one is more qualified to suggest text mining's research agenda than he. Indeed, the aim of Feldman and his 2006 SIGKDD co-authors proposing Data Mining Grand Challenges goes far beyond research. It is to "get researchers, press, funding agencies, venture capitalists, and public interested, greatly stimulate research, and produce dramatic advances in science and technology." This is a worthy vision and goal. >>Continue reading "The Grand Challenge for Text Mining" Posted Friday, April 13, 2007 11:49 AM >>Comments Just How 'Free' Are Open Source Licensing Models? Confusion and controversy about Open Source licensing did not start with current Free Software Foundation efforts to revise the GNU General Public License (GPL). Nor will emergence of an acceptable GPL V3 – or of a revised Lesser GPL or Affero GPL (thanks Dana Blankenhorn) – make OS licensing much less problematical for enterprise users. Concerns are both alleviated and complicated by a profusion of options that range from GPL's communitarianism to the Common Public License's collaborative focus to BSD's laissez-faire liberality. The variety of schemes in use creates opportunity: witness, for instance, Apache's magnificent munificence. But one must also take care to avoid bait-and-switch, pretend Open Source licenses that promise freedom in both common senses, liberty and price, but ultimately deliver neither. >>Continue reading "Just How 'Free' Are Open Source Licensing Models?" Posted Tuesday, April 10, 2007 10:26 AM >>Comments Reframing Text Analytics with BI I spent a pleasant and illuminating 90 minutes recently with Justin Langseth, president and co-founder of Clarabridge. Clarabridge sells text-mining software designed to integrate with business-intelligence tools. The company's solutions target both established text-analytics markets, such as life sciences, law enforcement and intelligence, as well as rapidly growing segments: marketing, CRM, reputation management and the like. But boosting Clarabridge is not my job, and, at least for those 90 minutes, it wasn't Justin's either. We did talk about the company's latest software release, but the bulk of our conversation, the helpful and illuminating part, was about the changing market landscape. >>Continue reading "Reframing Text Analytics with BI" Posted Monday, April 2, 2007 8:12 AM >>Comments InfoWorld Follows (Readers') On-line Path InfoWorld has announced that their April 2 issue will be the last to appear in print. The magazine – the computing trade rag I'd most want to write for if I weren't part of the Intelligent Enterprise family – follows in IE's footsteps in going on-line only; IE's last print issue appeared in January. Like IE, InfoWorld cites the Web as "a more efficient delivery mechanism" and they also cite advertisers' desire for "more immediate gratification and measureable results than print can afford them." Yet there's another important factor to on-line delivery that InfoWorld does not explore: reader preferences. >>Continue reading "InfoWorld Follows (Readers') On-line Path" Posted Monday, March 26, 2007 12:05 PM >>Comments On Products, the Press, Analysts and SaaS "How do companies with such a trivial product get such [extensive] press?" I had written in a recent blog entry about the claims and the coverage garnered by a BI software as a service (SaaS) company. Their products may be quite nice – their architecture and positioning seem sounds – but their grandiose self-depiction overstates their impact on the overall BI market. The person who sent me this question founded a company that creates BI solutions using open-source software, a rival of the company I wrote about. His question was half serious, half rhethorical – he surely had his own answer in mind – but it's worth a moment's thought. Consider the following as two minutes of PR 101 from someone on the receiving end of many press releases. >>Continue reading "On Products, the Press, Analysts and SaaS" Posted Wednesday, March 21, 2007 12:59 PM >>Comments New from the Hype Machine: BI as SaaS The launch of LucidEra, an "on-demand, reporting and analysis solution that focuses on simplicity," has generated quite a bit of attention. The company has been put forward as the poster child for software-as-a-service (SaaS) BI, a pacesetter for an emerging BI revolution. My take? The company appears to have spun the impact of a modest, narrowly focused (and perhaps quite nice) solution far out of proportion to its real import. Salesforce.com is a hosted customer-relationship-management (CRM) system. Fair enough; sounds like a solid foundation for a business. Babbi wrote in e-mail to me that "more applications are planned. Unfortunately we are not ready to share that roadmap at this time." >>Continue reading "New from the Hype Machine: BI as SaaS" Posted Tuesday, March 13, 2007 8:10 AM >>Comments Straight Dope About Open-Source BI Open-source business intelligence has changed nothing, yet it is making all the difference in the world. >>Continue reading "Straight Dope About Open-Source BI" Posted Monday, March 5, 2007 5:12 PM >>Comments Can Open Source Apps Find Strength in Numbers? Observations I drew from this week's LinuxWorld OpenSolutions Summit are that (1) location does matter, in both physical and market space and (2) some people have a strange notion of what constititues an IT solution. Regarding market space – namely how to go about creating some – the interesting news at the summit was the annoucement of a new Open Solutions Alliance. But I'll get to that after first explaining my point on strange notions. >>Continue reading "Can Open Source Apps Find Strength in Numbers?" Posted Friday, February 16, 2007 2:10 PM >>Comments SAS BI: Solid or Stolid? I agree with Gartner's assessment that "SAS offers the most comprehensive BI platform in the industry" with unmatched advanced analytics. It has been twenty years since I first programmed with SAS. I've invested thousands of hours in the company's products. I want the company to do well. And fortunately my experience over the years and my on-going monitoring of the broad BI market has rewarded my confidence. Yet Gartner also reports that SAS BI software is perceived as lacking usability. I agree with that assessment, too, and I've welcomed SAS efforts to counter it. It has been a chronic limitation that SAS functions are incompletely exposed through the graphical interfaces and that the variety of powerful analytical and presentation modules are not well integrated. I'd like to know that those situations have changed. >>Continue reading "SAS BI: Solid or Stolid?" Posted Wednesday, February 14, 2007 9:01 AM >>Comments Defining Text Analytics I’ve been writing and speaking and consulting on text analytics for years. This work led to a recent call from Philip Russom, an analyst at the Data Warehousing Institute, late of Forrester, Giga, Hurwitz, and Intelligent Enterprise. Philip invited me to contribute an expert comment – my take on “text analytics” in six sentences or fewer – for a forthcoming TDWI report on BI search and text analytics. I failed. I took eight sentences – we’ll see if Philip cuts them down – and I thought I’d share the lot with you. >>Continue reading "Defining Text Analytics" Posted Thursday, February 8, 2007 10:37 AM >>Comments Roadkill at the Corner of Search and BI The intersection of search and business intelligence has gained lots of attention in recent months. But are companies actually implementing search-BI solutions? Or is search-BI mostly talk, as yet unworthy of serious consideration by organizations with real-world problems to solve? E-mail from a friend seeking advice suggested these questions. Christine works for a prominent, expensive BI-DW consultancy. She's helping a client with information access problems. Her client regularly generates large numbers of reports that "sit within a data warehouse system, run off Business Objects/Cognos against one source or another, or more commonly are created manually by copying/pasting or rekeying data into a spreadsheet/Word doc/Powerpoint/Web site." The client envisions "something like the Amazon site that allows search, and when someone selects a report it lists other reports, noting 'people who looked at this report also looked at... and a way for users to rate the report, etc.'" Yes, even folks who rekey data into spreadsheets are allowed to dream. >>Continue reading "Roadkill at the Corner of Search and BI" Posted Wednesday, January 31, 2007 10:49 AM >>Comments EnterpriseDB's Open (Source) Deception Andy Astor, CEO of EnterpriseDB, stated in November, "Our offering is How does EnterpriseDB see it? CEO Astor, in an April 2006 interview, said that "the extensions that we've done [to the open source PostgreSQL that EnterpriseDB is based on] -- the Oracle compatibility for example -- is code that we share with customers who subscribe to our service." Paying customers -- and only paying customers -- can see and modify the code but "just can't redistribute it." That's open source?! >>Continue reading "EnterpriseDB's Open (Source) Deception" Posted Monday, January 15, 2007 4:47 PM >>Comments Humans and Avatars: The Ghost in the Machine The January 10 New York Times ran an intriguing article, "Computers Join Actors in Hybrids On Screen". It describes a new James Cameron film, "Avatar." The movie's alien characters will be designed by computer but played by human actors. The Times reports that "their bodies will be filmed using the latest evolution of motion-capture technology -- markers placed on the actor and tracked by a camera -- while the facial expressions will be tracked by tiny cameras on headsets that will record their performances to insert them into a virtual world." >>Continue reading "Humans and Avatars: The Ghost in the Machine" Posted Wednesday, January 10, 2007 4:19 PM >>Comments
|
Blog Channels
on Enterprise App Development on Changing the Enterprise by Shawn Shell by Kas Thomas Subscribe to RSS feed of all blogs Archives
|
|
|








