Data Frontiers, by Curt Monash Curt Monash runs Monash Research, which provides strategic, analysis-based advice to users and vendors of advanced information technology. He also writes the blogs DBMS2, Text Technologies, and Strategic Messaging. Write him at contact@monash.com Quick Thoughts on Sybase/Aleri Sybase announced an asset purchase that amounts to a takeover of CEP (Complex Event Processing) vendor Aleri. Perhaps not coincidentally, Sybase already had technology under the hood from Aleri predecessor/acquiree Coral8, for financial services uses (notwithstanding that between Aleri Classic and Coral8, Aleri Classic was the one of the two more focused on financial services). Quick reactions include: >>Continue reading "Quick Thoughts on Sybase/Aleri" Posted Thursday, February 4, 2010 3:44 PM >>Comments Database Snooping Threatens Liberty - And We're All Making Matters Worse Every year or two, I get back on my soapbox to say:
>>Continue reading "Database Snooping Threatens Liberty - And We're All Making Matters Worse" Posted Tuesday, February 2, 2010 2:30 PM >>Comments Netezza Skimmer Joins the Short List As I previously complained, last week wasn't a very convenient time for me to have briefings. So when Netezza emailed to say it would release its new entry-level Skimmer appliance this week, while I asked for and got a Friday afternoon briefing, I kept it quick and basic. That said, highlights of my Netezza Skimmer briefing included: >>Continue reading "Netezza Skimmer Joins the Short List" Posted Wednesday, January 27, 2010 8:14 AM >>Comments Two Cornerstones of Oracle's Database Hardware Strategy After several months of careful optimization, Oracle managed to pick the most inconvenient* day possible for me to get an Exadata update from Juan Loaiza. But the call itself was long and fascinating, with the two main takeaways being:
>>Continue reading "Two Cornerstones of Oracle's Database Hardware Strategy" Posted Friday, January 22, 2010 3:41 PM >>Comments Oracle Lifts Cloud Over MySQL Storage Engine Vendors Earlier this month, Oracle put out a press release promising to play nicely with MySQL if its Sun takeover is approved. The parts in italics below are quotes. My comments are in plain text. 1. Continued Availability of Storage Engine APIs. Oracle shall maintain and periodically enhance MySQL's Pluggable Storage Engine Architecture to allow users the flexibility to choose from a portfolio of native and third party supplied storage engines. MySQL's Pluggable Storage Engine Architecture shall mean MySQL's current practice of using, publicly-available, documented application programming interfaces to allow storage engine vendors to "plug" into the MySQL database server. Documentation shall be consistent with the documentation currently provided by Sun. >>Continue reading "Oracle Lifts Cloud Over MySQL Storage Engine Vendors" Posted Tuesday, December 29, 2009 12:50 PM >>Comments Reports of Perfectly-Balanced Hardware Configurations are Greatly Exaggerated Data warehouse appliance and software appliance vendors like to claim that they've worked out just the right hardware configuration(s), and that a single configuration is correct for a fairly broad range of workloads. But there are a lot of reasons to be dubious about that. Specific vendor evidence includes: >>Continue reading "Reports of Perfectly-Balanced Hardware Configurations are Greatly Exaggerated" Posted Tuesday, November 24, 2009 9:26 AM >>Comments Teradata's Hardware Strategy and Tactics In my opinion, the most important takeaways about Teradata's hardware strategy from the Teradata Partners conference last week are:
>>Continue reading "Teradata's Hardware Strategy and Tactics" Posted Tuesday, October 27, 2009 9:06 AM >>Comments This Week at the Teradata Partners User Conference Here are some highlights of what's going on, although names, dates, and details will have to await conversations and press releases this week.
>>Continue reading "This Week at the Teradata Partners User Conference " Posted Tuesday, October 20, 2009 11:18 AM >>Comments Oracle Exadata 2 Capacity Pricing Revealed Analyzing Oracle Exadata pricing is always harder than one would first think. But I've finally gotten around to doing an Oracle Exadata 2 pricing spreadsheet. The main takeaways are:
>>Continue reading "Oracle Exadata 2 Capacity Pricing Revealed" Posted Tuesday, October 6, 2009 10:20 AM >>Comments Thoughts on Integrating OLTP and Data Warehousing (Especially in Exadata 2) Oracle is pushing Exadata 2 as being a great system for OLTP (OnLine Transaction Processing), data warehousing or, presumably, the integration of same. This claim rests on a few premises, namely:
>>Continue reading "Thoughts on Integrating OLTP and Data Warehousing (Especially in Exadata 2)" Posted Tuesday, September 29, 2009 4:44 PM >>Comments Issues Comparing Analytic DBMS Performance The analytic DBMS/data warehouse appliance market is full of competitive performance claims. Sometimes, they're completely fabricated, with no basis in fact whatsoever. But often performance-advantage claims are based on one or more head-to-head performance comparisons. That is, System A and System B are used to run the same set of queries, and some function is applied that takes the two sets of query running times as an input, and spits out a relative performance number as an output. For example, Greg Rahn twittered to me that Oracle Exadata commonly outperforms existing Oracle installations by a factor of 50 or better, based on a "geometric mean". What I presume he meant by that is:
>>Continue reading "Issues Comparing Analytic DBMS Performance" Posted Tuesday, September 22, 2009 2:23 PM >>Comments Thinking About Analytic Speed For a variety of reasons, I don't plan to post my complete Enzee keynote slide deck soon, if ever. But perhaps one or more of its subjects are worth spinning out in their own blog posts. >>Continue reading "Thinking About Analytic Speed " Posted Monday, September 14, 2009 9:36 AM >>Comments Teradata's Active Enterprise Data Warehouse Story Teradata used to tell a one-size-fits-all Enterprise Data Warehouse (EDW) story. That's no longer the case. Last year, Teradata introduced a range of products. I think Teradata is serious about selling its full product range, and by now has achieved buy-in from its sales force for that strategy. I base these beliefs on data points such as:
But that raises the question: How does Teradata pitch the advantages of its top-end product line these days? At least at the corporate level, the answer seems to focus less on the "EDW" concept than it used to, and more on "Active." Teradata -- which actually has been talking about "Active Data Warehousing" for about a decade indeed calls its top-end 55xx series the "Teradata Active Enterprise Data Warehouse." >>Continue reading "Teradata's Active Enterprise Data Warehouse Story " Posted Monday, August 24, 2009 8:04 AM >>Comments Sorting out Netezza and Oracle Exadata Data Warehouse Appliance Pricing Netezza recently announced a new generation of data warehouse appliance called TwinFin. TwinFin's clearest stated list price is "a little under $20,000 per terabyte of user data," which in my opinion immediately became the new industry reference point for discussing prices in the data warehouse appliance category. Vigorous discussion ensued, especially in the comment thread to the first of the two posts linked above. Here's some followup. Netezza should not have claimed a "10-15X price/performance improvement," based on a 3-5X performance improvement and a 3X decrease in price/terabyte, and I should have grilled Netezza harder when it first made the claim. In fact, there is no unit of performance that you can, in a reasonable blended average, get 10-15X more of per dollar in TwinFin than you can in the predecessor NPS series. To look at it another way, multiplying 3-5X by 3X would only make sense if 3-5X were a measure of something like "terabytes/unit of performance." But in fact the 3-5X is a blended average of something more like "units of performance/unit of time"; i.e., you can do 3-5X more calculations or queries in a unit of time over the same database (of the same size*) on the new machine as you can on the old. >>Continue reading "Sorting out Netezza and Oracle Exadata Data Warehouse Appliance Pricing" Posted Monday, August 10, 2009 7:32 AM >>Comments Teradata 13 Focuses on Advanced Analytic Performance Last October I wrote about the Teradata 13 release of Teradata's database management software. Teradata 13, which will be used across the various Teradata product lines, has now been announced for GCA (General Customer Availability)*. So far as I can tell, there were two main points of emphasis for Teradata 13:
To put it even more concisely, the focus of Teradata 13 is on advanced analytic performance, although there of course are some enhancements in simple query performance and in analytic functionality as well. >>Continue reading "Teradata 13 Focuses on Advanced Analytic Performance " Posted Monday, August 3, 2009 3:28 PM >>Comments Netezza Is Changing its Hardware Architecture, Slashing Prices Netezza is about to make its biggest product announcement in years. In particular:
For months, it has been an increasingly open secret that Netezza was planning a major refresh of its product line. As signaled by a blog post from Netezza's product marketing VP Phil Francisco, many of the details are finally fit to post.* >>Continue reading "Netezza Is Changing its Hardware Architecture, Slashing Prices" Posted Friday, July 31, 2009 10:35 AM >>Comments Initial reactions to IBM acquiring SPSS IBM is acquiring SPSS. My initial thoughts (questions by Eric Lai of Computerworld) include: 1) good buy for IBM? why or why not? Yes. The integration of predictive analytics with other analytic or operational technologies is still ahead of us, so there was a lot of value to be gained from SPSS beyond what it had standalone. (That said, I haven't actually looked at the numbers, so I have no comment on the price.) By the way, SPSS coined the phrase "predictive analytics," with the rest of the industry then coming around to use it. As with all successful marketing phrases, it's somewhat misleading, in that it's not wholly focused on prediction. 2) how does it position IBM vs. competitors? >>Continue reading "Initial reactions to IBM acquiring SPSS" Posted Wednesday, July 29, 2009 7:50 AM >>Comments Update on Microsoft's Madison and Fast Track Data Warehouse Products I chatted with Stuart Frost of Microsoft on Tuesday. Stuart is and remains GM of Microsoft's data warehouse product unit, covering about $1 billion or so of revenue. While rumors of Stuart's departure from Microsoft are clearly exaggerated, it does seem that his role is more one of coordination than actual management. Microsoft Madison availability remains scheduled for H1 2010. Nothing new there. Tangible progress includes a few customer commitments of various sorts, including one outright planned purchase (due to some internal customer considerations around using up a budget). At the moment various Microsoft Madison technology "previews" are going on, which seem to amount to proofs-of-concept that:
>>Continue reading "Update on Microsoft's Madison and Fast Track Data Warehouse Products" Posted Friday, July 17, 2009 10:43 AM >>Comments Hasso Plattner Calls for In-Memory OLTP Column Stores Former SAP CEO Hasso Plattner has written a paper called A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database, in association with a SIGMOD keynote address.* The approach Plattner advocates is an MPP in-memory column store, presumably somewhat akin to SAP's frequently renamed Business Warehouse Accelerator/Business Intelligence Accelerator/BWA/BIA/Son-of-TREX technology. There also are strong similarities to the MPP in-memory row store project H-Store/VoltDB, although I don't know whether Plattner would go so far as to adopt the H-Store view that all transactions should run in stored procedures. Unsurprisingly, SAP applications are used as the OLTP paradigm throughout. *Thanks to Dave Kellogg for tipping me off to Plattner's paper. I only went to two SIGMOD sessions, neither of which was Plattner's. Nobody actually mentioned Plattner's talk to me when I was down at SIGMOD. Perhaps the most interesting part is Plattner's claim that what's demanding about OLTP isn't database updating per se, but rather maintaining aggregates for quick-response analytics. In his main example of that point, Plattner proposes a real-life "more than 18″ table schema, of which two are base tables, and (most of?) the rest are materialized views that his proposed database architecture dispenses with (because analytic performance is sufficiently good without them). Thus, Plattner's core columnar argument seemingly is... >>Continue reading "Hasso Plattner Calls for In-Memory OLTP Column Stores" Posted Wednesday, July 8, 2009 9:13 AM >>Comments Google Announces Fusion Tables Google has announced an experimental cloud-based data management system called Fusion Tables. A press article and Slashdot thread ensued, based on some bizarre-sounding analyst quotes that I will not attempt to parse. >>Continue reading "Google Announces Fusion Tables " Posted Monday, June 15, 2009 9:10 AM >>Comments Greenplum's Announcement and the Future of Data Marts Greenplum is announcing today a long-term vision, under the name Enterprise Data Cloud (EDC). Key observations around the concept -- mixing mine and Greenplum's together -- include:
In essence, Greenplum is pitching this story: >>Continue reading "Greenplum's Announcement and the Future of Data Marts" Posted Monday, June 8, 2009 9:20 AM >>Comments Reinventing Business Intelligence I've felt for quite a while that business intelligence tools are due for a revolution. But I've found the subject daunting to write about because -- well, because it's so multifaceted and big. So to break that logjam, here are some thoughts on the reinvention of business intelligence technology, with no pretense of being in any way comprehensive. Natural language and classic science fiction Actually, there's a pretty well-known example of BI near-perfection -- the Star Trek computers, usually voiced by the late Majel Barrett Roddenberry. They didn't have a big role in the recent movie, which was so fast-paced nobody had time to analyze very much, but were a big part of the Star Trek universe overall. Star Trek's computers integrated analytics, operations, and authentication, all with a great natural language/voice interface and visual displays. That example is at the heart of a 1998 article on natural language recognition I just re-posted. >>Continue reading "Reinventing Business Intelligence" Posted Tuesday, June 2, 2009 9:41 AM >>Comments More on MySQL Forks and Storage Engines The issue of MySQL forks and their possible effect on closed-source storage engine vendors continues to get attention. The underlying question is: Suppose Oracle wants to make life difficult for third-party storage engine vendors via its incipient control of MySQL? Can the storage engine vendors insulate themselves from this risk by working with a MySQL fork? As laid out most clearly in a comment thread to a previous post*, Mike Hogan (CEO of ScaleDB) believes closed-source storage engine vendors can use a MySQL fork without running afoul of the GPL. In a nutshell, what he proposes is an inbetween layer of software, itself open-sourced, that on one side interfaces with MySQL, and on the other side talks cleanly enough to storage engines that it doesn't infect them with the GPL. >>Continue reading "More on MySQL Forks and Storage Engines" Posted Tuesday, May 26, 2009 9:30 AM >>Comments The Real Story on IBM's System S Release IBM hastily announced System S Streams this week, a product that was supposed to be called InfoSphere Streams and introduced only in 2010. Apparently, the rush is because senior management wanted to talk about it later this week, and perhaps also because it was implicitly baked into some of IBM's advertising already. Scrambling ensued. Even so, Jeff Jones and team got to me fast, and briefed me -- fairly non-technically, unfortunately, but otherwise how I like it, namely on a harmless embargo and without any NDAs. Microsoft also introduced CEP this week. Perhaps it is more than coincidence that IBM rushed out its own announcement of an immature CEP technology immediately after Microsoft revealed its plans. Taken together, these announcements support my theory that the small independent CEP/stream processing vendors are more or less ceding broad parts of the potential stream processing market. >>Continue reading "The Real Story on IBM's System S Release" Posted Friday, May 15, 2009 9:53 AM >>Comments eBay's Enormous Data Warehouses Detailed A few weeks ago, I had the chance to visit eBay, meet briefly with Oliver Ratzesberger and his team, and then catch up later with Oliver for dinner. I've already alluded to those discussions in a couple of posts, specifically on MapReduce (which eBay doesn't like) and the astonishingly great difference between high- and low-end disk drives (to which eBay clued me in). Now I'm finally getting around to writing about the core of what we discussed, which is two of the very largest data warehouses in the world. >>Continue reading "eBay's Enormous Data Warehouses Detailed" Posted Friday, May 1, 2009 9:38 AM >>Comments It's Time to Strengthen MySQL Forkers As my first three posts on the Oracle/Sun merger suggested, I think Oracle will do a better job with MySQL product development than Sun has. But of course that's a low hurdle. And so it leaves open the questions: What should and/or will be the most widely adopted code lines of MySQL (or other open source DBMS), especially for the types of users and vendors who are engaged with MySQL (as opposed to principal alternative PostgreSQL) today? >>Continue reading "It's Time to Strengthen MySQL Forkers" Posted Tuesday, April 21, 2009 11:51 AM >>Comments First Thoughts on Oracle Acquiring Sun
>>Continue reading "First Thoughts on Oracle Acquiring Sun " Posted Monday, April 20, 2009 11:21 AM >>Comments Notes On Inforsense, Tableau, Jaspersoft and More I keep not finding the time to write as much about business intelligence as I'd like to. So I'm going to do one omnibus post here covering a lot of companies and trends, then circle back in more detail when I can. Top-level highlights include:
>>Continue reading "Notes On Inforsense, Tableau, Jaspersoft and More" Posted Monday, April 6, 2009 10:27 AM >>Comments SAS Enters Its Own Cloud The Register has a fairly detailed article about SAS expanding its cloud/SaaS offerings. I disagree with one part, namely: SAS may not have a choice but to build its own cloud. Given the sensitive nature of the data its customers analyze, moving that data out to a public cloud such as the Amazon EC2 and S3 combo is just not going to happen. And even if rugged security could make customers comfortable with that idea, moving large data sets into clouds (as Sun Microsystems discovered with the Sun Grid) is problematic. Even if you can parallelize the uploads of large data sets, it takes time. But if you run the applications locally in the SAS cloud, then doing further analysis on that data is no big deal. It's all on the same SAN anyway, locked down locally just as you would do in your own data center. I fail to see why SAS's campus would be better than leading hosting companies' data centers for either data privacy/security or data upload speed. Rather, I think major reasons for SAS building its own data center for cloud computing probably focus on: >>Continue reading "SAS Enters Its Own Cloud " Posted Wednesday, March 25, 2009 9:48 AM >>Comments Database Implications if IBM Acquires Sun Reported or rumored merger discussions between IBM and Sun are generating huge amounts of discussion (some links below). Here are some quick thoughts around the subject of how the IBM/Sun deal if it happens might affect the database management system industry.
>>Continue reading "Database Implications if IBM Acquires Sun " Posted Thursday, March 19, 2009 10:14 AM >>Comments Complex Event Processing Vendors Flounder Independent CEP (Complex/Event Processing) vendors continue to flounder, at least outside the financial services and national intelligence markets.
>>Continue reading "Complex Event Processing Vendors Flounder" Posted Wednesday, March 18, 2009 10:11 AM >>Comments Quick Take on Microsoft SQL Server Fast Track Stuart Frost of Microsoft (nee' DATAllegro) checked in, with Microsoft's TDWI-timed announcements. The news part was something called "SQL Server Fast Track," which is the Microsoft SQL Server equivalent to Oracle's "recommended configurations" or IBM's "BCUs." SQL Server Fast Track is further being portrayed as an incremental step toward Madison, Microsoft's future high-end data warehousing offering. >>Continue reading "Quick Take on Microsoft SQL Server Fast Track" Posted Monday, February 23, 2009 3:31 PM >>Comments Analytics' Role in a Frightening Economy I chatted the other day with an executive on the general business side (as opposed to the trading operation) of a household-name brokerage firm, one that's in no immediate financial peril. It seems their #1 analytic-technology priority right now is changing planning from an annual to a monthly cycle.* That's a smart idea. While it's especially important in their business, larger enterprises of all kinds should consider following suiy. *By the way, they seem to want use Applix technology, now owned by IBM/Cognos, to do it, more for the planning tools than for the cool in-memory OLAP engine itself. Your mileage may vary. >>Continue reading "Analytics' Role in a Frightening Economy" Posted Monday, February 9, 2009 2:41 PM >>Comments Why BI is in a Funk I wrote recently that BI is in a "funk." Let me now offer a few ideas as to why that is so. 1. At its heart, BI is an application development technology, and making money from innovating in development is hard. To quote myself: Products are obsolete before they [are] mature. Products commonly do only part of what is necessary. Generally, a new tool will be developed to help with a new need... But these tools will often be weak at what came before... By the time the shiny new tools mature to do a good job at the older requirements, some other... shift comes along, with yet newer and shinier tools to handle the latest twists. >>Continue reading "Why BI is in a Funk" Posted Friday, January 30, 2009 11:42 AM >>Comments Don't Let Gartner's Data Warehouse Magic Quadrant Confuse You Gartner's latest Magic Quadrant for data warehouse DBMSs was published last last year. Thankfully, vendors don't seem to be taking it as seriously as usual, so I didn't immediately hear about. (I finally noticed it in a Greenplum pay-per-click ad.) Links to Gartner MQs tend to come and go, but as of now here are two working links to the 2008 Gartner Data Warehouse Database Management System MQ. My posts on the 2007 and 2006 MQs have also been updated with working links. Highlights of this year's data warehouse DBMS Magic Quadrant include: >>Continue reading "Don't Let Gartner's Data Warehouse Magic Quadrant Confuse You" Posted Wednesday, January 21, 2009 10:50 AM >>Comments How to Buy an Analytic DBMS I went to London for a couple of days recently, at the behest of Kognitio. Since I was in the neighborhood anyway, I visited their offices for a briefing. But the main driver for the trip was a seminar Thursday at which I was the featured speaker. As promised, the slides have been uploaded here. The material covered on the first 13 slides should be very familiar to readers of this blog. I touched on database diversity and the disk-speed barrier, after which I zoomed through a quick survey of the data warehouse DBMS market. But then I turned to material I've been working on more recently practical advice directly on the subject of how to buy an analytic DBMS. I started by proposing a seven-part segmentation self-assessment: >>Continue reading "How to Buy an Analytic DBMS" Posted Monday, December 22, 2008 5:54 AM >>Comments Hot Topics in High-Performance Analytics For the past few months, I've collected a lot of data points to the effect that high-performance analytics i.e., beyond straightforward query is becoming increasingly important. And I've written about some of them at length. For example:
Ack. I can't decide whether "analytics" should be a singular or plural noun. Thoughts? Another area that's come up which I haven't blogged about so much is data mining in the database. Data mining accounts for a large part of data warehouse use. The traditional way to do data mining is to extract data from the database and dump it into SAS. But there are problems with this scenario, including: >>Continue reading "Hot Topics in High-Performance Analytics" Posted Monday, November 17, 2008 10:03 AM >>Comments Getting to Answers on Oracle's New Hardware I spent about six hours at Oracle last week talking with Andy Mendelsohn, Ray Roccaforte, Juan Loaiza, Cetin Ozbutun, et al. and plan to write more later. For now, let me pass along a few quick comments. The key philosophical point that I had perhaps been missing is that Oracle thinks there is and should be a storage (server) tier, just as there also are database (server), application (server), and web (server) tiers. Exadata cells are designed to never talk with each other. Instead, they talk to a set of Infiniband switches, which then talk to a grid of servers on the database tier. Oracle thinks this has solved its I/O bandwidth problem for once and for all. It's hard to see why that wouldn't be the case. What Exadata does on the storage tier in query execution is throw stuff away. Mainly, this is projection and restriction/SELECT. But if a join has been resolved on a small fact table, and Oracle is now filtering a fact table to match a value or set of values, the storage tier can do that too. >>Continue reading "Getting to Answers on Oracle's New Hardware" Posted Wednesday, October 22, 2008 12:27 PM >>Comments A Quick Guide to Teradata's Latest News The Teradata Partners (i.e., user) conference is this week. So there have been lots of press releases, some presentations, lots of meetings, and so on. A lot of Teradata's messaging is in flux, as it moves fairly rapidly to correct what I believe have been some deficiencies in the past. One confusing result is that there was very little prebriefing about the actual announcement details, and we're all scrambling to figure out what's up. Teradata does a good job of collecting its press releases at one URL. So without linking to most of them individually, let me jump in to an overview of Teradata news this week (whether or not in actual press release format): >>Continue reading "A Quick Guide to Teradata's Latest News" Posted Tuesday, October 14, 2008 11:51 AM >>Comments HP-Oracle Appliance Prices Estimated I've been trying to figure out how much the HP-Oracle Database Machine and HP-Oracle Exadata Storage Server actually cost. My first estimate was $58-190K/TB (user data), but I've since updated my pricing spreadsheet. Specifically: The first page of these estimates have been modestly altered to reflect more chargeable software options, as per the discussion below. >>Continue reading "HP-Oracle Appliance Prices Estimated" Posted Friday, October 3, 2008 1:11 PM >>Comments HP-Oracle Hardware Parallelization Clarified Some kind Oracle development managers have reached out and helped me better understand where Oracle does or doesn't stand in query and analytic parallelization. Let's start with the part everybody pretty much knows already: There are two parts to a parallelization story how you get data off of disk, and what you do with it once you have it. >>Continue reading "HP-Oracle Hardware Parallelization Clarified" Posted Wednesday, October 1, 2008 11:49 AM >>Comments Oracle Finally Answers Data Warehouse Challengers Oracle, in partnership with HP, has announced a new data warehouse appliance product line, cleverly branded "Exadata." The basic idea seems to be that database processing is split among two sets of servers: (The new stuff) A set of back-end servers the Oracle Exadata Storage Servers that gets data off of disk and does some preliminary query processing. Numbers are being thrown around suggesting that, unlike prior Oracle offerings, the Exadata-based appliance at least has scalability and price/performance worth comparing to Teradata hey, Exa is bigger than Tera! Netezza, et al. >>Continue reading "Oracle Finally Answers Data Warehouse Challengers" Posted Thursday, September 25, 2008 1:49 AM >>Comments Vertica Spells Out Compression Claims Omer Trajman of column-store DBMS vendor Vertica put up a must-read blog spelling out detailed compression numbers, based on actual field experience (which I'd guess is from a combination of production systems and POCs): >>Continue reading "Vertica Spells Out Compression Claims " Posted Wednesday, September 24, 2008 12:54 PM >>Comments Infobright Open Source Move Packs Potential Infobright announced today that it's going full-bore into open source specifically in the MySQL ecosystem with the licensing approach, pricing, distribution strategy, and VC money from Sun that such a move naturally entails. I think this is a great idea, for a number of reasons: >>Continue reading "Infobright Open Source Move Packs Potential" Posted Monday, September 15, 2008 11:16 AM >>Comments Tradeoffs In Splitting DBMS Work Among MPP Nodes I talk with lots of vendors of MPP data warehouse DBMS. I've now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives. The base-case MPP DBMS architecture is one in which there are two kinds of nodes: A boss node, whose jobs include: >>Continue reading "Tradeoffs In Splitting DBMS Work Among MPP Nodes" Posted Tuesday, September 9, 2008 12:16 PM >>Comments Why MapReduce Matters to SQL Data Warehousing Greenplum and Aster Data have both just announced the integration of MapReduce into their SQL MPP data warehouse products. So why do I think this could be a big deal? The short answer is "Because MapReduce offers dramatic performance gains in analytic application areas that still need great performance speed-up." The long answer goes something like this. The core ideas of MapReduce are: >>Continue reading "Why MapReduce Matters to SQL Data Warehousing" Posted Thursday, August 28, 2008 8:53 AM >>Comments David Raab Offers Kudos for QlikView David Raab is a great fan and former reseller of QlikTech's QlikView. His recent lengthy post about the product (I hesitate to call it "detailed" only because he rightly observes that QlikTech is in fact stingy with technical detail) is positive enough to have been recommended by the company itself. Specifically, it was cited in the comment thread to my recent post on QlikTech, where David himself also addressed some of my questions. But of course, no technology is perfect, not even one as great as David thinks QlikView is. >>Continue reading "David Raab Offers Kudos for QlikView" Posted Monday, August 25, 2008 8:18 AM >>Comments When to Use Modern DBMS Alternatives If there's one central theme in my DBMS2 blog, it's that modern database management system alternatives should in many cases be used instead of the traditional market leaders. So it was only a matter of time before somebody sponsored a white paper on that subject. The paper, sponsored by EnterpriseDB (disclosure noted), is now posted along with my other recent white papers. Its conclusion summarizing what kinds of database management system you should use in which circumstances is reproduced below. Many new applications are built on existing databases, adding new features to already-operating systems. But others are built in connection with truly new databases. And in the latter cases, it's rare that a market-leading product is the best choice. Mid-range DBMS (for OLTP) or specialty data warehousing systems (for analytics) are usually just as capable, and much more cost-effective. Exceptions arise mainly in three kinds of cases: >>Continue reading "When to Use Modern DBMS Alternatives" Posted Thursday, August 21, 2008 8:13 AM >>Comments Comparing Vertica, ParAccel and Exasol I talked with executives at Nuremberg, Germany-based Exasol last week at 5:00 am ET! and of course want to blog about it. For clarity, I'd like to start by comparing/contrasting the fundamental data structures at Vertica, ParAccel, and Exasol. And it feels like that should be a separate post. So here goes. >>Continue reading "Comparing Vertica, ParAccel and Exasol" Posted Tuesday, August 19, 2008 9:00 AM >>Comments Patent Nonsense in the Data Warehouse DBMS Market There are two recent patent lawsuits in the data warehouse DBMS market. In one, Sybase is suing Vertica. In another, an individual named Cary Jardin (techie founder of XPrime, a sort of predecessor company to ParAccel) is suing DATAllegro. Naturally, there's press coverage of the DATAllegro case, due in part to its surely non-coincidental timing right after the Microsoft acquisition was announced and in part to a vigorous PR campaign around it. And the Sybase case so excited one troll that he posted identical references to it on about 12 different threads in this blog, as well as to a variety of Vertica-related articles in the online trade press. But I think it's very unlikely that either of these cases turns out to much matter. >>Continue reading "Patent Nonsense in the Data Warehouse DBMS Market" Posted Friday, August 15, 2008 10:15 AM >>Comments
|
Blog Channels
on Enterprise App Development on Changing the Enterprise by Shawn Shell by Kas Thomas Subscribe to RSS feed of all blogs Archives
|
|
|






