Welcome Guest. | Log In| Register | Membership Benefits

Intelligent Enterprise

Better Insight for Business Decisions

Intelligent Enterprise - Better Insight for Business Decisions
search Intelligent Enterprise
Home
Digital Library
Events
RSS | Newsletters
Webcasts




April 22, 2003

I Will Never Have To Port This Code

Debunking shortcuts and SQL myths

by Joe Celko

If you hang around the database developer newsgroups, you'll find rookie programmers who write their code with proprietary features of whatever SQL product they have. They argue that they'll never have to port this code, so why bother to write standard SQL when you can gain some advantage in performance with a proprietary feature?

What's wrong with this thinking?

First of all, unless you throw out a program within a year of writing it, it's either going to be ported or so seriously maintained it might as well be a port; and I don't mean that it's necessarily going to be moved to a totally different database product.

Moving to another release of the same product is enough to cause problems. A prime example was the introduction of the SQL-92 [LEFT | RIGHT | FULL] OUTER JOIN syntax to SQL products. Previously, each vendor had its own version of an outer join, usually based on an "extended equality" notation. Consider Chris Date's sample SELECT statement ("Watch out for Outer Join," InfoDB, Spring/Summer 1990): SELECT * FROM Supplier, SupParts WHERE Supplier.supno *= SupParts.supno AND qty < 200; The results differ depending on the order in which the two predicates are applied.

Data Types

Another common vendor extension is binary data or BIT data types. This data type exists in the SQL-92 Standard but has no operators or rules for use with a host language.

Amateurs will use this data type for storing Boolean flags because that's how they did it in a low-level programming language like C or assembly. They knew that their language used 1 or 0 or -1 for TRUE and something else for FALSE all the time. They didn't have to worry about binary storage in the hardware being high end or low end because the compiler was written for one family of machines.

In low-level languages, the data and the code are part of the same system. In a database, you can split data from any particular host language so you can use it with any host language. Maybe that host language doesn't even exist yet — do you know anyone who planned for C# front ends five years ago? Even two years ago?

Learn From Your Mistakes

In the SQL Server Programming newsgroup, Tibor Karaszi wrote that he used the BIT data type quite a lot in a product-specific maintenance tool that will "never" be ported to another DBMS. But during the last rework of the product (with approximately 60 percent rewrite), he and his programmers saw the phrase "host language does not agree" quite a few times.

If 'T' and 'F' for Booleans had been used instead, Tibor infers fewer bugs would have resulted during development. He also reports the same experience when using bits in a binary string to report outcome of an execution. The practice is to set a bit in the binary string to indicate a state, such as 1 = error, 2 = warning, 4 = information, 8 = in progress, so 3 indicates both errors and warnings. During this rework, the code became harder to write, understand, and maintain.

Aaron Bertrand, SQL Server MVP and frequent contributor to the same newsgroup wrote this wonderful quote: "I've found that another release of the same database, specifically SQL Server, is very much backward compatible. From what I've seen, most problems people have when moving from [versions] 6.5 to 2000 is when they try to use new features that weren't supported in 6.5, but haven't upgraded their compatibility level from 6.5 to 8.0 — so, through non-action, they have inadvertently made it non-forward-compatible."



Rate This Article

Comments:

Optional e-mail address:

The best way to find proprietary coding tricks is to move the data into a data warehouse. There, data must be converted to the correct data types. Now we find out that the same program running on different machines has been configured with different upper- and lowercase sorting rules, different display formats for temporal data, and different ANSI feature settings. But it's still the same program.

You might want to look at a wonderful feature called a "FIPS Flagger" that comes with most databases. Required by the Federal Information Processing Standards, it sends out a warning when proprietary code is parsed. It's the one configuration flag you should have set.


Joe Celko [celko@northfacelearning.com] is vice president of RDBMS at North Face Learning in Salt Lake City and author of five books on SQL.









IE Weekly Newsletter
Subscribe to the newsletter
    Email Address