Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Ovid (2709)

Ovid
  (email not shown publicly)
http://publius-ovidius.livejournal.com/
AOL IM: ovidperl (Add Buddy, Send Message)

Stuff with the Perl Foundation. A couple of patches in the Perl core. A few CPAN modules. That about sums it up.

Journal of Ovid (2709)

Wednesday October 19, 2005
10:12 PM

Eliminating lookup tables

[ #27251 ]

One common problem with those new to databases is how they often stick something like field1, field2, field3, etc, in a table definition. Typically this if foolish. It violates first normal form by repeating information (if it doesn't, the fields are probably named poorly). For example, I once had to work on a database which had recipes for pears. The recipes table had eight fields for ingredients. If you had nine ingredients, you were out of luck.

At first blush, someone might create an ingredients table and list all of the ingredients and add a recipe id for each ingredient. Of course, this still does not work. You can probably reuse those ingredients in multiple recipes (such as the pears, duh!). So you create a lookup table. This table two two fields: recipe ID and ingredient ID (yes, we're keeping this deliberately simple).

Want the ingredients for a recipe?

SELECT i.name
FROM   ingredients i, recipe_ingredient ri, recipes r
WHERE  r.id = ri.recipe_id
  AND  r.id = i.ingredient_id
  AND  r.id = ?

The problem is, it's easy to get that wrong. Do you see where I goofed? Wouldn't it be nice if we could do away with that lookup table?

You can. It's called a "multivalued" database. UniData, Pick, Universe and others violate first normal form and support "multivalued" columns. Instead of lookup ids, you create references to the fields in the corresponding table. The syntax is different (I have the books for UniQuery somewhere but I can't find them), but they essentially allow you to do this:

SELECT ingredients FROM recipes WHERE recipe.id = ?

That's much easier to get right. If you delete an ingredient from the ingredients table, it automatically goes away in a corresponding recipe unless you have a constraint to protect it.

Unfortunately, these require specialized programming and query languages and they're generally proprietary (anyone know of an open-source one?). However, there's something else interesting they allow: another step closer to logic programming.

Consider the following Prolog facts:

likes( john, [cheese, money] ).
likes( mary, [cheese, running] ).

That's really close to how a multivalue database would store records in a table named likes/2. It becomes trivial to join on these values and you can get much richer relational data. Unfortunately, I don't recall that multivalue databases support recursion. Also, the multivalued columns (iirc) must point to a single table. This makes optimization much easier but also reduces the utility. Rats.

You can also create custom datatypes by integrating your database language directly into the table definitions, though my experience has been that this can slow things to a crawl. I really wish I had another multivalue database to play with. As it turns out, there's an open-source mvRDBMS named MaVerick. Regrettably, the documentation is sparse and it looks like I'd have to dig out my UniData books to even play with it. Now where could they be?

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I believe that PostgreSQL and Oracle both support this kind of thing. But naturally, they support it in incompatible ways.
    • Can you point to me where in the Postgres docs this is described? I've found references to it on the Web but I'm having trouble finding it in the documentation.

      • Likewise, I've seen it described but not looked for it in the docs. Nor have I used it, and I don't have a Postgres install handy to fiddle with.
      • Is this the section you're looking for: Arrays [postgresql.org]?
        • I suspect that's what was meant. Thanks! Unfortunately, it shows that Postgres really can't be used an MV database unless you push a lot of stuff into the application layer. Thus, one could think of it as the MySQL of MV databases (in other words, a fancy toy).

  • While it still requires that you type the query correctly once, views are portable across relational databases (finally even MySQL as of 5.0)

    CREATE VIEW recipe_ingredient_list AS
      SELECT r.id, i.name
      FROM   ingredients i, recipe_ingredient ri, recipes r
      WHERE  r.id = ri.recipe_id
        AND  i.id = ri.ingredient_id

    so that you could use this query:

    SELECT name
    FROM   recipe_ingredient_list
    WHERE  id = ?

    And if you use PostgreSQL as the backend (so

    • Aside from the "write rules in Perl" part, we actually use "updateable" views with Postgres (and SQLite) all the time where I work. It's very handy.

  • --
    J. David works really hard, has a passion for writing good software, and knows many of the world's best Perl programmers