Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • The cute trick is to build another table containing the path of each node (".1.5.12.19." means that this is node 19 whose parent is node 12 whose parent is node 5 whose parent is node 1).
    Beware that this is a premature optimization.

    While this technique works fine with a static tree, it makes tree transformations hideously difficult. Imagine the pain that comes from moving node 12 to be a sibling of node 5, or removing node 5 and consequently promoting their children up a level.

    The other technique of using a single integer column to store parent_id is the most flexible, even if the SQL is slightly hairier.

    Alternatively, you can select the entire subtree as a two pass operation. First, (recursively) find all of the children of node 5. Next, build a UNION query of all parent nodes that are part of the subtree rooted at node 5. If you build your union query properly, the nodes also come back in-order.

    • Heh, if you wanted to be really ugly, you could keep the parent id in the sme table, but have a trigger to rebuild the secondary table with the complete path-like listings. It'd be heavy on the database, but might be affordable depending upon how many updates you're doing.

      -Dom

    • It seems a bit premature to call this premature :-) As with every problem, the right data structure depends on your data and how it's accessed. Just as an alphabetized flat file is quick to binary search but slow to insert, whereas an unordered flat file is quick to insert (append) but slow to search, you choose your solution based on what you know about the data. If you were going to be doing a lot of hoists or reparenting operations, then I guess you'd have to test it in the field. This doesn't change
      • How do you recursively find all the children of node 5 without doing a ton of SELECTs? That's the problem that a path table gets around.

        I personally don't think that the "ton of SELECTs" is all that horrible. That's probably my personal style though.

        It starts out with something like this:

        SELECT id FROM nodes WHERE parent IN (5)

        ...which returns a list of IDs. That list of IDs then goes into a new SQL statement. The IDs are also pushed into a list that contains all IDs returned from your per-leve

        • I've written code as the one you describe dozens of times and I must say that it certainly was "fast enough". I'd be worried if I had very deep trees though, as that would be when it could become costly.

          But then why store a tree in an RDB? Wouldn't XPath be a *lot* more pleasant to access random things in a tree? ;)

          --

          -- Robin Berjon [berjon.com]