Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • The cute trick is to build another table containing the path of each node (".1.5.12.19." means that this is node 19 whose parent is node 12 whose parent is node 5 whose parent is node 1).

    Beware that this is a premature optimization.

    While this technique works fine with a static tree, it makes tree transformations hideously difficult. Imagine the pain that comes from moving node 12 to be a sibling of node 5, or removing node 5 and consequently promoting their children up a level.

    The other technique

    • It seems a bit premature to call this premature :-) As with every problem, the right data structure depends on your data and how it's accessed. Just as an alphabetized flat file is quick to binary search but slow to insert, whereas an unordered flat file is quick to insert (append) but slow to search, you choose your solution based on what you know about the data. If you were going to be doing a lot of hoists or reparenting operations, then I guess you'd have to test it in the field. This doesn't change
      • How do you recursively find all the children of node 5 without doing a ton of SELECTs? That's the problem that a path table gets around.
        I personally don't think that the "ton of SELECTs" is all that horrible. That's probably my personal style though.

        It starts out with something like this:

        SELECT id FROM nodes WHERE parent IN (5)
        ...which returns a list of IDs. That list of IDs then goes into a new SQL statement. The IDs are also pushed into a list that contains all IDs returned from your per-level queries:
        SELECT id FROM nodes WHERE parent IN (6, 7, 8, 9)
        That process is very amenable to a simple while loop:
        my @ids = ();
        my @new_ids = (5);   ## starting condition
        while (@new_ids) {
            push(@ids, @new_ids);
            my $set = join(", ", @new_ids);
            @new_ids = do_query("SELECT .... WHERE id IN ($set)");
        }

        ## @ids contains all children rooted at the subtree at node 5
        At the end of that, you've executed N queries to find all nodes that are 1..N levels beneath your root. From there, it's only one more query to obtain all nodes. You could even rewrite the queries to return all the children at each level over N queries instead of doing the N+1 query. That N+1st query can help to automatically order the nodes though.

        With this property, I don't know that the work to "compile" the tree structure into a text field is all that much of a benefit. But that's just my gut feeling, and of course I could be wrong about this.

        • I've written code as the one you describe dozens of times and I must say that it certainly was "fast enough". I'd be worried if I had very deep trees though, as that would be when it could become costly.

          But then why store a tree in an RDB? Wouldn't XPath be a *lot* more pleasant to access random things in a tree? ;)

          --

          -- Robin Berjon [berjon.com]