NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
DISTINCT or ORDER BY ? (Score:1)
In the customer-visit itinerary, ORDER BY may actually be a better solution than DISTINCT -- the repetition value in the BAG will be needed to help plan how long to lay over in each city, as it measures how many visits are required there. GROUP BY (City) COUNT(Customer) might be better, and would get you the equivalent of DISTINCT.
Bill
# I had a sig when sigs were cool
use Sig;
Re:DISTINCT or ORDER BY ? (Score:2)
You've come up with a rather artificial reason why one might, in this case, want a bag instead of a set, but your example is arbitrary. Who knows why Alice is in a particular city? Maybe she has some weird contractual agreement that she needs to at least be in those cities once a year? The point is the same: if I ask for information, why should I want my query, by default, to repeat itself rather than just give me the information I ask for?
Don't focus on the specifics of this particular problem becaus
DISTINCTly DisORDERed (Score:1)
This is not just a artificial quibble on the artificial example. I gave a different, more general rule that explains why the artificial example is wrong at an even deeper level. I claim that in general both general rules need to be addressed together.
I agree that SQL having been designed abstractly by mathematicians (with whom I happily self-identify) so that it encourages thinking SETs yet was implemented so it often returns BAGs is a problem of abstraction. Having a SHOW DUPS keyword instead of a DISTINCT keyword, reversing the defaults, would be more practical yet theoretically sound. Furthermore, both SET and BAG are the wrong abstractions: a ORDERED SET is the usually right return type for all queries. And when a BAG is what is desired, it should be a Multiplicity-encoded Ordered Multiset, not an unordered Bag with repetition. (Obviously(?), a totally order is to be preferred over a partial order -- in SQL terms, that means using a long enough composite key.)
To repeat, both SET and BAG are unordered, pure mathematical concepts (dear to my heart), but in practical programming, we need order. Any query returning a true (unordered) SET with DISTINCT but no ORDER BY is just as broken if not more broken as one returning a true (unordered) BAG (without DISTINCT) which you've objected to. Just as BAGginess is fixed with DISTINCT (or COUNT .. GROUP BY ..), DisORDER is cured by ORDER (or GROUP BY).
If in the artificial example the query weilder had used ORDER BY [as all practical SQL queries should be written whether desiring BAG or SET, to avoid getting bitten by hidden order dependencies], the artificially weird result would not have looked so weird, and the multiplicity of cities in the artificial example would have had emergent semantics, and the accountant wouldn't need firing (emotional statement in original artificial example). And the desirability of applying COUNT ... GROUP BY would have suggested itself.
There is more than one way to convert a BAG to a SET with
COUNT(... )FROM with ORDER BY ... GROUP BY ..., which is the MultiSet impelmentation of BAG, where the multiplicity is an attribute of the element rather than manifested by repetition, is a semantics preserving subsitute for DISTINCT.I agree with your basic thesis that all results should be converted to SET form. However, I would prefer COUNT ... GROUP rather than DISTINCT and further require the result be ORDERED.
Bill
# I had a sig when sigs were cool
use Sig;
Reply to This
Parent