Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Ovid (2709)

Ovid
  (email not shown publicly)
http://publius-ovidius.livejournal.com/
AOL IM: ovidperl (Add Buddy, Send Message)

Stuff with the Perl Foundation. A couple of patches in the Perl core. A few CPAN modules. That about sums it up.

Journal of Ovid (2709)

Monday January 22, 2007
01:40 PM

MySQL Madness (ma, get my gun!)

[ #32232 ]

So I was debugging a MySQL problem and quickly hacked out the following query. It took a long time to run before finally giving me an 'OUT OF MEMORY' error:

SELECT distinct(ip)
FROM   foo
WHERE  ip NOT IN (
  SELECT distinct(ip)
  FROM   bar
)

See a problem with that query? The problem I have is one you can't see. A coworker pointed out a typo. There's a bar.id column, but not a bar.ip column. MySQL didn't catch that. It was apparently trying to run the outer query, collect all results and then parsing and filtering on the subquery.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • The folks at Oracle realized that antijoins were often performance bottlenecks and provided hints to specifically to speed them up, e.g. HASH_AJ, and specifying the optimizer type (RULE vs COST) can make a huge difference.

    I'm curious what the point of the 'distinct' in the subquery is, btw. I think that would make the explain plan worse in Oracle. But I digress.

    For stuff like this in Oracle, I generally opt for the simple minus query. Reasonable performance and easy to understand.

    How do the following

    • I couldn't really say how those others compare. I'm not a performance guru. I've traditionally gone more for the normalization aspect and not worried as much about the implementation, trusting the DBAs to be good at their jobs. Now I'm finding I need to focus on the implementation more and more.

      As for the 'distinct' clause in the subquery, I really thought there might be duplicate ip addresses in that table and I had no idea if MySQL would optimize those away or force every value to be compared multipl