Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

djberg96 (2603)

djberg96
  (email not shown publicly)

Journal of djberg96 (2603)

Friday September 17, 2004
04:16 PM

Oracle: hash antijoins?

[ #20927 ]
One of the few things I haven't been able to confirm from Guy Harrison's book is that a hash antijoin is more efficient than a minus when doing a diff. Let me provide a couple quick examples:

-- minus
select foo, bar, baz
from table1
minus
select foo, bar, baz
from table2

-- hash antijoin
select foo, bar, baz
from table1
where (foo, bar, baz) not in (
   select /*+ hash_aj */ foo, bar, baz
   from table2
)

The explain plan for the minus query looks reasonable, while the one for the hash antijoin looks awful and, indeed, performs badly.

However, I'm also querying against a view, which in turn is getting its data from a link, so that might be a major issue. The view itself seems to be reasonable.

Anyone out there have an opinion?

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • As a side note, I've never had the opportunity to use a multi-column /(not )?in/ clause, I've wanted it for years and have never heard of it being a valid syntax until now.

    I googled for hash_aj and found this [oracle.com] in which one of the replies suggests that the hint is not being used, and that there should be a 'and field is not null' clause in both the query and subquery (the example there is a single column 'not in' clause, so I assume there should be a 'not null' condition for each column?).

    • I meant to say that though the reply suggests that the hint is not being used, adding a 'field is not null' expression to the where clauses (main and sub query) would maybe cause the hint to be used, and that printing the query plan should tell you for sure whether or not it is being used. I don't currently use Oracle, so I can't test any of this :)
      • Indeed, that link proved to be useful - thanks. Adding "is not null" conditionals forced the hint to be used. The only problem is that it can alter your result set when compared to a minus (i.e. what if I don't want to exclude the NULL's?).

        Besides, in my 10 column compare the performance increase was negligible - about 1 second faster, and that was with a PARALLEL(4) hint tossed in to boot.

        But, it's good to know and YMMV. :)

        • what if I don't want to exclude the NULL's?

          Then I think you need to use the MINUS anyway (which is not portable -- doesn't work on my version of Informix, and SqlServer doesn't complain, but it doesn't do the right thing either). If there are any nulls in the sub-query table, then you shouldn't get any results at all unless you have 'is not null' for those columns in the subquery, and if you have nulls in the main query table, then those records shouldn't be showing up. It's that old "if it's null, then

          • Indeed, a correlated subquery can be very efficient. It can also suck rocks. It largely depends on how your indexes are set up. :)

            I'll admit I had brushed this off, so now I'll go back and see how it compares. We don't have control over the remote databases we're pulling some data from so it may or may not be an option.

  • I remember reading somewhere that the hint wasn't being used (I see runrig has found a link). minus is way cool, besides (who said tuning was a rational procedure, anyway ?)