Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jdavidb (1361)

jdavidb
  (email not shown publicly)
http://voiceofjohn.blogspot.com/

J. David Blackstone has a Bachelor of Science in Computer Science and Engineering and nine years of experience at a wireless telecommunications company, where he learned Perl and never looked back. J. David has an advantage in that he works really hard, he has a passion for writing good software, and he knows many of the world's best Perl programmers.

Journal of jdavidb (1361)

Thursday July 08, 2004
11:09 AM

Observed

[ #19754 ]

Running massive deletes (on the order of tens of millions of records, probably a hundred million records when it's all done) on an Oracle table. I'm pulling the relevent key information in a select, looping for each record returned and sending the key information back in a delete, committing every 500 or 1000 rows. My DBA suggested I instead selected the ROWID as that is faster. The DBD::Oracle summary says a ROWID "can be treated as a string and used to rapidly (re)select rows."

However, switching to the use of ROWIDs appears to slow my program down about 50-100%. I haven't rigorously measured this, and there could of course be many other factors. But 500 rows in the original program take about .1333 seconds to delete on average, while 500 rows in the ROWID version of the program often take .3333 seconds.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Sounds weird. Is it the same if you do it from PL/SQL? Is the query plan what you would expect with/without the rowid in the query?

    If performance is very important, you could perhaps consider these things:

    Why first select and then delete? Is there a client-side decision whether a row should stay or go? It's probably more efficient to do it in the delete where clause.

    Delete in parallel from many processes. Oracle can probably handle more load than serialized deletes. At least it can do somehting useful du
    • If most rows should go, it may be faster to copy the remaining rows to another table and rename it into place, then truncate the original table.

      Bingo. I think I should'a done that.

      Delete in parallel from many processes.

      I am. I've currently got one running for every day between June 3rd and June 16th, inclusive. Seems to be going much faster now than it was before (although I've been on June 3rd forever... fortunately, I suspect there are an abnormally enormous number of records on that day com

      --
      J. David works really hard, has a passion for writing good software, and knows many of the world's best Perl programmers
      • a single delete was blowing up due to exceeding the size of the rollback segment

        I remember Thomas Kyte (of http://asktom.oracle.com/ [oracle.com] fame) recommending someone to... increase the rollback segment in a case like this :)

        Here's a discussion [oracle.com] about deleting gazillion rows where they also mention that it may be clever to drop the index before deleting.
        • increase the rollback segment in a case like this

          Well, yeah, there is the obvious solution, but there's also the issue of being one step further removed from those kinds of actions than I would like. Plus I worried that the rollback segment might need to be increased by more than one order of magnitude and that it wouldn't be practical. :)

          Thanks for your help.

          --
          J. David works really hard, has a passion for writing good software, and knows many of the world's best Perl programmers