Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I have no idea what you've been smoking, because there's no way you should be getting that kind of discrepency. Either you're using a very old version of ruby, your interpreter is broken, or you've been sniffing glue again.

    For my results, I used ruby 1.6.7 and perl 5.8.0 on Mandrake 9. I took the sample text you gave in your journal entry and copied it over and over until I ended up with a 2.4 MB file. I used "bzip-0.21" as the target. Hopefully, I didn't screw up the logic.

    I've provided the exact b

    • by dreadpiratepeter (2770) on 2003.01.15 9:40 (#16028)
      My reply is actually to the original post, not the first reply, but I couldn't find a link to comment on that.

      Anyway, I actually was able to get the benchmark on the Perl test significantly lower by doing 2 things:
         I precompiled the regexp
         I joined the relevant search fields using an (assumedly) unused char (^A) and searched on that.

      On my box that put the average from around 37 secs. to around 26 secs. (Using djberg96's benchmark version of the script).

      use Benchmark;
      use strict;
      my $target = "bzip-0.21";
      my $p = qr/$target/;

      timethese(1,{
         "original" => q{
            for (my $iter = 0; $iter < 100; $iter++)
            {
              my $count = 0;
              open(INDEX, "INDEX") || die "Couldn't open file: $!\n";
              while (<INDEX>)
              {
                chomp;
                my $f = join("\001",(split/\|/)[0,3,6,7,8]);
                if ($f =~ /$p/)
                {
                   $count++;
                }
              }
              close INDEX;
            }
         }
      });

      • Yep. Precompiling the regex and joining the fields to be searched shaved a couple of seconds off the Perl script.

        Thanks.
        --
        Buck