Continuing with my comments made here, here's the source for the Perl and Ruby scripts I used to test with. The Python version is included for
#!/usr/bin/perl
my $target = shift;
for (my $iter = 0; $iter < 100; $iter++)
{
my $count = 0;
open(INDEX, "/usr/ports/INDEX");
while (<INDEX>)
{
chomp;
my @fields = split/\|/;
if ($fields[0] =~ m{$target} || $fields[6] =~ m{$target} ||
$fields[3] =~ m{$target} || $fields[7] =~ m{$target} ||
$fields[8] =~ m{$target})
{
$count++;
}
}
close INDEX;
}
#!/usr/local/bin/ruby
# distribution-name|port-path|installation-prefix|comment| \
# description-file|maintainer|categories|build deps|run deps|www site
target = ARGV[0]
100.times do |iter|
index = File.new("/usr/ports/INDEX")
count = 0
index.each do |line|
fields = line.chomp.split('|')
if fields[0] =~
fields[3] =~
fields[8] =~
count += 1
end
end
index.close
end
#!/usr/local/bin/python
import sys
import string
import re
target = re.compile(sys.argv[1])
for iter in range(100):
index = open('/usr/ports/INDEX', 'r')
count = 0
line = index.readline()
while line <> ''
fields = line.strip().split('|')
if target.search(fields[0]) or target.search(fields[6]) \
or target.search(fields[3]) or target.search(fields[7]) \
or target.search(fields[8]):
count = count + 1
line = index.readline()
index.close()
An example of the data used (from
# distribution-name|port-path|installation-prefix|comment| \
# description-file|maintainer|categories|build deps|run deps|www site
9e-1.0|/usr/ports/archivers/9e|/usr/local|Explode Plan9 archives|/usr/ports/archivers/9e/pkg-descr|ports@FreeBSD.Org|archivers|||http:/
arc-5.21e.8_1|/usr/ports/archivers/arc|/usr/local|Create & extract files from DOS
arj-3.10b|/usr/ports/archivers/arj|/usr/local|Open-source ARJ|/usr/ports/archivers/arj/pkg-descr|kot@premierbank.dp.ua|archivers|autoconf
bzip-0.21|/usr/ports/archivers/bzip|/usr/local|A block-sorting file compressor|/usr/ports/archivers/bzip/pkg-descr|ports@FreeBSD.org|archivers|||ht
bzip2-1.0.2|/usr/ports/archivers/bzip2|/usr/local|A block-sorting file compressor|/usr/ports/archivers/bzip2/pkg-descr|jharris@widomaker.com|archivers
cabextract-0.6|/usr/ports/archivers/cabextract|/usr/local|A program to extract Microsoft cabinet (.CAB) files|/usr/ports/archivers/cabextract/pkg-descr|sobomax@FreeBSD.org|archivers|l
The average runtimes, minutes and seconds, have been (for running each of the above scripts once):
Perl: 0:35
Python: 0:38
Ruby: 2:29
Eh? (Score:2)
For my results, I used ruby 1.6.7 and perl 5.8.0 on Mandrake 9. I took the sample text you gave in your journal entry and copied it over and over until I ended up with a 2.4 MB file. I used "bzip-0.21" as the target. Hopefully, I didn't screw up the logic.
I've provided the exact b
Re:Eh? (Score:2)
Ruby:
Perl:
Re:Eh? (Score:1)
Anyway, I actually was able to get the benchmark on the Perl test significantly lower by doing 2 things:
I precompiled the regexp
I joined the relevant search fields using an (assumedly) unused char (^A) and searched on that.
On my box that put the average from around 37 secs. to around 26 secs. (Using djberg96's benchmark version of the script).
use Benchmark;
use strict;
my
Re:Eh? (Score:1)
Thanks.
Buck
Re:Eh? (Score:1)
Or use List::Util::first instead of grep (though it may only be an improvement on bigger arrays).
I'm using perl 5.6.1 and ruby 1.6.8 and getting ruby about twice as slow as perl.
Re:Eh? (Score:2)
Re:Eh? (Score:1)
I just installed Ruby today, and have been poking through online docs earlier, and couldn't find a 'break' or 'last' statement. Is there such a thing? The best I could come up with was throwing an exception and catching it outside that loop. I still need to g
Re:Eh? (Score:2)
Visit rubycentral [rubycentral.com] or ruby-doc [ruby-doc.org].
The first link is an online version of Programming Ruby, aka "The Pickaxe". You can still buy that book at the store, if you prefer paper.
Re:Eh? (Score:1)
I was poking through the bookstore and the only Ruby book there was Sam's "Learn Ruby in 21 days". I can't recommend it, as it had no mention of 'break', 'next', or 'redo', nor the IO.foreach method in your example (and it was a thick book).
Re:Eh? (Score:1)
[ayeka:~/portfinder] buck> repeat 5 time ruby pftest2.rb ruby
68.394u 2.119s 1:10.56 99.9% 4+1346k 0+0io 0pf+0w
69.770u 2.258s 1:12.08 99.9% 4+1346k 0+0
Buck
Re:Eh? (Score:2)
127.33user 2.35system 2:10.31elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (222major+301minor)pagefaults 0swaps
Perl:
126.28user 1.74system 2:08.32elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (374major+164minor)pagefaults 0swaps
Perhaps it's a NetBSD issue? Seems unlikely, but based on the results you're getting versus what I'm getting, I'd consider it a possibility at least. At least I cut it down to x2 instead of x4!
Please consider posting to the mail
Re:Eh? (Score:1)
Perhaps it's a NetBSD issue? (Buck: FreeBSD even :) ) Seems unlikely, but based on the results you're getting versus what I'm getting, I'd consider it a possibility at least. At least I cut it down to x2 instead of x4!
Agreed.
Please consider posting to the mailing list with this info (ruby-talk@ruby-lang.org).
I'd like to try these on my TiBook with OSX 10.2.3 first, though I don't expect much of a change. Is the mailing list archived somewhere where I can research before posting anything?
By the way,
Buck
Re:Eh? (Score:2)
There's a gateway between the mailing list and comp.lang.ruby, so you can search via deja (or your local news serve) and get everything from the mailing list that way.
FreeBSD even :) - Oops. Probably not the first time I made that mistake. Probably won't be the last. :-P