On OS X there's a little package management tool called fink. Its no Debian, but its ok. Listing the available packages takes a few seconds longer than I liked on my iBook so I decided to do a little profiling to figure out what was being slow.
Turns out it was two things. One was their version comparision function. It was repeating over 90% of its calls so I memoized it. That knocked out about 25% of the runtime.
The other was a function that was simply looking for new
sub search_comparedb {
my $path = shift;
my (@files, $file, $fullpath, @stats);
# FIXME: should probably just check dirs of $config->get_treelist()
opendir(DIR, $path) || die "can't opendir $path: $!";
@files = grep { !/^[\.#]/ } readdir(DIR);
closedir DIR;
foreach $file (@files) {
$fullpath = "$path/$file";
if (-d $fullpath) {
next if $file eq "binary-$debarch";
next if $file eq "CVS";
return 1 if (&search_comparedb($fullpath));
}
else {
next if !(substr($file, length($file)-5) eq ".info");
@stats = stat($fullpath);
return 1 if ($stats[9] > $db_mtime);
}
}
return 0;
}
Pretty straight forward. A few small touch ups could be done, but basically ok code. And it was sucking down a third of the runtime. This is over directory trees containing 2700 files and about 50 non-ignored directories. Just 50 calls to this function ate nearly a second or runtime on my iBook.
I could have tried unrolling the recursion, but I never really learned how to do that (I'd be interested to be shown how its done). Anyhow, I did the simple thing:
sub search_comparedb {
my $path = shift;
$path
# Using find is much faster than doing it in Perl
return
(grep !m{/(CVS|binary-$debarch)/},
`/usr/bin/find $path -type f -name '*.info' -newer $basepath/var/db/fink.db`)
? 1 : 0;
}
find. We can get away with this because the code only runs on OS X where
Benchmark: timing 100 iterations of find, orig...
find: 9 wallclock secs ( 0.19 usr 0.00 sys + 1.34 cusr 4.58 csys = 6.11 CPU) @ 526.32/s (n=100)
orig: 49 wallclock secs (21.28 usr + 0.00 sys = 21.28 CPU) @ 4.70/s (n=100)
God damn. Some sort of Perl module wrapper around a find C library.
What was the next bottleneck? Oddly enough, Storable::retrieve. Its the top time killer now gulping down 40% of the runtime loading up that index file we just checked to see if it needed updating. But I'd already nearly doubled the performance so I stopped there.
Shameless plug (Score:3, Informative)
It wraps the C code from the GNU findutils into some XS to natively query an existing locatedb.
Reply to This
Re:Shameless plug (Score:2)
faster... (Score:1)
When fast is fast enough. (Score:2)