I just caught a link to a programming language benchmark for bioinformatics. Unsurprisingly, the Perl is grotty and the C and C++ and Java implementations beat it handily.
Are there any PDLlers who'd like to bring some sanity to the results? (My eyes are going on strike from the C-style nested loops. Make sure you catch the use of bitwise and as control flow operator. That has to work only by accident.)
substr(A,B,1) (Score:2)
You *might* be able to make it faster with a regexp. But basically doing anything character by character in perl is very slow.
(it's also one of the main reasons that XML::SAX::PurePerl is so slow)
Re: (Score:1)
I managed to compensate by applying a regex if the character I see suggests I can read ahead a fair way.
You might be able to abuse the regex engine for this though...
s/./something;$1/e
You can double the speed (more or less) (Score:1)
In the alignment.pl code, nearly all the time is spent in the 'compute f matrix' loop. Pre-splitting the strings to arrays saved a few seconds (and took nearly no time). Using @_ directly instead of assigning to lexicals in the score and max subroutines (and using the ?: operator to write one line functions) saved a few more seconds (the python didn't seem quite fair to compare since it has named parameters, so you save the assignment).
There was also a lot of array indexing, so pre-assigning the first leve
Re: (Score:1)
4x (Score:1)
At the expense of some readability, I sped alignment.pl up by a factor of four (96 sec to 23 sec on my iMac G5 with perl5.8.6). You can view my modified code [chrisdolan.net] at your peril. The substr was not in fact the biggest cost -- I was surprised that changing to m/\G(.)/cg didn't save any time. The biggest win (about 40% time decrease) was unrolling the subroutines, of course, which is what some of the other languages may be doing