NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
Here are some possibilities for you (Score:1)
If it is much bigger than will fit in memory, though, then you should go for a radically different approach. What you should try to do is do a mergesort on disk. Odds are you won't even need to write this yourself - create a datas
another approach (Score:1)
1) create a dictionary for the words in the file assigning an integer to every different word
2) map the text file into an array (@word) where every word is replaced by its index.
3) create another array (@offset) containing 0..$#words
4) sort @offset as follows:
@offset = sort {
for (my $i = 0;;$i++) {
return -1 if $a + $i >= @word;
return 1 if $b + $i >= @word;
return ($word[$a+$i] $word[$b+$i] or next)
} } @offset;
5) now the offsets into