Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • by vsergu (505) on 2004.09.16 12:49 (#34389) Journal
    I looked briefly at Plucene but never got very far and got discouraged by all the comments about slowness.

    I had heard of Swish-e but hadn't even considered it because I didn't think it could handle gigabyte collections of documents. After seeing Josh Rabinowitz's talk at YAPC I decided to try it out, and we've been quite surprised at how fast it is. We're in the process of junking the Windows-based search software we were using (with a jerryrigged Perl/MySQL system for splitting up searches and distributing them across several servers) and replacing it with Swish-e.

    Swish-e has some limitations, some of which make it tricky to highlight matches or even give the total number of hits in a document, but the speed is worth it.