Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I looked briefly at Plucene but never got very far and got discouraged by all the comments about slowness.

    I had heard of Swish-e but hadn't even considered it because I didn't think it could handle gigabyte collections of documents. After seeing Josh Rabinowitz's talk at YAPC I decided to try it out, and we've been quite surprised at how fast it is. We're in the process of junking the Windows-based search software we were using (with a jerryrigged Perl/MySQL system for splitting up searches and distributin
  • I know it's not Perl, but have you considered looking at Lucene [apache.org], together with some Inline-Java [cpan.org] glue? That way, you'd get the fairly nice Lucene architecture, and possibly better performance.

    -Dom

  • I can't comment on scalability for the likes of slashdot, but swish-e certainly worked well for the site we deployed it on. Its parsing is quite good (for HTML, XML, and whatever you can parse programatically so it can grab data from SQL) and the output is quite flexible (your choice of templating systems); it wasn't too tough to make very nice looking output.

    We're not using SWISH::API under mod_perl, so I'm curious about whether it can stand up to slashdot's load.
    --

    -DA [coder.com]

  • Pudge, What ever came of this? I'm curious if you found a suitable one or if you're still looking. Have you looked at Texis' Webinator [thunderstone.com]?