NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
Scalar vs. Vector (Score:2)
Not quite. MySQL has a scalar cosine function, like Perl and damn near every other programming language you can name. Because this is a vector based search engine, you need to take the cosine of the two vectors, as the article clearly states:
Re:Scalar vs. Vector (Score:1)
That confirms my initial concerns that it wouldn't scale well, as you would have to do a lot of mathematical grunt work in Perl (or more likely the C PDL library), as opposed to using a specialised tool like a relational database to the bulk of the processing.
I don't think MySQL would manage the cosine math involved here. Oracle probably would using horrid PL/SQL perversions, but that would subtract rather from the initial elegance.
It would be nice to be able to do simple / fast similar document matches and scoring, but reverse index still offers advantages such as boolean matches and search term weighting and phrase matching that could be hit and miss with vector based searching.
@JAPH = qw(Hacker Perl Another Just);
print reverse @JAPH;
Reply to This
Parent