Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

TeeJay (2309)

  (email not shown publicly)

Working in Truro
Graduate with BSc (Hons) in Computer Systems and Networks
pm :,,
lug : Devon & Cornwall LUG
irc : TeeJay
skype : hashbangperl
livejournal : hashbangperl []
flickr :hashbangperl []

Journal of TeeJay (2309)

Wednesday November 20, 2002
10:10 AM

more search magic with mysql

[ #9038 ]
one of the problems I have found with mysql 4 is the way it scores results in boolean mode of fulltext.

This means that something that matches '"quite a long phrase"' scores lower than '+word other'. So your scoring is totally different for queries according to how many distinct tokens the query contains.

The way around this is to calculate a max score for a query.. this snippet of code is quite handy for this :

#/usr/bin/perl -w

print "\nstarting...\n";
my @strings = ('"quite long phrase"','+must optional','"short phrase" word', '"quite long phrase" word');
foreach (@strings) {
    my $max = 0;
    print "string:$_\n";
    my @tokens = m/(\"[\s\S]+\"|\S+)/g;
    print "tokens:\n";
    print join(":",@tokens), "\n\n";
    foreach (@tokens) {
    $max += (m/[\"\+]/) ? 0.8 : 0.3;
    print "max score : $max\n";

print "done...\n";

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • my @tokens = m/(\"[\s\S]+\"|\S+)/g;

    Shouldn't that regex be m/(\"[^"]+\"|\S+)/g ? Or as I'd normally write it, /("[^"]+"|\S+)/g . The string might include two quoted phrases.

    • thats a very good point.

      The second regex would do very nicely


      @JAPH = qw(Hacker Perl Another Just);
      print reverse @JAPH;