NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
Lingua-EN-MatchNames (Score:1)
Reply to This
Re:Lingua-EN-MatchNames (Score:1)
Yes, I looked at it, as well as the excellent modules Lingua:EN:NameParse and Lingua::EN::AddressParse by Kim Ryan. I plan to use them once I get to the blocking window level of matches.
The problem, of course, is that when you have many millions of records, the turn around time for a really close look at each record just gets too large. So what I think I need to do is determine how to split the records for large datasets into groups that can be compared in an economical amount of time (the