NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
three regexen (Score:2)
I'm not sure if I'm misreading this, but it looks like you have three different regular expressions there:
Re:three regexen (Score:2)
Sorry for not being clear. I'd expect
Re:three regexen (Score:1)
If there is any possibility of accented 'national' characters (which there always is in unconstrained data) '\w' is much preferred to [A-Za-z] or [A-Z]/i.
I'd worry that some 'persons' might actually be shorter than 8 chars, or have spaces or lower case in some systems. (van Helsing etc)
What strings(1) shows you isn't quite what Perl sees. Try xd(1) or od(1) to see details. If on Windows (or VMS?) set binmode(3) on your filehandle. (For portability, set binmode anytime reading binary data.)
good luck, we'll be interested to hear what the results are.
Bill
# I had a sig when sigs were cool
use Sig;
Reply to This
Parent