I was enjoying my coffee and doing some catch up work this morning when my sysadmin friend IM's me asking if I know of a way to generate Perl compatible regular expressions from a set of words. A quick trip to the CPAN revealed Regex::PreSuf. A couple minutes later I emailed him this script from the command line.
#!/usr/bin/env perl
use strict;
use warnings;
use Regex::PreSuf;
# Put in the words you want to match here
my @words = qw( foo bar blitz );
my $re = presuf( @words );
print $re;
As a bonus, the docs say that the regexs generated are usually faster than using alternation. I can think of a few places in my code to refactor already
Don't forget about the alternatives... (Score:2)
- Regexp::Assemble [cpan.org]
- Regexp::List [cpan.org]
I've been told these are not just examples of the phenomenon known as "reinventing the wheel": the authors allegedly knew of Regex::PreSuf, and made improved versions. Hopefully. So, it might be worthwhile to actually compare these modules...Re: (Score:1)
Beat me to the punch… Regexp::Assemble [cpan.org] is the one I generally use.
Re:Don't forget about the alternatives... (Score:1)
As the author of Regexp::Assemble, let me weigh in:
Yes, I knew about Regex::PreSuf (and it is referenced in the SEE ALSO section of the documenation). R::PS doesn't deal with meta characters, so something like a\d+b and a\s+d is going to produce a\[ds]+b, which won't even compile.
Regexp::List, I knew about, but you'll forgive me if I can't quite recall why I discarded it when I evaluated it. I think it gets exponentially slower as the input list grows.
Regexp::Assemble comes with a number of scripts in
Re:Don't forget about the alternatives... (Score:1)
Thanks for the weigh in! This looks like the industrial strength solution I will put into production. I need to dive into tries also and get a good understanding of those. I like the as_string method for readability here.
Another fun morning with Perl and Coffee!