Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Phred (5358)

Phred
  fredNO@SPAMtaperfriendlymusic.org
http://www.redhotpenguin.com/

Fred is a Perl and PostgreSQL geek. He has made some very small contributions to a few cpan modules and mod_perl.

Journal of Phred (5358)

Thursday November 24, 2005
01:27 PM

Regex::PreSuf with my coffee

[ #27727 ]

I was enjoying my coffee and doing some catch up work this morning when my sysadmin friend IM's me asking if I know of a way to generate Perl compatible regular expressions from a set of words. A quick trip to the CPAN revealed Regex::PreSuf. A couple minutes later I emailed him this script from the command line.

#!/usr/bin/env perl

use strict;
use warnings;

use Regex::PreSuf;

# Put in the words you want to match here
my @words = qw( foo bar blitz );

my $re = presuf( @words );
print $re;

As a bonus, the docs say that the regexs generated are usually faster than using alternation. I can think of a few places in my code to refactor already :)

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I've been told these are not just examples of the phenomenon known as "reinventing the wheel": the authors allegedly knew of Regex::PreSuf, and made improved versions. Hopefully. So, it might be worthwhile to actually compare these modules...
    • Beat me to the punch… Regexp::Assemble [cpan.org] is the one I generally use.

    • As the author of Regexp::Assemble, let me weigh in:

      Yes, I knew about Regex::PreSuf (and it is referenced in the SEE ALSO section of the documenation). R::PS doesn't deal with meta characters, so something like a\d+b and a\s+d is going to produce a\[ds]+b, which won't even compile.

      Regexp::List, I knew about, but you'll forgive me if I can't quite recall why I discarded it when I evaluated it. I think it gets exponentially slower as the input list grows.

      Regexp::Assemble comes with a number of scripts in

      • Thanks for the weigh in! This looks like the industrial strength solution I will put into production. I need to dive into tries also and get a good understanding of those. I like the as_string method for readability here.

        Another fun morning with Perl and Coffee!