Stories
Slash Boxes
Comments

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Wednesday December 27, 2006
01:07 AM

Happy Hanukkah/Merry Christmas/Happy Kwanzaa/Merry Newtonmas

To all my friends in the Perl community: Happy Hanukkah, Merry Christmas, and Happy New Year!

Here's wishing you and yours a very happy holiday season. This is our second Christmas in Palo Alto. Our second year in California was a little less confusing than the first. Still some wonderful things -- the farmer's markets, the street festivals, of which there seemed to be one within five to ten miles each weekend day through the summer. It's also been a great place for concerts -- U2, and Sting, as well as really good local stuff. Some 100+ degree days, so N. Cal is not weather heaven, and some rains in the winter that were as heavy as monsoons in Asia, only the pervasively wetness that soaked through clothes was cold rather than hot.

I'm having a ball at Yahoo!, and Shymala has been very lucky in having the historical novelist, Beverly Swerling, decide to involve herself in the novel Shymala's been working on for the past several years. Beverly has been the perfect mentor, and a very good friend.

Shymala has had some medical problems recently that have got us both motivated to take care of some things that got postponed during all the busyness since we've been out here, so that will be part of our new year, along with getting ourselves back into a gym again.

Real estate around here, though slowing, is still ridiculous, so we are again figuring out where we might live when we grow up. Having done a cross-country relocation at an age when we should have had more sense, we want the next move to be final, but there's the difficulty that there are a lot of wonderful places out there, and who can guess what life will be like a couple of years down the road in any particular place?

We saw a lot of wonderful things over the past year - the SF skyline from Sausalito, surfers at Santa Cruz, some very happy ducks (and some very barbecued ones!), the San Jose Rose Garden, and sea otters in the ocean at Monterey. We've finally done some of he tourist stuff in SF - driving down Lombard Street (the crookedest street in San Francisco), checking out the sea lions at Pier 39, and driving across the Golden Gate, just to do it. Still haven't ridden a cable car, though. And we've had a chance to catch up with some very old friends, which was lovely.

Hope things have been good for everyone!

best wishes

Shymala & Joe

Monday October 16, 2006
08:10 PM

Perl catches a pedophile on Myspace.

Wired Magazine's Kevin Poulsen has shown that at least part of the problem of policing MySpace is a SMOP.

http://www.wired.com/news/technology/1,71948-0.html

My road to this New York police unit began in Perl.

In May, I began an automated search of MySpace's membership rolls for 385,932 registered sex offenders in 46 states, mined from the Department of Justice's National Sex Offender Registry website -- a gateway to the state-run Megan's Law websites around the country. I searched on first and last names, limiting results to a five mile radius of the offender's registered ZIP code.

Wired News will publish the code under an open-source license later this week.

The code swept in a vast number of false or unverifiable matches. Working part time for several months, I sifted the data and manually compared photographs, ages and other data, until enhanced privacy features MySpace launched in June began frustrating the analysis.

Excluding a handful of obvious fakes, I confirmed 744 sex offenders with MySpace profiles, after an examination of about a third of the data. Of those, 497 are registered for sex crimes against children. In this group, six of them are listed as repeat offenders, though Lubrano's previous convictions were not in the registry, so this number may be low. At least 243 of the 497 have convictions in 2000 or later.
Tuesday August 29, 2006
06:04 PM

I'm liking Jifty

Just lately, I ran across librivox.org, an open-source audiobook project. These folks read public-domain books, upload the recordings as MP3s, and then release them in the public domain. Pretty cool.

The project has been running itself via the Librivox forum, and has come up with a pretty decent workflow to allow people all over the world to participate in reading books. Heck, they even did The Importance of Being Earnest this way, with individual folks from around the world reading lines, and an editor putting it all together to make the final "performance".

But they're starting to get to the point where it'd be better to have a separate application to handle the workflow and act as a search engine for completed works. I started looking at Jifty for this and it looks like it might be just the thing. Stealing some code from Wifty (the Jifty wiki) looks like I can bang together a basic application in a short while -- if an already-underway PHP/MySQL project isn't done first. That one has some of the more experienced Librivoxers behind it, so it may end up being the system of choice; it's more likely to have captured the workflow for sure, as I'm looking at it from the outside, as opposed to being experienced in the process.

Still, as a slightly-more difficult application, this should be a good vehicle for me to learn Jifty, and I'm going to go ahead anyway, even if they don't use mine.
Monday August 21, 2006
07:35 PM

Graph.pm: serious life simplification

Should you ever need to do dependency checking, just use Graph.pm instead of trying to do it yourself.

In my case, App::SimpleScan had a serious pessimization that I knew about: if you defined a variable, App::SimpleScan would try every possible value for it when subsituting, even if said variable did not appear in the test spec. This was because I needed to be able to have variable substitutions possibly contain other variable substitutions (though I drew the line at recursive definitions).

If you don't have some means of tracking dependencies, you must always check every possible combination of values. If you get a bunch of variables, this slows things WAY down, even if most of them only have a couple of different values, because you now have to check the cross-product of all possible values. Not good.

Graph.pm just makes all this go away. You add each variable as a node, with edges pointing to variables that this one might need to substitute. If you've defined all your variables right (i.e., no circular dependencies), then you've got what's called a directed acyclic graph (or DAG). It's essentially a forest of trees that may or may not share some nodes.

To figure out what all variables are dependent on a given one, you put the one(s) you're interested in an array, and then iterate over that, recording all the adjacent nodes. Repeat until you get no more new nodes (this could be optimized too, by pulling out nodes that have no successors, but this is fast enough for now). Ta-da: these, and only these, need to be substituted.

Oh, and Graph.pm detects cycles for you as well, so you just need to check whether any cycles got created as each variable is defined. Sweet.
Friday June 30, 2006
04:17 PM

YAPC::NA 2006 part 2

So on to specific items:
  • Jeffrey Goff (DrForr)'s presentation on PPI was cool. There seemed to be some confusion in the audience about exactly how much PPI understood about perl, and some of the folks asking questions didn't quite seem to get that PPI knows how to tokenize Perl well; it doesn't claim to understand it. Quite interesting, and possibly of use to me in connection with Class::Pluggability.
  • Adam Kennedy (Alias)'s PITA presentation was very well done. The equation of the state of CPAN to London's sewers in the 1840's was actually spot on. Can't wait to see more from this project.
  • Perl::Critic was good. The idea of including a Perl::Critic test in modules is also an excellent idea - setting that up as a derivative of Module::Starter::PBP sounds like it might be worth doing.
  • Object::Trampoline looks like a great tool: I had some one-on-one time with Stephen Lembark after the session, and he has so many applications for this kind of thing it was amazing. Definitely looking for more stuff from him.
  • The Perl 6 Update showed a lot of well-thought-out and interesting stuff. Perl 6 is looking exciting again.
  • Luke Closs's Selenium testing stuff is great, and I intend to steal^Wadapt as much of it as possible for our testing.
  • José Castro (cog)'s Acme presentation was great, and I've got some ideas for a few modules of my own now...
  • Jifty looks cool. I've installed it, and will be reading through it and through the docs of some of the more interesting-looking modules it loaded in the install process.
  • Steval Little's Moose talk was interesting; have to look at it a bit more to see if it's useful to me.

Got a lot more time in the hall sessions this YAPC, mostly because I seemed to have given people ideas both about doing more testing and about plugins.

It was a great YAPC. Wish I could do it more often!
 
02:38 AM

YAPC::NA 2006

This YAPC was far more intense than previous ones for me, both on the professional and the personal level - so much so that I've waited until getting home to try to blog about it.

I arrived Sunday evening - unfortunately too late for the arrival dinner. Or the anti-arrival dinner. In fact, I was so late that no one was willing to deliver food at all. The perils of routing oneself through O'Hare.

Chicago is an interesting town; a tad rough around the edges, and very much itself. I haven't stayed in doems since college, so I'd forgotten the dorm experience. As dorms go, the MSV dorms were quite okay. I gather that the SSV dorms were like living in concrete monk's cells. I was foresighted enough to take along a container of disinfecting wipes, which was nice to clean up around the shower and sink a little. I recommend it for anyone staying in a dorm.

The session were all very good this year, with some very cool stuff which I intend to start using and adapting as soon as possible:
  • Object::Trampoline, which uses one of the most wonderful pieces of subtle Perl I've seen in a while: an object that exists only for the purpose of replacing itself with something else as soon as it's called.
  • Test::WWW::Selenium, which wraps up the Selenium web testing platform in a handy Test::More interface. I intend to try grafting this into simple_scan as soon as I possibly can.
  • The Goo, a development environment which features a way of deciding "what do I feel like doing" (a couple problems installing, but I can work those out and send back some patches).
  • Jifty, yet another web app development platform, but this one looks a little less esoteric - not trying do hard to be "Perl on Rails". It installed a lot of interesting-looking prereqs as well; I'm planning on firing up Pos::Webserver::Source and reading a lot next week.

My presentations both went very well, though I think I may have been a little too high-energy in the Pluggability one, so I may have been off-camera a chunk of the time. Lots of interest in that; I'll need to finish off the final version and get it on CPAN soonest.

The simple_scan presentation went very well;. the jokes got laughs when they were supposed to, and the stories were well-recieved as well. I had someone come up to me and say that they'd come to my second talk because they'd been told that I was a good speaker, and that they were glad I did. Thanks very much, whoever you are!

I've spent more time on #perl this year, especially in the time leading up to the conference, and it was really great to meet the people that I did; I was also glad that I had good solid work to refer to this year. I feel like I've come a long way in the past two years.

Had a great time in the off-hours as well, with good conversation, and sharing MST3K with both old and new (hi, q!) friends.

More tomorrow. Time for bed.
Tuesday May 16, 2006
06:47 PM

Attribute::Handlers and pluggability

With chromatic's entry yesterday and a fair amount of experimentation and glob madness, I've finally gotten an elegant way to handle getting rid of the monkey code in my pluggability framework, which will be part of the "Designing for Pluggability" talk at YAPC.

package Vacuum::Plugin::Screamer;
use base qw(Plugin::Base);
no strict;
 
sub foo :PluggedMethod {
  print "RUNNING!!!!!\n";
}
 
sub volume :Option(=i);
sub vocal_range :Slot;
 
1;
Here we have defined a plugin method that will automatically be accessible to the parent (Vacuum::Pluggable), a command-line option definition (with automatic creation of an accessor nethod for it), and a slot to be added to the parent object (with an automatically-generated accessor as well).

Still to go: prehooks and posthooks, and the generalization of the pluggable base class, but it's all going very well. This is sufficiently advanced technology with a vengeance. Thanks!
Monday March 06, 2006
02:57 PM

Who'da thunk

... that this would work?

SKIP: {
   print "first\n";
   SKIP: {
     print "second\n";
     SKIP: {
     print "third\n";
     last SKIP;
     print "shouldn't print\n";
     }
     print "leaving second\n";
   }
   print "leaving first\n";
}
with output

first
second
third
leaving second
leaving first
Not only does it compile, but it works as you'd expect: the last leaves only the innermost SKIP block. Or maybe I should say "it works the way I want it to".

This means that Test::More SKIP blocks can be nested and the Right Thing will happen. Why this matters will be made clear in a module I'll be releasing soon that wraps up Test::More-like retryable tests in a nice syntax.
Thursday January 26, 2006
03:45 PM

MakeMaker fun

Here at Yahoo! we have a very cool install system, yinst; handwaving all the details, it makes it really easy to manage a zillion servers and development machines.

To use it properly, you need to build packages according to its scheme, which doesn't match a standard CPAN distribution. I found myself repeatedly building the config files for the build, doing the build, transferring to our repository, and yesterday the idea hit me that I shouldn't be doing this; the build process should.

A quick look over the ExtUtils::MakeMaker docs showed me that it was actually trivial to add a section to the Makefile.PL to put my special build targets into the Makefile.

sub MY::postamble {
  return <<'MAKE_FRAG';
yman: *.3
        cd build && pod2man ../<MAIN PM FILE> > "../man/man3/<MAIN MODULE>.3"
 
yinst: *.yicf
        cd build && yinst_create -t release *.yicf
 
ydist: yman yinst *.tgz
        cd build && dist_install *.tgz
 
MAKE_FRAG
}
So now the Makefile has my targets in it. "Wait, wait," I hear you say. "What's all that stuff in the angle brackets?" That stuff is part 2 of my plan. In addition to the MakeMaker change, I put together a Module::Starter::Yahoo module that builds a base distribution modelled on the Module::Starter::PBP one, but with the added build directory and some extra mojo to build the Yahoo! build files.

So now I can do this to start up a new module:

module-starter --module=My::New::Module
And when my module's ready to distribute, I do this:
<ecode>
make
make test
make ydist
and poof, my module is built, tested, and distributed in proper format. And of course, if it's CPAN-able, I can just do a make dist instead and then upload to CPAN.

Since I based it on Module::Starter::PBP, I could also use the Perl -MModule::Starter::Yahoo=setup business to get this installed as my base Module::Starter config.

I did find that Module::Starter::PBP wasn't well set up to be subclassed, which meant I had to cut and paste the create_distro method into my code and modify it (it uses hard-coded path components in its File::Spec calls), but not a big deal.

Also, the templating philosohy used in this module is slanted toward "one module, one file" because it's used to build .pm and .t files. I needed to have all of the modules in the .yicf file, so a little fiddling about was necessary there as well. All in all, though, I'm quite happy with the time spent to save time later.

I still have to edit the prereqs into the .yicf file; possibly I should expand the make yinst target to parse Makefile.PL and pull the prereqs out automatically.
Monday November 21, 2005
06:31 PM

CPANTS Weakness

It does suck when you add another module to CPAN, with everything in place just perfectly except "someone else is using this" ... and this makes your score drop.


Sigh.