Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

rjbs (4671)

rjbs
  (email not shown publicly)
http://rjbs.manxome.org/
AOL IM: RicardoJBSignes (Add Buddy, Send Message)
Yahoo! ID: RicardoSignes (Add User, Send Message)

I'm a Perl coder living in Bethlehem, PA and working Philadelphia. I'm a philosopher and theologan by training, but I was shocked to learn upon my graduation that these skills don't have many associated careers. Now I write code.

Journal of rjbs (4671)

Monday November 03, 2008
10:42 PM

october horror recap

As is our tradition, Gloria and I watched a bunch of horror movies this October. Here's a quick recap:

Saw II-IV were a good watch... well, at least the first two. We saw Saw a few years ago and both agreed it was mediocre. Saw II was better, though, and Saw III was actually a really good capstone to the trilogy. For some reason, though, they decided to keep going. Saw IV was really lousy. I expect that V and VI will be worse, but I'm in no rush to find out.

Opera, like Suspiria, was an incomprehensible and joyless horror film by Dario Argento. I suggest you avoid it.

Chopping Mall was a good old fashioned 80's horror film by Roger Corman, King of the Bs. It had everything you need: teenagers having sex, robots getting struck by lightning, and an exploding head. Would buy again.

Child's Play I-III were a disappointment. Each one of the movies had a few amusing bits, but they weren't very funny, weren't very scary, and weren't even all that memorable. I'm looking forward to Bride and Seed of Chucky, though, as they seem like they'll be a lot of fun. I'm just surprised that they got made. The first three were just lousy.

Dark Water was a movie about a girl who drowns and then terrorizes the living, based on a book by Koji Suzuki. In other words, it sounds a lot like Ring. It had a lot of cool ideas, but basically it stank. It wasn't very compelling, and the climax made nearly no sense. On the plus side, Dark Water, unlike Ringu, is unlikely to haunt my nights for years to come.

Prom Night was easily the worst of the movies we watched. Its plot was nearly incomprehensible, the construction of the narrative was confusing at best, the writing was awful, Leslie Nielsen and Jamie Lee Curtis were both totaly wasted, and the "payoff" was pointless. How on earth did that get to be a classic?

Mr. Ice Cream Man was mostly a terrible movie, not even on the B list, but the guy who played the ice cream man was creepy enough to get it above Prom Night. Still, what the hell was up with that last scene? It felt completely tacked on.

Basket Case 2 was, in a lot of ways, better than the first. Both were hard to really rate. They're strange movies. The second one was much more over the top, and I think it was better for it. There's a scene in which we meet a freak who is mostly a gigantic head, mostly mouth, who sings opera. Awesome!

Black Sheep did not impress. I'm not a big fan of the New Zealand school of horror-comedy. It had a few good gags, but mostly it didn't entertain me. I think it's the only movie that Gloria gave up on and went to bed during. I don't blame her.

Silent Night, Deadly Night was the movie on which we were the most conflicted. I thought it was pretty good, for what it was. Gloria thought that a guy dressed as Santa killing people was just Too Much. Despite that, it's now probably the most quoted of all the movies we watched this past month. "Punish!" and "Naughty!" just make great lines to yell out randomly.

We might watch one or two more movies that were left over, but I then we may be back to our normal mix of programming until next October. I wonder when we can start Martha on some horror flicks...

Saturday November 01, 2008
07:54 PM

hogar crea flan

There is a fairly well-respected charity that's fairly active in Bethlehem. It's called Hogar Crea. I'm sure they do lots of good stuff and help people. That's the impression I've gotten from various people. That's not what matters to me.

What matters to me is that sometimes they come to my door and in exchange for just a few dollars they give me a delicious flan. God bless you, Hogar Crea.

Monday October 27, 2008
10:58 AM

what the heck is distzilla?

At the Pittsburgh Perl Workshop this year, I gave a lightning talk about Dist::Zilla, the system I am increasingly using to manage my CPAN distributions. I'm using it instead of writing a Makefile.PL, but it doesn't do the same thing as Module::Build or ExtUtils::MakeMaker. I'm using it instead of running module-starter, but it doesn't do the same things as Module::Starter. I've had some people say, "So should I stop using X and use Dist::Zilla instead?" The answer is complicated.

(Well, actually, for now the answer is simple: probably not. Dist::Zilla is a lot of fun and I really, really appreciate the amount of work it saves me, but it's really young, underbaked, and probably full of bugs that I haven't noticed yet. Still, the adventurous may enjoy it.)

The idea behind Dist::Zilla is that once you've configured it, all you need to do to build well-packaged CPAN distributions is write code and documentation. If you're thinking, "but that's what I've been doing anyway!" then first consider this: If you are writing =head1 NAME\n\nMyModule - awesome module by me then you are not just writing code and documentation. If you are adding a license to every file, again, you are not just writing code and documentation.

If you use, say, Module::Starter to get all this written for you, then you're safe from writing that boilerplate stuff. Unfortunately, if you need to change the license, or you want to add a 'BUGTRACKER' section to every module, Module::Starter can't help you. It creates a bunch of files and then its job is done. It never, ever looks at your module-started distribution and fixes up things. This also means that if you realize that your templates have failed to include use strict for your last three module-started distributions, you have to fix them by hand. The same goes for the stock templates, which until recently didn't include a license declaration in the Makefile.PL.

With Dist::Zilla this content is not created at startup. It is not stored in your repository. Instead, the files in your repo are just the code, documentation, and the Dist::Zilla configuration. When you run dzil build, your files are rebuilt every time, adding all the boilerplate content from your current setup. If you want to change the license everywhere, you change one line. If you want to start adding a VERSION header, you tweak the Pod::Weaver plugin's configuration.

So, there does exist a dzil new command for starting a new distribution. All it really does, though, is make a directory (maybe) and add a stock configuration file. Why would it add anything else? If you want any code, you would only be writing the actual code needed, not any boilerplate, so adding anything would be foolish.

There's also dzil release, which goes beyond what Module::Starter (and its competitors) do and into the realm of ShipIt or Module::Release. I'm hoping I can integrate with or steal from one of those sort of tools. Right now, it exists, but all it does is build a dist and upload it. In the future, it will have at least two more kinds of plugins to make the release phase more useful: VCS (so it can check in and tag releases) and changelog management. It has a changelog thing now, but it stinks and isn't very useful. In the future, you won't need to edit a changelog. It will be able to read changes out of your commits, or you will just tell it to record a changelog entry. Then the Changes file can be generated as needed.

For now, I am manually editing my Changes file.

So, eventually Dist::Zilla could obsolete Module::Starter for people who like what Dist::Zilla does. Then again, people might still want to have starter templates that add minimal boilerplate for using certain frameworks. We'll see what happens.

Saturday October 25, 2008
01:08 PM

pod people versus elementalists

A long time ago, I wanted to write something to let my pod (documentation) contain its own coverage hints. I gave up when I found out that it was not going to be trivial to say something like this:

my @blocks = PodParser->read_file($my_perl_module)->data_for('coverage');

In order to extract "foo\nbar" from:

sub foo { ... }

=begin coverage

foo
bar

=end coverage

I found ways, but they all bugged me. I gave up on the project for a long time, because it was a real yak, but eventually I came back to it when I realized how much pod manipulation I'd want in Dist::Zilla. I really wasn't happy with how Pod::Simple worked. Dieter had contributed a bit to Pod::Simple, and had talked about writing a more TreeBuilder-like interface. There were a number of significant blockers, though, and I didn't want to get hung up on them. Instead, while walking to McGrady's for ABE.pm, I had an idea and called Dieter to brainstorm with him. Basically, the idea can now be summarized as "I should write Pod::Eventual."

Pod is really great. It's so easy to write that I know I write much, much more documentation that I would if I had to produce, say, a chm file. It's a very, very simple format, and is complex enough to handle almost everything I've ever needed from it. My problems have been that I want to write even less and have it rewritten for me, so I can avoid boilerplate.

The root problem is that pod has both very simple and very complex parts. Here are some of the simple things:

  • a pod document is made up of paragraphs
  • paragraphs are separated by blank lines (but 'cut' commands are special)
  • pod can be interwoven with non-pod in a document
  • pod paragraphs are either:
    • commands (start with =)
    • verbatim (start with whitespace)
    • text (start with anything else)
  • the non-whitespace characters after the = in a command are the command

So, knowing this is about enough to write a tolerable pod paragraph parser for most uses. Sure, it misses a lot of encoding stuff, but adding that later (I believe) is not a big issue.

It omits two very, very big things. First of all, it ignores the content of text paragraphs. That means that I've said nothing about what F<markup> means. This is a big obnoxious problem, and I have absolutely zero interest in tackling it. Hooray for punting, right?

The second problem is that it assumes that all pod documents are sequences of paragraphs. In fact, this is true. The problem is that on top of the syntax of paragraphs, there are paragraph semantics that make this, for example, an illegal document:

=pod

=item * Isn't this simple?

=end

We have an =item outside of an =over and an =end outside of a =begin. Wait... outside? If a pod document is just a sequence of paragraphs, how does containment work?

Well, it doesn't.

It is fairly obvious that the begin and over commands set up containment. They have start commands and end commands, and anything between the two is contained "inside" the block. Unfortunately, there's a lot of implicit containment in how many pod formatters relate the document to the reader. For example, look at how the Sub::Exporter Cookbook is presented. head2 items are presented, in table of contents, as being contained by the head1 items. You'd also like to think that the text and verbatim paragraphs that occur between two head1 paragraphs are contained by the first. Unfortunately, that isn't how it works, and it isn't really clear how it should work. What items cause the end of a container? What items can contain themselves?

Again, I originally punted. Pod::Eventual just produces the sequence of events. For the things I wanted to do, however, I needed structure. I wanted to be able to make a head1 thing and put head2 or other things inside of it. (Actually, in Pod::Weaver, the technical term for these is "thingers".)

Dieter had long since abandoned his work on pod stuff, so I stole (with his blessing) the name for my pod event-to-tree transformer: Pod::Elemental. It reads in a document that contains pod and returns a sequence of roots of trees that represent the document's pod. The logic by which they're formed into a structure is contained in the Nester, and anyone can write his own nester to use whatever nesting logic he thinks makes the most sense.

Pod::Weaver uses Pod::Elemental to turn a Perl document (using PPI) into a just-Perl document and a collection just-Pod elements. The elements are then reorganized and rewritten, in part by looking at the Perl and in part by using plugins and provided input. Dist::Zilla uses Pod::Weaver to add a name-and-abstract section, a license section, to build methods and attributes sections, and to do other stuff like that. It works very well, assuming you don't mind minor explosions while I rejigger the API every other day.

Right now, I know that I have ignored a lot of what is demanded by perlpodspec. Frankly, I intend to keep ignoring a bunch of it. My goal is to let people work with pod paragraph syntax without worrying about the syntax of paragraph content or of the semantics of paragraph ordering -- until they want to. The default Pod::Elemental::Nester, for example, will barf if you try to give it an =end outside matching =begin. Pod::Eventual, however, doesn't care.

Pod::Elemental doesn't care about other things, though, like the magic attached to =begin (data) blocks whose identifiers begin with a colon. Why? It's just about slinging around paragraphs, not around understanding meaning.

I'm definitely planning on adding quite a bit more standards-compliance to Elemental. For one thing, I want to get =encoding hashed out and improve the interface for the element tree. Even Eventual needs some help. For example, I think it gets the definition of a blank line (which divides paragraphs) wrong, and I'd like to change how it understands the lines between =cut and a blank.

Still, though, I'm very happy with what I have and how simply I got it. I definitely would not recomment writing a pod-to-text converter using any of this code, but for writing a pod preprocessor, I've found it really great.

Monday October 20, 2008
04:40 PM

overheard in #email

Programming email can leave you a bit ... touched. Here are some recent gems from #email on irc.perl.org

<dave0> To bounce, or not to bounce
<dave0> that is the question
<dave0> whether 'tis nobler in the mind to suffer
<dave0> the slings and arrows of outrageous NDRs
<dave0> or by opposing, end them

It's a poetry contest he wants, is it?

<rjbs> what a piece of work is mime
<rjbs> how nested in structure
<rjbs> how infinite in quirkiness
<rjbs> in form and encoding how completely execrable
<rjbs> the paragon of feature creep
<rjbs> and yet to me
<rjbs> what is the quintessence of wtf

Have we gone too far? No.

<confound> well, writing a poem about mime is pretty wtf
<dave0> I think a poem about MIME would need to be something epic and
        full of angry, capricious Norse gods
<rjbs> multipart/epicpoem
<rjbs> Das MIMErd&#228;mmerungenlied
<rjbs> stanza*1*=latin-1'q'no_no'Das=20MIMErd=E4mmerungenlied
<dave0> ; format=opera

Trust us. It's hilarious.

Saturday October 18, 2008
11:25 AM

coping with solaris cron

More and more, we're eliminating Linux boxes in favor of Solaris. This is generally not a huge deal, but one of the niggling details has been Sun's cron. It sucks. It sucks because it uses a constant as the subject of its alert messages. If you have a lot of servers running a lot of cron jobs, generating a lot of output, you end up with a display that looks like this:

1 N Oct17 Super-User      (  7)   Output from "cron" command
2 N Oct17 Super-User      (  7)   Output from "cron" command
3 N Oct17 Super-User      (  7)   Output from "cron" command
4 N Oct17 Super-User      (  7)   Output from "cron" command
5 N Oct18 Super-User      (  7)   Output from "cron" command
6 N Oct18 Super-User      (  7)   Output from "cron" command
7 N Oct18 Super-User      (  7)   Output from "cron" command
8 N Oct18 Super-User      (  7)   Output from "cron" command
9 N Oct18 Super-User      (  7)   Output from "cron" command
10 N Oct18 Super-User      (  7)   Output from "cron" command
11 N Oct18 Super-User      (  7)   Output from "cron" command

Seriously?

So, I put in a change request to have this fixed. Deploying Vixie cron was going to be a massive pain (I was told) so instead we updated our use of puppet to ensure that our cronjobs were run by a wrapper script. I'm really happy with it, as it eliminates a few other crappy wrapper scripts and gets me what I wanted to begin with. There are a few internal modules used below, but it should be trivial to replace them with whatever you want. (I've removed a few constants, too.)

Maybe I'll CPANize this later.

#!/usr/bin/perl
use strict;
use warnings;

use Digest::MD5 qw(md5_hex);
use Fcntl qw(:flock);
use Getopt::Long::Descriptive;
use ICG::SvcLogger;
use IPC::Run3 qw(run3);
use String::Flogger qw(flog);
use Sys::Hostname::Long;
use Text::Template;
use Time::HiRes ();

my ($opt, $usage) = describe_options(
  '%c %o',
   [ 'command|c=s',   'command to run (passed to ``)', { required => 1 } ],
   [ 'subject|s=s',   'subject of mail to send (defaults to command)'    ],
   [ 'rcpt|r=s@',     'recipient of mail; may be given many times',      ],
   [ 'errors-only|E', 'do not mail if exit code 0, even with output',    ],
   [ 'sender|f=s',    'sender for message',                              ],
   [ 'jobname|j=s',   'job name; used for locking, if given'             ],
   [ 'lock!',         'lock this job (default: lock; --no-lock to not)',
                      { default => 1 }                                   ],
);

die "illegal job name: $opt->{jobname}\n"
  if $opt->{jobname} and $opt->{jobname} !~ m{\A[-a-z0-9]+\z};

my $rcpts   = $opt->{rcpt}
           || [ split /\s*,\s*/, ($ENV{MAILTO} ? $ENV{MAILTO} : '...') ];

my $host    = hostname_long;
my $sender  = $opt->{sender} || sprintf '%s@%s', ($ENV{USER}||'cron'), $host;

my $subject = $opt->{subject} || $opt->{command};
   $subject =~ s{\A/\S+/([^/]+)(\s|$)}{$1$2} if $subject eq $opt->{command};

my $logger  = ICG::SvcLogger->new({
  program_name => 'cronjob',
  facility     => 'cron',
});

my $lockfile = sprintf '.../cronjob.%s', $opt->{jobname} || md5_hex($subject);

goto LOCKED if ! $opt->{lock};

open my $lock_fh, '>', $lockfile or die "couldn't open lockfile $lockfile: $!";
flock $lock_fh, LOCK_EX | LOCK_NB or die "couldn't lock lockfile $lockfile";
printf $lock_fh "running %s\nstarted at %s\n",
  $opt->{command}, scalar localtime $^T;

LOCKED:

$logger->log([ 'trying to run %s', $opt->{command} ]);

my $start = Time::HiRes::time;
my $output;

$logger->log_fatal([ 'run3 failed to run command: %s', $@ ])
  unless eval { run3($opt->{command}, \undef, \$output, \$output); 1; };

my %waitpid = (
  status => $?,
  exit   => $? >> 8,
  signal => $? & 127,
  core   => $? & 128,
);

my $end = Time::HiRes::time;

unlink $lockfile if -e $lockfile;

my $send_mail = ($waitpid{status} != 0)
             || (length $output && ! $opt->{errors_only});

if ($send_mail) {
  require Email::Simple;
  require Email::Simple::Creator;
  require ICG::Sendmail;
  require Text::Template;

  my $template = do { local $/; <DATA> };
  my $body     = Text::Template->fill_this_in(
    $template,
    HASH => {
      command => \$opt->{command},
      output  => \$output,
      time    => \(sprintf '%0.4f', $end - $start),
      waitpid => \%waitpid,
    },
  );

  my $subject = sprintf '%s%s',
    $waitpid{status} ? 'FAIL: ' : '',
    $subject;

  my $email = Email::Simple->create(
    body   => $body,
    header => [
      To      => join(', ', @$rcpts),
      From    => qq{"cron/$host" <$sender>},
      Subject => $subject,
    ],
  );

  ICG::Sendmail->sendmail(
    $email,
    {
      to      => $rcpts,
      from    => $sender,
      archive => undef,
    }
  );
}

__DATA__
Command: { $command }
Time   : { $time }s
Status : { join('', flog('%s', \%waitpid)) }

Output :
{ $output || '(no output)' }

Friday October 17, 2008
09:06 PM

american express sends me mixed messages

In general, I am a very, very happy cardholder. Just recently, when my EVDO modem died, American Express paid for me to replace it with nearly no questions asked. That saved me about $250, since the modem had just gone out of warranty. That pays for over half my annual membership. They also paid for some MacBook repairs earlier this year, which was a real plus.

Then again, I just got a letter that the "domestic companion airfare program" has been discontinued. This was the program that said that when flying within the country, I could get a free ticket when purchasing one at their normal prices. When going to YAPC in Chicago, this saved us about $400. It's really the reason I decided to upgrade to a platinum card.

Now, instead of being able to travel for less, I have to hope for my electronics to fail to recoup the cost of membership.

It's annoying to lose such a great program, but... I still really like my American Express.

Thursday October 16, 2008
09:19 PM

annoying things learned about perl today

App::Cmd::Tester lets you test that an App::Cmd program output the right things to standard error and standard out, and did so in the right order. It does stuff Test::Output can do, but also just a bit more.

I had a need to generalize this earlier today, and ran into a bunch of obnoxious problems. Most of these center around the "it's hard to get synchronized but separate output from a spawned program's stderr and stdout" problem. Others were less well-known, at least to me.

For example, I was doing something like this:

tie my $scalar, 'Tie::Class', { common => \$string, private => \$other };

I had a number of things like this, and I thought I'd be able to write:

tie my $scalar, $object->tie_args;

This didn't work, though, because the tie_args method gets called in scalar context, so it evaluated to the hash reference, which of course had no TIESCALAR method.

Instead, I ended up doing something possibly better anyway:

tie my $scalar, $object, $argument;

The object has a TIESCALAR method that proxies to the correct class with the correct arguments. I'm pretty happy with that.

The other problem is this:

tie my $scalar, 'Some::Tie';
open my $fh, '>', \$scalar or die "failed to open ref: $!";
print $fh "this is a test" or die "failed to print: $!;

This code just doesn't do what I mean. It doesn't raise any exceptions, but the STORE method is never called on the tie. This seems, to me, like a bug in perl. I'll need to investigate.

The code I was working on has been uploaded as IO::TieCombine.

08:01 AM

lies, lies, lies (about curious george)

From Zap2It's "premise of 'Curious George'":

Premise of Curious George
Curious George is a sweet African monkey who can't help but run into trouble. Mr. Renkins, who George calls 'The Man in the Yellow Hat' tries very hard to care for George and is always saving the day. The show's themes are about learning, forgiveness and playful curiosity.

George is not a monkey. He does not have a tail. That, however, is a mistake made long ago by the original translator of Curious George.

The more appaling mistake is identifying The Man with Mr. Renkins. Mister Renkins is The Man's neighbor at his country house. He is a farmer, and is married to Mrs. Renkins. There is no reason to believe, as far as I know, that there is any relationship between the Renkinses and The Man.

The recent movie version of Curious George assigned a name to The Man, but I will not repeat it here, as I find the very idea offensive, and do not wish for anyone else to be burdened by knowing what some moron thinks George's Friend's name is.

Time for Curious George. He's going to make a lemonade stand and fix some traffic lights.

Tuesday October 14, 2008
09:18 PM

another unproductive complain about subversion

I remember in 2005 or so when I first started using Subversion, I liked it so much. It was much easier to use than CVS. Everyone said it would be make tagging and branching easier than CVS. In CVS, tagging was fine, but branching was such a pain that I never bothered.

Eventually, I found out that branching and merging were much easier, but still a real pain. Tagging, though, was completely insane. Tags were implemented as copies (just like branches). This sort of made sense as a cheap way for branches to work, but none for tags. Tags are labels for points in time in a repository. They shouldn't be mutable, unless maybe to let you remove a label from rev 1 and put it on rev 2.

Because they're implemented as copies, you can actually go in and alter the state of a tag, meaning that tags are useless as ... tags. It also means that if you have a standard Subversion repository with trunk, branches, and tags directories, and you check out the whole thing, you check out absolutely every file in every revision. "Copies are cheap" was a big Subversion mantra back in the day, because in the repository, only files that changed were new files on disk -- but that only goes for what's in the repository, not your checkout. In your checkout, every copy of readme.txt is its own file -- and it has to be, because even the tags are mutable. You can't say that ./tags/1.000/readme.txt is the same file as ./tags/2.000/readme.txt just because there was no change between the two releases, because you could go change either of them, and if you do, you'd change both. Oops!

This came up today because of a piece of automated deployment code that did something like this:

$ mkdir TEMP
$ chdir TEMP
$ svn co $REPO/project
$ cd project/trunk
$ bump-perl-version
$ cd ..
$ svn cp trunk tags/$NEW_VERSION
$ svn ci -m "bump and tag $NEW_VERSION"

Checking out requires a whole lot of space, because it has to check out every single tag's copy of every file. Tagging the new release is also fairly space hungry. How hungry? Well... the project I'm working on right now is a web application. Let's call it New-Webapp.

If I export a copy of trunk from Subversion, getting just the files that make up the latest version of the application, it's 1.9 megabytes.

If I check a copy of the trunk out, so now there's all the extra working copy files, and it's 5.2 megabytes.

If I check out the whole repo, getting every tag and branch (for your information, there is exactly one branch), it's 207 megabytes.

Now, keep in mind that this gives me every file from every tagged release (there are 40 releases). This does not give me the entire revision history. There are many, many revisions missing. After all, what I have is basically 42 revisions: 40 releases, trunk, and one branch. That's it.

If I use git-svn to build a git repository of the project, meaning that I have absolutely every revision, every tag, and every branch, it's 249 megabytes. That's all 1149 revisions.

I am so ready to be done with Subversion.