Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Purdy (2383)

Purdy
  reversethis-{ofni.ydrup} {ta} {nosaj}
http://purdy.info/
AOL IM: EmeraldWarp (Add Buddy, Send Message)
Yahoo! ID: jpurdy2 (Add User, Send Message)

Bleh - not feeling creative right now. You can check me out on PerlMonks [perlmonks.org].

Journal of Purdy (2383)

Thursday July 30, 2009
07:40 AM

Wow - perlmonks hacked - passwords exposed!

If you haven't heard the news, perlmonks was 0wn3d pretty majorly. The hackers were able to get access to the box and reap usernames and passwords (which were stored in cleartext -- FAIL!).

It's also interesting to see that even elite Perl people aren't immune to insecure passwords, looking at the list (scroll down to find 'perlmonks').

Oh well - time to pick a new password & go around and update. This is when it's useful to have a Google Notebook of sites & password hints (not the actual passwords themselves ... that would be stupid, wouldn't it, perlmonks developers?) to know what sites you need to update.

Peace,

Jason
Saturday January 31, 2009
11:23 AM

Ouch

I woke up this morning and wanted to find something that would change my Windows wallpaper automatically, randomly and at regular intervals from my Picasa photo library. So what to use ... what to use ... Google? But to my surprise, every site is flagged as malware ... even Google.

Talk about serious egg-on-your-face. The only thing that could have been worse is that if it happened on a weekday.

Looks like others have noticed this, too.

Peace,

Jason

Friday January 02, 2009
10:29 PM

A new way to test (to me?)

It's a shame all of my code can't reside in the tidy confines of my Perl libraries - from time-to-time, I have to write JavaScript for a better browser experience. If you haven't heard, Yahoo's YUI has a testing framework (YUITest) that is pretty neat and I'm now using it (vs. Selenium).

Granted, I wasn't heavily invested w/ Selenium, so switching wasn't really that much of a hurdle. But YUITest is pretty cool stuff. I will tell you that it's no Test::More. It doesn't have cmp_ok, so you have to program around it a bit. And you have to program in JavaScript, itself.

What I found as a neat approach is to write the self-contained YUITest code in a separate file in a template directory (i.e. selectTest.yuit).

Then in my app code, I pass along a conditional template parameter that will include that .yuit template contents and then tie that to a query parameter ... i.e.

http://www.example.com/myApp.cgi

And that'll run the app as usual. Then if I append ?t=1, then my app will suck in the .yuit file, which brings in the YUITest framework and runs it. You could get more advanced and automate it and use the framework's reporting functionality to make things even more robust.

Peace,

Jason

Friday March 28, 2008
04:00 PM

Robots.txt Tip + Webinale.de

Knocking off some of the dust here... wanted to share two quickies:

I finally figured this out, and it may be beneath you, but let's say you have a web document root that's shared between ports 80 and 443 (iow, http and https go to the same place). Then your site gets spidered by the search engines and they put a bunch of your stuff in a supplemental index because it's redundant. Since there's only one robots.txt file, you can't easily say IF https, then go away w/o saying the same thing to the http version. So what do you do? Create a robots-ssl.txt file and then in your ssl apache configuration, use Alias:

Alias /robots.txt /path/to/robots-ssl.txt

Then http://www.example.com/robots.txt and https://www.example.com/robots.txt have two different contents, while sharing the same web root directory!

You probably already knew this, didn't you?

Ok, next up ... in more exciting news, I'm speaking at webinale.de! I submitted 6 talks, 5 technical and 1 marketing ... and wouldn't you know it? The marketing talk is approved. So I'll be speaking on SEO. I've been listening & learning German via the Deutche Welle podcasts and I'm lurking on #perlde to pick up reading & writing German, too. Thankfully, I'll be able to do my presentation in English (I think it would be torture to listen to my German ;)).

Peace,

Jason

PS: How ironic (ok, English nerds, coincidental ;)) is it that I'm listening to Daft Punk atm?

Thursday November 29, 2007
09:39 AM

The next thing CPAN needs...

I had a recent experience that gave me pause (pun intended!) and inspiration for a helpful CPAN tool: a categorical tabulation of module popularity.

My example: about a year ago (maybe more), I first dabbled into AJAX and went looking for a module that would import/export native Perl objects in JSON format. So a CPAN search pointed me to the JSON module.

Fast forward to yesterday, where I'm pointed to JSON::Syck, which fits my simple needs, but more importantly (and objectively), is faster & more memory efficient.

So I wish there was something where I could search for JSON, CSV, DBI, CGI, etc and then search.cpan.org would recognize that as a category and present objective data that would better direct me (and other developers) to the most popular (and often best) selection.

An initial objection would be flamewars, but if we kept it to pure numbers and tied it to BitCard logins, the numbers would be objective themselves. Of course, people will probably want to make comments and that's where it could get awkward.

Another objection might be something already exists, whether it be the rating system or this wiki page, but they both don't fit this need, IMO, basically because a sense of popularity isn't thrown behind the modules.

Another objection might be upstart modules would find it harder to be adopted, but if we allow people to change their "allegiance", upstart modules' new votes would become more substantial quickly.

Maybe even simpler would be a download tracker in CPAN to tell how many times a CPAN module has been downloaded/installed. Then put those #'s in the search results.

Now who's going to put the perspiration behind the inspiration? ;)

Peace,

Jason

PS: I highly recommend Daft Punk Alive 2007 - score it for $9 in MP3 format. It's awesome coding music!

Thursday September 13, 2007
09:11 AM

Math is hard...

especially when it involves dates & times. This random musing comes from upgrading CAP::Session and remembering the pain of writing tests for session_delete.

At this point in our lives, time is intuitive. Computers don't come pre-installed with the human experience, much like children. With my oldest daughter (age 3), she's currently stuck on everything in the past happened "yesterday", regardless of if it actually was yesterday or last month or two minutes ago.

Major props to those CPAN modules that get it right (DateTime being my current favorite) ... it's harder than you think.

I don't suggest those of you who are childless to rush out and get a child, but children do bring a real life analogy to what your computer is like. Now I just need to finish this Potty_Training 2.0 upgrade.

Peace,

Jason

Wednesday January 24, 2007
01:37 PM

This post brought to you by ...

ActiveState has released version 4.0 of their Komodo IDE, which supports multiple languages (Perl, PHP, etc), including TemplateToolkit. It has tons of other features, including vi key bindings and extensive configuration options to make it bend to your will. I've been playing around with the betas for the last few months and the new excitement for me is the capability to edit files remotely over SSH.

They also have XPI support, so we could develop add-ons much like those for Firefox & Thunderbird.

I was surprised that my fellow folks on #cgiapp haven't heard about it, so I wanted to share its goodness here. They also have a free version with some functionality stripped out. If you're using some other editor (Crimson Editor, jEdit, etc), Komodo (Edit) is definitely worth your replacement consideration.

Also to share my latest webdev bounty, make sure you have Firebug. There was a great article in Dr. Dobbs about how it can be used and I've found it priceless when trying to debug javascript & css/layout stuff.

Anyway, enough shilling for now. ;)

Peace,

Jason

Tuesday October 10, 2006
08:38 AM

Perl needs (more) evangelism

I was having lunch with a programmer friend of mine, who does his work in .NET (C#, I believe) and we got into another 'Why Perl? Why .NET?' diatribe, which really went nowhere[1]. The sticking point to me was that while Perl is a great language/platform to immerse yourself in, the cool/new stuff leaves Perl behind.

This idea was enforced by a recent Slashdot story, where an aspiring student picked great programmers to ask questions, but Larry Wall didn't make the cut. Not that Larry isn't great, but that Perl doesn't have the mindshare such that it made the student's list. Hopefully, Larry didn't get the email & ignore it. ;)

Topcoder is a neat site where programmers can compete, but they only support Java, .NET, C++, but not Perl.

Google has code competitions which include Python, but not Perl. They have a neat Desktop system you can develop on, but not in Perl.

You can develop extensions for Firefox/Thunderbird, but not in Perl.

I'm probably not saying anything that hasn't already been said, but I'm worried about being the guy scrounging for jobs when I'm 50 and too set in my ways to learn yet another language, when all these cool/new things are the now/then standard.

We need to get Perl embedded into these cool/new things so that we never have to leave the comfy confines of the language to not only get the job done, but do some cool stuff, too.

Peace,

Jason

[1]: This leads me to yet another lesson I've learned - you learn more from listening than talking. There is no real truth that can win an emotional/instinctual/behavioral/spiritual argument. Watching Pudge & Ovid go at it enforces that lesson. ;)

Thursday September 28, 2006
08:45 AM

Free lesson for you...

I have been working pretty hard on a work project for about a month now (off & on and for the last week, mostly on) and I've come to a realization that perhaps most of you already have.

For the impatient, here's the lesson up-front:

When tasked with importing data from an external source, consider importing it into a separate db/table and then building/extending the necessary functionality on top of that (versus importing the data right into your existing data).

We run circulation data for two magazines, both on systems we built ourselves. It was decided to outsource circulation of one of these to another vendor. Several months later, I was asked to import their data for some functionality that the vendor doesn't provide. As it turns out, there are some similarities between the schemas of their data and ours, but mostly differences. A big one being when a user renews their subscription, I treat that as a separate subscription and they treat that as an extension of their existing subscription.

Anyway, like I said, I've been working hard and feel like I'm currently at 85 or 90% completion, but to nail the final non-conformities would require user-specific code, which makes my eyes bleed when I'm already facing some messed-up code (5 or 6 main IF branches and a few places that could be re-factored).

Perhaps this is when I should move to logic programming vs. functional programming, but I still don't have my head around that one. :(

Perhaps also, I'm just at a point that all programmers reach when they've invested too much time in a project and question "WHY?" and I should just buckle down and knock out the remaining 10-15% cases.

Cheers,

Jason

Tuesday August 15, 2006
09:14 AM

Codestorm

I recently had a situation come up where I had to whip up some code to split up a huge (1 GB) mbox file. I KNOW I should be using mdir, but com'on, people ... it's what Debian does by default and I don't spend my time sysadmin'ing stuff. In looking around, I couldn't believe others hadn't already done this (perhaps they have and my Google-fu just wasn't adequate). There was a promising git-mailsplit program, but I couldn't find it in Debian.

So I whipped this up - feel free to use/tweak this for your own use:


#!/usr/bin/perl -wT

# Process:
# 1) cp /var/mail/person /var/mail/person.bak
# 2) Run this script
# 3) chmod/chown the INBOX.GigSplitNN files
# chown person:users /home/person/INBOX.GigSplit*
# chmod 0600 /home/person/INBOX.GigSplit*
# 4) mv /var/mail/person /var/mail/person.prerm
# 5) mail the person and see if the /var/mail/person gets setup right
# 6) diff /var/mail/person.bak and /var/mail/person.prerm and put that in /var/mail/person
# i ended up just tailing the file with the right number of differing lines
# and >>'ing that into /var/mail/person
# b/c diff'ing two 1GB files takes WAY too long!

use strict;

open( MBOX, '/var/mail/person.bak' ) || die "Cannot open person.bak: $!";

# go through the mbox file
my $message = '';
my $line_count = 0;
my $message_count = 0;
my $file_base = '/home/person/INBOX.GigSplit';
my $file_i = 1;
my $line_count_limit = 580000; # this ends up with ~40MB files, which are more tolerable
my $need_to_write_init = 1;

while( <MBOX> ) {
        $line_count++;
        if ( /^From / ) {
                if ( length( $message ) > 0 ) {
                        $message_count++;
                        my $file = $file_base . sprintf( "%02d", $file_i );
                        print "Got message # $message_count - appending to $file ...\n";
                        if ( $need_to_write_init ) {
                                write_initial_msg( $file );
                                $need_to_write_init = 0;
                        }
                        open( SPLIT, ">>$file" ) || die "Cannot append to $file: $!";
                        print SPLIT $message;
                        close( SPLIT );
                        if ( $line_count > $line_count_limit ) {
                                print "Line Count exceeded $line_count_limit, so incrementing \$file_i...\n";
                                $file_i++;
                                $line_count = 0;
                                $need_to_write_init = 1;
                        }
                }
                $message = $_;
        } else {
                $message .= $_;
        }
}

close( MBOX );

print "All done!\n";

sub write_initial_msg {
        my $file = shift;
        open( FILE, ">$file" ) || die "Cannot open $file to put in initial msg: $!";
        print FILE <<"_EOF_";
From MAILER-DAEMON Mon Aug 14 13:00:31 2006
Date: 14 Aug 2006 13:00:31 -0400
From: Mail System Internal Data <MAILER-DAEMON\@mail.example.com>
Subject: DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA
Message-ID: <1155574831\@mail.example.com>
X-IMAP: 1134739889 0000025473
Status: RO

This text is part of the internal format of your mail folder, and is not
a real message. It is created automatically by the mail system software.
If deleted, important folder data will be lost, and it will be re-created
with the data reset to initial values.

_EOF_
        close( FILE );
}

So that will create INBOX.GigSplit01 ... INBOX.GigSplitNN, which my user could manage with Squirrelmail (I had to hack /home/person/.mailboxlist to add those new folders). Since the problem stemmed from a checkbox in her email client keeping old messages on the server and not removing them, she could simply delete a lot of the stuff as redundant and just look at the more recent messages for stuff she missed. Remotely accessing a 1GB mbox file tends to timeout. ;)

Yes, I KNOW that could be optimized and probably even one in one line (go for it, golfers!) ... it was something I had to do and it wasn't too painful to run (6700 msgs in 2 minutes).

That's just the way I roll!* ;)

Speaking of coding, Google has their Code Jam going on, but where's the love for Perl? You can program in C++, C#, Java, Python and VB.NET, but not Perl. It probably has to do with what TopCoder supports, but something should really be done to get Perl in that list, for longevity sake.

Peace,

Jason

* = My new favorite saying