Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Journal of LTjake (4001)

Thursday June 03, 2010
10:33 AM

File::HomeDir 0.91 data directory changes

After performing a routine upgrade of my CPAN modules on my $work machine, I noticed that I was prompted to configure CPAN for the first time after I attempted to load the CPAN shell again.

Obviously I had already done all of that so I had to figure out what had changed. I remembered having installed only a couple new modules, including File::HomeDir.

Browsing its Changes file shows:

Moving the FreeDesktop driver to prod

My operating system is Ubuntu 10.04, which apparently means that File::HomeDir will use these new FreeDesktop rules. The big change is that the default data dir is no longer your home directory, but ~/.local/share/

The simple fix was to move my .cpan dir to ~/.local/share/. From then on, everything worked as per usual. This will also affect any other apps that use File::HomeDir to locate your data dir, including Padre.

Wednesday February 03, 2010
04:04 PM

Project Stepping Stones

Ever looked back at something you've worked on and thought: "Gee, it's too bad that project didn't get to it's ultimate goal, but, I've learned a lot from it." I have one of those projects. Such is the world of technology, your toolset is constantly evolving and shape-shifting; even "The Next Big Thing" can become obsolete. We move on to the next "Next Big Thing."

One such area in the Web Developer's toolset is "Search." I'm sure we can all relate to the experience of our first textbox with some behind-the-scenes code doing SQL "SELECT ... LIKE" statements. Perhaps at first it was raw DBI calls; maybe moving on to an abstraction layer (ORM and whatnot) shortly thereafter. Here's where things get interesting.

What happens when this is no longer Good Enough(tm). Google being essentially ubiquitous, people expect to plunk some words in the box and magically get what they want out the other side. I put in "Cat Hat" -- why didn't it give me "Cat in the Hat"? Okay, no problem. We can do some field and query normalization; removing stopwords, add term parsing ... wait, wait wait. There has to be prior art for this.

In 2004, the options are somewhat limited as far as Free/Open Source search software goes. Especially in Perl land. Swish-e looks pretty neat. We actually did some prototyping with it. It was definitely a step ahead of plain old SQL. Plucene came on the scene. Unfortunately, it's poor performance was a bit of a non-starter for us. The fact that it was modeled after the Lucene Java library, however, caught my eye.

I wanted to harness their project and its community, and bring it into our little Perl world. Luckily for me, someone else had already started down that road. The Lucene Web Service was a project by Robert Kaye, sponsored by CD Baby, which allowed users to talk to Lucene via an XML-based web service. After using it for a while, we developed some patches for bug fixes and enhancements. Because of our momentum with the project, we were eventually given total control over its development.

We attempted to strengthen the project by hooking into some existing standards. We leveraged the Atom Publishing Protocol as an analogy for dealing with indexes and documents. Search results were returned as an OpenSearch document. A document's field-value pairs were listed in the XOXO microformat. Creating a client for this setup meant a bunch of glue between the existing components (XML::Atom::Client and WWW::OpenSearch).

Almost in parallel, the Solr project emerged. Similar idea, much more support behind it. In the end, our idea never got very far, and Solr has turned out to be a fabulous product -- which we now use.

To this end, the Lucene WebService website will (finally) be shutting down in about a week's time. I've moved the pertinent code and wiki data to github in case anyone wants it. I still think it has some niche applications, but without some serious revamping of the java code, it will likely just rot.

At least it's a project that has led me to bigger and better things.

Friday January 15, 2010
09:47 AM

New Padre-Plugin-PerlTidy release

Padre 0.54 introduced a couple of project-specific settings, one for Perl::Critic and another for Perl::Tidy. As the maintainer of the Perl::Tidy plugin, it was only natural that I should implement support for this new feature.

Unfortunately, it wasn't immediately obvious to me how I might get at this info. With Adam's guidance, I was able to write the following:

my $tidyrc = $self->current->document->project->config->config_perltidy;

It's a mess of chained method calls, but, it pretty clear that we're getting the current document's project-related config, if it exists.

Here's a screenshot of the new version of the plugin in action.

[Image]

Grab version 0.09 from CPAN now.

Friday December 18, 2009
02:05 PM

The last five months

I haven't bothered to post anything in the last five months. With Christmas just around the corner, I figure this is as good a time as any to play catch up.

Padre-Plugin-PerlTidy

  • A new release of this plugin -- changes made mostly by other people.

Gedcom-FOAF

  • Rather than using a base url for the data, you can now specify a number of url templates. This makes the module actually useful. Thanks to Chris Prather for working through this with me.

Geo-IPfree

  • A couple of releases with various refactoring bits and bug fixes. The folks at software77.net now produce a data file specifically for this module. I will ship an updated copy with every release. Refactoring this code has been pretty satisfying, though there are some parts of the module API which I detest but I will be unable to modify them.

Image-Textmode/Image-Textmode-Reader-ANSI-XS

  • Various bug fixes thanks to some testing with a large dataset from Doug Moore and sixteencolors.net

Config-Any

  • Released version 0.18, which prefers YAML::XS over any other YAML parser. This created a number of issues with the HTML::FormFu crowd as existing parsers allowed this sort of syntax "auto_id: %n" -- whereas YAML::XS complains about an exposed percent sign. The easy fix is to wrap the string in quotes "auto_id: '%n'"

GD-Image-Scale2x

  • Fixed a nasty bug due to a missing my() which randomly broke the module on some platforms.

CGI-Application-PhotoGallery

  • A tiny patch for max_height included in this release. This still has some pending issues in RT -- though I have a hard time justifying spending any time on them as I don't use this module at all.

Catalyst-Model-WebService-Solr

  • Apparently, this module was basically broken. Fixed thanks to a supplied patch.

Template-Provider

  • Another kind user supplied some patches/info to support mod_perl and fully qualified template names.

CQL-Parser/SRU

  • Removing use of UNIVERSAL->import from these module. Not even sure why it was there to begin with.

WebService-Solr

  • A couple of release of this module. Includes some bug fixes, feature additions and Solr 1.4 compatibility.

Remove auto_install from my dists

  • Although, as I understand it, auto_install now works in newer versions of Module::Install, I've decided to remove it from my dists to avoid any issues.

See you next year.

Tuesday July 28, 2009
07:25 AM

Adding a feature to Padre

I've been following (and even contributing to) the Padre IDE project from very early on. I've watched it grow from very modest beginings into something quite impressive -- usable, even.

Its deep integration with Perl is such a killer feature. There are already a good two-dozen plugins, one of which I've been shepherding: Padre-Plugin-PerlTidy.

In light of Padre's first birthday, I decided I wanted to give something back into the Padre core rather than just an ancillary project.

I tend to use gedit on Ubuntu, and I rather liked the "right margin" option. This option puts a gray vertical line on whichever column you specify. It's an easy visual queue for long lines. It turns out that the Scintilla editor component supports this feature and all I had to do was enable the menus and dialogs to allow users to toggle the method.

[Padre with "right margin" option]

...and there you have it. It didn't take very long, and it's not exactly mind blowing, but it's something I've found useful.

Wednesday June 03, 2009
02:43 PM

Dear Module Author

Dear Module Author,

When preparing to upload a new release of your module to PAUSE could you please review your Changes file?

Did you remember to update it? Does it contain something meaningful? Here are a couple of examples of Changes entries which mean very little to me at a glance:

  • Bug Fixed
  • Foo::Bar Fixed
  • Fixed RT #12345

Also, your SCM revision log does not a good Changes file make.

Yes, this is old news. This is just a reminder.

Tuesday May 26, 2009
12:57 PM

Elsewhere

This is just a quick note to let y'all know that I now have a twitter account and an identi.ca account.

You have been warned.

Thursday May 21, 2009
12:33 PM

Benchmark

As noted in my last post, I was able to get a bit of a speed boost based on observations made as a result of code profiling.

In general, if I want to see if one piece of code is faster than another, I use Benchmark. Benchmark is shipped as part of the core set of modules, so there's no need to load up CPAN to get started. Its simplest usage, and the one i prefer looks something like this:

    use Benchmark ();
    
    Benchmark::cmpthese( $count, {
        Foo1 => sub {
            # code to do Foo1 here
        },
        Foo2 => sub {
            # code to do Foo1 here
        },
    } );

Of note is that $count can be negative, which will then signify how many seconds to run instead of the number of times. The result looks like this:

             Rate Foo1 Foo2
    Foo1 108665/s   -- -38%
    Foo2 175460/s  61%   --

It's pretty easy to see that Foo2 was faster. Using the above it was easy for me to test the XS-based ANSI parser vs the pure Perl version.

4k worth of ANSI over 10 seconds yields the following:

         Rate    PP    XS
    PP 15.7/s    --  -96%
    XS  379/s 2316%    --

For giggles, i tested it against a 33k ANSI, giving:

         Rate    PP    XS
    PP 2.23/s    --  -96%
    XS 58.7/s 2528%    --

Looks like a success to me!

Monday May 11, 2009
10:41 AM

Devel::NYTProf

Another week another QA tool.

This week I'm going to talk about Devel::NYTProf (aka NYTProf).

To start, if you're interested in profilers, you should check out the brief history section of the pod, then take a glance at its features. Until recently, I hadn't been very interested in profiling my code. I didn't really have anything that needed the profiling, and the tools just seemed a bit awkward to me. This changed for me when I saw the output from nytprofhtml (1, 2).

While working on Image-TextMode, I noticed that parsing large (~75k) ANSI files was getting to be pretty slow. I decided to run NYTProf on the parsing code, and here's what I got:

[Image: Profiling - Before]

The putpixel(), width() and height() methods are called for every character/attribute combo stored for the image. This turns out to be a really big inefficiency. I've had some XS code in my back pocket for ANSI parsing, so I decided to whip up a replacement parser using that code and run the profiler again.

[Image: Profiling - After]

Huge win! By moving _read() to XS (including putpixel, width, and height) I was able to shave over a second off of the total time (_read inclusive goes from 1.3 seconds to 0.03). Although working with XS was a bit of a pain, it was really great to see such a speed improvement.

I recommend everyone take a look at NYTProf if you're looking find speed inefficients in your code.

Friday May 01, 2009
02:42 PM

Perl::Critic

Holy -- this weekly thing goes by way too fast!

Anyway, as promised, I'm making my first QA tool post. This week, we're chatting about Perl::Critic.

Perl::Critic has been around since late 2005. I was able to resist its icy gaze until last fall. So, why wouldn't I want to jump right in with Perl::Critic early on? Mostly what I imagined was putting a significant amount of time in to bend Perl::Critic policies to my will so I wouldn't have to change how I code. This is, of course, the wrong way to look at it.

There's nothing wrong with having a tool that confirms you're doing the right thing -- but what I really wanted was a tool that showed me the bad habits I've learned and gave me a slap on the wrist every time I tried to use them. The easiest way to get started was to copy someone else's polcy file. RJBS was nice enough to comply.

For the Image::TextMode project, after adding my own tweaks to the policies, this is the result. A simple automated test integrates it into my development cycle.

After running it against my code, it found some issues -- most of my which were pretty tame: 2-arg open, lack of pod, plus a few regex and character matching niggles.

In my policy file, I have two sections: Things I don't agree with and things I've had to disable temporarily. I hope to eventually go back and clean up my code so I can remove the remainder of the temporarily disabled policies. The policies I don't agree with may change over time, but this is my current list of preferences.

I have yet to use this setup in any other project, but I think the tool is useful enough that I could put it into place from the very beginning of a project or go back and run it against all of my old projects over time.

Until next time...