Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

samtregar (2699)

samtregar
  (email not shown publicly)
http://sam.tregar.com/

Journal of samtregar (2699)

Sunday June 22, 2008
02:22 PM

Making Maps with Math::Geometry::Voronoi

I've been working with generating overlays for google maps lately at my day job. Without going into too much detail, the data set I need to display is a set of points, each assigned to a given set. In the real world these sets form contiguous regions which I need to translate into shapes to draw over the map. The rub is that these regions come in really complex shapes - drawing convex hulls isn't an option.

My first attempt at the problem once I realized I couldn't draw a hull was to divide the world into a grid and color each box according to what I found inside, subdividing until each box contained only a single set's point(s). The results were blocky (duh) and not all that appealing. But as long as the data was dense enough (i.e. urban areas) it did a decent job of expressing the shapes. Unfortunately in rural areas where points are more spread out it looked terrible.

For my second attempt I decided to actually learn some computational geometry - I read a book and quite a few sites around the web. This lead me to Voronoi diagrams. There's a wide variety of Voronoi diagram producing code out there on the web - everything from highly-templated C++ monsters to Java to some excellent 20-year old C code. I chose the latter of course (Perl and old-school C go together like chocolate and peanut butter), and after some serious debugging I give you Math::Geometry::Voronoi.

You can see a demo of it in action here: http://sam.tregar.com/voronoi.cgi. This is essentially what I'm doing with Google maps, except that I'm coloring multiple regions the same color and drawing a line around the border rather than each cell.

The speed of the diagramming code is really good. Don't let the demo fool you - that code is running in CGI mode on a shared server. Put it on a fast mod_perl server and it's definitely not going to be the bottleneck in a mapping application. That prize goes to Google Maps. Sweet app, but it sure is slow!

-sam

Wednesday April 23, 2008
06:15 PM

VMWare's Jerky Mouse

This isn't the least bit Perl related, but I figured this was an easy way to get some useful info out to Google. I searched in vain for a solution and managed to hit it only by accident. Maybe I can save someone else the trouble.

In short, I installed Fedora 8 as a guest in VMWare running on Windows XP. Everything was fine but the mouse movement was jerky. Not jerky like a video performance problem, but more like wiggly and jittery. It looked a little like my hand was shaking on the mouse, but I was calm as can be.

Reading through the logs, I realized that X had somehow detected my mouse twice. I guess it was getting conficting signals from each movement. I fixed it by opening /etc/X11/xorg.conf and commenting out the entire mouse configuration section and the reference to the mouse config in the screen section. I restart X and my mouse was auto-detected fine.

Most of the advice I could find on the topic of funky mouse movement in VMWare advises you to install VMWare tools. Don't bother - it won't work under Fedora 8 and you'll waste a lot of time trying. You don't need it anyway - Fedora 8 comes with working VMWare display and mouse drivers.

-sam

Thursday April 03, 2008
03:12 PM

Sharing a DBI handle between Class::DBI and Rose::DB::Object

At my current job we've got a large existing code-base built on Class::DBI. For my current project I decided to experiment with using Rose::DB::Object instead, hoping to see less need for hand-written SQL. So far it's been a success, but one issue was quite difficult to get right - getting the two systems to share a DBI handle. If I didn't do this then I was going to see a doubling of total DBI connections when I deployed my new app to the web cluster, which is unacceptable.

I started with the most obvious solution, an over-ridden init_db() in my Rose::DB::Object subclass:

sub init_db {
    my ($pkg, @args) = @_;

    My::Rose::DB->new_or_cached(dbh => My::Class::DBI->db_Main(), @args);
}

That worked great at first - when Rose needs a DB connection it gets one pre-loaded with my Class::DBI handle. (And as a side-note, the fact that the Class::DBI handle uses DBIx::ContextualFetch doesn't cause Rose problems.)

However, sometimes for unknown reasons Rose will decide it needs to reconnect after the initial connection is established. To intercept these calls I added an overridden dbh() method to my Rose::DB sub-class:

sub dbh {
    my $self = shift;
    unless (@_) {
        $self->{dbh} ||= My::Class::DBI->db_Main();
    }
    return $self->SUPER::dbh(@_);
}

Another useful hint which helped me notice very quickly when Rose decided to reconnect - I gave Rose::DB an invalid password in my call to register_db. That way Rose would have all the correct information about the connection, but wouldn't be able to initiate new connections.

I hope this helps other suffering Class::DBI users to give Rose a try!

-sam

Monday December 03, 2007
02:57 PM

HTML::FIllInForm patch to mark invalid fields

I've been working on producing yet another form validation system, this time for MasonX::WebApp. I modeled my work on the very nice CGI::Application::Plugin::ValidateRM, with a bit of the Krang::Message system thrown in to help get errors out to the user.

One thing I wanted to do this time that I'd never done before was automate marking invalid form fields. I'd done this in the past using manual template changes, but that's a lot of grunt work.

Instead, my coworker Perrin Harkins suggested we use HTML::FillInForm to do the job. We're already using it to re-fill forms with errors, so why not use it to setup a CSS class on invalid fields too. It turns out HTML::FillInForm has a very similar facility already to disable selected fields, so adding an invalid fields feature was pretty simple.

I've already sent along the patch to the maintainer, but if you want to get it now here it is:

-sam

Friday July 06, 2007
03:19 PM

Get your Geo::Coder::US DB file here!

I've put up a freshly built Geo::Coder::US DB file for download here:

It's 384MB compressed (845MB uncompressed), but believe me, you'd much rather download this than download the 6.6GB of TIGER/Line source files needed to build it!

Supposedly I've got 6TB of downstream bandwidth each month at my ISP (Dreamhost). I guess we'll see if they meant it! I should hook up something to take down the file automatically if I go over my limit...

-sam

Thursday April 26, 2007
01:46 PM

Review of Programming Erlang, the short version

In short, fantastic book, absolutely recommended to any progammer interested in learning about a fascinating and very novel progamming language. It's very well written and I was engaged from start to finish.

I think it may be of particular interest to Perl programmers because Erlang offers a great solution to a problem which is very hard to tackle in Perl - reliable, simple paralell processing. I'm definitely excited to use Erlang in the future - I've already got ideas for more projects than I could possibly find time to complete. I think its utility will only grow as processors continue to add more cores.

As a side note, I read the book as a beta PDF, which I enjoyed a lot more than I thought I would. It was very convenient to be able to cut-and-paste code from the book into a running Erlang interpreter to see it work. I got several updated versions of the book as I was reading, but the changes weren't too disruptive.

I may write a longer review, perhaps after I've written a moderate amount of Erlang so I can get a sense for what was left out. But don't wait for that - buy it now and thank me later!

-sam

Friday April 20, 2007
02:18 PM

Article about Data Warehousing with MySQL and Perl

An article I wrote about building a data warehouse with Perl and MySQL just went up on O'Reilly's database site:

I'd intended it for perl.com, but as the editor correctly pointed out it doesn't have a lot of Perl content. Take a look and let me know what you think (|grep -v 'mysql hate').

-sam

Friday December 08, 2006
01:52 PM

Decoding another bash error

Again I've been confused by a bash error message. And again, Google was little help, so I figured I'd post it here so perhaps others won't search in vain.

Here's the error, encountered trying to run a Perl script on a USB drive from my Fedora Core 5 machine:

$ bin/krang_ctl restart
-bash: bin/krang_ctl: /usr/bin/perl: bad interpreter: Permission denied

The "bad interpreter" part led me on a mission to make sure Perl was installed ok. It was, and scripts in other locations ran fine. Then I spent some time investigating the "Permission defined" angle, but I couldn't find a permissions problem anywhere.

Finally I looked at how the USB disk was mounted and there it was:

# grep Cube /etc/mtab
/dev/sda1 /media/Cube ext3 rw,nosuid,noexec,nodev 0 0

Note the "noexec" there! Fedora auto-mounted this disk with "noexec" turned on. I still don't know how to tell Fedora not to do that, but I do know how to fix it after the fact:

# mount /media/Cube/ -o remount,exec

After that the problem went away. How the heck having "noexec" on a mounted filesystem triggers a "bad interpreter" error in bash, I have no idea...

Hope that helps someone!

-sam

Sunday October 29, 2006
05:44 PM

Mac line-endings and Text::CSV_XS

To say that Text::CSV_XS has trouble with Mac line-endings (\015) is somewhat of an understatement. Not only will it not parse a file that uses them to end lines, it won't even allow them inside a field in binary-mode. Binary-mode is advertised as allowing any character as long as it's in a quoted string, so this is clearly (in my opinion) a bug.

I dug into the code intending to solve both problems, but only managed to fix the latter. Actually supporting \015 as a line-ending character looks like it would be hard. For my purposes it wouldn't help unless it was automatic - if I have to tell Text::CSV_XS that a file has Mac line-endings then I might as well just translate them. That's the way Unix and Windows line-endings work now - you don't have to tell the module what to expect and you can even mix them in a single file. The way it accomplishes this feat doesn't extend well though, at least as far as I can tell.

In any case, here's the bug fix: mac.diff. After you apply it you should find that stray \015 characters work just fine in binary mode. I also sent it to the maintainer, but since the module hasn't had a release in 5 years I'm not exactly holding my breath!

I came very close to going on an optimization mission while I was in the code. The state machine looks like it could benefit from some tweaking and the way lines are read looks like it could be improved. This would be pretty foolish though - Text::CSV_XS is already so fast that I've never seen it show up in a profile on a serious app. Usually I'm reading CSVs so I can load data into a database via DBI, by which point Text::CSV_XS is unlikely to be a bottleneck.

-sam

Sunday September 17, 2006
02:28 PM

Safely timeout DBI with DBIx::Timeout

Recently I needed to find a way to timeout a DBI request. I found the state of the art less than satisfying, involving unsafe signals and a chance of memory corruption deep in Perl's guts:

This led me to create DBIx::Timeout which instead of using unsafe signals:

  - Forks a child process which sleeps for $timeout seconds.

  - Runs your long-running query in the parent process.

  - If the parent process finishes first it kills the child and
    returns.

  - If the child process wakes up it kills the parent's DB thread and
    exits with a code so the parent knows it was timed out.

Tim Bunce suggested a possible optimization - fork just one child process and have it watch any number of slow queries simultaneously. It would accept assignments via a pipe interface. Seems like a good idea, although it's likely overkill for my usage. The queries I need to timeout are very likely to be long-running, and when they're not don't need to be very fast. The overhead of forking a process that exits almost immediately won't cause any problems, I'm betting.

So, please give it a try! And if you're not a MySQL user, please port it to your DB and send me a patch. It should be an easy job - all you have to do is implement a call to kill another process in the DB (MySQL does it with "KILL $thread_id"). (UPDATE: actually, it's a little more work - you also need to write new tests. The tests I wrote for MySQL use GET_LOCK() to test timeouts - you'll need to do something analgous for your DB.)

-sam

PS: I should note that the mechanism this module uses was suggested by my co-worker Perrin Harkins. I'll add that to the module's POD for the next release.