Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

bart (450)

Journal of bart (450)

Tuesday June 05, 2007
03:56 PM

Linus Torvalds on version control systems

Two days ago, I watched the video of Linus Torvalds' talk about GIT versus other version control systems at Google. It's 67 minutes long, but definitely interesting.

I think I must have checked out almost every freely available source control system last autumn, and in the end, I liked none. Linus seems to have the same sentiment: the most important aspect about a version control system, to me, is easy branching and merging. Otherwise, it's just a glorified archive, merely enhanced with a diff view on the history. There's one of these systems that doesn't even do merge. If you want it, you need a third party tool.

Now, where we differ in opinion is on his insistence that the system must be distributed. He is a huge scale developer, keeping a repository of the entire Linux kernel, with thousands of contributors, while I'm just a small time developer. What I want it for is for maintenance of some CPAN modules and at work, there's only the three of us, so in theory, a central repository would work just as well.

But even for CPAN modules, you can easily consider my personal repository, the one that eventually gets posted on CPAN, and the code from people who fix bugs and submit patches, as separate repositories, which I'd eventually like to be able merge easily. So even at this small scale, the benefit of using a system that is well suited for easy merging in other people's code, would be beneficial. Actually, it makes even more sense to use a distributed system for CPAN work, than for the code of us three at work.

What kills of GIT as a candidate for me, is the fact that it doesn't work on Windows. I can understand that Linus prefers to use just Linux, but I'm not fully converted, and I only use Unixy systems for server roles, not for desktop. And I'm not prepared to have to use an external Linux server just for
the source control system, that would blow most of a GIT's benefits out of the water, at least partly.

So I'm left in limbo. Linus' only other recommendation to look at is Mercurial. I remember trying it out, and it didn't work too well on Windows. Neither did most of the other things I tried, BTW.

Thursday May 31, 2007
04:38 PM

"Nice in theory": curried templates

Some time ago, I had a wild idea. I have no idea how to implement it, so I'm just going to describe how it could be here, hoping that someone will get inspired by the idea and run with it, implement it, and hopefully in Perl. It sounds to me like it might actually be very nifty. So let me explain the concept.

In functional languages, there's an often tauted feature called curried functions. Most "normal" languages have user defined functions, which expect a particular set of parameters to work, and if you fail to pass them all, it'll just barf. Other languages just fill in defaults for the missing values.

Other people thought they might be able to make it do something more useful instead, and they implemented "curried functions" in some of the so-called "functional languages", or as I could call them: "partially executed functions". A function, without the need for the programmer who wrote it to have to do anything extra, can take fewer parameters than required, fill them in, and return a function instead of a plain result, that to execute, simply requires the passing in of the rest of the parameters. Pass in all required parameters, and you get the final function value.

Move over to templates. Templates are a lot like functions, where the template is the function, the values to fill in are the function parameters, and the final (string) output is the function value.

Many dynamic websites depend on templates for the layout for their pages, complemented with code for the logic. But most often, going from template to page is a single step: you need to provide all the values for the variables to produce usable output.

It's quite common that these websites, when they do get to produce the final webpage (or a part of a webpage, a "pagelet", typically a rectangular area), to store it in a cache (memory), so that next time, as long as the data doesn't change, they don't have to regenerate the pagelet again on ever page load. Instead, it fetches it from the pagelet cache.

If one value can't be filled in, then you can't cache the pagelet. You have to produce it at the end, when the final webpage is built.

Introduce "curried templates". Just like curried functions, that would be a template that may be invoked with an incomplete set of variables, and then you'd end up with a new, partially filled in, simpler, template.

A typical use could be on a news site, where the text of a news article in a boilerplate template. In the resulting template, you can then fill in the parameters for an ad, so every viewer of the web page could see the same article with a different ad.

I'm quite sure that the code to produce fully cached pagelets, and to produce dynamic pagelets from scratch, typically look nothing alike at all. And that is a shame. I can imagine that using these curried templates, not only would the code for both extremes would look very much alike, but so could the code for any intermediate form. It's just a matter how much values you already fill in at the first templating stage.

Now, like I said, I have no idea where to start implementing such a system, but it sounds like it would be too cool. And I do think Perl's very capable of implementing such a system without too much black magic.

A currying template could always produce a new template object, where stringification of the object is overloaded to produce the flattened string output. That way you never have to distinguish between the curried or the plain output. It almost sounds too easy.

Sunday May 27, 2007
03:26 PM

Spiderman 3

This afternoon I went to see the movie Spiderman 3 in the theater, with my son and a neighbor's kid. I wasn't expecting too much, as many critiques say the movie's too much of a hodgepodge to be actually good.

Well, I disagree. I actually think this is the best of the 3 spiderman movies. So there's a confusing mixture of storylines, but they all have one thing in common: this is about spiderman's life. All these things happen to him all at the same time. Do you think villains would nicely wait their turn and attack him one at a time? No. Villains attack when it suits them, and yes, it makes life really confusing for the person in the center of the story. It may seem like a lot of balls to juggle, but it's Peter Parker who has to do the juggling. And he's having just a bit of trouble to cope.

Nowhere in the movie did I loose track of the story, or did I think they did something totally out of character. I think it works.

Monday May 14, 2007
04:32 PM

Corrupt headers in Hotmail's bounce mails

Lately I've been working on a script to parse and classify mails that come in after a bulk mail has been sent, most of them in the form of bounces. Of the roughly 2000 mails, 2 had corrupt headers. Guess where they both originated from? Oh, yeah, I already told you in the post title:

The problem with these two mails is that in the middle of the mail headers, there's a blank line, followed by a line starting with "From: ", thus, apart from the 3 garbage characters, it's the real "From:" line.

You would expect that a huge company under the umbrella would be capable of getting their stuff right. I think it's quite typical that they don't. Can't. Won't.

Can anybody explain what the origin of this garbage could be? I have no idea. In Perl, you can match it with /\357\273\277/.

Thursday May 10, 2007
12:38 AM

"Your Favorite Programming Quote"

I found a nice blog entry about refactoring via one of the popular links sites (I thought it was Reddit, but I can't find it back any more). I like it.

My favorite quote (by Larry Wall, heh) from that post is

Don't buy something unless you've wanted it three times.

which is, technically, not a programming quote. But what it means to a programmer is this: don't bother refactoring into a separate module until you've needed a particular functionality in at least 3 applications. Before that happens, you won't have a good idea how to generalize the code, yet.

It makes sense as economic advice too, it keeps you from ending up with stuff you rarely use. Do I really need an MP3 player? Well, so far I've not wanted to have it 3 times. So, no.

I think I'll keep an eye on that blog in the future.

Wednesday April 04, 2007
04:35 PM

Imogen Heap

In the previous weekend, when I was browsing some media content on the internet, I stumbled into a website of music videos of the nominees for the Grammy Awards of 2007. As was to be expected, it contained a bunch of mostly forgettable stuff that can be almost all be classified into one of two piles, as usual:

  1. pop and R&B, like Britney Spears and Beyonce
  2. rock, like Green Day

But occasionally, there's something in there that is different. And this is where Imogen Heap fits into, with the song "Goodnight and Go".

It looks like I just might have been to be the last person in the world to have heard of her, as some tracks of hers have already been listened to over 5 million times on MySpace. Wow.

So, what's it all about, then? Her compositions are not very special, just the plain, optimistic, electronic pop songs. She does have a beautiful voice, as can often be heard among Celtic folk singers, like Enya and Clannad.

Yet she has no fear of experimentation, she often lets her voice skip between vocal chords, just for the weird effect, and she has put electronic effects quite prominent on the foreground, which is quite uncommon for a woman. For example, "Hide and Seek" is just her voice and a vocoder.

I probably must have annoyed my housemates that weekend, as I'm guessing I must have played the few songs I found on internet for a total of about 40 times all.

I have also had a few of her songs kept stuck in my head all working day for 4 days, the following week. So yeah, she has made an impression.

I've been to every larger record shop here in town, and none of them had anything by her in stock. No surprise there, I suppose?

Saturday March 10, 2007
07:45 AM

Firefox 2 annoyance

On one of my PCs, I have upgraded my Firefox browser to 2.0, a few weeks ago. I can't say that, apart from details, I notice much of a difference. Which is not necessarily a bad thing...

But the developers behind Firefox should learn the difference between a crash, and a system shutdown. Because every single time I first open the browser, an annoying dialog box pops up asking me if I want to restore the previous session. It won't go away without clicking on a button first, and it just blocks the working of the browser. Even when I click "new session" the browser forgot the argument (either an URL or the path to a html page) Firefox got opened with, and simply opens on the homepage. Argh!

And if they think restoring the last session is such a cool feature, why don't they just add it as a menu item, in the system menu? So that you can still restore a session even after they closed the browser normally?

Friday February 09, 2007
04:43 PM, on a slow server? Mmm, not exactly...

When I looked at the Javascript used by this site, copy/pasted it from Firefox' Web Developer info page into a text editor window, I found it had a staggering 13500 lines of Javascript code, and saves down to a file of well over 400k. 400k of Javascript! For a site that is relatively sparsely sprinkled with Javascript. That is... Well, that is nuts, no other word for it. I'm thinking that, if you'd just open the stock library (which is one from Yahoo), extract the functions you use, and put them in a new file, you'd probably get along with a library file of maybe 10k to 20k, I think.

The site occasionally takes a long time to load, load times of 10 to 20 seconds are no exception, on a broadband connection. Most of the loading time is spent on, which is the domain where the Javascript library resides. I see this happen at least once every few days, so the amount of bandwidth I must be spending on this site each month, must be a bit on the high size.

And I do feel sorry for the people with a slow internet connection.

Monday February 05, 2007
03:19 PM

Ridiculous copyright notice

This is a warning that is printed in the user manual for my new Toshiba flatscreen TV, right under the information on how to change the aspect ratio and image size of the picture, in case the automatically chosen setting isn't what you would have chosen:

Using the special functions to change the size of the displayed image (i.e. changing the height/width ratio) for the purpose of public display or commercial gain may infringe on copyright laws.

What... The... F.

Monday January 29, 2007
04:07 PM

Foxit Reader, no thank you.

I've been looking at various versions of Acrobat Reader, and alternatives like Foxit Reader. On any forum discussing Acrobate Reader, you can find a few comments recommending Foxit Reader instead, because it's much smaller. So I decided to try it out.

Indeed, it is fast. It loads in a few seconds on an older computer. Its rendering looks very good, in a graphics intensive file.

But it succeeded in crashing once, and hanging once, bringing down the entire PC (an old Win98 system), but otherwise very stable platform, in under 15 minutes of testing. So it doesn't give me the impression of being a mature product.

So off it goes.