Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

ethan (3163)

ethan
  reversethis-{ed. ... rap.nov.olissat}

Being a 25-year old chap living in the western-most town of Germany. Stuying communication and information science and being a huge fan of XS-related things.

Journal of ethan (3163)

Tuesday July 06, 2004
04:02 AM

little pigeons' photos

On popular demand, you can now see a few photos of the two little pigeons. snapshot0003 one is one of the less blurry shots.

Since they don't yet have all their feathers, they look a little like porcupines.

Monday July 05, 2004
04:52 AM

$pigeons *= 2

Four days ago the two little pigeons slipped from the two eggs on my ledge. So now I have two basball-sized furry yelloish things in the flower pot. It took them around two days to open their eyes and they've already grown considerably ever since. The first days there was always one of the parent birds sitting on them to keep them warm. Now they are apparently old enough so that they can be left alone for an hour or so in between.

They are actually quite cute. If you touch one slightly on the head, it apparently mistakes it as an attack from its brother (or sister, who knows) and they start to fight with each other for a few seconds. Another funny thing is their proportions. Their claws and peckers already have full size which gives them a rather comical look.

Thursday June 24, 2004
03:03 AM

Inline::Lua out

Finally I am through with 0.01 and have released it to the CPAN. As always with a module due to be released, I found some bugs after writing the test cases (not too many tests yet, need to provide a few more). Yesterday I did some last-minute additions tha turned out to be quite tricky, too.

I am glad that this is done. Often enough I have modules that in theory are ready for CPAN shipping but then it just doesn't happen because small things turn out to be hard or annoying to be fixed and so I put them into the postpone-queue and later forget about it.

Tuesday June 22, 2004
04:13 AM

Working around bugs

Doing that is something I absolutely hate, because it can make you look quite stupid. See this code:

sub validate {
    my $o = shift;
 
    while (@_) {
    my ($key, $val) = splice @_, 0, 2;
    if ($key eq 'UNDEF') {
        Inline::Lua->register_undef(\@$val), next if ref $val eq 'ARRAY';
        Inline::Lua->register_undef(\%$val), next if ref $val eq 'HASH';
        Inline::Lua->register_undef(\*$val), next if ref $val eq 'GLOB';
        Inline::Lua->register_undef(\&$val), next if ref $val eq 'CODE';
        Inline::Lua->register_undef($val);
    }
    }
}

Now, why do I possibly dereference $val just to pass a reference to the dereferenced value? register_undef is an XSUB. What it receives when just passing the raw reference is for some reason an SvPVIV:

SV = PVIV(0x82a5ab8) at 0x81b9134
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,ROK)
  IV = 0
  RV = 0x814a75c
  PV = 0x814a75c ""
  CUR = 0
  LEN = 0

Note that the ROK flag is still set.

It should be, of course, an SvRV. I am not yet sure who to blame. The above validate method is triggered by Inline so I suspect it might do funny things to its arguments. But this cannot be confirmed when looking at its code. Odd.

On a related note, Inline::Lua is done. 80% of the perldocs are there as well (and 0% of the tests, naturally). My ambitious plans layed out in my last journal entry all turned out to be feasible, even with less effort than I had expected. Perl and Lua can now happily exchange basic types, arrays, hashes, tables, filehandles and functions without getting confused. Tomorrow I'll possibly be able to release it to the CPAN.

Tuesday June 15, 2004
02:10 AM

Lua

A few days ago I had the bright idea of writing an Inline module, just because I was curious how that would work. I chose a language that came to my mind quite spontaneously, partly because I remembered vaguely that it was a language meant to be embedded into other applications.

What I've seen so far of Lua is extremely impressive. The language is extremely clean and, despite offering only a handful of features and concept, very powerful. Some interesting things have been integrated into it quite well, such as coroutines and closures. The latter makes it feel a bit like a functional programming language with the very nice touch of an imperative syntax. It's even object-oriented.

Its C API is a bit confusing for me as of now. That is probably because I haven't yet written a single program in this language. But the Inline stuff already works quite well for some of the basic Lua/Perl types. The nice thing about Lua is that its types map quite well onto Perl. It knows about functions as a data type so a little bit of currying looks like this:

function foo (a)
    return function (b) return a * b end
end
 
io.write( foo(5)(3) )

Very neat! I have already some ideas how the inlined Lua functions can return Lua closures back to Perl as in

use Inline Lua;
 
print foo(5)->(3);
__END__
__Lua__
function foo (a)
    return function (b) return a * b end
end

Saturday June 12, 2004
09:56 AM

Counting the minutes

Just 63 minutes till the kick-off of the opening match of the European football championships. Opener will be Portugal versus Greece. After that, Spain faces Russia which I am looking forward to even more. Tomorrow there'll be France playing against England which is yet better. Oh, yes, and Tuesday we (Germany) will be facing the Netherlands which will be interesting as I happen to be in Aachen which is in walking distance to the Dutch border. I think a humiliating 3-0 for Germany would be in order although I am afraid that the other way round is more likely.

12:35 AM

pigeons on my ledge

Living in the very centre of a city, I am quite accustomed to pigeons being all over the place and usually making annoying sounds etc. Lately I noticed that I always had the same couple of pigeons on my ledge before my windows. There is a flower tub (without any flowers; they shrivelled long ago) on this ledge and they seemed to be very preoccupied with this tub.

After a while I got curious and had a look into the tub to see what was so special about it. To my surprise I found one pigeon egg in it! So apparently they have chosen my ledge and tub as their breeding ground.

Pigeons itself aren't very interesting animals you may say, and I would agree mostly. They make the utmost annoying sounds and have an obnoxious way of walking (with their head moving forward on each step like a woodpecker). However, having two pigeons breed just four meters next to where I am sitting right now is quite interesting.

Of course, in the beginning I tried all the funny experiments that spring to mind immediately, like putting an additional chicken egg into the tub and see how they would react. Apparently it's not so easy to irritate them as they continued breeding without much fuss.

I looked up some things about pigeons in the dictionary and found out that they are monogamous. According to the entry, both the female and the male pigeon engage in the breeding process although I think I've only seen the female one in the past two days. It also says that there are always two eggs in the nest. As there was only one in the tub I was rather sceptical as to how successful their breeding would eventually be.

But yesterday I had a look again and now there are indeed two eggs. From now on it should take around two weeks for their offspring to slip from the eggs.

Wednesday June 09, 2004
12:11 AM

Testing vim 6.3

Just installed the latest vim and have to see that all still works. For a reason beyond me, 6.3 no longer looks at /etc/vimrc, or maybe my old 6.1 only did it because it was configured to do so.

Saturday May 29, 2004
02:07 AM

Dealing with bounces

When looking at the annoyance factor of unwanted mail, bounce messages (caused by some insane worms randomly sending mails with arbitrary from-addresses) seem to have overhauled ordinary spam. The problem with those is that they pass my spam filters and now I have to take steps.

I figure that it should be possible to get an almost flawless detection of those bounces with a specially tailored bayesian filter. Note that I don't want to use the existing bayes filter (as part of SpamAssassin for example). I would first have to train them and also, I suspect that real spam and bounces don't have much in common when looking at the used words.

So what I have started doing now is writing a bayesian filter for bounces. First thing I wrote was a flex-scanner that detects valid RFC822 mail addresses. The scanner gets fed one message. It opens a pipe to another process (the one that does the actual filtering) and writes the mail to this process. The only thing the scanner does is replacing every email-address it can find in the body with T_MAILADDR or somesuch. When reading RFC822 correctly, the below should be the rules for a valid email-address:

    atom            [!#$%&'-/0-9A-Za-z_`{}|~^]*
    dtext           [\x00-\x0C\x0E-\x5A\x5E-\x7F]*
    qtext           [\x00-\x0C\x0E-\x21\x23-\x5B\x5D-\x7F]*
    quoted_pair     "\\"[\x00-\x7F]
    quoted_string   "\""({qtext}|{quoted_pair})*"\""
    word            {atom}|{quoted_string}
 
    domain_literal  "["({dtext}|{quoted_pair})*"]"
    domain_ref      {atom}
    sub_domain      {domain_ref}|{domain_literal}
    domain          {sub_domain}("."{sub_domain})*
    local_part      {word}("."{word})*
 
    addr_spec       {local_part}"@"{domain}

This should be a huge advantage for a bayesian filter since now not every single email-address is a word for its own but rather they get mapped onto one word.

The idea behind that is of course, that bounce messages tend to have a lot of email addresses in their body. Some of them even include whole header fields, so I could extend the scanner to detect those and generate another token for them.

For now I'll prototype the program that the scanner opens a pipe to in Perl and see whether the approach makes any sense at all. If it does, I can rewrite it in C and have a fairly well-performing bayesian filter that I can plug into my .procmailrc before spamassassin is even triggered.

Monday May 24, 2004
12:21 AM

How to test this thing?

I just finished wrapping libstatgrab into Unix::Statgrab. It would now be time to add the tests (yes, I don't write them in beforehand). But how am I to write tests for a library that is designed to return different results for each platform and each machine even?

Maybe I just have the tests call each function/method and make sure that they at least do not segfault. I think pulling out values from Config.pm and testing some of them against what the libary figures out is a bit too hairy.

On the upside though, this C library is deliberately portable among several unices so I wont have to worry about compilation issues, I hope.