Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

excalibor (262)

excalibor
  (email not shown publicly)
http://praeter.blogspot.com/

Journal of excalibor (262)

Tuesday September 26, 2006
09:19 AM

Promises, promises...

I've been thinking in lazy evaluation, and so, as of late, due to some healthy readings.

One thing that I've always liked about some languages (and which Perl 6 will have for free) is the ability to have lazy arrays, and list comprehensions (Haskell is the brightest example, but Python's gotten them for a good while now, and they have been proposed in a Scheme SRFI which is pretty cool).

The nice thing about Scheme's (or Python's) comprehensions is that they are eager, and that rings closer to Perl's world view than Haskell's.

Of course, we've always have something similar to a list comprehension in Perl, namely map.

For example, Haskell's simple example:

Prelude> let a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Prelude> a
[0,1,2,3,4,5,6,7,8,9]
Prelude> let b = [ x `mod` 2 == 0 | x <- a ]
Prelude> b
[True,False,True,False,True,False,True,False,True,False]

where 'Prelude>' is GHCi's prompt, can be easily written in Perl as

$ perl -we '@a=0..9;@b=map { $_%2 == 0 ? 'True' : 'False' } @a; print "@b\n"'
False True False True False True False True False True

Of course, Perl's thruth and falsehood vallues are different, we write 0 (or '', or ()) for 'False' and anything else for 'True' (for example, 0 and 1).

The real difference in both cases is which terms are calculated. If we limited this get just the first 3 elements of the comprehension, Haskell will only calculate those, and forget to test the rest of the elements. And if it's a calculated list, it will even skip the generation of the rest of the list. Perll will make all of it before slicing the result. A lot of unnecessary work.

We may have a hard time to make lazy list comprehensions in Perl, but we can make eager list comprehensions of sort pretty easily. This is part I.

And the trick is to resort to promises, like Scheme does. (There are probably other possibilities, but this is the one I know of :-P ):

We are going to implement a new type, a Promise, that, when created, will return the promise of a future evaluation. Thus:

my $promise = delay(q[foo(2+4)]);

is a promise to, eventually, evaluate foo(2+4), if forcefully asked for:

my $result = force($promise);

We are striving for a simple implementation to see how this works, the details are for CPAN, if I ever upload this over there.

Scheme's delay and force work as if defined this way:

(define (delay x) (lambda () x))
(define (force p) (p))

They are actually defined as macros, so the evaluation of the arguments is delayed until the macro is expanded. We can simulate this with Perl's ability to eval expressions. And thanks to Perl's closures, we can implement this straightforward:

package Promise;

# a bit of debugging info, and we name the argument of delay() so we can close over it (lexically speaking)
sub delay { print STDERR "<@_>\n"; my $code = $_[0]; return sub { eval $code } }
sub force { $_[0]->() }

1;

and test it:

use Promise;

sub foo { $_[0] + 1 }

my $p = Promise::delay(q[foo( 3 + 4 + foo(5) )]);
my $r = Promise::force($p);
print "* $r\n";

Executing it, we get:

$ perl test.pl
<foo( 3 + 4 + foo(5) )>
*

Uh? The debugging sentence printed what we wanted to execute correctly, what went wrong? Well, if we had done our homework we would be capturing return value of the eval block, stored in $@, and we would then be able to raise the exception 'Undefined subroutine &Promise::foo called at (eval 1) line 1.'. Let's do it:

(in Promise.pm):
package Promise;

sub delay { print STDERR "<@_>\n"; my $code = $_[0]; return sub { eval $code; die $@ if $@ } }
sub force { $_[0]->() }

1;
(in test.pl):
use Promise;

sub foo { $_[0] + 1 }

my $p = Promise::delay(q[foo( 3 + 4 + foo(5) )]);
my $r = Promise::force($p);
print "* $r\n";
__END__

And now we get:

$ perl test.pl
<foo( 3 + 4 + foo(5) )>
Undefined subroutine &Promise::foo called at (eval 1) line 1.

as expected. Why is this failing? Because the symbol 'foo' is not defined in the Promise:: package, we have defined it in main:: ... There's surely a way to 'upval' foo from main:: to Promise:: (like Tcl does), but we don't have to, for the time being, because we can fully qualify our code-to-be.

(test.pl):
use Promise;

sub foo { $_[0] + 1 }

my $p = Promise::delay(q[::foo( 3 + 4 + ::foo(5) )]);
my $r = Promise::force($p);
print "* $r\n";
__END__

No exception now, but it doesn't work. Before the error check, we were returning the eval, now we aren't. Let's fix it:

package Promise;

sub delay { print STDERR "<@_>\n"; my $code = $_[0]; return sub { my $ret = eval $code; die $@ if $@; $ret } }
sub force { $_[0]->() }

1;
(in the shell):
$ perl test.pl
<::foo( 3 + 4 + ::foo(5) )>
* 14

Now it works!

Of course, this is a really naïve implementation, and it may be particularly awful, as Perl doen't optimize its tail calls, but it's a beginning. Another problem is that we may be recalculating a promise again and again, which defeats its very purpose of easying calculations; in Scheme promises are fulfilled once, and the next time you ask for it, you get the result instead of a promise object. Memoize can help in here and save us a lot of cruft, or not, depending on how we do construct our code to be evaluated. Actually evaluating a string is suboptimal, but it's the easier way to avoid Perl's eager evaluation of function parameters. There surely are other ways.

Anyway, to check that this is really working, let's try something a little bit more involved. Let's calculate numbers. A (potentially) infinite chain of numbers:

(in test_naturals.pl):
use Promise;
# returns a number and a promise to calculate the next one later
sub nat {
    my $N = shift;
    return [ $N, Promise::delay(qq{::nat($N+1)}) ];
};

# getters
sub get_N { $_[0]->[0] }
sub get_next { $_[0]->[1] }
# pretty-printer
sub show_N { print get_N($_[0]), "\n" }

# the number one
my $one = nat(1);
show_N($one);
# the next Natural is, of course, the number two
my $two = Promise::force(get_next($one));
show_N($two);

# let's get some of them...
my @naturals;
my $number = nat(1);
my $num = get_N($number);
while ( $num <= 10 )
{
    push @naturals, $num;
    $number = Promise::force(get_next($number));
    $num = get_N($number);
}

print "@naturals\n";

We create some convenience functions but don't let 'em fool you. nat() defines a function that takes a number, and returns it alongside a promise to calculate the next one.

We check that it works, and then we calculate the first ten Natural numbers. Note that after calling nat(1) to assign it to $number right before the loop, we don't use the fuction never again explicitely. It's implicitely called by each promise we force to be fulfilled.

The popular name for these 'objects' is Generators. I'm sure there are pretty good implementations of these in books (like in MJD's Higher Order...); but this is the basis, and I didn't find any in CPAN. Hope you find this idea useful, because it is more powerful than it may appear at first (deceptively simple, uh? ;-)

Comments and discussion welcome! Laters!

Tuesday September 19, 2006
06:08 AM

Catalized

OK, I have finally managed to install Catalyst.

It's been kind of nightmarish, and I will recall in this post the whole process, so we have enough background to back up our musings and conclusions.

First, as I have commented before on this journal, CPAN and Ubuntu packaging (apt) don't like each other very much. I guess it would take some tinckering of perl packages so the post-scripts update the perllocal.pod file or something like that, at least for every package installed. Using `dh-make-perl' won't do unless someone finds a way to debianizing all the accompanying dependencies found by CPAN-pm in the process; otherwise it becomes too cumbersome.

Following advice, I've made a parallel Perl installation, version 5.8.8 from scratch. Compiling and installing was easy, then came the surprise: LWP was not installed, and CPAN.pm wouldn't deal with my authenticating proxy (!!!). It kept dropping my proxy_user and proxy_pass variables, and not doing what I supposed it had to do. As we have to put our realm when authenticating against the proxy (username@realm) if furthers confuses some tools parsers when dealing with the http://username@realm:password@server:port/; which is frustrating.

Finally, using privoxy on my local machine so it would provide user and password to the real proxy, I managed to confugure curl to move through the whole process, and then some waiting until CPAN managed to get the files, etc.

Using the cat-install script was certainly painless, I simply had to install by hand a couple of modules I wasn't able to automatically download because they were blocked by my local proxy (argh!), and then installing Catalyst::Devel was simple enough as well.

In the end I only had to install some 234 CPAN packages, and it was working. As an aside I think it's a big bunch of packages, really. I still think a batteries-included Catalyst distribution with all those packages nicely tarballed so you only install those still missing on the system would be a nice idea.

Now I have a Test application available in my local port 3000, and I'll be toying around for a while until I get the gist of it. I'll probably rant about it on this blog, anyway... :-)

In the meantime, the Revision 6 (almost!) of the Scheme Report (R6RS, at http://www.r6rs.org) has hit the stores, with lots of nice things to ponder about and to drool a bit... Go and check it by yourself!

laters!

Tuesday September 12, 2006
11:33 AM

Concurrency

While I wait for my freshly installed-from-sources Perl 5.8.8 to run the cat-install script to install Catalyst, I'll mumble about concurrency in Perl, now that I am trying to learn Erlang.

Concurrency can be doen at the level of process (fork(), which is fork(2) and friends on UNIXen) or at the level of thread.

Since Perl 5.8.x we have threads, although a bit shakey. However, threads mean semaphores, locks, etc... Ugly.

CPAN to the rescue:

subs::parallel looks very nice. I will have to stress it to see if it's really usable, but it puts perl threads to the use in a very simple way, I like it.

Parallel::Simple also looks very nice, although it uses fork(). This has also problems, but nowadays fork() is very fast and cheap in UNIXen.

If you have to run the same thing many times, and it can run in parallel, Parallel::ForkControl is very nice. I've used it in the past to run control code on a big subnet, limiting the amount of bandwidth I sucked up, and it improved my partner's program by 3 orders of magnitude (yeah, that's right, 3 hours--counting the sleep we added on purpose--instead of 3 months) by improving the use of the IO bottleneck... :-)

There are other CPAN modules so you don't have to use threads or fork() yourself. Have a look on the module namespaces Proc::*, Parallel::* or forks::*, for example, and see which one suites your needs the best.

BTW, nothing of this is similar to Erlangs lightweight processess, which use message passing to work, and I don't think we could make a similar thing that worked easily enough.

Termite is a Scheme implementation of Erlang's concurrency model, and a couple of changes to the compiler/interpreter were necessary to implement it in Scheme, so I guess some changes would be needed in perl itself (i.e. in the compiler/interpreter).

Simulating Erlang's ability in perl is a different story, which I may tackle one of these days if I have time enough... :-P

In the meantime, I'm back to fighting with CPAN.pm and my proxy, which is stupid enough to be really annoying (argh!). BTW, any way to let CPAN know it has to use the HTTP methods instead of FTP? Using the fully qualified HTTP URI of my proxy doesn't seem to work... :-/

Best regards!

Monday August 14, 2006
03:46 AM

Packaging complexity (huge!)

The other day, the fine folks at Hiveminder announced his new site, made with Perl, using Jifty (http://use.perl.org/article.pl?sid=06/08/08/0249201).

I thought, let's have a look, and went to go for Jifty. It looks really cool! Maybe even as cool as Catalyst looks, yeah.

Holy cow! After three days fighting with CPAN modules, dependencies, failed tests, etc, I have given up.

It probably has to do with the packaging system of my GNU/Linux box and CPAN, which don't go along very well (I mean, there's a difference between doing aptitude install libtest-pod-perl and perl -MCPAN -e 'install Test::Pod', for example). The difference is that in the first case, apt is dealing with dependencies, paths, etc; in the former, it's CPAN, and they don't think the same way.

This is, of course, very frustrating. I had the same experience with Catalyst, and I also gave up (that's why I said that Jifty looks as cool as Catalyst looks, because I haven't been able to try none of them!

My rant is that these framworks are cool, and use a host of standard modules (in the CPAN sense) but they should be available 'batteries included'. I mean: you should not be so fragile as to be useless without all CPAN at hand.

I work behind a firewall, and CPAN access is, let's say, unconvenient. This may be the best of both worlds way of doing this: you offer the option of downloading a tarball with everything that's needed to get the framework going on, and then Makefile.PL (or Makefile, or another program) checks which of the CPAN packages it has downloaded within the tarball it really needs to install on the system.

This is the best of both worlds: I can make a Debian package from the tarball (carefully defining dependencies and so) and use apt to install the packages I am missing (so my system has track of paths, versions, and dependencies) or use CPAN from the local repository.

But I was not offer such a thing. Either it's available on debian packages or in CPAN tarballs, but I have to go fetching dependencies, and some fail (because I have something the way that package doesn't like) and I'm screwed.

So, what possibilities are there to simplify all these complexity so huge frameworks don't need a fresh system (maybe even compiled from the sources) to live happily in production boxes?

Will CPANPLUS solve this? Because if Perl 6 won't be able to benefit from CPAN easily (and I cannot, always, benefit from it by using Perl 5!) then neither Perl 6 nor these frameworks will have an easy path to success in the real world. Specially the web frameworks, with Ruby on Rails, Django, webpy, and all the Java frameworks out there...

(enough ranting, back to work)

Wednesday August 02, 2006
08:09 AM

Back!

Well, after a long, long time, I'm back.

Not that it's such great news, actually, but, well, it's what's here to be (sorta).

I'll link this blog to my main blogging site (about the historical fiction writing I am doing) which is also in ENglish, so you can check over there as well.

As I'm still being paid to program in Perl, I find myself happy and longing job offers from my homeland, as I'd love to move back there.

Thus, I'll rant about all those Java offers, and about the wonderful things of daily Perl programming, and some things I'd love to find fixed (so I don't have to fix them myself ;-)

keep tuned if you want, and laters!