Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

inkdroid (3294)

  (email not shown publicly)
AOL IM: inkdroid (Add Buddy, Send Message)
Yahoo! ID: summe_e (Add User, Send Message)
Jabber: inkdroid

inkdroid is a person, not a robot. however, inkdroid likes ink. inkdroid likes perl too.

Journal of inkdroid (3294)

Wednesday August 20, 2003
06:26 AM

timeless way of building (2)

In my last journal entry I found it difficult to summarize what Alexander meant by the "quality without a name" because he expresses the idea so well himself, and the idea (by definition) kind of defies description. The same can be said for the second part of the Timeless Way of Building entitled "The Gate". This section of the book outlines the idea of a pattern language, which is the main ingredient in creating places, buildings, towns (software perhaps) which have the quality without a name. He does such a masterful job at describing what a pattern language is, that it is hard to review it here...but here it goes anyway.

The basic idea is that during the act of building a builder will assemble component pieces to create a new work. These component pieces are themselves made up of component pieces. The new work is itself just a part of a larger whole, which is being built in components. The components and their relationships together make up the patterns of a 'pattern language', which is shared by a community of builders. These components or elements of the language can include both physical objects (town square, religious center, main street), and also events that take place in these places (markets, weddings, walking). In fact the places and events are inextricably bound together. The 'pattern language' is the complete set of options that a builder has available when building. The languages vary from place to place, time period from time period. And as you might expect pattern languages aren't always used explicitly, but are often used implicitly all the time by builders. Choices are always made, patterns are copied from various places, and are incorporated in new ways. Most importantly some patterns (and hency pattern languages) are able to generate and sustain life, and others which inhibit and end up destroying life.

This is where things get interesting, the difference which makes a pattern (and the language it is a part of) alive. Alexander argues that the specialization which makes much of modern life possible also encourages a dislocation in the components of a pattern language:

If I build a fireplace for myself, it is natural for me to make a place to put the wood, a corner to sit it, a mantel wide enough to put things on, an opening which lets fire draw.

But, if I design fireplaces for other people--not for myself--then I never have to build a fire in the fireplaces I design. Gradually my ideas become more and more influenced by style, and shape, and crazy notions--my feeling for the simple business of making fire leaves the fireplace altogether.

So, it is inevitable that as the work of building passes into the hands of specialists, the patterns which they use beocme more and more banal, more willful, and less anchored in reality.

It's extremely important that patterns remain useful and rooted to their purpose, or else they risk becoming abstracted, useless and dead.

Another point Alexander makes reminded me of the Perl philosophy, and open source in general, since pattern languages must be shared if they are to be alive.

The acts of design which have been thought of as central are acts which use the structure already present in these underlying languages to generate the structure of specific buildings.

In this view, it is the structure of the underlying language which is doing most of the hard work. If you want to influence the structure of your town, you must help to change the underlying languages. It is useless to be innovative in an individual building, or an individual plan, if this innovation dos not become part of a living pattern language which everyone can use.

And we may conclude, even more strongly, that the central task of 'architecture' is the creation of a single, shared, evolving, pattern language, which everyone contributes to, and everyone can use.

Alexander goes on to outline how to explicitly describe a pattern as a rule so that it may be shared. Essentially, all patterns have the form:

Context --> System of Forces --> Configuration

Where context is the arena in which the pattern operates, the system of forces is the dynamic which the pattern is attempting to resolve, and the configuration is the resolution to the conflict. This method of description will make more sense if you read the book. It's not really relevant, but I couldn't help but be reminded of the idea of a tuple in RDF where each statement has the form:

Resource -> Property -> Value

I think this is because I'm enjoying O'Reilly's new RDF book at the moment too :-) At any rate after Alexander provides details about how to describe a pattern it is very clear what he means by it. He stresses that it is hard work finding these deep life sustaining patterns, but he provides instructions on how to go about it. What's more these instructions use architecture as the model, but they seem to abstract so well to software development. It's no wonder that this book inspired others too look for patterns in software. In building construction patterns had been shared implicitly for a long time, but the specialization and professionalization of architecture have lead to these patterns being dropped in favor of ones that could be expressed as blueprints etc. What's more these blueprints are guarded as business secrets, and are not shared. All this leads to a breakdown in the patterns, and the overarching pattern language that they make up. The process of discovering and sharing patterns in an unambiguous way is a technique for countering this trend. "The Gate" was every bit as fascinating as "The Quality", and I've still got one more section to go...up next "The Way".

Thursday July 31, 2003
01:43 PM


I work with PHP and Perl alot during the day. I've been internally bewailing the fact that there is no POD like documentation format for PHP. At least none that comes standard, as POD and javadoc do for Perl/Java. Then I was looking closely at one of the PHP class files I was editing and noticed that someone (it had to be petdance) had done this:


=head1 NAME

Page - A base class for all HTML pages.



class Page {


=head2 Page()

The constructor for the Page object. Pass in the args, etc, etc.



function Page() {
     ... do PHP stuff here...


So it's PHP but it has POD embedded in it within PHP multiline comments. So then you can run perldoc on the PHP file, and kazaam you get a nice POD document on the screen. I have no idea why this never occurred to me before, the solution is so simple it's pure genius :) Thanks Andy, this made my day.

Sunday June 22, 2003
11:07 AM

timeless way of building (1)

Recently I've been reading about design patterns. The thing I really like about them is that they provide guidance in finding successful ways to solve problems with workable models. In my reading so far I kept running across the name Christopher Alexander, and I decided to pick up his book The Timeless Way of Building. Alexander is an architect and mathematician, whose ideas about pattern languages largely inspired the design pattern movement in computer programming. I really enjoyed reading jplindstrom's notes about the Mythical Man Month, so I thought I could post some of my notes about the Timeless Way of Building here.

When I first began seeing myself as a professional programmer I was working in New York City. On my way to and from work I would often pass all sorts of construction projects. Sometimes I would see workmen assembling scaffolding to make structural repairs to a building, or drilling into the road revealing a vast network of cables and pipes. The act of creating new structures, and fixing existing ones in the city's complexity somehow reminded me of programming. For some reason I enjoyed thinking about programming as a type of building...especially all the craft that goes into it. So it wasn't a stretch for me to pick up a Timeless Way of Building.

The Timeless Way of Building has an interesting format. It is broken down into three parts: The Quality, The Gate, The Way. Each section is then broken down into chapters, and each chapter is made up of brief sections, each section having it's own 'headline'. Alexander designed the book this way so you can read large chunks quickly to get the feeling of the book as a whole.

The title says the book is about building, but as you might guess from three parts, and the "timeless" adjective there is a lot of philosophy thrown into the mix. This is especially true in the first few chapters where Alexander starts out talking about the elusive "quality without a name"; which is present in those moments we experience when things just seem "right". He is able to talk about this quality by approximating it with words like "alive", "whole", "comfortable", "free", "exact", and "eternal"; while showing how the approximations don't completely describe the quality without a name. He goes on to examine how places can exhibit this same quality, and how each place and building is actually a collection of elements and relationships among those elements...which he calls patterns. For example:

Consider a typical mid-twentieth-century American metropolitan region. Somewhere towards the center of the region, there is a central business district, which contains a very hig density office block; near these there are high density apartments. The overall density of the region slopes off with distance from the center, according to an exponential law; periodically there are again peaks of higer density, but smaller than the central ones; and subsidiary to these smaller peaks, there are still smaller peaks. Each of theses peaks of density contains stores and offices surrounded by higher density housing. Towards the outer fringe of the metropolis there are large areas of freestanding one family houses; the farther out from the center they are, the larger their gardens. The region is served by a network of freeways. These freeways are closer together at the center. Independent of the freeways, there is a roughly regular two dimensional network of streets. Every five or ten streets, there is a larger one, which functions as n artery. A few other arteries are even bigger than the others: these tend to be arranged radially, branching out from the center in a star-shaped fashion. Where an artery meets a freeway, there is a characteristic cloverleaf arrangement of connecting lanes. Where two arteries intersect, there is a traffic light; where a local street meets an artery, there is a stop sign. The major commercial areas, which coincide with the high density peaks in the density distribution, all fall on the major arteries. Industrial areas all fall within half a mile of a freeway; and the older ones are also close to at least one major artery.

In much of the early chapters Alexander is training the reader to look for patterns. He also begins to examine patterns that are 'alive', or self sustaining, and those that are 'dead' or destructive. What's more the patterns of a place are inextricably bound up with the activities that take place there.

The town which is alive, and beautiful, for me, shows, in a thousand ways, how all its institutions work together to make people comfortable, and deep seated in respect for themselves.

Places outdoors where people eat, and dance; old people sitting in the street, watching the world go by; places where teenage boys and girls hang out, within the neighborhood, free enough of their parents that they feel themselves alive, and stay there; car places where cars are kept, shielded, if there are many of them, so that they don't oppress us by their presence; work going on among the families, children playing where work is going on, and learning from it.

And finally the quality without a name appears, not when an isolated pattern lives, but when an entire system of patterns, interdependent at many levels, is stable and alive.

A building or town becomes alive when every pattern in it is alive: when it allows each person in it, and each plant and animal, and every stream, and bridge, and wall and roof, and every human group and every road, to become alive in its own terms.

And as that happens, the whole town reaches the state that individual people sometimes reach at their best and happiest moments, when they are most free.

It is very difficult for me to describe the effect that this first section "The Quality" had on me. Near to the end of the section I found myself being reminded of what I enjoy so much about the Perl programming language: the vibrant community, diversity of interests, the wealth of CPAN where programmers can share work together...all of these just seem very full of life. I'm looking forward to the next section "The Gate" where Alexander asks:

Is there a fluid code, which generates the quality without a name in buildings, and makes things live? Is there some process which takes place inside a person's mind, when she allows herself to generate a building or a place which is alive? And is there indeed a process which is so simple too, that all the people of society can use it, and so generate not only individual buildings, but whole neighborhoods and towns? It turns out there is. It takes the form of language.

The most refreshing thing about this book so far is that it provides a lens for thinking about programming and design that is outside of the ordinary realm of computer science. I can't wait to get on to the next section.

Thursday June 19, 2003
11:18 AM

my $day = YAPC::NA->new( day => 3 );

  • My last day in Boca started out listening to Piers Cawley talk about refactoring. Refactoring is the process of improving the design of some code, without changing its functionality. Refactoring also involves the art of looking for repetition in your code, and eliminating it. Piers spoke a bit about refactoring in general, and then set out to refactor some of his own code (I think it was Class::Builder). He used the audience as his colleague pair programmer, and people were more than happy to point out missing parens, or quotes. Piers used Test::Class to write his tests. Apparently Test::Class is very much like the JUnit framework (which my colleague Mike O'Regan really likes alot). I don't know much about JUnit, so I was interested to learn that the original was SUnit written by Kent Beck for Smalltalk. I need to check out Test::Class now. It seemed to me that Piers was talking more about testing than refactoring, but perhaps the two are so intertwined it's impossible to talk about one without the other.
  • Next I headed off to hear Peter Chines talk about exceptions. Unfortantely I missed a significant chunk of the beginning of the talk. Peter went over the various standard functions for throwing exceptions and warnings in Perl (croak, carp, confess) and provided guidance on why they were important to use. He also showed how signals could be used to automatically add information to all die messages (a handy trick) and talked about CGI::Carp which automatically logs the name of the generating program, and a timestamp to STDERR. As 2shortplanks pointed out, it would've been nice to see some examples of throwing objects as exceptions. And someone piped up at the end about Damian's Coy module which translates Perl's regular messages into soothing haikus.
  • Directly following Peter was Mark Fowler talking about extending Template::Toolkit. Mark motored through an amazing amount of material about TT. The essentials that I took away is that it's remarkably easy to extend TT by subclassing Template::Plugin. For some reason I find the TT framework much easier to grok than Mason. Perhaps it is because the writing of the templates seems like a new language, totally independent of Perl. Perhaps the comparison doesn't hold since they really are quite different in some ways: Mason is an Apache application development environment, and TT is a more generalized templating system. I'm really looking forward to seeing the new ORA book on TT.
  • After lunch I headed over to hear Ken Williams talk about Machine Learning. This was a huge topic, and Ken really could've had a whole day to talk about this fascinating stuff. Ken summarized ML as any system that improves (or changes) as it receives training examples. Clustering, categorization, recognition, and filtering are all examples of ML systems. It becomes feasible to use ML when you have too much data to sift through, people are too slow at doing the sifting, and when you can afford to be wrong ocassionally. Perl is a useful ML language since you've got CPAN at your fingertips, it is a quick prototyping language (and you may not throw away the prototype :), and if you need it you can drop down to C via XS for speed. As an example Ken described how the use of decision trees could improve Spam Assassin. SA has over 600 attributes that it uses to identify a spam message. The attributes all have (+/-) weights associated with them, which when summed together will mark an email as spam if it exceeds a certain threshold. The problem with 600 attributes is that they all need to have rules associated with them, and these rules must individually be applied to come up with the final sum...which requires a fair amount of time and processing. Ken wrote a program that uses the 600 SA attributes, and a collection of spam/ham available from SA, to generate a decision tree to identify spam. The benefit of a decision tree being that it doesn't need to process each rule, but only a subset of rules that are tuned to each email as it comes in. Ken used the AI::DecisionTree module to do the hard stuff, and plugged the tree into GraphViz for displaying the nodes. Pressed for time Ken quickly went through an example of collaborative filtering, which most people experience when they go on Amazon and are presented with a list of recommendations. Ken wrote another program which used Search::ContextGraph to analyze the CPAN non-core modules that 30 people had installed. This allowed him to say if you have LWP::UserAgent installed, you are likely to also like URI :-) which makes sense. It would've been cool to 1/2 a day or even a full day of this stuff...I felt like we were just scratching the surface.
  • I managed to stay awake with the help of some sugar-rich cookies and hear Michael Rodriguez talk about XML Modules. He provided a really nice summary of a large chunk of the XML modules that are available on CPAN. They all descend from two C ancestors: expat, and libxml2. They are genearlly divided into two camps, event based parsers (SAX) which process the XML as it comes in and throw events (callbacks); and DOM based parsers which read the entire file into an in memory data structure and then allow you to query it. One of the things that Michael said which really stuck with me is that XML should really only be used as an interchange format, and not as a live data format. What he meant was that XML is really good for exchanging data with other people, but once you've got it you should parse it and get the data into your relational database ASAP. Another thing that I liked was his joke that XML was really the revenge of Java programmers who need angly brackets around everything so that they could parse text easily. Of course Java now has Perl-like regular expressions, so things have probably improved (or have they?).
  • Unfortunately I wasn't able to make it to Damian's closing talk because I had an early flight. I took the bus from FAU to the tri-rail, got off at the wrong station, had to take a bus into Ft. Lauderdale, and then another bus to the airport. It was good to get to see the city at any rate. It was cool to hear what I thought was French on the buses, and everyone seemed so relaxed and friendly...perhaps it's the fine weather. I finally got to the airport and my flight was delayed a couple hours, so I probably could've heard Damian speak afterall! Overall it was a great/informative time. Thanks Perl Foundation, and Follett for sending me this year.
Tuesday June 17, 2003
08:43 PM

my $day = YAPC::NA::->new( day => 2 );

  • I spent the AM listening to Arthur Bergman talk about ithreads, Perl's threading mechanism. Arthur was a measured speaker, clearly comfortable with his material, and with being in front of a packed room full of hackers. I didn't know much about threads apart from what I skimmed in the latest camel. Apart from the basics here are the things I took from the talk. If you are using ithreads you must compile perl with them built in. If you want to use threads you should use the latest snapshot of perl (5.8.1 should be out "any day now"). Compiling in threads will cost you a 10% performance hit...even if you are writing a program that doesn't use threads explictly. mod_perl2 is using ithreads big time in support of Apache2's Multi Process Modules (MPMs)...and RT3.0 supposedly runs under MPM. Arthur got a Perl Foundation grant to get POE to work with ithreads, and also has plans to create a which will make all sorts of information about the running Perl interpreter available to Perl programs. So much stuff was covered, even a few scary dips into the Perl source. Best of all Arthur ended the session by asking the audience for suggestions on a simple threading application to write as an example. Several ideas were thrown around: a MUD server, a card shuffler, a customer request broker. Finally Ken Williams suggested that we write a voting application that would allow people to connect to a port on Arthur's presentation machine and make votes for the sample code to write. No sooner said than done, Arthur had is hacked out in 15 minutes, and then we could see as people connected and placed their votes. Unfortunately the fix was in, since one of the POE hackers wrote a program to cast a ton of votes for the MUD server.
  • After lunch I headed over to hear Brian Ingerson talk about extreme programming tools for module authors. Brian's talk was really refreshing since he clearly has alot of experience implementing extreme programming ideas, but he didn't seem to be sold on it as a religion...just as a set of practical techniques. He stresed agile strategies more than extremeness. But man, his presentation was extremely extreme. He presented his slides using CGI::Kwiki, using some extension he wrote for slideshows (which I think I heard him creating just the night before in the dorm lounge with some other folks). It was a trip, since folks in the audience were able to modify his slides before he got to them! At one point he hit next and came to this page (hit ENTER to reveal the various states of the page). There were so many good things in this talk, you can see the full contents here. Here are a few of the things that stuck in my head. The best things about extreme programming are: testing, saw sharpening (having good tools), refactoring, and collaboration. When writing modules you should: make it ridiculously easy to use, scale to any task, easy to install, few pre-requisites, have a good descriptive/catchy name, and the first release should be good enough, but not complete. Ingy also commented in an offhand way that you should imagine that you are writing Perl itself when you are writing a module for CPAN. I thought that was a simple and neat idea, in that you should try to emulate the flexibility, utility and friendliness of Perl when you write modules to extend it. He had some time to demonstrate the CGI::Kwiki module, and Module::Build which illustrated the concepts he covered in the beginning. I'd really like to set up a CGI::Kwiki for when I get back. Ingy graciously cut his session short so we could go to the lightning talks.
  • The lightning talks were as usual very humorous. Highlights for me were 3D madness in Perl by Pierre Denis, WWW::Mechanize by Uri Guttman (he mentioned my colleague Andy Lester), and of course the Allison's Restaurant sing along. You may have had to be there...but you can because some of the slides are up already. I just hope someone captured it on video :)
08:05 AM

my $day = YAPC::NA->new( day => 1 );

Day 1 YAPC was great and am looking forward to more today. In a nutshell here's what I got to hear about yesterday:
  • Damian Conway talking about the Perl6 language. I've been subscribed to the perl6-lang list for a bit, but hearing Damian talk about the new stuff that is coming along is really helpful. He has so much enthusiasm and energy for the new language constructs, and it is contagious. No word yet on *when* Perl6 is going to be in alpha or beta, but perhaps this will come at his closing talk at the end.
  • Dave Rolsky talking about dates/times in Perl. This talk was fascinating, mainly because of the breadth of Dave's knowledge of dates, and the date/time Perl project which he organized (and which is coming along really well). I particularly liked the overview of the various time standards and history of calendars that Dave presented. And of course the comparison (strengths and weaknesses) and summary of existing date/time modules on CPAN (over 35 of them!) was invaluable. The main thing I took from the talk is that you never want to store offsets in the database ( -5:00 GMT, store "America/New York" instead), and that you want to use the date/time project (and help them extend it). I also had never heard of Olson's timezone database before, and never really thought how the 2038 32bit time bug will start to take effect rather soon in mortgage calculations :)
  • After lunch it was more Damian talking about how he has customized his personal work environment to achieve the goal of more laziness. First he talked about some cool macros that he wrote for optimizing his key strokes in Vi. Damian's point was not to use Vi or his macros, but to get in the mindset where you become aware of where you are spending your time, and then taking the time to write a tool, or an optimization. Also, he stressed that mantra that the computer should adapt to people, and that people shouldn't have to adapt to the computer. I need to check out Text::Autoformat (which he said people should look at), and he recommended OSX users take a look at a simple/free app called XShelf.
  • After that I headed over to hear Chris Winters talk about generating Java w/ Perl. He is working at a utility company that must use Java (for the usual reasons), and they have a huge database (500 tables), and needed a bunch of rudimentary classes. He was primarily a Perl programmer, and learning Java on the job, so he wrote Perl programs that wrote all the repetitive Java classes from metadata. I learned about the difference between passive code generators (for example Visual C++) whose output you can edit, and active code generators which have output you should not edit.
  • After that it was more Rolsky talking about advanced Masonry. I started fading near the end of the day, and was thinking about my Open Archives project a bit; so I only woke up enough near the end to ask how he saw Mason relating the TemplateToolkit
  • After that I went back to the dorms, did a little coding, and then went off to the Boston party to have a few drinks. Met some cool people, and fell asleep pretty late :)
Saturday June 14, 2003
04:07 PM


I want two CPAN modules, which may or may not exist:


An XML::SAX handler that does what XML::Simple does. So if I have some XML like this:

    <foo a=1>

I would be able to do this:

    my $foo = XML::Filter::Simple();
    my $parser = XML::SAX::ParserFactory->parser( Handler => $foo );
    $parser->parse_string( $xml );

    ## same kind of data structure as XML::Simple
    print $foo->{ a };        # prints 1
    print $foo->{ bar }[0];    # prints baz
    print $foo->{ bar }[1];    # prints bar


I want to be able to keep up to date with the goings on of my congressman, senators, and I want Perl to help me.

use Politics::US::Senator;
use Politics::US::Bill;

my $senator = Politics::US::Senator->new( 'Durbin, Richard' );
foreach my $vote ( $senator->votes() ) {

     my $decision = $vote->decision();
     my $billNum = $vote->billNumber();
     my $bill = US::Politics::Bill->new( $billNumber);

          "Today Durbin cast a vote of $decision regarding $billNum\n",
          "The bill's title is: ", $bill->title(), "\n"
          "And here is the content of the bill:\n",

Between the Senate website, and Thomas and WWW::Mechanize this isn't so far fetched at all.

Friday June 06, 2003
01:01 PM

pacman (eclipsed)

Today's Astronomy Picture of the Day of the solar eclipse last week proves that the eclipse was really a video game simulation...just as I suspected.
Tuesday June 03, 2003
10:50 AM

attack of the killer hashes

By now you've probably already seen the news articles about this new form of DoS, which essentially makes the attacked computer do lots of work without bombarding it at the network layer. I didn't realize that the attacks were demonstrated on Perl regexes and hashes. I haven't read the article yet, but here's an excerpt from the original.

We present a new class of low-bandwidth denial of service attacks that exploit algorithmic deficiencies in many common applications' data structures. Frequently used data structures have ``average-case'' expected running time that's far more efficient than the worst case. For example, both binary trees and hash tables can degenerate to linked lists with carefully chosen input. We show how an attacker can effectively compute such input, and we demonstrate attacks against the hash table implementations in two versions of Perl, the Squid web proxy, and the Bro intrusion detection system. Using bandwidth less than a typical dialup modem, we can bring a dedicated Bro server to its knees; after six minutes of carefully chosen packets, our Bro server was dropping as much as 71% of its traffic and consuming all of its CPU. We show how modern universal hashing techniques can yield performance comparable to commonplace hash functions while being provably secure against these attacks.

Friday May 23, 2003
06:44 AM

data collection

The Wallstreet Journal has a nice article about data collection by the US Federal Government after 9/11. Interesting to note that the Total Information Awareness program has been renamed to the Terrorism Information Awareness program. Among the database initiatives covered is this story:

Examples of how local police records can live on in federal databases are surfacing in Denver, where the police department recently released documents in response to a lawsuit by the American Civil Liberties Union. They show the police intelligence unit had secretly built a computer database full of personal details about people active in political groups. Included were a Quaker peace-advocacy group called the American Friends Service Committee and right-wing causes such as the pro-gun lobby.

The Denver department is purging people not suspected of a crime from the records. But last summer, when a man listed in the Denver files as a gun-rights group member got into a fender bender, a police officer checking VGTOF found him described as "a member of a terrorist organization" and part of a "militia." According to a Denver police memo, the officer reported the stop to the FBI as a "terrorist contact." The Denver police and the FBI decline to comment on how the man ended up in VGTOF.

Sure mistakes happen, but the problem is that restrictions on the accuracy of the data are being loosened as the amount of data is being increased. The loosening is probably so that TIA is legally allowed to link together these new data sources. Bruce Schneier has a nice piece debunking that whole idea with a bit of math. The EFF is doing a great job fighting this stuff in Congress. They have their work cut out for them, because it's hard to speak rationally when large amounts of fear are involved.