Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

iburrell (4155)

AOL IM: imburrell (Add Buddy, Send Message)

Journal of iburrell (4155)

Saturday June 02, 2007
06:57 PM

Reverse proxy mirror

Does anybody know of software that could be used to build a mirroring reverse proxy? I am looking for something to cache and mirror files for distributions and other archives of mostly static files.

To clients, it looks like a copy of the archive. If a file does not exist, or is too old, it is fetched from the master servers and saved to disk. The files can be populated from CDs or mirrored through rsync to get complete copies. Ideally, it would load balance from multiple mirror servers.

It is possible to set something up with Apache2, mod_proxy, and mod_disk_cache but that does not allow load balancing between servers and stores files in an opaque cache.

Would Perlbal be usable for this? Could it be extended? I think a mod_perl module could be written.

Sunday September 17, 2006
08:27 PM

CPAN Black Hole

I hate CPAN maintainers who don't respond to bugs. I have been putting in bugs for problems I find in various CPAN modules. For some, I have even put in patches which fix the bug. The result has been complete silence.

I don't expect maintainers to drop everything and fix the bug. I would hope that they would at least send an acknowledgement that they have seen the bug and might work on it. A snarky "that is not a bug but a feature", "it has been fixed in the latest release", or "you are an idiot, do this instead" is better than silence.

If you post a module to CPAN, you have an obligation to maintain it. I suspect fixing bugs is more beneficial to the community than writing new, cool stuff. Even if it is really boring. The user has an obligation to report bugs if they encounter them so the maintainer is not in the dark. One big advantage of open source is that the user can provide a fix or at least a test case. The maintainer's responsiblity is to apply the patch if it is suitable. Not let it sit there gathering dust.

The best example is bug 27, a major limitation in URI which is only 5 years old. People have even provided patches.

Sunday August 07, 2005
09:23 PM

New Blog

I started a blog at with personal and technical stuff.

I am using Textpattern which is written in PHP. I realize now why PHP is popular for the web. Installing involved downloading the archive, unpacking in the web directory, and going to a setup.php page which made the database and configured everything. Everything was done through a web interface and there were no modules to install.

Does anyone have any suggestions of integrate and an outside blog?

Monday February 02, 2004
08:14 PM

HTTP URLs and authentication

Microsoft just posted a security update for Internet Explorer. One of the fixes is to disabled username and passwords for HTTP URLs. This is to prevent spoofing sites with URLs like that confuse naive users about which sites they are browinsg. Unluckily, it violates the URL standard and breaks useful behavior.

This affects one of our websites. The protected section of the website uses basic authentication. The username and password are put in the URL to access the content pages. This is a pretty stupid way of doing authentication since the username and password are exposed. But the content pages are accessed through a CGI script and in a frame so the URLs are not visible.

I think Microsoft should have chosen a different solution to the problem. This would be not showing the username and password in the URL bar or status bar. The username (and auth mechanism) should show in a "Page Info". Hiding information is bad (it would make debugging harder) but deceiving users is worse.

To make it clear when the authentication happens, the window should popup but with the username and password filled in from the URL. This makes it obvious that authentication is happening. It also lets the user see the username and password. This doesn't help with spoof sites which probably don't use any authentication.

Wednesday October 15, 2003
03:18 PM

Spam and Authentication

I have been reading lots of articles recently about how to combat spam. One thing some of the articles have touched on is the problem of authentication. Email doesn't authenticate where the email comes from or is going. It is trivial for spammers or viruses to fake the From address and the return path. It is trivial for them to send their email through open relays or blast it directly to the victim mail servers.

One possible solution is to introduce authentication into the SMTP protocol. This wouldn't protect the From: header in mail messages. But it would protect the return path that is used for bounce messages. It can also be used for access control. This is the difference between knowing the message came from your friend Bob, and knowing that some sent the message.

Introducing public-key cryptography into the protocol would not be too hard. SMTP has a mechanism for extensions and adding commands. However, any public-key signature system would depend on distributing the keys and enabling the access control. This requires infrastructure to regulate the sending of email. It requires more centralization in deciding who can send email. It also requires organizations to buy into the system before it helps in limiting the spread of email.

One way to help with the infrastructure problem is to have companies that provide the authentication services. They would run relays that sign messages for its customers. The customers would sign up for accounts, with either monthly or per-email fees to limit the amount of email that could be sent through the system. The relay companies need to be able to authenticate its customers but existing SMTP authentication or SSL cliet certificates are widely supported by email clients.

There would also need to be a mechanism for ISPs to join the system. They would need to get certificates that could be used for signing messages. And even become CAs for creating new certificates for servers and clients.

Authentication would be used to create a group of mail servers that can trust where email is coming from. They can send bounce messages. The postmaster and abuse email works and is answered.

It also helps tie the email system into the legal system. If a spammer breaks into an account and sends a million email that should normally cost $10,000 to send through a commercial relay, that is much more serious offense using some unkown number of open relays. Or forging a cryptographic signature is much more serious than just putting some characters in an email.

Saturday May 10, 2003
09:15 PM

Mistake about SQLite

I made one mistake in my article about SQLite. They recently changed the comparison operators to work based on type instead of by value. I found this out from a Linux Journal article on SQLite. I am glad that the argument for typing prevailed. If they put in some validation of types when inserting and updating, then that would be enough to preserve data integrity.
Monday May 05, 2003
10:38 PM

SQLite and typelessness

I have been looking at the SQLite database and the DBD::SQLite module for accessing. SQLite is a database with some nice features: embeddable library, small and fast, single file, transactions, and pretty complete SQL92 support. This makes it real useful for places where a full database like PostgreSQL is overkill but you want DBI and SQL and the fun of relations.

It has one big problem: it is typeless. You would think that after programming in Perl for a long time, I would like typelessness. In a dynamic language, it works pretty well. In a database, it is pure evil. The reasons have nothing to do with performance or encoding. Or the historical usage of limited sizes and types in databases. The problem is with the logical definition of types, domains and operators.

Types have a domain, the set of allowed values. Boolean has a simple domain of two values, true or false. There ares the sets of positive integers, rational numbers, and irrational numbers. All the strings shorter than 2 GB. The domain is completely separate from the encoding. The integer 3 can be represented in many different ways, binary, octal, decimal, but all logically the same.

Tying is important for databases because it restricts the values that can be entered in a column. If I define a column to hold a price, I make it numeric. It shouldn't be able to hold any other values. I don't want anyone to insert their dog's name or their manifesto for world peace into that column. I expect it to always be a number that can be added or subtracted. I think strong typing is more important for databases because they are about rigidly specifying the permanent storage of data. Perl variables are fleeting, databases are permanent and logical.

Types also defined a set of operators that can operate on values of the type. In Perl, the numeric and string operators are distinct, > and gt, == and eq. This is because the operations are different. Comparing integers is different from comparing the same integers as strings: 10000 > 1001 but "10000" lt "1001". Adding two numbers is different than concatenating two strings even though some languages use the same operator.

SQL doesn't have the distinction of separate operators; it uses types to figure out which operator to use. The way SQLite tries to figure out which operator to use is unintuitive. It tries to see if it can parse the values as a number and use the numeric comparison, otherwise is uses string compare. Instead, it should use the type of the column. I found a web site talking about SQLite that got confused about how it works. This is the kind of confusion that makes bugs. If I am sorting part numbers, I want them compared as strings even though some of them happen to look like numbers. Similarly, if I am adding a column of integers, I don't want any fractional values or Fido to mess up the results. I want to be able to add the interval of '1 day' to a date and get a proper date as the result.

A big advantage of databases is that there are many different ways to access the data. It is possible (and wise) to check all values in the Perl code. But programmer's make mistakes and any code that doesn't could corrupt the database. This is especially bad since everyone expects SQL and databases to be strongly typed and writes SQL code that will break otherwise.

I think it is great that SQLite wants to free us from the encoding of values and size limits. Why should a varchar be limited to 255 or 8000 characters? Well, when it is a credit card number with at most 16 digits. Also, SQLite has some

A good comparison to SQLite is PostgreSQL. PostgreSQL is an object-relational database with a rich type system. It has extensible and user-defined types including writing operators and functions in multiple languages including Perl. It has a standard syntax from SQL99 for specifying type conversions.