perrin's Journal perrin's use Perl Journal en-us use Perl; is Copyright 1998-2006, Chris Nandor. Stories, comments, journals, and other submissions posted on use Perl; are Copyright their respective owners. 2012-01-25T02:07:53+00:00 pudge Technology hourly 1 1970-01-01T00:00+00:00 perrin's Journal There is no Perl web hosting problem! <p>I keep seeing people refer to some kind of crisis in the world of cheap Perl web hosting. They talk about ISPs not supporting mod_perl and invoke PHP as an evil horde conquering all hosts.</p><p>Well, I'm fine with PHP, and congratulate it on its success. However, Perl is not in trouble.</p><p>First, for those who do want the full power of a mod_perl installation, or want to choose their own version of perl, their own webserver, etc., virtual hosting with root is available for about $30-$35 a month. That's not much for getting your own box. And yes, there's mod_perl hosting out there too, but I can't see why you'd use it when you can get full control this cheaply.</p><p>But obviously not everyone wants to administer their own system, so for everyone else there is FastCGI. Hosts with FastCGI are available for $5-$10 a month! Even if PHP became available for $2, that wouldn't be enough of a difference to matter. And FastCGI works just fine for perl.</p><p>There are other things you can complain about if you really want to: installing modules on some hosts is not obvious, and configuring FastCGI on some is not well-documented. However, unlike pricing, those are problems you can solve yourself. So please stop with the ISP panic. Perl web hosting is alright.</p> perrin 2009-07-08T13:31:03+00:00 journal Do you order your sub definitions? <p>Just curious. I usually try to arrange them so that most calls to other subs in the same file are forward references and the internal subs are defined close to where they are called most prominently. Does anyone else think about this?</p> perrin 2009-04-29T21:11:13+00:00 journal CHI::Driver::DBI released <p>If you haven't seen it yet, CHI is the new replacement for the nice-but-slow Cache::Cache modules. I uploaded a DBI driver for it to CPAN. If you're skeptical of the value, try benchmarking MySQL against memcached. It's pretty close for this kind of thing, and on a local connection MySQL wins by a fair amount. It's also a lot simpler than trying to get memcached or Cache::FastMmap going on some random shared hosting provider.</p><p>I have another driver for IPC::MMA mostly done, which is looking extremely fast. After that will be BerkeleyDB.</p> perrin 2009-03-20T23:36:07+00:00 journal Is he talking about JavaScript or Perl? Here's a great quote from Douglas Crockford's "JavaScript: The Good Parts":<blockquote><div><p>The worst features of a language aren't the features that are obviously dangerous or useless. Those are easily avoided. The worst features are the attractive nuisances, the features that are both useful and dangerous.</p></div> </blockquote> perrin 2008-09-12T20:15:33+00:00 journal Why don't profilers default to wall time? It seems like at least once a week someone on PerlMonks or a mailing list I'm on is confounded by their profiler results. (Recent example <a href="">here</a>.) They look at the output, and nothing seems to be taking much longer than anything else, so they start chasing phantoms like method call overhead. <p>The problem is that they're measuring the wrong thing. Is your program taking too much CPU, or is it taking too much time? If you're running on a computer built in the last decade, it's probably time that's the issue. So why do we profile CPU cycles? </p><p>Nearly every program outside of the scientific community (or other math-heavy situations) is I/O bound. You don't see the I/O when you profile CPU time. It looks like it's not even there. </p><p>I've seen this happen so many times on the Class::DBI, Template Toolkit, and mod_perl lists. People show up all desperate to eliminate every method call or rewrite their templating module in C based on a profile they did. Then you tell them to do one by wall time and they find out that 99.9% of the run time is actually waiting for database calls to come back or something similar.</p> perrin 2008-06-12T19:43:05+00:00 journal perldoc pain You know when you perldoc a module, see what you need in the example code, copy it, paste it into your program, edit it a little, and then... it doesn't work? And you get really strange error messages? And finally you realize it's because the single-quotes or pipes that you pasted are some kind of insane unicode characters rather than the normal ASCII thing they look like? <p>I hate that.</p> perrin 2008-04-15T21:41:57+00:00 journal "monkeypatching" considered bananas <p>Who would've thought? A Ruby programmer finally acknowledged that writing code which modifies other people's classes at runtime <a href="">might not have been the best idea</a> for writing maintainable code. He advocates writing and using actual APIs instead.</p> perrin 2008-02-26T06:05:08+00:00 journal MacOS X is a harsh mistress <p>When MacOS X came out, combining things I like about the Mac with Unix, I was tempted. But I waited. I figured I'd give them time to iron out the bugs, time for the hardware to catch up to the new rendering technology, etc.</p><p>Now that Macs are running Intel chips, I figured I had waited long enough, and I bought one of the new iMacs. It's a really nice machine, and except for some glitchy rendering here and there (tooltips that remain on the screen, for example), it's a solid OS.</p><p>However, getting a dev environment that's as productive as Linux has been a real chore. Here are some of the things I've been dealing with:</p><p> <b>Terminals</b> </p><p>You don't expect to spend a lot of time fussing with terminals these days. They should just work, right? Well, I first tried the terminal app that comes with the OS. It had broken key mappings for page up/down, which I fixed. It also makes only a really half-hearted attempt to do X11-style mouse paste, which is really irritating since I'm so accustomed to it.</p><p>I tried an alternative, called iTerm. It has better mouse paste support, but it has broken arrow key mappings. When was the last time you had to fix your arrow keys on Linux? It's like 1997 all over again.</p><p>Both of these seem to confuse my screen session about their terminal capabilities. iTerm does better at showing color (e.g. in man page headings) but both of them cause screen to do its horrific visual bell ("Wuff wuff!") until I manually tell it to use a normal bell.</p><p> <b>Text Editors</b> </p><p>Everyone seems to love TextMate. I do plan to try it, but it's very unlikely that I'm going to pay $80 for a closed source text editor in this day and age. If you're one of those people who loves TextMate and thinks it's worth the money, I'd be interested to hear your reasons.</p><p>I have tried TextWrangler, and found it pretty decent in some ways. I got perltidy wired up to it without trouble, and the SFTP browser is pretty lame, but works. I also like the Emacs key support. However, it has no code folding, and the syntax coloring is pretty weak compared to most Linux editors, like Kate, vim, and Emacs.</p><p>I tried Eclipse + EPIC, which is actually better than I expected. Syntax coloring is nice, and it feels pretty responsive. On the downside, I can't figure out how to make a filter for perltidy to work on just a selected region rather than a whole file. I also can't figure out how to get<nobr> <wbr></nobr>.html files with Mason code in them to switch into Perl mode. And then there's all the project-oriented Java cruft in the menus.</p><p>I tried a promising-looking Emacs port called Aquamacs. It's Emacs with keys remapped to normal Mac bindings, and easy-to-use font menus, separate windows for each buffer, etc. It looked pretty nice, but none of the elisp extensions I want to use (tramp, mmm-mode) seem to work with it. I admit to being a novice at this stuff, but I gave it a pretty good try and couldn't make them work.</p><p>The Carbon Emacs package from Apple worked great with elisp packages. In fact, it's really nice all-around: good syntax coloring from cperl and mmm-mode, tramp for SSH access, easy perltidy integration. The only real issue I've had with it is how poorly it plays with the rest of the Mac. Copying and pasting something from the terminal or Firefox into it seems to require some tricky incantations. I haven't been able to do it without resorting to clicking on menus. I suspect there's three or more different copy/paste systems happening at once here and they are not playing well together. It also took me a crazy amount of effort to change my font size, which is very non-Mac but somewhat expected from Emacs.</p><p>Overall I do like my Mac, but it's disturbing how reminiscent of the early days of Linux this has been. I may resort to running X11 with an xterm and maybe Kate if I can figure out how to get it running. I figure if I have to I can always run a double-boot (or Parallels) with Fedora. If anyone has tips about getting X11 mouse paste to work on the Mac, or taming Carbon Emacs, or getting Kate to run, pass them over.</p> perrin 2007-09-30T23:23:59+00:00 journal Which qualities are most important to project success? <p>Straw poll time. Rank these factors in order of how much effect they have on whether or not a project will succeed:</p><ol> <li>Good documentation.</li> <li>Choice of implementation language.</li> <li>Good (and plentiful) hardware.</li> <li>Coding skills of team members.</li> <li>Quality of project management.</li> <li>Automated tests.</li> <li>Clean, well-organized code.</li> <li>Choice of framework or tools (aside from language).</li> <li>Non-rushed deadline.</li> </ol> perrin 2007-09-13T23:14:39+00:00 journal the limits of object-relational mappers <p>At Plus Three, we built a large project using Class::DBI. When we started the project, Class::DBI was the ORM that best met our needs. I applied some patches from the mailing list and added a couple of CPAN modules and some custom code in order to get these features:</p><p>- LIMIT support for MySQL on all search queries.<br>- Ability to retrieve all records from one table with a sort ("retrieve_all_sorted").<br>- Ability to run any search query as a count instead of returning records.<br>- A safe version of find_and_create. The existing one was not atomic.</p><p>Surprisingly, after these enhancements, Class::DBI 0.96 proved up to the task for the entire project.</p><p>Since then Rose::DB::Object and DBIx::Class have matured, and some other interesting things have come along. These have much more complete querying abilities than Class::DBI. They generate multi-table joins with no trouble. They can fetch related objects in one query in order to avoid multiple trips to the db.</p><p>The obvious question is, how much different would the code be if it had been written with one of these instead. And the answer? It would be reduced, but not as much as you might imagine.</p><p>See, the great thing about Class::DBI is that it's very easy to add custom SQL to your classes. You just set up a SQL statement that returns the fields you want from the current class and give it a name and you have a custom search. You can even generate SQL programmatically at run-time and use it to select objects. It is limited by the fact that it can only result in a list of objects from one class, but in practice that is rarely an issue.</p><p>We used this feature extensively. We had a complex, normalized db, and many carefully tuned SQL queries. Looking at the SQL we wrote, I would guess that about half of it is relatively simple JOIN and LEFT JOIN queries that would be eliminated (or automated) by a more capable ORM. The rest though is beyond the capabilities of existing ORMs.</p><p>What's in it? Subqueries, both as derived tables (in the FROM clause) and as NOT EXISTS queries. Transaction control, with SELECT...FOR UPDATE. Database-specific extensions like INTERVAL and GROUP_CONCAT. Import/export commands like LOAD DATA INFILE. UPDATE statements that use joins. Temporary table creation. Not all applications require this level of sophistication with SQL, but I suspect all of the ones with large amounts of data and moderately complex schemas do.</p><p>In a few cases, an ORM could get the same results by writing the query in a simpler way. However, performance would suffer. You can get on your soapbox and cite some C.J. Date stuff you read about the relational model and how the phrasing of the query shouldn't matter, but in the real world it matters a great deal. And this is as true with Oracle as it with MySQL.</p><p>Could ORMs learn this? They could probably learn some of it. They could expand their coverage of SQL to include many of these operations. They could embed some common wisdom about how to optimize certain types of queries for one database or another, although this would still not always be correct.</p><p>In the end though, what's the point of an ORM that has all the complexity of SQL? It doesn't gain you anything unless it makes things simpler, which means it has to ignore a large amount of the capabilities of SQL. An ORM is mostly about making the easy work of simple fetches and saves as automated as possible, not about creating an impenetrable shield between your programmers and SQL.</p><p>What all this means to me in practical terms is that an easy way to use custom SQL is the most important feature to look for in any ORM. With a simplistic one like Class::DBI you have to go to custom SQL too soon, but even with a more powerful one you will eventually have to go there.</p> perrin 2007-08-20T20:03:20+00:00 journal My OSCON slides are up <p>The slides from my talk, "Care and Feeding of Large Web Applications", are <a href="">available for download</a>. These have a couple of corrections from the YAPC version, but nothing major.</p> perrin 2007-08-07T21:38:33+00:00 journal don't be so dismissive <p>I don't know if this is an actual trend, or I'm just noticing it more, but I feel like people are being more dismissive of others' work these days. Nat Torkington actually touched on this a little at OSCON in his keynote (which is <a href="">available online</a>, incdentally.) There's a tendency to just write off entire popular projects with some kind of sweeping generalization. We all make jokes about the "competition" now and then, but lately it feels more vicious.</p><p>I'll give you a couple of examples. First, MySQL. I heard lots of snide remarks about MySQL at OSCON. Some people went as far as to say that if everyone would use Postgres instead, none of the scaling techniques we hear about (like splitting your db up into shards on multiple servers) would be necessary.</p><p>Think about who uses MySQL: Yahoo, Google, etc. These people have enough money to try Postgres, and a huge financial incentive to look for something that would make their database scaling easier. Don't you think they might have tried it? Maybe it didn't meet their needs. Maybe it's better at certain things than Postgres.</p><p>Another popular target is PHP. People who have never used it slam it left and right as being a tool for idiots. The fact is, some of the smartest people I know do a lot of work in PHP. It's not a toy language anymore. It has nice OO support. It has a profiler that works more reliably than Devel::DProf.</p><p>The problem with this attitude is that what goes around comes around. I recall being at an open source content management conference and having a Java fan derisively say to me "People still use Perl?" I also experienced how this looks from the other side: this guy came off as an arrogant fool.</p><p>Tools that become very popular, like MySQL, PHP, and Perl, have good reasons for it. Even if they aren't your chosen tools, keeping enough of an open mind to learn from what they do right is worth it. I know I learn a great deal from articles and talks for Java programmers, even though I haven't used it as a primary language since I stopped working at Scholastic.</p> perrin 2007-07-31T19:27:18+00:00 journal MySQL bulk loading techniques <p>I'm working on the next generation of the data warehouse described <a href="">here</a> by Sam Tregar. This time, I'm trying to keep it up-to-date in near real-time. Because of this, I have several new constraints on how I load data: </p><ul> <li>The tables are all InnoDB. This is necessary because there will be long-running queries running on these tables while data is being loaded, and that requires the MVCC support in InnoDB. MyISAM tables would block updates while anyone is reading. Incidentally, contrary to what people often claim on Slashdot, converting these tables to InnoDB improved query performance quite a bit. Only the loading speed suffered compared to MyISAM tables.</li> <li>We can't disable the indexes while loading. This is a huge help in the current system where we load into an off-line database. The <code>ALTER TABLE DISABLE KEYS</code> and <code>ALTER TABLE ENABLE KEYS</code> commands allow the indexes to be rebuilt in bulk. If we did this to an on-line table though, anyone using it would suddenly have no indexes available. Also, InnoDB doesn't have the same bulk index creation optimization, although this is supposed to be coming soon.</li> <li>The incoming data will be a mix of new rows and updates to existing rows.</li> <li>Some of the loads will be partial data for a table, i.e. not all data loads cover all columns in the target table.</li> </ul><p> So, I had a few ideas of ways to load the data and wanted to see what would give me the best results. I made a quick and dirty benchmark script and tried them out on a relatively small table (~50K rows) which I loaded with 20K rows and then tested ways of copying the full data in, meaning a combination of new rows and updates. Here are the results. </p><p> The fastest approach is an INSERT...SELECT with an ON DUPLICATE KEY UPDATE clause. That looks a bit like this:</p><p> <code> INSERT INTO foo_test SELECT * FROM foo ON DUPLICATE KEY UPDATE bar=VALUES(bar), baz=VALUES(baz),... </code> </p><p>This was pretty fast, coming in at 29 seconds. Some people have trouble with the <code>INSERT...SELECT</code> because it takes a shared lock (like <code>SELECT...FOR UDPATE</code>) on the source table while it runs. This is apparently fixed in MySQL 5.1 by using row-based replication. It's also not really an issue for us because we're doing this work on a replicated database, so the worst case is that the replication falls behind a bit while the statement runs. </p><p> Although it won't work for us, I tried <code>REPLACE</code> as well, just to see how it compared. It was quite a bit slower, coming in at 54 seconds, or almost twice as long. </p><p>I considered trying a combination of <code>INSERT IGNORE...SELECT</code> and a bulk <code>UPDATE</code> (using a join), but figured this would do poorly if the <code>SELECT</code> had any real work in it, since it would be running twice. </p><p> The most common workaround for people who have trouble with the <code>INSERT...SELECT</code> locking is to use a temporary file with <code>SELECT INTO OUTFILE</code> and <code>LOAD DATA INFILE</code>. I tried that next. The dump is really fast, taking only 1 second. Loading is complicated by the fact that you can't do updates with <code>LOAD DATA INFILE</code>, so I decided the best thing would be to load the data into a temporary table and then do an <code>INSERT...SELECT</code> from that. </p><p>I got that load to go very quickly by making my temp table a MyISAM one and running an <code>ALTER TABLE DISABLE KEYS</code> on it before loading. It loaded in 3 seconds. Then I did the same <code>INSERT...SELECT</code> from the temp table which took the same 29 seconds (and I never built the indexes because I didn't need them). In total, the temp file only added 4 seconds or 14% overhead. This seems like a good solution for people who run into locking issues. </p><p>Then I tested using two database handles, one to <code>SELECT</code> and one to <code>INSERT/UPDATE</code>, pumping the data from one to the other. I was pretty sure I couldn't beat the <code>INSERT...SELECT</code> with this approach, but we have some situations where we need to process every row in perl during the load, such as geocoding addresses or applying logic that gets too ugly when done in SQL. </p><p> I played around with the frequency of commits and with MySQL's multi-row <code>INSERT</code> extension, and got this reasonably fast. It ran in 43 seconds, or a bit less than 50% slower than the <code>INSERT...SELECT</code>. </p><p> Looking at how fast the <code>LOAD DATA INFILE</code> was, I tried a different approach for processing every row, doing a <code>SELECT</code> and writing the rows out with Text::CSV_XS. Then I loaded that file into a temp table with <code>LOAD DATA INFILE</code> and did an <code>INSERT...SELECT</code> from the temp table as before. </p><p> This was much better. Dumping the <code>SELECT</code> with Text::CSV_XS only took 3 seconds and, combined with the 4 second load, it only adds 24% overhead and gives me a chance to work on every row in perl. It's also much simpler to code than the multi-row <code>INSERT</code> stuff. </p><p> I should point out that working with these large data sets row by row requires the "mysql_use_result" option, which makes the server spool results to the client instead of dumping them all at once. I activate it for specific statement handles like this: </p><p> <code> my $sth = $dbh-&gt;prepare($sql, {mysql_use_result =&gt; 1}); </code> </p><p>If anyone has additional ideas, I'd be interested in hearing them. For now, I'm happy with the first approach for updates that can all be done in SQL and the last one for updates that require perl processing on every row. </p> perrin 2007-05-14T22:35:43+00:00 journal Bruce Perens seems uninformed about Apache HTTPD <p>There's an article <a href="">here</a> about Bruce Perens manipulating the headers on some Lighttpd servers to look like Apache for Netcraft stats purposes. At the end of this article, he says this:</p><blockquote><div><p>"Apache is desirable if you want mod_perl, or mod_some-interpretive-language. That's an old-fashioned way of programming. Most newly-architected Web sites decouple the dispatcher running the interpretive language code from the Web server. These days, something like lighttpd works better for most sites. Open Source is about evolution."</p></div> </blockquote><p> This statement is confused on many levels. To begin with, I don't think many people choose Apache HTTPD because they want to run mod_perl, but that's subjective so I'll skip it.</p><p>Look at the part about separating the interpreter from the web server. This has been the recommended mod_perl configuration for busy sites for at least the past 10 years. The mod_perl docs have recommended running a light front-end server (Apache with mod_proxy, or Squid, or whatever) and using a mod_perl-enabled Apache as an "application server" behind the scenes. It's essentially identical to running FastCGI or the Java servlet daemons that came along later in terms of architecture.</p><p>And how about that evolution? Lighttpd quickly became popular with the Ruby crowd, but then some of them became annoyed with its FastCGI implementation. So what did they do? They wrote their own HTTPD that runs the Ruby code and used Lighttpd as a front-end, proxing the dynamic requests to it. Hmmm, where have I heard that before? </p><p>The larger point is that the Apache 2 HTTPD is a lot more than a web server. It's really a modular framework for writing networked servers. Everything, right down to the HTTP protocol, can be replaced with a module, and it can be done in multiple languages, including Perl.</p><p>Lighttpd might be the best server for Perens' application, and it's a high-quality open source application. I just wish he would get his facts straight about Apache and mod_perl.</p> perrin 2007-04-18T03:07:22+00:00 journal tech lessons <p>Reading <a href=",1217,a=198614,00.asp">this article</a> about and their technology lead me to some interesting tidbits. </p><blockquote><div><p> "Chau developed the initial version of the MySpace Web site in Perl, running on the Apache Web server, with a MySQL database back end. That didn't make it past the test phase, however, because other Intermix developers had more experience with ColdFusion, the Web application environment originally developed by Allaire and now owned by Adobe. So, the production Web site went live on ColdFusion, running on Windows, and Microsoft SQL Server as the database."</p></div> </blockquote><p> I think that explains why they had so many performance problems early on. This is one case where "go with what your teams knows" may have been bad advice. They eventually ditched it for C# and saw a big imrpovement. </p><blockquote><div><p>Whenever a particular database was hit with a disproportionate load, for whatever reason, the cluster of disk storage devices in the SAN dedicated to that database would be overloaded. "We would have disks that could handle significantly more I/O, only they were attached to the wrong database," Benedetto says.</p></div> </blockquote><p> They solved this by going to a storage technology that pooled their resources instead of partitioning them. I think this supports my theory that partitioning is usually a bad idea and you should share resources as much as possible. Partioning used to be a big sell for expensive EJB tools and IBM hardware, but the end result is that some of your hardware is under-utilized while parts of your application are starving for resources. </p><blockquote><div><p>The cache is also a better place to store transitory data that doesn't need to be recorded in a database, such as temporary files created to track a particular user's session on the Web site&#8212;a lesson that Benedetto admits he had to learn the hard way. "I'm a database and storage guy, so my answer tended to be, let's put everything in the database," he says, but putting inappropriate items such as session tracking data in the database only bogged down the Web site.</p></div> </blockquote><p> Storing sessions in your lossy cache storage is a mistake, in my opinion. If your session suddenly dissapears for no reason when you're browsing, this is why -- they put it in the same unreliable storage that they use for caching. But then he goes on to say that he really doesn't care if your data gets lost:</p><blockquote><div><p>In other words, on MySpace the occasional glitch might mean the Web site loses track of someone's latest profile update, but it doesn't mean the site has lost track of that person's money. "That's one of the keys to the Web site's performance, knowing that we can accept some loss of data," Benedetto says. So, MySpace has configured SQL Server to extend the time between the "checkpoints" operations it uses to permanently record updates to disk storage&#8212;even at the risk of losing anywhere between 2 minutes and 2 hours of data&#8212;because this tweak makes the database run faster.</p></div> </blockquote><p> Classic. </p> perrin 2007-01-17T21:31:03+00:00 journal POD + perltidy? <p>Does anyone have a handy way to run perltidy on code embedded in POD? It feels very backwards to be indenting that by hand when I run perltidy for everything else. Maybe an extension to Pod::Tidy?</p> perrin 2007-01-09T23:46:44+00:00 journal Are we done with Joel Spolsky now? <p>I've never been a fan of Joel's writing, but he kind of clinched the deal with <a href="">this one.</a> To quote:</p><blockquote><div><p>...the bottom line is that there are three and a half platforms (C#, Java, PHP, and a half Python) that are all equally likely to make you successful, an infinity of platforms where you're pretty much guaranteed to fail spectacularly when it's too late to change anything (Lisp, ISAPI DLLs written in C, Perl)...</p></div> </blockquote><p>I just hope I can fail as spectacularly as Amazon, Yahoo, and TicketMaster have with their use of Perl.</p><p>And incidentally, eBay was originally an ISAPI DLL.</p> perrin 2006-09-01T20:22:17+00:00 journal HTTP Server Fever! <p>Is it just me, or is everyone writing an HTTP server these days? After Apache 2 became solid, it looked like there wasn't much of interest left to do in the world of HTTP servers, and the field had been fully commodified. CPAN had a half dozen or so Perl HTTP servers, all of which were fine for entertainment but not useful for real sites. You'd hear some crank on Slashdot shouting about thttpd (I swear that wasn't me), but it didn't set the world on fire.</p><p>Then the single-threaded servers started showing up in earnest. A non-blocking I/O approach to networking is well-known to scale better than threads or processes, and it appealed to developers in a very primal way -- it's fast! Well, not so much fast, since you'd run out of bandwidth long before that mattered, but you could handle lots of open connections to slow clients without any trouble.</p><p>Lighttpd quickly became a star, especially in the PHP and Ruby worlds. (Why were Rails developers looking for a faster web server rather than trying to fix Ruby's performance problems? Probably because it's a much easier problem.) </p><p>Somewhere in there, Perlbal made the scene. It's a bit of a hodgepodge of features, having been developed to suit some particular in-house project needs, but an interesting sort of glue project to fill gaps in the Perl web app deployment story.</p><p>Some of the Rails guys then decided they didn't like FastCGI and would write their own HTTP server to replace it, called Mongrel. So far, the <a href="">benchmarks </a> I've seen make it look like performance has gotten worse compared to what they had with FastCGI, but it's still early so maybe they will improve that. They say they were doing it because the FastCGI implementations all had bugs, so maybe they don't care if it's slower anyway.</p><p>Meanwhile, people started popping up on the mod_perl list saying that they had built their own single-threaded servers. I usually ask people two things when they say this:</p><ul> <li>What will you do about DBI, and all of the other blocking network and file I/O calls that are the bread and butter of the average web app? Stalling your entire site while someone waits for a query is not going to work.</li> <li>How is this better than running Perl on Lighttpd + FastCGI?</li> </ul><p>The only good answer I've heard to the first question so far is to ship the blocking stuff off to some separate persistent processes (e.g. mod_perl, PPerl, etc.) that you talk to over non-blocking I/O, and pick up the results when it's done. This is what Stas Bekman did with the single-threaded server he works on at <a href="">MailChannels</a> (for blocking spam). It's also what Matt Sergeant seems to be planning for his new single-threaded AxKit2 HTTP server.</p><p>Meanwhile, back at the Apache 2 camp, mod_proxy has picked up useful new features like basic load balancing and people are experimenting with hybrid threaded/non-blocking I/O process models.</p><p>It's good to see innovation happening. Sometimes I do wonder if people are chasing the right things. I find it pretty easy to make a screamingly fast web app with basic Apache and mod_perl these days, so maybe pushing things in a direction that makes development harder (as I think single-threaded programming will be for most people) is not the best move for all of us. High-performance has an undeniable allure though, especially for people like us who still have to convince managers that Perl is fast enough for a web site. (Duh. Maybe you've heard of Amazon?) I'll certainly be paying attention though, to see what Matt and everyone else cooks up.</p> perrin 2006-09-01T04:06:10+00:00 journal job stats show Perl still leads the P languages With the help of <a href="">this job trends app</a>, you can see that job postings for Perl continue to be much higher than those for PHP, Python, or Ruby (honorary P language). It's nowhere close to Java, but it's about twice PHP and leaves the others in the dust. Of course this says nothing about the actual quality of the jobs -- only that Perl skills are in demand. perrin 2006-07-16T23:07:31+00:00 journal Do they know they're learning Perl? <p>In <a href="">this article</a> on IBM's DeveloperWorks site, Bruce Tate (the guy who has been pimping Ruby in his book "Beyond Java") teaches Java developers some basics of Ruby text generation. He talks about strings (when you use double quotes, they interpolate variables!) and eval (where have I seen that before?) and then shows a templating system that appears to be a Ruby port of Text::Template (or any of the similar embedded Perl code modules, but nowhere close to a more powerful system like TT). Funny thing -- it all looks exactly like perl code, and demonstrates basic features of Perl. </p><p> I wonder if these Java programmers who are getting excited about learning Ruby realize that they're actually being taught Perl. Our evil plot to change the name to Ruby has succeeded beyond our wildest dreams. Welcome to the fold, DeveloperWorks readers. </p> perrin 2006-07-14T19:30:57+00:00 journal YAPC Chicago wrapup <p>I'm back from YAPC. Michael (Peters, my colleague at Plus Three) is off at ApacheCon giving his talk there. I think he's actually done by now, since he basically had to go from the plane straight into the conference.</p><p>It was a big one this year. There were about 400 people there, and I got see lots of old friends and made a few new ones.</p><p>Mark Jason Dominus gave another talk based on his upcoming "red flags" book, about fixing bad style. He dissected a module by Chris Winters, which made for an interesting talk because Chris writes good code to begin with. I definitely want this book when it comes out.</p><p>Andy Lester gave an entertaining talk called "Get Out of Technical Debt Now!" The focus was on postponed "# TODO" and other tasks and how to<br>manage them.</p><p>I went to a talk on Perl::Critic, which I plan to start using (maybe in my "Low Maintenance Perl" talk at OSCON), and another on B::Lint and how to customize it. Lint looks challenging, so I'll try Perl::Critic first.</p><p>The Perl 5.10 update included some good news about speeding up regexes and eliminating some conditions where a regex would run forever on a large chunk of text. I'm personally not very excited about additions the defined-or operator ("//"). When I was a kid, we had to check for defined uphill, both ways, in the snow.</p><p>My favorite testing talk was the one on Selenium. This is the project for running a browser to test your JavaScript, which now has a Perl module with an API that looks easy to use from Test::More. (Possible project: a WWW::Mech compatibility layer.)</p><p>There was some buzz about a new OO module called Moose, which is somehow a product of some Perl6 work. I couldn't quite figure out the point of it in the 20 minute presentation, but Randal and Audrey and some other people seemed excited about it.</p><p>Tatsuhiko Miyagawa (now of Six Apart) gave a couple of interesting talks. (He gave more, but these are the ones I saw.) One was on Plagger, a tool for doing things with RSS. It had<br>all kinds of plugins for turning various things into feeds, filtering them, and generating different kinds of output. There's an example at<br></p><p>His other one was a lightning talk on XML::Liberal, which is a module for parsing the broken XML that most RSS feeds have in them. It corrects all the problems before passing it to XML::LibXML and it has no performance penalty on XML that is not broken. This could be very useful.</p><p>There was a good Subversion talk by one of the book's authors, but not too much to take away from it that I didn't know. One thing I learned<br>is that work is being done on merges that track what you've already merged. I make use of branches and merging quite a bit, so this could be helpful. Also, there is a way to mark individual files as requiring a lock before working on them, which can be handy for large binary files.</p><p>On the last day, I went over some new code for Apache::SizeLimit with Dave Rolsky and then caught some of MJD's Higher Order Perl talk.</p><p>The lightning talks were the usual mixed bag. Audrey Tang presented a brilliant one, which I believe was written by someone else at<br>YAPC::Asia. When video becomes available, watch<br>it. Jeff Bisbee showed JavaScript::XRay, a very useful-looking module for essentially tracing JavaScript execution in a frame. There was the<br>usual in-joke movie too, which was particularly good this year.</p><p>Slides for most of these talks are here:<br></p><p>And I'm sure you're all wondering how Michael and I did. I got some good laughs with my lightning talk on Web 2.0 and I was pleased with it. Michael had a packed house for his AJAX talk (adding AJAX features to the Krang CMS), and did a great job. Several people told me later that his was a favorite of theirs at the conference.</p><p>Thanks to the hard-working organizers who pulled off a great event.</p> perrin 2006-06-29T20:37:22+00:00 journal how large sites scale their databases <p>I've been following <a href="">Tim O'Reilly's series on how large sites scale their databases</a>. Also, <a href="">this article about</a>. They seem to fall into two camps:</p><ol> <li>Using flat-files, typically accompanied by lots of attitude about how much smarter they are for not using an RDBMS and frequent invocations of Google.</li> <li>Using MySQL, with replication to scale reads, and data partitioning to scale writes (users A-H on this cluster, I-P on that one...)</li> </ol><p>Amazingly, Craig's List uses MyISAM tables. I guess it's nearly all reads, but I just didn't think the locking approach used for MyISAM tables would hold up to traffic like that. A primary reason why I use InnoDB is the row-level locking and the multi-version concurrency system, which means that readers don't block writers.</p><p>Two interesting things here are that none of them use PostgreSQL, despite a few of them being fairly new, and that none of them have tried commercial offerings for database clustering, like the stuff IMB and Oracle sell.</p><p>In fact, I've never met <em>anyone</em> who had tried the Oracle or DB2 clustering. Even the people who have the money seem to avoid it. Can anyone offer any personal anecdotes about it? Does it work at all?</p> perrin 2006-04-28T19:06:46+00:00 journal Test::WWW::Mechanize tricks <p>Sometimes I want to test if my application sent the right redirect, but I don't want to follow it. It might be to a page that isn't accessible from my dev system, or is hosted by someone else and I don't want to hit it every time I run the test suite.</p><p>To do this with Test::WWW::Mechanize, I use this code:</p><blockquote><div><p> <tt>my $mech = Test::WWW::Mechanize-&gt;new(autocheck =&gt; 0);<br>$mech-&gt;requests_redirectable([]);&nbsp; &nbsp; # don't follow redirects<br>$mech-&gt;get($uri);<br>is($mech-&gt;status, 302);<br>$mech-&gt;content_contains($url_to_redirect_to, 'sent to correct URL');</tt></p></div> </blockquote><p>Another thing I want to check is if the request set a cookie. A simple way to do that:</p><blockquote><div><p> <tt>my $resp = $mech-&gt;get($uri);<br>ok($resp-&gt;header('Set-Cookie'), 'has cookie header');<br>like($mech-&gt;cookie_jar-&gt;as_string(), qr/$COOKIE_NAME/,<br>&nbsp; &nbsp; &nbsp;'cookie was accepted');</tt></p></div> </blockquote><p>Since the LWP cookie handling does the same checks that browsers do for domain and path, this can help catch problems with your cookie headers.</p> perrin 2006-04-26T20:10:01+00:00 journal love the new CPAN search! <p>It's been in beta testing for a while, but now it's live. Now I can type in "DBI" and the DBI module is at the top of the results! I used to have to dig for it. Same for CGI. It's much faster too. Swish-E, the search engine behind it, has been great on the projects we've used it on, and I'm glad to see it powering CPAN now.</p><p> Not sure if this was Graham Barr or Ask Bjorn Hansen or someone else, but whoever it was, thank you! This is a great improvement for CPAN search. </p> perrin 2006-03-09T20:00:14+00:00 journal Bruce Eckel on Perl, Java, Python, and Ruby <p> <a href="">This</a> article by Bruce Eckel discusses the book "Beyond Java" by Bruce Tate, in which Tate discusses Ruby on Rails and a few other things. For those who don't know, Eckel is the author of one of the most popular Java books ever, "Thinking in Java."</p><p>The article is full of good quotes and has a fun snarky attitude to it. I will warn you now, he doesn't like Perl. However, he does recognize that Perl brought something important to the table, and it's nice to read a piece by someone with a little more cool-headed view of the Java vs. Ruby showdown that is filling these days.</p><p>Here are some choice quotes:</p><blockquote><div><p>...almost at the end of the book he declares that he doesn't have time to learn these other languages in any depth -- although he has no trouble condemning the same languages in his rush to Ruby.</p></div> </blockquote><blockquote><div><p>Ruby is to Perl what C++ was to C.</p></div> </blockquote><blockquote><div><p>The backlash from heavyweight web frameworks has been significant. We now know that EJB 1 &amp; 2 were based on an entirely flawed set of use cases.</p></div> </blockquote><blockquote><div><p>I think we've mostly been hearing from people who have come from Perl and found Ruby to be a "better Perl, with objects that work," or people who are finally convinced that dynamic languages have merit, and so mix the enthusiasm of the first time dynamic language user (quite a rush, as I remember from my 2-month experience with Perl many years ago) with their experience of Ruby.</p></div> </blockquote> perrin 2005-12-22T00:38:56+00:00 journal Did you know that Ruby invented MVC? <p>From <a href="">an interview</a> with Bruce Tate, author of some O'Reilly Java books:</p><p> <cite> <b>Siddalingaiah:</b> Ruby on Rails is known for its rapid Web application development, but there others, such as PHP. The PHP community has developed mountains of opens source Web applications. Do you think PHP developers have something valuable to contribute? </cite> </p><p> <cite> <b>Tate:</b> Not really. PHP, and the ideas behind it, have been around for a while. It's quick and dirty. We know quick and dirty. Visual Basic. PHP. Perl. They don't excite me. Now, quick and clean, that excites me. Ruby on Rails is model-view-controller. It stretches the object-relational-mapping state of the art. It's quick and clean. </cite> </p><p>Well, I'm glad we cleared that up. You guys writing MVC apps with OO PHP5 and Perl can all go home now. (The idea that ActiveRecord somehow stretches the O/R mapping state of the art is pretty amusing too.)</p> perrin 2005-10-26T15:49:18+00:00 journal Rails cheerleaders: go easy on the hyperbole <p>This is from the first page of <a href="">the latest Ruby on Rails article</a> on the O'Reilly website: </p><p> <cite> "In this short time, Rails has progressed from an already impressive version 0.5 to an awe-inspiring, soon-to-be-released version 1.0 that managed to retain its ease of use and high productivity while adding a mind-boggling array of new features." </cite> </p><p>Come on! I don't know Curt Hibbs, and I have nothing against him, but this intro is positively ludicrous. How is this supposed to be "awe-inspiring?" Does he think that before Rails we had no object/relational mappers, no code generation, no MVC, no test scripts, no dynamic languages? You'd have to ignore an awful lot of web development tools (that pre-date Rails) to think that. </p><p>But he does seem to think that. A little further down the page he says this: </p><p> <cite> "The typical development cycle for testing a change to a web app has steps such as configure, compile, deploy, reset, and test." </cite> </p><p>In what world is that "typical"? The only web development tool in use these days with a compile step is Java, and if you look at web development as a whole, there are more people NOT using Java than using it. </p><p>It's also not a very accurate list. Even Java apps don't require reconfiguration every time you change your code! And Rails doesn't remove the "test" step. </p><p>Most of the writing about Rails that I've seen seems to have blinders on when it comes to anything other than Java. It would be nice to see some acknowledgment that other dynamic languages already have this stuff, and a little more touch of reality in general. The current tone smacks of ego and arrogance. I don't see anything wrong with trying to draw in Java converts, but if the plan is to just ignore PHP, Perl, and Python, they're losing out on an awful lot of potential allies.</p> perrin 2005-10-19T16:59:40+00:00 journal Damn you, Google Telling people things like <a href=",2000061733,39204515,00.htm">this</a> is going to cause lots of pain for people like me. It's easy to say everyone should use proprietary features and make 18 different versions of their site to work with all the browser and OS combinations out there when you have a huge staff of great JavaScript people and a QA lab with every platform set up. Most web developers don't have these things. Now, thanks to successful projects like Google Maps, clients will think this is a reasonable thing to expect from web developers. Wait until they see how much it costs to develop and test all those different versions... Wait until they see how much it costs to maintain them... perrin 2005-07-28T15:50:54+00:00 journal YAPC::NA 2005: Pleased to meet you The best thing about conferences is getting to meet people who I'm on mailing lists with but have never seen in person. Some of them are people I've "talked" with, some are lurkers, some are talkative on lists where I'm mostly a lurker. Regardless, it's always great when someone walks up and says "Hi, I know you from the ____ list and I just wanted to say hello in person." <p>So, to all those people who did just that at YAPC, thanks, and pleased to meet you.</p> perrin 2005-07-06T20:51:53+00:00 journal a plea to Module::Build users I'm asking all Module::Build users to please use the "traditional" option to create_makefile_pl so that their work will be more compatible. Full explanation in <a href="">my PerlMonks post</a>. perrin 2005-05-18T17:05:54+00:00 journal