btilly's Journal btilly's use Perl Journal en-us use Perl; is Copyright 1998-2006, Chris Nandor. Stories, comments, journals, and other submissions posted on use Perl; are Copyright their respective owners. 2012-01-25T02:08:56+00:00 pudge Technology hourly 1 1970-01-01T00:00+00:00 btilly's Journal I have another blog now <p>I have enough entries now at <a href="">Random Observations</a> that I'm fairly sure I'll post at least sporadically. Be warned that little of what I've felt motivated to write about so far is Perl related. Though I'm sure I'll write some Perl stuff eventually.</p> btilly 2009-10-16T06:39:13+00:00 journal Kelly Criterion <p>For personal reasons I've been writing up <a href="">an explanation</a> of the Kelly Criterion. That is a rule for how to optimize your betting strategy if you find a bet that is favorable to you. However I want to go beyond the rule (which is well-known by gamblers) and get at more subtle issues caused by complicated betting outcomes, choices of multiple bets, and worrying about how to trade off short term volatility with long term returns.</p><p>The explanation isn't done yet and the calculator doesn't have the optimization feature that I really want to add. However I have enough there to look for feedback.</p><p>Please keep in mind that I am targeting gamblers who are comfortable playing with numbers and know something about probability, but whose advanced math skills may be rusty to non-existent.</p> btilly 2009-09-12T07:16:24+00:00 journal Wow, C++ can be fast <p>I have long repeated the maxim that you write in a high level language, and only after it proves too slow do you consider dropping to a low level language. I have also found that the occasions where it is appropriate to go low level are few and far between.</p><p>In my past experience, I never felt the need. Generally most of the stuff that I work on is I/O bound. And the rest is stuff which it is cheaper to throw another server at than to take programmer time.</p><p>But that changed. While working on a contract I encountered the need. I had specced out an algorithm to do a particular task, optimized it somewhat, and implemented it as a series of MySQL queries. With some tweaking I got it to run the job I needed with the plan I wanted, but it was still going to take 5 days. I could extract some naive parallelism, but that would only take it down to a day or so. Since they want the job to run regularly, I need to do better than that. So I decided to go with C++.</p><p>The computation dropped from 5 days to <b>10 minutes!</b> Apparently all of the following are good for speed: Manipulating smaller data structures, being able to fit key data structures into level 1 and 2 cache rather than struggling to keep it in RAM, using direct compiled code rather than microcode, avoiding sorts by keeping things in the order I need, and having direct array lookups rather than searching btree indexes. There were some other minor tweaks done, and more I could do, but on the whole I am quite satisfied with the result.<nobr> <wbr></nobr>:-)</p><p>(And, for C++, it is fairly readable.)</p> btilly 2009-03-13T07:11:20+00:00 journal Lots of left joins on subqueries can be slow <p>I just encountered this problem on MySQL. But don't blame the database, since I've seen similar misbehavior on PostgreSQL and Oracle.</p><p>Someone put together a reporting query. In it there are many left joins to subqueries. The overall query was painfully slow and getting slower over time. I was asked to improve this.</p><p>The solution is to move all of the subqueries into queries that populate temp tables. Put indexes on those temp tables. Then do the big join and watch it run much faster.</p><p>The reason why this works is that this plan is not in the query optimizer's repertoire. And the reason for <i>that</i> is that if the query optimizer tried to analyze every possible strategy for a complex query, it could take longer to run the analyzer than to run the query!</p><p>Still this pattern does come up from time to time, so it is a good trick to have in your toolbox.</p> btilly 2009-03-02T04:30:35+00:00 journal FreshBooks is cool <p>I haven't been sharing my life, but it is complicated. My job went part time in January, and I'm contracting on the side. (If anyone needs work done requiring a mix of Perl, databases, math, and ability to talk with business people, my resume is <a href="">available</a>.) The promise is that I can go back full time in May. We will see what happens.</p><p>Anyways one client just turned me on to <a href="">FreshBooks</a>, which is a great tool for anyone who is contracting. It keeps track of how long you have worked on which projects for which clients. Your clients can see how much you have worked for them, and when. And it will even bill them for you if you want. And it is free if you don't have too many clients.</p><p>All in all I am very happy with it right now.</p> btilly 2009-02-28T16:34:00+00:00 journal Yet another timesink, project Euler <p>I've been amusing <a href=";profile=btilly">myself</a> recently with <a href="">Project Euler</a>.</p><p>I've known about it for a long time and ignored it because I figured that I knew how to do everything there already. That has been somewhat true - I've solved over half the problems and finding the time to code the solutions has been a bigger challenge than knowing how to do them. But some are challenging. For instance the maximal x for <a href=";id=66">problem 66</a> has over 30 digits, so you aren't solving it by hand.</p><p>I've also learned a bit about Perl. For instance if you have a long list of small integers, it is worth learning about the vec function...</p> btilly 2008-10-23T22:49:35+00:00 journal Why the $#!@@! does keep logging me out? <p>This isn't the journal entry that I intended to write when I first logged into the site last night.</p><p>After several experiences of being logged out from very unexpectedly (and often very quickly), my short question is why can't remember for 5 minutes that I am logged in. My follow-up is what alternative site would be recommended that didn't have this problem and had plenty of Perl people.</p><p>I hope it remembers that I am logged in for long enough that I can post this.</p> btilly 2008-10-23T22:38:08+00:00 journal I should have created this project ages ago <p>I finally uploaded <a href="">statistics-distributions.js</a> to Google Code. In the hope that if someone goes searching for that, they might find it there.</p><p>In other news I have become addicted to <a href="">Stack Overflow</a>. We'll see how long that interest lasts.</p> btilly 2008-09-19T07:07:19+00:00 journal Why did I not know this Firefox trick? <p>Start typing into a form and it creates an autocomplete popup list of things you've typed there in the past. That's nice. But your past typos show up as well. The solution? Navigate to that entry and press shift-delete. Bye bye entry.</p><p>I only discovered this by accident. Polling coworkers, most people don't know about it.</p><p>What other really useful functionality does Firefox have that nobody knows about?</p> btilly 2008-08-25T19:01:16+00:00 journal And now I have my evaluations :-/ <p>I got feedback from 12 people, which I'm guessing is about a quarter of the audience. I had a number of people who either didn't know what to expect, or found the math a little hard. The ratings averaged out to 3.67/5. I have no idea how people tend to rate these things. I'm guessing that is at the "improvement needed" end of the scale, which is not <i>entirely</i> surprising considering that it was the first talk that I ever did on this scale. However I'd have liked to do better than that.</p><p>Does anyone have any idea how I should be scaling my expectations?</p> btilly 2008-07-30T17:45:28+00:00 journal Well, I survived talking about A/B testing <p>I gave my <a href="">talk</a>. I was more than a little dazed while I gave it, but it seemed to go well. Most of my audience was still there at the end, and they were still alive enough to laugh in the right places.</p><p>I won't <i>really</i> know how it went until I get my feedback in a couple of weeks.</p><p>Ideally this would be the time when I could sit back and enjoy the conference. Unfortunately child care responsibilities made me come home after one day so I'll miss it. However in a short time I saw a lot of familiar faces (eg Schwern, Robert Spiers, Damian, merlyn, etc) and met some interesting people (eg I had lunch with the CTO of Wikipedia). Plus I enjoyed Damian's talk on scripting vim. (Tip: Do not let him near your vim configuration anywhere near April 1. Just..don't.)</p><p>For those who are interested, my <a href="">slides</a> are online. I still have a couple of edits to make (mainly that I want to put in some of the actual questions I was asked), and it should be posted on the OSCON site some time this week.</p> btilly 2008-07-22T22:06:11+00:00 journal Please give feedback on my OSCON tutorial <p>I will be <a href="">presenting</a> about A/B testing on Monday. I now have a <a href="">rough draft</a> of my talk. Any feedback, typo corrections, etc would be appreciated. Note that it is supposed to take 3 hours to present, so it is kind of long.</p><p>And heck, you might learn something from reading the presentation.<nobr> <wbr></nobr>:-)</p> btilly 2008-07-16T16:20:22+00:00 journal How do I share JavaScript? <p>I will be presenting about A/B testing at OSCON. Since I'm presenting in the web track, I can't assume that people have Perl. Besides I've strongly recommended that people create a significance calculator as a web page, so I wanted to demonstrate what that looks like. So I decided to kill two birds with one stone and write that in JavaScript.</p><p>For this purpose I've ported Statistics::Distributions from Perl to JavaScript. Which is fine and dandy for my purposes, but now I'm wondering whether I can contribute this code to some project somewhere. With Perl it is easy - everyone knows that CPAN is the place to do things like that. With JavaScript, what are my options?</p> btilly 2008-06-27T14:27:28+00:00 journal I hate context <p>I've hated Perl's notion of context for a long time. So this weekend was just confirmation.</p><p>The problem is simple. Perl's notion of context requires that we think about what we want to do in array and scalar contexts, and potentially do different interesting things. This automatically doubles all APIs. Now it is true that sometimes there is something useful you can hang on this hook. But in my experience it is more often true that nothing really is obvious. And in that case with depressing frequency you get design decisions that age poorly.</p><p>For example at one point I read an article suggesting that it was a good idea to return a reference to an array in scalar context. I was briefly convinced and did this in Text::xSV. And now I curse myself every time I write:<br><code><br> &nbsp; &nbsp; my $name = $csv-&gt;extract("name");<br></code><br>and it does what I don't want.</p><p>Now it is easy to say that this is a poorly thought through design decision. And it was. But I've noticed that attempts to be clever with context frequently lead to bad design decisions. And result in making APIs more complex than they need to be. Sure, context is occasionally handy. But when I compare Perl to Ruby or JavaScript, I find that on balance context doesn't seem worthwhile to me.</p><p>This is old hat. What prompted this is something specific. On the Rose::DB::Object list we had a discussion about a bug between Rose::DB::Object and Template Toolkit. Here is the bug in simplest possible form:<br><code><br>#!perl -w</code></p><p><code>package MainObject;<br>sub new { bless {}, shift }<br>sub subobjects {<br> &nbsp; &nbsp; &nbsp; &nbsp; my @data = SubObject-&gt;new("world");<br> &nbsp; &nbsp; &nbsp; &nbsp; # no problem when there are more than 1 subobjects<br> &nbsp; &nbsp; &nbsp; &nbsp; # push @data, SubObject-&gt;new("Hello");<br> &nbsp; &nbsp; &nbsp; &nbsp; return wantarray ? @data : \@data;;<br>}</code></p><p><code>package SubObject;<br>sub new { bless {somevalue =&gt; $_[-1]}, shift}<br>sub somevalue {<br> &nbsp; &nbsp; &nbsp; &nbsp; return shift-&gt;{somevalue};<br>}</code></p><p><code>package main;<br>use strict;<br>use Template;</code></p><p><code>my $template = Template-&gt;new();<br>$template-&gt;process(\*DATA, {mainobject =&gt; MainObject-&gt;new()}) or die $template-&gt;error;<br>__END__</code></p><p><code>[% FOREACH subobject IN mainobject.subobjects.reverse %]<br> &nbsp; &nbsp; &nbsp; &nbsp; still not printed: [% subobject.somevalue %]<br>[% END %]<br></code><br>This prints some blank lines. If there are multiple subobjects it does what it is supposed to.</p><p>Well, OK. Obviously a weird context problem of some sort. Template::Stash::Context is supposed to help resolve those. So we try it:<br><code><br>#!perl -w</code></p><p><code>package MainObject;<br>sub new { bless {}, shift }<br>sub subobjects {<br> &nbsp; &nbsp; &nbsp; &nbsp; my @data = SubObject-&gt;new("hello");<br> &nbsp; &nbsp; &nbsp; &nbsp; # no problem when there are more than 1 subobjects<br> &nbsp; &nbsp; &nbsp; &nbsp; return wantarray ? @data : \@data;;<br>}</code></p><p><code>package SubObject;<br>sub new { bless {somevalue =&gt; $_[-1]}, shift}<br>sub somevalue {<br> &nbsp; &nbsp; &nbsp; &nbsp; return shift-&gt;{somevalue};<br>}</code></p><p><code>package main;<br>use strict;<br>use Template;<br>use Template::Stash::Context;</code></p><p><code>my $stash = Template::Stash::Context-&gt;new();<br>my $template = Template-&gt;new({STASH=&gt;$stash});<br>$template-&gt;process(\*DATA, {mainobject =&gt; MainObject-&gt;new()}) or die $template-&gt;error;</code></p><p><code>__END__</code></p><p><code>[% FOREACH subobject IN mainobject.subobjects.reverse %]<br> &nbsp; &nbsp; &nbsp; &nbsp; still not printed: [% subobject.somevalue %]<br>[% END %]<br></code><br>This time it dies a horrible flaming death because it can't locate object method "reverse" via package "SubObject". WTF?</p><p>After some hacking around I came up with the following patch that makes Template::Stash::Context work properly:<br><code><br>---<nobr> <wbr></nobr>/Library/Perl/5.8.8/darwin-thread-multi-2level/Template/Stash/ 2007-04-27<br>10:56:05.000000000 -0700<br>+++ 2008-06-23 00:32:17.000000000 -0700<br>@@ -508,9 +508,8 @@<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $returnRef,<br>$scalarContext);<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return $retVal if ( $ret ); ## RETURN<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }<br>- elsif (UNIVERSAL::isa($root, 'ARRAY')<br>- &amp;&amp; ($value = $LIST_OPS-&gt;{ $item })) {<br>- @result = &amp;$value($root, @$args);<br>+ elsif ( defined($value = $LIST_OPS-&gt;{ $item })) {<br>+ @result = &amp;$value([$root], @$args); ## @result<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; else {<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; @result = (undef, $@);<br></code><br>It turns out that Template::Stash::Context added the ability to do list operations only to native scalar types, and not to scalar objects. This fixes that.</p><p>Of course there is the real underlying problem, which is why returning 1 thing is so different from returning many things. Digging further the cause of that problem is that TT does not internally maintain a notion of context. Because of that, it winds up with the same internal data structure for the return out of scalar context and the return of one thing in list context. Then when it has to thaw it out, it has no choice but to treat them as being the same thing. I found at least one place that assumption was encoded, but there are more, and it doesn't look easy to fix.</p><p>Now it is easy to say "poor design decision within TT". It is. However there is an old design principle. Which is that when people make repeated mistakes involving one part of the interface, at some point you have to ask whether the real mistake is in the design of the interface itself. (This is not, incidentally, a software principle. Read <i>The Design of Everyday Things</i> for more on it. I also saw it repeatedly emphasized in a book about industrial disasters.)</p><p>It seems to me that there are repeated mistakes made by good people involving the notion of context. Which I take as evidence that the problem is with the idea of context itself. And then I conduct a sanity check. There are a lot of languages out there that deliberately borrowed a lot of ideas from Perl. How many have chosen to borrow the idea of context? Why not?<nobr> <wbr></nobr>:-)</p> btilly 2008-06-23T20:26:56+00:00 journal Thank you CPAN <p>Everyone's favorite reason to use Perl just came through for me again. Let me give some background.</p><p>At $work I am in charge of reporting. Our business has something called events. Different events are treated differently, and sometimes we make money from them and sometimes we don't. Sometimes we make more money from them, and sometimes less. One of the things we want reports for is to figure out what factors make events make more money. (Obviously because we want to make more events make money for us!)</p><p>Now I have a fairly flexible report that we'll call revenue per event, because that is its name. It allows us to see revenue per event at various ages broken out by various combinations of factors. Such as whether passwords are needed to login, whether gift certificates were added, whether specific promotions were run, that kind of thing. This is a very useful report. We can see that, for instance, running a promotion brings in money (duh) and tell about how much more money events with that promotion make.</p><p>But we have a problem. You see, we have a pretty good idea what factors make events make more money. (Unfortunately we don't control how the event is set up, we're handling them as a service for our direct customers.) So when people take our advice they do several things that are good. How can we tell how good each individual thing they are doing is?</p><p>Hmm..let's see. Sounds like we need to do some sort of multi-variable linear regression. Why don't we look on CPAN and find <a href="">Statistics::Regression</a> and see if that works? Oh look, it does! I've used it in the past. But look, it added a method called standarderrors, what is that? (Run some tests, make hypothesis, email author, get confirmation.) Goody, we not only can find a linear regression, but we can get estimates of how much random noise in the data might be throwing off the coefficients! I had been bothered by the fact that people tend to take the numbers I produce as gospel with no eye to whether there was any statistical validity to the numbers.</p><p>Hrm, do I trust the module? I take a legitimate pride in knowing a fair amount of math, but I know I don't know how to do this. Well look at the source code and..holy crap! OK, for me to learn this to my satisfaction would take a long time. What programmer contributed it and do I trust it? Rummage around and do some research...oh, he's a <a href="">professor at Brown</a>. He <b>teaches</b> courses on multi-variable statistics. I think I can trust that he knows his stuff!<nobr> <wbr></nobr>:-)</p><p>Getting my report to do multi-variable regression and display it the way I wanted still wasn't easy. But at least it wasn't easy for programming reasons (some day I need to write something explaining why it may be important to put a condition in an ON clause rather than a WHERE clause - I spent an hour tracking down the resulting bug), and not because I couldn't figure out the math.</p><p>Where else but CPAN would you expect to find something like that?</p> btilly 2008-05-24T05:02:02+00:00 journal There's Claire! <p>My daughter Claire Eleanor Tilly was born on Monday March 24 at 10:01 AM PDT at Long Beach Memorial. She weighed in at 8 lbs, 3 oz, and was 20 inches long. Mother and baby are doing well.<nobr> <wbr></nobr>:-)</p><p><b>Edit:</b> Forgot to say that this happened yesterday.</p> btilly 2008-03-25T18:48:46+00:00 journal I'll be presenting at OSCON :-) <p>I just got notified that my tutorial on A/B testing has been accepted. I'll be presenting at 1:30 on Monday, July 21. Now I just have to extend and improve a 2 hour presentation into a 3 hour one.</p><p>That, and figure out where I'm supposed to have the break. And decide whether I'm going to port Statistics::Distributions to JavaScript so that I can port my code samples to JavaScript. That way the code that I present can be the interface that I recommend developing for your users...</p> btilly 2008-03-17T20:31:36+00:00 journal Why make classes you can't subclass? <p>Let me give the background. I'm using Template Toolkit. Because I want to be able to write things like [% cgi.popup_menu(...).scalar %] and get reasonable output, I am using Template::Stash::Context. And then I decided that I'd like to also have it check to catch any use of unknown variables in the template.</p><p>If you look at the design then it is obvious that I need a different stash. And there is no stash on CPAN that does what I want. (Both strict checks and context.</p><p>Obviously I should just subclass it, right? Wrong. If you look in Template::Stash::Context you'll find checks in key places that check whether you're at the root by testing whether the reference type is __PACKAGE__. Which a subclassed object is not.</p><p>OK, can I cut and paste? I got that working, but then threw that solution away because I really don't like checking open source code with its licenses into proprietary codebases. Sure, I know what is OK to do, but I don't want anyone else to lose track. And I don't like giving lawyers heart attacks.</p><p>OK, let's just put a proxy class in front of it.</p><p>Nope. Didn't work. I didn't track it down, but I assume that it didn't work because it is sometimes passing stashes in calls to other stashes. There is more wrapping/unwrapping needed than I wanted to figure out.</p><p>Final solution?<br><code><br># I do this because I don't want to copy somoeone else's copyrighted<br># code into our codebase, and the class was not written to be easily<br># subclassed.<br>no warnings 'redefine';<br>my $old_get = \&amp;Template::Stash::Context::get;<br>*Template::Stash::Context::get = sub {<br> &nbsp; &nbsp; my ($self, $ident) = @_;</code></p><p><code> &nbsp; &nbsp; my $variable = ref($ident) ? $ident-&gt;[0] : $ident;<br> &nbsp; &nbsp; if (UNIVERSAL::isa($self, "HASH")<br> &nbsp; &nbsp; &nbsp; &nbsp; and not exists $self-&gt;{$variable}<br> &nbsp; &nbsp; &nbsp; &nbsp; and $variable ne "component"<br> &nbsp; &nbsp; &nbsp; &nbsp; and $variable ne "results"<br> &nbsp; &nbsp; ) {<br> &nbsp; &nbsp; &nbsp; &nbsp; die "Variable $variable not found\n";<br> &nbsp; &nbsp; }</code></p><p><code> &nbsp; &nbsp; return $old_get-&gt;(@_);<br>};<br></code><br>I'm unhappy that I don't give more context, but I'm able to store that in another place in my code so that my debugging messages have all the context I could need.</p><p>That was far harder than it should have been. And it took me longer to come up with that answer than it should have. But that's what happens when I go back to thinking about Perl after spending all of my time writing SQL.</p> btilly 2008-01-23T05:57:36+00:00 journal Well I gave the A/B testing talk <p>And slides are available at for anyone who is interested.</p> btilly 2007-11-09T08:21:54+00:00 journal A/B Testing <p>Well I seem to have volunteered to talk for an indeterminate time on A/B testing at on Thursday next week.</p><p>I've never given a presentation like this before. And my first one will basically be a math talk directed towards programmers? I <i>could</i> have picked an easier topic!</p><p>Ah well. I've long thought that the techniques of A/B testing weren't widely enough understood. This will be my chance to correct that. I'll just have to find somewhere to put my slides on when I'm done. As for presentation software, <a href="">S5</a> looks like it does what I want.</p><p>Wish me luck.</p> btilly 2007-10-30T05:24:54+00:00 journal Interesting insights from Software Estimation Note: This is a letter that I wrote to Steve McConnell after I read his most recent book. I'm publically sharing it because I learned something that may be of general interest from thinking about the topic, and I'd like to see more people read the book.<p> I finally got around to reading <a href="">your</a> most recent <a href="">book</a>. As always with your books, I found it fascinating. So fascinating that I'm writing you an open letter which I am going to publicly post elsewhere.</p><p> First of all, as I expected, you hit it out of the park. You've done a very good job on a difficult topic that our industry normally does a horrible job on. Very few of your readers have any clue how poorly they judge what is 90% likely. It is incredibly helpful to be conscious of how people confuse estimates, targets and commitments. You were absolutely right to back up your advice on how to create accurate estimates with advice on how to defend those estimates from organizational pressures to replace them with wishful thinking. And, as always, all of the advice is backed up by invaluable compiled (and meticulously referenced) data on everything from how uncertain the best possible estimates are at various stages in the software lifecycle to how wide the productivity variation is from company to company.</p><p> As with any book, it is not perfect. However the overall quality is extremely high and the remaining imperfections are small. Furthermore you took into account all of my criticisms for the one chapter that I reviewed. Since I had the opportunity to review the rest of the book and didn't, I feel that any oversights that I notice are more my fault than yours.</p><p> Needless to say I highly recommend this book to everyone involved in the software development process. And my main difficulty now is identifying who in my immediate environment I should lend it to first. (ie Who would create the biggest positive impact on the company I'm in.)</p><p> As is often the case with your work, of even higher value is how close reading leads to or reinforces insights on other parts of software development. Sometimes this is presented in an understated way. Such as the paragraph on page 64 that says, <i>"...individual performance varies by a factor of 10 or more. Within any particular organization, however, your estimates probably won't need to account for that much variation because both top-tier and bottom-tier developers tend to migrate toward organizations that employ other people with similar skill levels."</i> (A fact which you then provide 2 references to.) I laughed aloud at that one.</p><p> Sometimes the tangential gems are presented very directly. For a random example on page 69 you point out that multi-site development increases needed effort an average of 56%. As you say, this effect should be carefully considered by organizations considering outsourcing. And while most software professionals understand that this is a significant factor, very few of us can quantify it. Which makes it hard for us to get businesses to take it seriously.</p><p> And sometimes the insights are not directly presented. They are just implicit in the copious data that you've presented, waiting to reward the careful reader who can spot them. I'd like to talk about one of those.</p><p> It has long been a mantra among people who like dynamic languages that developers are more productive in small groups, and so there is great value in delivering languages that make small groups as productive as possible. I cannot count how many times I have seen variations on this theme, nor can count how many times I have personally repeated it. Supporting anecdotal evidence is easy to find. However until I read your book, I'd never seen concrete quantitative evidence that I could quote to support what is common knowledge in some circles.</p><p> Well I'd long known evidence for part of that assertion. Variations on the chart that you reproduce as table 5-3 on pages 64-65 have been circulating for ages. And while I agree with your conclusion that it is more productive to use a language such as Java instead of a language like C, I'd also point out that it is more productive to use a language such as Perl instead of a language such as Java. Interpolating from that chart with too much precision, about 2.4 times. I hadn't before seen the more detailed table 18-3 that you offer on page 202. Judging from that, average Java programmers need 2.75 times as much code as average Perl programmers to do the same task. Those estimates agree since neither is very precise - Java takes somewhere between 2 and 3 times as much work for the same task as Perl.</p><p> Of course coding is but one of the tasks that needs to happen in software development. If only half of your development time would have gone to coding (a reasonable estimate based on table 21-4 on page 236), then reducing coding time to 40% of what it was only saves you 30% of overall effort. Still that is a significant reduction. Why don't people pay more attention to it?</p><p> The catch is, of course, that Java has many features that make it much better than Perl for handling the challenges of development in large teams. Therefore it is easy to dismiss the productivity benefit because "Perl is not scaleable." And it is easy to likewise dismiss the anecdotal accounts of exactly how productive small teams are because common sense keeps us from accepting that 6 people do more than a dozen.</p><p> Which is part of the reason why I am grateful to you for reproducing figure 20-3 on page 229. I've heard estimates before that it takes a team of about 20 people to match the output of a team of 6-7, but I'd never before seen concrete data backing that up.</p><p> Anecdotaly the primary cause is well-understood: people are most effective in a flat team, but that only works for teams up to about 6-8 people. With that structure you have little to no overhead from having to manage process, or from people not being able to find out what they need to know when they need to know it. But that falls apart when there are too many lines of communications. The solution to that problem is to introduce process to cut down who needs to talk to whom, when. However adding process drops productivity per person significantly, meaning you have to add more people. And this cascades until you get to the same productivity with a far larger team. But then you can scale for a lot longer, but at far higher cost.</p><p> There are secondary issues that are also well understood. For instance you're likely to find a higher portion of good developers in the small team environment? Why? Well there are a lot of reasons. First of all it is clear that it is easier for an individual to be productive in the small team than in the large one. People who are drawn to productive environments are likely to be people who value their personal productivity, who are therefore likely to be productive people. Conversely it is much harder for an incompetent developer to hide in a small group than a large one, so the worst developers don't stay. Additionally, given comparable turnover rates, one can maintain staffing levels in a small group while being more selective about candidates than one can in a large group. And finally a company that understands the cost benefits of having a small group of good people can justify higher individual salaries for those people.</p><p> So the 3-1 individual productivity difference in lines of code between small teams and large teams has a number of causes. It really isn't as simple as saying, "Move 2/3 of your 20 person team away and you'll get the same productivity." However that said, the line of code measure may be hiding some more dramatic productivity differences.</p><p> Some are very hard to quantify. For example common sense tells us that a team of 6 people that all talk to each other is going to have more consistency across 57,000 lines of code than a team of 20 people who are deliberately being kept from talking all the time. That lack of consistency is going to show up in all sorts of bad ways, from re-invented wheels to misunderstood internal APIs.</p><p> But one is easy to quantify: the small team is much more likely to be using a productive interpreted language than the large one. So the 57,000 line project delivered by the 7 person team might well have 2-3 times the functionality of the 57,000 line project delivered by a 20 person team in about the same time. (As I've noted, the productivity difference comes from a combination of factors, including having better people.) Even if you're paying those programmers 50% more per person, your productivity per dollar is about 5 times better with the small team than the large one. That's a pretty dramatic difference. While I'll be the first to admit that there are limits to what small teams of good people can do, I'll also stand in line to point out that those limits are farther out than most people realize, and there is a very good business case for relying on small teams whenever you can.</p><p> Anyways, congratulations on yet another excellent book, and I'm sure that I'll be digesting its consequences for a long time to come.</p><p> Cheers,<br> Ben</p> btilly 2007-06-11T07:54:17+00:00 journal What does DBD::Pg not liked named parameters? <p>At $work I'm getting a few basic tools set up. Since we use PostgreSQL, I'm using DBD::Pg. With DBD::Oracle I grew to like some of the more flexible ways of entering parameters, so I look for that in DBD::Pg. I find that there are three options.</p><p>1. The DBI standard ?. If I was happy with that, I wouldn't be reading this documentation.</p><p>2. They allow positional parameters with $1, $2, $3. Given that this is inside of Perl, I'm going to constantly be wondering whether I'm accidentally interpolating in variables. Better than ? but I don't really like the syntax. The visual disambiguation from variable interpolation was a reason to prefer DBD::Oracle's<nobr> <wbr></nobr>:1,<nobr> <wbr></nobr>:2 and so on.</p><p>3. They support full named parameters.<nobr> <wbr></nobr>:foo.<nobr> <wbr></nobr>:bar. Yay! I like it! But they go on to say, <i>While this syntax is supported by DBD::Pg, its use is highly discouraged.</i></p><p>Huh? Does anyone know why they discourage the cleanest and most flexible solution? Personally whenever I get the chance to work by name rather than by position, I jump at the opportunity. But I'm somewhat reluctant to use a feature that the software authors don't want me to use without knowing the reason for that dislike...</p> btilly 2007-06-07T23:22:46+00:00 journal No longer a programmer I never intended to be a programmer. By personality I'm a poor fit for being a programmer. And now I'm somewhat officially no longer a programmer.<p> It has been an interesting trip.</p><p> About a decade ago (only a decade ago? Seems longer) my wife and I left grad school and she entered medical school. I needed to make money. I had no work experience and so was willing to take any job I could get. Given my math background, and the fact that I'd be living in New York city, I figured that my best options were to be an actuary, something in finance, or a programmer. Of the three I thought that the least likely was to be a programmer. I'd done some programming when I was younger, and had never liked it much.</p><p> It took me a month to find a job. And the first job that I found was as a programmer working for a churn and burn consultancy. I can't say enough about how horrible the experience was. But I did wind up learning databases (mostly Access) and VB, and I was given the opportunity to learn Perl. While programming still wasn't a perfect fit for me, Perl was a good fit. And so, less than a year later, I went to another company.</p><p> At that company I had the fortune to work with an extremely talented programmer named Frank, who went from someone who was consulting with the company, to a fellow programmer, to my boss. Along the way he, directly and indirectly, taught me most of what I know about programming. (Note, <b>not</b> most of what <b>he</b> knew about programming.)</p><p> That job ended several years later because I moved. When I moved to the west coast I considered either a finance or a programming job, and found the programming job first at an excellent company named Not too long after I joined, we were purchased by eBay. I've done well here, and can absolutely recommend this as a very good place to work at. Certainly the best that I've worked at. In fact I'd recommend that anyone good who wants to work in Santa Monica should ask me how to apply there.</p><p> Along the way I joined some online communities, made friends, learned a lot more, and generally enjoyed the experience. And I've become a competent programmer. However an ongoing issue is that, no matter how capable I might become, I've really got the wrong personality to be a programmer. The key problem is that I'm too extroverted. Sit me down to work on any significant project, and before long I need to surface for air and talk to someone. After a week of this I can feel my motivation level slip. Which means that I actively avoid projects with long periods of heads down development. For similar reasons, unlike most programmers I know, I simply don't wind up taking on personal programming projects.</p><p> Despite this issue I've been productive. And there are ways to accomodate me. Certainly Rent has bent over backwards to do so. However how many companies will do so? And when I combine that with the ageism that some older friends have experienced, I've long doubted that programming is a sustainable long-term career path for me.</p><p> The problem has been what could come next.</p><p> Well at Rent I've wound up in a reporting role. This works out well for me. Lots of people need data. They don't always understand what data they need, so they need to talk with someone who can talk their needs through with them. And most of the time finding ways to get that data tends to be a short project with immediate results. In the process I've wound up learning a bit about how business decisions actually get made.</p><p> As this has happened, I've slowly drifted away from programming. That is not to say that I don't sometimes write or modify programs. I do, and that is unlikely to change. However I'm doing a lot less of it, and things like becoming better at programming or staying in touch with the current best environments, modules, and practices is feeling ever less relevant to my life. This is one of the reasons why I've drifted away from communities like Perlmonks. (A bigger reason is named Sam. It is amazing how much spare time vanishes when you're a parent. It is worth every moment, but it is still a lot of time.)</p><p> But what has happened unofficially has now become somewhat official. I've just given notice at This was a very hard decision for me. I love the company, the organization, the technology, etc. However I have an opportunity to work with friends much closer to home at a company with good opportunities (but admittedly also with some problems) which will give me more of an opportunity to explore how much I like being engaged with the business side. So, with considerable regrets, I'm going from an official title of senior software engineer at eBay to reporting architect at Pictage. To me this is the clearest milestone in my slow drift out of programming saying that I'm no longer a programmer.</p><p> Ironically I actually will be programming more (at least at first) in my new role. There is a lot of basic infrastructure that is missing and I'm going to have to create that. But this will be a fairly limited project. And, if all goes well, I'll wind up doing less and less programming over time. And I will no longer be reflexively describing myself as a programmer.</p><p> For whomever reads to this point, thank you for letting me ramble.</p> btilly 2007-05-22T15:49:07+00:00 journal MySQL has row-level locking? Really? <p>One of the DBAs here was playing around with MySQL. He'd heard that the InnoDB engine has rowlevel locking. So he created an InoDB table called cjg (his initials) with two columns, col1 is a number and col2 is text. He inserted a few rows. He set autocommit off. He then ran:</p><p>update cjg<br>set col1 = 2<br>where col2 = 1;</p><p>Then in another session he tried to insert the value (3, test) into cjg. It blocked.</p><p>He tried a variety of things, but was unable to get MySQL 5 to demonstrate that it knew how to do row-level locking.</p><p>So..can anyone come up with a demonstration where MySQL clearly does row-level locking? And can you explain why the example he tried to run locked the whole table?</p><p>Thanks,<br>Ben</p> btilly 2007-03-24T04:57:51+00:00 journal Goodbye, Bill <p>My uncle Bill died today. He is survived by his wife Lorna of 58 years, 6 sons, a daughter, many grandchildren and more memories. He will be missed.</p><p>Pretty much by accident, my wife and son <a href="">met</a> him April 2 and 3. I'm glad they did.</p><p>Funeral details are not set yet.</p><p>Ben</p> btilly 2006-04-22T03:04:27+00:00 journal The story about ping Scroll down to and read <a href="">the first review</a>.<nobr> <wbr></nobr>:-) btilly 2005-08-20T18:02:00+00:00 journal Are there videos of the lightning talks at YAPC::NA? <p>I saw Luke Closs' hilarious juggling lightning talk. I despair of describing it to co-workers. Therefore I'd like to send a video around instead.</p><p>Unfortunately I can't find one.</p><p>He says that it was filmed at YAPC::NA, but he has not seen the video anywhere.</p><p>Is there hope that someone knows where such a thing might be found?</p><p>Thanks,<br>Ben</p> btilly 2005-08-08T06:25:18+00:00 journal If O'Reilly cooperates, I'll go to OSCON <p>I was asked if I was interested a couple of months ago, but it wasn't until yesterday that approval came through for the company to pay for it. Of course then I had to OK it with my wife (because she's working and it will leave her with extra child care she wasn't prepared for). That's OK as well, so I'm going!</p><p>So I bought my airline ticket, then tried to buy a conference ticket. Wrong order. O'Reilly can't process my order.<nobr> <wbr></nobr>:-( I sent them an email, got no reply. So I waited a while, tried again from the same form and got no confirmation page. My credit card didn't show a charge, so I figured that there was a timeout on their end. I tried from the start again, got to the end and again they can't process my order.</p><p>So right now I'm flying to Portland at the same time as a conference that I can't purchase a registration at. Wonderful.</p><p>In other news, my employer is looking for another developer. If you're good at Perl and are interested in working for (a subsidiary of eBay) in Santa Monica, I can put you in touch with my boss.</p> btilly 2005-07-21T23:01:16+00:00 journal Sometimes the solution is simple <p><a href="">Many moons ago</a> I had an idea for a useful module. Unfortunately I ran into a <a href="">small roadblock</a> and I dropped the project.</p><p>Well tonight I <a href="">realized</a> that instead of installing into UNIVERSAL::AUTOLOAD I could just install into AUTOLOAD where requested and document the problems that AUTOLOAD always has and let people request their own AUTOLOADs, and accept the problems that come with it.</p><p>I have no idea why I missed it before.</p> btilly 2005-04-09T12:04:35+00:00 journal Never believe what you see in the news <p>It seems that every time anything that I know about gets near a reporter, I have cause to wince. Witness that <a href="'s+Law+overshoots+the+mark/2100-1033_3-5616549.html?tag=st_lh">I appear to have moved to Michigan</a>.</p><p>Of course slashdot is worse yet, <a href="">they claim</a> that the University of Minnesota is now Southwest Missouri State University. Um, not quite.</p><p>Seriously, friendly complaining aside, the publicity is unexpected and great to see. But brief exposures like this make me wonder at how little I should trust the news that I normally see. Even when the government doesn't <a href="">deliberately manipulate it</a>.</p> btilly 2005-03-16T07:58:14+00:00 journal