Stories
Slash Boxes
Comments

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Ovid (2709)

Ovid
  (email not shown publicly)
http://publius-ovidius.livejournal.com/
AOL IM: ovidperl (Add Buddy, Send Message)

Stuff with the Perl Foundation. A couple of patches in the Perl core. A few CPAN modules. That about sums it up.

Journal of Ovid (2709)

Thursday July 02, 2009
08:47 AM

The Real Problem With Roles

Inheritance, as most of us know, is rather problematic. Specifically, it's been around since 1967 and for the past 42 years, people have been arguing about how to do it right. Some OO languages don't have inheritance (Self, Javascript). Some languages don't have multiple inheritance (C#, Ruby). Some languages have multiple inheritance with many safeguards (Eiffel) or none (Perl). Some languages just do strange things with inheritance (BETA). Even very experienced OO developers focusing on a single language will argue vociferously about whether or not a design has been done correctly (do you really give a damn about strict equivalence in overridden methods?)

The problem is ultimately that classes fulfill conflicting needs. As an agent of responsibility, a class needs to do everything a class needs to do. That sounds like a stupid tautology, but what it means is that as systems grow, classes grow. Thus, classes tend to get larger. Unfortunately, as agents of code reuse -- via inheritance or delegation -- classes should be smaller. How many times have you seen (or written) code which inherits from something because it needs one or two methods but you're pulling in a lot of extra behavior which you don't need?

This responsibility/reuse tension which leads to classes wanting to be both larger and smaller at the same time is much of the reason why inheritance has proven so problematic. Interfaces in Java and C#, along with mixins in Ruby and other languages were an attempt to decouple the responsibility and reuse needs of classes, but they've all had their problems. Roles seem to handle the decoupling of responsilibity and reuse quite nicely, but some of you may have noticed some of the disagreements that I and others have had about the proper use of roles. Why is that?

Ultimately, the disagreement boiled down to the fact that -- like inheritance -- roles serve more than one master. Roles provide behavior and roles provide an interface. Are we faced with another 40 years of arguing because we've again tried to shoehorn too much into one thing? I don't think this need be the case as, unlike classes, the different things the roles provide aren't necessarily antagonistic, but it would be nice if the different parties sat down and tried to come up with a strategy (syntax?) for implementing roles which cleanly addresses these disparate needs. Otherwise, there really will another 40 years of arguing.

Tuesday June 30, 2009
09:16 AM

Calling All Test:: Authors

The latest developer release of Test::More allows subtests. Subtests are great in that they solve a lot of problems in advanced Perl testing, but they have required a change in Test::Builder. Previously you could do stuff like this:

package Test::StringReverse;

use base 'Test::Builder::Module';
our @EXPORT = qw(is_reversed);

my $BUILDER = Test::Builder->new;

sub is_reversed ($$;$) {
    my ( $have, $want, $name ) = @_;

    my $passed = $want eq scalar reverse $name;

    $BUILDER->ok($passed, $name);
    $BUILDER->diag(<<"    END_DIAG") if not $passed;
    have: $have
    want: $want
    END_DIAG

    return $passed;
}

1;

And you've have a simple (untested ;) test for whether or not strings are reversed.

The reason that worked is that Test::Builder->new used to return a singleton. This is no longer true. If someone uses your test library in a subtest, the above code would break. Instead, you want to do this:

sub is_reversed ($$;$) {
    my ( $have, $want, $name ) = @_;

    my $passed  = $want eq scalar reverse $name;
    my $builder = __PACKAGE__->builder;

    $builder->ok($passed, $name);
    $builder->diag(<<"    END_DIAG") if not $passed;
    have: $have
    want: $want
    END_DIAG

    return $passed;
}

It's a minor change, it's completely backwards-compatible and it supports subtests. There's a work-around being planned, but it's not out there yet.

06:48 AM

Never Let Them Read From Your Database

An imaginary conversation synthesized from past discussions and the responses I wish I made.

  • Customer: We need read-only access to your database.
  • Ovid: No.
  • Customer: Please?
  • Ovid: No.
  • Customer: But I need ad-hoc queries.
  • Ovid: Your ad-hoc cartesian join returning 12 billion rows was real fun.
  • Customer: I promise I won't do it again.
  • Ovid: That's what you said about the ad-hoc cartesian join returning 10 billion rows.
  • Customer: But this time I mean it.
  • Ovid: So do I. We guarantee backwards-compatibility in our API, not our database. If we move a field from one table to another, your queries will break.
  • Customer: Then tell us when you do that.
  • Ovid: We did that with another team and had to keep delaying releases while they updated their system.
  • Customer: Then you can provide views to maintain backwards-compatibility.
  • Ovid: We do that already. "View" as in "Model-View-Controller". It's part of our REST API; you should check it out.
  • Customer: But your REST API doesn't provide all the information I need.
  • Ovid: It provides more than the information you need because much of it represents knowledge not stored in the database. If you need more information, let's see what we can do to add this to the API.
  • Customer: Why are you being so difficult?
  • Ovid: Because your temporary convenience is not more important than my long-term pain.

Don't let external customers read directly from your database. Just don't. The usual justification is the need to support ad-hoc queries. Get a few samples and try to figure out a general mechanism to support their actual business needs. If you let them read from your database, they will become dependent on this and beg you to hold off database changes or complain if you don't. As your project grows larger, the pain grows more severe. They will have the best of intentions, but good intentions mean nothing when you need to coordinate your internals with people who should know better than to violate encapsulation.

As a side note, ad-hoc queries, even if not causing performance issues, could potentially be dangerous if the people making them aren't really thinking them through. The problem is two-fold. One, they might not be really paying attention to their core business needs (this is subtle and hard to explain, but common). The other problem is that they might very well be making a query that your API already supports, but because they don't rely as much on your API, they don't know it.

Monday June 29, 2009
09:00 AM

Guess Who Loses: Test::More::subtest versus Test::XML

I've found that one of the best ways to try new testing ideas is to run them against our test suite at work. We've over 30,000 tests at this point, with several test harnesses integrating together, along with two fundamentally different test systems. This means that when I throw something at this test suite, I often shake out bugs. My new Test::Aggregate::Nested combined with our test suite managed to find a rather serious issue with the new subtest function in Test::More. The following, for example, fails:

    use Test::More tests => 2;
    use Test::XML;

    ok 1;
    subtest 'FAIL!' => sub {
        plan tests => 1;
        is_xml '<foo/>', '<foo/>', 'Singleton fail';
    };
    __END__
    xml.t ..
    1..2
    ok 1
        1..1
    Cannot run test (Singleton fail) with active children at /home/ovid/pips_dev/work/Pips3/branches/rights_modeling/deps/lib/perl5/Test/XML. pm line 57.
        # Child (FAIL!) exited without calling finalize()

There's nothing wrong with the code as it's written, but Test::XML knows that the Test::Builder object is a singleton, so there's a false optimization in the code. Near the top of the package, you see this line, defined outside of all functions:

    my $Test = Test::Builder->new;

If every Test::XML function simply had that line in the function rather than trying to share this across all test functions, the subtest code would work fine. Instead, the author knew he had a singleton and there's no point in re-instantiating, is there?

To be fair, I've done the same thing before (see Test::JSON), even though I knew it might be a bad idea. Heck, lots and lots of testing libraries have this issue. Now we need to figure out how to deal with this problem or else subtests aren't going to play well with a lot of code. Damn.

Sunday June 28, 2009
04:48 AM

Test::Aggregate::Nested Almost Done

Boy oh boy, does nested TAP make Test::Aggregate much cleaner. It's not uploaded yet -- still documenting and working out corner cases for the new Test::Aggregate::Nexted -- but here's what aggregated test output used to look like. Remember, this is five separate test programs. Can you tell where each starts and ends?

Test-Aggregate  $ prove -lv t/pre_post.t
t/pre_post.t .. # ******** running tests for aggtests/check_plan.t ********

ok 1 - aggtests/check_plan.t ***** 1
ok 2 - aggtests/check_plan.t ***** 2
ok 3 # skip checking plan (aggtests/check_plan.t ***** 3)
ok 4 - env variables should not hang around
ok 5 - aggtests/check_plan.t ***** 4
ok 6 - findbin is reinitialized for every test
ok 7 # skip Testing skip all
#     ok - aggtests/check_plan.t (1 out of 5)
# ******** running tests for aggtests/findbin.t ********
#     ok - aggtests/findbin.t (2 out of 5)
# ******** running tests for aggtests/skip_all.t ********
#     ok - aggtests/skip_all.t (3 out of 5)
# ******** running tests for aggtests/slow_load.t ********
ok 8 - slow loading module loaded
ok 9 - env variables should not hang around
ok 10 - subs work!
ok 11 - Startup should be called once
ok 12 - ... as should shutdown
ok 13 - Setup should be called once for each test program
ok 14 - ... as should teardown
1..14
#     ok - aggtests/slow_load.t (4 out of 5)
# ******** running tests for aggtests/subs.t ********
#     ok - aggtests/subs.t (5 out of 5)
ok
All tests successful.
Files=1, Tests=14,  1 wallclock secs ( 0.02 usr  0.01 sys +  0.11 cusr  0.01 csys =  0.15 CPU)
Result: PASS

Now let's run that with Test::Aggregate::Nested (requires the development version of Test::More):

Test-Aggregate  $ prove -lv t/pre_post_nested.t
t/pre_post_nested.t ..
1..5
    1..5
        1..0 # SKIP Testing skip all
    ok 1 # skip Testing skip all
        1..1
        ok 1 - findbin is reinitialized for every test
    ok 2 - aggtests/findbin.t
        1..1
        ok 1 - subs work!
    ok 3 - aggtests/subs.t
        1..2
        ok 1 - slow loading module loaded
        ok 2 - env variables should not hang around
    ok 4 - aggtests/slow_load.t
        1..5
        ok 1 - aggtests/check_plan.t ***** 1
        ok 2 - aggtests/check_plan.t ***** 2
        ok 3 # skip checking plan (aggtests/check_plan.t ***** 3)
        ok 4 - env variables should not hang around
        ok 5 - aggtests/check_plan.t ***** 4
    ok 5 - aggtests/check_plan.t
ok 1 - nested tests
ok 2 - Startup should be called once
ok 3 - ... as should shutdown
ok 4 - Setup should be called once for each test program
ok 5 - ... as should teardown
ok
All tests successful.
Files=1, Tests=5,  2 wallclock secs ( 0.03 usr  0.01 sys +  0.11 cusr  0.02 csys =  0.17 CPU)
Result: PASS

Much, much nicer. As an added bonus, plans can now be cleanly asserted. I hope to have it on github soon, and later on the CPAN.

Friday June 26, 2009
10:27 AM

The 81% Solution

Whether you're talking about git, Mercurial, SVK or some other distributed version control system, it's very important to realize that one of its strongest benefits is "distributed". Many people don't care about this, but it can tremendously boost productivity when combined as a part of overall development strategy.

Recently I got so fed up with our dev box that I install Ubuntu on my work computer so I could work locally. This was because our dev box was routinely running with loads around 20, and on one occasion hit a load of 57. Seems the admins thought it would be a good idea to let a bunch of other teams develop on that server but not tell us about it. I just couldn't work like that. Instead of using my computer as a dumb workstation with Windows, I now develop locally and am (usually) not dependent on our dev box being available.

That's also the reason why I have MySQL server installed locally, rather than using our test database. Sometimes that test database goes down, so I don't want to depend on that, either. And the admins thought it would be a good idea to let a bunch of other teams develop on that server but not tell us about it.

In fact, sometimes I break down. With a recent back injury, I was working from home for a week. It was slow, frustrating and painful because I wasn't really set up to work from home and this caused me to lose even more productivity. Had I been able to work from my laptop, there would have been no performance hit. Of course, even without a back injury, we had thousands of employees unable to work full hours because of a Tube strike.

Which brings me back to version control. More than once I've been unable to check in code because the subversion server has been down (never mind when the repository simply freaks out and tells me I can't commit). With something like git, I could do all of my work locally and if I can't push to the central repository, oh well.

So yeah, you know all this, but seriously, do the math. Let's be really generous and assume that our dev box, subversion server, mysql server and ability to be physically present at work all run at 99% availability. This means that your chance being "available" to work is .99^4, or about 96%. If the average availability of all of those things seems around 95%, then your "availability" drops to about 81%. Before I started working so hard to "disconnect" from the network at work, my availability was often lower than 81%. It was awful.

Now imagine sitting in a job interview and telling them that you're only going to work 81% of the time. You're not going to get the job. But the reality is, many corporate environments enforce 81% availability. This is insane, but common. And of course, we also know that those four items aren't the only things which can disrupt your productivity, but those are the four which most impact mine, so that's why I chose them.

When you're setting up a dev environment for programmers and you want it to be as productive as possible, don't force them to depend on things they can't control, such as network availability. Of course, as an employee, it's always fun when the network goes down and you see everyone wandering around with a coffee cup in hand, having been electronically muzzled. What on earth could convince people that this is a sane development environment?

Wednesday June 24, 2009
02:50 AM

Nested TAP Now Available In Developer Release

Schwern has released Test::Simple 0.89_01. Nested TAP is now available. In case you don't recall (or haven't heard about this), you might write a subtest like this:

  use Test::More tests => 3;

  pass("First test");

  subtest 'An example subtest' => sub {
      plan tests => 2;

      pass("This is a subtest");
      pass("So is this");
  };

  pass("Third test");

And get the following nested TAP:

  1..3
  ok 1 - First test
      1..2
      ok 1 - This is a subtest
      ok 2 - So is this
  ok 2 - An example subtest
  ok 3 - Third test

(Adrian, I believe you have a branch of Test::Class which uses this? :)

Monday June 22, 2009
07:10 AM

Tight Coupling of Applications

At the BBC, we have some mandated technology stacks. While this is almost galling to some people's interpretation of TIMTOWTDI, the reality is that in a large-scale environment, it's very helpful to admins to not have to worry about which VCS, Wiki or other tools teams are using when needing different teams to work in shared environments. Of course, when you get down to lower level stuff, you really don't want to tell a team they can't upgrade DBD::mysql because another team requires an older version, but compromises have to be made. Today, that compromise is becoming painful.

I recently tried to commit some changes to our architecture dependent CPAN modules (for example,modules in deps/lib/perl5/i486-linux-gnu-thread-multi/). However, subversion kept insisting that a bunch of files I was adding already existed in the repository, even though svn ls svn://path/to/archdeps did not list those files. After I and another developer kept digging around, we finally gave up and came up with a work-around. First, I did this:

find . -type d -name '.svn' --exec rm -fr {} \;

I then tarred up the deps directory, made a fresh checkout and untarred the files over the fresh checkout. The commit then worked just fine. Unfortunately, I then found out that somehow this corrupted several XS extensions, forcing me to reinstall them. Luckily this was trivial, but the last time this happened (yes, this has happened before!), I couldn't even run the cpan shell due to corrupted XS dependencies.

Having run into numerous issues with Subversion and having found git to be so pleasant to work with, I thought it was time for us to give up and switch to git, but we can't. Even if we get developer buy-in, we use Trac. Trac is tightly integrated with subversion but not with git. There are git plugins for Trac, but they're not ready for production use. Even if they were, it shouldn't be the case that two developers dealing with a frustration invest a lot of time and money into solving a problem whose solution potentially impacts a number of other teams.

The fact that Trac is so tightly integrated with our subversion use of Subversion is a huge frustration (I don't know enough about the internals of Trac to know if this is our fault or theirs). I'd love to find a better way of dealing with this and I'd be ecstatic to see some strategy for decoupling the two. Unfortunately, it's not going to happen.

There are some great benefits of working with larger organizations, but this ain't one of them.

Sunday June 21, 2009
08:17 AM

Always, Always, Always Run Your Code Examples

Putting the finishing touches on my YAPC::EU 2009 paper and at the last moment, decided to run a code snippet since I was trying to compare it to Ruby mixins:

package PracticalJoke;
use Moose;
with 'Bomb'   => { exclude => 'explode' },
     'Spouse' => { exclude => 'fuse' };

I should have written excludes and not exclude. That's a very easy mistake to make, but given that people might want to run my examples, it's embarrassing :)

Friday June 19, 2009
10:57 AM

Package::Dynamic As Yet Unwritten

Piers Cawley has described Devel::Declare as possibly the most hostilely documented library [he's] ever come across. I'm hard-pressed to think of a more accurate description. For example, here's this gem from the docs:

For a simpler way to install new methods, see also Devel::Declare::MethodInstaller::Simple

The problem with that line is that Devel::Declare::MethodInstaller::Simple has no documentation at all! I now think I understand what's it's doing, but my brain hurts.

What I want to do is transform something like this:

repackage $some_package {
    use Moose;
    extends 'Some::Class';
    with    'Some::Role';
}

Into this:

eval <<"END_DYNAMIC_REPACKAGING";
package $some_package;
use Moose;
extends 'Some::Class';
with    'Some::Role';
END_DYNAMIC_REPACKAGING
if ( my $error = $@ ) {
    require Carp;
    Carp::Confess($error);
}

While I now know that I can just use Moose::Meta::Class to do this, but this is actually a problem I encounter from time to time in other contexts and rather than explicitly doing a string eval (which breaks syntax highlighting and hides the code from PPI), I'd like that 'repackage' to handle that.

In fact, delaying evaluation of a block until runtime is moderately common and that's really what we're trying to do here. Unfortunately, you have to call a string eval to do this. I think Devel::Declare can handle this, but the parser currently has me defeated. I'll try again later, but if you want to simply post the answer, I won't mind :)