Leto and I got chromatic to actually use github today at OS Bridge. In the process of explaining it to him I drew up the most useful diagram of git you will ever see. It illustrates the five layers (stash, working copy, staging area, local repo, remote repo), how you move changes between them and what layers different invocations of diff act on.
I wish someone had shown me this months ago.
UPDATE: Of course I'm not the first person to think of this. Here's a much cleaner version of what I did from an article about the git workflow.
I signed on at YAPC to do a talk simply entitled Trapped In A Room With Schwern. Robert Blackwell said this of what he wants:
"I think you have a nice bent/slant/angle etc on a lot of stuff. I want you to get people talking. And no I would not expect every talk to be about perl. It would make me sad if they were all perl. But if you do it I would hope you could get everyone in the room to get excited about something. And I hope you would bore the crap out of others. Why do I think that b/c your audience is everyone from Larry to noob. You are not going to shock both of them or bore both of them with the same stuff."
I have a goldfish's memory for what's interesting to me. As I'm writing up a list of things to talk about, I'm thinking that I'm missing something really obvious that I've simply forgotten about or that seems old and obvious to me.
So, suggestions? What would people like to hear about? Perl or otherwise. What do I tend to babble animatedly about between sessions?
I posted about my WWW::Selenium + Test.Simple hack yesterday to enable automated Javascript unit testing. One of the problems was it was very slow. It had to start and kill a Firefox instance between each test which takes 8 seconds per test on my machine. Running 7 tests is a full minute.
Solution? Cache the selenium object! This will reuse the same Firefox session between tests so you only get slammed by the startup cost once. Now my 7 tests run in 8 seconds, the time to start up Firefox. That's awesome!
Will reusing the same Firefox process cause a problem? Unlikely. When I test web sites, with or without Selenium, I sure don't restart Firefox between checks. And neither will your users, so this is far more realistic. Web browsers are designed to isolate page requests from one another.
The prototype works. Future directions...
* Roll selenium-server into the distribution.
* Automate starting the selenium server.
* Add a config file...
* Which browser(s) to use?
* Which selenium server to use, or start its own?
* What file extensions to test with selenium?
* Rerun tests across multiple browsers
* Turn the HTML wrapper into a configurable template
* Make it play nice with prove.
* Turn it into something which can be used with --exec
* Turn it into something which can be put into
* Modularize it
* Figure out how to keep the Firefox process from appearing
* Or at least run backgrounded
I've been doing acceptance level QA at my $job lately which means a lot of clicking around in browsers and a lot of writing Selenium tests. Really my job is to reduce the amount of manual testing which needs to be done and automate as much as possible.
I was talking with Zack who said he hates Selenium. What he really meant was he hates testing at the browser level. Its so finicky to write Selenium regression tests that won't break later because the layout changed. He'd rather unit test his Javascript. I pointed out Test.Simple as a solution. Trouble is, that runs in a browser and someone still has to look at it. That's not very useful for automated tests.
So I began wondering... what would it take to pipe the Test.Simple TAP into TAP::Harness? Ideally I want to run the Javascript with a real DOM in a real browser, not some tinker toy simulation. Can I get Firefox to pipe its page rendering to STDOUT? I asked David Wheeler if he knew of anything and pointed me at some unfinished things like JSAN::Prove which requires a bunch of complicated setup and dependencies which I tl;dr'd.
Maybe there's some module on CPAN which can pipe from Firefox. I searched for "firefox" and what comes up but WWW::Selenium. OF COURSE! I can use WWW::Selenium to talk to Selenium Remote Control and get the output of a web page!
require WWW::Selenium;
# Assumes you have a selenium server running locally on 4444
my $sel = WWW::Selenium->new(
host => "localhost",
post => 4444,
browser => "*firefox",
browser_url => "file://nothing"
);
$sel->start;
$sel->open('http://isperldeadyet.com');
print $sel->get_body_text();
Then it's a simple matter of hooking this into TAP::Harness.
#!/usr/bin/perl -w
use TAP::Harness;
my $harness = TAP::Harness->new({
exec => sub {
my( $harness, $file ) = @_;
# Let Perl programs run normally
return undef unless $file =~ m{\.(js|html)$};
require WWW::Selenium;
my $sel = WWW::Selenium->new(
host => "localhost",
post => 4444,
browser => "*firefox",
browser_url => "file://nothing"
);
require File::Spec;
my $url = "file://" . File::Spec->rel2abs($file);
$sel->start;
$sel->open($url);
# Get whatever's inside <pre id="TAP">
return $sel->get_text(q{//pre[@id="TAP"]});
},
verbosity => 1
});
Then write up a little HTML wrapper to run a basic Test.Simple test.
(Note that t/lib contains the Test.Simple libraries)
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head>
<script type="text/javascript" src="t/lib/Test/Builder.js"></script>
<script type="text/javascript" src="t/lib/Test/Simple.js"></script>
<title>TAP test</title>
</head>
<body>
<pre id="TAP">
<script type="text/javascript">
plan({ tests: 1 });
ok( 1, "this is a test from Javascript" );
</script>
</pre>
</body>
</html>
And run it.
$ perl -w javascript_harness.plx tap.html
tap.html.. ok
All tests successful.
Files=1, Tests=1, 8 wallclock secs ( 0.19 usr + 0.03 sys = 0.22 CPU)
Result: PASS
Its slow, but it's awesome! It gives me something to stick into an
automated smoke server. Developers need a faster turn around time, so
they can quickly and individually unit test it directly in their browser using
Test.Harness.Browser... but that's another show.
But wait, there's more!
That HTML wrapper is icky. Wouldn't it be better if one could just
test Javascript directly? Why yes! How about we generate the wrapper
for
#!/usr/bin/perl -w
use autodie;
use TAP::Harness;
my $harness = TAP::Harness->new({
exec => sub {
my( $harness, $file ) = @_;
my($type) = $file =~ m{\.(js|html)$};
return unless $type; # run Perl normally
require File::Spec;
require WWW::Selenium;
my $sel = WWW::Selenium->new(
host => "localhost",
post => 4444,
browser => "*firefox",
browser_url => "file://nothing"
);
my $url = $type eq 'js' ? _testify_javascript($file):
$type eq 'html' ? File::Spec->rel2abs($file):
die "Unknown type $type";
$sel->start;
$sel->open($url);
return $sel->get_text(q{//pre[@id="TAP"]});
},
});
$harness->runtests(@ARGV);
# Turn.js files into .html with the Test.Simple libraries loaded.
sub _testify_javascript {
my $file = shift;
open my $fh, "<", $file;
my $javascript = join "", <$fh>;
use Cwd;
use File::Temp 'tempfile';
my $cwd = cwd;
my($tmpfh, $tmpfile) = tempfile( CLEANUP => 1 );
print $tmpfh <<"END_OF_HTML";
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head>
<script type="text/javascript" src="$cwd/t/lib/Test/Builder.js"></script>
<script type="text/javascript" src="$cwd/t/lib/Test/Simple.js"></script>
<title>TAP test</title>
</head>
<body>
<pre id="TAP">
<script type="text/javascript">
$javascript
</script>
</pre>
</body>
</html>
END_O F_HTML
return "file://$tmpfile";
}
Which eliminates all the scaffolding from the Javascript test file.
plan({ tests: 1 });
ok( 1, "this is a test from Javascript" );
And the end result is I can run Javascript and Perl tests together from the command line.
$ perl javascript_harness.plx t/*.js t/*.t
t/tap.js.. ok
t/perl.t.. ok
All tests successful.
Files=2, Tests=2, 9 wallclock secs ( 0.19 usr 0.04 sys + 0.03 cusr 0.01 csys = 0.27 CPU)
Result: PASS
UPDATE: As threatened, its on github.
Some of you young whipper snappers might not even know about Coy. This is the module that introduced Damian Conway to the Perl community as the super genius he his. It's a module that generates haiku based on error messages.
#!/usr/bin/perl -w
use Coy;
open my $fh, "<", "doesnotexist" or die "Can't open file: $!";
-----
Eshun departs near
the village. A pair of woodpeckers
nesting. Bankei.
-----
Eshun's commentary...
Can't open file: No such file or directory
("/Users/schwern/tmp/test.plx Speaks": line 5.)
The American idea of a haiku is that its a 5/7/5 syllable arrangement. So it must have a big dictionary of words and how many syllables they contain, right? Wrong! It has code to figure out how many syllables a word contains. It can also hyphenate them, pluralize them and has a basic understanding of what words and concepts go together sensibly. That's what makes it a Damian module. You should read his original presentation on it if nothing else than to see presentation grand master Damian using Comic Sans! But really because the whole thing is in haiku.
What also makes it a Damian module is it hasn't been touched since 1999. It contains a broken version of Lingua::EN::Inflect which overlays the separated CPAN version. All that clever hyphenating code has never been documented or released or tested. It needs love.
I asked Damian about it. He offered it to me. I can't even keep up with my own stuff so I declined... then I thought better and took it. Its on github now. I've removed the busted Lingua::EN::Inflect, write some basic tests and will re-release once I get PAUSE perms fro Damian. Lingua::EN::Hyphenate should be split out into its own release, if anyone is feeling their oats please take it.
This module is way too awesome to let die.
At PPW 2007 I have a keynote about Skud's demographic Perl Survey (the domain has sadly lapsed). To illustrate the findings, I did an informal survey of the audience. Fortunately JCap caught it all on film^H^H^H^Helectrons.
First, as a control, I had everyone in the room stand up. Its well known that unless you're Jason Webley you'll never get 100% participation from the audience. So this provides a baseline to compare later measurements against.
As another baseline, we measured those who took the Perl Survey to have some way to compare this set with the set of people who took the survey.
Then I had just the white males stand up. If you visually swap back and forth between that picture and the one of everyone you see there's not a whole lot of difference.
Here we have everyone who is not white and male. Big difference.
And finally a surprising result, those who started Perl during the
So there it is, for posterity.
Adam Kennedy just informed me that UNIVERSAL::require is in the top 100 most depended on modules on CPAN. Roughly #50 with 243 direct dependents and about 900 total dependents. It's an order of magnitude less important than MakeMaker (with about 9000 total dependents) but that's really surprising. And another fun way to destroy CPAN.
When I see code like this:
# Locale-ize the parser
my $ampm_list = join('|', @{$self->{_locale}->am_pms});
$ampm_list.= '|' . lc $ampm_list;
its a dead give-away that the author is using 4-character tabs. Particularly hateful. Not only do I have to sleuth that tabs are being used, but I also have to change my tab stops. (Delicious irony: use.perl translated the tabs to spaces... four spaces... so I had to detabify the example for it to show up correctly).
I count about 20 tabs vs space indention mistakes in this module which just illustrates why tabs are so hateful... THEY'RE INVISIBLE! Little hidden surprises sprinkled all over the code. Even the author can't see the mistake. Tab users like to tout that it allows you to use whatever indentation level you like when in reality you just wind up with mixed up tabs and spaces.
Since I can't KILL ALL TAB USERS I guess I'll just have to get better tools. Fortunately there's already a large discussion of how to show whitespace on the EmacsWiki. I went with show-wspace.el. Now if only emacs could guess at the author's indentation style and adjust itself to match.
Comment from vi user in 3..2..1..
Since it's Valentines Day, I'll squee about something I love. Github. Oh I loves it so hard. Hopefully this love will stand the test of time.
For years now I've wanted to replace mailing patches around with a version control centric system. I drop patches on the floor all the time, they get lost in my inbox or in the ticket queue. And that sucks, because people who actually write code are really valuable.
To fix this, I want a system that let anyone make their own branches, which they own, for each task they want to do. They could do easy updates from trunk, and when they're ready, they'd ping the project integrator to evaluate the change and pull it in.
SVK came close to this potential, with it's easy pushing and pulling and SVN-like syntax, but I never came around to making the associated tools and web site. Now, github has implemented this. This is why I've moved Test::More to github and why more will be following.
Instead of branching, you fork the whole project. Git makes this cheap. You have total control over that fork, which is good because I'm a terrible bottleneck. When your ready, you issue a "pull request" which pokes me to look at your changes and integrate them. Even better, there is a fork queue where you can see what changes are pending. Then a few clickies on web forms and the patches are integrated. It's a whole lot easier than the normal git pull process (see step 5). Github's integration process could provide a little more information about the merge, but I'm sure that will come.
Today I integrated a patch before it was even submitted!
Heart.
This came up on hates-software recently. I have a special hate in my heart for this one. It's one of those special "helpful enhancements" which is both inconvenient and fails to do its job.
$ rm *
rm: remove regular file `foo.txt'? y
rm: remove regular file `bar.txt'? y
rm: remove regular file `this.html'? YES
rm: remove regular file `junk.html'? YES!
rm: remove regular file `temporary.tmp'? YES GOD DAMNIT
rm: remove regular file `important.txt'? YES
...
Wait, NOOOOOOOO!!!!!!!
This is the "are you sure?" anti-pattern, where the computer second guesses every potentially irreversible command issued by the user. The Microsoft approach. It results in slow interactions and a frustrated user trained to reflexively hit "yes" before comprehending the warning. By the time they do, it's too late.
This design ignores that there's a differences between a mistake and a slip. A mistake is when the user really doesn't know what they're doing. A slip is when they do know what they're doing, but have a temporary lapse. Many programmers assume users are idiots, that everything is a mistake, and don't account for slips.
Here's what the dialog would look like with the buttons taking slips into account. (I can't take credit for this one, I saw it at YAPC St. Louis)
------------------------------------------------------
Remove file "foo.txt"?
[Yes] [No] [No, but I meant Yes] [Yes, but I meant No]
------------------------------------------------------
You can't really make slips go away, everyone slips up. All you can do is reduce their chance of occurring (which is another show) and most importantly, lessen their impact. One simple way to do that is by turning an irreversible action into a reversible one. That is, provide an undo button. Or, in terms of deleting files, a trash.
$ cat ~/bin/trash
#!/bin/sh
mv --backup=numbered "$@" ~/.Trash/
IF you're going to monkey with rm to try and protect the user, make it move files to the trash. It doesn't break the outward interface (making it honor rm's flags is left as an exercise for the reader), and it actually does its intended job instead of just being broken, useless and annoying. Coupled with an automatic trash reaper (a cron job to delete the oldest files when the trash hits a certain size), and with hard drives sizes being what they are, most desktop users will never notice.
Slips happen. Cushion the blow.