mugwumpjism's Journal
http://use.perl.org/~mugwumpjism/journal/
mugwumpjism's use Perl Journalen-ususe Perl; is Copyright 1998-2006, Chris Nandor. Stories, comments, journals, and other submissions posted on use Perl; are Copyright their respective owners.2012-01-25T02:19:07+00:00pudgepudge@perl.orgTechnologyhourly11970-01-01T00:00+00:00mugwumpjism's Journalhttp://use.perl.org/images/topics/useperl.gif
http://use.perl.org/~mugwumpjism/journal/
Dowse::BadSSH on CPAN
http://use.perl.org/~mugwumpjism/journal/36436?from=rss
<p>One of the nasty things about the recent OpenSSH vulnerability is that it affects non-debian systems, too.
</p><p>Thankfully the script to find the bad keys was written in Perl. With a bit of back-porting, I managed to get it to work with perl 5.6.1, and thanks to the magic of <a href="http://search.cpan.org/dist/Module-Install">Module::Install</a>, I have made a tarball which includes the dependencies of the debian-published script and uploaded to CPAN as <a href="http://search.cpan.org/dist/Dowse-BadSSH">Dowse::BadSSH</a>.
</p><p>Unlike the published script, the updated <tt>dowkd.pl</tt> is capable of removing bad keys and checks more places on the system, such as known_hosts files and the system host key.
</p><p>Portability patches more than welcomed.
</p><p>Yes, I realise I probably should have based my work off <a href="http://repo.or.cz/w/dowkd.git">the upstream sources</a>
</p><p>Also available from <a href="http://utsl.gen.nz/Dowse-BadSSH-0.04.tar.gz">utsl.gen.nz</a>. Note there were not one but two brown paper bag releases for this. You want at least version 0.04 to safely use the '-r' option.</p>mugwumpjism2008-05-16T01:26:57+00:00journalChange 33412 was so yesterday
http://use.perl.org/~mugwumpjism/journal/35827?from=rss
<p>That's like ancient history by now. But there's a new <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=change-33412">gittorrent references file</a> for a distribution if you don't need to be up to date. A shallow clone of blead works out to be a 16MB download. A full clone of the ~37,733 commits in the repository is about a 112MB download.
</p><p>And for the record,<br>
<code>
Com- Pump- Release Date Notes<br>
mits king (by no means<br>
comprehensive,<br>
see Changes*<br>
for details)<br>
===============================================================================<nobr>=<wbr></nobr> ===<br>
<br>
Leon <a href="http://utsl.gen.nz/gitweb/?p=perl;a=shortlog;h=maint-5.005">maint-5.005</a> 2007-Oct-02<br>
<br>
Nicholas 5.8.5-RC1 2004-Jul-06<br>
5.8.5-RC2 2004-Jul-08<br>
5.8.5 2004-Jul-19<br>
5.8.6-RC1 2004-Nov-11<br>
5.8.6 2004-Nov-27<br>
5.8.7-RC1 2005-May-18<br>
23,900 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.8.7">5.8.7</a> 2005-May-30<br>
5.8.8-RC1 2006-Jan-20<br>
24,294 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.8.8">5.8.8</a> 2006-Jan-31<br>
25,218 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=shortlog;h=maint-5.8">maint-5.8</a> 2008-Mar-03<br>
<br>
23,217 Rafael <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.9.2">5.9.2</a> 2005-Apr-01<br>
25,521 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.9.3">5.9.3</a> 2006-Jan-28<br>
5.9.4 2006-Aug-15<br>
29,291 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.9.3">5.9.5</a> 2007-Jul-07<br>
5.10.0-RC1 2007-Nov-17<br>
5.10.0-RC2 2007-Nov-25<br>
30,191 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.10.0">5.10.0</a> 2007-Dec-18<br>
30,256 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=shortlog;h=maint-5.10">maint-5.10</a> 2008-Mar-03<br>
30,813 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=shortlog;h=blead">blead</a> 2008-Mar-03<br>
<br>
</code></p>mugwumpjism2008-03-03T22:10:55+00:00newsnewsAnother few hours, another 13k changes
http://use.perl.org/~mugwumpjism/journal/35824?from=rss
<p>Freshly up - <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=change-22739">a fresh batch of changes</a>
<code>
Com- Pump- Release Date Notes<br>
mits king (by no means<br>
comprehensive,<br>
see Changes*<br>
for details)<br>
===============================================================================<nobr>=<wbr></nobr> ===<br>
Leon 5.005_04-RC1 2004-Feb-05<br>
5.005_04-RC2 2004-Feb-18<br>
3,943 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.005_04">5.005_04</a> 2004-Feb-23<br>
[...]<br>
Rafael 5.6.2-RC1 2003-Nov-08<br>
8,089 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.6.2">5.6.2</a> 2003-Nov-15 Fix new build issues<br>
Jarkko 5.7.0 2000-Sep-02 The 5.7 track: Development.<br>
5.7.1 2001-Apr-09<br>
5.7.2 2001-Jul-13 Virtual release candidate 0.<br>
15,424 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.7.3">5.7.3</a> 2002-Mar-05<br>
5.8.0-RC1 2002-Jun-01<br>
5.8.0-RC2 2002-Jun-21<br>
5.8.0-RC3 2002-Jul-13<br>
18,560 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.8.0">5.8.0</a> 2002-Jul-18<br>
5.8.1-RC1 2003-Jul-10<br>
5.8.1-RC2 2003-Jul-11<br>
5.8.1-RC3 2003-Jul-30<br>
5.8.1-RC4 2003-Aug-01<br>
5.8.1-RC5 2003-Sep-22<br>
19,911 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.8.1">5.8.1</a> 2003-Sep-25<br>
Nicholas 5.8.2-RC1 2003-Oct-27<br>
5.8.2-RC2 2003-Nov-03<br>
5.8.2 2003-Nov-05<br>
5.8.3-RC1 2004-Jan-07<br>
5.8.3 2004-Jan-14<br>
5.8.4-RC1 2004-Apr-05<br>
5.8.4-RC2 2004-Apr-15<br>
20,328 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.8.4">5.8.4</a> 2004-Apr-21<br>
<br>
21,401 Hugo ,<a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.9.0">5.9.0</a> 2003-Oct-27<br>
22,007 Rafael <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.9.1">5.9.1</a> 2004-Mar-16<br>
</code></p>mugwumpjism2008-03-03T12:47:14+00:00newsnewsAnother month, another ... hey what the?!
http://use.perl.org/~mugwumpjism/journal/35822?from=rss
<p>Change 9999 already! That's quite a few releases... new GitTorrent references tag is at <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=change-9999">this link</a>.
</p><p>For funsies, here's another section of the records:</p><p>
<code>
Com- Pump- Release Date Notes<br>
mits king (by no means<br>
comprehensive,<br>
see Changes*<br>
for details)<br>
============================================================================<br>
3460 <a href="http://www.perlfoundation.org/perl5/index.cgi?Gsar">Sarathy</a> <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.005">5.005</a> 1998-Jul-22 Oneperl.<br>
<br>
Sarathy 5.005_01 1998-Jul-27 The 5.005 maintenance track.<br>
5.005_02-T1 1998-Aug-02<br>
5.005_02-T2 1998-Aug-05<br>
5.005_02 1998-Aug-08<br>
<a href="http://www.perlfoundation.org/perl5/index.cgi?Graham">Graham</a> 5.005_03-MT1 1998-Nov-30<br>
5.005_03-MT2 1999-Jan-04<br>
5.005_03-MT3 1999-Jan-17<br>
5.005_03-MT4 1999-Jan-26<br>
5.005_03-MT5 1999-Jan-28<br>
5.005_03-MT6 1999-Mar-05<br>
3907 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.005_03">5.005_03</a> 1999-Mar-28<br>
<br>
Sarathy 5.005_50 1998-Jul-26 The 5.6 development track.<br>
5.005_51 1998-Aug-10<br>
5.005_52 1998-Sep-25<br>
5.005_53 1998-Oct-31<br>
5.005_54 1998-Nov-30<br>
5.005_55 1999-Feb-16<br>
5.005_56 1999-Mar-01<br>
5.005_57 1999-May-25<br>
5.005_58 1999-Jul-27<br>
5.005_59 1999-Aug-02<br>
5.005_60 1999-Aug-02<br>
5.005_61 1999-Aug-20<br>
5.005_62 1999-Oct-15<br>
5.005_63 1999-Dec-09<br>
5.5.640 2000-Feb-02<br>
5.5.650 2000-Feb-08 beta1<br>
5.5.660 2000-Feb-22 beta2<br>
5.5.670 2000-Feb-29 beta3<br>
5.6.0-RC1 2000-Mar-09 Release candidate 1.<br>
5.6.0-RC2 2000-Mar-14 Release candidate 2.<br>
5.6.0-RC3 2000-Mar-21 Release candidate 3.<br>
7223 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.6.0">5.6.0</a> 2000-Mar-22<br>
<br>
Sarathy 5.6.1-TRIAL1 2000-Dec-18 The 5.6 maintenance track.<br>
5.6.1-TRIAL2 2001-Jan-31<br>
5.6.1-TRIAL3 2001-Mar-19<br>
5.6.1-foolish 2001-Apr-01 The "fools-gold" release.<br>
8003 <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=perl-5.6.1">5.6.1</a> 2001-Apr-08<br>
</code>
The "commits" column is the approximate number of discrete changes in that version of Perl, not the number of changes in the entire repository. At Change 9999, there are 12266 revisions in git.</p>mugwumpjism2008-03-03T08:54:29+00:00newsnewsOh, the irony of it
http://use.perl.org/~mugwumpjism/journal/35521?from=rss
<p>Will the Parrot team finish Perl 6 before I finish converting the Perl 5 history to git?
</p><p>Who knows<nobr> <wbr></nobr>... but it looks like I will end up missing my original target release date. I just haven't had the energy recently to burn the midnight oil for it.
</p><p>However, I have managed to <a href="http://utsl.gen.nz/gitweb/?p=git-p4raw;a=commitdiff;h=d6e257af72f918637185bb916b71ba174d2a6296#patch1">knock off</a> one of the reasonably difficult TODO items in my Perforce importer. I think this one was responsible for most of the discrepancies between the preview one and the rsync blead. There's a couple of problems of this scale left, which will culminate in the nastiest queries and techniques employed by the tool being removed.
</p><p>I'm quite happy that I developed this along the way using the stable development style, otherwise it would have been truly shattered by now<nobr> <wbr></nobr>:-). At least when progress is slow, it can be steady.
</p><p>Well, with a bit of luck I'll solve the other two biggies with a couple of those "magic" nights you get every now and then over the next few weeks. I'll post here once I get a decent preview export out of this version, and will continue to update as I work on the project.</p>mugwumpjism2008-01-30T14:13:05+00:00journalThe Incomplete history of Perl, to change 999
http://use.perl.org/~mugwumpjism/journal/35194?from=rss
<p>The maintenance release 5.004_05 seems a reasonable place to call it a day. Several changes on the maint-5.004 branch were expanded using the same scripts that were used to extract commits from the early 5.004_NN-series patches.
</p><p>Additionally, there are now new rules that aim to place "soft"-links (commit IDs embedded in the commit message) when change numbers are used in commits.
</p><p> <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=change-999">References File</a></p>mugwumpjism2007-12-23T16:46:52+00:00journalUpdating ... up to Change 379
http://use.perl.org/~mugwumpjism/journal/35192?from=rss
<p>Ok, so I might have built an exporter which can export 100s of commits per second;</p><blockquote><div><p> <tt>second.maia:~/src/perl.clean$ time git-p4raw export-commits -n 400<br>git-p4raw: gathering export plan<br>git-p4raw: exporting commits between 525 and 924<br>100% [===============================================================]D 0h00m00sgit-p4raw: Now checkpointing.<br>warning: Not updating refs/heads/p4/perl (new tip ef9eefa4f9098fbcc59c2aba28c73f4e071fe88f does not contain 5ec5d678e03468a8d1d3cb0b3863aacb4ba75233)<br>warning: Not updating refs/heads/p4/ansiperl (new tip d869d2901d6cf3197af2842a1cd2aca6fcea7024 does not contain c864495f8aa5c60a65eec5f015666b8f3ea5ae96)<br>warning: Not updating refs/heads/p4/win32/perl (new tip 15d3ab12d6f67a93f4836715f854ade61f9f6030 does not contain 1e696528467d711508235105c2294a1911fc12ad)<br>git-p4raw: waited 1s for p4raw.8790.marks to be created<br> <br>real 0m2.528s<br>user 0m0.636s<br>sys 0m0.056s<br>maia:~/src/perl.clean$ </tt></p></div> </blockquote><p>However there are some changes which are just begging to be made<nobr> <wbr></nobr>... I've added a <a href="http://utsl.gen.nz/gitweb/?p=git-p4raw;a=commitdiff;h=b458774a5231a562283c35cd1620e772bbbe6127;hp=2266f9413e872796685c1c707fad36e8e9f85276">customization</a> to the part which returns a commit message from a perforce change. This is taking much of the older in-band data, back into its correct header places.
</p><p>One of the common practices through the history is to refer to other changes by numbers. So, I made the <a href="http://utsl.gen.nz/gitweb/?p=git-p4raw;a=commitdiff;h=2266f9413e872796685c1c707fad36e8e9f85276;hp=74268106d3c9a7ed63ee3a569ba28cb533905c2e">add extra links to the commits</a>. As the model of git-fast-import is to just fire the objects at it and let it calculate the SHA1, this required a <a href="http://utsl.gen.nz/gitweb/?p=git-p4raw;a=blob;f=0001-git-fast-import-s-X-N-X-substr-mark-2.patch;h=4d167aecb4f69a5508242636feb354f8242e8bca;hb=2266f9413e872796685c1c707fad36e8e9f85276">hack to git-fast-import</a>. Hack it may well be, but it looks like it <a href="http://utsl.gen.nz/gitweb/?p=perl;a=commit;h=945eca2996b248214d83f63300a417d3602c0860">works</a>.
</p><p> <b>Update:</b> <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=change-379">GitTorrent references file</a></p>mugwumpjism2007-12-23T09:28:12+00:00journalHappy birthday Perl! (You're on holiday in Hawai'i, right?)
http://use.perl.org/~mugwumpjism/journal/35146?from=rss
Today I published the <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=birthday-release-3">Incomplete history of Perl</a> as my birthday gift to Perl. Users of git or mercurial can perform the initial clone now and avoid the rush later:<blockquote><div><p> <tt>git clone git://utsl.gen.nz/perl</tt></p></div> </blockquote><p>(also available via HTTP at <tt>http://git.utsl.gen.nz/perl</tt>)</p><p>Note that the preparation of this history has involved many long toiling hours of correlation of changelog information, searching for and attempting to apply literally thousands of patches from p5p archives and comparing them to the binary releases, etc. I believe it to be a significant achievement into the restoration of the early revisions of Perl. It certainly wouldn't have been possible without the support of my employer, <a href="http://www.catalyst.net.nz/">Catalyst IT</a>, and of course the wonderful revision toolkit that is <a href="http://repo.or.cz/">Git</a>.</p>mugwumpjism2007-12-19T07:52:06+00:00journalTim Bunce's Perls now in perl git repository
http://use.perl.org/~mugwumpjism/journal/35122?from=rss
Now featuring almost 2,000 commits up to Perforce Change 128.
<p>Here is the <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=21e4c95c6aee4eb47435c8baa0347ebab6cc2f67">release announcement references file</a>.</p>mugwumpjism2007-12-17T17:13:20+00:00newsnewsA new set of references is distributed
http://use.perl.org/~mugwumpjism/journal/35119?from=rss
Using the <a href="http://gittorrent.utsl.gen.nz/rfc.html#peer-state-references">REFERENCES</a> message, the seeder's peer uploads <a href="http://utsl.gen.nz/gitweb/?p=perl;a=tag;h=883045e15cd42923a47fcc119e09c0868b2f593b">this signed update</a> to a couple of neighbouring peers. The peers verify the signature, and begin to request blocks from the seeding peer. The new 1MB or so of delta compressed content is distributed across the swarm in under a minute.mugwumpjism2007-12-17T14:12:05+00:00journalSeeding the swarm.
http://use.perl.org/~mugwumpjism/journal/35116?from=rss
There's no working <a href="http://gittorrent.utsl.gen.nz/rfc.html">GitTorrent</a> implementation yet (I'm <a href="http://utsl.gen.nz/gitweb/?p=VCS-Git-Torrent">working on it<nobr> <wbr></nobr>;)</a>, but here's a <a href="http://gittorrent.utsl.gen.nz/rfc.html#references">references file</a> anyway for an initial injection of the perl history.<blockquote><div><p> <tt>object 79a55ab9c5f2ae29fbdc495be80567680330f873<br>type tag<br>tag seed<br>tagger Sam Vilain <sam@vilain.net> 1197871162 +1300<br> <br>A complete history of Perl to 5.004_02<br> <br>There would have been more in this first injection, but 5.004_03 was<br>just too good a release to skip over with a single change.<br> <br>8490252049bf42d3d2f75d89178a8682bf22ba74 HEAD<br>8490252049bf42d3d2f75d89178a8682bf22ba74 refs/heads/master<br>73924ba4dbcc4bf33f88796af24ee0d0642f6147 refs/heads/p4/lexwarn/perl<br>4f59651d02ac3f77a817307d559e2381a9831021 refs/heads/p4/mainline/perl<br>d9a464030e8e08486d87711a575d5a5696634a73 refs/heads/p4/maint-5.004/perl<br>0fc29dc9270d39b1226ee7e3d156212e55679ebe refs/heads/p4/oneperl<br>2ddcc7aa6c936ba8e7a9703319dfd8959bb54574 refs/heads/p4/perl<br>2938739d2e019a7e34b70a9c71ba3903201f6d6d refs/heads/p4/perlext/Compiler<br>683929b49c6384fb92ba65fc111b71ae82a6e29d refs/heads/p4/perlext/Thread<br>ec4e49dc1523dcdb6bec56a66be410eab95cfa61 refs/heads/p4/relperl<br>9ed32d99bcab50ff8df392e9741dd3de08a596a4 refs/heads/p4/thrperl<br>fcc42238821171e387706f935f68939e32051fd7 refs/heads/p4/win32/perl<br>8d063cd8450e59ea1c611a2f4f5a21059a2804f1 refs/tags/perl-1.0<br>378cc40b38293ffc7298c6a7ed3cd740ad79be52 refs/tags/perl-2.0<br>ffd30a0b488495f48bc676c58309803860e1e715 refs/tags/perl-2.001<br>a687059cbaf2c6fdccb5e0fae2aee80ec15625a8 refs/tags/perl-3.000<br>27e2fb84680b9cc1db17238d5bf10b97626f477f refs/tags/perl-3.044<br>778d8c9dd800db4cc8f91788f5356eea9559bfc4 refs/tags/perl-3gamma<br>e334a159a5616cab575044bafaf68f75b7bb3a16 refs/tags/perl-4.0.36<br>a0d0e21ea6ea90a22318550944fe6cb09ae10cda refs/tags/perl-5.000<br>fec02dd38faf8f83471b031857d89cb76fea1ca0 refs/tags/perl-5.000o<br>748a93069b3d16374a9859d1456065dd3ae11394 refs/tags/perl-5.001<br>4aa0a1f7324b8447469670a1b2427c3ac2428bae refs/tags/perl-5.001l<br>e50aee73b3d4c555c37e4b4a16694765fb16c887 refs/tags/perl-5.001m<br>8e07c86ebc651fe92eb7e3b25f801f57cfb8dd6f refs/tags/perl-5.001n<br>a5f75d667838e8e7bb037880391f5c44476d33b4 refs/tags/perl-5.002<br>cc72480dfb711da8819daacccecdeaf1b246ed48 refs/tags/perl-5.002_01<br>4633a7c4bad06b471d9310620b7fe8ddd158cccd refs/tags/perl-5.002b1<br>f70b6ff5dbae63778d9b1ac9a297c2d960e64cbf refs/tags/perl-5.002b1h<br>91b7def858c29dac014df40946a128c06b3aa2ed refs/tags/perl-5.002b2<br>c07a80fdfe3926b5eb0585b674aa5d1f57b32ade refs/tags/perl-5.002b3<br>2920c5d2b358b11ace52104b6944bfa0e89256a7 refs/tags/perl-5.002gamma<br>b4a488b9f66ab1d51091a616c6fab3b3b300db36 refs/tags/perl-5.003<br>43cc1d52f97c5f21f3207f045444707e7be33927 refs/tags/perl-5.003_01<br>0c73a419525723821ea6572a87abb4e3fd04cab1 refs/tags/perl-5.003_02<br>9bb683e6b9e9c477f8ad211dd2f5b8c19d1b7bc2 refs/tags/perl-5.003_03<br>6252a976dfa7360d3fe0c876e583cf00ccd46c21 refs/tags/perl-5.003_04<br>83f702e05ad51fc3e6dc8ca853f13c5ecfac6166 refs/tags/perl-5.003_05<br>c66dd20ebe4b5d5267b3d700cc1002cbae7a86fe refs/tags/perl-5.003_06<br>49e60db318dafe6977e7332fc64c4fac32f5eb2c refs/tags/perl-5.003_07<br>fc996c6d2516678fa2851fd9005f805d593dd0fd refs/tags/perl-5.003_08<br>caa48c6690970f2445e75b64a107fd944e6632e4 refs/tags/perl-5.003_09<br>c407cc441c4f7840e9470ebb574378e557410788 refs/tags/perl-5.003_10<br>c0ce25c4324ce509b1fa849fa1714cc53be05c1e refs/tags/perl-5.003_11<br>58b4fa454e0ece6644458135d2a1b953914e890b refs/tags/perl-5.003_12<br>0137675c105d12c696699e187e15a717855db5e1 refs/tags/perl-5.003_13<br>8ce17d58707471746c013f21470e4c491b1528d1 refs/tags/perl-5.003_14<br>d0df753667027cbfca2518e4d2af0ee6ca1a0e7b refs/tags/perl-5.003_15<br>a0181376d0b59c0197d01352284440d67ee8a5cf refs/tags/perl-5.003_16<br>7fba0250235f5256e8f87e81f39a43b57a3825f4 refs/tags/perl-5.003_17<br>2e5e17f3a88325b29780c6cba8397138a725953f refs/tags/perl-5.003_18<br>33502c5b4ee64074899031c457cf9716683184c4 refs/tags/perl-5.003_19<br>6160ed969d77552af29192c9d758ca097e4f2623 refs/tags/perl-5.003_20<br>160f596370bd319a28800c06784046bba646a01c refs/tags/perl-5.003_21<br>a450582fafbfed0e73f4af686cfd9af9767c655f refs/tags/perl-5.003_22<br>c17482d859ad924eed69061cee34d415c96dd036 refs/tags/perl-5.003_23<br>f698111030ef998e22f359e7cc6013a755013c98 refs/tags/perl-5.003_24<br>f401c9f25871c376dcbfb36b8223dee1af376acf refs/tags/perl-5.003_25<br>5d5810d1c6de8a3a7380a493e10e99d45c2fdfa8 refs/tags/perl-5.003_26<br>147cc559d06ec9e07958756491f3a86ee48915a5 refs/tags/perl-5.003_27<br>927b0b4cd5120a7df523972a098d9118d1941517 refs/tags/perl-5.003_28<br>e068dd810b6a88d60def58601ace4fb7db60d751 refs/tags/perl-5.003_90<br>3508d2b629952724414f48c3ccd24cc38a5cacf4 refs/tags/perl-5.003_91<br>45a349425d251084ca9614c7b8ebd2b475a022e0 refs/tags/perl-5.003_92<br>2c90bd5122441d3ffd67b42518f55a0da737e2b0 refs/tags/perl-5.003_93<br>018aae727c3b70f4bd84b3ba625d3fbca663d22e refs/tags/perl-5.003_94<br>f4135ee29a6ba098dcb34808d211955106dea1a0 refs/tags/perl-5.003_95<br>770cc2ec57cfffdf4e6e0b236e4ac6d3f4f156cd refs/tags/perl-5.003_96<br>6c2286864e8c9e5854704e0f2f76e4acaede21c9 refs/tags/perl-5.003_97<br>fa21ed85df0c7b0426c491b4b9674ad97706dead refs/tags/perl-5.003_97a<br>ba7c7025f98e9a5b785bfc46d3b104244d2fc6ba refs/tags/perl-5.003_97b<br>3a9c13334bc5461a028e4db753a384e54df9dd87 refs/tags/perl-5.003_97c<br>966d917ddb2bd79d3afb8843668bc00444122970 refs/tags/perl-5.003_97d<br>9c1743c6c6306adf4115e06797599457f334f832 refs/tags/perl-5.003_97e<br>9f98d20b03330a3c2d6a54dbfe6915fa10c761db refs/tags/perl-5.003_97f<br>8f1920c69889fe43516bace443aa106aa166b78f refs/tags/perl-5.003_97g<br>5ff748c3dd13028969f3597e82c53e5a59d98fbe refs/tags/perl-5.003_97h<br>3afa6ea5d00c69bec4fd15a7b65a6879b44e90c3 refs/tags/perl-5.003_97i<br>3f0968d5acc94891e409a8a05ee8831209dd0e3c refs/tags/perl-5.003_97j<br>2ac29e1bada2c1be30af16a2d6b57f386ca76ba3 refs/tags/perl-5.003_98<br>cc4e685f86b71bb8ea4f3fa125f87886664175b6 refs/tags/perl-5.003_99<br>9f98c66411c821bdf5805fb6642517ecb382d9e8 refs/tags/perl-5.003_99a<br>a923c5bf4b2f815a95fc4e1a2ab7787e7455d316 refs/tags/perl-5.004<br>3e3baf6d63945cb64e829d6e5c70a7d00f3d3d03 refs/tags/perl-5.004_01<br>8490252049bf42d3d2f75d89178a8682bf22ba74 refs/tags/perl-5.004_02<br>79072805bf63abe5b5978b5928ab00d360ea3e7f refs/tags/perl-5a2<br>93a17b20b6d176db3f04f51a63b0a781e5ffd11c refs/tags/perl-5a3<br>463ee0b2acbd047c27e8b5393cdd8398881824c5 refs/tags/perl-5a4<br>ed6116ce9b9d13712ea252ee248b0400653db7f9 refs/tags/perl-5a5<br>8990e3071044a96302560bbdb5706f3e74cf1bef refs/tags/perl-5a6<br>2304df62caa7d9be70e8b8bcdb454e139c9c103d refs/tags/perl-5a8<br>85e6fe838fb25b257a1b363debf8691c0992ef71 refs/tags/perl-5a9<br>b061ae04e6c92fbb3db8273eb86ed934b33f704b refs/tags/the_answer<br>-----BEGIN PGP SIGNATURE-----<br>Version: GnuPG v1.4.6 (GNU/Linux)<br> <br>iEYEABECAAYFAkdmED0ACgkQ/AZAiGayWEMC6QCbBgqXwz5z5AaTXLPQfiOApBcg <br> JDQAoIMFdWDwS7IM9Kc4GhWyyYU5snVx<br>=piDI<br>-----END PGP SIGNATURE-----</tt></p></div> </blockquote><p>Clairvoyant observers may notice certain <a href="http://git.utsl.gen.nz/perl/objects/pack/">packs</a> from the tiny corner of the internet are good ones to clone to avoid repacking later.
</p><p>Early adopters can
</p><p> <code>git clone http://git.utsl.gen.nz/perl</code>
</p><p>It's easy to change the repository later to its eventual home possibly on <tt>cpan.perl.org</tt>, using the git 1.5+ commands:
</p><p> <code>git remote rm origin<br>git remote add origin git://git.perl.org/perl</code>
</p><p>Many thanks to Catalyst for the bandwidth, hosting, and letting me spend working hours on this stuff.</p>mugwumpjism2007-12-17T06:38:42+00:00journalFinal Perl history conversion coming soon!
http://use.perl.org/~mugwumpjism/journal/35025?from=rss
<p>Well, some time has passed, and I am getting very close to making a release of the Perl 5 project history, converted to git.
</p><p>I would have written <a href="http://vilain.net/index.php?q=node/41">this article about it</a> on this site, but I wanted to include the gitk screenshots<nobr> <wbr></nobr>:-). If you're interested on how the Perforce to Git conversion happened, then read this article!</p>mugwumpjism2007-12-01T15:33:37+00:00newsnewsgit-p4raw v2 specification
http://use.perl.org/~mugwumpjism/journal/34549?from=rss
Just a quick update - I have decided to re-shape the messy proof-of-concept raw importer into a tool which is more useful, allowing access to the information which the previous version just embedded with queries as it needed to. I'd love it if current Perl perforce committers and pumpkins would check <a href="http://utsl.gen.nz/gitweb/?p=git-p4raw;a=commitdiff;h=5fabfe98">the specification</a> out and provide feedback. In particular, if there's a perforce command you'd like explained or think should be catered for by the tool, go ahead and ask<nobr> <wbr></nobr>...mugwumpjism2007-09-26T13:36:49+00:00journalRipping the Perl Perforce repository
http://use.perl.org/~mugwumpjism/journal/34521?from=rss
<p>Today I had my first successful run of <a href="http://utsl.gen.nz/gitweb/?p=git-p4raw">git-p4raw</a>, a program I have been working on recently that imports a Perforce repository from its raw back-end data files.
</p><p>The program is actually quite simple, and its approach was inspired by a conversation between myself and Gurusamy Sarathy. Perforce keeps these "checkpoint" and "journal" files which have inside them all of the metadata that it tracks. The file content itself is stored in RCS files which are otherwise uninteresting. So, the program loads this data into tables, throws a few constraints on the tables to confirm my suspicions of how the schema works, and then sets about mining changesets from the tables. As there is MD5 information in the database (for most files, anyway), integrity can be assured along the way and as a result I am very confident that this will be the cleanest conversion yet.
</p><p>I still have some work to do:
</p><ol>
<li>It might be possible to represent some of the branch cross-merges in the history as real merges in git.</li>
<li>the 5.004 to 5.004_04 series (Tim Bunce's early maintenance work) is currently represented in Perforce as only 5 changes. I'd like to expand that as I have done with the other pre-perforce series</li>
<li>Many changes have embedded author attribution that should be copied into the git "author" field of the commits, so that OHLOH etc has the best information.</li>
<li>Final rewriting to glue the old history onto the new.</li>
</ol><p>I look forward to announcing the completion of this work! It's been a long, hard slog!<nobr> <wbr></nobr>:)</p>mugwumpjism2007-09-22T13:25:51+00:00journalnow back to where I was at the end of the Reposithon
http://use.perl.org/~mugwumpjism/journal/34159?from=rss
<p>Ok, so with a bit more work, I've managed to get a full 169 more commits out of the 5.003_08 -> <a href="http://git.catalyst.net.nz/gw?p=perl.git;a=tag;h=perl-5.004">5.004</a> series. I'm proud of it as a loving restoration of the history of the period, a tribute to the work that went into this quality Perl release. Perhaps 50% of the changes are attributed and applied separately, including countless patches mined from the Perl 5 porters archives.
</p><p>Here are the updated sections of THE RECORDS with the new commit totals:
</p><p>
<code>
THE RECORDS<br>
Com- Pump- Release Date Notes<br>
mits king (by no means<br>
comprehensive,<br>
see Changes*<br>
for details)<br>
=======================================================================<nobr>=<wbr></nobr> ===<br>
986 <a href="http://www.perlfoundation.org/perl5/index.cgi?Chip">Chip</a> <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.003_08">5.003_08</a> 1996-Nov-19<br>
5.003_09 1996-Nov-26<br>
5.003_10 1996-Nov-29<br>
5.003_11 1996-Dec-06<br>
5.003_12 1996-Dec-19<br>
5.003_13 1996-Dec-20<br>
1092 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.003_14">5.003_14</a> 1996-Dec-23<br>
5.003_15 1996-Dec-23<br>
5.003_16 1996-Dec-24<br>
5.003_17 1996-Dec-27<br>
5.003_18 1996-Dec-31<br>
5.003_19 1997-Jan-04<br>
5.003_20 1997-Jan-07<br>
5.003_21 1997-Jan-15<br>
5.003_22 1997-Jan-16<br>
5.003_23 1997-Jan-25<br>
5.003_24 1997-Jan-29<br>
5.003_25 1997-Feb-04<br>
5.003_26 1997-Feb-10<br>
5.003_27 1997-Feb-18<br>
1349 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.003_28">5.003_28</a> 1997-Feb-21<br>
5.003_90 1997-Feb-25 Ramping up to the 5.004 release.<nobr> <wbr></nobr> <br>
5.003_91 1997-Mar-01<br>
5.003_92 1997-Mar-06<br>
5.003_93 1997-Mar-10<br>
5.003_94 1997-Mar-22<br>
5.003_95 1997-Mar-25<br>
5.003_96 1997-Apr-01<br>
1596 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.003_97">5.003_97</a> 1997-Apr-03 Fairly widely used.<br>
5.003_97a 1997-Apr-05<br>
5.003_97b 1997-Apr-08<br>
5.003_97c 1997-Apr-10<br>
5.003_97d 1997-Apr-13<br>
5.003_97e 1997-Apr-15<br>
5.003_97f 1997-Apr-17<br>
5.003_97g 1997-Apr-18<br>
5.003_97h 1997-Apr-24<br>
5.003_97i 1997-Apr-25<br>
5.003_97j 1997-Apr-28<br>
1750 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.003_98">5.003_98</a> 1997-Apr-30<br>
5.003_99 1997-May-01<br>
5.003_99a 1997-May-09<br>
p54rc1 1997-May-12 Release Candidates.<br>
p54rc2 1997-May-14<br>
<br>
1812 Chip <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.004">5.004</a> 1997-May-15 A major maintenance release.<br>
</code>
</p><p>Next mission: lather, rinse, repeat, but with Tim Bunce's 5.004_* so-called 'maintenance' releases.
</p><p> <b>Exercise</b>: download <a href="http://git.catalyst.net.nz/perl.git/objects/pack/pack-bf9ec9a22a047c2712011b87e1df228c69976c36.pack">this 8MB pack file</a>, then see if you can find the 1812 commits within. Start with <tt>git index-pack</tt></p>mugwumpjism2007-08-20T12:58:37+00:00journalAnother perl git update
http://use.perl.org/~mugwumpjism/journal/34150?from=rss
Now up to about <a href="http://git.catalyst.net.nz/gw?p=perl.git;a=shortlog;h=restorical-v3alpha7">5.003_97d</a>. I already have more commits in the history than I finished with the first time around. Some commit dates are wrong, will be fixed shortly. I'm having difficulty pushing tags<nobr> <wbr></nobr>... will hopefully find out what's going wrong soon.mugwumpjism2007-08-19T06:35:05+00:00journalHistory is not linear - the case for micro-branching
http://use.perl.org/~mugwumpjism/journal/34135?from=rss
I can't embed images in this journal, so <a href="http://vilain.net/index.php?q=node/38">here</a> instead. It's a little examination of how one point release has been represented, with gitk screenshot.mugwumpjism2007-08-16T23:23:30+00:00journalAnother perl git update
http://use.perl.org/~mugwumpjism/journal/34126?from=rss
<p>Another half-day on the history. This time I worked on 5.003_21 through 5.003_91. <a href="http://git.catalyst.net.nz/gw?p=perl.git;a=shortlog;h=af35ebf7e4816ca9425aadec48090e4708d6667e">latest version</a>.</p>mugwumpjism2007-08-16T07:30:30+00:00journalPerl 5 history continued; OHLOH listing
http://use.perl.org/~mugwumpjism/journal/34117?from=rss
<p>After some quality checks on the pre-perforce history, I decided it would be worth running through the import of the 5.003_08 -> 5.004 series again, but this time with a little more patience. When I ran through it at the reposithon, I was accepting all the defaults and decided that I'd written too fancy an interactive importer to let that be the case.
</p><p>Some people might wonder why the importer needs to be interactive, let me say that I'm making dozens of minor corrections along the way. There is a lot of manual work going into it!
</p><p>Just today, I've managed to extract another 29 commits out of the 5.003_07 -> 5.003_21 phase of the history, bringing the total from that period from 200 to 229. This might sound minor, but taking the higher quality of the conversion into account I'd say I'm increasing the detail and resolution by about a third to a half.
</p><p>However, I am pretty much on a roll with it, and hope to keep going all the way through to the end of about 5.004_04 very soon, which will give a very firm base for putting the Perforce history onto. Thanks to Robert Spiers and John Peacock, I now have a Perforce repository with the metadata extracted. It will be worth doing some re-work of the Perforce importer to get it clean.
</p><p>I'm trying to get the project listed on OHLOH - but OHLOH seems to be having some issues. I have written to the maintainers of the site.</p>mugwumpjism2007-08-15T15:48:02+00:00journalPumpking conversion so far
http://use.perl.org/~mugwumpjism/journal/34045?from=rss
<p>For those who didn't see the <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=restorical-v2">initial release</a>, here's an annotated
<a href="http://search.cpan.org/dist/perl/pod/perlhist.pod">perlhist</a>; the first column is the number of commits in the history at that point. As I release updates, I will log them here.
</p><p>
<code>
THE RECORDS<br>
Com- Pump- Release Date Notes<br>
mits king (by no means<br>
comprehensive,<br>
see Changes*<br>
for details)<br>
=======================================================================<nobr>=<wbr></nobr> ===<br>
<br>
Larry 0 Classified. Don’t ask.<br>
ada1bfb6c4d1<br>
1 <a href="http://www.perlfoundation.org/perl5/index.cgi?Larry">Larry</a> <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-1.0">1.000</a> 1987-Dec-18<br>
<br>
1.001..<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-1.010">10</a> 1988-Jan-30<br>
14 1.011..<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-1.014">14</a> 1988-Feb-02<br>
<a href="http://www.perlfoundation.org/perl5/index.cgi?Schwern">Schwern</a> 1.0.15 2002-Dec-18 Modernization<br>
<a href="http://www.perlfoundation.org/perl5/index.cgi?Richard">Richard</a> 1.0.16 2003-Dec-18<br>
<br>
Larry <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-2.000">2.000</a> 1988-Jun-05<br>
<br>
17 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-2.001">2.001</a> 1988-Jun-28<br>
<br>
19 Larry <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-2.001">3.000</a> 1989-Oct-18<br>
<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=ada1bfb6c4d1">3.001</a> 1989-Oct-26<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=784e8af9adc4">3.002..4</a> 1989-Nov-11<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=1d1e3892e787">3.005</a> 1989-Nov-18<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=2e223e830e62">3.006..8</a> 1989-Dec-22<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=4bacc023fa33">3.009..12</a> 1990-Mar-02<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=bd7b82b21dc0">3.013..14</a> 1990-Mar-13<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=3ad8b73d1746">3.015</a> 1990-Mar-14<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=145755a28c5a">3.016..18</a> 1990-Mar-28<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=d166707f0aae">3.019..27</a> 1990-Aug-10 User subs.<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=13eed104114b">3.028</a> 1990-Aug-14<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=55cbef4aa38a">3.029..36</a> 1990-Oct-17<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=c964363f708c">3.037</a> 1990-Oct-20<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=84329cee975b">3.038..040</a> 1990-Nov-10<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=57a2d3e23b59">3.041</a> 1990-Nov-13<br>
<a href="http://git.catalyst.net.nz/gw?p=perl.git;a=commitdiff;h=e9358fa65d8e">3.042..43</a> 1991-Jan-??<br>
62 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-3.044">3.044</a> 1991-Jan-12<br>
<br>
63 Larry <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=a159586cbb67">4.000</a> 1991-Mar-21<br>
<br>
4.001..3 1991-Apr-12<br>
4.004..9 1991-Jun-07<br>
4.010 1991-Jun-10<br>
4.011..18 1991-Nov-05<br>
4.019 1991-Nov-11 Stable.<br>
4.020..33 1992-Jun-08<br>
4.034 1992-Jun-11<br>
4.035 1992-Jun-23<br>
99 Larry <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-4.0.36">4.036</a> 1993-Feb-05 Very stable.<br>
<br>
5.000alpha1 1993-Jul-31<br>
100 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5a2">5.000alpha2</a> 1993-Aug-16<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5a3">5.000alpha3</a> 1993-Oct-10<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5a4">5.000alpha4</a> 1993-???-??<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5a5">5.000alpha5</a> 1993-???-??<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5a6">5.000alpha6</a> 1994-Mar-18<br>
5.000alpha7 1994-Mar-25<br>
<a href="http://www.perlfoundation.org/perl5/index.cgi?Andy">Andy</a> <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5a8">5.000alpha8</a> 1994-Apr-04<br>
Larry <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5a9">5.000alpha9</a> 1994-May-05 ext appears.<br>
5.000alpha10 1994-Jun-11<br>
5.000alpha11 1994-Jul-01<br>
Andy 5.000a11a 1994-Jul-07 To fit 14.<br>
5.000a11b 1994-Jul-14<br>
5.000a11c 1994-Jul-19<br>
5.000a11d 1994-Jul-22<br>
Larry 5.000alpha12 1994-Aug-04<br>
Andy 5.000a12a 1994-Aug-08<br>
5.000a12b 1994-Aug-15<br>
5.000a12c 1994-Aug-22<br>
5.000a12d 1994-Aug-22<br>
5.000a12e 1994-Aug-22<br>
5.000a12f 1994-Aug-24<br>
5.000a12g 1994-Aug-24<br>
5.000a12h 1994-Aug-24<br>
Larry 5.000beta1 1994-Aug-30<br>
Andy 5.000b1a 1994-Sep-06<br>
Larry 5.000beta2 1994-Sep-14 Core slushified.<br>
Andy 5.000b2a 1994-Sep-14<br>
5.000b2b 1994-Sep-17<br>
5.000b2c 1994-Sep-17<br>
Larry 5.000beta3 1994-Sep-??<br>
Andy 5.000b3a 1994-Sep-18<br>
5.000b3b 1994-Sep-22<br>
5.000b3c 1994-Sep-23<br>
5.000b3d 1994-Sep-27<br>
5.000b3e 1994-Sep-28<br>
5.000b3f 1994-Sep-30<br>
5.000b3g 1994-Oct-04<br>
Andy 5.000b3h 1994-Oct-07<br>
Larry? 5.000gamma 1994-Oct-13?<br>
<br>
109 Larry <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5.000">5.000</a> 1994-Oct-17<br>
<br>
Andy <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=ac419cfcfe42">5.000a</a> 1994-Dec-19<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=d68b9cd28e53">5.000b</a> 1995-Jan-18<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=43a571f37e97">5.000c</a> 1995-Jan-18<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=3bb22b1bdce1">5.000d</a> 1995-Jan-18<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=69bb195668a1">5.000e</a> 1995-Jan-18<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=e623fb575cfc">5.000f</a> 1995-Jan-18<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=ee695892848a">5.000g</a> 1995-Jan-18<br>
5.000h 1995-Jan-18<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=acc789bc0a0b">5.000i</a> 1995-Jan-26<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=8157edb431ae">5.000j</a> 1995-Feb-07<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=170ef09c4097">5.000k</a> 1995-Feb-11<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=b54c23d1339b">5.000l</a> 1995-Feb-21<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=d849d1eae02e">5.000m</a> 1995-Feb-28<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=039a0abf3996">5.000n</a> 1995-Mar-07<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=6e56b8560de0">5.000o</a> 1995-Mar-13?<br>
<br>
128 Larry <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=perl-5.001">5.001</a> 1995-Mar-13<br>
<br>
Andy 5.001a 1995-Mar-15<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=696118d8c7a9">5.001b</a> 1995-Mar-31<br>
<a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=commit;h=9d9330c055c8">5.001c</a> 1995-Apr-07<br>
5.001d 1995-Apr-14<br>
5.001e 1995-Apr-18 Stable.<br>
5.001f 1995-May-31<br>
5.001g 1995-May-25<br>
5.001h 1995-May-25<br>
5.001i 1995-May-30<br>
5.001j 1995-Jun-05<br>
5.001k 1995-Jun-06<br>
5.001l 1995-Jun-06 Stable.<br>
5.001m 1995-Jul-02 Very stable.<br>
5.001n 1995-Oct-31 Very unstable.<br>
146 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.002b1">5.002beta1</a> 1995-Nov-21<br>
5.002b1a 1995-Dec-04<br>
5.002b1b 1995-Dec-04<br>
5.002b1c 1995-Dec-04<br>
5.002b1d 1995-Dec-04<br>
5.002b1e 1995-Dec-08<br>
5.002b1f 1995-Dec-08<br>
<a href="http://www.perlfoundation.org/perl5/index.cgi?Tom">Tom</a> 5.002b1g 1995-Dec-21 Doc release.<br>
207 Andy <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.002b1h">5.002b1h</a> 1996-Jan-05<br>
5.002b2 1996-Jan-14<br>
Larry 5.002b3 1996-Feb-02<br>
Andy 5.002gamma 1996-Feb-11<br>
Larry 5.002delta 1996-Feb-27<br>
<br>
321 Larry <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.002">5.002</a> 1996-Feb-29 Prototypes.<br>
<br>
431 <a href="http://www.perlfoundation.org/perl5/index.cgi?Tom">Charles</a> <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.002_01">5.002_01</a> 1996-Mar-25<br>
<br>
450 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.003">5.003</a> 1996-Jun-25 Security release.<br>
<br>
647 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.003_01">5.003_01</a> 1996-Jul-31<br>
649 <a href="http://www.perlfoundation.org/perl5/index.cgi?Nick">Nick</a> 5.003_02 1996-Aug-10<br>
Andy 5.003_03 1996-Aug-28<br>
5.003_04 1996-Sep-02<br>
5.003_05 1996-Sep-12<br>
5.003_06 1996-Oct-07<br>
976 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.003_07">5.003_07</a> 1996-Oct-10<br>
986 <a href="http://www.perlfoundation.org/perl5/index.cgi?Chip">Chip</a> <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=shortlog;h=perl-5.003_08">5.003_08</a> 1996-Nov-19<br>
5.003_09 1996-Nov-26<br>
5.003_10 1996-Nov-29<br>
5.003_11 1996-Dec-06<br>
5.003_12 1996-Dec-19<br>
5.003_13 1996-Dec-20<br>
1085 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.003_14">5.003_14</a> 1996-Dec-23<br>
5.003_15 1996-Dec-23<br>
5.003_16 1996-Dec-24<br>
5.003_17 1996-Dec-27<br>
5.003_18 1996-Dec-31<br>
5.003_19 1997-Jan-04<br>
5.003_20 1997-Jan-07<br>
5.003_21 1997-Jan-15<br>
5.003_22 1997-Jan-16<br>
5.003_23 1997-Jan-25<br>
5.003_24 1997-Jan-29<br>
5.003_25 1997-Feb-04<br>
5.003_26 1997-Feb-10<br>
5.003_27 1997-Feb-18<br>
1310 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.003_28">5.003_28</a> 1997-Feb-21<br>
5.003_90 1997-Feb-25 Ramping up to the 5.004 release.<nobr> <wbr></nobr> <br>
5.003_91 1997-Mar-01<br>
5.003_92 1997-Mar-06<br>
5.003_93 1997-Mar-10<br>
5.003_94 1997-Mar-22<br>
5.003_95 1997-Mar-25<br>
5.003_96 1997-Apr-01<br>
1461 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.003_97">5.003_97</a> 1997-Apr-03 Fairly widely used.<br>
5.003_97a 1997-Apr-05<br>
5.003_97b 1997-Apr-08<br>
5.003_97c 1997-Apr-10<br>
5.003_97d 1997-Apr-13<br>
5.003_97e 1997-Apr-15<br>
5.003_97f 1997-Apr-17<br>
5.003_97g 1997-Apr-18<br>
5.003_97h 1997-Apr-24<br>
5.003_97i 1997-Apr-25<br>
5.003_97j 1997-Apr-28<br>
1597 <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.003_98">5.003_98</a> 1997-Apr-30<br>
5.003_99 1997-May-01<br>
5.003_99a 1997-May-09<br>
p54rc1 1997-May-12 Release Candidates.<br>
p54rc2 1997-May-14<br>
<br>
1643 Chip <a href="http://git.catalyst.net.nz/gitweb/perl.git?p=perl.git;a=tag;h=perl-5.004">5.004</a> 1997-May-15 A major maintenance release.<br>
</code></p>mugwumpjism2007-08-08T09:46:48+00:00newsnewsScriptalicious 1.11
http://use.perl.org/~mugwumpjism/journal/34020?from=rss
Apparently people <a href="http://packages.debian.org/libscriptalicious-perl">use Scriptalicious;</a> more than I thought. I have fixed a couple of the longstanding niggles with this module and made a <a href="http://search.cpan.org/~samv/Scriptalicious-1.11/">release</a>. See <a href="http://utsl.gen.nz/gitweb/?p=Scriptalicious;a=shortlog;h=v1.11">shortlog</a> for a complete list of changes or the bundled <a href="http://search.cpan.org/~samv/Scriptalicious-1.11/Changes.pod#VERSION_1.11">Changes POD</a>.mugwumpjism2007-08-07T06:49:52+00:00modulesPumpkings, past and present
http://use.perl.org/~mugwumpjism/journal/33958?from=rss
<p>As I retrace the steps and patchwork of the early pumpkings, I have had the pleasure to meet and discuss the work with two of them recently, Tim Bunce and Gurusamy Sarathy. Tim was pumpking for the 5.004_* maintenance series and many _50+ releases; Gurusamy looked after a few after Malcolm Beattie. They gave me some hints and tips, particularly explaining what the goodies in <tt> <a href="http://git.catalyst.net.nz/gw?p=perl.git;a=tree;f=Porting;h=5090da1871;hb=f3b9ab08ea0b">Porting/</a> </tt> are about, and clarifications of the attribution styles.
</p><p>It turns out that the Perforce backing store might be easy enough to trace through, making for a smooth, clean conversion. Extra information will be accompanied with a <a href="http://git.catalyst.net.nz/gw?p=perl-history-massage.git">Catalyst Application</a>. This is in addition to a third-party extraction of the information as performed by John Peacock, and all credits due for performing this conversion in something of record time. More data and conversions are good especially when they can be used to cross-check each other.</p>mugwumpjism2007-08-02T07:57:55+00:00newsnewsI has a loverly bunch of Catalysts
http://use.perl.org/~mugwumpjism/journal/33943?from=rss
<p>Here they are all <a href="http://utsl.gen.nz/gitweb/">sitting on my servers</a>.
</p><p>Beware, they're not such pretty Catalysts
</p><p>rebasing soon with good committers!
</p><p>Once I see if it's in the repo anywhere<nobr> <wbr></nobr>:). And other good things like all branches visible. See also a BAST import. Couldn't see DBIx::Class there - what happened? Well, If anyone wants to help muck in with conversion, grab the source data, and tell me where it needs to go. You can reply on this comment if you like. It's a bit like a bugtracker except when you can more easily ignore foolish requests. I can give out logins to that machine as required.
</p><p>Anyone brave enough to add support to <tt>git-svn</tt> to mirror the root path of projects and mesh together a superproject using <a href="http://repo.or.cz/w/git.git?a=blob;f=Documentation/git-submodule.txt;h=2c48936fcd72c5276ae2fed217bd9b9564342f03;hb=HEAD"> <tt>git-submodule</tt> </a> ? This would be a useful final audit. Also looking for people who might be keen to set up <a href="http://repo.or.cz/w/repo.git">repo</a> on the UTSL network.
</p><p>It's unfinished - some repos are still yet to copy. No accuracy or completion claims yet. But coming soon.</p>mugwumpjism2007-08-01T08:26:19+00:00journalcpan6 - moving forward
http://use.perl.org/~mugwumpjism/journal/30907?from=rss
<p>Mark Overmeer gave a <a href="http://birmingham2006.com/cgi-bin/yapc.pl?act=talk-item&talkid=51">talk</a> with support from myself at YAPC::Europe 2006 about <a href="http://cpan6.org/">cpan6</a>, the design so far is the result of a collaboration between Mark and myself. The talk was generally well received, and during the conference we have heard many more peoples' concerns. The good news is that there were no new requirements that didn't fit cleanly in the design, in fact it gave some people lots of ideas. I think I can say that we have support for the general direction of things, and now we can open up the debate widely, and start implementing pieces.
</p><p>I invite people to join either the <a href="http://lists.cpan6.org/mailman/listinfo.cgi/pause6">pause6 mailing list</a> (for infrastructure discussions) or the <a href="http://lists.cpan6.org/mailman/listinfo.cgi/tools">cpan6 tools list</a> (for client-side installers and upload tools).
</p><p>The earliest task will be looking at the big picture, and seeing which pieces are the low-hanging fruit that we can write tests for straight away. I'll start the ball rolling after we have a few subscriptions.</p>mugwumpjism2006-09-06T15:12:49+00:00cpanDatabase - Slave or Master? 3 of 3 - Integration
http://use.perl.org/~mugwumpjism/journal/30728?from=rss
<p>This story begins with an effort to store <a href="http://search.cpan.org/dist/Moose">Moose</a> classes in a
Tangram store. Specifically, converting from
<tt>Moose::Meta::Class</tt> objects to a <tt>Tangram::Schema</tt>
structure.
</p><p>The structures are already quite similar. In the Tangram schema,
you have a per-class map of <code>(type, name, (details...))</code>.
In Moose::Meta::Class, you have a map of <code>(attribute,
(details...))</code>, where the <code>details</code> includes a type
constraint. Based on the type constraint, you can guess a reasonable
<tt>type</tt>. Well, not quite. The next thing you really need is
<em>Higher Order Types</em> on your type constraints (called
<em>parametric roles</em> in the Perl 6 Cannon). In a nutshell,
that's not just saying there's an <tt>Array</tt> somewhere, but saying
there's an <tt>Array of</tt> <em>something</em>. Then you can make
sure that you put an actual <em>foreign key</em> or <em>link
table</em> in that point in the schema, rather than the
<tt>oid+type</tt> pair that you get with <tt>Tangram</tt> when you use
a <tt>ref</tt> column (and, in recent versions, without specifying a
<tt>class</tt>). Getting parametric roles working in Moose is still
an open question, but certainly one I hope to find time for.
</p><p>So, during this deep contemplation, I thought, well, what would
Tangram be adding? I mean, other than the obvious elitism and other
associated baggage? Why not just tie the schema to the Moose
meta-model, and start a new persistence system from scratch? Or use
<tt>DBIx::Class</tt> for all the bits I couldn't be bothered
re-writing?
</p><p>In principle, there are reasons why you might want the storage
schema and the object metamodel to differ. You might not want to map
all object properties to database columns, for instance. Or you might
want to use your own special mapping for them - not just the default.
</p><p>Then I thought, how often did I do that? I added a
<tt>transient</tt> type in <tt>Class::Tangram</tt> for columns that
were not mapped, but only rarely used it, and never for data that I
couldn't derive from the formal columns or some other truly transient
source. I only used the <tt>idbif</tt> mapping type for classes when
I didn't have the time to describe their entire model. So, perhaps a
storage system that just ties these two things together would be
enough of a good start that the rest wouldn't matter.
</p><p> <strong>The Evil Plan to NOT refactor Tangram using
<tt>DBIx::Class</tt> </strong>
</p><p>Ok, so the plan is basically this. Take the Tangram API, and make
the core bits that I remember using into thin wrappers around
<tt>DBIx::Class</tt> and friends. Then, all of the stuff under the
hood that was a headache working with, I'll conveniently forget to
port. That way, it won't be a source compatible refactoring, just
enough to let people who liked the Tangram API do similar sorts of
things with <tt>DBIx::Class</tt>.
</p><p>The first thing I remember using is a schema object for the
connection, if only because of acme's reaction when I say "schema".
In a talk I'd use a UML diagram at this point, but given
<tt><img></tt> tags are banned, instead let's use Moose code.</p><blockquote><div><p> <tt> package DBIx::Moose::Schema;<br> use Moose;<br> has '$.classes' => (is => 'ro',<br> isa => 'Set of Moose::Meta::Class',<br> );</tt></p></div> </blockquote><p>Alright. So, we have a schema which is composed of Moose Classes.
The next thing we need is a Storage object that has the bits we want;</p><blockquote><div><p> <tt> package DBIx::Moose::Storage;<br> use Moose;<br> use Set::Object qw(weak_set);<br> has '$:db' => (is => 'ro', isa => "DBIx::Class::Schema");<br> has '$:objects' => (is => 'rw', isa => "Set::Object",<br> default => sub { weak_set() } );<br> has '$.schema' => (is => 'ro',<br> isa => "DBIx::Moose::Schema");</tt></p></div> </blockquote><p>That <tt>weak_set</tt> is a little bit of magic I cooked up for
nothingmuch recently. All we're doing is keeping references to the
objects we've already loaded from the database, primarily for
transactional consistency. Actually, Tangram uses there a hash from
an <tt>oid</tt> to a weak reference to the member with that
<tt>oid</tt>, but I think that <tt>oids</tt> suck. In Perl memory,
the <tt>refaddr</tt> can be the <tt>oid</tt>.
</p><p>And we'd need an overloaded query interface;</p><blockquote><div><p> <tt> package DBIx::Moose::Remote;<br> use Moose;<br> has '$._storage' => (is => 'ro', weak => 1,<br> isa => "DBIx::Moose::Storage");<br> has '$._class' => (is => 'ro',<br> isa => "Moose::Meta::Class");<br> has '$._resultset' => (is => 'ro',<br> isa => "DBIx::Class::ResultSet",<br> default => \&_rs_default,<br> );<br> sub _rs_default {<br> my $self = shift;<br> $self->_storage->resultset($self->_class);<br> }</tt></p></div> </blockquote><p>So, hopefully, the <tt>DBIx::Class::ResultSet</tt> API will be rich
enough to be able to deal with all the things I did with
<tt>Tangram</tt>, or at least it will given enough TH^HLC.
</p><p>There will be a bit of double-handling of objects involved.
Basically, the objects that we get back from <tt>DBIx::Class</tt> will
be freed very soon after loading, their values passed to a
schema-specified constructor (probably just <tt>Class->new</tt>),
and then their slots that contain collections that are not already
loaded set up to lazy load the referant collections on access. This
happens already in Tangram; the intermediate rows are the arrayrefs
returned by <code>DBI::fetchrow_arrayref()</code>. So there will be
lots of classes, perhaps under <tt>DBIx::Moose::DB::</tt>, that mirror
the objects in the schema. Perhaps we don't need that, but it should
be a good enough starting point, and if it can be eliminated entirely
later on, then all the better. (<b>Update:</b> Matt has kindly pointed me to the part of the API that deals with this; this shouldn't be a problem at all)
</p><p> <strong>Mapping the Index from the Class</strong>
</p><p>One of the nice things about a database index is that it's
basically a performance 'hack' only (because databases are too dumb to
know what to index themselves), and do not actually affect the
operation of the database. So, for the most part, we can ignore
mapping indices and claim we are doing the 'correct' thing<nobr> <wbr></nobr><tt>;)</tt>.
</p><p>That is, unless the index happens to be a <em>unique index</em> or
a <em>primary key</em>. What those add is a <em>uniqueness
constraint</em>, which <em>does</em> affect the way that the object
behaves. So, what of that?
</p><p>Interestingly, Perl 6 has the concept of a special<nobr> <wbr></nobr><tt>.id</tt>
property. If two object references have the same<nobr> <wbr></nobr><tt>.id</tt>
property, then they are considered to be <em>the same object</em>.
This has some interesting implications.
</p><p>After all, isn't this;</p><blockquote><div><p> <tt> class Book;<br> has Str $.isbn where {<nobr> <wbr></nobr>.chars < 255 };<br> method id {<br> $.isbn;<br> }</tt></p></div> </blockquote><p>The same thing as this?</p><blockquote><div><p> <tt>CREATE TABLE Book (<br> isbn VARCHAR(255);<br> UNIQUE PRIMARY KEY (isbn);<br>);</tt></p></div> </blockquote><p>So, we can perhaps map this in Perl 6 code, at least map one
uniqueness constraint per type. Generalising this to multiple
uniqueness constraints is probably something left best to our Great
Benevolant Navel-Gazers. In the short term, we'll need to come up
with some other kind of way of specifying this per-class; probably a
<tt>Moose::Util::UniquenessConstraint</tt> or somesuch.
</p><p> <strong>Mapping Inheritance</strong>
</p><p>Alright, so we still have inheritance to deal with. But wait!
We've got a bigger, brighter picture with Moose. We've now got roles.
</p><p>Fortunately, this is OK. The Tangram <tt>type</tt> column was only
ever used (conceptually, anyway) to derive a <em>bitmap</em> of
associated (ie, sharing a primary key) tables that we expect to find
rows in for a particular tuple. So, if we map the role's properties
to columns, then we only have to "duplicate" columns for particular
roles, if those roles are composed into classes that don't share a
common primary key.
</p><p> <strong>The other features</strong>
</p><p>Well, there may be other important features that I'll remember when
the time comes, but for now I think there's enough ideas here to form
a core roadmap, or at least provide a starting point for discussion.</p>mugwumpjism2006-08-23T12:27:02+00:00yapceDatabase - Slave or Master? 2 of 3 - Object Persistence
http://use.perl.org/~mugwumpjism/journal/30712?from=rss
<p>One of the coolest things about "Object Persistence", is that it
has the word "Object" in it, which of course means better than
anything that was around before "Object Oriented Programming" was
decided to be flavour of the decade. Even better than that, it even
has the word "Persistence" in it, which sounds much more sophisticated
and modern than "Database" or "Relational".
</p><p>Then are shiny tools in this space, like Hibernate for Java. Using
Hibernate, you can make an Enterprisey Macchiato Java Servlet that
blends your database to a set of Java objects, and then provides
Soaped-up Beany Hyper-galactic XML Web services for other parts of
your Enterprisey code base to access. It's fantastic - you end up
with a set of tables (all with surrogate IDs, of course) that you are
guaranteed to not be able to write to safely from anything except the
Java Server. This puts the Java developer in control. Which is the
way (s)he likes it. Maybe hibernate doesn't have to work like this,
but (s)he prefers it because it means that all the changes to the
objects have to go through the same central set of functions.
Otherwise, the development investment is wasted. And we can't have
that, not at the price it cost.
</p><p>Anyway, Tangram is not quite so antisocial as that. It at least
behaves transactionally, given appropriate use of <a href="http://search.cpan.org/~samv/Tangram-2.10/lib/Tangram/Storage.pod#unload_all(_%5B_%24notify_method_%5D)"> <code>$storage->unload_all</code> </a>
(also <tt>->recycle</tt>) and distribution of clue. But it is
currently anti-social in other ways, such as the surrogate ID
requirement.
</p><p> <strong>Wait a minute - the database has<nobr> <wbr></nobr>.oids, too</strong>
</p><p>Postgres, and Oracle, both have a concept of<nobr> <wbr></nobr><tt>.rowid</tt>; all
tables except for 'index organised tables' have them by nature. I
have observed that in the vast majority of code that uses the Tangram
API, I never need to use or refer to this<nobr> <wbr></nobr><tt>.id</tt>; in fact, when
storing an object in multiple stores, its<nobr> <wbr></nobr><tt>.id</tt> will vary
across those stores. In light of this, while I consider surrogate IDs
a design flaw - it's not a tragic one, it's consistent with what the
database does anyway, and it has allowed for interesting patterns to
be built in the mean-time while better ideas come forth. For a more
detailed analysis of what I think is wrong with Tangram, see the <a href="http://search.cpan.org/dist/Tangram/lib/Tangram/Sucks.pm">bugs
POD file</a> in the Tangram distribution, especially the <a href="http://search.cpan.org/dist/Tangram/lib/Tangram/Sucks.pm#tables_without_a_type_column">section
on surrogate type columns</a> (actually I've just tidied those up, if
you're reading this before I make a release then read the fresh one <a href="http://utsl.gen.nz/gitweb/?p=Tangram;a=blob;h=cc70b396;hb=doc-updates;f=lib/Tangram/Sucks.pm">here</a>).
</p><p> <strong>What defines "Object Persistence"?</strong>
</p><p>Again, hazarding a set of common features of object persistence
tools that could plausibly form part of a definition;
</p><ol>
<li>They normally <em>do</em> have requirements of the database;
usually not all valid DDL models can be mapped to a set of objects.</li>
<li>They will map features of objects not usually considered
relational concepts such as inheritance and Perl structures like
Arrays and Hashes.</li>
</ol><p> <strong>What's so cool about Tangram</strong>
</p><p>The key design feature of Tangram is what is frequently referred to
as being <em>orthogonal</em> - it is normally non-intrusive on the
objects being stored. A given object may even exist in multiple
stores simultaneously (but be represented by the same Perl object).
The result? Classes do not need to be aware of their storage; any
more than a tuple needs to know it's being stored in a table space.
</p><p>This is implemented with <em>Lazy Loading</em>. The in-memory data
structure is considered equivalent to the database form; via types
such as <a href="http://search.cpan.org/~samv/Tangram/lib/Tangram/Type/Set/FromOne.pod">Tangram::Type::Set::FromOne</a>,
it is possible to follow joins between tables by just walking Perl
objects with visitor iterators like <tt>Data::Dumper</tt>.
</p><p> <strong>Tangram Querying</strong>
</p><p>For the cases where you have specific questions for your data
model, and you are not just following adjacent relations between
objects, lazy loading is not enough. We still need some form of query
syntax.
</p><p>For this, Tangram uses <a href="http://search.cpan.org/~samv/Tangram/lib/Tangram/Expr.pod"> <tt>Tangram::Expr</tt> </a>
objects that represent <em>database objects</em> - and they use
<tt>overload</tt> so that you can write your query expressions using
standard perl operators (as far as <tt>overload</tt> allows).
Depending on your inclination, you either "run screaming" from this
syntax or love it.
</p><p>In my experience, Tangram's query syntax makes some previously hard
queries easy, and some "impossibly difficult" queries easy. You can
build intricate joins with a consistent notation. For example,
process a form, make a list of <tt>Tangram::Expr</tt> fragments, and
then combine them into a filter that can be used for multiple queries.</p><blockquote><div><p> <tt> # get the table aliases<br> my ($r_artist, $r_cd, $r_track)<br> = $storage->remote(qw(Artist CD Track));<br> <br> # build a set of filter expressions - some of these<br> # represent joins.<br> my @filters =<br> ( ( $r_artist->{name} eq "The Black Seeds" ),<br> ( $r_cd->{artist} == $r_artist ),<br> ( $r_cd->{tracks}->includes($r_track) ),<br> ( $r_track->{name} eq "Heavy Mono E" ) );<br> <br> # AND them all together<br> my $filter = reduce { $a & $b } @filters;<br> <br> # then use them for queries<br> my (@cds) = $storage->select( $r_cd, $filter );<br> my (@tracks) = $storage->select( $r_track, $filter );</tt></p></div> </blockquote><p>The query there is already getting reasonably impressive; the first
<code>->select()</code> maps to:</p><blockquote><div><p> <tt> SELECT<br> t1.id,<br> t1.type,<br> t1.artist_id,<br> t1.name<br> FROM<br> CD t1,<br> Artist t2,<br> Track t3<br> WHERE<br> t1.artist_id = t2.id AND<br> t2.name = "The Black Seeds" AND<br> t3.cd = t1.id AND<br> t3.name = "Heavy Mono E"</tt></p></div> </blockquote><p>This is a simple example, and I have found that there are very few
real queries on well designed schema that do not map to this syntax
well. That being said, sub-selects require an undocumented syntax,
and while I have some sympathy to the notion that you should be able
to write sub-selects as joins most of the time, it's certainly an
example that the API hasn't been extended in all directions yet.
</p><p> <strong>Tangram Maps Inheritance</strong>
</p><p>There are those that would say inheritance is about a relational a
concept as an<nobr> <wbr></nobr><tt>.mdb</tt> file, but I think that there is adequate
justification for its use in data modelling.
</p><p>A good question to ask when validating a relational schema to be
normal form, is "what does this relation mean?" or "what fact
is being represented by this tuple?". We can ask this question for
all tables - and the basic answer is "there exists an object with
these values"¹. The fact is that the object exists. Better answers
can be made for individual tables; consider that answer a template -
ask a meta-question, get a meta-answer.
</p><p>This is where the argument for inheritance stems. The relations
still describe existence of an object, but certain types of objects
will have extra items in their tuple - relations to the extra
properties bestowed upon them by their sub-classes.
</p><p>In the <a href="http://search.cpan.org/src/SAMV/Tangram-2.10/t/musicstore/MusicStore.pm">CD
store schema</a>, for instance, 'Artist', 'Person' (perhaps better called
'Musician') and 'Band' are related like this. The justification is,
that an artist can be either a musician or a band, but if we are
relating to something in its capacity as an artist (ie, from the CD
table, to say who released it), there also exists by association a
relationship between the CD and all of the properties of the artist in
its capacity of a musician or a band.
</p><p>Tangram short-cuts the query overhead of this situation using a
'<tt>type</tt>' column. The type column is an index into the schema,
and is used to derive a bitmap of which extra tables associated with a
base class are expected to have rows for this primary key. This is a
de-normalization of data, so technically a hack - as noted on
<tt>Tangram::Sucks</tt>, it should be possible to detect the type
using the presence or absence of tuples in the corresponding tables.
Or, somewhat equivalently, <tt>NULL</tt>s when using "Horizontal"
mapping - see <a href="http://search.cpan.org/~samv/Tangram-2.10/lib/Tangram/Relational/Mappings.pod"> <tt>Tangram::Relational::Mappings</tt> </a>
for a description of these terms). I'm told that David Wheeler's <a href="http://search.cpan.org/~dwheeler/Object-Relation/"> <tt>Object::Relation</tt> </a>
can work like this.
</p><p> <strong>But what about the Schema?</strong>
</p><p>Having a schema structure that is <em>free from side-effects</em>
can be quite useful. Tangram has this down well; its input is a plain
set of perl hashes and arrays, no side effects. If you want to use
the pure objects to create code, you can still pass them to
<tt>Class::Tangram::Generator</tt>. If you want to connect to
storage, pass to <tt>Tangram::Storage</tt>. <a href="http://search.cpan.org/dist/T2">T2</a> was my attempt at making
a set of objects that can both describe this Tangram model
relationally, and itself be stored in a Tangram database. This is
useful for building <em>rapid application development</em> /
<em>Computer-Aided Software Engineering</em> (RAD / CASE) tools.
Consider <a href="http://uml.sourceforge.net/">Umbrello</a>; it could
not compile classes as the objects were manipulated, otherwise you
might override internal behaviour and break the running program!
</p><p> <strong>You don't have to write comprehensive schemata any
more</strong>
</p><p>Consider the package manager, <a href="http://fink.sourceforge.net/">Fink</a>. Whilst using
<tt>Storable</tt> for persistence can make applications like Fink
faster by reducing parse time to load their entire state at start-up,
it is still not as fast as a Berkeley, ISAM or SQLite-style database
which is loaded on demand for small accesses.</p><p>The general approach is not making the whole schema relational in
one go, but instead cherry-picking out the columns that you think are
useful enough to be indexed, and throwing the rest into a single
column that contains a <tt>Storable</tt>, <tt>Data::Dumper</tt> or
even <tt>YAML</tt> data field which is used to construct the rest of
the object. <a href="http://search.cpan.org/~samv/Tangram/lib/Tangram/Type/Dump/Any.pm"> <tt>Tangram::Type::Dump::Any</tt> </a>
is built for this. I wrote a Tangram schema for Fink that does this,
which is <a href="http://utsl.gen.nz/fink/">lurking here</a>.</p><p>You end up with a data source which can be queried on all mapped
columns, and almost all close that was written for the old,
non-Tangram system works too - because previously, the only option was
to follow Perl references, but we've made sure they all get lazy
loaded.
</p><p> <strong>Where Object Persistence Wins</strong>
</p><dl>
<dt> <em>RAD-developed, and imported models</em></dt>
<dd>In the RAD case, the model for your program is developed with a
tool; the relational mapping is then derived by mutating the generated
model.<br>
In the imported case, it comes from the metamodel of an another
module, such as <tt>Class::Tangram</tt> or <tt>Class::Meta</tt>.<br>
In both of these cases, a general form of translation is "all that is
required" - write a few rules about how to convert from one metamodel
to another, and you have automatic Object Persistence. Sadly this
"all that is required" part can get quite difficult to understand and
debug.</dd>
<dt> <em>retro-fitting storage around existing objects</em></dt>
<dd>This works out best when you have code that already stores via
something like <tt>Storable</tt>, and hasn't been written relationally
in the first place, just like Fink.</dd>
</dl><p>Yes, I know this is another absurdly long post in a multipart
series. That's actually mostly in this case because I have more to
say about it, rather than being a particular endorsement of the
approach. But more on what I <em>will</em> endorse in the next part.
</p><p>Footnotes:
</p><ol>
<li>Yes, I know there is a widely circulating school of thought saying
"that's not The Right Way™ to do object-relational mapping, you should
be using object values as <em>columns</em> and tuples as <em>object
relations</em>". The former isn't available in current databases, and
the latter is done using classes that consist only of foreign keys
(<tt>Tangram::Type::*::FromOne</tt> relations).</li>
</ol>mugwumpjism2006-08-21T22:13:06+00:00yapceDatabase - Slave or Master? 1 of 3 - Database Abstraction
http://use.perl.org/~mugwumpjism/journal/30698?from=rss
<p>After the <a href="http://en.wikipedia.org/wiki/ACID">ACID</a>
revolution of the 1960s, Relational Database Design was the next big
thing during the late 60's and 70's. It marked an evolutionary step
forward from the Heirarchical models of early ACID conformant systems;
for after all, it <em>included</em> the heirarchical model, as any
heirarchy can be expressed as relations¹, yet <em>transcended</em> it
by expressing structures that didn't fit heirarchies.
</p><p>And it has some solid theory behind it as well - the relational
model has strong roots in mathematics and logic, and so you can expect
that University-goers will be poring over it with a bit more scrutiny
and peer review than your average <tt>use.perl.org</tt> columnist.
</p><p>Through all this, we have a decent set of experience for looking at
data management problems through the goggles of the Relational Model,
of which modern Relational Database Management Systems (RDBMS's)
provide a reasonable approximation². We have built it up logically
with key concepts such as <em>constraints</em>, <em>sequences</em>,
<em>triggers</em>, <em>joins</em>, <em>views</em>, <em>cursors</em>,
etc, and well-known performance hacks such as <em>indices</em>,
<em>partitioning</em> or <em>materialized views</em>. And this
logical layering is what allows us to build complex RDBMS's and
database applications that do not violate the strict requirements of
ACID. Well, some of us, some of the time. I won't say it's easy to
do it without making mistakes.
</p><p>We have a set of rules that let you decide whether data in the
model is <em>normalized</em> - that is, it is not duplicated or
aggregating any other information in the database, or
<em>de-normalized</em>. We should be able to look at a table, and
decide whether that <tt>auto_increment</tt> primary ID key is actually
a normalized and valid member of the model (such as a customer or
invoice number), or whether it is just a surrogate ID thrown on the
row so that the programmer doesn't have to guess whether
<tt>table.id</tt> exists or not, that does not actually <em>mean</em>
anything in terms of the data model.
</p><p>We have a graphical language of notation, called <em>crowfoot
diagrams</em> (<a href="http://utsl.gen.nz/img/crowfoot.png">example</a>). And this is
a very good common language.
</p><p>We even have Relational abuses such as <em>stored procedures</em>
and <tt>NULL</tt> values².
</p><p>But we want a common language for writing Perl components, not just
how for talking to DBAs or writing database schema. We cannot write
entire applications in SQL. And nor do we want to.
</p><p> <strong>What defines "Database Abstraction"?</strong>
</p><p>For the heritage for this term, we can look to Dave Rolsky's <a href="http://poop.sourceforge.net/">POOP Comparison</a> document.
POOP stands for <b>P</b>erl <b>O</b>bject-<b>o</b>riented
<b>P</b>ersistence, and stands out as one of the worst acronyms for a
user group ever.
</p><p>So, "Database Abstraction" is my own refactoring of the term
"RDBMS/OO Mapper" from the above document. Modules such as <a href="http://search.cpan.org/dist/DBIx-Class"> <tt>DBIx::Class</tt> </a>
and Dave's <a href="http://search.cpan.org/dist/Alzabo">Alzabo</a>
clearly fit into this category.
</p><p>Allow me to hazard some key characteristics of modules strictly in
this category;
</p><ol>
<li>they (in principle) do not have particular requirements on table
layout, such as surrogate IDs or type indicators</li>
<li>they do not try to represent or provide concepts not described by
orthodox relational model literature, such as inheritance</li>
</ol><p>Perhaps I'll think of some others as time progresses; I'll try to
add them here if I do.
</p><p> <strong>What's so cool about DBIx::Class</strong>
</p><p>In a nutshell, it does the Database Abstraction part very well,
with a clean modular implementation via <a href="http://search.cpan.org/dist/Class-C3/"> <tt>Class::C3</tt> </a>.
Which isn't quite as chic as <a href="http://search.cpan.org/dist/Moose"> <tt>Moose</tt> </a>, but close
enough that it's probably not worth re-writing <tt>DBIx::Class</tt> in
the near future. It has active maintainers, it has releases, it has
users, it has mailing lists and IRC and all those other indicators of
projects which are "succeeding".
</p><p>One thing I particularly like about its API is <a href="http://search.cpan.org/dist/DBIx-Class/lib/DBIx/Class/ResultSet.pm"> <tt>DBIx::Class::ResultSet</tt> </a>.
In particular, the way that you don't get <em>tables</em> from your
schema, you get <em>result sets</em> that happen to be for
<em>all</em> objects. What's more, they don't actually run the query
until you use them, which makes for easy piecely building of simplish
queries.
</p><p> <strong>Driving the Perl Objects from the Database Schema</strong>
</p><p>One of the most popular <tt>DBIx::Class</tt> extensions, which I
also think is pretty nifty, is <a href="http://search.cpan.org/dist/DBIx-Class-Schema-Loader/"> <tt>DBIx::Class::Schema::Loader</tt> </a>.
This actually connects to a database, uses DBI's various interfaces
for querying the table structure in about as DB agnostic a way as you
could imagine a tool of its class doing, and then calls the relevant
<tt>DBIx::Class</tt> hooks to create classes which are a reasonable
representation of what it found in the database.
</p><p>For those people who are adamant that best practices be strictly
followed, and normalization guidelines honoured, this works very well
- and it sure is a delight when you have an application with a
database clean enough for this to work without tweaking the schema.
Then again, those developing applications from scratch might prefer
writing in <tt>DBIx::Class</tt> directly.
</p><p> <strong>What's the model of your model?</strong>
</p><p>In all of the above scenarios, but particularly with the
<tt>Loader</tt>, the <em>model</em> (ie, schema) of your database has
a <em>meta-model</em> (ie, governing schema form). It is a very close
relative of the Data Definition Language, DDL - <tt>CREATE TABLE</tt>
statements and so-on that tell the database what to do. And that is
perhaps key to the success of <tt>DBIx::Class</tt> and perhaps all
other modules that work like this - they piggy back on the success of
the relational model.
</p><p>It should be noted that the <tt>DBIx::Class</tt> meta-model is
currently <em>implicit</em>; there is, for instance, a
<tt>DBIx::Class::Schema</tt> module that lets you create objects for a
model, but they just go ahead and make the classes immediately rather
than a separate step. The closest thing I could find to a pure set of
data structures that represent the schema was probably
<tt>DBIx::Class::Schema::Base</tt>, but even that had the "side
effect" of compiling the classes into Perl as the schema is
constructed.
</p><p>But that's not necessarily a harsh critique of a real problem. As
an exercise, and for a RAD (Rapid Application Development) tool I was
writing at the time to procrastinate from building a real application
for a VC project, I developed a set of modules for Tangram called <a href="http://search.cpan.org/dist/T2">T2</a> that described the
Tangram meta-model using the <a href="http://search.cpan.org/dist/Class-Tangram">Class::Tangram</a>
meta-model. I later found myself wanting to do the same thing to
<tt>Class::Tangram</tt> itself - that is, have <tt>Class::Tangram</tt>
re-entrantly be its own meta-model. Other people have tried this sort
of thing, too - Kurt Stephen's UMMF, David Wheeler's Class::Meta, etc.
Metamodelling really amounts to the data modeller's equivalent of
navel gazing - ie fruitful but only with good practice and a clear mind. I admire Stevan Little's accomplishment with <a href="http://search.cpan.org/dist/Class-MOP">Class::MOP</a> in this
regard, which is why I didn't cut my Moose talk.
</p><p>But I digress. Why don't I summarise the usage scenarios where I
think the <em>Database Abstraction</em> approach really wins.
</p><p> <strong>Summary - Where Database Abstraction Wins</strong>
</p><p>There we go, large heading and everything. I have observed
Database Abstraction to be effective, both in my own practice but more
in others, in two situations:
</p><dl>
<dt> <em>Well designed</em> models</dt>
<dd>
If the information has been modelled well using classical set theory
notions, and those notions are adequate for the task at hand and
little denormalization present in the data, then any approach that
ends up getting to <tt>DBIx::Class</tt> classes will work well.
</dd><dt> <em>retro-fitting</em> existing models</dt>
<dd>
The <tt>DBIx::Class::Schema::Loader</tt> wins here. You already have
a set of tables, you've defined your foreign keys properly using
constraints and what-not, and it's not just a bunch of integer id
keyed data dumping grounds, so just go ahead and load it all using a
set of clearly-defined conventions.
</dd></dl><p>Right, time to collect a free meal for my delayed flight, then I'll
have a crack at part 2.
</p><p>Footnotes:
</p><ol>
<li>Yes, querying heirarchies in SQL sucks and usually relies on
vendor-specific extensions which are inflexible and not portable. We
will get to this a bit more in part 2 hopefully.</li>
<li>Insert long rant about <tt>NULL</tt> values and duplicate rows
here.</li>
</ol>mugwumpjism2006-08-19T16:53:14+00:00yapceInternational Transit Lounges, what fun
http://use.perl.org/~mugwumpjism/journal/30697?from=rss
<p>Well, here I am sitting at one of the handy power and internet
outlets in Changli Airport in Singapore, hoping the paranoia caused by
missing an international flight the last time I lost track of time
sitting here will prevent the same from happening again. Checking in
to the transit check-in desk, they informed me of a slight delay in my
outgoing flight to Amsterdam, of the order of 5-7 hours. So, I've got
another all-nighter to pull through - I wonder if I'll be ID'd at 4am
by any assault rifle clad security staff this time around. On the
bright side, that means I should be taking off between 8am and 10am in
my home time zone. So, if I sleep deprive myself now I'll hopefully
get some sleep on the 10+ hour leg over the continent, and also
hopefully not miss a boarding call for my delayed flight being brought
forward. Fun, fun, fun.
</p><p>What better thing to do when sleep deprived but write talks, or in
this case, the second set of rants I'm passing off as substitutes for
my withdrawn YAPC talks. This second talk I really hated withdrawing;
but sadly, I had some crazy things happen to me in the 11th hour of
preparation, and when you're a hard core procrastinator like me, that
can really throw a spanner in the works because that last hour is
where most of the work gets done. So, I'll put the material and
'argument' here, and hopefully still be in the position to turn it
into a good talk with slides and examples for the Australasian Perl
conference in December (OSDC). Much, much kudos to my employer, <a href="http://www.catalyst.net.nz/">Catalyst IT</a>, for sending me to
speak at such an insane number of <em>international</em> conferences
this year (OSDC will be my third).
</p><p>In case anyone missed it, this was the advertised talk topic:
</p><p> <strong>Database - Slave or Master? DBIC vs Tangram</strong>
</p><p> <em>Whilst the DBI may be an excellent provide of database driver
independence, just about every programmer who starts using the DBI
ends up either building their own abstractions to its interface, or
using somebody else's. As a result there are a multitude of modules in
this space with significant overlap in functionality.</em>
</p><p> <em>This talk compares two major categories of database management
libraries - "Database Abstration" (DDL driven) and "Object
Persistence" (metaclass driven). </em> <tt> <a href="http://search.cpan.org/dist/DBIx-Class/">DBIx::Class</a> </tt> <em>
(a module with some design roots in </em> <tt>Class::DBI</tt> <em>) and
</em> <tt> <a href="http://tangram.utsl.gen.nz/">Tangram</a> </tt> <em> (a
prevayler-style persistence system) are examined as mature examples of
each of these styles of access.</em>
</p><p>The plan at this point is to break it into three logical chunks -
in the first part, I will put across my thoughts about the traditional
approach of Database Abstraction used by DBIx::Class and other
modules, where the database is considered to be the centre of the
information. If nothing else, that should help clarify things like
terminology and make sure that readers of the later parts are on the
same page as me. In the second part, I will discuss the alternate
approach used by Tangram, as well as its key advantages and failings.
In the third part I will outline how I see this schism can be closed
without losing the benefits of either or having to rebuild your
applications from scratch (again).
</p><p>Let the rambling begin.</p>mugwumpjism2006-08-19T14:55:22+00:00yapceWhat people love about their VCS - Part 4 of 4. darcs
http://use.perl.org/~mugwumpjism/journal/30616?from=rss
<p>With the shining review of <tt>git</tt> just posted, it seems there would be
little ground left for other tools to show distinction.
</p><p>However I respect and admire <a href="http://www.darcs.net/DarcsWiki/FrontPage">darcs</a> on several
grounds, and there are still clear and useful development interactions
for which <tt>darcs</tt> has an advantage over all current <tt>git</tt> porcelain¹.
</p><p> <strong>It's also properly distributed</strong>
</p><p>Firstly, it should be noted that almost all of the distributed
development advantages of <tt>git</tt> also apply to <tt>darcs</tt>.
<tt>darcs</tt> also uses a single directory for its repository, so
'<tt>grep -r</tt>' is ok from sub-directories, and like <tt>git</tt>, it keeps
these repositories with the checkouts so you can freely move and copy
your checkout directories without worrying about using special
commands or updating some mapping file in an obscure location in your
dotfiles.
</p><p> <tt>darcs</tt> has not been scaled to massive projects, instead
focusing on smaller ones (say, a few thousand commits), where the
extra functionality is considered more important than speed. That
being said, in fact you'll see in newer <tt>darcs</tt> repositories
the first traces of content hashing, which have made drastic
improvements - and could eventually render <tt>git</tt>'s performance edge
marginal.
</p><p> <strong>The (in)famous Patch Calculus</strong>
</p><p>Patch Calculus has to be one of the most frighteningly named terms
used in revision control systems today. It screams "Maths to
University Level required to understand".
</p><p>But let's throw away the awful term and describe it in plain Geek.
Basically it's all about ways of determining, from a set of possibly
unrelated patches, which extra patches are required for any given
"cherry pick". I much prefer terms like <em>Patch Dependency</em> to
refer to this set of concepts. Even <tt>darcs</tt>' term <em>patch
commuting</em> could be better called <em>patch re-ordering</em>.
</p><p>The theory goes like this. If you are trying to get a specific
change from a tree, then quickly work out by examining the history
which other changes are required first, and so add all of those
patches to your tree.
</p><p>The general finding from this technology is that it is useful, but
it opens a big can of worms. In essence, the version control system
is tracking not only the history that you recorded, but also all the
different paths through the patches you have made that history might
have successfully progressed. And on any code base, simple metrics
such as "does this patch apply cleanly" cannot be relied upon to
verify whether or not two changes are actually interdependant.
</p><p>So, what some developers do is manually mark which patches are
predecessors to the next patch that they make. Even more enlightended
developers use metrics such as whether or not the changed code still
compiles successfully or even passes the regression test suite to
consider changes dependant.
</p><p>Whether patch dependency works or not in practice depends on
whether or not developers create commits of a high enough standard
that they co-operate with this feature.
</p><p> <strong>Interactive Commit</strong>
</p><p>I didn't talk about this much in the SVK section despite SVK having
this feature, mainly because <tt>darcs</tt> is where the feature came
from in the first place.
</p><p>Basically, the way it works is when you <tt>record</tt> changes,
you are presented with a summary of the changes, then asked to group
each change, <em>hunk by hunk</em> into bundles which are darcs
patches.
</p><p>This is largely how it is possible for the patch calculus to work
so well - if changes to a single file are combined into a single
commit as so frequently happens with file-grained <tt>commit</tt> in
other VCS, it entwines the two features being worked on to be
co-dependant. The better the tool is at keeping changes small and
tidy, the better - but if they are too small, the reverse happens -
every feature is considered to be its own independant change.
</p><p> <strong>¹ - And now darcs is a <tt>git</tt> porcelain, too</strong>
</p><p>With the arrival on the scene of <a href="http://www.darcs.net/DarcsWiki/DarcsGit"> <tt>darcs-git</tt> </a>, a <tt>git</tt> porcelain with the UI of <tt>darcs</tt>, I have access to the interactive commit interface of <tt>darcs</tt> already.
</p><p>I don't miss patch dependency, because it is easily - and I would
add, less confusingly - performed with <tt>git</tt> using topic
branches (making a new branch for each new feature or stream of
development), and the powerful tools of rebasing and cherry picking.</p>mugwumpjism2006-08-14T06:48:04+00:00yapceWhat people Love about their VCS - Part 3 of 4. git
http://use.perl.org/~mugwumpjism/journal/30615?from=rss
<p>It is clear that the earlier posts in this series are light on
details and teasers, whereas this post goes into much detail on each
new feature. For this bias I offer no apology. There is no mistaking
that within the period of one year, I have gone from being an
outspoken SVK advocate to extolling the virtues of the content
filesystem, <tt>git</tt>. And I am not alone.
</p><p> <strong>Content Addressible Filesystem</strong>
</p><p>There are many good reasons that super-massive projects like the
Linux Kernel, XFree86, Mozilla, Gentoo, etc are switching to <tt>git</tt>.
This is not just a short term fad, <tt>git</tt> brings a genuinely new (well,
stolen from <tt>monotone</tt>) concept to the table - that of the
<em>content addressable filesystem</em>.
</p><p>In this model, files, trees of files, and revisions are all hashed
with a specially pre-seeded SHA1 to yield <em>object identifiers</em>,
that uniquely identify (to the strength of the hashing algorithm) the
type and contents of the object. The full ramifications of this take
some time to realise, but include more efficient delta compression¹,
algorithmically faster merging, less error prone file history
detection², but chiefly, much better identification of revisions. All
of a sudden, it does not matter which repository a revision comes from
- if the SHA1 object ID matches, <em>you have the same object</em>, so
the system naturally distributes <em>by model</em>, with no
requirement for URIs or surrogate repository UUIDs and revision
numbers.
</p><p>Being content-keyed also means you are naturally transaction-safe.
In terms of the core repository, you are only ever adding new objects.
So, if two processes try to write to the same file, this will succeed
because it means that they are writing the same contents.
</p><p>It also makes cryptography and authentication easy - you can sign
an <em>entire project and its revision history</em> just by signing
text including a commit ID. And if you recompute the object
identifiers using a stronger hash, you have a stronger guarantee.
</p><p> <strong>The matter of speed</strong>
</p><p>The design of the <tt>git-core</tt> implementation is very OS
efficient. People might scoff at this as a key feature, but consider
this performance comparison;
</p><p> <b>SVK:</b> </p><blockquote><div><p> <tt>wilber:~/src/git$ time svk sync -t 11111<nobr> <wbr></nobr>/pugs/openfoundry<br>Syncing http://svn.openfoundry.org/pugs<br>Retrieving log information from 1 to 11111<br>Committed revision 2 from revision 1.<br>Committed revision 3 from revision 2.<br>Committed revision 4 from revision 3.<br> [...]<br>Committed revision 11110 from revision 11109.<br>Committed revision 11111 from revision 11110.<br>Committed revision 11112 from revision 11111.<br> <br>real 227m36.096s<br>user 3m47.281s<br>sys 5m0.577s</tt></p></div> </blockquote><p>That's 13,656 seconds to mirror 11,111 revisions.
</p><p>Compare that to <b> <tt>git</tt><nobr> <wbr></nobr></b>:</p><blockquote><div><p> <tt>wilber:~/src$ time git-clone git://git.kernel.org/pub/scm/git/git.git git.git<br>Checking files out...<br> 100% (688/688) done<br> <br>real 1m54.932s<br>user 0m2.825s<br>sys 0m0.468s</tt></p></div> </blockquote><p>That was 115 seconds to mirror 6,511 revisions. The key bottleneck
was the network - which was saturated for almost all of the execution
time of the command, not a laborious, revision by revision dialogue
due to a server protocol that just didn't seem to think that people
might want to copy entire repositories³. The server protocol simply
exchanged a few object IDs, then using the <em>merge base
algorithm</em> to figure out which new objects are required, it
generated a delta-compressed pack that just gives you the new objects
you need. So, <tt>git</tt> does not suffer from high latency networks
in the same way that <tt>SVN::Mirror</tt> does.
</p><p>But it's not just the server protocol which is orders of magnitude
faster. Git commands overall execute in incredibly short periods of
time. The reason for this speed isn't (just) because "it's written in
C". It's mainly due to the programming style - files are used as
<em>iterators</em>, and <em>iterator functions</em> are combined
together by way of <em>process pipelines</em>. As the computations
for these iterator functions are all completely independent, they
naturally distribute the processing, and UNIX with its pipe buffers
was always designed to make mincemeat of this kind of highly parallel
processing task.
</p><p>There is a lot to be learnt from this style of programming;
generally the habit has been to try to avoid using unnecessary IPC in
programs in order to make best use of traditional straight line CPU
performance, where task switching is a penalty. Combining iterative
programming with real operating system filehandles can bring the
potential of this speed enhancement to adequately built iterative
programs. I expect it will only be a matter of time before someone
will produce a module for Perl 6 that will automatically auto-thread
many iterative programs to use this trick. Perhaps one day, it will
even be automatic.
</p><p>But that aside, we have yet to touch on some of the further
ramifications of the content filesystem.
</p><p> <strong>Branching is more natural</strong>
</p><p>Branches are much more natural - instead of telling the repository
ahead of time when you are branching, you simply commit. Your commit
can never be invalid (there is no "transaction out of date" error) -
if you commit in a different way to somebody else, then you have just
branched the development.
</p><p>Branches are therefore <em>observed</em>, not <em>declared</em>.
This is an important distinction, but is actually nothing new - it is
the <em>paradigm shift</em> that was so loudly touted by those
<tt>arch</tt> folk who irritatingly kept suggesting that systems like
CVS were fundamentally flawed. Beneath the bile of their arguments,
there was a key point of decentralisation that was entirely missed by
the Subversion design. Most of the new version control systems out
there - bazaar-NG, mercurial, codeville, etc have this property.
</p><p>Also, the repository itself is normally kept alongside the
checkout, in a single directory at the top called<nobr> <wbr></nobr><tt>.git</tt> (or
wherever you point the magic environment variable <tt>GIT_DIR</tt> at
- so you can get your 'spotless' checkouts, if you need them). As the
files are saved in the repository compressed via gzip and/or delta
compressed into a pack file, with filenames that are essentially SHA1
hashes, the '<tt>grep -r</tt>' problem that Subversion and CVS
suffered from is gone.
</p><p>It means that you can explain that to make a branch, you can just
copy the entire checkout+repository:</p><blockquote><div><p> <tt> $ cp -r myproject myproject.test</tt></p></div> </blockquote><p>Not only that, but you can combine repositories back together just
by copying their objects directories over each other.</p><blockquote><div><p> <tt> $ cp -ru myproject.test/.git/objects myproject/.git/<br> $ git-fsck-objects<br> dangling commit deadbeef...<br> $ git-update-ref refs/heads/test deadbeef</tt></p></div> </blockquote><p>Now, that's crude and illustrative only, but these sorts of
characteristics make repository hacks more accessible. Normally you
would just <tt>fetch</tt> those revisions:</p><blockquote><div><p> <tt> $ git-fetch<nobr> <wbr></nobr>../myproject.test test:refs/heads/test</tt></p></div> </blockquote><p> <strong>Merges are truly merges</strong>
</p><p>Unlike in Subversion, the repository itself tracks key information
about merges. When you use `<tt>svn merge</tt>', you are actually
copying changes from one place in the repository to another. Git does
support this, but calls it "pulling" changes from one branch to
another. The difference is that a merge (by default) creates a
special type of commit - a <em>merge commit</em> that has two parents
(a "parent" is just a SHA1 identifier to the previous commit). Thus,
the two branches are truly converged, and if the maintainer of the
other branch then pulls from the merged branch, they're not just
identical - they <em>are</em> the same branch. <em>Merge base
calculations</em> can just look at two commit structures, and find the
earliest commits that the two branches have in common.
</p><p>To compare the model of branching and merging to databases and
transactional models, the Subversion model is like auto-commit,
whereas distributed SCM such as <tt>git</tt> provides is akin to
transactions, with the diverged branch's commits being like SQL
savepoints, and merges being like full "commit" points.
</p><p> <strong>"Best of" merging - cherry picking</strong>
</p><p>There is also the concept of piecemeal merging via <em>cherry
picking</em>. One by one, you can pluck out individual changes that
you want instead of just merging in all of the changes from the other
branch. If you later pull the entire branch, the commits which were
cherry picked are easily spotted by matching commit IDs, and do not
need to be merged again.
</p><p> <strong>The plethora of tools</strong>
</p><p>Another name for <tt>git</tt> is the <em>Stupid</em> content tracker. This
is reference to the fact that the <tt>git-core</tt> tools are really
just a set of the small "iterator functions" that allow you to build
'real' SCMs atop of it. So, instead of using the <tt>git-core</tt> -
the "plumbing" - directly, you will probably be using a "porcelain"
such as <a href="http://git.or.cz/gitwiki/Cogito">Cogito</a>, <a href="http://www.cyd.liu.se/~freku045/gct/">(h)gct</a>, <a href="http://digilander.libero.it/mcostalba/">QGit</a>, <a href="http://www.darcs.net/DarcsWiki/DarcsGit">Darcs-Git</a>, <a href="http://www.procode.org/stgit/">Stacked Git</a>, <a href="http://www.isisetup.ch/">IsiSetup</a>, etc. Instead of using
<tt>git-log</tt> to view revision history, you'll crank up <a href="http://git.or.cz/gitwiki/Gitk">Gitk</a>, <a href="http://git.or.cz/gitwiki/GitView">GitView</a> or the
curses-based <a href="http://jonas.nitro.dk/tig/">tig</a>.
</p><p>The <a href="http://git.or.cz/gitwiki/InterfacesFrontendsAndTools">huge list
of tools</a> which interface with <tt>git</tt> already are a product of the
massive following that it has received in its very short lifetime.
</p><p> <strong>The matter of scaling</strong>
</p><p>The scalability of <tt>git</tt> can be grasped by browsing the many Linux
trees visible on <a href="http://kernel.org/git/">http://kernel.org/git/</a>. In fact, if
you were to combine all of the trees on <tt>kernel.org</tt> into one
<tt>git</tt> repository, you would measure that the project as a whole has
anywhere between 1,000 and 4,000 commits every month. <a href="http://members.cox.net/junkio/200607-ols.pdf">Junio's OLS <tt>git</tt>
presentation</a> contains this and more.
</p><p>In fact, for a laugh, I tried this out. First, I cloned the
mainstream <tt>linux-2.6</tt> tree. This took about 35 minutes to
download the 140MB or so of packfile. Then I went through the list of
trees, and used '<tt>git fetch</tt>' to copy all extra revisions in
those trees into the same repository. It worked, taking between a
second and 8 minutes for each additional branch - and while I write
this, it has happily downloaded over 200 heads so far - leaving me
with a repository with over 40,000 revisions that packs down to only
200MB. (<b>Update:</b> Chris Wedgwood writes that he has a revision history of the Linux kernel dating all the way back to 2.4.0, with almost 97,000 commits, which is only 380MB)
</p><p>Frequently, scalability is reached through distribution of
bottlenecks, and if the design of the system itself elimates
bottlenecks, there is much less scope for overloaded central servers
like Debian's <tt>alioth</tt> or the OSSF's
<tt>svn.openfoundry.org</tt> to slow you down. While Subversion and
SVK support "Star" and "Tree" (or heirarchical) developer team
patterns, systems such as <tt>git</tt> can truly, both in principle
and practice, be said to support <em>meshes</em> of development teams.
And this is always going to be more scalable.
</p><p> <strong>Revising patches, and uncommit</strong>
</p><p>The ability to undo, and thus completely forget commits is
sometimes scorned at, as if it were "wrong" - that version control
systems Should Not support such a bad practice, and therefore that
having no way to support it is not a flaw, but a feature. "Just
revert", they will say, and demand to know why you would ever want
such a hackish feature as <tt>uncommit</tt>.
</p><p>There is a point to their argument - if you publish a revision then
subsequently withdraw that revision from the history without
explicitly reverting it, people who are tracking your repository may
also have to remove those revisions from their branches before
applying your changes.
</p><p>However, this is not an unsurmountable problem when your revision
numbers uniquely and unmistakably identify their history - and when
you are working on a set of patches for later submission, it is
actually what you want. In the name of hubris, you only care to share
the changes once you've made them each able to withstand the hoards of
the Linux Kernel Mailing List reviewers (or wherever you are sending
your changes, even to an upstream Subversion repository via
<tt>git-svn</tt>).
</p><p>In fact, the success of Linux kernel development can also be
attributed in part to its approach of only committing to the mainline
kernel, patches that have been reviewed and tested in other trees,
don't break the compile or add temporary bugs, etc. As they are
refined, the changes themselves are modified before they are
eventually cleared for inclusion in the mainline kernel. This
stringent policy allows them to do things such as <em>bisect</em>
revisions to perform a binary search between two starting points to
locate the exact patch that caused a bug.
</p><p>Before <tt>git</tt> arrived, there were tools such as <a href="http://savannah.nongnu.org/projects/quilt">Quilt</a> that
managed the use case of revising patches, but they were not integrated
with the source control management system. These days, <a href="http://www.spearce.org/category/projects/scm/pg/">Patchy Git</a>
and <a href="http://www.procode.org/stgit/">Stacked Git</a> layer this
atop of <tt>git</tt> itself, using a technique that amounts to being
commit reversal. In fact, the reversed commits still exist - it's
just nothing refers to them - they can still be seen by
<tt>git-fsck-objects</tt> before the next time the maintenance command
<tt>git-prune</tt> is run.
</p><p>So, Stacked Git has a command called <tt>uncommit</tt> that takes a
commit from the head and moves it to your patch stack, <tt>refresh</tt> to update the current patch once it has been suitably revised, a pair of
commands <tt>push</tt> and <tt>pop</tt> to wind the patch stack, a
<tt>pick</tt> command to pluck individual patches from another branch,
and a <tt>pull</tt> command that picks entire stacks of patches, which
is called "rebasing" the patch series. And of course, being a porcelain only, you can mix and match the use of <tt>stgit</tt> with other <tt>git</tt> porcelain.
</p><p>Far from being "so 20th century", patches are a clean way to
represent proposed changes to a code base that have stood the test of
time - and a practice of reviewing and revising patches encourages
debate of the implementation and makes for a tidier and more tracable
project history.
</p><p>The polar opposite to reviewing every patch - a single head that
anyone can commit to - is more like a Wiki, and an open-commit policy
Subversion server suits this style of colloboration well enough.
There is no "better" or "more modern" between these two choices of
development styles - each will suit certain people and projects better
than others.
</p><p>Of course, those tools that made distributed development a key
tenet of their design make the distributed pattern more natural, and
yet it is just as easy for them to support the Wiki-style development
pattern of Subversion.
</p><p>In fact there are no use cases for which I can recommend Subversion
over <tt>git</tt> any more. In my opinion, those that attack it on
the grounds of "simplicity" (usually on the topic of the long, though
able to be abbreviated, revision numbers) have not grasped the beauty
of the <a href="http://utsl.gen.nz/img/git-model.png">core model</a>
of <tt>git</tt>.
</p><p>Footnotes:
</p><p>Many people, especially those with time, effort and ego invested in
their own VCS, judged the features of <tt>git</tt> in very early days.
Without being able to see where it would be today, they each gave
excuses as to why this new VCS <tt>git</tt> offered their users less
functionality. So, a lot of FUD exists, a few points of which I address here.
</p><ol>
<li>git <em>does</em> do delta compression to save space (as a
separate step)</li>
<li>git <em>can</em> track renames of files, though it does not record
this in the meta-data, and pragmatically the observation is that this
is, overall speaking, just as good, if not better, than tracking them
with meta-data.</li>
<li>git <em>is not</em> forced to hold the entire project history, it
is quite possible to have partial repositories using <em>grafts</em>,
though this feature is still relatively new and initial check-outs
cannot easily be made grafts. Patches welcome<nobr> <wbr></nobr><tt>;-)</tt>.</li>
</ol>mugwumpjism2006-08-14T06:33:09+00:00yapce