ChrisDolan's Journal ChrisDolan's use Perl Journal en-us use Perl; is Copyright 1998-2006, Chris Nandor. Stories, comments, journals, and other submissions posted on use Perl; are Copyright their respective owners. 2012-01-25T02:11:15+00:00 pudge Technology hourly 1 1970-01-01T00:00+00:00 ChrisDolan's Journal Devel::Symdump + Compress::Zlib crash <p>Lazyweb,</p><p>The following is a known bug that crashes Perl 5.10.0 (solved for 5.10.1 via <a href="">Perl RT#52740</a>)</p><blockquote><div><p> <tt>perl -MCompress::Zlib -MDevel::Symdump -e'Devel::Symdump-&gt;rnew'</tt></p></div> </blockquote><p>Does anyone know of a workaround? I want my software to work on 5.10.0 because that's what Apple ships with Snow Leopard. I don't want to use the 5.8.9 that ships with SL because I want 64-bit.</p><p>See also <a href="">CPAN RT#43675 - Segfault bug in rnew-&gt;packages from Test::Class+Moose+Test::WWW::Mechanize::Catalyst</a></p> ChrisDolan 2010-02-15T02:59:26+00:00 journal [non-perl] Hacking iPhoto via SQLite <p>(a very off-topic SQL exercise follows...)</p><p>Apple's iPhoto 2009 has a very cool face recognition feature where it will show you faces and prompt you to enter names. When it finds resemblances, it will suggest names for you to confirm or reject.</p><p>When it's wrong, it makes for funny little screenshots you can send to friends: <a href="">"Haha, cute girl! You look like an older heavier man!"</a></p><p>But once you reject the match, you can't undo or get it back in any other way. Luckily, iPhoto stores ALL of its metadata in a collection of SQLite databases. So, in the particular case that I linked above I was able to undo by telling the database not to ignore the marked mismatches from IMG_0095.JPG:</p><blockquote><div><p> <tt>% osascript -e 'tell application "iPhoto" to quit'<br>% cd "Pictures/iPhoto Library"<br>% sqlite3 iPhotoMain.db<br>sqlite&gt; select photoKey,relativePath from SqFileImage,SqFileInfo where sqFileInfo=SqFileInfo.primaryKey and relativePath like '%IMG_0095.JPG';<br>110|Data/2003/Roll 15/IMG_0095.jpg<br>110|Originals/2003/Roll 15/IMG_0095.JPG<br>3666|Data/2006/Apr 25, 2006/IMG_0095.jpg<br>3666|Originals/2006/Apr 25, 2006/IMG_0095.JPG<br>sqlite&gt;<nobr> <wbr></nobr>.exit<br>% sqlite3 face.db<br>sqlite&gt; update similar_faces set ignore=0 where image_key=110;<br>sqlite&gt;<nobr> <wbr></nobr>.exit<br>% open -a</tt></p></div> </blockquote> ChrisDolan 2009-08-14T02:33:23+00:00 journal [non-perl] mdfind vs. locate <p>I've used the "locate" command for many, many years on various POSIX systems. It's a very fast search for filenames containing the specified string on your machine.</p><p>Here's a Mac command that simulates "locate" using the Spotlight database:</p><blockquote><div><p> <tt>#!/bin/sh<br>mdfind "kMDItemFSName == '*$1*'"</tt></p></div> </blockquote><p>It's not as fast as locate, but it is always up to date (as opposed to locate, which relies on cron to reindex the disk periodically). As anyone can guess from the syntax, it's also very flexible. To find all files that originated as email attachments from my friend Glenn, for example, I can do:</p><blockquote><div><p> <tt>mdfind "kMDItemWhereFroms == '**'"</tt></p></div> </blockquote><p>Or to find all files bigger than about a gigabyte:</p><blockquote><div><p> <tt>mdfind "kMDItemFSSize &gt; 1000000000"</tt></p></div> </blockquote><p>(the WhereFroms search is fast and the size search is dramatically slower, maybe because all files have sizes but not many have kMDItemWhereFroms metadata)</p><p>To get a list of the supported keywords, type "mdimport -A"</p><p>Update: I found the <a href="">Spotlight query syntax</a> to be useful</p> ChrisDolan 2009-07-26T19:08:54+00:00 journal Copy protection: users vs. developers <p>A few weeks ago, LWN ran an article about the <a href="">Okular PDF viewer</a> which enforces copy protection as specified in the PDF specification. The LWN editor and several commenters complained about this restriction to their freedom. I have three comments on that topic:</p><p>1) As a <a href="">PDF implementor</a> myself, I chose to implement the PDF copy protection features just as Okular did. I did this because Adobe's license agreement to download the spec insisted that I do not willfully violate the spec. I agreed to those terms and so I have ignored all requests to disable said protection in my own library.</p><p>2) It's open source. Anybody can trivially turn off the copy protection and recompile (my library is Perl, so you don't even need to recompile). If they do, then they can bear the responsibility for violating the spec.</p><p>3) Do you ever hear people complaining about permissions in the tar file utility? Even GNU tar implements file access controls as specified in the tar file. If I untar a file which is -r--r--r-- or even ---x--x--x, is that a violation of my rights? I say not. It's a minor inconvenience at worst and an excellent safety precaution at best. Nobody's beating a drum to remove copy protection features from tar.</p><p>The copy protection worth fighting against is the kind that can take away your current rights at some unspecified future time (like what happened when Google Video or Walmart music shut down their DRM servers)</p> ChrisDolan 2009-07-04T18:12:41+00:00 journal Coding Horror <p>Jeff Atwood is the best tech blogger in the world, in my opinion. And his <a href="">most recent post</a> contains my nominee for the quote of the year:</p><blockquote><div><p>Open source software only comes in one edition: <em>awesome</em>.</p></div></blockquote> ChrisDolan 2009-07-03T02:24:36+00:00 journal Rakudo improves! <p>I haven't really been working on features for my <a href="">Perl 6 PDF grammar</a> very much, but I have been using it as a tool to follow improvements in the Rakudo implementation of Perl 6.</p><p>This week, I went through all of the PDF code and looked for comments I wrote for myself about workarounds for incomplete parts of Rakudo. The exciting news is that <b>I was able to remove about half of my workarounds</b> this week. Considering that most of the rest of the annoyances are features that are on the near-term roadmap (mainly, PGE refactoring), I consider this to be superb progress.</p><p>Bravo, Perl 6 developers!</p><p>I made a contribution of my own, too, which has made me absurdly happy out of proportion to the size of the <a href="">patch.</a> I implemented <a href="">'make' from S05</a>. This builtin function is shorthand for the following:</p><blockquote><div><p> <tt> # Equivalent code:<br> &nbsp; $/.result_object($value);<br> &nbsp; make $value;</tt></p></div> </blockquote><p>This, along with other simplifications allowed me to shave about half of the lines of <a href="">PDF::Grammar::Actions</a> while improving readability. Yay!</p> ChrisDolan 2009-02-13T04:04:05+00:00 journal Perk moved to github <p>Today I moved Perk (a Java compiler targeting Parrot) to <a href="">github</a>. I do not intend to maintain the SVN repository at googlecode. Sorry for the churn, but better now than later...</p><p>I am working on converting the to be real Perl 6 instead of NQP. I submitted a prerequisite patch to Rakudo to support that (implement the 'make' builtin). But I'm getting mysterious runtime failures with basic PIR.</p><p>If you want to try Perl 6, just edit one line of the Perk Makefile to change "ACTION_COMPILER=nqp.pbc" to "ACTION_COMPILER=perl6.pbc".</p><p>Once I get that to work, I'll probably port perk.pir to Then we'll see about implementing more of the actions.</p> ChrisDolan 2009-02-11T05:47:41+00:00 journal Frozen Perl talk: Perl 6 grammars <p>I'm headed to <a href="">Frozen Perl</a> tomorrow. My talk is titled <a href="">Using Rakudo Grammars</a>, but it ended up being more about Perl 6 than specifically the Parrot/Rakudo implementation.</p><p>I'm looking forward to the hackathon Sunday. I'm thinking that I'd like to work on enabling any language's PCT to be implemented in Rakudo instead of NQP. Patrick Michaud mentioned a few blockers in IRC, so I'm hoping to work on those.</p> ChrisDolan 2009-02-06T03:57:01+00:00 journal Perk: an implementation of Java language on Parrot <p>I moved my Java parsing experiment from my own SVN to the <a href="">Squawk project</a>. Squawk is intended to Parrot-based languages that are too small or immature to live on their own.</p><p>I chose the name "Perk" for my implementation to avoid any risk of Java trademarks. It's short, it reminds you of caffeine, and it's only one letter away from Perl.</p><p>If this project doesn't wither on the vine (a very real possibility, I admit) then I hope it will be source code compatible with basic Java. I have no intention of any interoperability with Java bytecode or the Java VM.</p><p>The source code is easily <a href="">browsable</a>. If anyone is interested in participating, I welcome the company. There's no way I'm going to make more than a dent in this project by myself.</p> ChrisDolan 2009-01-15T03:40:42+00:00 journal Regex for floating hex <p><a href="">A couple days ago</a> I wrote about Java syntax for hexadecimal representation of floating point numbers like this:</p><blockquote><div><p> <tt>0x1.fffffeP+127f</tt></p></div> </blockquote><p>Here are the Perl 6 grammar snippets I <a href="">put together</a> to parse this. Easy!</p><blockquote><div><p> <tt>token HexLiteral {<br>&nbsp; '0' ['x'|'X'] [<br>&nbsp; &nbsp; &nbsp;| '.' &lt;HexDigit&gt;+ &lt;HexExponent&gt;? &lt;FloatTypeSuffix&gt;?<br>&nbsp; &nbsp; &nbsp;| &lt;HexDigit&gt;+ [<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| '.' &lt;HexDigit&gt;* &lt;HexExponent&gt;? &lt;FloatTypeSuffix&gt;?<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| &lt;IntegerTypeSuffix&gt;?<br>&nbsp; &nbsp; &nbsp;]<br>&nbsp; ]<br>}<br>token HexDigit { [\d|&lt;[a..f]&gt;|&lt;[A..F]&gt;] }<br>token HexExponent { ['p'|'P'] ['+'|'-']? \d+ }<br>token FloatTypeSuffix { 'f'|'F'|'d'|'D' }</tt></p></div> </blockquote> ChrisDolan 2009-01-14T03:29:12+00:00 journal PGE Java parser and floating hex notation <p><a href="">Last week</a> I reported that I had a <a href="">working parser</a> for Java source code implemented as a Perl 6 grammar. Well, a dozen or so bugs later, it actually works now. Out of 7,000+<nobr> <wbr></nobr>.java files in the OpenJDK, I fail to parse 3 of them (plus a handful that that are too large to fit in a gig of RAM -- I need to work on decreasing memory consumption). One of the three is a surely a bug, and the other two are unusual syntax that I've never seen before in java.lang.Double and java.lang.Float:</p><blockquote><div><p> <tt>&nbsp; &nbsp; public static final double MAX_VALUE = 0x1.fffffffffffffP+1023;<br>&nbsp; &nbsp; public static final double MIN_NORMAL = 0x1.0p-1022;<br>&nbsp; &nbsp; public static final double MIN_VALUE = 0x0.0000000000001P-1022;<br>&nbsp; &nbsp;<nobr> <wbr></nobr>...<br>&nbsp; &nbsp; public static final float MAX_VALUE = 0x1.fffffeP+127f;<br>&nbsp; &nbsp; public static final float MIN_NORMAL = 0x1.0p-126f;<br>&nbsp; &nbsp; public static final float MIN_VALUE = 0x0.000002P-126f;</tt></p></div> </blockquote><p>Hex notation for rational numbers -- what an interesting idea. I've never encountered that before. It appears that this notation first appeared in the <a href="">3rd edition</a> of the Java language specification.</p><p>This seems to me like a notation worth adding to Perl 6, given its inherent precision advantages over decimal notation.</p> ChrisDolan 2009-01-12T05:41:22+00:00 journal Java parser in Parrot/PGE <p>My favorite part of Perl 6 is the new grammar syntax. Over the last couple of days, I translated a Java source code grammar from <a href="">antlr</a> to <a href="">PGE</a>. After about 4-5 hours of work, I now have <strong>a Perl 6 grammar that can parse all of the<nobr> <wbr></nobr>.java files in the <a href="">OpenJDK</a> </strong> (the Java 7 source code). Well, that may be a lie. It's still crunching at about 5-10 seconds per file so it will be a while before I know if its really true.</p><p>Admittedly most of the credit goes to the authors of the antlr grammar I adapted, but this also says good things about the Perl 6 regex implementation in Parrot.</p><p>The things that bit me hardest were:</p><ol><li>negated classes (PGE doesn't understand "&lt;-[abc]&gt;" so I had to make the inner part a separate token)</li><li>antlr allows character classes with outside of any character group syntax (antlr: "'0'..'9'", perl: "&lt;[0..9]&gt;")</li><li>longest token on integer vs. float (I had to change the antlr grammar to put float ahead of integer)</li><li>whitespace (I cribbed from the Pipp implementation)</li></ol> ChrisDolan 2009-01-09T05:02:20+00:00 journal [non-perl] help, my favorite author is ransoming a book! <p>For the last 20+ years, my favorite sci-fi/fantasy author has been <a href="">Lawrence Watt-Evans</a>. His books are not always profound, but they just fit my brain. He's currently writing a sequel to his 1989 sci-fi mystery Nightside City.</p><p>The catch is that he's writing it one chapter at a time and holding future chapters <a href="">hostage</a> until he gets enough donations. His goal of $300 per chapter is both reasonable and steep, depending on your point of view. He justifies it thoroughly in his FAQ.</p><p>If you've ever read one of his books, or just enjoy sci-fi, or have ever used one of my modules and feel you owe me<nobr> <wbr></nobr>;-) then please consider donating.</p> ChrisDolan 2008-12-05T03:31:48+00:00 journal Padre on Mac install trick <p>I've been curious to peek at Padre, Gabor Szabo's new editor, but it's Test::NeedsDisplay prereq always failed for me with an obscure recommendation to install xvfb-run. It turns out all you really need is to run "wxPerl Makefile.PL" instead of "perl Makefile.PL". I added a note to that effect in the Makefile.PL (thanks to AdamK's open repository).</p><p>So, any Mac user who wants to try Padre can use this trick (I didn't actually test this exact technique since I got it installed via a much more convoluted trial-and-error process).</p><blockquote><div><p> <tt>cpan install Wx<br>wxPerl -MCPAN -e install Padre</tt></p></div> </blockquote><p>Now I'm off to report some bugs...</p> ChrisDolan 2008-11-28T20:43:07+00:00 journal Rakudo patches <p>I'm addicted to working on Rakudo. I like Perl6 a lot more than Perl5, so I've been digging into some interesting corners. In the last 4 weeks, I've submitted 13 patches:</p><ul><li> <a href="">[perl #60356]</a> Rakudo doesn't allow inheriting from classes with<nobr> <wbr></nobr>:: in the name</li><li> <a href="">[perl #60358]</a> Rakudo doesn't recognize grammars with<nobr> <wbr></nobr>:: in the name</li><li> <a href="">[perl #60384]</a> Remove references to __get_string() and related methods in PGE POD</li><li> <a href="">[perl #60716]</a> invoke multi-level namespace grammars from rules</li><li> <a href="">[perl #60218]</a> P6object.new_class 'hll' option is ignored</li><li> <a href="">[perl #60186]</a> make PGE support {PIR} closures instead of just {{PIR}}</li><li> <a href="">[perl #60160]</a> recursive "use" causes infinite loop</li><li> <a href="">[perl #60366]</a> 'does' fails with roles that have '::' in their names</li><li> <a href="">[perl #57980]</a> Fix bugs with nested ?? !!</li><li> <a href="">[perl #60164]</a> make methods return a boolean, like Test::More</li><li> <a href="">[perl #60718]</a> better error message for<nobr> <wbr></nobr>.new on undefined class</li><li> <a href="">[perl #60446]</a> first draft implementation self.WHO to return package of a class</li></ul><p>This represents ~50 hours of learning, reading, coding and patching. That's not much compared to the big contributors, but I'm proud of the work I've done.</p> ChrisDolan 2008-11-21T05:12:25+00:00 journal Rakudo progress <p>I've been learning the Rakudo innards like crazy this past week. I keep staying up past midnight every day -- "just one more bug..."</p><p>I suffered some despair over the weekend when I realized that two fundamental features didn't work 1) lexpad variables under recursion and 2) "::" as a namespace separator. For the latter, I had worked around it by using "PDF__Grammar" instead of "PDF::Grammar" (that is, a one-level namespace instead of N-level). But now it's fixed thanks to a big patch from Jonathan Worthington and a little one from me. Yay!</p><p>The lexpad bug is a bigger one and it's killing my progress because re-entering a method overwrites the lexical variables of the previous call. I know Patrick Michaud is actively working on it though, so I've satisfied myself by working on the other bugs in the meantime.</p><p>It's a little hard to dig through the code because there's so many layers (Perl6-&gt;PGE-&gt;NQP-&gt;PAST-&gt;POST-&gt;PIR), but I'm generally quite impressed with the Rakudo and Parrot/PCT implementation. PIR is surprisingly readable, despite being an assembler-like language. NQP is deceptively close to Perl, which keeps throwing me off. And PGE just rocks.</p> ChrisDolan 2008-11-06T05:19:08+00:00 journal Perl6 surprise: No reference operator <p>The most surprising change I've discovered so far in Perl6 is that the reference operator ('\') is no more. That is, the following code, while valid in both Perl5 and Perl6, behaves differently in each:</p><blockquote><div><p> <tt>&nbsp; &nbsp;my %hash;<br>&nbsp; &nbsp;my $hashref = \%hash;</tt></p></div> </blockquote><p>In Perl6 the \ operator creates Capture instances instead of references. To make Perl6 behave the way the above code does under Perl5 behavior (where a scalar "contains" the hash), do like this:</p><blockquote><div><p> <tt>&nbsp; my %hash;<br>&nbsp; my $hashref = $(%hash); # itemization</tt></p></div> </blockquote><p>Then to use that <code>$hashref</code>, do like this:</p><blockquote><div><p> <tt>&nbsp; my @keys = $hashref.keys;<br>&nbsp; # Perl5 equivalent:<br>&nbsp; # my @keys = keys %{$hashref};</tt></p></div> </blockquote><p>So, the $hashref feels less like a pointer and more like a container. The scalar delegates to its value, which is why the ".keys" works in this example without needing to explicitly dereference the scalar.</p><p>The funny thing is that Captures of Lists behave a lot like arrayrefs, so I didn't even notice I had misused the \ in the following code:</p><blockquote><div><p> <tt>&nbsp; my @array;<br>&nbsp; my $arrayref = \@array;<br>&nbsp; for @($arrayref) -&gt; $val {<br>&nbsp; &nbsp; &nbsp;say $val;<br>&nbsp; }</tt></p></div> </blockquote><p>The as-list operator, "@(...)", works about the same on Capture instances as it does on scalars that contain Lists -- in both cases, it returns a list of the enclosed values.</p> ChrisDolan 2008-10-30T05:42:41+00:00 journal 128-bit encryption coming to CAM::PDF <p>w00t! <a href="">Joe Hudson</a> has submitted a modification to the <a href="">CAM::PDF</a> encryption module to support 128-bit encryption instead of just basic 40-bit. Along the way, he fixed a LONG standing bug (about 5 years, I think) where I didn't distinguish between owner and user passwords.</p><p>I'm hoping to integrate his solution in the next week and release CAM::PDF v1.60.</p><p>I love open source!</p> ChrisDolan 2008-10-25T02:28:54+00:00 journal Perl6 PDF library, version <p><a href="">Yesterday</a> I mentioned that I was experimenting with Perl6. Ovid followed up by suggesting that I open the source. OK: <a href=""></a></p><p>It's an experiment to see if Perl6 grammars make PDF parsing easier than the Perl5 continued regexes (i.e.<nobr> <wbr></nobr>//cg) that I used in <a href="">CAM::PDF</a>. So far the answer is probably yes, but a few Rakudo limitations are slowing me down. Luckily I'm an optimist.<nobr> <wbr></nobr>:-)</p><p>I have gobs of disk and bandwidth, so if others want to use some of my SVN space, I'm willing to share. I'm thinking about moving my Perl5 work from my private repository to that public repository too, but I have to figure out the SVN slicing commands... <a href="">again</a>.</p> ChrisDolan 2008-10-25T01:40:40+00:00 journal Perl6 is here! <p>I wrote my first Perl6 program last night, using the Rakudo implementation on the Parrot virtual machine. Compilation is slow, but runtime is fast. I made a lot of mistakes, and the compiler doesn't print very useful syntax error messages yet, but once I figured out a few gotchas (like you can't have a grammar rule named "null", that must be reserved somewhere) it worked and all of my tests passed.</p><p>Take that, false vaporware accusers!</p> ChrisDolan 2008-10-24T00:51:33+00:00 journal Win32 non-cp1252 filenames <p>There was a <a href="">P5P thread</a> recently about encoding filename on Win32. A while back, I wrote an app that had to support Shift-JIS and other filesystem encodings transparently. I came up with the following unpleasant but successful hack.</p><p>Whenever I want to pass a filename to any system function (open, opendir, unlink, -f, etc) I wrap the filename string in a <code>localfile()</code> call like so: <code>unlink localfile("foo.txt")</code>. The <code>localfile()</code> function is defined as follows</p><blockquote><div><p> <tt>use Encode;<br>use English qw(-no_match_vars);<br> &nbsp; <br>my $encoding;<br>sub localfile {<br>&nbsp; &nbsp; my ($filename) = @_;<br>&nbsp; &nbsp; if (!defined $encoding) {<br>&nbsp; &nbsp; &nbsp; &nbsp; $encoding = q{};<br>&nbsp; &nbsp; &nbsp; &nbsp; if ($OSNAME eq 'MSWin32') {<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; require Win32::Codepage;<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $encoding = Win32::Codepage::get_encoding() || q{};<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $encoding &amp;&amp;= Encode::resolve_alias($encoding) || q{};<br>&nbsp; &nbsp; &nbsp; &nbsp; }<br>&nbsp; &nbsp; }<br>&nbsp; &nbsp; return $encoding ? encode($encoding, $filename) : $filename;<br>}</tt></p></div> </blockquote><p>This solution is obnoxious because you have to wrap EVERY filename in your entire program. I tested by setting my working directory to something non-ASCII in a Shift-JIS Windows and looked for test failures.</p><p>It's the only solution that worked reliably for me, though, on arbitrarily-encoded filesystems. Just using utf8 filesystems is so much easier... Well, aside from normalization issues, that is.</p> ChrisDolan 2008-10-07T05:11:14+00:00 journal CGI-Compress-Gzip v1.00: 1 day old bug fixed! <p>I was totally wrong <a href="">yesterday</a>. I blamed the spurious test failures on taint mode, but it was really autoflush that caused problems. Maybe taint was a problem too, but it was not the core problem. CGI::Compress::Gzip disables itself if autoflush mode is on, because that implies that the programmer wants HTML sent to the user NOW, not buffered and sent later via gzip compression.</p><p>I rolled out a 1.00 release this evening which I hope will work around the test failures. My thanks go out to Slaven Rezic (and others) for prompt smoke testing of new releases!</p> ChrisDolan 2008-10-07T04:49:18+00:00 journal CGI-Compress-Gzip v0.23: 5 year old bug fixed! <p>For over <b>five years</b>, I've been getting <a href="">smoke failure reports</a> for <a href="">CGI::Compress::Gzip</a>. I've tried many times to solve this problem and have always failed to reproduce it.</p><p>I finally figured it out today!</p><p>The problem was in the test code. The tests simulate a CGI environment by setting an envvar (HTTP_ACCEPT_ENCODING=gzip) and calling an external, very simple CGI program via backticks. The problem is that <i>some smoke systems don't pass envvars</i> (probably due to taint). So, the CGI always ran in non-gzip mode. The fix was simple.</p><p>Even though this test failure was spurious and was never reported by anyone except a smoke tester, I'm relieved to have fixed it finally. As always, I'm grateful for the patience and persistence of the smoke testing community! Perl would be nothing without its CPAN support community.</p><p>If this new release is good with the smoke testers, I'm going to push out a 1.00 release at long last.</p> ChrisDolan 2008-10-06T04:30:41+00:00 journal CAM::PDF v1.50: Better late than never <p>Back in PDF v1.5 (which corresponds to Acrobat 6, in 2003), Adobe added a new feature where nearly all of the document metadata could be serialized in compressed blocks. It was the first completely incompatible feature that Adobe added to the document format since PDF v1.0, so adoption was slow even though it can save about 20-30% of the document size.</p><p>Despite reading large swaths of the PDF v1.5 spec and fielding questions from about a hundred CAM::PDF users over the years, I never heard about this feature. I overlooked it in the 952-page spec and never came across such a PDF in the wild...</p><p><nobr> <wbr></nobr>...Until a month ago that is. Suddenly, people were emailing me left and right about support for this feature. I'm not sure what changed. Someone important (maybe a recent Acrobat release?) must have changed a default so new docs use the compressed syntax.</p><p>Now CAM::PDF v1.50 supports reading compressed streams. It still only supports writing the older PDF v1.4 style streams, so as a side effect it's a useful tool for downgrading your PDFs for broader compatibility. Along the way I fixed a serious bug in the PNG decompressor in my code. Wow, I can't believe nobody hit that one before.</p><p>It works very well (pretty good unit tests) but just, uh, don't look too close at the source code. I took some complex, 2002-era, barely-object-oriented code and added another layer of complexity on it. Man, if I had the time to refactor this, I would try to merge CAM::PDF's rich low-level feature set and speed with PDF::API2's saner API and the Perl PDF world would be much happier. Maybe for Rakudo 1.0...</p> ChrisDolan 2008-09-24T02:50:40+00:00 journal How ghostscript parses PDF files <p>Did you know that PDF was created to replace <a href="">PostScript</a>?</p><p>Did you know that PostScript is a Turing-complete language?</p><p>Did you know that ghostscript's <a href="">PDF parser</a> is written in PostScript?</p><blockquote><div><p><nobr> <wbr></nobr><tt>/I<nobr> <wbr></nobr>/it love</tt></p></div> </blockquote> ChrisDolan 2008-09-19T04:17:52+00:00 journal SVN copying <p>I've had to learn this twice over the last 2 years, so I'm going to document it here for eternity.<nobr> <wbr></nobr>:-)</p><p>I have a Subversion repository and I want to split off a piece of it (one subdir) into a new repository on another server.</p><blockquote><div><p> <tt>ssh my-old-server<br>&nbsp; svnadmin dump repositories/myproject | svndumpfilter include subproject \<br>&nbsp; &nbsp; &nbsp; | bzip2 -9 &gt; subproject_at_rev_4753.bz2<br>&nbsp; scp subproject_at_rev_4753.bz2 my-new-server:.<br>ssh my-new-server<br>&nbsp; svnadmin create repos/newproject<br>&nbsp; bzcat subproject_at_rev_4753.bz2 | svnadmin load repos/newproject</tt></p></div> </blockquote><p>I don't use the <code>--drop-empty-revs</code> option on svndumpfilter because that seems to confuse (I'm using SVN 1.4 still, not 1.5 yet).</p><p>Then in existing workspaces, I do:</p><blockquote><div><p> <tt>&nbsp; svn switch --relocate http://my-old-server/myproject/subproject \<br>&nbsp; &nbsp; &nbsp; http://my-new-server/newproject/subproject</tt></p></div> </blockquote> ChrisDolan 2008-09-03T06:12:53+00:00 journal Attributes::Handler in 5.8 vs. 5.10 <p>If you use attributes with multiple arguments like so:<br> &nbsp; &nbsp; &nbsp; &nbsp; <code>sub foo : myattr(one, two) { }</code></p><p>then it's important to realize that the attribute arguments are parsed differently under Perl 5.8 vs. Perl 5.10. In 5.8, you get a string like "one, two" passed to your<nobr> <wbr></nobr>:ATTR sub. Under 5.10, you instead get an arrayref like ['one', 'two'].</p><p>I had some 5.8 code that parsed the attribute args like so:<br> &nbsp; &nbsp; &nbsp; &nbsp; <code>my @args = split<nobr> <wbr></nobr>/\s*,\s*/, $args</code><br>which resulted in @args containing 'ARRAY[0x123456]' under 5.10! My new workaround that is compatible with 5.8 and 5.10 is:<br> &nbsp; &nbsp; &nbsp; &nbsp; <code>my @args = ref $args ? @{$args} : split<nobr> <wbr></nobr>/\s*,\s*/, $args;</code></p><p>If anyone sees flaws in this workaround, or has a better explanation, please comment.</p> ChrisDolan 2008-07-29T05:26:30+00:00 journal BarCamp <p>I attended <a href="">BarCamp-Madison 2008</a> this weekend. It's the first time I've been to a BarCamp and it was significantly better than I expected.</p><p>Madison is a medium-sized city, but is the primary seat of government and university in Wisconsin (with lots of UW research spinoff companies). So there were a large number of techies in attendance, especially those representing the non-profit sector.</p><p>I presented my <a href="">Fuse+PDF</a> talk again (I'm into recycling<nobr> <wbr></nobr>:-)) to a small but very curious audience.</p><p>One very interesting discussion was about how to aggregate all of the local user groups in the area (Perl, Linux, Python, Java, Rails, PHP, etc, etc) into a tighter network. One of the products of that idea is the nascent <a href=""></a> (608 is the regional telephone prefix), which hopes to provide a central place to host mailing lists, meeting calendars, etc. The group hopes to identify a shared meeting space so all of the various users groups will have something tangible in common. I'm very excited about this plan. We also discussed the challenge of time-balancing evening user group meetings with children. People suggested a babysitting coop, or a kid-friendly room at/near the meeting space. Hmm...</p><p>The biggest bummer for me was when I described my involvement with Perl::Critic as "helping Perl developers write more readable code". A Python user laughed and said "Good luck with that!"<nobr> <wbr></nobr>:-(</p> ChrisDolan 2008-07-28T01:13:07+00:00 journal More Yacc to Parrot translation <p>I <a href="">mentioned</a> a while back that I was playing with a <a href="">PCT</a>-based grammar to parse Yacc files and transcode the Yacc grammar to PCT.</p><p>My project took two big detours.</p><p>First, I re-learned that Yacc is a parser for a pre-tokenized stream and does not include lexing or scanning, unlike PCT. So it is infeasible to do a full, automated translation. I suspected that would be the case when I started, but I didn't realize how far from complete my translation would be. Basically, I generate a lot of mostly-useful PGE "rule {}" constructs, but then a lot of placeholder "token {}" constructs that need to be addressed by a human.</p><p>The second detour was that I was trying to learn Yacc and PCT/PGE at the same time, which was too much. So, I dropped down to Perl5 and wrote a parser based on the m/\G.../cgxms construct. The good news is that I finished, and the parser is blazing fast, if verbose. I can parse the whole Yacc vocabulary (well, Bison v2.1 really). My testcases are perl/perly.y, bash/parse.y, cola.y, lua51.y and even bison/parse-gram.y. I am generating PCT and files that actually compile, but they are far from functional -- they're just starting points, really, but they're better than a blank page I think.</p><p>So, opinions are welcome:</p><ul> <li>Should I continue working on the Perl5 parser to get it to make better PCT output?</li><li>Should I work on porting the parser itself to PCT?</li><li>Should I put this aside and start using it to make real Parrot code from existing Yacc grammars?</li><li>Or should I work on the RT/CPANTesters bugs that have been accumulating against my existing packages? Sigh...</li></ul> ChrisDolan 2008-07-23T06:02:27+00:00 journal Polymorphic database tables? <p>[I started asking this question on IRC, but it got too complicated... It seems like something basic that most DBAs should know, but I'm not a DBA and I couldn't find a good solution after some searching.]</p><p>What's the best way to represent polymorphism in a collection of database tables?</p><p>Consider a website where students answer surveys administered by faculty or departments. Start with three database tables: survey, faculty, and department. How do I indicate one-to-one ownership from faculty to survey and from department to survey? I like the strong-typing guarantees of foreign keys, so I really want to avoid un-keyed solutions.</p><p>I've thought about the following solutions, but I'm unhappy with all of them:</p><dl><dt>One null field</dt><dd>Put "faculty_id" and "department_id" foreign keys in the survey table and insist that exactly one is not null. This is awkward in code due to the pervasive conditionals, and problematic as I consider more things that both faculty and departments can own (e.g. student rosters)</dd><dt>Single owner table, two-to-one</dt><dd>The survey table has an owner_id which points to an owner table which has faculty_id and department_id fields, exactly one of them non-null. This is easier to code than the above because everything gets exactly one "owner".</dd><dt>Single owner table, two-to-many</dt><dd>Ownership is not represented in the survey table, but instead the owner table has faculty_id, department_id and survey_id fields. This seems to have no advantage over the "One null field" option.</dd><dt>Multiple owner tables</dt><dd>Create a faculty_survey and department_survey one-to-many tables. How do ensure that each survey is represented exactly once across those two tables?</dd><dt>Multiple survey tables</dt><dd>Partition the surveys into two tables, one for faculty surveys and one for department surveys. This is very painful as I add more things that can be owned.</dd></dl><p>Am I missing something obvious? What happens when I add another type that can be an owner?</p> ChrisDolan 2008-06-14T05:21:19+00:00 journal