Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

rjbs (4671)

  (email not shown publicly)
AOL IM: RicardoJBSignes (Add Buddy, Send Message)
Yahoo! ID: RicardoSignes (Add User, Send Message)

I'm a Perl coder living in Bethlehem, PA and working Philadelphia. I'm a philosopher and theologan by training, but I was shocked to learn upon my graduation that these skills don't have many associated careers. Now I write code.

Journal of rjbs (4671)

Thursday June 14, 2007
10:21 PM

organizing documents, no help from adobe

[ #33524 ]

I've got a bunch of documents that I want to organize, and I didn't want to use some database system, or rely on Spotlight (which I loathe) or anything annoying like that. I wanted a different set of annoyances. I wrote a little module (File::LinkTree::Builder) to build a tree of directories based on file metadata leading back to files in a storage area. So, given (say) my iTunes library, it could build a tree in which I could look up /Rock/80s/Island and find every rock track from Island Records in the 80's. The module is very simple and it lets you define how to find metadata.

Since a lot of the files I want to deal with are PDF, I thought I'd look into its own metadata system. It has two, but the extensible one is XMP, the eXtensible Metadata Platform, which embeds a bunch of XML/RDF in the PDF. PDF::API2 has a way to get and replace this, so I figured I'd be on my way.

Well, the docs suck, and there's no non-image-related XMP module on the CPAN. I thought I'd just use XML::Simple, but I have to deal with goofy processing instructions, which it doesn't seem to support. I'm getting close to just doing something brute-forcey. Worst case, maybe I'll put YAML in a CDATA block and see whether it gets deleted by some Adobe tool while I'm not looking.

During my research into XMP, I found this, the most unhelpful technical documentation ever:

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • I wrote me FFRIndexed [], back when it became clear that WinFS would not be built into Windows. It has a webserver and a DAV server to export something like a filesystem that is organized along the dimensions of the metadata. The key point of my system is that it only offers those dimensions as "subdirectories" which offer a real refinement, that is, which reduce the size of the result set, but are not trivial reductions to n one-element subsets.

    If you're interested in sharing your metadata collectors, I'd li