Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

thoellri (4139)

thoellri
  (email not shown publicly)
http://www.kahunaburger.com/

Journal of thoellri (4139)

Saturday June 11, 2005
07:20 AM

Foundation vs. Mac::Propertylist

[ #25149 ]
I needed to access iPhoto's AlbumData.xml file in order to read my iPhoto database.

Mac::iPhoto did not work for me it always failed loading the xml-file (now I know why).

Mac::PropertyList would work after massaging the XML data before handing it of to plist_parse. However the solution was painfully slow. My 600KB AlbumData.xml file took more than 30 secs to load and parse on a Mac Mini.

So I looked into other ways to load the data from AlbumData.xml. Via PerlObjCBridge I implemented some code using NSPropertyListSerialization. I'm happy to report that the 30 secs have turned into 3 secs. That's better ...

The data returned from plistToHash and loadiPhotoDB are different! There are some data types missing in the plistTraverse() sub!

--------------------------------------------------
use strict;
use Foundation;
use Mac::PropertyList;
use Time::HiRes qw{gettimeofday tv_interval};

use constant XML => qq{$ENV{HOME}/Pictures/iPhoto Library/AlbumData.xml};

my $t0 = [gettimeofday];
my $hash=plistToHash(XML);
my $elapsed = tv_interval($t0);
print "using plistToHash = $elapsed\n";
$t0 = [gettimeofday];
$hash=loadiPhotoDB(XML);
$elapsed = tv_interval($t0);
print "using Mac::PropertyList = $elapsed\n";

sub plistToHash {
    my($filename)=@_;
    my $data=NSData->dataWithContentsOfFile_($filename);
    return undef unless($data);
    my $plist=NSPropertyListSerialization->propertyListFromData_mutabilityOption_format _errorDescription_($data,0,undef,undef);
    return undef unless($plist);
    my %dict=();
    return plistTraverse(\%dict,$plist,'dict',0);
}

sub plistTraverse {
    my($dest,$src,$type,$depth)=@_;
    my $e=($type eq 'dict')?$src->keyEnumerator():$src->objectEnumerator;
    while(my $next = $e->nextObject()) {
    last unless($$next);
    my $obj=($type eq 'dict')?$src->objectForKey_($next):$next;
    my $class=$obj->className->cString();
    my $keyString=($type eq 'dict')?$next->cString:"";
    if($class =~ /dictionary$/i){
        my %dict=();
        my $sub=plistTraverse(\%dict,$obj,'dict',$depth+1);
        if($type eq 'dict') {
        $dest->{$keyString}=$sub;
        }else{
        push(@$dest,$sub);
        }
    }elsif($class =~ /array$/i){
        my @array=();
        my $sub=plistTraverse(\@array,$obj,'array',$depth+1);
        if($type eq 'dict') {
        $dest->{$keyString}=$sub;
        }else{
        push(@$dest,$sub);
        }
    }elsif($class =~ /string$/i){
        if($type eq 'dict') {
        $dest->{$keyString}=$obj->cString;
        } else {
        push(@$dest,$obj->cString);
        }
    }elsif($class =~ /number$/i){
        if($type eq 'dict') {
        $dest->{$keyString}=$obj->doubleValue;
        } else {
        push(@$dest,$obj->doubleValue);
        }
    }elsif($class =~ /boolean$/i){
        if($type eq 'dict') {
        $dest->{$keyString}=($obj->boolValue eq 'YES')?1:0;
        } else {
        push(@$dest,($obj->boolValue eq 'YES')?1:0);
        }
    } else {
        print STDERR "**** unhandled class: $class\n";
    }
    }
    return $dest;
}

sub loadiPhotoDB {
    my($catalogPath)=@_;
    my $xml;
    open(CATALOG, $catalogPath) || return undef;
    {local $/=undef;$xml=<CATALOG>;}
    close(CATALOG);

    # Mac::PropertyList is pretty strict about what it expects to
    # see in the XML file. We are trimming the file before handing
    # it off to parse_plist
    $xml =~ s{^.*<plist\s*.*?>\s*}{}s;
    $xml =~ s{\s*</plist\s*.*?>\s*\z}{}s;

    my $dict=Mac::PropertyList::parse_plist($xml);
    return $dict;
}

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I'm sure the Mac::PropertyList is slower than hooking into Foundation, but the times you see surprise me. I tried it with my own Album.xml (860kb) and it parsed it in 0.2 seconds. I'm curious hwo long the timed tests (in t/time.t) take for you.

    Also, what else is in the file besides the plist stuff? The only thing I've ever seen above <plist> is the XML declaration, and nothing below </plist>

    A future release of Mac::PropertyList will have the hooks to the foundation stuff so it's faster if tha
    • Brian - i just ran t/time.t and I see this:

      t/time.........Elapsed time is 0.021996
      t/time.........ok

      Pretty consistently at that value.
      I also ran the sample through Devel::DProf and her's what I see:

      macbox:~/tmp thoellri$ perl -d:DProf plist2.pl
      using Mac::PropertyList = 32.427314
      macbox:~/tmp thoellri$ dprofpp
      Total Elapsed Time = 31.58248 Seconds
      User+System Time = 31.01248 Seconds
      Exclusive Times
      %Time ExclSec CumulS #Calls sec/call Csec/c Name
      61.6 19.12 31.044 11456 0.0017 0.0027 Ma
      • Okay, good to know. I'll fix up the parser.

        The newest version is a fix by Mike Ciul that made things a little bit faster for very large files. It might help.

        What I really need to do is fix up Mike's enhancement so it can deal with files without reading them all in at once. That should be easy, but it's in line after all the other easy things. :)

        After that, I need to add the Foundation stuff (or something similar) so the Mac users don't have to suffer the portability penalty.

        Thanks again :)