Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Alias (5735)

  (email not shown publicly)

Journal of Alias (5735)

Tuesday February 12, 2008
12:51 AM

PDF::API2 Considered Dangerous

[ #35639 ]

In between doing interesting toolchainy stuff like building a system for safe database migrations, making Perl product rpm'ification sane on RHEL5 and implementing a magic inside+outside-aware web testing module, I've also been tasked by $work to make some upgrades to our online configurable product system.

Which, to sum it up, means business cards.

Business Cards that are preset web-based templates that company employees can provide varying error-checked details to and have cards appear on their desk a week later.

There's letterheads and other easier stuff too, but mostly nobody pays attention to that, since business cards are both the most tricky and have the highest emotional factor for clients.

Business cards are also the print industry's loss leader. If you can get a company's business cards right, then you have a decent chance of getting the REST of their print business too.

So print companies ALSO care a lot about business cards.

Now doing layout logic for something small like a business card doesn't seem THAT complex, until you discover that it's all being done to print industry standards, which adds insane (but necessary) complexity to the problem.

Concepts like "colour" and "dimensions" and "font" and "position" and "background" a myriad of other fundamentals need to be redefined in entirely different ways.

All of this complexity results in about 50,000 SLOC and 30-35 tables, to not only produce you a business card correctly, but let you see it in advance on screen absolutely and provably identical to the way it will come back from the print shop. WYSIWYG for custom print jobs.

The current implementation, integrated into the large 200,000 SLOC system, is a port of an implementation originally created by a startup that $work bought years ago. Their implementation was 100% Microsoft, and couldn't be integrated directly, so it was ported instead.

The current version uses PDF::API2 under the covers to provide the primary layout rendering to PDF, from which we then cook it into a gif for screen previewing.

Unfortunately, PDF::API2 in the default usage does not appear to live up to print industry standards.

Oh sure, it prints basic documents, letterheads, invoices and so on good enough.

But for business cards, it's a Big Deal when a company executive has the hyphen in his last name sitting too far to the left.

Why doesn't it work right? To be honest, I'm not entirely sure yet. I have a week allocated in the next month some time to work out exactly why.

But the short version seems to be that PDF::API2 "doesn't do fonts like any other Windows or Mac program that the rest of the industry uses".

A font taken from Windows or Mac and put into PDF::API2 comes out different. This isn't simply a matter of differences between programs. They appear to work EVERYWHERE else, except for in "our product".

Although I'm entirely sure, we suspect the work of some form of proprietary Adobe auto-hinter magic that is licensed in "everything else", but which isn't available to us.

So PDF::API2 is just using the font metrics available in the font, which can be minimal because they assume the presence of the auto-hinter.

So suffice it to say that while PDF::API2 works just fine, be very cautious if you plan to use it for some form of document that has to replicate a design to high print standards.

You may not get out what the designer put in...

(More details as I investigate further, at which time I'll also hopefully have some proper bugs to submit)

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • Lemme start by saying that before I got into programming, I did desktop publishing and typeset many a business card using antiquated versions of Quark and/or PageMaker. So I know very well the horrors of which you speak.

    If I were you I would look into replacing PDF::API2 with PDFLib (or the free version PDFLib-lite). Sure, it is not on CPAN and so annoying to install, but the level of quality in the output and control is top notch. We have been using it at $work for creating very high quality reports an

    • Good.

      We did find PDFLib, and identified it as our last-resort strategy (we'd prefer not to have to replace the renderer).

      As for the horrid API you mention, I was just planning to write a CPAN'ified object-oriented wrapper around the C-inspired one.

      Chalk one up for inflexibility to Java here.

      Every other language they support uses the same ugly hacked API, except for Java which doesn't support it, so they HAD to provide a "real" OO API.
  • Any chance you can share more on your magic web testing module?
    • The inside+outside code isn't anything particularly amazing I'm afraid.

      The web app here has a structure vaguely similar to Catalyst/Jifty in that it has a context object that holds the session/user/data etc.

      First attempts at a pure LWP-based web testing library had some problems with user accounts and debugging. We could instantiate a controller manually (including creating user accounts on the fly) but lacked similar flexibility in the web version.

      So the idea of inside+outside was that the master testing o