Every now and then, we hear people talking about mechanisms for doing Perl in a commercial environment and how they deal with packaging and dependencies.
This is mine.
At Corporate Express, our main Perl application is a 250,000 line non-public monster of a website that has over 100,000 physical users and turns over about a billion dollars. It implements huge amounts of complex business functionality, and has layer upon layer of security and reliability functions in it because we supply to multinationals, governments and the military (only the stuff that doesn't blow up of course). Our
Lest you suspect that 200,000 lines is wasted in re-implementing stuff, the main Build.PL script has around 110 DIRECT dependencies, and somewhere in the 300-500 range of recursive dependencies. Loading the main codebase into memory takes around 150-200meg of RAM.
When I joined the team, the build system was horribly out of date. The application was stuck on an old version of RedHat due to go out of support, and as a Tier 1 application we are absolutely forbidden from using unsupported platform.
So I took on the task of upgrading both the operating system and the build system for the project. And it's a build system with a history.
Once upon a time, long ago, the project went through a period where the development team was exceptionally strong and high skilled. And so of course, they created a roll-your-own build system called "VBuild".
They built their own Perl, and along with it they also built their own Apache, mod_perl, and half a dozen other things needed by the project. This is similar to many suggestions I hear from high-skilled people today, that at a certain point it's better just to build your own Perl. VBuild this created in the pre-commercial Linux era, so it's not an entirely unreasonable decision for that time period.
Unfortunately, a few years later the quality of the team dropped off and VBuild turned into a maintenance nightmare because it required a high-skill person to maintain it.
At the time, the Tier 1 "Must be supported" policy was coming into effect, and after the problems with custom-compiling they decided to go with the completely opposite approach of using only vendor provided packages, in a system called "UnVBuild".
Since their platform of choice was RedHat, this had become troublesome even before I arrived. Worse, in the change from RHEL 4 to RHEL 5, some of the vendor packages for things like XSLT support were dropped entirely leaving us in a bind.
My first instinct was to return to the build everything approach, but the stories (and commit commentary) from that time period reinforced the idea that complete custom build was a bad idea. Office supplies is hardly a sexy industry, and the ability to entice good developers into it is a quite legitimate risk.
So in the end, I went with an approach we ended up nicknaming "HalfBuild". The concept behind it is "Vendor where possible, build where needed".
We use a fairly reasonable chunk of vendor packages under this model. Perl itself, the Oracle client, XML::LibXML and a variety of other things where our version needs are modest and RHEL5 contains it. We also use a ton of C libraries from RHEL5 that are consumed by the CPAN modules, like all the image libraries needed by Imager, some PDF and Postscript libraries, and so on.
One RPM "platform-deps" meta-package defines the full list of these system dependencies, and that RPM is maintained exclusively by server operations so that we as developers are cryptographically unable to add major non-Perl dependencies without consulting them first.
On top of this is one enormous "platform-cpan" RPM package built by the dev team that contains every single CPAN dependency needed by all of our Perl projects.
This package lives in its own home at
We then boot up the CPAN client from
The CPAN client grinds away installing for an hour, and then we're left with our "CPAN Layer", which we can include in our application with some simple changes to @INC at the beginning of our bootstrapping module.
The
Updating
Over the last 2 years, we've upgraded it around once every 6 months and usually because we needed to add five or ten more dependencies. We tend to add these new dependencies as early as we can, when work that needs them is confirmed but unscheduled.
We also resort to the occasional hand-copied or inlined pure-Perl
While not ideal, we've been quite happy with the
It means we only have to maintain 5 RPM packages rather than 500, and updating it takes one or two man-days per 6 months, if there aren't any API changes in the upgrade.
And most importantly it provides us with much better bus sensitivity, which is hugely important in applications with working lives measured in decades.
Thanks (Score:1)
Thanks for this, I would love to hear more similar stories from other people.
In our case, we are still trying to be perfectionists and maintain >300 in-house .deb packages and several hundreds of hand-builded cpan packages as well.
Actually, in our case our "application" consists of many various scripts and daemons, working on different hosts and clusters, so it is impossible to pack everything in one package anyway.
I've been dreaming for a long time about fully-automated cpan-to-deb (and cpan-to-rpm) bui
Re: (Score:1)
<p>My current solution is to build a quite clean chroot of Debian Lenny and then use cpanm to install some modules in
# debootstrap --arch amd64 --variant=buildd --include=cdbs,libwww-perl lenny
Re: (Score:2)
We have something similar of course, web cluster + application server + integration server + internal web server, but we've found that it's easier to just deploy the whole project to all the servers, and only start up the parts that are relevant.
For example, the internal and external websites share the same web application code, but the controllers and dispatch paths are only loaded into memory if that "profile" is enabled in configuration for that server.
So while the public facing web servers do theoretica
cpan mirror (Score:1)
How do you maintain your CPAN mirror? What is your process for verifying that updated modules are needed or allowed? How do you mark packages you do not want to update?