What am I working on right now? Probably the Sprog project [sourceforge.net].
GnuPG key Fingerprint:
6CA8 2022 5006 70E9 2D66
AE3F 1AF1 A20A 4CC0 0851
We recently launched a new mod_perl app that is accessed via SSL. Soon after launch, it became clear that we were getting lots of segfaults recorded in our error log:
[Tue Sep 4 03:42:47 2005] [notice] child pid 8195 exit signal Segmentation fault (11)
We cranked up the Apache LogLevel to 'debug' and found that almost every segfault message was preceded by something like this:
[Tue Sep 4 03:42:47 2005] [info] [client 220.127.116.11] mod_perl: Apache->print timed out
A bit of googling turned up this message which suggests a bug in mod_ssl's timeout handler could cause segfaults.
It turns out that all the timeouts were occurring while the app was sending a PDF generated on the fly by our app. The PDF generation is very quick (thanks to PDF::Reuse) but the files are about 250KB. Assuming a clean connection with 56kbps throughput, that's going to take around 45 seconds to download. As it happens, the app is targetted at people who are travelling internationally, so many (if not most) users get much less bandwidth than 56kpbs.
Anyway, it turns out that for reasons nobody can explain, our Apache config included this line:
Bumping it up to 600 (10 minutes) has vastly decreased the incidence of segfaulting, although we've still had about a dozen in the last 24hrs. I could increase it further, but having one Apache/mod_perl process tied up handling one request for more than 10 minutes is not ideal. Luckily it's not a particularly high-traffic app.