I should have been doing something productive last night, but instead I ran some analysis and aggregation against perl.org's caught spam mailboxes.
The end result, a lot of numbers, not enough pretty graphs. What I really want to do is put the data into RRDtool, and update it regularly -- but properly configuring RRDtool to do fancy things is a black art I have not mastered yet.
In the end, we get a whole lot of spam and viruses every day. We get around the same number of messages -- but the viruses take up an order of magnitude more space. (And don't compress very well either.)
On the positive side, we're not passing this stuff on to the mailing lists or @cpan.org users. On the negative side, there's no sign of this tapering off. The spam level has stayed relatively constant for the past few months, although we've begun trapping a bunch more since we started using SURBL. Virus levels fluctuate widely.
If anyone is interested in playing with the data, let me know. I'd expect some pretty graphs in return.
I'd plug National Shoot a Spammer Day, but really it's the virus writers. Why can't they write small viruses? In some ways, the latest MyDoom variant is progress. In my book, it gets classified as spam.