Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Beatnik (493)

  (email not shown publicly)

A 29 year old belgian who likes Mountain Dew, Girl Scout Cookies, Tim Hortons French Vanilla Flavoured Cappucinno, Belgian beer, Belgian chocolate, Belgian women, Magners Cider, chocolate chipped cookies and Perl. Likes snowboarding, snorkling, sailing and silence. Bach can really cheer him up! He still misses his dog.

Project Daddy of Spine [], a mod_perl based CMS.

In his superhero time (8.30 AM to 5.30 PM), he works on world peace.

Journal of Beatnik (493)

Thursday December 15, 2005
09:24 AM

1984 : The numbers

[ #28005 ]
If I recall correctly, the European Union does not force ISPs, phone operators, etc to store all transaction data, like the /. article states. But then again, let's look at the numbers.
A quick calculation:

Belgium has about 10 million inhabitants. Suppose 1 million of those are actively connected the net and are generating 100 megabytes of traffic per day.
That means that 100 million megabytes per day needs to be stored.
calc.exe points out to me that that's about 95 terabytes... per day in total. If you count the different ISPs (we have about 4 major ones) then they each have to store about 24 terabytes a day. The data needs to be available from 6 months up to 2 years... ( 180 * 24 ) which comes down to 4320 terabytes per ISP for 6 months worth of data. For 2 years, that's about 17280 terabytes, per ISP. Now, I know hard disks are cheap but I doubt they're THAT cheap :)

Suppose your avegerage disk is about 250 gig, each ISP would need about 69120 disks. At $100 for a 250 gig disk, that would cost them $69120,000. It would be a better idea to store just the metadata :)

I'm clearly guessing here.. plus these are figures for a small country with a limited number of ISPs. I didn't count for telephone operators.. or the fact that the data is probably in some kind of failsafe setup.. which would add at least 20% more capacity.

Edit: Updated figures, thx jmm!
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • You divided by 4 twice - once when you went from 95 to 24, and again when you converted 6 months to 180/4. So, your result is one quarter of the real total...
  • I thought it was only the metadata anyway.. ie: a record of who connected/chatted where, rather than the actual transcript of the conversation.

    While I was actually reading the calculation (which seemed conservative, actually) - clearly, they'd be buying the largest possible disks for this use. Say 200GB disks (I'm sure there are larger). Assume a mean failure rate of 5% inside 6 months (again, conservative). If they (the ISPs) don't wish to say "sorry, that disk failed" when the authorities show up, they'

  • What's required is that ISPs and telco's only store the data to find out "who talked with whom". So, a telco needs to store who dialled which numbers (which they are already doing, as that's required to do write bills), and ISPs need to store who send an email to whom (what their mail logs are most likely already storing anyway), and who visits which websites (proxy logs?) On the one hand, that means that ISPs that save their mail logs on a tape, or burn it on a CD and forget to erase it after three months