At work, I've got a problem on the back burner which is kind of interesting. We've got some mod_perl processes with big data sets. The processes fork and then serve requests. I've heard from Operations that they're not using Linux's Copy-on-Write feature to the extent desired so I'm trying to understand just what's being shared and not shared.
To that end, I wanted to map out where perl put its data. I made a picture (http://diotalevi.isa-geek.net/~josh/090909/memory-0.png, a strip, showing the visible linear memory layout from 0x3042e0 to 0x8b2990. The left edge shows where arenas are. The really clustered lines to the middle show the pointers from the arenas to the SV heads. The really splayed lines from the middle to the right show the SvANY() pointer from the SV heads to the SV bodies.
I kind of now suspect that maybe the CoW unshared pages containing SV heads because of reference counting are maybe compact or sparse. They sure seem to be highly clustered so maybe it's a-ok to go get a bunch of values between two forked processes and not worry about reference counts. Sure, the SV head pages are going to be unshared but maybe those pages are just full of other SV heads and it's not a big deal. If SV heads weren't clustered then reference count changes could have affected lots of other pages.
Anyway, there's a nice little set of pics at http://diotalevi.isa-geek.net/~josh/090909/. I started truncating precision by powers of two to get things to visually chunk up more. So when you look at memory-0.png, there's no chunking but when you look at memory-4.png, the bottom 4 bits were zeroed out.
There's a github repo of this at http://github.com/jbenjore/Internals-GraphArenas/tree/master for the interested.