Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

scrottie (4167)


My email address is Spam me harder! *moan*

Journal of scrottie (4167)

Wednesday January 30, 2008
07:05 PM

Databases as a message passing back end

[ #35528 ]

Beating this same horse some more. It isn't dead yet and I want it to die.

In the late 60's or so, MUD was created -- multiplayer computed text based adventure game. Brits dialed in, paying by the minute (some getting addicted, running up huge bills, and going to jail) to play live with other players. I don't remember what technology it was built on, but it's still the gold standard today -- the richest world, most dynamic environment, most complex play, best grammar parser, etc, etc.

It's popularity gave rise to a number of free clones, written essentially as free software (MUD and MUD II were not and are not free). One of the first ran on a GE mid-frame running GECOS. GECOS had no provision for IPC or for server processes, so each player got their own executable spawned. To communicate with other people's processes (and thus be multi-player), each process had to write out a file log of what that player was doing, and another program would read all of those files and write out files for each player of what the results of their commands were. It was IPC not through pipes but through files. It sucked. It was a terrible kludge.

Now we have Unix, and we have pretty solid IPC -- for programs running on one machine. If you make an app that runs on one machine, you can have great multi-user or multi-player capabilities. But that isn't "scalable". And in the late 1990s, we re-invented scalability. It now means that you have a database backend, and message passing is done over the database backend. Just like the MUD clone, the processes can't talk to each other except through this explicit read and write message passing system. It's so necessary that it actually feels good. Just like people adapting to Pascal's useless type system, successfully adhering to the discipline gives a rewarding feeling of accomplishment, and after a while, we forget that the discipline is ultimately pointless.

The irony is real scalability -- not the kind that novices learning to program during the dot com boom invented -- was largely created as a way of supporting extremely large databases that wouldn't fit on a mere two or four processor computer, or even all of the processors you could ram in one physical cabinet and be able to move the cabinet. So we're using database technology to scale applications, but we're not using the _real_ scalability technology that enabled databases themselves to scale, so our "scaled" applications only scale as far as a database on a dual processor, 8 core machine will scale.

Okay, a few places run Oracle or SQLServer for their database for web apps and do have database servers with hundreds of processors. That's another kicker. Microsoft get it right on this one but we web app programmers can't.

Here's the upshot. A good database server running on a good operating system will run the same application spread across hundreds of processors -- or even thousands. This technology exists. It's called "single image". OpenSSI is one implementation. It's been hacked onto Linux. Go use it. Write apps that store transient user information in RAM in a very large distributed machine and put persistent data in the database. Cache in RAM. Don't re-adapt your software architecture yet again to use memcached or other sorts of caching. Don't cache data locally on the server and have each server redundantly caching the same garbage. Cache it once on one large computer built out of all the smaller computers. OpenSSI is memcached -- for Linux.


The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • The thing is, databases were really one of the first mass-usage systems that was forced (or took it upon itself) to solve most of the really tricky technical problems (atomic-ness + transactional-ness + parellism + many concurrent users on multiple machines/cpus + permissions + platform-neutral) AND made those solutions available via a convenient interface.

    Because it has done this, it is a low-overhead way to leveraging those solutions without having to resort to actually learning anything new or doing anyt