Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

acme (189)

acme
  (email not shown publicly)
http://www.astray.com/

Leon Brocard (aka acme) is an orange-loving Perl eurohacker with many varied contributions to the Perl community, including the GraphViz module on the CPAN. YAPC::Europe was all his fault. He is still looking for a Perl Monger group he can start which begins with the letter 'D'.

Journal of acme (189)

Wednesday September 12, 2007
03:50 AM

Nooglers and the PDB: Reactor

[ #34422 ]

I monitor Google's tech talks on Google Video and yesterday a very interesting one popped up: "Nooglers-and-the-pdb: Reactor". While I was watching the video I realised it was an internal Google talk for new Googlers that had been publicly posted by mistake, and it has now been withdrawn. Google Blogoscoped has a good summary of the talk. The blurb is:

Reactor is the backend that provides feed services for Google Reader, igoogle, and other applications. It provides access to the full history of feeds, with tagging and read state management. In this talk we will discuss the design of the reactor backend, including the recently-launched search feature.

It was very interesting as they talk about all sorts of internal Google projects and their code names. It includes the BigTable schema for Reactor, how the new search feature is a tree of 150 servers (150 million documents) to spread network bandwidth (and 40 machines serving 40 million fresh documents) and the fact that the team is three people on the backend and three people plus one intern on the frontend.

Reactor has two tables in BigTable: an items table and a streams table. The items table has an item column: ID which is a hash of the URL and other things, a column and a tag column family which has tags for each user that has read it. The stream table has two kind of streams: feeds streams (keyed by the URL) and with a list of item ids and for users' read tags. The first page is expensive to generate:

"Any time you come to Reader it's doing tens if not hundreds of lookups of BigTable lookups in parallel to find all of your streams and the items in them".

It was quite interesting. Google should make all these talks public ;-)

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.