Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

TorgoX (1933)

TorgoX
  sburkeNO@SPAMcpan.org
http://search.cpan.org/~sburke/

"Il est beau comme la retractilité des serres des oiseaux rapaces [...] et surtout, comme la rencontre fortuite sur une table de dissection d'une machine à coudre et d'un parapluie !" -- Lautréamont

Journal of TorgoX (1933)

Tuesday May 27, 2003
06:56 AM

RSS and NNTP

[ #12452 ]
Dear Log,

Since I wrote the program that generates the RSS feeds for nntp.perl.org and since Robert nicely set it up and everything, I've had a few people ask why it doesn't also provide the full content of each article in <item> element's <body> section. There's many reasons, but the main one is: it's too damned much bandwidth to make an RSS file that gives you the last N messages on a list, or the last M-day's-worth of messages, for useful values of N or M. The values of N or M have to err on the side of being large, so that clients that don't poll frequently don't miss out on some posts; but that meas that clients that do poll frequently (say, every half hour) still get the whole potentially huge file all recent messages.

The basic problem is that the server doesn't know exactly what messages you have and haven't yet read, and so it can't give just just what's new to you. But it doesn't have to be that way: suppose the server keeps track of this, by having each RSS client access a unique URL like http://whatever/thing_rdf_gen.pl?xyz123 where xyz123 is some unique ID for that client. The program that dynamically generates that RSS feed would show items that it didn't already show last time, and then update its little database for user xyz123 so that it would know not to show them the next time.

Or one can have a framework where each client says to the server "here's the IDs of items I've seen; now what items do you have that aren't in this set of IDs?"

Or one can have the client say "Give me the IDs of everything you have, and then I can ask for full details of everything that's new to me".

The problem is that all of these options are solutions that have existed practically forever for NNTP, and reiterating them for RSS seems really quite wrong-headed to me, like pointlessly wrapping SMTP in XML-RPC. I have no grand conclusion here, but rather three incomplete thoughts:

* An "rss2nntp" proxy CGI should be simple to produce; it'd basically be just a newsreader that dumps the new news files in an RSS wrapper. The per-user data on the server is basically just a .newsrc.

* While we're at it, an "nntp2rss" program should be simple to produce: say, as a program that polls a given RSS feed, and every time it sees a new item, posts that item's data to a given newsgroup (whether it's one newsgroup per feed, or what, is an open issue).

* The fact that these things are possible, sane, and in fact trivial to implement, suggests that NNTP and RSS are not radically different things. Protocol-wise, they are clearly different -- and that's most of what I just said. And at the basic level of items versus posts, there are some basic problems with expressive range (you can express things in an RSS item that there's no /single obvious translation for/ in an NNTP post, and vice versa). So at the technical level, there's just no relationship; they're chalk and cheese. However, at the user end of things, there is a weird isomorphy between simple typical RSS and simple typical NNTP -- so much so that I'm left wondering: How about having newsreaders (like Netscape News, for example) be RSS readers too?

Generalization: maybe most situations that suggest/allow/require a trivial protocol2protocol proxy, are situations where what should really happen is for the clients of each protocol to get a bit smarter, so that the proxies aren't needed in anything but the short-term.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • > like pointlessly wrapping SMTP in XML-RPC.

    This largely describes "web services" in general.

    Well, I've seen three kinds: (1) toy examples, (2) wrapping of propietary protocols into XML (they will still be closed and proprietary, mind, as long as the vocabularies and protocols are not public), and then these (3) pointless rewrappings of existing protocols/frameworks.

    In (2) and (3) the only measurable effect has been manifold increase in bandwidth, and the need to have an XML parser everywhere. Not to
    • I just don't see much net benefit.

      Is that a pun :)

    • If I've hurt someones feelings who think web services are the greatest thing since sliced bread, I'm sorry. I just don't see much net benefit.

      No, that pretty much nails it. Web Services are a vast conspiracy of deep-pocketed vendors and tagheads to make themselves relevant.

      There are a few benefits to Web Services, like the reinvention of IDL and "baked in platform neutrality", but there were better ways to get those benefits than XML-RPC, SOAP, WSDL, and RWSA(*) provide. For example, wrapping a pr

  • Would Etag and Last Modified not prove useful? If a requesting client specified one or the other then you get the entries since the last time. If there are only a few then provide the full content, otherwise provide the headlines.

    Obviously if the client end doesn't provide that information then continue with the current setup.

    Regardless of all of this, congratulations. It's a wonderful resource in it's current state.
    --
    Steve Rushe - www.deeden.co.uk
    • Would Etag and Last Modified not prove useful?
      A better-defined replacement for RSS would be more useful. A specification that approximates the less useful half of NNTP doesn't improve when ad-hoc extensions are added to provide one or two more NNTP features on a per-feed basis.