Friday, May 26, 2006

New thinking about Radio UserLand's aggregator

Radio UserLand has shared it's XML aggregator with Frontier for years. The built in aggregator is simple: fetch the XML, compile it into a outline stored in the database, then use a format driver to read the table and extract standard information.

Format drivers are UserTalk scripts written for a specific syndication format that step through the compiled outline and look for things like site name, site URL, a description or details about an item like a GUID or permalink. The format driver scripts are big and contain repetitive subroutines because while it's fun to talk about how different things like RDF, RSS and Atom are, they aren't that different in ways that user's care about. In fact, the currently shipping format driver for RSS handles all "versions" of RSS and RDF *because they are so similar*.

I posted an example of a script that Patrick Ritchie wrote about a year ago with a fresh approach: element drivers. The idea behind element drivers was you only extract the information you want by calling the element's driver. Want to know the title of that website? As a script writer, you don't have to know the format, you just get the database address of the feed and ask for the site's title:

aggregator.compile.site.title (adrServicesTable)

Where "adrServicesTable" is the address of the table in the database where the syndication service information is stored.

This script (which doesn't exist yet but I am writing it) would return the site's title based on the format of the feed. In an RSS 2.0 feed, that's the value of the "title" element under the "channel" element. In an Atom 1.0 feed, that's "title" element under the main "feed" element. As an aside, the neat thing about Atom is that it tells you what type of content is in that element. For example, it can tell me that the text of the "title" element is escaped HTML and that allows me to tell my script to read the text that way. Cool, eh.

Here's the rub:

I think the aggregator format drivers need to go the way of the dodo and we need to use element drivers. It will help us reuse code and write things that are simpler. The issue is when someone needs to write drivers for a non-standard format. We can allow for that by giving a programmer the ability to completely bypass the compile process and use their own.

| |

Nick Bradbury says pick one syndication format

Nick Bradbury: "So, if you currently offer multiple feed formats, may I suggest that you stop doing this? Just pick a format - any format. If RSS does what you need, stick with it and dump your Atom feed. If you need the extra features that the Atom format offers, dump your RSS feed. Either way you'll be fine, and your readers will be happier."

Amen to that.

| |


May 2006
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
Apr Jun

Wines I Like from Cork'd

(c)Copyright: 2006 Steve Kirks

Click here to visit the Radio UserLand website.

Click here to send an email to the editor of this weblog.