May 31, 2004
JRoller Does It Again
Over the weekend, to coincide with the rather annoying Javablogs outage (hardware-related), JRoller decided to tweak its configuration file slightly, and change its base URL from “jroller.com” to “www.jroller.com”.
Quick digression. RSS2.0 defines a “guid” element. According to the RSS spec: “guid stands for globally unique identifier. It’s a string that uniquely identifies the item. When present, an aggregator may choose to use this string to determine if an item is new.” There’s also a specific option to make the GUID the same as the item’s permalink, which was possibly the biggest mistake in the RSS2.0 spec.
We all know that cood cool URIs don’t change. While the Internet is a lot better than it used to be about making sure that if a URL points to something today, it will continue to point ot that thing tomorrow, we still tend to change URLs regularly, just making sure that the legacy of old URLs continue to point to the same page.
This is true of my weblog. People continue to use http://fishbowl.pastiche.org/archives/001132.html to address my tutorial on HTTP conditional get, even though the server has been sending back “MOVED PERMANENTLY” notices for over a year.
Anyway, since Roller uses the isPermalink="true" option in RSS2, and because you’re supposed to use GUID uniqueness to determine whether an item is new, and because they changed the permalink of every single entry over the weekend, the ultimate effect was to have every single article of every JRoller blog suddenly appear to Javablogs as being completely new, flooding the front page with a bunch of stale duplicate articles.
Charles’ recommendations for syndication GUIDs:
- Don’t use a permalink (See Mark Pilgrim’s Howto on Atom IDs for some better suggestions
- Store the ID with the entry, so that whatever happens to the entry, the GUID will stay the same. (And, Movable Type pet peeve, EXPORT the damn thing with the entry, so that it survives backup/restore)
Ouch. I certainly did not mean to change any of the JRoller URLs. I'll correct that immediately. Could the strain of of those false new entries be what took Javablogs down all day?
Posted by: Dave Johnson at May 31, 2004 01:17 PM (#link)Dave: the outage was because the server had a broken case fan and was overheating. I'm pretty sure Javablogs can handle a few thousand entries (although I think the RSS feed is quietly dying right now).
Posted by: Charles Miller at May 31, 2004 01:39 PM (#link)Cool URIs don't change: http://www.w3.org/Provider/Style/URI
Posted by: Dan Moore at June 1, 2004 01:39 PM (#link)/me points up to where that's linked in the middle of the post.
Posted by: Charles Miller at June 1, 2004 01:57 PM (#link)