Lesson one:
10.3.2 301 Moved Permanently The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible. This response is cacheable unless indicated otherwise. (RFC2616: Hypertext Transfer Protocol -- HTTP/1.1.)
It's pretty obvious, you'd think. If a webserver sends a "Moved Permanently" response, an HTTP client ought to rename bookmarks to point to the new address, since the old one is no longer valid and will never be valid again. RSS readers tend to follow this rule religiously. Mark Pilgrim's feed parser documentation states:
If you are polling a feed on a regular basis, it is very important to check the status code (d.status) every time you download. If the feed has been permanently redirected, you should update your database or configuration file with the new address (d.url). Repeatedly requesting the original address of a feed that has been permanently redirected is very rude, and may get you banned from the server. (HTTP Redirects [Universal Feed Parser])
(Most of the time it's a good rule. Javablogs follows it.)
On a recent trip to the USA, Scott stayed in a hotel with a 'net connection. The connection was one of those with an arbitrary web-based registration that made sure you'd agreed to their ass-covering terms of service. Until you had agreed, the hotel's proxy would redirect any HTTP connection to the registration page... using the 301 status code.
Within a few seconds of plugging in the ethernet cable, all of Scott's RSS subscriptions had been silently replaced with the hotel's registration page URL. All the original feed URLs were lost.
Lesson Two:
The Servlet 2.3 specification defines HttpSessionListener#sessionDestroyed thus:
Notification that a session was invalidated.
If you follow the specification to the letter, as many servlet container implementors did, you end up with a totally useless method. Sure, it gets called when a user's session is destroyed or times out. But since it's only called after the session is invalidated, you can't do anything with it.
Want to know what was in the session? Can't, it's invalid. Want to do some cleanup on objects in the session before it is dumped? Can't, it's invalid. Want to know who the session belonged to? Can't, it's invalid.
Thankfully, for the 2.4 spec the wording was fixed to read:
Notification that a session is about to be invalidated
Lesson one is a lesson on when TO follow the spec - for the hotel that doesn't want to piss over their customers. The spec is perfectly clear, it just that the hotel didn't follow it.
If you're gonna follow the spec, carry a large bludgeoning device for those that dont follow it to ummm.... fix their specification compliance.
Daniel: Spoken like a true programmer.
Hehe, that's what I was going to respond to Daniel. ;)
While you have a valid point, Daniel, the data is still lost. What if you happen to log in to an open wifi hotspot on the road and someone has set up something like that? Who do you blame?
It has been discussed wether changing the data (like Subscriptions/Bookmarks) on a 301 should happen silently, or wether the user should be asked each time (which could get annoying or confusing depending on the user and the circumstances). But silently changing it - without any way to go back - well it has it's drawbacks as is demonstrated in the above case.
My dad always said: Know the rules and why they should be followed, but don't follow them religiously just for the simple reason they are rules. Always reserve your right to knowingly break them.
I'd say keep to the spec, there are plenty of legitimate uses for Moved Permanently and it would be tragic if we removed it because of the potential for people to mis-use it.
Instead, make sure you regularly backup your system. That way, you can easily restore your URLs next time you hook up to some dodgy network!
Sencer: you've got all tha makings of a really annoying DoS attack there - set up a proxy that passes HTTP from a browser, while 301ing any request that looks like it's from an aggregator.
What a pisser. I've always been annoyed that web browsers didn't make any use of 301; it would really help in the fight against link rot. Too bad the author of the hotel software was so clueless as to appear downright malicious. Weird part is, how does someone so untalented even -know- about 301's? Typically inexperienced web programmers use 301 for everything, whenever they can (especially in ASP).
Now how should web browsers or aggregators handle such a situation to provide a better user experience? Prompting would be annoying; the content author explicitly told you that the content will never be in that location again. Forcing the user to make a decision about what to do at that point would get you inconsistent behavior, and 301's would not be nearly as effective.
Use some sort of hueristic to determine when all the URL's fetched suddenly start returning 301's and warn the user? That sounds okay.
Probably best of all, snapshot or save all the urls before updating them. Then the user could return to a point in time (at least the last couple of snapshots) in case anything bad happened to their feeds, which is probably the best way.
> Sencer: you've got all tha makings of a really annoying DoS attack there
Well a good way to ask for user feedback without DOSing him, is to just show a little icon or text next to the tile of the bookmark/Feed in your client. Clicking on it would show you what the deal is (410, 301 etc.) and let you decide wether to change to the new location.
There is an easier way to fix the 301 problem. When you get a 301 try to fetch the feed at the new url. If you can successfully fetch the feed then honour the 301 and change your feed database to point to the new url. Otherwise ignore the 301.
This adheres to the spirit of the rule but protects your users when something goes wrong. (Now I have to go and check whether Aggrevator does that.)
Hey, the spec says that "clients with link editing capabilities ought to automatically re-link references to the Request-URI", but it doesn't dictate that those clients should SILENTLY re-link references to the URI. What about just asking "hey, I've got a 301 for this address. I'll update the feed URL, ok?".
If you ever found yourself on a hotel room like that and your newsreader started asking about updating feed URLs for all of your subscriptions, you'd just click on "No For All" after a while and go on with your stuff, with maybe a few damaged links, but not your entire database. Seems like a good compromise between spec adherence and niceness to me.