RSS and MIME types

The last umpteen comments I’ve heard regarding “feed:” is that we should be using MIME types instead. But NO ONE has addressed ANY of the problems with MIME types…rather, the comments are all basically saying “MIME types are the right way to do this.” Let me describe the MIME problems in more detail here, and if someone has solutions or suggestions, please post them. Two of these problems are deal-breakers.

Problem 1: [severity: deal-breaker] In order to serve up a file with a specific MIME type, you need to make some changes in your web server configuration. There are a LOT of people out there (shared hosting, anyone?) who don’t have this capability. We have to cater to the masses, people – we’re trying to drive adoption of this technology.

Problem 1a: [severity: annoyance] There are even more people who wouldn’t know a MIME type from a hole in the head. If Joe user figures out that he can build a XML file with notepad that contains his RSS data (and it’s being done more often than you think), and upload it to his web site, you’d think that’d be enough. Sorry, Joe, you need to change the MIME type too. The what?

Problem 2: [severity: deal-breaker] If you register a handler for a MIME type, the handler gets the contents of the file, rather than the URL. This is great if you’re a media player or whatever. However, with RSS, the client tool needs the URL of the RSS file, not the actual contents of the RSS file. Well, it needs the contents too, but it needs the URL so it can poll the file for changes later. This means that the file that’s actually registered with a new MIME type would have to be some kind of intermediate file, a “discovery” file if you will. So now, not only would Joe user have to learn about MIME types, but he’d have to create another discovery file as well.

Remember the goal. We need an easy subscription mechanism for users to subscribe to feeds. We need a solution that will a) be workable with today’s tools, and b) be easy to implement for the vast majority of publishers. Using feed: as discussed recently meets these requirements.

So please, post your comments. However, if you’re going to advocate using MIME types for RSS, make sure you address AT LEAST problems 1 and 2 in your comment. Don’t just say “feed: is wrong, you have to use MIME types” – address the real problems. Otherwise, it’s all theoretical.

27 thoughts on “RSS and MIME types

  1. Anil

    In all fairness, messing with the protocol:// standard is a deal-breaker, too. This breaks lots of clients in many ways that we probably can’t even anticipate. I admire your enthusiasm, and share the sense of “shipping is the most important feature” but there are some well-reasoned critiques that can be made here.

    All but the most rudimentary shared hosting situations allow creation of a .htaccess file which can add an appropriate MIME type.

    Point 1a seems like a red herring. Users who are savvy enough to be manually hacking XML files are certainly going to have access to instructions on manually creating an .htaccess file, particularly since every instruction page on the process will link to exactly those directions.

    Discovery files are *not* needed. MP3s are regularly served through the web with no discovery files, and users choose how they’re handled, with methods ranging from desktop download all the way to adding to a playlist or music library. RSS feeds that are fed to a “player” app would automatically pass along their URLs, and the player app could decide whether that means preview, subscribe, or prompt the user.

    I do want something good to happen with user experience here, but we’re adding features to this form of syndication after the fact, a reflection of the format not having been designed to address such things. (Pause for a moment to consider what possibilities that raises for formats that *are* designed.)

    Whatever decision we make here will never reach 100% of users. But the majority of feeds being produced are hosted in a few environments or by a few companies, and we can cover the vast majority of feeds with these technologies. Ignoring one of the few fundamental pieces of web/net infrastructure that’s not yet broken, the URI, is a much higher price to pay, and frankly no form of syndication is so valuable as to be worth that kind of breakage.

    Reply
  2. Matthew Ernest

    Anil’s MP3 comparision is incorrect in it’s description a direct-served MP3 file being passed to a player. When served without a “discovery file”, the browser (or equivalent agent) download the file to local storage and then invokes the player on that local file. The player does not get the URL in this situation, and therefore can perform no action based on the URL.

    The situation is just not the way Anil describes, and can be easily verified as such.

    Reply
  3. Greg Reinacker

    Anil, on the MP3 comparison, Matthew is correct – the file contents are passed along, rather than the URL.

    On the protocol thing, is the feed: thing really that much different than mailto: ?

    Reply
  4. Danny

    Anil’s right about .htaccess being available with most basic host providers – I use two of the most minimal myself. Initially I argued (on the rss-dev list a few months ago) against the use of a new mime type, but after a bit of reading around I do think it’s far preferable to a new scheme.

    Most people savvy enough to generate RSS won’t have a problem with this.

    The way an agent behaves when an mp3 is served will (should!) usually depend on the mime type used. Saving to file or playing in a media player (streaming or after saving) are two examples of possible behaviour. If the mime type is set to audio/x-mp3 and xmms is the registered app, you get a stream.

    mailto: is a registered scheme. Registering schemes is *not* easy. Some background:

    http://www.w3.org/Addressing/schemes-gen.html#Registration

    http://www.w3.org/TR/webarch/#URI-scheme

    “Good practice

    New URI schemes: Authors of specifications SHOULD avoid introducing new URI schemes when existing schemes can be used to meet the goals of the specifications.”

    Reply
  5. Greg Reinacker

    Re: “mailto: is not a registered scheme”

    Well, ok…then why couldn’t feed: be the same way? “mailto:” is all about hooking into another application to send mail. That’s all we’re looking for “feed:” to do.

    Reply
  6. Richard Tallent

    As one of the advocates of the MIME type option, I agree on points 1 and 2 (I’m unconvinced of the J6P XML author argument, and an “rss” extension for hand-edited files would be a decent workaround in most browsers). If RSS and Echo both had a standard place to expose their “home URL”, the biggest issue would be moot, but there isn’t so it is a valid concern.

    That said, I’m still in favor of having a proper MIME type and standard extension so aggregators can export/import/view feeds on disk or network devices, and I’m still in favor of using the HTML tag for autodiscovery through a browser plug-in (with no “feed:” prefix preferably), which will require a proper MIME type. I’m not against coffee mug or RSS icons or whatever for feeds as well that link to “feed:whatever”, but finding the icons can be a pain and the tag is underused.

    Reply
  7. Jaykul

    Maybe on linux hosts you can create an .htaccess file (if someone explains it to you) … but not on a windows server, and not if you’re a newb’ who only knows how to create a rss file because your blogging software (which you probably had to have installed by someone else) does it automatically.

    That said, as I mentioned before ( http://jaykul.fragmentized.com/internet/feed_me_more.php ), on windows we can solve the first MIME-type problem CLIENT-SIDE by associating to a file extension, and simply publishing feeds with .rss extensions (if you’re feeding your rss through php or some other client-side script, you can set the MIME type that way).

    I’ll ignore the “annoyance” of teaching people what a MIME-TYPE is … long term, that’s not much of an issue, and anyway, eventually server administrators will start taking care of it ;-)

    However, that last issue is absolutely crucial (I dealt with this before, in the aforementioned post) and is a complete deal-breaker that I cannot see any way around. In the face of this, everything else is basically irrelevant. I mean: if your reader doesn’t know what the URL is, it doesn’t matter how many times I send you the stupid rss content, you can’t subscribe to it.

    Reply
  8. Jaykul

    Somehow, this got left out of the post the first time, it was supposed to go on that blank like before “I’ll ignore …” ;-)

    I think we can probably solve the first issue by creating (on the client side) a mime-type associated with an extension.

    What’s all this about “messing with the protocol:// standard is a deal-breaker, too. This breaks lots of clients …” what client? Show me one, please? On windows at least, the scheme registration is handled by the system, and all the web/html/http clients just hand off the url to whatever application has registered for the scheme. If you don’t believe me, go read my post ( http://jaykul.fragmentized.com/internet/feedhttp_is_better.php ) and grab the script ( http://jaykul.fragmentized.com/binaries/scheme.wsf ) and try it!

    Reply
  9. Matthew Ernest

    In the streaming mode that Danny describes, the player is receiving the data over stdin or some other socket-like connection with the browser or agent. The player still does not know a URL for the data it is receiving from the agent.

    When a browser/agent gets a MIME type for a connection that knows of a handler apllication for that MIME type, pretty much the only options are to save to a file and invoke the handler with the filespec or to open a pipe and invoke the handler at the receiving end of that pipe.

    Reply
  10. Greg Reinacker

    One small comment on what Jaykul said…true, on Windows you could use a “standard” extension like .rss. However, if you want to generate RSS feeds on the fly with ASP or .NET or something (like I do on this site, rss.aspx), you’d need to map that additional extension in IIS – which is another thing you can’t do with most hosts.

    Reply
  11. Roger Benningfield

    Anil and Danny: Hey, the world doesn’t revolve around Apache… no, really. :) I’ve never used anything but IIS for hosting, and probably never will… so .htaccess isn’t a solution. Of course, many IIS hosts seem to be moving toward supplying control panels that let the user configure things like MIME types, so it may not be that big an issue even then.

    Richard: A “standard extension” is a very bad idea. Personal experience says that it is *much* easier to get a host to configure a new MIME type than to get him to install an ISAPI filter so one can have foo.rss parsed by ColdFusion or whatever.

    Reply
  12. Roger Benningfield

    Greg: The difficulty of mapping the extension is only the first problem with that approach. Once you’ve got that mapping, you can no longer serve static RSS without invoking the application server and its related overhead. Not a viable approach, IMO.

    Reply
  13. Anil

    Roger, I do use IIS as well, so I understand that .htaccess isn’t a 100% solution. However, if you combine the fact that Apache hosts a disproportionate amount of the sites which currently produce syndication feeds with the fact that, as stated above, most feeds are generated by weblog applications or similar tools (which could easily be updated to present the correct MIME type or file extension) I think we begin to solve a lot of the problem.

    In short, people who run servers and edit files on servers are a lot more technically adept, on average, than people who run clients. By adding a new unregistered protocol, particularly one that breaks the protocol tradition by being about app use instead of transmission protocol, in the way that mailto: is broken, we risk foisting the difficulty of this problem onto nontechnical end users, instead of onto server administrators, who are better equipped to help solve these problems.

    I understand Greg’s bias towards preferring updates of client applications, but I don’t think that’s the most effective way to get broad adoption of a simpler way to handle subscription feeds.

    Reply
  14. Danny

    Roger (long time no see, btw), I too use an IIS server for some of my material, including some SVG stuff which needs its own mime type (image/svg+xml). When the material is dynamically generated it’s easy enough to set the mime type programmatically (I’m using that particular server because it had the cheapest servlets support I could find). For static files though it was indeed a problem, but I mailed the admins and a couple of days later they’d made the change.

    So Problem 1. only has deal breaker severity when:

    a) the server is IIS (or similar)

    b) the RSS is creating manually

    c) the sysops are unfriendly

    I agree that the mime type approach is not without it’s problems, but I believe the benefits outweigh them.

    Reply
  15. Derek Scruggs

    Re: mailto: not being registered –

    Is javascript: registered? I haven’t checked, but my guess is no. But that hasn’t stopped a lot of developers from making use of it. Given that feed: is for client-side integration only, I think the complaints about registered URI schemes are overblown. You can do anything you want on the client side if the host OS supports it.

    That said, I’ve long advocated a saparate MIME type and continue to believe it should be supported. Everything I’ve seen says blog tools should *already* be serving RSS as application/rss+xml, though many of them don’t.

    At least the feed: scheme allows a relatively unsophisticated user to take advantage of advances in aggregator development, even if their blog tool of choice doesn’t.

    Reply
  16. James Aylett

    Ignoring 1/1a, which are mostly to do with server-side support rather than client-side, surely 2 (figuring out the URI from the feed contents) is /less/ of an issue than getting a new URI scheme right and implemented well? Saying Atom doesn’t support something means it should be addressed now (since Atom is in no way ready for prime time yet); that RSS doesn’t support it now suggests an extension to resolve that. So everyone’s templates have to change to support it – but hey, they would for a new scheme too. And a self-identifying URI probably has other uses besides this anyway …

    Reply
  17. Joerg

    I just came found this thread and here’s some comments:

    mailto: is a registered scheme. See http://www.iana.org/assignments/uri-schemes for a list of registered schemes.

    javascript: is not registered, but it doesn’t really have to be because it isn’t used for URIs anyway. The whole “link” is handled by the client, without contacting any server, so there’s no need for a “real” URI.

    A feed: scheme I think is a bad idea. This isn’t the first time that a new file type has been invented and people start using it on the web more commonly, and it certainly won’t be the last time. You can’t be serious about wanting to introduce a new URI scheme for every new file type. When RSS becomes more commonly used, web hosting service providers will start configuring their servers to respond with an appropriate Content-Type header for files with a .rss extension, so problem 1 and 1a will become a non-issue. The same thing has happened with other new file formats (.png, .xml, .swf, .svg, …).

    Also note that if an application can (on Windows) register itself for handling a certain scheme, it can do the same with MIME types. Adobe’s Acrobat Reader for example by default installs itself to handle application/pdf content, so it’s certainly possible.

    Using feed: has other problems as well, for example how does the aggregator know what protocol to use for downloading the file? Yes, probably HTTP, but are you sure you want to make it impossible to use other protocols like HTTPS or FTP?

    Regarding problem 2, wouldn’t the most simple idea be to include the subscription url as an attribute somewhere in the RSS file? (Yes, that would mean having to extend the RSS standard, but I think that solution is far better than abusing established internet standards just because it looks like the simpler “solution” in the short term.)

    Reply
  18. Roger Benningfield

    Joerg: Unless I’m misunderstanding what folks are suggesting, the logic you apply to javascript: would apply to feed: as well. It’s a purely client-side thing… the user clicks a feed: link, it is captured by an aggregator, and then the aggregator goes from there, making calls via http: or whatever.

    Danny: I don’t personally have a horse in this race for precisely the reason you mention… all of my RSS is generated dynamically, so I can set the MIME type at will. If that’s the way things go, it’s no big deal for me. All I need to do is add support for to JournURL’s template language and I’m done.

    But there’s a simplicity to the feed: approach that strikes me as just about perfect for this space. It’s a purely client-side thing, so I don’t see any real-world harm coming from it, and it can be implemented and understood by virtually anyone.

    I *would* recommend something a little less generic, though. An unregistered scheme like javascript: is safe because it’s so specific… but someone might get the wild idea to do a “real” feed: scheme someday. Might be better to go with feedsub: or something similar.

    Reply
  19. James Aylett

    javascript: is pretty hideous too, frankly.

    One of my biggest complaints with feed: is that (as mentioned before), it’s /harder/ to configure at the client side than MIME type support anyway. Windows has a centralised MIME registry (it hooks into file types, but we can’t have everything) where I can choose the handling app. Not so for scheme handling; it’s centralised (presumably), but without a convenient editing interface. If I run one main aggregator, but evaluate others that grab the feed: scheme, I’m going to go crazy. Little modal boxes asking me if I want to switch handling back to MachoAggregator (and do I want it to check in the future?) aren’t much help either. Please, let’s not end up with a solution that requires a helper app to switch between candidate feed: handlers …

    (Incidentally, why doesn’t someone write a plugin for MSIE, and another for Mozilla, that gives a “subscribe to this” context menu item with config that allows you to select your aggregator … that’d be cool. Safari would be a nice third browser to support … :-)

    Reply
  20. Greg Reinacker

    James, I’m sure you’re aware that NewsGator already offers a context menu in IE for subscription…

    You’re also advocating a MIME type without addressing the very real problems.

    Reply
  21. Danny

    This discussion is pretty mobile, but just to answer that last comment, re. the very real problems you list, when using the mime approach:

    1. is only a problem in a limited number of circumstances (see above). Alternative mechanisms are still available at the client side. In fact, any behaviour you trigger from a feed: URI could be triggered from receipt of mime-typed data (you also have more information on which to act).

    1a. There are plenty of people who wouldn’t know a URI scheme from a hole in the head. There plenty of room for misinterpretation of “feed:” (especially when what you’re using it for is “subscribe”). Making sure that your work follows good practice guidelines is one way of avoiding confusion.

    2. That the user agent doesn’t pass the URI is really an oversight of the agent. Creating a new (monumental) architectural feature to workaround this is a ridiculous hack. A more sensible hack is simply to pass the URI in the feed. This has collateral advantages – e.g. the data is more portable.

    Reply
  22. Zenab

    I need to know about ISS configuration for mapping .aspx extenstion into .rss

    so that reader can load rss data generating by my .net Programme

    Reply

Leave a Reply to Roger Benningfield Cancel reply