RSS enclosures

Dave Winer on RSS enclosures:

Chris Lydon has been doing a series of audio interviews on his weblog at Harvard. There are already over 25 interviews, representing 40 separate MP3 files. The archive is nearly 300MB. It’s a perfect application for RSS enclosures. [Scripting News]
Eek…any time I see an automatic 300MB download being a perfect application for anything, it gives me pause.

I’ve read Dave’s “How to support enclosures” document. It says aggregators should not download enclosures until the computer is idle, and gives some other guidelines for implementing them. The idea is that the enclosures will be there waiting for you when you get around to looking at them.

Here’s my big problem with this, though. The enclosure-aware aggregators I’ve seen thus far just go happily download all of these enclosures in the background. There’s probably an excellent chance the user will never open these files…and yet we’re burning untold amounts of bandwidth to download them anyway. Bandwidth isn’t free, folks.

NewsGator will indeed support enclosures in the next release…but it will work a little differently than existing tools. We may not follow Dave’s recommendations on how to support enclosures to the letter, as our application is unique, and the user experience is different from most other tools…but we believe the user experience will be satisfying, and give users the flexibility to do what they want. Stay tuned.

6 thoughts on “RSS enclosures

  1. Dave Winer

    No way is a 300MB automatic download even remotely part of the picture. What did I say that gave you that impression?

    Anyway, glad you’re going to support enclosures.

    Also I hope you’ll think about doing it the prescribed way. Your users will thank you. You’re not burning bandwidth if you do the download at off-peak hours and read the Payloads piece. The click-wait problem is what enclosures solve.

    Reply
  2. Greg Reinacker

    All of the referenced docs imply that the aggregator will automatically download enclosures, albeit off-hours. If there’s 300MB of enclosures in a feed, down they will come.

    And this is potentially a waste of bandwidth even if it’s “off-hours”. It sounds like you might be thinking of the client’s bandwidth…I’m thinking of the publisher’s bandwidth also, which he’s paying for regardless of what time the download happens. And what if everyone’s aggregator started downloading a 30MB attachment at 2am? Suddenly that’s not off-peak any more for the publisher.

    You could meter bandwidth for these attachments on the publisher’s side, but that just creates a whole different resource problem if there’s a large amount of traffic.

    In any case, we will be supporting enclosures, and our users will have a choice as to how they’re handled. The no-click-wait scenario will certainly be possible.

    Reply
  3. Dave Winer

    Okay, but I won’t be putting that 300MB archive on the Lydon feed, the purpose of the feed was to flow the interviews out one or two a day, they’re all less than 10MB.

    Believe me, I’m not only thinking of the client’s bandwidth. I am hosting all the Lydon stuff on my server. Luckily I’m in an organization that does a huge amount of traffic so this now is the smallest of drops in the bucket and we really love what Chris is doing, and feel very evangelical about it and would be kind of happy if our server got swamped.

    Also, a little over a year ago I worked out a way to move this stuff over a P2P network, with the CTO at Morpheus. Even though we’re not still working on it (I got sick and moved to Harvard, don’t know what he’s doing now) I remember the plan and it will work. But first we have to get there. Right now it would be overkill to deploy a P2P network for this stuff and would slow the adoption, which is already way too slow (this feature first came out 2.5 years ago).

    If you’re really worried about this, the best thing to do is to set a pref saying what’s the largest thing you’re willing do download, and default it to something like 25MB. That should work pretty well for now.

    Also your question “what if everyone started downloading at 2AM” your 2AM and mine are probably different. So I don’t see what the problem is. I guess to answer your basic question, yes I have had enough time to think this through and I honestly don’t think there’s anything controversial about the feature, and once the flow really gets going people will absolutely love it.

    Also, if you respond could you ping me via email with the URL.

    Reply
  4. Richard

    The way I solved this in Enclosure Extractor is to let it look at skipHours and skipDays, that and some native elements developers can insert in their feed (without upsetting validators), its documented at http://www.lionhardt.com/ee/developerinfo.asp. The application also allows the user to set a limit on filesize, if a file is larger then their set limit the application simply ignores it and goes on.

    I reckon that’s how aggregators will(have to)do it as well, that way both sides can control bandwith, wich is not only nice for the enduser, but also for the content provider if they are hosted on a virtual domain.

    Richard

    Reply
  5. Simple Tool to Download RSS Enclosures

    There have been a few activities at the
    rss enclosures front,
    notable an

    enclosure-to-ipod gateway
    Kevin
    Marks
    plans to build and for which Adam
    Curry
    has set up an

    experimental

    syncPod
    feed
    .

    I agree with
    Greg that
    automatic downloads of large enclosures is not necessarily a ‘good thing
    , but I also didn’t want to miss out on the fun of some of the enclosures
    available. And since I long ago left radio behind my aggregators do not support
    enclosures. So I hacked something together in 20 lines of C# and you now have a
    small tool to download enclosures. If you want to get the off-hour behavior that
    radio also obeys, stick the tool in your windows scheduler.

    From the readme file:

    GetRSSEnclosures

    usage: getrssenclosures url [url*]

    The program takes one or more urls of RSS files and scans those for items
    with an enclosure tag. Each of the enclosures it finds it will download into
    the current directory if the file does not exist yet. Progress and error
    messages are written into a file named getrssenclosures.log. The executable
    requires the .NET Framework V1.1.

    If one wants to emulate the way radio userland handles enclosures, one
    should stick this program into the windows scheduler to achieve downloads
    during off-hours.

    Caveat: This is a hack, there is very limited error checking and
    recovery, nor does it check whether the size and media type of the enclosure
    are identical to that specified in the enclosure tag. Use at you own risk,
    this program may corrupt your file system, and cause irreparable damage to
    your system. See the source files in getrssenclosures-src.zip for
    details.

    Todo: make this into a plug-in for Ephpod, make a version with
    a UI so people can pick which enclosures to download.

    Download:

    executable


    source files
    . Enjoy!

    …[more]

    Reply
  6. Shawn Miller

    The perfect way to download large amounts of RSS enclosures is by using a tool that only downloads in the background when the network bandwith isn’t being utilized.

    NuParadigm’s DrizzleCast is a free podcast client for Windows that uses Microsoft BITS technology to download RSS enclosures using idle bandwith.

    DrizzleCast version 1.1 was released today. Added the ability to view the progress of downloads as well as the ability to open the file once the download has completed. View the screenshots to see these new features in action @ http://www.nuparadigm.com/Products/Toys/DrizzleCast/

    Reply

Leave a Reply