NewsGator feed retrieval intervals

I was just reading an article about Google Reader and their retrieval intervals, and thought this might be a good time to write about what NewsGator Online does. This is relevant for not only online users, but anyone who is using one of our clients (FeedDemon, NetNewsWire, Inbox, Go!, etc) in sync mode, since in that mode the clients retrieve content from our online system.

One of the more common questions/complaints we get is something about a feed not appearing to update in a timely manner. 99% of the time, it’s actually a problem with the feed – but I’ll come back to that.

There are about 2.5 million feeds in our system, and these feeds get divided into categories. They have fancy (and sometimes amusing) internal names, but for now I will describe them as follows. Also keep in mind these rules are subject to change, and in fact do change quite often to better optimize the experience for our users and our overall system load.

And before I get into all of this…note that feeds that ping our system will be updated and available typically within 60 seconds. The category the feed is in is largely irrelevant.

Category A: these are feeds that are needed by certain commercial syndication services customers with extremely tight SLAs – some of these SLAs guarantee content available within 2 minutes of publication in a feed. Feeds in this category are retrieved every 60 seconds. Exception – if a feed reliably pings our system with updates, the poll-retrieval interval may be dropped to a lower category; however, if the feed does not appear to ping us with every update, the 60 second interval remains in effect.

Category B: these are feeds with over 20 subscribers, or occasional feeds that for whatever reason are deemed “important” enough to keep in this category. Retrieval interval is 15 minutes.

Category C: these are feeds with 2-19 subscribers, and any feed that requires credentials to access. These feeds are retrieved every 1-2 hours depending on system load.

Category D: these are feeds with only 1 subscriber, which do not require credentials. If that subscriber is an “active user”, interval is 1-2 hours. If that subscriber is not very active, interval is 4-8 hours depending on load. The definition of “active” changes, but think of it as people who use the system daily-ish.

Category E: this is what we affectionately call the “penalty box.” These are feeds which have returned some kind of error, and they are “penalized” for it. For example – if a feed 404′s, it is immediately penalized for 24 hours. A 500 server error? 4 hours. Other kinds of errors (including parsing problems) cause penalties of varying lengths, taking into account how many consecutive errors we see. If a feed continues to have errors for 90 days, it will be blacklisted and no longer retrieved at all…and the only way for a feed to get off the blacklist is for it to a) fix the error(s) and then b) ping us. [I should add that 410 (gone) is not considered an error; feeds that return a 410 are immediately removed and all subscribers are unsubscribed.]

Category F: this is somewhat of a grab bag of other cases. The most visible type of feed in this category is craigslist feeds – we retrieve them on a 48-hour interval. This sucks – for you, for me, for everyone – but the problem is craigslist will throttle and blacklist us, and they seem not to be interested in solving this problem with us (we’re also not the only ones with this problem). So 48 hours is roughly the minimum interval we can get away with and minimize the chances of getting blacklisted (which takes days to undo).

By far the best way to help ensure timely updates to content is to encourage publishers to ping our system when they update (I talk about NewsGator’s ping endpoint here). A large number already do this – but there are some folks who do not. If they’re using FeedBurner, we’re already getting pinged; if they’re using another system, they may need to add NewsGator to their ping list manually. But typically, after a ping, updated content is available within 60 seconds. And as mentioned, a ping can even remove content from our blacklist.

We get a fair number of inquiries in the forums and elsewhere about feeds not updating; in nearly all of those instances, everything is actually working fine – the feed has usually fallen into category E for whatever reason. Something I’ve been thinking about is some kind of status page or something where someone can type in the name of a feed, and we’ll display status for that feed (including why it’s in the penalty box if it is)…we’ve resisted doing this because it’s just one of those things our users shouldn’t have to worry about.

26 thoughts on “NewsGator feed retrieval intervals

  1. Pedro Melo

    Hi,

    Thanks for clarifying NewsGator operation regarding this.

    I for one would welcome some sort of notification when my feeds drop into the E category.

    Best regards,

    Reply
  2. Andrew Bloomgarden

    I’d really appreciate it if I received some sort of notification when a feed 410s thus unsubscribing me. I don’t think this happens often, but I know that I wouldn’t likely notice for a month or so if a feed suddenly disappeared unless it’s one of my favorites.

    Reply
  3. Brad

    Thanks for the explanation of this. I in fact just got a reply from NewsGator support about why it took so long to see updates to my (category C, possibly D) blog. Their answer was to disable syncing, which was an unfortunate solution. I’ve since turned on pinging so that I can reënable syncing.

    Reply
  4. Brian

    And this is the reason why I don’t use the sync feature. I like being able to specify the amount of time in between checking, and I don’t have to worry about any “penalty box” senario (which only hurts us because our feed stops updating at all).

    Reply
  5. Pingback: Greg Reinacker tells us how often NewsGator updates feeds

  6. Patrick

    If I’m using NetNewsWire in syncing mode, if I set a feed to “don’t sync” then it doesn’t follow these rules, right? In other words if I want to sub to a CL feed I can do it that way and get it updated once an hour?

    Reply
  7. Pingback: Around the web | alexking.org

  8. Pingback: The devils in the feed details Life is grand

  9. DDA

    “We get a fair number of inquiries in the forums and elsewhere about feeds not updating; in nearly all of those instances, everything is actually working fine – the feed has usually fallen into category E for whatever reason.”

    Then things are *not* working fine; the user is confused or upset about something and the explanation is about how your system has decided their feed doesn’t get refreshed.

    I like the idea of syncing all my readers so I’m not reading the same stuff over and over. But I have important feeds that I want updated and being told, “Well, our system decided your feed had some issue so it won’t be refreshed when you want it to be” doesn’t cut it. So I turn off syncing since I can’t find a way to exclude one feed in NNW; while I can easily set a custom refresh interval, it is in *hours* but I’ve set the default feed refresh to be 30 minutes.

    Reply
  10. Jo

    You’re effectively penalizing your users for something beyond their control, which just seems insanely stupid to me. A single 404 kills and updates for 24 hours? That’s crazy. Four hours for a 500? I could see that being acceptable after MULTIPLE 404s or 500, but not after just one.

    Reply
  11. Sebastian Lewis

    Jo, that’s the thing though, if the feed just keeps 404ing then Newsgator would just be wasting bandwidth by continuing to let that feed 404 every time they do a refresh. It’s just easier to put it on a 24 hour interval until it returns so that they don’t hammer their servers with needless 404s.

    Sebastian

    Reply
  12. tbelcher

    Mate, this is just bloody stupid! I’ve just wasted the morning trying to figure out why a couple of my feeds refuse to update. Couldn’t FeedDemon at least show some status icon on feeds that are causing it grief?

    Also the evidence is that your system puts some feeds in the too-hard basket for a lot longer than 24 hours.

    I have 2 feeds that have not been updated for the better part of a week. I look in the raw XML and see that the feed data is correct – the XML contains the latest entries. But your program refuses to show them. That’s just plain !@#$%%^-ing mad.

    Reply
  13. Pingback: TPN :: The Global Geek Podcast » Blog Archive » FeedDemon now Working; What About Your Feeds that are Not Updating as they Should

  14. Pingback: Symphonious » More On NewsGator Syncing

  15. Geoff

    I’ve just found this post linked from the NewsGator forum, because I had a single feed that refused to update in NetNewsWire. The problem turned out to be that the server had, on roughly June 13th, returned an authorisation error (even though the feed doesn’t require authorisation). No updates had been received since then, suggesting that there’s a class of errors that will cause a feed to not be updated for much longer than 24 hours, or perhaps no longer updated at all.

    It would be enormously helpful if the error status of a synchronised feed on the NewsGator server could be propagated to the NetNewsWire client, so the user knows that they may need to force a refresh to get new entries. I only became aware of the problem at all because someone else subscribed to the same feed (through Google Reader) asked whether I’d seen the latest post… which of course led to the question: “Why aren’t you using Google Reader?” I like NetNewsWire, but I don’t like missing out on news and having no visibility of the reason.

    Reply
  16. Selva

    How about the internal feeds stored in Newsgator Enterprise server? Do they have catogarization too? Is there any option in Admin to specify this or the time period to poll?

    Reply
  17. gregr Post author

    @Selva – NGES feeds are all treated equally; there is no algorithm in place there to auto-adjust retrieval intervals. A system admin can specify the global retrieval interval, though, and NGES also has a standard XML-RPC ping endpoint at /ngws/xmlrpcping.aspx.

    Reply
  18. Peter

    I have several feeds running through Newsgator. All seems to work fine, except one feed running from twitter to Pipes to newsgator.

    I display it in a script on my blog.

    That one is not updating correctly. I need to go into newsgator and refresh the feed in order to get updates…

    I would not be surprised if I am in category E (not the first time there is a problem with Pipes feeds)

    If so, I’d like to know indeed.

    Reply
  19. Yvon

    Thanks for those explanations. I like your softwares and the idea of syncing my iPhone and my laptop, but this latency between feeds updates is the reason I mooved to google reader.

    Reply
  20. Kit Sunde

    Thanks for explaining how it works. Because of overload one of the party leaders blog was under heavy load due to an ongoing court case he returned 404 so I’ve missed a day of it (maybe not important too you, but it’s quite fundamental for me).

    This also explains why I’m not getting all of the feeds when a feed pushes more new items within a timespan than you catch.

    It’s even worse when you subscribe to a feed that uses the URL to filter tags so the chance of you getting a feed with more subscribers and thus have it update reasonably often is close to nothing.

    I’m turning of synchronization because of this, it’s way to flaky to go through your servers even if I really love the client.

    Reply
  21. New FeedDemon user.. and now EX-user

    Wow, that’s great, thanks a lot. Goodbye FeedDemon and sync POS. 2 days testing FD and syncing = really frustrated why some feeds doesn’t get updated or gets updated after several hours.

    Found the “online reader” page and noticed that about half of my feeds (~20) “last updated” statuses are from yesterday!! That’s right, no problems/errors, I guess they’re updated in 12H or longer periods. TOTALLY UNACCEPTABLE! No indication of this in my FeedDemon reader.. BAD! I guess this very slow update happens because these feeds are Finnish news/etc. feeds and these feeds doesn’t have many readers with NewsGator. So, summa summarum: NewsGator sync is ok ONLY if you’re reading popular international feeds that have many readers. Otherwise, DO NOT SYNC!!

    Funny how viewing a source of a feed in FeedDemon shows the latest news items but the feed doesn’t get updated when syncing/updating. Really lame. I’d have to login to NewsGator online MANY TIMES PER DAY and “ping” the “unpopular” feeds in case I want to be up-to-date with feeds. That’s just REALLY LAME!

    Thanks for the sync idea but no thanks, it doesn’t work like users are waiting it to work. Your categories are to blame. Goodbye.

    (thank god that I found this blog post, saved me time.. now just a quick FD uninstall.. and off to see the Google Reader I keep on reading everywhere)

    Reply
  22. Chris

    Finally I now understand why my twitter feeds don’t update. I thought perhaps it was just a matter of poorly written and badly maintained software, since the only way I could get it to update was to delete it from my list of feeds and then re add it.

    Reply

Leave a Reply