Page 1 of 1

Limiting RSS Feed History Size

Posted: July 11th, 2014, 11:22 am
by sideoffries
I've recently started using a single RSS feed from an indexing site and doing multiple regex tests on that one feed. I've limited the size of the feed I pull but it's very active. The sabnzbd RSS feed "history" (the Matched/Not Matched/Downloaded lists) has grown very rapidly in the last few days and the "Not Matched" list is over 1500 entries now. As a result accessing the RSS feed or making a change to the rules is now incredibly slow, as is starting sabnzb.

So what causes entries to be removed from the feeds' history? I can see that rss_data.sab is currently 71M in size. So I stopped sabnzb, renamed that file, then restarted and not surprisingly the UI speeds when accessing that RSS feed improved greatly .. and I also lost all the history for it. The problem seems to be when reading/processing that large file but I see nothing that would allow me clear or limit how much history is kept for a feed.

Any way to fix this short of nuking the file and losing the history, and if not would doing that cause any problems with sabnzbd?

Re: Limiting RSS Feed History Size

Posted: July 11th, 2014, 1:24 pm
by shypike
SABnzbd keeps 3 days of entries.
This is done to prevent re-downloads when a site temporarily drops NZBs.
That happens enough to be a problem.
In your case it works against you, especially in combination with a lot of regexes.

Release 0.8.0 will use a different method of preventing double downloads
and then the need to keep days of entries is no longer needed.

Re: Limiting RSS Feed History Size

Posted: July 11th, 2014, 1:56 pm
by sideoffries
OK I understand what's going on. Thanks. It might be useful to have a way to disable the testing against history automatically every time the rss feed is viewed initially or a filter rule is modified/saved and instead let the user initiate on demand display of filter rule results. From what I've noticed changing a filter doesn't immediately load those changes to the queue anyway. That's only done on the next rss processing cycle. It's the significant processing time that occurs after every view or rule update that's annoying.

Re: Limiting RSS Feed History Size

Posted: July 11th, 2014, 5:31 pm
by shypike
Each time you change the filters, the current content is re-filtered so you can see the effect of the new situation.
Most of the time is spent on running the regexes. How many do you have?

Re: Limiting RSS Feed History Size

Posted: July 11th, 2014, 5:48 pm
by sideoffries
I'm doing a bit of preprocessing to reduce the number of steps in sab. I'm actually combining 2 feeds in an aggegator and filtering there for resolution and newsgroup. I don't think that matters though in this context since we're not reading/processing the raw feed when the filters change right? From there it's only a total of 7 tests on that preprocessed feed, 6 regex and one more just a "Requires", not regex. Two of the remaining 6 regex are disabled currently so I assume they aren't tested at all. Yet running through ~1500 historical entries will take several minutes. It actually runs against the raw feed fairly quickly according to the log. I have the aggregated raw feed limited to 200 entries. So it's just the "history processing" that's the issue.

sab is running on a pretty hefty machine; quad core 3.2GHz Xeon, RAID-0 so I know it's not hardware resource bottle-necked.

Re: Limiting RSS Feed History Size

Posted: July 11th, 2014, 6:03 pm
by shypike
It needs a redesign.
0.8.0 also scans for episodes in titles, using regexes.
Performance is very poor with the current code.

Re: Limiting RSS Feed History Size

Posted: July 11th, 2014, 6:07 pm
by sideoffries
Understood.

So last question and then I'll leave you alone. Is there any harm in nuking the rss_data.sab file while sab is shut down and then restarting? I assume I'd lose history of course and maybe might trigger a requeue of a past download; neither of which I care much about if it will fix having to wait long times to edit/change filters. At least until 0.8.0 is an option :)

Thanks for the explanations.

Re: Limiting RSS Feed History Size

Posted: July 12th, 2014, 1:48 am
by shypike
That will work fine.

Re: Limiting RSS Feed History Size

Posted: July 17th, 2014, 7:40 pm
by sideoffries
I've found that the reason this is happening to me is that sabnzbd isn't purging history entries from nzbclub after 3 days per your earlier explanation. After deleting rss_data.sab 6 days ago I still have 6-day-old history entries from nzbclub in the history lists as of now.

The aggregated RSS I'm using is from nzbclub plus a newsnab-based site. The other site's entries are being purged as expected.

Re: Limiting RSS Feed History Size

Posted: July 18th, 2014, 1:47 pm
by shypike
Possibly it sees them as new entries, or it's just a bug.
I must re-check the code.