Duplicate detection not working properly in 4.5.2+
Forum rules
Help us help you:
Help us help you:
- Are you using the latest stable version of SABnzbd? Downloads page.
- Tell us what system you run SABnzbd on.
- Adhere to the forum rules.
- Do you experience problems during downloading?
Check your connection in Status and Interface settings window.
Use Test Server in Config > Servers.
We will probably ask you to do a test using only basic settings. - Do you experience problems during repair or unpacking?
Enable +Debug logging in the Status and Interface settings window and share the relevant parts of the log here using [ code ] sections.
Re: Duplicate detection not working properly in 4.5.2+
No hard feelings. As for guessit, if this scheme is indeed very common for anime (which I'm completely unfamiliar with myself), you could open an issue there and nicely ask them to add support for it.
Re: Duplicate detection not working properly in 4.5.2+
I see someone already mentioned this problem 4 years ago, and they only catered for it (https://github.com/guessit-io/guessit/pull/778) in something called Kyoo last year. So it seems they do know about it, but the version used by sabnzbd does not. Perhaps that version would be more suitable for sabnzbd as it caters for TV, movies and anime? Just a thought 
Re: Duplicate detection not working properly in 4.5.2+
It seems nobody is maintaining guessit regularly..
Unfortunately not many options for us out there to use as detection library.
Or we have to vendor it, but not looking forward to that.
Some patches I can hack into it on runtime, but not this one unfortunately.
Unfortunately not many options for us out there to use as detection library.
Or we have to vendor it, but not looking forward to that.
Some patches I can hack into it on runtime, but not this one unfortunately.
If you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
Re: Duplicate detection not working properly in 4.5.2+
hacking time:
The good:
So: ('episode', 10) in all cases. Good.
The bad:
No episode, but ('episode_title', '10'). The rest is the same.
If guessit developers do not solve it, we might consider this:
if SABnzbd sees season a number, and no episode from guessit, and episode_title is a number ... use the episode_title as episode. As in switch values of episode and episode_title.
I checked my sabnzbd.log, and
1) there was a episode_title for the Kakkou downloads, and always the number
2) there was never a episode_title for other downloads.
So seems safe to do this.
So somewhere in https://github.com/sabnzbd/sabnzbd/blob ... #L656-L686 ... ah, I see SAB already does some fixing for other guessit results?
The good:
Code: Select all
>>> guessit.guessit("[SubsPlease] Blablabla S2E10 (1080p) [something]")
MatchesDict([('title', 'Blablabla'), ('season', 2), ('episode', 10), ('screen_size', '1080p'), ('release_group', 'something'), ('type', 'episode')])
>>> guessit.guessit("[SubsPlease] Blablabla S2-E10 (1080p) [something]")
MatchesDict([('title', 'Blablabla'), ('season', 2), ('episode', 10), ('screen_size', '1080p'), ('release_group', 'something'), ('type', 'episode')])
>>> guessit.guessit("[SubsPlease] Blablabla S2 E10 (1080p) [something]")
MatchesDict([('title', 'Blablabla'), ('season', 2), ('episode', 10), ('screen_size', '1080p'), ('release_group', 'something'), ('type', 'episode')])
The bad:
Code: Select all
>>> guessit.guessit("[SubsPlease] Blablabla S2 - 10 (1080p) [something]")
MatchesDict([('title', 'Blablabla'), ('season', 2), ('episode_title', '10'), ('screen_size', '1080p'), ('release_group', 'something'), ('type', 'episode')])
>>> guessit.guessit("[SubsPlease] Blablabla S2_-_10 (1080p) [something]")
MatchesDict([('title', 'Blablabla'), ('season', 2), ('episode_title', '10'), ('screen_size', '1080p'), ('release_group', 'something'), ('type', 'episode')])No episode, but ('episode_title', '10'). The rest is the same.
If guessit developers do not solve it, we might consider this:
if SABnzbd sees season a number, and no episode from guessit, and episode_title is a number ... use the episode_title as episode. As in switch values of episode and episode_title.
I checked my sabnzbd.log, and
1) there was a episode_title for the Kakkou downloads, and always the number
2) there was never a episode_title for other downloads.
So seems safe to do this.
So somewhere in https://github.com/sabnzbd/sabnzbd/blob ... #L656-L686 ... ah, I see SAB already does some fixing for other guessit results?
Re: Duplicate detection not working properly in 4.5.2+
Easy to check for pure number resp empty string:
code:
result:
code:
Code: Select all
test_strings = ['12345', '123a45', '1.23', '-123', '', '0', 'blabla 12']
for s in test_strings:
is_digit = s.isdigit()
is_empty = s == ''
print(f"String: '{s}', contains only digits: {is_digit}, is empty: {is_empty}")Code: Select all
String: '12345', contains only digits: True, is empty: False
String: '123a45', contains only digits: False, is empty: False
String: '1.23', contains only digits: False, is empty: False
String: '-123', contains only digits: False, is empty: False
String: '', contains only digits: False, is empty: True
String: '0', contains only digits: True, is empty: False
String: 'blabla 12', contains only digits: False, is empty: FalseRe: Duplicate detection not working properly in 4.5.2+
Beautiful!
Or even: schitterend!
Or even: schitterend!
Re: Duplicate detection not working properly in 4.5.2+
From what I can gather from the pull request I mentioned from GuessIt, they say pretty much exactly what you are describing, i.e. 'Promote "episode_title" to "episode" when the title is in fact the episode number'. So if what you are suggesting in sabnzbd, matches pretty much what guessit did for kyoo, then yes, it does sound awesome 
Re: Duplicate detection not working properly in 4.5.2+
Merged. Will be in the next release!
If you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
Re: Duplicate detection not working properly in 4.5.2+
Thanks very much everyone 

