DNS loadbalancing / round-robin not functioning as expected

Report & discuss bugs found in SABnzbd
Forum rules
Help us help you:
  • Are you using the latest stable version of SABnzbd? Downloads page.
  • Tell us what system you run SABnzbd on.
  • Adhere to the forum rules.
  • Do you experience problems during downloading?
    Check your connection in Status and Interface settings window.
    Use Test Server in Config > Servers.
    We will probably ask you to do a test using only basic settings.
  • Do you experience problems during repair or unpacking?
    Enable +Debug logging in the Status and Interface settings window and share the relevant parts of the log here using [ code ] sections.
Post Reply
marky
Newbie
Newbie
Posts: 2
Joined: June 11th, 2011, 7:03 am

DNS loadbalancing / round-robin not functioning as expected

Post by marky »

Version: 0.6.4
OS: CentOS 5.6 with python26-2.6.5-6.el5
Install-type: python source
Skin: Default
Firewall Software: None, iptables rules set to accept all.
Are you using IPV6? No
Is the issue reproducible? Yes
NAT: No
DNS servers tested: ISP, OpenDNS and Google Public DNS

SABnzbd is configured to use eu.news.astraweb.com with 15 connections. DNS resolves eu.news.astraweb.com to eight different IP addresses. Regardless how many or few connections SABnzbd is configured to use all connection are established to first IP returned by DNS query.
$ host eu.news.astraweb.com
eu.news.astraweb.com has address 193.202.122.55
eu.news.astraweb.com has address 193.202.122.159
eu.news.astraweb.com has address 193.202.122.162
eu.news.astraweb.com has address 193.202.122.61
eu.news.astraweb.com has address 193.202.122.56
eu.news.astraweb.com has address 91.208.207.57
eu.news.astraweb.com has address 193.202.122.54
eu.news.astraweb.com has address 193.202.122.58
$ netstat -an|grep :119
tcp        0      0 X.X.X.X:40535        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40538        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40539        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40544        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40546        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40549        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40551        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40554        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40556        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40557        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40560        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40562        193.202.122.55:119            ESTABLISHED
tcp        0      0 X.X.X.X:40563        193.202.122.55:119            ESTABLISHED
I assume SABnzbd doesn't do any re-ordering of IP addresses returned by DNS and simply always uses first one. This seems to be well known problem affecting many applications using getaddrinfo() instead of old IPv4 only gethostbyname() and is therefore not limited to SABnzbd.

To work-around this problem bit of googling shows two fixes people have implemented. One involved modifying getaddrinfo() in glibc to behave like gethostbyname() which is solution not many are willing to do as it requires recompiling custom package affecting all applications used and maintaining it whenever upgrades are released by upstream. Another one used for example by Gentoo emerge to load balance rsync mirrors is simply randomizing order of IPs returned by DNS after getaddrinfo() has sorted them.

Explanation why this troublesome change was originally implemented in glibc is explained here http://lists.debian.org/debian-ctte/200 ... 00071.html.

I've been wondering for a while why download speed vary a lot and I've pinpointed it to performance issues with few IPs. Can't say for sure if it's ISP routing issue or problem with news provider server capacity. Depending on which IP address SABnzbd connects to I may get 1MB/s or 10MB/s but rarely anything between (MB/s as megabytes per second). Playing with number of concurrent connections doesn't fix problem.

Manually adding separate server entry for each IP address DNS query for news server returns gives me solid 10MB/s. This is however bad solution as list of IPs news providers use tends to change every now and then. There's other downsides as well such as problem optimizing number of connections, statistics etc.

What this manual hack does is how I believe SABnzbd is inteded to work but due underlying OS limitations it doesn't. When SABnzbd is configured to use more than one connection to news server (like it usually is) and news server used has more than one IP address (like they usually do) only one of those IPs are actually connected to. By going thru list of all available IP addresses either by round-robin or simply picking one to use by random for next connection this problem would go away. For example on Astraweb one slow server out of eight IPs returned wouldn't have big overall impact for download speed. Currently if you end up connected to that slow server you're stuck with slow download speeds for duration of entire download session.

While this seems to be design problem on operating system level could you developers consider solving it on application side? After all patching OS is not an option for most of us. I also doubt I'm only one experiencing this but many probably don't know actual cause and just blame bad service from ISP and NSP. Problem is much more evident on high speed links as on slow connection even slow server is often enough to saturate link.

Alternatively anyone who understands python programming willing to help me on how to modify source so whenever new connection to news server is established it would pick IP randomly from list returned by DNS rather than just use first one?

Thanks,
Mark
Last edited by marky on June 11th, 2011, 7:59 am, edited 1 time in total.
User avatar
shypike
Administrator
Administrator
Posts: 19773
Joined: January 18th, 2008, 12:49 pm

Re: DNS loadbalancing / round-robin not functioning as expected

Post by shypike »

Noted for a future release.
Post Reply