South African NTP server hammered by Chinese IPs


#1

Since a week I am in the process of adding a server in South Africa to the NTP pool. I was prepared to see a high QPS because the continent is only served by 27 active IPv4 and 15 active IPv6 servers, but when traversing the IP logs I found that more than 75% of the NTP requests were from China.

Can this be an issue with the GeoDNS server, or is it possible that some Chinese operators use the Africa or South Africa pool as a default for their NTP queries? The issue is on IPv4. I haven’t added an IPv6 address for this server to the pool yet. I haven’t seen such a high percentage of Chinese queries on my European servers.


#2

I’m in the US and I see a significant amount of my NTP traffic (about 40% of packets) coming from China… Which I have to scratch my head at too, you would think any Chinese overflow would be queried from the other Asia servers or even limited to the European Pool because of their proximity (lower latency & jitter) and # of physical pool servers.

Every few days too I will see a crazy spike in traffic, it will jump 4x or 5x normal levels briefly. I haven’t checked IPs as I don’t log that detailed normally but I have a strong hunch it comes from China. It’s not merely the same IPs querying faster, it’s different IPs (because I rate-limit queries per-ip).

For a country with almost 1.4 billion people, I know those few NTP Pool servers in the China zone get hit pretty hard (there is a recent thread with graphs and stats). I know there are a number of people active on the message boards with servers in Singapore, I wonder how much traffic those servers see in comparison, and if the bulk of their traffic is from China.

Another unknown factor is the bandwidth settings for pool servers, the only public info we see is the # of servers, but we don’t know if their average bandwidth is 1Mb or 500Mb… I think it would be interesting to see either cumulative bandwidth or even just average bandwidth per-zone.

When I was asking about port related peculiarities on the NTP list, Steven Sommars posted this link to a paper: https://tf.nist.gov/general/pdf/2818.pdf where they logged traffic from a couple NIST NTP servers for a month and analyzed it in all sorts of ways. A few pages in there is a heat map broken down by regional registry ownership which is interesting, I only wish they had a table of numbers.


#3

Pure speculation, but sometimes software comes with a bad default configuration listing each continent zone like:

server africa.pool.ntp.org
server asia.pool.ntp.org
server europe.pool.ntp.org
server ...

So they blast traffic evenly at each continent zone, hitting servers in the continents with fewer servers much harder. It could be that the Europe and Africa zones get an equal share of Chinese traffic, but European servers see far less because there are so many of them.


#4

Also lammert, just out of curiosity what kind of QPS are you seeing and what is your bandwidth setting in the NTP Pool Console?


#5

Thanks both for your input on this issue. The NIST document is an interesting read but I have to make some time for it to study it. Badly configured client software which lists multiple continents as a “safe” option to always have a server nearby may be the reason. Although it looks that the queries are both coming from mobile and broadband wired clients.

I had a Bangalore server for some time and it was hit pretty hard, with peaks of 30k QPS. I haven’t seen such high peeks at the South African server yet but I am testing it now at a low bandwidth setting, (384 kbps). With that setting traffic settles somewhere between 500 and 2000 QPS. The main problem I’m currently facing is the connection with the Los Angeles monitor server which kicks the server out of the pool on a regular base due to lost packets. Peering with local servers and my stratum 1 server in Europe is excellent though.

On a side note, since a little less than a month I see a decline in NTP traffic to my European servers. That has never been the case in the years before. For me traffic has always been increasing over time. I even upped two French servers from 250 Mbps to 1000 Mbps today to accommodate for the lost traffic.


#6

I’m sure older clients from many years ago might have the hardcoded different continents, but that has long since been remedied (from my understanding) that even if you query a pool with zero servers it will generate a list from a different zone(s). I think these days it is just querying the general global pool and load balancing to pick up the slack in areas that have few / no servers (like most of Asia).

As for your european servers, there have been an additional 98 (ipv4) servers added to the european zone in the last 60 days, I’m sure that offloaded a considerable load.


#7

I think I’ve mentioned it before (and obviously not done the work yet), but I think the right approach is to deprecate the continent zones (and likely even the country ones). They’d still work, but they’d point to the “computer picks your zone” default zone.

I think they are causing more trouble than they help.

If there are use cases where people really need them, maybe there’s a variant of the “vendor zones” where people get a personal zone that can be limited to servers from a particular area (or even ones own servers).


#8

I was going to ask you about this, if the DNS now used 100% geo-location or still bases queries off the zone requested.

I think forcing every request to be a geo-lookup (assuming you are updating the GeoIP DBs regularly) regardless of the zone queried is the way to go. In the end I think a vast majority will be much better targeted since a lot of the “behind the scenes” usage in embedded / mobile devices the end-user has no clue what the factory setting is, much less know how (or even be able to) change it.

Can you geo-target so you aren’t just giving them random servers from a zone they are located in, but perhaps the “closest” servers to their location? Well, that might be something to implement further down the road… I think starting out just based on country, then continent, then adjacent continent(s) would probably be a good first step.

Perhaps create a ‘debug’ subzone if someone really wants to force requests from one area for one reason or another… i.e.: .debug.pool.ntp.org ??? Or maybe only allow it on the grundclock.com domain?

If a person’s GeoIP is way off, they probably have more issues than just time since most large sites are transparently geotargeting for performance too.