IPv6 for China Zone

Over the past few days, I’ve added four IPv6 addresses to the CN zone. Once the scores went above ten, I was a bit surprised to see I was getting next to no traffic against the new IPs (<100kbit/s).

Suddenly, today, all four servers got slammed with traffic for an hour and a half, before just as quickly dropping back to nothing. Over the 1.5 hour period each server was pushing up to 10mbit and 10k reqs/sec. The servers are now back to idling away, whilst having maintained high scores throughout.

Does anyone have any insight in to what is going on here ?
I was expecting high traffic, but don’t really get why I’m getting virtually no traffic outside of a single busy period.

According to Google, currently China has only <4% IPv6 adoption. So you can touch at most 1/25 of Chinese clients by providing IPv6 server instance into cn pool.

Yes, not ideal, but 4% of a big number is still quite a bit :grinning:

With ~800 million internet users China would have about 3 times the number of IPv6 capable users than in the UK, for example.

Hi @burble. It’s weird the “busy period” would last for so long with no interruptions. The zone is re-generated every 10 minutes, so by random you should end up it in regularly. (I don’t quite remember how the code work; for IPv6 with just 19 servers it might be “all the time” actually).

I suspect some of the ISPs in China do dumb things with caching DNS records for a long time, or some variation of that.

Indeed, two things about the results don’t make sense to me.

  1. With a relatively small number of servers, I expected to be taking traffic more or less constantly, whereas I’m actually seeing long periods of nothing.
  2. The periods with traffic are the same across all four servers, so it doesn’t appear to be a case of each server being randomly allocated in to the zone.

Attached is a screenshot of the traffic over a 12 hour period yesterday.

Random querying suggested that each of the four servers were regularly being published in DNS for the cn and asia zones, even during periods of little activity.

So, I did a bit more digging with the result that the bursts in traffic are coming in from a few operators in India, rather than China (details are below after the break). I guess this is a combination of higher IPv6 in India, a dire lack of NTP servers in the India zone, and my servers being in the asia zone.

An outstanding question though is why the traffic arrives in such harsh bursts across different servers at the same time:
All four of the servers get hit at the same time and same period, yet I have other servers in the asia zone that don’t see this pattern at all. If this were DNS related, I might expect to see peaks at different times for each server in relation to its appearance in DNS queries. I’d perhaps also expect to see peaks across other servers in the asia zone.
DNS caching is plausible, but with 4 different operators involved I would expect them to have different DNS infrastructure.


Gory details

I started with logging IP addresses for about 50 minutes this morning. This included a 20 minute peak period with approximately 2mb/sec of traffic. Overall, I had 790k requests with 452k unique IP addresses. 7k of these were invalid addresses (e.g. they lacked a network part ::1:2:3:4)

After stripping away the interface IDs, I used a script to look up the relevant network prefix and country code through whois. This resulted in another 2k addresses where I couldn’t immediately identify the prefix or code.

Finally, I sorted the results by country code looking at the overall figures and some 5 minute blocks before and during the peak. The top few results are below:

5 minutes of traffic before the peak

  1. 3678 VN
  2. 1011 TH
  3. 786 CA
  4. 605 IL
  5. 488 IN
  6. 451 US
  7. 374 CN

5 minutes during peak

  1. 204991 IN
  2. 1923 CN
  3. 1066 TH
  4. 476 VN
  5. 472 IL

Across entire 50 min period

  1. 718633 IN
  2. 17230 VN
  3. 16093 CN
  4. 10238 TH
  5. 5027 CA

Top prefixes

  1. 397464 - 2405:200::/29 IN - Reliance Jio Infocomm Limited
  2. 108502 - 2409:4000::/22 IN - Reliance Jio Infocomm Limited
  3. 100759 - 2402:3a80::/32 IN - Hutchison Max Telecom Limited
  4. 90870 - 2402:8100::/32 IN - Idea Cellular Limited
  5. 20171 - 2401:4900::/32 IN - Bharti Airtel Limited

ok, mystery solved.

There are very few IPv6 servers in the India zone, but there is a huge amount of traffic. At times all the servers would drop out (perhaps understandably, see below), leaving no servers in the zone. When this happens the traffic gets passed to the Asia zone and the peaks in traffic seen above were because there were no IPv6 servers available in the India zone at that time.

I’ve added 3 dedicated servers to the India zone, whilst these are not ideally located they should be ‘ok’ and have enough capacity to underpin the zone, this will prevent traffic spilling on to other struggling servers in the Asia zone and mean any new servers in India don’t get pummelled the moment they enter the zone.

For a while yesterday, before I could add the other two servers, I had the only server in the India IPv6 zone. It was taking ~60mb/s and 70k requests/sec before somebody else joined with an AWS node that helped soak up the traffic. Whoever has that AWS node must be paying a small fortune in bandwidth charges, hats off to them.

@ask - Another way of dealing with under served zones like this would be to continue to return servers from the region until there is a critical mass of local servers available. This would provide a much smoother transition than the current binary on/off and prevent a small number of local servers being swamped.

Geoff Huston just gave a presentation, I’ve only skimmed the slides, but they have a table of the ISPs with the most IPv6 users in the world. Jio is number 1 with 240 million users, and Vodafone (Hutchison Max Telecom) and Idea are near the top of the list too.

India probably has an eye-watering amount of IPv6 NTP traffic even if nothing’s wrong.

Thanks @mnordhoff , that helps put some context around the volume of traffic.

As far as I can tell, an AWS instance and a couple of DO droplets have been trying to support this 60mb/s+ zone on their own, and I really feel for whoever has been doing this as the bandwidth charges must be punishing.

I’ve done what I can to try and help, but the long term answer can only be to get more capacity.

As soon as our router vendor solves an open issue we’ve got, we’ll consider adding one of our nodes to the IPv6 asia pool. If we did it today, however, the bug will cause us serious IPv6 stability issues due to the number of unique IPv6 flows we would be seeing with high overall PPS :frowning:

Trying to prop up the Indian IPv6 zone is it’s own special kind of challenge. It would seem quite a few people have an app or device that attempts to sync the time at midnight.

Here’s what happens (times shown in the chart are BST):

chart

The vertical axis shows the total number of successful responses, per second, across five servers. That’s a peak of 300k/sec, 150mb/sec, and a total of about 25 million in the 10 minute period. A second smaller spike also occurs at 2am IST that’s about half the size.

For your amusement, a full 24 hours in the zone looks like this (again, times are BST):