Overcoming Great Obstacle of NTP in China

TL; DR!!

####Skip all of these paragraphs, directly jumping to the end of the article to read the most important part, which needs your help.


Who Am I?

I’m a free software advocate, a member from Beijing GNU/Linux User Group. I love to play with servers and have 7-year experience on desktops, I own 6 virtual servers for experimenting different applications as a hobby.

Why I’m here?

Recently, I have read an article published by LWN.net, introducing the NTP pool system, and I also did the research by myself on the history of NTP community because of personal interest. I was amazed (but not surprised) by the fact, that how everyone’s time around the whole world is provided by only a handful of developers plus many volunteers.

I know the Pool still needs a lot of non-American and non-European servers. Particularly, there are only 8 pool servers in China, which is a country with a billion of population.

Status of NTP in China

Currently, the NTP Pool in China is unstable even for sysadmins and power users, because “cn.ntp.pool.org” is hardly working and some sysadmins just gave up using it. Well, they don’t know the fact that there are only 8 servers in the China Pool.

Most people in China use the default servers provided by the vendor of their systems, basically by Microsoft Windows and Apple, it works but the latency is a little bit high. Occasionally, a few users wanted to find a good alternative local public NTPs, but they do not exist.

Some universities and colleges provide NTPs, but they have no intention to keep the service continually available. In some cases, those NTPs are only responsible for their school networks, but exposed IP addresses to the Internet by accident.

The National Time Service Center, of Chinese Academy of Science, used to provide a public server, but just disappeared from the Internet. Some blame the government, but it is quite understandable how public servers had been abused, and even the NIST can not hold it anymore…

Campaigning People to Join

FOSS/Linux Groups can be found in a handful of universities in China, some are quite active, moderate-resourced and already providing excellent Free and Open Source Mirror for dozens of projects.

Independent, a few city-based groups can also be found. Beijing GNU/Linux User Group is the most active group. We can publicize the NTP Pool and encourage individuals and groups to contribute servers, since servers has low bandwidth requirements, it is relatively easy for people like my to join the pool in 20 minutes, things will be much better.


First Experiment to Join the Pool

In November, I set one of my VPS in China to run public NTP, and joined it to the NTP Pool. However, it was unreachable at all by the NTP Pool Monitoring Station, the score kept dropping, down to -100.

But its connectify is perfectly okay in China. Later, after I switched off a P2P network system (also for testing), the server immediately became reachable by the Monitoring Station, and the score started to rise, looked pretty good.

Apparently it started to work. It seems that it was consuming too much bandwidth, strangely, P2P used really little traffic, the limit was far to reach…

A Weird Phenomena

The score was continued to raise. When the score exceed 10 and the server started to be used by the Pool system, I noticed the score started to drop again. I queried my NTP server from a VPS in Japan, the server was clearly unreachable. But it is still working within China, at least my own connection!

I thought maybe the bandwidth was too high, I set was too high, so I changed it to minimum. But it didn’t help much. But weird enough, once I established one TCP connection between the Japanese VPS and my server in China, the NTP became reachable at once!

The Great Wall of China!

Finally, I realized it was a saw-tooth wave in the graph of the NTP Pool, around score 10. Once the score became greater than 10, the server became unreachable from the monitoring station immediately and kicked out. Then the score started to drop, later to became available again.

Gotcha! This is exactly the normal behavior of the international traffic in China! The exact details of how it works remain non-deterministic and a puzzle, but the main features are clear.

  • First, the outgoing capacity is really limited, partial reason is to allow the easier control of the flow of information.
  • Second, the Great Firewall is actively interfering all types of traffic.
  • Currently, the government is not active at blocking servers and protocols like the old days, they switched to a random jamming approach.
  • Third, these two factors can be combined, for example, if your outgoing traffic is too busy, interfering some traffic that looks like VPNs (e.g. QoS them to the lowest priority) seem to be a good choice.

So if you have a connection issue, it will be hard to distinguish state censorship (the Great Firewall) and ISP throttling, or a infrastructure overload / malfunction.

Unusable UDP

And normally, if you cannot guarantee stable transmissions even for TCPs, the UDP packets are basically completely ignored from the international traffic.

Back to my NTP server case, once the score became higher than 10, my server started to send UDP packets to the international traffic, and then, the mysterious throttling mechanism became active and made the server unreachable outside China.

After a cooling period, the throttling stopped and it became reachable again. If you establish a TCP connection, you can also temporally bypass the throttling. Also, the throttling is mysterious and irrelevant to traffic. Sometimes it happened at 30 KBps, sometimes we don’t see it even at 300 KBps.

Overcome the Obstacle

This is why there are only 8 servers in the China NTP Pool, and normally only 3 servers are online, because of the UDP throttling. It prevents the NTP servers from serving the Chinese Internet, and prevents newcomers’ servers from working in the pool.

We must make a difference if we want to overcome the issue. The only possible solution I came up with, is to add a monitoring server within China, but I don’t even know if it is possible…

Alternative Explanation

Or just like what Ask has said, it is also possible that, the traffic was simply too large: when the server had been used by the Pool, a burst of traffic hit the server, and monitoring station failed to connect the server at all. And during the time frame of my personal testing, the server has already been kicked out, so I saw a working server.

I may be mis-credited the Great Firewall. We need to do more work to determine the real reasons.

Current Status

I have written to Ask, and Ask suggested to do more testing on a Chinese server. I’m going to temporarily give Ask two VPS servers in China with root access for testing purpose on this weekend, and once problems have been identified, the servers can be permanently donated and served for the Community.

What is everyone’s opinion?

3 Likes

@tomli It seems more likely related to load, however GFW could also be having some impact.

Once more servers are in the pool (see this post) then we might have a better idea what the limitations or issues are.