Adding servers to the China zone

OK, added to CN pool.

1 Like

https://web.beta.grundclock.com/scores/118.122.35.10
Look at result from Zurich station, seems we need a monitor station in China.
Is there any instruction to set a monitor station? I want set one for test purpose.

Hey, getting back to this slightly old Topic, I have a server to be added. If that works out with the load, I might have another one.
https://www.ntppool.org/scores/95.216.144.226
https://www.ntppool.org/scores/2a01:4f9:c010:1625::1

Thanks.

Iā€™m glad to say my company Tencent(a Chinese tech company) is serving the CN pool now.
5 1000Mbps-servers with GPS source.

We need some time to optimize network and incorrect stratum issue.

https://www.ntppool.org/user/cate83vctnviet3dbbzj

3 Likes

This is amazing, thank you @maxmeng! I notice from the graphs today that there are periods where the monitoring is scoring them low. I know the name Tencent ā€” youā€™re a huge operator, I doubt the problem is at your end.

@ask: is the problem here that the monitoring vantage point is outside China, and any packet loss is unfairly penalising their scores?

This is an instance where I wish the pool was working with the RIPE Atlas team, they have probes in almost every country and could query NTP servers for a more accurate local reading.

Heck, if RIPE doesnā€™t donate the resources, I would be willing to at least donate my credits to see something like this happen.

Using RIPE Atlas would be really interesting, indeed. (For China specifically their coverage is pretty limited though).

The work with RIPE Atlas is building something to use the API and then figure out how to cancel out ā€œnoiseā€ appropriately (Atlas will always find problems so you have to figure out which are noise and which are ā€œrealā€, or at least what the reasonable thresholds are).

(Other work is having the monitoring system deal better with many more test probe results without using too many resources or making the system more fragile; Iā€™ve been fussing with that for way too long).

1 Like

@maxmeng ā€“ this is amazing, thank you so much!

Any Update on my server?

We have student administered datacenter at Kajaani University of Applied Sciences. Currently we are looking for real world usage for our servers and bandwidth. It would be ideal way to test our 1Gbit uplink and firewall with China pool.

You can add https://www.ntppool.org/scores/195.148.70.12 to China pool.

Please also add my server to the China zone:

https://www.ntppool.org/scores/94.130.49.186

1 Like

You are welcome to add my servers as well: https://www.ntppool.org/user/pb8yvobuaj4oscre4yfw
I have more than enough bandwidth and can easily get more. Iā€™m going to add a server with 10G transit in a few months.
As you have clearly got a lot, you may only want to add my two germany ones if you have enough already but I am happy for you to add all of them.

My theory: monitoring nodes, both those serving the CN zone from outside China and those inside China, is broken. Monitoring surely has to treat ā€œbehind-GFWā€ as a separate Internet, otherwise the CN zone will be completely unstable in the face of any packet loss crossing the GFW.

It feels like it has been almost a year since the zone started collapsing, and weā€™re still heroically throwing extra nodes around the world which flit in and out of the zone, rather than building on a stable foundation?

did @ask ever answer whether the monitoring for CN could be ā€œmore forgivingā€? because the situation is going to keep on going ā€” brave volunteers stepping up, monitoring taking some within-China nodes out of the pool, volunteers get punished ā€” until the monitoring actually reflects the reality of the zone

This has been going on for half a year. What help can we offer you, @ask, and the wider NTP Pool admins? I feel like I keep on asking this, but hear nothing back ā€” whether itā€™s on this forum or by email; whether itā€™s offers of DNS anycast nodes, monitoring VPSs in different countries, etc. Iā€™m starting to get the feeling that these offers are not wantedā€¦

I know he was talking about volunteers for additional monitoring servers in another thread.

The source code for the ntppool & geodns are on github, anyone can contribute to make improvements.

https://github.com/abh

Thereā€™s a lot of tweaks and features people ask for, but all that requires quite a bit of time to code & test.

1 Like

That reminds me that I was going to make a monitoring server. I think I installed the software but never actually set it up. I should get on that

Iā€™d love to contribute code. But:

  • do we agree on what the problems might be for the CN zone? do we think that what is happening is that nodes in the zone get punished, drop out, others take over the load, drop out, but meantime the first lot to have been kicked out come back? if so, the problem isnā€™t the nodes, but the fact that they donā€™t ever get the chance to work together to serve the load of the zone
  • of course we want to run tests, how do we even run tests at scale?
  • how quickly would a modification to how monitoring works as submitted in a PR to github actually get deployed to a production environment that might be useful?

Piling more nodes into the zone has completely failed. Here we are, almost a year later, and the zone keeps on collapsing. By repeating the same actions, without changes, we will continue to fail.

Suggestions of ā€œJust submit a PRā€ are BS. Iā€™m not asking for ā€œtweaksā€ or ā€œfeaturesā€. Iā€™m asking the project architect(s), and indeed the wider community, to help work out how we can help the project fulfil itā€™s core goal of serving NTP. That means problem management 101:

  • agreeing what we think is wrong, based on all the evidence we have (and most of us just running nodes have nothing but the output of our NTP daemon to go by ā€” more ā€œdatumā€ than ā€œevidenceā€)
  • deciding an action plan
  • carrying out that action plan
  • seeing if what we did fixed it

If the action plan is ā€œwe need 100 nodesā€ then weā€™ve got a problem: until you have all 100 nodes, anybody chipping into the pool for CN is going to get battered, their node will be removed, and possibly they will grow weary with the project and take their freely donated resources with them. Youā€™ll never reach 100 nodes this way.

If the action plan is ā€œwe need to fix monitoringā€ then itā€™d be great if we could agree what needs fixing, where to get those resources and developer time, and who can roll it out into production.

So, hereā€™s my stab in the dark:

The problem isnā€™t the number of nodes servicing the zone, or how much traffic they can serve. The problem is how monitoring them is causing the zone to crumble. My proposal is that a bunch of CN node operators who believe their node(s) are solid do not get removed from the zone automatically by the (what I believe to be broken) NTP Pool Project monitoring. However, we absolutely must let those operators remove their node instantly from the zone rather than having to wait ~4 days, in case they are getting so battered that they need to withdraw.

Want me to turn that into a PR? Iā€™ll do that, gladly. But Iā€™d love for some of the people who this issue affects to kick around the idea, and especially @ask to weigh in whether heā€™s just going to reject it outright, before I spend more of my time on this project for naught.

This makes me think that we need to find a way to try and kickstart the zone. Maybe send all CN zone requests globally until a majority of servers in the CN zone, including those outside the zone that have had theirs added to the zone, are back up to the point where they are in the pool. Then slowly move CN zone requests back to the CN zone pool. Itā€™s not a long term solution or really any good but it may be the best temporary solution available.

In which case, the DNS cycling must be a lot faster because even having your node on the lowest settings ā€” 384kbit/sec ā€” you get bursts of up to 20Mbit/sec of NTP traffic. Not everyone in the pool can afford or cope with that level of traffic. And I would argue that none of them consented to it when they added their node to e.g. UK or CH or DE.

Youā€™re right about that. It may not be the best idea after all. Here is my question, if the China zone was implemented such that scores of any server in the zone were always at 20. Iā€™m not suggesting that we do that but seeing how many requests there are, I wonder if itā€™s reasonable to believe that even with the entire zone active, they would not be able to handle the requests. Perhaps what we really need to do is push strongly for some alternate solution whatever that may be (personally looking forward to ISPs hosting Strat 3s and using the NTP DHCP option but basically no one supports receiving it and no one sends it so that is not the right thing to do right now)