Great, added 22.214.171.124 and 2400:8901::f03c:91ff:fe44:131b to CN zone. Thanks.
OK, added to CN pool.
Look at result from Zurich station, seems we need a monitor station in China.
Is there any instruction to set a monitor station? I want set one for test purpose.
Hey, getting back to this slightly old Topic, I have a server to be added. If that works out with the load, I might have another one.
I’m glad to say my company Tencent(a Chinese tech company) is serving the CN pool now.
5 1000Mbps-servers with GPS source.
We need some time to optimize network and incorrect stratum issue.
This is amazing, thank you @maxmeng! I notice from the graphs today that there are periods where the monitoring is scoring them low. I know the name Tencent — you’re a huge operator, I doubt the problem is at your end.
@ask: is the problem here that the monitoring vantage point is outside China, and any packet loss is unfairly penalising their scores?
This is an instance where I wish the pool was working with the RIPE Atlas team, they have probes in almost every country and could query NTP servers for a more accurate local reading.
Heck, if RIPE doesn’t donate the resources, I would be willing to at least donate my credits to see something like this happen.
Using RIPE Atlas would be really interesting, indeed. (For China specifically their coverage is pretty limited though).
The work with RIPE Atlas is building something to use the API and then figure out how to cancel out “noise” appropriately (Atlas will always find problems so you have to figure out which are noise and which are “real”, or at least what the reasonable thresholds are).
(Other work is having the monitoring system deal better with many more test probe results without using too many resources or making the system more fragile; I’ve been fussing with that for way too long).
Any Update on my server?
We have student administered datacenter at Kajaani University of Applied Sciences. Currently we are looking for real world usage for our servers and bandwidth. It would be ideal way to test our 1Gbit uplink and firewall with China pool.
You can add https://www.ntppool.org/scores/126.96.36.199 to China pool.
Please also add my server to the China zone:
You are welcome to add my servers as well: https://www.ntppool.org/user/pb8yvobuaj4oscre4yfw
I have more than enough bandwidth and can easily get more. I’m going to add a server with 10G transit in a few months.
As you have clearly got a lot, you may only want to add my two germany ones if you have enough already but I am happy for you to add all of them.
My theory: monitoring nodes, both those serving the CN zone from outside China and those inside China, is broken. Monitoring surely has to treat “behind-GFW” as a separate Internet, otherwise the CN zone will be completely unstable in the face of any packet loss crossing the GFW.
It feels like it has been almost a year since the zone started collapsing, and we’re still heroically throwing extra nodes around the world which flit in and out of the zone, rather than building on a stable foundation?
did @ask ever answer whether the monitoring for CN could be “more forgiving”? because the situation is going to keep on going — brave volunteers stepping up, monitoring taking some within-China nodes out of the pool, volunteers get punished — until the monitoring actually reflects the reality of the zone
This has been going on for half a year. What help can we offer you, @ask, and the wider NTP Pool admins? I feel like I keep on asking this, but hear nothing back — whether it’s on this forum or by email; whether it’s offers of DNS anycast nodes, monitoring VPSs in different countries, etc. I’m starting to get the feeling that these offers are not wanted…
I know he was talking about volunteers for additional monitoring servers in another thread.
The source code for the ntppool & geodns are on github, anyone can contribute to make improvements.
There’s a lot of tweaks and features people ask for, but all that requires quite a bit of time to code & test.
That reminds me that I was going to make a monitoring server. I think I installed the software but never actually set it up. I should get on that
I’d love to contribute code. But:
- do we agree on what the problems might be for the CN zone? do we think that what is happening is that nodes in the zone get punished, drop out, others take over the load, drop out, but meantime the first lot to have been kicked out come back? if so, the problem isn’t the nodes, but the fact that they don’t ever get the chance to work together to serve the load of the zone
- of course we want to run tests, how do we even run tests at scale?
- how quickly would a modification to how monitoring works as submitted in a PR to github actually get deployed to a production environment that might be useful?
Piling more nodes into the zone has completely failed. Here we are, almost a year later, and the zone keeps on collapsing. By repeating the same actions, without changes, we will continue to fail.
Suggestions of “Just submit a PR” are BS. I’m not asking for “tweaks” or “features”. I’m asking the project architect(s), and indeed the wider community, to help work out how we can help the project fulfil it’s core goal of serving NTP. That means problem management 101:
- agreeing what we think is wrong, based on all the evidence we have (and most of us just running nodes have nothing but the output of our NTP daemon to go by — more “datum” than “evidence”)
- deciding an action plan
- carrying out that action plan
- seeing if what we did fixed it
If the action plan is “we need 100 nodes” then we’ve got a problem: until you have all 100 nodes, anybody chipping into the pool for CN is going to get battered, their node will be removed, and possibly they will grow weary with the project and take their freely donated resources with them. You’ll never reach 100 nodes this way.
If the action plan is “we need to fix monitoring” then it’d be great if we could agree what needs fixing, where to get those resources and developer time, and who can roll it out into production.
So, here’s my stab in the dark:
The problem isn’t the number of nodes servicing the zone, or how much traffic they can serve. The problem is how monitoring them is causing the zone to crumble. My proposal is that a bunch of CN node operators who believe their node(s) are solid do not get removed from the zone automatically by the (what I believe to be broken) NTP Pool Project monitoring. However, we absolutely must let those operators remove their node instantly from the zone rather than having to wait ~4 days, in case they are getting so battered that they need to withdraw.
Want me to turn that into a PR? I’ll do that, gladly. But I’d love for some of the people who this issue affects to kick around the idea, and especially @ask to weigh in whether he’s just going to reject it outright, before I spend more of my time on this project for naught.
This makes me think that we need to find a way to try and kickstart the zone. Maybe send all CN zone requests globally until a majority of servers in the CN zone, including those outside the zone that have had theirs added to the zone, are back up to the point where they are in the pool. Then slowly move CN zone requests back to the CN zone pool. It’s not a long term solution or really any good but it may be the best temporary solution available.
In which case, the DNS cycling must be a lot faster because even having your node on the lowest settings — 384kbit/sec — you get bursts of up to 20Mbit/sec of NTP traffic. Not everyone in the pool can afford or cope with that level of traffic. And I would argue that none of them consented to it when they added their node to e.g. UK or CH or DE.