More precise (sensible, sensitive) server monitoring score

As so often, this appears to reflect the rather narrow view of someone fortunate enough to apparently be residing in a zone where these kinds of criteria are easily met, and where apparently getting the last bit of differentiation in server assessment is the only improvement conceivable. It keeps ignoring that while the majority of servers are currently hosted in such zones (for reasons akin to why mostly people from such zones are most vocal in the forum), the majority of the Internet population arguably does not. And a large portition of the Internet population doesn’t even have the luxury of having an abundant number of pool monitors in their zone, or even reasonably close, increasing the risk of packet drops or path asymmetries due to larger RTT. So as before, I fear tightening the criteria in this manner will make an already dire situation in large parts of the world even worse. And testing in the test system will not reflect that properly as those parts of the Internet are even more underrepresented in the test pool than they are in the production pool, or this forum.

How I wished at least part of the energy of tuning the pool for those lucky enough to reside in Anglo-American or European or similarly well-equipped zones of the world were spent on making sure the pool works reasonably well everywhere, and for everyone (server operators and clients alike), on properly supporting IPv6, on making sure that, e.g., reported GeoIP mislocations, or just wishes to help out in other zones get processed in a predictable and somewhat timely manner, or get vendor zones handled in a predictable and timely manner (if that concept is to be kept going forward),…

I see your point. However, if a server is badly connected, its usefulness is also more or less equal to its scoring. At least having well connected monitors as much as the count of the active monitors (currently equal to 7).

One practical experiment worth a million speculations. Let’s put some servers from the badly connected corners of the world into the test pool, and see how it goes.

If there is a sufficient number of well-connected monitors nearby, that may be true. But the score does not only reflect the server’s connectivity, but also the monitor’s, and the path in between.

So already the premise “server is badly connected, its usefulness is also more or less equal to its scoring” is wrong in the general case.

The server might be serving local clients perfectly, yet since it is scored from far away, its scores are worse than local monitors/clients would see it.

As mentioned before, slightly different case, but highlighting the issue: I have monitors in places like India, Singapore and the like. For noticeable portions of the time, they are not participating in scoring. Why? Because they are evaluated against a server in California showing an offset of -15ms, and another one in Sweden not responding most of the time, and some in Japan with an offset of +15ms, and so on. Yet comparing against a local reference clock, they are quite fine.

I trust you can transfer that to how similar constellations would affect servers being scored mostly by monitors far away. And that is what the monitoring, and the scoring, should properly reflect, how a server will be seen by the majority of its clients, which are local or regional.

And yes, please add some of the servers you run in underserved countries to the test pool so that that experience is reflected there as well.

Having limited number of testing monitors near exclusively in the well-connected portion of the world will give the worst-case score for the servers in the badly connected parts of the world. I believe, in the production network (more monitors, probably more divers locations) the results of the scoring would be better.

At this moment, the number of active monitors are 7. For a badly connected server there might be smaller number of monitors that give good result. May be the decrease of the number of the active monitors could be subject of discussion, but only after having enough test result.

So what you are basically saying is that you’ll be ignoring that part, because however bad (or good) it may come out of the test on the test system, the production system is expected to be better, and thus considered good enough either way, because there doesn’t seem to be any attempt to really understand that part?

Anyhow, merits of a test, or the proposal itself, aside, the point remains that rather than tweaking the system to get a bit more differentiation in scoring for people in an arguably smaller part of the world (including myself), I’d rather see the effort spent in making the pool work somewhat decently at least everywhere, for everyone, in getting proper IPv6 support, and getting server location updates and vendor zone setups processed in a predictable and somewhat timely fashion.

It is possible to compare the quality of the set of production monitors with the set of testing monitors. Some servers that are both in the two pools could be reference.

Fair enough, resources may be spent more efficiently on other places, your observation is noted. That issue is only my pain point. However, other issues or improvement plans should be discussed in different threads.

I am not discussing those issues or improvement plans, simply mentioning them for reference. There’s already a sufficient number of threads on them, no need to add even more of them.

And just mentioning them as far as they intersect with your proposal, and are thus relevant in this context. I.e., that the proposal needs to consider, and work for, the entire pool, not just a portion of it.

And don’t get me wrong, I definitely don’t mind fine-tuning the behavior of the pool in specific areas. But as you concede yourself, there’s maybe bigger fish to fry right now (the stuff mentioned, others may have further pain points that I forgot, but in-depth discussion of which (including my points) are out of scope here).

By the way did you consider what this improvement would give to some other improvements, especially adding servers to the underpopulated zones? We could have better reachability matrix, n servers to m monitors. A server with many bad score monitors but some good quality, far away monitors (high delay), could help locate what servers to add to the country’s pool. Those servers are good candidates that share the AS number with the well connected, but far away monitors.

It will not matter or change one bit. Can we please stop this useless discussion.

The monitors, and thanks to @ask, we have now 99 instead of just 1 before.

These 99 monitors score your server, and MOST OF THEM, score your server pretty good, meaning it’s kept in the pool and not kicked out like before with just 1 monitor.

There is just 1 purpose to monitors, they check if your server is OK, TICKS WELL and is REACHABLE.

If 70 monitors agree, just a few do not. It means your server is in the pool serving time.
Regardless where you are.

This got nothing to do with getting extra NTP-servers…they are monitors, testing NTP-servers.

Please leave the monitors alone, I’m getting sick of this non-discussion.

Underserver-zones need more NTP-servers, yes, got nothing to do with the monitors.

Can we move on and try to get a better NTP-serve-spread? As that is the real issue.

You probably call me rude and email Ask over this. Monitors are NOT the problem, they are not.

I think the data is already there, and good enough. I don’t think the data would be any more helpful with the improvements you suggest. Rather, as discussed in another thread that you kicked off, it would be good to be able to access it more broadly. Or be used now at all, rather than waiting even further for some additional fine-tuning, thus further delaying taking action on the bigger issues.

This is rubbish. What improvement would this give all over the globe?

The best monitors that are selected are the ones that score you.

And no, they should not be in the same AS.

Please stop this discussion. The monitor system we have now is the best we ever had.

Did it kick you out of the pool for nothing? Like we had before with 1 monitor? No it doesn’t.

You are bringing up a non-issue.

We have a bigger problem…serving countries better, better serving-spread, but that got ZERO to do with the monitors.

Can we please stop this? Monitors are fine.

Sorry to say, but you did not understand. I meant having a monitor with servers being in the same AS.

They should not, as they can be in the same datacenter.
No single monitor should score you in the same network/provider.

I give up. You probably will never get it.

It is not about scoring those servers by a nearby monitor (even in the same datacenter).
Rather it is about the following: a monitor that gives good scoring for a server far away (high round-trip delay) and the server is in an under-served zone, the servers nearby that monitor (same AS) could be good candidates to add to that zone.

Do not give up, please. I am trying to explain to you what I mean.

Ok you realy do not get how it works.

First, there is no round-trip-delay. This is something you invented.

Second, it’s UDP, meaning send and forget.

Third, if the command (UDP) arrives it sends (UDP) back. If it doesn’t arrive it’s lost. Simple as that.

There is no round-trip, that only happens with TCP, not with UDP.

And last time: Monitors send an request, after being tested themselves to be OK on time, to an NTP-server. This can be anything. Time-out / Reject / Wrong-time / Good-time. A monitor does nothing else.

When Time is good, you are scored 20, when there is a problem, it drops.

If you put a NTP-server online you HOPE it has good time/routing all over the world. Sadly it doesn’t.
As the 1 monitor (San Jose) proven us. Today the monitors work fine.

There is NO round-trip. It gets an answer or not. But when answers are NOT there, those monitors are NOT selected to score you.

Monitors that do get the right data are selected and score you.

My god, why do you not get this? It has been 1 year…so many explanations…how much more do you need to get it? You started Jan 3, 2025 ( I think)…

How long are you going on with this? Do you enjoy this crap? Are you for real? Come on :yawning_face:

Sorry, didn’t want to get in the middle of this any further than I already did, but really need to say, @bas, you are the one who clearly is not understanding things. Not that you are wrong about most points you raise, but the issue @NTPman raises simply is more subtle than the very rough approach that you take to the topic.

So please just as with so many other topics, if you don’t think this is worthwhile discussing, please simply stay out of it, and let those who think this worthwhile to discuss do so amongst themselves, without interference.

(And I while I don’t agree with the approach that @NTPman proposes due to the current shortcomings I tried to highlight, I think I certainly do understand the issue he is trying to address - though again I don’t necessarily agree regarding the relevance/urgency to address it.)

Apart from his very latest proposal being about addressing the issue that you also face yourself, so that should be welcome at least I would think, and not dismissed somewhat rudely without apparently even understanding it.

As yes, there is such a thing as RTT, just look at the page of any of your servers, and it’ll be shown very plainly for active and testing monitors, and by hovering above a monitor name for those monitors in candidate mode.

3 Likes

Thank you, @MagicNTP .

1 Like

Sorry, but that is not an RTT (round trip time) as it’s not a round trip.
The monitor sends a request to the ntp-server in question, now a few things happen (as I understand it) :smile:

1: The server doesn’t reply = Timeout => lowers score because no/late response.

2: The server does respond, gives time but also a timestamp, the difference between time and timestamp is the pingtime (if the time is ok), resulting in a ‘ping-time’, but good enough, score 20, regardless of the ping.

3: The server responds with wrong time, but within limits, score drops, but ‘ping-time’ can be fine.

This is not a RTT, just a single-trip. As far as I understand it.

The problem is, in my opinion, that we could move to 100000 monitors or use just 20, in the end the scoring to be accepeted in the pool makes no difference. It just gives the entire system more load to handle, but no extra accuracy.

Same as accuracy between Stratum 1 and Stratum 10 should be the same. The only big change is Stratum 0 to Stratum 1 depending on the quality of the PPS-signal, but after that the accuracy won’t change much.

A solution to get better accuracy is when monitors are only be allowed to be stratum 0 or 1 and have very good precission. That, in my opinion can improve the pool.
But then we have to ditch a lot of monitors ( I think ) and I don’t believe that will improve the global Pool-system.

Beware, we are mostly better then 5ms, this may sound high.
But for most purposes this is far beyond the needs, as it’s still 0.005 seconds. And corrected all the time.

Systems that need better accuration do buy own equipment.

Last remark, all monitors check ALL systems, so adding more monitors will only give a higher load.

I’m still puzzeled…maybe you mean only Strat 0/1 monitors? As I believe that is the only way to get more accuracy, but we will lose monitors this way, resulting in less servers…that will make more servers overload.

My 2 cts.


Not quite sure why you insist there’s no RTT. Round-trip time as a concept does not care at all about the protocol. If a laser light is beamed from Earth to a passive reflector on a Moon, which then reflects the laser light back to Earth, the time it takes for the laser light to travel to the Moon and back is the RTT. Similarly, for NTP, the RTT is the time it takes for the packet to reach the server plus the time it takes for the response to come back. The measured RTTs for one of your server are highlighted in the above image. The RTTs for the Candidate monitors are only shown when you hover the mouse over the monitor’s entry.

As for the NTP protocol, there are four timestamps in each NTP package:

  • When the server’s clock was set (not related to this RTT discussion)
  • When the request was sent by the client
  • When the request was received by the server
  • When the response was sent by the server

The RTT can be calculated based on these numbers (edit: and by recording the time when the client received the response). All the fields are there in every NTP packet, but obviously the fields to be filled in by the server are left empty by the client.

1 Like

I believe it stand for Return-Time-Timing…not being rount trip.

12ms’s that is fast.

Pinging my own router takes already 2.5ms.

If I do a traceroute to my fastest server, it takes 28ms.

So how can it be faster then a traceroute?

I do not know how the NTP-protocol works in total, but I do know it’s able to workout the delays in transport.

As such the ‘RTT’ is too fast to be Round-Trip-Time, when I can’t even get those times with a Traceroute.

Therefor I believe it’s Return-Time-Timing…ergo, one way trip back. It’s not a round trip.

As the monitor does this:

bas@workstation:~$ ntpdate -q ntp1.heppen.be
2025-12-24 21:19:32.527940 (+0100) -68.061499 +/- 0.001603 ntp1.heppen.be 185.142.225.68 s1 no-leap
bas@workstation:~$ ntpdate -q ntp2.heppen.be
2025-12-24 21:19:35.476507 (+0100) -68.062169 +/- 0.015213 ntp2.heppen.be 217.103.55.36 s2 no-leap
bas@workstation:~$ ntpdate -q ntp3.heppen.be
2025-12-24 21:19:38.582864 (+0100) -68.062891 +/- 0.014326 ntp3.heppen.be 87.118.104.17 s2 no-leap
bas@workstation:~$ ntpdate -q ntp4.heppen.be
2025-12-24 21:19:42.16401 (+0100) -68.060999 +/- 0.011316 ntp4.heppen.be 51.75.149.45 s2 no-leap

Then compares it to the time of the monitor, that has been checked BEFORE and stated to be within limits.

As you can see there is a little difference…I believe that is the ‘ping-time’.

But then again, I could be wrong. :grin: