More precise (sensible, sensitive) server monitoring score

avij · December 24, 2025, 8:42pm

Routers may have lower priority for responding to traceroute packets, ie. responding to those may take longer. This will show up as longer RTTs. Another funny thing you may have noticed when doing traceroutes to hosts far away (ie. >10 hops or so) – it is possible that traceroute reports 14 ms for the first few hops, then maybe 10 ms for the next hops and then 20 ms for the last hops. This should be “impossible”, but it really isn’t because there’s no guarantee that traceroute requests are replied to immediately. Then there’s also a possibility that packets take one path when going towards the destination, and another path when going back to you. In short, take the results of traceroute RTTs with a large grain of salt.

As for the RTT in the NTP and NTP Pool contexts, it really is the round-trip time. The relation between ping times and NTP RTT measurements may be more visible when observing hosts further away. Let’s take one random example NTP server from Australia. Its RTT time for fihel4 is (currently) shown as 309.2 ms, and when I ping the same server from fihel4 I get “rtt min/avg/max/mdev = 309.163/309.435/309.517/0.114 ms”. So I’d say the NTP Pool’s RTT time matches the ping time pretty closely.

ask · December 24, 2025, 8:47pm

The monitor is on GitHub; the NTP client used by the monitor too. You don’t have to guess what the value is. ntp/ntp.go at v1.5.0 · beevik/ntp · GitHub

The RTT is the receive time minus transmit time.

ask · December 24, 2025, 8:51pm

Its RTT time for fihel4 is (currently) shown as 309.2 ms, and when I ping the same server from fihel4 I get “rtt min/avg/max/mdev = 309.163/309.435/309.517/0.114 ms”. So I’d say the NTP Pool’s RTT time matches the ping time pretty closely.

When doing multiple queries the monitoring client picks the response with the lowest RTT (assuming it’s least affected by network latency), so 309.163ms and 309.2ms matches pretty closely indeed.

github.com/ntppool/monitor

client/monitor/monitor.go

42e8c3c9f


      
          		}
          
          		// Priority 2: Among responses with errors, prefer partial responses over complete timeouts
          		if r.Error != nil && best.Error != nil {
          			if !r.Status.NoResponse && best.Status.NoResponse {
          				best = r
          				continue
          			}
          		}
          
          		// Priority 3: Among equivalent response types, compare RTT (only if both have valid RTT)
          		if r.Error == nil && best.Error == nil {
          			// Both are valid responses - compare RTT
          			if r.Status.Rtt != nil && best.Status.Rtt != nil &&
          				r.Status.Rtt.AsDuration() < best.Status.Rtt.AsDuration() {
          				best = r
          			}
          		} else if r.Error != nil && best.Error != nil &&
          			r.Status.NoResponse == best.Status.NoResponse {
          			// Both have same error type - compare RTT if available
          			if r.Status.Rtt != nil && best.Status.Rtt != nil &&

Bas · December 24, 2025, 8:59pm

Ok try again, I just tested:

bas@workstation:~$ ntpdate -q 110.232.114.22
2025-12-24 21:45:21.247389 (+0100) -68.074959 +/- 0.170928 110.232.114.22 s2 no-leap
bas@workstation:~$ ping 110.232.114.22
PING 110.232.114.22 (110.232.114.22) 56(84) bytes of data.
64 bytes from 110.232.114.22: icmp_seq=1 ttl=49 time=330 ms
64 bytes from 110.232.114.22: icmp_seq=2 ttl=49 time=329 ms
64 bytes from 110.232.114.22: icmp_seq=3 ttl=49 time=329 ms

Yes, the time from me to him is 330ms. So what?

The active monitors are lower, and score 20. My time check on him is also good.

So what is the point? His time is correct, regardless of the ‘rtt-time’. Meaning this rtt DOES NOT MATTER.

I’m saying this all along, the ‘ping-time’ has no impact on timekeeping.

The problem is/was, that in this topic the TS said we can have more precise time by adding x-monitors to local AS’ses. The thing is…it doesn’t matter.

If I request FAR away, the time is still good, unless the routes change all the time, or timeout, then it can’t be calculated anymore.
But because we have many monitors now, we can be sure ALL Timeservers are scored correctly within limits. And those limits are (hopefully) <5ms…ergo we produce time at 0.005sec accuracy.

Isn’t that the goal of the system? Or am I missing something? Sure I like 1ns…but that 's not realistic. Beware, we serve the globe of GOOD time…then 0.005 is pretty good in my opinion.

My 2cts.

Bas · December 24, 2025, 9:03pm

I would call that a time-out! Even at 100ms. Maybe you set it toooooooo high for a responds.

Just my opinion.

I would dismiss any ntp-server that doesn’t respond in 100ms, like the San Jose monitor before (if I’m not mistaken) makes it easier to dismiss monitors from being cosidered or even get active.

100ms would be a good figure for time-out in my opinion.

stevesommars · December 24, 2025, 11:18pm

For NTP traffic, RTT may depend upon the client UDP port.
This manefests as multiple bands in RTT plots due to multiple paths.
Typical traceroute uses multiple UDP ports & may show similar path variations.

For ICMP traffic (e.g., Echo Request / Echo Response), there are no ports.

Network paths can be very assymetric, though that is not common. Recently
I found a network path where traffic from the NTP server (Australia) towards
my NTP client (different city in Australia) was low latency. Traffic from
my NTP client towards the NTP server travelled from Australia, to Japan,
to Los Angeles, and back to Australia. This is a good example of the
maxim “time transfer uncertainty is at least 1/2 the RTT”

Separately, NTP round trip time may differ from Client Rx time - Client Tx time.
NTP servers may internally buffer NTP requests. Normally the
internal buffer time is sub-millisecond. I took a random sample now and see
several servers with internal delay of multiple milliseconds.
I’m omitting some messy details.

MagicNTP · December 25, 2025, 8:53pm

So what you’re basically saying is that you’d simply kick out pretty much all servers in Oceania, such as this one:

Because once a seventh active monitor has again been added from among the testing monitors, e.g., the above server would be kicked out according to your proposal of considering any RTT beyond 100ms as a timeout.

And this one right away as well:

Or this one:

Or this one in South Africa:

Or this one in Argentina:

Or this one in Peru:

And so on and so forth.

Bas · December 26, 2025, 9:51am

I did not say that, I said the monitors that score that high and are above e.g. 100ms should dismiss the ntp-server, so the monitor is taken off monitorring that ntp-server and better ones are selected.
I used 100ms as sample as San Jose did remove servers from the pool, but it was the only monitor to decide.
In the current system there are 99 monitors. So they stay in the pool unless ALL monitors say the ntp-server is bad. Ergo, all your listed servers will stay in the pool, just different monitors are assigned where the respond-time is less then e.g. 100ms.

A score from a monitor on a wobbly-path to a server isn’t a good score, you shouldn’t want to be scored based on that. I wouldn’t.

MagicNTP · December 26, 2025, 11:20am

Well, while you dismiss it at the beginning of your sentence, you actually repeat it in the second half. Maybe actually take a look at the data I shared, and you would easily see that your proposal would result in pretty much all monitors being dismissed for the examples (and many more in the referenced parts of the world), as fewer than the needed quorum of seven monitors would be remaining, sometimes much fewer.

Bas · December 26, 2025, 4:44pm

Well I did that and entered your sample server in Chrony and let it run for a bit. The offset is bigger then my servers but that is to be expected.

Sourcestats:

bas@workstation:~$ chronyc sourcestats 
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
heppen.be                   9   6   395     +0.544      1.018    +34us    83us
heppen.be                   8   7   391     +0.666      3.218   -458us   179us
voip.sprintweb.be          10   7   395     +1.555     20.794  +2319us  1718us
ip-217-103-55-36.ip.prio>   8   5   330     +1.013      3.585   -492us   230us
mansfield.id.au             8   5   452     -3.838     77.779  +4673us  5252us

Sources:

bas@workstation:~$ chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^* heppen.be                     1   6   377    12   -159us[ -199us] +/- 1795us
^- heppen.be                     2   6   377    13   -671us[ -711us] +/-   18ms
^- voip.sprintweb.be             2   6   377    12    +66us[  +27us] +/-   23ms
^- ip-217-103-55-36.ip.prio>     2   6   333    76   -947us[ -994us] +/-   25ms
^- mansfield.id.au               2   6   377    19    +16ms[  +16ms] +/-  170ms

The error seems to be 170ms, but then, it’s comming from the other side of the globe.
Timing seems pretty good.

So why are the RTT’s comming from? As I see my own monitor scoring it a perfect 20: belgg3-19sfa9p 20

I did an UDP portscan…and yes, it’s the same number:

bas@workstation:~$ sudo nmap -sU -p 123 110.232.114.22 
Starting Nmap 7.94SVN ( https://nmap.org ) at 2025-12-26 17:50 CET
Nmap scan report for mansfield.id.au (110.232.114.22)
Host is up (0.32s latency).

PORT    STATE SERVICE
123/udp open  ntp

Nmap done: 1 IP address (1 host up) scanned in 0.89 seconds

Latency 0.32s. So I believe the value has no meaning, just the error should be given.
If the error magin is too high, I would expect the monitor to be dismissed for that ntp-server as being active.

As I see tittle error for a server so far away, I also fail to see why the ping-time matters.
Looks to me the error in time is far more important, as it does increase with every hop if the peers between the hops are unstable. Apart from that, I do not see why my server should be selected (it isn’t) but yet should test it like twice a day or as backup-tester. As the error between us is far too high in my opinion. As you can see compared to my other servers.

Maybe it’s best to replace the ping-time with error-margin? Just my opinion.

avij · December 26, 2025, 8:05pm

The main difference is that the RTT is easy to measure with only one probe – send one query and record the time it takes for the response to arrive. Chrony’s estimated error, in contrast, is not based on a single measurement but it takes into account multiple recent measurements. The time stability is taken into account.

This means that replacing “ping time” with “error margin” is not quite as straightforward as one might think. There is a relation between RTT and estimated error. I believe changing the monitor ranking to be based on estimated error would not change the order of the selected monitors. Therefore I don’t see a need to put any effort into making such a change.

Edit: Some data regarding our guinea pig NTP server 110.232.114.22 in Australia. The list is primarily sorted by RTT. You’ll notice that the sorting would remain the same if sorted by estimated error. It is possible that with a large enough list of monitoring hosts there could be some differences (ie. some monitoring hosts might swap places in the list), but I believe the differences would most probably be minor. The monitoring hosts in top 10 would still likely be in top 10, regardless of how they are sorted.

Host	RTT (ms)	Estimated error (ms)	Ratio
au	2,68	4,68	0,57
sg	153	80	1,91
ph	181,12	94	1,93
us	198,07	102	1,94
nl	308,82	158	1,95
fi-1	309,12	158	1,96
pl	327,42	167	1,96
be (Bas)	329	170	1,94
fi-2	339,58	172	1,97

Edit2: Another table of results, this time for a server in Germany (185.248.189.10). As with the above table, the sorting remains unaffected whether we sort by RTT or estimated error.

Host	RTT (ms)	Estimated error (ms)	Ratio
nl	8,09	4,04	2,00
pl	24,77	12	2,06
fi1	25,69	13	1,98
fi2	41,60	19	2,19
us	95,58	48	1,99
sg	178,05	81	2,20
ph	201,01	106	1,90
au	290,99	151	1,93

Edit3: Based on these numbers you can estimate the estimated error from RTT with 0.429 * RTT^1.03 + 1.02. YMMV.

Bas · December 28, 2025, 6:34pm

The only thing is, request time/rtt/whatever you call it…does not matter.

What matters is the correct time + stable path.

Monitors should be selected on that only.
Regardless if they are 300ms away, but stay 300ms all the time.

The problem is the wobbly paths. As you can’t correct those.

This happened with the single monitor San Jose before, it has wobbly paths, making your server look good and the other day bad.

Monitors should be selected on stability, not on RTT (request-time) or distance. Why is this so important for some? I do not get it.

If the path is stable at 300ms, the monitor will score you good. If the path is bad, the monitor won’t work for you. Period, end of story.

In short, and put this in your head, it does not matter one bit where the monitor is IF the path is stable. It can and will compensate the path-delay, and stable gives GOOD time.

RTT is bullshit when it’s not stable. Why do you not get this? And takes to long is time-out.
This is not hard to understand.

The problem is unstable paths, regardless the distance. One day 100ms, the next 200ms, you can not calculate good time on such. That is the problem for poor scoring servers.

avij · December 28, 2025, 8:24pm

I understand and almost kind of agree with you in principle, but bear this in mind: The probability of error increases with increasing RTT. In other words, when the RTT is high, it is probable that the estimated error is also high. Here’s a plot of the numbers in my above tables:

Keeping this in mind we can simplify the monitor selection by picking those monitors with a low RTT to the target server, because it is probable that their estimated error is also low. I believe the system essentially works this way now.

In addition to time stability there’s another point of view – time offset caused by asymmetric routing. It is very possible that the RTT to some server stays stable at around 100 ms, but there may be a constant 5 ms offset if the packets are routed so that it takes 45 ms for the request to reach the server and 55 ms for the response to come back. In this case the estimated error could be low, but the monitor would think that the server’s clock is significantly off. I don’t think that’s a good outcome either. The effects and probability of asymmetric routing increase when the monitor and NTP server are further away from each other, ie. higher RTT and more network hops.

Do you have some examples (actual measurements) where you think using the estimated error would be better than using RTT? Using the estimated error instead of RTT would not have changed anything in my above example graph/tables.

Bas · December 29, 2025, 4:20pm

I don’t agree with you, sorry.
As you do not know how the peers are from and to your ISP.
I know in Belgium we have often better peers with Germany/France/Holland then we have inside Belgium.

Linux Mint/Ubuntu has a nice tool where you can measure the fastest Apt-source to get updates from.

But to show you how bad it is, and why I run my servers mostly outside Belgium:

So by looking local only, you may not get the best server.
Mostly my ping-times are better outside Belgium then inside.

But those servers do not serve Belgium unless asked to add them.

The only purpose of a monitor is to check if a ntp-server is on time and online, it can not know where the clients are.

I do agree clients do prefer faster responces and steady delay. But that can’t be the job of a monitor, as it simply does not know where the clients are. The monitors, like mine are at the same location as my servers (same machine in fact), so they are excluded on the same network, as they should.

I expect the monitors to score my systems to be online and tick properly. The rest is speculation what the best is for the clients. Testing various servers as backup in my servers, often German servers score better.

davehart · January 7, 2026, 8:29pm

To nitpick belatedly, it’s actually:

(server receive time - client transmit time)
+
(client receive time - server transmit time)

In other words, the server’s processing time between receipt of request and transmission of response is excluded. Also, the differing clocks of the server and client cancel out – either difference can be negative, and neither difference represents a one-way time.

Bas · January 10, 2026, 3:54pm

You can’t rule that out, as only with HW-stamping you know when the package was received and send by the NIC. As most don’t have HW-stamping it’s the kernel doing the stamping.
That time is higher then the NIC doing it, so processing can’t be excluded.

More then an average guess at best. It really doesn’t have much meaning other then showing a route is poor/bad/good at best.

A stable path at 100ms (via e.g. sat) is far better then a wobbly path of 10ms one second and 1000ms the next. As I presume it will also show in the time-accuracy.

Maybe a min/max RTT should be given, so you can see how stable it is, with an average value to see the middle-value. Because if min/max are wide apart, it makes it an unstable path (unless you did it yourself, reboots, testing etc).

Topic		Replies	Views
Who watches the watchers? Client Configuration and Development	12	298	December 8, 2025
Suggestions for monitors, as Newark fails a lot and the scores are dropped too quickly Server operators monitoring	91	4338	August 2, 2021
Monitoring upgrade Announcements	68	3720	May 25, 2023
Score/network woes Server operators monitoring	71	7275	March 7, 2019
Monitor belgg1-19sfa9p Pool Development monitoring	19	853	May 31, 2023

More precise (sensible, sensitive) server monitoring score

Related topics