Server consistently off

After a search in this forum, I couldn’t find a similar situation as mine.

I have a server whose time is consistently a few ms off from its peers, bot ST1 and 2:

server (loc      remote           refid      st t when poll reach   delay   offset  jitter
==========================================================================================
192.168.z.a -192.168.z.c    aaa.bbb.ccc.ddd   3 u  309 1024  377    0.272    9.275   1.759
192.168.z.a +192.168.z.b    eee.f.g.hh        2 s   27   64  377    0.166    7.333   0.219
_xxxx::yy   *xxxx::123:     .GPS.             1 u  459 1024  377   36.360    6.783   2.814
_xxxx::yy   +xxxx::e        iii.jjj.k.l       2 u  842 1024  377   30.356    5.561   1.871
_xxxx::yy   -xxxx::4011     mmm.n.ooo.pp      2 u  825 1024  377   36.003    9.773   1.842

This has been going on for days, even after wiping out the drift file and restarting the daemon. However, this server has been in sync before. Recently, I tried to add the huff and puff filter, with no results.

This server is not a VM and other NTP servers in the same subnetwork are working well.

Please, advise.

Hi, what do the outputs of the other NTP servers on the same subnet look like?

server         remote           refid      st t when poll reach   delay   offset  jitter
========================================================================================
192.168.z.c -192.168.z.a     61.150.111.99    2 u  327 1024  377    0.390  -10.471   1.695
192.168.z.c -192.168.z.b     129.77.111.66    2 u  978 1024  377    0.336   -2.283   0.622
192.168.z.c *144.172.111.222 129.77.111.66    2 u  347 1024  377    9.691   -0.287   2.818
192.168.z.c -72.30.33.88     129.66.111.22    2 u   20 1024  377   48.213   -4.495   2.118
192.168.z.c +2001:2001:3:4:: 213.172.99.11    2 u  828 1024  377  132.521   -0.576   1.072
192.168.z.c +54.236.222.111  216.218.222.222  2 u    7 1024  177   40.191    0.430   2.332

server         remote           refid      st t when poll reach   delay   offset  jitter
========================================================================================
192.168.z.b -192.168.z.c     144.172.111.222  3 u  249 1024  377    0.300    2.614   0.835
192.168.z.b -192.168.z.a     61.150.111.99    2 s   51   64  377    0.174   -7.721   0.101
192.168.z.b +68.94.111.11    12.121.111.11    2 u 1004 1024  377    9.145    2.137   3.310
192.168.z.b *129.77.66.66    .GPS.            1 u  70m 1024  330   17.415    0.429   1.052
192.168.z.b +139.78.99.111   .GPS.            1 u  163 1024  377   15.629    1.800   1.688
192.168.z.b +128.206.44.111  .GNSS.           1 u  633 1024  377   33.407    0.790   5.633
192.168.z.b +2607:ffff::123: .GPS.            1 u  287 1024  377   35.063    0.367   2.363

BTW, the NTP versions of these servers are:

  • 192.168.z.a: 4.2.6p5
  • 192.168.z.b: 4.2.8p12
  • 192.168.z.c: 4.2.6p2

Please send ntpq -npcrv output again without obfuscating the IP addresses. There is no risk to you by sharing IP addresses, and it makes it harder to help you by obfuscating them. Including your pool IP address would help as well, that way we can look at the raw stats in the CSV log.

For what it’s worth, it’s not uncommon to see graphs like that simply due to delay because of distance from the Newark monitoring station. Here’s one of mine: https://www.ntppool.org/scores/150.101.186.48 - it’s basically always less than 1 ms offset from LAN peers, but delay to the monitoring station changes the view from outside.

1 Like

Here are its variables:

version="ntpd 4.2.6p5@1.2349-o Tue May 5 09:53:03 UTC 2020 (1)",
processor="x86_64", system="Linux/4.4.190.x86_64.1", leap=00, stratum=2,
precision=-23, rootdelay=35.513, rootdisp=63.506, refid=61.150.110.96,
reftime=e2b745f5.9f63765c Mon, Jul 13 2020 15:43:33.622,
clock=e2b74ae1.5b03391f Mon, Jul 13 2020 16:04:33.355, peer=31067,
tc=10, mintc=5, offset=9.388, frequency=1.085, sys_jitter=0.819,
clk_jitter=4.001, clk_wander=2.191

That looks pretty OK. Your frequency, jitter, & wander values are all reasonably low, and your root delay/dispersion is not terrible. I think it’s likely that this is normal behaviour for your server.

Irrelevant or not, in recent 2 or 3 weeks my stratum 1 server was observed like this:

Being a server synced against GPS PPS, the offset hardly went above or below 10ms before. I suspect there was some trans pacific routing anomaly but I cannot confirm it.

1 Like

Hi, is it the local offset (6.783) or the pool offset you’re concerned about? As @paulgear says, if it’s the pool offset I wouldn’t worry about it. If it’s the local offset then as you say the server was in sync before and it’s on the same LAN as other in sync servers, what’s changed or is different about this server? The delay to the selected peer looks higher on this server (36.360) than the other two (9.691 and 17.415) but not excessively so. Does this server have a highly loaded network interface? I’ve seen it before when a network cables affected traffic - both dodgy cables or connectors not fully inserted. Without the real IP details it’s not clear if all three servers are syncing to the same external source or not.

1 Like

Aye - mine jumps up and down too. IPv4 and v6 are the same locally GPS synced server…

image

1 Like

I am not worried about what the monitoring stations see, but about the local offset.

The peer of 192.168.z.c is also seen by 192.168.z.b, but it prefers a closer one.

It’s a new NAS, which locked the time as expected at first, but after a few days started to veer off and never got back.

As a NAS, it does see a lot of local traffic during the day, but the offset doesn’t get any better during the night. The other servers see only moderate traffic.

I’ve never seen anything like it in all my years using NTP.

Nothing else changed between when it was locked ok and now? My bets would still be network traffic / config / physical connection. Might be worth setting the peer list on z.a the same as one of the other two to see if they both give similar figures and prefer the same server. Otherwise maybe plug z.a into z.b or z.c’s network cable and see if that has any effect…

z.a synchronizes only with IPv6 servers, while z.b and z.c mostly IPv4. IP version dependent differing delay condition may explain the phenomenon.

It doesn’t seem to be the case, since it once synchronized with little offset and other servers in the same network see the same IPv6 servers with little offset. Shouldn’t NTP correct for network delays?

Try adding IPv4 servers to z.a to check this.

The NTP protocol well compensate big delays. However, it assumes that the delay is symmetric (lack of priory knowledge about delay asymmetry). The delay asymmetry leads to fix offset. That is not visible, unless the server has other reference, for example different delay asymmetry due to other routing condition with a different protocol.

Do the offsets consistently have the same sign (positive)? That would be strange indeed. Any chance there is a different NTP client running on the system?

Yes, it’s always running behind.

On z.a what statements are in the ntp.conf other than peer or server?

z.a and z.b share the same configuration file, except for the peers and servers. Other parameters disable logging, specify the drift file, restrictions and the one tinkering that I added a couple of days ago to see if it could bring z.a and z.b closer: huff n puff.

After connecting the second interface of z.a’s to a different switch port using a different cable, it still displays the same persistent offset:

 remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
-192.168.z.c     111.111.111.222  3 u  600 1024  377    0.519    7.749   2.682
+192.168.z.b     111.7.1.66       2 s   51   64  377    0.394    8.372   0.816
*2222:ffff::123: .GPS.            1 u  760 1024  377   34.453    8.054   1.973
-2222:111:bbbb:e 222.111.2.5      2 u  574 1024  377   31.686   10.712   5.204
+2222:0:eee:4444 222.7.222.33     2 u  400 1024  377   35.810    8.849   3.365