Newark monitor problems

Is there a problem with the Newark monitor service ? My stratum 1 ntp server is suddenly not well anymore. Checking using dedicated tools, show that the server is perfect :slight_smile:

https://www.ntppool.org/scores/77.68.139.83

http://ntpgps.greyboxdata.com/ (my server)

That drop in the score is not a problem with the monitor. Currently your server returns an offset of -1

https://servertest.online/ntp/20210303-141644-f90b

I recommend that you restart gpsd and chrony and verify that the Score rises again.

1 Like

Could someone please tell me what is wrong with mine? It really ever gets to 20 but plummets down to minus figures for no apparent reason. Is it me or the monitoring server?

This is a stratum 1 server (LeoNTP)

https://www.ntppool.org/scores/86.10.253.124

The offset is fine for my server:
https://servertest.online/ntp/20210303-153638-bc6a

I’ve also added my server on the beta NTP system and I get a +20 from Amsterdam?
https://web.beta.grundclock.com/scores/86.10.253.124

This is looking more like a connectivity issue perhaps?

my chronyc is reporting fine

root@raspberrypi:/home/pi# chronyc sources
210 Number of sources = 3
MS Name/IP address Stratum Poll Reach LastRx Last sample

#- NEMA 0 4 377 20 -526ms[ -526ms] +/- 262ms
#? SOCK 0 4 0 - +0ns[ +0ns] +/- 0ns
#* PPS 0 4 377 20 -841ns[ -928ns] +/- 444ns

root@raspberrypi:/home/pi# chronyc tracking
Reference ID : 50505300 (PPS)
Stratum : 1
Ref time (UTC) : Wed Mar 03 14:59:55 2021
System time : 0.000000054 seconds slow of NTP time
Last offset : -0.000000067 seconds
RMS offset : 0.000000044 seconds
Frequency : 16.209 ppm fast
Residual freq : -0.000 ppm
Skew : 0.001 ppm
Root delay : 0.000000001 seconds
Root dispersion : 0.000018778 seconds
Update interval : 16.0 seconds
Leap status : Normal

Where is the offset from ?

My monitors also show that Jakob’s server is off by 1 second.
Can you provide some stratum 1 details plus the chrony.conf?

You can add a noselect server to chrony.conf e.g., let a.b.c.d be a nearby stratum 1.
server a.b.c.d noselect
This will show the offset to the nearby stratum 1.

Regarding 86.10.253.124. The NTP monitor is seeing high loss. I suspect you’re correct.
Please run traceroute 139.178.64.42 from your server. I’ll do some monitoring from the Newark side.

We’ve seen a lot of NTP filtering for Pool servers in some regions.

Hi Steve

Please see the results from the traceroute from 86.10.253.124 to 139.178.64.42 as requested:

traceroute to 139.178.64.42 (139.178.64.42), 30 hops max, 60 byte packets
1 10.0.0.254 0.277 ms 0.312 ms 0.322 ms
2 10.53.37.117 10.976 ms 15.585 ms 13.725 ms
3 86.28.83.165 15.939 ms 16.248 ms 16.504 ms
4 * * *
5 * * *
6 62.254.85.86 27.339 ms 28.470 ms 28.030 ms
7 213.248.84.25 31.349 ms 25.715 ms 27.947 ms
8 62.115.120.74 97.456 ms 96.588 ms 97.016 ms
9 62.115.113.20 95.175 ms 94.855 ms *
10 213.155.130.28 90.564 ms 62.115.137.99 94.577 ms 94.980 ms
11 62.115.175.183 95.895 ms 97.810 ms 98.927 ms
12 192.80.8.10 113.598 ms 115.342 ms 198.16.7.206 108.696 ms
13 198.16.4.213 106.343 ms 112.932 ms 110.001 ms
14 147.75.98.107 129.041 ms 147.75.98.105 105.805 ms 115.009 ms
15 139.178.64.42 106.978 ms 108.068 ms 109.762 ms

I posted without any name resolution because apparently as a new member, I’m only allowed to specify two URLs in a post!

It doesn’t appear to show serious packet loss during this trace, just a firewall at hops 4 and 5 blocking ICMP from Virgin Media (my ISP) apart from Hop 9 which looks a bit suspect! This hop however is in the States, Telia Company AB so nothing wrong from the UK side.

Something is obviously wrong though as my score is up and down like Zebadee.

Many thanks

Mark

A few more traceroutes show issues at 62.115.113.20 with some severe packet loss at times - this is the second hop into the US at Telia as stated.

Nothing much I can do about that I’m afraid!

Looks like I’m going to have boinging scores for the long term :frowning_face:

On one traceroute, there was a packet loss on 139.178.64.42 !!

traceroute to 139.178.64.42 (139.178.64.42), 30 hops max, 60 byte packets
1 10.0.0.254 0.183 ms 0.188 ms 0.209 ms
2 10.53.37.117 10.510 ms 15.499 ms 16.104 ms
3 86.28.83.165 16.302 ms 16.958 ms 17.236 ms
4 * * *
5 * * *
6 62.254.85.86 30.947 ms 30.438 ms 29.919 ms
7 213.248.84.25 33.645 ms 26.613 ms 26.757 ms
8 62.115.120.74 97.091 ms 62.115.122.180 108.791 ms 62.115.120.74 96.948 ms
9 62.115.113.20 97.021 ms 104.693 ms 62.115.112.244 100.041 ms
10 213.155.130.28 101.479 ms 62.115.137.99 100.501 ms 98.659 ms
11 62.115.175.183 93.794 ms 91.892 ms 96.988 ms
12 198.16.7.206 108.695 ms 106.858 ms 192.80.8.10 106.968 ms
13 198.16.4.211 118.905 ms 198.16.4.215 115.059 ms 198.16.4.213 109.710 ms
14 147.75.98.105 128.313 ms 129.731 ms 147.75.98.107 138.849 ms
15 139.178.64.42 108.868 ms 107.848 ms *

All very bizarre.

Just an update from my side. Restarting Chronyd did not give any result. Restarting the entire raspberry, and thus also gpsd, fixed the offset. My score is giong up again. Next time i will try to restart gpsd only.

I don’t see problems in the Newark → 86.10.253.124 direction.
Your traceroute results strongly suggest that NTP filtering at Telia is the cause.

And back up to +18.9 in less than 48 hours! Sorry Steve but I really not sure it’s to do with filtering at Telia. I also send ADSB feeds to quite a few providers who have noted no outage/downtime or package loss so I’m stumped.

Also look at the scores on the beta system - pool.ntp.org: Statistics for 86.10.253.124
+20 in Amsterdam and +20 in Newark, NJ but now -13.6 from LA. The scores seem all over the place.

The packet drops I’m describing are specific to UDP port 123.

I sent you a traceroute script, if you’d like to investigate systematically.
A number of NTP Pool volunteers have run this script & the NTP drops are usually seen at Telia or Zayo. Further, I monitor NTP traffic from a host located in London & see near-continuous drops attributable to those ISPs.

1 Like

Hi

Yes, I’m seeing exactly the same.

Thus, seems a worthless exercise continuing this. I’ve removed all 7 of my NTP servers from the pool (shame as 2 were Stratum 1’s).

Thank you anyway.

Regards

Mark Smith

Hi Mark, of course it’s your choice if you want to remove one or all of your servers from the pool, but as long as they score 10 or higher they’re helping spread the load. :+1:

Looking at the management page in the live system it’s only that one server that bounces - all the others are (were!) showing solid 20s!

Internet routing is by it’s nature not 100% and especially with NTP using UDP packets whose delivery isn’t guaranteed. I’d guess the ADSB is using TCP, which is designed to retry failed packets. Routing that doesn’t work now might come back to life tomorrow (and vice versa!).

Laurence, one of the volunteer pool admins