Server score keeps dropping

monitoring
#1

I used Server Test - NTP to test my server and it gives back good results.
https://servertest.online/ntp

but my score on https://www.ntppool.org/scores/84.199.33.223 is still dropping
This is the same server as https://www.ntppool.org/scores/2a02:1802:7e:152:4554:adc6:664c:a4ec only difference is the IP version, ntpq -p result:
remote refid st t when poll reach delay offset jitter

*ntp1.oma.be .MRS. 1 u 811 1024 377 17.199 0.968 2.144
+ntp2.oma.be .PPS. 1 u 771 1024 377 14.412 1.350 2.400
+d51a4c480.acces 193.67.79.202 2 u 291 1024 377 26.146 0.705 0.952
+d51a4c8ee.acces 193.190.230.65 2 u 550 1024 377 21.990 -1.069 2.376
-ptr-377wgf5b8vy 193.67.79.202 2 u 121 1024 377 18.922 -2.496 3.063
+ptr-377wgf593qt 193.190.230.65 2 u 232 1024 377 23.734 -0.659 2.210

0 Likes

#2

Did you see this thread? Monitoring station has issue with the IPv4 part of my NTP

Another person from Belgium seems to have similar network issues, so I would assume there is some link between points A & B causing the timeouts.

0 Likes

#3

Thanks, I was a bit worried there were some ipv4 issues but our monitoring software doesn’t report any anomalies.
I hope this can be resolved soon, thanks your fast reply and thread link.

0 Likes

#4

I have the same issue from Belgium. IPv4 address stays in negative and dropping. IPv6 address has perfect score. Tested everything with 2 different sites, one of which you also used, and both report my server is doing fine

Besides the servertest.online site you liked to, I also used this one: https://keetweej.vanheusden.com/query_ntp.php

here’s my profile: https://www.ntppool.org/user/microchip

0 Likes

#5

First post here…
I had my server ntp1.mgrey.se +2 years without any issues always keeping a score above 10.
Last 2-3 weeks started to get a lot of ‘i/o timeout’ from Los Angeles station (every hour).
I can not see any issues with my ntp server from outside using web tools and other ntp clients.
ntpdc -c sysstats localhost gives:

Only about 25-35 packets/s, but I also see about 13,5% ‘rate exceeded’
(i have discard average 5 minimum 1)

Any ideas? Bursts of bad traffic?

0 Likes

#6

Recent days my server also faced similar problems, the score drops even under no load at all. Traceroute showed funny results:

Traceroute to 103.226.213.30
 1 gw-b.develooper.com (207.171.7.3) AS7012  0.239  0.195
 2 gi1-9.r01.lax2.phyber.com (207.171.30.13) AS7012  0.528  0.634
 3 te0-1-0-7.r04.lax02.as7012.net (207.171.30.61) AS7012  0.949  0.971
 4 hurricane-electric.as6939.any2ix.coresite.com (206.72.210.122)  0.653  0.606
 5 100ge2-2.core1.lax1.he.net (72.52.92.121) AS6939  0.705  0.751
 6 100ge12-1.core1.ash1.he.net (184.105.80.201) AS6939  65.373  65.384
 7 100ge5-1.core2.ash1.he.net (72.52.92.226) AS6939  55.394  55.400
 8 100ge8-1.core1.nyc5.he.net (184.105.81.149) AS6939  60.358  60.461
 9 100ge14-1.core1.nyc6.he.net (72.52.92.101) AS6939  60.981  61.103
10 peer1.nyc6.flagtel.com (198.32.160.88)  65.722  65.744
11 xe-2-2-0.0.pjr02.nyc007.flagtel.com (85.95.25.150) AS15412  189.079  189.012
12 (85.95.27.142) AS15412  189.758
12 ae2.0.pjr03.lax002.flagtel.com (85.95.27.30) AS15412  188.985
13 (85.95.27.102) AS15412  199.689
13 xe-2-0-3.0.pjr03.wad001.flagtel.com (85.95.27.138) AS15412  161.130
14 xe-0-0-1.0.eji01.tpe001.flagtel.com (85.95.26.122) AS15412  187.128  197.687
15 (80.77.2.202) AS15412  201.332
15 (80.77.2.198) AS15412  189.264
16  *  *
17  *  *
18 (103.31.197.122) AS131584  187.454  197.929
19 (103.31.197.86) AS131584  195.139  201.213
20 252-213-226-103-static.chief.net.tw (103.226.213.252) AS131584  199.671  190.570
21  *  *
22  *  *
23  *  *
24  *  *
25  *  *
26  *  *
27  *  *
28  *  *
29  *  *
30  *  *

So my packet crossed the America continent for 2 times before crossing the Pacific… :stuck_out_tongue_winking_eye:

0 Likes

Additional monitoring servers (help wanted)
#7

I can not see any problem with my ntp, my own external monitor looks fine.
I’m already out of the pool because of low score so I have scheduled my server for deletion now.

0 Likes

#8

We have the same problems in Finland at MIKES.
The attached image shows number of IPv4 users for our public stratum-2 servers, which has fluctuated wildly since mid-February 2019 because the Los Angeles monitoring station produces erratic scores for us.

If there’s a mechanism for setting up a monitoring server we could contribute - but I would need detailed instructions.

regards,
AW

0 Likes

#9

And this caused obvious penalty on my latency… 30ms+ added to offset. It came back to normal state briefly on Monday, as you can see on the graph.

0 Likes

#10

Tracerouting back to normal state.

Traceroute to 103.226.213.30
 1 gw-b.develooper.com (207.171.7.3) AS7012  0.274  0.213
 2 gi1-9.r01.lax2.phyber.com (207.171.30.13) AS7012  0.498  0.545
 3 te0-1-0-7.r04.lax02.as7012.net (207.171.30.61) AS7012  0.765  0.779
 4 hurricane-electric.as6939.any2ix.coresite.com (206.72.210.122)  0.709  0.698
 5 flag-as-as15412.10gigabitethernet11.switch2.lax2.he.net (72.52.72.118) AS6939  0.718  0.681
 6 (85.95.25.246) AS15412  103.363
 6 xe-2-2-0.0.pjr03.wad001.flagtel.com (85.95.26.22) AS15412  96.200
 7 xe-0-0-1.0.eji01.tpe001.flagtel.com (85.95.26.122) AS15412  125.020  125.007
 8 (80.77.2.198) AS15412  131.369  124.196
 9  *  *
10  *  *
11 (103.31.197.122) AS131584  125.071  125.000
12 (103.31.197.86) AS131584  139.311  132.132
13 252-213-226-103-static.chief.net.tw (103.226.213.252) AS131584  125.388  125.653
14  *  *
15  *  *
16  *  *
17  *  *
18  *  *
19  *  *
20  *  *
21  *  *
22  *  *
23  *  *
24  *  *
25  *  *
26  *  *
27  *  *
28  *  *
29  *  *
30  *  *
0 Likes

#11

This seems to be a common problem. :-\

I recently tried adding https://www.ntppool.org/scores/54.232.82.232 which is an AWS instance in Brazil. My own monitoring of it suggests it’s perfectly reliable, but the monitoring server hates it.

Even adding it to the pool was unreliable; it first showed “Could not check NTP status” but worked the second time. I just got the same error trying to add it to the beta (is that still a thing?), which seems to be from LA as well. I’m inclined to just remove this server and not worry about it, but I can’t quite account for why I’m not seeing problems in my own monitoring, or why AWS’s network would be so unreliable without wider objection.

Traceroute was not too informative; AWS drops ICMP at most of its routers, but in general the path looks to be AWS - Seabone - NTT - Phyber outbound, and possibly straight Phyber - NTT - AWS inbound.

0 Likes

#12

For what it’s worth, I configured two US VPSes to use 54.232.82.232 as a server, just to keep an eye on it. One is an EC2 instance and one is a Linode.

I’m not carefully monitoring it, but it’s been reliable for the EC2 instance, and maybe 90% reliable for the Linode.

0 Likes

#13

Thanks! I ended up scheduling it for deletion, because it’s consistently had a score below 10. I double-checked that conntrack wasn’t secretly running and torpedoing connections, either. The monitoring server just hates this server.

0 Likes

#14

I hardly make it to 10 still… (my own external monitoring is fine).

@ask, I think you should consider changing the penalty for ‘i/o timeouts’ if you want any servers still in the pool…

0 Likes

#15

I have a server in Chinahttps://www.ntppool.org/scores/166.111.68.210. It gets the same problem. The score never passes 10 after I joined it in the pool. Actually, most of the time, the score is below 0. But I don’t think it has any problem from my own monitoring result. Does anyone know why this happens? Thanks.

0 Likes