I have been searching for days to find the problem of me getting bad scores.
I was unable to find the cause, no matter what I do or did, it never got better.
So I started tracing to the monitor station, and behold, underway between Zayo . com and Packet . net all packets are dropped.
It’s my believe that Zayo.com has problems, could be wrong.
Maybe other can track as well?
This can’t be bad, is it?
root@server:/# chronyc tracking
Reference ID : 47505300 (GPS)
Stratum : 1
Ref time (UTC) : Fri Oct 11 13:25:44 2019
System time : 0.000012358 seconds fast of NTP time
Last offset : +0.000012565 seconds
RMS offset : 0.000442996 seconds
Frequency : 0.440 ppm fast
Residual freq : -0.177 ppm
Skew : 31.578 ppm
Root delay : 0.000000 seconds
Root dispersion : 0.001194 seconds
Update interval : 16.0 seconds
Leap status : Normal
root@server:/#
Hi Bas, sorry to hear you’re having problems. The best thing to do is log a ticket with your ISP so they can investigate and contact the peer and ask them to resolve / change their routing.
All of us that pass that server have problems.
The server is in the USA near the monitoring station, way after peering near my ISP.
So contacting them is of no use.
Your data-center should contact them, as they are probably a peer to them.
Opening a ticket here will do nothing, you need to contact packet-dot-net and tell them it’s not resolved at EWR1
Hi Bas, only Ask has the relationship with the provider at the monitor end. He is unlikely to respond for some time, so the only options are to ignore the problem and wait for it to improve on it’s own or for you to contact your ISP as you have the relationship with them. Other people have contacted their providers and they have happily looked at and resolved routing issues between their ends and the monitoring server.
Sorry I can’t be of more help but you have to remember that the pool is a free service run by volunteers.
I understand that it’s all volunteers, I’m one too
However, many of us get spammed with automated-messages being removed from the pool.
I’m investigating the matter for more then 2 weeks now, tested everything possible.
And it turned out to be packet-dot-net that hasn’t fixed their faulty, server, as I posted before.
I’m running 2 servers, and every time we hit EWR1 the packages are lost 100%.
I know what my ISP will tell me, that they have no relation with that peer and as such it’s not their problem.
However, I will contact packet myself and tell them about the problem, maybe it helps.
I agree it’s annoying - I was just writing a proposed update to the alert email that gets sent to add some troubleshooting steps as we seem to get be getting quite a few queries recently!
I think I would push the ISP harder - you pay them to deliver traffic and they’re not doing it properly! Having set up our own servers we probably have more knowledge than those who haven’t - I don’t think ISPs could expect most people to be able to diagnose where routing isn’t working or expect people to contact that peer directly! The routing is within the ISP/Peers control not ourselves. As I say we’ve had reports back from other people whose ISPs have contacted peers and got things fixed. Good luck!
Well I used to be a trouble-shooting-engineer and a webhoster myself.
Doing this kind of stuff is fun!
Still hosting but just for friends, not to get paid.
I just got word from packet and they are going to contact NTPpool to sort it out.
Nice!
We are getting somewhere, there is a problem with monitoring.
As they asked me to use the beta-server as well.
So I did.
On the normal server ntp1.heppen.be is rubbish, but on the beta-server no issues at all, it’s near 20.
But my seconds, ntp2.heppen.be is perfect on the normal server but rubbish on the beta-server, score about 5.6.
As the beta server is in Amsterdam and not in the USA it has a different path.
In my optinion the monitor can’t be trusted if you have to follow a path that drops UDP-messages.
The uptime-testing should be more robust and not assume your server is bad just because the path to you is broken.
If you have troubles with the blue-line and thus bad scores, it is suggested you enter your server here:
Since nobody has mentioned it yet, I’ll point out that the line
18.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
is meaningless. It doesn’t mean that the node is dropping packets, it means that the node is not responding to traceroute (which it is not required to do).
It’s clear that this is a non-issue because the subsequent lines in the trace show 0% loss. In general, people assign far too much value to traceroute. The best you can hope for is a series of nodes dropping packets, in which case you may be able to guess that the problems start at the first of those nodes. Without a trace from the other end, it’s a fairly weak guess because on the modern internet you’re very likely to have a completely different route in the other direction.
One thing that might be nice to add in future would be the ability to see a trace from the monitoring station.