Server monitoring

ebahapo · September 28, 2021, 4:47pm

What happens with the monitoring out of San Jose? Is it only me?

The monitoring out of Amsterdam seems fine though.

NTPman · September 28, 2021, 4:49pm

Sudden jump in offset and another jump backward normally signifies changes in the routing between your NTP server and the monitoring station.

stevesommars · September 28, 2021, 6:07pm

Some additional data to support NTPman’s response. This plot shows additional delay information seen from the San Jose monitor

The top plot shows the one-way delays (this assumes that both hosts have accurate time).
The middle plot shows the computed offset, similar to that shown in the server monitoring plot.
The bottom plot shows the round-trip time which is insensitive to the two host clocks.

The NTP response delay (your server → San Jose) decreased briefly. The impact on NTP offset seen in SanJose is only 15 msec, which isn’t impacting scores. If you want to probe deeper, run traceroutes from your NTP server towards 2604:1380:1001:d600::1, San Jose’s IPv6 address.

Bas · October 1, 2021, 2:53pm

It’s not only you, it happens to all of us. The monitor(s) have poor/overloaded peers tot the datacenter(s) and that causes timeouts and as such drops in the graph.
The monitors are not stable.

We all complained about them for years.

ebahapo · October 3, 2021, 2:47am

Ok, but what’s going on with the sampling period? It used to be 1024s, but now it seems to be variable?

ts_epoch,ts,offset,step,score,monitor_id,monitor_name,leap,error
1633226768,"2021-10-03 02:06:08",-0.01438497,1,20,10,"San Jose, CA, US",0,
1633226768,"2021-10-03 02:06:08",-0.01438497,1,20,,,0,
1633224895,"2021-10-03 01:34:55",-0.013908964,1,20,10,"San Jose, CA, US",0,
1633224895,"2021-10-03 01:34:55",-0.013908964,1,20,,,0,
1633219448,"2021-10-03 00:04:08",-0.014651828,1,20,10,"San Jose, CA, US",0,
1633219448,"2021-10-03 00:04:08",-0.014651828,1,20,,,0,
1633216445,"2021-10-02 23:14:05",-0.014050207,1,20,10,"San Jose, CA, US",0,
1633216445,"2021-10-02 23:14:05",-0.014050207,1,20,,,0,
1633214859,"2021-10-02 22:47:39",-0.015255415,1,20,10,"San Jose, CA, US",0,
1633214859,"2021-10-02 22:47:39",-0.015255415,1,20,,,0,
1633212638,"2021-10-02 22:10:38",-0.014175246,1,20,10,"San Jose, CA, US",0,
1633212638,"2021-10-02 22:10:38",-0.014175246,1,20,,,0,

marco.davids · October 16, 2021, 8:29am

This seems to apply to IPv4 much more than to IPv6 and in Europe more than elsewhere. Last night it was bad again overhere, to the extend where it starts posing a risk for the stability of the pool, because all of a sudden quite a few servers might be dropped out of the pool.

It seems that the beta with multiple vantage points has fewer problems.

Schermafbeelding 2021-10-16 om 10.27.12

Schermafbeelding 2021-10-16 om 10.27.26

Bas · October 20, 2021, 5:10pm

The monitor is crap, has been for years.
I’m done.

All servers deleted again.

It will never be resolved.

Ask doesn’t do anything to fix his monitor.

ebahapo · October 20, 2021, 6:54pm

The monitor is the Achilles heel of the pool. I’d love to help, if only @ask asks for it.

aubergine · October 21, 2021, 7:57pm

It’s sad. Since the San Jose monitor is used it went worse for me as well and the Newark monitor was already bad.

The beta system is doing much better and also has an EU monitoring station implemented. Sadly it looks like it will be beta forever. No idea why all servers still get monitored from an US machine, while the most pool machines are based in Europe.

Bas · October 23, 2021, 4:06pm

I have written an entire paper on how the monitor should work, this was a few years ago.
Nothing happened.

Loads of good servers are being de-listed only because the monitor fails to reach them.
I have spend a lot of time finding the problem and it lead to the monitor all the time.

My conclusion is that @ask has no interest in this project and ignores all input on improving it.

There are a lot of people here that want to fix the monitor issue, but ‘management’ doesn’t care one bit.

The pool is heading nowehere as there is no intention to fix issues.
This is ongoing for years and years.

Topic		Replies	Views
Timeouts from San Jose Server operators	7	682	November 10, 2021
Remove my server from pool Server operators monitoring	2	1148	November 2, 2019
Adding New Monitor Pool Development monitoring	18	1834	April 17, 2024
Monitoring station has issue with the IPv4 part of my NTP Server operators	6	1498	October 14, 2019
Suggestions for monitors, as Newark fails a lot and the scores are dropped too quickly Server operators monitoring	91	4773	August 2, 2021

Server monitoring

Related topics