Beta system now has multiple monitors

bhueske · April 6, 2018, 5:55am

Hi,

the same goes for me. In the production pool everything is OK with my Servers but the beta systems i get e-mails, that my servers have a problem

https://web.beta.grundclock.com/user/urmvj7ct6ryi2dx6zmq

My restrict settings:
restrict default limited kod nomodify notrap nopeer noquery
restrict -6 default limited kod nomodify notrap nopeer noquery

I have no idea what is going on there

ask · April 8, 2018, 6:00am

Hi,

The “4 samples” monitor was getting “not in sync” responses it looks like, maybe it’s being rate limited? Which version of ntpd do you run? I wonder if the monitor is more aggressive than ‘ntpdate’ (that was what I was trying to emulate with the 4 requests, 2 seconds between each).

https://web.beta.grundclock.com/scores/78.46.60.40/log?limit=500&monitor=12

bhueske · April 8, 2018, 9:47am

Hi,

both Time-Servers are on Ubuntu 16.04 with ntpd Ver. 4.2.8p4.

After reviewing some online sources and this thread, i added the following to the configuration: discard average 2 minimum 1. Since then it seems to be ok. I think this was an issue with rate limiting.

ask · April 9, 2018, 6:00am

[ eh, I was quoting 4.1.1 docs – no wonder I was confused by this! So edited below ]

I put ‘limited’ in the default configuration suggestions on http://www.pool.ntp.org/join/configuration.html – so assuming that’s reasonable then the monitoring system should be working with that default configuration.

The system should be waiting 2 seconds between each query and the default “minimum” setting is that, so I don’t understand why it’s not working. @bhueske I haven’t taken time to read the tcpdump you sent properly. Can you summarize it for me (and everyone else)?

http://doc.ntp.org/current-stable/accopt.html

mnordhoff · April 9, 2018, 6:12am

Do the monitoring systems for the normal pool and beta pool query independently from the same IPs? If so, and a server is in both pools, and they both happened to query around the same time, it could be a problem. But that sounds quite rare.

ask · April 9, 2018, 7:41am

The “4 samples” monitor should be using a separate IP. The “normal” one shares an IP between the normal pool and the beta pool (currently).

ask · April 10, 2018, 9:07am

Yikes! I think the “Los Angeles (4 samples)” beta monitor was running old code that wasn’t waiting 2 seconds between each query. I fixed it just now. Hopefully that solves the unexpected errors. :-/ Sorry about that!

ahferroin7 · April 13, 2018, 11:21am

It looks like there may still be some issues with the LA (4 samples) beta monitor (either that or I’m having inexplicable routing issues with only that monitoring node). Every other monitor in the beta system, plus all the ones on the main system are having no issues with my systems, but that LA (4 samples) beta monitor is consistently failing talking to my systems.

mlichvar · May 6, 2018, 12:45pm

I think I still see the “4 samples” monitor sending 4 requests with no delay between them. I wanted to look at the code to see how it works. The original post says the old monitor written in perl was replaced, but there is no link to the new one .

ask · May 6, 2018, 10:24pm

Yikes! Indeed. I made the change in February, but either messed up pushing it to the server or restarting the right process. My workflow hadn’t caught up to two IPv4 monitors running on the same server, I think.

Thank you for pointing this out. I restarted it now and verified the behavior by staring at tcpdump for a bit.

The code for the monitor is at GitHub - ntppool/monitor: monitoring agent for the NTP Pool

mlichvar · May 7, 2018, 4:19pm

Thanks for looking into it.

I’m not sure if it’s intended, but I no longer see the monitoring host making any bursts. In tcpdump output I see about 6 requests per hour and it’s a mix of NTPv3 and NTPv4 requests.

curbynet · June 20, 2018, 2:43pm

Is the beta site intended to be more strict when it comes to handing out scores?

E.g.

http://www.pool.ntp.org/scores/72.14.181.128
https://web.beta.grundclock.com/scores/72.14.181.128

I noticed that the graphs have different time scales, but even accounting for that, the scores from the new site are consistently lower. By the way, is there a means (like a HTTP GET variable) to sync up the x axes of the graphs between the two sites?

Thanks!

ask · August 11, 2018, 6:53am

This topic was automatically closed 51 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Beta monitoring operators/systems Pool Development beta , monitoring	15	1001	May 5, 2022
Beta system now has one monitor only Pool Development beta , monitoring	0	1000	March 22, 2019
Beta site changes - monitoring updates Pool Development monitoring , beta	17	3061	May 4, 2018
Monitoring upgrade Announcements	68	3353	May 25, 2023
Monitor: Production and beta Server operators monitoring	0	510	April 16, 2021

Beta system now has multiple monitors

Related topics