Load feedback from the NTP servers to the pool infrastructure?

I think it is a valid concern. That made me thinking of the following:

What if the NTP servers would be able to report back the actual QPS value to the pool system?

That creates regulatory closed loop and permits the pool system to fine tune the traffic level to a particular NTP server.

1 Like

That would probably miss the bozos who hit some arbitrary minute on the hour. Realistically it should be Qph, and there is no good option to decrease traffic levels other than opting out and hoping the bozos do not flood you. Particularly given clown clients that triple their query rate every time they’re told to go away (via KOD, ICMP administratively prohibited, or unnaturally lousy time). [/rant]

In the more innocent past, the Pool monitoring system could have used ntpdc/ntpq to scrape servers’ metrics…

I am thinking to submit a JSON object with members of packet number and time period.
That would allow QPS or QPh, or any arbitrary period, for example since the previous submission. It is flexible. Then, it is up to the pool infrastructure how to use those values.

Other members of the JSON object is the list of IP addresses for multi-homed systems, I am having system with both IPv4 and IPv6 addresses. The NTP daemon does not report separately the packet rate by protocol. One more member of the object would be a credential token, retrievable from the management interface after authenticating to it.

Try ntpq -c “keyid #” -c “passwd controlkeypass” -c ifstats

Cheers,
Dave Hart

1 Like

The system is counting DNS queries per second, so that’s not completely impossible; though typically for this type of work the thing being measured will just count and then the thing measuring will calculate a rate based on that. To do per second measurements the thing being measured have to report the data with timestamps.

More than ten years ago I built a tool to try figuring out how many clients are behind each DNS server. The idea was to see if having really popular DNS servers get a shorter TTL (or more IPs or something else) would make the load get distributed better. https://www.mapper.ntppool.org/

I can’t find the code now but I did also at some point look into what it’d take to have a tool server operators could run to count the NTP queries (I think I used pcap to count on the network device; but it was awkward so if I tried again I’d use the chrony and ntpd control channel instead assuming the daemons can count reliably – though as @NTPman pointed out that’ll not work well on dual-stack servers. Firewall rules with counters is another option, but that’d require manual configuration across a bunch of different platforms to setup the rules; I don’t think a tool like this should mess with your firewall rules).

It was surprisingly much work (tweaking and fussing over I think several years) to get the DNS query logs and the infrastructure to count and process them reliably working all the time, but the tools are in place to deal with NTP query count data as well.

The new “zone generation” tool will use the DNS query logs (they are summarized by the country/continent of the client) as a proxy for “how many NTP queries”.

It’d be interesting to find out how accurate that actually is, but I don’t know how easily we could do that without having the tools setup to count on a bunch of servers (and the database tables, tools, etc to load and query the data), so I’ve been pushing it down the list of things to explore.

ntpq’s ifstats does break down packets received and sent per local address, so you would be able to differentiate IPv4 and IPv6 counts, and ignore loopback traffic and private-net traffic as needed.

I think the end user system (the server volunteer) is best placed to determine what is an acceptable load for them, so how about something simpler: an API call to set a pool server to “monitoring only”, i.e. the temporarily stop handing this IP out state that is available from the web interface.

I understand from previous discussions that this does fairly rapidly reduce load, though obviously there is a long period of caching, abusers are perhaps more likely to hang on to IPs than typical clients, and nothing stops a focused attacker from keeping hold of past pool IPs forever.

If this were implemented then it would of course result in under-served zones like RU and CN emptying even further. I have separately suggested that the pool DNS servers could be more willing to offer IPs from nearby geographic regions when the number of servers in a zone gets too low.

Problem, how do we report back? User/pass?

As you need a system that keeps track but should be safe.
Else hackers will inject values and bring the system down.

I would suggest a new function in NTP/Chrony-code, that allows reporting back.
Could be build into the verify-server funtion, that it hands out a user/pass to report back.

While people that do no verify, get no code and keep running on the old system.

We do have the verify why not combine it with user/pass reporting of some sort?
As then the pool-dns can take you out based on REAL-traffic rather then guesses.

Beware, DNS-proxies can still make requests, but I think it will be a lot better.

However, when taken out, the monitor should STILL be able to access it…else you probably never get back online…or way late.