Perhaps it would not be unreasonable for the NTP Pool project to declare a queries/IP/second that should be supported by a pool volunteer and then everything after that is accepted to be at risk of being dropped?
As IPv4 depletion gets worse and more gets put behind CGNAT, I worry this is not sustainable growth in per-IP query traffic for a volunteer project. Maybe it is acceptable to say that large CGNAT populations can’t be accomodated and should query using IPv6?
Of course, whether you drop the traffic or not, you can’t stop it arriving, which is also a problem. Can we brainstorm ways that pool volunteers can more quickly temporarily remove themselves out of the pool when they get overwhelmed? Lower DNS TTL and a simple API for a volunteer to say “enough is enough, remove me until I say I want back in”?
If one were designing an NTP Pool v2 what radical changes would one make to try to address abusive clients?
Off the top of my head:
With the rise of cheap bring-your-own-IP VM providers like Vultr, how about an NTP pool that is dispersed in say 12 or so locations worldwide and uses its own IPv6 space anycasted?
The pool proxies to its volunteers at their own IPv4 /32 and/or IPv6 /128 but only ever presents its own IPs (just on IPv6 - it’s a new world out there) to the clients, distributes client queries amongst its local volunteers. Since NTP clients keep track of servers by IP you should not put different servers behind different IPs, so each volunteer would need a unique IPv6 address to be known to the world by, but there are billions of them in a /64.
The thing is, if a volunteer says they have had enough they can disable the proxying and then the pool itself sinks the traffic, not the volunteer. In this way, the idea of allocating a given queries-per-second and monthly GB of traffic to be donated to the pool could be normalised, as once you say “enough”, you do not get any more traffic from the pool.
A very understanding upstream may even let the pool blackhole an individual IP at its border using a BGP community, but that is an advanced topic.
There would be increased latency introduced by the proxying stage between the pool point-of-presence and the volunteer server, but as long as the pool was a bit selective about which volunteers it accepted at which PoP (e.g. automatically drop candidates whose latency/jitter grows too severe), I think the NTP protocol should cope with this. I think it would be better than entire regions going dark because all volunteers were DDoS’ed out.
Fund the project by Patreon lowest feasible monthly contribution or other subscription method compatible with small transactions. If necessary only give out per-client-org DNS names to paying users in the same way that various DNSBLs do.
A pool volunteer would stop being able to tell who their clients were, though (they’d only see pool IPs). It could be taken a step further and every pool volunteer iBGP peers an IPv6 /64 with the pool. That way they directly receive the NTP traffic to their individual /64 but can tear down the BGP session whenever thjey like, becoming unreachable. The proxying step becomes a routing step.
There is probably a fatal flaw in all of this that I have overlooked.