What is a reasonable limit?

Ok, the thesis can be found here: https://www.sidnlabs.nl/en/publications#theses under ‘A Day in the Life of NTP: Analysis of NTP Pool Traffic’. It is quite a long read, so let me try to summarise some of the more relevant highlights.

Our graduate collected 24 hours of NTP-queries on our 30 globally ‘anycasted’ NTP-nodes, resulting in 13.67 billion NTP queries from ≈158.7 million unique clients. The data was anonymised and then analysed.

One of the findings is that a number of clients send way more NTP-queries than they are expected to do according to the RFC’s and the polling interval. And that by far this is not just abusive traffic. Some clients send as much as ~100 qps on average over a 24 hour period. But bear in mind that queries can come in bursts, going as high as 2500 qps or more. Besides clients being broken, or abusive, we concluded that a considerable amount of the ‘overly active’ clients are simply benign clients sitting behind (CG)NAT.

As a result, many queries, including numerous valid ones, are not being answered due to rate limiting. This behaviour almost exclusively occurs with IPv4 clients - wheres virtually all IPv6 clients behave properly (and are treated as such). This is yet another indication that (CG)NAT is sitting in the way, perhaps more than we would hope for. In particular SNTP clients, who simultaneously tent to burst at regular intervals, suffer from this.

For me this leads to the conclusion (as was already mentioned by others in this thread), that a reasonable rate limit (for IPv4) is hard to determine. There are simply too many legitimate requests from clients sitting behind (CG)NAT that cannot easily be distinguished from wrong, abusive ones. But capping at 10.000 queries per hour is definitely too harsh. But default, recommended settings are also causing problems for IPv4 clients in combination with large scale (CG)NAT. Allowing a significant burst before blocking seems like the best approach.

5 Likes