We have a similar situation in ru-zone of pool. We have less than 10 active servers left in the pool in the ru-zone (war and security reason, ntp DDoS reflection attacks, etc. Maybe some pool monitoring troubles). Now if I add my server to the pool and set the bandwidth to minimal 512k, I get “waves” of requests from “normal” 1-10-50-100kpps to 1500kpps (1,500,000 requests per second, yes) in 15minutes. My old server can handle about 200-250kpps without losing responses and ~80-90% CPU load, but 1.5 millon is too high (it’s above 500-600 Mbit/s of bandwidth). Any Raspberry/embedded servers dies at all.
When server reach 100% CPU it can’t handle so many requests, it loses its score and the pool deletes it. The cpu load decreases, the server starts to cope, the score grows, the pool makes server active again and sends a million requests to server again. Again, the server can’t handle all the requests, the score decreases, and so on in a circle (or waves).
Another pool member from the ru-segment has the same problem, his server can handle ~0.5-1 million requests per second, but it’s have impact to another services. Unfortunately, we are forced to disconnect from the pool too
We have observed the servers in the ru-zone and we see that the score is floating in waves for all the servers in the zone. Apparently there are too many requests and the ru-zone servers simply overload, as a result they fly out of the pool in a circle. All this bad situation for pool and timekeeping community.
We suggest that if there are few servers in a zone, pool can distribute clients to a “more global” zone more actively. Otherwise, the entire overloaded zone will permanent score/servers flapping and work poorly.