Are there any server operators out there who have run Chrony instead of ntpd as a server in the pool?
I’m interested to see what it’s like for CPU performance compared with ntpd.
On a busy pool server I regularly see 60-80% of a CPU core used and unfortunately ntpd isn’t multithreaded and can’t make use of more than 1 core on a box. Chrony is also limited the same from what I’ve read.
How much traffic are you seeing to cause that sort of load? My pool server on 100 Mbps barely cracks 2% CPU. (See http://serverfault.com/a/813122/129161 for further discussion.)
A “Net speed” of 100Mbps in an underserved pool region (eg AU) will result in a lot more traffic than a “Net speed” of 100Mbps in one of the larger regions like UK, US etc.
Maybe I need to bump up my pool setting; during the peak of the snapchat surge I saw around 14K qps maximum, and there were a couple of spikes up to 100% CPU, but most of the time it was well below 50%.
In my experience chronyd needs slightly less CPU that ntpd, but most of the CPU time is the kernel receiving and sending the packets. Switching to chronyd will not help much with that.
To reduce the usage more, it may be necessary to increase the limits for interrupt coalescing with ethtool -C in order to reduce the number of interrupts. Of course, this has a negative impact on accuracy of NTP timestamps (unless the NIC supports and chronyd is configured to use HW timestamping). For a pool server that may be acceptable. For start, you could try setting rx-usecs to 100 and rx-frames to 20.
Some NICs have adaptive interrupt coalescing and changing the values with ethtool may not do anything. These should be able to handle larger packet rates in default configuration. For example, a machine with an Intel i210 card I use for testing can handle up to about 250kpps.
I’m not sure how much this is useful in virtual machines.
As you can see, the CPU load is quite low even in peaks in the packet quantity. Unfortunately, I have no Munin graphics with the ntpd.
In addition, it seems to me that chrony is a bit more synonymous with the topic of synchronization. What you see also in the graph here in the NTP pool.
I’ll keep watching this.
PS: At the moment chrony is running with 3.4% cpu at avg. 1kpps