Chrony as pool server


#1

Are there any server operators out there who have run Chrony instead of ntpd as a server in the pool?

I’m interested to see what it’s like for CPU performance compared with ntpd.

On a busy pool server I regularly see 60-80% of a CPU core used and unfortunately ntpd isn’t multithreaded and can’t make use of more than 1 core on a box. Chrony is also limited the same from what I’ve read.


#2

At the moment i setup a server for the africa zone.
I could use Chrony instead of ntpd and test it ?


#3

I’m using chrony on my server. I average around 300 pps and my chrony CPU usage is around 2%.

Chrony is also single-threaded and will only use one core.

How much NTP traffic are you getting?


#4

Approx 5k to 10k pps.

I might spin up a Ubuntu VM and run the latest Chrony release, put it into the pool and see how it goes.

Given Chrony is also single threaded it might not be a lot difference CPU wise compared to ntpd.


#5

If you do that, it would be great if you posted some feedback here :thumbsup:

I’ve noticed that the Chrony version in most Linux distributions seems a way behind the latest releases.


#6

How much traffic are you seeing to cause that sort of load? My pool server on 100 Mbps barely cracks 2% CPU. (See http://serverfault.com/a/813122/129161 for further discussion.)


#7

A “Net speed” of 100Mbps in an underserved pool region (eg AU) will result in a lot more traffic than a “Net speed” of 100Mbps in one of the larger regions like UK, US etc.

Nice article too!


#8

I don’t know about @josephb, but in my experience, a VM would use 70% CPU at 10,000 queries per second.


#9

Maybe I need to bump up my pool setting; during the peak of the snapchat surge I saw around 14K qps maximum, and there were a couple of spikes up to 100% CPU, but most of the time it was well below 50%.


#10

Can confirm that’s about what I see.

It’s why I thought I might give Chrony a try to see if it is any better performance, although it’s still a single threaded application :dizzy_face:


#11

In my experience chronyd needs slightly less CPU that ntpd, but most of the CPU time is the kernel receiving and sending the packets. Switching to chronyd will not help much with that.

To reduce the usage more, it may be necessary to increase the limits for interrupt coalescing with ethtool -C in order to reduce the number of interrupts. Of course, this has a negative impact on accuracy of NTP timestamps (unless the NIC supports and chronyd is configured to use HW timestamping). For a pool server that may be acceptable. For start, you could try setting rx-usecs to 100 and rx-frames to 20.

Some NICs have adaptive interrupt coalescing and changing the values with ethtool may not do anything. These should be able to handle larger packet rates in default configuration. For example, a machine with an Intel i210 card I use for testing can handle up to about 250kpps.

I’m not sure how much this is useful in virtual machines.


#12

Virtual Machines typically can’t set coalescing values:

# ethtool -c eth0
Coalesce parameters for eth0:
Cannot get device coalesce settings: Operation not supported

#13

So some interim results.

As you can see, the CPU load is quite low even in peaks in the packet quantity. Unfortunately, I have no Munin graphics with the ntpd.

In addition, it seems to me that chrony is a bit more synonymous with the topic of synchronization. What you see also in the graph here in the NTP pool.

I’ll keep watching this.

PS: At the moment chrony is running with 3.4% cpu at avg. 1kpps


#14

Thanks for the graphs and feedback, which Chrony version are you using?


#15

The newest one (2.4.1)


#16

Currently there is a peak with about 10k pps at 30% cpu usage.


#17

Good data.

I suspect the peak is probably higher, munin is likely to be smoothing it over several time periods for the graphs.