Ubuntu will use ntpd-rs (and benchmark it against chrony)

Hi all, this is my first post in this community. I am running an little IPv6 openwrt router with chrony as time server on a low level traffic setting (I have no idea if its adress is linked to my profile). Now I read an announcement and wanted to share it:

Ubuntu will use ntpd-rs in future and they want to improve it. Additionally they want to benchmark it against chrony. Sounds interesting.

  • Benchmarking & Testing: Comprehensive benchmarking of long-term memory, CPU usage, and synchronization performance against chrony to give our cloud partners and enterprise users complete confidence in the transition.
2 Likes

Is there a list of public NTP servers running ntpd-rs?

I tried ntp-rs. Not feature compatible with chrony or ntpd for a server. Might be good enough for a client.

To be honest, I read this german newsticker which has an english translation. They state they want to improve ntpd-rs, so maybe it will be feature compatible in future, uniting NTP, NTS and PTP under one roof. High claims, though. At least this means active development and given that ubuntu is a bigger player, this cannot be bad for the ntp protocol itself, can it? OK, they did things like snap and so… but let’s be optimistic.

I used to run one on one of my few IPv4-only servers. My IPv6-enabled servers typically have multiple IPv6 addresses, a primary, often “native” one (e.g., stably autogenerated via SLAAC), and a “vanity” one for the NTP service. That multi-homing scenario is not supported today (as also mentioned in the announcement), which to me severely limits the current usefulness as server in “production” environments.

It did not have something like the iburst option, so after each restart, it took quite a while to reach the synchronized state. It often had trouble synchronizing at all, even with enough samples (admittedly in a challenging zone, though). Neither is helpful in a server role.

From pool monitoring perspective, time stability did seem less good than established alternatives, but again, might have been skewed by the zone it was in being a bit challenging in general.

My interest is not so much in the low-level timekeeping behavior, rather in the overall system and network architecture, so I didn’t do any systematic performance investigation of this implementation. If interested, I could set up an IPv4-only server in the US zone for someone else to run tests against.

Ubuntu switching from chrony to ntpd-rs is surprising to me. From what I have seen so far, I’d say it’s a regression in security, features, timekeeping performance and server performance. I realize I might be heavily biased.

For people on this forum the server performance would likely be most important. I ran some tests to compare it with chrony and rsntp. Here is the maximum response rate I observed with ntpperf on a machine with a 4-core CPU (configured to run at a constant frequency) and 1Gb onboard Broadcom NIC (2 RX queues):

rsntp (2 threads) 820 kpps
chrony 4.8 (2 instances, noclientlog) 720 kpps
chrony 4.8 (2 instances) 690 kpps
rsntp (1 thread) 510 kpps
chrony 4.8 (1 instance, noclientlog) 440 kpps
chrony 4.8 (1 instance) 400 kpps
chrony 4.8 + Fil-C (2 instances, noclientlog) 380 kpps
ntpd-rs 1.7.1 270 kpps
ntpd-rs 1.7.1 (rate-limiting-cache) 260 kpps
chrony 4.8 + Fil-C (1 instance, noclientlog) 200 kpps

ntpd-rs doesn’t seem to support multithreading or multiprocessing as a server. It seems to be about 40% slower than a single chronyd instance configured with noclientlog (matching default ntpd-rs configuration). The default chronyd clientloglimit is useless on a public server, it should to be either increased or disabled.

If someone insisted on running a memory-safe version of chrony, it can be compiled with Fil-C. That should provide better memory safety than Rust as it covers all code (there is no unsafe keyword), but the impact on performance is significant.

4 Likes

The is a bit of a problem with your test, as the standard Linux uses the pre-emptive kernel, you do not know how it schedules.
As such the measurements can differ from each test and therefor you do not know the exact speed of the software.

Make the CPU-clock static is one thing, but you really need the realtime-kernel to get more accurate readings. As this kernel will give the same schedule over and over again.

I use this one on all my NTP-servers: Linux server 6.12.69+deb13-rt-amd64 #1 SMP PREEMPT_RT Debian 6.12.69-1 (2026-02-08) x86_64 GNU/Linux

I know it can still vary a bit, but you can place the software in RT mode.

I use this kernel, as I do not want programs to be able to hogg/slowdown the scheduling of other software. For normal use this isn’t the best kernel, but it does improve NTP.
Also, I always install irq-balance to prevent CPU0 being overloaded with IRQ-requests ment for other cores.

I would love to see if my changes change your outcome. Can you test them against the RT-kernel and irqbalance installed?

I think the main point of providing those measurements was for relative performance between implementations and specific configurations, especially chronyd vs. ntpd-rs. While the absolute numbers are interesting as well, to get an impression of the ballpark figures one can expect, they depend too much on too many factors (as you write yourself as well) to be taken as “absolute truth”. I.e., one always needs to consider that they are for a specific situation.

Intuitively, I would expect the performance (as in achievable packet rates) with an RT kernel to be lower than with the standard kernel. Because the RT kernel is inherently less efficient in scheduling tasks due to the additional constraints when scheduling. The NTP performance in the sense of accuracy, on the other hand, will be better, exactly because of the prompter handling of packets etc. I guess that is what you are referring to when you write that an RT kernel “does improve NTP”.

Yep, as the RT kernel is faster at e.g. interrupts, when you set priority higher, it will receive more CPU-time.
It makes the execution more precise.

I mean, most NTP-servers have little other tasks to perform so accuracy isn’t compromised.

As such the RT-kernel makes more sense to use. Also because Time is a Real-Time event :rofl:

The CPU-cycles are more predictable. Is this good for a normal server/desktop, no, but it’s better for high-accurate-fast-systems, like an NTP-server.

That is why they say you should lock the CPU-speed, same can be said for CPU-cycles. RT-kernel can lock them at a steady pace.

The numbers presented were about performance as in number of packets per second. You are now again referring to another performance metric. Please do not confuse the two.

I was referring to the number of CPU-cycles they get, this also affects response and thus the numbers.

A server that has more time, can answer more requests. To be sure they all handle the same, RT in RT priority should give better results as you know they get the same cpu-cycle time.

RT has nothing to do with number of CPU cycles. The code needs what it needs. All the RT kernel does is service those requests in a more timely manner by preempting less critical/urgent tasks That makes it less efficient, i.e., it is burning more cycles overall, e.g., because of additional overhead for more frequent context switches. The cycles spent on those additional context switches cannot be spent on “productive” stuff like running an NTP daemon anymore, thus overall, I’d expect the achievable packet rate to go down with an RT kernel.

1 Like

Sorry but the RT kernel has a different scheduler then normal kernels, as such the priority is different as well as the number of CPU-cycles any program can claim and can recieve.

I you want to have a program to use the same ammount of cycles all the time, this scheduler+kernel is te way to go.
On the other hand, this kernel allows programs to hog/slowdown the system to a crawl making it very hard to recover.

That is the difference between Real-Time and normal, as in normal kernels you can not schedule real-time, ergo same CPU-time all the time, it varies based on the load.

The best will be core-isolation and give 1-2 cores maximum to NTP, then other programs can’t affect it.

Don’t forget we have multi-core/cpu systems now, in the past we had only 1 core/cpu then it has far more impact. Today the system typical runs on core0 and the rest on others. Making RT a different animal.

Therefor I run the RT-kernel, as this is very slow quad-core CPU and I noticed it makes Chrony more stable. Maybe it has less or bad impact on faster systems, but on the Intel J1900 it works fine.

My cores are locked at 2.4GHz, also to improve timing-stability. But it keeps being a pretty slow CPU. I can be wrong, have been before :wink:

It seems you keep confusing two things:

  • An RT kernel improves timekeeping performance (i.e., “accuracy”) because various time-sensitive operations in the kernel and in userspace are handled in a more timely manner.
  • Performance in the sense of maximal achievable packets per seconds is expected to be lower with an RT kernel as RT handling implies higher scheduling overhead in many parts of the system which leaves less resources (“CPU cycles”) to actually process packets.

Miroslav’s numbers were about the second aspect with a view to compare ntpd-rs against other implementations regarding that metric, as benchmarking ntpd-rs against chronyd but potentially also other implementations is one focus of this thread.

I’ll leave it at that.

He should have done core-insulation, then measure.

As then the CPU/CORE can be dedicated to the task.

All I’m saying, normal kernels do not give the same time to all processes.

RT does and so does core-insulation.

You can not measure without steady parameters. That is my problem.

When you are bench-marking you should ensure the parameters of e.g. CPU-cycles should be the same, no matter what you do.
Else it a useless test, regardless the outcome.

As such I do not take his measurements as valid. Sorry.

I did 15 minutes of testing and found cases where ntpd-rs sometimes returned root dispersions that were too low by a factor of 10. Correctness should be a high priority.

1 Like

That’s the thing with rewriting code: it introduces correctness and performance and security bugs. Rust may be safer than C, but that doesn’t make programs written on it devoid of such bugs.

1 Like

Out of curiousity, I rebooted the machine to an RT kernel a ran some of the ntpperf tests again. The single-instance chrony-noclientlog rate decreased from 440 kpps to 350 kpps and single-thread rsntp from 510 to 420 kpps.

irqbalance was enabled in both the earlier and this test. With the onboard NIC configured to use only two RX queues (i.e. half of the available CPU cores) I’m not sure if it makes a difference.

2 Likes

But did you set the instances to RT? Because if you didn’t they get scheduled to lower priority over all other programs.
Typical normal programs are not set RT. As such they suffer, can you try again but schedule them RT this time?

I tried it again with a few different SCHED_FIFO priorities. No change in the performance. There is no other process running which it could take CPU cycles from. It’s just the NTP daemon and kernel.

1 Like