How does one do maintanance on a time server?

How does one do maintenance on a time server, such as firewall upgrade/reboot?

Does one remove the time server from the pool, wait a week(or a year according to the “add a server” page) for the traffic to go away, then take the server down?

Or does one do it live, as quickly as possible, and take the score hit?

I set the server’s speed to “monitoring only” on the management page maybe a few days before the planned outage, then set the speed back to the normal value after the maintenance has completed.

This way most of the simple SNTP clients won’t try to access your NTP server during the outage, and for most clients that use an actual NTP daemon (ntpd, chrony etc.) the brief outage won’t matter. Those are usually configured with multiple NTP sources anyway, so they’ll just get their time fix elsewhere. The scores will resume quickly to the normal values, don’t worry about those.

1 Like

TBH I am often reluctant to wait for a few days before updating/upgrading.
These updates often contain security fixes and I want to patch those asap on internet-facing servers…
Most ideal solution useable for me is to reroute (NAT) the traffic for server A to server B during updates/changes. NTP clients dont have a clue :wink:

That is what I do, when possible. For instance, when I need to restart the ntp appliance, then I’ll move all traffic to my second appliance via firewall routing.

But when I have to reboot the firewall/router, then that isn’t possible, as it is the one device I cant reroute around, and it is my internet connection…

1 Like

Monitors do this. Don’t worry.

No need to do anything.

3 Likes

What I do depends on how long I expect the outage to be. For simple reboots, I don’t take any special measures. Full NTP clients will generally be using multiple servers and won’t care about lost responses from one of them. There might be some problem with clients that (a) only take a single address from the pool and (b) time out in less than the time it takes my servers to reboot, but I don’t expect there are many of them. Short outages like this are generally detected by the monitoring system but don’t reduce my score below 10, so I assume the pool thinks they’re OK.

For longer outages where I know about them in advance, I’ll put the servers into “monitoring only” mode in the Pool a few days in advance. That removes about half of the load, and the other half are probably competent NTP clients that will handle the outage well.

And then there are the occasions when an outage isn’t planned. For those, I depend on the monitors to spot the problem and remove the server from the pool because I’m usually trying to fix something more important.

1 Like

Rerouting the traffic to a different server with potentially different clock characteristics is counterproductive - see RFC 8633 - Network Time Protocol Best Current Practices. The RFC talks about anycast, but the same logic applies to load balancing as well. Clients using the pool are effectively using a superior anycast alternative already (one in which they are aware of the individual servers available for use).

Bottom line: there’s no need to do anything. A correctly configured client will just use another time server if it can’t reach yours, and the pool monitors will remove your server if it’s not available.

6 Likes