No_sys_peer, panic stop

Sorry to post a basic question here but I’m not sure where else to ask.

Lately (twice, now) one of our servers has had the NTP daemon die.

I get:

Nov 25 19:58:29 replication_server ntpd[17559]: 0.0.0.0 0618 08 no_sys_peer
Nov 25 21:56:40 replication_server ntpd[17559]: 0.0.0.0 0617 07 panic_stop -1189 s; set clock manually within 1000 s.
Nov 25 21:56:41 replication_server systemd: ntpd.service: main process exited, code=exited, status=255/n/a
Nov 25 21:56:41 replication_server systemd: Unit ntpd.service entered failed state.
Nov 25 21:56:41 replication_server systemd: ntpd.service failed.
Nov 26 07:09:01 replication_server ntpd[9333]: ntpd 4.2.6p5@1.2349-o Thu Aug  8 11:47:59 UTC 2019 (1)

When I restart, I have peers.

[root@replication_server ~]# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
-107.170.29.136  128.59.0.245     2 u    2   64   77    9.947   -0.127   0.174
+64.113.44.54    129.6.15.29      2 u    2   64  167   31.182    0.226   0.257
*173.51.147.14   .GPS.            1 u   40   64   77   81.075    3.216  13.981
+45.56.74.200    128.138.140.44   2 u   30   64  177   45.659    3.256   0.118
[root@replication_server ~]#

These servers are in our local closet, on a business grade cable internet connection, and I don’t see a network any network outages reported that would keep ntpd from contacting servers. The same ntp package is in use on the other linux servers in the closet and they are fine.

NTP package is 4.2.6p5-29.el7.centos as released. No updates available.

I would appreciate any suggestions.

Hi, doesn’t sound like fun! :frowning_face:

A few suggestions here: No_sys_peer, panic stop

The other questions that come to mind are:

Are there any obvious differences between this server and the others that are ok?
Is there anything else obvious in /var/log - any other packages reporting errors?

Maybe change the network cable and switch port?

Is your server virtual, or running on bare metal? If it is virtualized, you may want to disable time synchronization of this guest to the host.

no_sys_peer means no source could be selected…
panic_stop means the clock error is more than 600 seconds…

can you paste your ntp.conf ?

If you have multiple local servers running NTP, you might want to consider adding in some lines to have them “peer” between each other (don’t forget you will probably need to add a “restrict” line with different permissions than internet sources), that way if you lose internet or something else wonky goes on, at least your local group of servers will work to keep time between them the best they can.

Or, if you don’t want to do that, I would then recommend adding additional servers to your configuration. I typically have 7 NTP servers in my configurations. NTP uses such little bandwidth (48 byte packet), and the polling interval is so long (Usually 1024 seconds) that bandwidth usage is not an issue. The general rule is for 2n+1 to protect against “n” falsetickers.

Unfortunately the old version of NTPD on CentOS 7 doesn’t fully support the “pool” feature properly to refresh a new server (via DNS lookup) if one stops responding. If that is a critical feature, an alternative would be to use chrony instead of ntpd (which chrony was the default time service for EL 7).

Depending on your ISP, they usually have a couple of their own NTP servers (since all their network hardware needs to get time too), sometimes you can dig through your router’s config or logs to find them. They may or may not provide acceptable time for your requirements though.

Other good sources of time around the world are:
Cloudflare: time.cloudflare.com ( https://www.cloudflare.com/time/ )
Apple: time.apple.com
PublicNTP: ( https://publicntp.org/ )

Google & Amazon also have NTP servers, but they use a leap smear method for leap seconds and that would cause conflicts mixing server types.