If all the dozen monitor servers think the time is OK, there is a good chance that the error, whatever it is, is on your side. Do you have some further data/evidence regarding this incident? Log entries or something? Is your server a virtual server? In this case, have you verified that the host server’s clock was OK?
Well this whole incident lasted a couple of minutes, maybe like two, then it reverted to correct time, I have windows event logs showing this. Os is Server 2022 core
This is graph from zabbix and event log shows the time service recieved a new time offset. We had to rebuild several gmsa accounts because of this. Host is hardware and it only allows local login, its a normal proliant server so no home hardware. As I have said this is not something that happened once, last year we had similiar incident using these hosts, but the drift was much worse (several months).
Lesson learner on configuring max offset for time service. We switched to google ntp after this
My NTP server monitoring is independent of the NTP Pool’s infrastructure. I checked my logs for the indicated two Slovakia servers since 2025-01-01 00:00 and see no anomalies. The servers are running @ stratum 2 and have good stability.
today at 22:12:39 we saw offset of 1783487 seconds on all servers on sk.pool.ntp.org servers, to be more precise:
185.242.56.5:123
213.81.129.99:123
Can you share us the logs where those servers and the measurement can be identified?
This is graph from zabbix and event log shows the time service recieved a new time offset.
When your system clock jumps, from the systems point of view the time servers suddenly show a different offset. Are you sure the time was changed by the ntp service and not something else?
Was the incident last year with the exact same upstream IPs?
As others have said - the pool servers looked fine on the entire monitoring, so the problem was likely client-related. If the client was the problem, changing the upstream to google will not solve the problem.
It sounds like you’re using Microsoft’s Windows Time Service, AKA w32time. You might want to read up on a feature intended to break the cyclic dependency between client time and SSL certificate validity constraints that has been reported to cause short-lived but drastic time jumps.
Are you sure the time was changed by the ntp service and not something else?
Judging by the last years exact behaviour of said hosts, no, I am sure, it is not possible to login to physical DC remotely to change the time, all remoting is disabled on our hosts, it is also not possible to reboot the server, go into password protected BIOS, and boot it up in a span of 2 minutes straight.
Was the incident last year with the exact same upstream IPs?
I dont know that unfortunately, that happened on the VM that hosts apps, not DC, this time it was DC that has really strict security measures, it also happened on one DC, not the other two.
While I understand that you have the monitoring and keep saying that the pool looked good, I highly doubt that after these experiences, we do not need sub second accuracy, not even seconds, hell not even minutes, but jumping by sever weeks or like in last year, months, thats a problem.
It sounds like you’re using Microsoft’s Windows Time Service, AKA w32time
So you say our GPOs that specify sk.pool.ntp.org,0x1 as a time source for DC is just not being applied, right?
Thank you for the screenshots. As I had suspected - the server first jumped the time forward, and then (42 seconds later) complained that the pool server indicates a huge offset. The second screenshot suggests that the pool was handing out the correct time at least after the jump, since the server with the wrong time now has an offset to the pool server identical to the distance it had jumped.
The big question still is: what triggered the jump?
The log entry from “Kernel-General” with Event id 1 two lines below the Time-Service DI 52 entry should contain additional information about the change in the system time, and possibly even the process that initiated the change.
Are there additional log entries from the “Time-Service” in the event log in the hours leading up to the jump? Not just in the system log but also under the log under Applications and Services Log / Microsoft / Windows / Time-Service / Operational ? The Operational log should include detailed log entries to what upstream source your server is trying to synchronize.
Note that 185.242.56.5 implements rate limiting; I didn’t try the other server.
If NTP requests are sent too rapidly the NTP server will set LI=3, Reference ID = RATE, and the client timestamp will simply be echoed back.
This is normal procedure.
Does the client software check the LI (leap indicator) field ?