When we had the NTP outage that happenned on 3rd Luly (due to DB migration) is it possible that some of the servers to respond with a wrong timestamp for the NTP requests from clients? We have seen timestamps there were in the future (quite a far in the future) and trying to figure if it is possible during the outage. Has anyone experienced such a case?
Welcome to the forum, @Arun!
The topic came up in a thread that initially covered a related, but different issue/has an unrelated title.
Thanks @MagicNTP for the pointer and it was quite an useful thread as well.
Another question is that during the outage that was seen on 3rd July, we also noticed that all of the ntp pool servers resolved to the address of 23.155.40.38 which then gave the wrong (futuristic) timestamp. Did we point all the server domains to this particular one which then had an issue with the timestamp?
As for all pool server DNS entries resolving to 23.155.40.38, no, that sounds unlikely. What kind of evidence do you have regarding this DNS oddity? Log entries or something might be informative.
Hi @avij
It is based on the logs that dumped the output from the ntpd and there are some of the logs
“_ts”:“2040-08-01T07:35:17.501Z”,“answer”:{“delta”:475884571.143786,“ip”:“23.155.40.38”,“slop”:317256380.762539,“stratum”:1,“update_after”:“2040-08-01T07:35:16.834Z”,“update_before”:“2025-07-03T09:25:45.69Z”,“host”:“2.android.pool.ntp.org”}
“_ts”:“2036-09-29T21:33:43.501Z”,“answer”:
{“delta”:354801080.848145,“ip”:“23.155.40.38”,“slop”:236534053.898778,“stratum”:1,“update_after”:“2036-09-29T21:33:43.206Z”,“update_before”:“2025-07-03T09:42:22.358Z”,“host”:“pool.ntp.org”},“unused”:[“172.16.128.2”,“172.16.111.2”]
From my independent monitoring.
From 2025-07-03 09:07 until 2025-07-03 12:48 the NTP responses from 23.155.40.38 contained
Receive Timestamps with huge errors. The Transmit timestamps appear correct.
From 2025-07-03 12:50 until 2025-07-03 13:03 no NTP responses seen.
From 2025-07-03 13:05 until 2025-07-03 13:09 both timestamps were in error by about 150 msec.
During these intervals the NTP server continued to report stratum 1, root dispersion = 0,
root distance=0. I.e., 23.155.40.38 was a falseticker.
Normal operation resumed 2025-07-03 13:11.
Nobody disputes that this NTP server was operating incorrectly for several hours and that this
was detected by the NTP Pool Monitors.
At issue is the DNS aspect. Perhaps a local DNS server cached the NTP response and used this
to respond to local clients.
FWIW, all the pool monitors agree that IP is not providing good NTP service now. See https://www.ntppool.org/scores/23.155.40.38
Remember that during the July 3 outage, monitoring and DNS updates were broken – so while the pool DNS servers stayed up, the addresses they served were not being updated to reflect changes observed by the monitors. In that situation, it’s feasible 23.155.40.38 was being returned for *.pool.ntp.org queries despite misbehaving.
While that type of pool outage has been very rare, perhaps it shouldn’t be. ![]()
I think it’s worthwhile for people using cron’d ntpdate or similar against the pool to understand how fragile and dangerous that configuration is if you depend on time even loosely matching between hosts/VMs or in relation to reality.
“FWIW, all the pool monitors agree that IP is not providing good NTP service now. See pool.ntp.org: Statistics for 23.155.40.38”
True. My fault for not showing the overall timeline after the timestamp errors ended. I have seen zero responses from the server since 2025-07-03 13:36.
Speaking of which — I noticed that some of the monitoring systems include “TEST” as the RefID in their requests (though not all of them?), and I was wondering whether that’s actually a smart idea. If true, it could make it easier for malicious actors to return valid-looking responses to the monitors, while manipulating replies to (certain) regular clients.