Time four months off on 217.198.219.102

amplex · August 17, 2023, 11:46am

On the 15th of August between 17 and 20 UTC a number of Danish street light controllers got at time stamp from the above Stratum 1 server that was more than four months off, 31 March. Has this been reported elsewhere, and should we expect that the issue has been resolved?

ask · August 17, 2023, 12:34pm

Hi @amplex,

That IP was removed from the pool at 2023-08-15 14:42:29 (UTC) when it wasn’t responding. When it came back online around 14:59 it had an unsynchronized time signal (“leap=3”) and remained out of the pool.

A couple recommendations:

Don’t use a single IP to get the time
Don’t hold on to a DNS result for an excessive period of time
Check the leap flag, when it’s 3 the clock isn’t synchronized

Bas · August 17, 2023, 2:29pm

Who programmed just 1 timeserver in those streetlights instead of a dns-pool?
That person should be held accountable.

Sorry, but the pool only shares active and checked to be correct servers.

The problem is not caused by us, but by your programmer/supplier of those lights.

amplex · August 17, 2023, 2:49pm

Thank you – I see from logs on the units that leap was indeed 3, thank you for pointing that out. The problem seems to be the way we have been using a quite old version of chrony for the time synchronization. It does invalidate the response at one point, but still registers a time skew, which after a chrony “makestep” sets the time incorrectly.

The units do not use a single IP, but use a DNS result from the pool. It does however store the ip address for some time (which should only be as long as it receives valid results), this was originally done to save bandwidth. I believe I have the information to get this fixed, thank you.

amplex · August 17, 2023, 2:53pm

I don’t know where you got that they are programmed that way? They use a dns-pool, but due to bandwidth limitations they can’t do a dns update on every ntp request. If you only have a few available kB per month, that is indeed an issue. But as you see from my other answer, there is clearly a bug in that they don’t correctly discard the response with leap=3.

grifferz · August 17, 2023, 3:17pm

Trusting only one IP for time is not advisable. There is no SLA nor guarantee on the NTP Pool service. If you are going to trust a single IP you really need to be the one operating the software on that IP at the very least (and even then it’s not ideal).

What stops someone joining the pool with many IPs, waiting until they see one of your devices as a client, and then giving you false time with leap deliberately set to 0? I don’t think that anything the pool does, nor anything you do, can stop that, if you trust only one IP address at a time.

This is a very bad idea that should not be replicated by anyone else.

Bas · August 17, 2023, 3:17pm

Weird.

You can request time but a dns-request is too expensive?
That several months?

Normally you have an DNS-cache that holds a working request for some time, usually hours but can be extended.
However, DNS requests take just a few bytes.

Also, the ntppool doesn’t use the leap-system.

Looks to me your programmer is way wrong on NTP.

What I don’t understand is, if you have limited bandwidth, why not starting your own NTP-stratum2/3/4 server on a static-IP that you control yourself???

Then let that server connect to the pool. I mean, a simple RaspberryPi4 is up-to that task.

Placing streetlights all over town costs millions a street(s) but running a server to keep BW low is too expensive? For Denmark? One of the richest countries in the world…weird.

ask · August 17, 2023, 5:38pm

@amplex Sorry, from the scenario I assumed it was a (too) “dumb” SNTP client!

I’m surprised that chrony didn’t discard the answers from this server and used some from the other configured / used IPs.

If you can tell which version of chrony and share your configuration, @mlichvar might have suggestions for making it behave appropriately. (Though maybe “don’t run an ancient version” will be the first suggestion!)

ask · August 17, 2023, 5:40pm

Since they are using an NTP daemon (rather than an SNTP client) that’s less relevant; we expect NTP daemons to hold on to the IPs they get for weeks or longer.

No, we try to support leap seconds (as much as I’m looking forward to them going away)! (Not that it’s relevant in this scenario, leap=3 doesn’t have anything to do with leap seconds).

amplex · August 18, 2023, 8:33am

This is getting a bit out of context, but FWIW: It is not uncommon that M2M-simcards are limited to 1MB of data per month, even if this seems absurd. As a DNS request is normally around 500 bytes, while I figure an ntp request is around 50, it makes a lot of sense, to me at least, to limit the number of DNS requests. This doesn’t make the current handling correct, and as others have suggested the solution seems to be to switch to our own ntp server.

amplex · August 18, 2023, 8:39am

No problem, and I understand why you would think that. The units that we are talking about are running a very old version of chrony (1.45), where requesting a “makestep” seems to not take the leap=3 into account. We will look into switching to a more recent version as well, there is probably no need to spend more time on the problem mentioned here. Thanks for your kind assistance, though.

Bas · August 18, 2023, 2:26pm

I don’t. My Chrony has the support turned off.

@ask please enlighten me, as leap=3 is something not to be found with google, so I assumed it’s the leap-second-support-system in the daemons and clients.

Please tell me what it is, thanks.

grifferz · August 18, 2023, 3:05pm

Ask already posted a link to what it is; it’s the Leap Indicator field as described by RFC1361: RFC 1361 - Simple Network Time Protocol (SNTP)

(Ask had originally assumed SNTP so posted that RFC, but the same packet is relevant for NTP and is also described in RFC1305.)

ask · August 18, 2023, 11:10pm

Yes. I don’t know what chrony or ntpd do with leap=3 (“not synchronized”) packets, but it’d be reasonable if the’d be wary of them in particular if stepping the clock. The monitoring system doesn’t give a negative score for “not in sync” responses if the responses are still accurate (though I have a todo item to do something else with them …).

@davehart or @mlichvar, what is the recommended client behavior when getting leap=“not in sync” responses? Should the NTP Pool monitor remove servers with that response (that aren’t also KoD responses, which are already being removed).

@amplex I’d recommend you configure chrony with the pool keyword to make sure it’s using multiple servers (or multiple server statements if your version is too old to have pool).

davehart · August 18, 2023, 11:51pm

Ntpd ignores upstream leap=3/unsynched responses as far as timekeeping, but it considers the source still responsive as far as the reach register goes.

As the (S)NTP RFCs surely state, clients must not use the time when leap is 0b11 (eg 3). The occurs routinely during ntpd startup before the first system peer selection.

Topic		Replies	Views
Uk.pool.ntp.org delivering wrong time	26	883	August 19, 2024
Leap second 2017 status	30	20614	January 7, 2017
NTP pool wrong time issue 2/5/20 Client Configuration and Development monitoring , dns	26	4323	February 12, 2020
Very long jumps in system time Client Configuration and Development	5	2749	October 23, 2019
Slovakian NTP Pool delivering wrong time Server operators	9	174	January 20, 2025

Time four months off on 217.198.219.102

Related topics