Do not use "offset" value from KOD responses in offset/score graphs

For reasons out of scope here, one of my IPv6 NTP servers is occasionally responding with a RATE KOD packet to probe requests from the monitoring system.

With a KOD packet, various timestamp fields have undefined contents and must not be relied upon to have valid values.

Still, it seems that for display purposes in the offset/score graphs, the monitoring system uses those undefined values for calculating an offset value, and then includes the sample in the graph based on that bogus offset value.

As the resulting offset values may become somewhat large (e.g. 3841726559 seconds), the scale of the Y axis in the offset/score graph is drastically skewed. This makes it rather impossible to assess the actual offset performance of the server for the large majority of requests from the graph.

skewed offset scale due to KOD responses

Would it be possible to not display KOD response packets based on the bogus “offset” value calculated from them? E.g., display with zero offset but a distinct color to indicate the KOD condition?

I understand this seems to be a rare issue, so not highest priority to fix. But maybe it is something that can be fixed when changes are implemented in that area anyway, e.g., relating to the ongoing discussion whether/how legacy scores should be displayed in the graph.

Thanks!

2 Likes

Yeah, that seems pretty reasonable.

It’s logging the offset because I thought it’d be worth knowing what offset is returned no matter what, but I guess it’d be reasonable to say that if the client is ignoring the kiss code then we can’t reasonably care about what offset they get anyway.

Currently the kiss code gets lumped in with other errors. I’ll make a change so it’s tracked more distinctly and then when the monitors have been upgraded I can fix the graphs to use the information.

I’m surprised though that the monitors are doing enough queries your server is responding with KoD codes.

2 Likes

Many thanks for the quick reply!

Many thanks in advance, looking forward to the change, but no hurry.

Indeed that is kind of puzzling to me as well. This is the only server on my side for which this happens, and it seems to happen for IPv6 only. Unfortunately, I recently lost my public IPv4 address, so can no longer compare the behavior between different IP versions.

It is running a pre-compiled version of NTP classic. There is nothing in the config that would be significantly different from what I am using on my other servers, so my current hypothesis is that there is something in the build that causes this behavior, and I am trying to rebuild myself.

Will also keep experimenting with the configuration, might yet be some subtle difference to my other servers, owing to this being a different kind of platform/environment. (I just realized that I used the xmtnonce option for upstream servers of this one, which I did not use anywhere else so far. Should only affect behavior towards upstream servers, not downstream clients, but who knows…)

Will also look a bit more into the pattern of incoming traffic to see whether anything stands out to correlate this server behavior with aspects of incoming traffic.

Any other thoughts/suggestions?

1 Like