Asymmetric latency measurement tool

Hello,
I made a little tool (with ChatGPT doing most the actual coding work) that measures the round trip latency between two systems on the internet in each direction, detecting any asymmetry in the network latency. I’ve spent most of the day so far fiddling with it, and it seems to be working as intended.

What it does is exchange timestamps and offset values.

Basically, the client sends a UDP packet with the time it thinks it is, and the server sends back its timestamp along with an offset of the client’s initial timestamp. When the client receives this, it will in turn take the server’s timestamp and compare it to the current time. If there’s a discrepancy between the offset reported by the server, and the offset calculated by the client, we have detected asymmetric latency, and it will spit out ratio numbers and an offset to put in the Chrony config.
It runs 10 times back and forth, exchanging timestamps and offsets, and at the end it will spit out the median values as a conclusion.

I imagine this might be interesting to some people here. If anyone wanna test it, and maybe improve on it, that would be really cool!

1 Like

Chrony does this by itself with ‘chronyc sourcestats’ it will tell you the exact number of the offset. You only need to insert that number in offset and restart chronyd.

chronyc sourcestats 
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
GPS                         8   4    27   -100.254    947.357   +724us  4060us
PPS                        64  36   126     -0.001      0.050     -1ns  4257ns
MiniPC.fritz.box           41  16  138m     -0.002      0.016    -23us    43us
ntp-main-1.oma.be          12   9  241m     +0.009      0.102  -1619us   315us
ntp0.nl.uu.net              7   3  137m     +0.030      0.462  -3639us   483us
ptbtime1.ptb.de            11   5  224m     +0.038      0.174  -1467us   438us

You can correct the offset, however, Internet is never the same speed…so it won’t help. In a local-lan it can be done but the offset will be so low that it’s not usefull.

Can you explain where this is different then what chrony already does?

Bas

I’m aware that Chrony does this. I’m not sure how it accounts for asymmetric latency though, which is what my tool is measuring. The point isn’t to measure the full round trip.

Afaik, there’s no easy way to specifically measure asymmetric latency using the NTP protocol. But it could very well be that Chrony is doing a better job of accounting for this by itself.

I can’t really think of a way to verify that my tool’s readings are actually accurate either, other than knowing with a 100% certainty that the two clocks are perfectly synchronised. Which I can’t do.

The problem with UDP is that you don’t have a round-trip that is reliable.
It’s not for nothing called the protocol as SEND and DON’t care :laughing:

You can do this with TCP and you get a pretty good figure, but not with UDP.

As such it will be an estimate. UDP is not a good protocol to measure such things as there is no return message that the package is received, nor a resend.

UDP is a protocol, send and forget. I fail to see how you measure better then Chrony does.

Chrony compares serveral sources, then it knows what is right. But you need to give enough sources OR connect your own GPS with PPS.

I know. Sometimes the packets get lost. I’ve seen that while testing. It’s not a big deal, that’s why we’re sending more packets. And it is certainly possible to send a response in return.

No, you can’t actually do this properly with TCP because it has to communicate back and forth a lot just to deliver one message, in one direction, just to be sure everything arrived intact and in the chronological order. It first needs a full 3-way handshake, and during the transmit, control messages will go back and forth. It won’t reach the application until the OS is 100% certain nothing is lost or in the wrong order, and in the meantime, the app is trying to time the responses, rendering the measurements useless.

With UDP, you shoot a message to the server, and assuming it arrived correctly, it will read the message, compare the time, and shoot a message in return so the client can do its own comparisons again. Given how important the timing is here, UDP is the only sensible approach. NTP also uses UDP, for the same reasons.

The comparison between the client and server timestamps are measured on both sides and reported back. I don’t think the NTP protocol lets you do that. Which isn’t to say I’m necessarily actually doing better, I’m just tinkering.

1 Like

Ok. Hopefully it will give results.

Well it does something.

Between Norway and Singapore:

./asymmetry.py sg.ntp.awhell.no
Initiating exchange with sg.ntp.awhell.no...
Sending exchange requests. Press Ctrl+C to stop.
Exchange 0 - RTT: 337.09 ms, Local Offset: -172.72 ms,  Remote Offset: 164.38 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 345.43 ms
Exchange 1 - RTT: 336.22 ms, Local Offset: -171.99 ms,  Remote Offset: 164.24 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 343.97 ms
Exchange 2 - RTT: 336.48 ms, Local Offset: -172.00 ms,  Remote Offset: 164.49 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 344.00 ms
Exchange 3 - RTT: 335.88 ms, Local Offset: -171.93 ms,  Remote Offset: 163.96 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 343.84 ms
Exchange 4 - RTT: 335.95 ms, Local Offset: -171.96 ms,  Remote Offset: 163.99 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 343.92 ms
Exchange 5 - RTT: 335.48 ms, Local Offset: -171.93 ms,  Remote Offset: 163.56 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 343.85 ms
Exchange 6 - RTT: 336.20 ms, Local Offset: -171.98 ms,  Remote Offset: 164.24 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 343.94 ms
Exchange 7 - RTT: 336.30 ms, Local Offset: -171.97 ms,  Remote Offset: 164.34 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 343.93 ms
Exchange 8 - RTT: 335.81 ms, Local Offset: -171.88 ms,  Remote Offset: 163.94 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 343.75 ms
Exchange 9 - RTT: 336.16 ms, Local Offset: -171.89 ms,  Remote Offset: 164.28 ms,    Asymmetry Ratio: 51/49 (-0.02, 51:49),    Skew: 343.78 ms

---- MEDIANS:
RTT: 336.18
Ratio: 49/51 (51:49)
Chrony offset: -0.02000

On LAN:

Initiating exchange with pi4...
Sending exchange requests. Press Ctrl+C to stop.
Exchange 0 - RTT: 3.52 ms, Local Offset: 17.04 ms,  Remote Offset: 20.55 ms,    Asymmetry Ratio: 45/55 (0.10, 9:11),    Skew: -34.07 ms
Exchange 1 - RTT: 1.00 ms, Local Offset: 16.74 ms,  Remote Offset: 17.74 ms,    Asymmetry Ratio: 49/51 (0.02, 49:51),    Skew: -33.48 ms
Exchange 2 - RTT: 0.00 ms, Local Offset: 17.62 ms,  Remote Offset: 17.62 ms,    Asymmetry Ratio: 50/50 (0.00, 1:1),    Skew: -35.24 ms
Exchange 3 - RTT: 0.00 ms, Local Offset: 17.12 ms,  Remote Offset: 17.12 ms,    Asymmetry Ratio: 50/50 (0.00, 1:1),    Skew: -34.24 ms
Exchange 4 - RTT: 0.00 ms, Local Offset: 17.51 ms,  Remote Offset: 17.51 ms,    Asymmetry Ratio: 50/50 (0.00, 1:1),    Skew: -35.02 ms
Exchange 5 - RTT: 1.00 ms, Local Offset: 16.93 ms,  Remote Offset: 17.93 ms,    Asymmetry Ratio: 49/51 (0.02, 49:51),    Skew: -33.86 ms
Exchange 6 - RTT: 0.00 ms, Local Offset: 17.31 ms,  Remote Offset: 17.31 ms,    Asymmetry Ratio: 50/50 (0.00, 1:1),    Skew: -34.63 ms
Exchange 7 - RTT: 1.00 ms, Local Offset: 16.74 ms,  Remote Offset: 17.73 ms,    Asymmetry Ratio: 49/51 (0.02, 49:51),    Skew: -33.47 ms
Exchange 8 - RTT: 0.00 ms, Local Offset: 17.11 ms,  Remote Offset: 17.11 ms,    Asymmetry Ratio: 50/50 (0.00, 1:1),    Skew: -34.21 ms
Exchange 9 - RTT: 0.00 ms, Local Offset: 17.51 ms,  Remote Offset: 17.51 ms,    Asymmetry Ratio: 50/50 (0.00, 1:1),    Skew: -35.03 ms

---- MEDIANS:
RTT: 0.00
Ratio: 50/50 (1:1)
Chrony offset: 0.00000

I’ve done extensive measurements using standard NTP. See Multiple Network Paths in my PTTI paper.
Some comments:

  • If the NTP server or NTP client have local times that differ significantly (milliseconds or larger) the calculated asymmetry will be incorrect. This risk also exists with the custom UDP described in this thread.

  • The unidirectional delay may depend on which UDP ports are used. I monitor NTP using multiple UDP client ports.

  • NTP server implementations can have weird bugs. Using custom UDP programs as done by Badeand might avoid many of these.

  • The UDP port used by NTP (123) is sometimes the target of middleboxes which can result in drops and unexpected delays. The custom UDP program may help.

3 Likes

I was pretty certain my approach shouldn’t have any trouble even if the local times differ, so long as the 32 bit signed integer used to transfer the offset reading doesn’t overflow?

Seeing as the offsets I’m measuring has both the network latency and the discrepancy between the local and remote time, but I don’t care what the actual offset between the systems are, just the discrepancy between the offset readings from each side.

If the connection is perfectly symmetric, both should arrive on the same offset, whatever it may be, while asymmetry would mean that there’s a shorter or longer delay in one direction, causing the offset to differ because it takes longer or shorter to reach the other way.

Sounds like you might have some interesting insights into when and why my approach might fail to make an accurate measurement?

Asymmetric latency cannot be measured without reference clocks or other accurate time sources on both ends. If it was possible, NTP would be already doing that.

Your program seems to be implementing an NTP-like client/server exchange, but it doesn’t use kernel timestamping and it’s in python, so accuracy is not great.

Assuming you have both ends synchronized to an accurate local time source, you can measure the asymmetry simply by using an NTP client configured with the noselect option. From the measured offset and peer delay you can calculate delay in the two directions as delay / 2 + offset and delay / 2 - offset.

1 Like

It exchanges timestamps to calculate the offset on both ends, and then compares the offsets rather than the timestamps to detect asymmetry. The idea seems plausible in my head, at least, and the measurements seem to make sense as far as I can tell

Agree with Miroslav. If the two clocks have unknown accuracy the offset can calculated to at best an uncertainty of RTT/2. The RTT split between the outbound and return path is unknown, hence the asymmetry is poorly known.

If both endpoints are synchronized with a common time-base (e.g., sync to UTC using NTP) the asymmetry can be measured to within some error. NTP implementations strive to minimize this error.

A lot of work goes into measurement and improving IP network latency.
RFC2679 and RFC 7679 may be of interest. Also see One-way Active Measurement Protocol (OWAMP) Requirements.

1 Like

On top, there is no point measuring round trip, as the request for time doesn’t need the best speed/lowest jitter speed. It’s only the answer that needs to be a jitter-free as possible.
Ergo, a short path will be better then a globe-round-route.

The request for time is just that, a request, other then that it waits for an answer…and the answer matters. Not so much the request.

This is UDP, not TCP, so it’s very hard to measure without some form of synchronization or feedback. Like others mention too.

How do you do that is you don’t have root-access to the other server?

In my opinion the ‘noselect’ in chrony is far more useful to see how good other NTP’s are.

Bas.

You don’t need root access, but you need access.

It sure is, especially if your system’s time is correct. But I’m not trying to measure the quality of NTP servers.

If badeand is into something, it may revolutionise the ntp protocol. New inventions always encountering resistance. Must look into this, but its friday.

ps Badeand har et koselig nick.

1 Like

I noticed a massive discrepancy when testing from different computers from the same local network against a server on the internet, don’t know if that’s mainly because the implementation or method itself is flawed. Maybe it’s indeed impossible. Let me know what you find :slight_smile:

<3

@mlichvar, it’s not clear why all information is not present for 1-way delays to be computed.

if a pool ntp server is stratum 0 or stratum 1 and has a score above 10, then you can be fairly sure the server is synced quite well. It’s probably 1/10’th of a millisecond or less.

So if 1 stratum 0 or stratum 1 server (with score above 10) is able to send ntp packets to another similar server, then those two machines are synced with less than a millisecond of error.

From there, the T1, T2, T3, and T4 times are all known to sub-millisecond accuracy. This image from here is a little more annotated than David Mills’ classic NTP delay sequence.

One-way delay_1 = T2 - T1
One-way delay_2 = T4 - T3

Why isn’t this implemented as an option in chronyd? It seems like a straightforward configurable option to select another server to use for 1-way delay estimation.

Yes, you can measure the asymmetry in the network delay between two NTP servers that are synchronized to a reference clock. All necessary information to split the measured peer delay in the two directions is in the ntpdata report and measurements log. It’s peer_delay / 2 + offset and peer_delay / 2 - offset.

3 Likes

wow.

I’ve never heard it put so simply.:face_with_monocle::star_struck:

Is this common knowledge? I’ve searched for this in the past but never quite found a satisfying answer.

I’ll give this a try and see what is discovered :thinking:

this also gives the possibility of comparing these asymmetrical delays with the One Way Active Measurement Protocol (OWAMP) mentioned earlier in this thread.