Monitors belgg1-19sfa9p and belgg2-19sfa9p having hiccups?

Hi @bas,

Could you kindly check your monitors belgg1-19sfa9p and belgg2-19sfa9p? Since a few days, they seem to very occasionally see unusually high offsets, sometimes larger than 1 second.

ts_epoch,ts,offset,step,score,monitor_id,monitor_name,leap,error
1730034823,2024-10-27 13:13:43,0.001598066,1,15.797745705,41,belgg2-19sfa9p,,
1730032413,2024-10-27 12:33:33,-0.315106736,-0.260426939,15.576574326,41,belgg2-19sfa9p,,
1730029998,2024-10-27 11:53:18,0.000873653,1,16.670526505,41,belgg2-19sfa9p,,
1730029686,2024-10-27 11:48:06,0.001589592,1,16.495292664,41,belgg2-19sfa9p,,
1730029234,2024-10-27 11:40:34,-1.213118746,-2,16.310832977,41,belgg2-19sfa9p,,
1730028873,2024-10-27 11:34:33,0.002238161,1,19.274560928,41,belgg2-19sfa9p,,

This affects all my servers across four locations in Germany and two in Singapore. It affects both IPv4 and IPv6. It also affects your own servers in multiple locations.

So I think it is very unlikely that this is a server-side issue, but rather points to a monitor-side one.

It is not a big issue from a functional point of view as it happens somewhat rarely only, and there obviously is a sufficient number of other monitors. But it badly skews the “Offset and scores” graphs.

It seems this roughly started around the time when belgg1-19sfa9p started monitoring IPv6 servers again. I obviously don’t know whether there is a causal connection, or whether the two things are completely unrelated.

Thanks!

Belg2 wasn’t supposed to work.

The problem is the router that is in use at the moment.
It can not handle the number of requests.

I did change it to a faster router but our monopolistic ISP won’t allow the faster router.
However, from 1 nov on they are forced to lift the restriction and I put the faster router back in place.

But it can’t affect your server, as timeouts only remove my testing-server from being used as test-server.

If you look:

https://www.ntppool.org/scores/87.118.104.17

and

https://www.ntppool.org/scores/2603:c020:8017:3e00::123

My server is taken OFF the scoring for your server. Whatever it produces, it will not be used to rate your scoring.

That is the beauty of the new scoring system, only active-monitors give scoring.
Any server, like mine at the moment, not scoring well all the time, is taken ‘offline’ for you.

Thanks for your reply!

If you reread my message more carefully, you’ll see that I wrote that it is not a functional issue for my or other servers, in the sense of affecting the scoring.

It is the graphs for many servers that are getting messed up to the point of being unusable.

Is there a way to disable your monitors until such time as the issue with your router has been addressed?

Yeah I read that, but you complain to me :rofl:

When in fact you must complain to @ask that he should stop including all servers in the graph.

As I can do nothing about it but take all monitors offline, but the issue isn’t constant.

It’s just the router, currenty a FritzBox 7530AX simply doesn’t cope ans slows down.
I tried to change it by taking the DNS-server out of it and put it on internal servers, but that doesn’t help.

It’s the firewall/nat tables that grow too big in my opinion.

Next Friday I can use the Fritzbox 5690Pro again, that is incredibly faster and doesn’t have this problem.

Last week my ISP, not the monopolistic one, did a test on VDSL and my router was set back to 8Mbit down and 512Kbit up, while it’s normal 100/35mbit.
But it will be fixed soon…then finally free modem-router choice is forced on the monopolistic ISP that controls the network.

Sorry my friend…not much I can do at the moment.

However, I did run over a month with the new router, but they noticed :face_with_symbols_over_mouth:

But still you responded only to the issue that I mentioned was a non-issue.

How about just stopping the monitoring processes on the respective monitor machines?

I can do that. No problem.

I have stoppend them now.

1 Like

Many thanks!

Good luck with the new old router then on Nov 1!

Help me remember to turn them back on by then :rofl:

Yes, I will. Might not be on Nov 1 right away, depending on when I get around to it. But I made a note of this.

We will see…can’t wait to put the new router in place as it so much faster it’s not funny anymore.
The forced routers simply hinder internet performance.

And it’s simply not possible to install better…can’t wait till it’s 1st nov.

Belgium really ***** on these matters.

As the fiber guys here face the same problem, they are forced to use the Proximus ONT instead of their own.

As the state owns the monopolistic ISP Proximus…they do not do anything unless Europe forces them. Thank god the EU forces this upon us :ok_hand:

Operators in Germany in the past also tried to redefine the network termination point to be an operator-provided device, rather than the end of the physical line. But so far, luckily, the regulator has resisted such lobbying (certainly for copper-based access, but I think for optical fiber as well)…

BELGG2 has been largely a random number generator from here. The only harm done would be the forced re-scaling of the entire graph.

I’ve got some code here that stops that from happening in a dynamic graphing system, but it’s now almost 30 years old, so unlikely to be a significant contribution.

I am a bit unsure how to deal with this. The current graphing already has some offset-dependent scaling so that occasional “bigger” offsets don’t impact the visibility and differentiation of the “lesser” offsets - that hopefully by far exceed the number of the larger outliers - too much. But that obviously has its limits. So how to not completely exclude the large outliers, as they are interesting to see at a glance as well, but still keep the better resolution in the more interesting, lower offset areas?

Have you previously taken a look at the current approach, as documented in the GitHub repositories, to see whether you could build on that? I don’t understand enough of the topic and the code to contribute to this, and also Ask has previously signaled that the graphing is an area he prefers not to touch too much, given it is in a language that he doesn’t routinely deal with (if I understood correctly). But if there were specific proposals, I think he might
– generally – be open to that (which however doesn’t mean he’ll be able to actually spend time on this, given his other constraints and priorities vs. available timeslots, so it also might lead to nowhere after all, one never knows beforehand).

I should state that it would need to be a user-specified behaviour, i.e. the user can set some specified upper limit to the allowable rescale, and specify other modifications to the display. This potentially moves the rendering operation from the server to the client.

My own background is in data rendering for automotive engineers who have to deal with large volumes of multi-channel data. At an earlier time, there was the “eight brush” data recorder, which gave you miles of paper traces with up to 8 channels of data. When the engineers got a comma- or tab-delimited data file instead, the tool I gave them would let them see the data in an auto-scaled form, or with any arbitrary modifications to scale and graph position for any channel, as well as presenting the data in a tabular format. It was written for a VGA display on an MS-DOS platform when I was a contractor at Ford Motor Company back in the '80s/'90s. Code not NDA’d, so free to good home.

Hi @Bas, as promised a kind reminder to re-enable and restart the monitor instance that you want to be running after your router switch.

Thanks, they are back online for testing.

If they give problems again, I will take them offline.

Lets see hoe it goes. Both are running IPv4/6