Monitors belgg1-19sfa9p and belgg2-19sfa9p having hiccups?

Bas · November 10, 2024, 11:43am

Well I have replaced the modem itself with a Zyxel and so far it keeps being stable.

The Fritz is now router only and it seems to work better.

Looks to me the internal bus of the Fritz simpy can’t follow the workload.

You can also notice this in the interface, when you restart it’s fast, but after a few hours of pushing UDP traffic it becomes deadslow, even reboots at times.

Now I splitted the modem and router part is seems to handle the load just fine.

Fingers crossed…

Kets_One · November 10, 2024, 5:59pm

@davehart
Thanks for the correction

Bas · November 11, 2024, 2:43pm

Thanks for the explanation, I had that impression as the Fritzbox becomes slower and slower.
When that happens the VDSL2 traffic is starting to give errors.

So in my opinion the CPU is so busy it has no time to empty the DSL-chip-buffer.

I do know ARM-type CPU’s have no hardware interrupt-lines and have to poll the busses/chips. Where Intel CPU’s do have a interrupt-controller…the reason we do PPS-signal via RS-232 to make the CPU aware critical data is present.

This was already a problem at the time with Sparc-servers, overload them and they miss data.
I do know the kernel of Linux has changed a lot to counter this problem, but I doubt it can a good job as the Intel-design can.

So in my opinion the Fritzbox is going nuts on NAT- and Firewall-tables that it simply can’t process the rest properly.

PoolMUC · November 11, 2024, 3:11pm

I don’t think this is true.

Indeed interrupts are good for making the CPU aware of data being ready somewhere. But only if that is a relatively infrequent event. Like the PPS signal. Or network traffic on devices with “relatively” little network traffic.

For devices whose primary purpose is handling network traffic, and/or that need to handle a lot of it, handling that via interrupt is not good. Interrupts are called interrupts because they interrupt the CPU in whatever it is doing to check what the interrupt is all about, also involving frequent and expensive switches between user and kernel modes. If interrupts happen too often, e.g., because the packet rate is “too high”, the CPU ends up doing nothing but mostly responding to interrupts, and doesn’t get to do much “actual” work anymore.

That is why concepts/technologies like interrupt moderation/interrupt coalescing, poll-mode drivers (DPDK), and similar things were conceived.

PoolMUC · November 11, 2024, 3:38pm

Imagine if the device wouldn’t need to do NAT anymore, and could also do just stateless packet filtering…

Bas · November 11, 2024, 4:01pm

Enjoy…the difference between ARM and INTEL/AMD are big.
And not the same.

Bas · November 11, 2024, 4:10pm

NAT got nothing to do with it, because without NAT you still have Firewall-tables.
Whatever the traffic is, the CPU will have to deal with it.

Not quite, the Intel way of sending an IRQ is different. I fact you can make 1 core handle all IRQ’s if you want and all other cores work on. You are right when a CPU is single core, then it stops and looks at it. But with SMP/Multi-core systems and hardware IRQ is what you want to empty buffers that can’t wait. Like PPS, or other low-latency realtime stuff. Arm doesn’t have this way of calling attention. It was a problem in the old days, and still is. I’m not talking about software interrupts, but hardware. Like RS232 does. That is why it’s used for GPS and so much faster. It triggers IRQ 3/4 (normally) in hardware. In my opinion a DSL-modem-chip should be able to do the same…and not wait for the CPU to make time. Software and Hardware IRQ’s are not the same.

PoolMUC · November 11, 2024, 4:12pm

Sorry, I am confused. Did anybody claim they were?

Bas · November 11, 2024, 4:15pm

No clue what you are on about.

I’m just reporting my problem and what I do to solve it.

As well as why I think they happen.

PoolMUC · November 11, 2024, 4:17pm

Sorry, I thought you had understood @davehart’s explanation of what NAT entails.

Yes, but they can be stateless. Not “setting up and tearing down mappings for each kilobit of NTP”, as @davehart explains.

Sure, the question is what this “deal with” entails. Just shuffling packets forth and back unmodified, or additionally rewriting parts of the packets, and having to manage a whole bunch of table entries to undo that rewriting when the response comes back.

PoolMUC · November 11, 2024, 4:18pm

You pointing out that ARM and Intel were different in many respects, when nobody was claiming they weren’t.

Bas · November 11, 2024, 4:28pm

Who cares, just point out that the Fritzbox has an ARM and it fails empty buffers in time when heavy usage.
Where an Intel/AMD CPU would have done the job.

This is typical for ARM CPU’s at this low range of the spectrum.
If it had an Intel/AMD CPU it would not happen.

That is my point. I opened a ticket at AVM to explain, but they do not want to look into this.

So I splitted the load to avoid the DSL-chip being underserviced by a CPU.

That is my point. On an Intel this would never happen.

PoolMUC · November 11, 2024, 4:35pm

Hmm, what is it that you want them to do, then? Replace the ARM CPU in your device by an Intel one?

If that is what you want to believe, fine with me. But as long as you don’t seem to grasp the concepts that are involved here, as, e.g., @davehart has explained, not sure how much progress you’ll make.

But wishing you the best of luck, as while you keep confusing different causes of a similar set of symptoms (and thus what the respective solution spaces could look like), the issue as such (devices’ improved handling of high-rate NTP traffic) obviously is worthwhile solving.

Bas · November 11, 2024, 4:44pm

No. But rethink their software to have better handling towards the DSL-modem-chip.

Dave explained the problem of NAT tables growing to fast and too big.

This means they HOGG the CPU from doing anything else.

That is the problem I tried to tell AVM, but they do not listen.

It’s my believe they keep tables too long and do not pay attention to the DSL-chip for service.

But I stopped mailing them, it has no use.

I using a Zyxel now as modem, the Fritz does only rooting and that seems to work well.

In my opinion AVM tries too much being a nice-looking-modem rather then seek performance at heavy use. Beware, most users are light with little connections.

WE in here are BIG users with easily 100K-connections an hour or more…there it fails.
Maybe they don’t care, test it, or whatever. I have 3 Fritzboxes…all same…fail at high load with many connections, like NTP-monitor/Chrony.

Kets_One · November 11, 2024, 6:18pm

@Bas Tried to tell them to optimize their Fritz!boxes for many connections and small packets better a few years ago.
They told me they would look into it… Heard nothing since.

Have been thinking of replacing the fritz with something more professional, but i hesitate since i do fancy their simple setup. Im not a networking professional, but an amateur with specific requirements (time-nut), lol.
If anyone has a suggestion for a professionalr outer that is easy to setup and maintain (and a low power consumption), be my guest.

PoolMUC · November 11, 2024, 6:44pm

You’re hitting the (a) nail right on the head, I think. “Professional” devices, in the sense that they are better (but likely still far from perfect) able to handle our type of traffic, are not as simple to use. And also not available at the price point of a Fritz Box. Which isn’t cheap, either, compared to consumer devices from other vendors. But the money doesn’t go into optimizing for our use case (which I guess is quite a niche for the market that AVM target), but into the “simple setup”, nice GUI, and broad spectrum of functionality. NAS, telephony, Smart Home and all that kind of stuff, as you point out.

Coming back to the topic, I’d expect getting rid of NAT aka port forwarding could go a long way, because it costs performance (maintaining those tables @davehart was referring to), and leads to performance-unrelated (with performance in the sense of CPU cycles) packet drops when the table is full.

Not sure whether the “Exposed Host” mode that the Fritz Box offers would do the trick, maybe that is among the many things it sounds you’ve tried already, I’ve never tried it myself. Unlike typical IPv4 port forwarding, IPv6 pinholing wouldn’t need state, but I have the impression that the FritzBox implementation is still handling pinholing statefully as well. So not sure how they implement the “exposed host” functionality.

NTPman · November 11, 2024, 7:27pm

What about stateless NAT? No need to maintain any kind of table.

Kets_One · November 11, 2024, 7:31pm

Exposed host is only for maximum one device (i have multiple). Also, it will expose the device completely to the big-bad-internet… so any implementation or configuration faults that are exposed may lead to compromise of the device and your network (question mark?).

PoolMUC · November 11, 2024, 7:31pm

Yeah, I think the challenge is to get that in a device “that is easy to setup and maintain”, like the AVM line of products.

PoolMUC · November 11, 2024, 7:34pm

Yeah, that is kind of what I figured. No luck then. You could request for AVM to implement stateless NAT, as @NTPman suggests. But as I said, I have the impression that they don’t even do it for IPv6, where it is even less needed, so would be difficult to imagine to get that for IPv4 anytime soon

Yeah, it is a trade-off between eating the cake, and keeping it Anything in between is not impossible, but somewhat more difficult…

Don’t want to open that can of worms in this thread (because it does not bring us closer to a solution in this thread, but more as a reference, to put things in relation), but that is why some people are promoting protocols that were designed to not have that problem in the first place. But obviously, also that isn’t without tradeoffs, they’re just different ones, but in some people’s view at least fewer of them…

Topic		Replies	Views
Monitor belgg1-19sfa9p Pool Development monitoring	19	756	May 31, 2023
Debugging a single monitor Server operators	5	168	February 3, 2025
Server monitoring Server operators monitoring	9	960	October 23, 2021
Is it time to retire the legacy scores, or omit them from the graphs? Server operators monitoring	11	635	October 3, 2023
Monitors mishandling empty responses Pool Development monitoring	2	147	August 26, 2024

Monitors belgg1-19sfa9p and belgg2-19sfa9p having hiccups?

Related topics