The issue of NTP requests exceeding bandwidth load

NTPman · November 15, 2024, 12:59pm

Of course this is always doable. If the domain name of your small ISP is small-isp.ru then you can configure the clients you distribute with configuration option in the NTP service pointing to ntp.small-isp.ru and you can add the following DNS entry:

ntp.small-isp.ru. IN CNAME 2.eu.pool.ntp.org.

And it is very flexible as it is easy to change where the hardcoded name is pointing to. However, the domain small-isp.ru should always be reachable for the clients to get time synchronization.

vlad.bfly · November 15, 2024, 10:15pm

Found this thread by googling. I’m one of those few Russian NTP servers left, and I must admit that I’m having trouble keeping the server up because of the load exceeding my capacity. I’ve been a part of the project for over a decade, and never experienced any issues until recently.

It all began in October, with some mysterious random Internet outages every few hours each taking usually up to ten minutes. I blamed my ISP at first, until their technician told me that during those outages my inbound traffic was maxing out my 100Mbps bandwidth, so their shaper automatically throttled my traffic. The quick research indicated that it was NTP traffic. Normally not more than a few thousand requests per minute, it peaked to half a million and more every once in a while, completely congesting my traffic.

I took my server into monitoring mode for a day, and the situation gradually recovered. However, switching back to operational mode, even with the smallest bandwidth 512Kbit, almost immediately brought the the outages back. I’m still trying to figure out a solution, but it seems like a we are dealing with a snowball effect, when the excessive load makes more and more servers quit, thus even further increasing the load on the remaining ones.

avij · November 15, 2024, 10:38pm

Hi @vlad.bfly, let’s continue this discussion in the other topic, as this topic was originally about a slightly different issue in China.

davehart · November 16, 2024, 3:50am

I very much doubt that is an issue today. ntpd’s stable release removed the massive multipler reflector functionality (ntpdc -c monlist) in 2014 or 2015. It was removed from the test releases (ntp-dev) in 2009 or 2010. The reflection DDoS wave in 2015 caused many firewalls to reject all NTP traffic, or all traffic except client (mode 3) and server (mode 4), when all that was needed was to block all ntpdc (mode 7) or specifically mode 7 monlist requests (a particular hex operation code value at the correct offset of the binary mode 7 protocol). Many of those blocks were never removed, causing ongoing carnage for NTP clients and servers for years.

It’s sad the ntp.org release management chose to go 5 years between stable releases despite the potential for DDoS around monlist being understood. I still have major problems with the way release management is being handled, but this isn’t the place to gripe about that.

kkursor · November 16, 2024, 8:48am

vlad.bfly:

Found this thread by googling. I’m one of those few Russian NTP servers left, and I must admit that I’m having trouble keeping the server up because of the load exceeding my capacity. I’ve been a part of the project for over a decade, and never experienced any issues until recently.

It all began in October, with some mysterious random Internet outages every few hours each taking usually up to ten minutes. I blamed my ISP at first, until their technician told me that during those outages my inbound traffic was maxing out my 100Mbps bandwidth, so their shaper automatically throttled my traffic. The quick research indicated that it was NTP traffic. Normally not more than a few thousand requests per minute, it peaked to half a million and more every once in a while, completely congesting my traffic.

I took my server into monitoring mode for a day, and the situation gradually recovered. However, switching back to operational mode, even with the smallest bandwidth 512Kbit, almost immediately brought the the outages back. I’m still trying to figure out a solution, but it seems like a we are dealing with a snowball effect, when the excessive load makes more and more servers quit, thus even further increasing the load on the remaining ones.

Exactly the same. End of October, too.
I’ve rate-limited NTP traffic at the router. It’s better than nothing.

Bas · November 17, 2024, 8:45pm

Are you running NTPD or Chrony?

As Chrony is pretty good at limiting traffic to abusive IP’s.

Chrony can keep a log with IP’s that come to often and then stops responding.

I could be wrong, but I do not believe NTPD can.

You may want to switch to encounter this abuse.

Also, what router/connection do you use? I forgot…sorry.

davehart · November 18, 2024, 10:33pm

As is ntpd, as I pointed out to here recently including a link to the relevant documentation and an example discard minimum directive for ntp.conf. Search the forum for “discard minimum” if you have forgotten. Otherwise, I’m left with the impression you’re intentionally spreading misinformation about ntpd in a misguided effort to promote Chrony. Chrony has lots of other differences with ntpd, there’s no need to stretch the truth to find reasons one might choose it, such as the fact it’s the default in popular Linux distributions.

On the ntpd server, as I’ve also previously pointed out here recently, one can use:

$ ntpq -c "mrulist sort=avgint" | less

The first entries are the ones hitting the server most frequently. There are knobs available via ntp.conf to configure the size of the Most Recently Used list ntpd keeps to enforce rate limiting. This ntpq query displays that list. Without the sort parameter, the list is most recent first.

Bas · November 19, 2024, 9:43pm

Nobody asked you Dave.

Chrony keeps a log of abusive users when you enable it.
And keep is as long as you want.

Got nothing with your ntpq command.

Chronyc command prompt is far more advanced.

As such NTPD is NOT my favorite timeserver, far to complicated to get information.

NTPD is old and outdated.

davehart · November 19, 2024, 11:46pm

This is bullshit, as well as being rude. You claimed ntpd didn’t have a capability and noted you could be wrong. I pointed out a similar capability (not logging, but querying recent). Suggesting I was wrong to respond is just disrespectful of the intelligence of every reader here, and a sign of why I claim your posts can be very similar to trolling.

This is repetitive, you’ve already stated that and I didn’t contradict you in my reply. Do you just like to make other people read your repetitive comments more than once?

I wouldn’t expect you to, given you run Chrony. If you ran that command on a ntpd host, or pointing to one that allows remote ntpq queries by tacking on its DNS or IP at the end of the command, you’d have seen its list of most active recent clients.

Repetitive. We all have heard you sing Chrony’s praises and (often misleadingly) run down ntpd before.

I’m not sure your opinion on software you don’t use carries much weight. It’s old, as the original NTP server and client, but development is ongoing, though it did a break from 2019 until 2023 when I became active in its development again after demands on my time were reduced.

davehart · November 19, 2024, 11:55pm

You should be able to see if in fact cloudflare NTP servers are down in Russia with a few checks.

First off, does nslookup time.cloudflare.com. return any IP addresses for you?

If you do get IP addresses, try using sntp or ntpdate from the NTP distribution to try using time.cloudflare.com:

ntpdate -d time.cloudflare.com

or add them to your NTP server’s configuration with:

server time.cloudflare.com iburst

and then take a look at its status and see if it’s serving you time.

That would be a major help in determining if the problem is some sort of flooding due to abuse/bad client or if it’s due to Cloudflare’s Russian NTP service going down shifting a much higher burden on the rest of the pool’s Russian servers.

davehart · November 20, 2024, 4:31am

The stated policy is for vendors of equipment/software using the pool hardcoded or defaulting to using the pool to obtain a vendor zone from the pool, like ubuntu.pool.ntp.org. Yes, there are many who ignore this, and yes, people have complained about waiting ages for vendor zones to be approved, and I don’t know how much the latter is responsible for the former, but it’s still the advice I’d recommend as it leaves open the possibility to handle a vendor zone differently if it turns out their clients are misbehaving somehow.

Unfortunately that rules out using continent zones as @PoolMUC is suggesting is wise with the current pool DNS implementation. Or at least I couldn’t figure out how to query a vendor+continent zone, perhaps someone affiliated with the pool will speak up here.

Knot3n · November 20, 2024, 11:38pm

If you continue to behave like this, especially since you’ve already drawn attention to yourself multiple times, you’ll need to take a break.

davehart · November 24, 2024, 4:49am

No replies. It doesn’t need to be @kkursor, but would someone in Russia please check if Cloudflare servers are working from within Russia? They are included in *.ru.pool.ntp.org but keep in mind with their IP addresses being anycast from many different data centers, just because monitors outside Russia say it’s working and it is in the zone doesn’t mean it’s working for clients inside Russia. My hunch is it does indeed work inside Russia and that’s letting clients using *.ru.pool.ntp.org continue to work despite what I suspect is a malicious flood of NTP queries to that zone, but it would be groovy to know one way or the other.

If you have access to ntpdate, try:

nslookup time.cloudflare.com.

Does the response include the IP addresses below? Then please try:

ntpdate -d 162.159.200.1
ntpdate -d 162.159.200.123

and if you have IPv6, also:

ntpdate -d 2606:4700:f1::1
ntpdate -d 2606:4700:f1::123

The output should make it clear if you’re getting responses and how close they suggest your system clock is.

If you don’t have access to ntpdate, you can test by adding those addresses to your NTP server configuration and looking at the status after a few minutes to see if you’re getting useful responses.

timz · November 24, 2024, 7:54am

Tested using two different ISP. No one problem detected.

nslookup time.cloudflare.com
Server:         127.0.0.1
Address:        127.0.0.1#53

Non-authoritative answer:
Name:   time.cloudflare.com
Address: 162.159.200.1
Name:   time.cloudflare.com
Address: 162.159.200.123
Name:   time.cloudflare.com
Address: 2606:4700:f1::123
Name:   time.cloudflare.com
Address: 2606:4700:f1::1

ntpdate -uqd 162.159.200.1
ntpdig: querying 162.159.200.1 (162.159.200.1)
org t1: eaed561c.7dd6f000 rec t2: eaed561c.841cdd9b
xmt t3: eaed561c.84b6ea6c dst t4: eaed561c.8dce4800
org t1: 1732433820.491561 rec t2: 1732433820.516066
xmt t3: 1732433820.518416 dst t4: 1732433820.553929
rec-org t21: 0.024505  xmt-dst t34: -0.035513
2024-11-24 14:37:00.518416 (+0700) -0.005504 +/- 0.030010 162.159.200.1 s3 no-leap

ntpdate -uqd 162.159.200.123
ntpdig: querying 162.159.200.123 (162.159.200.123)
org t1: eaed5678.092aa000 rec t2: eaed5678.0f5f302b
xmt t3: eaed5678.0f62c520 dst t4: eaed5678.16127000
org t1: 1732433912.035807 rec t2: 1732433912.060046
xmt t3: 1732433912.060101 dst t4: 1732433912.086219
rec-org t21: 0.024240  xmt-dst t34: -0.026118
2024-11-24 14:38:32.60101 (+0700) -0.000939 +/- 0.025179 162.159.200.123 s3 no-leap

MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
...
^- 162.159.200.123               3  10   377   818   +115us[ +129us] +/-   40ms
^- 162.159.200.1                 3  10   377   20m   -108us[  -87us] +/-   40ms
...

kkursor · November 24, 2024, 8:01am

No problems.

kkursor@dot:~$ nslookup time.cloudflare.com.
Server: 192.168.14.1
Address: 192.168.14.1#53

Non-authoritative answer:
Name: time.cloudflare.com
Address: 162.159.200.1
Name: time.cloudflare.com
Address: 162.159.200.123
Name: time.cloudflare.com
Address: 2606:4700:f1::123
Name: time.cloudflare.com
Address: 2606:4700:f1::1
kkursor@dot:~$ /sbin/ntpdate -q 162.159.200.1
2024-11-24 11:00:48.595625 (+0300) +0.001640 +/- 0.001534 162.159.200.1 s3 no-leap
kkursor@dot:~$ /sbin/ntpdate -q 162.159.200.123
2024-11-24 11:00:55.581503 (+0300) +0.001635 +/- 0.001247 162.159.200.123 s3 no-leap

Topic		Replies	Views
Updated stats on "Join" page Forum Site Feedback	4	1025	July 15, 2018
Can I use my own server as a monitoring server?	11	264	March 22, 2025
API for changing server speed? Pool Development	17	736	October 10, 2023
What is the expected bandwidth for an NTP server?	11	4037	October 2, 2019
Bandwidth setting vs traffic Server operators	8	1832	April 5, 2018

The issue of NTP requests exceeding bandwidth load

Related topics