@umike blocking ICMP traffic at all would not help?
yes
the server is getting better, but 1.8 millon is too much for me.
Yes, i try limit them by iptables like
- drop all from bad_icmp set source
- if icmp type 3 arrives add them to bad_icmp set for some time
after 1-2 minutes set contain 700 000 ipās and grows.
There is one more nuance here: part of the icmp comes from transit routers/wirewalls and looks like
TransitHost->me ICMP host NTPquerier port zzzz unreacheble.
where ICMP source ip TransitHost does not match the NTPquerier ip. I donāt know how this can be processed in the firewall. Any analysis in the user space is too expensive.
I will try increase clientloglimit, butā¦ really I dont see many queries from single ipās. Therefore, I donāt think that daemon ratelimit or iptables limits will help me much.
Has the article been approved yet?
Where is it? Can you provide a link so we can vote for it? I see nothing relevant in the Sandbox, and zero articles or comments in you profile.
No.
ŠŠ°ŠæŠ»Š°Š½ŠøŃŠ¾Š²Š°Š½Š¾ Šŗ ŠæŃŠ±Š»ŠøŠŗŠ°ŃŠøŠø 24 Š½Š¾ŃŠ±ŃŃ 2024 Š² 11:15
Iāve sent an invite to you so that you can post articles without pre-moderation.
Oh, thanks! ^^, The world is so close
Years ago it was possible for the admin of a server to request that it be manually added to an underserved zone, but it seems that fell out of favor. Would the pool admins be amenable to allowing admins of servers in Europe to temporarily have their servers put in the Russian zone to try to help stabilize things?
I think this is clear evidence, along with the sheer volume of requests, that ru.pool.ntp.org is under sustained DDoS and that is the issue that needs to be resolved.
The fact that youāre seeing so many port unreachable suggests strongly to me the DDoS is happening via source IP spoofing from an ISP that doesnāt implement BCP 38 info (Note thatās a HTTP-only link, no HTTPS available. You can also see the actual IETF document at IETF BCP 38 RFC.
Quite a few ISPs do ensure spoofed source IPs donāt leave their eyeball (consumer) networks, but there are exceptions. Itās possible the spoofing is coming entirely from one or a few machines with a fast connection sending each packet with a different random spoofed IP. Itās also possible the actual sources are compromised devices, but only some fraction of a botnet is going to be connected to spoofing-friendly ISPs.
With a spoofed source IP address, the IP receiving the NTP serverās response didnāt send the NTP request, so it doesnāt have the requestās claimed port open waiting for a response and so responds with port unreachable.
I am curious if youāre seeing only ICMP port unreachables, or both that and ICMP host unreachable. With spoofed source addresses, if theyāre generated randomly Iād expect some to not actually have a live machine at that address so a host unreachable from a router in the target ISP (AS, autonmous system) would make sense. They could be careful to pick among a list of known-connected IPs, but Iām guessing they wouldnāt bother.
The bad news is tracking down the actual source of spoofed-source traffic is difficult and sometimes practically impossible. You need the cooperation of each ISP along the path back from a target IP to the source, with each one needing to look for huge flows of NTP mode 3 requests and figure out where itās entering their system to point back to the next AS/ISP on that path.
Correct.
You can disable the processing overhead of maintaining the MRU list in ntpd by adding disable monitor
to ntp.conf and ensuring none of the restrict
lines have limited
. I donāt think kod
alone will enable the MRU list maintenance, but then kod
without limited
in a restrict entry does nothing, and will produce a warning in the log to that effect.
The default maximum memory for the mrulist is 1 MB. Using the authentication-required ntpq
command monstats
you can see MRU list stats. On a ntpd with no mru
configuration in ntp.conf on x64, I get:
C:\Users\daveh>nq -c "monstats"
enabled: 0x3
addresses: 19
peak addresses: 20
maximum addresses: 11915
reclaim above count: 600
reclaim older than: 64
kilobytes: 2
maximum kilobytes: 1024
C:\Users\daveh>
So you can see each entry on x64 is consuming 1 MB / 11915 or 88 bytes. A gigabyte of memory would allow up to 12.2 million different addresses, however, it would require tuning the āreclaim aboveā and āreclaim older thanā which cause ntpd to not grow the number of entries if the total count is more than 600 or the oldest is more than 64 seconds. Thatās done with mru
options in ntp.conf, maxdepth
and maxage
respectively.
The MRU list is maintained as a doubly-linked list indexed by an IP address hash table to minimize the per-packet work. This means the work is localized to the two hash table entries (lists) for the outgoing and incoming IP address plus the back and forward list pointers of the entry being recycled to move it to the most-recent poisition in the MRU list. Iāve successfully configured it to keep at least 200,000 entries without noticable impact to the processing speed on a system handling 1-2 Mbps of NTP traffic. It will be a bit slower triggering more CPU cache misses manipulating various pages of the 17.6 MB of memory a 200,000 entry MRU list occupies.
Incidentally the default mru maxdepth 600
of ntpd is a holdover from the long-ago-removed ntpdc
command monlist
, as ntpdās response could only send 600 responses in a blast of packets likely to make it through to a remote without any being dropped. That monlist
response functionality was the infamous ntpd traffic amplification that was widely exploited in 2014/2015 before people either updated to a newer ntpd without that functionality, or configured an older ntpd to drop ntpq and ntpdc requests via restrict ... noquery
. Itās probably time to increase that default maxdepth
to something closer to the number that fits in the default maximum memory of 1 KB, or at least a more generous number like 2000.
[1] I tried to figure out the link syntax to use here that would let me change the link text, no luck yet, my apology. EDIT: Thanks to @n1zyy for pointing out the correct syntax to me in a PM. He also pointed out itās MarkDown, but I knew that and had searched for how to link in MarkDown. Maybe I got my () and ][ confused, but it might also be Discourse is picky about the order of the text vs. the link, which apparently isnāt always the case in ever-slippery MarkDown universe.
Following up on messages in another thread really about the problems for Russian pool server operators, thanks to @timz and @kkursor for verifying Cloudflareās anycast servers are working from within Russia.
For reference see my post followed by two responses.
The upshot is while the flood of abusive queries to *.ru.pool.ntp.org is causing pain for most pool server operators in Russia, itās only degrading service and making the zone utility essentially entirely reliant on Cloudflare. For those relying on that zone to maintain their clocks, it appears Cloudflareās infrastructure can handle the flood one way or another. They may have tracked it back to a particular AS they peer with and filtered NTP queries from that AS, or they may have some peer-facing firewalling thatās dropping the abusive traffic before it hits their NTP servers. Given providing DDoS-proof web CDN is one of their core businesses, Iām sure they have all sorts of expertise and tools at their disposal to manage the problem.
Operators of pool servers may want to switch to monitoring-only mode as long as this mostly-futile attack continues. Or they may want to reach out to their ISPs to explain the situation and ask for their help back-tracing the flood to its sources.
ā¦ or it could be just a bug.
As mentioned earlier, kkursor posted about this on Habr and something interesting showed up in the comments.
With the help of some machine translation:
āOn the night of October 24, the number of UDP broadcasts on all nodes at once increased sharply. That is, this is not just one node. [ā¦] A lot of sessions on port UDP 123. I took a specific subscriber and found out that Yandex station is requesting NTP servers 6 times every 5 seconds. [ā¦] p.s. I checked it on my home āAliceā. Exactly every 5 seconds, 4 NTP requests.ā
A followup response:
"The number of Yandex stations sold is 8 million by 2023 and +3.3 million in 2024 = 11.3 million.
Letās assume that the phenomenon is widespread, each one makes 4 NTP requests every 5 seconds.
This is 720 (3600 / 5) requests per hour, or (11,300,000 * 4 * 720) - 32.544 billion requests per hour or 9,040,000 requests to NTP servers per second."
I would suggest investigating if those Yandex stations are to blame. You may need to contact the abuse address of some friendly ISP to troubleshoot this further, possibly with some tcpdumps of the offending traffic.
At a glance, this didnāt look like a DDoS to me, because:
- Heavy traffic load appears only when the monitoring system includes a server in the pool.
- Traffic drops to negligible values once the pool no longer includes the server.
In my understanding, this looks like legitimate clients making their first-time requests. If it was a DRDoS, then the traffic would remain indefinitely, once the āattackersā become aware of server existence.
However, inspecting the MRU list gave me some thoughts:
$ ntpq -c 'mrulist sort=-count'
lstint avgint rstr r m v count rport remote address
==============================================================================
0 0 3d0 L 3 3 82834 437 171.22.215.174 (RLINE1 = AS35608)
0 0 bd0 K 3 4 43905 46178 80.76.106.190 (dynip6-190.tdsplus.ru)
1 0 3d0 L 3 4 41390 39532 80.76.96.53 (TDS+ = AS51547)
3 0 3d0 L 3 4 40596 52981 80.76.96.43 (TDS+ = AS51547)
0 0 3d0 L 3 3 38341 294 45.141.93.253 (RLINE1 = AS35608)
1 0 3d0 L 3 4 30959 40020 80.76.110.197 (dynip10-197.tdsplus.ru)
3 0 3d0 L 3 4 21990 22305 80.76.96.37 (etra-plus.ru)
1 0 3d0 L 3 4 21932 50388 80.76.96.35 (dkkonversiya.ru)
2 0 3d0 L 3 4 17815 42734 80.76.110.195 (dynip10-195.tdsplus.ru)
2 0 3d0 L 3 4 17723 33731 80.76.96.33 (TDS+ = AS51547)
5 0 3d0 L 3 4 17013 50980 80.76.96.39 (TDS+ = AS51547)
1 0 3d0 L 3 3 15484 19523 171.22.213.22 (RLINE1 = AS35608)
First of all, the most frequent addresses are from a small bunch of domestic ISPs. This fact alone does not indicate anything, as many users in Russia are behind NAT and thus sharing same IP addresses. However, the ISPs figured here are not anywhere popular, AFAIK, to generate such an amount of traffic, while none of the really popular ISPs showed up in the logs. This makes me think that some ISPs may be the target of an attack, or may be the source of some IoT devices which went out of control, etc.
Second, many requests āfromā those clients have strange source port numbers ā neither 123 nor 32768ā65535, and sometimes even below 1024. I decided to block such requests on the firewall to decrease the probability of reflection attacks on third-party infrastructure (or at least to halve its intensity if the āsourceā port is chosen randomly by a spoofer). If those are legitimate legacy systems using ports starting from 1024, then I think it is acceptable ācollateral damageā in current desperate circumstances.
PS. Well, after inspecting the firewall logs during āpeace timeā, I reconsidered and enabled ports 1024ā32767 as well.
JFYI. For me, the bottleneck is not the NTPd server itself (although its Atom D2500 is nearly fully loaded when incoming traffic reaches 20 to 50 Mbit/s), but pfSense router based on Celeron G3900 and an Intel NIC which seems to generate a lot of interrupts, so that a single core is almost eaten by handling them.
One of big russian hosting-provider has read Habr article, contacted me for assistance and offered 30 free VPS to serve RU-zone.
Maybe we will resurrect soon.
This would be consistent with the attack targeting a hostname in *.ru.pool.ntp.org rather than IP addresses.
Itās not unusual for clients to use any UDP source port. Typically Linux systems would query from 123 or a port above 1024, but any source port is possible.
As far as the flood coming from less-popular ISPs, those might be ISPs which donāt protect against their customers spoofing othersā IP addresses. The unusually high level of ICMP unreachables suggests forged source IP addresses.
It is possible to set netspeed lower than 512k. Look at GET requests that āManage serversā page send.
pool.ntp.org: the internet cluster of ntp servers will set 1 kbps. I set 1 kbps and it is easy to handle.
upd: set 30 kbps, it gets higher - 50% cpu, about 70k ppm, ~70 mbps. Set 15 kbps, ~45 mbit/s, comfortable load.
Maybe due the āhugeā amount of new server in the RU Zone you will get less often in the DNS rotation and as result you will get less traffic.
But funny that you can set the speed via GET
And even clients using source UDP/123 will be translated to some other port number if behind a NAT gateway anyway.
In my experience, I suspect that the pool serves as a way to mine server addresses. My test involved adding a server to the pool and observed waves of pokes at several popular ports, such as SSH, SMB, Telnet, etc, besides NTP. If my anecdotal experience applies, the Russian pool would be an obvious resource to mine addresses of servers in Russia.
All of my public IPs receive this 24/7 whether they are NTP servers or not, and there are paid for services like Shodan which scan the whole Internet to compile a database of open ports.
Are you sure that you see an increase in this immediately after you add IPs to the NTP pool?
The idea that people are doing DNS queries to gather lists of NTP servers (in a given region?) and then subject them to further scanning seems strange to me when one can for example just download a list of all IP addresses allocated to entities in RU and scan those (or pay someone who has already scanned those).
Yes, I am.
Why not? I would, if I were thusly inclined. Itās a very low hanging fruit, especially to script kids.