Buggy ntp server?

Sometimes I get huge burst of NTP packets from certain servers.
Here is part of the packet trace:

11:27:28.156386 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:27:28.156460 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:28:59.184211 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:28:59.184267 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:01.193556 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:01.193663 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:03.206575 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:03.206650 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:05.229603 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:05.229661 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:07.244921 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:07.245097 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.198813 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.198828 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.198869 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.198888 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.199473 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.199553 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.199748 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.199819 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.200372 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.200434 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.200839 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.200935 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.201280 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.201349 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.201985 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.202073 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.203261 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.203276 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.203333 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.203352 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.203897 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.203982 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.204062 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.204176 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.204707 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.204781 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.205086 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.205158 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48
11:29:57.205875 IP 212.147.28.70.123 > 156.106.214.52.123: NTPv4, Client, length 48
11:29:57.205943 IP 156.106.214.52.123 > 212.147.28.70.123: NTPv4, Server, length 48

The per second statistics for the client 212.147.28.70 (first column is the number of packets in both directions):

      2 11:27:28
      2 11:28:59
      2 11:29:01
      2 11:29:03
      2 11:29:05
      2 11:29:07
   3580 11:29:57
   6267 11:29:58
   6274 11:29:59
   6271 11:30:00
   6266 11:30:01
   6252 11:30:02
   6245 11:30:03
   6270 11:30:04
   6269 11:30:05
   6306 11:30:06
   1222 11:30:07

My ntp.conf has: restrict default nomodify notrap nopeer noquery

Do you know by chance is there any broken NTP server implementation in use on the Internet with similar behavior?

I guess it is a broken NTP client or “crazy” client behind firewall/NAT. Also it can be multiple clients in network behind NAT that cannot translate replies to clients, so clients send queries many times. Or it can be spoofed source ip and reflection attack. Really we cannot know.

You could try the abuse contact for the IP: http://whois.domaintools.com/212.147.28.70 Or their website has “Live Help” and other contact methods: https://www.vtx.ch/de

On the hour / half hour etc is known to be busy:

1 Like

Could it be a loop?
Where somebody has set a pool as time-source and serves the pool at the same time?

Just wondering.

That must be a broken NTP server configured as client, please concentrate the timing details.

The source port is always 123. There were 1 packet at 11:27:28. Then, 91 seconds later 5 other packets with 2 second spacing. Then, 50 seconds later starts the storm.

Yeah; I have sometimes worked with vendors on minimizing that. I’d like to have more of the DNS server metrics exposed on the website and with some analytics for people with a custom DNS zone to automatically alert them when their clients do this.

Yeah, depending on how often you run a cron job, it’s always good to have some sort of random sleep if it doesn’t absolutely have to be run at a precise time… ex: sleep $(expr $RANDOM % 900)

1 Like

I’m seeing the same floods occasionally.
Those are coming from a handful IP addresses. And sometimes there are regular requests at a reasonable rate, then all out of a sudden there are 20k+ packets per second.
Each packet has a new transmit timestamp field value which moves slightly forward with each packet, so I guess those are not attempts to DDoS someone, because it would take unnessecary CPU time to put a new timestamp in each packet.
Also, this means these packets are likely generated by the same client and those are not thousands of clients behind the same NAT.
Packets are always NTPv4, so this tends to be a newer client implementation.

Based on the received TTL this seems to be a Linux or Mac client.
The source IP is from my point of view 12 hops away from my server and I received packets with TTL 50. Assuming the start TTL was 64, this would correlate with standard NAT and slightly asymmetric routing.

Could this be a very, very buggy “iburst” like configuration?

1 Like

Can you post a more detailed tcpdump output (-v option) of few requests, or share a pcap file? Maybe we could guess the client implementation.

Here it is:

12:48:20.884388 IP (tos 0x0, ttl 52, id 13562, offset 0, flags [none], proto UDP (17), length 76)
    212.147.28.70.123 > 156.106.214.52.123: [udp sum ok] NTPv4, length 48
	Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 4 (16s), precision -6
	Root Delay: 1.000000, Root dispersion: 1.000000, Reference-ID: (unspec)
	  Reference Timestamp:  0.000000000
	  Originator Timestamp: 0.000000000
	  Receive Timestamp:    0.000000000
	  Transmit Timestamp:   3791015300.876820999 (2020/02/18 12:48:20)
	    Originator - Receive Timestamp:  0.000000000
	    Originator - Transmit Timestamp: 3791015300.876820999 (2020/02/18 12:48:20)
12:48:20.884495 IP (tos 0xc0, ttl 64, id 60213, offset 0, flags [DF], proto UDP (17), length 76)
    156.106.214.52.123 > 212.147.28.70.123: [bad udp cksum 0x63c2 -> 0xb88c!] NTPv4, length 48
	Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 4 (16s), precision -24
	Root Delay: 0.014846, Root dispersion: 0.023849, Reference-ID: 131.188.3.222
	  Reference Timestamp:  3791014771.905195942 (2020/02/18 12:39:31)
	  Originator Timestamp: 3791015300.876820999 (2020/02/18 12:48:20)
	  Receive Timestamp:    3791015300.884388057 (2020/02/18 12:48:20)
	  Transmit Timestamp:   3791015300.884481105 (2020/02/18 12:48:20)
	    Originator - Receive Timestamp:  +0.007567057
	    Originator - Transmit Timestamp: +0.007660105
12:48:20.884810 IP (tos 0x0, ttl 52, id 13563, offset 0, flags [none], proto UDP (17), length 76)
    212.147.28.70.123 > 156.106.214.52.123: [udp sum ok] NTPv4, length 48
	Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 4 (16s), precision -6
	Root Delay: 1.000000, Root dispersion: 1.000000, Reference-ID: (unspec)
	  Reference Timestamp:  0.000000000
	  Originator Timestamp: 0.000000000
	  Receive Timestamp:    0.000000000
	  Transmit Timestamp:   3791015300.877182999 (2020/02/18 12:48:20)
	    Originator - Receive Timestamp:  0.000000000
	    Originator - Transmit Timestamp: 3791015300.877182999 (2020/02/18 12:48:20)
12:48:20.884940 IP (tos 0xc0, ttl 64, id 60214, offset 0, flags [DF], proto UDP (17), length 76)
    156.106.214.52.123 > 212.147.28.70.123: [bad udp cksum 0x63c2 -> 0x1c7d!] NTPv4, length 48
	Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 4 (16s), precision -24
	Root Delay: 0.014846, Root dispersion: 0.023849, Reference-ID: 131.188.3.222
	  Reference Timestamp:  3791014771.905195942 (2020/02/18 12:39:31)
	  Originator Timestamp: 3791015300.877182999 (2020/02/18 12:48:20)
	  Receive Timestamp:    3791015300.884810888 (2020/02/18 12:48:20)
	  Transmit Timestamp:   3791015300.884926261 (2020/02/18 12:48:20)
	    Originator - Receive Timestamp:  +0.007627888
	    Originator - Transmit Timestamp: +0.007743261
12:48:20.885441 IP (tos 0x0, ttl 52, id 13564, offset 0, flags [none], proto UDP (17), length 76)
    212.147.28.70.123 > 156.106.214.52.123: [udp sum ok] NTPv4, length 48
	Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 4 (16s), precision -6
	Root Delay: 1.000000, Root dispersion: 1.000000, Reference-ID: (unspec)
	  Reference Timestamp:  0.000000000
	  Originator Timestamp: 0.000000000
	  Receive Timestamp:    0.000000000
	  Transmit Timestamp:   3791015300.877715999 (2020/02/18 12:48:20)
	    Originator - Receive Timestamp:  0.000000000
	    Originator - Transmit Timestamp: 3791015300.877715999 (2020/02/18 12:48:20)
12:48:20.885555 IP (tos 0xc0, ttl 64, id 60215, offset 0, flags [DF], proto UDP (17), length 76)
    156.106.214.52.123 > 212.147.28.70.123: [bad udp cksum 0x63c2 -> 0x8df2!] NTPv4, length 48
	Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 4 (16s), precision -24
	Root Delay: 0.014846, Root dispersion: 0.023849, Reference-ID: 131.188.3.222
	  Reference Timestamp:  3791014771.905195942 (2020/02/18 12:39:31)
	  Originator Timestamp: 3791015300.877715999 (2020/02/18 12:48:20)
	  Receive Timestamp:    3791015300.885441947 (2020/02/18 12:48:20)
	  Transmit Timestamp:   3791015300.885540690 (2020/02/18 12:48:20)
	    Originator - Receive Timestamp:  +0.007725947
	    Originator - Transmit Timestamp: +0.007824690

That looks like requests from ntpdate. The transmit timestamps end with 999, which indicates a system clock with microsecond resolution.

There is a NIST paper with usage patterns they observed.

I too see this flood of packets every few hours with peaks around 24k/second. I have rate limiting configured but do not know if these bursts are coming from the same or different set of IPs.

This has been happening for a few months now. I thought it was only my server being targeted. I’ll most likely run a packet capture to gather information and correlate with others.

While this has little impact to me, it’d be nice to stop it.

Yeah, without capturing IPs there’s really no way to tell. I’ve had issues in the past with specific IPs, and sometimes entire subnets. FWIW, my packet rate is pretty consistent…

I am catching the very talkative clients with something like that:
tcpdump -c 10000 -n -n dst port ntp 2> /dev/null | awk '/Client/ {print $3}' | cut -d. -f-4 | sort | uniq -c | sort -k 1nr | head
You can play with the packet capture number limit depending on your load.

I’m seeing these 10s spikes in traffic several times per hour on my server in the US zone (not so much on servers in other zones). Here is a graph with rate limiting disabled and enabled (at two different settings).

The spikes seem to be exactly 9 or 10 seconds long looking at ntptraf output:

1 Like

Thanks for the additional info! Rate limiting helps and necessary, but it is just a workaround. The solution would be to find the client and fix it. At the end, I think, an NTP dedicated community like this could figure out what is this broken client.

Yes, that would be great. I exchanged few emails with @stevesommars (who brought my attention to this issue). He tried emailing the abuse address of few offending clients, but got no response. Maybe more people should try that.

The requests of the client look exactly as requests from ntpdate, but from the timing of the requests when the client doesn’t break it doesn’t seem to be ntpdate (or maybe it’s a broken port).

1 Like

I suspect it is the Windows version of ntpdate. It seems it has the same moving timing of requests as observed for the offending addresses when they work correctly and there is also an explanation in the ntpdate code (select() not returning on timer).

As the bursts have been seen with NTP v3 and v4 packets, and also poll 3 and 4, it indicates it’s a mix of old and new ntpdate versions. Maybe a recent Windows update broke ntpdate?

However, this wouldn’t explain why they all seem to send 5 (and sometimes 6?) requests instead of the default 4 when working correctly.