Pool servers always get some HTTP requests, and other garbage traffic, but HTTP requests to my US Pool servers have recently been going way up.
Specifically, requests to http://north-america.pool.ntp.org/ for “HEAD / HTTP/1.1” from assorted Android user-agents.
For example, my servers each recently averaged about 200,000 of those requests per day. (Net speed: 500 Mbps.) (Edit: Or 1000 Mbps.)
Anyone have thoughts or information about it?
It’s not a problem for me at this point – though I did decide to tighten my log rotation – but if the traffic continues to increase, or if someone is low on disk space, doesn’t have good log rotation, or logs to an SD card, they could eventually have hundreds of MB of logs and run into trouble…
There’s at least 20 000 distinct IP addresses being used (I see between 19 000 and 25 000 different addresses in a daily log for these requests).
On my systems at least, it’s exclusively IPv4 traffic, even though both servers are also on IPv6 in the pool.
The clients appear to be leaving the connections open until they time out on the server side.
It’s always Dalvki user agents, which means it’s either a lazy browser or something that’s automated (all the big browsers have their own name in the UA string).
It tends to happen in bursts that correlate with the systems being selected in the DNS rotation.
The first four points sound to me like it’s a lazy attempt at a DDoS attack against the NTP pool, and on my systems at least it’s having an effect (nginx is dying due to memory exhaustion sometimes when these spikes happen). The fact that it’s only a HEAD request is somewhat suspicious too, as it implies either that they know that there’s going to be a redirect, or that they have no intent to follow the redirect and just want to query the specific system. Things have gotten a bit better on my end since I adjusted the HTTP keepalive settings in nginx to somewhat more draconian values (which has no effect on the regular web service since it’s largely just file hosting for downloads).
On my end, the plan is to wire up a couple of conditionals in the web server config to match the common user agent prefix and the exact request type, and just drop the connection immediately.
I’ve also seen some odd traffic apparently coming from a handful of systems supposedly running Safari, initially trying ‘GET / HTTP/1.1’ and doing something bogus with the headers that got them a 400 all the time, but more recently showing a pattern like the above-mentioned traffic.
Good point about stuff using the Date header in HTTP requests, I had forgot that there’s stuff that does so.
The issue here for me though is that it leaves the connection open. It’s requesting HTTP/1.1, and apparently sending a Connection: keep-alive header (otherwise nginx would be closing the connection) and then not bothering to close the connection itself. While it may not be intentionally malicious, I’d still consider that to be unintentionally malicious \by virtue of negligence. Looks like I need to figure out how to configure nginx to selectively disable keepalive for these clients.
If I can find the time, I may do some further digging to get a full list of HTTP headers it’s sending, maybe that will help identify where the traffic is actually coming from.
Yeah, I had forgotten that the keepalive_timeout setting is valid for servers and/or locations (anything I’ve dealt with before has just had it as a system-wide setting).
I’ve actually configured it so the whole virtual server I’m running to redirect *pool.ntp.org requests has keepalive disabled (there’s no reason requests should be pipelined in this case, especially since I’m issuing a 301), which has helped with the server load considerably (I would previously see spikes of up to 200 of these connections concurrently, now it’s down to at most about 8).
First of all, these requests don’t cause any particular harm to my servers, but they are indeed an oddity. My North American server is currently set to 10Mbit to keep its bandwidth usage below 3TB/month. I guess the request rates would be higher with a higher bandwidth setting. At this moment I’m getting 2000-3000 north-america HTTP requests each day.
It turns out that the “clients” doing HEAD requests actually follow the redirects. As many of you do, I also had a redirect for *.pool.ntp.org to www.pool.ntp.org, but now I thought I’d temporarily change the destination to a different server that I control (as an afterthought, I could also have redirected to the same NTP server but to its actual hostname). I saw matching requests in the other server’s logfile. This means that Ask is probably seeing more than the usual amount of requests to www.pool.ntp.org due to this. Some of those “clients” seem to have IPv6, because some of the redirected requests came in via IPv6.
I have now (temporarily?) reconfigured my web server so that pool.ntp.org and ntppool.org are still redirected to www.pool.ntp.org like before, but requests for other subdomains will get a 403. I don’t think any human is going to see those 403s. 403s also have the Date field, so the “clients” should be able to get the date regardless of the 403.
I see those requests on my US server, too.
I plotted the requests towards my system on a GeoIP map. Unfortunately, there is no real “hot spot”.
Until now, the device types (if you trust the User-Agent:-Header) are by 100% Android with many different makes and models. If it’s caused by an app: Why is there no traffic from iOS devices?
It’s either an Android exclusive app or the iOS variant is going other ways to get the current time.
I will start monitoring HTTP traffic towards “north-america.pool.ntp.org” by tomorrow (monday) in our office.
Maybe, someone is also using this specific app so we can find out, what is causing these requests.
@avij That’s interesting, I’ve not seen anything on IPv6 yet, and on top of that, I actually hadn’t seen any UA headers from any Android version before 5.1 either. I also find the presence of the Accept-Encoding header a bit odd, as gzip encoding a response will save absolutely zero time on a HEAD request unless there are a ton of headers being returned. As far as the traffic volume. it’s enough with the net speed set to 100MBit that it was swamping the kernel memory for TCP connections on my systems on occasion before I disabled keepalive. As far as configuration, I’m likely going to do similarly to what you’ve done, although I’m tempted to return 421, as that more accurately describes the situation, and will probably add some particularly draconian rate-limiting as well.
@lordgurke Thanks for the map! For what it’s worth, that’s actually a rather typical global distribution for a generic Android app, with the distribution itself suggesting very limited localization (my guess based on what’s shown is that it’s probably only one or two languages, most likely English and/or Spanish). The fact that they’re going for ‘north-america.pool.ntp.org’ combined with the heavy US and Mexico usage seems to indicate it was probably made in the US. As far as no iOS traffic, that doesn’t really surprise me either, as poorly thought-out stuff like this tends to be single-platform. Overall, it sounds like some college student wrote up a game and doesn’t trust the system time for some reason (or thinks he can’t trust the system time).
Clarification on IPv6: All request to north-america.pool.ntp.org are made via IPv4 because that DNS entry does not have an IPv6 address, but when I redirect that request to some other hostname that has both an IPv4 and IPv6 address, some of those requests are made via IPv6. If north-america.pool.ntp.org had an IPv6 address in DNS, there would be some requests to that address over IPv6 as well. This does not matter much, but I only wanted to point out that some of those clients do have IPv6 capability.
Somewhat ironically, I got another burst of traffic on one of my systems, and this seems to keep things in a reasonably sane state on the server end (no more than about 8 connections at a time from the offending devices, each one lasts about 1 second at most and uses almost no processing power (I think turning off gzip compression was what really changed that), and a 421 return code still sends the Date header. I’ve also explicitly called out a match for www.pool.ntp.org because believe it or not I’ve actually gotten a few stray requests with that in the host header.
Yup. That happened to me once a few years ago. And of course I redirected *.pool.ntp.org to http://www.pool.ntp.org/, so it caused an infinite redirect loop with the client until I changed the configuration…
@mnordhoff Good point. On the other hand, it’s almost certainly going to be a case of either a horribly misconfigured client or active malicious intent (or possibly DNS issues, but I doubt that that will be the case), so I may check that explicitly and return something else to flag the request as obviously bogus…
@ahferroin7: I can see some Android 4.x in my log files. And also, the Dalvik version number seems to be the correct one for these Android versions.
At the moment I can see some clients constantly requesting my pool server with those requests:
GET / HTTP/1.1
Connection: keep-alive
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
All requests originate from different IP addresses, but all from the Comcast network.
These requests get an HTTP 400 answer, since (I assume) the Host: header is missing but required for HTTP/1.1.
In the log file of today there are about more or less exact 40 or 80 or 120 requests logged, coming from several IP address within the range 98.223.99.0 - 98.223.231.255.
The whole range 98.192.0.0/10 is routed or used by Comcast: https://whois.arin.net/rest/net/NET-98-192-0-0-1
Outside that range there were 40 requests each from IP addresses inside 68.50.200.0 and 68.60.240.0.
The mind-melting oddity is: There are always multiples of 40 requests per IP address. Not 15 or 27. Always 40, 80 or 120 requests. And all requests share the EXACT same user agent.
@lordgurke Yeah, I’ve yet to see the Dalvik version mismatch, and the handful of build ID’s I was able to check over the weekend are correct relative to the Android and Dalvik version too, so it looks like it isn’t spoofing te UA string (though that doesn’t mean it’s legitimate traffic, for every intelligent malicious actor out there, there’s at least a dozen stupid ones).
I’ve also seen the same kind of apparently Mac originated traffic too. You’re right it’s a 400 because of the lack of the Host header. I would not be surprised if the User-Agent header is bogus, that’s a five and a half year old version of Google Chrome it’s claiming to be, but it’s also violating the protocol in a way that Chrome never has (unless there’s a plugin involved). I have not yet come up with a reasonable way to deal with these as I kind of want to avoid making it trivial for people to DoS my web server due to automated reactive blocking of the offending addresses.
Ugh, this is annoying. You’d think having some headers from HTTP would help track down who’s doing it, but it doesn’t look like it.
The website is on a CDN (generously provided by Fastly) for this reason. I don’t actually keep logs, but if I’m reading the stats they keep then there are a disproportionate number of redirects (say from / to /en/ ?). Regular users would also download CSS, JS, etc.
The site gets an average of about a 100 requests a second, with peaks of about 5000 on an per-minute average basis. That seems about right compared to what I remember before the site was on varnish (thousands of requests at the top of the hour).
I don’t really have a good suggestion for what to do about it. One “solution” would be to make HTTP time an explicit service we offer (on a different hostname obviously) …
Well, the given headers for the Android traffic appear to be the defaults you get from the HttpURLConnection class, so either the individual who wrote the app is either actively trying to evade identification, or they’re really lazy.
As far as the macOS stuff, I’d be willing to bet that the headers are bogus and designed to just look like it’s macOS. There’s obviously some form of automation going on there, but the reported Chrome version didn’t really support this kind of thing very well, and Chrome has always followed the HTTP specs pretty much to the letter, so I think at least the UA header is spoofed. At the point at which that is spoofed, it’s reasonably safe to assume everything else is suspect too, so I don’t think there’s really anything to be found looking at that.
After a few days, the distribution of request IPs seems really global and “normal” to me.
I made some statistics with Wireshark over the requested Hosts and URLs (seriously, there’s nothing you can’t do with Wireshark!).
I’ve uploaded a summary file there: https://maxderdepp.de/files/ntp-http-hosts-1.yaml
The server on which I’ve captured this, has IP 192.96.202.120 (“dns-e.wdc-us.hosts.301-moved.de.”), so requests to this IP addresses or this host are most likely intrusion attempts.
The host “north-america” is not the only one pool host name queried. There are plenty of queries towards, for example, vendor zones for Fedora, CEntOS or Debian and others.
EDIT: That’s the latest map of request distribution:
Huh, I’ve yet to see any for any of the vendor zones on my systems, but that may just mean they’ve never been listed under the vendor zones.
In my case though, I actually have real legitimate traffic to my two systems, so I may need to start more aggressively handling this traffic if it gets significantly higher, as much as I hate the prospect of just dropping them instead of replying.