Getting beyond 10k qps?


#21

Q1: Works, thanks.
Q2: Correct, “interface ignore wildcard” doesnt apply to IPv6 but if I start ntpd with -4 it starts without IPv6.
Q3: Looked up what the traffic was and its my co-los router that sends ECMP traffic to hosts it shouldnt.
Q4: That worked :grinning:
Q5: rsntp seems a bit fragile for the first few minutes when it just has been started, it doesnt happend offen but it has happend that it just stops processing requests without any message. I restart it until its
stable.


#22

Now I have tested rsntp for over 2 months and I must say that its very stable and use resources better than ntp. I recommend it for others that have a need to reply to a few billion ntp requests.

Packet stats for 6 of my 8 ntp servers:

vlan161-1-mini

Traffic stats for 6 of my 8 ntp servers:

vlan161-2-mini


NTP ECMP Clusters
#23

So how can we monitor the rsntp stats? Does the script found in old ntp-pool maillist article still applicable?


#24

Can you link to the old post? I googled but could not find anything.

What kind of stats do you want to monitor exactly? Packets, Bytes, Concurrent Connections?


#25

The original script: https://lists.ntp.org/pipermail/pool/2012-July/006049.html
It can only monitor ntpd packets, and that’s fine with me. Currently I am seeking for available methods to monitor rsntp before switching.


#26

I browsed over the rsntp source and it’s pretty minimalist. I don’t see any sort of stats or output that it can do. I don’t know how to program in rust, but if you could talk to the developer I don’t think it would be too terribly difficult to add some counters in for the packets sent / received and have some way to query that.

Alternately you could setup some rules in iptables and track packet count / byte count that way for some basic numbers and query with a custom script to input into some rrd based logging. You could probably do the same with tcpdump but it might be a little more cpu intensive. I’m sure there are other ways too.

I wish I had some scripts that would work, but I monitor my NTP sent/received packets via parsing the ‘ntpdc -c iostats’ command.

On a similar note that might give you some inspiration… Recently I was wanting to track IPs making NTP requests to my server. So I created a couple rules in iptables to log all incoming NTP, which with rsyslog I wrote to a separate file (as to not fill up my main syslog file). Well, the amount of traffic made this file grow very large very fast (because of all the data each entry contains), so that wasn’t going to work. All I really wanted to know was how many queries each IP made and just separating it by day was fine. So first I went to work and wrote a little script that was a basic datagram socket server. Rsyslog output the log entries directly to it. The script parsed the log entry and inserted it into a MySQL database. The table had the IP, hitcount, and date. Within about 30 minutes I was astonished at what I was seeing… Most IPs were okay, but there was a very small handful that already made hundreds, if not thousands of queries! For a protocol where a person should for the long-term (excluding iburst) one query per minute, I was getting a few IPs that were making continuous multiple queries per-second!

So from there I added a few rules & logs in my firewall to limit requests per-ip using ‘hashlimit’. Altering my DB table and script slightly I now logged how many requests per-IP were ‘accepted’ and how many were ‘dropped’. Again, a very eye-opening experience to see this small handful of clients being so abusive. Looking up some of the top offenders offered no insight as to a cause or common source. I did notice it was a little more common that the netblock would belong to various wireless carriers. Not just cell phones (there were a lot of china telecom), but embedded type devices (yay, which are probably hardcoded and firmwares never updated). One interesting block was Jasper Wireless - www.jasper.com - Coming from like half a dozen IPs in their netblock I can only assume they were testing devices or something, but why those devices needed to update at a rate that borderlines as a DOS attack, I do not know. They earned a permanent block in my firewall weeks ago, yet still seem to be querying away at tens of thousands of packets an hour… Anyhow, crunching some numbers I found that about 0.2% of the IP address making NTP requests were having a portion of their requests dropped (due to too frequent requests) yet those IPs were generating about 4% of the overall inbound NTP traffic.

Sorry, I kind of rambled… But my point is, where there’s a will, there’s a way… Adding in a couple rules to iptables, even if they have no action will still give you a good count of packets and bytes which you can query (look at the -x option for exact counts). You just need to make sure your rrd stats can handle the counter resetting (most have an option to enable that so the graphs don’t go bonkers).


#27

Jasper is a massive IoT product managed by Cisco. Loads of huge mobile carriers all around the world are using the platform. Quite possibly this is the NTP traffic from tens of thousands of end devices, and that will only grow as more and more SIMs get issued.

:-/


#28

… and yet their “Contact Us” page is broken. Guess they don’t need any more business?


#29

They’re planning to have a billion devices on their network, in partnership with the mobile carriers. Maybe that’s enough, even for Cisco?!