Since Friday the monitoring station has started reducing the score of my IPv6 server due to the “timeout” of some NTP requests.
At the end of the day, the problem began to reach the same server also through IPv4 connectivity. I noticed that incoming UDP packet traffic was still high (around 6-8 Kpps), but the amount of packets sent drastically reduced to 2-3Kpps, indicating that most packets were not being answered.
The server is currently held on a DigitalOcean VPS, is the problem related to the datacenter network or some kind of NTP traffic blocking? I would like a suggestion to try to identify and solve the problem.
The San Jose timeouts are due to not receiving responses rather than tardy responses.
If you can collect tcpdumps for a couple of hours of traffic to/from your VPS, we can determine whether the NTP requests or responses are getting lost. Use tcpdump filter:
host 139.178.70.122 or host 2604:1380:1001:d600::1
(I log all NTP packets on monsjc1.)
The lowest tier Digital Ocean droplet is “Basic”, which uses a shared CPU.
Perhaps the shared CPU is significantly over-subscribed.
I run NTP servers at other Digital ocean sites. tcpdump on those servers shows typical NTP request-response times of ~50 usec. Your packet capture shows 200-300 usec. Are you using connection tracking?
I haven’t seen any network issues, but I’ll continue to check.
The VPS firewall is completely disabled. I only use DigitalOcean Cloud Firewalls. I don’t know if he does some kind of tracking.
Requests from the San Jose monitoring station seem normal in IPv4, but almost all IPv6 requests continue to record “timeout.”
Its only been a few hours for my new NY1 droplet. I’m getting 100% replies.
How many:
- packets/sec
- NTP ipv4 packets in, packets out
- NTP ipv6 packets in, packets out
are you seeing?
Could you share ~ 10seconds of unfiltered tcpdump?
Linux Connection Tracking is unrelated to the Digital Ocean firewalls.
The connection tracking allows the kernel to track traffic on a per-IP address basis.
This can go wrong. See this and also this for example.
I don’t know much about connection tracking’s benefits, but have heard of several cases where it caused problems with NTP servers.
My NY1 droplet runs Ubuntu 20.04.2 LTS. I don’t see connection tracking anywhere.
I’m using the default Ubuntu configuration and haven’t made any modifications related to Linux Connection Tracking, so I believe that’s not the case. I also uninstalled the NetData monitoring service which could be interfering in some way, but nothing has changed.
I am currently receiving about 6-7k UDP packets per second and sending about 3-4k packets per second on the IPv4 network stack (I don’t know why the amount of packets received and sent are so different).
Good news: I’ve done some performance testing by turning off DigitalOcean Cloud Firewalls and it looks like it really is responsible for losing much of the outbound UDP traffic packets.
I also used a stress test tool that triggers multiple NTP requests in a short time. With Cloud Firewall enabled, some requests were simply unanswered, but with Cloud Firewall disabled, a larger number of responses arrived.
I believe that just like the Ubuntu State Firewall that stores trace data in the conntrack table, Digital Ocean Cloud Firewalls should also have a buffer storage that can degrade performance over time depending on the amount of packets.