My NTP server stopped responding to requests

Since Friday the monitoring station has started reducing the score of my IPv6 server due to the “timeout” of some NTP requests.
At the end of the day, the problem began to reach the same server also through IPv4 connectivity. I noticed that incoming UDP packet traffic was still high (around 6-8 Kpps), but the amount of packets sent drastically reduced to 2-3Kpps, indicating that most packets were not being answered.
The server is currently held on a DigitalOcean VPS, is the problem related to the datacenter network or some kind of NTP traffic blocking? I would like a suggestion to try to identify and solve the problem.

image
image
image

The San Jose timeouts are due to not receiving responses rather than tardy responses.

If you can collect tcpdumps for a couple of hours of traffic to/from your VPS, we can determine whether the NTP requests or responses are getting lost. Use tcpdump filter:
host 139.178.70.122 or host 2604:1380:1001:d600::1
(I log all NTP packets on monsjc1.)

The lowest tier Digital Ocean droplet is “Basic”, which uses a shared CPU.
Perhaps the shared CPU is significantly over-subscribed.

I’ve started some traceroutes from monsjc1.

I captured the packets and received requests from both hosts, but for some reason most IPv6 packets were not answered.

I’m sharing the Wireshark capture file in case you want to take a look: Gofile - Free file sharing and storage platform

I opened a support ticket with DigitalOcean and they requested an MTR report from my droplet to the client, so I provided that report:

Although the droplet has only 1 CPU, processor utilization always remains below 15%.

MTR and tracerout using UDP and port 123:

Tracerout non-NTP port, 124:

I run NTP servers at other Digital ocean sites. tcpdump on those servers shows typical NTP request-response times of ~50 usec. Your packet capture shows 200-300 usec. Are you using connection tracking?

I haven’t seen any network issues, but I’ll continue to check.

The VPS firewall is completely disabled. I only use DigitalOcean Cloud Firewalls. I don’t know if he does some kind of tracking.
Requests from the San Jose monitoring station seem normal in IPv4, but almost all IPv6 requests continue to record “timeout.”

Its only been a few hours for my new NY1 droplet. I’m getting 100% replies.

How many:
- packets/sec
- NTP ipv4 packets in, packets out
- NTP ipv6 packets in, packets out
are you seeing?
Could you share ~ 10seconds of unfiltered tcpdump?

Linux Connection Tracking is unrelated to the Digital Ocean firewalls.
The connection tracking allows the kernel to track traffic on a per-IP address basis.
This can go wrong. See this and also this for example.

I don’t know much about connection tracking’s benefits, but have heard of several cases where it caused problems with NTP servers.

My NY1 droplet runs Ubuntu 20.04.2 LTS. I don’t see connection tracking anywhere.

monsjc1 does have connection tracking:

more /proc/net/nf_conntrack

ipv4 2 udp 17 30 src=139.178.70.122 dst=62.128.1.19 sport=43808 dport=123 [UNREPLIED] src=62.128.1.19 dst=139.178.70.122 sport=123 dport=43808 mark=0 zone=0 use=2
ipv4 2 tcp 6 8 CLOSE src=123.120.13.92 dst=139.178.70.122 sport=47830 dport=80 src=139.178.70.122 dst=123.120.13.92 sport=80 dport=47830 [ASSURED] mark=0 zone=0 use=2
ipv4 2 tcp 6 431994 ESTABLISHED src=139.178.70.122 dst=139.178.67.96 sport=47190 dport=443 src=139.178.67.96 dst=139.178.70.122 sport=443 dport=47190 [ASSURED] mark=0 zone=0 use=2
ipv4 2 udp 17 27 src=139.178.70.122 dst=185.220.101.25 sport=48894 dport=123 [UNREPLIED] src=185.220.101.25 dst=139.178.70.122 sport=123 dport=48894 mark=0 zone=0 use=2
ipv4 2 udp 17 17 src=139.178.70.122 dst=82.141.152.3 sport=53334 dport=123 src=82.141.152.3 dst=139.178.70.122 sport=123 dport=53334 mark=0 zone=0 use=2
ipv4 2 udp 17 17 src=139.178.70.122 dst=5.135.188.53 sport=40736 dport=123 src=5.135.188.53 dst=139.178.70.122 sport=123 dport=40736 mark=0 zone=0 use=2
ipv4 2 udp 17 6 src=106.241.133.16 dst=139.178.70.122 sport=25891 dport=53 src=139.178.70.122 dst=106.241.133.16 sport=53 dport=25891 mark=0 zone=0 use=2
ipv4 2 udp 17 13 src=139.178.70.122 dst=176.197.251.160 sport=40452 dport=123 src=176.197.251.160 dst=139.178.70.122 sport=123 dport=40452 mark=0 zone=0 use=2
But it is running Centos 7.9.2009

I’m using the default Ubuntu configuration and haven’t made any modifications related to Linux Connection Tracking, so I believe that’s not the case. I also uninstalled the NetData monitoring service which could be interfering in some way, but nothing has changed.

I am currently receiving about 6-7k UDP packets per second and sending about 3-4k packets per second on the IPv4 network stack (I don’t know why the amount of packets received and sent are so different).

Because my NTP IPv6 server has been removed from the Pool, I am currently receiving about 3 packets per second and sending 1 packet per second.

I’ve performed a simple test using the online NTP Server Test and most IPv6 requests are failing. A few times the tool returns positive result.

image

Here is the capture of packages without filters: Gofile - Free file sharing and storage platform

I appreciate your continued support.

Good news: I’ve done some performance testing by turning off DigitalOcean Cloud Firewalls and it looks like it really is responsible for losing much of the outbound UDP traffic packets.

I also used a stress test tool that triggers multiple NTP requests in a short time. With Cloud Firewall enabled, some requests were simply unanswered, but with Cloud Firewall disabled, a larger number of responses arrived.

I believe that just like the Ubuntu State Firewall that stores trace data in the conntrack table, Digital Ocean Cloud Firewalls should also have a buffer storage that can degrade performance over time depending on the amount of packets.

DigitalOcean Cloud Firewall ON:

image

DigitalOcean Cloud Firewall OFF:

image