After trying to chase down my server issues and not making much progress, I decided the issue is likely on my provider’s side and not my servers. I built a couple of new servers at a new provider that mimic the ones I already had.
So far, testing is showing that they are much more reliable. I have an EC2 instance that is polling all the servers with ntpdate and recording the results. ntp1 and ntp2 are the old servers (at Linode), ntp3 and ntp4 are the new servers (at Vultr):
[ec2-user@ip-172-30-0-163 ~] for H in 1 2 3 4; do echo "-- ntp{H}"; cat ntp${H}.bytestacker.com.log |grep strat|tail -10; done
– ntp1
server 69.164.201.245, stratum 2, offset 0.060460, delay 0.18123
server 69.164.201.245, stratum 2, offset 0.003616, delay 0.08075
server 69.164.201.245, stratum 0, offset 0.000000, delay 0.00000
server 69.164.201.245, stratum 0, offset 0.000000, delay 0.00000
server 69.164.201.245, stratum 0, offset 0.000000, delay 0.00000
server 69.164.201.245, stratum 0, offset 0.000000, delay 0.00000
server 69.164.201.245, stratum 2, offset -0.011584, delay 0.08273
server 69.164.201.245, stratum 0, offset 0.000000, delay 0.00000
server 69.164.201.245, stratum 2, offset -0.000726, delay 0.05666
server 69.164.201.245, stratum 2, offset 0.001277, delay 0.06134
– ntp2
server 172.104.187.12, stratum 0, offset 0.000000, delay 0.00000
server 172.104.187.12, stratum 0, offset 0.000000, delay 0.00000
server 172.104.187.12, stratum 2, offset -0.000296, delay 0.27011
server 172.104.187.12, stratum 2, offset -0.003444, delay 0.26508
server 172.104.187.12, stratum 0, offset 0.000000, delay 0.00000
server 172.104.187.12, stratum 2, offset 0.001656, delay 0.26199
server 172.104.187.12, stratum 0, offset 0.000000, delay 0.00000
server 172.104.187.12, stratum 2, offset -0.001426, delay 0.25827
server 172.104.187.12, stratum 0, offset 0.000000, delay 0.00000
server 172.104.187.12, stratum 0, offset 0.000000, delay 0.00000
– ntp3
server 149.28.248.90, stratum 2, offset -0.001238, delay 0.05667
server 149.28.248.90, stratum 2, offset 0.000518, delay 0.05803
server 149.28.248.90, stratum 2, offset -0.001013, delay 0.05566
server 149.28.248.90, stratum 2, offset 0.000547, delay 0.05823
server 149.28.248.90, stratum 2, offset 0.000283, delay 0.05843
server 149.28.248.90, stratum 2, offset -0.001173, delay 0.05539
server 149.28.248.90, stratum 2, offset -0.001030, delay 0.05548
server 149.28.248.90, stratum 2, offset 0.000340, delay 0.05827
server 149.28.248.90, stratum 2, offset 0.000332, delay 0.05817
server 149.28.248.90, stratum 2, offset -0.001008, delay 0.05547
– ntp4
server 149.28.156.244, stratum 2, offset -0.000084, delay 0.25479
server 149.28.156.244, stratum 2, offset -0.001035, delay 0.25525
server 149.28.156.244, stratum 2, offset -0.001262, delay 0.25494
server 149.28.156.244, stratum 2, offset -0.005068, delay 0.26225
server 149.28.156.244, stratum 2, offset -0.001136, delay 0.25523
server 149.28.156.244, stratum 2, offset -0.000106, delay 0.25693
server 149.28.156.244, stratum 2, offset -0.002571, delay 0.25841
server 149.28.156.244, stratum 2, offset -0.001345, delay 0.25487
server 149.28.156.244, stratum 2, offset -0.001028, delay 0.25371
server 149.28.156.244, stratum 2, offset -0.011647, delay 0.27618
The pool monitoring server is showing similar positive results. There was one timeout this morning to ntp4, which is in Shanghai… we’ll see if that trend continues, I guess.
ntp4 hasn’t taken any production traffic yet. We’ll see what happens when it does.