So we DO have a problem with monitoring, right?


Reading thru a number of posts on the apparent problems with the monitoring system, I gather that something is actually going on and that it may be why my server keeps getting kicked out of the pool as soon as it reaches 10 or 11. I had for once the opportunity to see it “live” today and as far as I can tell by using my beginner’s knowledge and some tools like iftop, there was no outrageous traffic at the time it dipped below 10 (about 5 kb/s on port 123, my limit is set at 512kb/s). Mrulist does not show any particular agressive client. I confirmed that the config is as per the recommendations of the Pool Project web site and nothing as changed on that server for the past 2 years. I am located in Canada, near Montréal. The server is a small embedded device running OpenWrt on my network DMZ. Sooooo…

1- Can anyone confirm that what is been happeneing for the past few months is “normal” given the situation with the monitoring system?

2- Any recommendations as to what I could do on my side to further debug the problem?

3- Is there anything I could do to help?

Thank you for your time and I hope to be able to keep my little part of the internet on time! :wink:

What’s the IP(s)? What is the monitoring system complaining about?

If your server’s score is only going down when the score is above 10, it sounds like it’s failing under the traffic load – often because of an overloaded connection tracking firewall or NAT.

If the score is shaky no matter what it is, it could be a problem with the monitoring system, or anything else.

IP: From the csv, I/O Timeout error when at/or above 10 and then going down. I agree with you that it does look like traffic load but I have had my speed set at 512kb since the beginning but I have been having problems only in the past few months. Any hints or suggestion on how I could prove that it is indeed traffic? Like I mentionned, I had iftop running when it got to 10 about 1 hour ago and the max traffic I saw was around 4-5kb/s. As far as the firewall is concerned, no change on that front for the last couple of years.

There is definitely some monitoring problem!
I have the same problem to/from Sweden, worked flawlessly for +2years, now out of the pool for the last 2-3 months.
Mine goes down even before reaching 10 most of the times.
I did my own 5 min external monitoring from another linux machine at home, cronjob:

*/5 * * * * /usr/sbin/ntpdate -q >>/tmp/ntptest.log 2>>/tmp/ntptest.err

I get about 0,03% errors from my own, from ntppool monitoring around 30%…
So, I think my server is fine.

After several years always scoring 20 I have had suffered regular dropouts every week over the last year. I was out of the pool most of the weekend.
By the time i joined the beta only LA was working, it shows similar dropouts. I’m in London,UK using LeoNTP. My logs show background traffic continues normally and local link is not overloaded.


On my way back to 20 since the Newark monitoring station went online I guess…! So I DID have a problem with monitoring! :wink:

Thanks a lot for all the work and good to be back! :slight_smile:

Looking at, there are +225 servers since yesterday. Nice!

Newark is good news for me as well i’ve not misses a beat since it went live.

For more than two months now I had constant problems of my 5 (Croatian) pool servers getting low scores every afternoon/night in Europe (i.e. day in Americas). Finally, as soon as the new monitoring station started I get normal results again. From Croatia I never had any problems. ( The difference of the two monitoring stations is quite big :slight_smile:
There might be a problem on your side, but there’s definitely a problem on the monitoring side. My pool servers have both IPv4 and IPv6 addresses, and while the IPv4 addresses in Singapore and Taiwan keep getting low scores dropped out of the pool, the IPv6 addresses on the very same hosts stay at or near 20.0. For example, right now the IPv4 addresses for the Taiwan pair show a score of 11.0 and 13.9, but the IPv6 addresses on the same hosts show 19.9 and 20.0.