I want to add my two cents here.
- The “Net Speed” setting is misleading.
I took me quite some time digging in the forums to understand it. But any newcomers, especially those with few time (in big companies, etc) or language barrier, will just set it to their servers’ wire speed and afterwards either struggle with the low score (and blame the monitoring system) or broken server networking (due to too much traffic). This will eventually make them decide to remove the server and never come back again.
I would suggest to use a different name here. Like a “Weight”, “Relative Bandwidth” or whatever that makes more sense. I am not a native speaker, so you may find a better one than what I have in mind. Or at least the help text in the right should be changed to explicitly remind them that’s NOT the wire speed.
- ntpd is broken on high load.
This is actually hard to figure out and I struggled for quite some time. Please see the following chart:
Before the red bar my server was having a broken firewall setup. And before the green bar, it is a single ntpd instance running with iptables notrack. The ntpd instance drops so many packets that the monitoring ones are dropped from time to time. For details please refer to my other thread: Dropped packets
Between the green and yellow bars it is running with rsntp, which unfortunately resulted in blank lines in the csv log and doomed the score. After the yellow bar it is running with chrony and never had a problem again.
I’m curious if the other 1Gbps server admins are aware of the issue at all, since this can easily be treated as a monitoring system problem as well (because the networking is fine, a few queries to the server will work mostly, and traceroutes to the monitoring station are okay in both direction). I will suggest to try chrony if your server is in the same dilemma.
If later this turns out to be a common problem, I would suggest changes on the software recommendation pages.
- Tolerant of broken net speed setting
I believe @LeoBodnar and everyone else in the thread are willing to help with the issues here, and most of us now have a good understanding of what’s going on in the China pool. However, there are many server admins who are either unaware of the forum, don’t have time to, or just didn’t understand. For a volunteer project I think it is important to make the system more tolerant with the fact that many servers are setting a wrong net speed. This is still the case even if the “net speed” name was changed.
For these broken settings, would it be a good idea if the admins go ahead and just lower the net speed setting for them? The issues can easily be identified if the server keeps going down right after getting a score of 10. I know some server admins may want to keep a lower load so increasing the limit would be very inappropriate, but I don’t see a problem the opposite way. If this turns out to be effective, we can even automate it.
In case it’s a human resource problem, I can offer some help here.