Server access dead time characterisation

Ntp has been online for two or three months now, 100% system uptime. I made some initial comments in the server operators list, surprised at the monitoring scores. Steve Sommars got in touch, requesting some tcpdump logs. Did this for a few days and that showed up some connectivity issues. To investigate ntp filtering, Steve requested that I set up a time protocol (udp 37) server. This confirmed his theory that ntp was being filtered, as the time 37 polls ran at nearly 100% success rate, while the ntp polls,123, were down around the 80% level in some cases, lots of dropouts.

I was interested in characterising client / server path characteristics in a bit more detail. Wrote a simple utility to poll a server and on no reply, log current time, then continually re poll at intervals until a reply is received, then calculate the elapsed time…

Results are interesting. Round trip time from the uk to monwrl is consistent at around 75 mS when the path is good, but initial polls often require a retry after the initial 1 sec timeout. 2 retries will catch most cases, but 10 retries is frequent and worst case, have seen 80-100 retries, over a minute of dead time, before monwrl can again be reached. I need to add some code to save the data to a file for plotting, which may illustrate any time of day or- day of week factors in the results.

The utility is command line driven: (ntpbin ip port poll_rate timeout retries) and is in C. Standard Makefile, gnu C and make. Running FreeBSD 12 here, but also builds on Linux and Solaris 10. Can dropbox tar.gz if anyone else is interested in playing with the code…

Chris

Hi, that is to be expected, ntp is using UDP, “User Datagram Protocol”, tongue in cheek also called “Unreliable Datagram Protocol”. UDP makes no guarantees of order or delivery of packets.

But I agree, the monitoring should at least retry once before penalising hosts of the pool.

I see a different aspect on my host, packtes over IPv4 take a different path than over IPv6 from the monitor, leading to a measured offset of about 1.8 ms between the graphs of the same host.
I hope that nobody is penalised on his/her dual stacked server to be removed from the pool, because the monitor thinks it deviates too much.

Thanks. UDP is unreliable, but ntp has been designed to mitigate the worst effects of that. In practice, a client polls a group of servers, and the loss of any single server for a short period, probably up to a few minutes, has little affect on the client’s clock accuracy. The Mills book has details, but the algorithmics that drive it are really quite smart.

Apparently, the present monitoring system polls 3 times at 5 second intervals. Without criticising the coffee etc, this seems a little wooden, where an approach better matched to the path characteristics and ntp requirements could work better. Tests here suggest that an initial timeout of 1 or 2 seconds would catch most cases, but several re polls / retries at say, one second intervals, would have a better chance of mitigating temporary path failure. The key thing is multiple retries, which would improve the odds of a poll and reply getting through. Code to do that would be a page or two of C, so not difficult to implement.

This obviously doesn’t fix the case where the path is down for a minute or more, but that could be resolved as well. Monitoring is of no use if it’s not accurate…

Chris

If the monitoring is to be accurate, it should send exactly one packet every 1024 seconds, to match the behavior of NTP client daemon polling (or up to one packet every 64 seconds to get a higher resolution picture of reachability).

Non-daemon NTP clients like ntpdate don’t want an accurate NTP pool monitor. 3 polls in 15 seconds is already very generous. A server that responds after 3 tries is OK for one-shot clients (like ntpdate) that just want to fire a few packets in a short burst and never ask for time again. The current monitor behavior is intended to model this kind of client (ntpdate sends 4 packets in 8 seconds) as this kind of client is most sensitive to short-term outages.

NTP daemons send their one packet, and if there’s no response, they’ll send a second packet 17 minutes later. If there’s no response in 136 minutes, they’ll ask the pool for a different server, or just live without time from that server, depending on the daemon software.

The NTP monitor and real NTP client daemons can widely disagree about the quality of an individual NTP server because the different distribution of packets over time triggers different router behavior, especially in stateful firewalls and layer 3 routers. If an NTP server is behind a gateway that can’t route UDP replies until after the 2nd or 3rd packet, then it’s 100% unreachable as far as an NTP daemon client is concerned. Such a server can only be used by ntpdate.

The best way to deal with individual path failures is to poll from multiple places (ideally from where the clients are, but if that’s not possible then more diverse is still better). We’ve known that since before the beginning of the NTP pool. 14 years after that, we have multiple monitors in beta. At this rate of development, they’ll be fully implemented some time around 2032.

1 Like

What makes you so optimistic? :grinning:

Thanks for the reply. You mention 3 at 5 seconds intervals as being “generous”, but if that is the official line, it does explain part of the misunderstandng here. I’ve been assuming that the point of the monitor was to establish server availablity, by whatever means necessary, but perhaps I was mistaken.

Let’s analyse a bit more: We know the path quality is chaotic, but accuracy is essential for monitoring to mean anything at all. Otherwise, why bother ?. A monitor would be expected to make a lot more effort than clients to contact a server, perhaps weighted at distance. Ideally, as you imply, we could do with monitors everywhere, but that’s impractical. Basic comms theory tells you that link integrity is proportional to signal to noise ratio and udp over internet is very noisy in that respect. Ntp clients compensate for that in software, but monitoring makes little effort to compensate at all, so not a real world evaluation.

What monitoring is effectively doing at present, is evaluating the path, not the server, which is why there are such wide swings in the scoring, as the path quality varies. Path quality swamps the evaluation, with a negative bias with distance. Find it difficult to believe that is what was originally intended, as it really tells you nothing about the server at all, or it’s availability.

As an aside, can no longer reach monwrl, other than the occasional “port 123 not reachable” reply. Don’t know if this is intentional or not, but the whole point of testing against the monitor node was to evaluate the path quality from a monitor point of view. Obviously not an absolute reciprocal test, but the results do seem to have some time correlation with the scoring graphs. That’s uk to east coast US, but the path to west coast servers is even worse, with as much as a minute’s or more retries at one second rate, worst case, before the server can be seen again. There are plenty of other servers to gather data from though, as I would like to plot dead time against time of day and day of week. Seems to me that such data might contribute something to a better monitoring solution…

Chris

A monitor would be expected to make a lot more effort than clients to contact a server, perhaps weighted at distance.

Why? The monitor’s job is to determine whether an NTP server’s IP address should be returned as the result of a client DNS query. We don’t care about the server at the other end of the path at all, except to the extent that there must be one to generate the reply packet, and it must provide something close enough to correct time, but not so close that volunteers have difficulty implementing it.

We can’t care very much about time accuracy–if we did, we’d lose all the servers running in people’s basements on DSL, cable modems, and other asymmetrical jittery slow links. There’s no need for the monitor to be a baremetal host or have a local reference clock–a VPS with a few nearby stratum 1’s is more than sufficient.

At least, that’s the theory. It would make more sense if there were a lot of monitoring nodes and they formed a consensus over a diverse sample of available network paths, but at the moment the monitoring code can only accept input from one sample point at a time. This makes the monitor too sensitive to path issues that are specific to the monitor and it’s why people in arbitrary parts of the world have problems keeping functional NTP servers in the pool.

It’s a non-trivial research project to figure out how to use more than one (not PhD thesis material, but not something obvious and simple to implement at scale either). We’d need to invent a completely new scoring system to support aggregate reachability data from multiple monitors and a new backend to send different subsets of the pool to different DNS servers based on local monitoring results (I suspect the final answer will look like “run almost entirely separate NTP pools in different zones, with only the user signup/admin page in common”).

At the moment only one person can do this work. Ask has nearly zero availability as it is–running a monitoring algorithm research project is not feasible given that constraint. Someone would have to build the whole thing in parallel with the NTP pool, and send a pull request when it’s done, if it is to be done at all.

Since yesterday the NTP pool’s monitor host is down (which is why you can no longer reach it). We’re all waiting for it to come back up. This is what Ask’s near-zero availability looks like.

Now if the goal is to measure how accurate NTP servers are, then sure, you will need deep path latency analysis and multiple entry points into the network and a baremetal host with beefy CPU and a local reference clock and lots and lots of NTP query samples and all the other stuff you suggested. That’s far beyond the NTP pool’s monitoring requirements, though.

1 Like

Seems to me that server availabilty and dns are separate functions. At lowest level the monitor’s job is to verify that a server can be reached via it’s ip address and an ntp poll, nothing else. Since the path is not always known at that level, good comms practice would suggest a smart retry mechanism, much as tcp is effectively udp with retries The above layers can make use of that data as required but for partitioning reasons and modularity., each layer should deal with just one function, the simpler the better. As you say, there’s no interest in absolute accuracy at that level, as an ntp client is designed to deal with that itself. Looks like the current monitoring would consider 1 millisecond or few acceptable, but no idea how that figures in the scoring values, or if it is considered at all.

As for regional monitors, finding volunteers to run monitors would probably get many offers, but there are obviously issues of security and trust. Also, how is it managed overall ?. A central host could interrogate each monitor in turn for results, or the monitors could send the data data back asynchronously at intervals. It’s not a trivial project though. Meantime, a solution is needed. Since monitoring already uses retries and would have control of timeouts, it should be a simple task to tweak those values to make the scores more consistent and make more allowance for the path distance. Don’t see why it should need a major rewrite to do that.

If there is a frustrating aspect to this, it’s the fact that the system is 99% there already. Haven’t spoken to, or been spoken to by Ask, but if he was responsible for the system design and coding, it really was a herculean effort. It’s a medium to large scale project that must have taken thousands of hours of effort to build and debug. However, what the project has done, is to dig itself into a classic key man syndrome hole and not only that, but the system appears to have been written in GO, which is not exactly mainstream. Can find no system design docs either, which makes any knowledge transfer to future developers and maintainers difficult, other than by reverse engineering the code. One way to get round that might be for Ask to mentor (person of choice), to bring them up to speed on the system and at the same time, generate docs for others. Not volunteering for that here, though may be able to contribute in other ways, such a building systems, putting them online, writing tools and test code etc.

So, just a few thoughts, observations, not criticism. Stuff happens and projects can take on a life and direction of their own. No blame, but need a plan to get past logjams for there to be progress…

Chris

OK, so send one packet every 17 minutes, drop the server score by 1.25 every time there is no reply, so after 8 reply failures the server is temporarily dropped from the pool.

Since the path is not always known at that level, good comms practice would suggest a smart retry machanism, much as tcp is effectively udp with retries

This is, intentionally, not how ntp polling works. NTP daemons send one packet only to keep network load down. Some NTP server operators will auto-ban users who send more than the minimum number of packets required by the protocol (it’s easy to do this in either NTP config or firewall config).

ntpdate does do retries. A client that runs ntpdate pool.ntp.org at startup is supported by the pool. Most server operators will allow a ntpdate-style packet burst (but some will not allow the iburst directive, which is 8 packets).

There’s a third class of NTP client which does a single DNS query and keeps trying to use the IP address forever (even if the server goes away). There’s not much we can do about these except try to ensure every DNS response includes at least one long-term stable server.

These transmission patterns are hardcoded, or would require manual intervention on every client device to change. We can’t consider a server reachable if it doesn’t respond successfully to one of these access patterns.

People have often proposed keeping any functional NTP server in the pool no matter how unreachable it is, but there are tradeoffs for this. Users complain when their ntpdate clients can’t get time sync. Operators mistype an IP address in the signup page, and/or select the highest bandwidth setting, and send some random victim enough traffic to knock their host off the network. If a server stops responding to NTP packets, we want them delisted from the NTP pool–and quickly, too. False positives are as bad as false negatives.

So I don’t see how we solve the problem any other way. We can only use the traffic patterns of real NTP clients for monitoring. If we measure anything else, we get meaningless results in the monitoring data. Other proposed solutions involve sending more packets or different types of packets (e.g. tracerouting to try to identify where in the path packets are lost to “correct” the monitoring results). These traffic patterns aren’t identical to NTP client traffic patterns, so they can produce results that don’t match a real NTP client’s experience. Of course the results will be wrong, because they’re measuring something entirely different and using it as a proxy for NTP server reachability.

We are too sensitive to network issues close to the monitor’s end of the network path, and the obvious solution is to use a diverse set of network entry points to eliminate as many common elements of the path as we can on the monitoring side. It’s obvious, and not particularly expensive, either–one monitoring host in every Linode data center would cost less than what I used to pay for cable TV (but don’t do that–it would be better for diversity to get one VPS from every ISP servicing one area than to get one VPS from every area serviced by one ISP). I’d be happy to donate some of these (run them or just pay the bill for them) if the software issues are worked out.

None of this solves problems like an NTP server on an island, where clients on the island can reach the server, but no NTP pool monitor can reach the server, because the island has a poor connection to the outside world and it’s too small to have its own NTP pool monitors. I don’t think we can reasonably solve this problem–if clients on the island can’t reach anything outside of the island, it follows that they can’t reach pool.ntp.org either, nor can clients in the outside world use an NTP server on the island, so arguably the island’s NTP servers should not be in the global NTP pool. We do need to solve this problem when the “island” in question is China or India, though.

Looks like the current monitoring would consider 1 millisecond or few acceptable, but no idea how that figures in the scoring values, or if it is considered at all.

If the NTP offset is out of range it’s treated as if the server failed to reply. Errors as high as 60 ms don’t impact server score, though over 50 ms produces a different color point on the server score graph. I don’t know off the top of my head what the accuracy cutoff is (maybe 100ms?).

If the monitor rejected replies that were only 1 ms off, the NTP pool would be orders of magnitude smaller. Asymmetric path jitter is usually much larger than that. Server operators sometimes (often?) put an NTP server in their residences or small offices, where it’s easy to add a GPS reference clock, but the NTP server is connected to the Internet via a DSL or cable modem. NTP daemons can reduce the offset error, while ntpdate users don’t care about that much accuracy, but the monitor is stuck making a decision with the instantaneous offset value.

As for regional monitors, finding volunteers to run monitors would probably get many offers,

…provided the stated requirements are sane. A VPS host with reliable access to a stratum 1 would do, particularly if there were enough monitors to build a consensus (i.e. more than 4 monitors). The monitor doesn’t need to keep very accurate time by NTP standards, just good enough to know when a NTP server is so clearly wrong that it should be removed from the pool.

So far, the public descriptions of monitor requirements say “no VPS” which pushes the cost up by an order of magnitude while reducing product availability.

but there are obviously issues of security and trust.

I don’t think the list of IPs and bandwidth selections in the pool is particularly sensitive. It’s already possible to build a fairly extensive list of NTP server nodes for a region simply by hammering the DNS server and counting the number of times each IP comes back. Most of the remaining trust and security issues are already covered by the existing distributed DNS server fleet, i.e. if we can trust people to operate client-facing DNS servers, we can trust the same people to also run NTP monitors.

Also, how is it managed overall

I think there should be a central list of (ip_address, bandwidth, service_zones…) that is replicated to all the regional nodes. That part would be more or less unchanged from the current pool. Each regional node would run its own monitor on every ip address that matches one of its regional service areas (country, continent, and global zones) and feed the results to its local DNS server. This eliminates the need to send realtime monitoring data around except to produce graphs in the UI (and maybe it could be pulled on demand for that).

However, what the project has done, is to dig itself into a classic key man syndrome hole

There are ways around that, if it ever became necessary. All the code is in github. In theory, if someone wants to run a region-specific NTP pool, all they have to do is download the code, run some servers, set up a domain name, and call for volunteer server operators. In practice, it turns out there are very few Asks in the world.

China is certainly big enough to run its own pool, and strongly motivated to do so because of network structure issues and years of terrible ntp.pool.org monitoring issues, so I sometimes wonder why they haven’t given up, found their own regional Ask, and built their own NTP pool.

There’s also no rule that says NTP servers or clients can only participate in one pool. Why solve the problem at the center of the network when it can be solved at the edges? Not proposing this, just exploring the extreme ends of the solution space.

1 Like

This is, intentionally, not how ntp polling works.

Well, perhaps it should :-). If you want the monitor to behave like a client, then it needs to have the same robust algorithms as a real client, but imho, that’s not the function of the monitor. Since it’s not possible to duplicate the variable path quality as each of thousands of clients, you have a to eliminate that common variable in the equation to get a meaningful result from monitoring. Like trying to decode a signal in a noisy channel, you only see part of the picture otherwise. I used a one second timeout and one second poll rate for some tests here, but they could be set to any convenient value. Using the current 15 second time limit, an initial timeout of 1 or two seconds, then say twelve retries at one second rate, or six at two. Statistically, that would have much better odds (yes, that is the right word) of making contact over a noisy channel.

As for servers blocking fast polls, there are only retries until a reply, with the rest either assumed to be dropped or lost in either direction, so pretty much irrelevant. If that is found to be a problem, then tune the timeouts and retry rates to suit.

So, why can’t that be tried ?.

These transmission patterns are hardcoded, or would require manual intervention on every client device to change. We can’t consider a server reachable if it doesn’t respond successfully to one of these access patterns.

Once again, we seem to be confusing server accessability evaluation with ntp client operation, which are not the same things.

Users complain when their ntpdate clients can’t get time sync.

Have an old Suse 11.4 install here on a laptop that gets used mainly as a terminal (minicom) , but even back then, years ago, ntpdate was deprecated, so how much support should ntp provide for obsolete facilities ?.

If a server stops responding to NTP packets, we want them delisted from the NTP pool–and quickly, too. False positives are as bad as false negatives.

Why ?. An ntp client effectively doesn’t care if one of it’s basket of servers goes down for a while, so why does it need to be so draconian ?.

We are too sensitive to network issues close to the monitor’s end of the network path, and the obvious solution is to use a diverse set of network entry points to eliminate as many common elements of the path as we can on the monitoring side. It’s obvious, and not particularly expensive, either–one monitoring host in every Linode data center would cost less than what I used to pay for cable TV (but don’t do that–it would be better for diversity to get one VPS from every ISP servicing one area than to get one VPS from every area serviced by one ISP). I’d be happy to donate some of these (run them or just pay the bill for them) if the software issues are worked out.

I wonder if there is a case for a self contained, standalone keep it simple monitor application ?. Written in a baseline language like C and buildable anywhere using standard Gnu tools, gcc. Make etc, or even distributed as a binary for various Linux and bsd distros. Make it simple to install to encourage uptake, like an ntp client. Works from a text file list of server ip addresses and saves results to another text file in comma separated format, for plotting or whatever. All key parameters settable via the command line for flexibility. Not too difficult to design and build something like that, and would be happy to contribute time, design and coding, if a spec could be agreed. Can also run an example here with little added effort over the current ntp server.

None of this solves problems like an NTP server on an island, where clients on the island can reach the server, but no NTP pool monitor…

Pretty straw man example :-). No system should be expected to deal with extreme cases like that…

Chris

Well, perhaps it should :-). If you want the monitor to behave like a client, then it needs to have the same robust algorithms as a real client, … Once again, we seem to be confusing server accessability evaluation with ntp client operation, which are not the same things.

I think your confusion comes from not realizing they are exactly the same thing.

TCP treats packet loss as a signal to decrease the packet transmission rate. If it did not, and every client attempted to retry transmissions at the same or a higher rate rate, a server host would quickly be flooded with packets it couldn’t possibly receive. Classic TCP cut the transmission rate in half every time a dropped packet was observed. TCP must deliver the content of every packet and, so TCP retransmits packets. TCP does not have fixed requirements for when packets are transmitted, so it seeks the highest rate available whenever it has packets to transmit.

NTP uses a similar approach, but with a very much narrower range of transmission rates–just one. NTP daemons need a fixed number of packets transmitted at fixed times driven by the NTP PLL algorithm, so NTP always transmits packets at the minimum (and only) rate required for that algorithm. NTP daemons can lose a few packets and still work, so NTP daemons do not attempt any retransmission–the one packet they send every 17 minutes is all they will ever send (initially packets are sent every 64 seconds, and later sent every 1024 seconds, but this is driven by the requirements for timekeeping and not by network packet loss).

When NTP clients observe packet loss, they will first try to do without the lost packets and use different servers for time sync (the recommended 4 servers includes one potentially dead server). After a very long outage, modern clients will go back to DNS to try to find a different server to bring the client’s working server count back up to 4. While this occurs, the NTP pool should be directing new clients away from servers that are dropping packets, otherwise the client might get the same dead server again.

We want the NTP clients to drop servers that can’t respond. It is the core of NTP’s congestion avoidance algorithm. The NTP pool monitor should use the same packet loss threshold as the clients–if the monitor is different in any way, it will either waste server resources by not utilizing servers that clients would accept, or it will send NTP clients to servers the NTP clients will reject.

NTP daemons only need a 12.5% query reply rate–one reply to 8 queries from the PLL algorithm. ntpdate requires only 25%. These are extremely achievable targets on the modern Internet. If an NTP server can’t deliver that, something is wrong with it, and it should be removed from the pool until it’s fixed. Common examples of such problems include the application of stateful firewalls that cannot track all the NTP client sessions, or the use of a hosting provider that aggressively discards NTP packets, or a bandwidth setting several orders of magnitude too high.

Unfortunately, the current NTP monitor itself has problems hitting this target with significant parts of the Internet. This needs to be fixed for the NTP pool to work properly.

Statistically, that would have much better odds (yes, that is the right word) of making contact over a noisy channel.

The point of monitoring is to remove servers from the pool whose channels are too noisy to be reachable by standard NTP clients. The point of the single-query-single-response NTP design is to keep the channels free of noise from retries and other non-time-synchronizing packets.

The current monitor has only one channel, and it’s full of so much nearly continuous noise. We can try to measure that noise and try to subtract it from the signal, but that’s some pretty heavy signal processing, and the active sensing required to make it happen from a single network entry point will itself add more noise in the same channel.

If we used multiple channels, we can easily and reliably correlate them to extract signal, without adding enough noise to change the signal. The trick is to build a diverse channel set, but that’s easy to do with volunteers who live in diverse places setting up monitor nodes near themselves.

As for servers blocking fast polls, there are only retries until a reply, with the rest either assumed to be dropped or lost in either direction, so pretty much irrelevant.

This is not how blocking works. If the first reply is lost, and the client retries, the server will not send another reply until the client’s sending rate drops to an acceptable level. Each retry restarts the ban timer.

Since it’s not possible to duplicate the variable path quality as each of thousands of clients, you have a to eliminate that common variable in the equation to get a meaningful result from monitoring.

It’s not necessary to duplicate every possible path, just a representative sample of them that is big enough to keep the error rate below an acceptable threshold.

Like trying to decode a signal in a noisy channel, you only see part of the picture otherwise

This is like trying to take a picture of our house from across the street, but there is a tree in our yard blocking our view. We can take a lot of photos of the tree in front of the house, hoping the wind will blow leaves and branches around so that we can see some part of the entire house in each photo, and photoshop them all together to get a picture of the house, but to do this we need to be able to figure out which pixels in each individual photo are tree and which are house, which is an AI research project. The trunk is probably not going to move very much with just wind, so there are parts of our house that we can never see this way, even if we solved all the other problems.

We could do all that, or…we could just pick up our camera and move it so the tree isn’t in the way of our photo.

Most client network paths are very similar. Most of the reachability problems on global scale occur in a handful of places (like underwater cables and oversubscribed/underfunded network peering links) so we only need to have two nodes on either side of those. We don’t even need to know where those problems are in the network–we can discover them automatically, and filter them out of the monitoring signal, by analyzing the monitoring data.

A couple of years ago, during another conversation about this topic, I built a proof-of-concept multi-monitoring system. I recruited a number of monitoring hosts with connections as diverse as I could beg, buy, borrow, or steal. I included residential broadband access points, multiple national carriers, carriers in multiple nations, nodes in business offices with commercial regional ISPs, nodes in friends’ and relatives’ homes, nodes in data centers. I purchased a few days of VPS hosting in data centers around the world. To simulate a “stolen” node, I set a monitor up in a relative’s apartment, using a coffee shop’s open wifi across the street (I did obtain permission to use this network beforehand, but the technical implementation would have been identical if I had not).

I had my pool of monitors monitoring a set of 50 NTP servers around the world, randomly chosen from the NTP pool so that I would have NTP pool monitoring data for comparison. I put their server scores from the NTP pool monitor and my own NTP monitors in a grid with a NTP monitor in each column, a NTP server in each row, colored each cell according to how reachable that monitor thought that server was, and made a stack of such grids over time.

Correlations along rows indicate server-side outages (every monitor sees the host is down because the host is down). Correlations along columns indicate failing or redundant monitors (the monitor thinks every server is down because the monitor is disconnected from everything). Inverse correlations along part of a column indicate a net split–monitors on one side of the split disagree with monitors on the other side of the split about the reachability of servers on the other side of the split. Noise indicates network congestion (can appear in either columns or rows, which identify which end of the path is most affected). Statistical algorithms can automatically analyze the data and recommend places to prune or expand the monitor pool, but given the small size of the data set I just turned the data into an animated visualization and analyzed it with my eyeballs.

I went through a few cycles of pruning redundant monitors whose reachability data correlated too strongly with other monitors, and spinning up monitors in new locations to bring the total back to 10. I kept monitor nodes that could reach servers that other monitors could not reach, and discarded monitors that could only see the same servers or fewer servers than other monitors. After two weeks I had 13 monitors returning 3-5 unique reachable server sets (depending on where the threshold of uniqueness was set and the time interval compared).

So we definitely don’t need thousands of monitors for the NTP pool–half a dozen could be enough, if they are in the right places. Double that number keep things working when there are server failures and changes in global network structure (over time, some monitors will become redundant as network paths merge, some new paths will be created requiring monitors in new locations to test them). If monitors provide duplicate results, discard all but the cheapest one and try new monitor locations.

I ranked the NTP monitors by how different their scores were from the consensus reachability (the highest reachability score for each server observed by any 2 monitors). The NTP pool’s production monitor was ranked 13th out of 13, i.e. it was the worst.

If we’re looking for a quick and easy fix for the NTP monitor, one possible solution is to move the NTP pool monitor to any randomly chosen data center in North America. 93% of the monitor locations in my experiment were better than where the monitor is now, and I was trying to find bad ones. I put “stolen wifi” in my monitoring pool because I was trying to build a monitor that was worse than Newark…and failed (I could have achieved it with an iptables rule to create 50%+ packet loss, but that would have been cheating).

ntpdate was deprecated, so how much support should ntp provide for obsolete facilities ?.

I mostly agree with you here, but there’s still a lot of single-shot NTP traffic in the pool (maybe not ntpdate any more, but smartphone apps with ntp query libraries embedded in them are apparently common).

It doesn’t matter much whether the cutoff is 75% unreachable (NTP daemon) or 87.5% unreachable (ntpdate)–both are extremely high packet loss rates by current Internet standards.

An ntp client effectively doesn’t care if one of it’s basket of servers goes down for a while, so why does it need to be so draconian

Clients do care. Dropping a server with a true packet loss rate over 50% isn’t draconian. If it’s much higher than that, clients will drop it themselves.

Also note that if monitoring is done correctly (along diverse paths) it’s not very draconian at all, because single-packet queries do accurately reflect server reachability if you are doing them from somewhere near the intended clients (like on the same side of the planet). Also, each monitor sends its own query packet, so we would get those extra tries to get a response that you were asking for, just in a much more useful way.

When it’s done incorrectly (from a single location next to a bad peer), a lot of good hosts get kicked out of the pool. The current NTP pool monitor was able to reach about 5% fewer servers than the other monitor nodes in my experiment. Traceroute said packets to all of the failing NTP servers were sent through the same peer right next to the Newark data center.

Of course there are lots of other problems with my experiment–too short, not enough nodes, not enough continents, I ran it two years ago instead of this week…but it’s enough to see the shape of what we are not seeing when we monitor the pool from only one point.

I wonder if there is a case for a self contained, standalone keep it simple monitor application

For my experiment I just put all the NTP servers in chrony, scraped reachability data from chrony sources, and used some perl to turn it into tables. Not the most efficient way to do it, but I only had a million cells of data to worry about, and I didn’t want to spend more than a few hours to build the data collection and visualization tools.

NTP server on an island…
No system should be expected to deal with extreme cases like that

Well, I agree it’s unsolvably hard for the NTP pool to do it with DNS service; however, it’s a trivial problem to solve using an anycast service. If, everywhere on the Internet, 123.456.78.90 was the IP address of a nearby working NTP server with +/- 50 ms accuracy, then the island just needs to drop one NTP server there, and configure its routers to rfc1546 it, and nobody needs to talk to the NTP pool’s DNS server. OK, we’d need 4 IP addresses, but you get the idea.

1 Like

Bit late getting back to this, sorry, but other work to do.

I think your confusion comes from not realizing they are exactly the same thing.

I think we will have to agree to differ on that :-). As I said before, that idea is only valid if there are enough monitors to cancel out the effects of the path, which is not the case. From the amount of work you seem to have done, it looks like that was recognised some time ago, so why did you give up on it ?.

Some sort of distributed monitoring could solve the problem, perhaps even as an optional install for ntp itself, so that every server could optionally become part of the monitoring network.

Unfortunately, the current NTP monitor itself has problems hitting this target with significant parts of the Internet. This needs to be fixed for the NTP pool to work properly.

Statistically, that would have much better odds (yes, that is the right word) of making contact over a noisy channel.

The point of monitoring is to remove servers from the pool whose channels are too noisy to be reachable by standard NTP clients. The point of the single-query-single-response NTP design is to keep the channels free of noise from retries and other non-time-synchronizing packets.

Has any analysis been done, w/regard to the effect that a modified retry and timeout regime would have on traffic density, or it just gut feeling ?.

But again, unless you can cancel the path effect, which is unique and indefinable for every connection, you really have no idea if a server is up and reachable or not. That is the fundamental fact that seems to be ignored.

Been looking at the grundclock site recently and the results from that make even less sense than the old site. Old site is consistently pessemistic, while the new site gives upper teens / 20 from California (unreal) , and the Amsterdam site, much closer, is usually low single figures ?..

Chris

For the packet loss the network topology counts, not the geographical distance. Between the monitoring station of LA and the monitored server there is no transit network with NTP rate limiting, unlike the case of the monitoring station at Amsterdam.

For the packet loss the network topology counts, not the geographical distance. Between the monitoring station of LA and the monitored server there is no transit network with NTP rate limiting, unlike the case of the monitoring station at Amsterdam.

As described upthread, I was curious as to measure how long a server is unreachable for after intial request. I wrote a simple utility to measure initial round trip time, and also, where unreachable, to measure dead time until that server was again reachable. Tests done here confim what seems obvious, that path reliability degrades significantly with distance. More connections, switches and hosts in the path.

Located in the UK. Polls to uk servers average out at 10’s of millseconds max, no retries, to east coast US, 75 mS, some retries, but circuits to the west coast US are very unreliable, with anything up to one or two minutes of dead time before the node is again reachable. So whatever the theory, there is definately a correlation between path reliability and distance.

If I quibble about the blanket statement of “no transit network with ntp limiting”, it’s because you cannot possibly know that, since the path can only be known for the particular clien/server pair under test, and not for other pairs. So why are the results from the test site apparently so inverted ?..

I joined ntp.org expecting to find a lively and open group, teamwork, in depth discussions about technical issues, a plan for future development and timescales to fix faults. Probably pushing my luck, bit let’s review what I have found.

In practice, what I have found is an atmosphere of aloofness and secrecy that discourages any input and involvement from volunteers and an unwillingness to consider suggestions for improvement. Checking back through the community archives, input from volunteers is either ignored, deflected, or we get technical replies that if questioned, get no further reply at all. Blanked. There’s little information as to who runs the project, , whether it is intended to be an open source project with leadership and governance, or whether commercial interests or influence are involved.

Many projects starting out as an academic research projects draw upon volunteer goodwill effort, but when a project grows to perhaps millions of users, there’s a responsibility to the volunteers and users to ensure that it works equitably, is transparently managed, with documentation and source listings, so that many eyes can incrementally improve the system. This is how open source teamwork works to advantage, but see little evidence of that here.

Volunteers to the project devote time, resources and yes, money to building and maintaining a server. No one expects any reward for that, but when there are obvious problems, it’s reasonable to expect that they should be addressed in a timely manner. Otherwise, why should volunteers make the effort, if there’s no support from the center ?. Excuses like “being too busy” really doesn’t cut it for a production system depended on by so many. Having worked freelance in industry for decades, situations like this would not last five minutes. Faults would be either be fixed, or they would throw resources at it until they were.

All in all, a bit of a disappointment,. Nice web presentation, but little substance in some areas, so how about a plan and a bit of leadership to take it all forward ?. Apologies if I seem to be holding feet to fire here, but still a lot of work to do…

1 Like

ntp.org isn’t the same as ntppool.org - guessing you mean the latter?

Yes, that’s correct…

Bit late getting back to this, sorry, but other work to do.

Not as late as me :wink:

Has any analysis been done, w/regard to the effect that a modified retry and timeout regime would have on traffic density, or it just gut feeling ?.

The pool monitor cannot innovate new NTP protocols. It can only follow the traffic pattern of existing clients. The further it deviates from that pattern (or fails to keep up with changes in the pattern as deployed software changes over the years), the less useful it is as a monitor for clients that can only use the standard NTP traffic pattern.

Obviously people can and do run their own private experiments and build hacked up NTP clients and try avant garde new protocols (NTP packets in JSON format, anyone?), but as I understand it, there has been no effort to make changes to NTP client behavior wrt packet timing and timeouts and get them deployed at scale.

20 years ago, NTP clients couldn’t change NTP servers on the fly–they were stuck with the results of their first DNS queries more or less until reboot. Now clients can eliminate servers that can’t respond to 8 packets every few hours, and switch to new ones. That change did get implemented at scale. So I guess the answer to your question must be yes, someone did an analysis at some point, and the recent NTP client software’s behavior is the result of that analysis (“recent” being about a decade ago).

east coast US, 75 mS, some retries, but circuits to the west coast US are very unreliable

Transatlantic routing success depends heavily on who carries your packets to a US coast. One peer is awesome, another peer is awful, and their offices can be a few blocks apart. Assuming the packet arrives at all, it then has to cross that big empty space between the coasts…

up to one or two minutes of dead time before the node is again reachable

This is really common. I dug into someone’s NTP monitor problem a few years ago, and found a peer near the Newark monitor that was busted (multi-minute periods of severe traffic loss, while my other nodes in Newark but at different companies with different peering had zero problems at the same times). Just this morning someone posted here about Zayo, which is, yup, still broken.

Clearly, nobody is caring about that network path–it’s either not being monitored, or the company behind it can’t afford to improve it. Broken paths on the Internet can stay broken for years, especially at national and geographic (and occasionally corporate oligopoly) boundaries. It is just a feature of the Internet that we all have to design our globally distributed server pools around.

There’s no need to tolerate that sort of issue when we have access to today’s commercial hosting market. $5/mo VPS nodes can monitor the NTP pool and they’re available everywhere that matters. If a node is underperforming a little, add a new node somewhere else and redirect traffic to avoid the bad one. If a node is underperforming a lot, send a courtesy note to the owner of that node to advise them their node is not performing as well as its peers, and give them time to fix it, but if nothing changes, go ahead and drop the node. Ultimately, it’s the node owner’s responsibility to resolve their own network issues. If the issues aren’t getting resolved, the tenant’s responsibility is to become a tenant of someone who is better at resolving network issues.

My guess is that the NTP pool monitor started using the Newark monitor node because it’s free with the sponsorship, and keeps using it because nobody has time to move it. It took days to reboot that node the last time it went down.

From the amount of work you seem to have done, it looks like that was recognised some time ago, so why did you give up on it ?.

I didn’t give up on it–I finished it. I built a prototype to work out how to solve the various accuracy, cost, and management problems and confirm or refute various theories floating around here about how monitoring should work and what network issues a distributed monitor can easily see vs a non-distributed monitor, and even whether those issues actually exist on the Internet today because there was some doubt about that (OK I may have been a little bored, and also this isn’t my first multi-continent network monitoring project). I posted my findings. That project is now done: the prototype served its research and educational purposes, and is no longer needed.

The obvious follow up project would be to build a production multi-node monitoring system based on the prototype, and connect it to the NTP pool project; however:

  • I don’t have (or want) power to deploy anything in the NTP pool infrastructure myself.
  • I don’t have the time or resources to successfully fork the entire project just to replace the monitor.
  • I have seen no recent evidence of capacity within the project to integrate and manage any kind of architectural change (mostly for obvious structural reasons–all the work is on the least available person, and there’s no visible signs of recruiting).
  • I have not seen an emerging viable competitor project I could contribute to instead (there are several large sort-of-public NTP server pools, but most of them are run by large corporations that own all their server nodes and don’t need external help).
  • The collection of niche languages and tools the NTP pool project is built on does not intersect with the collection of niche languages and tools I normally work with and can use well.

Until any of those facts changes, I don’t see a way to build a production version of the prototype that anyone will ever get to use.

If someone wants to ask questions while they build their own multi-node NTP pool monitor and put it into service, I’ll be happy to answer them. Similarly, if someone wants to know how to make NTP pool monitoring even worse, I’ll be happy to answer that too. :wink:

Apologies if I seem to be holding feet to fire here, but still a lot of work to do…

Preaching to the choir…

1 Like