Score graphs on beta

avij · July 23, 2025, 1:56pm

My home server was briefly offline yesterday and today due to me fiddling with the NIC driver.

Looks odd to see the outages as “blanks”. I think they used to show red score dots to show lost packets. Was this change intentional?

NTPman · July 23, 2025, 3:20pm

Well, and the text “Jul 23” 6 times on the horizontal scale isn’t very informative.

apuls · July 24, 2025, 6:54pm

My first intention is that “blanks” are right because the monitor cant reach your server and don’t get any data.

ask · July 25, 2025, 4:11am

All the red dots are back! Thanks @avij.

Last weekend I modernized ~all the frontend javascript and how it’s built & deployed and in the rewrite I missed a subtlety with zero offset data.

@NTPman oops on the x-axis legend. I’ll figure that out.

avij · July 25, 2025, 4:59am

Thanks!

As an aside if anyone wonders, as mentioned, I’m debugging one particular NIC driver and making my own development builds of it. I may need to reboot the server frequently during this development time and this may lead to getting a new DHCP lease more often than usually. The server’s IP address has changed again. I’m not putting this server in the main pool due to not having a static address, but it’s good enough for the test pool. Once the situation with the NIC driver stabilizes I’d expect the IP address to stay more constant.

ask · July 25, 2025, 5:49am

Barely working servers are great for the test pool. More scenarios to test!

marco.davids · July 25, 2025, 6:52am

I’ve written a simple NTP simulator (probably not the only one out there) that lets you tweak things however you like:

It was originally built for personal use, so there’s no documentation or anything, but it should be fairly self-explanatory.

ask · July 25, 2025, 7:17am

Feel free (you or anyone else) to add an instance to the beta pool.

I fixed the the x-axis legend in two tries, and the negative score display.

marco.davids · July 25, 2025, 8:01am

Done!

Expect ‘horror’:

I’ve configured it as shown below, but it is already complaining about ‘bad stratum’, so perhaps I should configure that different. What is the highest stratum that is deemed valid ?

DISCLAIMER:
I can’t tell how long I can let it run like this, so future readers may miss out on the fun.

{
  "port": 123,
  "debug": true,
  "min_poll": 6,
  "max_poll": 10,
  "min_precision": -29,
  "max_precision": -20,
  "max_ref_time_offset": 60,
  "ref_id_type": "XFUN",
  "min_stratum": 1,
  "max_stratum": 15,
  "leap_indicator": 0,
  "version_number": 4,

  "jitter_ms": 10,
  "drift_model": "random_walk",
  "drift_ppm": 50.0,
  "drift_step_ppm": 50.0,
  "drift_update_interval_sec": 10 
}

stevesommars · July 25, 2025, 12:58pm

For the x-axis length I prefer YYYY-MM-DD (e.g., 2025-07-25).

apuls · July 25, 2025, 1:51pm

The new status page looks realy nice

ask · July 25, 2025, 7:22pm

Hi @marco.davids – this is really helpful!

Your broken server actually helped identify a color problem with the charts. When everything’s failing like yours was, it became clear we couldn’t distinguish between different types of issues - everything just looked “broken.”

I changed the colors:

Scores are now purple/dark blue for negative steps (was red)
Offsets are hot pink when too high (was red)
Orange remains for “in between” values

The changes should also work better for colorblind users (tested with Sim Daltonism).

This also highlighted a quirk in the new selector: no monitors get chosen as primary because none are working. Selection uses recent “step” values, not scores, so once you fix the server it should recover (though it’ll take some time).

The stratum cutoff is 7 - stratum 8+ gets negative steps.

(Edit: oh, the client had a lower threshold for just throwing an error. I’m updating the client to allow stratum up to 10 before it returns the test as an error, but the scoring will require 7 and below. This all feels a little dubious. I think the limit is there because in the past I observed a correlation between high stratums and “soon the offsets will be nuts”, but I think the scoring system will react faster now to the offsets so this is less necessary.)

In development, I’ve tightened scoring to require ≤25ms offset for perfect scores and ~100ms to stay in the pool. Production currently uses 75ms/250ms. In practice it’s stricter since timeouts or network blips will pull you down if you’re borderline. Your tester will be great for validating this.

ask · July 25, 2025, 7:28pm

Yeah, me too. I fixed it up for now to use UTC and 24 hour clocks, but the YYYY-MM-DD format looked weird(er) when at most there’s 1.2 days of data shown.

I have a branch for the data API to extend the API to allow selection of dates to allow for graphs over longer periods. I haven’t figured out the downsampling yet; it’s pretty tricky with the amount of data. What I’ve been fussing with is using an average or median and then having data points for the 5th / 95th percentile offset values (high and low) and figuring out how to represent the score and maybe the volume of tests (or filter out monitors with too few tests in a period).

I might need to change the monitoring system to collect the current status of the monitor so we can filter on that after the fact; which will lead to another bunch of yak shaving most likely.

MagicNTP · July 25, 2025, 9:23pm

I see you implemented something I was going to suggest: With the much larger number of monitors, put them in two colums rather than one very long one (and move the overall score back to the top of the list). Nice, also with the delay (or RTT?) added!

Maybe the spacing in the table could be improved a bit. At least on a small screen device (tablet), there is line wrapping, while there would be enough space to avoid that (with the graph anyhow taking more space horizontally). And even if everything were on one line, a bit more space before the score would increase legibility (at least I found that with the interim “thin” one-column design that directly preceded the 2-column one, not sure what the spacing will be once the line wrapping is addressed).

I have a few more small nits for your consideration, but didn’t get around to compiling them today. A few off the top of my head:

There’s still a bunch of links that are underlined (just because you mentioned earlier that you got so used to that that you don’t see it anymore ).
The “Top Countries” heading in the “Client distribution” section seems centered wrt the table it introduces. However, with everything else left justified, that looks a bit odd to me, and despite the bold font, the indentation increases the chance of requiring a second look to recognize it.
The three icons in the top row and associated text also seem centered on the main part of the page, with the first icon having a slight indentation with respect to contents below it. On some pages, like the “My servers” page, and to some degree on the server details page, that similarly looks slightly skewed to me when the rest of the page is left justified, and large swaths on the right side of the page are empty with just the right-most icon’s text kind of floating in free space on that side of the page.

That’s all that came to mind right now. Have another item for the monitors page, will post that tomorrow. And maybe more minor nits I forgot above.

MagicNTP · July 26, 2025, 1:22pm

@ask, I see you’ve already done some updates, looking good!

Very small items only on the server details page:

I think a bit more space between the score and the RTT value would improve readability (it’s not unreadable, just would make it a bit easier to pick out the respective numbers).
The RTT value is in smaller font than the monitor name and score, but seems top-aligned to the latter two. Not to the point it would appear as full-blown superscript, but maybe possible to bottom-align nonetheless.
Almost imperceptible as modern browsers tend to hide scrollbars when not moving the focus (mouse pointer) over the page, but even on a 1200 pixels wide screen, there is a horizontal scroll bar (not only on the server details page). I guess that comes from something in the footer overflowing. I.e., while most of the contents in the side bar and the main bar is left adjusted within the respective container (with exception of the set of icons + text in the top navigation bar, which looks right adjusted to the main bar on some pages), the links to the status page and forum seem right aligned to the overall page, not any of the main two top level containers on the page.
When the history data for the graph cannot be loaded (e.g., because the server has been deleted), there is an error message that I’d think is understandable for even less technical people. However, when the “Client distribution” data cannot be loaded, the error message of the respective underlying browser-specific JS implementation is shown (the JSON parser trying to parse a plain text error message when the actual JSON data cannot be loaded). Which seems perhaps too technical for the average user (see multiple threads in this forum). Maybe instead of propagating the low-level JS error text, a more high-level error message could be displayed (like with the missing history data)?

Thoughts regarding the monitors management page:

There seem to be some formatting inconsistencies between not yet approved dual-stack monitors, and not yet approved IPv4-only monitors: The different data items within a server’s card are separated into different cells for dual-stack monitors, but mushed together in one cell for IPv4-only monitors, and in a different order. (I currently don’t have an approved IPv4-only monitor, so don’t know whether the card for those would be consistent with the cards for approved dual-stack monitors, or not.)

Screenshot_20250725-233509_Ecosia621×302 57 KB

Screenshot_20250725-233702_Ecosia694×625 97 KB
There seems to be an empty line within the respective cell after each IP address for dual-stack monitors, wasting a bit of vertical page real estate.
The status text in the badges is different between not-yet approved dual-stack and not-yet approved IPv4-only monitors. The former have a “dimension” word (“Status” or “Connection”), the latter don’t. Not sure whether that is intended, i.e., the respective status is really different, or not (e.g., it was used without dimension word initially, and the dimension word was added later, but only for dual-stack monitors).
Not sure whether that is a remnant of the process of adding the state dimension word “Status” or “Connection” in front of the respective actual state (e.g., “active” or “testing”) after initially not having that dimension word. But the actual state word (e.g., “active”, “testing”) is capitalized for the “connection” dimension, but all lower case for the “status” dimension. While there seems to be a rampant tendency to capitalize nouns that are not nouns proper these day, or even non-nouns, my tendency would be to use lower case for the state word (unless it is used from the same source but in a different context without the preceding dimension word).

Screenshot_20250725-233900_Ecosia1165×119 41.8 KB
It seems vertical space separating the last monitor card and the following box with informative text is missing. And there seems to be an additional empty line in the blue box after the informative text. (Not sure whether when one has open monitor slots available still and the box is referring to the process of adding an additional monitor, that text has a line wrap and fills the empty space. Maybe vertical centering within what might be a fixed-size box could address this.)

Screenshot_20250726-021607_Ecosia860×517 101 KB
I very much enjoy, and frequently use, the feature that “selecting” a monitor in the table (via hovering over the entry with the mouse, or tapping on it on a pure touch device, e.g., tablet) emphasizes the data points for that monitor in the graph by fading the data points for the other monitors, and am thus happy you brought it back after it was gone for some iterations. Otherwise I would have raised it now. Now with the new multi-column table, I note that when hovering over/tapping on a monitor, the entire line is first highlighted with a blueish background, then the cell of the respective monitor is highlighted with grayish background (on a fast device, one might not notice that this is a two-step process). Not sure the highlighting of the entire line was intended, vs. highlighting really just an individual cell. Originally, I would have preferred to really highlight the cell only, but am now starting to like the entire line being highlighted as well. Not sure anyway whether, and with how much effort, that could be changed, and whether that would be worthwhile to spend.

I understand some of the items mentioned are really small ones, and many, if not most people would care, or even note them. I happen to note such things, so just sharing for your consideration. And I understand that the pages should look nice, and a lot of effort could go into perfecting the web design. But the Pareto principle obviously applies, certainly for the optical aspects of the pages (and I’d rather have you spend your precious time on functional improvements).

A bit related to the status shown in the respective monitor cards: Even after reading some of the descriptions in the GitHub repository, I am not sure how/whether the status shown in the monitor cards relates to the status that a monitor has for a specific server, as shown in the table on a server’s detail page.

I guess while the state word is similar (e.g., monitor status “testing” for not-yet-approved monitors, vs. the “testing” category of monitors on the server detail pages), the two are two separate things. As a monitor that is not yet approved itself will already poll servers to assess the monitor’s performance/health/suitability, but will/should obviously not be considered for actually evaluating/scoring a server.

I am just a bit confused as I have the impression that sometimes, some of my not-yet-approved monitors appeared in some servers’ “candidate” section. I haven’t found a pattern yet, though, as to when that happens vs. apparently most of the time, not-yet-approved monitors not showing up on a server’s page (as it currently seems to make more sense to me).

Similarly, I sometimes had the impression, even somewhat recently, that (some of) my own monitors were also appearing on some of my servers’ pages (which I understood should not happen with the new constraint system). But I couldn’t find a pattern, and I have not seen that in the most recent past, so might not be happening anymore today.

Lots of text/thoughts, probably too long already, so I’ll leave it at that, and for your consideration.

EDIT: Just added a new server to the beta system, and multiple of my own monitors are among the “Candidate” or “Testing” servers, including some that have not been approved yet. So maybe I misunderstood, and the “constraints” are not applied initially, but only over time as part of the overall selection mechanisms, and as other kinds of input data become available as well, e.g., measurement data as well, to get the overall picture in all dimensions. Let’s see how that evolves…

ask · July 26, 2025, 7:30pm

Thanks for the list! Feel free to file GitHub issues, too, it might be easier to track (lots of small issues are fine). I’m going through the list as carefully as I can, but feel free to call out if I miss or misunderstood anything.

You might also be able to prototype fixes in the web inspector if you want.

Score + RTT spacing. I made the RTT right aligned; it makes it a bit cleaner without taking up more space (which leads to other problems sometimes).
Horizontal scroll: fixed; oof. That was bothering me too.
history data on deleted servers: it’s supposed to not show the markup for those features if the server has been deleted for a while, but maybe I mixed up the date cutoffs. Do you have an example (or a traceid from the http header or html)? The “client distribution” feature was my first experiment in adding features with more modern javascript techniques; I need to incorporate it in the newer all rewritten system to fix this (and the header alignment you mentioned earlier).

I’ll push these changes now and continue on the rest of the list later!

MagicNTP · July 26, 2025, 7:50pm

I usually prefer that as well, exactly for those reasons (and it better shows the intent, rather than a verbal description). Unfortunately, I continue to struggle understanding how the templating system is dynamically assembling, let alone modern web technologies…

I can’t immediately find it anymore right now, and another example of a deleted server looks as intended. Will share should I stumble across an example again. In any case, the point was less on the data not being there anymore for a deleted server, but more generally what is shown when something happens (earlier examples in various threads were typically caused by some temporary issue in the backend for server that did still exist). E.g., here, but also more recent examples.

As mentioned, some items are really just small things and not pressing, including the alignment topic. Maybe something for another day, when the more pressing functional aspects are done…

stevesommars · July 27, 2025, 12:42am

Some graphs may be copied and survive for years. I’d like to at least see the year included.

MagicNTP · July 27, 2025, 1:02am

Maybe add the full date once, e.g., before the first label so there is enough space to fit without another label cramping the space. Then, given the short timespan covered by the graph, only the day of month, or weekday, could be given for the other labels, allowing to deduce the full date.

Or put the full date elsewhere entirely, separate from the axis labels, with a clear definition how to interpret it. E.g., that it pertains to the first, or maybe better, last label (kind of embodying the point in time when that particular graph snapshot was captured) present along the x axis. Or the right edge of the x axis, regardless of whether it has a label (according to the usual interval for the current time span displayed) or not. Again, with the limited time span the graph covers, deducing the month, certainly the year for the labels that don’t include those components explicitly should not be too difficult.

EDIT: I just realize that the current production system has a variant of the above, not including the year though: At least as of right now, the right-most label on the x axis is the only one that has the month plus day of month, whereas the other labels alternate between time only, and day of week plus day of month at the currently covered time span of ~4 days.

marco.davids · July 28, 2025, 6:32am

New settings (stratum within accepted limits and drift a little less):

{
  "port": 123,
  "debug": true,
  "min_poll": 6,
  "max_poll": 10,
  "min_precision": -29,
  "max_precision": -20,
  "max_ref_time_offset": 60,
  "ref_id_type": "XFUN",
  "min_stratum": 1,
  "max_stratum": 6,
  "leap_indicator": 0,
  "version_number": 4,

  "jitter_ms": 0,
  "drift_model": "random_walk",
  "drift_ppm": 5.0,
  "drift_step_ppm": 5.0,
  "drift_update_interval_sec": 10 
}

Pretty cool…

Topic		Replies	Views
Monitoring upgrade Announcements	68	3427	May 25, 2023
Score/network woes Server operators monitoring	71	6955	March 7, 2019
Beta system now has multiple monitors Pool Development monitoring , beta	32	4353	August 11, 2018
Suggestions for monitors, as Newark fails a lot and the scores are dropped too quickly Server operators monitoring	91	4075	August 2, 2021
Monitor belgg1-19sfa9p Pool Development monitoring	19	761	May 31, 2023

Score graphs on beta

Related topics