Monitoring from clients on more OSes (e.g. macOS)

Some people use mac mini machines as home servers – they are after all quiet and low power – and home vantage points do provide much-needed views in countries with complex network conditions (cough cough, CN). I happen to have such a M4 mac mini so I think I will try to document what I run into as I try to get the monitor client running there. (I do have an actual retired server, but the power bills really hurt.)

I think I will need to write about:

  • Building binaries (done, nothing to write about really)
  • Writing launchd service files
  • Getting system time synced well enough (monitor uses local system time offsets)
  • Possible non-static IP woes on router restart?

Current status: Program runs and acts correctly. Network conditions prevent effective communication with the upstream and is giving way too many “local clock might not be okay”.


Build

Clone followed by ./scripts/run-goreleaser is sufficient (.goreleaser.yml already includes darwin arm64… maybe this is already planned?), though I imagine go build may be better. I am getting v0.0.0 because I was cloning from a master-only fork made by GitHub. Gotta see if it causes issues down the line.

A test-run (still using my main account, will migrate later) looks alright.

export MONITOR_STATE_DIR=/Users/a/work/.ntppool-agent
screen -- dist/ntppool-agent_darwin_arm64_v8.0/ntppool-agent --env prod monitor
dist/ntppool-agent_darwin_arm64_v8.0/ntppool-agent --env prod setup

The website says v0.0.0, which seems to be cosmetic only. Will fix later.

Local time

Without time sync the local time is, as expected, not okay:

time=2026-04-28T17:10:40.016+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=ntp.nict.jp ip=2001:df0:232:eea0::fff4 err="offset too large: 245.82372ms" trace_id=0c4e82ddf78362113c9bde429c48af7f span_id=267cb8e9d4897da4
time=2026-04-28T17:10:40.334+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=time.apple.com ip=2620:149:a00:4000::31 err="offset too large: 240.746021ms" trace_id=0c4e82ddf78362113c9bde429c48af7f span_id=267cb8e9d4897da4
time=2026-04-28T17:10:40.342+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=usscz2-ntp-002.aaplimg.com ip=2620:149:a0c:4000::1f2 err="offset too large: 235.658443ms" trace_id=0c4e82ddf78362113c9bde429c48af7f span_id=267cb8e9d4897da4
time=2026-04-28T17:10:40.538+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=ntp1.net.berkeley.edu ip=2607:f140:ffff:8000:0:8006:0:a err="offset too large: 217.03777ms" trace_id=0c4e82ddf78362113c9bde429c48af7f span_id=267cb8e9d4897da4

So I installed chronycontrol. It’s obviously better, but still dubious:

time=2026-04-28T17:33:05.085+08:00 level=WARN msg="local-check failure" env=prod ip_version=v4 server=ntp.nict.jp ip=133.243.238.164 err="offset too large: -23.368166ms" trace_id=98dfe77fb148b2e87563b90b1d6b61d4 span_id=287a49ca621b9f2d
time=2026-04-28T17:33:05.389+08:00 level=WARN msg="local-check failure" env=prod ip_version=v4 server=ntp.inet.tele.dk ip=193.162.159.194 err="offset too large: -20.050336ms" trace_id=98dfe77fb148b2e87563b90b1d6b61d4 span_id=287a49ca621b9f2d
time=2026-04-28T17:33:05.472+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=ntp1.net.berkeley.edu ip=2607:f140:ffff:8000:0:8006:0:a err="offset too large: -29.038715ms" trace_id=f087385e8dce97448a72c86f1ab3b559 span_id=ffea86a4f742d98d
time=2026-04-28T17:33:05.594+08:00 level=WARN msg="local-check failure" env=prod ip_version=v4 server=time.fu-berlin.de ip=130.133.1.10 err="offset too large: -13.301816ms" trace_id=98dfe77fb148b2e87563b90b1d6b61d4 span_id=287a49ca621b9f2d
time=2026-04-28T17:33:08.108+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=ntp.nict.jp ip=2001:ce8:78::2 err="offset too large: -35.483382ms" trace_id=f087385e8dce97448a72c86f1ab3b559 span_id=ffea86a4f742d98d
time=2026-04-28T17:33:08.452+08:00 level=WARN msg="local-check failure" env=prod ip_version=v4 server=ntp.ripe.net ip=193.0.0.229 err="offset too large: 10.19097ms" trace_id=98dfe77fb148b2e87563b90b1d6b61d4 span_id=287a49ca621b9f2d
time=2026-04-28T17:33:08.583+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=ntp.ripe.net ip=2001:67c:2e8:14:ffff::229 err="offset too large: 50.487256ms" trace_id=f087385e8dce97448a72c86f1ab3b559 span_id=ffea86a4f742d98d
time=2026-04-28T17:33:11.413+08:00 level=WARN msg="local-check failure" env=prod ip_version=v4 server=sesto4-ntp-002.aaplimg.com ip=17.253.38.43 err="offset too large: 31.715595ms" trace_id=98dfe77fb148b2e87563b90b1d6b61d4 span_id=287a49ca621b9f2d
time=2026-04-28T17:33:16.574+08:00 level=WARN msg="local-check failure" env=prod ip_version=v4 server=time.google.com ip=216.239.35.12 err="network: i/o timeout" trace_id=98dfe77fb148b2e87563b90b1d6b61d4 span_id=287a49ca621b9f2d
time=2026-04-28T17:33:16.644+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=time.google.com ip=2001:4860:4806:8:: err="network: i/o timeout" trace_id=f087385e8dce97448a72c86f1ab3b559 span_id=ffea86a4f742d98d
time=2026-04-28T17:33:16.661+08:00 level=WARN msg="local-check failure" env=prod ip_version=v4 server=ntp.stupi.se ip=192.36.143.234 err="network: i/o timeout" trace_id=98dfe77fb148b2e87563b90b1d6b61d4 span_id=287a49ca621b9f2d
time=2026-04-28T17:33:16.661+08:00 level=INFO msg=local-check env=prod ip_version=v4 failures=7 threshold=5 hosts=12 trace_id=98dfe77fb148b2e87563b90b1d6b61d4 span_id=287a49ca621b9f2d
time=2026-04-28T17:33:16.662+08:00 level=INFO msg="local clock might not be okay" env=prod ip_version=v4 monitor_ip=116.232.127.10 waiting=3m0s
time=2026-04-28T17:33:16.686+08:00 level=WARN msg="local-check failure" env=prod ip_version=v6 server=uklon5-ntp-002.aaplimg.com ip=2a01:b740:a16:4000::31 err="network: i/o timeout" trace_id=f087385e8dce97448a72c86f1ab3b559 span_id=ffea86a4f742d98d
time=2026-04-28T17:33:16.686+08:00 level=INFO msg=local-check env=prod ip_version=v6 failures=5 threshold=3 hosts=8 trace_id=f087385e8dce97448a72c86f1ab3b559 span_id=ffea86a4f742d98d
time=2026-04-28T17:33:16.686+08:00 level=INFO msg="local clock might not be okay" env=prod ip_version=v6 monitor_ip=240e:b8f:3aab:c100:8cc2:9ff9:f907:7e4b waiting=3m0s

Well what can I say, that’s China Telecom for you.

Launchd service

Create a new user like so…

sudo dscl . -create /Groups/_ntpmon UniqueID 423
alias U='sudo dscl . -create /Users/_ntpmon'
U
U PrimaryGroupID 423
U UniqueID 423
U UserShell /bin/bash
U NFSHomeDirectory /var/lib/ntppool-agent

A little migration…

sudo mv /Users/a/work/.ntppool-agent /var/lib/ntppool-agent
ln -s /var/lib/ntppool-agent /Users/a/work/.ntppool-agent
sudo chown -R 423 /var/lib/ntppool-agent
sudo cp dist/ntppool-agent_darwin_arm64_v8.0/ntppool-agent /opt

In /Library/LaunchDaemons/org.ntppool.agent-prod.plist

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>EnvironmentVariables</key>
        <dict>
                <key>MONITOR_STATE_DIR</key>
                <string>/var/lib/ntppool-agent</string>
        </dict>
        <key>KeepAlive</key>
        <dict>
                <key>Crashed</key>
                <true/>
                <key>SuccessfulExit</key>
                <false/>
        </dict>
        <key>Label</key>
        <string>org.ntppool.agent-prod</string>
        <key>Nice</key>
        <integer>-20</integer>
        <key>ProcessType</key>
        <string>Interactive</string>
        <key>Program</key>
        <string>/opt/ntppool-agent</string>
        <key>ProgramArguments</key>
        <array>
                <string>ntppool-agent</string>
                <string>--env</string>
                <string>prod</string>
                <string>monitor</string>
        </array>
        <key>RunAtLoad</key>
        <true/>
        <key>StandardErrorPath</key>
        <string>/tmp/ntppool-agent.err.log</string>
        <key>StandardOutPath</key>
        <string>/tmp/ntppool-agent.out.log</string>
        <key>UserName</key>
        <string>_ntpmon</string>
</dict>
</plist>

Then I used LaunchControl to start it, which should be equivalent to sudo launchctl load -w /Library/LaunchDaemons/org.ntppool.agent-prod.plist.

Not exactly sure what you mean by this, but the monitors pretty much don’t care about the IP address anymore once they are set up. Neither does the system currently (apart from the detection of monitoring of a co-located server not working anymore when the address has changed - with the upside that you’ll have a monitor that’ll always score your server well, almost no matter what :laughing:).

Good luck with your endeavor! We can use some more monitors in the CN zone, and other under-represented zones and regions of the globe.

I am actually rather surprised that there currently still is only a single monitor in the CN zone (that I can see) after so many complaints over time as to how monitoring servers in the CN zone from the outside puts them at a disadvantage due to the “complex network conditions”, as you so aptly put it. And the new monitoring having been built in part at least to address those concerns, and similar ones from other zones, or due to other challenging network conditions.

Anybody any thought as to why that is, why there aren’t any more monitors in the CN zone?

If support in setting one up is needed, feel free to reach out.

You’re right! it really does not seem to care about IP a lot.

why there aren’t any more monitors in the CN zone?

Well, the “complex” conditions can be hostile to many things, including monitor client-server communication:

time=2026-04-28T18:29:55.430+08:00 level=INFO msg="traces export: Post \"https://api-buzz.mon.ntppool.dev/v1/traces\": processor export timeout"
time=2026-04-28T18:30:03.412+08:00 level=ERROR msg="mqtt connect" err="failed to connect to mqtts://mqtt.ntppool.net:1883/: context deadline exceeded"
time=2026-04-28T18:30:23.414+08:00 level=ERROR msg="mqtt connect" err="failed to connect to mqtts://mqtt.ntppool.net:1883/: context deadline exceeded"
time=2026-04-28T18:30:37.103+08:00 level=INFO msg="IPv4 protocol status changed" env=prod appconfig-manager.previous=false appconfig-manager.current=true appconfig-manager.status=testing appconfig-manager.ip=
time=2026-04-28T18:30:37.103+08:00 level=INFO msg="IPv6 protocol status changed" env=prod appconfig-manager.previous=false appconfig-manager.current=true appconfig-manager.status=testing appconfig-manager.ip=
time=2026-04-28T18:30:43.381+08:00 level=WARN msg="mqtt status publish error" err="connection with the MQTT server is currently down"
time=2026-04-28T18:30:43.416+08:00 level=ERROR msg="mqtt connect" err="failed to connect to mqtts://mqtt.ntppool.net:1883/: context deadline exceeded"
time=2026-04-28T18:31:02.349+08:00 level=INFO msg="mqtt connection up" clientID=cnsha1-3m66cyt.prod
time=2026-04-28T18:31:05.512+08:00 level=INFO msg="MQTT subscription completed successfully" reasons="\x01"
time=2026-04-28T18:31:07.380+08:00 level=INFO msg="failed to upload metrics: Post \"https://api-buzz.mon.ntppool.dev/v1/metrics\": reader collect and export timeout"

And this is fiber-to-the-home. It’s not a problem between myself and the carrier, but the carrier to the outside.1 The monitor’s README does after all say “good IPv4 and/or IPv6 internet connectivity” and well that’s just not what you get on residential.

1 The “outside” is not just overseas (though residential priority on the international exits is particularly bad). There are well-known problems between carriers in the same country too because they bill each other for the data transit and everyone wishes to pay less. There is even intentional throttling between different provincial subsidiaries of the same carrier because, again, they have to pay each other.

Because of the data-transit bills mentioned in 1, the carriers love calculating your upload-download ratios and throttling your upload if you go out of line. This is a significant cost on the carriers because some people join P2P CDN networks to get monetary rewards, but by doing so the carriers clamp down even harder on what little self-hosting and non-paid P2P networking there is. (Not that self-hosting has been a big thing at all: every carrier blocks your port 80 due to Internet Content Provider registration requirements.)

In any case, the non-techies are just not going to make it. The techies on the other hand tend to value their quality-of-life and set up a tun/vpn/whatever-you-call-it at the level of their routers. That’s good but still hostile to P2P and measurements of the network itself and over time there end up being little collective interest in these things. (My router happens to be too locked-down for that.


I don’t work for some big IT company or some university’s IT department, but it might be worth contacting some of these people about setting up a monitor on their existing NTP machines. They tend to have better net.


Another thing I can try is perhaps make the upstream comms go through a proxy (not the NTP bits!) and the “local clock might not be okay” ignore network timeout failures.

Fun experiment running the mac with Chrony (?)

The build system actually makes darwin/arm64 binaries too – https://builds.ntppool.dev/ntppool-agent/builds/test/26/ntppool-agent_v4.1.4-26_darwin_arm64.tar.gz