PPS Server Question

Skeet · February 21, 2019, 1:30am

Hi Everyone,
I could use an intelligence infusion from this group before I go back to drinking as my primary hobby. I built up a Raspberry Pi NTP server with Adafruit GPS hat a few weekends ago. After much trial and error I got it working to book specs with GPS reference and PPS precision. The numbers looked great and matched everything I had read about in the many tutorials available.

Problem is, I put it on the network and every client I have pointed at it throws it out of consideration. In all cases, the NTP clients will happily take GPS and stratum 2 and 3 servers over this PPS server.

Interestingly, I’ve floated around the internet and found other PPS providers and used them only to find that they, too, are thrown out quickly. Even though their numbers are great - offset decent, jitter low, etc. I’ve noticed that every PPS server I’ve tested has a root dispersion of 500. Mine included.

I have a handful of commercial NTP boxes here in the shop that I put on the wire (Time Machines, LeoNTP, and two enterprise class boxes) that work wonderfully in test. But this Pi PPS thing has me pulling hair

Clearly, I’m a new learner. Am I following a bad path here?

alica · February 21, 2019, 7:51am

Please post your ntpq -p output (both from server and client if possible) before we can help you debug.

littlejason99 · February 21, 2019, 2:05pm

and a ntpq -c rv from the server would be helpful too…

If you are seeing a root dispersion of 500 on the Pi, then the ntpq -p should show that as jitter for the local reflock I would think?

Maybe fifo buffering is causing jitter? I’ve never tried on a Pi so I’m just taking a shot in the dark. Are you sure the GPS is locked? Some modules will give a PPS before the module has a fix.

Skeet · February 21, 2019, 8:15pm

Certainly! And thanks so much for your willingness to take a look. The output from ntpq is pasted just below from the Pi server.

Increasingly, I’m thinking I’m probably chasing an issue that really doesn’t exist. The numbers from the server look really good to my untrained eye. And, contrary to my previous post, this PPS server does get used in infrequent cases, however I’m still surprised at how efficiently it gets ignored on most of my test boxes around the country.

Here’s the server’s ntpq output:

pi@xxxxxxxxx:~ $ ntpq -p -crv
remote refid st t when poll reach delay offset jitter

oPPS(0) .PPS. 0 l 3 16 377 0.000 0.000 0.001
*SHM(0) .GPS. 1 l 5 16 377 0.000 -74.454 53.018
+SHM(2) .SHM2. 0 l 3 16 377 0.000 0.000 0.001

associd=0 status=0115 leap_none, sync_pps, 1 event, clock_sync,
version=“ntpd 4.2.8p12@1.3728 Mon Feb 18 11:58:54 UTC 2019 (1)”,
processor=“armv7l”, system=“Linux/4.14.79-v7+”, leap=00, stratum=1,
precision=-21, rootdelay=0.000, rootdisp=500.030, refid=PPS,
reftime=e0196a63.f4737b0c Thu, Feb 21 2019 12:20:19.954,
clock=e0196a66.d6a95d95 Thu, Feb 21 2019 12:20:22.838, peer=53810, tc=4,
mintc=3, offset=0.000490, frequency=-4.935, sys_jitter=0.000539,
clk_jitter=0.001, clk_wander=0.000

And here’s a representative sample of what I see on test client boxes around my network. This is a bit of a dramatic example since this was taken from a PC here in my shop just a few switch layers away from the server. As you can see, my PPS server (the top line) is thrown out along with another Stratum 1 PPS box at Torix. I haven’t yet seen either of these servers used on this box.

 remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================
-ip72-196-20-29. .PPS. 1 u 100 128 377 1.353 0.963 0.113
-ntp1.torix.ca .PPS. 1 u 23 128 377 30.257 1.842 1.668
+vps6.ctyme.com 216.218.254.202 2 u 77 128 377 49.069 2.327 34.432
*eterna.binary.n 128.252.19.1 2 u 18 128 377 37.104 1.272 1.335

Clearly the algorithms used in the ntp client don’t care for something or some attribute arriving with the packets from these servers. I’ve done a tremendous amount of reading on this in the past 24 hours and the one thing I have learned is how much I still have to learn about this process/protocol. I believe the chances are high that I’m wasting some of your valuable time on an issue that probably isn’t much of an issue at all.

My intent is to put this server into the pool so I can return as much data as I take from the pool on a day to day basis - plus have some fun with it. Since I probably have more money than brains I’ll likely order another LeoNTP server (which is a great box, btw) and throw that on the network to fulfill my promise. Perhaps the $35 Pi hardware isn’t quite able to perform the way one would expect in this role.

Thanks again!
Skeet

mlichvar · February 21, 2019, 8:30pm

The problem is in the root dispersion. Clients generally prefer servers with shorter dispersion (and delay). I’m not sure what could cause the PPS refclock to have a 500 millisecond dispersion. It might help if we could see your ntp.conf.

Skeet · February 21, 2019, 11:01pm

Yes, sorry - meant to include ntp.conf in the previous message but forgot to get it in there. I do use a “Time1” setting of 0.500 on the driver 28. This is a suggested value in many of the tutorials I’ve used. And I’ve found that if I get too far away from 0.500, I’ll lose PPS lock.

/etc/ntp.conf, configuration for ntpd; see ntp.conf(5) for help

driftfile /var/lib/ntp/ntp.drift

#Note that this config works without an internet source.
#PPS Kernel mode
server 127.127.22.0 minpoll 4 maxpoll 4 true
fudge 127.127.22.0 flag3 1 refid PPS

GPS Serial data Reference

server 127.127.28.0 minpoll 4 maxpoll 4 iburst prefer
fudge 127.127.28.0 flag1 1 time1 0.500 refid GPS stratum 1

Shared Memory 2 source

server 127.127.28.2 minpoll 4 maxpoll 4
fudge 127.127.28.2 flag1 1 refid SHM2

Fix False tickers

tos mindist 0.5

By default, exchange time with everybody, but don’t allow configuration.

restrict -4 default kod notrap nomodify nopeer noquery limited

Local users may interrogate the ntp server more closely.

restrict 127.0.0.1

Needed for adding pool entries

restrict source notrap nomodify noquery

littlejason99 · February 22, 2019, 12:05am

I believe this is what’s causing your root dispersion to be so high.

If the PPS is too jittery without the setting, try at least using a lower value like 0.010 and keep working your way smaller.

As I mentioned before, you probably want to make sure the UART FIFO buffers are disabled (every OS is different so you will have to do some googling on this one). Doing so typically decreases the jitter for NTP.

What is the SHM2 pointing to?

You also shouldn’t need iburst & prefer for the GPS line. I’m not even sure iburst works for local refclocks.

littlejason99 · February 22, 2019, 3:20am

Okay, I just did a test on one of my GPS NTP servers. It is definitely the tos mindist 0.5 that is forcing the root dispersion to be so high, which in turn is why no clients want to use that source.

I did a little searching, the way they say to fix the jitter issue in linux is to use:

setserial /dev/ttyS0 low_latency

or whichever serial port is being used. I’m not sure what you would do if your distro doesn’t have the setserial command.

littlejason99 · February 23, 2019, 6:08pm

Any update @Skeet ? Would like to know if you are able to get this resolved as I’ve been considering getting a Pi to do similar.

Skeet · February 23, 2019, 6:49pm

Yessir @littlejason99 - Actually spending some time with it this weekend. You were right on with tos mindist setting having a direct impact on root dispersion. Also, @mlichvar was totally correct about high dispersion causing clients to throw the server out of consideration.

Here’s what I’ve observed so far with the Pi: Setting the tos mindist down to a starting value of 0.010 yielded a root dispersion of 10. All of my test clients immediately picked it up and started using the server in their consideration process. Many of them even marked the server as their current time source (*), and the rest at least including in the final set (+).

However! With tos mindist at 0.010 I started getting serious clock hop. The server client use PPS discipline for awhile and then fail back to the GPS shared memory. I’m working that value uphill in small increments to find a stable point. I’ve found that one needs incredible patience here and not move to quickly. I’m waiting an hour or two after each change and NTP restart to allow everything to settle. At the moment, I’m using a mindist setting of 0.090 and the clock hop has stabilized at least for now. I’ll continue to watch it.

The 0.090 setting yields a root dispersion of 90. As such, most of my test clients have eliminated the server from consideration. So I’ll continue to tune over longer periods of time and see if I can arrive at a sweet spot somewhere.

I have not messed with the FIFO latency setting just yet.

Skeet

littlejason99 · February 23, 2019, 7:27pm

Yeah I want to say the mindist setting is in the “seconds” scale while the root dispersion is in the more common “milliseconds” scale.

I would try the fifo settings, you really don’t want to use the mindist setting if you don’t have to. On my little soekris with a GPS my root dispersion is about 1.4 (ms) on average.

Bas · October 19, 2019, 6:22pm

Run ppscheck, if you have output the PPS works.

If ppscheck reports no data then it doesn’t have pps.

Also gpsmon can and will reveal a lot about PPS.

Topic		Replies	Views
What on earth is wrong with my stratum 1 servers Server operators	18	3821	May 12, 2020
What happened here? Bad Stratum 7 13 and 15 Server operators	43	3496	May 12, 2020
New Pi Stratum 1 Check Server operators	4	2220	April 12, 2018
RPI GPS stratum 1 - Should i adjust the 3 ms offset?	3	892	September 12, 2019
NTPD with GPS/PPS is unreliable Server operators	1	1878	September 28, 2021