Context
When I was working at the amazing swiss 🇨🇭 hoster Infomaniak, I had the change to setup their NTP infrastructure.
Previously, they may have been using an old, single point of failure, forgotten, server that acted as the main time provider. When you have thousands of clients and a time-sensitive infrastructure, and multiple DCs this in not really acceptable.
So the goal was to create something better, and we ended up using some PCI-e cards with an OCXO and a GPS + PPS connection.
These servers are now part of the ntp pool and provide thousand of requests per second.
Can I haz too?
Well, this is great but I no longer work with them and I kinda want the same stuff to play with
Stratum 1
My first goal was to have a stratum 1 server. Of course, I could buy some Symmetricon appliance on ebay, but then it’s not the same.
I ended up choosing an old raspberry pi (RPi 3 model B 1.2). Fitted with an Ublox MAX-M8Q GPS GPS hat, it provides an accurate 1Hz “PPS” signal to the pi.
To have a good reception, the GPS antenna was mounted on the roof, a bit far from my 21km P2P WiFi link in order to avoid interference.

This is nice, but it needs an enclosure. I salvaged an old cisco router I had lying around and made this:

The LCD is connected to the Pi with a Pcf8574 I2C multiplexer and driven with a small rust program.
Redundancy & BGP
What happens if the Rasbperry pi dies? The NTP pool is redundant, but my quality of service would suffer.
I decided to add two other hosts. As I have two different sites, they would be geografically redundant.
The two stratum 2 servers are simple VMs (1CPU, 512Mb of RAM. Turns out ou dont need much to tell the time). They are created with terraform on my proxmox hypervisors.

Instead of adding their interal (though public) IPv6 on the NTP pool, went the BGP route.
Each server has a decicated IP their offer via BGP. For my servers, the IPs are:
- 2a01:e0a:431:b527::a123 (chronos.ntp.k3s.fr)
- 2a01:e0a:431:b527::b123 (ntp-s2-cra.ntp.k3s.fr)
- 2a0e:e701:122c:fff0::a123 (ntp-s2-ces.ntp.k3s.fr)
The servers directly peer with my routers.
To make it redundant, each server will also announce the other IPs. However, if we simply announced all the IPs on each server it would be hard to have balanced and dedicated traffic. Indeed, If the routes and not identical, the traffic would always go to the shortest server maybe overloading them.
To fix that, the redundant IPs are announced with some AS Path prepending. We artifically lenghten the route for these “backup” annouces so the traffic naturally goes to the intended server.
The way it’s done is quite simple: each server has 2 dummy network interfaces: bgp
and bgp-backup
. For example, on chronos, we have:
4: bgp: <BROADCAST,NOARP,UP> mtu 1500 qdisc noqueue state UNKNOWN …
inet6 2a01:e0a:431:b527::a123/128 scope global
valid_lft forever preferred_lft forever
5: bgp-backup: <BROADCAST,NOARP,UP> mtu 1500 qdisc noqueue state UNKNOWN …
inet6 2a01:e0a:431:b527::b123/128 scope global
valid_lft forever preferred_lft forever
inet6 2a0e:e701:122c:fff0::a123/128 scope global
valid_lft forever preferred_lft forever
We then configure our BGP router (in my case frr), to announce the IPs on the interfaces:
router bgp 64600
bgp router-id 192.168.10.155
bgp bestpath as-path multipath-relax
bgp bestpath compare-routerid
neighbor pg-leaf peer-group
neighbor pg-leaf remote-as external
neighbor fc00::1 peer-group pg-leaf
address-family ipv6 unicast
redistribute connected route-map map-bgp
neighbor pg-leaf activate
neighbor pg-leaf soft-reconfiguration inbound
neighbor pg-leaf route-map map-bgp out
route-map map-bgp permit 10
match interface bgp
route-map map-bgp permit 20
match interface bgp-backup
set as-path prepend 64600 64600 64600
This way, the IPs naturally go to each dedicated host:
$ show ipv6 route
…
B 2a01:e0a:431:b527::a123/128 [20/0] via fe80::ba27:ebff:fe72:7731, eth0, 4d8h15m
B 2a01:e0a:431:b527::b123/128 [20/0] via fe80::be24:11ff:fe68:534e, eth0.21, 16d23h42m
…
But when a server dies, the BGP connection is broken and the traffic is redirected to the other backup hosts:
$ show ipv6 route
…
B 2a01:e0a:431:b527::a123/128 [20/0] via fe80::be24:11ff:fe68:534e, eth0.21, 1m20s
B 2a01:e0a:431:b527::b123/128 [20/0] via fe80::be24:11ff:fe68:534e, eth0.21, 16d23h42m
…
The actual network schema feels like that:
XXX
As you may notice, the actual SPOF is the internet uplink. Sadly, I actually have two different ISPs, so I can’t really announce ISP’s A range on ISP B. It should be fixed once I have my own ASN :-)
Choosing NTP peers
Choosing NTP peers is tricky. You may be tempted to use the NTP pool for that, but it is not great if the server is part of the pool itself, and it could cause the overall time to drift if everybody did that.
The challenge is then to find NTP stratum 1 peers. Of course, I added my Pi in the list, but one server is not enough as it may drift if the source becomes unavailable. You can see an example of a drifting peer in the following picture:

In order to find the peers, I extracted all the NTP pool participant servers and directly queried each of them. As an NTP reply contains the stratum of a server, I could cherry pick the stratum-1 servers close to me.
Future upgrades
RPi
The raspberry pi ethernet connection goes through an internal USB connection. This is not that great because latency is higher due to the USB protocol. I’d like to try a beaglebone because their ethernet connector talks directly with the CPU.
Also, I’d like to add an OCXO to the setup in case the GPS connection is lost, due to interferrence, jamming, or a bug. However, this is more complicated because it’d need to have a full PLL + VCO stack in the middle of the GPS PPS->Pi route. Maybe someday..
And of course, get an ASN, announce the “bgp HA” prefix on both upstreams, and fully be HA :-)
IPv4
I originally made this infrastructure IPv6 only, because IPv4 is dying, but mostly because I don’t have many IPv4s, and that would involve NAT, which is suboptimal. Maybe oneday :-)
Use the pool
Your best bet would be to use the NTP pool, but if you want to use the stratum 1 server (hosted near Geneva) as an uplink, you can use chronos.ntp.k3s.fr (warning: IPv6 only for now!)
If you’re curious, you can access the Grafana dashboard here or the NTP pool stats page.
Source code & Dashboard
You can find the source code of the project here