DNS configuration tampering on one of our GeoDNS servers

DNS configuration tampering on one of our GeoDNS servers

We found that a volunteer who provided hosting for one of our GeoDNS servers used their access to manipulate DNS zone weights for the NTP Pool service domain. The server has been secured and removed from the DNS NS-set.

What happened

One of our geodns servers (ntpmnl1, in Manila) was hosted on a VM provided by a volunteer. When we set up the server, we followed our standard process: full administrative control, firewall rules, locked down access. The volunteer’s SSH key remained on the system from the initial VM provisioning. Later, they asked us to open a firewall exception so they could retrieve personal files from the machine. We made an exception. That was a mistake, and it’s not something we’ll do again.

The volunteer used that access to install a tool that modified the geodns zone data every two minutes, boosting the AAAA record weights of 42 specific IPv6 addresses, all NTP servers the same person had registered in the pool. Many of these servers were already active in the pool in Asian countries, the US, and Mexico.

They also installed a reverse proxy tunnel for persistent remote access and ran a packet capture tool to log IPv6 source addresses of DNS queries hitting the server.

Our configuration system refreshes zone data on geodns servers regularly, so the modifications were overwritten each time. The tool ran every two minutes to re-apply them between refreshes.

What was the actual impact?

The impact was limited. ntpmnl1 was one of many geodns servers and handled only 2-10% of AAAA traffic for any given country. The zone refresh cycle also kept overwriting the modifications, though the two-minute cron was often fast enough to re-inject before the next refresh. Some countries saw clean stretches of several hours (DE had a 5-hour gap on March 19), but it wasn’t a regular pattern.

For the first ~41 hours the tool only boosted weights of IPs that were already present in zone entries, similar to what an operator could do by adjusting netspeed on the pool management website. For countries where these servers weren’t registered (France, Poland, Sweden, Argentina, Nigeria, etc.) the net effect was effectively 0% of total AAAA queries, at most 0.5%. For countries where the servers were already registered (US, SG, AU, BR, etc.) the net effect was higher, around 0.5-7% of total queries, since ntpmnl1 handled 2-10% of each country’s traffic and the volunteer’s servers were heavily favored in its responses.

Later versions of the tool tried to inject IPs into zone entries for countries where the servers weren’t registered. One test ran for 68 minutes before being reverted; the most aggressive version ran for about 20 minutes before being discovered. Neither had much time to take effect.

Even in affected responses, clients usually got a mix of the volunteer’s servers and regular pool servers. In the US, about half of affected queries had all 4 answer IPs from the volunteer; the other half included at least one normal server. In Canada, most affected responses had 3 regular servers and 1 from the volunteer. NTP clients using multiple servers (ntpd, chrony, systemd-timesyncd) would still have had unaffected time sources available.

All 42 servers are registered pool members. Most have the maximum monitoring score of 20, and we have no data indicating NTP queries to these servers weren’t answered accurately. The servers that were most prominent in affected responses matched their labeled geography: US-labeled IPs dominated US queries, SG-labeled IPs dominated Singapore queries, and so on.

None of that changes the fact that what this volunteer did was wildly inappropriate. Tampering with DNS infrastructure that hundreds of millions to billions of devices depend on, regardless of whether the NTP responses were accurate, is a serious breach of trust.

How we run our DNS infrastructure

Our policy is to maintain complete administrative control of our DNS servers. We don’t give outside access to anyone. The servers run either on infrastructure we acquire commercially or on machines hosted by long-standing, trusted community members.

The NTP Pool runs on limited resources. We depend on the community to help us operate, and that means trusting the people we work with. This volunteer had been a pool contributor for over a year, running NTP servers in parts of the world that are poorly served. That track record is why we worked with them and made the exception on the firewall.

Going forward, we’ll be more careful about who we work with on hosting, and we won’t be making exceptions to our access policies. If you’re a long-time pool participant and want to help with DNS server hosting, have a look at pool.ntp.org: NTP Pool DNS servers .

What we’re doing about it

  • The server has been secured and removed from the DNS NS-set
  • We’re reviewing access controls across all geodns nodes
  • We’re looking at the pool account that registered these servers
  • We’re adding integrity checks for zone data on geodns servers
  • We won’t be granting firewall exceptions for host access going forward

Timeline (UTC, March 2026)

  • March 14: Volunteer installs a reverse proxy tunnel on the server for persistent remote access
  • March 17-18: DNS packet capture begins
  • March 18 08:16: First zone file modification. For the first ~41 hours, the tool only boosted weights of servers already present in zone entries, similar to changing netspeed on the manage website. Impact on countries where the servers weren’t registered was under 0.1%.
  • March 19 10:54: A 68-minute test of a more aggressive version that injected IPs into previously empty zone entries. Reverted afterward.
  • March 20 07:42: Most aggressive version starts, but only runs for about 20 minutes before discovery.
  • March 20 09:00: Data clean across all countries. Server secured and removed from the DNS NS-set.
16 Likes

Wow, that’s brazen, unbelievable :enraged_face:

I’d be curious how you caught it, I guess some of the independent pool monitoring played a role? But I’d understand if you were reluctant to share your tradecraft publicly.

1 Like

Hi @ask. Good catch!

Could you please share what disciplinary measures were taken against the participant?

Great work Ask and it’s outrageous to hear somebody breached the DNS server for this important service to the whole internet.

If appropriate - whether that volunteer is an individual or a hosting company / data center etc.?

A rogue operator was logging addresses and injecting his servers? That’s why IPv6 with its privacy option is important. Too bad that IPv6 is not supported by the pool, except in a half, nay, quarter hearted way. Please, @ask, don’t go rogue on us and fully embrace IPv6 pronto!

2 Likes

F-hell, how bad can some people be to abuse others.

@ask If you need a VPS, Racknerd has KVM VPS’ses pretty cheap.

See here: RackNerd - New Year Sale!

I have a VPS there too, if you want, I have no problems to sponser the 3,5GB version.

Best option, you order it and send me the yearly bill for it, so you have total control.

It’s only $32.49 a year, and seems to cover your needs. I suggest you order it, then send me the invoice to pay. I do NOT want any access to that server, it’s all yours.

Let me know if you want me to sponser. Happy to help.

1 Like

I would want to know who this person is, such nasty people should be made public.
If it happened to me, I would make all info about such an individual public, even with surname and address etc.
People that do this should be publicly be shamed! My 2 cents.

I do agree that having better IPv6 support is good for the Pool, considering that now >25% of internet traffic is routed as IPv6.

1 Like

Oh man, this is exactly why we should keep our circle on the smaller side. I’ve been involved here since around ~2018… I even sponsor a GeoDNS server, as well as time and monitoring servers. I’ve seen a lot in my time and this just confirms it once again. But this is something completely new and it’s incredibly intense.

The node started triggering performance alerts (high CPU from scripts running on the server), and logferry uploads were failing.

When we investigated we found the tools, immediately removed the server from the NS-set, and then analyzed the tools and logs to understand the impact. We had system logs on the server, and also sent to our central otlp/loki infrastructure, and query logs from GeoDNS (including the answers given to each query).

The volunteer has responded and taken responsibility. He says he’s a student and that this was reckless curiosity about how global DNS traffic routing works at scale, not a targeted attack. I don’t think we can fully verify that either way, but on balance I believe this was more incompetent and inappropriate than deliberately malicious. Still completely unacceptable — he lied to get access, captured DNS client data he had no business collecting, and manipulated zone files to circumvent the load balancing and “find the best server” aspects of the service for the small percentage of users querying this server. The server has (of course) been removed from the NS-set.

I have ideas for catching zone file manipulation faster (weight anomaly detection, zone checksum verification), but no time to implement it. The avro log files from the DNS servers are open source and in the infrastructure they go into kafka, so if someone wants to build a tool that can detect anomalies from a stream (~150-400k entries per second) that’d be great.

We’re going to be more careful vetting new contributors, obviously. But this comes back to the broader problem: too much of this is done by me alone. Many of the DNS servers I manage and pay for directly. The system’s resilience actually depends on distributing both infrastructure and management. NTP servers very distributed, DNS servers somewhat distributed, and so on. Think of it as a pyramid: the more user-facing the layer, the more people can help run it, and the less we depend on any one person. That’s the design, but it only works if people are actually helping.

2 Likes

Sure, but that’s how we end up with the other topic with a bunch of people complaining I’m not doing enough? :man_shrugging:

3 Likes

I don’t think people complain that you aren’t doing enough. I think on the contrary, and people are grateful for your work, dedication, expertise, and steady hand in running this and keeping things stable. I certainly am.

At the same time, as you yourself have mentioned in the past, you have limited time available to spend on the project, and a large part of that goes into various operational and maintenance topics.

Thus, I think it is arguably so that moving the pool forward from a feature point of view obviously takes time, and long outstanding feature wishes (e.g., IPv6, netspeed API, vendor zones, …) and performance topics (e.g., underserved zones) remain open for quite a while since being raised/identified, and I think that is what people are unhappy about.

Because some of those are actual restrictions, making life for operators and users more difficult. E.g., people not being able to join servers in certain zones, or clients having issues getting time in certain zones, better support for IPv6, people wanting to adhere to pool policies requesting vendor zones (including IPv6 support) but not getting them even after several years, a netspeed API, …

And people are also enthusiastic about the pool, and have ideas regarding how the pool could be improved. When there is limited feedback on that, again understandably for the reasons mentioned above, so that tends to stifle people’s enthusiasm when topics important to a lot of people are getting discussed time and time again, but there is no progress, or even feedback on what is being discussed.

So again, I don’t think the complaints are about you doing too little. Rather, that the project is not set up in a way that would share the load more widely, e.g., more maintenance topics, or smaller items offloaded to other people so you can focus your precious time and expertise on driving the pool system forward from a feature point of view.

I understand that given the current example, and others that also exploited trust in various ways, people are obviously reluctant to add additional people to the fray when it is difficult to assess whether they are really trustworthy. At the same time, there is a real risk that projects can also be killed by smothering them. I just think about NTP classic, that arguably through a lack of moving forward quickly enough with features people needed, or even just basic maintenance topics, bugs being fixed, lead to NTPsec being spawned, and NTP classic being dropped from many Linux distributions.

At least I am concerned something similar could happen to the NTP pool project. I don’t think it will go away, it’s factually being relied upon too much by the Internet community (at least that part of the often-referred xkcd picture is true). At the same time, I feel many people who are enthusiastic about the project, and would like to contribute in one way or another, get frustrated, and many eventually turn away because, e.g., joining a server simply is not possible in large parts of the globe, or they feel the ideas they try to contribute are simply being ignored.

So I think that it is worthwhile at least having a discussion about as to how to improve that aspect of the project, and how to set it up so as to increase the momentum regarding new functionalities of the project.

6 Likes

I wouldn’t take it too personally. You could just “loop in” someone you can rely on, like Steve or myself, for certain tasks. You know we’re on your side and we all have the same goal in the end. If you tell me where you need the most help, I’ll see how I can support you.

5 Likes

Thanks @ask for fully disclosing this and the post-morten analysis. I can only imagine the amount of time do figure out what happened there.

This is actually an attack, which according to your descrption consists of:

  1. Shifting more traffic to 42 IPv6 controlled servers of the attacker (aka volunteer)
  2. Add NTP servers to empty zones
    • this is huge, given they would be the sole time provider for those empty countries. See Section 5 in this pdf
      • but ONLY for resolvers who contacted the malicious DNS server (ntpmnl1). FWIW pool.ntp.org has 9 NS records, each of them with multiple IP addresess. So it seems that it affected one of the say 40–50 GeoDNSinstances.
  3. They captured DNS queries and IP addresses of affected clients without consent/permission and exfiltrated the data, which both are a crime in many jurisdictions
  4. You don’t say how many “tests” they did on this server injection: you say one of 68min and another one of 20min.
  5. You say: “NTP clients using multiple servers (ntpd, chrony, systemd-timesyncd) would still have had unaffected time sources available”.
    • NTP clients true, but not for SNTP clinets, such as systemd-timesyncd. If those would choose the attacker IP address, they would only use that one 1 server – see this presentation
  6. You say: “we have no data indicating NTP queries to these servers weren’t answered accurately”. Sure, but they could have hijacked and timeshifted the clocks of a large number of devices this way . They could have simply lied to your monitors as previous attacks have shown

Follow-up questions:

  1. How many “events” (tests) happened? How long did they last? In addition for the 20 and 68min ones? From your timestamps there it seems these were the only two
  2. What was the attacker (student) goal? Was it to have a fancy security paper?
    • I mean, if they are a PhD student trying fancy attacks on the NTP pool, there is no way a resulting paper would be accepted in any tier 1 academic conferences due to the attack and ethics. For instance, see USENIX Ethics, which is also used by ACM CCS.
  3. Do you plan to address this with their supervisor or institution?
  4. And this one I never understood, but why do you still do everything yourself alone? It’s great work, I could never pull this off myself. But maybe it’s time to finally distribute the dev/ops tasks to a bunch of trusted volunteers. For instance, this is what the Root DNS Servers operator do to operate the Root DNS zone.

Thanks a lot

3 Likes

An NS volunteer might capture the DNS packets and manipulate them in flight. NS servers should never rely on volunteers, but be closed to them. Only the pool team should manage them on hosts under their full control.

You misunderstood. My criticism is that you are not open enough. Several people made more than suggestions that you ignored, they submitted PRs and volunteered to help, which you also ignored. Let’s be honest about your attitude over the years. Yes, we’re grateful, but you shunned help many times before.

@ask

Dude, if I can help in any way, I’m more the happy to do so.

Just tell me what to do. I know, I can be a pain in the *** but I never be dishonest.

No I’m not looking for a position, I’m more then happy to pay a few servers for you.

Or help moderating the forum, I was the previous admin (long time ago) of the MSI.COM forum and had some April fools jokes that you can lookup…my best one RTFM CHIP joke.

https://forums.guru3d.com/threads/new-announcement-rtfm-chip-details-reveiled.318261/

That was me :wink:

If I can help…and to prove it was me…here is a picture…

Was given because they where not able to visit me at the time, later they came.

Long time ago, it was fun…but I was Admin of the MSI forum for a long time.

My Avatar at the time…at MSI:
Bas-avatar

People that know…know me…so much fun at the time to help others.

@ask will you please tell me how to help you pay for costs?

I need IBAN etc, mail me, I do not use Paypal as those fuckers protects criminals.

Tell me how I can help. I will do so. Let me know, time is too important.

As I’m 60, I have not much time left :rofl:

I see that your main ask is for the Pool to implement IPv6 entirely. While I am a supporter of adding full IPv6 support to the Pool, this is a separate topic than DNS tampering. I wouldn’t suggest mixing both together.

That would be a good reason to add DNSSEC to pool.ntp.org. It’s not a silver bullet, but it could help.

1 Like