[HN Gopher] The Sisyphean Task of DNS Client Config on Linux
___________________________________________________________________
The Sisyphean Task of DNS Client Config on Linux
Author : smitop
Score : 100 points
Date : 2021-04-15 15:01 UTC (8 hours ago)
(HTM) web link (tailscale.com)
(TXT) w3m dump (tailscale.com)
| jiqiren wrote:
| If you are just trying to use curl or some other straightforward
| small application - this is enough. But running Kubernetes,
| docker, or some other container system? This mess is much more
| deep...
|
| For example: there are known issues with systemd-resolved &
| Kubernetes (at least on Ubuntu that defaults to systemd-resolved)
| https://kubernetes.io/docs/tasks/administer-cluster/dns-debu...
| dijit wrote:
| It also messes up docker-compose, probably in the same way.
|
| I have to disable systemd-resolved and manually configure
| resolv.conf to get my containers to resolve each other
| properly. (Also Ubuntu)
| denysvitali wrote:
| FYI, if you don't bother opening the link, systemd-resolv sets
| the /etc/resolv.conf to have only 127.0.0.1 as a DNS server.
| You can see how things get bad when the CoreDNS pod tries to
| get the "upstream" DNS servers from the host /etc/resolv.conf
| kelnos wrote:
| I guess it shouldn't be doing that? I mean, that also breaks
| in the case where you're running something like dnsmasq as
| your local resolver, which will also require 127.0.0.1 in
| your resolv.conf.
|
| As much as I'm not always comfortable with the larger
| complexity, the CoreDNS pod really needs to be doing
| something like the flowchart described in the article. If
| systemd-resolved or NetworkManager are in play, it should be
| using D-Bus to talk to them to get the information it needs.
|
| resolv.conf is an old idea that is too inflexible for today's
| DNS needs; a more complex solution with more complex
| interface is unfortunately warranted here.
| vetinari wrote:
| > I mean, that also breaks in the case where you're running
| something like dnsmasq as your local resolver, which will
| also require 127.0.0.1 in your resolv.conf.
|
| Dnsmasq DNS support is redundant when you are running
| systemd-resolved. It can do everything dnsmasq does and
| more.
|
| > resolv.conf is an old idea that is too inflexible for
| today's DNS needs; a more complex solution with more
| complex interface is unfortunately warranted here.
|
| That's exactly what systemd-resolved is.
| venamresm__ wrote:
| I also tried once to make sense of domain name resolution:
| https://venam.nixers.net/blog/unix/2020/11/01/resolving-a-ho...
| Koffiepoeder wrote:
| To be honest I hate all of the aforementioned programs trying to
| battle over /etc/resvolv.conf. That's why by default I have the
| file marked as immutable, and pointing to 127.0.0.1. This way I
| cannot have accidental DNS traffic leaking. If I need local
| network DNS I send a manual dhcp command to get the wifi's DNS
| and add it temporarily to my dnscrypt config. Similar thing for
| VPN DNS.
| mindslight wrote:
| I do something similar in that I just point DNS traffic at the
| gateway itself and configure it along with the rest of the WAN
| horizon on the router (eg direct goes to local recursor,
| wireguard tunnel goes to the recursor on the other side of the
| tunnel or suitable public resolver, commercial VPN gets DNATted
| to whatever their recommended resolver is, pihole eventually).
| Running multiple apps / security contexts / nyms (or whatever
| else you want to call them) on the same OS instance is just
| asking for trouble.
| corndoge wrote:
| I recommend the series "Anatomy of a Linux DNS lookup"[0].
|
| Everything you never wanted to know.
|
| https://zwischenzugs.com/2018/06/08/anatomy-of-a-linux-dns-l...
| tyingq wrote:
| That appears to have been written prior to systemd-resolved
| becoming enabled by default on many distros.
| tptacek wrote:
| Tailscale is awesome, you should use it for everything.
|
| This Linux DNS stuff drives us batty at Fly.io. We run user
| containers as Firecracker VMs; users belong to "organizations",
| and organizations share a private IPv6 network. We do DNS for
| that private network under the fake "internal" TLD, so if you
| have an app "phoenix-frontend" and another app "rabbitmq-
| cluster", they can see each other at "phoenix-frontend.internal"
| and "rabbitmq-cluster.internal".
|
| What you'd want in a perfect work is an option in
| `/etc/resolv.conf` that sends `.internal` to a special
| nameserver, and everything else to a normal nameserver. But as
| far as I can tell, there's no way to take a bare Linux VM that
| can accept an arbitrary container and set that capability up.
|
| So instead, we end up (by default; you could override) serving
| _all_ customer DNS, and our `.internal` server has to forward
| recursive queries to things that aren't `.internal` somewhere
| else. This sucks; we shouldn't have to be inline for arbitrary
| customer DNS.
|
| If there's a clean way to resolve this, so that the VM itself can
| just send `.internal` queries to us, and everything else to
| `1.1.1.1` or `8.8.8.8` or whatever the customer's container had,
| I would _love_ to hear it. I've come pretty close to breaking out
| preload in anger over this problem.
| js2 wrote:
| On macOS, you could do this trivially by creating
| /etc/resolver/internal with the name servers in it.
|
| Linux should copy that.
| vetinari wrote:
| Systemd-resolved does much more than MacOS resolver. It can
| make the additional resolvers active conditional on whether
| the link is up, for example (think VPN). MacOS resolver can't
| do that.
| ptomato wrote:
| ... I realize I'm a monster, but aren't y'all already
| extensively using BPF for things? Sticking in a BPF egress
| filter on the VM that rewrites outbound DNS packets with
| .internal in the question to point at your DNS server seems
| like it would be lighter weight than just handling all queries
| recursively.
| YarickR2 wrote:
| Why do you need kernel-mode parser for that ? iptables -t nat
| -I POSTROUTING -p udp --dport 53 -j DNAT --to <your internal
| recursor, sending .internal to authoritative dns server for
| that zone, and resolving globally available hostnames by
| itself>
| ptomato wrote:
| Well, because he doesn't want to be inline for
| non-.internal DNS queries.
| dave_universetf wrote:
| This is also what we're trying to do in Tailscale (to grab your
| MagicDNS domain, and whatever corporate split DNS you set, but
| not anything else). And yeah, on linux, basically systemd-
| resolved is the only thing that gets this right (with dynamic
| config - with fully static config, other recursors are an
| option, with varying tradeoffs), everything else assumes "one
| resolver should be enough for everything".
|
| So, your solution to proxy all the traffic and split out as
| needed is the right way to do it without adding more software
| into each VM :(
|
| Doing stuff via LD_PRELOAD would be hilarious, you should
| definitely do that and report back _ducks behind the blast
| shield_
| dijit wrote:
| I used to have a very similar problem (didn't want my company
| vpn to handle all dns traffic, only what was directed at the
| company), I managed to solve it with a local dnsmasq on my
| machine.
|
| Unlikely to fit your usecase. But it's not impossible for us
| plebs.
| geofft wrote:
| If you're willing to assume glibc, a reasonably clean approach
| is to write your own nss_fly module that returns results for
| .internal.
|
| The downside is that Go code will drop to cgo for all
| resolutions (because it will see something it can't handle in
| pure Go in resolv.conf) and non-glibc code (like Alpine
| containers using musl, or non-libc resolvers like ares) won't
| get anywhere at all.
|
| But you're kind of doomed in that latter case, anyway, because
| since there _currently_ isn 't a standard for how resolv.conf
| should express the rules you want, the effort to get it adopted
| by glibc, musl, Go, Chrome, ares, and all the other various
| interpreters of resolv.conf will take way too long to make a
| practical impact on a startup's product.
| jeffbee wrote:
| Feels like the actual clean way to resolve this is to not use
| DNS for discovery of named services.
| 4rtyui9xd wrote:
| I can think of a variety of ways. Some involve running some
| code on the VM. For example, dnscache or dqcache would work.
| This can also be complished using tinydns. Tiny programs that
| use very little memory. But it sounds like you are trying to
| avoid asking the user to install anything other than flyctl on
| the "bare Linux VM". What is not clear is what programs are
| installed by default in the "bare Linux VM". The other question
| is how many ".internal" domains the VM will need to resolve.
| The simplest solution that comes to mind is when the user
| provisions an IP, flyctl writes the ".internal" domain(s) for
| that IP to /etc/hosts. /etc/resolv.conf can then point to
| whatever the user prefers. IMO, users should be running their
| own DNS servers, not setting /etc/resolv.conf to point to third
| party DNS addresses like 1.1.1.1 or 8.8.8.8 or whatever. In the
| case they are running their own DNS server on the VM, then it
| becomes trivial segregate internal from external domains. The
| configuration for dnscache is so easy that it could be done by
| the flyctl program, requiring no user interaction. (Something
| like tinydns-config.) One could even have configurations for a
| variety of DNS servers in flyctl, in case the user prefers
| unbound, etc.
| mwcampbell wrote:
| > This sucks; we shouldn't have to be inline for arbitrary
| customer DNS.
|
| Why not? Someone has to do it, and if I were paying Fly for
| service, I'd rather have Fly handle my DNS queries than be
| another freeloader on Google or Cloudflare.
| mjevans wrote:
| Couldn't you run dnsmasq, unbound, or some other configurable
| resolver on the localhost? Though that's probably already
| what's done with the internal smart resolver.
| YarickR2 wrote:
| This is proper advice. My recursor of choice is pdns-recursor
| , but bind in forward-only mode with several forwarders for
| different zones will work too
| friseurtermin wrote:
| Funnily enough, that is the _exact_ same problem I'm facing
| right now, down to Firecracker and a custom internal TLD. I'm
| excited to see the solutions. I think the only difference is
| that I need to run this DNS service on the same host as my VMs,
| so I will need to use a different port than systemd-resolve.
| YarickR2 wrote:
| I'd recommend to run it on the same port, but different IP
| (127.0.1.2 , f.e.) , due to inability of some programs to use
| non-standard port
| dnr wrote:
| systemd-resolved can do this, though I haven't played with it
| too much yet.
|
| Here's a random post I just found:
| https://gist.github.com/brasey/fa2277a6d7242cdf4e4b7c720d42b...
|
| (Ha, I did the thing where I read the comments before the post,
| and the post describes how to do this. So what's still
| missing?)
| tptacek wrote:
| I'm not sure how reasonable it is for us to run systemd-
| resolved (or systemd itself) on our VMs. We're trying to
| provide a clean environment for any random container to run
| on; we provide our own init, and that's almost the whole of
| it.
|
| What, I think, you really care about here is glibc's
| behavior. But then, you can't depend on any one libc, either.
| YarickR2 wrote:
| It's unreasonable to bring the whole cow to just get a
| gallon of milk . Use standalone resolver.
| recuter wrote:
| The cliffs of cloud native software defined networking
| insanity await you:
|
| https://github.com/containernetworking/cni
|
| https://github.com/firecracker-microvm/firecracker-go-sdk
| Firecracker, by design, only supports Linux tap devices.
| The SDK provides facilities to: Attach a pre-
| created tap device, optionally with static IP
| configuration, to the VM. This is referred to as a "static
| network interface". Create a tap device via CNI
| plugins, which will then be attached to the VM
| automatically by the SDK. This is referred to as a "CNI-
| configured network interface"
| mdlowman wrote:
| It depends on what you're using for the resolver. I'm assuming
| you only care about gethostbyname(3) and friends. With glibc
| that means nss; generally you're also looking at libnss_dns.so,
| which uses glibc's resolv (copied from BIND). This doesn't
| include enough configuration to do what you suggest; it pretty
| much just points everything towards a server.
|
| So you have two options: use a different NSS module (maybe
| write your own?) or have a proxy DNS resolver that sends
| different requests to different places.
|
| systemd-resolved actually handles the first option pretty well
| (although it would prefer that you use the dbus interface over
| gai). It can handle multiple interfaces with separate domains
| and split DNS fairly well! (Not so good with reverse DNS,
| unfortunately. But I get it, reverse DNS is pretty hacky
| anyways.)
|
| If you prefer the forwarder route, dnsmasq seems to be fairly
| popular these days in the embedded world and elsewhere.
|
| If I were you, I think I'd write a short NSS module or use
| dnsmasq, depending upon your needs.
| bonzini wrote:
| The most interesting takeaway from this article is that,
| according to people who actually do the work, NetworkManager and
| systemd-resolved do get things right.
| kevinoid wrote:
| I agree! As a non-expert, I've also had good luck with
| NetworkManager, but wouldn't recommend systemd-resolved yet if
| you use DNSSEC:
|
| https://github.com/systemd/systemd/issues/6490 (fixed in v248)
|
| https://github.com/systemd/systemd/issues/8451
|
| https://github.com/systemd/systemd/issues/9867
|
| https://github.com/systemd/systemd/issues/12388
| YarickR2 wrote:
| Mostly right. Article never mentions /etc/hosts , which is
| still largely a thing, and works wonders in difficult cases (
| and makes other difficult cases much worse to debug)
| vetinari wrote:
| Because /etc/hosts is not really part of the chain handled by
| resolver or nss-dns. It is handled by different nss module
| which usually has higher priority.
| liveoneggs wrote:
| one reason resolv.conf is/was so sticky is that gethostbyname
| (and friends) is a libc thing and pretty low-level
| corty wrote:
| Problem is, modern browsers do things differently yet again.
| So the insanity continues, just on different levels.
| liveoneggs wrote:
| browsers are operating systems in user space. They will
| continue to replace kernel parts.
| vetinari wrote:
| Technically, the existing dns resolving mechanism is a
| part of glibc, so it is userspace already.
| denysvitali wrote:
| Ok, now let's talk about how messed up the same thing is in Mac
| OS.
|
| Hint: it is waaay worse
| xrd wrote:
| Dave, one of the authors of these posts, has been regularly
| documenting on Twitter his struggles with DNS on all the
| platforms.
|
| If you like sports, it is like watching a basketball game, with
| three teams and Dave as the play by play announcer.
|
| I'm a linux user, so I'm always cheering when OSX or Windows
| "loses" but then very quickly Dave will, paraphrasing say:
| "And, OSX hits a deep 3 pointer..." ("And, OSX does this right,
| and Linux sucks").
|
| But, it is great theater and very informative. On Dave's
| recommendation, I switched to systemd-resolved, and it is
| working flawlessly for me (with my own wireguard setup, not
| using Tailscale yet).
| dave_universetf wrote:
| It's a pity, because macOS got the general idea right, but
| seemingly every single particular wrong thereafter.
|
| In general, a modern DNS client wants: a set of "default route"
| resolvers; a set of "DNS routes" that point certain suffixes to
| other resolver configs; a set of search paths to expand single-
| label queries; integration with mdns and LLMNR, for seamless
| zero-config resolution on LANs (super important for printers,
| in particular); all of the above tied to interface lifetimes,
| so you can tie resolver reachability to underlying network
| state; very detailed documentation on the algorithm used to
| resolve a name, and how you traverse all the above
| configuration.
|
| macOS has default resolvers, DNS routes, mdns integration (but
| no LLMNR), interface-tied configs, and knows about search
| paths.
|
| But then you look at the NetworkExtension API for configuring
| DNS, and it turns out the search paths field doesn't actually
| configure the search paths in ways you'd expect, instead all
| the suffixes you install as "routes" end up also becoming
| search paths, and your only option is have all or none of them
| be search paths. Meanwhile, the search paths you specified do
| get installed... In an interface-scoped config that doesn't
| actually get used in the majority of name lookups that need
| name expansion.
|
| It's so frustrating because it's _this_ close to being
| excellent, and instead ends up being the most limiting of APIs
| we have to work with, because being apple, it 's either their
| API or go screw yourself and don't configure DNS.
|
| Oh dear, I've ranted again, haven't I. Anyway, every OS is its
| own beautiful little snowflake of weirdery an brokenness.
| Linux's particular flavor is "there's 15 ways to do it, most of
| which require polyfills". macOS's flavor is "we have an API
| that should be amazing but somehow does the wrong thing almost
| always". Windows's flavor is "we can do really cool things but
| the main source of documentation is people exchanging
| superstitions about registry keys on stack overflow".
|
| Given that choice, I think I prefer linux. It's way more code
| to write to make it work, but at least the code can be derived
| from documentation+source code, and has half a chance of
| working as desired.
| schmichael wrote:
| Where does /etc/nsswitch.conf factor in? I don't see it
| mentioned.
| LukeShu wrote:
| nsswitch.conf tells it whether to consult `/etc/hosts` (`hosts:
| files`), `/etc/resolv.conf` (`hosts: dns`), or systemd-resolved
| (`hosts: resolve`); and if multiple of those, in what order
| (`hosts: first second third...`).
| daenney wrote:
| It's probably omitted as nsswitch doesn't affect DNS client
| configuration.
|
| It affects which sources are consulted to lookup a name
| (including maybe asking DNS), but it doesn't configure things
| like which DNS servers to ask or what options to set.
|
| > The Name Service Switch (NSS) configuration file,
| /etc/nsswitch.conf, is used by the GNU C Library and certain
| other applications to determine the sources from which to
| obtain name-service information in a range of categories, and
| in what order.
| LukeShu wrote:
| nsswitch.conf is what tells it whether to use the traditional
| nss_dns (the thing that looks at /etc/resolv.conf), or
| whether to use the newer nss_resolve (the thing that talks to
| systemd-resolved).
|
| So yeah, if the article is contrasting resolv.conf vs
| systemd-resolved, nsswitch.conf is how you select which of
| those approaches is used.
| daenney wrote:
| The article focussed on how /etc/resolv.conf is being
| managed and populated. That's not affected by nsswitch.
|
| I only addressed why nsswitch was likely not mentioned, not
| that it doesn't serve a purpose.
| vetinari wrote:
| Even with nss_resolve, you still have to have
| /etc/resolv.conf correctly populated, because some apps
| will ignore gethostbyname() and just parse resolv.conf and
| do DNS by themselves. For example, golang stdlib does that
| (but to the credit of golang runtime, it checks whether
| nsswitch.conf is in expected state and falls back to glibc
| if it is not).
|
| Some apps go even further and do their DNS entirely on
| their own (Firefox DoH controversy).
|
| nsswitch however doesn't just configure mechanism for
| public resolver, it is a hook where alternate mechanisms
| like nss-mdns or nss-mymachines can hook up.
| tgbugs wrote:
| An enlightening article that explains why I think I am losing my
| mind every time I try to get the network configured correctly on
| any number of distros. My main take home is to never install
| NetworkManager or a resolvconf and to avoid systemd if at all
| possible, unless I absolutely know that I am going to need those
| state of the art DNS capabilities.
|
| I have mostly managed to avoid resolv.conf issues since the
| default Gentoo image has a sane setup (even if using dhcp on a
| laptop and switching wifi and wired). However, every single time
| I have tried to set up a system using another distro something
| has gone wrong, and in support of the theory that NetworkManager
| is a major contributor to the problem the one Gentoo system with
| NetworkManager installed had the same issues.
|
| To echo the plea in the original article, in nearly every case
| the primary challenge has been to figure out exactly what
| documentation actually applies to the system at hand because
| distros seem to change this completely out of sync with any
| attempt to correct or align the documentation.
|
| I suspect that on an individual user level this leads to a happy
| path situation where everyone who does the right thing by
| accident is quiet and the ones who need something slightly
| different are never heard from because they weren't able to even
| connect to the internet (probably not quiet that bad).
| vetinari wrote:
| That's exactly the wrong take home; the only sane way to handle
| dns is systemd-resolved, as the article said. If you use your
| machine as anything resebling desktop/laptop, you should
| configure your network using Network Manager, especially if you
| have connections that come and go. Switching between Lan and
| wifi with dhcp is really low bar.
|
| Anything else is just prolonging the agony that the linux
| networking configuration had to endure for years.
| bkus wrote:
| But wait, there's more! nsswitch.conf and nscd.conf also affect
| the stub resolver's behavior.
___________________________________________________________________
(page generated 2021-04-15 23:01 UTC)