[HN Gopher] Musl 1.2.4 adds TCP DNS fallback
___________________________________________________________________
Musl 1.2.4 adds TCP DNS fallback
Author : goranmoomin
Score : 164 points
Date : 2023-07-30 16:39 UTC (6 hours ago)
(HTM) web link (www.openwall.com)
(TXT) w3m dump (www.openwall.com)
| joelverhagen wrote:
| Really happy to see this. This caused random NuGet package
| restore issues when the CNAME chain for api.nuget.org exceeded a
| certain length.
|
| https://github.com/NuGet/NuGetGallery/issues/9396
|
| Our CDN provider ended up having a shedding mode in some hot
| areas that made the chain exceed the limit from time to time. Our
| multi CDN set up saved us so we could do geo specific failovers.
| [deleted]
| [deleted]
| InvaderFizz wrote:
| Glad to see this finally come to fruition.
|
| This has been an issue plaguing Alpine for years where the musl
| maintainer basically said the standard says may fallback, not
| must fallback. Let the rest of the internet change, we don't feel
| this is important. We're standards complainant.
|
| It gained traction for the change with the latest RFC. for dns
| last year which made TCP fallback mandatory [0]. The cloak of
| standards compliant could no longer be used.
|
| 0: https://datatracker.ietf.org/doc/html/rfc9210
| LexiMax wrote:
| > This has been an issue plaguing Alpine for years where the
| musl maintainer basically said the standard says may fallback,
| not must fallback. Let the rest of the internet change, we
| don't feel this is important. We're standards complainant.
|
| I've never really understood the underlying mentality that
| causes maintainers to bend over backwards to _not_ provide
| popular functionality like this.
|
| Reminds me a bit of the strlcpy/strlcat debacle with glibc.
| yjftsjthsd-h wrote:
| > I've never really understood the underlying mentality that
| causes maintainers to bend over backwards to _not_ provide
| popular functionality like this.
|
| Probably trying to prevent feature creep; musl's home page
| describes it as
|
| > musl is lightweight, fast, simple, free, and strives to be
| correct in the sense of standards-conformance and safety
|
| And every single feature they add makes those things harder.
| tptacek wrote:
| In this case it's not so much "popular functionality" as
| "core mechanism of the protocol". TCP DNS isn't a quality of
| life feature; it's necessary in order to look up record sets
| with e.g. lots of IPv6 addresses, because UDP DNS has a
| sharply limited response size.
| geocar wrote:
| Why would anyone want an RRset with more than 20 AAAA RRs?
|
| What would they be doing with it that they couldn't do
| better (faster/less bugs/risk) another way?
| marcus0x62 wrote:
| I'm sure they have their reasons. Some might even be
| good. Here's something I _know_ I want: for my resolver
| to resolve valid DNS queries, no matter how stupid I
| believe the remote zone 's maintainers to be.
| [deleted]
| tptacek wrote:
| Sorry, I missed the spot in the DNS RFCs where they said
| RRsets are limited to 20 records. Got a reference?
| Thanks!
| akira2501 wrote:
| People confuse purity with quality all the time.
| baq wrote:
| And since we're talking about Debian, stability with
| stagnation.
| nineteen999 wrote:
| The MX records for major email providers at the time (eg.
| Yahoo) didn't even fit into a UDP DNS packet back in 2002.
|
| That they only just implemented this is a joke, and Alpine/musl
| users are the punchline.
| yjftsjthsd-h wrote:
| > The cloak of standards compliant could no longer be used
|
| Or in other words they've always been standards-compliant, and
| when the standard changed they updated to match it.
| fanf2 wrote:
| TCP support has been MUST implement for stub resolvers since
| 2016, so musl is several years late https://www.rfc-
| editor.org/rfc/rfc7766#page-6
| kelnos wrote:
| Once I started reading about these issues a few years ago, I
| stopped using Alpine as a container base image, and started
| using Debian (the 'debian-slim' variant). Slim is still larger
| than Alpine, but not by a lot, and does contain some extra
| functionality in the base image that's useful for debugging
| (most of which can be fairly easily removed for security
| hardening). Debugging random DNS issues is difficult enough;
| there's no need to make it harder by using intentionally-faulty
| software.
|
| While I wouldn't call myself a fan of Postel's Law (I think
| "being liberal with what you accept" can allow others to get
| away with sloppy implementations over long time periods,
| diluting the usefulness of standards and specifications), I
| think at some point you have to recognize the reality of how
| things are implemented in the real world, and that refusing to
| conform to the de-facto standard hurts your users.
|
| The fact that the maintainer only caved because the TCP
| fallback behavior is finally being made mandatory, and not
| because he's (very belatedly) recognizing he's harming his
| users with his stubbornness, also speaks volumes... and not in
| a good way.
| fefe23 wrote:
| [flagged]
| xyzzy_plugh wrote:
| Given the choice between maintaining a fork of a C standard
| library implementation and switching to an implementation
| that doesn't have this issue, the choice is pretty clear.
|
| It was pretty clear that musl was unwilling to support adding
| any TCP fallback code path regardless of a patch existing.
|
| Anyways, your comment is inflammatory and is full of straw-
| men. Try being less of a dick.
| fefe23 wrote:
| [flagged]
| xyzzy_plugh wrote:
| I don't use musl, so I don't have a horse in this race,
| but I wouldn't use it anyways because of issues like this
| one.
|
| Do they owe me anything? Of course not. Is anyone here
| claiming anything to the contrary? No. So what are you on
| about, exactly?
|
| This is the comment section on a website. Here we are
| discussing that the project maintainers kinda cocked this
| one up from a reputation and user trust perspective. If
| reputation, user trust and thus adoption are not a
| concern or goal of the project, then cool, good for them.
| It doesn't mean, however, that there isn't some
| reputation or user trust that was eroded. And it
| certainly doesn't mean we can't discuss it.
|
| For you to reply "what are you a child, just fork it and
| grow a pair, fix it yourself" is not constructive nor
| does it contribute to the discourse. Or did I put words
| in your mouth? Hm I wonder how that feels.
| vidarh wrote:
| Yikes. The first time I ran into an issue where email delivery
| for a major provider _was impossible_ without TCP fallback was
| 23 years ago. To treat this as optional this long is
| ridiculous.
| brmgb wrote:
| For all the good things Musl does, its authors can often be
| downright ridiculous out of a dogmatism which is very
| unpleasant to deal with as a user. See also their stubborn
| refusal to give you an easy way to detect that the libc is
| Musl while compiling.
| yjftsjthsd-h wrote:
| Well, if musl _is_ standards compliant, then wouldn 't it
| make sense to detect when you're building on something
| different rather than musl? Like, if the standard didn't
| require TCP fallback, then in theory the application should
| handle that and only skip it if it detects itself being
| built on a libc with non-standardized extensions to obviate
| the effort.
| zajio1am wrote:
| Standards are not laws, they are reasonably well-written
| documents describing behavior that is supposed to be
| interoperable. If, for some reason, that is not true
| (perhaps because the standard is ambiguous, obsolete or
| some other reason), ensuring that there is
| interoperability is more important than just keeping it
| to the letter of the standard.
|
| See RFC 1925, The twelve networking truths:
|
| (1) It has to work
| brmgb wrote:
| Because as we all know the standard never changes
| potentially introducing new features, allows no
| variability and bugs don't exist.
|
| It must be nice living in your. theoretical world. Here
| in the real world, I like libraries to actually help me
| do what I want to do rather than be needlessly annoying
| to take a stand no one actually cares about. But that's
| me. Clearly it serves Musl right given the massive amount
| of use it's seeing. Hmm, wait a minute...
| yjftsjthsd-h wrote:
| Okay, so nobody uses it and you can stop complaining
| because it'll never come up.
| brmgb wrote:
| Well I would like to use it. It's code is mostly sane,
| easier to understand than glibc and it would make sense
| in a lot of places where performances are not critical.
| That's why I am so annoyed by the non sense its
| developers keep pulling out. It's an incomprehensible
| situation because they get absolutely nothing out of it.
| They are basically annoying people for the sake of it.
| It's incredible that changes like the one we are
| commenting on which are just sanity being restored have
| to come from an evolution of the standard.
| 0xcde4c3db wrote:
| > See also their stubborn refusal to give you an easy way
| to detect that the libc is Musl while compiling.
|
| I suspect that the primary use cases for that would be
| applying half-baked workarounds and refusing to work with
| an "unsupported" libc, much like HTTP User-Agent.
| brmgb wrote:
| Or, you know, taking into account the stupid things Musl
| does like not having TCP DNS fallback.
|
| I'm only half joking. Features detection in Musl is a
| generally pain. I don't blame developers for sometimes
| taking the path of least resistance and not trying to
| support it.
| geocar wrote:
| Email servers cannot use gethostbyname anyway; they never
| would have been affected by this issue.
| vidarh wrote:
| Musl also includes res_send() etc., which _can_ be used for
| MX records, and e.g. Qmail _did_ use that back then. Here
| 's the commit adding TCP support to those functions in
| Musl.
|
| http://git.musl-
| libc.org/cgit/musl/commit/src/network/res_ms...
|
| AOL was the specific case we ran into that at the time (ca.
| 2000) returned either MX or A records (can't remember which
| caused the problem) that required more than 512 bytes.
| geocar wrote:
| I wasn't aware musl implemented libres. I don't have a
| problem with libres doing TCP.
|
| Qmail's issue was a hardcoded buffer size so this patch
| to musl wouldn't have helped.
| kstrauser wrote:
| I'm skeptical of that. Why do you say so?
| vidarh wrote:
| gethostbyname() only returns an address, it can't be used
| to e.g. query for MX records. You also lose control over
| retries to different addresses if you use
| gethostbyname(), which some mail server software will
| also care about.
|
| However as I noted in my other reply, musl also
| implements the res_* functions, like res_send() etc., and
| _those_ can be used to address both.
| robinhoodexe wrote:
| We're currently consolidating all container images to run on
| Debian-slim instead of a mixture of Debian, Ubuntu and alpine.
| Sure, alpine is small, but with 70% of our 500 container images
| being used for Python, R or node the final image is so large (due
| to libraries/packages), that the difference between alpine (~30
| MB) and debian-slim (~80) is negligible. We've been experiencing
| the weird DNS behaviour of alpine and other issues with musl as
| well. Debian is rock solid and upgrading to bookworm from
| bullseye and even buster in many cases didn't cause any problems
| at all.
|
| I will admit though, that Debian-slim still has some non-
| essential stuff that usually isn't needed at runtime, a shell is
| still neat for debugging or local development. This trade off
| could be considered a security risk, but it's rather simple to
| restrict other stuff at runtime (such as run as non-privileged,
| non-root user with all capabilities dropped and a read-only file
| system except for /tmp).
|
| It's a balancing act between ease-of-use and security. I don't
| think I'd get popular with the developers by forcing them to use
| "FROM scratch" and let them figure out exactly what their
| application needs at runtime and what stuff to copy over from a
| previous build stage.
| adolph wrote:
| > the difference between alpine (~30 MB) and debian-slim (~80)
|
| Given that it's a different layer, the your container runtime
| isn't going to redownload the layer anyway, right?
| kstrauser wrote:
| And even if it did, in an ancient data center that only uses
| gigabit ethernet, that's only a .5s longer download. And even
| a $4 DigitalOcean server comes with 10GB of storage, so that
| 50MB is only 1/200th of the instance's store. (I'd also bet
| that nearly no one uses instances that tiny for durable
| production work where 50MB is going to make a difference.)
| robinhoodexe wrote:
| Exactly. Part of the appeal to consolidate all of our
| container images to use Debian-slim is the ability to
| optimise the caching of layers, both in our container
| registry but also on our kubernetes cluster's nodes (which
| can be done in a consistent manner with kube-fledged[1]).
|
| [1] https://github.com/senthilrch/kube-fledged
| robertlagrant wrote:
| Thanks for that - that operator sounds extremely useful!
| leononame wrote:
| Can you point me on where to look for more details on securing
| a container? I'm a developer myself, and for me, the main
| benefit of containers is being able to deploy the app myself
| easily because I can bundle all the dependencies.
|
| What would you suggest I restrict at runtime and can you point
| me to a tutorial or an article where I can go have a deeper
| read on how to do it?
| vbezhenar wrote:
| Basically you need to work as unprivileged user and with
| immutable file system (of course you can have ephemeral /tmp
| or persistent /data, but generally the entire system should
| be treated as read-only).
| robinhoodexe wrote:
| What you want to read is the kubernetes pod security context
| fields[1].
|
| In your Dockerfile, add a non-root user with UID and GID
| 1000, then in the end of your Dockerfile, right before a CMD
| or ENTRYPOINT you change to that user.
|
| In your kubernetes yaml manifest, you can now set
| runAsNonRoot to true and runAsUser & runAsGroup to 1000.
|
| Then there's the privileged and allowPrivilegeEscalation
| fields, these can nearly always be set to false unless you
| actually need the extra privileges (such as using a GPU) on a
| shared node.
|
| Then there's seccomp profiles and the system capabilities. If
| you can run your container as the non-root user you've
| created, and you don't need the extra privileges then these
| can safely also be set to the most restricted. Non-privileged
| non-root is the same as all capabilities are dropped.
|
| The tricky one is the readOnlyRootfilesystem field. This
| includes /tmp, which is considered a global writeable
| directory, so the workaround is to make a in-memory volume
| and mount it at /tmp to make it writable. Likewise, your
| $HOME/.cache and $HOME/.local directories (for the user you
| created in your Dockerfile) are usually used by third party
| packages, so creating mounts here can be useful as well (if
| for some reason you can't point it to /tmp instead).
|
| [1] https://kubernetes.io/docs/tasks/configure-pod-
| container/sec...
| 4oo4 wrote:
| This was really helpful for me:
|
| https://cheatsheetseries.owasp.org/cheatsheets/Docker_Securi.
| ..
| Rapzid wrote:
| This here. Honestly most orgs with uhh.. Let's say a more
| mature sense of ROI tradeoffs were doing this from pretty much
| the very beginning.
|
| Also, Ubuntu 22.04 is only 28.17MB compressed right now so it
| looks equiv to debian-slim. There are also these new image
| lines, I can't recall the funky name for them, that are even
| smaller.
| kstrauser wrote:
| I'm pushing to go back to Debian from Ubuntu. Canonical's
| making decisions lately that don't appeal to me, and
| especially on the server I don't see a clear advantage for
| Ubuntu vs good ol' rock solid Debian.
| senknvd wrote:
| > There are also these new image lines, I can't recall the
| funky name for them, that are even smaller.
|
| You might be thinking of the chiselled images. An interesting
| idea but very much incomplete[1].
|
| [1]: https://github.com/canonical/chisel-releases/issues/34
| synergy20 wrote:
| why is it so much smaller than debian slim?
| yjftsjthsd-h wrote:
| Fewer features (including less "we've supported doing that
| for 20 years and we're not cutting it now"), packages
| separated into parts (ex. where debian might separate "foo"
| into "foo" for the main package and "foo-dev" for headers,
| alpine will also break out "foo-doc" with manpages and
| such), general emphasis on being small rather than full-
| featured.
| Rapzid wrote:
| It's not, Debian slim is ~28.8MB compressed.
| kachnuv_ocasek wrote:
| Do you have any tips regarding building R-based container
| images?
| robinhoodexe wrote:
| R is kinda difficult and I haven't cracked this one.
| Currently we're using the rocker based ones[1] but they are
| based on Ubuntu and include a lot of stuff we don't need at
| runtime. I'll look into creating a more minimal R base images
| that's based on Debian-slim.
|
| [1] https://github.com/rocker-org/rocker-versioned2
| nickjj wrote:
| I made the switch too around 4ish years ago. It has worked out
| nicely and I have no intention on moving away from Debian Slim.
| Everything "just works" and you get to re-use any previous
| Debian knowledge you may have picked up before using Docker.
| galangalalgol wrote:
| I've run into the same thing for large dev images, but using
| pure rust often means that musl allows for a single executable
| and a config file in a from scratch container for deployment.
| In cases where a slab or bump allocator are used, musl's
| deficiencies seem minimized.
|
| That means duplication of musl in lots of containers, but when
| they are all less than 10MB its less of an issue. Linking
| against gnu libraries might get the same executable down to
| less than 2MB but you'll add that back and more for even the
| tiniest gnu nase images.
| synergy20 wrote:
| what is pure rust,rust needs its own libraries which is a few
| MB as I recall
| galangalalgol wrote:
| By pure, I just meant no dependencies on c or c++ bindings
| other than libc. If that is the case you can do a musl
| build that has no dynamic dependencies, as all rust
| dependencies are static. So then your only dependency is
| the kernel, which is provided via podman/docker. A decent
| sized rust program with hundreds of dependencies I can get
| to compile down to 1.5MB. But that is depending on gnu. So
| if you had 4 or 5 of those on a node, it might be less data
| to use one gnu base image that is really small like rhel
| micro, and build rust for gnu. But if you have cpu hungry
| services like I do, then you usually have only a couple per
| node, so from scratch musl can be a bit smaller.
| phamilton wrote:
| > cpu hungry services
|
| Have you benchmarked musl vs glibc in any way? Data I've
| seen is all over the place and in curious about your
| experience.
| jonwest wrote:
| In the same boat here as well. Especially when you're talking
| about container images using JavaScript or other interpreted
| languages that are bundling in a bunch of other dependencies,
| the developer experience is much better in my experience given
| that more developers are likely to have had experience working
| in a Debian based distro than an Alpine based one.
|
| Especially when you're also developing within the container as
| well, having that be unified is absolutely worth the
| convenience, and honestly security and reliability as well. I
| realize that a container with less installed on it is
| inherently more secure, but if the only people who are familiar
| with the system are a small infrastructure/platform/ops type of
| team, things are more likely to get missed.
| richardwhiuk wrote:
| This was nearly three months ago?
| Arnavion wrote:
| Yes, and for the people who link to musl dynamically in their
| Alpine containers, it's also in Alpine 3.18
| yjftsjthsd-h wrote:
| And https://alpinelinux.org/posts/Alpine-3.18.0-released.html
| puts that also nearly 3 months ago, FWIW.
| jake_morrison wrote:
| I use distroless images based on Debian or Ubuntu, e.g.,
| https://github.com/cogini/phoenix_container_example
|
| The result is images the same size as Alpine, or smaller, without
| the incompatibilities. I think Alpine is a dead end.
| suprjami wrote:
| I hadn't heard of "distroless" before. Confusing name for a
| container with just main process runtimes, but neat idea.
|
| https://github.com/GoogleContainerTools/distroless
| ecliptik wrote:
| While I'm glad this is finally addressed, this limits the
| usefulness of one of my favorite interview questions.
|
| Asking about Alpine in a production environment was always good
| way finding who has container experiences of watching C-Beams
| glitter in the dark to those who only just read a "10 Docker
| Container Tricks; #8 will blow your mind!" blog post from 2017.
| vbezhenar wrote:
| I'm using alpine containers for two years on a moderately sized
| cluster and I've yet to encounter any issues caused by it.
| baq wrote:
| Famous last words right here.
|
| It's usually the case that everything works until it doesn't.
| When it's DNS that doesn't work, good luck debugging it
| unless you've got war stories to tell.
| InvaderFizz wrote:
| It's still going to be pretty common for at least a few years,
| and the now incorrect assumption that it is still broken I'm
| sure will persist for a decade or more among those who have
| been burned and thus moved on from Alpine and do not follow it.
|
| DNS is a fun rabbit hole for interviews, for sure.
|
| My favorite one to see on a resume is NIS. If you are listing
| NIS and don't have horror stories or other things to say about
| NIS, that's a really good indicator of the value of your
| resume.
|
| I intentionally list NIS on my resume because it is such a fun
| conversation topic to go on about how security models changed
| over time, all the ways NIS is terrible, but also how simple
| and useful it was.
| ecliptik wrote:
| NIS is a good one. I have UUCP as a skill on my resume as an
| easter egg but no one ever asks about it.
|
| For DNS, my favorite interview question goes like this,
|
| _How would you verify DNS is resolving from within a pod on
| Kubernetes?_
|
| After listening to the answer, add some constraints:
|
| 1. Common networking utilities like ping, nslookup, dig, etc
| are not available
|
| 2. Container user is unpriviledged
|
| 3. su/sudo do not work
|
| This can lead to some elaborate k8s troubleshooting or the
| simple, and correct, answer of _getent hosts_.
| cmeacham98 wrote:
| After constraint 1, this devolves to a weird game of "does
| the interviewee realize I don't consider `getent hosts` to
| be a 'common networking utility' so it's still available?"
| geocar wrote:
| I still use NIS because hosts files are faster than DNS.
| yjftsjthsd-h wrote:
| > Asking about Alpine in a production environment was always
| good way finding who has container experiences of watching
| C-Beams glitter in the dark to those who only just read a "10
| Docker Container Tricks; #8 will blow your mind!" blog post
| from 2017.
|
| I dunno, I've been running containers in prod for a while now
| and I don't recall Alpine being a problem. Maybe it varies by
| your workload?
| stefan_ wrote:
| glibc also has some fun behavior that few people know about
| because (1) distributions have been patching it and nobody ever
| actually ran the upstream version and or (2) downstream
| software is papering it over:
|
| https://github.com/golang/go/issues/21083
| nathants wrote:
| i've stopped using containers because they are annoying.
|
| i've started using alpine on my laptop and ec2 because it's not.
|
| different strokes, different folks.
| develatio wrote:
| IIRC this was causing some exotic problems when deploying docker
| images based on musl.
| tyingq wrote:
| I think there's also still some potential problems because it
| still does some things differently than glibc. Musl defaults to
| parallel requests if you define more than one nameserver
| (multiple --dns=, for example, for the docker daemon)...where
| glibc uses them in the order you provide them.
|
| To be clear, that's not "wrong", but just different maybe from
| what docker was expecting.
| LaLaLand122 wrote:
| I really hope it does some things differently than glibc:
| https://sourceware.org/bugzilla/show_bug.cgi?id=19643
|
| In any case glibc uses NSS, so what glibc does depends on the
| configuration. It may well just forward the request to
| systemd-resolved.
___________________________________________________________________
(page generated 2023-07-30 23:00 UTC)