[HN Gopher] More Memory Safety for Let's Encrypt: Deploying ntpd-rs
       ___________________________________________________________________
        
       More Memory Safety for Let's Encrypt: Deploying ntpd-rs
        
       Author : Dunedan
       Score  : 110 points
       Date   : 2024-06-24 17:23 UTC (5 hours ago)
        
 (HTM) web link (letsencrypt.org)
 (TXT) w3m dump (letsencrypt.org)
        
       | skilled wrote:
       | > When we look at the general security posture of Let's Encrypt,
       | one of the things that worries us most is how much of the
       | operating system and network infrastructure is written in unsafe
       | languages like C and C++.
       | 
       | Here we go... I struggle to understand why they would say that as
       | the opening statement in such a matter-of-fact manner.
       | 
       | Drinking the Rust kool-aid by the sound of it.
        
         | mianosm wrote:
         | The Jonestown massacre was actually grape flavor-aid:
         | 
         | https://www.vox.com/2015/5/23/8647095/kool-aid-jonestown-fla...
         | 
         | They really do appear to be all in on avoiding memory leaks
         | from C/CPP:
         | 
         | > Over the next few years we plan to continue replacing C or
         | C++ software with memory safe alternatives in the Let's Encrypt
         | infrastructure: OpenSSL and its derivatives with Rustls, our
         | DNS software with Hickory, Nginx with River, and sudo with
         | sudo-rs. Memory safety is just part of the overall security
         | equation, but it's an important part and we're glad to be able
         | to make these improvements.
         | 
         | It seems like a really challenging endeavor, but I appreciate
         | their desire to maintain uptime and a public service like they
         | do.
        
         | _flux wrote:
         | Is it completely unwarranted, though? It seems most of the
         | issues listed here are indeed memory safety bugs that are more
         | difficult to pull off in memory-safe languages such as Rust:
         | https://www.cvedetails.com/vulnerability-list/vendor_id-2153...
        
         | pjmlp wrote:
         | Because since the Morris Worm in 1988, there are still plenty
         | of networking facing services that keep being written in C and
         | C++, and without the necessary sanitary precautions.
         | 
         | True, there are plenty of alternatives for many of those
         | networking services, not necessarily Rust.
        
           | danudey wrote:
           | I hate writing code in golang, because I have to pepper every
           | single function call with `if err != nil`, but then I think
           | about how much C code I've seen that doesn't do that and I
           | wonder how much of it should.
        
         | tialaramex wrote:
         | Correctness matters, in their particular game that's especially
         | true although I'm doubtful of common insistence that it's
         | better for this or that software to be fast than correct.
         | 
         | Rust is _really good_ for correctness. Take  "Hello, World",
         | the obvious toy program. Someone tried giving it various error
         | states instead of (as would be usual) a normal happy terminal
         | environment. In C or C++ the canonical "Hello, World" program
         | terminates successfully despite any amount of errors, it just
         | doesn't care about correctness.
         | 
         | The default Rust Hello World, the one you get out of the box
         | when you make a new project, or you'd show people on a "My
         | First Rust Program" course, will complain about the errors when
         | they happen. Because doing so is correct.
         | 
         | It's the New Jersey style. The priority for these languages was
         | simplicity of implementation. It's more important that you can
         | cobble together a C compiler easily than that the results are
         | useful or worthwhile. This contributed to C's survival, but we
         | pay the price until we give it up.
        
           | akaletF wrote:
           | If _println_ panics if the write fails that is kind of
           | cheating, isn 't it. Yes, toy C programs do not check the
           | return value of _printf_. So what.
        
             | dralley wrote:
             | I would venture to say that _most_ C programs don 't check
             | the return value of printf, including ones that ought not
             | to be "toys"
        
               | akaletF wrote:
               | I don't know. Most serious programs will use _write()_
               | and then check the return value. In locations where it
               | does not matter (say a test suite that is guaranteed to
               | signal an error but an _fprintf()_ error message could
               | fail in theory) not checking is fine I think.
               | 
               | You will not see the message if you get a panic either
               | ...
        
             | toast0 wrote:
             | Setting up your environment so that you have to check for
             | errors or the errors propagate doesn't seem to be cheating
             | when the complaint is that other environments often omit
             | checking for errors even when it matters.
        
         | itishappy wrote:
         | > I struggle to understand why they would say that as the
         | opening statement in such a matter-of-fact manner.
         | 
         | TFA's second sentence explains the facts of the matter along
         | with the flavor of Kool-Aid they stock:
         | 
         | > The CA software itself is written in memory safe Golang, but
         | from our server operating systems to our network equipment,
         | lack of memory safety routinely leads to vulnerabilities that
         | need patching.
        
       | akira2501 wrote:
       | Why does your ntpd have a json dependency?
        
         | danudey wrote:
         | This is a good question to ask, especially in the age of
         | everything pulling in every possible dependency just to get one
         | library function or an `isNumeric()` convenience function.
         | 
         | The answer is that there is observability functionality which
         | provides its results as JSON output via a UNIX socket[0]. As
         | far as I can see, there's no other JSON functionality anywhere
         | else in the code, so this is just to allow for easily querying
         | (and parsing) the daemon's internal state.
         | 
         | (I'm not convinced that JSON is the way to go here, but that's
         | the answer to the question)
         | 
         | [0] https://docs.ntpd-rs.pendulum-
         | project.org/development/code-s...
        
           | motrm wrote:
           | If the pieces of state are all well known at build time - and
           | trusted in terms of their content - it may be feasible to
           | print out JSON 'manually' as it were, instead of needing to
           | use a JSON library,                 print "{"       print
           | "\"some_state\": \"";       print
           | GlobalState.Something.to_text();       print "\", ";
           | print "\"count_of_frobs\": ";       print
           | GlobalState.FrobsCounter;       print "}";
           | 
           | Whether it's worth doing this just to rid yourself of a
           | dependency... who knows.
        
             | syncsynchalt wrote:
             | Even better to just use TSV. Hand-rolling XML or JSON is
             | always a smell to me, even if it's visibly safe.
        
               | hackernudes wrote:
               | Do you mean TLV (tag-length-value)? I can't figure out
               | what TSV is.
        
               | FredFS456 wrote:
               | Tab Separated Values, like CSV but tabs instead of
               | commas.
        
             | fiedzia wrote:
             | > If the pieces of state are all well known at build time -
             | and trusted in terms of their content
             | 
             | .. than use library, because you should not rely on the
             | assumption that next developer adding one more piece to
             | this code will magically remember to validate it with json
             | spec.
        
               | maxbond wrote:
               | No magic necessary. Factor your hand-rolling into a
               | function that returns a string (instead of printing as in
               | the example), and write a test that parses it's return
               | with a proper JSON library. Preferably a property test.
        
             | mananaysiempre wrote:
             | That's somewhat better than assembling, say, HTML or SQL
             | out of text fragments, but it's still not fantastic. A JSON
             | output DSL would be better still--it wouldn't have to be
             | particularly complicated. (Shame those usually only come
             | paired with parsers, libxo excepted.)
        
         | orf wrote:
         | Would you rather it had a JSON dependency to parse a config
         | file, or yet another poorly thought out, ad-hoc homegrown
         | config file format?
        
           | itishappy wrote:
           | It uses TOML for configuration.
        
           | akira2501 wrote:
           | > yet another poorly thought out, ad-hoc homegrown config
           | file format
           | 
           | OpenBSD style ntpd.conf:                   servers
           | 0.gentoo.pool.ntp.org         servers 1.gentoo.pool.ntp.org
           | servers 2.gentoo.pool.ntp.org         servers
           | 3.gentoo.pool.ntp.org              constraints from
           | "https://www.google.com"              listen on *
           | 
           | I mean, there's always the possibility that they used a
           | common, well known and pretty decent config file format. In
           | this particular case, this shouldn't be the thing that
           | differentiates your ntpd implementation anyways.
        
           | amiga386 wrote:
           | Poorly thought out, ad-hoc homegrown config file format,
           | please. Every time.
           | 
           | 1. Code doesn't change at the whims of others.
           | 
           | 2. The _entire_ parser for an INI-style config can be in
           | about 20 lines of C
           | 
           | 3. Attacker doesn't also get to exploit code you've never
           | read in the third party dependency (and its dependencies! The
           | JSON dependency now wants to pull in the ICU library... I
           | guess you're linking to that, too)
           | 
           | 4. Complexity of config file formats are usually format-
           | independent, the feature-set of the format itself only _adds_
           | complexity, rather than takes it away. To put it another way,
           | is _this_ any saner...
           | {"user":"ams","host":"ALL","runas":["/bin/ls","/bin/df -h
           | /","/bin/date \"\"","/usr/bin/","sudoedit
           | /etc/hosts","OTHER_COMMANDS"}
           | 
           | ... than ...                   # I may be crazy mad but at
           | least I can have comments!         ams ALL=/bin/ls, /bin/df
           | -h /, /bin/date "", /usr/bin/, sudoedit /etc/hosts,
           | OTHER_COMMANDS
           | 
           | All the magic in the example is in what those values _are_
           | and what they _imply_ , the format doesn't improve if you
           | naively transpose it to JSON.
           | 
           | An example of an NTP server's config:                   # I
           | can have comments too         [Time]
           | NTP=ntp.ubuntu.com         RootDistanceMaxSec=5
           | PollIntervalMinSec=32         PollIntervalMaxSec=2048
           | 
           | If you _just_ want key-value pairs of strings /ints, nothing
           | more complex is needed. Using JSON is overdoing it.
        
       | NelsonMinar wrote:
       | I like the idea of NTPD in Rust. Is there anything to read about
       | how well ntpd-rs performs? Would love a new column for chrony's
       | comparison: https://chrony-project.org/comparison.html
       | 
       | Particularly interested in the performance stats, how well the
       | daemon keeps time in the face of various network problems. Chrony
       | is very good at this. Some of the other NTP implementations (not
       | on that chart) are so bad they shouldn't be used in production.
        
       | ComputerGuru wrote:
       | Unlike say, coreutils, ntp is something very far from being a
       | solved problem and the memory safety of the solution is
       | unfortunately going to play second fiddle to its efficacy.
       | 
       | For example, we only use chrony because it's so much better than
       | whatever came with your system (especially on virtual machines).
       | ntpd-rs would have to come at least within spitting distance of
       | chrony's time keeping abilities to even be up for consideration.
       | 
       | (And I say this as a massive rust aficionado using it for both
       | work and pleasure.)
        
         | syncsynchalt wrote:
         | The biggest danger in NTP isn't memory safety (though good on
         | this project for tackling it), it's
         | 
         | (a) the inherent risks in implementing a protocol based on
         | trivially spoofable UDP that can be used to do amplification
         | and reflection
         | 
         | and
         | 
         | (b) emergent resonant behavior from your implementation that
         | will inadvertently DDOS critical infrastructure when all 100m
         | installed copies of your daemon decide to send a packet to NIST
         | in the same microsecond.
         | 
         | I'm happy to see more ntpd implementations but always a little
         | worried.
        
           | timmytokyo wrote:
           | I really wish more internet infrastructure would switch to
           | using NTS. It addresses these kinds of issues.
        
       | cogman10 wrote:
       | This seems like a weird place to be touting memory safety.
       | 
       | It's ntpd, it doesn't seem like a place for any sort of attack
       | vector and it's been running on many VMs without exploding memory
       | for a while now.
       | 
       | I'd think there are far more critical components to rewrite in a
       | memory safe language than the clock synchronizer.
        
         | luma wrote:
         | It's present on loads of systems, it's a very common service to
         | offer, it's a reasonably well-constrained use case, and the
         | fact that nobody thinks about it might be a good reason to
         | think about it. They can't boil the ocean but one service at a
         | time is a reasonable approach.
         | 
         | I'll flip the question around, why not start at ntpd?
        
           | cogman10 wrote:
           | > I'll flip the question around, why not start at ntpd?
           | 
           | Easy, because there are loads of critical infrastructure
           | written in C++ that is commonly executed on pretty much every
           | VM and exposed in such a way that vulnerabilities are
           | disasterous.
           | 
           | For example, JEMalloc is used by nearly every app compiled in
           | *nix.
           | 
           | Perhaps systemd which is just about everywhere running
           | everything.
           | 
           | Maybe sshd, heaven knows it's been the root of many attacks.
        
         | oconnor663 wrote:
         | > it's been running on many VMs without exploding memory for a
         | while now
         | 
         | Most of the security bugs we hear about don't cause random
         | crashes on otherwise healthy machines, because that tends to
         | get them noticed and fixed. It's the ones that require
         | complicated steps to trigger that are really scary. When I look
         | at NTP, I see a service that:
         | 
         | - runs as root
         | 
         | - talks to the network
         | 
         | - doesn't usually authenticate its traffic
         | 
         | - uses a bespoke binary packet format
         | 
         | - almost all network security depends on (for checking cert
         | expiration)
         | 
         | That looks to me like an _excellent_ candidate for a memory-
         | safe reimplementation.
        
           | cogman10 wrote:
           | > runs as root
           | 
           | ntpd can (and should) run as a user
           | 
           | > talks to the network
           | 
           | Makes outbound requests to the network. For it to be
           | compromised, the network itself or a downstream server needs
           | to be compromised. That's very different from something like
           | hosting an http server.
           | 
           | > doesn't usually authenticate its traffic
           | 
           | Yes it does. ntp uses TLS to communicate with it's well known
           | locations.
           | 
           | > uses a bespoke binary packet format
           | 
           | Not great but also see above where it's talking to well known
           | locations authenticated and running as a user.
           | 
           | It's a service that to be compromised requires state level
           | interference.
        
         | jaas wrote:
         | I'm the person driving this.
         | 
         | NTP is worth moving to a memory safe language but of course
         | it's not the single most critical thing in our entire stack to
         | make memory safe. I don't think anyone is claiming that. It's
         | simply the first component that got to production status, a
         | good place to start.
         | 
         | NTP is a component worth moving to a memory safe language
         | because it's a widely used critical service on a network
         | boundary. A quick Google for NTP vulnerabilities will show you
         | that there are plenty of memory safety vulnerabilities lurking
         | in C NTP implementations:
         | 
         | https://www.cvedetails.com/vulnerability-list/vendor_id-2153...
         | 
         | Some of these are severe, some aren't. It's only a matter of
         | time though until another severe one pops up.
         | 
         | I don't think any critical service on a network boundary should
         | be written in C/C++, we know too much at this point to think
         | that's a good idea. It will take a while to change that across
         | the board though.
         | 
         | If I had to pick the most important thing in the context of
         | Let's Encrypt to move to a memory safe language it would be
         | DNS. We have been investing heavily in Hickory DNS but it's not
         | ready for production at Let's Encrypt yet (our usage of DNS is
         | a bit more complex than the average use case).
         | 
         | https://github.com/hickory-dns/hickory-dns
         | 
         | Work is proceeding at a rapid pace and I expect Hickory DNS to
         | be deployed at Let's Encrypt in 2025.
        
           | landmarker10 wrote:
           | So, where are the memory safety issues in the last decade for
           | Postfix?
           | 
           | https://www.cvedetails.com/vulnerability-
           | list/vendor_id-8450...
           | 
           | I see issues that could also occur in Rust. Should we rewrite
           | (presumably plagiarize) Postfix in Rust?
        
       | _joel wrote:
       | Reading this reminded me of ntpsec, anyone actually use that?
        
       | nubinetwork wrote:
       | The problem with ntp isn't the client, it's the servers having to
       | deal with forged UDP packets. Will ntpd ever become TCP-only?
       | Sadly I'm not holding my breath. I stopped running a public
       | stratum 3 server ~10 years ago.
        
         | Faaak wrote:
         | On the contrary, I'm hosting a stratum 1 and 2 stratum 2s (at
         | my previous company we offered 3 stratum 1s) on the ntp pool.
         | It's useful, used, and still needed :-)
        
         | brohee wrote:
         | When one can make a stratum 1 server for $100, there is very
         | little reason for the continuous existence of public NTP
         | servers. ISP can offer the service to their customers, and any
         | company with a semblance of IT dept can have its own stratum 1.
        
       ___________________________________________________________________
       (page generated 2024-06-24 23:00 UTC)