hngopher.com

       [HN Gopher] Zlib-rs is faster than C
       ___________________________________________________________________
        
       Zlib-rs is faster than C
        
       Author : dochtman
       Score  : 140 points
       Date   : 2025-03-16 19:35 UTC (3 hours ago)
        
 (HTM) web link (trifectatech.org)
 (TXT) w3m dump (trifectatech.org)
        
       | IshKebab wrote:
       | It's _barely_ faster. I would say it 's more accurate to say it's
       | as fast as C, which is still a great achievement.
        
         | throwaway48476 wrote:
         | But it is faster. The closer to theoretical maximum the smaller
         | the gains become.
        
           | mananaysiempre wrote:
           | Zlib-ng is between a couple and multiple times away from the
           | state of the art[1], it's just that nobody has yet done the
           | (hard) work of adjusting libdeflate[2] to a richer API than
           | "complete buffer in, complete buffer out".
           | 
           | [1] https://github.com/zlib-ng/zlib-ng/issues/1486
           | 
           | [2] https://github.com/ebiggers/libdeflate
        
         | qweqwe14 wrote:
         | "Barely" or not is completely irrelevant. The fact is that it's
         | measurably faster than the C implementation with the more
         | common parameters. So the point that you're trying to make
         | isn't clear tbh.
         | 
         | Also I'm pretty sure that the C implementation had more man
         | hours put into it than the Rust one.
        
           | bee_rider wrote:
           | I think that would be really hard to measure. In particular,
           | for this sort of very optimized code, we'd want to separate
           | out the time spent designing the algorithms (which the Rust
           | version benefits from as well). Actually I don't think that
           | is possible at all (how will we separate out time spent
           | coding experiments in C, then learning from them).
           | 
           | Fortunately these "which language is best" SLOC measuring
           | contests are just frivolous little things that only silly
           | people take seriously.
        
         | ajross wrote:
         | It's... basically written in C. I'm no expert on zlib/deflate
         | or related algorithms, but digging around
         | https://github.com/trifectatechfoundation/zlib-rs/ almost every
         | block with meaningful logic is marked unsafe. There's raw
         | allocation management, raw slicing of arrays, etc... This code
         | looks and smells like C, and very much not like rust. I don't
         | know that this is a direct transcription of the C code, but if
         | you were to try something like that this is sort of what it
         | would look like.
         | 
         | I think there's lots of value in wrapping a raw/unsafe
         | implementation with a rust API, but that's not _quite_ what
         | most people think of when writing code  "in rust".
        
           | hermanradtke wrote:
           | > basically written in C
           | 
           | Unsafe Rust still has to conform to many of Rust's rules. It
           | is meaningfully different than C.
        
             | est31 wrote:
             | It has also way less tooling available than C to analyze
             | its safety.
        
               | nindalf wrote:
               | The number of tools matters less than the quality of the
               | tools. Rust's inherent guarantees + miri + software
               | verification tools mean that in practice Rust code, even
               | with unsafe, ends up being higher quality.
        
             | ajross wrote:
             | Are there examples you're thinking about? The only good
             | ones I can think of are bits about undefined behavior
             | semantics, which frankly are very well covered in modern C
             | code via tools like ubsan, etc...
        
               | sedatk wrote:
               | This comment summarizes the difference of unsafe Rust
               | quite well. Basically, mostly safe Rust, but with few
               | exceptions, fewer than one would imagine:
               | https://news.ycombinator.com/item?id=43382176
        
               | steveklabnik wrote:
               | They're just fundamentally different languages. There's
               | semantics that exist in all four of these quadrants:
               | 
               | * defined in C, undefined in Rust
               | 
               | * undefined in C, undefined in Rust
               | 
               | * defined in Rust, undefined in C
               | 
               | * defined in Rust, defined in C
        
           | xxs wrote:
           | I mentioned in under another comment - and while I consider
           | myself versed enough in deflate - comparing the library to
           | zlib-ng is quite weird as the latter is generally hand
           | written assembly. In order to beat it'd take some oddity in
           | the test itself
        
           | oneshtein wrote:
           | Cannot understand your complain. It written in Rust, but for
           | you it looks like C. So what?
        
             | Alifatisk wrote:
             | So, it is basically like it was written in C.
        
             | ajross wrote:
             | It doesn't exploit (and in fact deliberately evades) Rust's
             | signature memory safety features. The impression from the
             | headline is "Rust is as fast as C now!", but in fact the
             | subset of the language that has been shown to be as fast as
             | C is the subset that is basically _isomorphic_ to C.
             | 
             | The impression a naive reader might take is that
             | idiomatic/safe/best-practices Rust has now closed the
             | performance gap. But clearly that's not happening here.
        
               | sedatk wrote:
               | Rust's many memory safety features (including the borrow
               | checker) are still enabled in unsafe Rust blocks.
               | 
               | For more information:
               | https://news.ycombinator.com/item?id=43382176
        
           | johnisgood wrote:
           | It does actually seem like what a C -> Rust transpiler would
           | spit out.
        
           | gf000 wrote:
           | C is not assembly, nor is it portable assembly at all in this
           | century, so your phrasing is very off.
           | 
           | C code will go through a huge amounts of transformations by
           | the compiler, and unless you are a compiler expert you will
           | have no idea how the resulting code looks. It's not targeting
           | the PDP-11 anymore.
        
       | johnisgood wrote:
       | "faster than C" almost always boils down to different designs,
       | implementations, algorithms, etc.
       | 
       | Perhaps it is faster than already-existing implementations, sure,
       | but not "faster than C", and it is odd to make such claims.
        
         | oneshtein wrote:
         | ... because by "C" we mean handwritten inline assembler.
         | 
         | Typical realworld C code uses \0 terminated strings and
         | strlen() with O(len^2) complexity.
        
         | qweqwe14 wrote:
         | The fact that it's faster than the C implementation that surely
         | had more time and effort put into it doesn't look good for C
         | here.
        
           | johnisgood wrote:
           | It says absolutely nothing about the programming language
           | though.
        
             | acdha wrote:
             | Doesn't it say something if Rust programmers routinely feel
             | more comfortable making aggressive optimizations and have
             | more time to do so? We maintain code for longer than the
             | time taken to write the first version and not having to pay
             | as much ongoing overhead cost is worth something.
        
           | vkou wrote:
           | I think you'll find that if you re-write an application,
           | feature-for-feature, _without_ changing its language, the re-
           | written version will be faster.
        
             | renewiltord wrote:
             | This is known as the Second System Effect: where Great
             | Rewrites always succeed in making a more performant thing.
        
         | xxs wrote:
         | zlib-ng is pretty much assembly - with a bit of C. There is
         | this quote: _but was not entirely fair because our rust
         | implementation could assume that certain SIMD capabilities
         | would be available, while zlib-ng had to check for them at
         | runtime_
         | 
         | zlib-ng can be compiled to whatever target arch is necessary,
         | and the original post doesn't mention how it was compiled and
         | what architecture and so on.
         | 
         | It's another case not to trust micro benchmarks
        
         | tdiff wrote:
         | Nevertheless Russinovich actually says something in the lines
         | of "simple rewriting in rust made some our code 5-15% faster
         | (without deliberate optimizations)":
         | https://www.youtube.com/watch?v=1VgptLwP588&t=351s
        
           | pinkmuffinere wrote:
           | I'm sure I'm missing context, and presumably there are other
           | benefits, but 5-15% improvement is such a small step to
           | justify rewriting codebases.
           | 
           | I also wonder how much of an improvement you'd get by just
           | asking for a "simple rewrite" in the existing language. I
           | suspect there are often performance improvements to be had
           | with simple changes in the existing language
        
             | tdiff wrote:
             | I agree that simple rewriting could have given some if not
             | all perf benefits, but can it be the case that rust forces
             | us to structure code in a way that is for some reason more
             | performant in some cases?
             | 
             | 5-15% is a big deal for a low-level foundational code,
             | especially if you get it along with some other guarantees,
             | which may be of greater importance.
        
             | turtletontine wrote:
             | Far better justification for a rewrite like this is if it
             | eases maintenance, or simplifies
             | building/testing/distribution. Taking an experienced and
             | committed team of C developers with a mature code base, and
             | retraining them to rewrite their project in Rust for its
             | own sake is pretty absurd. But if you have a team that's
             | more comfortable in Rust, then doing so could make a lot of
             | sense - and, yes, make it easier to ensure the product is
             | secure and memory-safe.
        
               | johnisgood wrote:
               | > if you have a team that's more comfortable in
               | 
               | As is the case with any languages, of course, it is not
               | in favor (nor against) Rust.
        
             | sedatk wrote:
             | > 5-15% improvement is such a small step to justify
             | rewriting codebases
             | 
             | They hadn't expected any perf improvements at all. Quite
             | the opposite, in fact. They were surprised that they saw
             | perf improvements right away.
        
         | kgeist wrote:
         | I heard that aliasing in C prevents the compiler from
         | optimizing aggressively. I can believe Rust's compiler can
         | optimize more aggressively if there's no aliasing problem.
        
           | layer8 wrote:
           | C has the _restrict_ type qualifier to express non-aliasing,
           | hence it shouldn't be a fundamental impediment.
        
             | gf000 wrote:
             | Which is so underused that the whole compiler feature was
             | buggy as hell, and was only recently fixed because
             | compiling Rust where it is the norm exposed it.
        
         | layer8 wrote:
         | If anything, this should be "zlib-rs is faster than zlib-ng",
         | but not "$library is faster than $programming_language".
        
           | chjj wrote:
           | It should be, but you'll never convince the rust people of
           | that. It's always a competition with them.
        
       | kahlonel wrote:
       | You mean the implementation is faster than the one in C. Because
       | nothing is "faster than C".
        
         | arlort wrote:
         | Tachyons?
        
           | einpoklum wrote:
           | Maybe if you reverse the beam polarity and route them through
           | the main deflector array.
        
             | layer8 wrote:
             | But that requires rerouting auxiliary power from life
             | support to the shield generators. In Rust you would need to
             | use _unsafe_ for that.
        
         | mkoubaa wrote:
         | C after an optimizing compiler has chewed through it is faster
         | than C
        
         | Jaxan wrote:
         | Of course many things can be faster than C, because C is very
         | far from modern hardware. If you compile with optimisation
         | flags, the generated machine code looks nothing like what you
         | programmed in C.
        
         | dijit wrote:
         | The kind of code you can write in rust can indeed be faster
         | than C, but someone will wax poetic about how anything is
         | possible in C and they would be valid.
         | 
         | The major reason that rust can be faster than C though, is
         | because due to the way the compiler is constructed, you can
         | lean on threading idiomatically. The same can be true for Go,
         | coroutines vs no coroutines in some cases is going to be faster
         | for the use case.
         | 
         | You _can_ write these things to be the same speed or even
         | faster in C, but you won't, because it's hard and you will
         | introduce more bugs per KLOC in C with concurrency vs Go or
         | Rust.
        
         | pornel wrote:
         | If you don't count manual SIMD intrinsics or inline assembly as
         | C, then Rust and FORTRAN can be faster than C. This is mainly
         | thanks to having pointer aliasing guarantees that C doesn't
         | have. They can get autovectorization optimizations where C's
         | semantics get in the way.
        
         | nindalf wrote:
         | Why can't something be faster than C? If a language is able to
         | convey more information to a backend like LLVM, the backend
         | could use that to produce more optimised code than what it
         | could do for C.
         | 
         | For example, if the language is able to say, for any two
         | pointers, the two pointers will not overlap - that would enable
         | the backend to optimise further. In C this requires an explicit
         | restrict keyword. In Rust, it's the default.
         | 
         | By the way this isn't theoretical. Image decoders written in
         | Rust are faster than ones written in C, probably because the
         | backend is able to autovectorise better. (https://www.reddit.co
         | m/r/rust/comments/1ha7uyi/memorysafe_pn...).
         | 
         | grep (C) is about 5-10x slower than ripgrep (Rust). That's why
         | ripgrep is used to execute all searches in VS Code and not
         | grep.
         | 
         | Or a different tack. If you wrote a program that needed to sort
         | data, the Rust version would probably be faster thanks to the
         | standard library sort being the fastest, across languages
         | (https://github.com/rust-lang/rust/pull/124032). Again, faster
         | than C.
         | 
         | Happy to give more examples if you're interested.
         | 
         | There's nothing special about C that entitles it to the crown
         | of "nothing faster". This would have made sense in 2005, not
         | 2025.
        
           | burntsushi wrote:
           | Narrow correction on two points:
           | 
           | First, I would say that "ripgrep is generally faster than GNU
           | grep" is a true statement. But sometimes GNU grep is faster
           | than ripgrep and in many cases, performance is comparable or
           | only a "little" slower than ripgrep.
           | 
           | Secondly, VS Code using ripgrep because of its speed is only
           | one piece of the picture. Licensing was also a major
           | consideration. There is an issue about this where they
           | originally considered ripgrep (and ag if I recall correctly),
           | but I'm on mobile so I don't have the link handy.
        
         | kllrnohj wrote:
         | It is quite easy for C++ and Rust to both be faster than C in
         | things larger than toy projects. C is hardly a panacea of
         | efficiency, and the language makes useful things very hard to
         | do efficiently.
         | 
         | You can contort C to trick it into being fast[1], but it
         | quickly becomes an unmaintainable nightmare so almost nobody
         | does.
         | 
         | 1: eg, correct use of restrict, manually creating move
         | semantics, manually creating small string optimizations, etc...
        
         | gf000 wrote:
         | Wtf, since when?
         | 
         | Besides the famous "C is not a low-level language" blog post..
         | I don't even get what you are thinking. C is not even the
         | performance queen for large programs (the de facto standard
         | today is C++ for good reasons), let alone for tiny ultra hot
         | loops like codecs and stuff, which are all hand-written
         | assembly.
         | 
         | It's not even hard to beat C with something like Rust or C++,
         | because you can properly do high level optimizations as the
         | language is expressive enough for that.
        
       | YZF wrote:
       | I found out I already know Rust:                       unsafe {
       | let x_tmp0 = _mm_clmulepi64_si128(xmm_crc0, crc_fold, 0x10);
       | xmm_crc0 = _mm_clmulepi64_si128(xmm_crc0, crc_fold, 0x01);
       | xmm_crc1 = _mm_xor_si128(xmm_crc1, x_tmp0);
       | xmm_crc1 = _mm_xor_si128(xmm_crc1, xmm_crc0);
       | 
       | Kidding aside, I thought the purpose of Rust was for safety but
       | the keyword unsafe is sprinkled liberally throughout this
       | library. At what point does it really stop mattering if this is C
       | or Rust?
       | 
       | Presumably with inline assembly both languages can emit what is
       | effectively the same machine code. Is the Rust compiler a better
       | optimizing compiler than C compilers?
        
         | oneshtein wrote:
         | > I thought the purpose of Rust was for safety but the keyword
         | unsafe is sprinkled liberally throughout this library.
         | 
         | What wrong with that?
        
         | Filligree wrote:
         | The usual answer is: You only need to verify the unsafe blocks,
         | not every block. Though 'unsafe' in Rust is actually even less
         | safe than regular C, if a bit more predictable, so there's a
         | crossover point where you really shouldn't have bothered.
         | 
         | The Rust compiler is indeed better than the C one, largely
         | because of having more information and doing full-program
         | optimisation. A `vec_foo =
         | vec_foo.into_iter().map(...).collect::Vec<foo>`, for example,
         | isn't going to do any bounds checks _or_ allocate.
        
           | johnisgood wrote:
           | I have been told that "unsafe" affects code outside of that
           | block, but hopefully steveklabnik may explain it better
           | (again).
           | 
           | > isn't going to do any bounds checks or allocate.
           | 
           | You need to add explicit bounds check or explicitly allocate
           | _in C_ though. It is not there if you do not add it yourself.
        
             | LegionMammal978 wrote:
             | > I have been told that "unsafe" affects code outside of
             | that block, but hopefully stevelabnik may explain it better
             | (again).
             | 
             | Poorly-written unsafe code can have effects extending out
             | into safe code. But correctly-written unsafe code does not
             | have any effects on safe code w.r.t. memory safety. So to
             | ensure memory safety, you just have to verify the
             | correctness of the unsafe code (and any helper functions,
             | etc., it depends on), rather than the entire codebase.
             | 
             | Also, some forms of unsafe code are far less dangeous than
             | others in practice. E.g., most of the SIMD functions are
             | practically safe to call in every situation, but they all
             | have 'unsafe' slapped on them due to being intrinsics.
             | 
             | > You need to add explicit bounds check or explicitly
             | allocate _in C_ though. It is not there if you do not add
             | it yourself.
             | 
             | Unfortunately, you do need to allocate a new buffer in C if
             | you change the type of the elements. The annoying side of
             | strict aliasing is that every buffer has a single type
             | that's set in stone for all time. (Unless you preemptively
             | use unions for everything.)
        
               | uecker wrote:
               | C has type-changing stores. If you store to a buffer with
               | a new type, it has the new type. Clang does not implement
               | this correctly though, but GCC does.
        
             | pornel wrote:
             | Buggy unsafe blocks can affect code anywhere (through
             | Undefined Behavior, or breaking the API contract).
             | 
             | However, if you verify that the unsafe blocks are correct,
             | and the safe API wrapping them rejects invalid inputs, then
             | they won't be able to cause unsafety anywhere.
             | 
             | This does reduce how much code you need to review for
             | memory safety issues. Once it's encapsulated in a safe API,
             | the compiler ensures it can't be broken.
             | 
             | This encapsulation also prevents combinatorial explosion of
             | complexity when multiple (unsafe) libraries interact.
             | 
             | I can take zlib-rs, and some multi-threaded job executor
             | (also unsafe internally), but I don't need to specifically
             | check how these two interact. zlib-rs needs to ensure they
             | use slices and lifetimes correctly, the threading library
             | needs to ensure it uses correct lifetimes and type bounds,
             | and then the compiler will check all interactions between
             | these two libraries for me. That's like (M+N) complexity to
             | deal with instead of (M*N).
        
             | steveklabnik wrote:
             | > I have been told that "unsafe" affects code outside of
             | that block, but hopefully stevelabnik may explain it better
             | (again).
             | 
             | It's due to a couple of different things interacting with
             | each other: unsafe relies on invariants that safe code must
             | also uphold, and that the privacy boundary in Rust is the
             | module.
             | 
             | Before we get into the unsafe stuff, I want you to consider
             | an example. Is this Rust code okay?
             | struct Foo {            bar: usize,         }
             | impl Foo {             fn set_bar(&mut self, bar: usize) {
             | self.bar = bar;             }         }
             | 
             | No unsafe shenanigans here. This code is perfectly safe, if
             | a bit useless.
             | 
             | Let's talk about unsafe. The canonical example of unsafe
             | code being affected outside of unsafe itself is the
             | implementation of Vec<T>. Vecs look _something_ like this
             | (the real code is different for reasons that don 't really
             | matter in this context):                   struct Vec<T> {
             | ptr: *mut T,            len: usize,            cap: usize,
             | }
             | 
             | The pointer is to a bunch of Ts in a row, the length is the
             | current number of Ts that are valid, and the capacity is
             | the total number of Ts. The length and the capacity are
             | different so that memory allocation is amortized; the
             | capacity is always greater than or equal to the length.
             | 
             | That property is very important! If the length is greater
             | than the capacity, when we try and index into the Vec, we'd
             | be accessing random memory.
             | 
             | So now, this function, which is the same as Foo::set_bar,
             | is no longer okay:                   impl<T> Vec<T> {
             | fn set_len(&mut self, len: usize) {
             | self.len = len;             }         }
             | 
             | This is because the unsafe code inside of other methods of
             | Vec<T> need to be able to rely on the fact that len <=
             | capacity. And so you'll find that Vec<T>::set_len in Rust
             | is marked as unsafe, even though it doesn't contain unsafe
             | code. It still requires judicious use of to not introduce
             | memory unsafety.
             | 
             | And this is why the module being the privacy boundary
             | matters: the only way to set len directly in safe Rust code
             | is code within the same privacy boundary as the Vec<T>
             | itself. And so, that's the same module, or its children.
        
         | dietr1ch wrote:
         | > I thought the purpose of Rust was for safety but the keyword
         | unsafe is sprinkled liberally throughout this library.
         | 
         | Which is exactly the point, other languages have unsafe
         | implicitly sprinkled in every single line.
         | 
         | Rust tries to bound and explicitly delimit where unsafe code is
         | to makes review and verification efforts precise.
        
         | datadeft wrote:
         | I thought that the point of Rust is to have safe {} blocks
         | (implicit) as a default and unsafe {} when you need the
         | absolute maximum performance available. You can audit those few
         | lines of unsafe code very easily. With C everything is unsafe
         | and you can just forget to call free() or call it twice and you
         | are done.
        
           | steveklabnik wrote:
           | > unsafe {} when you need the absolute maximum performance
           | available.
           | 
           | Unsafe code is not inherently faster than safe code, though
           | sometimes, it is. Unsafe is for when you want to do something
           | that is legal, but the compiler cannot understand that it is
           | legal.
        
           | WD-42 wrote:
           | It's not about performance, it's about undefined behavior.
        
         | akx wrote:
         | To quote the Rust book (https://doc.rust-
         | lang.org/book/ch20-01-unsafe-rust.html):                 In
         | addition, unsafe does not mean the code inside the       block
         | is necessarily dangerous or that it will definitely       have
         | memory safety problems: the intent is that as the
         | programmer, you'll ensure the code inside an unsafe block
         | will access memory in a valid way.
         | 
         | Since you say you already know that much Rust, you can be that
         | programmer!
        
           | silisili wrote:
           | I feel like C programmers had the same idea, and well, we see
           | how that works out in practice.
        
             | dijit wrote:
             | the problem in those cases is that C can't help but be
             | unsafe always.
             | 
             | People can write memory safe code, just not 100% of the
             | time.
        
             | sunshowers wrote:
             | No, C lacks encapsulation of unsafe code. This is very
             | important. Encapsulation is the only way to scale local
             | reasoning into global correctness.
        
         | Aurornis wrote:
         | Using unsafe blocks in Rust is confusing when you first see it.
         | The idea is that you have to opt-out of compiler safety
         | guarantees for specific sections of code, but they're clearly
         | marked by the unsafe block.
         | 
         | In good practice it's used judiciously in a codebase where it
         | makes sense. Those sections receive extra attention and
         | analysis by the developers.
         | 
         | Of course you can find sloppy codebases where people reach for
         | unsafe as a way to get around Rust instead of writing code the
         | Rust way, but that's not the intent.
         | 
         | You can also find die-hard Rust users who think unsafe should
         | never be used and make a point to avoid libraries that use it,
         | but that's excessive.
        
           | timschmidt wrote:
           | Unsafe is a very distinct code smell. Like the hydrogen
           | sulfide added to natural gas to allow folks to smell a gas
           | leak.
           | 
           | If you smell it when you're not working on the gas lines,
           | that's a signal.
        
             | cmrdporcupine wrote:
             | Look, no. Just go read the unsafe block in question. It's
             | just SIMD intrinsics. No memory access. No pointers. It's
             | unsafe in name only.
             | 
             | No need to get all moral about it.
        
               | kccqzy wrote:
               | By your line of reasoning, SIMD intrinsics functions
               | should not be marked as unsafe in the first place. Then
               | why are they marked as unsafe?
        
               | cmrdporcupine wrote:
               | There's no standardization of simd in Rust yet, they've
               | been sitting in nightly unstable for years:
               | 
               | https://doc.rust-lang.org/std/intrinsics/simd/index.html
               | 
               | So I suspect it's a matter of two things:
               | 
               | 1. You're calling out to what's basically assembly, so
               | buyer beware. This is basically FFI into C/asm.
               | 
               | 2. There's no guarantee on what comes out of those
               | 128-bit vectors after to follow any sanity or
               | expectations, so... buyer beware. Same reason
               | std::mem::transmute is marked unsafe.
               | 
               | It's really the weakest form of unsafe.
               | 
               | Still entirely within the bounds of a sane person to
               | reason about.
        
               | pclmulqdq wrote:
               | > they've been sitting in nightly unstable for years
               | 
               | So many very useful features of Rust and its core library
               | spend years in "nightly" because the maintainers of those
               | features don't have the discipline to see them through.
        
               | cmrdporcupine wrote:
               | simd and allocator_api are the two that irritate me
               | enough to consider a different language for future
               | systems dev projects.
               | 
               | I don't have the personality or time to wade into
               | committee type work, so I have no idea what it would take
               | to get those two across the finish line, but the
               | allocator one in particular makes me question Rust for
               | lower level applications. I think it's just not going to
               | happen.
               | 
               | If Zig had proper ADTs and something equivalent to borrow
               | checker, I'd be inclined to poke at it more.
        
               | steveklabnik wrote:
               | > There's no standardization of simd in Rust yet
               | 
               | Of _safe_ SIMD, but some stuff in core::arch is
               | stabilized. Here 's the first bit called in the example
               | of the OP: https://doc.rust-
               | lang.org/core/arch/x86/fn._mm_clmulepi64_si...
        
               | CryZe wrote:
               | They are in the process of marking them safe, which is
               | enabled through the target_feature 1.1 RFC.
               | 
               | In fact, it has already been merged two weeks ago:
               | https://github.com/rust-lang/stdarch/pull/1714
               | 
               | The change is already visible on nightly:
               | https://doc.rust-
               | lang.org/nightly/core/arch/x86/fn._mm_xor_s...
               | 
               | Compared to stable: https://doc.rust-
               | lang.org/core/arch/x86/fn._mm_xor_si128.htm...
               | 
               | So this should be stable in 1.87 on May 15 (Rust's 10
               | year anniversary since 1.0)
        
               | timschmidt wrote:
               | I don't read any moralizing in my previous comment. And
               | it seems to mirror the relevant section in the book:
               | 
               | "People are fallible, and mistakes will happen, but by
               | requiring these five unsafe operations to be inside
               | blocks annotated with unsafe you'll know that any errors
               | related to memory safety must be within an unsafe block.
               | Keep unsafe blocks small; you'll be thankful later when
               | you investigate memory bugs."
               | 
               | I hope the SIMD intrinsics make it to stable soon so
               | folks can ditch unnecessary unsafes if that's the only
               | issue.
        
               | SkiFire13 wrote:
               | SIMD intrinsics are unsafe because they are available
               | only under some CPU features.
        
             | mrob wrote:
             | There's no standard recipe for natural gas odorant, but
             | it's typically a mixture of various organosulfur compounds,
             | not hydrogen sulfide. See:
             | 
             | https://en.wikipedia.org/wiki/Odorizer#Natural_gas_odorizer
             | s
        
               | timschmidt wrote:
               | TIL!
        
           | api wrote:
           | The idea is that you can trivially search the code base for
           | "unsafe" and closely examine all unsafe code, and unless you
           | are doing really low-level stuff there should not be much of
           | it. Higher level code bases should ideally have none.
           | 
           | It tends to be found in drivers, kernels, vector code, and
           | low-level implementations of data structures and allocators
           | and similar things. Not typical application code.
           | 
           | As a general rule it should be avoided unless there's a good
           | reason to do it. But it's there for a reason. It's almost
           | impossible to create a systems language that imposes any kind
           | of rules (like ownership etc.) that covers all possible cases
           | and all possible optimization patterns on all hardware.
        
             | timschmidt wrote:
             | To the extent that it's even possible to write bare metal
             | microcontroller firmware in Rust without unsafe, as the
             | embedded hal ecosystem wraps unsafe hardware interfaces in
             | a modular fairly universal safe API.
        
             | formerly_proven wrote:
             | My understanding from Aria Beingessner's and some other
             | writings is that unsafe{} rust is significantly harder to
             | get right in "non-trivial cases" than C, because the
             | semantics are more complex and less specified.
        
               | dwattttt wrote:
               | It's hard to compare. Rust has stricter requirements than
               | C, but looser requirements don't mean easier: ever bit
               | shifted by a variable amount? Hope you never relied on
               | shifting "entirely" out of a variable zeroing it.
        
           | chongli wrote:
           | Isn't it the case that once you use unsafe even a single
           | time, you lose all of Rust's nice guarantees? As far as I'm
           | aware, inside the unsafe block you can do whatever you want
           | which means all of the nice memory-safety properties of the
           | language go away.
           | 
           | It's like letting a wet dog (who'd just been swimming in a
           | nearby swamp) run loose inside your hermetically sealed
           | cleanroom.
        
             | timschmidt wrote:
             | It seems like you've got it backwards. Even unsafe rust is
             | still more strict than C. Here's what the book has to say
             | (https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html)
             | 
             | "You can take five actions in unsafe Rust that you can't in
             | safe Rust, which we call unsafe superpowers. Those
             | superpowers include the ability to:
             | Dereference a raw pointer         Call an unsafe function
             | or method         Access or modify a mutable static
             | variable         Implement an unsafe trait         Access
             | fields of a union
             | 
             | It's important to understand that unsafe doesn't turn off
             | the borrow checker or disable any other of Rust's safety
             | checks: if you use a reference in unsafe code, it will
             | still be checked. The unsafe keyword only gives you access
             | to these five features that are then not checked by the
             | compiler for memory safety. You'll still get some degree of
             | safety inside of an unsafe block.
             | 
             | In addition, unsafe does not mean the code inside the block
             | is necessarily dangerous or that it will definitely have
             | memory safety problems: the intent is that as the
             | programmer, you'll ensure the code inside an unsafe block
             | will access memory in a valid way.
             | 
             | People are fallible, and mistakes will happen, but by
             | requiring these five unsafe operations to be inside blocks
             | annotated with unsafe you'll know that any errors related
             | to memory safety must be within an unsafe block. Keep
             | unsafe blocks small; you'll be thankful later when you
             | investigate memory bugs."
        
               | pclmulqdq wrote:
               | The way I have heard it described that I think is a bit
               | more succinct is "unsafe admits undefined behavior as
               | though it was safe."
        
               | Someone wrote:
               | But "Dereference a raw pointer", in combination with the
               | ability to create raw pointers pointing to arbitrary
               | memory addresses (that, you can do even in safe rust)
               | allows you to write arbitrary memory from unsafe rust.
               | 
               | So, _in theory_ , unsafe rust opens the floodgates. _In
               | practice_ , though, you can use small fragments of unsafe
               | code that programmers can fairly easily check to be safe.
               | 
               | Then, once you've convinced yourself that those fragments
               | are safe, you can be assured that your whole program is
               | safe (using 'safe' in the rust sense, of course)
               | 
               | So, there may be some small islands of unsafe code that
               | require extra attention from the programmer, but that
               | should be just a tiny fraction of all lines, and you
               | should be able to verify those islands in isolation.
        
               | steveklabnik wrote:
               | > allows you
               | 
               | This is where the rubber hits the road. Rust does not
               | allow you to do this, in the sense that this is possibly
               | undefined behavior. That "possibly" is why the compiler
               | allows you to write this code, because by saying
               | "unsafe", you are promising that this specific arbitrary
               | address is legal for you to write to. But that doesn't
               | mean that it's always legal to do so.
        
               | timschmidt wrote:
               | The compiler won't allow you to compile such code without
               | the unsafe. The unsafe is *you* promising the compiler
               | that *you* have checked to ensure that the address will
               | always be legal. So that the compiler will allow you to
               | compile the code.
        
               | steveklabnik wrote:
               | Right, I'm saying "allow" has two different connotations,
               | and only one of them, the one that you're talking about,
               | applies.
        
               | timschmidt wrote:
               | I gotcha. I misread and misunderstood. Yes, we agree.
        
               | uecker wrote:
               | This description is still misleading. The preconditions
               | for the correctness of an unsafe block can very much
               | depend on the correctness of the code outside and it is
               | easy to find Rust bugs where exactly this was the cause.
               | This is very similar where often C out of bounds accesses
               | are caused by some logic error elsewhere. Also an unsafe
               | block has to maintain all the invariants the safe Rust
               | part needs to maintain correctness.
        
               | iknowstuff wrote:
               | No. Correctness of code _outside_ unsafe depends on
               | correctness inside those blocks, not the other way around
        
               | uecker wrote:
               | Sweet summer child.
        
               | iknowstuff wrote:
               | tf are you talking about
        
               | steveklabnik wrote:
               | They are (rudely) talking about
               | https://news.ycombinator.com/item?id=43382369
        
               | dwattttt wrote:
               | In a more helpful framing: safe Rust code doesn't need to
               | worry about its own correctness, it just is.
               | 
               | Unsafe code can be incorrect (or unsound), and needs to
               | be careful about it. Part of being careful is that safe
               | code can call the unsafe code in a way that triggers that
               | unsoundness; in that way, safe code can cause undefined
               | behaviour in unsafe code.
               | 
               | It's not always the case that this is possible; there are
               | unsafe blocks that don't need to depend on safe code for
               | its correctness.
        
               | dwattttt wrote:
               | It's true, but I think it's only fair if you hold Rust to
               | this analysis, other languages should too; the scrutiny
               | you're implying you need in an unsafe Rust block needs to
               | be applied to all C code, because all C code could depend
               | on code anywhere else for its safety characteristics.
               | 
               | In practice (in both languages) you check what the actual
               | unsafe code does (or "all" code in C's case), note code
               | that depends on external actors for safety (it's not all
               | C code, nor is it all unsafe Rust blocks), and check
               | their callers (and callers callers, etc).
        
               | uecker wrote:
               | What is true is that there are more operations in C which
               | can cause undefined behavior and those are more densely
               | distributed over the C code, making it harder to screen
               | for undefined behavior. This is true and Rust certainly
               | has an advantage, but it not nearly as big of an
               | advantage as the "Rust is safe" (please do not look at
               | all the unsafe blocks we need to make it also fast!) and
               | "all C is unsafe" story wants you to believe.
        
               | dwattttt wrote:
               | The places where undefined behaviour can occur are also
               | limited in scope; you insist that that part isn't true,
               | because operations outside those unsafe blocks can impact
               | their safety.
               | 
               | That's only true at the same level of scrutiny as "all C
               | operations can cause undefined behaviour, regardless of
               | what they are", which I find similarly shallow.
        
               | gf000 wrote:
               | Rust is plenty fast, in fact there are countless examples
               | of _safe_ rust that will trivially beat out C in
               | performance due to no aliasing, enabling better
               | vectorization among others. Let alone being simply a more
               | expressive language and allowing writing better
               | optimizations (e.g. small strings, vs the absolutely
               | laughable c-strings that perform terribly, but also you
               | can actually get away with sharing more stuff in memory
               | vs doing defensive copies everywhere because it is safe
               | to do so, etc)
               | 
               | And there is not many things we have statistics on in CS,
               | but memory vulnerabilities being absolutely everywhere in
               | unsafe languages, and Rust cleaning up the absolute
               | majority of them even when only the new parts are written
               | in Rust are some of the few we _do_ know, based on
               | actual, real life projects at Google /Microsoft among
               | others.
               | 
               | A memory safe low-level language is as novel as it gets.
               | Rust is absolutely not just hype, it actually delivers
               | and you might want to get on with the times.
        
               | lambda wrote:
               | So, it's true that unsafe code can depend on
               | preconditions that need to be upheld by safe code.
               | 
               | But using ordinary module encapsulation and private
               | fields, you can scope the code that needs to uphold those
               | preconditions to a particular module.
               | 
               | So the "trusted computing base" for the unsafe code can
               | still be scoped and limited, allowing you to reduce the
               | amount of code you need to audit and be particularly
               | careful about for upholding safety guarantees.
               | 
               | Basically, when writing unsafe code, the actual unsafe
               | operations are scoped to only the unsafe blocks, and they
               | have preconditions that you need to scope to a particular
               | module boundary to ensure that there's a limited amount
               | of code that needs to be audited to ensure it upholds all
               | of the safety invariants.
               | 
               | Ralf Jung has written a number of good papers and blog
               | posts on this topic.
        
               | uecker wrote:
               | And you think one can not modularize C code and
               | encapsulate critical buffer operations in much safer
               | APIs? One can, the problem is that a lot of legacy C code
               | was not written this way. Also lot of newly written C
               | code is not written this way, but the reason is often
               | that people cut corners when they need to get things done
               | with limited time and resources. The same you will see
               | with Rust.
        
               | gf000 wrote:
               | Even innocent looking C code can be chock-full of UBs
               | that can invalidate your "local reasoning" capabilities.
               | So, not even close.
        
               | wavemode wrote:
               | Care to share an example?
        
               | gf000 wrote:
               | This is technically correct, but a bit pedantic.
               | 
               | Sure, you can technically just write your own
               | vulnerability for your own program and inject it at an
               | unsafe and see the whole world crumble... but the exact
               | same is true for any form of FFI calls in any language.
               | Is Java memory safe? Yeah, just because I can grab a
               | random pointer and technically break anything I want
               | won't change that.
               | 
               | The fact that a memory vulnerability _error_ may either
               | appear at no place at all _OR_ at the couple hundred
               | lines of code thorough the whole project is a night and
               | day difference.
        
               | onnimonni wrote:
               | Would someone with more experience be able to explain to
               | me why can't these operations be "safe"? What is blocking
               | rust from producing the same machine code in a "safe"
               | way?
        
               | vlovich123 wrote:
               | Those specific functions are compiler builtin vector
               | intrinsics. The main reason is that they can easily read
               | past ends of arrays and have type safety and aliasing
               | issues.
               | 
               | By the way, the rust compiler does generate such code
               | because under the hood LLVM runs an autovectorizer when
               | you turn on optimizations. However, for the
               | autovectorizer to do a good job you have to write code in
               | a very special way and you have no way of controlling
               | whether or not it kicked in and once it did that it did a
               | good job.
               | 
               | There's work on creating safe abstractions (that also
               | transparently scale to the appropriate vector
               | instruction), but progress on that has felt slow to me
               | personally and it's not available outside nightly
               | currently.
        
               | NobodyNada wrote:
               | Rust's raw pointers are more-or-less equivalent to C
               | pointers, with many of the same types of potential
               | problems like dangling pointers or out-of-bounds access.
               | Rust's references are the "safe" version of doing pointer
               | operations; raw pointers exist so that you can express
               | patterns that the borrow checker can't prove are sound.
               | 
               | Rust encourages using unsafe to "teach" the language new
               | design patterns and data structures; and uses this
               | heavily in its standard library. For example, the Vec
               | type is a wrapper around a raw pointer, length, and
               | capacity; and exposes a safe interface allowing you to
               | create, manipulate, and access vectors with no risk of
               | pointer math going wrong -- assuming the people who
               | implemented the unsafe code inside of Vec didn't make a
               | mistake, the external, safe interface is guaranteed to be
               | sound no matter what external code does.
               | 
               | Think of unsafe not as "this code is unsafe", but as
               | "I've proven this code to be safe, and the borrow checker
               | can rely on it to prove the safety of the rest of my
               | program."
        
               | adgjlsfhk1 wrote:
               | often the unsafe code is at the edges of the type system.
               | e.g. sometimes the proof of safety is that someone read
               | the source code of the c library that you are calling out
               | to. it's not useful to think of machine code as safe or
               | unsafe. safety often refers to whether the types of your
               | data match the lifetime dataflow.
        
               | rybosome wrote:
               | I believe the post you are replying to was referring to
               | the fact that you could take actions in that unsafe block
               | that would compromise the guarantees of rust; eg you
               | could do something silly, leave the unsafe block, then
               | hit an "impossible" condition later in the program.
               | 
               | A simple example might be modifying a const value deep
               | down in some class, where it only becomes apparent later
               | in the program's execution. Hence their analogy of the
               | wet dog in a clean room - whatever beliefs you have about
               | the structure of memory in your entire program, and
               | guaranteed by the compiler, could have been undone by a
               | rogue unsafe.
        
             | CooCooCaCha wrote:
             | I wouldn't go that far. Bevy for example, uses unsafe
             | internally but is VERY strict about it, and every use of
             | unsafe requires a comment explaining why the code is safe.
             | 
             | In other words, unsafe works if you use it carefully and
             | keep it contained.
        
               | tonyhart7 wrote:
               | right, the point is raising awareness and assumption its
               | not 100 and 0 problem
        
             | SkiFire13 wrote:
             | You lose the nice guarantees inside the `unsafe` block, but
             | the point is to write a sound and safe interface over it,
             | that is an API that cannot lead to UB no matter how other
             | safe code calls it. This is basically the encapsulation
             | concept, but for safety.
             | 
             | To continue the analogy of the dog, you let the dog get wet
             | (=you use unsafe), but you put a cleaning room (=the sound
             | and safe API) before your sealed room (=the safe code
             | world)
        
             | timeon wrote:
             | > unsafe even a single time, you lose all of Rust's nice
             | guarantees
             | 
             | Not sure why would _one_ resulted in _all_. One of Rust 's
             | advantages is the clear boundary between safe/unsafe.
        
             | wongarsu wrote:
             | If your unsafe code violates invariants it was supposed to
             | uphold, that can wreck safety properties the compiler was
             | trying to uphold elsewhere. If you can achieve something
             | without unsafe you definitely should (safe, portable simd
             | is available in rust nightly, but it isn't stable yet).
             | 
             | At the same time, unsafe doesn't just turn off all compiler
             | checks, it just gives you tools to go around them, as well
             | as tools that happen to go around them because of the way
             | they work. Rust unsafe is this weird mix of being safer
             | than pure C, but harder to grasp; with lots of nuanced
             | invariants you have to uphold. If you want to ensure your
             | code still has all the nice properties the compiler
             | guarantees (which go way beyond memory safety) you would
             | have to carefully examine every unsafe block. Which few
             | people do, but you generally still end up with a better
             | status quo than C/C++ where _any_ code can in principle
             | break properties other code was trying to uphold.
        
             | sunshowers wrote:
             | What language is the JVM written in?
             | 
             |  _All_ safe code in existence running on von Neumann
             | architectures is built on a foundation of unsafe code. The
             | goal of _all_ memory-safe languages is to provide safe
             | abstractions on top of an unsafe core.
        
             | janice1999 wrote:
             | Claiming unsafe invalidates "all of the nice memory-safety
             | properties" is like saying having windows in your house
             | does away with all the structural integrity of your walls.
             | 
             | There's even unsafe usage in the standard library and it's
             | used a lot in embedded libraries.
        
             | vlovich123 wrote:
             | You only lose those guarantees if and only if the code
             | within the unsafe block violates the rules of the Rust
             | language.
             | 
             | Normally in safe code you can't violate the language rules
             | because the compiler enforces various rules. In unsafe
             | mode, you can do several things the compiler would normally
             | prevent you from doing (e.g. dereferencing a naked
             | pointer). If you uphold all the preconditions of the
             | language, safety is preserved.
             | 
             | What's unfortunate is that the rules you are required to
             | uphold can be more complex than you might anticipate if
             | you're trying to use unsafe to write C-like code. What's
             | fortunate is that you rarely need to do this in normal code
             | and in SIMD which is what the snippet is representing
             | there's not much danger of violating the rules.
        
             | pdimitar wrote:
             | Where did you even get that weird extreme take from?
             | 
             | O_o
        
           | colonwqbang wrote:
           | Can't rust do safe simd? This is just vectorised
           | multiplication and xor, but it gets labelled as unsafe. I
           | imagine most code that wants to be fast would use simd to
           | some extent.
        
             | steveklabnik wrote:
             | It's still nightly-only.
        
         | pcwalton wrote:
         | > Presumably with inline assembly both languages can emit what
         | is effectively the same machine code. Is the Rust compiler a
         | better optimizing compiler than C compilers?
         | 
         | rustc uses LLVM just as clang does, so to a first approximation
         | they're the same. For any given LLVM IR you can _mostly_ write
         | equivalent Rust and C++ that causes the respective compiler to
         | emit it (the switch fallthrough thing mentioned in the article
         | is interesting though!) So if you 're talking about what's
         | _possible_ (as opposed to what 's _idiomatic_ ), the question
         | of "which language is faster" isn't very interesting.
        
         | AlotOfReading wrote:
         | The key difference is that there are invariants you can rely on
         | as a user of the library, and they'll be enforced by the
         | compiler outside the unsafe blocks. The corresponding C
         | invariants mostly aren't enforced by the compiler. Worse, many
         | C programmers will actively argue that some amount of undefined
         | behavior is "fine".
        
         | jdefr89 wrote:
         | Not to mention they link to libc.. All rust code does last I
         | checked...
        
           | techjamie wrote:
           | There is an option to not link to it for instances like OS
           | writing and embedded. Writing everything in pure Rust without
           | libc is entirely possible, even if an effort in losing sanity
           | when you're reimplementing every syscall you need from
           | scratch.
           | 
           | But even then, your code is calling out to kernel functions
           | which are probably written in C or assembly, and therefore
           | "dangerous."
           | 
           | Rust code safety is overhyped frequently, but reducing an
           | attack surface is still an improvement over not doing so.
        
             | jdefr89 wrote:
             | I agree and binary exploitation/Vulnerability Research is
             | my area of expertise.. The whole "Lets port everything to
             | Rust" is so misguided. Binary exploitation has already
             | gotten 20x harder than say ten years ago.. Even so.. Most
             | big breaches happen because people reuse their password or
             | just give it out... Nation States are pretty much the only
             | parties capable of delivering full kill chains that
             | exploit, say chrome... That is why I moved to the embedded
             | space.. Still so insecure...
        
         | einpoklum wrote:
         | > At what point does it really stop mattering if this is C or
         | Rust?
         | 
         | That depends. If, for you, safety is something relative and
         | imperfect rather than absolute, guaranteed and reliable, then -
         | the answer is that once you have the first non-trivial unsafe
         | block that has not gotten standard-library-level of scrutiny.
         | But if that's your view, you should not be all that starry-eyed
         | about how "Rust is a safe language!" to begin with.
         | 
         | On the other hand, if you really do want to rely on Rust's
         | strong safety guarantees, then the answer is: From the moment
         | you use any library with unsafe code.
         | 
         | My 2 cents, anyway.
        
         | koito17 wrote:
         | The purpose of `unsafe` is for the compiler to assume a block
         | of code is correct. SIMD intrinsics are marked as unsafe
         | because they take raw pointers as arguments.
         | 
         | In safe Rust (the default), memory access is validated by the
         | borrow checker and type system. Rust's goal of soundness means
         | safe Rust should never cause out-of-bounds access, use-after-
         | free, etc; if it does, then there's a bug in the Rust compiler.
        
           | no_wizard wrote:
           | How do we know if Rust is safe unless Rust is written purely
           | in safe Rust?
           | 
           | Is that not true? Even validators have bugs or miss things
           | no?
        
             | steveklabnik wrote:
             | > Even validators have bugs
             | 
             | Yep! For example, https://github.com/Speykious/cve-rs is an
             | example of a bug in the Rust compiler, which allows
             | something that it shouldn't. It's on its way to being
             | fixed.
             | 
             | > or miss things no?
             | 
             | This is the trickier part! Yes, even proofs have axioms,
             | that is, things that are accepted without proof, that the
             | rest of the proof is built on top of. If an axiom is
             | incorrect, so is the proof, even though we've proven it.
        
           | int_19h wrote:
           | Out of curiosity, _why_ do they take raw pointers as
           | arguments, rather than references?
        
             | steveklabnik wrote:
             | From the RFC: https://rust-lang.github.io/rfcs/2325-stable-
             | simd.html
             | 
             | > The standard library will not deviate in naming or type
             | signature of any intrinsic defined by an architecture.
             | 
             | I think this makes sense, just like any other intrinsic:
             | unsafe to use directly, but with safe wrappers.
             | 
             | I believe that there are also some SIMD things that would
             | have to inherently take raw pointers, as they work on
             | pointers that aren't aligned, and/or otherwise not valid
             | for references. In theory you could make only those take
             | raw pointers, but I think the blanket policy of "follow
             | upstream" is more important.
        
         | sesm wrote:
         | Rust code emitter is Clang, the same one that Apple uses for C
         | on their platforms. I wouldn't expect any miracles there, as
         | Rust authors have zero influence over it. If any compiler is
         | using any secret Clang magic, that would be Swift or
         | Objective-C, since they are developed by Apple.
        
           | nindalf wrote:
           | You're conflating clang and LLVM.
        
             | sesm wrote:
             | Yes, you are right, should be 'code emitter is LLVM, the
             | same that Clang uses for C'
        
         | xxs wrote:
         | oddly enough that's not the most optimal version of crc32, e.g.
         | it's not an avx512 variant.
        
         | Shorel wrote:
         | Awesome find. This really means:
         | 
         | Assembly language faster than C. And faster than Rust. Assembly
         | can be very fast.
        
         | bitwize wrote:
         | You can use 'unsafe' blocks to delineate places on the hot path
         | where you _need_ to take the limiters off, then trust that the
         | rest of the code will be safe. In C, _all_ your code is unsafe.
         | 
         | We will see more and more Rust libraries trounce their C
         | counterparts in speed, because Rust is more fun to work in
         | because of the above. Rust has democratized high-speed and
         | concurrent systems programming. Projects in it will attract a
         | larger, more diverse developer base -- developers who would be
         | loath to touch a C code base for (very justified) fear of
         | breaking something.
        
         | dzaima wrote:
         | Looks like as of 2 weeks ago the unsafe block should no longer
         | be required: https://github.com/rust-lang/stdarch/pull/1714
         | 
         | ..at least outside of loads/stores. From a bit of looking at
         | the code though it seems like a good amount of those should be
         | doable in a safe way with some abstractions.
        
         | gf000 wrote:
         | Rust's borrow checker still checks within unsafe blocks, so
         | unless you are _only_ operating with raw pointers (and not
         | accessing certain references as raw pointers in some small,
         | well-defined blocks) across the whole program it will be
         | significantly more safe than C. Especially given all the other
         | language benefits, like a proper type system that can encode a
         | bunch of invariants, no footguns at every line
         | /initialization/cast, etc.
        
           | acdha wrote:
           | Yes. I think it's easy to underestimate how much the richer
           | language and library ecosystem chip away at the attack
           | surface area. So many past vulnerabilities have been in code
           | which isn't dealing with low-level interfaces or weird
           | performance optimizations and wouldn't need to use unsafe.
           | There've been so many vulnerabilities in crypto code which
           | weren't the encryption or hashing algorithms but things like
           | x509/ASN parsing, logging, or the kind of option/error
           | handling logic a Rust programmer would use the type system to
           | validate.
        
         | asveikau wrote:
         | > At what point does it really stop mattering if this is C or
         | Rust?
         | 
         | If I read TFA correctly, they came up with a library that is
         | API compatible with the C one, but they've measured to be
         | faster.
         | 
         | At that point I think in addition to safety benefits in other
         | parts of the library (apart from unsafe micro optimizations as
         | quoted), what they're leveraging is better compiler technology.
         | Intuitively, I start to assume that the rust compiler can
         | perhaps get away with more optimizations that might not be safe
         | to assume in C.
        
       | cb321 wrote:
       | I think this _may_ not be a very high bar. zippy in Nim claims to
       | be about 1.5x to 2.0x faster than zlib:
       | https://github.com/guzba/zippy I think there are also faster
       | zlib's around in C than the standard install one, such as
       | https://github.com/ebiggers/libdeflate (EDIT: also mentioned
       | elsethread https://news.ycombinator.com/item?id=43381768 by
       | mananaysiempre)
       | 
       | zlib itself seems pretty antiquated/outdated these days, but it
       | does remain popular, even as a basis for newer parallel-friendly
       | formats such as https://www.htslib.org/doc/bgzip.html
        
         | hinkley wrote:
         | Zlib is unapologetically written to be portable rather than
         | fast. It is absolutely no wonder that a Rust implementation
         | would be faster. It runs on a pathetically small number of
         | systems by contrast. This is not a dig at Rust, it's an
         | acknowledgement of how many systems exist out there, once you
         | include embedded, automotive, aerospace, telecom, industrial
         | control systems, and mainframes.
         | 
         | Richard Hipp denounces claims that SQLite is the widest-used
         | piece of code in the world and offers zlib as a candidate for
         | that title, which I believe he is entirely correct about. I've
         | been consciously using it for almost thirty years, and for a
         | few years before that without knowing I was.
        
         | lern_too_spel wrote:
         | They're comparing against zlib-ng, not zlib. zlib-ng is more
         | than twice as fast as zlib for decompression.
         | https://github.com/zlib-ng/zlib-ng/discussions/871
         | 
         | libdeflate is not zlib compatible. It doesn't support streaming
         | decompression.
        
         | mastax wrote:
         | The benchmarks in the parent post are comparing to zlib-ng,
         | which is substantially faster than zlib. The zippy claims are
         | against "zlib found on a fresh Linux install" which at least
         | for Debian is classic zlib.
        
         | JoshTriplett wrote:
         | The bar here is not zlib, it's zlib-ng, which aims primarily
         | for performance.
         | 
         | libdeflate is an impressive library, but it doesn't help if you
         | need to stream data rather than having it all in memory at
         | once.
        
       | jrockway wrote:
       | Chromium is kind of stuck with zlib because it's the algorithm
       | that's in the standards, but if you're making your own protocol,
       | you can do even better than this by picking a better algorithm.
       | Zstandard is faster and compresses better. LZ4 is much faster,
       | but not quite as small.
       | 
       | Some reading:
       | https://jolynch.github.io/posts/use_fast_data_algorithms/
       | 
       | (As an aside, at my last job container pushes / pulls were in the
       | development critical path for a lot of workflows. It turns out
       | that sha256 and gzip are responsible for a lot of the time spent
       | during container startup. Fortunately, Zstandard is allowed, and
       | blake3 digests will be allowed soon.)
        
         | jeffbee wrote:
         | Yeah I just discovered this a few days ago. All the docker-era
         | tools default to gzip but if using, say, bazel rules_oci
         | instead of rules_docker you can turn on zstd for large speedups
         | in push/pull time.
        
       | amorio2341 wrote:
       | Not surprised at all, Rust is the future.
        
       | akagusu wrote:
       | Bravo. Now Rust has its existence justified.
        
       ___________________________________________________________________
       (page generated 2025-03-16 23:00 UTC)