hngopher.com

       [HN Gopher] SHA-3 Buffer Overflow
       ___________________________________________________________________
        
       SHA-3 Buffer Overflow
        
       Author : cbzbc
       Score  : 393 points
       Date   : 2022-10-20 21:34 UTC (1 days ago)
        
 (HTM) web link (mouha.be)
 (TXT) w3m dump (mouha.be)
        
       | tinglymintyfrsh wrote:
       | Oh shit. I better check my native Ruby extension that I believe
       | uses the reference C code.
        
         | tinglymintyfrsh wrote:
         | Phew. It works.                    gem specific_install
         | https://github.com/steakknife/digest-sha3-ruby          git
         | version 2.31.1         http installing from
         | https://github.com/steakknife/digest-sha3-ruby         Cloning
         | into '/var/folders/x6/87j2gpl54x79m0nvns2lsrpc0000gn/T/d2022102
         | 0-79527-r1vite'...         remote: Enumerating objects: 191,
         | done.         remote: Counting objects: 100% (3/3), done.
         | remote: Compressing objects: 100% (3/3), done.         remote:
         | Total 191 (delta 0), reused 0 (delta 0), pack-reused 188
         | Receiving objects: 100% (191/191), 6.10 MiB | 11.79 MiB/s,
         | done.         Resolving deltas: 100% (65/65), done.
         | Successfully built RubyGem           Name: digest-sha3
         | Version: 2.0.0           File: digest-sha3-2.0.0.gem
         | Building native extensions. This could take a while...
         | Successfully installed          pry -rdigest/sha3         [1]
         | pry(main)> h = Digest::SHA3.new(224)         => #<Digest::SHA3:
         | 6b4e03423667dbb73b6e15454f0eb1abd4597f9a1b078e3f5b5a6bc7>
         | [2] pry(main)> h.update("\x00")         => #<Digest::SHA3:
         | bdd5167212d2dc69665f5a8875ab87f23d5ce7849132f56371a19096>
         | [3] pry(main)> h.update("\x00"*4294967295)         =>
         | #<Digest::SHA3:
         | c5bcc3bc73b5ef45e91d2d7c70b64f196fac08eee4e4acf6e6571ebe>
         | [4] pry(main)> h.hexdigest         =>
         | "c5bcc3bc73b5ef45e91d2d7c70b64f196fac08eee4e4acf6e6571ebe"
         | [5] pry(main)>
        
       | all_these_years wrote:
       | SHA-3 in Ruby:                   $ gem install sha3         $ irb
       | > require 'sha3'           > s = SHA3::Digest::SHA224.new
       | > s.update("\x00")           > s.update("\x00" * 4294967295)
       | [ Segmentation fault... ]
       | 
       | Tested with Ruby 3.1.2
       | 
       | Gem's code (including C native extension):
       | https://github.com/johanns/sha3
        
         | abraham wrote:
         | Just over half a million total downloads. So some use but not
         | particularly popular.
        
         | [deleted]
        
       | [deleted]
        
       | Ptchd wrote:
       | Does Tor use SHA-3?
        
       | adunsulag wrote:
       | FYI in the healthcare space to meet ONC Cures Update regulations
       | for bulk patient export, its a requirement to use SHA384 (RS384
       | or ES384). I would think most implementations will be safe due to
       | the 4GB payload criteria, but if anyone is doing anything funky
       | they'll need to look at this closely.
        
         | jakobdabo wrote:
         | SHA-384 is a hash function from the SHA-2 family, and it has no
         | relation to SHA-3.
        
       | bobkazamakis wrote:
       | https://www.xilinx.com/products/intellectual-property/1-1mje...
       | 
       | https://documentation-service.arm.com/static/62bb3e4bb334256...
        
       | Donckele wrote:
       | "I've also shown how a specially constructed file can result in
       | arbitrary code execution, "
       | 
       | Ouch, thats not looking good for a reference implementation for a
       | piece of software deployed on billions and billions of machines
       | that was written by experts and reviewed (and modified) by even
       | bigger egg heads.
        
         | [deleted]
        
         | manholio wrote:
         | Just another nail in the long overdue C/C++ coffin. If not even
         | these experts can get it right, under the best development
         | conditions, public review and highest stakes, on a relatively
         | small component with the most strict specifications, then no
         | one can.
        
           | adev_ wrote:
           | > Just another nail in the long overdue C/C++ coffin
           | 
           | C, f**ing C. By the sake of god. Not C++.
           | 
           | Stop to put both in the same basket, it just show you do not
           | know what you are talking about.
           | 
           | https://github.com/XKCP/XKCP/commit/fdc6fef075f4e81d6b1bc383.
           | ..
           | 
           | Something like that in modern C++ would have been done using
           | std::span<> and that prevents this kind of out-of-bound
           | access and the party-time that comes with it.
        
             | eterevsky wrote:
             | Code like that in C++ would probably be over-optimized and
             | would skip boundary checks, exposing it to the same
             | problem.
        
             | pjmlp wrote:
             | As much as I like C++, std::span<> would only have helped
             | if the build was built with bounds checking turned on.
             | 
             | ISO on their wisdom has turned the safety defaults around
             | in regards to classical C++ frameworks.
             | 
             | So C++ secure code, either has to make use of Microsoft's
             | gsl::span<>, or again turn on bounds checking in release
             | builds, you cannot trust everyone to call .at().
        
             | saagarjha wrote:
             | C++ offers safer constructs but they have their own sharp
             | edges and you have to actually use them to get their
             | benefits. In security critical software these can still be
             | enough to cause significant problems.
        
             | boardwaalk wrote:
             | Mmmhm. Go look in the docs for std::span and see how long
             | it takes to find "The behavior is undefined if idx is out
             | of range" or similar. For example: https://en.cppreference.
             | com/w/cpp/container/span/operator_at
        
               | adev_ wrote:
               | > Mmmhm. Go look in the docs for std::span and see how
               | long it takes to find "The behaviour is undefined if idx
               | is out of range" or similar.
               | 
               | It is regrettable and under fix but:
               | 
               | - You generally do have your own implementation with more
               | advance bound check for safety-critical implementation. -
               | We do have our own where I work right now. - Google has
               | its own in abseil.
               | 
               | - span allows the usage for range-for loop which combined
               | with subspan() makes possible to avoid pointer arithmetic
               | all together.
        
               | manholio wrote:
               | When you are already in a deep hole, it's a good idea to
               | stop digging.
               | 
               | How much time did it took you to understand all this?
               | What chances does a beginner has to understand these
               | subtle differences and learn the particular way your
               | codebase uses std::span and not footgun themselves? How
               | much productivity is lost to the sheer terror of such
               | footguns - not actual bugs, but time and effort lost that
               | did not move the product forward?
               | 
               | And most importantly, what is the point of accepting that
               | huge cost instead of writing in a language where unsafe
               | data access is impossible?
        
               | adev_ wrote:
               | > And most importantly, what is the point of accepting
               | that huge cost instead of writing in a language where
               | unsafe data access is impossible?
               | 
               | Because the world is a place where billions of lines of
               | code have been written. And these will never be
               | rewritten, but can be modernized.
               | 
               | That the reality, and that's why your web browser right
               | now and most of your OS is still written in unsafe
               | language right now. And will still be for the next
               | 15years at least.
        
               | manholio wrote:
               | Surely "nails in the coffin of C/C++" does not imply an
               | all out effort to rewrite all this existing and working
               | software from scratch.
               | 
               | The beast will die in the same sense Fortran or Cobol are
               | dead: an outdated language that has no noticeable
               | strengths in the modern world, used by a bunch of
               | (difficult to hire) old timers to fix legacy systems.
        
               | adev_ wrote:
               | > Surely "nails in the coffin of C/C++" does not imply an
               | all out effort to rewrite all this existing and working
               | software from scratch.
               | 
               | Or you use the evolutionary approach: remove the unsafe
               | part from C/C++ bit by bit. And do code migration (Carbon
               | style) when you can do it. This is what some C and C++
               | committee members try to do and fortunately they are
               | there.
               | 
               | Because that is way more likely to happen in a reasonable
               | time frame (next 15 years) than rewrite Chrome, Linux,
               | OpenSSL, Office and all the other 10 of billions of
               | proprietary code we have around in Rust.
        
               | manholio wrote:
               | > remove the unsafe part from C/C++ bit by bit.
               | 
               | You cannot do that without breaking backward
               | compatibility. If you maintain compatibility, what
               | results is a sprawling meta-language with a large
               | learning burden that has the theoretical ability to
               | emulate a safe language, and which in practice will be
               | used to mix both safe and unsafe idioms depending on the
               | experience and discipline of the programmer, in an unholy
               | mess that no static analyzer can prove as correct and no
               | programmer can say they really fully understand.
               | 
               | Maybe I'm mistaken, I've last programmed in C++ a good
               | number of years ago, but I have not encountered a (C++
               | programming paradigm, static linter) doublet that can
               | prove data integrity and thread safety to an extent even
               | close to the Rust compiler. And it would be useless for
               | existing codebases anyhow.
        
             | tialaramex wrote:
             | WG21 (the C++ Standards Committee) feels that since
             | idiomatic C++ was unsafe (e.g. array operations don't have
             | bounds checks) it is consistent for new C++ features to
             | also be unsafe by default, even if the whole _point_ of
             | those features in other languages is to provide an
             | idiomatic safe way to do things.
             | 
             | So for example in Rust you have a native slice type
             | reflecting a dynamically sized view into a contiguous
             | sequence, and of course it's safe -- five[20] will either
             | refuse to compile if the compiler can see this slice is too
             | short, or it will panic at runtime. In C++ as others have
             | mentioned actually std::span just isn't safe when used
             | ergonomically. five[20] in C++ is Undefined Behaviour.
             | std::optional has a safe interface for a Maybe / Option
             | type, but is presented with a more ergonomic unsafe
             | behaviour and that's what people use.
             | 
             | This is not an old-fashioned thing the C++ Committee grew
             | out of, std::expected is in C++ 23, that's a Result type
             | and it likewise offers an ergonomic unsafe API. The safe
             | APIs were seen as a nice-to-have which could miss the train
             | because who needs safety anyway? Just don't make any
             | mistakes.
        
               | pjmlp wrote:
               | Tragically, what I would call idiomatic C++ with compiler
               | provided frameworks pre-ISO C++98, made use of bounds
               | checking in their collection classes, and then the C++
               | Committee went completly to the other way.
               | 
               | Naturally we have now ways to enable them on all major
               | compilers, but many still don't.
               | 
               | This is one point I kind of agree with you.
        
               | tialaramex wrote:
               | The crucial trick in Rust is _not_ that the index
               | operations are bounds checked, that 's easy, as you
               | observed plenty of C++ toolchains can do that.
               | 
               | The crucial trick is providing the unchecked operations
               | (with improved performance) as unergonomic alternatives.
               | *(five.get_unchecked_mut(20)) = k; // looks _horrible_.
               | Nobody wants to write that, so when they don 't _need_ it
               | they won 't write it. The fact calling get_unchecked_mut
               | requires _unsafe_ is part of how Rust could achieve its
               | goals, but the choice to not make this ergonomic is why
               | it actually delivers in practice.
               | 
               | What CppFront wants here, and P2687 proposes, and lots of
               | other C++ work has suggested, is roughly:
               | [[suppress(bounds_check)]] { five[20] = k; }
               | 
               | Thus imitating what they _think_ Rust does here, rather
               | than what it actually did, and in the process completely
               | missing the point.
        
               | pjmlp wrote:
               | Given that on some domains, there is no way around C or
               | C++, unless one wants to be part of building the
               | ecosystem, I was having big hopes on the clang and VC++
               | static analysis work for those kind of scenarios.
               | 
               | I can't speak for clang, but in what concerns VC++ is
               | mostly useless still, unless one wants to annotate
               | everything with those kind of annotations + SAL, and even
               | then it is only half way there.
               | 
               | Which is not really inspiring.
        
           | hot_gril wrote:
           | What's to replace C, Rust?
        
         | hot_gril wrote:
         | At least it's not SHA-2, which is probably more widely used. Or
         | am I mixing them up?
        
           | turminal wrote:
           | You are correct, SHA-2 is much more prevalent.
        
       | jbaczuk wrote:
       | no bounty and still politely reports it. Good guys need more
       | praise.
        
         | baby wrote:
         | it's a public standard, who would pay such a bounty?
        
           | amenghra wrote:
           | A couple companies sponsor "the internet bug bounty":
           | https://www.hackerone.com/internet-bug-bounty
           | 
           | IMHO, sha3 should be part of that bounty.
        
           | morepork wrote:
           | I'm sorry, we lost millions of dollars due to an exploit.
           | 
           | Don't you have bug bounties to find and fix these things?
           | 
           | But this was in a public standard that we were using, we
           | don't cover those.
           | 
           | Do I look like I care?
        
             | baby wrote:
             | Are you telling me this? Or are you going to pay a bounty?
        
           | mhh__ wrote:
           | Someone who values security with real money
        
         | [deleted]
        
       | yieldcrv wrote:
       | Bro what? This is a big deal right?
       | 
       | Like NSO Group hacking everyone for the next 5 years deal?
        
         | raydiatian wrote:
         | > The vulnerable code was released in January 2011, so it took
         | well over a decade for this vulnerability to be found.
         | 
         | It would have already been a big deal for the past five years,
         | if anything.
         | 
         | Furthermore you have to induce a bufferoverflow first. I'm no
         | security researcher, but to my mind that means you're either
         | (a) trusting idiots to write sensitive code without a IT Sec
         | team reviewing it, or (b) you've got a malicious actor on the
         | inside, who can slip it past review. I'm sure there are plenty
         | of other scenarios, but I'm saying it doesn't feel likely by
         | gut.
        
       | aliqot wrote:
       | Interesting they both say "Official Sha3" and "by its designers",
       | which as I remember it isn't that accurate. Keccak was chosen and
       | then NIST added what is affectionately known as the 'mystery
       | padding' before certification. What we know as official is not
       | the way the designers submitted the proposal.
       | 
       | This isn't an attempt at a scary accusation, but as a pedant,
       | this got me.
       | 
       | For those wondering, here is an explanation by a commenter:
       | The padding change is the only difference, this allows future
       | tree hashing modes as well as the current SHAKE outputs to
       | generate different digests given the same security parameters and
       | message inputs. Up to 4 additional bits are added, which keeps
       | the full padding inside a byte boundary, making implementations
       | with octet only input able to switch to SHA-3 from Keccak with
       | change to only a single line of code.
       | 
       | https://crypto.stackexchange.com/questions/10645/are-nists-c...
       | 
       | https://cdt.org/insights/what-the-heck-is-going-on-with-nist...
       | 
       | ketccak team's response:
       | https://keccak.team/2013/yes_this_is_keccak.html
        
         | baby wrote:
         | I'm not sure what's your point, the XKCP is the official SHA-3
         | implementation.
        
           | aliqot wrote:
           | > I'm not sure what's your point
           | 
           | You're just a baby. There's time.
        
         | encryptluks2 wrote:
        
           | tptacek wrote:
           | You're responding to a post that links to the Keccak
           | designers saying that the padding change isn't nefarious. You
           | have to be able to do better than "NIST bad" in comments on
           | threads like these.
        
             | aliqot wrote:
             | From Daniel J. Bernstein:
             | 
             | 2022.08.05: NSA, NIST, and post-quantum cryptography:
             | Announcing my second lawsuit against the U.S. government. :
             | https://blog.cr.yp.to/20220805-nsa.html
             | 
             | https://twitter.com/hashbreaker/status/1555625577989541888
        
               | tptacek wrote:
               | ... is a FOIA lawsuit about the recent PQ contest they
               | refereed that has literally nothing to do with SHA3.
        
             | jbirer wrote:
             | Appeal to Authority is not a very good argument either.
        
               | tptacek wrote:
               | It is in this case. Absent any argument other than "I
               | don't trust NIST", the designers of Keccak can refute
               | your comment by simply saying "no, this is fine", and
               | that's what they did.
        
               | kelnos wrote:
               | That's a bit of a stretch. I think the person who
               | initially designed something signing off on a
               | modification of that something is pretty compelling.
        
         | nullc wrote:
         | Who actually wrote the faulty code? I had the impression that
         | the padding change was written by the keccak team at the
         | request of NIST.
        
           | tptacek wrote:
           | The Keccak team wrote XKCP, for whatever that's worth.
        
           | aliqot wrote:
           | Keccak for xkcp and NIST for the padding, as far as I can
           | remember. As I recall the padding was added first, and the
           | explanation came later. There was great distrust at first
           | during this time as a result, hence the name 'mystery
           | padding' we used.
        
         | [deleted]
        
           | [deleted]
        
       | ouid wrote:
       | What dows generating preimages mean here? The preimage is the set
       | of all messages that hash to some set of hashes. Is the author
       | saying the the hash function is reversible in reasonable time?
        
         | SAI_Peregrinus wrote:
         | It's a buffer overflow. They can get a vulnerable
         | implementation to execute arbitrary code, so they can get it to
         | return any value they want. The actual SHA-3 algorithm wouldn't
         | return the same value as the vulnerable implementation, of
         | course.
        
           | enedil wrote:
           | I didn't see anybody mention arbitrary code execution
        
             | NicolaiS wrote:
             | It's in the post:                   a specially constructed
             | file can result in arbitrary code execution
        
       | arisAlexis wrote:
       | Can someone ELI5 the severity of this over the whole internet?
       | What breaks/what not
        
         | flatiron wrote:
         | If you don't use that library then nothing. If you do and you
         | hash my crafted binary I probably could crash it.
        
         | capableweb wrote:
         | Most people limit the amount of input that gets passed to
         | whatever is doing the hashing (SHA-3 in this case) to some
         | amount of bytes that are way less than this issue requires.
         | 
         | Most web servers default to a 50MB limit or something like
         | that, I think, not many things accept by default the size
         | required for this exploit. Most cases would involve services
         | that accept large untrusted files from users.
        
         | csande17 wrote:
         | The full advisory is linked at https://github.com/XKCP/XKCP/sec
         | urity/advisories/GHSA-6w4m-2.... In order for an application to
         | be vulnerable, it must:
         | 
         | 1. Break the file into multiple chunks and pass them to SHA-3
         | individually.
         | 
         | 2. Make one of those chunks larger than 4 GB in size. (This
         | requires using 4 GB of memory.)
         | 
         | This is kind of an unlikely thing for an application to do. If
         | you're breaking the file into chunks, usually you'd use smaller
         | chunks. (And if your server is limited to less than 4 GB of
         | RAM, you might not be able to allocate enough memory to trigger
         | the vulnerability in any case.) But it could be exploitable in
         | some really weird or poorly written application somewhere.
        
           | fulafel wrote:
           | Nitpick: it requires using 4 GB of address space, not
           | necessarily physical memory - the app might be eg memory
           | mapping the file.
           | 
           | (mmap() can be reasonably used for streaming usecases like
           | this provided you use madvise() hints on whether you want the
           | data kept resident after use, probably non unix platforms
           | have similar apis)
        
             | csande17 wrote:
             | Right, mmap() lets you get a big buffer without actually
             | using that much physical memory. Good catch!
             | 
             | I still think it's unlikely for an app to be vulnerable to
             | this but having less than 4 GB of RAM isn't an absolute
             | defense. Having a 32-bit address space probably is, though,
             | if you're running on old or embedded hardware.
        
         | Ayesh wrote:
         | SHA3 is relatively infrequent compared to sha1/2 and md5. SHA3
         | also has a few variants, and I don't know for sure if this
         | vulnerability is present in all variants.
         | 
         | The vulnerability is only present in the reference
         | implementation. So it's unlikely that other implementations
         | (rush as Rust or Go) are vulnerable.
         | 
         | I tested with the latest released and master branches of PHP,
         | both of which segfaulted for the code samples mentioned in the
         | article. The article says Python is also vulnerable.this is
         | because both of the languages apparently use the reference
         | implementation.
         | 
         | Ethereum uses SHA3 quite extensively, but I doubt it's
         | vulnerable post migration from PoW, let alone it's unlikely
         | that they use the vulnerable implementation/variant.
        
       | throwaway81523 wrote:
       | Didn't they find a vulnerability like this in the official MD5
       | implementation when they tried to port it to SPARK/Ada and the
       | proofs didn't work? I wasn't able to just find it with web
       | search, but here's a release about something similar happening
       | with another one of the SHA3 candidates (Skein), before Keccak
       | was chosen:
       | 
       | https://www.adacore.com/press/spark-skein
       | 
       | See also: https://www.adacore.com/papers/sparkskein
        
         | [deleted]
        
       | hasa wrote:
       | It may be dumb question, but is there any realistic use case to
       | use this vulnerability to reveal SHA-3 hashed secrets? Or is it
       | just that attacker can crash systems with suitable input?
        
         | 6LLvveMx2koXfwn wrote:
         | From the article:
         | 
         | I've shown how this vulnerability in XKCP can be used to
         | violate the cryptographic properties of the hash function to
         | create preimages, second preimages, and collisions. Moreover,
         | I've also shown how a specially constructed file can result in
         | arbitrary code execution, and the vulnerability can also impact
         | signature verification algorithms such as Ed448 that require
         | the use of SHA-3. The details of these attacks will be made
         | public at a later date.
        
       | computerfriend wrote:
       | LibreSSL-based implementations seem unaffected. I can calculate
       | that hash using hashlib without a segfault.
        
       | bvrmn wrote:
       | It looks pretty difficult to exploit. Untrusted 4GB input with
       | non-chunked update.
        
         | nequo wrote:
         | The purpose of running SHA is to determine whether the input
         | can be trusted though. But it is true that relatively few files
         | circulate at or above that size.
        
           | thayne wrote:
           | And even if they are, you probably aren't going to do a
           | digest on the whole thing all at once.
        
             | kevincox wrote:
             | IDK, I see a lot of "non-production" code that likes to
             | read whole files into a in-memory buffer rather than
             | bothering with streaming. With modern memory sizes it is
             | completely possible that you could read a 4GiB file into
             | memory without having issues.
        
         | mike_d wrote:
         | 4+ GB inputs are often fed to hashing functions. Verifying the
         | integrity of file downloads before running them for example.
        
           | hot_gril wrote:
           | But that'll probably run in chunks. Not gonna load 4GB into
           | memory.
           | 
           | Edit: Turns out there was another comment like this, and a
           | response was that the file could be mmap'd.
        
             | Cthulhu_ wrote:
             | You're optimistic, which is commendable, but unfortunately
             | a naive implementation would probably just read it all. I
             | probably would, because the happy path doesn't deal with 4
             | GB inputs.
        
               | hot_gril wrote:
               | I guess if your test machine always has a lot of
               | contiguous* free memory or you never test it with a large
               | enough file. Otherwise, it's impossible to ignore. One
               | large download, and it wouldn't just be slow, it'd not
               | work at all.
               | 
               | * Which is usually much harder than just free mem. I
               | recall macOS always being able to give a huge chunk of
               | memory, I guess by rearranging on the fly, but it's very
               | slow if you ask for a lot. And Linux just says no.
        
               | someweirdperson wrote:
               | > And Linux just says no.
               | 
               | Linux might say yes, even if the memory isn't available.
               | See notes of malloc:
               | 
               | "This means that when malloc() returns non-NULL there is
               | no guarantee that the memory really is available." [0]
               | 
               | [0] https://man7.org/linux/man-pages/man3/malloc.3.html
        
               | kevingadd wrote:
               | It doesn't need to be contiguous on any modern operating
               | system I'm aware of. You just need 4gb of free contiguous
               | _address space_ , which virtually every 64-bit
               | application will have.
        
               | hot_gril wrote:
               | Oh, then I ought to try my experiment again sometime.
               | Last time was 8 years ago on Ubuntu Server, and maybe I'm
               | missing a detail, like unknowingly running a 32-bit OS.
               | But yeah the behavior you describe is more what I'd
               | expect.
        
               | YetAnotherNick wrote:
               | This won't trigger overflow. It will only be triggered if
               | file is broken into chunks.
        
           | YetAnotherNick wrote:
           | The vulnerability only triggers when you break the file into
           | chunks and make one chunk larger than 4GB.
        
             | magicalhippo wrote:
             | Could easily happen if you compute the combined hash of a
             | set of files, with one chunk per file (mmap'ed fex).
        
       | flatiron wrote:
       | So if we ported this library to rust sha3 seems peachy? No doubt
       | an interesting port but it now seems inevitable.
        
         | wongarsu wrote:
         | Rust code already exclusively uses rust ports of the library.
         | There are a couple competing ones [1][2], but I couldn't even
         | find rust bindings for the official C library. The ecosystem
         | has a bit of a fetish for pure rust dependencies where
         | possible, and this vulnerability seems to vindicate that
         | stance.
         | 
         | I don't think any of those are published as libraries usable by
         | other languages, but doing that would be less than a hundred
         | lines for defining a C interface.
         | 
         | 1: https://lib.rs/crates/sha3
         | 
         | 2: https://lib.rs/crates/tiny-keccak
        
           | oconnor663 wrote:
           | I maintain some Rust bindings to the official implementation
           | here: https://crates.io/crates/kangarootwelve_xkcp. I've
           | published an update with today's patch (https://github.com/XK
           | CP/K12/commit/27a84ecb811200990add07a04...). That said, I
           | don't know whether those overflows were exploitable with K12.
        
         | tialaramex wrote:
         | There are various crates with pure Rust implementations of SHA3
         | or of the Keccak functions, I would expect that they just don't
         | have the bug, although it's possible they instead panic under
         | the circumstances invoked in this exploit.
         | 
         | (safe) Rust writes actual bounds checks, so if you accidentally
         | overflow a buffer in some case you never tested that compiles,
         | it would just panic if the case you got wrong occurs in real
         | life.
         | 
         | A more specialised language like WUFFS doesn't write bounds
         | checks, it just constrains all the buffer access variables, so
         | when you write code that could overflow that doesn't compile,
         | preventing this problem.
         | 
         | There is a price for this, you can have code which you know,
         | intellectually, never hits the case where say k = 5, but maybe
         | the proof would be sixty pages of difficult maths, WUFFS just
         | won't compile that code until you add handling for k = 5, too
         | bad, safe Rust insists on behaving as though k _might_ be 5
         | (e.g. inserting runtime checks), unsafe Rust would allow you to
         | YOLO, with a lot of hoop jumping, C++ doesn 't care. Of course
         | if your sixty page proof is _wrong_ then these outcomes feel
         | very different...
        
           | green_on_black wrote:
           | It seems that some memory safety bug in C/C++ makes HN
           | frontpage every other day.
        
         | tinglymintyfrsh wrote:
         | Not the same, but there are multiple implementations. Here's
         | the top one with 13 M downloads. https://crates.io/crates/sha3
         | 
         | If someone used ffi, linked to it, or attempted to reproduce
         | the behavior verbatim in unsafe code, obviously it would have
         | the same problems as the native code.
        
         | drexlspivey wrote:
         | what do you mean? the sha3 crate is 8 years old
        
       | brundolf wrote:
       | > The vulnerable code was released in January 2011, so it took
       | well over a decade for this vulnerability to be found
       | 
       | Ouch
        
         | galangalalgol wrote:
         | Between this and "beacown" I came to a realization. I work on
         | codebases worth way less than these amd we do lots of different
         | static analyzers two compilers, lots of unit tests with
         | coverage checks and asan/ubsan. The linux kernel and sha are
         | worth tons of money. Thus, someone wrote all the pipelines and
         | unit tests for that software too. The problem is, the people
         | who paid for all that work aren't the people that rely on thos
         | software's security. It's paid for by people that rely on it's
         | insecurity. So we don't know about this stuff until they or the
         | people they sold it too use these exploits.
         | 
         | Edit: in case the solution I propose isn't obvious,
         | organizations like red hatand google need to pay for these
         | pipelines and unit tests. Getting good unit test coverage is
         | expensive and ubsan and asan don't work without that coverage.
         | So since no one has mentioned it yet, you could rewrite this
         | stuff in rust as a poor man's substitute. It will catch some of
         | the aame things, but ultimately there is no substitute for test
         | coverage with sanitizers. Well, maybe formal analysis?
        
           | lmm wrote:
           | > So since no one has mentioned it yet, you could rewrite
           | this stuff in rust as a poor man's substitute. It will catch
           | some of the aame things, but ultimately there is no
           | substitute for test coverage with sanitizers.
           | 
           | That's backwards. You'll catch more cases with a Rust-style
           | type system that naturally checks everything, than with
           | sanitisers that can only check the paths that get executed in
           | tests.
        
             | galangalalgol wrote:
             | Rust isn't perfect. UB is a bug in rust, but it
             | occasionally has bugs. Ideally you'd do rust _and_ asan
             | with good unit tests. And yes, if you only picked one, it
             | should be rust, but don 't just pick one. And just because
             | you are using rust is no excuse to skip static analysis
             | like prusti, coverage with gcov, llvm, or tarpaulin, and
             | certainly not unit tests.
        
               | [deleted]
        
               | lmm wrote:
               | > And just because you are using rust is no excuse to
               | skip static analysis like prusti, coverage with gcov,
               | llvm, or tarpaulin, and certainly not unit tests.
               | 
               | It absolutely is, and this kind of absolutism is what
               | holds back the adoption of things like rust.
               | 
               | In most real world software development, the target
               | defect rate is not zero. The value proposition of
               | something like rust for most businesses isn't that it
               | lets you lower your defect rate; it's that it lets you
               | maintain your existing (acceptable) defect rate at a much
               | lower cost, by letting you drop your static analysis and
               | most of your tests (reducing maintenance burdens) and
               | still have a better bottom-line defect rate.
        
               | consp wrote:
               | So your assumption is that most to all defects are
               | language related and not programmer error? I will bet the
               | farm your defect rate will remain the same without any
               | actual validation of what people write. Plenty of errors
               | due to language quirks but lots of times I see missing
               | statements, fixed values which should be variable, input
               | checking failures due to unknown input, incorrect
               | assumptions and lots of non language gelated bugs which
               | will not go away.
        
               | green_on_black wrote:
               | You're not accounting for the fact that Rust ships with a
               | mandatory static analyzer called rustc. Btw Google
               | recently gave a talk about switching from Java to Kotlin,
               | which significantly reduced bugs, despite less mature
               | tooling. So...
        
               | iopq wrote:
               | Rust does validation of what people write, it won't let
               | you make a dangling pointer, whereas C will happily
               | accept it if you hide it from the warning heuristics
        
               | lmm wrote:
               | > So your assumption is that most to all defects are
               | language related
               | 
               | What does that even mean?
               | 
               | > I will bet the farm your defect rate will remain the
               | same without any actual validation of what people write.
               | 
               | In what sense is a typechecker (or the rust borrow
               | checker) not "actual validation", but a static analyser
               | is?
               | 
               | > Plenty of errors due to language quirks but lots of
               | times I see missing statements, fixed values which should
               | be variable, input checking failures due to unknown
               | input, incorrect assumptions and lots of non language
               | gelated bugs
               | 
               | Most of those sound like type errors to me. Errors where
               | the program isn't doing what the programmer thought it
               | should can generally be avoided by using more precise
               | types. (The more insidious case is where the program is
               | doing exactly what the programmer thought it should, but
               | they've misunderstood the specification or not thought
               | through the implications - but no kind of testing can
               | catch that kind of bug).
        
               | galangalalgol wrote:
               | I'll give you the static analyzer is much less important
               | with rist, but that is the easy part anyway. I stand firm
               | that tests are always important, and you don't know how
               | well you tested without coverage. I'm not even sure I'd
               | say you need less coverage with rust, not all errors are
               | resource safety related.
        
               | lmm wrote:
               | Coverage numbers are misleading (to the point that I find
               | they did more harm than good). There are definitely cases
               | where you can't (or aren't clever enough to) express what
               | makes your business logic valid in the type system and
               | need to write a test, but IME they're the exception
               | rather than the rule (both because you generally can
               | encode things in types, and because the majority of code
               | ends up being "plumbing" with no real business logic
               | anyway); I like to follow something like
               | https://spin.atomicobject.com/2014/12/09/typed-language-
               | tdd-... .
        
               | tinglymintyfrsh wrote:
               | There is unlikely ever to be a perfect, but a getting
               | better by constraining behavior that breaks things. Throw
               | all of the compile-time, profiling, checked builds, and
               | binary-level tools at projects for defense-in-depth and
               | checklists, no matter the platform or the application.
               | Fuzzing, valgrind, gperftools, dtrace, {[amt],ub}san,
               | etc. and formal methods like seL4 if you can afford the
               | investment. :>
        
               | galangalalgol wrote:
               | Does valgrind add anything to asan given the same unit
               | tests? I haven't used in years because it is so slow.
        
               | saagarjha wrote:
               | Address sanitizer is not precise and can miss certain
               | classes of bugs that Valgrind would catch.
        
         | vermilingua wrote:
         | Took well over a decade for this vulnerability to be
         | _disclosed_.
        
           | [deleted]
        
       | [deleted]
        
       | ajsfoux234 wrote:
       | The vulnerability impacts 'the "official" SHA-3 implementation'.
       | How widely used is it for SHA-3 hashing compared to something
       | like OpenSSL?
        
         | duskwuff wrote:
         | SHA-3 is rarely used in general. Most applications still use
         | SHA-2 hashes (such as SHA256).
        
           | halostatue wrote:
           | I'd love to use SHA-3, but I'm waiting until I see SHA-3 in
           | standard libraries for my most-used languages.
        
           | PeterisP wrote:
           | An obvious issue is that many protocols do two-sided
           | negotiations of crypto algorithms during a handshake - so one
           | of "most applications" which uses SHA-2 hashes by default
           | could be vulnerable to a network connection saying "I don't
           | support SHA-2, let's use SHA-3 pls?" and then exploiting the
           | buffer overflow.
           | 
           | In a similar manner, apps that verify signatures/certificates
           | often let the certificate specify which algorithm is used, so
           | again effectively an attacker may be able to force the app to
           | use SHA3 as long as it uses a library that has some support
           | for it.
        
             | londons_explore wrote:
             | > two-sided negotiations of crypto algorithms
             | 
             | This is a common problem, and massively increases attack
             | surface, since if a vulnerability can be found in _any_
             | algorithm, it can be exploited. Downgrade attacks are a
             | varient of the same.
             | 
             | Let me present an alternative:
             | 
             | Each protocol has _one specific_ encryption algorithm and
             | set of parameters. However, that set isn 't fixed, but
             | defined based on the current date and time.
             | 
             | For example, HTTPS could be defined as "SHA1 from 1996 till
             | 2006", "SHA256 from 2005 till 2018", "SHA3 from 2017 till
             | 2032", etc.
             | 
             | The dates can be written 10+ years into the future, so that
             | software deployed today is still usable that many years
             | into the future. As soon as the last date period expires,
             | the software is not usable until it has been upgraded
             | (unless both ends of the communication link choose to fake
             | the date).
             | 
             | There is precedent for software 'expiring' - that happens
             | whenever the root certificate of a trust store reaches an
             | expiry date.
             | 
             | The downside is sometimes an algorithm might be broken
             | before its date comes, and other times an algorithm may
             | still be secure at that date. But that still seems better
             | than our current mess of downgrade attacks.
        
               | GTP wrote:
               | As you say, it could be that an algorithm gets broken
               | before its "expiry date" or remains secure well past that
               | (look e.g. at AES). There is a similar but better
               | alternative, usually called "opnionated cryptography". A
               | protocol specifies a single cryptographic primitive for
               | each type needed (hash function, block cipher etc) so no
               | negotiation is needed. If one of the primitives gets
               | broken, a new version of the protocol is released that
               | uses a different primitive.
        
               | remram wrote:
               | And then you still have a roll-out period during which
               | those two protocols are available, so the two parties
               | still need to negotiate which version they want. It's not
               | clear to me what the advantage over the current situation
               | is?
        
               | GTP wrote:
               | >the two parties still need to negotiate which version
               | they want
               | 
               | Not necessarily. If the client contacts the server using
               | version X, the server will reply with version X.
        
               | remram wrote:
               | What if the server doesn't support version X? Then the
               | client will try again with version X-1. This is a
               | negotiation, it is just an inefficient one (client might
               | have to try X-1, X-2, X-3 in turn if more versions are
               | still co-existing; and contrarily, if client doesn't
               | support any version the server does, you will not get a
               | detailed error about the version mismatch, because they
               | are entirely different protocols).
        
               | bhawks wrote:
               | That's sounds a lot like protocol negotiation, just
               | implicit vs explicit and limited to 2 algos.
        
               | kevincox wrote:
               | It seems to me that it may be better in many cases to
               | support exactly two configurations (or some other small
               | fixed number). This way there is no need to wait until a
               | new version is released and dropping the vulnerable one
               | won't cause any downtime. Ideally both codepaths would be
               | regularly exercised (maybe clients should pick randomly
               | when both choices are available). Then as soon as a
               | vulnerability is discovered (maybe even before it is
               | publicly revealed) all clients and servers could
               | immediately drop support for the one that was discovered
               | vulnerable.
               | 
               | Then the release of a new configuration can be done
               | slowly with reasonable caution to get back to the desired
               | redundancy level.
        
           | [deleted]
        
           | tinglymintyfrsh wrote:
           | Rare != bad && rare != unimportant.
           | 
           | The point of NIST standardizing on SHA-3 is to gradually
           | replace SHA-2 due to the rise of computing power and the
           | likelihood it will become as weak as SHA-1 is now in the near
           | future. Unfortunately, like American credit cards vs.
           | European chip & pin, it's going to take forever to adopt.
        
             | tialaramex wrote:
             | "near future"? I doubt that. Eventually, but likely not in
             | my lifetime.
             | 
             | https://www.imperialviolet.org/2017/05/31/skipsha3.html
        
             | tptacek wrote:
             | No. The "rise in computing power" doesn't jeopardize SHA2.
             | There are important design differences between SHA1 and
             | SHA2 (here's where in my younger days I'd pretend that I
             | could rattle off the implications of nonlinear message
             | expansion off the top of my head). SHA2 is secure; don't
             | take my word for it through, you can find one of the Blake2
             | designers saying SHA2 is unlikely ever to be broken, or
             | Marc Stevens on a Twitter thread talking briefly about why
             | his attacks on SHA1 don't apply at all to SHA2.
        
               | dlubarov wrote:
               | I agree, SHA-2 is secure as far as we know. But since
               | it's based on Merkle-Damgard, it permits length-extension
               | attacks - i.e. given H(x), one can derive H(pad(x) || y)
               | without knowing x.
               | 
               | So we need to be careful not to use it in setting where
               | that would be problematic. Or we can use it with
               | workarounds, like double hashing (SHA-256d) or truncating
               | its output.
               | 
               | SHA-3 is sponge based, so its output is always truncated,
               | preventing length-extension attacks. So I think SHA-3 is
               | a better default, though it's fine to use SHA-2 if you
               | know what you're doing.
        
               | duskwuff wrote:
               | Truncated SHA512 hashes, such as SHA512/256, defeat
               | length extension attacks by omitting part of the hash
               | state from the output. They're also significantly faster
               | than classic SHA256 for large inputs.
        
               | baby wrote:
               | Blake2/3 also doesn't suffer from length extension
               | attacks, but SHA-256 is what everyone uses unfortunately.
        
               | dcow wrote:
               | We use Blake 3. So far so good...
        
           | baby wrote:
           | I've mostly seen SHA-3 used in more modern applications, it's
           | used a lot in cryptocurrencies (with BLAKE2)
        
         | derefr wrote:
         | The version in the Golang stdlib defaults to a pure-go
         | implementation... unless you're compiling for amd64, in which
         | case you get an assembler variant apparently directly derived
         | from the XKCP package (https://github.com/golang/crypto/blob/ma
         | ster/sha3/keccakf_am...).
         | 
         | Slightly concerning news for the (mostly-Golang-based) Ethereum
         | ecosystem, which relies on SHA3-256 for pretty much
         | everything...
        
           | tptacek wrote:
           | Easy enough to test, right?                   func main() {
           | h := sha3.New224()           buf := make([]byte, 4294967295)
           | h.Write(buf)           sum := h.Sum(nil)
           | fmt.Printf("%x\n", sum)         }
           | 
           | Doesn't crash on my amd64 dev machine.
           | 
           |  _Later_
           | 
           | I could have just looked at the code, too: the assembly
           | you've linked to is just the Keccak permutation, not the
           | entire Go hash; the buffer management is done in Go, not in
           | assembly.
        
             | derefr wrote:
             | This is one of those cases where I'm actually more
             | concerned that it _doesn 't_ crash. It's like seeing
             | clearly-syntactically-invalid code that somehow compiles
             | anyway -- you wonder what semantics the compiler could have
             | possibly ascribed to it.
             | 
             | Presumably this isn't not-crashing just because the
             | developers of the Golang stdlib somehow found+fixed this
             | bug back in 2015 when this assembler file was baked. The
             | error is _in_ that assembler code, I 'm sure. It's
             | presumably getting masked by something. (Something that may
             | be benign, or might be subtly corrupting the runtime.)
             | 
             | For a benign case: maybe the Golang runtime isn't
             | allocating you precisely as many bytes as you're asking
             | for, but rather a little bit more? Perhaps rounding up to a
             | multiple of a page for more-than-page-sized allocations?
             | 
             | Not having access to an amd64 machine at the moment, I'll
             | have to ask you: does increasing the size by one, as in the
             | article, cause an infinite loop?
        
               | tptacek wrote:
               | No, it causes the string
               | c5bcc3bc73b5ef45e91d2d7c70b64f196fac08eee4e4acf6e6571ebe
               | 
               | No matter what Go is allocating under the hood, it has to
               | be feeding exactly that many bytes to the SHA3 algorithm.
        
               | derefr wrote:
               | > it has to be feeding exactly that many bytes to the
               | SHA3 algorithm
               | 
               | Well, yes, but this is supposed to be a buffer overflow
               | -- the algorithm itself is reading and/or writing past
               | the end of the buffer it's been handed. My hypothesis was
               | that Golang is allocating in such a way that
               | reading/writing a single byte past the received slice
               | bounds won't result in a protection fault in the way
               | you'd expect if the allocation were exact.
        
               | halffullbrain wrote:
               | The buffer overflow is in the C version of the algorithm
               | (and likely related to loop condition checks idiomatic to
               | C's for-loops). The Go version is a fresh implementation,
               | not a wrapping of the C version. I'm no Go programmer,
               | but if I'm not mistaken, the Go implementation just eats
               | little slices of the input buffer until no more buffer is
               | left, leaving all the overflow-danger to the Go array
               | implementation:
               | 
               | https://github.com/golang/crypto/blob/642fcc37f5043eadb25
               | 09c...
               | 
               | Possibly not as fast as C, but easier to reason about,
               | I'd say.
        
               | tptacek wrote:
               | Just for what it's worth, we're talking about the amd64
               | assembly version of the SHA3 code in the x/crypto/sha3
               | library; I didn't look carefully at how it's called, so
               | if it's used incrementally the way this Go code shows,
               | then yeah, it's fine regardless.
               | 
               |  _A second later_
               | 
               | Oh, wait, yeah, this is just the Keccak permutation in
               | assembly, not the entire hash. That was dumb of me. Yeah,
               | this code looks ok?
        
               | halffullbrain wrote:
               | Indeed, the two assembly variants are just of the 1600
               | byte block bit fiddling (keccakF1600).
               | 
               | The outer loop of the Write method (function?) is the
               | same for all architectures.
        
             | tinglymintyfrsh wrote:
             | Off-by-one error. Didn't do the 1 byte update.
        
               | tptacek wrote:
               | See downthread, but I don't think it matters. I didn't
               | look at the code before writing the test case; the code
               | here is just the SHA3 transform, not the buffer
               | management code around it, which AFAIK is just Go code. I
               | don't see an opportunity for the overflow here?
        
             | retrocat wrote:
             | Are you sure this is the same code that triggers the
             | vulnerability? Is the double `update` in the post itself
             | unimportant (i.e., first updating with a 1-byte string,
             | then the 4,294,967,295-byte string)? As I don't think this
             | does that.
        
               | tptacek wrote:
               | Let's try! Gimme a sec. And: to be clear: I make _no
               | representation_ that this code proves anything, and
               | people should of course take a hard look at the SHA3
               | amd64 assembly code. I 'm just having a message board
               | discussion.
               | 
               | One sec and I'll get you an answer to your thing about
               | the extra write.
        
               | tptacek wrote:
               | With the extra write:
               | c5bcc3bc73b5ef45e91d2d7c70b64f196fac08eee4e4acf6e6571ebe
               | 
               | With the extra write and an extra byte on the big write
               | (to try to trigger the loop condition):
               | ec66be1ebccf055f839fccf2d12e641dcbbda4f5c71a3bdee6509495
        
           | bugfix-66 wrote:
           | All the Go code is safe. No bugs.
        
             | [deleted]
        
           | bugfix-66 wrote:
           | I read the code carefully, both yesterday and today, and it's
           | correct.
           | 
           | Read it yourself here:
           | 
           | https://cs.opensource.google/go/x/crypto/+/refs/tags/v0.1.0:.
           | ..
           | 
           | There's obviously no problem.
        
       | cipherboy wrote:
       | Does this impact most distros? I'd imagine Python and PHP's
       | native crypto bindings would be replaced with OpenSSL which
       | should have assembly variants of SHA-3.
        
         | ok_dad wrote:
         | I use pyenv and whatever it installed for 3.10.4 months ago
         | seems to be fine with the example code snippet run in the
         | console. It took a few seconds, but didn't crash.
        
       | metadat wrote:
       | If you're familiar with SHA-256 and this is your first encounter
       | with SHA-3:
       | 
       | The main differences between the older SHA-256 of the SHA-2
       | family of FIPS 180, and the newer SHA3-256 of the SHA-3 family of
       | FIPS 202, are:
       | 
       | * Resistance to length extension attacks.
       | 
       | * Performance. The SHA-2 functions--particularly SHA-512,
       | SHA-512/224, and SHA-512/256--generally have higher performance
       | than the SHA-3 functions. Partly this was out of paranoia and
       | political reasons in the SHA-3 design process.
       | 
       | Further reading:
       | https://crypto.stackexchange.com/questions/68307/what-is-the...
        
         | franky47 wrote:
         | Note that truncated SHA-2 (SHA-224, SHA-384, SHA-512/224,
         | SHA-512/256) are not susceptible to length-extensions attacks
         | [1].
         | 
         | With the added benefit of better performance of SHA-512 (on 64
         | bit systems) [2], there's no good reason to use SHA-256 rather
         | than SHA-512/256 for new cryptographic designs.
         | 
         | [1] https://en.wikipedia.org/wiki/Length_extension_attack
         | 
         | [2]
         | https://crypto.stackexchange.com/questions/26336/sha-512-fas...
        
           | adrian_b wrote:
           | There is a good reason to use SHA-256 and not SHA-512.
           | 
           | Many modern CPUs, e.g. all AMD Zen, most Intel Atom, Intel
           | Ice Lake and newer, most 64-bit ARM, have hardware
           | implementations of SHA-256, which are much faster than
           | software computing SHA-512.
           | 
           | Only some more recent 64-bit ARM CPUs also have hardware for
           | SHA-512 and SHA-3.
           | 
           | Whenever the speed matters, SHA-256 is the best choice,
           | unless you choose different hash algorithms based on the
           | detected CPU.
           | 
           | On old 64-bit CPUs without SHA hardware, e.g. Intel Skylake,
           | SHA-512 is faster than SHA-256 (Intel introduced SHA in Atom
           | CPUs many years before also introducing it in Core CPUs, so
           | that Atom could compete in Geekbench scores with the ARM
           | CPUs, which already had SHA).
        
             | hot_gril wrote:
             | Allegedly, SHA-256 can often be faster than MD5
             | https://security.stackexchange.com/a/95697 Not to say the
             | algo is faster, it's not, but the CPU has specialized
             | hardware for SHA-256.
        
             | metadat wrote:
             | Interesting, in general I've noticed SHA-256 hashing is
             | relatively CPU intensive and slow.. is the 2024 Xeon CPU in
             | my server too old to include the hardware implementation?
        
               | metadat wrote:
               | *2014, yes it was a typo
        
               | adrian_b wrote:
               | I think that you might have mistyped the Xeon model
               | number.
               | 
               | Nevertheless, most existing Xeon CPUs are too old to have
               | SHA hardware.
               | 
               | In Intel server CPUs SHA was introduced many years later
               | than in AMD server CPUs or Intel desktop CPUs, i.e. only
               | in "the 3rd generation Xeon Scalable" based on the Ice
               | Lake Server cores, in Q2 of 2021 (Xeon model numbers
               | 83xx, 63xx and 53xx).
        
               | tedunangst wrote:
               | Is your software using SHA instructions?
        
               | metadat wrote:
               | I imagine the sha256sum program would use the best
               | available technique.
               | 
               | Reasonable assumption?
        
               | adrian_b wrote:
               | Last time when I have checked what sha256sum (from
               | coreutils) does was one year ago.
               | 
               | At that time, on x86 CPUs (I have not tested on ARM)
               | neither sha256sum nor sha224sum nor sha1sum used the
               | hardware instructions (and I have compiled them from
               | sources, with the appropriate flags).
               | 
               | Because I have a Zen CPU (but that would also be true on
               | Atom CPUs or Core CPUs since Ice Lake), I have to use
               | "openssl dgst -r -sha256" instead of sha256sum and
               | "openssl dgst -r -sha1" instead of sha1sum.
               | 
               | The openssl implementation uses the hardware instructions
               | where available and in that case it is several times
               | faster.
               | 
               | (The "-r" option flag in "openssl dgst" selects an output
               | format that is more similar to the sha???sum tools, i.e.
               | hash value followed by file name, but openssl still
               | outputs an extra character, which must be accounted in
               | scripts.)
        
               | metadat wrote:
               | I also see similarly slow performance when using the Go
               | built-in Sha256 hashing library.
               | 
               | Thanks for the info! Fortunately for my purposes of
               | integrity checking, slow is okay.
        
               | adrian_b wrote:
               | I also use them for file integrity checking, but I
               | frequently check large files, e.g. 50 GB files, and in
               | that case the difference in time until finish between
               | "sha256sum" and "openssl dgst -r -sha256" is extremely
               | noticeable on a Zen CPU.
               | 
               | On the other hand, on older Skylake/Kaby Lake/Coffee Lake
               | etc. CPUs, both programs have the same speed.
        
             | orangepurple wrote:
             | Makes me wonder if there is a microcode back door to these
             | hardware hashers such as one which steers the output after
             | the microprocessor is fed carefully crafted input to cause
             | it to run in a vulnerable state.
        
         | moonchild wrote:
         | My understanding was that sha-3 should be faster than sha-2, in
         | a general sense, but sha-2 has hardware acceleration. Is that
         | incorrect?
        
           | Vecr wrote:
           | I think SHA-3 is almost always slower in software, in theory
           | SHA-3 could be hardware accelerated of course, but on both
           | current AMD and Intel systems it's not, where SHA-2-256 is.
        
             | wahern wrote:
             | AArch64 supports accelerated SHA-3, available on production
             | systems since 2019 with the Apple A13, judging by
             | https://github.com/llvm/llvm-
             | project/blob/c35ed40f4f1bd8afd7...
             | 
             | Power10 also supports accelerated SHA-3:
             | https://www.redbooks.ibm.com/redpapers/pdfs/redp5649.pdf
             | (p150)
             | 
             | Accelerated SHA-3 on x86_64 is probably an inevitability;
             | the question is more when than if.
        
               | adrian_b wrote:
               | Most Android smartphones support SHA3 starting with the
               | 2018 models (i.e. Cortex-A55, Cortex-A75 or newer).
        
               | gonzo wrote:
               | SHA-3 is available on QuickAssist.
               | 
               | https://csrc.nist.gov/projects/cryptographic-algorithm-
               | valid...
        
               | adrian_b wrote:
               | For those not familiar with QuickAssist, it is a
               | cryptography and compression accelerator connected
               | through PCIe, which is available in some models of Intel
               | server CPUs or server chipsets and in a few models of
               | Intel server Ethernet NICs.
               | 
               | Moreover, the laptop variants of Alder Lake (H-series,
               | P-series and U-series) are said in the Intel Ark web site
               | to include a subset of QuickAssist, but I have not been
               | able to find any public Intel document explaining which
               | functions are supported by laptop Alder Lake CPUs or any
               | report of someone testing these functions on laptop Alder
               | Lake CPUs.
        
         | baby wrote:
         | SHA-3 was also designed with hardware performance in mind IIRC,
         | as in: if it becomes supported in hardware it will be much
         | faster than SHA-2 hardware support.
        
           | adrian_b wrote:
           | Some of the more recent 64-bit ARM CPUs have added hardware
           | support for SHA-3 (SHA3 is optional in architectures
           | Armv8.2-A and later, i.e. in Cortex-A55, Cortex-A75 and
           | later, i.e. starting with the smartphones from 2018).
        
         | a-dub wrote:
         | i thought sha-2 included the message length in the head padding
         | to prevent length extension attacks?
         | 
         | just read the link you provided: it's sha-224 and above.
         | 
         | i wonder if a similar issue exists in the sha-224+ padding
         | code.
        
           | tialaramex wrote:
           | I think you've badly misunderstood what's going on here.
           | SHA-2 includes SHA-224, SHA-256, SHA-384, SHA-512,
           | SHA-512/224, SHA-512/256. Six functions. Of these SHA-256 and
           | SHA-512 are subject to the length extension attack (and
           | SHA-224 only has 32-bit worth of safety which is clearly
           | inadequate). Since you probably only needed a 256-bit hash,
           | SHA-512/256 is a choice available today which produces a
           | 256-bit hash without the possibility of length extension.
           | 
           | Length extension works because the hash _is_ the entire
           | internal hash function state. This is not true for SHA-224,
           | SHA-384, SHA-512 /224 or SHA-512/256 since for these four
           | functions not all of the internal state is exposed.
           | 
           | No, the length is not included at the start of the data to be
           | hashed, this would mean committing to entire input before we
           | hash it, which is a pretty unsatisfactory API, the length is
           | appended (with the padding).
        
             | a-dub wrote:
             | > I think you've badly misunderstood what's going on here.
             | 
             | that sounds right. been almost a decade since i've dug into
             | sha-2.
             | 
             | i do remember the lengths in sha-256 and sha-512 though and
             | i remember not being a huge fan of the chunking/padding
             | code.
             | 
             | come to think of it, there's another reason for the message
             | length in the padding. i forget what it is now.
        
       ___________________________________________________________________
       (page generated 2022-10-21 23:02 UTC)