[HN Gopher] GoFetch: New side-channel attack using data memory-d...
       ___________________________________________________________________
        
       GoFetch: New side-channel attack using data memory-dependent
       prefetchers
        
       Author : kingsleyopara
       Score  : 139 points
       Date   : 2024-03-21 14:51 UTC (8 hours ago)
        
 (HTM) web link (gofetch.fail)
 (TXT) w3m dump (gofetch.fail)
        
       | woadwarrior01 wrote:
       | Reminded me of the Augury attack[1] from 2022, which also
       | exploits the DMP prefetcher on Apple Silicon CPUs.
       | 
       | [1]: https://www.prefetchers.info
        
         | loeg wrote:
         | Yes, they specifically mention that in the article and FAQ.
        
         | Findecanor wrote:
         | BTW. Three of the authors of GoFetch where also behind Augury.
        
       | martinky24 wrote:
       | Why does every attack needs its own branding, marketing page,
       | etc...? Genuine question.
        
         | xena wrote:
         | So people talk about it
        
         | sapiogram wrote:
         | Well, names are useful for the same reason people's names are
         | useful. The rest just kinda happens naturally, I think.
        
           | yborg wrote:
           | Yes, it saves time vs. starting a discussion on "that crypto
           | cache sidechannel attack that one team in China found".
        
           | martinky24 wrote:
           | Name makes enough sense. "Branding, marketing page, etc..."
           | was my question.
           | 
           | "Happens naturally" isn't really an answer.
        
             | ziddoap wrote:
             | Is your position that any write-up about an attack must be
             | plain text only, and must not use its own URL?
             | 
             | I truly cannot understand why this is brought up so often.
             | You aren't paying for it, it doesn't hurt you in any way,
             | it detracts nothing from the findings (in fact, it makes
             | the findings easier to discuss), etc. There is no downside
             | I can think of.
             | 
             | Can you share what the downsides of a picture of a puppy
             | and a $5 domain are? Sorry, "branding" and "marketing
             | page"?
             | 
             | Or at least, maybe you can share what you think would be a
             | more preferable way?
        
         | fruktmix wrote:
         | It's science these days. They need funding, one way is to get
         | people to recognize the importance of their work
        
         | modeless wrote:
         | Science isn't just about discovering information. Dissemination
         | is critical. Communicating ideas is just as important as
         | discovering them and promotion is part of effective
         | communication. It's natural and healthy for researchers to
         | promote their ideas.
        
         | FiloSottile wrote:
         | Names are critical to enable discussion.
         | 
         | The "marketing" page is where documentation is. Summaries that
         | don't require reading a whole academic papers are a good thing,
         | and they are the place where all the different links are
         | collected. Same reason software has READMEs.
         | 
         | Logos... are cute and take 10-60 minutes? If you spend months
         | on some research might as well take the satisfaction of giving
         | it a cute logo, why not.
        
         | saagarjha wrote:
         | Why does the comments of every such attack need a question
         | about why it has its own branding, marketing page, etc...?
         | Genuine question.
         | 
         | (Seriously, this comes up every time, just do a search for it
         | if you actually want to figure out why.)
        
       | jerf wrote:
       | As long as we're getting efficiency cores and such, maybe we need
       | some "crypto cores" added to modern architectures, that make
       | promises specifically related to constant time algorithms like
       | this and promise not to prefetch, branch predict, etc. Sort of
       | like the Itanium, but confined to a "crypto processor". Given how
       | many features these things _wouldn 't_ have, they wouldn't be
       | much silicon for the cores themselves, in principle.
       | 
       | This is the sort of thing that would metaphorically drive me to
       | drink if I were implementing crypto code. It's an uphill battle
       | at the best of times, but even if I finally get it all right,
       | there's dozens of processor features both current and future
       | ready to blow my code up at any time.
        
         | FiloSottile wrote:
         | Speaking as a cryptography implementer, yes, these drive us up
         | the wall.
         | 
         | However, crypto coprocessors would be a tremendously disruptive
         | solution: we'd need to build mountains of scaffolding to allow
         | switching to and off these cores, and to share memory with
         | them, etc.
         | 
         | Even more critically, you can't just move the RSA
         | multiplication to those cores and call it a day. The key is
         | probably parsed from somewhere, right? Does the parser need to
         | run on a crypto core? What if it comes over the network? And if
         | you even manage to protect all the keys, what if a CPU side
         | channel leaks the message you encrypted? Are you ok with it
         | just because it's not a key? The only reason we don't see these
         | attacks against non-crypto code is that finding targets is very
         | application specific, while in crypto libraries everyone can
         | agree leaking a key is bad.
         | 
         | No, processor designers "just" need to stop violating
         | assumptions, or at least talk to us before doing it.
        
           | bee_rider wrote:
           | I don't think the security community is also going to become
           | experts in chip design, these are two full skill sets that
           | are already very difficult to obtain.
           | 
           | We must stop running untrustworthy code on modern full-
           | performance chips.
           | 
           | The feedback loop that powers everything is: faster chips
           | allow better engineering and science, creating faster chips.
           | We're not inserting the security community into that loop and
           | slowing things down just so people can download random
           | programs onto their computers and run them at random. That's
           | just a stupid thing to do, there's no way to make it safe,
           | and there never will be.
           | 
           | I mean we're talking about prefetching. If there was a way to
           | give ram cache-like latencies why wouldn't the hardware folks
           | already have done it?
        
             | FiloSottile wrote:
             | > download random programs onto their computers and run
             | them at random
             | 
             | To be clear that includes what we're all doing by
             | downloading and running Javascript to read HN.
             | 
             | Maybe _I_ can say  "don't run adversarial code on my same
             | CPU" and only care about over-the-network CPU side-channels
             | (of which there are still some), because I write Go crypto,
             | but it doesn't sound like something my colleagues writing
             | browser code can do.
        
               | bee_rider wrote:
               | Unfortunately somebody has tricked users to leaving
               | JavaScript on for every site, it is a really bad
               | situation.
        
               | anonymous-panda wrote:
               | Security and utility are in opposing balances often. The
               | safest possible computer is one buried far underground
               | without any cables in a faraday cage. Not very useful.
               | 
               | > We're not inserting the security community into that
               | loop and slowing things down just so people can download
               | random programs onto their computers and run them at
               | random. That's just a stupid thing to do, there's no way
               | to make it safe, and there never will be.
               | 
               | Setting aside JavaScript, you can see this today with
               | cloud computers which have largely displaced private
               | clouds. These run untrusted code on shared computers.
               | Fundamentally that's what they're doing because that's
               | what you need for economies of scale, durability,
               | availability, etc. So figuring out a way to run untrusted
               | code on another machine safely is fundamentally a
               | desirable goal. That's why people are trying to do
               | homomorphic encryption - so that the "safely" part can go
               | both ways and both the HW owner and the "untrusted" SW
               | don't need to trust each other to execute said code.
        
               | arp242 wrote:
               | Is this exploitable through JavaScript?
               | 
               | In general from what I've seen, most of these JS-based
               | CPU exploits didn't strike me as all that practical in
               | real world conditions. I mean, it _is_ a problem, but not
               | really all that worrying.
        
               | csande17 wrote:
               | Speak for yourself; I've got JavaScript disabled on
               | news.ycombinator.com and it works just fine.
        
             | titzer wrote:
             | I almost gave you up an upvote until your third paragraph,
             | but I have to now give a hard disagree. We're running more
             | untrusted code than ever, and we absolutely should trust it
             | less than ever and have hardware and software designed with
             | security in mind. Security should be priority #1 from here
             | on out. We are absolutely awash in performance and memory
             | capacity but keep getting surprised by bad security
             | outcomes because it's been second fiddle for too long.
             | 
             | Software is now critical infrastructure in modern society,
             | akin to the power grid and telephone lines. It's a
             | strategic vulnerability to neglect security, and it must
             | happen at all levels of the software and hardware stack.
             | Meaning, trying to crash an enemy's entire society by
             | bricking all of its computers and send them back to the
             | dark ages in milliseconds. I fundamentally don't understand
             | the mindset of people who want to take that kind of risk
             | for a 10% boost in their games' FPS[1].
             | 
             | Part of that is paying back the debt that decades of
             | cutting corners has yielded us.
             | 
             | In reality, the vast majority of the 1000x increase in
             | performance and memory capacity over the past four decades
             | has come from shrinking transistors and increasing
             | clockspeeds and memory density--the 1 or 5 or 10% gains
             | from turning off bounds checks or prefetching aren't the
             | lion's share. And for the record, turning off bounds checks
             | is monumentally stupid, and people should be jailed for it.
             | 
             | [1] I'm exaggerating to make a point here. What we trade
             | for a little desktop or server performance is an enormous,
             | pervasive risk. Not just melting down in a cyberwar, but
             | the constant barrage of intrusion and leaks that costs the
             | economy billions upon billions of dollars per year. We're
             | paying for security, just at the wrong end.
        
               | bee_rider wrote:
               | > I fundamentally don't understand the mindset of people
               | who want to take that kind of risk for a 10% boost in
               | their games' FPS[1]
               | 
               | Me either. But, lots of engineers are out there just
               | writing single threaded matlab and python codes with lots
               | of data-dependencies and just hoping the system manages
               | to do a good job (for those operations that can't be
               | offloaded to BLAS). So I'm glad gamer dollars subsidize
               | the development of fast single threaded chips that handle
               | branchy codes well.
               | 
               | > In reality, the vast majority of the 1000x increase in
               | performance and memory capacity over the past four
               | decades has come from shrinking transistors and
               | increasing clockspeeds and memory density
               | 
               | I disagree, modern designs include deep pipelines, lots
               | of speculation, and complex caches _because_ that's the
               | only way to spend that higher transistor budget for
               | higher clocks and compensate for the fact that memory
               | latencies haven't kept up.
               | 
               | > Part of that is paying back the debt that decades of
               | cutting corners has yielded us.
               | 
               | It will be tough, but yeah, server and mainframe users
               | need to roll back the decision to repurpose consumer
               | focus chips like the x86 and arm families. RISC-V is
               | looking good though and seems open enough that maybe they
               | can pick-and-choose which features they take.
               | 
               | > I almost gave you up an upvote until your third
               | paragraph, but I have to now give a hard disagree.
               | 
               | I'm not too worried about votes on this post; this site
               | has lots of web devs and cloud users, pointing out that
               | the ecosystem they rely on is impossible to secure is
               | destined to get lots of downvotes-to-disagree.
        
               | saagarjha wrote:
               | How is RISC-V going to solve anything here?
        
               | bee_rider wrote:
               | It isn't a sure thing. Just, since it is a more open
               | ecosystem, maybe the designers of chips that need to be
               | able to safely run untrusted code can still borrow some
               | features from the general population.
               | 
               | I think it is basically impossible to run untrusted code
               | safely or to build sand-proof sandboxes, but I thought
               | the rest of my post was too pessimistic.
        
               | saagarjha wrote:
               | Turning off bounds checks is like a 5% performance
               | penalty. Turning off prefetching is like using a computer
               | from twenty years ago.
        
               | aseipp wrote:
               | I agree that hardware/software codesign are critical to
               | solving things like this, but features like prefetching,
               | speculation, and prediction are absolutely critical to
               | modern pipelines and broadly speaking are what enable
               | what we think of as "modern computer performance." This
               | has been true for over 20 years now. In terms of
               | "overhead" it's not in the same ballpark -- or even the
               | same sport, frankly -- as something like bounds checking
               | or even garbage collection. Hell, if the difference was
               | within even one order magnitude, they'd have done it
               | already.
        
             | tadfisher wrote:
             | > The feedback loop that powers everything is: faster chips
             | allow better engineering and science, creating faster
             | chips. We're not inserting the security community into that
             | loop and slowing things down just so people can download
             | random programs onto their computers and run them at
             | random. That's just a stupid thing to do, there's no way to
             | make it safe, and there never will be.
             | 
             | Note that in the vast majority of cases, crypto-related
             | code isn't what we spend compute cycles on. If there was a
             | straightforward, cross-architecture mechanism to say, "run
             | this code on a single physical core with no branch
             | prediction, no shared caches, and using in-order execution"
             | then the real-world performance impact would be minimal,
             | but the security benefits would be huge.
        
               | bee_rider wrote:
               | I'm in favor of adding some horrible in-order, no
               | speculation, no prefetching, 5 stage pipeline
               | architectures 101 core which can be completely verified
               | and bulletproof to chips.
               | 
               | But the presence of this bulletproof core would not solve
               | the problem of running bad code on modern hardware,
               | unless all untrusted code is run on it.
        
           | saagarjha wrote:
           | Processor designers are very unlikely to do that for you,
           | because everyone not working on constant time crypto gives
           | them a whole lot of money to keep doing this. The best you
           | might get is a mode where the set of assumptions they violate
           | is reduced.
        
         | gabrielhidasy wrote:
         | Many modern architectures have crypto extensions, usually to
         | accelerate a few common algorithms, maybe it would be good to
         | add a few crypto-primitives instructions to allow new
         | algorithms?
        
         | sargun wrote:
         | I think what's more likely is "mode switching" in which you can
         | disable these components of the CPU for a certain section of
         | executing code (the abstraction would probably be at the thread
         | level).
        
         | bee_rider wrote:
         | One option would be for people to stop downloading viruses and
         | then running them.
        
         | Joel_Mckay wrote:
         | Encrypted bus mmu have existed since the 1990's.
         | 
         | However, the trend to consumer-grade hardware for cost-
         | optimized cloud architecture ate the CPU market.
         | 
         | Thus, the only real choice now is consumer CPUs even in scaled
         | applications.
        
       | theobservor wrote:
       | The end result of these side channel attacks would be to have
       | CPUs that perform no optimizations at all and all opcodes would
       | run in the same number of cycles in all situations. But that will
       | never happen. No one wants a slow CPU.
       | 
       | As long as these effects cannot be exploited remotely, it's not a
       | concern. Of course multi-tenant cloud-based virtualization would
       | be a no go.
        
         | bee_rider wrote:
         | We need to drop all the untrusted code on some horrible in-
         | order, no speculative execution, no prefetching, 5 stage
         | pipeline from architectures 101 class core.
        
           | graemep wrote:
           | It might be preferable.
           | 
           | We have ridiculously fast hardware. In many use cases (client
           | machines in particular) we do not usually really need that. I
           | would gladly drop features for security.
        
             | bee_rider wrote:
             | It will also be good because users will become more annoyed
             | when people try to sneak full programs into their websites,
             | hopefully resulting in a generally less bloated internet.
        
         | _factor wrote:
         | This is why high core counts and isolation matter. Isolate the
         | code to a specific core. Assuming everything is working as
         | intended, an exploit won't compromise other tenants.
        
       | xiconfjs wrote:
       | From the paper: "OpenSSL reported that local side-channel attacks
       | (...) fall outside of their threat model. The Go Crypto team
       | considers this attack to be low severity".
        
       | john_alan wrote:
       | On reading it seems a lib like libsodium can simply set the
       | disable bit prior to cryptographic operations that are sensitive
       | on M3 and above.
       | 
       | Also looks like they need to predetermine aspects of the key.
       | 
       | Very cool but I don't think it looks particularly practical.
        
       | saagarjha wrote:
       | > Can the DMP be disabled?
       | 
       | > Yes, but only on some processors. We observe that the DIT bit
       | set on m3 CPUs effectively disables the DMP. This is not the case
       | for the m1 and m2.
       | 
       | Surely there is a chicken bit somewhere to do this?
        
       | Shtirlic wrote:
       | Is it naive to ask whether implementing this mitigation would
       | impact performance and memory interaction speed?
        
       ___________________________________________________________________
       (page generated 2024-03-21 23:00 UTC)