[HN Gopher] ZenHammer: Rowhammer Attacks on AMD Zen-Based Platforms
       ___________________________________________________________________
        
       ZenHammer: Rowhammer Attacks on AMD Zen-Based Platforms
        
       Author : transpute
       Score  : 169 points
       Date   : 2024-03-25 18:20 UTC (4 hours ago)
        
 (HTM) web link (comsec.ethz.ch)
 (TXT) w3m dump (comsec.ethz.ch)
        
       | wmf wrote:
       | The real news appears to be that rowhammer is mostly fixed on
       | DDR5.
        
         | samtheprogram wrote:
         | Not news, and per the article:
         | 
         | > Furthermore, we show that ZenHammer can trigger Rowhammer bit
         | flips on a DDR5 device for the first time.
        
         | merb wrote:
         | Well they said that it needs further testing. If it would be
         | mostly fixed, it would mean that ecc could help even more. I
         | mean the on-die-ecc probably already helps
        
         | dist-epoch wrote:
         | DDR5 is so fragile they had to include on-die ECC to make it
         | work, even when ECC is not exposed externally.
        
           | jeffbee wrote:
           | That only brings DRAM into alignment with flash and magnetic
           | storage, so it's not really a negative. Everything in your
           | computer is converging on semiconductor with bounded
           | probabilistic state + math.
        
             | titzer wrote:
             | It's always been that way, just how many nines of
             | reliability we're talking about. E.g. at Google scale,
             | bitflips in memory from cosmic rays and general noise happy
             | every day. Everything has checksums on it.
        
           | samstave wrote:
           | May you please ELI5 why DDR5 is 'fragile' as you put it?
           | 
           | Was its design pushing material sciences such that the theory
           | worked, but practical implementation required the 'crutch' of
           | ECC?
        
             | adgjlsfhk1 wrote:
             | basically. pushing the timing and sizes makes it likely
             | that some of your bits will fail to be built correctly.
             | rather than dropping the speed and sizes to get
             | reliability, you just throw an extra chip on to give you
             | redundancy.
        
       | DarkNova6 wrote:
       | I know far too little about hardware security. Is this one of the
       | many inevitable vulnerabilities that arise from CPU optimization
       | and are of little feasibility in the real world?
        
         | dist-epoch wrote:
         | This is a RAM problem, not a CPU one.
        
         | rocqua wrote:
         | Arguably worse. This arises from the physics of DRAM. This
         | occurs at a much lower level than an edge case of a feature
         | that lets you leak info over a side channel. Instead this is
         | just: the data is stored as a small charge in a grid by
         | flipping nearby points on the grid alot you can leak some
         | charge into your target charge.
         | 
         | The smaller the charge, and the closer together the charges,
         | the easier rowhammer attacks are. Also, the smaller and closer
         | together the charges, the faster, cheaper, denser, and
         | efficient your RAM gets.
         | 
         | There are mitigations, but they are pushed to the limit.
        
       | VHRanger wrote:
       | Serious question: as an average person, are those hardware
       | security issues (rowhammer, spectre, meltdown) an actual risk?
       | 
       | My understanding with spectre and meltdown was that it was an
       | issue for escaping VMs and similar attacks - something AWS
       | engineers should care about, but not me
        
         | transpute wrote:
         | First paragraph:                 This poses a significant risk
         | as DRAM devices in the wild cannot easily be fixed, and
         | previous work showed that Rowhammer attacks are practical, for
         | example, in the browser, on smartphones, across VMs, and even
         | over the network.
        
           | ngneer wrote:
           | That is just one view, namely the authors' view. You may wish
           | to consider recent perfect 10 vulnerabilities for comparison,
           | as these are far more likely to cause problems.
        
         | hathawsh wrote:
         | From a security perspective, a web browser is a kind of VM
         | hypervisor, where each web site may have its own VMs. So yes,
         | everyone can be affected.
        
         | gary_0 wrote:
         | The solution is to disable JavaScript and not run any untrusted
         | apps. And then move to a shack in the woods and live off the
         | land, because you just cut yourself off from modern society.
        
           | transpute wrote:
           | Some browsers (including Brave on iOS) can disable Javascript
           | by default, to be enabled only on trusted sites where 3rd-
           | party ads are blocked.
        
           | rtehfm wrote:
           | Sounds like a recipe to become to the next Ted Kaczynski.
        
             | bitwize wrote:
             | Ted Kaczynski's views are pretty popular on Hackernews.
        
               | rustcleaner wrote:
               | Too bad we won't see Uncle Ted give a TED Talk. :^(
        
             | bee_rider wrote:
             | Just install Firefox, then noscript, and skip the bit about
             | the shack.
        
           | bee_rider wrote:
           | Noscript is annoying for like a week until you get the sites
           | that you use frequently and basically trust whitelisted.
           | 
           | Sure, it isn't perfectly safe. If HN or my employer goes
           | evil, they can rowhammer me I guess. I'd expect it to cause a
           | big todo, though, so I'm not that worried about it.
           | 
           | I don't really understand why people seem to think disabling
           | JS is a big hassle. Is this motivated reasoning by web devs
           | or something?
           | 
           | It is not a big problem, and the sort of "ambient shittiness"
           | of the internet greatly improved by doing it. Most sites work
           | fine, they'll default to some (better) less dynamic state,
           | maybe some ads won't load. For those sites that don't work,
           | you can make an exception or leave. Personally I'm now mostly
           | visiting sites by people who don't enjoy over complicating
           | things, and who think about fallbacks. It is great!
        
           | spxneo wrote:
           | The year is 2024. Solar panels you installed from Alibaba
           | begins to search for cell towers. Your local instance of LLM
           | voice bot you built to keep you company is using a malicious
           | npm package that suddenly communicates with the solar panels
           | and starts sending packets to a Chinese server.
        
           | tycho-newman wrote:
           | The fact that JavaScript is essential to modern society makes
           | me want to cry.
        
         | rgbrenner wrote:
         | You run untrusted code everyday inside a VM: your browser.
        
         | cortesoft wrote:
         | Rowhammer has a javascript implementation that can run in the
         | browser: https://github.com/IAIK/rowhammerjs
        
         | ncann wrote:
         | The practical answer is that, if 99.9% of people out there has
         | system that mitigates these issues, no one will bother using
         | these exploits in the wild and you can turn off these
         | mitigations to get the perf benefit and be reasonably sure that
         | you won't get exploited. Unless you're targeted of course.
        
           | magnoliakobus wrote:
           | If 99.9% of people can be exposed to the same malicious code
           | and not even be aware that it was running in the background,
           | it's all the more reason for a malicious actor to expose the
           | largest amount of people to it with relatively minimal risk.
        
         | Tuna-Fish wrote:
         | Before browsers got patched, meltdown could be used to steal
         | browser encryption keys using js. This absolutely would have
         | affected normal people.
        
           | mik1998 wrote:
           | "could", theoretically. In practice, there has never been an
           | observed exploitation of the supposed vulnerability.
           | 
           | mitigations=off
        
             | oynqr wrote:
             | Just don't do that on modern AMD processors, you'll lose
             | performance.
        
         | bee_rider wrote:
         | Everyone should install some kind of script whitelisting ad-on
         | and only run JavaScript programs from websites that they really
         | trust. I like noscript. I'm not sure what the Chrome pick is.
         | 
         | Other than that... we don't often run random programs from the
         | internet, right?
         | 
         | They've only scratched the surface for these sorts of bugs.
         | Modern hardware is too complex to actually believe they'll ever
         | get them all.
        
           | sundvor wrote:
           | Definitely not a security expert here, but this is one of the
           | reasons why I at least run ublock origin on just about
           | everything - and recommend everyone do the same. The ad
           | delivery networks is just such a huge vulnerability surface.
           | 
           | Noscript would be much better of course, I guess I'm just too
           | lazy to go that extra step.
        
         | Dalewyn wrote:
         | If you really are an _average person_ , then no: Like most
         | other supposed threats, you lose more to the fixes/mitigations
         | than to the threat itself. They just make for great headlines
         | and sensationalism, which is why you as an average person would
         | hear about them at all.
         | 
         | Note that the _average person_ wouldn 't know WTF "DRAM" means,
         | let alone "Rowhammer" or "Zen" or other esoteric industry
         | terms.
        
         | ngneer wrote:
         | No. As a sober hardware security researcher, most exploited
         | vulnerabilities that would affect an average person are far
         | more mundane and mostly software driven.
        
         | ls612 wrote:
         | No. I've run the Rowhammer test in memtest86 on my PC after
         | building it (as part of the whole memtest package to verify my
         | XMP was stable) and got zero errors on 64GB of DDR5 memory over
         | all the passes. If Memtest couldn't do it when trying its
         | hardest to brute force it nobody doing drive-by javascript has
         | any chance to exploit it.
        
       | oldge wrote:
       | Does this work when full memory encryption, poisoning, and
       | address xor is turned on?
        
         | reliabilityguy wrote:
         | With memory encryption it wont lead to system exploitation,
         | just to a system crash.
         | 
         | So, with memory encryption you are safer.
        
       | ciupicri wrote:
       | I wonder: does Secure Memory Encryption [1] help against this?
       | 
       | [1]: https://www.amd.com/en/developer/sev.html
        
         | anticensor wrote:
         | Yes, but you might end up in a huge loss in stability. Even a
         | single bitflip might become a fatal error.
        
           | formerly_proven wrote:
           | If oyu get that many bitflips the system wasn't stable to
           | begin with.
        
             | wtallis wrote:
             | I think the implication was that memory encryption could
             | mean that a rowhammer-induced bitflip would be amplified
             | into scrambling the entire word of memory, which is more
             | likely to have catastrophic effects than a single bit flip.
             | That would be true for any reasonable definition of
             | "stable" that admits any susceptibility to rowhammer.
        
       | axytol wrote:
       | They mention Zen 2 and 3, any info on Zen 1? Would it simply
       | apply as well?
        
       | crotchfire wrote:
       | _What about DIMMs with Error Correction Codes (ECC)? Previous
       | work on DDR3 showed that ECC cannot provide protection against
       | Rowhammer._
       | 
       | This is incredibly misleading. The paper they cite states:
       | 
       |  _When the ECC detection is used correctly 0.65%-7.42% of all bit
       | flips still cause silent corruptions... On setup AMD-1,
       | uncorrectable errors crash the system._
       | 
       | The attacker will need to cause dozens of machine halts in order
       | to achieve even a single exploitable bitflip. Dozens of machine
       | halts is not something that goes undetected.
       | 
       | Kudos for calling out JEDEC's terrible behavior on the rowhammer
       | question, but we should not be downplaying ECC as a near-term
       | solution.
        
         | transpute wrote:
         | Any recommendations for client devices with ECC memory?
        
           | wtallis wrote:
           | If it has ECC memory, it's going to be branded as a
           | workstation or server or industrial device, not marketed as a
           | consumer device.
           | 
           | Among consumer products, some AMD desktop CPUs and
           | motherboards support ECC memory, and that's about it.
        
         | reliabilityguy wrote:
         | --
        
           | crotchfire wrote:
           | It will detect (by crashing) enough to make exploitation
           | impractical. _That_ is the key point.
        
             | reliabilityguy wrote:
             | I would say that 60% success per trial is a good chance.
        
               | exmadscientist wrote:
               | In the process of generating one triple flip, many, many,
               | many, many, _many_ single and double flips will occur and
               | will be caught. That is why ECC is still an effective
               | defense. Attackers don 't just get to go straight to
               | their end game.
        
               | reliabilityguy wrote:
               | --
        
               | rightbyte wrote:
               | That's true for encryption too.
        
               | exmadscientist wrote:
               | The ECCploit paper has extensive discussion of all the
               | ways their work is detected, and how they even use
               | detection to probe the correction structure. This is
               | _not_ a silent attack. This is a proof that ECC is a
               | penetrable defense. Which we all know! The question is
               | how difficult it is and how stealthily it can be done.
               | 
               | But regardless, ECC still sounds the alarm when it's
               | being attacked. If no one listens, there's not much ECC
               | can do about that.
        
               | YetAnotherNick wrote:
               | You can cause any amount of single and double flip
               | without worry. It's not a defence as the attacker can
               | retry till ECC labels it as uncorrectable. AFAIK there is
               | no cost in retrying.
        
         | p1necone wrote:
         | > The attacker will need to cause dozens of machine halts in
         | order to achieve even a single exploitable bitflip. Dozens of
         | machine halts is not something that goes undetected.
         | 
         |  _If_ you 're targeting a specific machine, if you're throwing
         | the exploit at a few thousand machines shotgun style then
         | you're still going to get your botnet - it'll just be smaller.
        
           | vlovich123 wrote:
           | I think the point is that people with thousands of machines
           | are probably going to notice if a meaningful chunk of them
           | start halting.
        
             | SAI_Peregrinus wrote:
             | Yep, and desktop users will certainly notice. Only AMD has
             | desktop (not workstation) ECC support.
        
               | riedel wrote:
               | If you are running windows 10 random halts and the CPU
               | getting hot won't seem suspicious.
        
           | crotchfire wrote:
           | Can you point to _any_ botnets which were built using
           | rowhammer attacks?
           | 
           | Rowhammer and speculative execution attacks are _incredibly_
           | labor-intensive and target-specific. They are targeted
           | attacks for high-value targets.
        
         | wolpoli wrote:
         | > The attacker will need to cause dozens of machine halts in
         | order to achieve even a single exploitable bitflip. Dozens of
         | machine halts is not something that goes undetected.
         | 
         | Is there a process for the operations team managing the system
         | to figure out that it was an attack and not just flaky
         | hardware?
        
       | tdullien wrote:
       | Coauthor of the original Rowhammer exploit here. ECC remains a
       | highly effective method for turning this from a security issue to
       | a reliability issue, _mostly_. As an individual owner of a
       | server, if that server has ECC and you expect to notice machine
       | halts due to uncorrectable ECC errors, the security implications
       | for you are modest.
       | 
       | Now, if you are a cloud provider that provides VMs on multitenant
       | hosts, your threat model may be different.
       | 
       | Either way, avoid machines without ECC. TRR was a lame duck even
       | when Rowhammer was still fresh, and bits flipping in DRAM will
       | not go away unless the economics in DRAM manufacturing change
       | (e.g. not).
        
         | c2h5oh wrote:
         | I would expect that increased crash rate of multi-tenant hosts
         | would be something that would be detected and investigated by
         | the cloud provider. At the same time targeting a specific
         | tenant would require a lot of luck.
        
         | treprinum wrote:
         | AMD now took Intel's market segmentation approach and is
         | disabling ECC on most Ryzen CPUs. Only Pro and Threadrippers
         | have it guaranteed, then some boards with some desktop Ryzens.
        
           | wtallis wrote:
           | Is that a change? I think what you described applies equally
           | to every generation of Zen processors: Pro-branded chips have
           | ECC capability officially, laptop chips don't have it, and
           | consumer-branded chips have it unofficially with ECC
           | capability optional for motherboards.
        
           | c2h5oh wrote:
           | Nothing changed since first Ryzen launch:
           | 
           | - Desktop Ryzen CPUs support ECC, but implementation by
           | motherboard vendors is not mandatory
           | 
           | - Laptop and G-series Ryzen CPUs only support ECC in pro
           | variant
           | 
           | - Threadripper has ECC support
           | 
           | edit: not confirmed but supposedly laptop and APUs starting
           | with 6000 series all support ECC.
        
             | 0xcde4c3db wrote:
             | There are also a few oddball desktop SKUs that are actually
             | a G-series processor with the GPU disabled (primarily ones
             | below the "600" tier, e.g. the Ryzen 3 4100 or Ryzen 5
             | 5500), which also lack ECC support.
        
           | dralley wrote:
           | Is it disabled or is it simply not certified?
           | 
           | Last I checked ECC was not certified to work on most
           | "consumer" oriented hardware, but AMD didn't make any attempt
           | to actually disable it.
        
         | Arnavion wrote:
         | I would use ECC memory if I could. I used to use a TR 2920x
         | with ECC but now I'm on a Ryzen 7950x with non-ECC. Unbuffered
         | ECC memory is the only one supported by Ryzen, and it's slower
         | or more expensive or both compared to the equivalent-capacity
         | non-ECC memory. The latest Threadripper lineup supports
         | Registered ECC, but Threadripper is overkill (cost, threads,
         | PCIe lanes) for home users like myself.
        
       ___________________________________________________________________
       (page generated 2024-03-25 23:00 UTC)