[HN Gopher] AEPIC Leak: Architecturally leaking uninitialized da...
       ___________________________________________________________________
        
       AEPIC Leak: Architecturally leaking uninitialized data from the
       microarchitecture [pdf]
        
       Author : triska
       Score  : 86 points
       Date   : 2022-08-09 18:02 UTC (4 hours ago)
        
 (HTM) web link (aepicleak.com)
 (TXT) w3m dump (aepicleak.com)
        
       | triska wrote:
       | Quoting from the accompanying repository
       | https://github.com/IAIK/AEPIC :
       | 
       |  _" AEPIC Leak is the first architectural CPU bug that leaks
       | stale data from the microarchitecture without using a side
       | channel. It architecturally leaks stale data incorrectly returned
       | by reading undefined APIC-register ranges."_
        
       | [deleted]
        
       | pbsd wrote:
       | It bugs me that they classify Alder Lake as being Sunny Cove. It
       | is not. The code name for Alder Lake is Golden Cove / Gracemont
       | for performance/efficiency cores.
       | 
       | In fact it is quite strange that the attack skips Tiger Lake
       | (Willow Cove), which changes almost nothing from Sunny Cove
       | besides L2 and L3 cache sizes, but shows up again in Alder Lake
       | (which has two types of cores: does the attack work on both
       | efficiency and performance cores?)
        
         | titzer wrote:
         | > which changes almost nothing from Sunny Cove besides L2 and
         | L3 cache sizes
         | 
         | AFAICT the attack targets buffers that go between L2 and L3, so
         | it is isn't surprising to me that it just happens to not work
         | with a slightly different cache geometry.
        
           | pbsd wrote:
           | Golden Cove has the same sizes as Willow Cove, though
           | different geometry (10 vs 20-way L2, for example). However,
           | Golden Cove's L3 has incredibly high latency compared to
           | predecessors, which might be what makes it work by forcing
           | data to stay in the superqueue longer.
        
       | tptacek wrote:
       | This is super neat.
       | 
       | The researchers built an informal taxonomy of all the CPU bugs
       | we've seen in the past several years, using the CWE (bleh) as a
       | reference. They noticed that anywhere they found a transient
       | execution vulnerability (like a cache side channel) for a given
       | CWE, they'd also find an architectural vulnerability (one exposed
       | directly through the ISA) --- except for CWE-665, Improper
       | Initialization, where their survey only identified transient
       | attacks.
       | 
       | Working from a hypothesis that these kinds of attacks always come
       | in pairs, they set out to find a CWE-665 vulnerability. They did
       | something clever: they build a scanner. From one hardware thread,
       | they kept known data patterns cached. From another, they used
       | ring 0 code to systematically map and read the I/O address space;
       | they could check all those reads for canary values from the first
       | grooming process to detect leaks.
       | 
       | And they found one! The APIC interface (which routes interrupts)
       | exposes successive 32 bit values (sometimes as 4-byte blocks of
       | 8- or 32- byte data structure), which are always aligned on
       | 128-bit boundaries; in other words, each 4-byte APIC value is
       | embedded in a range of addresses 16-bytes wide. Reads past 4
       | bytes in those ranges are undefined. When their scanner read
       | them, they caught canary values their grooming process wrote.
       | 
       | The theory here is that the cache system in these CPUs, which is
       | organized around queues of buffers (to asynchronously handle
       | loads from L2 cache, with line fill buffers, and the LL cache,
       | with fill buffers in the "superqueue", is also used by the APIC
       | system to satisfy reads: reads from the APIC are satisfied
       | through the superqueue. Their grooming process is filling the
       | whole superqueue up with canary values, and the APIC reads are
       | failing to clear out the superqueue buffers when using them for
       | APIC reads.
       | 
       | You can only launch this attack from ring 0 (you need access to
       | physical memory). But that's enough to fatally break SGX, whose
       | whole purpose is running compute on CPUs that don't trust their
       | ring0.
       | 
       | Looks like it only works on Sunny Cove Intel.
        
         | cmroanirgo wrote:
         | I've been a long time away from CPU architecture, but isn't it
         | time that instructions are added that target the caches so that
         | they only fetch the size of the data allocated? That way
         | there's no leaking of errant memory? Or did I completely
         | misunderstand the problem set (which is likely)?
         | 
         | So far in our languages we have two bits of information, the
         | start pointer and the size (the latter being stored as a
         | variable, or intrinsically as a block size), whereas the
         | OS/framework itself often only needs the pointer...
         | 
         | If there was a system that allowed the tightly coupled block of
         | memory to be represented as an op code for the caches, wouldn't
         | that fix up all of this?
        
           | sbf501 wrote:
           | An instruction wouldn't fix that, right? The cacheline
           | fetched by the memory controller would still contain leaked
           | data. Perhaps a fill cacheline with zero on invalidate...
        
       | joosters wrote:
       | How many times has SGX been broken now?
       | 
       | When (if ever) could a reasonable person think "ok, now they must
       | have got rid of all the vulnerabilities, time to trust this!" ?
       | 
       | ...and then Intel will add another new architectural feature that
       | will interact with SGX in some unforeseen way and break it yet
       | again. SGX is surely just too fragile?
        
         | Syonyk wrote:
         | > _How many times has SGX been broken now?_
         | 
         | Quite a few. L1TF/Foreshadow was pretty catastrophic, and
         | Plundervolt was just _funny._ In name and execution. Plus
         | various others.
         | 
         | > _When (if ever) could a reasonable person think "ok, now they
         | must have got rid of all the vulnerabilities, time to trust
         | this!" ?_
         | 
         | Never.
         | 
         | Nor should one trust Intel chips for sensitive computations,
         | given how badly they leak and how badly Intel seems to be at
         | reasoning about this. I'm getting rid of the last of my Intel
         | systems this month (beyond random compute nodes for BOINC stuff
         | I run them with 'mitigations=off' for max performance because
         | they have literally nothing at all sensitive on them, not even
         | passwords - I use different passwords for them).
        
       | aritmo wrote:
       | Website: https://aepicleak.com/
        
       ___________________________________________________________________
       (page generated 2022-08-09 23:00 UTC)