https://gofetch.fail GoFetch Logo GoFetch Breaking Constant-Time Cryptographic Implementations Using Data Memory-Dependent Prefetchers Overview of GoFetch Attack GoFetch is a microarchitectural side-channel attack that can extract secret keys from constant-time cryptographic implementations via data memory-dependent prefetchers (DMPs). We show that DMPs are present in many Apple CPUs and pose a real threat to multiple cryptographic implementations, allowing us to extract keys from OpenSSL Diffie-Hellman, Go RSA, as well as CRYSTALS Kyber and Dilithium. Paper Cite Tools Cite GoFetch [ ] [@inproceedings{gofet] [ title = {GoFetch: B] [ author = {Boru Chen] [ booktitle = {USENIX] [ year = {2024}, ] [} ] [ ] Copy to Clipboard Close Demo Videos. Go's RSA-2048 Key Extraction on Apple m1 People Behind GoFetch * Boru Chen University of Illinois Urbana-Champaign * Yingchen Wang University of Texas at Austin * Pradyumna Shome Georgia Institute of Technology * Christopher W. Fletcher University of California, Berkeley * David Kohlbrenner University of Washington * Riccardo Paccagnella Carnegie Mellon University * Daniel Genkin Georgia Institute of Technology Contact us at info@gofetch.fail Frequently Asked Questions What is the mechanism behind GoFetch? The GoFetch attack is based on a CPU feature called data memory-dependent prefetcher (DMP), which is present in the latest Apple processors. We reverse-engineered DMPs on Apple m-series CPUs and found that the DMP activates (and attempts to dereference) data loaded from memory that "looks like" a pointer. This explicitly violates a requirement of the constant-time programming paradigm, which forbids mixing data and memory access patterns. To exploit the DMP, we craft chosen inputs to cryptographic operations, in a way where pointer-like values only appear if we have correctly guessed some bits of the secret key. We verify these guesses by monitoring whether the DMP performs a dereference through cache-timing analysis. Once we make a correct guess, we proceed to guess the next batch of key bits. Using this approach, we show end-to-end key extraction attacks on popular constant-time implementations of classical (OpenSSL Diffie-Hellman Key Exchange, Go RSA decryption) and post-quantum cryptography (CRYSTALS-Kyber and CRYSTALS-Dilithium). What processors have DMPs and are affected by GoFetch? We have mounted end-to-end GoFetch attacks on Apple hardware equipped with m1 processors. We also tested DMP activation patterns on other Apple processors and found that m2 and m3 CPUs also exhibit similar exploitable DMP behavior. While we have not tested other m-series variants (e.g., m2 Pro, etc), we hypothesize that since these parts have the same microarchitecture as their simpler counterparts, they are likewise equipped with exploitable DMPs. Finally, we found that Intel's 13th Gen Raptor Lake microarchitecture also features a DMP. However, its activation criteria are more restrictive, making it robust to our attacks. What is the difference between GoFetch and Augury? The Apple m-series DMP was first discovered by Augury, which suggested that DMPs might mix data and addresses under some conditions. However, we found that the DMP activation criteria outlined by Augury are overly restrictive. This prevents Augury's findings from being sufficient to mount attacks on real-world constant-time cryptography. GoFetch shows that the DMP is significantly more aggressive than previously thought, and thus poses a much greater security risk. Specifically, we find that any value loaded from memory is a candidate for being dereferenced (literally!). This allows us to sidestep many of Augury's limitations and demonstrate end-to-end attacks on real constant-time code. What is a Cache Side-Channel Attack? Modern processors use caches to reduce a program's memory access latency. If data has been accessed before, it gets cached, which makes subsequent accesses to it faster. Since the cache is shared by processes running on the same machine, attackers co-located to the same machine can monitor the cache's state to deduce a victim's access pattern. What is constant-time programming? Constant-time programming is a paradigm that aims to harden code against side-channel attacks by ensuring that all operations take the same amount of time, regardless of their operands. In particular, constant-time code cannot contain secret-dependent branches, loops, or other control structures. Moreover, as the CPU caches different addresses with attacker-observable latency, constant-time code cannot mix data and addresses in any way and prohibits the use of secret-dependent memory accesses or array indices. We show that even if a victim correctly separates data from addresses by following the constant-time paradigm, the DMP will generate secret-dependent memory access on the victim's behalf, resulting in variable-time code susceptible to our key-extraction attacks. What is a data memory-dependent prefetcher? Prefetchers are a hardware optimization that predicts memory addresses accessed in the near future and fetch the data into the cache accordingly from the main memory. To make a prediction, classical prefetchers use the address trace of previous demand accesses. This strategy performs poorly when it comes to irregular access patterns like linked-list traversals. Aiming to handle such irregular patterns, data memory-dependent prefetchers (DMPs) also consider the content of memory to determine what to fetch, which is capable of capturing those indirect access patterns. Unfortunately, this behavior inherently mixes data and memory addresses at the hardware level, making the entire compute stack non-constant-time, enabling our attack. Are there other DMP-vulnerable cryptographic implementations? We don't know. Our attack relies on the fact that it is possible to craft inputs to control specific intermediate states, making them contain memory addresses in a key-dependent way. The DMP then serves as an oracle, allowing us to learn if the intermediate state indeed looks like a pointer and thus leaks secret key bits. Unfortunately, to assess if an implementation is vulnerable, cryptanalysis and code inspection are required to understand when and how intermediate values can be made to look like pointers in a way that leaks secrets. This process is manual and slow and does not rule out other attack approaches. Can the DMP be disabled? Yes, but only on some processors. We observe that the DIT bit set on m3 CPUs effectively disables the DMP. This is not the case for the m1 and m2. Also, Intel's counterpart, DOIT bit, can be used to disable DMP on the Raptor Lake processors. What can I do to protect myself against this attack? For users, we recommend using the latest versions of software, as well as performing updates regularly. Developers of cryptographic libraries can either set the DOIT bit and DIT bit bits, which disable the DMP on some CPUs. Additionally, input blinding can help some cryptographic schemes avoid having attacker-controlled intermediate values, avoiding key-dependent DMP activation. Finally, preventing attackers from measuring DMP activation in the first place, for example by avoiding hardware sharing, can further enhance the security of cryptographic protocols. Is there a proof-of-concept code? Yes, we will release it soon. Can I use the logo? Yes, SVG and PNG versions of GoFetch logo are free to use under a CC0 liscence. When did you notify Apple? We discloused our findings to Apple on December 5, 2023 (107 days before public release). GoFetch in the News * [640px-Ars_] Unpatchable vulnerability in Apple chip leaks secret encryption keys * [Apple-Insi] Apple Silicon vulnerability leaks encryption keys, and can't be patched easily * [logo] GoFetch Flaw Exposes Cryptographic Key Leakage Risk in Apple's M-Series Chips * [idZUTeBdmu] New chip flaw hits Apple Silicon and steals cryptographic keys from system cache -- 'GoFetch' vulnerability attacks Apple M1, M2, M3 processors, can't be fixed in hardware Acknowledgments This work was partially supported by the Air Force Office of Scientific Research (AFOSR) under award number FA9550-20-1-0425; the Defense Advanced Research Projects Agency (DARPA) under contract numbers W912CG-23-C-0022 and HR00112390029; the National Science Foundation (NSF) under grant numbers 1954712, 1954521, 2154183, 2153388, and 1942888; the Alfred P. Sloan Research Fellowship; and gifts from Intel, Qualcomm, and Cisco. Copyright (c) 2024 Georgia Institute of Technology. All rights reserved.