[HN Gopher] Zenbleed
___________________________________________________________________
Zenbleed
Author : loeg
Score : 743 points
Date : 2023-07-24 14:34 UTC (8 hours ago)
(HTM) web link (lock.cmpxchg8b.com)
(TXT) w3m dump (lock.cmpxchg8b.com)
| ComputerGuru wrote:
| No details on the performance impact of the microcode update.
| _Presumably_ it disables speculative execution of vzeroupper?
| hinkley wrote:
| Or adds a guard.
|
| They mention perf issues for the workaround but they're notably
| absent from the microcode commentary.
|
| I wonder what this is going to do to the new AMD hardware AWS
| is trying to roll out, which is supposed to be a substantial
| performance bump over the previous generation.
| jeffffff wrote:
| shouldn't have any effect, the new amd hardware is zen 4 and
| this only affects zen 2
| infinityio wrote:
| It looks like this is a Zen 2-only exploit, so it shouldn't
| have any impact - AWS are likely already running hardware
| that isn't vulnerable to this
| hinkley wrote:
| The way Spectre and Meltdown played out, you'll have to
| excuse me if I stand outside the blast radius while we
| figure out if there's a chapter 2, 3 or 4 to this story.
|
| They've proven Zen 2 has this problem. They haven't proven
| no other AMD processors have it. A bunch of people looking
| to make names for themselves are probably busily testing
| every other AMD processor for a similar exploit.
| heywhatupboys wrote:
| > The way Spectre and Meltdown played out, you'll have to
| excuse me if I stand outside the blast radius while we
| figure out if there's a chapter 2, 3 or 4 to this story.
|
| I am OOTL on this one, do you have some information you
| could share?
| loeg wrote:
| There has been a long trickle of similar bugs to
| Spectre/Meltdown coming out long after the initial bugs
| and "fixes" were published. (The early fixes were all, in
| some sense, incomplete.)
| kzrdude wrote:
| There was a list of vulnerabilities in this comment up
| top: https://news.ycombinator.com/item?id=36849914
| darkclouds wrote:
| Nice catch!
|
| > If you can't apply the update for some reason, there is a
| software workaround: you can set the chicken bit DE_CFG[9].
|
| It reminds me of the compiler switches which can alter the way
| code at different levels (global, procedure, routine) can access
| variables declared at different levels and the change in scope
| that ensues.
|
| Maybe some of this HW caching should be left to the coders.
| codedokode wrote:
| I don't understand how a microcode update could fix this. I
| assume microcode is used for slow operations like triginometric
| functions, and doesn't affect how registers are allocated or
| renamed. Or does the update simply disables some optimizations
| using "chicken bits"? And by the way, is there a list of such
| bits?
| sebzim4500 wrote:
| Everything a modern CPU runs is microcode. There are a few x86
| instructions that translate to a single microcode instruction,
| but most are translated to several.
| wolf550e wrote:
| The designers leave themselves an ability to override any
| instruction using the microcode so they can patch any
| instruction. They don't use the microcode only to implement
| complex instructions that require loops.
| mrpippy wrote:
| It feels like not-a-coincidence that OpenBSD added AMD microcode
| loading in the last 3 days.
|
| https://news.ycombinator.com/item?id=36838511
| dralley wrote:
| This may or may not also be relevant (I actually have no idea):
| https://www.phoronix.com/news/Fedora-Server-Alert-FW-Updates
| hammock wrote:
| Explain that like I'm 5?
| laverya wrote:
| The patch for this exploit is to load AMD's updated
| microcode.
| dumdumchan wrote:
| Is apt update && apt upgrade enough for pop-os users?
| CameronNemo wrote:
| Probably eventually yes, but if you are really concerned
| you need to discuss it with your distro maintainers.
| gabereiser wrote:
| This. Not everyone is as quick as say Arch or Fedora in
| updating/patching. Please reach out to your maintainers
| of the distro you use.
| [deleted]
| vladvasiliu wrote:
| Even Arch seems out of date as of 24 jul 2023 17:55 UTC.
|
| The latest amd firmware version is 20230625.
| LtdJorge wrote:
| Gentoo already has it, however the latest ebuild is still
| masked, so one would need to put "sys-kernel/linux-
| firmware ~amd64" inside a file in
| /etc/portage/package.accept_keywords, or better yet,
| always run the git version, using * instead of ~amd64.
|
| Apart from that, it's necessary to "sudo emaint sync -A
| && sudo emerge -av sys-kernel/linux-firmware", while
| checking that the correct files are included in the
| savedconfig file if using it. After that, rebuild the
| kernel or the initramfs and reboot.
| kzrdude wrote:
| I think you'll need to reboot for the microcode to be
| updated
| jahsome wrote:
| I'm not sure five year olds know what microcode is. I'm 35,
| been in tech nearly 20 years and don't recall having heard
| that specific term before today.
| eindiran wrote:
| The whole "explain like I'm 5" thing is ridiculous. A
| huge percentage of topics simply cannot be broken down to
| an average 5 year old in a way that makes the
| conversation worth having at all. The 5 year old has no
| context about why in recent years there has been a huge
| push towards running your own code on other people's
| computers using various isolation techniques, or why
| people are trying to exploit that. The 5 year old has no
| context for what the exploits actually are, or how to
| mitigate them. Even if you break all of those things down
| into 5 year old bitesized chunks, you end up with boring
| word soup completely disconnected from the meaningful
| parts of the conversation.
|
| Really what ELI5 is, is a technique to allow the asker to
| not have to look anything up. From the parent comment,
| you can look up "patch", "AMD", "microcode"; or you can
| demand "ELI5!" and have someone else type up long,
| careful definitions that don't reference context or words
| that a 5 year old doesn't know.
|
| Regarding what microcode is, here is a good explanation
| of the differences between microcode and firmware:
|
| https://superuser.com/questions/1283788/what-exactly-is-
| micr...
| jahsome wrote:
| Sure, I can look it up (and I did) but this is a
| _discussion_ section, so why not prompt a discussion by
| asking for a simple explanation?
|
| Appreciate the link! I'm not OP but that's exactly what I
| was looking for.
| byvirtueof wrote:
| I agree that many topics are hard to explain to a five
| year old, but ELI5 can be very helpful in forcing people
| to simplify their writing. Many people explain things in
| an unnecessarily complex way, and ELI5 at least makes
| them think about the target audience.
| wolf550e wrote:
| A Grandchild's Guide to Using Grandpa's Computer a.k.a.
| "If Dr. Zeuss were a Technical Writer" was written in
| 1994 and mentions microcode.
|
| Microcode updates are always discussed when talking about
| microarchitectural security vulnerabilities (and other
| scary CPU errata like
| https://lkml.org/lkml/2023/3/8/976).
|
| Microcode is always mentioned when discussing CPU design
| evolution.
| jahsome wrote:
| It's funny that it's "always" mentioned, yet it's not
| familiar to me. Also curious the Wikipedia article for
| CPU design doesn't mention it, since it's "always"
| referenced.
|
| Just because something is familiar to you, or even large
| swaths of a given population, doesn't mean everyone
| should be expected to know it.
|
| I love learning new things. I love discovering topics I
| know nothing about, and I love picking the brains of
| those passionate about them. But the condescension from a
| certain type of tech nerd sucks all the fun out of
| learning. I've certainly been guilty of this in the past.
| heywhatupboys wrote:
| > I'm not sure five year olds know what microcode is
|
| Sounds like cope being outprogrammed by a kindergartner i
| Roblox
| enedil wrote:
| But well educated five year olds from good schools would
| know it.
| akyuu wrote:
| https://www.amd.com/en/resources/product-security/bulletin/a...
|
| According to AMD's security bulletin, firmware updates for non-
| EPYC CPUs won't be released until the end of the year. What
| should users do until then, disable the chicken bit and take the
| performance hit?
| stefan_ wrote:
| Are they out of their mind? This is not a "medium".
| qhwudbebd wrote:
| Presumably classified as severity 'medium' in an attempt to
| look marginally less negligent when announcing that they
| can't be bothered to issue microcode updates for most CPU
| models until Nov or Dec.
| ItsTotallyOn wrote:
| What does this allow the attacker to do? Steal data? The post
| isnt very clear.
| timmaxw wrote:
| It allows the attacker to eavesdrop on the data going through
| operations like strcmp(), memcpy(), and strlen(). (These are
| the standard functions in C for working with strings; and many
| higher-level languages use them under the hood.) It works on
| any function that uses the XMM/YMM/ZMM registers.
|
| It's stochastic; the attacker randomly gets data from whatever
| happens to be using the XMM/YMM/ZMM registers at the time. So
| if the attacker could eavesdrop in the background constantly,
| they might eventually see a password. Or they might be able to
| trigger some system code that processes your password, then
| eavesdrop for the next few milliseconds.
|
| The attacker needs to run code on your machine. Unclear if
| running code in a web browser is sufficient or not. It requires
| an unusual sequence of machine instructions, which isn't
| necessarily possible in JS/WASM, but 'sounds' says they did it:
| https://news.ycombinator.com/item?id=36849767
| sounds wrote:
| Huh. The very first line seems pretty clear: >
| If you remove the first word from the string "hello world",
| what should > the result be? This is the story of how we
| discovered that the answer > could be your root
| password!
|
| Can you please expand on your question?
| bananapub wrote:
| I assume they meant "what does this do in normal
| vulnerability discussion terms", I don't know why tavis
| didn't just say "arbitrary memory read across processes" or
| whatever.
| ItsTotallyOn wrote:
| does it require physical access to the machine?
| xmodem wrote:
| No, only the ability to execute arbitrary code in an
| unprivileged context. Would probably have to be arbitrary
| x86_64 instructions - Javascript wouldn't cut it for this
| one.
| sounds wrote:
| I was able to reproduce the vulnerability using javascript
| on a webpage. Therefore, no.
| Sohcahtoa82 wrote:
| PoC || GTFO
| IggleSniggle wrote:
| [flagged]
| Y_Y wrote:
| Not even an xor? Harsh.
| sounds wrote:
| OP here hadn't even bothered to read the article. That's
| the context of my reply. No PoCs going online so close to
| the disclosure, sorry.
| pests wrote:
| It's okay to admit you are wrong or don't have a working
| POC.
| LtdJorge wrote:
| What? The researcher that found it and wrote the article
| already posted a PoC that can be used to farm data from
| VMs in any VPS provider.
| pests wrote:
| Why is everyone claiming this is impossible in
| JavaScript? If you have a POC you should post it so
| others can learn of the danger.
|
| You've even been quoted elsewhere in this thread about
| this topic.
| 0xbadcafebee wrote:
| Some people think you need "the ability to execute
| arbitrary code in an unprivileged context" to perform
| this exploit. Which is of course a false assumption. The
| bug class in this case is basically a user-after-free,
| for a function which keeps its state per-cpu-core, for a
| function that is (for almost all intents and purposes)
| unprivileged.
|
| From the article: We now know that basic
| operations like strlen, memcpy and strcmp will use the
| vector registers - so we can effectively spy on
| those operations happening anywhere on the system! It
| doesn't matter if they're happening in other
| virtual machines, sandboxes, containers, processes,
| whatever!
|
| All you need to do is write some JavaScript that will _"
| trigger something called the XMM Register Merge
| Optimization2, followed by a register rename and a
| mispredicted vzeroupper"_. It's up to the hacker to
| determine how to do this explicitly in JS, but it's
| theoretically possible by literally any application at
| any time on any operating system. Even if some language
| or interpreter claims to prevent it, it's possible to
| find an exploit in that particular
| language/interpreter/etc to get it to happen.
|
| This is how exploit development works; if you can't go
| straight ahead, go sideways. I guarantee you that someone
| will find a way, if they haven't yet.
| _flux wrote:
| What javascript was that, or did you create your own? I
| did not find any from this post.
| KomoD wrote:
| I'll take this as bullshit until there's a POC
| crtasm wrote:
| Might you post a screen recording?
| CyberDildonics wrote:
| Might you explain how that would prove anything?
| heywhatupboys wrote:
| effort to lie on a text comment << effort to lie with a
| video
| CyberDildonics wrote:
| Might you think that source code would be much better
| proof and easier to send out?
| evandale wrote:
| We are on a tech site with highly intelligent individuals
| who have been programming computers since we've been in
| diapers.
|
| If you don't believe the text then how would you believe
| the video? Anything can be done in devtools beforehand
| and I can think of a million different ways to fake the
| video.
|
| Personally, if I didn't trust the text then an easily
| faked video wouldn't placate me either.
| kzrdude wrote:
| No, it requires unprivileged arbitrary code execution
| kristopolous wrote:
| Beyond what everyone else said, these types of exploits can
| break out of VMs. Unless I'm misreading it you could log
| into your $5 linode/digitalocean/aws machine and start
| reading other people's data on the host machine.
|
| There's tons of million dollar/month businesses on
| ~$20/month accounts on shared machines.
| rkrzr wrote:
| It allows the attacker to steal data like e.g. your (root)
| password.
| tremon wrote:
| Only while it's stored unencrypted in memory, right?
| saagarjha wrote:
| As is the case whenever you type it in, yes
| taneliv wrote:
| My reading of the article was that memory is not directly
| compromised, but CPU registers. So loaded unencrypted in
| one of the affected registers.
| beebmam wrote:
| It is very clear, you just didn't read it.
|
| >We now know that basic operations like strlen, memcpy and
| strcmp will use the vector registers - so we can effectively
| spy on those operations happening anywhere on the system! It
| doesn't matter if they're happening in other virtual machines,
| sandboxes, containers, processes, whatever!
|
| >This works because the register file is shared by everything
| on the same physical core. In fact, two hyperthreads even share
| the same physical register file.
|
| >It turns out that mispredicting on purpose is difficult to
| optimize! It took a bit of work, but I found a variant that can
| leak about 30 kb per core, per second.
|
| >This is fast enough to monitor encryption keys and passwords
| as users login!
| hinkley wrote:
| Literally the intro says it might contain the root password.
|
| TLDR: The vector registers this bug affects are used for
| string functions like strcmp, so anything could get loaded
| into them, including passwords.
| kristjank wrote:
| At least it's fixed in microcode, unlike some recent exploits
| (Spectre and Meltdown come to mind)
| sounds wrote:
| The site is getting hugged to death.
| https://web.archive.org/web/20230724143835/https://lock.cmpx...
| ksec wrote:
| It is a simple static HTML page, how is it possible in 2023 a
| static site could be hugged to death. In most cases HN traffic
| barely hits 100 page view per second.
| jedberg wrote:
| It's a security writeup so it's probably run by a security
| expert who is not an expert at running high traffic websites.
| Most likely there is something on the page that causes a
| database hit. Possibly the page content itself.
| [deleted]
| taviso wrote:
| welp, that's unfortunate indeed.
|
| It's a single-core 128 MB VPS, which seemed fine for my
| boring static html articles. I guess I underestimated the
| interest.
| yakubin wrote:
| FWIW, enabling gzip/zstd compression in your HTTP server
| could help.
| ransackdev wrote:
| A single core machine already overloaded is going to get
| even worse introducing the cpu overhead of gzipping
| response bodies (assuming it's cpu bound and not IO
| bound)
|
| Cache control headers will help with return traffic
|
| More cpu cores
|
| If using nginx ensure sendfile is enabled and workers are
| set to auto or tuned for your setup
|
| Check ulimit file handle limits
|
| Offload static assets to cdn
|
| Since it's a static html site, you could even host on s3,
| netlify, etc
| jwilk wrote:
| It's a static file. You need to compress it only once,
| not for every response.
| ptx wrote:
| ...and here's how to do it in Apache: https://httpd.apach
| e.org/docs/2.4/mod/mod_deflate.html#preco...
| brazzledazzle wrote:
| Could even host on github pages with a cname.
| wolf550e wrote:
| Only with something like mod_asis
| (https://httpd.apache.org/docs/2.4/mod/mod_asis.html) to
| serve already compressed content. Actually running zlib
| on every request will only make it worse.
| javajosh wrote:
| As an aside, I'd be curious to now how your VPS failed.
| Memory? Bandwidth?
| account42 wrote:
| Interesting, do you mind sharing what software you use to
| serve the static html and what kind of traffic its getting.
| loeg wrote:
| HTTP/1.1 200 OK Date: Mon, 24 Jul 2023 17:05:06 GMT
| Server: Apache
| brazzledazzle wrote:
| I do not miss performance tuning apache.
| cesarb wrote:
| In my personal experience, the first step in tuning
| Apache was "put a nginx server in front of it". Running
| out of workers (either processes in the prefork model, or
| threads otherwise) was in my experience way too easy,
| especially when keepalive is enabled (even a couple of
| seconds of keepalive can be painful). The async model
| used by nginx can handle a lot more connections before
| running out of resources.
| zokier wrote:
| Apache has been defaulting to event mpm for over a
| decade.
| tamimio wrote:
| Doesn't matter, great article!
| marcus0x62 wrote:
| I imagine they are also getting traffic from sources other
| than HN.
| winrid wrote:
| 100rps for most articles. I bet this is at least double that,
| and he's using apache which by default I think is still
| thread per connection.
| ComputerGuru wrote:
| Faster link: https://archive.is/QAwvQ
| AdmiralAsshat wrote:
| And now we've hugged the archive to death. Nice job!
| loeg wrote:
| The original still loads (eventually) for me. YMMV.
| nevi-me wrote:
| XMMV or ZMMV could also apply
| account42 wrote:
| [flagged]
| artisanspam wrote:
| Why does disabling SMT not fully prevent this? I don't know the
| details of Zen 2 architecture, but register files are usually
| implemented as SRAM on the CPU-die itself. So unless the core is
| running SMT, I don't understand how another thread could be
| accessing the register file to write a secret.
| adrian_b wrote:
| Because unless you pin the threads to certain CPU cores (e.g.
| in Linux by using the taskset command, or in Windows by using
| the Set Affinity command in Task Manager), they are migrated
| very frequently between cores.
|
| So even with SMT disabled, each core will execute sequentially
| many threads, switching every few milliseconds from one thread
| to another, and each context switch does not modify the hidden
| registers, it just restores the architecturally visible
| registers.
| dontlaugh wrote:
| Pinning doesn't help either, since there will always be more
| threads than cores. Scheduling all those threads and even
| blocking on IO will cause context switches.
| wbl wrote:
| Because the context switch only affects architectural state not
| microarchitectural state.
| artisanspam wrote:
| Yes I understand that but I was struggling to think of a
| sequence of instructions that would cause this secret leaking
| on a single thread.
|
| But a simple example is `vzeroupper` followed by anything
| that writes a secret to the same register file entry would be
| leaked on a subsequent flush.
| wbl wrote:
| That's not quite right. The attacker doss the vzeroupper
| rollback. Any registers in the physical file that haven't
| been overwritten can be exposed as a result, regardless of
| what the victim did.
| [deleted]
| wzdd wrote:
| Really lovely writeup. I liked the discussion of determining how
| can you tell if a randomly-generated program performed correctly.
| The obvious approach is to just run it on an "oracle" -- another
| processor or simulator -- and see if it behaves the same way. But
| if you're checking for microarchitectural effects with tight
| timing windows you can also write the same program with various
| stalls, fences, nops and so on -- things which shouldn't affect
| the output (for single-threaded code) but which will result in
| the CPU doing significantly different things
| microarchitecturally. That way the CPU can be its own oracle.
| weinzierl wrote:
| This part was super interesting, especially the differences
| between fuzzing software and hardware. I also liked the
| _chicken bit_.
| Shazshe wrote:
| [flagged]
| gavinhoward wrote:
| Off-topic question, but can some experts tell me why it is safe
| for `strlen()` and friends to use vector instructions when they
| can technically read out of bounds?
| loeg wrote:
| Essentially because memory mappings and RAM work at page
| granularity, rather than bytes. If a read from in-bounds in a
| page isn't going to fault, a read later in the same page isn't
| going to fault either (even if it is past the end of the
| particular object).
|
| You can see this in glibc's implementation, which checks for
| crossing page boundaries:
| https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86...
| (line ~68)
| gavinhoward wrote:
| Ah, so _that 's_ why there is special code in Valgrind to
| handle glibc and friends!
| loeg wrote:
| I think capability-pointer machines like CHERI might need
| in-bounds-only variants of these functions, too.
| saagarjha wrote:
| Generally CHERI tracks things for 16-byte regions
| loeg wrote:
| Implementations using 32- or 64-byte (256 or 512 bit)
| vector extensions would run afoul of 16-byte granularity.
| While it is not common yet, ARM SVE allows vector sizes
| larger than 128 bits -- e.g., Graviton3 has 256-bit SVE
| and Fujitsu A64FX has 512-bit.
| Liquid_Fire wrote:
| I think you might be confusing the tracking of validity
| of capabilities themselves (which could indeed be at a 16
| byte granularity for an otherwise 64-bit system) with the
| bounds of a capability, which can be as small as 1 byte.
| dtx1 wrote:
| > AMD have released an microcode update for affected processors.
| Your BIOS or Operating System vendor may already have an update
| available that includes it.
|
| Yes, I love flashing BIOS...
|
| _edit_ nvm, Microcode can get updated via system updates.
| Night_Thastus wrote:
| To be fair, flashing the bios isn't nearly as bad on most
| modern systems.
|
| Put the file on a USB drive, plug it in, restart and go into
| the bios, look for the flashing utility, select the file, done.
| As long as the machine is on a UPS in case of disaster,
| everything's accounted for.
| dtx1 wrote:
| From my experience: Better have a > 32gig USB Flash drive,
| everything else doesn't work (MSI) and I don't have an UPS so
| it's always quite an exciting experience. Especially since
| Motherboard Manifactures save almost a whole dollar by not
| having a display ouputting anything. So it's blinkenlights
| and hope for the best
| Night_Thastus wrote:
| Get a UPS!!
|
| Not just for convenience, but safety. You don't want to be
| caught out when something goes wrong, even without flashing
| the bios.
|
| A lot of boards these days have 7-segment displays. They're
| not great, but they're a good step up. Don't need to spend
| a lot, I think they show up on $300-ish boards. Mine
| definitely does.
| dtx1 wrote:
| I see no need for it. Living in Germany, any kind of
| power outages are exceptionally rare. I remember one in
| the last 10 years for a few hours and that was very
| local. If I am in a situation where a power outage
| occurs, i'll listen to my battery radio for a while and
| be fine. I work on nothing and rely on nothing that would
| actually require a UPS.
| Night_Thastus wrote:
| It's like not having a backup drive. Everything is fine
| until one day it isn't.
|
| A good UPS does more than just protect from outages. It
| also protects from surges and low-voltage situations that
| can both damage the equipment severely.
|
| A UPS doesn't cost much and will last many years. Buying
| a new motherboard and GPU because they got fried is much
| more expensive.
| baq wrote:
| Sometimes there's even a backup BIOS die available, so yeah,
| bricking is now much harder than in the past.
| ls612 wrote:
| My new computer takes a while to POST (z690 with ddr5 smh) so
| it's basically been continuously either on or in sleep since
| I built it 18 months ago and I've had an unexpected shutdown
| due to power loss once in that time according to the Event
| log. I think the risk of losing power while flashing the bios
| is very small in real life unless you are stuck in a place
| with third world electricity infrastructure.
| formerly_proven wrote:
| If POST takes a long time it's often memory training,
| backing off on the timings just slightly might make it go a
| lot quicker. Bios updates also often twiddle knobs in this
| area.
| ls612 wrote:
| It isn't this, it takes about a minute to train after a
| bios update or when I enable XMP but never trains after
| that. It just takes like 20-30 seconds to get all the way
| to the bios splash screen and only 5 seconds to return
| from sleep so I just use sleep instead of turning it off.
| Then the only time I need to wait through a boot is for
| windows updates.
| Night_Thastus wrote:
| Doesn't that only happen on the first boot with new
| memory? As well, I thought it was more of a concern on
| AMD, and less on Intel. (Z690 is Intel)
| yyyk wrote:
| Often the BIOS will allow reading the update file from EFI
| partition, so there's no need for the USB drive.
| astrange wrote:
| Mine loses its settings when you update the BIOS, so your fan
| curves go away.
| deaddodo wrote:
| Microcode updates haven't been managed in-BIOS for over a
| decade now. If you use Linux, you'll usually see them released
| as some package like "intel-microcode" or "amd-microcode".
|
| Even EFI updates rarely are very intrusive or dangerous, and
| can also be handled by the Operating System via an update.
| mrpippy wrote:
| They are managed both ways. I think updating in BIOS is
| preferable to ensure no CPU parameters change while (some
| part) of the kernel has already initialized.
|
| But of course BIOS updates have many downsides and often stop
| after a few years.
| dTP90pN wrote:
| > AMD have released an microcode update for affected processors.
|
| I don't think that is correct. AMD has released a microcode
| update[0] for family 17h models 0x31 and 0xa0, which corresponds
| to Rome, Castle Peak and Mendocino as per WikiChip [1].
|
| So far, there seems to be no microcode update for Renoir, Grey
| Hawk, Lucienne, Matisse and Van Gogh. Fortunately, the newly
| released kernels can and do simply set the chicken bit for those.
| [2]
|
| [0]
| https://git.kernel.org/pub/scm/linux/kernel/git/firmware/lin...
|
| [1] https://en.wikichip.org/wiki/amd/cpuid#Family_23_.2817h.29
|
| [2]
| https://github.com/torvalds/linux/commit/522b1d69219d8f08317...
| dTP90pN wrote:
| More details:
|
| `good_revs` as per the kernel:
| https://github.com/torvalds/linux/commit/522b1d69219d8f08317...
|
| Currently published revs ("Patch") (git HEAD):
|
| https://git.kernel.org/pub/scm/linux/kernel/git/firmware/lin...
|
| As of this writing, only two of the five `good_rev`s have been
| published.
| cratermoon wrote:
| This link seems hugged to death, so here's an alternate source:
| AMD 'Zenbleed' Bug Allows Data Theft From Zen 2 Processors,
| Patches Coming: <https://www.tomshardware.com/news/zenbleed-bug-
| allows-data-t...>
| ItsTotallyOn wrote:
| This story has comments from AMD, too.
| [deleted]
| nemetroid wrote:
| The README in the tar file with the exploit (linked at "If you
| want to test the exploit, the code is available here") contains
| some more details, including a timeline:
|
| - `2023-05-09` A component of our CPU validation pipeline
| generates an anomalous result.
|
| - `2023-05-12` We successfully isolate and reproduce the issue.
| Investigation continues.
|
| - `2023-05-14` We are now aware of the scope and severity of the
| issue.
|
| - `2023-05-15` We draft a brief status report and share our
| findings with AMD PSIRT.
|
| - `2023-05-17` AMD acknowledge our report and confirm they can
| reproduce the issue.
|
| - `2023-05-17` We complete development of a reliable PoC and
| share it with AMD.
|
| - `2023-05-19` We begin to notify major kernel and hypervisor
| vendors.
|
| - `2023-05-23` We receive a beta microcode update for Rome from
| AMD.
|
| - `2023-05-24` We confirm the update fixes the issue and notify
| AMD.
|
| - `2023-05-30` AMD inform us they have sent a SN (security
| notice) to partners.
|
| - `2023-06-12` Meeting with AMD to discuss status and details.
|
| - `2023-07-20` AMD unexpectedly publish patches, earlier than an
| agreed embargo date.
|
| - `2023-07-21` As the fix is now public, we propose privately
| notifying major distributions that they should begin preparing
| updated firmware packages.
|
| - `2023-07-24` Public disclosure.
| sedatk wrote:
| > AMD unexpectedly publish patches, earlier than an agreed
| embargo date.
|
| > As the fix is now public, we propose privately notifying
| major distributions that they should begin preparing updated
| firmware packages.
|
| AMD had to drop the ball somewhere didn't it.
| klyrs wrote:
| It's _good_ that they published patches early, isn 't it?
| robryk wrote:
| You'd want the delay between first publication of X and the
| microcode update making its way into releases of OSes to be
| smallest, for various values of X (mention of a
| vulnerability, microcode patch, description of
| vulnerability, PoC). Making various OS releasers aware that
| a microcode patch that fixes a vulnerability will be
| published on a given date before that date decreases that
| for most values of X.
| taviso wrote:
| Yes. It was unexpected, but good. Not a complaint.
| sedatk wrote:
| Uh, okay. I thought the embargo date was set so you could
| have enough time to inform the distros. Not the case,
| then.
| [deleted]
| LtdJorge wrote:
| This is both as cool as it is scary. I managed to "exfiltrate"
| pieces of my Bitwarden password (could easily be reconstructed),
| ssh login password, and bank credentials in a minute of running
| from a 10MB sample.
| causi wrote:
| _AMD Ryzen 5000 Series Processors with Radeon Graphics_
|
| Does this mean Ryzen CPUs without integrated graphics are fine?
| formerly_proven wrote:
| No this means AMD's numbering scheme is intentionally obtuse.
| This has nothing to do with graphics, but with the CPU core,
| Zen 2.
| lgl wrote:
| The only series 5000 cpu's that are still using Zen2
| architecture are apparently the 5300U, 5500U and 5700U, which
| all use socket FP6 (mobile/embedded).
|
| So I'm guessing it shouldn't affect any of the more recent and
| very popular Zen3 cpus like the 5600, 5700 etc. I personally
| own a 5600, which are a great bang for buck.
| paulmd wrote:
| Lucienne (5700U/5500U/5300U) are the only Zen2s in the 5000
| series at present (afaik), but AMD continues to re-use the
| Zen2 architecture in the 7000 series (7520U, etc), as well as
| many semicustom products like Steam Deck.
|
| It's in rather a sweet-spot as far as performance-power-area,
| so this isn't entirely a bad thing. Zen3's main innovation
| was unifying the CCXs/caches, but if you only have a 4C, or
| you want to be able to power-gate a CCX (and its attendant IF
| links/caches) down entirely, Zen2 does that better, and it's
| slightly smaller. We'll be seeing Zen2 products for years to
| come, most likely.
| gruez wrote:
| No, it's all Zen 2 CPUs, which include both desktop CPUs (with
| or without integrated graphics, laptop CPUs, and server CPUs.
| The reason why the product list is so confusing is that AMD
| reuses architectures across generations. You'd think that all
| ryzen 5000 series CPUs have the same microarchitecture, but
| they don't). It's much easier to consult this list instead:
| https://en.wikipedia.org/wiki/Zen_2#Products
| paulmd wrote:
| FYI this list isn't exhaustive. And I went to recommend the
| wikichips link and it's not exhaustive either.
|
| https://en.wikichip.org/wiki/amd/microarchitectures/zen_2#Al.
| ..
|
| Both of them are missing the newer 7000-family products with
| Zen2 like 7520U etc.
|
| https://www.amd.com/en/products/apu/amd-ryzen-5-7520u
|
| https://www.amd.com/en/products/apu/amd-ryzen-3-7320u
|
| https://www.amd.com/en/products/apu/amd-athlon-gold-7220u
| tremon wrote:
| _products /apu/amd-athlon_
|
| Wait... now there's also APU's under the AMD Athlon brand?
| I know that people are happy when AMD's product offerings
| are on-par or outperforming Intel, but they didn't have to
| outdo Intel in the consumer confusion arena as well.
| paulmd wrote:
| Has been for a while.
|
| https://www.techpowerup.com/cpu-specs/athlon-200ge.c2073
|
| Intel also used the Pentium branding for low-end
| processors (below i3 and in the Atom lineup), and
| followed it up with the rather perplexing move of using
| their company name as the sole branding for their worst
| products ("Intel Processor").
| neogodless wrote:
| The 7520U and 7530U are listed on the linked Wikipedia
| page. Look under "Ultra-mobile APUs".
|
| The Athlon is missing, though.
| lopkeny12ko wrote:
| Relevant snippet:
|
| This technique is CVE-2023-20593 and it works on all Zen 2 class
| processors, which includes at least the following products:
| AMD Ryzen 3000 Series Processors AMD Ryzen PRO 3000
| Series Processors AMD Ryzen Threadripper 3000 Series
| Processors AMD Ryzen 4000 Series Processors with Radeon
| Graphics AMD Ryzen PRO 4000 Series Processors AMD
| Ryzen 5000 Series Processors with Radeon Graphics AMD
| Ryzen 7020 Series Processors with Radeon Graphics AMD
| EPYC "Rome" Processors
| kevin_thibedeau wrote:
| FYI, Ryzen 3000 APUs aren't Zen 2.
| neogodless wrote:
| > AMD Ryzen 3000 Series Processors
|
| The above are desktop. If they meant APUs, it would list
| "Ryzen 3000 Series Processors with Radeon Graphics."
| timw4mail wrote:
| They are Zen+, aren't they?
| justinclift wrote:
| Whew, my 5600X looks like it avoided this one too. :)
| tremon wrote:
| Do they mean "only confirmed on Zen2", or is the problem
| definitely confined to only this architecture?
|
| Is it likely that this same technique (or similar) also works
| on earlier (Zen/Zen+) or later (Zen3) cores, but they just
| haven't been able to demonstrate it yet?
| Arnavion wrote:
| Doesn't repro on 2920x (Zen+).
| rincebrain wrote:
| At least the stock exploit code he provided said "nope I
| can't get shit to leak" on my 5900X.
| zacmps wrote:
| I tested on a Zen 3 Epyc and wasn't able to get the POC to
| work, so I think it probably is just Zen 2.
| paulmd wrote:
| It's Tavis Ormandy, and he reported it to AMD, so _one would
| assume_ they tried it on related hardware and it 's not
| working.
| ye-olde-sysrq wrote:
| So are Ryzen 5000's without Radeon not vulnerable? I guess said
| processors are zen 3?
|
| I have an "AMD Ryzen 9 5950x Desktop Processor" which appears
| to be Zen 3. I think I'm good?
|
| (Not that I'm running untrusted workloads, but yknow, fortune
| favors the prepared)
| Tuna-Fish wrote:
| You are likely frequently running untrusted workloads. As
| javascript in a browser. I don't know about this one, but at
| least meltdown was fully exploitable from js.
|
| But yes, you are fine, 5950x is Zen3.
| anarazel wrote:
| I wish Firefox would use PR_SCHED_CORE to reduce the
| likelihood of such leakage...
| CameronNemo wrote:
| I was under the impression that 5600g and 5600u were Zen3,
| but being the APU models they have Radeon graphics.
|
| Anecdotally, I tried to reproduce on my 5600g but couldn't.
| Which is surprising because they claim it works on 5700u...
|
| Edit: just discovered that while my 5600g is Zen3, the
| 5700u is Zen2. Lol.
| eugene3306 wrote:
| and how about playstation 5 ?
|
| and also xbox and that thing from valve?
| javajosh wrote:
| I mean, the PS5 is running a Zen 2 processor [0] so I would
| assume it's vulnerable. In general I would assume that AAA
| games are safe. Websites and smaller games made by
| malefactors will be the issue. (Note that AAA game makers
| have little interest in antagonizing the audience, OTOH they
| also will push limits to install anti-cheat mechanisms. On
| balance I'd trust them.)
|
| 0 - https://blog.playstation.com/2020/03/18/unveiling-new-
| detail...
| darkwater wrote:
| I think the interesting point here might be one could be
| able to extract some secret from memory of a PS5, like to
| break some kind of encryption
| tracker1 wrote:
| Interresting, could well be a path to jailbreaking the
| PS5... although, not sure if that has or hasn't already
| happened. For XBox Series, you can just use dev mode in
| the first place.
| FirmwareBurner wrote:
| What valuable secrets do people have on their PS5/Xbox?
| You also need a way to deploy the malicious payload on
| those platforms which, due to their closed nature, is
| very difficult to do.
| kmeisthax wrote:
| The valuable secret here would be the keys that let you
| decrypt and copy games. The threat models of locked-down
| platforms are incredibly strange.
| FirmwareBurner wrote:
| That's a good point but I can't believe that every
| console doesn't have it's own unique set of keys so that
| if you compromise one before SW patches land, it won't be
| much use in the ecosystem.
| kmeisthax wrote:
| It depends. I'm going to speak in general terms, since I
| obviously don't know how every single system works, but
| per-console keys are used for pairing system storage to
| the motherboard and _maybe_ keeping save data from being
| copied from user to user. Most CDNs don 't really provide
| the option for on-the-fly per user encryption, so instead
| you serve up games encrypted with title keys and then
| issue each console a title key that's encrypted with a
| per-console key. Disc games need to be encrypted with
| keys that every system already has, otherwise you can't
| actually use the disc to play the game.
|
| As for the value of being able to do 'hero attacks' on
| game consoles, let me point out that once you have a
| cleartext dump of a game, you've already done most of the
| work. The Xbox 360 was actually very well secured, to the
| point where it was easier to hack a disc drive to inject
| fake authentication data into a normal DVD-R than to
| actually hack a 360's CPU to run copied games. That's why
| we didn't have widely-accessible homebrew on that
| platform for the longest time. Furthermore, you can make
| emulators that just don't care about authenticating media
| (because why would they) and run cleartext games on
| those.
| javajosh wrote:
| Oh, I can imagine lots of uses for a bevy of PS5's,
| assuming you can gain remote control. What do you do with
| a botnet? What do you do with a botnet with a pretty good
| GPU? What do you do with an always-on microphone in
| people's living rooms?
| AdmiralAsshat wrote:
| At least with the PS3, I seem to recall that I couldn't
| extract any of my games' save data from the hard-drive of
| my PS3 unit that went dead due to RROD (or was it YLOD?)
| because the hard-drive was encrypted using the PS3's
| serial key as part of the encryption.
|
| I don't know if that mechanism persists into the PS4/PS5.
| winrid wrote:
| Looks like my 2700x narrowly misses this one, assuming 7020
| series is affected and not 7000 series.
| loeg wrote:
| Yeah -- Ryzen 2700x is Zen+, not Zen 2. Current understanding
| is that Zen+ is not affected.
| _flux wrote:
| The wording "at least" suggests the list might not be
| exhaustive.
| blinkingled wrote:
| On my Zen2 / Renoir based system the PoC exploit continues to
| work albeit slowly even after updating the microcode (linked from
| TFA) that has the fix for this issue. The wrmsr stops it fully in
| its track.
|
| Edit: just realized it must have been that the initramfs image is
| not updated with the manually updated firmware in /lib/firmware.
|
| Edit2: Updated the initramfs and even if the benchmark.sh fails,
| ./zenbleed -v2 still picks out and prints strings which doesn't
| happen with the wrmsr solution.
| johnp_ wrote:
| linux-firmware does not carry any microcode update for Renoir
| (yet). Or what do you mean by "TFA"?
|
| The fixed Renoir microcode should have revision >= 0x0860010b
| as per the kernel:
| https://github.com/torvalds/linux/commit/522b1d69219d8f08317...
| href wrote:
| Can anyone explain the `wrmsr -a 0xc0011029 $(($(rdmsr -c
| 0xc0011029) | (1<<9)))`? It seems to help on my system, but I
| don't understand what it does, and I don't know how to unset it.
| taviso wrote:
| An msr is a "model specific register", a chicken bit can
| configure cpu features.
|
| They don't persist across a reboot, so you can't break
| anything. You can undo what you just did without a reboot, just
| use `... & ~(1 << 9)` instead (unset the bit instead of set
| it).
| mmastrac wrote:
| This sets the chicken bit: https://www.phoronix.com/news/Linux-
| AMD-Spectral-Chicken
| mike_hearn wrote:
| CPU designers know that some features are risky. Much like how
| web apps may often have "feature flags" that can be flipped on
| and off by operators in case a feature goes wrong, CPUs have
| "chicken bits" that control various performance enhancing
| tricks and exotic instructions. By flipping that bit you
| disable the optimization.
| HideousKojima wrote:
| [flagged]
| jrmg wrote:
| _AMD have released an microcode update for affected processors.
| Your BIOS or Operating System vendor may already have an update
| available that includes it._
|
| I don't really understand how CPU microcode updates work. If I'm
| keeping Ubuntu up to date, will this just happen automatically?
| naikrovek wrote:
| no.
|
| microcode changes are provided to the CPU at boot time and are
| only valid early in the boot process. the machine UEFI/BIOS
| must apply them.
| tremon wrote:
| Linux can (and does) apply microcode patches during kernel
| boot.
| kzrdude wrote:
| for example use journalctl -k -g microcode to see log
| messages related to this: (intel cpu, so revision does not
| relate to anything AMD)
|
| > microcode: microcode updated early to revision 0xa6, date
| = 2022-06-28
| sdht0 wrote:
| https://www.cyberciti.biz/faq/install-update-intel-microcode...
| tremon wrote:
| If you already have the package amd64-microcode installed
| (highly likely), then yes it will be updated automatically.
|
| https://packages.ubuntu.com/search?keywords=amd64-microcode
| jrmg wrote:
| Great, thanks.
|
| Sort of weirds me out that my OS can just silently update my
| CPU - I didn't realize I was giving it that level of
| control... I guess it's good vs the alternative of no-one
| actually updating for exploits like his though.
| Thaxll wrote:
| It does not upgrade your cpu, it loads up the firemware
| when you boot Linux.
| jrmg wrote:
| That's reassuring, thanks (not sure why you're getting
| downvoted!)
| sp332 wrote:
| _Active microcode updates are stored in volatile memory and
| thus have to be applied during each system boot._
|
| https://wiki.gentoo.org/wiki/Microcode
| loeg wrote:
| As opposed to updating any other piece of software in the
| system directly? The OS has always had full control.
| eric__cartman wrote:
| This is incredibly scary. On my Zen 2 box (Ryzen 3600) logging
| the output of the exploit running as an unprivileged user while
| copying and pasting a string into a text editor in the background
| (I used Kate), resulted in pieces of the string being logged into
| the output of zenbleed. And this is after a few seconds of
| runtime mind you, not even a full minute.
|
| Thankfully the exploit is highly dependent on a specific asm
| routine so exploiting it from JS or WASM in a browser should be
| extremely difficult. Otherwise a nefarious tab left open for
| hours in the background could exfiltrate without an issue.
|
| I'm eagerly waiting for Fedora maintainers to push the new
| microcode so the kernel can update it during the boot process.
| zekica wrote:
| I tried on my zen 2 box, and the same things works even when
| the exploit is run in a KVM.
| loeg wrote:
| > Thankfully the exploit is highly dependent on a specific asm
| routine so exploiting it from JS or WASM in a browser should be
| extremely difficult. Otherwise a nefarious tab left open for
| hours in the background could exfiltrate without an issue.
|
| At least one commentor here claims to be able to reproduce this
| with javascript: https://news.ycombinator.com/item?id=36849767
| .
| IshKebab wrote:
| A very bold claim with zero evidence.
| saagarjha wrote:
| What about it is very bold? The instruction sequence
| mentioned seems pretty reasonable and not at all out of the
| question for a JavaScript JIT to generate.
| kludge41 wrote:
| How do you build the POC? I get "No such file or directory" and
| error 127 on Ubuntu.
| eric__cartman wrote:
| I had to run make on the uncompressed folder. Perhaps the
| build-essential package doesn't come with NASM in Ubuntu?
| I'll need a bit more info on the error if you want me to try
| and help you :)
| kludge41 wrote:
| After extracting the POC and installing build-essential, I
| still get this: nasm -O0 -felf64 -o zenleak.o zenleak.asm
| make: nasm: No such file or directory make: **
| [Makefile:11: zenleak.o] Error 127
| eric__cartman wrote:
| Install the nasm package. It's probably not included in
| build-essencial.
| kludge41 wrote:
| Thank you. I guess I should've read the error better, but
| I thought nasm was the thing complaining.
| hprotagonist wrote:
| ah, not the color theme. Hamming distance strikes again!
|
| https://kippura.org/zenburnpage
| heywhatupboys wrote:
| I knew there were color schemes for the color blind.
|
| Schemes for the blind are news to me though
| sedatk wrote:
| I didn't expect it to as it's Zen3, but still tried: doesn't
| repro on my 5950X.
| [deleted]
| 0xbadcafebee wrote:
| This is super cool. This exploit will be one of the canonical
| examples that just running something in a VM does not mean it's
| safe. We've always known about VM breakout, but this is a no-
| breakout massive exploit that is simple to execute and gives big
| payoffs.
|
| Remember: just because this one bug gets fixed in microcode
| doesn't mean there's not another one of these waiting to be
| discovered. Many (most?) 0-days are known about by black-hats-
| for-hire well before they're made public.
|
| CPU vulnerabilities found in the past few years:
| https://en.wikipedia.org/wiki/Meltdown_(security_vulnerability)
| https://en.wikipedia.org/wiki/Spectre_(security_vulnerability)
| https://aepicleak.com/
| https://en.wikipedia.org/wiki/Software_Guard_Extensions#SGAxe
| https://en.wikipedia.org/wiki/Software_Guard_Extensions#LVI
| https://en.wikipedia.org/wiki/Software_Guard_Extensions#Plundervo
| lt https://en.wikipedia.org/wiki/Software_Guard_Extensions#
| MicroScope_replay_attack https://en.wikipedia.org/wiki/Soft
| ware_Guard_Extensions#Enclave_attack https://en.wikipedia.o
| rg/wiki/Software_Guard_Extensions#Prime+Probe_attack
| https://www.vusec.net/projects/crosstalk/
| https://en.wikipedia.org/wiki/Hertzbleed
| https://www.securityweek.com/amd-processors-expose-sensitive-
| data-new-squip-attack/
| zamadatix wrote:
| In the case of the VM won't registers be wiped when
| entering/exiting the VM?
| loeg wrote:
| The problem is the freed entries in the register file. A VM
| can, at least, use this bug to read registers from a non-VM
| thread running on the adjacent SMT/HT of a single physical
| core. I suspect a VM could also read registers from other
| processes scheduled on the same SMT/HT.
| astrange wrote:
| Are people running multiple untrusted VMs without turning
| SMT off? Even letting them share caches seems like asking
| for trouble.
| bbojan wrote:
| The fine article states that simply turning off SMT
| doesn't help with this particular exploit.
| zamadatix wrote:
| In the context of this conversation, SMT on/off is
| relevant to what scope of the vulnerability has with VMs
| beyond the claim in the article that the issue is in some
| way present inside VMs.
| Astronaut3315 wrote:
| This specific CVE still applies even if SMT is off, per
| the article.
| zamadatix wrote:
| In the context of this conversation, SMT on/off is
| relevant to what scope of the vulnerability has with VMs
| beyond the claim in the article that the issue is in some
| way present inside VMs.
| jeroenhd wrote:
| Not only do people do this, it's generally how VPS
| providers work. Most machines barely use the CPU most of
| the time (web servers etc.) so reserving a full CPU core
| for a VPS is horribly inefficient. It doesn't matter
| anyway, because SMT isn't relevant for this particular
| bug.
|
| With SMT allowing twice the cores on a CPU for most
| workloads, disabling it would double the cost for most
| providers!
|
| There are VPS providers that will let you rent dedicated
| CPU cores, but they often cost 4-5x more than a normal
| virtual CPU. Overprovisioning is how virtual servers are
| available for cheap!
| zamadatix wrote:
| SMT is relevant in the VM case of this bug because it
| determines whether this bug is restricted to data outside
| the VM or not.
|
| Providers usually won't disable SMT completely, they'd
| run a scheduler which only allows 1 VM to use both SMT
| threads of a core. Ultra cheap VPS providers may still
| find that not worth the pennies though as if you sell a
| majority of single core VPS then the majority of your SMT
| threads are still unavailable even with the scheduler
| approach.
|
| Fully dedicated cores aren't necessarily required because
| in the timesliced case the registers are unloaded and
| reloaded when different VMs are shuffled on and off the
| core. That said, they definitely prevent the cross-vm-
| data-leak case of this bug.
| toast0 wrote:
| > Fully dedicated cores aren't necessarily required
| because in the timesliced case the registers are unloaded
| and reloaded when different VMs are shuffled on and off
| the core. That said, they definitely prevent the cross-
| vm-data-leak case of this bug.
|
| Registers are unloaded and reloaded when different
| processes / threads are scheduled within a running VM
| too. That _should_ protect the register contents, but
| because of this issue, it doesn 't, so I don't see why it
| would if it's a hypervisor switching VMs instead of an OS
| switching processes. If you're running a vulnerable
| processor on a vulnerable microcode, it seems like you
| can potentially read things put into the vulnerable
| registers by anything else running on the same physical
| core, regardless of context.
| KeplerBoy wrote:
| Well you don't have to reserve any CPU Cores per VM.
| There's no law saying you can't have more VMs than
| logical cores. They're just processes after all and we
| can have thousands of them.
| jeroenhd wrote:
| Of course not, but the vulnerability works by exploiting
| the shared register file so to mitigate this entire class
| of vulnerabilities, you'd need to dedicate a CPU core and
| as much of its associated cache as possible to a single
| VM.
| loeg wrote:
| Someone, somewhere is, of course. I don't know if the
| hyperscalers do, or not.
| zamadatix wrote:
| Ah, this is a good point for those still using hypervisor
| schedulers which allow mapping different VMs to the same
| core.
| crote wrote:
| The problem is that the _logical_ registers don 't have a 1:1
| relation to the _physical_ registers.
|
| For example, let's imagine a toy architecture with two
| registers: r0 and r1. We can create a little assembly snippet
| using them: "r0 = load(addr1); r1 = load(addr2); r0 = r0 +
| r1; store(addr3, r0)". Pretty simple.
|
| Now, what happens if we want to do that _twice_? Well, we get
| something like "r0 = load(addr1); r1 = load(addr2); r0 = r0
| + r1; store(addr3, r0); r0 = load(addr4); r1 = load(addr5);
| r0 = r0 + r1; store(addr6, r0)". Because there is no overlap
| between the accessed memory sections, they are completely
| independent. In theory they could even execute at the same
| time - but that is impossible because they use the same
| registers.
|
| This can be solved by adding more physical registers to the
| CPU, let's call them R0-R6. During execution the CPU can now
| analyze and rewrite the original assembly into "R1 =
| load(addr1); R4 = load(addr4); R2 = load(addr2); R5 =
| load(addr5); R3 = R1 + R2; R6 = R4 + R5; store(addr3, R3);
| store(addr6, R6)". This means we can now start the loads for
| the second addition before the first addition is done, which
| means we have to wait less time for the data to arrive when
| we finally want to actually do the second addition. To the
| user nothing has changed and the results are identical!
|
| The issue here is that when entering/exiting a VM you can
| definitely clear the logical registers r0&r1, but there is no
| guarantee that you are _actually_ clearing the physical
| registers. On a hardware level, "clearing a register" now
| means "mark logical register as empty". The CPU makes sure
| that any future use of that _logical_ register results in it
| behaving _as if_ it has been clear, but there is no need to
| touch the content of the _physical_ register. It just gets
| marked as "free for use". The only way that physical
| register becomes available again is after a write, after all,
| and that write would by definition overwrite the stale
| content - so clearing it would be pointless. Unless your CPU
| misbehaves and you run into this new bug, of course.
| cmrdporcupine wrote:
| In the end, I'm thinking _most_ of these are related to branch
| prediction?
|
| It strikes me that it's either that branch prediction is so
| inherently complex enough it's always going to be vulnerable to
| this _and /or_ it just so defies the way most of us intuitively
| think about code paths / instruction execution that it's hard
| to conceive of the edge cases until too late?
|
| At what point does the complexity of CPU architectures become
| so difficult to reason about that we just accept the
| performance penalty of keeping it simpler?
| c7DJTLrn wrote:
| We demanded more performance and we got what we demanded. I
| doubt manufacturers are going to walk back on branch
| prediction no matter how flawed it is. They'll add some more
| mitigations and features which will be broken-on-arrival.
| Tuna-Fish wrote:
| > At what point does the complexity of CPU architectures
| become so difficult to reason about that we just accept the
| performance penalty of keeping it simpler?
|
| Never for branch prediction. It just gets you too much
| performance. If it becomes too much of a problem, the
| solution is greater isolation of workloads.
| hedgehog wrote:
| In certain cases isolation and simplicity overlap, I
| suspect for example that the dangers of SMT implementation
| complexity are part of why Apple didn't implement it for
| their respective CPUs. Likely we'll see this elsewhere too,
| for example Amazon may not ever push to have SMT in their
| Graviton chips (the early generations are off the shelf
| cores from ARM where they didn't have a readily available
| choice).
| loeg wrote:
| Speculative execution, not branch prediction.
| rcxdude wrote:
| >At what point does the complexity of CPU architectures
| become so difficult to reason about that we just accept the
| performance penalty of keeping it simpler?
|
| Basically never for anything that's at all CPU-bound, that
| growth in complexity is really the only thing that's been
| powering single-threaded CPU performance improvements since
| Dennard scaling stopped in about 2006 (and by that time they
| were already plenty complex: by the late 90s and early 2000's
| x86 CPUs were firmly superscalar, out-of-order, branch-
| predicting and speculative executing devices). If your
| workload can be made fast without needing that stuff (i.e. no
| branches and easily parallelised), you're probably using a
| GPU instead nowadays.
| paulmd wrote:
| More generally, most of them are related to speculative
| execution, where branch mis-prediction is a common gadget to
| induce speculative mis-execution.
|
| Speculation is hard, it's sort of akin to the idea of
| introducing multithreading into a program, you are explicitly
| choosing to tilt at the windmill of pure technical
| correctness because in a highly concurrent application every
| error will occur fairly routinely. Speculation is great too,
| in combination with out-of-order execution it's a
| multithreading-like boon to overall performance, because now
| you can resolve several chunks of code in parallel instead of
| one at a time. It's just also a minefield of correctness
| issues, but the alternative would be losing something like
| the equivalent of 10 years of performance gains (going back
| to like ARM A53 performance).
|
| The recent thing is that "observably correct" needs to
| include timings. If you can just guess at what the data might
| be, and the program runs faster if you're correct, that's
| basically the same thing as reading the data by another
| means. It's a timing oracle attack.
|
| (in this case AMD just fucked up though, there's no timing
| attack, this is just implemented wrong and this instruction
| can speculate against changes that haven't propagated to
| other parts of the pipeline yet)
|
| The cache is the other problem, modern processors are built
| with every tenant sharing this single big L3 cache and it
| turns out that it also needs to be proof against timing
| attacks for data present in the cache too.
| 0cf8612b2e1e wrote:
| If you pin the VM to a different core/CPU, would that do
| anything to mitigate? Or are the OS affinity guarantees not
| that strong?
| saagarjha wrote:
| In this case, it would avoid the exploit, because it
| requires a shared register file.
| c7DJTLrn wrote:
| Running untrusted code whether in a sandbox, container, or VM,
| has not been safe since at least Rowhammer, maybe before. I
| believe a lot of these exploits are down to software and
| hardware people not talking. Software people make assumptions
| about the isolation guarantees, hardware people don't speak up
| when said assumptions are made.
| saagarjha wrote:
| That is not true in this case. It's just a CPU bug; not even
| a side channel.
| Bluecobra wrote:
| Yup! I worked at a few companies that would co-mingle Internet
| facing/DMZ VMs with internal VMs. When pointing this out and
| recommending we should airgap these VMs to it's own dedicated
| hypervisor it always fell on deaf ears. Jokes on them I guess.
| Kwpolska wrote:
| I'm pretty sure AWS/Azure/GCP don't assign separate boxes to
| every customer, and somehow they're fine.
| Bluecobra wrote:
| Good point, I should have clarified that I was talking
| about on-prem VMs e.g. VMWare.
| yencabulator wrote:
| You can pay AWS a premium to make sure you're the only
| tenant on the physical machine. You can also split your own
| stuff into multiple tenants, and keep those separate too.
| nicolas_17 wrote:
| At which point you don't really need the flexibility of
| AWS and you might as well get a Dedicated Server
| elsewhere?
| yencabulator wrote:
| It'll still let you do the elastic scaling stuff, billing
| for actual usage instead of racked hardware.
| phendrenad2 wrote:
| The problem is, VMs aren't really "Virtual Machines" anymore.
| You're not parsing opcodes in a big switch statement, you're
| running instructions on the actual CPU, with a few hardware
| flags that the CPU says will guarantee no data or instruction
| overlap. It promises! But that's a hard promise to make in
| reality.
| msla wrote:
| This is because VM means two different things and has for a
| long time:
|
| IBM's VM was and is a hypervisor. It dates to the mid 1960s,
| in the form of CP-40, and it didn't run opcodes in software,
| but in hardware.
|
| https://en.wikipedia.org/wiki/IBM_CP-40
|
| p-code machines, which interpret bytecode, date back almost
| as far, such as the O-code machine for BCPL.
|
| https://en.wikipedia.org/wiki/BCPL
|
| Getting people to distinguish between these concepts is
| probably a lost cause.
| Joker_vD wrote:
| Looking at the IBM's tech from the sixties is somehow
| weirdly depressing: it's unbelievable how much of the
| architectural stuff they've invented already by the 1970.
| meepmorp wrote:
| I remember seeing VMware for the first time and thinking
| that the PC world had finally entered the 1970s.
| nine_k wrote:
| Not depressing, but inspiring. So many great
| architectural ideas can be made accessible to millions of
| consumers, not limited to a few thousand megacorps.
| MuffinFlavored wrote:
| > you're running instructions on the actual CPU
|
| Just how many times is the average operating system workload
| (with or without a virtual machine also running a second
| average operating system workload) context switching a
| second?
|
| Like... unless I'm wrong... the kernel is the main process,
| and then it slices up processes/threads, and each time those
| run, they have their own EAX/EBX/ECX/ESP/EBP/EIP/etc. (I know
| it's RAX, etc. for 64-bit now)
|
| How many cycles is a thread/process given before it context
| switches to the next one? How is it managing all of the
| pushfd/popfd, etc. between them? Is this not how modern
| operating systems work, am I misunderstanding?
| toast0 wrote:
| > How many cycles is a thread/process given before it
| context switches to the next one?
|
| Depends on a lot of things. If it's a compute heavy task,
| and there's no I/O interrupts, the task gets one
| "timeslice", timeslices vary, but typical times are
| somewhere in the neighborhood of 1 ms to 100 ms. If it's an
| I/O heavy task, chances are the task returns from a syscall
| with new data to read (or because a write finished), does a
| little bit of work, then does another syscall with I/O.
| Lots of context switches in network heavy code (io_uring
| seems promising).
|
| > How is it managing all of the pushfd/popfd, etc. between
| them?
|
| The basic plan is when the kernel takes an interrupt (or
| gets a syscall, which is an interrupt on some systems and
| other mechanisms on others), the kernel (or the cpu) loads
| the kernel stack pointer for the current thread, then it
| pushes all the (relevant) cpu registers onto the stack,
| then the kernel business it taken care of, the scheduler
| decides which userspace thread to return to (which might be
| the same one that was interrupted or not), the destination
| thread's kernel stack is switched to, registers are popped,
| then the thread's userspace stack is switched to, then
| userspace execution resumes.
| saagarjha wrote:
| Usually a few hundred to a few thousand times a second.
| trebligdivad wrote:
| The comparison to Meltdown/Spectre are a bit misleading though
| - they were a whole new form of attack based on timing where
| the CPU did exactly what it should have done; This zenbleed
| case is a good old fashioned bug though - data in a register
| that shouldn't be.
| stcredzero wrote:
| _this is a no-breakout massive exploit that is simple to
| execute and gives big payoffs_
|
| Wouldn't we be able to avoid the "big payoffs" of no-breakout
| exploits if we had specialized hardware handle the secrets?
___________________________________________________________________
(page generated 2023-07-24 23:00 UTC)