[HN Gopher] Reptar
       ___________________________________________________________________
        
       Reptar
        
       Author : abhi9u
       Score  : 193 points
       Date   : 2023-11-14 17:49 UTC (5 hours ago)
        
 (HTM) web link (lock.cmpxchg8b.com)
 (TXT) w3m dump (lock.cmpxchg8b.com)
        
       | saagarjha wrote:
       | See also Intel's advisory, which has a description of impact:
       | https://www.intel.com/content/www/us/en/security-center/advi...
       | 
       | > Sequence of processor instructions leads to unexpected behavior
       | for some Intel(R) Processors may allow an authenticated user to
       | potentially enable escalation of privilege and/or information
       | disclosure and/or denial of service via local access.
        
       | tedunangst wrote:
       | Their diagnosis reminds me of what happened when qemu ran into
       | repz ret. https://repzret.org/p/repzret/
        
       | Lammy wrote:
       | > the processor would begin to report machine check exceptions
       | and halt.
       | 
       | I get it https://www.youtube.com/watch?v=dXekDCcw2FE
        
       | doublerabbit wrote:
       | Any reason to why its named after the dinosaur from the cartoon
       | Rugrats? Or was that what was on TV at the time?
       | 
       | Maybe I should start hacking while watching Teenage Mutant Ninja
       | Turtles.
        
         | 2OEH8eoCRo0 wrote:
         | rep is an assembly instruction prefix
        
         | Blackthorn wrote:
         | I think from the memey line "Halt! I am Reptar!" Plus the rep
         | prefix
        
         | AdmiralAsshat wrote:
         | If you discover a major processor vulnerability and wanna name
         | it Shredder/Krang/Bebop/Rocksteady, I feel like you will have
         | earned that right!
        
       | xyst wrote:
       | Reading this makes me realize how little I know of the hardware
       | that runs my software
       | 
       | > Prefixes allow you to change how instructions behave by
       | enabling or disabling features
       | 
       | Why do we need "prefixes" to disable or enable features? Is this
       | for dynamically toggling feature so you don't have to go into
       | BIOS?
        
         | jeffbee wrote:
         | It's just because x86 as an ISA has accreted over the course of
         | 40+ years, and has variable-length instructions. Every time
         | they extend the ISA they carve out part of the opcode space to
         | squeeze in a new prefix. This will only continue, considering
         | that Intel has proposed another new scheme this year.
        
         | shenberg wrote:
         | Prefixes are modifiers to specific instructions executed by the
         | processor, e.g. to control the size of the operands or enable
         | locking for concurrency.
        
         | Tuna-Fish wrote:
         | x86 was designed in 78, basically for the purpose of running a
         | primitive laser printer (or other similar workloads). The big
         | problem with this is that the encoding space for instructions
         | was "efficiently utilized". When new instructions, or worse,
         | additional registers were later added, you had to fit the new
         | instruction variants in somehow, and you did this by tacking on
         | prefixes.
        
           | mschuster91 wrote:
           | Nah, x86 goes even earlier in its heritage - it was,
           | effectively, a bolt-on on Intel's way older designs, as a
           | huge part of the 8086 was being ASM source-compatible with
           | the older 8xxx chips, even as the instruction set itself
           | changed [1]. What utterly amazes me is that the original 8086
           | was mostly designed _by hand_ by a team of not even two dozen
           | people - and today, we got hundreds if not thousands of
           | people working on designing ASICs...
           | 
           | [1] https://en.wikipedia.org/wiki/Intel_8086#The_first_x86_de
           | sig...
        
         | db48x wrote:
         | Read
         | https://wiki.osdev.org/X86-64_Instruction_Encoding#Legacy_Pr...
         | 
         | The REP prefixes are the most common; they just let you perform
         | the same instruction a variable number of times. It looks in
         | the CX register for the count. This makes many common loops
         | really, really short, especially for moving objects around in
         | memory. The memcpy function is often inlined as a single REP
         | MOVS instruction, possibly with an instruction to copy the
         | count into CX if it isn't already there.
         | 
         | I suppose the REX (operand size) prefix is pretty common too,
         | since 64-bit programs will want to operate on 64-bit values and
         | addresses pretty frequently.
         | 
         | None of the prefixes toggle things that can be set globally, by
         | the BIOS or otherwise. They all just specify things that the
         | next instruction needs to do.
        
           | pclmulqdq wrote:
           | The ModR/M and SIB prefixes are probably the most common
           | prefixes in instructions. They are so common that assemblers
           | elide their existence when you read code. REX is in the same
           | boat: so common that it's usually elided. The VEX prefix is
           | also really common (all of the V* AVX instructions, like
           | VMOVDQ), and then the LOCK prefix (all atomics).
           | 
           | After all of those, REP is not that uncommon of a prefix to
           | run into, although many people prefer SIMD memcpy/memset to
           | REP MOVSB/REP STOSB. It is slightly unusual.
        
             | bonzini wrote:
             | ModRM and SIB are not a prefix, they're part of the opcode
             | (second and third byte after all the prefixes and the
             | 0Fh/0F38h/0F3Ah opcode map selectors)
        
               | EarlKing wrote:
               | More specifically, they're affixed to _certain_ opcodes
               | that require them. There are a number of byte-sized
               | opcodes that do not require a ModRM or SIB byte (although
               | a number of those got gobbled up to make the REX prefix,
               | but that 's another story).
               | 
               | TL;DR Weeee! Intel machine language is crazy!
        
             | EarlKing wrote:
             | There's a good reason for using vector instructions over
             | REP: Until relatively recently that was how you got maximum
             | performance in small, tight loops. REP is making a comeback
             | precisely because of ERMS and FSRM, so unfortunately this
             | will become a bigger problem going forward.
        
         | epcoa wrote:
         | That's a very poor summary of what prefixes are. My advice,
         | just skip the original article which isn't very good or
         | interesting and read taviso's blog that is linked in the top
         | comment (it gives a few concrete examples of these prefixes).
         | They are modifiers that are part of the CPU instruction.
        
         | ajross wrote:
         | "Prefixes" in this case mostly expand the instruction encoding
         | space.
         | 
         | So rarely-used addressing modes get a "segment prefix" that
         | causes them to use a segment other than DS. Or x86_64 added a
         | "REX" prefix that added more bits to the register fields
         | allowing for 16 GPRs. Likewise the "LOCK" prefix (though poorly
         | specified originally) causes (some!) memory operations to be
         | atomic with respect to the rest of the system (c.f. "LOCK
         | CMPXCHG" to effect a compare-and-set).
         | 
         | All these things are operations other CPU architectures
         | represent too, though they tend to pack them into the existing
         | instruction space, requiring more bits to represent every
         | instruction.
         | 
         | Notably the "REP" prefix in question turns out to be the one
         | exception. This is a microcoded repeat prefix left over from
         | the ancient days. But it represents operations (c.f.
         | memset/memmove) that are performance-sensitive even today, so
         | it's worthwhile for CPU vendors to continue to optimize them.
         | Which is how the bug in question seems to have happened.
        
       | rvba wrote:
       | It looks like Intel was cutting corners to be faster than AMD and
       | now all those thigs come out. How much slower will all those
       | processors be after multiple errata? 10%? 30%? 50%?
       | 
       | In a duopoly market there seems to be no real competition. And
       | yes I know that some (not all) bugs also happen for AMD.
        
         | mschuster91 wrote:
         | > And yes I know that some (not all) bugs also happen for AMD.
         | 
         | Some of these novel side-channel attacks actually even apply in
         | completely unrelated architectures such as ARM [1] or RISC-V
         | [2].
         | 
         | I think the problem is not (just) a lack of competition
         | (although you're right that the duopoly in desktop/laptop/non-
         | cloud servers for x86 brings its own serious issues, I've
         | written and ranted more often than I can count [3]), it rather
         | is that modern CPUs and SoCs have simply become so utterly
         | complex and loaded with decades worth of backwards-
         | compatibility baggage that it is impossible for any single
         | human, even a small team of the best experts you can bring
         | together, to fully grasp every tiny bit of them.
         | 
         | [1] https://www.zdnet.com/article/arm-cpus-impacted-by-rare-
         | side...
         | 
         | [2]
         | https://www.sciencedirect.com/science/article/pii/S004579062...
         | 
         | [3]
         | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
        
           | bobim wrote:
           | So no saving grace from the ISA... humans just lost ground on
           | CPU design, and I suspect the situation will worsen when AI
           | will enter the picture.
        
             | mschuster91 wrote:
             | > and I suspect the situation will worsen when AI will
             | enter the picture.
             | 
             | For now, AI lacks the contextual depth - but an AI that can
             | actually _design_ a CPU from scratch (and not just
             | rehashing prior-art VHDL it has ... learned? somehow), if
             | that happens we 'll be at a Cambrian Explosion-style event
             | anyway, and all we can do is stand on the sides, munch
             | popcorn and remember this tiny quote from Star Wars [1].
             | 
             | [1] https://www.youtube.com/watch?v=Xr9s6-tuppI
        
         | arp242 wrote:
         | It's not clear to me this fix will have any performance impact.
         | I strongly suspect it will be negligible or zero.
         | 
         | This seems like a "simple" bug of the type that people write
         | every day, not deep architectural problems like Spectre and the
         | like, which also affected AMD (in roughly equal measure if I
         | recall correctly).
        
           | kmeisthax wrote:
           | Parent commenter might be thinking of Meltdown, a related
           | architectural bug that only bit Intel and IBM PPC. Everything
           | with speculative execution has Spectre[0], but you only have
           | Meltdown if you speculate across _security boundaries_.
           | 
           | The reason why Meltdown has a more dramatic name than
           | Spectre, despite being the same vulnerability, is that
           | hardware privilege boundaries are the only defensible
           | boundary against timing attacks. We already expect context
           | switches to be expensive, so we're allowed to make them a
           | little _more_ expensive. It 'd be prohibitively expensive to
           | avoid leaking timing from, say, one executable library to a
           | block of JIT compiled JavaScript code within the same browser
           | content process.
           | 
           | [0] https://randomascii.wordpress.com/2018/01/07/finding-a-
           | cpu-d...
        
         | akoboldfrying wrote:
         | Not sure what other errata you're referring to, but this looks
         | like an off-by-one in the microcode. I would expect the fix to
         | have zero or minimal penalty.
        
       | varispeed wrote:
       | It's going to be a pain for cloud and shared hosting.
       | 
       | Most likely dedicated resources on demand will be the future.
       | Some companies already offer it.
        
       | Flow wrote:
       | Would be possible to describe a modern CPU in something like TLA+
       | to find all non-electrical problems like these?
        
         | boxfire wrote:
         | There are still bit flipping tricks like rowhammer for RAM, I
         | wouldn't be surprised if there are such vulnerabilities in some
         | CPUs.
        
           | sterlind wrote:
           | Rowhammer is an electrical vulnerability though. PP specified
           | non-electrical vulns.
        
         | sterlind wrote:
         | I've heard Intel does use TLA+ extensively for specifying their
         | designs and verifying their specs. But TLA+ specs are extremely
         | high-level, so they don't capture implementation details that
         | can lead to bugs. And model checking isn't a formal proof, only
         | (tractably small) finite state spaces can be checked with TLC.
         | And even there, you're only checking the invariants you
         | specified.
         | 
         | That said, I'm sure there's some verification framework like
         | SPARK for VHDL, and this feels like exactly the kind of thing
         | it should catch.
        
         | dboreham wrote:
         | Formal methods have been used in CPU design for nearly 40 years
         | [1] but not yet for everything, and the methods tend to not
         | have "round-trip-engineering" properties (e.g. TLA+ is not
         | actually proving validity of the code you will run in
         | production, just your description of its behavior and your idea
         | of exhaustive test cases).
         | 
         | [1] https://www.academia.edu/60937699/The_IMS_T_800_Transputer
        
       | bobim wrote:
       | Is it even possible to design a cpu with out-of-order and
       | speculative execution that would have no security issue? Is the
       | future leads to a swarm of disconnected A55 cores each running a
       | single application?
        
         | SmoothBrain12 wrote:
         | Yes, but they won't clock as fast because they'll be waiting
         | for RAM.
        
           | bobim wrote:
           | We need to keep programs small so they fit in the cache.
        
             | moffkalast wrote:
             | We need 2 GBs of L1 cache, thus solving the cache miss
             | problem once and for all.
        
               | rep_lodsb wrote:
               | 640K should be enough for anyone ;)
        
         | Tuna-Fish wrote:
         | This vulnerability was not caused by OoO or speculative
         | execution. It was caused by the fact that x86 was designed 45
         | years ago, and has had feature after feature piled on the same
         | base, which has never been adequately rebuilt.
         | 
         | The more proximate cause is that some instructions with
         | multiple redundant prefixes (which is legal, but pointless)
         | have their length miscalculated by some Intel CPUs, which
         | results in wrong outcomes.
        
           | epcoa wrote:
           | Not entirely pointless, redundant prefixes are occasionally
           | the useful method for alignment.
        
             | TheCoreh wrote:
             | A more sensible approach for that use-case would be IMO to
             | have well-defined specialized prefixes for padding, instead
             | of relying on the case-by-case behavior of redundant
             | prefixes. (However I understand that there's almost
             | certainly a good historical reason why this was not the way
             | it was done)
        
               | bobim wrote:
               | Are new ISA solving this? Time to move to Risc V?
        
               | epcoa wrote:
               | N/A and No.
        
               | dontlaugh wrote:
               | RISC V is not great at this either, with the compression
               | extension being common and variable length.
               | 
               | ARM 64 gets this right, with fixed length 32 bit
               | instructions.
        
               | kccqzy wrote:
               | The easiest way of doing padding is to add a bunch of
               | `nop` instructions which are one byte each.
               | 
               | If you read the manual, Intel encourages minor variations
               | of the `nop` instructions that can be lengthened into
               | different number of bytes (like `nop dword ptr [eax]` or
               | `nop dword ptr [eax + eax*1 + 00000000h]`).
               | 
               | It is never recommended anywhere in my knowledge to rely
               | on redundant prefixes of random non-nop instructions.
        
               | epcoa wrote:
               | NOPs are not generally free.
               | 
               | It's a pretty old and well known technique:
               | 
               | https://stackoverflow.com/questions/48046814/what-
               | methods-ca...
               | 
               | Note that this technique is really only legitimate where
               | the used prefix already has defined behavior with the
               | given instruction ("Use of repeat prefixes and/or
               | undefined opcodes with other Intel 64 or IA-32
               | instructions is reserved; such use may cause
               | unpredictable behavior."), and of course the REX prefix
               | has special limitations. The key is redundant, not
               | spurious. It is not a good idea to be doing rep add for
               | example. But otherwise, there is no issue.
        
               | epcoa wrote:
               | The prefixes are _redundant_ so it 's not really case-by-
               | case behavior. You're just repeating the prefix you would
               | be using anyway in that location.
               | 
               | Using specialized prefixes wastes encoding space for no
               | real gain. You realize on most common processors NOP
               | itself is a pseudo-instruction? Even the apparently meme-
               | worthy (see sibling comment) RISC-V, it's ADDI x0, x0, 0.
        
               | tedunangst wrote:
               | And then there are CPUs that retcon behavioral changes
               | onto nops.
               | 
               | > Moving a register to itself is functionally a nop, but
               | the processor overloads it to signal information about
               | priority.
               | 
               | https://devblogs.microsoft.com/oldnewthing/20180809-00/?p
               | =99...
        
             | iforgotpassword wrote:
             | Because they cost no/less cycles compared to NOPs?
        
               | tedunangst wrote:
               | See http://repzret.org/p/repzret/
        
           | gumby wrote:
           | > It was caused by the fact that x86 was designed 45 years
           | ago, and has had feature after feature piled on the same
           | base, _which has never been adequately rebuilt_.
           | 
           | Itanic would like to object! Unfortunately it can't get
           | through the door.
        
         | nextaccountic wrote:
         | I think formal methods could help designing of such machine, if
         | you can write a mathematical statement that amounts to "there
         | is no side channel between A and B"
         | 
         | Or at least put a practical bound on how many bits per second
         | at most you can from any such side channel (the reasoning
         | being, if you can get at most a bit for each million years, you
         | probably don't have an attack)
         | 
         | Then you verify if a given design meets this constraint
        
           | mgaunard wrote:
           | A program is itself a formal specification of what an
           | algorithm does.
        
           | bobim wrote:
           | What would be the typical size of such a constraint-based
           | problem, and do we have the compute power to translate the
           | rules into an implementation? And what if one forgot a rule
           | somewhere... Deeply interesting subject.
        
             | less_less wrote:
             | I think you'd want it to be a theorem (in Lean, Coq,
             | Isabelle/HOL or whatever) instead of a constraint problem.
             | So it would be more limited by developer effort than by
             | computational power.
             | 
             | Theoretically you can do this from software down to
             | (idealized) gates, but in practice the effort is so great
             | that it's only been done in extremely limited systems.
        
           | tsimionescu wrote:
           | Formal methods are widely used in processor design. It is
           | hard to formalize specs to assert behaviors that bugs we
           | haven't thought about don't exist. At least hard while also
           | preserving the property of being a Turing machine.
        
             | nextaccountic wrote:
             | I know. I mean applying formal methods to this specific
             | problem of proving side channels don't exist (which seems a
             | very hard thing to do and might even require to modify the
             | whole design to be amenable to this analysis)
        
               | less_less wrote:
               | As a tidbit, this was part of how one of the teams
               | involved in the original Spectre paper found some of the
               | vulnerabilities. Basically the idea was to design a small
               | CPU that could be formally shown to be free of certain
               | timing attacks. In the process they found a bunch of
               | things that would have to change for the analysis to
               | work... maybe in a small system those wouldn't _actually_
               | lead to vulnerabilities, but they couldn 't prove it (or
               | it would require lots of careful analysis). And in big
               | systems, those features do lead to vulnerabilities.
        
         | akoboldfrying wrote:
         | Well, the bug in this specific case (based on the article by
         | Tavis O. linked elsewhere in comments) looks to be the regular
         | kind -- probably an off-by-one in a microcode edge case. That
         | is, here it's _not_ the case that the CPU functions correctly
         | but leaves behind traces of things that should be private in
         | timing side channels, as was the case for Spectre.
        
           | trebligdivad wrote:
           | Yeh just a fun bug rather than anything too fundamental.
           | Still, it is a fun bug.
        
         | JohnBooty wrote:
         | Is the future leads to a swarm of disconnected A55
         | cores each running a single application?
         | 
         | don't you dare tease me like that
        
           | bobim wrote:
           | And programmed in... Forth!
        
       | tasty_freeze wrote:
       | Benchmarking is always problematic -- what is a good
       | representative workload? All the same, I'd be curious if the
       | ucode update that plugs this bug has affected CPU performance,
       | eg, it diverts the "fast short rep move" path to just use the
       | "bad for short moves but great for long moves" version.
        
         | akoboldfrying wrote:
         | In the article by Tavis O. linked elsewhere in comments, he
         | suggests disabling the FSRM CPU feature _only as an expensive
         | workaround_ to be taken only if the microcode can 't be updated
         | for some reason. That suggests to me that he, at least, expects
         | the update to do better.
        
         | ReactiveJelly wrote:
         | That would be the conservative thing to do. If there's no limit
         | on microcode updates, if I was Intel, I'd consider doing that
         | first and then speeding it up again later. Based on the
         | 5-second guess that people who update everything regularly will
         | care that we did the right thing for security, and people who
         | hate updates won't be happy anyway, so at least the first
         | update will be secure if they never get the next one.
         | 
         | (I think there is a limit on microcode, they seem conservative
         | to release new ones - I don't remember the details)
        
       | writeslowly wrote:
       | I noticed the Intel advisory [1] says the following
       | 
       | Intel would like to thank Intel employees:[...] for finding this
       | issue internally.
       | 
       | Intel would like to thank Google Employees: [...] for also
       | reporting this issue.
       | 
       | [1] https://www.intel.com/content/www/us/en/security-
       | center/advi...
        
         | narinxas wrote:
         | I wonder how much sooner than google did intel employees found
         | this issue
        
           | narinxas wrote:
           | but what I am really wondering about is how much money (if
           | any) was the vulnerability worth up the moment when google
           | also discovered this?
        
             | ajross wrote:
             | As described it's just a CPU crash exploit that requires
             | local binary execution. Getting to a vulnerability would
             | require understanding exactly how the corrupted microcode
             | state works, and that seems extremely difficult outside of
             | Intel.
             | 
             | So as described, this isn't a "valuable" bug.
        
               | derefr wrote:
               | This assumes that either 1. partners and interested
               | sponsor-state state actors aren't kept abreast Intel's
               | microcode backend architecture, or 2. that there hasn't
               | been at least one leak of this information from one of
               | these partners into the hands of interested APT
               | developers. I wouldn't put strong faith in either of
               | these assumptions.
        
               | dgacmu wrote:
               | It's not super-valuable yet, but it would keep you mount
               | a really nasty DoS on cloud providers by triggering hard
               | resets of the physical machines. Some people would
               | probably pay for that, though it's obviously more
               | interesting to push on privilege or exfiltration.
               | 
               | Particularly since the MCEs triggered could prevent an
               | automatic reboot. Would depend what the hardware
               | management system did - do machines presenting MCEs get
               | pulled?
        
               | toast0 wrote:
               | If I'm a cloud provider and somebody's workflow is hard
               | resetting lots of my physical machines, I'm going to give
               | them free access to single tenant machines at the very
               | minimum. If they keep crashing the machines that only
               | they run on, I guess that's ok.
        
       | jefc1111 wrote:
       | This was a lot more fun than the Google puff piece.
        
       | frontalier wrote:
       | The date on the article is for tomorrow?
        
         | bitwize wrote:
         | Cereal Killer: Check this out, it's a memo about how they're
         | gonna deal with those oil spills on the 14th.
         | 
         | Acid Burn: What oil spills?
         | 
         | Lord Nikon: Yo, brain dead, today's the 13th.
         | 
         | Cereal Killer: Whoa, this hasn't happened yet!
        
       | quietpain wrote:
       | ...our validation pipeline produced an interesting assertion...
       | 
       | What is a validation pipeline?
        
         | tonfa wrote:
         | The blog has a link to
         | https://lock.cmpxchg8b.com/zenbleed.html#discovery which
         | presents the concept.
        
         | ForkMeOnTinder wrote:
         | It's described one paragraph earlier.
         | 
         | > I've written previously about a processor validation
         | technique called Oracle Serialization that we've been using.
         | The idea is to generate two forms of the same randomly
         | generated program and verify their final state is identical.
        
           | 1f60c wrote:
           | Sounds like the real story should be that Google solved the
           | halting problem. :-P
        
             | kadoban wrote:
             | You're free to solve the halting problem for restricted
             | sets of programs, that doesn't break any rules of the
             | universe.
             | 
             | They also could be just discarding any where it runs for
             | longer than X time, or a bunch of other possibilities.
        
       | mike_d wrote:
       | The most awesome part:
       | 
       | > This bug was independently discovered by multiple research
       | teams within Google, including the silifuzz team and Google
       | Information Security Engineering.
        
       | yodon wrote:
       | Dupe: https://news.ycombinator.com/item?id=38268043
       | 
       | (As of this writing, this post has more votes, the other has more
       | comments)
        
         | dang wrote:
         | We'll merge that one hither. Please stand by!
        
       | metadat wrote:
       | Thanks, this is a way more detailed and informative explanation!
        
       | blauditore wrote:
       | Can someone give a TL;DR for non-CPU experts? All technical
       | articles seem pretty long and/or complex.
        
         | Arnavion wrote:
         | Some x86 instructions can have prefixes that modify their
         | behavior in a meaningful way. Such a prefix can be applied
         | generally to any instruction, but it's expected to have no
         | effect when applied to an instruction it doesn't make sense
         | with. But it turns out the CPU actually misbehaves in some
         | cases when this is done. Intel released a CPU firmware update
         | to fix it.
        
         | kmeisthax wrote:
         | x86 has a builtin memory copy instruction, provided by the
         | combination of the movsb instruction and a rep _prefix byte_ ,
         | that says you want the instruction to run in a loop until it
         | runs out data to copy. This is "rep movsb". This instruction is
         | fairly old, meaning a lot of code still has it, even though
         | there's faster ways to copy memory in x86.
         | 
         | Intel added two features to modern x86 chips that detects rep
         | movsb and accelerates it to be as fast as those other ways.
         | However, those features have a bug. You see, because rep is a
         | prefix byte, you can just keep adding more prefix bytes to the
         | instruction (up to a maximum of 16 AFAIK). x86 has other prefix
         | bytes too, such as rex (used to access registers 8-16), vex,
         | evex, etc. The part of the processor that recognizes a rep
         | movsb does NOT account for these other prefix bytes, which
         | makes the processor get confused in ways that are difficult to
         | understand. The processor can start executing garbage, take the
         | wrong branch in if statements, and so on.
         | 
         | Most disturbingly, when multiple physical cores are executing
         | these "rep rep rep rep movsb" instructions at the same time,
         | they will start generating machine check exceptions, which can
         | at worst force a physical machine reboot. This is very bad for
         | Google because they rent out compute time to different
         | companies and they all need to be able to share the same
         | machine. They don't want some prankster running these
         | instructions and killing someone else's compute jobs. We call
         | this a "Denial of Service" vulnerability because, while I can't
         | read someone else's computations or change them, I _can_ keep
         | them from completing, which is just as bad.
        
           | BlueTemplar wrote:
           | > they all need to be able to share the same machine
           | 
           | Do they ? As these issues keep piling up, it just seems that
           | it's not worth the hassle, and they should instead never do
           | sharing like this...
        
             | jrockway wrote:
             | To some extent, anyone with a web browser is sharing their
             | machine with other people. That's Javascript.
             | 
             | If you ever download untrustworthy code and run it in a VM
             | to protect your main set of data, that's another case.
             | 
             | The success of cloud computing is from the idea that
             | multiple people can share the same computer. You only need
             | one core, but CPUs come with 128, but with the cloud you
             | can buy just that one core and share 1/128th of the power
             | supply, rack space, motherboard, ethernet cable, sysadmin
             | time, etc. and that reduces your costs. That assumption is
             | all based on virtualization working, though; nobody wants
             | 1/128th of someone else's computer, they want their own
             | computer that's 1/128th as fast. Bugs like these
             | demonstrate that you're just sharing a computer with
             | someone, which is bad for the business of cloud providers.
        
       | rep_lodsb wrote:
       | The REX prefix is redundant for 'movsb', but not 'movsd'/'movsq'
       | (moving either 32- or 64-bit words, depending on the prefix).
       | That may have something to do with the bug, if there is any
       | shared microcode between those instructions?
        
       | ZoomerCretin wrote:
       | Intel is a known partner of the NSA. If Intel was intentionally
       | creating backdoors at the behest of the NSA, how would they look
       | different from this vulnerability and the many other discovered
       | vulnerabilities before it?
        
         | rep_lodsb wrote:
         | My guess is that it would be something that could be exploited
         | via JavaScript. And no JIT would emit an instruction like the
         | one that causes this bug.
        
         | thelittleone wrote:
         | But so is Google. It would be some very crafty theatrics if
         | it's all coordinated.
        
         | gosub100 wrote:
         | the backdoor would just be an encrypted stream of "random" data
         | flowing right out the RNG. there's some maxim of crypto that
         | encrypted data is indistinguishable from random bytes.
        
         | tedunangst wrote:
         | How would you distinguish this backdoor from one inserted by an
         | unknown partner of the NSA?
        
       | dang wrote:
       | Related: https://cloud.google.com/blog/products/identity-
       | security/goo...
       | 
       | (via https://news.ycombinator.com/item?id=38268043, but we merged
       | the comments hither)
        
       | quotemstr wrote:
       | If the problem really is that the processor is confused about
       | instruction length, I'm impressed that this problem can be fixed
       | in microcode without a huge performance hit: my intuition (which
       | could be totally wrong) is that computing the length of an
       | instruction would be something synthesized directly to logic
       | gates.
       | 
       | Actually, come to think of it, my hunch is that the uOP decoder
       | (presumably in hardware) is actually fine and that the microcoded
       | optimized copy routine is trying to infer things about the uOP
       | stream that just aren't true --- "Oh, this is a rep mov, so of
       | course I need to go backward two uOPs to loop" or something.
       | 
       | I expect Intel's CPU team isn't going to divulge the details
       | though. :-)
        
       | ShadowBanThis01 wrote:
       | Is what? Another useless title.
        
       ___________________________________________________________________
       (page generated 2023-11-14 23:00 UTC)