[HN Gopher] Intel and AMD Contemplate Different Replacements for...
___________________________________________________________________
Intel and AMD Contemplate Different Replacements for x86 Interrupt
Handling
Author : eklitzke
Score : 140 points
Date : 2021-06-04 18:10 UTC (4 hours ago)
(HTM) web link (www.eejournal.com)
(TXT) w3m dump (www.eejournal.com)
| korethr wrote:
| Somewhat off topic from the main thread of the article, but I
| have always wondered about the multiple privilege levels. What's
| the expected/intended use for them? The only thing I think of is
| separating out hardware drivers (assuming ring 1 can still
| directly read/write I/O ports or memory addresses mapped to
| hardware) so they can't crash the kernel should the drivers or
| hardware turn out to be faulty. But I don't think I've ever heard
| of such a design being used in practice. It seems everyone throws
| their driver code into ring 0 with the rest of the kernel, and if
| the driver or hardware faults and takes the kernel with it, too
| bad, so sad. Ground RESET and start over.
|
| What I find myself wondering is _why_? It seems like a good idea
| on paper, at least. Is it just a hangover from other CPU
| architectures that only had privileged /unprivileged modes, and
| programmers just ended up sticking with what they were already
| familiar and comfortable with? Was there some painful gotcha
| about multiple privilege modes that made them impractical to use,
| like the time overhead of switching privilege levels made it
| impossible to meet some hardware deadline? Silicon-level bugs?
| Something else?
| monocasa wrote:
| It works with the call gates. You could have a sort of nested
| microkernel idea if it wasn't such a mess with each ring able
| to include the lower privileged ring's address spaces. And not
| just a free for all, but the kernel really is just the control
| plane, but can set up all sorts of per process descriptor
| tables (the LDTs).
|
| So you'd have a tiny kernel at ring 0 which could R/W
| everything but wasn't responsible for much.
|
| Under that you'd have drivers at ring 1 that can't see the
| kernel or other drivers, but can R/W user code at rings 2 and
| 3.
|
| Under that you'd have system daemons at ring 2 that can R/W
| regular programs but not other daemons at the same level, nor
| drivers or the kernel.
|
| And then under that you had regular processes at ring 0 that
| have generally the same semantics as today's processes.
|
| Each process of any ring can export a syscall like table
| through the call gates, and so user code could directly invoke
| drivers or daemons without going throught the kernel at all.
| Basically IPC with about the same overhead as a C++ virtual
| method call.
|
| So what happened? The exact underlying semantics didn't exactly
| match many OSs anyone wanted to build (particularly OSs that
| anyone cared about in the late 80s early 90s). And you can
| enforce similar semantics all in software with the exception of
| the cheap IPC anyway.
| dmitrygr wrote:
| I think OS/2 used every feature x86 had, including call gates
| and at least 3 privilege levels (0, 2, 3). That is why OS/2 is
| such a good test for any aspiring x86 emulator developer.
| vlovich123 wrote:
| Given the multi core, NUMA and Spectre/meltdown reality we're
| living in, and the clear benefits of the io_uring approach, why
| not just have a dedicated core(s) to handle "interrupts" which
| are nothing more than entries in a shared memory table?
| devit wrote:
| There must be a way to alter a core's instruction pointer from
| another core or from hardware to support killing processes
| running untrusted machine code, and to support pre-emptive
| multithreading without needing to have compilers add a check
| for preemption on all backward branches and calls.
|
| These features are very worth the hassle of providing this
| capability (which is known as "IPI"s), and once you have that
| hardware interrupts become pretty much free to support by using
| the same capability and the OS/user can then decide whether to
| dedicate a core, to affinity them to a core, to load balance
| them among all cores or to disable the interrupts and poll
| instead.
| vlovich123 wrote:
| I was thinking rather than mucking with instruction pointers
| you would just send a message back to the other CPU saying
| "pause & switch to context X". Technically an interrupt but
| one that can be handled internally within the CPU.
| [deleted]
| electricshampo1 wrote:
| This is essentially the approach taken by
|
| https://www.dpdk.org/ (network) and https://spdk.io/ (storage)
|
| Anything trying to squeeze perf doing IO intensive work should
| switch to this model (context permitting of course).
| knz42 wrote:
| This is the approach proposed in
| http://doi.org/10.1109/TPDS.2015.2492542
|
| (preprint:
| https://science.raphael.poss.name/pub/poss.15.tpds.pdf )
| DSingularity wrote:
| And old systems like Corey OS.
| eklitzke wrote:
| This approach works for I/O devices (and for things like
| network cards the kernel will typically poll them anyway), but
| I/O isn't the only thing that generates interrupts. For
| instance, a processor fault (e.g. divide by zero) should be
| handled immediately and synchronously since the CPU core
| generating the fault can't do any useful work until the fault
| is handled.
| vlovich123 wrote:
| Is that actually true? Wouldn't this imply you could launch a
| DOS attack against cloud providers just generating divisions
| by zero?
| bonzini wrote:
| You would only attack yourself. The CPU time you're paying
| for would be spent processing division by zero exceptions.
| vlovich123 wrote:
| Then the CPU isn't stopping and is moving on doing other
| work. Meaning the divide by zero could be processed by a
| background CPU and doesn't require immediate handling.
| Same for page faults.
| bananabreakfast wrote:
| Incorrect. The "CPU" is stopping and handling the fault.
| There is no background CPU from your perspective. In a
| cloud provider you are always in a virtual environment
| and using a vCPU which is constantly being preempted by
| the hypervisor.
|
| You cannot DOS a hypervisor just by bogging down your
| virtualized kernel.
| bonzini wrote:
| Your instance can still be preempted by the hypervisor.
| imtringued wrote:
| A site that renders HTML is more computationally expensive
| than handling a site that does nothing but division by
| zero.
| justincormack wrote:
| Faults for divide by zero are a terrible legacy thing, Arm
| etc do not do this, you test for zero if you want to,
| otherwise you get a value.
| DblPlusUngood wrote:
| A better example: a page fault for a non-present page.
| rwmj wrote:
| At university we designed an architecture[1] where you
| had to test for page not present yourself. It was all
| about seeing if we could make a simpler architecture
| where all interrupts could be handled synchronously, so
| you'd never have to save and restore the pipeline. Also
| division by zero didn't trap - you had to check before
| dividing. IIRC the conclusion was it was possible but
| somewhat tedious to write a compiler for[2], plus you had
| to have a trusted compiler which is a difficult sell.
|
| [1] But sadly didn't implement it in silicon! FPGAs were
| much more primitive back then.
|
| [2] TCG in modern qemu has similar concerns in that they
| also need to worry about when code crosses page
| boundaries, and they also have a kind of "trusted"
| compiler (in as much as everything must go through TCG).
| warkdarrior wrote:
| Interesting. So what happens if the program does not test
| for the page? Doesn't the processor have to handle that
| as an exception of sorts?
| vlovich123 wrote:
| Could be handled by having the CPU switch to a different
| process while the kernel CPU faults the data in.
| ben509 wrote:
| The CPU doesn't know what processes are, that's handled
| by the OS. So there still needs to be a fault.
| vlovich123 wrote:
| You're thinking about computer architecture as designed
| today. There's no reason there isn't a common data
| structure defined that the CPU can use to select a backup
| process, much how it uses page table data structures in
| main memory to resolve TLB misses.
| pjc50 wrote:
| Just to make it explicit for the people having trouble,
| the mechanism for switching processes in a pre-emptive
| multitasking system _is_ interrupts.
| DblPlusUngood wrote:
| In some cases, yes (if there are other runnable threads
| on this CPU's queue).
| eklitzke wrote:
| That was just an example, there are many other things the
| CPU can do that will generate a fault (for example, trying
| to execute an illegal instruction).
| mschuster91 wrote:
| Yes, but this one is so hard baked into everything that it
| would kill any form of backwards compatibility.
| bonzini wrote:
| Or interprocessor interrupts for flushing the TLB,
| terminating the scheduling quantum, or anything else.
| bostonsre wrote:
| Would guess it would be complicated and difficult to update the
| kernel to support something like that. Not sure Linus would
| entertain PRs for custom boards that do something like that.
| Would think it would need to be an industry wide push for that.
| But just speculation..
| vlovich123 wrote:
| We're talking about the CPU architecture here, not custom
| one-off ARM boards. Think x86 or ARM not a Qualcomm SoC.
|
| And yes of course. Linus' opinion would be needed.
| bogomipz wrote:
| The author states:
|
| >'The processor nominally maintained four separate stacks (one
| for each privilege level), plus a possible "shadow stack" for the
| operating system or hypervisor.'
|
| Can someone elaborate on what the "shadow stack" is and what it's
| for exactly? This is the first time I've heard this nomenclature.
| ChuckMcM wrote:
| Argh, why do author's write stuff like this -- _" It is, not to
| put too fine a point on it, a creaking old bit of wheezing
| ironmongery that, had the gods of microprocessor architecture
| been more generous, would have been smote into oblivion long
| ago."_
|
| Just because a technology is "old" doesn't mean it is useless, or
| needs to be replaced. I'm all in favor of fixing problems, and
| refactoring to improve flow and remove inefficiencies. I am _not_
| a fan of re-inventing the wheel because gee, we 've had this
| particular wheel for 50 years and its doing fine but hey let's
| reimagine it anyway.
|
| That said, the kink in x86 architecture was put their by "IBM PC
| Compatibility" and a Windows/Intel monopoly that went on way too
| long. But even knowing _why_ the thing has these weird artifacts
| that just means the engineers are working under constraints you
| don 't understand, doesn't give you license to dismiss what
| they've done as needing to be "wiped away."
|
| We are in a period where enthusiasts can design, build, and
| operate a completely bespoke ISA and micro-architecture with
| dense low cost FPGAs. Maybe they don't run at multi-GHz speeds
| but if you want to contribute positively to the question of
| computer architecture, there has never been a better time. You
| don't even have to build the whole thing! You can just add it
| into an existing architecture and compare how you do against it.
|
| Want to do flow control colored register allocation for
| speculative instruction retirement? You can build the entire
| execution unit in an FPGA and throw instructions at it to your
| hearts content and provide analysis of the results.
|
| Okay, enough ranting. I want AARCH64 to win so we can reset the
| problem set back to a smaller number of workarounds, but I think
| the creativity of people trying to advance the x86 architecture
| given the constraints is not something to belittled, it is to be
| admired.
| gumby wrote:
| Also "smitten" is the passive past participle of "smite", not
| "smote", which is the active. This bothered me through the
| whole article.
|
| I suppose the author is not a native English speaker. Like the
| person who titled a film "honey I shrunk the kids". Makes me
| wince to even type it.
| Dylan16807 wrote:
| I propose that they are quite familiar with English and want
| to avoid the love-related connotations of "smitten".
| Scene_Cast2 wrote:
| It might be there to put a playful twist on things. Kind of
| like "shook" (slang) or "woke" (when it first appeared).
| failwhaleshark wrote:
| _And I woke from a terrible dream_
|
| _So I caught up my pal Jack Daniel 's_
|
| _And his partner Jimmy Beam_
|
| https://www.independent.co.uk/news/uk/home-news/woke-
| meaning...
|
| _AC /DC - You Shook Me All Night Long (Official Video)_
|
| https://youtu.be/Lo2qQmj0_h4
|
| (Rock n' roll stole the best vernacular.)
| gumby wrote:
| Those are both correct ("conventional" to descriptivists
| like me) uses of the respective terms.
| thanatos519 wrote:
| Ah, "not a native English speaker", usual written as
| "American".
|
| </troll>
| coldtea wrote:
| > _Just because a technology is "old" doesn't mean it is
| useless, or needs to be replaced._
|
| Sure. But if on top of old it is "a creaking wheezing
| ironmonger" that "had the gods of microprocessor architecture
| been more generous, would have been smote into oblivion long
| ago", then it does need to be replaced.
|
| And both the article mentions reasons why (they don't stop at
| mere old) and Intel/AMD share them.
| nwmcsween wrote:
| > Just because a technology is "old" doesn't mean it is
| useless, or needs to be replaced. I'm all in favor of fixing
| problems, and refactoring to improve flow and remove
| inefficiencies. I am not a fan of re-inventing the wheel
| because gee, we've had this particular wheel for 50 years and
| its doing fine but hey let's reimagine it anyway.
|
| Can't get promotions if you don't NIH a slow half broken
| postgres
| edoceo wrote:
| > slow half broken postgres
|
| Don't bring Postgres into this. Its fast and closer to zero-
| broken than to half-broken. ;)
| dkersten wrote:
| I think you misunderstood the comment. It wasn't calling
| Postgres slow and half broken, it was making a stab at
| other home-brewed (NIH, Not Invented Here
| https://en.wikipedia.org/wiki/Not_invented_here) databases,
| calling them slow, half baked copies of postgres, implying
| that they should have just used postgres, but that doing so
| wouldn't get them promotions.
| justicezyx wrote:
| I am doubting that if you walk over the aisle to the engineers
| in the other team, and state an obscure legacy issue to
| him/her. That person would, with decent chance, say something
| like "how could it be?" "was the original engineer
| dumb/imcompetent"? "was the original team got reorged"? etc....
|
| Not saying the tech journalist is better in any sense. But
| let's be honest, there is no reason they should be doing
| better...
| failwhaleshark wrote:
| Sssh! "Old = bad" means job security for millions of engineers.
| We need yet another security attack surface, I mean "improved
| interrupt handling."
|
| I don't understand why people, engineers of all people, fall
| for the Dunning-Kruger/NIH/ageism-in-all-teh-things consumerism
| fallacy that everything else that came before is dumb,
| unusable, or can _always_ be done better.
|
| Code magically rusts after being exposed to air for 9 months,
| donja know? If it's not trivially-edited every 2 months, it's a
| "dead" project.
| lazide wrote:
| Part of it I think is that for many people, it's more fun to
| build something than to maintain something. It's also easier
| to write code than it is to read it (most of the time).
|
| So why not do the fun and easy thing? Especially if they
| aren't the one writing the checks!
| lazide wrote:
| Well, no one gets promoted/a raise by writing 'and everything
| is actually fine' right?
|
| On the engineering side, similar more often than not. You get
| promoted by solving the 'big problem'. The really enterprising
| (and ones you need to watch) often figure out how to make the
| big problem the one they are trying/well situated to solve -
| even if it isn't really a problem.
| Sebb767 wrote:
| > Well, no one gets promoted/a raise by writing 'and
| everything is actually fine' right?
|
| Well, that's actually not true. There are quite a few live
| coaches and the likes making their living on positive
| writing. And there a quite a few writers, like Scott
| Alexander, which, while not being all positive, definitely
| don't need to or want to paint an overly dark or dramatic
| picture.
|
| In the more conventional news sector, on the other hand, this
| is probably true.
| herpderperator wrote:
| > we've had this particular wheel for 50 years and its doing
| fine but hey let's reimagine it anyway
|
| How is it doing fine? Apple is doing laps over Intel because --
| it seems -- of their choice to use ARM. Would they have been
| able to design an x86/amd64 CPU just as good?
| yyyk wrote:
| >Apple is doing laps over Intel because -- it seems -- of
| their choice to use ARM.
|
| AMD processors are essentially equivalent to M1 in
| performance while still keeping x86, so probably yes (there
| are an ARM advantage in instruction decoding, but judging by
| performance differences, it's probably small).
|
| Apple's advantage is mostly that Apple (via TSMC) can use a
| smaller processor node than Intel and optimize the entire Mac
| software stack for their processors.
| Someone wrote:
| > and optimize the entire Mac software stack for their
| processors.
|
| And vice versa.
| cycomanic wrote:
| I would argue the author is not really saying "old" is "bad",
| but instead that we have been piling more and more craft onto
| the old so it has now become a significant engineering exercise
| to work around the idiosyncrasies of the old system every time
| you want to do something else.
|
| To use your wheel analogy. It's sort of like starting with the
| wheel of horse cart and adding bits and pieces to that same
| wheel to make it somehow work as the landing gear wheel for a
| jumbo jet. At some point it might be a good idea to simply
| design a new wheel.
| gwbas1c wrote:
| Why continue with x86? Given how popular ARM is, why not just
| join the trend?
| ronsor wrote:
| x86 is not going anywhere for performance computing, backwards
| compatibility, and plethora of other reasons.
| young_unixer wrote:
| Linus' opinion:
| https://www.realworldtech.com/forum/?threadid=200812&curpost...
| bryanlarsen wrote:
| tldr: AMD is "fix the spec bugs". Intel is "replace with better
| approach". Linus: do both, please!
| bogomipz wrote:
| Thanks for posting this link. I was curious about a, b and d:
|
| >"(a) IDT itself is a horrible nasty format and you shouldn't
| have to parse memory in odd ways to handle exceptions. It was
| fundamentally bad from the 80286 beginnings, it got a tiny bit
| harder to parse for 32-bit, and it arguably got much worse in
| x86-64."
|
| What is it about IDT that requires parsing memory in odd ways?
| What is odd about it?
|
| >"(b) %rsp not being restored properly by return-to-user mode."
|
| Does anyone know why this is? Is this a historical accident or
| something else?
|
| >"(d) several bad exception nesting problems (NMI, machine
| checks and STI-shadow handling at the very least)" Is this one
| these two exceptions are nested together or is this an issue
| when either one of these is present in the interrupt chain? Is
| there any good documentation on this?
| PopePompus wrote:
| Since cloud servers are a bigger market than users who want to
| run an old copy of VisiCalc, why doesn't either Intel or AMD
| produce a processor line that has none of the old 16 and 32 bit
| architectures (and long-forgotten vector extensions), implemented
| in silicon? Why not just make a clean (or as clean as possible)
| 64 bit x86 processor?
| jandrewrogers wrote:
| Intel did this with the cores for the Xeon Phi. While it was
| x86 compatible, they removed a bunch of the legacy modes.
| th3typh00n wrote:
| Because the number of transistors used for that functionality
| is absolutely negligible, so removing it has virtually no
| benefit.
| Symmetry wrote:
| The number of transistors sure. The engineer time to design
| new features that don't interfere with old features is high.
| The verification time to make sure every combination of
| features plays sensibly together is extremely high. To the
| extent that Intel and AMD are limited by the costs of
| employing and organizing large numbers of engineers it's a
| big deal. Though that's also the reason they'll never make a
| second, simplified, core.
| failwhaleshark wrote:
| It's never going to happen. The ISA is a hardware contract.
| PopePompus wrote:
| When things get to the point where AMD is considering
| making nonmaskable interrupts maskable (as the article
| states), maybe it's time to invoke "force majeure".
| PopePompus wrote:
| Even so, doesn't having a more complex instruction set,
| festooned with archaic features needed by very few users,
| increase the attack surface for hacking exploits and increase
| the likelyhood of bugs being present? Isn't it a bad thing
| that the full boot process is understood in depth by only a
| tiny fraction of the persons programming for x86 systems (I'm
| certainly not one of them)?
| jcranmer wrote:
| Not really. Basically none of these features can be used
| outside of the kernel anyways, which means that the
| attacker already has _far_ more powerful capabilities they
| can employ.
| failwhaleshark wrote:
| Yep. Benefit = The cost of minimalism at every expense - the
| cost of incompatibility (in zillions): breaking things that
| cannot be rebuilt, breaking every compiler, breaking every
| debugger, breaking every disassembler, adding more feature
| flags, and it's no longer the Intel 64 / EMT-32 ISA. Hardware
| != software.
| lizknope wrote:
| You mean like the Intel i860?
|
| https://en.wikipedia.org/wiki/Intel_i860
|
| Or the Intel Itanium?
|
| https://en.wikipedia.org/wiki/Itanium
|
| Or the AMD Am29000?
|
| https://en.wikipedia.org/wiki/AMD_Am29000
|
| Or the AMD K12 which was a 64-bit ARM?
|
| https://www.anandtech.com/show/7990/amd-announces-k12-core-c...
|
| All of these things were either rejected by the market or
| didn't even make it to the market.
|
| Binary compatibility is one of the major if not the major
| reason that x86 has hung around so long. In the 1980's and 90's
| x86 was slower than the RISC workstation competitors but Intel
| and AMD really took the performance crown around 2000.
| fredoralive wrote:
| I think he's suggesting something more like the 80376, an
| obscure embedded 386 that booted straight into protected
| mode. So you'd have an x86-64 CPU that boots straight into
| Long Mode and thus could remove stuff like real mode and
| virtual 8086 mode. AFAIK with UEFI it's the boot firmware
| that handles switching to 32/ 64 bit mode, not the OS loader
| or kernel, so it would be transparent to the OS and programs.
|
| But in order to not break a lot of stuff on desktop Windows
| (and ancient unmaintained custom software on corprate
| servers) you'd still have to implement the "32 bit software
| on 64 bit OS" support. That probably means you don't actually
| simplfy the CPU much.
|
| Of course some x86 extensions do get dropped occasionally,
| but only things like AMD 3DNow (I guess AMD market share
| meant few used it anyway) and that Intel transactional memory
| thing that was just broken.
| defaultname wrote:
| Binary compatibility kept x86 dominant, coupled with
| competing platforms not offering enough of a performance or
| price benefit to make them worth the trouble.
|
| That formula has completely changed. With the tremendous
| improvement in compilers, and the agility of development
| teams, the move has been long underway. People are firing up
| their Gravitron2 instances at a blistering pace, my Mac runs
| on Apple Silicon (already, just months in, with zero x86 apps
| -- I thought Rosetta2 would be the lifevest, but everyone
| transitioned so quickly I could do fine without it).
|
| It's a very different world.
| ben509 wrote:
| GHC[1] is almost there >_<
|
| [1]: https://www.haskell.org/ghc/blog/20210309-apple-m1-sto
| ry.htm...
| monocasa wrote:
| No, there's a lot of PCisms that that can be removed and
| still allow for x86 cores. User code doesn't care about PC
| compat really anymore (see the PS4 Linux port for the
| specifics of a non PC x86 platform that runs regular x86 user
| code like Steam, albeit one arguably worse designed than the
| PC somehow). Cleaning up ring 0 in a way that ring 3 code
| can't tell the difference with a vaguely modern kernel could
| be a huge win.
| nanaahow wrote:
| Really nice <a href="https://nanahow.com/how-to-make-brown/">how
| to make brown</a>
| raverbashing wrote:
| > Rather than use the IDT to locate the entry point of each
| handler, processor hardware will simply calculate an offset from
| a fixed base address
|
| So, wasn't the 8086 like this? Or at least some microprocessors
| that jump to $BASE + OFFSET to a point where one JMP fits more or
| less
| sounds wrote:
| I have no idea how Intel's proposal handles this, but the 8086
| jump was to a fixed location. i.e. BASE was always 0.
___________________________________________________________________
(page generated 2021-06-04 23:00 UTC)