hngopher.com

       [HN Gopher] A bug fix in the 8086 microprocessor, revealed in th...
       ___________________________________________________________________
        
       A bug fix in the 8086 microprocessor, revealed in the die's silicon
        
       Author : _Microft
       Score  : 353 points
       Date   : 2022-11-26 22:28 UTC (1 days ago)
        
 (HTM) web link (www.righto.com)
 (TXT) w3m dump (www.righto.com)
        
       | anyfoo wrote:
       | Once again, absolutely amazing. Those are more details of a
       | really interesting internal CPU bug than I could have ever hopes
       | for.
       | 
       | Ken, do you think in some future it might be feasible for a
       | hobbyist (even if just a very advanced one like you) to do some
       | sort of precise x-ray imaging that would obviate the need to
       | destructively dismantle the chip? For a chip of that vintage, I
       | mean.
       | 
       | Obviously that's not an issue for 8086 or 6502, since there are
       | more than plenty around. But if there were ever for example an
       | engineering sample appearing, it would be incredibly interesting
       | to know what might have changed. But if it's the only one,
       | dissecting it could go very wrong and you lose both the chip and
       | the insight it could have given.[1]
       | 
       | Also in terms of footnotes, I always meant to ask: I think they
       | make sense as footnotes, but unlike footnotes in a book or paper
       | (or in this short comment), I cannot just let my eyes jump down
       | and back up, which interrupts flow a little. I've seen at least
       | one website having footnotes on the _side_ , i.e. in the margin
       | next to the text that they apply to. Maybe with a little JS or
       | CSS to fully unveil then. Would that work?
       | 
       | [1] Case in point, from that very errata, I don't know how rare
       | 8086s with (C)1978 are, but it's conceivable they could be rare
       | enough that dissolving them to compare the bugfix area isn't
       | desirable.
        
         | molticrystal wrote:
         | >some sort of precise x-ray imaging that would obviate the need
         | to destructively dismantle the chip
         | 
         | I don't know about a hobbyist, but are you talking about
         | something along the lines of "Ptychographic X-ray
         | Laminography"? [0] [1]
         | 
         | [0] https://spectrum.ieee.org/xray-tech-lays-chip-secrets-bare
         | 
         | [1] https://www.nature.com/articles/s41928-019-0309-z
        
           | anyfoo wrote:
           | I haven't ever looked into it itself, but what you just
           | pasted seems like a somewhat promising answer indeed. Except
           | for the synchrotron part I guess? Maybe?
        
             | sbierwagen wrote:
             | >Except for the synchrotron part I guess? Maybe?
             | 
             | They were using 6.2keV x-rays from the Swiss Light Source,
             | a multi-billion Euro scientific installation. 6.2keV isn't
             | especially energetic by x-ray standards, (a tungsten target
             | x-ray tube will do ten times that) so either they needed
             | monochromacy or the high flux you can only get from a
             | building-sized synchrotron. Given that the paper says they
             | poured 76 million Grays of ionizing radiation into an area
             | of 90,000 cubic micrometers over the course of 60 hours
             | suggests the latter. (A fatal dose of whole-body radiation
             | to a human is about 5 Grays. This is not a tomography
             | technique that will ever be applied to a living subject,
             | though there is some interesting things no doubt being done
             | right now to frozen bacteria or viruses.)
        
               | anyfoo wrote:
               | No question, but:
               | 
               | "Though the group tested the technique on a chip made
               | using a 16-nanometer process technology, it will be able
               | to comfortably handle those made using the new
               | 7-nanometer process technology, where the minimum
               | distance between metal lines is around 35 to 40
               | nanometers."
               | 
               | The discernible feature size required for imaging CPUs of
               | 6502 or even 8086 vintage is much lower than that. The
               | latter was apparently made with a 3 um process, so
               | barring any differences in process naming since, that's
               | about 3 orders of magnitude less.
               | 
               | Plus, as Ken's article says, the 8086 has only one metal
               | layer instead of a dozen.
               | 
               | So my (still ignorant) hope is that you maybe don't need
               | a synchrotron for that.
        
           | kens wrote:
           | That X-ray technique is very cool. One problem, though, is
           | that it only shows the metal layers. The doping of the
           | silicon is very important, but doesn't show up on X-rays.
        
       | caf wrote:
       | I believe this one-instruction disabling of interrupts is called
       | the 'interrupt shadow'.
        
       | _Microft wrote:
       | A Twitter thread by the author can be found here:
       | 
       | https://twitter.com/kenshirriff/status/1596622754593259526
        
         | raphlinus wrote:
         | Or, if you prefer Mastodon:
         | https://mastodon.online/@kenshirriff@oldbytes.space/10941232...
        
         | kens wrote:
         | Usually if I post a Twitter thread, people on HN want a blog
         | post instead. I'm not used to people wanting the Twitter thread
         | :)
        
           | jiggawatts wrote:
           | Blogs are too readable.
           | 
           | I prefer the challenge of mentally stitching together
           | individual sentences interspersed by ads.
        
             | IIAOPSW wrote:
             | Well I've got great news for you (1/4)
        
           | lazzlazzlazz wrote:
           | The Twitter thread is better. You get the full context of
           | other peoples' reactions, comments, references, etc. This is
           | lost with blogs.
           | 
           | It's the same reason why many people go to the Hacker News
           | comments before clicking the article. :)
        
             | kelnos wrote:
             | I won't quite say I don't care about other people's
             | reactions (seeing as I'm here on HN, caring about other
             | people's reactions), but Twitter makes it annoying enough
             | to follow the original multi-tweet material such that I
             | greatly prefer the blog post.
        
             | Shared404 wrote:
             | I like having the twitter/mastodon thread easily linked,
             | but 10/10 times would rather see the article first.
        
             | serf wrote:
             | >The Twitter thread is better.
             | 
             | let's recognize subjectivity a bit here. it's different,
             | you may like it more -- that's great , but prove 'better' a
             | bit better.
             | 
             | i've said this dozens of times, and i'm sure people are
             | getting tired of the sentiment, but Twitter is a worse
             | reading experience than all but the very worst blog formats
             | for long-format content.
             | 
             | ads, forced long-form formatting that breaks flow
             | constantly, absolutely unrelated commentary half of the
             | time (from people that have absolutely no clout or
             | experience with whatever subject), and the forced
             | MeTooism/Whataboutism inherent with social media platforms
             | and their trolling audience.
             | 
             | It gets a lot of things right with communication, and many
             | people _love_ reading long-form things on Twitter -- but
             | really i 'm just asking you to convince me here ; there is
             | very little 'better' from what I see : help me to realize
             | the offerings.
             | 
             | tl;dr : old man says 'I just don't get IT.' w.r.t. long-
             | form tweeting.
        
               | anyfoo wrote:
               | > absolutely unrelated commentary half of the time (from
               | people that have absolutely no clout or experience with
               | whatever subject)
               | 
               | Yeah, you can see that by just looking at the comments
               | for the first tweet there. A few ones are maybe
               | interesting nuggets of information (but a comment on the
               | blog or on HN might have elaborated more), the rest is...
               | exactly what you say.
        
               | lazzlazzlazz wrote:
               | It should be very obvious that when someone says "X is
               | better" they mean that -- in their estimation and
               | according to their beliefs, X is better. Spelling it all
               | out isn't always necessary.
        
               | ilyt wrote:
               | In according to subjective media, sure. But in tech-
               | focused social media, not really. There are mostly
               | objective things (like too little contrast, or site
               | working slow) that can be said about readability for
               | example.
        
               | anyfoo wrote:
               | I don't think that's obvious. You could well mean that "X
               | is better" in general, and consequentially even advocate
               | that "so Y, which is not better, is a waste of resources
               | and should not be done".
               | 
               | It's hard for an outside observer to judge which one you
               | mean, and just in case you meant the advocating one,
               | people might feel to discuss to prevent damage.
               | 
               | It's easy for you to clarify with "specifically for me"
               | or something similar.
               | 
               | And there even is something that I think is _generally_
               | better, namely that I 've seen this blog (and others)
               | attract long form comments from people who were also
               | there, and elaborate some more historical details in
               | them. On Twitter it's mostly condensed to the bare
               | minimum, which you can see on this very Twitter thread!
               | So losing the blog would be bad.
        
           | bhaak wrote:
           | Twitter threads force the author to be more concise and split
           | the thoughts into smaller pieces. And if there are picture to
           | the tweets that makes them even better.
           | 
           | It's a very good starting point, getting a summary of the
           | topic, before diving into a longer blog post knowing what to
           | expect and with some knowledge you probably didn't have
           | before.
           | 
           | The way I'm learning best is having some skeleton that can be
           | fleshed out. A concise Twitter threads builds such a skeleton
           | if I didn't have it already.
        
             | ilyt wrote:
             | Forcing to be concise in nuanced topics usually does more
             | bad than good. And format of Twitter is abhorrent.
             | 
             | Like, even your comment would need to be split into 2 twits
             | 
             | Then again I read pretty fast so someone writing slightly
             | longer sentence than absolutely necessary doesn't really
             | bother me in the first place
        
               | bhaak wrote:
               | > Forcing to be concise in nuanced topics usually does
               | more bad than good.
               | 
               | Following that argument you can't ever build up proper
               | knowledge unless you get a deep introduction into a
               | topic.
               | 
               | > And format of Twitter is abhorrent.
               | 
               | I disagree but you need to write to its format. Just
               | writing a blog post and then splitting it up for cramming
               | as much as possible into a tweet, possibly even breaking
               | within a sentence, yes, that is abhorrent.
               | 
               | > Like, even your comment would need to be split into 2
               | twits
               | 
               | 3 even if you preserve the paragraphs I used.
               | 
               | > Then again I read pretty fast so someone writing
               | slightly longer sentence than absolutely necessary
               | doesn't really bother me in the first place
               | 
               | It's not about reading slow, it's about getting small
               | pieces of information. Tweets usually don't have the
               | fluff that many blog posts do. Or even articles that take
               | up one third of the whole piece to describe a person in a
               | coffee shop, drinking a specific brand of coffee to
               | illustrate some superficial point that has a vague
               | connection to the actual topic of the article.
        
       | userbinator wrote:
       | _The obvious workaround for this problem is to disable interrupts
       | while you 're changing the Stack Segment register, and then turn
       | interrupts back on when you're done. This is the standard way to
       | prevent interrupts from happening at a "bad time". The problem is
       | that the 8086 (like most microprocessors) has a non-maskable
       | interrupt (NMI), an interrupt for very important things that
       | can't be disabled._
       | 
       | Although it's unclear whether the very first revisions of the
       | 8088 (not 8086) with this bug ended up in IBM PCs, since that
       | would be a few years before its introduction, the original PC and
       | successors have the ability to disable NMI in the external logic
       | via an I/O port.
        
         | _tom_ wrote:
         | This was in the early IBM PCs. I know, because I remember I had
         | to replace my 8088 when I got the 8087 floating point
         | coprocessor. I don't recall exactly why this caused it to hit
         | the interrupt bug, but it did.
        
           | anyfoo wrote:
           | Maybe because IBM connected the 8087's INT output to the
           | 8086's NMI, so the 8087 became a source of NMIs (and with
           | really trivial stuff like underflow/overflow, forced rounding
           | etc., to boot).
           | 
           | Even if you could disable the NMI with external circuitry,
           | that workaround quickly becomes untenable, especially when
           | it's not fully synchronous.
        
             | _tom_ wrote:
             | That sounds right. It's been a while.
        
           | [deleted]
        
         | kaszanka wrote:
         | For those who are curious, it's the highest bit in port 70h
         | (this port is also used as an index register when accessing the
         | CMOS/RTC): https://wiki.osdev.org/CMOS#Non-Maskable_Interrupts
        
           | userbinator wrote:
           | Port 70h is where it went in the AT (which is what most
           | people are probably referring to when they say "PC
           | compatible" these days.) On the PC and XT, it's in port A0:
           | 
           | https://wiki.osdev.org/NMI
           | 
           | Thre's also another set of NMI control bits in port 61h on an
           | AT. It's worth noting that the port 70h and 61h controls are
           | still there in the 2022 Intel 600-series chipsets, almost 40
           | years later; look at pages "959" and "960" of this: https://c
           | drdv2.intel.com/v1/dl/getContent/710279?fileName=71...
        
         | acomjean wrote:
         | We used to disable interrupts for some processes at work
         | (hpux/pa-risc). It makes them quasi real time, though if
         | something goes wrong you have to reboot to get that cpu back...
         | 
         | Psrset was the command, there is very little info on it since
         | HP gave up on HPUX.. I thought it was a good solution for some
         | problems.
         | 
         | They even sent a bunch of us to a red hat kernel internals
         | class to see if Linux had something comparable.
         | 
         | https://www.unix.com/man-page/hpux/1m/psrset/
        
           | azalemeth wrote:
           | I'd never heard of processor sets (probably because I've
           | never used HPUX in anger), but they sound like a great
           | feature, especially in the days where one CPU had only one
           | execution engine on it. Modern Linux has quite a lot of
           | options for hard real time on SMP systems, some of them Free
           | and some of them not. Of all places, Xilinx has quite a good
           | overview: https://xilinx-
           | wiki.atlassian.net/wiki/spaces/A/pages/188424...
        
       | kklisura wrote:
       | > While reverse-engineering the 8086 from die photos, a
       | particular circuit caught my eye because its physical layout on
       | the die didn't match the surrounding circuitry.
       | 
       | Is there a software that builds/reverses circuity from die photos
       | or this is all manual work?
        
         | kens wrote:
         | Commercial reverse-engineering places probably have software.
         | But in my case it's all manual.
        
         | speps wrote:
         | Check out the Visual 6502 project for some gruesome details:
         | http://www.visual6502.org/
         | 
         | The slides are a great summary:
         | http://www.visual6502.org/docs/6502_in_action_14_web.pdf
        
       | techwiz137 wrote:
       | Ah yes, the infamous mov ss, pop ss that caused many a debuggers
       | to fail and be detected.
        
       | sgtnoodle wrote:
       | Just brainstorming a software workaround. It seems like you could
       | abuse the division error interrupt, and manipulate the stack
       | pointer from inside its ISR? I assume the division error
       | interrupt can't be interrupted since it is interrupt 0?
        
         | kens wrote:
         | There's no stopping an NMI (non-maskable interrupt), even if
         | you're in a division error interrupt.
        
           | sgtnoodle wrote:
           | That's some funny priority inversion! Reading up on it, the
           | division error interrupt will be processed by the hardware
           | first (pushed to the stack), but then be interrupted by NMI
           | before the ISR runs.
        
         | colejohnson66 wrote:
         | "Interrupt 0" wasn't its priority, but it's index. IIRC, the
         | 8086 doesn't have the concept of interrupt priority. Upon
         | encountering a #DE, the processor would look up interrupt 0 in
         | the IVT (located at 0x00000 though 0x00003 in "real mode").
        
       | StayTrue wrote:
       | Fantastic read. I'll note for others this blog has an RSS feed
       | (to which I'm now subscribed!).
        
       | fnordpiglet wrote:
       | I sincerely wish I were this person. Reading this makes me feel
       | I've fundamentally failed in my life decisions.
        
         | holoduke wrote:
         | I actually had some eye openers on how a cpu works in more
         | detail after reading this article. Is there any good material
         | for understanding cpu design for beginners?
        
           | johannes1234321 wrote:
           | Checkout Ben Eaters series on building a breadboard computer:
           | https://youtube.com/playlist?list=PLowKtXNTBypGqImE405J2565d.
           | ..
           | 
           | In the series he builds all the things more or less from the
           | group up. And if that isn't enough he in another video series
           | builds his breadboard VGA graphics card
        
         | hyperman1 wrote:
         | I am always awestruck by Kens post, and felt severely
         | underqualified to comment on anything, even as a regular if
         | basic 'elektor' reader.
         | 
         | But the previous post, about the bootstrap drivers gave me a
         | chance. I had an hour to spare, the size of the circuit was
         | small enough, so I just stared at it on my cell phone,
         | scrolling back and forth, back and forth. It took a while, but
         | slowly I started picking up some basic understanding. Big
         | shoutout to Ken, BTW, he did everything to make it easy for
         | mortals to follow along.
         | 
         | I am still clearly at the beginner level, and surely can't redo
         | that post on my own, but there is a path forward to learn this.
         | 
         | If I'd redo that post, I'd advise to have pen and paper ready,
         | and print out the circuit a few times to doodle upon them. And
         | have lots of time to focus.
        
         | quickthrower2 wrote:
         | Whatever you decide to do next, please be gentle with yourself.
         | If you are in a position to even appreciate the post, you have
         | probably done well!
         | 
         | I get it. And i think i might know why: we are all smart here.
         | Smart enough to do outstandingly well at high school. As time
         | goes on: university and career the selection bias reveals
         | itself. We are now with millions of peers and statistically
         | some of them will be well ahead by some measure. In short: the
         | feeling is to be expected, and it is by design of modern life
         | and high interconnectedness
        
           | toofy wrote:
           | this is a really nice reply.
           | 
           | sometimes in the too often contrarian comment sections, it's
           | jarring to see someone's comment recognize the human being on
           | the other end. it's inspiring.
           | 
           | i'm not the op, but your comment was kinda rad. thanks.
        
         | Sirened wrote:
         | Hardware is fun! It's never too late :)
        
         | kens wrote:
         | I'm not sure if I should be complimented or horrified by this.
        
           | redanddead wrote:
           | why not onboard him into the initiative?
        
         | _Microft wrote:
         | It's never too late.
        
           | fnordpiglet wrote:
           | https://youtu.be/n3SsJdm8bMY
        
             | [deleted]
        
         | prettyStandard wrote:
         | I'm sure you could figure this out if you wanted to.
         | 
         | https://www.reddit.com/r/devops/comments/8wemc2/feeling_unsu...
        
       | albert_e wrote:
       | Fascinating!
       | 
       | Are there any good documentary style videos that dive into some
       | of such details of chips and microprocessor architectures?
        
       | ajross wrote:
       | I love these so much.
       | 
       | kens: were you able to identify the original bug? I understand
       | the errata well enough, it implies that the processor state is
       | indeterminate between the segment assignment and subsequent
       | update to SP. But this isn't a 2/386 with segment selectors; on
       | the 8086, SS and SP are just plain registers. At least
       | conceptually, whenever you access memory the processor does an
       | implicit shift/add with the relevant segment; there's no
       | intermediate state.
       | 
       | Unless there is? Was there a special optimization that pre-baked
       | the segment offset into SP, maybe? Would love to hear about
       | anything you saw.
        
         | wbl wrote:
         | It's not that the state is indeterminate it's that a programmer
         | would have some serious trouble ensuring that it was possible
         | to dump registers at that moment.
        
           | ajross wrote:
           | No, that would just be a software bug (and indeed, interrupt
           | handlers absolutely can't just arbitrarily trust segment
           | state on 8086 code where app code messes with it, for exactly
           | that reason). The errata is a hardware error: the specified
           | behavior of the CPU after a segment assignment is clear per
           | the docs, but the behavior of the actual CPU is apparently
           | different.
        
             | mjw1007 wrote:
             | The 8086 didn't have a separate stack for interrupts.
             | 
             | So, as I understand it, the problem isn't that code running
             | in the interrupt handler might push data to a bad place;
             | the problem is that the the processor itself would do so
             | when pushing the return address before branching to the
             | interrupt handler.
        
               | cesarb wrote:
               | > the problem is that the the processor itself would do
               | so when pushing the return address before branching to
               | the interrupt handler.
               | 
               | I want to emphasize this point: when handling an
               | interrupt, the x86 processor itself pushes the return
               | address and the flags into the stack. Other processor
               | architectures store the return address and the processor
               | state word into dedicated registers instead, and it's
               | code running in the interrupt handler which saves the
               | state in the stack; that would easily allow fixing the
               | problem in software, instead of requiring it to be fixed
               | in hardware (a simple software fix would be to use a
               | dedicated save area or interrupt stack, separate from the
               | normal stack; IIRC, it's that approach which later x86
               | processors followed, with a separate interrupt stack).
        
               | robocat wrote:
               | > with a separate interrupt stack
               | 
               | Some detail here:
               | https://www.kernel.org/doc/html/latest/x86/kernel-
               | stacks.htm...
        
             | anyfoo wrote:
             | The CPU _itself_ pushes flags and return address on the
             | stack during an interrupt. If it 's an NMI, you can't
             | prevent an interrupt between mov/pop ss and mov/pop sp,
             | which most systems will always have a need to do
             | eventually.
             | 
             | Therefore, the CPU has consistent state, but an extra fix
             | was still necessary just to prevent interrupts after a
             | mov/pop ss.
        
         | kens wrote:
         | I believe this was a design bug: the hardware followed the
         | design but nobody realized that the design had a problem.
         | Specifically, it takes two instructions to move the stack
         | pointer to a different segment (updating the SS and the SP). If
         | you get an interrupt between these two instructions, everything
         | is deterministic and "works". The problem is that the
         | combination of the old SP and the new SS points to an
         | unexpected place in memory, so your stack frame is going to
         | clobber something. The hardware is doing the "right thing" but
         | the behavior is unusable.
        
           | ajross wrote:
           | Huh... well that's disappointing. That doesn't sound like a
           | hardware bug to me at all. There's nothing unhandleable about
           | this circumstance that I see, any combination of SS/SP points
           | to a "valid" address, code just has to be sure the
           | combination always points to enough memory to store an
           | interrupt frame.
           | 
           | In the overwhelmingly common situation where your stack
           | started with an SP of 0, you just assign SP back to 0 before
           | setting SS and nothing can go wrong (you are, after all,
           | about to move off this stack and don't care about its
           | contents). Even the weirdest setups can be handled with a
           | quick thunk through an SP that overlaps in the same 64k
           | region as the target.
           | 
           | It's a clumsy interface but hardly something that Intel has
           | traditionally shied away from (I mean, good grief, fast
           | forward a half decade and they'll have inflicted protected
           | mode transitions on us). I'm just surprised they felt this
           | was worth a hardware patch.
           | 
           | I guess I was hoping for something juicier.
        
             | munch117 wrote:
             | > That doesn't sound like a hardware bug to me at all.
             | 
             | It isn't. It's a rare case of a software bug being worked
             | around in hardware. That alone makes it fascinating.
             | 
             | The natural way of handling it in software would be to
             | disable interrupts temporarily while changing the stack
             | segment and pointer. You _could_ just say that software
             | developers should do that, but there 's a performance cost,
             | and more importantly: The rare random failures from the
             | software already out there that didn't do this, would give
             | your precious new CPU a reputation for being unstable.
             | Better to just make it work.
        
               | cesarb wrote:
               | As others have already noted: yes, software could disable
               | interrupts temporarily while changing the stack segment
               | and pointer, but on these processors, there's a second
               | kind of interrupt ("non-maskable interrupt" aka NMI)
               | which cannot be disabled.
               | 
               | (Some machines using these processors have hardware to
               | externally gate the NMI input to the processor, but
               | disabling the NMI through these methods defeats the whole
               | purpose of the NMI, which is being able to interrupt the
               | processor even when it has disabled interrupts.)
        
               | munch117 wrote:
               | I saw those other 8087+NMI comments, but the 8087 is from
               | 1980, and I thought this fix was older than that.
        
             | caf wrote:
             | _In the overwhelmingly common situation where your stack
             | started with an SP of 0, you just assign SP back to 0
             | before setting SS and nothing can go wrong (you are, after
             | all, about to move off this stack and don 't care about its
             | contents)._
             | 
             | Consider the case when the MSDOS command interpreter is
             | handing off execution to a user program. It can't just set
             | SP to 0 first (well, really 0xFFFF since the stack grows
             | down), because it's somewhere several frames deep in its
             | own execution stack, an execution stack which will
             | hopefully resume when the user program exits.
             | 
             | Programs using the 'tiny' memory model, like all COM files,
             | have SS == CS == DS so new_ss:old_sp could easily clobber
             | the code or data from the loaded file.
        
               | ajross wrote:
               | Right, so you just thunk to a different region at the
               | bottom of your own stack in that case. I'm not saying
               | this is easy or simple, I'm saying it's possible to do it
               | reliably entirely in software at the OS level without the
               | need for patching a hardware mask.
               | 
               | Edit, just because all the folks jumping in to downvote
               | and tell me I'm wrong are getting under my skin, this
               | (untested, obviously) sequence should work to safely set
               | SP to zero in an interrupt-safe way, all it requires is
               | that you have enough space under your current stack to
               | align SP down to a 16 bit boundary and fit 16 bytes and
               | one interrupt frame below that.
               | 
               | Again, you wouldn't use this in production OS code, you'd
               | just make sure there was a frame of space at the top of
               | your stack or be otherwise able to recover from an
               | interrupt. But you absolutely can do this.
               | mov %sp, %ax  // Align SP down to a segment boundary
               | and %ax, ~0xf          mov %ax, %sp         loop:
               | // Iteratively:          mov %ss, %ax  // + decrement the
               | segment (safe)          dec %ax          mov %ax, %ss
               | mov %sp, %ax  // + increment the stack pointer by the
               | same amount (safe)          add %ax, 16          mov %ax,
               | %sp          jnz loop      // + Until SP is zero (and
               | points to the same spot!)
        
             | pm215 wrote:
             | The problem is that in general "code" can't do that because
             | the code that's switching stack doesn't necessarily
             | conceptually control the stack it's switching to/from. One
             | SS:SP might point to a userspace process, for instance, and
             | the other SS:SP be for the kernel. Both stacks are "live"
             | in that they still have data that will later be context
             | switched back to. You might be able to come up with a
             | convoluted workaround, but since "switch between stacks"
             | was a design goal (and probably assumed by various OS
             | designs they were hoping to get ports for), the race that
             | meant you can't do it safely was definitely a design bug.
        
         | duskwuff wrote:
         | > I understand the errata well enough, it implies that the
         | processor state is indeterminate between the segment assignment
         | and subsequent update to SP.
         | 
         | The root cause was effectively a bug in the programming model,
         | not in the implementation -- there was no way to atomically
         | relocate the stack (by updating SS and SP). The processor state
         | wasn't _indeterminate_ between the two instructions that
         | updated SS and SP, but the behavior resulting from an interrupt
         | at that point would almost certainly be _unintended_ , since
         | the CPU would push state onto a stack in the wrong location,
         | potentially overwriting other stack frames, and the
         | architecture as originally designed provided no way to prevent
         | this.
        
       | ilyt wrote:
       | I'm more surprised that it already had microcode-equivalent, even
       | if it was essentially hardcoded
        
         | carry_bit wrote:
         | I have a textbook on processor design from 1979, and microcode
         | was the go-to technique described in it.
        
           | pwg wrote:
           | Wikipedia (https://en.wikipedia.org/wiki/Microcode#History)
           | dates it back to 1951 and Maurice Wilkes.
        
       | MBCook wrote:
       | So this was all to avoid having to re-layout everything and cut
       | new rubylith, right? And at this point was all still done by
       | hand?
       | 
       | I suppose you'd have to re-test everything with either a new
       | layout or this fix, so no real cost to save there?
        
         | retrac wrote:
         | > And at this point was all still done by hand?
         | 
         | Sort of. From what I've gathered, at this point Intel was still
         | doing the actual layout with rubylith, yes. But in small
         | modular sections. The sheets were then digitized and stitched
         | into the final whole in software; there wasn't literally a
         | giant 8086 rubylith sheet pieced together by hand, unlike just
         | a few years before with the 8080. But the logic and circuits
         | etc. were on paper, and there was no computer model capable of
         | going from that to layout. The computerized mask was little
         | more than an digital image. So a hardware patch it would have
         | to be, unless you want to redo a lot.
         | 
         | Soon, designers would start creating and editing such masks
         | directly in CAD software. But those were just giant (for the
         | time) images, really, with specialized editors. Significant
         | software introspection and abstraction handling them came
         | later. I don't think the modern approach of full synthesis from
         | a specification, really came into use until the late 80s.
        
           | rasz wrote:
           | 286 was an RTL model hand converted module by module to
           | transistor/gate level schematic. Afair 386 was the first
           | Intel CPU where they fully used synthesis (work out of UC
           | Berkeley https://vcresearch.berkeley.edu/faculty/alberto-
           | sangiovanni-... co-founder of a little company called
           | Cadence) instead of manual routing. Everything went thru
           | logic optimizers (multi-level logic synthesis) and will most
           | likely be unrecognizable.
           | 
           | I found a paper on this 'Coping with the Complexity of
           | Microprocessor Design at Intel - A CAD History'
           | https://www.researchgate.net/profile/Avinoam-
           | Kolodny/publica...
           | 
           | >In the 80286 design, the blocks of RTL were manually
           | translated into the schematic design of gates and transistors
           | which were manually entered in the schematic capture system
           | which generated netlists of the design.
           | 
           | >Albert proposed to support the research at U.C. Berkeley,
           | introduce the use of multi-level logic synthesis and
           | automatic layout for the control logic of the 386, and to set
           | up an internal group to implement the plan, albeit Alberto
           | pointed out that multi-level synthesis had not been released
           | even internally to other research groups in U.C. Berkeley.
           | 
           | >Only the I/O ring, the data and address path, the microcode
           | array and three large PLAs were not taken through the
           | synthesis tool chain on the 386. While there were many early
           | skeptics, the results spoke for themselves. With layout of
           | standard cell blocks automatically generated, the layout and
           | circuit designers could myopically focus on the highly
           | optimized blocks like the datapath and I/O ring where their
           | creativity could yield much greater impact
           | 
           | >486 design:                   A fully automated translation
           | from RTL to layout (we called it RLS: RTL to Layout
           | Synthesis)         No manual schematic design (direct
           | synthesis of gate-level netlists from RTL, without graphical
           | schematics of the circuits)         Multi-level logic
           | synthesis for the control functions         Automated gate
           | sizing and optimization         Inclusion of parasitic
           | elements estimation         Full chip layout and floor
           | planning tools
        
             | retrac wrote:
             | Thanks. Fascinating!
        
           | marcosdumay wrote:
           | I remember the Pentium was marketed as "the first computer
           | entirely designed on CAD", well into the 90's. But I'm not
           | sure how real was that marketing message.
        
             | eric__cartman wrote:
             | That's a good advertisement for 486s because they were
             | baller enough to pull that off.
        
         | dboreham wrote:
         | Yes. A hardware patch.
        
           | dezgeg wrote:
           | And even still today this kind of manual hand-patching will
           | be done for minor bug fixes (instead of 'recompiling' the
           | layout from the RTL code).
           | 
           | One motivator is if the fixes are simple enough, the
           | modifications can be done such they only affect the top-most
           | metal layer, so there is no need to remanufacture the masks
           | for other layers, saving time and $$$.
        
       | klelatti wrote:
       | Fantastic work by Ken yet again.
       | 
       | It's striking how many of the early microprocessors shipped with
       | bugs. The 6502 and 8086 both did and it was an even bigger
       | problem for processors like the Z8000 and 32016 where the bugs
       | really helped to hinder their adoption.
       | 
       | And the problem of bugs was a motivation for both the Berkeley
       | RISC team and for Acorn ARM team when choosing RISC.
        
         | adrian_b wrote:
         | All the current processors are shipping with known bugs,
         | including all Intel and AMD CPUs and all CPUs with ARM cores,
         | also including even all the microcontrollers with which I have
         | ever worked, regardless of manufacturer.
         | 
         | The many bugs (usually from 20 to 100) of each CPU model are
         | enumerated in errata documents with euphemistic names like
         | "Specification Update" of Intel and "Revision Guide" of AMD.
         | Some processor vendors have the stupid policy of providing the
         | errata list only under NDA.
         | 
         | Many of the known bugs have workarounds provided by microcode
         | updates, a few are scheduled to be fixed in a later CPU
         | revision, some affect only privileged code and it is expected
         | that the operating system kernels will include workarounds that
         | are described in the errata document, and many bugs have the
         | resolution "Won't fix", because it is considered that they
         | either affect things that are not essential for the correct
         | execution of a program, e.g. the values of the performance
         | counters or the speed of execution, or because it is considered
         | that those bugs happen only in very peculiar circumstances that
         | are unlikely to occur in any normal program.
         | 
         | I recommend the reading of some of the "Specification Update"
         | documents of Intel, to understand the difficulties of designing
         | a bug-free CPU, even if much of the important information about
         | the specific circumstances that cause the bugs is usually
         | omitted.
        
       | drmpeg wrote:
       | When I was at C-Cube Microsystems in the mid 90's, during the
       | bring-up of a new chip they would test fixes with a FIB (Focused
       | Ion Beam). Basically a direct edit of the silicon.
        
         | mepian wrote:
         | Intel is still using FIB for silicon debugging.
        
       | hota_mazi wrote:
       | I started reading this article and immediately thought "That's
       | something Ken Shiriff could have written", but I didn't recognize
       | the domain name.
       | 
       | Lo and behold... looks like Ken just changed his domain name.
       | 
       | Amazing work, as always, Ken.
        
         | kens wrote:
         | Its the same domain name that I've had since the 1990s.
        
           | iamtedd wrote:
           | Maybe the parent is from another dimension. What would be
           | your second choice for a domain?
        
       ___________________________________________________________________
       (page generated 2022-11-28 05:02 UTC)