[HN Gopher] Booting Modern Intel CPUs
___________________________________________________________________
Booting Modern Intel CPUs
Author : zdw
Score : 362 points
Date : 2023-04-17 04:07 UTC (18 hours ago)
(HTM) web link (mjg59.dreamwidth.org)
(TXT) w3m dump (mjg59.dreamwidth.org)
| lynguist wrote:
| There are so many steps and sidesteps there! Could someone who
| has access to GPT-4 API feed this text in and ask it to produce
| the topological or chronological step-by-step form of it? I'd be
| curious to see it.
| rep_lodsb wrote:
| Maybe you should read this article on the same site:
|
| https://mjg59.dreamwidth.org/64090.html
| twic wrote:
| Clearly ChatGPT has just reasoned out all the secret NSA-
| mandated additions to TPM 2.0 that are not listed in the
| specification.
| yuuta wrote:
| That's a pretty good article, but I expect more in-depth
| information on booting modern Intel CPUs ... I am very interested
| in modern UEFI / BIOS firmware development and how do they bring
| up x86 CPUs, but unfortunately there are very little source (I
| guess, except for EDK2), and the majority (?) of x86 firmwares
| are proprietary. Booting x86 is much more complicated than
| writing a linker script with a vector table for your
| microcontroller ... so, this seems very interesting.
| smikhanov wrote:
| There's a very detailed but less emotional writeup on the same
| topic from 2018, highly recommended:
| https://binarydebt.wordpress.com/2018/10/06/how-does-an-x86-...
| ignoramous wrote:
| And a _how computers boot_ discussion this previous month:
| https://news.ycombinator.com/item?id=35229045
| evilos wrote:
| So I guess this is what Bryan Cantrill meant when he said that a
| million kittens were slaughtered every time you boot your CPU.
| empiricus wrote:
| Sometimes I think our universe works by killing kittens (1 mil
| every hour). (The alternative explanation for all the dead
| kittens is just Moloch).
| porphyry3 wrote:
| > ...reprograms the CPU into a sensible mode (ie, one without all
| this segmentation bullshit)...
|
| Protected mode still has segmentation except that segment
| registers now index into either a global or local descriptor
| table (DT). Linux tries to make this transparent by setting up
| descriptors where logical addresses coincide with linear
| addresses (at least for CS and DS).
| rkagerer wrote:
| Thanks for educating me as to the dumpster fire of patched
| together steps that take place in the first milliseconds of
| booting my PC.
|
| With regard to this:
|
| _Intel CPUs ship with built-in microcode, but it 's frequently
| old and buggy and it's up to the system firmware to include a
| copy that's new enough that it's actually expected to work
| reliably._
|
| ...I wish companies put more effort into releasing finished,
| polished products. Not just Intel. I played with an ESP32-S3
| recently and learned not only was the entire digi controller for
| ADC2 'dropped' from support due to bad silicon, but using ADC1
| with DMA was fraught with gotchas and blatently incorrect info in
| the documentation (reported at least a dozen to EspressIf and
| they acknowledged).
| sgtnoodle wrote:
| The ESP32 ADC is a dumpster fire. I used one to sample an
| analog signal at 8Khz, looking for 400uS wide pulses (pacemaker
| capture pulses.) I originally started with an interrupt that
| triggered conversion from software. It would completely miss
| pulses 10-20% of the time and instead drift around, as if the
| ADC was floating rather than connected to anything. I finally
| got it to work reliably by switching to DMA through the I2S
| peripheral.
|
| Pretty much every hobby project I look at, people complain
| about how bad the ADC is, and then just proceed to take N
| samples and average them. It's not that the ADC is bad, it's
| that the underlying firmware is buggy. I suspect there's a race
| condition with the main CPU and a coprocessor that's using the
| ADC for some internal wifi stuff.
| rkagerer wrote:
| Yeah I felt that pain :-).
|
| On the ESP32-S3 ADC2 is shared with Wifi, but I don't think
| that's the case for ADC1. I made headway by throwing away all
| their boilerplate code and setting up the raw peripheral
| registers myself. I actually prefer it that way without all
| the abstraction underneath. Was getting nice results at 80kHz
| sample rate.
|
| Would love to turn it into a write-up as I'm not sure anyone
| has done this before with that chip (at least I couldn't find
| any sample code around). It's actually all in MicroPython at
| the moment to boot... but wouldn't be hard to convert to C
| (and might even clean things up due to the more natural fit
| of that language working with bitfields). What I'd really
| like is to make it into a DMA-based ADC module for
| MicroPython but I'm not familiar enough with compiling that
| platform from scratch, particularly on Windows.
| matthewfcarlson wrote:
| If you do end up writing it up, consider dropping it in the
| hackaday tip line (https://hackaday.com/submit-a-tip/). It
| sounds incredibly useful.
| gkhartman wrote:
| Thanks for the tip about the ESP32-S3 ADC2. Coincidentally, I
| was about to embark on that trail only to learn the hard way.
| You've likely saved me a good deal of frustration.
| ChuckNorris89 wrote:
| This is why companies still choose to buy expensive
| microcontrollers from big name reputable manufacturers
| (Microchip, TI, SiLabs, NXP, Nordic, STM, Infineon, Renesas,
| etc.) instead of going for the cheaper Chinese ARM
| microcontrollers at half the price.
|
| They tend to usually test their designs a lot more thoroughly,
| which adds to the final cost, and even when their products do
| come with flaws, they're more likely to be transparent about it
| and assist you with workarounds and sometimes send over field
| engineers on-site to help you.
|
| They also tend to be more honest about the specs, capabilities
| and limitations of their products in the datasheets. Finding
| out half-way through the design phase that some claims in the
| datasheet are bogus is a no-go for most companies.
|
| If you're just doing hobby work, or tinkering, or planning to
| ship millions of bottom of the barrel products on AliExpress
| with no intention of providing any warranty or customer support
| for them, then it's fine to go with whatever's the absolute
| cheapest, but serious companies like Apple, Sony, etc. who care
| about the customer experience, won't risk delaying a product
| launch because they wanted to save 50 cents on a new unproven
| cheap microcontroller who's ADCs don't work right.
| monocasa wrote:
| I don't know about all of that. TI's Stellaris
| microcontrollers were probably the worst chips I've had to
| deal with including sketchy Chinese chips.
| mmac_ wrote:
| I would generally agree with this, however I do find the
| esp32 on the whole has been as reliable as any of the major
| brands (although haven't had to use an ADC). We don't drive
| them too hard though, but their cheapness has pushed the
| bigger guys to get more competitive on pricing which is a
| good thing. TI in particular seems to have sharpened their
| pencil a bit, maybe due to all their new fabs coming online?
|
| The big guys still screw up, and over the journey I've
| noticed quite a few MCU subfamilies go EOL far before they
| should and it's usually due to silicon that has too many bugs
| in it. Maybe the big guys told them 'no' so there weren't any
| decent volumes on them anymore and they were forced to adapt.
|
| Sometimes they're a bit more subtle. You've probably seen
| quite a few 'A' revision part numbers recently where they
| clearly keep the same MCU but fix the bugs. See this on other
| IC's as well.
|
| For us, logistics (supply chain) and a solid support team are
| the highest importance. It's rare that we're locked into a
| single vendor due to a must-have-feature. These requirements
| narrow down our choices very quickly, and I'm sure it varies
| region-by-region (and how much $$$ you spend).
| RicoElectrico wrote:
| Uh, have you ever seen the length of a typical STM32 errata
| sheet? ;)
| voxadam wrote:
| I'd rather have a long errata sheet than no errata at all.
| I'll take truth in advertising any day over burring my head
| in the sand and pretending that everything's perfect.
| ChuckNorris89 wrote:
| Uh, have you read the part in my comments where I said that
| they're morel likely to be honest about it? Having a short
| errata or no errata at all doesn't mean the product is
| flawless. It could be that they don't know all the faults
| yet, or aren't sharing them, or both.
|
| For ST it could be that they seem to be the most popular
| cheap ARM microcontrollers for hobbyists and consumer
| products, so it's easier to find faults in them since so
| many companies use them, similar how the most used pieces
| of software also have the most vulnerabilities reported on
| them.
|
| Also, maybe ST was not a great example on my end, as they
| seem to have an obsession lately with outsourcing and
| farming out everything to the cheapest offshore location
| possible and penny pinching to the extreme for their
| consumer oriented parts. I don't blame them though,
| competition in the generic ARM microcontroller market is
| cutthroats and margins are slim and salaries are low and
| your major customers (Sony, Apple, Samsung, LG, etc) keep
| putting the pressure on you to lower your prices or
| threaten to look somewhere else.
| xobs wrote:
| I've most definitely run into issues with others in that
| list, and my reports to the companies never generated an
| errata.
|
| One fun one was that when accessing the External Memory
| Interface module (think: parallel ROM) and switching from
| Bank 1 to Bank 0, the HDMI controller would get reset,
| but only when configured for two banks of 32 MB.
|
| If I did one bank of 64 MB and just used the extra pin as
| CS, it worked just fine.
|
| There are lots of similar quirks in every chip.
| ChuckNorris89 wrote:
| From my time in the semi industry, when reporting issues
| for the erata, it depends who _you_ are.
|
| Are you some hobbyist or small company who sent your
| issue report to some generic company email address? Then
| your report most likely neve reached anyone on the team
| working on that chip, but probably reached some clueless
| jobsworth who didn't know what to do with it because
| large semi companies are highly siloed and there's no
| centralized management for such things, so you need
| direct contact with the team responsible for that chip.
|
| If you're large customer, you then have direct email
| addresses of support engineers, application engineers and
| the engineers who worked on the chip itself, and so your
| reports will definitely taken seriously.
|
| There's also another issue. There's public datasheets and
| eratas which are often not updated or fully transparent,
| and then there's confidential datasheets and eratas,
| which are updated and issued under NDA to industrial
| customers on a need to know basis. As a hobbyist you
| rarely get all the truthful info.
|
| It's not ideal, but the semi industry mostly focuses on
| large customers who buy in large volumes of product, not
| on hobbyists and tinkerers.
| 1827162 wrote:
| And that cost cutting is likely due to the rise of
| Chinese STM32 clones from companies most of us have never
| heard of such as GigaDevice, CKS, Geehy, MindMotion, APEX
| Semiconductors, etc...
|
| Some die photos here: https://mecrisp-stellaris-
| folkdoc.sourceforge.io/clones-stm3...
| SassyGrapefruit wrote:
| I think my favorite part of this discussion is that most
| developers give this sort of process/code a mystical vibe. Like
| it's some sacred code built by a priesthood of the world's
| greatest and smartest programmers. Not accessible to mere
| mortals.
|
| I encourage developers that work for me to just read the code
| and documentation. My advice to them is usually something along
| the lines of...
|
| _Think of how you would have done it as a sophomore in college
| completing an overdue assignment... chances are it works just
| like that_
| taspeotis wrote:
| Sure but I have some sympathy here ... a contemporary CPU
| probably has a million different capabilities and if some not-
| even-single-digit-percentage of them don't work that's a lot of
| errata. Plus it's hardware! So it's hard to fix it once it's
| out.
| saagarjha wrote:
| Single digit percentage? CPUs are generally bug-free to
| several orders of magnitude greater than that.
| taspeotis wrote:
| Almost as if they have a not-even-single-digit-percentage
| of bugs...
| rbanffy wrote:
| > dumpster fire of patched together
|
| That's more or less the history of the x86, from the 8086 and
| on. I'd add a couple extra expletives as well. I can imagine a
| number of better ways seeding CPU state prior to starting it in
| a way they could all be neatly started in parallel.
|
| > I wish companies put more effort into releasing finished,
| polished products.
|
| Some do. It's just that their hardware is expensive and not
| very available outside some niches.
| vaxman wrote:
| "It then reads the initial block of the firmware (the Initial
| Boot Block, or IBB) into RAM (or, well, cache, as previously
| described) and parses it. There's a block that contains a public
| key - it hashes that key and verifies that it matches the SHA256
| from the fuses. It then uses that key to validate a signature on
| the IBB. If it all checks out, it executes the IBB and everything
| starts looking like the nice simple model we had before."
|
| Should all be performed by ME/PSP now, but the Intel ME/AMD PSP
| can't be updated (even with new microcode) and older versions are
| known to contain serious vulnerabilities. Thus we can reduce all
| of our options down to either (a) not running with Secure Boot or
| (b) running with Secure Boot, but only on the very latest
| Intel/AMD processors. So if we have the very latest processors,
| then we only need to worry about validating "critical" microcode
| updates to them and that doesn't need to happen at power-on at
| all, it can happen in the ME/PSP, but we do need to know about
| them at the C-suite level, since it will impact purchasing of
| replacement gear, financial statements and risk analysis
| decisions.
|
| That means we should power-on directly into the on-die ME/PSP and
| it, in turn, can bring up the RAM, PCI, etc. and check (online or
| on attached storage) for "critical" updates and, if found, emit a
| "distress beacon" on the network while also checking for a signal
| (from a resistor, the network or attached storage) that will
| inform it to download and update (rather than simply halting) the
| processor. This allows Management (the ones who have to sign
| SarBox, not just the system admins) to be informed of emerging
| vulnerabilities in their infrastructure (because of the "distress
| beacons" triggering indicators on their enterprise dashboards) so
| that they may choose to take explicit action to allow processing
| to continue (by emitting the signal that all ME/PSP would look
| for to enable critical microcode updates instead of halting).
|
| Likewise, we need ME/PSP to also start verifying all connected
| communication interfaces (before and after boot) by checking
| digital signatures that chip manufacturers obtain from Intel/AMD
| --much like what developers have to do to get their code to run
| in kernel space on an Apple Mac. I mean, CVE-2022-21742, et. al.
| --not just Thunderbolt and USB attacks.
|
| The only other rational solution really is to run with Secure
| Boot disabled (because older Intel/AMD processors cannot be
| trusted anyway --thanks to their non-updatable components like
| ME/PSP-- even with post-production microcode updates, even with
| Secure Boot).
|
| Anyway, 11 years later, will leave this here ..
| https://www.extremetech.com/defense/133773-rakshasa-the-hard...
| mjg59 wrote:
| > the Intel ME/AMD PSP can't be updated
|
| Yes they can, their firmware is in the same flash as the system
| firmware.
| booi wrote:
| This is insanity. I wonder if ARM CPUs were able to start from
| scratch or a better place.
| grishka wrote:
| For one, there never was, and still isn't, any universal BIOS-
| like spec for how an ARM CPU should boot and which peripherals
| it should have.
| [deleted]
| penguin_booze wrote:
| Arm v7 was a Wild West, but with v8, Arm tried to standardize
| a lot. The Arm Trusted Firmware is the reference boot
| firmware implementation for v8+ CPUs: https://github.com/ARM-
| software/arm-trusted-firmware.
|
| I'd think most of the referece documents can be discovered
| from that code base.
|
| Relatedly, from the perspective of hands-on programming, the
| System Programmer's guide is _the_ manual to start with:
| https://developer.arm.com/documentation/den0024/a/.
| yencabulator wrote:
| To drive home how far from sane common ARM bootup sequences
| are, the Raspberry Pi is started by its GPU. You can think of
| an RPi as a proprietary GPU with an auxiliary ARM CPU.
| jsmith45 wrote:
| Yeah, and the GPU code is designed to read the linux kernel
| into RAM, and set it up so it begins executing the linux
| kernel as the inital instructions of the ARM CPU. If you want
| some more normal bootloader like u-boot you need to jump
| through hoops to make sure the GPU based bootloader can treat
| it like a weird Linux Kernel.
|
| (In theory, with source for the GPU 2nd stage bootloader one
| could change things, but RPI foundation does not provide
| access to that source).
| [deleted]
| moose_man wrote:
| When technical debt strangles your entire business. "We'll fix
| this next release"
| mnd999 wrote:
| It's not technical debt, it's a feature. Being able to boot and
| run software from decades ago has served them well in the past.
| mjg59 wrote:
| That's an argument for including support for real mode,
| rather than for coming up in it. Modern systems booting
| legacy software are already transitioning into protected mode
| to run the UEFI stack, and then switching back to real mode
| before passing control to the Compatibility Services Module.
| LoganDark wrote:
| > That's an argument for including support for real mode,
| rather than for coming up in it.
|
| Not really. You have to come up in it to boot software that
| expects to already be in it.
| mjg59 wrote:
| No, firmware needs to hand off in the state the software
| you're booting needs. That says nothing about the mode
| the CPU needs to be in when it starts running firmware.
| LoganDark wrote:
| > No, firmware needs to hand off in the state the
| software you're booting needs. That says nothing about
| the mode the CPU needs to be in when it starts running
| firmware.
|
| This assumes that "firmware" doesn't count against
| backwards compatibility, which isn't necessarily the
| case. Maybe Intel (or AMD) doesn't have a 100% monopoly
| on firmware to be confident enough that the CPU mode is
| an implementation detail. Or maybe some customers do
| indeed run their own "firmware" (maybe embedded?). No way
| to be sure.
| mjg59 wrote:
| Firmware needs to know CPU-specific details (it needs to
| be able to program the memory controller, for instance),
| so skipping (or inverting) the real mode to protected
| mode code in the firmware is just another step in porting
| to a new platform.
| LoganDark wrote:
| Are we talking about the same "firmware" here? If you're
| talking about firmware _loaded directly onto the CPU_
| (like microcode updates are), that runs even before the
| motherboard gets to do anything, then the mode the CPU
| starts in can only be observed after that point anyway,
| so for all we know it probably could already be
| implementing your idea without anyone noticing.
|
| I have objections to changing the way the CPU
| _observably_ starts (i.e. mode in which the BIOS or
| bootloader starts in).
| mjg59 wrote:
| I'm talking about the firmware on the motherboard. Your
| BIOS is CPU-specific - if a future CPU changes the
| default CPU mode, you simply update your BIOS code to
| match while you're doing the rest of the work you need to
| do for that BIOS to run on the new CPU. If the BIOS
| expects to run in real mode (I'm not aware of any modern
| firmware that does, but) then you just add some code to
| switch back to real mode. Otherwise, you probably just
| delete the code that currently transitions from real mode
| to protected mode. That doesn't preclude you switching
| back to real mode if the bootloader expects that.
| LoganDark wrote:
| > I'm talking about the firmware on the motherboard.
|
| Then that's what I thought, yeah.
|
| I don't see why you're explaining how your idea would be
| implemented; I'm rather saying that implementing it in
| that way might be prohibitive if Intel or AMD still have
| customers that expect the CPU to act a certain way. And
| these customers aren't necessarily standard
| desktop/laptop motherboards.
|
| In other words, changing "what mode the CPU starts in"
| would be a big and observable breaking change and not
| _necessarily_ just an implementation detail that can be
| magically worked around by firmware updates like you
| describe.
| mjg59 wrote:
| You usually can't take existing firmware and run it on a
| new CPU, because the new CPU requires different bringup
| code anyway. Take a look at https://github.com/coreboot/c
| oreboot/tree/master/src/soc/int... to get some idea of
| how many different implementations there are for modern
| Intel alone (there's a bunch more for the pre-SoC style
| Intels). If you already have to port your firmware to a
| new CPU, you can deal with the CPU starting in a
| different mode - it is _entirely_ an implementation
| detail that can be handled in the firmware.
| jeroenhd wrote:
| I believe one of the Playstations or Xboxes run an AMD64 chip
| that does away with most of the legacy stuff. I read about it
| in an article about getting Linux to run on a homebrewed
| console.
|
| If I remember correctly, this required hacking around a lot of
| assumptions in the Linux kernel. I imagine the Windows kernel
| won't be that different.
|
| If Intel or AMD bring out a CPU that doesn't support any
| operating system in use today (or any UEFI firmware/BIOS
| implementation for that matter) they wouldn't be selling many
| chips. Many vendors outsource their driver update tools to
| third parties, which in turn use tiny operating systems like
| FreeDOS to flash firmware onto devices; it'd suck for them to
| end up needing to rebuild their operating systems.
|
| Likewise, a dedicated GPU also plays a role in the boot
| process, and taking away the legacy assumptions of the GPU boot
| ROM will probably also require flashing any consumer graphics
| card with new firmware as well. Then there's PXE network boot,
| which often still relies on separate firmware, which also
| brings its own expectations about the state of the CPU.
|
| Bringing up a modern CPU is going to be a terrible hack
| whatever you do. I don't see why Intel would need to re-
| engineer their entire boot process. The current system is hacky
| as hell but it works and it doesn't require much more work than
| putting the firmware and microcode images in the right place.
|
| I seriously doubt that redoing their entire boot process and
| guiding everyone from motherboard manufacturers to driver
| programmers on how to use the new system (and to iron out the
| bugs in the new process) will be more cost effective than
| letting all the old code work like it does today. Very rarely
| do complete rewrites make any business sense.
| mjg59 wrote:
| GPU option ROMs no longer make assumptions about legacy setup
| - UEFI option ROMs are executed in either 32-bit or 64-bit
| mode, and there's no need to implement any of the legacy VGA
| compatibility. Same for PXE, which just hooks into the UEFI
| network stack rather than having to deal with anything
| legacy.
|
| No OS assumes real-mode for the boot processor at this point.
| If you boot Linux on a UEFI system you'll jump straight into
| the kernel in 64-bit protected mode. The only time real mode
| comes into play is in the bringup of other CPUs (which is
| something that can be ignored now that ACPI specifies an
| alternative) and ACPI resume (which isn't relevant on systems
| that use S0ix rather than S3), so you could absolutely ship
| an x86 CPU that didn't support real mode and all you'd have
| to do is modify the firmware entry code. Modern operating
| systems would Just Work, as would hardware option ROMs.
|
| (And enough systems no longer ship with CSMs that people
| aren't using FreeDOS for firmware updates any more - it's
| either Linux or a UEFI executable)
| RulerOf wrote:
| >And enough systems no longer ship with CSMs that people
| aren't using FreeDOS for firmware updates any more - it's
| either Linux or a UEFI executable
|
| Is this only for OEM systems? I'm used to seeing these
| happen from inside of Windows, with the exception of
| motherboard firmware all happening inside of the setup
| program. It would make sense for much of that to be EFI
| applications nowadays, although there's not much in the way
| of context around these GUI wrappers to really indicate
| what's going on under the hood.
| userbinator wrote:
| Just like the locked-down mobile devices that have unfortunately
| perverted the nature of general-purpose computing, Boot Guard is
| a tool of planned obsolescence and manufacturer control disguised
| as "security". Want to fix something in the BIOS that they didn't
| want you to have[1][2][3]? Too bad, it's locked and they won't
| release a newer version to force you to buy another. Absolute
| bastards.
|
| [1] https://news.ycombinator.com/item?id=29837884
|
| [2] https://news.ycombinator.com/item?id=28254571
|
| [3] https://news.ycombinator.com/item?id=33650347
| vrglvrglvrgl wrote:
| [dead]
| cperciva wrote:
| Another fun thing with SMP: The x86 multiprocessor spec says that
| to start an AP you need to send an IPI, wait 10 ms, then send
| another IPI (IIRC it's a "reset" followed by an "init"). On large
| systems, this adds up!
|
| Except that you don't need to wait 10 ms for each AP -- you can
| start up the APs in parallel. There's just one small problem: All
| of the APs start up in the same state -- executing from the same
| CS:IP, _and also the same stack pointer_. Good luck having
| hundreds of CPUs stomping over each other 's stacks.
|
| Except that if you're careful, it doesn't matter -- you can even
| make a function call if you want _because all of the CPUs will
| push the same return address onto the stack_.
|
| Implementing this in FreeBSD is on my "speeding up the boot" to-
| do list. I know it's possible though, because someone told me
| that they had already done exactly this in a different (non open
| source) system.
|
| Lexicon for the non-x86 people: SMP = Symmetric MultiProcessing,
| aka more than one "virtual CPU". AP = Auxiliary Processor, any
| CPU other than the one which the BIOS starts up for you. IPI =
| InterProcessor Interrupt, how CPUs wake each other up. CS:IP =
| Code Segment + Instruction Pointer, where the CPU is reading
| instructions from.
| pantalaimon wrote:
| I thought Linux does this already, at least there is a patch:
| https://lore.kernel.org/lkml/20230414225551.858160935@linutr...
| __turbobrew__ wrote:
| I wonder if risc-v is much faster to boot since the
| architecture doesn't come with all of this legacy cruft that
| x86 needs to deal with?
| riceart wrote:
| What legacy cruft? (speaking specifically of SMP boot here)
| mrguyorama wrote:
| If successful it will inevitably accrue it's own cruft.
| snvzz wrote:
| Boot process being codified in a specification minimizes
| the risk.
| layer8 wrote:
| You mean, like booting in real mode was codified for x86?
| toast0 wrote:
| > All of the APs start up in the same state -- executing from
| the same CS:IP, and also the same stack pointer. Good luck
| having hundreds of CPUs stomping over each other's stacks.
|
| I'm away from my hobby OS to double check, but isn't it the cae
| that the Start IPI includes a page number which drives CS? If
| you send those out one by one, you can give each AP its own
| code page and set the stack page based on that (either using
| the CS value to index into something, or as an immediate value
| in the code, that you modify as you copy to the page). Of
| course, if you do a broadcast SIPI, then all of those are going
| to have the same CS. Depending on how much early boot code you
| fancy writing in assembler, you could maybe jump into
| protected/long mode, find the current cpu id, and lookup the
| proper stack pointer without using the stack at all, and only
| then jump into C code? Of course, one probably has nice C
| functions for some of those things, so it doesn't seem nice to
| also have it in assembly.
| toast0 wrote:
| > I'm away from my hobby OS to double check, but isn't it the
| cae that the Start IPI includes a page number which drives
| CS?
|
| I double checked, and as I understand it, with traditional
| APIC start IPI, you get to pick the CS address to be (0-255)
| * 0x1000; although how much of the first 1MB of physical
| address space is available within that depends on the system
| memory map. I just start one AP at a time, and use the top of
| the code page as stack space until it switches to the
| intended kernel stack, and then that AP starts the next one.
| That's not time efficient though; you could pretty easily
| start as many APs as you've got low pages available; although
| the option where everybody starts from the same place and
| they figure it out among themselves is probably simpler;
| because there's never a need to wait for an AP to finish
| starting before starting more APs; just saying, you've got
| options, they don't all have to start at the same CS:IP.
| mananaysiempre wrote:
| If you know how many CPUs you are bringing up, then you can
| allocate a bunch of stacks contiguously and have the CPUs
| race to pick up the next one, say mov rsp,
| STACKSIZE lock xadd [currstack], rsp
|
| Of course, the contention on that xadd is going to cost you
| (if not 10ms per CPU... probably?), and this presumes you
| aren't using the kernel stack pointer for anything (like a
| stable CPU number). To fix that, you probably will need to
| traverse a CPU -> startup data map in assembly. But it's a
| start (no pun intended), and is not as horrendous a hack as
| having multiple CPUs push the same return address onto the
| same stack.
| klempner wrote:
| As an order of magnitude point, my experience has been that
| a bunch of CPUs trying to xadd has a throughput bottleneck
| on the scale of once per 50 to 100 nanoseconds.
|
| But even if you allow an entire extra order of magnitude,
| at one per microsecond, that's still 10000 over the course
| of 10 milliseconds which is plenty for this usecase, at
| least for now.
| unnah wrote:
| Do you need special case processing for SMT (Symmetric
| multithreading) in there, or is it actually completely
| transparent?
| cperciva wrote:
| As far as the startup process is concerned, SMT is two CPUs.
| I don't actually know how SMT works when one "CPU" has been
| started and the other hasn't... I guess it just pretends that
| it hit a hlt instruction on the unstarted "CPU"?
| JoshTriplett wrote:
| Also, on any modern system, you really don't need the second
| SIPI. The CPU will come up with the first SIPI, and then ignore
| the second SIPI. So you can just send a pile of INITs and then
| a pile of SIPIs (or in theory one broadcast SIPI), and expect
| the CPUs to come up.
|
| For the startup code, you shouldn't need to make a function
| call. A few lines of memory-less stack-less assembly could get
| the CPU number and then change the stack, assuming you have a
| global value that gives the base of a preallocated array of
| stacks.
| Dwedit wrote:
| How "modern" are we talking here? Core 2 Duo? Arrandale?
| Haswell? Skylake?
| mananaysiempre wrote:
| Per the OSDev Wiki, Pentium Pro and later[1]:
|
| > For newer CPUs (P6, Pentium 4) one SIPI is enough, but
| I'm not sure if older Intel CPUs (Pentium) or CPUs from
| other manufacturers need a second SIPI or not.
|
| When (and if) that became officially sanctioned behaviour
| is another question.
|
| [1] https://wiki.osdev.org/Symmetric_Multiprocessing#Initia
| lisat...
| vardump wrote:
| > A few lines of memory-less stack-less assembly could get
| the CPU number and then change the stack...
|
| Except for that TSC_AUX (MSR that stores CPU id number) is
| going to be 0 for all of the cores. Unless you know some
| other way to get CPU number?
| dfox wrote:
| Core ID in TSC_AUX is essentially an concession to
| userspace. As an OS and firmware you are supposed to
| identify CPU cores by means of their LAPIC ID (as read from
| APIC configurations MSR or from CPUID). Small issue there
| is that APIC IDs are structured according to HT/NUMA
| topology and thus not necessarily consecutive.
|
| On the other hand, as an OS on PC-like platform you know
| how many cores there are supposed to be and what are their
| APIC IDs before-hand because they were already enumerated
| by firmware (which is the reason why you can do the one by
| one AP startup sequence in the first place).
| JoshTriplett wrote:
| Exactly: either read the APIC ID and use that to look up
| the CPU number in a table you already have, or arrange a
| location in memory to use xadd to assign a sequential CPU
| number, whichever your OS prefers.
| jeffbee wrote:
| Huh, I did not realize until reading this that the IME is also
| x86. I assumed it was just whatever was most convenient, which
| seems like it rules out x86, but I guess not.
| mjg59 wrote:
| It was initially ARC, but transitioned to x86 with version 11.
| p_l wrote:
| IIRC some of the variants used embedded SPARC, though they
| are a rare find.
| usr1106 wrote:
| AMD uses ARM for that purpose IIRC.
| anonymfus wrote:
| No, the role of the AMD's Platform Security Processor in the
| boot process is completely different from Intel's IME, as PSP
| is located on the CPU side and so entire boot process and
| security checks are completely different and would require a
| separate writeup.
| usr1106 wrote:
| Use the cache as RAM. So I guess you could run a small Linux
| system just in cache without any RAM?
|
| Why? As a fun project with some old motherboard for example :)
| JoshTriplett wrote:
| With a fair bit of effort; notably, DMA and IOMMUs probably
| won't work, and most modern devices don't really support PIO.
| You might be able to boot a really simple environment that runs
| out of an initramfs though. It's also entirely not obvious to
| what degree you can use the paging system.
|
| It'd likely be a substantial effort to port Linux.
| usr1106 wrote:
| Sure, I would not expect to do any advanced IO. But right,
| probably there is no serial console really close to the CPU
| either, this is not a Raspberry PI. No idea what signals
| could be used for that.
|
| Page tables I am not sure either. Could you still do it like
| in Linux 1.0? No idea what things looked like then, but I
| assume much less dedicated hardware support.
| JoshTriplett wrote:
| > probably there is no serial console really close to the
| CPU either
|
| outb to 0x3f8 _might_ work.
|
| I know there were once versions of Linux that supported
| running without an MMU. Those versions, with very limited
| hardware support, _might_ work in this mode.
| mschuster91 wrote:
| It's still part of the kernel code:
| https://www.kernel.org/doc/Documentation/nommu-mmap.txt
|
| And apparently, good enough to run DOOM:
| https://hackaday.com/2022/12/07/a-tiny-risc-v-emulator-
| runs-...
| wtallis wrote:
| Hasn't Intel supported DMA into L3 cache for something like a
| decade now?
| adrian_b wrote:
| It is supported on true Xeon systems.
|
| I believe that it is not supported in Core CPUs, not even
| in those of them which were branded as Xeon E or Xeon W.
| usr1106 wrote:
| Right, I did not specify what L ;) And the original article
| did not mention either.
| usr1106 wrote:
| Substantial effort I don't doubt. First probably years of
| learning how things work under the hood for most normal
| mortals.
| lamp987 wrote:
| DMA on x86 does update contents of CPU caches.
| JoshTriplett wrote:
| Yes, it does, but that doesn't mean the hardware
| necessarily supports doing it without a memory controller.
| Klinky wrote:
| With the some of the new AMD Epyc CPUs having over 1GB of L3
| cache, you could run a pretty full-featured Linux distro + app
| entirely in cache.
| undersuit wrote:
| If anyone can verify that Cache as RAM actually works on AMD
| though or explain how AMD boots without it that would be
| great:
|
| >Cache-as-RAM (CAR) is no longer a supportable feature in AMD
| hardware.
|
| https://git.furworks.de/coreboot-
| mirror/coreboot/commit/a245...
| mjg59 wrote:
| The PSP sets up the memory controller before the x86 cores
| are started. It's not implausible that the PSP has some
| sort of cache as RAM stage, but that's before Coreboot
| starts.
| layer8 wrote:
| For a moment I was confused that the PlayStation Portable
| would have x86 cores.
| TacticalCoder wrote:
| That is very interesting.
|
| > I'm also missing out the fact that this entire process only
| kicks off after the Management Engine says it can, which means
| we're waiting for an entirely independent x86 to boot an entire
| OS before our CPU even starts pretending to execute the system
| firmware.
|
| I take it that that OS is Minix?
|
| > But what verifies the first component in the boot chain? You
| can't simply ask the BIOS to verify itself - if an attacker can
| replace the BIOS, they can replace it with one that simply lies
| about having done so. Intel's solution to this is called Boot
| Guard.
|
| Wait... How can an attacker replace the BIOS? Aren't motherboard
| nowadays protected from unwarranted BIOS flashing?
|
| Say I'm an attacker and I got root on some PC (Intel or AMD)
| running Linux, how do I replace the BIOS with a backdoored BIOS
| without the user noticing?
| mjg59 wrote:
| > I take it that that OS is Minix?
|
| It's the Minix kernel, I don't think the userland contains much
| Minix.
|
| > Wait... How can an attacker replace the BIOS?
|
| If you have physical access you can just attach to the flash
| directly and reprogram it. This is very much in-scope for
| various people.
| LoganDark wrote:
| > If you have physical access you can just attach to the
| flash directly and reprogram it. This is very much in-scope
| for various people.
|
| Not to mention just replacing the motherboard since the CPU
| is socketed and could go anywhere.
| superkuh wrote:
| https://archive.is/XoghM
|
| These days dreamwidth.org is harder to view and interact with
| than facebook.com if you don't have an account. We really need to
| stop linking to it and link instead to an archive.is copy or the
| like.
|
| After the run-around to the archived copy I see mgj is still
| complaining about people being able to boot in modes other than
| UEFI. I'm glad these options still exist. Throwing away all the
| legacy computing options would remove many abilities no longer
| possible on modern hardware and software stacks.
| mjg59 wrote:
| > I see mgj is still complaining about people being able to
| boot in modes other than UEFI
|
| I'm not sure how you get that impression, since I'm mostly
| talking about what happens before you get to that point. Having
| the CPU hand off control to the firmware in protected mode
| doesn't preclude the firmware switching back to real mode.
| cronix wrote:
| > These days dreamwidth.org is harder to view and interact with
| than facebook.com if you don't have an account. We really need
| to stop linking to it and link instead to an archive.is copy or
| the like.
|
| I've made the suggestion to Dang in the past to just
| autogenerate an archive.is link for every story/link posted to
| HN and have that be an "alternate link" after the main one. I
| think it's kind of silly some people just post an archive link
| for every post and gets a buttload of points for it as everyone
| upvotes it which games the system. I think it would also be
| good in general to preserve the article as it appeared at the
| time when initially discussed and hasn't been edited, or
| removed, since.
| masfuerte wrote:
| It's fine with js disabled.
| superkuh wrote:
| How do you get through the captcha? It's what blocks me.
| masfuerte wrote:
| It didn't show me one. FWIW, I also have cookies and third-
| party domains disabled.
| [deleted]
| rwmj wrote:
| Intel actually released a variant of the 80386 which booted into
| protected mode and lacked real mode entirely. It was as far as I
| know not very successsful:
| https://en.wikipedia.org/wiki/Intel_80376
| mjg59 wrote:
| Simultaneously lacking real mode and paging support did kind of
| restrict it to embedded use cases
| senko wrote:
| In 1989, people very much used real mode, so it's not
| surprising this failed. (also, it was for embedded systems only
| according to that Wikipedia article)
|
| In 2023, not so much.
|
| It's baffling to me this is still supported, when all the other
| changes in the hardware basically make it impossible to run
| anything that old on the modern hardware (you can do that in
| QEMU, but you can then emulate x86 in software fast enough
| anyway).
| kevin_thibedeau wrote:
| I boot FreeDOS on a Ryzen system so I can use a parallel port
| device whose program won't work correctly under 64-bit
| Dosemu2. it is an EPROM programmer whose timing requirements
| won't tolerate non-virtualized emulation.
| senko wrote:
| Thanks for providing an actual and relevant use case.
|
| TBH (and I'm wildly speculating here, I'm not involved in
| embedded dev at all), if there were no other options, I
| believe the SW emulation would rise to the challenge to
| make it workable.
|
| If people can faithfully reproduce behaviours of ages old
| consoles to make sure the old games' bugs are still
| preserved, and for free, I'm guessing someone would step up
| in case of x86 if there was a business need.
|
| But since you can still use actual HW to do that, there
| isn't any.
| voxadam wrote:
| What's the actual reason for real mode _still_ being
| supported on modern processors in this day in age? Why didn
| 't it die with the advent of AMD64 (aka x86-64)? Why didn't
| AMD skip real mode and boot directly into something more
| modern?
| pwg wrote:
| Most likely because for protected mode to function, there
| is a certain minimal amount of housekeeping data tables
| that need to be setup properly (i.e. LDT, GDT, IDT, etc),
| otherwise you'll just immediately take a double fault and
| the CPU will halt.
|
| Real mode exists today as a gateway to setting up all those
| housekeeping data tables so that once the "protected mode"
| switch is flipped on, the CPU will actually find code to
| execute.
| toast0 wrote:
| When AMD64 came out, bios booting was dominant. You need
| real mode for that.
|
| In today's world, you could probably release an UEFI only
| cpu and few would notice. But I doubt it would save enough
| space to make a difference. And you'd open yourself to
| criticism from those few that still use real mode: this
| processor is fake, they'd say, because it has no real mode.
| sgjohnson wrote:
| > this processor is fake, they'd say, because it has no
| real mode.
|
| They could also claim that it's not PC compatible.
| Because it literally wouldn't be.
|
| Apple also achieved their dream of the Mac no longer
| being a PC with the release of M1.
| gpderetta wrote:
| - There is likely very little to no cost in keeping it in
|
| - There might be even a cost in removing it.
|
| - Complexity is a barrier to entry to any competitor that
| want to produce compatible CPUs.
| wg0 wrote:
| This might seem far fetched but a RISC-V take over in a decade or
| two is imminent even more so looking at the geopolitical vectors
| and trajectories.
| concerned_ wrote:
| [flagged]
| yjftsjthsd-h wrote:
| ? The article very specifically talks about how the process has
| changed over the years, even to the point of UEFI _not_
| starting the OS /bootloader in real mode anymore.
| concerned_ wrote:
| [flagged]
| yjftsjthsd-h wrote:
| > It does not
|
| It literally does:
|
| >>> For modern UEFI systems, the firmware that's launched
| from the reset vector then reprograms the CPU into a
| sensible mode (ie, one without all this segmentation
| bullshit),
|
| > you could not use this article to boot any Intel CPU.
|
| I mean, the article doesn't contain machine code or
| assembly listings, but it does a decent job of describing
| the process.
|
| What is your actual criticism, either of the article or the
| described CPUs?
| concerned_ wrote:
| [flagged]
| MichaelZuo wrote:
| What exactly does "...the CPU is hardcoded to start reading
| instructions from when power is applied." mean?
|
| How do the hardcoded parts read anything with just an electrical
| current?
| deepspace wrote:
| I am not sure what you mean by "just an electrical current".
| The CPU is typically kept in reset until the clock is stable,
| so it has a valid clock signal on startup. It is therefore able
| to start executing the microcode which performs the power-on-
| reset sequence, i.e. start reading instructions from the reset
| vector and so on.
| MichaelZuo wrote:
| > CPU is typically kept in reset until the clock is stable >
| It is therefore able to start executing the microcode
|
| At the very beginning, what is keeping it in 'reset' and
| initiating the execution of the microcode? Through what
| means?
|
| From what I understand, in the first few hundred nanoseconds
| it's just an electrical current that's flowing through the
| CPU and nothing else.
| convolvatron wrote:
| yeah, ok, reset would normally be held low - whether that's
| the presence or absence of a current isn't important. in
| the the old days there was usually a hardware power
| controller that doesn't raise reset until the power is
| stable. on those machines I think that just took long
| enough that the clock chain had settled out.
|
| these days its more likely that there is a system
| controller orchestrating the bringup. potentially waiting
| for all the voltage converters and the clock generator to
| settle before raising the reset line. depending on the
| architecture it may not really be a simple pin, but
| addressed by the system board controller through the scan
| network.
| MichaelZuo wrote:
| How is the system controller initiated? I assume via a
| simpler process?
| convolvatron wrote:
| using something like the 'brownout detector' power
| controller above. I'm pretty sure I've seen designs (and
| done this myself), that just puts a little RC on the
| reset pin. 100ms aughta be enough for anyone!
| microcontrollers (like those used for board controllers)
| have a much simpler power and clock structure than big
| cpus...and many of them are built to just come up on
| their own.
___________________________________________________________________
(page generated 2023-04-17 23:02 UTC)