[HN Gopher] Reverse engineering Dell iDRAC to get rid of GPU thr...
       ___________________________________________________________________
        
       Reverse engineering Dell iDRAC to get rid of GPU throttling
        
       Author : f_devd
       Score  : 160 points
       Date   : 2023-05-10 17:29 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | amir734jj wrote:
       | I'm dealing with something similar. I wanted to use Redfish to
       | clear out hard drives but storage is not standardize across
       | different vendors. Dell has a secure erase. HPE gen10 has smart
       | storage and anything older doesn't have any useful functionality
       | in their Redfish API. What a mess. So I need to use PXE booting
       | and probably winpe to do this.
        
       | Terminal135 wrote:
       | The repo claims that the servers themselves throttle the GPUs,
       | but isn't it the GPUs themselves that can throttle or maybe the
       | OS? Neither of those are controlled by the server (hopefully) so
       | is there a different system at play here?
        
         | csdvrx wrote:
         | No, that's controlled by the server: try lspci -vv on any linux
         | system. Look at the link speed and width, like LnkSta: Speed
         | 8GT/s, Width x2: x2 means 2 lanes.
         | 
         | Try:
         | 
         | `sudo lspci -vv | grep -P
         | "[0-9a-f]{2}:[0-9a-f]{2}\\.[0-9a-f]|downgrad" |grep -B1
         | downgrad`
         | 
         | Besides the speed, you can have another problem with lanes
         | limitations.
         | 
         | For example, AMD CPUs have a lot of lanes, but unless you have
         | an EPYC, most of them are not exposed, so the PCH tries to
         | spread its meager set among the devices connected to your PCI
         | bus, and if you have a x16 GPU, but also a WIFI adapter, a WWAN
         | card and a few identical NVMe, you may find only of the NVMe
         | benchmarks at the throughput you expect.
        
           | ilyt wrote:
           | > For example, AMD CPUs have a lot of lanes, but unless you
           | have an EPYC, most of them are not exposed, so the PCH tries
           | to spread its meager set among the devices connected to your
           | PCI bus, and if you have a x16 GPU, but also a WIFI adapter,
           | a WWAN card and a few identical NVMe, you may find only of
           | the NVMe benchmarks at the throughput you expect.
           | 
           | example from my X670E board
           | 
           | * first NVME = 4x gen 5
           | 
           | * second= 4x gen 4
           | 
           | * 2 USB ports connected to CPU (10/5 Gbit)
           | 
           | and EVERYTHING ELSE goes thru 4x gen 4 PCIE bus, including
           | additional 3x nvme, 7 SATA ports, a bunch of USBs, few 1x
           | PCIE ports, network, etc.
        
           | toast0 wrote:
           | > For example, AMD CPUs have a lot of lanes, but unless you
           | have an EPYC, most of them are not exposed, so the PCH tries
           | to spread its meager set among the devices connected to your
           | PCI bus, and if you have a x16 GPU, but also a WIFI adapter,
           | a WWAN card and a few identical NVMe, you may find only of
           | the NVMe benchmarks at the throughput you expect.
           | 
           | Most AM4 boards put an x16 slot direct to the CPU, and an x4
           | direct linked NVMe slot. That's 20 of the 24 lanes; the other
           | 4 lanes go to the chipset, which all the rest of the
           | peripherals are behind. (There's some USB and other I/O from
           | the cpu, too). AM5 CPUs added another 4 lanes, which is
           | usually a second cpu x4 slot.
           | 
           | Early AM4 boards might not have a cpu x4 NVMe slot, and those
           | 4 cpu lanes might not be exposed, and the a300/x300
           | chipsetless boards don't tend to expose everything, but where
           | else are you seeing AMD boards where all the CPU lanes aren't
           | exposed?
        
             | ilyt wrote:
             | > Most AM4 boards put an x16 slot direct to the CPU, and an
             | x4 direct linked NVMe slot. That's 20 of the 24 lanes; the
             | other 4 lanes go to the chipset, which all the rest of the
             | peripherals are behind. (There's some USB and other I/O
             | from the cpu, too). AM5 CPUs added another 4 lanes, which
             | is usually a second cpu x4 slot.
             | 
             | Mine just go to second NVMe weirdly enough.
        
               | toast0 wrote:
               | I meant to say the additional x4 is usually a second cpu
               | x4 [NVMe] slot. Not a pci-e x4 slot.
        
             | csdvrx wrote:
             | > Early AM4 boards might not have a cpu x4 NVMe slot, and
             | those 4 cpu lanes might not be exposed, and the a300/x300
             | chipsetless boards don't tend to expose everything
             | 
             | I'm sorry, I oversimplified, and said "most of them" while
             | I should have said "not all of them" as 20/24 is more
             | correct for B550 chipsets (the most common for AM4) instead
             | of trying to generalize.
             | 
             | Your explanation is more correct that mine.
             | 
             | For anyone who might want extra details about the number of
             | lanes per CPU, https://pcguide101.com/motherboard/how-many-
             | pcie-lanes-does-... is a good read that shows the
             | difference for APUs.
        
               | toast0 wrote:
               | I'm still not quite sure what you're trying to say?
               | 
               | Lanes behind the chipset are multiplexed, and you can't
               | get more than x4 throughput through the chipset (and the
               | link speed between the cpu and the chipset varies
               | depending on the chipset and cpu). But that's not a
               | problem of the CPU lanes not being exposed, it's a
               | problem of "not enough lanes" or more likely, lanes not
               | arranged how you'd like. On AM4, if your GPU uses x16,
               | and one NVMe uses x4, then everything else is going to be
               | squeezed through the chipset. On AM5, you usually get two
               | x4 NVMe slots, but again everything else is squeezed
               | through the chipset; x670 is particularly constrained
               | because it just puts a second chipset downstream of the
               | first chipset, so you're just adding more stuff to
               | squeeze through the same x4 link to the CPU.
               | 
               | Personally, I found that link to be more confusing than
               | just reading through the descriptions on wikipedia for a
               | particular Zen version. For example
               | https://en.wikipedia.org/wiki/Zen_3 ... just text search
               | in the page for "lanes" and it explains for all the
               | flavors of chips how many lanes, and how many go to the
               | chipset. Similarly the page for AMD chipsets is pretty
               | succinct https://en.wikipedia.org/wiki/List_of_AMD_chipse
               | ts#AM5_chips...
        
               | formerly_proven wrote:
               | There's a reason why so many motherboard makers avoid
               | putting a block diagram in their manuals and go for
               | paragraphs of legalese instead, and laziness is only half
               | of it.
        
         | formerly_proven wrote:
         | PCIe devices can only draw a limited wattage until the host
         | clears them for higher power. There is also a separate power
         | brake mechanism (optional part of PCIe) mentioned in the
         | article, which has been proposed by nVidia for PCIe so it seems
         | likely their GPUs support it.
        
         | f_devd wrote:
         | I can actually answer this (as it is how I stumbled on to the
         | repo), it's through a signal from the motherboard called Pwrbrk
         | (Power Brake), Pin 30 on PCIe. It tells the PCIe device to
         | maintain a low-power mode, in the case of Nvidia GPUs it's
         | about 50W (300Mhz out of 2100Mhz in my case).
         | 
         | You can check if it's active using `nvidia-smi -q | grep
         | Slowdown` as shown in the post
        
       | somat wrote:
       | BMC's in general leave me uneasy.
       | 
       | I like the idea, it is a small computer that is used to monitor
       | and control your big computer. But hate the implementation. Why
       | are they all super secret special firmware blobs? Why can't I
       | just install my linux of choice and run the manufacturers
       | software? This would still suck but not as bad as the full stack
       | nonsense they foist on you at this point.
        
         | WaitWaitWha wrote:
         | Was responsible to poke at BMC security in data centers.
         | 
         | BMCs lack the security fundamentals and often behave like cheap
         | IoT knock-offs. They often use outdated kernels, libraries, and
         | security mechanism.
        
           | jdwithit wrote:
           | Yeah they are utter garbage. For _years_ you had to use Java
           | 6 with absolutely every modern security measure turned off in
           | both the JVM runtime itself and your browser to access Dell
           | DRACs. Accept expired certs, run unsigned code, I 'm sure
           | this is all fine ...
           | 
           | I mostly work in the cloud now but when I last had to manage
           | a bunch of physical machines we had a physically separate
           | network accessed via its own VPN to get onto the BMCs.
           | Because yeah, the security situation was a joke.
        
             | hsbauauvhabzb wrote:
             | I found that if you leave your bmc unplugged on a super
             | micro, it'll conveniently bridge it to whatever other
             | Ethernet is plugged in, meaning an outage of your
             | management network may roll over to another network
             | unintentionally.
             | 
             | Id put money on there being preauth vulnerabilities in
             | those things, judging by the engineering quality.
        
           | kabdib wrote:
           | Ran a fleet of servers with terrible BMCs. We kept those on a
           | well-sealed-off private network.
           | 
           | Woe betide you if you run into one of the BMC implementations
           | that shares a host network interface; no separate cable!
           | These things are terrible from a security standpoint.
        
             | septune wrote:
             | i hope you disable OS passtrough because it could be gore
        
             | [deleted]
        
         | scifi wrote:
         | To add to this, manufacturers (HPE) are now requiring an
         | expensive annual fee in order for customers to use the
         | hardware.
        
         | yjftsjthsd-h wrote:
         | Actually, why aren't they literally normal little computers?
         | Like, fully open, bring your own OS computers. All it needs is
         | some peripherals - Ethernet, its own host USB, gadget mode USB
         | to present mouse/kb/storage to the main computer, video capture
         | card, some GPIO to control power - but there's nothing all that
         | special there; then you just install Debian or w/e and control
         | the perfectly standard interfaces at will.
        
           | wmf wrote:
           | The interface between the BMC and motherboard is unique to
           | each motherboard, especially for "added value" features that
           | some servers have. DC-SCM is working on standardizing this
           | but I don't know how interoperable it will be.
        
         | [deleted]
        
         | donalhunt wrote:
         | Was responsible for trying to improve the management and
         | operation of a large fleet of BMCs for a while. Plenty of bugs
         | and pace of releases is slow. :(
         | 
         | Definitely an area where a more open ecosystem would improve
         | the pace of innovation.
        
           | chasil wrote:
           | Sometimes the bugs end up at a security conference.
           | 
           | https://airbus-seclab.github.io/ilo/BHUSA2021-Slides-
           | hpe_ilo...
        
           | hinkley wrote:
           | We need a Linux for BMCs. Oxide is working on one, but I'd
           | like to see a contender fielded from the seL4 community,
           | along with some other folks. For example, why doesn't Wind
           | River have one already?
        
             | NexRebular wrote:
             | Why linux? Why not *BSD?
        
               | gaius_baltar wrote:
               | > Why linux? Why not *BSD?
               | 
               | GPL can force manufacturers to cooperate with users. Of
               | course, they can still use closed source binary modules
               | and userland programs ...
        
               | NexRebular wrote:
               | GPL can also turn manufacturers away. I would rather have
               | variation in the possible BMC operating systems instead
               | of sticking linux everywhere and contributing to a
               | monoculture.
        
               | ilyt wrote:
               | ...which you wouldn't get with BSD as they'd just close
               | it down and ship you binary, as they would have no
               | obligation or reason to.
               | 
               | The BSD "freedom" is not for the user of the software,
               | it's for corporation to take.
        
               | NexRebular wrote:
               | And how would this affect the original open BSD-based BMC
               | firmware which would be available? If the corporations
               | want to maintain their own fork, it's on them.
               | 
               | The original BSD release will stay open and free for
               | everyone to utilize.
        
               | IntelMiner wrote:
               | How can you trust that a closed source OpenBSD fork is
               | secure when there's no way to audit the quality ( _or
               | lack thereof_ ) in the firmware the vendor gives you?
               | 
               | If it's GPL you can at least interrogate the code release
               | and make an informed decision
        
               | wmf wrote:
               | Software diversity is a luxury customers won't pay for.
        
               | taneq wrote:
               | That sure sounds appealing to manufacturers.
        
             | wmf wrote:
             | OpenBMC is the leader at this point.
        
         | theideaofcoffee wrote:
         | They're special firmware blobs because generally the OEMs
         | aren't building their BMCs from scratch as they might with
         | their main boards and other components. They're generally
         | getting the bmc SoC from the likes of Aspeed and others who are
         | the ones keeping them closed up. I've tried to get the magic
         | binaries and source for various projects but have given up
         | because there are so many layers of NDAs and sales gatekeepers.
         | I'm not entirely sure who makes the dell bmcs but I know
         | supermicro bundles Aspeed (at least they did with older
         | generations of their main boards.)
         | 
         | I agree with you that you should be able to run whatever since
         | in the end it's just another computer, but the manufacturers
         | believe otherwise since there's "valuable IP" or whatever
         | nonsense (insert rollseyes emoji here).
         | 
         | There are open specs like redfish but still doesn't get to the
         | heart of the matter.
        
         | Y_Y wrote:
         | It may no be "enterprise" enough for a given employer, by it's
         | not hard to replicate most of this functionality with cheap
         | (and open) hardware. For example I had a case that called for
         | BMC that I resolved with a spare raspi3b, a very cheap capture
         | device and a "smart plug". Total cost of materials was about 30
         | euro and (for me) it wasn't any harder to operate that an
         | idrac.
        
         | ilyt wrote:
         | It would also be sooo much easier to automate everything if it
         | just ran slightly custom Debian install instead of... whatever
         | the fuck abomination manufacturer made.
        
       | [deleted]
        
       | bubblethink wrote:
       | Dell is peak asshole design. They also blast fans as full speed
       | if you install GPUs that you don't buy from them. Fuck them.
        
         | ryanjshaw wrote:
         | They also stop producing laptop batteries after a few years
         | while refusing to let the laptop charge 3rd party battery
         | replacements, significantly limiting the useful life of their
         | laptops.
        
           | ComputerGuru wrote:
           | You can actually call a small business rep and get them to
           | order it for you. They still produce them or else have them
           | in stock, just don't sell them online.
        
         | flykespice wrote:
         | Isn't that illegal? Hijacking your customer pc if they install
         | something that isn't from them?
        
           | Maxburn wrote:
           | Nominally done for your protection. Lowering power (clock)
           | and heat load (fast fan) for unapproved gear prevents things
           | from going dead and getting people REALLY mad and likely
           | reduces warranty claims.
        
         | causi wrote:
         | Don't forget how Dell included/includes DRM in their laptop
         | chargers to prevent customers from buying cheap aftermarket
         | replacements. Of course the wire for the DRM functionality is
         | as thin as possible and is always the first thing to break.
        
           | aeadio wrote:
           | They don't prevent you from using aftermarket chargers. They
           | just display a warning, which can be disabled in the BIOS.
        
             | martijnvds wrote:
             | They also set CPU speed to minimum.
        
         | josephh wrote:
         | In case you're referring to their servers (from
         | https://dl.dell.com/manuals/common/poweredge_pcie_cooling.pd...
         | ):
         | 
         | > The automatic system cooling response for third-party cards
         | provisions airflow based on common industry PCIe requirements,
         | regulating the inlet air to the card to a maximum of 55degC.
         | The algorithm also approximates airflow in linear foot per
         | minute (LFM) for the card based on card power delivery
         | expectations for the slot (not actual card power consumption)
         | and sets fan speeds to meet that LFM expectation. Since the
         | airflow delivery is based on limited information from the
         | third-party card, it is possible that this estimated airflow
         | delivery may result in overcooling or undercooling of the card.
         | Therefore, Dell EMC provides airflow customization for third-
         | party PCIe adapters installed in PowerEdge platforms.
         | 
         | You need to use their RACADM interface to update the minimum
         | LFM for your card.
        
         | somehnguy wrote:
         | Restrictive nonsense seems common in the server space
         | unfortunately. HPe do similar things. IIRC they disabled
         | certain features if you used non-HPe 'approved' hard drives.
        
           | formerly_proven wrote:
           | Are there even any good server vendors? Dell, HPE and Lenovo
           | do their lock-in shit. Supermicro's BMC is pretty bad.
           | xFusion is totally-not-Huawei-I-pwomise. There's a few more
           | that come to mind but all of them are niches like HPC and
           | don't really do sales on a small scale.
        
           | bg46z wrote:
           | Dell also does this with their EMC storage arrays, it's meant
           | to push you towards their pro services. You are supposed to
           | tell the array to order drives for you from pro services and
           | someone from some nameless MSP contracted with dell installs
           | it for you at a 10x markup.
        
         | csdvrx wrote:
         | Other manufacturers do worse and prevent boot if your PCI ids
         | aren't on a positive list.
         | 
         | This is for example present on thinkpads, and while you could
         | patch the bios before, Intel bootguard now prevents you do that
         | "for your own protection" :)
         | 
         | I hope the MSI leak contains actual bootguard keys for intel
         | 11th gen+, and can be used to allow "unauthorized" PCI modules
         | on modern thinkpads!
        
           | Avery3R wrote:
           | bootguard keys are oem specific
        
             | csdvrx wrote:
             | The MSI leak comments like https://sizeof.cat/post/leak-
             | intel-private-keys-msi-firmware... mentioned the bootguard
             | keys maybe have been common to other manufacturers: _" It
             | is assumed that the keys for downloading the guard are not
             | limited to compromising MSI products and can also be used
             | to attack equipment from other manufacturers using Intel's
             | 11th, 12th and 13th generation processors (for example,
             | Intel, Lenovo and Supermicro boards are mentioned)"_
        
               | mjg59 wrote:
               | Those are board that MSI OEMed for other vendors.
        
               | csdvrx wrote:
               | So servers and desktops only, which crushes my hope of
               | getting rid of bootguard on my laptop.
        
           | undersuit wrote:
           | I'd love to buy a AMD Ryzen 3 Pro 5350g, but I don't want to
           | deal with the stupid locked CPUs floating around from Lenovo
           | pulls.
        
           | wokkel wrote:
           | What worked for some older think machines (haven't tried it
           | on my thinkpad though) is to update the bios (can be to the
           | same version) and change the serial nr.to all zeroes (the
           | update script asks if you want that). That got rid of the
           | wifi whitelist i encountered.
        
             | csdvrx wrote:
             | Very interesting!
             | 
             | Could you please explain which serial? Can you do dmidecode
             | and tell me which Handle/ UUID is all zeroes?
             | 
             | Even if it may not directly apply to current thinkpads, it
             | implies the UEFI module might have other conditionals
             | before going on checking the positive list - something that
             | should be easy to check by reversing the
             | LenovoWmaPolicyDxe.sct PE32.
        
         | burnte wrote:
         | HP is way worse than Dell. Fans at full speed I can handle.
         | Servers permanently throwing errors because a part isn't HP
         | branded, that's peak asshole design. So is refusing to measure
         | status of drives if I use a third party drive sled (which is
         | some folded metal, and some LEDs on a flex PCB) AND throwing
         | errors about them.
        
           | BizarroLand wrote:
           | Lenovo soft bricks their workstations if they detect non-
           | branded cards in them.
           | 
           | No upgrading my wifi, thats a nono!
        
             | throwway120385 wrote:
             | The wifi thing is slightly understandable because FCC
             | requires you to limit your radiated emissions, and when you
             | do the certification you have to control the entire
             | configuration to pass the testing, which means that your
             | radio is paired with your antenna cable and antenna in the
             | body of the device. Allowing people to replace the radio
             | without a paired antenna cable and antenna could cause
             | radiated emissions to fall outside of the spectrum allowed
             | by the FCC. It's dumb in practice but at least somewhat
             | understandable in principle.
        
           | pxmpxm wrote:
           | Hmm I've got an HP Z6 that doesn't seem to care?
           | 
           | Are you sure this isn't a 1u rack will fry itself if you put
           | a spaceheater type of thing inside of it?
        
             | burnte wrote:
             | 5u system, ML350Gen9. I ignore the errors, but of course
             | that means that if a real error pops up there, I won't
             | know. It's a lower-urgency server of my own so it's ok, but
             | annoying as well in production.
             | 
             | I see the Z6 is a workstation unit, they're going to be
             | more flexible there.
        
         | JLCarveth wrote:
         | > E.g. it is well known that adding a third-party PCIe NIC
         | makes fans run at the maximum speed.
         | 
         | From the article
        
         | otherjason wrote:
         | I observed this behavior in a Dell system about a decade ago,
         | but based on experience over the last 5 years or so with
         | PowerEdge servers, installing a third-party GPU no longer
         | triggers the (extremely loud) maximum fan-speed response.
        
           | BizarroLand wrote:
           | Even if that's the case the damage is done.
           | 
           | I still tell people about capacitor plaque and how we should
           | have class actioned Dell out of existence over it
        
       ___________________________________________________________________
       (page generated 2023-05-10 23:00 UTC)