[HN Gopher] AMD Open-Source GPU Kernel Driver Above 5M Lines, En...
___________________________________________________________________
AMD Open-Source GPU Kernel Driver Above 5M Lines, Entire Linux
Kernel at 34.8M
Author : TangerineDream
Score : 114 points
Date : 2023-08-31 16:47 UTC (6 hours ago)
(HTM) web link (www.phoronix.com)
(TXT) w3m dump (www.phoronix.com)
| pixelesque wrote:
| It's not completely clear from the article, but: are the files
| generated 'on-the-fly' during the build process (and therefore
| not in git), or generated once (by AMD), and then committed?
| [deleted]
| mpreda wrote:
| Pre-generated by AMD and committed, I assume.
|
| If they were generated as part of the build, they would not be
| counted as SLOC (not being "source").
| rjsw wrote:
| Not read the article but the files in the Linux tree have been
| generated once by AMD.
| tgsovlerkhgsel wrote:
| ... and it doesn't work right. When you start googling for your
| syslog entries you find countless reports spanning many kernel
| versions of identical looking crashes, likely with different root
| causes since all the message basically says is "the GPU hung".
| shmerl wrote:
| This could be expressed in binary format using way less space,
| but expressing it in code / text I suppose make it more suitable
| to call it a source.
| 1letterunixname wrote:
| Corporations don't incentivize good engineering, they incentivize
| functionality at any cost. This leads to giant codebases, over-
| engineering, bad engineering, fragility, unmaintainable, useless
| code, and duplication. The FOSS/FLOSS community must push back
| against the hot mess turds corporations want to dump into their
| source.
| Kab1r wrote:
| I would much rather have a large amount of in-tree driver source
| over a small driver with a "large" firmware binary.
| Thaxll wrote:
| Linux kernel is not made of 34M loc, most of it is drivers which
| I hardly consider kernel code.
| agloe_dreams wrote:
| I'm not sure I get why the comparison to the Kernel is needed.
| GPUs are wildly complex. Rendering is wildly complex. Managing
| memory and data is complex. Managing connected hardware is
| complex. I am not sure why anyone would expect a GPU Driver to be
| small while also doing a billion things and playing games as well
| as mature gaming platforms.
| sylware wrote:
| It is said nvidia hardware programming interface is much more
| simple than AMD one.
|
| If true, AMD is doing something wrong here. And yes, giga tons
| of generated headers related to registers.
| harry8 wrote:
| If you're not intimately familiar with GPU drivers and what
| goes on this gives you a very quick, back-of-the-envelope of
| the size and complexity of the work involved. 1/7th the size
| and complexity of the kernel for this one driver.
|
| I raised an eyebrow but I have only the vaguest notion of how
| the hardware works and what a driver might have to manage.
| dralley wrote:
| As the article pointed out, the vast majority of the lines of
| code in the driver are autogenerated header files for things
| like defining hardware registers. There's not much complexity
| or logic in that type of code.
|
| Probably if AMD wanted to spend the time, they could compress
| it down to a fraction of it's current size.
| jmole wrote:
| Right, if you have 7 different architectures, each with
| it's own register map, and then model-specific tweaks,
| you're going to have a ton of code like that.
| undersuit wrote:
| We could just compile it into a proprietary blob like
| Nvidia! /s
| scns wrote:
| > if you have 7 different architectures
|
| GPU or CPU? If talking about the latter only two [four]
| should count (ARM & x86 [+ [* 2 64BitVersion]]. If you
| meant the former forget my comment.
| benlwalker wrote:
| Is it really that much code? I don't know GPU hardware,
| but the NVMe spec header file in SPDK is around 4k
| lines[0]. If there's 7 of them and they're twice as
| complicated each, we're still well under 100k from
| register map headers. I didn't actually look through
| Linux to see how big they are, so maybe it is that much
| more complex.
|
| 0: https://github.com/spdk/spdk/blob/master/include/spdk/
| nvme_s...
| tester756 wrote:
| >If you're not intimately familiar with GPU drivers and what
| goes on this gives you a very quick, back-of-the-envelope of
| the size and complexity of the work involved. 1/7th the size
| and complexity of the kernel for this one driver.
|
| ehh, no.
|
| almost all of this are header files
|
| >Meanwhile the open-source NVIDIA "Nouveau" driver is around
| 201k (21.7k blank lines, 24.3k lines of comments, and 155k
| lines of code). Or the Intel i915 DRM kernel graphics driver
| is around 381k lines via the same cloc judgment.
|
| so it seems like GPU driver is around 1% of kernel's code
|
| and you start thinking why actually kernel has this much code
| if GPU (out of all software) needs just around 1%.
| nvm0n2 wrote:
| The NVIDIA proprietary driver is about the same size
| compiled as the Linux kernel, iirc.
|
| The reason is, GPU drivers are basically complete operating
| systems, just for the secondary computer we call the GPU
| instead of the CPU.
| boppo1 wrote:
| Nouveau barely works iirc
| 1-6 wrote:
| "Of course, much of that is auto-generated header files... A
| large portion of it with AMD continuing to introduce new auto-
| generated header files with each new generation/version of a
| given block. These verbose header files has been AMD's
| alternative to creating exhaustive public documentation on their
| GPUs that they were once known for."
|
| So what's the point of saying that it's large?
| dijit wrote:
| I read it as a bit of a negative situation. So the reason for
| mentioning it is to shame AMD into doing a more correct or sane
| thing instead of spewing out enormous amounts of what is
| basically repetitive noise.
|
| Pointing out that enormity is important because source files
| need to be stored; interpreted, versioned and parsed by
| humans/IDE's. It has an externalised cost (but, then again,
| isn't capitalism all _about_ externalising costs?)
| throwaway193439 wrote:
| Because it's large and large is difficult to maintain.
|
| AMD maintains it but do we know how they are generated?
| Probably not.
|
| It's like a gift that stinks but you can't complain about
| because it's a gift.
| FirmwareBurner wrote:
| _> but do we know how they are generated? Probably not_
|
| Having worked in the semi industry, I can fathom a guess:
| It's a spaghetti mess of cascading Perl scripts that parse
| the Verilog/VHDL design files, with their development going
| back 20+ years, full of comments like "don't touch this line
| because it breaks another line, nobody knows why", and
| maintained by a team where a gray-beard "Gandalf" engineer
| wearing an ATI t-shirt, has most of the deep-down low-level
| knowledge on how to un-fuck them whenever they get fucked,
| pardon my french.
| trollied wrote:
| There's bound to be some tcl in there too...
| baq wrote:
| Having too worked in the semi industry, this is spot on
| sshine wrote:
| I haven't worked in the semi industry, but I've worked
| with EE's and Perl programmers, and they do love that
| undocumented lore. And the universe does reward you with
| a grey beard after enough Perl.
| trws wrote:
| I have not seen these scripts, but can confirm that AMD has
| a long history of such Perl scripts. Look at hipcc for a
| current, moderately frustrating, example of this. Also the
| last time I met one of the open source driver team in
| person he was, in fact, wearing a classic ATI red ATI
| t-shirt straight in from Markham. Much of that team is
| European now though from what I hear, and they're generally
| a good bunch.
| FirmwareBurner wrote:
| _> ATI t-shirt straight in from Markham._
|
| Curious how much of AMD Radeon GPU development now is
| being done in Markham-Canada, as AFAIK, the modern Radeon
| architecture stems from ATI's acquisition of ArtX[1], a
| US-based spin-ff of SGI, which was responsible for the
| GPUs in the Nintendo GameCube, Wii and many other
| innovations like programable shaders, later found in
| ATI/AMD GPUs.
|
| _> Much of that team is European now though from what I
| hear, and they're generally a good bunch._
|
| I didn't know AMD has a GPU design team in Europe. Where?
| I know they had a fab in Germany and they have an office
| for the Ryzen and Infinity Fabric R&D in Romania, but I
| had no idea they do GPU stuff as well in Europe. Where is
| that office?
|
| [1] https://en.wikipedia.org/wiki/ArtX
| ahartmetz wrote:
| >I didn't know AMD has a GPU design team in Europe
|
| AFAIK they don't, but the Linux driver guys seem to be
| mostly German and Polish and such. And yeah, they are
| doing good work. I half-expect AMD to reboot their
| Windows driver from the Linux driver code base at some
| point.
| [deleted]
| SXX wrote:
| > AMD maintains it but do we know how they are generated?
| Probably not.
|
| Basically those files are generated from AMD GPU register
| data files where majorify of registers are documented, but
| there of course bunch of magic numbers as well probably
| because they belong to HDCP or other cases where
| documentation only available under NDA.
|
| There been a number of leaks of AMD internal documentation so
| anyone who is into GPU drivers can really find a lot of
| information on their GPU internal workings.
|
| I've archieved some of it many years ago and it's was never
| DMCA*ed:
|
| https://github.com/ArseniyShestakov/rai-bonaire
|
| Source was a talk on CCC.
| mrweasel wrote:
| So while the AMD driver is open source, the community is
| basically excluded from contributing?
|
| Should someone decided that they'd start working through the
| code, removing duplicate code and clean up headers, functions and
| abstraction, they work would either be rejected, or undone with
| the next AMD code dump?
| kube-system wrote:
| A lot of open source projects work that way. Open source means
| you get access to the source and get to make changes for your
| own use. It doesn't mean you get to force anyone else to merge
| your code.
| mrweasel wrote:
| > It doesn't mean you get to force anyone else to merge your
| code.
|
| Sure, you can fork the code if you really feel that strongly
| about it. My main "issue" is that it basically removes one of
| the big benefits of open source, that we can collaborate and
| do better as a collective. If it's just a big code dump that
| other kernel developers can't really touch it's more "source
| code is available" than actual open source.
| acrispino wrote:
| It's not open source unless you can have it your way?
| That's too picky, for me
| elteto wrote:
| The fundamental premise of open source is full access to
| the source with the possibility to make changes and
| redistribute those changes [0]. Anything else, including
| collaboration to improve the code, is a nice cherry on top
| but not consequential to the concept of open source.
|
| [0] https://opensource.org/osd/
| kube-system wrote:
| Open source is a licensing model, not a community
| organization model. Collaboration is not a benefit of open
| source, it's a benefit of collaboration software and a
| group of people who welcome collaboration. Almost all of
| the people who like collaborating on software use open
| source licensing. But there are _plenty_ of people who use
| open source licensing who are not interested in
| collaborating. For example, it is very normal for projects
| maintained by someone with a narrow focus, or projects with
| limited or formally organized resources to not accept PRs.
|
| When you send someone a PR, you are demanding that they do
| work for you to review and merge. Open source licensing
| does not mandate that they do this. Heck, most open source
| licenses even disclaim warranty to avoid obligating the
| authors of even doing work that _the law_ would otherwise
| require them to do. Now yes, some people will help you with
| problems. This is because they 're nice, not because it's
| open source.
| guardiangod wrote:
| I am working on the kernel right now, the code is very pleasant
| (as far as C code goes) to work with.
|
| Whereas I worked on Chrome's V8 C++ code for a year and I still
| could not say I understand more than half of it. Its complexity
| is a factor more than the Linux kernel.
| tiffanyh wrote:
| As a comparison: FreeBSD: ~9M loc NetBSD:
| ~7M loc OpenBSD: ~3M loc
|
| And this _includes_ the base userland (not just kernel)
|
| https://www.csoonline.com/article/564373/is-the-bsd-os-dying...
| rjsw wrote:
| NetBSD currently contains an older version of this driver, from
| Linux 5.6. Checking just now it comes to 2.2M loc. Running the
| same test on the Linux 6.4 source tree, does give me the
| reported 5M loc.
|
| Maybe the figures you quote exclude things imported from
| elsewhere like gcc and llvm, I get a figure of 75M loc for base
| + kernel of NetBSD-10.
| somat wrote:
| The openbsd situation is even worse. over there the driver is
| bigger than the rest of the kernel.
|
| Don't get me wrong I use the driver every day and AMD is
| definitely one of the good guys for making an open source driver
| and them who ported it are absolute heros. However.... Sometimes
| I wish AMD had tied down the isa to their cards a little better.
| Narrowed the interface if you would. because as it is the driver
| is so big because there is this combinatorial explosion of
| generated header files.
|
| https://flak.tedunangst.com/post/watc
| sroussey wrote:
| There is no business reason to restrict themselves on the ISA,
| and it would be make their hardware less performant compared to
| the competition which would not be so bound.
| dogma1138 wrote:
| The competition is very much bound to a rather narrow ISA
| which is why CUDA is forward and backwards compatible whilst
| ROCm isn't.
|
| ROCm will be pointless until at least forward compatibility
| will be guaranteed by design.
| nimish wrote:
| CUDA is subsequently compiled to the hardware assembly at
| runtime isn't it? Like precompiled shaders.
|
| C++ -> PTX -> Hardware ISA
| Laaas wrote:
| The issue is that the generated code is checked in. Surely
| there's a better solution.
| tux3 wrote:
| They _could_ in theory post the (no doubt) Perl scripts that
| generate those headers from the HDL along with the relevant
| source files, but I imagine that would be a _very_ hard sell.
| And probably not much more helpful to the kernel, as no one
| reads those headers anyways, and the compile time will not
| improve by shuffling where the generation step happens.
|
| It may be more practical to rework the scripts to try to find
| ways to reduce the verbosity and redundancy. The actual .c
| driver code probably doesn't need every copy of every lines
| in all those .h files.
| alfalfasprout wrote:
| As long as it's deterministic there should be no issue
| checking in the generators right?
| tux3 wrote:
| The generators themselves probably not, but the
| definition of all those registers is from hardware. These
| kind of code generators convert input source files that
| describe the hardware into C header files for the
| software.
|
| But I expect AMD would be skittish about open-sourcing
| anything that could even remotely be construed as HDL,
| even if it's just dry lists of registers. Open sourcing
| the drivers is one thing, but the hardware itself is
| another.
| trevithick wrote:
| Does a graphical representation of the files in the Linux kernel
| exist anywhere? Like a graphical file explorer but for the
| different kernel components.
| fooker wrote:
| Yeah the files are organized into a directory hierarchy, pretty
| cool tech! :-)
|
| And there are great tools for exploring directories of files,
| my current favorite is dolphin with two or three panes.
| Sjonny wrote:
| you can run windirstat (or similar tool) on a checkout to get
| an idea
| trevithick wrote:
| Yeah, I guess this is the answer. When I posted the question
| I had this[1] in mind, and was thinking of something like
| that with simplified labels maybe. But I guess the file
| structure is so organized it would explain itself to anyone
| interested in this kind of thing.
|
| [1] https://upload.wikimedia.org/wikipedia/commons/d/d5/GNOME
| _Di...
| pavon wrote:
| Wikipedia has a graph showing high-level breakdown of the
| kernel tree, and the size of the components[1]
|
| [1]
| https://en.wikipedia.org/wiki/Linux_kernel#/media/File:Sanke...
| Osiris wrote:
| Why are GPU drivers baked into the kernel?
|
| Wouldn't it be better to load them in such a way that a crash in
| the GPU driver can be recovered from as opposed to crashing the
| whole system?
|
| Other operating systems load the GPUs drivers separately.
| bendhoefs wrote:
| Why should having the GPU drivers checked into the same
| repository mean that they can't be loaded and unloaded
| dynamically?
| amelius wrote:
| How much of it is generated code?
___________________________________________________________________
(page generated 2023-08-31 23:01 UTC)