[HN Gopher] Giving Rust a chance for in-kernel codecs
___________________________________________________________________
Giving Rust a chance for in-kernel codecs
Author : orf
Score : 73 points
Date : 2024-04-26 21:28 UTC (1 days ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| jeffbee wrote:
| > raw pointer arithmetic and problematic memcpy() calls can be
| eliminated, array accesses can be checked at run time, and error
| paths can be greatly simplified. Complicated algorithms can be
| expressed more succinctly through the use of more modern
| abstractions such as iterators, ranges, generics, and the like.
|
| I know why people choose Rust _today_ , but we could have had all
| of those benefits decades ago if not for the recalcitrance of
| reflexive C++ haters.
| pas wrote:
| Of course, but ergonomics, vibe, or look-and-feel are
| important. C++ doesn't have a package manager, the .hpp thing
| is a mess (modules took ages), SFINAE, preprocessor macros vs
| proc-macros, and so on.
|
| That said, Linux is a very conservative and idiosyncratic
| project. It's really the bazaar. (No issue tracker, firehose of
| emails, etc.)
| vlovich123 wrote:
| is the kernel using cargo? I don't think the package manager
| is relevant.
| steveklabnik wrote:
| It is not using Cargo.
| pengaru wrote:
| absolutely not
| jsheard wrote:
| > (modules took ages)
|
| They're arguably still not there yet, sure the standard was
| ratified years ago but the implementations are still a mess
| and uptake is almost nonexistent.
| flohofwoe wrote:
| Not to mention: what C++ subset to use, and how to deal with
| the endless discussion-circlejerks around that one topic
| alone.
|
| For instance most of the C++ stdlib is useless for kernel (or
| embedded) development, so you end up with a C++ that's not
| much more than a "C with namespaces". At least Rust brings a
| couple of actual language improvements to the table (not a
| big fan of Rust, but if the only other option is C++, then
| Rust is the clear winner).
| jeffbee wrote:
| None of the C stdlib can be used in the kernel, either, and
| yet the kernel is written in C.
|
| Zircon is an existence proof that you can use the C++ std
| library in a kernel. Do they use every damned thing in std?
| No, but they do use array, pair, unique_ptr, iterators, and
| more.
| SAI_Peregrinus wrote:
| > None of the C stdlib can be used in the kernel, either,
|
| Except the bits allowed in a freestanding environment and
| (stretching the definition of "used in") the "nolibc"
| partial C standard library the kernel includes for
| environments that don't have any other libc available
| (`tools/include/nolibc`, mostly used for kernel tests
| where there's no userspace at all).
| dale_glass wrote:
| C++ even today has plenty footguns that will make code unsafe,
| and some of the safety comes at performance costs, or uses
| features that are ill-suited for the kernel.
| b20000 wrote:
| there is nothing problematic about memcpy, the problem is that
| you need to know what you are doing and rust won't solve that
| problem.
| jeroenhd wrote:
| It took C++ decades to get good, in my opinion. If the kernel
| team had switched to C++ decades ago, we probably would've
| ended up with a worse kernel code base. For me, C++11 and C++14
| created the modern C++ that's actually usable, but compilers
| took a while to implement all those features (efficiently and
| correctly).
|
| I find the C++ equivalent of the Rust iterators and such to be
| even harder to read (almost an accomplishment, given the
| density of Rust code); I don't think features like ranges would
| be expressed more succinctly using std::range the same way it
| can be done in Rust, for instance. I also find C++'s iterators'
| API design rather verbose, and I don't think there's much good
| to be said of implementing generics through C++ templates.
| There are good reasons for why the language was designed this
| way, but I get the impression succinctness didn't seem to be a
| primary objective designing them. Rather, I get the feeling
| that the language designers prioritised making the features
| accessible to people who already knew C++ and were used to the
| complexer side of C++.
| akira2501 wrote:
| > the recalcitrance of reflexive C++ haters.
|
| I don't reflexively hate C++, just all the implementations of
| it.
| codedokode wrote:
| I don't understand why video codecs must be in kernel and run
| with supervisor privileges. Why can't they run in a userspace?
| ben-schaaf wrote:
| This is for hardware encoding/decoding.
| ec109685 wrote:
| I believe the reason is that it's not safe to send arbitrary
| bitstreams directly to hardware decoders and given the
| "stateless" nature of them, you need something trusted to run
| the full video encoder / decoder logic.
| fulafel wrote:
| What happens if you send bad bitstreams to the hardware?
| Ecoste wrote:
| Explosions
| adastra22 wrote:
| You can have it overwrite kernel memory.
| screcth wrote:
| Isn't it possible to restrict what memory regions a
| device may have access to?
| adastra22 wrote:
| No, it's on the PCIe bus. It sends data straight to RAM,
| circumventing the memory controller on the CPU.
|
| _SOME_ systems have write protection logic on the RAM
| controllers themselves, but this is not universal.
| Firerouge wrote:
| > SOME systems have write protection logic on the RAM
| controllers themselves
|
| How can one tell if a system has RAM controller based
| security, what name does this write protection go by?
| stefan_ wrote:
| Maybe the grandparent is trying to refer to _IOMMUs_.
| adastra22 wrote:
| This. Your system will almost certainly have an IOMMU.
| But that can't be said for all systems that the Linux
| kernel supports.
| surajrmal wrote:
| Iommus are very common on PC grade hardware, as well as
| premium smart phones. They just aren't that common on
| lower end phones and iot devices, but pcie is also fairly
| uncommon on those devices which makes your comment a bit
| confusing.
| AlotOfReading wrote:
| There's a _lot_ of moderately powerful embedded Linux
| systems out there that either don 't have an MMU
| equivalent or have one that the vendor BSP doesn't use by
| default. Too give one example, Xilinx doesn't set up the
| SMMU for AXI DMA devices on Zynq by default, iirc.
| johntb86 wrote:
| In theory the hardware should return corrupted video,
| return an error, or at least hang, but not anything worse.
| It's worse if the data structures specifying memory buffers
| are incorrect; then you may be able to read/write arbitrary
| memory.
| Animats wrote:
| Which hardware decoders are not callable from user space
| and unsafe to call from kernel space?
| saagarjha wrote:
| Badly-implemented ones
| gary_0 wrote:
| I had the same question. If your chip can do, say, the DCT in
| hardware, why not just expose that unit directly to userspace?
| And if userspace sends invalid data to that unit, surely the
| kernel can just handle the fault and return an error? I must be
| missing something.
|
| At any rate, it's unfortunate that entire media file formats
| have to run in kernel space in order to implement hardware
| acceleration. There's no better way to do it?
| jeffbee wrote:
| The kernel has to interpose at least a little bit because
| some of these hardware devices can read and write anywhere,
| so you can't just let random users sent them commands.
| gary_0 wrote:
| Why would a hardware codec need the ability to arbitrarily
| read/write the entire address space, though? That seems
| like a needlessly dangerous design when it could easily
| have a register that restricts it to a kernel-designated
| address range.
| drdaeman wrote:
| It shouldn't, but we live in a world where cheapest
| products that just barely work overwhelmingly win the
| market (most obvious proof is IoT, but it applies to just
| about everything else). General consumer market doesn't
| care about proper designs, doesn't understand geeky
| security concerns and whatever - if something somehow
| works satisfactorily enough for acceptable number of
| situations - it gets released and marketed, successfully.
|
| Sorry. I hate it too.
| gary_0 wrote:
| Ah. I see. And then the kernel needs to bend over
| backwards to cover up the security holes and design
| flaws. :(
| toast0 wrote:
| To avoid the cost of copying the data from wherever it
| was to where it needs to be...
| Veserv wrote:
| The general answer is so that you can get the output at a
| program-configurable location rather than a hardware-
| fixed location. For instance, I want to shove out a frame
| of data and then get the decoded data back into a
| appropriately-sized static buffer. The hardware could
| always dump that at memory address 0x5000 and then I
| could copy it into my buffer, but from a programming
| perspective it is a lot cleaner if it just shoves the
| bytes directly into my buffer.
|
| It is analogous to passing a pointer to a buffer to a
| library function so that the library can write the
| contents of the buffer directly. You rely on the library
| (device) operating properly and not writing outside of
| the designated buffer. That is the baseline.
|
| As you state, you could add a enforcement mechanism that
| defines the extents of memory the external actor is
| allowed to store like how many language runtimes check
| and disallow out-of-bounds accesses. However, if you have
| multiple outstanding buffers, control structures, or
| other complex device-accessible structures then enforcing
| precise "bounds" checking rapidly demands very complex
| "bounds" definition/enforcement. Language runtimes can do
| this because they support arbitrary code, but enforcing
| in hardware that, say, every access lies within the nodes
| of a "user-constructed" red-black tree rapidly becomes
| infeasible.
|
| You basically get one of two options at that point,
| either you rely on the hardware working properly and do
| nothing, or you design your driver to only require
| coarser isolation that fits within hardware-definable
| boundaries. Most opt for the former. If you do the latter
| then there are various ways of actually enforcing the
| isolation such as IOMMUs or you could have a DMA
| controller that basically functions as a IO MPU (define
| ranges the device can access) (I am not actually aware of
| any DMA controllers that actually do this as a security
| measure, but it is theoretically possible).
|
| You do have to be careful that they actually enforce the
| isolation. For instance, I question if the x86-64 IOMMU
| implementation is actually safe against a malicious
| device due to certain supported features such as device
| IO TLBs, but I do not know about the actual hardware
| implementation to know for certain.
| binary132 wrote:
| and even a correct hardware specification could have an
| implementation bug...
| ok123456 wrote:
| How is this interposition different from what the virtual
| memory subsystem already does? Many other memory-mapped
| devices have kernel support. Access to these is usually
| mediated through udev or an equivalent on Linux.
| singron wrote:
| I have no specific knowledge, but it seems similar to
| issues we had before iommus. E.g. maybe the device
| accesses physical memory without memory protection and/or
| virtual address mapping, so the kernel needs to do that
| in its behalf.
| saagarjha wrote:
| This is correct, typically these devices need some page
| tables/IOMMU be set up to be told where they should read
| and write from.
| surajrmal wrote:
| Even without an iommu, there are still benefits to placing
| this logic in user space. It doesn't have to be all or
| nothing. Ideally if and when it becomes commonplace, iommus
| will also become more common.
| akira2501 wrote:
| It's because the value proposition of Rust and the reality of
| it's implementation don't match. So, they went through all this
| effort to put it in the kernel, but then realized, it's not
| really useful for much, outside of making "safe" drivers that
| no one really needs.
| saagarjha wrote:
| Well, for example, if you only have one hardware decoder and
| two processes that want to use it someone's got to mediate
| access to it
| b20000 wrote:
| Rust must be removed from the kernel. It is a huge time sink for
| smaller companies to deal with another language and frameworks.
| Effort should be focused on getting more hardware support
| mainlined such that device manufacturers have less work which
| will increase adoption.
|
| Memory safety can be addressed via kernel tools or frameworks and
| should not be the job of a language IMHO.
| Faaak wrote:
| Sorry for this non-hn reply, but "lol"
| phi-go wrote:
| Isn't Rust only used for kernel modules? So no one needs to
| depend on Rust code if not needed?
| steveklabnik wrote:
| That is correct, yes.
| hamandcheese wrote:
| As a user I am not particularly sympathetic to the economic
| concerns of some small company if that comes at the expense of
| my security.
|
| And Rust does seem to be increasing the rate and level of
| support small outfits can offer - see Asahi Linux for an
| example of that.
| IshKebab wrote:
| I see these frankly crazy opinions fairly regularly and I'm
| genuinely curious how you come to these conclusions.
|
| Have you written much C or C++? What kind of kernel tools or
| frameworks are you thinking of? Have you ever used Rust? Are
| you familiar with the different kinds of memory errors?
|
| I really struggle to imagine how anyone who is actually
| familiar with all this stuff could say things like this but you
| aren't the first...
| jeroenhd wrote:
| This reads like a comment from twenty years ago, to be honest.
|
| Memory safety could've been addressed through kernel tools and
| frameworks for decades, but it hasn't. And it _should_ be part
| of the language, as even low level languages like C try not to
| clobber memory and specify undefined behaviour in cases where
| you may accidentally end up doing it anyway.
|
| There are good arguments for and against other Rust features
| such as the way panic!() works and the strictness of the borrow
| checker. However, "C and tooling can do everything your fancy
| pants new language does" has been said for longer than I've
| been alive and yet every month I see CVE reports about major
| projects caused by bugs that would never have passed the Rust
| compiler.
|
| Out of every reason I can think of, a lack of hardware support
| seems like the least likely reason for a hardware manufacturer
| not to upstream hardware support. Look at companies like
| Qualcom, with massive ranges of devices and working kernel
| drivers, hacked together because upstreaming doesn't benefit
| them. Look at companies like Apple, who doesn't care if their
| software works on Linux or not. The Linux kernel supports
| everything from 90s supercomputers to drones to smart
| toothbrushes, there's no lack of hardware support.
| jeroenhd wrote:
| I think these types of drivers, taking care of parsing in a place
| where I would personally say parsing should be avoided where
| possible, is an excellent place to move towards Rust. The
| language may not be great at doing things like low-level memory
| management, but parsing data and forwarding it to hardware seems
| like an excellent use case here.
|
| And yes, you can parse safely in C. Unfortunately, correctness in
| C has been proven to be rather difficult to achieve.
| andrepd wrote:
| Why is rust not good at low-level memory management? Can you
| give an example?
| jmull wrote:
| This seems like a case of two classic mistakes to me... #1
| starting with a solution, and then applying it to problems
| (imperfectly); #2 solving the problem at the wrong level (related
| to problem #1) - in this case at too low a level, which can work,
| but is a lot more work than a solution at the right level.
|
| That is: why not sandboxing rather than a rewrite?
| saagarjha wrote:
| Sandboxing an _in-kernel_ codec?
| blipvert wrote:
| eBPF?
| st_goliath wrote:
| The kernel has API in place to run code in userspace from
| within a kernel module. That would in theory be one way
| something complex in the kernel could be isolated. The
| obvious downside being the additional context switches and
| round trips for the data adding latency.
|
| Here's a blog post that demonstrates embedding an entire Go
| program as a blob into a kernel module and running it in
| userspace from the module:
|
| https://www.sigma-star.at/blog/2023/07/embedded-go-prog/
___________________________________________________________________
(page generated 2024-04-27 23:01 UTC)