[HN Gopher] Rust for Filesystems
       ___________________________________________________________________
        
       Rust for Filesystems
        
       Author : drakerossman
       Score  : 237 points
       Date   : 2024-07-15 09:39 UTC (13 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | gritzko wrote:
       | From the minutes I conclude that Rust-in-the-kernel looks like an
       | additional complexity tax. I mean, if you write an OS from
       | scratch, you can use the full power of your language. Plastering
       | it to the side of an already _vast_ codebase creates additional
       | issues, as we see here.
        
         | thesnide wrote:
         | While I agree there are benefits to rust, I tend to think all
         | reason cannot fight hype.
         | 
         | The tax will be seen as a necessity to embrace future and
         | progress.
         | 
         | I'm wondering why do not restrict ourselves to a safe subset
         | instead of jumping into a huge bandwagon of unknown bugs and
         | tradeoffs
        
           | Yoric wrote:
           | Safe subset of what?
        
           | pjc50 wrote:
           | There is no "safe subset" of C. MISRA is fairly close, but
           | all sorts of things that you might need, like integer
           | arithmetic, have potential UB in C.
           | 
           | (The best current effort is https://sel4.systems/ , which is
           | written in C but has a large proof of safety attached. The
           | language design question is basically: should the proof be
           | part of the language?)
        
             | regularfry wrote:
             | Given that undefined behaviour just means "undefined by the
             | standard", do you get usefully closer to being able to
             | identify a safe subset with the (MISRA/alternative,
             | specific compiler, specific architecture) triple?
        
               | pjc50 wrote:
               | No, undefined behavior does _not_ mean  "not defined by
               | the standard", it means those places where the standard
               | says "undefined behavior". And then the long and
               | complicated war over "the compiler may assume that UB
               | does not happen and then optimize on that basis".
               | 
               | You might be able to tighten it up in some specific
               | cases, and those battles are being fought elsewhere, but
               | there's stuff like lock lifetimes which you cannot do
               | without substantial extra annotations inside or outside
               | the language.
        
               | regularfry wrote:
               | Sorry, yes - poor wording on my part.
        
             | PhilipRoman wrote:
             | I found frama-c to be pretty good, including all the
             | integer quirks
        
           | germandiago wrote:
           | I like Linus view on evolution. Evolution will tell what the
           | most sensible choices are over time. It is like "the market"
           | in some way. Let everyone make their bets, wait, see,
           | analyze, research. That's it.
        
         | mnau wrote:
         | > additional complexity tax
         | 
         | Yes, but that should be offsetted by easier driver development.
         | See the blog about Rust GPU driver for asahii linux, done in
         | one month. EDIT: Google "tales of the m1 gpu" (author has a
         | very negative opinions about hacker news, read if you like by
         | clicking the link https://asahilinux.org/2022/11/tales-of-
         | the-m1-gpu/)
         | 
         | Is it universal? We'll see in coming years.
        
           | eru wrote:
           | Alas, that link just gets you a rant about politics, if you
           | click on it directly.
           | 
           | Copy-and-pasting works.
        
             | olivermuty wrote:
             | "Rant about politics", haha. Or as other people like to
             | call it: "A real concern described in an apt manner".
             | 
             | I have observed these inflammatory sub-graphs of comments
             | myself and have thought to myself that this must be a huge
             | growing grounds for unmoderated and unwanted behaviour
             | because it more or less becomes invisible once flagged
             | enough.
        
               | erdii wrote:
               | In this specific case complaining about "politics" gets
               | the sour by-taste of enabling (or at least not condoning)
               | harrasment to the point of single folks taking their own
               | lifes over it.
               | 
               | Why?! Even if you're not sure what to think about the
               | queer movement; even if you have already made up your
               | mind about the queer movement and oppose their ideas or
               | some of them; I refuse to believe that any single person
               | would not want to stop someone from bullying someone else
               | into their own suicide!
               | 
               | hastily jotted rant for the folks who'd like to complain
               | about "politics" from creeping into every discussion
               | everywhere:
               | 
               | It's really sad to see so many folks disconnecting and
               | immediately dismissing whole groups of other folks as
               | soon as they start complaining about an issue they have
               | because of "politics". :(
               | 
               | I get that you don't want to get involved in shit
               | flinging shows and that its tedious to figure out who's
               | in the right and who's in the wrong. Especially because
               | there are never clear answers. If you feel like this and
               | then proceed to complain about 'politics' creeping
               | everywhere, please beware of this:
               | 
               | Pretending to be apolitical doesn't work most of the
               | time, as politics is basically another word for "acting
               | (or deliberately not-acting) in some kind of public
               | sphere" which you all do, and when the "policitics" have
               | arrived at a topic, then they'll stay there at least in
               | that specific case you are witnessing! You just are part
               | of a hyperconnected and confusing world with a lot of
               | conflict, wether you like it or not.
               | 
               | Pretending to be apolitical also serves the upholding of
               | whatever status quo is currently in place because
               | anything that has even a slight chance of changing
               | anything is inherently a political topic.
               | 
               | Please don't turn your heads on "political" topics or, at
               | least, don't complain about it in that way as it mostly
               | enables unjust behaviour to continue. It doesn't even
               | matter if it's the person who brought up the "political"
               | stuff who is acting unjust or the folks they're
               | complaining about). In both cases it's probably better to
               | either avoid commenting at all or to convey your critical
               | thoughts to that "political" conversation.
        
               | pjc50 wrote:
               | > I refuse to believe that any single person would not
               | want to stop someone from bullying someone else into
               | their own suicide!
               | 
               | There are plenty of people who do want the freedom to say
               | exactly what they choose, including a lengthy period of
               | directed harassment, and shrug their shoulders if someone
               | commits suicide over it. There's not much that can be
               | done other than ban them from civilized spaces.
        
               | pessimizer wrote:
               | "Bullying" is a judgement. Intrinsic to the word is a
               | judgement that what is being done is bad, and the person
               | doing it would not describe it that way.
               | 
               | And by that I don't mean that it's not bad to bully
               | people (it's rather tautological), I mean that talk of
               | bullying often begs the question, and is intentionally
               | done in order to elide past the actual events that
               | occurred. _Lese majeste_ laws against talking about the
               | King, elected officials, or even cops or bureaucrats now
               | get justified as anti-bullying.
               | 
               | I missed any rant about politics in the blog however. But
               | this thread has a smell of "my politics aren't political
               | because they are true, and your politics are political
               | because they are lies."
        
               | jcranmer wrote:
               | What the referer-replacement page was talking is Kiwi
               | Farms, which is doing the kind of stuff that even the US
               | First Amendment's very expansive protections fails to
               | protect. (The criminal liability here is "intentional
               | inflection of emotional distress", although note that
               | most lawsuits that allege that are groundless lawsuits
               | that largely fail to make it pass the motion to dismiss
               | for failure to state a claim stage as "they made me feel
               | bad" isn't sufficient to allege an IIED).
        
               | nyssos wrote:
               | > The criminal liability here is "intentional inflection
               | of emotional distress",
               | 
               | Intentional infliction of emotional distress is a tort,
               | not criminal.
        
               | keybored wrote:
               | I mean this is correct. Bullies, or at least the adult
               | ones, don't call what they do bullying (except the
               | absolute geniuses that think "actually bullying has a
               | social corrective behavior, I'm just helping actually").
               | 
               | We're all just busting each other's balls, right? Look at
               | George, he's laughing! He's totally in on the joke and
               | not at all just showing submissive deference in order to
               | not lose face.
        
               | logicprog wrote:
               | I'm glad at least some other people on this horrible site
               | feel this way.
        
               | im3w1l wrote:
               | To me, you have to distinguish between harassment
               | directed _at_ a person, and on the other hand discussion
               | _about_ a person.
               | 
               | It is not possible to use hacker news to send messages to
               | someones email. It is not even possible to send a dm to
               | another hacker news user. You could potentially imagine
               | hacker news being used to organize harassment of someone,
               | but I have never seen either accusations or evidence of
               | such a thing.
               | 
               | So then we have established, that since harassment
               | directed at them is impossible, the issue they have is
               | that people on hacker news write bad things about them.
               | 
               | Next, those things seem to often be flagged or downvoted,
               | reducing exposure. But that is apparently not enough,
               | because they can be found on google. So here we arrive at
               | the core issue. There is content on Google about this
               | person, that they would rather not be on Google. This is
               | the complaint. So this person is basically saying that if
               | there is unfavorable coverage of them findable on google,
               | that is harassment, and it needs to go. If it isn't
               | purged it's bullying that could lead to suicide.
               | 
               | This is a very ambitious "landgrab" if you will, and it
               | starts to seriously infringe on other peoples rights.
               | 
               | It's similar in that manner to other things like "stop
               | terrorism" or "think of the children". Yes clearly
               | harassment is bad, and terrorism is bad, and pedophiles
               | are no good. But we can't completely give up on our
               | freedoms because of that.
        
               | lelanthran wrote:
               | > Or as other people like to call it: "A real concern
               | described in an apt manner".
               | 
               | Oh please. Every activist for every marginal issue says
               | the same thing. Doesn't make it true.
        
             | aniviacat wrote:
             | I can't find a rant in the link. Was the link changed or
             | did I overlook something?
        
               | jeroenhd wrote:
               | The author of the website details their issue with the
               | way HN does moderation (which I can't say I disagree
               | with, especially after HN intentionally disabled referrer
               | headers for websites that take issue with HN). This only
               | shows up if HN is in the referer URL.
               | 
               | I wouldn't call it a rant, but rather a polite request
               | for HN policy to change.
        
               | yjftsjthsd-h wrote:
               | > I wouldn't call it a rant, but rather a polite request
               | for HN policy to change.
               | 
               | Which is made by blocking people with no ability to do
               | anything about it.
        
               | mcronce wrote:
               | What? All you need to do is resubmit the request without
               | a Referer header. For me, using Firefox, this meant
               | clicking in the address bar, changing nothing, and
               | hitting enter.
               | 
               | That's hardly "no ability to do anything about it".
        
               | yjftsjthsd-h wrote:
               | I mean people with no ability to alter HN moderation
               | policy.
               | 
               | Consider it like this:                   1. I think, "oh,
               | that looks relevant, let me open that link".         2. I
               | get a screenful of objections to HN moderation.
               | 3. I shrug and close the tab.
               | 
               | Since I'm just a normal user who can't change HN
               | moderation, the outcome is that HN doesn't change but I
               | walk away with a worse opinion of the Asahi Linux folks.
        
               | lanstin wrote:
               | Weird: HN is supposed to be technical but the people
               | behind Asahi Linux have proven themselves to be technical
               | wizards; meanwhile HN seems more interested in money and
               | pseudo-libertarian politics than transformitive technical
               | excellence. But such is our age.
        
               | amiga386 wrote:
               | If you click the link - any link to asahilinux.org from
               | HN - it should start "Hi! It looks like you might have
               | come from Hacker News.", followed Hector Martin ranting
               | that he isn't in charge of the moderation policy of HN.
               | 
               | The response given in Arkell v. Pressdram is appropriate.
        
               | Phelinofist wrote:
               | Also didn't show for me with disabled JS
        
           | KallDrexx wrote:
           | Maybe I'm reading something wrong but the discussion this HN
           | posting is about sounds very much about trying to make a
           | Linux subsystem and API in Rust, so that Rust's type system
           | can enforce conformance to safety mechanisms via its
           | abstraction.
           | 
           | That's fundamentally different and harder than a driver being
           | written in rust that uses the Linux's subsystems C APIs.
           | 
           | I can see a lot of drivers taking on the complexity tax of
           | being written in Rust easily. The complexity tax on writing a
           | whole subsystem in Rust seems like an exponentially harder
           | problem.
        
             | wongarsu wrote:
             | You could just write rust code that calls the C APIs, and
             | that would probably avoid a lot of discussions like the one
             | in the article.
             | 
             | But making good wrappers would make development of
             | downstream components even easier. As the opponents in the
             | discussion said: there's about 50 filesystem drivers. If
             | you make the interface better that's a boon for every file
             | system (well, every file system that uses rust, but it
             | doesn't take many to make the effort pay off). You pay the
             | complexity tax once for the interface, and get complexity
             | benefits in every component that uses the interface.
             | 
             | We would have the same discussions about better C APIs, if
             | only C was expressive enough to allow good abstractions.
        
           | nicce wrote:
           | > author has a very negative opinions about hacker news
           | 
           | I am not sure if author (Asahi Lina) has, but the project
           | lead Hector Martin definitely has.
        
           | josephcsible wrote:
           | Warning: Don't click that link. Copy and paste the URL
           | instead. That site serves only verbal abuse and harassment to
           | people that it detects are HN users.
        
         | vollbrecht wrote:
         | One can argue that any additional code is introducing
         | complexity, not only writing Rust. Does that mean we should
         | just stop innovating and go into an indefinite state of
         | maintenance, since we are already so vast?
         | 
         | A tax in one place may not be a net negative, if it's used like
         | in the real world to offset other problems. And just saying it
         | will not offset any problems because of a single discussion,
         | that does not have a definite conclusion, comes of as a short
         | argument.
        
           | another2another wrote:
           | >Does that mean we should just stop innovating and go into an
           | indefinite state of maintenance
           | 
           | If you mean that not using Rust (or maybe some other
           | languages e.g. Zig or Ada?) means that there can be no
           | innovation in the Linux kernel, I would have to disagree
           | since there's been plenty of progress in plain old c (see for
           | instance io_uring), not to mention the fact that the c
           | language itself could change to make developer ergonomics
           | better - since that seems to be the nub of the problem.
           | 
           | It also raises the question of what happens in the future
           | when Rust is no longer the language du jour - how do we know
           | it's going to last the course? And now there's 2 different
           | codebases, potentially maintained by 2 different diminishing
           | sets of active maintainers.
        
             | vollbrecht wrote:
             | > If you mean that not using Rust (or maybe some other
             | languages e.g. Zig or Ada?) means that there can be no
             | innovation in the Linux kernel, I would have to disagree
             | since there's been plenty of progress in plain old c.
             | 
             | No i didn't mean that. If i understand OP correctly here,
             | he argued that it is a tax to use rust, a tax is always
             | bad, and thous should be avoided.
             | 
             | We obviously can't now the future. We also can't now how
             | future maintainers look like, and if there will be a bigger
             | abundance of people understanding kernel level C or kernel
             | level Rust or both.
             | 
             | I also don't think that any one developer can claim to
             | fully get every part of the Linux Kernel. So if one person
             | want's to work on a particular subsection they need to make
             | themself familiar with it, independent of the language
             | used. And then we are back at the argument, is the
             | additional tax bad, or what does it bring to the table.
        
           | riku_iki wrote:
           | > One can argue that any additional code is introducing
           | complexity
           | 
           | additional code can replace more complex existing or future
           | code, thus can reduce complexity
        
         | Already__Taken wrote:
         | reads a lot like letting perfect be the enemy of good.
        
       | ysw0145 wrote:
       | Having more options available in the Linux kernel is always
       | beneficial. However, Rust may not be the solution for everything.
       | While Rust does its best to ensure its programming model is safe,
       | it is still a limited model. Memory issues? Use Rust! Concurrency
       | problems? Switch to Rust! But you can't do everything that C does
       | without using unsafe blocks. Rust can offer a fresh perspective
       | to these problems, but it's not a complete solution.
        
         | drdo wrote:
         | But unsafe blocks are available! And you should use them when
         | you have to, but only when you have to.
         | 
         | Using an unsafe block with a very limited blast radius doesn't
         | negate all the guarantees you get in all the rest of your code.
        
           | sanxiyn wrote:
           | Note that unsafe blocks don't have limited blast radius.
           | Blast that can be caused by a single incorrect unsafe block
           | is unlimited, at least in theory. (In practice there could be
           | correlation of amount of incorrectness to effect, but same
           | also could be said about C undefined behavior.)
           | 
           | Unsafe blocks limit amount you need to get correct, but you
           | need to get all of them correct. It is not a blast limiter.
        
             | weinzierl wrote:
             | Yes, they don't contain the blast, but they limit the
             | places where a bomb can be, and that is their worth.
        
               | foldr wrote:
               | Generally speaking yes, but there could be a logic error
               | somewhere in safe code that causes an unsafe block to do
               | something it shouldn't. For example, a safe function that
               | is expected to return an integer less than n is called
               | within an unsafe block to obtain an index, but the return
               | value isn't actually less than n. In that case the 'bomb'
               | may be in the unsafe block, but the bug is in the safe
               | code.
        
               | Klonoar wrote:
               | I cannot imagine writing a method to return a value less
               | than n, and not verifying that constraint somewhere in
               | the safe method.
        
               | foldr wrote:
               | It's just a simple example to illustrate the point.
               | Realistic bugs would probably involve more complex logic.
               | 
               | The prevalence of buffer overrun bugs in C code shows
               | that it very definitely is possible for programmers to
               | screw up when calculating indices. Rust removes a lot of
               | the footguns that make that both easy to do and dangerous
               | in C. But in unsafe Rust code, you're still fundamentally
               | vulnerable to any arithmetic bug in any function that you
               | call as part of the computation of an index.
        
               | nicce wrote:
               | > yes, but there could be a logic error somewhere in safe
               | code that causes an unsafe block to do something it
               | shouldn't.
               | 
               | Sounds like bad design. You can typically limit the use
               | for unsafe for so small area than you can verify the
               | ranges of parameters which will cause memory problems.
               | Check for invalid values and raise panic. Still
               | "memorysafe", even if it panics.
        
               | foldr wrote:
               | Sure, it may be bad design. The point is that nothing in
               | the Rust language itself guarantees that memory safety
               | bugs will be localized to unsafe blocks. If your code has
               | that property it's because you wrote it in a disciplined
               | way, not because Rust forced you to write it that way
               | (though it may have given some moral support).
               | 
               | Let me emphasize that I am not criticizing Rust here. I
               | am just pointing out an incontrovertible fact about how
               | unsafe blocks in Rust work: memory safety bugs are not
               | guaranteed to be localized to unsafe blocks.
        
             | neysofu wrote:
             | I believe this is technically true, but somewhat myopic
             | when it comes to how maintainers approach unsafe blocks in
             | Rust.
             | 
             | UBs have unlimited blast radius by definition, and you'll
             | need to write correct code in all your unsafe blocks to
             | ensure your application is 100% memory-safe. There's no
             | debate around that. From this perspective, there's no
             | difference between a C application and a Rust one which
             | contains a single, incorrect unsafe block.
             | 
             | The appreciable difference between the two, however, is how
             | much more debuggable and auditable an unsafe block is.
             | There's usually not that many of them, and they're easily
             | greppable. Those (hopefully) very few lines of code in your
             | entire application benefit from a level of attention and
             | scrutiny that teams can hardly afford for entire C
             | codebases.
             | 
             | EDIT: hardy -> hardly (typo)
        
             | drdo wrote:
             | That is of course correct.
             | 
             | The main value is that you only have to make sure that a
             | small amount of code surrounding the unsafe block is safe,
             | and hopefully you provide a safe API for the rest of the
             | code to use.
        
           | CraigJPerry wrote:
           | I'd word that different- it reduces the search space for a
           | bug when something goes wrong but it doesn't limit the blast
           | radius - you can still spectacularly blow up safe rust code
           | with an unsafe block (that no aliases rule is seriously tough
           | to adhere to!)
           | 
           | This is definitely a strong benefit though.
        
         | bilekas wrote:
         | > Concurrency problems?
         | 
         | I have to admit, while I do enjoy rust in the sense that it
         | makes sense and can really "click" sometimes. For anything
         | asynchronous I find it really rough around the edges. It's not
         | intuitive what's happening under the hood.
        
           | the_duke wrote:
           | Async != concurrency.
           | 
           | One of the major wins of Rust is encoding thread safety in
           | the type system with the `Send` and `Sync` traits.
        
             | bilekas wrote:
             | > Async != concurrency.
             | 
             | Right, but tasks are sharing the same thread which is fine,
             | but when we need to expand on that with them actually
             | working async, i.e non blocking, fire and quasi-forget, its
             | tricky. That's all I'm saying.
        
               | the_duke wrote:
               | The Rust async experience indeed has lots of pitfalls,
               | very much agree there.
        
               | dboreham wrote:
               | s/The Rust/All/
        
             | duped wrote:
             | async == concurrency, concurrency != parallelism.
        
               | sophacles wrote:
               | async == concurrency in the same way square == rectangle
               | - that is it's not an associative '==' since there are
               | plenty of rectangles that are not squares.
        
           | asyx wrote:
           | I really hate async rust. It's really great that rust forces
           | you on a compiler level to use mutexes but async is a disease
           | that is spreading through your whole project and introduces a
           | lot of complexity that I don't feel in C#, Python or JS/TS.
        
             | John23832 wrote:
             | Eh, syntactically async rust is the exact same as C#. It's
             | all task based concurrency.
             | 
             | Now, lifetimes attached to function signatures is
             | definitely a problem.
        
               | colejohnson66 wrote:
               | Not really. C#'s Task/Task<T> are based on background
               | execution. Once something is awaited, control is returned
               | to the caller. OTOH, Rust's Future<T> is, by default,
               | based on polling/stepping, a bit like IEnumerable<T> in
               | C#; If you never poll/await the Future<T>, it never
               | executes. Executor libraries like Tokio allow running
               | futures in the background, but that's not built-in.
        
               | John23832 wrote:
               | I don't want to "well actually" the "well actually", but
               | I think you missed the word syntactically.
               | 
               | > C#'s Task/Task<T> are based on background execution.
               | Once something is awaited, control is returned to the
               | caller.
               | 
               | Async/await in any language happens in the background.
               | 
               | What happens during a Task.Yield() (C#)? The task is
               | yielded to the another awaiting task in the work queue.
               | Same as Rust.
               | 
               | > OTOH, Rust's Future<T> is, by default, based on
               | polling/stepping,
               | 
               | The await syntax abstracts over Future/Stream polling.
               | The real difference is that Rust introduced the Future
               | type/concept of polling at all (which is a result of not
               | having a standard async runtime). There is a concept of
               | "is this task available to proceed on" in C# too, it's
               | just not exposed to the user and handled by the CLR.
        
               | merb wrote:
               | > Task.Yield()
               | 
               | In c# you probably never call yield.
        
               | John23832 wrote:
               | It was just an example. In practice, you're right.
        
               | neonsunset wrote:
               | Yield in C# is frequently used for the same reasons as in
               | Rust, although implementation details between fine-
               | grained C# Tasks and even finer grained Rust Futures
               | aggregated into large Tasks differ quite a bit.
               | 
               | Synchronous part of an async method in C# will run
               | "inline". This means that should there be a
               | computationally expensive or blocking code, a caller will
               | not be able to proceed even if it doesn't await it
               | immediately. For example:                   var ptask =
               | Primes.Calculate(n); // returns Task<ulong[]>         //
               | Do other things...right?         // Why are we stuck
               | calculating the primes then?
               | Console.WriteLine("Started.");
               | 
               | In order for the .Calculate to be able to continue
               | execution "elsewhere" in a free worker thread, it would
               | have to yield.
               | 
               | If a caller does not control .Calculate, the most common
               | (and, sadly, frequently abused) solution is to simply do
               | var task = Task.Run(Primes.Calculate);         // Do
               | something else         var text = string.Join(',', await
               | task);
               | 
               | If a return signature of a delegate is also Task, the
               | return type will be flattened - just a Task<T>, but
               | nonetheless the returned task will be a proxy that will
               | complete once the original task completes. This
               | successfully deals with badly behaved code.
               | 
               | However, a better solution is to instead insert
               | `Task.Yield()` to allow the caller to proceed and not be
               | blocked, before continuing a long-running operation:
               | var ptask = Primes.Calculate(n); // returns Task<ulong[]>
               | // Successfully prints the message
               | Console.WriteLine("Started.");                   static
               | async Task<int[]> CalculatePrimes(int n)         {
               | await Task.Yield();             // Continue execution in
               | a free worker thread             // If the caller
               | immediately awaits us, most likely             // the
               | caller's thread will end up doing so, as the
               | // continuation will be scheduled in the local queue,
               | // so it is unlikely for the work item to be stolen this
               | // quickly by another worker thread.         }
        
               | brigadier132 wrote:
               | How do you imagine async works otherwise? Also, in case
               | you misunderstand how polling works in practice in rust,
               | it's not polling in the traditional web development sense
               | where it polls every 5 ms to check if a future is
               | completed (although you can do this if you want to for
               | some reason). There are typically "wakers" that are
               | "awoken" by the os when data is ready and when they are
               | "awoken" _then_ they poll. And since they are only awoken
               | by the OS when the information is ready it really never
               | has to poll more than once unless there are multiple
               | bundled futures.
        
           | wongarsu wrote:
           | Rust async isn't all that pleasant to use. On the other hand
           | for normal threaded concurrency Rust is one of the best
           | languages around. The type system prevents a lot of
           | concurrency bugs. "Effortless concurrency" is a tagline the
           | language really has earned.
        
         | tialaramex wrote:
         | > But you can't do everything that C does without using unsafe
         | blocks
         | 
         | For this particular work the huge benefit of Rust is its
         | enthusiasm for _encapsulating_ such safety problems in types.
         | Which is indeed what this article is about.
         | 
         | C and particularly the way C is used in the kernel makes it
         | everybody's responsibility to have total knowledge of the tacit
         | rules. That cannot scale. A room full of kernel developers
         | didn't entirely agree on the rules for a data structure they
         | all use!
         | 
         | Rust is very good at making you aware of rules you need to
         | know, and making it not your problem when it can be somebody
         | else's problem to ensure rules are followed. Sometimes the
         | result will be less optimal, but even in the Linux kernel sub-
         | optimal is often the right default and we can provide an
         | (unsafe) escape hatch for people who can afford to learn six
         | more weird rules to maybe get better performance.
        
           | mjburgess wrote:
           | > That cannot scale.
           | 
           | lol... you're talking about the linux kernel, _written in C_.
           | 
           | The vast majority of software over many decades "bottoms out"
           | in C whether in VMs, operating systems, device drivers, etc.
           | 
           | The scale of the success of C is unparalleled.
        
             | dxroshan wrote:
             | I agree with you.
        
             | pjc50 wrote:
             | The scale of C adoption is certainly unparalleled over the
             | past 40 or so years, but so are the safety issues in the
             | cyberwarfare era.
             | 
             | https://www.whitehouse.gov/oncd/briefing-
             | room/2024/02/26/pre...
             | 
             | If, somehow, we'd got to an era where (a) operating systems
             | were widely deployed in a different language, and (b) the
             | Morris Worm of 1988 had happened due to buffer overflow
             | issues, then C in its current form would never have been
             | adopted.
        
               | mjburgess wrote:
               | C is just convenient assembly. In an era where
               | performance mattered, and much software was written _for_
               | hardware, and controlling hardware, it 's hard to see an
               | alternative.
               | 
               | C's choices were for performance on hardware-limited
               | systems. I don't really see what other ones made sense
               | historically.
        
               | another2another wrote:
               | >In an era where performance mattered, and much software
               | was written for hardware, and controlling hardware, it's
               | hard to see an alternative
               | 
               | Actually, what made sense _was_ assembly when performance
               | mattered above all. C was actually seen as a higher level
               | language.
               | 
               | However C's advantage was the fact that it was cross
               | platform, so you could compile or quite easily port the
               | same code to many different platforms with a C compiler
               | (Solaris,Windows,BSD,Linux and latterly Mac OSX). That
               | was its strength (pascal shared this too, but it didn't
               | survive).
               | 
               | You can see this in the legacy of software that's still
               | in use today - lots of gnu utilities, shells, X windows,
               | the zlib library, the gcc, openssl and discussed fairly
               | recently POV Ray which has been going since the 80's.
        
               | pjc50 wrote:
               | C is, in some important cases, _less_ convenient than
               | assembly in ways which have to be worked round either
               | fooling the compiler or adding intrinsics. A recent
               | example: https://justine.lol/endian.html
               | 
               | Is the huge macro more convenient than the "bswap"
               | instruction? No, but it's portable.
               | 
               | > I don't really see what other ones made sense
               | historically.
               | 
               | Pascal chose differently in a couple of places. In
               | particular, carrying the length with strings.
               | 
               | C refused to define semantics for arithmetic. This gave
               | you programs which were "portable" so long as you didn't
               | mind different behavior on different platforms. Good for
               | adoption, bad for sanity. It was only relatively recently
               | they defined subtraction to be twos-complement.
               | 
               | 16-bit Windows even used C with the Pascal calling
               | convention. http://www.c-jump.com/CIS77/ASM/Procedures/P7
               | 7_0070_pascal_s...
        
               | kelnos wrote:
               | > _C is just convenient assembly._
               | 
               | I'm not sure if you're being facetious here, but that's
               | absurd. It is certainly one of our lowest-level options
               | before reaching for assembly, but it's still a high-level
               | language that abstracts machine details from the
               | programmer.
               | 
               | > _In an era where performance mattered, and much
               | software was written for hardware, and controlling
               | hardware, it 's hard to see an alternative._
               | 
               | During that era, people who really needed to care about
               | performance used assembly. The optimizations done by C
               | compilers at that time were not nothing, but they were
               | fairly primitive to what they do now.
        
             | freeone3000 wrote:
             | But it doesn't have to. We can choose any other language
             | that compiles to native, including memory-safe ones.
        
         | pjc50 wrote:
         | > But you can't do everything that C does without using unsafe
         | blocks
         | 
         | How much of this is actually 100% unambiguously _necessary_? Is
         | there a good reason why anything in the filesystem code at all
         | _needs_ to be unsafe?
         | 
         | I suspect it's a very small subset needed in a few places.
        
           | nicce wrote:
           | Usually avoidance of copying or moving data is the primary
           | reason. In filesystems, this is quite highlighted.
        
         | bigstrat2003 wrote:
         | > But you can't do everything that C does without using unsafe
         | blocks. Rust can offer a fresh perspective to these problems,
         | but it's not a complete solution.
         | 
         | It's true that you need to have unsafe code to do low level
         | things. But it's a misconception that if you have to use unsafe
         | then Rust isn't a good fit. The point of the safe/unsafe
         | dichotomy in Rust is to clearly mark which bits of the code are
         | unsafe, so that you can focus all your attention on auditing
         | those small pieces and have confidence that everything else
         | will work if you get those bits right.
        
       | pjmlp wrote:
       | The disconnect section of the article is a good example of
       | exactly on how not to do the things, and how things can turn out
       | sour if the existing community isn't taken for the ride.
        
       | pornel wrote:
       | I don't get how can each file system have a custom lifecycle for
       | inodes, but still use the same functions for inode lifecycle
       | management, but apparently with different semantics? That sounds
       | like the opposite of an abstraction layer, if the same function
       | must be used in different ways depending on implementation
       | details.
       | 
       | If the lifecycle of inodes is filesystem-specific, it should be
       | managed via filesystem-specific functions.
        
         | phkahler wrote:
         | >> I don't get how can each file system have a custom lifecycle
         | for inodes, but still use the same functions for inode
         | lifecycle management, but apparently with different semantics?
         | 
         | I had the same question. They're trying to understand (or even
         | document) all the C APIs in order to do the rust work. It
         | sounds like collecting all that information might lead to some
         | [WTFs and] refactoring so questions like this don't come up in
         | the first place, and that would be a good thing.
        
         | crest wrote:
         | I assume it's supposed to work by having the compiler track the
         | lifetime of the inodes. The compiler is expected to help with
         | ephemeral references (the file system still has to store the
         | link count to disk).
        
         | sandywaffles wrote:
         | I understood it as they're working to abstract as much as is
         | generally and widely possible in the VFS layer, but there will
         | still be (many?) edge cases that don't fit and will need to be
         | handled in FS-specific layers. Perhaps the inode lifecycle was
         | just an initial starting point for discussion?
        
         | DSMan195276 wrote:
         | > but still use the same functions for inode lifecycle
         | management
         | 
         | I'm not an expert by any means but I'm somewhat knowledgeable,
         | there's different functions that can be used to create inodes
         | and then insert them into the cache. `iget_locked()` that's
         | focused on here is a particular pattern of doing it, but not
         | every FS uses that for one reason or another (or doesn't use it
         | in every situation). Ex: FAT doesn't use it because the inode
         | numbers get made-up and the FS maintains its own mapping of FAT
         | position to inodes. There's then also file systems like `proc`
         | which never cache their inode objects (I'm pretty sure that's
         | the case, I don't claim to understand proc :P )
         | 
         | The inode objects themselves still have the same state flow
         | regardless of where they come from, AFAIK, so from a consumer
         | perspective the usage of the `inode` doesn't change. It's only
         | the creation and internal handling of the inode objects by the
         | FS layer that depends based on what the FS needs.
        
         | seanhunter wrote:
         | If you haven't seen it before, you might find this useful
         | https://www.kernel.org/doc/html/latest/filesystems/vfs.html
         | 
         | It's an overview of the VFS layer, which is how they do all the
         | filesystem-specific stuff while maintaining a consistent
         | interface from the kernel.
        
       | hu3 wrote:
       | Some of the comments below the lwn.net page are rather
       | disrespectful.
       | 
       | Imagine getting this comment about the open source project you
       | contribute to:
       | 
       | "Science advances one funeral at a time"
        
       | gwbas1c wrote:
       | Maybe they are asking the wrong questions?
       | 
       | Does Rust need to change to make it easier to call C?
       | 
       | I've done a bit of Rust, and (as a hobbyist,) it's still not
       | clear (to me) how to interoperate with C. (I'm sure someone
       | reading this has done it.) In contrast, in C++ and Objective C,
       | all you need to do is include the right header and call the
       | function. Swift lets you include Objective C files, and you can
       | call C from them.
       | 
       | Maybe Rust as a language needs to bend a little in this case,
       | instead of expecting the kernel developers to bend to the
       | language?
        
         | tupshin wrote:
         | This is not a notable challenge in rust, nor relevant to the
         | article.
         | 
         | The article is about finding ways of using rust to actually
         | implement kernel fs drivers/etc. Note that any rust code in the
         | kernel is necessarily consuming C interfaces.
         | 
         | Bindgen works quite well for the use case that you are
         | thinking.
         | 
         | https://github.com/rust-lang/rust-bindgen
        
           | moomin wrote:
           | Yeah, the Rust proponents are being significantly more
           | ambitious. Not just the ability to code a file system in
           | Rust, but do it in a way that catches a lot of the
           | correctness issues relating to the complex (and changing)
           | semantics of FS development.
        
         | codetrotter wrote:
         | I've written Rust code that called C++
         | 
         | It wasn't completely straightforward, but on the whole I
         | figured out everything I needed to within a few days in order
         | to be able to do it.
         | 
         | Calling C would surely be very similar.
        
         | duped wrote:
         | It's actually pretty easy. All you need is declare `extern "C"
         | fn foo() -> T` to be able to call it from Rust, and to pass the
         | link flags either by adding a #[link] attribute or by adding it
         | in a build.rs.
         | 
         | You can use the bindgen crate to generate bindings ahead of
         | time, or in a build.rs and include!() the generated bindings.
         | 
         | Normally what people do is create a `-sys` crate that contains
         | only bindings, usually generated. Then their code can `use` the
         | bindings from the sys crate as normal.
         | 
         | > in contrast, in C++ and Objective C, all you need to do is
         | include the right header
         | 
         |  _and_ link against the library.
        
         | lambda wrote:
         | Calling C from Rust can be quite simple. You just declare the
         | external function and call it. For example, straight out of the
         | Rust book https://doc.rust-lang.org/book/ch19-01-unsafe-
         | rust.html#usin... :                 extern "C" {           fn
         | abs(input: i32) -> i32;       }            fn main() {
         | unsafe {               println!("Absolute value of -3 according
         | to C: {}", abs(-3));           }       }
         | 
         | Now, if you have a complex library and don't want to write all
         | of the declarations by hand, you can use a tool like bindgen to
         | automatically generate those extern declarations from a C
         | header file: https://github.com/rust-lang/rust-bindgen
         | 
         | There's an argument to be made that something like bindgen
         | could be included in Rust, not requiring a third party
         | dependency and setting up build.rs to invoke it, but that's not
         | really the issue at hand in this article.
         | 
         | The issue is not the low-level bindings, but higher level
         | wrappers that are more idiomatic in Rust. There's no way you're
         | going to be able to have a general tool that can automatically
         | do that from arbitrary C code.
        
           | jiripospisil wrote:
           | There's also cbindgen for going the other way around.
           | https://github.com/mozilla/cbindgen
        
           | varjag wrote:
           | That's not really "simple", it's on par with C FFI in about
           | any other language (except C++), with same drawbacks.
        
             | gizmo686 wrote:
             | ... And? Most languages make C interop simple.
        
               | varjag wrote:
               | They quickly become unwieldy on non-trivial APIs, with
               | hundreds of definitions across dozens of files and with
               | macros to boot. Naturally people would still get the job
               | done but it's beyond simple.
        
               | mcronce wrote:
               | That's what bindgen is for, as was mentioned in the
               | original comment you replied to.
        
               | varjag wrote:
               | How well does it handle preprocessor macros in APIs?
        
               | marshray wrote:
               | I have used it successfully against header files for
               | Win32 COM interfaces generated from IDL which include
               | major parts of the infamous "windows.h". Almost every
               | type is a macro.
               | 
               | This is an extremely well-understood space.
               | 
               | Just open the docs and do it.
        
             | commodoreboxer wrote:
             | It's on par with C++, too. In C++ you need an `extern "C"`,
             | because C++ linkage isn't guaranteed to be the same as C
             | linkage. You can get away with wrapping that around it in a
             | preprocessor conditional, but that's not all that much
             | easier than Rust's bindgen.
             | 
             | A lot of C to C++ interop is actually done wrong without
             | knowing it. Throwing a C++ static function as a callback
             | into a C function usually works, but it's not technically
             | correct because the linkage isn't guaranteed to be the same
             | without an extern "C". In practice, it usually is the same,
             | but this is implementation-defined, and C++ could use a
             | different calling convention from C (e.g. cdecl vs fastcall
             | vs stdcall. The Borland C++ compiler uses fastcall by
             | default for C++ functions, which will make them illegal
             | callbacks for C functions).
             | 
             | The major difference between Objective-C and C++'s C
             | interop and other languages is the lack of the
             | preprocessor. Macros will just work because they use the
             | same preprocessor. That's really not easy to paper over in
             | other languages that can't speak the C preprocessor.
        
               | spacechild1 wrote:
               | I think you're confusing some terms here.
               | 
               | > In C++ you need an `extern "C"`, because C++ linkage
               | isn't guaranteed to be the same as C linkage.
               | 
               | `extern "C"` has nothing to do with linkage, all it does
               | is disable namemangling, so you get the same symbol name
               | as with a C compiler.
               | 
               | > Throwing a C++ static function as a callback into a C
               | function usually works, but it's not technically correct
               | because the linkage isn't guaranteed to be the same
               | without an extern "C".
               | 
               | Again, linkage is not relevant here. Your C++ callbacks
               | don't have to be declared as extern "C" either, because
               | the symbol name doesn't matter. As you noted correctly,
               | the calling conventions must match, but in practice this
               | only matters on x86 Windows. (One notable example is
               | passing callbacks to Win32 API functions, which use
               | `stdcall` by default.) Fortunately, x86_64 and ARM did
               | away with this madness and only have a single calling
               | convention (per platform).
        
             | kelnos wrote:
             | How is that not simple? You just declare the function and
             | then call it. I find it hard to imagine how it could be any
             | more simple than that.
        
               | varjag wrote:
               | Now imagine a hundred or two functions, structures and
               | callbacks, some of them exposed only as CPP macros over
               | internal implementation. PJSIP low level API is one
               | example.
        
           | jacobgorm wrote:
           | Passing integers around is easy, sharing structs or strings
           | and context pointers for use in callbacks crossing the
           | language barrier etc is typically much harder.
        
             | Someone wrote:
             | For rust code calling C, sharing structs is doable with
             | _#[repr(C)]_. See https://doc.rust-lang.org/reference/type-
             | layout.html#reprc-s...
             | 
             | (Nitpick: I don't think it technically is correct to call
             | this "The C representation", as strict layout in C depends
             | on the C compiler/ABI. I wouldn't trust this to be good
             | enough for serializing data between 32-bit and 64-bit
             | systems, for example. For calling code on the same system,
             | it's good enough, though)
        
         | Smaug123 wrote:
         | The point is that Rust can model invariants that C can't. You
         | can call both ways, but if C is incapable of expressing what
         | Rust can, that has important implications for the design of
         | APIs which must be common to both.
        
           | gwbas1c wrote:
           | That's not how I interpreted it: There is a clear need to be
           | able to write filesystems in Rust, and the kernel
           | developer(s) who write the filesystem API don't want to have
           | to maintain the bindings to Rust.
        
         | kelnos wrote:
         | > _Does Rust need to change to make it easier to call C?_
         | 
         | No, because it's already dirt-simple to do. You just declare
         | the C function as 'extern "C"', and then call it. (You will
         | often need to use 'unsafe' and convert or cast references to
         | raw pointers, but that's simple syntax as well.)
         | 
         | There are tools (bindgen being the most used) that can scan C
         | header files and produce the declarations for you, so you don't
         | have to manually copy/paste and type them yourself.
         | 
         | > _Maybe Rust as a language needs to bend a little in this
         | case, instead of expecting the kernel developers to bend to the
         | language?_
         | 
         | I think you maybe misunderstood the article? There's nothing
         | wrong with the language here. The argument is around how Rust
         | should be used. The Rust-for-Linux developers want to encode
         | semantics into their API calls, using Rust's features and type
         | system, to make these calls safer and less error-prone to use.
         | The people on the C side are afraid that doing so will make it
         | harder for them to evolve the behavior and semantics of their C
         | APIs, because then the Rust APIs will need to be updated as
         | well, and they don't want to sign up for that work.
         | 
         | An alternative that might be more palatable is to not make use
         | of Rust features and the type system in order to encode
         | semantics into the Rust API. That way, it will be easier for C
         | developers, as updating Rust API when C API changes will be
         | mechanical and simple to do. But then we might wonder what the
         | point is of all this Rust work if the Rust-for-Linux developers
         | can't use Rust some features to make better, safer APIs.
         | 
         | > _I 've done a bit of Rust, and (as a hobbyist,) it's still
         | not clear (to me) how to interoperate with C._
         | 
         | Kinda weird that you currently have the top-voted comment when
         | you admit you don't understand the language well enough to have
         | an informed opinion on the topic at hand.
        
         | emporas wrote:
         | If you like to see some examples of C bindings:
         | 
         | https://github.com/tree-sitter/tree-sitter/blob/25c718918084...
        
       | BiteCode_dev wrote:
       | Given how those discussions usually go, and the scale of the
       | change, I find that discussion extraordinarily civil.
       | 
       | I disagree with the negative tone of this thread, I'm quite
       | optimistic given how clearly the parties involved were able to
       | communicate the pain points with zero BS.
        
         | nickparker wrote:
         | I found myself reading this more for the excellent notetaking
         | than for the content.
         | 
         | I suspect the discussion was about as charged, meandering, and
         | nitpicky as we all expect a PL debate among deeply opinionated
         | geeks to be, and Jake Edge (who wrote this summary) is
         | exceptionally good at removing all that and writing down
         | substance.
        
           | BiteCode_dev wrote:
           | Certainly.
           | 
           | We are talking about extremely competent people who worked on
           | a critical piece of software for years and invested a lot of
           | their lives in it, with all pain, effort, experience, and
           | responsibilities that come with that.
           | 
           | That this debate is inscribed is a process that is still
           | ongoing, and in fact, progressing, is a testament to how
           | healthy the situation is.
           | 
           | I was expecting the whole Rust thing to be shut down 10
           | times, in a flow of distasteful remarks, already.
           | 
           | This means that not only Rust is vindicated as promising for
           | the job, but both teams are willing and up to the task of
           | working on the integration.
           | 
           | Those projects are exhausting, highly under-pressure
           | situations, and they last a long time.
           | 
           | I still find that the report is showing a positive outcome.
           | What do people expect? Move fast and break things?
           | 
           | A barrage of "no" is how it's supposed to go.
        
             | 0cf8612b2e1e wrote:
             | I am definitely of the opinion we need to rush away from C.
             | Rust, Go, Zig, etc does not matter, but anything which can
             | catch some of the repeated mistakes that squishy humans
             | keep repeating.
             | 
             | That being said, the file system is one of those
             | infrastructure bits where you cannot make a mistake.
             | Introduce a memory corruption bug leading to crashes every
             | Thursday? Whatever. Total loss of data for 0.1% of users
             | during a leap year at a total eclipse? Apocalypse.
             | 
             | There is no amount of being too careful when interfacing
             | with storage. C may have a lot of foibles, but it is the
             | devil we know.
        
             | structural wrote:
             | I agree. And ideally, every time you raise the question and
             | get the "no" response, you learn something about the system
             | you're modifying or the reviewer learns something about
             | your solution. Then you improve your solution, and come
             | back.
             | 
             | Eventually consensus is built - either the solution becomes
             | good enough, or both the developers and the reviewers agree
             | that it's not going to work out and the line of development
             | gets abandoned.
             | 
             | Large-scale change in production is hard, and messy, and
             | involves a lot of imperfect humans that we hope are mostly
             | well-intentioned.
        
             | emporas wrote:
             | Moving using a 70's technology breaks things. Rust is
             | tested already on other OSes like Windows, Mac (or iOS) and
             | Android and solves several pitfalls of C and C++. Some
             | quotes from the Android team [1]:
             | 
             | "To date, there have been zero memory safety
             | vulnerabilities discovered in Android's Rust code."
             | 
             | "Safety measures make memory-unsafe languages slow"
             | 
             | Not saying Rust is the perfect solution to every problem,
             | but it is definitely not an outlandish proposition to use
             | it where it makes sense.
             | 
             | [1] https://security.googleblog.com/2022/12/memory-safe-
             | language...
        
       | sandywaffles wrote:
       | I wasn't clear and am not familiar enough with the Linux FS
       | systems to know if this Rust API would be wrapping or re-
       | implementing the C APIs? If it's re-implementing (or rather an
       | additional API) it seems keeping the names the same as the C API
       | would be problematic and lead to more confusion over time, even
       | if initially it helped already-familiar-developers grok whats
       | going on faster.
        
         | CGamesPlay wrote:
         | > Almeida put up a slide with the equivalent of iget_locked()
         | in Rust, which was called get_or_create_inode().
         | 
         | Seems like the answer is that it's reimplementing and doesn't
         | use the same names.
        
           | swfsql wrote:
           | I'm not familiar with those functions, but I had the
           | impression they actually shouldn't have the same name.
           | 
           | Since the Rust function has implicit/automatic behavior
           | depending on how it's state is and how it's used by the
           | callsite, and since the C one doesn't have any
           | implicit/automatic behavior (as in, separate/explicit
           | lifecycle calls must be made "manually"), I don't even see
           | the reason for them to have the same name.
           | 
           | That is to say, having the same name would be somehow wrong
           | since the functions do and serve for different stuff.
           | 
           | But it would make sense, at least from the Rust site, to have
           | documentation referring to the original C name.
        
       | brodouevencode wrote:
       | > about the disconnect between the names in the C API and the
       | Rust API, which means that developers cannot look at the C code
       | and know what the equivalent Rust call would be
       | 
       | Ah, the struggle of legacy naming conventions. I've had success
       | in keeping the same name but when I wanted an alternative name I
       | would just wrap the old name with the new name.
       | 
       | But yeah, naming things is hard.
        
         | adastra22 wrote:
         | One of the two major problems in computer science (the other
         | two being concurrency and off-by-one errors).
        
       | simon04 wrote:
       | tl;dr?
        
       ___________________________________________________________________
       (page generated 2024-07-15 23:01 UTC)