[HN Gopher] Rust for Filesystems
___________________________________________________________________
Rust for Filesystems
Author : drakerossman
Score : 237 points
Date : 2024-07-15 09:39 UTC (13 hours ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| gritzko wrote:
| From the minutes I conclude that Rust-in-the-kernel looks like an
| additional complexity tax. I mean, if you write an OS from
| scratch, you can use the full power of your language. Plastering
| it to the side of an already _vast_ codebase creates additional
| issues, as we see here.
| thesnide wrote:
| While I agree there are benefits to rust, I tend to think all
| reason cannot fight hype.
|
| The tax will be seen as a necessity to embrace future and
| progress.
|
| I'm wondering why do not restrict ourselves to a safe subset
| instead of jumping into a huge bandwagon of unknown bugs and
| tradeoffs
| Yoric wrote:
| Safe subset of what?
| pjc50 wrote:
| There is no "safe subset" of C. MISRA is fairly close, but
| all sorts of things that you might need, like integer
| arithmetic, have potential UB in C.
|
| (The best current effort is https://sel4.systems/ , which is
| written in C but has a large proof of safety attached. The
| language design question is basically: should the proof be
| part of the language?)
| regularfry wrote:
| Given that undefined behaviour just means "undefined by the
| standard", do you get usefully closer to being able to
| identify a safe subset with the (MISRA/alternative,
| specific compiler, specific architecture) triple?
| pjc50 wrote:
| No, undefined behavior does _not_ mean "not defined by
| the standard", it means those places where the standard
| says "undefined behavior". And then the long and
| complicated war over "the compiler may assume that UB
| does not happen and then optimize on that basis".
|
| You might be able to tighten it up in some specific
| cases, and those battles are being fought elsewhere, but
| there's stuff like lock lifetimes which you cannot do
| without substantial extra annotations inside or outside
| the language.
| regularfry wrote:
| Sorry, yes - poor wording on my part.
| PhilipRoman wrote:
| I found frama-c to be pretty good, including all the
| integer quirks
| germandiago wrote:
| I like Linus view on evolution. Evolution will tell what the
| most sensible choices are over time. It is like "the market"
| in some way. Let everyone make their bets, wait, see,
| analyze, research. That's it.
| mnau wrote:
| > additional complexity tax
|
| Yes, but that should be offsetted by easier driver development.
| See the blog about Rust GPU driver for asahii linux, done in
| one month. EDIT: Google "tales of the m1 gpu" (author has a
| very negative opinions about hacker news, read if you like by
| clicking the link https://asahilinux.org/2022/11/tales-of-
| the-m1-gpu/)
|
| Is it universal? We'll see in coming years.
| eru wrote:
| Alas, that link just gets you a rant about politics, if you
| click on it directly.
|
| Copy-and-pasting works.
| olivermuty wrote:
| "Rant about politics", haha. Or as other people like to
| call it: "A real concern described in an apt manner".
|
| I have observed these inflammatory sub-graphs of comments
| myself and have thought to myself that this must be a huge
| growing grounds for unmoderated and unwanted behaviour
| because it more or less becomes invisible once flagged
| enough.
| erdii wrote:
| In this specific case complaining about "politics" gets
| the sour by-taste of enabling (or at least not condoning)
| harrasment to the point of single folks taking their own
| lifes over it.
|
| Why?! Even if you're not sure what to think about the
| queer movement; even if you have already made up your
| mind about the queer movement and oppose their ideas or
| some of them; I refuse to believe that any single person
| would not want to stop someone from bullying someone else
| into their own suicide!
|
| hastily jotted rant for the folks who'd like to complain
| about "politics" from creeping into every discussion
| everywhere:
|
| It's really sad to see so many folks disconnecting and
| immediately dismissing whole groups of other folks as
| soon as they start complaining about an issue they have
| because of "politics". :(
|
| I get that you don't want to get involved in shit
| flinging shows and that its tedious to figure out who's
| in the right and who's in the wrong. Especially because
| there are never clear answers. If you feel like this and
| then proceed to complain about 'politics' creeping
| everywhere, please beware of this:
|
| Pretending to be apolitical doesn't work most of the
| time, as politics is basically another word for "acting
| (or deliberately not-acting) in some kind of public
| sphere" which you all do, and when the "policitics" have
| arrived at a topic, then they'll stay there at least in
| that specific case you are witnessing! You just are part
| of a hyperconnected and confusing world with a lot of
| conflict, wether you like it or not.
|
| Pretending to be apolitical also serves the upholding of
| whatever status quo is currently in place because
| anything that has even a slight chance of changing
| anything is inherently a political topic.
|
| Please don't turn your heads on "political" topics or, at
| least, don't complain about it in that way as it mostly
| enables unjust behaviour to continue. It doesn't even
| matter if it's the person who brought up the "political"
| stuff who is acting unjust or the folks they're
| complaining about). In both cases it's probably better to
| either avoid commenting at all or to convey your critical
| thoughts to that "political" conversation.
| pjc50 wrote:
| > I refuse to believe that any single person would not
| want to stop someone from bullying someone else into
| their own suicide!
|
| There are plenty of people who do want the freedom to say
| exactly what they choose, including a lengthy period of
| directed harassment, and shrug their shoulders if someone
| commits suicide over it. There's not much that can be
| done other than ban them from civilized spaces.
| pessimizer wrote:
| "Bullying" is a judgement. Intrinsic to the word is a
| judgement that what is being done is bad, and the person
| doing it would not describe it that way.
|
| And by that I don't mean that it's not bad to bully
| people (it's rather tautological), I mean that talk of
| bullying often begs the question, and is intentionally
| done in order to elide past the actual events that
| occurred. _Lese majeste_ laws against talking about the
| King, elected officials, or even cops or bureaucrats now
| get justified as anti-bullying.
|
| I missed any rant about politics in the blog however. But
| this thread has a smell of "my politics aren't political
| because they are true, and your politics are political
| because they are lies."
| jcranmer wrote:
| What the referer-replacement page was talking is Kiwi
| Farms, which is doing the kind of stuff that even the US
| First Amendment's very expansive protections fails to
| protect. (The criminal liability here is "intentional
| inflection of emotional distress", although note that
| most lawsuits that allege that are groundless lawsuits
| that largely fail to make it pass the motion to dismiss
| for failure to state a claim stage as "they made me feel
| bad" isn't sufficient to allege an IIED).
| nyssos wrote:
| > The criminal liability here is "intentional inflection
| of emotional distress",
|
| Intentional infliction of emotional distress is a tort,
| not criminal.
| keybored wrote:
| I mean this is correct. Bullies, or at least the adult
| ones, don't call what they do bullying (except the
| absolute geniuses that think "actually bullying has a
| social corrective behavior, I'm just helping actually").
|
| We're all just busting each other's balls, right? Look at
| George, he's laughing! He's totally in on the joke and
| not at all just showing submissive deference in order to
| not lose face.
| logicprog wrote:
| I'm glad at least some other people on this horrible site
| feel this way.
| im3w1l wrote:
| To me, you have to distinguish between harassment
| directed _at_ a person, and on the other hand discussion
| _about_ a person.
|
| It is not possible to use hacker news to send messages to
| someones email. It is not even possible to send a dm to
| another hacker news user. You could potentially imagine
| hacker news being used to organize harassment of someone,
| but I have never seen either accusations or evidence of
| such a thing.
|
| So then we have established, that since harassment
| directed at them is impossible, the issue they have is
| that people on hacker news write bad things about them.
|
| Next, those things seem to often be flagged or downvoted,
| reducing exposure. But that is apparently not enough,
| because they can be found on google. So here we arrive at
| the core issue. There is content on Google about this
| person, that they would rather not be on Google. This is
| the complaint. So this person is basically saying that if
| there is unfavorable coverage of them findable on google,
| that is harassment, and it needs to go. If it isn't
| purged it's bullying that could lead to suicide.
|
| This is a very ambitious "landgrab" if you will, and it
| starts to seriously infringe on other peoples rights.
|
| It's similar in that manner to other things like "stop
| terrorism" or "think of the children". Yes clearly
| harassment is bad, and terrorism is bad, and pedophiles
| are no good. But we can't completely give up on our
| freedoms because of that.
| lelanthran wrote:
| > Or as other people like to call it: "A real concern
| described in an apt manner".
|
| Oh please. Every activist for every marginal issue says
| the same thing. Doesn't make it true.
| aniviacat wrote:
| I can't find a rant in the link. Was the link changed or
| did I overlook something?
| jeroenhd wrote:
| The author of the website details their issue with the
| way HN does moderation (which I can't say I disagree
| with, especially after HN intentionally disabled referrer
| headers for websites that take issue with HN). This only
| shows up if HN is in the referer URL.
|
| I wouldn't call it a rant, but rather a polite request
| for HN policy to change.
| yjftsjthsd-h wrote:
| > I wouldn't call it a rant, but rather a polite request
| for HN policy to change.
|
| Which is made by blocking people with no ability to do
| anything about it.
| mcronce wrote:
| What? All you need to do is resubmit the request without
| a Referer header. For me, using Firefox, this meant
| clicking in the address bar, changing nothing, and
| hitting enter.
|
| That's hardly "no ability to do anything about it".
| yjftsjthsd-h wrote:
| I mean people with no ability to alter HN moderation
| policy.
|
| Consider it like this: 1. I think, "oh,
| that looks relevant, let me open that link". 2. I
| get a screenful of objections to HN moderation.
| 3. I shrug and close the tab.
|
| Since I'm just a normal user who can't change HN
| moderation, the outcome is that HN doesn't change but I
| walk away with a worse opinion of the Asahi Linux folks.
| lanstin wrote:
| Weird: HN is supposed to be technical but the people
| behind Asahi Linux have proven themselves to be technical
| wizards; meanwhile HN seems more interested in money and
| pseudo-libertarian politics than transformitive technical
| excellence. But such is our age.
| amiga386 wrote:
| If you click the link - any link to asahilinux.org from
| HN - it should start "Hi! It looks like you might have
| come from Hacker News.", followed Hector Martin ranting
| that he isn't in charge of the moderation policy of HN.
|
| The response given in Arkell v. Pressdram is appropriate.
| Phelinofist wrote:
| Also didn't show for me with disabled JS
| KallDrexx wrote:
| Maybe I'm reading something wrong but the discussion this HN
| posting is about sounds very much about trying to make a
| Linux subsystem and API in Rust, so that Rust's type system
| can enforce conformance to safety mechanisms via its
| abstraction.
|
| That's fundamentally different and harder than a driver being
| written in rust that uses the Linux's subsystems C APIs.
|
| I can see a lot of drivers taking on the complexity tax of
| being written in Rust easily. The complexity tax on writing a
| whole subsystem in Rust seems like an exponentially harder
| problem.
| wongarsu wrote:
| You could just write rust code that calls the C APIs, and
| that would probably avoid a lot of discussions like the one
| in the article.
|
| But making good wrappers would make development of
| downstream components even easier. As the opponents in the
| discussion said: there's about 50 filesystem drivers. If
| you make the interface better that's a boon for every file
| system (well, every file system that uses rust, but it
| doesn't take many to make the effort pay off). You pay the
| complexity tax once for the interface, and get complexity
| benefits in every component that uses the interface.
|
| We would have the same discussions about better C APIs, if
| only C was expressive enough to allow good abstractions.
| nicce wrote:
| > author has a very negative opinions about hacker news
|
| I am not sure if author (Asahi Lina) has, but the project
| lead Hector Martin definitely has.
| josephcsible wrote:
| Warning: Don't click that link. Copy and paste the URL
| instead. That site serves only verbal abuse and harassment to
| people that it detects are HN users.
| vollbrecht wrote:
| One can argue that any additional code is introducing
| complexity, not only writing Rust. Does that mean we should
| just stop innovating and go into an indefinite state of
| maintenance, since we are already so vast?
|
| A tax in one place may not be a net negative, if it's used like
| in the real world to offset other problems. And just saying it
| will not offset any problems because of a single discussion,
| that does not have a definite conclusion, comes of as a short
| argument.
| another2another wrote:
| >Does that mean we should just stop innovating and go into an
| indefinite state of maintenance
|
| If you mean that not using Rust (or maybe some other
| languages e.g. Zig or Ada?) means that there can be no
| innovation in the Linux kernel, I would have to disagree
| since there's been plenty of progress in plain old c (see for
| instance io_uring), not to mention the fact that the c
| language itself could change to make developer ergonomics
| better - since that seems to be the nub of the problem.
|
| It also raises the question of what happens in the future
| when Rust is no longer the language du jour - how do we know
| it's going to last the course? And now there's 2 different
| codebases, potentially maintained by 2 different diminishing
| sets of active maintainers.
| vollbrecht wrote:
| > If you mean that not using Rust (or maybe some other
| languages e.g. Zig or Ada?) means that there can be no
| innovation in the Linux kernel, I would have to disagree
| since there's been plenty of progress in plain old c.
|
| No i didn't mean that. If i understand OP correctly here,
| he argued that it is a tax to use rust, a tax is always
| bad, and thous should be avoided.
|
| We obviously can't now the future. We also can't now how
| future maintainers look like, and if there will be a bigger
| abundance of people understanding kernel level C or kernel
| level Rust or both.
|
| I also don't think that any one developer can claim to
| fully get every part of the Linux Kernel. So if one person
| want's to work on a particular subsection they need to make
| themself familiar with it, independent of the language
| used. And then we are back at the argument, is the
| additional tax bad, or what does it bring to the table.
| riku_iki wrote:
| > One can argue that any additional code is introducing
| complexity
|
| additional code can replace more complex existing or future
| code, thus can reduce complexity
| Already__Taken wrote:
| reads a lot like letting perfect be the enemy of good.
| ysw0145 wrote:
| Having more options available in the Linux kernel is always
| beneficial. However, Rust may not be the solution for everything.
| While Rust does its best to ensure its programming model is safe,
| it is still a limited model. Memory issues? Use Rust! Concurrency
| problems? Switch to Rust! But you can't do everything that C does
| without using unsafe blocks. Rust can offer a fresh perspective
| to these problems, but it's not a complete solution.
| drdo wrote:
| But unsafe blocks are available! And you should use them when
| you have to, but only when you have to.
|
| Using an unsafe block with a very limited blast radius doesn't
| negate all the guarantees you get in all the rest of your code.
| sanxiyn wrote:
| Note that unsafe blocks don't have limited blast radius.
| Blast that can be caused by a single incorrect unsafe block
| is unlimited, at least in theory. (In practice there could be
| correlation of amount of incorrectness to effect, but same
| also could be said about C undefined behavior.)
|
| Unsafe blocks limit amount you need to get correct, but you
| need to get all of them correct. It is not a blast limiter.
| weinzierl wrote:
| Yes, they don't contain the blast, but they limit the
| places where a bomb can be, and that is their worth.
| foldr wrote:
| Generally speaking yes, but there could be a logic error
| somewhere in safe code that causes an unsafe block to do
| something it shouldn't. For example, a safe function that
| is expected to return an integer less than n is called
| within an unsafe block to obtain an index, but the return
| value isn't actually less than n. In that case the 'bomb'
| may be in the unsafe block, but the bug is in the safe
| code.
| Klonoar wrote:
| I cannot imagine writing a method to return a value less
| than n, and not verifying that constraint somewhere in
| the safe method.
| foldr wrote:
| It's just a simple example to illustrate the point.
| Realistic bugs would probably involve more complex logic.
|
| The prevalence of buffer overrun bugs in C code shows
| that it very definitely is possible for programmers to
| screw up when calculating indices. Rust removes a lot of
| the footguns that make that both easy to do and dangerous
| in C. But in unsafe Rust code, you're still fundamentally
| vulnerable to any arithmetic bug in any function that you
| call as part of the computation of an index.
| nicce wrote:
| > yes, but there could be a logic error somewhere in safe
| code that causes an unsafe block to do something it
| shouldn't.
|
| Sounds like bad design. You can typically limit the use
| for unsafe for so small area than you can verify the
| ranges of parameters which will cause memory problems.
| Check for invalid values and raise panic. Still
| "memorysafe", even if it panics.
| foldr wrote:
| Sure, it may be bad design. The point is that nothing in
| the Rust language itself guarantees that memory safety
| bugs will be localized to unsafe blocks. If your code has
| that property it's because you wrote it in a disciplined
| way, not because Rust forced you to write it that way
| (though it may have given some moral support).
|
| Let me emphasize that I am not criticizing Rust here. I
| am just pointing out an incontrovertible fact about how
| unsafe blocks in Rust work: memory safety bugs are not
| guaranteed to be localized to unsafe blocks.
| neysofu wrote:
| I believe this is technically true, but somewhat myopic
| when it comes to how maintainers approach unsafe blocks in
| Rust.
|
| UBs have unlimited blast radius by definition, and you'll
| need to write correct code in all your unsafe blocks to
| ensure your application is 100% memory-safe. There's no
| debate around that. From this perspective, there's no
| difference between a C application and a Rust one which
| contains a single, incorrect unsafe block.
|
| The appreciable difference between the two, however, is how
| much more debuggable and auditable an unsafe block is.
| There's usually not that many of them, and they're easily
| greppable. Those (hopefully) very few lines of code in your
| entire application benefit from a level of attention and
| scrutiny that teams can hardly afford for entire C
| codebases.
|
| EDIT: hardy -> hardly (typo)
| drdo wrote:
| That is of course correct.
|
| The main value is that you only have to make sure that a
| small amount of code surrounding the unsafe block is safe,
| and hopefully you provide a safe API for the rest of the
| code to use.
| CraigJPerry wrote:
| I'd word that different- it reduces the search space for a
| bug when something goes wrong but it doesn't limit the blast
| radius - you can still spectacularly blow up safe rust code
| with an unsafe block (that no aliases rule is seriously tough
| to adhere to!)
|
| This is definitely a strong benefit though.
| bilekas wrote:
| > Concurrency problems?
|
| I have to admit, while I do enjoy rust in the sense that it
| makes sense and can really "click" sometimes. For anything
| asynchronous I find it really rough around the edges. It's not
| intuitive what's happening under the hood.
| the_duke wrote:
| Async != concurrency.
|
| One of the major wins of Rust is encoding thread safety in
| the type system with the `Send` and `Sync` traits.
| bilekas wrote:
| > Async != concurrency.
|
| Right, but tasks are sharing the same thread which is fine,
| but when we need to expand on that with them actually
| working async, i.e non blocking, fire and quasi-forget, its
| tricky. That's all I'm saying.
| the_duke wrote:
| The Rust async experience indeed has lots of pitfalls,
| very much agree there.
| dboreham wrote:
| s/The Rust/All/
| duped wrote:
| async == concurrency, concurrency != parallelism.
| sophacles wrote:
| async == concurrency in the same way square == rectangle
| - that is it's not an associative '==' since there are
| plenty of rectangles that are not squares.
| asyx wrote:
| I really hate async rust. It's really great that rust forces
| you on a compiler level to use mutexes but async is a disease
| that is spreading through your whole project and introduces a
| lot of complexity that I don't feel in C#, Python or JS/TS.
| John23832 wrote:
| Eh, syntactically async rust is the exact same as C#. It's
| all task based concurrency.
|
| Now, lifetimes attached to function signatures is
| definitely a problem.
| colejohnson66 wrote:
| Not really. C#'s Task/Task<T> are based on background
| execution. Once something is awaited, control is returned
| to the caller. OTOH, Rust's Future<T> is, by default,
| based on polling/stepping, a bit like IEnumerable<T> in
| C#; If you never poll/await the Future<T>, it never
| executes. Executor libraries like Tokio allow running
| futures in the background, but that's not built-in.
| John23832 wrote:
| I don't want to "well actually" the "well actually", but
| I think you missed the word syntactically.
|
| > C#'s Task/Task<T> are based on background execution.
| Once something is awaited, control is returned to the
| caller.
|
| Async/await in any language happens in the background.
|
| What happens during a Task.Yield() (C#)? The task is
| yielded to the another awaiting task in the work queue.
| Same as Rust.
|
| > OTOH, Rust's Future<T> is, by default, based on
| polling/stepping,
|
| The await syntax abstracts over Future/Stream polling.
| The real difference is that Rust introduced the Future
| type/concept of polling at all (which is a result of not
| having a standard async runtime). There is a concept of
| "is this task available to proceed on" in C# too, it's
| just not exposed to the user and handled by the CLR.
| merb wrote:
| > Task.Yield()
|
| In c# you probably never call yield.
| John23832 wrote:
| It was just an example. In practice, you're right.
| neonsunset wrote:
| Yield in C# is frequently used for the same reasons as in
| Rust, although implementation details between fine-
| grained C# Tasks and even finer grained Rust Futures
| aggregated into large Tasks differ quite a bit.
|
| Synchronous part of an async method in C# will run
| "inline". This means that should there be a
| computationally expensive or blocking code, a caller will
| not be able to proceed even if it doesn't await it
| immediately. For example: var ptask =
| Primes.Calculate(n); // returns Task<ulong[]> //
| Do other things...right? // Why are we stuck
| calculating the primes then?
| Console.WriteLine("Started.");
|
| In order for the .Calculate to be able to continue
| execution "elsewhere" in a free worker thread, it would
| have to yield.
|
| If a caller does not control .Calculate, the most common
| (and, sadly, frequently abused) solution is to simply do
| var task = Task.Run(Primes.Calculate); // Do
| something else var text = string.Join(',', await
| task);
|
| If a return signature of a delegate is also Task, the
| return type will be flattened - just a Task<T>, but
| nonetheless the returned task will be a proxy that will
| complete once the original task completes. This
| successfully deals with badly behaved code.
|
| However, a better solution is to instead insert
| `Task.Yield()` to allow the caller to proceed and not be
| blocked, before continuing a long-running operation:
| var ptask = Primes.Calculate(n); // returns Task<ulong[]>
| // Successfully prints the message
| Console.WriteLine("Started."); static
| async Task<int[]> CalculatePrimes(int n) {
| await Task.Yield(); // Continue execution in
| a free worker thread // If the caller
| immediately awaits us, most likely // the
| caller's thread will end up doing so, as the
| // continuation will be scheduled in the local queue,
| // so it is unlikely for the work item to be stolen this
| // quickly by another worker thread. }
| brigadier132 wrote:
| How do you imagine async works otherwise? Also, in case
| you misunderstand how polling works in practice in rust,
| it's not polling in the traditional web development sense
| where it polls every 5 ms to check if a future is
| completed (although you can do this if you want to for
| some reason). There are typically "wakers" that are
| "awoken" by the os when data is ready and when they are
| "awoken" _then_ they poll. And since they are only awoken
| by the OS when the information is ready it really never
| has to poll more than once unless there are multiple
| bundled futures.
| wongarsu wrote:
| Rust async isn't all that pleasant to use. On the other hand
| for normal threaded concurrency Rust is one of the best
| languages around. The type system prevents a lot of
| concurrency bugs. "Effortless concurrency" is a tagline the
| language really has earned.
| tialaramex wrote:
| > But you can't do everything that C does without using unsafe
| blocks
|
| For this particular work the huge benefit of Rust is its
| enthusiasm for _encapsulating_ such safety problems in types.
| Which is indeed what this article is about.
|
| C and particularly the way C is used in the kernel makes it
| everybody's responsibility to have total knowledge of the tacit
| rules. That cannot scale. A room full of kernel developers
| didn't entirely agree on the rules for a data structure they
| all use!
|
| Rust is very good at making you aware of rules you need to
| know, and making it not your problem when it can be somebody
| else's problem to ensure rules are followed. Sometimes the
| result will be less optimal, but even in the Linux kernel sub-
| optimal is often the right default and we can provide an
| (unsafe) escape hatch for people who can afford to learn six
| more weird rules to maybe get better performance.
| mjburgess wrote:
| > That cannot scale.
|
| lol... you're talking about the linux kernel, _written in C_.
|
| The vast majority of software over many decades "bottoms out"
| in C whether in VMs, operating systems, device drivers, etc.
|
| The scale of the success of C is unparalleled.
| dxroshan wrote:
| I agree with you.
| pjc50 wrote:
| The scale of C adoption is certainly unparalleled over the
| past 40 or so years, but so are the safety issues in the
| cyberwarfare era.
|
| https://www.whitehouse.gov/oncd/briefing-
| room/2024/02/26/pre...
|
| If, somehow, we'd got to an era where (a) operating systems
| were widely deployed in a different language, and (b) the
| Morris Worm of 1988 had happened due to buffer overflow
| issues, then C in its current form would never have been
| adopted.
| mjburgess wrote:
| C is just convenient assembly. In an era where
| performance mattered, and much software was written _for_
| hardware, and controlling hardware, it 's hard to see an
| alternative.
|
| C's choices were for performance on hardware-limited
| systems. I don't really see what other ones made sense
| historically.
| another2another wrote:
| >In an era where performance mattered, and much software
| was written for hardware, and controlling hardware, it's
| hard to see an alternative
|
| Actually, what made sense _was_ assembly when performance
| mattered above all. C was actually seen as a higher level
| language.
|
| However C's advantage was the fact that it was cross
| platform, so you could compile or quite easily port the
| same code to many different platforms with a C compiler
| (Solaris,Windows,BSD,Linux and latterly Mac OSX). That
| was its strength (pascal shared this too, but it didn't
| survive).
|
| You can see this in the legacy of software that's still
| in use today - lots of gnu utilities, shells, X windows,
| the zlib library, the gcc, openssl and discussed fairly
| recently POV Ray which has been going since the 80's.
| pjc50 wrote:
| C is, in some important cases, _less_ convenient than
| assembly in ways which have to be worked round either
| fooling the compiler or adding intrinsics. A recent
| example: https://justine.lol/endian.html
|
| Is the huge macro more convenient than the "bswap"
| instruction? No, but it's portable.
|
| > I don't really see what other ones made sense
| historically.
|
| Pascal chose differently in a couple of places. In
| particular, carrying the length with strings.
|
| C refused to define semantics for arithmetic. This gave
| you programs which were "portable" so long as you didn't
| mind different behavior on different platforms. Good for
| adoption, bad for sanity. It was only relatively recently
| they defined subtraction to be twos-complement.
|
| 16-bit Windows even used C with the Pascal calling
| convention. http://www.c-jump.com/CIS77/ASM/Procedures/P7
| 7_0070_pascal_s...
| kelnos wrote:
| > _C is just convenient assembly._
|
| I'm not sure if you're being facetious here, but that's
| absurd. It is certainly one of our lowest-level options
| before reaching for assembly, but it's still a high-level
| language that abstracts machine details from the
| programmer.
|
| > _In an era where performance mattered, and much
| software was written for hardware, and controlling
| hardware, it 's hard to see an alternative._
|
| During that era, people who really needed to care about
| performance used assembly. The optimizations done by C
| compilers at that time were not nothing, but they were
| fairly primitive to what they do now.
| freeone3000 wrote:
| But it doesn't have to. We can choose any other language
| that compiles to native, including memory-safe ones.
| pjc50 wrote:
| > But you can't do everything that C does without using unsafe
| blocks
|
| How much of this is actually 100% unambiguously _necessary_? Is
| there a good reason why anything in the filesystem code at all
| _needs_ to be unsafe?
|
| I suspect it's a very small subset needed in a few places.
| nicce wrote:
| Usually avoidance of copying or moving data is the primary
| reason. In filesystems, this is quite highlighted.
| bigstrat2003 wrote:
| > But you can't do everything that C does without using unsafe
| blocks. Rust can offer a fresh perspective to these problems,
| but it's not a complete solution.
|
| It's true that you need to have unsafe code to do low level
| things. But it's a misconception that if you have to use unsafe
| then Rust isn't a good fit. The point of the safe/unsafe
| dichotomy in Rust is to clearly mark which bits of the code are
| unsafe, so that you can focus all your attention on auditing
| those small pieces and have confidence that everything else
| will work if you get those bits right.
| pjmlp wrote:
| The disconnect section of the article is a good example of
| exactly on how not to do the things, and how things can turn out
| sour if the existing community isn't taken for the ride.
| pornel wrote:
| I don't get how can each file system have a custom lifecycle for
| inodes, but still use the same functions for inode lifecycle
| management, but apparently with different semantics? That sounds
| like the opposite of an abstraction layer, if the same function
| must be used in different ways depending on implementation
| details.
|
| If the lifecycle of inodes is filesystem-specific, it should be
| managed via filesystem-specific functions.
| phkahler wrote:
| >> I don't get how can each file system have a custom lifecycle
| for inodes, but still use the same functions for inode
| lifecycle management, but apparently with different semantics?
|
| I had the same question. They're trying to understand (or even
| document) all the C APIs in order to do the rust work. It
| sounds like collecting all that information might lead to some
| [WTFs and] refactoring so questions like this don't come up in
| the first place, and that would be a good thing.
| crest wrote:
| I assume it's supposed to work by having the compiler track the
| lifetime of the inodes. The compiler is expected to help with
| ephemeral references (the file system still has to store the
| link count to disk).
| sandywaffles wrote:
| I understood it as they're working to abstract as much as is
| generally and widely possible in the VFS layer, but there will
| still be (many?) edge cases that don't fit and will need to be
| handled in FS-specific layers. Perhaps the inode lifecycle was
| just an initial starting point for discussion?
| DSMan195276 wrote:
| > but still use the same functions for inode lifecycle
| management
|
| I'm not an expert by any means but I'm somewhat knowledgeable,
| there's different functions that can be used to create inodes
| and then insert them into the cache. `iget_locked()` that's
| focused on here is a particular pattern of doing it, but not
| every FS uses that for one reason or another (or doesn't use it
| in every situation). Ex: FAT doesn't use it because the inode
| numbers get made-up and the FS maintains its own mapping of FAT
| position to inodes. There's then also file systems like `proc`
| which never cache their inode objects (I'm pretty sure that's
| the case, I don't claim to understand proc :P )
|
| The inode objects themselves still have the same state flow
| regardless of where they come from, AFAIK, so from a consumer
| perspective the usage of the `inode` doesn't change. It's only
| the creation and internal handling of the inode objects by the
| FS layer that depends based on what the FS needs.
| seanhunter wrote:
| If you haven't seen it before, you might find this useful
| https://www.kernel.org/doc/html/latest/filesystems/vfs.html
|
| It's an overview of the VFS layer, which is how they do all the
| filesystem-specific stuff while maintaining a consistent
| interface from the kernel.
| hu3 wrote:
| Some of the comments below the lwn.net page are rather
| disrespectful.
|
| Imagine getting this comment about the open source project you
| contribute to:
|
| "Science advances one funeral at a time"
| gwbas1c wrote:
| Maybe they are asking the wrong questions?
|
| Does Rust need to change to make it easier to call C?
|
| I've done a bit of Rust, and (as a hobbyist,) it's still not
| clear (to me) how to interoperate with C. (I'm sure someone
| reading this has done it.) In contrast, in C++ and Objective C,
| all you need to do is include the right header and call the
| function. Swift lets you include Objective C files, and you can
| call C from them.
|
| Maybe Rust as a language needs to bend a little in this case,
| instead of expecting the kernel developers to bend to the
| language?
| tupshin wrote:
| This is not a notable challenge in rust, nor relevant to the
| article.
|
| The article is about finding ways of using rust to actually
| implement kernel fs drivers/etc. Note that any rust code in the
| kernel is necessarily consuming C interfaces.
|
| Bindgen works quite well for the use case that you are
| thinking.
|
| https://github.com/rust-lang/rust-bindgen
| moomin wrote:
| Yeah, the Rust proponents are being significantly more
| ambitious. Not just the ability to code a file system in
| Rust, but do it in a way that catches a lot of the
| correctness issues relating to the complex (and changing)
| semantics of FS development.
| codetrotter wrote:
| I've written Rust code that called C++
|
| It wasn't completely straightforward, but on the whole I
| figured out everything I needed to within a few days in order
| to be able to do it.
|
| Calling C would surely be very similar.
| duped wrote:
| It's actually pretty easy. All you need is declare `extern "C"
| fn foo() -> T` to be able to call it from Rust, and to pass the
| link flags either by adding a #[link] attribute or by adding it
| in a build.rs.
|
| You can use the bindgen crate to generate bindings ahead of
| time, or in a build.rs and include!() the generated bindings.
|
| Normally what people do is create a `-sys` crate that contains
| only bindings, usually generated. Then their code can `use` the
| bindings from the sys crate as normal.
|
| > in contrast, in C++ and Objective C, all you need to do is
| include the right header
|
| _and_ link against the library.
| lambda wrote:
| Calling C from Rust can be quite simple. You just declare the
| external function and call it. For example, straight out of the
| Rust book https://doc.rust-lang.org/book/ch19-01-unsafe-
| rust.html#usin... : extern "C" { fn
| abs(input: i32) -> i32; } fn main() {
| unsafe { println!("Absolute value of -3 according
| to C: {}", abs(-3)); } }
|
| Now, if you have a complex library and don't want to write all
| of the declarations by hand, you can use a tool like bindgen to
| automatically generate those extern declarations from a C
| header file: https://github.com/rust-lang/rust-bindgen
|
| There's an argument to be made that something like bindgen
| could be included in Rust, not requiring a third party
| dependency and setting up build.rs to invoke it, but that's not
| really the issue at hand in this article.
|
| The issue is not the low-level bindings, but higher level
| wrappers that are more idiomatic in Rust. There's no way you're
| going to be able to have a general tool that can automatically
| do that from arbitrary C code.
| jiripospisil wrote:
| There's also cbindgen for going the other way around.
| https://github.com/mozilla/cbindgen
| varjag wrote:
| That's not really "simple", it's on par with C FFI in about
| any other language (except C++), with same drawbacks.
| gizmo686 wrote:
| ... And? Most languages make C interop simple.
| varjag wrote:
| They quickly become unwieldy on non-trivial APIs, with
| hundreds of definitions across dozens of files and with
| macros to boot. Naturally people would still get the job
| done but it's beyond simple.
| mcronce wrote:
| That's what bindgen is for, as was mentioned in the
| original comment you replied to.
| varjag wrote:
| How well does it handle preprocessor macros in APIs?
| marshray wrote:
| I have used it successfully against header files for
| Win32 COM interfaces generated from IDL which include
| major parts of the infamous "windows.h". Almost every
| type is a macro.
|
| This is an extremely well-understood space.
|
| Just open the docs and do it.
| commodoreboxer wrote:
| It's on par with C++, too. In C++ you need an `extern "C"`,
| because C++ linkage isn't guaranteed to be the same as C
| linkage. You can get away with wrapping that around it in a
| preprocessor conditional, but that's not all that much
| easier than Rust's bindgen.
|
| A lot of C to C++ interop is actually done wrong without
| knowing it. Throwing a C++ static function as a callback
| into a C function usually works, but it's not technically
| correct because the linkage isn't guaranteed to be the same
| without an extern "C". In practice, it usually is the same,
| but this is implementation-defined, and C++ could use a
| different calling convention from C (e.g. cdecl vs fastcall
| vs stdcall. The Borland C++ compiler uses fastcall by
| default for C++ functions, which will make them illegal
| callbacks for C functions).
|
| The major difference between Objective-C and C++'s C
| interop and other languages is the lack of the
| preprocessor. Macros will just work because they use the
| same preprocessor. That's really not easy to paper over in
| other languages that can't speak the C preprocessor.
| spacechild1 wrote:
| I think you're confusing some terms here.
|
| > In C++ you need an `extern "C"`, because C++ linkage
| isn't guaranteed to be the same as C linkage.
|
| `extern "C"` has nothing to do with linkage, all it does
| is disable namemangling, so you get the same symbol name
| as with a C compiler.
|
| > Throwing a C++ static function as a callback into a C
| function usually works, but it's not technically correct
| because the linkage isn't guaranteed to be the same
| without an extern "C".
|
| Again, linkage is not relevant here. Your C++ callbacks
| don't have to be declared as extern "C" either, because
| the symbol name doesn't matter. As you noted correctly,
| the calling conventions must match, but in practice this
| only matters on x86 Windows. (One notable example is
| passing callbacks to Win32 API functions, which use
| `stdcall` by default.) Fortunately, x86_64 and ARM did
| away with this madness and only have a single calling
| convention (per platform).
| kelnos wrote:
| How is that not simple? You just declare the function and
| then call it. I find it hard to imagine how it could be any
| more simple than that.
| varjag wrote:
| Now imagine a hundred or two functions, structures and
| callbacks, some of them exposed only as CPP macros over
| internal implementation. PJSIP low level API is one
| example.
| jacobgorm wrote:
| Passing integers around is easy, sharing structs or strings
| and context pointers for use in callbacks crossing the
| language barrier etc is typically much harder.
| Someone wrote:
| For rust code calling C, sharing structs is doable with
| _#[repr(C)]_. See https://doc.rust-lang.org/reference/type-
| layout.html#reprc-s...
|
| (Nitpick: I don't think it technically is correct to call
| this "The C representation", as strict layout in C depends
| on the C compiler/ABI. I wouldn't trust this to be good
| enough for serializing data between 32-bit and 64-bit
| systems, for example. For calling code on the same system,
| it's good enough, though)
| Smaug123 wrote:
| The point is that Rust can model invariants that C can't. You
| can call both ways, but if C is incapable of expressing what
| Rust can, that has important implications for the design of
| APIs which must be common to both.
| gwbas1c wrote:
| That's not how I interpreted it: There is a clear need to be
| able to write filesystems in Rust, and the kernel
| developer(s) who write the filesystem API don't want to have
| to maintain the bindings to Rust.
| kelnos wrote:
| > _Does Rust need to change to make it easier to call C?_
|
| No, because it's already dirt-simple to do. You just declare
| the C function as 'extern "C"', and then call it. (You will
| often need to use 'unsafe' and convert or cast references to
| raw pointers, but that's simple syntax as well.)
|
| There are tools (bindgen being the most used) that can scan C
| header files and produce the declarations for you, so you don't
| have to manually copy/paste and type them yourself.
|
| > _Maybe Rust as a language needs to bend a little in this
| case, instead of expecting the kernel developers to bend to the
| language?_
|
| I think you maybe misunderstood the article? There's nothing
| wrong with the language here. The argument is around how Rust
| should be used. The Rust-for-Linux developers want to encode
| semantics into their API calls, using Rust's features and type
| system, to make these calls safer and less error-prone to use.
| The people on the C side are afraid that doing so will make it
| harder for them to evolve the behavior and semantics of their C
| APIs, because then the Rust APIs will need to be updated as
| well, and they don't want to sign up for that work.
|
| An alternative that might be more palatable is to not make use
| of Rust features and the type system in order to encode
| semantics into the Rust API. That way, it will be easier for C
| developers, as updating Rust API when C API changes will be
| mechanical and simple to do. But then we might wonder what the
| point is of all this Rust work if the Rust-for-Linux developers
| can't use Rust some features to make better, safer APIs.
|
| > _I 've done a bit of Rust, and (as a hobbyist,) it's still
| not clear (to me) how to interoperate with C._
|
| Kinda weird that you currently have the top-voted comment when
| you admit you don't understand the language well enough to have
| an informed opinion on the topic at hand.
| emporas wrote:
| If you like to see some examples of C bindings:
|
| https://github.com/tree-sitter/tree-sitter/blob/25c718918084...
| BiteCode_dev wrote:
| Given how those discussions usually go, and the scale of the
| change, I find that discussion extraordinarily civil.
|
| I disagree with the negative tone of this thread, I'm quite
| optimistic given how clearly the parties involved were able to
| communicate the pain points with zero BS.
| nickparker wrote:
| I found myself reading this more for the excellent notetaking
| than for the content.
|
| I suspect the discussion was about as charged, meandering, and
| nitpicky as we all expect a PL debate among deeply opinionated
| geeks to be, and Jake Edge (who wrote this summary) is
| exceptionally good at removing all that and writing down
| substance.
| BiteCode_dev wrote:
| Certainly.
|
| We are talking about extremely competent people who worked on
| a critical piece of software for years and invested a lot of
| their lives in it, with all pain, effort, experience, and
| responsibilities that come with that.
|
| That this debate is inscribed is a process that is still
| ongoing, and in fact, progressing, is a testament to how
| healthy the situation is.
|
| I was expecting the whole Rust thing to be shut down 10
| times, in a flow of distasteful remarks, already.
|
| This means that not only Rust is vindicated as promising for
| the job, but both teams are willing and up to the task of
| working on the integration.
|
| Those projects are exhausting, highly under-pressure
| situations, and they last a long time.
|
| I still find that the report is showing a positive outcome.
| What do people expect? Move fast and break things?
|
| A barrage of "no" is how it's supposed to go.
| 0cf8612b2e1e wrote:
| I am definitely of the opinion we need to rush away from C.
| Rust, Go, Zig, etc does not matter, but anything which can
| catch some of the repeated mistakes that squishy humans
| keep repeating.
|
| That being said, the file system is one of those
| infrastructure bits where you cannot make a mistake.
| Introduce a memory corruption bug leading to crashes every
| Thursday? Whatever. Total loss of data for 0.1% of users
| during a leap year at a total eclipse? Apocalypse.
|
| There is no amount of being too careful when interfacing
| with storage. C may have a lot of foibles, but it is the
| devil we know.
| structural wrote:
| I agree. And ideally, every time you raise the question and
| get the "no" response, you learn something about the system
| you're modifying or the reviewer learns something about
| your solution. Then you improve your solution, and come
| back.
|
| Eventually consensus is built - either the solution becomes
| good enough, or both the developers and the reviewers agree
| that it's not going to work out and the line of development
| gets abandoned.
|
| Large-scale change in production is hard, and messy, and
| involves a lot of imperfect humans that we hope are mostly
| well-intentioned.
| emporas wrote:
| Moving using a 70's technology breaks things. Rust is
| tested already on other OSes like Windows, Mac (or iOS) and
| Android and solves several pitfalls of C and C++. Some
| quotes from the Android team [1]:
|
| "To date, there have been zero memory safety
| vulnerabilities discovered in Android's Rust code."
|
| "Safety measures make memory-unsafe languages slow"
|
| Not saying Rust is the perfect solution to every problem,
| but it is definitely not an outlandish proposition to use
| it where it makes sense.
|
| [1] https://security.googleblog.com/2022/12/memory-safe-
| language...
| sandywaffles wrote:
| I wasn't clear and am not familiar enough with the Linux FS
| systems to know if this Rust API would be wrapping or re-
| implementing the C APIs? If it's re-implementing (or rather an
| additional API) it seems keeping the names the same as the C API
| would be problematic and lead to more confusion over time, even
| if initially it helped already-familiar-developers grok whats
| going on faster.
| CGamesPlay wrote:
| > Almeida put up a slide with the equivalent of iget_locked()
| in Rust, which was called get_or_create_inode().
|
| Seems like the answer is that it's reimplementing and doesn't
| use the same names.
| swfsql wrote:
| I'm not familiar with those functions, but I had the
| impression they actually shouldn't have the same name.
|
| Since the Rust function has implicit/automatic behavior
| depending on how it's state is and how it's used by the
| callsite, and since the C one doesn't have any
| implicit/automatic behavior (as in, separate/explicit
| lifecycle calls must be made "manually"), I don't even see
| the reason for them to have the same name.
|
| That is to say, having the same name would be somehow wrong
| since the functions do and serve for different stuff.
|
| But it would make sense, at least from the Rust site, to have
| documentation referring to the original C name.
| brodouevencode wrote:
| > about the disconnect between the names in the C API and the
| Rust API, which means that developers cannot look at the C code
| and know what the equivalent Rust call would be
|
| Ah, the struggle of legacy naming conventions. I've had success
| in keeping the same name but when I wanted an alternative name I
| would just wrap the old name with the new name.
|
| But yeah, naming things is hard.
| adastra22 wrote:
| One of the two major problems in computer science (the other
| two being concurrency and off-by-one errors).
| simon04 wrote:
| tl;dr?
___________________________________________________________________
(page generated 2024-07-15 23:01 UTC)