[HN Gopher] Uninitialized memory: Unsafe Rust is too hard
___________________________________________________________________
Uninitialized memory: Unsafe Rust is too hard
Author : drrlvn
Score : 97 points
Date : 2022-01-30 10:51 UTC (12 hours ago)
(HTM) web link (lucumr.pocoo.org)
(TXT) w3m dump (lucumr.pocoo.org)
| notpopcorn wrote:
| Without any unsafe code this is simply: let
| role = Role { name: "basic", flag: 1,
| disabled: false, };
|
| The language tries to prevent you from interacting with a `Role`
| object that's not fully initialized. `mem::zero()` could work,
| but then you'll have to turn the `&'static str` into an
| `Option<&'static str>` or a raw pointer, to indicate that it
| might be null. You could also add `#[derive(Default)]` to the
| struct, to automatically get a `Role::default()` function to
| create a `Role` with and then modify the fields afterwards, if
| you want to set the fields in separate statements for some
| reason: let mut role = Role::default();
| role.name = "basic"; role.flag = 1; role.disabled
| = false;
|
| And even with `MaybeUninit` you can initialize the whole struct
| (without `unsafe`!) with `MaybeUninit::write`. It's just that
| _partially_ initializing something is hard to get right, which is
| the point of the article I guess. But I wonder how commonly you
| would really want that, as it easily leads to mistakes.
| ATsch wrote:
| A much better way to do partial initialization is by splitting
| up the struct into multiple parts. This can be easily done in
| safe rust with Option, or MaybeUninit if you're really
| desperate for performance.
| [deleted]
| bodhiandpysics1 wrote:
| When dealing with unix syscalls, you actually sometimes need
| to pass structs that aren't fully initialized, or are
| initialized to zero with the exception of some fields. The
| quintessential example is the sigaction struct.
| pjmlp wrote:
| Another good example is Win32, in many cases only the
| length is initialized and the API does the rest, this
| allows them to change the ABI across versions without
| impacting the caller.
| flohofwoe wrote:
| Also, the C version should really look like this:
| const struct role r = { .name = "basic",
| .flag = 1, .disabled = false, };
|
| (of course this doesn't give you uninitialized memory in case
| new items are added to the struct, but why would one ever want
| that?)
| ______-_-______ wrote:
| Exactly this. I'm not sure what the author's practical goal is
| with that code. He rejects #[repr(C)] a few times, so it's not
| FFI.
|
| Yes, working with uninitialized memory is tedious. But that
| isn't something you ever have to do. If you're translating some
| C to Rust, write it using Rust idioms, instead of trying to
| preserve every call to malloc/free and every access to
| uninitialized memory.
| jerf wrote:
| The point is to make points in a simple environment.
|
| Anything small enough to clearly make points about unsafe
| Rust is almost certainly small enough to be done in safe
| Rust, defeating the purpose.
| malf wrote:
| If it's too big for an example, it's almost certainly too
| big for "trust me, I know this is safe".
| Ericson2314 wrote:
| The write_unaligned is pure FUD. Regular unpacked structs don't
| violate alignment on fields!
| raphlinus wrote:
| In the spirit of being charitable, I would say that the article
| is highlighting a place where the guarantees are incompletely
| documented. And not having a solid specification and clear
| documentation is absolutely one of the things that makes unsafe
| Rust hard.
|
| I'd actually like to qualify that a bit. Doing unsafe is not
| especially hard, just use pointers everywhere instead of
| references. That's exactly like C (which doesn't even have
| references), but the syntax is clunkier, you have to use
| function calls instead of concise operators like * and ->. What
| _is_ hard is finely interleaving safe and unsafe Rust, as the
| author is trying to do. That 's difficult because the unsafe
| code has to upload all the safety invariants of safe Rust, and
| those are indeed complicated.
| Ericson2314 wrote:
| > In the spirit of being charitable
|
| You are being nice, but even if there is a documentation
| error, we can prove that if _safe_ rust isn 't completely
| broken, and `& x.field` is allowed in safe Rust, then fields
| must be aligned. It is just preposterous Rust would be more
| broken than C in this regard.
|
| > but the syntax is clunkier, you have to use function calls
| instead of concise operators like * and ->.
|
| Yes I agree, the syntax does suck. I see the macros use an
| unstable &raw, that would be more concise.
|
| I think would be _really_ good is if x- >y in Rust matched
| &x->y in C. That is nicely orthogonal to dereferencing, and
| always safe.
| DSMan195276 wrote:
| I agree, but IMO it's not all that clear to me that this is
| guaranteed. The documentation for the default layout pretty
| clearly says[0]:
|
| > There are no guarantees of data layout made by this
| representation.
|
| So that being the case, there's not really anything _stopping_
| them from introducing a situation where an unaligned field in a
| struct is created in the future. Of course I can 't imagine why
| they would do that, but then maybe my imagination just isn't
| good enough. I think the author's point here (which is a good
| one in my opinion) is that when writing `unsafe` you're not
| supposed to rely on stuff that seems like it should be true,
| you're supposed to rely on stuff that's guaranteed to always be
| true, which with Rust isn't all that clearly defined.
|
| [0]: https://doc.rust-lang.org/reference/type-layout.html#the-
| def...
| Ericson2314 wrote:
| If struct fields weren't aligned, then _safe_ Rust would be
| completely and utterly broken. And really obviously so.
|
| This isn't just a matter of things "seeming". It is quite
| literally an implication of Safe Rust works => field offsets
| must be aligned. There is no other way for safe Rust to be
| safe, other than alignment not mattering at all because all
| accesses are careful to pessimistically not rely on it.
|
| I am sorry, but this hypothesis is just completely outside
| the Overton Window.
| sharikous wrote:
| What is the reason for the rule objects have to be always in a
| good state even inside unsafe?
| TheCycoONE wrote:
| Because the compiler makes assumptions on the valid layouts of
| a type, e.g. packing Option<bool> in a single byte. Prior to
| 2018 it was assumed the UB was only on read like C but that
| turned out not to be the case. To avoid this the type has to
| reflect that it might be uninitialized - hence the transparent
| MaybeUnit wrapper.
| judofyr wrote:
| One of the core ideas behind unsafe blocks is that they don't
| actually change any semantics. All they do is _allow_ more
| operations. This makes it a lot easier to reason about (for
| both the programmer and the compiler) since there's not two
| different set of rules to remember.
|
| It does however makes things a bit clunky since the unsafe bits
| need to ensure a "safe" state _throughout_ the whole block and
| not only by the end of it.
| xorvoid wrote:
| I've never understood why they don't just support partially
| initialized structs. They're already doing control flow
| analysis for initializing variables. It seems like a natural
| extension to do this for aggregate types (product and sum).
| From a type theory perspective, a struct T is not a T in
| unitialized state, it's. "partial T" that at some point gets
| transformed into a T. So, you'd not be able to use the T in
| the normal sense until the compiler can prove that all fields
| have been init on all possible control flow paths.
|
| Why is this problematic? (I presume there is a fatal flaw as
| it seems too obvious of a solution..)
| xorvoid wrote:
| As an example, in C the following is quite common:
|
| Data dat;
|
| dat.a = 1;
|
| dat.b = "foobar";
|
| do_thing(&dat);
|
| I seems like the compiler should be able to treat "dat" as
| an "maybe uninit" type until after "dat.b" gets assigned.
| comex wrote:
| Rust already supports partially _de_ initialized structs -
| that is, moving out of a struct field by field - and
| there's no fundamental reason it can't support partial
| initialization too.
|
| Indeed, there's an open issue for it:
|
| https://github.com/rust-lang/rust/issues/54987
|
| But there are a lot of desired features with open issues,
| so don't expect this to be implemented anytime soon unless
| someone takes an interest in it.
| staticassertion wrote:
| It's not problematic, it's just not work that's been done
| yet. Rust isn't finished. I bet you could get an RFC pushed
| through and implement it if you wanted to, it certainly is
| a promising idea.
| duped wrote:
| To the OP - why should creating uninitialized references with
| static lifetimes be easy? That is a recipe for undefined behavior
| - borrows aren't pointers, if you want a pointer to be zero
| initialized, then use a pointer.
|
| If you want safe access to that pointer then wrap it in a struct
| with an accessor method
| remram wrote:
| > Because that raw pointer does not implement deref and because
| Rust has no -> operator we now need to dereference the pointer
| permanently to assign the fields with that awkward syntax.
|
| Absolutely not, you can still use a mutable reference:
| let role = &mut *uninit.as_mut_ptr(); role.name =
| "basic";
| mkeeter wrote:
| The article says
|
| > A mutable reference must also never point to an invalid
| object, so doing let role = &mut *uninit.as_mut_ptr() if that
| object is not fully initialized is also wrong.
|
| I'm curious who's right here, because I've seen your pattern in
| code recently!
| remram wrote:
| Aren't all the `(*role)` in their code "mutable references"
| too? (*role).name = "basic";
| steveklabnik wrote:
| The docs for as_mut_ptr: https://doc.rust-
| lang.org/stable/std/mem/union.MaybeUninit.h...
|
| > Incorrect usage of this method: let mut x =
| MaybeUninit::<Vec<u32>>::uninit(); let x_vec = unsafe {
| &mut *x.as_mut_ptr() }; // We have created a reference
| to an uninitialized vector! This is undefined behavior.
|
| Also, above, it explicitly describes the intended API for
| partially initializing a struct: https://doc.rust-
| lang.org/stable/std/mem/union.MaybeUninit.h...
| mkeeter wrote:
| Thanks Steve!
|
| It turns out I had misremembered; the cast I was thinking
| of is
|
| https://github.com/oxidecomputer/hubris/blob/master/sys/ker
| n...
|
| from &mut MaybeUninit<[T]> to &mut [MaybeUninit<T>], which
| doesn't construct a reference to something uninitialized.
| sAbakumoff wrote:
| I am under the impression that even _safe_ Rust is really hard to
| learn. Several years ago I started with GoLang and it was so easy
| to start programming even advanced things almost instantly..Rust
| drives me crazy. The syntax seems overcomplicated, the compiler
| errors are cryptic, the IDE is not helpful.
| sidkshatriya wrote:
| > I am under the impression that even _safe_ Rust is really
| hard to learn.
|
| [...]
|
| > The syntax seems overcomplicated, the compiler errors are
| cryptic, the IDE is not helpful.
|
| Yes, Rust is hard to learn. Rust does _seem_ over-complicated.
|
| However I like to compare Rust to exercise. You need to do a
| bit of it before you start reaping the benefits of it.
|
| If you suspend your judgement for a bit and try to write some
| Rust, starting from the very beginning you will find that:
|
| - Rust is actually a small language at its core, unlike the
| monstrosity that is C++ . You don't really need Advanced Rust
| to be productive. Use Advanced Rust only when you're...
| advanced
|
| - Rust actually is very consistent
|
| - The Rust compiler is actually very helpful. It's the least
| cryptic compiler I've met. But its OK if you feel that now as
| you're just beginning your journey with Rust
|
| Avoid the temptation to "read" Rust from a book. Try to _do_
| Rust. Otherwise it might overwhelm you. Simply keep adding Rust
| techniques to your arsenal as you mature in your usage of Rust.
|
| Learning Rust changed the way I look at programing. Rust is a
| beautiful language. As a random example, just look at the the
| Firecracker VMM written in Rust --
| https://github.com/firecracker-microvm/firecracker . It would
| have been able to very difficult for me to understand the
| codebase if it were written in C/C++!
|
| Rust is one of those rare languages I've encountered that if
| the code compiles, there is a high probability it will work.
| The type system is that good!
|
| TL;DR Persist and you will reap the rewards with Rust.
| FpUser wrote:
| >"Avoid the temptation to "read" Rust from a book. Try to
| _do_ Rust. Otherwise it might overwhelm you. Simply keep
| adding Rust techniques to your arsenal as you mature in your
| usage of Rust."
|
| I used to be the opposite in the ancient times. Would read a
| book and then start programming. But then the books were
| relatively tiny. Now most languages has matured to the state
| of having an insane amount of features. And the result is
| definitely what you say. Just read some brief overview on
| basic language constructs and then proceed by learning on on-
| need basis as you progress.
|
| Among the others I program in C++ for example but I would
| shoot myself if asked to read something resembling its
| complete description. Sorry I have a life to live. And I am
| using only subset of C++ that solves my particular needs. If
| I feel my code does not look nice when doing some particular
| stuff then the time comes to do some more reading.
| Arch-TK wrote:
| >It would have been able to very difficult for me to
| understand the codebase if it were written in C/C++!
|
| I find rust and modern C++ codebases equally hard to
| understand. How can you be so sure that the reason you find
| it easier to understand is not simply because you know rust
| better than C or C++?
| sAbakumoff wrote:
| That is inspiring. Thanks a lot!
|
| Perhaps the problem was that I was overconfident and
| immediately started with pretty advanced Rust - writing a web
| assembly that performs big data analysis using the "polars"
| library.
| jeroenhd wrote:
| I consider myself reasonably competent in Rust but the
| webassembly stuff always trips me up, and so do most cross-
| language tools. You might've just had an unfortunate
| experience. I'm not sure if I'd pick Rust for WASM unless
| you already know it, but I suppose it's a good reason to
| learn the language.
|
| The problem with Rust is that to write good Rust, you need
| to accept that you can't use the same approach to solve a
| problem you could use in other languages. The borrow
| checker and its implications aren't necessarily difficult,
| but they're different.
|
| I'd compare the experience to someone who started in Python
| learning Haskell for the first time: it can take weeks or
| more before one can truly understand a monad. If you start
| with basic functional programming you can be quite
| productive before you need such code structures, but when
| you open a code base where monads are mixed liberally, your
| head will be spinning.
|
| Rust may be a lot closer to traditional imperative
| languages, but the implications of the safety mechanisms in
| the language are something you can't just skip over. It'll
| take time and practice to write code that the Rust compiler
| likes.
|
| If you're getting started, I'd recommend writing some toy
| programs for the CLI instead. I also recommend using
| "clippy" to warn you of code smells (it'll suggest
| improvements when it can!) and to get links to specific
| problems you might not be aware of. As for the IDE, I
| recommend the rust-analyser plugin over the native "Rust"
| plugins found in most IDEs, because the basic ones fail to
| do proper macro expansion and leave you guessing on how to
| use libraries.
| sAbakumoff wrote:
| >> I'm not sure if I'd pick Rust for WASM
|
| What would be an alternative if I plan to develop a high-
| performant data analysis tool for WASM(similar to
| Perspective[0])? I looked at the list of supported
| languages[1] and Rust seems to be a good choice.
|
| [0] - https://github.com/finos/perspective
|
| [1] - https://github.com/appcypher/awesome-wasm-langs
| jeroenhd wrote:
| I can't say I'm too experienced in writing code for WASM,
| but I have to say I was pleasantly surprised when I
| experimented with Go.
|
| It has to be said that I have many (mostly subjective)
| problems with the Go language and the ecosystem, but with
| the help of GoLand I was productive in minutes. The layer
| for exchanging arguments between the browser and the
| "native" code is a bit weird, but once you get passed
| that, it's easy to get going.
|
| The Rust problem with WASM is more about learning Rust
| well and picking the right libraries (many of them have
| dependencies that don't work well in the browser!).
| Setting up tools like cargo to compile usable WASM files
| also takes a little practice, but that's at most an
| afternoon of messing around before you should be
| reasonably comfortable with it. In my opinion, the main
| improvements Rust brings to the table are the (memory)
| security features and the fearless multithreading, but
| neither of them are of much use within the WASM runtime.
| The borrow checker will still help you write correct
| code, but it can be an unnecessary pain in the ass when
| it doesn't need to be. Rust is a great systems
| programming language, but I'm not so sure about it
| becoming the de-facto WASM standard.
|
| Of course, if you already know Rust, or know a library
| that would be super useful to you, it's great that Rust
| can Just Work (TM) with the right setup. First-party
| tooling support is pretty great for a language to have!
|
| If I had to choose, I think I'd pick a language that I'm
| comfortable with (C#, Kotlin, Java) and has the necessary
| libraries easily available, and see if the tooling works
| well for my use cases.
|
| I'm also watching Zig evolve with interest; it's not
| quite there yet, but it's integration with C libraries
| and some of its more modern language features are very
| promising. WASM code doesn't need many of the
| complexities modern languages bring, but older languages
| like C can lead to dangerous programming paradigms, so I
| think a mix between the two can produce clean, performant
| and fully-featured code. I wouldn't recommend it for
| production use yet, though, as the language is still in
| constant development with breaking changes between point
| releases!
| pjmlp wrote:
| To be honest, I would even consider C for WASM despite my
| usual rants, after all with the security sales pitch for
| Webassembly it shouldn't matter after all.
| jeroenhd wrote:
| Agreed, without the risk of memory corruption and the
| associated security risks, C can be used without too much
| hesitation (as long as your complicated pointer code
| doesn't break your own data structures and mess up your
| program, of course!).
|
| I think it would be a challenge to compile _all_
| dependencies for a fully features C program into WASM,
| but if you pick your libraries well, C could be an
| excellent language for speeding up complex calculations
| in Javascript.
| jeroenhd wrote:
| I much prefer Rust over C++, but I find that the problems of
| C++ have little to do with the language itself.
|
| I've been watching the videos by Andreas King on SerenityOS
| and the code is so clean that at first I wondered what
| programming language I was even looking at. I see the
| SerenityOS codebase as proof that if C++ programmers wanted
| to write modern, elegant, readable code, they definitely
| could.
|
| In practice, though, most C++ programs are full of legacy
| code or are written by people who don't necessarily know
| about or agree with modern ways to program C++. It's easy to
| write beautiful code if you also wrote the memory manager and
| standard library in modern C++, but most people don't have
| that luxury.
|
| By being created with a more modern standard library, Rust
| has an advantage over C++. There is no legacy code to remain
| compatible with and there is no real way to write "old-
| fashioned" Rust because the project hasn't existed for long
| enough. I've seen plenty of terrible, ugly Rust, most of it
| in my own personal projects. The strictness of the language
| and standard toolset helps, but it's far from a guarantee
| that enterprise Rust will be readable and clear.
| pjmlp wrote:
| Having acquired C++ into my toolbox in 1993, and thus lived
| through its adoption over C (which still owns several
| domains after 50 years), I am bettting that Rust at 30
| years of age into production will suffer similar fate.
| tialaramex wrote:
| Even though Rust's Editions don't solve _everything_ they
| make a huge difference and they also change the nature of
| the conversation around such evolution.
|
| I think the built-in array type is illustrative. In both
| languages (C++ and Rust) the initial 1.0 language offers
| a built in array type that is provided with built-in
| syntax and parsing but isn't as good as the user-made
| container types, so on day one the situation is OK, yeah,
| we do have arrays but you should likely avoid them.
|
| In C++ that just remains the case, C++ 20 has poor built-
| in arrays and a note saying we built another array type
| that you should actually use, it's in our standard
| library.
|
| Meanwhile in Rust they've been _improving_ their built-in
| arrays, using const generics, implementing IntoIterator
| for arrays, and so on. Rust 2021 in a compiler today has
| pretty nice built-in arrays that behave how you 'd expect
| for a container, a sophisticated programmer might notice
| that Default isn't implemented for your array of 64
| integers, but such sharp corners are now few and far
| between and further refinements continue.
|
| The resulting conversation is more open to change, even
| though Editions can't actually do magic they can
| _conceal_ some pretty deep compiler magic like the hack
| to enable IntoIterator for arrays yet keep working Rust
| 2015 and Rust 2018 code that assumed into_iter() on an
| array will go via a reference. Being able to get to 90%
| of what people wanted with no magic meant the
| conversation about extra magic happened and it might
| otherwise not have.
|
| Editions also spur language innovations that make further
| edition work easier. Rust 1.0 did not have any way to
| talk about an identifier if it collided with a keyword,
| which of course means if you reserve a new keyword now
| you can't access identifiers which used the now-reserved
| name. Rust 2018 introduces raw identifiers to fix that,
| if you really insist on naming your function "try" you
| can write r#try despite the existence of the keyword try.
|
| I think these benefits are cumulative, and although Rust
| 2045 might have some cruft it will have a _lot_ less than
| C++ 23 let alone C++ 44.
| jeroenhd wrote:
| Definitely. I think Rust's more restrictive nature will
| leave it a little better than C and C++, but as
| technology evolves, every language will eventually show
| its age.
| ccleve wrote:
| Why was this downvoted? Rust may be the "most loved" language,
| but after a few weeks with it, I don't love it. We've got to
| face the reality that it makes simple things way too
| complicated.
|
| In fairness, I have found the compiler errors to be extremely
| helpful. They often tell me exactly what to fix. But honestly,
| they shouldn't have to do that. The syntax should have been
| obvious from the beginning, as it is in most programming
| languages.
| spacechild1 wrote:
| I guess because it's off topic. TFA is about a specific Rust
| issue and a generic comment about "Rust is hard to learn"
| doesn't add anything to the conversation. It's like the
| generic "C++ is way to complicated" comment under every C++
| post. They just incite language flame wars and district from
| the actually interesting stuff.
| staticassertion wrote:
| It's because, for some reason, the rust community is
| extremely, overly aggressive about downvoting. I think we
| can all just admit that - I've been using Rust since 2014
| and it's always been an issue and it needs to be called
| out.
| sAbakumoff wrote:
| Right, the HN comments are all about being on topic.
| bitwize wrote:
| Prior exposure to C appears to be negatively correlated with
| ease in learning Rust. This is why I say Rust is not a
| language for C programmers -- it's for their replacements.
| Young programmers with only a few years' experience in
| JavaScript or Ruby are now coding circles around the old C
| wizards, contributing bare-metal, bit-banging code that is
| guaranteed to be free of several classes of bugs those C
| wizards are still struggling with.
| kaashif wrote:
| That's surprising. I would've thought most C programmers
| have at least some experience writing C++, and C++
| programmers (and C programmers too, but I don't know anyone
| who programs primarily in C so I can't comment) already do
| many of the things Rust does for you. Like thinking about
| lifetimes, const by default, moving instead of copying,
| avoiding raw pointers like the plague, etc.
|
| Someone who has never thought about that stuff must surely
| find it harder to appreciate Rust.
| creata wrote:
| What syntax "should have been obvious from the beginning"? I
| always felt like most things in Rust are around as simple as
| they could be, for the things the language is trying to do.
| howinteresting wrote:
| Rust also makes complicated things really simple, which is
| why it's been the most loved for many years in a row.
| eminence32 wrote:
| I suspect you might start to receive a bunch of replies from
| people with the opposite experience (replies from people who
| find the syntax easy to read, and the compiler errors clear and
| understandable). I would like to attempt to preempt that by
| noting that it's totally reasonable for two people to feel
| drastically different things about a programming language.
| Neither view is more correct than the other.
|
| That said, I am curious why different people have these
| different feelings. One aspect is likely rooted in the fact all
| of our brains are different. But I also wonder if first
| impressions play a big role here. A good example of a cryptic
| rust error is the `expected type Foo, but found type Foo` error
| message which is very inscrutable, especially to a new users.
| There are also some lifetime errors that can be hard to
| understand.
|
| I wonder if someone encounters these type of messages very
| early on in their learning experiences, the unpleasantness of
| having to decipher them colors the rest of their learning
| experiences.
| kaashif wrote:
| > A good example of a cryptic rust error is the `expected
| type Foo, but found type Foo` error message which is very
| inscrutable, especially to a new users.
|
| Does Rust actually give an error like `expected type Foo, but
| found type Foo`, as in both types are the same in the error?
| I don't think I've seen that before, but I don't write much
| Rust.
|
| If both types are the same, what does the error mean?
| masonium wrote:
| I've seen this error a lot when working with two different
| versions of the same library. Specifically, you can
| directly include version A, but a different library depends
| on a version B, with some of that exposed in the public
| API.
| staticassertion wrote:
| I think the compiler will now give you a hint that they
| may be from different versions. If not, next time you see
| it you should open up a bug in rustc.
|
| To anyone who gets a cryptic error message - that's a
| bug, report it.
| dragonwriter wrote:
| > That said, I am curious why different people have these
| different feelings. One aspect is likely rooted in the fact
| all of our brains are different. But I also wonder if first
| impressions play a big role here.
|
| Path dependency has _big_ impact on what seems natural,
| intuitive, etc. Part of that is what you 've done before, and
| part is first impressions, and part of it is your approach to
| learning (or the approach taking to teaching you) the subject
| at hand.
|
| I've approached Rust via different books and tutorials before
| and come up with the "it's awesome, but too hard" feeling and
| set it aside.
|
| Recently I've been trying _Hands-on Rust_ [0] and going off
| to the side from it and Rust is clicking pretty well. Not
| sure if the book is a better fit for me, if the past false
| starts have prepared the ground, or what specifically
| changed.
|
| [0] https://pragprog.com/titles/hwrust/hands-on-rust/
| littlestymaar wrote:
| > Path dependency has big impact on what seems natural,
| intuitive, etc
|
| So much this. One example I hit a lot is when people keep
| saying "async/await is too hard, green threads are much
| more intuitive": coming from a JavaScript background I feel
| async/await and all the future combinators much more
| intuitive than dealing with threads and channels. Before
| async rust was stabilized I had to learn how to use threads
| and I was always frustrated how clunky it felt, and now
| that Rust has async/await I use it for everything IO
| related, because to me it's just much more familiar.
| Subsentient wrote:
| The big issue I had with learning Rust is not the simple
| stuff -- I could write safe Rust fairly quickly. The problem
| I continue to have is that many internal behaviors of the
| language are totally undocumented and inexplicable. For
| example, implicit reborrows. They happen everywhere, but it's
| still just a pile of unicorn farts as far as documentation.
| The idea of a language with _no_ documentation for a given
| syntax is horrifying. Rust does very well in documentation
| for beginners, but if you actually want to understand the
| language, if you want to know what 's really happening,
| you're usually fucked.
| SloopJon wrote:
| Despite being contemporaries, both nominally for systems
| programming, Rust and Go are very different, starting with
| memory management. If you're used to writing (correct) C++
| programs, particularly those that exploit move semantics in
| C++11 and later, then you can appreciate what Rust is doing
| for you, weird syntax and all.
|
| On the other hand, if you're fine with a garbage collector,
| which most people are most of the time, then Go is going to
| feel more natural. For some people, Go is more comparable to
| Python than Rust, because of this one big difference.
| Tozen wrote:
| I believe that different programming languages resonate better
| with some more than others. Often it's a matter of preferences,
| comfort, perceived usability, and previous experiences.
|
| So, it's quite possible for various people to feel more
| comfortable using Golang than Rust, and the opposite can be
| true as well, where using Rust is preferred over Golang. At the
| end of the day, it will usually come down to individual or
| corporate priorities.
|
| Another of the newer languages that fall into a similar usage
| space would be Vlang (https://github.com/vlang/v). It being
| debatably easier to learn and use, in the context of languages
| that more easily interact with C.
| scrubs wrote:
| Agree. I don't need to get 10 Phds figuring out Rust. At this
| point in my career distributed algos/systems are much more
| interesting to understand and implement. And writing those in a
| way that are bindable to multiple clients means staying close
| to C/Zig/C++.
|
| Over the years the following C++ aspects have and continue to
| drive me nuts. On a scale of (Sarcasm coming next) 10:
|
| * (7/10) build time, and dependency management with the 81
| million tools, formats, approaches to deal with this.
|
| * (1/10) 21 page error sets resulting from single word typos in
| templated code: even Egyptologists ask how do we put up with
| that? At least we have nice rocks to look at. Yah, they have
| rocks. How come we don't have rocks?
|
| * (1/10) Code decl duplication between .h/.cpp
|
| * (1/10) Long, pointless C++ errors. if you have an orthodox
| background the guilt trip C++ lays on for mismatched function
| calls is legion; you really feel it. It tells you such-n-such
| function could not be matched ... but look at the 42 million
| function calls you _could have made_ ... _should have made_ ...
| _and it 's killing me you didn't make; you seeing the effort
| I'm putting into this? It's killing me_ ... I just need the cop
| to say it straight: you screwed up. Ticket. See in court. Have
| a nice day. I prefer one liners.
|
| I'm in no rush to waste time on Rust. I'd rather stick to GO
| where I can, Zig when I can. At the office we're 65% C++, 25%
| C. Stack overflows, memory corruptions do occur periodically;
| I've sorted several of those out. But the new code is C++ which
| makes heavy use of STL. Much better. There is growing adoption
| in GO and Rust, but the vast amount of C/C++ code means the
| apple will not fall too far from the tree.
| staticassertion wrote:
| Rust initially was much harder to learn. The compiler was
| considerably more strict - rejecting a lot of programs that
| were entirely valid. That earned rust a very negative "hard
| language" reputation early on.
|
| That hasn't been the case for years. The 2018 edition
| officially stabilized Non-Lexical Lifetimes, allowing tons of
| valid programs to work. There have been a lot of other
| improvements since then to address papercuts.
|
| At this point Rust is a pretty easy language to learn imo.
| jcranmer wrote:
| Here's another perspective on why things are the way they are:
|
| One of the central philosophies of Rust is that it should not be
| possible to execute undefined behavior using only safe code.
| Rust's underlying core semantics end up being _very_ similar to C
| 's semantics, at least in terms of where undefined behavior can
| arise, and we can imagine Rust's references as being wrappers
| around the underlying pointer type that have extra requirements
| to ensure that they can be safely dereferenced in safe code
| without _ever_ causing UB.
|
| So consider a simple pointer dereference in C (*p)... how could
| that cause UB? Well, the obvious ones are that the pointer could
| be out-of-bounds or pointing to an expired memory location. So
| references (& and &mut) most point to a live memory location,
| even in unsafe code. Also pretty obviously, the pointer would be
| UB were it unaligned, so a Rust reference must be properly
| aligned.
|
| Another one that should be familiar from the C context is that
| the memory location must be initialized. So the & reference in
| Rust means that the memory location must also be initialized...
| and since &mut _implies_ &, so must &mut. This part is probably
| genuinely surprising, since it's a rule that _doesn 't_ apply to
| C.
|
| The most surprising rule that applies here as well is that the
| memory location cannot be a trap representation (to use C's
| terminology). Yes--C has the same requirement here, but most
| people probably don't come across a platform that has trap
| representations in C. The reason why std::mem::uninitialized was
| deprecated in favor of MaybeUninit was that Rust has a type all
| of whose representations are trap representation (that's the !
| type).
|
| In short, the author is discovering two related issues here.
| First, the design of Rust is to push all of the burden of
| undefined behavior into unsafe code blocks, and the downside of
| that is that most programmers probably aren't sufficiently
| cognizant of UB rules to do that rule. Rust also pushes the UB of
| pointers to reference construction, whereas C makes most of its
| UB happen only on pointer dereference (constructing unaligned
| pointers being the exception).
|
| The second issue is that Rust's syntax is geared to making _safe_
| Rust ergonomic, not unsafe Rust. This means that using the
| "usual" syntax rules in unsafe Rust blocks is more often than not
| UB, even when you're trying to avoid the inherent UB construction
| patterns. Struct projection (given a pointer/reference to a
| struct, get a pointer/reference to a field) is especially
| implicated here.
|
| These combine when you deal with uninitialized memory references.
| This is a reasonably common pattern, but designing an always-safe
| abstraction for uninitialized memory is challenging. And Rust did
| screw this up, and the stability guidelines means the bad
| implementations are baked in for good (see, e.g., std::io::Read).
| nyanpasu64 wrote:
| If Rust's syntax is geared to making safe Rust ergonomic, they
| should start by not requiring method calls to read or write
| from a &Cell<T> (because in C++ you don't need method calls to
| read or write from a T&).
| gpm wrote:
| Needing to use a `Cell` in rust is incredibly rare in my
| experience. It's very far from clear to me that the increased
| complexity coming from allowing overriding the meaning of
| `=`, and worse of
| `variable_name_that_happens_to_contain_a_certain_type` would
| be remotely worth it.
| pjmlp wrote:
| Unless GUI code comes up with Rc<RefCell<>> all over the
| place.
| gpm wrote:
| RefCell is distinct from Cell, more common, but much
| worse tradeoffs for implicit access, because accessing it
| can crash your program.
| nyanpasu64 wrote:
| I want Cell to be more ergonomic because it addresses the
| same Rust weak points as RefCell but without runtime
| overhead and panicking. I think RefCell<struct> should be
| an infrequently used type that you reach for when you
| specifically want to guard against reentrancy, and
| struct{Cell} should be the primary replacement for shared
| access in other languages. The latter is sound but
| introduces a lot of boilerplate syntax.
| notpopcorn wrote:
| > So we use a &'static str here instead of a C string so there
| are some changes to the C code.
|
| > [..]
|
| > So why does this type not support zero initialization? What do
| we have to change? Can zeroed not be used at all? Some of you
| might think that the answer is #[repr(C)] on the struct to force
| a C layout but that won't solve the problem.
|
| The type of the first field was switched to a type (&str) that
| specifically promises it is never null. If the original type (a
| pointer) was kept, or a Option<&str> was used, mem::zero would've
| worked fine.
| eddyb wrote:
| > For instance `(*role).name` creates a `&mut &'static str`
| behind the scenes which is illegal, even if we can't observe it
| because the memory where it points to is not initialized.
|
| Where is this coming from? It's literally not true. The MIR for
| this has: ((*_3).0: &str) = const "basic";
| ((*_3).2: u32) = const 1_u32; ((*_3).1: bool) = const
| false;
|
| So it's only going to do a raw offset and then assign to it,
| which is identical to `*ptr::addr_of_mut!((*role).field) =
| value`.
|
| Sadly there's no way to tell miri to consider `&mut T` valid only
| if `T` is valid (that choice is not settled yet, AFAIK, at the
| language design level), in order to demonstrate the difference
| (https://github.com/rust-lang/miri/issues/1638).
|
| The other claim, "dereferencing is illegal", is more likely, but
| unlike popular misconception, "dereference" is a _syntactic_
| concept, that turns a (pointer /reference) "value" into a
| "place".
|
| There's no "operation" of "dereference" to attach _dynamic
| semantics_ to. After all, `ptr::addr_of_mut!(*p).write(x)` has to
| remain as valid as `p.write(x)`, and it does literally contain a
| "dereference" operation (and so do your field projections).
|
| So it's still inaccurate. I _believe_ what you want is to say
| that in `place = value` the destination `place` has to hold a
| valid value, as if we were doing `mem::replace( &mut place,
| value)`. This is indeed true for types that have destructors in
| them, since those would need to run (which in itself is why
| `write` on pointers exists - it long existed before any of the
| newer ideas about "indirect validity" in recent years).
|
| However, you have `Copy` types there, and those are _definitely_
| not different from ` <*mut T>::write` to assign to, today. I
| don't see us having to change that, but I'm also not seeing any
| references to where these ideas are coming from.
|
| > I'm pretty sure we can depend on things being aligned
|
| What do you mean "pretty sure"? Of course you can, otherwise it
| would be UB to allow safe references to those fields! Anything
| else _would be unsound_. In fact, this goes hand in hand with the
| main significant omission of this post: this is _not_ how you 're
| supposed to use `MaybeUninit`.
|
| All of this raw pointer stuff is a distraction from the fact that
| what you want is `&mut MaybeUninit<FieldType>`. Then all of the
| things about reference validity are _necessarily_ true, and you
| can _safely_ initialize the value. The only `unsafe` operation in
| this entire blog post, that isn 't unnecessarily added in, is
| `assume_init`.
|
| What the author doesn't mention is that Rust fails to let you
| convert between `&mut MaybeUninit<Struct>` and some hypothetical
| `&mut StructBut<replace Field with MaybeUninit<Field>>` because
| the language isn't powerful enough to do it automatically. This
| was one of the saddest things about `MaybeUninit` (and we tried
| to rectify it for at least arrays).
|
| This is where I was going to link to a custom derive that someone
| has written to generate that kind of transform manually (with the
| necessary check for safe field access wrt alignment). To my
| shock, I can't find one. Did I see one and did it have a funny
| name? (the one thing I did find was a macro crate but unlike a
| derive those have a harder time checking everything so I had to
| report https://github.com/youngspe/project-uninit/issues/1)
| staticassertion wrote:
| I think that the premise here is correct - writing unsafe Rust
| _is_ too hard. There are lots of footguns.
|
| This isn't a very good motivating example but I suppose it does
| the job of showing the various hoops one has to jump through when
| using unsafe.
|
| I think right now the approach is to make unsafe "safe" (ie
| std::mem::uninitialized -> MaybeUninit) at the cost of complex,
| and eventually to build out improved helpers and abstractions.
| Obviously this is still ongoing.
|
| But also, just don't write unsafe? It's very easy to avoid.
| smoldesu wrote:
| > But also, just don't write unsafe? It's very easy to avoid.
|
| Yeah, there's a weird subset of developers who insist on mixing
| unsafe and safe code even when they're presented equally
| performant, safe alternatives. One such example was the Actix
| framework, where the lead dev refused to merge any fixes for
| his unsafe code. Eventually, so many merge requests showed up
| to fix his broken code that he just gave up the project
| altogether and let the community take over.
|
| If you want to write unsafe code, I think that's perfectly
| fine, but Rust is not going to cater to your desires. C and C++
| will give you the tools you need with the conveniences you
| want.
| DSMan195276 wrote:
| > If you want to write unsafe code, I think that's perfectly
| fine, but Rust is not going to cater to your desires. C and
| C++ will give you the tools you need with the conveniences
| you want.
|
| This is a bit of a odd suggestion to me, honestly, you're
| basically saying Rust is not intended to be a C/C++
| replacement. There is definitely a reality that not
| everything can be written in completely safe Rust, and a lot
| of the places that Rust could be the most beneficial (Ex.
| Linux Kernel) are going to require using it.
| ivraatiems wrote:
| Calling Rust "a C/C++ replacement" is, I think, slightly
| underselling what it's supposed to do. As I understand it,
| the goal of Rust is more "take all these things that were
| historically hard to do safely and make them easier to do
| safely" than "replace C/C++ 1 for 1".
| DSMan195276 wrote:
| Is it actually achieving that goal though if the advice
| is "use C if you have to do `unsafe` stuff"? That's
| really my point. IMO the language was always intended to
| be able to be used in any situation C is used (with maybe
| a few exceptions), which includes situations where
| `unsafe` is necessary.
| smoldesu wrote:
| For starters, the example I gave (Actix framework) had no
| such situations where unsafe code was imperative to
| having it function. _That 's_ where the drama originated
| from there.
|
| Secondly, I don't think the goal of Rust is to rewrite
| the entire kernel (or replace C/C++ for that matter).
| Rust is optimized around writing ergonomic, safe code.
| It's compiler is designed to produce high-performance
| binaries with minimal UB. Sure, they _could_ designate a
| team for making the unsafe Rust experience better, but
| why bother? C++ already does that and does it well. The
| idea of "oxidation", or slowly replacing safety-critical
| portions of a program in Rust, doesn't necessitate fully
| replacing any of the languages that came before it, and
| while there _are_ situations where unsafe code is
| undeniably necessary, I don 't really think there's much
| of a reason to replace C or C++ for those uses. Any of
| the replacements you could write in Rust wouldn't be
| ergonomic, easy to read or well optimized.
| queuebert wrote:
| When I want to write unsafe Rust code, I usually find it easier
| just to use C. The whole purpose for Rust in my work is to be
| safe.
| mlindner wrote:
| This author doesn't even seem to know C properly so it's hard to
| accept their reasoning.
|
| This in Rust: let mut role: Role = mem::zeroed();
|
| Is not the same as this in C: struct role r;
|
| C does not zero initialize.
| andreareina wrote:
| I don't know rust, but why isn't the answer, don't try to do what
| you'd do in C like construct uninitialized structs?
| TheCycoONE wrote:
| They are pretty useful for a number of data structures, and
| Rust uses them heavily in the standard library.
|
| I'm not aware of as many uses of field by field initialization
| of a struct but there is an example similar to this blog in the
| docs[1] (without the alignment considerations.)
|
| That said my read has been the complexity is accidental as a
| result of language decisions to improve safe rust. MaybeUnit
| was only defined 3 years ago when it was discovered that
| mem::uninitialized /zeroed resulted in undefined behavior when
| used with bools. [2][3]
|
| [1] https://doc.rust-
| lang.org/std/mem/union.MaybeUninit.html#ini...
|
| [2] https://github.com/rust-lang/rust/issues/53491
|
| [3] https://doc.rust-lang.org/std/mem/fn.uninitialized.html
| joeatwork wrote:
| Interoperating with C libraries can often force you into this
| sort of thing.
| dathinab wrote:
| It's the correct answer.
|
| You still sometimes need it like:
|
| - when highly optimizing some algorithms
|
| - doing FFI
|
| So places you find it include some aync runtimes, some
| algorithm libraries, the standard library.
|
| Still often times you initialize it by fully writing it, not by
| writing fields.
|
| Anyway rules are simple:
|
| 1. use `ptr::write` instead of ` _ptr =`
|
| 2. use `addr_of_mut!(ptr.x)` instead of `&_ptr.x` to get field
| pointers
|
| 3. uhm, `packed` structs are a mess, if you have some you need
| to take a lot of additional care, this is not limited to rust
| but also true for C/C++
|
| Also you do not need `#[repr(C)]`, while the rust-specification
| is pending and as such `repr(Rust)` is pretty much undefined
| you still can expect fields to be aligned (as else you would
| have unaligned-`&` which is quite a problem and would likely
| cause a bunch of breakage through the eco-system).
| kaba0 wrote:
| There may of course be rare cases where having it
| uninitialized helps, but I would wager that even than the
| compiler could optimize it more often than not.
| dathinab wrote:
| It's not that simple.
|
| Like the unused bu allocated space in a Vec is basically a
| `[MaybeUninit<T>]`.
|
| Or in async runtimes you often have an unsized type with an
| future trait object inlined (through unsized types anyway
| need a bit more love ;=) ).
|
| Or some C FFI patterns.
|
| But yes I would say in pure rust the use cases for
| `MaybeUninit` are rare, and the cases where you need
| pointers to fields even rarer.
|
| Though while rare in comparison to the amount of code not
| needing it, still needed enough to be somewhere used (e.g.
| in a dependency) in many projects even if ignoring std.
| dathinab wrote:
| The scary thing is:
|
| Handling uninitialized memory is hard in C++ (and C), too.
|
| You just don't notice and accidentally do it slightly wrong
| (mainly in C++, in C it's harder to mess up).
| queuebert wrote:
| Exactly this. And also avoiding initializing memory with zeroes
| or something is often premature optimization. Very few programs
| are performant enough to notice the difference.
| dathinab wrote:
| In my experience the most common use case for zero-memset is
| not optimizations but to reduce fallout if you happen to
| initialize a field..., like a newly added field.
| jcranmer wrote:
| In C, when you declare 'struct role r' (not as a static
| variable), it is not zeroed. The immediate Rust equivalent would
| be to use std::mem::uninitialized(), not std::mem::zeroed.
| wyldfire wrote:
| But an idiom from C which might inspire some unsafe rust is to
| memset a struct to zero after declaration in order to guarantee
| that all fields are initialized before anything would access
| them.
| MaxBarraclough wrote:
| If I understand correctly, reading [0], in C99 (or later) you
| can do that with struct MyStruct foo = {};
|
| This has the effect of initializing all members to zero (or,
| more precisely, the value which is _the same as for objects
| that have static storage duration_ [0]).
|
| [0] https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
|
| See also _Stop Memsetting Structures_ :
| https://news.ycombinator.com/item?id=19766930
| Lvl999Noob wrote:
| Is initializing a NonZero field to 0 really initializing it?
| dathinab wrote:
| no not at all
|
| IMHO requiring all types in C to have a valid "all zero"
| variant so that this pattern can be used isn't grate
| either, somewhat of an anti-pattern even. But an anti-
| pattern needed for ergonomic C.
|
| The post is a good example for trying to program "like in a
| different language" just because some tools somewhat allow
| it. Like in this case "programming rust like it's C".
|
| And if you do rust FFI and have a lot of C experience it is
| tempting.
| staticassertion wrote:
| Sort of. It's definitely better than _not_ initializing it.
___________________________________________________________________
(page generated 2022-01-30 23:02 UTC)