[HN Gopher] What Is Rust's Unsafe? (2019)
___________________________________________________________________
What Is Rust's Unsafe? (2019)
Author : luu
Score : 100 points
Date : 2022-04-10 17:36 UTC (5 hours ago)
(HTM) web link (nora.codes)
(TXT) w3m dump (nora.codes)
| verdagon wrote:
| Interestingly enough, unsafe is the root reason Rust couldn't add
| Vale's Seamless Concurrency feature [0] which is basically a way
| to add a "parallel" loop that can access any existing data,
| without refactoring it or any existing code.
|
| If Rust didn't have unsafe, it could have that feature. Instead,
| the Rust compiler assumes a lot of data is !Sync, such as
| anything that might indirectly contain a trait object (unless
| explicitly given + Sync) which might contain a Cell or RefCell,
| both of which are only possible because of unsafe.
|
| Without those, without unsafe, shared borrow references would be
| true immutable borrow references, and Sync would go away, and we
| could have Seamless Concurrency in Rust.
|
| I often wonder what else would emerge in an unsafe-less Rust!
|
| Still, Given Rust's priorities (low level development) and the
| borrow checker's occasional need for workarounds, and the sheer
| usefulness of shared mutability, it was a wise decision for Rust
| to include unsafe.
|
| [0] https://verdagon.dev/blog/seamless-fearless-structured-
| concu...
| ______-_-______ wrote:
| I'm not sure I follow this. If Rust wanted the feature in that
| blog post, they could restrict it to only accessing data that
| is Sync. They wouldn't have to throw out the concept of Sync
| entirely, in fact cases like this are the reason it exists.
| Rust just chooses to leave this kind of feature up to libraries
| instead of building it into the language.
|
| And even without unsafe, you still couldn't assume all data is
| Sync. Counter-examples include references to data in thread-
| local storage, and most data used with ffi.
| zozbot234 wrote:
| > Without those, without unsafe, shared borrow references would
| be true immutable borrow
|
| Rust devs have thought about implementing "true immutable"
| before and found it to be problematic. It would come in quite
| handy for mostly anything related to FP or referential
| transparency/purity, but these things turn out to be very hard
| to reconcile with the "systems" orientation of Rust. Perhaps
| the answer will reside in some expanded notion of "equality" of
| values and objects, which might allow for trivial variation
| while verifying that the code you write respects the same
| notion of "equality".
| celeritascelery wrote:
| Rust without unsafe would just be another obscure academic
| language like Haskell. The ability to bypass the type system
| when the programmer needs to is what makes Rust work.
| whateveracct wrote:
| Haskell also allows you to bypass the type system plenty
| tialaramex wrote:
| Crucially, unsafe is also about the _social contract_. Rust 's
| compiler can't tell whether you wrote a safety rationale
| adequately explaining why this use of unsafe was appropriate,
| only other members of the community can decide that. Rust's
| compiler doesn't prefer an equally fast _safe_ way to do a thing
| over the unsafe way, but the community does. You could imagine a
| language with exactly the same technical features but a different
| community, where unsafe use is rife and fewer of the benefits
| accrue.
|
| One use of "unsafe" that was not mentioned by Nora but is
| important for the embedded community in particular is the use of
| "unsafe" to flag things which from the point of view of Rust
| itself are fine, but are dangerous enough to be worth having a
| human programmer directed away from them unless they know what
| they're doing. From Rust's point of view,
| "HellPortal::unchecked_open()" is safe, it's thread-safe, it's
| memory safe... but it will summon demons that could destroy
| mankind if the warding field isn't up, so, that might need an
| "unsafe" marker and we can write a _checked_ version which
| verifies that the warding is up and the stand-by wizard is
| available to close the portal before we actually open it, the
| checked one will be safe.
| OtomotO wrote:
| When I read posts like yours, I wish unsafe had a different
| name like "human invariant" or whatever.
|
| Something that would make it harder to water down the meaning
| of the clearly defined unsafe keyword to suddenly mean
| something else.
|
| Using the unsafe keyword to mark a function as "potentially
| dangerous" is just wrong.
|
| Just prefix your functions with something like
| "dangerous_call", but don't misuse unsafe!
| pjmlp wrote:
| This is a complete misuse of the unsafe as language concept in
| high integrity computing.
| vitno wrote:
| This person isn't wrong. A lot of serious Rust users don't
| agree with what the GP is suggesting. `unsafe` has an
| explicit meaning: the user must uphold some invariant or
| check something about the environment, otherwise it is memory
| unsafe.
|
| I have several times in code review prevented people from
| marking safe interfaces as "unsafe" because they are "special
| and concerning", overloading the usage of unsafe is itself
| dangerous.
| zozbot234 wrote:
| True, but I think allowing potentially UB-invoking code to
| not use "unsafe" (e.g. because the use is in the context of
| FFI, so the unsafety is thought to be "obvious" and not
| worth marking as such) might be even less advisable. This
| makes it harder to ensure the "social rule" mentioned by
| GP, that every potential UB should be endowed with a
| "Safety" annotation describing the conditions for it to be
| safe.
| ______-_-______ wrote:
| Your comment gave me an idea for a lint that might help
| prevent those mistakes. Right now rustc flags `unsafe {}`
| with an "unused_unsafe" warning. However it doesn't warn
| for `unsafe fn foo() {}`. Maybe it should.
| gpm wrote:
| I think as described you would get false positives,
| because `unsafe fn foo() { body that performs no unsafe
| operations }` can be unsafe to call if it interacts with
| private fields on datastructures used by safe (to call)
| functions that perform unsafe operations... I expect you
| would end up with a reasonably high number of false
| positives.
|
| For an example, consider Vec::set_len in the standard
| library. Which only contains safe code, but lets you
| access uninitialized memory and beyond the length of your
| allocation by modifying the length field of vector:
| https://doc.rust-lang.org/src/alloc/vec/mod.rs.html#1264
|
| You might be able to fix this with a lint that looked at
| a bit more context though, `unsafe fn foo()` in a module
| (or even crate) with no actually unsafe operations is
| very likely wrong. Likewise `unsafe fn foo()` which
| performs no unsafe operations and only accesses fields,
| statics, functions, and methods that are public.
| infogulch wrote:
| Real formal verification is clearly a step up from rust's
| meaning of "safe", but I don't think it's wrong to try to add
| another rung to the verification ladder at a different
| height. Verification technologies have a lot of space to
| improve in the UX department, here Rust trades off a some
| verification guarantees for a practical system that is still
| meaningful.
| wheelerof4te wrote:
| "From Rust's point of view, "HellPortal::unchecked_open()" is
| safe, it's thread-safe, it's memory safe... but it will summon
| demons that could destroy mankind if the warding field isn't
| up, so, that might need an "unsafe" marker and we can write a
| checked version which verifies that the warding is up and the
| stand-by wizard is available to close the portal before we
| actually open it, the checked one will be safe."
|
| _" The only thing they fear is you"_
|
| playing in the background.
| furyofantares wrote:
| Would you be able to provide a real example of that HellPortal
| thing? I'm not really following
| seba_dos1 wrote:
| I'm not convinced that it's a great use of Rust's `unsafe`,
| but since you want an example... dealing with voltage
| regulators maybe? Where an invalid value put into some
| register could fry your hardware? There's a ton of such cases
| in embedded.
| oconnor663 wrote:
| You could probably make the case that any function that
| might physically destroy your memory is memory unsafe :)
| mikepurvis wrote:
| Similar to the sibling, stuff where you're dealing with
| parallel state in hardware, like talking to a device over i2c
| or something, where you know certain things are _supposed_ to
| happen but you don 't, like, know know.
| wheelerof4te wrote:
| Imagine the code screaming
|
| _" Rip and tear, until it's done!"_
| tialaramex wrote:
| I'm assuming "real example" means of such unsafe-means-
| actually-unsafe behaviour in embedded Rust, as opposed to a
| real example of summoning demons?
|
| For example volatile_register is a crate for representing
| some sort of MMIO hardware registers. It will do the actual
| MMIO for you, just tell it where your registers are in
| "memory" and say whether they're read-write, read-only, or
| write-only just once, and it provides the nice Rust interface
| to the registers.
|
| https://docs.rs/volatile-
| register/0.2.1/volatile_register/st...
|
| The low-level stuff it's doing is inherently unsafe, but it
| is wrapping that. So when you call register.read() that's
| safe, and it will... read the register. However even though
| it's a wrapper it chooses to label the register.write() call
| as unsafe, reminding you that this is a hardware register and
| that's on you.
|
| In many cases you'd add a further wrapper, e.g. maybe there's
| a register for controlling clock frequency of another part,
| you know the part malfunctions below 5kHz and is not
| warrantied above 60kHz, so, your wrapper can take a value,
| check it's between 5 and 60 inclusive and then do the
| arithmetic and set the frequency register using that unsafe
| register.write() function. You would probably decide that
| your wrapper is now actually safe.
| furyofantares wrote:
| > I'm assuming "real example" means of such unsafe-means-
| actually-unsafe behaviour in embedded Rust, as opposed to a
| real example of summoning demons?
|
| That was what I meant, thanks for the answer! Though if you
| have an example of the other thing I'd be open to that too
| zozbot234 wrote:
| The social contract is manageable precisely because the unsafe
| subset of typical Rust codebases is a tiny fraction of the
| code. This is also why I'm wary about expanding the use of
| `unsafe` beyond things that are actually UB. Something that's
| "merely" security sensitive or tricky should use Rust's
| existing features (modules, custom types etc.) to ensure that
| any use of the raw feature is flagged by the compiler, while
| sticking to actual Safe Rust and avoiding the `unsafe` keyword
| altogether.
| saghm wrote:
| I agree with this mindset a lot. One example I like about how
| to deal with exposing a "memory and type safe but still
| potentially dangerous" API is how rustls supports allowing
| custom verification for certificates. By default, no such API
| exists, and server certificates will be verified by the
| client when connecting, with an error being returned if the
| validation fails. However, they expose an optional feature
| for the crate called "dangerous_configuration" which allows
| writing custom code that inspects a certificate and
| determines for itself whether or not the certificate is
| valid. This is useful because often you might want to test
| something locally or in a trusted environment with a self-
| signed certificate bit not want to actually deploy code that
| would potentially allow an untrusted certificate to be
| accepted.
| EE84M3i wrote:
| As mentioned in the article, it's entirely possible to create a
| raw pointer to an object in safe rust, you just can't do much
| with it. One thing you can do with it though is convert it to an
| integer and reveal a randomized base address, which isn't
| possible in some other languages without their "unsafe" features.
| Of course, this follows naturally from Rust's definition of what
| is safe, but I remember being kind of surprised by it when I
| first learned Rust and didn't understand those definitions yet.
|
| It would be pretty interesting to me if someone wrote a survey of
| what different languages consider to be "unsafe", including
| specific operations like this. For example, it looks like
| "sizeof" is relegated to the "unsafe" package in Go, which
| strikes me as strange.[1] I'd love to read a big comparison.
|
| [1]: https://pkg.go.dev/unsafe#Sizeof
| ridiculous_fish wrote:
| Sometimes unsafe is used differently, for example when setting up
| a signal handler or a post-fork handler. Process::pre_exec is
| unsafe to indicate that some safe code may crash or produce UB
| within this function. Only async-signal safe functions should be
| used and Rust does not model that in its type system.
| oxff wrote:
| It is a very cool abstraction that makes lots of things possible,
| which are (to my understanding) impossible to do in other
| languages without the explicit safety contract.
|
| Like this: https://www.infoq.com/news/2021/11/rudra-rust-safety/,
| and I quote: "In C/C++, getting a confirmation from the
| maintainers whether certain behavior is a bug or an intended
| behavior is necessary in bug reporting, because there are no
| clear distinctions between an API misuse and a bug in the API
| itself. In contrast, Rust's safety rule provides an objective
| standard to determine whose fault a bug is."
|
| People bring up `unsafe` Rust as an argument against the
| language, but to me it appears to be an argument `for` it.
| likeabbas wrote:
| My only complaint is I wish they didn't name it `unsafe` and
| instead named it something like `compiler_unverifiable` so that
| people could more properly understand that we can make safe
| abstractions around what the compiler can't verify.
| dwohnitmok wrote:
| I like the word `unsafe`. It's a nice red flag to newbies
| that "you almost certainly shouldn't use this" and once you
| have enough experience to use `unsafe` well you'll know its
| subtleties well enough.
| likeabbas wrote:
| I don't think newbies using it would be a problem. There is
| an issue with people/companies evaluating Rust and stopping
| when they see `unsafe` without looking much further into
| it.
| mynameisash wrote:
| I just saw a tweet from Esteban Kuber[0] the other day that
| made me rethink a silly thing I did:
|
| "I started learning Rust in 2015. I've been using it since
| 2016. I've been a compiler team member since 2017. I've
| been paid to write Rust code since 2018. I have needed to
| use `unsafe` <5 times. _That 's_ what why Rust's safety
| guarantees matter despite the existence of `unsafe`."
|
| For me, I was being a bit lazy with loading some config on
| program startup that would never change, so I used a
| `static mut` which requires `unsafe` to access. Turns out I
| was able to figure out a way to pass my data around with an
| `Arc<T>`. I think either way would have worked, but I
| figured I should avoid the unsafe approach anyway.
|
| [0] https://twitter.com/ekuber/status/1511762429226590215
| oconnor663 wrote:
| Yeah if you don't need mutation after initialization is
| done, Arc<T> is a good option for sharing. Lazy<T> (https
| ://docs.rs/once_cell/latest/once_cell/sync/struct.Lazy...
| .) is also nice, especially if you want to keep treating
| it like a global. I believe something like Lazy<T> is
| eventually going to be included in the standard library.
| shepmaster wrote:
| I'll be even lazier (heh) and just straight up leak
| memory for true once-set read-only config values.
|
| https://github.com/shepmaster/stack-overflow-
| relay/blob/273a...
| kibwen wrote:
| If I were to go back and do it again I'd propose to rename
| unsafe blocks to something like `promise`, to indicate that
| the programmer is promising to uphold some invariants (the
| keyword on function declarations would still be `unsafe`).
| (Of course the idea of a "promise" means something different
| in JavaScript land, but I'm not worried about confusion
| there.)
| gpm wrote:
| I still like `trust_me`, maybe a bit unprofessional in some
| contexts though.
| ChadNauseam wrote:
| While we're at it, I also would have preferred `&uniq` over
| `&mut`.
| kevincox wrote:
| Or just `unchecked`. It is similarly short but holds a
| similar meaning. Of course `unchecked_by_the_compiler` would
| be more accurate.
| kibwen wrote:
| That's inaccurate, because Rust still performs all the
| usual checks inside of unsafe blocks. The idea that unsafe
| blocks are "turning off" some aspect of the language is a
| persistent misconception, so it's best to pick a keyword
| that doesn't reinforce that.
| nyanpasu64 wrote:
| Shrug. I really don't think unchecked is "wrong" in
| spirit, since adding an unsafe block makes it trivial to
| write code which ignores the usual boundary and pointer
| lifetime checks, which is not very different in practice
| from turning off checks. Also unsafe code is usually
| wrong, and deeply difficult to write correctly, much more
| so than C.
|
| Also, I dislike the word "unsafe" since "unsafe code" is
| easily (mis)interpreted to mean "invalid/UB code", but
| "invalid/UB code" is officially called "unsound" rather
| than "unsafe". Unsafe blocks are used to call unsafe
| functions etc. (they pass information into unchecked
| operations via arguments), and unsafe trait impls are
| used by generic/etc. unsafe code like threading
| primitives (unsafe trait impls pass information into
| unchecked operations via return values or side effects).
|
| unchecked would make a good keyword name for blocks
| calling unchecked operations, but I'm not so sure about
| using it for functions or traits.
| retrac wrote:
| I believe the term "unsafe" in this sense predates Rust.
| Haskell's GHC has unsafePerformIO [1], which is very similar
| conceptually. Generally used to implement an algorithm in a
| way where it can be pure/safe in practice, but where this
| cannot be proved by the type system.
|
| [1] https://stackoverflow.com/questions/10529284/is-there-
| ever-a...
| pjmlp wrote:
| The unsafe concept exists since early 60's in systems
| programming languages.
|
| The real issue here is Rust fans trying to make Rust into a
| type dependent language while abusing unsafe's original
| purpose.
| pornel wrote:
| hold_my_beer { }
| verdagon wrote:
| Jon Goodwin's Cone [0] language had a nice idea of calling it
| "trust", as in, "Compiler, trust me, I know what I'm doing"
|
| [0] https://cone.jondgoodwin.com/coneref/reftrust.html
| bitbckt wrote:
| I like the sound of "trusted".
| MereInterest wrote:
| I think the benefit of the name "unsafe" is that it
| immediately tells newer users that the code inside has
| deeper magic than they may be comfortable using. Where
| "trusted" is what the writer attests to the compiler,
| "unsafe" is what writer warnings to a future reader.
| [deleted]
| seba_dos1 wrote:
| > People bring up `unsafe` Rust as an argument against the
| language
|
| Those people usually don't understand what `unsafe` is.
| zozbot234 wrote:
| This article fails to mention that the true semantics of Rust's
| `unsafe` subset has been very much up-in-the-air for a long time.
| Nowadays the `miri` interpreter is supposed to be giving us a
| workable model for what `unsafe` code is or is not going to
| trigger UB but many things are still uncertain, sometimes
| intentionally so as some properties may depend on idiosyncratic
| workings of the underlying OS, platform and/or hardware. These
| factors all make it harder to turn what's currently `unsafe` into
| future features of safe Rust, perhaps guarded by some sort of
| formalized proof-carrying code.
| kibwen wrote:
| Progress towards this is always being made, though it's a long
| road with much still to determine. I encourage anyone
| interested in this to join the Zulip channel for Rust's unsafe
| code guidelines working group: https://rust-
| lang.zulipchat.com/#narrow/stream/t-lang.2Fwg-u...
|
| Some recent developments: https://gankra.github.io/blah/tower-
| of-weakenings/
| [deleted]
___________________________________________________________________
(page generated 2022-04-10 23:00 UTC)