[HN Gopher] Rust 1.50
___________________________________________________________________
Rust 1.50
Author : joseluisq
Score : 322 points
Date : 2021-02-11 14:43 UTC (8 hours ago)
(HTM) web link (blog.rust-lang.org)
(TXT) w3m dump (blog.rust-lang.org)
| AlchemistCamp wrote:
| Really too bad that after another minor version, I'm still
| seeing: Warning: can't set `control_brace_style =
| ClosingNextLine`, unstable features are only available in nightly
| channel.
|
| It seems like a trivial and desirable feature but it's only been
| available in nightly for over a year now. The poor (and rigidly
| enforced) default on this same issue was why I ripped prettier
| out of every project I work on.
|
| Also, I don't know how hard it is to contribute to this kind of
| issue, but I'm glad to put in some time (with my still newbie
| level of Rust skills) if that can help resolve it.
| varajelle wrote:
| I am happy that bool::then is stable. I can simplify quite some
| code by removing the `else { None }` branch
| k__ wrote:
| Could you give a before/after example?
| Fiahil wrote:
| Before:
|
| if predicate() { Some(thing) } else { None }
|
| After:
|
| predicate().then(|| thing)
| trevor-e wrote:
| Can you also please explain what the "||" is doing? That
| looks really weird to me coming from Swift and don't see it
| in the docs.
|
| edit: thanks all, should have known it was a lambda :D, in
| Swift we can omit this. My first thought seeing "||" was
| some type of OR logic since it's in the context of if/else
| varajelle wrote:
| It is a lambda function with no parameters.
|
| The argument of then() is a function which is evaluated
| only if the bool is true.
| Sharlin wrote:
| Rust lambdas look like this: |ar,gs|
| expr
|
| So `|| expr` is just a lambda function that takes no
| arguments and evaluates to `expr`.
| [deleted]
| laszlokorte wrote:
| empty parameter list of a closure.
|
| rust: |x| x+x
|
| swift: {x in x+x}
|
| rust: || 42
|
| swift: { 42 }
| [deleted]
| [deleted]
| Shadonototro wrote:
| i prefer the Before one, it is clear what it does, it's
| branching, common, even in plain english or any latin
| language
|
| the second, that magic is not clear, i'll keep using the
| Before personally
| momothereal wrote:
| IMO it depends if you're already in a chaining context. I
| wouldn't use it in a simple if-else branch, but it also
| removes the need to use if-else when you're already in a
| 10-function long chain.
|
| Before:
|
| if a.b().c().d().e().f() { Some(g().h().i()) } else {
| None }
|
| Now:
|
| a().b().c().d().e().f().then(|| g().h().i())
|
| Even better if you have multiple boolean returns in the
| chain.
| csomar wrote:
| The naming can be confusing since "then" is usually used
| for async operations.
| steveklabnik wrote:
| That ship had already sailed in Rust https://doc.rust-
| lang.org/stable/std/?search=then
| hobofan wrote:
| Option and Result already have a `and_then` in addition
| to the futures methods you mention, so it should just
| always be assumed that then/and_then have slightly
| different meanings depending on the context.
| [deleted]
| oblio wrote:
| Rust is awesome but I swear the language developer have no
| aesthetic eye. I mean, they do have one, but it definitely
| isn't that of Elvis (and definitely not that of Mort).
|
| https://web.archive.org/web/20080218051638/http://www.nikhi
| l...
| pjungwir wrote:
| I don't use Rust often (sadly), but I really appreciate its goals
| and its steady progress. I read every one of these Rust 1.x
| stories on HN. Around Christmas I made an n-player chess clock
| with Rust and websockets [0] (because my friends are really
| slooow Agricola players :-), and it was incredibly easy compared
| to my first projects 5 years ago. I can't believe it's been that
| long. So thank you to all the folks working on it, and also thank
| you to everyone helping out 5-year newbies like me. Especially
| steveklabnik, I think you've helped me personally several times.
| :-)
|
| [0] https://github.com/pjungwir/multiclock
| haolez wrote:
| I was just wondering... is there any hope to have minimalist Rust
| compilers? Something like TinyCC for C[0]?
|
| [0] https://tiny.cc/
| twic wrote:
| The closest thing i am aware of is mrustc:
|
| https://github.com/thepowersgang/mrustc
|
| But AIUI that is not aimed at being a Rust compiler you would
| actually use day-to-day, just at being a way to bootstrap a
| Rust toolchain without needing a Rust compiler.
| masklinn wrote:
| An interpreter might make more sense, as compiling rust in a
| straightforward manner (without optimisations) is really slow
| and takes a lot of space. It might not be slower to just
| interpret it, and avoiding binary generation might be
| advantageous.
| pornel wrote:
| I don't think so. The type system is pretty advanced, and type
| inference is required to compile it. Macros, proc macros, and
| modules handle tons of details, and aren't just textual
| inclusion like in C.
|
| The borrow checker is technically optional (as proven by
| mrustc), but if you wanted to implement it, it's no longer a
| set of simple scope rules, but more like a flavor of Prolog
| inside the compiler.
|
| These things are awesome for the power and usability of the
| language, but aren't tiny.
| steveklabnik wrote:
| I am not aware of anyone pursuing any.
| kibwen wrote:
| Your link there is to a link shortener, I think you mean TCC.
| :)
|
| It really depends on what you're actually asking for. Do you
| want, for example, a "slimmer" Rust compiler that jettisons all
| the stuff that supports older language editions? Do you want a
| "simpler" compiler that only uses straightforward algorithms at
| the expense of compiler speed? Or do you want a faster compiler
| that does less optimization at the expense of code generation
| quality? The latter, at least, is something that the Cranelift
| backend for rustc hopes to achieve.
| haolez wrote:
| Yes and I've lost the window to edit my comment. Thanks for
| the correction :)
| weinzierl wrote:
| I've been doing Rust for a while now, but this is the first time
| I've heard about the _niche_ concept[1]. Sounds really useful.
|
| I know that other languages have subranges or refinement types
| but _niches_ seem to solve a different problem - namely to use
| the "holes" in a data type for optimization. Does any other
| language have some comparable concept to _niches_ in Rust?
| da_big_ghey wrote:
| I am looking to learn Rust, but am wondering, what resources are
| good for learning the latest language version? I don't want to
| pick up something that teaches me an old version so I then have
| to go and get used to the new one.
| baq wrote:
| pick any stable version released in the last few months, or
| perhaps the one shipped with a recent version of your favorite
| distro (if running linux) - changes are very rarely important
| enough to absolutely have to know them.
|
| 1.50 is a nice round number and for that reason i can recommend
| it :)
| vincenv wrote:
| I found the Rust Programming Language book [1] very helpful for
| learning, and with rust installed you can open it by typing
| rustup doc --book.
|
| [1] https://doc.rust-lang.org/stable/book/
| steveklabnik wrote:
| New versions of Rust come out every six weeks, so you basically
| can never find something that is covering the absolute latest
| head.
|
| However, new features are additive, so you won't learn stuff
| that's incorrect, you just may not be aware of later, new
| things.
| joseluisq wrote:
| Yeah, however I think that a good one could be 2018 Edition
| IMO.
| lmkg wrote:
| I agree that _versions_ aren 't worth worrying about, but
| which _edition_ is something I would pay attention to.
| no_wizard wrote:
| Disclaimer: I'm new to Rust and lower-level programming of this
| type in general. The 'lowest' level language I've worked with on
| the regular before this is C#[2]
|
| I know Rust already has Tuples[0], so I'm assuming Const Generic
| Indexing For Arrays[1] is a happy path for the compiler to
| optimize what amounts to a finite sized Generic Tuple? (Finite
| size in terms of memory not elements)
|
| Excellent feature, I just seem some (admittedly only high level)
| surface overlap in this language feature.
|
| [0]: https://doc.rust-lang.org/rust-by-
| example/primitives/tuples....
|
| [1]: https://blog.rust-
| lang.org/2021/02/11/Rust-1.50.0.html#const...
|
| [2]: Which, while generally accepted to be a strongly typed
| language, is not near the level of Rust
| zucker42 wrote:
| I don't know if you have the right impression of what the
| feature does.
|
| Consider if you write a generic function which takes as an
| argument any type that has an indexing operation. In the past
| that function couldn't take an array as an argument. Now it
| can.
| DasIch wrote:
| Arrays and tuples are very different. Arrays are homogeneous
| structures, all entries have the same type. Tuples are
| heterogeneous, so entries can have different types.
| Blikkentrekker wrote:
| They're not very different in that arrays can be thought of
| as a more specialized form of tuples that can thus coerce to
| slices.
|
| In theory, arrays do not need to exist, and could in theory
| be expressed as tuples, but that would lead to rather
| unpleasant syntax. `(A, A, A, A, A, A, A, A, A, A)` is
| certainly not as nice as `[A; 10]`; -- _Rust_ could even have
| chosen to make the latter syntactic sugar for the former and
| make them interchangeable.
|
| The crux to this, is that in _Rust_ , unlike in many other
| languages, the length of an array is static and known at
| compile time. Many languages lack such a datatype, as it
| isn't strictly needed per the aforementioned reasons.
| steveklabnik wrote:
| ... and in some sense, tuples are _also_ isomorphic to
| structs.
|
| Product types: woo!
| no_wizard wrote:
| Thanks, after further thought, I misread the example code
| they posted, and realized this. I'm still getting used to how
| Rust specifies generic vs the way you would specify something
| as generic in TypeScript (a language I am far more familiar
| with as I use it every day in my job and all my other
| projects). That's what got me.
|
| _Addendum Explanation (feel free to skip readers)_ :
|
| When I say that, this is what I mean:
|
| In TypeScript, an accepted thing about generic is that I can
| have multiple declared types either via a union e.g.
|
| `TypeA | TypeB`
|
| or intersection
|
| `TypeA & TypeB`
|
| In general practice, it is assumed you can do this, and not
| vice versa. If this is supposed to be avoided, you typically
| will use a constraining Type or specify a specific interface
| / type instead of making it outright generic.
|
| This is not _always_ the case with a language like Rust, it
| 's much finer grained, so I must try and get out of the habit
| of thinking this way. It's the opposite, that generic typing
| is constrained by the structure you put them in, as I
| understand it.
|
| This of course, is a _super_ general overview of what I 'm
| getting at, and there is a lot more nuance to TypeScript as
| well, but my everyday practice of using the language for the
| least 6 years or so has proven this to be the case often. I'm
| finding (and I do like this for what it's worth) Rust is not
| like this outwardly.
| matt_kantor wrote:
| I'm interested in understanding your addendum because I
| often find myself teaching programmers new languages by
| mapping concepts from languages they already know.
|
| Here's a TypeScript analogy that might help clarify things:
| https://www.typescriptlang.org/play?#code/C4TwDgpgBASgrgZ2A
| G...
|
| I'm not sure what you're getting at when mentioning unions
| & intersections. Are you comparing a type like `Foo<number
| | boolean>` with `Bar<number, boolean>`? Those are
| conceptually very different (even in TypeScript), but maybe
| I'm misunderstanding.
|
| The big difference is structural vs nominal typing and the
| fact that Rust doesn't really have subtyping (besides
| lifetimes). TypeScript types are defined by their structure
| --an object type you define can be compatible with a
| different object type from a separate library that never
| heard of you, just by nature of sharing the same
| properties. However Rust types are delineated by their
| names/identities--two separate types are not directly
| interchangeable even if they're defined identically.
| no_wizard wrote:
| Yeah they're definitely different. My point being that in
| TypeScript though, its generally assumed (unless a
| constraint type is used or an explicit type / interface)
| is used, you can generally get away with using a union or
| an intersection to specify a generic type value.
|
| So for instance, `Foo<number | boolean>` or `Bar<number,
| boolean>` wouldn't be out of place on say, a function's
| return type depending on what it does. Not that you
| _should_ but you most certainly _can_. In some cases
| (like dealing with fetch results) its infinitely useful
| when doing foundational library work on a project / app
| / library. In other cases, I may want to use something
| like an intersection type to compose a return type or
| value type of two interfaces that are similar enough to
| be merged into one, or I have a composition function that
| merges data structures together etc.
|
| Fundamentally, with Rust, I have found, and again,
| (again, disclaimer: I'm really new at this language),
| that saying something is generic does not imply you can
| saturate type values like the examples I gave above (the
| real thing I was getting at). Its not common place to see
| this (at least, not that I've read or seen yet).
| Typically, there is a level of explicitness even within
| generics, like mentioned with the array. It has to be
| _homogenous_ so A union type is a no go (at least,
| homogenous in my mind means of one uniform type). An
| intersection type _might_ be idiomatic but I don 't know
| if Rust even has this equivalent in that way. The way I
| think of it myself is that the compiler is trying to do
| the right thing not just for the developer but for the
| _program_ , so these constraints likely tie back to
| memory safety and efficiency, so yes, the
| interchangeability aspect throws me for a loop when
| reading (it's the same words but a vastly different
| context) sometimes.
|
| Even C# was less strict in this way (but still a bit more
| strict than TypeScript, though you could just box and
| unbox the _very annoying to see in practice most of the
| time_ `object` type since all types derive from the C#
| object type). F# was better in that you could do type
| aliasing which made certain things better (like generic
| Record types. I still believe F# is a better language
| overall in terms of ergonomics. I wonder if I could just
| leverage that instead of Rust, but I think the binaries
| would be much big (easily 50mb plus) for what I 'm trying
| to achieve, which is web dev tools, ala things like SWC)
| matt_kantor wrote:
| TypeScript-style untagged unions don't exist in (safe)
| Rust. Instead, "or" is done with enums (which are
| tagged/discriminated unions--
| https://en.wikipedia.org/wiki/Tagged_union). One
| difference is that in TypeScript `number | number`
| reduces to just `number`, but Rust enums don't work that
| way (`enum Foo { A(f64), B(f64) }` has two distinct
| variants).
|
| Check out these equivalent-ish programs:
|
| TypeScript: https://www.typescriptlang.org/play?#code/C4T
| wDgpgBA6gTgQzJO... type Wrapper<T> = {
| value: T } function toBoolean(x:
| Wrapper<boolean | number>): boolean { if
| (typeof x.value === "number") { return
| x.value !== 0 } else { return
| x.value } } function
| unwrap<T>(x: Wrapper<T>): T { return x.value
| } const a = { value: 1.23 }
| const b = { value: true } toBoolean(a)
| toBoolean(b) const c = { value: "whatever" }
| unwrap(c)
|
| Rust: https://play.rust-
| lang.org/?version=stable&edition=2018&gist...
| struct Wrapper<T> { value: T, }
| enum BooleanOrNumber { Boolean(bool),
| Number(f64), } fn to_bool(x:
| Wrapper<BooleanOrNumber>) -> bool { match
| x.value { BooleanOrNumber::Boolean(b) =>
| b, BooleanOrNumber::Number(n) => n !=
| 0.0, } } fn
| unwrap<T>(x: Wrapper<T>) -> T { x.value
| } fn main() { let a =
| Wrapper { value: BooleanOrNumber::Number(1.23) };
| let b = Wrapper { value: BooleanOrNumber::Boolean(true)
| }; to_bool(a); to_bool(b);
| let c = Wrapper { value: "whatever" };
| unwrap(c); }
|
| In Rust you also have traits to play with:
| https://play.rust-
| lang.org/?version=stable&edition=2018&gist...
| struct Wrapper<T> { value: T, }
| trait Boolable { fn to_bool(self) -> bool;
| } impl Boolable for bool { fn
| to_bool(self) -> bool { self
| } } impl Boolable for f64 {
| fn to_bool(self) -> bool { self != 0.0
| } } fn to_bool(x: Wrapper<impl
| Boolable>) -> bool { x.value.to_bool()
| } fn unwrap<T>(x: Wrapper<T>) -> T {
| x.value } fn main() {
| let a = Wrapper { value: 1.23 }; let b =
| Wrapper { value: true }; to_bool(a);
| to_bool(b); let c = Wrapper { value:
| "whatever" }; unwrap(c); }
|
| I find there's a decent mental shift switching between
| TypeScript and Rust, despite some superficial
| similarities. I use both languages in different codebases
| and it usually takes me a bit to adjust when hopping back
| and forth. Both type systems have useful worldviews, but
| they're pretty different, and they nudge you towards
| different ways of structuring your code.
| staticassertion wrote:
| const generics are a feature I've been looking forward to since
| 1.0, really cool to see that work making it to stable.
|
| This solves another of the ergonomic issues in Rust. It really
| feels like within 2021 Rust will hit a point where it feels
| totally consistent.
| CodesInChaos wrote:
| Unfortunately the version of const generics stabilized in 1.51
| has many limitations. The core problem seems to be that
| evaluating const functions can panic, while generics should
| only produce errors when constraints are not met, not when
| instantiating them.
|
| The current limitation I find the most annoying is that you
| can't use associated constants in generics.
| est31 wrote:
| const generics themselves will only become stable in the
| upcoming release, 1.51. This release only adds implementations
| of specific traits to arrays of all lengths. Users themselves
| can't declare types yet that are generic on numbers or write
| impl blocks generic on numbers.
| staticassertion wrote:
| Ah! Bummer, but also, that's enough to deal with what felt
| like a big inconsistency in Rust - inability to work well
| with arrays.
| conradludgate wrote:
| I'm already using them in nightly and they are amazing.
|
| The only thing left that I want with generics is
| variadics/tuples. The only way currently to implement a
| trait over a generic tuple is to write a macro and call it
| for however many sized tuples you want. I've not seen any
| convincing RFCs about it yet though so I'm not confident
| we'll get them any time soon
| felipellrocha wrote:
| I hit this limitation recently, and although it was a bummer,
| it's good to hear that a solution is being worked on
| mkesper wrote:
| Seems to have vanished, was put out too early?
|
| OK, can see it now, too. Caches...
| steveklabnik wrote:
| No, it's there, not too early. Best guess is that you have an
| old cache. This happens sometimes, from what I hear from
| release team folks. Should sort itself soon.
| no_wizard wrote:
| (I'm noting this separately from my other comment)
|
| I found that not using the ISP DNS on your router helps with
| this problem on several sites, this being one of them, among
| other benefits
|
| I recommend everyone use something like OpenDNS, Google's
| public DNS resolver (if you're comfortable with that) or
| CloudFlare via 1.1.1.1 (and associated addresses), or any other
| myriad of reputable third party DNS services
| no_wizard wrote:
| Here's the full release notes on GitHub as an alternate if
| you're having trouble with the link:
|
| https://github.com/rust-lang/rust/blob/master/RELEASES.md#ve...
| PartiallyTyped wrote:
| Still vanished for me, here is a webarchive link.
|
| https://web.archive.org/web/20210211144406/https://blog.rust...
| cevans01 wrote:
| Couldn't any negative number be used as a niche for file
| descriptors in Unix? Could Option<File> use -2 to specify None?
| duckerude wrote:
| Based on https://internals.rust-lang.org/t/can-the-standard-
| library-s... it looks like that's probably the case, but that
| it's not beyond all doubt that it holds across all Unix-likes.
|
| > On Linux, the answer is pretty obviously no. Linux file
| descriptors are stored in an array of structs that has its
| capacity bounded to INT_MAX, so any negative int would either
| be considered nonsense (if treated as negative) or be higher
| than INT_MAX (if it was bit-reinterpreted as an unsigned
| value).
|
| > The Single UNIX Specification explicitly says that open can't
| return a negative file descriptor, and says in the page for
| dup2 that you should get EBADF if you try to claim a negative
| file descriptor.
|
| > Unfortunately, their description of file descriptor
| allocation never explicitly says that the negative range is out
| of bounds, but open references this algorithm while
| simultaneously claiming that it never gives a negative result.
|
| > Also, Wikipedia says it can't be negative, but they don't
| give a source on that particular claim (urgh!).
|
| -1 was picked as a conservative choice that's enough for the
| common case of Option.
| baq wrote:
| i don't know, but -1 has the prettiest two's complement
| representation of all negative ints :)
| slmjkdbtl wrote:
| I'm not familiar with Rust's rfc / implementation / go stable
| flow, but I'm curious how f32::clamp took so many years to go
| stable
| steveklabnik wrote:
| You can read the relevant history here:
| https://github.com/rust-lang/rust/issues/44095
|
| Short summary:
|
| First delay was ecosystem breakage; many people had defined
| their own functions with the same name. We're allowed to make
| these changes but tend to try not not unnecessarily break
| people.
|
| It then sat for about a year. The original author was a bit
| tired from all of the work to get to that point, and reasonably
| let it sit.
|
| There was then a small discussion about taking individual
| parameters vs ranges.
|
| Six months after that, it finally actually landed, thanks to
| some other changes that would help mitigate the breakage.
|
| It then sat for a while, until the libs team proposed merging.
| That brought up a lot of the previous design questions, which
| some people thought weren't resolved to their satisfaction.
|
| It then finally got stabilized in October of last year, and
| then it had to wait for the release trains.
|
| So, TL;DR: a surprising amount of details for such a small
| feature, which can lead to burnout, along with not enough
| people wanting to push it over the line.
| vlang1dot0 wrote:
| Looks like the two main issues were:
|
| 1. Some crates in the ecosystem have "extension traits" that
| add the same method to `f32`. Adding this into std will cause
| conflicts with those methods so users will need to disambiguate
| or remove their use of those extension traits. (This is allowed
| breakage under the stability guidelines)
|
| 2. Should this use this two arguments or a range (`x.clamp(1.0,
| 5.0)` or `x.clamp(1.0..5.0)`)?
|
| I think part of the reason this took so long was that by having
| widely used crates like `numtools` in the ecosystem that
| provided this functionality, it took a lot of the pressure off
| having this in std/core.
| coldtea wrote:
| pub fn clamp(self, min: f64, max: f64) -> f64
|
| Rust has generics iirc, so why there's this?
| masklinn wrote:
| I guess because floats implement PartialOrd, not Ord, and what
| would you return if one of self, min, and max is non-ordered?
| Returning an `Option<T>` would usually be inconvenient.
|
| Therefore clamp is implemented on Ord (meaning there's always a
| value to return), and there's an efficient implementation on
| floats which can define its behaviour with respect to NaN.
|
| If you want the actual details,
|
| * https://internals.rust-lang.org/t/clamp-function-for-
| primiti...
|
| * https://github.com/rust-lang/rust/issues/44095
| pornel wrote:
| Also the float implementation was chosen to optimize well
| with SSE, even in presence of NaNs.
| pimeys wrote:
| I hope 1.51 brings the new cargo resolver to stable. I was
| fighting with a problem recently, solved in the new resolver. If
| we'd think a dependency such as:
| [target.'cfg(target_os = "macos")'.dependencies] foo = {
| version = "1", features = ["x"] }
| [target.'cfg(not(target_os = "macos"))'.dependencies] foo =
| { version = "1", features = [] }
|
| With the current resolver, the dependency `foo` will always be
| compiled with a feature `x`, no matter which target os you're
| using. Debugging this took me a while and I was surprised it's
| actually just how the current resolver works.
| candied_scarf wrote:
| im still waiting for fixes for https://github.com/rust-
| lang/rust/issues/40552 and https://github.com/rust-
| lang/rust/issues/75263 in one of these releases
|
| lots of upvotes for them too since its for privacy in the
| compiler
| steveklabnik wrote:
| I am not aware of anyone actively working on them, though I am
| also not on the compiler team.
|
| Looks like someone who cares about this needs to step up! You
| can make it happen sooner than later. It is one of the best
| things about open source.
| brundolf wrote:
| > Some types in Rust have specific limitations on what is
| considered a valid value, which may not cover the entire range of
| possible memory values. We call any remaining invalid value a
| niche, and this space may be used for type layout optimizations.
| For example, in Rust 1.28 we introduced NonZero integer types
| (like NonZeroU8) where 0 is a niche, and this allowed
| Option<NonZero> to use 0 to represent None with no extra memory.
|
| I didn't know about this, and it's super cool. It's part of a
| broader pattern where the rigid constraints Rust can impose on
| code allow both the compiler and the user to do things that would
| be wildly dangerous in a less-strict language.
| TazeTSchnitzel wrote:
| Not only is it more efficient, it aligns with a safety
| principle I like: "make invalid states unrepresentable".
| steveklabnik wrote:
| For some history here, the most famous example is Option<&T>.
| References cannot be null, and so the Option can use the null
| pointer to represent None.
|
| At first, this was special cased in the compiler, but then
| generalized.
| Drup wrote:
| Hi steve, since you are around: is there a (formal?)
| description of how this "niche" layout optimization behaves
| on ADTs in general, or any documented work on the topic?
| steveklabnik wrote:
| I think there is, but I cannot find it right now. If you
| make a post on internals.rust-lang.org, I bet someone can
| point you to the right place.
| drran wrote:
| Sadly, it's not generalized enough to represent C-like enums
| in Rust, which can be declared as (in _theory_):
| enum Foo { A = 0, B = 1, C = 2,
| Unknown(i32), }
|
| where Unknown field cannot contain values 0, 1, or 2, thus
| Rust compiler will be able to pack this enum into single i32,
| like in C.
| azornathogron wrote:
| For that use-case can you get away with just setting
| #[repr(i32)] on the enum and leaving Unknown out of the
| list?
|
| I guess the compiler can't help you keep known and unknown
| values separate with that structure though, so maybe it's
| not enough.
| kzrdude wrote:
| If you do that, then 3 etc is not a valid value for the
| enum. Only valid values for that type are the variants.
| johnsoft wrote:
| It's not really an enum if you can't enumerate all the
| values at compile time. Rust enums are not C enums.
|
| I'd represent it like this:
| #[repr(transparent)] struct Foo(i32);
| impl Foo { const A: Foo = Foo(0);
| const B: Foo = Foo(1); const C: Foo = Foo(2);
| }
| drran wrote:
| 1) Enums in C are implemented that way, so C-like enums
| in Rust must implement it in the same way to be
| compatible with C.
|
| 2) This is required for forward compatibility. For
| example protobuf explicitly requires it.
|
| 3) Your code is not forward compatible. The proper way is
| to implement it is as i32 and then unwrap it, or:
| enum Foo { A = 0, B = 1, C = 2,
| _UNKNOWN_3 = 3, _UNKNOWN_4 = 4,
| _UNKNOWN_5 = 5, _UNKNOWN_6 = 6, // ...
| repeat few dozen times, to be forward compatible }
|
| Example: https://github.com/apoelstra/rust-
| bitcoin/blob/c37ab1f9c2392...
|
| P.S.
|
| IMHO, Rust should allow to define enums as:
| enum Foo { A = 0, B = 1, C = 2,
| _ , // Unknown values which must be handled by default
| case }
|
| or enum Foo { A = 0, B =
| 1, C = 2, _OTHER , // Unknown values
| which must be handled }
|
| because current workaround is usable for i8, maybe even
| for i16, but not for i32.
| pornel wrote:
| Rust enums are not compatible with C, even with
| `#[repr(C)]`. Values not explicitly present in the enum
| are not allowed. Casting them to a Rust enum is UB, and
| it does actually cause miscompilation, because `match`
| can be a jump table.
| andrewaylett wrote:
| Rust has an annotation to mark that an enum is non-
| exhaustive[1], and a mechanism for declaring an enum as
| being laid out in a manner compatible with C[2]. I've not
| tried using them together :).
|
| In general, it _is_ possible for any C data layout to be
| represented in Rust -- but it's not necessarily the case
| that the representation has the same name. And it's also
| not the case that we can safely pass memory from C to
| Rust without validating the content, even if the
| representation is equivalent for all valid values.
|
| [1] https://blog.rust-
| lang.org/2019/12/19/Rust-1.40.0.html#non_e...
|
| [2] https://rust-lang.github.io/unsafe-code-
| guidelines/layout/en...
| steveklabnik wrote:
| Yes, if you look at the actual PR for this change, you'll
| see that doing this is using a rustc-specific interface,
| for similar reasons.
| k__ wrote:
| Do C/C++ use such optimizations or is their type system too
| brittle?
| lpapez wrote:
| When it comes to std::optional, the answer is no. In fact,
| std::optional<T&> is explicitly forbidden (see here:
| https://www.fluentcpp.com/2018/10/05/pros-cons-optional-
| refe...)
|
| It is used in some other places though. For instance,
| std::string implementations typically employ small-string
| optimization (SSO) which means that there might be a limit
| on string::size() since some of the bits inside that value
| are used for different purposes (example:
| https://akrzemi1.wordpress.com/2014/04/14/common-
| optimizatio...). The "niche" in this case is the
| realization that strings are unlikely to be exabytes in
| size, so there is no reason to always store zeros in high
| bits of _size.
| steveklabnik wrote:
| Fun trivia fact: Rust's stdlib String type cannot, and
| therefore does not, support SSO.
| drran wrote:
| If you need [no_std] version, use SmallString/SmallVec.
| Otherwise, use SmartString, SmolStr, etc.
| pdimitar wrote:
| I recently wondered if a library like `smol_str` will be
| upstreamed one day. I feel people would rejoice if they
| can use stack-allocated strings (up to a certain size).
|
| Do you think that's possible?
| steveklabnik wrote:
| Anything is possible, but it's not clear to me what
| advantage being in the standard library would actually
| bring. It is a pretty niche use-case (and I work in
| embedded these days), and it is pretty trivial to use a
| package.
|
| Only the libs team can really say, and I'm not on that
| team.
| pdimitar wrote:
| I'm looking at it through the lens of "let Rust have the
| best possible newcomer experience", which of course isn't
| a sentiment that has to be shared by the creators.
|
| I suppose I appreciate SmallString's ability to alias
| itself to Rust's String.
|
| But I'd immediately agree that this adds hidden
| complexity and the potential for it to bite you in the
| worst possible moment.
| steveklabnik wrote:
| Even with that perspective, you can disagree. Offering
| _even more_ string types may be more confusing for users.
| pdimitar wrote:
| Yep. I was kind of thinking out loud but was curious
| about your perspective. Thank you.
| steveklabnik wrote:
| Totally! That's how progress is made, thinking out loud
| :) Thank you too!
| pornel wrote:
| I like that Rust's String is not trying to be smart. It
| makes it predictable and easy to reason about.
|
| Because Rust has &str as the lowest common denominator
| for all string types, "non-standard" strings aren't
| difficult to use.
| pdimitar wrote:
| I agree that less smartness is a good thing, it's just
| that sometimes I get a bit frustrated even with basic
| usages of Rust's String and &str. It gets easier with
| time but the first few months were a big struggle.
| brundolf wrote:
| Going along with the other replies, I disagree that this
| would improve the experience for newcomers. SmallString
| is a hyper-optimization for specialized use-cases which
| (I would assume) is rarely worthwhile. I would actively
| discourage people from using SmallString without first
| profiling to identify Strings as their bottleneck,
| especially newcomers. String will almost always be "fast
| enough".
| pdimitar wrote:
| Agreed that it's a very likely unneeded optimization. As
| shared in another sibling comment, I got frustrated with
| Rust's strings and figured that maybe one more complexity
| on top wouldn't be that bad. But thinking of it now,
| yeah, it's definitely a bad idea.
| colejohnson66 wrote:
| > The "niche" in this case is the realization that
| strings are unlikely to be exabytes in size,...
|
| That's what we say now, but a few dozen years down the
| line? Old Mac OS versions (pre X) would store extra bits
| of information in the MSBs of a pointer. "32 bit clean"
| Mac OS had to be created to allow usage on computers with
| more than a few dozen megabytes. That was an all too
| common problem when multiple gigabytes of RAM became
| commonplace.
|
| Granted, I _highly_ doubt we'll see _exabyte_ level
| string lengths, but you never know.
|
| Fun fact: x86-64 mandates pointers be in "canonical" form
| where _all_ unused bits of the address are the same as
| the MSB. So on a processor that has 48 address lines, the
| upper 16 _must_ match the 48th bit. If you try to access
| memory with a "non canonical" pointer, the processor will
| throw an exception. This is because they don't want
| people using them for flag bits and having a repeat of
| the "unclean" software problem.
|
| TL;DR: I agree, but never say never
| KMag wrote:
| And it's too bad that most kernels (including Linux and
| Windows) put userspace in the low half of the address
| space. In the upper half of a 48-bit address space (or
| even 51-bit), all valid pointers, if interpreted as
| IEEE-754 doubles, are NaN, so you get NaN boxing without
| performing any arithmetic. (Though, if you want NaN-boxed
| nullable pointers, you'll need to pick a special value
| for nullptr.)
| Dylan16807 wrote:
| There are two problems with using high pointer bits as
| flags.
|
| The small problem is that your program will be limited in
| memory space. That's usually okay. If you want a vast
| expansion of memory space you probably need to start
| using different algorithms anyway.
|
| The big problem is that your program depends on the CPU
| ignoring those bits, and now it can't run at all on
| updated hardware. As you noted, x86-64 solves this
| problem by forcing programs to clean up pointers before
| actually using them.
| lpapez wrote:
| I think we have quite some before us until the need for
| these high bits becomes reality :)
|
| Now that you mention pointers, Apple is using unused bits
| in their values for signing and authentication purposes
| https://googleprojectzero.blogspot.com/2019/02/examining-
| poi...
|
| Pretty neat in my opinion
| brundolf wrote:
| There are some cases, like the Option<&T>::None case, where
| C/C++ do roughly the same thing just without any type
| guards. So in this case: pointers in C/C++ can simply be
| null, which is not expressed at a type level but is lumped
| in as a special case of the pointer value itself instead of
| being a separate bit. i.e. the optimization in this case is
| only necessary for Rust because it normally represents the
| enum variant as a _separate_ component of the value in
| addition to its contents. The optimization brings it back
| in line with C /C++ from a memory usage standpoint.
|
| However, because there are no type guards for this stuff in
| C/C++ you're generally constrained by
| convention/intuitiveness, when picking these special
| values, in order to (hopefully) guard against accidental
| misuse. So in the NonZero case above, you probably wouldn't
| do that because it's pretty unusual, and you'd therefore
| use more memory.
|
| And then, even when the "special values" _are_ chosen
| "reasonably", they sometimes get misused anyway and we end
| up with bugs like last year's trivial sudo exploit:
| https://bit-sentinel.com/to-sudo-or-not-to-sudo-
| demystifying...
|
| > Reading the manual for those functions we learn that -1
| is a special value
|
| > Supplying a value of -1 for either the real or effective
| user ID forces the system to leave that ID unchanged.
|
| > If one of the arguments equals -1, the corresponding
| value is not changed.
|
| > So, what happens when we send -1 as a parameter? Well,
| somewhere along the underlying code lines is a convention
| that -1 is represented as 4294967295, thus either values
| will get the wanted result.
|
| Edit: In practice the above may be more common in C than
| modern C++
| steveklabnik wrote:
| I am a non-expert in the exact semantics of C and C++
| layout, but as far as I can recall off the top of my head,
| they do not, because the general semantics are "lay out the
| struct in the order as declared, add padding for alignment
| reasons, done."
|
| Whereas the semantics in Rust are "we can do whatever we
| want unless you add an explicit #[repr] attribute, in which
| case you choose the semantics of layout and we don't mess
| with it."
|
| It's not about the strength of the type system, it's about
| history.
| MaulingMonkey wrote:
| Theoretically you could roll your own such optimizations
| manually in a C++ codebase by abusing partial
| specialization. That is, you could write:
| template < typename T > class optional<
| std::vector<T> > { // ... };
|
| And write an implementation that (ab)uses knowledge of
| the exact layout of `std::vector<T>` to avoid an extra
| bool. This would of course be a breaking change to the
| ABI of optional<std::vector<T>> if implemented by
| std::optional, so compiler vendors tend to avoid such
| things.
|
| There is one stdlib example of this, but sadly it's more
| infamous than inspirational: std::vector<bool>. The
| standard relaxes several guarantees of std::vector<T> in
| the case of T == bool, to allow it to bit-pack the
| booleans instead of using sizeof(bool) - e.g. an entire
| byte - per element.
|
| Some of those relaxed guarantees make std::vector<T>
| unsuitable for interop with C style APIs taking pointers
| + lengths if T might be a bool. Worse still, many (most?)
| implementations don't even implement the space
| optimization those guarantees were relaxed for - a worst
| of both worlds situation. If you actually need such space
| optimizations, you'll reach for an alternative such as
| boost::dynamic_bitset which provides them consistently.
| steveklabnik wrote:
| This is a good point; given that it's automatic in Rust,
| I assumed that they parent was talking about it happening
| automatically, but you are right that you absolutely can
| do this manually, and it's good to have that nuance.
| After all, this PR required manual intervention to get
| the automatic parts to kick in!
| tjalfi wrote:
| That particular optimization is unlikely but there have
| been C compilers that optimized struct layout.
|
| 179.art, one of the SPEC2000 benchmarks, has some poorly
| laid out structs. Sun introduced targeted optimizations
| for this benchmark and several other vendors also did so.
| I have also read papers about profile-guided struct
| reordering but don't have a citation on hand.
|
| GCC also had a pass[0] for this optimization but it may
| have been removed.
|
| [0] https://www.research.ibm.com/haifa/dept/_svt/papers/g
| olovane...
| MereInterest wrote:
| Sort of. You're right on what the abstract C++ machine
| does, but C++ also has the "as if" rule. A compiler can
| make whatever changes it wants so long as all observable
| effects are _as if_ it had generated exactly the program
| as requested. This is the rule that allows for compiler
| optimizations to be done. Values stored in memory are not
| considered to be observable effects, so the compiler is
| allowed to make whatever changes it wants on that end.
|
| However, to the best of my understanding, most types are
| exposed to other compilation units, so it is hard to
| establish invariants about how they are used. Unless the
| compiler can verify that nowhere in the program ever
| takes the address of a boolean variable in a struct, it
| isn't allowed to rearrange the struct to avoid having
| that extra boolean. That might be possible at link-time
| for statically linked programs, but I'm not sure.
|
| Also, my knowledge is mostly from programming in C++ and
| watching CppCon talks, so I may be out of date from the
| latest optimization techniques.
| k__ wrote:
| I see, thanks.
|
| Pretty cool to see that Rust can do more optimizations
| than other languages.
|
| And it seems to play out well. I fondly remember Java
| being touted as being superior to C/C++ because it can do
| runtime optimizations, which will outperform C/C++.
| Somehow this never happened.
| volta83 wrote:
| I write a lot of code using the
| #[rustc_layout_scalar_valid_range_start(x)]
| #[rustc_layout_sclaar_valid_range_end(y)] struct
| MyInt(i32);
|
| to create integers with only valid values in range [x, y) that
| benefit from the niche optimizations.
|
| Like, have literally a project with almost 100k LOC which is
| all built on top of this.
| brundolf wrote:
| Wow, I did not know about those!
|
| I'm surprised how little a Google search turns up for them;
| the best I found was a mention (but no description or
| examples) in docs.rs: https://docs.rs/rustc-ap-
| syntax_pos/634.0.0/rustc_ap_syntax_...
|
| Do these enforce against out-of-bounds values by panicking,
| the way arrays do? Or are they unsafe?
| tmzt wrote:
| It would be nice to have a #[range(a..b)] version of this
| that could be standardized.
| jleahy wrote:
| They're not only unsafe, but as far as I'm aware they're a
| compiler implementation detail that isn't really supposed
| to be used in user code (along with
| rustc_nonnull_optimization_guaranteed). It's actually how
| NonZero is implemented.
|
| See https://github.com/rust-
| lang/rust/blob/master/library/core/s...
| brundolf wrote:
| Ah, that explains things
| oblio wrote:
| The revenge of Pascal integer ranges! :-D
| vlovich123 wrote:
| Are there any plans to extend the ability so that user code can
| use niche values that aren't 0? Unless I failed at figuring out
| how to find the relevant docs, I only see NonZero formally
| documented but this release is clearly utilizing -1 instead (&
| I'm assuming it's not just doing +1/-1 math when accessing just
| to leverage NonZero).
| steveklabnik wrote:
| Off the top of my head, I don't think there is, but not due to
| some objections to the feature, but because doing this is a
| pretty niche feature (pun absolutely intended.) Nobody has been
| motivated enough to do the design and consensus building work.
| vlovich123 wrote:
| This came up a _ton_ when I was doing particle filters on
| mobile. We had optional doubles that were initialized to NaN
| but NaN isn 't a valid value. Being able to use optionals
| instead would be great (to avoid having to backtrace where an
| accidental NaN propagation may have started from) but the
| memory hit was impractical. I bet some parts of the science
| community might need this but the need is so spread out (each
| individual application) that it might not be as obvious as a
| popular library identifying the need.
|
| I wish I knew language design & compiler development better &
| had the time to propose it & see it through.
| conradludgate wrote:
| FiniteFloat or NotNaN would be nice additions to the stdlib
| jsheard wrote:
| NonMax types would be nice to have, since the natural "null"
| value for an index into some linear data structure is the
| highest one
|
| In the meantime it can be done by wrapping NonZero and
| storing !value, but that's a (tiny) runtime overhead on each
| get/set
| Waterluvian wrote:
| So does all this const work not break any existing usage but
| promotes the functions to "we can know more about and do more
| with at compile time"?
| steveklabnik wrote:
| Correct.
| crazypython wrote:
| I'm flagging this because virtually anything about Rust gets on
| the front page of Hacker News, interesting or not.
| ibraheemdev wrote:
| Hacker news is a democracy in that the people decide what gets
| to the top. Just because you do not like the topic does not
| mean others share your sentiment.
| password321 wrote:
| Seems like anything to do with Rust these days.
___________________________________________________________________
(page generated 2021-02-11 23:01 UTC)