[HN Gopher] Rust Cookbook
___________________________________________________________________
Rust Cookbook
Author : smusamashah
Score : 217 points
Date : 2021-02-11 13:09 UTC (9 hours ago)
(HTM) web link (rust-lang-nursery.github.io)
(TXT) w3m dump (rust-lang-nursery.github.io)
| pixel_tracing wrote:
| Was hoping I could find something on changing a string in place
| (and efficiently) why is this such a difficult thing to do in
| rust?
|
| I am trying to build a parser combinator library in rust
| [deleted]
| tupshin wrote:
| Rust strings are UTF-8, and UTF-8 characters vary in byte
| length, therefore replacing same length character sequences
| don't guarantee that they will fit in the same number of bytes.
| jessegrosjean wrote:
| Rust uses very similar API to other languages for this:
|
| https://doc.rust-lang.org/std/string/struct.String.html#meth...
|
| The problems you are likely running into are higher level
| mutability issues. The above linked function requires &mut
| self, which means you need a mutable string. The above linked
| documentation has an example of using the `mut` keyword to
| declare a mutable string that you can replace_range on.
| bluejekyll wrote:
| You can modify strings in place, but not their length, which
| makes what your talking about complex if you need to support
| utf8.
|
| Here's the str method: https://doc.rust-
| lang.org/std/primitive.str.html#method.as_b...
|
| If you want an string that is mutable, can be extended, etc.
| You want to use String, and that has plenty of methods for
| mutation.
|
| Remember, Rust strings are all utf8, and part of the safety
| contract is that they're always valid.
| the8472 wrote:
| clear() + push_str() to reuse the storage to assemble a new
| string. replace_range() to splice something in the middle. This
| may require the tail to be moved around when lengths don't
| match.
|
| Note that Strings are stored as utf8, so a substring with the
| same number of characters can still result in a different
| number of bytes and thus require memmoves.
| protoman3000 wrote:
| I like this book a lot, but the writers use an external package
| to handle errors.
|
| Why don't you just return a Result<T, Box<dyn Error>> instead?
| volta83 wrote:
| > Why don't you just return a Result<T, Box<dyn Error>>
| instead?
|
| Because this book is old and outdated.
| swsieber wrote:
| Disclaimer: takes this with a grain of salt as I do use Box<dyn
| Error>. But I have a couple of things to note.
|
| 1. error chain is not the recommended way, despite being in the
| book. It needs updating!
|
| 2. The error trait in the std has come a long way since the
| book was originally written
|
| 3. You still might want some nicities that that Box<dun Error>
| doesn't provide you, and so you'd define an enum. You can do it
| manually, or use a library like anyhow or thiserror depending
| on if you're an app or library.
|
| 4. Rust error handling has been in a lot of flux, but generally
| improving. It probably does deserve a dedicated section in the
| book.
|
| I recommend this blog post: https://nick.groenen.me/posts/rust-
| error-handling/
| gsserge wrote:
| > I recommend this blog post:
| https://nick.groenen.me/posts/rust-error-handling/
|
| This blog post is good indeed. The discussion on /r/rust
| linked at the end of the post is also worth reading.
|
| Another good and light introduction is
| http://www.sheshbabu.com/posts/rust-error-handling/
|
| And a more in-depth look https://blog.burntsushi.net/rust-
| error-handling/
| michael_j_ward wrote:
| Just adding this video [0] that both gives a mental framework
| for designing errors and a "lay-of-the-land" for current
| libraries. Jane, the speaker, also heads up Rust's error
| handling project group [1], and gives her thoughts on future
| design / work.
|
| [0] https://www.youtube.com/watch?v=rAF8mLI0naQ [1]
| https://blog.rust-lang.org/inside-rust/2020/09/18/error-
| hand...
| gsserge wrote:
| One thing that is useful to keep in mind when considering boxed
| errors or error-like objects, e.g. anyhow::Error, is that they
| famously do not implement std::error::Error trait [1].
|
| This could be a problem if you want to use a third-party API
| where generic functions or structs use std::error::Error in
| their trait bounds. For example, fn use_error<E:
| std::error::Error>(e: E) { ... } will not compile with a boxed
| error as the parameter.
|
| [1] https://users.rust-lang.org/t/shouldnt-box-t-where-t-
| error-i...
| steveklabnik wrote:
| This actually goes to the root of why this project exists.
|
| There is a pretty deep tension in the way that Rust deals with
| stability. That is, the project itself takes stability very
| seriously, and as such, has to minimize, to some degree, what
| things we accept as official, because we need to be able to
| maintain them. So what tends to happen is, stuff happens in the
| ecosystem, and then, if it makes sense, moves slowly into the
| project itself. Furthermore, because being blessed by the
| project commands some weight, if we refer to something, it'll
| get used. But if we want the ecosystem to explore the design
| space and develop solutions, endorsing something in the midst
| of that process could cut it short.
|
| This also means that the documentation, generally, doesn't
| refer to any external projects. We need stuff to be solid and
| foundational, but some random package can change at whatever
| pace they'd like. We also don't want to interfere with the
| community development process by picking winners.
|
| But you, as a Rust developer, programming something, need to
| know how to get stuff done. And that very often will mean
| interacting with the ecosystem. That's a good thing! But that
| comes in tension with the docs provided by the project. Because
| you won't learn about serde from reading the book.
|
| ---------------
|
| There is a second issue here, which is the _kind_ of
| documentation provided by the project. We have good beginner
| documentation. We have good expert documentation. But how do
| you get from point A to point B? It 's a bit less clear.
| Additionally, our beginner documentation _tends_ to be focused
| on _doing_ , and our expert documentation tends to be focused
| on more abstract theory or implementation details.
|
| Where we are weakest is "okay I read the book... now what?"
|
| ---------------
|
| One method of resolving these tensions is the Cookbook. The
| idea was to provide a resource based around helping
| intermediate Rust developers solve specific tasks. The usual
| guidelines around referring to packages is relaxed a bit. It
| wouldn't be pointing to the absolute latest fancy stuff, but it
| would be allowed to refer to packages that have mostly settled
| out. If you're doing serialization, you may not use serde, but
| you probably are, and nobody is going to be surprised if you
| do. So on some level, we want to provide a resource to folks
| that are saying "Hey I need to deserialize this JSON, what
| should I do?" and be able to say "Use serde."
|
| This may mean that the cookbook would need changing if the
| ecosystem shifts significantly. There's an art to doing this
| enough to be useful, but not enough to cause constant churn.
|
| ----------------
|
| Unfortunately, a lot of folks put a lot of effort into the
| cookbook, but then moved on to other things. It's needed some
| revitalization for a while, and never _quite_ made it over the
| hump from "in-progress work" to "thing we're willing to
| publish and promote." It got close! It's still really good in
| many places! It's also out of date in places.
|
| ----------------
|
| Anyway, that's a really long way of saying "because real world
| programs don't tend to use Result<T, Box<dyn Error>> except at
| the exploratory phase."
| mwww2 wrote:
| Thank you for this resource!
| atbpaca wrote:
| I love Rust, I just wish it could get rid of semicolons. Quite a
| lot of discussions around this.
| paavohtl wrote:
| As far as I know, a (C-like) language must either have:
|
| - A statement separator (semicolons)
|
| - Significant whitespace
|
| Otherwise parsing the language is very likely ambiguous, e.g
| JavaScript [1] [2].
|
| [1] https://flaviocopes.com/javascript-automatic-semicolon-
| inser...
|
| [2] https://medium.com/better-programming/you-might-need-
| those-s...
| jolux wrote:
| Swift and Go don't have either. Neither does Kotlin. The
| difference is that Rust is an expression language (much like
| ML), and the semicolon determines whether the resulting
| expression is returned or thrown away. OCaml needs semicolons
| when you're doing imperative programming too.
| johncolanduoni wrote:
| Swift, Go, and Kotlin all take newlines into account. I
| guess one could have a definition of "significant
| whitespace" that excludes that but in any case I strongly
| suspect that's what they were referring to.
| jolux wrote:
| yes, but I believe they support semicolons as a statement
| separator on the same line
| hu3 wrote:
| Go doesn't have either: https://play.golang.org/
| paavohtl wrote:
| Ah, but it does. Statements are separated by newlines. That
| is the same as significant whitespace (or a statement
| separator; you can interpret it either way).
| hu3 wrote:
| Technically correct.
|
| But the point is about ergonomy. Go doesn't require
| semicolons neither indentation. That coupled with gofmt
| auto-formatter makes coding more pleasant.
| duckerude wrote:
| JavaScript messed it up, but Python did it well. A newline is
| a statement separator, unless it's e.g. between parentheses
| where a statement separator wouldn't make sense. Unlike
| JavaScript, that means you don't have to look at the next
| line to figure out whether the statement continues.
|
| JavaScript assumes a newline doesn't end a statement, until
| proven otherwise. Python assumes it does end a statement
| unless it very clearly can't.
|
| It's independent from Python's other significant whitespace,
| so you could use it in a more "C-like" syntax.
|
| But it wouldn't be a good fit for Rust. In Rust, leaving the
| semicolon off the final statement in a block already changes
| its meaning, so that would become ambiguous. It also uses
| method chaining syntax a great deal more than Python, which
| is a bit awkward because you have to wrap the expression in
| parentheses (though autoformatting helps). And of course it
| would break most existing code.
| kthxb wrote:
| of all the complaints to have...
| Tade0 wrote:
| Any such move will eventually result in automatic semicolon
| insertion in some shape or form.
|
| What's wrong with semicolons anyway?
| matesz wrote:
| Nothing wrong with semicolons. IMO they make code look a
| little bit less readable - you need to look at semicolon
| while you could be just looking at the empty line. One of the
| reasons people like python.
| colejohnson66 wrote:
| This is a valid point. Without the context, the "everything
| is an expression" forces one to look for a line _without_ a
| semicolon to see the expression result. In a 5 line
| function, that's easy, but in a nested match, it can be
| hard.
| unanswered wrote:
| It can only possibly be at the end, so it's not really
| hard?
| deetsb wrote:
| I wasn't a fan at first (coming from Python), but I really like
| that it allows you to make expressions into statements (I'm
| sure there's more going on then just that, but it was pretty
| cool to see that if/match are generally expressions but if I
| don't want them to be, ";" can change that!).
| colejohnson66 wrote:
| I'd like that too, but: if you get rid of semicolons, how do
| you avoid something like JavaScript's automatic semicolon
| insertion mess? For example: return
| x;
|
| turns into: return; x;
|
| which is clearly not what was intended. Sure, if Rust copied
| JavaScript's algorithm, the compiler would throw a type error
| (where () is not compatible with typeof(x)) and complain of
| dead code instead of silently failing, but it seems like a hard
| task. Then again, I'm not a compiler writer.
| bcoates wrote:
| Python's solution to this is that a randomly indented block
| is an error.
|
| Many languages restrict statement-expressions to function
| calls and other expressions with explicit side-effects.
|
| Another choice would be to avoid syntax where you can
| optionally append expressions to an otherwise complete
| statement (so only one of "return" and "return _expr_ " are
| allowed, and the other has an alternate spelling)
|
| Or the parser could forbid line continuation in the absence
| of a trailing binary operator, which makes mid-statement
| newlines illegal outside of a highly visible special case so
| users would be trained out of inserting them at random before
| they introduced errors.
|
| Both JavaScript and python do something like the last one but
| both screw it up; JavaScript also allows leading operators
| and open-brackets to continue the previous line which is a
| huge mess, and python uses brackets to continue lines but
| parens are both optional around most expressions and
| overloaded (generators, tuples). If you wanted to go that way
| you would need a bracket that meant "this is syntactically a
| single expression and has no other meaning"
| zozbot234 wrote:
| The easiest solution might be to simply allow \ as a line-
| continuation operator. So you'd write:
| return \ x;
| colejohnson66 wrote:
| But then you've just swapped one set of trade offs for
| another. It seems like a weak solution. IIRC semicolons
| were added to the languages of old to avoid the need for
| line continuation marks.
| kungito wrote:
| What a coincidence. I've just used it 3 times in the past 2 hours
| to find code for some uuid and statistical cases. I have to
| admit, it's amazing for a quick lookup of common use cases when
| you don't want to think too much or remember things.
| carols10cents wrote:
| Note: The Rust Cookbook has existed for 4 years now, but it's
| still unofficial and somewhat out of date because we don't have
| enough people maintaining it.
| liquidify wrote:
| If this is already out of date, what will prevent it from
| complete rot over time?
| steveklabnik wrote:
| More people interested in contributing to bring it back up to
| date.
___________________________________________________________________
(page generated 2021-02-11 23:02 UTC)