[HN Gopher] Writing Pythonic Rust
___________________________________________________________________
Writing Pythonic Rust
Author : zbentley
Score : 151 points
Date : 2021-05-24 17:18 UTC (5 hours ago)
(HTM) web link (www.cmyr.net)
(TXT) w3m dump (www.cmyr.net)
| xvedejas wrote:
| What is the idiomatic way to mimic inheritance in Rust, when
| writing bindings for OO languages? I'm new to Rust but I imagine
| there are ways to mimic upcasting/downcasting its types?
| bluejekyll wrote:
| I actually end up generally only using polymorphism in Rust and
| have never needed inheritance. For polymorphism, as other
| comments have mentioned, you should look at trait objects
| (`&dyn T`) or generic parameters for monomorphism.
|
| If absolutely necessary, Objects can be downcast by using Any
| (though I've personally not needed to write Rust code that
| needs this): https://doc.rust-lang.org/std/any/trait.Any.html.
| Edit, instead I tend to rely on Rust enums (for it's algebraic
| type features).
|
| Also, in some cases you may end up wanting to implement Deref
| or DerefMut, https://doc.rust-
| lang.org/std/ops/trait.Deref.html, but this shouldn't be use to
| create inheritance, it's more used for getting references from
| one type to another similar type (like String derefs to &str
| for example).
| rav wrote:
| You can take a look at how the Gtk bindings handle the
| inheritance hierarchy in Gtk objects: https://gtk-rs.org/docs-
| src/tutorial/object_oriented
| andrewzah wrote:
| With the 'as' keyword you can cast between types [0]. Instead
| of inheritance, you can implement Traits for objects [1], which
| works more like composition.
|
| [0]: https://doc.rust-lang.org/reference/expressions/operator-
| exp...
|
| [1]: https://doc.rust-lang.org/rust-by-example/trait.html
| ChrisSD wrote:
| Note that `from` and `try_from` are generally preferred to
| using `as`, where possible. This also allows casting between
| non-primitive types.
|
| https://doc.rust-
| lang.org/std/convert/trait.TryFrom.html#exa...
| xvedejas wrote:
| How expensive is this kind of casting in Rust? Like, if I
| have a struct which has a superset of attributes that I
| need for the type I'm casting to, do I need to construct a
| new struct instance just to have something to treat as my
| target type, or is there a simpler way that doesn't involve
| new allocation?
| msbarnett wrote:
| You have options.
|
| Given Superset struct A, some other struct B that happens
| to have a subset of A's fields, and some function foo
| which takes an argument of type B, you could
|
| 1) implement From<A> for B, and use that to convert
| instances of A when you want to call foo. This would
| probably involve some copying and/or allocation.
|
| 2) back up and turn foo into something that takes a trait
| argument instead. Now instead of taking instances of A,
| what it needs struct to provide is defined by a trait,
| which you can provide implementations of for both A and
| B. Now you can pass either to foo.
|
| 3) potentially, depending on the layout of structs,
| create a union and do some type punning to convert
| between them. this requires unsafe and you'd better be
| right about the struct layouts.
|
| 4) just make foo take a tagged union of the two structs.
| this isn't unsafe but has different storage tradeoffs.
| also foo would need to handle each case separately.
|
| 5) find some other way to skin this cat. there's plenty
|
| #2 would be idiomatic in most circumstances I think?
| rackjack wrote:
| Yeah, I would do (2) if possible.
|
| ```
|
| struct Person { name: String, }
|
| struct Engineer { name: String, boss: String, }
|
| trait Name { fn get_name(&self) -> &str; }
|
| impl Name for Person { ... }
|
| impl Name for Engineer { ... }
|
| ```
| herbstein wrote:
| > How expensive is this kind of casting in Rust?
|
| It really depends on how you're implementing it. If
| you're doing a clone/deepcopy-style conversion, or if
| you're consuming the value you're casting from. However,
| since most values in Rust are placed on the stack,
| "allocating" them is incredibly cheap. Moving them around
| is often also cheap, and involves just a few `mov`s from
| registers to registers, or stack-address to stack-
| address.
|
| A very simple example: https://godbolt.org/z/WMePqMd71
| phantom_oracle wrote:
| I checked this thread and the first thing I searched for was nim-
| lang[1]. Looks like nobody mentioned it yet.
|
| Nim/Nim-lang makes it much easier to write Pythonic code. Minus
| all the complex things that only smart(er) programmers get(like
| GC, etc.), with Nim you can write code that is much more
| readable.
|
| Another win for it is that it has very limited support for OOP,
| which IMHO sucks in terms of readability and understanding what
| the code is doing(as compared to imperative code).
|
| [1]https://nim-lang.org/
| aazaa wrote:
| The example code is: font =
| Font.open("MyFont.ufo") glyphA =
| font.layers.defaultLayer["A"] point =
| glyphA.contours[0].points[0] point.x = 404;
| assert point.x == glyphA.contours[0].points[0].x
|
| The author apparently wants to be able to perform ad hoc
| modifications to an existing font.
|
| Unfortunately, it's not clear from the article what the use case
| is. Most of the time I've worked with fonts, they're treated as
| immutable values.
|
| If the idea is that a font needs to be built from a serialization
| format, then an alternative approach would be to eliminate the
| python mutable interface and replace it with a build function
| that calls the underlying Rust to build a Font. That way Python
| never needs to deal with mutating a font.
|
| Rust lets you take unlimited numbers of references to an
| immutable value. That's not a problem, other than defining
| lifetimes if you pass those references around and/or hold them in
| structs.
|
| The problem arises the instant you want to mutate the value
| behind a reference. Hold just one active immutable reference at
| the same time, and that won't be possible. Note that mutable and
| immutable references can in some situations exist in the same
| block thanks to non-lexical lifetimes.
| cmyr wrote:
| There are a lot of existing python scripts that run on top of
| the existing python `ufoLib2` library, and the initial
| motivation for this work was seeing what would be involved in
| creating a wrapper API that would allow all of these existing
| scripts to work transparently on top of an existing rust crate.
| alkonaut wrote:
| Exactly. If I do glyphA.contours[0].points[0].x = 1 what
| happened to someone else holding a reference to the same glyph?
| To a python programmer it might not be a surprise to see such
| an API (a shared mutable value), but to most OO programmers I
| hope this API looks very strange.
| dexwiz wrote:
| The conclusion in the article that using a getter/setter
| would fix most of the issues seems much more straight
| forward. Maybe I don't write enough python, but I rarely
| think modifying properties on a library object is preferable
| over clearly named methods.
| alkonaut wrote:
| _How_ the object is mutated isn 't really relevant to the
| smell. It's that an object can be mutated while someone
| else has a reference to it (and that the API is designed
| that way),
| mjw1007 wrote:
| I think the use case for modifications is letting people write
| custom extensions to a font editor.
| adamnemecek wrote:
| You probably don't want to do that.
| batterylow wrote:
| Coming from Python, I was interested in using Rust for data
| analysis within Jupyter notebooks... Evcxr made that possible,
| and I've self-published a book on the topic
| https://datacrayon.com/shop/product/data-analysis-with-rust-...
| waffletower wrote:
| The world needs fewer python idioms rather than more. Python is a
| sprawling language with an identity crisis stemming from its
| orthogonal object oriented and functional features. Why impose
| this mindset on Rust? Perhaps python, or at the very minimum the
| python developer, has much to gain from other languages instead.
| cmyr wrote:
| This question has been answered at least twice in comments in
| this thread.
| lalaithion wrote:
| > python expects you to share references to things. When you
| create a binding to a point in a contour, that binding refers to
| the same data as the original point, and modifying the binding
| modifies the collection.
|
| Honestly, as someone who has written way more Python than Rust,
| this seems like bad API design regardless of language. This
| screams "it's impossible to take a font, make two separate
| modifications to it, and then work with those separate
| modifications at the same time", because deepcopying objects is
| usually very difficult.
| Orou wrote:
| Naive question: in Python, isn't this solved by .copy and
| .deepcopy if you really _have_ to get a copy of an object
| rather than create a new instance of something? I 'm curious
| where the bad API design is, unless you're saying that Python's
| assignment behavior itself is bad and everything should be
| immutable by default.
| Blikkentrekker wrote:
| Why would deep copying be difficult?
|
| Even in _Rust_ , the `Clone` trait produces a deep copy.
|
| Copying only up to a specific depth in a general way is what is
| difficult.
|
| I would argue that shallow copies, _id est_ the depth being 1,
| are far more difficult to realize that copies of unbounded
| depth until a value o trait `Copy` is reached, _id est_ a type
| that is purely encoded by the data it takes up on the stack and
| owns no further data and can thus be fully be `Clone`ed by
| copying it 's stack bits.
| kzrdude wrote:
| > Even in Rust, the `Clone` trait produces a deep copy.
|
| Clone is not really a deep copy. I like the description that
| says it is "deep enough to give you an object of
| identical/independent ownership (of what you started with),
| but no deeper".
|
| Example: Rc<String> when cloned only gives you another handle
| to the Rc, the string data is not duplicated (for this reason
| we don't call it a real deep copy). You get a new rc handle
| that is on equal footing with the old handle.
|
| There is plenty of Rust types that consist of tree-shaped
| ownership relations with no frills - in these cases clone and
| deep copy are identical. Let's say for example a HashMap<i32,
| String>.
| oconnor663 wrote:
| I think every "deep" copy ultimately has some limitation
| like this. Even in Python, an object can implement
| __deepcopy__() to prevent copy.deepcopy() from "copying too
| much", and things like file objects, classes, and modules
| get special treatment.
| kzrdude wrote:
| I found that in Rust, you don't have to think about the
| distinction so much - not like you have to in for example
| Python and C# (pervasive/transparent object reference
| languages). In Rust it's predictable and mostly obvious
| which data is shared ownership or shared, and which is
| not. We don't need to talk about deep vs shallow copy
| semantics much in Rust. :)
| wk_end wrote:
| It's not _intrinsically difficult_ , but in Python it's
| arguably un-Pythonic and somewhat awkward. Like, sharing
| references has built-in syntax that's taught to most Python
| programmers within their first hour with the language,
| whereas copying - without writing a tedious manual
| implementation - requires importing the `copy` module [0],
| which outlines a number of pitfalls, quirks, and caveats with
| the process, and that I've literally never used so far as I
| can recall in a decade of professional Python development.
|
| [0] https://docs.python.org/3/library/copy.html
| adrianN wrote:
| You have to pass around the whole object you want to copy for
| that to work. For example in a compiler, if you function only
| sees an AstNode, it can't make a deep copy of the whole AST
| easily. You also have to be careful with things like objects
| that contain refcounted pointers.
| xapata wrote:
| Passing pointers around is a pretty common pattern. It avoids
| allocating memory and copying data unnecessarily. Seemed like a
| decent choice in the late 80's and early 90's, probably all the
| way up through mid-2000's. It's still fine as the default for
| many situations.
| TheCoelacanth wrote:
| > It avoids allocating memory and copying data unnecessarily
|
| My experience is the exact opposite. Pervasive shared
| mutability leads to developers making lots of unnecessary
| copies out of fear that some other part of the code will
| change values out from under them.
| skrtskrt wrote:
| Maybe it's just the kind of work I have done (web services,
| where objects don't live longer than a request), but I have
| never had a problem of "some other part of the code
| changing values out from under me" in Python or Go.
| sgt wrote:
| Speaking of.. is there an article about writing Pythonic
| _python_? I think that would be extremely useful to various team
| members. Sometimes the official documentation is too verbose.
| kzrdude wrote:
| Raymond Hettinger's talks are great. I know it's not a succint
| blog post, but they are out there are watching a video is very
| approachable.
|
| Maybe https://www.youtube.com/watch?v=T-TwcmT6Rcw (Dataclasses!
| We could cheekily say Python gets better at something Rust does
| - dataclasses makes Python better at records.)
|
| And https://www.youtube.com/watch?v=S_ipdVNSFlo
|
| This talk (2013): https://www.youtube.com/watch?v=OSGv2VnC0go
| Unfortunately this video is now a bit out of touch with modern
| Python.
|
| Another (2013) classic:
| https://www.youtube.com/watch?v=HTLu2DFOdTg it is very well
| known, and it has the very memorable advice: what's a class
| that only has one method? That should be a function!
| dec0dedab0de wrote:
| Another by Raymond Hettinger that is fantastic.
|
| Beyond PEP 8 -- Best practices for beautiful intelligible
| code - PyCon 2015
|
| https://www.youtube.com/watch?v=wf-BqAjZb8M
| kzrdude wrote:
| This is the best one :)
| whalesalad wrote:
| It's a journey, to be honest. Very hard to prescribe one
| solution because there are different ways people think about
| this question (syntax or architecture?) and different cultures
| around tools (like basing a web app on a heavier tool such as
| Django vs a lighter tool like Flask) and then of course battles
| over line lengths, types vs no types, etc...
|
| (I also divide Python into two big camps: building
| software/apps vs. data science and analysis, which further
| subdivides the community - if you ever read a post on how to do
| XYZ from the perspective of a data hacker it will usually fall
| into the non-pythonic category)
|
| There is a fantastic book in the Ruby world called "Eloquent
| Ruby" but I have yet to encounter an analog for Python.
|
| RealPython has some cool posts, though. I think they're doing
| the best job pushing more modern/clean practices forward at the
| moment.
| fnord123 wrote:
| Pythonic guide, not too verbose? >>> import
| this The Zen of Python, by Tim Peters
| Beautiful is better than ugly. Explicit is better than
| implicit. Simple is better than complex.
| Complex is better than complicated. Flat is better than
| nested. Sparse is better than dense.
| Readability counts. Special cases aren't special enough
| to break the rules. Although practicality beats purity.
| Errors should never pass silently. Unless explicitly
| silenced. In the face of ambiguity, refuse the
| temptation to guess. There should be one-- and
| preferably only one --obvious way to do it. Although
| that way may not be obvious at first unless you're Dutch.
| Now is better than never. Although never is often
| better than *right* now. If the implementation is hard
| to explain, it's a bad idea. If the implementation is
| easy to explain, it may be a good idea. Namespaces are
| one honking great idea -- let's do more of those!
| throwaway894345 wrote:
| I've often wished Python indexed more on "simple is better
| than complex" and "There should be one-- and preferably only
| one --obvious way to do it". Python has become quite complex
| over the years, including its object system, its tooling,
| ecosystem, and even many of the standard library APIs
| (thinking of you, `subprocess`!).
| pletnes wrote:
| The _Fluent python_ book is fantastic for exactly that.
| adsharma wrote:
| If you have python code and trying to generate equivalent rust
| for performance, please give py2many a try. Looking for feedback.
|
| Also experimental support for pyo3 extensions. But really the
| main idea of the project is static python and complete
| recompilation (not interoperability with dynamic python code).
| amelius wrote:
| I wonder what Rustaceous Python would look like.
| Twisol wrote:
| Aggressively shared-xor-mutable. If an object A has a reference
| to some other object B, either B is (deeply) immutable or A has
| the only reference to B. This means that in general, data is
| only held by some code as long as is needed to operate on it,
| at which point the data is relinquished.
|
| Personally, I find OO designs to be enhanced by this principle,
| so I don't think it's only something one does in Rust. I
| certainly learned it from Rust though.
| nerdponx wrote:
| For what it's worth, following this pattern informally is
| usually a good idea anyway in Python (and Lua and Ruby and
| Javascript etc).
|
| Even if B is technically mutable (which 99% of the time it is
| because almost everything in Python is mutable), just don't
| mutate it and pretend like you're not allowed to do so.
| Twisol wrote:
| > Even if B is technically mutable (which 99% of the time
| it is because almost everything in Python is mutable), just
| don't mutate it and pretend like you're not allowed to do
| so.
|
| This gets you pretty far, but it's hard to ensure that
| nobody else ever mutates B. If B is technically mutable,
| and there are multiple mutable references to B, then when
| you invoke some other object in the course of your work,
| control might come back to you with B mutated without you
| realizing it. This is why the XOR is so important: if B is
| shared (with you) and mutable (not by you), B could change
| under you while you've passed control temporarily to
| someone else.
|
| This is certainly less problematic than having multiple
| mutable references, but it's still a source of complexity.
| As a very small toy example, consider iterating forward
| over an array while deleting elements.
| macintux wrote:
| One of my frustrations when working with Python is that
| it has been adding very useful FP-inspired tools, but any
| random library I invoke can pull the rug out from
| underneath me.
|
| I have long felt that retrofitting a language to add
| support for immutability/FP constructs is better than
| nothing but significantly worse than starting with
| immutability as a core principle.
| the__alchemist wrote:
| I do this! Dataclasses and Enums as the foundation of program
| layout. No inheritance. No getters/setters. No @property.
| Judicious use of typing and function / class doc comments.
| simias wrote:
| This article demonstrates well something I've seen a lot with
| people coming from GC languages getting into Rust: they just
| write the code the way they're used to and work around the borrow
| checker by slapping `Arc<Mutex<_>>` all over the place.
|
| Then that leads to frustration and wondering why you even got rid
| of the GC in the first place if you end up with a crappier non-
| transparent reference-counted garbage collection with all these
| Arc/Rc. Some devs even seem to think that non-GC devs are silly
| poseurs who refuse to have a garbage collector only to
| reimplement it manually with reference counting. We are silly
| poseurs alright, but we're more efficient than that!
|
| I share the author's conclusions: don't do that. If you find
| yourself slapping Mutexes and Arc/Rc all over the place it
| probably means that there's something messed up with the way you
| modeled data ownership within your program.
| cogman10 wrote:
| Yup, definitely a frustration I have reading through rust
| tutorials as well. So many of them will get out of memory
| problems by "Oh, just add Arc around it".
| throwaway894345 wrote:
| In fairness, "just use Rc<>/Arc<>/clone()/etc" is common advice
| from the Rust community in response to criticism that the
| borrow checker puts an undue burden on code paths which aren't
| performance sensitive.
|
| > If you find yourself slapping Mutexes and Arc/Rc all over the
| place it probably means that there's something messed up with
| the way you modeled data ownership within your program.
|
| It only means that the data model doesn't agree with Rust's
| rules for modeling data (which exist to ensure memory safety in
| the absence of a GC). This doesn't mean that the programs the
| user wants to express are invalid. And this really matters
| because very often it doesn't make economic sense to appease
| the borrow checker--there are a lot of code paths for which the
| overhead of a GC is just fine but lots of time spent battling
| the borrow checker is not fine, and I think Rust could use a
| better story for "gracefully degrading" here. I say this as a
| Rust enthusiast.
|
| EDIT: I can also appreciate that Rust is waiting some years to
| figure out far it can get on its ownership system alone as
| opposed to building some poorly-conceived escape hatch that
| people slap on everything.
| fpgaminer wrote:
| > very often it doesn't make economic sense to appease the
| borrow checker--there are a lot of code paths for which the
| overhead of a GC is just fine but lots of time spent battling
| the borrow checker is not fine
|
| To me it's not about performance. A little bit of time spent
| now appeasing the borrow checker will pay off ten fold later
| when you don't have to deal with exploding memory usage and
| GC stalls in production.
|
| GC is great for quick hack jobs, scripts, or niches like
| machine learning, but I believe at this point it's a failed
| experiment for anything else.
| throwaway894345 wrote:
| > To me it's not about performance. A little bit of time
| spent now appeasing the borrow checker will pay off ten
| fold later when you don't have to deal with exploding
| memory usage and GC stalls in production.
|
| I'm confused by the "it's not about performance. [reasons
| why it is, in fact, about performance]" phrasing, but in
| general a lot of applications aren't bottlenecked by memory
| and a GC works just fine. Even when that's not entirely the
| case, they often only have one or two critical paths that
| _are_ bottlenecked on memory, and those paths can be
| optimized to reduce allocations.
|
| > GC is great for quick hack jobs, scripts, or niches like
| machine learning, but I believe at this point it's a failed
| experiment for anything else.
|
| That sounds kind of crazy considering how much of the world
| runs on GC (certainly much more than the other way around).
| I feel the need to reiterate that I'm not a GC purist by
| any means--I've done a fair amount of C and C++ including
| some embedded real time. But the idea that GC is a failed
| experiment is utterly unsupported.
| steveklabnik wrote:
| > In fairness, "just use Rc<>/Arc<>/clone()/etc" is common
| advice from the Rust community in response to criticism that
| the borrow checker puts an undue burden on code paths which
| aren't performance sensitive.
|
| Yes, and I think it's good to push back on that. I personally
| feel it's pretty misguided. I've been meaning to write about
| this, but haven't... so I'll just leave a comment, heh.
| Someday...
|
| A related comment I wrote a while back about this:
| https://news.ycombinator.com/item?id=24992747
| Animats wrote:
| The recommended solution uses "scoped_threadpool".
|
| But "scoped_threadpool" uses "unsafe".[1] _They could not
| actually do this in Rust. They had to cheat._ The language
| has difficulty expressing that items in an array can be
| accessed in parallel. You can 't write that loop in Rust
| itself without getting the error "error[E0499]: cannot
| borrow `v` as mutable more than once at a time".
|
| And, sure enough, the "unsafe" code once had a hole in
| it.[2] It's supposedly fixed.
|
| If you look at the example with "excessive boilerplate"
| closely, you can see that it doesn't achieve concurrency at
| all. It locks the _entire_ array to work on _one element_
| of the array, so all but one of the threads are blocked at
| any time. To illustrate that, I put in a "sleep" and some
| print statements.[3]
|
| You could put a lock on each array element. That would be a
| legitimate solution that did not require unsafe code.
|
| To do this right you need automatic aliasing analysis,
| which is hard but not impossible. (Been there, done that.)
| Or a special case for "for foo in bar", which needs to be
| sure that all elements of "bar" are disjoint".
|
| [1] https://github.com/reem/rust-scoped-
| pool/blob/master/src/lib...
|
| [2] https://www.reddit.com/r/rust/comments/3hyv3i/safe_stab
| le_an...
|
| [3] https://play.rust-
| lang.org/?version=stable&mode=debug&editio...
| throwaway894345 wrote:
| If all you do is publish that link, it would not be a
| disservice to your readers IMHO.
| oconnor663 wrote:
| "Just use clone" seems like reasonable advice to give
| people on day 1 or week 1. I guess the hazard there is that
| they might end up writing `fn foo(s: String)` everywhere
| instead of `fn foo(s: &str)`, but they can gradually learn
| the better thing through case-by-case feedback, and
| correcting this is usually a small change at each callsite
| rather than a total program rewrite.
|
| On the other hand "just use Arc" is definitely smelly
| advice. I think going down that route, a new Rust
| programmer is likely to try to do something that really
| won't ever compile, and wind up really frustrated. Maybe we
| can distinguish this often-really-bad advice from the other
| mostly-ok-at-first advice?
| timClicks wrote:
| Agree. Starting with .clone() is very useful for
| beginners, in my opinion. It gives people a chance to
| learn what ownership is gradually.
| steveklabnik wrote:
| "Just use clone" is absolutely fine. I even put it in the
| book!
|
| Yes, I guess to me they're distinct, but I can totally
| see how it may seem similar. I'll try to make sure to
| make that explicit when I talk about this, thanks!
| bsder wrote:
| Thanks you for that example. You really should put that
| somewhere public.
|
| However, that example commits the Rust "sin" I always see
| with "spawn" explanations in that it packs everything into
| a closure.
|
| Please, please, please ... for the sake of newbies
| everywhere ... please define a function and call that
| function from inside the spawn closure. This is one of the
| Rust things I spend the most time explaining and unpacking
| for junior engineers.
|
| The separate function makes explicit the variables that are
| moving/borrowing/cloning around, which names are
| external/passed/closure-specific, and how it all ties
| together. It breaks apart the machinery that gets all
| conflated with the closure. They can always make it
| succinct again _later_ once they understand the machinery
| more completely.
|
| This gets particularly bad with "event loops" (I have a
| _loooooong_ rant I need to write up about event loops and
| why they are evil--and it 's not Rust specific) and deep
| "builder chains" (which I consider a Rust misfeature that
| desperately needs to go away).
| steveklabnik wrote:
| That is a very interesting idea that I find viscerally
| unsettling but will consider, thanks.
|
| It is probably because of all my years with Ruby that
| this seems so off to me, but also, I hear you. Hmm.
| bsder wrote:
| I get why its unsetlling. I hear you, too.
|
| The issue in Rust is that "spawn" is generally the first
| time that a programmer is _forced_ to confront closures--
| before that they can completely ignore them.
|
| It is a quirk of Rust that someone who "just wants to
| spawn a thread" suddenly gets all this language overhead
| dumped on them in a block.
| saghm wrote:
| What about iterators? My instinct is that people would
| run into wanting to use `map` or `filter` or something
| before they feel the need to spawn threads when using a
| new language, although that might be my bias coming from
| a more functional background before learning Rust. The
| types of closures used as predicates tend to be a lot
| simpler though, so I guess this may not be what you mean
| by "confronting" them.
| gbear605 wrote:
| Map/filter closures almost always never actually enclose
| variables, instead just acting on the elements from the
| iterator. They're basically a one-off nameless function.
| On the other hand, closures used for spawning threads
| almost always enclose variables from the outer context.
| That's an important distinction that makes them seem
| quite different even if they're the same underneath.
|
| When I started with Rust, I actually thought that the
| common
|
| spawn(|| ...)
|
| syntax was something special, not just a closure with no
| variables.
| bsder wrote:
| Iterators? Nope. You can happily live in "for x in foo"
| land without ever touching map/filter/collect.
|
| > might be my bias coming from a more functional
| background before learning Rust
|
| This is exactly the problem. The people I'm dealing with
| are coming from "imperative land" and haven't had a
| functional background. Someone stepping up to Rust as "C
| with memory safety" does not have any of that functional
| background.
|
| Please do remember that a "closure" is built on a lot of
| prior abstractions. What is a "scope"? What is "variable
| capture"? What is "heap" and "stack"? Why does that
| matter here and not normally?
|
| No programming language is only used by experts, and Rust
| is no exception.
|
| The issue is that "spawn" smacks you with a bunch of that
| baggage all at once.
| mlindner wrote:
| > In fairness, "just use Rc<>/Arc<>/clone()/etc" is common
| advice from the Rust community in response to criticism that
| the borrow checker puts an undue burden on code paths which
| aren't performance sensitive.
|
| I hypothesize that this mostly comes from a laziness in
| response as it's an easy response to give to people who are
| used to having a garbage collector. I come from the C world
| (still learning Rust) and every time I see one of these
| pieces of advice given I'm forced to facepalm.
| forrestthewoods wrote:
| > It only means that the data model doesn't agree with Rust's
| rules for modeling data (which exist to ensure memory safety
| in the absence of a GC).
|
| This doesn't seem correct at all?
|
| GC merely solves freeing memory when it is no longer needed.
| But it does not solve parallel access.
|
| One of the beauties of Rust is that you can write highly
| parallel code and if it compiles it works. Meanwhile Python
| languishes behind the GIL. GC does not even attempt to solve
| multithreaded memory access.
|
| I'm happy to be corrected if I'm wrong or missing something
| here.
| verdagon wrote:
| That's a fair point.
|
| I'd add: some GC'd languages solve this with sending around
| deeply immutable objects (functional languages mostly), in
| a way that can be more flexible than the way Rust handles
| immutable objects.
| throwaway894345 wrote:
| You're right on the local point that Rust's ownership rules
| guarantee correctness in the case of parallel access while
| GCs do not; however, the broader point is that Rust's
| ownership rules also reject many programs which would be
| correct with GC.
|
| For example: fn foo<'a>() -> &'a [u8] {
| let v = vec![0, 1, 2]; // returns a value
| referencing data owned by the current function
| v.as_slice() }
|
| vs Go's: func foo() []uint8 {
| // subject to escape analysis, but code like this would
| likely return // a fat pointer into the heap--
| no complaints because there's a GC. return
| []uint8{0, 1, 2} }
|
| > One of the beauties of Rust is that you can write highly
| parallel code and if it compiles it works. Meanwhile Python
| languishes behind the GIL. GC does not even attempt to
| solve multithreaded memory access.
|
| Python's GIL is unrelated to GC, but yes, Rust's borrow
| checker guarantees correct parallel access of memory. But I
| think this benefit is overblown for a couple reasons:
|
| 1. contrary to popular opinion, if you've learned how to
| write parallel programs, it's not tremendously difficult to
| write them correctly without a borrow checker. In my
| experience, whatever time I've saved debugging pernicious
| data races is lost by the upfront cost of pacifying the
| borrow checker. _Maybe_ this wouldn 't hold for people who
| aren't experienced with writing parallel code (but I
| imagine such people would have a harder time grokking the
| borrow checker as well).
|
| 2. most data races in my experience aren't single threads
| on a host accessing a piece of shared memory, but rather
| many threads on many hosts accessing some network resource
| (e.g., an S3 object). The borrow checker doesn't help here
| at all, but you still have to "pay the borrow checker tax".
|
| Again, this isn't a tirade against the borrow checker, but
| an insistence that tradeoffs exist and it's not just a
| "you're just doing it wrong" sort of thing.
| richardwhiuk wrote:
| The equivalent to the go is returning the Vec.
| layoutIfNeeded wrote:
| >Rust's ownership rules guarantee correctness in the case
| of parallel access while GCs do not
|
| That's a dangerous misconception. The Rust ownership
| model only guarantees that the program is free of data
| races. That's a necessary but not sufficient condition of
| program _correctness_.
| raverbashing wrote:
| Clone is "okish", but I don't remember a case where I had to
| use Rc or Arc (sure, there are use cases for that, but not
| for basic stuff)
|
| Some people just try to force their way into a new language
| and don't realize that if you keep doing something that looks
| stupid or weird it probably is (and no, yours is not a
| special case)
| throwaway894345 wrote:
| > Clone is "okish", but I don't remember a case where I had
| to use Rc or Arc (sure, there are use cases for that, but
| not for basic stuff)
|
| I think the use case is "I haven't yet completely grokked
| the borrow checker and/or I don't have time to pacify it,
| but I would prefer not to copy potentially large data
| structures all over with Clone".
|
| > Some people just try to force their way into a new
| language and don't realize that if you keep doing something
| that looks stupid or weird it probably is (and no, yours is
| not a special case)
|
| You're responding to my comment which is about _the Rust
| community_ prescribing this as a solution to newcomers. We
| 're not talking about newcomers obstinately refusing to
| learn new idioms in the language they allegedly want to
| learn (although no doubt this happens, especially if the
| language in question is Go :p ).
| raverbashing wrote:
| > I think the use case is "I haven't yet completely
| grokked the borrow checker and/or I don't have time to
| pacify it, but I would prefer not to copy potentially
| large data structures all over with Clone".
|
| For basic stuff I agree, though I wouldn't use it.
|
| > about the Rust community prescribing this as a solution
| to newcomers.
|
| There's probably a sweet spot for using those constructs
| in not so obvious places while going through the simpler
| stuff in a more idiomatic way
| PragmaticPulp wrote:
| This is also a problem with Rust conversations and tutorials
| online: There's a lot of talk about what _not_ to do or what
| you can 't do in Rust, but it's rarely followed up with an
| explanation of what users should do.
|
| Mutexes aren't an entirely foreign concept to many programmers.
| Obviously if someone can architect their system in a way that
| doesn't require Arc<Mutex<_>> then go for it, but we need to be
| careful about giving blanket advice without alternatives.
| rmdashrfstar wrote:
| Indeed, this blanket advice does not apply in situations
| where shared resources that require mutable access need to be
| synchronized. If my async HTTP router is serving two requests
| that are using the same database client to persist changes,
| that needs to be synchronized. How does one do this without a
| mutex?
| waffletower wrote:
| There are STM (Software Transactional Memory)
| implementations in Rust.
| ibraheemdev wrote:
| Use a database pool, although some pool implementations may
| use mutexes internally.
| barsonme wrote:
| By acquiring and releasing resources. I'm not intimately
| familiar with Rust, so I'll use Go as an example.
|
| Create a one-buffered channel: `make(chan T, 1)`. To
| acquire, receive from the channel. If the object is in use,
| the receive will block (goroutine put to sleep) until it's
| available. To release, send the acquired object to the
| channel.
|
| On a language level, no mutexes.
| Dzugaru wrote:
| I come from C# and Rust absolutely drives me bonkers. I try to
| reference all the things all the time, cause its ingrained in
| my brain as more efficient than copying stuff around or/and
| requesting things from hashmaps. The first thing I've tried to
| implement was a string interning module and I was stuck on it
| for three days (I still don't know how to do it without
| duplicating each string twice). Rust is completely orthogonal
| to the concept of a nice garbage collected graph of objects.
| cmyr wrote:
| On this very specific point, for this specific problem, I
| think that `Rc<str>` is criminally underused. It's a great
| type if you need a heap allocated string, but want to hold on
| to copies of it in various places; and it derefs to `&str`,
| so you can use it transparently with most of `std`.
| nicoburns wrote:
| Maybe check out one of the existing string interning modules?
| There are some good ones. I believe it's common to internally
| use indexes instead of references to get around the borrow
| checker in these kind of situations.
| Dzugaru wrote:
| That's my impression with Rust for now - you use indexes
| instead and borrow as needed. Long lived borrows are a no
| go, I was shocked to figure out you cannot return a new
| thing and a reference to it from a function.
| [deleted]
| CraigJPerry wrote:
| I've never tried that so i don't know the challenge but my
| first thought was that it doesn't seem incompatible with
| rust's ownership model - assuming an interned string has
| static lifetime and then hand out immutable references?
| Dzugaru wrote:
| I wanted to drop strings which are not used anymore, you do
| need Rc for that (no way around it I think), but my problem
| was that I also needed that strings as a key to
| HashMap<String, Weak<String>>. So I use that HashMap to
| check if the string is interned and give a new counted
| reference as needed and also periodically check any weak
| refs that should be dropped. This works, but the key better
| be immutable (I think?) so you cant do
| HashMap<Weak<String>, Weak<String>> (HashSet?) and hence
| need second copy as a key.
| iudqnolq wrote:
| That's an interesting case. This won't solve your whole
| problem, but you should be able to make the key the hash
| of the string and then have a HashMap<u64, Weak<String>>.
| I feel like there's a pretty solution where you stick the
| whole map in a Rc and stick that in every interned
| string. Then strings can remove/lookup themselves. To
| make that work you'll want some sort of concurrent hash
| map.
| axegon_ wrote:
| Couldn't have summed it up better if I tried. Rust is an
| awesome language but at the end of the day it's a systems
| language. As such it's not particularly suitable for writing
| just any type of application. Seems it shares one of my many
| problems with javascript: The fact that everything is being
| written and re-written in javascript(even if there are
| different reasons for both). As I've said a million times,
| "just because you can, doesn't mean you should". The endless
| Arc<Mutex<_>> is a symptom of it. In larger application this
| turns into an endless hell of read/write locks and sooner
| rather than later it blows up straight into your face. Which is
| further amplified by the somewhat recently introduced
| async/await. And let's not overlook the many incompatible tokio
| versions and the fact that tons and tons of crates use
| different versions of tokio: Just as everything looks buttery
| smooth, BOOM: thread 'main' panicked at 'not currently running
| on the Tokio runtime.' Damn it... Awesome... Which one is it
| this time...
|
| On the subject of python, I think Rust has a very powerful
| niche it could take: slightly modifying an old Google
| philosophy(which they can't stick to anymore for a million and
| one reasons of course): "Python where we can, Rust where we
| must". To my mind it could be a very pleasant recipe for a
| large number of startups.
| ragnese wrote:
| I'm ambivalent about it.
|
| In my opinion, Rust is optimized to be a systems programming
| language. As such, we'd expect that it shouldn't really be your
| first choice for writing an "application". (I'm not referring
| to the OP at all here)
|
| HOWEVER, Rust is such an excellent language that we all want to
| use it to write applications ANYWAY. That's kind of amazing in
| and of itself- that people want to deal with a non-GC'd
| language at all to write applications. Because, really, garbage
| collection is awesome and there's almost zero reason to avoid
| it unless you need extremely _predictable_ performance, or very
| low runtime overhead, etc.
|
| As far as wrapping everything in Arcs and Mutexes is concerned:
| Yes, that's ugly and it's a lot of extra typing. On the other
| hand, your performance is still likely to be orders of
| magnitude faster than Python for some general application-type
| tasks, and it _will_ likely avoid headaches with the borrow
| checker, etc.
|
| So, honestly, I don't know if I recommend doing that or not.
| Want I _want_ is to be able to tell people to not use Rust for
| what they 're doing. I'd _like_ to have a different
| recommendation for a "garbage collected Rust" but there really
| isn't anything that I think is good enough for the title. Maybe
| Scala 3 (I haven't played with it yet) or OCaml when its multi-
| core stuff is done. Maybe F# or C# are good enough, too.
| amyjess wrote:
| > I'd like to have a different recommendation for a "garbage
| collected Rust" but there really isn't anything that I think
| is good enough for the title.
|
| D and Nim are both good candidates IMO.
| Zababa wrote:
| The problem with OCaml is not multicore. If it had an easier
| to use build system and a larger community (so more
| libraries), it could be great for server-side software. But
| dune is way harder to use than Cargo and you can often miss
| libraries. Here's an example of Dark moving away from OCaml
| to F# https://blog.darklang.com/leaving-ocaml/, which doesn't
| have as much this problem because you can use C# stuff. Scala
| also doesn't suffer as much from it, part of it because it
| was really popular at some point, and part of it because you
| can use Java anyways.
|
| I hope at some point we'll have this "garbage collected
| Rust", but for now OCaml, Scala and F# all have a worse
| developer experience than Rust. I could add Haskell to that
| too.
|
| I've been thinking for some time that something like Go but
| based on a ML would fill this "garbage collected Rust" niche
| quite well. Maybe something built over Rust itself to
| leverage all the ecosystem? You could write all of the glue
| code in this language, have access to a large ecosystem of
| libraries and have an option for high performance code. This
| would also complement Rust nicely: I know that OCaml has a
| really fast compiler, which would be a breath of fresh air
| for the community.
| dexwiz wrote:
| > HOWEVER, Rust is such an excellent language that we all
| want to use it to write applications ANYWAY.
|
| Is this really the reason, or is it because people think they
| will get free performance wins by choosing the correct
| language? The plethora of articles detailing a developer's
| story trying to write X-style language in Rust shows that
| most are not approaching Rust with the correct mindset.
|
| I have limited experience is Rust, but seems like most people
| are attracted by shiny new toys in the language. Many people
| have issues with OO languages, and the separation of data and
| function is attractive. But it lacks the run time most
| application developers have come to know, specifically GC and
| easy references.
| steveklabnik wrote:
| I'm sure there's a bit of both. Many people talk about how
| they love the tooling, the type system, enums, traits...
| stuff unrelated to performance.
| bjornjajayaja wrote:
| I think the broader problem is that: one _cannot_ just
| arbitrarily "go" from a higher level language to a systems
| language.
|
| Folks really ought to read the C/C++ literature to understand
| why Rust evolved in a unique direction. That gives better
| compare/contrast.
|
| Also, a lot of Python idioms are actually from C++ (e.g.
| mixins, iterators, operator overloads, etc).
|
| Anyway, to reiterate: folks interested in systems languages
| should read the C/C++ literature and actually become a systems
| engineer first.
| whatshisface wrote:
| > _Folks really ought to read the C /C++ literature to
| understand why Rust evolved in a unique direction. That gives
| better compare/contrast._
|
| Sounds like a good idea for the post-singularity age of
| infinite lifespans, but what folks really need to do is learn
| what they need to learn as best they can when they need it.
| bjornjajayaja wrote:
| Maybe if one just wants to scrape by. I wouldn't want
| systems code from those folks to be honest... Maybe that's
| fine up the layers but the whole point of Rust is
| correctness and stability--to get the that point, maybe
| invest a month hitting the books?
| heresie-dabord wrote:
| Whenever I see the word "pythonic" I am cautious. Not all
| languages are Python, nor should they all become Python, nor
| are they worse for _not being Python_.
| radicalbyte wrote:
| Python developers try to write idiomatic Python in
| whichever language they use.
|
| Learning the idioms and conforming to a new language are
| hard, especially at the beginning, and people are lazy.
| amyjess wrote:
| Much of Python's influence is actually from Modula-3, not
| C++. The only reason C++ is similar is because C++ also
| borrowed liberally from Modula-3.
|
| In particular, Python's import, exception handling, and
| object systems are based on Modula-3's (though IIRC it did
| also borrow some C++ innovations for the object system).
|
| (Modula-3 is the Velvet Underground of programming languages:
| your average programmer/listener has never heard of it, but
| it influenced so many languages/musicians you _have_ heard
| of)
| nicoburns wrote:
| > Folks really ought to read the C/C++ literature to
| understand why Rust evolved in a unique direction. That gives
| better compare/contrast.
|
| Unfortunately the C/C++ literature is dense and not at all
| approachable. The Rust literature is much, much better for
| newcomers to systems programming (partly because it doesn't
| have to cover a load of weird failure cases that simply don't
| exist in Rust).
|
| > one _cannot_ just arbitrarily "go" from a higher level
| language to a systems language.
|
| I mean, it's like learning anything new. You have to do a bit
| of unlearning, and grasp the core concepts. I don't think
| there is anything especially difficult about systems
| languages. I had a background of JavaScript and PHP, and I
| was able to pick up Rust well enough to use it in my day job
| in a couple of weeks.
| loeg wrote:
| I think OP's first approach (with some combination of Arc/Rc and
| Mutex or just plain RefCell) was probably on the right track and
| I'm not sure why they chose something else. They don't really
| elaborate on this?
|
| > This was my initial approach, but it started to become pretty
| verbose, pretty quickly. In hindsight it may have ultimately been
| simpler than where I did end up, but, well, that's hindsight.
| granduh wrote:
| can someone comment on the OP's website platform? it looks like a
| plain ftp explorer but it seems to support inline pages on
| mobile.
| scoopertrooper wrote:
| Looks like a custom job.
| whalesalad wrote:
| I don't agree with a lot of this post but I think it does a good
| job pointing out that porting a library from ecosystem A to
| ecosystem B is not always a good idea. A good API design in one
| language can be a horrible one in another.
|
| _Python_ doesn't expect you to share references to things - it's
| just that the API that was being copied follows an imperative
| style that mutates objects.
| Spivak wrote:
| But the predominant concern in a lot of porting efforts is
| interoperability with developers who are already familiar with
| the current API. Porting the API to feel very Rusty would be a
| mistake when your audience is people who already know the
| Python API. Python itself has adopted some decidedly non-
| pythonic APIs for the same reason.
|
| In the Python world if this bothers you you're supposed to use
| the adapter pattern to Pythonify the mechanically ported API.
| whalesalad wrote:
| I would first start by asking - why am I porting an identical
| API? Why am I not going back to first principles and
| rethinking from ground zero? Usually there is a compelling
| reason to switch from A to B, likely for performance in this
| case (going from py -> rust). Any programmer worth a salt
| knows that nothing in life is free - and everything has a
| tradeoff. In this case the tradeoff (should) be that the API
| might change, but the performance benefits will be
| worhtwhile.
|
| For instance - lets look at sequential operations against a
| DB versus batches. In a sequential style, you might iterate
| your items and just write them one by one. Alternatively, in
| a batched style you might need to do things like prepare
| queries or store your queries alongside the actual data
| values, then hand it all to a batch mechanism that will
| perform the write. The ergonomics are completely different
| but at the end of the day the result (rows in the table) will
| be the same. So even in the same language you will see
| totally different ways to go about solving the same problem.
| This is why I do not really see this as a python/rust
| argument and more of a generic program architecture argument.
|
| In the above db case - the programmer isn't disappointed that
| they need to rewrite their iterative loop approach because
| the know that the batching approach is going to be much
| faster and will achieve their goals.
|
| I think the same parallel can be drawn here: it is not a bad
| thing to have a slightly different API. Especially in the
| context of going from a dynamic language to one that is
| compiled and more performant.
|
| All of this is kind of a moot point though because the Rust
| lib in question doesn't have any instructions/documentation
| or design docs so it is hard to say for certain what the
| intention is here other than a port for the sake of porting.
| cmyr wrote:
| Mentioned this briefly below, but this was largely an
| academic exercise: there is a lot of existing code written
| (in python) against the python API in question, and I was
| curious to see how feasible it would be to reimplement that
| API in such a way that this existing code would continue to
| work with minimal modification, but now running on top of a
| rust implementation.
|
| What's additionally challenging in this case is that the
| design of the underlying rust API (the norad crate[1]) was
| also more or less done, so this really was just a matter of
| trying to shim.
|
| In any case, I think we more or less agree; just trying to
| provide a bit more background on the motivation. This was
| originally just circulated as a gist between a few
| interested parties, who were largely familiar with the
| motivations; it certainly didn't occur to me that it might
| be interesting to a general audience.
|
| [1]: https://docs.rs/norad/0.4.0/norad/
| tomnipotent wrote:
| > interoperability with developers who are already familiar
|
| I wouldn't say it's "predominant". Some instances are code
| authors trying to bring their library to new audiences (c/c++
| frameworks with multiple language bindings), or developers
| unfamiliar with the language but want that specific API in
| their own (Python's requests library is a great example,
| cloned in many languages now).
| redrobein wrote:
| Arguably you'd want to write in idiomatic style for the
| ecosystem you're working in. It doesn't help when the
| standard library plus any other libraries you're using offer
| rusty apis and this one library is pythonic.
| oconnor663 wrote:
| The `Arc<Mutex<T>>` approach here would make me worried about
| reference cycles, unless the type relationships here look like a
| tree/DAG. Out of curiosity, if I store references to other things
| as opaque `pyo3::PyObject` objects, would the CPython cycle
| collector be able to see these references and collect their
| cycles?
___________________________________________________________________
(page generated 2021-05-24 23:00 UTC)