[HN Gopher] (Risp (In (Rust) (Lisp)))
___________________________________________________________________
(Risp (In (Rust) (Lisp)))
Author : tosh
Score : 103 points
Date : 2021-07-12 13:45 UTC (9 hours ago)
(HTM) web link (stopachka.essay.dev)
(TXT) w3m dump (stopachka.essay.dev)
| trutannus wrote:
| Reminds me of this: https://github.com/kanaka/mal Highly
| recommend trying it out. Making a lisp is surprisingly easy and
| fun. Plus you get your own programming language out of it.
| forgotpwd16 wrote:
| >(In (Rust) (Lisp))
|
| Is this "Rust in Lisp" or "Lisp in Rust"?
| C-x_C-f wrote:
| Prefix notation preserves the order of the argument, so it
| would technically translate to "Rust in Lisp." Huh, guess that
| was not the intended meaning...
| aaron-santos wrote:
| We're lucky that language diffusion works both ways. I've
| kept my eye on Carp[1] because it takes a cue from Rust and
| brings deterministic memory management & ownership into a
| Lisp.
|
| [1] https://github.com/carp-lang/Carp
| mastax wrote:
| One of my pathologies as a developer is that I want to do things
| "properly" at all times. When I see things like
| fn tokenize(expr: String) -> Vec<String> { expr
| .replace("(", " ( ") .replace(")", " ) ")
| .split_whitespace() .map(|x| x.to_string())
| .collect() }
|
| my instinctive reaction is: _those .replace() calls will allocate
| unnecessarily, what you really need is a little state machine,
| maybe use the nom or parsec crate ..._.
|
| Wanting to do things The Right Way is a good instinct to have as
| an engineer, and it's something that Rust encourages by design.
| However, I've noticed that I'll often spend a lot of time getting
| in the weeds trying to optimize or elegant-ize a bit of code
| which ends up being unnecessary. I'll take a step back and
| realize that the performance of that code does not matter, or
| that the implementation was the wrong approach and I need to
| delete it all and do something else, or just that it wasn't very
| important and I should've done the easy solution and moved on.
|
| When I'm in a flow state writing code it's hard to step back and
| evaluate what I'm working on in the context of the bigger
| picture; I haven't been successful at training myself to do that.
| I think a better solution would be to deliberately write "first
| draft" code that's biased toward being quick and easy to write.
| When the code is done there's a natural pause to test and review
| it in the context of the big picture.
|
| Does anybody else struggle with this? What have you done to
| mitigate it?
| kenniskrag wrote:
| > those .replace() calls will allocate unnecessarily,
|
| All the time or only in Rust or Lisp? I would guess the
| compiler can optimize it because it is an XValue.
| mastax wrote:
| `str::replace(from, to)` replaces all instances of the
| pattern `from` in the string with `to` returning the result
| as a newly allocated String.
|
| In theory the compiler could notice that the spaces added to
| the newly allocated strings are only used to influence the
| behavior of `split_whitespace` and do not otherwise affect
| the output of the function. In practice, it does not:
| https://rust.godbolt.org/z/z573Mvjzz The compiler does not
| decide to inline the call to `replace` and therefore does not
| do any special reasoning about it.
| lmkg wrote:
| Those replace calls are growing the strings--one character is
| being replaced by several. That has the potential for re-
| allocation in any language, depending on how much spare space
| is present. Unless the optimizer is smart enough to figure
| out the space characters are never used because they're
| splitting points.
| rtoway wrote:
| In Rust, many times this kind of code will compile to the same
| code using plain old loops and if statements
| mastax wrote:
| Yes, however for this specific situation `replace()` is not
| an iterator adapter but a method of `str` which returns a
| newly allocated `String`. In practice, the compiler does not
| do something smarter than that:
| https://rust.godbolt.org/z/z573Mvjzz
| rtoway wrote:
| Ah you are right about that. Yes, it would be a nice place
| to avoid an allocation
| jcelerier wrote:
| > I'll take a step back and realize that the performance of
| that code does not matter, or that the implementation was the
| wrong approach and I need to delete it all and do something
| else, or just that it wasn't very important and I should've
| done the easy solution and moved on.
|
| you're lucky, I don't remember a lot of times where following
| exactly this process did not lead to having to rewrite it 6
| months later because it turns out that some user has some
| corner case which actually ends up requiring data going through
| that exact codepath in as little time as possible.
| rpsw wrote:
| I think in a real parser your instincts would likely serve you
| well.
|
| For example, if a string data type was added, our tokenize
| method could not be this simple.
|
| To the actual question, I think one good approach when cutting
| corners or doing a quicker sub-optimal approach, is to document
| the limitations and assumptions in some way. You could then
| evaluate and prioritise you efforts and time from this, rather
| than getting stuck in the weeds early in the process.
| nerpderp82 wrote:
| Barring the use of unsafe, Rust gives you more guarantees
| about containing low quality code behind good interfaces, so
| in Rust, one should do the thing in the most expedient manner
| and then can revisit it after the program is working.
|
| Working code can be iterated on, by multiple people. Non-
| functioning code cannot. Having a repo in state that can be
| shown functional by a CICD pipeline is the equivalent of
| achieving a chain reaction and freezing to death. Light that
| fire.
| sreque wrote:
| I consider my self a bit of an optimization geek who sometimes
| errs on over-optimizing from the start. I think there are two
| major concerns.
|
| First, you want to write code whose style fits in with the
| team, project, and organization. If other people are writing
| high-level code chaining method calls like above and not
| worrying about performance, then you should write your code
| that way as well. It will fit in fine and no one will notice
| the performance impact because all the rest of the code is
| already equally slow.
|
| On the other hand, there are certain architectural decisions
| that affect performance that can be hard to undo after the
| fact. Those decisions can be a little harder but it usually
| boils down to meeting short-term requirements. Unless your boss
| or org appreciates optimizations and you can sell it on your
| quarterly review that you saved X dollars due to optimized
| code, then why bother?
|
| Your optimization may even cause conflict with people who don't
| like or appreciate it and don't like the extra complexity it
| might bring to the project. Or, if you ship with a fast-enough
| solution and the product's popularity grows to the point that
| scaling becomes hard, you can optimize it and look like a hero!
| Whereas, if you had optimized it in the first place pre-launch,
| your work could likely go un-appreciated.
|
| It's a bit cynical of a take, but a useful one to consider, I
| think.
| 6keZbCECT2uB wrote:
| I 100% struggle with this, and it makes me a slower developer.
| As I said elsewhere, the dirty data / control flow of two
| passes to do a one character substitution is as painful to me
| as synchronizing two copies of the same function for different
| character substitutions.
|
| Probably the biggest help is to work somewhere that it's a
| productive use of your time to care about the details. Second,
| getting the details right can steal time from getting broad
| strokes right. Who cares if you reduce a predictable branch if
| you are sitting on top of a massive bucketed hash table the
| serializes all keys to strings by allocating? Fight the battles
| worth winning.
|
| One of the things that helps is to work in a language with an
| execution model that I don't understand, e.g. Python or
| JavaScript.
| [deleted]
| lisper wrote:
| The most important thing is not to get the code right but to
| get the interfaces right. If you do that, you can isolate the
| ugliness and the inefficiency and fix it all incrementally as
| needed and as the mood strikes you.
|
| Getting the interfaces right is not easy, but it's a hell of a
| lot easier than getting _everything_ right.
| dec0dedab0de wrote:
| My problem is I'll catch myself playing code golf in the name
| of efficient expression, or writing mini frameworks when
| they'll never be used again.
|
| In general, I agree with your first draft idea. Get the thing
| working before you optimize for whatever is bothering you.
|
| Though, I also find that when I do waste time doing something
| that doesnt matter, I end up staying engaged longer than if i
| just stuck to the problem at hand. That is to say, if you're
| going to procrastinate it's better to procrastinate by writing
| code.
| todd3834 wrote:
| Over the years I've just learned to recognize that the "right
| way" is really about weighing trade offs. Is this hot code that
| needs maximum efficiency? Is this code that is going to be
| touched a lot and readability and maintainability are more
| important? If the latter then the "right" way might be a less
| optimized version but given the trade offs it is the right way.
| nomel wrote:
| And, benchmarks always tell you if there's a problem and
| exactly where it is, and squash any unneeded discussion.
|
| I once had a new-to-the-group developer pointing out how
| things could be some much better if we optimized various code
| paths (I was them at one point in my career). I returned the
| next meeting with some benchmarks showing that we spent a
| total of 100ms in these code paths, in an application that
| ran for hours.
| 6keZbCECT2uB wrote:
| "benchmarks always tell you if there's a problem and
| exactly where it is, and squash any unneeded discussion."
| Unfortunately, always is too strong of a word. Cache
| eviction in one parts can cause memory stalls on another
| part, indirection in the caller can prevent speculation.
| Type erasure can prevent inlining resulting in the called
| function being blamed for problem in the caller.
|
| Your problem might not even be CPU, if it's contention
| related, or timing related, overloaded queues, not pushing
| back at the right places, io bound, the bottleneck is work
| which is queued and executed elsewhere... Causal profiling
| is a technique which is relevant specifically because
| profiling can miss the forest for the trees:
| https://github.com/plasma-umass/coz
|
| It's really easy to write a benchmark which measures a
| different scenario from what your application is doing. A
| classic example might be benchmarking a hashmap in a loop
| when that hashmap is usually used when cold.
|
| I definitely agree about directing efforts to where you can
| make an impact and guiding that through measurement, but
| benchmarks can miss that there's a problem and blame the
| wrong part of the application.
|
| If the difference is large enough, ms vs hours, you'd have
| to really screw up methodology to get the wrong result
| (I've done it almost that badly before).
| stagger87 wrote:
| If the "right way" leads to
|
| - Wasting your time and your company's time.
|
| - More complex code which is harder to maintain in the long
| term.
|
| did it ever deserve to be called the "right way"?
| dan-robertson wrote:
| Well I think this code _is_ harder to maintain in the long
| term than a proper parser. For example you may want foo
| "bar"baz to parse as two symbols and a string, or (a.b) as a
| cons. It also looks scary to me because I need to think a bit
| to be convinced that it works. But looking at code for a
| parser is surely hard too, and I don't think any of those are
| strong enough arguments not to use the code above.
| nomel wrote:
| To me the "right way" is that it's not technical debt that
| will self destruct in the near future. If it will in the far
| future, a comment with an explanation for that future person
| is plenty.
| mumblemumble wrote:
| Spending a month or two doing really scrupulous test-driven
| development, perhaps? I don't personally like TDD as an end in
| and of itself, but I love it as a sort of kata-like practice to
| help ingrain good habits.
|
| In this case, where I see it potentially helping is that the
| tests give you a very clear, objective way of deciding when the
| code is done, and it's time to stop futzing with it. Though you
| would have to be careful about remembering that the "refactor"
| phase is about refactoring for maintainability, not
| performance.
|
| I don't know what the tooling situation is like in Rust, but
| another one is to try and be really scrupulous about never
| making a change whose purpose is performance without first
| profiling the code, so that you can objectively measure the
| impact of the change. You don't need to do this forever,
| either. But, if your experience is anything like my own, I'm
| guessing that a lot of what motivates you is the feeling that
| you've done something of tangible value when you optimize the
| code. Actually seeing some hard numbers might change your
| feelings a bit. It's a lot easier to feel good about, "I spent
| an afternoon making the code more efficient," than it is to
| feel good about, "I spent an afternoon reducing the time it
| takes this program to execute from 50ms to 49ms."
| 6keZbCECT2uB wrote:
| "But, if your experience is anything like my own, I'm
| guessing that a lot of what motivates you is the feeling that
| you've done something of tangible value when you optimize the
| code."
|
| This is not at all how it is for me. The execution model of
| the code is core to how I read programs in languages where I
| am familiar with their execution model. So reading code that
| has dirty data or control flow, such as copying an entire
| array twice to change one character in two passes is as
| repulsive to me as maintaining and synchronizing two separate
| functions, one for each character swap.
| mumblemumble wrote:
| In that case, perhaps the best option is to lean into it
| and get onto a more Chuck Moore career trajectory?
|
| (By which I mean, there are some corners of the profession
| where this sort of mindset is very highly valued.
| Aerospace, some high frequency trading shops, embedded
| development. Nothing wrong with playing to your strengths.)
| thurn wrote:
| Very nice! I think there could definitely be a market for a nice
| scripting language that natively integrates with Rust and doesn't
| require crazy huge compile times, the way Lua is used in game
| engines.
| eointierney wrote:
| This is a great thread.
|
| A cousin of mine, a superb programmer, is C++ fluent. As in able
| to express any good idea in any other language in C++
| efficiently. He is deep in C++.
|
| He knows a poet. They are critical, as he told me, of his ability
| to understand poetry. When I suggested he show them the lambda
| calculus he smiled.
|
| We need more poetry in code as much as we need more proof in
| code.
|
| I know many of us program to build, quickly and efficiently,
| solutions to problems, and mainly problems defined in terms of
| business or regulatory requirements.
|
| Nonetheless we need more poetry in code. Terse, referential,
| contextualised, meaningful code that uses every suitable math
| technique for proofiness.
|
| We build our future with code. Let's make beautiful code.
|
| In a nutshell: write less read more. Risp is great.
| brundolf wrote:
| One weird thing I noticed is that they use a Vec instead of a
| linked-list to store lists. This is going to really mess with
| anyone who tries to do a lot of car/cdr/cons-ing. Maybe that's
| out of scope for this guide.
| dcsommer wrote:
| There are some good thoughts concerning when to use a linked
| list here: https://rust-unofficial.github.io/too-many-lists/
| brundolf wrote:
| Your link talks about the general case. For better or worse,
| (most) Lisps give linked-list operations a special status as
| a fundamental primitive. For example, getting the tail of a
| list over and over is commonly used for iterating through its
| elements. If you try and do this with a Vec you'll be copying
| and re-allocating almost the whole thing on every loop
| iteration.
|
| Nothing about Lisp _requires_ that you write code this way of
| course, but it 's the epitome of what could be called "Lispy"
| (literally: "list processing language"), and any preexisting
| code or habits will hit a brick wall if those lists are
| implemented as Vecs
| dgb23 wrote:
| To add, Clojure uses a sequence interface that is typically
| backed by a vector. Conj (instead of cons) is used to
| derive a new sequence with the thing added in front for
| lists (which are rarely used) and at the back for vectors.
| brundolf wrote:
| It supports both, as it should; Lists are still first-
| class citizens, it's just that vectors are too.
|
| Though also in Clojure's case vectors aren't really
| vectors, they're immutable persistent data structures
| that share memory as much as possible, which I think
| would actually solve most of the performance problem
| here. But the same is not true of Rust's Vec<>
| nerpderp82 wrote:
| https://github.com/bodil/im-rs
| brundolf wrote:
| I'm solely commenting on the implementation in the OP,
| which uses a regular Vec<>
| tialaramex wrote:
| > you'll be copying and re-allocating almost the whole
| thing on every loop iteration.
|
| This is true if the list is writeable, but, if so surely a
| Lisp has to also keep duplicating the list or else it will
| get into trouble?
|
| For reading the list why shouldn't Rust use a _slice_ of
| the vector? The slice can 't own anything, but that's OK,
| we aren't changing anything. The slice is very cheap, it's
| basically a pointer into the vector plus a length count.
| brundolf wrote:
| > surely a Lisp has to also keep duplicating the list or
| else it will get into trouble?
|
| Nope, it doesn't traditionally duplicate the list. It
| certainly is possible to get into trouble in your logic,
| but those are the presented semantics, and debating their
| virtue is out of scope
|
| > why shouldn't Rust use a slice of the vector?
|
| You're gonna get into ownership-hell if you can't give a
| separate Rc to each list tail, because those can get
| passed around wherever
| tangent128 wrote:
| The bytes[0] crate is a popular library in the Rust
| ecosystem for taking reference-counted slices of a shared
| byte buffer.
|
| [0]: https://docs.rs/bytes/1.0.1/bytes/struct.Bytes.html
| wizzwizz4 wrote:
| That's what Cow is for, surely?
| brundolf wrote:
| For the ownership thing? No, I don't think so. I haven't
| done much with Cow but my understanding is it doesn't do
| anything to help with cases where you have "multiple
| owners" of a value. It might allow you to use a slice up
| _until_ it needs to be cloned, but you 'll still have to
| clone it at that point.
| tialaramex wrote:
| Unlike the slice, a Cow made from that slice can be
| Owned. Unlike reference counting the underlying Vector,
| the Cow won't duplicate the slice until you modify it.
|
| This forces you to confront the reality you'd been
| dodging. _Either_ you actually mutate this list in your
| program, and the "magic" of linked lists dissolves when
| it consumes all your memory, or as seems far more likely
| you get good performance from the better underlying data
| structure anyway and the "magic" of linked lists
| dissolves that way.
|
| You only get good performance from linked lists today on
| the rare occasion when their lack of data locality is
| outweighed by some other factor. Sprinkling the Lisp
| idiom over things doesn't change that.
|
| Here's an example where it's worth it: In highly
| concurrent systems you can't afford to use any sort of
| locking to protect data structures, the contention for
| the locks hurts too much, and you can't afford to
| reference count everything in those structures because
| even the contention on the reference counts also costs
| too much (everything looking at an item is storing to the
| reference count). So you use Hazard Pointers to avoid
| prematurely dropping anything. But any type of locking
| for your Hazard Pointers structure would have too much
| contention also, so you store the Hazard Pointers in a
| linked list, new ones can be slotted into place at the
| start of the list with an atomic compare-exchange. Each
| CPU core is writing to the Hazard Pointers it "owns" a
| lot, but they're deliberately too big to share with
| another CPU's cache, and any CPU cores that need to check
| the Hazard Pointers read from them all but never write so
| modern caches cope admirably.
| brundolf wrote:
| You're getting really into the weeds. All I'm saying is
| the OP will behave unexpectedly for people expecting
| "just a lisp" (a Scheme-like and/or CL-like lisp). It is
| not some kind of contract-preserving optimization over a
| naive linked-list (though those exist!).
|
| The way those languages chose to handle this stuff has
| well-trod advantages and disadvantages. I don't love it
| personally, but I get it. Certainly it's worked well
| enough for a whole lot of software!
|
| But I'm _not_ commentating on it here. I 'm _just_
| stating the divergence with the OP. Critiquing Lisp 's
| 40+ year history and what parts of it should or shouldn't
| have been different is out of scope as far as I'm
| concerned. You're arguing against something that isn't
| being stated.
| oh_sigh wrote:
| Does anyone actually program like that besides for in lisp
| tutorials?
| klyrs wrote:
| Yes, linked lists are great for certain use cases. They make
| for very clear and performant code, especially if your
| language can special-case their memory management. In other
| cases, they're slow garbage, and that's where vectors shine.
| jhgb wrote:
| Of course people do that; it's very convenient.
| mumblemumble wrote:
| It is, but I do wonder if it's the most performant option.
| For grabbing individual elements, cadddr and friends are
| O(N) for linked lists while a O(1) implementation is
| possible for arrays.
|
| List eaters are an elegant design pattern, especially for
| picking apart recursive algorithms, but eating linked list
| wouldn't be nearly as cache-friendly as map/reduce over an
| array. It wouldn't get rid of all the pointer chasing, but
| even reducing one level of it is worth something.
|
| Perhaps, with modern language implementation technology, we
| could produce something like that, while also preserving
| persistence (at the semantic level), by, in the background,
| choosing mutation when there's only a single active
| reference to the list, and falling back to a copy-on-write
| discipline when there are multiple concurrent references.
|
| For what it's worth, I find I never program that way in
| Clojure. I realize Clojure not being as list-oriented is a
| point of contention for many people, and I'm not all that
| much of a lisper, so perhaps my opinion isn't worth much,
| but I've generally found that Clojure gives me most of what
| I want. I am more annoyed by the lack of reader macros.
| Though I also understand and respect Rich Hickey's decision
| there.
| tialaramex wrote:
| In _theory_ a linked list has lots of attractive features as a
| data structure and when LISP Machines were a thing that theory
| translated into reality pretty well.
|
| Today losing data locality hurts really badly. Rust does
| provide a linked list, for the handful of cases where that
| really is what you wanted, but almost always even though _in
| theory_ a vector should be worse, in _practice_ it 's better.
|
| For example if you build a toy system with a thousand small
| integers on the list, you might reason that removing the 500th
| via a vector instead of a linked list would be awful - you'd
| need to shuffle 500 of them forward compared to just swapping a
| pointer. But wait, on today's hardware the linked list approach
| involves about five hundred dependent _cache misses_. Before
| you find the 500th on the list to remove it, you have stalled
| the CPU for so long the vector implementation would have
| already finished doing its shuffling of 500 items forward in
| memory and moved on to something else.
|
| Now, if your list has 500 _million_ items in it, and you often
| want to remove just the first item while appending to the far
| end, a Vector is indeed a bad fit. But there are still better
| alternatives than a Linked List, Rust provides a Deque but
| unlike the one a Lisp-enthusiastic professor may have showed
| you in data structures class, it 's actually a similar
| structure to a vector not a linked list.
| brundolf wrote:
| I'm not here to debate the virtues of linked-lists, I'm just
| pointing out that they're extremely idiomatic in Lisp, so
| it's weird to implement a basic Lisp that doesn't use them,
| because anybody running idiomatic code in it will be in for a
| big surprise.
| karmakaze wrote:
| Clojure already started the trend with renaming car/cdr and
| favouring vectors in places.
| nerpderp82 wrote:
| It would be interesting to have a List trait and a cons
| operation on that trait and one could swap out for different
| list implementations. A DebugList might keep track of
| operation on the list and recommend allocation hints or
| operations that could be sped up.
|
| One could absolutely use a Dequeue or some other optimized
| structure for a Lisp list, really depends on how the code is
| written.
| rjsw wrote:
| Lisp Machines could also implement lists using vectors [1].
|
| [1] https://en.wikipedia.org/wiki/CDR_coding
| fulafel wrote:
| How does this end up managing memory? I would guess things are
| just allocated from the Rust heap and not freed?
| brundolf wrote:
| Looks like they use Rc<>
| skavi wrote:
| Previous discussion here:
| https://news.ycombinator.com/item?id=19810504
| C-x_C-f wrote:
| That title though :-P
|
| Wish they'd gone with something like (define Risp (in Rust Lisp))
| IMHO
| dgb23 wrote:
| Interestingly the title illustrates the big tradeoff in lisp
| notation. It doesn't read like English sentences from left to
| right, but is a tree data structure. This has upsides and
| downsides.
|
| Rust for example reads much more like a sequential sentences
| with the dot notation to chain methods and trait
| implementations and the like.
|
| Luckily for example in Clojure we use let bindings and
| threading macros (-> and ->>) to express the sequentiality of
| code more clearly. But it takes some time to get used to Lisp
| syntax in general.
| Zelphyr wrote:
| I applaud the author for choosing to use the "Risp" portmanteau
| over "Lust".
| akkartik wrote:
| Curious that it doesn't implement _quote_. Would the current
| approach have any issues with _quote_?
___________________________________________________________________
(page generated 2021-07-12 23:01 UTC)