[HN Gopher] We rewrote a high-performance database in Rust
___________________________________________________________________
We rewrote a high-performance database in Rust
Author : gk1
Score : 90 points
Date : 2022-10-18 18:51 UTC (4 hours ago)
(HTM) web link (www.pinecone.io)
(TXT) w3m dump (www.pinecone.io)
| ayewo wrote:
| Rewriting in Rust is not a meme, it's a cycle.
|
| Before Rust became viable, rewrites were done in Go.
|
| From the archives:
|
| - Rewriting a large production system in Go
| https://news.ycombinator.com/item?id=6234736 (2013)
|
| - How We Moved Our API From Ruby to Go
| https://news.ycombinator.com/item?id=9693743 (2015)
|
| - Matrix and Riot Confirmed as the Basis for France's Secure
| Instant Messenger App
| https://news.ycombinator.com/item?id=16938545 (2018)
|
| - Toward Vagrant 3.0
| https://news.ycombinator.com/item?id=27476676 (2021)
|
| - I'm porting the TypeScript type checker tsc to Go
| https://news.ycombinator.com/item?id=30074414 (2022)
| pjmlp wrote:
| Which is why old timers eventually learn to just deliver with
| boring technology.
| riskable wrote:
| > Which is why old timers eventually learn to just deliver
| with ~~boring~~ _buggy_ technology.
|
| There's a reason why folks take the time to rewrite things in
| Rust. No matter how good you are at C/C++ you will encounter
| bugs that you would not have if you had written it in Rust.
| pjmlp wrote:
| Assuming there is even a Rust library replacement to start
| with.
|
| People keep forgetting C++ has 30 years of being deployed
| in production.
|
| Rust is 2022 is like using C++ in 1990's in terms of
| ecosystem.
| darksaints wrote:
| The C++ IDEs available up until about 10 years ago were
| complete garbage. C++ _still doesn 't even have a good
| package manager_. All the build systems are pure chaos.
| The largest C++ package manager has 1500 packages. In
| comparison, rust's package manager and build system are
| way easier to use and already have 94,000 packages
| available to users.
|
| That's not exactly fair to C++ because entire categories
| of dev tools (like build systems, package managers, IDEs,
| debuggers, version control, static analyzers, etc.) have
| matured after C++ did. And let's not forget that when C++
| was new, most libraries were proprietary licensed and
| paid for, whereas today almost all libs are open source.
| And those improvements (along with general size of the
| programming community) mean that a trendy language today
| is going to develop and mature a lot faster than a
| formerly trendy language did 30 years ago.
|
| IMO Rust in 2022 is a lot closer to java in 2005 than C++
| in the 1990s.
| pjmlp wrote:
| Another one that never used Borland, Apple, IBM IDEs.
|
| Where is the Rust IDE that is half as capable as C++
| Builder, MPW/Metrowerks, Visual Age, Zortech?
|
| Considering all features they offered across the board in
| the box, not only code completion.
| darksaints wrote:
| I've used Borland. I'm sure it was marvelous at the time,
| but it doesn't hold a candle to CLion or Visual Studio in
| the 2010s or later.
| pjmlp wrote:
| CLion now ships a cross platform C++ framework with it?
|
| As for Visual Studio, yeah it is great in all aspects,
| except having nothing else beyond MFC to offer on the GUI
| department, WinUI is still a mess after UWP.
|
| In any case your examples are for C++ IDEs, reinforcing
| my case of C++ tooling versus Rust.
| howinteresting wrote:
| This is just plain false. C++ in the 1990s had nothing
| like serde for example.
| jerf wrote:
| The minimum bar for a language has moved up significantly
| since the 1990s. It isn't enough to just have a neat new
| idea, you need to ship with nearly-best-of-breed JSON
| serialization, a web server, a huge standard library with
| not just strings but things like compression and a lot of
| networking, and a laundry list of other things (give or
| take a few things) just to make it to the "barely viable
| alternate choice" point.
|
| Nice as a language consumer, but a bummer that building
| new languages and getting some attention is so much
| harder than it used to be.
| xani__ wrote:
| marcosdumay wrote:
| Well, Rust checks all of those boxes.
|
| Yes, it's a problem for that language you plan on
| creating (try specializing into a niche). But it's not
| something that should impact Rust's adoption.
| pjmlp wrote:
| With various levels of completeness.
| rastignack wrote:
| Where are the production grade and pure rust tls library
| ? Key-value store ? Ldap client ? SSH client ?
| pitaj wrote:
| Aren't the most commonly used libraries for all of those
| written in C, not C++?
|
| Regardless, I'm surprised you haven't heard of rustls -
| https://github.com/rustls/rustls
| rastignack wrote:
| You have great c++ libraries for those. Not in pure rust
| though.
| jhgg wrote:
| > production grade and pure rust tls library
|
| You mean rustls? https://github.com/rustls/rustls
| rastignack wrote:
| Pure rust ? No.
| bogeholm wrote:
| You're right, I see some *.md-files in the GitHub repo
| howinteresting wrote:
| I haven't used them much, but sled, ldap3 and thrussh do
| exist. As Rust gains further in popularity I'd expect
| more of these to become production ready. Meanwhile
| there's always C and C++ interop.
| rastignack wrote:
| Sled and thrussh are not production grade. I don't
| particularly want to delve into details as I think the
| effort is laudable. I can explain my position in private
| if need be.
| howinteresting wrote:
| OK. I mean they're probably less mature, yeah. But high-
| quality Rust bindings to libssh2 and RocksDB do exist so
| _shrug_
| pjmlp wrote:
| And? A drop in the ocean of libraries.
| howinteresting wrote:
| And virtually no one should be starting new projects in
| C++ and everyone should switch to Rust.
| pjmlp wrote:
| Start by removing C++ from Rust compiler.
|
| Then go around for Khronos, NVidia, Microsoft, Sony,
| Nintendo, Unreal, Godot,.... to support Rust on their
| SDKs.
| nicoburns wrote:
| I honestly feel like rust _is_ boring technology in most
| senses of the word. It "just works" more than almost any
| other technology that I've used. The ownership system is new
| and different, but that's really the only thing.
| xani__ wrote:
| lijogdfljk wrote:
| Honestly, so is Go. I used to use Go, and it's boring as
| hell. I hate it for a few choices they made, but they
| definitely achieved their goal. It is quite boring.
|
| I agree with you though, so is Rust. The less boring areas
| imo these days aren't languages (at least none i see), as
| all the good languages are boring. Zig for example, is
| pretty mundane too.
|
| The older i get the more i value confidence in a product.
| Confidence that it won't crash at runtime. Confidence that
| i won't be bugged over the weekend. etc
| notriddle wrote:
| Rust is not boring technology. There's too much ecosystem
| churn, and new language features are deployed too often.
|
| C++ isn't boring technology, either. If you just want to
| deliver value, I'd recommend Java.
| zozbot234 wrote:
| > There's too much ecosystem churn, and new language
| features are deployed too often.
|
| Not much of an issue if you stick to the stable subset of
| the language, and libraries that work within that subset.
| lijogdfljk wrote:
| > There's too much ecosystem churn, and new language
| features are deployed too often.
|
| That kinda feels like saying Linux is too crazy because
| new apps get made for Linux frequently.
|
| You can use the same part of the language tomorrow that
| you used today. Nothing is changing out from under you.
| If you're afraid of libraries, don't use them. You'd have
| the same problem in any ecosystem that is new, no?
| notriddle wrote:
| > That kinda feels like saying Linux is too crazy because
| new apps get made for Linux frequently.
|
| Apps are okay, but other parts of userland that roll out
| breaking changes on a regular basis are definitely a
| problem [1] [2] [3]. Even if they aren't technically part
| of the kernel, they are usually used with it to provide a
| complete working system, and they break stuff all the
| time.
|
| [1]: https://lwn.net/Articles/904892/
|
| [2]: https://lwn.net/Articles/840430/
|
| [3]: https://lwn.net/Articles/777595/
| howinteresting wrote:
| I've led and been on teams that have written multiple
| production-grade Rust services that have together
| delivered 100MM+ USD of value. The number of production
| bugs has been in the single digits, with exactly one
| outage that lasted more than a few minutes in the last 3
| years. How about yourself?
|
| In my experience, Rust delivers by far the fewest number
| of bugs in production out of any mainstream language. It
| gets the fundamentals right like nothing before it. &,
| &mut, Send and Sync take care of many classes of bugs in
| the inner loop of productivity.
| doliveira wrote:
| With my (admittedly limited) experience with the Hadoop
| ecosystem, I'd sincerely beg for people to stop writing
| databases in Java... Apart from the way bigger system
| requirements, dependency version hell, having to monitor
| GC pauses is just so, so annoying
| avgcorrection wrote:
| The stable release of Go was maybe four years or so before
| Rust. So what you're saying seems to be that people like to
| rewrite their tech in young and hyped (for good or bad or
| neutral) languages. Because there is little connection between
| Rust and Go (other than chronology).
| sbdivuvu wrote:
| [deleted]
| einpoklum wrote:
| The authors of this post rewrote _their own_ DBMS in Rust. Which
| is perfectly ok, but I'm not sure I would trust them to decide
| that theirs is a "high-performance" DBMS. They don't have any
| benchmark results except images of their own internal performance
| measures; they don't offer any way of comparing their performance
| with other DBMSes (e.g. Vectorwise/Actian Vector, ClickHouse,
| DuckDB etc. - not to mention Oracle, MS or SAP offerings); and
| they only have marketing blurb about their numbers: "Up to 10x
| performance" (with no baseline of course).
|
| So, they took some DBMS (which is probably not so hot in terms of
| performance) and rewrote it in Rust. Surely possible, possibly
| useful, but not much to write home about if one is interested in
| DBMS performance.
| 22SAS wrote:
| Glanced through the article, and I see no comparisons on how
| performance of the DB is in Rust versus their current C++
| implementation, no mention of if maintaining the Rust code is
| easier than their C++ codebase, no stats on how devs are ramping
| up and how it's tackling their "hard to find a dev who knows both
| C++ and Python well" issue.
| jeroenhd wrote:
| The next paragraph they state: We looked at
| and compared several languages - Go, Java, C++, and Rust. We
| knew that C++ was harder to scale and maintain high quality as
| you build a dev team; that Java doesn't provide the flexibility
| and systems programming language we needed; and that Go is also
| a garbage collected language. This left us with Rust. With
| Rust, the pros around performance, memory management, and ease
| of use outweighed the cons of it not yet being a very
| established language.
|
| In other words, they wanted to unify the programming languages
| and evaluated several. Rust won out of those for performance
| reasons.
|
| The article is a short recap of a 40 minute video. The video
| has more context and explains the intentions much better than
| the web page.
|
| They show a graph of performance over time as the rewrite
| progressed. There were some small optimisations and problems, a
| few big regressions, and then a huge improvement that was
| maintained. Looks like the rewrite process made the database
| perform significantly better. There's nothing on how much this
| was caused by the language switch itself, but that's
| functionally impossible: nobody is rewriting their application
| twice to see what rewrite is better.
| lijogdfljk wrote:
| > but that's functionally impossible: nobody is rewriting
| their application twice to see what rewrite is better.
|
| Agreed, and hypothetically the 2nd rewrite should _still_ be
| better than the first. So the language would have to make it
| significantly worse to outweigh the yet again experience in
| improving things.
|
| To be clear though i'm not stating that every rewrite is
| assured to be better. However a carefully considered rewrite
| has a much easier time making decisions learned from any
| warts discovered in previous implementations. God knows
| there's always _some_ warts.
|
| As a Rust fanatic, i wouldn't expect Rust itself to be due to
| the performance gains. It's not expected to be _faster_ than
| C /C++ typically. Just comparable.
| dxhdr wrote:
| Article also states that the switch from C++ to Rust improves
| "low level optimized instruction sets, memory layout, and
| running async tasks."
|
| The first two are also strengths of C++, and for the third the
| article says that "Rust is async, and Tokio is the one of the
| most popular async providers ... However, it's not great for
| running CPU intensive workloads, like with Pinecone." Puzzling.
| rwaksmunski wrote:
| I've had more luck with async-std over Tokio for more CPU
| intensive workloads. But then again, I ran it on a kqueue
| platform so my experience is probably not representative.
| pclmulqdq wrote:
| My past experience with Rust async code is that both async-
| std and Tokio are fairly unimpressive on performance (as
| async code goes), particularly if you compare to ScyllaDB's
| runtime or other similar C++ async runtimes.
| kaladin_1 wrote:
| I have no problem with people rewriting their projects in
| whatever language they see fit.
|
| What stood out for me in the article is him saying that it's
| difficult getting developers with experience in both Python and
| C++.
|
| So, I wonder, if his in-house devs could pick up Rust that they
| previously couldn't write, why does he think he can not hire a
| good programmer and charge him to learn the stack the company
| uses. Why must they employ someone that already writes Python or
| C++.
|
| Is Rust such a straight-forward language that people new to the
| language can write a very performant programme
| pornel wrote:
| Despite Rust's steep learning curve, it's also paradoxically
| easy to add novice Rust programmers to a project.
|
| This is because inexperienced Rust programmers are relatively
| harmless. Noob mistakes won't compile, rather than running into
| dangerous gotchas. You can tell noobs not to use `unsafe` (and
| there are ways to enforce that), and mostly they'll just write
| inefficient or non-idiomatic code, but the code will be free
| from data races and memory corruption.
|
| The strictness of the Rust compiler is quite the opposite of
| something like the C++ Core Guidelines where the majority of
| the rules aren't enforced by the compiler, and have to be in
| the programmers' head first.
|
| Noobs make lifetime errors and fight the Rust compiler, but
| imagine working with a compiler that _doesn 't_ tell you when
| you have lifetime errors.
| krona wrote:
| Memory sanitizers, address sanitizers, leak sanitizers,
| threading sanitizers, undefined behaviour sanitizers. The
| visual studio core guidelines checker. The clang-tidy core
| guideline checker. I could go on but my point is, the
| landscape does not really look like how you've painted it.
| atoav wrote:
| All of these tools and we still have exploitable buffer
| overflows in 2022. So either the tools are not working,
| people are not using them or they are using them but can
| simply ignore critical warnings.
|
| Your milage may vary, but I think Rust offers a well
| considered step into the right direction. Stupid and
| dangerous code of the kind _every_ developer will produce
| once in a while just won 't compile in Rust. You cannot
| forget to run a check, you can't hide behind not knowing a
| tool. You can't ignore the warning of you want a running
| program.
|
| That is not nothing.
| abc_lisper wrote:
| Yeah, but Rust guarantees sanity by design. Sanitizers are
| a patch, and hence not comprehensive.
| lijogdfljk wrote:
| They painted it like reality, though, no?
|
| _You_ seem to paint the landscape as full of tools and
| imply that they 're used. Either they're insufficient or
| they're often under utilized, simply due to the number of
| bugs we see. No?
| pornel wrote:
| I know about these, but there is a marked difference
| between Rust and these tools.
|
| Static analysis tools have much harder job analyzing C++
| (aliasing and escape analysis are way harder, and static
| analysis of thread-safety is basically impossible due to
| lack of thread-safety info in the type system). The results
| are a trade-off between being sparse or having false
| positives.
|
| The sanitizers only catch issues they can observe at run
| time, and that relies on having sufficient test and fuzz
| coverage. Some data races are incredibly hard to reproduce,
| and might depend on a timing difference that won't happen
| in your test harness.
|
| OTOH Rust proves absence of these issues by construction,
| at compile time.
|
| It's like a difference between dynamically-typed and
| statically-typed languages. Sure, you can fuzz type errors
| out of JS or Python, but in statically-typed languages such
| errors are eliminated entirely at compile time. Rust
| extends this experience to more classes of errors.
| djwatson24 wrote:
| > and that relies on having sufficient test and fuzz
| coverage
|
| At the faang I worked at, some small portion of servers
| ran the sanitizers in prod, so you're not reliant on test
| coverage nearly so much for catching rare issues.
| krona wrote:
| _The results are a trade-off between sparse or having
| false positives._
|
| Rust just takes the other side of the trade-off, and will
| reject valid programs. Hence why the unsafe keyword
| exists, and why tools like Miri (https://github.com/rust-
| lang/miri) exist specifically for rust.
| timeon wrote:
| > Rust just takes the other side of the trade-off, and
| will reject valid programs.
|
| Are we still talking about ease of add novice Rust
| programmers to a project?
| dymk wrote:
| It takes longer to learn how to use C++ to the same level of
| proficiency and correctness compared to Rust, in my experience.
| It's harder to write an incorrect program in Rust.
| lupire wrote:
| What are the main correctness risks in C++ if you just never
| use a raw pointer?
| bcrosby95 wrote:
| One thing off the top of my head, from experience:
|
| std::string s(s);
|
| To be fair, compilers will warn you about this nowadays.
| But when I converted a C codebase to C++ 20 years ago they
| didn't.
|
| IIRC, references can also refer to de-allocated memory.
| Also, if you don't pass-by-reference or pointer, you can
| literally "slice" the dynamic doohickies off your instance
| so your AlbinoCat behaves like a Cat because all that extra
| special stuff is gone as far as the function is concerned.
|
| This is just off the top of my head after not working with
| C++ for 20 years. I'm sure with all the new features it's
| gained over the past 20 years theres whole new exciting
| ways to blow your leg off.
| steveklabnik wrote:
| I don't know about "main", but like, you don't need raw
| pointers to have UB. uniq_ptr is nullptr after you move it.
|
| And even then, my understanding is that raw pointers are
| still intended to be used in Modern C++: they're there for
| when you don't want to transfer ownership.
| mamcx wrote:
| Well... UNSAFE { // TODO:
| Verify all the lines, all the time, are ok //
| Just like you do testing, documentation, security and all
| that // ok? #include <iostream>
| using namespace std; int main() {
| // YOUR CODE } }
| lupire wrote:
| What are you saying?
| mamcx wrote:
| Well if the questions is:
|
| > What are the main correctness risks in C++ if you just
| never use a raw pointer?
|
| All the code on C/C++ IS a correctness "risks". Only
| constant, manual inspection could(maybe) say otherwise.
|
| What Rust gives is significant reduction of the risks.
| ekidd wrote:
| I've written quite a bit of production code in C++, Python and
| Rust, and currently work on a hybrid Rust/Python system. Here's
| my experience:
|
| - C++ is an unusually large language. And it has many historic
| footguns, requiring a higher level of vigilance and code
| review. If I were starting a brand new project today, I
| wouldn't try to build a team of C++ programmers.
|
| - Untyped Python becomes more difficult to refactor and
| maintain once you reach 50k to 80k lines on a group project.
| Typed Python, however, scales nicely beyond this size.
|
| - Rust is a "medium-sized" language. It requires developers to
| learn more than Go or Python does, but less than C++. And Rust
| has far fewer traps for the unwary and the reckless than C++.
| Rust's tooling is also very good in many areas.
|
| - It's tempting to split a project into a fast "core" language,
| and high-level "glue" language. There are real advantages to
| this. (Which is why I've done it on one recent project!) But
| this also comes with costs: everyone needs to be fairly good at
| two languages, and switch back and forth. And you pay a tax at
| the boundary.
|
| If I were building a brand new database (and a team to maintain
| it), I'd actually be strongly tempted to use Rust exclusively.
| But this is partly because databases rarely have a "business
| logic" layer that changes constantly, so there's less need for
| a high level scripting language.
|
| But with a different team or different constraints, C++ could
| also be the right choice.
| hutzlibu wrote:
| "But this also comes with costs: everyone needs to be fairly
| good at two languages, and switch back and forth."
|
| Why does everyone needs to be good at both languages? You can
| seperate and have the core people writing efficient low level
| code - and you have higher level scripting/gluing code.
| pornel wrote:
| You will have Conway's law in your codebase. Coordination
| between teams is hard, so teams will prefer to implement
| features entirely in their language, even where that is
| technically suboptimal.
|
| You will get hot loops in Python, because a Rust programmer
| wasn't around, and Rust programmers implementing whole
| complex business logic in Rust behind a single `do_it()`
| Python call.
| hutzlibu wrote:
| Communication and coordination is surely hard and things
| like that surely happen, but this is why project
| management exist.
|
| If it is doing things right, then the rust people don't
| do complex buisness logic, because it is not assigned to
| them and they would not even have the details.
|
| And if the python people were too eager and have core
| stuff implemented and it is affecting performance, than
| you can always reimplement it low level.
|
| It all depends on the project of course, of what would be
| the best mix.
| lupire wrote:
| A brand new project doesn't need legacy C++ footguns. It can
| use modern C++.
|
| The part of Python (usually) is that you don't _need_ to be
| "good at it" it you aren't trying to write super polymorphic
| core that runs super efficient computations like scipy. If
| you have a fast core engine for the innner loop, a slow
| Python management layer is plenty fast.
| dkarl wrote:
| > Typed Python, however, scales nicely beyond this size.
|
| Could you say more about what tools and practices make this
| possible, beyond simply adding type annotations in your code?
| Asking for a friend.
|
| > It's tempting to split a project into a fast "core"
| language, and high-level "glue" language.
|
| I did this with C++ and Boost Python back in the day and
| loved the experience. I wonder if Rust will someday get a
| high-level language for writing applications and scripts on
| top of a Rust codebase, like Boost Python for C++ or Tcl for
| C.
| heavyset_go wrote:
| > _Could you say more about what tools and practices make
| this possible, beyond simply adding type annotations in
| your code? Asking for a friend._
|
| Python with type annotations works really well with type
| checkers like Mypy, along with LSP servers, and both of
| those integrate with most development environments.
|
| Using a Python-oriented IDE like Pycharm with type
| annotated Python also allows for better refactoring
| options. It reduces the uncertainty and guesswork an IDE's
| static analyzer must engage in for even basic features
| you'd take for granted with IDE and statically typed
| languages.
|
| In practice, developers don't have to keep what can be a
| massively complex application running in their heads to
| modify code accurately. A nicely typed project makes it
| easy to exactly what types of data are being passed around
| and modified. Before gradual typing, you'd have to
| backtrack to all of a function's call sites to understand
| exactly what kind of data it takes and returns. With
| gradual typing, you can just look at types and rely on Mypy
| to ensure the right data is actually being shuffled around.
|
| > _I did this with C++ and Boost Python back in the day and
| loved the experience. I wonder if Rust will someday get a
| high-level language for writing applications and scripts on
| top of a Rust codebase, like Boost Python for C++ or Tcl
| for C._
|
| I haven't used Boost Python, but there are some options for
| Rust and Python that work well and seem to suit this use
| case like PyO3.
| exceptione wrote:
| Especially business logic should be taken in the firm grip of
| static compile time guarantees that the hand of a strong type
| system delivers. Even more so if it changes constantly!
| Refactoring without fear.
|
| Only software that does not have to run correctly
| (prototypes, personal hobby projects) can get away with a
| non-static type system.
|
| When I have to pick a tool and I see it is written in Python
| I will have a look for an alternative if possible. Because I
| know it will have many bugs: some known, lots hidden.
| MisterTea wrote:
| > What stood out for me in the article is him saying that it's
| difficult getting developers with experience in both Python and
| C++.
|
| More like they had difficulty finding cheap experienced
| c++/python devs.
| arriu wrote:
| I agree, rust has a difficult learning curve. I've often heard
| at least a year is required to really feel confident.
| gotts wrote:
| 6 months is what I heard. I'm currently at ~2 and 6 sounds
| like a pretty good estimate.
| heavyset_go wrote:
| It takes a relatively short time to be proficient enough to
| make useful contributions, maybe half a year to a year to be
| confident. You can give an experienced developer the Rust
| book and have them contributing to a Rust codebase quickly.
| heavyset_go wrote:
| The footgun-to-appropriate-feature ratio is higher with C++
| than Rust. Rust also has some excellent Python integration
| options that are relatively easy to use.
| smitty1e wrote:
| What is a vector database?
|
| https://www.pinecone.io/learn/vector-database/
|
| ...was less than informative.
| jiggawatts wrote:
| Standard row-oriented databases store columns on disk like so:
| ABCABCABCABC
|
| Vector databases store them like this:
| AAAABBBBCCCC
|
| This allows faster queries if you just need one (or a few)
| columns, because unrelated columns don't have to be processed
| at all. Caches are more efficient, vector CPU instructions can
| be used, etc...
|
| The downside is that random single row access is more expensive
| because a row has to be reassembled from many locations.
| [deleted]
| makmanalp wrote:
| > If you're using a higher level language, you're not going to
| have access to how the memory is laid out. A simple change, like
| removing indirection in our list, was an order of magnitude
| improvement in our latencies since there's memory prefetching in
| the compiler and the CPU can anticipate which vectors are going
| to be loaded next in order to improve the memory footprint.
|
| This is a common experience and I'm still surprised by the choice
| I constantly see to use a managed-memory languages to build a
| database - one of a very small set of special cases where having
| full control over the memory layout might just be a reasonable
| thing to want. In this universe (absent doing something
| completely absurd) it's not algorithmic complexity but managing
| data locality in the cache hierarchy (e.g. reading things from L3
| vs main memory vs disk) that makes things orders of magnitude
| faster, especially if you're in the realm of doing things like
| SIMD operations to speed things up.
|
| Perhaps there's some level of suck we're willing to tolerate for
| all the other benefits you get, but I've been noticing a pattern
| of "align things just so at the higher level and hope they mostly
| turn out the way you want at the lower level" (e.g. also with the
| Apache java-y databases like hadoop / hbase / cassandra which I
| guess were mostly supposed to derive their total throughput from
| massive scale rather than per-node performance) which is a bit
| funny.
|
| But also it seems like part of Rust's promise was "low level but
| make it high level" which seems to be succeeding (zero-cost
| abstractions and whatnot), so I imagine this will get better over
| time - having not attempted a project like this myself, I'm not
| sure what the limitations you'd run up against are in terms of
| laying things out in memory in a favorable way - I imagine the
| kind of massive manually managed arena allocations and ad-hoc
| pointers going everywhere that one normally does doesn't really
| fly.
| pjmlp wrote:
| Because it is a fake dichotomy.
|
| D, Nim, C#, Swift, not to count all of those that existed since
| Xerox PARC days.
| lesuorac wrote:
| > As you can see in the above graph, a commit was merged that
| caused a huge spike. However, with Criterion, an open source
| benchmarking tool, we were easily able to identify it, mitigate
| it, and push a fix.
|
| Wonder what the commit was that caused a more than 2x regression
| and got a fix instead of an undo.
| menaerus wrote:
| sbdivuvu wrote:
| TehCorwiz wrote:
| I care.
|
| Programming languages give us different frameworks and
| guardrails to express computational tasks similarly to how
| written and spoken languages give us a different set of
| concepts with which to express ideas. New languages mean a
| potentially different way of thinking about a problem. Some
| ideas which are difficult to express in one language are
| trivial in another.
|
| Discovering these differences is one of the joys of language
| learning. Language learning requires practice, and rewriting a
| known work (or translating it you might say) is a great way to
| deepen your understanding and test which ideas are easier or
| harder.
| [deleted]
| lupire wrote:
| I liked the part where they said Python is too slow because it's
| garbage collected, and didn't show any metrics, and then built a
| new solution and Rust and didn't show metics to compare to the
| original system.
|
| Makes me think the eng lead just wanted to do Rust, and made up a
| rationalization.
| viig99 wrote:
| Same, "We knew that C++ was harder to scale and maintain high
| quality as you build a dev team" this just sounds arbitary and
| a weak excuse to use rust, C++-20 is as scalable as rust with a
| very rich ecosystem.
| cercatrova wrote:
| Well, we already know Python is inherently slower than Rust or
| any compiled language really, so does one really need metrics
| to know that the Rust implementation was faster?
| codespin wrote:
| a perf change without perf numbers is a bug
| ketralnis wrote:
| Yes. If your bottleneck is magnetic disc seeking times, no
| amount of language change is going to move the needle (hah!).
| rs_rs_rs_rs_rs wrote:
| >In addition, it's challenging to find developers with experience
| in both Python and C++
|
| So you decided on a language that makes it even harder to find
| experienced developers?
| masklinn wrote:
| Anecdotally, a lot of rust-curious people seem to know python.
| Projects like pyo3 help a lot as they make it much easier (=
| safe) to build native modules compared to C, let alone C++.
| pclmulqdq wrote:
| Rust is seen as more approachable by Javascript and Python
| devs, so they tend to learn it more often than C or C++.
|
| It is a lot more similar to JS than C++ is.
| 8jy89hui wrote:
| > It is a lot more similar to JS than C++ is.
|
| That is strange as I have experienced the opposite. I've
| written all three languages and I've noticed that JS
| patterns don't translate well to Rust. Many C++ patterns
| translate well to Rust (albeit after a bit of borrow
| checker fighting).
|
| Thoughts?
| smilekzs wrote:
| Rust iterators gives you JS vibes, with gotchas mostly
| related to lambda captures lifetimes. Once you accept
| that sometimes a `collect` is the easiest way out, it
| feels okay at the end of the day.
| goodpoint wrote:
| mistrial9 wrote:
| it is arguable that C++ in the modern days is no longer "one
| language" due to style, libraries, language features and code-
| base legacy; you have to find a coder that will fit your C++
| world, not just C++
| pjmlp wrote:
| Just like it will happen to Rust when it achieves 30 years of
| history, getting features every six weeks.
|
| How many epochs will exist in 30 years?
| Thaxll wrote:
| fwip wrote:
| > First of all, Python is a garbage collected language, which
| means it can be extremely slow for writing anything high
| performance at scale.
|
| I don't think garbage collection is in the top 3 causes of why
| Python is slow.
| 0x457 wrote:
| > First of all, Python is a garbage, which means it can be
| extremely slow for writing anything high performance at scale.
|
| Fixed it.
| pjmlp wrote:
| Thankfully Fintech and military weapon control systems aren't
| high performance.
| mattnewton wrote:
| This might be a difference of semantics- there is a difference
| between garbage collection as a concept being slow and python's
| GIL approach. My understanding is that the GIL would almost
| always make the top 3 reason of why python is slow in practice
| - it works for a very specific single threaded execution model
| but can't really take advantage of modern processors.
| slt2021 wrote:
| I think GIL is not the reason for slowness, it just specifies
| single threaded interpreter execution model. You can always
| spin up more interpreters to take advantage of multiple
| cores.
|
| The reason for slowness - is the weak dynamic type system of
| Python.
|
| Every single instruction need to be type checked at runtime
| and thus making everything slow.
|
| Compare to C#/Java which have GC but both are amazingly fast,
| because these languages have stricter type system. If you add
| JIT on top (which can selectively replace MSIL/java opcodes
| with native machine instructions) and it makes perf on par
| with natively compiled languages like C++.
| riku_iki wrote:
| also, is py still interpreted or jit-compiled?
| kstrauser wrote:
| I'd argue Python is strongly, dynamically typed, and that
| "weak" and "dynamic" are on different axes.
| [deleted]
| PartiallyTyped wrote:
| > I think GIL is not the reason for slowness, it is the
| weak dynamic types of Python that make it slow.
|
| Python is structurally typed, that makes it dynamic, but it
| is not weak as there is no type coercion.
|
| > Every single instruction need to be type checked at
| runtime and thus making everything slow.
|
| This is also wrong, python does not type check anything,
| not in the "regular" manner of typechecking. It relies on
| structural typing, if it quacks like a duck, then it is
| treated like a duck.
|
| In fact, PyPy is an argument against your position as it
| still allows the same (more or less) behaviour that python
| has while operating a lot faster due to JIT.
|
| Python doesn't have the luxury of compiling that C# and
| Java, nor is the VM intended to be high performing.
| pjmlp wrote:
| PyPy is still slower than the competition and largely
| ignored by the Python community.
| ordiel wrote:
| This only proves some skilled people heve too much free time and
| no creativity...
| jalino23 wrote:
| rewriting everything in rust is not just a meme?
| codegeek wrote:
| It's like everyone is trying to do things in "Rust" because
| it's the new thing to do.
| dymk wrote:
| Rust is 12 years old. 1.0 was released in 2015.
|
| We've been past the "it's the shiny new thing" phase for a
| while.
| likeabbas wrote:
| Memes typically have some basis in reality. If your project
| reaches a point where it could benefit from fearless
| concurrency or better memory control, Rust is probably your
| best bet at the moment.
|
| I could see huge benefits from Kafka and Cassandra being re-
| written in Rust.
| dominotw wrote:
| kafka clone redpanda is written in rust ?
| tilt_error wrote:
| No. C++ and the C* (seastar) framework.
| eatonphil wrote:
| Hmm? I'm pretty sure it's written in C++.
|
| See also, their install dependencies script.
|
| https://github.com/redpanda-
| data/redpanda/blob/dev/install-d...
| tilt_error wrote:
| They are rewritten in C++ already; Redpanda and ScyllaDB,
| respectively. Why waste the effort of rewriting it once
| again?
| agallego wrote:
| I tried in 2017 writing it in rust and found some compiler
| bugs. I also found compiler bugs in c++ tho to be honest,
| but I felt more comfortable in c++ so decided to write the
| first version of it in c++. The huge advantage is that
| storage engines in particular need to be more conservative
| in many dimensions and having seen success with scylla,
| seastar was apealing to me as a 'tried and tested' for
| storage systems.
|
| Prior systems I had built with facebook folly (c++ lib) and
| had also written my own eventing systems in the past, but
| the real value is having seastar being battle tested since
| 2016. Largely it has been the right decision for us as
| redpanda for it's young age has benefited from the
| stability of seastar.
| arcticbull wrote:
| C++ isn't better in the ways outlined.
|
| I'm not necessarily advocating it [1] but the parent's
| claim was that those programs could benefit from memory
| safety, thread safety and better concurrency and C++ does
| not deliver along that axis.
|
| [1] https://www.joelonsoftware.com/2000/04/06/things-you-
| should-...
| dymk wrote:
| Why do you think the meme exists?
___________________________________________________________________
(page generated 2022-10-18 23:01 UTC)