[HN Gopher] Comparing our Rust-based indexing and querying pipel...
       ___________________________________________________________________
        
       Comparing our Rust-based indexing and querying pipeline to
       Langchain
        
       Author : tinco
       Score  : 92 points
       Date   : 2024-10-01 15:09 UTC (7 hours ago)
        
 (HTM) web link (bosun.ai)
 (TXT) w3m dump (bosun.ai)
        
       | pjmlp wrote:
       | Most of the Python libraries, are anyway bindings to native
       | libraries.
       | 
       | Any other ecosystem is able to plug into the same underlying
       | native libraries, or even call them directly in case of being the
       | same language.
       | 
       | In a way it is kind of interesting the performance pressure that
       | is going on Python world, otherwise CPython folks would never
       | reconsider changing their stance on performance.
        
         | OptionOfT wrote:
         | Most of these native libraries' output isn't 1-1 mappable to
         | Python. Based on the data you need to write native data
         | wrappers, or worse, marshal the data into managed memory. The
         | overhead can be high.
         | 
         | It gets worse because Python doesn't expose you to memory
         | management. This initially is an advantage, but later on causes
         | bloat.
         | 
         | Python is an incredibly easy interface over these native
         | libraries, but has a lot of runtime costs.
        
           | pjmlp wrote:
           | Yet another reason to use native compiled languages with
           | bindings to the same C and C++ libraries.
           | 
           | If using C++20 onwards, then it is relatively easy to have
           | similar high level abstractions, one just needs to let go of
           | Cisms that many insist in using.
           | 
           | Here Rust has clearly an advantage that it doesn't allow for
           | copy-paste of C like code.
           | 
           | Naturally D and Swift with their safety and C++ interop,
           | would be an option as well.
        
           | __coaxialcabal wrote:
           | Have you had any success using LLMs to rewrite Python to
           | rust?
        
             | throwup238 wrote:
             | They're very good at porting code between languages but
             | going from a dynamically typed language with a large
             | standard library to a static one with a large library
             | ecosystem requires a bit more hand holding. It helps to
             | specify the rust libraries you want to use (and their
             | versions) and you'll probably want to give a few rounds of
             | feedback and error correction before the code is ready.
        
           | nicce wrote:
           | > Python is an incredibly easy interface over these native
           | libraries, but has a lot of runtime costs.
           | 
           | It also means that many people use Python while they don't
           | understand that what part of the code is actually fast. They
           | mix Python code with wrappers to native libraries, and
           | sometimes the Python code slows down the overall work
           | substantially and people don't know that fault is there. E.g
           | use Python Maths with the mix of Numpy math bindings, while
           | they can do it with Numpy alone.
        
         | oersted wrote:
         | Indeed, but Python is used to orchestrate all these lower-level
         | libraries. If you have Python on top, you often want to call
         | these libraries on a loop, or more often, within parallelized
         | multi-stage pipelines.
         | 
         | Overhead and parallelization limitations become a serious issue
         | then. Frameworks like PySpark take your Python code and are
         | able to distribute it better, but it's still (relatively) very
         | slow and clunky. Or they can limit what you can do to a
         | natively implemented DSL (often SQL, or some DataFrame API, or
         | an API to define DAGs and execute them within a native engine),
         | but you can't to much serious data work without UDFs, where
         | again Python comes in. There are tricks but you can never
         | really avoid the limitations of the Python interpreter.
        
       | lmeyerov wrote:
       | At least for Louie.ai, basically genAI-native computational
       | notebooks, where operational analysts ask for intensive analytics
       | tasks for like pulling Splunk/Databricks/neo4j data, getting it
       | wrangled in some runtime, cluster/graph/etc it, and generate
       | interactive viz, Python has ups and downs:
       | 
       | On the plus side, it means our backend gets to handle small/mid
       | datasets well. Apache Arrow adoption in analytics packages is
       | strong, so zero copy & and columnar flows on many rows is normal.
       | Pushing that to the GPU or another process is also great.
       | 
       | OTOH, one of our greatest issues is the GIL. Yes, it shows up a
       | bit in single user code, and not discussed in the post,
       | especially when doing divide-and-conquer flows for a user.
       | However, the bigger issue is in stuffing many concurrent users
       | into the same box to avoid blowing your budget. We would like the
       | memory sharing benefits of threaded, but because of the GIL, want
       | the isolation benefits of multiprocess. A bit same-but-different,
       | we stream results to the browser as agents progress in your
       | investigation, and that has not been as smooth as we have done
       | with other languages.
       | 
       | And moving to multiprocess is no panacea. Eg, a local embedding
       | engine is expensive to do in-process per worker because modern
       | models have high RAM needs. So that biases to using a local
       | inference server for what is meant to be an otherwise local call,
       | which is doable, but representative of that extra work needed for
       | production-grade software.
       | 
       | Interesting times!
        
       | dmezzetti wrote:
       | I've covered this before in articles such as this:
       | https://neuml.hashnode.dev/building-an-efficient-sparse-keyw...
       | 
       | You can make anything performant if you know the right buttons to
       | push. While Rust makes it easy in some ways, Rust is also a
       | difficult language to develop with for many developers. There is
       | a tradeoff.
       | 
       | I'd also say LangChain's primary goal isn't performance it's
       | convenience and functionality coverage.
        
         | timonv wrote:
         | Cool, that's a fun read! I recently added sparse vector support
         | to fastembed-rs, with Splade, not bm-25. Still, would be nice
         | to compare the two.
        
       | swyx wrote:
       | i mean LLM based or not has nothing to do with it, this is a
       | standard optimization, scripting lang vs systems lang story.
        
         | godelski wrote:
         | Shhhh, let this one go. So many people don't get optimization
         | and why it is needed that I'll take anything we can get. Hell,
         | I routinely see people saying no one needs to know C because
         | python calls C in "the backend" (who the fuck writes "the
         | backend" then?). The more people that can learn some HPC and
         | parallelism, the better.
        
           | pjmlp wrote:
           | Even better if they would learn about these amazing managed
           | languages where we can introspect the generated machine code
           | of their dynamic compilers.
        
             | godelski wrote:
             | Agree, but idk what the gateway in is since I'm so
             | desperate for people to just get the basic concepts.
        
           | dboreham wrote:
           | Obviously AI writes the backend.
        
       | serjester wrote:
       | I'm surprised they don't talk about the business side of this -
       | did they have users complaining about the speed? At the end of
       | day they only increased performance by 50%.
       | 
       | These kind of optimization seem awesome once you have a somewhat
       | mature product but you really have to wonder if this is the best
       | use of a startup's very limited bandwidth.
        
         | godelski wrote:
         | > At the end of day they only increased performance by 50%.
         | > only 50%.
         | 
         | I'm sorry... what?! That's a lot of improvement and will save
         | you a lot of money. 10% increases are quite large!
         | 
         | Think about it this way, if you have a task that takes an hour
         | and you turn that into 59 minutes and 59 seconds, it might seem
         | like nothing (0.02%). But now consider you have a million
         | users, that's a million seconds, or 277 hrs! This can save you
         | money, you are often paying by the hour in one way or another
         | (even if you own the system, your energy has cost that's
         | dynamic). If this is a task run frequently, you're saving a lot
         | of time in aggregate, despite not a lot per person. But even
         | for a single person, this is helpful if more devs do this.
         | Death by a thousand cuts.
         | 
         | But in the specific case, if a task takes an hour and you save
         | 50%, your task takes 30 minutes. Maybe the task here took only
         | a few minutes, but people will be chaining these together quite
         | a lot.
        
           | lpapez wrote:
           | Maybe these optimizations benefit the two users who do the
           | operation three times a year.
           | 
           | In such an extreme case no amount of optimization work would
           | be profitable.
           | 
           | So the parent comment asks a very valid question: how much
           | total time was saved by this and who asked for it to be saved
           | (paying or free tier customers for example)?
           | 
           | People who see the business side of things rightfully fear
           | when they hear the word "optimization", it's often not the
           | best use of limited development resources - especially in an
           | early stage product under development.
        
             | sroussey wrote:
             | I do wish that when people write about optimization that
             | they would then multiply by usage, or something similar.
             | 
             | Another way is to show CPU usage over a fleet of servers
             | before and after. And then reshuffle the servers and use
             | fewer and use the number of servers no longer needed as the
             | metric.
             | 
             | Number of servers have direct costs, as well as indirect
             | costs, so you can even derive a dollar value. More so if
             | you have a growth rate.
        
               | godelski wrote:
               | > I do wish that when people write about optimization
               | that they would then multiply by usage, or something
               | similar.
               | 
               | How? You can give specific examples and then people make
               | the same complaints because it isn't relevant to their
               | use case. It's fairly easy to extrapolate the numbers to
               | specific cases. We are humans, and we can fucking
               | generalize. I'll agree there isn't much to the article,
               | but I find this ask a bit odd. Do you not have all the
               | information to make that calculation yourself? They
               | should have done that if they're addressing their
               | manager, but it looks like a technical blog where I think
               | it is fair to assume the reader is technical and can make
               | these extrapolations themselves.
        
             | godelski wrote:
             | > So the parent comment asks a very valid question: how
             | much total time was saved by this and who asked for it to
             | be saved (paying or free tier customers for example)?
             | 
             | That is a hard question to answer because it very much
             | depends on the use case, which is why I gave a vague
             | response in my comment. Truth be told, __there is no
             | answer__ BECAUSE it depends on context. In the case of AI
             | agents, yeah, 50% is going to save you a ton of money. If
             | you make LLM calls once a day, then no, probably not. Part
             | of being the developer is to determine this tradeoff.
             | Specifically, that's what technical managers are for,
             | communicating technical stuff to business people (sure,
             | your technical manager might not be technical, but someone
             | being bad at their job doesn't make the point irrelevant,
             | it just means someone else needs to do the job).
             | 
             | You're right about early stage products, but there's lots
             | of moderate and large businesses (and yes, startups) that
             | don't optimize but should. Most software never optimizes
             | and it has led to a lot of enshitification. Yes, move fast
             | and break things, but go back and clean up, optimize, and
             | reduce your tech debt, because you left a mess of broken
             | stuff in your wake. But it is weird to pigeonhole to early
             | stage startups.
        
           | jahewson wrote:
           | > 10% increases are quite large!
           | 
           | You have to ask yourself, 10% of what? I don't usually mind
           | throwing 10% more compute or memory at a problem but I do
           | mind if its 10x more. I've shipped 100x perf improvements in
           | the past where 1.5x would have been a waste of engineering
           | time. A more typical case is a 10x or 20x improvement that's
           | worth a few days coding. Now, if I'm working on a mature
           | system that's had tens of thousands of engineering hours
           | devoted to it, and is used by thousands of users, then I
           | might be quite happy with 10%. Though I also may not! The
           | broader context matters.
        
             | godelski wrote:
             | Sure, but I didn't shy away from the fact that it is case
             | dependent. In fact, you're just talking about the
             | metaoptimization. Which for any optimization, needs to be
             | considered too.
        
         | timonv wrote:
         | Core maintainer of Swiftide here. That's a fair comment!
         | Additionially, it's interesting to note that almost all the
         | time is spend in FastEmbed / onxx in the Swiftide benchmark. A
         | more involved follow up with chunking and transformation could
         | be very interesting, and anecdotally shows far bigger
         | differences. We did not have the time yet to fully dive into
         | this.
         | 
         | Personally, I just love code being fast, and Rust is incredible
         | to work with. Exceptions granted, I'm more productive with Rust
         | than any other language. And it's fun.
        
       | satvikpendem wrote:
       | I was asking the same question, turns out mistral.rs [0] has
       | pretty good abstractions in order to not depend and package
       | llama.cpp for every platform.
       | 
       | [0] https://github.com/EricLBuehler/mistral.rs
        
       | RcouF1uZ4gsC wrote:
       | Why not use C++?
       | 
       | For the most part, these aren't security critical components.
       | 
       | You already have a massive amount of code you can use like say
       | llama.cpp
       | 
       | You get the performance that you do with Rust.
       | 
       | Compared to Python, in addition to performance, you also get a
       | much easier deployment story.
        
         | oersted wrote:
         | If you already have substantial experience with C++, this could
         | be a good option. But I'd say nowadays that learning to use
         | Rust *well* is much easier than learning to use C++ *well*. And
         | the ecosystem, even if it's a lot less mature, I'd say is
         | already better in Rust for these use-cases.
         | 
         | Indeed, here security (generally safety) is a secondary concern
         | and is not the main reason for choosing Rust, although welcome.
         | It's just that Rust has everything that C++ gives you, but in a
         | more modern and ergonomic package. Although, again, I can see
         | how someone already steeped in C/C++ for years might not feel
         | that, and reasonably so. But I think I can farely safely say
         | that Rust is just "a better C++" from the perspective of
         | someone starting from scratch now.
        
           | outworlder wrote:
           | Indeed.
           | 
           | Plus, one doesn't usually just 'learn C++'. It's a herculean
           | effort and I've yet to meet anyone, even people exclusively
           | using C++ for all their careers, that could confidently say
           | they "know C++". They may be comfortable with whatever subset
           | of C++ their company uses, while another company's codebase
           | will look completely alien, often with entire features being
           | ignored that they used, and vice versa.
           | 
           | Despite that, it's still a substantial time commitment, to
           | the point that many (if not most) people working on C++ have
           | made that their career; it's not just a tool anymore at that
           | point. They may be more willing to jump entire industries
           | rather than jump to another language. It is a generalization,
           | but I have seen that far too often at this point.
           | 
           | If someone is making a significant time investment starting
           | today, I too would suggest investing in Rust instead. It also
           | requires a decent time investment, but the rewards are great.
           | Instead of learning where all the (hidden) landmines are, you
           | learn how to write code that can't have those landmines in
           | the first place. You aren't losing much either, other than
           | the ability to read existing C++ codebases.
        
           | riku_iki wrote:
           | > But I'd say nowadays that learning to use Rust _well_ is
           | much easier than learning to use C++ _well_.
           | 
           | For someone(me) who was making a choice recently, it is not
           | that obvious. I tried to learn through rust examples and
           | ecosystems, and there are many more wtf moments compared to
           | when I am writing C++ as C with classes + boost, especially
           | when writing close to metal performance code, rust has many
           | abstractions with unobvious performance implications.
        
             | tcfhgj wrote:
             | > rust has many abstractions with unobvious performance
             | implications.
             | 
             | such as?
        
               | riku_iki wrote:
               | this article has several examples:
               | https://blog.polybdenum.com/2021/08/09/when-zero-cost-
               | abstra...
        
         | IshKebab wrote:
         | Rust is much better than C++ overall and far easier to debug
         | (C++ is prone to _very_ difficult to debug memory errors which
         | don 't happen in Rust).
         | 
         | The main reasons to use C++ these days are compatibility with
         | existing code (C++ and Rust are a bit of a pain to mix), and if
         | a big dependency is C++ (e.g. Qt).
        
           | pjmlp wrote:
           | Additionally the industry standards on GPGPU APIs, tooling
           | ecosystem.
           | 
           | Maybe one day we get Live++ or Visual Studio debugging
           | experience for Rust, given that now plenty of Microsoft
           | projects use Rust.
        
         | Philpax wrote:
         | Why use C++? What's the benefit over Rust here?
        
         | timonv wrote:
         | I've worked with C++ in the past, it's subject to taste. I like
         | how Rust's rigidness empowers rapid change _without_ breaking
         | things.
         | 
         | Besides, the ML ecosystem is also very mature. llama.cpp has
         | native bindings (which Swiftide supports), onnx bindings,
         | ndarray (numpy in Rust) works great, Candle, lots of processing
         | utilities. Additionally, many languages are rewriting parts in
         | Rust, more often than not, these are available in Rust as well.
        
         | roca wrote:
         | Lots of reasons, but a big one is that dependency and build
         | management in C++ is absolutely hellish unless you use stuff
         | like Conan which nobody knows. In Rust, you use Cargo and
         | everyone is happy.
        
           | pjmlp wrote:
           | There are lots of things I don't know until I learn how to
           | use them, duh.
           | 
           | Cargo is great, for pure Rust codebases, otherwise it is
           | build.rs or having to learn another build system, and then
           | people aren't that happy any longer.
        
           | riku_iki wrote:
           | You can always use something as simple as Make for your C++
           | proj with manually dumping dependencies to some libs folder.
        
       | zie1ony wrote:
       | DSPy is in Python, so it must be Python. Sorry bro :P
        
       | bborud wrote:
       | It would be helpful to move to a compiled language with a decent
       | toolchain. Rust and Go are good candidates.
        
       | sandGorgon wrote:
       | this is very cool!
       | 
       | we built something for our internal consumption (and now used in
       | quite a few places in India).
       | 
       | Edgechains is declarative (jsonnet) based. so chains + prompts
       | are declarative. And we built an wasm compiler (in rust based on
       | wasmedge).
       | 
       | https://github.com/arakoodev/EdgeChains/actions/runs/1039197...
        
       | zozbot234 wrote:
       | Am I the only one who thinks a Swift IDE project should be called
       | Taylor?
        
         | Svoka wrote:
         | I would name it Tailor
        
         | giancarlostoro wrote:
         | Sure, but this is a Rust project for building LLMs called
         | Swiftide, not a Swift IDE...
         | 
         | https://swiftide.rs/what-is-swiftide/
        
       | zitterbewegung wrote:
       | This is a comparison of apples to oranges. Langchain has an order
       | of magnitude of examples, of integrations and features and also
       | rewrote its whole architecture to try to make the chaining more
       | understandable. I don't see enough documentation in this pipeline
       | to understand how to migrate my app to this. I also realize it
       | would take me at least a week even migrate my own app to
       | Langchain's rewrite.
       | 
       | Langchain is used because it was a first mover and that's the
       | same reason it's achilles heel and not for speed at all.
        
       | elpalek wrote:
       | Langchain and other frameworks are too bloated, it's good for
       | demo, but highly recommend to build your own pipeline in
       | production, it's not really that complicated, and you can have
       | much better control over implementation. Plus you don't need 99%
       | packages that comes with Langchain, reduce security
       | vulnerabilities.
       | 
       | I've written a series of RAG notebooks on how to implement RAG in
       | python directly, with minimal packages. I know it's not in Rust
       | or C++, but it can give you some ideas on how to do things
       | directly.
       | 
       | https://github.com/yudataguy/RawRAG
        
         | cpill wrote:
         | trouble is that the Langchain community is large and jumps on
         | the latest research papers that come out almost immediately,
         | which is a big advantage of your a small team
        
       ___________________________________________________________________
       (page generated 2024-10-01 23:01 UTC)