[HN Gopher] Choosing vector database: a side-by-side comparison
       ___________________________________________________________________
        
       Choosing vector database: a side-by-side comparison
        
       Author : emilfroberg
       Score  : 121 points
       Date   : 2023-10-04 13:04 UTC (9 hours ago)
        
 (HTM) web link (benchmark.vectorview.ai)
 (TXT) w3m dump (benchmark.vectorview.ai)
        
       | BenoitP wrote:
       | Pricing for pg should be easy to compute
       | 
       | 20M vectors @768 is about 62GB, for 32bit, not even quantized.
       | AWS RDS will put it at 83USD/m (db.t4g.small, 2vcpu 2GB RAM). But
       | that's not with egress, backups, etc
       | 
       | Seems acceptable at least for a POC?
       | 
       | A better option if you already have the data in the same
       | instance, but developer experience being low scares me. Anyone
       | tried it? How did it go?
        
         | LunaSea wrote:
         | You will be able to store the data but no query it.
         | 
         | Vector indexes are very large, almost the size of the original
         | data, and that needs to fit into the database memory ideally.
        
       | NicoJuicy wrote:
       | I'm actually curious on how the new vector DB from cloudflare
       | compares.
        
         | emilfroberg wrote:
         | Me too! Couldn't find a lot of information on it yet, but I
         | might have to try it myself to get some benchmarks
        
       | emilfroberg wrote:
       | I made this table to compare vector databases in order to help me
       | choose the best one for a new project. I spent quite a few hours
       | on it, so I wanted to share it here too in hopes it might help
       | others as well. My main criteria when choosing vector DB were the
       | speed, scalability, dx, community and price. You'll find all of
       | the comparison parameters in the article.
        
         | andre-z wrote:
         | I'm curious where got the numbers on qps? They are pretty
         | different from our experience. Reached out on LinkedIn. ;)
        
           | emilfroberg wrote:
           | Happy to connect. The benchmark numbers are mostly from ANN
           | Benchmarks. For my use case, the nytimes-256 dataset was most
           | relevant so I used that for the QPS benchmark. I also took a
           | look at the benchmarks you've made at
           | https://qdrant.tech/benchmarks/ and there qdrant seems to be
           | outperforming many others. If I've gotten something wrong
           | here, I'm glad to update the article :)
        
       | panarky wrote:
       | I'd love to know how vector databases compare in their ability to
       | do hybrid queries, vector similarity filtered by metadata values.
       | For example, find the 100 items with the closest cosine
       | similarity where genre = jazz and publication date between 1990
       | and 2000.
       | 
       | Can the vector index operate on a subset of records? Or when
       | searching for 100 closest matches does the database have to find
       | 1000 matches and then apply the metadata filter, and hope that
       | doesn't reduce the result set down to zero and exclude relevant
       | vectors?
       | 
       | It seems like measuring precision and recall for hybrid queries
       | would be illuminating.
        
         | mvcalder wrote:
         | I can't speak to the others, but pgvector indices can "break"
         | hybrid queries. For example, if you select using a where clause
         | specifying metadata (where genre = jazz) and order by distance
         | from a vector (embedding of sound clip); if the index doesn't
         | have a lot (or any) vectors in the sphere of the query vector
         | that also match the metadata it can return no results. I
         | discuss this in a blog post here [1].
         | 
         | [1]: https://www.polyscale.ai/blog/pgvector-bigger-boat/
        
         | mistrial9 wrote:
         | > do hybrid queries
         | 
         | "no" - the graph objects after training are opaque AFAIK
        
           | hobs wrote:
           | Actually a lot of the databases offer filtering before or
           | after similarity search.
        
             | esafak wrote:
             | I'd say it's table stakes today.
        
         | andre-z wrote:
         | There is on-stage filtering approach with extended HNSW
         | https://qdrant.tech/articles/filtrable-hnsw/
        
       | brigadier132 wrote:
       | None of these vector dbs seem economical outside of enterprise.
        
         | emilfroberg wrote:
         | Many of them are open source and you can host them yourself.
         | That would make it more cost effective. Also someone mentioned
         | https://turbopuffer.com/. That seems like a good alternative if
         | you're looking for something economical.
        
         | esafak wrote:
         | They have open source versions.
        
       | kesor wrote:
       | Redis is definitely missing in the comparison.
        
       | alter123 wrote:
       | You might want to add https://turbopuffer.com/ as well now in the
       | benchmarks.
        
         | softwaredoug wrote:
         | I agree this looks really promising.
        
         | emilfroberg wrote:
         | Turbopuffer looks like something I would consider. And the
         | pricing looks to be lowest on the list from what I can see
        
           | Sirupsen wrote:
           | Emil if you email me at info@turbopuffer.com I can let you
           | into the alpha :)
        
       | dmezzetti wrote:
       | I'll add txtai to the list: https://github.com/neuml/txtai
       | 
       | txtai is an all-in-one embeddings database for semantic search,
       | LLM orchestration and language model workflows.
       | 
       | Embeddings databases are a union of vector indexes (sparse and
       | dense), graph networks and relational databases. This enables
       | vector search with SQL, topic modeling and retrieval augmented
       | generation.
       | 
       | txtai adopts a local-first approach. A production-ready instance
       | can be run locally within a single Python instance. It can also
       | scale out when needed.
       | 
       | txtai can use Faiss, Hnswlib or Annoy as it's vector index
       | backend. This is relevant in terms of the ANN-Benchmarks scores.
       | 
       | Disclaimer: I am the author of txtai
        
         | emilfroberg wrote:
         | Txtai looks interesting, maybe you could help me collect some
         | of the comparision parameters for it?
        
           | dmezzetti wrote:
           | Sure, I'd be happy to do so. Easiest way is probably using
           | the public Slack channel that's accessible via the GitHub
           | page.
        
       | [deleted]
        
       | citruscomputing wrote:
       | Strongly disagree with PGVector's DX being worse than Chroma.
       | Installing, configuring, and working with Chroma was infuriating
       | -- it's alpha software and has the bugs and rough edges to prove
       | it. The tools to support and interface with postgres are battle-
       | tested and _so_ much nicer by comparison; getting Chroma working
       | took over a week, ripping it out and replacing with PGVector took
       | a couple hours.
       | 
       | Also agree with this[0] article that vector search is only one
       | type of search, and even for RAG isn't necessarily the one you
       | want to start with.
       | 
       | [0]: https://colinharman.substack.com/p/beware-tunnel-vision-
       | in-a...
        
         | fzliu wrote:
         | Shameless self-plug for milvus-lite:                  $ pip
         | install milvus        $ python        >>> import milvus
         | >>> milvus.start()
        
           | m00x wrote:
           | Gonna add some information here since this isn't very
           | descriptive.
           | 
           | milvus-lite is a bit like sqlite where it runs in-process.
           | Here are some scenarios you'd want to use it in:
           | 
           | - You want to use Milvus directly without having it installed
           | using Milvus - Operator, Helm, or Docker Compose etc. - You
           | do not want to launch any virtual machines or containers
           | while you are using Milvus. - You want to embed Milvus
           | features in your Python applications.
        
         | emilfroberg wrote:
         | Thanks for your input, I've only tried Chroma a little bit so
         | far and had a pretty good experience. What they also have going
         | for them is a big community on discord that can be helpful.
        
         | luckyt wrote:
         | Yeah, I had a similar experience with Chroma DB. On paper, it
         | checked all my boxes. But yea, it's alpha software with the
         | first non-prerelease version only coming out in July 2023 (so
         | it's 3 months old).
         | 
         | I ran into some dumb issues during install like the SQLite
         | version being incorrect, and there wasn't much guidance on how
         | to fix these problems, so gave up after struggling for a few
         | hours. Switched to PGVector which was much simpler to setup. I
         | hope Chroma DB improves, but I wouldn't recommend it for now.
        
       | magden wrote:
       | I don't think we need specialized databases for vectors.
       | Relational databases can easily be expanded by vector data types
       | and operations. They will eventually catch up by supporting what
       | was once a unique feature of the new system:
       | https://medium.com/@magda7817/two-things-to-keep-in-mind-bef...
        
         | emilfroberg wrote:
         | Yeah, maybe they will.. But for now, the best options are the
         | purpose-built vector databases, so why not use them?
        
         | rnk wrote:
         | Yeah, this is my sense too. They will be slower to add these
         | new requirements but they should be able to add these vector
         | capabilities within a year or so. It's then a question of
         | ability of smaller vector db companies to mature and add
         | regular db capabilities, while innovating.
        
       | krishadi wrote:
       | Latency from embedding models is still going to be the bottleneck
       | for performance however fast the DB is going to be. Plus adding
       | all the overhead of synthesising answers and summaries from a LLM
       | is going to weigh you down.
        
       | Pandabob wrote:
       | I've been wondering about Redis as vector database [0].
       | 
       | [0]: https://twitter.com/sh_reya/status/1661136833848438784
        
         | emilfroberg wrote:
         | I quickly took a look at the redisearch ANN Benchmarks and they
         | seem to stack up against the others (more or less same level as
         | Milvus) in the comparison when it comes to QPS and Latency.
        
         | esafak wrote:
         | Apparently it's possible:
         | https://redis.io/docs/interact/search-and-query/search/vecto...
         | 
         | Euclidean distance, inner product, and cosine similarity are
         | supported.
        
       | drewbug01 wrote:
       | I really appreciate comparisons like this, although I find myself
       | wanting to know more about why certain things are listed the way
       | they are.
       | 
       | For example, pgvector is listed as not having role-based access
       | control, but the Postgres manual dedicates an entire chapter to
       | it: https://www.postgresql.org/docs/current/user-manag.html
       | 
       | Hence why I'd be interested to know more about the supporting
       | details for the different categories. It may help uncover some
       | inadvertent errors in the analysis, but also would just serve as
       | a useful jumping-off point for people doing their own research as
       | well.
        
         | proleisuretour wrote:
         | Totally agree with the puzzling assortment of a rubric.
         | PostgreSQL supports role based-access control, RBAC. Not to
         | mention, with PostgreSQL and the pgvector extension, you have a
         | whole list of languages ready to use it:
         | 
         | C++ pgvector-cpp C# pgvector-dotnet Crystal pgvector-crystal
         | Dart pgvector-dart Elixir pgvector-elixir Go pgvector-go
         | Haskell pgvector-haskell Java, Scala pgvector-java Julia
         | pgvector-julia Lua pgvector-lua Node.js pgvector-node Perl
         | pgvector-perl PHP pgvector-php Python pgvector-python R
         | pgvector-r Ruby pgvector-ruby, Neighbor Rust pgvector-rust
         | Swift pgvector-swift
         | 
         | Wonder how many of those other Vector databases play nice.
        
         | mritchie712 wrote:
         | Same for Developer experience. If you used Postgres or any
         | other relational db (which I think covers a large % of devs),
         | you could easily argue the dev experience is 3/3 for pgvector.
        
         | sojournerc wrote:
         | That stood out to me as well. I've been playing with pgvector,
         | and there's no reason you can't use row/table role-based
         | security.
         | 
         | I think there's an unmentioned benefit to using something like
         | pgvector also. You don't need a separate relational database!
         | In fact you can have foreign keys to your vectors/embeddings
         | which is super powerful to me.
        
         | hereonout2 wrote:
         | Possibly / quite probably whoever wrote this knows very little
         | about postgres.
        
       | BeetleB wrote:
       | Let me half hijack to ask a related question:
       | 
       | I'm building a RAG for my personal use: Say I have a lot of notes
       | on various topics I've compiled over the years. They're scattered
       | over a lot of text files (and org nodes). I want to be able to
       | ask questions in a natural language and have the system query my
       | notes and give me an answer.
       | 
       | The approach I'm going for is to store those notes in a vector
       | DB. When I ask my query, a search is performed and, say, the top
       | 5 vectors are sent to GPT for parsing (along with my query). GPT
       | will then come back with an answer.
       | 
       | I can build something like this, but I'm struggling in figuring
       | out metrics for how _good_ my system is. There are many variables
       | (e.g. amount of content in a given vector, amount of overlap
       | amongst vectors, number of vectors to send to GPT, and many
       | more). I 'd like to tweak them, but I also want some objective
       | way to compare different setups. Right now all I do is ask a
       | question, look at the answer, and try to subjectively gauge
       | whether I think it did a good job.
       | 
       | Any tips on how people measure the performance/effectiveness for
       | these types of problems?
        
         | civilitty wrote:
         | I use lots and lots of domain specific test cases at several
         | layers, numbering in the hundreds or thousands. The score is
         | the number of test cases that pass so it requires a different
         | approach than all or nothing tests. The layers depend on your
         | RAG "architecture" but I test the RAG query generation and
         | scoring (comparing ordered lists is the simplest but I also
         | include a lot of fuzzy comparisons), the LLM scoring the
         | relevance of retrieved snippets before feeding into the final
         | answering prompt, and the final answer. The most annoying part
         | is the prompt to score the final answer, since it tends out to
         | come out looking like a CollegeBoard AP test scoring rubric.
         | 
         | This requires a lot of domain specific work. For example, two
         | of my test cases are "Is it [il]legal to build an atomic bomb"
         | run against the entire USCode [1] so I have a list of sections
         | that are relevant to the question that I've scored before
         | eventually getting an answer of "it is illegal" followdd by
         | several prompts that evaluate nuance in the answer ("it's
         | illegal except for..."). I have hundreds of these test cases,
         | approaching a thousand. It's a slog.
         | 
         | [1] 42 U.S.C. 2122 is one of the "right" sections in case
         | anyone is wondering. Another step tests whether 2121 is pulled
         | in based on the mention in 2122
        
         | TrueDuality wrote:
         | For small personal projects its kind of hard to build metrics
         | like this because the volume of indexed content in the database
         | tends to be pretty low. If you're indexing paragraphs you might
         | consistently be able to fit all relevant paragraphs in the
         | context itself.
         | 
         | What I can recommend is to take the coffee tasting approach.
         | Don't try and test and evaluate individual responses, instead
         | lock the seed used in generation, and use the same prompt for
         | two different runs. Change one variable and do a relative
         | comparison of the two outputs. The variables probably worth
         | testing for you off the top of my head:
         | 
         | * Choice of models and/or tunes
         | 
         | * System prompts
         | 
         | * Temperature of the model against your queries
         | 
         | * Threshold for similarity for document inclusions (you only
         | want relevant documents from your RAG, set it too low and
         | you'll get some extra distractions, too high and useful
         | information might be left out of the context).
         | 
         | If you setup a system to track the comparisons either
         | automatically or by hand that just indicates which side of the
         | change worked better for your use case, and test that same
         | change for a bunch of different prompts you should be able to
         | tally up whether the control or change was more preferred.
         | 
         | Keep those data points! The data points are your bench log and
         | can be invaluable later on for anything you do with the system
         | to see what changed in aggregate, what had the most outsized
         | impact, etc and can guide you to build useful tooling for
         | testing or finding existing solutions out there.
        
         | screye wrote:
         | You might like this -
         | https://www.youtube.com/watch?v=fWC4VxolWAk
         | 
         | Blog on the same topic - https://blog.langchain.dev/evaluating-
         | rag-pipelines-with-rag...
        
         | hobs wrote:
         | For the normal ones
         | https://en.wikipedia.org/wiki/Evaluation_measures_(informati...
         | 
         | The main thing is that there's no "objective" way, but if you
         | rank and label your own data then you can certainly get a
         | ranking that's subjectively well performing according to you.
        
       | softwaredoug wrote:
       | Everyone I talk to who is building some vector db based thing
       | sooner or later realizes they also care about the features of a
       | full-text search engine.
       | 
       | They care about filtering, they care to some degree about direct
       | lexical matches, they care about paging, getting groups / facet
       | counts, etc.
       | 
       | Vectors, IMO, are just one feature that a regular search engine
       | should have. IMO currently Vespa does the best job of this,
       | though lately it seems Lucene (Elasticsearch and Opensearch) are
       | really working hard to compete
        
         | emilfroberg wrote:
         | Vespa looks interesting, hadn't seen it before but will
         | definitely take a look at it
        
         | pmc00 wrote:
         | Agreed, vector search is great but it's only one of many tools
         | you can use to create a great search solution.
         | 
         | We recently did a bunch of evaluation work to quantify the
         | differences between keyword search, vector search, hybrid,
         | reranking, etc. across a few datasets. We shared the results
         | here: https://techcommunity.microsoft.com/t5/azure-ai-services-
         | blo...
         | 
         | Disclosure - I work in the Azure Search team.
        
         | vosper wrote:
         | My company is using vector search with Elasticsearch. It's
         | working well so far. IMO Elastic will eat most vector-
         | first/only products because of its strength at full-text
         | search, plus all the other stuff it does.
        
           | treprinum wrote:
           | Amazon was already working on getting rid of ElasticSearch
           | with their Kendra NLP search. Are you sure ElasticSearch has
           | rosy future?
        
             | m00x wrote:
             | They have beef with ES since they took the software, made a
             | bunch of cash on it, then never contributed back. ES called
             | them out and it started a feud.
             | 
             | I'd go on ES over Amazon-built software any day. I worked
             | on RDS and I've used RDS at several companies, it's a mess.
             | 
             | Longer story: One day one of our table went missing on
             | Aurora, we couldn't figure out why, it was in the schema,
             | etc. Devops panicked and restarted the instance, and then
             | another table was missing. We ended up creating 10 empty
             | tables and restarted it until it hit one of those.
             | 
             | We contacted RDS support after that, and the conclusion of
             | their 3 month investigation is: "Yeah, it's not supposed to
             | do that."
             | 
             | There's some really smart people working at Amazon,
             | unfortunately the incentives is to push new stuff out and
             | get promoted ASAP. If you can do that better than others
             | and before your house of cards falls, you're safe. If the
             | house of card crumbles after you're gone, it's their
             | problem.
        
               | vmfunction wrote:
               | >Longer story: One day one of our table went missing on
               | Aurora, we couldn't figure out why, it was in the schema,
               | etc. Devops panicked and restarted the instance, and then
               | another table was missing. We ended up creating 10 empty
               | tables and restarted it until it hit one of those.
               | 
               | Are there any report this? How come this is the first
               | time I heard of this? How can companies trust this kind
               | of managed DB services?
        
             | vosper wrote:
             | Amazon forked ElasticSearch into OpenSearch. When deciding
             | which platform to go with (we are an AWS customer) I
             | decided to stick with the company whose future depends on
             | their search product (Elastic), not the one that could lose
             | interest and walk away and suffer almost no consequences
             | (AWS). If OpenSearch is still around in 5 years, and
             | keeping pace with ElasticSearch, then maybe I'd consider it
             | the next time I'm making this choice.
             | 
             | Also there's a lot more to ElasticSearch than full-text
             | search (aggregations, lifecycle management, Kibana).
             | Doesn't seem like Kendra is going to be a replacement for
             | our use case.
        
           | lordofmoria wrote:
           | I tend to agree - search, and particularly search-for-humans,
           | is really a team sport - meaning, very rarely do you have a
           | single search algo operating in isolation. You have multiple
           | passes, you filter results through business logic.
           | 
           | Having said that, I think pgvector has a chance for less
           | scale-intense needs - embedding as a column in your existing
           | DB and a join away from your other models is where you want
           | search.
           | 
           | I don't get why you'd want to bolt RBAC onto these new vector
           | dbs, unless it's because they've caused this problem in the
           | first place...
        
           | dathinab wrote:
           | it's also has tones of subtle issues and we are constantly
           | looking for potential replacements
        
         | deepsquirrelnet wrote:
         | Until very recently, "dense retrieval" was not even as good as
         | bm25, and still is not always better.
         | 
         | I think a lot of people use dense retrieval in applications
         | where sparse retrieval is still adequate and much more
         | flexible, because it has the hype behind it. Hybrid approaches
         | also exist and can help balance the strengths and weaknesses of
         | each.
         | 
         | Vectors can also work in other tasks, but largely people seem
         | to be using them for retrieval only, rather than applying them
         | to multiple tasks.
        
           | dathinab wrote:
           | more commonly you use approximate KNN vector search with LLM
           | based embeddings, which can find many fitting documents bm25
           | and similar would never manage to
           | 
           | the tricky part if to properly combine the results
        
           | marginalia_nu wrote:
           | A lot of these things are use-case dependent. Like the
           | characteristics even of BM-25 varies a lot depending on
           | whether the query is over or under specified, the nature of
           | the query and so on.
           | 
           | I don't think there will ever be an answer to what is the
           | best way of doing information retrieval for a search engine
           | scale corpus of document that is superior for every type of
           | queries.
        
         | donretag wrote:
         | Vector search is not exclusively in the domain of text search.
         | There is always image/video search.
         | 
         | But pre-filtering is important, since you want to reduce the
         | set of items to be matched on and it feels like
         | Elasticsearch/OpenSearch are fairing better in this regard.
         | Mixed scoring derived from both both sparse and dense
         | calculations is also important, which is another strength of
         | ES/OS.
        
         | ruslandanilin wrote:
         | Vespa.ai does a great job. Absolutely stunning thing!
        
           | esafak wrote:
           | What do you like about it relative to alternatives? How fast
           | is it?
        
             | dathinab wrote:
             | much more mature and feature rich then many of the
             | competition listed in the article
             | 
             | to some degree it's more a platform you can use to
             | efficiently and flexible build your own more complicated
             | search system, which is both a benefit and drawback
             | 
             | some good parts:
             | 
             | - very flexible text search (bm25), more so then elastic
             | search (or at least easier to user/better documented when
             | it comes to advanced features)
             | 
             | - fast flexible enough vector search, with good filtering
             | capabilities
             | 
             | - build in support for defining more complicated search
             | piplines, including multi phase search (also known as
             | rerankin)
             | 
             | - quite nice approach for more fine controlling about what
             | kind of indices are build for which fields
             | 
             | - when doing schema changes has safety checks to make sure
             | you don't accidentally brake anything, which you can
             | override if you are sure you want that
             | 
             | - ton of control in a cluster about where which search
             | system resources get allocated (e.g. which schemas get
             | stored on which storage clusters, which cluster nodes
             | should act as storage nodes, which should e.g. only do
             | preprocessing or post processing steps in a search piplines
             | and which e.g. should be used for calculating embeddings
             | using some LLM or similar) Not something you for demos but
             | definitly something you need once you customers have enough
             | data.
             | 
             | - child documents, and document references
             | 
             | - multiple vectors per document
             | 
             | - quite a interesting set of data types for fields and
             | related ways you can use them in a search pipline
             | 
             | - an flexible reasonable easy to use system for
             | plugins/extensions (through Java only)
             | 
             | - support building search piplines which have sub-searches
             | in extern potentially non vespa systems
             | 
             | - really well documented
             | 
             | Through the main benefit *and drawback* is that it's not
             | just a vector database, but a full fledged search system
             | platform.
        
             | bratao wrote:
             | +1 for Vespa. For me it is VERY resilient and production
             | ready. It is such a dream compared to Elasticsearch, that
             | we migrated from.
        
               | vinni2 wrote:
               | Does Vespa have an equivalent of Kibana? and how hard was
               | the migration?
        
         | [deleted]
        
       | deepsquirrelnet wrote:
       | What advantage are vector databases providing above using an
       | index in conjunction with a mature database? I'm not sold on this
       | as a separate technology.
       | 
       | Vector search is useful, but I don't understand why I would go
       | out of my way when I could implement FAISS or HNSWlib as an
       | adjunct to postgres or a document store.
        
         | dathinab wrote:
         | The thing is if you need a vector _database_ there is no reason
         | why it can't be a pg extensions. And if you project is only
         | small scale there is probably some HNSW pg extension library
         | you could use.
         | 
         | But what is most times needed instead of a vector database is a
         | efficient fast responsive vectore approximate KNN search system
         | with fast attribute filtering which overlaps with a fast an
         | efficient text search system (e.g. bm25 based)
         | 
         | And if you then go to billion vector scale things become tricky
         | performance wise.
         | 
         | And then you reach the same point at which companies do things
         | like using warehouse approach where you have a read only
         | extremely read optimized mostly in memory variant of their db
         | they access for searches only and changes from their main db a
         | streamed to the read only search instance, potentially while
         | losing snapshot views, transactions and similar.
         | 
         | You could say that approx. KNN vector search is the new must
         | have feature for unstructured fuzzy text search, and while you
         | can have unstructured fuzzy text search in pg it's also often
         | not the go-to solution if your databse is just for getting that
         | search.
        
         | spullara wrote:
         | Vector extensions to your current database or search engine
         | makes far more sense than adding yet another dependency to
         | manage and operate. The vector database folks will have to
         | become a real database or full featured search engine to
         | survive and compete with the incumbents that will all have good
         | solutions for vector similarity search.
        
         | dmezzetti wrote:
         | If you're interested in an approach like this, take a look at
         | txtai.
         | 
         | 1. https://neuml.github.io/txtai/embeddings/indexing/
         | 
         | 2. https://neuml.hashnode.dev/external-database-integration
        
           | deepsquirrelnet wrote:
           | I love this idea. It seems like a very practical approach.
           | I'm going to give this a try on my next project.
        
       | ldjkfkdsjnv wrote:
       | Postgres vector store has been the most simple, and will be if
       | you are at a lower scale. You can just use it directly with
       | something like spring boot.
        
         | avthar wrote:
         | Agreed on pgvector being simple and a great choice for POCs and
         | low scale, especially if you're familiar with Postgres. Our
         | team released something new last week built for folks looking
         | to use PostgreSQL at scale as a vector store [0], featuring a
         | DiskANN index type.
         | 
         | [0]: https://www.timescale.com/blog/how-we-made-postgresql-the-
         | be...
        
       | la64710 wrote:
       | Somehow I felt that at least part of the articles was generated
       | by a LLM. It's unfortunate to see that a new bias has started to
       | creep up. Whatever I read now I second guess and I _feel_ it
       | maybe partially or fully generated by LLMs.
        
       | jn2clark wrote:
       | As others have correctly pointed out, to make a vector search or
       | recommendation application requires a lot more than similarity
       | alone. We have seen the HNSW become commoditised and the real
       | value lies elsewhere. Just because a database has vector
       | functionality doesn't mean it will actually service anything
       | beyond "hello world" type semantic search applications. IMHO
       | these have questionable value, much like the simple Q and A RAG
       | applications that have proliferated. The elephant in the room
       | with these systems is that if you are relying on machine learning
       | models to produce the vectors you are going to need to invest
       | heavily in the ML components of the system. Domain specific
       | models are a must if you want to be a serious contender to an
       | existing search system and all the usual considerations still
       | apply regarding frequent retraining and monitoring of the models.
       | Currently this is left as an exercise to the reader - and a very
       | large one at that. We (https://github.com/marqo-ai/marqo, I am a
       | co-founder) are investing heavily into making the ML production
       | worthy and continuous learning from feedback of the models as
       | part of the system. Lots of other things to think about in how
       | you represent documents with multiple vectors, multimodality,
       | late interactions, the interplay between embedding quality and
       | HNSW graph quality (i.e. recall) and much more.
        
       | donretag wrote:
       | Curious about the lack of Vespa, especially given the
       | thoroughness of the article and its long-time reputation.
       | OpenSearch is also missing, but perhaps it can be considered
       | being lumped in with Elasticsearch due to them both being based
       | on Lucene. The products are starting to diverge, so would be nice
       | to see, especially since it is open-source.
       | 
       | For the performance-based columns, would be also helpful to see
       | which versions were tested. There is so much attention lately for
       | vector databases, that they all are making great strides forward.
       | The Lucene updates are notable.
        
         | emilfroberg wrote:
         | Someone else also pointed out that Vespa was missing. I'll have
         | to look in to it and add it to the article!
        
       | lazy_moderator1 wrote:
       | also, typesense
        
       | dathinab wrote:
       | Their definition about Hybrid Search is I think wrong.
       | 
       | Through this terms tend to not be consistently defined at all, so
       | "wrong" is maybe the wrong word.
       | 
       | Their definition seem to be about filtering results during
       | (approximate) KNN vector search.
       | 
       | But that is filtering, not hybrid search. Through it might
       | sometimes be implemented as a form of hybrid search, but that's
       | an internal implementation detail and you probably should hope
       | it's not implemented that way.
       | 
       | Hybrid search is when you do both a vector search and a more
       | classical text based search (e.g. bm25) and combine both results
       | in a reasonable way.
        
         | emilfroberg wrote:
         | The way you explain hybrid search aligns with my understanding.
         | Pinecone has a good article about it here
         | https://www.pinecone.io/learn/hybrid-search-intro/. From my
         | understanding, all vector DBs support this.
        
       | J_Shelby_J wrote:
       | Nice post! I think this could be a very good page to bookmark.
       | 
       | There is also this series of articles detailing the options and
       | it includes some that the OP is missing:
       | https://thedataquarry.com/posts/vector-db-1/#key-takeaways
       | 
       | I'm currently in the market for a self hosted DB for a personal
       | project. The project is an app you can run on your own system and
       | provide QA on your text files. So I'm looking for something light
       | weight, but I'm also looking for the best possible search and ANN
       | retrieval is just a single part of that.
        
       | Havoc wrote:
       | 16x difference between pg and milvus?
       | 
       | I thought for most use cases this would be quite performance
       | sensitive
        
         | emilfroberg wrote:
         | Yeah, that's the difference we've seen according to the QPS for
         | the ANN Benchmarks. The same story seems to be true for other
         | datasets too. We're looking at a 0.9 recall.
        
       ___________________________________________________________________
       (page generated 2023-10-04 23:01 UTC)