[HN Gopher] Rerank-2.5 and rerank-2.5-lite: instruction-followin...
___________________________________________________________________
Rerank-2.5 and rerank-2.5-lite: instruction-following rerankers
Author : fzliu
Score : 43 points
Date : 2025-08-12 06:12 UTC (1 days ago)
(HTM) web link (blog.voyageai.com)
(TXT) w3m dump (blog.voyageai.com)
| skerit wrote:
| I read the introductory post but I still don't quite understand
| daemonologist wrote:
| A re-ranker takes a query and a chunk of text and assigns them
| a relevance score according to how well the text answers the
| query. (Generally - in theory you could have some other metric
| of relevance.)
|
| They're called "re"rankers specifically because they're usually
| downstream of a faster but less accurate relevance algorithm
| (some kind of full text search and/or vector similarity) in a
| search pipeline. Rerankers have to run from scratch on every
| query-document pair and are relatively computationally
| expensive, and so are practical to run only on a small number
| of documents.
|
| An "instruction following" reranker basically just has a third
| input which is intended to be used kind of like a system prompt
| for an LLM - to provide additional context to all comparisons.
| gnulinux wrote:
| Rerankers are used downstream from an embedding model.
| Embedding models are "coarse" so they give false positives for
| things that may not be as relevant as contender text. Re-
| ranker, ranks bunch of text based on a query in order to find
| the most relevant ones. You can then take them and feed them as
| context to some other query.
| sroussey wrote:
| Not really sure why they have a HuggingFace presence.
|
| https://huggingface.co/voyageai/rerank-2.5-lite
| xfalcox wrote:
| Having a public tokenizer is quite useful, specially for
| embeddings. It allows you to do the chunking locally without
| going to the internet.
| jpctan wrote:
| Hi fzliu,
|
| Have you considered to add keyword search and other factors into
| the re-ranker?
|
| Other factors are formatted texts like bold, heading, bullet
| points, as well as bunch of factors typically seen in web search
| techniques?
| mediaman wrote:
| Keyword search (or something similar in concept, like bm25)
| would typically be first stage, rather than second, since it
| can be done with an inverted index.
___________________________________________________________________
(page generated 2025-08-13 23:00 UTC)