text-ranking (cross-encoder reranker)
HF group: NLP · Status: ❌ not built · Tier 1 (high leverage)
What it is
Take a query + a list of candidate documents, re-score each pair jointly. Cross-encoders beat bi-encoders on top-K re-ranking quality and are the standard “second stage” in RAG pipelines:
- First stage — bi-encoder retrieves top-N (cheap, see
text-to-embedding) - Second stage — cross-encoder re-scores top-N to top-K (expensive but accurate).
Open-weight models
| Model | Params | Released | License | Quality | Notes |
|---|---|---|---|---|---|
| BGE-Reranker-Base | 280 M | 2024 | MIT | Solid baseline | Self-hostable. |
| BGE-Reranker-V2-M3 | 568 M | 2024 | MIT | Lightweight, multilingual | Good practical baseline. |
| BGE-Reranker-V2-Gemma | 2 B | 2024 | Gemma | Strong | LLM-distilled reranker. |
| ColBERT-v2 | 110 M | 2022 | Apache-2.0 | Late interaction | Higher infra complexity than cross-encoders. |
| ZeroEntropy zerank | various | 2025 | MIT | Top open in their bench | Newer entrant. |
| mxbai-rerank-large-v1 | 335 M | 2024 | Apache-2.0 | Strong | Competitive with Cohere Rerank. |
Infrastructure required
Inference
- ❌ Cross-encoder mode in
locara-llama(llama.cpp supports reranker models). Same shape as embedding inference but different output (single score vs vector).
Input
- Query string + list of candidate documents.
Output
- Sorted list with scores.
Storage
- ❌ Weights cache.
- App-side: typically reranker is invoked at search time — no per-call persistence.
Interaction (IPC + SDK)
- ❌
rerank.score({ query, candidates })IPC. - App pattern: vector search retrieves top-N, reranker re-orders to top-K.
Capabilities (manifest)
capabilities.models[]for the reranker.
Gaps
A reranker is the retrieval-quality multiplier for any RAG-
style app. DocVault would benefit directly. Probably fits in
locara-llama (llama.cpp supports reranker models) or a small
new crate. Tier 1 BACKLOG.
See also
text-to-embeddingvisual-document-retrieval- Crates:
locara-llama(likely host),locara-storage - Index:
../modalities-and-models-survey.md