`text-to-embedding`

HF group: NLP · Status: ✅ shipped

HF aliases: feature-extraction, sentence-similarity.

What it is

Text → fixed-size float vector for retrieval, clustering, classification, similarity. The default storage primitive for any Locara app that needs RAG-style retrieval.

Open-weight models

Model	Params	Released	License	Quality	Notes
BGE-M3	568 M	2024	MIT	MTEB 72 % retrieval	Multilingual, dense + sparse + multi-vector in one model. Best self-host pick.
Nomic Embed v2	137 M (305 M MoE active / 475 M total)	2025	Apache-2.0	Strong on multilingual	First MoE embedder. CPU-friendly.
E5-Large-v2	560 M	2023	MIT	Strong English	Solid baseline.
E5-Small	33 M	2023	MIT	Punch-above-weight	384 dims, 512 ctx — good for laptops.
GTE-multilingual-base	305 M	2024	Apache-2.0	Strong, fast	10× faster than decoder-only embedders.
Snowflake-Arctic-Embed-L	335 M	2024	Apache-2.0	Strong English	Best open-weight English-only.
mxbai-embed-large-v1	335 M	2024	Apache-2.0	Strong	Drop-in BGE alternative.

Infrastructure required

Inference

✅ Encoder-only mode in locara-llama (llama.cpp’s embedding mode).
❌ Some models (Nomic Embed v2 MoE, BGE-M3 multi-vector) need a separate ONNX or Candle path.

Input

Plain UTF-8 text strings (single or batch). No special capture infrastructure.

Output

✅ Vec<f32> per input string returned synchronously.
Output dimensionality is model-dependent and declared in the manifest so storage allocates the right schema.

Storage

✅ Weights via locara-models::Cache.
✅ Vector store: sqlite-vec extension via locara-storage (per-app SQLite + vec0 virtual tables for ANN search).
❌ Multi-vector storage (for ColPali-style late interaction) — sqlite-vec supports it via custom SQL but no clean SDK helper yet.

Interaction (IPC + SDK)

✅ IPC: embed.embed accepting string | string[].
✅ SDK: embed.embed(text) / embed.embed(texts) in packages/sdk/src/embed.ts.
❌ Streaming embedding for long-doc ingestion (currently best-effort batching in apps).

Capabilities (manifest)

✅ capabilities.models[] lists the embedding model.
App-side schema declared in storage.schema (sql file) to allocate the right vector dimension.

Gaps

Bigger curated model list (we ship one default).
Async batch embedding for large corpora (currently best-effort in apps).
Streaming embedding for long-doc ingestion.
Multi-vector mode for ColPali-class retrieval.

text-to-embedding