Locara

text-to-embedding

HF group: NLP · Status: ✅ shipped

HF aliases: feature-extraction, sentence-similarity.

What it is

Text → fixed-size float vector for retrieval, clustering, classification, similarity. The default storage primitive for any Locara app that needs RAG-style retrieval.

Open-weight models

ModelParamsReleasedLicenseQualityNotes
BGE-M3568 M2024MITMTEB 72 % retrievalMultilingual, dense + sparse + multi-vector in one model. Best self-host pick.
Nomic Embed v2137 M (305 M MoE active / 475 M total)2025Apache-2.0Strong on multilingualFirst MoE embedder. CPU-friendly.
E5-Large-v2560 M2023MITStrong EnglishSolid baseline.
E5-Small33 M2023MITPunch-above-weight384 dims, 512 ctx — good for laptops.
GTE-multilingual-base305 M2024Apache-2.0Strong, fast10× faster than decoder-only embedders.
Snowflake-Arctic-Embed-L335 M2024Apache-2.0Strong EnglishBest open-weight English-only.
mxbai-embed-large-v1335 M2024Apache-2.0StrongDrop-in BGE alternative.

Infrastructure required

Inference

  • ✅ Encoder-only mode in locara-llama (llama.cpp’s embedding mode).
  • ❌ Some models (Nomic Embed v2 MoE, BGE-M3 multi-vector) need a separate ONNX or Candle path.

Input

  • Plain UTF-8 text strings (single or batch). No special capture infrastructure.

Output

  • Vec<f32> per input string returned synchronously.
  • Output dimensionality is model-dependent and declared in the manifest so storage allocates the right schema.

Storage

  • ✅ Weights via locara-models::Cache.
  • Vector store: sqlite-vec extension via locara-storage (per-app SQLite + vec0 virtual tables for ANN search).
  • ❌ Multi-vector storage (for ColPali-style late interaction) — sqlite-vec supports it via custom SQL but no clean SDK helper yet.

Interaction (IPC + SDK)

  • ✅ IPC: embed.embed accepting string | string[].
  • ✅ SDK: embed.embed(text) / embed.embed(texts) in packages/sdk/src/embed.ts.
  • ❌ Streaming embedding for long-doc ingestion (currently best-effort batching in apps).

Capabilities (manifest)

  • capabilities.models[] lists the embedding model.
  • App-side schema declared in storage.schema (sql file) to allocate the right vector dimension.

Gaps

  • Bigger curated model list (we ship one default).
  • Async batch embedding for large corpora (currently best-effort in apps).
  • Streaming embedding for long-doc ingestion.
  • Multi-vector mode for ColPali-class retrieval.

See also