Locara

text-to-image

HF group: Computer Vision · Status: ❌ not built

What it is

Text → image. The classic diffusion task.

Open-weight models

ModelParamsReleasedLicenseQualityNotes
FLUX.1 [schnell]12 B2024-08Apache-2.0Best open-weight image qualityDistilled to 4 steps; fast.
FLUX.1 [dev]12 B2024-08FLUX-1-dev (non-comm.)Top qualityNon-commercial only.
FLUX.1 Kontext [dev]12 B2025FLUX-1-devImage editingSee image-text-to-image.
Stable Diffusion 3.5 Medium2.5 B2024-10Stability communityStrong, much smaller than FLUXBest fit for edge devices / 16 GB Macs.
Stable Diffusion 3.5 Large8 B2024-10Stability communityCompetitive with FLUXHeavier.
Stable Diffusion XL Lightning3.5 B2024OpenRAIL-MFast (4-step)Workhorse for quick generations.

Infrastructure required

Inference

  • Diffusion runtime (typically Diffusers / mlx-diffusion / Candle diffusers). Cleanest path: mlx-diffusion → SD 3.5 Medium for the default. New locara-diffusion crate.
  • ❌ Quantization path to fit 12 B FLUX on a 24 GB Mac.

Input

  • Plain text prompt (optionally with negative prompt).

Output

  • ❌ Image bytes streamed back during sampling (progressive preview — each diffusion step’s latent decoded for live update).
  • Final image saved to disk.

Storage

  • ❌ Weights via locara-models::Cache (large — 12 B FLUX is several GB even quantized).
  • Output to fs.user-folder for save.

Interaction (IPC + SDK)

  • image.generate({ prompt, options }) IPC with progress events (Tauri Channel<DiffusionStep>).

Capabilities (manifest)

  • capabilities.fs.user-folder write for save location.
  • capabilities.models[] for the diffusion model.

Gaps

Whole stack. Cleanest path: mlx-diffusion (Apple’s MLX port of Diffusers) → small Stable Diffusion 3.5 Medium for the default. New crate locara-diffusion, new IPC commands, picker UI.

This unlocks at least 6 other modalities — see “Cross-cutting infrastructure” in ../modalities-and-models-survey.md.

See also