Locara

image-to-3d

HF group: Computer Vision · Status: ❌ not built

What it is

Single image (+ optional text) → 3D mesh. Most “text-to-3D” pipelines today actually go text → image → 3D, so this is the heavy-lift stage.

Open-weight models

ModelParamsReleasedLicenseQualityNotes
Hunyuan3D-2.1~5 B2025-06Apache-2.0PBR-ready meshesSame model as text-to-3D. 6 GB VRAM.
TripoSR~1 B2024MITHalf-second image-to-meshBakes lighting into texture; static-asset only.
InstantMesh~1 B2024Apache-2.0512x512 mesh10× faster than optimization-based methods.
CRM (Convolutional Reconstruction Model)~600 M2024Apache-2.0Strong on objectsImage → 6 views → mesh.

Infrastructure required

Inference

  • ❌ 3D-specific runtime.

Input

Output

  • ❌ 3D file save + 3D viewer component.

Storage

  • ❌ Weights cache.
  • Output: fs.user-folder.

Interaction (IPC + SDK)

  • mesh.from_image({ image, prompt? }) IPC.

Capabilities (manifest)

  • capabilities.fs.user-selected for input image.
  • capabilities.fs.user-folder for save.
  • capabilities.models[] for the model.

Gaps

  • Image input pipeline.
  • 3D file output + viewer component.

See also