Locara

text-to-3d

HF group: Computer Vision · Status: ❌ not built

What it is

Text → 3D mesh / Gaussian splat. Note that most “text-to-3D” pipelines today actually go text → image → 3D, so the heavy lift is in image-to-3d.

Open-weight models

ModelParamsReleasedLicenseQualityNotes
Hunyuan3D-2.1~5 B2025-06Apache-2.0First production-ready open 3DPBR-ready meshes, 6 GB VRAM minimum. Both text-to-3D and image-to-3D.
MeshAnything~350 M2024CC-BY-NCHigh quality but small assetsAuto-regressive mesh generation.
3DGen-Arena open models (variants)varies2025variousActive research areaNiche; image-to-3D dominates.

Infrastructure required

Inference

  • ❌ 3D-specific diffusion / mesh-generation runtime.

Input

  • Plain text prompt; optionally chained through text-to-image first.

Output

  • ❌ 3D file (.glb / .obj / .ply) saved to disk.
  • 3D viewer component in @locara/components (three.js-based) for in-app preview.

Storage

  • ❌ Weights cache.
  • Output: fs.user-folder.

Interaction (IPC + SDK)

  • mesh.generate({ prompt }) IPC.

Capabilities (manifest)

  • capabilities.fs.user-folder for save.
  • capabilities.models[] for the model.

Gaps

  • 3D viewer component.
  • 3D file output IPC.
  • Most pipelines route through image-to-3d, so that should land first.

See also