Locara

10 — Tools (Sandboxed Tool Execution)

Tools are callable actions the LLM or app code can invoke to do work in the world: OCR a PDF, search the filesystem, run Python, transform an image. Locara treats tooling as first-class — see 04-modalities.md for the developer-facing declaration model.

This document covers the runtime that executes tools.

Architecture

App declares tooling: ["ocr", "code-exec.python"]

Locara fetches signed wasm tools into shared cache

At call time:
  1. App invokes tools.ocr(...) via SDK
  2. SDK forwards to Rust runtime (capability-checked)
  3. Runtime spins up Wasmtime instance
  4. Imports configured per tool's declared WASI capabilities
  5. Tool runs to completion or abort
  6. Result returned to app

Why Wasmtime + WASI

  • Capability-based by design. Wasm modules have no syscalls except those imported by the host.
  • Fast cold start. Tens of microseconds vs hundreds of ms for containers.
  • Cross-platform. One binary runs on any Wasmtime host.
  • Standard. WASI is a stable specification with growing adoption.
  • Memory-isolated. Tool state cannot leak into the host process.
  • Resource-limited. Fuel-based execution caps prevent infinite loops; memory cap prevents allocation bombs.

See ../notes/wasmtime-wasi.md for background.

Tool registry

The Locara tool registry mirrors the model registry — curated, versioned, signed wasm artifacts:

registry/tools/<tool-id>.json
{
  "id": "ocr",
  "version": "1.2.0",
  "displayName": "Optical Character Recognition",
  "description": "Extract text and structure from images and PDFs.",
  "license": "Apache-2.0",
  
  "artifact": {
    "url": "https://cdn.locara.app/tools/ocr-v1.2.0.wasm",
    "sha256": "abc123...",
    "size_bytes": 2400000
  },

  "model_dependencies": [
    "glm-ocr-1.5-q8@sha256:..."
  ],

  "capability_requirements": [
    "fs.user-selected (when source is a file path)"
  ],

  "wasi_imports": {
    "fs": { "scope": "$TOOL_TEMP" },  // tool can only write to a temp dir
    "stdio": "read-write"
  },

  "input_schema": {
    "type": "object",
    "properties": {
      "source": { "$ref": "#/definitions/blob_or_path" },
      "language": { "type": "string", "enum": ["en", "auto"] }
    },
    "required": ["source"]
  },

  "output_schema": {
    "type": "object",
    "properties": {
      "text": { "type": "string" },
      "blocks": { "type": "array" },
      "confidence": { "type": "number" }
    }
  },

  "validated_at": "2026-04-01",
  "validated_by": "locara-team"
}

Apps reference tools by <id>@<version>. The runtime fetches and verifies SHA at install time.

Built-in tool catalog (v1)

Each tool below ships in the curated registry on day 1 of phase 3:

Document & content

  • ocr — Extract text from images / PDFs using GLM-OCR or RapidOCR.
  • pdf.extract-text — Extract text layer from PDFs.
  • pdf.split — Split PDFs by page.
  • image.resize — Resize / crop / format-convert images.
  • image.format-convert — JPEG ↔ PNG ↔ WebP.

Filesystem

  • filesystem.search — Search a user-selected directory by filename or content (FTS-backed).
  • filesystem.read — Read a user-selected file.
  • filesystem.write — Write to a user-selected path.
  • bash.read-only — Whitelist of safe shell commands (ls, grep, find, wc).

Code execution

  • code-exec.python — Pyodide-based Python in wasm. No native deps; pure Python + a small standard library subset. Memory-capped, time-bounded.
  • code-exec.js — JS execution in a separate wasm sandbox.
  • code-exec.lua — Optional; small footprint scripting.

Network (capability-gated)

  • web.fetch — Fetch a URL. Only available if the app declares net with allowed hosts.

LLM-derived

  • text.summarize — Summarize via the app’s declared LLM.
  • text.translate — Translate via the app’s LLM.
  • text.classify — Categorize text via the app’s LLM.

Audio / Video

  • audio.transcribe — Use the app’s STT modality to transcribe.
  • audio.detect-silence — Detect silence regions for chunking.

This catalog is the starting set. Adding a new tool = a one-time PR with the wasm artifact + manifest entry + signed validation.

Sandbox boundaries

A tool’s wasm module runs in a Wasmtime store with only the WASI imports the tool’s manifest declares. By default:

CapabilityDefaultNotes
fs.readscope = $TOOL_TEMP onlyA temp dir created per invocation
fs.writescope = $TOOL_TEMP onlyWiped after invocation
netnoneTools cannot make outbound calls unless the tool and the host app declare net
clocksyesRead-only
randomyesCryptographic-grade
envnoneNo env vars
processnoneNo subprocess
stdioper-toolSome tools read/write structured JSON

For a tool to gain a capability beyond the default, it must:

  1. Declare it in the tool registry manifest.
  2. The hosting app must also have that capability (composition rule, see 04-modalities.md).

Composition rule means: a tool cannot do anything the app couldn’t already do. Adding a tool never silently expands the app’s reach.

Resource limits

Per invocation, tools are bounded:

LimitDefaultConfigurable per tool?
Memory256 MBYes, up to 1 GB
Wall-clock time30 sYes, up to 5 min
Wasmtime fuel10⁹ instructionsYes
Output size10 MBYes
File descriptors32No

If a tool exceeds any limit, the runtime aborts the invocation and returns a ToolError. The wasm instance is destroyed; no state persists.

Tool invocation flow

// Pseudocode
async fn invoke_tool(app: AppCtx, tool_id: &str, args: Value) -> Result<Value> {
    // 1. Verify tool is declared in app manifest
    let tool_decl = app.manifest.tooling.get(tool_id)
        .ok_or(CapabilityDenied)?;
    
    // 2. Load tool from cache (or fetch if missing)
    let wasm = tool_cache.load(&tool_decl.artifact_sha)?;
    
    // 3. Validate args against tool input schema
    schema_validate(&args, &tool_decl.input_schema)?;
    
    // 4. Build Wasmtime store with imports based on tool's wasi_imports
    let store = Store::new(&engine);
    let imports = build_imports(&tool_decl.wasi_imports, app);
    
    // 5. Set fuel + memory limits
    store.add_fuel(tool_decl.fuel_limit)?;
    
    // 6. Instantiate
    let instance = Linker::new(&engine)
        .define_imports(imports)?
        .instantiate(&mut store, &wasm)?;
    
    // 7. Invoke entry point
    let result = instance.get_func("invoke")?
        .call_async(&mut store, args)
        .with_timeout(tool_decl.timeout)
        .await?;
    
    // 8. Validate result against output schema
    schema_validate(&result, &tool_decl.output_schema)?;
    
    Ok(result)
}

SDK exposure

import { tools } from '@locara/sdk'

// Direct call
const result = await tools.ocr({ source: pdfBlob, language: 'en' })

// LLM tool calling
const response = await llm.chat({
  model: 'qwen2.5-7b-q4',
  messages: [...],
  tools: ['ocr', 'filesystem.search']  // names must match declared tools
})
// If model wants to call a tool, response.tool_calls is populated
// SDK auto-loops or app handles the tool-calling protocol

Tools as wasm components (forward-looking)

Currently described as core wasm + WASI. As WASI Component Model matures, tools migrate to Components:

  • Strongly typed inputs/outputs at the wasm level.
  • Cross-language: tools compiled from Rust, Go, C, AssemblyScript all interop.
  • Composition: tools can be chained without the host having to re-validate types.

v1 ships core wasm; Component Model migration is a v2 task that doesn’t break existing tools.

Custom tools (app-specific)

Apps can ship their own wasm tools alongside the curated registry:

my-app/
├── locara.json
└── tools/
    ├── pdf-classifier.wasm
    └── pdf-classifier.locara-tool.json
"tooling": [
  "ocr",
  {
    "name": "pdf-classifier",
    "path": "./tools/pdf-classifier.wasm",
    "manifest": "./tools/pdf-classifier.locara-tool.json",
    "signature": "..."
  }
]

The tool manifest follows the same shape as registry tools (input_schema, output_schema, wasi_imports). Custom tools:

  • Must be signed by the publisher.
  • Cannot exceed app capabilities (composition rule).
  • Are reviewed alongside the app at submission.

Why not Docker / subprocess / native code?

SandboxProConv1 verdict
WasmtimeCross-platform, fast cold start, capability-based, no native depsCan’t run native libs (no pip install pandas with C extensions)Default
SubprocessRun native code, full Python ecosystemWeak isolation, hard to limit, security holesOpt-in only with explicit capability flag
DockerStrong isolationHeavy dependency, slow startup, not assumed installedNot v1
macOS containerizationStrong isolation, Apple-supportedMac-only, complex APIFuture research

Subprocess execution is opt-in via a tool.exec.subprocess capability that triggers strong warnings during review. Most apps never need it.

Open questions

  • (open) Should LLM-derived tools (text.summarize, etc.) be tools at all, or just SDK functions? Treating them as tools lets LLMs call them in agent loops. Treating them as SDK functions is simpler. Probably both — expose as both, document the duality.
  • (open) Tool versioning when called by LLM — does the LLM see only the tool name, or name + version? Leaning name only at call time, with version pinned in lockfile.
  • (open) How do we handle tool deprecation? When ocr@1.2 is replaced by ocr@2.0 (breaking changes), how do existing apps migrate? Probably: support both indefinitely, mark @1.x as deprecated in registry, surface in locara doctor.

Cross-references