10 — Tools (Sandboxed Tool Execution)

Tools are callable actions the LLM or app code can invoke to do work in the world: OCR a PDF, search the filesystem, run Python, transform an image. Locara treats tooling as first-class — see 04-modalities.md for the developer-facing declaration model.

This document covers the runtime that executes tools.

Architecture

App declares tooling: ["ocr", "code-exec.python"]
       ↓
Locara fetches signed wasm tools into shared cache
       ↓
At call time:
  1. App invokes tools.ocr(...) via SDK
  2. SDK forwards to Rust runtime (capability-checked)
  3. Runtime spins up Wasmtime instance
  4. Imports configured per tool's declared WASI capabilities
  5. Tool runs to completion or abort
  6. Result returned to app

Why Wasmtime + WASI

Capability-based by design. Wasm modules have no syscalls except those imported by the host.
Fast cold start. Tens of microseconds vs hundreds of ms for containers.
Cross-platform. One binary runs on any Wasmtime host.
Standard. WASI is a stable specification with growing adoption.
Memory-isolated. Tool state cannot leak into the host process.
Resource-limited. Fuel-based execution caps prevent infinite loops; memory cap prevents allocation bombs.

See ../notes/wasmtime-wasi.md for background.

Tool registry

The Locara tool registry mirrors the model registry — curated, versioned, signed wasm artifacts:

registry/tools/<tool-id>.json

{
  "id": "ocr",
  "version": "1.2.0",
  "displayName": "Optical Character Recognition",
  "description": "Extract text and structure from images and PDFs.",
  "license": "Apache-2.0",
  
  "artifact": {
    "url": "https://cdn.locara.app/tools/ocr-v1.2.0.wasm",
    "sha256": "abc123...",
    "size_bytes": 2400000
  },

  "model_dependencies": [
    "glm-ocr-1.5-q8@sha256:..."
  ],

  "capability_requirements": [
    "fs.user-selected (when source is a file path)"
  ],

  "wasi_imports": {
    "fs": { "scope": "$TOOL_TEMP" },  // tool can only write to a temp dir
    "stdio": "read-write"
  },

  "input_schema": {
    "type": "object",
    "properties": {
      "source": { "$ref": "#/definitions/blob_or_path" },
      "language": { "type": "string", "enum": ["en", "auto"] }
    },
    "required": ["source"]
  },

  "output_schema": {
    "type": "object",
    "properties": {
      "text": { "type": "string" },
      "blocks": { "type": "array" },
      "confidence": { "type": "number" }
    }
  },

  "validated_at": "2026-04-01",
  "validated_by": "locara-team"
}

Apps reference tools by <id>@<version>. The runtime fetches and verifies SHA at install time.

Built-in tool catalog (v1)

Each tool below ships in the curated registry on day 1 of phase 3:

Document & content

ocr — Extract text from images / PDFs using GLM-OCR or RapidOCR.
pdf.extract-text — Extract text layer from PDFs.
pdf.split — Split PDFs by page.
image.resize — Resize / crop / format-convert images.
image.format-convert — JPEG ↔ PNG ↔ WebP.

Filesystem

filesystem.search — Search a user-selected directory by filename or content (FTS-backed).
filesystem.read — Read a user-selected file.
filesystem.write — Write to a user-selected path.
bash.read-only — Whitelist of safe shell commands (ls, grep, find, wc).

Code execution

code-exec.python — Pyodide-based Python in wasm. No native deps; pure Python + a small standard library subset. Memory-capped, time-bounded.
code-exec.js — JS execution in a separate wasm sandbox.
code-exec.lua — Optional; small footprint scripting.

Network (capability-gated)

web.fetch — Fetch a URL. Only available if the app declares net with allowed hosts.

LLM-derived

text.summarize — Summarize via the app’s declared LLM.
text.translate — Translate via the app’s LLM.
text.classify — Categorize text via the app’s LLM.

Audio / Video

audio.transcribe — Use the app’s STT modality to transcribe.
audio.detect-silence — Detect silence regions for chunking.

This catalog is the starting set. Adding a new tool = a one-time PR with the wasm artifact + manifest entry + signed validation.

Sandbox boundaries

A tool’s wasm module runs in a Wasmtime store with only the WASI imports the tool’s manifest declares. By default:

Capability	Default	Notes
`fs.read`	scope = `$TOOL_TEMP` only	A temp dir created per invocation
`fs.write`	scope = `$TOOL_TEMP` only	Wiped after invocation
`net`	none	Tools cannot make outbound calls unless the tool and the host app declare `net`
`clocks`	yes	Read-only
`random`	yes	Cryptographic-grade
`env`	none	No env vars
`process`	none	No subprocess
`stdio`	per-tool	Some tools read/write structured JSON

For a tool to gain a capability beyond the default, it must:

Declare it in the tool registry manifest.
The hosting app must also have that capability (composition rule, see 04-modalities.md).

Composition rule means: a tool cannot do anything the app couldn’t already do. Adding a tool never silently expands the app’s reach.

Resource limits

Per invocation, tools are bounded:

Limit	Default	Configurable per tool?
Memory	256 MB	Yes, up to 1 GB
Wall-clock time	30 s	Yes, up to 5 min
Wasmtime fuel	10⁹ instructions	Yes
Output size	10 MB	Yes
File descriptors	32	No

If a tool exceeds any limit, the runtime aborts the invocation and returns a ToolError. The wasm instance is destroyed; no state persists.

Tool invocation flow

// Pseudocode
async fn invoke_tool(app: AppCtx, tool_id: &str, args: Value) -> Result<Value> {
    // 1. Verify tool is declared in app manifest
    let tool_decl = app.manifest.tooling.get(tool_id)
        .ok_or(CapabilityDenied)?;
    
    // 2. Load tool from cache (or fetch if missing)
    let wasm = tool_cache.load(&tool_decl.artifact_sha)?;
    
    // 3. Validate args against tool input schema
    schema_validate(&args, &tool_decl.input_schema)?;
    
    // 4. Build Wasmtime store with imports based on tool's wasi_imports
    let store = Store::new(&engine);
    let imports = build_imports(&tool_decl.wasi_imports, app);
    
    // 5. Set fuel + memory limits
    store.add_fuel(tool_decl.fuel_limit)?;
    
    // 6. Instantiate
    let instance = Linker::new(&engine)
        .define_imports(imports)?
        .instantiate(&mut store, &wasm)?;
    
    // 7. Invoke entry point
    let result = instance.get_func("invoke")?
        .call_async(&mut store, args)
        .with_timeout(tool_decl.timeout)
        .await?;
    
    // 8. Validate result against output schema
    schema_validate(&result, &tool_decl.output_schema)?;
    
    Ok(result)
}

SDK exposure

import { tools } from '@locara/sdk'

// Direct call
const result = await tools.ocr({ source: pdfBlob, language: 'en' })

// LLM tool calling
const response = await llm.chat({
  model: 'qwen2.5-7b-q4',
  messages: [...],
  tools: ['ocr', 'filesystem.search']  // names must match declared tools
})
// If model wants to call a tool, response.tool_calls is populated
// SDK auto-loops or app handles the tool-calling protocol

Tools as wasm components (forward-looking)

Currently described as core wasm + WASI. As WASI Component Model matures, tools migrate to Components:

Strongly typed inputs/outputs at the wasm level.
Cross-language: tools compiled from Rust, Go, C, AssemblyScript all interop.
Composition: tools can be chained without the host having to re-validate types.

v1 ships core wasm; Component Model migration is a v2 task that doesn’t break existing tools.

Custom tools (app-specific)

Apps can ship their own wasm tools alongside the curated registry:

my-app/
├── locara.json
└── tools/
    ├── pdf-classifier.wasm
    └── pdf-classifier.locara-tool.json

"tooling": [
  "ocr",
  {
    "name": "pdf-classifier",
    "path": "./tools/pdf-classifier.wasm",
    "manifest": "./tools/pdf-classifier.locara-tool.json",
    "signature": "..."
  }
]

The tool manifest follows the same shape as registry tools (input_schema, output_schema, wasi_imports). Custom tools:

Must be signed by the publisher.
Cannot exceed app capabilities (composition rule).
Are reviewed alongside the app at submission.

Why not Docker / subprocess / native code?

Sandbox	Pro	Con	v1 verdict
Wasmtime	Cross-platform, fast cold start, capability-based, no native deps	Can’t run native libs (no `pip install pandas` with C extensions)	Default
Subprocess	Run native code, full Python ecosystem	Weak isolation, hard to limit, security holes	Opt-in only with explicit capability flag
Docker	Strong isolation	Heavy dependency, slow startup, not assumed installed	Not v1
macOS containerization	Strong isolation, Apple-supported	Mac-only, complex API	Future research

Subprocess execution is opt-in via a tool.exec.subprocess capability that triggers strong warnings during review. Most apps never need it.

Open questions

(open) Should LLM-derived tools (text.summarize, etc.) be tools at all, or just SDK functions? Treating them as tools lets LLMs call them in agent loops. Treating them as SDK functions is simpler. Probably both — expose as both, document the duality.
(open) Tool versioning when called by LLM — does the LLM see only the tool name, or name + version? Leaning name only at call time, with version pinned in lockfile.
(open) How do we handle tool deprecation? When ocr@1.2 is replaced by ocr@2.0 (breaking changes), how do existing apps migrate? Probably: support both indefinitely, mark @1.x as deprecated in registry, surface in locara doctor.

Cross-references

Modality + tooling declarations: 04-modalities.md
SDK tools module: 05-sdk.md
Capability composition rule: 03-capabilities.md
Models that some tools depend on: 09-models.md
Wasmtime + WASI background: ../notes/wasmtime-wasi.md