10 — Tools (Sandboxed Tool Execution)
Tools are callable actions the LLM or app code can invoke to do work in the world: OCR a PDF, search the filesystem, run Python, transform an image. Locara treats tooling as first-class — see 04-modalities.md for the developer-facing declaration model.
This document covers the runtime that executes tools.
Architecture
App declares tooling: ["ocr", "code-exec.python"]
↓
Locara fetches signed wasm tools into shared cache
↓
At call time:
1. App invokes tools.ocr(...) via SDK
2. SDK forwards to Rust runtime (capability-checked)
3. Runtime spins up Wasmtime instance
4. Imports configured per tool's declared WASI capabilities
5. Tool runs to completion or abort
6. Result returned to app
Why Wasmtime + WASI
- Capability-based by design. Wasm modules have no syscalls except those imported by the host.
- Fast cold start. Tens of microseconds vs hundreds of ms for containers.
- Cross-platform. One binary runs on any Wasmtime host.
- Standard. WASI is a stable specification with growing adoption.
- Memory-isolated. Tool state cannot leak into the host process.
- Resource-limited. Fuel-based execution caps prevent infinite loops; memory cap prevents allocation bombs.
See ../notes/wasmtime-wasi.md for background.
Tool registry
The Locara tool registry mirrors the model registry — curated, versioned, signed wasm artifacts:
registry/tools/<tool-id>.json
{
"id": "ocr",
"version": "1.2.0",
"displayName": "Optical Character Recognition",
"description": "Extract text and structure from images and PDFs.",
"license": "Apache-2.0",
"artifact": {
"url": "https://cdn.locara.app/tools/ocr-v1.2.0.wasm",
"sha256": "abc123...",
"size_bytes": 2400000
},
"model_dependencies": [
"glm-ocr-1.5-q8@sha256:..."
],
"capability_requirements": [
"fs.user-selected (when source is a file path)"
],
"wasi_imports": {
"fs": { "scope": "$TOOL_TEMP" }, // tool can only write to a temp dir
"stdio": "read-write"
},
"input_schema": {
"type": "object",
"properties": {
"source": { "$ref": "#/definitions/blob_or_path" },
"language": { "type": "string", "enum": ["en", "auto"] }
},
"required": ["source"]
},
"output_schema": {
"type": "object",
"properties": {
"text": { "type": "string" },
"blocks": { "type": "array" },
"confidence": { "type": "number" }
}
},
"validated_at": "2026-04-01",
"validated_by": "locara-team"
}
Apps reference tools by <id>@<version>. The runtime fetches and verifies SHA at install time.
Built-in tool catalog (v1)
Each tool below ships in the curated registry on day 1 of phase 3:
Document & content
ocr— Extract text from images / PDFs using GLM-OCR or RapidOCR.pdf.extract-text— Extract text layer from PDFs.pdf.split— Split PDFs by page.image.resize— Resize / crop / format-convert images.image.format-convert— JPEG ↔ PNG ↔ WebP.
Filesystem
filesystem.search— Search a user-selected directory by filename or content (FTS-backed).filesystem.read— Read a user-selected file.filesystem.write— Write to a user-selected path.bash.read-only— Whitelist of safe shell commands (ls,grep,find,wc).
Code execution
code-exec.python— Pyodide-based Python in wasm. No native deps; pure Python + a small standard library subset. Memory-capped, time-bounded.code-exec.js— JS execution in a separate wasm sandbox.code-exec.lua— Optional; small footprint scripting.
Network (capability-gated)
web.fetch— Fetch a URL. Only available if the app declaresnetwith allowed hosts.
LLM-derived
text.summarize— Summarize via the app’s declared LLM.text.translate— Translate via the app’s LLM.text.classify— Categorize text via the app’s LLM.
Audio / Video
audio.transcribe— Use the app’s STT modality to transcribe.audio.detect-silence— Detect silence regions for chunking.
This catalog is the starting set. Adding a new tool = a one-time PR with the wasm artifact + manifest entry + signed validation.
Sandbox boundaries
A tool’s wasm module runs in a Wasmtime store with only the WASI imports the tool’s manifest declares. By default:
| Capability | Default | Notes |
|---|---|---|
fs.read | scope = $TOOL_TEMP only | A temp dir created per invocation |
fs.write | scope = $TOOL_TEMP only | Wiped after invocation |
net | none | Tools cannot make outbound calls unless the tool and the host app declare net |
clocks | yes | Read-only |
random | yes | Cryptographic-grade |
env | none | No env vars |
process | none | No subprocess |
stdio | per-tool | Some tools read/write structured JSON |
For a tool to gain a capability beyond the default, it must:
- Declare it in the tool registry manifest.
- The hosting app must also have that capability (composition rule, see 04-modalities.md).
Composition rule means: a tool cannot do anything the app couldn’t already do. Adding a tool never silently expands the app’s reach.
Resource limits
Per invocation, tools are bounded:
| Limit | Default | Configurable per tool? |
|---|---|---|
| Memory | 256 MB | Yes, up to 1 GB |
| Wall-clock time | 30 s | Yes, up to 5 min |
| Wasmtime fuel | 10⁹ instructions | Yes |
| Output size | 10 MB | Yes |
| File descriptors | 32 | No |
If a tool exceeds any limit, the runtime aborts the invocation and returns a ToolError. The wasm instance is destroyed; no state persists.
Tool invocation flow
// Pseudocode
async fn invoke_tool(app: AppCtx, tool_id: &str, args: Value) -> Result<Value> {
// 1. Verify tool is declared in app manifest
let tool_decl = app.manifest.tooling.get(tool_id)
.ok_or(CapabilityDenied)?;
// 2. Load tool from cache (or fetch if missing)
let wasm = tool_cache.load(&tool_decl.artifact_sha)?;
// 3. Validate args against tool input schema
schema_validate(&args, &tool_decl.input_schema)?;
// 4. Build Wasmtime store with imports based on tool's wasi_imports
let store = Store::new(&engine);
let imports = build_imports(&tool_decl.wasi_imports, app);
// 5. Set fuel + memory limits
store.add_fuel(tool_decl.fuel_limit)?;
// 6. Instantiate
let instance = Linker::new(&engine)
.define_imports(imports)?
.instantiate(&mut store, &wasm)?;
// 7. Invoke entry point
let result = instance.get_func("invoke")?
.call_async(&mut store, args)
.with_timeout(tool_decl.timeout)
.await?;
// 8. Validate result against output schema
schema_validate(&result, &tool_decl.output_schema)?;
Ok(result)
}
SDK exposure
import { tools } from '@locara/sdk'
// Direct call
const result = await tools.ocr({ source: pdfBlob, language: 'en' })
// LLM tool calling
const response = await llm.chat({
model: 'qwen2.5-7b-q4',
messages: [...],
tools: ['ocr', 'filesystem.search'] // names must match declared tools
})
// If model wants to call a tool, response.tool_calls is populated
// SDK auto-loops or app handles the tool-calling protocol
Tools as wasm components (forward-looking)
Currently described as core wasm + WASI. As WASI Component Model matures, tools migrate to Components:
- Strongly typed inputs/outputs at the wasm level.
- Cross-language: tools compiled from Rust, Go, C, AssemblyScript all interop.
- Composition: tools can be chained without the host having to re-validate types.
v1 ships core wasm; Component Model migration is a v2 task that doesn’t break existing tools.
Custom tools (app-specific)
Apps can ship their own wasm tools alongside the curated registry:
my-app/
├── locara.json
└── tools/
├── pdf-classifier.wasm
└── pdf-classifier.locara-tool.json
"tooling": [
"ocr",
{
"name": "pdf-classifier",
"path": "./tools/pdf-classifier.wasm",
"manifest": "./tools/pdf-classifier.locara-tool.json",
"signature": "..."
}
]
The tool manifest follows the same shape as registry tools (input_schema, output_schema, wasi_imports). Custom tools:
- Must be signed by the publisher.
- Cannot exceed app capabilities (composition rule).
- Are reviewed alongside the app at submission.
Why not Docker / subprocess / native code?
| Sandbox | Pro | Con | v1 verdict |
|---|---|---|---|
| Wasmtime | Cross-platform, fast cold start, capability-based, no native deps | Can’t run native libs (no pip install pandas with C extensions) | Default |
| Subprocess | Run native code, full Python ecosystem | Weak isolation, hard to limit, security holes | Opt-in only with explicit capability flag |
| Docker | Strong isolation | Heavy dependency, slow startup, not assumed installed | Not v1 |
| macOS containerization | Strong isolation, Apple-supported | Mac-only, complex API | Future research |
Subprocess execution is opt-in via a tool.exec.subprocess capability that triggers strong warnings during review. Most apps never need it.
Open questions
- (open) Should LLM-derived tools (
text.summarize, etc.) be tools at all, or just SDK functions? Treating them as tools lets LLMs call them in agent loops. Treating them as SDK functions is simpler. Probably both — expose as both, document the duality. - (open) Tool versioning when called by LLM — does the LLM see only the tool name, or name + version? Leaning name only at call time, with version pinned in lockfile.
- (open) How do we handle tool deprecation? When
ocr@1.2is replaced byocr@2.0(breaking changes), how do existing apps migrate? Probably: support both indefinitely, mark@1.xas deprecated in registry, surface inlocara doctor.
Cross-references
- Modality + tooling declarations: 04-modalities.md
- SDK
toolsmodule: 05-sdk.md - Capability composition rule: 03-capabilities.md
- Models that some tools depend on: 09-models.md
- Wasmtime + WASI background:
../notes/wasmtime-wasi.md