08 — Storage
Overview
Each Locara app has a per-app SQLite database for relational data and a per-app vector store (sqlite-vec or zvec) for embeddings. Storage is isolated by container — apps cannot read each other’s data.
Storage hierarchy
~/Library/Containers/<app-id>/Data/Documents/
├── app.sqlite # Main app database
├── app.sqlite-wal # Write-ahead log
├── app.sqlite-shm # Shared memory file
└── (additional DBs if app declares them)
The app sees this as $APP_DATA via the SDK. Apps don’t manage paths directly; they use db.run, db.all, etc.
SQLite as default (committed)
Locara uses SQLite for all relational storage:
- Single file per app — trivial backups, exports, deletes.
- Zero-ops — no daemon, no setup.
- WAL mode by default — concurrent reads + single writer; good throughput.
- FTS5 enabled — full-text search built in.
- JSON1 enabled — structured data without separate JSON DB.
- Loadable extensions enabled (for sqlite-vec, zvec adapter).
Why not Postgres / DuckDB / etc.: see ../notes/sqlite.md. Short answer: long-term stability + zero-ops + single-file > additional power for v1’s app shapes.
Vector storage
Two backends:
sqlite-vec (default)
- Lives inside the same SQLite file as relational data.
- Single source of truth, single backup, single export.
- Capable up to ~1M vectors, hundreds of QPS.
- Apache 2.0.
- Good enough for ~95% of local-app use cases.
zvec (opt-in)
- Separate in-process library; embedded but not inside SQLite.
- Sparse + dense + hybrid search.
- Better for >1M vectors or sophisticated hybrid search.
- Apache 2.0.
Apps declare which engine they want:
"storage": {
"vector": { "engine": "sqlite-vec" }
}
The SDK abstracts both behind db.vec.* and db.search.hybrid. Apps that switch engines change one config line; SDK calls don’t change. (Some advanced features may be engine-specific; surfaced via runtime warnings.)
Schema declaration
Each app’s schema lives in db/schema.sql:
-- db/schema.sql
CREATE TABLE transcripts (
id TEXT PRIMARY KEY,
title TEXT NOT NULL,
text TEXT NOT NULL,
duration INTEGER NOT NULL,
created_at INTEGER NOT NULL DEFAULT (unixepoch())
);
CREATE INDEX idx_transcripts_created ON transcripts (created_at DESC);
-- FTS5 virtual table (auto-synced via triggers)
CREATE VIRTUAL TABLE transcripts_fts USING fts5(
title, text,
content='transcripts',
content_rowid='rowid'
);
-- Vector table (sqlite-vec)
CREATE VIRTUAL TABLE transcripts_vec USING vec0(
embedding FLOAT[768]
);
-- Triggers to keep fts/vec in sync
CREATE TRIGGER transcripts_ai AFTER INSERT ON transcripts BEGIN
INSERT INTO transcripts_fts(rowid, title, text) VALUES (new.rowid, new.title, new.text);
END;
-- (similar AU, AD triggers)
The CLI parses this schema and:
- Generates TS types in
node_modules/@locara/sdk/db.d.ts. - Computes diffs between published versions for migrations.
Migrations
Each app version’s schema is the source of truth. Migrations between versions are derived.
db/
├── schema.sql # Current schema (the truth)
└── migrations/
├── 0001_initial.sql # Auto-generated by `locara db migrate`
├── 0002_add_user_table.sql # Generated; can be hand-edited
└── 0003_add_embedding_idx.sql
Workflow:
# Edit schema.sql, e.g., add a new column
# Then:
locara db migrate
# CLI computes diff: schema.sql vs last applied migration
# Outputs proposed migration SQL
# Developer reviews, optionally edits, commits
For non-trivial changes (column type changes, data backfills), the developer hand-writes the migration. CLI helps with the obvious cases (add column, add table, add index).
On user app update:
1. Runtime detects new app version's schema.sql differs from current DB schema.
2. Runtime applies pending migrations in order, in a transaction.
3. If migration fails: roll back, refuse update, surface error.
4. Backups: before each migration, snapshot DB to .backup-<version>.sqlite (kept for 7 days).
Migration safety is non-negotiable. A broken migration that destroys user data is the fastest way to kill trust.
Transactions
import { db } from '@locara/sdk'
await db.transaction(async (tx) => {
await tx.run`INSERT INTO transcripts (id, title) VALUES (${id}, ${title})`
await tx.run`INSERT INTO transcripts_fts (rowid, title) VALUES (last_insert_rowid(), ${title})`
})
Auto-commit on resolved promise; rollback on rejection.
Concurrency
WAL mode permits one writer + many readers. The SDK serializes writes through a single mutex per database. For most apps this is fine — local single-user apps have low write contention.
Heavy-write workloads: consider partitioning into multiple databases (e.g., per-project DBs) declared in manifest.
Per-app isolation
Apps cannot access each other’s databases. Each container has its own Data/Documents/ and SQLite files. The macOS App Sandbox enforces this at the kernel level.
If two apps need to share data, they declare an ipc capability with each other; the runtime mediates. (See 03-capabilities.md.)
Backups
Apps can opt into Locara-mediated backups:
"storage": {
"backups": {
"enabled": true,
"frequency": "daily",
"retention_days": 30
}
}
Backups go to ~/Library/Application Support/Locara/Backups/<app-id>/<timestamp>.sqlite. Apps can prompt the user to restore from a backup via SDK (db.restore.list, db.restore.apply).
Cloud backup (iCloud Drive, etc.) is not v1. Apps can’t ask for it because the privacy story complicates.
Encryption
(open) SQLite databases are unencrypted by default. macOS FileVault encrypts the disk; container is per-app-isolated. Is per-app DB encryption (SQLCipher etc.) worth the complexity?
Leaning no for v1: rely on FileVault + container isolation. Apps with extreme privacy needs can opt into SQLCipher (declare in manifest, runtime supports). Cost: SQLCipher is GPL/proprietary, complicates build.
Common patterns
RAG / hybrid search
const queryEmbedding = await embed.one({ model: 'nomic-embed-text-v1.5', text: userQuery })
const results = await db.search.hybrid('transcripts', {
query: userQuery,
embedding: queryEmbedding,
weights: { keyword: 0.3, vector: 0.7 },
k: 10,
})
The runtime translates this into a query joining transcripts_fts (FTS5) and transcripts_vec (sqlite-vec) with rank fusion.
JSON columns
CREATE TABLE settings (
id INTEGER PRIMARY KEY,
data TEXT -- JSON
);
const settings = await db.one<{ data: string }>`SELECT data FROM settings WHERE id = 1`
const parsed = JSON.parse(settings.data)
For typed JSON, use Zod or similar in app code; SDK doesn’t try to add a JSON-typing layer.
What’s not in v1
- Distributed / multi-device sync
- Cloud backups
- Real-time replication
- Multi-tenant within a single app
- DuckDB or analytical engines
These are post-v1 considerations. The privacy thesis benefits from staying single-device.
Cross-references
- SDK API: 05-sdk.md
- Capability model (storage capabilities): 03-capabilities.md
- SQLite design philosophy:
../notes/sqlite.md - Vector backend choices:
../notes/huggingface-hub.md(mentioned)