Security
Threat model and accepted risks for OptimalVault and the surrounding Fabric auth surface. Synthesizes the Phase 10a-6 audit at ~/.optimalos/transfers/fabric-design/06-vault-auth-threat-rerun.md (511 lines, code-cited per finding).
The audit was a code-read review against clenis @ caa0b34, not a penetration test or fuzz harness. All findings cite files on disk at the audited commit.
Posture summary
The shipped design substantially implements the original threat model: ciphertext-only cloud, multi-recipient age routing, BIP39 recovery, soft-revoke plus eager re-wrap, scoped JWTs, single-use pairing tokens, mode-0600 device keys, browser-side passphrase plus WebAuthn key derivation.
| Severity | Count | Status |
|---|---|---|
| P0 | 5 | All cleared in Phase 10a-7 (CSP middleware, device-revoke cross-check, WebAuthn canary, secrets sweep, logout JWT-revoke). Vault ceremony walked end-to-end on iPad Safari 2026-05-04. |
| P1 | 8 | 0 closed today. None are in the kanban project yet. |
| P2 | 10 | 1 in the kanban; remainder tracked here only. |
What "P0 cleared" means in practice: the design is approved for Carlos's personal Pi self-host today, and approved for invite-only beta with named users (10 or fewer) once the corresponding tests are added. Public hosted mode remains explicitly NOT approved until the full Phase 13b plus 14 plus external pen-test sequence completes (see §7 of the audit).
Cleared P0s (Phase 10a-7)
| # | Finding | Fix shipped at |
|---|---|---|
| 1 | No CSP header on /vault/* HTML routes | src/server.ts middleware mounts default-src 'self'; script-src 'self' 'wasm-unsafe-eval'; style-src 'self' 'unsafe-inline'; connect-src 'self'; frame-ancestors 'none'; base-uri 'self'. Tests: tests/server/csp-headers.test.ts (7 cases). |
| 2 | authMiddleware did not cross-check vault_recipients.revoked_at for kind=device JWTs | src/auth/session.ts now does the lookup with a 60s in-memory cache plus an invalidation hook. Tests: tests/auth/device-revoke.test.ts (5 cases). |
| 3 | First-unlock pubkey-mismatch silently corrupted state for non-deterministic authenticators | Canary blob round-trip on vault_recipients.canary_ciphertext (migration 20260503235721_fabric_vault_canary_phase_10a_7.sql). Tests: tests/vault/canary.test.ts (4 cases). |
| 4 | No automated check that test fixtures or .env files were free of real secrets | lint:secrets script plus a CI gate. Tests: tests/server/lint-secrets.test.ts (2 cases). |
| 5 | localStorage Bearer-token replay risk paired with no logout endpoint | POST /api/auth/logout plus AuthStore.revokeJti / isJtiRevoked consulted on every Fabric-JWT verify. Tests: tests/auth/logout-flow.test.ts (7 cases). |
Outstanding P1s (must close before Phase 11 harness catalog ships)
| ID | Finding | Why it matters | Remediation |
|---|---|---|---|
| T2 | RLS absent (single-tenant only) | Fine for personal self-host; "two friends share an instance" becomes a configuration accident at hosted-beta scale | Phase 14 |
| T4 | Device-JWT revocation cross-check on the daemon side | Cloud soft-revokes a recipient, but the device's 30-day JWT keeps passing daemon-side validation. Stale JWT remains usable for session:claim and telemetry:write for the remainder of its TTL | Daemon-side SELECT revoked_at cross-check, cached short TTL |
| T5b | localStorage trust marker | Trusted-device unlocks read deviceBindingB64 from localStorage. CSP plus canary mitigate XSS, but the second factor still effectively lives in the DOM | Promote to WebAuthn PRF extension or a non-extractable WebCrypto AES-GCM key in IndexedDB |
| T6b | Lock-file SRI pinning | Vendored JS deps are bundle-served by Vite (good) but no Subresource-Integrity check exists | Pin pnpm-lock.yaml plus pnpm audit in CI |
| T7 | Cloud TLS pubkey pinning on device fetch | Device daemon fetch calls trust the system CA store. CA compromise plus MITM yields the device JWT | Persist cloud TLS pubkey SHA-256 at pairing in ~/.config/optimalos/keys/cloud-pin.txt; verify on every fetch |
| T8 | Postgres RPC for atomic re-wrap | Today's re-wrap is multi-statement on Supabase; mid-batch network failure leaves a half-state. Self-heals via dirty-detect but the gap is visible | vault_rewrap_batch(p_items jsonb) plpgsql function in a single transaction |
| T11 | Argon2 salt rotation policy | Per-install salt is set; no documented rotation schedule | Document or persist a rotation timestamp |
| T13 | Access-log payload validation | x-session-id header accepted as-is, no UUID check, no binding to verified subject | isUuid() validator (already exists) plus bind to verified.sub |
Two of these (T4 and T7) are explicitly carried over to the Hetzner handoff as the highest-priority "left to test" items.
Tracked P2s
The full P2 list lives in the audit at §3 and §4.3. Highlights:
- Recovery-phrase DOM and clipboard zeroize (§3-A, §3-B). Words live in DOM and clipboard until natural GC and OS clipboard eviction. Mitigation: clear
state.recovery = nullplus a 60-second clipboard auto-clear after "Copy phrase." - Pair-complete cross-check of
inviteclaim against active browser recipient (§3-O). Pairing tokens issued by a now-revoked browser remain redeemable for the 10-minute TTL. vault_access_log.ipaccepts forwarded headers without proxy allowlist (§3-J). Audit-quality issue, not a security boundary.- JWT
jtireplay window inside natural TTL (§3-K). Logout is implemented, but no active replay-detection list. Acceptable for HS256 REST auth; revisit if Phase 13b WebAuthn-for-destructive needs fresh assertions regardless. - Setup state holds passphrase in plain string until
handleChallengeconsumes it (§3-T). Window is the duration of the setup wizard; mitigated by CSP.
Acceptable risks (documented)
The audit §5 enumerates 13 risks the design knowingly accepts. The most important to remember:
| Risk | Rationale | Revisit trigger |
|---|---|---|
| Argon2id 5-7s on Pi 5; longer on iPhone Safari | UX cost vs security; locked m=64MiB, t=3, p=1 per RFC 9106 | If iPhone Safari unlock exceeds 30s in real-world testing, re-tune to t=2 |
| Best-effort plaintext zeroization on device | V8 string interning; no truly synchronous wipe | TPM-sealed device key (Phase 12-2) |
| In-memory trust plus passkey store | Cloud restart drops trust, forcing fresh ceremony; fine on the Pi, observable in hosted mode | Hosted-mode rollout (Phase 17) |
| Single-tenant cloud, no per-user RLS | Decision-ledger #3 | Multi-user surfacing (Phase 14) |
| Trusted-fingerprint = passphrase only inside 30-day window | UX vs Bloomberg-tight tradeoff (Decision-ledger amend #12) | First report of XSS or extension-based vault read; immediately move to PRF (P1 #6) |
vault_entries.label plus metadata cleartext | Same posture as 1Password Watchtower | Multi-tenant |
vault_access_log.ip plus user_agent cleartext | Audit trail by design | Privacy review for hosted mode |
| Recovery phrase shown ONCE in DOM | UX necessity | After the §3-A P1 zeroize lands |
| WebAuthn signature determinism assumed | Trust-marker caches the first signature, sidestepping the issue at unlock; canary detects mismatch at registration | Non-deterministic authenticator report; PRF migration closes this |
| Best-effort Supabase rewrap rollback | Self-heals via dirty-detect | Postgres RPC migration (P1 T8) |
| HS256 JWT (symmetric) over asymmetric | Single-tenant, key only on cloud | Multi-instance or shared-key threat (Phase 14) |
No jti revocation on session JWTs beyond logout | Replay window is the natural TTL | After Phase 13b WebAuthn-for-destructive |
| Cloud TLS not pinned by device daemon | Tunneled origin trust plus system CAs | Hosted mode (P1 T7) |
Recovery is one-way
The 24-word BIP39 phrase is shown once during setup and is not persisted in plaintext anywhere. Lose it and the vault is unrecoverable for any recipient that depended on it.
Two practical implications:
- Print it or write it down at setup time. Do not save it to a password manager that might itself be in the vault.
- Add a second recipient (paired device) before you forget the phrase. Two
vault_recipientsrows means losing one identity is recoverable from the other; one row plus a lost phrase is terminal.
Escalation paths
- Vault key compromise suspected. Revoke all recipients via the dashboard (
UPDATE vault_recipients SET revoked_at = now()), force re-wrap by re-registering each recipient. T4 fix needed before this is fully effective on devices. - Hetzner box compromise. Rotate
JWT_SIGNING_KEYplusINVITE_PASSWORDin/opt/optimalos/secrets.env, restartoptimalos.service, force all browsers and devices to re-pair. Vault ciphertext stays safe (server cannot decrypt). - Pi compromise. Rotate the device's vault keypair via dashboard revoke plus re-pair. Vault ciphertext on Hetzner stays safe.
Where to read more
- Audit (full code-cited findings):
~/.optimalos/transfers/fabric-design/06-vault-auth-threat-rerun.md - Vault design (canonical threat model):
~/.optimalos/transfers/fabric-design/02-vault-design.md§1 - Decision ledger (amendments include the trusted-fingerprint trade-off):
~/.optimalos/transfers/fabric-design/03-decision-ledger.md - Manual smoke checklist:
~/.openclaw/workspace/optimalOS/tests/vault/SMOKE.md - Vault user guide (for the daily-use story): OptimalVault