Cross-Node Dispatch
Phase 9 of the Loom charter. The hub enqueues work for compute nodes that have the right capability tags; remote workers on those nodes claim the jobs, run them, and write results back. Validated end-to-end on the Pi (oracle) and pop-os (carlos@pop-os) mesh.
The wire
| Layer | What it does |
|---|---|
loom_job_queue (Supabase table) | Holds pending and in-flight remote jobs. |
optimalOS/src/loom/remote-dispatch.ts | Hub-side enqueue. Writes a row with target_host, workflow_id, step_id, payload. |
optimalOS/src/loom/remote-worker.ts | Per-node poller. Claims rows where target_host matches the local hostname, runs the step locally, writes result back. |
| Tailscale | Provides the private transport. Each node's MagicDNS hostname (e.g. oracle.tail-scale.ts.net) is what target_host resolves to. |
There is no realtime subscription — the worker polls. Polling interval is set by POLL_INTERVAL_MS in the worker.
Capability matching
The scheduler's capability matcher decides which compute node should run a given step:
- Each compute node reports
capabilities: text[]toopenclaw_instanceson heartbeat (e.g.["has-gpu", "has-returnpro-creds", "arch-x86_64"]). - A workflow declares
requires: string[]at the top level, or per step. - The scheduler picks the first healthy node whose
capabilitiesis a superset of the requirement.
Code references: optimalOS/src/loom/scheduler.ts (manual + cron path) and optimalOS/src/loom/remote-dispatch.ts (enqueue path).
Capability matching status
As of 2026-04-27 the capability column on openclaw_instances is wired and heartbeats populate it. The scheduler does not yet enforce capability requirements when picking a target_host. The MVP path is defaultHost or explicit hostHint; capability-driven routing is a follow-up. Charter §7 + §12.
Tailscale resolution
The Pi can't always rely on MagicDNS for outbound calls to pop-os and vice-versa. optimalOS/src/routes/_tailscale.ts provides the resolver shim that maps a node name to its Tailscale IPv4. This is why workflows reference host: "pop-os" rather than a raw IP — the resolver does the lookup at dispatch time.
SSH-driven deploys
The popos-deploy workflow (optimalOS/workflows/popos-deploy.ts) runs on the hub but executes commands on pop-os over SSH. It uses a dedicated key:
- Key file:
~/.ssh/id_popos-deploy(private), public key onpop-os. - SSH config: connects as
User carlos. - Restricted to deploy-related commands by the agent launcher's command allowlist.
The same key is reused by the cross-node smoke workflow (optimalOS/workflows/cross-node-smoke.ts) which periodically pings the remote node and verifies a round-trip job completes.
Verifying the mesh
Manual smoke test:
# From the Pi, with both nodes online:
curl -X POST http://localhost:3000/api/loom/strands/cross-node-smoke/run \
-H "Authorization: Bearer $OPTIMALOS_TOKEN"The run should produce step rows with two different host values — one for the local step and one for the step that gets dispatched to pop-os. Watch the Loom tab; the strand should pulse across both nodes.
If the remote step stays in pending, check:
optimalos.serviceis running on pop-os (not just the Pi).loom_job_queuerow in Supabase hastarget_hostmatching the pop-os hostname exactly.- Tailscale
tailscale statusshows both nodes online. - The pop-os worker has
SUPABASE_URLandSUPABASE_SERVICE_ROLE_KEYset so it can poll.