Skip to content

Cross-Node Dispatch

Phase 9 of the Loom charter. The hub enqueues work for compute nodes that have the right capability tags; remote workers on those nodes claim the jobs, run them, and write results back. Validated end-to-end on the Pi (oracle) and pop-os (carlos@pop-os) mesh.

The wire

LayerWhat it does
loom_job_queue (Supabase table)Holds pending and in-flight remote jobs.
optimalOS/src/loom/remote-dispatch.tsHub-side enqueue. Writes a row with target_host, workflow_id, step_id, payload.
optimalOS/src/loom/remote-worker.tsPer-node poller. Claims rows where target_host matches the local hostname, runs the step locally, writes result back.
TailscaleProvides the private transport. Each node's MagicDNS hostname (e.g. oracle.tail-scale.ts.net) is what target_host resolves to.

There is no realtime subscription — the worker polls. Polling interval is set by POLL_INTERVAL_MS in the worker.

Capability matching

The scheduler's capability matcher decides which compute node should run a given step:

  1. Each compute node reports capabilities: text[] to openclaw_instances on heartbeat (e.g. ["has-gpu", "has-returnpro-creds", "arch-x86_64"]).
  2. A workflow declares requires: string[] at the top level, or per step.
  3. The scheduler picks the first healthy node whose capabilities is a superset of the requirement.

Code references: optimalOS/src/loom/scheduler.ts (manual + cron path) and optimalOS/src/loom/remote-dispatch.ts (enqueue path).

Capability matching status

As of 2026-04-27 the capability column on openclaw_instances is wired and heartbeats populate it. The scheduler does not yet enforce capability requirements when picking a target_host. The MVP path is defaultHost or explicit hostHint; capability-driven routing is a follow-up. Charter §7 + §12.

Tailscale resolution

The Pi can't always rely on MagicDNS for outbound calls to pop-os and vice-versa. optimalOS/src/routes/_tailscale.ts provides the resolver shim that maps a node name to its Tailscale IPv4. This is why workflows reference host: "pop-os" rather than a raw IP — the resolver does the lookup at dispatch time.

SSH-driven deploys

The popos-deploy workflow (optimalOS/workflows/popos-deploy.ts) runs on the hub but executes commands on pop-os over SSH. It uses a dedicated key:

  • Key file: ~/.ssh/id_popos-deploy (private), public key on pop-os.
  • SSH config: connects as User carlos.
  • Restricted to deploy-related commands by the agent launcher's command allowlist.

The same key is reused by the cross-node smoke workflow (optimalOS/workflows/cross-node-smoke.ts) which periodically pings the remote node and verifies a round-trip job completes.

Verifying the mesh

Manual smoke test:

bash
# From the Pi, with both nodes online:
curl -X POST http://localhost:3000/api/loom/strands/cross-node-smoke/run \
  -H "Authorization: Bearer $OPTIMALOS_TOKEN"

The run should produce step rows with two different host values — one for the local step and one for the step that gets dispatched to pop-os. Watch the Loom tab; the strand should pulse across both nodes.

If the remote step stays in pending, check:

  1. optimalos.service is running on pop-os (not just the Pi).
  2. loom_job_queue row in Supabase has target_host matching the pop-os hostname exactly.
  3. Tailscale tailscale status shows both nodes online.
  4. The pop-os worker has SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY set so it can poll.

Built by Carlos Lenis in Miami