fix(cli): lazy-load model auth runtime

docs(bluebubbles): rewrite with Steps for setup, Tabs for DM/groups and coalescing, AccordionGroup for actions and config
docs(troubleshooting): rewrite with AccordionGroup for symptom signatures, Steps for fix flows, and Warning callouts
2026-06-22 15:03:05 +08:00 · 2026-04-25 23:49:06 -07:00 · 2026-04-25 23:48:13 -07:00 · 2026-04-25 23:44:25 -07:00 · 2026-04-26 07:42:14 +01:00 · 2026-04-25 23:40:24 -07:00
50 changed files with 2865 additions and 1327 deletions
--- a/.agents/skills/openclaw-pr-maintainer/SKILL.md
+++ b/.agents/skills/openclaw-pr-maintainer/SKILL.md
@@ -7,6 +7,22 @@ description: Review, triage, close, label, comment on, or land OpenClaw PRs/issu

 Use this skill for maintainer-facing GitHub workflow, not for ordinary code changes.

+## Start issue and PR triage with ghcrawl
+
+- Anytime you inspect OpenClaw issues or PRs, check local `ghcrawl` data first for related threads, duplicate attempts, and already-landed fixes.
+- Use `ghcrawl` for candidate discovery and clustering; use `gh`, `gh api`, and the current checkout to verify live state before commenting, labeling, closing, or landing.
+- If `ghcrawl` is missing, stale, lacks the target thread, or has no embeddings for neighbor/search commands, fall back to the GitHub search workflow below.
+- Do not run expensive/update commands such as `ghcrawl refresh`, `ghcrawl embed`, or `ghcrawl cluster` unless the user asked to update the local store or the stale data is blocking the decision.
+
+Common read-only path:
+
+```bash
+ghcrawl threads openclaw/openclaw --numbers <issue-or-pr-number> --include-closed --json
+ghcrawl neighbors openclaw/openclaw --number <issue-or-pr-number> --limit 12 --json
+ghcrawl search openclaw/openclaw --query "<scope or title keywords>" --mode hybrid --json
+ghcrawl cluster-detail openclaw/openclaw --id <cluster-id> --member-limit 20 --body-chars 280 --json
+```
+
 ## Apply close and triage labels correctly

 - If an issue or PR matches an auto-close reason, apply the label and let `.github/workflows/auto-response.yml` handle the comment/close/lock flow.
@@ -59,9 +75,9 @@ Use this skill for maintainer-facing GitHub workflow, not for ordinary code chan

 ## Search broadly before deciding

- Prefer targeted keyword search before proposing new work or closing something as duplicate.
- Use `--repo openclaw/openclaw` with `--match title,body` first.
- Add `--match comments` when triaging follow-up discussion.
+- Prefer `ghcrawl` first. Then use targeted GitHub keyword search to verify gaps, live status, comments, and candidates not present in the local store.
+- Use `--repo openclaw/openclaw` with `--match title,body` first when using `gh search`.
+- Add `--match comments` when triaging follow-up discussion or closed-as-duplicate chains.
 - Do not stop at the first 500 results when the task requires a full search.

 Examples:
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -26,6 +26,7 @@ Docs: https://docs.openclaw.ai
 - Diagnostics/OTEL: align model-call GenAI span attributes with OpenTelemetry stability opt-in semantics, keeping legacy `gen_ai.system` by default while emitting `gen_ai.provider.name` under `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental`. Thanks @vincentkoc.
 - Diagnostics/OTEL: support signal-specific OTLP endpoint overrides for traces, metrics, and logs via config or standard OTEL environment variables. Thanks @vincentkoc.
 - Diagnostics/OTEL: emit bounded telemetry exporter health diagnostics for startup and log-export failures without exporting raw error text. Thanks @vincentkoc.
+- Diagnostics/OTEL: export agent harness lifecycle telemetry as bounded `openclaw.harness.run` spans and `openclaw.harness.duration_ms` metrics so QA-lab, Codex, and future harnesses share one trace shape. Thanks @vincentkoc.
 - Plugins/CLI: add `openclaw plugins registry` for explicit persisted-registry inspection and `--refresh` repair without making normal startup rescan plugin locations. Thanks @vincentkoc.
 - Plugins/CLI: make `openclaw plugins list` read the cold persisted registry snapshot by default, leaving module-aware diagnostics to `plugins doctor` and `plugins inspect`. Thanks @vincentkoc.
 - Plugins/startup: move gateway startup plugin planning onto the versioned cold registry index, with postinstall repair for older registry files that predate startup metadata. Thanks @vincentkoc.
@@ -37,6 +38,7 @@ Docs: https://docs.openclaw.ai
 - CLI/models: use OpenClaw Provider Index preview rows as the final cold fallback for installable providers, while keeping user config, installed manifests, and refreshed cache rows above provider-index metadata. Thanks @vincentkoc.
 - Providers/plugins: keep onboarding and auth-choice setup lists on cold manifest/install metadata and add Provider Index install metadata for not-yet-installed provider plugins. Thanks @vincentkoc.
 - Providers/plugins: keep provider setup guidance and configure auth imports on cold manifest metadata, with a regression guard against static provider-runtime imports on setup/configure list paths. Thanks @vincentkoc.
+- CLI/capabilities: keep capability command registration from importing the models auth runtime until `model auth login` actually runs. Thanks @vincentkoc.
 - Plugins/chat commands: refresh the persisted plugin registry after `/plugins enable` and `/plugins disable`, matching the CLI mutation path. Thanks @vincentkoc.
 - Plugins/compat: mark `OPENCLAW_DISABLE_PERSISTED_PLUGIN_REGISTRY` as a deprecated break-glass switch and point operators at registry repair instead. Thanks @vincentkoc.
 - Plugins/registry: ignore stale persisted registry reads when plugin policy no longer matches current config, and stamp generated registry files with a do-not-edit warning. Thanks @vincentkoc.
@@ -74,12 +76,17 @@ Docs: https://docs.openclaw.ai

 ### Fixes

+- Gateway/install: refresh loaded gateway service installs when the current service embeds stale gateway auth instead of returning already-installed, avoiding LaunchAgent token-mismatch loops after token rotation. Fixes #70752. Thanks @hyspacex.
+- Update: ignore bundled plugin `.openclaw-install-stage` directories during global install verification and packaged dist pruning so leftover runtime-dep staging files do not turn successful updates into `unexpected packaged dist file` failures. Fixes #71752. Thanks @waynegault.
 - Gateway/plugins: stop persisted WhatsApp auth state from activating bundled channel runtime-dependency repair during startup when `channels.whatsapp` is absent, avoiding npm/git stalls on packaged Linux installs. Fixes #71994. Thanks @xiao398008.
+- Gateway/device tokens: enforce caller-scope containment inside token rotation and revocation so pairing-only sessions cannot mutate higher-scope operator tokens. Fixes #71990. Thanks @coygeek.
 - CLI/model runs: keep `openclaw infer model run` on explicit OpenRouter models from loading the full provider catalog or inheriting chat-agent silent-reply policy, restoring non-empty one-shot probe output. Fixes #68791. Thanks @limpredator.
 - Installer/macOS: rerun Homebrew install steps without the gum spinner when raw-mode ioctl failures occur, and avoid claiming `node@24` was installed when the Homebrew keg binary is missing. Fixes #70411. Thanks @1fanwang and @dad-io.
 - Installer: load nvm before Node.js detection so `curl | bash` installs respect nvm-managed Node instead of stale system Node. Fixes #49556. Thanks @heavenlxj.
+- Installer/Windows: route PowerShell install failures through a top-level handler so `iwr ... | iex` returns control to the current shell while direct script-file runs still exit non-zero. Fixes #38054. Thanks @PwrSrg.
 - CLI/Volta: respawn raw `openclaw` CLI runs through the named `node` shim when the current Node executable resolves to `volta-shim`, avoiding direct shim execution failures in non-interactive shells. Fixes #68672. Thanks @sanchezm86.
 - Installer: warn when multiple npm global roots contain OpenClaw installs, showing active Node/npm/openclaw plus each install path and version so stale version-manager installs are visible. Fixes #40839. Thanks @zhixianio.
+- Cron/tasks: recover completed cron task ledger records from durable run logs and job state before marking them `lost`, reducing false `backing session missing` audit errors for isolated cron runs and keeping offline CLI audit from treating its empty local cron active-job set as authoritative. Fixes #71963.
 - Docker: copy patched dependency files into runtime images so downstream `pnpm install` layers keep working. Fixes #69224. Thanks @gucasbrg.
 - Agents/runtime: submit heartbeat, cron, and exec wakeups as transient runtime context instead of visible user prompts, keeping synthetic system work out of chat transcripts. Fixes #66496 and #66814. Thanks @jeades and @mandomaker.
 - Telegram: include native quote excerpts automatically for threaded replies and reply tags when the original Telegram text is available, without adding another config knob. Fixes #6975. Thanks @rex05ai.
--- a/docs/automation/cron-jobs.md
+++ b/docs/automation/cron-jobs.md
@@ -51,7 +51,7 @@ Cron is the Gateway's built-in scheduler. It persists jobs, wakes the agent at t
 <a id="maintenance"></a>

 <Note>
-Task reconciliation for cron is runtime-owned: an active cron task stays live while the cron runtime still tracks that job as running, even if an old child session row still exists. Once the runtime stops owning the job and the 5-minute grace window expires, maintenance can mark the task `lost`.
+Task reconciliation for cron is runtime-owned first, durable-history-backed second: an active cron task stays live while the cron runtime still tracks that job as running, even if an old child session row still exists. Once the runtime stops owning the job and the 5-minute grace window expires, maintenance checks persisted run logs and job state for the matching `cron:<jobId>:<startedAt>` run. If that durable history shows a terminal result, the task ledger is finalized from it; otherwise Gateway-owned maintenance can mark the task `lost`. Offline CLI audit can recover from durable history, but it does not treat its own empty in-process active-job set as proof that a Gateway-owned cron run is gone.
 </Note>

 ## Schedule types
--- a/docs/automation/tasks.md
+++ b/docs/automation/tasks.md
@@ -25,8 +25,12 @@ Not every agent run creates a task. Heartbeat turns and normal interactive chat
 - Tasks are **records**, not schedulers — cron and heartbeat decide _when_ work runs, tasks track _what happened_.
 - ACP, subagents, all cron jobs, and CLI operations create tasks. Heartbeat turns do not.
 - Each task moves through `queued → running → terminal` (succeeded, failed, timed_out, cancelled, or lost).
- Cron tasks stay live while the cron runtime still owns the job; chat-backed CLI tasks stay live only while their owning run context is still active.
- Completion is push-driven: detached work can notify directly or wake the requester session/heartbeat when it finishes, so status polling loops are usually the wrong shape.
+- Cron tasks stay live while the cron runtime still owns the job; if the
+  in-memory runtime state is gone, task maintenance first checks durable cron
+  run history before marking a task lost.
+- Completion is push-driven: detached work can notify directly or wake the
+  requester session/heartbeat when it finishes, so status polling loops are
+  usually the wrong shape.
 - Isolated cron runs and subagent completions best-effort clean up tracked browser tabs/processes for their child session before final cleanup bookkeeping.
 - Isolated cron delivery suppresses stale interim parent replies while descendant subagent work is still draining, and it prefers final descendant output when that arrives before delivery.
 - Completion notifications are delivered directly to a channel or queued for the next heartbeat.
@@ -143,8 +147,14 @@ Agent run completion is authoritative for active task records. A successful deta

 - ACP tasks: backing ACP child session metadata disappeared.
 - Subagent tasks: backing child session disappeared from the target agent store.
- Cron tasks: the cron runtime no longer tracks the job as active.
- CLI tasks: isolated child-session tasks use the child session; chat-backed CLI tasks use the live run context instead, so lingering channel/group/direct session rows do not keep them alive. Gateway-backed `openclaw agent` runs also finalize from their run result, so completed runs do not sit active until the sweeper marks them `lost`.
+- Cron tasks: the cron runtime no longer tracks the job as active and durable
+  cron run history does not show a terminal result for that run. Offline CLI
+  audit does not treat its own empty in-process cron runtime state as authority.
+- CLI tasks: isolated child-session tasks use the child session; chat-backed
+  CLI tasks use the live run context instead, so lingering
+  channel/group/direct session rows do not keep them alive. Gateway-backed
+  `openclaw agent` runs also finalize from their run result, so completed runs
+  do not sit active until the sweeper marks them `lost`.

 ## Delivery and notifications

@@ -236,7 +246,7 @@ openclaw tasks notify <lookup> state_changes
    Reconciliation is runtime-aware:

    - ACP/subagent tasks check their backing child session.
-    - Cron tasks check whether the cron runtime still owns the job.
+    - Cron tasks check whether the cron runtime still owns the job, then recover terminal status from persisted cron run logs/job state before falling back to `lost`. Only the Gateway process is authoritative for the in-memory cron active-job set; offline CLI audit uses durable history but does not mark a cron task lost solely because that local Set is empty.
    - Chat-backed CLI tasks check the owning live run context, not just the chat session row.

    Completion cleanup is also runtime-aware:
--- a/docs/channels/bluebubbles.md
+++ b/docs/channels/bluebubbles.md
@@ -5,14 +5,14 @@ read_when:
  - Troubleshooting webhook pairing
  - Configuring iMessage on macOS
 title: "BlueBubbles"
+sidebarTitle: "BlueBubbles"
 ---

 Status: bundled plugin that talks to the BlueBubbles macOS server over HTTP. **Recommended for iMessage integration** due to its richer API and easier setup compared to the legacy imsg channel.

-## Bundled plugin
-
-Current OpenClaw releases bundle BlueBubbles, so normal packaged builds do not
-need a separate `openclaw plugins install` step.
+<Note>
+Current OpenClaw releases bundle BlueBubbles, so normal packaged builds do not need a separate `openclaw plugins install` step.
+</Note>

 ## Overview

@@ -21,113 +21,119 @@ need a separate `openclaw plugins install` step.
 - OpenClaw talks to it through its REST API (`GET /api/v1/ping`, `POST /message/text`, `POST /chat/:id/*`).
 - Incoming messages arrive via webhooks; outgoing replies, typing indicators, read receipts, and tapbacks are REST calls.
 - Attachments and stickers are ingested as inbound media (and surfaced to the agent when possible).
- Auto-TTS replies that synthesize MP3 or CAF audio are delivered as iMessage
-  voice memo bubbles instead of plain file attachments.
+- Auto-TTS replies that synthesize MP3 or CAF audio are delivered as iMessage voice memo bubbles instead of plain file attachments.
 - Pairing/allowlist works the same way as other channels (`/channels/pairing` etc) with `channels.bluebubbles.allowFrom` + pairing codes.
 - Reactions are surfaced as system events just like Slack/Telegram so agents can "mention" them before replying.
 - Advanced features: edit, unsend, reply threading, message effects, group management.

 ## Quick start

-1. Install the BlueBubbles server on your Mac (follow the instructions at [bluebubbles.app/install](https://bluebubbles.app/install)).
-2. In the BlueBubbles config, enable the web API and set a password.
-3. Run `openclaw onboard` and select BlueBubbles, or configure manually:
+<Steps>
+  <Step title="Install BlueBubbles">
+    Install the BlueBubbles server on your Mac (follow the instructions at [bluebubbles.app/install](https://bluebubbles.app/install)).
+  </Step>
+  <Step title="Enable the web API">
+    In the BlueBubbles config, enable the web API and set a password.
+  </Step>
+  <Step title="Configure OpenClaw">
+    Run `openclaw onboard` and select BlueBubbles, or configure manually:

-   ```json5
-   {
-     channels: {
-       bluebubbles: {
-         enabled: true,
-         serverUrl: "http://192.168.1.100:1234",
-         password: "example-password",
-         webhookPath: "/bluebubbles-webhook",
-       },
-     },
-   }
-   ```
+    ```json5
+    {
+      channels: {
+        bluebubbles: {
+          enabled: true,
+          serverUrl: "http://192.168.1.100:1234",
+          password: "example-password",
+          webhookPath: "/bluebubbles-webhook",
+        },
+      },
+    }
+    ```

-4. Point BlueBubbles webhooks to your gateway (example: `https://your-gateway-host:3000/bluebubbles-webhook?password=<password>`).
-5. Start the gateway; it will register the webhook handler and start pairing.
+  </Step>
+  <Step title="Point webhooks at the gateway">
+    Point BlueBubbles webhooks to your gateway (example: `https://your-gateway-host:3000/bluebubbles-webhook?password=<password>`).
+  </Step>
+  <Step title="Start the gateway">
+    Start the gateway; it will register the webhook handler and start pairing.
+  </Step>
+</Steps>

-Security note:
+<Warning>
+**Security**

 - Always set a webhook password.
 - Webhook authentication is always required. OpenClaw rejects BlueBubbles webhook requests unless they include a password/guid that matches `channels.bluebubbles.password` (for example `?password=<password>` or `x-password`), regardless of loopback/proxy topology.
 - Password authentication is checked before reading/parsing full webhook bodies.
+  </Warning>

 ## Keeping Messages.app alive (VM / headless setups)

-Some macOS VM / always-on setups can end up with Messages.app going “idle” (incoming events stop until the app is opened/foregrounded). A simple workaround is to **poke Messages every 5 minutes** using an AppleScript + LaunchAgent.
+Some macOS VM / always-on setups can end up with Messages.app going "idle" (incoming events stop until the app is opened/foregrounded). A simple workaround is to **poke Messages every 5 minutes** using an AppleScript + LaunchAgent.

-### 1) Save the AppleScript
+<Steps>
+  <Step title="Save the AppleScript">
+    Save this as `~/Scripts/poke-messages.scpt`:

-Save this as:
+    ```applescript
+    try
+      tell application "Messages"
+        if not running then
+          launch
+        end if

- `~/Scripts/poke-messages.scpt`
+        -- Touch the scripting interface to keep the process responsive.
+        set _chatCount to (count of chats)
+      end tell
+    on error
+      -- Ignore transient failures (first-run prompts, locked session, etc).
+    end try
+    ```

-Example script (non-interactive; does not steal focus):
+  </Step>
+  <Step title="Install a LaunchAgent">
+    Save this as `~/Library/LaunchAgents/com.user.poke-messages.plist`:

-```applescript
-try
-  tell application "Messages"
-    if not running then
-      launch
-    end if
+    ```xml
+    <?xml version="1.0" encoding="UTF-8"?>
+    <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+    <plist version="1.0">
+      <dict>
+        <key>Label</key>
+        <string>com.user.poke-messages</string>

-    -- Touch the scripting interface to keep the process responsive.
-    set _chatCount to (count of chats)
-  end tell
-on error
-  -- Ignore transient failures (first-run prompts, locked session, etc).
-end try
-```
+        <key>ProgramArguments</key>
+        <array>
+          <string>/bin/bash</string>
+          <string>-lc</string>
+          <string>/usr/bin/osascript &quot;$HOME/Scripts/poke-messages.scpt&quot;</string>
+        </array>

-### 2) Install a LaunchAgent
+        <key>RunAtLoad</key>
+        <true/>

-Save this as:
+        <key>StartInterval</key>
+        <integer>300</integer>

- `~/Library/LaunchAgents/com.user.poke-messages.plist`
+        <key>StandardOutPath</key>
+        <string>/tmp/poke-messages.log</string>
+        <key>StandardErrorPath</key>
+        <string>/tmp/poke-messages.err</string>
+      </dict>
+    </plist>
+    ```

-```xml
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-  <dict>
-    <key>Label</key>
-    <string>com.user.poke-messages</string>
+    This runs **every 300 seconds** and **on login**. The first run may trigger macOS **Automation** prompts (`osascript` → Messages). Approve them in the same user session that runs the LaunchAgent.

-    <key>ProgramArguments</key>
-    <array>
-      <string>/bin/bash</string>
-      <string>-lc</string>
-      <string>/usr/bin/osascript &quot;$HOME/Scripts/poke-messages.scpt&quot;</string>
-    </array>
-
-    <key>RunAtLoad</key>
-    <true/>
-
-    <key>StartInterval</key>
-    <integer>300</integer>
-
-    <key>StandardOutPath</key>
-    <string>/tmp/poke-messages.log</string>
-    <key>StandardErrorPath</key>
-    <string>/tmp/poke-messages.err</string>
-  </dict>
-</plist>
-```
-
-Notes:
-
- This runs **every 300 seconds** and **on login**.
- The first run may trigger macOS **Automation** prompts (`osascript` → Messages). Approve them in the same user session that runs the LaunchAgent.
-
-Load it:
-
-```bash
-launchctl unload ~/Library/LaunchAgents/com.user.poke-messages.plist 2>/dev/null || true
-launchctl load ~/Library/LaunchAgents/com.user.poke-messages.plist
-```
+  </Step>
+  <Step title="Load it">
+    ```bash
+    launchctl unload ~/Library/LaunchAgents/com.user.poke-messages.plist 2>/dev/null || true
+    launchctl load ~/Library/LaunchAgents/com.user.poke-messages.plist
+    ```
+  </Step>
+</Steps>

 ## Onboarding

@@ -139,11 +145,21 @@ openclaw onboard

 The wizard prompts for:

- **Server URL** (required): BlueBubbles server address (e.g., `http://192.168.1.100:1234`)
- **Password** (required): API password from BlueBubbles Server settings
- **Webhook path** (optional): Defaults to `/bluebubbles-webhook`
- **DM policy**: pairing, allowlist, open, or disabled
- **Allow list**: Phone numbers, emails, or chat targets
+<ParamField path="Server URL" type="string" required>
+  BlueBubbles server address (e.g., `http://192.168.1.100:1234`).
+</ParamField>
+<ParamField path="Password" type="string" required>
+  API password from BlueBubbles Server settings.
+</ParamField>
+<ParamField path="Webhook path" type="string" default="/bluebubbles-webhook">
+  Webhook endpoint path.
+</ParamField>
+<ParamField path="DM policy" type="string">
+  `pairing`, `allowlist`, `open`, or `disabled`.
+</ParamField>
+<ParamField path="Allow list" type="string[]">
+  Phone numbers, emails, or chat targets.
+</ParamField>

 You can also add BlueBubbles via CLI:

@@ -153,19 +169,20 @@ openclaw channels add bluebubbles --http-url http://192.168.1.100:1234 --passwor

 ## Access control (DMs + groups)

-DMs:
-
- Default: `channels.bluebubbles.dmPolicy = "pairing"`.
- Unknown senders receive a pairing code; messages are ignored until approved (codes expire after 1 hour).
- Approve via:
-  - `openclaw pairing list bluebubbles`
-  - `openclaw pairing approve bluebubbles <CODE>`
- Pairing is the default token exchange. Details: [Pairing](/channels/pairing)
-
-Groups:
-
- `channels.bluebubbles.groupPolicy = open | allowlist | disabled` (default: `allowlist`).
- `channels.bluebubbles.groupAllowFrom` controls who can trigger in groups when `allowlist` is set.
+<Tabs>
+  <Tab title="DMs">
+    - Default: `channels.bluebubbles.dmPolicy = "pairing"`.
+    - Unknown senders receive a pairing code; messages are ignored until approved (codes expire after 1 hour).
+    - Approve via:
+      - `openclaw pairing list bluebubbles`
+      - `openclaw pairing approve bluebubbles <CODE>`
+    - Pairing is the default token exchange. Details: [Pairing](/channels/pairing)
+  </Tab>
+  <Tab title="Groups">
+    - `channels.bluebubbles.groupPolicy = open | allowlist | disabled` (default: `allowlist`).
+    - `channels.bluebubbles.groupAllowFrom` controls who can trigger in groups when `allowlist` is set.
+  </Tab>
+</Tabs>

 ### Contact name enrichment (macOS, optional)

@@ -361,21 +378,23 @@ BlueBubbles supports advanced message actions when enabled in config:
 }
 ```

-Available actions:
-
- **react**: Add/remove tapback reactions (`messageId`, `emoji`, `remove`). iMessage's native tapback set is `love`, `like`, `dislike`, `laugh`, `emphasize`, and `question`. When an agent picks an emoji outside that set (for example `👀`), the reaction tool falls back to `love` so the tapback still renders instead of failing the whole request. Configured ack reactions still validate strictly and error on unknown values.
- **edit**: Edit a sent message (`messageId`, `text`)
- **unsend**: Unsend a message (`messageId`)
- **reply**: Reply to a specific message (`messageId`, `text`, `to`)
- **sendWithEffect**: Send with iMessage effect (`text`, `to`, `effectId`)
- **renameGroup**: Rename a group chat (`chatGuid`, `displayName`)
- **setGroupIcon**: Set a group chat's icon/photo (`chatGuid`, `media`) — flaky on macOS 26 Tahoe (API may return success but the icon does not sync).
- **addParticipant**: Add someone to a group (`chatGuid`, `address`)
- **removeParticipant**: Remove someone from a group (`chatGuid`, `address`)
- **leaveGroup**: Leave a group chat (`chatGuid`)
- **upload-file**: Send media/files (`to`, `buffer`, `filename`, `asVoice`)
-  - Voice memos: set `asVoice: true` with **MP3** or **CAF** audio to send as an iMessage voice message. BlueBubbles converts MP3 → CAF when sending voice memos.
- Legacy alias: `sendAttachment` still works, but `upload-file` is the canonical action name.
+<AccordionGroup>
+  <Accordion title="Available actions">
+    - **react**: Add/remove tapback reactions (`messageId`, `emoji`, `remove`). iMessage's native tapback set is `love`, `like`, `dislike`, `laugh`, `emphasize`, and `question`. When an agent picks an emoji outside that set (for example `👀`), the reaction tool falls back to `love` so the tapback still renders instead of failing the whole request. Configured ack reactions still validate strictly and error on unknown values.
+    - **edit**: Edit a sent message (`messageId`, `text`).
+    - **unsend**: Unsend a message (`messageId`).
+    - **reply**: Reply to a specific message (`messageId`, `text`, `to`).
+    - **sendWithEffect**: Send with iMessage effect (`text`, `to`, `effectId`).
+    - **renameGroup**: Rename a group chat (`chatGuid`, `displayName`).
+    - **setGroupIcon**: Set a group chat's icon/photo (`chatGuid`, `media`) — flaky on macOS 26 Tahoe (API may return success but the icon does not sync).
+    - **addParticipant**: Add someone to a group (`chatGuid`, `address`).
+    - **removeParticipant**: Remove someone from a group (`chatGuid`, `address`).
+    - **leaveGroup**: Leave a group chat (`chatGuid`).
+    - **upload-file**: Send media/files (`to`, `buffer`, `filename`, `asVoice`).
+      - Voice memos: set `asVoice: true` with **MP3** or **CAF** audio to send as an iMessage voice message. BlueBubbles converts MP3 → CAF when sending voice memos.
+    - Legacy alias: `sendAttachment` still works, but `upload-file` is the canonical action name.
+  </Accordion>
+</AccordionGroup>

 ### Message IDs (short vs full)

@@ -406,54 +425,56 @@ The two webhooks arrive at OpenClaw ~0.8-2.0 s apart on most setups. Without coa

 `channels.bluebubbles.coalesceSameSenderDms` opts a DM into merging consecutive same-sender webhooks into a single agent turn. Group chats continue to key per-message so multi-user turn structure is preserved.

-### When to enable
+<Tabs>
+  <Tab title="When to enable">
+    Enable when:

-Enable when:
+    - You ship skills that expect `command + payload` in one message (dump, paste, save, queue, etc.).
+    - Your users paste URLs, images, or long content alongside commands.
+    - You can accept the added DM turn latency (see below).

- You ship skills that expect `command + payload` in one message (dump, paste, save, queue, etc.).
- Your users paste URLs, images, or long content alongside commands.
- You can accept the added DM turn latency (see below).
+    Leave disabled when:

-Leave disabled when:
+    - You need minimum command latency for single-word DM triggers.
+    - All your flows are one-shot commands without payload follow-ups.

- You need minimum command latency for single-word DM triggers.
- All your flows are one-shot commands without payload follow-ups.
-
-### Enabling
-
-```json5
-{
-  channels: {
-    bluebubbles: {
-      coalesceSameSenderDms: true, // opt in (default: false)
-    },
-  },
-}
-```
-
-With the flag on and no explicit `messages.inbound.byChannel.bluebubbles`, the debounce window widens to **2500 ms** (the default for non-coalescing is 500 ms). The wider window is required — Apple's split-send cadence of 0.8-2.0 s does not fit in the tighter default.
-
-To tune the window yourself:
-
-```json5
-{
-  messages: {
-    inbound: {
-      byChannel: {
-        // 2500 ms works for most setups; raise to 4000 ms if your Mac is slow
-        // or under memory pressure (observed gap can stretch past 2 s then).
-        bluebubbles: 2500,
+  </Tab>
+  <Tab title="Enabling">
+    ```json5
+    {
+      channels: {
+        bluebubbles: {
+          coalesceSameSenderDms: true, // opt in (default: false)
+        },
      },
-    },
-  },
-}
-```
+    }
+    ```

-### Trade-offs
+    With the flag on and no explicit `messages.inbound.byChannel.bluebubbles`, the debounce window widens to **2500 ms** (the default for non-coalescing is 500 ms). The wider window is required — Apple's split-send cadence of 0.8-2.0 s does not fit in the tighter default.

- **Added latency for DM control commands.** With the flag on, DM control-command messages (like `Dump`, `Save`, etc.) now wait up to the debounce window before dispatching, in case a payload webhook is coming. Group-chat commands keep instant dispatch.
- **Merged output is bounded** — merged text caps at 4000 chars with an explicit `…[truncated]` marker; attachments cap at 20; source entries cap at 10 (first-plus-latest retained beyond that). Every source `messageId` still reaches inbound-dedupe so a later MessagePoller replay of any individual event is recognized as a duplicate.
- **Opt-in, per-channel.** Other channels (Telegram, WhatsApp, Slack, …) are unaffected.
+    To tune the window yourself:
+
+    ```json5
+    {
+      messages: {
+        inbound: {
+          byChannel: {
+            // 2500 ms works for most setups; raise to 4000 ms if your Mac is slow
+            // or under memory pressure (observed gap can stretch past 2 s then).
+            bluebubbles: 2500,
+          },
+        },
+      },
+    }
+    ```
+
+  </Tab>
+  <Tab title="Trade-offs">
+    - **Added latency for DM control commands.** With the flag on, DM control-command messages (like `Dump`, `Save`, etc.) now wait up to the debounce window before dispatching, in case a payload webhook is coming. Group-chat commands keep instant dispatch.
+    - **Merged output is bounded** — merged text caps at 4000 chars with an explicit `…[truncated]` marker; attachments cap at 20; source entries cap at 10 (first-plus-latest retained beyond that). Every source `messageId` still reaches inbound-dedupe so a later MessagePoller replay of any individual event is recognized as a duplicate.
+    - **Opt-in, per-channel.** Other channels (Telegram, WhatsApp, Slack, …) are unaffected.
+  </Tab>
+</Tabs>

 ### Scenarios and what the agent sees

@@ -470,27 +491,35 @@ To tune the window yourself:

 If the flag is on and split-sends still arrive as two turns, check each layer:

-1. **Config actually loaded.**
+<AccordionGroup>
+  <Accordion title="Config actually loaded">
+    ```
+    grep coalesceSameSenderDms ~/.openclaw/openclaw.json
+    ```

-   ```
-   grep coalesceSameSenderDms ~/.openclaw/openclaw.json
-   ```
+    Then `openclaw gateway restart` — the flag is read at debouncer-registry creation.

-   Then `openclaw gateway restart` — the flag is read at debouncer-registry creation.
+  </Accordion>
+  <Accordion title="Debounce window wide enough for your setup">
+    Look at the BlueBubbles server log under `~/Library/Logs/bluebubbles-server/main.log`:

-2. **Debounce window wide enough for your setup.** Look at the BlueBubbles server log under `~/Library/Logs/bluebubbles-server/main.log`:
+    ```
+    grep -E "Dispatching event to webhook" main.log | tail -20
+    ```

-   ```
-   grep -E "Dispatching event to webhook" main.log | tail -20
-   ```
+    Measure the gap between the `"Dump"`-style text dispatch and the `"https://..."; Attachments:` dispatch that follows. Raise `messages.inbound.byChannel.bluebubbles` to comfortably cover that gap.

-   Measure the gap between the `"Dump"`-style text dispatch and the `"https://..."; Attachments:` dispatch that follows. Raise `messages.inbound.byChannel.bluebubbles` to comfortably cover that gap.
-
-3. **Session JSONL timestamps ≠ webhook arrival.** Session event timestamps (`~/.openclaw/agents/<id>/sessions/*.jsonl`) reflect when the gateway hands a message to the agent, **not** when the webhook arrived. A queued-second message tagged `[Queued messages while agent was busy]` means the first turn was still running when the second webhook arrived — the coalesce bucket had already flushed. Tune the window against the BB server log, not the session log.
-
-4. **Memory pressure slowing reply dispatch.** On smaller machines (8 GB), agent turns can take long enough that the coalesce bucket flushes before the reply completes, and the URL lands as a queued second turn. Check `memory_pressure` and `ps -o rss -p $(pgrep openclaw-gateway)`; if the gateway is over ~500 MB RSS and the compressor is active, close other heavy processes or bump to a larger host.
-
-5. **Reply-quote sends are a different path.** If the user tapped `Dump` as a **reply** to an existing URL-balloon (iMessage shows a "1 Reply" badge on the Dump bubble), the URL lives in `replyToBody`, not in a second webhook. Coalescing does not apply — that's a skill/prompt concern, not a debouncer concern.
+  </Accordion>
+  <Accordion title="Session JSONL timestamps ≠ webhook arrival">
+    Session event timestamps (`~/.openclaw/agents/<id>/sessions/*.jsonl`) reflect when the gateway hands a message to the agent, **not** when the webhook arrived. A queued-second message tagged `[Queued messages while agent was busy]` means the first turn was still running when the second webhook arrived — the coalesce bucket had already flushed. Tune the window against the BB server log, not the session log.
+  </Accordion>
+  <Accordion title="Memory pressure slowing reply dispatch">
+    On smaller machines (8 GB), agent turns can take long enough that the coalesce bucket flushes before the reply completes, and the URL lands as a queued second turn. Check `memory_pressure` and `ps -o rss -p $(pgrep openclaw-gateway)`; if the gateway is over ~500 MB RSS and the compressor is active, close other heavy processes or bump to a larger host.
+  </Accordion>
+  <Accordion title="Reply-quote sends are a different path">
+    If the user tapped `Dump` as a **reply** to an existing URL-balloon (iMessage shows a "1 Reply" badge on the Dump bubble), the URL lives in `replyToBody`, not in a second webhook. Coalescing does not apply — that's a skill/prompt concern, not a debouncer concern.
+  </Accordion>
+</AccordionGroup>

 ## Block streaming

@@ -516,30 +545,40 @@ Control whether responses are sent as a single message or streamed in blocks:

 Full configuration: [Configuration](/gateway/configuration)

-Provider options:
-
- `channels.bluebubbles.enabled`: Enable/disable the channel.
- `channels.bluebubbles.serverUrl`: BlueBubbles REST API base URL.
- `channels.bluebubbles.password`: API password.
- `channels.bluebubbles.webhookPath`: Webhook endpoint path (default: `/bluebubbles-webhook`).
- `channels.bluebubbles.dmPolicy`: `pairing | allowlist | open | disabled` (default: `pairing`).
- `channels.bluebubbles.allowFrom`: DM allowlist (handles, emails, E.164 numbers, `chat_id:*`, `chat_guid:*`).
- `channels.bluebubbles.groupPolicy`: `open | allowlist | disabled` (default: `allowlist`).
- `channels.bluebubbles.groupAllowFrom`: Group sender allowlist.
- `channels.bluebubbles.enrichGroupParticipantsFromContacts`: On macOS, optionally enrich unnamed group participants from local Contacts after gating passes. Default: `false`.
- `channels.bluebubbles.groups`: Per-group config (`requireMention`, etc.).
- `channels.bluebubbles.sendReadReceipts`: Send read receipts (default: `true`).
- `channels.bluebubbles.blockStreaming`: Enable block streaming (default: `false`; required for streaming replies).
- `channels.bluebubbles.textChunkLimit`: Outbound chunk size in chars (default: 4000).
- `channels.bluebubbles.sendTimeoutMs`: Per-request timeout in ms for outbound text sends via `/api/v1/message/text` (default: 30000). Raise on macOS 26 setups where Private API iMessage sends can stall for 60+ seconds inside the iMessage framework; for example `45000` or `60000`. Probes, chat lookups, reactions, edits, and health checks currently keep the shorter 10s default; broadening coverage to reactions and edits is planned as a follow-up. Per-account override: `channels.bluebubbles.accounts.<accountId>.sendTimeoutMs`.
- `channels.bluebubbles.chunkMode`: `length` (default) splits only when exceeding `textChunkLimit`; `newline` splits on blank lines (paragraph boundaries) before length chunking.
- `channels.bluebubbles.mediaMaxMb`: Inbound/outbound media cap in MB (default: 8).
- `channels.bluebubbles.mediaLocalRoots`: Explicit allowlist of absolute local directories permitted for outbound local media paths. Local path sends are denied by default unless this is configured. Per-account override: `channels.bluebubbles.accounts.<accountId>.mediaLocalRoots`.
- `channels.bluebubbles.coalesceSameSenderDms`: Merge consecutive same-sender DM webhooks into one agent turn so Apple's text+URL split-send arrives as a single message (default: `false`). See [Coalescing split-send DMs](#coalescing-split-send-dms-command--url-in-one-composition) for scenarios, window tuning, and trade-offs. Widens the default inbound debounce window from 500 ms to 2500 ms when enabled without an explicit `messages.inbound.byChannel.bluebubbles`.
- `channels.bluebubbles.historyLimit`: Max group messages for context (0 disables).
- `channels.bluebubbles.dmHistoryLimit`: DM history limit.
- `channels.bluebubbles.actions`: Enable/disable specific actions.
- `channels.bluebubbles.accounts`: Multi-account configuration.
+<AccordionGroup>
+  <Accordion title="Connection and webhook">
+    - `channels.bluebubbles.enabled`: Enable/disable the channel.
+    - `channels.bluebubbles.serverUrl`: BlueBubbles REST API base URL.
+    - `channels.bluebubbles.password`: API password.
+    - `channels.bluebubbles.webhookPath`: Webhook endpoint path (default: `/bluebubbles-webhook`).
+  </Accordion>
+  <Accordion title="Access policy">
+    - `channels.bluebubbles.dmPolicy`: `pairing | allowlist | open | disabled` (default: `pairing`).
+    - `channels.bluebubbles.allowFrom`: DM allowlist (handles, emails, E.164 numbers, `chat_id:*`, `chat_guid:*`).
+    - `channels.bluebubbles.groupPolicy`: `open | allowlist | disabled` (default: `allowlist`).
+    - `channels.bluebubbles.groupAllowFrom`: Group sender allowlist.
+    - `channels.bluebubbles.enrichGroupParticipantsFromContacts`: On macOS, optionally enrich unnamed group participants from local Contacts after gating passes. Default: `false`.
+    - `channels.bluebubbles.groups`: Per-group config (`requireMention`, etc.).
+  </Accordion>
+  <Accordion title="Delivery and chunking">
+    - `channels.bluebubbles.sendReadReceipts`: Send read receipts (default: `true`).
+    - `channels.bluebubbles.blockStreaming`: Enable block streaming (default: `false`; required for streaming replies).
+    - `channels.bluebubbles.textChunkLimit`: Outbound chunk size in chars (default: 4000).
+    - `channels.bluebubbles.sendTimeoutMs`: Per-request timeout in ms for outbound text sends via `/api/v1/message/text` (default: 30000). Raise on macOS 26 setups where Private API iMessage sends can stall for 60+ seconds inside the iMessage framework; for example `45000` or `60000`. Probes, chat lookups, reactions, edits, and health checks currently keep the shorter 10s default; broadening coverage to reactions and edits is planned as a follow-up. Per-account override: `channels.bluebubbles.accounts.<accountId>.sendTimeoutMs`.
+    - `channels.bluebubbles.chunkMode`: `length` (default) splits only when exceeding `textChunkLimit`; `newline` splits on blank lines (paragraph boundaries) before length chunking.
+  </Accordion>
+  <Accordion title="Media and history">
+    - `channels.bluebubbles.mediaMaxMb`: Inbound/outbound media cap in MB (default: 8).
+    - `channels.bluebubbles.mediaLocalRoots`: Explicit allowlist of absolute local directories permitted for outbound local media paths. Local path sends are denied by default unless this is configured. Per-account override: `channels.bluebubbles.accounts.<accountId>.mediaLocalRoots`.
+    - `channels.bluebubbles.coalesceSameSenderDms`: Merge consecutive same-sender DM webhooks into one agent turn so Apple's text+URL split-send arrives as a single message (default: `false`). See [Coalescing split-send DMs](#coalescing-split-send-dms-command--url-in-one-composition) for scenarios, window tuning, and trade-offs. Widens the default inbound debounce window from 500 ms to 2500 ms when enabled without an explicit `messages.inbound.byChannel.bluebubbles`.
+    - `channels.bluebubbles.historyLimit`: Max group messages for context (0 disables).
+    - `channels.bluebubbles.dmHistoryLimit`: DM history limit.
+  </Accordion>
+  <Accordion title="Actions and accounts">
+    - `channels.bluebubbles.actions`: Enable/disable specific actions.
+    - `channels.bluebubbles.accounts`: Multi-account configuration.
+  </Accordion>
+</AccordionGroup>

 Related global options:

@@ -582,8 +621,8 @@ For general channel workflow reference, see [Channels](/channels) and the [Plugi

 ## Related

- [Channels Overview](/channels) — all supported channels
- [Pairing](/channels/pairing) — DM authentication and pairing flow
- [Groups](/channels/groups) — group chat behavior and mention gating
 - [Channel Routing](/channels/channel-routing) — session routing for messages
+- [Channels Overview](/channels) — all supported channels
+- [Groups](/channels/groups) — group chat behavior and mention gating
+- [Pairing](/channels/pairing) — DM authentication and pairing flow
 - [Security](/gateway/security) — access model and hardening
--- a/docs/channels/pairing.md
+++ b/docs/channels/pairing.md
@@ -83,6 +83,8 @@ That bootstrap token carries the built-in pairing bootstrap profile:
 - bootstrap scope checks are role-prefixed, not one flat scope pool:
  operator scope entries only satisfy operator requests, and non-operator roles
  must still request scopes under their own role prefix
+- later token rotation/revocation remains bounded by both the device's approved
+  role contract and the caller session's operator scopes

 Treat the setup code like a password while it is valid.

--- a/docs/cli/devices.md
+++ b/docs/cli/devices.md
@@ -95,9 +95,9 @@ If you omit `--scope`, later reconnects with the stored rotated token reuse that
 token's cached approved scopes. If you pass explicit `--scope` values, those
 become the stored scope set for future cached-token reconnects.
 Non-admin paired-device callers can rotate only their **own** device token.
-Also, any explicit `--scope` values must stay within the caller session's own
-operator scopes; rotation cannot mint a broader operator token than the caller
-already has.
+The target token scope set must stay within the caller session's own operator
+scopes; rotation cannot mint or preserve a broader operator token than the
+caller already has.

 ```
 openclaw devices rotate --device <deviceId> --role operator --scope operator.read --scope operator.write
@@ -111,6 +111,8 @@ Revoke a device token for a specific role.

 Non-admin paired-device callers can revoke only their **own** device token.
 Revoking some other device's token requires `operator.admin`.
+The target token scope set must also fit within the caller session's own
+operator scopes; pairing-only callers cannot revoke admin/write operator tokens.

 ```
 openclaw devices revoke --device <deviceId> --role node
@@ -135,12 +137,15 @@ Pass `--token` or `--password` explicitly. Missing explicit credentials is an er
 - These commands require `operator.pairing` (or `operator.admin`) scope.
 - `gateway.nodes.pairing.autoApproveCidrs` is an opt-in Gateway policy for
  fresh node device pairing only; it does not change CLI approval authority.
- Token rotation stays inside the approved pairing role set and approved scope
-  baseline for that device. A stray cached token entry does not grant a new
-  rotate target.
+- Token rotation and revocation stay inside the approved pairing role set and
+  approved scope baseline for that device. A stray cached token entry does not
+  grant a token-management target.
 - For paired-device token sessions, cross-device management is admin-only:
  `remove`, `rotate`, and `revoke` are self-only unless the caller has
  `operator.admin`.
+- Token mutation is also caller-scope contained: a pairing-only session cannot
+  rotate or revoke a token that currently carries `operator.admin` or
+  `operator.write`.
 - `devices clear` is intentionally gated by `--yes`.
 - If pairing scope is unavailable on local loopback (and no explicit `--url` is passed), list/approve can use a local pairing fallback.
 - `devices approve` requires an explicit request ID before minting tokens; omitting `requestId` or passing `--latest` only previews the newest pending request.
--- a/docs/cli/tasks.md
+++ b/docs/cli/tasks.md
@@ -84,6 +84,10 @@ openclaw tasks maintenance [--apply] [--json]
 ```

 Previews or applies task and Task Flow reconciliation, cleanup stamping, and pruning.
+For cron tasks, reconciliation uses persisted run logs/job state before marking an
+old active task `lost`, so completed cron runs do not become false audit errors
+just because the in-memory Gateway runtime state is gone. Offline CLI audit is
+not authoritative for the Gateway's process-local cron active-job set.

 ### `flow`

--- a/docs/concepts/model-providers.md
+++ b/docs/concepts/model-providers.md
@@ -4,35 +4,42 @@ read_when:
  - You need a provider-by-provider model setup reference
  - You want example configs or CLI onboarding commands for model providers
 title: "Model providers"
+sidebarTitle: "Model providers"
 ---

 Reference for **LLM/model providers** (not chat channels like WhatsApp/Telegram). For model selection rules, see [Models](/concepts/models).

 ## Quick rules

- Model refs use `provider/model` (example: `opencode/claude-opus-4-6`).
- `agents.defaults.models` acts as an allowlist when set.
- CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
- `models.providers.*.models[].contextWindow` is native model metadata; `contextTokens` is the effective runtime cap.
- Fallback rules, cooldown probes, and session-override persistence: [Model failover](/concepts/model-failover).
- OpenAI-family routes are prefix-specific: `openai/<model>` uses the direct
-  OpenAI API-key provider in PI, `openai-codex/<model>` uses Codex OAuth in PI,
-  and `openai/<model>` plus `agents.defaults.embeddedHarness.runtime: "codex"`
-  uses the native Codex app-server harness. See [OpenAI](/providers/openai)
-  and [Codex harness](/plugins/codex-harness). If the provider/runtime split is
-  confusing, read [Agent runtimes](/concepts/agent-runtimes) first.
- Plugin auto-enable follows that same boundary: `openai-codex/<model>` belongs
-  to the OpenAI plugin, while the Codex plugin is enabled by
-  `embeddedHarness.runtime: "codex"` or legacy `codex/<model>` refs.
- CLI runtimes use the same split: choose canonical model refs such as
-  `anthropic/claude-*`, `google/gemini-*`, or `openai/gpt-*`, then set
-  `agents.defaults.embeddedHarness.runtime` to `claude-cli`,
-  `google-gemini-cli`, or `codex-cli` when you want a local CLI backend.
-  Legacy `claude-cli/*`, `google-gemini-cli/*`, and `codex-cli/*` refs migrate
-  back to canonical provider refs with the runtime recorded separately.
- GPT-5.5 is available through `openai/gpt-5.5` for direct API-key traffic,
-  `openai-codex/gpt-5.5` in PI for Codex OAuth, and the native Codex
-  app-server harness when `embeddedHarness.runtime: "codex"` is set.
+<AccordionGroup>
+  <Accordion title="Model refs and CLI helpers">
+    - Model refs use `provider/model` (example: `opencode/claude-opus-4-6`).
+    - `agents.defaults.models` acts as an allowlist when set.
+    - CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
+    - `models.providers.*.models[].contextWindow` is native model metadata; `contextTokens` is the effective runtime cap.
+    - Fallback rules, cooldown probes, and session-override persistence: [Model failover](/concepts/model-failover).
+  </Accordion>
+  <Accordion title="OpenAI provider/runtime split">
+    OpenAI-family routes are prefix-specific:
+
+    - `openai/<model>` uses the direct OpenAI API-key provider in PI.
+    - `openai-codex/<model>` uses Codex OAuth in PI.
+    - `openai/<model>` plus `agents.defaults.embeddedHarness.runtime: "codex"` uses the native Codex app-server harness.
+
+    See [OpenAI](/providers/openai) and [Codex harness](/plugins/codex-harness). If the provider/runtime split is confusing, read [Agent runtimes](/concepts/agent-runtimes) first.
+
+    Plugin auto-enable follows the same boundary: `openai-codex/<model>` belongs to the OpenAI plugin, while the Codex plugin is enabled by `embeddedHarness.runtime: "codex"` or legacy `codex/<model>` refs.
+
+    GPT-5.5 is available through `openai/gpt-5.5` for direct API-key traffic, `openai-codex/gpt-5.5` in PI for Codex OAuth, and the native Codex app-server harness when `embeddedHarness.runtime: "codex"` is set.
+
+  </Accordion>
+  <Accordion title="CLI runtimes">
+    CLI runtimes use the same split: choose canonical model refs such as `anthropic/claude-*`, `google/gemini-*`, or `openai/gpt-*`, then set `agents.defaults.embeddedHarness.runtime` to `claude-cli`, `google-gemini-cli`, or `codex-cli` when you want a local CLI backend.
+
+    Legacy `claude-cli/*`, `google-gemini-cli/*`, and `codex-cli/*` refs migrate back to canonical provider refs with the runtime recorded separately.
+
+  </Accordion>
+</AccordionGroup>

 ## Plugin-owned provider behavior

@@ -46,25 +53,28 @@ Provider runtime `capabilities` is shared runner metadata (provider family, tran

 ## API key rotation

- Supports generic provider rotation for selected providers.
- Configure multiple keys via:
-  - `OPENCLAW_LIVE_<PROVIDER>_KEY` (single live override, highest priority)
-  - `<PROVIDER>_API_KEYS` (comma or semicolon list)
-  - `<PROVIDER>_API_KEY` (primary key)
-  - `<PROVIDER>_API_KEY_*` (numbered list, e.g. `<PROVIDER>_API_KEY_1`)
- For Google providers, `GOOGLE_API_KEY` is also included as fallback.
- Key selection order preserves priority and deduplicates values.
- Requests are retried with the next key only on rate-limit responses (for
-  example `429`, `rate_limit`, `quota`, `resource exhausted`, `Too many
-concurrent requests`, `ThrottlingException`, `concurrency limit reached`,
-  `workers_ai ... quota limit exceeded`, or periodic usage-limit messages).
- Non-rate-limit failures fail immediately; no key rotation is attempted.
- When all candidate keys fail, the final error is returned from the last attempt.
+<AccordionGroup>
+  <Accordion title="Key sources and priority">
+    Configure multiple keys via:
+
+    - `OPENCLAW_LIVE_<PROVIDER>_KEY` (single live override, highest priority)
+    - `<PROVIDER>_API_KEYS` (comma or semicolon list)
+    - `<PROVIDER>_API_KEY` (primary key)
+    - `<PROVIDER>_API_KEY_*` (numbered list, e.g. `<PROVIDER>_API_KEY_1`)
+
+    For Google providers, `GOOGLE_API_KEY` is also included as fallback. Key selection order preserves priority and deduplicates values.
+
+  </Accordion>
+  <Accordion title="When rotation kicks in">
+    - Requests are retried with the next key only on rate-limit responses (for example `429`, `rate_limit`, `quota`, `resource exhausted`, `Too many concurrent requests`, `ThrottlingException`, `concurrency limit reached`, `workers_ai ... quota limit exceeded`, or periodic usage-limit messages).
+    - Non-rate-limit failures fail immediately; no key rotation is attempted.
+    - When all candidate keys fail, the final error is returned from the last attempt.
+  </Accordion>
+</AccordionGroup>

 ## Built-in providers (pi-ai catalog)

-OpenClaw ships with the pi‑ai catalog. These providers require **no**
-`models.providers` config; just set auth + pick a model.
+OpenClaw ships with the pi‑ai catalog. These providers require **no** `models.providers` config; just set auth + pick a model.

 ### OpenAI

@@ -72,8 +82,7 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
 - Auth: `OPENAI_API_KEY`
 - Optional rotation: `OPENAI_API_KEYS`, `OPENAI_API_KEY_1`, `OPENAI_API_KEY_2`, plus `OPENCLAW_LIVE_OPENAI_KEY` (single override)
 - Example models: `openai/gpt-5.5`, `openai/gpt-5.4-mini`
- Verify account/model availability with `openclaw models list --provider openai`
-  if a specific install or API key behaves differently.
+- Verify account/model availability with `openclaw models list --provider openai` if a specific install or API key behaves differently.
 - CLI: `openclaw onboard --auth-choice openai-api-key`
 - Default transport is `auto` (WebSocket-first, SSE fallback)
 - Override per model via `agents.defaults.models["openai/<model>"].params.transport` (`"sse"`, `"websocket"`, or `"auto"`)
@@ -81,11 +90,8 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
 - OpenAI priority processing can be enabled via `agents.defaults.models["openai/<model>"].params.serviceTier`
 - `/fast` and `params.fastMode` map direct `openai/*` Responses requests to `service_tier=priority` on `api.openai.com`
 - Use `params.serviceTier` when you want an explicit tier instead of the shared `/fast` toggle
- Hidden OpenClaw attribution headers (`originator`, `version`,
-  `User-Agent`) apply only on native OpenAI traffic to `api.openai.com`, not
-  generic OpenAI-compatible proxies
- Native OpenAI routes also keep Responses `store`, prompt-cache hints, and
-  OpenAI reasoning-compat payload shaping; proxy routes do not
+- Hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`) apply only on native OpenAI traffic to `api.openai.com`, not generic OpenAI-compatible proxies
+- Native OpenAI routes also keep Responses `store`, prompt-cache hints, and OpenAI reasoning-compat payload shaping; proxy routes do not
 - `openai/gpt-5.3-codex-spark` is intentionally suppressed in OpenClaw because live OpenAI API requests reject it and the current Codex catalog does not expose it

 ```json5
@@ -102,8 +108,10 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
 - Example model: `anthropic/claude-opus-4-6`
 - CLI: `openclaw onboard --auth-choice apiKey`
 - Direct public Anthropic requests support the shared `/fast` toggle and `params.fastMode`, including API-key and OAuth-authenticated traffic sent to `api.anthropic.com`; OpenClaw maps that to Anthropic `service_tier` (`auto` vs `standard_only`)
- Anthropic note: Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again, so OpenClaw treats Claude CLI reuse and `claude -p` usage as sanctioned for this integration unless Anthropic publishes a new policy.
- Anthropic setup-token remains available as a supported OpenClaw token path, but OpenClaw now prefers Claude CLI reuse and `claude -p` when available.
+
+<Note>
+Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again, so OpenClaw treats Claude CLI reuse and `claude -p` usage as sanctioned for this integration unless Anthropic publishes a new policy. Anthropic setup-token remains available as a supported OpenClaw token path, but OpenClaw now prefers Claude CLI reuse and `claude -p` when available.
+</Note>

 ```json5
 {
@@ -119,16 +127,12 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
 - Native Codex app-server harness ref: `openai/gpt-5.5` with `agents.defaults.embeddedHarness.runtime: "codex"`
 - Native Codex app-server harness docs: [Codex harness](/plugins/codex-harness)
 - Legacy model refs: `codex/gpt-*`
- Plugin boundary: `openai-codex/*` loads the OpenAI plugin; the native Codex
-  app-server plugin is selected only by the Codex harness runtime or legacy
-  `codex/*` refs.
+- Plugin boundary: `openai-codex/*` loads the OpenAI plugin; the native Codex app-server plugin is selected only by the Codex harness runtime or legacy `codex/*` refs.
 - CLI: `openclaw onboard --auth-choice openai-codex` or `openclaw models auth login --provider openai-codex`
 - Default transport is `auto` (WebSocket-first, SSE fallback)
 - Override per PI model via `agents.defaults.models["openai-codex/<model>"].params.transport` (`"sse"`, `"websocket"`, or `"auto"`)
 - `params.serviceTier` is also forwarded on native Codex Responses requests (`chatgpt.com/backend-api`)
- Hidden OpenClaw attribution headers (`originator`, `version`,
-  `User-Agent`) are only attached on native Codex traffic to
-  `chatgpt.com/backend-api`, not generic OpenAI-compatible proxies
+- Hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`) are only attached on native Codex traffic to `chatgpt.com/backend-api`, not generic OpenAI-compatible proxies
 - Shares the same `/fast` toggle and `params.fastMode` config as direct `openai/*`; OpenClaw maps that to `service_tier=priority`
 - `openai-codex/gpt-5.5` uses the Codex catalog native `contextWindow = 400000` and default runtime `contextTokens = 272000`; override the runtime cap with `models.providers.openai-codex.models[].contextTokens`
 - Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.
@@ -154,9 +158,17 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**

 ### Other subscription-style hosted options

- [Qwen Cloud](/providers/qwen): Qwen Cloud provider surface plus Alibaba DashScope and Coding Plan endpoint mapping
- [MiniMax](/providers/minimax): MiniMax Coding Plan OAuth or API key access
- [GLM models](/providers/glm): Z.AI Coding Plan or general API endpoints
+<CardGroup cols={3}>
+  <Card title="GLM models" href="/providers/glm">
+    Z.AI Coding Plan or general API endpoints.
+  </Card>
+  <Card title="MiniMax" href="/providers/minimax">
+    MiniMax Coding Plan OAuth or API key access.
+  </Card>
+  <Card title="Qwen Cloud" href="/providers/qwen">
+    Qwen Cloud provider surface plus Alibaba DashScope and Coding Plan endpoint mapping.
+  </Card>
+</CardGroup>

 ### OpenCode

@@ -180,29 +192,54 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
 - Example models: `google/gemini-3.1-pro-preview`, `google/gemini-3-flash-preview`
 - Compatibility: legacy OpenClaw config using `google/gemini-3.1-flash-preview` is normalized to `google/gemini-3-flash-preview`
 - CLI: `openclaw onboard --auth-choice gemini-api-key`
- Thinking: `/think adaptive` uses Google dynamic thinking. Gemini 3/3.1 omit a fixed
-  `thinkingLevel`; Gemini 2.5 sends `thinkingBudget: -1`.
- Direct Gemini runs also accept `agents.defaults.models["google/<model>"].params.cachedContent`
-  (or legacy `cached_content`) to forward a provider-native
-  `cachedContents/...` handle; Gemini cache hits surface as OpenClaw `cacheRead`
+- Thinking: `/think adaptive` uses Google dynamic thinking. Gemini 3/3.1 omit a fixed `thinkingLevel`; Gemini 2.5 sends `thinkingBudget: -1`.
+- Direct Gemini runs also accept `agents.defaults.models["google/<model>"].params.cachedContent` (or legacy `cached_content`) to forward a provider-native `cachedContents/...` handle; Gemini cache hits surface as OpenClaw `cacheRead`

 ### Google Vertex and Gemini CLI

 - Providers: `google-vertex`, `google-gemini-cli`
 - Auth: Vertex uses gcloud ADC; Gemini CLI uses its OAuth flow
- Caution: Gemini CLI OAuth in OpenClaw is an unofficial integration. Some users have reported Google account restrictions after using third-party clients. Review Google terms and use a non-critical account if you choose to proceed.
- Gemini CLI OAuth is shipped as part of the bundled `google` plugin.
-  - Install Gemini CLI first:
-    - `brew install gemini-cli`
-    - or `npm install -g @google/gemini-cli`
-  - Enable: `openclaw plugins enable google`
-  - Login: `openclaw models auth login --provider google-gemini-cli --set-default`
-  - Default model: `google-gemini-cli/gemini-3-flash-preview`
-  - Note: you do **not** paste a client id or secret into `openclaw.json`. The CLI login flow stores
-    tokens in auth profiles on the gateway host.
-  - If requests fail after login, set `GOOGLE_CLOUD_PROJECT` or `GOOGLE_CLOUD_PROJECT_ID` on the gateway host.
-  - Gemini CLI JSON replies are parsed from `response`; usage falls back to
-    `stats`, with `stats.cached` normalized into OpenClaw `cacheRead`.
+
+<Warning>
+Gemini CLI OAuth in OpenClaw is an unofficial integration. Some users have reported Google account restrictions after using third-party clients. Review Google terms and use a non-critical account if you choose to proceed.
+</Warning>
+
+Gemini CLI OAuth is shipped as part of the bundled `google` plugin.
+
+<Steps>
+  <Step title="Install Gemini CLI">
+    <Tabs>
+      <Tab title="brew">
+        ```bash
+        brew install gemini-cli
+        ```
+      </Tab>
+      <Tab title="npm">
+        ```bash
+        npm install -g @google/gemini-cli
+        ```
+      </Tab>
+    </Tabs>
+  </Step>
+  <Step title="Enable plugin">
+    ```bash
+    openclaw plugins enable google
+    ```
+  </Step>
+  <Step title="Login">
+    ```bash
+    openclaw models auth login --provider google-gemini-cli --set-default
+    ```
+
+    Default model: `google-gemini-cli/gemini-3-flash-preview`. You do **not** paste a client id or secret into `openclaw.json`. The CLI login flow stores tokens in auth profiles on the gateway host.
+
+  </Step>
+  <Step title="Set project (if needed)">
+    If requests fail after login, set `GOOGLE_CLOUD_PROJECT` or `GOOGLE_CLOUD_PROJECT_ID` on the gateway host.
+  </Step>
+</Steps>
+
+Gemini CLI JSON replies are parsed from `response`; usage falls back to `stats`, with `stats.cached` normalized into OpenClaw `cacheRead`.

 ### Z.AI (GLM)

@@ -217,8 +254,7 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**

 - Provider: `vercel-ai-gateway`
 - Auth: `AI_GATEWAY_API_KEY`
- Example models: `vercel-ai-gateway/anthropic/claude-opus-4.6`,
-  `vercel-ai-gateway/moonshotai/kimi-k2.6`
+- Example models: `vercel-ai-gateway/anthropic/claude-opus-4.6`, `vercel-ai-gateway/moonshotai/kimi-k2.6`
 - CLI: `openclaw onboard --auth-choice ai-gateway-api-key`

 ### Kilo Gateway
@@ -228,11 +264,8 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
 - Example model: `kilocode/kilo/auto`
 - CLI: `openclaw onboard --auth-choice kilocode-api-key`
 - Base URL: `https://api.kilo.ai/api/gateway/`
- Static fallback catalog ships `kilocode/kilo/auto`; live
-  `https://api.kilo.ai/api/gateway/models` discovery can expand the runtime
-  catalog further.
- Exact upstream routing behind `kilocode/kilo/auto` is owned by Kilo Gateway,
-  not hard-coded in OpenClaw.
+- Static fallback catalog ships `kilocode/kilo/auto`; live `https://api.kilo.ai/api/gateway/models` discovery can expand the runtime catalog further.
+- Exact upstream routing behind `kilocode/kilo/auto` is owned by Kilo Gateway, not hard-coded in OpenClaw.

 See [/providers/kilocode](/providers/kilocode) for setup details.

@@ -264,28 +297,35 @@ See [/providers/kilocode](/providers/kilocode) for setup details.
 | xAI                     | `xai`                            | `XAI_API_KEY`                                                | `xai/grok-4`                                    |
 | Xiaomi                  | `xiaomi`                         | `XIAOMI_API_KEY`                                             | `xiaomi/mimo-v2-flash`                          |

-Quirks worth knowing:
+#### Quirks worth knowing

- **OpenRouter** applies its app-attribution headers and Anthropic `cache_control` markers only on verified `openrouter.ai` routes. DeepSeek, Moonshot, and ZAI refs are cache-TTL eligible for OpenRouter-managed prompt caching but do not receive Anthropic cache markers. As a proxy-style OpenAI-compatible path, it skips native-OpenAI-only shaping (`serviceTier`, Responses `store`, prompt-cache hints, OpenAI reasoning-compat). Gemini-backed refs keep proxy-Gemini thought-signature sanitation only.
- **Kilo Gateway** Gemini-backed refs follow the same proxy-Gemini sanitation path; `kilocode/kilo/auto` and other proxy-reasoning-unsupported refs skip proxy reasoning injection.
- **MiniMax** API-key onboarding writes explicit text-only M2.7 chat model definitions; image understanding stays on the plugin-owned `MiniMax-VL-01` media provider.
- **xAI** uses the xAI Responses path. `/fast` or `params.fastMode: true` rewrites `grok-3`, `grok-3-mini`, `grok-4`, and `grok-4-0709` to their `*-fast` variants. `tool_stream` defaults on; disable via `agents.defaults.models["xai/<model>"].params.tool_stream=false`.
- **Cerebras** GLM models use `zai-glm-4.7` / `zai-glm-4.6`; OpenAI-compatible base URL is `https://api.cerebras.ai/v1`.
+<AccordionGroup>
+  <Accordion title="OpenRouter">
+    Applies its app-attribution headers and Anthropic `cache_control` markers only on verified `openrouter.ai` routes. DeepSeek, Moonshot, and ZAI refs are cache-TTL eligible for OpenRouter-managed prompt caching but do not receive Anthropic cache markers. As a proxy-style OpenAI-compatible path, it skips native-OpenAI-only shaping (`serviceTier`, Responses `store`, prompt-cache hints, OpenAI reasoning-compat). Gemini-backed refs keep proxy-Gemini thought-signature sanitation only.
+  </Accordion>
+  <Accordion title="Kilo Gateway">
+    Gemini-backed refs follow the same proxy-Gemini sanitation path; `kilocode/kilo/auto` and other proxy-reasoning-unsupported refs skip proxy reasoning injection.
+  </Accordion>
+  <Accordion title="MiniMax">
+    API-key onboarding writes explicit text-only M2.7 chat model definitions; image understanding stays on the plugin-owned `MiniMax-VL-01` media provider.
+  </Accordion>
+  <Accordion title="xAI">
+    Uses the xAI Responses path. `/fast` or `params.fastMode: true` rewrites `grok-3`, `grok-3-mini`, `grok-4`, and `grok-4-0709` to their `*-fast` variants. `tool_stream` defaults on; disable via `agents.defaults.models["xai/<model>"].params.tool_stream=false`.
+  </Accordion>
+  <Accordion title="Cerebras">
+    GLM models use `zai-glm-4.7` / `zai-glm-4.6`; OpenAI-compatible base URL is `https://api.cerebras.ai/v1`.
+  </Accordion>
+</AccordionGroup>

 ## Providers via `models.providers` (custom/base URL)

-Use `models.providers` (or `models.json`) to add **custom** providers or
-OpenAI/Anthropic‑compatible proxies.
+Use `models.providers` (or `models.json`) to add **custom** providers or OpenAI/Anthropic‑compatible proxies.

-Many of the bundled provider plugins below already publish a default catalog.
-Use explicit `models.providers.<id>` entries only when you want to override the
-default base URL, headers, or model list.
+Many of the bundled provider plugins below already publish a default catalog. Use explicit `models.providers.<id>` entries only when you want to override the default base URL, headers, or model list.

 ### Moonshot AI (Kimi)

-Moonshot ships as a bundled provider plugin. Use the built-in provider by
-default, and add an explicit `models.providers.moonshot` entry only when you
-need to override the base URL or model metadata:
+Moonshot ships as a bundled provider plugin. Use the built-in provider by default, and add an explicit `models.providers.moonshot` entry only when you need to override the base URL or model metadata:

 - Provider: `moonshot`
 - Auth: `MOONSHOT_API_KEY`
@@ -359,29 +399,26 @@ Volcano Engine (火山引擎) provides access to Doubao and other models in Chin
 }
 ```

-Onboarding defaults to the coding surface, but the general `volcengine/*`
-catalog is registered at the same time.
+Onboarding defaults to the coding surface, but the general `volcengine/*` catalog is registered at the same time.

-In onboarding/configure model pickers, the Volcengine auth choice prefers both
-`volcengine/*` and `volcengine-plan/*` rows. If those models are not loaded yet,
-OpenClaw falls back to the unfiltered catalog instead of showing an empty
-provider-scoped picker.
+In onboarding/configure model pickers, the Volcengine auth choice prefers both `volcengine/*` and `volcengine-plan/*` rows. If those models are not loaded yet, OpenClaw falls back to the unfiltered catalog instead of showing an empty provider-scoped picker.

-Available models:
-
- `volcengine/doubao-seed-1-8-251228` (Doubao Seed 1.8)
- `volcengine/doubao-seed-code-preview-251028`
- `volcengine/kimi-k2-5-260127` (Kimi K2.5)
- `volcengine/glm-4-7-251222` (GLM 4.7)
- `volcengine/deepseek-v3-2-251201` (DeepSeek V3.2 128K)
-
-Coding models (`volcengine-plan`):
-
- `volcengine-plan/ark-code-latest`
- `volcengine-plan/doubao-seed-code`
- `volcengine-plan/kimi-k2.5`
- `volcengine-plan/kimi-k2-thinking`
- `volcengine-plan/glm-4.7`
+<Tabs>
+  <Tab title="Standard models">
+    - `volcengine/doubao-seed-1-8-251228` (Doubao Seed 1.8)
+    - `volcengine/doubao-seed-code-preview-251028`
+    - `volcengine/kimi-k2-5-260127` (Kimi K2.5)
+    - `volcengine/glm-4-7-251222` (GLM 4.7)
+    - `volcengine/deepseek-v3-2-251201` (DeepSeek V3.2 128K)
+  </Tab>
+  <Tab title="Coding models (volcengine-plan)">
+    - `volcengine-plan/ark-code-latest`
+    - `volcengine-plan/doubao-seed-code`
+    - `volcengine-plan/kimi-k2.5`
+    - `volcengine-plan/kimi-k2-thinking`
+    - `volcengine-plan/glm-4.7`
+  </Tab>
+</Tabs>

 ### BytePlus (International)

@@ -400,27 +437,24 @@ BytePlus ARK provides access to the same models as Volcano Engine for internatio
 }
 ```

-Onboarding defaults to the coding surface, but the general `byteplus/*`
-catalog is registered at the same time.
+Onboarding defaults to the coding surface, but the general `byteplus/*` catalog is registered at the same time.

-In onboarding/configure model pickers, the BytePlus auth choice prefers both
-`byteplus/*` and `byteplus-plan/*` rows. If those models are not loaded yet,
-OpenClaw falls back to the unfiltered catalog instead of showing an empty
-provider-scoped picker.
+In onboarding/configure model pickers, the BytePlus auth choice prefers both `byteplus/*` and `byteplus-plan/*` rows. If those models are not loaded yet, OpenClaw falls back to the unfiltered catalog instead of showing an empty provider-scoped picker.

-Available models:
-
- `byteplus/seed-1-8-251228` (Seed 1.8)
- `byteplus/kimi-k2-5-260127` (Kimi K2.5)
- `byteplus/glm-4-7-251222` (GLM 4.7)
-
-Coding models (`byteplus-plan`):
-
- `byteplus-plan/ark-code-latest`
- `byteplus-plan/doubao-seed-code`
- `byteplus-plan/kimi-k2.5`
- `byteplus-plan/kimi-k2-thinking`
- `byteplus-plan/glm-4.7`
+<Tabs>
+  <Tab title="Standard models">
+    - `byteplus/seed-1-8-251228` (Seed 1.8)
+    - `byteplus/kimi-k2-5-260127` (Kimi K2.5)
+    - `byteplus/glm-4-7-251222` (GLM 4.7)
+  </Tab>
+  <Tab title="Coding models (byteplus-plan)">
+    - `byteplus-plan/ark-code-latest`
+    - `byteplus-plan/doubao-seed-code`
+    - `byteplus-plan/kimi-k2.5`
+    - `byteplus-plan/kimi-k2-thinking`
+    - `byteplus-plan/glm-4.7`
+  </Tab>
+</Tabs>

 ### Synthetic

@@ -458,14 +492,13 @@ MiniMax is configured via `models.providers` because it uses custom endpoints:
 - MiniMax OAuth (CN): `--auth-choice minimax-cn-oauth`
 - MiniMax API key (Global): `--auth-choice minimax-global-api`
 - MiniMax API key (CN): `--auth-choice minimax-cn-api`
- Auth: `MINIMAX_API_KEY` for `minimax`; `MINIMAX_OAUTH_TOKEN` or
-  `MINIMAX_API_KEY` for `minimax-portal`
+- Auth: `MINIMAX_API_KEY` for `minimax`; `MINIMAX_OAUTH_TOKEN` or `MINIMAX_API_KEY` for `minimax-portal`

 See [/providers/minimax](/providers/minimax) for setup details, model options, and config snippets.

-On MiniMax's Anthropic-compatible streaming path, OpenClaw disables thinking by
-default unless you explicitly set it, and `/fast on` rewrites
-`MiniMax-M2.7` to `MiniMax-M2.7-highspeed`.
+<Note>
+On MiniMax's Anthropic-compatible streaming path, OpenClaw disables thinking by default unless you explicitly set it, and `/fast on` rewrites `MiniMax-M2.7` to `MiniMax-M2.7-highspeed`.
+</Note>

 Plugin-owned capability split:

@@ -492,9 +525,7 @@ Then set a model (replace with one of the IDs returned by `http://localhost:1234
 }
 ```

-OpenClaw uses LM Studio's native `/api/v1/models` and `/api/v1/models/load`
-for discovery + auto-load, with `/v1/chat/completions` for inference by default.
-See [/providers/lmstudio](/providers/lmstudio) for setup and troubleshooting.
+OpenClaw uses LM Studio's native `/api/v1/models` and `/api/v1/models/load` for discovery + auto-load, with `/v1/chat/completions` for inference by default. See [/providers/lmstudio](/providers/lmstudio) for setup and troubleshooting.

 ### Ollama

@@ -518,21 +549,17 @@ ollama pull llama3.3
 }
 ```

-Ollama is detected locally at `http://127.0.0.1:11434` when you opt in with
-`OLLAMA_API_KEY`, and the bundled provider plugin adds Ollama directly to
-`openclaw onboard` and the model picker. See [/providers/ollama](/providers/ollama)
-for onboarding, cloud/local mode, and custom configuration.
+Ollama is detected locally at `http://127.0.0.1:11434` when you opt in with `OLLAMA_API_KEY`, and the bundled provider plugin adds Ollama directly to `openclaw onboard` and the model picker. See [/providers/ollama](/providers/ollama) for onboarding, cloud/local mode, and custom configuration.

 ### vLLM

-vLLM ships as a bundled provider plugin for local/self-hosted OpenAI-compatible
-servers:
+vLLM ships as a bundled provider plugin for local/self-hosted OpenAI-compatible servers:

 - Provider: `vllm`
 - Auth: Optional (depends on your server)
 - Default base URL: `http://127.0.0.1:8000/v1`

-To opt in to auto-discovery locally (any value works if your server doesn’t enforce auth):
+To opt in to auto-discovery locally (any value works if your server doesn't enforce auth):

 ```bash
 export VLLM_API_KEY="vllm-local"
@@ -552,15 +579,13 @@ See [/providers/vllm](/providers/vllm) for details.

 ### SGLang

-SGLang ships as a bundled provider plugin for fast self-hosted
-OpenAI-compatible servers:
+SGLang ships as a bundled provider plugin for fast self-hosted OpenAI-compatible servers:

 - Provider: `sglang`
 - Auth: Optional (depends on your server)
 - Default base URL: `http://127.0.0.1:30000/v1`

-To opt in to auto-discovery locally (any value works if your server does not
-enforce auth):
+To opt in to auto-discovery locally (any value works if your server does not enforce auth):

 ```bash
 export SGLANG_API_KEY="sglang-local"
@@ -613,31 +638,28 @@ Example (OpenAI‑compatible):
 }
 ```

-Notes:
+<AccordionGroup>
+  <Accordion title="Default optional fields">
+    For custom providers, `reasoning`, `input`, `cost`, `contextWindow`, and `maxTokens` are optional. When omitted, OpenClaw defaults to:

- For custom providers, `reasoning`, `input`, `cost`, `contextWindow`, and `maxTokens` are optional.
-  When omitted, OpenClaw defaults to:
-  - `reasoning: false`
-  - `input: ["text"]`
-  - `cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }`
-  - `contextWindow: 200000`
-  - `maxTokens: 8192`
- Recommended: set explicit values that match your proxy/model limits.
- For `api: "openai-completions"` on non-native endpoints (any non-empty `baseUrl` whose host is not `api.openai.com`), OpenClaw forces `compat.supportsDeveloperRole: false` to avoid provider 400 errors for unsupported `developer` roles.
- Proxy-style OpenAI-compatible routes also skip native OpenAI-only request
-  shaping: no `service_tier`, no Responses `store`, no Completions `store`, no
-  prompt-cache hints, no OpenAI reasoning-compat payload shaping, and no hidden
-  OpenClaw attribution headers.
- For OpenAI-compatible Completions proxies that need vendor-specific fields,
-  set `agents.defaults.models["provider/model"].params.extra_body` (or
-  `extraBody`) to merge extra JSON into the outbound request body.
- For vLLM chat-template controls, set
-  `agents.defaults.models["provider/model"].params.chat_template_kwargs`.
-  OpenClaw automatically sends `enable_thinking: false` and
-  `force_nonempty_content: true` for `vllm/nemotron-3-*` when the session
-  thinking level is off.
- If `baseUrl` is empty/omitted, OpenClaw keeps the default OpenAI behavior (which resolves to `api.openai.com`).
- For safety, an explicit `compat.supportsDeveloperRole: true` is still overridden on non-native `openai-completions` endpoints.
+    - `reasoning: false`
+    - `input: ["text"]`
+    - `cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }`
+    - `contextWindow: 200000`
+    - `maxTokens: 8192`
+
+    Recommended: set explicit values that match your proxy/model limits.
+
+  </Accordion>
+  <Accordion title="Proxy-route shaping rules">
+    - For `api: "openai-completions"` on non-native endpoints (any non-empty `baseUrl` whose host is not `api.openai.com`), OpenClaw forces `compat.supportsDeveloperRole: false` to avoid provider 400 errors for unsupported `developer` roles.
+    - Proxy-style OpenAI-compatible routes also skip native OpenAI-only request shaping: no `service_tier`, no Responses `store`, no Completions `store`, no prompt-cache hints, no OpenAI reasoning-compat payload shaping, and no hidden OpenClaw attribution headers.
+    - For OpenAI-compatible Completions proxies that need vendor-specific fields, set `agents.defaults.models["provider/model"].params.extra_body` (or `extraBody`) to merge extra JSON into the outbound request body.
+    - For vLLM chat-template controls, set `agents.defaults.models["provider/model"].params.chat_template_kwargs`. OpenClaw automatically sends `enable_thinking: false` and `force_nonempty_content: true` for `vllm/nemotron-3-*` when the session thinking level is off.
+    - If `baseUrl` is empty/omitted, OpenClaw keeps the default OpenAI behavior (which resolves to `api.openai.com`).
+    - For safety, an explicit `compat.supportsDeveloperRole: true` is still overridden on non-native `openai-completions` endpoints.
+  </Accordion>
+</AccordionGroup>

 ## CLI examples

@@ -651,7 +673,7 @@ See also: [Configuration](/gateway/configuration) for full configuration example

 ## Related

- [Models](/concepts/models) — model configuration and aliases
- [Model failover](/concepts/model-failover) — fallback chains and retry behavior
 - [Configuration reference](/gateway/config-agents#agent-defaults) — model config keys
+- [Model failover](/concepts/model-failover) — fallback chains and retry behavior
+- [Models](/concepts/models) — model configuration and aliases
 - [Providers](/providers) — per-provider setup guides
--- a/docs/concepts/qa-e2e-automation.md
+++ b/docs/concepts/qa-e2e-automation.md
@@ -59,9 +59,9 @@ pnpm qa:otel:smoke
 That script starts a local OTLP/HTTP trace receiver, runs the
 `otel-trace-smoke` QA scenario with the `diagnostics-otel` plugin enabled, then
 decodes the exported protobuf spans and asserts the release-critical shape:
-`openclaw.run`, `openclaw.model.call`, `openclaw.context.assembled`, and
-`openclaw.message.delivery` must be present; model calls must not export
-`StreamAbandoned` on successful turns; raw diagnostic IDs and
+`openclaw.run`, `openclaw.harness.run`, `openclaw.model.call`,
+`openclaw.context.assembled`, and `openclaw.message.delivery` must be present;
+model calls must not export `StreamAbandoned` on successful turns; raw diagnostic IDs and
 `openclaw.content.*` attributes must stay out of the trace. It writes
 `otel-smoke-summary.json` next to the QA suite artifacts.

--- a/docs/gateway/doctor.md
+++ b/docs/gateway/doctor.md
--- a/docs/gateway/protocol.md
+++ b/docs/gateway/protocol.md
@@ -360,8 +360,8 @@ enumeration of `src/gateway/server-methods/*.ts`.
  <Accordion title="Device pairing and device tokens">
    - `device.pair.list` returns pending and approved paired devices.
    - `device.pair.approve`, `device.pair.reject`, and `device.pair.remove` manage device-pairing records.
-    - `device.token.rotate` rotates a paired device token within its approved role and scope bounds.
-    - `device.token.revoke` revokes a paired device token.
+    - `device.token.rotate` rotates a paired device token within its approved role and caller scope bounds.
+    - `device.token.revoke` revokes a paired device token within its approved role and caller scope bounds.
  </Accordion>

  <Accordion title="Node pairing, invoke, and pending work">
@@ -549,15 +549,15 @@ rather than the pre-handshake defaults.
  reused when the client is reusing the stored per-device token.
 - Device tokens can be rotated/revoked via `device.token.rotate` and
  `device.token.revoke` (requires `operator.pairing` scope).
- Token issuance/rotation stays bounded to the approved role set recorded in
-  that device's pairing entry; rotating a token cannot expand the device into a
-  role that pairing approval never granted.
+- Token issuance, rotation, and revocation stay bounded to the approved role set
+  recorded in that device's pairing entry; token mutation cannot expand or
+  target a device role that pairing approval never granted.
 - For paired-device token sessions, device management is self-scoped unless the
  caller also has `operator.admin`: non-admin callers can remove/revoke/rotate
  only their **own** device entry.
- `device.token.rotate` also checks the requested operator scope set against the
-  caller's current session scopes. Non-admin callers cannot rotate a token into
-  a broader operator scope set than they already hold.
+- `device.token.rotate` and `device.token.revoke` also check the target operator
+  token scope set against the caller's current session scopes. Non-admin callers
+  cannot rotate or revoke a broader operator token than they already hold.
 - Auth failures include `error.details.code` plus recovery hints:
  - `error.details.canRetryWithDeviceToken` (boolean)
  - `error.details.recommendedNextStep` (`retry_with_device_token`, `update_auth_configuration`, `update_auth_credentials`, `wait_then_retry`, `review_auth_configuration`)
--- a/docs/gateway/troubleshooting.md
+++ b/docs/gateway/troubleshooting.md
@@ -4,12 +4,10 @@ read_when:
  - The troubleshooting hub pointed you here for deeper diagnosis
  - You need stable symptom based runbook sections with exact commands
 title: "Troubleshooting"
+sidebarTitle: "Troubleshooting"
 ---

-# Gateway troubleshooting
-
-This page is the deep runbook.
-Start at [/help/troubleshooting](/help/troubleshooting) if you want the fast triage flow first.
+This page is the deep runbook. Start at [/help/troubleshooting](/help/troubleshooting) if you want the fast triage flow first.

 ## Command ladder

@@ -27,20 +25,13 @@ Expected healthy signals:

 - `openclaw gateway status` shows `Runtime: running`, `Connectivity probe: ok`, and a `Capability: ...` line.
 - `openclaw doctor` reports no blocking config/service issues.
- `openclaw channels status --probe` shows live per-account transport status and,
-  where supported, probe/audit results such as `works` or `audit ok`.
+- `openclaw channels status --probe` shows live per-account transport status and, where supported, probe/audit results such as `works` or `audit ok`.

 ## Split brain installs and newer config guard

-Use this when a gateway service unexpectedly stops after an update, or logs show
-that one `openclaw` binary is older than the version that last wrote
-`openclaw.json`.
+Use this when a gateway service unexpectedly stops after an update, or logs show that one `openclaw` binary is older than the version that last wrote `openclaw.json`.

-OpenClaw stamps config writes with `meta.lastTouchedVersion`. Read-only commands
-can still inspect a config written by a newer OpenClaw, but process and service
-mutations refuse to continue from an older binary. Blocked actions include
-gateway service start, stop, restart, uninstall, forced service reinstall,
-service-mode gateway startup, and `gateway --force` port cleanup.
+OpenClaw stamps config writes with `meta.lastTouchedVersion`. Read-only commands can still inspect a config written by a newer OpenClaw, but process and service mutations refuse to continue from an older binary. Blocked actions include gateway service start, stop, restart, uninstall, forced service reinstall, service-mode gateway startup, and `gateway --force` port cleanup.

 ```bash
 which openclaw
@@ -49,27 +40,31 @@ openclaw gateway status --deep
 openclaw config get meta.lastTouchedVersion
 ```

-Fix options:
+<Steps>
+  <Step title="Fix PATH">
+    Fix `PATH` so `openclaw` resolves to the newer install, then rerun the action.
+  </Step>
+  <Step title="Reinstall the gateway service">
+    Reinstall the intended gateway service from the newer install:

-1. Fix `PATH` so `openclaw` resolves to the newer install, then rerun the action.
-2. Reinstall the intended gateway service from the newer install:
+    ```bash
+    openclaw gateway install --force
+    openclaw gateway restart
+    ```

-   ```bash
-   openclaw gateway install --force
-   openclaw gateway restart
-   ```
+  </Step>
+  <Step title="Remove stale wrappers">
+    Remove stale system package or old wrapper entries that still point at an old `openclaw` binary.
+  </Step>
+</Steps>

-3. Remove stale system package or old wrapper entries that still point at an old
-   `openclaw` binary.
-
-For intentional downgrade or emergency recovery only, set
-`OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS=1` for the single command.
-Leave it unset for normal operation.
+<Warning>
+For intentional downgrade or emergency recovery only, set `OPENCLAW_ALLOW_OLDER_BINARY_DESTRUCTIVE_ACTIONS=1` for the single command. Leave it unset for normal operation.
+</Warning>

 ## Anthropic 429 extra usage required for long context

-Use this when logs/errors include:
-`HTTP 429: rate_limit_error: Extra usage is required for long context requests`.
+Use this when logs/errors include: `HTTP 429: rate_limit_error: Extra usage is required for long context requests`.

 ```bash
 openclaw logs --follow
@@ -85,9 +80,17 @@ Look for:

 Fix options:

-1. Disable `context1m` for that model to fall back to the normal context window.
-2. Use an Anthropic credential that is eligible for long-context requests, or switch to an Anthropic API key.
-3. Configure fallback models so runs continue when Anthropic long-context requests are rejected.
+<Steps>
+  <Step title="Disable context1m">
+    Disable `context1m` for that model to fall back to the normal context window.
+  </Step>
+  <Step title="Use an eligible credential">
+    Use an Anthropic credential that is eligible for long-context requests, or switch to an Anthropic API key.
+  </Step>
+  <Step title="Configure fallback models">
+    Configure fallback models so runs continue when Anthropic long-context requests are rejected.
+  </Step>
+</Steps>

 Related:

@@ -116,38 +119,26 @@ Look for:

 - direct tiny calls succeed, but OpenClaw runs fail only on larger prompts
 - backend errors about `messages[].content` expecting a string
- backend crashes that appear only with larger prompt-token counts or full agent
-  runtime prompts
+- backend crashes that appear only with larger prompt-token counts or full agent runtime prompts

-Common signatures:
-
- `messages[...].content: invalid type: sequence, expected a string` → backend
-  rejects structured Chat Completions content parts. Fix: set
-  `models.providers.<provider>.models[].compat.requiresStringContent: true`.
- direct tiny requests succeed, but OpenClaw agent runs fail with backend/model
-  crashes (for example Gemma on some `inferrs` builds) → OpenClaw transport is
-  likely already correct; the backend is failing on the larger agent-runtime
-  prompt shape.
- failures shrink after disabling tools but do not disappear → tool schemas were
-  part of the pressure, but the remaining issue is still upstream model/server
-  capacity or a backend bug.
-
-Fix options:
-
-1. Set `compat.requiresStringContent: true` for string-only Chat Completions backends.
-2. Set `compat.supportsTools: false` for models/backends that cannot handle
-   OpenClaw's tool schema surface reliably.
-3. Lower prompt pressure where possible: smaller workspace bootstrap, shorter
-   session history, lighter local model, or a backend with stronger long-context
-   support.
-4. If tiny direct requests keep passing while OpenClaw agent turns still crash
-   inside the backend, treat it as an upstream server/model limitation and file
-   a repro there with the accepted payload shape.
+<AccordionGroup>
+  <Accordion title="Common signatures">
+    - `messages[...].content: invalid type: sequence, expected a string` → backend rejects structured Chat Completions content parts. Fix: set `models.providers.<provider>.models[].compat.requiresStringContent: true`.
+    - direct tiny requests succeed, but OpenClaw agent runs fail with backend/model crashes (for example Gemma on some `inferrs` builds) → OpenClaw transport is likely already correct; the backend is failing on the larger agent-runtime prompt shape.
+    - failures shrink after disabling tools but do not disappear → tool schemas were part of the pressure, but the remaining issue is still upstream model/server capacity or a backend bug.
+  </Accordion>
+  <Accordion title="Fix options">
+    1. Set `compat.requiresStringContent: true` for string-only Chat Completions backends.
+    2. Set `compat.supportsTools: false` for models/backends that cannot handle OpenClaw's tool schema surface reliably.
+    3. Lower prompt pressure where possible: smaller workspace bootstrap, shorter session history, lighter local model, or a backend with stronger long-context support.
+    4. If tiny direct requests keep passing while OpenClaw agent turns still crash inside the backend, treat it as an upstream server/model limitation and file a repro there with the accepted payload shape.
+  </Accordion>
+</AccordionGroup>

 Related:

- [Local models](/gateway/local-models)
 - [Configuration](/gateway/configuration)
+- [Local models](/gateway/local-models)
 - [OpenAI-compatible endpoints](/gateway/configuration-reference#openai-compatible-endpoints)

 ## No replies
@@ -177,10 +168,10 @@ Common signatures:
 Related:

 - [Channel troubleshooting](/channels/troubleshooting)
- [Pairing](/channels/pairing)
 - [Groups](/channels/groups)
+- [Pairing](/channels/pairing)

-## Dashboard control ui connectivity
+## Dashboard control UI connectivity

 When dashboard/control UI will not connect, validate URL, auth mode, and secure context assumptions.

@@ -198,32 +189,21 @@ Look for:
 - Auth mode/token mismatch between client and gateway.
 - HTTP usage where device identity is required.

-Common signatures:
-
- `device identity required` → non-secure context or missing device auth.
- `origin not allowed` → browser `Origin` is not in `gateway.controlUi.allowedOrigins`
-  (or you are connecting from a non-loopback browser origin without an explicit
-  allowlist).
- `device nonce required` / `device nonce mismatch` → client is not completing the
-  challenge-based device auth flow (`connect.challenge` + `device.nonce`).
- `device signature invalid` / `device signature expired` → client signed the wrong
-  payload (or stale timestamp) for the current handshake.
- `AUTH_TOKEN_MISMATCH` with `canRetryWithDeviceToken=true` → client can do one trusted retry with cached device token.
- That cached-token retry reuses the cached scope set stored with the paired
-  device token. Explicit `deviceToken` / explicit `scopes` callers keep their
-  requested scope set instead.
- Outside that retry path, connect auth precedence is explicit shared
-  token/password first, then explicit `deviceToken`, then stored device token,
-  then bootstrap token.
- On the async Tailscale Serve Control UI path, failed attempts for the same
-  `{scope, ip}` are serialized before the limiter records the failure. Two bad
-  concurrent retries from the same client can therefore surface `retry later`
-  on the second attempt instead of two plain mismatches.
- `too many failed authentication attempts (retry later)` from a browser-origin
-  loopback client → repeated failures from that same normalized `Origin` are
-  locked out temporarily; another localhost origin uses a separate bucket.
- repeated `unauthorized` after that retry → shared token/device token drift; refresh token config and re-approve/rotate device token if needed.
- `gateway connect failed:` → wrong host/port/url target.
+<AccordionGroup>
+  <Accordion title="Connect / auth signatures">
+    - `device identity required` → non-secure context or missing device auth.
+    - `origin not allowed` → browser `Origin` is not in `gateway.controlUi.allowedOrigins` (or you are connecting from a non-loopback browser origin without an explicit allowlist).
+    - `device nonce required` / `device nonce mismatch` → client is not completing the challenge-based device auth flow (`connect.challenge` + `device.nonce`).
+    - `device signature invalid` / `device signature expired` → client signed the wrong payload (or stale timestamp) for the current handshake.
+    - `AUTH_TOKEN_MISMATCH` with `canRetryWithDeviceToken=true` → client can do one trusted retry with cached device token.
+    - That cached-token retry reuses the cached scope set stored with the paired device token. Explicit `deviceToken` / explicit `scopes` callers keep their requested scope set instead.
+    - Outside that retry path, connect auth precedence is explicit shared token/password first, then explicit `deviceToken`, then stored device token, then bootstrap token.
+    - On the async Tailscale Serve Control UI path, failed attempts for the same `{scope, ip}` are serialized before the limiter records the failure. Two bad concurrent retries from the same client can therefore surface `retry later` on the second attempt instead of two plain mismatches.
+    - `too many failed authentication attempts (retry later)` from a browser-origin loopback client → repeated failures from that same normalized `Origin` are locked out temporarily; another localhost origin uses a separate bucket.
+    - repeated `unauthorized` after that retry → shared token/device token drift; refresh token config and re-approve/rotate device token if needed.
+    - `gateway connect failed:` → wrong host/port/url target.
+  </Accordion>
+</AccordionGroup>

 ### Auth detail codes quick map

@@ -236,11 +216,9 @@ Use `error.details.code` from the failed `connect` response to pick the next act
 | `AUTH_DEVICE_TOKEN_MISMATCH` | Cached per-device token is stale or revoked.                                                                                                                                                 | Rotate/re-approve device token using [devices CLI](/cli/devices), then reconnect.                                                                                                                                                                                                        |
 | `PAIRING_REQUIRED`           | Device identity needs approval. Check `error.details.reason` for `not-paired`, `scope-upgrade`, `role-upgrade`, or `metadata-upgrade`, and use `requestId` / `remediationHint` when present. | Approve pending request: `openclaw devices list` then `openclaw devices approve <requestId>`. Scope/role upgrades use the same flow after you review the requested access.                                                                                                               |

-Direct loopback backend RPCs authenticated with the shared gateway
-token/password should not depend on the CLI's paired-device scope baseline. If
-subagents or other internal calls still fail with `scope-upgrade`, verify the
-caller is using `client.id: "gateway-client"` and `client.mode: "backend"` and
-is not forcing an explicit `deviceIdentity` or device token.
+<Note>
+Direct loopback backend RPCs authenticated with the shared gateway token/password should not depend on the CLI's paired-device scope baseline. If subagents or other internal calls still fail with `scope-upgrade`, verify the caller is using `client.id: "gateway-client"` and `client.mode: "backend"` and is not forcing an explicit `deviceIdentity` or device token.
+</Note>

 Device auth v2 migration check:

@@ -252,24 +230,30 @@ openclaw gateway status

 If logs show nonce/signature errors, update the connecting client and verify it:

-1. waits for `connect.challenge`
-2. signs the challenge-bound payload
-3. sends `connect.params.device.nonce` with the same challenge nonce
+<Steps>
+  <Step title="Wait for connect.challenge">
+    Client waits for the gateway-issued `connect.challenge`.
+  </Step>
+  <Step title="Sign the payload">
+    Client signs the challenge-bound payload.
+  </Step>
+  <Step title="Send the device nonce">
+    Client sends `connect.params.device.nonce` with the same challenge nonce.
+  </Step>
+</Steps>

 If `openclaw devices rotate` / `revoke` / `remove` is denied unexpectedly:

- paired-device token sessions can manage only **their own** device unless the
-  caller also has `operator.admin`
- `openclaw devices rotate --scope ...` can only request operator scopes that
-  the caller session already holds
+- paired-device token sessions can manage only **their own** device unless the caller also has `operator.admin`
+- `openclaw devices rotate --scope ...` can only request operator scopes that the caller session already holds

 Related:

- [Control UI](/web/control-ui)
 - [Configuration](/gateway/configuration) (gateway auth modes)
- [Trusted proxy auth](/gateway/trusted-proxy-auth)
- [Remote access](/gateway/remote)
+- [Control UI](/web/control-ui)
 - [Devices](/cli/devices)
+- [Remote access](/gateway/remote)
+- [Trusted proxy auth](/gateway/trusted-proxy-auth)

 ## Gateway service not running

@@ -291,12 +275,14 @@ Look for:
 - Extra launchd/systemd/schtasks installs when `--deep` is used.
 - `Other gateway-like services detected (best effort)` cleanup hints.

-Common signatures:
-
- `Gateway start blocked: set gateway.mode=local` or `existing config is missing gateway.mode` → local gateway mode is not enabled, or the config file was clobbered and lost `gateway.mode`. Fix: set `gateway.mode="local"` in your config, or re-run `openclaw onboard --mode local` / `openclaw setup` to restamp the expected local-mode config. If you are running OpenClaw via Podman, the default config path is `~/.openclaw/openclaw.json`.
- `refusing to bind gateway ... without auth` → non-loopback bind without a valid gateway auth path (token/password, or trusted-proxy where configured).
- `another gateway instance is already listening` / `EADDRINUSE` → port conflict.
- `Other gateway-like services detected (best effort)` → stale or parallel launchd/systemd/schtasks units exist. Most setups should keep one gateway per machine; if you do need more than one, isolate ports + config/state/workspace. See [/gateway#multiple-gateways-same-host](/gateway#multiple-gateways-same-host).
+<AccordionGroup>
+  <Accordion title="Common signatures">
+    - `Gateway start blocked: set gateway.mode=local` or `existing config is missing gateway.mode` → local gateway mode is not enabled, or the config file was clobbered and lost `gateway.mode`. Fix: set `gateway.mode="local"` in your config, or re-run `openclaw onboard --mode local` / `openclaw setup` to restamp the expected local-mode config. If you are running OpenClaw via Podman, the default config path is `~/.openclaw/openclaw.json`.
+    - `refusing to bind gateway ... without auth` → non-loopback bind without a valid gateway auth path (token/password, or trusted-proxy where configured).
+    - `another gateway instance is already listening` / `EADDRINUSE` → port conflict.
+    - `Other gateway-like services detected (best effort)` → stale or parallel launchd/systemd/schtasks units exist. Most setups should keep one gateway per machine; if you do need more than one, isolate ports + config/state/workspace. See [/gateway#multiple-gateways-same-host](/gateway#multiple-gateways-same-host).
+  </Accordion>
+</AccordionGroup>

 Related:

@@ -323,46 +309,43 @@ Look for:
 - A timestamped `openclaw.json.clobbered.*` file beside the active config
 - A main-agent system event that starts with `Config recovery warning`

-What happened:
-
- The rejected config did not validate during startup or hot reload.
- OpenClaw preserved the rejected payload as `.clobbered.*`.
- The active config was restored from the last validated last-known-good copy.
- The next main-agent turn is warned not to blindly rewrite the rejected config.
- If all validation issues were under `plugins.entries.<id>...`, OpenClaw would
-  not restore the whole file. Plugin-local failures stay loud while unrelated
-  user settings remain in the active config.
-
-Inspect and repair:
-
-```bash
-CONFIG="$(openclaw config file)"
-ls -lt "$CONFIG".clobbered.* "$CONFIG".rejected.* 2>/dev/null | head
-diff -u "$CONFIG" "$(ls -t "$CONFIG".clobbered.* 2>/dev/null | head -n 1)"
-openclaw config validate
-openclaw doctor
-```
-
-Common signatures:
-
- `.clobbered.*` exists → an external direct edit or startup read was restored.
- `.rejected.*` exists → an OpenClaw-owned config write failed schema or clobber checks before commit.
- `Config write rejected:` → the write tried to drop required shape, shrink the file sharply, or persist invalid config.
- `missing-meta-vs-last-good`, `gateway-mode-missing-vs-last-good`, or `size-drop-vs-last-good:*` → startup treated the current file as clobbered because it lost fields or size compared with the last-known-good backup.
- `Config last-known-good promotion skipped` → the candidate contained redacted secret placeholders such as `***`.
-
-Fix options:
-
-1. Keep the restored active config if it is correct.
-2. Copy only the intended keys from `.clobbered.*` or `.rejected.*`, then apply them with `openclaw config set` or `config.patch`.
-3. Run `openclaw config validate` before restarting.
-4. If you edit by hand, keep the full JSON5 config, not just the partial object you wanted to change.
+<AccordionGroup>
+  <Accordion title="What happened">
+    - The rejected config did not validate during startup or hot reload.
+    - OpenClaw preserved the rejected payload as `.clobbered.*`.
+    - The active config was restored from the last validated last-known-good copy.
+    - The next main-agent turn is warned not to blindly rewrite the rejected config.
+    - If all validation issues were under `plugins.entries.<id>...`, OpenClaw would not restore the whole file. Plugin-local failures stay loud while unrelated user settings remain in the active config.
+  </Accordion>
+  <Accordion title="Inspect and repair">
+    ```bash
+    CONFIG="$(openclaw config file)"
+    ls -lt "$CONFIG".clobbered.* "$CONFIG".rejected.* 2>/dev/null | head
+    diff -u "$CONFIG" "$(ls -t "$CONFIG".clobbered.* 2>/dev/null | head -n 1)"
+    openclaw config validate
+    openclaw doctor
+    ```
+  </Accordion>
+  <Accordion title="Common signatures">
+    - `.clobbered.*` exists → an external direct edit or startup read was restored.
+    - `.rejected.*` exists → an OpenClaw-owned config write failed schema or clobber checks before commit.
+    - `Config write rejected:` → the write tried to drop required shape, shrink the file sharply, or persist invalid config.
+    - `missing-meta-vs-last-good`, `gateway-mode-missing-vs-last-good`, or `size-drop-vs-last-good:*` → startup treated the current file as clobbered because it lost fields or size compared with the last-known-good backup.
+    - `Config last-known-good promotion skipped` → the candidate contained redacted secret placeholders such as `***`.
+  </Accordion>
+  <Accordion title="Fix options">
+    1. Keep the restored active config if it is correct.
+    2. Copy only the intended keys from `.clobbered.*` or `.rejected.*`, then apply them with `openclaw config set` or `config.patch`.
+    3. Run `openclaw config validate` before restarting.
+    4. If you edit by hand, keep the full JSON5 config, not just the partial object you wanted to change.
+  </Accordion>
+</AccordionGroup>

 Related:

- [Configuration: strict validation](/gateway/configuration#strict-validation)
- [Configuration: hot reload](/gateway/configuration#config-hot-reload)
 - [Config](/cli/config)
+- [Configuration: hot reload](/gateway/configuration#config-hot-reload)
+- [Configuration: strict validation](/gateway/configuration#strict-validation)
 - [Doctor](/gateway/doctor)

 ## Gateway probe warnings
@@ -394,7 +377,7 @@ Related:
 - [Multiple gateways on the same host](/gateway#multiple-gateways-same-host)
 - [Remote access](/gateway/remote)

-## Channel connected messages not flowing
+## Channel connected, messages not flowing

 If channel state is connected but message flow is dead, focus on policy, permissions, and channel specific delivery rules.

@@ -421,9 +404,9 @@ Common signatures:
 Related:

 - [Channel troubleshooting](/channels/troubleshooting)
- [WhatsApp](/channels/whatsapp)
- [Telegram](/channels/telegram)
 - [Discord](/channels/discord)
+- [Telegram](/channels/telegram)
+- [WhatsApp](/channels/whatsapp)

 ## Cron and heartbeat delivery

@@ -443,23 +426,25 @@ Look for:
 - Job run history status (`ok`, `skipped`, `error`).
 - Heartbeat skip reasons (`quiet-hours`, `requests-in-flight`, `alerts-disabled`, `empty-heartbeat-file`, `no-tasks-due`).

-Common signatures:
-
- `cron: scheduler disabled; jobs will not run automatically` → cron disabled.
- `cron: timer tick failed` → scheduler tick failed; check file/log/runtime errors.
- `heartbeat skipped` with `reason=quiet-hours` → outside active hours window.
- `heartbeat skipped` with `reason=empty-heartbeat-file` → `HEARTBEAT.md` exists but only contains blank lines / markdown headers, so OpenClaw skips the model call.
- `heartbeat skipped` with `reason=no-tasks-due` → `HEARTBEAT.md` contains a `tasks:` block, but none of the tasks are due on this tick.
- `heartbeat: unknown accountId` → invalid account id for heartbeat delivery target.
- `heartbeat skipped` with `reason=dm-blocked` → heartbeat target resolved to a DM-style destination while `agents.defaults.heartbeat.directPolicy` (or per-agent override) is set to `block`.
+<AccordionGroup>
+  <Accordion title="Common signatures">
+    - `cron: scheduler disabled; jobs will not run automatically` → cron disabled.
+    - `cron: timer tick failed` → scheduler tick failed; check file/log/runtime errors.
+    - `heartbeat skipped` with `reason=quiet-hours` → outside active hours window.
+    - `heartbeat skipped` with `reason=empty-heartbeat-file` → `HEARTBEAT.md` exists but only contains blank lines / markdown headers, so OpenClaw skips the model call.
+    - `heartbeat skipped` with `reason=no-tasks-due` → `HEARTBEAT.md` contains a `tasks:` block, but none of the tasks are due on this tick.
+    - `heartbeat: unknown accountId` → invalid account id for heartbeat delivery target.
+    - `heartbeat skipped` with `reason=dm-blocked` → heartbeat target resolved to a DM-style destination while `agents.defaults.heartbeat.directPolicy` (or per-agent override) is set to `block`.
+  </Accordion>
+</AccordionGroup>

 Related:

- [Scheduled tasks: troubleshooting](/automation/cron-jobs#troubleshooting)
- [Scheduled tasks](/automation/cron-jobs)
 - [Heartbeat](/gateway/heartbeat)
+- [Scheduled tasks](/automation/cron-jobs)
+- [Scheduled tasks: troubleshooting](/automation/cron-jobs#troubleshooting)

-## Node paired tool fails
+## Node paired, tool fails

 If a node is paired but tools fail, isolate foreground, permission, and approval state.

@@ -486,9 +471,9 @@ Common signatures:

 Related:

+- [Exec approvals](/tools/exec-approvals)
 - [Node troubleshooting](/nodes/troubleshooting)
 - [Nodes](/nodes/index)
- [Exec approvals](/tools/exec-approvals)

 ## Browser tool fails

@@ -509,95 +494,104 @@ Look for:
 - CDP profile reachability.
 - Local Chrome availability for `existing-session` / `user` profiles.

-Common signatures:
-
- `unknown command "browser"` or `unknown command 'browser'` → the bundled browser plugin is excluded by `plugins.allow`.
- browser tool missing / unavailable while `browser.enabled=true` → `plugins.allow` excludes `browser`, so the plugin never loaded.
- `Failed to start Chrome CDP on port` → browser process failed to launch.
- `browser.executablePath not found` → configured path is invalid.
- `browser.cdpUrl must be http(s) or ws(s)` → the configured CDP URL uses an unsupported scheme such as `file:` or `ftp:`.
- `browser.cdpUrl has invalid port` → the configured CDP URL has a bad or out-of-range port.
- `Could not find DevToolsActivePort for chrome` → Chrome MCP existing-session could not attach to the selected browser data dir yet. Open the browser inspect page, enable remote debugging, keep the browser open, approve the first attach prompt, then retry. If signed-in state is not required, prefer the managed `openclaw` profile.
- `No Chrome tabs found for profile="user"` → the Chrome MCP attach profile has no open local Chrome tabs.
- `Remote CDP for profile "<name>" is not reachable` → the configured remote CDP endpoint is not reachable from the gateway host.
- `Browser attachOnly is enabled ... not reachable` or `Browser attachOnly is enabled and CDP websocket ... is not reachable` → attach-only profile has no reachable target, or the HTTP endpoint answered but the CDP WebSocket still could not be opened.
- `Playwright is not available in this gateway build; '<feature>' is unsupported.` → the current gateway install lacks the bundled browser plugin's `playwright-core` runtime dependency; run `openclaw doctor --fix`, then restart the gateway. ARIA snapshots and basic page screenshots can still work, but navigation, AI snapshots, CSS-selector element screenshots, and PDF export stay unavailable.
- `fullPage is not supported for element screenshots` → screenshot request mixed `--full-page` with `--ref` or `--element`.
- `element screenshots are not supported for existing-session profiles; use ref from snapshot.` → Chrome MCP / `existing-session` screenshot calls must use page capture or a snapshot `--ref`, not CSS `--element`.
- `existing-session file uploads do not support element selectors; use ref/inputRef.` → Chrome MCP upload hooks need snapshot refs, not CSS selectors.
- `existing-session file uploads currently support one file at a time.` → send one upload per call on Chrome MCP profiles.
- `existing-session dialog handling does not support timeoutMs.` → dialog hooks on Chrome MCP profiles do not support timeout overrides.
- `existing-session type does not support timeoutMs overrides.` → omit `timeoutMs` for `act:type` on `profile="user"` / Chrome MCP existing-session profiles, or use a managed/CDP browser profile when a custom timeout is required.
- `existing-session evaluate does not support timeoutMs overrides.` → omit `timeoutMs` for `act:evaluate` on `profile="user"` / Chrome MCP existing-session profiles, or use a managed/CDP browser profile when a custom timeout is required.
- `response body is not supported for existing-session profiles yet.` → `responsebody` still requires a managed browser or raw CDP profile.
- stale viewport / dark-mode / locale / offline overrides on attach-only or remote CDP profiles → run `openclaw browser stop --browser-profile <name>` to close the active control session and release Playwright/CDP emulation state without restarting the whole gateway.
+<AccordionGroup>
+  <Accordion title="Plugin / executable signatures">
+    - `unknown command "browser"` or `unknown command 'browser'` → the bundled browser plugin is excluded by `plugins.allow`.
+    - browser tool missing / unavailable while `browser.enabled=true` → `plugins.allow` excludes `browser`, so the plugin never loaded.
+    - `Failed to start Chrome CDP on port` → browser process failed to launch.
+    - `browser.executablePath not found` → configured path is invalid.
+    - `browser.cdpUrl must be http(s) or ws(s)` → the configured CDP URL uses an unsupported scheme such as `file:` or `ftp:`.
+    - `browser.cdpUrl has invalid port` → the configured CDP URL has a bad or out-of-range port.
+    - `Playwright is not available in this gateway build; '<feature>' is unsupported.` → the current gateway install lacks the bundled browser plugin's `playwright-core` runtime dependency; run `openclaw doctor --fix`, then restart the gateway. ARIA snapshots and basic page screenshots can still work, but navigation, AI snapshots, CSS-selector element screenshots, and PDF export stay unavailable.
+  </Accordion>
+  <Accordion title="Chrome MCP / existing-session signatures">
+    - `Could not find DevToolsActivePort for chrome` → Chrome MCP existing-session could not attach to the selected browser data dir yet. Open the browser inspect page, enable remote debugging, keep the browser open, approve the first attach prompt, then retry. If signed-in state is not required, prefer the managed `openclaw` profile.
+    - `No Chrome tabs found for profile="user"` → the Chrome MCP attach profile has no open local Chrome tabs.
+    - `Remote CDP for profile "<name>" is not reachable` → the configured remote CDP endpoint is not reachable from the gateway host.
+    - `Browser attachOnly is enabled ... not reachable` or `Browser attachOnly is enabled and CDP websocket ... is not reachable` → attach-only profile has no reachable target, or the HTTP endpoint answered but the CDP WebSocket still could not be opened.
+  </Accordion>
+  <Accordion title="Element / screenshot / upload signatures">
+    - `fullPage is not supported for element screenshots` → screenshot request mixed `--full-page` with `--ref` or `--element`.
+    - `element screenshots are not supported for existing-session profiles; use ref from snapshot.` → Chrome MCP / `existing-session` screenshot calls must use page capture or a snapshot `--ref`, not CSS `--element`.
+    - `existing-session file uploads do not support element selectors; use ref/inputRef.` → Chrome MCP upload hooks need snapshot refs, not CSS selectors.
+    - `existing-session file uploads currently support one file at a time.` → send one upload per call on Chrome MCP profiles.
+    - `existing-session dialog handling does not support timeoutMs.` → dialog hooks on Chrome MCP profiles do not support timeout overrides.
+    - `existing-session type does not support timeoutMs overrides.` → omit `timeoutMs` for `act:type` on `profile="user"` / Chrome MCP existing-session profiles, or use a managed/CDP browser profile when a custom timeout is required.
+    - `existing-session evaluate does not support timeoutMs overrides.` → omit `timeoutMs` for `act:evaluate` on `profile="user"` / Chrome MCP existing-session profiles, or use a managed/CDP browser profile when a custom timeout is required.
+    - `response body is not supported for existing-session profiles yet.` → `responsebody` still requires a managed browser or raw CDP profile.
+    - stale viewport / dark-mode / locale / offline overrides on attach-only or remote CDP profiles → run `openclaw browser stop --browser-profile <name>` to close the active control session and release Playwright/CDP emulation state without restarting the whole gateway.
+  </Accordion>
+</AccordionGroup>

 Related:

- [Browser troubleshooting](/tools/browser-linux-troubleshooting)
 - [Browser (OpenClaw-managed)](/tools/browser)
+- [Browser troubleshooting](/tools/browser-linux-troubleshooting)

 ## If you upgraded and something suddenly broke

 Most post-upgrade breakage is config drift or stricter defaults now being enforced.

-### 1) Auth and URL override behavior changed
+<AccordionGroup>
+  <Accordion title="1. Auth and URL override behavior changed">
+    ```bash
+    openclaw gateway status
+    openclaw config get gateway.mode
+    openclaw config get gateway.remote.url
+    openclaw config get gateway.auth.mode
+    ```

-```bash
-openclaw gateway status
-openclaw config get gateway.mode
-openclaw config get gateway.remote.url
-openclaw config get gateway.auth.mode
-```
+    What to check:

-What to check:
+    - If `gateway.mode=remote`, CLI calls may be targeting remote while your local service is fine.
+    - Explicit `--url` calls do not fall back to stored credentials.

- If `gateway.mode=remote`, CLI calls may be targeting remote while your local service is fine.
- Explicit `--url` calls do not fall back to stored credentials.
+    Common signatures:

-Common signatures:
+    - `gateway connect failed:` → wrong URL target.
+    - `unauthorized` → endpoint reachable but wrong auth.

- `gateway connect failed:` → wrong URL target.
- `unauthorized` → endpoint reachable but wrong auth.
+  </Accordion>
+  <Accordion title="2. Bind and auth guardrails are stricter">
+    ```bash
+    openclaw config get gateway.bind
+    openclaw config get gateway.auth.mode
+    openclaw config get gateway.auth.token
+    openclaw gateway status
+    openclaw logs --follow
+    ```

-### 2) Bind and auth guardrails are stricter
+    What to check:

-```bash
-openclaw config get gateway.bind
-openclaw config get gateway.auth.mode
-openclaw config get gateway.auth.token
-openclaw gateway status
-openclaw logs --follow
-```
+    - Non-loopback binds (`lan`, `tailnet`, `custom`) need a valid gateway auth path: shared token/password auth, or a correctly configured non-loopback `trusted-proxy` deployment.
+    - Old keys like `gateway.token` do not replace `gateway.auth.token`.

-What to check:
+    Common signatures:

- Non-loopback binds (`lan`, `tailnet`, `custom`) need a valid gateway auth path: shared token/password auth, or a correctly configured non-loopback `trusted-proxy` deployment.
- Old keys like `gateway.token` do not replace `gateway.auth.token`.
+    - `refusing to bind gateway ... without auth` → non-loopback bind without a valid gateway auth path.
+    - `Connectivity probe: failed` while runtime is running → gateway alive but inaccessible with current auth/url.

-Common signatures:
+  </Accordion>
+  <Accordion title="3. Pairing and device identity state changed">
+    ```bash
+    openclaw devices list
+    openclaw pairing list --channel <channel> [--account <id>]
+    openclaw logs --follow
+    openclaw doctor
+    ```

- `refusing to bind gateway ... without auth` → non-loopback bind without a valid gateway auth path.
- `Connectivity probe: failed` while runtime is running → gateway alive but inaccessible with current auth/url.
+    What to check:

-### 3) Pairing and device identity state changed
+    - Pending device approvals for dashboard/nodes.
+    - Pending DM pairing approvals after policy or identity changes.

-```bash
-openclaw devices list
-openclaw pairing list --channel <channel> [--account <id>]
-openclaw logs --follow
-openclaw doctor
-```
+    Common signatures:

-What to check:
+    - `device identity required` → device auth not satisfied.
+    - `pairing required` → sender/device must be approved.

- Pending device approvals for dashboard/nodes.
- Pending DM pairing approvals after policy or identity changes.
-
-Common signatures:
-
- `device identity required` → device auth not satisfied.
- `pairing required` → sender/device must be approved.
+  </Accordion>
+</AccordionGroup>

 If the service config and runtime still disagree after checks, reinstall service metadata from the same profile/state directory:

@@ -608,12 +602,12 @@ openclaw gateway restart

 Related:

- [Gateway-owned pairing](/gateway/pairing)
 - [Authentication](/gateway/authentication)
 - [Background exec and process tool](/gateway/background-process)
+- [Gateway-owned pairing](/gateway/pairing)

 ## Related

- [Gateway runbook](/gateway)
 - [Doctor](/gateway/doctor)
 - [FAQ](/help/faq)
+- [Gateway runbook](/gateway)
--- a/docs/install/installer.md
+++ b/docs/install/installer.md
@@ -292,6 +292,9 @@ by default, plus git-checkout installs under the same prefix flow.
    - Refreshes a loaded gateway service best-effort (`openclaw gateway install --force`, then restart)
    - Runs `openclaw doctor --non-interactive` on upgrades and git installs (best effort)
  </Step>
+  <Step title="Handle failures">
+    `iwr ... | iex` and scriptblock installs report a terminating error without closing the current PowerShell session. Direct `powershell -File` / `pwsh -File` installs still exit non-zero for automation.
+  </Step>
 </Steps>

 ### Examples (install.ps1)
--- a/extensions/diagnostics-otel/src/service.test.ts
+++ b/extensions/diagnostics-otel/src/service.test.ts
@@ -1140,6 +1140,28 @@ describe("diagnostics-otel service", () => {
        traceFlags: "01",
      },
    });
+    emitDiagnosticEvent({
+      type: "harness.run.completed",
+      runId: "run-1",
+      sessionKey: "session-key",
+      sessionId: "session-1",
+      provider: "codex",
+      model: "gpt-5.4",
+      channel: "qa",
+      harnessId: "codex",
+      pluginId: "codex-plugin",
+      outcome: "completed",
+      durationMs: 90,
+      resultClassification: "reasoning-only",
+      yieldDetected: true,
+      itemLifecycle: { startedCount: 3, completedCount: 2, activeCount: 1 },
+      trace: {
+        traceId: TRACE_ID,
+        spanId: GRANDCHILD_SPAN_ID,
+        parentSpanId: CHILD_SPAN_ID,
+        traceFlags: "01",
+      },
+    });
    emitDiagnosticEvent({
      type: "tool.execution.error",
      runId: "run-1",
@@ -1160,7 +1182,12 @@ describe("diagnostics-otel service", () => {

    const spanNames = telemetryState.tracer.startSpan.mock.calls.map((call) => call[0]);
    expect(spanNames).toEqual(
-      expect.arrayContaining(["openclaw.run", "openclaw.model.call", "openclaw.tool.execution"]),
+      expect.arrayContaining([
+        "openclaw.run",
+        "openclaw.model.call",
+        "openclaw.harness.run",
+        "openclaw.tool.execution",
+      ]),
    );

    const runCall = telemetryState.tracer.startSpan.mock.calls.find(
@@ -1207,6 +1234,36 @@ describe("diagnostics-otel service", () => {
    });
    expect(modelCall?.[2]).toBeUndefined();

+    const harnessCall = telemetryState.tracer.startSpan.mock.calls.find(
+      (call) => call[0] === "openclaw.harness.run",
+    );
+    expect(harnessCall?.[1]).toMatchObject({
+      attributes: {
+        "openclaw.harness.id": "codex",
+        "openclaw.harness.plugin": "codex-plugin",
+        "openclaw.outcome": "completed",
+        "openclaw.provider": "codex",
+        "openclaw.model": "gpt-5.4",
+        "openclaw.channel": "qa",
+        "openclaw.harness.result_classification": "reasoning-only",
+        "openclaw.harness.yield_detected": true,
+        "openclaw.harness.items.started": 3,
+        "openclaw.harness.items.completed": 2,
+        "openclaw.harness.items.active": 1,
+      },
+      startTime: expect.any(Number),
+    });
+    expect(harnessCall?.[1]).toEqual({
+      attributes: expect.not.objectContaining({
+        "openclaw.runId": expect.anything(),
+        "openclaw.sessionId": expect.anything(),
+        "openclaw.sessionKey": expect.anything(),
+        "openclaw.traceId": expect.anything(),
+      }),
+      startTime: expect.any(Number),
+    });
+    expect(harnessCall?.[2]).toBeUndefined();
+
    const toolCall = telemetryState.tracer.startSpan.mock.calls.find(
      (call) => call[0] === "openclaw.tool.execution",
    );
@@ -1244,6 +1301,25 @@ describe("diagnostics-otel service", () => {
        "openclaw.runId": expect.anything(),
      }),
    );
+    expect(
+      telemetryState.histograms.get("openclaw.harness.duration_ms")?.record,
+    ).toHaveBeenCalledWith(
+      90,
+      expect.objectContaining({
+        "openclaw.harness.id": "codex",
+        "openclaw.harness.plugin": "codex-plugin",
+        "openclaw.outcome": "completed",
+      }),
+    );
+    expect(
+      telemetryState.histograms.get("openclaw.harness.duration_ms")?.record,
+    ).toHaveBeenCalledWith(
+      90,
+      expect.not.objectContaining({
+        "openclaw.runId": expect.anything(),
+        "openclaw.sessionKey": expect.anything(),
+      }),
+    );
    expect(
      telemetryState.histograms.get("openclaw.tool.execution.duration_ms")?.record,
    ).toHaveBeenCalledWith(
--- a/extensions/diagnostics-otel/src/service.ts
+++ b/extensions/diagnostics-otel/src/service.ts
@@ -81,6 +81,10 @@ type ModelCallLifecycleDiagnosticEvent = Extract<
  DiagnosticEventPayload,
  { type: "model.call.completed" | "model.call.error" }
 >;
+type HarnessRunLifecycleDiagnosticEvent = Extract<
+  DiagnosticEventPayload,
+  { type: "harness.run.completed" | "harness.run.error" }
+>;
 type TelemetryExporterDiagnosticEvent = Extract<
  DiagnosticEventPayload,
  { type: "telemetry.exporter" }
@@ -720,6 +724,10 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
        unit: "ms",
        description: "Agent run duration",
      });
+      const harnessDurationHistogram = meter.createHistogram("openclaw.harness.duration_ms", {
+        unit: "ms",
+        description: "Agent harness lifecycle duration",
+      });
      const contextHistogram = meter.createHistogram("openclaw.context.tokens", {
        unit: "1",
        description: "Context window size and usage",
@@ -1426,6 +1434,82 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
        span.end(evt.ts);
      };

+      const harnessRunMetricAttrs = (evt: HarnessRunLifecycleDiagnosticEvent) => ({
+        "openclaw.harness.id": lowCardinalityAttr(evt.harnessId, "unknown"),
+        "openclaw.harness.plugin": lowCardinalityAttr(evt.pluginId),
+        "openclaw.outcome": evt.type === "harness.run.error" ? "error" : evt.outcome,
+        "openclaw.provider": lowCardinalityAttr(evt.provider, "unknown"),
+        "openclaw.model": lowCardinalityAttr(evt.model, "unknown"),
+        ...(evt.channel ? { "openclaw.channel": lowCardinalityAttr(evt.channel) } : {}),
+      });
+
+      const recordHarnessRunCompleted = (
+        evt: Extract<DiagnosticEventPayload, { type: "harness.run.completed" }>,
+        metadata: DiagnosticEventMetadata,
+      ) => {
+        harnessDurationHistogram.record(evt.durationMs, harnessRunMetricAttrs(evt));
+        if (!tracesEnabled) {
+          return;
+        }
+        const spanAttrs: Record<string, string | number | boolean> = {
+          ...harnessRunMetricAttrs(evt),
+        };
+        if (evt.resultClassification) {
+          spanAttrs["openclaw.harness.result_classification"] = lowCardinalityAttr(
+            evt.resultClassification,
+          );
+        }
+        if (typeof evt.yieldDetected === "boolean") {
+          spanAttrs["openclaw.harness.yield_detected"] = evt.yieldDetected;
+        }
+        if (evt.itemLifecycle) {
+          spanAttrs["openclaw.harness.items.started"] = evt.itemLifecycle.startedCount;
+          spanAttrs["openclaw.harness.items.completed"] = evt.itemLifecycle.completedCount;
+          spanAttrs["openclaw.harness.items.active"] = evt.itemLifecycle.activeCount;
+        }
+        const span = spanWithDuration("openclaw.harness.run", spanAttrs, evt.durationMs, {
+          parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
+          endTimeMs: evt.ts,
+        });
+        if (evt.outcome === "error") {
+          span.setStatus({
+            code: SpanStatusCode.ERROR,
+            message: "error",
+          });
+        }
+        span.end(evt.ts);
+      };
+
+      const recordHarnessRunError = (
+        evt: Extract<DiagnosticEventPayload, { type: "harness.run.error" }>,
+        metadata: DiagnosticEventMetadata,
+      ) => {
+        const errorType = lowCardinalityAttr(evt.errorCategory, "other");
+        const attrs = {
+          ...harnessRunMetricAttrs(evt),
+          "openclaw.harness.phase": evt.phase,
+          "openclaw.errorCategory": errorType,
+        };
+        harnessDurationHistogram.record(evt.durationMs, attrs);
+        if (!tracesEnabled) {
+          return;
+        }
+        const spanAttrs: Record<string, string | number | boolean> = {
+          ...attrs,
+          "error.type": errorType,
+          ...(evt.cleanupFailed ? { "openclaw.harness.cleanup_failed": true } : {}),
+        };
+        const span = spanWithDuration("openclaw.harness.run", spanAttrs, evt.durationMs, {
+          parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
+          endTimeMs: evt.ts,
+        });
+        span.setStatus({
+          code: SpanStatusCode.ERROR,
+          message: errorType,
+        });
+        span.end(evt.ts);
+      };
+
      const recordContextAssembled = (
        evt: Extract<DiagnosticEventPayload, { type: "context.assembled" }>,
        metadata: DiagnosticEventMetadata,
@@ -1746,6 +1830,12 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
            case "run.completed":
              recordRunCompleted(evt, metadata);
              return;
+            case "harness.run.completed":
+              recordHarnessRunCompleted(evt, metadata);
+              return;
+            case "harness.run.error":
+              recordHarnessRunError(evt, metadata);
+              return;
            case "context.assembled":
              recordContextAssembled(evt, metadata);
              return;
@@ -1781,6 +1871,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
              return;
            case "tool.execution.started":
            case "run.started":
+            case "harness.run.started":
            case "model.call.started":
            case "payload.large":
              return;
--- a/qa/scenarios/runtime/otel-trace-smoke.md
+++ b/qa/scenarios/runtime/otel-trace-smoke.md
@@ -13,6 +13,7 @@ objective: Verify a QA-lab gateway run emits bounded OpenTelemetry trace spans t
 successCriteria:
  - The diagnostics-otel plugin starts with trace export enabled.
  - A minimal QA-channel agent turn completes.
+  - The trace includes the selected agent harness lifecycle span.
  - The run emits low-cardinality OpenTelemetry trace spans without content or raw diagnostic identifiers.
 plugins:
  - diagnostics-otel
@@ -33,6 +34,7 @@ docsRefs:
  - docs/concepts/qa-e2e-automation.md
 codeRefs:
  - extensions/diagnostics-otel/src/service.ts
+  - src/agents/harness/v2.ts
  - extensions/qa-lab/src/suite.ts
 execution:
  kind: flow
--- a/scripts/e2e/npm-telegram-live-docker.sh
+++ b/scripts/e2e/npm-telegram-live-docker.sh
@@ -153,10 +153,50 @@ export NPM_CONFIG_PREFIX="/npm-global"
 export PATH="$NPM_CONFIG_PREFIX/bin:$PATH"
 export OPENCLAW_NPM_TELEGRAM_REPO_ROOT="/app"

+dump_hotpath_logs() {
+  local status="$1"
+  echo "installed npm onboarding recovery hot path failed with exit code $status" >&2
+  for file in \
+    /tmp/openclaw-npm-telegram-onboard.json \
+    /tmp/openclaw-npm-telegram-channel-add.log \
+    /tmp/openclaw-npm-telegram-doctor-fix.log \
+    /tmp/openclaw-npm-telegram-doctor-check.log; do
+    if [ -f "$file" ]; then
+      echo "--- $file ---" >&2
+      sed -n '1,220p' "$file" >&2 || true
+    fi
+  done
+}
+trap 'status=$?; dump_hotpath_logs "$status"; exit "$status"' ERR
+
 command -v openclaw
 openclaw --version

+echo "Running installed npm onboarding recovery hot path..."
+OPENAI_API_KEY="${OPENAI_API_KEY:-sk-openclaw-npm-telegram-hotpath}" openclaw onboard --non-interactive --accept-risk \
+  --mode local \
+  --auth-choice openai-api-key \
+  --secret-input-mode ref \
+  --gateway-port 18789 \
+  --gateway-bind loopback \
+  --skip-daemon \
+  --skip-ui \
+  --skip-skills \
+  --skip-health \
+  --json >/tmp/openclaw-npm-telegram-onboard.json </dev/null
+
+openclaw channels add --channel telegram --token "123456:openclaw-npm-telegram-hotpath" >/tmp/openclaw-npm-telegram-channel-add.log 2>&1 </dev/null
+openclaw doctor --fix --non-interactive >/tmp/openclaw-npm-telegram-doctor-fix.log 2>&1 </dev/null
+openclaw doctor --non-interactive >/tmp/openclaw-npm-telegram-doctor-check.log 2>&1 </dev/null
+if grep -F -q "Bundled plugin runtime deps are missing." /tmp/openclaw-npm-telegram-doctor-check.log; then
+  exit 1
+fi
+if grep -F -q "Failed to install bundled plugin runtime deps" /tmp/openclaw-npm-telegram-doctor-fix.log; then
+  exit 1
+fi
+
 export OPENCLAW_NPM_TELEGRAM_SUT_COMMAND="$(command -v openclaw)"
+trap - ERR
 node --import tsx scripts/e2e/npm-telegram-live-runner.ts
 EOF

--- a/scripts/install.ps1
+++ b/scripts/install.ps1
@@ -384,6 +384,29 @@ function Add-ToPath {
    }
 }

+$script:InstallExitCode = 0
+
+function Fail-Install {
+    param([int]$Code = 1)
+
+    $script:InstallExitCode = $Code
+    return $false
+}
+
+function Complete-Install {
+    param([bool]$Succeeded)
+
+    if ($Succeeded) {
+        return
+    }
+
+    if ($PSCommandPath) {
+        exit $script:InstallExitCode
+    }
+
+    throw "OpenClaw installation failed with exit code $($script:InstallExitCode)."
+}
+
 # Main
 function Main {
    Write-Banner
@@ -394,16 +417,16 @@ function Main {
    if (!(Ensure-ExecutionPolicy)) {
        Write-Host ""
        Write-Host "Installation cannot continue due to execution policy restrictions" -Level error
-        exit 1
+        return (Fail-Install)
    }
    
    if (!(Ensure-Node)) {
-        exit 1
+        return (Fail-Install)
    }
    
    if ($InstallMethod -eq "git") {
        if (!(Ensure-Git)) {
-            exit 1
+            return (Fail-Install)
        }
        
        if ($DryRun) {
@@ -421,7 +444,7 @@ function Main {
            Write-Host "[DRY RUN] Would install OpenClaw via npm ($((Resolve-PackageInstallSpec -Target $Tag)))" -Level info
        } else {
            if (!(Install-OpenClawNpm -Target $Tag)) {
-                exit 1
+                return (Fail-Install)
            }
        }
    }
@@ -446,6 +469,8 @@ function Main {
    
    Write-Host ""
    Write-Host "🦞 OpenClaw installed successfully!" -Level success
+    return $true
 }

-Main
+$installSucceeded = Main
+Complete-Install -Succeeded:$installSucceeded
--- a/scripts/postinstall-bundled-plugins.mjs
+++ b/scripts/postinstall-bundled-plugins.mjs
@@ -187,8 +187,8 @@ function assertSafeInstalledDistPath(relativePath, params) {
  return candidatePath;
 }

-function isStagedRuntimeNodeModulesPath(relativePath) {
-  return /^dist\/extensions\/[^/]+\/node_modules(?:\/|$)/u.test(
+function isStagedRuntimeDependencyPath(relativePath) {
+  return /^dist\/extensions\/[^/]+\/(?:node_modules|\.openclaw-install-stage(?:-[^/]+)?)(?:\/|$)/u.test(
    normalizeRelativePath(relativePath),
  );
 }
@@ -208,7 +208,7 @@ function listInstalledDistFiles(params = {}) {
      continue;
    }
    const relativeCurrentDir = normalizeRelativePath(relative(packageRoot, currentDir));
-    if (isStagedRuntimeNodeModulesPath(relativeCurrentDir)) {
+    if (isStagedRuntimeDependencyPath(relativeCurrentDir)) {
      continue;
    }
    for (const entry of readDir(currentDir, { withFileTypes: true })) {
@@ -247,7 +247,7 @@ function pruneEmptyDistDirectories(params = {}) {

  function prune(currentDir) {
    const relativeCurrentDir = normalizeRelativePath(relative(packageRoot, currentDir));
-    if (isStagedRuntimeNodeModulesPath(relativeCurrentDir)) {
+    if (isStagedRuntimeDependencyPath(relativeCurrentDir)) {
      return;
    }
    for (const entry of readDir(currentDir, { withFileTypes: true })) {
--- a/scripts/qa-otel-smoke.ts
+++ b/scripts/qa-otel-smoke.ts
@@ -80,6 +80,7 @@ type CapturedSpan = {
 const DEFAULT_SCENARIO_ID = "otel-trace-smoke";
 const REQUIRED_SPAN_NAMES = [
  "openclaw.run",
+  "openclaw.harness.run",
  "openclaw.model.call",
  "openclaw.context.assembled",
  "openclaw.message.delivery",
--- a/src/agents/harness/v2.test.ts
+++ b/src/agents/harness/v2.test.ts
@@ -1,5 +1,11 @@
 import type { Api, Model } from "@mariozechner/pi-ai";
-import { describe, expect, it, vi } from "vitest";
+import { afterEach, describe, expect, it, vi } from "vitest";
+import {
+  onInternalDiagnosticEvent,
+  resetDiagnosticEventsForTest,
+  type DiagnosticEventMetadata,
+  type DiagnosticEventPayload,
+} from "../../infra/diagnostic-events.js";
 import type { EmbeddedRunAttemptResult } from "../pi-embedded-runner/run/types.js";
 import type { AgentHarness, AgentHarnessAttemptParams } from "./types.js";
 import type { AgentHarnessV2 } from "./v2.js";
@@ -9,6 +15,7 @@ function createAttemptParams(): AgentHarnessAttemptParams {
  return {
    prompt: "hello",
    sessionId: "session-1",
+    sessionKey: "session-key",
    runId: "run-1",
    sessionFile: "/tmp/session.jsonl",
    workspaceDir: "/tmp/workspace",
@@ -19,9 +26,19 @@ function createAttemptParams(): AgentHarnessAttemptParams {
    authStorage: {} as never,
    modelRegistry: {} as never,
    thinkLevel: "low",
+    messageChannel: "qa",
+    trigger: "manual",
  } as AgentHarnessAttemptParams;
 }

+function createDiagnosticTrace() {
+  return {
+    traceId: "11111111111111111111111111111111",
+    spanId: "2222222222222222",
+    traceFlags: "01",
+  };
+}
+
 function createAttemptResult(): EmbeddedRunAttemptResult {
  return {
    aborted: false,
@@ -32,6 +49,7 @@ function createAttemptResult(): EmbeddedRunAttemptResult {
    promptError: null,
    promptErrorSource: null,
    sessionIdUsed: "session-1",
+    diagnosticTrace: createDiagnosticTrace(),
    messagesSnapshot: [],
    assistantTexts: ["ok"],
    toolMetas: [],
@@ -46,7 +64,28 @@ function createAttemptResult(): EmbeddedRunAttemptResult {
  };
 }

+async function flushDiagnosticEvents(): Promise<void> {
+  await new Promise<void>((resolve) => setImmediate(resolve));
+}
+
+function captureDiagnosticEvents(): {
+  events: Array<{ event: DiagnosticEventPayload; metadata: DiagnosticEventMetadata }>;
+  unsubscribe: () => void;
+} {
+  const events: Array<{ event: DiagnosticEventPayload; metadata: DiagnosticEventMetadata }> = [];
+  const unsubscribe = onInternalDiagnosticEvent((event, metadata) => {
+    if (event.type.startsWith("harness.run.")) {
+      events.push({ event, metadata });
+    }
+  });
+  return { events, unsubscribe };
+}
+
 describe("AgentHarness V2 compatibility adapter", () => {
+  afterEach(() => {
+    resetDiagnosticEventsForTest();
+  });
+
  it("executes prepare/start/send/outcome/cleanup as one bounded lifecycle", async () => {
    const params = createAttemptParams();
    const result = createAttemptResult();
@@ -102,6 +141,112 @@ describe("AgentHarness V2 compatibility adapter", () => {
    ]);
  });

+  it("emits trusted harness lifecycle diagnostics for successful attempts", async () => {
+    resetDiagnosticEventsForTest();
+    const params = createAttemptParams();
+    const result = {
+      ...createAttemptResult(),
+      agentHarnessResultClassification: "reasoning-only",
+      yieldDetected: true,
+      itemLifecycle: { startedCount: 3, completedCount: 2, activeCount: 1 },
+    } as EmbeddedRunAttemptResult;
+    const harness: AgentHarnessV2 = {
+      id: "codex",
+      label: "Codex",
+      pluginId: "codex-plugin",
+      supports: () => ({ supported: true }),
+      prepare: async () => ({
+        harnessId: "codex",
+        label: "Codex",
+        pluginId: "codex-plugin",
+        params,
+        lifecycleState: "prepared",
+      }),
+      start: async (prepared) => ({ ...prepared, lifecycleState: "started" }),
+      send: async () => result,
+      resolveOutcome: async (_session, rawResult) => rawResult,
+      cleanup: async () => {},
+    };
+    const diagnostics = captureDiagnosticEvents();
+    try {
+      await runAgentHarnessV2LifecycleAttempt(harness, params);
+      await flushDiagnosticEvents();
+    } finally {
+      diagnostics.unsubscribe();
+    }
+
+    expect(diagnostics.events.map(({ event }) => event.type)).toEqual([
+      "harness.run.started",
+      "harness.run.completed",
+    ]);
+    expect(diagnostics.events.every(({ metadata }) => metadata.trusted)).toBe(true);
+    expect(diagnostics.events[1]?.event).toMatchObject({
+      type: "harness.run.completed",
+      runId: "run-1",
+      sessionKey: "session-key",
+      sessionId: "session-1",
+      provider: "codex",
+      model: "gpt-5.4",
+      channel: "qa",
+      trigger: "manual",
+      harnessId: "codex",
+      pluginId: "codex-plugin",
+      outcome: "completed",
+      resultClassification: "reasoning-only",
+      yieldDetected: true,
+      itemLifecycle: { startedCount: 3, completedCount: 2, activeCount: 1 },
+      durationMs: expect.any(Number),
+    });
+  });
+
+  it("emits trusted harness error diagnostics with the failing lifecycle phase", async () => {
+    resetDiagnosticEventsForTest();
+    const params = createAttemptParams();
+    const sendError = new Error("codex app-server send failed");
+    const harness: AgentHarnessV2 = {
+      id: "codex",
+      label: "Codex",
+      supports: () => ({ supported: true }),
+      prepare: async () => ({
+        harnessId: "codex",
+        label: "Codex",
+        params,
+        lifecycleState: "prepared",
+      }),
+      start: async (prepared) => ({ ...prepared, lifecycleState: "started" }),
+      send: async () => {
+        throw sendError;
+      },
+      resolveOutcome: async (_session, rawResult) => rawResult,
+      cleanup: async () => {
+        throw new Error("cleanup failed");
+      },
+    };
+    const diagnostics = captureDiagnosticEvents();
+    try {
+      await expect(runAgentHarnessV2LifecycleAttempt(harness, params)).rejects.toThrow(
+        "codex app-server send failed",
+      );
+      await flushDiagnosticEvents();
+    } finally {
+      diagnostics.unsubscribe();
+    }
+
+    expect(diagnostics.events.map(({ event }) => event.type)).toEqual([
+      "harness.run.started",
+      "harness.run.error",
+    ]);
+    expect(diagnostics.events.every(({ metadata }) => metadata.trusted)).toBe(true);
+    expect(diagnostics.events[1]?.event).toMatchObject({
+      type: "harness.run.error",
+      phase: "send",
+      errorCategory: "Error",
+      cleanupFailed: true,
+      harnessId: "codex",
+      durationMs: expect.any(Number),
+    });
+  });
+
  it("runs cleanup with the original failure and preserves that failure", async () => {
    const params = createAttemptParams();
    const sendError = new Error("codex app-server send failed");
--- a/src/agents/harness/v2.ts
+++ b/src/agents/harness/v2.ts
@@ -1,3 +1,10 @@
+import { diagnosticErrorCategory } from "../../infra/diagnostic-error-metadata.js";
+import {
+  emitTrustedDiagnosticEvent,
+  type DiagnosticHarnessRunErrorEvent,
+  type DiagnosticHarnessRunOutcome,
+} from "../../infra/diagnostic-events.js";
+import type { DiagnosticTraceContext } from "../../infra/diagnostic-trace-context.js";
 import { formatErrorMessage } from "../../infra/errors.js";
 import { createSubsystemLogger } from "../../logging/subsystem.js";
 import { applyAgentHarnessResultClassification } from "./result-classification.js";
@@ -13,6 +20,7 @@ import type {
 } from "./types.js";

 const log = createSubsystemLogger("agents/harness/v2");
+type AgentHarnessV2LifecyclePhase = DiagnosticHarnessRunErrorEvent["phase"];

 type AgentHarnessV2RunBase = {
  harnessId: string;
@@ -95,6 +103,87 @@ export function adaptAgentHarnessToV2(harness: AgentHarness): AgentHarnessV2 {
  };
 }

+function agentHarnessDiagnosticBase(
+  harness: AgentHarnessV2,
+  params: AgentHarnessAttemptParams,
+  trace?: DiagnosticTraceContext,
+) {
+  return {
+    runId: params.runId,
+    sessionId: params.sessionId,
+    provider: params.provider,
+    model: params.modelId,
+    harnessId: harness.id,
+    ...(harness.pluginId ? { pluginId: harness.pluginId } : {}),
+    ...(params.sessionKey ? { sessionKey: params.sessionKey } : {}),
+    ...(params.trigger ? { trigger: params.trigger } : {}),
+    ...(params.messageChannel ? { channel: params.messageChannel } : {}),
+    ...(trace ? { trace } : {}),
+  };
+}
+
+function agentHarnessRunOutcome(result: AgentHarnessAttemptResult): DiagnosticHarnessRunOutcome {
+  if (result.promptError) {
+    return "error";
+  }
+  if (result.externalAbort || result.aborted) {
+    return "aborted";
+  }
+  if (result.timedOut || result.idleTimedOut || result.timedOutDuringCompaction) {
+    return "timed_out";
+  }
+  return "completed";
+}
+
+function emitAgentHarnessRunStarted(
+  harness: AgentHarnessV2,
+  params: AgentHarnessAttemptParams,
+): void {
+  emitTrustedDiagnosticEvent({
+    type: "harness.run.started",
+    ...agentHarnessDiagnosticBase(harness, params),
+  });
+}
+
+function emitAgentHarnessRunCompleted(params: {
+  harness: AgentHarnessV2;
+  attemptParams: AgentHarnessAttemptParams;
+  result: AgentHarnessAttemptResult;
+  startedAt: number;
+}): void {
+  const { harness, attemptParams, result, startedAt } = params;
+  emitTrustedDiagnosticEvent({
+    type: "harness.run.completed",
+    ...agentHarnessDiagnosticBase(harness, attemptParams, result.diagnosticTrace),
+    durationMs: Date.now() - startedAt,
+    outcome: agentHarnessRunOutcome(result),
+    ...(result.agentHarnessResultClassification
+      ? { resultClassification: result.agentHarnessResultClassification }
+      : {}),
+    ...(typeof result.yieldDetected === "boolean" ? { yieldDetected: result.yieldDetected } : {}),
+    itemLifecycle: { ...result.itemLifecycle },
+  });
+}
+
+function emitAgentHarnessRunError(params: {
+  harness: AgentHarnessV2;
+  attemptParams: AgentHarnessAttemptParams;
+  startedAt: number;
+  phase: AgentHarnessV2LifecyclePhase;
+  error: unknown;
+  cleanupFailed?: boolean;
+}): void {
+  const { harness, attemptParams, startedAt, phase, error, cleanupFailed } = params;
+  emitTrustedDiagnosticEvent({
+    type: "harness.run.error",
+    ...agentHarnessDiagnosticBase(harness, attemptParams),
+    durationMs: Date.now() - startedAt,
+    phase,
+    errorCategory: diagnosticErrorCategory(error),
+    ...(cleanupFailed ? { cleanupFailed: true } : {}),
+  });
+}
+
 export async function runAgentHarnessV2LifecycleAttempt(
  harness: AgentHarnessV2,
  params: AgentHarnessAttemptParams,
@@ -103,13 +192,21 @@ export async function runAgentHarnessV2LifecycleAttempt(
  let session: AgentHarnessV2Session | undefined;
  let rawResult: AgentHarnessAttemptResult | undefined;
  let result: AgentHarnessAttemptResult;
+  let phase: AgentHarnessV2LifecyclePhase = "prepare";
+  const startedAt = Date.now();

+  emitAgentHarnessRunStarted(harness, params);
  try {
+    phase = "prepare";
    prepared = await harness.prepare(params);
+    phase = "start";
    session = await harness.start(prepared);
+    phase = "send";
    rawResult = await harness.send(session);
+    phase = "resolve";
    result = await harness.resolveOutcome(session, rawResult);
  } catch (error) {
+    let cleanupFailed = false;
    try {
      await harness.cleanup({
        prepared,
@@ -118,6 +215,7 @@ export async function runAgentHarnessV2LifecycleAttempt(
        ...(rawResult === undefined ? {} : { result: rawResult }),
      });
    } catch (cleanupError) {
+      cleanupFailed = true;
      // Preserve the user-visible harness failure. Cleanup errors after a
      // failed lifecycle stage must not mask the actionable runtime error.
      log.warn("agent harness cleanup failed after attempt failure", {
@@ -128,9 +226,30 @@ export async function runAgentHarnessV2LifecycleAttempt(
        originalError: formatErrorMessage(error),
      });
    }
+    emitAgentHarnessRunError({
+      harness,
+      attemptParams: params,
+      startedAt,
+      phase,
+      error,
+      cleanupFailed,
+    });
    throw error;
  }

-  await harness.cleanup({ prepared, session, result });
+  try {
+    phase = "cleanup";
+    await harness.cleanup({ prepared, session, result });
+  } catch (error) {
+    emitAgentHarnessRunError({
+      harness,
+      attemptParams: params,
+      startedAt,
+      phase,
+      error,
+    });
+    throw error;
+  }
+  emitAgentHarnessRunCompleted({ harness, attemptParams: params, result, startedAt });
  return result;
 }
--- a/src/cli/capability-cli.test.ts
+++ b/src/cli/capability-cli.test.ts
@@ -162,10 +162,13 @@ vi.mock("../agents/memory-search.js", () => ({
    mocks.resolveMemorySearchConfig as typeof import("../agents/memory-search.js").resolveMemorySearchConfig,
 }));

-vi.mock("../commands/models.js", () => ({
+vi.mock("../commands/models/auth.js", () => ({
  modelsAuthLoginCommand: vi.fn(),
+}));
+
+vi.mock("../commands/models/list.js", () => ({
  modelsStatusCommand:
-    mocks.modelsStatusCommand as typeof import("../commands/models.js").modelsStatusCommand,
+    mocks.modelsStatusCommand as typeof import("../commands/models/list.js").modelsStatusCommand,
 }));

 vi.mock("../gateway/call.js", () => ({
--- a/src/cli/capability-cli.ts
+++ b/src/cli/capability-cli.ts
@@ -13,7 +13,7 @@ import {
 import { updateAuthProfileStoreWithLock } from "../agents/auth-profiles/store.js";
 import { resolveMemorySearchConfig } from "../agents/memory-search.js";
 import { loadModelCatalog } from "../agents/model-catalog.js";
-import { modelsAuthLoginCommand, modelsStatusCommand } from "../commands/models.js";
+import { modelsStatusCommand } from "../commands/models/list.js";
 import { loadConfig } from "../config/config.js";
 import { resolveAgentModelPrimaryValue } from "../config/model-input.js";
 import type { OpenClawConfig } from "../config/types.openclaw.js";
@@ -1522,6 +1522,7 @@ export function registerCapabilityCli(program: Command) {
    .requiredOption("--provider <id>", "Provider id")
    .action(async (opts) => {
      await runCommandWithRuntime(defaultRuntime, async () => {
+        const { modelsAuthLoginCommand } = await import("../commands/models/auth.js");
        await modelsAuthLoginCommand({ provider: String(opts.provider) }, defaultRuntime);
      });
    });
--- a/src/cli/daemon-cli/install.test.ts
+++ b/src/cli/daemon-cli/install.test.ts
@@ -361,6 +361,89 @@ describe("runDaemonInstall", () => {
    expect(actionState.emitted.at(-1)).toMatchObject({ result: "already-installed" });
  });

+  it("reinstalls when the loaded service still embeds OPENCLAW_GATEWAY_TOKEN", async () => {
+    service.isLoaded.mockResolvedValue(true);
+    service.readCommand.mockResolvedValue({
+      programArguments: ["openclaw", "gateway", "run"],
+      environment: {
+        OPENCLAW_GATEWAY_TOKEN: "stale-service-token",
+      },
+    } as never);
+
+    await runDaemonInstall({ json: true });
+
+    expect(installDaemonServiceAndEmitMock).toHaveBeenCalledTimes(1);
+    expect(actionState.warnings).toContain(
+      "Gateway service OPENCLAW_GATEWAY_TOKEN differs from the current install plan; refreshing the install.",
+    );
+  });
+
+  it("returns already-installed when the embedded gateway token matches the install plan", async () => {
+    service.isLoaded.mockResolvedValue(true);
+    service.readCommand.mockResolvedValue({
+      programArguments: ["openclaw", "gateway", "run"],
+      environment: {
+        OPENCLAW_GATEWAY_TOKEN: "durable-token",
+      },
+    } as never);
+    buildGatewayInstallPlanMock.mockResolvedValueOnce({
+      programArguments: ["openclaw", "gateway", "run"],
+      workingDirectory: "/tmp",
+      environment: {
+        OPENCLAW_GATEWAY_TOKEN: "durable-token",
+      },
+    });
+
+    await runDaemonInstall({ json: true });
+
+    expect(buildGatewayInstallPlanMock).toHaveBeenCalledTimes(1);
+    expect(writeConfigFileMock).not.toHaveBeenCalled();
+    expect(installDaemonServiceAndEmitMock).not.toHaveBeenCalled();
+    expect(actionState.emitted.at(-1)).toMatchObject({ result: "already-installed" });
+  });
+
+  it("reinstalls when the embedded gateway token differs from the install plan", async () => {
+    service.isLoaded.mockResolvedValue(true);
+    service.readCommand.mockResolvedValue({
+      programArguments: ["openclaw", "gateway", "run"],
+      environment: {
+        OPENCLAW_GATEWAY_TOKEN: "stale-service-token",
+      },
+    } as never);
+    buildGatewayInstallPlanMock.mockResolvedValueOnce({
+      programArguments: ["openclaw", "gateway", "run"],
+      workingDirectory: "/tmp",
+      environment: {
+        OPENCLAW_GATEWAY_TOKEN: "fresh-token",
+      },
+    });
+
+    await runDaemonInstall({ json: true });
+
+    expect(installDaemonServiceAndEmitMock).toHaveBeenCalledTimes(1);
+    expect(actionState.warnings).toContain(
+      "Gateway service OPENCLAW_GATEWAY_TOKEN differs from the current install plan; refreshing the install.",
+    );
+  });
+
+  it("does not reinstall when OPENCLAW_GATEWAY_TOKEN comes from an env file", async () => {
+    service.isLoaded.mockResolvedValue(true);
+    service.readCommand.mockResolvedValue({
+      programArguments: ["openclaw", "gateway", "run"],
+      environment: {
+        OPENCLAW_GATEWAY_TOKEN: "env-file-token",
+      },
+      environmentValueSources: {
+        OPENCLAW_GATEWAY_TOKEN: "file",
+      },
+    } as never);
+
+    await runDaemonInstall({ json: true });
+
+    expect(installDaemonServiceAndEmitMock).not.toHaveBeenCalled();
+    expect(actionState.emitted.at(-1)).toMatchObject({ result: "already-installed" });
+  });
+
  it("reinstalls when an existing service is missing the nvm TLS CA bundle", async () => {
    service.isLoaded.mockResolvedValue(true);
    resolveNodeStartupTlsEnvironmentMock.mockReturnValue({
--- a/src/cli/daemon-cli/install.ts
+++ b/src/cli/daemon-cli/install.ts
@@ -3,12 +3,16 @@ import { buildGatewayInstallPlan } from "../../commands/daemon-install-helpers.j
 import {
  DEFAULT_GATEWAY_DAEMON_RUNTIME,
  isGatewayDaemonRuntime,
+  type GatewayDaemonRuntime,
 } from "../../commands/daemon-runtime.js";
 import { resolveGatewayInstallToken } from "../../commands/gateway-install-token.js";
 import { resolveFutureConfigActionBlock } from "../../config/future-version-guard.js";
 import { readConfigFileSnapshotForWrite } from "../../config/io.js";
 import { resolveGatewayPort } from "../../config/paths.js";
+import type { OpenClawConfig } from "../../config/types.js";
+import { readEmbeddedGatewayToken } from "../../daemon/service-audit.js";
 import { resolveGatewayService } from "../../daemon/service.js";
+import type { GatewayServiceCommandConfig } from "../../daemon/service.js";
 import { isNonFatalSystemdInstallProbeError } from "../../daemon/systemd.js";
 import {
  isDangerousHostEnvOverrideVarName,
@@ -16,6 +20,7 @@ import {
  normalizeEnvVarKey,
 } from "../../infra/host-env-security.js";
 import { defaultRuntime } from "../../runtime.js";
+import { normalizeOptionalString } from "../../shared/string-coerce.js";
 import { formatCliCommand } from "../command-format.js";
 import { buildDaemonServiceSnapshot, installDaemonServiceAndEmit } from "./response.js";
 import {
@@ -98,6 +103,7 @@ export async function runDaemonInstall(opts: DaemonInstallOptions) {
  const service = resolveGatewayService();
  let loaded = false;
  let existingServiceEnv: Record<string, string> | undefined;
+  let existingServiceCommand: GatewayServiceCommandConfig | null = null;
  try {
    loaded = await service.isLoaded({ env: process.env });
  } catch (err) {
@@ -109,7 +115,8 @@ export async function runDaemonInstall(opts: DaemonInstallOptions) {
    }
  }
  if (loaded) {
-    existingServiceEnv = (await service.readCommand(process.env).catch(() => null))?.environment;
+    existingServiceCommand = await service.readCommand(process.env).catch(() => null);
+    existingServiceEnv = existingServiceCommand?.environment;
  }
  const installEnv = mergeInstallInvocationEnv({
    env: process.env,
@@ -117,12 +124,20 @@ export async function runDaemonInstall(opts: DaemonInstallOptions) {
  });
  if (loaded) {
    if (!opts.force) {
-      if (await gatewayServiceNeedsAutoNodeExtraCaCertsRefresh({ service, env: process.env })) {
-        const message = "Gateway service is missing the nvm TLS CA bundle; refreshing the install.";
+      const autoRefreshMessage = await getGatewayServiceAutoRefreshMessage({
+        currentCommand: existingServiceCommand,
+        env: process.env,
+        installEnv,
+        port,
+        runtime: runtimeRaw,
+        existingEnvironment: existingServiceEnv,
+        config: cfg,
+      });
+      if (autoRefreshMessage) {
        if (json) {
-          warnings.push(message);
+          warnings.push(autoRefreshMessage);
        } else {
-          defaultRuntime.log(message);
+          defaultRuntime.log(autoRefreshMessage);
        }
      } else {
        emit({
@@ -196,18 +211,40 @@ export async function runDaemonInstall(opts: DaemonInstallOptions) {
  });
 }

-async function gatewayServiceNeedsAutoNodeExtraCaCertsRefresh(params: {
-  service: ReturnType<typeof resolveGatewayService>;
+async function getGatewayServiceAutoRefreshMessage(params: {
+  currentCommand: GatewayServiceCommandConfig | null;
  env: Record<string, string | undefined>;
-}): Promise<boolean> {
+  installEnv: NodeJS.ProcessEnv;
+  port: number;
+  runtime: GatewayDaemonRuntime;
+  existingEnvironment?: Record<string, string | undefined>;
+  config: OpenClawConfig;
+}): Promise<string | undefined> {
  try {
-    const currentCommand = await params.service.readCommand(params.env);
+    const currentCommand = params.currentCommand;
    if (!currentCommand) {
-      return false;
+      return undefined;
+    }
+    const currentEmbeddedToken = readEmbeddedGatewayToken(currentCommand);
+    if (currentEmbeddedToken) {
+      const plannedInstall = await buildGatewayInstallPlan({
+        env: params.installEnv,
+        port: params.port,
+        runtime: params.runtime,
+        existingEnvironment: params.existingEnvironment,
+        warn: () => undefined,
+        config: params.config,
+      });
+      const plannedEmbeddedToken = normalizeOptionalString(
+        plannedInstall.environment.OPENCLAW_GATEWAY_TOKEN,
+      );
+      if (currentEmbeddedToken !== plannedEmbeddedToken) {
+        return "Gateway service OPENCLAW_GATEWAY_TOKEN differs from the current install plan; refreshing the install.";
+      }
    }
    const currentExecPath = currentCommand.programArguments[0]?.trim();
    if (!currentExecPath) {
-      return false;
+      return undefined;
    }
    const currentEnvironment = currentCommand.environment ?? {};
    const currentNodeExtraCaCerts = currentEnvironment.NODE_EXTRA_CA_CERTS?.trim();
@@ -221,10 +258,13 @@ async function gatewayServiceNeedsAutoNodeExtraCaCertsRefresh(params: {
      includeDarwinDefaults: false,
    }).NODE_EXTRA_CA_CERTS;
    if (!expectedNodeExtraCaCerts) {
-      return false;
+      return undefined;
    }
-    return currentNodeExtraCaCerts !== expectedNodeExtraCaCerts;
+    if (currentNodeExtraCaCerts !== expectedNodeExtraCaCerts) {
+      return "Gateway service is missing the nvm TLS CA bundle; refreshing the install.";
+    }
+    return undefined;
  } catch {
-    return false;
+    return undefined;
  }
 }
--- a/src/cli/model-auth-runtime-boundary.test.ts
+++ b/src/cli/model-auth-runtime-boundary.test.ts
@@ -0,0 +1,16 @@
+import fs from "node:fs";
+import path from "node:path";
+import { fileURLToPath } from "node:url";
+import { describe, expect, it } from "vitest";
+
+const repoRoot = fileURLToPath(new URL("../..", import.meta.url));
+
+describe("model auth runtime boundary", () => {
+  it("keeps capability CLI command registration off the models auth runtime", () => {
+    const source = fs.readFileSync(path.join(repoRoot, "src/cli/capability-cli.ts"), "utf8");
+
+    expect(source).not.toMatch(/\bfrom\s+["'][^"']*commands\/models\.js["']/);
+    expect(source).not.toMatch(/\bfrom\s+["'][^"']*commands\/models\/auth\.js["']/);
+    expect(source).toMatch(/\bawait\s+import\(["'][^"']*commands\/models\/auth\.js["']\)/);
+  });
+});
--- a/src/commands/status.summary.test.ts
+++ b/src/commands/status.summary.test.ts
@@ -58,6 +58,7 @@ vi.mock("../infra/system-events.js", () => ({
 }));

 vi.mock("../tasks/task-registry.maintenance.js", () => ({
+  configureTaskRegistryMaintenance: vi.fn(),
  getInspectableTaskRegistrySummary: vi.fn(() => ({
    total: 0,
    active: 0,
--- a/src/commands/status.summary.ts
+++ b/src/commands/status.summary.ts
@@ -4,6 +4,7 @@ import { resolveStorePath } from "../config/sessions/paths.js";
 import { readSessionStoreReadOnly } from "../config/sessions/store-read.js";
 import { resolveSessionTotalTokens, type SessionEntry } from "../config/sessions/types.js";
 import type { OpenClawConfig } from "../config/types.js";
+import { resolveCronStorePath } from "../cron/store.js";
 import { listGatewayAgentsBasic } from "../gateway/agent-list.js";
 import { resolveHeartbeatSummaryForAgent } from "../infra/heartbeat-summary.js";
 import { peekSystemEvents } from "../infra/system-events.js";
@@ -151,6 +152,9 @@ export async function getStatusSummary(
  const mainSessionKey = resolveMainSessionKey(cfg);
  const queuedSystemEvents = peekSystemEvents(mainSessionKey);
  const taskMaintenanceModule = await loadTaskRegistryMaintenanceModule();
+  taskMaintenanceModule.configureTaskRegistryMaintenance({
+    cronStorePath: resolveCronStorePath(cfg.cron?.store),
+  });
  const tasks = taskMaintenanceModule.getInspectableTaskRegistrySummary();
  const taskAudit = taskMaintenanceModule.getInspectableTaskAuditSummary();

--- a/src/commands/tasks.ts
+++ b/src/commands/tasks.ts
@@ -1,3 +1,5 @@
+import { loadConfig } from "../config/config.js";
+import { resolveCronStorePath } from "../cron/store.js";
 import type { RuntimeEnv } from "../runtime.js";
 import { normalizeOptionalString } from "../shared/string-coerce.js";
 import { getTaskById, updateTaskNotifyPolicyById } from "../tasks/runtime-internal.js";
@@ -24,6 +26,7 @@ import { compareTaskAuditFindingSortKeys } from "../tasks/task-registry.audit.sh
 import {
  getInspectableTaskAuditSummary,
  getInspectableTaskRegistrySummary,
+  configureTaskRegistryMaintenance,
  previewTaskRegistryMaintenance,
  runTaskRegistryMaintenance,
 } from "../tasks/task-registry.maintenance.js";
@@ -44,10 +47,16 @@ const RUN_PAD = 10;
 const info = theme.info;

 async function loadTaskCancelConfig() {
-  const { loadConfig } = await import("../config/config.js");
  return loadConfig();
 }

+function configureTaskMaintenanceFromConfig(): void {
+  const cfg = loadConfig();
+  configureTaskRegistryMaintenance({
+    cronStorePath: resolveCronStorePath(cfg.cron?.store),
+  });
+}
+
 function truncate(value: string, maxChars: number) {
  if (value.length <= maxChars) {
    return value;
@@ -417,6 +426,7 @@ export async function tasksAuditCommand(
  },
  runtime: RuntimeEnv,
 ) {
+  configureTaskMaintenanceFromConfig();
  const severityFilter = opts.severity?.trim() as TaskSystemAuditSeverity | undefined;
  const codeFilter = opts.code?.trim() as TaskSystemAuditCode | undefined;
  const { allFindings, filteredFindings, taskFindings, summary } = toSystemAuditFindings({
@@ -491,6 +501,7 @@ export async function tasksMaintenanceCommand(
  opts: { json?: boolean; apply?: boolean },
  runtime: RuntimeEnv,
 ) {
+  configureTaskMaintenanceFromConfig();
  const auditBefore = getInspectableTaskAuditSummary();
  const flowAuditBefore = getInspectableTaskFlowAuditSummary();
  const taskMaintenance = opts.apply
--- a/src/cron/run-log.test.ts
+++ b/src/cron/run-log.test.ts
@@ -9,6 +9,7 @@ import {
  getPendingCronRunLogWriteCountForTests,
  readCronRunLogEntries,
  readCronRunLogEntriesPage,
+  readCronRunLogEntriesSync,
  resolveCronRunLogPruneOptions,
  resolveCronRunLogPath,
 } from "./run-log.js";
@@ -96,6 +97,36 @@ describe("cron run log", () => {
    });
  });

+  it("reads run-log entries synchronously for task reconciliation", async () => {
+    await withRunLogDir("openclaw-cron-log-sync-", async (dir) => {
+      const logPath = path.join(dir, "runs", "job-1.jsonl");
+      await appendCronRunLog(logPath, {
+        ts: 1000,
+        jobId: "job-1",
+        action: "finished",
+        status: "ok",
+        runAtMs: 900,
+        durationMs: 100,
+      });
+      await appendCronRunLog(logPath, {
+        ts: 2000,
+        jobId: "job-2",
+        action: "finished",
+        status: "error",
+      });
+
+      expect(readCronRunLogEntriesSync(logPath, { jobId: "job-1" })).toEqual([
+        expect.objectContaining({
+          jobId: "job-1",
+          status: "ok",
+          runAtMs: 900,
+          durationMs: 100,
+        }),
+      ]);
+      expect(readCronRunLogEntriesSync(path.join(dir, "runs", "missing.jsonl"))).toEqual([]);
+    });
+  });
+
  it.skipIf(process.platform === "win32")(
    "writes run log files with secure permissions",
    async () => {
--- a/src/cron/run-log.ts
+++ b/src/cron/run-log.ts
@@ -1,4 +1,5 @@
 import { randomBytes } from "node:crypto";
+import fsSync from "node:fs";
 import fs from "node:fs/promises";
 import path from "node:path";
 import { parseByteSize } from "../cli/parse-bytes.js";
@@ -198,6 +199,23 @@ export async function readCronRunLogEntries(
  return page.entries.toReversed();
 }

+export function readCronRunLogEntriesSync(
+  filePath: string,
+  opts?: { limit?: number; jobId?: string },
+): CronRunLogEntry[] {
+  const limit = Math.max(1, Math.min(5000, Math.floor(opts?.limit ?? 200)));
+  let raw: string;
+  try {
+    raw = fsSync.readFileSync(path.resolve(filePath), "utf-8");
+  } catch (error) {
+    if (typeof error === "object" && error !== null && "code" in error && error.code === "ENOENT") {
+      return [];
+    }
+    throw error;
+  }
+  return parseAllRunLogEntries(raw, { jobId: opts?.jobId }).slice(-limit);
+}
+
 function normalizeRunStatusFilter(status?: string): CronRunLogStatusFilter {
  if (status === "ok" || status === "error" || status === "skipped" || status === "all") {
    return status;
--- a/src/cron/store.test.ts
+++ b/src/cron/store.test.ts
@@ -3,7 +3,7 @@ import os from "node:os";
 import path from "node:path";
 import { setTimeout as scheduleNativeTimeout } from "node:timers";
 import { afterAll, afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
-import { loadCronStore, resolveCronStorePath, saveCronStore } from "./store.js";
+import { loadCronStore, loadCronStoreSync, resolveCronStorePath, saveCronStore } from "./store.js";
 import type { CronStoreFile } from "./types.js";

 let fixtureRoot = "";
@@ -125,6 +125,19 @@ describe("cron store", () => {
    });
  });

+  it("loads split cron state synchronously for task reconciliation", async () => {
+    const { storePath } = await makeStorePath();
+    await saveCronStore(storePath, makeStore("job-sync", true));
+
+    const loaded = loadCronStoreSync(storePath);
+
+    expect(loaded.jobs[0]).toMatchObject({
+      id: "job-sync",
+      state: expect.any(Object),
+      updatedAtMs: expect.any(Number),
+    });
+  });
+
  it("does not create a backup file when saving unchanged content", async () => {
    const store = await makeStorePath();
    const payload = makeStore("job-1", true);
--- a/src/cron/store.ts
+++ b/src/cron/store.ts
@@ -114,6 +114,39 @@ async function loadStateFile(statePath: string): Promise<CronStateFile | null> {
  }
 }

+function loadStateFileSync(statePath: string): CronStateFile | null {
+  let raw: string;
+  try {
+    raw = fs.readFileSync(statePath, "utf-8");
+  } catch (err) {
+    if ((err as { code?: unknown })?.code === "ENOENT") {
+      return null;
+    }
+    throw new Error(`Failed to read cron state at ${statePath}: ${String(err)}`, {
+      cause: err,
+    });
+  }
+
+  try {
+    const parsed = parseJsonWithJson5Fallback(raw);
+    if (!parsed || typeof parsed !== "object" || Array.isArray(parsed)) {
+      return null;
+    }
+    const record = parsed as Record<string, unknown>;
+    if (
+      record.version !== 1 ||
+      typeof record.jobs !== "object" ||
+      record.jobs === null ||
+      Array.isArray(record.jobs)
+    ) {
+      return null;
+    }
+    return { version: 1, jobs: record.jobs as Record<string, CronStateFileEntry> };
+  } catch {
+    return null;
+  }
+}
+
 function hasInlineState(jobs: Array<Record<string, unknown> | null | undefined>): boolean {
  return jobs.some(
    (job) =>
@@ -219,6 +252,60 @@ export async function loadCronStore(storePath: string): Promise<CronStoreFile> {
  }
 }

+export function loadCronStoreSync(storePath: string): CronStoreFile {
+  try {
+    const raw = fs.readFileSync(storePath, "utf-8");
+    let parsed: unknown;
+    try {
+      parsed = parseJsonWithJson5Fallback(raw);
+    } catch (err) {
+      throw new Error(`Failed to parse cron store at ${storePath}: ${String(err)}`, {
+        cause: err,
+      });
+    }
+    const parsedRecord =
+      parsed && typeof parsed === "object" && !Array.isArray(parsed)
+        ? (parsed as Record<string, unknown>)
+        : {};
+    const jobs = Array.isArray(parsedRecord.jobs) ? (parsedRecord.jobs as never[]) : [];
+    const store = {
+      version: 1 as const,
+      jobs: jobs.filter(Boolean) as never as CronStoreFile["jobs"],
+    };
+
+    const stateFile = loadStateFileSync(resolveStatePath(storePath));
+    const hasLegacyInlineState =
+      !stateFile && hasInlineState(jobs as unknown as Array<Record<string, unknown>>);
+
+    if (stateFile) {
+      for (const job of store.jobs) {
+        const entry = stateFile.jobs[job.id];
+        if (entry) {
+          job.updatedAtMs = resolveUpdatedAtMs(job, entry.updatedAtMs);
+          job.state = (entry.state ?? {}) as never;
+        } else {
+          backfillMissingRuntimeFields(job);
+        }
+      }
+    } else if (!hasLegacyInlineState) {
+      for (const job of store.jobs) {
+        backfillMissingRuntimeFields(job);
+      }
+    }
+
+    for (const job of store.jobs) {
+      ensureJobStateObject(job);
+    }
+
+    return store;
+  } catch (err) {
+    if ((err as { code?: unknown })?.code === "ENOENT") {
+      return { version: 1, jobs: [] };
+    }
+    throw err;
+  }
+}
+
 type SaveCronStoreOptions = {
  skipBackup?: boolean;
 };
--- a/src/gateway/server-methods/devices.test.ts
+++ b/src/gateway/server-methods/devices.test.ts
@@ -181,7 +181,10 @@ describe("deviceHandlers", () => {
  });

  it("disconnects active clients after revoking a device token", async () => {
-    revokeDeviceTokenMock.mockResolvedValue({ role: "operator", revokedAtMs: 456 });
+    revokeDeviceTokenMock.mockResolvedValue({
+      ok: true,
+      entry: { role: "operator", revokedAtMs: 456 },
+    });
    const opts = createOptions("device.token.revoke", {
      deviceId: " device-1 ",
      role: " operator ",
@@ -193,6 +196,7 @@ describe("deviceHandlers", () => {
    expect(revokeDeviceTokenMock).toHaveBeenCalledWith({
      deviceId: " device-1 ",
      role: " operator ",
+      callerScopes: [],
    });
    expect(opts.context.disconnectClientsForDevice).toHaveBeenCalledWith("device-1", {
      role: "operator",
@@ -205,7 +209,10 @@ describe("deviceHandlers", () => {
  });

  it("allows admin-scoped callers to revoke another device's token", async () => {
-    revokeDeviceTokenMock.mockResolvedValue({ role: "operator", revokedAtMs: 456 });
+    revokeDeviceTokenMock.mockResolvedValue({
+      ok: true,
+      entry: { role: "operator", revokedAtMs: 456 },
+    });
    const opts = createOptions(
      "device.token.revoke",
      { deviceId: "device-2", role: "operator" },
@@ -217,6 +224,7 @@ describe("deviceHandlers", () => {
    expect(revokeDeviceTokenMock).toHaveBeenCalledWith({
      deviceId: "device-2",
      role: "operator",
+      callerScopes: ["operator.admin"],
    });
    expect(opts.respond).toHaveBeenCalledWith(
      true,
@@ -226,7 +234,10 @@ describe("deviceHandlers", () => {
  });

  it("treats normalized device ids as self-owned for token revocation", async () => {
-    revokeDeviceTokenMock.mockResolvedValue({ role: "operator", revokedAtMs: 456 });
+    revokeDeviceTokenMock.mockResolvedValue({
+      ok: true,
+      entry: { role: "operator", revokedAtMs: 456 },
+    });
    const opts = createOptions(
      "device.token.revoke",
      { deviceId: " device-1 ", role: "operator" },
@@ -238,6 +249,7 @@ describe("deviceHandlers", () => {
    expect(revokeDeviceTokenMock).toHaveBeenCalledWith({
      deviceId: " device-1 ",
      role: "operator",
+      callerScopes: ["operator.pairing"],
    });
    expect(opts.respond).toHaveBeenCalledWith(
      true,
@@ -272,6 +284,7 @@ describe("deviceHandlers", () => {
      deviceId: " device-1 ",
      role: " operator ",
      scopes: ["operator.pairing"],
+      callerScopes: ["operator.pairing"],
    });
    expect(opts.context.disconnectClientsForDevice).toHaveBeenCalledWith("device-1", {
      role: "operator",
@@ -308,6 +321,7 @@ describe("deviceHandlers", () => {
      deviceId: " device-1 ",
      role: "operator",
      scopes: ["operator.pairing"],
+      callerScopes: ["operator.pairing"],
    });
    expect(opts.respond).toHaveBeenCalledWith(
      true,
@@ -324,6 +338,7 @@ describe("deviceHandlers", () => {

  it("rejects rotating a token for a role that was never approved", async () => {
    mockPairedOperatorDevice();
+    rotateDeviceTokenMock.mockResolvedValue({ ok: false, reason: "unknown-device-or-role" });
    const opts = createOptions(
      "device.token.rotate",
      {
@@ -341,7 +356,12 @@ describe("deviceHandlers", () => {

    await deviceHandlers["device.token.rotate"](opts);

-    expect(rotateDeviceTokenMock).not.toHaveBeenCalled();
+    expect(rotateDeviceTokenMock).toHaveBeenCalledWith({
+      deviceId: "device-1",
+      role: "node",
+      scopes: undefined,
+      callerScopes: ["operator.pairing"],
+    });
    expect(opts.context.disconnectClientsForDevice).not.toHaveBeenCalled();
    expect(opts.respond).toHaveBeenCalledWith(
      false,
@@ -351,7 +371,7 @@ describe("deviceHandlers", () => {
  });

  it("does not disconnect clients when token revocation fails", async () => {
-    revokeDeviceTokenMock.mockResolvedValue(null);
+    revokeDeviceTokenMock.mockResolvedValue({ ok: false, reason: "unknown-device-or-role" });
    const opts = createOptions("device.token.revoke", {
      deviceId: "device-1",
      role: "operator",
@@ -363,7 +383,7 @@ describe("deviceHandlers", () => {
    expect(opts.respond).toHaveBeenCalledWith(
      false,
      undefined,
-      expect.objectContaining({ message: "unknown deviceId/role" }),
+      expect.objectContaining({ message: "device token revocation denied" }),
    );
  });

--- a/src/gateway/server-methods/devices.ts
+++ b/src/gateway/server-methods/devices.ts
@@ -1,20 +1,17 @@
 import {
  approveDevicePairing,
  formatDevicePairingForbiddenMessage,
-  getPairedDevice,
  getPendingDevicePairing,
-  listApprovedPairedDeviceRoles,
  listDevicePairing,
  removePairedDevice,
  type DeviceAuthToken,
+  type RevokeDeviceTokenDenyReason,
  type RotateDeviceTokenDenyReason,
  rejectDevicePairing,
  revokeDeviceToken,
  rotateDeviceToken,
  summarizeDeviceTokens,
 } from "../../infra/device-pairing.js";
-import { normalizeDeviceAuthScopes } from "../../shared/device-auth.js";
-import { resolveMissingRequestedScope } from "../../shared/operator-scope-compat.js";
 import {
  ErrorCodes,
  errorShape,
@@ -29,11 +26,7 @@ import {
 import type { GatewayClient, GatewayRequestHandlers } from "./types.js";

 const DEVICE_TOKEN_ROTATION_DENIED_MESSAGE = "device token rotation denied";
-
-type DeviceTokenRotateTarget = {
-  pairedDevice: NonNullable<Awaited<ReturnType<typeof getPairedDevice>>>;
-  normalizedRole: string;
-};
+const DEVICE_TOKEN_REVOCATION_DENIED_MESSAGE = "device token revocation denied";

 type DeviceSessionAuthz = {
  callerDeviceId: string | null;
@@ -62,11 +55,7 @@ function logDeviceTokenRotationDenied(params: {
  log: { warn: (message: string) => void };
  deviceId: string;
  role: string;
-  reason:
-    | RotateDeviceTokenDenyReason
-    | "caller-missing-scope"
-    | "unknown-device-or-role"
-    | "device-ownership-mismatch";
+  reason: RotateDeviceTokenDenyReason | "unknown-device-or-role" | "device-ownership-mismatch";
  scope?: string | null;
 }) {
  const suffix = params.scope ? ` scope=${params.scope}` : "";
@@ -75,23 +64,17 @@ function logDeviceTokenRotationDenied(params: {
  );
 }

-async function loadDeviceTokenRotateTarget(params: {
+function logDeviceTokenRevocationDenied(params: {
+  log: { warn: (message: string) => void };
  deviceId: string;
  role: string;
-  log: { warn: (message: string) => void };
-}): Promise<DeviceTokenRotateTarget | null> {
-  const normalizedRole = params.role.trim();
-  const pairedDevice = await getPairedDevice(params.deviceId);
-  if (!pairedDevice || !listApprovedPairedDeviceRoles(pairedDevice).includes(normalizedRole)) {
-    logDeviceTokenRotationDenied({
-      log: params.log,
-      deviceId: params.deviceId,
-      role: params.role,
-      reason: "unknown-device-or-role",
-    });
-    return null;
-  }
-  return { pairedDevice, normalizedRole };
+  reason: RevokeDeviceTokenDenyReason | "device-ownership-mismatch";
+  scope?: string | null;
+}) {
+  const suffix = params.scope ? ` scope=${params.scope}` : "";
+  params.log.warn(
+    `device token revocation denied device=${params.deviceId} role=${params.role} reason=${params.reason}${suffix}`,
+  );
 }

 function resolveDeviceManagementAuthz(
@@ -354,50 +337,19 @@ export const deviceHandlers: GatewayRequestHandlers = {
      );
      return;
    }
-    const rotateTarget = await loadDeviceTokenRotateTarget({
+    const rotated = await rotateDeviceToken({
      deviceId,
      role,
-      log: context.logGateway,
+      scopes,
+      callerScopes: authz.callerScopes,
    });
-    if (!rotateTarget) {
-      respond(
-        false,
-        undefined,
-        errorShape(ErrorCodes.INVALID_REQUEST, DEVICE_TOKEN_ROTATION_DENIED_MESSAGE),
-      );
-      return;
-    }
-    const { pairedDevice, normalizedRole } = rotateTarget;
-    const requestedScopes = normalizeDeviceAuthScopes(
-      scopes ?? pairedDevice.tokens?.[normalizedRole]?.scopes ?? pairedDevice.scopes,
-    );
-    const missingScope = resolveMissingRequestedScope({
-      role,
-      requestedScopes,
-      allowedScopes: authz.callerScopes,
-    });
-    if (missingScope) {
-      logDeviceTokenRotationDenied({
-        log: context.logGateway,
-        deviceId,
-        role,
-        reason: "caller-missing-scope",
-        scope: missingScope,
-      });
-      respond(
-        false,
-        undefined,
-        errorShape(ErrorCodes.INVALID_REQUEST, DEVICE_TOKEN_ROTATION_DENIED_MESSAGE),
-      );
-      return;
-    }
-    const rotated = await rotateDeviceToken({ deviceId, role, scopes });
    if (!rotated.ok) {
      logDeviceTokenRotationDenied({
        log: context.logGateway,
        deviceId,
        role,
        reason: rotated.reason,
+        scope: rotated.scope,
      });
      respond(
        false,
@@ -448,15 +400,27 @@ export const deviceHandlers: GatewayRequestHandlers = {
      respond(
        false,
        undefined,
-        errorShape(ErrorCodes.INVALID_REQUEST, "device token revocation denied"),
+        errorShape(ErrorCodes.INVALID_REQUEST, DEVICE_TOKEN_REVOCATION_DENIED_MESSAGE),
      );
      return;
    }
-    const entry = await revokeDeviceToken({ deviceId, role });
-    if (!entry) {
-      respond(false, undefined, errorShape(ErrorCodes.INVALID_REQUEST, "unknown deviceId/role"));
+    const revoked = await revokeDeviceToken({ deviceId, role, callerScopes: authz.callerScopes });
+    if (!revoked.ok) {
+      logDeviceTokenRevocationDenied({
+        log: context.logGateway,
+        deviceId,
+        role,
+        reason: revoked.reason,
+        scope: revoked.scope,
+      });
+      respond(
+        false,
+        undefined,
+        errorShape(ErrorCodes.INVALID_REQUEST, DEVICE_TOKEN_REVOCATION_DENIED_MESSAGE),
+      );
      return;
    }
+    const entry = revoked.entry;
    const normalizedDeviceId = deviceId.trim();
    context.logGateway.info(`device token revoked device=${normalizedDeviceId} role=${entry.role}`);
    respond(
--- a/src/gateway/server-startup-early.ts
+++ b/src/gateway/server-startup-early.ts
@@ -1,6 +1,7 @@
 import { registerSkillsChangeListener } from "../agents/skills/refresh.js";
 import type { GatewayTailscaleMode } from "../config/types.gateway.js";
 import type { OpenClawConfig } from "../config/types.openclaw.js";
+import { resolveCronStorePath } from "../cron/store.js";
 import { getMachineDisplayName } from "../infra/machine-name.js";
 import {
  primeRemoteSkillsCache,
@@ -8,7 +9,10 @@ import {
  setSkillsRemoteRegistry,
 } from "../infra/skills-remote.js";
 import type { PluginRegistry } from "../plugins/registry-types.js";
-import { startTaskRegistryMaintenance } from "../tasks/task-registry.maintenance.js";
+import {
+  configureTaskRegistryMaintenance,
+  startTaskRegistryMaintenance,
+} from "../tasks/task-registry.maintenance.js";
 import { startGatewayDiscovery } from "./server-discovery-runtime.js";
 import { startGatewayMaintenanceTimers } from "./server-maintenance.js";

@@ -77,6 +81,10 @@ export async function startGatewayEarlyRuntime(params: {
  if (!params.minimalTestGateway) {
    setSkillsRemoteRegistry(params.nodeRegistry);
    void primeRemoteSkillsCache();
+    configureTaskRegistryMaintenance({
+      cronStorePath: resolveCronStorePath(params.cfgAtStart.cron?.store),
+      cronRuntimeAuthoritative: true,
+    });
    startTaskRegistryMaintenance();
  }

--- a/src/gateway/server.device-token-rotate-authz.test.ts
+++ b/src/gateway/server.device-token-rotate-authz.test.ts
@@ -7,6 +7,7 @@ import {
  issueOperatorToken,
  openTrackedWs,
  pairDeviceIdentity,
+  resolveDeviceIdentityPath,
 } from "./device-authz.test-helpers.js";
 import {
  connectOk,
@@ -200,7 +201,53 @@ describe("gateway device.token.rotate/revoke ownership guard (IDOR)", () => {
  });
 });

-describe("gateway device.token.rotate caller scope guard", () => {
+describe("gateway device.token.rotate/revoke caller scope guard", () => {
+  test("rejects shared-token callers rotating or revoking above their session scopes", async () => {
+    const started = await startServer("secret");
+    const target = await issueOperatorToken({
+      name: "shared-pairing-target",
+      approvedScopes: ["operator.admin"],
+      clientId: GATEWAY_CLIENT_NAMES.TEST,
+      clientMode: GATEWAY_CLIENT_MODES.TEST,
+    });
+
+    let pairingWs: WebSocket | undefined;
+    try {
+      pairingWs = await openTrackedWs(started.port);
+      await connectOk(pairingWs, {
+        token: "secret",
+        scopes: ["operator.pairing"],
+        deviceIdentityPath: resolveDeviceIdentityPath("shared-pairing-caller"),
+      });
+
+      const rotate = await rpcReq(pairingWs, "device.token.rotate", {
+        deviceId: target.deviceId,
+        role: "operator",
+      });
+      expect(rotate.ok).toBe(false);
+      expect(rotate.error?.message).toBe("device token rotation denied");
+
+      const afterRotate = await getPairedDevice(target.deviceId);
+      expect(afterRotate?.tokens?.operator?.token).toBe(target.token);
+      expect(afterRotate?.tokens?.operator?.revokedAtMs).toBeUndefined();
+
+      const revoke = await rpcReq(pairingWs, "device.token.revoke", {
+        deviceId: target.deviceId,
+        role: "operator",
+      });
+      expect(revoke.ok).toBe(false);
+      expect(revoke.error?.message).toBe("device token revocation denied");
+
+      const afterRevoke = await getPairedDevice(target.deviceId);
+      expect(afterRevoke?.tokens?.operator?.token).toBe(target.token);
+      expect(afterRevoke?.tokens?.operator?.revokedAtMs).toBeUndefined();
+    } finally {
+      pairingWs?.close();
+      await started.server.close();
+      started.envSnapshot.restore();
+    }
+  });
+
  test("rejects rotating an admin-approved device token above the caller session scopes", async () => {
    const started = await startServer("secret");
    const attacker = await issueOperatorToken({
--- a/src/infra/device-pairing.test.ts
+++ b/src/infra/device-pairing.test.ts
@@ -638,6 +638,77 @@ describe("device pairing tokens", () => {
    expect(after?.approvedScopes).toEqual(["operator.read"]);
  });

+  test("rejects omitted-scope rotation when caller cannot hold the current token scopes", async () => {
+    const baseDir = await makeDevicePairingDir();
+    await setupPairedOperatorDevice(baseDir, ["operator.admin"]);
+    const before = await getPairedDevice("device-1", baseDir);
+
+    const rotated = await rotateDeviceToken({
+      deviceId: "device-1",
+      role: "operator",
+      callerScopes: ["operator.pairing"],
+      baseDir,
+    });
+    expect(rotated).toEqual({
+      ok: false,
+      reason: "caller-missing-scope",
+      scope: "operator.admin",
+    });
+
+    const after = await getPairedDevice("device-1", baseDir);
+    expect(after?.tokens?.operator?.token).toEqual(before?.tokens?.operator?.token);
+    expect(after?.tokens?.operator?.scopes).toEqual([
+      "operator.admin",
+      "operator.read",
+      "operator.write",
+    ]);
+    expect(after?.tokens?.operator?.revokedAtMs).toBeUndefined();
+  });
+
+  test("rejects token revocation when caller cannot hold the target token scopes", async () => {
+    const baseDir = await makeDevicePairingDir();
+    await setupPairedOperatorDevice(baseDir, ["operator.admin"]);
+    const before = await getPairedDevice("device-1", baseDir);
+
+    const revoked = await revokeDeviceToken({
+      deviceId: "device-1",
+      role: "operator",
+      callerScopes: ["operator.pairing"],
+      baseDir,
+    });
+    expect(revoked).toEqual({
+      ok: false,
+      reason: "caller-missing-scope",
+      scope: "operator.admin",
+    });
+
+    const after = await getPairedDevice("device-1", baseDir);
+    expect(after?.tokens?.operator?.token).toEqual(before?.tokens?.operator?.token);
+    expect(after?.tokens?.operator?.revokedAtMs).toBeUndefined();
+  });
+
+  test("allows token revocation when caller holds the target token scopes", async () => {
+    const baseDir = await makeDevicePairingDir();
+    await setupPairedOperatorDevice(baseDir, ["operator.admin"]);
+
+    const revoked = await revokeDeviceToken({
+      deviceId: "device-1",
+      role: "operator",
+      callerScopes: ["operator.admin"],
+      baseDir,
+    });
+    expect(revoked).toEqual({
+      ok: true,
+      entry: expect.objectContaining({
+        role: "operator",
+        revokedAtMs: expect.any(Number),
+      }),
+    });
+
+    const after = await getPairedDevice("device-1", baseDir);
+    expect(after?.tokens?.operator?.revokedAtMs).toBeTypeOf("number");
+  });
+
  test("rejects scope escalation when ensuring a token and leaves state unchanged", async () => {
    const baseDir = await makeDevicePairingDir();
    await setupPairedOperatorDevice(baseDir, ["operator.read"]);
--- a/src/infra/device-pairing.ts
+++ b/src/infra/device-pairing.ts
@@ -60,11 +60,18 @@ export type DeviceAuthTokenSummary = {
 export type RotateDeviceTokenDenyReason =
  | "unknown-device-or-role"
  | "missing-approved-scope-baseline"
-  | "scope-outside-approved-baseline";
+  | "scope-outside-approved-baseline"
+  | "caller-missing-scope";

 export type RotateDeviceTokenResult =
  | { ok: true; entry: DeviceAuthToken }
-  | { ok: false; reason: RotateDeviceTokenDenyReason };
+  | { ok: false; reason: RotateDeviceTokenDenyReason; scope?: string };
+
+export type RevokeDeviceTokenDenyReason = "unknown-device-or-role" | "caller-missing-scope";
+
+export type RevokeDeviceTokenResult =
+  | { ok: true; entry: DeviceAuthToken }
+  | { ok: false; reason: RevokeDeviceTokenDenyReason; scope?: string };

 export type PairedDevice = {
  deviceId: string;
@@ -970,6 +977,7 @@ export async function rotateDeviceToken(params: {
  deviceId: string;
  role: string;
  scopes?: string[];
+  callerScopes?: readonly string[];
  baseDir?: string;
 }): Promise<RotateDeviceTokenResult> {
  return await withLock(async () => {
@@ -999,6 +1007,16 @@ export async function rotateDeviceToken(params: {
    ) {
      return { ok: false, reason: "scope-outside-approved-baseline" };
    }
+    if (params.callerScopes) {
+      const missingScope = resolveMissingRequestedScope({
+        role,
+        requestedScopes,
+        allowedScopes: params.callerScopes,
+      });
+      if (missingScope) {
+        return { ok: false, reason: "caller-missing-scope", scope: missingScope };
+      }
+    }
    const now = Date.now();
    const next = buildDeviceAuthToken({
      role,
@@ -1018,28 +1036,39 @@ export async function rotateDeviceToken(params: {
 export async function revokeDeviceToken(params: {
  deviceId: string;
  role: string;
+  callerScopes?: readonly string[];
  baseDir?: string;
-}): Promise<DeviceAuthToken | null> {
+}): Promise<RevokeDeviceTokenResult> {
  return await withLock(async () => {
    const state = await loadState(params.baseDir);
-    const device = state.pairedByDeviceId[normalizeDeviceId(params.deviceId)];
-    if (!device) {
-      return null;
+    const context = resolveDeviceTokenUpdateContext({
+      state,
+      deviceId: params.deviceId,
+      role: params.role,
+    });
+    if (!context || !context.existing) {
+      return { ok: false, reason: "unknown-device-or-role" };
    }
-    const role = normalizeRole(params.role);
-    if (!role) {
-      return null;
+    const { device, role, tokens, existing } = context;
+    const targetScopes = normalizeDeviceAuthScopes(
+      Array.isArray(existing.scopes) ? existing.scopes : device.scopes,
+    );
+    if (params.callerScopes) {
+      const missingScope = resolveMissingRequestedScope({
+        role,
+        requestedScopes: targetScopes,
+        allowedScopes: params.callerScopes,
+      });
+      if (missingScope) {
+        return { ok: false, reason: "caller-missing-scope", scope: missingScope };
+      }
    }
-    if (!device.tokens?.[role]) {
-      return null;
-    }
-    const tokens = { ...device.tokens };
-    const entry = { ...tokens[role], revokedAtMs: Date.now() };
+    const entry = { ...existing, revokedAtMs: Date.now() };
    tokens[role] = entry;
    device.tokens = tokens;
    state.pairedByDeviceId[device.deviceId] = device;
    await persistState(state, params.baseDir, "paired");
-    return entry;
+    return { ok: true, entry };
  });
 }

--- a/src/infra/diagnostic-events.ts
+++ b/src/infra/diagnostic-events.ts
@@ -256,6 +256,47 @@ export type DiagnosticRunCompletedEvent = DiagnosticRunBaseEvent & {
  errorCategory?: string;
 };

+export type DiagnosticHarnessRunPhase = "prepare" | "start" | "send" | "resolve" | "cleanup";
+export type DiagnosticHarnessRunOutcome = "completed" | "aborted" | "timed_out" | "error";
+
+type DiagnosticHarnessRunBaseEvent = DiagnosticBaseEvent & {
+  type: "harness.run.started" | "harness.run.completed" | "harness.run.error";
+  runId: string;
+  sessionKey?: string;
+  sessionId?: string;
+  provider?: string;
+  model?: string;
+  trigger?: string;
+  channel?: string;
+  harnessId: string;
+  pluginId?: string;
+};
+
+export type DiagnosticHarnessRunStartedEvent = DiagnosticHarnessRunBaseEvent & {
+  type: "harness.run.started";
+};
+
+export type DiagnosticHarnessRunCompletedEvent = DiagnosticHarnessRunBaseEvent & {
+  type: "harness.run.completed";
+  durationMs: number;
+  outcome: DiagnosticHarnessRunOutcome;
+  resultClassification?: "empty" | "reasoning-only" | "planning-only";
+  yieldDetected?: boolean;
+  itemLifecycle?: {
+    startedCount: number;
+    completedCount: number;
+    activeCount: number;
+  };
+};
+
+export type DiagnosticHarnessRunErrorEvent = DiagnosticHarnessRunBaseEvent & {
+  type: "harness.run.error";
+  durationMs: number;
+  phase: DiagnosticHarnessRunPhase;
+  errorCategory: string;
+  cleanupFailed?: boolean;
+};
+
 type DiagnosticModelCallBaseEvent = DiagnosticBaseEvent & {
  type: "model.call.started" | "model.call.completed" | "model.call.error";
  runId: string;
@@ -392,6 +433,9 @@ export type DiagnosticEventPayload =
  | DiagnosticExecProcessCompletedEvent
  | DiagnosticRunStartedEvent
  | DiagnosticRunCompletedEvent
+  | DiagnosticHarnessRunStartedEvent
+  | DiagnosticHarnessRunCompletedEvent
+  | DiagnosticHarnessRunErrorEvent
  | DiagnosticModelCallStartedEvent
  | DiagnosticModelCallCompletedEvent
  | DiagnosticModelCallErrorEvent
@@ -446,6 +490,9 @@ const ASYNC_DIAGNOSTIC_EVENT_TYPES = new Set<DiagnosticEventPayload["type"]>([
  "model.call.started",
  "model.call.completed",
  "model.call.error",
+  "harness.run.started",
+  "harness.run.completed",
+  "harness.run.error",
  "context.assembled",
  "log.record",
 ]);
--- a/src/infra/package-dist-inventory.ts
+++ b/src/infra/package-dist-inventory.ts
@@ -26,6 +26,7 @@ const OMITTED_PRIVATE_QA_DIST_PREFIXES = ["dist/qa-runtime-"];
 const OMITTED_DIST_SUBTREE_PATTERNS = [
  /^dist\/extensions\/node_modules(?:\/|$)/u,
  /^dist\/extensions\/[^/]+\/node_modules(?:\/|$)/u,
+  /^dist\/extensions\/[^/]+\/\.openclaw-install-stage(?:-[^/]+)?(?:\/|$)/u,
  /^dist\/extensions\/[^/]+\/\.openclaw-runtime-deps-[^/]+(?:\/|$)/u,
  /^dist\/extensions\/qa-matrix(?:\/|$)/u,
  new RegExp(`^dist/plugin-sdk/extensions/${LEGACY_QA_LAB_DIR}(?:/|$)`, "u"),
--- a/src/infra/update-global.test.ts
+++ b/src/infra/update-global.test.ts
@@ -425,6 +425,34 @@ describe("update global helpers", () => {
    });
  });

+  it("ignores bundled plugin install stages during installed dist verification", async () => {
+    await withTempDir({ prefix: "openclaw-update-global-plugin-stage-" }, async (packageRoot) => {
+      await writeGlobalPackageJson(packageRoot);
+      await writeCompatSidecars(packageRoot);
+      await fs.mkdir(path.join(packageRoot, "dist", "extensions", "brave"), { recursive: true });
+      await writePackageDistInventory(packageRoot);
+
+      for (const stageDir of [".openclaw-install-stage", ".openclaw-install-stage-retry"]) {
+        const stagedFile = path.join(
+          packageRoot,
+          "dist",
+          "extensions",
+          "brave",
+          stageDir,
+          "node_modules",
+          "typebox",
+          "build",
+          "compile",
+          "code.mjs",
+        );
+        await fs.mkdir(path.dirname(stagedFile), { recursive: true });
+        await fs.writeFile(stagedFile, "export {};\n", "utf8");
+      }
+
+      await expect(collectInstalledGlobalPackageErrors({ packageRoot })).resolves.toEqual([]);
+    });
+  });
+
  it("does not require private QA sidecars when the inventory is missing", async () => {
    await withTempDir({ prefix: "openclaw-update-global-legacy-" }, async (packageRoot) => {
      await writeGlobalPackageJson(packageRoot);
--- a/src/logging/diagnostic-stability.ts
+++ b/src/logging/diagnostic-stability.ts
@@ -305,6 +305,34 @@ function sanitizeDiagnosticEvent(event: DiagnosticEventPayload): DiagnosticStabi
      record.outcome = event.outcome;
      assignReasonCode(record, event.errorCategory);
      break;
+    case "harness.run.started":
+      record.source = event.harnessId;
+      record.pluginId = event.pluginId;
+      record.provider = event.provider;
+      record.model = event.model;
+      record.channel = event.channel;
+      break;
+    case "harness.run.completed":
+      record.source = event.harnessId;
+      record.pluginId = event.pluginId;
+      record.provider = event.provider;
+      record.model = event.model;
+      record.channel = event.channel;
+      record.durationMs = event.durationMs;
+      record.outcome = event.outcome;
+      record.count = event.itemLifecycle?.completedCount;
+      break;
+    case "harness.run.error":
+      record.source = event.harnessId;
+      record.pluginId = event.pluginId;
+      record.provider = event.provider;
+      record.model = event.model;
+      record.channel = event.channel;
+      record.durationMs = event.durationMs;
+      record.outcome = "error";
+      record.action = event.phase;
+      assignReasonCode(record, event.errorCategory);
+      break;
    case "model.call.started":
      record.provider = event.provider;
      record.model = event.model;
--- a/src/tasks/task-registry.maintenance.issue-60299.test.ts
+++ b/src/tasks/task-registry.maintenance.issue-60299.test.ts
@@ -1,6 +1,8 @@
 import { afterEach, describe, expect, it, vi } from "vitest";
 import type { AcpSessionStoreEntry } from "../acp/runtime/session-meta.js";
 import type { SessionEntry } from "../config/sessions.js";
+import type { CronRunLogEntry } from "../cron/run-log.js";
+import type { CronStoreFile } from "../cron/types.js";
 import type { ParsedAgentSessionKey } from "../routing/session-key.js";
 import {
  resetDetachedTaskLifecycleRuntimeForTests,
@@ -9,6 +11,7 @@ import {
 } from "./detached-task-runtime.js";
 import {
  previewTaskRegistryMaintenance,
+  reconcileInspectableTasks,
  resetTaskRegistryMaintenanceRuntimeForTests,
  runTaskRegistryMaintenance,
  setTaskRegistryMaintenanceRuntimeForTests,
@@ -53,11 +56,15 @@ function createTaskRegistryMaintenanceHarness(params: {
  acpEntry?: AcpSessionStoreEntry["entry"];
  activeCronJobIds?: string[];
  activeRunIds?: string[];
+  cronStore?: CronStoreFile;
+  cronRunLogEntries?: Record<string, CronRunLogEntry[]>;
+  cronRuntimeAuthoritative?: boolean;
 }) {
  const sessionStore = params.sessionStore ?? {};
  const acpEntry = params.acpEntry;
  const activeCronJobIds = new Set(params.activeCronJobIds ?? []);
  const activeRunIds = new Set(params.activeRunIds ?? []);
+  const cronRunLogEntries = params.cronRunLogEntries ?? {};
  const currentTasks = new Map(params.tasks.map((task) => [task.taskId, { ...task }]));

  const runtime: TaskRegistryMaintenanceRuntime = {
@@ -113,6 +120,24 @@ function createTaskRegistryMaintenanceHarness(params: {
      currentTasks.set(patch.taskId, next);
      return next;
    },
+    markTaskTerminalById: (patch) => {
+      const current = currentTasks.get(patch.taskId);
+      if (!current) {
+        return null;
+      }
+      const next = {
+        ...current,
+        status: patch.status,
+        endedAt: patch.endedAt,
+        lastEventAt: patch.lastEventAt ?? patch.endedAt,
+        ...(patch.error !== undefined ? { error: patch.error } : {}),
+        ...(patch.terminalSummary !== undefined
+          ? { terminalSummary: patch.terminalSummary ?? undefined }
+          : {}),
+      } satisfies TaskRecord;
+      currentTasks.set(patch.taskId, next);
+      return next;
+    },
    maybeDeliverTaskTerminalUpdate: async () => null,
    resolveTaskForLookupToken: () => undefined,
    setTaskCleanupAfterById: (patch) => {
@@ -124,6 +149,11 @@ function createTaskRegistryMaintenanceHarness(params: {
      currentTasks.set(patch.taskId, next);
      return next;
    },
+    isCronRuntimeAuthoritative: () => params.cronRuntimeAuthoritative ?? true,
+    resolveCronStorePath: () => "/tmp/openclaw-test-cron/jobs.json",
+    loadCronStoreSync: () => params.cronStore ?? { version: 1, jobs: [] },
+    resolveCronRunLogPath: ({ jobId }) => jobId,
+    readCronRunLogEntriesSync: (jobId) => cronRunLogEntries[jobId] ?? [],
  };

  setTaskRegistryMaintenanceRuntimeForTests(runtime);
@@ -164,6 +194,112 @@ describe("task-registry maintenance issue #60299", () => {
    expect(currentTasks.get(task.taskId)).toMatchObject({ status: "running" });
  });

+  it("does not mark cron tasks lost when the current process is not the cron runtime authority", async () => {
+    const task = makeStaleTask({
+      runtime: "cron",
+      sourceId: "cron-job-offline-audit",
+      childSessionKey: undefined,
+    });
+
+    const { currentTasks } = createTaskRegistryMaintenanceHarness({
+      tasks: [task],
+      cronRuntimeAuthoritative: false,
+    });
+
+    expect(previewTaskRegistryMaintenance()).toMatchObject({ reconciled: 0 });
+    expect(await runTaskRegistryMaintenance()).toMatchObject({ reconciled: 0 });
+    expect(currentTasks.get(task.taskId)).toMatchObject({ status: "running" });
+  });
+
+  it("recovers finished cron tasks from durable run logs before marking them lost", async () => {
+    const startedAt = Date.now() - GRACE_EXPIRED_MS;
+    const task = makeStaleTask({
+      runtime: "cron",
+      sourceId: "cron-job-run-log-ok",
+      runId: `cron:cron-job-run-log-ok:${startedAt}`,
+      startedAt,
+      lastEventAt: startedAt,
+    });
+
+    const { currentTasks } = createTaskRegistryMaintenanceHarness({
+      tasks: [task],
+      cronRunLogEntries: {
+        "cron-job-run-log-ok": [
+          {
+            ts: startedAt + 1250,
+            jobId: "cron-job-run-log-ok",
+            action: "finished",
+            status: "ok",
+            summary: "done",
+            runAtMs: startedAt,
+            durationMs: 1250,
+          },
+        ],
+      },
+    });
+
+    expect(reconcileInspectableTasks()).toEqual([
+      expect.objectContaining({
+        taskId: task.taskId,
+        status: "succeeded",
+        endedAt: startedAt + 1250,
+        terminalSummary: "done",
+      }),
+    ]);
+    expect(previewTaskRegistryMaintenance()).toMatchObject({ reconciled: 0, recovered: 1 });
+    expect(await runTaskRegistryMaintenance()).toMatchObject({ reconciled: 0, recovered: 1 });
+    expect(currentTasks.get(task.taskId)).toMatchObject({
+      status: "succeeded",
+      endedAt: startedAt + 1250,
+      terminalSummary: "done",
+    });
+  });
+
+  it("recovers interrupted cron tasks from durable cron job state when run logs are absent", async () => {
+    const startedAt = Date.now() - GRACE_EXPIRED_MS;
+    const task = makeStaleTask({
+      runtime: "cron",
+      sourceId: "cron-job-state-error",
+      runId: `cron:cron-job-state-error:${startedAt}`,
+      startedAt,
+      lastEventAt: startedAt,
+    });
+
+    const { currentTasks } = createTaskRegistryMaintenanceHarness({
+      tasks: [task],
+      cronStore: {
+        version: 1,
+        jobs: [
+          {
+            id: "cron-job-state-error",
+            name: "state error",
+            enabled: true,
+            createdAtMs: startedAt - 60_000,
+            updatedAtMs: startedAt,
+            schedule: { kind: "every", everyMs: 60_000, anchorMs: startedAt - 60_000 },
+            sessionTarget: "isolated",
+            wakeMode: "next-heartbeat",
+            payload: { kind: "agentTurn", message: "work" },
+            state: {
+              lastRunAtMs: startedAt,
+              lastRunStatus: "error",
+              lastError: "cron: job interrupted by gateway restart",
+              lastDurationMs: 5000,
+            },
+          },
+        ],
+      },
+    });
+
+    expect(previewTaskRegistryMaintenance()).toMatchObject({ reconciled: 0, recovered: 1 });
+    expect(await runTaskRegistryMaintenance()).toMatchObject({ reconciled: 0, recovered: 1 });
+    expect(currentTasks.get(task.taskId)).toMatchObject({
+      status: "failed",
+      endedAt: startedAt + 5000,
+      error: "cron: job interrupted by gateway restart",
+    });
+  });
+
  it("marks chat-backed cli tasks lost after the owning run context disappears", async () => {
    const channelKey = "agent:main:workspace:channel:C1234567890";
    const task = makeStaleTask({
--- a/src/tasks/task-registry.maintenance.ts
+++ b/src/tasks/task-registry.maintenance.ts
@@ -1,6 +1,10 @@
 import { readAcpSessionEntry } from "../acp/runtime/session-meta.js";
 import { loadSessionStore, resolveStorePath } from "../config/sessions.js";
 import { isCronJobActive } from "../cron/active-jobs.js";
+import { readCronRunLogEntriesSync, resolveCronRunLogPath } from "../cron/run-log.js";
+import type { CronRunLogEntry } from "../cron/run-log.js";
+import { loadCronStoreSync, resolveCronStorePath } from "../cron/store.js";
+import type { CronJob, CronStoreFile } from "../cron/types.js";
 import { getAgentRunContext } from "../infra/agent-events.js";
 import { parseAgentSessionKey } from "../routing/session-key.js";
 import { deriveSessionChatType } from "../sessions/session-chat-type.js";
@@ -12,6 +16,7 @@ import {
  getTaskById,
  listTaskRecords,
  markTaskLostById,
+  markTaskTerminalById,
  maybeDeliverTaskTerminalUpdate,
  resolveTaskForLookupToken,
  setTaskCleanupAfterById,
@@ -23,7 +28,7 @@ import {
 } from "./task-registry.audit.js";
 import type { TaskAuditSummary } from "./task-registry.audit.js";
 import { summarizeTaskRecords } from "./task-registry.summary.js";
-import type { TaskRecord, TaskRegistrySummary } from "./task-registry.types.js";
+import type { TaskRecord, TaskRegistrySummary, TaskStatus } from "./task-registry.types.js";

 const TASK_RECONCILE_GRACE_MS = 5 * 60_000;
 const TASK_RETENTION_MS = 7 * 24 * 60 * 60_000;
@@ -38,6 +43,8 @@ const SWEEP_YIELD_BATCH_SIZE = 25;
 let sweeper: NodeJS.Timeout | null = null;
 let deferredSweep: NodeJS.Timeout | null = null;
 let sweepInProgress = false;
+let configuredCronStorePath: string | undefined;
+let configuredCronRuntimeAuthoritative = false;

 type TaskRegistryMaintenanceRuntime = {
  readAcpSessionEntry: typeof readAcpSessionEntry;
@@ -51,9 +58,15 @@ type TaskRegistryMaintenanceRuntime = {
  getTaskById: typeof getTaskById;
  listTaskRecords: typeof listTaskRecords;
  markTaskLostById: typeof markTaskLostById;
+  markTaskTerminalById: typeof markTaskTerminalById;
  maybeDeliverTaskTerminalUpdate: typeof maybeDeliverTaskTerminalUpdate;
  resolveTaskForLookupToken: typeof resolveTaskForLookupToken;
  setTaskCleanupAfterById: typeof setTaskCleanupAfterById;
+  isCronRuntimeAuthoritative: () => boolean;
+  resolveCronStorePath: typeof resolveCronStorePath;
+  loadCronStoreSync: typeof loadCronStoreSync;
+  resolveCronRunLogPath: typeof resolveCronRunLogPath;
+  readCronRunLogEntriesSync: typeof readCronRunLogEntriesSync;
 };

 const defaultTaskRegistryMaintenanceRuntime: TaskRegistryMaintenanceRuntime = {
@@ -68,9 +81,15 @@ const defaultTaskRegistryMaintenanceRuntime: TaskRegistryMaintenanceRuntime = {
  getTaskById,
  listTaskRecords,
  markTaskLostById,
+  markTaskTerminalById,
  maybeDeliverTaskTerminalUpdate,
  resolveTaskForLookupToken,
  setTaskCleanupAfterById,
+  isCronRuntimeAuthoritative: () => configuredCronRuntimeAuthoritative,
+  resolveCronStorePath: () => configuredCronStorePath ?? resolveCronStorePath(),
+  loadCronStoreSync,
+  resolveCronRunLogPath,
+  readCronRunLogEntriesSync,
 };

 let taskRegistryMaintenanceRuntime: TaskRegistryMaintenanceRuntime =
@@ -83,6 +102,32 @@ export type TaskRegistryMaintenanceSummary = {
  pruned: number;
 };

+type CronExecutionId = {
+  jobId: string;
+  startedAt: number;
+};
+
+type CronTerminalRecovery = {
+  status: Extract<TaskStatus, "succeeded" | "failed" | "timed_out">;
+  endedAt: number;
+  lastEventAt: number;
+  error?: string;
+  terminalSummary?: string;
+};
+
+type CronRecoveryContext = {
+  storePath: string;
+  store?: CronStoreFile | null;
+  runLogsByJobId: Map<string, CronRunLogEntry[]>;
+};
+
+function createCronRecoveryContext(): CronRecoveryContext {
+  return {
+    storePath: taskRegistryMaintenanceRuntime.resolveCronStorePath(),
+    runLogsByJobId: new Map<string, CronRunLogEntry[]>(),
+  };
+}
+
 function findSessionEntryByKey(store: Record<string, unknown>, sessionKey: string): unknown {
  const direct = store[sessionKey];
  if (direct) {
@@ -110,6 +155,142 @@ function hasLostGraceExpired(task: TaskRecord, now: number): boolean {
  return now - referenceAt >= TASK_RECONCILE_GRACE_MS;
 }

+function parseCronExecutionId(task: TaskRecord): CronExecutionId | undefined {
+  const runId = task.runId?.trim();
+  if (!runId?.startsWith("cron:")) {
+    return undefined;
+  }
+  const separator = runId.lastIndexOf(":");
+  if (separator <= "cron:".length) {
+    return undefined;
+  }
+  const startedAt = Number(runId.slice(separator + 1));
+  if (!Number.isFinite(startedAt)) {
+    return undefined;
+  }
+  const jobId = runId.slice("cron:".length, separator).trim();
+  if (!jobId || (task.sourceId?.trim() && task.sourceId.trim() !== jobId)) {
+    return undefined;
+  }
+  return { jobId, startedAt };
+}
+
+function isTimeoutCronError(error: string | undefined): boolean {
+  return error === "cron: job execution timed out";
+}
+
+function mapCronTerminalStatus(status: unknown, error?: string): CronTerminalRecovery["status"] {
+  if (status === "ok" || status === "skipped") {
+    return "succeeded";
+  }
+  return isTimeoutCronError(error) ? "timed_out" : "failed";
+}
+
+function getCronRunLogEntries(context: CronRecoveryContext, jobId: string): CronRunLogEntry[] {
+  const cached = context.runLogsByJobId.get(jobId);
+  if (cached) {
+    return cached;
+  }
+  let entries: CronRunLogEntry[] = [];
+  try {
+    const logPath = taskRegistryMaintenanceRuntime.resolveCronRunLogPath({
+      storePath: context.storePath,
+      jobId,
+    });
+    entries = taskRegistryMaintenanceRuntime.readCronRunLogEntriesSync(logPath, {
+      jobId,
+      limit: 5000,
+    });
+  } catch {
+    entries = [];
+  }
+  context.runLogsByJobId.set(jobId, entries);
+  return entries;
+}
+
+function getCronStore(context: CronRecoveryContext): CronStoreFile | null {
+  if (context.store !== undefined) {
+    return context.store;
+  }
+  try {
+    context.store = taskRegistryMaintenanceRuntime.loadCronStoreSync(context.storePath);
+  } catch {
+    context.store = null;
+  }
+  return context.store;
+}
+
+function resolveCronRunLogRecovery(
+  execution: CronExecutionId,
+  context: CronRecoveryContext,
+): CronTerminalRecovery | undefined {
+  const entries = getCronRunLogEntries(context, execution.jobId);
+  const entry = entries.findLast(
+    (candidate) =>
+      candidate.jobId === execution.jobId &&
+      candidate.action === "finished" &&
+      candidate.runAtMs === execution.startedAt &&
+      (candidate.status === "ok" || candidate.status === "skipped" || candidate.status === "error"),
+  );
+  if (!entry) {
+    return undefined;
+  }
+  const durationMs =
+    typeof entry.durationMs === "number" && Number.isFinite(entry.durationMs)
+      ? Math.max(0, entry.durationMs)
+      : undefined;
+  const endedAt = durationMs === undefined ? entry.ts : execution.startedAt + durationMs;
+  return {
+    status: mapCronTerminalStatus(entry.status, entry.error),
+    endedAt,
+    lastEventAt: endedAt,
+    ...(entry.error !== undefined ? { error: entry.error } : {}),
+    ...(entry.summary !== undefined ? { terminalSummary: entry.summary } : {}),
+  };
+}
+
+function resolveCronJobStateRecovery(
+  execution: CronExecutionId,
+  context: CronRecoveryContext,
+): CronTerminalRecovery | undefined {
+  const store = getCronStore(context);
+  const job: CronJob | undefined = store?.jobs.find((entry) => entry.id === execution.jobId);
+  if (!job || job.state.lastRunAtMs !== execution.startedAt) {
+    return undefined;
+  }
+  const status = job.state.lastRunStatus ?? job.state.lastStatus;
+  if (status !== "ok" && status !== "skipped" && status !== "error") {
+    return undefined;
+  }
+  const durationMs =
+    typeof job.state.lastDurationMs === "number" && Number.isFinite(job.state.lastDurationMs)
+      ? Math.max(0, job.state.lastDurationMs)
+      : 0;
+  const endedAt = execution.startedAt + durationMs;
+  return {
+    status: mapCronTerminalStatus(status, job.state.lastError),
+    endedAt,
+    lastEventAt: endedAt,
+    ...(job.state.lastError !== undefined ? { error: job.state.lastError } : {}),
+  };
+}
+
+function resolveDurableCronTaskRecovery(
+  task: TaskRecord,
+  context: CronRecoveryContext,
+): CronTerminalRecovery | undefined {
+  if (task.runtime !== "cron" || !isActiveTask(task)) {
+    return undefined;
+  }
+  const execution = parseCronExecutionId(task);
+  if (!execution) {
+    return undefined;
+  }
+  return (
+    resolveCronRunLogRecovery(execution, context) ?? resolveCronJobStateRecovery(execution, context)
+  );
+}
+
 function hasActiveCliRun(task: TaskRecord): boolean {
  const candidateRunIds = [task.sourceId, task.runId];
  for (const candidate of candidateRunIds) {
@@ -123,6 +304,9 @@ function hasActiveCliRun(task: TaskRecord): boolean {

 function hasBackingSession(task: TaskRecord): boolean {
  if (task.runtime === "cron") {
+    if (!taskRegistryMaintenanceRuntime.isCronRuntimeAuthoritative()) {
+      return true;
+    }
    const jobId = task.sourceId?.trim();
    return jobId ? taskRegistryMaintenanceRuntime.isCronJobActive(jobId) : false;
  }
@@ -204,6 +388,41 @@ function markTaskLost(task: TaskRecord, now: number): TaskRecord {
  return updated;
 }

+function markTaskRecovered(task: TaskRecord, recovery: CronTerminalRecovery): TaskRecord {
+  const updated =
+    taskRegistryMaintenanceRuntime.markTaskTerminalById({
+      taskId: task.taskId,
+      status: recovery.status,
+      endedAt: recovery.endedAt,
+      lastEventAt: recovery.lastEventAt,
+      ...(recovery.error !== undefined ? { error: recovery.error } : {}),
+      ...(recovery.terminalSummary !== undefined
+        ? { terminalSummary: recovery.terminalSummary }
+        : {}),
+    }) ?? projectTaskRecovered(task, recovery);
+  void taskRegistryMaintenanceRuntime.maybeDeliverTaskTerminalUpdate(updated.taskId);
+  return updated;
+}
+
+function projectTaskRecovered(task: TaskRecord, recovery: CronTerminalRecovery): TaskRecord {
+  const projected: TaskRecord = {
+    ...task,
+    status: recovery.status,
+    endedAt: recovery.endedAt,
+    lastEventAt: recovery.lastEventAt,
+    ...(recovery.error !== undefined ? { error: recovery.error } : {}),
+    ...(recovery.terminalSummary !== undefined
+      ? { terminalSummary: recovery.terminalSummary }
+      : {}),
+  };
+  return {
+    ...projected,
+    ...(typeof projected.cleanupAfter === "number"
+      ? {}
+      : { cleanupAfter: resolveCleanupAfter(projected) }),
+  };
+}
+
 function projectTaskLost(task: TaskRecord, now: number): TaskRecord {
  const projected: TaskRecord = {
    ...task,
@@ -220,7 +439,14 @@ function projectTaskLost(task: TaskRecord, now: number): TaskRecord {
  };
 }

-export function reconcileTaskRecordForOperatorInspection(task: TaskRecord): TaskRecord {
+export function reconcileTaskRecordForOperatorInspection(
+  task: TaskRecord,
+  context: CronRecoveryContext = createCronRecoveryContext(),
+): TaskRecord {
+  const cronRecovery = resolveDurableCronTaskRecovery(task, context);
+  if (cronRecovery) {
+    return projectTaskRecovered(task, cronRecovery);
+  }
  const now = Date.now();
  if (!shouldMarkLost(task, now)) {
    return task;
@@ -230,9 +456,10 @@ export function reconcileTaskRecordForOperatorInspection(task: TaskRecord): Task

 export function reconcileInspectableTasks(): TaskRecord[] {
  taskRegistryMaintenanceRuntime.ensureTaskRegistryReady();
+  const cronRecoveryContext = createCronRecoveryContext();
  return taskRegistryMaintenanceRuntime
    .listTaskRecords()
-    .map((task) => reconcileTaskRecordForOperatorInspection(task));
+    .map((task) => reconcileTaskRecordForOperatorInspection(task, cronRecoveryContext));
 }

 configureTaskAuditTaskProvider(reconcileInspectableTasks);
@@ -253,15 +480,21 @@ export function reconcileTaskLookupToken(token: string): TaskRecord | undefined
 }

 // Preview is synchronous and cannot call the async detached-task recovery hook,
-// so recovered tasks are counted under reconciled here. The real sweep
-// in runTaskRegistryMaintenance splits them into reconciled vs recovered.
+// so hook-recovered tasks are counted under reconciled here. Durable cron
+// recovery is synchronous and can be previewed exactly.
 export function previewTaskRegistryMaintenance(): TaskRegistryMaintenanceSummary {
  taskRegistryMaintenanceRuntime.ensureTaskRegistryReady();
  const now = Date.now();
  let reconciled = 0;
+  let recovered = 0;
  let cleanupStamped = 0;
  let pruned = 0;
+  const cronRecoveryContext = createCronRecoveryContext();
  for (const task of taskRegistryMaintenanceRuntime.listTaskRecords()) {
+    if (resolveDurableCronTaskRecovery(task, cronRecoveryContext)) {
+      recovered += 1;
+      continue;
+    }
    if (shouldMarkLost(task, now)) {
      reconciled += 1;
      continue;
@@ -274,7 +507,7 @@ export function previewTaskRegistryMaintenance(): TaskRegistryMaintenanceSummary
      cleanupStamped += 1;
    }
  }
-  return { reconciled, recovered: 0, cleanupStamped, pruned };
+  return { reconciled, recovered, cleanupStamped, pruned };
 }

 /**
@@ -305,12 +538,25 @@ export async function runTaskRegistryMaintenance(): Promise<TaskRegistryMaintena
  let cleanupStamped = 0;
  let pruned = 0;
  const tasks = taskRegistryMaintenanceRuntime.listTaskRecords();
+  const cronRecoveryContext = createCronRecoveryContext();
  let processed = 0;
  for (const task of tasks) {
    const current = taskRegistryMaintenanceRuntime.getTaskById(task.taskId);
    if (!current) {
      continue;
    }
+    const cronRecovery = resolveDurableCronTaskRecovery(current, cronRecoveryContext);
+    if (cronRecovery) {
+      const next = markTaskRecovered(current, cronRecovery);
+      if (next.status !== current.status) {
+        recovered += 1;
+      }
+      processed += 1;
+      if (processed % SWEEP_YIELD_BATCH_SIZE === 0) {
+        await yieldToEventLoop();
+      }
+      continue;
+    }
    if (shouldMarkLost(current, now)) {
      const recovery = await tryRecoverTaskBeforeMarkLost({
        taskId: current.taskId,
@@ -412,6 +658,18 @@ export function setTaskRegistryMaintenanceRuntimeForTests(

 export function resetTaskRegistryMaintenanceRuntimeForTests(): void {
  taskRegistryMaintenanceRuntime = defaultTaskRegistryMaintenanceRuntime;
+  configuredCronStorePath = undefined;
+  configuredCronRuntimeAuthoritative = false;
+}
+
+export function configureTaskRegistryMaintenance(options: {
+  cronStorePath?: string;
+  cronRuntimeAuthoritative?: boolean;
+}): void {
+  configuredCronStorePath = options.cronStorePath?.trim() || undefined;
+  if (options.cronRuntimeAuthoritative !== undefined) {
+    configuredCronRuntimeAuthoritative = options.cronRuntimeAuthoritative;
+  }
 }

 export function getReconciledTaskById(taskId: string): TaskRecord | undefined {
--- a/src/tasks/task-registry.test.ts
+++ b/src/tasks/task-registry.test.ts
@@ -129,6 +129,7 @@ function configureTaskRegistryMaintenanceRuntimeForTest(params: {
      params.currentTasks.set(patch.taskId, next);
      return next;
    },
+    markTaskTerminalById: () => null,
    maybeDeliverTaskTerminalUpdate: async () => null,
    resolveTaskForLookupToken: () => undefined,
    setTaskCleanupAfterById: (patch: { taskId: string; cleanupAfter: number }) => {
@@ -143,6 +144,11 @@ function configureTaskRegistryMaintenanceRuntimeForTest(params: {
      params.currentTasks.set(patch.taskId, next);
      return next;
    },
+    isCronRuntimeAuthoritative: () => true,
+    resolveCronStorePath: () => "/tmp/openclaw-test-cron/jobs.json",
+    loadCronStoreSync: () => ({ version: 1, jobs: [] }),
+    resolveCronRunLogPath: ({ jobId }) => jobId,
+    readCronRunLogEntriesSync: () => [],
  });
 }

@@ -1625,9 +1631,15 @@ describe("task-registry", () => {
          throw new Error("maintenance boom");
        },
        markTaskLostById: () => null,
+        markTaskTerminalById: () => null,
        maybeDeliverTaskTerminalUpdate: async () => null,
        resolveTaskForLookupToken: () => undefined,
        setTaskCleanupAfterById: () => null,
+        isCronRuntimeAuthoritative: () => true,
+        resolveCronStorePath: () => "/tmp/openclaw-test-cron/jobs.json",
+        loadCronStoreSync: () => ({ version: 1, jobs: [] }),
+        resolveCronRunLogPath: ({ jobId }) => jobId,
+        readCronRunLogEntriesSync: () => [],
      });

      try {
--- a/test/scripts/install-ps1.test.ts
+++ b/test/scripts/install-ps1.test.ts
@@ -0,0 +1,113 @@
+import { spawnSync } from "node:child_process";
+import { chmodSync, readFileSync, writeFileSync } from "node:fs";
+import { join } from "node:path";
+import { describe, expect, it } from "vitest";
+import { createScriptTestHarness } from "./test-helpers";
+
+const SCRIPT_PATH = "scripts/install.ps1";
+
+function extractFunctionBody(source: string, name: string): string {
+  const match = source.match(
+    new RegExp(`^function ${name} \\{\\r?\\n([\\s\\S]*?)^\\}\\r?\\n`, "m"),
+  );
+  expect(match?.[1]).toBeDefined();
+  return match![1];
+}
+
+function findPowerShell(): string | undefined {
+  for (const candidate of ["pwsh", "powershell"]) {
+    const result = spawnSync(
+      candidate,
+      ["-NoLogo", "-NoProfile", "-Command", "$PSVersionTable.PSVersion"],
+      {
+        encoding: "utf8",
+      },
+    );
+    if (result.status === 0) {
+      return candidate;
+    }
+  }
+  return undefined;
+}
+
+function toPowerShellSingleQuotedLiteral(value: string): string {
+  return `'${value.replaceAll("'", "''")}'`;
+}
+
+function createFailingNodeFixture(source: string): string {
+  const scriptWithoutEntryPoint = source.replace(
+    /\r?\n\$installSucceeded = Main\r?\nComplete-Install -Succeeded:\$installSucceeded\s*$/m,
+    "",
+  );
+  expect(scriptWithoutEntryPoint).not.toBe(source);
+
+  return [
+    scriptWithoutEntryPoint,
+    "",
+    "function Write-Banner { }",
+    "function Ensure-ExecutionPolicy { return $true }",
+    "function Ensure-Node { return $false }",
+    "",
+    "$installSucceeded = Main",
+    "Complete-Install -Succeeded:$installSucceeded",
+    "",
+  ].join("\n");
+}
+
+describe("install.ps1 failure handling", () => {
+  const harness = createScriptTestHarness();
+  const source = readFileSync(SCRIPT_PATH, "utf8");
+  const powershell = findPowerShell();
+  const runIfPowerShell = powershell ? it : it.skip;
+
+  it("does not exit directly from inside Main", () => {
+    const mainBody = extractFunctionBody(source, "Main");
+    expect(mainBody).not.toMatch(/\bexit\b/i);
+    expect(mainBody).toContain("return (Fail-Install)");
+  });
+
+  it("keeps failure termination in the top-level completion handler", () => {
+    const completeInstallBody = extractFunctionBody(source, "Complete-Install");
+    expect(completeInstallBody).toMatch(/\$PSCommandPath/);
+    expect(completeInstallBody).toMatch(/\bexit \$script:InstallExitCode\b/);
+    expect(completeInstallBody).toMatch(/\bthrow "OpenClaw installation failed with exit code/);
+  });
+
+  runIfPowerShell("exits non-zero when run as a script file", () => {
+    const tempDir = harness.createTempDir("openclaw-install-ps1-");
+    const scriptPath = join(tempDir, "install.ps1");
+    writeFileSync(scriptPath, createFailingNodeFixture(source));
+    chmodSync(scriptPath, 0o755);
+
+    const result = spawnSync(
+      powershell!,
+      ["-NoLogo", "-NoProfile", "-ExecutionPolicy", "Bypass", "-File", scriptPath],
+      { encoding: "utf8" },
+    );
+
+    expect(result.status).toBe(1);
+  });
+
+  runIfPowerShell("throws without killing the caller when run as a scriptblock", () => {
+    const tempDir = harness.createTempDir("openclaw-install-ps1-");
+    const scriptPath = join(tempDir, "install.ps1");
+    writeFileSync(scriptPath, createFailingNodeFixture(source));
+    chmodSync(scriptPath, 0o755);
+
+    const command = [
+      "try {",
+      `  & ([scriptblock]::Create((Get-Content -LiteralPath ${toPowerShellSingleQuotedLiteral(scriptPath)} -Raw)))`,
+      "} catch {",
+      '  Write-Output "caught=$($_.Exception.Message)"',
+      "}",
+      'Write-Output "alive-after-install"',
+    ].join("\n");
+    const result = spawnSync(powershell!, ["-NoLogo", "-NoProfile", "-Command", command], {
+      encoding: "utf8",
+    });
+
+    expect(result.status).toBe(0);
+    expect(result.stdout).toContain("caught=OpenClaw installation failed with exit code 1.");
+    expect(result.stdout).toContain("alive-after-install");
+  });
+});
--- a/test/scripts/postinstall-bundled-plugins.test.ts
+++ b/test/scripts/postinstall-bundled-plugins.test.ts
@@ -550,11 +550,39 @@ describe("bundled plugin postinstall", () => {
    const staleFile = path.join(packageRoot, "dist", "stale-runtime.js");
    const packageJson = path.join(packageRoot, "dist", "extensions", "slack", "package.json");
    const binDir = path.join(packageRoot, "dist", "extensions", "slack", "node_modules", ".bin");
+    const installStageFile = path.join(
+      packageRoot,
+      "dist",
+      "extensions",
+      "slack",
+      ".openclaw-install-stage",
+      "node_modules",
+      "typebox",
+      "build",
+      "compile",
+      "code.mjs",
+    );
+    const retryInstallStageFile = path.join(
+      packageRoot,
+      "dist",
+      "extensions",
+      "slack",
+      ".openclaw-install-stage-retry",
+      "node_modules",
+      "typebox",
+      "build",
+      "compile",
+      "code.mjs",
+    );
    await fs.mkdir(path.dirname(staleFile), { recursive: true });
    await fs.mkdir(path.dirname(packageJson), { recursive: true });
    await fs.mkdir(binDir, { recursive: true });
+    await fs.mkdir(path.dirname(installStageFile), { recursive: true });
+    await fs.mkdir(path.dirname(retryInstallStageFile), { recursive: true });
    await fs.writeFile(staleFile, "export {};\n");
    await fs.writeFile(packageJson, "{}\n");
+    await fs.writeFile(installStageFile, "export {};\n");
+    await fs.writeFile(retryInstallStageFile, "export {};\n");
    await fs.symlink("../fxparser/bin.js", path.join(binDir, "fxparser"));

    expect(
@@ -564,6 +592,8 @@ describe("bundled plugin postinstall", () => {
        log: { log: vi.fn(), warn: vi.fn() },
      }),
    ).toEqual(["dist/stale-runtime.js"]);
+    await expect(fs.stat(installStageFile)).resolves.toBeDefined();
+    await expect(fs.stat(retryInstallStageFile)).resolves.toBeDefined();
  });

  it("unlinks stale files instead of recursive pruning them", () => {
Author	SHA1	Message	Date
Vincent Koc	c7ead7d8a9	fix(cli): lazy-load model auth runtime	2026-04-25 23:49:06 -07:00
Vincent Koc	62869c8502	docs(bluebubbles): rewrite with Steps for setup, Tabs for DM/groups and coalescing, AccordionGroup for actions and config	2026-04-25 23:48:13 -07:00
Vincent Koc	bb0ef5ef18	docs(troubleshooting): rewrite with AccordionGroup for symptom signatures, Steps for fix flows, and Warning callouts	2026-04-25 23:44:25 -07:00
Harry Xie	77719899f3	fix(gateway): refresh stale embedded service tokens Refresh loaded gateway service installs when the current service embeds stale gateway auth instead of returning already-installed, avoiding LaunchAgent token-mismatch loops after token rotation. Fixes #70752. Thanks @hyspacex. Co-authored-by: Harry Xie <harryhsieh963@yahoo.com>	2026-04-26 07:42:14 +01:00
Vincent Koc	8c87a637e9	docs(doctor): rewrite with Tabs for headless flags and AccordionGroup for the 19+ detailed behaviors	2026-04-25 23:40:24 -07:00
Vincent Koc	c4a39a6819	docs(model-providers): rewrite with AccordionGroup, CardGroup, Tabs, and Steps for cleaner provider scan	2026-04-25 23:36:01 -07:00
Vincent Koc	82ddcf24f5	feat(diagnostics): add harness lifecycle telemetry	2026-04-25 23:34:34 -07:00
Peter Steinberger	8bbb143ab8	fix: enforce device token scope containment	2026-04-26 07:28:21 +01:00
Peter Steinberger	26e4eb8e40	fix(update): ignore plugin install stages in dist verify	2026-04-26 07:28:02 +01:00
Peter Steinberger	8368026986	fix(installer): preserve PowerShell host on failure	2026-04-26 07:23:48 +01:00
Peter Steinberger	1fae716a04	fix: recover stale cron task records	2026-04-26 07:23:39 +01:00
Ayaan Zaidi	9d21200049	test(e2e): cover npm onboarding runtime deps	2026-04-26 11:53:17 +05:30
Peter Steinberger	7091dbe2bf	docs: prefer ghcrawl for OpenClaw issue triage	2026-04-26 07:19:00 +01:00