fix: mark approval gateway calls as runtime clients

fix: wrap Mac menu gateway errors
fix(telegram): repair desktop proof login
2026-06-21 14:32:03 +08:00 · 2026-05-17 21:24:42 -07:00 · 2026-05-18 05:21:19 +01:00 · 2026-05-18 09:49:21 +05:30 · 2026-05-18 05:19:02 +01:00 · 2026-05-18 04:53:40 +01:00
487 changed files with 18558 additions and 4995 deletions
--- a/.agents/skills/autoreview/SKILL.md
+++ b/.agents/skills/autoreview/SKILL.md
@@ -7,6 +7,8 @@ description: "Autoreview closeout: local dirty changes, PR branch vs main, paral

 Run Codex's built-in code review as a closeout check. This is code review (`codex review`), not Guardian `auto_review` approval routing.

+Codex native review mode performs best and is recommended. Non-Codex reviewers are fallback/second-opinion paths that receive a generated diff prompt, not the full Codex review-mode runtime.
+
 Use when:
 - user asks for Codex review / autoreview / second-model review
 - after non-trivial code edits, before final/commit/ship
@@ -21,7 +23,7 @@ Use when:
 - Prefer small fixes at the right ownership boundary; no refactor unless it clearly improves the bug class.
 - Keep going until the selected review path returns no accepted/actionable findings.
 - If a review-triggered fix changes code, rerun focused tests and rerun the review helper.
- Default to Codex review. If Codex is unavailable or exits with an error, the helper may fall back to `claude -p`; `pi -p` and `opencode run` are explicit reviewer/fallback options. The helper runs nested Codex review in yolo/full-access mode by default; use `--no-yolo` only when intentionally testing sandbox behavior.
+- Default to Codex review. If Codex is unavailable or exits with an error, the helper falls back to the first configured CLI from `claude -p`, `pi -p`, `opencode run`, `droid exec`, or `copilot`. Prefer Codex for final closeout because it uses native review mode; non-Codex reviewers use a Codex-inspired generated diff prompt. The helper runs nested Codex review in yolo/full-access mode by default; use `--no-yolo` only when intentionally testing sandbox behavior.
 - Stop as soon as the review command/helper exits 0 with no accepted/actionable findings. Do not run an extra direct `codex review` just to get a nicer "clean" line, a second opinion, or clearer closeout wording.
 - Treat the helper's successful exit plus absence of actionable findings as the clean review result, even if the underlying Codex CLI output is terse.
 - If rejecting a finding as intentional/not worth fixing, add a brief inline code comment only when it explains a real invariant or ownership decision that future reviewers should know.
@@ -107,12 +109,12 @@ The helper:
 - otherwise uses `origin/main` for non-main branches
 - use `--mode commit --commit <ref>` for already-committed work, especially clean `main` after landing
 - should be left in `--mode auto` or forced to `--mode branch` for PR/branch work; do not force `--mode local` after committing
- supports `--reviewer codex|claude|pi|opencode|auto`; `auto` runs Codex first
- supports `--fallback-reviewer claude|pi|opencode|none`; default is `claude`
+- supports `--reviewer codex|claude|pi|opencode|droid|copilot|auto`; `auto` means Codex first
+- supports `--fallback-reviewer auto|claude|pi|opencode|droid|copilot|none`; default is configured CLI fallback
 - falls back only when Codex is unavailable or exits nonzero, not when Codex reports findings
 - writes only to stdout unless `--output` or `AUTOREVIEW_OUTPUT` is set
 - supports `--dry-run`, `--parallel-tests`, and commit refs
- runs nested review with `--dangerously-bypass-approvals-and-sandbox` by default
+- runs nested review with `--dangerously-bypass-approvals-and-sandbox --sandbox danger-full-access` by default
 - keeps accepting `--full-access`; use `--no-yolo` or `AUTOREVIEW_YOLO=0` to opt out
 - still accepts legacy `CODEX_REVIEW_*` env vars when the matching `AUTOREVIEW_*` var is unset
 - prints `autoreview clean: no accepted/actionable findings reported` when the selected review command exits 0
--- a/.agents/skills/autoreview/scripts/autoreview
+++ b/.agents/skills/autoreview/scripts/autoreview
@@ -10,14 +10,16 @@ Options:
                              Target selection. Default: auto.
  --base REF                 Base ref for branch review. Default: PR base or origin/main.
  --commit REF               Commit ref for commit review. Default: HEAD.
-  --reviewer codex|claude|pi|opencode|auto
-                              Review engine. Default: auto (Codex, fallback reviewer on error).
-  --fallback-reviewer claude|pi|opencode|none
-                              Fallback when Codex is unavailable or exits nonzero. Default: claude.
+  --reviewer codex|claude|pi|opencode|droid|copilot|auto
+                              Review engine. Default: Codex with configured fallback on error.
+  --fallback-reviewer auto|claude|pi|opencode|droid|copilot|none
+                              Fallback when Codex is unavailable or exits nonzero. Default: auto.
  --codex-bin PATH           Codex binary. Default: codex.
  --claude-bin PATH          Claude binary. Default: claude.
  --pi-bin PATH              Pi binary. Default: pi.
  --opencode-bin PATH        OpenCode binary. Default: opencode.
+  --droid-bin PATH           Droid binary. Default: droid.
+  --copilot-bin PATH         GitHub Copilot binary. Default: copilot.
  --full-access              Keep yolo/full-access mode enabled. Default.
  --no-yolo                  Run nested Codex review with normal sandbox/approval prompts.
  --output FILE              Also save output to file.
@@ -37,11 +39,13 @@ mode=auto
 base_ref=
 commit_ref=HEAD
 reviewer=${AUTOREVIEW_REVIEWER:-${CODEX_REVIEW_REVIEWER:-auto}}
-fallback_reviewer=${AUTOREVIEW_FALLBACK_REVIEWER:-${CODEX_REVIEW_FALLBACK_REVIEWER:-claude}}
+fallback_reviewer=${AUTOREVIEW_FALLBACK_REVIEWER:-${CODEX_REVIEW_FALLBACK_REVIEWER:-auto}}
 codex_bin=${CODEX_BIN:-codex}
 claude_bin=${CLAUDE_BIN:-claude}
 pi_bin=${PI_BIN:-pi}
 opencode_bin=${OPENCODE_BIN:-opencode}
+droid_bin=${DROID_BIN:-droid}
+copilot_bin=${COPILOT_BIN:-copilot}
 codex_args=()
 yolo=${AUTOREVIEW_YOLO:-${CODEX_REVIEW_YOLO:-1}}
 output=${AUTOREVIEW_OUTPUT:-${CODEX_REVIEW_OUTPUT:-}}
@@ -86,6 +90,14 @@ while [[ $# -gt 0 ]]; do
      opencode_bin=${2:-}
      shift 2
      ;;
+    --droid-bin)
+      droid_bin=${2:-}
+      shift 2
+      ;;
+    --copilot-bin)
+      copilot_bin=${2:-}
+      shift 2
+      ;;
    --full-access)
      yolo=1
      shift
@@ -119,7 +131,7 @@ done

 case "$yolo" in
  0|false|False|FALSE|no|No|NO|off|Off|OFF) ;;
-  *) codex_args+=(--dangerously-bypass-approvals-and-sandbox) ;;
+  *) codex_args+=(--dangerously-bypass-approvals-and-sandbox --sandbox danger-full-access) ;;
 esac

 case "$mode" in
@@ -131,7 +143,7 @@ case "$mode" in
 esac

 case "$reviewer" in
-  auto|codex|claude|pi|opencode) ;;
+  auto|codex|claude|pi|opencode|droid|copilot) ;;
  *)
    echo "invalid --reviewer: $reviewer" >&2
    exit 2
@@ -139,7 +151,7 @@ case "$reviewer" in
 esac

 case "$fallback_reviewer" in
-  claude|pi|opencode|none) ;;
+  auto|claude|pi|opencode|droid|copilot|none) ;;
  *)
    echo "invalid --fallback-reviewer: $fallback_reviewer" >&2
    exit 2
@@ -194,10 +206,17 @@ printf 'branch: %s\n' "${current_branch:-detached}"
 if [[ -n "$pr_url" ]]; then
  printf 'pr: %s\n' "$pr_url"
 fi
-printf 'reviewer: %s\n' "$reviewer"
 if [[ "$reviewer" == auto ]]; then
-  printf 'fallback-reviewer: %s\n' "$fallback_reviewer"
+  printf 'reviewer: codex\n'
+else
+  printf 'reviewer: %s\n' "$reviewer"
 fi
+case "$reviewer" in
+  codex|auto) ;;
+  *)
+    printf 'note: Codex native review mode is the recommended and best-supported review path; %s uses a generated diff prompt.\n' "$reviewer"
+    ;;
+esac
 if [[ "$reviewer" == auto || "$reviewer" == codex ]]; then
  printf 'review:'
  printf ' %q' "${review_cmd[@]}"
@@ -284,10 +303,14 @@ Base: ${base_ref:-}
 Commit: ${commit_ref:-}

 Rules:
- Review only the diff below.
+- Review the proposed code change as a closeout reviewer.
+- Focus on the diff below. If your CLI exposes read-only repository tools, inspect surrounding code and tests to verify findings; never modify files.
 - Do not modify files.
- Prioritize correctness bugs, regressions, security issues, and missing tests.
- Ignore speculative edge cases and broad rewrites.
+- Report only discrete, actionable issues introduced by this change.
+- Prioritize correctness, regressions, security, data loss, performance cliffs, and missing tests that would catch a real bug.
+- Do not report pre-existing issues, speculative risks, broad rewrites, style nits, changelog gaps, or findings that depend on unstated assumptions.
+- Identify the concrete scenario where the issue appears, and keep the line reference as small as possible.
+- A finding should overlap changed code or clearly cite changed code as the cause.
 - For each accepted/actionable finding, use exactly this format:
  [P<0-3>] Short title
  File: path:line
@@ -302,8 +325,15 @@ EOF
  } > "$prompt_file" || return
 }

+reviewer_output_has_clean_marker() {
+  local path=$1
+  grep -Eq '^[^[:alnum:]]*autoreview clean: no accepted/actionable findings reported[[:space:]]*$' "$path"
+}
+
 run_prompt_reviewer() {
  local selected=$1
+  local copilot_prompt=
+  local prompt_bytes=0
  local reviewer_output
  local status=0

@@ -343,13 +373,46 @@ run_prompt_reviewer() {
        echo "fallback reviewer unavailable: $opencode_bin" >&2
        status=127
      elif printf 'fallback: opencode run\n' | tee -a "$review_output"; then
-        "$opencode_bin" run --pure --dir "$(dirname "$prompt_file")" --file "$prompt_file" \
-          "Review the attached prompt file. Do not modify files." 2>&1 | tee -a "$review_output" "$reviewer_output"
+        "$opencode_bin" run --pure --dir "$repo_root" \
+          "Review the attached prompt file. Do not modify files." \
+          --file "$prompt_file" 2>&1 | tee -a "$review_output" "$reviewer_output"
        status=$?
      else
        status=$?
      fi
      ;;
+    droid)
+      if ! command -v "$droid_bin" >/dev/null 2>&1; then
+        echo "fallback reviewer unavailable: $droid_bin" >&2
+        status=127
+      elif printf 'fallback: droid exec\n' | tee -a "$review_output"; then
+        "$droid_bin" exec --cwd "$repo_root" -f "$prompt_file" 2>&1 | tee -a "$review_output" "$reviewer_output"
+        status=$?
+      else
+        status=$?
+      fi
+      ;;
+    copilot)
+      if ! command -v "$copilot_bin" >/dev/null 2>&1; then
+        echo "fallback reviewer unavailable: $copilot_bin" >&2
+        status=127
+      elif printf 'fallback: copilot\n' | tee -a "$review_output"; then
+        prompt_bytes=$(wc -c < "$prompt_file" | tr -d '[:space:]')
+        if (( prompt_bytes > 120000 )); then
+          echo "copilot reviewer unavailable: generated prompt is too large for copilot -p; use codex, droid, or another file/stdin-capable reviewer" \
+            2>&1 | tee -a "$review_output" "$reviewer_output"
+          status=1
+        else
+          copilot_prompt=$(< "$prompt_file")
+          "$copilot_bin" -C "$repo_root" --available-tools=none --stream off --output-format text --silent \
+            -p "$copilot_prompt" \
+            2>&1 | tee -a "$review_output" "$reviewer_output"
+          status=$?
+        fi
+      else
+        status=$?
+      fi
+      ;;
    *)
      echo "unsupported prompt reviewer: $selected" >&2
      status=2
@@ -360,7 +423,7 @@ run_prompt_reviewer() {
      status=1
    elif ! grep -q '[^[:space:]]' "$reviewer_output"; then
      status=1
-    elif ! grep -Fxq 'autoreview clean: no accepted/actionable findings reported' "$reviewer_output"; then
+    elif ! reviewer_output_has_clean_marker "$reviewer_output"; then
      status=1
    fi
  fi
@@ -380,7 +443,7 @@ run_selected_review() {
      fi
      run_review
      ;;
-    claude|pi|opencode)
+    claude|pi|opencode|droid|copilot)
      run_prompt_reviewer "$selected"
      ;;
    *)
@@ -390,6 +453,36 @@ run_selected_review() {
  esac
 }

+fallback_reviewer_is_available() {
+  local selected=$1
+  case "$selected" in
+    claude) command -v "$claude_bin" >/dev/null 2>&1 ;;
+    pi) command -v "$pi_bin" >/dev/null 2>&1 ;;
+    opencode) command -v "$opencode_bin" >/dev/null 2>&1 ;;
+    droid) command -v "$droid_bin" >/dev/null 2>&1 ;;
+    copilot) command -v "$copilot_bin" >/dev/null 2>&1 ;;
+    *) return 1 ;;
+  esac
+}
+
+run_auto_fallback_review() {
+  local selected
+  if [[ "$fallback_reviewer" != auto ]]; then
+    run_selected_review "$fallback_reviewer"
+    return $?
+  fi
+
+  for selected in claude pi opencode droid copilot; do
+    if fallback_reviewer_is_available "$selected"; then
+      run_selected_review "$selected"
+      return $?
+    fi
+  done
+
+  echo "fallback reviewer unavailable: no configured fallback CLI found" >&2
+  return 127
+}
+
 run_auto_review() {
  run_selected_review codex
  local status=$?
@@ -405,8 +498,12 @@ run_auto_review() {
  if [[ "$fallback_reviewer" == none ]]; then
    return "$status"
  fi
-  printf 'autoreview warning: codex exited %s; falling back to %s\n' "$status" "$fallback_reviewer" >&2
-  run_selected_review "$fallback_reviewer"
+  if [[ "$fallback_reviewer" == auto ]]; then
+    printf 'autoreview warning: codex exited %s; trying configured fallback reviewers\n' "$status" >&2
+  else
+    printf 'autoreview warning: codex exited %s; falling back to %s\n' "$status" "$fallback_reviewer" >&2
+  fi
+  run_auto_fallback_review
 }

 elapsed_since() {
--- a/.agents/skills/openclaw-pr-maintainer/SKILL.md
+++ b/.agents/skills/openclaw-pr-maintainer/SKILL.md
@@ -24,6 +24,36 @@ gitcrawl search openclaw/openclaw --query "<scope or title keywords>" --mode hyb
 gitcrawl cluster-detail openclaw/openclaw --id <cluster-id> --member-limit 20 --body-chars 280 --json
 ```

+## Claim specific review targets
+
+When a maintainer asks Codex to review, triage, fix, or land a specific OpenClaw issue/PR, check assignment before deep work.
+
+- Identify the requesting maintainer's GitHub login. In this environment, default Peter to `steipete`; if another maintainer is clearly the requester, use that maintainer's bare login.
+- Read current assignees with live `gh issue view` / `gh pr view`; `gitcrawl` is not enough for assignment state.
+- If unassigned, assign the requester before deep review. This is allowed for specific requested targets; do not auto-assign broad discovery candidates or shortlists.
+- If assigned to someone else, say so clearly before analysis and include assignment age:
+  - fresh: assigned within 6h; treat as actively owned unless user explicitly asks to continue or reassign
+  - stale: assigned 6h+ ago; treat as ownership hint, not a hard block; continue only with that caveat
+- If assigned to requester plus others, mention co-assignees and continue.
+- If assignment event time is unavailable, say `assigned, time unknown`; treat as assigned, not stale.
+- Never remove or replace assignees unless explicitly asked.
+
+Assignment time proof:
+
+```bash
+gh api "repos/openclaw/openclaw/issues/<number>/timeline" --paginate \
+  -H "Accept: application/vnd.github+json" \
+  --jq '[.[] | select(.event=="assigned") | {assignee:.assignee.login, assigner:.assigner.login, actor:.actor.login, created_at}]'
+```
+
+Use the newest `assigned` event for each current assignee. Issue timeline events expose `created_at`; GitHub GraphQL `AssignedEvent.createdAt` is also valid when REST pagination is awkward.
+
+Claim command for issues or PRs:
+
+```bash
+gh api -X POST "repos/openclaw/openclaw/issues/<number>/assignees" -f 'assignees[]=<login>' >/dev/null
+```
+
 ## Surface opener identity

 - For every reviewed, triaged, closed, or landed issue/PR, show the opener's human name when available, GitHub login, and account age.
@@ -217,6 +247,7 @@ gh search issues --repo openclaw/openclaw --match title,body --limit 50 \
  not correctness findings.
 - If bot review conversations exist on your PR, address them and resolve them yourself once fixed.
 - Leave a review conversation unresolved only when reviewer or maintainer judgment is still needed.
+- Before landing any PR with non-trivial code changes, run `$autoreview` until no accepted/actionable findings remain, unless equivalent manual review already covered it, the change is trivial/docs-only, or the user opts out.
 - When landing or merging any PR, follow the global `/landpr` process.
 - Use `scripts/committer "<msg>" <file...>` for scoped commits instead of manual `git add` and `git commit`.
 - Keep commit messages concise and action-oriented.
--- a/.github/codex/prompts/mantis-telegram-desktop-proof.md
+++ b/.github/codex/prompts/mantis-telegram-desktop-proof.md
@@ -16,8 +16,11 @@ Hard limits:
 - Do not finish with tiny, cropped-wrong, off-bottom, or sidebar-heavy GIFs.
 - Do not invent a generic proof. The proof must match the PR behavior.
 - Do not force GIFs for internal-only, workflow-only, test-only, docs-only, or
-  otherwise non-visual PRs. A no-visual-proof manifest is a successful outcome
-  when GIFs would be misleading.
+  otherwise non-visual PRs. A no-visual-proof manifest is a successful workflow
+  outcome when GIFs would be misleading, but it is not proof that the PR passed.
+- Keep public-facing manifest summaries short and user-domain. Do not mention
+  harness internals, mock-provider limits, secret/trust boundaries, local paths,
+  transcript seeding, or workflow implementation details in the summary.

 Inputs are provided as environment variables:

@@ -42,9 +45,10 @@ Required workflow:
   before/after. If it does not, write
   `${MANTIS_OUTPUT_DIR}/mantis-evidence.json` with `comparison.pass: true`, no
   artifacts, and a summary that starts with
-   `Mantis did not generate before/after GIFs because`. Include the concrete
-   reason in the summary. Use this manifest shape and do not create worktrees
-   or start Crabbox for this case:
+   `Mantis did not generate before/after GIFs because`. Include a short
+   public reason, such as `the PR changes internal session bookkeeping rather
+than Telegram-visible behavior`. Use this manifest shape and do not create
+   worktrees or start Crabbox for this case:

   ```json
   {
@@ -73,6 +77,14 @@ Required workflow:
   }
   ```

+   If the PR appears visual but proof is blocked by Telegram Desktop session
+   state, authorization, credentials, Crabbox, or another capture-infrastructure
+   issue, do not describe it as a no-visual PR. Write a manifest with
+   `comparison.pass: false`, skipped lanes, no artifacts, and a summary that
+   starts with `Mantis could not capture Telegram Desktop proof because`. The
+   publisher will keep that out of PR comments so the failure stays in the
+   workflow logs and artifacts.
+
 4. Decide what Telegram message, mock model response, command, callback, button,
   media, or sequence best proves the PR. Use `MANTIS_INSTRUCTIONS` as extra
   maintainer guidance, not as a replacement for reading the PR.
@@ -134,4 +146,6 @@ Expected final state:
  `Main` and `This PR`.
 - No-visual-proof manifests contain no artifacts and have `comparison.pass:
 true`.
+- Capture-infrastructure failure manifests contain no artifacts and have
+  `comparison.pass: false`.
 - The worktree can be dirty only under `.artifacts/`.
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -20,6 +20,8 @@ on:
      - "docs/**"
  pull_request:
    types: [opened, reopened, synchronize, ready_for_review, converted_to_draft]
+    paths-ignore:
+      - "CHANGELOG.md"

 permissions:
  contents: read
@@ -38,7 +40,7 @@ jobs:
    permissions:
      contents: read
    if: github.event_name != 'pull_request' || !github.event.pull_request.draft
-    runs-on: ubuntu-24.04
+    runs-on: ${{ github.event_name == 'workflow_dispatch' && 'ubuntu-24.04' || (github.repository == 'openclaw/openclaw' && 'blacksmith-4vcpu-ubuntu-2404' || 'ubuntu-24.04') }}
    timeout-minutes: 20
    outputs:
      checkout_revision: ${{ steps.checkout_ref.outputs.sha }}
@@ -301,7 +303,7 @@ jobs:
    permissions:
      contents: read
    if: github.event_name != 'pull_request' || !github.event.pull_request.draft
-    runs-on: ubuntu-24.04
+    runs-on: ${{ github.event_name == 'workflow_dispatch' && 'ubuntu-24.04' || (github.repository == 'openclaw/openclaw' && 'blacksmith-4vcpu-ubuntu-2404' || 'ubuntu-24.04') }}
    timeout-minutes: 20
    env:
      PRE_COMMIT_HOME: .cache/pre-commit-security-fast
@@ -394,7 +396,7 @@ jobs:
    permissions:
      contents: read
    if: github.event_name != 'pull_request' || !github.event.pull_request.draft
-    runs-on: ubuntu-24.04
+    runs-on: ${{ github.event_name == 'workflow_dispatch' && 'ubuntu-24.04' || (github.repository == 'openclaw/openclaw' && 'blacksmith-4vcpu-ubuntu-2404' || 'ubuntu-24.04') }}
    timeout-minutes: 10
    steps:
      - name: Checkout
@@ -419,7 +421,7 @@ jobs:
    permissions: {}
    needs: [security-scm-fast, security-dependency-audit]
    if: ${{ !cancelled() && always() && (github.event_name != 'pull_request' || !github.event.pull_request.draft) }}
-    runs-on: ubuntu-24.04
+    runs-on: ${{ github.event_name == 'workflow_dispatch' && 'ubuntu-24.04' || (github.repository == 'openclaw/openclaw' && 'blacksmith-4vcpu-ubuntu-2404' || 'ubuntu-24.04') }}
    timeout-minutes: 5
    steps:
      - name: Verify fast security jobs
@@ -641,6 +643,15 @@ jobs:
            echo "${name}-result=${results[$name]}" >> "$GITHUB_OUTPUT"
          done

+          failures=0
+          for name in channels core-support-boundary gateway-watch; do
+            if [ "${results[$name]}" = "failure" ]; then
+              echo "::error title=${name} failed::${name} failed"
+              failures=1
+            fi
+          done
+          exit "$failures"
+
      - name: Upload gateway watch regression artifacts
        if: always() && needs.preflight.outputs.run_check_additional == 'true'
        uses: actions/upload-artifact@v7
@@ -828,28 +839,6 @@ jobs:
          EOF
          OPENCLAW_VITEST_INCLUDE_FILE="$include_file" pnpm test:contracts:plugins

-  checks-fast-plugin-contracts:
-    permissions:
-      contents: read
-    name: checks-fast-contracts-plugins
-    needs: [preflight, checks-fast-plugin-contracts-shard]
-    if: ${{ !cancelled() && always() && needs.preflight.outputs.run_plugin_contracts_shards == 'true' }}
-    runs-on: ubuntu-24.04
-    timeout-minutes: 5
-    steps:
-      - name: Verify plugin contract shards
-        env:
-          SHARD_RESULT: ${{ needs.checks-fast-plugin-contracts-shard.result }}
-        run: |
-          if [ "$SHARD_RESULT" = "cancelled" ]; then
-            echo "Plugin contract shards were cancelled, usually because a newer commit superseded this run." >&2
-            exit 1
-          fi
-          if [ "$SHARD_RESULT" != "success" ]; then
-            echo "Plugin contract shards failed: $SHARD_RESULT" >&2
-            exit 1
-          fi
-
  checks-fast-channel-contracts-shard:
    permissions:
      contents: read
@@ -934,35 +923,13 @@ jobs:
          EOF
          OPENCLAW_VITEST_INCLUDE_FILE="$include_file" pnpm test:contracts:channels

-  checks-fast-channel-contracts:
-    permissions:
-      contents: read
-    name: checks-fast-contracts-channels
-    needs: [preflight, checks-fast-channel-contracts-shard]
-    if: ${{ !cancelled() && always() && needs.preflight.outputs.run_checks_fast == 'true' }}
-    runs-on: ubuntu-24.04
-    timeout-minutes: 5
-    steps:
-      - name: Verify channel contract shards
-        env:
-          SHARD_RESULT: ${{ needs.checks-fast-channel-contracts-shard.result }}
-        run: |
-          if [ "$SHARD_RESULT" = "cancelled" ]; then
-            echo "Channel contract shards were cancelled, usually because a newer commit superseded this run." >&2
-            exit 1
-          fi
-          if [ "$SHARD_RESULT" != "success" ]; then
-            echo "Channel contract shards failed: $SHARD_RESULT" >&2
-            exit 1
-          fi
-
  checks-fast-protocol:
    permissions:
      contents: read
    name: "checks-fast-protocol"
    needs: [preflight]
    if: needs.preflight.outputs.run_checks_fast == 'true'
-    runs-on: ubuntu-24.04
+    runs-on: ${{ github.event_name == 'workflow_dispatch' && 'ubuntu-24.04' || (github.repository == 'openclaw/openclaw' && 'blacksmith-4vcpu-ubuntu-2404' || 'ubuntu-24.04') }}
    timeout-minutes: 30
    steps:
      - name: Checkout
@@ -1021,38 +988,6 @@ jobs:
      - name: Run protocol check
        run: pnpm protocol:check

-  checks:
-    permissions:
-      contents: read
-    name: ${{ matrix.check_name }}
-    needs: [preflight, build-artifacts]
-    if: ${{ !cancelled() && always() && needs.preflight.outputs.run_checks == 'true' && needs.build-artifacts.result == 'success' }}
-    runs-on: ubuntu-24.04
-    timeout-minutes: 5
-    strategy:
-      fail-fast: false
-      matrix: ${{ fromJson(needs.preflight.outputs.checks_matrix) }}
-    steps:
-      - name: Verify ${{ matrix.task }} (${{ matrix.runtime }})
-        env:
-          TASK: ${{ matrix.task }}
-          CHANNELS_RESULT: ${{ needs.build-artifacts.outputs['channels-result'] }}
-        shell: bash
-        run: |
-          set -euo pipefail
-          case "$TASK" in
-            channels)
-              if [ "$CHANNELS_RESULT" != "success" ]; then
-                echo "Channel tests failed in build-artifacts: $CHANNELS_RESULT" >&2
-                exit 1
-              fi
-              ;;
-            *)
-              echo "Unsupported checks task: $TASK" >&2
-              exit 1
-              ;;
-          esac
-
  checks-node-compat:
    permissions:
      contents: read
@@ -1240,63 +1175,6 @@ jobs:
          }
          EOF

-  checks-node-core-test-dist-shard:
-    permissions:
-      contents: read
-    name: ${{ matrix.check_name }}
-    needs: [preflight, build-artifacts]
-    if: ${{ !cancelled() && always() && needs.preflight.outputs.run_checks_node_core_dist == 'true' && needs.build-artifacts.result == 'success' }}
-    runs-on: ubuntu-24.04
-    timeout-minutes: 5
-    strategy:
-      fail-fast: false
-      matrix: ${{ fromJson(needs.preflight.outputs.checks_node_core_dist_matrix) }}
-    steps:
-      - name: Verify Node test shard
-        env:
-          CORE_SUPPORT_BOUNDARY_RESULT: ${{ needs.build-artifacts.outputs['core-support-boundary-result'] }}
-          SHARD_NAME: ${{ matrix.shard_name }}
-        shell: bash
-        run: |
-          set -euo pipefail
-          case "$SHARD_NAME" in
-            core-support-boundary)
-              if [ "$CORE_SUPPORT_BOUNDARY_RESULT" != "success" ]; then
-                echo "Core support boundary shard failed in build-artifacts: $CORE_SUPPORT_BOUNDARY_RESULT" >&2
-                exit 1
-              fi
-              ;;
-            *)
-              echo "Unsupported built-artifact shard: $SHARD_NAME" >&2
-              exit 1
-              ;;
-          esac
-
-  checks-node-core-test:
-    permissions:
-      contents: read
-    name: checks-node-core
-    needs: [preflight, checks-node-core-test-nondist-shard, checks-node-core-test-dist-shard]
-    if: ${{ !cancelled() && always() && needs.preflight.outputs.run_checks == 'true' }}
-    runs-on: ubuntu-24.04
-    timeout-minutes: 5
-    steps:
-      - name: Verify node test shards
-        env:
-          DIST_SHARD_RESULT: ${{ needs.checks-node-core-test-dist-shard.result }}
-          NONDIST_SHARD_RESULT: ${{ needs.checks-node-core-test-nondist-shard.result }}
-          RUN_DIST_SHARDS: ${{ needs.preflight.outputs.run_checks_node_core_dist }}
-          RUN_NONDIST_SHARDS: ${{ needs.preflight.outputs.run_checks_node_core_nondist }}
-        run: |
-          if [ "$RUN_NONDIST_SHARDS" = "true" ] && [ "$NONDIST_SHARD_RESULT" != "success" ]; then
-            echo "Node non-dist test shards failed: $NONDIST_SHARD_RESULT" >&2
-            exit 1
-          fi
-          if [ "$RUN_DIST_SHARDS" = "true" ] && [ "$DIST_SHARD_RESULT" != "success" ]; then
-            echo "Node dist test shards failed: $DIST_SHARD_RESULT" >&2
-            exit 1
-          fi
-
  # Types, lint, and format check shards.
  check-shard:
    permissions:
@@ -1312,7 +1190,7 @@ jobs:
        include:
          - check_name: check-preflight-guards
            task: preflight-guards
-            runner: ubuntu-24.04
+            runner: blacksmith-4vcpu-ubuntu-2404
          - check_name: check-prod-types
            task: prod-types
            runner: blacksmith-4vcpu-ubuntu-2404
@@ -1321,16 +1199,16 @@ jobs:
            runner: blacksmith-16vcpu-ubuntu-2404
          - check_name: check-dependencies
            task: dependencies
-            runner: ubuntu-24.04
+            runner: blacksmith-8vcpu-ubuntu-2404
          - check_name: check-policy-guards
            task: policy-guards
-            runner: ubuntu-24.04
+            runner: blacksmith-4vcpu-ubuntu-2404
          - check_name: check-test-types
            task: test-types
            runner: blacksmith-4vcpu-ubuntu-2404
          - check_name: check-strict-smoke
            task: strict-smoke
-            runner: ubuntu-24.04
+            runner: blacksmith-4vcpu-ubuntu-2404
    steps:
      - name: Checkout
        shell: bash
@@ -1442,24 +1320,6 @@ jobs:
          path: .artifacts/deadcode
          if-no-files-found: ignore

-  check:
-    permissions:
-      contents: read
-    name: "check"
-    needs: [preflight, check-shard]
-    if: ${{ !cancelled() && always() && needs.preflight.outputs.run_check == 'true' }}
-    runs-on: ubuntu-24.04
-    timeout-minutes: 5
-    steps:
-      - name: Verify check shards
-        env:
-          SHARD_RESULT: ${{ needs.check-shard.result }}
-        run: |
-          if [ "$SHARD_RESULT" != "success" ]; then
-            echo "Check shards failed: $SHARD_RESULT" >&2
-            exit 1
-          fi
-
  check-additional-shard:
    permissions:
      contents: read
@@ -1637,59 +1497,13 @@ jobs:

          exit "$failures"

-  check-additional:
-    permissions:
-      contents: read
-    name: "check-additional"
-    needs: [preflight, check-additional-shard, build-artifacts]
-    if: ${{ !cancelled() && always() && needs.preflight.outputs.run_check_additional == 'true' }}
-    runs-on: ubuntu-24.04
-    timeout-minutes: 5
-    steps:
-      - name: Verify additional check shards
-        env:
-          SHARD_RESULT: ${{ needs.check-additional-shard.result }}
-          BUILD_ARTIFACTS_RESULT: ${{ needs.build-artifacts.result }}
-          GATEWAY_RESULT: ${{ needs.build-artifacts.outputs.gateway-watch-result }}
-        run: |
-          if [ "$SHARD_RESULT" != "success" ]; then
-            echo "Additional check shards failed: $SHARD_RESULT" >&2
-            exit 1
-          fi
-          if [ "$BUILD_ARTIFACTS_RESULT" != "success" ]; then
-            echo "Build artifact job failed: $BUILD_ARTIFACTS_RESULT" >&2
-            exit 1
-          fi
-          if [ "$GATEWAY_RESULT" != "success" ]; then
-            echo "Gateway topology check failed: $GATEWAY_RESULT" >&2
-            exit 1
-          fi
-
-  build-smoke:
-    permissions:
-      contents: read
-    name: "build-smoke"
-    needs: [preflight, build-artifacts]
-    if: ${{ !cancelled() && always() && needs.preflight.outputs.run_build_smoke == 'true' && (github.event_name != 'push' || needs.build-artifacts.result == 'success') }}
-    runs-on: ubuntu-24.04
-    timeout-minutes: 5
-    steps:
-      - name: Verify build smoke
-        env:
-          BUILD_ARTIFACTS_RESULT: ${{ needs.build-artifacts.result }}
-        run: |
-          if [ "$BUILD_ARTIFACTS_RESULT" != "success" ]; then
-            echo "Build smoke checks failed in build-artifacts: $BUILD_ARTIFACTS_RESULT" >&2
-            exit 1
-          fi
-
  # Validate docs (format, lint, broken links) only when docs files changed.
  check-docs:
    permissions:
      contents: read
    needs: [preflight]
    if: needs.preflight.outputs.run_check_docs == 'true'
-    runs-on: ubuntu-24.04
+    runs-on: ${{ github.event_name == 'workflow_dispatch' && 'ubuntu-24.04' || (github.repository == 'openclaw/openclaw' && 'blacksmith-4vcpu-ubuntu-2404' || 'ubuntu-24.04') }}
    timeout-minutes: 20
    steps:
      - name: Checkout
@@ -1763,7 +1577,7 @@ jobs:
      contents: read
    needs: [preflight]
    if: needs.preflight.outputs.run_skills_python_job == 'true'
-    runs-on: ubuntu-24.04
+    runs-on: ${{ github.event_name == 'workflow_dispatch' && 'ubuntu-24.04' || (github.repository == 'openclaw/openclaw' && 'blacksmith-4vcpu-ubuntu-2404' || 'ubuntu-24.04') }}
    timeout-minutes: 20
    steps:
      - name: Checkout
--- a/.github/workflows/docs.yml
+++ b/.github/workflows/docs.yml
@@ -6,6 +6,7 @@ on:
    paths:
      - "**/*.md"
      - "docs/**"
+      - "!CHANGELOG.md"

 permissions:
  contents: read
--- a/.github/workflows/full-release-validation.yml
+++ b/.github/workflows/full-release-validation.yml
@@ -638,6 +638,7 @@ jobs:
    name: Run package Telegram E2E
    needs: [resolve_target, prepare_release_package]
    if: ${{ always() && contains(fromJSON('["all","npm-telegram"]'), inputs.rerun_group) && (inputs.npm_telegram_package_spec != '' || inputs.release_package_spec != '' || (inputs.rerun_group == 'all' && inputs.release_profile == 'full')) }}
+    continue-on-error: ${{ startsWith(github.ref, 'refs/heads/tideclaw/alpha/') }}
    runs-on: ubuntu-24.04
    timeout-minutes: ${{ inputs.release_profile == 'full' && 120 || 60 }}
    outputs:
@@ -955,6 +956,8 @@ jobs:

          if [[ "$NPM_TELEGRAM_RESULT" == "skipped" && -z "${NPM_TELEGRAM_RUN_ID// }" ]]; then
            check_child "npm_telegram" "" 0 || failed=1
+          elif [[ "$CHILD_WORKFLOW_REF" =~ ^tideclaw/alpha/[0-9]{4}-[0-9]{2}-[0-9]{2}-[0-9]{4}Z$ ]]; then
+            check_child "npm_telegram" "$NPM_TELEGRAM_RUN_ID" 0 || echo "::warning::npm_telegram is advisory for Tideclaw alpha validation."
          else
            check_child "npm_telegram" "$NPM_TELEGRAM_RUN_ID" 1 || failed=1
          fi
--- a/.github/workflows/mantis-discord-status-reactions.yml
+++ b/.github/workflows/mantis-discord-status-reactions.yml
@@ -46,9 +46,8 @@ jobs:
          github.event_name == 'issue_comment' &&
          github.event.issue.pull_request &&
          (
-            contains(github.event.comment.body, '@Mantis') ||
-            contains(github.event.comment.body, '@mantis') ||
-            contains(github.event.comment.body, '/mantis')
+            contains(github.event.comment.body, '@openclaw-mantis') ||
+            contains(github.event.comment.body, '/openclaw-mantis')
          )
        )
      }}
@@ -128,7 +127,7 @@ jobs:

            const normalized = body.toLowerCase();
            const requested =
-              (normalized.includes("@mantis") || normalized.includes("/mantis")) &&
+              (normalized.includes("@openclaw-mantis") || normalized.includes("/openclaw-mantis")) &&
              normalized.includes("discord") &&
              normalized.includes("status") &&
              normalized.includes("reaction");
@@ -574,3 +573,44 @@ jobs:
            --artifact-url "$ARTIFACT_URL" \
            --run-url "https://github.com/${GITHUB_REPOSITORY}/actions/runs/${GITHUB_RUN_ID}" \
            --request-source "$REQUEST_SOURCE"
+
+  clear_issue_comment_reaction:
+    name: Clear Mantis command reaction
+    needs: [resolve_request, validate_refs, run_status_reactions]
+    if: ${{ always() && github.event_name == 'issue_comment' && needs.resolve_request.outputs.request_source == 'issue_comment' }}
+    runs-on: ubuntu-24.04
+    permissions:
+      issues: write
+    steps:
+      - name: Remove workflow eyes reaction
+        uses: actions/github-script@v8
+        with:
+          script: |
+            const { owner, repo } = context.repo;
+            const commentId = context.payload.comment?.id;
+            if (!commentId) {
+              core.info("No issue comment id found; skipping reaction cleanup.");
+              return;
+            }
+
+            const reactions = await github.paginate(github.rest.reactions.listForIssueComment, {
+              owner,
+              repo,
+              comment_id: commentId,
+              per_page: 100,
+            });
+            const eyes = reactions.filter(
+              (reaction) => reaction.content === "eyes" && reaction.user?.login === "github-actions[bot]",
+            );
+            for (const reaction of eyes) {
+              await github.rest.reactions.deleteForIssueComment({
+                owner,
+                repo,
+                comment_id: commentId,
+                reaction_id: reaction.id,
+              });
+              core.info(`Removed eyes reaction ${reaction.id} from comment ${commentId}.`);
+            }
+            if (eyes.length === 0) {
+              core.info(`No workflow eyes reaction found on comment ${commentId}.`);
+            }
--- a/.github/workflows/mantis-discord-thread-attachment.yml
+++ b/.github/workflows/mantis-discord-thread-attachment.yml
@@ -46,9 +46,8 @@ jobs:
          github.event_name == 'issue_comment' &&
          github.event.issue.pull_request &&
          (
-            contains(github.event.comment.body, '@Mantis') ||
-            contains(github.event.comment.body, '@mantis') ||
-            contains(github.event.comment.body, '/mantis')
+            contains(github.event.comment.body, '@openclaw-mantis') ||
+            contains(github.event.comment.body, '/openclaw-mantis')
          )
        )
      }}
@@ -128,7 +127,7 @@ jobs:

            const normalized = body.toLowerCase();
            const requested =
-              (normalized.includes("@mantis") || normalized.includes("/mantis")) &&
+              (normalized.includes("@openclaw-mantis") || normalized.includes("/openclaw-mantis")) &&
              normalized.includes("discord") &&
              normalized.includes("thread") &&
              (normalized.includes("attachment") ||
@@ -596,3 +595,44 @@ jobs:
        run: |
          echo "Mantis comparison failed." >&2
          exit 1
+
+  clear_issue_comment_reaction:
+    name: Clear Mantis command reaction
+    needs: [resolve_request, validate_candidate, run_thread_attachment]
+    if: ${{ always() && github.event_name == 'issue_comment' && needs.resolve_request.outputs.request_source == 'issue_comment' }}
+    runs-on: ubuntu-24.04
+    permissions:
+      issues: write
+    steps:
+      - name: Remove workflow eyes reaction
+        uses: actions/github-script@v8
+        with:
+          script: |
+            const { owner, repo } = context.repo;
+            const commentId = context.payload.comment?.id;
+            if (!commentId) {
+              core.info("No issue comment id found; skipping reaction cleanup.");
+              return;
+            }
+
+            const reactions = await github.paginate(github.rest.reactions.listForIssueComment, {
+              owner,
+              repo,
+              comment_id: commentId,
+              per_page: 100,
+            });
+            const eyes = reactions.filter(
+              (reaction) => reaction.content === "eyes" && reaction.user?.login === "github-actions[bot]",
+            );
+            for (const reaction of eyes) {
+              await github.rest.reactions.deleteForIssueComment({
+                owner,
+                repo,
+                comment_id: commentId,
+                reaction_id: reaction.id,
+              });
+              core.info(`Removed eyes reaction ${reaction.id} from comment ${commentId}.`);
+            }
+            if (eyes.length === 0) {
+              core.info(`No workflow eyes reaction found on comment ${commentId}.`);
+            }
--- a/.github/workflows/mantis-telegram-desktop-proof.yml
+++ b/.github/workflows/mantis-telegram-desktop-proof.yml
@@ -3,7 +3,7 @@ name: Mantis Telegram Desktop Proof
 on:
  issue_comment:
    types: [created]
-  pull_request_target:
+  pull_request_target: # zizmor: ignore[dangerous-triggers] maintainer-owned Mantis label trigger; trusted base workflow validates refs before checkout/use
    types: [labeled]
  workflow_dispatch:
    inputs:
@@ -120,6 +120,7 @@ jobs:
      publish_run_id: ${{ steps.resolve.outputs.publish_run_id }}
      pr_number: ${{ steps.resolve.outputs.pr_number }}
      request_source: ${{ steps.resolve.outputs.request_source }}
+      should_run: ${{ steps.resolve.outputs.should_run }}
    steps:
      - name: Resolve refs and target PR
        id: resolve
@@ -145,24 +146,52 @@ jobs:
              return;
            }

-            const { owner, repo } = context.repo;
-            const { data: pr } = await github.rest.pulls.get({
-              owner,
-              repo,
-              pull_number: Number(prNumber),
-            });
            const body =
              eventName === "workflow_dispatch"
                ? inputs.instructions || ""
                : eventName === "issue_comment"
                  ? context.payload.comment?.body || ""
                  : "";
+            if (eventName === "issue_comment") {
+              const normalized = body.toLowerCase();
+              const requestedDesktopProof =
+                (normalized.includes("@openclaw-mantis") || normalized.includes("/openclaw-mantis")) &&
+                (normalized.includes("desktop proof") ||
+                  normalized.includes("desktop-proof") ||
+                  normalized.includes("telegram desktop") ||
+                  normalized.includes("native telegram") ||
+                  normalized.includes("visible proof") ||
+                  normalized.includes("visible-proof") ||
+                  normalized.includes("telegram-visible-proof"));
+              if (!requestedDesktopProof) {
+                core.notice("Comment mentioned Mantis but did not request Telegram desktop proof.");
+                setOutput("should_run", "false");
+                setOutput("baseline_ref", "");
+                setOutput("candidate_ref", "");
+                setOutput("pr_number", "");
+                setOutput("instructions", "");
+                setOutput("crabbox_provider", "");
+                setOutput("lease_id", "");
+                setOutput("publish_artifact_name", "");
+                setOutput("publish_run_id", "");
+                setOutput("request_source", "unsupported_issue_comment");
+                return;
+              }
+            }
+
+            const { owner, repo } = context.repo;
+            const { data: pr } = await github.rest.pulls.get({
+              owner,
+              repo,
+              pull_number: Number(prNumber),
+            });
            const provider = inputs.crabbox_provider || "aws";
            if (!["aws", "hetzner"].includes(provider)) {
              core.setFailed(`Unsupported Crabbox provider for Mantis Telegram desktop proof: ${provider}`);
              return;
            }

+            setOutput("should_run", "true");
            setOutput("baseline_ref", pr.base.sha);
            setOutput("candidate_ref", pr.head.sha);
            setOutput("pr_number", String(pr.number));
@@ -185,7 +214,7 @@ jobs:
  validate_refs:
    name: Validate selected refs
    needs: resolve_request
-    if: needs.resolve_request.outputs.publish_artifact_name == ''
+    if: needs.resolve_request.outputs.should_run == 'true' && needs.resolve_request.outputs.publish_artifact_name == ''
    runs-on: ubuntu-24.04
    outputs:
      baseline_revision: ${{ steps.validate.outputs.baseline_revision }}
@@ -264,7 +293,7 @@ jobs:
  run_telegram_desktop_proof:
    name: Run agentic native Telegram proof
    needs: [resolve_request, validate_refs]
-    if: needs.resolve_request.outputs.publish_artifact_name == ''
+    if: needs.resolve_request.outputs.should_run == 'true' && needs.resolve_request.outputs.publish_artifact_name == ''
    runs-on: blacksmith-16vcpu-ubuntu-2404
    timeout-minutes: 360
    environment: qa-live-shared
@@ -429,6 +458,7 @@ jobs:
          codex-home: /tmp/mantis-codex-home-${{ github.run_id }}
          safety-strategy: unprivileged-user
          codex-user: codex
+          allow-bot-users: clawsweeper[bot]

      - name: Inspect Mantis evidence manifest
        id: inspect
@@ -513,7 +543,7 @@ jobs:
  publish_existing_telegram_desktop_proof:
    name: Publish existing native Telegram proof
    needs: resolve_request
-    if: needs.resolve_request.outputs.publish_artifact_name != ''
+    if: needs.resolve_request.outputs.should_run == 'true' && needs.resolve_request.outputs.publish_artifact_name != ''
    runs-on: ubuntu-24.04
    environment: qa-live-shared
    steps:
@@ -598,3 +628,44 @@ jobs:
            --artifact-url "$PUBLISH_ARTIFACT_URL" \
            --run-url "https://github.com/${GITHUB_REPOSITORY}/actions/runs/${PUBLISH_RUN_ID}" \
            --request-source "$REQUEST_SOURCE"
+
+  clear_issue_comment_reaction:
+    name: Clear Mantis command reaction
+    needs: [resolve_request, validate_refs, run_telegram_desktop_proof]
+    if: ${{ always() && github.event_name == 'issue_comment' && needs.resolve_request.outputs.request_source == 'issue_comment' }}
+    runs-on: ubuntu-24.04
+    permissions:
+      issues: write
+    steps:
+      - name: Remove workflow eyes reaction
+        uses: actions/github-script@v8
+        with:
+          script: |
+            const { owner, repo } = context.repo;
+            const commentId = context.payload.comment?.id;
+            if (!commentId) {
+              core.info("No issue comment id found; skipping reaction cleanup.");
+              return;
+            }
+
+            const reactions = await github.paginate(github.rest.reactions.listForIssueComment, {
+              owner,
+              repo,
+              comment_id: commentId,
+              per_page: 100,
+            });
+            const eyes = reactions.filter(
+              (reaction) => reaction.content === "eyes" && reaction.user?.login === "github-actions[bot]",
+            );
+            for (const reaction of eyes) {
+              await github.rest.reactions.deleteForIssueComment({
+                owner,
+                repo,
+                comment_id: commentId,
+                reaction_id: reaction.id,
+              });
+              core.info(`Removed eyes reaction ${reaction.id} from comment ${commentId}.`);
+            }
+            if (eyes.length === 0) {
+              core.info(`No workflow eyes reaction found on comment ${commentId}.`);
+            }
--- a/.github/workflows/mantis-telegram-live.yml
+++ b/.github/workflows/mantis-telegram-live.yml
@@ -56,9 +56,8 @@ jobs:
          github.event_name == 'issue_comment' &&
          github.event.issue.pull_request &&
          (
-            contains(github.event.comment.body, '@Mantis') ||
-            contains(github.event.comment.body, '@mantis') ||
-            contains(github.event.comment.body, '/mantis')
+            contains(github.event.comment.body, '@openclaw-mantis') ||
+            contains(github.event.comment.body, '/openclaw-mantis')
          )
        )
      }}
@@ -140,9 +139,18 @@ jobs:
            }

            const normalized = body.toLowerCase();
+            const requestedDesktopProof =
+              normalized.includes("desktop proof") ||
+              normalized.includes("desktop-proof") ||
+              normalized.includes("telegram desktop") ||
+              normalized.includes("native telegram") ||
+              normalized.includes("visible proof") ||
+              normalized.includes("visible-proof") ||
+              normalized.includes("telegram-visible-proof");
            const requested =
-              (normalized.includes("@mantis") || normalized.includes("/mantis")) &&
-              normalized.includes("telegram");
+              (normalized.includes("@openclaw-mantis") || normalized.includes("/openclaw-mantis")) &&
+              normalized.includes("telegram") &&
+              !requestedDesktopProof;
            if (!requested) {
              core.notice("Comment mentioned Mantis but did not request Telegram live QA.");
              setOutput("should_run", "false");
@@ -532,3 +540,44 @@ jobs:
        run: |
          echo "Mantis Telegram live failed: comparison=${COMPARISON_STATUS:-unset} telegram_exit=${TELEGRAM_EXIT:-unset}." >&2
          exit 1
+
+  clear_issue_comment_reaction:
+    name: Clear Mantis command reaction
+    needs: [resolve_request, validate_ref, run_telegram_live]
+    if: ${{ always() && github.event_name == 'issue_comment' && needs.resolve_request.outputs.request_source == 'issue_comment' }}
+    runs-on: ubuntu-24.04
+    permissions:
+      issues: write
+    steps:
+      - name: Remove workflow eyes reaction
+        uses: actions/github-script@v8
+        with:
+          script: |
+            const { owner, repo } = context.repo;
+            const commentId = context.payload.comment?.id;
+            if (!commentId) {
+              core.info("No issue comment id found; skipping reaction cleanup.");
+              return;
+            }
+
+            const reactions = await github.paginate(github.rest.reactions.listForIssueComment, {
+              owner,
+              repo,
+              comment_id: commentId,
+              per_page: 100,
+            });
+            const eyes = reactions.filter(
+              (reaction) => reaction.content === "eyes" && reaction.user?.login === "github-actions[bot]",
+            );
+            for (const reaction of eyes) {
+              await github.rest.reactions.deleteForIssueComment({
+                owner,
+                repo,
+                comment_id: commentId,
+                reaction_id: reaction.id,
+              });
+              core.info(`Removed eyes reaction ${reaction.id} from comment ${commentId}.`);
+            }
+            if (eyes.length === 0) {
+              core.info(`No workflow eyes reaction found on comment ${commentId}.`);
+            }
--- a/.github/workflows/npm-telegram-beta-e2e.yml
+++ b/.github/workflows/npm-telegram-beta-e2e.yml
@@ -40,8 +40,18 @@ on:
        description: Optional comma-separated Telegram scenario ids
        required: false
        type: string
+      advisory:
+        description: Treat package Telegram failures as advisory for the caller
+        required: false
+        default: false
+        type: boolean
  workflow_call:
    inputs:
+      advisory:
+        description: Treat package Telegram failures as advisory for the caller
+        required: false
+        default: false
+        type: boolean
      package_spec:
        description: Published OpenClaw package spec to test when no artifact is supplied
        required: true
@@ -100,6 +110,7 @@ jobs:
  run_package_telegram_e2e:
    name: Run package Telegram E2E
    runs-on: blacksmith-32vcpu-ubuntu-2404
+    continue-on-error: ${{ inputs.advisory }}
    timeout-minutes: 60
    environment: qa-live-shared
    permissions:
--- a/.github/workflows/openclaw-cross-os-release-checks-reusable.yml
+++ b/.github/workflows/openclaw-cross-os-release-checks-reusable.yml
@@ -86,8 +86,18 @@ on:
        required: false
        default: ""
        type: string
+      advisory:
+        description: Treat failures as advisory for the caller
+        required: false
+        default: false
+        type: boolean
  workflow_call:
    inputs:
+      advisory:
+        description: Treat failures as advisory for the caller
+        required: false
+        default: false
+        type: boolean
      ref:
        description: Public OpenClaw ref to validate (tag, branch, or full commit SHA)
        required: true
@@ -191,6 +201,7 @@ env:
 jobs:
  prepare:
    runs-on: ubuntu-24.04
+    continue-on-error: ${{ inputs.advisory }}
    outputs:
      baseline_file_name: ${{ steps.baseline_metadata.outputs.file_name }}
      baseline_spec: ${{ steps.baseline.outputs.value }}
@@ -513,6 +524,7 @@ jobs:
  cross_os_release_checks:
    name: "${{ matrix.display_name }} / ${{ matrix.suite_label }}"
    needs: prepare
+    continue-on-error: ${{ inputs.advisory }}
    strategy:
      fail-fast: false
      matrix: ${{ fromJson(needs.prepare.outputs.matrix) }}
--- a/.github/workflows/openclaw-live-and-e2e-checks-reusable.yml
+++ b/.github/workflows/openclaw-live-and-e2e-checks-reusable.yml
@@ -97,8 +97,18 @@ on:
          - beta
          - stable
          - full
+      advisory:
+        description: Treat failures as advisory for the caller
+        required: false
+        default: false
+        type: boolean
  workflow_call:
    inputs:
+      advisory:
+        description: Treat failures as advisory for the caller
+        required: false
+        default: false
+        type: boolean
      ref:
        description: Ref, tag, or SHA to validate
        required: true
@@ -455,6 +465,7 @@ jobs:
  validate_release_live_cache:
    needs: validate_selected_ref
    if: inputs.include_live_suites && !inputs.live_models_only && (inputs.live_suite_filter == '' || inputs.live_suite_filter == 'live-cache')
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-8vcpu-ubuntu-2404' }}
    timeout-minutes: 20
    env:
@@ -505,6 +516,7 @@ jobs:
  validate_repo_e2e:
    needs: validate_selected_ref
    if: inputs.include_repo_e2e && inputs.live_suite_filter == ''
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-8vcpu-ubuntu-2404' }}
    timeout-minutes: ${{ inputs.release_test_profile == 'full' && 90 || 60 }}
    env:
@@ -534,6 +546,7 @@ jobs:
  validate_special_e2e:
    needs: validate_selected_ref
    if: inputs.include_repo_e2e && (inputs.live_suite_filter == '' || inputs.live_suite_filter == 'openshell-e2e')
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-8vcpu-ubuntu-2404' }}
    timeout-minutes: ${{ matrix.timeout_minutes }}
    strategy:
@@ -608,6 +621,7 @@ jobs:
    needs: [validate_selected_ref, prepare_docker_e2e_image]
    if: inputs.include_release_path_suites && inputs.docker_lanes == ''
    name: Docker E2E (${{ matrix.label }})
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-32vcpu-ubuntu-2404' }}
    timeout-minutes: ${{ matrix.timeout_minutes }}
    strategy:
@@ -876,6 +890,7 @@ jobs:
  plan_docker_lane_groups:
    needs: validate_selected_ref
    if: inputs.docker_lanes != ''
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-4vcpu-ubuntu-2404' }}
    timeout-minutes: 5
    outputs:
@@ -903,6 +918,7 @@ jobs:
    needs: [validate_selected_ref, prepare_docker_e2e_image, plan_docker_lane_groups]
    if: inputs.docker_lanes != ''
    name: Docker E2E targeted lanes (${{ matrix.group.label }})
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-32vcpu-ubuntu-2404' }}
    timeout-minutes: 60
    strategy:
@@ -1112,6 +1128,7 @@ jobs:
    needs: [validate_selected_ref, prepare_docker_e2e_image]
    if: inputs.include_openwebui && !inputs.include_release_path_suites && inputs.docker_lanes == ''
    name: Docker E2E (openwebui)
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-32vcpu-ubuntu-2404' }}
    timeout-minutes: 60
    env:
@@ -1239,6 +1256,7 @@ jobs:
  prepare_docker_e2e_image:
    needs: validate_selected_ref
    if: inputs.include_release_path_suites || inputs.include_openwebui || inputs.docker_lanes != ''
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-32vcpu-ubuntu-2404' }}
    timeout-minutes: ${{ inputs.release_test_profile == 'full' && 90 || 60 }}
    permissions:
@@ -1483,6 +1501,7 @@ jobs:
  prepare_live_test_image:
    needs: validate_selected_ref
    if: inputs.include_live_suites && (inputs.live_suite_filter == '' || startsWith(inputs.live_suite_filter, 'live-') || startsWith(inputs.live_suite_filter, 'docker-live-models'))
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-32vcpu-ubuntu-2404' }}
    timeout-minutes: 60
    permissions:
@@ -1556,6 +1575,7 @@ jobs:
    name: Docker live models (${{ matrix.provider_label }})
    needs: [validate_selected_ref, prepare_live_test_image]
    if: inputs.include_live_suites && inputs.live_model_providers == '' && (inputs.live_suite_filter == '' || inputs.live_suite_filter == 'docker-live-models')
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-32vcpu-ubuntu-2404' }}
    timeout-minutes: 45
    strategy:
@@ -1708,6 +1728,7 @@ jobs:
    name: Docker live models (selected providers)
    needs: [validate_selected_ref, prepare_live_test_image]
    if: inputs.include_live_suites && inputs.live_model_providers != '' && (inputs.live_suite_filter == '' || inputs.live_suite_filter == 'docker-live-models')
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-32vcpu-ubuntu-2404' }}
    timeout-minutes: 45
    env:
@@ -1883,6 +1904,7 @@ jobs:
  validate_live_provider_suites:
    needs: validate_selected_ref
    if: inputs.include_live_suites && !inputs.live_models_only && (inputs.live_suite_filter == '' || (startsWith(inputs.live_suite_filter, 'native-live-') && !startsWith(inputs.live_suite_filter, 'native-live-extensions-media') && inputs.live_suite_filter != 'native-live-extensions-a-k'))
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-8vcpu-ubuntu-2404' }}
    timeout-minutes: ${{ matrix.timeout_minutes }}
    strategy:
@@ -2204,6 +2226,7 @@ jobs:
    name: Docker live suites (${{ matrix.label }})
    needs: [validate_selected_ref, prepare_live_test_image]
    if: inputs.include_live_suites && !inputs.live_models_only && (inputs.live_suite_filter == '' || startsWith(inputs.live_suite_filter, 'live-'))
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-32vcpu-ubuntu-2404' }}
    timeout-minutes: ${{ matrix.timeout_minutes }}
    strategy:
@@ -2423,6 +2446,7 @@ jobs:
    name: Live media suites (${{ matrix.label }})
    needs: validate_selected_ref
    if: inputs.include_live_suites && !inputs.live_models_only && (inputs.live_suite_filter == '' || startsWith(inputs.live_suite_filter, 'native-live-extensions-media') || inputs.live_suite_filter == 'native-live-extensions-a-k')
+    continue-on-error: ${{ inputs.advisory }}
    runs-on: ${{ github.event_name == 'workflow_call' && 'ubuntu-24.04' || 'blacksmith-8vcpu-ubuntu-2404' }}
    container:
      image: ghcr.io/openclaw/openclaw-live-media-runner:ubuntu-24.04
--- a/.github/workflows/openclaw-release-checks.yml
+++ b/.github/workflows/openclaw-release-checks.yml
@@ -534,6 +534,7 @@ jobs:
    permissions: read-all
    uses: ./.github/workflows/openclaw-cross-os-release-checks-reusable.yml
    with:
+      advisory: ${{ startsWith(github.ref, 'refs/heads/tideclaw/alpha/') }}
      ref: ${{ needs.resolve_target.outputs.revision }}
      provider: ${{ needs.resolve_target.outputs.provider }}
      mode: ${{ needs.resolve_target.outputs.mode }}
@@ -565,6 +566,7 @@ jobs:
      pull-requests: read
    uses: ./.github/workflows/openclaw-live-and-e2e-checks-reusable.yml
    with:
+      advisory: ${{ startsWith(github.ref, 'refs/heads/tideclaw/alpha/') }}
      ref: ${{ needs.resolve_target.outputs.revision }}
      include_repo_e2e: true
      include_release_path_suites: false
@@ -630,6 +632,7 @@ jobs:
      pull-requests: read
    uses: ./.github/workflows/openclaw-live-and-e2e-checks-reusable.yml
    with:
+      advisory: ${{ startsWith(github.ref, 'refs/heads/tideclaw/alpha/') }}
      ref: ${{ needs.resolve_target.outputs.revision }}
      include_repo_e2e: false
      include_release_path_suites: true
@@ -650,6 +653,7 @@ jobs:
      pull-requests: read
    uses: ./.github/workflows/package-acceptance.yml
    with:
+      advisory: ${{ startsWith(github.ref, 'refs/heads/tideclaw/alpha/') }}
      workflow_ref: ${{ github.ref_name }}
      source: ${{ (needs.resolve_target.outputs.package_acceptance_package_spec != '' || needs.resolve_target.outputs.release_package_spec != '') && 'npm' || 'artifact' }}
      package_spec: ${{ needs.resolve_target.outputs.package_acceptance_package_spec || needs.resolve_target.outputs.release_package_spec || 'openclaw@beta' }}
@@ -660,7 +664,7 @@ jobs:
      published_upgrade_survivor_baselines: ${{ needs.resolve_target.outputs.run_release_soak == 'true' && 'last-stable-4 2026.4.23 2026.5.2 2026.4.15' || '' }}
      published_upgrade_survivor_scenarios: ${{ needs.resolve_target.outputs.run_release_soak == 'true' && 'reported-issues' || '' }}
      telegram_mode: mock-openai
-      telegram_scenarios: telegram-help-command,telegram-commands-command,telegram-tools-compact-command,telegram-whoami-command,telegram-status-command,telegram-other-bot-command-gating,telegram-context-command,telegram-mentioned-message-reply,telegram-reply-chain-exact-marker,telegram-stream-final-single-message,telegram-long-final-reuses-preview,telegram-mention-gating
+      telegram_scenarios: telegram-help-command,telegram-commands-command,telegram-tools-compact-command,telegram-whoami-command,telegram-status-command,telegram-other-bot-command-gating,telegram-context-command,telegram-mentioned-message-reply,telegram-long-final-reuses-preview,telegram-mention-gating
    secrets:
      OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
      OPENAI_BASE_URL: ${{ secrets.OPENAI_BASE_URL }}
@@ -897,6 +901,7 @@ jobs:
        run: pnpm build

      - name: Run runtime parity lane
+        id: runtime_parity_lane
        run: |
          set -euo pipefail
          pnpm openclaw qa suite \
@@ -908,6 +913,19 @@ jobs:
            --runtime-pair pi,codex \
            --output-dir ".artifacts/qa-e2e/runtime-parity"

+      - name: Run standard runtime parity tier
+        if: ${{ always() && steps.runtime_parity_lane.outcome != 'skipped' && steps.runtime_parity_lane.outcome != 'cancelled' }}
+        run: |
+          set -euo pipefail
+          pnpm openclaw qa suite \
+            --provider-mode mock-openai \
+            --runtime-parity-tier standard \
+            --concurrency "${QA_PARITY_CONCURRENCY}" \
+            --model "${OPENCLAW_CI_OPENAI_MODEL}" \
+            --alt-model "openai/gpt-5.5-alt" \
+            --runtime-pair pi,codex \
+            --output-dir ".artifacts/qa-e2e/runtime-parity-standard"
+
      - name: Generate runtime parity report
        if: always()
        run: |
@@ -918,6 +936,16 @@ jobs:
            --summary .artifacts/qa-e2e/runtime-parity/qa-suite-summary.json \
            --output-dir .artifacts/qa-e2e/runtime-parity-report

+      - name: Generate standard runtime parity report
+        if: always()
+        run: |
+          set -euo pipefail
+          pnpm openclaw qa parity-report \
+            --repo-root . \
+            --runtime-axis \
+            --summary .artifacts/qa-e2e/runtime-parity-standard/qa-suite-summary.json \
+            --output-dir .artifacts/qa-e2e/runtime-parity-standard-report
+
      - name: Upload runtime parity artifacts
        if: always()
        uses: actions/upload-artifact@v4
@@ -927,6 +955,57 @@ jobs:
          retention-days: 14
          if-no-files-found: warn

+  runtime_tool_coverage_release_checks:
+    name: Enforce QA Lab runtime tool coverage
+    needs: [resolve_target, qa_lab_runtime_parity_release_checks]
+    if: always() && contains(fromJSON('["all","qa","qa-parity"]'), needs.resolve_target.outputs.rerun_group)
+    runs-on: ubuntu-24.04
+    timeout-minutes: 15
+    permissions:
+      contents: read
+      actions: read
+    env:
+      OPENCLAW_BUILD_PRIVATE_QA: "1"
+      OPENCLAW_ENABLE_PRIVATE_QA_CLI: "1"
+    steps:
+      - name: Checkout selected ref
+        uses: actions/checkout@v6
+        with:
+          persist-credentials: false
+          ref: ${{ needs.resolve_target.outputs.revision }}
+          fetch-depth: 1
+
+      - name: Setup Node environment
+        uses: ./.github/actions/setup-node-env
+        with:
+          node-version: ${{ env.NODE_VERSION }}
+          pnpm-version: ${{ env.PNPM_VERSION }}
+          install-bun: "true"
+
+      - name: Download runtime parity artifacts
+        uses: actions/download-artifact@v4
+        with:
+          name: release-qa-runtime-parity-${{ needs.resolve_target.outputs.revision }}
+          path: .artifacts/qa-e2e/
+
+      - name: Enforce standard runtime tool coverage
+        run: |
+          set -euo pipefail
+          pnpm openclaw qa coverage \
+            --repo-root . \
+            --tools \
+            --summary .artifacts/qa-e2e/runtime-parity-standard/qa-suite-summary.json \
+            --output .artifacts/qa-e2e/runtime-parity-standard-report/qa-runtime-tool-coverage-report.md
+
+      - name: Upload runtime tool coverage artifacts
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: release-qa-runtime-tool-coverage-${{ needs.resolve_target.outputs.revision }}
+          path: .artifacts/qa-e2e/runtime-parity-standard-report/
+          retention-days: 14
+          if-no-files-found: warn
+
  qa_live_matrix_release_checks:
    name: Run QA Lab live Matrix lane
    needs: [resolve_target]
@@ -1406,6 +1485,7 @@ jobs:
      - qa_lab_parity_lane_release_checks
      - qa_lab_parity_report_release_checks
      - qa_lab_runtime_parity_release_checks
+      - runtime_tool_coverage_release_checks
      - qa_live_matrix_release_checks
      - qa_live_telegram_release_checks
      - qa_live_discord_release_checks
@@ -1418,9 +1498,15 @@ jobs:
    steps:
      - name: Verify release check results
        shell: bash
+        env:
+          WORKFLOW_REF: ${{ github.ref }}
        run: |
          set -euo pipefail
          failed=0
+          tideclaw_alpha=false
+          if [[ "${WORKFLOW_REF}" =~ ^refs/heads/tideclaw/alpha/[0-9]{4}-[0-9]{2}-[0-9]{2}-[0-9]{4}Z$ ]]; then
+            tideclaw_alpha=true
+          fi
          for item in \
            "prepare_release_package=${{ needs.prepare_release_package.result }}" \
            "install_smoke_release_checks=${{ needs.install_smoke_release_checks.result }}" \
@@ -1431,6 +1517,7 @@ jobs:
            "qa_lab_parity_lane_release_checks=${{ needs.qa_lab_parity_lane_release_checks.result }}" \
            "qa_lab_parity_report_release_checks=${{ needs.qa_lab_parity_report_release_checks.result }}" \
            "qa_lab_runtime_parity_release_checks=${{ needs.qa_lab_runtime_parity_release_checks.result }}" \
+            "runtime_tool_coverage_release_checks=${{ needs.runtime_tool_coverage_release_checks.result }}" \
            "qa_live_matrix_release_checks=${{ needs.qa_live_matrix_release_checks.result }}" \
            "qa_live_telegram_release_checks=${{ needs.qa_live_telegram_release_checks.result }}" \
            "qa_live_discord_release_checks=${{ needs.qa_live_discord_release_checks.result }}" \
@@ -1440,6 +1527,15 @@ jobs:
            name="${item%%=*}"
            result="${item#*=}"
            if [[ "$result" != "success" && "$result" != "skipped" ]]; then
+              if [[ "$tideclaw_alpha" == "true" ]]; then
+                case "$name" in
+                  prepare_release_package|install_smoke_release_checks) ;;
+                  *)
+                    echo "::warning::${name} ended with ${result}; Tideclaw alpha treats non-package-safety release-check lanes as advisory."
+                    continue
+                    ;;
+                esac
+              fi
              if [[ "$name" == qa_* ]]; then
                echo "::warning::${name} ended with ${result}; QA release-check lanes are advisory and do not block release validation."
                continue
--- a/.github/workflows/package-acceptance.yml
+++ b/.github/workflows/package-acceptance.yml
@@ -93,8 +93,18 @@ on:
        required: false
        default: ""
        type: string
+      advisory:
+        description: Treat acceptance failures as advisory for the caller
+        required: false
+        default: false
+        type: boolean
  workflow_call:
    inputs:
+      advisory:
+        description: Treat acceptance failures as advisory for the caller
+        required: false
+        default: false
+        type: boolean
      workflow_ref:
        description: Trusted repo ref for workflow scripts and Docker E2E harness
        required: false
@@ -509,6 +519,7 @@ jobs:
    needs: resolve_package
    uses: ./.github/workflows/openclaw-live-and-e2e-checks-reusable.yml
    with:
+      advisory: ${{ inputs.advisory }}
      ref: ${{ needs.resolve_package.outputs.package_source_sha || inputs.workflow_ref }}
      include_repo_e2e: false
      include_release_path_suites: ${{ needs.resolve_package.outputs.include_release_path_suites == 'true' }}
@@ -573,6 +584,7 @@ jobs:
    if: needs.resolve_package.outputs.telegram_enabled == 'true'
    uses: ./.github/workflows/npm-telegram-beta-e2e.yml
    with:
+      advisory: ${{ inputs.advisory }}
      package_spec: ${{ inputs.package_spec }}
      package_artifact_name: ${{ needs.resolve_package.outputs.package_artifact_name }}
      package_label: openclaw@${{ needs.resolve_package.outputs.package_version }}
@@ -599,6 +611,7 @@ jobs:
        shell: bash
        run: |
          set -euo pipefail
+          advisory="${{ inputs.advisory }}"
          failed=0
          for item in \
            "resolve_package=${RESOLVE_RESULT}" \
@@ -608,6 +621,10 @@ jobs:
            name="${item%%=*}"
            result="${item#*=}"
            if [[ "$result" != "success" && "$result" != "skipped" ]]; then
+              if [[ "$advisory" == "true" && "$name" != "resolve_package" ]]; then
+                echo "::warning::${name} ended with ${result}; package acceptance is advisory for this caller."
+                continue
+              fi
              echo "::error::${name} ended with ${result}"
              failed=1
            fi
--- a/.github/workflows/qa-live-transports-convex.yml
+++ b/.github/workflows/qa-live-transports-convex.yml
@@ -229,6 +229,96 @@ jobs:
          retention-days: 14
          if-no-files-found: warn

+  run_live_runtime_token_efficiency:
+    name: Run live runtime token-efficiency lane
+    needs: [authorize_actor, validate_selected_ref]
+    if: github.event_name == 'schedule'
+    runs-on: blacksmith-8vcpu-ubuntu-2404
+    timeout-minutes: 45
+    environment: qa-live-shared
+    env:
+      QA_PARITY_CONCURRENCY: "1"
+      OPENCLAW_QA_TRANSPORT_READY_TIMEOUT_MS: "180000"
+      OPENCLAW_QA_REDACT_PUBLIC_METADATA: "1"
+    steps:
+      - name: Checkout selected ref
+        uses: actions/checkout@v6
+        with:
+          persist-credentials: false
+          ref: ${{ needs.validate_selected_ref.outputs.selected_revision }}
+          fetch-depth: 1
+
+      - name: Setup Node environment
+        uses: ./.github/actions/setup-node-env
+        with:
+          node-version: ${{ env.NODE_VERSION }}
+          pnpm-version: ${{ env.PNPM_VERSION }}
+          install-bun: "true"
+
+      - name: Validate required QA credential env
+        env:
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+        shell: bash
+        run: |
+          set -euo pipefail
+
+          if [[ -z "${OPENAI_API_KEY:-}" ]]; then
+            echo "Missing required OPENAI_API_KEY." >&2
+            exit 1
+          fi
+
+      - name: Build private QA runtime
+        env:
+          NODE_OPTIONS: --max-old-space-size=8192
+        run: pnpm build
+
+      - name: Run live runtime parity lane
+        id: run_lane
+        shell: bash
+        env:
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+          OPENCLAW_LIVE_OPENAI_KEY: ${{ secrets.OPENAI_API_KEY }}
+        run: |
+          set -euo pipefail
+
+          output_dir=".artifacts/qa-e2e/runtime-token-efficiency-live-${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}"
+          echo "output_dir=${output_dir}" >> "$GITHUB_OUTPUT"
+
+          pnpm openclaw qa suite \
+            --repo-root . \
+            --provider-mode live-frontier \
+            --runtime-parity-tier standard \
+            --runtime-parity-tier live-only \
+            --concurrency "${QA_PARITY_CONCURRENCY}" \
+            --model "${OPENCLAW_CI_OPENAI_MODEL}" \
+            --alt-model "${OPENCLAW_CI_OPENAI_MODEL}" \
+            --runtime-pair pi,codex \
+            --fast \
+            --allow-failures \
+            --output-dir "${output_dir}/runtime-suite"
+
+      - name: Generate live runtime token-efficiency report
+        if: always() && steps.run_lane.outcome != 'skipped' && steps.run_lane.outcome != 'cancelled'
+        shell: bash
+        run: |
+          set -euo pipefail
+
+          pnpm openclaw qa parity-report \
+            --repo-root . \
+            --runtime-axis \
+            --token-efficiency \
+            --summary "${{ steps.run_lane.outputs.output_dir }}/runtime-suite/qa-suite-summary.json" \
+            --output-dir "${{ steps.run_lane.outputs.output_dir }}/runtime-report"
+
+      - name: Upload live runtime token-efficiency artifacts
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: qa-live-runtime-token-efficiency-${{ github.run_id }}-${{ github.run_attempt }}
+          path: ${{ steps.run_lane.outputs.output_dir }}
+          retention-days: 14
+          if-no-files-found: warn
+
  run_live_matrix:
    name: Run Matrix live QA lane
    needs: [authorize_actor, validate_selected_ref]
--- a/.github/workflows/workflow-sanity.yml
+++ b/.github/workflows/workflow-sanity.yml
@@ -2,8 +2,12 @@ name: Workflow Sanity

 on:
  pull_request:
+    paths-ignore:
+      - "CHANGELOG.md"
  push:
    branches: [main]
+    paths-ignore:
+      - "CHANGELOG.md"
  workflow_dispatch:

 permissions:
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,6 +9,7 @@ Docs: https://docs.openclaw.ai
 - Mac app: redesign Settings pages with consistent card layouts, cached navigation, cleaner permissions/voice/skills/cron/exec/debug panes, and steadier spacing around the native sidebar.
 - Skills: rename the repo-local Codex closeout review skill and helper to `autoreview` while preserving the Codex-first fallback behavior.
 - Skills: add a meme-maker skill for curated template search, local SVG/PNG rendering, Imgflip hosted rendering, and Know Your Meme provenance links.
+- Browser: surface pending and recently handled modal dialogs in snapshots, return `blockedByDialog` when an action opens a modal, and allow `browser dialog --dialog-id` to answer pending dialogs.
 - Agents/tools: shorten built-in tool descriptions and schema hints across media, messaging, sessions, cron, Gateway, web, image/PDF, TTS, nodes, and plan tools while preserving routing guardrails.
 - Skills: add node inspector debugging, fused diagram generation, and throwaway spike workflow skills.
 - CLI/plugins: add `defineToolPlugin` plus `openclaw plugins build`, `validate`, and `init` for typed simple tool plugins with generated manifest metadata, optional tool declarations, and context factories.
@@ -18,30 +19,64 @@ Docs: https://docs.openclaw.ai
 - Plugins/messages: add presentation capability limits for channel renderers, adapt rich message controls before native rendering, and mark legacy `interactive`/Slack directive producer APIs as deprecated.
 - Proxy: support HTTPS managed forward-proxy endpoints and scoped `proxy.tls.caFile` CA trust for proxy endpoint TLS. (#79171) Thanks @jesse-merhi.
 - QA-Lab: add first-hour 20-turn and optional 100-turn runtime parity scenarios, with tier metadata for standard and soak QA gates. Fixes #80338; refs #80337. Thanks @100yenadmin.
+- QA-Lab: add `openclaw qa suite --runtime-parity-tier` and wire the standard Codex-vs-Pi tier into release checks separately from optional/live-only/soak lanes. Fixes #80337. Thanks @100yenadmin.
 - QA-Lab: add a live-only Codex Pi-shaped Read vocabulary canary so runtime parity catches native workspace-read prompt compatibility drift. (#80323) Thanks @100yenadmin.
 - QA-Lab: add live-only harness self-health scenarios for plugin hook crashes, manifest contract errors, and WebChat direct-reply self-message routing. (#80323) Thanks @100yenadmin.
 - QA-Lab: add runtime tool fixture scenarios and coverage reporting for Codex-native workspace tools, OpenClaw dynamic tools, and optional plugin-backed tools. Fixes #80173. Thanks @100yenadmin.
 - QA-Lab: expose runtime tool fixture coverage through `openclaw qa coverage --tools`, with optional suite-summary evaluation for parity gate artifacts. Thanks @100yenadmin.
+- QA-Lab: schedule a live-frontier Codex-vs-Pi runtime token-efficiency artifact lane in the all-lanes QA workflow. Fixes #80175. Thanks @100yenadmin.
+- QA-Lab: hard-gate required OpenClaw dynamic runtime-tool drift in the standard Codex-vs-Pi tier with a blocking release-check verifier and publish the tool coverage report artifact. Fixes #80339; refs #80319. Thanks @100yenadmin.
 - QA-Lab: add the personal-agent approval-denial scenario so the benchmark pack verifies denied local reads stop cleanly without tool progress or fixture leaks. (#83150) Thanks @iFiras-Max1.

 ### Fixes

+- Gateway/skills: preflight remote macOS skill-bin refreshes with a WebSocket connectivity check so stale node sessions skip quickly instead of logging slow `system.which` timeout warnings.
+- GitHub Copilot: drop unsafe native Responses reasoning replay items with non-replayable IDs before dispatch, preventing affected Copilot sessions from failing with `invalid_request_body`. Fixes #83220. Thanks @galiniliev.
+- QA-Lab: make runtime tool coverage fail on missing required tool exercise instead of treating pass/pass parity envelope drift as missing coverage.
+- Core/plugins: harden clawpatch-reported edge cases across gateway auth cleanup, Claude session id paths, plugin activation policy, apply-patch hunk handling, diagnostic redaction, and plugin metadata validation.
+- Mac app: prefer explicit private/Tailscale/LAN Gateway endpoints over SSH tunnels, preserve legacy loopback tunnel configs, persist transport choices, and show captured SSH stderr when tunneling really fails.
+- Gateway/sessions: keep ACP/acpx and runtime child sessions visible in configured-only session lists when their owner or parent session belongs to a configured agent.
+- Mac app: keep app-level menu commands and Dashboard failure states reachable when the remote Gateway is disconnected, and keep the Settings sidebar toggle in the leading titlebar area.
+- Mac app: allow longer Gateway and Context errors to wrap in the menu instead of truncating the useful failure detail.
+- Gateway/webchat: hide internal runtime-context and other `display: false` transcript messages from Chat history and live message events. Fixes #83216. Thanks @EmpireCreator.
+- CLI/help: keep `gateway`, `doctor`, `status`, and `health` help registration out of action/runtime imports so subcommand `--help` stays lightweight in constrained terminals. Fixes #83228. Thanks @dfguerrerom.
+- Cron/Discord: keep explicit announce runs in message-tool-only source-reply mode so scheduled agent turns post once instead of also echoing through automatic visible replies. Fixes #83261. Thanks @Theralley.
+- Telegram: preserve forum-topic origin targets in inbound, audio-preflight, and skipped-message hook contexts so follow-up delivery stays bound to the originating topic. Fixes #83302. Thanks @M00zyx.
+- Telegram: retry HTTP 421 Misdirected Request send failures on a fresh fallback transport so transient edge-node routing errors no longer drop outbound replies. Fixes #48892. (#48908) Thanks @MarsDoge.
+- Telegram: fail topic sends closed when Telegram reports `message thread not found` instead of retrying without `message_thread_id` into the base chat. Refs #83302.
+- Mac app: align the Sessions settings pane with the standard Settings page gutter and row spacing.
+- OpenAI/Codex: stop rejecting available `openai-codex` GPT-5.1, GPT-5.2, and GPT-5.3 model refs during config validation, while keeping removed Spark aliases suppressed. Fixes #83303.
+- Plugins/xAI: complete OAuth-backed xAI login and sidecar auth fixes, including guarded loopback callback CORS handling, video generation polling/defaults, and native-host User-Agent attribution. (#83322) Thanks @Jaaneek.
+- Codex app-server: preserve streamed native command output in mirrored transcripts and trajectory exports when final snapshots omit aggregated output. (#83200) Thanks @rozmiarD.
+- Codex app-server: fail closed when chat or sender policy denies tools, disabling native code, app, environment, and user MCP surfaces for restricted turns. (#82374) Thanks @VACInc.
+- Codex app-server: keep recent context-engine messages when oversized projected history is truncated, so short follow-ups in long channel sessions do not fall back to stale earlier turns. (#83127) Thanks @VACInc.
+- Feishu: return bound subagent delivery origins from session thread setup so Feishu subagent completions route back to the same DM or topic. (#83190) Thanks @100menotu001.
+- CLI/update: tailor post-update Gateway recovery hints by platform, showing systemd, LaunchAgent, Scheduled Task, or generic service-manager guidance instead of macOS-only recovery text. (#83096) Thanks @rubencu.
 - Plugins: apply a default 15-second timeout to legacy `before_agent_start` hooks so hung plugin handlers no longer block agent startup. Fixes #48534. (#83136) Thanks @therahul-yo.
 - Feishu: refresh inbound session delivery context for DM, group, and broadcast turns so later replies do not inherit stale WebChat routing. Fixes #78274.
+- Agents/subagents: require the initial subagent registry save before reporting spawn accepted, returning a spawn error instead of losing an untracked run when the registry write fails. (#83146) Thanks @yetval.
 - QA-Lab/qa-channel: attach redacted agent tool-start traces to outbound `QaBusMessage` records so scenarios can assert actual tool use instead of relying only on reply text. Fixes #67637. Thanks @100yenadmin.
 - QA-Lab: fail live runtime parity reports when assistant-message usage is missing, preventing `0 vs 0` live token rows from being reported as passing proof. Fixes #80411. Thanks @100yenadmin.
+- QA-Lab: add a runtime token-efficiency sidecar report that classifies Codex savings separately from regressions and fails only positive Codex-over-Pi live token deltas above threshold. Fixes #81093. Thanks @100yenadmin.
 - QA-Lab: fail Codex-backed OpenAI live runtime-pair runs before launching isolated workers when no portable Codex auth is available, while staging API-key fallbacks and configured Codex keys for isolated QA agents. Fixes #80412. Thanks @100yenadmin.
 - QA-Lab: refresh parity gates, mock frontier fixtures, model scenarios, and workflow artifact lanes to compare GPT-5.5 against Claude Opus 4.7. Fixes #74262. Thanks @100yenadmin.
 - QA-Lab: make mock parity dispatch provider-aware for source discovery and subagent scenarios so OpenAI and Anthropic lanes no longer share identical canned plans. Fixes #64879. Thanks @100yenadmin.
 - QA-Lab: stop returning Control UI bearer tokens from unauthenticated bootstrap payloads and bind Docker harness ports to loopback-only host addresses. (#66355) Thanks @pgondhi987.
 - Mac app: avoid a SwiftUI metadata crash when rendering the Cron Jobs settings pane.
+- Agents/subagents: preserve run-mode keep subagent registry entries past the session sweep TTL, so kept subagent runs remain visible after cleanup completes. Fixes #83132. (#83168) Thanks @yetval.
 - Agents/OpenAI streams: yield via `setTimeout(0)` instead of `setImmediate` between bursty Responses chunks so abort timers can fire during the yield, keeping cancel-on-timeout responsive on hot streams. Refs #82462.
+- Agents/Codex: keep legacy `oauthRef`-backed OAuth profiles usable while `openclaw doctor --fix` migrates them back to inline credentials, without creating new sidecar credentials. (#83312) Thanks @joshavant.
 - CLI/config: send SecretRef diagnostics to stderr so JSON command stdout remains parseable.
+- CLI/doctor: seed Control UI allowed origins when migrating legacy non-loopback gateway bind host aliases like `0.0.0.0`. Fixes #83286. Thanks @giodl73-repo.
 - CLI/plugins: ship the bundled memory CLI as a package entry so package-installed `openclaw memory` commands register correctly.
 - CLI/update: defer doctor-time plugin package installs during package swaps and seed post-core repair from the updated install registry, preventing duplicate reinstall failures.
 - Feishu: detect SecretRef top-level credentials as a configured default account instead of treating object-backed app secrets as missing.
+- Gateway/restart: keep ordinary unmanaged SIGUSR1/config restarts in-process instead of detach-spawning an orphaned child, preserving custom supervisor PID tracking while leaving update restarts on the fresh-process path. Fixes #65668.
 - CLI/completion: resolve concrete PowerShell profile paths and reload commands during setup and doctor completion installation. Fixes #44296. (#83059) Thanks @yu-xin-c.
+- Telegram: keep isolated long polling below the hard `getUpdates` request guard so idle bot accounts with high `timeoutSeconds` do not false-disconnect and restart-loop. Fixes #83264. Thanks @riccodecarvalho.
 - Providers/Google: preserve and recover Gemini 3 tool-call thought signatures during native replay so function-calling turns no longer fail with missing `thought_signature` 400s. Fixes #72879. (#80358) Thanks @abnershang.
+- Telegram: skip transcript-only delivery mirrors and gateway-injected rows when resolving latest assistant text, preventing retained previews from replacing final replies with stale fragments. Fixes #83159. (#83362) Thanks @joshavant.
+- Memory/QMD: keep lexical search on raw hyphenated queries while normalizing semantic QMD sub-searches, avoiding fallback to the builtin index for dashed identifiers and dates. Fixes #81328.
 - Memory-core: distinguish sqlite-vec load failures from missing semantic vector embeddings in degraded `memory index` warnings, so vector recall diagnostics point at unresolved dimensions instead of blaming sqlite-vec when the store is ready. Fixes #75624. (#83056) Thanks @xuruiray and @Noah3521.
 - Agents/subagents: preserve sandbox-peer controller ownership while routing completion announcements back to the originating run session, keeping subagent control and completion delivery scoped correctly. Fixes #80201. (#80242) Thanks @Jerry-Xin.
 - Gateway: continue restarting remaining channels when one hot-reload channel restart fails, while still reporting aggregate reload failure and rolling back plugin pre-replace stops. Fixes #83054. Thanks @zqchris.
@@ -77,6 +112,7 @@ Docs: https://docs.openclaw.ai
 - Codex: avoid spawning native hook relay subprocesses for post-tool/finalize events with no registered hook handlers while preserving pre-tool safety and approval relays. Fixes #76552. (#78004) Thanks @evgyur.
 - Channel accounts: keep top-level default channel accounts visible when named accounts are added alongside default credential material, so mixed legacy/new account configs keep resolving `default` instead of silently dropping it.
 - Codex/Telegram: synthesize native Codex tool progress from final turn snapshots so Telegram `/verbose` stays visible when command events arrive only at completion.
+- Codex/Telegram: deliver Codex verbose tool summaries in direct message-tool-only turns while suppressing message-send and activity-log noise. (#83186) Thanks @kurplunkin.
 - Mac app: make Channels settings open faster by deferring config-schema work, avoiding startup channel probes, caching decoded channel status rows, and showing only compact quick settings instead of the full generated channel schema.
 - Control UI: include the Control UI and Gateway protocol versions in protocol-mismatch errors so stale app/dashboard pairings identify which side needs rebuilding or restarting.
 - Gateway/protocol: restore Gateway WS protocol v4 and keep `message.action` room-event metadata on the existing `inboundTurnKind` wire field while preserving internal inbound-event classification.
@@ -114,6 +150,8 @@ Docs: https://docs.openclaw.ai
 - Android: prompt before replacing a changed Gateway TLS thumbprint, showing the old and new SHA-256 fingerprints so users can accept expected certificate rotations instead of hard failing on pin mismatch. (#83077) Thanks @sliekens.
 - CLI/status: render extra gateway-like service diagnostics as warning/info output instead of error output. Fixes #46930. (#82922) thanks @giodl73-repo.
 - Agents/failover: classify Moonshot/Kimi exhausted-balance HTTP 429 payloads as billing instead of generic rate limits, preserving billing guidance and fallback behavior. Fixes #43447. (#83079) Thanks @leno23.
+- Plugin SDK: bundle `openclaw/plugin-sdk/zod` into the published package artifact and verify the packed zod subpath stays self-contained, so pnpm global installs can register plugins without a package-local `zod` symlink. Fixes #78398. (#78515) Thanks @ggzeng.
+- Providers/Google: drop compaction-truncated Gemini thought signatures before replay so malformed Base64 no longer aborts the next assistant turn. (#82995) Thanks @wAngByg.

 ## 2026.5.17

@@ -1414,6 +1452,7 @@ Docs: https://docs.openclaw.ai
 - Agents/compaction: cap summarization output reserve tokens to the selected model's `maxTokens` so 1M-context Anthropic compactions do not request more output than the API permits. Fixes #54383.
 - Control UI/login: replace raw connection failures with structured, actionable login guidance for auth, pairing, insecure HTTP, origin, protocol, and transport failures. Thanks @BunsDev.
 - Agents/tools: fail `exec host=node` before `system.run` when the selected node is known to be disconnected, with an actionable reconnect message instead of a raw node invoke failure. Thanks @BunsDev.
+- Agents/tool-result guard: ignore internal tool-result `details` when estimating model-visible context, so large diagnostic metadata no longer triggers unnecessary truncation or compaction even though the provider boundary already strips `details` before model conversion. (#75525) Thanks @zqchris.
 - Agents/models: accept legacy `anthropic-cli/*` model refs as Claude CLI runtime refs instead of failing model resolution with `Unknown model`. Thanks @BunsDev.
 - Agents/tools: keep restrictive-profile tool-section warnings scoped to the configured sections whose tools are still missing from `alsoAllow`, so already re-allowed filesystem tools do not make exec-only fixes look broader than they are. Thanks @BunsDev.
 - Agents/tools: avoid warning messaging-only agents about inherited global `tools.exec` or `tools.fs` sections when the agent profile did not configure those tool sections itself. Thanks @BunsDev.
--- a/apps/macos/Sources/OpenClaw/AppState.swift
+++ b/apps/macos/Sources/OpenClaw/AppState.swift
@@ -363,9 +363,11 @@ final class AppState {
        }

        let configRoot = OpenClawConfigFile.loadDict()
-        let configRemoteUrl = GatewayRemoteConfig.resolveUrlString(root: configRoot)
        let configRemoteToken = GatewayRemoteConfig.resolveTokenValue(root: configRoot)
-        let configRemoteTransport = GatewayRemoteConfig.resolveTransport(root: configRoot)
+        let configRemoteResolution = GatewayRemoteConfig.resolveTransportResolution(root: configRoot)
+        let configRemoteTransport = configRemoteResolution.transport
+        let configRemoteUrl = configRemoteResolution.directURL?.absoluteString
+            ?? GatewayRemoteConfig.resolveUrlString(root: configRoot)
        let resolvedConnectionMode = ConnectionModeResolver.resolve(root: configRoot).mode
        self.remoteTransport = configRemoteTransport
        self.connectionMode = resolvedConnectionMode
@@ -532,7 +534,10 @@ final class AppState {
            }

        case .ssh:
-            changed = Self.updateGatewayString(&remote, key: "transport", value: nil) || changed
+            changed = Self.updateGatewayString(
+                &remote,
+                key: "transport",
+                value: RemoteTransport.ssh.rawValue) || changed

            let sanitizedTarget = Self.sanitizeSSHTarget(draft.remoteTarget)
            let expectedRemoteHost = CommandResolver.parseSSHTarget(sanitizedTarget)?.host ?? draft.remoteHost
@@ -576,7 +581,8 @@ final class AppState {
        let hasRemoteUrl = !(remoteUrl?
            .trimmingCharacters(in: .whitespacesAndNewlines)
            .isEmpty ?? true)
-        let remoteTransport = GatewayRemoteConfig.resolveTransport(root: root)
+        let remoteResolution = GatewayRemoteConfig.resolveTransportResolution(root: root)
+        let remoteTransport = remoteResolution.transport

        let desiredMode: ConnectionMode? = switch modeRaw {
        case "local":
@@ -600,7 +606,7 @@ final class AppState {
        if remoteTransport != self.remoteTransport {
            self.remoteTransport = remoteTransport
        }
-        let remoteUrlText = remoteUrl ?? ""
+        let remoteUrlText = remoteResolution.directURL?.absoluteString ?? remoteUrl ?? ""
        if remoteUrlText != self.remoteUrl {
            self.remoteUrl = remoteUrlText
        }
--- a/apps/macos/Sources/OpenClaw/ContextRootMenuLabelView.swift
+++ b/apps/macos/Sources/OpenClaw/ContextRootMenuLabelView.swift
@@ -23,7 +23,7 @@ struct ContextRootMenuLabelView: View {

                if self.usesStackedLayout {
                    self.subtitleText
-                        .lineLimit(3)
+                        .lineLimit(5)
                        .fixedSize(horizontal: false, vertical: true)
                }
            }
--- a/apps/macos/Sources/OpenClaw/ControlChannel.swift
+++ b/apps/macos/Sources/OpenClaw/ControlChannel.swift
@@ -265,9 +265,10 @@ final class ControlChannel {

    private static func isLikelyLocalNetworkPermissionBlock() -> Bool {
        let root = OpenClawConfigFile.loadDict()
+        let resolution = GatewayRemoteConfig.resolveTransportResolution(root: root)
        guard ConnectionModeResolver.resolve(root: root).mode == .remote,
-              GatewayRemoteConfig.resolveTransport(root: root) == .direct,
-              let url = GatewayRemoteConfig.resolveGatewayUrl(root: root),
+              resolution.transport == .direct,
+              let url = resolution.directURL,
              url.scheme?.lowercased() == "ws",
              let host = url.host,
              GatewayRemoteConfig.isTrustedPlaintextRemoteHost(host),
--- a/apps/macos/Sources/OpenClaw/DashboardManager.swift
+++ b/apps/macos/Sources/OpenClaw/DashboardManager.swift
@@ -10,6 +10,7 @@ final class DashboardManager {
    static let shared = DashboardManager()

    private var controller: DashboardWindowController?
+    private static let failureURL = URL(string: "about:blank")!

    private init() {}

@@ -69,6 +70,19 @@ final class DashboardManager {
        Task { _ = try? await ControlChannel.shared.health(timeout: 3) }
    }

+    func showFailure(_ error: Error) {
+        let message = (error as NSError).localizedDescription
+        dashboardManagerLogger.error("dashboard setup failed error=\(message, privacy: .public)")
+        let controller = self.controller ?? DashboardWindowController(
+            url: Self.failureURL,
+            auth: DashboardWindowAuth(gatewayUrl: nil, token: nil, password: nil))
+        self.controller = controller
+        controller.showFailure(
+            title: "Dashboard unavailable",
+            message: message,
+            detail: "Check Settings → Connection or use Debug → Reset Remote Tunnel, then try again.")
+    }
+
    func close() {
        self.controller?.closeDashboard()
    }
@@ -101,9 +115,10 @@ final class DashboardManager {

    private func immediateDashboardConfig(mode: AppState.ConnectionMode) -> GatewayConnection.Config? {
        let root = OpenClawConfigFile.loadDict()
+        let resolution = GatewayRemoteConfig.resolveTransportResolution(root: root)
        if mode == .remote,
-           GatewayRemoteConfig.resolveTransport(root: root) == .direct,
-           let url = GatewayRemoteConfig.resolveGatewayUrl(root: root)
+           resolution.transport == .direct,
+           let url = resolution.directURL
        {
            return (
                url,
--- a/apps/macos/Sources/OpenClaw/DashboardWindowController.swift
+++ b/apps/macos/Sources/OpenClaw/DashboardWindowController.swift
@@ -80,6 +80,17 @@ final class DashboardWindowController: NSWindowController, WKNavigationDelegate,
        self.window?.performClose(nil)
    }

+    func showFailure(title: String, message: String, detail: String? = nil) {
+        self.currentURL = URL(string: "about:blank")!
+        self.auth = DashboardWindowAuth(gatewayUrl: nil, token: nil, password: nil)
+        self.refreshNativeAuthScript(url: self.currentURL, auth: self.auth)
+        self.webView.stopLoading()
+        self.webView.loadHTMLString(
+            Self.failureHTML(title: title, message: message, detail: detail, url: nil),
+            baseURL: nil)
+        self.show()
+    }
+
    private func load(_ url: URL) {
        dashboardWindowLogger.debug("dashboard load \(url.absoluteString, privacy: .public)")
        self.webView.load(URLRequest(url: url))
@@ -282,54 +293,107 @@ final class DashboardWindowController: NSWindowController, WKNavigationDelegate,
        if nsError.domain == NSURLErrorDomain, nsError.code == NSURLErrorCancelled { return }
        dashboardWindowLogger.error(
            "dashboard load failed url=\(self.currentURL.absoluteString, privacy: .public) error=\(error.localizedDescription, privacy: .public)")
-        let html = Self.failureHTML(url: self.currentURL, message: error.localizedDescription)
+        let html = Self.failureHTML(
+            title: "Dashboard unavailable",
+            message: error.localizedDescription,
+            detail: "The dashboard window is open, but the web UI could not load from this endpoint.",
+            url: self.currentURL)
        self.webView.loadHTMLString(html, baseURL: nil)
    }

-    private static func failureHTML(url: URL, message: String) -> String {
-        """
+    private static func failureHTML(title: String, message: String, detail: String?, url: URL?) -> String {
+        let detailHTML = detail.map { "<p class=\"detail\">\(self.htmlEscape($0))</p>" } ?? ""
+        let urlHTML = url.map { "<code>\(self.htmlEscape($0.absoluteString))</code>" } ?? ""
+        return """
        <!doctype html>
        <html>
        <head>
          <meta charset="utf-8">
          <style>
            :root { color-scheme: light dark; }
+            * { box-sizing: border-box; }
            body {
              margin: 0;
              min-height: 100vh;
              display: grid;
              place-items: center;
-              background: Canvas;
-              color: CanvasText;
-              font: -apple-system-body;
+              background: #101114;
+              color: rgba(255,255,255,.92);
+              font: 15px -apple-system, BlinkMacSystemFont, "SF Pro Text", system-ui, sans-serif;
            }
            main {
-              width: min(520px, calc(100vw - 64px));
-              line-height: 1.4;
+              width: min(540px, calc(100vw - 72px));
+              padding: 34px;
+              border: 1px solid rgba(255,255,255,.12);
+              border-radius: 22px;
+              background: rgba(255,255,255,.035);
+              box-shadow: 0 28px 90px rgba(0,0,0,.36);
+              line-height: 1.45;
+            }
+            .badge {
+              width: 44px;
+              height: 44px;
+              display: grid;
+              place-items: center;
+              margin-bottom: 20px;
+              border-radius: 14px;
+              background: rgba(255,255,255,.07);
+              color: #ff746b;
+              font-size: 24px;
            }
            h1 {
-              margin: 0 0 10px;
-              font: -apple-system-title2;
-              font-weight: 650;
+              margin: 0 0 12px;
+              font-size: 24px;
+              line-height: 1.16;
+              font-weight: 700;
+              letter-spacing: 0;
+            }
+            p {
+              margin: 0;
+              color: rgba(255,255,255,.76);
+              font-size: 16px;
+            }
+            .detail {
+              margin-top: 14px;
+              color: rgba(255,255,255,.56);
+              font-size: 13px;
            }
-            p { margin: 8px 0; color: color-mix(in srgb, CanvasText 72%, transparent); }
            code {
              display: block;
-              margin-top: 14px;
+              margin-top: 18px;
              padding: 12px;
-              border-radius: 8px;
-              background: color-mix(in srgb, CanvasText 8%, transparent);
-              color: CanvasText;
+              border: 1px solid rgba(255,255,255,.08);
+              border-radius: 10px;
+              background: rgba(0,0,0,.26);
+              color: rgba(255,255,255,.76);
              overflow-wrap: anywhere;
              font: 12px ui-monospace, SFMono-Regular, Menlo, monospace;
            }
+            @media (prefers-color-scheme: light) {
+              body { background: #f5f6f8; color: rgba(0,0,0,.86); }
+              main {
+                background: rgba(255,255,255,.84);
+                border-color: rgba(0,0,0,.1);
+                box-shadow: 0 28px 90px rgba(0,0,0,.12);
+              }
+              .badge { background: rgba(0,0,0,.06); }
+              p { color: rgba(0,0,0,.68); }
+              .detail { color: rgba(0,0,0,.54); }
+              code {
+                background: rgba(0,0,0,.05);
+                border-color: rgba(0,0,0,.08);
+                color: rgba(0,0,0,.68);
+              }
+            }
          </style>
        </head>
        <body>
          <main>
-            <h1>Dashboard unavailable</h1>
+            <div class="badge">!</div>
+            <h1>\(self.htmlEscape(title))</h1>
            <p>\(self.htmlEscape(message))</p>
-            <code>\(self.htmlEscape(url.absoluteString))</code>
+            \(detailHTML)
+            \(urlHTML)
          </main>
        </body>
        </html>
--- a/apps/macos/Sources/OpenClaw/DeepLinks.swift
+++ b/apps/macos/Sources/OpenClaw/DeepLinks.swift
@@ -184,7 +184,7 @@ final class DeepLinkHandler {
        do {
            try await DashboardManager.shared.show()
        } catch {
-            self.presentAlert(title: "Dashboard unavailable", message: error.localizedDescription)
+            DashboardManager.shared.showFailure(error)
        }
    }

--- a/apps/macos/Sources/OpenClaw/GatewayDiscoveryHelpers.swift
+++ b/apps/macos/Sources/OpenClaw/GatewayDiscoveryHelpers.swift
@@ -41,21 +41,31 @@ enum GatewayDiscoveryHelpers {
    static func directUrl(for gateway: GatewayDiscoveryModel.DiscoveredGateway) -> String? {
        self.directGatewayUrl(
            serviceHost: gateway.serviceHost,
-            servicePort: gateway.servicePort)
+            servicePort: gateway.servicePort,
+            gatewayTls: gateway.gatewayTls)
    }

    static func directGatewayUrl(
        serviceHost: String?,
-        servicePort: Int?) -> String?
+        servicePort: Int?,
+        gatewayTls: Bool = false) -> String?
    {
        // Security: do not route using unauthenticated TXT hints (tailnetDns/lanHost/gatewayPort).
        // Prefer the resolved service endpoint (SRV + A/AAAA).
        guard let endpoint = self.serviceEndpoint(serviceHost: serviceHost, servicePort: servicePort) else {
            return nil
        }
-        // Security: for non-loopback hosts, force TLS to avoid plaintext credential/session leakage.
-        let scheme = self.isLoopbackHost(endpoint.host) ? "ws" : "wss"
-        let portSuffix = endpoint.port == 443 ? "" : ":\(endpoint.port)"
+        let scheme: String
+        if gatewayTls {
+            scheme = "wss"
+        } else if self.isLoopbackHost(endpoint.host)
+            || GatewayRemoteConfig.isTrustedPlaintextRemoteHost(endpoint.host)
+        {
+            scheme = "ws"
+        } else {
+            return nil
+        }
+        let portSuffix = scheme == "wss" && endpoint.port == 443 ? "" : ":\(endpoint.port)"
        return "\(scheme)://\(endpoint.host)\(portSuffix)"
    }

--- a/apps/macos/Sources/OpenClaw/GatewayDiscoverySelectionSupport.swift
+++ b/apps/macos/Sources/OpenClaw/GatewayDiscoverySelectionSupport.swift
@@ -25,14 +25,14 @@ enum GatewayDiscoverySelectionSupport {
        state.remoteTarget = GatewayDiscoveryHelpers.sshTarget(for: gateway) ?? ""

        if preferredTransport == .direct {
-            if let endpoint = GatewayDiscoveryHelpers.serviceEndpoint(for: gateway) {
-                OpenClawConfigFile.setRemoteGatewayUrl(
-                    host: endpoint.host,
-                    port: endpoint.port)
+            OpenClawConfigFile.setRemoteGatewayTransport(AppState.RemoteTransport.direct.rawValue)
+            if !state.remoteUrl.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
+                OpenClawConfigFile.setRemoteGatewayUrlString(state.remoteUrl)
            } else {
                OpenClawConfigFile.clearRemoteGatewayUrl()
            }
        } else {
+            OpenClawConfigFile.setRemoteGatewayTransport(AppState.RemoteTransport.ssh.rawValue)
            OpenClawConfigFile.setRemoteGatewayUrlString(state.remoteUrl)
        }
    }
@@ -65,9 +65,10 @@ enum GatewayDiscoverySelectionSupport {
        for gateway: GatewayDiscoveryModel.DiscoveredGateway) -> Bool
    {
        guard GatewayDiscoveryHelpers.directUrl(for: gateway) != nil else { return false }
-        if gateway.stableID.hasPrefix("tailscale-serve|") {
+        if gateway.gatewayTls || gateway.gatewayDirectReachable {
            return true
        }
+
        guard let host = GatewayDiscoveryHelpers.resolvedServiceHost(for: gateway)?
            .trimmingCharacters(in: .whitespacesAndNewlines)
            .lowercased()
--- a/apps/macos/Sources/OpenClaw/GatewayEndpointStore.swift
+++ b/apps/macos/Sources/OpenClaw/GatewayEndpointStore.swift
@@ -306,8 +306,9 @@ actor GatewayEndpointStore {
                password: password))
        case .remote:
            let root = OpenClawConfigFile.loadDict()
-            if GatewayRemoteConfig.resolveTransport(root: root) == .direct {
-                guard let url = GatewayRemoteConfig.resolveGatewayUrl(root: root) else {
+            let resolution = GatewayRemoteConfig.resolveTransportResolution(root: root)
+            if resolution.transport == .direct {
+                guard let url = resolution.directURL else {
                    self.cancelRemoteEnsure()
                    self.setState(.unavailable(
                        mode: .remote,
@@ -470,8 +471,9 @@ actor GatewayEndpointStore {

    private func resolveDirectRemoteURL() throws -> URL? {
        let root = OpenClawConfigFile.loadDict()
-        guard GatewayRemoteConfig.resolveTransport(root: root) == .direct else { return nil }
-        guard let url = GatewayRemoteConfig.resolveGatewayUrl(root: root) else {
+        let resolution = GatewayRemoteConfig.resolveTransportResolution(root: root)
+        guard resolution.transport == .direct else { return nil }
+        guard let url = resolution.directURL else {
            throw NSError(
                domain: "GatewayEndpoint",
                code: 1,
--- a/apps/macos/Sources/OpenClaw/GatewayRemoteConfig.swift
+++ b/apps/macos/Sources/OpenClaw/GatewayRemoteConfig.swift
@@ -5,6 +5,18 @@ import Darwin
 #endif

 enum GatewayRemoteConfig {
+    enum TransportSource: Equatable {
+        case explicit
+        case inferredRemoteURL
+        case legacySSH
+    }
+
+    struct TransportResolution: Equatable {
+        let transport: AppState.RemoteTransport
+        let source: TransportSource
+        let directURL: URL?
+    }
+
    enum TokenValue: Equatable {
        case missing
        case plaintext(String)
@@ -28,14 +40,49 @@ enum GatewayRemoteConfig {
    }

    static func resolveTransport(root: [String: Any]) -> AppState.RemoteTransport {
+        self.resolveTransportResolution(root: root).transport
+    }
+
+    static func resolveTransportResolution(root: [String: Any]) -> TransportResolution {
+        let explicit = self.resolveExplicitTransport(root: root)
+        switch explicit {
+        case .direct:
+            return TransportResolution(
+                transport: .direct,
+                source: .explicit,
+                directURL: self.resolveGatewayUrl(root: root))
+        case .ssh:
+            return TransportResolution(transport: .ssh, source: .explicit, directURL: nil)
+        case nil:
+            break
+        }
+
+        if let url = self.resolveGatewayUrl(root: root),
+           let host = url.host,
+           !LoopbackHost.isLoopbackHost(host)
+        {
+            return TransportResolution(transport: .direct, source: .inferredRemoteURL, directURL: url)
+        }
+
+        return TransportResolution(transport: .ssh, source: .legacySSH, directURL: nil)
+    }
+
+    private static func resolveExplicitTransport(root: [String: Any]) -> AppState.RemoteTransport? {
        guard let gateway = root["gateway"] as? [String: Any],
              let remote = gateway["remote"] as? [String: Any],
              let raw = remote["transport"] as? String
        else {
-            return .ssh
+            return nil
        }
        let trimmed = raw.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
-        return trimmed == AppState.RemoteTransport.direct.rawValue ? .direct : .ssh
+        switch trimmed {
+        case AppState.RemoteTransport.direct.rawValue:
+            return .direct
+        case AppState.RemoteTransport.ssh.rawValue:
+            return .ssh
+        default:
+            return .ssh
+        }
    }

    static func resolveUrlString(root: [String: Any]) -> String? {
--- a/apps/macos/Sources/OpenClaw/MenuBar.swift
+++ b/apps/macos/Sources/OpenClaw/MenuBar.swift
@@ -91,8 +91,11 @@ struct OpenClawApp: App {
        }
    }

-    private func applyStatusItemAppearance(paused: Bool, sleeping: Bool) {
-        self.statusItem?.button?.appearsDisabled = paused || sleeping
+    private func applyStatusItemAppearance(paused _: Bool, sleeping _: Bool) {
+        // Keep the status item actionable even when the Gateway is paused or disconnected.
+        // The SwiftUI label already renders those states; AppKit's disabled appearance can
+        // leak into menu item validation and grey out app-level commands like Settings.
+        self.statusItem?.button?.appearsDisabled = false
    }

    private static func applyAttachOnlyOverrideIfNeeded() {
@@ -180,10 +183,7 @@ struct OpenClawApp: App {
            do {
                try await DashboardManager.shared.show()
            } catch {
-                let alert = NSAlert()
-                alert.messageText = "Dashboard unavailable"
-                alert.informativeText = error.localizedDescription
-                alert.runModal()
+                DashboardManager.shared.showFailure(error)
            }
        }
    }
@@ -302,10 +302,7 @@ final class AppDelegate: NSObject, NSApplicationDelegate {
                do {
                    try await DashboardManager.shared.show()
                } catch {
-                    let alert = NSAlert()
-                    alert.messageText = "Dashboard unavailable"
-                    alert.informativeText = error.localizedDescription
-                    alert.runModal()
+                    DashboardManager.shared.showFailure(error)
                }
            }
        }
--- a/apps/macos/Sources/OpenClaw/MenuHeaderCard.swift
+++ b/apps/macos/Sources/OpenClaw/MenuHeaderCard.swift
@@ -38,7 +38,7 @@ struct MenuHeaderCard<Content: View>: View {
                    .font(.caption)
                    .foregroundStyle(.secondary)
                    .multilineTextAlignment(.leading)
-                    .lineLimit(3)
+                    .lineLimit(5)
                    .truncationMode(.tail)
                    .fixedSize(horizontal: false, vertical: true)
            }
--- a/apps/macos/Sources/OpenClaw/OpenClawConfigFile.swift
+++ b/apps/macos/Sources/OpenClaw/OpenClawConfigFile.swift
@@ -301,6 +301,16 @@ enum OpenClawConfigFile {
        }
    }

+    static func setRemoteGatewayTransport(_ value: String) {
+        let trimmed = value.trimmingCharacters(in: .whitespacesAndNewlines)
+        guard !trimmed.isEmpty else { return }
+        self.updateGatewayDict { gateway in
+            var remote = gateway["remote"] as? [String: Any] ?? [:]
+            remote["transport"] = trimmed
+            gateway["remote"] = remote
+        }
+    }
+
    static func clearRemoteGatewayUrl() {
        self.updateGatewayDict { gateway in
            guard var remote = gateway["remote"] as? [String: Any] else { return }
--- a/apps/macos/Sources/OpenClaw/RemotePortTunnel.swift
+++ b/apps/macos/Sources/OpenClaw/RemotePortTunnel.swift
@@ -16,6 +16,32 @@ final class RemotePortTunnel: @unchecked Sendable {
    let localPort: UInt16?
    private let stderrHandle: FileHandle?

+    private final class StderrCapture: @unchecked Sendable {
+        private let lock = NSLock()
+        private var text = ""
+        private let limit = 4096
+
+        func append(_ chunk: String) {
+            let trimmed = chunk.trimmingCharacters(in: .whitespacesAndNewlines)
+            guard !trimmed.isEmpty else { return }
+            self.lock.lock()
+            defer { self.lock.unlock() }
+            if !self.text.isEmpty {
+                self.text += "\n"
+            }
+            self.text += trimmed
+            if self.text.count > self.limit {
+                self.text = String(self.text.suffix(self.limit))
+            }
+        }
+
+        func snapshot() -> String {
+            self.lock.lock()
+            defer { self.lock.unlock() }
+            return self.text.trimmingCharacters(in: .whitespacesAndNewlines)
+        }
+    }
+
    private init(process: Process, localPort: UInt16?, stderrHandle: FileHandle?) {
        self.process = process
        self.localPort = localPort
@@ -93,6 +119,7 @@ final class RemotePortTunnel: @unchecked Sendable {
        let pipe = Pipe()
        process.standardError = pipe
        let stderrHandle = pipe.fileHandleForReading
+        let stderrCapture = StderrCapture()

        // Consume stderr so ssh cannot block if it logs.
        stderrHandle.readabilityHandler = { handle in
@@ -106,6 +133,7 @@ final class RemotePortTunnel: @unchecked Sendable {
                .trimmingCharacters(in: .whitespacesAndNewlines),
                !line.isEmpty
            else { return }
+            stderrCapture.append(line)
            Self.logger.error("ssh tunnel stderr: \(line, privacy: .public)")
        }
        process.terminationHandler = { _ in
@@ -114,7 +142,11 @@ final class RemotePortTunnel: @unchecked Sendable {

        try process.run()

-        try await Self.waitForListener(process: process, localPort: localPort, stderrHandle: stderrHandle)
+        try await Self.waitForListener(
+            process: process,
+            localPort: localPort,
+            stderrHandle: stderrHandle,
+            stderrCapture: stderrCapture)

        // Track tunnel so we can clean up stale listeners on restart.
        Task {
@@ -131,12 +163,13 @@ final class RemotePortTunnel: @unchecked Sendable {
    private static func waitForListener(
        process: Process,
        localPort: UInt16,
-        stderrHandle: FileHandle) async throws
+        stderrHandle: FileHandle,
+        stderrCapture: StderrCapture) async throws
    {
        let deadline = Date().addingTimeInterval(6)
        repeat {
            if !process.isRunning {
-                let stderr = Self.drainStderr(stderrHandle)
+                let stderr = Self.drainStderr(stderrHandle, captured: stderrCapture.snapshot())
                let msg = stderr.isEmpty ? "ssh tunnel exited before listening" : "ssh tunnel failed: \(stderr)"
                throw NSError(domain: "RemotePortTunnel", code: 4, userInfo: [NSLocalizedDescriptionKey: msg])
            }
@@ -152,7 +185,7 @@ final class RemotePortTunnel: @unchecked Sendable {
        } while Date() < deadline

        process.terminate()
-        let stderr = Self.drainStderr(stderrHandle)
+        let stderr = Self.drainStderr(stderrHandle, captured: stderrCapture.snapshot())
        let msg = stderr.isEmpty ? "ssh tunnel did not open local port \(localPort)" : "ssh tunnel failed: \(stderr)"
        throw NSError(domain: "RemotePortTunnel", code: 4, userInfo: [NSLocalizedDescriptionKey: msg])
    }
@@ -311,16 +344,27 @@ final class RemotePortTunnel: @unchecked Sendable {
    }

    private static func drainStderr(_ handle: FileHandle) -> String {
+        self.drainStderr(handle, captured: "")
+    }
+
+    private static func drainStderr(_ handle: FileHandle, captured: String) -> String {
        handle.readabilityHandler = nil
        defer { try? handle.close() }

        do {
            let data = try handle.readToEnd() ?? Data()
-            return String(data: data, encoding: .utf8)?
+            let remaining = String(data: data, encoding: .utf8)?
                .trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
+            if captured.isEmpty {
+                return remaining
+            }
+            if remaining.isEmpty {
+                return captured
+            }
+            return captured + "\n" + remaining
        } catch {
            self.logger.debug("Failed to drain ssh stderr: \(error, privacy: .public)")
-            return ""
+            return captured
        }
    }

--- a/apps/macos/Sources/OpenClaw/SessionsSettings.swift
+++ b/apps/macos/Sources/OpenClaw/SessionsSettings.swift
@@ -23,8 +23,9 @@ struct SessionsSettings: View {
            self.content
            Spacer()
        }
-        .frame(maxWidth: .infinity, alignment: .leading)
-        .padding(.horizontal, 12)
+        .frame(maxWidth: .infinity, maxHeight: .infinity, alignment: .topLeading)
+        .padding(.leading, 18)
+        .padding(.trailing, SettingsLayout.scrollbarGutter)
        .task {
            guard !self.hasLoaded else { return }
            guard !self.isPreview else { return }
@@ -34,16 +35,16 @@ struct SessionsSettings: View {
    }

    private var header: some View {
-        HStack(alignment: .top, spacing: 12) {
+        HStack(alignment: .top, spacing: 16) {
            VStack(alignment: .leading, spacing: 4) {
                Text("Sessions")
-                    .font(.headline)
+                    .font(.title3.weight(.semibold))
                Text("Peek at the stored conversation buckets the CLI reuses for context and rate limits.")
-                    .font(.footnote)
+                    .font(.callout)
                    .foregroundStyle(.secondary)
                    .fixedSize(horizontal: false, vertical: true)
            }
-            Spacer()
+            Spacer(minLength: 16)
            SettingsRefreshButton(isLoading: self.loading) {
                Task { await self.refresh() }
            }
@@ -58,21 +59,30 @@ struct SessionsSettings: View {
                    .foregroundStyle(.secondary)
                    .padding(.top, 6)
            } else {
-                List(self.rows) { row in
-                    self.sessionRow(row)
+                ScrollView(.vertical) {
+                    LazyVStack(alignment: .leading, spacing: 0) {
+                        ForEach(Array(self.rows.enumerated()), id: \.element.id) { index, row in
+                            self.sessionRow(row)
+                                .padding(.horizontal, 8)
+                                .padding(.vertical, 8)
+
+                            if index != self.rows.count - 1 {
+                                Divider()
+                                    .padding(.leading, 8)
+                            }
+                        }
+                    }
+                    .frame(maxWidth: .infinity, alignment: .leading)
                }
-                .listStyle(.inset)
                .overlay(alignment: .topLeading) {
                    if let errorMessage {
                        Text(errorMessage)
                            .font(.footnote)
                            .foregroundStyle(.red)
-                            .padding(.leading, 4)
+                            .padding(.leading, 8)
                            .padding(.top, 4)
                    }
                }
-                // The view already applies horizontal padding; keep the list aligned with the text above.
-                .padding(.horizontal, -12)
            }
        }
    }
@@ -136,7 +146,7 @@ struct SessionsSettings: View {
                }
            }
        }
-        .padding(.vertical, 6)
+        .frame(maxWidth: .infinity, alignment: .leading)
    }

    private func label(icon: String, text: String) -> some View {
--- a/apps/macos/Sources/OpenClaw/SettingsRootView.swift
+++ b/apps/macos/Sources/OpenClaw/SettingsRootView.swift
@@ -8,6 +8,7 @@ struct SettingsRootView: View {
    @State private var monitoringPermissions = false
    @State private var selectedTab: SettingsTab = .general
    @State private var cachedTabs: Set<SettingsTab>
+    @State private var columnVisibility: NavigationSplitViewVisibility = .all
    @State private var snapshotPaths: (configPath: String?, stateDir: String?) = (nil, nil)
    let updater: UpdaterProviding?
    private let isPreview = ProcessInfo.processInfo.isPreview
@@ -22,7 +23,7 @@ struct SettingsRootView: View {
    }

    var body: some View {
-        NavigationSplitView {
+        NavigationSplitView(columnVisibility: self.$columnVisibility) {
            List(selection: self.$selectedTab) {
                ForEach(self.visibleGroups) { group in
                    Section(group.title) {
@@ -46,6 +47,7 @@ struct SettingsRootView: View {
            .padding(.horizontal, 22)
            .padding(.vertical, 18)
        }
+        .navigationSplitViewStyle(.balanced)
        .frame(width: SettingsTab.windowWidth, height: SettingsTab.windowHeight, alignment: .topLeading)
        .frame(maxWidth: .infinity, maxHeight: .infinity, alignment: .topLeading)
        .background(SettingsWindowChromeConfigurator())
--- a/apps/macos/Sources/OpenClaw/SettingsSidebarScroll.swift
+++ b/apps/macos/Sources/OpenClaw/SettingsSidebarScroll.swift
@@ -10,5 +10,6 @@ struct SettingsSidebarScroll<Content: View>: View {
                .padding(.horizontal, 10)
        }
        .settingsSidebarCardLayout()
+        .padding(.leading, 16)
    }
 }
--- a/apps/macos/Sources/OpenClawDiscovery/GatewayDiscoveryModel.swift
+++ b/apps/macos/Sources/OpenClawDiscovery/GatewayDiscoveryModel.swift
@@ -30,6 +30,8 @@ public final class GatewayDiscoveryModel {
        public var tailnetDns: String?
        public var sshPort: Int
        public var gatewayPort: Int?
+        public var gatewayTls: Bool
+        public var gatewayDirectReachable: Bool
        public var cliPath: String?
        public var stableID: String
        public var debugID: String
@@ -43,6 +45,8 @@ public final class GatewayDiscoveryModel {
            tailnetDns: String? = nil,
            sshPort: Int,
            gatewayPort: Int? = nil,
+            gatewayTls: Bool = false,
+            gatewayDirectReachable: Bool = false,
            cliPath: String? = nil,
            stableID: String,
            debugID: String,
@@ -55,6 +59,8 @@ public final class GatewayDiscoveryModel {
            self.tailnetDns = tailnetDns
            self.sshPort = sshPort
            self.gatewayPort = gatewayPort
+            self.gatewayTls = gatewayTls
+            self.gatewayDirectReachable = gatewayDirectReachable
            self.cliPath = cliPath
            self.stableID = stableID
            self.debugID = debugID
@@ -184,6 +190,8 @@ public final class GatewayDiscoveryModel {
                tailnetDns: beacon.tailnetDns,
                sshPort: beacon.sshPort ?? 22,
                gatewayPort: beacon.gatewayPort,
+                gatewayTls: beacon.gatewayTls,
+                gatewayDirectReachable: beacon.gatewayDirectReachable,
                cliPath: beacon.cliPath,
                stableID: stableID,
                debugID: "\(beacon.instanceName)@\(beacon.host):\(beacon.port)",
@@ -210,6 +218,8 @@ public final class GatewayDiscoveryModel {
                tailnetDns: beacon.tailnetDns,
                sshPort: 22,
                gatewayPort: beacon.port,
+                gatewayTls: true,
+                gatewayDirectReachable: true,
                cliPath: nil,
                stableID: stableID,
                debugID: "\(beacon.host):\(beacon.port)",
@@ -282,6 +292,8 @@ public final class GatewayDiscoveryModel {
                tailnetDns: parsedTXT.tailnetDns,
                sshPort: parsedTXT.sshPort,
                gatewayPort: parsedTXT.gatewayPort,
+                gatewayTls: parsedTXT.gatewayTls,
+                gatewayDirectReachable: parsedTXT.gatewayDirectReachable,
                cliPath: parsedTXT.cliPath,
                stableID: stableID,
                debugID: GatewayEndpointID.prettyDescription(result.endpoint),
@@ -445,6 +457,8 @@ public final class GatewayDiscoveryModel {
        public var tailnetDns: String?
        public var sshPort: Int
        public var gatewayPort: Int?
+        public var gatewayTls: Bool
+        public var gatewayDirectReachable: Bool
        public var cliPath: String?
    }

@@ -453,6 +467,8 @@ public final class GatewayDiscoveryModel {
        var tailnetDns: String?
        var sshPort = 22
        var gatewayPort: Int?
+        var gatewayTls = false
+        var gatewayDirectReachable = false
        var cliPath: String?

        if let value = txt["lanHost"] {
@@ -475,6 +491,14 @@ public final class GatewayDiscoveryModel {
        {
            gatewayPort = parsed
        }
+        if let value = txt["gatewayTls"] {
+            let normalized = value.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
+            gatewayTls = normalized == "1" || normalized == "true" || normalized == "yes"
+        }
+        if let value = txt["gatewayDirectReachable"] {
+            let normalized = value.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
+            gatewayDirectReachable = normalized == "1" || normalized == "true" || normalized == "yes"
+        }
        if let value = txt["cliPath"] {
            let trimmed = value.trimmingCharacters(in: .whitespacesAndNewlines)
            cliPath = trimmed.isEmpty ? nil : trimmed
@@ -485,6 +509,8 @@ public final class GatewayDiscoveryModel {
            tailnetDns: tailnetDns,
            sshPort: sshPort,
            gatewayPort: gatewayPort,
+            gatewayTls: gatewayTls,
+            gatewayDirectReachable: gatewayDirectReachable,
            cliPath: cliPath)
    }

--- a/apps/macos/Sources/OpenClawDiscovery/WideAreaGatewayDiscovery.swift
+++ b/apps/macos/Sources/OpenClawDiscovery/WideAreaGatewayDiscovery.swift
@@ -9,6 +9,8 @@ struct WideAreaGatewayBeacon: Equatable {
    var lanHost: String?
    var tailnetDns: String?
    var gatewayPort: Int?
+    var gatewayTls: Bool
+    var gatewayDirectReachable: Bool
    var sshPort: Int?
    var cliPath: String?
 }
@@ -83,6 +85,8 @@ enum WideAreaGatewayDiscovery {
                lanHost: txt["lanHost"],
                tailnetDns: txt["tailnetDns"],
                gatewayPort: parseInt(txt["gatewayPort"]),
+                gatewayTls: parseBool(txt["gatewayTls"]),
+                gatewayDirectReachable: parseBool(txt["gatewayDirectReachable"]),
                sshPort: parseInt(txt["sshPort"]),
                cliPath: txt["cliPath"])
            beacons.append(beacon)
@@ -246,6 +250,12 @@ enum WideAreaGatewayDiscovery {
        return Int(trimmed)
    }

+    private static func parseBool(_ value: String?) -> Bool {
+        guard let value else { return false }
+        let normalized = value.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
+        return normalized == "1" || normalized == "true" || normalized == "yes"
+    }
+
    private static func isTailnetIPv4(_ value: String) -> Bool {
        let parts = value.split(separator: ".")
        if parts.count != 4 { return false }
--- a/apps/macos/Sources/OpenClawMacCLI/DiscoverCommand.swift
+++ b/apps/macos/Sources/OpenClawMacCLI/DiscoverCommand.swift
@@ -41,6 +41,8 @@ struct DiscoveryOutput: Encodable {
        var tailnetDns: String?
        var sshPort: Int
        var gatewayPort: Int?
+        var gatewayTls: Bool
+        var gatewayDirectReachable: Bool
        var cliPath: String?
        var stableID: String
        var debugID: String
@@ -106,6 +108,8 @@ func runDiscover(_ args: [String]) async {
                    tailnetDns: $0.tailnetDns,
                    sshPort: $0.sshPort,
                    gatewayPort: $0.gatewayPort,
+                    gatewayTls: $0.gatewayTls,
+                    gatewayDirectReachable: $0.gatewayDirectReachable,
                    cliPath: $0.cliPath,
                    stableID: $0.stableID,
                    debugID: $0.debugID,
@@ -139,6 +143,8 @@ func runDiscover(_ args: [String]) async {
        if let port = gateway.gatewayPort {
            print("  gatewayPort: \(port)")
        }
+        print("  gatewayTls: \(gateway.gatewayTls)")
+        print("  gatewayDirectReachable: \(gateway.gatewayDirectReachable)")
        if let cliPath = gateway.cliPath {
            print("  cliPath: \(cliPath)")
        }
--- a/apps/macos/Tests/OpenClawIPCTests/AppStateRemoteConfigTests.swift
+++ b/apps/macos/Tests/OpenClawIPCTests/AppStateRemoteConfigTests.swift
@@ -51,7 +51,7 @@ struct AppStateRemoteConfigTests {
                remoteTokenDirty: false))

        #expect(remote["url"] as? String == "ws://127.0.0.1:18789")
-        #expect((remote["transport"] as? String) == nil)
+        #expect(remote["transport"] as? String == "ssh")
        #expect(remote["sshTarget"] as? String == "alice@gateway.example")
    }

@@ -161,6 +161,29 @@ struct AppStateRemoteConfigTests {
        }
    }

+    @Test
+    func `app state init preserves legacy SSH tunnel config until transport is explicit`() async {
+        let configPath = TestIsolation.tempConfigPath()
+        await TestIsolation.withIsolatedState(
+            env: ["OPENCLAW_CONFIG_PATH": configPath],
+            defaults: [remoteTargetKey: nil])
+        {
+            OpenClawConfigFile.saveDict([
+                "gateway": [
+                    "mode": "remote",
+                    "remote": [
+                        "url": "ws://127.0.0.1:18789",
+                        "sshTarget": "steipete@192.168.0.202",
+                    ],
+                ],
+            ])
+
+            let state = AppState(preview: true)
+            #expect(state.remoteTransport == .ssh)
+            #expect(state.remoteUrl == "ws://127.0.0.1:18789")
+        }
+    }
+
    @Test
    func `synced gateway root preserves object token across mode and transport changes when untouched`() {
        let initialRoot: [String: Any] = [
--- a/apps/macos/Tests/OpenClawIPCTests/DashboardWindowSmokeTests.swift
+++ b/apps/macos/Tests/OpenClawIPCTests/DashboardWindowSmokeTests.swift
@@ -37,4 +37,18 @@ struct DashboardWindowSmokeTests {
        let url = try #require(URL(string: "http://[fd12:3456:789a::1]:18789/control/"))
        #expect(DashboardWindowController.originString(for: url) == "http://[fd12:3456:789a::1]:18789")
    }
+
+    @Test func `dashboard failure state opens in dashboard window`() throws {
+        let url = try #require(URL(string: "http://127.0.0.1:18789/control/"))
+        let controller = DashboardWindowController(
+            url: url,
+            auth: DashboardWindowAuth(gatewayUrl: nil, token: nil, password: nil))
+        controller.showFailure(
+            title: "Dashboard unavailable",
+            message: "Remote control tunnel failed",
+            detail: "Reset the remote tunnel and try again.")
+        #expect(controller.window?.isVisible == true)
+        #expect(controller.window?.styleMask.contains(.closable) == true)
+        controller.closeDashboard()
+    }
 }
--- a/apps/macos/Tests/OpenClawIPCTests/GatewayDiscoveryHelpersTests.swift
+++ b/apps/macos/Tests/OpenClawIPCTests/GatewayDiscoveryHelpersTests.swift
@@ -10,7 +10,8 @@ struct GatewayDiscoveryHelpersTests {
        lanHost: String? = "txt-host.local",
        tailnetDns: String? = "txt-host.ts.net",
        sshPort: Int = 22,
-        gatewayPort: Int? = 18789) -> GatewayDiscoveryModel.DiscoveredGateway
+        gatewayPort: Int? = 18789,
+        gatewayTls: Bool = false) -> GatewayDiscoveryModel.DiscoveredGateway
    {
        GatewayDiscoveryModel.DiscoveredGateway(
            displayName: "Gateway",
@@ -20,6 +21,7 @@ struct GatewayDiscoveryHelpersTests {
            tailnetDns: tailnetDns,
            sshPort: sshPort,
            gatewayPort: gatewayPort,
+            gatewayTls: gatewayTls,
            cliPath: "/tmp/openclaw",
            stableID: UUID().uuidString,
            debugID: UUID().uuidString,
@@ -70,13 +72,14 @@ struct GatewayDiscoveryHelpersTests {
    @Test func `direct url uses resolved service endpoint only`() {
        let tlsGateway = self.makeGateway(
            serviceHost: "resolved.example.ts.net",
-            servicePort: 443)
+            servicePort: 443,
+            gatewayTls: true)
        #expect(GatewayDiscoveryHelpers.directUrl(for: tlsGateway) == "wss://resolved.example.ts.net")

        let wsGateway = self.makeGateway(
            serviceHost: "resolved.example.ts.net",
            servicePort: 18789)
-        #expect(GatewayDiscoveryHelpers.directUrl(for: wsGateway) == "wss://resolved.example.ts.net:18789")
+        #expect(GatewayDiscoveryHelpers.directUrl(for: wsGateway) == "ws://resolved.example.ts.net:18789")

        let localGateway = self.makeGateway(
            serviceHost: "127.0.0.1",
@@ -84,6 +87,15 @@ struct GatewayDiscoveryHelpersTests {
        #expect(GatewayDiscoveryHelpers.directUrl(for: localGateway) == "ws://127.0.0.1:18789")
    }

+    @Test func `direct url rejects public plaintext service endpoint`() {
+        let gateway = self.makeGateway(
+            serviceHost: "gateway.example",
+            servicePort: 18789,
+            gatewayTls: false)
+
+        #expect(GatewayDiscoveryHelpers.directUrl(for: gateway) == nil)
+    }
+
    @Test func `direct url rejects txt only fallback`() {
        let gateway = self.makeGateway(
            serviceHost: nil,
--- a/apps/macos/Tests/OpenClawIPCTests/GatewayDiscoveryModelTests.swift
+++ b/apps/macos/Tests/OpenClawIPCTests/GatewayDiscoveryModelTests.swift
@@ -87,12 +87,16 @@ struct GatewayDiscoveryModelTests {
            "tailnetDns": "  peters-mac-studio-1.ts.net  ",
            "sshPort": " 2222 ",
            "gatewayPort": " 18799 ",
+            "gatewayTls": " yes ",
+            "gatewayDirectReachable": " true ",
            "cliPath": " /opt/openclaw ",
        ])
        #expect(parsed.lanHost == "studio.local")
        #expect(parsed.tailnetDns == "peters-mac-studio-1.ts.net")
        #expect(parsed.sshPort == 2222)
        #expect(parsed.gatewayPort == 18799)
+        #expect(parsed.gatewayTls)
+        #expect(parsed.gatewayDirectReachable)
        #expect(parsed.cliPath == "/opt/openclaw")
    }

@@ -107,6 +111,8 @@ struct GatewayDiscoveryModelTests {
        #expect(parsed.tailnetDns == nil)
        #expect(parsed.sshPort == 22)
        #expect(parsed.gatewayPort == nil)
+        #expect(!parsed.gatewayTls)
+        #expect(!parsed.gatewayDirectReachable)
        #expect(parsed.cliPath == nil)
    }

--- a/apps/macos/Tests/OpenClawIPCTests/GatewayDiscoverySelectionSupportTests.swift
+++ b/apps/macos/Tests/OpenClawIPCTests/GatewayDiscoverySelectionSupportTests.swift
@@ -11,6 +11,8 @@ struct GatewayDiscoverySelectionSupportTests {
        servicePort: Int?,
        tailnetDns: String? = nil,
        sshPort: Int = 22,
+        gatewayTls: Bool = false,
+        gatewayDirectReachable: Bool = false,
        stableID: String) -> GatewayDiscoveryModel.DiscoveredGateway
    {
        GatewayDiscoveryModel.DiscoveredGateway(
@@ -21,6 +23,8 @@ struct GatewayDiscoverySelectionSupportTests {
            tailnetDns: tailnetDns,
            sshPort: sshPort,
            gatewayPort: servicePort,
+            gatewayTls: gatewayTls,
+            gatewayDirectReachable: gatewayDirectReachable,
            cliPath: nil,
            stableID: stableID,
            debugID: UUID().uuidString,
@@ -40,6 +44,7 @@ struct GatewayDiscoverySelectionSupportTests {
                    serviceHost: tailnetHost,
                    servicePort: 443,
                    tailnetDns: tailnetHost,
+                    gatewayTls: true,
                    stableID: "tailscale-serve|\(tailnetHost)"),
                state: state)

@@ -61,6 +66,7 @@ struct GatewayDiscoverySelectionSupportTests {
                    serviceHost: tailnetHost,
                    servicePort: 443,
                    tailnetDns: tailnetHost,
+                    gatewayTls: true,
                    stableID: "wide-area|openclaw.internal.|gateway-host"),
                state: state)

@@ -69,12 +75,33 @@ struct GatewayDiscoverySelectionSupportTests {
        }
    }

-    @Test func `selecting nearby lan gateway keeps ssh transport`() async {
+    @Test func `legacy tailnet discovery without reachability flags still switches to direct transport`() async {
+        let tailnetHost = "gateway-host.tailnet-example.ts.net"
+        let configPath = TestIsolation.tempConfigPath()
+        await TestIsolation.withEnvValues(["OPENCLAW_CONFIG_PATH": configPath]) {
+            let state = AppState(preview: true)
+            state.remoteTransport = .ssh
+
+            GatewayDiscoverySelectionSupport.applyRemoteSelection(
+                gateway: self.makeGateway(
+                    serviceHost: tailnetHost,
+                    servicePort: 18789,
+                    tailnetDns: tailnetHost,
+                    stableID: "wide-area|openclaw.internal.|gateway-host"),
+                state: state)
+
+            #expect(state.remoteTransport == .direct)
+            #expect(state.remoteUrl == "ws://\(tailnetHost):18789")
+        }
+    }
+
+    @Test func `selecting nearby lan gateway keeps ssh without direct reachability signal`() async {
        let configPath = TestIsolation.tempConfigPath()
        await TestIsolation.withEnvValues(["OPENCLAW_CONFIG_PATH": configPath]) {
            let state = AppState(preview: true)
            state.remoteTransport = .ssh
            state.remoteTarget = "user@old-host"
+            state.remoteUrl = "ws://localhost:29876"

            GatewayDiscoverySelectionSupport.applyRemoteSelection(
                gateway: self.makeGateway(
@@ -84,16 +111,17 @@ struct GatewayDiscoverySelectionSupportTests {
                state: state)

            #expect(state.remoteTransport == .ssh)
-            #expect(state.remoteUrl == "ws://127.0.0.1:18789")
+            #expect(state.remoteUrl == "ws://127.0.0.1:29876")
            #expect(CommandResolver.parseSSHTarget(state.remoteTarget)?.host == "nearby-gateway.local")

            let configRoot = OpenClawConfigFile.loadDict()
            let remote = ((configRoot["gateway"] as? [String: Any])?["remote"] as? [String: Any]) ?? [:]
-            #expect(remote["url"] as? String == "ws://127.0.0.1:18789")
+            #expect(remote["transport"] as? String == "ssh")
+            #expect(remote["url"] as? String == "ws://127.0.0.1:29876")
        }
    }

-    @Test func `selecting nearby lan gateway preserves existing ssh tunnel port`() async {
+    @Test func `selecting direct reachable lan gateway ignores stale local tunnel port`() async {
        let configPath = TestIsolation.tempConfigPath()
        await TestIsolation.withEnvValues(["OPENCLAW_CONFIG_PATH": configPath]) {
            let state = AppState(preview: true)
@@ -104,15 +132,17 @@ struct GatewayDiscoverySelectionSupportTests {
                gateway: self.makeGateway(
                    serviceHost: "nearby-gateway.local",
                    servicePort: 19999,
+                    gatewayDirectReachable: true,
                    stableID: "bonjour|nearby-gateway-custom"),
                state: state)

-            #expect(state.remoteTransport == .ssh)
-            #expect(state.remoteUrl == "ws://127.0.0.1:29876")
+            #expect(state.remoteTransport == .direct)
+            #expect(state.remoteUrl == "ws://nearby-gateway.local:19999")

            let configRoot = OpenClawConfigFile.loadDict()
            let remote = ((configRoot["gateway"] as? [String: Any])?["remote"] as? [String: Any]) ?? [:]
-            #expect(remote["url"] as? String == "ws://127.0.0.1:29876")
+            #expect(remote["transport"] as? String == "direct")
+            #expect(remote["url"] as? String == "ws://nearby-gateway.local:19999")
        }
    }
 }
--- a/apps/macos/Tests/OpenClawIPCTests/GatewayEndpointStoreTests.swift
+++ b/apps/macos/Tests/OpenClawIPCTests/GatewayEndpointStoreTests.swift
@@ -315,6 +315,54 @@ struct GatewayEndpointStoreTests {
        #expect(url?.absoluteString == "ws://100.123.224.76:18789")
    }

+    @Test func `missing transport infers direct from private remote URL`() {
+        let root: [String: Any] = [
+            "gateway": [
+                "remote": [
+                    "url": "ws://192.168.0.202:18789",
+                ],
+            ],
+        ]
+
+        let resolution = GatewayRemoteConfig.resolveTransportResolution(root: root)
+        #expect(resolution.transport == .direct)
+        #expect(resolution.source == .inferredRemoteURL)
+        #expect(resolution.directURL?.absoluteString == "ws://192.168.0.202:18789")
+    }
+
+    @Test func `legacy loopback URL keeps SSH even with trusted SSH target`() {
+        let root: [String: Any] = [
+            "gateway": [
+                "remote": [
+                    "url": "ws://127.0.0.1:18789",
+                    "sshTarget": "steipete@192.168.0.202",
+                ],
+            ],
+        ]
+
+        let resolution = GatewayRemoteConfig.resolveTransportResolution(root: root)
+        #expect(resolution.transport == .ssh)
+        #expect(resolution.source == .legacySSH)
+        #expect(resolution.directURL == nil)
+    }
+
+    @Test func `explicit ssh keeps legacy tunnel even when target is direct capable`() {
+        let root: [String: Any] = [
+            "gateway": [
+                "remote": [
+                    "transport": "ssh",
+                    "url": "ws://127.0.0.1:18789",
+                    "sshTarget": "steipete@192.168.0.202",
+                ],
+            ],
+        ]
+
+        let resolution = GatewayRemoteConfig.resolveTransportResolution(root: root)
+        #expect(resolution.transport == .ssh)
+        #expect(resolution.source == .explicit)
+        #expect(resolution.directURL == nil)
+    }
+
    @Test func `normalize gateway url rejects public host ws`() {
        let url = GatewayRemoteConfig.normalizeGatewayUrl("ws://gateway.example:18789")
        #expect(url == nil)
--- a/docs/.generated/plugin-sdk-api-baseline.sha256
+++ b/docs/.generated/plugin-sdk-api-baseline.sha256
@@ -1,2 +1,2 @@
-d979b8c2721eeb83380a38853309e9ba0f2c28e040a9ad2ee1e7b2ab10c547db  plugin-sdk-api-baseline.json
-4815f711fe2481483159137cfb97ce3d1c173e0b50a364a2353f49888f4d53df  plugin-sdk-api-baseline.jsonl
+048d8ff5e4455d16f75f6762a916f67c982e1211fb7085456647234255567466  plugin-sdk-api-baseline.json
+2d46a9660c9143f823a47df3c7ecfd315a4999e96af5eddb4ba4e71d9bb377a6  plugin-sdk-api-baseline.jsonl
--- a/docs/automation/tasks.md
+++ b/docs/automation/tasks.md
@@ -102,7 +102,7 @@ Not every agent run creates a task. Heartbeat turns and normal interactive chat
  <Accordion title="Notify defaults for cron and media">
    Main-session cron tasks use `silent` notify policy by default - they create records for tracking but do not generate notifications. Isolated cron tasks also default to `silent` but are more visible because they run in their own session.

-    Session-backed `image_generate`, `music_generate`, and `video_generate` runs also use `silent` notify policy. They still create task records, but completion is handed back to the original agent session as an internal wake so the agent can write the follow-up message and attach the finished media itself. Group/channel completions follow the normal visible-reply policy, so the agent uses the message tool when source delivery requires it. If the completion agent fails to produce message-tool delivery evidence in a tool-only route, OpenClaw sends the completion fallback directly to the original channel instead of leaving the media private.
+    Session-backed `image_generate`, `music_generate`, and `video_generate` runs also use `silent` notify policy. They still create task records, but completion is handed back to the original agent session as an internal wake so the agent can write the follow-up message and attach the finished media itself. Generated-media completion events require message-tool delivery: the agent must send the finished media with the `message` tool, then reply `NO_REPLY`. If the completion agent only writes a private final reply or misses the media attachment, OpenClaw marks the completion handoff as failed; it does not auto-post the generated media as a fallback.

  </Accordion>
  <Accordion title="Concurrent media-generation guardrail">
--- a/docs/channels/discord.md
+++ b/docs/channels/discord.md
@@ -1631,7 +1631,7 @@ openclaw logs --follow
          // Molty listens to all bot-authored Discord messages.
          allowBots: true,
          mentionAliases: {
-            // Lets Molty write "@Mantis" and send a real Discord mention.
+            // Lets Molty write a Mantis Discord mention with the configured user id.
            Mantis: "MANTIS_DISCORD_USER_ID",
          },
          botLoopProtection: {
--- a/docs/ci.md
+++ b/docs/ci.md
@@ -12,39 +12,39 @@ OpenClaw CI runs on every push to `main` and every pull request. The `preflight`

 ## Pipeline overview

-| Job                              | Purpose                                                                                                   | When it runs                       |
-| -------------------------------- | --------------------------------------------------------------------------------------------------------- | ---------------------------------- |
-| `preflight`                      | Detect docs-only changes, changed scopes, changed extensions, and build the CI manifest                   | Always on non-draft pushes and PRs |
-| `security-scm-fast`              | Private key detection and workflow audit via `zizmor`                                                     | Always on non-draft pushes and PRs |
-| `security-dependency-audit`      | Dependency-free production lockfile audit against npm advisories                                          | Always on non-draft pushes and PRs |
-| `security-fast`                  | Required aggregate for the fast security jobs                                                             | Always on non-draft pushes and PRs |
-| `check-dependencies`             | Production Knip dependency-only pass plus the unused-file allowlist guard                                 | Node-relevant changes              |
-| `build-artifacts`                | Build `dist/`, Control UI, built-artifact checks, and reusable downstream artifacts                       | Node-relevant changes              |
-| `checks-fast-core`               | Fast Linux correctness lanes such as bundled/plugin-contract/protocol checks                              | Node-relevant changes              |
-| `checks-fast-contracts-channels` | Sharded channel contract checks with a stable aggregate check result                                      | Node-relevant changes              |
-| `checks-node-core-test`          | Core Node test shards, excluding channel, bundled, contract, and extension lanes                          | Node-relevant changes              |
-| `check`                          | Sharded main local gate equivalent: prod types, lint, guards, test types, and strict smoke                | Node-relevant changes              |
-| `check-additional`               | Architecture, sharded boundary/prompt drift, extension guards, package boundary, and gateway watch        | Node-relevant changes              |
-| `build-smoke`                    | Built-CLI smoke tests and startup-memory smoke                                                            | Node-relevant changes              |
-| `checks`                         | Verifier for built-artifact channel tests                                                                 | Node-relevant changes              |
-| `checks-node-compat-node22`      | Node 22 compatibility build and smoke lane                                                                | Manual CI dispatch for releases    |
-| `check-docs`                     | Docs formatting, lint, and broken-link checks                                                             | Docs changed                       |
-| `skills-python`                  | Ruff + pytest for Python-backed skills                                                                    | Python-skill-relevant changes      |
-| `checks-windows`                 | Windows-specific process/path tests plus shared runtime import specifier regressions                      | Windows-relevant changes           |
-| `macos-node`                     | macOS TypeScript test lane using the shared built artifacts                                               | macOS-relevant changes             |
-| `macos-swift`                    | Swift lint, build, and tests for the macOS app                                                            | macOS-relevant changes             |
-| `android`                        | Android unit tests for both flavors plus one debug APK build                                              | Android-relevant changes           |
-| `test-performance-agent`         | Daily Codex slow-test optimization after trusted activity                                                 | Main CI success or manual dispatch |
-| `openclaw-performance`           | Daily/on-demand Kova runtime performance reports with mock-provider, deep-profile, and GPT 5.5 live lanes | Scheduled and manual dispatch      |
+| Job                                | Purpose                                                                                                   | When it runs                       |
+| ---------------------------------- | --------------------------------------------------------------------------------------------------------- | ---------------------------------- |
+| `preflight`                        | Detect docs-only changes, changed scopes, changed extensions, and build the CI manifest                   | Always on non-draft pushes and PRs |
+| `security-scm-fast`                | Private key detection and workflow audit via `zizmor`                                                     | Always on non-draft pushes and PRs |
+| `security-dependency-audit`        | Dependency-free production lockfile audit against npm advisories                                          | Always on non-draft pushes and PRs |
+| `security-fast`                    | Required aggregate for the fast security jobs                                                             | Always on non-draft pushes and PRs |
+| `check-dependencies`               | Production Knip dependency-only pass plus the unused-file allowlist guard                                 | Node-relevant changes              |
+| `build-artifacts`                  | Build `dist/`, Control UI, built-CLI smoke checks, embedded built-artifact checks, and reusable artifacts | Node-relevant changes              |
+| `checks-fast-core`                 | Fast Linux correctness lanes such as bundled and CI-routing checks                                        | Node-relevant changes              |
+| `checks-fast-protocol`             | Gateway protocol compatibility check                                                                      | Node-relevant changes              |
+| `checks-fast-contracts-plugins-*`  | Two sharded plugin contract checks                                                                        | Node-relevant changes              |
+| `checks-fast-contracts-channels-*` | Two sharded channel contract checks                                                                       | Node-relevant changes              |
+| `checks-node-core-*`               | Core Node test shards, excluding channel, bundled, contract, and extension lanes                          | Node-relevant changes              |
+| `check-*`                          | Sharded main local gate equivalent: prod types, lint, guards, test types, and strict smoke                | Node-relevant changes              |
+| `check-additional-*`               | Architecture, sharded boundary/prompt drift, extension guards, package boundary, and runtime topology     | Node-relevant changes              |
+| `checks-node-compat-node22`        | Node 22 compatibility build and smoke lane                                                                | Manual CI dispatch for releases    |
+| `check-docs`                       | Docs formatting, lint, and broken-link checks                                                             | Docs changed                       |
+| `skills-python`                    | Ruff + pytest for Python-backed skills                                                                    | Python-skill-relevant changes      |
+| `checks-windows`                   | Windows-specific process/path tests plus shared runtime import specifier regressions                      | Windows-relevant changes           |
+| `macos-node`                       | macOS TypeScript test lane using the shared built artifacts                                               | macOS-relevant changes             |
+| `macos-swift`                      | Swift lint, build, and tests for the macOS app                                                            | macOS-relevant changes             |
+| `android`                          | Android unit tests for both flavors plus one debug APK build                                              | Android-relevant changes           |
+| `test-performance-agent`           | Daily Codex slow-test optimization after trusted activity                                                 | Main CI success or manual dispatch |
+| `openclaw-performance`             | Daily/on-demand Kova runtime performance reports with mock-provider, deep-profile, and GPT 5.5 live lanes | Scheduled and manual dispatch      |

 ## Fail-fast order

 1. `preflight` decides which lanes exist at all. The `docs-scope` and `changed-scope` logic are steps inside this job, not standalone jobs.
-2. `security-scm-fast`, `security-dependency-audit`, `security-fast`, `check`, `check-additional`, `check-docs`, and `skills-python` fail quickly without waiting on the heavier artifact and platform matrix jobs.
+2. `security-scm-fast`, `security-dependency-audit`, `security-fast`, `check-*`, `check-additional-*`, `check-docs`, and `skills-python` fail quickly without waiting on the heavier artifact and platform matrix jobs.
 3. `build-artifacts` overlaps with the fast Linux lanes so downstream consumers can start as soon as the shared build is ready.
-4. Heavier platform and runtime lanes fan out after that: `checks-fast-core`, `checks-fast-contracts-channels`, `checks-node-core-test`, `checks`, `checks-windows`, `macos-node`, `macos-swift`, and `android`.
+4. Heavier platform and runtime lanes fan out after that: `checks-fast-core`, `checks-fast-contracts-plugins-*`, `checks-fast-contracts-channels-*`, `checks-node-core-*`, `checks-windows`, `macos-node`, `macos-swift`, and `android`.

-GitHub may mark superseded jobs as `cancelled` when a newer push lands on the same PR or `main` ref. Treat that as CI noise unless the newest run for the same ref is also failing. Aggregate shard checks use `!cancelled() && always()` so they still report normal shard failures but do not queue after the whole workflow has already been superseded. The automatic CI concurrency key is versioned (`CI-v7-*`) so a GitHub-side zombie in an old queue group cannot indefinitely block newer main runs. Manual full-suite runs use `CI-manual-v1-*` and do not cancel in-progress runs.
+GitHub may mark superseded jobs as `cancelled` when a newer push lands on the same PR or `main` ref. Treat that as CI noise unless the newest run for the same ref is also failing. Matrix jobs use `fail-fast: false`, and `build-artifacts` reports embedded channel, core-support-boundary, and gateway-watch failures directly instead of queuing tiny verifier jobs. The automatic CI concurrency key is versioned (`CI-v7-*`) so a GitHub-side zombie in an old queue group cannot indefinitely block newer main runs. Manual full-suite runs use `CI-manual-v1-*` and do not cancel in-progress runs.

 The `ci-timings-summary` job uploads a compact `ci-timings-summary` artifact for each non-draft CI run. It records wall time, queue time, slowest jobs, and failed jobs for the current run, so CI health checks do not need to scrape the full Actions payload repeatedly.

@@ -56,7 +56,7 @@ Scope logic lives in `scripts/ci-changed-scope.mjs` and is covered by unit tests
 - **CI routing-only edits, selected cheap core-test fixture edits, and narrow plugin contract helper/test-routing edits** use a fast Node-only manifest path: `preflight`, security, and a single `checks-fast-core` task. That path skips build artifacts, Node 22 compatibility, channel contracts, full core shards, bundled-plugin shards, and additional guard matrices when the change is limited to the routing or helper surfaces the fast task exercises directly.
 - **Windows Node checks** are scoped to Windows-specific process/path wrappers, npm/pnpm/UI runner helpers, package manager config, and the CI workflow surfaces that execute that lane; unrelated source, plugin, install-smoke, and test-only changes stay on the Linux Node lanes.

-The slowest Node test families are split or balanced so each job stays small without over-reserving runners: channel contracts run as three weighted Blacksmith-backed shards with the standard GitHub runner fallback, core unit fast/support lanes run separately, core runtime infra is split between state, process/config, cron, and shared shards, auto-reply runs as balanced workers (with the reply subtree split into agent-runner, dispatch, and commands/state-routing shards), and agentic gateway/server configs are split across chat/auth/model/http-plugin/runtime/startup lanes instead of waiting on built artifacts. Broad browser, QA, media, and miscellaneous plugin tests use their dedicated Vitest configs instead of the shared plugin catch-all. Include-pattern shards record timing entries using the CI shard name, so `.artifacts/vitest-shard-timings.json` can distinguish a whole config from a filtered shard. `check-additional` keeps package-boundary compile/canary work together and separates runtime topology architecture from gateway watch coverage; the boundary guard list is striped across four matrix shards, each running selected independent guards concurrently and printing per-check timings. The expensive Codex happy-path prompt snapshot drift check runs as its own additional job for manual CI and for prompt-affecting changes only, so normal unrelated Node changes do not wait behind cold prompt snapshot generation and the boundary shards stay balanced while prompt drift is still pinned to the PR that caused it; the same flag skips prompt snapshot Vitest generation inside the built-artifact core support-boundary shard. Gateway watch, channel tests, and the core support-boundary shard run concurrently inside `build-artifacts` after `dist/` and `dist-runtime/` are already built.
+The slowest Node test families are split or balanced so each job stays small without over-reserving runners: plugin contracts and channel contracts each run as two weighted Blacksmith-backed shards with the standard GitHub runner fallback, core unit fast/support lanes run separately, core runtime infra is split between state, process/config, cron, and shared shards, auto-reply runs as balanced workers (with the reply subtree split into agent-runner, dispatch, and commands/state-routing shards), and agentic gateway/server configs are split across chat/auth/model/http-plugin/runtime/startup lanes instead of waiting on built artifacts. Broad browser, QA, media, and miscellaneous plugin tests use their dedicated Vitest configs instead of the shared plugin catch-all. Include-pattern shards record timing entries using the CI shard name, so `.artifacts/vitest-shard-timings.json` can distinguish a whole config from a filtered shard. `check-additional-*` keeps package-boundary compile/canary work together and separates runtime topology architecture from gateway watch coverage; the boundary guard list is striped across four matrix shards, each running selected independent guards concurrently and printing per-check timings. The expensive Codex happy-path prompt snapshot drift check runs as its own additional job for manual CI and for prompt-affecting changes only, so normal unrelated Node changes do not wait behind cold prompt snapshot generation and the boundary shards stay balanced while prompt drift is still pinned to the PR that caused it; the same flag skips prompt snapshot Vitest generation inside the built-artifact core support-boundary shard. Gateway watch, channel tests, and the core support-boundary shard run concurrently inside `build-artifacts` after `dist/` and `dist-runtime/` are already built.

 Android CI runs both `testPlayDebugUnitTest` and `testThirdPartyDebugUnitTest` and then builds the Play debug APK. The third-party flavor has no separate source set or manifest; its unit-test lane still compiles the flavor with the SMS/call-log BuildConfig flags, while avoiding a duplicate debug APK packaging job on every Android-relevant push.

@@ -81,7 +81,7 @@ Treat GitHub titles, comments, bodies, review text, branch names, and commit mes

 ## Manual dispatches

-Manual CI dispatches run the same job graph as normal CI but force every non-Android scoped lane on: Linux Node shards, bundled-plugin shards, channel contracts, Node 22 compatibility, `check`, `check-additional`, build smoke, docs checks, Python skills, Windows, macOS, and Control UI i18n. Standalone manual CI dispatches run Android only with `include_android=true`; the full release umbrella enables Android by passing `include_android=true`. Plugin prerelease static checks, the release-only `agentic-plugins` shard, the full extension batch sweep, and plugin prerelease Docker lanes are excluded from CI. The Docker prerelease suite runs only when `Full Release Validation` dispatches the separate `Plugin Prerelease` workflow with the release-validation gate enabled.
+Manual CI dispatches run the same job graph as normal CI but force every non-Android scoped lane on: Linux Node shards, bundled-plugin shards, plugin and channel contract shards, Node 22 compatibility, `check-*`, `check-additional-*`, built-artifact smoke checks, docs checks, Python skills, Windows, macOS, and Control UI i18n. Standalone manual CI dispatches run Android only with `include_android=true`; the full release umbrella enables Android by passing `include_android=true`. Plugin prerelease static checks, the release-only `agentic-plugins` shard, the full extension batch sweep, and plugin prerelease Docker lanes are excluded from CI. The Docker prerelease suite runs only when `Full Release Validation` dispatches the separate `Plugin Prerelease` workflow with the release-validation gate enabled.

 Manual runs use a unique concurrency group so a release-candidate full suite is not cancelled by another push or PR run on the same ref. The optional `target_ref` input lets a trusted caller run that graph against a branch, tag, or full commit SHA while using the workflow file from the selected dispatch ref.

@@ -93,15 +93,15 @@ gh workflow run full-release-validation.yml --ref main -f ref=<branch-or-sha>

 ## Runners

-| Runner                           | Jobs                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
-| -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| `ubuntu-24.04`                   | `preflight`, fast security jobs and aggregates (`security-scm-fast`, `security-dependency-audit`, `security-fast`), fast protocol/contract/bundled checks, sharded channel contract checks, `check` shards except lint, `check-additional` aggregates, Node test aggregate verifiers, docs checks, Python skills, workflow-sanity, labeler, auto-response; install-smoke preflight also uses GitHub-hosted Ubuntu so the Blacksmith matrix can queue earlier |
-| `blacksmith-4vcpu-ubuntu-2404`   | `CodeQL Critical Quality`, lower-weight extension shards, `checks-fast-core`, `checks-node-compat-node22`, `check-prod-types`, and `check-test-types`                                                                                                                                                                                                                                                                                                        |
-| `blacksmith-8vcpu-ubuntu-2404`   | build-smoke, Linux Node test shards, bundled plugin test shards, `check-additional` shards, `android`                                                                                                                                                                                                                                                                                                                                                        |
-| `blacksmith-16vcpu-ubuntu-2404`  | `build-artifacts`, `check-lint` (CPU-sensitive enough that 8 vCPU cost more than they saved); install-smoke Docker builds (32-vCPU queue time cost more than it saved)                                                                                                                                                                                                                                                                                       |
-| `blacksmith-16vcpu-windows-2025` | `checks-windows`                                                                                                                                                                                                                                                                                                                                                                                                                                             |
-| `blacksmith-6vcpu-macos-latest`  | `macos-node` on `openclaw/openclaw`; forks fall back to `macos-latest`                                                                                                                                                                                                                                                                                                                                                                                       |
-| `blacksmith-12vcpu-macos-latest` | `macos-swift` on `openclaw/openclaw`; forks fall back to `macos-latest`                                                                                                                                                                                                                                                                                                                                                                                      |
+| Runner                           | Jobs                                                                                                                                                                                                                                                                                                                              |
+| -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `ubuntu-24.04`                   | `preflight`, fast security jobs and aggregates (`security-scm-fast`, `security-dependency-audit`, `security-fast`), fast protocol/contract/bundled checks, docs checks, Python skills, workflow-sanity, labeler, auto-response; install-smoke preflight also uses GitHub-hosted Ubuntu so the Blacksmith matrix can queue earlier |
+| `blacksmith-4vcpu-ubuntu-2404`   | `CodeQL Critical Quality`, lower-weight extension shards, `checks-fast-core`, `checks-fast-protocol`, plugin/channel contract shards, `checks-node-compat-node22`, `check-prod-types`, and `check-test-types`                                                                                                                     |
+| `blacksmith-8vcpu-ubuntu-2404`   | Linux Node test shards, bundled plugin test shards, `check-additional-*` shards, `android`                                                                                                                                                                                                                                        |
+| `blacksmith-16vcpu-ubuntu-2404`  | `build-artifacts`, `check-lint` (CPU-sensitive enough that 8 vCPU cost more than they saved); install-smoke Docker builds (32-vCPU queue time cost more than it saved)                                                                                                                                                            |
+| `blacksmith-16vcpu-windows-2025` | `checks-windows`                                                                                                                                                                                                                                                                                                                  |
+| `blacksmith-6vcpu-macos-latest`  | `macos-node` on `openclaw/openclaw`; forks fall back to `macos-latest`                                                                                                                                                                                                                                                            |
+| `blacksmith-12vcpu-macos-latest` | `macos-swift` on `openclaw/openclaw`; forks fall back to `macos-latest`                                                                                                                                                                                                                                                           |

 Canonical-repo CI keeps Blacksmith as the default runner path. During `preflight`, `scripts/ci-runner-labels.mjs` checks recent queued and in-progress Actions runs for queued Blacksmith jobs. If a specific Blacksmith label already has queued jobs, downstream jobs that would use that exact label fall back to the matching GitHub-hosted runner (`ubuntu-24.04`, `windows-2025`, or `macos-latest`) for that run only. Other Blacksmith sizes in the same OS family stay on their primary labels. If the API probe fails, no fallback is applied.

@@ -121,7 +121,7 @@ pnpm test:changed                             # cheap smart changed Vitest targe
 pnpm test:channels
 pnpm test:contracts:channels
 pnpm check:docs                               # docs format + lint + broken links
-pnpm build                                    # build dist when CI artifact/build-smoke lanes matter
+pnpm build                                    # build dist when CI artifact/smoke checks matter
 pnpm ci:timings                               # summarize the latest origin/main push CI run
 pnpm ci:timings:recent                        # compare recent successful main CI runs
 node scripts/ci-run-timings.mjs <run-id>      # summarize wall time, queue time, and slowest jobs
@@ -203,7 +203,7 @@ Docker release-path soak; `full` forces soak on.

 The umbrella records the dispatched child run ids, and the final `Verify full validation` job re-checks current child run conclusions and appends slowest-job tables for each child run. If a child workflow is rerun and turns green, rerun only the parent verifier job to refresh the umbrella result and timing summary.

-For recovery, both `Full Release Validation` and `OpenClaw Release Checks` accept `rerun_group`. Use `all` for a release candidate, `ci` for only the normal full CI child, `plugin-prerelease` for only the plugin prerelease child, `release-checks` for every release child, or a narrower group: `install-smoke`, `cross-os`, `live-e2e`, `package`, `qa`, `qa-parity`, `qa-live`, or `npm-telegram` on the umbrella. This keeps a failed release box rerun bounded after a focused fix. For one failed cross-OS lane, combine `rerun_group=cross-os` with `cross_os_suite_filter`, for example `windows/packaged-upgrade`; long cross-OS commands emit heartbeat lines and packaged-upgrade summaries include per-phase timings. QA release-check lanes are advisory, so QA-only failures warn but do not block the release-check verifier.
+For recovery, both `Full Release Validation` and `OpenClaw Release Checks` accept `rerun_group`. Use `all` for a release candidate, `ci` for only the normal full CI child, `plugin-prerelease` for only the plugin prerelease child, `release-checks` for every release child, or a narrower group: `install-smoke`, `cross-os`, `live-e2e`, `package`, `qa`, `qa-parity`, `qa-live`, or `npm-telegram` on the umbrella. This keeps a failed release box rerun bounded after a focused fix. For one failed cross-OS lane, combine `rerun_group=cross-os` with `cross_os_suite_filter`, for example `windows/packaged-upgrade`; long cross-OS commands emit heartbeat lines and packaged-upgrade summaries include per-phase timings. QA release-check lanes are advisory except the standard runtime tool coverage gate, which blocks when required OpenClaw dynamic tools drift or disappear from the standard tier summary.

 `OpenClaw Release Checks` uses the trusted workflow ref to resolve the selected ref once into a `release-package-under-test` tarball, then passes that artifact to cross-OS checks and Package Acceptance, plus the live/E2E release-path Docker workflow when soak coverage runs. That keeps the package bytes consistent across release boxes and avoids repacking the same candidate in multiple child jobs.

--- a/docs/cli/browser.md
+++ b/docs/cli/browser.md
@@ -204,12 +204,17 @@ openclaw browser upload /tmp/openclaw/uploads/file.pdf --ref <ref>
 openclaw browser waitfordownload
 openclaw browser download <ref> report.pdf
 openclaw browser dialog --accept
+openclaw browser dialog --dismiss --dialog-id d1
 ```

 Managed Chrome profiles save ordinary click-triggered downloads into the OpenClaw
 downloads directory (`/tmp/openclaw/downloads` by default, or the configured temp
 root). Use `waitfordownload` or `download` when the agent needs to wait for a
 specific file and return its path; those explicit waiters own the next download.
+When an action opens a modal dialog, the action response returns
+`blockedByDialog` with `browserState.dialogs.pending`; pass `--dialog-id` to
+answer it directly. Dialogs handled outside OpenClaw appear under
+`browserState.dialogs.recent`.

 ## State and storage

--- a/docs/cli/daemon.md
+++ b/docs/cli/daemon.md
@@ -35,7 +35,7 @@ openclaw daemon uninstall
 ## Common options

 - `status`: `--url`, `--token`, `--password`, `--timeout`, `--no-probe`, `--require-rpc`, `--deep`, `--json`
- `install`: `--port`, `--runtime <node|bun>`, `--runtime-path <path>`, `--token`, `--force`, `--json`
+- `install`: `--port`, `--runtime <node|bun>`, `--token`, `--force`, `--json`
 - `restart`: `--safe`, `--skip-deferral`, `--force`, `--wait <duration>`, `--json`
 - lifecycle (`uninstall|start|stop`): `--json`

@@ -52,7 +52,6 @@ Notes:
 - When token auth requires a token and `gateway.auth.token` is SecretRef-managed, `install` validates that the SecretRef is resolvable but does not persist the resolved token into service environment metadata.
 - If token auth requires a token and the configured token SecretRef is unresolved, install fails closed.
 - If both `gateway.auth.token` and `gateway.auth.password` are configured and `gateway.auth.mode` is unset, install is blocked until mode is set explicitly.
- `install --runtime-path <path>` pins the managed service to an absolute Node or Bun executable for the selected `--runtime` and persists `OPENCLAW_DAEMON_RUNTIME_PATH` for later forced reinstalls, updates, and doctor repairs.
 - On macOS, `install` keeps LaunchAgent plists owner-only and loads managed service environment values through an owner-only file and wrapper instead of serializing API keys or auth-profile env refs into `EnvironmentVariables`.
 - If you intentionally run multiple gateways on one host, isolate ports, config/state, and workspaces; see [/gateway#multiple-gateways-same-host](/gateway#multiple-gateways-same-host).
 - `restart --safe` asks the running Gateway to preflight active work and schedule one coalesced restart after active work drains. Plain `restart` keeps the existing service-manager behavior; `--force` remains the immediate override path.
--- a/docs/cli/doctor.md
+++ b/docs/cli/doctor.md
@@ -15,13 +15,34 @@ Related:
 - Troubleshooting: [Troubleshooting](/gateway/troubleshooting)
 - Security audit: [Security](/gateway/security)

+## Why Use It
+
+`openclaw doctor` is the OpenClaw health surface. Use it when the gateway,
+channels, plugins, skills, model routing, local state, or config migrations are
+not behaving as expected and you want one command that can explain what is
+wrong.
+
+Doctor has three postures:
+
+| Posture | Command                  | Behavior                                                                        |
+| ------- | ------------------------ | ------------------------------------------------------------------------------- |
+| Inspect | `openclaw doctor`        | Human-oriented checks and guided prompts.                                       |
+| Repair  | `openclaw doctor --fix`  | Applies supported repairs, using prompts unless non-interactive repair is safe. |
+| Lint    | `openclaw doctor --lint` | Read-only structured findings for CI, preflight, and review gates.              |
+
+Prefer `--lint` when automation needs a stable result. Prefer `--fix` when a
+human operator intentionally wants doctor to edit config or state.
+
 ## Examples

 ```bash
 openclaw doctor
-openclaw doctor --repair
+openclaw doctor --lint
+openclaw doctor --lint --json
+openclaw doctor --lint --severity-min warning
 openclaw doctor --deep
-openclaw doctor --repair --non-interactive
+openclaw doctor --fix
+openclaw doctor --fix --non-interactive
 openclaw doctor --generate-gateway-token
 ```

@@ -44,13 +65,134 @@ The targeted Discord capabilities probe reports the bot's effective channel perm
 - `--non-interactive`: run without prompts; safe migrations and non-service repairs only
 - `--generate-gateway-token`: generate and configure a gateway token
 - `--deep`: scan system services for extra gateway installs and report recent Gateway supervisor restart handoffs
+- `--lint`: run modernized health checks in read-only mode and emit diagnostic findings
+- `--json`: with `--lint`, emit JSON findings instead of human output
+- `--severity-min <level>`: with `--lint`, drop findings below `info`, `warning`, or `error`
+- `--skip <id>`: with `--lint`, skip a check id; repeat to skip more than one
+- `--only <id>`: with `--lint`, run only a check id; repeat to run a small selected set
+
+## Lint mode
+
+`openclaw doctor --lint` is the read-only automation posture for doctor checks.
+It uses the structured health-check path, does not prompt, and does not repair
+or rewrite config/state. Use it in CI, preflight scripts, and review workflows
+when you want machine-readable findings instead of guided repair prompts.
+Lint-output options such as `--json`, `--severity-min`, `--only`, and `--skip`
+are only accepted with `--lint`.
+
+```bash
+openclaw doctor --lint
+openclaw doctor --lint --severity-min warning
+openclaw doctor --lint --json
+openclaw doctor --lint --only core/doctor/gateway-config --json
+```
+
+Human output is compact:
+
+```text
+doctor --lint: ran 6 check(s), 1 finding(s)
+  [warning] core/doctor/gateway-config gateway.mode - gateway.mode is unset; gateway start will be blocked.
+    fix: Run `openclaw configure` and set Gateway mode (local/remote), or `openclaw config set gateway.mode local`.
+```
+
+JSON output is the scripting surface for lint runs:
+
+```json
+{
+  "ok": false,
+  "checksRun": 5,
+  "checksSkipped": 0,
+  "findings": [
+    {
+      "checkId": "core/doctor/gateway-config",
+      "severity": "warning",
+      "message": "gateway.mode is unset; gateway start will be blocked.",
+      "path": "gateway.mode",
+      "fixHint": "Run `openclaw configure` and set Gateway mode (local/remote), or `openclaw config set gateway.mode local`."
+    }
+  ]
+}
+```
+
+Exit behavior:
+
+- `0`: no findings at or above the selected severity threshold
+- `1`: at least one finding meets the selected threshold
+- `2`: command/runtime failure before lint findings can be produced
+
+`--severity-min` controls both visible findings and the exit threshold. For
+example, `openclaw doctor --lint --severity-min error` can print no findings and
+exit `0` even when lower-severity `info` or `warning` findings exist.
+
+## Structured Health Checks
+
+Modern doctor checks use a small structured contract:
+
+```ts
+detect(ctx, scope?) -> HealthFinding[]
+repair?(ctx, findings) -> HealthRepairResult
+```
+
+`detect()` powers `doctor --lint`. `repair()` is optional and is only considered
+by `doctor --fix` / `doctor --repair`. Checks that have not migrated to this
+shape continue to use the legacy doctor contribution flow.
+
+The split is intentional: `detect()` owns diagnosis, while `repair()` owns
+reporting what it changed or would change. Repair contexts can carry
+`dryRun`/`diff` requests, and repair results can return structured `diffs` for
+config/file edits plus `effects` for service, process, package, state, or other
+side effects. That lets converted checks grow toward `doctor --fix --dry-run`
+and diff reporting without moving mutation planning into `detect()`.
+
+`repair()` reports whether it attempted the requested repair with `status:
+"repaired" | "skipped" | "failed"`. Omitted status means `repaired`, so simple
+repair checks only need to return changes. When repair returns `skipped` or
+`failed`, doctor reports the reason and does not run validation for that check.
+
+After a successful structured repair, doctor re-runs `detect()` with the
+repaired findings as scope. Checks can use selected findings, paths, or `ocPath`
+values for focused validation. If the finding is still present, doctor reports a
+repair warning instead of treating the change as silently complete.
+
+A finding includes:
+
+| Field             | Purpose                                                |
+| ----------------- | ------------------------------------------------------ |
+| `checkId`         | Stable id for skip/only filters and CI allowlists.     |
+| `severity`        | `info`, `warning`, or `error`.                         |
+| `message`         | Human-readable problem statement.                      |
+| `path`            | Config, file, or logical path when available.          |
+| `line` / `column` | Source location when available.                        |
+| `ocPath`          | Precise `oc://` address when a check can point to one. |
+| `fixHint`         | Suggested operator action or repair summary.           |
+
+This release registers the modernized core doctor checks on the structured
+health path. The `openclaw/plugin-sdk/health` subpath exposes the same
+contract for bundled follow-up consumers, but plugin-backed checks only run
+after their owning package registers them in the active command path.
+
+## Check Selection
+
+Use `--only` and `--skip` when a workflow wants a focused gate:
+
+```bash
+openclaw doctor --lint --only core/doctor/gateway-config --json
+openclaw doctor --lint --skip core/doctor/skills-readiness
+```
+
+`--only` and `--skip` accept full check ids and may be repeated. If an `--only`
+id is not registered, no check runs for that id; use the command's `checksRun`
+and `checksSkipped` fields to verify a focused gate is selecting the checks you
+expect.

 Notes:

 - In Nix mode (`OPENCLAW_NIX_MODE=1`), read-only doctor checks still work, but `doctor --fix`, `doctor --repair`, `doctor --yes`, and `doctor --generate-gateway-token` are disabled because `openclaw.json` is immutable. Edit the Nix source for this install instead; for nix-openclaw, use the agent-first [Quick Start](https://github.com/openclaw/nix-openclaw#quick-start).
 - Interactive prompts (like keychain/OAuth fixes) only run when stdin is a TTY and `--non-interactive` is **not** set. Headless runs (cron, Telegram, no terminal) will skip prompts.
- Performance: non-interactive `doctor` runs skip eager plugin loading so headless health checks stay fast. Interactive sessions still fully load plugins when a check needs their contribution.
+- Performance: non-interactive `doctor` runs skip eager plugin loading so headless health checks stay fast. Interactive doctor sessions still load the plugin surfaces needed by the legacy health and repair flow.
+- `--lint` is stricter than `--non-interactive`: it is always read-only, never prompts, and never applies safe migrations. Run `doctor --fix` or `doctor --repair` when you want doctor to make changes.
 - `--fix` (alias for `--repair`) writes a backup to `~/.openclaw/openclaw.json.bak` and drops unknown config keys, listing each removal.
+- Modernized health checks can expose a `repair()` path for `doctor --fix`; checks that do not expose one continue through the existing doctor repair flow.
 - `doctor --fix --non-interactive` reports missing or stale gateway service definitions but does not install or rewrite them outside update repair mode. Run `openclaw gateway install` for a missing service, or `openclaw gateway install --force` when you intentionally want to replace the launcher.
 - State integrity checks now detect orphan transcript files in the sessions directory. Archiving them as `.deleted.<timestamp>` requires an interactive confirmation; `--fix`, `--yes`, and headless runs leave them in place.
 - Doctor also scans `~/.openclaw/cron/jobs.json` (or `cron.store`) for legacy cron job shapes and can rewrite them in place before the scheduler has to auto-normalize them at runtime.
--- a/docs/cli/gateway.md
+++ b/docs/cli/gateway.md
@@ -448,22 +448,6 @@ openclaw gateway restart
 openclaw gateway uninstall
 ```

-### Install with a pinned runtime path
-
-Use `--runtime-path` when the managed service must run with a specific Node or Bun executable.
-This is useful for launchd/systemd/schtasks services because they do not load your interactive
-shell startup files.
-
-```bash
-openclaw gateway install --runtime node --runtime-path "$(mise which node)" --force
-openclaw gateway restart
-```
-
-`--runtime` selects the runtime family (`node` or `bun`). `--runtime-path` must point to an
-absolute executable path for that runtime. The installer validates the executable, writes it into
-the service command, and persists `OPENCLAW_DAEMON_RUNTIME_PATH` so forced reinstalls, updates, and
-doctor repairs keep the same operator-selected runtime path.
-
 ### Install with a wrapper

 Use `--wrapper` when the managed service must start through another executable, for example a
@@ -502,7 +486,7 @@ openclaw gateway restart
 <AccordionGroup>
  <Accordion title="Command options">
    - `gateway status`: `--url`, `--token`, `--password`, `--timeout`, `--no-probe`, `--require-rpc`, `--deep`, `--json`
-    - `gateway install`: `--port`, `--runtime <node|bun>`, `--runtime-path <path>`, `--token`, `--wrapper <path>`, `--force`, `--json`
+    - `gateway install`: `--port`, `--runtime <node|bun>`, `--token`, `--wrapper <path>`, `--force`, `--json`
    - `gateway restart`: `--safe`, `--skip-deferral`, `--force`, `--wait <duration>`, `--json`
    - `gateway uninstall|start`: `--json`
    - `gateway stop`: `--disable`, `--json`
--- a/docs/cli/node.md
+++ b/docs/cli/node.md
@@ -100,18 +100,8 @@ Options:
 - `--node-id <id>`: Override node id (clears pairing token)
 - `--display-name <name>`: Override the node display name
 - `--runtime <runtime>`: Service runtime (`node` or `bun`)
- `--runtime-path <path>`: Absolute executable path for the selected service runtime
 - `--force`: Reinstall/overwrite if already installed

-Use `--runtime-path` when a managed node host should run with a specific Node or Bun executable:
-
-```bash
-openclaw node install --runtime node --runtime-path "$(mise which node)" --force
-```
-
-The installer validates that the path matches the selected runtime and persists it in the service
-environment as `OPENCLAW_DAEMON_RUNTIME_PATH`.
-
 Manage the service:

 ```bash
--- a/docs/concepts/mantis.md
+++ b/docs/concepts/mantis.md
@@ -252,7 +252,7 @@ Telegram Web login state is not required for normal Mantis automation.

 `Mantis Telegram Desktop Proof` is the agentic native Telegram Desktop
 before/after wrapper. A maintainer can trigger it from a PR comment with
-`@Mantis telegram desktop proof`, from the Actions UI with freeform
+`@openclaw-mantis telegram desktop proof`, from the Actions UI with freeform
 instructions, or through the generic `Mantis Scenario` dispatcher. The workflow
 hands the PR, baseline ref, candidate ref, and maintainer instructions to Codex.
 The agent reads the PR, decides what Telegram-visible behavior proves the
@@ -351,7 +351,7 @@ region, and public URL values directly. The reusable publisher requires:
 You can also trigger the status-reactions run directly from a PR comment:

 ```text
-@Mantis discord status reactions
+@openclaw-mantis discord status reactions
 ```

 The comment trigger is intentionally narrow. It only runs on pull request
@@ -361,15 +361,15 @@ and the current PR head SHA as the candidate. Maintainers can override either
 ref:

 ```text
-@Mantis discord status reactions baseline=origin/main candidate=HEAD
+@openclaw-mantis discord status reactions baseline=origin/main candidate=HEAD
 ```

 Telegram live QA can also be triggered from a PR comment:

 ```text
-@Mantis telegram
-@Mantis telegram scenario=telegram-status-command
-@Mantis telegram scenarios=telegram-status-command,telegram-mentioned-message-reply
+@openclaw-mantis telegram
+@openclaw-mantis telegram scenario=telegram-status-command
+@openclaw-mantis telegram scenarios=telegram-status-command,telegram-mentioned-message-reply
 ```

 By default it uses the current PR head SHA as the candidate and runs
--- a/docs/concepts/model-providers.md
+++ b/docs/concepts/model-providers.md
@@ -149,7 +149,7 @@ Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again, so Ope
 - Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.
 - For the common subscription plus native Codex runtime route, sign in with `openai-codex` auth but configure `openai/gpt-5.5`; OpenAI agent turns select Codex by default.
 - Use provider/model `agentRuntime.id: "pi"` only when you want a compatibility route through PI; otherwise keep `openai/gpt-5.5` on the default Codex harness.
- Older `openai-codex/gpt-5.1*`, `openai-codex/gpt-5.2*`, and `openai-codex/gpt-5.3*` refs are suppressed because ChatGPT/Codex OAuth accounts reject them; use `openai-codex/gpt-5.5` or the native Codex runtime route instead.
+- `openai-codex/gpt-*` refs remain a legacy PI route. Prefer `openai/gpt-5.5` on the native Codex runtime for new agent config, and run `openclaw doctor --fix` when you want to migrate old `openai-codex/*` refs to canonical `openai/*` refs.

 ```json5
 {
--- a/docs/concepts/qa-e2e-automation.md
+++ b/docs/concepts/qa-e2e-automation.md
@@ -34,7 +34,7 @@ script aliases; both forms are supported.
 | `qa run`                                            | Bundled QA self-check; writes a Markdown report.                                                                                                                                                                                                                        |
 | `qa suite`                                          | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM.                                                                                                                                  |
 | `qa coverage`                                       | Print the markdown scenario-coverage inventory (`--json` for machine output).                                                                                                                                                                                           |
-| `qa parity-report`                                  | Compare two `qa-suite-summary.json` files and write the agentic parity report.                                                                                                                                                                                          |
+| `qa parity-report`                                  | Compare two `qa-suite-summary.json` files and write the agentic parity report, or use `--runtime-axis --token-efficiency` to write Codex-vs-Pi runtime parity and token-efficiency reports from one runtime-pair summary.                                               |
 | `qa character-eval`                                 | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting).                                                                                                                                                            |
 | `qa manual`                                         | Run a one-off prompt against the selected provider/model lane.                                                                                                                                                                                                          |
 | `qa ui`                                             | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`).                                                                                                                                                                                                    |
--- a/docs/gateway/doctor.md
+++ b/docs/gateway/doctor.md
@@ -26,17 +26,28 @@ openclaw doctor
    Accept defaults without prompting (including restart/service/sandbox repair steps when applicable).

  </Tab>
-  <Tab title="--repair">
+  <Tab title="--fix">
    ```bash
-    openclaw doctor --repair
+    openclaw doctor --fix
    ```

    Apply recommended repairs without prompting (repairs + restarts where safe).

  </Tab>
-  <Tab title="--repair --force">
+  <Tab title="--lint">
    ```bash
-    openclaw doctor --repair --force
+    openclaw doctor --lint
+    openclaw doctor --lint --json
+    ```
+
+    Run structured health checks for CI or preflight automation. This mode is
+    read-only: it does not prompt, repair, migrate config, restart services, or
+    touch state.
+
+  </Tab>
+  <Tab title="--fix --force">
+    ```bash
+    openclaw doctor --fix --force
    ```

    Apply aggressive repairs too (overwrites custom supervisor configs).
@@ -66,6 +77,57 @@ If you want to review changes before writing, open the config file first:
 cat ~/.openclaw/openclaw.json
 ```

+## Read-only lint mode
+
+`openclaw doctor --lint` is the automation-friendly sibling of
+`openclaw doctor --fix`. Both use doctor health checks, but their posture is
+different:
+
+| Mode                     | Prompts   | Writes config/state     | Output                 | Use it for                      |
+| ------------------------ | --------- | ----------------------- | ---------------------- | ------------------------------- |
+| `openclaw doctor`        | yes       | no                      | friendly health report | a human checking status         |
+| `openclaw doctor --fix`  | sometimes | yes, with repair policy | friendly repair log    | applying approved repairs       |
+| `openclaw doctor --lint` | no        | no                      | structured findings    | CI, preflight, and review gates |
+
+Modernized health checks may provide an optional `repair()` implementation.
+`doctor --fix` applies those repairs when they exist and continues to use the
+existing doctor repair flow for checks that have not migrated yet.
+The structured repair contract also separates repair reporting from detection:
+`detect()` reports current findings, while `repair()` can report changes,
+config/file diffs, and non-file side effects. That keeps the migration path open
+for future `doctor --fix --dry-run` and diff output without making lint checks
+plan mutations.
+
+Examples:
+
+```bash
+openclaw doctor --lint
+openclaw doctor --lint --severity-min warning
+openclaw doctor --lint --json
+openclaw doctor --lint --only core/doctor/gateway-config --json
+```
+
+JSON output includes:
+
+- `ok`: whether any visible finding met the selected severity threshold
+- `checksRun`: number of health checks executed
+- `checksSkipped`: checks skipped by `--only` or `--skip`
+- `findings`: structured diagnostics with `checkId`, `severity`, `message`, and
+  optional `path`, `line`, `column`, `ocPath`, and `fixHint`
+
+Exit codes:
+
+- `0`: no findings at or above the selected threshold
+- `1`: one or more findings met the selected threshold
+- `2`: command/runtime failure before lint findings could be emitted
+
+Use `--severity-min info|warning|error` to control both what is printed and what
+causes a non-zero lint exit. Use `--only <id>` for narrow preflight gates and
+`--skip <id>` to temporarily exclude a noisy check while keeping the rest of the
+lint run active.
+Lint-output options such as `--json`, `--severity-min`, `--only`, and `--skip`
+must be paired with `--lint`; regular doctor and repair runs reject them.
+
 ## What it does (summary)

 <AccordionGroup>
@@ -112,7 +174,7 @@ cat ~/.openclaw/openclaw.json
    - Codex route repair for legacy `openai-codex/*` model refs in primary models, fallbacks, heartbeat/subagent/compaction overrides, hooks, channel model overrides, and session route pins; `--fix` rewrites them to `openai/*`, removes stale session/whole-agent runtime pins, and leaves canonical OpenAI agent refs on the default Codex harness.
    - Supervisor config audit (launchd/systemd/schtasks) with optional repair.
    - Embedded proxy environment cleanup for gateway services that captured shell `HTTP_PROXY` / `HTTPS_PROXY` / `NO_PROXY` values during install or update.
-    - Gateway runtime best-practice checks (Node vs Bun, version-manager paths). Services installed with an explicit `OPENCLAW_DAEMON_RUNTIME_PATH` keep that operator-selected executable during doctor repairs.
+    - Gateway runtime best-practice checks (Node vs Bun, version-manager paths).
    - Gateway port collision diagnostics (default `18789`).

  </Accordion>
@@ -471,8 +533,8 @@ That stages grounded durable candidates into the short-term dreaming store while

    - `openclaw doctor` prompts before rewriting supervisor config.
    - `openclaw doctor --yes` accepts the default repair prompts.
-    - `openclaw doctor --repair` applies recommended fixes without prompts.
-    - `openclaw doctor --repair --force` overwrites custom supervisor configs.
+    - `openclaw doctor --fix` applies recommended fixes without prompts (`--repair` is an alias).
+    - `openclaw doctor --fix --force` overwrites custom supervisor configs.
    - `OPENCLAW_SERVICE_REPAIR_POLICY=external` keeps doctor read-only for gateway service lifecycle. It still reports service health and runs non-service repairs, but skips service install/start/restart/bootstrap, supervisor config rewrites, and legacy service cleanup because an external supervisor owns that lifecycle.
    - On Linux, doctor does not rewrite command/entrypoint metadata while the matching systemd gateway unit is active. It also ignores inactive non-legacy extra gateway-like units during the duplicate-service scan so companion service files do not create cleanup noise.
    - If token auth requires a token and `gateway.auth.token` is SecretRef-managed, doctor service install/repair validates the SecretRef but does not persist resolved plaintext token values into supervisor service environment metadata.
--- a/docs/help/testing.md
+++ b/docs/help/testing.md
@@ -335,9 +335,9 @@ start it from the Actions UI through `Mantis Scenario` (`scenario_id:
 telegram-live`) or directly from a pull request comment:

 ```text
-@Mantis telegram
-@Mantis telegram scenario=telegram-status-command
-@Mantis telegram scenarios=telegram-status-command,telegram-mentioned-message-reply
+@openclaw-mantis telegram
+@openclaw-mantis telegram scenario=telegram-status-command
+@openclaw-mantis telegram scenarios=telegram-status-command,telegram-mentioned-message-reply
 ```

 `Mantis Telegram Desktop Proof` is the agentic native Telegram Desktop
@@ -346,7 +346,7 @@ freeform `instructions`, through `Mantis Scenario` (`scenario_id:
 telegram-desktop-proof`), or from a PR comment:

 ```text
-@Mantis telegram desktop proof
+@openclaw-mantis telegram desktop proof
 ```

 The Mantis agent reads the PR, decides what Telegram-visible behavior proves the
--- a/docs/install/raspberry-pi.md
+++ b/docs/install/raspberry-pi.md
@@ -145,6 +145,8 @@ EOF
 source ~/.bashrc
 ```

+`OPENCLAW_NO_RESPAWN=1` keeps routine Gateway restarts in-process, which avoids extra process handoffs and keeps PID tracking simple on small hosts.
+
 **Reduce memory usage** -- For headless setups, free GPU memory and disable unused services:

 ```bash
--- a/docs/plugins/building-plugins.md
+++ b/docs/plugins/building-plugins.md
@@ -2,276 +2,196 @@
 summary: "Create your first OpenClaw plugin in minutes"
 title: "Building plugins"
 sidebarTitle: "Getting Started"
+doc-schema-version: 1
 read_when:
  - You want to create a new OpenClaw plugin
  - You need a quick-start for plugin development
-  - You are adding a new channel, provider, tool, or other capability to OpenClaw
+  - You are choosing between channel, provider, CLI backend, tool, or hook docs
 ---

-Plugins extend OpenClaw with new capabilities: channels, model providers,
-speech, realtime transcription, realtime voice, media understanding, image
-generation, video generation, web fetch, web search, agent tools, or any
-combination.
+Plugins extend OpenClaw without changing core. A plugin can add a messaging
+channel, model provider, local CLI backend, agent tool, hook, media provider,
+or another plugin-owned capability.

-You do not need to add your plugin to the OpenClaw repository. Publish to
-[ClawHub](/clawhub) and users install with
-`openclaw plugins install clawhub:<package-name>`. Bare package specs still
-install from npm during the launch cutover.
+You do not need to add an external plugin to the OpenClaw repository. Publish
+the package to [ClawHub](/clawhub) and users install it with:

-## Prerequisites
+```bash
+openclaw plugins install clawhub:<package-name>
+```

- Node >= 22 and a package manager (npm or pnpm)
- Familiarity with TypeScript (ESM)
- For in-repo plugins: repository cloned and `pnpm install` done. Source
-  checkout plugin development is pnpm-only because OpenClaw loads bundled
-  plugins from the `extensions/*` workspace packages.
+Bare package specs still install from npm during the launch cutover. Use the
+`clawhub:` prefix when you want ClawHub resolution.

-## What kind of plugin?
+## Requirements

-<CardGroup cols={3}>
+- Use Node 22 or newer and a package manager such as `npm` or `pnpm`.
+- Be familiar with TypeScript ESM modules.
+- For in-repo bundled plugin work, clone the repository and run `pnpm install`.
+  Source-checkout plugin development is pnpm-only because OpenClaw loads bundled
+  plugins from `extensions/*` workspace packages.
+
+## Choose the plugin shape
+
+<CardGroup cols={2}>
  <Card title="Channel plugin" icon="messages-square" href="/plugins/sdk-channel-plugins">
-    Connect OpenClaw to a messaging platform (Discord, IRC, etc.)
+    Connect OpenClaw to a messaging platform.
  </Card>
  <Card title="Provider plugin" icon="cpu" href="/plugins/sdk-provider-plugins">
-    Add a model provider (LLM, proxy, or custom endpoint)
+    Add a model, media, search, fetch, speech, or realtime provider.
  </Card>
  <Card title="CLI backend plugin" icon="terminal" href="/plugins/cli-backend-plugins">
-    Map a local AI CLI into OpenClaw's text fallback runner
+    Run a local AI CLI through OpenClaw model fallback.
  </Card>
  <Card title="Tool plugin" icon="wrench" href="/plugins/tool-plugins">
-    Add simple typed agent tools with generated manifest metadata
-  </Card>
-  <Card title="Hook plugin" icon="plug" href="/plugins/hooks">
-    Register event hooks, services, or advanced runtime integrations
+    Register agent tools.
  </Card>
 </CardGroup>

-For a channel plugin that isn't guaranteed to be installed when onboarding/setup
-runs, use `createOptionalChannelSetupSurface(...)` from
-`openclaw/plugin-sdk/channel-setup`. It produces a setup adapter + wizard pair
-that advertises the install requirement and fails closed on real config writes
-until the plugin is installed.
+## Quickstart

-## Quick start: tool plugin
-
-This walkthrough creates a minimal plugin that registers an agent tool. Channel
-and provider plugins have dedicated guides linked above.
-For the detailed tool-only workflow, see [Tool Plugins](/plugins/tool-plugins).
+Build a minimal tool plugin by registering one required agent tool. This is the
+shortest useful plugin shape and shows the package, manifest, entry point, and
+local proof.

 <Steps>
-  <Step title="Create the package and manifest">
+  <Step title="Create package metadata">
    <CodeGroup>
-    ```json package.json
-    {
-      "name": "@myorg/openclaw-my-plugin",
-      "version": "1.0.0",
-      "type": "module",
-      "openclaw": {
-        "extensions": ["./index.ts"],
-        "compat": {
-          "pluginApi": ">=2026.3.24-beta.2",
-          "minGatewayVersion": "2026.3.24-beta.2"
-        },
-        "build": {
-          "openclawVersion": "2026.3.24-beta.2",
-          "pluginSdkVersion": "2026.3.24-beta.2"
-        }
-      }
-    }
-    ```

-    ```json openclaw.plugin.json
-    {
-      "id": "my-plugin",
-      "name": "My Plugin",
-      "description": "Adds a custom tool to OpenClaw",
-      "contracts": {
-        "tools": ["my_tool"]
-      },
-      "activation": {
-        "onStartup": true
-      },
-      "configSchema": {
-        "type": "object",
-        "additionalProperties": false
-      }
+```json package.json
+{
+  "name": "@myorg/openclaw-my-plugin",
+  "version": "1.0.0",
+  "type": "module",
+  "openclaw": {
+    "extensions": ["./index.ts"],
+    "compat": {
+      "pluginApi": ">=2026.3.24-beta.2",
+      "minGatewayVersion": "2026.3.24-beta.2"
+    },
+    "build": {
+      "openclawVersion": "2026.3.24-beta.2",
+      "pluginSdkVersion": "2026.3.24-beta.2"
    }
-    ```
+  }
+}
+```
+
+```json openclaw.plugin.json
+{
+  "id": "my-plugin",
+  "name": "My Plugin",
+  "description": "Adds a custom tool to OpenClaw",
+  "contracts": {
+    "tools": ["my_tool"]
+  },
+  "activation": {
+    "onStartup": true
+  },
+  "configSchema": {
+    "type": "object",
+    "additionalProperties": false
+  }
+}
+```
+
    </CodeGroup>

-    Every plugin needs a manifest, even with no config. Runtime-registered tools
-    must be listed in `contracts.tools` so OpenClaw can discover the owning
-    plugin without loading every plugin runtime. For simple tool-only plugins,
-    prefer `defineToolPlugin` plus `openclaw plugins build` so tool names and
-    the empty config schema are generated from one source of truth. Plugins
-    should also declare `activation.onStartup` intentionally. This example sets
-    it to `true`. See [Manifest](/plugins/manifest) for the full schema. The
-    canonical ClawHub publish snippets live in `docs/snippets/plugin-publish/`.
+    Published external plugins should point runtime entries at built JavaScript
+    files. See [SDK entry points](/plugins/sdk-entrypoints) for the full entry
+    point contract.
+
+    Every plugin needs a manifest, even when it has no config. Runtime tools
+    must appear in `contracts.tools` so OpenClaw can discover ownership without
+    eagerly loading every plugin runtime. Set `activation.onStartup`
+    intentionally. This example starts on Gateway startup.
+
+    For every manifest field, see [Plugin manifest](/plugins/manifest).

  </Step>

-  <Step title="Write the entry point">
-
-    ```typescript
-    // index.ts
+  <Step title="Register the tool">
+    ```typescript index.ts
    import { Type } from "typebox";
-    import { defineToolPlugin } from "openclaw/plugin-sdk/tool-plugin";
+    import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";

-    export default defineToolPlugin({
+    export default definePluginEntry({
      id: "my-plugin",
      name: "My Plugin",
      description: "Adds a custom tool to OpenClaw",
-      tools: (tool) => [
-        tool({
+      register(api) {
+        api.registerTool({
          name: "my_tool",
-          description: "Do a thing",
+          description: "Echo one input value",
          parameters: Type.Object({ input: Type.String() }),
-          async execute({ input }) {
-            return { message: `Got: ${input}` };
+          async execute(_id, params) {
+            return {
+              content: [{ type: "text", text: `Got: ${params.input}` }],
+            };
          },
-        }),
-      ],
+        });
+      },
    });
    ```

-    `defineToolPlugin` is for simple agent-tool plugins. For providers, hooks,
-    services, and other advanced non-channel plugins, use `definePluginEntry`.
-    For channels, use `defineChannelPluginEntry` - see
-    [Channel Plugins](/plugins/sdk-channel-plugins). For the full
-    `defineToolPlugin` workflow, see [Tool Plugins](/plugins/tool-plugins). For
-    full entry point options, see [Entry Points](/plugins/sdk-entrypoints).
+    Use `definePluginEntry` for non-channel plugins. Channel plugins use
+    `defineChannelPluginEntry`.

  </Step>

-  <Step title="Generate and validate metadata">
+  <Step title="Test the runtime">
+    For an installed or external plugin, inspect the loaded runtime:

    ```bash
-    npm run build
-    openclaw plugins build --entry ./dist/index.js
-    openclaw plugins validate --entry ./dist/index.js
+    openclaw plugins inspect my-plugin --runtime --json
    ```

-    `openclaw plugins build` writes `openclaw.plugin.json` and keeps
-    `package.json` `openclaw.extensions` pointed at the entry module. For
-    published packages, point it at built JavaScript such as `./dist/index.js`.
-    The generated manifest is the cold-load contract that OpenClaw reads before
-    runtime import. `openclaw plugins validate` imports the entry only during
-    author validation and checks that the manifest and package metadata match
-    the static `defineToolPlugin` metadata.
+    If the plugin registers a CLI command, run that command too. For example,
+    a demo command should have an execution proof such as
+    `openclaw demo-plugin ping`.
+
+    For a bundled plugin in this repository, OpenClaw discovers source-checkout
+    plugin packages from the `extensions/*` workspace. Run the closest targeted
+    test:
+
+    ```bash
+    pnpm test -- extensions/my-plugin/
+    pnpm check
+    ```

  </Step>

-  <Step title="Test and publish">
-
-    **External plugins:** validate and publish with ClawHub, then install:
+  <Step title="Publish">
+    Validate the package before publishing:

    ```bash
    clawhub package publish your-org/your-plugin --dry-run
    clawhub package publish your-org/your-plugin
-    openclaw plugins install clawhub:@myorg/openclaw-my-plugin
    ```

-    Bare package specs like `@myorg/openclaw-my-plugin` install from npm during
-    the launch cutover. Use `clawhub:` when you want ClawHub resolution.
+    The canonical ClawHub snippets live in `docs/snippets/plugin-publish/`.

-    **In-repo plugins:** place under the bundled plugin workspace tree - automatically discovered.
+  </Step>
+
+  <Step title="Install">
+    Install the published package through ClawHub:

    ```bash
-    pnpm test -- <bundled-plugin-root>/my-plugin/
+    openclaw plugins install clawhub:your-org/your-plugin
    ```

  </Step>
 </Steps>

-## Plugin capabilities
+<a id="registering-agent-tools"></a>

-A single plugin can register any number of capabilities via the `api` object:
+## Registering tools

-| Capability             | Registration method                              | Detailed guide                                                                  |
-| ---------------------- | ------------------------------------------------ | ------------------------------------------------------------------------------- |
-| Text inference (LLM)   | `api.registerProvider(...)`                      | [Provider Plugins](/plugins/sdk-provider-plugins)                               |
-| CLI inference backend  | `api.registerCliBackend(...)`                    | [CLI Backend Plugins](/plugins/cli-backend-plugins)                             |
-| Channel / messaging    | `api.registerChannel(...)`                       | [Channel Plugins](/plugins/sdk-channel-plugins)                                 |
-| Speech (TTS/STT)       | `api.registerSpeechProvider(...)`                | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Realtime transcription | `api.registerRealtimeTranscriptionProvider(...)` | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Realtime voice         | `api.registerRealtimeVoiceProvider(...)`         | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Media understanding    | `api.registerMediaUnderstandingProvider(...)`    | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Image generation       | `api.registerImageGenerationProvider(...)`       | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Music generation       | `api.registerMusicGenerationProvider(...)`       | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Video generation       | `api.registerVideoGenerationProvider(...)`       | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Web fetch              | `api.registerWebFetchProvider(...)`              | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Web search             | `api.registerWebSearchProvider(...)`             | [Provider Plugins](/plugins/sdk-provider-plugins#step-5-add-extra-capabilities) |
-| Tool-result middleware | `api.registerAgentToolResultMiddleware(...)`     | [SDK Overview](/plugins/sdk-overview#registration-api)                          |
-| Agent tools            | `api.registerTool(...)`                          | Below                                                                           |
-| Custom commands        | `api.registerCommand(...)`                       | [Entry Points](/plugins/sdk-entrypoints)                                        |
-| Plugin hooks           | `api.on(...)`                                    | [Plugin hooks](/plugins/hooks)                                                  |
-| Internal event hooks   | `api.registerHook(...)`                          | [Entry Points](/plugins/sdk-entrypoints)                                        |
-| HTTP routes            | `api.registerHttpRoute(...)`                     | [Internals](/plugins/architecture-internals#gateway-http-routes)                |
-| CLI subcommands        | `api.registerCli(...)`                           | [Entry Points](/plugins/sdk-entrypoints)                                        |
-
-For the full registration API, see [SDK Overview](/plugins/sdk-overview#registration-api).
-
-Bundled plugins can use `api.registerAgentToolResultMiddleware(...)` when they
-need async tool-result rewriting before the model sees the output. Declare the
-targeted runtimes in `contracts.agentToolResultMiddleware`, for example
-`["pi", "codex"]`. This is a trusted bundled-plugin seam; external
-plugins should prefer regular OpenClaw plugin hooks unless OpenClaw grows an
-explicit trust policy for this capability.
-
-If your plugin registers custom gateway RPC methods, keep them on a
-plugin-specific prefix. Core admin namespaces (`config.*`,
-`exec.approvals.*`, `wizard.*`, `update.*`) stay reserved and always resolve to
-`operator.admin`, even if a plugin asks for a narrower scope.
-
-`openclaw/plugin-sdk/gateway-method-runtime` is a reserved control-plane bridge
-for plugin HTTP routes that declare
-`contracts.gatewayMethodDispatch: ["authenticated-request"]`. It is an
-intentional-use guard for reviewed native plugins, not a sandbox boundary.
-
-Hook guard semantics to keep in mind:
-
- `before_tool_call`: `{ block: true }` is terminal and stops lower-priority handlers.
- `before_tool_call`: `{ block: false }` is treated as no decision.
- `before_tool_call`: `{ requireApproval: true }` pauses agent execution and prompts the user for approval via the exec approval overlay, Telegram buttons, Discord interactions, or the `/approve` command on any channel.
- `before_install`: `{ block: true }` is terminal and stops lower-priority handlers.
- `before_install`: `{ block: false }` is treated as no decision.
- `message_sending`: `{ cancel: true }` is terminal and stops lower-priority handlers.
- `message_sending`: `{ cancel: false }` is treated as no decision.
- `message_received`: prefer the typed `threadId` field when you need inbound thread/topic routing. Keep `metadata` for channel-specific extras.
- `message_sending`: prefer typed `replyToId` / `threadId` routing fields over channel-specific metadata keys.
-
-The `/approve` command handles both exec and plugin approvals with bounded fallback: when an exec approval id is not found, OpenClaw retries the same id through plugin approvals. Plugin approval forwarding can be configured independently via `approvals.plugin` in config.
-
-If custom approval plumbing needs to detect that same bounded fallback case,
-prefer `isApprovalNotFoundError` from `openclaw/plugin-sdk/error-runtime`
-instead of matching approval-expiry strings manually.
-
-See [Plugin hooks](/plugins/hooks) for examples and the hook reference.
-
-## Registering agent tools
-
-Tools are typed functions the LLM can call. They can be required (always
-available) or optional (user opt-in):
-
-For simple plugins that only own a fixed set of tools, prefer
-[`defineToolPlugin`](/plugins/tool-plugins). It generates manifest metadata and
-keeps `contracts.tools` aligned. Use the lower-level `api.registerTool(...)`
-surface when the plugin also owns channels, providers, hooks, services,
-commands, or fully dynamic tool registration.
+Tools can be required or optional. Required tools are always available when the
+plugin is enabled. Optional tools require user opt-in.

 ```typescript
 register(api) {
-  // Required tool - always available
-  api.registerTool({
-    name: "my_tool",
-    description: "Do a thing",
-    parameters: Type.Object({ input: Type.String() }),
-    async execute(_id, params) {
-      return { content: [{ type: "text", text: params.input }] };
-    },
-  });
-
-  // Optional tool - user must add to allowlist
  api.registerTool(
    {
      name: "workflow_tool",
@@ -286,21 +206,13 @@ register(api) {
 }
 ```

-Tool factories receive a runtime-supplied context object. Use
-`ctx.activeModel` when a tool needs to log, display, or adapt to the active
-model for the current turn. The object can include `provider`, `modelId`, and
-`modelRef`. Treat it as informational runtime metadata, not as a security
-boundary against the local operator, installed plugin code, or a modified
-OpenClaw runtime. For sensitive local tools, keep an explicit plugin or operator
-opt-in and fail closed when the active model metadata is missing or unsuitable.
-
 Every tool registered with `api.registerTool(...)` must also be declared in the
 plugin manifest:

 ```json
 {
  "contracts": {
-    "tools": ["my_tool", "workflow_tool"]
+    "tools": ["workflow_tool"]
  },
  "toolMetadata": {
    "workflow_tool": {
@@ -310,110 +222,74 @@ plugin manifest:
 }
 ```

-OpenClaw captures and caches the validated descriptor from the registered tool,
-so plugins do not duplicate `description` or schema data in the manifest. The
-manifest contract only declares ownership and discovery; execution still calls
-the live registered tool implementation.
-Set `toolMetadata.<tool>.optional: true` for tools registered with
-`api.registerTool(..., { optional: true })` so OpenClaw can avoid loading that
-plugin runtime until the tool is explicitly allowlisted.
-
-Users enable optional tools in config:
+Users opt in with `tools.allow`:

 ```json5
 {
-  tools: { allow: ["workflow_tool"] },
+  tools: { allow: ["workflow_tool"] }, // or ["my-plugin"] for all tools from one plugin
 }
 ```

- Tool names must not clash with core tools (conflicts are skipped)
- Tools with malformed registration objects, including missing `parameters`, are skipped and reported in plugin diagnostics instead of breaking agent runs
- Use `optional: true` for tools with side effects or extra binary requirements
- Users can enable all tools from a plugin by adding the plugin id to `tools.allow`
+Use optional tools for side effects, unusual binaries, or capabilities that
+should not be exposed by default. Tool names must not conflict with core tools;
+conflicts are skipped and reported in plugin diagnostics. Malformed
+registrations, including tool descriptors without `parameters`, are skipped and
+reported the same way. Registered tools are typed functions the model can call
+after policy and allowlist checks pass.

-## Registering CLI commands
+Tool factories receive a runtime-supplied context object. Use `ctx.activeModel`
+when a tool needs to log, display, or adapt to the active model for the current
+turn. The object can include `provider`, `modelId`, and `modelRef`. Treat it as
+informational runtime metadata, not as a security boundary against the local
+operator, installed plugin code, or a modified OpenClaw runtime. Sensitive local
+tools should still require an explicit plugin or operator opt-in and fail closed
+when active-model metadata is missing or unsuitable.

-Plugins can add root `openclaw` command groups with `api.registerCli`. Provide
-`descriptors` for every top-level command root so OpenClaw can show and route
-the command without eagerly loading every plugin runtime.
-
-```typescript
-register(api) {
-  api.registerCli(
-    ({ program }) => {
-      const demo = program
-        .command("demo-plugin")
-        .description("Run demo plugin commands");
-
-      demo
-        .command("ping")
-        .description("Check that the plugin CLI is executable")
-        .action(() => {
-          console.log("demo-plugin:pong");
-        });
-    },
-    {
-      descriptors: [
-        {
-          name: "demo-plugin",
-          description: "Run demo plugin commands",
-          hasSubcommands: true,
-        },
-      ],
-    },
-  );
-}
-```
-
-After install, verify the runtime registration and execute the command:
-
-```bash
-openclaw plugins inspect demo-plugin --runtime --json
-openclaw demo-plugin ping
-```
+The manifest declares ownership and discovery; execution still calls the live
+registered tool implementation. Keep `toolMetadata.<tool>.optional: true`
+aligned with `api.registerTool(..., { optional: true })` so OpenClaw can avoid
+loading that plugin runtime until the tool is explicitly allowlisted.

 ## Import conventions

-Always import from focused `openclaw/plugin-sdk/<subpath>` paths:
+Import from focused SDK subpaths:

 ```typescript
 import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
 import { createPluginRuntimeStore } from "openclaw/plugin-sdk/runtime-store";
-
-// Wrong: monolithic root (deprecated, will be removed)
-import { ... } from "openclaw/plugin-sdk";
 ```

-For the full subpath reference, see [SDK Overview](/plugins/sdk-overview).
+Do not import from the deprecated root barrel:

-Within your plugin, use local barrel files (`api.ts`, `runtime-api.ts`) for
-internal imports - never import your own plugin through its SDK path.
+```typescript
+import { definePluginEntry } from "openclaw/plugin-sdk";
+```

-For provider plugins, keep provider-specific helpers in those package-root
-barrels unless the seam is truly generic. Current bundled examples:
+Within your plugin package, use local barrel files such as `api.ts` and
+`runtime-api.ts` for internal imports. Do not import your own plugin through an
+SDK path. Provider-specific helpers should stay in the provider package unless
+the seam is truly generic.

- Anthropic: Claude stream wrappers and `service_tier` / beta helpers
- OpenAI: provider builders, default-model helpers, realtime providers
- OpenRouter: provider builder plus onboarding/config helpers
+Custom Gateway RPC methods are an advanced entry point. Keep them on a
+plugin-specific prefix; core admin namespaces such as `config.*`,
+`exec.approvals.*`, `operator.admin.*`, `wizard.*`, and `update.*` stay reserved
+and resolve to `operator.admin`. The
+`openclaw/plugin-sdk/gateway-method-runtime` bridge is reserved for plugin HTTP
+routes that declare `contracts.gatewayMethodDispatch: ["authenticated-request"]`.

-If a helper is only useful inside one bundled provider package, keep it on that
-package-root seam instead of promoting it into `openclaw/plugin-sdk/*`.
-
-Some generated `openclaw/plugin-sdk/<bundled-id>` helper seams still exist for
-bundled-plugin maintenance when they have tracked owner usage. Treat those as
-reserved surfaces, not as the default pattern for new third-party plugins.
+For the full import map, see [Plugin SDK overview](/plugins/sdk-overview).

 ## Pre-submission checklist

 <Check>**package.json** has correct `openclaw` metadata</Check>
 <Check>**openclaw.plugin.json** manifest is present and valid</Check>
-<Check>Entry point uses `defineToolPlugin`, `defineChannelPluginEntry`, or `definePluginEntry`</Check>
+<Check>Entry point uses `defineChannelPluginEntry` or `definePluginEntry`</Check>
 <Check>All imports use focused `plugin-sdk/<subpath>` paths</Check>
 <Check>Internal imports use local modules, not SDK self-imports</Check>
 <Check>Tests pass (`pnpm test -- <bundled-plugin-root>/my-plugin/`)</Check>
 <Check>`pnpm check` passes (in-repo plugins)</Check>

-## Beta release testing
+## Test against beta releases

 1. Watch for GitHub release tags on [openclaw/openclaw](https://github.com/openclaw/openclaw/releases) and subscribe via `Watch` > `Releases`. Beta tags look like `v2026.3.N-beta.1`. You can also turn on notifications for the official OpenClaw X account [@openclaw](https://x.com/openclaw) for release announcements.
 2. Test your plugin against the beta tag as soon as it appears. The window before stable is typically only a few hours.
@@ -450,8 +326,5 @@ reserved surfaces, not as the default pattern for new third-party plugins.

 ## Related

- [Plugin Architecture](/plugins/architecture) - internal architecture deep dive
- [SDK Overview](/plugins/sdk-overview) - Plugin SDK reference
- [Manifest](/plugins/manifest) - plugin manifest format
- [Channel Plugins](/plugins/sdk-channel-plugins) - building channel plugins
- [Provider Plugins](/plugins/sdk-provider-plugins) - building provider plugins
+- [Plugin hooks](/plugins/hooks)
+- [Plugin architecture](/plugins/architecture)
--- a/docs/providers/openai.md
+++ b/docs/providers/openai.md
@@ -248,10 +248,10 @@ Choose your preferred auth method and follow the setup steps.
    | `codex-cli/gpt-5.5` | repaired by doctor | Legacy CLI route rewritten to `openai/gpt-5.5` | Codex app-server auth |

    <Warning>
-    Do not configure older `openai-codex/gpt-5.1*`, `openai-codex/gpt-5.2*`, or
-    `openai-codex/gpt-5.3*` model refs. ChatGPT/Codex OAuth accounts now reject
-    those models. Use `openai/gpt-5.5`; OpenAI agent turns now select the Codex
-    runtime by default.
+    Prefer `openai/gpt-5.5` for new subscription-backed agent config. Older
+    `openai-codex/gpt-*` refs are legacy PI routes, not the native Codex runtime
+    path; run `openclaw doctor --fix` when you want to migrate them to canonical
+    `openai/*` refs.
    </Warning>

    <Note>
--- a/docs/reference/RELEASING.md
+++ b/docs/reference/RELEASING.md
@@ -185,10 +185,10 @@ vYYYY.M.D-beta.N` from the matching `release/YYYY.M.D` branch. The helper runs
  - `custom`: exact `docker_lanes` selection for a focused rerun
 - Run the manual `CI` workflow directly when you only need full normal CI
  coverage for the release candidate. Manual CI dispatches bypass changed
-  scoping and force the Linux Node shards, bundled-plugin shards, channel
-  contracts, Node 22 compatibility, `check`, `check-additional`, build smoke,
-  docs checks, Python skills, Windows, macOS, Android, and Control UI i18n
-  lanes.
+  scoping and force the Linux Node shards, bundled-plugin shards, plugin and
+  channel contract shards, Node 22 compatibility, `check-*`, `check-additional-*`,
+  built-artifact smoke checks, docs checks, Python skills, Windows, macOS,
+  Android, and Control UI i18n lanes.
  Example: `gh workflow run ci.yml --ref release/YYYY.M.D`
 - Run `pnpm qa:otel:smoke` when validating release telemetry. It exercises
  QA-lab through a local OTLP/HTTP receiver and verifies the exported trace
@@ -442,16 +442,19 @@ Focused `npm-telegram` reruns require `release_package_spec` or
 `npm_telegram_package_spec`; full/all runs with `release_profile=full` use the
 release-checks package artifact. Focused
 cross-OS reruns can add `cross_os_suite_filter=windows/packaged-upgrade` or
-another OS/suite filter. QA release-check failures are advisory; a QA-only
-failure does not block release validation.
+another OS/suite filter. QA release-check failures are advisory except the
+standard runtime tool coverage gate, which blocks release validation when
+required OpenClaw dynamic tools drift or disappear from the standard tier
+summary.

 ### Vitest

 The Vitest box is the manual `CI` child workflow. Manual CI intentionally
 bypasses changed scoping and forces the normal test graph for the release
-candidate: Linux Node shards, bundled-plugin shards, channel contracts, Node 22
-compatibility, `check`, `check-additional`, build smoke, docs checks, Python
-skills, Windows, macOS, Android, and Control UI i18n.
+candidate: Linux Node shards, bundled-plugin shards, plugin and channel contract
+shards, Node 22 compatibility, `check-*`, `check-additional-*`,
+built-artifact smoke checks, docs checks, Python skills, Windows, macOS,
+Android, and Control UI i18n.

 Use this box to answer "did the source tree pass the full normal test suite?"
 It is not the same as release-path product validation. Evidence to keep:
--- a/docs/reference/full-release-validation.md
+++ b/docs/reference/full-release-validation.md
@@ -44,7 +44,7 @@ only when Package Acceptance should intentionally prove a different package.
 | Stage                | Details                                                                                                                                                                                                                                                                                                                                                                                                                                        |
 | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | Target resolution    | **Job:** `Resolve target ref`<br />**Child workflow:** none<br />**Proves:** resolves the release branch, tag, or full commit SHA and records selected inputs.<br />**Rerun:** rerun the umbrella if this fails.                                                                                                                                                                                                                               |
-| Vitest and normal CI | **Job:** `Run normal full CI`<br />**Child workflow:** `CI`<br />**Proves:** manual full CI graph against the target ref, including Linux Node lanes, bundled plugin shards, channel contracts, Node 22 compatibility, `check`, `check-additional`, build smoke, docs checks, Python skills, Windows, macOS, Control UI i18n, and Android via the umbrella.<br />**Rerun:** `rerun_group=ci`.                                                  |
+| Vitest and normal CI | **Job:** `Run normal full CI`<br />**Child workflow:** `CI`<br />**Proves:** manual full CI graph against the target ref, including Linux Node lanes, bundled plugin shards, plugin and channel contract shards, Node 22 compatibility, `check-*`, `check-additional-*`, built-artifact smoke checks, docs checks, Python skills, Windows, macOS, Control UI i18n, and Android via the umbrella.<br />**Rerun:** `rerun_group=ci`.             |
 | Plugin prerelease    | **Job:** `Run plugin prerelease validation`<br />**Child workflow:** `Plugin Prerelease`<br />**Proves:** release-only plugin static checks, agentic plugin coverage, full extension batch shards, plugin prerelease Docker lanes, and a non-blocking `plugin-inspector-advisory` artifact for compatibility triage.<br />**Rerun:** `rerun_group=plugin-prerelease`.                                                                          |
 | Release checks       | **Job:** `Run release/live/Docker/QA validation`<br />**Child workflow:** `OpenClaw Release Checks`<br />**Proves:** install smoke, cross-OS package checks, Package Acceptance, QA Lab parity, live Matrix, and live Telegram. With `run_release_soak=true` or `release_profile=full`, also runs exhaustive live/E2E suites and Docker release-path chunks.<br />**Rerun:** `rerun_group=release-checks` or a narrower release-checks handle. |
 | Package artifact     | **Job:** `Prepare release package artifact`<br />**Child workflow:** none<br />**Proves:** creates the parent `release-package-under-test` tarball early enough for package-facing checks that do not need to wait for `OpenClaw Release Checks`.<br />**Rerun:** rerun the umbrella or provide `release_package_spec` for published-package reruns.                                                                                           |
@@ -166,9 +166,10 @@ summaries include per-phase timings for packaged upgrade lanes, and long-running
 commands print heartbeat lines so a stuck Windows update is visible before the
 job timeout.

-QA release-check lanes are advisory. A QA-only failure is reported as a warning
-and does not block the release-check verifier; rerun `rerun_group=qa`,
-`qa-parity`, or `qa-live` when you need fresh QA evidence.
+QA release-check lanes are advisory except the standard runtime tool coverage
+gate. Required OpenClaw dynamic tool drift in the standard tier blocks the
+release-check verifier; other QA-only failures are reported as warnings. Rerun
+`rerun_group=qa`, `qa-parity`, or `qa-live` when you need fresh QA evidence.

 ## Evidence to keep

--- a/docs/tools/browser-control.md
+++ b/docs/tools/browser-control.md
@@ -193,6 +193,7 @@ openclaw browser waitfordownload report.pdf
 openclaw browser upload /tmp/openclaw/uploads/file.pdf
 openclaw browser fill --fields '[{"ref":"1","type":"text","value":"Ada"}]'
 openclaw browser dialog --accept
+openclaw browser dialog --dismiss --dialog-id d1
 openclaw browser wait --text "Done"
 openclaw browser wait "#main" --url "**/dash" --load networkidle --fn "window.ready===true"
 openclaw browser evaluate --fn '(el) => el.textContent' --ref 7
@@ -228,7 +229,7 @@ openclaw browser set device "iPhone 14"

 Notes:

- `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog.
+- `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog. If an action opens a modal, the action response includes `blockedByDialog` and `browserState.dialogs.pending`; pass that `dialogId` to respond directly. Dialogs handled outside OpenClaw appear under `browserState.dialogs.recent`.
 - `click`/`type`/etc require a `ref` from `snapshot` (numeric `12`, role ref `e12`, or actionable ARIA ref `ax12`). CSS selectors are intentionally not supported for actions. Use `click-coords` when the visible viewport position is the only reliable target.
 - Download, trace, and upload paths are constrained to OpenClaw temp roots: `/tmp/openclaw{,/downloads,/uploads}` (fallback: `${os.tmpdir()}/openclaw/...`).
 - `upload` can also set file inputs directly via `--input-ref` or `--element`.
--- a/docs/tools/browser.md
+++ b/docs/tools/browser.md
@@ -646,7 +646,8 @@ Compared to the managed `openclaw` profile, existing-session drivers are more co

 - **Screenshots** - page captures and `--ref` element captures work; CSS `--element` selectors do not. `--full-page` cannot combine with `--ref` or `--element`. Playwright is not required for page or ref-based element screenshots.
 - **Actions** - `click`, `type`, `hover`, `scrollIntoView`, `drag`, and `select` require snapshot refs (no CSS selectors). `click-coords` clicks visible viewport coordinates and does not require a snapshot ref. `click` is left-button only. `type` does not support `slowly=true`; use `fill` or `press`. `press` does not support `delayMs`. `type`, `hover`, `scrollIntoView`, `drag`, `select`, `fill`, and `evaluate` do not support per-call timeouts. `select` accepts a single value.
- **Wait / upload / dialog** - `wait --url` supports exact, substring, and glob patterns; `wait --load networkidle` is not supported. Upload hooks require `ref` or `inputRef`, one file at a time, no CSS `element`. Dialog hooks do not support timeout overrides.
+- **Wait / upload / dialog** - `wait --url` supports exact, substring, and glob patterns; `wait --load networkidle` is not supported. Upload hooks require `ref` or `inputRef`, one file at a time, no CSS `element`. Dialog hooks do not support timeout overrides or `dialogId`.
+- **Dialog visibility** - Managed browser action responses include `blockedByDialog` and `browserState.dialogs.pending` when an action opens a modal dialog; snapshots also include pending dialog state. Respond with `browser dialog --accept/--dismiss --dialog-id <id>` while a dialog is pending. Dialogs handled outside OpenClaw appear under `browserState.dialogs.recent`.
 - **Managed-only features** - batch actions, PDF export, download interception, and `responsebody` still require the managed browser path.

 </Accordion>
--- a/docs/vps.md
+++ b/docs/vps.md
@@ -90,7 +90,7 @@ source ~/.bashrc
 ```

 - `NODE_COMPILE_CACHE` improves repeated command startup times.
- `OPENCLAW_NO_RESPAWN=1` avoids extra startup overhead from a self-respawn path.
+- `OPENCLAW_NO_RESPAWN=1` keeps routine Gateway restarts in-process, which avoids extra process handoffs and keeps PID tracking simple on small hosts.
 - First command run warms the cache; subsequent runs are faster.
 - For Raspberry Pi specifics, see [Raspberry Pi](/install/raspberry-pi).

--- a/extensions/bonjour/index.test.ts
+++ b/extensions/bonjour/index.test.ts
@@ -72,6 +72,7 @@ describe("bonjour plugin entry", () => {
        gatewayPort: 3210,
        gatewayTlsEnabled: true,
        gatewayTlsFingerprintSha256: "abc123",
+        gatewayDirectReachable: true,
        canvasPort: 9876,
        sshPort: 22,
        tailnetDns: "dev.tailnet.ts.net",
@@ -88,6 +89,7 @@ describe("bonjour plugin entry", () => {
        gatewayPort: 3210,
        gatewayTlsEnabled: true,
        gatewayTlsFingerprintSha256: "abc123",
+        gatewayDirectReachable: true,
        canvasPort: 9876,
        sshPort: 22,
        tailnetDns: "dev.tailnet.ts.net",
--- a/extensions/bonjour/index.ts
+++ b/extensions/bonjour/index.ts
@@ -32,6 +32,7 @@ export default definePluginEntry({
            gatewayPort: ctx.gatewayPort,
            gatewayTlsEnabled: ctx.gatewayTlsEnabled,
            gatewayTlsFingerprintSha256: ctx.gatewayTlsFingerprintSha256,
+            gatewayDirectReachable: ctx.gatewayDirectReachable,
            canvasPort: ctx.canvasPort,
            sshPort: ctx.sshPort,
            tailnetDns: ctx.tailnetDns,
--- a/extensions/bonjour/src/advertiser.test.ts
+++ b/extensions/bonjour/src/advertiser.test.ts
@@ -180,6 +180,7 @@ describe("gateway bonjour advertiser", () => {
    const started = await startAdvertiser({
      gatewayPort: 18789,
      sshPort: 2222,
+      gatewayDirectReachable: true,
      tailnetDns: "host.tailnet.ts.net",
      cliPath: "/opt/homebrew/bin/openclaw",
      minimal: false,
@@ -195,6 +196,7 @@ describe("gateway bonjour advertiser", () => {
    expect(gatewayCall?.[0]?.hostname).toBe("test-host");
    expect((gatewayCall?.[0]?.txt as Record<string, string>)?.lanHost).toBe("test-host.local");
    expect((gatewayCall?.[0]?.txt as Record<string, string>)?.gatewayPort).toBe("18789");
+    expect((gatewayCall?.[0]?.txt as Record<string, string>)?.gatewayDirectReachable).toBe("1");
    expect((gatewayCall?.[0]?.txt as Record<string, string>)?.sshPort).toBe("2222");
    expect((gatewayCall?.[0]?.txt as Record<string, string>)?.tailnetDns).toBe(
      "host.tailnet.ts.net",
--- a/extensions/bonjour/src/advertiser.ts
+++ b/extensions/bonjour/src/advertiser.ts
@@ -22,6 +22,7 @@ export type GatewayBonjourAdvertiseOpts = {
  sshPort?: number;
  gatewayTlsEnabled?: boolean;
  gatewayTlsFingerprintSha256?: string;
+  gatewayDirectReachable?: boolean;
  canvasPort?: number;
  tailnetDns?: string;
  cliPath?: string;
@@ -451,6 +452,9 @@ export async function startGatewayBonjourAdvertiser(
        txtBase.gatewayTlsSha256 = opts.gatewayTlsFingerprintSha256;
      }
    }
+    if (opts.gatewayDirectReachable) {
+      txtBase.gatewayDirectReachable = "1";
+    }
    if (typeof opts.canvasPort === "number" && opts.canvasPort > 0) {
      txtBase.canvasPort = String(opts.canvasPort);
    }
--- a/extensions/browser/browser-doctor.ts
+++ b/extensions/browser/browser-doctor.ts
@@ -1 +1,6 @@
-export { noteChromeMcpBrowserReadiness } from "./src/doctor-browser.js";
+export {
+  detectLegacyClawdBrowserProfileResidue,
+  maybeArchiveLegacyClawdBrowserProfileResidue,
+  noteChromeMcpBrowserReadiness,
+} from "./src/doctor-browser.js";
+export type { LegacyClawdBrowserProfileResidue } from "./src/doctor-browser.js";
--- a/extensions/browser/src/browser-tool.actions.ts
+++ b/extensions/browser/src/browser-tool.actions.ts
@@ -404,6 +404,31 @@ export async function executeSnapshotAction(params: {
  }
  params.onTabActivity?.(readStringValue(snapshot.targetId) ?? targetId);
  if (snapshot.format === "ai") {
+    const dialogStateFields = {
+      ...(snapshot.blockedByDialog ? { blockedByDialog: true } : {}),
+      ...(snapshot.browserState !== undefined ? { browserState: snapshot.browserState } : {}),
+    };
+    if (snapshot.blockedByDialog) {
+      const wrapped = wrapBrowserExternalJson({
+        kind: "snapshot",
+        payload: {
+          format: snapshot.format,
+          targetId: snapshot.targetId,
+          url: snapshot.url,
+          ...dialogStateFields,
+        },
+      });
+      return {
+        content: [{ type: "text" as const, text: wrapped.wrappedText }],
+        details: {
+          ...wrapped.safeDetails,
+          format: snapshot.format,
+          targetId: snapshot.targetId,
+          url: snapshot.url,
+          ...dialogStateFields,
+        },
+      };
+    }
    const extractedText = snapshot.snapshot ?? "";
    const wrappedSnapshot = wrapExternalContent(extractedText, {
      source: "browser",
@@ -423,6 +448,7 @@ export async function executeSnapshotAction(params: {
      imagePath: snapshot.imagePath,
      imageType: snapshot.imageType,
      refsFallback,
+      ...dialogStateFields,
      externalContent: {
        untrusted: true,
        source: "browser",
@@ -457,6 +483,8 @@ export async function executeSnapshotAction(params: {
        targetId: snapshot.targetId,
        url: snapshot.url,
        nodeCount: snapshot.nodes.length,
+        ...(snapshot.blockedByDialog ? { blockedByDialog: true } : {}),
+        ...(snapshot.browserState !== undefined ? { browserState: snapshot.browserState } : {}),
        externalContent: {
          untrusted: true,
          source: "browser",
--- a/extensions/browser/src/browser-tool.schema.ts
+++ b/extensions/browser/src/browser-tool.schema.ts
@@ -118,6 +118,7 @@ export const BrowserToolSchema = Type.Object({
  paths: Type.Optional(Type.Array(Type.String())),
  inputRef: Type.Optional(Type.String()),
  timeoutMs: Type.Optional(Type.Number()),
+  dialogId: Type.Optional(Type.String()),
  accept: Type.Optional(Type.Boolean()),
  promptText: Type.Optional(Type.String()),
  // Legacy flattened act params (preferred: request={...})
--- a/extensions/browser/src/browser-tool.test.ts
+++ b/extensions/browser/src/browser-tool.test.ts
@@ -1230,6 +1230,35 @@ describe("browser tool external content wrapping", () => {
    expect(details.nodeCount).toBe(1);
  });

+  it("preserves pending dialog state in ai snapshot results", async () => {
+    browserClientMocks.browserSnapshot.mockResolvedValueOnce({
+      ok: true,
+      format: "ai",
+      targetId: "t1",
+      url: "https://example.com",
+      snapshot: "",
+      blockedByDialog: true,
+      browserState: {
+        dialogs: {
+          pending: [{ id: "d1", type: "confirm", message: "Continue?" }],
+          recent: [],
+        },
+      },
+    });
+
+    const tool = createBrowserTool();
+    const result = await tool.execute?.("call-1", { action: "snapshot", snapshotFormat: "ai" });
+    const text = firstResultText(result);
+    expect(text).toContain('"blockedByDialog": true');
+    expect(text).toContain('"id": "d1"');
+    const details = externalContentDetails(result, "snapshot") as {
+      blockedByDialog?: unknown;
+      browserState?: { dialogs?: { pending?: Array<{ id?: string }> } };
+    };
+    expect(details.blockedByDialog).toBe(true);
+    expect(details.browserState?.dialogs?.pending?.[0]?.id).toBe("d1");
+  });
+
  it("wraps tabs output as external content", async () => {
    browserClientMocks.browserTabs.mockResolvedValueOnce([
      {
--- a/extensions/browser/src/browser-tool.ts
+++ b/extensions/browser/src/browser-tool.ts
@@ -867,6 +867,7 @@ export function createBrowserTool(opts?: {
        case "dialog": {
          const accept = Boolean(params.accept);
          const promptText = readStringValue(params.promptText);
+          const dialogId = readStringValue(params.dialogId);
          const { targetId, timeoutMs } = readOptionalTargetAndTimeout(params);
          if (proxyRequest) {
            const result = await proxyRequest({
@@ -876,6 +877,7 @@ export function createBrowserTool(opts?: {
              body: {
                accept,
                promptText,
+                dialogId,
                targetId,
                timeoutMs,
              },
@@ -885,6 +887,7 @@ export function createBrowserTool(opts?: {
          const result = await browserToolDeps.browserArmDialog(baseUrl, {
            accept,
            promptText,
+            dialogId,
            targetId,
            timeoutMs,
            profile,
--- a/extensions/browser/src/browser/client-actions-core.ts
+++ b/extensions/browser/src/browser/client-actions-core.ts
@@ -19,6 +19,8 @@ type BrowserActResponse = {
  url?: string;
  result?: unknown;
  results?: Array<{ ok: boolean; error?: string }>;
+  blockedByDialog?: boolean;
+  browserState?: unknown;
 };

 const BROWSER_ACT_REQUEST_TIMEOUT_SLACK_MS = 5_000;
@@ -66,6 +68,7 @@ export async function browserArmDialog(
  opts: {
    accept: boolean;
    promptText?: string;
+    dialogId?: string;
    targetId?: string;
    timeoutMs?: number;
    profile?: string;
@@ -78,6 +81,7 @@ export async function browserArmDialog(
    body: JSON.stringify({
      accept: opts.accept,
      promptText: opts.promptText,
+      dialogId: opts.dialogId,
      targetId: opts.targetId,
      timeoutMs: opts.timeoutMs,
    }),
--- a/extensions/browser/src/browser/client.ts
+++ b/extensions/browser/src/browser/client.ts
@@ -44,6 +44,8 @@ export type SnapshotResult =
      targetId: string;
      url: string;
      nodes: SnapshotAriaNode[];
+      blockedByDialog?: boolean;
+      browserState?: unknown;
    }
  | {
      ok: true;
@@ -64,6 +66,8 @@ export type SnapshotResult =
      labelsSkipped?: number;
      imagePath?: string;
      imageType?: "png" | "jpeg";
+      blockedByDialog?: boolean;
+      browserState?: unknown;
    };

 export async function browserStatus(
--- a/extensions/browser/src/browser/pw-ai.ts
+++ b/extensions/browser/src/browser/pw-ai.ts
@@ -10,9 +10,16 @@ export {
  ensurePageState,
  forceDisconnectPlaywrightForTarget,
  focusPageByTargetIdViaPlaywright,
+  createObservedDialogAbortSignalForPage,
+  getObservedBrowserStateForPage,
+  getObservedBrowserStateViaPlaywright,
  getPageForTargetId,
+  isBrowserObservedDialogBlockedError,
  listPagesViaPlaywright,
+  markObservedDialogsHandledRemotelyForPage,
  refLocator,
+  respondToObservedDialogOnPage,
+  respondToObservedDialogViaPlaywright,
 } from "./pw-session.js";

 export {
--- a/extensions/browser/src/browser/pw-session.dialogs.test.ts
+++ b/extensions/browser/src/browser/pw-session.dialogs.test.ts
@@ -0,0 +1,139 @@
+import type { Dialog, Page } from "playwright-core";
+import { afterEach, describe, expect, it, vi } from "vitest";
+import {
+  armObservedDialogResponseOnPage,
+  createObservedDialogAbortSignalForPage,
+  ensurePageState,
+  getObservedBrowserStateForPage,
+  isBrowserObservedDialogBlockedError,
+  markObservedDialogsHandledRemotelyForPage,
+  respondToObservedDialogOnPage,
+} from "./pw-session.js";
+
+type Handler = (arg: unknown) => void;
+
+function createPageHarness() {
+  const handlers = new Map<string, Handler[]>();
+  const page = {
+    on: (event: string, handler: Handler) => {
+      handlers.set(event, [...(handlers.get(event) ?? []), handler]);
+      return page;
+    },
+  };
+  return {
+    page: page as unknown as Page,
+    emit: (event: string, arg: unknown) => {
+      for (const handler of handlers.get(event) ?? []) {
+        handler(arg);
+      }
+    },
+  };
+}
+
+function createDialog(
+  overrides: Partial<{
+    type: string;
+    message: string;
+    defaultValue: string;
+  }> = {},
+) {
+  return {
+    type: vi.fn(() => overrides.type ?? "confirm"),
+    message: vi.fn(() => overrides.message ?? "Continue?"),
+    defaultValue: vi.fn(() => overrides.defaultValue ?? ""),
+    accept: vi.fn(async (_promptText?: string) => {}),
+    dismiss: vi.fn(async () => {}),
+  } as unknown as Dialog & {
+    accept: ReturnType<typeof vi.fn>;
+    dismiss: ReturnType<typeof vi.fn>;
+  };
+}
+
+describe("observed browser dialogs", () => {
+  afterEach(() => {
+    vi.useRealTimers();
+  });
+
+  it("surfaces pending dialogs and lets callers respond by id", async () => {
+    const { page, emit } = createPageHarness();
+    ensurePageState(page);
+    const dialog = createDialog({ message: "Ship it?" });
+
+    emit("dialog", dialog);
+
+    expect(getObservedBrowserStateForPage(page).dialogs.pending).toMatchObject([
+      { id: "d1", type: "confirm", message: "Ship it?" },
+    ]);
+
+    const closed = await respondToObservedDialogOnPage({
+      page,
+      dialogId: "d1",
+      accept: true,
+      promptText: "yes",
+    });
+
+    expect(dialog.accept).toHaveBeenCalledWith("yes");
+    expect(closed.closedBy).toBe("agent");
+    expect(getObservedBrowserStateForPage(page).dialogs.pending).toEqual([]);
+    expect(getObservedBrowserStateForPage(page).dialogs.recent).toMatchObject([
+      { id: "d1", closedBy: "agent" },
+    ]);
+  });
+
+  it("keeps arm-next-dialog behavior through the observed dialog path", async () => {
+    const { page, emit } = createPageHarness();
+    ensurePageState(page);
+    const dialog = createDialog({ type: "alert", message: "Heads up" });
+    const observed = createObservedDialogAbortSignalForPage({ page });
+
+    armObservedDialogResponseOnPage({ page, accept: false, timeoutMs: 1000 });
+    emit("dialog", dialog);
+    await Promise.resolve();
+
+    expect(observed.signal.aborted).toBe(false);
+    expect(dialog.dismiss).toHaveBeenCalledOnce();
+    expect(getObservedBrowserStateForPage(page).dialogs.pending).toEqual([]);
+    expect(getObservedBrowserStateForPage(page).dialogs.recent).toMatchObject([
+      { id: "d1", type: "alert", closedBy: "armed" },
+    ]);
+    observed.cleanup();
+  });
+
+  it("aborts in-flight actions while keeping unarmed dialogs pending", async () => {
+    const { page, emit } = createPageHarness();
+    ensurePageState(page);
+    const dialog = createDialog({ type: "alert", message: "Heads up" });
+    const observed = createObservedDialogAbortSignalForPage({ page });
+
+    emit("dialog", dialog);
+
+    expect(observed.signal.aborted).toBe(true);
+    expect(isBrowserObservedDialogBlockedError(observed.signal.reason)).toBe(true);
+    expect(getObservedBrowserStateForPage(page).dialogs.pending).toMatchObject([
+      { id: "d1", type: "alert", message: "Heads up" },
+    ]);
+
+    expect(dialog.dismiss).not.toHaveBeenCalled();
+    await respondToObservedDialogOnPage({ page, dialogId: "d1", accept: false });
+    observed.cleanup();
+
+    expect(dialog.dismiss).toHaveBeenCalledOnce();
+    expect(getObservedBrowserStateForPage(page).dialogs.pending).toEqual([]);
+    expect(getObservedBrowserStateForPage(page).dialogs.recent).toMatchObject([
+      { id: "d1", type: "alert", closedBy: "agent" },
+    ]);
+  });
+
+  it("moves remotely handled pending dialogs into recent state", () => {
+    const { page, emit } = createPageHarness();
+    ensurePageState(page);
+    emit("dialog", createDialog({ type: "confirm", message: "Continue?" }));
+
+    const state = markObservedDialogsHandledRemotelyForPage(page);
+
+    expect(state.dialogs.pending).toEqual([]);
+    expect(state.dialogs.recent).toMatchObject([
+      { id: "d1", type: "confirm", message: "Continue?", closedBy: "remote" },
+    ]);
+  });
+});
--- a/extensions/browser/src/browser/pw-session.ts
+++ b/extensions/browser/src/browser/pw-session.ts
@@ -5,6 +5,7 @@ import type {
  Browser,
  BrowserContext,
  ConsoleMessage,
+  Dialog,
  Page,
  Request,
  Response,
@@ -67,6 +68,52 @@ export type BrowserNetworkRequest = {
  failureText?: string;
 };

+export type BrowserObservedDialogRecord = {
+  id: string;
+  type: string;
+  message: string;
+  defaultValue?: string;
+  openedAt: string;
+  closedAt?: string;
+  closedBy?: "agent" | "armed" | "auto" | "timeout" | "remote";
+};
+
+export type BrowserObservedDialogState = {
+  pending: BrowserObservedDialogRecord[];
+  recent: BrowserObservedDialogRecord[];
+};
+
+export type BrowserObservedState = {
+  dialogs: BrowserObservedDialogState;
+};
+
+export class BrowserObservedDialogBlockedError extends Error {
+  readonly browserState: BrowserObservedState;
+
+  constructor(browserState: BrowserObservedState) {
+    super("Browser action blocked by a modal dialog.");
+    this.name = "BrowserObservedDialogBlockedError";
+    this.browserState = browserState;
+  }
+}
+
+export function isBrowserObservedDialogBlockedError(
+  err: unknown,
+): err is BrowserObservedDialogBlockedError {
+  return err instanceof BrowserObservedDialogBlockedError;
+}
+
+type PendingObservedDialog = BrowserObservedDialogRecord & {
+  dialog: Dialog;
+};
+
+type ArmedDialogResponse = {
+  accept: boolean;
+  promptText?: string;
+  expiresAt: number;
+  timer?: ReturnType<typeof setTimeout>;
+};
+
 type TargetInfoResponse = {
  targetInfo?: {
    targetId?: string;
@@ -86,9 +133,13 @@ type PageState = {
  requestIds: WeakMap<Request, string>;
  nextRequestId: number;
  armIdUpload: number;
-  armIdDialog: number;
  armIdDownload: number;
  downloadWaiterDepth: number;
+  nextObservedDialogId: number;
+  pendingDialogs: PendingObservedDialog[];
+  recentDialogs: BrowserObservedDialogRecord[];
+  armedDialogResponse?: ArmedDialogResponse;
+  dialogAbortControllers: Set<AbortController>;
  /**
   * Role-based refs from the last role snapshot (e.g. e1/e2).
   * Mode "role" refs are generated from ariaSnapshot and resolved via getByRole.
@@ -123,6 +174,8 @@ const MAX_ROLE_REFS_CACHE = 50;
 const MAX_CONSOLE_MESSAGES = 500;
 const MAX_PAGE_ERRORS = 200;
 const MAX_NETWORK_REQUESTS = 500;
+const MAX_RECENT_DIALOGS = 20;
+const OBSERVED_DIALOG_TIMEOUT_MS = 120_000;

 const cachedByCdpUrl = new Map<string, ConnectedBrowser>();
 const connectingByCdpUrl = new Map<string, Promise<ConnectedBrowser>>();
@@ -183,6 +236,135 @@ function findNetworkRequestById(state: PageState, id: string): BrowserNetworkReq
  return undefined;
 }

+function appendRecentDialog(state: PageState, record: BrowserObservedDialogRecord): void {
+  state.recentDialogs.push(record);
+  while (state.recentDialogs.length > MAX_RECENT_DIALOGS) {
+    state.recentDialogs.shift();
+  }
+}
+
+function serializeDialogRecord(dialog: BrowserObservedDialogRecord): BrowserObservedDialogRecord {
+  return {
+    id: dialog.id,
+    type: dialog.type,
+    message: dialog.message,
+    ...(dialog.defaultValue !== undefined ? { defaultValue: dialog.defaultValue } : {}),
+    openedAt: dialog.openedAt,
+    ...(dialog.closedAt !== undefined ? { closedAt: dialog.closedAt } : {}),
+    ...(dialog.closedBy !== undefined ? { closedBy: dialog.closedBy } : {}),
+  };
+}
+
+function serializePendingDialog(dialog: PendingObservedDialog): BrowserObservedDialogRecord {
+  return serializeDialogRecord(dialog);
+}
+
+function serializeObservedBrowserState(state: PageState): BrowserObservedState {
+  return {
+    dialogs: {
+      pending: state.pendingDialogs.map(serializePendingDialog),
+      recent: state.recentDialogs.map(serializeDialogRecord),
+    },
+  };
+}
+
+function clearArmedDialogResponse(state: PageState): void {
+  if (state.armedDialogResponse?.timer) {
+    clearTimeout(state.armedDialogResponse.timer);
+  }
+  state.armedDialogResponse = undefined;
+}
+
+function abortActionsBlockedByDialog(state: PageState): void {
+  if (state.dialogAbortControllers.size === 0) {
+    return;
+  }
+  const err = new BrowserObservedDialogBlockedError(serializeObservedBrowserState(state));
+  for (const controller of state.dialogAbortControllers) {
+    if (!controller.signal.aborted) {
+      controller.abort(err);
+    }
+  }
+  state.dialogAbortControllers.clear();
+}
+
+function isNoDialogShowingError(err: unknown): boolean {
+  const message = err instanceof Error ? err.message : String(err);
+  return message.toLowerCase().includes("no dialog is showing");
+}
+
+async function settleObservedDialog(params: {
+  state: PageState;
+  pending: PendingObservedDialog;
+  accept: boolean;
+  promptText?: string;
+  closedBy: NonNullable<BrowserObservedDialogRecord["closedBy"]>;
+}): Promise<BrowserObservedDialogRecord> {
+  const { state, pending } = params;
+  state.pendingDialogs = state.pendingDialogs.filter((dialog) => dialog.id !== pending.id);
+
+  let closedBy = params.closedBy;
+  try {
+    if (params.accept) {
+      await pending.dialog.accept(params.promptText);
+    } else {
+      await pending.dialog.dismiss();
+    }
+  } catch (err) {
+    if (!isNoDialogShowingError(err)) {
+      if (params.closedBy === "agent") {
+        state.pendingDialogs.push(pending);
+      }
+      throw err;
+    }
+    closedBy = "remote";
+  }
+
+  const record: BrowserObservedDialogRecord = {
+    id: pending.id,
+    type: pending.type,
+    message: pending.message,
+    ...(pending.defaultValue !== undefined ? { defaultValue: pending.defaultValue } : {}),
+    openedAt: pending.openedAt,
+    closedAt: new Date().toISOString(),
+    closedBy,
+  };
+  appendRecentDialog(state, record);
+  return record;
+}
+
+function observeDialog(pageState: PageState, dialog: Dialog): void {
+  pageState.nextObservedDialogId += 1;
+  const type = dialog.type();
+  const defaultValue = dialog.defaultValue();
+  const pending: PendingObservedDialog = {
+    id: `d${pageState.nextObservedDialogId}`,
+    type,
+    message: dialog.message(),
+    openedAt: new Date().toISOString(),
+    dialog,
+    ...(type === "prompt" ? { defaultValue } : {}),
+  };
+  pageState.pendingDialogs.push(pending);
+
+  const armed = pageState.armedDialogResponse;
+  if (armed && armed.expiresAt >= Date.now()) {
+    clearArmedDialogResponse(pageState);
+    void settleObservedDialog({
+      state: pageState,
+      pending,
+      accept: armed.accept,
+      ...(armed.promptText !== undefined ? { promptText: armed.promptText } : {}),
+      closedBy: "armed",
+    }).catch(() => {});
+    return;
+  }
+  if (armed) {
+    clearArmedDialogResponse(pageState);
+  }
+  abortActionsBlockedByDialog(pageState);
+}
+
 function targetKey(cdpUrl: string, targetId: string) {
  return `${normalizeCdpUrl(cdpUrl)}::${targetId}`;
 }
@@ -380,9 +562,12 @@ export function ensurePageState(page: Page): PageState {
    requestIds: new WeakMap(),
    nextRequestId: 0,
    armIdUpload: 0,
-    armIdDialog: 0,
    armIdDownload: 0,
    downloadWaiterDepth: 0,
+    nextObservedDialogId: 0,
+    pendingDialogs: [],
+    recentDialogs: [],
+    dialogAbortControllers: new Set(),
  };
  pageStates.set(page, state);

@@ -451,6 +636,9 @@ export function ensurePageState(page: Page): PageState {
      rec.failureText = req.failure()?.errorText;
      rec.ok = false;
    });
+    page.on("dialog", (dialog: Dialog) => {
+      observeDialog(state, dialog);
+    });
    page.on(
      "download",
      (download: {
@@ -481,6 +669,14 @@ export function ensurePageState(page: Page): PageState {
      },
    );
    page.on("close", () => {
+      clearArmedDialogResponse(state);
+      for (const controller of state.dialogAbortControllers) {
+        if (!controller.signal.aborted) {
+          controller.abort(new Error("Page closed before browser action completed."));
+        }
+      }
+      state.dialogAbortControllers.clear();
+      state.pendingDialogs = [];
      pageStates.delete(page);
      observedPages.delete(page);
    });
@@ -489,6 +685,158 @@ export function ensurePageState(page: Page): PageState {
  return state;
 }

+export function getObservedBrowserStateForPage(page: Page): BrowserObservedState {
+  const state = ensurePageState(page);
+  return serializeObservedBrowserState(state);
+}
+
+export async function getObservedBrowserStateViaPlaywright(opts: {
+  cdpUrl: string;
+  targetId?: string;
+  ssrfPolicy?: SsrFPolicy;
+}): Promise<BrowserObservedState> {
+  const page = await getPageForTargetId(opts);
+  return getObservedBrowserStateForPage(page);
+}
+
+function resolvePendingDialogForResponse(params: {
+  state: PageState;
+  dialogId?: string;
+}): PendingObservedDialog {
+  const dialogId = normalizeOptionalString(params.dialogId);
+  if (dialogId) {
+    const found = params.state.pendingDialogs.find((dialog) => dialog.id === dialogId);
+    if (found) {
+      return found;
+    }
+    throw new Error(`Dialog "${dialogId}" is not pending.`);
+  }
+  if (params.state.pendingDialogs.length === 1) {
+    return params.state.pendingDialogs[0];
+  }
+  if (params.state.pendingDialogs.length > 1) {
+    throw new Error("Multiple dialogs are pending; pass dialogId.");
+  }
+  throw new Error("No dialog is pending.");
+}
+
+export async function respondToObservedDialogOnPage(opts: {
+  page: Page;
+  dialogId?: string;
+  accept: boolean;
+  promptText?: string;
+  closedBy?: "agent" | "armed";
+}): Promise<BrowserObservedDialogRecord> {
+  const state = ensurePageState(opts.page);
+  const pending = resolvePendingDialogForResponse({
+    state,
+    ...(opts.dialogId !== undefined ? { dialogId: opts.dialogId } : {}),
+  });
+  return await settleObservedDialog({
+    state,
+    pending,
+    accept: opts.accept,
+    ...(opts.promptText !== undefined ? { promptText: opts.promptText } : {}),
+    closedBy: opts.closedBy ?? "agent",
+  });
+}
+
+export async function respondToObservedDialogViaPlaywright(opts: {
+  cdpUrl: string;
+  targetId?: string;
+  dialogId?: string;
+  accept: boolean;
+  promptText?: string;
+  ssrfPolicy?: SsrFPolicy;
+}): Promise<BrowserObservedDialogRecord> {
+  const page = await getPageForTargetId(opts);
+  return await respondToObservedDialogOnPage({
+    page,
+    accept: opts.accept,
+    ...(opts.dialogId !== undefined ? { dialogId: opts.dialogId } : {}),
+    ...(opts.promptText !== undefined ? { promptText: opts.promptText } : {}),
+  });
+}
+
+export function markObservedDialogsHandledRemotelyForPage(page: Page): BrowserObservedState {
+  const state = ensurePageState(page);
+  const pending = state.pendingDialogs.splice(0);
+  const closedAt = new Date().toISOString();
+  for (const dialog of pending) {
+    appendRecentDialog(state, {
+      id: dialog.id,
+      type: dialog.type,
+      message: dialog.message,
+      ...(dialog.defaultValue !== undefined ? { defaultValue: dialog.defaultValue } : {}),
+      openedAt: dialog.openedAt,
+      closedAt,
+      closedBy: "remote",
+    });
+  }
+  return serializeObservedBrowserState(state);
+}
+
+export function armObservedDialogResponseOnPage(opts: {
+  page: Page;
+  accept: boolean;
+  promptText?: string;
+  timeoutMs?: number;
+}): void {
+  const state = ensurePageState(opts.page);
+  clearArmedDialogResponse(state);
+  const timeoutMs = Math.max(1, Math.floor(opts.timeoutMs ?? OBSERVED_DIALOG_TIMEOUT_MS));
+  const response: ArmedDialogResponse = {
+    accept: opts.accept,
+    expiresAt: Date.now() + timeoutMs,
+    ...(opts.promptText !== undefined ? { promptText: opts.promptText } : {}),
+  };
+  response.timer = setTimeout(() => {
+    if (state.armedDialogResponse === response) {
+      state.armedDialogResponse = undefined;
+    }
+  }, timeoutMs);
+  state.armedDialogResponse = response;
+}
+
+export function createObservedDialogAbortSignalForPage(opts: {
+  page: Page;
+  parentSignal?: AbortSignal;
+}): { signal: AbortSignal; cleanup: () => void } {
+  const state = ensurePageState(opts.page);
+  const controller = new AbortController();
+  const abortForCurrentDialog = () => {
+    if (!controller.signal.aborted) {
+      controller.abort(new BrowserObservedDialogBlockedError(serializeObservedBrowserState(state)));
+    }
+  };
+  const abortForParent = () => {
+    if (!controller.signal.aborted) {
+      controller.abort(opts.parentSignal?.reason ?? new Error("aborted"));
+    }
+  };
+
+  if (state.pendingDialogs.length > 0) {
+    abortForCurrentDialog();
+  } else {
+    state.dialogAbortControllers.add(controller);
+  }
+  if (opts.parentSignal) {
+    if (opts.parentSignal.aborted) {
+      abortForParent();
+    } else {
+      opts.parentSignal.addEventListener("abort", abortForParent, { once: true });
+    }
+  }
+
+  return {
+    signal: controller.signal,
+    cleanup: () => {
+      state.dialogAbortControllers.delete(controller);
+      opts.parentSignal?.removeEventListener("abort", abortForParent);
+    },
+  };
+}
+
 function observeContext(context: BrowserContext) {
  if (observedContexts.has(context)) {
    return;
--- a/extensions/browser/src/browser/pw-tools-core.browser-ssrf-guard.test.ts
+++ b/extensions/browser/src/browser/pw-tools-core.browser-ssrf-guard.test.ts
@@ -17,7 +17,9 @@ const sessionMocks = vi.hoisted(() => ({
    return pageState.page;
  }),
  gotoPageWithNavigationGuard: vi.fn(async () => null),
+  isBrowserObservedDialogBlockedError: vi.fn(() => false),
  isPolicyDenyNavigationError: vi.fn(() => false),
+  markObservedDialogsHandledRemotelyForPage: vi.fn(() => ({})),
  refLocator: vi.fn(() => {
    if (!pageState.locator) {
      throw new Error("missing locator");
--- a/extensions/browser/src/browser/pw-tools-core.downloads.ts
+++ b/extensions/browser/src/browser/pw-tools-core.downloads.ts
@@ -5,13 +5,14 @@ import { resolvePreferredOpenClawTmpDir } from "../infra/tmp-openclaw-dir.js";
 import { writeExternalFileWithinOutputRoot } from "./output-files.js";
 import { DEFAULT_UPLOAD_DIR, resolveStrictExistingPathsWithinRoot } from "./paths.js";
 import {
+  armObservedDialogResponseOnPage,
  ensurePageState,
  getPageForTargetId,
  refLocator,
+  respondToObservedDialogOnPage,
  restoreRoleRefsForTarget,
 } from "./pw-session.js";
 import {
-  bumpDialogArmId,
  bumpDownloadArmId,
  bumpUploadArmId,
  normalizeTimeoutMs,
@@ -191,32 +192,34 @@ export async function armFileUploadViaPlaywright(opts: {
 export async function armDialogViaPlaywright(opts: {
  cdpUrl: string;
  targetId?: string;
+  dialogId?: string;
  accept: boolean;
  promptText?: string;
  timeoutMs?: number;
 }): Promise<void> {
  const page = await getPageForTargetId(opts);
-  const state = ensurePageState(page);
  const timeout = normalizeTimeoutMs(opts.timeoutMs, 120_000);
-
-  state.armIdDialog = bumpDialogArmId();
-  const armId = state.armIdDialog;
-
-  void page
-    .waitForEvent("dialog", { timeout })
-    .then(async (dialog) => {
-      if (state.armIdDialog !== armId) {
-        return;
-      }
-      if (opts.accept) {
-        await dialog.accept(opts.promptText);
-      } else {
-        await dialog.dismiss();
-      }
-    })
-    .catch(() => {
-      // Ignore timeouts; the dialog may never appear.
+  try {
+    await respondToObservedDialogOnPage({
+      page,
+      accept: opts.accept,
+      closedBy: "agent",
+      ...(opts.dialogId !== undefined ? { dialogId: opts.dialogId } : {}),
+      ...(opts.promptText !== undefined ? { promptText: opts.promptText } : {}),
    });
+    return;
+  } catch (err) {
+    if (opts.dialogId || (err instanceof Error && !err.message.includes("No dialog is pending"))) {
+      throw err;
+    }
+  }
+
+  armObservedDialogResponseOnPage({
+    page,
+    accept: opts.accept,
+    timeoutMs: timeout,
+    ...(opts.promptText !== undefined ? { promptText: opts.promptText } : {}),
+  });
 }

 export async function waitForDownloadViaPlaywright(opts: {
--- a/extensions/browser/src/browser/pw-tools-core.interactions.batch.test.ts
+++ b/extensions/browser/src/browser/pw-tools-core.interactions.batch.test.ts
@@ -14,6 +14,8 @@ const getPageForTargetId = vi.fn(async () => {
 const ensurePageState = vi.fn(() => {});
 const assertPageNavigationCompletedSafely = vi.fn(async () => {});
 const forceDisconnectPlaywrightForTarget = vi.fn(async () => {});
+const isBrowserObservedDialogBlockedError = vi.fn(() => false);
+const markObservedDialogsHandledRemotelyForPage = vi.fn(() => ({}));
 const refLocator = vi.fn(() => {
  throw new Error("test: refLocator should not be called");
 });
@@ -27,6 +29,8 @@ vi.mock("./pw-session.js", () => ({
  ensurePageState,
  forceDisconnectPlaywrightForTarget,
  getPageForTargetId,
+  isBrowserObservedDialogBlockedError,
+  markObservedDialogsHandledRemotelyForPage,
  refLocator,
  restoreRoleRefsForTarget,
 }));
--- a/extensions/browser/src/browser/pw-tools-core.interactions.evaluate.abort.test.ts
+++ b/extensions/browser/src/browser/pw-tools-core.interactions.evaluate.abort.test.ts
@@ -13,6 +13,10 @@ const getPageForTargetId = vi.fn(async () => {
 const ensurePageState = vi.fn(() => {});
 const assertPageNavigationCompletedSafely = vi.fn(async () => {});
 const restoreRoleRefsForTarget = vi.fn(() => {});
+const isBrowserObservedDialogBlockedError = vi.fn(
+  (err: unknown) => err instanceof Error && err.name === "BrowserObservedDialogBlockedError",
+);
+const markObservedDialogsHandledRemotelyForPage = vi.fn(() => ({}));
 const refLocator = vi.fn(() => {
  if (!locator) {
    throw new Error("test: locator not set");
@@ -26,6 +30,8 @@ vi.mock("./pw-session.js", () => {
    ensurePageState,
    forceDisconnectPlaywrightForTarget,
    getPageForTargetId,
+    isBrowserObservedDialogBlockedError,
+    markObservedDialogsHandledRemotelyForPage,
    refLocator,
    restoreRoleRefsForTarget,
  };
@@ -93,4 +99,37 @@ describe("evaluateViaPlaywright (abort)", () => {
    await expect(p).rejects.toThrow("aborted by test");
    expect(forceDisconnectPlaywrightForTarget).toHaveBeenCalled();
  });
+
+  it("does not disconnect when evaluate is blocked by an observed dialog", async () => {
+    const ctrl = new AbortController();
+    const pending = createPendingEval();
+    let resolveEval: (value: unknown) => void = () => {};
+    const pendingPromise = new Promise((resolve) => {
+      resolveEval = resolve;
+    });
+    page = {
+      evaluate: vi.fn(() => {
+        pending.resolveEvalCalled();
+        return pendingPromise;
+      }),
+      url: vi.fn(() => "https://example.com/current"),
+    };
+
+    const p = evaluateViaPlaywright({
+      cdpUrl: "http://127.0.0.1:9222",
+      fn: "() => alert('x')",
+      signal: ctrl.signal,
+    });
+
+    await pending.evalCalledPromise;
+    const err = new Error("blocked by dialog");
+    err.name = "BrowserObservedDialogBlockedError";
+    ctrl.abort(err);
+
+    await expect(p).rejects.toThrow("blocked by dialog");
+    expect(forceDisconnectPlaywrightForTarget).not.toHaveBeenCalled();
+    resolveEval(true);
+    await Promise.resolve();
+    expect(markObservedDialogsHandledRemotelyForPage).toHaveBeenCalled();
+  });
 });
--- a/extensions/browser/src/browser/pw-tools-core.interactions.set-input-files.test.ts
+++ b/extensions/browser/src/browser/pw-tools-core.interactions.set-input-files.test.ts
@@ -11,6 +11,8 @@ const getPageForTargetId = vi.fn(async () => {
 });
 const ensurePageState = vi.fn(() => ({}));
 const restoreRoleRefsForTarget = vi.fn(() => {});
+const isBrowserObservedDialogBlockedError = vi.fn(() => false);
+const markObservedDialogsHandledRemotelyForPage = vi.fn(() => ({}));
 const refLocator = vi.fn(() => {
  if (!locator) {
    throw new Error("test: locator not set");
@@ -27,6 +29,8 @@ vi.mock("./pw-session.js", () => {
    ensurePageState,
    forceDisconnectPlaywrightForTarget,
    getPageForTargetId,
+    isBrowserObservedDialogBlockedError,
+    markObservedDialogsHandledRemotelyForPage,
    refLocator,
    restoreRoleRefsForTarget,
  };
--- a/extensions/browser/src/browser/pw-tools-core.interactions.ts
+++ b/extensions/browser/src/browser/pw-tools-core.interactions.ts
@@ -19,9 +19,12 @@ import {
 import { DEFAULT_UPLOAD_DIR, resolveStrictExistingPathsWithinRoot } from "./paths.js";
 import {
  assertPageNavigationCompletedSafely,
+  createObservedDialogAbortSignalForPage,
  ensurePageState,
  forceDisconnectPlaywrightForTarget,
  getPageForTargetId,
+  isBrowserObservedDialogBlockedError,
+  markObservedDialogsHandledRemotelyForPage,
  refLocator,
  restoreRoleRefsForTarget,
 } from "./pw-session.js";
@@ -66,6 +69,16 @@ async function getRestoredPageForTarget(opts: TargetOpts) {
  return page;
 }

+function toFriendlyInteractionError(err: unknown, label: string): Error {
+  return isBrowserObservedDialogBlockedError(err) ? err : toAIFriendlyError(err, label);
+}
+
+function reconcileRemoteDialogAfterActionSettled(page: Page, signal?: AbortSignal): void {
+  if (isBrowserObservedDialogBlockedError(signal?.reason)) {
+    markObservedDialogsHandledRemotelyForPage(page);
+  }
+}
+
 const resolveInteractionTimeoutMs = resolveActInteractionTimeoutMs;

 // Returns true only when the URL change indicates a cross-document navigation
@@ -426,6 +439,7 @@ async function assertInteractionNavigationCompletedSafely<T>(opts: {
 async function awaitActionWithAbort<T>(
  actionPromise: Promise<T>,
  abortPromise?: Promise<never>,
+  onActionResolvedAfterAbort?: () => void,
 ): Promise<T> {
  if (!abortPromise) {
    return await actionPromise;
@@ -434,7 +448,10 @@ async function awaitActionWithAbort<T>(
    return await Promise.race([actionPromise, abortPromise]);
  } catch (err) {
    // If abort wins the race, the action may reject later; avoid unhandled rejections.
-    void actionPromise.catch(() => {});
+    void actionPromise.then(
+      () => onActionResolvedAfterAbort?.(),
+      () => {},
+    );
    throw err;
  }
 }
@@ -448,7 +465,7 @@ function createAbortPromise(signal?: AbortSignal): {

 function createAbortPromiseWithListener(
  signal?: AbortSignal,
-  onAbort?: () => void,
+  onAbort?: (reason: unknown) => void,
 ): {
  abortPromise?: Promise<never>;
  cleanup: () => void;
@@ -459,12 +476,12 @@ function createAbortPromiseWithListener(
  let abortListener: (() => void) | undefined;
  const abortPromise: Promise<never> = signal.aborted
    ? (() => {
-        onAbort?.();
+        onAbort?.(signal.reason);
        return Promise.reject(signal.reason ?? new Error("aborted"));
      })()
    : new Promise((_, reject) => {
        abortListener = () => {
-          onAbort?.();
+          onAbort?.(signal.reason);
          reject(signal.reason ?? new Error("aborted"));
        };
        signal.addEventListener("abort", abortListener, { once: true });
@@ -490,7 +507,7 @@ export async function highlightViaPlaywright(opts: {
  try {
    await refLocator(page, ref).highlight();
  } catch (err) {
-    throw toAIFriendlyError(err, ref);
+    throw toFriendlyInteractionError(err, ref);
  }
 }

@@ -525,6 +542,9 @@ export async function clickViaPlaywright(opts: {
    });
    void abortPromise.catch(() => {});
    const disconnect = () => {
+      if (isBrowserObservedDialogBlockedError(signal.reason)) {
+        return;
+      }
      void forceDisconnectPlaywrightForTarget({
        cdpUrl: opts.cdpUrl,
        targetId: opts.targetId,
@@ -545,6 +565,7 @@ export async function clickViaPlaywright(opts: {
      throw signal.reason ?? new Error("aborted");
    }
  }
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, signal);
  try {
    await assertInteractionNavigationCompletedSafely({
      action: async () => {
@@ -554,7 +575,11 @@ export async function clickViaPlaywright(opts: {
          ACT_MAX_CLICK_DELAY_MS,
        );
        if (delayMs > 0) {
-          await awaitActionWithAbort(locator.hover({ timeout }), abortPromise);
+          await awaitActionWithAbort(
+            locator.hover({ timeout }),
+            abortPromise,
+            reconcileRemoteDialog,
+          );
          await new Promise((resolve) => setTimeout(resolve, delayMs));
        }
        if (opts.doubleClick) {
@@ -565,6 +590,7 @@ export async function clickViaPlaywright(opts: {
              modifiers: opts.modifiers,
            }),
            abortPromise,
+            reconcileRemoteDialog,
          );
          return;
        }
@@ -575,6 +601,7 @@ export async function clickViaPlaywright(opts: {
            modifiers: opts.modifiers,
          }),
          abortPromise,
+          reconcileRemoteDialog,
        );
      },
      cdpUrl: opts.cdpUrl,
@@ -584,7 +611,7 @@ export async function clickViaPlaywright(opts: {
      targetId: opts.targetId,
    });
  } catch (err) {
-    throw toAIFriendlyError(err, label);
+    throw toFriendlyInteractionError(err, label);
  } finally {
    if (signal && abortListener) {
      signal.removeEventListener("abort", abortListener);
@@ -602,23 +629,30 @@ export async function clickCoordsViaPlaywright(opts: {
  delayMs?: number;
  timeoutMs?: number;
  ssrfPolicy?: SsrFPolicy;
+  signal?: AbortSignal;
 }): Promise<void> {
  const page = await getRestoredPageForTarget(opts);
  const previousUrl = page.url();
+  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
  await assertInteractionNavigationCompletedSafely({
    action: async () => {
-      await page.mouse.click(opts.x, opts.y, {
-        button: opts.button,
-        clickCount: opts.doubleClick ? 2 : 1,
-        delay: resolveBoundedDelayMs(opts.delayMs, "clickCoords delayMs", ACT_MAX_CLICK_DELAY_MS),
-      });
+      await awaitActionWithAbort(
+        page.mouse.click(opts.x, opts.y, {
+          button: opts.button,
+          clickCount: opts.doubleClick ? 2 : 1,
+          delay: resolveBoundedDelayMs(opts.delayMs, "clickCoords delayMs", ACT_MAX_CLICK_DELAY_MS),
+        }),
+        abortPromise,
+        reconcileRemoteDialog,
+      );
    },
    cdpUrl: opts.cdpUrl,
    page,
    previousUrl,
    ssrfPolicy: opts.ssrfPolicy,
    targetId: opts.targetId,
-  });
+  }).finally(cleanup);
 }

 export async function hoverViaPlaywright(opts: {
@@ -627,6 +661,7 @@ export async function hoverViaPlaywright(opts: {
  ref?: string;
  selector?: string;
  timeoutMs?: number;
+  signal?: AbortSignal;
 }): Promise<void> {
  const resolved = requireRefOrSelector(opts.ref, opts.selector);
  const page = await getRestoredPageForTarget(opts);
@@ -634,12 +669,20 @@ export async function hoverViaPlaywright(opts: {
  const locator = resolved.ref
    ? refLocator(page, requireRef(resolved.ref))
    : page.locator(resolved.selector!);
+  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
  try {
-    await locator.hover({
-      timeout: resolveInteractionTimeoutMs(opts.timeoutMs),
-    });
+    await awaitActionWithAbort(
+      locator.hover({
+        timeout: resolveInteractionTimeoutMs(opts.timeoutMs),
+      }),
+      abortPromise,
+      reconcileRemoteDialog,
+    );
  } catch (err) {
-    throw toAIFriendlyError(err, label);
+    throw toFriendlyInteractionError(err, label);
+  } finally {
+    cleanup();
  }
 }

@@ -651,6 +694,7 @@ export async function dragViaPlaywright(opts: {
  endRef?: string;
  endSelector?: string;
  timeoutMs?: number;
+  signal?: AbortSignal;
 }): Promise<void> {
  const resolvedStart = requireRefOrSelector(opts.startRef, opts.startSelector);
  const resolvedEnd = requireRefOrSelector(opts.endRef, opts.endSelector);
@@ -663,12 +707,20 @@ export async function dragViaPlaywright(opts: {
    : page.locator(resolvedEnd.selector!);
  const startLabel = resolvedStart.ref ?? resolvedStart.selector!;
  const endLabel = resolvedEnd.ref ?? resolvedEnd.selector!;
+  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
  try {
-    await startLocator.dragTo(endLocator, {
-      timeout: resolveInteractionTimeoutMs(opts.timeoutMs),
-    });
+    await awaitActionWithAbort(
+      startLocator.dragTo(endLocator, {
+        timeout: resolveInteractionTimeoutMs(opts.timeoutMs),
+      }),
+      abortPromise,
+      reconcileRemoteDialog,
+    );
  } catch (err) {
-    throw toAIFriendlyError(err, `${startLabel} -> ${endLabel}`);
+    throw toFriendlyInteractionError(err, `${startLabel} -> ${endLabel}`);
+  } finally {
+    cleanup();
  }
 }

@@ -680,6 +732,7 @@ export async function selectOptionViaPlaywright(opts: {
  values: string[];
  timeoutMs?: number;
  ssrfPolicy?: SsrFPolicy;
+  signal?: AbortSignal;
 }): Promise<void> {
  const resolved = requireRefOrSelector(opts.ref, opts.selector);
  if (!opts.values?.length) {
@@ -691,12 +744,18 @@ export async function selectOptionViaPlaywright(opts: {
    ? refLocator(page, requireRef(resolved.ref))
    : page.locator(resolved.selector!);
  const previousUrl = page.url();
+  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
  try {
    await assertInteractionNavigationCompletedSafely({
      action: async () => {
-        await locator.selectOption(opts.values, {
-          timeout: resolveInteractionTimeoutMs(opts.timeoutMs),
-        });
+        await awaitActionWithAbort(
+          locator.selectOption(opts.values, {
+            timeout: resolveInteractionTimeoutMs(opts.timeoutMs),
+          }),
+          abortPromise,
+          reconcileRemoteDialog,
+        );
      },
      cdpUrl: opts.cdpUrl,
      page,
@@ -705,7 +764,9 @@ export async function selectOptionViaPlaywright(opts: {
      targetId: opts.targetId,
    });
  } catch (err) {
-    throw toAIFriendlyError(err, label);
+    throw toFriendlyInteractionError(err, label);
+  } finally {
+    cleanup();
  }
 }

@@ -715,6 +776,7 @@ export async function pressKeyViaPlaywright(opts: {
  key: string;
  delayMs?: number;
  ssrfPolicy?: SsrFPolicy;
+  signal?: AbortSignal;
 }): Promise<void> {
  const key = normalizeOptionalString(opts.key) ?? "";
  if (!key) {
@@ -723,18 +785,28 @@ export async function pressKeyViaPlaywright(opts: {
  const page = await getPageForTargetId(opts);
  ensurePageState(page);
  const previousUrl = page.url();
-  await assertInteractionNavigationCompletedSafely({
-    action: async () => {
-      await page.keyboard.press(key, {
-        delay: Math.max(0, Math.floor(opts.delayMs ?? 0)),
-      });
-    },
-    cdpUrl: opts.cdpUrl,
-    page,
-    previousUrl,
-    ssrfPolicy: opts.ssrfPolicy,
-    targetId: opts.targetId,
-  });
+  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
+  try {
+    await assertInteractionNavigationCompletedSafely({
+      action: async () => {
+        await awaitActionWithAbort(
+          page.keyboard.press(key, {
+            delay: Math.max(0, Math.floor(opts.delayMs ?? 0)),
+          }),
+          abortPromise,
+          reconcileRemoteDialog,
+        );
+      },
+      cdpUrl: opts.cdpUrl,
+      page,
+      previousUrl,
+      ssrfPolicy: opts.ssrfPolicy,
+      targetId: opts.targetId,
+    });
+  } finally {
+    cleanup();
+  }
 }

 export async function typeViaPlaywright(opts: {
@@ -747,6 +819,7 @@ export async function typeViaPlaywright(opts: {
  slowly?: boolean;
  timeoutMs?: number;
  ssrfPolicy?: SsrFPolicy;
+  signal?: AbortSignal;
 }): Promise<void> {
  const resolved = requireRefOrSelector(opts.ref, opts.selector);
  const text = opts.text ?? "";
@@ -756,15 +829,29 @@ export async function typeViaPlaywright(opts: {
    ? refLocator(page, requireRef(resolved.ref))
    : page.locator(resolved.selector!);
  const timeout = resolveInteractionTimeoutMs(opts.timeoutMs);
+  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
  try {
    const previousUrl = page.url();
    if (opts.slowly) {
      await assertInteractionNavigationCompletedSafely({
        action: async () => {
-          await locator.click({ timeout });
-          await locator.type(text, { timeout, delay: 75 });
+          await awaitActionWithAbort(
+            locator.click({ timeout }),
+            abortPromise,
+            reconcileRemoteDialog,
+          );
+          await awaitActionWithAbort(
+            locator.type(text, { timeout, delay: 75 }),
+            abortPromise,
+            reconcileRemoteDialog,
+          );
          if (opts.submit) {
-            await locator.press("Enter", { timeout });
+            await awaitActionWithAbort(
+              locator.press("Enter", { timeout }),
+              abortPromise,
+              reconcileRemoteDialog,
+            );
          }
        },
        cdpUrl: opts.cdpUrl,
@@ -776,9 +863,17 @@ export async function typeViaPlaywright(opts: {
    } else {
      await assertInteractionNavigationCompletedSafely({
        action: async () => {
-          await locator.fill(text, { timeout });
+          await awaitActionWithAbort(
+            locator.fill(text, { timeout }),
+            abortPromise,
+            reconcileRemoteDialog,
+          );
          if (opts.submit) {
-            await locator.press("Enter", { timeout });
+            await awaitActionWithAbort(
+              locator.press("Enter", { timeout }),
+              abortPromise,
+              reconcileRemoteDialog,
+            );
          }
        },
        cdpUrl: opts.cdpUrl,
@@ -789,7 +884,9 @@ export async function typeViaPlaywright(opts: {
      });
    }
  } catch (err) {
-    throw toAIFriendlyError(err, label);
+    throw toFriendlyInteractionError(err, label);
+  } finally {
+    cleanup();
  }
 }

@@ -799,31 +896,60 @@ export async function fillFormViaPlaywright(opts: {
  fields: BrowserFormField[];
  timeoutMs?: number;
  ssrfPolicy?: SsrFPolicy;
+  signal?: AbortSignal;
 }): Promise<void> {
  const page = await getRestoredPageForTarget(opts);
  const timeout = resolveInteractionTimeoutMs(opts.timeoutMs);
-  for (const field of opts.fields) {
-    const ref = field.ref.trim();
-    const type = (field.type || DEFAULT_FILL_FIELD_TYPE).trim() || DEFAULT_FILL_FIELD_TYPE;
-    const rawValue = field.value;
-    const value =
-      typeof rawValue === "string"
-        ? rawValue
-        : typeof rawValue === "number" || typeof rawValue === "boolean"
-          ? String(rawValue)
-          : "";
-    if (!ref) {
-      continue;
-    }
-    const locator = refLocator(page, ref);
-    if (type === "checkbox" || type === "radio") {
-      const checked =
-        rawValue === true || rawValue === 1 || rawValue === "1" || rawValue === "true";
+  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
+  try {
+    for (const field of opts.fields) {
+      const ref = field.ref.trim();
+      const type = (field.type || DEFAULT_FILL_FIELD_TYPE).trim() || DEFAULT_FILL_FIELD_TYPE;
+      const rawValue = field.value;
+      const value =
+        typeof rawValue === "string"
+          ? rawValue
+          : typeof rawValue === "number" || typeof rawValue === "boolean"
+            ? String(rawValue)
+            : "";
+      if (!ref) {
+        continue;
+      }
+      const locator = refLocator(page, ref);
+      if (type === "checkbox" || type === "radio") {
+        const checked =
+          rawValue === true || rawValue === 1 || rawValue === "1" || rawValue === "true";
+        try {
+          const previousUrl = page.url();
+          await assertInteractionNavigationCompletedSafely({
+            action: async () => {
+              await awaitActionWithAbort(
+                locator.setChecked(checked, { timeout }),
+                abortPromise,
+                reconcileRemoteDialog,
+              );
+            },
+            cdpUrl: opts.cdpUrl,
+            page,
+            previousUrl,
+            ssrfPolicy: opts.ssrfPolicy,
+            targetId: opts.targetId,
+          });
+        } catch (err) {
+          throw toFriendlyInteractionError(err, ref);
+        }
+        continue;
+      }
      try {
        const previousUrl = page.url();
        await assertInteractionNavigationCompletedSafely({
          action: async () => {
-            await locator.setChecked(checked, { timeout });
+            await awaitActionWithAbort(
+              locator.fill(value, { timeout }),
+              abortPromise,
+              reconcileRemoteDialog,
+            );
          },
          cdpUrl: opts.cdpUrl,
          page,
@@ -832,25 +958,11 @@ export async function fillFormViaPlaywright(opts: {
          targetId: opts.targetId,
        });
      } catch (err) {
-        throw toAIFriendlyError(err, ref);
+        throw toFriendlyInteractionError(err, ref);
      }
-      continue;
-    }
-    try {
-      const previousUrl = page.url();
-      await assertInteractionNavigationCompletedSafely({
-        action: async () => {
-          await locator.fill(value, { timeout });
-        },
-        cdpUrl: opts.cdpUrl,
-        page,
-        previousUrl,
-        ssrfPolicy: opts.ssrfPolicy,
-        targetId: opts.targetId,
-      });
-    } catch (err) {
-      throw toAIFriendlyError(err, ref);
    }
+  } finally {
+    cleanup();
  }
 }

@@ -882,7 +994,10 @@ export async function evaluateViaPlaywright(opts: {
  evaluateTimeout = Math.min(evaluateTimeout, outerTimeout);

  const signal = opts.signal;
-  const { abortPromise, cleanup } = createAbortPromiseWithListener(signal, () => {
+  const { abortPromise, cleanup } = createAbortPromiseWithListener(signal, (reason) => {
+    if (isBrowserObservedDialogBlockedError(reason)) {
+      return;
+    }
    void forceDisconnectPlaywrightForTarget({
      cdpUrl: opts.cdpUrl,
      targetId: opts.targetId,
@@ -935,8 +1050,9 @@ export async function evaluateViaPlaywright(opts: {
        fnBody: fnText,
        timeoutMs: evaluateTimeout,
      });
+      const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, signal);
      const result = await assertInteractionNavigationCompletedSafely({
-        action: () => awaitActionWithAbort(evalPromise, abortPromise),
+        action: () => awaitActionWithAbort(evalPromise, abortPromise, reconcileRemoteDialog),
        cdpUrl: opts.cdpUrl,
        page,
        previousUrl,
@@ -973,8 +1089,9 @@ export async function evaluateViaPlaywright(opts: {
      fnBody: fnText,
      timeoutMs: evaluateTimeout,
    });
+    const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, signal);
    const result = await assertInteractionNavigationCompletedSafely({
-      action: () => awaitActionWithAbort(evalPromise, abortPromise),
+      action: () => awaitActionWithAbort(evalPromise, abortPromise, reconcileRemoteDialog),
      cdpUrl: opts.cdpUrl,
      page,
      previousUrl,
@@ -993,6 +1110,7 @@ export async function scrollIntoViewViaPlaywright(opts: {
  ref?: string;
  selector?: string;
  timeoutMs?: number;
+  signal?: AbortSignal;
 }): Promise<void> {
  const resolved = requireRefOrSelector(opts.ref, opts.selector);
  const page = await getRestoredPageForTarget(opts);
@@ -1002,10 +1120,18 @@ export async function scrollIntoViewViaPlaywright(opts: {
  const locator = resolved.ref
    ? refLocator(page, requireRef(resolved.ref))
    : page.locator(resolved.selector!);
+  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
  try {
-    await locator.scrollIntoViewIfNeeded({ timeout });
+    await awaitActionWithAbort(
+      locator.scrollIntoViewIfNeeded({ timeout }),
+      abortPromise,
+      reconcileRemoteDialog,
+    );
  } catch (err) {
-    throw toAIFriendlyError(err, label);
+    throw toFriendlyInteractionError(err, label);
+  } finally {
+    cleanup();
  }
 }

@@ -1026,8 +1152,9 @@ export async function waitForViaPlaywright(opts: {
  ensurePageState(page);
  const timeout = resolveActWaitTimeoutMs(opts.timeoutMs);
  const { abortPromise, cleanup } = createAbortPromise(opts.signal);
+  const reconcileRemoteDialog = () => reconcileRemoteDialogAfterActionSettled(page, opts.signal);
  const waitForStep = async <T>(stepPromise: Promise<T>) => {
-    await awaitActionWithAbort(stepPromise, abortPromise);
+    await awaitActionWithAbort(stepPromise, abortPromise, reconcileRemoteDialog);
  };

  try {
@@ -1281,7 +1408,7 @@ export async function setInputFilesViaPlaywright(opts: {
  try {
    await locator.setInputFiles(resolvedPaths);
  } catch (err) {
-    throw toAIFriendlyError(err, inputRef || element);
+    throw toFriendlyInteractionError(err, inputRef || element);
  }
  try {
    const handle = await locator.elementHandle();
@@ -1324,6 +1451,7 @@ async function executeSingleAction(
        delayMs: action.delayMs,
        timeoutMs: action.timeoutMs,
        ssrfPolicy,
+        signal,
      });
      break;
    case "clickCoords":
@@ -1337,6 +1465,7 @@ async function executeSingleAction(
        delayMs: action.delayMs,
        timeoutMs: action.timeoutMs,
        ssrfPolicy,
+        signal,
      });
      break;
    case "type":
@@ -1350,6 +1479,7 @@ async function executeSingleAction(
        slowly: action.slowly,
        timeoutMs: action.timeoutMs,
        ssrfPolicy,
+        signal,
      });
      break;
    case "press":
@@ -1359,6 +1489,7 @@ async function executeSingleAction(
        key: action.key,
        delayMs: action.delayMs,
        ssrfPolicy,
+        signal,
      });
      break;
    case "hover":
@@ -1368,6 +1499,7 @@ async function executeSingleAction(
        ref: action.ref,
        selector: action.selector,
        timeoutMs: action.timeoutMs,
+        signal,
      });
      break;
    case "scrollIntoView":
@@ -1377,6 +1509,7 @@ async function executeSingleAction(
        ref: action.ref,
        selector: action.selector,
        timeoutMs: action.timeoutMs,
+        signal,
      });
      break;
    case "drag":
@@ -1388,6 +1521,7 @@ async function executeSingleAction(
        endRef: action.endRef,
        endSelector: action.endSelector,
        timeoutMs: action.timeoutMs,
+        signal,
      });
      break;
    case "select":
@@ -1399,6 +1533,7 @@ async function executeSingleAction(
        values: action.values,
        timeoutMs: action.timeoutMs,
        ssrfPolicy,
+        signal,
      });
      break;
    case "fill":
@@ -1408,6 +1543,7 @@ async function executeSingleAction(
        fields: action.fields,
        timeoutMs: action.timeoutMs,
        ssrfPolicy,
+        signal,
      });
      break;
    case "resize":
@@ -1483,32 +1619,52 @@ export async function executeActViaPlaywright(opts: {
 }): Promise<{
  result?: unknown;
  results?: Array<{ ok: boolean; error?: string }>;
+  blockedByDialog?: boolean;
+  browserState?: unknown;
 }> {
-  if (opts.action.kind === "batch") {
-    const batch = await batchViaPlaywright({
-      cdpUrl: opts.cdpUrl,
-      targetId: opts.targetId,
-      ssrfPolicy: opts.ssrfPolicy,
-      actions: opts.action.actions,
-      stopOnError: opts.action.stopOnError,
-      evaluateEnabled: opts.evaluateEnabled,
-      signal: opts.signal,
-    });
-    return { results: batch.results };
+  const page = await getPageForTargetId({
+    cdpUrl: opts.cdpUrl,
+    targetId: opts.targetId,
+    ssrfPolicy: opts.ssrfPolicy,
+  });
+  const dialogAbort = createObservedDialogAbortSignalForPage({
+    page,
+    parentSignal: opts.signal,
+  });
+  try {
+    if (opts.action.kind === "batch") {
+      const batch = await batchViaPlaywright({
+        cdpUrl: opts.cdpUrl,
+        targetId: opts.targetId,
+        ssrfPolicy: opts.ssrfPolicy,
+        actions: opts.action.actions,
+        stopOnError: opts.action.stopOnError,
+        evaluateEnabled: opts.evaluateEnabled,
+        signal: dialogAbort.signal,
+      });
+      return { results: batch.results };
+    }
+    const result = await executeSingleAction(
+      opts.action,
+      opts.cdpUrl,
+      opts.targetId,
+      opts.evaluateEnabled,
+      opts.ssrfPolicy,
+      0,
+      dialogAbort.signal,
+    );
+    if (opts.action.kind === "evaluate") {
+      return { result };
+    }
+    return {};
+  } catch (err) {
+    if (isBrowserObservedDialogBlockedError(err)) {
+      return { blockedByDialog: true, browserState: err.browserState };
+    }
+    throw err;
+  } finally {
+    dialogAbort.cleanup();
  }
-  const result = await executeSingleAction(
-    opts.action,
-    opts.cdpUrl,
-    opts.targetId,
-    opts.evaluateEnabled,
-    opts.ssrfPolicy,
-    0,
-    opts.signal,
-  );
-  if (opts.action.kind === "evaluate") {
-    return { result };
-  }
-  return {};
 }

 export async function batchViaPlaywright(opts: {
@@ -1545,6 +1701,9 @@ export async function batchViaPlaywright(opts: {
      );
      results.push({ ok: true });
    } catch (err) {
+      if (isBrowserObservedDialogBlockedError(err)) {
+        throw err;
+      }
      const message = formatErrorMessage(err);
      results.push({ ok: false, error: message });
      if (opts.stopOnError !== false) {
--- a/extensions/browser/src/browser/pw-tools-core.last-file-chooser-arm-wins.test.ts
+++ b/extensions/browser/src/browser/pw-tools-core.last-file-chooser-arm-wins.test.ts
@@ -4,6 +4,7 @@ import path from "node:path";
 import { describe, expect, it, vi } from "vitest";
 import { DEFAULT_UPLOAD_DIR } from "./paths.js";
 import {
+  getPwToolsCoreSessionMocks,
  installPwToolsCoreTestHooks,
  setPwToolsCoreCurrentPage,
 } from "./pw-tools-core.test-harness.js";
@@ -74,38 +75,42 @@ describe("pw-tools-core", () => {
    }
  });
  it("arms the next dialog and accepts/dismisses (default timeout)", async () => {
-    const accept = vi.fn(async () => {});
-    const dismiss = vi.fn(async () => {});
-    const dialog = { accept, dismiss };
-    const waitForEvent = vi.fn(async () => dialog);
-    setPwToolsCoreCurrentPage({
-      waitForEvent,
-    });
+    const sessionMocks = getPwToolsCoreSessionMocks();
+    const page = {};
+    setPwToolsCoreCurrentPage(page);

    await mod.armDialogViaPlaywright({
      cdpUrl: "http://127.0.0.1:18792",
      accept: true,
      promptText: "x",
    });
-    await Promise.resolve();

-    expect(waitForEvent).toHaveBeenCalledWith("dialog", { timeout: 120_000 });
-    expect(accept).toHaveBeenCalledWith("x");
-    expect(dismiss).not.toHaveBeenCalled();
+    expect(sessionMocks.respondToObservedDialogOnPage).toHaveBeenCalledWith({
+      page,
+      accept: true,
+      promptText: "x",
+      closedBy: "agent",
+    });
+    expect(sessionMocks.armObservedDialogResponseOnPage).toHaveBeenCalledWith({
+      page,
+      accept: true,
+      promptText: "x",
+      timeoutMs: 120_000,
+    });

-    accept.mockClear();
-    dismiss.mockClear();
-    waitForEvent.mockClear();
+    sessionMocks.respondToObservedDialogOnPage.mockClear();
+    sessionMocks.armObservedDialogResponseOnPage.mockClear();

    await mod.armDialogViaPlaywright({
      cdpUrl: "http://127.0.0.1:18792",
      accept: false,
    });
-    await Promise.resolve();

-    expect(waitForEvent).toHaveBeenCalledWith("dialog", { timeout: 120_000 });
-    expect(dismiss).toHaveBeenCalled();
-    expect(accept).not.toHaveBeenCalled();
+    expect(sessionMocks.armObservedDialogResponseOnPage).toHaveBeenCalledWith({
+      page,
+      accept: false,
+      timeoutMs: 120_000,
+    });
  });
  it("waits for selector, url, load state, and function", async () => {
    const waitForSelector = vi.fn(async () => {});
--- a/extensions/browser/src/browser/pw-tools-core.shared.ts
+++ b/extensions/browser/src/browser/pw-tools-core.shared.ts
@@ -3,7 +3,6 @@ import { formatErrorMessage } from "../infra/errors.js";
 import { parseRoleRef } from "./pw-role-snapshot.js";

 let nextUploadArmId = 0;
-let nextDialogArmId = 0;
 let nextDownloadArmId = 0;

 export function bumpUploadArmId(): number {
@@ -11,11 +10,6 @@ export function bumpUploadArmId(): number {
  return nextUploadArmId;
 }

-export function bumpDialogArmId(): number {
-  nextDialogArmId += 1;
-  return nextDialogArmId;
-}
-
 export function bumpDownloadArmId(): number {
  nextDownloadArmId += 1;
  return nextDownloadArmId;
--- a/extensions/browser/src/browser/pw-tools-core.test-harness.ts
+++ b/extensions/browser/src/browser/pw-tools-core.test-harness.ts
@@ -5,13 +5,11 @@ let currentRefLocator: Record<string, unknown> | null = null;
 let pageState: {
  console: unknown[];
  armIdUpload: number;
-  armIdDialog: number;
  armIdDownload: number;
  downloadWaiterDepth: number;
 } = {
  console: [],
  armIdUpload: 0,
-  armIdDialog: 0,
  armIdDownload: 0,
  downloadWaiterDepth: 0,
 };
@@ -42,6 +40,15 @@ const sessionMocks = vi.hoisted(() => ({
    return err.name === "SsrFBlockedError" || err.name === "InvalidBrowserNavigationUrlError";
  }),
  restoreRoleRefsForTarget: vi.fn(() => {}),
+  respondToObservedDialogOnPage: vi.fn(async () => {
+    throw new Error("No dialog is pending.");
+  }),
+  armObservedDialogResponseOnPage: vi.fn(() => {}),
+  createObservedDialogAbortSignalForPage: vi.fn((opts?: { parentSignal?: AbortSignal }) => ({
+    signal: opts?.parentSignal ?? new AbortController().signal,
+    cleanup: vi.fn(() => {}),
+  })),
+  isBrowserObservedDialogBlockedError: vi.fn(() => false),
  storeRoleRefsForTarget: vi.fn(() => {}),
  refLocator: vi.fn(() => {
    if (!currentRefLocator) {
@@ -89,7 +96,6 @@ export function installPwToolsCoreTestHooks() {
    pageState = {
      console: [],
      armIdUpload: 0,
-      armIdDialog: 0,
      armIdDownload: 0,
      downloadWaiterDepth: 0,
    };
--- a/extensions/browser/src/browser/routes/agent.act.hooks.ts
+++ b/extensions/browser/src/browser/routes/agent.act.hooks.ts
@@ -119,6 +119,7 @@ export function registerBrowserAgentActHookRoutes(
      const accept = toBoolean(body.accept);
      const promptText = toStringOrEmpty(body.promptText) || undefined;
      const timeoutMs = toNumber(body.timeoutMs);
+      const dialogId = toStringOrEmpty(body.dialogId) || undefined;
      if (accept === undefined) {
        return jsonError(res, 400, "accept is required");
      }
@@ -130,6 +131,9 @@ export function registerBrowserAgentActHookRoutes(
        targetId,
        run: async ({ profileCtx, cdpUrl, tab }) => {
          if (getBrowserProfileCapabilities(profileCtx.profile).usesChromeMcp) {
+            if (dialogId) {
+              return jsonError(res, 501, EXISTING_SESSION_LIMITS.hooks.dialogId);
+            }
            if (timeoutMs) {
              return jsonError(res, 501, EXISTING_SESSION_LIMITS.hooks.dialogTimeout);
            }
@@ -186,6 +190,7 @@ export function registerBrowserAgentActHookRoutes(
          await pw.armDialogViaPlaywright({
            cdpUrl,
            targetId: tab.targetId,
+            dialogId,
            accept,
            promptText,
            timeoutMs: timeoutMs ?? undefined,
--- a/extensions/browser/src/browser/routes/agent.act.ts
+++ b/extensions/browser/src/browser/routes/agent.act.ts
@@ -654,6 +654,12 @@ export function registerBrowserAgentActRoutes(
            ssrfPolicy,
            signal: req.signal,
          });
+          if (result.blockedByDialog) {
+            return await jsonOk({
+              blockedByDialog: true,
+              browserState: result.browserState,
+            });
+          }
          switch (action.kind) {
            case "batch":
              return await jsonOk(
--- a/extensions/browser/src/browser/routes/agent.debug.ts
+++ b/extensions/browser/src/browser/routes/agent.debug.ts
@@ -97,6 +97,31 @@ export function registerBrowserAgentDebugRoutes(
    }),
  );

+  app.get(
+    "/dialogs",
+    asyncBrowserRoute(async (req, res) => {
+      const targetId = resolveTargetIdFromQuery(req.query);
+
+      await withPlaywrightRouteContext({
+        req,
+        res,
+        ctx,
+        targetId,
+        feature: "dialog state",
+        enforceCurrentUrlAllowed: true,
+        run: async ({ cdpUrl, tab, pw, resolveTabUrl }) => {
+          const browserState = await pw.getObservedBrowserStateViaPlaywright({
+            cdpUrl,
+            targetId: tab.targetId,
+            ssrfPolicy: ctx.state().resolved.ssrfPolicy,
+          });
+          const url = await resolveTabUrl(tab.url);
+          res.json({ ok: true, targetId: tab.targetId, ...(url ? { url } : {}), browserState });
+        },
+      });
+    }),
+  );
+
  app.post(
    "/trace/start",
    asyncBrowserRoute(async (req, res) => {
--- a/extensions/browser/src/browser/routes/agent.existing-session.test.ts
+++ b/extensions/browser/src/browser/routes/agent.existing-session.test.ts
@@ -74,6 +74,7 @@ vi.mock("../../media/store.js", () => ({
 vi.mock("./agent.shared.js", () => createExistingSessionAgentSharedModule());

 const { registerBrowserAgentActRoutes } = await import("./agent.act.js");
+const { registerBrowserAgentActHookRoutes } = await import("./agent.act.hooks.js");
 const { registerBrowserAgentSnapshotRoutes } = await import("./agent.snapshot.js");

 function getSnapshotGetHandler(ssrfPolicy?: unknown) {
@@ -106,6 +107,16 @@ function getActPostHandler() {
  return handler;
 }

+function getDialogHookPostHandler() {
+  const { app, postHandlers } = createBrowserRouteApp();
+  registerBrowserAgentActHookRoutes(app, {
+    state: () => ({ resolved: {} }),
+  } as never);
+  const handler = postHandlers.get("/hooks/dialog");
+  expect(handler).toBeTypeOf("function");
+  return handler;
+}
+
 function requireRecord(value: unknown, label: string): Record<string, unknown> {
  if (!value || typeof value !== "object") {
    throw new Error(`expected ${label}`);
@@ -286,6 +297,24 @@ describe("existing-session browser routes", () => {
    expect(chromeMcpMocks.fillChromeMcpElement).not.toHaveBeenCalled();
  });

+  it("fails closed for existing-session dialogId responses", async () => {
+    const handler = getDialogHookPostHandler();
+    const response = createBrowserRouteResponse();
+    await handler?.(
+      {
+        params: {},
+        query: {},
+        body: { accept: true, dialogId: "d1" },
+      },
+      response.res,
+    );
+
+    expect(response.statusCode).toBe(501);
+    const body = requireRecord(response.body, "response body");
+    expect(String(body.error)).toContain("dialogId");
+    expect(chromeMcpMocks.evaluateChromeMcpScript).not.toHaveBeenCalled();
+  });
+
  it("supports glob URL waits for existing-session profiles", async () => {
    chromeMcpMocks.evaluateChromeMcpScript.mockReset();
    chromeMcpMocks.evaluateChromeMcpScript.mockImplementation(
--- a/extensions/browser/src/browser/routes/agent.snapshot.ts
+++ b/extensions/browser/src/browser/routes/agent.snapshot.ts
@@ -248,6 +248,26 @@ async function saveBrowserMediaResponse(params: {
  });
 }

+function hasObservableBrowserState(state: unknown): boolean {
+  if (!state || typeof state !== "object") {
+    return false;
+  }
+  const dialogs = (state as { dialogs?: { pending?: unknown[]; recent?: unknown[] } }).dialogs;
+  return Boolean(dialogs?.pending?.length || dialogs?.recent?.length);
+}
+
+function hasPendingDialogs(state: unknown): boolean {
+  if (!state || typeof state !== "object") {
+    return false;
+  }
+  const dialogs = (state as { dialogs?: { pending?: unknown[] } }).dialogs;
+  return Boolean(dialogs?.pending?.length);
+}
+
+function browserStateResponseFields(state: unknown): { browserState?: unknown } {
+  return hasObservableBrowserState(state) ? { browserState: state } : {};
+}
+
 export function registerBrowserAgentSnapshotRoutes(
  app: BrowserRouteRegistrar,
  ctx: BrowserRouteContext,
@@ -524,11 +544,26 @@ export function registerBrowserAgentSnapshotRoutes(

      try {
        const tab = await profileCtx.ensureTabAvailable(targetId || undefined);
+        const usesChromeMcp = getBrowserProfileCapabilities(profileCtx.profile).usesChromeMcp;
+        const ssrfPolicyOpts = browserNavigationPolicyForProfile(ctx, profileCtx);
+        let observedBrowserState: unknown;
+        if (!usesChromeMcp && pwModule) {
+          await assertBrowserNavigationResultAllowed({
+            url: tab.url,
+            ...ssrfPolicyOpts,
+          });
+          observedBrowserState = await pwModule
+            .getObservedBrowserStateViaPlaywright({
+              cdpUrl: profileCtx.profile.cdpUrl,
+              targetId: tab.targetId,
+              ssrfPolicy: ctx.state().resolved.ssrfPolicy,
+            })
+            .catch(() => undefined);
+        }
        if ((plan.labels || plan.mode === "efficient") && plan.format === "aria") {
          return jsonError(res, 400, "labels/mode=efficient require format=ai");
        }
-        if (getBrowserProfileCapabilities(profileCtx.profile).usesChromeMcp) {
-          const ssrfPolicyOpts = browserNavigationPolicyForProfile(ctx, profileCtx);
+        if (usesChromeMcp) {
          if (plan.selectorValue || plan.frameSelectorValue) {
            return jsonError(res, 400, EXISTING_SESSION_LIMITS.snapshot.snapshotSelector);
          }
@@ -628,6 +663,17 @@ export function registerBrowserAgentSnapshotRoutes(
            ...builtWithUrls,
          });
        }
+        if (hasPendingDialogs(observedBrowserState)) {
+          return res.json({
+            ok: true,
+            format: plan.format,
+            targetId: tab.targetId,
+            url: tab.url,
+            blockedByDialog: true,
+            ...browserStateResponseFields(observedBrowserState),
+            ...(plan.format === "aria" ? { nodes: [] } : { snapshot: "", refs: {} }),
+          });
+        }
        if (plan.format === "ai") {
          const roleSnapshotArgs = {
            cdpUrl: profileCtx.profile.cdpUrl,
@@ -715,6 +761,7 @@ export function registerBrowserAgentSnapshotRoutes(
              format: plan.format,
              targetId: tab.targetId,
              url: tab.url,
+              ...browserStateResponseFields(observedBrowserState),
              labels: true,
              labelsCount: labeled.labels,
              labelsSkipped: labeled.skipped,
@@ -729,6 +776,7 @@ export function registerBrowserAgentSnapshotRoutes(
            format: plan.format,
            targetId: tab.targetId,
            url: tab.url,
+            ...browserStateResponseFields(observedBrowserState),
            ...snap,
          });
        }
@@ -771,6 +819,7 @@ export function registerBrowserAgentSnapshotRoutes(
          format: plan.format,
          targetId: tab.targetId,
          url: tab.url,
+          ...browserStateResponseFields(observedBrowserState),
          ...resolved,
        });
      } catch (err) {
--- a/extensions/browser/src/browser/routes/existing-session-limits.ts
+++ b/extensions/browser/src/browser/routes/existing-session-limits.ts
@@ -28,6 +28,7 @@ export const EXISTING_SESSION_LIMITS = {
      "existing-session file uploads do not support element selectors; use ref/inputRef.",
    uploadSingleFile: "existing-session file uploads currently support one file at a time.",
    uploadRefRequired: "existing-session file uploads require ref or inputRef.",
+    dialogId: "existing-session dialog handling does not support dialogId.",
    dialogTimeout: "existing-session dialog handling does not support timeoutMs.",
  },
  download: {
--- a/extensions/browser/src/browser/server.agent-contract-core.test.ts
+++ b/extensions/browser/src/browser/server.agent-contract-core.test.ts
@@ -1,6 +1,9 @@
 import fs from "node:fs";
-import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
+import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
 import { DEFAULT_AI_SNAPSHOT_MAX_CHARS } from "./constants.js";
+import { BROWSER_NAVIGATION_BLOCKED_MESSAGE } from "./errors.js";
+import { ACT_ERROR_CODES } from "./routes/agent.act.errors.js";
+import { isActKind } from "./routes/agent.act.shared.js";
 import {
  installAgentContractHooks,
  postJson,
@@ -17,6 +20,8 @@ import {
  setBrowserControlServerEvaluateEnabled,
  setBrowserControlServerProfiles,
  setBrowserControlServerReachable,
+  setBrowserControlServerSsrFPolicy,
+  setBrowserControlServerTabUrl,
  startBrowserControlServerFromConfig,
 } from "./server.control-server.test-harness.js";
 import { getBrowserTestFetch } from "./test-support/fetch.js";
@@ -83,15 +88,18 @@ describe("browser control server", () => {

  const slowTimeoutMs = 60_000;

+  beforeAll(async () => {
+    await resetBrowserControlServerTestContext();
+    await startBrowserControlServerFromConfig();
+    await cleanupBrowserControlServerTestContext();
+  }, slowTimeoutMs);
+
  it(
    "returns ACT_KIND_REQUIRED when kind is missing",
-    async () => {
-      const base = await startServerAndBase();
-      const response = await postActAndReadError(base, {});
-
-      expect(response.status).toBe(400);
-      expect(response.body.code).toBe("ACT_KIND_REQUIRED");
-      expect(response.body.error).toContain("kind is required");
+    () => {
+      expect(isActKind(undefined)).toBe(false);
+      expect(isActKind("")).toBe(false);
+      expect(ACT_ERROR_CODES.kindRequired).toBe("ACT_KIND_REQUIRED");
    },
    slowTimeoutMs,
  );
@@ -223,6 +231,46 @@ describe("browser control server", () => {
    slowTimeoutMs,
  );

+  it(
+    "returns blocked dialog state for action-triggered modals",
+    async () => {
+      const base = await startServerAndBase();
+      pwMocks.executeActViaPlaywright.mockResolvedValueOnce({
+        blockedByDialog: true,
+        browserState: {
+          dialogs: {
+            pending: [
+              {
+                id: "d1",
+                type: "confirm",
+                message: "Continue?",
+                openedAt: "2026-05-17T12:00:00.000Z",
+              },
+            ],
+            recent: [],
+          },
+        },
+      });
+
+      const response = await postJson<{
+        ok: boolean;
+        blockedByDialog?: boolean;
+        browserState?: { dialogs?: { pending?: Array<{ id?: string; message?: string }> } };
+      }>(`${base}/act`, {
+        kind: "click",
+        ref: "5",
+      });
+
+      expect(response.ok).toBe(true);
+      expect(response.blockedByDialog).toBe(true);
+      expect(response.browserState?.dialogs?.pending?.[0]).toMatchObject({
+        id: "d1",
+        message: "Continue?",
+      });
+    },
+    slowTimeoutMs,
+  );
+
  it(
    "returns ACT_SELECTOR_UNSUPPORTED for selector on unsupported action kinds",
    async () => {
@@ -338,6 +386,67 @@ describe("browser control server", () => {
    });
  });

+  it("agent contract: snapshot surfaces pending dialog state without reading the blocked page", async () => {
+    const base = await startServerAndBase();
+    const realFetch = getBrowserTestFetch();
+    pwMocks.getObservedBrowserStateViaPlaywright.mockResolvedValueOnce({
+      dialogs: {
+        pending: [
+          {
+            id: "d1",
+            type: "confirm",
+            message: "Continue?",
+            openedAt: "2026-05-17T12:00:00.000Z",
+          },
+        ],
+        recent: [],
+      },
+    });
+
+    const snap = (await realFetch(`${base}/snapshot?format=ai`).then((r) => r.json())) as {
+      ok: boolean;
+      blockedByDialog?: boolean;
+      browserState?: { dialogs?: { pending?: Array<{ id?: string; message?: string }> } };
+      snapshot?: string;
+    };
+
+    expect(snap.ok).toBe(true);
+    expect(snap.blockedByDialog).toBe(true);
+    expect(snap.snapshot).toBe("");
+    expect(snap.browserState?.dialogs?.pending?.[0]).toMatchObject({
+      id: "d1",
+      message: "Continue?",
+    });
+    expect(pwMocks.snapshotAiViaPlaywright).not.toHaveBeenCalled();
+  });
+
+  it("agent contract: snapshot blocks pending dialog state on disallowed current tab URLs", async () => {
+    setBrowserControlServerSsrFPolicy({ allowPrivateNetwork: false });
+    setBrowserControlServerTabUrl("http://127.0.0.1:8080/admin");
+    const base = await startServerAndBase();
+    const realFetch = getBrowserTestFetch();
+    pwMocks.getObservedBrowserStateViaPlaywright.mockResolvedValueOnce({
+      dialogs: {
+        pending: [
+          {
+            id: "d1",
+            type: "alert",
+            message: "blocked secret",
+            openedAt: "2026-05-17T12:00:00.000Z",
+          },
+        ],
+        recent: [],
+      },
+    });
+
+    const res = await realFetch(`${base}/snapshot?format=ai`);
+    expect(res.status).toBe(400);
+    const body = (await res.json()) as { error?: unknown };
+    expect(body.error).toBe(BROWSER_NAVIGATION_BLOCKED_MESSAGE);
+    expect(pwMocks.getObservedBrowserStateViaPlaywright).not.toHaveBeenCalled();
+    expect(pwMocks.snapshotAiViaPlaywright).not.toHaveBeenCalled();
+  });
+
  it("agent contract: doctor deep runs a live snapshot probe", async () => {
    const base = await startServerAndBase();
    const realFetch = getBrowserTestFetch();
--- a/extensions/browser/src/browser/server.agent-contract-form-layout-act-commands.test.ts
+++ b/extensions/browser/src/browser/server.agent-contract-form-layout-act-commands.test.ts
@@ -437,9 +437,16 @@ describe("browser control server", () => {

    const dialog = await postJson(`${base}/hooks/dialog`, {
      accept: true,
+      dialogId: "d1",
      timeoutMs: 5678,
    });
    expectOkResult(dialog);
+    expectBrowserCallFields(pwMocks.armDialogViaPlaywright, {
+      targetId: "abcd1234",
+      accept: true,
+      dialogId: "d1",
+      timeoutMs: 5678,
+    });

    const waitDownload = await postJson(`${base}/wait/download`, {
      path: "report.pdf",
--- a/extensions/browser/src/browser/server.control-server.test-harness.ts
+++ b/extensions/browser/src/browser/server.control-server.test-harness.ts
@@ -176,6 +176,9 @@ const pwMocks = vi.hoisted(() => ({
  fillFormViaPlaywright: vi.fn(async (_opts?: unknown) => {}),
  getConsoleMessagesViaPlaywright: vi.fn(async () => []),
  getNetworkRequestsViaPlaywright: vi.fn(async () => ({ requests: [] })),
+  getObservedBrowserStateViaPlaywright: vi.fn(async () => ({
+    dialogs: { pending: [], recent: [] },
+  })),
  getPageErrorsViaPlaywright: vi.fn(async () => ({ errors: [] })),
  hoverViaPlaywright: vi.fn(async (_opts?: unknown) => {}),
  scrollIntoViewViaPlaywright: vi.fn(async (_opts?: unknown) => {}),
--- a/extensions/browser/src/cli/browser-cli-actions-input/register.files-downloads.ts
+++ b/extensions/browser/src/cli/browser-cli-actions-input/register.files-downloads.ts
@@ -175,6 +175,7 @@ export function registerBrowserFilesAndDownloadsCommands(
    .option("--accept", "Accept the dialog", false)
    .option("--dismiss", "Dismiss the dialog", false)
    .option("--prompt <text>", "Prompt response text")
+    .option("--dialog-id <id>", "Pending dialog id from snapshot/browser state")
    .option("--target-id <id>", "CDP target id (or unique prefix)")
    .option(
      "--timeout-ms <ms>",
@@ -202,6 +203,7 @@ export function registerBrowserFilesAndDownloadsCommands(
        body: {
          accept,
          promptText: normalizeOptionalString(opts.prompt),
+          dialogId: normalizeOptionalString(opts.dialogId),
          targetId,
          timeoutMs,
        },
--- a/Show More
+++ b/Show More