SDK list helpers now send an empty params object when filters are omitted while preserving explicit invalid params for Gateway validation.\n\nVerification:\n- git diff --check origin/main...HEAD\n- node --check packages/sdk/src/client.ts\n- codex review --base origin/main\n- GitHub Actions CI release gate 27855603923 succeeded on 353f13c0d1
Summary:
- The branch changes config write preparation and doctor regression coverage so `doctor --fix` persists repair ... rams under canonical `openai/*` with Codex runtime policy, plus a prerelease lane timeout assertion update.
- PR surface: Source +9, Tests +107. Total +116 across 4 files.
- Reproducibility: yes. at source level: current main can re-preserve stale source-authored `openai-codex/*` m ... the candidate config, while the PR body supplies after-fix command proof for the narrowed persistence path.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 7b5bc00f31.
- Required merge gates passed before the squash merge.
Prepared head SHA: 7b5bc00f31
Review: https://github.com/openclaw/openclaw/pull/94478#issuecomment-4739605890
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Bound gateway model pricing catalog reads through the shared streaming byte-limit helper so no-content-length LiteLLM/OpenRouter responses cannot be fully buffered past the 5 MiB cap before rejection. Adds a regression for streamed LiteLLM overflow while preserving OpenRouter fallback pricing.
Keep the OpenAI Realtime WebRTC smoke's SDP offer request in the browser fetch path while moving the browser-side SDP answer reader into a testable helper. Reject unsafe decimal Content-Length values before acquiring a body reader and preserve streamed byte limiting for responses without a safe declared length.
Proof: direct bounded-reader repro rejects unsafe content-length before getReader and cancels the body; node --check --experimental-strip-types scripts/dev/realtime-talk-live-smoke.ts; node --check --experimental-strip-types test/scripts/dev-tooling-safety.test.ts; git diff --check origin/main...HEAD; autoreview clean overall 0.84; exact-head release gate succeeded at https://github.com/openclaw/openclaw/actions/runs/27848673438.
Reject unsafe decimal Content-Length values in the E2E bounded response text helper before streaming response bodies. Keep non-decimal values on the streaming byte-limit path and add regression coverage proving unsafe declared lengths cancel without starting a read.
Proof: direct patched repro rejects before reading with code ETOOBIG; origin/main comparison entered the reader first; node --check scripts/e2e/lib/bounded-response-text.mjs; git diff --check origin/main...HEAD; autoreview clean overall 0.86; exact-head release gate succeeded at https://github.com/openclaw/openclaw/actions/runs/27846197115.
* feat(cli): add `sessions compact` command and fail loudly on CLI `/compact`
`sessions.compact` was reachable only as an internal Gateway RPC — no CLI
command, no docs — and `openclaw agent --message '/compact'` silently no-opped
with exit 0 because the slash-command handler rejects CLI-originated senders,
so the message fell through to an ordinary agent turn that compacted nothing.
- Add `openclaw sessions compact <key>` wrapping the existing `sessions.compact`
RPC; exit non-zero on a transport error or an `ok:false` payload so automation
never mistakes a silent no-op for success.
- Reject `openclaw agent --message '/compact'` with a redirect to the new
command and exit 1 instead of a silent exit 0. The shared chat-side `/compact`
handler is left untouched (no compatibility / message-delivery blast radius).
- Strictly validate `--max-lines` and `--timeout` (positive integers only).
- Document the command and the `sessions.compact` RPC in docs/cli/sessions.md.
Fixes#90640.
* fix(cli): inherit parent `sessions` options for `compact`
`openclaw sessions compact <key>` did not merge the parent `sessions`
command options the way its sibling subcommands (list/cleanup/info/…) do,
so a parent-level `--agent`/`--json` was silently dropped. In particular
`openclaw sessions --agent work compact <key>` compacted the default
agent's session instead of the work agent's — a wrong-target session-state
mutation.
Merge the parent options in the compact action (parent `--agent`/`--json`,
with the compact-level option taking precedence) and add regression
coverage for parent `--agent`, parent `--json`, and the compact-level
override.
Refs #90640.
* fix(cli): report pending Codex compaction and reject unsupported parent options
Address two ClawSweeper review findings on the `sessions compact` command:
- `sessions-compact.ts`: the Codex app-server `thread/compact/start` path
returns `ok:true / compacted:false` with a pending marker, meaning the
compaction was *started* asynchronously. The formatter collapsed every
non-compacted success into "No compaction needed", so Codex users were told
nothing happened. Report it as a started/pending compaction instead.
- `register.status-health-sessions.ts`: the parent `sessions` command defines
list-only options (`--store`/`--all-agents`/`--active`/`--limit`) that the
compact action previously ignored. Silently dropping a parent `--store` is
dangerous — the gateway resolves the target store itself, so a user could
believe they targeted one store while another is mutated. Reject any
unsupported inherited parent option with a clear error and a non-zero exit.
Add regression tests for the pending-compaction message and the rejected
parent options.
Refs #90640.
* fix(gateway): guard sessions.compact maxLines truncation against active runs
The non-maxLines (LLM) compact branch interrupts an active session run before
compacting, but the maxLines truncate branch read the tail, archived, and
overwrote the transcript in place without that guard. Exposing `--max-lines`
as a documented CLI command (this PR) would make the active-run data-loss mode
tracked by #72765 easy to trigger from ordinary CLI usage.
Run the same interruptSessionRunIfActive guard in the maxLines branch before
reading the tail and truncating, matching the LLM compact path. Add gateway
regression coverage over a real in-process Gateway: with no active run, the
maxLines branch truncates the on-disk transcript 500 -> 50 and preserves the
original 500 lines in the .bak archive; with an active embedded run, the
maxLines branch fires the same interrupt (abort + wait-for-end) before
archiving and truncating.
* docs(cli): move sessions compact section above related links
The new "Compact a session" section was inserted between the cleanup
section's inline "Related:" list and the page's final "## Related"
block, splitting related-link content around the command docs. Move the
compact section above the related-links area and merge the orphaned
"Session config" link into the single final "## Related" block.
* fix(gateway): avoid no-op compact aborts
Signed-off-by: sallyom <somalley@redhat.com>
* fix(gateway): satisfy compact preflight lint
Signed-off-by: sallyom <somalley@redhat.com>
* fix(sessions): preserve compacted transcript structure
---------
Signed-off-by: sallyom <somalley@redhat.com>
Co-authored-by: sallyom <somalley@redhat.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Reject unsafe numeric Content-Length values in the OpenAI chat tools E2E client before waiting on the response stream.
Also hardens Docker E2E heartbeat timing coverage after the exact-head release gate exposed a brittle zero-padded heartbeat assertion.
Verification: direct mock gateway repro, docker heartbeat shell proof, autoreview clean, and exact-head CI release gate https://github.com/openclaw/openclaw/actions/runs/27843455246.
Keep plugin tool discovery request-local, preserve active provider/channel registries, and carry the prepared registry through MCP and catalog resolution.
Co-authored-by: 郑苏波 (Super Zheng) <superzheng@tencent.com>
Distinguish validated gateway reachability from pre-open and TLS-validation failures, and sanitize close diagnostics before terminal output.
Fixes#79099.
Co-authored-by: xialonglee <li.xialong@xydigit.com>
Clarify that `networkidle` is supported for managed and raw-CDP browser sessions but rejected for existing-session mode.
Fixes#80587.
Co-authored-by: ZengWen-DT <ceng.wen@xydigit.com>
Show elapsed session duration in the status footer using the canonical session lifecycle timestamps and compact formatter.
Fixes#68226.
Co-authored-by: Alix-007 <li.long15@xydigit.com>
Use the watchOS application API for text input, remove simulator-only Debug architecture restrictions, and document the standard Watch bundle location. Refs #92477.
Co-authored-by: Sash Zats <sash@zats.io>
Summary:
- The PR changes isolated cron delivery resolution to reject keyless implicit delivery inherited from the shar ... targets into delivery context resolution, and cleans up direct cron sessions on unresolved delivery exits.
- PR surface: Source +57, Tests +496. Total +553 across 8 files.
- Reproducibility: yes. from source inspection: current resolver can inherit the shared agent-main last target ... ls or sends based on that resolved target; I did not run live Matrix reproduction in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(cron): clean up deleteAfterRun session when keyless cron delivery…
- PR branch already contained follow-up commit before automerge: Merge remote-tracking branch 'upstream/main' into fix/91613-isolated-…
- PR branch already contained follow-up commit before automerge: Merge upstream main into fix/91613-isolated-cron-delivery-identity
- PR branch already contained follow-up commit before automerge: chore: retrigger PR CI after upstream base fix
Validation:
- ClawSweeper review passed for head f129375dd7.
- Required merge gates passed before the squash merge.
Prepared head SHA: f129375dd7
Review: https://github.com/openclaw/openclaw/pull/91685#issuecomment-4659309145
Co-authored-by: nxmxbbd <32288+nxmxbbd@users.noreply.github.com>
Summary:
- The PR changes Telegram legacy HTML rendering so raw HTML table tags are converted to `<pre><code>` pipe-tab ... ks before unsupported-tag escaping, while preserving pre/code literals and rich-message table sanitization.
- PR surface: Source +38, Tests +31. Total +69 across 2 files.
- Reproducibility: yes. Source inspection shows current main's legacy HTML renderer sends raw tables directly ... the linked issue describes that same escaped output; I did not run tests because this review was read-only.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 5944f8e4d2.
- Required merge gates passed before the squash merge.
Prepared head SHA: 5944f8e4d2
Review: https://github.com/openclaw/openclaw/pull/94856#issuecomment-4749452707
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: zhangqueping <3436352+zhangqueping@users.noreply.github.com>
Summary:
- Adds saved CLI startup benchmark report comparison flags to `scripts/bench-cli-startup.ts`, plus JSON output coverage and changed-target routing expectations for the new test-helper importer.
- PR surface: Tests +77, Other +109. Total +186 across 4 files.
- Reproducibility: not applicable. as a feature/tooling PR. The prior PR defects were source-proven in review comments and the current head addresses them; I did not run local tests because this review was read-only.
Automerge notes:
- Ran the ClawSweeper repair loop before final review.
- Included post-review commit in the final squash: test(perf): compare saved CLI startup benchmarks
Validation:
- ClawSweeper review passed for head 1afa110f1b.
- Required merge gates passed before the squash merge.
Prepared head SHA: 1afa110f1b
Review: https://github.com/openclaw/openclaw/pull/94812#issuecomment-4748785428
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: Felix Isaac Lim <38658663+FelixIsaac@users.noreply.github.com>
Summary:
- The PR adds provider-internal/server_error classification in reply failure handling and regression tests for classifier output plus pre-reply external-channel copy.
- PR surface: Source +21, Tests +58. Total +79 across 3 files.
- Reproducibility: yes. source-reproducible. Current main sanitizes generic provider internal errors to a stab ... and conversation-state branches, so pre-reply chat failures can fall through to generic session-reset copy.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 8265fc71f3.
- Required merge gates passed before the squash merge.
Prepared head SHA: 8265fc71f3
Review: https://github.com/openclaw/openclaw/pull/94737#issuecomment-4747506983
Co-authored-by: snowzlm <snowzlm@noreply.codeberg.org>
Approved-by: vincentkoc
Summary:
- The PR changes Telegram sendChatAction 401 detection to trust structured Telegram `error_code` values before an unauthorized-text fallback and adds regression tests for false 401 suspension cases.
- PR surface: Source +14, Tests +90. Total +104 across 2 files.
- Reproducibility: yes. Source inspection shows current main and the latest release classify any rendered erro ... before transient handling, matching the linked issue's structured 429 `retry_after=401` reproduction path.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 0ffee85d17.
- Required merge gates passed before the squash merge.
Prepared head SHA: 0ffee85d17
Review: https://github.com/openclaw/openclaw/pull/94810#issuecomment-4748778567
Co-authored-by: 徐闻涵0668001344 <xu.wenhan1@xydigit.com>
Approved-by: vincentkoc
Summary:
- The PR expands `src/cron/parse.test.ts` with grouped `parseAbsoluteTimeMs` coverage for epoch, ISO timezone/offset, precision, whitespace, invalid-format, and cron example cases.
- PR surface: Tests +233. Total +233 across 1 file.
- Reproducibility: not applicable. this is a test coverage PR, not a runtime bug report with user steps. Source inspection confirms the requested parser coverage is still added only by this open PR path.
Automerge notes:
- Ran the ClawSweeper repair loop before final review.
- Included post-review commit in the final squash: test(cron): expand parseAbsoluteTimeMs test coverage to 39 cases
Validation:
- ClawSweeper review passed for head 69a49d9512.
- Required merge gates passed before the squash merge.
Prepared head SHA: 69a49d9512
Review: https://github.com/openclaw/openclaw/pull/91656#issuecomment-4657254372
Co-authored-by: 刘江0668001123 <liu.jiang2@xydigit.com>
Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>
Summary:
- The branch adds a bounded task-registry predicate and tests so successful delegated ACP parent-review comple ... with a Discord channel target and threadId send the parent-review terminal message directly to that thread.
- PR surface: Source +24, Tests +142. Total +166 across 2 files.
- Reproducibility: yes. at source level. Current main queues successful ACP parent-review completions through ... annel/group owner keys, and the linked canonical issue includes matching Discord thread-bound ACP evidence.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 04ad66b23d.
- Required merge gates passed before the squash merge.
Prepared head SHA: 04ad66b23d
Review: https://github.com/openclaw/openclaw/pull/89279#issuecomment-4597994374
Co-authored-by: anyech <anyech@gmail.com>
Summary:
- The branch stamps Gateway chat run registrations and abort markers with ordering metadata, uses freshness checks for chat projection suppression, and updates abort/restart/maintenance tests and related types.
- PR surface: Source +79, Tests +103. Total +182 across 13 files.
- Reproducibility: yes. source-level: on current main, seed abortedRuns for a client run id, register a same-k ... end; the presence-only checks suppress both projections. I did not execute tests in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: ci: re-trigger checks against current main
- PR branch already contained follow-up commit before automerge: Merge upstream/main into stale-abort marker fix
- PR branch already contained follow-up commit before automerge: Merge remote-tracking branch 'upstream/main' into nex/91013-conflict-…
Validation:
- ClawSweeper review passed for head 6f13d6f7c2.
- Required merge gates passed before the squash merge.
Prepared head SHA: 6f13d6f7c2
Review: https://github.com/openclaw/openclaw/pull/91013#issuecomment-4640475472
Co-authored-by: nxmxbbd <32288+nxmxbbd@users.noreply.github.com>
Adds stdout and both-mode diagnostics OTEL log export, with focused QA Lab smoke coverage and docs/config updates.
Prepared head SHA: efa2ef07ab
Verification: CI 27808480969 passed for the prepared head.
Reviewed-by: @jesse-merhi
Summary:
- The PR adds descriptor-backed CLI command suggestions for unknown root commands, wires them into Commander parse errors and early unowned-root diagnostics, and covers both paths with focused CLI tests.
- PR surface: Source +104, Tests +71. Total +175 across 5 files.
- Reproducibility: yes. for the behavior gap: current main's formatter and early unowned-root path emit generic diagnostics without closest-command hints, and the PR proof shows the after-fix CLI output.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix: suppress suggestions for plugin policy diagnostics
- PR branch already contained follow-up commit before automerge: Merge remote-tracking branch 'origin/main' into fix/83999-cli-command…
- PR branch already contained follow-up commit before automerge: test: align agent model expectations
- PR branch already contained follow-up commit before automerge: test: restore unrelated agent test fixture
Validation:
- ClawSweeper review passed for head b98f5b59e6.
- Required merge gates passed before the squash merge.
Prepared head SHA: b98f5b59e6
Review: https://github.com/openclaw/openclaw/pull/91345#issuecomment-4646215016
Co-authored-by: Glenn-Agent <glenn_agent@163.com>
Summary:
- The branch replaces iOS notification permission display-string state with a typed SettingsNotificationStatus ... n value, and opens the app notification Settings page with UIApplication.openNotificationSettingsURLString.
- PR surface: Other +51. Total +51 across 5 files.
- Reproducibility: yes. Current main has a source-level reproduction path where the Notifications settings act ... n display strings and opens the general app Settings URL instead of the notification-specific Settings URL.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 1a2fdeeac5.
- Required merge gates passed before the squash merge.
Prepared head SHA: 1a2fdeeac5
Review: https://github.com/openclaw/openclaw/pull/91923#issuecomment-4669439195
Co-authored-by: Sash Zats <sash@zats.io>
Summary:
- The branch replaces Feishu's module-load Axios `handlers` reset with public request-interceptor registration and adds tests that throw on private handler access.
- PR surface: Source +7, Tests +48. Total +55 across 2 files.
- Reproducibility: yes. for the source/dependency boundary: current main still writes `interceptors.request.ha ... l on that access before the production change. No live authenticated Feishu request failure was reproduced.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head b87083193b.
- Required merge gates passed before the squash merge.
Prepared head SHA: b87083193b
Review: https://github.com/openclaw/openclaw/pull/89806#issuecomment-4611809953
Co-authored-by: Cornna <96944678+ymylive@users.noreply.github.com>
Summary:
- The PR wires the macOS Dashboard and Canvas WKWebViews to WKUIDelegate and presents NSOpenPanel for HTML file inputs.
- PR surface: Other +61. Total +61 across 3 files.
- Reproducibility: yes. at source level: current main renders the affected file inputs while the macOS Dashboa ... fore-fix packaged macOS app in this read-only review, but the after-fix screenshots show the real app path.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 4f477c4ed0.
- Required merge gates passed before the squash merge.
Prepared head SHA: 4f477c4ed0
Review: https://github.com/openclaw/openclaw/pull/94612#issuecomment-4743165861
Co-authored-by: bbblending <li.mingkang@xydigit.com>
Summary:
- This PR wraps embedded-agent tool-handler onExecutionPhase and per-run onAgentEvent emissions in best-effort warning guards and adds regression tests for throwing and rejecting callbacks.
- PR surface: Source +31, Tests +44. Total +75 across 2 files.
- Reproducibility: yes. Current main directly invokes the relevant callbacks in the tool-start and tool-event ... sync observer can leak unless guarded; I did not run a failing current-main repro in this read-only review.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 65de17d9e0.
- Required merge gates passed before the squash merge.
Prepared head SHA: 65de17d9e0
Review: https://github.com/openclaw/openclaw/pull/81696#issuecomment-4448200659
Co-authored-by: xuyi1243 <maginaxwhz@gmail.com>
Summary:
- The branch adds a Slack subsystem INFO receipt formatter/logger for accepted non-DM app_mention events before dispatch, plus direct log tests and a test-harness team id.
- PR surface: Source +37, Tests +81. Total +118 across 3 files.
- Reproducibility: yes. from source inspection. Current main and v2026.6.8 route accepted Slack app_mention ev ... andleSlackMessage without a per-inbound INFO receipt, while Telegram emits an inbound line before dispatch.
Automerge notes:
- PR branch already contained follow-up commit before automerge: feat(slack): log INFO receipt for inbound app_mention events
Validation:
- ClawSweeper review passed for head b174201e0a.
- Required merge gates passed before the squash merge.
Prepared head SHA: b174201e0a
Review: https://github.com/openclaw/openclaw/pull/94790#issuecomment-4748509343
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: ZengWen-DT <290981215+ZengWen-DT@users.noreply.github.com>
Summary:
- The PR adds `curl` to the bundled Trello skill's `metadata.openclaw.requires.bins` entry.
- PR surface: Docs 0. Total 0 across 1 file.
- Reproducibility: yes. at source level. Current main and v2026.6.8 declare only `jq` for Trello while the skill body uses `curl`, and the shared requirement evaluator checks only declared bins.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 83ae5e8bef.
- Required merge gates passed before the squash merge.
Prepared head SHA: 83ae5e8bef
Review: https://github.com/openclaw/openclaw/pull/94729#issuecomment-4747397470
Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
Summary:
- The PR updates Codex context projection fitting so non-positive context budgets still return turn/start text within the app-server input cap while preserving the current user request tail.
- PR surface: Source +23, Tests +87. Total +110 across 2 files.
- Reproducibility: yes. Current main is source-reproducible: when `beforeContext.length + afterContext.length ... ll-over-limit text; the linked diagnostic also shows the real Codex app-server rejects that pre-fix string.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 1510a3d13a.
- Required merge gates passed before the squash merge.
Prepared head SHA: 1510a3d13a
Review: https://github.com/openclaw/openclaw/pull/94756#issuecomment-4747889774
Co-authored-by: Anas <anaselghoudane@gmail.com>
Summary:
- The branch adds Vitest coverage for browser action-input CLI request bodies across element, navigation/resize, fill/evaluate, and upload paths, plus blank-ref validation.
- PR surface: Tests +278. Total +278 across 4 files.
- Reproducibility: yes. for a source-level coverage gap: current main exposes the browser action-input command ... isting tests still lack broad success-path request-body assertions. This is not a runtime bug reproduction.
Automerge notes:
- PR branch already contained follow-up commit before automerge: test(browser): cover click-coords action body
Validation:
- ClawSweeper review passed for head c070a8d51b.
- Required merge gates passed before the squash merge.
Prepared head SHA: c070a8d51b
Review: https://github.com/openclaw/openclaw/pull/92574#issuecomment-4697124920
Co-authored-by: Stellar鱼 <2182712990@qq.com>
Co-authored-by: yu-xin-c <2182712990@qq.com>
Summary:
- The PR changes gateway chat-history byte-budget fallback behavior to return a small metadata-free unavailable sentinel instead of an empty transcript, with focused budget tests.
- PR surface: Source +20, Tests +73. Total +93 across 2 files.
- Reproducibility: yes. Source inspection shows current main reaches `messages: []` when the full history, las ... d copied oversized placeholder all exceed `maxBytes`; I did not run tests because this review is read-only.
Automerge notes:
- PR branch already contained follow-up commit before automerge: test: access __openclaw via bracket notation for no-underscore-dangle
Validation:
- ClawSweeper review passed for head f2fa246ab7.
- Required merge gates passed before the squash merge.
Prepared head SHA: f2fa246ab7
Review: https://github.com/openclaw/openclaw/pull/92383#issuecomment-4688688923
Co-authored-by: Hidetsugu55 <183473679+Hidetsugu55@users.noreply.github.com>
Summary:
- The PR reorganizes the Android Settings home rows into titled intent sections and adds ShellScreen logic tests for section title mapping and section ordering.
- PR surface: Other +106. Total +106 across 2 files.
- Reproducibility: not applicable. this is a UI organization cleanup rather than a bug report. The relevant ve ... ion path is the before/after Android emulator screenshot proof plus source comparison against current main.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head da9bf5c5b5.
- Required merge gates passed before the squash merge.
Prepared head SHA: da9bf5c5b5
Review: https://github.com/openclaw/openclaw/pull/94539#issuecomment-4741795253
Co-authored-by: Tosko4 <tosko4@gmail.com>
Summary:
- The PR extends TUI session info to carry `totalTokensFresh`, maps fresh missing totals to `0`, and adds a focused regression test for the footer merge path.
- PR surface: Source +15, Tests +38. Total +53 across 4 files.
- Reproducibility: yes. at source level: `chat.history` returns session info with `totalTokensFresh`, but curr ... `null` before footer formatting. I did not run local tests or a live TUI session in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: Merge branch 'main' into fix/followup-93798
Validation:
- ClawSweeper review passed for head 43657b52c8.
- Required merge gates passed before the squash merge.
Prepared head SHA: 43657b52c8
Review: https://github.com/openclaw/openclaw/pull/94337#issuecomment-4737123127
Co-authored-by: 杨浩宇0668001029 <yang.haoyu@xydigit.com>
Co-authored-by: mushuiyu_xydt <yang.haoyu@xydigit.com>
Summary:
- The PR retargets stale generated plugin-skill symlinks when their old target disappeared and adds regression coverage for that case.
- PR surface: Source +11, Tests +17. Total +28 across 2 files.
- Reproducibility: no. high-confidence current-main failure was run in this read-only review. The linked issue ... ased-build filesystem state and source inspection confirms the runtime publisher path that this PR changes.
Automerge notes:
- PR branch already contained follow-up commit before automerge: Merge remote-tracking branch 'upstream/main' into fix/plugin-skill-st…
- PR branch already contained follow-up commit before automerge: fix(skills): unlink generated plugin skill symlinks
Validation:
- ClawSweeper review passed for head 94a9765735.
- Required merge gates passed before the squash merge.
Prepared head SHA: 94a9765735
Review: https://github.com/openclaw/openclaw/pull/86719#issuecomment-4539047343
Co-authored-by: Steven Palmer <palmer.e.steven@gmail.com>
Summary:
- The PR widens the virtual Clack output columns for wrapped terminal notes and adds a rendered-output regression test for copy-sensitive session-lock paths.
- PR surface: Source +8, Tests +28. Total +36 across 2 files.
- Reproducibility: yes. Current source routes session lock paths through `note()`, and the pinned Clack note renderer hard-wraps final content from `getColumns(output) - 6` after OpenClaw's first wrapping pass.
Automerge notes:
- PR branch already contained follow-up commit before automerge: test(note): add rendered-output regression test for copy-sensitive to…
Validation:
- ClawSweeper review passed for head b17a4ff571.
- Required merge gates passed before the squash merge.
Prepared head SHA: b17a4ff571
Review: https://github.com/openclaw/openclaw/pull/94746#issuecomment-4747714518
Co-authored-by: Dirk <0668000837@xydigit.com>
* fix: default cron runMode to 'due' instead of 'force'
When the runMode parameter is omitted from a cron 'run' action,
the default value now respects schedule guards ('due') instead
of bypassing them ('force'). This prevents unintended execution
of scheduled jobs outside their configured time windows.
Fixes#94270
Co-Authored-By: Claude <noreply@anthropic.com>
* test: update runMode expectations for default 'due' (#94270)
* ci: trigger re-evaluation of real behavior proof
* fix(cron): document due-by-default agent runs
Signed-off-by: sallyom <somalley@redhat.com>
---------
Signed-off-by: sallyom <somalley@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: sallyom <somalley@redhat.com>
Summary:
- The PR changes the WhatsApp auto-reply first-media failure fallback to resend the saved leading caption chunk and adds a multi-chunk regression test for that failure path.
- PR surface: Source 0, Tests +26. Total +26 across 2 files.
- Reproducibility: yes. Source inspection of current main gives a deterministic path: the first chunk is shift ... fallback shifts `remainingText` again before checking `caption`; this read-only review did not rerun tests.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head b609e44654.
- Required merge gates passed before the squash merge.
Prepared head SHA: b609e44654
Review: https://github.com/openclaw/openclaw/pull/93823#issuecomment-4724923171
Co-authored-by: yetval <yetvald@gmail.com>
Summary:
- The branch changes `formatMessageCliText` to render dry-run message output from `result.dryRun` instead of only `handledBy === "dry-run"`.
- PR surface: Source 0. Total 0 across 1 file.
- Reproducibility: yes. source-reproducible. The linked issue has captured CLI output, and current main shows ... e the formatter still checks `handledBy === "dry-run"`; I did not execute the CLI in this read-only review.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head dce6d6a0d3.
- Required merge gates passed before the squash merge.
Prepared head SHA: dce6d6a0d3
Review: https://github.com/openclaw/openclaw/pull/94684#issuecomment-4746101038
Co-authored-by: lizeyu-xydt <li.zeyu@xydigit.com>
Summary:
- This PR replaces the generated Documentation prompt wording with self-knowledge docs-authority guidance and updates prompt tests plus the system-prompt docs.
- PR surface: Source 0, Tests +27, Docs +6. Total +33 across 4 files.
- Reproducibility: yes. from source for the prompt gap: current main and v2026.6.8 have only broad docs-first ... ledge failure example. I did not run a fresh current-main live model conversation in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix: strengthen self-knowledge docs prompt
- PR branch already contained follow-up commit before automerge: test: narrow cli prompt tool assertion
- PR branch already contained follow-up commit before automerge: fix: condense self-knowledge docs prompt
- PR branch already contained follow-up commit before automerge: fix: clarify self-knowledge docs authority
- PR branch already contained follow-up commit before automerge: Merge branch 'main' into sutrah/self-knowledge-docs-prompt
Validation:
- ClawSweeper review passed for head 88a7db5d2a.
- Required merge gates passed before the squash merge.
Prepared head SHA: 88a7db5d2a
Review: https://github.com/openclaw/openclaw/pull/90882#issuecomment-4637990339
Co-authored-by: Sutra Hsing <sutrahsing@163.com>
Co-authored-by: sutra <sutrahsing@163.com>
* Add /name chat command to rename the current session
Adds a `/name <title>` slash command so users can name or rename the
current session directly from any chat channel, instead of only through
the web/admin session manager. This keeps parallel sessions easy to tell
apart from within the chat flow.
Behaviour:
- `/name <title>` sets the session label, reusing the canonical
`parseSessionLabel` validation (trim, non-empty, max 512 chars) and the
same cross-store uniqueness rule enforced by the web `sessions.patch`
path, so chat naming behaves identically to the session manager.
- `/name` with no argument shows the current name plus a locally derived
`deriveSessionTitle` suggestion without mutating anything (no LLM).
- Only authorized senders can rename (rejectUnauthorizedCommand), matching
/goal. The label surfaces everywhere sessions.list is shown (TUI, web,
CLI, MCP).
The handler resolves the session via resolveSessionStoreEntry so renames
land on the canonical entry even when the store still holds a legacy or
case-folded key alias, and excludes those aliases from the uniqueness scan
to avoid false conflicts. Failed renames skip the store write.
Registers the command in commands-registry.shared.ts and the handler in
loadCommandHandlers, documents it in docs/tools/slash-commands.md, and adds
unit tests covering rename, no-arg suggestion, duplicate-label rejection,
unauthorized senders, disabled text commands, and persisted-name re-read.
Part of the chat-native session naming feature (follows the web in-chat
rename PR). Relates to openclaw#85502 and openclaw#54397.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(name): seed native sessions and persist renames via canonical key
Address Codex review on PR #88581:
- Fall back to the in-memory params.sessionEntry when the store has no row
yet, so a brand-new native slash session can be named from its first
/name command instead of failing with 'no active session to name'.
- Persist the rename through resolved.normalizedKey and drop legacy/
case-folded alias keys (mirroring persistResolvedSessionEntry) so the
canonical entry is updated and sessions.list stops surfacing the stale
alias row.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(name): emit session metadata changes
Route successful /name renames through the shared command session metadata seam so subscribed session lists receive sessions.changed like /goal.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat(commands): add /name to rename the current session from chat
* fix(docs): document the /name slash command
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Agent <agent@example.com>
Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>
* fix(skills/1password): stop forcing tmux for desktop app auth (#52540)
The bundled skill currently mandates that every `op` invocation run inside
a fresh tmux session. That guidance is wrong on every desktop-app-integration
setup (macOS/Windows/Linux) because the 1Password app exposes the CLI over
a per-user Unix domain socket the gateway exec env can reach but tmux
subshells generally cannot — wrapping in tmux produces "1Password CLI
couldn't connect to the 1Password desktop app" failures.
Rewrite the skill to detect auth mode first and only use tmux for the one
case where it actually helps:
- Service account (`OP_SERVICE_ACCOUNT_TOKEN`): direct exec, no signin.
- Desktop app integration: direct exec, never tmux. Note the macOS socket
location (`~/Library/Group Containers/2BUA8C4S2C.com.1password/t/`) so
agents can recognize the failure mode.
- Standalone interactive signin: tmux is the right tool because it
preserves the per-shell session token written by `op signin`.
Update Guardrails and the get-started reference accordingly. Drop the
blanket 'do not run op outside tmux' rule.
Fixes#52540
* fix(skills/1password): correct desktop-app IPC wording and signin example
Address PR #75090 review:
- Replace the blanket 'per-user Unix domain socket' description with
per-platform wording: XPC via the 1Password Browser Helper on macOS,
a Unix domain socket on Linux, a named pipe on Windows. Keep the macOS
group-container path as a symptom indicator only, not as a transport
claim. Mirror the same correction in the get-started reference and the
changelog entry.
- Fix the standalone-signin tmux example: `op signin` was being sent as
a plain command, so its eval-style export was printed but never applied.
Subsequent `op whoami` and `op vault list` calls would fail because
the OP_SESSION_* env var was never set. Wrap the call in
`eval "$(op signin ...)"` so the session token is exported into the
tmux pane environment as the surrounding text describes.
Same direct-exec direction; tighter and more accurate.
* docs(1password): clarify Windows standalone signin
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(skills/1password): repair auth-mode guidance
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>
* fix(scripts): render auth monitor unit before install
Render the auth monitor service into temporary files instead of editing the tracked template. Quote the generated ExecStart safely, including spaces and literal dollars, then atomically install the rendered unit.
* fix(scripts): avoid mutating tracked auth-monitor template during setup
* fix(scripts): avoid mutating tracked auth-monitor template during setup
* fix(scripts): avoid mutating tracked auth-monitor template during setup
---------
Co-authored-by: JackWuGlobal <JackWuGlobal@users.noreply.github.com>
Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>
* fix(slack): stop leaking bot token into /api/auth.test request body
The bot token is already passed as an `Authorization` header,
so we don't need to send it in the request body when calling `/api/auth.test`.
See [Slack API documentation](https://api.slack.com/methods/auth.test).
Also, showing with `curl` that the bot token is not needed in the request body when passed as an `Authorization` header when calling `/api/auth.test`:
```
curl -X POST https://slack.com:443/api/auth.test -H "Authorization: Bearer xoxb-..."
{"ok":true,"url":"https://xcoulonworkspace.slack.com/","team":"xcoulon",...}
```
Signed-off-by: Xavier Coulon <xcoulon@redhat.com>
* add test for slack auth.test token handling
verify that the bot token is not passed in the request body when calling `/api/auth.test`.
Signed-off-by: Xavier Coulon <xcoulon@redhat.com>
---------
Signed-off-by: Xavier Coulon <xcoulon@redhat.com>
* fix(exec): resume agent turn for native chat exec approvals (issue #93918)
Extend the inline approval-pending path that PR #85239 added for webchat to
every bundled chat channel that ships an `approval-handler.runtime`
adapter (Telegram, Discord, Slack, Signal, WhatsApp, iMessage, Matrix,
Google Chat, QQ Bot, plus webchat). When the originating turn can be
approved in the same chat, the gateway resolves the approval in place and
the agent waits inline for the command output instead of terminating the
run on the "approval-pending" tool result.
Before this fix, native chat approvals landed in the fire-and-forget
`sendExecApprovalFollowup` path. The followup either failed silently
against the agent dispatch and fell through to a direct delivery to the
operator, or never reached the agent at all; either way the model never
saw an "Exec running / Exec finished / Exec denied" event. The operator
had to send a follow-up message to recover the turn, and a new approval
was minted because the original run had already ended.
The change:
- Introduces `NATIVE_APPROVAL_CHANNELS` and `isNativeApprovalChannel`
in `src/utils/message-channel-constants.ts`, listing the channels that
ship a native chat approval client. `webchat` is included so the
single-channel check inside `shouldAwaitGatewayApprovalInline` can
move from "this one id" to "any native approval client".
- Replaces the `INTERNAL_MESSAGE_CHANNEL` equality check in
`shouldAwaitGatewayApprovalInline` with `isNativeApprovalChannel`,
preserving the `approvalFollowupMode` opt-out and the existing
`unavailableReason === null` gate.
- Adds unit tests asserting inline resolution and inline denial for
every native approval channel, plus a regression test that
non-native channels (e.g. feishu) and explicit `approvalFollowupMode`
settings still take the fire-and-forget path.
- Adds a `NATIVE_APPROVAL_CHANNELS` test in
`src/utils/message-channel.test.ts` to lock the membership and the
negative cases.
Refs https://github.com/openclaw/openclaw/issues/93918
* fix(lint): restore InternalMessageChannel type export lost during rebase
Rebase on upstream/main dropped the InternalMessageChannel type alias
from message-channel-constants.ts, breaking the plugin-sdk boundary
.dts check ('has no exported member named InternalMessageChannel').
message-channel.ts was also re-importing the type only to re-export
it, triggering the oxlint no-unused-vars rule.
- Re-add 'export type InternalMessageChannel = typeof INTERNAL_MESSAGE_CHANNEL'
in message-channel-constants.ts so the public re-export is valid.
- Drop the redundant 'type InternalMessageChannel' from the local
import in message-channel.ts; the value-side import is what the
file body actually needs.
* test(exec): align native approval routing expectations
* fix(openai-embedding): preserve openai/ prefix for non-native base URLs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(openai-embedding): normalize model before maxInputTokens lookup so qualified models retain token cap
* fix(openai-embedding): use semantic hostname check for native OpenAI URL detection
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Stabilize timeout-sensitive hosted QA by removing wedged synthetic Codex run-attempt integrations while keeping narrower dynamic-tool and thread-start coverage. Refresh root and Discord shrinkwraps after the current undici security floor landed on main.\n\nProof: git diff --check origin/main...HEAD; node scripts/pre-commit/pnpm-audit-prod.mjs --audit-level=high; node scripts/generate-npm-shrinkwrap.mjs --all --check; CI run 27774571070; Plugin Prerelease run 27774571273.
The lane timeout sliding window expires during long-running tool
execution (e.g. exec commands >5min) because noteLaneTaskProgress()
is never called during tool execution. Add a periodic 30s interval
that calls noteLaneTaskProgress() while the embedded attempt runs,
keeping the lane alive until the attempt completes.
Closes: openclaw/openclaw#94033
Co-Authored-By: Claude <noreply@anthropic.com>
Address ClawSweeper P1 (Carry the effective git channel into finalize):
an unconfigured git/source update runs the core update on the git/dev channel
(runGatewayUpdate: opts.channel ?? "dev"), but the finalizer received no channel
and fell back to the stable package channel, so plugin convergence could resolve
official plugins on the wrong channel.
Mirror the CLI post-core resume's effective/requested channel split: the RPC
finalize path now passes the effective channel (configChannel ?? DEFAULT_GIT_CHANNEL)
to update finalize via OPENCLAW_UPDATE_EFFECTIVE_CHANNEL (convergence-only), never
as --channel. update finalize uses it as a convergence fallback but never persists
update.channel unless the user actually requested one.
Codex follow-up: defaulting the channel to dev and passing --channel made
update finalize persist update.channel into openclaw.json (persistRequestedUpdateChannel
treats any --channel as an explicit request). Only forward --channel when the
caller has a configured channel so the finalizer never writes a channel the user
did not request; when omitted it converges on the stored/default channel and the
reconcile still resolves a host-compatible version. Keeps the per-step vs
whole-process timeout decoupling.
Address codex PR-review findings:
- Default the post-core finalize channel to the git/dev channel (matching
runGatewayUpdate's git default) instead of letting update finalize fall back to
the stable package channel, so official plugins converge on the same channel as
the core update for default source updates.
- Decouple the finalizer's whole-process spawn timeout from the per-step
--timeout so a valid multi-step finalize is not killed prematurely and falsely
reported as post-core-plugin-finalize-failed.
- Strip gateway service identity (OPENCLAW_SERVICE_MARKER/KIND/PID) from the
finalizer child so it is not mistaken for the managed service, matching the
CLI post-core spawn.
- Skip finalize for no-op git updates (unchanged SHA and version), mirroring the
CLI resume gate, to avoid an unnecessary doctor/convergence run.
The gateway update.run RPC updated git/source installs via runGatewayUpdate
but, unlike the openclaw update CLI, never resumed the post-core plugin
convergence that runGatewayUpdate's doctor pass defers. As a result a
git/source core update would restart on the new core with official managed
plugins still pinned to versions built against removed core APIs.
Spawn the rebuilt binary's update finalize entrypoint after a successful
git update so official plugins reconcile to a host-compatible version, and
block the restart if convergence fails (mirroring the CLI).
Local extension before_tool_call/after_tool_call hooks registered but
never fired after a scoped mid-run plugin activation (harness or memory
ensure) rebound the global hook runner to a narrow registry, dropping
hooks unique to the broader registry (#91918).
The runner is now created once and resolves hooks live on every dispatch
from the composed set of currently-live registries (the most recently
initialized registry, the active registry, and the pinned channel and
http-route surfaces) instead of freezing one registry. The loader's
one-shot preserve gate is removed since activation order no longer
matters. Per-plugin ownership prefers loaded records so a failed scoped
reload cannot shadow a healthy pinned registration (including a
fail-closed tool-call gate), and the explicitly initialized registry
stays highest precedence so SDK callers keep an authoritative registry.
Reuses the live-registry collector the agent-event bridge already uses
so both dispatch surfaces agree on what is live.
One-time maintainer-authorized bootstrap merge for the release-gate verifier policy. Exact hosted CI and all supporting workflow gates passed on 66133de419.
When ensureAgentWorkspace is called for an already-configured workspace
(setupCompletedAt is set), skip creating optional bootstrap files
(SOUL.md, USER.md, IDENTITY.md, HEARTBEAT.md) at the root level.
This prevents subagent spawns from recreating root-level optional
bootstrap markdown files in repository workspaces where these files
were removed intentionally or only exist under agent-specific
subdirectories (e.g., main/).
Fixes#83593
* fix(session): prevent stale finalizer from recreating deleted session rows
After sessions.delete removes a session row, updateSessionStoreAfterAgentRun
could still recreate it via the fallbackEntry path in patchSessionEntry when
preserveUserFacingRunState was false. Changed the guard from only checking
preserveUserFacingRunState to checking whether the session key exists in the
in-memory store but not on disk — indicating the session was intentionally
deleted mid-run.
Fixes#40840
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(session): cover deleted session finalizer fence
* fix(session): fence post-run writes after deletion
* fix(session): guard post-run transcript persistence
* fix(session): fence metadata after session reset
---------
Co-authored-by: Peter Lee <22994703+xialonglee@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Mount the configured package Telegram output directory into the Docker runtime and pass the container path to the harness, avoiding host `/home/runner` paths inside Docker.
Proof:
- pnpm test test/scripts/npm-telegram-live.test.ts
- git diff --check
- https://github.com/openclaw/openclaw/actions/runs/27685093647
Set TMPDIR=/tmp inside the package Telegram Docker runner so runtime scratch files are written to a writable container path.
Proof:
- pnpm test test/scripts/npm-telegram-live.test.ts
- git diff --check
Compaction summarization consumes the model stream via result() only (no
iteration), so it never emitted model.call diagnostic spans. Observe the
stream's result() in the diagnostic wrapper and wire the wrapper into the
direct compaction path so these LLM calls are traced (request/response
content, byte accounting, traceparent).
Decouple underlying-iterator cleanup from terminal-event dedup. The agent
loop awaits result() on the terminal event then abandons the iterator, so
once result() also emits the terminal event, gating safeReturnIterator on
terminalEventEmitted skipped provider cleanup (idle-timeout abort listeners
on the long-lived run signal, SSE readers). Track iterator settlement
separately so return() cleanup always runs; emit dedup stays on
terminalEventEmitted.
Parent compaction model-call spans to the active run/harness trace rather
than a phantom child trace that emits no span of its own.
Clarify that `openclaw mcp list`, `show`, `set`, and `unset` manage the OpenClaw `mcp.servers` registry and do not include the separate mcporter registry.
Co-authored-by: Alix-007 <li.long15@xydigit.com>
The generic dmPolicy/allowFrom warning read only the canonical top-level
allowFrom, so channels that keep their wildcard under the legacy dm.allowFrom
alias (e.g. Discord/Slack, mode=topOnly/topOrNested) got a false 'all DMs
dropped' warning even though runtime honors dm.allowFrom. Resolve policy and
allowFrom through the shared resolveChannelDm* helpers with the channel's
dmAllowFromMode (matching runtime and doctor), and skip nestedOnly channels
whose canonical fields live under dm.* and do not match this warning's
top-level paths. Adds a Discord legacy-alias regression test.
Addresses ClawSweeper review finding P1 (false positives on legacy dm.allowFrom).
Replace the hardcoded Mattermost-only open-DM config check with a generic,
plugin-agnostic warning driven by a single shared evaluator
(evaluateDmPolicyAllowFromDependency) reused by the Zod refinements and the
CLI validator. Surface warnings at 'config validate' and on config load.
Remove the Mattermost-specific status-issues module now covered generically;
keep the runtime drop-log diagnostic.
* fix(feishu): fetch quoted content before empty-message guard
Moves the quoted/replied message content fetching before the empty-message
early return so a reply with only @bot mention (no text, no media) is not
dropped when it quotes a message with meaningful content. The guard now also
checks that quoted text is empty before skipping.
Note: because the fetch is now unconditional on parentId after passing the
group admission/mention gate, an empty-text reply that quotes a parent in an
open group (requireMention: false) without mentioning the bot will now be
dispatched, where before it was dropped. This is the intended behavior for
open groups — any non-empty turn (including one where context comes from a
quote) should reach the agent. For requireMention:true groups, unmentioned
messages still exit at the mention gate before the fetch, so no over-fetch
occurs.
Adds group-based regression tests for the #90177 scenario:
- Positive: mention-only reply in requireMention:true group with quoted
parent — dispatches with [Replying to: "..."] in the body.
- Negative: empty reply with no bot mention in requireMention:true group —
getMessageFeishu is never called and nothing is dispatched.
* fix(feishu): fetch quoted content before empty-message guard (#90192) (thanks @bladin)
---------
Co-authored-by: 黑承亮0668000844 <bladin@users.noreply.github.com>
Co-authored-by: sliverp <870080352@qq.com>
* fix(cron): reject invalid absolute timestamps
* fix(cron): preserve ISO end of day
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(reasoning-tags): accept MiniMax mm: prefix in silent-detection and stream gates
PR #93767 added MiniMax `mm:`-namespaced reasoning-tag support across the
shared sanitizer and Telegram lane coordinator, but two production reasoning-tag
recognizers were missed and still only matched the `antml:` namespace:
- src/auto-reply/tokens.ts: `taggedReasoningPrefixRe` / `openReasoningPrefixRe`
drive `stripLeadingReasoningBlocks` and `isSilentReplyPayloadText`, which 14+
call sites use to detect NO_REPLY silent payloads. A `<mm:think>…</mm:think>NO_REPLY`
reply was not recognized as silent, leaking the wrapper into delivery.
- src/agents/embedded-agent-subscribe.handlers.messages.ts: `REASONING_TAG_RE`
gates `shouldRecomputeFullStream`. A `<mm:think>` streaming chunk failed the
test, so the visible stream was not recomputed and the hidden reasoning leaked.
Add the `mm:` alternative alongside `antml:` in all three regexes, matching the
exact `(?:antml:|mm:)?` form used by #93767. Identification-only change, no other
regex logic touched.
* test(agents): cover MiniMax reasoning regressions
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(browser): use openTab return value to prevent wsUrl race in ensureTabAvailable
When ensureTabAvailable opens a new tab on empty list, the return value
from openTab was discarded. A subsequent listTabs() call may return tabs
without webSocketDebuggerUrl populated yet, causing the wsUrl filter to
eliminate the newly opened tab and throw BrowserTabNotFoundError.
Fix: capture openTab's return value and merge it into candidates if the
wsUrl filter excluded it. openTab's internal discovery loop already
resolves wsUrl, so the returned tab is always valid.
* fix(browser): harden tab selection discovery
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(feishu): paginate wiki node and space listing (fixes#37626)
client.wiki.spaceNode.list / wiki.space.list return at most one page (max
50 items); the tool ignored has_more/page_token and silently dropped every
node past the first page. Drain both endpoints via a bounded shared helper
that loops on has_more with a 100-page safety cap.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(feishu): expose wiki pagination cursors
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
The previous `read_text(encoding="utf-8")` call left the UTF-8 byte
order mark (EF BB BF, three bytes) in the content string if the file
was saved by a tool that emits a BOM. The first line check
(`lines[0].strip() != "---"`) then saw "\ufeff---" and rejected the
file as "Invalid frontmatter format", even though the document was
otherwise valid frontmatter.
Co-authored-by: Zo Bot <github-automation@zo.computer>
When more than maxMissedJobsPerRestart cron jobs are overdue after gateway
downtime, runMissedJobs defers the overflow jobs to a near-future staggered
catch-up slot. start()'s second maintenance pass then recomputed each overflow
cron deferral to its natural schedule slot, because it ran future-slot repair
with the default-enabled flag. For a daily 0 9 * * * job the now+stagger
catch-up was clobbered to the next 09:00, dropping the missed run for a full
period.
Scope the exemption instead of disabling repair wholesale: runMissedJobs now
returns the ids it deferred this startup, recomputeNextRunsForMaintenance gains
skipFutureRepairJobIds to exempt exactly those ids, and start() threads them
into its pass. Overflow catch-up deferrals survive until their staggered tick
while ordinary stale-future cron slots are still repaired on startup.
* fix(status): show 0/1.0m instead of ?/1.0m on a fresh session
On a brand-new /new session the persisted totalTokens is absent
(undefined), so /status rendered the context numerator as ? via
formatTokens(null, ...). A fresh session with no usage is a known
zero, not an unknown total, so normalize undefined-but-not-stale
totals to 0 before formatting while leaving the intentional
totalTokensFresh === false stale guard (which must keep ?) intact.
Fixes#93771
* fix(status): persist fresh-session zero usage
* fix(status): identify fresh empty sessions
* fix(status): persist fresh empty session usage
* fix(status): preserve fork and compaction token state
* fix(status): preserve queued compaction token state
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(google): keep parallel Gemini tool responses in the turn after the model
On Gemini < 3 vision models, a parallel tool-call turn whose non-last result
returns an image split function responses across user turns. The merge heuristic
only inspected contents[last], so the separate "Tool result image:" turn landed
between two parallel responses and stranded the second one in a fresh turn. The
turn right after the model then carried fewer functionResponse parts than the
model issued functionCall parts, so Gemini returned 400 INVALID_ARGUMENT. Because
the malformed turn is persisted, every later turn re-400s and the session sticks.
Replace the contents[last] heuristic with a run-scoped accumulator: all responses
for one model turn merge into the single user turn after it, and Gemini < 3 image
turns defer to the end of the tool-result run so they trail that response turn.
Covers both google.ts and google-vertex.ts, which share this convertMessages.
* fix(google): align provider transport tool result turns
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(memory): await search-sync before returning results to prevent stale index
When the gateway process has been running for a while, memory_search
returns stale results because startAsyncSearchSync fires off the index
sync as a background task (void ... .catch()) without waiting for it
to complete. Search results are then read from the old index state.
Change startAsyncSearchSync from sync/fire-and-forget to async/await
so that the index is synced before search results are returned. This
ensures memory_search reflects the current filesystem state, matching
the behavior of the CLI command which creates
a fresh manager each time.
Fixes#52115
* test(memory): prove search waits for dirty sync
* test(memory): align search with synchronous sync
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Treat refreshable manifest catalog rows as non-authoritative and load the owning plugin for runtime/cache-backed discovery. Adds focused regression coverage for entries-only and full discovery paths.
* fix(feishu): recover CJK filenames from JSON file_name field
Apply recoverUtf8FileNameFromLatin1Header to JSON-derived filenames in
extractFeishuDownloadMetadata, matching the behavior already present for
Content-Disposition headers in decodeDispositionFileName.
Fixes#81103
* fix(feishu): recover inbound CJK filenames
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(reasoning-tags): strip MiniMax `mm:` namespaced reasoning tags
MiniMax M3 (e.g. via Fireworks) emits its chain-of-thought inline in the
content stream wrapped in `<mm:think>…</mm:think>` rather than in a separate
`reasoning_content` field. The reasoning-tag stripper only recognized the
`antml:` namespace, so `mm:`-namespaced tags slipped through QUICK_TAG_RE and
leaked the model's hidden reasoning into visible chat output.
Accept the `mm:` prefix alongside `antml:` in the shared sanitizer
(reasoning-tags.ts) and in the Telegram reasoning-lane coordinator's tag regex
and prefix list. Adds unit tests covering mm: think/thinking/thought blocks,
truncated-open orphan close recovery, and code-fence preservation.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(reasoning): handle MiniMax tags in streams
---------
Co-authored-by: DrHack1 <DrHack1@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* feat(inbound-meta): expose message_type in trusted inbound metadata (fixes#50482)
Add resolveInboundMessageType() that extracts the media type prefix
(e.g. 'audio' from 'audio/ogg') from MediaType or MediaTypes fields.
Expose it as message_type in the inbound metadata JSON so agents can
distinguish voice messages from typed text for turn-completion heuristics.
* fix(inbound-meta): preserve per-turn source modality
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* feat(memory): apply outputDimensionality truncation to local GGUF embeddings
The outputDimensionality config field was passed through to the local
embedding provider but never applied. Local GGUF models (e.g.
Qwen3-Embedding-0.6B) always returned their full dimension vector.
Apply slice(0, N) after normalization so MRL-capable models can benefit
from dimension truncation — matching the behavior already supported by
Gemini embedding-2 and OpenAI providers.
Fixes#58765
* fix(memory): preserve local embedding dimensions through worker
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
The isToolDocBlockStart function checked normalized === normalized.toUpperCase()
but normalized is already uppercased from line 24, making the condition always true.
This caused mixed-case lines ending with ':' to be incorrectly detected as doc block
starts, truncating tool descriptions unnecessarily.
Compare the original line instead to correctly detect all-uppercase headings.
Co-authored-by: Gautam Kumar <gautamkumarofficial@users.noreply.github.com>
* fix(usage): reject invalid explicit dates in usage RPC date parsing
usage.cost and sessions.usage accepted shape-valid but impossible dates such as 2026-02-30: parseDateParts validated only the YYYY-MM-DD regex, so Date.* silently rolled them over (2026-02-30 -> 2026-03-02) and the RPC returned cost/usage for the wrong day. Out-of-range parts now fail a UTC round-trip check, and an explicitly provided unparseable date (bad format or impossible calendar date) returns INVALID_REQUEST instead of silently falling back to the default range. Absent/valid dates are unchanged.
[AI-assisted]
* fix(usage): reject non-string explicit dates
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Duplicate user-message detection ran over the full branch, so when a prompt
was re-sent within the 60s window its earlier copy in the summarized prefix
and the later copy in the kept tail were both removed: the summarized copy via
summarizedBranchIds and the kept copy as a duplicate. With
truncateAfterCompaction enabled the prompt then vanished from the successor
transcript entirely. Restrict dedup to the kept region so the first surviving
copy is preserved.
* fix: pin plugin workspace dir for sessions.list to avoid O(rows) memo busting
sessions.list was O(rows) slow under concurrent agent/cron load because
each row read a process-global active plugin-registry workspace dir
that was mutated by other turns between rows. The per-row memo key
changed every time, so loadPluginMetadataSnapshot scanned fresh
(~100ms per row).
Fix:
1. Add AsyncLocalStorage-based workspace dir pinning to
runtime-workspace-state.ts — withPinnedActivePluginRegistryWorkspaceDir()
snapshots the current workspace dir for the duration of a callback.
2. Wrap listSessionsFromStoreAsync body in the pin so all per-row
metadata lookups use a stable memo key.
Fixes#90814
* test(plugins): cover request-scoped workspace pins
* fix(plugins): pin canonical runtime workspace reads
* fix(plugins): preserve workspace pins across reloads
---------
Co-authored-by: lsr911 <lsr911@github.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(ollama): preserve configured API during discovery
* fix(ollama): keep compatible discovery base URL
* fix(ollama): route compatible APIs through configured transport
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix: scope assistant avatar override to agent ID
The local assistant avatar override was stored globally in
localStorage without an agentId, causing the same avatar to
apply to all agents. Setting an avatar for agent A would
overwrite the avatar for agent B.
Fix: include agentId when saving the local avatar override,
and filter by agentId when loading. An override saved for one
agent no longer bleeds into other agents.
Fixes#90890
* fix(ui): persist assistant avatars per agent
* fix(ui): satisfy scoped avatar checks
---------
Co-authored-by: lsr911 <lsr911@github.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
The 'can preserve asynchronous provider model discovery' test was
flaky because resolveModelAsyncMock in beforeEach delegates to
resolveModelMock. When useAsyncModelResolution=true, the test
asserted resolveModelMock was not called, but the delegation
caused it to be called, failing CI on two lanes.
Fix: use a standalone vi.fn() for the async resolver in this
test, and explicitly reset resolveModelMock before the assertion
to guard against mock state leakage from prior tests.
Fixes#92117
Co-authored-by: lsr911 <lsr911@github.com>
MiniMax TTS API returns HTTP 200 even on quota/billing errors, with the
error encoded in base_resp.status_code. Without this check, placeholder
audio returned alongside the error is silently accepted, preventing the
TTS dispatcher from falling back to a configured secondary provider.
This follows the same pattern used by all other MiniMax providers:
- image-generation-provider.ts
- video-generation-provider.ts
- music-generation-provider.ts
- minimax-web-search-provider.runtime.ts
Fixes#76904
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* fix(whatsapp): extract GIF metadata and distinguish gifPlayback in media placeholders (fixes#49099)
- Add escapeAttr() helper to sanitize quotes and angle brackets in XML attribute values
- Add extractExternalAdReplyMetadata() to extract title, sourceUrl, body from contextInfo.externalAdReply
- Distinguish GIFs from videos using videoMessage.gifPlayback flag (media:gif vs media:video)
- Enrich image and video placeholders with externalAdReply metadata when available
- Add 5 test cases covering GIF detection, metadata extraction, attribute escaping, and empty fields
* fix(whatsapp): keep GIF metadata in untrusted context
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
QQBot is the only channel that root-sandboxes outbound local files. Its three
gate sites (resolveOutboundMediaPath, the voice send re-check, and
structured-payload validation) only trusted the QQ Bot media storage roots, so
framework-generated scratch media written under OpenClaw's hardened temp root
(e.g. cron auto-TTS voice files from speech-core) was rejected. The send then
returned a no-identity error, the message was silently lost, yet cron still
recorded it as delivered.
Add one shared resolver (resolveTrustedOutboundMediaPath) that also trusts the
preferred OpenClaw temp root — already a sanctioned media root in core
(buildMediaLocalRoots) — and route all three gates through it so the trust set
agrees everywhere. Fixes#92816.
Co-authored-by: zengwen <zeng_wen@foxmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
When an assistant message's `content` is a raw string at runtime (JSONL
transcript replay passes it through even though the type declares an array),
the OpenAI-compatible completions path crashes:
- `transformMessages` called `assistantMsg.content.flatMap(...)` ->
`TypeError: ... .flatMap is not a function` (first crash, always hit).
- Two `hasToolHistory` helpers (`openai-transport-stream.ts` and
`openai-completions.ts`) called `content.some(...)` -> `TypeError: ...
.some is not a function` (siblings, surface once the flatMap crash is fixed).
Normalize a string assistant content to an equivalent single text block
before transforming (matching the string->text handling already used in
anthropic-payload-policy.ts), and `Array.isArray`-guard both `hasToolHistory`
helpers so a string assistant simply does not count toward tool history.
Verified end-to-end through the real `buildOpenAICompletionsParams` and
`streamOpenAICompletions` entry points: before the fix a string-content
assistant followed by a toolResult throws TypeError; after the fix params are
produced correctly (string preserved as text, tool history detected). Normal
array content is unaffected.
* fix(respawn): rewrite pnpm versioned entry paths to stable wrapper
During self-update the pnpm versioned directory (node_modules/.pnpm/openclaw@<ver>/)
may be removed. If process.argv contains the versioned path, the respawned child
fails to start because the entrypoint no longer exists.
Detect pnpm versioned realpaths in spawnDetachedGatewayProcess and rewrite them
to the stable node_modules/<pkg>/openclaw.mjs wrapper before spawning.
Fixes#52313
* fix(respawn): scope pnpm entry rewrite to openclaw
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(wizard): preserve existing default model during setup auth choice
Without preserveExistingDefaultModel: true, the setup wizard
overwrite the user's configured default model when a new provider
auth is selected. This causes existing heartbeat turns to silently
consume paid API quota (e.g. Google Gemini) instead of the user's
original model.
The configure.gateway-auth.ts path already passes this flag; the
setup wizard path was missing it.
Fixes#64129
* fix(wizard): add type assertion for preserveExistingDefaultModel test
Summary:
- This PR changes the docs i18n Codex command-output preview to keep a short head plus retained tail, and adds Go unit coverage for stdout and stderr tails.
- PR surface: Other +20. Total +20 across 2 files.
- Reproducibility: yes. Source inspection of current main and `v2026.6.6` shows long output is truncated to the prefix only, and the PR's focused tests model the stdout/stderr tail cases that lose final API details.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head b510b598c6.
- Required merge gates passed before the squash merge.
Prepared head SHA: b510b598c6
Review: https://github.com/openclaw/openclaw/pull/93687#issuecomment-4720840859
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: Mason Huang <8814856+hxy91819@users.noreply.github.com>
Approved-by: hxy91819
* fix(agents): handle string assistant content in getLastAssistantText
PR #93456 added an `if (!Array.isArray(message.content)) return false` guard
to hasAssistantToolCallArguments, acknowledging that a persisted/legacy
assistant message can carry a string `content` at runtime even though the
type is declared as an array. buildSessionContext pushes such entries through
unchanged, so the string can reach agent.state.messages.
getLastAssistantText() still assumed an array: iterating a string `content`
yields individual characters, none of which has `type === "text"`, so the
assistant's text was silently dropped and the function returned undefined.
Mirror extractTextContent(): when `content` is a string, treat it as the text
itself; otherwise iterate the content blocks as before. The aborted/empty
check is left untouched because `.length === 0` is already correct for both an
empty array and an empty string.
* fix(agents): safely read persisted assistant text
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
When the block reply pipeline streamed partial content, buildReplyPayloads()
unconditionally dropped all text-only final payloads. This suppressed the
complete final reply when the pipeline only streamed a partial block and
never sent the exact final text.
The fix checks hasSentPayload() for text-only payloads too, preserving
unsent finals instead of dropping them unconditionally.
The async Clipboard API is only available in secure contexts (HTTPS or
localhost). On plain-HTTP deployments navigator.clipboard is undefined, so the
code block copy button threw synchronously and silently failed. Add a shared
copyToClipboard helper that guards the secure-context path and falls back to the
legacy execCommand copy, reuse it for the code block button and the copy-as-
markdown affordance, and cover it with a unit test plus a real-browser e2e that
simulates the non-secure context.
Fixes#93628
Co-authored-by: Pick-cat <266665499+Pick-cat@users.noreply.github.com>
Summary:
- The PR changes `/status` context-window selection to ignore stale runtime snapshots after manual model switches while preserving fallback/runtime-alias context windows.
- PR surface: Source +6, Tests +128. Total +134 across 2 files.
- Reproducibility: yes. source-reproducible: current main trusts explicit runtime context before checking fall ... fer. I did not run a local failing repro, but the PR fixture models the stale prior-runtime state directly.
Automerge notes:
- PR branch already contained follow-up commit before automerge: test(status): make context fixtures type-correct
Validation:
- ClawSweeper review passed for head f14fda4279.
- Required merge gates passed before the squash merge.
Prepared head SHA: f14fda4279
Review: https://github.com/openclaw/openclaw/pull/93306#issuecomment-4708596208
Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Approved-by: hxy91819
description: "Use when previewing local channel message flow fixtures."
description: "Use when running QA Lab channel message flow evidence."
---
# Channel Message Flows
Use this from the OpenClaw repo root to send canned channel preview flows while iterating on message UX. These are real sends/edits/deletes against the configured channel target.
Use this from the OpenClaw repo root to run the QA Lab evidence for Telegram
draft/final delivery sequencing. This skill no longer launches a standalone
script; the behavior is owned by the QA scenario and its Vitest-backed e2e test.
## Telegram
## QA Scenario
Native Telegram `sendMessageDraft` tool progress, then a final answer:
description: Audit or refresh OpenClaw maturity scorecard docs from root taxonomy, maturity scores, and QA evidence artifacts without using maintainer discrawl data or committed inventory reports.
---
# claw-score
Use this skill when working on the OpenClaw maturity scorecard in this repo.
This is the openclaw-local version of the maintainer `claw-score` workflow:
it keeps the taxonomy and scorecard concepts, but excludes discrawl and the old
committed `inventory/` report tree.
## Authority
This skill owns the operational workflow for:
-`taxonomy.yaml`
-`docs/maturity-scores.yaml`
-`docs/concepts/qa-e2e-automation.md`
-`qa/scenarios/index.yaml`
Keep person-specific, maintainer-private, Discord archive, and discrawl facts
out of this repo. If a score needs private evidence, use the redacted
`qa-evidence.json` artifact shape generated by OpenClaw QA workflows.
## Source Model
-`taxonomy.yaml` is the hand-edited source of truth for surfaces, levels,
- Hosted Provider Execution: Hosted provider turns, Provider-specific model options, Hosted tool use, Reasoning and cache controls, Hosted streaming and replies
- Local and Self-hosted Providers: Local provider profiles, Tool-capability flags, Timeouts and context windows, Local smoke checks, Local failure handling
- Model and Runtime Selection: Model reference selection, Provider and runtime overrides, Thinking and context settings, Invalid route recovery
- Provider Auth: Login and API-key setup, Auth profile selection, Credential health checks, Auth failover, Provider fallback recovery, Rate-limit and capacity recovery, Missing-key and OAuth guidance, Restart and stale-route recovery, Structured provider diagnostics, Subagent credential propagation
- Streaming and Progress: Streaming replies, Progress visibility
- Tool Calls and Response Handling: Tool-call handling, Usage and response reporting, Failure recovery
Use this rubric when assigning category Completeness scores for the
`cli-install-update-onboard-doctor` surface.
## Surface-Specific Scoring Questions
For each category, ask:
- Can a normal operator complete the job end to end from the CLI?
- Are the expected environments represented where they matter for the category,
such as local installs, remote gateway use, supervised services, or
Windows/WSL2?
- Are the main lifecycle stages present where relevant: setup, inspection,
change, repair, and upgrade?
- Are common recovery and troubleshooting branches present, or does the
workflow dead-end after the happy path?
- Are major documented operator expectations still unimplemented?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the CLI operator journey for installation, onboarding, configuration, repair, and upgrade across expected environments and recovery branches.
- Score the CLI against the full operator journey, not only installation or the happy path.
- Repair, migration, remote, and platform-specific branches are expected where a category exposes them.
- For Windows and WSL2, score against the intended supported experience rather than parity with macOS/Linux internals.
- Gateway Service Management: Foreground gateway runs, Service install and control, Service auth wiring, Drift and reinstall recovery, Service health checks
- CLI Observability: Status snapshots, Health snapshots, Remote log tailing, Diagnostics export, Support-safe redaction
- Doctor: Interactive repair, Config migration, Auth and SecretRef checks, Plugin validation and repair, Lint and JSON findings, Extra gateway discovery, Supervisor drift repair, Port and startup diagnosis, Runtime path checks, Restart guidance
- Updates and Upgrades: Update channels, Install-kind switching, Managed gateway restart, Update status and RPC, Plugin convergence
Use this rubric when assigning category Completeness scores for the
`discord` surface.
## Category Scope
- Channel Setup and Operations: Application and bot setup, Token and application ID configuration, Setup wizard and account inspection, Status, doctor, and intent checks, Multi-account bot configuration, Account monitor startup, Gateway WebSocket lifecycle, Reconnect and heartbeat handling, Rate limits and gateway metadata, Status, probe, and health-monitor recovery
- Access and Identity: DM policy modes, Allowlist inheritance, Pairing-code approval, Sender authorization, Access-group authorization, Group DM authorization
- Conversation Routing and Delivery: Guild and channel admission, Mention gating, Session key isolation, Configured and runtime routing, Inbound context visibility, Forum and media-channel thread posts, Thread actions, Target parsing, Thread context resolution, Thread-bound session routing, ACP agent routing, Routing lifecycle, Discord forum/media channel posts created as, CLI and message-tool thread actions, Discord target parsing for `channel:<id>`, Thread context resolution, Thread-bound session routing for `/focus`, `/unfocus`, `/agents`, `/session idle`, `/session max-age`, `sessions_spawn({ thread, ACP current-conversation bindings and ACP thread, Binding lifecycle behavior, Direct and thread sends, Text chunking and reply mode, Draft and progress edits, Mention and embed rendering, REST retry and final delivery, File uploads, Component file and media-gallery blocks, Video caption follow-up, Voice-message upload, Inbound attachment context
- Media and Rich Content: Direct and thread sends, Text chunking and reply mode, Draft and progress edits, Mention and embed rendering, REST retry and final delivery, File uploads, Component file and media-gallery blocks, Video caption follow-up, Voice-message upload, Inbound attachment context, Direct and thread sends, Text chunking and reply mode, Draft and progress edits, Mention and embed rendering, REST retry and final delivery, File uploads, Component file and media-gallery blocks, Video caption follow-up, Voice-message upload, Inbound attachment context, Outbound file uploads from URLs and, Component v2 file and media-gallery blocks, Video caption handling and follow-up media-only delivery, Discord voice-message sends with OGG/Opus conversion, Inbound media/attachment-aware debounce behavior, Realtime voice-channel conversations, General text-only delivery
- Native Controls and Approvals: Native slash command registration, Native slash command execution, Model Picker Commands, Components v2 messages, Callback TTL, Native Discord exec/plugin approvals, Sensitive owner-only command routing for prompts, Discord message actions, Action gates under channels.discord.actions.\*
- Realtime Voice and Calls: Voice Channel Lifecycle, Auto-join and follow-users, Realtime voice modes, Wake, barge-in, and echo handling, Voice codec and DAVE recovery
- Hosted Web Surface: Control UI, WebChat hosting, Plugin web routes, Canvas and A2UI routes
- Gateway RPC APIs and Events: Health APIs, Identity and presence APIs, Model APIs, Usage and memory APIs, Session APIs, Chat APIs, Channel APIs, Web login and wake APIs, Config and secrets APIs, Update and setup APIs, Agent and artifact APIs, Task and automation APIs, Tool and skill APIs, Request and event envelopes, Idempotent side effects, Method discovery, Event discovery, Accepted-then-final results, Event ordering, State refresh after gaps
- Health, Diagnostics, and Repair: Health snapshots, Channel readiness, Stability diagnostics, Payload diagnostics, Diagnostics exports, Doctor checks, Log tailing
- Protocol Compatibility: Published protocol schema, Runtime request validation, JSON Schema export, Swift client models, Version negotiation, Client transport defaults, Backward-compatible evolution
- Roles and Permissions: Role negotiation, Operator permissions, Approval-gated actions, Untrusted node declarations, Event scoping
- Gateway Lifecycle: Foreground startup, Service installation, Restart and stop, Service status, Bind and port settings, Config reload, Multi-gateway isolation
Use this rubric when assigning category Completeness scores for the
`image-video-music-generation-tools` surface.
## Category Scope
- Media Routing and Discovery: default media model config, per-call model refs and fallbacks, auth-backed tool discovery, action=list provider inspection
Use this rubric when assigning category Completeness scores for the
`imessage-bluebubbles` surface.
## Category Scope
- Channel Setup and Operations: Translate legacy config, Cut over safely, Handle migration caveats, Run local imsg, Run through SSH wrapper, Grant macOS permissions, Probe runtime health, Account setup prompts, Account status checks, Doctor repair checks, Account Config, Translate legacy config, Cut over safely, Handle migration caveats, Run local imsg, Run through SSH wrapper, Grant macOS permissions, Probe runtime health
- Access and Identity: Authorize direct senders, Route direct conversations, Bind ACP sessions, Group Policy, Mentions, System Prompts, Group Policy, Mentions, System Prompts
- Conversation Routing and Delivery: Watch live messages, Coalesce split-send DMs, Replay missed messages, Seed conversation history, Authorize direct senders, Route direct conversations, Bind ACP sessions, Group Policy, Mentions, System Prompts
- Media and Rich Content: Media, Attachments, Remote Fetch, Chunking, Native Actions, Private API, Message Tool
Use this rubric when assigning category Completeness scores for the
`kubernetes-hosting` surface.
## Surface-Specific Scoring Questions
For each category, ask:
- Can an operator deploy and manage OpenClaw on Kubernetes end to end?
- Are the taxonomy features present as supported manifests, commands, and docs rather than examples only?
- Are setup, normal operation, status or inspection, redeploy, teardown, and secret rotation represented where relevant?
- Are local Kind validation, namespace/image customization, provider secrets, and secure exposure branches covered?
- Do known gaps leave major cluster-hosting capability branches missing?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the Kubernetes operator workflow for deployment, configuration, secrets, access, exposure, lifecycle, security posture, status, and recovery.
- A complete Kubernetes category lets an operator deploy, expose, secure, update, troubleshoot, and remove the Gateway without relying on Docker-only assumptions.
- Happy-path port-forwarding, missing secret/config rotation, or omitted exposed-service security posture are material completeness gaps.
- Gateway Connectivity: Local Gateway attach and status, Gateway pairing and auth, Remote mode, Local and remote resource boundaries
- Chat and Sessions: Native Linux chat window, Transcript, Gateway chat transport
- Desktop Capabilities: Linux desktop permissions, Secret storage, Sandbox/package posture, Linux native node identity, Host command execution, Desktop tools, Linux native Talk, Microphone capture, Native media permissions
- Status and Diagnostics: Native Linux app readiness, Gateway health/status display, Log/transcript opening, Doctor/repair affordances, Linux tray/status item, Runtime status row, Desktop-environment integration
Use this rubric when assigning category Completeness scores for the
`linux-gateway-host` surface.
## Category Scope
- Host Setup and Updates: Linux CLI install, Node runtime prerequisites, Package-manager policy, Update path
- Gateway Runtime and Service Control: Foreground Gateway Runtime, Process Control, Systemd User Service Lifecycle setup, Systemd User Service Lifecycle operation, Systemd User Service Lifecycle status, Systemd User Service Lifecycle recovery
- Provider Setup, Lifecycle, and Diagnostics: Provider Selection, Onboarding, localService configuration, Process startup and readiness, Request leases and idle shutdown, Health checks and restart, Provider recipes, Local provider status, Backend reachability probes, Model availability errors, Memory readiness diagnostics, Provider troubleshooting docs
- Native Provider Plugins: Ollama setup and model pulling, Model discovery, Streaming and vision, Ollama embeddings, Web-search support, LM Studio setup, Model discovery and auth, Model preload and JIT loading, Streaming compatibility, LM Studio embeddings
- Media Intake and Access: Local and remote media references, MIME and type detection, Size caps and bounded reads, Safe remote fetch, Local root policy, Inbound media store, PDF/document extraction dispatch, QR and media helper classification
- Channel Media Handling: Inbound attachment staging, Sandbox media rewrites, Reply media templating, Message-tool attachment delivery, Duplicate delivery suppression
- Media Configuration: Media capability configuration
- Media Understanding: Audio attachment selection, Batch STT provider and CLI fallback, Voice-note mention preflight, Transcript insertion and echo, Audio proxy and limit handling, Inbound image summarization, Active vision model bypass, Text-only model media offload, Vision provider fallback, Image and PDF input routing, Video Understanding, Direct Video Analysis
- Media Generation: Image generation tool invocation, Provider and model selection, Reference image editing, Generated image task lifecycle, Generated image persistence and delivery, Music generation tool invocation, Provider and model selection, Lyrics, instrumental, duration, and format controls, Reference inputs where supported, Music task lifecycle and duplicate status, Generated audio persistence and delivery, Video generation tool invocation, Mode and provider capability selection, Reference image, video, and audio inputs, Provider option validation, Video task lifecycle and status, Generated video persistence and delivery
Use this rubric when assigning category Completeness scores for the
`microsoft-teams` surface.
## Category Scope
- Channel Setup and Operations: Teams CLI app creation, Bot registration and manifest upload, Credential configuration, Teams app install verification, Setup status, Probe and scope reporting, Teams app doctor, Webhook and health diagnostics, Operator repair paths, Text formatting and chunking, Adaptive and presentation cards, Progress streaming, Delivery receipts and errors, Queued and proactive replies, Webhook Runtime, SDK Lifecycle, Proactive Cloud Boundary, Setup status, Probe and scope reporting, Teams app doctor, Webhook and health diagnostics, Operator repair paths, Webhook Runtime, SDK Lifecycle, Proactive Cloud Boundary
- Access and Identity: DM pairing, Stable sender identity, Allowlists and access groups, Invoke and command authorization, Teams-originated config writes, Bot Framework SSO invokes, Delegated token storage, Graph directory lookup, Member profile lookup, Bot Framework SSO invokes, Delegated token storage, Graph directory lookup, Member profile lookup
- Conversation Routing and Delivery: Team and channel allowlists, Deterministic channel replies, Mention-gated group access, Session routing, Reply and thread context, Text formatting and chunking, Adaptive and presentation cards, Progress streaming, Delivery receipts and errors, Queued and proactive replies, Webhook Runtime, SDK Lifecycle, Proactive Cloud Boundary, Text formatting and chunking, Adaptive and presentation cards, Progress streaming, Delivery receipts and errors, Queued and proactive replies, Webhook Runtime, SDK Lifecycle, Proactive Cloud Boundary
- Media and Rich Content: Inbound attachments, Graph-hosted media, File consent, SharePoint and OneDrive sharing, Media fetch safety
- Native Controls and Approvals: Message action discovery, Polls and reactions, Read, edit, delete, and pin, Native approval cards, Feedback and group actions
Use this rubric when assigning category Completeness scores for the
`multi-agent-orchestration` surface.
## Surface-Specific Scoring Questions
For each category, ask:
- Can an operator configure and run the category workflow end to end?
- Are the taxonomy features present as supported user paths rather than partial config fragments?
- Are setup, normal operation, status or inspection, recovery, and removal paths represented where relevant?
- Are channel, account, workspace, auth, task, and delegate variants covered where the category expects them?
- Do known gaps leave major coordination or isolation branches missing?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the operator-facing system for setup, isolation, conversation routing, account routing, specialist lanes, delegate identity, status, recovery, and safe defaults.
- A complete category lets multiple agents be created, isolated, routed, delegated, and inspected without implicit cross-agent leakage.
- Undocumented config, nondeterministic routing, or unclear ownership of state, credentials, and outbound delivery are material completeness gaps.
Use this rubric when assigning category Completeness scores for the
`native-windows-companion-app` surface.
## Category Scope
- Installation and Updates: Official app download, MSI/MSIX/App Installer/winget-style packaging, Windows architecture handling for x64, App release channel
Use this rubric when assigning category Completeness scores for the
`openclaw-app-sdk` surface.
## Surface-Specific Scoring Questions
For each category, ask:
- Can an external app developer complete the category workflow using public SDK APIs?
- Are the taxonomy features represented by stable client contracts rather than protocol-only fragments?
- Are setup, authentication, streaming, result handling, error behavior, and compatibility expectations documented?
- Are browser, Node, React, testing, and custom transport variants covered where the category expects them?
- Do known gaps leave major external-app capability branches missing?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the external app-developer workflow from connection through agent runs, sessions, events, approvals, resources, compatibility, and operational error handling.
- A complete SDK category exposes typed, documented, reusable client APIs instead of requiring low-level Gateway protocol work.
- Manual Gateway frame construction or reliance on internal package shapes is a material completeness gap.
- Can the intended plugin task be completed end to end by an author or
operator?
- Are the important plugin variants present for this category, such as channel,
provider, tool, bundled, local, npm, or ClawHub flows?
- Are the main lifecycle stages present where relevant: create, configure,
validate, run, update, and remove or roll back?
- Are compatibility, approval, or safety branches present when the category
implies them?
- Are important author/operator-visible gaps still forcing workarounds or
unsupported paths?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the plugin author or operator lifecycle for authoring, packaging, installing, running, approving, publishing, and testing plugins, not just SDK or runtime primitives.
- Score the plugin surface against the full plugin journey, not only one import path, packaging mode, or runtime path.
- Bundled-only support or support for only selected plugin families is incomplete when the category implies broader plugin capability.
- Publishing and testing categories should include expected lifecycle support, not just raw commands or fixtures.
Use this rubric when assigning category Completeness scores for the
`signal` surface.
## Category Scope
- Setup and Account Health: QR link setup, SMS registration, Installer and binary setup, Container account provisioning, Status probes, Setup diagnostics, Account safety guardrails
- Conversation Access and Routing: DM pairing, DM allowlists, Sender identity normalization, Group allowlists, Mention gates, Pending group history
- Message Delivery and Actions: Text delivery targets, Media delivery and limits, Typing and read receipts, Styled/chunked output, Reaction action discovery, Add/remove reactions, Group reaction targeting
Use this rubric when assigning category Completeness scores for the
`telegram` surface.
## Category Scope
- Channel Setup and Operations: BotFather token creation, TELEGRAM_BOT_TOKEN, Setup wizard credential capture, Startup getMe, Doctor/status surfacing, Named account configuration, CLI/message-tool targets, Directory adapters, Channel status, Account-scoped outbound, Long polling runner startup, Webhook listener startup, Reconnect, Restart, Named account configuration, Directory adapters and configured peers/groups for, Channel status, Account-scoped outbound, Long polling runner startup, Reconnect, Restart
- Access and Identity: dmPolicy modes, Pairing-code approval, Numeric Telegram user ID normalization with telegram, allowFrom, Unauthorized DM, Group allowlists, Supergroup negative chat IDs, Forum topic session keys, ACP topic routing, Session key construction
- Conversation Routing and Delivery: dmPolicy modes, Pairing-code approval, Numeric Telegram user ID normalization with telegram, allowFrom, Unauthorized DM, Group allowlists, Supergroup negative chat IDs, Forum topic session keys, ACP topic routing, Session key construction, Inbound media download, Voice notes, Location, Poll sending, Reactions, Text, Preview streaming, Reply threading tags, Durable outbound message recording, Voice notes, Poll sending, Reply threading tags, Durable outbound message recording
- Media and Rich Content: Inbound media download, Voice notes, Location, Poll sending, Reactions, Text, Preview streaming, Reply threading tags, Durable outbound message recording, Voice notes, Poll sending, Reply threading tags, Durable outbound message recording, Inbound media download, Voice notes, Location and venue extraction into channel context, Poll sending, Reactions
- Native Controls and Approvals: Inline keyboard rendering, Exec approvals in DMs, Message actions, Action capability discovery, Native setMyCommands startup sync, Command name/description normalization, Built-in commands, Command authorization in DMs, Model buttons, Native `setMyCommands` startup sync, Command name/description normalization, Built-in commands such as `/help`, Command authorization in DMs, Model buttons and command UI helpers
Use this rubric when assigning category Completeness scores for the
`whatsapp` surface.
## Category Scope
- Channel Setup and Operations: Official @openclaw/whatsapp plugin metadata, openclaw plugin install whatsapp, Channel config schema, Baileys socket lifecycle, Operator troubleshooting, Baileys socket lifecycle, Operator troubleshooting for reconnect loops
- Access and Identity: QR login, Baileys multi-file auth persistence, DM pairing challenge, Multi-account/default-account resolution, Direct-message dmPolicy, Sender identity extraction, Privacy controls for plugin hooks, Direct-message `dmPolicy`, Sender identity extraction, Privacy controls for plugin hooks and
- Conversation Routing and Delivery: Group allowlists, Group session keys, Outbound text sends, Provider-accepted receipts, Outbound text sends, Provider-accepted receipts and durable delivery identifiers
- Media and Rich Content: Inbound media download, Outbound image
- Native Controls and Approvals: Native exec, Approver target resolution
- Gateway Service Lifecycle: Onboarded systemd install, Gateway service install, systemd user unit rendering, WSL-aware systemd unavailable hints, Doctor service repair, WSL user-service linger, Systemd availability after Windows boot, Windows startup task for WSL, Verification before Windows sign-in, Clear expectations around PC power
- Gateway Access and Exposure: Gateway token/password auth, Provider credentials, Gateway auth SecretRefs, Remote URL credential precedence, WSL virtual network, Windows portproxy setup, Windows Firewall rules, Reachable Gateway URLs, Loopback and LAN exposure, WSL2 IPv4 networking, Tailscale remote access
- Diagnostics and Repair: openclaw doctor, openclaw status, openclaw logs, SecretRef, WSL/systemd unavailable hints, Operator repair guidance after WSL2 service
- Browser and Control UI: WSL2 Gateway with Windows browser, Windows Control UI URL, Raw remote CDP to Windows Chrome, Host-local Chrome MCP, Browser profile cdpUrl, Layered diagnostics
@@ -16,19 +16,33 @@ Use this with `$release-openclaw-maintainer` and `$openclaw-testing` when a rele
- Watch one parent run plus compact child summaries. Avoid broad `gh run view` polling loops; REST quota is easy to burn.
- Fetch logs only for failed or currently-blocking jobs. If quota is low, stop polling and wait for reset.
- Treat live-provider flakes separately from code failures: prove key validity, provider HTTP status, retry evidence, and exact failing lane before editing code.
- Anthropic release lanes support both API keys and OAuth. When API keys are
exhausted but a maintainer-owned OAuth token passes a live Anthropic probe,
set `ANTHROPIC_OAUTH_TOKEN` for provider/runtime lanes and
refreshable `OPENCLAW_CLAUDE_CREDENTIALS_JSON` or
`CLAUDE_CODE_OAUTH_TOKEN` for Claude CLI subscription lanes before rerunning
the matrix. Revalidate short-lived OAuth immediately before dispatch. Never
keep retrying a known exhausted API key. Live-cache validation must prefer
the proven OAuth token instead of leaving an exhausted API key first in the
runtime key pool.
- A model-list response proves authentication, not billing or inference
entitlement. Mandatory live providers must pass a real completion probe
before release dispatch. Fix the credential first; do not add an alternate
auth path merely to bypass a failed release credential.
- Full Release Validation parent monitors fail fast: once a required child job
fails, the parent cancels the remaining child matrix and prints the failed
job summary. Inspect that first red job instead of waiting for unrelated
matrix tails.
- In a sparse worktree or Testbox source sync, first confirm `package.json`,
`pnpm-lock.yaml`, and every source path the selected check reads. If any are
absent, that checkout cannot validate a release dependency or Docker lane:
stop and use the repo remote changed gate or a full task worktree. When the
inputs are present and a release fix changes `package.json` or
`pnpm-lock.yaml`, rebuild only the task-owned disposable box with
`CI=true pnpm install --frozen-lockfile`, then run an explicit
`require.resolve()` probe before Docker or focused tests. The CI flag permits
pnpm to recreate a prewarmed modules directory without an interactive
confirmation. Do not weaken the lockfile or label sparse-checkout failures
as product/Docker failures.
- If the candidate is rebased or its base SHA changes after warmup, stop the
task-owned box and warm a fresh one before testing. Testbox source sync is
relative to the warmed source tree; continuing can mix an old base file with
a new candidate diff and produce false lockfile or Docker failures.
- For a committed release candidate, warm the box with
`blacksmith testbox warmup ... --ref <candidate-branch-or-sha>`. Do not rely
on source sync to overlay committed branch changes onto the workflow's
default ref.
## Preflight
@@ -45,8 +59,8 @@ git rev-parse HEAD
preflight. Inject those exact targeted keys first, then run the verifier; use
ambient env only when it was already intentionally injected for this release.
The script prints only provider status and HTTP class, never tokens.
For Anthropic it prefers `ANTHROPIC_OAUTH_TOKEN` and validates it with bearer
OAuth headers when present; otherwise it checks API-key-shaped credentials.
The Anthropic check performs a tiny message completion so exhausted or
non-billable credentials fail before the expensive release matrix.
## Dispatch
@@ -62,7 +76,7 @@ gh workflow run openclaw-performance.yml \
-f repeat=3\
-f deep_profile=false\
-f live_openai_candidate=false\
-f fail_on_regression=false
-f fail_on_regression=true
```
- Do not wait for full release validation to start this early perf signal.
@@ -71,8 +85,9 @@ gh workflow run openclaw-performance.yml \
- Call out any regression in the release proof. Treat a major regression as a
release blocker until it is fixed, waived by the operator, or proven to be
infrastructure noise.
- Full Release Validation also records advisory product-performance evidence;
the early standalone run is for overlap and faster regression discovery.
- Full Release Validation records blocking product-performance evidence. The
early standalone run is for overlap and faster regression discovery, but a
regression or missing child run blocks the parent validation.
Prefer the trusted workflow on `main`, target the exact release SHA:
@@ -94,7 +109,7 @@ gh workflow run full-release-validation.yml \
-f rerun_group=all
```
Use `release_profile=stable` unless the operator explicitly asks for the broad advisory provider/media matrix. Use narrow `rerun_group` after focused fixes.
Use `release_profile=stable` unless the operator explicitly asks for the broad advisory provider/media matrix. Stable and full profiles force the release soak; the beta profile may opt in with `run_release_soak=true`. Use narrow `rerun_group` after focused fixes.
Publish with `openclaw-release-publish.yml` using `release_profile=from-validation`
unless a maintainer intentionally wants to cross-check a specific profile; the
publish workflow reads the effective profile from the full-validation manifest.
@@ -124,13 +139,25 @@ Stop watchers before ending the turn or switching strategy.
--jq '.jobs[] | select(.conclusion=="failure" or .conclusion=="timed_out" or .conclusion=="cancelled") | [.databaseId,.name,.conclusion,.url] | @tsv'
```
3. Fetch one failed job log. If rate-limited, note reset time and avoid more REST calls.
4. For secret-looking failures, validate the provider endpoint from the same secret source before editing code.
For Docker CLI-backend failures, also validate
`OPENCLAW_CLAUDE_CREDENTIALS_JSON` or `CLAUDE_CODE_OAUTH_TOKEN` in a
clean-home Claude CLI probe; that lane should use subscription mode when
either credential exists.
4. For secret-looking failures, validate a real completion from the same secret source before editing code. A successful model-list request is insufficient.
Claude CLI subscription credentials are a separate native auth path; prove
them in a clean-home CLI probe, never as a substitute for a required
Anthropic API-key lane.
5. For live-cache failures, inspect whether it is missing/invalid key, empty text, provider refusal, timeout, or baseline miss. Do not weaken release gates without clear provider evidence.
6. Fix narrowly, run local/changed proof, commit, push, rerun the smallest matching group.
7. If a required PR CI run is capacity-stalled with queued jobs and no active
jobs, do not cancel unrelated work or accept a generic manual dispatch.
From the PR head branch, dispatch the explicit exact-SHA fallback:
`gh workflow run ci.yml --repo openclaw/openclaw --ref <pr-head-branch> -f
@@ -17,6 +17,10 @@ Use this skill for release and publish-time workflow. Load `$release-private` if
- This skill should be sufficient to drive the normal release flow end-to-end.
- Use the private maintainer release docs for credentials, recovery steps, and mac signing/notary specifics, and use `docs/reference/RELEASING.md` for public policy.
- Core `openclaw` publish is manual `workflow_dispatch`; creating or pushing a tag does not publish by itself.
- Do not edit the root `README.md` as release prep, release closeout, or a
substitute for release notes. Package-root README validation is a hard
packaging gate, but a release only changes README content when an actual
user-facing documentation contract changed.
- Normal release work happens on a branch cut from `main`, not directly on
`main`. Use `release/YYYY.M.PATCH` for the branch name.
- If the operator asks for a release without saying stable/full, default to
@@ -76,6 +80,44 @@ Use this skill for release and publish-time workflow. Load `$release-private` if
or clawgrit reports. Report regressions explicitly. A major regression is a
release blocker unless the operator waives it or the data clearly proves
infrastructure noise.
- Heal CI before tagging or publishing. The exact candidate SHA must have green
`Full Release Validation`, including the root Dockerfile/install-smoke path.
Treat a red Docker, package, or release workflow lane as a release-branch
defect until the smallest correct fix is landed and proven; do not waive it
because npm preflight or another sibling lane passed.
- Keep the canonical `scripts/pr` runner authoritative for prepare and merge
artifacts. A release-gate policy change may use focused candidate tests and
exact-SHA hosted CI for proof, but never route `prepare-*` or `merge-*`
through PR-controlled scripts or synthesize prepare artifacts to bootstrap
the change. If the current canonical gate cannot validate the new policy,
stop for explicit maintainer direction rather than weakening that boundary.
- In maintainer Testbox mode, use `OPENCLAW_TESTBOX=1 scripts/pr prepare-run
<PR>` only after the exact PR head has passed `CI` and every scheduled
hosted gate. For a workflow change, that means `Blacksmith Testbox`,
`Blacksmith ARM Testbox`, `Blacksmith Build Artifacts Testbox`, and
`Workflow Sanity`; only gates GitHub actually scheduled for that exact head
are required. This preserves the canonical prepare artifacts while avoiding
a redundant broad local suite. A
literal `CHANGELOG.md`-only head gets a clean diff check instead because
those workflows intentionally do not dispatch. Documentation and README
changes still require CI. If `merge-run` requires a mainline sync, run
`OPENCLAW_TESTBOX=1 scripts/pr prepare-sync-head <PR>`, wait for those hosted
gates on the newly pushed SHA, then run `prepare-run` again.
- If an exact PR-head CI run has no active jobs because Blacksmith capacity is
stalled, a maintainer may dispatch the explicit GitHub-hosted fallback from
the PR head branch:
`gh workflow run ci.yml --repo openclaw/openclaw --ref <pr-head-branch> -f
Add a visible `Closes #<issue-number>` or `Related: #<issue-number>` line
below this comment.
What problem does this PR solve?
Required PR title:
type: user-facing description
Use a parenthesized scope only when it adds clarity:
fix(auth): login redirect loops when session cookie is expired
Why does this matter now?
Types: feat, fix, improve, refactor, docs, chore.
For fixes, describe the user-visible symptom and trigger:
fix: task list fails to load when user has no environments
Avoid implementation details such as:
fix: add null check to task query
-->
What is the intended outcome?
## What Problem This Solves
What is intentionally out of scope?
<!--
Describe the concrete user, product, or operational problem.
For fixes, begin with:
"Fixes an issue where users <do X> would <experience Y> when <condition>."
or:
"Resolves a problem where..."
What does success look like?
Name the affected UI surface or workflow. Do not describe the code-level cause here.
-->
What should reviewers focus on?
## Why This Change Was Made
<details>
<summary>Summary guidance</summary>
<!--
In one or two sentences, explain the complete shipped solution, key design
decisions, and relevant boundaries or non-goals. Include implementation detail
only when it helps reviewers understand user-visible behavior or risk.
Avoid file-by-file narration.
-->
This PR description is the contributor's durable explanation of the change. Write it for human maintainers first; ClawSweeper and Barnacle use the same text to understand intent, proof, risk, and current review state.
## User Impact
Describe the intent and outcome in 2-5 bullets. Avoid restating the diff; reviewers and bots can read the changed files.
<!--
State what users, operators, or developers can now do or expect. Lead with the
concrete benefit and use user-facing language. If there is no user-visible
impact, say so plainly.
-->
If this PR fixes a plugin beta-release blocker, title it `fix(<plugin-id>): beta blocker - <summary>` and link the matching `Beta blocker: <plugin-name> - <summary>` issue labeled `beta-blocker`. Contributors cannot label PRs, so the title is the PR-side signal for maintainers and automation.
## Evidence
</details>
<!--
Show the most useful proof that this change works. Screenshots, screencasts,
terminal output, focused tests, CI results, live observations, redacted logs,
and artifact links are all useful. Include before/after evidence for visual
changes when it clarifies the result.
## Linked context
Which issue does this close?
Closes #
Which issues, PRs, or discussions are related?
Related #
Was this requested by a maintainer or owner?
<details>
<summary>Linked context guidance</summary>
Link the issue, PR, discussion, maintainer request, or owner request that explains why this PR should exist. Maintainer context helps reviewers and automation distinguish intended work from drive-by churn.
</details>
## Real behavior proof (required for external PRs)
- Behavior or issue addressed:
- Real environment tested:
- Exact steps or command run after this patch:
- Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output):
- Observed result after fix:
- What was not tested:
- Proof limitations or environment constraints:
- Before evidence (optional but encouraged):
<details>
<summary>Real behavior proof guidance</summary>
External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only.
Screenshots are encouraged even for CLI, console, text, or log changes. Terminal screenshots, copied live output, redacted runtime logs, recordings, and linked artifacts count.
If your environment cannot produce the ideal proof, explain that under `Proof limitations or environment constraints` so reviewers and ClawSweeper can direct the next step properly.
Be mindful of private information like IP addresses, API keys, phone numbers, non-public endpoints, or other private details when providing evidence.
</details>
## Tests and validation
Which commands did you run?
What regression coverage was added or updated?
What failed before this fix, if known?
If no test was added, why not?
<details>
<summary>Testing guidance</summary>
List focused commands, not every incidental check. CI is useful support, but external PRs still need real behavior proof above when behavior changes.
</details>
## Risk checklist
Did user-visible behavior change? (`Yes/No`)
Did config, environment, or migration behavior change? (`Yes/No`)
Did security, auth, secrets, network, or tool execution behavior change? (`Yes/No`)
What is the highest-risk area?
How is that risk mitigated?
<details>
<summary>Risk guidance</summary>
Use this for author judgment that is not obvious from the diff. ClawSweeper can see touched files, but it cannot know which behavior you think is risky, why the risk is acceptable, or what mitigation reviewers should verify.
</details>
## Current review state
What is the next action?
What is still waiting on author, maintainer, CI, or external proof?
Which bot or reviewer comments were addressed?
<details>
<summary>Review state guidance</summary>
Keep this as the durable state for review progress. If useful information appears in comments, fold the current next action or blocker back here so maintainers and ClawSweeper do not need to reconstruct state from comment history.
</details>
Reviewers will inspect the code, tests, and CI. Use this section to make the
validation easy to understand, not to restate the diff.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.