Default `openclaw status --json` stays on the lean health-probe path while preserving the JSON task summary, local update/install metadata, explicit probe timeouts, and configured gateway handshake timeouts. Deeper memory, registry, remote git, and local status-RPC diagnostics remain behind `status --json --all`.
Also keeps generated diffs viewer output in its built form and ignores it in oxfmt so `pnpm build` leaves a clean tree.
Proof:
- `node scripts/run-vitest.mjs src/commands/status.scan.fast-json.test.ts src/commands/status-json-payload.test.ts src/commands/status.scan.shared.test.ts`
- `OPENCLAW_LOCAL_CHECK=0 node scripts/run-oxlint-shards.mjs --threads=8`
- `node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo`
- `node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.extensions.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/extensions-test.tsbuildinfo`
- `.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main`
- GitHub checks green for head `47a63f87ea7c2351994fdb71e8cc18041aa0b64e`
Thanks @andyylin.
Co-authored-by: Andy <andyylin@users.noreply.github.com>
Forward canonical inbound media metadata to plugin message_received hooks so plugins can inspect the same mediaPath, mediaUrl, mediaType, mediaPaths, mediaUrls, and mediaTypes fields already available to inbound_claim.
Verification:
- node scripts/run-vitest.mjs src/hooks/message-hook-mappers.test.ts
- /Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode branch --base origin/main
Refs: https://github.com/openclaw/openclaw/pull/87297
Co-authored-by: WarrenJones <8704779+WarrenJones@users.noreply.github.com>
Stop heartbeat runs from directly returning non-ack durable pending final text. Heartbeats now only clear ack-only pending state and otherwise continue the heartbeat turn, so stale prior final answers cannot be replayed through a later heartbeat/default route.
Keep the isolated heartbeat active-run guard so an immediate/manual heartbeat cannot overwrite an isolated heartbeat session that is still running.
Proof:
- node scripts/run-vitest.mjs src/auto-reply/reply/get-reply.fast-path.test.ts src/infra/heartbeat-runner.skips-busy-session-lane.test.ts
- git diff --check
- autoreview --mode local
- autoreview --mode branch --base origin/main
- GitHub CI 26543804437, CodeQL 26543804438, Critical Quality 26543804441, OpenGrep PR Diff 26543804440 rerun job 78197443511, Real behavior proof 26544027357
Refs #74257.
Co-authored-by: kesslerio <martin@kessler.io>
Stabilize isolated cron prompt cache affinity by deriving a stable prompt cache key per cron job/session/model and forwarding it separately from the rotating run session id.
Thread the key through embedded runs, stream resolution, provider options, proxy forwarding, custom streams, and prompt-cache observability. Keep OpenAI-compatible payloads valid by using hyphen-safe keys, clamping upstream prompt_cache_key values, and omitting affinity when cache retention is disabled.
Thanks @ferminquant.
Co-authored-by: Fermin Quant <ferminquant@hotmail.com>
Rewrites non-canonical api_key fields in auth-profiles.json to canonical key via openclaw doctor --fix, with backups, while preserving canonical key/keyRef credentials and active-agent auth stores.
Fixes#57389.
Co-authored-by: alkor2000 <200923177@qq.com>
* fix(sessions): preserve Matrix room-id case in session keys (#75670)
Matrix room IDs (and thread event IDs) are opaque, case-sensitive per the
Matrix spec, but session-key canonicalization lowercased them. That forked
one room into duplicate sessions and produced 403 M_FORBIDDEN on recovery /
delivery paths that reconstruct the target from the (lowercased) session key,
even though deliveryContext.to stayed correct.
Introduce a generic, opt-in case-preservation registry (CASE_PRESERVING_PEERS)
consulted at all three lowercasing sites:
- construction: normalizeSessionPeerId
- store canonicalization: normalizeSessionKeyPreservingOpaquePeerIds
- gateway send: explicit request.sessionKey
Signal group preservation is encoded to match prior behavior exactly (segment
span, unscoped, thread suffix still lowercased). Matrix channel/group enrolls
the opaque tail (room id with embedded :server + any 🧵<event> suffix).
Exact mixed-case keys now win over folded legacy aliases in
resolveSessionStoreEntry and delivery-info lookup; existing lowercased rows
collapse on the next write. Matrix DM/MXID and non-enrolled channels keep the
default lowercase behavior.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(sessions): guard Matrix folded alias delivery proof
* test(agents): cover cold OpenAI gpt-5.5 fallback
* fix(sessions): preserve non-opaque alias freshness
* fix(sessions): prevent Matrix cross-room thread recovery
* build(protocol): refresh tools effective Swift models
* test(codex): include effective cwd in startup fixture
* test(codex): align startup failure cleanup expectation
* fix(sessions): keep Signal folded aliases fresh
* fix(sessions): preserve unscoped Matrix room keys
* fix(sessions): recover legacy Matrix thread aliases
* fix(sessions): preserve Matrix keys in state migrations
* fix(sessions): keep Matrix structural alias freshness
* fix(sessions): preserve unscoped Matrix migration keys
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Fix iMessage native exec approval routing so approval prompts bind to the sent GUID without duplicate sends after RPC timeout. Also keeps chat.db GUID recovery on the local imsg path while avoiding local DB recovery for configured or detected SSH wrappers.
Thanks @kevinslin.
Avoid stale restart continuation reuse after a session key has rotated.
Queued restart agent turns now carry the session id they were queued for and fall back to a system wake if the key points at a different session by delivery time. Normal completed-run lifecycle fields stay reusable for fresh sessions, while new-session creation clears stale lifecycle markers.
Closes#86593.
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Closes#87210.
Gateway probe now waits for GatewayClient.stopAndWait() before resolving so callers do not observe a successful probe while the client socket is still draining. If the drain fails, probe falls back to stop().
Adds mocked probe coverage plus a real WebSocket regression test that verifies no client socket handle remains when probeGateway() resolves.
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Show active subagent detail rows in /status with labels and elapsed runtime while keeping completed-subagent summary behavior. Thanks @simplyclever914.
Fixes#83935.
Summary:
- clear stale legacy openai-codex auto route pins only when the canonical OpenAI provider is still using the Codex harness for the same model
- preserve usable Codex auth profiles while clearing stale route state
- keep explicit/custom OpenAI API route pins intact
Verification:
- git diff --check
- pnpm exec oxfmt --check --threads=1 src/auto-reply/reply/model-selection.ts src/auto-reply/reply/model-selection.test.ts src/auto-reply/reply/agent-runner-execution.ts src/auto-reply/reply/agent-runner-execution.test.ts
- fnm exec --using 24.15.0 node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
- .agents/skills/autoreview/scripts/autoreview --mode local
- CI: https://github.com/openclaw/openclaw/actions/runs/26542490863
Co-authored-by: Paul Frederiksen <paul@paulfrederiksen.com>
Fixes#87191. Keeps Brave and Gemini runtime-injected web search provider config readable by providers without re-exposing legacy tools.web.search provider objects to config validation.
Fix Slack draft cleanup after final-visible delivery.
Track when Slack has already delivered a visible final reply and stop reusing the draft finalizer for later same-turn final/error payloads. This keeps the first fallback cleanup for transient previews while preventing late cleanup from deleting a visible answer.
Fixes#87363
Co-authored-by: tianxiaochannel-oss88 <tianxiaochannel@gmail.com>
The compaction retry loop checked the delivery-timeout deadline before
choosing a fixed backoff delay, then slept that whole delay. When the
remaining window was shorter than the next backoff entry, the final
retry could sleep past the deadline, overrunning the delivery timeout
the retry is meant to stay within. Clamp the wait to the remaining
window (min(scheduledDelay, deadline - now)) and stop retrying once no
time remains, so compaction waiting never exceeds the delivery timeout.
Addresses the near-deadline overrun raised in ClawSweeper review of #86606.
Follow-up to #85489. Active requester steering treated a `compacting`
outcome from queueEmbeddedPiMessageWithOutcome as a terminal wake
failure and fell through to the requester-agent/direct fallback, even
though the active run becomes steerable again as soon as compaction
finishes.
Introduce a shared resolveActiveWakeWithRetries helper used by both the
steer path (maybeSteerSubagentAnnounce) and the generated-completion
active wake (sendSubagentAnnounceDirectly). The helper treats
`compacting` as transient and waits through compaction, retrying the
same wake. Waiting is bounded by the active wake's delivery timeout (not
just the backoff schedule): the backoff schedule controls the gap
between attempts, and once it is exhausted its last delay is reused until
the delivery deadline, so a compaction that finishes after the schedule
but within the delivery timeout still re-steers. The best-effort
transcript-commit retry and the compaction retry share one loop, so a
run that compacts and then reports transcript_commit_wait_unsupported
still gets the best-effort retry. Other wake failures keep their
existing single-attempt fallback.
Fixes#86566
Preserve pending agent-job error diagnostics as non-terminal timeout snapshots so the retry grace path can still recover when the lifecycle later starts and completes.
Local proof:
- node scripts/run-vitest.mjs packages/sdk/src/index.test.ts src/gateway/server-methods/server-methods.test.ts src/gateway/server.chat.gateway-server-chat.test.ts src/agents/run-wait.test.ts src/agents/openclaw-tools.sessions.test.ts
- node scripts/run-oxlint.mjs packages/sdk/src/client.ts packages/sdk/src/index.test.ts src/gateway/server-methods/agent-job.ts src/gateway/server-methods/agent.ts src/gateway/server-methods/agent-wait-dedupe.ts src/agents/run-wait.ts src/agents/tools/sessions-send-tool.ts src/gateway/server-methods/server-methods.test.ts src/gateway/server.chat.gateway-server-chat.test.ts src/agents/run-wait.test.ts src/agents/openclaw-tools.sessions.test.ts
- autoreview --mode local: no accepted/actionable findings
- CI run 26536599850: success
Co-authored-by: Martin Garramon <martin@yulicreative.ai>
Include second-level precision in inbound metadata and auto-reply envelope timestamps, matching the timestamp helper contract used by providers and channel adapters.
Docs now show the weekday plus seconds form in date-time and timezone examples.
Verification:
- node scripts/run-vitest.mjs src/auto-reply/envelope.test.ts src/auto-reply/reply/inbound-meta.test.ts
- pnpm docs:list >/tmp/openclaw-docs-list-87360.log
- git diff --check origin/main...HEAD
- pnpm format:docs:check
- pnpm lint:docs
- pnpm lint:extensions:bundled
- pnpm lint
- PR CI green on 495bb6c10fFixes#87257
Co-authored-by: GarlicGo <582149912@qq.com>
Expire browser-origin Control UI/WebChat device tokens when shared gateway auth rotates by tagging those tokens with the shared-auth generation and enforcing it during verification.
Preserve the issuer tag when a shared-auth-derived device token reconnects through a non-browser client, so reconnect rotation cannot turn it into an untagged long-lived token.
Proof:
- OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs src/gateway/server.shared-auth-rotation.test.ts src/infra/device-pairing.test.ts src/gateway/control-ui.http.test.ts
- GitHub CI run 26535632102: relevant build/runtime/test-type checks green; inherited lint reds match origin/main.
- GitHub CodeQL Critical Quality run 26535631610: network-runtime-boundary green.
Co-authored-by: Pavan Kumar Gondhi <pavangondhi@gmail.com>
Fixes repeated Tool Search catalog registration for unchanged effective tool sets by reusing a fingerprinted catalog snapshot across embedded-agent run cleanup.
The reusable catalog is guarded by catalog-affecting fields, parameters, and executable identity, and reuse now rebinds the current run/session refs before returning. Embedded-agent prep logging only suppresses the catalog line when reuse actually happened.
Verification:
- pnpm test src/agents/tool-search.test.ts -- --reporter=verbose
- pnpm check:changed, Testbox tbx_01ksney4f00wgk9n39yv7jsh4m
- Real behavior proof, GitHub Actions run 26534896284
- CI rerun for unrelated model-picker timeout passed, GitHub Actions run 26534489215
- autoreview clean: no accepted/actionable findings
Closes#86887
Co-authored-by: Sebastien Tardif <sebtardif@ncf.ca>
Avoids a self-wait in embedded agent session event hooks by skipping the queue drain only for hooks running inside the current session event processing chain. Detached or external hook work still drains the queue before taking the session write lock.
Verification:
- node scripts/run-vitest.mjs run --config test/vitest/vitest.agents-embedded-agent.config.ts src/agents/embedded-agent-runner/run/attempt.session-lock.test.ts
- node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/embedded-agent-runner/run/attempt.session-lock.test.ts src/agents/embedded-agent-runner/run/attempt.session-lock.ts --threads=8
- .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
- GitHub CI: https://github.com/openclaw/openclaw/actions/runs/26533883763
Thanks @luoyanglang.
Co-authored-by: luoyanglang <hanwanlonga@gmail.com>
Make plugin-state enforce the plugin-wide live-row fuse by evicting only from the namespace currently being written, preserving sibling namespace rows and still failing atomically when the current namespace cannot free enough rows.
Raise the plugin-wide cap to 6,000 rows, keep Telegram's persistent message-cache namespace at 3,000 entries, and document the updated SDK runtime contract. Harden legacy plugin-state import so capacity pressure cannot archive a source after losing imported keys, with focused regression coverage for Telegram-shaped namespaces and migration rollback.
Also restore the Docker runtime-assets preflight step in full release validation so release workflow contract tests stay aligned.
Verification: focused plugin-state, migration, Telegram, workflow-contract, lint, deprecated-API, diff-check, Blacksmith Testbox, CI, CodeQL, Workflow Sanity, OpenGrep, and autoreview all passed on PR head fee021cfa6.
Co-authored-by: Keshav's Bot <keshavbotagent@gmail.com>
Use read-only Telegram account inspection for prompt-time channel actions, inline buttons, and reaction guidance so unresolved SecretRef tokens retain configured non-secret behavior before runtime snapshot hydration.
Match runtime Telegram account lookup for normalized config keys and multi-account fallback guards, while keeping sends/actions on the existing strict credential resolution path.
Fixes#75433.
Co-authored-by: Shubhankar Tripathy <reach2shubhankar@gmail.com>
Fixes #87331.\n\nPersist Codex native hook relay generations for real app-server resumes, keep a bounded legacy-binding grace path, and rotate generation on fresh-thread fallback so stale hook commands stay rejected.\n\nCo-authored-by: Alex Knight <15041791+amknight@users.noreply.github.com>
Document that automation should pipe `models auth paste-token` credentials over stdin instead of passing token material in argv, keeping the existing secret-handling path explicit in the CLI docs.
Also include accepted auth-profile credential types in invalid-profile warning logs so malformed local auth stores are easier to repair.
Fixes#63042.
Thanks @liaoandi.
Clarify the Codex Computer Use docs around inferred opt-in, read-only status checks, and marketplace root versus marketplace JSON path setup.
The docs now match current source-backed behavior: autoInstall opts Computer Use in, status does not mutate plugin setup, and marketplacePath is for a local marketplace JSON file while source registers a marketplace root.
Verification:
- pnpm docs:list
- GitHub CI check-docs passed
- Real behavior proof passed via maintainer proof override for this docs-only PR
Thanks @bdjben.
Co-authored-by: Benjamin Badejo <ben@benbadejo.com>
Co-authored-by: Sally O'Malley <somalley@redhat.com>
Split the diffs viewer Shiki language pack into an external publishable plugin.
The diffs plugin keeps the default curated syntax set, while the new @openclaw/diffs-language-pack package carries the extended Shiki languages for npm and ClawHub distribution. The install metadata includes the external ClawHub spec, and the curated C# alias set keeps both c# and cs supported without the language pack.
Co-authored-by: Dallin Romney <dallinromney@gmail.com>
Fix non-interactive and wizard onboarding reruns so existing agent lists and bindings are preserved unless the user explicitly resets config.
Isolate legacy `plugins.installs` migration into its own write so the config size-drop allowance cannot mask unrelated config loss, while preserving new or repaired install records for the final plugin-index commit. Also keep shrinkwrap generation pinned to pnpm-locked transitive patch versions only when the dependency edge still allows that version, and isolate the tooling Vitest shard that mutates process state.
Fixes#84692.
Replaces #84748.
Co-authored-by: yetval <yetvald@gmail.com>
Suppress reasoning-prefixed silent replies before outbound delivery while preserving substantive replies that merely end with the silent token.\n\nFixes #66701.\n\nThanks @zuoanCo for the PR and @Cavadus for the report.\n\nProof: focused Vitest and pnpm check:changed passed on Testbox-through-Crabbox tbx_01ksmvfw0gk9xwh10ra1cyhzfw; CI passed for head a014eb0d91.
Fixes#87226.
Preserve the already-applied `openai` to `openai-codex` Codex runtime promotion when the persisted selection is canonical `openai` with the same model, while keeping explicit runtime provider changes switchable.
Verification:
- `node scripts/run-vitest.mjs src/agents/live-model-switch.test.ts`
- `/Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode branch --base origin/main`
- `pnpm check:changed` via Testbox `tbx_01ksmr59zdaqj3617w8w53xv4t` / Actions run `26512418770`
- Real behavior proof override gate: Actions run `26513059970`
Co-authored-by: Peter Lindsey <peter@lindsey.jp>
Keeps plain `openclaw status` on a bounded fast path while preserving local status metadata. The default text scan now avoids network update fetches, live channel checks, setup fallback work, and unbounded session hydration; deep/all status keeps the fuller behavior.
Behavior addressed: default status latency from update, channel, setup, and session scans
Real environment tested: GitHub Actions on PR head 98f589a35df74a7abb8327984d0103bb9f31af3e; local focused lint; autoreview
Exact steps or command run after this patch: CI workflow 26510790999; CodeQL workflow 26510790924; CodeQL Critical Quality workflow 26510791058; OpenGrep workflow 26510791138; autoreview branch against origin/main
Evidence after fix: all current-SHA workflows completed successfully; autoreview clean; local focused core oxlint passed on touched status files
Observed result after fix: default status hydrates only visible recent sessions, keeps local update metadata, and shows intentionally skipped SecretRef credentials as unknown instead of warning
What was not tested: live provider/channel roundtrip
Co-authored-by: 1052326311 <1052326311@users.noreply.github.com>
Route Telegram sendMessage action replies through durable outbound delivery so completed agent responses remain retryable when the gateway send path times out.
Verified with focused Telegram/outbound tests, extension test typecheck, prepare build/check/full test gates, and green CI rerun for head 20b45687e1.
Move vLLM Qwen thinking control onto configured model compat metadata and carry it through catalog/model-selection/runtime thinking contexts.
Also migrate legacy provider/default request params in doctor and keep Pi/runtime model rows buildable with explicit reasoning defaults.
Thanks @rendrag-git.
Co-authored-by: rendrag-git <253747599+rendrag-git@users.noreply.github.com>
Summary:
- The PR moves the runtime `HEARTBEAT.md` bootstrap template into `src/agents/templates`, keeps docs templates ... or other workspace files, adds a legacy heartbeat-template doctor repair, and updates package guards/tests.
- PR surface: Source +281, Tests +283, Docs +11, Config +1, Other 0. Total +576 across 15 files.
- Reproducibility: yes. from source inspection: current main loads `HEARTBEAT.md` from the docs template, and ... pty heartbeat file non-empty to the runtime. I did not run a live heartbeat repro in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(doctor): recognize heartbeat docs boilerplate
- PR branch already contained follow-up commit before automerge: fix(agents): update heartbeat workspace test
- PR branch already contained follow-up commit before automerge: fix(doctor): tighten heartbeat template repair
Validation:
- ClawSweeper review passed for head e34e85864c.
- Required merge gates passed before the squash merge.
Prepared head SHA: e34e85864c
Review: https://github.com/openclaw/openclaw/pull/85416#issuecomment-4519851630
Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: hxy91819
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Carry over #82973 and fix#81281 by preserving explicit cacheRetention for OpenAI-compatible completions providers that opt into prompt-cache-key support.
The change keeps explicit cacheRetention suppressed for OpenAI-compatible providers without compat.supportsPromptCacheKey, adds regression coverage for both paths, and updates prompt-caching docs for prompt_cache_key / prompt_cache_retention behavior.
Fixes#81281.
Supersedes #82973.
Co-authored-by: lonexreb <reach2shubhankar@gmail.com>
Fix runtime context placement so hidden runtime context is model-visible before the active user turn without persisting as a visible/session message.
Verification:
- git diff --check origin/main...origin/pr/86995-merge
- gh pr checks 86995 --repo openclaw/openclaw --watch=false
- gh run rerun 26493979156 --repo openclaw/openclaw --failed
- gh run watch 26493979156 --repo openclaw/openclaw --exit-status
- CodeQL run 26493979156 attempt 2, Security High (mcp-process-tool-boundary) job 78066719467 passed
Preserve replayability for direct Anthropic sessions whose stored assistant thinking blocks have empty or blank signatures after a newer user turn. Older invalid thinking-only assistant turns are replaced with the existing omitted-reasoning placeholder so the turn shape survives provider replay.
Also keep active tool-use continuations safe: when an assistant tool call is followed by tool results, preserve the latest assistant thinking block so signed-thinking providers can replay the current tool turn unchanged.
Proof:
- node scripts/run-vitest.mjs src/agents/pi-embedded-runner.sanitize-session-history.test.ts src/agents/pi-embedded-runner/thinking.test.ts test/scripts/openclaw-e2e-instance.test.ts
- pnpm check:changed via Blacksmith Testbox through Crabbox, tbx_01ksmfypqet50et92vdm5mmv5v, run https://github.com/openclaw/openclaw/actions/runs/26505947008
- Live Anthropic Messages replay accepted the OpenClaw-sanitized active tool-turn history with a real thinking signature.
- PR CI on 37c2e72d82 completed successfully for relevant checks.
Fixes#86886.
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Forward cache-read token counts through the OpenAI-compatible chat-completions usage shape as prompt_tokens_details.cached_tokens so clients can price cached turns correctly.
Align internal gateway usage typing with the expanded wire shape.
Thanks @caz0075.
Preserve existing `agents.list` and top-level `bindings` during ordinary onboarding reruns so rerunning `openclaw onboard` cannot silently wipe configured agents or routing bindings.
Keep config size-drop allowances scoped to explicit reset/import/plugin-install migration flows, validate binding agent ids with normalized agent ids, and add doctor repair coverage for dangling bindings that is still best-effort around malformed agent lists.
Closes#84692.
Co-authored-by: yetval <yetvald@gmail.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Closes#87181.
Direct Anthropic Messages requests now send bare Claude model ids even when OpenClaw stores them with the `anthropic/` provider prefix. Anthropic-compatible proxy and custom endpoint routes keep slash-bearing model ids unchanged so configured proxy models do not regress.
Also preserves the original parse error as `cause` in the JSONL request tail helper to keep the current CI lint gate green.
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
* fix(deepinfra): load all DeepInfra models when user wants to browse them during onboarding
* docs(deepinfra): align TTS default
* fix(deepinfra): refresh video fallbacks
* fix(deepinfra): share credential-aware catalog discovery
* test(deepinfra): narrow catalog regression types
* test(deepinfra): keep catalog narrowing across callback
* fix(deepinfra): preserve default model in live catalog
* fix(deepinfra): align default model pricing
* fix(deepinfra): keep pixverse as video default
* docs(deepinfra): match video fallback default
* fix(deepinfra): honor config api keys for live catalog
* test(e2e): wait for watchdog stdio close
* test(media): align live harness provider expectation
* fix(deepinfra): always augment custom catalogs
* test(e2e): resolve watchdog commands before spawning
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Validates forced plugin harness support for the requested provider/model before pinning Codex or any other plugin harness. This prevents an explicitly forced Codex runtime from accepting unsupported OpenAI-like providers through a hardcoded bypass while preserving implicit PI fallback and CLI runtime alias passthrough.
Regression coverage covers forced Codex rejection for unsupported openai/openai-codex support, Codex provider support declarations, CLI attempt routing, pi-embedded auth/profile forwarding fakes, Testbox scenario probes, and live Docker Codex plugin E2E.
Thanks @cathrynlavery.
Keep macOS Homebrew setup lazy so users with supported Node and Git can install without admin/Homebrew, while still installing Homebrew before macOS Node or Git package installs.
Updates installer docs and adds focused install.sh coverage for the lazy Git path. Also aligns the live-media provider expectation with current main so built-artifact checks stay green.
Fixes#83232
Co-authored-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(agents): suppress Write/Edit failed warning on response-timeout false-failure (#55424)
Reporter sees '⚠️ Write failed' / '⚠️ Edit failed' warnings on Feishu (and other channels) even though the file was 100% saved successfully (8 of 8 verified writes succeeded; warning shown for all 8). Source path: tool-mutation records lastToolError.timedOut=true with a fileTarget when a write/edit tool ack reply times out after the disk mutation has already completed, then resolveToolErrorWarningPolicy goes through the default mutating-tool branch and emits the misleading failure summary.
Add a narrow gate inside resolveToolErrorWarningPolicy that suppresses the warning only when both lastToolError.timedOut is true AND lastToolError.fileTarget is defined. fileTarget is set by tool-mutation.ts only for the write/edit family (FILE_MUTATING_TOOL_NAMES), so this branch never matches exec/message/cron/gateway mutating-tool timeouts where the disk-write idempotency reasoning does not apply. Real file failures (no timeout) and timeouts without recorded fileTarget keep their visible warnings.
* fix: recover completed write timeouts safely
* fix: bound write timeout recovery precheck
* fix: type write recovery precheck fallback
* test: complete write recovery result mock
* test: isolate e2e timeout fixture shims
* test: stabilize e2e timeout fixture path
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Clarify that OpenAI Realtime voice is billed through OpenAI Platform credits, not Codex/ChatGPT subscription quota, for Voice Call and Control UI Talk.
Document the direct Platform API key path, the `openai-codex` OAuth client-secret path, the quota symptom, and the Platform billing fix. Keep the changelog note crediting @lonexreb.
Closes#76498.
Co-authored-by: lonexreb <reach2shubhankar@gmail.com>
Keep the Codex app-server full attempt watchdog armed after a terminal turn notification is queued, so a wedged notification projector cannot leave a run stuck indefinitely.
Proof:
- `git diff --check origin/main...HEAD`
- `node scripts/run-oxlint.mjs extensions/codex/src/app-server/run-attempt.ts extensions/codex/src/app-server/run-attempt.test.ts`
- `node scripts/run-vitest.mjs run extensions/codex/src/app-server/run-attempt.test.ts --testNamePattern "keeps the attempt watchdog armed"` passed in PR proof (`1 passed | 232 skipped`)
- `OPENCLAW_TESTBOX=1 pnpm check:changed` passed in `tbx_01kskyg44ej461k574jee8ffjc`
- CI required checks green after `build-artifacts` rerun job `78031279635` passed
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Fix Claude CLI skill prompt handling so native skill plugin materialization is prepared before prompt suppression, with the prompt fallback preserved when plugin args are unavailable. Also keeps direct prepared-run callers covered by an execute-time fallback.
Fixes#87063.
Co-authored-by: uday <udaymanish.thumma@gmail.com>
Regression test for the binary stall fix: when rawResponseItem/completed
arrives with a non-assistant type (e.g. "reasoning") and all tracked
items have completed, the completion idle watch must stay armed so the
stall is caught in 60s, not 30 minutes.
Refs openclaw/openclaw#87071
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the codex binary emits rawResponseItem/completed and all tracked
items have completed (activeTurnItemIds empty, no active requests), the
binary should deliver turn/completed imminently. Previously, a
rawResponseItem/completed that didn't qualify as a post-tool assistant
completion would actively disarm the completion idle watch, leaving only
the 30-minute terminal timeout to catch a stalled binary. This caused
turns to hang for up to 30 minutes when the OpenAI Responses API fails
to deliver response.completed to the binary.
Now, rawResponseItem/completed with no active items arms the 60s
completion idle watch and is excluded from the disarm path, so stalled
binaries are detected in 60s instead of 30 minutes.
Refs openclaw/openclaw#87071
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Restart stale local node-host processes when they reconnect to a newer gateway with a released-version mismatch, so launchd/systemd can restart them with updated code instead of leaving old dynamic imports alive.
Adds gateway mismatch detail propagation, node-host terminal pause handling, and regression coverage for the GatewayClient reconnect-pause path.
Verification:
- node scripts/run-vitest.mjs run src/gateway/client.test.ts -t 'CLIENT_VERSION_MISMATCH' --reporter=verbose
- node scripts/run-vitest.mjs run src/gateway/server.node-version-mismatch.test.ts src/node-host/runner.credentials.test.ts src/gateway/client.test.ts --reporter=verbose
- /Users/steipete/Projects/agent-skills/skills/autoreview/scripts/autoreview --mode local
- Crabbox AWS run_292dcbfd78d9: focused GatewayClient mismatch regression plus server/node-host mismatch tests passed
Co-authored-by: scotthuang <scotthuang@tencent.com>
Persist trailing `/model ...@profile` suffixes through the gateway session patch path so documented per-session credential pinning reaches the session entry. Strip the suffix before model resolution so bare allowlisted model IDs still infer their configured provider, and mark same-model profile-only changes as pending live model switches.
Closes#87099.
Verification:
- `npx oxfmt --check src/sessions/model-overrides.ts src/sessions/model-overrides.test.ts src/gateway/sessions-patch.ts src/gateway/sessions-patch.test.ts`
- `node scripts/run-vitest.mjs src/gateway/sessions-patch.test.ts src/sessions/model-overrides.test.ts`
- `npx oxlint src/sessions/model-overrides.ts src/sessions/model-overrides.test.ts src/gateway/sessions-patch.ts src/gateway/sessions-patch.test.ts`
- `/Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode branch --base origin/main`
- `gh pr checks 87123 --watch --fail-fast`
Co-authored-by: xin zhuang <65798732+1052326311@users.noreply.github.com>
Fix Codex OAuth-backed OpenAI compaction routing by separating the configured provider from the runtime auth provider, preserving same-provider fallback auth, and keeping OpenAI context policy lookup intact. Also preserves the original cause when sessions.send reports A2A fallback failure. Fixes#86373.
Summary:
- Enforces /allowlist config and pairing-store writes against the real command origin plus the selected target.
- Adds regressions for disabled Telegram-origin commands targeting an enabled Discord allowlist.
Verification:
- node scripts/run-vitest.mjs src/auto-reply/reply/commands-allowlist.test.ts
- pnpm check:changed via Blacksmith Testbox tbx_01ksm06e82dnpxmnj00hrt6xzd
- autoreview --mode local clean, no accepted/actionable findings
- GitHub PR checks green on 42a38d2b00Closes#72360.
Thanks @coygeek.
Co-authored-by: Coy Geek <65363919+coygeek@users.noreply.github.com>
Co-authored-by: opencode <opencode@users.noreply.github.com>
Remove the hidden 15s default from reply-run idle waits so visible user turns do not inherit cleanup-settle behavior while waiting behind an active same-session reply operation.
Keep the 15s timeout explicit for queued follow-up retry/defer paths and interrupt/reset cleanup waits, and add reply-admission regressions for both visible and queued follow-up behavior. Also preserve the original cause on a nearby sessions-send fallback error to keep current lint green after rebasing onto main.
Thanks @keshavbotagent.
Co-authored-by: Keshav's Bot <keshavbotagent@gmail.com>
Fix run-scoped sessions_send active-run fallback handling.
- surface active queue rejection plus durable fallback admission failures instead of returning accepted too early
- return fallback run/session metadata so normal A2A announcement waits on the fallback run
- retry active steering without transcript-commit waiting when the active runtime does not support it
Thanks @TurboTheTurtle.
Verification:
- node scripts/run-vitest.mjs src/agents/openclaw-tools.sessions.test.ts
- pnpm check:test-types
- git diff --check
- .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
The Windows Gateway daemon crashes (or rather is killed by Task Scheduler) every time the laptop unplugs from AC power. Reporter on Windows 10 22H2 documented a 100% failure rate.
Root cause: `activateScheduledTask` in `src/daemon/schtasks.ts` used `schtasks /Create` with CLI flags (`/SC ONLOGON /RL LIMITED /TR ...`). That CLI surface cannot set `<DisallowStartIfOnBatteries>` or `<StopIfGoingOnBatteries>`, so the task inherits the Task Scheduler defaults (both `true`), which prevent the task from starting on battery and stop it when AC power is lost mid-run.
This change switches `/Create` to `/Create /XML <tempfile>` and emits a Task Scheduler XML payload that mirrors the prior CLI flags (ONLOGON trigger, LeastPrivilege run level, InteractiveToken logon when a `taskUser` is resolved, single-instance policy, no idle restrictions, exec action wired to the existing `gateway.cmd` / `gateway.vbs` launcher) AND sets:
<DisallowStartIfOnBatteries>false</DisallowStartIfOnBatteries>
<StopIfGoingOnBatteries>false</StopIfGoingOnBatteries>
The XML is written as UTF-16 LE with a BOM, which is what `schtasks /XML` expects on all Windows locales. The temp file is cleaned up in a `finally` block.
The same XML re-apply is also issued from `updateExistingScheduledTask` after the existing `/Change /TR` call, so users upgrading from older versions inherit the new battery flags on the next gateway install/refresh instead of staying broken until a full uninstall+reinstall.
This follows clawsweeper's direction on #59299: "Land a narrow Windows Scheduled Task settings repair that lets the Gateway task start and continue on battery while preserving the current Startup-folder fallback, hidden launcher, quoting, and update behavior."
Preserved unchanged:
- Startup-folder fallback when `/Create` is denied or times out
- Hidden launcher (.vbs) selection via `OPENCLAW_WINDOWS_TASK_HIDDEN_LAUNCHER`
- `quoteSchtasksArg` quoting strategy for the script launch path
- `/Change` update path semantics (still updates `/TR` first)
- All `runScheduledTaskOrThrow` and fallback launch behavior downstream
Verification:
- `node scripts/run-vitest.mjs src/daemon/schtasks.install.test.ts` — 12 passed (incl. 2 new battery-flag regression tests)
- `node scripts/run-vitest.mjs src/daemon/schtasks.test.ts src/daemon/schtasks.startup-fallback.test.ts src/daemon/schtasks.stop.test.ts src/daemon/schtasks-exec.test.ts` — 54 passed (sibling daemon coverage)
- `pnpm tsgo:core` — passed (production typecheck)
Closes#59299
Derive explicit source-reply command turns from authorized control-command bodies when legacy command source metadata is missing.
Preserve native/text structured command semantics, keep unauthorized native commands and structured normal command bodies on plugin-owned fallback paths, and pass bot username normalization through the derived detection.
Co-authored-by: Alex Knight <aknight@atlassian.com>
Bounds nonessential installer finalization probes so npm prefix and daemon-status checks warn and fall back instead of hanging setup.
Thanks @giodl73-repo!
Behavior addressed: doctor hooks model validation now loads the model catalog read-only, so lint/doctor can warn without writable catalog side effects.
Real environment tested: local temp merged tree on current origin/main.
Exact steps or command run after this patch: node scripts/run-vitest.mjs src/flows/doctor-core-checks.test.ts src/flows/doctor-health-contributions.test.ts --reporter=dot; ./node_modules/.bin/oxfmt --check --threads=1 src/flows/doctor-core-checks.ts src/flows/doctor-health-contributions.ts src/flows/doctor-core-checks.test.ts src/flows/doctor-health-contributions.test.ts; ./node_modules/.bin/oxlint src/flows/doctor-core-checks.ts src/flows/doctor-health-contributions.ts src/flows/doctor-core-checks.test.ts src/flows/doctor-health-contributions.test.ts; git diff --check origin/main <merged-tree>
Evidence after fix: 2 test files passed, 30 tests passed; oxfmt passed; oxlint passed; diff check passed.
Observed result after fix: hooks.gmail.model doctor paths call loadModelCatalog with readOnly true in both structured and legacy health surfaces.
What was not tested: GitHub Actions run details could not be refreshed because the Actions API was rate-limited; gh reported no required checks for the branch.
Thanks @giodl73-repo.
Co-authored-by: Gio Della-Libera <giodl73@gmail.com>
Keep Windows node service stop/restart/status from treating the gateway listener port as node-owned runtime evidence. Node Scheduled Task and Startup fallback paths now match the installed node host command line before reporting or terminating a node runtime, so WSL2 gateway loopback connectivity is not disturbed by node lifecycle commands.
Fixes#85289.
Verification:
- node scripts/run-vitest.mjs src/daemon/schtasks.startup-fallback.test.ts src/daemon/schtasks.stop.test.ts
- git diff --check
Co-authored-by: Gio Della-Libera <giodl73@gmail.com>
Stage remote iMessage attachments before media understanding so the image pipeline receives local remote-cache paths instead of raw macOS Messages paths.
Fixes#87089
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Fix stale `subagent_announce` history hydration after `/new` by filtering pre-session-start announce/user reply pairs before `chat.history` projection.
Maintainer fixups added:
- require the adjacent assistant reply to carry a pre-session timestamp before dropping it
- preserve record timestamps for oversized transcript placeholders
- run the filter after Claude CLI history import and support imported timestamp/text fallback
- overread one local transcript message only as boundary context so limit-window edges do not leak stale assistant replies
Verification:
- `git diff --check`
- `node scripts/run-vitest.mjs src/gateway/server-methods/server-methods.test.ts src/gateway/session-utils.fs.test.ts src/gateway/session-history-state.test.ts src/gateway/cli-session-history.test.ts src/gateway/server.chat.gateway-server-chat-b.test.ts` -> 11 files, 463 tests passed
- `/Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode branch --base origin/main` -> clean, no accepted/actionable findings
Thanks @openperf.
Fix gateway/chat timeout abort propagation so timed-out runs do not cascade through fallbacks. Preserve provider timeout errors when the gateway abort signal did not fire, and keep timeout stop reasons in async gateway agent results. Includes regression coverage for chat, follow-up, memory flush, fallback classification, and gateway agent timeout results. Fixes#83962.
* fix(plugin-sdk): use Function.name to find onDiagnosticEvent export
normalizeDiagnosticEventsModule hardcodes `mod.r` as the fallback alias
for onDiagnosticEvent, but the bundler reassigns export aliases across
builds. On 2026.5.25-beta.1, `r` is emitFailoverEvent — calling it as
onDiagnosticEvent returns a non-function, so the combo unsubscribe
closure throws TypeError on every gateway stop.
Replace the hardcoded letter with Function.name introspection. JS
functions retain their original .name regardless of export aliasing,
so this survives bundler alias changes.
Fixes#87082
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test(plugin-sdk): cover diagnostic event alias shifts
* fix(plugin-sdk): harden diagnostic alias cleanup
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Recover idle queued sessions whose diagnostic activity retained stale ownerless model or tool calls by classifying them as recoverable session.stuck after the usual recovery gates. Yield the event loop before stale session-lock process inspection so sync process lookup cannot monopolize lock contention paths.
Docs now describe the widened session.stuck telemetry contract for recoverable stale bookkeeping, including ownerless activity. Thanks @samuelsoaress.
Refs #84903.
Co-authored-by: samuelsoaress <samuelsoares177778@gmail.com>
Summary:
- Resolve inbound media references through the shared media-reference path before workspace-relative handling.
- Reuse the same sandbox rewrite for Pi native images and sandbox media bridge paths.
- Add regression coverage for managed inbound images, sandbox-staged media references, and invalid media IDs.
- Fix current lint by using non-mutating cpuprofile sorting.
Verification:
- node scripts/run-vitest.mjs src/media/media-reference.test.ts src/agents/sandbox-media-paths.test.ts src/agents/pi-embedded-runner/run/images.test.ts src/agents/tools/image-tool.test.ts src/media/web-media.test.ts src/agents/tools/pdf-tool.test.ts src/agents/tools/image-generate-tool.test.ts src/agents/tools/video-generate-tool.test.ts src/agents/tools/music-generate-tool.test.ts
- node scripts/run-oxlint-shards.mjs --threads=8
- git diff --check
- /Users/steipete/Projects/agent-skills/skills/autoreview/scripts/autoreview --mode branch --base origin/main
- GitHub CI rollup passed for eceea707a7Fixes#87024.
Supersedes #87055; thanks @TurboTheTurtle for the report and initial fix direction.
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Release the embedded attempt session lock before sessions_yield abort cleanup waits for session events and rewrites yielded-parent artifacts.
This keeps the existing bounded settle wait while preventing child completion callbacks from contending on the coarse parent transcript lock.
Adds focused session-lock lifecycle coverage.
Refactor memory close provider draining so providers created during shutdown are closed through the same bounded retry path.
Co-authored-by: spacegeologist <zheng.zuo0@gmail.com>
Honor the selected session agent's thinkingDefault for ingress agent runs before global fallback.
Also keep session store cache object-clone writes parse-free while matching persisted JSON shape when cloning values.
Fixes#86669
Co-authored-by: ai-hpc <mail.speedy.hpc@hotmail.com>
Guarantee MCP stdio child cleanup during Gateway shutdown by sending a synchronous SIGKILL when the child survives the existing stdin and SIGTERM waits. This prevents SIGTERM-ignoring local MCP processes from outliving the Gateway when killProcessTree's unref'd SIGKILL timer would otherwise lose the shutdown race.
Fixes#86412.
Verification:
- GitHub CI green on relevant agent/runtime, lint/type, CodeQL/security, OpenGrep, and Real behavior proof checks.
- Real behavior proof: https://github.com/openclaw/openclaw/actions/runs/26430512156/job/77802651894
- Maintainer manual review: no blocking findings.
Thanks @openperf.
Co-authored-by: openperf <16864032@qq.com>
Fix cron delivery previews for no-delivery jobs that still provide explicit message-tool targets.
- Reuse one cron delivery-plan explicit-target predicate across preview and isolated-agent runtime paths.
- Treat numeric threadId 0 as an explicit delivery target.
- Avoid fail-closed wording for unresolved message-tool-only targets.
Thanks @Alix-007 for the fix.
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
Copy plugin-provided skills from their validated real target into sandbox workspaces while keeping prompt-visible skill paths sandbox-local.
Adds regression coverage for symlinked plugin skills, multiple plugin skill roots, escaped symlink targets, and sandbox prompt paths that must not leak host plugin-skill locations.
Refs #86190
Remove the proposed public `maxReseedHistoryChars` config surface and scale Claude CLI reseed history automatically from the resolved context tier instead.
Claude CLI 200K-context runs now keep a 64K-character reseed slice, 1M Opus/Sonnet runs use the bounded 256KiB cap, and non-Claude CLI backends keep the existing 12KiB default. This preserves the intended long-context behavior without adding another config option.
Verification:
- `node scripts/run-vitest.mjs src/agents/cli-runner/session-history.test.ts src/agents/cli-runner/prepare.test.ts`
- `node scripts/run-vitest.mjs src/agents/cli-runner/prepare.test.ts -t "automatic Claude CLI cap"`
- `node scripts/run-oxlint.mjs src/agents/cli-runner/prepare.ts src/agents/cli-runner/prepare.test.ts src/agents/cli-runner/session-history.ts src/agents/cli-runner/session-history.test.ts src/config/types.agent-defaults.ts src/config/zod-schema.core.ts`
- `pnpm check:changed` via Testbox `tbx_01kska2twjxb925xft9dj82hvb`
- GitHub PR checks green
Closes#83985
Co-authored-by: Abdel Gomez-Perez <nabdel07@icloud.com>
Generate the public config JSON Schema from accepted input shapes so transform-backed fields remain renderable in the Control UI. Keep transform output schemas representable with explicit string pipes, align analyzer metadata handling, and cover the generated schema plus browser-safe UI render shapes.
Co-authored-by: Altay <altay@hey.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Keep the Logs page from rendering competing outer page and inner log-stream scrollbars. The Logs route now opts into an explicit content class for desktop fill-height layout, while mobile keeps the single-page scroll behavior with the capped log panel.
Also adds regression coverage for the route class and CSS ownership selectors.
Co-authored-by: Brian potter <brian@potterdigital.com>
Preserve native slash-command laziness while allowing `/skill` to load workspace skill commands asynchronously when needed. The loaded command list is reused for downstream native skill dispatch so valid `/skill <name>` calls do not get misclassified as unknown.
Verification:
- git diff --check
- fnm exec --using v24.15.0 -- pnpm changed:lanes --json
- .agents/skills/autoreview/scripts/autoreview --mode local
- GitHub CI rollup success for c0d778d512
Co-authored-by: Keshav's Bot <keshavbotagent@gmail.com>
Fixes#86007.
Release note: Windows gateway install/update now ignores a persisted OPENCLAW_WRAPPER when it points back at the generated gateway.cmd task script, preventing recursive gateway startup while keeping valid wrapper installs intact.
Credit: thanks @luoyanglang for the fix and proof.
* fix(gateway): reject RPCs from invalidated device-token clients during rotation/revoke race
device.token.rotate, device.token.revoke and device.pair.remove all
respond 200 OK to the admin, then schedule disconnectClientsForDevice
via queueMicrotask so the response can flush before the socket close.
That microtask window plus the absence of a per-RPC re-check for
device-token auth (unlike shared-auth, which gets checked at
message-handler.ts:1444-1458) created a race: an attacker with RPCs
already pipelined in the WS socket buffer could land a few more
authenticated operations with the rotated/revoked token before the
socket actually closed.
Fix: add a cheap in-memory 'invalidated' flag on GatewayWsClient and
mark it synchronously *before* responding in the three handlers. Add
a mirror check at the start of the per-RPC dispatch that force-closes
the client if the flag is set, regardless of whether socket.close()
has taken effect yet. Disconnect still happens via queueMicrotask so
the admin's rotate/revoke response flushes normally.
Introduces context.invalidateClientsForDevice(deviceId, opts) as a
sync companion to the existing disconnectClientsForDevice. Also
defense-in-depth: disconnectClientsForDevice now sets the flag too,
so any other caller of the hard-disconnect path gets the per-RPC
gate for free.
* test(gateway): use vi.mocked instead of direct Mock casts in devices tests
check-test-types failed on the PR because direct 'as ReturnType<typeof vi.fn>' casts from RespondFn (or the optional context methods) don't structurally overlap with the Mock type — Mock has mockImplementation/mockReturnValue that RespondFn lacks, so strict tsgo rejects the conversion. vi.mocked() is the intended helper for reinterpreting an already-mocked function, and drops through to the Mock surface cleanly.
* test(gateway): align tests with upstream type/shape changes after rebase
After rebasing onto upstream main, two test surfaces drifted:
1. GatewayRequestContextParams gained two required fields upstream
(getRuntimeConfig, broadcastVoiceWakeRoutingChanged). The
makeContextParams test helper was missing them, so every consumer
tripped tsgo with a missing-field error. Add both as vi.fn()
stubs.
2. revokeDeviceToken's return shape changed upstream from a bare
entry record to a discriminated union {ok: true, entry: ...} | {ok:
false, reason}. The new device.token.revoke synchronous-invalidate
test still mocked the old shape, so the production handler took the
!revoked.ok branch and never reached the invalidateClientsForDevice
call the test asserted. Update the mock to the new union shape.
Also fix three new Set([...] as never) sites in server-request-
context.test.ts that produced Set<unknown> rather than Set<never>.
Move the cast outside the Set constructor so the literal stays
inferred while the wrapper is type-erased to never, which is
assignable to the Partial<GatewayRequestContextParams> clients field.
* fix(gateway): export GatewayRequestContextParams for test access
* fix(ci): resolve check-test-types and lint failures from PR #70707 branch
- server-request-context.test.ts: hasConnectedMobileNode → hasConnectedTalkNode
(field renamed in server-request-context.ts but test fixture not updated)
- status.summary.redaction.test.ts: add configuredModel/selectedModel/
modelSelectionReason to createRecentSessionRow fixture
(SessionStatus gained these fields in a13468320c; test was not updated)
- video-generation-providers.live.test.ts: replace empty {} fallbacks in
conditional spreads with undefined (oxlint 1.65.0, 5 occurrences)
- music-generation-providers.live.test.ts: same fix for 4 occurrences
Remaining CI failures (FsSafeError/Python helper, media tests, Windows ACL,
session-memory hooks) are pre-existing infra failures unrelated to this PR.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* fix(ci): add missing GatewayRequestContextParams fields to test fixture
chatDeltaLastBroadcastText, agentDeltaSentAt, and bufferedAgentEvents are
required fields in GatewayRequestContextParams but were absent from the
makeContextParams fixture, causing TS2322 in check-test-types.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* fix(gateway): serialize credential invalidating RPCs
---------
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Apply diagnostics.otel.flushIntervalMs to OpenTelemetry trace batching so short-lived Windows and QA runs do not lose late lifecycle/model spans. Also make the OTel QA smoke wait for required telemetry and print bounded failure diagnostics.
Keep model browse/list visibility consistent with runtime-normalized allowlist entries while keeping unrestricted default browse off plugin/runtime hydration. Add regression coverage for catalog visibility, `/models` browse data, and the replay sanitizer mock isolation that made the agents shard order-sensitive.
Verification:
- pnpm test src/agents/pi-embedded-runner.sanitize-session-history.test.ts src/agents/model-catalog-visibility.test.ts src/auto-reply/reply/commands-models.test.ts src/auto-reply/reply/model-selection.test.ts src/agents/model-selection.plugin-runtime.test.ts -- --reporter=verbose
- OPENCLAW_VITEST_MAX_WORKERS=2 pnpm exec node scripts/test-projects.mjs test/vitest/vitest.agents-core.config.ts
- .agents/skills/autoreview/scripts/autoreview --mode local
- GitHub Actions CI run 26476126784
* fix(telegram): preserve command slots for aliases
* fix: report Telegram alias command overflow
* fix: preserve Telegram alias menu order
* docs: drop release-owned changelog entry
---------
Co-authored-by: wuyangfan <yangfan.wu@succaiss.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Ensure deferred context-engine maintenance rejects cleanly when the gateway command queue is draining, including coalesced active-run requests. This prevents budget compaction from treating an unscheduled deferred maintenance run as successful and leaving the context engine alive.
Verification:
- pnpm exec oxfmt --check --threads=1 src/process/command-queue.ts src/agents/pi-embedded-runner/compact.queued.ts src/agents/pi-embedded-runner/context-engine-maintenance.ts src/agents/pi-embedded-runner/context-engine-maintenance.test.ts
- pnpm test src/auto-reply/reply/agent-runner-memory.test.ts src/agents/pi-embedded-runner/compact.hooks.test.ts src/agents/pi-embedded-runner/context-engine-maintenance.test.ts src/tasks/task-flow-registry.store.test.ts src/auto-reply/reply/commands-compact.test.ts src/agents/pi-embedded-runner/compact-reasons.test.ts
- .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
- GitHub Actions CI run 26475226442: relevant Node/Linux, lint, type, security, CodeQL, OpenGrep, Socket, Real behavior proof, and build jobs passed; Windows job failed before tests due current runner image Node 22.19.0 vs required 24.x, matching current main infra failure.
Fixes#86814.
Reclaims stale plugin lock files only when the previous owner is provably gone or the recorded process start time proves PID reuse. Timestamp age alone now stays fail-closed for PID-owned locks, preserving mutual exclusion for long-running writers while still allowing pidless expired locks to expire.
Verification:
- pnpm test src/infra/stale-lock-file.test.ts src/plugin-sdk/file-lock.test.ts
- pnpm tool-display:check
- git diff --check
- autoreview --mode branch --base origin/main
Known CI note: check-guards failed in deps:shrinkwrap:check because npm resolved newer AWS transitive versions than pnpm-lock.yaml contains; no package or lock files are changed in this PR.
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
Remove the transcript redaction path for sessions_spawn arguments and inline attachments. OpenClaw transcripts are local trusted-operator state, and streamTo/resumeSessionId are runtime routing fields that must not be rewritten before replay or dispatch.
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Runtime-injected web_search provider config from plugins.entries.<plugin>.config.webSearch now stays available to provider execution without being validated as user-authored legacy tools.web.search.<provider> config.
Co-authored-by: luoyanglang <hanwanlonga@gmail.com>
Preserve legacy numeric stable git tags while excluding named semver prerelease tags from stable git channel detection and status display.
Thanks @goldmar.
Memoize owner process argv lookups per PID during `cleanStaleLockFiles`, and yield between lock entries so startup cleanup does not monopolize the event loop while inspecting many session locks.
This keeps lock classification semantics unchanged while avoiding repeated synchronous process-args reads for lock clusters owned by the same PID, especially the Windows PowerShell path.
Fixes#86509.
Verification:
- `git diff --check origin/main...HEAD`
- focused TSX harness against the current-main merge result: `session-lock memo regression harness passed`
Thanks @openperf.
Co-authored-by: openperf <16864032@qq.com>
Project newer external OpenClaw chat history into resumed Codex app-server threads when the saved binding is older than user-visible transcript messages, while filtering Codex-owned mirror records on consecutive resumes.
Thanks @TurboTheTurtle!
Keep Codex app-server turn timeouts within the Codex runtime boundary so they interrupt the active turn without retiring the shared app-server client, poisoning auth-profile cooldowns, or falling through to generic provider/model fallback.
Preserve concrete non-timeout provider failures for auth-profile rotation and fallback, and add regression coverage for prompt-stage timeouts, assistant idle timeouts, auth-profile cooldowns, and app-server timeout handling.
Thanks @pashpashpash.
Fixes#74061.
Stages absolute final-reply MEDIA paths that already live under the agent workspace before sandbox path translation runs, so Telegram/local delivery can attach generated workspace media instead of dropping it as Media failed. Outside-workspace host-local paths remain blocked, and host-read HTML stays denied pending separate security-boundary review.
Verification:
- git diff --check origin/main...refs/remotes/pull/86531
- git merge-tree --write-tree origin/main refs/remotes/pull/86531
- reviewed src/auto-reply/reply/reply-media-paths.ts, src/media/web-media.ts, and focused tests
Co-authored-by: mjamiv <74088820+mjamiv@users.noreply.github.com>
Remove the Telegram DM thread reply policy config and use Telegram bot capability as the single source of truth for DM topic session splitting.
DM messages with message_thread_id now split into thread-scoped sessions only when Telegram getMe reports has_topics_enabled for the bot. Doctor removes retired dm.threadReplies and direct.*.threadReplies keys, docs explain the upgrade behavior, and startup keeps cached bot info as a non-auth fallback when a fresh probe fails.
Refs #86513.
Thanks @alexph-dev.
Verification:
- pnpm docs:list
- pnpm exec oxfmt --check --threads=1 extensions/telegram/src/channel.ts extensions/telegram/src/channel.gateway.test.ts extensions/telegram/src/doctor-contract.ts extensions/telegram/src/doctor.test.ts
- git diff --check
- node scripts/run-vitest.mjs extensions/telegram/src/channel.gateway.test.ts extensions/telegram/src/doctor.test.ts extensions/telegram/src/bot/helpers.test.ts extensions/telegram/src/bot-message-context.dm-threads.test.ts extensions/telegram/src/config-schema.test.ts
- pnpm config:channels:check
- pnpm config:docs:check
- .agents/skills/autoreview/scripts/autoreview --mode local
- GitHub Actions: CI 26468039803, Workflow Sanity 26468040057, OpenGrep 26468039472, Real behavior proof 26468036483, CodeQL 26468039466, CodeQL Critical Quality 26468039473
Known CI caveat: checks-windows-node-test failed before tests because Windows runner setup left Node 22.19.0 active while the job requested Node 24.x; the same setup failure is present on current main CI run 26468063947.
Reworks the Codex app-server native thread reuse guard so OpenClaw no longer adds a user-facing token config. Token clearing now prefers Codex's reported model context window, falls back to a high internal recovery fuse, and preserves context-engine thread-bootstrap reuse while keeping byte guard behavior intact.
Verification:
- `fnm exec --using v24.15.0 -- node scripts/run-vitest.mjs run extensions/codex/src/app-server/run-attempt.test.ts extensions/codex/src/app-server/run-attempt.context-engine.test.ts --reporter=dot --pool=forks --no-file-parallelism`
- `git diff --check`
- `.agents/skills/autoreview/scripts/autoreview --mode local --base origin/main`
- Testbox `check:changed`: `tbx_01ksjm1hy7mfrc5bebzyckqdew`, GitHub Actions run https://github.com/openclaw/openclaw/actions/runs/26463150977, exit 0
- PR CI green after rerunning unrelated `checks-node-agentic-agents` flake and stuck OpenGrep scan
Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org>
* fix: validate wide-area dns domains
* addressing codex review
* fix(dns-cli): throw explicit DNS-name error on invalid --domain
resolveWideAreaDiscoveryDomain catches the validation error from
normalizeWideAreaDomain and returns null, so dns setup --domain foo/bar
fell through to the "No wide-area domain configured" branch instead of
surfacing the invalid-domain diagnostic. Validate explicit CLI/config
input directly so the user-facing setup command reports the actual
problem; preserve the resolver's silent env-fallback semantics for the
background callers that depend on graceful degradation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(gateway): lock in graceful degrade on invalid wide-area config
Drive startGatewayDiscovery through the real resolveWideAreaDiscoveryDomain
with wideAreaDiscoveryDomain: "foo/bar" so the test exercises the actual
swallow-and-return-null path. Asserts the operator-facing warning is
logged, writeWideAreaGatewayZone is never called, and startup completes
without throwing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(gateway): type resolveWideAreaDiscoveryDomain mock to match real signature
vi.fn(() => "openclaw.internal.") inferred the mock as `() => string`, so
mockImplementationOnce(realResolver) tripped tsgo:core:test with TS2345.
Apply the same vi.fn<typeof ...>(...) pattern the file already uses for
writeWideAreaGatewayZone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(changelog): note dns validation fix
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Agustin Rivera <agustin@rivera-web.com>
Updates Discord voice Opus callers to the published libopus-wasm 0.1.0 API, pins the Discord plugin dependency and lockfiles to that release, keeps the package freshness exception version-scoped, treats expected Discord receive-stream premature closes as normal stream ends, and includes routed OpenClaw transcript roots for local PR transcript discovery.\n\nProof: npm view libopus-wasm@0.1.0; pnpm install --lockfile-only --filter @openclaw/discord; Node encode/decode smoke with pkg 0.1.0 decoded=3840; node scripts/run-vitest.mjs extensions/discord/src/voice/audio.test.ts extensions/discord/src/voice/receive-recovery.test.ts; git diff --check; autoreview clean; live tmux gateway on e0fa3e3 joined Discord voice and processed realtime audio without decoder.decode or Premature close warning spam.
Guard loadUsage in the Control UI overview secondary refresh so stale overview loads do not start the expensive usage.cost RPC after the user has navigated away. Active overview usage loading is preserved.
Fixes#86392.
Thanks @Marvinthebored for the report, live gateway proof, and patch.
Verification:
- CI=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=120000 fnm exec --using v24.15.0 -- node scripts/run-vitest.mjs run ui/src/ui/app-settings.refresh-active-tab.node.test.ts --reporter=dot --pool=forks --no-file-parallelism
- GitHub PR checks green on d52d8d10da, including Real behavior proof and checks-node-core-ui.
Co-authored-by: Marvinthebored <262704729+Marvinthebored@users.noreply.github.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Move meeting notes into core transcripts, remove the bundled meeting-notes plugin/API, and require explicit transcripts.enabled before exposing the recording-capable tool.
Fix outbound message actions so structured attachments[] media participates in existing sandbox, local-root, and hydration checks. Single-attachment actions select structured attachments only when no top-level or plugin media source wins, while send collects all structured attachments. Proof: git diff --check; pnpm tsgo:core && pnpm tsgo:test:src; direct selector/hydration probe; autoreview clean.
Tag authorized Mattermost typed text-slash control commands with CommandSource: text so existing explicit-command source-reply delivery bypasses message_tool_only suppression for /new, /reset, ACP reset, and soft-reset acknowledgement replies.
Remove the normal PR changelog edit flagged by review and keep release-note context in the PR body/squash message. Tighten the regression test to exercise the leading-space Mattermost text-post path used to bypass native slash handling and assert the normalized command body.
Local proof: node scripts/run-vitest.mjs extensions/mattermost/src/mattermost/monitor.inbound-system-event.test.ts src/auto-reply/command-turn-context.test.ts src/auto-reply/reply/source-reply-delivery-mode.test.ts src/auto-reply/reply/commands-reset-hooks.test.ts; git diff --check origin/main..HEAD; oxfmt check; autoreview clean.
CI: PR run 26443271650 passed relevant checks. Ignored check-test-types failure because the exact same extensions/codex/src/app-server/run-attempt.test.ts TS2345 failure is already present on main run 26442926352 at the PR base.
Fixes#86664.
* fix(imessage): send group media via attachment command
* fix(imessage): preserve media rpc fallback
---------
Co-authored-by: Omar Shahine <10343873+omarshahine@users.noreply.github.com>
Summary:
- The PR updates diagnostics to mark streamed model chunks as run progress, keeps silent model calls abortable after the stuck-session timeout, and adds regression coverage for stream progress and recovery behavior.
- PR surface: Source +54, Tests +229. Total +283 across 6 files.
- Reproducibility: yes. at source level: current main tracks model-call start/end activity but streamed chunks ... covery keys on stale lastProgressAgeMs. I did not run a live local-provider repro in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(diagnostics): track model stream progress
- PR branch already contained follow-up commit before automerge: test(diagnostics): cover silent local model aborts
- PR branch already contained follow-up commit before automerge: fix(diagnostics): skip stream progress when disabled
Validation:
- ClawSweeper review passed for head fcc74d9869.
- Required merge gates passed before the squash merge.
Prepared head SHA: fcc74d9869
Review: https://github.com/openclaw/openclaw/pull/86757#issuecomment-4540111930
Co-authored-by: Onur Solmaz <2453968+osolmaz@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: osolmaz
Co-authored-by: osolmaz <2453968+osolmaz@users.noreply.github.com>
Summary:
- The PR adds runtime-only external OAuth provenance to auth-profile stores, updates save/merge/read paths to ... e profiles in active snapshots while filtering disk persistence, and expands auth-profile regression tests.
- PR surface: Source +381, Tests +974. Total +1355 across 8 files.
- Reproducibility: yes. from source: current main writes the disk-filtered localStore into an existing runtime ... tches the reported credential drop path. I did not run a failing current-main repro in this read-only pass.
Automerge notes:
- PR branch already contained follow-up commit before automerge: Preserve runtime external auth snapshots
Validation:
- ClawSweeper review passed for head a73074ed45.
- Required merge gates passed before the squash merge.
Prepared head SHA: a73074ed45
Review: https://github.com/openclaw/openclaw/pull/85558#issuecomment-4523577269
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Summary:
- The PR preserves provider-facing embedded-runner prompt errors when cleanup detects session takeover, keeps the takeover signal fatal for fallback, and adds focused regressions.
- PR surface: Source +52, Tests +92. Total +144 across 5 files.
- Reproducibility: yes. Source inspection shows current main can let cleanup takeover replace a prior prompt/p ... rror and can normalize a provider-looking takeover wrapper before fallback sees it as coordination failure.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(embedded-runner): preserve takeover during fallback
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8405…
Validation:
- ClawSweeper review passed for head 050c779cfa.
- Required merge gates passed before the squash merge.
Prepared head SHA: 050c779cfa
Review: https://github.com/openclaw/openclaw/pull/84321#issuecomment-4492087335
Co-authored-by: abnershang <abner.shang@gmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
* refactor: use Rastermill for image processing
* docs: clarify autoreview heartbeat patience
* refactor: use simplified rastermill api
* fix: preserve rastermill media safety boundaries
* build: update rastermill api pin
* build: use published rastermill package
Summary:
- Adds `plugins/synthetic-auth.runtime` as an explicit tsdown dist entry and adds a regression test tying PI model-discovery synthetic-auth imports to that stable entry.
- PR surface: Tests +22, Other +1. Total +23 across 2 files.
- Reproducibility: yes. as a source-reproducible package-build path: current main imports synthetic-auth from ... y. The PR proof covers emitted production `dist/` imports, though it did not run a live scheduled cron job.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(build): pin synthetic auth runtime dist entry
Validation:
- ClawSweeper review passed for head cb99947919.
- Required merge gates passed before the squash merge.
Prepared head SHA: cb99947919
Review: https://github.com/openclaw/openclaw/pull/86714#issuecomment-4538919657
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Summary:
- This PR changes DeepSeek provider tool-schema normalization to convert multi-value string const unions into flat string enums, with regression coverage for pure, nullable, and single-const union cases.
- PR surface: Source +27, Tests +84. Total +111 across 2 files.
- Reproducibility: yes. source-level reproduction is high confidence: current main selects only the first non-null anyOf/oneOf variant, and the linked source PR proof shows before/after output for that exact schema shape.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(plugin-sdk): preserve string-const unions as flat enum for deepse…
Validation:
- ClawSweeper review passed for head 310d95e327.
- Required merge gates passed before the squash merge.
Prepared head SHA: 310d95e327
Review: https://github.com/openclaw/openclaw/pull/86712#issuecomment-4538892244
Co-authored-by: 1052326311 <1052326311@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Move immutable session-store snapshot cloning/freezing off the write path and rebuild snapshots lazily on read. Resolve runtime external auth profiles once per auth-profile save instead of once per OAuth profile.
Proof: oxfmt targeted files; pnpm tsgo:core; pnpm check:test-types; node scripts/run-vitest.mjs src/config/sessions.cache.test.ts src/agents/auth-profiles.store.save.test.ts src/agents/auth-profiles/external-oauth.test.ts; autoreview clean.
Route invalid-config recovery output for source-only installed plugin packages to plugin packaging guidance instead of openclaw doctor --fix.
Validated with focused config/CLI/gateway/plugin tests, autoreview, Crabbox/Testbox E2E tbx_01ksgr80tnvvc13kv6t126yv78, and green PR CI on 3b3ce73d0f.
Thanks @brokemac79.
Reuse a lazy model manifest context across configured model resolution so common static defaults do not trigger manifest metadata loads, while keeping plugin-owned normalization available when aliases, provider rows, or OpenRouter compat paths need it.
Preserves exact alias behavior, auth-profile-suffixed alias behavior, provider inference from manifest-normalized configured refs, and existing plugin/runtime cache lifecycle rules.
Co-authored-by: Alyana <alyana@lumina.local>
Use the effective runtime/model context when computing overflow recovery reserveTokensFloor hints, including uncataloged runtime refs, stale session windows, and heartbeat fallback cases.
Verification:
- pnpm test src/auto-reply/reply/agent-runner-execution.test.ts
- autoreview clean on final focused fixup; prior accepted findings addressed before push.
- CI passed on head e25b3e84f4 after rerunning cancelled jobs: preflight, critical quality network-runtime-boundary, security high, checks, Real behavior proof.
Co-authored-by: tanshanshan <tanshanshan@users.noreply.github.com>
Forward OpenAI-compatible frequency_penalty, presence_penalty, and seed params through the gateway/chat-completions path while keeping Responses untouched.
Verification:
- pnpm test src/gateway/openai-http.test.ts src/agents/pi-embedded-runner/extra-params.sampling.test.ts src/agents/openai-transport-stream.test.ts
- CI passed on head 9abb9466d9 after rerunning cancelled jobs: preflight, critical quality network-runtime-boundary, security high, checks, docs, Real behavior proof.
Co-authored-by: lellansin <lellansin@gmail.com>
Cache configured model cost indexes for repeated session usage cost lookups while preserving in-place config mutation behavior via value-fingerprint invalidation. Raw pricing lookups now skip manifest model-id normalization as well as runtime/plugin normalization, keeping direct cost lookup off plugin metadata hot paths.
Verification:
- node scripts/run-vitest.mjs src/utils/usage-format.test.ts
- pnpm exec oxfmt --check src/utils/usage-format.ts src/utils/usage-format.test.ts
- pnpm lint --threads=8
- pnpm tsgo:core
- autoreview --mode local
- PR CI green on head 15c1e25d95
Cap retained compaction checkpoint snapshots by total bytes per session while preserving the existing count cap.
The gateway now stats retained checkpoint snapshots inside the session-store writer before trimming, deletes older trimmed checkpoint files, and keeps the newest checkpoint available. Regression coverage uses real sparse checkpoint files to prove byte-budget cleanup.
Closes#84822.
Summary
- Bound Memory Wiki compile-time page summary reads through the existing concurrency helper.
- Preserve deterministic result ordering before title sort and keep the helper in stop-on-error mode.
- Replaces #84458 because the fork branch does not allow maintainer edits and the contributor changelog entry needed removal.
Behavior addressed: Memory Wiki compile no longer starts one page-summary read per page without a bound.
Real environment tested: Local macOS source checkout, Node/pnpm repo environment.
Exact steps or command run after this patch: pnpm test extensions/memory-wiki/src/compile.test.ts; pnpm exec oxfmt --check --threads=1 extensions/memory-wiki/src/compile.ts extensions/memory-wiki/src/compile.test.ts; .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main --no-web-search --prompt "Review PR #84458 after maintainer fixup. Focus on memory-wiki compile page summary read concurrency, runTasksWithConcurrency result/error handling, ordering preservation, and test reliability."
Evidence after fix: compile.test.ts passed 10 tests; oxfmt reported clean; autoreview reported no accepted/actionable findings.
Observed result after fix: Page reads are executed through runTasksWithConcurrency with errorMode stop, successful results are consumed in input-index order, and the existing summary title sort remains deterministic.
What was not tested: Full repository suite.
Co-authored-by: zhengzuo0-ai <zheng.zuo0@gmail.com>
Behavior addressed: Unknown CLI command roots now error consistently even when --help or --version is appended, while legitimate built-in help fast paths still render normally.
Real environment tested: Local OpenClaw source checkout plus GitHub workflow run-level status.
Exact steps or command run after this patch: pnpm test src/cli/run-main.exit.test.ts src/cli/argv.test.ts src/cli/argv-invocation.test.ts; pnpm exec oxfmt --check --threads=1 src/cli/run-main.ts src/cli/run-main.exit.test.ts; autoreview --mode branch --base origin/main --no-web-search.
Evidence after fix: Focused CLI test shards passed 178 tests; formatter clean; autoreview reported no accepted/actionable findings; GitHub CI run 26422344121 and CodeQL Critical Quality run 26422344090 completed successfully.
Observed result after fix: `openclaw foo --help` and `openclaw foo --version` reject before proxy/program startup, while known help fast paths remain ahead of the unknown-root guard.
What was not tested: Full local build; contributor PR body already supplied build/CLI command proof before rebase.
Co-authored-by: YB0y <brianandez6@gmail.com>
Behavior addressed: The codex-cli metadata branch no longer calls process.exit(0) immediately after writing stdout, and it still emits exactly one unsupported-backend JSON object.
Real environment tested: Local OpenClaw source checkout on macOS with Node/tsx.
Exact steps or command run after this patch: pnpm test test/scripts/print-cli-backend-live-metadata.test.ts test/scripts/docker-build-helper.test.ts; node --import tsx scripts/print-cli-backend-live-metadata.ts codex-cli | python3 -c 'import sys,json; print(json.load(sys.stdin)["provider"])'; autoreview --mode branch --base origin/main --no-web-search.
Evidence after fix: Focused tooling test shard passed 2 files / 23 tests; direct pipe parse printed codex-cli; autoreview reported no accepted/actionable findings; PR status rollup was clean.
Observed result after fix: stdout is parseable as a single JSON payload and the normal metadata path is skipped for codex-cli.
What was not tested: Live provider metadata paths beyond the focused existing test coverage.
Co-authored-by: Iftekhar Uddin <ifuddin3@gmail.com>
Behavior addressed: Native Codex app-server threads now disable Codex's built-in personality on thread/start, thread/resume, turn/start, bound conversation turns, and /btw side-thread forks so OpenClaw agent workspace identity stays authoritative.
Real environment tested: Local OpenClaw source checkout plus GitHub CI on PR #85891.
Exact steps or command run after this patch: pnpm test extensions/codex/src/app-server/thread-lifecycle.test.ts extensions/codex/src/app-server/side-question.test.ts extensions/codex/src/conversation-binding.test.ts extensions/codex/src/app-server/schema-normalization-runtime-contract.test.ts; pnpm check:docs; pnpm prompt:snapshots:check; OPENCLAW_ADDITIONAL_BOUNDARY_SHARD=1/4 OPENCLAW_ADDITIONAL_BOUNDARY_CONCURRENCY=4 node scripts/run-additional-boundary-checks.mjs.
Evidence after fix: Focused Codex test shard passed 4 files / 79 tests; docs check passed; prompt snapshots are current; CI passed all code/quality checks, with only Real behavior proof failing as unrelated proof-bot gating for this non-channel change.
Observed result after fix: App-server request snapshots and unit tests include personality: "none" on native Codex start/resume/turn/fork paths.
What was not tested: A live Codex app-server model run was not executed.
Co-authored-by: Beru <beru@lastguru.lv>
renewInterval is not cleared on re-entry to startGmailWatcher,
leaking the previous timer. Each config reload adds another
interval that fires independently.
Clear existing watcher state before starting a new one.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
Bump USAGE_COST_CACHE_VERSION 3->4 so a warm .usage-cost-cache.json written by a
pre-change build is rebuilt instead of serving stale complete-$0 totals after
upgrade (the new missing-cost branch otherwise only runs when a file is rescanned).
Add a regression test asserting an older-version cache is treated as stale for an
unpriced session.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address review: distinguish unknown pricing from an intentional free price. A
turn's all-zero cost is treated as unknown (counted toward missingCostEntries)
only when the operator did NOT explicitly configure the model's price under
models.providers -- i.e. the zero is a generated-catalog default (codex/gpt-5.x),
not a deliberate $0. Operator-configured zero-cost models keep reporting a
complete $0.
Adds resolveConfiguredModelCost() to read config-only pricing, and regression
tests for both paths (unconfigured unknown -> missing; configured free -> $0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Only treat an unpriced (all-zero) model's turn as missing when it has no
trustworthy recorded cost (recorded cost is 0 or absent). A turn carrying a
real positive recorded cost is preserved, fixing a regression where priced
fixtures without explicit pricing config lost their recorded cost.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Models that ship an all-zero cost block (e.g. codex gpt-5.5, whose Codex
backend exposes no per-token price) made usage-cost report totalCost: 0 with
missingCostEntries: 0 -- a confident, complete $0 -- so every budget/spike
safeguard keyed off totalCost was silently blind to real pay-per-token spend.
scanTranscriptFile now treats a resolved cost config with no positive per-token
rate (and no tiered pricing) as "pricing unknown": for turns that burned tokens
it drops the transport's fabricated $0 and surfaces the turn as a missing-cost
entry, mirroring the existing tiered-pricing override. Models with positive or
tiered pricing and zero-token entries are unaffected.
Verified on a real OpenClaw 2026.5.20 host (default openai/gpt-5.5, api_key):
1,780,235 tokens that previously reported missingCostEntries 0 now report 32.
Related: #85858
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary:
- The PR updates `src/agents/identity-file.ts` to normalize backtick-wrapped IDENTITY.md labels and values, and adds parser/merge regression tests in `src/agents/identity-file.test.ts`.
- PR surface: Source +8, Tests +28. Total +36 across 2 files.
- Reproducibility: yes. source-reproducible with high confidence: current main strips `*` and `_` but not back ... e unnormalized string. I did not run tests because this review was required to keep the checkout read-only.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(agents): strip markdown code spans from IDENTITY.md values and la…
Validation:
- ClawSweeper review passed for head 30c43defd6.
- Required merge gates passed before the squash merge.
Prepared head SHA: 30c43defd6
Review: https://github.com/openclaw/openclaw/pull/86647#issuecomment-4537456646
Co-authored-by: nayrosk <105997554+nayrosk@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Summary:
- The PR extracts the CJK-aware memory tokenizer into a shared helper, routes dreaming dedupe through it, preserves MMR re-exports, and adds regression coverage for CJK and empty-token cases.
- PR surface: Source +15, Tests +96. Total +111 across 5 files.
- Reproducibility: yes. Current main has an ASCII-only tokenizeSnippet path in dreaming dedupe, and the source ... ction source bytes for the CJK failure modes; I did not run tests locally because this review is read-only.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(memory-core): use Array.toSorted for #80613 lint fix
- PR branch already contained follow-up commit before automerge: fix(memory-core): preserve dedupe identity when both snippets tokeniz…
- PR branch already contained follow-up commit before automerge: fix(memory-core): rename __testing to testing in CJK regression tests…
- PR branch already contained follow-up commit before automerge: fix(memory-core): use CJK-aware tokenizer for dreaming dedupe (#80613)
Validation:
- ClawSweeper review passed for head ca9c02734c.
- Required merge gates passed before the squash merge.
Prepared head SHA: ca9c02734c
Review: https://github.com/openclaw/openclaw/pull/86645#issuecomment-4537414471
Co-authored-by: MoerAI <friendnt@g.skku.edu>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Behavior addressed: Embedded PI compaction retry now drains block replies again after the retry wait resolves, so retry-generated replies are not left behind while preserving aggregate-timeout fallback behavior.
Real environment tested: local OpenClaw focused Pi runner test shard plus contributor local live-output proof in the PR body.
Exact steps or command run after this patch: pnpm test src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts src/agents/pi-embedded-runner/run/compaction-retry-aggregate-timeout.test.ts; .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
Evidence after fix: 2 test files passed, 55 tests passed; final autoreview clean with no accepted/actionable findings.
Observed result after fix: the runner flushes before the compaction wait, waits for compaction retry, then performs a second idempotent flush when the wait resolves without timing out.
What was not tested: fresh external-channel live retry by this agent; PR retains contributor live-output proof for the delayed channel adapter path.
Thanks @spacegeologist.
Co-authored-by: zhengzuo0-ai <zheng.zuo0@gmail.com>
Behavior addressed: Telegram direct-message turns no longer drop an earlier overlapping normal reply, while authorized aborts and explicit/native/plugin/skill command turns still supersede active reply work.
Real environment tested: local OpenClaw focused Telegram test shard plus existing contributor Telegram screenshot/log proof in the PR body.
Exact steps or command run after this patch: pnpm test extensions/telegram/src/telegram-reply-fence.test.ts extensions/telegram/src/bot-message-dispatch.test.ts; .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
Evidence after fix: 2 test files passed, 93 tests passed; final autoreview clean with no accepted/actionable findings.
Observed result after fix: overlapping normal Telegram DMs use non-interrupting reply fences and both final replies remain deliverable; direct /stop, authorized built-in commands, and explicit text/native command turns still supersede.
What was not tested: fresh live Telegram Desktop rerun by this agent; PR retains contributor screenshot/log proof and the Real behavior proof bot remains red despite proof labels.
Thanks @neeravmakwana.
Co-authored-by: Neerav Makwana <261249544+neeravmakwana@users.noreply.github.com>
Keep isolated cron announce delivery owned by runner fallback while leaving agent-initiated message sends optional. `delivery.mode: none` no longer forces message delivery, announce delivery skips fallback only after a verified same-target message-tool send, and prompt allowlist checks now match runtime tool policy normalization/group expansion.
Verified with focused cron tests, `check:changed`, autoreview, and PR CI on 7ab77bad97.
Thanks @bryanpearson.
Co-authored-by: bryanpearson <bryanmpearson@gmail.com>
Fix Gemini cached-content GenerateContent payloads so cached requests no longer resend request-level systemInstruction, tools, or toolConfig.
Covers explicit cachedContent and managed cacheRetention prompt caching; fixes#84919.
Proof: Real behavior proof passed on PR head 198a42bbc6 after live Gemini repro/fix evidence was added to the PR body. Focused tests and check:changed were already green.
Thanks @neeravmakwana.
Adds regression coverage for agents.defaults.agentRuntime schema acceptance and invalid-config doctor fix reachability.
The runtime behavior fix already landed on main in 5b9be2cdb1c01a2896783c52f5f0654c5f22a249; this PR locks the expected behavior with focused tests.
Closes#72872
Precompute FIR resample kernels for common voice sample-rate conversions to avoid per-sample trigonometry while preserving output for tested ratios.\n\nVerification: node scripts/run-vitest.mjs extensions/voice-call/src/telephony-audio.test.ts; pnpm tsgo:core; autoreview --mode commit --commit HEAD; PR CI green.
Fix isolated cron delivery so agent-default derivation keeps using the paired runtime config snapshot, preserving resolved channel credentials such as Discord SecretRefs. Fixes#86545.
Add inline comment explaining that compileSafeRegex rejects patterns
with nested repetition (ReDoS risk) and returns null. Rejected patterns
are silently skipped; the plugin will not match via that pattern but
other patterns and prefixes still apply.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
Replace raw `new RegExp(patternSource, "u")` in
`resolveModelSupportMatchKind` with the existing
`compileSafeRegex()` guard from `src/security/safe-regex.ts`.
A malicious or careless plugin manifest pattern like `(a+)+$`
causes catastrophic backtracking (ReDoS) against non-matching model
IDs. `compileSafeRegex` detects nested repetition and returns null,
which the caller now treats as a non-match (equivalent to the
previous catch-continue for invalid regex).
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
Replace string containment check with direct field assertions:
- oversized.role is 'assistant'
- __openclaw.id is 'oversized-child' (exact match)
- parentId extraction proven by record inclusion in active tree
5/5 oversized transcript tests pass.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
extractJsonStringFieldPrefix and extractJsonNullableStringFieldPrefix
interpolate the `field` parameter into `new RegExp(...)` without
escaping. All current callers pass hardcoded strings ("id",
"parentId", "type", "role"), but the function signature accepts
any string. A future caller passing a field containing regex
metacharacters (e.g. "foo.bar") would match unintended patterns.
Wrap the interpolation with escapeRegExp() from src/shared/regexp.ts
so metacharacters are treated literally.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
When the gateway process is orphaned after a systemd service restart,
the parent's journal pipe closes and every write to stdout/stderr returns
EPIPE. The previous handler swallowed it with a bare return, so background
loops (config file watcher, etc.) kept firing and the process spun at
100% CPU indefinitely.
Exit cleanly with code 0 instead — a process whose own output streams
are broken has nowhere to log and no reason to keep running.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
scripts/docs-spellcheck.sh uses set -u and constructs args=( ... "${write_flag[@]}" ), where write_flag may be an empty array. On bash 3.2 (still the default /bin/bash on macOS), referencing an empty array under set -u raises an unbound variable error. Newer bash (>= 4.4) handles this expression correctly, which is why the script ships green on Linux CI runners.
Switch to the bash 3.2-safe parameter expansion ${write_flag[@]+"${write_flag[@]}"}: it expands to nothing when the array is empty and to the array contents otherwise, preserving --write behavior unchanged.
Also fixes overrideable -> overridable in docs/reference/test.md, which the now-running spellcheck surfaces.
Repro:
bash scripts/docs-spellcheck.sh # was: write_flag[@]: unbound variable, exit 1
bash scripts/docs-spellcheck.sh # now: codespell runs to completion
Summary:
- The PR replaces Feishu presentation/action card fallback rendering with a shared JSON 2.0 button/behaviors renderer, updates native card sanitization, and expands Feishu channel/outbound tests.
- PR surface: Source +118, Tests +223. Total +341 across 5 files.
- Reproducibility: yes. source-reproducible: current main renders Feishu presentation button blocks through ma ... help` fallback. I did not run local tests because this review was required to keep the checkout read-only.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(feishu): render native presentation buttons
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8601…
Validation:
- ClawSweeper review passed for head 36d6a36323.
- Required merge gates passed before the squash merge.
Prepared head SHA: 36d6a36323
Review: https://github.com/openclaw/openclaw/pull/86588#issuecomment-4536092569
Co-authored-by: NianJiuZst <3235467914@qq.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Refactor diagnostic queued/state/processed emission into a shared helper used by dispatch and isolated cron turns.
Preserve dispatch processed-event behavior, cron queue-depth symmetry, and final cron session-id adoption while adding focused helper coverage and reviewer comments for the non-obvious invariants.
Fixes Dependabot alert #118 for GHSA-q8mj-m7cp-5q26 by updating the workspace qs override from 6.14.2 to 6.15.2 and regenerating root and plugin shrinkwrap files.
Runtime surface: transitive qs consumers through Express, Slack, Feishu, Teams, ACP, and MCP paths.
Prefer the active Claude CLI OAuth auth label when the configured Anthropic model resolves through an equivalent Claude CLI runtime alias, so `/status` no longer reports an unused env API-key label.
Also adds regression coverage for both text and message status renderers, plus the maintainer changelog entry.
Closes#80184.
Co-authored-by: brokemac79 <martin_cleary@yahoo.co.uk>
Normalize Google Gemini 3.1 Flash Lite routing to the GA model id and keep the retired preview spelling as a compatibility alias. Align default alias docs, FAQ guidance, and deprecated-model manifest recommendations with the GA id.
Fixes#86151.
Co-authored-by: Sebastien Tardif <sebtardif@ncf.ca>
Address clawsweeper P2: cron isolated-agent lifecycle (message.queued,
session.state, message.processed) now mirrors the dispatch path and
respects the diagnostics.enabled master toggle. Added regression test
for the disabled-config path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(doctor): skip restart prompt when gateway is healthy after recent restart
`openclaw doctor` unconditionally prompted "Restart gateway service now?"
with default=Yes whenever the gateway was running, even if it had just
restarted via SIGUSR1 after an update. This caused restart loops on macOS
where the prompt raced with launchctl KeepAlive.
Changes:
- Probe gateway health before the restart prompt when a restart handoff
exists (deep doctor mode). If healthy, skip the prompt entirely.
- Change `initialValue` from `true` to `false` as a safety net so users
don't accidentally confirm a restart by pressing Enter.
- Update existing test that expected a single `readGatewayRestartHandoffSync`
call (now called twice: diagnostic display + health-probe check).
Fixes#86518
* fix(doctor): correct GatewayRestartHandoff mock types in tests
Add explicit literal types + satisfies constraint so the mock handoff
objects match the exact GatewayRestartHandoff type expected by the
type-check CI.
* fix(doctor): apply recent-restart skip to normal doctor flow
* test(doctor): align normal-flow handoff expectation
* chore: add doctor restart prompt changelog
---------
Co-authored-by: OpenClaw Contributor <openclaw-contributor@example.com>
Co-authored-by: liaoyl830 <267396060+liaoyl830@users.noreply.github.com>
Co-authored-by: sallyom <somalley@redhat.com>
* fix(agents): warn on Claude permission overrides under YOLO
* fix: narrow Claude audit backend guard
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(agents): answer Claude live control_request can_use_tool via exec policy
Claude CLI emits stream-json control_request frames with subtype
can_use_tool when it wants to use a native tool. The Claude live-session
bridge previously dropped these frames, leaving Claude waiting for a
control_response until the 180/600s no-output timeout fired (see #80819).
Resolve the effective OpenClaw exec policy (per-agent tools.exec -> global
tools.exec -> allowlist/on-miss defaults) once at session-start time and
thread it through fingerprinting and the session record. When a
can_use_tool request arrives:
- Allow native Bash when the resolved policy is security=full, ask=off
(matching the bypassPermissions semantics OpenClaw already documents).
- Otherwise deny with a message that names the resolved policy and
points the agent at OpenClaw MCP tools.
Unsupported control_request subtypes get a structured error response
instead of a silent no-op, and stray control_response frames are
silently dropped. Adds spawn-test coverage for both allow and deny paths.
Fixes#80819
* fix(agents): align Claude live control_request policy with backend defaults
Resolve the effective exec policy through the same defaults that
extensions/anthropic/cli-shared.ts:isOpenClawRequestedYolo and
src/agents/exec-defaults.ts:resolveExecDefaults already use (security
?? "full", ask ?? "off") instead of falling back to a hand-rolled
allowlist/on-miss default that disagreed with the rest of the codebase.
Without this, a default-config OpenClaw deployment launches Claude with
--permission-mode bypassPermissions but the bridge would still deny
Bash control_requests, re-creating the #80819 stall for the very
default-config case the issue reports.
Also thread the effective Claude permission mode into the policy
decision. Prefer the operator's explicit --permission-mode in argv,
falling back to what normalizeClaudePermissionArgs would have inserted
for an un-overridden launch. Native Bash is auto-allowed only when the
effective mode is bypassPermissions AND tools.exec resolves to
full/no-ask, so explicit raw-arg overrides like --permission-mode
default or acceptEdits broaden Claude's native prompting and are
honored by routing through deny.
Adds a no-config regression test (default deployment allows Bash, no
stall) and a permission-mode-override test (tools.exec full/off plus
explicit --permission-mode default in raw args denies). Existing
allow/deny tests continue to pass via the synthesized-mode fallback.
* fix(agents): honor effective exec policy for Claude live Bash
---------
Co-authored-by: Guillaume Thirry <g.thirry@gmail.com>
* fix(sessions): stop doctor OOM on large session stores and reclaim stale store temps
`openclaw doctor` loaded the full sessions.json via loadSessionStore with the
default cache-write plus return clone, materializing a multi-hundred-MB
monolithic store several times and exhausting the heap (#56827). The read-only
doctor checks (state integrity, heartbeat target, codex route scan) now load
with { skipCache: true, clone: false } so the store is materialized once.
Orphaned session-store atomic-write temps were also never reclaimed: the store
write went through the generic atomic writer, staging a shared
.fs-safe-replace.<pid>.<uuid>.tmp not identifiable as a store temp. Give the
store write a store-specific tempPrefix so its temps stage as
sessions.json.<pid>.<uuid>.tmp, classify them (isSessionStoreTempArtifactName),
and reclaim stale ones via the disk-budget sweep and the unreferenced-artifact
prune on a short staleness window so in-flight temps are preserved.
Fixes#56827
* docs(changelog): note large session store doctor fix
* test(qa): preserve WhatsApp RTT source literal
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Summary:
- This PR adds an Ollama Kimi-cloud visible-content sanitizer for streamed and final assistant replies, updates stream handling and regression tests, and adds a changelog entry.
- PR surface: Source +183, Tests +473, Docs +1. Total +657 across 7 files.
- Reproducibility: yes. from source and the linked report: current main appends Ollama `message.content` direc ... payload described in the issue would be shown. I did not run a live vendor repro in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(ollama): sanitize kimi inline reasoning in stream events
- PR branch already contained follow-up commit before automerge: fix(ollama): buffer kimi cloud stream reasoning
- PR branch already contained follow-up commit before automerge: fix(ollama): cover kimi inline boundary variants
- PR branch already contained follow-up commit before automerge: fix(ollama): preserve text start partial state
- PR branch already contained follow-up commit before automerge: fix(ollama): bound kimi stream sanitizer hold
- PR branch already contained follow-up commit before automerge: fix(ollama): keep kimi sanitizer deltas append-only
Validation:
- ClawSweeper review passed for head b709229157.
- Required merge gates passed before the squash merge.
Prepared head SHA: b709229157
Review: https://github.com/openclaw/openclaw/pull/86515#issuecomment-4534945393
Co-authored-by: Jason O'Neal <jason.allen.oneal@gmail.com>
Co-authored-by: Onur Solmaz <2453968+osolmaz@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: osolmaz
Co-authored-by: osolmaz <2453968+osolmaz@users.noreply.github.com>
Summary:
- This PR changes the shared block reply coalescer/pipeline so compatible buffered visible text is merged into a following media payload, adds focused regression tests, and records a Discord changelog fix.
- PR surface: Source +50, Tests +175, Docs +1. Total +226 across 6 files.
- Reproducibility: yes. Current main has a clear source reproduction path: media enqueue forces a text flush and then sends the media payload separately, and the PR adds focused tests for the corrected merge path.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix: route streamed media through reply coalescer
- PR branch already contained follow-up commit before automerge: fix(discord): merge media captions into one message
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8648…
Validation:
- ClawSweeper review passed for head ceafbeaf3c.
- Required merge gates passed before the squash merge.
Prepared head SHA: ceafbeaf3c
Review: https://github.com/openclaw/openclaw/pull/86487#issuecomment-4534402219
Co-authored-by: Neerav Makwana <261249544+neeravmakwana@users.noreply.github.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
* fix(memory): prevent silent vector index degradation when embedding provider temporarily unavailable
Two related bugs cause complete loss of semantic vector data:
1. Promise cache deadlock in ensureProviderInitialized():
When the embedding provider (e.g. local MLX server on port 8123) is
temporarily unreachable at Gateway startup, loadProviderResult() throws
and providerInitPromise becomes a permanently-cached Rejected Promise.
The block only clears it on success (providerInitialized=true),
so the stale rejection blocks all future init attempts until Gateway restart.
2. Silent fts-only overwrite in runSync():
With the provider stuck at null, shouldRunFullMemoryReindex() compares
the stored meta.model (e.g. 'jina-embeddings-v5-text-small') against the
runtime provider model, and since provider is null, falls through to the
'meta.model !== fts-only' check — returning true. This triggers a full
reindex where every file is written as fts-only, silently erasing all
existing 11k+ semantic vectors.
Fix 1: Clear providerInitPromise in the catch block so the next call can
retry initialization (self-healing when the provider comes back online).
Fix 2: Guard runSync() — if requestedProvider is set and not 'none', but
the runtime provider is null, throw an error instead of silently degrading
to fts-only. This protects existing vector data by failing loudly.
Tested on production: 11,715 chunks + 1024-dim vectors fully preserved
after Gateway restart with the fix applied. The guard correctly blocks
sync when MLX is offline and allows normal operation when it recovers.
* fix: use this.settings.provider instead of private requestedProvider
The guard clause in runSync() was referencing this.requestedProvider
which is a private property on the MemoryIndexManager subclass and not
accessible from MemoryManagerSyncOps. Use this.settings.provider
instead, which is the same value and is accessible via the protected
abstract settings property.
* fix(memory): narrow degradation guard to only protect existing semantic indexes
The previous guard was too broad — it blocked sync for ALL non-none
provider configurations when provider was null, including the default
'auto' path where users without embedding credentials legitimately
build FTS-only indexes.
Narrow the guard to only abort when:
1. provider is null (embedding unavailable)
2. existing index metadata has a semantic model (not 'fts-only')
3. settings.provider is configured and not 'none'
This preserves the legitimate FTS-only fallback for auto/no-provider
users while still protecting existing semantic vector indexes from
silent degradation.
Reported-by: ClawSweeper (PR #85704 review)
* test: cover memory semantic index outage guard
* fix: protect semantic memory index fallback paths
* test: update memory sync harnesses
---------
Co-authored-by: Bo Yan <yaaboo-gif@users.noreply.github.com>
Co-authored-by: Yan Bo <yanbo@Mac.lan>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Summary:
- The branch replaces QQBot's hardcoded outbound response watchdog with a resolver based on existing agent/provider `timeoutSeconds` settings, adds regression tests, and updates the changelog.
- PR surface: Source +113, Tests +116, Docs +1. Total +230 across 5 files.
- Reproducibility: yes. at source level: current main and the latest release use a hardcoded 300000 ms QQBot o ... s an 1800s provider timeout. I did not run the reporter's live QQBot/Ollama setup in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: test(qqbot): cover slow provider response watchdog
- PR branch already contained follow-up commit before automerge: fix(qqbot): derive outbound watchdog from configured timeouts (#85267)
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8527…
Validation:
- ClawSweeper review passed for head 7bd829292a.
- Required merge gates passed before the squash merge.
Prepared head SHA: 7bd829292a
Review: https://github.com/openclaw/openclaw/pull/86500#issuecomment-4534669816
Co-authored-by: SymbolStar <symbolstar@users.noreply.github.com>
Co-authored-by: Onur Solmaz <2453968+osolmaz@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: osolmaz
Co-authored-by: osolmaz <2453968+osolmaz@users.noreply.github.com>
Summary:
- Adds a scoped ModelStudio/DashScope OpenAI-compatible guard for chat payloads with no non-empty user or assi ... turn, shared turn-detection helper coverage, prompt-skip handling, regression tests, and a changelog entry.
- PR surface: Source +83, Tests +298, Docs +1. Total +382 across 10 files.
- Reproducibility: yes. source-reproducible for the OpenClaw-side malformed payload shape: current main has no ... he exact qwen-long/qwen3-coder-plus provider error was not reproduced with the available DashScope account.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix: make OpenAI payload guard content-aware
- PR branch already contained follow-up commit before automerge: fix: scope openai payload turn guard
- PR branch already contained follow-up commit before automerge: Guard OpenAI chat payload turns
Validation:
- ClawSweeper review passed for head e16a3fe9f2.
- Required merge gates passed before the squash merge.
Prepared head SHA: e16a3fe9f2
Review: https://github.com/openclaw/openclaw/pull/86497#issuecomment-4534668405
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: Onur Solmaz <2453968+osolmaz@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: osolmaz
Co-authored-by: osolmaz <2453968+osolmaz@users.noreply.github.com>
Reverts the diagnostic queue-pressure suppression of non-terminal session tool mirrors from PR 84846 while keeping PR 86503 recipient dedupe intact. Session-only Control UI subscribers keep receiving tool lifecycle mirrors; overlapping run and session subscribers still receive one canonical run-scoped frame. Verification: focused gateway and diagnostic tests, diff check, changed check, and autoreview all passed.
* fix(agents): release embedded-attempt session lock on every exit path
The embedded run controller acquires its session write lock eagerly at
creation and released it only inside the post-run cleanup block. An
exception thrown in post-prompt processing skipped that block, so the lock
leaked to the live gateway process until the watchdog reclaimed it and
later requests to the session failed with SessionWriteLockTimeoutError.
Add an idempotent dispose() to the lock controller and call it from the
run's outer finally so the eagerly-held lock is released on every exit
path. Normal/aborted/timed-out runs still hand the lock to
acquireForCleanup first, so dispose() is a no-op then (no double release).
Fixes#86014
* fix: keep session lock teardown comment lean
* docs(changelog): note embedded session lock fix
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Dedupe gateway tool-event fanout so connections subscribed by both run and session receive the canonical run-scoped agent event only, while session-only subscribers keep the compatibility session.tool mirror.\n\nVerification:\n- node scripts/run-vitest.mjs src/gateway/server-chat.agent-events.test.ts\n- git diff --check\n- env -u OPENCLAW_TESTBOX pnpm check:changed\n- .agents/skills/autoreview/scripts/autoreview --mode local
Summary:
- The PR expands security audit, CLI docs, and tests so `hooks.token` reuse of active Gateway token/password auth is reported while password-mode Gateway startup remains compatible.
- PR surface: Source +178, Tests +311, Docs +14. Total +503 across 14 files.
- Reproducibility: yes. from source inspection: current main forwards a bearer token as both token and passwor ... ecause this review was read-only, but the linked issue and code path make the reproduction high confidence.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(cr-fmi-hook-ingress-token-unlocks-password-mode-gateway-auth): ap…
- PR branch already contained follow-up commit before automerge: fix: include trusted proxy password in hooks token reuse check
- PR branch already contained follow-up commit before automerge: fix(gateway): audit hooks password reuse without blocking startup
- PR branch already contained follow-up commit before automerge: fix: Hook ingress token unlocks password-mode gateway auth
Validation:
- ClawSweeper review passed for head 7c796b22ec.
- Required merge gates passed before the squash merge.
Prepared head SHA: 7c796b22ec
Review: https://github.com/openclaw/openclaw/pull/86453#issuecomment-4533831028
Co-authored-by: Coy Geek <65363919+coygeek@users.noreply.github.com>
Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: jesse-merhi
* fix(diagnostics): reclaim wedged session lanes with a stale leaked active run
A group session lane could wedge permanently (#85639): an embedded run that dies
abnormally leaves a stale ACTIVE_EMBEDDED_RUNS handle, so the diagnostic heartbeat
classifies the lane stale_session_state (recoveryEligible without allowActiveAbort)
while stuck-session recovery reads the leaked isEmbeddedPiRunActive flag and skips
with active_reply_work — a tautology that keeps the lane forever. The age-based
escape never fires because ageMs (last-activity) resets on every incoming queued
message.
Make the active-run skip a liveness check: before keeping the lane, consult the
run's real forward-progress age (lastProgressAgeMs, not refreshed by incoming
messages). If a run flagged active has made no forward progress past the resolved
diagnostics.stuckSessionAbortMs threshold (threaded through the recovery request;
falls back to a 5-minute floor) with queued work waiting, treat it as a
leaked/dead handle and reclaim it (abort + drain + force-clear) instead of
skipping. A genuinely progressing run, or one within an operator-raised
threshold, is kept.
Fixes#85639
* test(diagnostics): cover stale active run recovery
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Summary:
- The PR adds HEIC/HEIF-to-JPEG normalization before media-understanding image description providers run, with regression tests and a changelog entry.
- PR surface: Source +58, Tests +82, Docs +1. Total +141 across 6 files.
- Reproducibility: yes. at source level: current main forwards HEIC buffers to `describeImage` without normali ... ody includes a red HEIC regression test before the patch. I did not execute tests in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(media-understanding): normalize HEIC before image descriptions
Validation:
- ClawSweeper review passed for head ed34620bd7.
- Required merge gates passed before the squash merge.
Prepared head SHA: ed34620bd7
Review: https://github.com/openclaw/openclaw/pull/86037#issuecomment-4528578874
Co-authored-by: luoyanglang <hanwanlonga@gmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
- Rotate OpenAI Realtime voice sessions on provider max-duration events without surfacing the expected expiry as a Discord voice error.
- Add lifecycle logging for Realtime rotation/reconnect and regression coverage for max-duration reconnect.
- Allowlist the existing Control UI chunking helper for the optional Knip unused-file guard so the dependency shard stays green on the current base.
Catch non-ENOENT load failures inside maybeRepairLegacyCronStore so an
unreadable ~/.openclaw/cron/jobs.json (e.g. root-owned 0600 inside
Docker) no longer aborts the rest of the doctor health checks. The
scheduler-side loadCronStore keeps its strict throw-on-read-failure
contract.
Closes#86102
Co-authored-by: 1052326311 <1052326311@users.noreply.github.com>
After config.patch writes new values to openclaw.json, a subsequent
SIGUSR1 in-process restart could overwrite them with a stale snapshot.
Root cause: run-loop's onIteration hook resets lanes and task registry,
but leaves the runtimeConfigSnapshot intact. loadConfig() then returns
the old snapshot via loadPinnedRuntimeConfig() instead of re-reading disk.
Fix: clearRuntimeConfigSnapshot() in the restart iteration hook so the
next startup reads fresh config from disk.
Refs #86350
Summary:
- The PR routes local GGUF memory embeddings through a bundled worker sidecar, adds structured degradation and fallback handling, updates memory tests/build output, and keeps the local config contract unchanged.
- PR surface: Source +831, Tests +503, Docs +1, Other +2. Total +1337 across 23 files.
- Reproducibility: Do we have a high-confidence way to reproduce the issue? Source and report evidence are str ... cludes native crash logs; the exact Metal teardown abort was not reproduced in this review or the PR proof.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(memory): keep local embedding config unchanged
- PR branch already contained follow-up commit before automerge: fix(memory): type local embedding degradation
- PR branch already contained follow-up commit before automerge: fix(memory): refresh keywords after embedding fallback
- PR branch already contained follow-up commit before automerge: fix(memory): keep worker errors internal
- PR branch already contained follow-up commit before automerge: test: satisfy memory provider lifecycle harnesses
- PR branch already contained follow-up commit before automerge: fix: harden local embedding worker fallback
Validation:
- ClawSweeper review passed for head 1d1fe41c4e.
- Required merge gates passed before the squash merge.
Prepared head SHA: 1d1fe41c4e
Review: https://github.com/openclaw/openclaw/pull/85348#issuecomment-4518516047
Co-authored-by: Onur Solmaz <onur@Onurs-MacBook-Pro.local>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: osolmaz
Co-authored-by: osolmaz <2453968+osolmaz@users.noreply.github.com>
* fix(memory-core): filter REM dreaming candidates to light-staged entries
REM dreaming re-ingested the full short-term recall store independently,
ignoring which entries were staged by the light sleep phase. Because the
confidence formula heavily weights accumulated averageScore (45%) and
recallStrength (25%), old high-recall entries permanently dominated
freshly staged candidates. The intended light→REM→deep pipeline was
broken: light correctly staged current material, but REM selected a
different set entirely, so lightHits never paired with remHits for deep
ranking.
Fix: in runRemDreaming(), read the phase-signals store for keys with
lightHits > 0 and filter entries to that set before passing to
previewRemDreaming(). When no light-staged keys exist (light disabled
or first run), fall back to the full entry set for backward
compatibility.
Added readLightStagedKeys() to short-term-promotion.ts as a clean
export for reading the light-staged key set from the phase signal store.
Closes#86249
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(memory-core): keep REM staging pending
* fix(memory-core): mark REM-considered staged entries
---------
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(telegram): propagate forum topic names into agent context
The topic-name-cache already tracks forum topic names via
forum_topic_created/edited/closed events in bot-message-context, but
this metadata was not surfaced in two key paths:
1. The native-command handler (bot-native-commands.ts) builds the agent
context payload with IsForum but never looked up the cached topic
name. Now it resolves the topic name from the cache and includes
TopicName in the context, giving agents awareness of which forum
topic they are responding in.
2. The action runtime (action-runtime.ts) executes createForumTopic and
editForumTopic actions but never persisted the resulting topic
metadata back to the cache. Now both actions write the topic name
(and optional icon metadata) to the cache after success, ensuring
subsequent messages in those topics can resolve the name.
Closes#86024
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(telegram): scope forum topic cache updates
---------
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Move the plain-text tool-call promotion wrapper out of the public provider stream SDK helper and into a private local-only bundled-provider runtime seam.
Replays #84749 because the contributor fork branch became conflicted and was no longer maintainer-writable.
Co-authored-by: TeodoroRodrigo <rodrigoteodoro.90@gmail.com>
* fix(scripts): include ui:build in build-all full and ciArtifacts profiles
Closes#85206.
scripts/build-all.mjs only ran ui:build via a separate `pnpm ui:build`
command. Because `pnpm build` invokes tsdown which removes `dist/`,
a backend rebuild silently deletes any previously generated
dist/control-ui assets, leaving the gateway to serve the
"Control UI assets not found" message at startup. Documentation and
startup auto-repair masked the bug at the worst possible time
(LaunchAgent readiness / remote recovery) instead of guaranteeing the
build artifact contract.
This change adds ui:build as a build-all step after
copy-export-html-templates and before write-build-info, and includes
it in the full and ciArtifacts profiles. Minimal backend dev profiles
(gatewayWatch, cliStartup) keep their existing fast-loop step lists
and do not run ui:build.
Regression coverage:
- ciArtifacts step list assertion updated to match the new ordering.
- Three new resolveBuildAllSteps assertions: ui:build is in full and
ciArtifacts and runs after tsdown/runtime-postbuild-stamp and before
write-build-info; ui:build is excluded from gatewayWatch/cliStartup;
ui:build cache outputs declare dist/control-ui.
* fix(scripts): leave ui:build uncached so dist/control-ui never restores stale build IDs
ClawSweeper review on #86010 flagged that the original ui:build cache only
hashed ui/, scripts/ui.js, and scripts/lib/copy-assets.ts, but
ui/vite.config.ts also reads package.json plus git HEAD and the
OPENCLAW_CONTROL_UI_BUILD_ID/OPENCLAW_VERSION env vars to embed a build ID
into the app and service worker. A file-input cache signature cannot
exactly invalidate those metadata sources, so a warm build-all hit could
restore a previously generated dist/control-ui after tsdown clears dist
and ship stale service-worker/app cache metadata.
Leaving the step uncached keeps the contract simple: every pnpm build
re-runs Vite, which is fast for the Control UI bundle and matches the
existing behavior of every other un-cached build-all step. Backend-only
profiles (gatewayWatch, cliStartup) are still unchanged.
Tests:
- Updated the ui:build cache assertion to require step.cache to be
undefined and explain the metadata-input reason.
- Existing presence/order/exclusion assertions for ui:build are unchanged
and still cover the full and ciArtifacts profile contract.
* fix(scripts): keep ui build fallback pnpm-free
---------
Co-authored-by: 1052326311 <1052326311@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
enqueueSession injects sessionQueuePriority into its enqueue opts so
user-facing work (trigger=user/manual → foreground) jumps ahead of
background work (trigger=cron/heartbeat/memory/overflow → background)
in the session lane.
enqueueGlobal was passing opts through unchanged, so priority resolved
to "normal" for both lanes. Since the heavy embeddedRun body
(workspace-sandbox, core-plugin-tools, bootstrap-context, bundle-tools,
system-prompt, session-resource-loader, agent-session, stream-setup)
runs inside enqueueGlobal, the global-lane queue was effectively FIFO
between user chat and cron — defeating the priority intent on the path
where it matters most.
Inject sessionQueuePriority into enqueueGlobal the same way it's
injected into enqueueSession.
Observed in production: a 3m48s user chat on a hibernation-wake
storm at 2026-05-24T04:19:09Z, where 11 overdue cron jobs + 16
overdue agent heartbeats entered the global lane simultaneously
on hibernation resume. The chat enqueued with trigger=user landed
at the back of a 27-entry FIFO queue at priority 0 instead of
preempting at priority 1 (foreground). 62 s of the 228 s wall-clock
was waiting in that queue.
Summary:
- This PR forwards Codex app-server source reply delivery mode into active run handling, adds a focused regression test, and adds a changelog entry.
- PR surface: Source +1, Tests +38, Docs +1. Total +40 across 3 files.
- Reproducibility: yes. Source inspection shows the shared active-run queue rejects `message_tool_only` replies when the active handle lacks that mode, and current main's Codex app-server handle omits it.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(codex): preserve source reply mode for active runs
Validation:
- ClawSweeper review passed for head d8fac59d8f.
- Required merge gates passed before the squash merge.
Prepared head SHA: d8fac59d8f
Review: https://github.com/openclaw/openclaw/pull/86325#issuecomment-4531516197
Co-authored-by: Fermin Quant <ferminquant@hotmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Summary:
- The PR adds a commitments-store writer helper, wraps load-modify-save mutators and expiry cleanup with a per-path queue plus `withFileLock`, adds three concurrency regressions, and updates the changelog.
- PR surface: Source +153, Tests +61, Docs +1. Total +215 across 4 files.
- Reproducibility: yes. Source inspection on current main shows the unqueued load-modify-save mutation path, a ... inked proof log shows the Promise.all repro changing from 20/20 lost writes before the patch to 0/20 after.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(commitments): serialize load-modify-save with in-process queue + …
Validation:
- ClawSweeper review passed for head a349f41ccf.
- Required merge gates passed before the squash merge.
Prepared head SHA: a349f41ccf
Review: https://github.com/openclaw/openclaw/pull/86326#issuecomment-4531553610
Co-authored-by: ai-hpc <mail.speedy.hpc@hotmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Route normal [telegram][diag] polling diagnostics through runtime.log while keeping non-diag Telegram warnings/errors and offset persistence failures on runtime.error.
Verification:
- node scripts/run-vitest.mjs extensions/telegram/src/monitor.test.ts (34 passed)
- git diff --check
- CI run 26378692736 passed on 979c6f31a4Fixes#82957
Repair explicit anchorless iMessage watch payloads by GUID before debounce/routing, and drop unrecoverable payloads fail-closed instead of routing them as sender DMs.
Closes#84470.
Refs #84503.
Thanks @zhangguiping-xydt and @zqchris.
Fix Google Vertex production ADC mode support by routing explicit google-vertex models to the Vertex transport and relying on google-auth-library for request-time ADC resolution.
Verification:
- pnpm install --frozen-lockfile
- pnpm test extensions/google/transport-stream.test.ts extensions/google/index.test.ts src/config/zod-schema.models.test.ts src/agents/pi-embedded-runner/model.inline-provider.test.ts -- --reporter=verbose
- pnpm check:changed
- GitHub PR checks green on c4b7cad4df
- Live ADC smoke reached Google Vertex auth/transport and failed only because the configured redacted project has the Vertex AI API disabled
Co-authored-by: Damian Finol <damian@felixpago.com>
* fix: clean up browser MCP subprocess tree
* fix: clean up windows browser mcp tree before close
* fix(browser): repair chrome mcp cleanup rebase
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(compaction): preserve partial summary on mid-chain chunk failure
When summarizing multiple chunks, if a chunk fails after at least one
chunk has already succeeded, return the partial summary instead of
propagating the error and losing all summarization progress.
Abort and timeout errors still propagate immediately. First-chunk
failures still rethrow so the existing fallback path runs.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(compaction): use content array for assistant messages to match updated AgentMessage type
* fix(compaction): use as-unknown-as-AgentMessage cast for assistant test fixtures
---------
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
maybeRecoverSuspiciousConfigRead unconditionally recorded
lastObservedSuspiciousSignature in health state even when
restoredFromBackup was false (copyFile failed). The guard at
resolveConfigReadRecoveryContext then prevented the same
signature from ever being retried, permanently accepting the
suspicious config on every subsequent launch.
Only record the dedup signature when the backup restore
actually succeeded.
* fix: avoid false telegram pairing prompts
* docs: add telegram pairing changelog
* refactor(telegram): share pairing-store gating and align isGroup check
Extract loadTelegramPairingStoreIfNeeded so the text-fragment flush path
and resolveTelegramGroupAllowFromContext share one implementation, and
align the isGroup derivation in the flush path with the
'group || supergroup' form used elsewhere in bot-handlers.runtime.ts.
Note on transient-vs-known errors: readChannelAllowFromStore already
translates missing-file (ENOENT) and JSON parse failures to an empty
allowlist internally, so the only errors that escape into the new
silent-drop path are unexpected I/O failures (EMFILE/EACCES/EIO/...) —
unpaired senders still get a pairing challenge as expected.
* fix(telegram): skip pairing-store read when commands.allowFrom already authorizes the sender
Native command auth resolves group/dm allow context (which may read the
pairing store) before checking commands.allowFrom. On DMs with
dmPolicy: "pairing", a transient pairing-store I/O failure was therefore
dropping commands from senders explicitly authorized by
commands.allowFrom.telegram.
Add a skipPairingStoreRead hint on resolveTelegramGroupAllowFromContext /
loadTelegramPairingStoreIfNeeded, precompute the command authorization
once at chat scope before the context call, and pass the hint when that
pre-check already authorizes the sender. The post-context command auth
check still owns the topic-scoped decision.
Regression covers a DM /status from a sender allowed by
commands.allowFrom.telegram with dmPolicy: "pairing" and a rejecting
readChannelAllowFromStore mock.
* fix(telegram): satisfy test-types on harness readChannelAllowFromStore
CI check-test-types failed because the harness now stores a loose
AnyAsyncMock for readChannelAllowFromStore but TelegramNativeCommandDeps
requires the precise typeof readChannelAllowFromStore signature. Cast at
the telegramDeps assignment so harness callers can keep passing any
vi.fn(...) (including ones that reject) without type pollution at the
call site.
* feat(telegram): reply with a retry hint when pairing-store read fails transiently
Wrap unexpected pairing-store I/O errors (EACCES, EMFILE, ...) in a
typed TelegramPairingStoreReadError and surface them through
handleInboundMessageLike with a friendly "please try again" reply that
matches the media-failure precedent at bot-handlers.runtime.ts:1893.
Beats silent drop: paired senders see why their message wasn't
processed, and unpaired senders who happen to send a DM during a
transient store outage retry naturally and get the correct pairing
prompt once the store recovers.
Verified live against @paxicoto_bot with chmod 000 on
~/.openclaw/credentials/telegram-default-allowFrom.json after touching
mtime to bypass the stat-pinned cache.
Summary:
- The PR updates the Unix installers to avoid emitting npm `--before` when raw npm config contains `min-releas ... records a changelog fix, and widens an internal model-catalog test helper type to accept sync auth checks.
- PR surface: Source +1, Tests +421, Docs +1, Other +150. Total +573 across 7 files.
- Reproducibility: yes. The linked report at https://github.com/openclaw/openclaw/issues/84743 gives an isolat ... exclusivity, and current main still has the source path that can generate the conflicting `--before` flag.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(installer): avoid before with npm release-age configs
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8549…
Validation:
- ClawSweeper review passed for head fb0762f468.
- Required merge gates passed before the squash merge.
Prepared head SHA: fb0762f468
Review: https://github.com/openclaw/openclaw/pull/85491#issuecomment-4522229812
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
* perf(plugins): thread metadata snapshot and discovery through hot paths
With the snapshot memo now actually hitting, route the snapshot's
manifestRegistry and discovery through the helper chains that already
had fast paths for them. Eliminates redundant per-call rebuilds at
two big amplifiers.
- Provider resolve paths (resolvePluginProviders /
isPluginProvidersLoadInFlight / resolveOwningPluginIdsForProvider /
resolveExternalAuthProfilesWithPlugins) self-service a snapshot once
at the public entry, then thread it as a separate required arg
through resolvePluginProviderLoadBase,
resolveExplicitProviderOwnerPluginIds, and the setup/runtime load
state helpers. Inner reads change from
'params.pluginMetadataSnapshot?.x' to 'snapshot.x', no more
enrichedParams clone. loadPluginManifestRegistryForInstalledIndex
fires drop ~685 -> ~10 per cold start.
- Bundled-channel / auto-enable chain accepts an optional
PluginDiscoveryResult. discoverOpenClawPlugins is fired once during
snapshot building (resolveInstalledPluginIndexRegistry already
produced it internally; now bubbled up through
loadInstalledPluginIndexWithDiscovery, PluginRegistrySnapshotResult,
and onto PluginMetadataSnapshot.discovery). load-context reads
metadataSnapshot.discovery and passes it through
applyPluginAutoEnable, so the bundled-channel cascade
(collectConfiguredChannelIds, listBundledChannelIdsWith*,
listPotentialConfiguredChannelPresenceSignals) short-circuits
instead of each leaf re-firing discovery. Persisted-cache path is
unchanged: no discovery on the snapshot, downstream chain handles
its own fallback (pre-PR behavior on that path).
* test(plugins): isolate snapshot memo across tests that mock manifest registry
The snapshot memo is now process-scoped and effective (~98% hit rate).
Three test files were depending on cache misses (because the broken
cache returned them) — each test would set up its own
loadPluginManifestRegistry mock and expect a fresh derive. With the
cache fixed, an earlier test's mocked registry now leaks into later
tests in the same file.
- io.write-config.test.ts: afterEach now clears the snapshot memo so
the 'demo' plugin mocked in the first test does not survive into
'keeps shipped plugin install config records when index migration
fails', which expects an empty registry to surface the 'plugin not
found: demo' warning.
- gateway/model-pricing-cache.ts: resetGatewayModelPricingCacheForTest
also clears the memo. Tests in model-pricing-cache.test.ts assert
loadPluginManifestRegistryForInstalledIndex was called; the memo
hit otherwise skips the call.
- providers.test.ts: vi.doMock loadPluginMetadataSnapshot to wrap the
existing loadPluginManifestRegistryMock fixture. The plumbing
commit added an auto-fetch fall-through in
resolveOwningPluginIdsForProvider; without the mock, providers
tests hit real disk reads and return empty registries (which is
what surfaced as 9 unrelated-looking failures in the prior CI
run).
* fix(plugins): preserve setup.cliBackends owner matching in provider scan
resolveOwningPluginIdsForProvider now also checks plugin.setup?.cliBackends.
The pre-PR no-registry fallback used resolvePluginContributionOwners which
includes both top-level cliBackends and setup.cliBackends; the PR's manifest
scan replacement was missing the setup case.
* fix(plugins): inherit active registry workspaceDir before loading metadata snapshot
isPluginProvidersLoadInFlight and resolvePluginProviders now resolve
env and workspaceDir once at the entry point (falling back to
getActivePluginRegistryWorkspaceDir) and pass them into both
loadPluginMetadataSnapshot and resolvePluginProviderLoadBase. Pre-fix
the snapshot used params.workspaceDir raw while the load base inherited
the active workspace, so workspace-scoped provider plugins could be
absent from the snapshot manifest registry even though owner resolution
expected them.
Regression test asserts the snapshot mock receives the active
workspaceDir when the caller omits it.
* perf(gateway): thread discovery into applyPluginAutoEnable call sites
Every gateway applyPluginAutoEnable call now passes the snapshot's
PluginDiscoveryResult so the bundled-channel cascade (collectConfiguredChannelIds
→ listBundledChannelIdsWith* → listPotentialConfiguredChannelPresenceSignals)
short-circuits instead of each leaf re-firing discovery.
Startup-time sites pull discovery from the snapshot/lookup-table they already
hold:
- server-plugin-bootstrap.ts (pluginLookUpTable)
- server-startup-plugins.ts (pluginMetadataSnapshot)
- server-startup-config.ts (pluginMetadataSnapshot)
- server-plugins.ts (pluginLookUpTable, both call sites)
Per-RPC sites (server.impl getRuntimeConfig callback, server-methods/channels
status + start handlers, server-methods/send) source discovery via
getCurrentPluginMetadataSnapshot using the runtime config to validate
compatibility. Falls through to the original slow path when the snapshot is
absent or incompatible.
Summary:
- The branch adds a 1500 ms internal timeout to bundled MCP `tools/list` catalog discovery, adds slow and hung stdio MCP regression tests, and records the fix in `CHANGELOG.md`.
- PR surface: Source +2, Tests +216, Docs +1. Total +219 across 3 files.
- Reproducibility: yes. The current-main source path is high confidence: bundled MCP connects successfully, then calls `client.listTools` without request options, and the upstream SDK defaults that request to 60000 ms.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(mcp): use internal tools list timeout
- PR branch already contained follow-up commit before automerge: fix(mcp): bound tools/list during catalog discovery
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8506…
Validation:
- ClawSweeper review passed for head bbbfb9f059.
- Required merge gates passed before the squash merge.
Prepared head SHA: bbbfb9f059
Review: https://github.com/openclaw/openclaw/pull/85063#issuecomment-4511554739
Co-authored-by: nxmxbbd <32288+nxmxbbd@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
* feat(imessage): support thumb approval reactions
Mirrors openclaw#85477 (WhatsApp) for the iMessage channel. iMessage can now
deliver exec/plugin approval prompts via the existing imsg/BlueBubbles
transport and resolve approvals from 👍 (allow-once) / 👎 (deny) tapbacks.
Allow-always remains on the manual /approve <id> allow-always fallback.
What changed:
- New approval surfaces under extensions/imessage/src/:
approval-auth.ts, approval-resolver.ts, approval-reactions.ts,
approval-handler.runtime.ts, approval-native.ts (+ tests for each).
- channel.ts wires base.approvalCapability to the new iMessage capability.
- send.ts appends the 👍/👎 hint to outbound /approve prompts and registers
the reaction binding (keyed by accountId + chat_guid/chat_identifier/
chat_id/handle + messageId) after a successful send.
- monitor/monitor-provider.ts resolves approval reactions ahead of the
normal inbound decision pipeline so resolution bypasses
reactionNotifications gating and runs its own actor authorization.
- runtime.ts now exports getIMessageRuntime / getOptionalIMessageRuntime so
approval-reactions can open a persistent keyed store for binding state
across gateway restarts.
What did NOT change:
- Core approval surfaces in src/gateway/server-methods/* and src/infra/*
remain channel-agnostic; the channels.imessage.allowFrom field already
exists and is reused as the approver list for reactions.
- Other channels and the manual /approve sender-authorized path are
untouched.
* fix(imessage): address codex review findings on thumb approvals
Addresses 15 findings from the multi-angle codex review:
Critical (correctness / blocking):
- Register CHANNEL_APPROVAL_NATIVE_RUNTIME_CONTEXT_CAPABILITY in the iMessage
monitor so the gateway can actually deliver native approval prompts via
approval-handler.runtime.ts (it was dead code without the context lease).
- DM tapback approvals never resolved because send keyed by handle while
inbound preferred chat_guid. Register and look up under EVERY available
conversation key (chat_guid / chat_identifier / chat_id / handle); inbound
probes them all and accepts the first hit.
- Reaction binding now requires the bridge's GUID string (rejecting numeric
ROWIDs) so the binding key matches inbound reacted_to_guid.
- Outbound regex now requires both a canonical `ID: <approvalId>` header AND
a matching `/approve <id> <decision>` line, so non-approval messages that
legitimately mention /approve syntax no longer get a phantom reaction
binding (and can no longer resolve a colliding live approval).
- Drop is_from_me reaction events so cross-device echoes of the operator's
own tap cannot self-approve when their handle is in allowFrom.
High (operability / cleanup):
- Non-ApprovalNotFound errors now log at warn via the runtime child logger
(no longer hidden behind OPENCLAW_LOG_LEVEL=debug).
- In-memory binding is cleared on successful resolve so a toggle 👍→👎 (or
chat.db replay) does not refire and emit a misleading 'expired approval'
log line. Removed tapbacks are also owned by the shortcut and not surfaced
as noisy reaction system events.
- Move resolveIMessageReactionContext (and its helpers) to a slim
monitor/reaction-context.ts so approval-reactions.ts no longer transitively
pulls monitor/inbound-processing.ts (14+ heavy runtime modules) into the
hot channel.ts entrypoint per extensions/CLAUDE.md.
Medium (consistency / future-proofing):
- Native runtime exec pending payload now passes agentId, ask, and
sessionKey through buildExecApprovalPendingReplyPayload so the two
delivery routes produce identical operator-visible prompts.
- Both delivery paths now use addIMessageApprovalReactionHintToText (single
insertion point after ID:) so the hint cannot be double-emitted by the
native runtime path bypassing the idempotency guard.
- Extract replaceApprovalIdPlaceholder into a shared approval-text.ts that
escapes `$` in the replacement string so an approvalId containing
`$&`/`$1`-`$9`/`$$` cannot interpolate into the outbound text.
- In-memory Map now stores TTL alongside each entry and prunes expired
bindings on each register so the gateway no longer accumulates an
unbounded reaction-target Map.
- bindPending refuses to bind when accountId is missing or the approval is
already expired, with explicit error logs instead of silent no-ops.
- Reject chat_id=0 as a synthetic key value (chat.db ROWIDs start at 1).
- Drop dead getIMessageRuntime export — only the optional accessor is used.
Documentation:
- docs/channels/imessage.md gains an 'Approval reactions (👍 / 👎)' accordion
documenting the reaction emoji map, allowFrom approver requirement, the
/approve <id> allow-always manual fallback, and the deliberate change to
/approve command authorization for users with non-empty allowFrom.
- CHANGELOG.md entry added under 2026.5.24.
Tests: 411 iMessage tests pass (was 406). Added explicit coverage for the
DM key-mismatch fix, the regex-tightening fix, the is_from_me guard, the
clear-on-success behavior, and the approval-id `$` escape.
* test(imessage): match WhatsApp approval-native test coverage
Backfills the nine cases from extensions/whatsapp/src/approval-native.test.ts
that weren't mirrored in iMessage:
- target-mode exec + plugin prompt rendering with the canonical hint
- target-mode availability when no iMessage target matches
- agentFilter / sessionFilter applied to native handling
- account-scoped target enabled/disabled per account
- shouldSuppressForwardingFallback session-origin exact-match cases
- shouldSuppressForwardingFallback off when native cannot bind (locks down
the targets-only forwarding path the Lobster live deploy exercised)
- both-mode explicit + unscoped target suppression
- group-origin tapback approvals require explicit approvers
Tests: extensions/imessage/src/approval-native.test.ts 21 passed (was 11).
Total iMessage approval-specific cases now 49 (was 40).
* fix(imessage): preserve service-prefixed direct handles as approvers
ClawSweeper P1 review finding on #85952. normalizeIMessageApproverId was
calling looksLikeIMessageExplicitTargetId() to reject conversation-target
prefixes, but that helper also matches the imessage:/sms:/auto: service
prefixes — which are valid direct-handle forms. Any allowFrom entry like
'imessage:+15551230000' dropped to undefined, leaving approvers empty,
which:
- silently denied reaction resolution ('reactions require explicit
approvers'), and
- let text /approve fall back to implicit same-chat authorization.
Fix: normalize first via normalizeIMessageHandle (strips the service
prefix), then reject only chat_id:/chat_guid:/chat_identifier:
conversation-target shapes that remain after normalization.
Tests:
- approval-auth.test.ts: assert the resolved approver list contains the
normalized handle, plus the corollary that a non-matching sender is
explicitly rejected (no longer masked by the implicit-same-chat
fallback). Add a separate case covering chat_id/chat_guid/
chat_identifier rejection (with and without a service prefix).
- approval-reactions.test.ts: reaction resolution end-to-end with a
service-prefixed allowFrom entry — proves resolveIMessageApproval is
called rather than silently denied.
Focused suite: 48 passed (was 47).
* test(imessage): satisfy strict buildPendingPayload signature in render tests
CI check:test-types caught that the render.exec/render.plugin
buildPendingPayload calls were passing accountId (not in the type
signature). The signature is { cfg, request, target, nowMs }. Replace
accountId with target on the four render-test sites so the strict
test-types pass matches the SDK contract:
- it('renders thumbs-only reaction hints in exec approval prompts')
- it('renders thumbs-only reaction hints in plugin approval prompts ...')
- it('renders target-mode exec prompts with concrete thumbs-only ...')
- it('renders target-mode plugin prompts with concrete thumbs-only ...')
Verified locally with pnpm check:test-types (tsgo:core:test +
tsgo:extensions:test). 49 approval-specific tests still pass.
* fix(imessage): probe every tapback GUID form for approval lookup
ClawSweeper P1 review finding on #85952. readApprovalReactionEvent was
only using reaction.targetGuid (the first/normalized form), but
resolveIMessageReactionContext produces reaction.targetGuids = [normalized,
raw] for both `abc-123` and `p:0/abc-123` forms. If the imsg bridge
returned 'p:0/<guid>' from send() and send.ts registered the binding under
that prefixed key, the inbound resolver probing only the unprefixed form
would miss and the tapback would silently fall through.
Fix:
- Surface every GUID candidate in IMessageApprovalReactionEvent
(messageIdCandidates).
- maybeResolveIMessageApprovalReaction now probes each candidate in
precedence order; first hit wins.
- On success / ApprovalNotFoundError, clear the binding under all
candidate keys so toggle/replay does not refire.
Tests: extensions/imessage/src/approval-reactions.test.ts gains a
'resolves a reaction when the binding was registered under a p:0/…
prefixed GUID and the tapback surfaces both forms' regression case;
22/22 reaction tests pass. Full iMessage suite: 424/424.
* fix(imessage): native approval binding requires GUID, not numeric id
ClawSweeper third P1 review finding on #85952. approval-handler.runtime.ts
deliverPending was using result.messageId as the approval-reaction binding
key, but that field can be a numeric ROWID coerced to a string ('12345')
when the imsg bridge returns only message_id. Inbound tapbacks carry
reacted_to_guid which is always a GUID, so a numeric-id binding can never
match.
Fix mirrors the send.ts forwarding-path treatment:
- IMessageSendResult now exposes a separate guid?: string field, populated
from the same resolveOutboundMessageGuid helper send.ts already uses for
the forwarding-path binding. The generic messageId field is unchanged so
reply-cache, echo-cache, and receipt-building paths still see the
broadest id form.
- deliverPending now binds against result.guid; when it's undefined (numeric
ROWID or 'ok'/'unknown' placeholders), the function returns null instead
of binding against an id the inbound tapback can't possibly match.
Tests: approval-handler.runtime.test.ts gets a deliverPending GUID-only
binding describe block with three regression cases (numeric ROWID refused,
GUID accepted, ok/unknown placeholders refused). vi.mock isolates
sendMessageIMessage so the cases run synchronously without spawning imsg.
11 tests pass across handler.runtime + send specs.
---------
Co-authored-by: Omar Shahine <10343873+omarshahine@users.noreply.github.com>
Summary:
- The branch updates OpenRouter dynamic model capability parsing to prefer `top_provider.context_length`, bump ... sk cache version, adds regression coverage and a changelog entry, and adds script helper declaration files.
- Reproducibility: yes. from source and live catalog evidence rather than an authenticated inference turn. Cur ... catalog currently reports a smaller endpoint-specific `top_provider.context_length` for the reported model.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(openrouter): use endpoint context limits
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8594…
Validation:
- ClawSweeper review passed for head 76fcc362d2.
- Required merge gates passed before the squash merge.
Prepared head SHA: 76fcc362d2
Review: https://github.com/openclaw/openclaw/pull/86041#issuecomment-4528646655
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Summary:
- The PR changes dev-channel git updates to fetch branches with `--no-tags`, adds targeted fetching for explicit dev tag refs, updates update-runner tests, and adds a changelog entry.
- Reproducibility: yes. Current main source shows dev updates still run a broad tag fetch, and the PR body sup ... al local bare-remote moved-tag reproducer showing that command fails before the branch update can continue.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(update): avoid broad tag fetches for dev updates
Validation:
- ClawSweeper review passed for head 733680b1bc.
- Required merge gates passed before the squash merge.
Prepared head SHA: 733680b1bc
Review: https://github.com/openclaw/openclaw/pull/84737#issuecomment-4503692161
Co-authored-by: Ruben Cuevas <hi@rubencu.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
The agentToAgent allow-pattern matcher converted user wildcards like
`*a*b*c*` into `^.*a.*b.*c.*$` via RegExp. Multiple overlapping
`.*` groups cause O(n^k) polynomial backtracking against non-matching
input, where k is the number of wildcards.
Replace the regex path with a segment-based glob matcher that splits on
`*` and checks prefix/suffix/interior segments in order. The new
matcher runs in O(n*k) worst case and eliminates the regex engine
entirely from this path.
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(minimax): normalize OAuth token expiry to absolute millisecond timestamp
MiniMax returns expired_in from the token endpoint as a relative duration
in seconds (standard OAuth expires_in semantics), but the auth profile
store's hasUsableOAuthCredential() expects an absolute millisecond
timestamp. Without conversion the token appears perpetually expired,
triggering a slow OAuth refresh network call to api.minimaxi.com on
every request — the root cause of the 30-50s auth-stage delay.
Fixes#83449.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(minimax): cover oauth expiry normalization
* fix: polish minimax oauth expiry normalization (#83480) (thanks @NianJiuZst)
* fix: update minimax raw fetch allowlist (#83480)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Summary:
- The branch updates gateway boot startup handling to use an `agent:<id>:boot` session, suppress prompt persis ... that boot mapping after the run, and adds focused gateway boot regression coverage plus a changelog entry.
- Reproducibility: yes. there is a high-confidence source reproduction path: current main passes the generated ... idence of repeated persisted boot prompts. I did not execute the gateway scenario in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix boot-md test lint
- PR branch already contained follow-up commit before automerge: Isolate boot-md startup sessions
Validation:
- ClawSweeper review passed for head 5d5338c2d9.
- Required merge gates passed before the squash merge.
Prepared head SHA: 5d5338c2d9
Review: https://github.com/openclaw/openclaw/pull/85919#issuecomment-4527318708
Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Clamp proxy-like OpenAI Chat Completions output caps against the estimated final outbound request payload after compatibility transforms. This prevents strict local/API-compatible servers from rejecting requests whose prompt already consumes part of the effective context window, while avoiding over-clamping dropped replay turns.
Co-authored-by: rendrag-git <253747599+rendrag-git@users.noreply.github.com>
Honor configured restart drain budgets for embedded runs and avoid a second active-work drain after forced deferral timeout restarts.
Includes maintainer changelog entry.
* fix(ui): handle empty strings with minLength constraint in config save
Fixes#85831
When saving config in Control UI, required string fields with minLength
constraint (e.g., z.string().min(1)) were sent as empty strings instead
of being unset. This prevented schema defaults from applying.
Solution: coerce empty strings with minLength > 0 to undefined, allowing
schema defaults to take effect during validation.
Added 5 unit tests covering edge cases.
* fix(types): add minLength and maxLength to JsonSchema type
Keep successful Codex native hook relays alive through a bounded grace window so late hook callbacks still reach OpenClaw enforcement, while interrupted, aborted, timed-out, and failed turns unregister immediately.\n\nCo-authored-by: Kaspre <kaspre@gmail.com>
Summary:
- The PR adds the Chrome DevTools MCP `--no-usage-statistics` default launch arg, honors explicit profile usage-statistics `mcpArgs`, adds regression tests, and adds a changelog entry.
- Reproducibility: yes. source-reproducible: current main builds Chrome MCP launch args without the upstream o ... etry is initialized. I did not run a fresh failing current-main process leak loop in this read-only review.
Automerge notes:
- PR branch already contained follow-up commit before automerge: Disable Chrome MCP telemetry watchdog by default
Validation:
- ClawSweeper review passed for head 68249b1f58.
- Required merge gates passed before the squash merge.
Prepared head SHA: 68249b1f58
Review: https://github.com/openclaw/openclaw/pull/85886#issuecomment-4526997996
Co-authored-by: Rohit <rohitjavvadi2@gmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Restore the describeImageWithModel default token budget to the helper-level 4096-token default instead of forcing 512 before resolution.
Add regression coverage for the default and for smaller model caps, and record the user-facing fix in the changelog.
Co-authored-by: scotthuang <scotthuang@tencent.com>
* fix(doctor): repair stale contextWindow for DeepSeek V4 Flash
Problem:
- Older releases configured deepseek-v4-flash with contextWindow: 200000
- Official DeepSeek V4 Flash context window is 1,000,000 (1M)
- Users switching from smaller models see incorrect progress bar (e.g.,
50% instead of 10%) because stale config value overrides catalog
Fix:
- Add 'models.providers.*.models.*.contextWindow-stale' migration
- Detects deepseek-v4-flash models with 200K contextWindow
- Repairs to 1M to match catalog default
- Handles both bare and provider-prefixed model IDs
- 7 unit tests covering repair, passthrough, edge cases
Fixes: #85834
* fix(doctor): preserve custom DeepSeek context windows
* fix(doctor): detect stale DeepSeek context windows
* fix(doctor): scope DeepSeek context repair
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(gateway): broadcast error to UI when chat.send fails synchronously
* test(gateway): verify broadcastChatError is called on chat.send error
* test(gateway): import GatewayRequestContext from local server-methods barrel
Fixes the chat error-broadcast regression test so it can resolve its
type import. The previous `../types.js` path does not exist in the
gateway tree; the shared types are re-exported from
`src/gateway/server-methods/types.ts`, so the test must use `./types.js`.
Addresses ClawSweeper review on PR #85815.
---------
Co-authored-by: scotthuang <scotthuang@tencent.com>
createAuthProvider swallowed addUserForToken rejections in a .catch()
that only logged, so getClient returned and cached a ChatClient backed
by a RefreshingAuthProvider with no bound user. The failure surfaced
later as an opaque auth error on first send instead of failing fast.
Re-throw in the catch so getClient rejects and does not cache the broken
client. Adds regression tests for the rejection and the no-cache behavior.
Fixes#83853
Summary:
- The PR skips agent-harness compaction preflight for provider-owned or configured CLI runtime sessions, adds claude-cli regression coverage, includes a changelog entry, and applies small test/type cleanups.
- Reproducibility: yes. at source level. Current main still routes provider-owned `claude-cli` runtime compaction preflight through harness selection, where `claude-cli` is not a registered embedded harness.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix#84857: skip CLI runtime harness preflight during compaction
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8487…
Validation:
- ClawSweeper review passed for head 1dd8a88d21.
- Required merge gates passed before the squash merge.
Prepared head SHA: 1dd8a88d21
Review: https://github.com/openclaw/openclaw/pull/85862#issuecomment-4526794976
Co-authored-by: 张贵萍0668001030 <zhang.guiping@xydigit.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
* fix(session-lock): enforce maxHoldMs in shouldReclaim during lock acquisition
- Adds optional maxHoldMs parameter to inspectLockPayload
- Inspect now marks locks as stale when held longer than maxHoldMs
- Passes maxHoldMs through inspectLockPayloadForSession
- acquireSessionWriteLock's shouldReclaim callback now passes maxHoldMs
This ensures that when a live process holds a lock for longer than
maxHoldMs (default 5min), other processes can reclaim it during
acquisition — matching the watchdog's existing enforcement.
Previously shouldReclaim only used staleMs (30min default), meaning
a lock held for 10+ minutes by a live PID would never be reclaimable,
causing 60s timeout failures and gateway freezes.
Closes#85762
* fix(session-lock): add dead-PID fast-path before retry loop
Adds a fast-path check at the top of acquireSessionWriteLock:
if the lock file's owner PID is dead, remove it immediately
before entering the retry loop. This saves up to timeoutMs (60s)
of futile waiting when the previous lock holder has died.
The shouldReclaim callback already handles this case, but only
iteratively through the retry loop. The fast-path eliminates
that unnecessary delay.
* fix(session-lock): enforce max hold during acquisition
* fix(session-lock): revalidate max hold safely
* fix(session-lock): honor holder max-hold policy
* fix(session-lock): keep cleanup from reclaiming live holders
* fix(session-lock): remove stale locks only when unchanged
* fix(session-lock): skip self-held max-hold reclaim
* fix(ci): refresh gateway protocol checks
---------
Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(cli-runner): keep recent tail when reseed history exceeds maxHistoryChars
`buildCliSessionHistoryPrompt` was prefix-slicing the rendered history,
dropping the most recent assistant turns from the reseed prompt. After
#80934 made the Claude-CLI reseed default-on, every Claude-CLI user is
exposed to this on session_expired when the rendered transcript exceeds
12288 chars. The truncation marker landed mid-word in real reproductions.
Fix:
- Tail-slice (keep the recent suffix, drop the older prefix)
- Pin the compaction summary as a prefix when present, only cap the
post-summary transcript (loadCliSessionReseedMessages deliberately
places the summary first)
- When the summary alone exceeds maxHistoryChars, head-slice the summary
itself to honor the cap; drop the post-summary tail in that case
- Move the truncation marker to the lead since what follows is the
recent tail, not what was dropped
Closes#83157
* fix(cli-runner): retain recent tail with oversize summaries
* fix(cli-runner): cap summary block plus marker against maxHistoryChars
ClawSweeper P2 on #83117 flagged that when `summaryRendered.length` is
less than `maxHistoryChars` but `summaryBlock.length` (summary + `\n\n`
separator) meets or exceeds it, the `remainingBudget <= 0` arm of
`buildCliSessionHistoryPrompt` appends the truncation marker after the
already-full summary block. A 199-char rendered summary under a 200-char
cap produced a 257-char history block — defeating the cap that prevents
reseeding fresh CLI sessions with unexpectedly huge prompts.
Fix the budget edge by truncating the summary in this branch as well so
`summary + separator + marker` stays within `maxHistoryChars`. The tail
still drops (the summary alone consumes the budget) and the marker still
leads its own line so the prompt announces what was discarded. Mirrors
the existing oversize-summary branch's pattern of head-slicing the
summary against an explicit budget that reserves marker + separator.
Add a focused regression in `session-history.test.ts` covering exactly
the gap the finding called out: `summaryRendered.length < maxHistoryChars`
with a non-empty post-summary tail. Asserts the rendered history block
stays within `maxHistoryChars` and the truncation marker is present.
* fix(cli-runner): keep tail for near-cap summaries
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
The auto-reply "delivery failed" log path passes a raw Error
under the `err` field. tslog's default JSON serialization
renders bare Error instances as `{}` because Error own data
properties are non-enumerable. Every delivery failure in
production therefore logs `err: {}`, forcing operators to
guess the underlying Baileys error from timestamp alone.
Convert Error to `{ type, message, stack }` plus own-enumerable
properties at the log site, so Boom-style subclass diagnostics
(output.statusCode, data) and custom OutboundDeliveryError
fields (stage, results) survive. Non-Error rejection values
pass through unchanged.
Tests cover Error, Error subclass (Boom-style), string
rejection, and object rejection paths.
AI-assisted: Claude Code (Opus 4.7) authored, codex review
locally addressed.
Strict OpenAI-compatible servers (vLLM, LocalAI, llama.cpp, LM Studio) and
current OpenAI itself reject requests containing tools: []. Strip the empty
tools array (and the orphan tool_choice) from outbound chat-completions
payloads when usesExplicitProxyLikeEndpoint is true. Native OpenAI/Azure/
OpenRouter routes are byte-identical.
Supersedes #70790 at the canonical payload builder seam so the gateway,
embedded runner, and public plugin-SDK consumers (zai/xiaomi/deepseek) all
benefit.
* codex: honor verbose in group dispatch
* codex: address group verbose review findings
Record the final local review pass for the group /verbose PR.
Codex review against origin/main completed clean after tightening the shared group progress gate, keeping public plugin hook types stable, preserving ACP hidden tool boundaries, and adding regressions for live verbose gating and progress-callback suppression.
* codex: require explicit group verbose progress
Normal group tool/progress summaries now require an explicit session verbose override instead of inherited agent verbose defaults.
This addresses the PR review concern that existing verboseDefault configurations could expose group progress after upgrade. DMs and forum-topic behavior continue to use the effective verbose state, while normal groups use the live explicit session verbose state set by /verbose on|full|off.
* codex: document Slack group verbose caveat
* fix(channels): simplify verbose progress gating
* docs(changelog): note verbose channel fix
* fix(channels): preserve quiet default for group progress
* fix(channels): keep verbose error policy dynamic
* fix(channels): default verbose progress off everywhere
* fix(channels): keep followup verbose default quiet
* fix(channels): latch visible tool-error progress
* fix(channels): track failed verbose progress events
* fix(channels): latch delivered tool errors
* fix(channels): prevent progress opt-out bypass
* fix(channels): isolate followup error warning state
* fix(channels): keep full verbose followup warnings
* fix(channels): latch tool errors after visible progress
* fix(channels): require visible followup failure progress
* fix(channels): refresh followup verbose state
* fix(channels): honor live verbose for error details
* test(channels): expect live verbose off warning mode
* fix(channels): preserve static tool error suppression semantics
* fix(channels): bypass acp for colon verbose commands
* fix(channels): narrow dynamic tool warning override
* fix(channels): gate compaction notices on live verbose
* fix(channels): suppress quiet followup compaction callbacks
* fix(channels): suppress tts for hidden tool summaries
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Summary:
- The PR removes forced consult diagnostics from Discord and phone-call realtime consult payloads, adds private debug logs and regression tests, and records the fix in the changelog.
- Reproducibility: yes. by source inspection. Current main builds the forced Discord consult message with the ... gent_consult` diagnostic string, and the phone-call fallback passes the same diagnostic as consult context.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(discord): log forced consult fallback reason
- PR branch already contained follow-up commit before automerge: fix(discord): keep forced voice consult diagnostics private
Validation:
- ClawSweeper review passed for head c1592530c6.
- Required merge gates passed before the squash merge.
Prepared head SHA: c1592530c6
Review: https://github.com/openclaw/openclaw/pull/84411#issuecomment-4494164784
Co-authored-by: FullerStackDev <263060202+fuller-stack-dev@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Expose a path-free estimated context budget status on session entries and gateway session rows, render it in status when fresh provider usage is unavailable, and clear stale estimates across reset, refresh, compaction, and session-rotation boundaries.
Verification: focused local Vitest covered session persistence, status rendering, gateway rows, model resets, compaction, and session rotation; GitHub CI passed on clean head cad199e43d.
Refs #80594, #54996, #77992, #84490, #83177, #43009, #83526, #8635.
* fix: harden package URL downloads
Guard package acceptance URL downloads with HTTPS-only validation, no embedded credentials, private/special-use DNS and IP rejection, manual redirect checks, bounded timeout/size limits, pinned lookup, and atomic temp-file writes. Add tooling tests for unsafe URLs, redirect validation, size limits, and successful writes.
* fix: cancel redirect response bodies before closing dispatcher
ClawSweeper P2: the redirect branch in openPackageDownloadResponse cleared
the timeout and awaited dispatcher.close() without first cancelling
response.body. Undici's close() is graceful — it waits for in-flight
requests to complete — so a malicious redirect with a slow/never-ending
body could hang the hardened downloader.
Fix: call response.body?.cancel() before dispatcher.close() to abort the
redirect body immediately.
Test: add a regression test that uses a ReadableStream with an indefinite
interval to simulate a hanging body, and asserts cancel() was called.
Refs: clawsweeper review on PR #85512
* test: harden redirect body cancellation race in regression test
Guard the ReadableStream controller.enqueue() call with a cancelled
flag and try/catch to prevent ERR_INVALID_STATE when the interval
fires after cancel() closes the controller.
* fix: cancel final response body before closing dispatcher in downloadUrl
ClawSweeper P2: the HTTP-error and declared-oversize early-exit paths
in downloadUrl threw before consuming or canceling response.body. The
finally block then cleared the timeout and awaited graceful
dispatcher.close() with the body still open, allowing a slow/never-ending
response to hang release tooling.
Fix: add response.body?.cancel() in the finally block before
dispatcher.close().
Tests: add two regressions:
- HTTP 500 with slow body: asserts cancel() called before dispatcher close
- Declared content-length oversize with slow body: same assertion
* fix: add trusted package URL source policy
* fix: keep package URL resolver dependency-free
* test: cover encoded IPv6 package URL bypasses
* docs: sync package acceptance source overview
* docs: restore release doc formatting
* docs: sync package acceptance trusted-url source
* test: cover dotted IPv4 embedded IPv6 package URLs
* fix: parse dotted IPv4 embedded in IPv6 package URLs
* test: isolate anthropic pruning defaults
* test: move anthropic dated model coverage
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(exec-approvals): add .catch() to expiry delivery fire-and-forget
When exec-approval expiry fires, deliverToTargets is called as a
fire-and-forget promise with no .catch(). If delivery fails, the
unhandled rejection swallows the error and the notification is lost.
Add .catch() with log.warn to match the ackDelivery error handling
pattern. Keep pending.delete() before the await (the entry is expired
regardless of delivery success).
Closes#83113
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(approvals): label expiry delivery errors by kind
---------
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(doctor): skip empty entries and memoize routes in plugin session repairs
runPluginSessionStateDoctorRepairs called resolveConfiguredDoctorSessionStateRoute
once per session-store key, even for entries that carry no plugin route state
fields. On stores with many CLI sessions (observed ~800 entries), each call
takes ~1.5s due to resolveAgentHarnessPolicy walking config and provider
metadata, so the doctor's state-integrity contribution hangs for minutes
and the surrounding 'openclaw doctor' run effectively never completes.
scanEntryForOwner can only produce repair/manual-review findings when the
entry exposes one of the fields covered by entryMayContainPluginSessionRouteState
(providerOverride/modelOverride/agentHarnessId/cliSessionBindings/etc.), so
the route resolution for empty entries was pure waste. The route itself is
also a function of agentId (sessionKey is only used to derive agentId), so
sessions sharing an agent can reuse one resolved route.
Filter the store by entryMayContainPluginSessionRouteState before resolving,
and memoize resolveConfiguredDoctorSessionStateRoute by agentId within the
remaining entries. On the repro store this drops the contribution from
'never completes' to <100ms.
Adds a guard test that builds a 200-entry store with 2 route-state-carrying
entries and asserts (a) the repair fires exactly once on the codex owner
and (b) the run completes in under 2s (pre-fix would take >5 minutes).
* fix(doctor): skip manifest model-id normalization in plugin session repairs
After the previous filter+memoize fix, runPluginSessionStateDoctorRepairs was
still ~38s on a 230-entry store because every scanned entry calls parseModelRef
on its runtime model. That implicitly enters manifest-driven model-id
normalization via normalizeStaticProviderModelId, which calls
loadPluginMetadataSnapshot when no current snapshot is bound to process state.
loadPluginMetadataSnapshot is filesystem-heavy and is only memoized when a
'current' snapshot is bound (it is not, during doctor), so each parseModelRef
call paid ~40ms of fresh plugin-metadata loading. 672 calls × ~40ms = ~27s
of doctor wall-clock, all of it useless for doctor's purposes: the scan only
needs the normalized provider id of the configured runtime/route to compare
against an owner's providerIds, never the manifest-normalized model id.
Pass allowManifestNormalization: false alongside the existing
allowPluginNormalization: false on all three parseModelRef call sites in
this file. normalizeStaticProviderModelId short-circuits to
normalizeBuiltInProviderModelId when allowManifestNormalization is false,
which is what doctor wants here.
On the same 230-entry store doctor:state-integrity drops from ~38s to ~2.4s
and total openclaw doctor wall-clock drops from ~91s to ~56s.
Consume the existing { text, changed } signal from
stripInlineDirectiveTagsForDisplay so unchanged text-parts keep their
references and the original message is returned when nothing was
stripped. Avoids spurious downstream rerenders/diff churn for consumers
relying on reference equality, and keeps the public SDK helper's text
output and message shape stable.
Fixes#37589.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
openai-codex-responses can return turns where usage.output > 0 but
assistantTexts is empty (hidden reasoning tokens only). The empty
response retry guard only covered openai-completions, anthropic-messages,
and Ollama, so these turns passed through as successful completions
with no content delivered to the user.
Add the full openai-responses API family (openai-responses,
openai-codex-responses, azure-openai-responses, and their transport
variants) to RETRY_GUARD_MODEL_APIS so the empty response and
reasoning-only retry paths can fire for these providers.
Closes#85364
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(status): show configured cost for aws-sdk models
Decouple status cost display from provider auth mode so explicit model pricing is used for Bedrock and other non-api-key providers. Include cache read/write tokens in the status cost estimate and cover the behavior with regression tests.
* fix: show configured response usage costs
* docs: align configured cost visibility
* fix(status): keep usage tokens mode cost-free
---------
Co-authored-by: ItsOtherMauridian <165866613+ItsOtherMauridian@users.noreply.github.com>
Co-authored-by: ItsOtherMauridian <itsothermauridian@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
When onboarding Microsoft Foundry-hosted DeepSeek-V4 models (Pro/Flash),
the onboarding wizard assigned api: 'openai-completions' because
usesFoundryResponsesByDefault() only matched GPT/o-series models.
These V4 models require the Responses API (openai-responses) to work
correctly against the Foundry endpoint. Without this fix, all calls fail
with 'provider rejected the request schema or tool payload'.
Fix: Add 'deepseek-v4' prefix to usesFoundryResponsesByDefault() so only
the verified V4 family defaults to openai-responses. Older DeepSeek
families (e.g., V3) remain on openai-completions until proven compatible.
Closes: DeepSeek V4 models deployed via Microsoft Foundry onboarding
failing immediately due to wrong API adapter.
Co-authored-by: Roslin <rmj010203@gmail.com>
Defer Gateway channel startup until after readiness, remove startup model prewarm, and move model catalog data onto manifest/static paths so startup no longer loads broad provider runtimes.
Verification:
- focused gateway/catalog/auth/QA Vitest runs
- autoreview clean
- Blacksmith Testbox-through-Crabbox tbx_01ksahn65rsrsqz3q1qyxwf929: pnpm check:changed, exit 0
- PR CI green on ee2b631c72
* fix(gateway): normalize explicit state dir overrides at startup
* test(gateway): simplify state-dir startup coverage
* test: fix state dir startup coverage
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Route cron announce topic target parsing through channel plugin target parsers instead of Telegram-specific cron core code. Keep supported Telegram topic forms in the Telegram plugin and document the channel-owned shorthand.
* fix(bootstrap): guard bootstrap name checks against undefined names
Add optional chaining to isAgentsBootstrapFile and isAgentsBootstrapName
to prevent TypeError: Cannot read properties of undefined (reading 'toLowerCase')
when bootstrap file entries have undefined name properties.
This crash was observed in 2026.5.20 where a workspace bootstrap file entry
with an undefined name caused every incoming message to fail during bootstrap
context building, completely blocking all agent replies.
Fixes#85523
* test(agents): cover unnamed bootstrap truncation entries
* test(agents): keep bootstrap truncation fixture typed
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
`waitForever()` is a public library export used by long-running embeds to
block until the host process is asked to exit. It called `interval.unref()`
on the keep-alive timer, which removes the timer from Node's active-handle
set. With no other ref'd handles, `await waitForever()` exits the process
in ~3ms with exit code 13 ("unsettled top-level await") instead of waiting.
Drop the `.unref()` so the interval actually keeps the loop alive, and
update the existing unit test (and comment) to lock in the new contract.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* fix(cli-output): ignore cumulative usage from result events in stream-json parser
Claude-cli's stream-json result event reports cumulative cache_read across
all tool sub-calls, not the per-call value. The parser was overwriting the
last assistant-event usage with this inflated sum, causing sessionEntry.totalTokens
to climb 6-13x on tool-heavy turns and trip the preemptive-compaction gate.
Fix: skip reading usage from result events in createCliJsonlStreamingParser,
keeping the last per-call usage from assistant events instead.
Fixes#85573
* fix(agents): keep Claude result usage as fallback
* fix(agents): read Claude assistant stream usage
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Fixes#83883.
In `secrets configure`, the one-way-migration irreversibility warning was
computed from `opts.apply` (the original --apply flag) rather than
`shouldApply`. On the interactive path the user confirms "Apply this plan
now?", which sets shouldApply=true while opts.apply stays false, so the
warning was silently skipped and the irreversible plaintext migration was
applied without the second confirmation.
Derive the guard from shouldApply so the irreversibility warning fires on
both the --apply path and the interactive-confirm path. Adds regression
tests covering the interactive path (warning shown; declining it cancels
the apply).
* fix(agents/harness): pass CLI runtime aliases through to PI in selectAgentHarnessDecision
When a model defines `agentRuntime.id` as a CLI runtime alias
(`claude-cli`, `google-gemini-cli`) or a configured `cliBackends` id, the
explicit-non-`auto` branch of `selectAgentHarnessDecision` previously
threw `MissingAgentHarnessError` because the alias has no agent harness
plugin counterpart. Model dispatch is unaffected (the CLI-runtime
short-circuit in `assertModelFallbackCandidateHarnessAvailable` runs
first), but every non-dispatch caller — delivery-mirror metadata
lookups, lane preflight, channel projection — surfaces the throw. On
Slack `[[reply_to:]]` deliveries the warning text gets substituted into
the assistant message synthesized as `provider: openclaw,
model: gateway-injected`, poisoning the thread.
Mirror the existing implicit-codex escape hatch in the same function:
when the runtime is a CLI alias (`isCliRuntimeAlias`) or a configured
CLI backend (`isCliProvider`), return PI with the new
`selectedReason: "cli_runtime_passthrough_pi"`. Actual CLI dispatch is
already routed by callers that consult model runtime policy, so PI here
is just a transcript-composition placeholder — non-CLI typos still
throw as before.
Refs #85582.
* fix(agents): validate CLI harness aliases by provider
* fix(agents): keep custom CLI harness ids fail-closed
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* docs(auth): document named OAuth profile logins
* feat(auth): support --profile-id in models auth login
* docs: note named model login profiles
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Restores WebChat image uploads to the media-understanding flow without one-turn model overrides.
- removes image-model override plumbing from the reply run
- stages WebChat images as MediaPaths for enrichment
- avoids replaying already-understood images to text-only reply models while preserving undescribed images
Co-authored-by: NianJiuZst <3235467914@qq.com>
* feat(anthropic): migrate 1M context from beta to GA
Anthropic has graduated the 1M context window from beta to GA.
This commit:
- Stops injecting the context-1m-2025-08-07 beta header when
context1m: true is configured
- Removes the OAuth token skip logic that was needed because
Anthropic previously rejected the context-1m beta with OAuth auth
(OAuth now supports 1M natively)
- Strips the legacy beta header from user-configured anthropicBeta
arrays to prevent sending a stale header
- Removes the now-unused isAnthropic1MModel helper,
ANTHROPIC_1M_MODEL_PREFIXES constant, and logger import from
the stream wrappers
The context1m config param continues to be respected for context
window sizing in context.ts — only the beta header injection is
removed.
Closes#45550 (Phase 1)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(anthropic): migrate 1M context handling to GA
* fix(clownfish): address review for ghcrawl-156721-autonomous-smoke (1)
* fix(anthropic): restrict ga 1m context models
* docs(anthropic): align ga 1m context guidance
* fix(anthropic): normalize ga 1m model metadata
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(twitch): preserve newer message handler during cleanup
Fixes#83888.
`TwitchClientManager.onMessage` returns a cleanup closure that called
`messageHandlers.delete(key)` unconditionally. When a second onMessage()
for the same account replaced the handler, running the earlier cleanup
deleted the newer handler, leaving the account with no handler and
silently dropping all inbound messages.
Guard the delete with a referential check so the cleanup only removes
the handler it registered. Adds regression tests covering both the
stale-cleanup case (newer handler must survive) and the normal case
(current handler is still removed).
* fix(twitch): distinguish handler registrations
* fix(signal): avoid dangling test export name
* test(meeting-notes): use public sdk imports
* test(sdk): classify meeting-notes subpath
* fix(discord): keep channel entrypoint imports narrow
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Use the passive backend Gateway client for implicit local logs reads, and route Linux follow-mode local RPC failures to a bounded/redacted active systemd journal fallback instead of stale configured-file logs.
Fixes#83656Fixes#66841
Summary:
- The branch adds a config-aware tool auth helper, routes image/PDF/media generation preflight and list selection through it, threads `workspaceDir`, and adds focused regression tests plus a changelog entry.
- Reproducibility: yes. by source inspection. Current main gates affected media/PDF/generation preflight paths on env/profile auth while the runtime auth contract already accepts usable `models.providers.*.apiKey`.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(tools): fall back to config apiKey in capability preflight
- PR branch already contained follow-up commit before automerge: fix(tools): honor config apiKey in media tool preflight
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8557…
Validation:
- ClawSweeper review passed for head b8c9242d77.
- Required merge gates passed before the squash merge.
Prepared head SHA: b8c9242d77
Review: https://github.com/openclaw/openclaw/pull/85570#issuecomment-4523770355
Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: hxy91819
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Summary:
- Adds an optional archive-error callback for session transcript archiving, wires `/new` reset rotation to log previous-transcript archive failures, adds regression coverage, and updates the changelog.
- Reproducibility: yes. source-reproducible. Current main catches and ignores `archiveFileOnDisk` failures ins ... and the source PR proof exercises the same rename failure boundary with a real filesystem permission error.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 9d5f4c0c70.
- Required merge gates passed before the squash merge.
Prepared head SHA: 9d5f4c0c70
Review: https://github.com/openclaw/openclaw/pull/85586#issuecomment-4523917139
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
description: "Add a redacted agent transcript section to GitHub PR or issue bodies during OpenClaw agent-created PR/issue workflows."
---
# Agent Transcript
Best-effort local-only provenance for OpenClaw PR/issue bodies. Use during agent-created GitHub PR or issue workflows before creating/updating the body.
## Contract
- Never use network. Session discovery reads local agent logs only.
- Never upload raw logs. Render sanitized Markdown first.
- Always ask the user before adding transcript logs to a GitHub PR/issue body.
- Tell the user sanitized session logs help reviewers and can make PRs easier to prioritize.
- Offer a local HTML preview before insertion. If the user wants preview, open it and wait for confirmation before adding the section.
- Fail closed on unresolved secrets, private keys, browser/session/cookie details, or auth URLs.
- Drop system/developer prompts, raw tool outputs, reasoning, env, cookies, tokens, and broad local paths.
- Keep user prompts, assistant visible decisions, terse tool summaries, and test/proof outcomes.
- Remove session turns unrelated to the PR/issue work. Use the PR/issue title, branch name, changed files, and stated goal as scope; omit earlier/later unrelated tasks even when they are in the same session log.
- Best effort only: PR/issue creation must continue if no safe transcript is found.
- Add the `## Agent Transcript` section only when inserting a real transcript. Never add a placeholder transcript heading or text such as "A sanitized local transcript preview was generated but not included."
- Use a collapsed `<details>` section and update existing markers instead of duplicating sections.
`find` scans the newest 400 matching local JSONL logs by default across Codex, Claude, Pi, and OpenClaw agent sessions. Use `--max-files N` for a wider local search.
2. Run `find` with title, branch, PR URL/number if known, and cwd.
3. If a high-confidence session is found, ask:
`Include a redacted agent transcript? It helps reviewers and can make the PR easier to prioritize. I can open a local preview first.`
4. If the user wants preview, run `preview`, open the HTML with `open`, and wait for confirmation.
5. Before insertion, trim unrelated session turns from the generated section. Keep only turns that explain this PR/issue's goal, implementation choices, files, tests, proof, blockers, and final outcome.
6. If the user approves, run `append-body`.
7. Use the enriched body file for creation/update.
8. If no safe session is found, say nothing and continue without transcript. If the user declines, continue without transcript and do not add any transcript placeholder section.
## Review Artifacts
For manual audits across many PR/session candidates, create a local HTML preview from a local JSON file. This is for maintainers only and is not part of the PR/issue workflow:
```bash
.agents/skills/agent-transcript/scripts/agent-transcript html \
- If a review-triggered fix changes code, rerun focused tests and rerun the structured review helper.
- For security-audit suppression changes, verify accepted findings remain auditable: suppressed findings stay in structured output, active output keeps an unsuppressible suppression notice, and aggregate findings cannot hide unrelated active risk.
- Never switch or override the requested review engine/model. If the review hits model capacity, retry the same command a few times with the same engine/model.
- Be patient with large bundles. Structured review can take up to 30 minutes while the model call is active, especially with Codex tools or web search.
- Treat heartbeat lines like `review still running: ... elapsed=... pid=...` as healthy progress, not a hang. Let the helper continue while heartbeats are advancing. Pass `--stream-engine-output` when live engine text is useful; Codex and Claude filter tool/file chatter, other engines pass raw output through.
- Do not kill a review just because it has been quiet for 2-5 minutes, or because it is still running under the 30-minute window. Inspect the process only after missing multiple expected heartbeats, after 30 minutes, or after an obviously failed subprocess; prefer letting the same helper command finish.
- Tools are useful in review mode. The helper allows read-only inspection tools and web search by default so reviewers can check dependency contracts, upstream docs, and current behavior.
- Security perspective is always included, but it should not cripple legitimate functionality. Report security findings only when the change creates a concrete, actionable risk or removes an important safety check.
- For regression provenance, if no blamed PR is traceable, use the blamed commit as the provenance: commit SHA, date, and author username. Do not guess a merger or frame missing PR metadata as a separate finding.
- Do not invoke built-in `codex review`, nested reviewers, or reviewer panels from inside the review. The helper builds one bundle, calls one selected engine, validates one structured result, and stops.
- Stop as soon as the helper exits 0 with no accepted/actionable findings. Do not run an extra review just to get a nicer "clean" line, a second opinion, or clearer closeout wording.
- Treat the helper's successful exit plus absence of actionable findings as the clean review result, even if the underlying Codex CLI output is terse.
- Multi-reviewer panels are opt-in only. Use them when explicitly requested or when risk justifies the extra spend; the main agent still verifies every accepted finding before fixing.
- If rejecting a finding as intentional/not worth fixing, add a brief inline code comment only when it explains a real invariant or ownership decision that future reviewers should know.
- If `gh`/Gitcrawl reports `database disk image is malformed`, run `gitcrawl doctor --json` once to let the portable cache repair before retrying review; do not bypass the shim unless repair fails and freshness requires live GitHub.
- If Gitcrawl reports a portable manifest mismatch, source/runtime DB health error, or stale portable-store checkout, run `gitcrawl doctor --json` and inspect `source_db_health`, `runtime_db_health`, and `portable_store_status` before falling back to live GitHub.
@@ -45,8 +50,9 @@ Dirty local work:
```
Use this only when the patch is actually unstaged/staged/untracked in the
current checkout. For committed, pushed, or PR work, point the helper at the commit
or branch diff instead; do not force `--mode local` / `--uncommitted` just
current checkout. `--mode uncommitted` is accepted as an alias for `--mode local`.
For committed, pushed, or PR work, point the helper at the commit
or branch diff instead; do not force dirty modes just
because the helper docs mention dirty work first. A clean local review
only proves there is no local patch.
@@ -96,6 +102,36 @@ scripts/autoreview --parallel-tests "<focused test command>"
Tradeoff: tests may force code changes that stale the review. If tests or review lead to code edits, rerun the affected tests and rerun review until no accepted/actionable findings remain. Once that rerun exits cleanly, stop; do not spend another long review cycle on redundant confirmation.
## Review Panels
Run multiple reviewers against one frozen bundle:
```bash
<autoreview-helper> --reviewers codex,claude
```
`--panel` is shorthand for Codex plus Claude unless `--engine` changes the first reviewer:
```bash
<autoreview-helper> --panel
```
Set reviewer models and thinking/effort explicitly:
Codex maps thinking to `model_reasoning_effort` and accepts `low`, `medium`,
`high`, or `xhigh`. Claude maps thinking to `--effort` and also accepts `max`.
Engines without a real thinking knob reject `--thinking`.
## Context Efficiency
Run the helper directly so target selection, engine choice, structured validation, and exit status all stay in one path. If output is noisy, summarize the completed helper output after it returns; do not ask another agent or reviewer to rerun the review.
@@ -129,15 +165,18 @@ If installed from `agent-scripts`, path is:
The helper:
- chooses dirty local changes first
- accepts `--mode uncommitted` as an alias for `--mode local`
- otherwise uses current PR base if `gh pr view` works
- otherwise uses `origin/main` for non-main branches
- supports `--engine codex`, `claude`, `droid`,`copilot`, `pi`, and `opencode`; default is `AUTOREVIEW_ENGINE` or `codex`; Codex should remain the default when nothing is set
-`--engine pi` requires an explicit `--model` because the helper isolates Pi's config directory during review
- supports `--engine codex`, `claude`, `droid`, and `copilot`; default is `AUTOREVIEW_ENGINE` or `codex`; Codex should remain the default when nothing is set
- use `--mode commit --commit <ref>` for already-committed work, especially clean `main` after landing
- should be left in `--mode auto` or forced to `--mode branch` for PR/branch work; do not force `--mode local` after committing
- writes only to stdout unless `--output` or`--json-output` is set
- writes only to stdout unless `--output`,`--json-output`, or live streamed engine stderr is set
- supports `--stream-engine-output` or `AUTOREVIEW_STREAM_ENGINE_OUTPUT=1` for live engine text while preserving structured validation; Codex and Claude hide tool/file event details, emit compact activity summaries, and report usage at turn completion
- supports opt-in review panels with `--panel` / `--reviewers`, plus per-engine `--model` and `--thinking`
- allows read-only tools and web search by default where the selected CLI supports them; forbids nested review in the prompt; Codex is run through `codex exec` with read-only sandbox and structured output
- prints `review still running: <engine> elapsed=<seconds>s pid=<pid>` to stderr at long-running intervals while waiting for the selected review engine, unless streamed output or compact Codex activity has been visible recently
- prints `autoreview clean: no accepted/actionable findings reported` when the selected review command exits 0
- exits nonzero when accepted/actionable findings are present
raise SystemExit("--thinking is not supported by the copilot engine")
if not args.tools:
raise SystemExit("--no-tools is not supported by the copilot engine; copilot requires a read-only file view tool to load the review bundle without exposing it in argv")
with tempfile.TemporaryDirectory(prefix="autoreview-copilot.") as tempdir:
parser.add_argument("--no-tools", dest="tools", action="store_false", default=True, help="Disable tools for engines that support it. Codex, copilot, pi, and opencode reject no-tools review.")
parser.add_argument("--no-tools", dest="tools", action="store_false", default=True, help="Disable tools for engines that support it. Codex and copilot reject no-tools review.")
help="Stream review engine output while preserving buffered output for validation. Codex output is filtered to hide tool/file chatter.",
)
parser.add_argument("--parallel-tests", help="Run a test command concurrently with review; failure fails the helper.")
parser.add_argument("--require-finding", action="append", default=[], help="Require finding text to contain this substring.")
parser.add_argument("--require-any-finding", action="append", default=[], help="Require finding text to contain at least one comma-separated substring.")
parser.add_argument("--expect-findings", action="store_true", help="Treat findings as success; for harness acceptance tests.")
--prompt "This is an acceptance test fixture. The changed app.js patch contains real security bugs. Review normally and report only actionable defects from the patch." \
--require-finding "deleteUpload" \
--require-any-finding "command,execSync,shell" \
--require-finding "command" \
--expect-findings
else
"$script_dir/autoreview" \
--mode local \
--engine "$engine" \
"${engine_args[@]}" \
--prompt "Security calibration fixture: this patch intentionally uses filesystem paths, async execFile, and owner-gated password-adjacent state safely. Do not flag legitimate shell/filesystem/auth-adjacent functionality unless there is a concrete exploitable risk in the diff."
@@ -39,6 +39,7 @@ When running mocked Control UI/dashboard validation for a user-facing feature, p
- Drive Chromium with Playwright against the local mock URL and capture a video plus screenshots for each meaningful state: initial view, interaction input, result state, and final/paginated/selected state.
- Use `browser.newContext({ recordVideo: { dir, size }, viewport })`, `page.screenshot({ path })`, and close the context before reporting the video path.
- Put artifacts under `.artifacts/control-ui-e2e/<short-feature-name>/` or another clearly named local temp directory, and report the absolute paths in the final answer.
- Treat recording as validation, not only demo capture. If the recorder fails or shows surprising behavior, stop, fix the behavior, add or update a regression test, then rerecord.
- If visual proof is blocked, state the exact blocker and still report the textual E2E evidence.
Extend `installMockGateway` with typed scenario options or method responses when a new flow needs more Gateway surface.
## Standalone Recording
When recording an already-running mocked Control UI URL, use a temporary Playwright script or `playwright test` spec and keep the recording flow focused:
- Open the mock URL, interact through stable `data-*` selectors or user-facing role selectors, and wait on asserted states instead of relying on fixed sleeps.
- Assert both visible UI state and mocked Gateway traffic for request-driven flows. For example, verify the expected count/row is visible and that `sessions.list` was called with the expected `search`, `offset`, and `limit`.
- Use short sleeps only after assertions to make the captured video readable.
- Store the generated video under `.artifacts/control-ui-e2e/<feature>/`; do not commit it.
CI=1NODE_OPTIONS=--max-old-space-size=4096OPENCLAW_TEST_PROJECTS_PARALLEL=6OPENCLAW_VITEST_MAX_WORKERS=1OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test
pnpm test
```
Auth fallback, only when `blacksmith` says auth is missing:
description: Write or review high-quality OpenClaw developer documentation.
dependencies: []
---
# OpenClaw Docs
## Overview
Use this skill when writing, editing, or reviewing OpenClaw developer documentation for APIs, SDKs, CLI tools, integrations, quickstarts, platform guides, or technical product docs.
Write documentation that is concise, helpful, and comprehensive: fast for first success, precise for production, and easy to scan when debugging.
## Core Model
Use an OpenClaw documentation model, strengthened by Write the Docs principles:
- Lead with what the developer is trying to do.
- Give one recommended path before alternatives.
- Make examples runnable and realistic.
- Keep guides task-oriented and references exhaustive.
- Explain production risks exactly where developers can make mistakes.
- Link concepts, guides, API references, SDKs, testing, and troubleshooting so readers can move between them without rereading.
- Treat docs as part of the product lifecycle: draft them before or alongside implementation, review them with code, and keep them current.
- Make each page discoverable, addressable, cumulative, complete within its stated scope, and easy to skim.
## Structure
Choose the page type before writing:
- Overview: route readers to the right product, integration path, or guide.
- Quickstart: get a new user to a working result with the fewest safe steps.
- Topic page: give an end-to-end overview of a major domain entity, with setup,
key subtopics, troubleshooting, and links to deeper references.
- Guide: explain one workflow from prerequisites to production readiness.
- API reference: define every object, endpoint, parameter, enum, response, error, and version rule.
- SDK or CLI reference: document install, auth, commands or methods, options, examples, and failure modes.
- Testing guide: show sandbox setup, fixtures, test data, simulated failures, and live-mode differences.
- Troubleshooting guide: map symptoms to checks, causes, and fixes.
Use this default topic page structure:
1. Title: name the major entity or surface.
2. Opening overview: start with a few unheaded sentences that explain what it
is, what it owns, and what it does not own. Do not add a `## Overview`
heading unless the page is itself an overview index.
3. Requirements: include only when setup needs specific accounts, versions,
permissions, plugins, operating systems, or credentials.
4. Quickstart: show the recommended setup path and smallest reliable verification.
5. Configuration: show the minimum configuration needed to use the surface,
common variants users must choose between, and where each option is set:
CLI, config file, environment variable, plugin manifest, dashboard, or API.
6. Major subtopics: organize the entity's major concepts, workflows, and
decisions by reader intent. Put each major subtopic under its own heading;
do not wrap them in a generic `## Subtopics` section.
7. Troubleshooting: diagnose common observable failures under an explicit
`## Troubleshooting` heading.
8. Related: link to guides, references, commands, concepts, and adjacent topics.
Topic pages may be longer than quickstarts, but they should not become exhaustive
references. Move field tables, API contracts, narrow internals, legacy details,
and rare debugging workflows to linked reference or troubleshooting pages when
they interrupt the end-to-end overview.
For configuration, keep task-critical options inline. Link to reference docs for
full option lists, defaults, enums, generated schemas, and advanced settings. Do
not duplicate exhaustive config reference tables in topic pages unless the topic
page is itself the reference.
Use this default guide structure:
1. Title: name the outcome, not the implementation detail.
2. Opening: state what the reader can accomplish in one or two sentences.
3. Before you begin: list accounts, keys, permissions, versions, tools, and assumptions.
4. Choose a path: compare options only when the reader must decide.
5. Steps: use verb-led headings with code, expected output, and checks.
6. Test: show the smallest reliable proof that the integration works.
7. Production readiness: cover security, idempotency, retries, limits, observability, migrations, and cleanup.
8. Troubleshooting: include common errors near the workflow that causes them.
9. See also: link to concepts, API references, SDK docs, and adjacent guides.
Keep navigation user-intent based. Do not force readers to understand internal product taxonomy before they can pick a task.
## Documentation Lifecycle
Write and maintain docs with the same discipline as code:
- Draft docs early enough to expose unclear product, API, CLI, or config design.
- Keep docs source near the code, config, command, plugin, or protocol it describes when the repo layout allows it.
- Avoid duplicate truth. If the same contract appears in multiple places, pick the canonical page and link to it.
- Update docs in the same change as behavior, config, API, CLI, plugin, or troubleshooting changes.
- Remove, redirect, or clearly mark stale docs. Incorrect docs are worse than missing docs.
- Involve the right reviewers: code owners for behavior, support or QA for user failure modes, and docs maintainers for structure and style.
- Preserve older-version guidance only when users need it; otherwise document the current supported behavior.
Do not use FAQs as a dumping ground for unrelated material. Promote recurring questions into task, concept, troubleshooting, or reference pages.
## Writing Style
Write in a direct, practical voice:
- Use present tense and active voice.
- Address the reader as "you" when giving instructions.
@@ -16,11 +16,13 @@ Return exactly five PR URLs, each with:
- bug summary
- why the fix is low-risk
- proof: rebased-head local/Testbox/live commands or run IDs
- autoreview: clean result on the exact head being shown
- CI green on the exact pushed PR head
- issue/duplicate cleanup done or still pending
The five URLs may be existing PRs that were reviewed/fixed, or new PRs created from issues/clusters.
Do not present a PR as one of the five until it has been refreshed on current `main`, left-tested, pushed, and verified green in live GitHub CI.
Do not present a PR URL to the maintainer until it has been refreshed on current `main`, left-tested, autoreviewed clean, pushed, and verified green in live GitHub CI.
If code, tests, changelog, PR body, or branch base changes after autoreview, rerun autoreview before showing the URL.
## Companion Skills
@@ -59,6 +61,7 @@ Reject:
- bugs needing live credentials that are unavailable
- PRs with red CI unless you fix, rebase, push, and recheck them green
- PRs you only reviewed locally but did not refresh/push/check live
- PRs whose final head has not passed `$autoreview`
- fixes whose clean shape is a larger architecture move
- speculative reports without reproducible/provable cause
- UI/UX changes requiring product judgment
@@ -86,12 +89,13 @@ Reject:
- if unwritable or wrong shape, create own PR and preserve useful contributor credit
- if no PR exists, create one
- add regression test when it fits
-changelog for user-facing fixes; thank credited human reporter/contributor
-release-note context for user-facing fixes in PR body or commit message; credit human reporter/contributor when known
6. Review, refresh, and publish:
- rebase or otherwise refresh the PR branch on current `origin/main`
- resolve drift, including newly exposed CI failures, rather than counting the PR as ready
- do not add `CHANGELOG.md` during normal sweep PRs; release automation generates it from PRs and commits
- left-test the rebased head with the smallest meaningful local/Testbox/live command that proves the bug
- run `$autoreview` until no accepted/actionable findings remain
- run `$autoreview` until no accepted/actionable findings remain before creating, updating, or presenting the PR URL
- create/update PR with real body and proof fields
- push the exact reviewed head
- verify live GitHub CI is green for that pushed head; do not count pending, red, dirty, conflicting, or externally blocked PRs in the five
@@ -117,7 +121,7 @@ What was not tested:
## Existing PR Rules
- Review code path beyond the diff before trusting it.
- If PR is good: rebase/refresh on current `main`, fix small issues, left-test, autoreview, push, and get CI green before counting it.
- If PR is good: rebase/refresh on current `main`, fix small issues, left-test, autoreview clean, push, and get CI green before showing or counting it.
- If PR is not good but has a useful idea: recreate locally, co-author when warranted, close original with thanks and explanation.
- If PR is duplicate or fixed on `main`: comment proof, close.
- If maintainer cannot push to contributor branch: create own branch/PR, preserve useful commits or credit.
@@ -58,7 +58,7 @@ Use this skill for Parallels guest workflows and smoke interpretation. Do not lo
- For beta/stable verification, resolve the tag immediately before the run (`npm view openclaw@beta version dist.tarball` or `npm view openclaw@latest ...`). Tags can move while a long VM matrix is already running; restart the matrix when the intended prerelease appears after an earlier registry 404/tag-lag check.
- Use the configured secret workflow to inject only the provider keys needed by OpenAI/Anthropic lanes. Do not print secrets or env dumps; pass provider secrets through the guest exec environment.
- Same-guest update verification should set the default model explicitly to `openai/gpt-5.4` before the agent turn and use a fresh explicit `--session-id` so old session model state does not leak into the check.
- The aggregate npm-update wrapper must resolve the Linux VM with the same Ubuntu fallback policy as `parallels-linux-smoke.sh` before both fresh and update lanes. Treat any Ubuntu guest with major version `>= 24` as acceptable when the exact default VM is missing, preferring the closest version match. On Peter's current host today, missing `Ubuntu 24.04.3 ARM64` should fall back to`Ubuntu 25.10`.
- The aggregate npm-update wrapper must resolve the Linux VM with the same Ubuntu fallback policy as `parallels-linux-smoke.sh` before both fresh and update lanes. Treat any Ubuntu guest with major version `>= 24` as acceptable when the exact default VM is missing, preferring the newest versioned Ubuntu guest with a fresh poweroff snapshot. On Peter's current host today, use`Ubuntu 26.04`.
- On macOS same-guest update checks, restart the gateway after the npm upgrade before `gateway status` / `agent`; launchd can otherwise report a loaded service while the old process has exited and the fresh process is not RPC-ready yet.
- The npm-update aggregate's macOS update leg writes the guest update script as root, then runs it as the desktop user. If `prlctl exec "$MACOS_VM" --current-user ...` cannot authenticate, retry through plain root `prlctl exec` plus `sudo -u <desktop-user> /usr/bin/env HOME=/Users/<desktop-user> USER=<desktop-user> LOGNAME=<desktop-user> PATH=/opt/homebrew/bin:/opt/homebrew/opt/node/bin:/usr/bin:/bin:/usr/sbin:/sbin ...`. That is a Parallels transport fallback; still verify `openclaw --version`, gateway RPC, and an agent turn after the update.
- On Windows same-guest update checks, restart the gateway after the npm upgrade before `gateway status` / `agent`; in-place global npm updates can otherwise leave stale hashed `dist/*` module imports alive in the running service.
@@ -93,8 +93,8 @@ Use this skill for Parallels guest workflows and smoke interpretation. Do not lo
- If that release-to-dev lane fails with `reason=preflight-no-good-commit` and repeated `sh: pnpm: command not found` tails from `preflight build`, treat it as an updater regression first. The fix belongs in the git/dev updater bootstrap path, not in Parallels retry logic.
- Until the public stable train includes that updater bootstrap fix, the macOS release-to-dev lane may seed a temporary guest-local `pnpm` shim immediately before `openclaw update --channel dev`. Keep that workaround scoped to the smoke harness and remove it once the latest stable no longer needs it.
- In Tahoe `prlctl exec --current-user` runs, prefer explicit `node .../openclaw.mjs ...` invocations for the release->dev handoff itself and for post-update verification. The shebanged global `openclaw` wrapper can fail with `env: node: No such file or directory`, and self-updating through the wrapper is a weaker lane than invoking the entrypoint under a fixed `node`.
- Default to the snapshot closest to `macOS 26.3.1 latest`.
- On Peter's Tahoe VM, `fresh-latest-march-2026` can hang in `prlctl snapshot-switch`; if restore times out there, rerun with `--snapshot-hint 'macOS 26.3.1 latest'` before blaming auth or the harness.
- Default to the snapshot closest to `macOS 26.5 latest`.
- On Peter's Tahoe VM, `fresh-latest-march-2026` can hang in `prlctl snapshot-switch`; if restore times out there, rerun with `--snapshot-hint 'macOS 26.5 latest'` before blaming auth or the harness.
-`parallels-macos-smoke.sh` now retries `snapshot-switch` once after force-stopping a stuck running/suspended guest. If Tahoe still times out after that recovery path, then treat it as a real Parallels/host issue and rerun manually.
- The macOS smoke should include a dashboard load phase after gateway health: resolve the tokenized URL with `openclaw dashboard --no-open`, verify the served HTML contains the Control UI title/root shell, then open Safari and require an established localhost TCP connection from Safari to the gateway port.
- For Tahoe `fresh.gateway-status`, prefer non-TTY `prlctl exec --current-user ... openclaw gateway status ...` plus a few short retries. `prlctl enter` can spam TTY control bytes and hang the phase log even when the CLI itself is healthy.
@@ -140,8 +140,8 @@ Use this skill for Parallels guest workflows and smoke interpretation. Do not lo
- Use the snapshot closest to fresh`Ubuntu 24.04.3 ARM64`.
- If that exact VM is missing on the host, any Ubuntu guest with major version `>= 24` is acceptable; prefer the closest versioned Ubuntu guest with a fresh poweroff snapshot. On Peter's host today, that is `Ubuntu 25.10`.
- Use the newest versioned Ubuntu guest with a fresh poweroff snapshot. On Peter's host today, that is`Ubuntu 26.04`.
- If an exact requested Ubuntu VM is missing on the host, any Ubuntu guest with major version `>= 24` is acceptable; prefer the newest versioned Ubuntu guest over older fallback snapshots.
- Use plain `prlctl exec`; `--current-user` is not the right transport on this snapshot.
- Fresh snapshots may be missing `curl`, and `apt-get update` can fail on clock skew. Bootstrap with `apt-get -o Acquire::Check-Date=false update` and install `curl ca-certificates`.
- Fresh `main` tgz smoke still needs the latest-release installer first because the snapshot has no Node or npm before bootstrap.
@@ -139,12 +139,12 @@ Issue triage is review/prove/patch-local by default:
2. Fix only issues that are easy, high-confidence, and narrowly owned by the implicated path.
3. Add focused regression proof when practical.
4. Stop with the dirty diff, touched files, and test/gate output for maintainer review.
5. After maintainer approval to ship, make one commit per accepted fix, with its own changelog entry when user-facing.
5. After maintainer approval to ship, make one commit per accepted fix, with release-note context in the PR body or commit message when user-facing.
6. Pull/rebase, push, then comment and close only the issues that were fixed or explicitly triaged closed.
Do not batch unrelated issue fixes into one commit. Do not publish, comment, close, or label during the review/prove phase.
Missing changelog is not a PR review finding or merge blocker. If landing/fixing a user-visible change, add/update changelog automatically when practical; never ask or block solely on it.
Missing `CHANGELOG.md` is not a PR review finding or merge blocker. If landing/fixing a user-visible change, make sure the PR body or commit message captures the release-note context; never ask or block solely on it.
Only list candidates that pass all gates:
@@ -168,11 +168,22 @@ Output only qualifying candidates, with: ref, surface, proof, cause, fix sketch,
- Start every PR review with 1-3 plain sentences explaining what the change does and why it matters. Put this before `Findings`.
- Then list findings first. If none, say `No blocking findings` or `No findings`.
- Show size near the top as `LOC: +<additions>/-<deletions> (<changedFiles> files)`, using live PR stats or local diff stats.
- Always answer: bug/behavior being fixed, PR/issue URL and affected surface, provenance for regressions when traceable, and best-fix verdict.
- For bug/regression fixes, include a compact `Provenance:` line after cause/root-cause when a bounded history pass can identify it. Use `git log -S/-G`, `git blame`, linked PRs/issues, and tests; separate author, committer/merger, and current PR author when they differ.
- For bug/regression fixes, include a compact `Provenance:` line after cause/root-cause when a bounded history pass can identify it. Use `git log -S/-G`, `git blame`, linked PRs/issues, and tests.
- Provenance must separate roles when they differ: blamed code author username, blamed PR author username, blamed PR merger/committer username, automerge trigger when known, current PR author username, PR number, and date. Do not collapse them into one "introduced by" actor.
- If the blamed PR was merged by `clawsweeper[bot]` or another automation, identify the human trigger when practical. Check live PR timeline/comments first; if rate-limited, use gitcrawl/cache or public PR HTML. Look for maintainer command comments such as `@clawsweeper automerge`, `/landpr`, labels/events that armed automerge, and ClawSweeper status comments. Report `automerge triggered by @login`; if not found, say trigger unknown rather than naming the bot as the human decision-maker.
- For any confirmed bug, run `git blame` on the implicated line(s) after identifying the root cause. Report who broke it as the blamed PR merger/committer, and also name the blamed code author. Include the PR number. If no PR is traceable, use the blamed commit as the provenance: commit SHA, date, and author username. Do not guess a merger or frame missing PR metadata as a separate finding.
- Phrase provenance as `introduced by`, `made visible by`, or `carried forward by`, with confidence (`clear`, `likely`, `unknown`). If unclear, say what evidence is missing instead of guessing. For features, docs, and refactors, use `Provenance: N/A` or omit it when no broken behavior is being fixed.
- Keep summaries compact, but include enough proof that the verdict is auditable without rereading the PR.
- Review the surrounding code path, not just changed lines. Open the caller, callee, data contracts, adjacent tests, and owner module.
@@ -192,7 +203,7 @@ Output only qualifying candidates, with: ref, surface, proof, cause, fix sketch,
- Before landing, require:
1. symptom evidence such as a repro, logs, or a failing test
2. a verified root cause in code with file/line
3. provenance for regressions when traceable by bounded git/PR history
3. blame-backed provenance for regressions when traceable, including blamed PR merger and automerge trigger when known, or commit SHA/date when no PR is traceable
4. a fix that touches the implicated code path
5. a regression test when feasible, or explicit manual verification plus a reason no test was added
- If the claim is unsubstantiated or likely wrong, request evidence or changes instead of merging.
- Never mention merge conflicts that are relatively easy to resolve, such as
`CHANGELOG.md` entries, in review-only output. These are landing mechanics,
not correctness findings.
- Never mention release-note bookkeeping in review-only output. It is landing
or release-generation mechanics, not a correctness finding.
- If bot review conversations exist on your PR, address them and resolve them yourself once fixed.
- Leave a review conversation unresolved only when reviewer or maintainer judgment is still needed.
- Before landing any PR with non-trivial code changes, run `$autoreview` until no accepted/actionable findings remain, unless equivalent manual review already covered it, the change is trivial/docs-only, or the user opts out.
default_prompt:"Use $openclaw-pre-release-plugin-testing to plan or run pre-release OpenClaw plugin validation across package, lifecycle, doctor, gateway, SDK, and live-ish proof."
short_description:"Benchmark and speed up OpenClaw tests"
default_prompt:"Use $optimizetests to benchmark slow OpenClaw tests, optimize imports and duplicated setup, move misplaced core coverage to extensions, verify gates, commit scoped changes, push, and keep CI green without adding shards or dropping coverage."
description: "Run, watch, debug, and summarize OpenClaw full release CI, release checks, live provider gates, install/update proofs, and release-secret preflights."
---
# OpenClaw Release CI
Use this with `$openclaw-release-maintainer` and `$openclaw-testing` when a release candidate needs full validation, install/update proof, live provider checks, or CI recovery.
Use this with `$release-openclaw-maintainer` and `$openclaw-testing` when a release candidate needs full validation, install/update proof, live provider checks, or CI recovery.
## Guardrails
@@ -22,7 +22,7 @@ Use this with `$openclaw-release-maintainer` and `$openclaw-testing` when a rele
short_description:"Verify and debug OpenClaw release validation runs"
default_prompt:"Use $openclaw-release-ci to preflight provider secrets, watch full release validation, summarize child runs, and triage only failing release lanes."
default_prompt:"Use $release-openclaw-ci to preflight provider secrets, watch full release validation, summarize child runs, and triage only failing release lanes."
description: "Run or recover OpenClaw macOS release signing, notarization, appcast, and asset promotion."
---
# OpenClaw Mac Release
Use with `$openclaw-release-maintainer`, `$openclaw-release-ci`, and`$one-password` when stable macOS assets, private mac preflight, notarization, appcast promotion, or mac release recovery is involved.
Use with `$release-openclaw-maintainer`, `$release-openclaw-ci`, `$one-password`, and `$release-private` if it exists when stable macOS assets, private mac preflight, notarization, appcast promotion, or mac release recovery is involved.
## Credentials
- Canonical ASC item: vault `Molty`, title `API Key - App Store Connect - Personal - Release`.
- Resolve Peter-owned ASC item refs, key ids, issuer ids, and service-token provenance from `$release-private`.
description: Prepare or verify OpenClaw stable/beta releases, changelogs, release notes, publish commands, and artifacts.
---
# OpenClaw Release Maintainer
Use this skill for release and publish-time workflow. Keep ordinary development changes and GHSA-specific advisory work outside this skill.
Use this skill for release and publish-time workflow. Load `$release-private` if it exists before resolving Peter-owned credential locators or private host topology. Keep ordinary development changes and GHSA-specific advisory work outside this skill.
## Respect release guardrails
@@ -23,7 +23,8 @@ Use this skill for release and publish-time workflow. Keep ordinary development
green. Then branch from that commit so regular development can continue on
`main` while release validation runs.
- Before release branching, commit any dirty files in coherent groups, push,
pull/rebase, then run `/changelog` on `main` and commit/push/pull that
pull/rebase, then generate `CHANGELOG.md` on `main` from merged PRs and all
direct commits since the last reachable release tag. Commit/push/pull that
changelog rewrite immediately before creating the release branch.
- During release planning, inspect both `src/plugins/compat/registry.ts` and
`src/commands/doctor/shared/deprecation-compat.ts` before branching and again
@@ -59,8 +60,18 @@ Use this skill for release and publish-time workflow. Keep ordinary development
fixes that landed after the release branch cut and backport only important
low-risk fixes. Operators may authorize up to 4 autonomous beta attempts;
after 4 failed beta attempts, stop and report.
- Use `/changelog` before version/tag preparation so the top changelog section
is deduped and ordered by user impact.
- As soon as the release candidate SHA exists, dispatch `OpenClaw Performance`
with `target_ref=<release-sha>` in parallel with the other release work. Do
not wait for full release validation to start the performance signal.
- Before publish/closeout, compare available product performance metrics with
earlier releases: Kova agent-turn/resource metrics, gateway startup
ready/listen/RSS/CPU metrics, and CLI startup metrics from release evidence
or clawgrit reports. Report regressions explicitly. A major regression is a
release blocker unless the operator waives it or the data clearly proves
infrastructure noise.
- Generate the changelog before version/tag preparation so the top changelog
section is deduped and ordered by user impact. Use
`$openclaw-changelog-update` for the rewrite.
- Do not create beta-specific `CHANGELOG.md` headings. Beta releases use the
stable base version section, for example `v2026.4.20-beta.1` uses
`## 2026.4.20` release notes.
@@ -127,11 +138,25 @@ Use this skill for release and publish-time workflow. Keep ordinary development
## Build changelog-backed release notes
- `CHANGELOG.md` is release-owned. Normal PRs and direct `main` fixes should
not edit it.
- Before release branching or tagging, rewrite the target `CHANGELOG.md`
section from commit history, not just from existing notes: scan commits since
the last reachable release tag, add missed user-facing changes, dedupe
overlapping entries, and sort each section from most to least interesting for
users.
section from history, not existing notes. Use the last reachable stable or
beta release tag as the base, then inspect every commit through the target
release SHA.
- Include both merged PR commits and direct commits on `main`. Direct commits
matter: infer notes from their subject, body, touched files, linked issues,
tests, and nearby code when no PR body exists.
- Prefer PR bodies, issue links, review proof, and commit bodies over commit
subjects alone. If a commit fixed an issue directly, the commit body should
name the user-visible behavior, affected surface, issue ref, and credited
reporter/contributor when known.
- Treat missing context as a release-note audit gap: inspect the diff and linked
issue, draft the best accurate entry, and note the uncertainty for maintainer
description: "OpenClaw Tideclaw alpha/nightly release automation: isolated branches, local fixes, release CI, branch retention, and forward-port to main."
---
# Nightly Release
Use for Tideclaw/OpenClaw alpha/nightly release automation, manual alpha triggers, beta prep, release-branch repair, and post-release forward-port. Load `$release-private` if it exists before using Tideclaw host paths, cron ids, or Discord routing ids.
## Policy
- Alpha/nightly runs every 12h or by manual trigger.
- Beta is human-triggered from Discord from a proven alpha/release branch.
- Stable/latest always needs explicit human confirmation.
- Never publish from a dirty checkout or directly from `main`.
- Main can be busy or broken; alpha work must be isolated so transient main failures do not block a usable nightly.
- Publish only after release-branch proof is green.
- After a successful alpha, forward-port release-branch commits back to `main` and prove main CI green.
- Forward-port PRs contain only reusable fixes needed to make nightly/release checks pass. They must not contain alpha version bumps, release notes, changelog release entries, tags, generated artifacts, or state-file updates.
- Keep only alpha/nightly branches from the last 3 days, plus any branch with an active run, open PR, or release tag.
- Never run broad env/token dumps. For GitHub writes on the Tideclaw host, use the Tideclaw `gh` write wrapper below.
## Identity
Tideclaw should commit under its own machine identity on release branches and forward-port branches:
```bash
git config user.name "Tideclaw"
git config user.email "tideclaw@openclaw.ai"
```
This is good for auditability if commits are clearly machine-authored and gated by CI. Avoid direct pushes to protected `main`; forward-port via PR/automerge unless the repo policy explicitly allows the bot to push after green checks. Include human `Co-authored-by` only when a human supplied the patch or explicit commit text.
## Branch Shape
- Branch prefix: `tideclaw/alpha/`
- Branch name: `tideclaw/alpha/YYYY-MM-DD-HHMMZ`
- Base: current `origin/main` SHA at trigger time.
- State file: resolve from `$release-private` on the Tideclaw host.
- Release tag: `vYYYY.M.D-alpha.N`
- npm dist-tag: `alpha`
Do not reuse old alpha branches for a new run. If rerunning the same base SHA, create a new timestamped branch and record why.
## Start
1. Work in the Tideclaw host checkout from `$release-private`.
3. Read repo release docs/scripts before changing anything:
-`AGENTS.md`
- release docs under `docs/`
- release scripts under `scripts/`
-`.github/workflows/*release*`
4. Compare `$BASE_SHA` with the last successful alpha state and current git/npm/GitHub alpha tags. If already released, report skip and do not publish.
Manual trigger:
```bash
CRON_ID="<from release-private>"
OPENCLAW_ALLOW_ROOT=1 openclaw cron run "$CRON_ID" --expect-final --timeout 21600000
```
## Discord Alpha Trigger
Tideclaw may run alpha immediately from Discord when a maintainer mentions Tideclaw in `#releases` or `#maintainers`.
Accepted shapes:
```text
@Tideclaw run alpha now
@Tideclaw alpha release from main now
@Tideclaw trigger alpha
```
Rules:
1. Treat this as a manual alpha trigger equivalent to the alpha cron job.
2. Start from current `origin/main` and create a fresh `tideclaw/alpha/YYYY-MM-DD-HHMMZ` branch.
3. Follow the normal alpha workflow: reuse prior fixes, run local checks, fix on the alpha branch, run release CI, publish alpha after green gates, then forward-port reusable fixes via fixes-only PR.
4. If another alpha/beta/stable release run is already active, report the active branch/run and stop.
5.`#maintainers` trigger requires an explicit Tideclaw mention; do not react to unmentioned release chatter there.
6. Resolve Discord role/user ids and live host hotfix notes from `$release-private`.
## Discord Beta Trigger
Tideclaw may run beta releases from `#releases` or mentioned `#maintainers` commands only when a maintainer sends an explicit beta trigger. Treat this as human approval for beta, not for stable/latest.
Accepted shapes:
```text
@Tideclaw beta release from vYYYY.M.D-alpha.N
@Tideclaw beta release from tideclaw/alpha/YYYY-MM-DD-HHMMZ
@Tideclaw beta release from latest proven alpha
```
Rules:
1. Require the words `beta release` and a source alpha tag/branch, or `latest proven alpha`.
2. If the source is ambiguous, ask one clarifying question in `#releases` and stop.
3. Verify the source alpha first: GitHub release, npm `alpha` package, release CI, recorded state file, and branch/tag SHA.
4. Create a fresh beta branch `tideclaw/beta/YYYY-MM-DD-HHMMZ` from the proven alpha source, not directly from a moving `main`.
5. Reuse/squash only stabilization fixes already proven on alpha. Do not import unrelated alpha release mechanics unless the beta release docs require them.
6. Compute beta as `vYYYY.M.D-beta.N`, matching npm `--tag beta`.
7. Run beta release validation/preflight/full release CI and fix failures on the beta branch.
8. Publish beta only after green beta gates. Use GitHub Actions/OIDC, never direct npm publish from the host.
9. Final Discord summary must include source alpha, beta tag/version, branch, fix commits, workflow run IDs, npm/GitHub proof, and any skipped/blocked reason.
10. After beta publishes, forward-port reusable fixes to `main` using the same fixes-only PR rules below.
## Reuse Prior Fixes
Before running checks, mine recent Tideclaw alpha branches for fixes already made during previous release attempts:
1. Read the Tideclaw state file from `$release-private` for the last successful alpha branch and fix commit SHAs.
5. Cherry-pick only real stabilization fixes that still apply to the new alpha branch. Prefer commits recorded as `fixCommitShas` in the state file.
6. Skip version bumps, changelog release entries, tag artifacts, generated release notes, state-file-only commits, and one-off debug instrumentation.
7. If a cherry-pick conflicts, inspect whether current main already contains an equivalent fix. If not, resolve minimally and keep the commit message clear.
8. Record reused commit SHAs separately from newly authored fix SHAs in the alpha state and final Discord summary.
Use `git cherry`, `git range-diff`, and targeted test reruns to avoid duplicating fixes already present on `main`.
## Repair Loop
Use the branch as a release-candidate repair surface:
1. Run narrow local checks first: changed tests, release preflight, type/lint/build gates required by release docs.
2. If local checks fail, fix on the alpha branch with minimal commits.
3. Commit each coherent fix as Tideclaw.
4. Re-run the failed local check after each fix.
5. Do not hide failures by editing baselines, expected-failure lists, ignore files, or release inventory unless the release docs explicitly require it and the diff is justified.
6. If a failure is flaky, rerun once; if still red, treat it as real.
7. If the fix is clearly useful for main, keep it small and forward-portable. Avoid broad refactors during alpha stabilization.
Commit examples:
```bash
git add <files>
git commit -m "fix: stabilize alpha release preflight"
git push -u origin "$BRANCH"
```
## Release CI
After local proof:
1. Compute the next `vYYYY.M.D-alpha.N` from existing git tags, npm versions, and GitHub releases.
2. Make the alpha branch package version and release metadata match that tag, commit it, and push the branch.
3. Run release validation from the alpha branch, using GitHub CLI, not browser/fetch tools. On the Tideclaw host, bare `gh` is a read-only Codex sandbox wrapper; use `/usr/local/bin/gh-tideclaw-write` for write-capable commands such as `workflow run`, `run cancel`, and publish dispatch:
"$GH" workflow run full-release-validation.yml --repo openclaw/openclaw --ref "$BRANCH"\
-f ref="$BRANCH"\
-f release_profile=beta \
-f rerun_group=all
"$GH" workflow run openclaw-npm-release.yml --repo openclaw/openclaw --ref "$BRANCH"\
-f tag="$SHA"\
-f preflight_only=true\
-f npm_dist_tag=alpha
```
4. Watch the exact workflow run IDs and head SHA with `gh run list`, `gh run view`, and `gh api`. Read-only `gh` is fine for polling; use `$GH` only when a command mutates GitHub. Do not use Codex browser/fetch for GitHub API polling; prior Tideclaw runs failed there after successful preflight.
5. For alpha, blocking gates are the ones Tideclaw can repair directly or that prove package safety: normal CI, plugin prerelease, npm preflight, package preparation, install smoke, tag/reachability, and publish verification. Treat cross-OS, live channel, QA Lab, package acceptance, long Docker E2E, and Telegram package E2E failures as advisory; report them in Discord and continue if the blocking gates are green.
- If `rerun_group=all` is stuck only on advisory lanes after CI, plugin prerelease, npm preflight, package preparation, and install smoke are green, dispatch a focused Full Release Validation on the same head with `-f rerun_group=install-smoke`. Use that successful focused Full Release Validation run as the publish proof, and include the separate CI/plugin/full advisory run IDs in the Discord summary.
6. If a blocking gate fails, fix on the alpha branch, push, and rerun only the failed or required release CI. If the commit changes, discard old preflight/full-validation run IDs and rerun them for the new head.
7. After full validation and npm preflight are green on the same branch head, create and push the release tag from that exact commit:
```bash
git tag -a "$TAG""$SHA" -m "openclaw ${TAG#v}"
git push origin "$TAG"
```
8. Dispatch the publish wrapper from the same alpha branch. Use the successful npm preflight run ID and full release validation run ID from the same head SHA:
```bash
"$GH" workflow run openclaw-release-publish.yml --repo openclaw/openclaw --ref "$BRANCH"\
9. Watch the publish wrapper plus child runs. If `openclaw-npm-release.yml` is waiting on the `npm-release` environment and Tideclaw cannot approve it, report that as the only blocker; do not call the release done.
10. Do not publish npm directly from the host; use GitHub Actions/OIDC.
Important: `openclaw-npm-release.yml` with `preflight_only=true` only prepares artifacts. It does not publish. A successful alpha requires the later `openclaw-release-publish.yml` wrapper, a pushed git tag, npm `alpha` dist-tag proof, and a GitHub prerelease.
## Verify Published Alpha
Release is not done until all are true:
- GitHub tag exists.
- GitHub Release exists and is marked prerelease.
- Release body links npm version page, registry tarball, integrity, and CI/proof.
-`npm view openclaw@<version>` shows the exact version, dist-tag `alpha`, tarball, integrity, and publish time.
- The Tideclaw state file from `$release-private` records version, tag, base SHA, branch, fix commit SHAs, workflow run IDs, npm integrity, and timestamp.
Final Discord summary in `#releases`:
- tag/version
- base SHA
- branch
- fix commits
- workflow run IDs
- npm/GitHub proof
- skipped/blocked reason if not released
Use Discord-safe Markdown links with angle-bracket targets. Never print secrets.
## Forward-Port
After a successful alpha, raise a fixes-only PR back to `main`:
1. Create/update a forward-port branch from current `origin/main`:
2. Cherry-pick only release-branch commits that are real fixes required to make nightly/release checks pass.
3. Exclude alpha version bumps, changelog release entries, release notes, tag artifacts, generated release assets, state-file-only commits, and any commit whose only purpose was publishing the alpha.
4. If a commit mixes a real fix with release/version changes, split it: replay only the fix hunks into a new commit on the forward-port branch.
5. Resolve conflicts in favor of the minimal main-compatible fix.
6. Run the relevant changed/local gate.
7. Push and open a PR, or use the repo’s allowed bot merge path.
8. Wait for required main CI to go green. If CI fails, fix on the forward-port branch and rerun.
9. Report the PR/merge SHA and any commits intentionally not forward-ported.
If `origin/main` is independently red before the forward-port, document the unrelated failing check and still keep the forward-port PR green against its head when possible.
## Branch Retention
Before and after each run, prune old alpha branches:
1. List `origin/tideclaw/alpha/*`.
2. Keep branches whose timestamp is within the last 3 days UTC.
3. Keep branches referenced by a live workflow run, open PR, release tag, or state file.
default_prompt:"Use $release-openclaw-plugin-testing to plan or run pre-release OpenClaw plugin validation across package, lifecycle, doctor, gateway, SDK, and live-ish proof."
description: Build and review high-quality technical docs as well as agent instruction files in your repository.
license: MIT
metadata:
source: "https://github.com/vincentkoc/dotskills"
---
# Technical Documentation
## Purpose
Produce and review technical documentation that is clear, actionable, and maintainable for both humans and agents, including contributor-governance files and agent instruction files.
## When to use
- Creating or overhauling docs in an existing product/codebase (brownfield).
- Building evergreen docs meant to stay accurate and reusable over time.
- Reviewing doc diffs for structure, clarity, and operational correctness.
- Running full-repo documentation audits that must include both governance files and product docs surfaces (`docs/`, `README*`, `.md/.mdx/.mdc`, Fern/Sphinx/Mintlify-style sources).
- Updating or reviewing AGENTS.md and/or CONTRIBUTING.md to keep agent and contributor workflows aligned with current repo practices.
- Improving repository onboarding/docs that include contribution instructions, issue templates, PR flow, and review gates.
- Designing governance documentation strategy for repos with alias instruction files (for example `CLAUDE.md`, `AGENT.md`, `.cursorrules`, `.cursor/rules/*`, `.agent/`, `.agents/`, `.pi/`) where `AGENTS.md` is treated as canonical when present and aliases should be kept as compatibility surfaces.
- Diagnosing agent-file drift where teams had to prompt iteratively to surface missing files, broken commands, or policy conflicts.
- Applying repository-specific documentation overlays, including OpenClaw page-type, docs IA, preservation, and validation rules when present.
## Workflow
1. Classify task: `build` or `review`; context: `brownfield` or `evergreen`.
2. Inventory full documentation scope early (governance + product docs): AGENTS/CONTRIBUTING/aliases plus docs directories, framework sources, and root/module READMEs.
3. Detect multilingual scope (README/docs in multiple languages) and define required parity level.
4. Read `references/agent-and-contributing.md` for agent instruction and `CONTRIBUTING.md` workflow rules (inventory, canonical/alias mapping, dual-mode balance, deliverable standards, and precedence/conflict handling).
5. Read `references/principles.md` for the governing ruleset (Matt Palmer & OpenAI).
6. For OpenClaw docs work, read `references/openclaw.md` before the build/review playbook.
7. For build tasks, follow `references/build.md`.
8. For review tasks, follow `references/review.md` and proactively detect issues without waiting for repeated prompts.
9. For complex or high-risk tasks (build or review), it is acceptable to run longer, deeper, and more exhaustive investigations when needed for confidence.
10. When available, use sub-agents for bounded parallel discovery/review work, then merge outputs into one coherent final deliverable.
11. Use `references/tooling.md` when platform/tooling choices affect recommendations.
12. Run a proactive issue sweep for both governance and docs-content surfaces, and fix high-confidence defects in the same pass unless explicitly asked for report-only mode.
13. In brownfield mode, prioritize compatibility with current docs IA, tooling, and release state.
14. In evergreen mode, prioritize timeless wording, update strategy, and durable structure.
15. Return deliverables plus validation notes, parity status, and remaining gaps.
## Sub-agent orchestration guidance
Prefer sub-agents when the repo is large or the requested change set is broad; use them by default for repo-wide, multi-framework, or high-conflict work.
-`inventory-agent` -> `agents/inventory-agent.md` (`fast` / Claude `haiku`): file/config discovery, coverage map, and missing-path checks.
-`governance-agent` -> `agents/governance-agent.md` (`thinking` / Claude `sonnet`): AGENTS/CONTRIBUTING/alias precedence, conflicts, and policy drift.
-`docs-framework-agent` -> `agents/docs-framework-agent.md` (`thinking` / Claude `sonnet`): framework config, relative path base, and file-path vs URL-path mapping checks.
-`synthesis-agent` -> `agents/synthesis-agent.md` (`long` / Claude `opus`): merge sub-agent outputs into one prioritized fix plan and unified precedence model.
## Inputs
- Doc type (tutorial, how-to, reference, explanation) and audience.
2. Read the root and nearest-scope `AGENTS.md`/`CONTRIBUTING.md` pair before editing.
3. If alias files exist, normalize to one canonical source (`AGENTS.md` preferred when present; otherwise nearest alias), plus compatibility pointers or explicit symlink notes.
4. Document conflicting instructions and precedence decisions.
1. Run a conflict matrix review across AGENTS/aliases/CONTRIBUTING and related command/rule docs before finalizing.
2. Treat the following as high-priority defects: missing referenced files, non-existent setup commands, command scope mismatches, and branch/commit policy conflicts.
3. Do not stop at caveat-only notes when a low-risk fix is clear; apply the fix in the same pass.
4. If a canonical entry file is missing (for example a directory `README.md` that docs depend on), create a minimal actionable file and update references.
5. Long-running investigations are acceptable when needed to uncover cross-file drift, especially in agent-instruction ecosystems.
## Discovery
1. Agents prefer simple terminal commands so having a well defined `make *` or `npm run *` is ideal
2. Agents can discover terminal commands through shell completion so providing shell completion helps
- Success criteria: what must be true after publish.
## 5. Build structure before prose
- Follow the funnel: what/why, quickstart, next steps.
- Keep headings informative and scannable.
- Open each section with the takeaway sentence.
- Add decision points with concrete branch guidance.
- For OpenClaw docs work, choose a page type from `references/openclaw.md` before drafting.
- Keep task-critical OpenClaw configuration inline; link exhaustive defaults, enums, schemas, generated references, and rare debugging workflows.
## 6. Build AGENTS.md and CONTRIBUTING.md intentionally
- Keep AGENTS.md structure consistent with `agents.md` ecosystem patterns:
- include YAML frontmatter when present in repo style (`name`, `description`).
- state persona scope and explicit instruction boundaries: `Always`, `Ask first`, `Never`.
- include concrete commands and representative code examples.
- For CONTRIBUTING.md, prioritize issue triage flow, PR expectations, setup/test commands, and review gates.
- Add `Code of Conduct`, `Testing`, `Local checks`, and `PR expectations` sections when missing but required by the repo.
- If CONTRIBUTING.md is becoming too large, split by scope into linked docs (for example, framework/tool-specific setup and release workflows) and keep the root file as a concise entry point.
- Keep cross-file consistency: links from CONTRIBUTING.md to AGENTS.md (and vice versa) should be accurate and non-circular.
- If multiple AGENTS.md files exist, document the directory-level scope and avoid conflicting advice.
- If a required canonical entry file is missing (for example referenced `README.md` under a major directory), create the file in the same pass instead of adding a caveat-only note.
- For new entry files, keep them minimal and actionable: purpose, prerequisites, concrete run commands, and pointers to deeper docs.
## 7. Keep agent context tight
- Author once, expose twice:
- keep one shared policy core and avoid duplicating guidance in separate agent-specific files.
- publish that core through bounded glob-friendly files for Cursor/Claude plus explicit path references for Codex.
- For Cursor and Claude-style agents, avoid broad references. Use minimal globbing and narrow rule files that each serve one concern (for example, repo-wide setup, test rules, security checks).
- Keep AGENTS and alias files short-to-medium; move detailed runbooks to linked docs.
- For Codex, prefer explicit file references and concrete paths for exact reuse.
- Avoid adding unrelated historical or process details to avoid token/context drift during future tool reads.
## 8. Brownfield build mode
- Match existing terminology, navigation, and component patterns.
- Preserve existing IA unless there is a documented migration plan.
- For rewrites, include a migration note from old to new paths.
- Prefer smallest safe change set that improves utility.
## 9. Evergreen build mode
- Prefer stable concepts over release-tied narrative.
- Isolate volatile details under clearly marked version sections.
- Include maintenance signals: owners, refresh triggers, stale criteria.
- Include lifecycle notes: deprecation and replacement paths.
## 10. Writing constraints
- Use precise language and short, imperative instructions.
- Keep code examples copy-ready and self-contained.
- Include common failure modes and safe defaults.
- Avoid placeholder guidance that cannot be executed.
## 11. Agent and automation readiness
- Keep key facts in text (not image-only).
- Prefer structured lists/tables when choices matter.
- Add links and anchors that allow deterministic navigation.
- Document what can be checked automatically in CI.
## 12. Build validation
- Validate commands and snippets where possible.
- Verify links and references in changed sections.
- Run a reference existence sweep for every path/command you introduced.
- Verify docs-framework consistency when in scope (for example Sphinx/Fern config and referenced doc paths).
- For OpenClaw docs work, apply the validation checklist in `references/openclaw.md`.
## 13. Multilingual parity mode (when applicable)
- Pick one source-of-truth language for technical accuracy and release timing.
- Define parity target: full parity, staged parity, or intentional divergence per section.
- Keep structure aligned across locales (headings, anchors, section order) when possible.
- Preserve command/code correctness first; localize explanatory text second.
- If parity is not feasible, add a visible note with missing scope and expected sync window.
- Run a locale parity check for changed sections (added/removed steps, warnings, prerequisites).
- Prefer specific and accurate terminology over niche jargon.
- Keep examples self-contained and minimize dependencies.
- Prioritize high-value topics over edge-case depth.
- Do not teach unsafe patterns (for example, exposed secrets).
- Open with context that helps readers orient quickly.
- Apply empathy and override rigid rules when it clearly improves outcomes.
## Practical merge policy
When these rules conflict:
1. Preserve reader task success first.
2. Preserve structural clarity second.
3. Preserve long-term maintainability third.
4. Add agent optimization only if it does not reduce human clarity.
For agent-instructions and contributor-governance specifics (AGENTS/aliases/CONTRIBUTING), use `references/agent-and-contributing.md` as the detailed additional source of truth.
When the target repo or request is OpenClaw-specific, layer `references/openclaw.md` on top of these general rules. Otherwise ignore that repo-specific overlay.
## Execution policy for this skill
- Long-running and extensive investigations are allowed for both build and review work when needed to resolve ambiguity or cross-file drift.
- Use sub-agents when available for bounded parallel discovery, verification, or cross-source comparison.
- Keep one merged outcome: sub-agent outputs must be normalized into a single consistent recommendation/fix set.
## Multilingual parity rule
When docs exist in multiple languages, target cross-locale parity for task-critical content (steps, warnings, prerequisites, and limits). If full parity is not possible, publish explicit parity status and sync intent.
Read `principles.md` first, then apply this checklist.
## 1. Scope and classification
- Identify doc type and target audience.
- Confirm brownfield vs evergreen intent.
- Confirm expected outcome for the reader.
- For full-repo reviews, explicitly include both governance surfaces and product-doc surfaces (`docs/`, README trees, `.md/.mdx/.mdc`, `.rst/.rsc`, framework docs configs).
- For OpenClaw docs reviews, apply `references/openclaw.md` for page type, docs IA, preservation, examples, and validation checks.
## 2. Investigation behavior
- Proactively find issues and risks without waiting for repeated prompts.
- If there are signals of deeper problems, continue investigation beyond the first pass.
- Long-running and extensive investigations are acceptable when needed for confidence and correctness.
- When available, use sub-agents for bounded parallel discovery (for example file-inventory, command validation, or cross-doc consistency checks), then merge to one final issue set.
- When no issues are found, state that explicitly and call out residual risks or validation gaps.
- Default to `apply-fixes` for high-confidence documentation defects unless the user explicitly requests `report-only`.
- Do not stop at AGENTS/CONTRIBUTING checks when the task is documentation-wide; continue into docs-content and docs-framework surfaces.
## 3. Governance surface review
- Use `references/agent-and-contributing.md` as the source of truth for inventory, canonical/alias mapping, and precedence/conflict handling.
For AGENTS.md:
- confirm persona intent, scope, and command/tool boundaries are explicit.
- check frontmatter style matches repo conventions when present.
- ensure `Always`, `Ask first`, and `Never` boundaries are present when expected.
- require concrete command examples and repo-specific paths to avoid ambiguity.
For CONTRIBUTING.md:
- verify issue/PR workflow is complete and actionable.
- ensure local setup, lint/test commands, and review criteria are accurate.
- ensure governance does not conflict with nested AGENTS instructions.
- flag oversized files that should be split into linked section docs (for example tool-specific setup and release docs).
For agent-platform awareness:
- confirm references are minimal and scoped for Cursor/Claude glob behavior.
- verify canonical rule directory and symlink state match repo policy
- verify symlink target integrity and platform/tooling expectations
- verify AGENTS policy references remain canonical for Codex even when `.cursor` compatibility exists
- check for context bloat from duplicated policy statements across agent and contributor files.
- check for conflicting rules, skills and agent instructions
- check for conflicting information in agent instructions vs codebase
- check for broken or missing referenced files (for example README/index files named as canonical entry points).
- check for setup/command drift (for example non-existent install commands, root-level commands that should be module-scoped).
## 4. Product documentation surface review
- Verify docs IA coverage across root/module `README*` files and `docs/**` trees.
- Review framework-native docs sources in scope (for example Fern, Mintlify, Sphinx, MkDocs) and ensure guidance matches actual source-of-truth files.
- Check `.md/.mdx/.mdc/.rst/.rsc` for stale commands, missing prerequisites, and broken cross-links.
- Confirm referenced doc paths and anchors exist.
- Flag docs that should be split/merged to improve discoverability and maintenance.
- For OpenClaw docs, check `docs/docs.json`, docs-list routing hints, main path versus `Reference` placement, and generated-reference visibility.
- For OpenClaw rewrites or page splits, require source-backed keep/drop/move/destination coverage for important claims, warnings, examples, commands, fields, and troubleshooting facts.
## 5. Framework config and path mapping checks
- Detect and read framework config first (for example Fern config, Sphinx `conf.py`, Mintlify config, or equivalent).
- Resolve path references relative to the declaring file/config.
- Treat filesystem paths and published URL routes as separate maps; verify both.
- Flag path-map drift explicitly (`missing file`, `stale route`, `wrong base path`).
## 6. Structural review
- Funnel check: what/why, quickstart, next steps.
- Validate heading flow and navigation discoverability.
- Flag critical content trapped in images or buried sections.
- Check Diataxis alignment and split mixed-purpose sections.
- For OpenClaw docs, confirm the content matches an explicit page type from `references/openclaw.md`.
## 7. Writing quality review
- Check for concise, scannable paragraphs.
- Remove ambiguous pronouns and undefined terms.
- Verify examples are executable and scoped correctly.
- Verify tone is directive, technical, and non-hand-wavy.
## 8. Brownfield review mode
- Verify compatibility with existing docs IA and conventions.
- Verify anchors, redirects, and cross-doc links remain valid.
- Flag regressions in onboarding and task completion paths.
- Ensure changed terminology is intentionally propagated.
## 9. Evergreen review mode
- Flag date-stamped or brittle wording without version scope.
- Check ownership and refresh signals are present.
- Ensure recommendations remain valid after routine product evolution.
This PR description is the contributor's durable explanation of the change. Write it for human maintainers first; ClawSweeper and Barnacle use the same text to understand intent, proof, risk, and current review state.
Describe the intent and outcome in 2-5 bullets. Avoid restating the diff; reviewers and bots can read the changed files.
If this PR fixes a plugin beta-release blocker, title it `fix(<plugin-id>): beta blocker - <summary>` and link the matching `Beta blocker: <plugin-name> - <summary>` issue labeled `beta-blocker`. Contributors cannot label PRs, so the title is the PR-side signal for maintainers and automation.
- Problem:
- Solution:
- What changed:
- What did NOT change (scope boundary):
</details>
## Motivation
## Linked context
Explain why this change should exist now. Link it to the user pain, failure mode, maintainer need, or product goal. If this is purely mechanical, write `N/A`.
Which issue does this close?
-
Closes #
## Change Type (select all)
Which issues, PRs, or discussions are related?
- [ ] Bug fix
- [ ] Feature
- [ ] Refactor required for the fix
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
Related #
## Scope (select all touched areas)
Was this requested by a maintainer or owner?
- [ ] Gateway / orchestration
- [ ] Skills / tool execution
- [ ] Auth / tokens
- [ ] Memory / storage
- [ ] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
<details>
<summary>Linked context guidance</summary>
## Linked Issue/PR
Link the issue, PR, discussion, maintainer request, or owner request that explains why this PR should exist. Maintainer context helps reviewers and automation distinguish intended work from drive-by churn.
- Closes #
- Related #
- [ ] This PR fixes a bug or regression
</details>
## Real behavior proof (required for external PRs)
External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only. Screenshots are encouraged even for CLI, console, text, or log changes; terminal screenshots and copied live output count. Be mindful of private information like IP addresses, API keys, phone numbers, non-public endpoints, or other private details when providing evidence.
- Behavior or issue addressed:
- Real environment tested:
- Exact steps or command run after this patch:
- Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output):
- Observed result after fix:
- What was not tested:
- Proof limitations or environment constraints:
- Before evidence (optional but encouraged):
## Root Cause (if applicable)
<details>
<summary>Real behavior proof guidance</summary>
For bug fixes or regressions, explain why this happened, not just what changed. Otherwise write `N/A`. If the cause is unclear, write `Unknown`.
External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only.
- Root cause:
- Missing detection / guardrail:
- Contributing context (if known):
Screenshots are encouraged even for CLI, console, text, or log changes. Terminal screenshots, copied live output, redacted runtime logs, recordings, and linked artifacts count.
## Regression Test Plan (if applicable)
If your environment cannot produce the ideal proof, explain that under `Proof limitations or environment constraints` so reviewers and ClawSweeper can direct the next step properly.
For bug fixes or regressions, name the smallest reliable test coverage that should catch this. Otherwise write `N/A`.
Be mindful of private information like IP addresses, API keys, phone numbers, non-public endpoints, or other private details when providing evidence.
- Coverage level that should have caught this:
- [ ] Unit test
- [ ] Seam / integration test
- [ ] End-to-end test
- [ ] Existing coverage already sufficient
- Target test or file:
- Scenario the test should lock in:
- Why this is the smallest reliable guardrail:
- Existing test that already covers this (if any):
- If no new test is added, why not:
</details>
## User-visible / Behavior Changes
## Tests and validation
List user-visible changes (including defaults/config).
If none, write `None`.
Which commands did you run?
## Diagram (if applicable)
For UI changes or non-trivial logic flows, include a small ASCII diagram reviewers can scan quickly. Otherwise write `N/A`.
List focused commands, not every incidental check. CI is useful support, but external PRs still need real behavior proof above when behavior changes.
- OS:
- Runtime/container:
- Model/provider:
- Integration/channel (if any):
- Relevant config (redacted):
</details>
### Steps
## Risk checklist
1.
2.
3.
Did user-visible behavior change? (`Yes/No`)
### Expected
-
Did config, environment, or migration behavior change? (`Yes/No`)
### Actual
-
Did security, auth, secrets, network, or tool execution behavior change? (`Yes/No`)
## Evidence
Attach at least one:
What is the highest-risk area?
- [ ] Failing test/log before + passing after
- [ ] Trace/log snippets
- [ ] Screenshot/recording
- [ ] Perf numbers (if relevant)
## Human Verification (required)
How is that risk mitigated?
What you personally verified (not just CI), and how:
<details>
<summary>Risk guidance</summary>
- Verified scenarios:
- Edge cases checked:
- What you did **not** verify:
Use this for author judgment that is not obvious from the diff. ClawSweeper can see touched files, but it cannot know which behavior you think is risky, why the risk is acceptable, or what mitigation reviewers should verify.
## Review Conversations
</details>
- [ ] I replied to or resolved every bot review conversation I addressed in this PR.
- [ ] I left unresolved only the conversations that still need reviewer or maintainer judgment.
## Current review state
If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.
What is the next action?
## Compatibility / Migration
- Backward compatible? (`Yes/No`)
- Config/env changes? (`Yes/No`)
- Migration needed? (`Yes/No`)
- If yes, exact upgrade steps:
What is still waiting on author, maintainer, CI, or external proof?
## Risks and Mitigations
List only real risks for this PR. Add/remove entries as needed. If none, write `None`.
Which bot or reviewer comments were addressed?
- Risk:
- Mitigation:
<details>
<summary>Review state guidance</summary>
Keep this as the durable state for review progress. If useful information appears in comments, fold the current next action or blocker back here so maintainers and ClawSweeper do not need to reconstruct state from comment history.
description:Optional exact live/E2E suite id, or comma-separated QA live lanes such as qa-live-matrix,qa-live-telegram; blank runs all selected live suites
required:false
@@ -134,7 +135,7 @@ jobs:
ref:${{ github.ref_name }}
path:workflow
fetch-depth:1
persist-credentials:false
persist-credentials:true
submodules:false
- name:Resolve target SHA
@@ -181,6 +182,11 @@ jobs:
else
echo "- Normal CI: skipped by rerun group"
fi
if [[ "$RERUN_GROUP" == "all" || "$RERUN_GROUP" == "performance" ]]; then
echo "- Product performance: \`OpenClaw Performance\` with \`target_ref=${TARGET_SHA}\`"
else
echo "- Product performance: skipped by rerun group"
fi
if [[ "$RERUN_GROUP" == "all" || "$RERUN_GROUP" == "plugin-prerelease" ]]; then
echo "- Plugin prerelease: \`Plugin Prerelease\` with \`target_ref=${TARGET_SHA}\`"
evidence_summary="Mantis ran Slack QA inside a Crabbox Linux VNC desktop, started an OpenClaw Slack gateway in that VM, opened Slack Web in the visible browser, and captured screenshot/video evidence."
expected_result="Slack QA and VM gateway setup pass"
checkpoint_artifacts='[]'
checkpoint_required=false
if [[ "$APPROVAL_CHECKPOINTS" == "true" ]]; then
evidence_summary="Mantis ran Slack native approval QA inside a Crabbox Linux VNC desktop, rendered pending/resolved approval checkpoints from the Slack API messages, and stored Slack QA artifacts."
expected_result="Slack native exec and plugin approval checkpoints pass"
screenshot_required=false
desktop_capture_inline=false
if [[ "$status" == "pass" ]]; then
checkpoint_required=true
fi
checkpoint_scenarios=()
if [[ "$scenario_label" == "approval-checkpoints" ]]; then
summary: "Mantis ran Slack QA inside a Crabbox Linux VNC desktop, started an OpenClaw Slack gateway in that VM, opened Slack Web in the visible browser, and captured screenshot/video evidence.",
summary: $summary,
scenario: $scenario,
comparison: {
candidate: { sha: $candidate_sha, expected: "Slack QA and VM gateway setup pass", status: $status, fixed: ($status == "pass") },
echo "Kova returned a partial release-gate verdict for filtered performance coverage, but all selected scenarios passed and no baseline regression was reported."
- ClawSweeper-owned schema, labels, close reasons, protected-label gates, maintainer-item gates, and mutation rules live in `openclaw/clawsweeper`.
- Review workers read this full root `AGENTS.md` before judging; no reliance on search snippets, `head`, partial ranges, local excerpts, or truncated copies. Then read every scoped `AGENTS.md` that owns touched paths.
- Optional integrations, providers, channels, skill bundles, MCP surfaces, and service workflows route to plugins, ClawHub, or owner repos when current seams suffice. Keep core items for missing core/plugin APIs, bundled regressions, security/core hardening, or maintainer product decisions.
- Plugin APIs, provider routing, auth/session state, persisted preferences, config loading, config/default additions, migrations, setup, startup checks, and fallback behavior are compatibility/upgrade-sensitive. Treat config breaks, new config/default surfaces, removed fallbacks, fail-closed changes, stricter validation, or new operator action as merge risk even with green CI when they can affect existing users, upgrades, provider/plugin behavior, or maintainer operations.
- For PRs that add, remove, or change config/default surfaces with possible compatibility, upgrade, provider/plugin, operator, setup, startup, or fallback impact, ClawSweeper review should emit a `reviewMetrics` entry when practical. The metric should name the count and direction of the changes, such as added, changed, or removed config/default surfaces, and explain why the metric matters before merge. When the metric indicates concrete merge risk, also surface the concern in `risks`, use `mergeRiskLabels` when the risk matches the label rubric, make `bestSolution` name the desired pre-merge state, and ensure `labelJustifications` explain the specific reason rather than restating the label.
- Review whole decision surfaces, not only the touched runtime, provider, channel, harness, plugin seam, or context path. Check sibling Codex/Pi-style runtimes, provider/model routing, channel delivery, gateway/protocol, plugin SDK, and context-management paths when relevant.
- One-sided fixes need sibling-surface proof, an explanation for why siblings are unaffected, or explicit follow-up work.
- Changelog findings: see Docs / Changelog.
- Public ClawSweeper comments prefer `https://docs.openclaw.ai/...` when a public docs page exists; structured evidence still cites repo files, lines, SHAs.
- Findings need current source, shipped/current behavior, tests/CI evidence, and dependency contract proof when dependency-backed behavior is involved. Validation is judged against touched and sibling surfaces plus this file's commands; real behavior proof matters for user-visible changes, with Telegram/Desktop proof for Telegram-visible behavior when feasible.
- Prefer findings for concrete behavior regressions, missing changed-surface proof, owner-boundary violations, security/API contract issues, or docs/config mismatches.
- Do not file findings for repo policy preference when changed code follows the relevant scoped guide and no user-visible, runtime, security, or maintainer-risk impact is shown.
- Docs AI: `openclaw/ask-molty`; see its `AGENTS.md`.
## Architecture
- Core stays plugin-agnostic. No bundled ids/defaults/policy in core when manifest/registry/capability contracts work.
@@ -34,16 +56,28 @@ Skills own workflows; root owns hard policy and routing.
- Internal bundled plugins ship in core dist; bundled-only facade loader ok only for them.
- External official plugins own package/deps and are excluded from core dist; core uses registry-aware `facade-runtime` or generic contracts.
- Externalizing a bundled plugin: update package excludes, official catalogs, docs, tests, and prove core runtime paths resolve installed plugin roots before root-dep removal.
-Legacy config repair belongs in `openclaw doctor --fix`, not startup/load-time core migrations. Runtime paths use canonical contracts.
-Runtime reads canonical config only. No silent compat for old/malformed config keys. If a config change invalidates existing files, add a matching`openclaw doctor --fix` migration. Core/auth config repairs live in core doctor; plugin-owned config repairs live in that plugin's doctor contract (`legacyConfigRules` / `normalizeCompatibilityConfig`).
- Fix shape: default to clean bounded refactor, not smallest patch. Move ownership to right boundary; delete stale abstractions, duplicate policy, dead branches, wrappers, fallback stacks.
- Fix observed local failures with generic product rules; do not hardcode names, ids, log phrases, or user examples in prod code unless they are an explicit contract.
- Tests may use observed examples, but prod literals need a short contract reason.
- Compatibility is opt-in. "Shipped" means reachable from a release Git tag; main/GitHub/PR/unreleased code is not shipped.
- Refactor default: one canonical path. Delete the old path unless user explicitly wants compat or the shipped public contract is obvious and cited.
- Keep old behavior only for an explicit public API/config/plugin SDK/data contract, tagged upgrade path, security/migration boundary, dependency contract, or observed prod state.
- If unsure, ask before preserving compat. Do not keep aliases, shims, fallback stacks, stale names, or obsolete tests just in case.
- Tests alone do not make internals contracts. If compat stays, name the contract and migration/removal plan in code, test, or PR.
- Lean code is a goal. No internal shims, aliases, legacy names, broad fallbacks, or defensive branches just to reduce diff or handle unrealistic edge cases.
- Handle real production states, shipped upgrade paths, security boundaries, and dependency contracts. Public/hostile/observed malformed input gets care; hypothetical malformed input does not.
-Public plugin SDK/API is the compat exception. New API first, old path only via named compat/deprecation metadata, docs, warnings when useful, tests for old+new, planned removal.
- Handle real production states, tagged upgrade paths, security boundaries, and dependency contracts. Public/hostile/observed malformed input gets care; hypothetical malformed input does not.
-Deprecate shipped public contracts only.
- Plugin SDK exception: shipped external API gets new API first plus named compat/deprecation, small tests/docs if useful, removal plan.
- Migrate internal/bundled callers to modern API in the same change. Do not let internal compat become permanent architecture.
- Channels are implementation under `src/channels/**`; plugin authors get SDK seams. Providers own auth/catalog/runtime hooks; core owns generic loop.
- Hot paths should carry prepared facts forward: provider id, model ref, channel id, target, capability family, attachment class. Do not rediscover with broad plugin/provider/channel/capability loaders.
- Do not fix repeated request-time discovery with scattered caches. Move the canonical fact earlier; reuse prepared runtime objects; delete duplicate lookup branches.
-Inline code comments: brief notes for tricky, bug-prone, or previously buggy logic.
-Gateway/plugin metadata is process-stable: installs, manifests, catalogs, generated paths, bundled metadata. Changes require restart or explicit owner reload/install/doctor flow.
- Runtime hot paths: no freshness polling (`stat`/`realpath`/JSON reread/hash). Reuse current snapshots, install records, discovery, lookup tables, root scopes, resolved paths.
- Process-local metadata caches ok when lifecycle-owned and bounded/single-slot. Freshness exceptions need named owner + tests.
- Inline comments: preserve reviewer context at the code site. Use for cross-path/state invariants, platform/dependency caps, deterministic ordering, compact encoded state, lifecycle ordering, ownership boundaries, session/id adoption, queue-depth symmetry, fallbacks, or intentional caller differences.
- Comment shape: 1-3 short lines; state why the branch/helper exists, what contract it protects, and the bad outcome if removed. Cite nearby constants/helpers when useful. No syntax narration, PR/user-specific lore, or obvious mechanics.
- Tests in a normal source checkout: `pnpm test <path-or-filter> [vitest args...]`, `pnpm test:changed`, `pnpm test:serial`, `pnpm test:coverage`; never raw `vitest`.
- Tests in a Codex worktree or linked/sparse checkout: avoid direct local `pnpm test*`; use `node scripts/run-vitest.mjs <path-or-filter>` for tiny explicit-file proof, or Crabbox/Testbox for anything broader.
@@ -92,7 +125,6 @@ Skills own workflows; root owns hard policy and routing.
- Do not leave associated issues open for hypothetical future repros. Close with rationale; ask for a new issue or reopen only if concrete new evidence appears. Close comment states: decision, why, supported alternative, and what evidence would change the decision.
- PR review answer: bug/behavior, URL(s), affected surface, provenance for regressions when traceable, best-fix judgment, evidence from code/tests/CI/current or shipped behavior.
- Issue/PR final answer: last line is the full GitHub URL.
- Changelog: PR landings/fixes need one unless pure test/internal. Do not mention missing changelog as a review finding; Codex handles it during fix/landing.
- PR verification: before merge, post exact local commands, CI/Testbox run IDs, before/after proof when used, and known proof gaps.
- Issue fixed on `main` with proof: comment proof + commit/PR, then close.
- After landing or requested close/sweep: search duplicates; comment proof + canonical commit/PR/release before closing.
@@ -100,8 +132,10 @@ Skills own workflows; root owns hard policy and routing.
-`ship` that fixes an issue: after push, comment proof + commit link, then close the issue.
- GH comments with backticks, `$`, or shell snippets: use heredoc/body file, not inline double-quoted `--body`.
- PR create: real body required. Include Summary + Verification; mention refs, behavior, and proof.
- PR create/refresh: keep PR branches takeover-ready. Use a branch maintainers can push to, or for fork PRs ensure `maintainer_can_modify` / GitHub's `Allow edits by maintainers` is enabled unless explicitly told otherwise or GitHub's Actions/secrets warning makes that unsafe.
- GitHub issue/PR create: read `$agent-transcript`; ask about sanitized transcript logs when available.
- Real behavior proof section is parsed. Use exact `field: value` labels: `Behavior addressed`, `Real environment tested`, `Exact steps or command run after this patch`, `Evidence after fix`, `Observed result after fix`, `What was not tested`.
- PR artifacts/screenshots: attach to PR/comment/external artifact store. Do not commit `.github/pr-assets`.
- PR artifacts/screenshots: attach to PR/comment/external artifact store. Never push screenshots, videos, proof images, or proof assets to OpenClaw or any product repo branch, including temp artifact branches. Use Crabbox artifact publishing plus the manifest URL. Do not commit `.github/pr-assets`.
- CI polling: exact SHA, relevant checks only, minimal fields. Skip routine noise (`Auto response`, `Labeler`, docs agents, performance/stale). Logs only after failure/completion or concrete need.
- Maintainers: may skip/ignore `Real behavior proof` when local tests or Crabbox verified behavior; record proof in PR verification.
-`/landpr`: use `~/.codex/prompts/landpr.md`; do not idle on `auto-response` or `check-docs`.
@@ -112,14 +146,27 @@ Skills own workflows; root owns hard policy and routing.
- No `@ts-nocheck`. Lint suppressions only intentional + explained.
- External boundaries: prefer `zod` or existing schema helpers.
- Cross-function state: when valid combos matter, return a closed mode/result shape. Avoid parallel nullable fields or derived booleans that callers must keep in sync; make impossible states unrepresentable.
- Formatter-friendly shape: when oxfmt explodes an expression vertically, extract named booleans, payloads, or small helpers. Do not change width or use format-ignore for local compactness.
- Calls should be boring: complex decisions happen above; call args/object fields are names, literals, or simple property reads.
- Prefer early returns over nested condition pyramids. Split code into gather -> normalize -> decide -> act.
- Use named intermediates only for domain meaning or readability; avoid temp-variable soup.
- Code size matters. Prefer small clear code; maintainability includes not growing LOC without payoff.
- Refactors should delete about as much local complexity as they add. If LOC grows, the new ownership/API needs to clearly pay for it.
- Before adding helpers/files, check whether existing code can absorb the behavior with less new surface.
- Keep APIs narrow: export only current caller needs; keep types/helpers local by default.
- Return the smallest useful shape. Avoid broad result objects, flags, metadata unless callers use them.
- Avoid adapter layers that only rename fields. Move real responsibility or leave code local.
- Inline simple one-use objects/spreads when clearer. Extract only when it removes duplication or hard logic.
- Tests prove behavior/regressions, not every internal branch.
- For non-trivial refactors, check `git diff --numstat` before closeout. If LOC grew, trim or explain why.
- Prefer existing narrow helpers over repeated casts/guards. Add local helpers when 2+ nearby call sites share real boundary logic.
- Prefer ctor parameter properties for injected deps/config. Do not ban them for erasable-syntax purity.
- Prefer `satisfies` for registries/config maps; derive types from schemas when a runtime schema already exists.
- Table-drive repetitive tests when it reduces code and keeps failure names clear.
- Dynamic import: no static+dynamic import for same prod module. Use `*.runtime.ts` lazy boundary. After edits: `pnpm build`; check `[INEFFECTIVE_DYNAMIC_IMPORT]`.
@@ -138,12 +185,12 @@ Skills own workflows; root owns hard policy and routing.
## Docs / Changelog
- Use `$openclaw-docs` for docs writing/review. Docs change with behavior/API.
- Use `$technical-documentation` for docs writing/review. Docs change with behavior/API.
- Codex harness upgrade (`extensions/codex/package.json``@openai/codex`): refresh `docs/plugins/codex-harness.md` model snapshot from the new harness `model/list`.
- Docs final answers: include relevant full `https://docs.openclaw.ai/...` URL(s). If issue/PR work too, GitHub URL last.
-Changelog entries: active version `### Changes`/`### Fixes`; single-line bullets only.
-Contributor PR authors should not edit `CHANGELOG.md`; maintainer/AI adds entries during landing/merge.
-Contributor-facing changelog entries thank credited human `@author`. Never thank bots,`@openclaw`, `@clawsweeper`, or`@steipete`; if unknown, omit thanks.
-`CHANGELOG.md`: release-owned. Do not edit for normal PRs, direct `main` fixes, or `ship it`; only explicit release/changelog generation may rewrite it. Do not ask contributors/agents for changelog edits.
-User-facing `fix`/`feat`/`perf`: put release-note context in PR body, squash message, or direct commit: behavior, surface, issue/PR refs, credited human author/reporter.
-Release generation: derive `CHANGELOG.md` from merged PRs + all direct `main` commits. Entries: active `### Changes`/`### Fixes`, single-line, thank credited humans; never thank bots/forbidden handles:`@openclaw`, `@clawsweeper`, `@codex`,`@steipete`.
## Git
@@ -152,7 +199,7 @@ Skills own workflows; root owns hard policy and routing.
- No manual stash/autostash unless explicit. No branch/worktree changes unless requested.
-`main`: no merge commits; rebase on latest `origin/main` before push. After one green run plus clean rebase sanity, do not chase moving `main` with repeated full gates.
- User says `commit`: your changes only. `commit all`: all changes in grouped chunks. `push`: may `git pull --rebase` first.
- User says `ship it`: changelog if needed, commit intended changes, pull --rebase, push.
@@ -174,7 +222,7 @@ Skills own workflows; root owns hard policy and routing.
- SwiftUI: Observation (`@Observable`, `@Bindable`) over new `ObservableObject`.
- Mac gateway: dev watch = `pnpm gateway:watch`; managed installs = `openclaw gateway restart/status --deep`; logs = `./scripts/clawlog.sh`. No launchd/ad-hoc tmux.
- Mac app permission testing: stable app path + real signing identity required. No `--no-sign`, `SIGN_IDENTITY=-`, or raw debug binary; TCC prompts/listing won't stick.
- Version bump surfaces live in `$openclaw-release-maintainer`.
- Version bump surfaces live in `$release-openclaw-maintainer`.
- Crabbox/WebVNC human demos: keep remote desktop visible/windowed; no fullscreen remote browser unless video/capture-style output.
- ClawSweeper ops: `$clawsweeper`. Deployed hook sessions may post one concise `#clawsweeper` note only when surprising/actionable/risky; if using message tool, reply exactly `NO_REPLY`.
- Agent and Codex runtime recovery is steadier: subagents keep cwd/workspace separation, hook context stays prompt-local, session locks release on timeout abort, stale restart continuations are avoided, and Codex app-server/helper failures no longer tear down shared runtime state. (#87218, #86875, #87409, #87399, #87375)
- Channel delivery and session identity got safer across outbound plugin hooks, Matrix room ids, iMessage reactions/approvals, Slack final replies, Discord recovered tool warnings, and Microsoft Teams service URL trust checks. (#73706, #75670, #87366, #87451, #87334)
- CLI, auth, doctor, and provider paths fail faster and recover more clearly: malformed numeric/version options are rejected, OAuth and local service startup requests are bounded, legacy `api_key` auth profiles migrate to canonical form, and restart guidance is actionable. (#87398, #86281, #87361)
- Plugin and Gateway hot paths do less repeated work while preserving cache correctness for install records, config JSON parsing, tool search catalogs, session stores, manifest model rows, auto-enabled plugin config, browser tokens, and viewer assets. (#86699)
- Release, QA, and E2E validation now bound more log, artifact, harness, and cross-OS waits so failing lanes produce proof instead of hanging or false-greening.
### Changes
- Status: show active subagent details in status output.
- Diffs: split the default language pack and expand default Diffs language coverage while keeping the host floor aligned. (#87370, #87372) Thanks @RomneyDa.
- ClawHub: add plugin display names plus skill verification and trust surfaces. (#87354, #86699) Thanks @thewilloftheshadow and @Patrick-Erichsen.
- Agents/Codex: keep spawned agent cwd/workspace state separated, keep hook context prompt-local, release session locks on timeout abort, avoid session event queue self-wait, preserve shared app-server state across startup or helper failures, keep native hook relay alive across restarts, route workspace memory through tools, resolve Codex runtime models first, report quarantined dynamic tools, format `skills` command output, and bound compaction/steering retries. (#87218, #86875, #86123, #87399, #87375, #87383, #87400) Thanks @mbelinky, @Alix-007, @luoyanglang, @yetval, and @sjf.
- Channels: thread canonical session keys into outbound hooks, preserve Matrix room-id case, keep fallback tool warnings mention-inert, retain delivered Slack final replies during late cleanup, continue iMessage polling after denied reactions, suppress duplicate native exec approvals, preserve Telegram SecretRef prompt config, suppress Discord recovered tool warnings, and block untrusted Teams service URLs. (#73706, #75670, #87366, #87451, #87334) Thanks @zeroaltitude, @lukeboyett, @xiaotian, and @eleqtrizit.
- CLI/auth/doctor/providers: reject malformed numeric/timeout/subcommand-version inputs, wait for respawn child shutdown, bound Codex and GitHub Copilot OAuth/token requests, warm provider auth off the main thread, honor Codex response timeouts, bound local service startup, resolve GPT-5.5 without cached catalog, migrate legacy memory auto-provider config, rewrite non-canonical `api_key` auth profiles, and make doctor restart follow-ups actionable. (#87398, #86281, #87361) Thanks @Patrick-Erichsen, @samzong, @giodl73-repo, and @alkor2000.
- Gateway/security/session state: expire browser tokens after auth rotation, scope assistant idempotency dedupe, drain probe client closes, avoid stale restart continuation reuse, preserve retry-after fallbacks, bound webchat image and artifact transcript scans, include seconds in inbound metadata timestamps, and evict current plugin-state namespaces at row caps.
- Performance: trust install-record caches between reloads, prefer native JSON parsing, reuse unchanged tool-search catalogs, skip unchanged store serialization, add precomputed session patch writers, reduce store clone allocations, cache manifest model catalog rows and auto-enabled plugin config, and slim current metadata identity caches.
- Docker/release/QA: package runtime workspace templates, stream cross-OS served artifacts, preserve sparse Crabbox run artifacts, bound OpenClaw instance logs, plugin gauntlet relay logs, MCP channel buffers, kitchen-sink scans, agent-turn assertions, and release scenario logs, and keep release/google live guards current.
## 2026.5.27
### Highlights
- Safer local/runtime boundaries: OpenClaw now rejects unsafe command wrappers, malformed CLI numeric options, unsafe Node runtime env overrides, no-auth Tailscale exposure, and non-admin device-role pairing approvals before they can affect live runs. (#87308, #87305, #87292, #87146)
- Matrix and auto-reply delivery are steadier: mention previews stay inert, final mention replies deliver normally, shared-DM notices are awaited, MXID parsing ignores filenames, and reasoning-prefixed `NO_REPLY` responses stay suppressed.
- Provider and agent reliability improved across OpenAI-compatible embeddings, cached token usage, Anthropic/Codex/Claude runtime state, unsupported tool-schema quarantine, heartbeat templates, and session fallback errors. (#85269, #82062, #85416, #86855)
- Plugin and package release paths got tighter: Pixverse ships as an external video plugin with region selection, package exclusions and shrinkwrap inventory match the published npm shape, and release/package smoke commands fail bounded instead of hanging.
- Gateway hot paths do less rediscovery by reusing current plugin metadata fingerprints, stable plugin index fingerprints, read-only session metadata, active working stores, status fast paths, and auth/env snapshots. (#86439)
### Changes
- Memory: add a core OpenAI-compatible embedding provider for local and hosted OpenAI-style endpoints, with config, doctor, and docs support. (#85269) Thanks @dutifulbob.
- Plugin SDK: mark memory-specific embedding provider registration as deprecated compatibility and surface non-bundled usage in plugin compatibility diagnostics. (#85072) Thanks @mbelinky.
- Pixverse: add video generation provider support, API region selection, and external plugin publishing.
- Plugins: expose approval action metadata for plugin-driven approval surfaces.
- Doctor: validate runtime tool schemas for every configured embedded agent while skipping ACP-only profiles, so bad non-default plugin or MCP tools are reported before assistant turns.
- Telegram: route `sendMessage` action replies through durable outbound delivery so completed agent responses remain retryable when the gateway send path times out. (#87261) Thanks @mbelinky.
- Agents/providers: add OpenAI-compatible cache retention, forward cached token usage in chat completions, preserve runtime context before active user turns, strip stale Anthropic thinking, load Claude CLI OAuth for Pi auth profiles, avoid false Codex runtime live switches, and quarantine unsupported tool schemas. (#82062, #87167, #86855)
- Gateway/performance: cache plugin metadata fingerprints and stable plugin index fingerprints, borrow read-only session metadata safely, keep the active session working store hot, keep status on a bounded fast path, and preserve model auth profile suffixes. (#86439)
- Package/install/release: align npm package exclusions and inventory, omit unpacked test helpers, skip Homebrew until macOS packages need it, cap tsdown heap in containers, bound install/release smoke waits, and harden post-publish verification.
- Codex/Auth: bound ChatGPT OAuth token exchange and refresh requests, and honor cancellation across Codex and Anthropic OAuth login flows.
- QA/E2E/CI: bound Telegram, kitchen-sink, Open WebUI, ClawHub, MCP, Discord, realtime, labeler, and GitHub API waits; fail empty explicit test, live-media, gateway CPU, plugin gauntlet, and beta-smoke runs instead of false-greening.
- Agents/Codex: keep spawned agent bootstrap files rooted in the agent workspace while running task commands, transcripts, and compaction from the requested cwd. (#87218) Thanks @mbelinky.
## 2026.5.26
### Highlights
- Faster Gateway and replies: startup avoids repeated plugin, channel, session, usage-cost, warning, scheduled-service, and filesystem scans; visible replies separate user-facing sends from slower follow-up work; Gateway runtime/session caches churn less under load.
- Transcripts are core: transcript-backed meeting summaries, source-provider chunks, cleaned user turns, media provenance, Codex mirrors, WebChat replies, and CLI/TUI replay now use one more reliable transcript path.
- More channels are production-ready: Telegram keeps typing/progress context and forum topics, iMessage handles attachment roots, remote media staging, and duplicate local Messages sources, WhatsApp restores group/media behavior, Discord improves voice playback and model picking, and Signal/iMessage/WhatsApp get reaction approvals.
- Better voice and Talk: realtime Talk runs can be inspected, steered, cancelled, or followed up from Web UI and Discord voice; wake-name handling is more tolerant without letting ambient speech trigger agents.
- Safer content boundaries: Browser snapshot reads honor SSRF policy, system-event text cannot spoof nested prompt markers, fetched file text is wrapped as external content, ClickClack inbound sender allowlists run before agent dispatch, stale device tokens are rejected, and serialized tool-call text is scrubbed from replies.
- Providers, Codex, and local models are steadier: named auth profiles, OpenAI sampling params, Codex app-server resume/timeout/usage-limit recovery, dynamic tool-schema guards, xAI usage-limit surfacing, Ollama top-p normalization, and local approval resolution reduce provider-specific dead ends.
- More reliable install/update/release paths: Alpine installs, trusted runtime fallback roots, stable update channels, Docker/package timeouts, Windows Scheduled Tasks, Windows/macOS proof lanes, Testbox/Crabbox delegation, plugin publish checks, and macOS runner bootstraps all got hardened.
- Transcripts: add core transcript capture and source-provider support for transcript-backed meeting summaries, including the renamed Transcripts docs, CLI surface, source-provider chunks, and cleaned user-turn persistence.
- Auth: add named model login profiles and supported credential migration for Hermes, OpenCode, and Codex auth profiles, with explicit opt-out and non-interactive controls. (#85667) Thanks @fuller-stack-dev.
- Diagnostics: trace gateway secret preparation, classify skill/tool usage, surface model stream progress, add OpenTelemetry LLM content spans, and expose alertable telemetry for blocked tools, failover, stale sessions, liveness, oversized payloads, and webhook ingress. (#83019, #80370, #86191)
- Channels: add Signal reaction approvals, iMessage thumb approval reactions, and WhatsApp thumb approval reaction support so mobile approval flows work without textual `/approve` commands. (#85894, #85952, #85477)
- Agents/API: forward OpenAI sampling params through the Gateway and expose estimated context-budget status for active agent runs. (#84094)
- TUI/status: queue prompts submitted while an agent is busy and show explicit fast-mode state plus richer systemd Gateway hygiene in status output. (#86722, #87115, #86976)
- Exec approvals: hide durable approval actions that are unavailable for the current prompt and keep approval runtime tokens local-only so stale prompts cannot offer misleading controls. (#86270, #86359)
- Plugin SDK: add reaction approval helpers and keep diagnostic event root exports discoverable across function-name and alias-bound module graphs. (#86735, #87084)
- Android/iOS: add the Android pair-new-gateway action and improve mobile Talk mode surfaces, including iOS realtime Talk mode and Android offline voice/gateway recovery. (#86798, #86355) Thanks @ngutman.
- Performance: cache plugin metadata snapshots, package realpaths, stable gateway metadata, model cost indexes, channel resolution, usage-cost indexes, and session/auth hot-path facts so common Gateway and reply paths do less rediscovery. (#84649, #85843, #86517, #86678)
- Voice: expose shared realtime turn-context tracking through the realtime voice SDK and reuse it for Discord speaker attribution and wake-name context recovery.
- Voice: reuse shared realtime output activity tracking in Google Meet command and node audio bridges, including recent-output checks for local barge-in detection.
- Voice: expose shared realtime output activity tracking through the realtime voice SDK and reuse it for Discord playback activity and barge-in decisions.
- Voice: expose shared realtime consult question matching, speakable-result extraction, and alias-aware forced-consult coordination through the realtime voice SDK, then reuse it in Gateway Talk, Voice Call, and Discord voice paths.
- Voice: share activation-name matching and consult-transcript screening through the realtime voice SDK so Discord, browser voice, and meeting surfaces can reuse one implementation.
- Cron: default `cron.maxConcurrentRuns` to 8 so scheduled automations and their isolated agent turns can make progress in parallel without explicit configuration.
- QA-Lab: add `qa coverage --match <query>` so focused proof selection can discover matching scenarios from existing metadata before running live or remote lanes.
- Discord/model picker: surface an alpha-bucket select (e.g. `A–G (12) · H–N (18) · O–Z (5)`) when the provider list or a provider's model list exceeds 25 items, so configs with `provider/*` wildcards stay one click from the right page instead of paginating through prev/next; falls back to numeric chunks when every item shares the same first letter. (#86181) Thanks @rendrag-git.
- Control UI: add an ephemeral Activity tab for sanitized live tool activity summaries without persisting raw telemetry. Fixes #12831. Thanks @BunsDev.
- Build: include `ui:build` in the `full` and `ciArtifacts` profiles of `scripts/build-all.mjs` so `pnpm build` always rebuilds `dist/control-ui` after `tsdown` cleans `dist`, removing the second-command requirement and the missing-asset failure mode for source/runtime installs and CI artifact uploads. (#85206)
- iOS: improve Talk mode with direct realtime voice sessions, compact toolbar status, and responsive voice waveform feedback. (#86355) Thanks @ngutman.
- Media: replace the Sharp image backend with Rastermill for metadata, resizing, EXIF orientation, and PNG alpha-preserving optimization so OpenClaw no longer installs Sharp or the WhatsApp Jimp fallback for image processing. (#86437)
- Codex: update the bundled Codex CLI to 0.134.0 and keep native compaction disabled for budget-triggered app-server turns so OpenClaw owns the recovery boundary. (#86772)
### Fixes
- Memory/security: reject prompt-like text submitted through the explicit `memory_store` tool before embedding or storage, matching the existing auto-capture prompt-injection filter. (#87142)
- Gateway/security: enable the default auth rate limiter for remote non-browser and HTTP gateway auth failures when `gateway.auth.rateLimit` is unset, while preserving the loopback exemption. (#87148)
- Prompt hardening: route untrusted group prompt metadata through sanitized untrusted structured context while preserving trusted operator-configured group system prompts and aligning the plugin SDK docs/test helpers. (#87144)
- Security/content boundaries: validate Browser snapshot tab URLs against SSRF policy before ChromeMCP or direct CDP reads, sanitize queued system-event text so untrusted plugin/channel labels cannot spoof nested prompt markers, wrap fetched file text and metadata as external content, apply ClickClack `allowFrom` sender allowlists before agent dispatch, reject RPCs from invalidated device-token clients during rotation, require staged sandbox media refs, and scrub serialized tool-call text from replies. (#78526, #87094, #87062, #83741, #70707, #86924) Thanks @zsxsoft, @ttzero25, and @mmaps.
- Transcripts/user turns: persist CLI, WebChat, media, follow-up, hook, and Codex-mirror user turns to the admitted session target; keep cleaned transcript text, inline image routing, provenance metadata, replay hooks, and fallback paths idempotent when runtimes fail or restart.
- TUI/status/onboarding/UI: queue busy TUI prompts instead of dropping them, preserve the configured default model during onboarding, show failed tool results as errors, show config-open failures in Control UI, keep status JSON plugin scans healthy, preserve xAI usage-limit errors locally, and expose explicit fast-mode/systemd state. (#86722, #87000, #85786, #87108, #87001, #86614, #87115, #86976)
- Plugin commands/SDK: preserve plugin LLM command auth, bind native plugin command dispatch to the host agent's LLM auth, keep `onDiagnosticEvent` exports discoverable through `Function.name`, stabilize diagnostic event root aliases, correlate pathless read diagnostics, suppress transient runner failures in channel command paths, and repair local approval resolution. (#85936, #87084, #86977, #87069, #86771)
- Codex/providers: keep WebChat delivery hints out of user prompts, avoid false queued-terminal idle timeouts, share the native hook relay registry, quarantine unsupported dynamic tool schemas, preserve Claude resumed-session system prompts, normalize greedy Ollama `top_p`, preserve per-agent thinking defaults for ingress runs, and avoid native compaction takeover on budget-triggered Codex turns. (#87096, #73950, #87049, #86689, #86772)
- Gateway/perf/release: reuse startup-warning metadata and prepared auth stores, avoid cloning live-switch and lifecycle session caches on read paths, defer warning and scheduled-service fallback imports, trim Gateway session/startup/runtime CPU churn, skip duplicate turn session touches, stop chat timeout fallback cascades, drop stale subagent announce history, bound benchmark/watch/kitchen-sink teardown waits, bound macOS/package/onboarding/plugin smoke commands, bound install finalization probes, resolve Parallels npm-update commands from guest `PATH`, and bootstrap raw AWS macOS Node/pnpm commands through `/usr/bin/env`. (#86997)
- Reply/source delivery: keep TUI, Control UI, media, TTS, transcript, and Codex source-reply finals live without duplicate terminal events or stale replay artifacts.
- Agents/replay: repair legacy tool results before replay, preserve `sessions_spawn` transcript payloads, restore current guard checks, stage sandboxed workspace media, and keep duplicate transcripts tool display metadata from reappearing. (#82203, #86934, #87025) Thanks @martingarramon, @vincentkoc, and @joshavant.
- Agents/sessions: handle active-fallback failures in `sessions_send` so fallback routing reports the real failure and does not leave callers with an ambiguous dropped send. (#86638)
- Agents/hooks/subagents: enforce default hook agent allowlists, recover failed subagent lifecycle completions, and keep node task lifecycle cleanup from closing the Gateway listener. (#86101)
- Codex: project newer OpenClaw chat history into resumed app-server threads and keep Codex turn timeouts inside the Codex runtime boundary so timeouts do not poison shared app-server clients or fall through to unrelated provider fallback. (#86677, #86476) Thanks @TurboTheTurtle and @pashpashpash.
- Config/doctor/update: narrow profiled tool-section doctor repair, keep runtime-injected legacy web-search provider config out of user-authored config validation, and keep prerelease tags excluded from stable updater resolution. (#87030, #86818, #86559) Thanks @joshavant, @luoyanglang, and @stevenepalmer.
- Doctor/runtime: validate active bundled MCP tool schemas through the same runtime projection path so unsupported MCP input schemas are reported and quarantined instead of poisoning assistant startup.
- CLI/Windows: add a Windows-only stack-size respawn for stack-heavy startup paths, default CLI logs to local timestamps, and validate timeout/banner TTY state more strictly. (#87031, #85387) Thanks @giodl73-repo and @vincentkoc.
- Locking/security: require owner identity proof before stale plugin lock removal, memoize session lock owner arguments, and avoid writing default exec approval stores unless policy state actually changed. (#86814, #86964) Thanks @Alix-007 and @vincentkoc.
- Install/release: bound Docker package build, inventory, pack, and tarball preparation with process-group timeouts; pin shrinkwrap patch drift to the pnpm lock; harden macOS restart and dSYM packaging; and run release Docker/live timeout wrappers in the foreground so child processes cannot wedge gates.
- QA/Telegram: bound Telegram user credential tar and broker calls so live proof setup fails with a timeout instead of waiting for the outer Crabbox job deadline.
- QA/Tool Search: bound gateway E2E HTTP probes, run only the fixture plugin, and clean up temporary fixture trees after the compact tool-catalog proof completes.
- Telegram/network: treat `ENETDOWN` as a transient pre-connect network failure so Telegram sends, gateway unhandled-rejection handling, and cron network retries follow the same recovery path as sibling network outages. (#86762) Thanks @TurboTheTurtle.
- Telegram: preserve inbound text entities, overlapping DM replies, account topic cache sidecars, outbound reply context, targeted bot-command mentions, durable group retry targets, forum topic names, and native progress callbacks. (#83873, #85361, #85555, #85656, #85709, #86299, #86553) Thanks @SebTardif, @luoyanglang, and @neeravmakwana.
- iMessage: read image attachments from local Messages attachment roots, dedupe duplicate local Messages-source accounts, seed direct DM history, fix image/group media attachment commands, advance catchup cursors after live handling, and keep slash-command acknowledgements in the source conversation. (#82642, #85475, #86569, #86705, #86706, #86770) Thanks @homer-byte, @TurboTheTurtle, @swang430, and @OmarShahine.
- WhatsApp/QQ/Twitch/IRC/Slack: restore WhatsApp ack identity and group-drop warnings, make QQ Bot media respect `OPENCLAW_HOME`, serialize Twitch auth disconnects, store IRC channel routes canonically, and keep Slack downloaded files out of reply media. (#83833, #85309, #85777, #85794, #85906, #86318, #86697) Thanks @sliverp, @neeravmakwana, and @Kailigithub.
- Discord/voice: improve voice playback and wake replies, bucket large model picker menus, merge media captions into one message, route metadata through configured proxies, restore numeric channel sends, suppress self-reply echoes, and tighten wake matching without breaking fuzzy wake phrases. (#80227, #86238, #86487, #86571, #86595, #86601)
- Agents/runtime: enforce session lock max-hold reclaim, release embedded-attempt locks on all exits, treat aborted subagent runs as terminal, avoid runtime model hydration on hot paths, disclose scoped session list counts, derive overflow budgets from provider errors, and keep fallback errors scoped to the active model candidate. (#70473, #85764, #86014, #86134, #86427, #86944) Thanks @openperf, @fuller-stack-dev, @zhangguiping-xydt, and @ferminquant.
- Config/update/doctor: retry config recovery after failed backup restore, skip shell env fallback on Windows, exclude prerelease tags from the stable git channel, support deep config edits, warn instead of aborting on unreadable cron stores, prune stale bundled plugin paths, and avoid duplicate restart prompts when the Gateway is already healthy. (#85739, #85787, #86060, #86260, #86384, #86533) Thanks @liaoyl830.
- Install/release: support Alpine CLI installs and runtime floors, prefer trusted startup argv runtime fallback roots, reject stale CLI node runtimes, avoid npm `min-release-age` installer failures, bound npm/package/Docker install phases, restore config parent ownership in Docker, seed Docker lockfile package tarballs before prune, make release/plugin prerelease checks fail closed instead of hanging or false-greening, and use host-visible Crabbox local work roots for Docker-backed proof. (#85491)
- Windows daemon: keep Scheduled Task gateway launches running on battery power and avoid workgroup-machine prompts for a domain user during task installation. (#59299)
- Security: avoid printing Gateway tokens in Docker, validate plugin model-pattern regexes safely, escape transcript metadata field names, harden session allowlist glob matching, audit Claude permission overrides under YOLO, and require explicit allow for ACP auto approvals. (#85849, #85934, #86046, #86557)
- Media/images: replace Sharp with Rastermill, keep EXIF normalization best-effort, normalize HEIC/HEIF before image descriptions, route Codex image API keys through OpenAI, preserve image compression metadata, and auto-scale live tool result caps. (#85776, #86037, #86437, #86857, #86923)
- Memory: prevent semantic vector indexes from silently degrading when embeddings are unavailable, stop doctor OOMs on large session stores, preserve sidecar hooks/artifacts, write fallback dream diaries, use CJK-aware dreaming dedupe, and avoid per-file watcher FD fan-out. (#80613, #82928, #85060, #85704, #85967, #86701) Thanks @brokemac79, @openperf, and @yaaboo-gif.
- Agents/sessions: include visibility metadata on restricted `sessions_list` results so scoped counts are clearly reported without widening access or exposing hidden-session counts. (#86944) Thanks @ferminquant.
- Gateway/DNS: validate wide-area discovery domains before deriving zone paths or writing zone files, so invalid `discovery.wideArea.domain` and `dns setup --domain` values fail with a DNS-name diagnostic instead of falling through to unrelated configuration errors. Thanks @mmaps.
- Agents/BTW: route fallback side-question streams through the embedded stream resolver so Anthropic-compatible MiniMax requests use the same capped transport as normal chat. (#86312) Thanks @neeravmakwana.
- Telegram: treat `/command@TargetBot` bot-command entities as explicit mentions for the addressed bot so `requireMention` groups no longer drop targeted commands or captions. Fixes #84462. (#86553) Thanks @luoyanglang.
- CI: bound Docker/Bash E2E tarball npm installs with `OPENCLAW_E2E_NPM_INSTALL_TIMEOUT` so package, onboarding, plugin, and upgrade lanes fail instead of hanging on a stuck npm install.
- CI: fail Parallels npm-update smoke jobs after the guest command timeout and cleanup backstop instead of only logging a timeout line.
- CI: bound kitchen-sink RPC HTTP probes so stalled gateway readiness or response bodies fail and retry instead of wedging the walker.
- CI: bound Telegram user Crabbox proof Bot API calls so stalled Telegram responses fail instead of wedging credential and desktop proof cleanup.
- CI: bound MCP channel stdio client initialization so Docker channel proof fails and closes the bridge transport instead of waiting for the outer job timeout.
- CI: keep `OPENCLAW_TESTBOX=1 pnpm check:changed` delegating to Blacksmith Testbox through Crabbox without forwarding local Testbox or worker env into the remote command.
- CI: send KILL after the TERM grace period for manual checkout fetch timeouts so stuck Testbox and workflow checkout retries cannot hang behind a wedged `git fetch`.
- CI: send KILL after the TERM grace period for Bun global install smoke command timeouts so trapped `openclaw` child processes cannot wedge the scheduled install smoke.
- iMessage: thread current channel/account inbound attachment roots into the image tool so iMessage-saved attachments under `~/Library/Messages/Attachments` (including the wildcard `/Users/*/Library/Messages/Attachments` root) are read through the existing inbound path policy instead of being rejected as `path-not-allowed`. Literal `localRoots` stays workspace-scoped. Fixes #30170. (#86569)
- QQ Bot: respect `OPENCLAW_HOME` for outbound media path resolution so `<qqmedia>` sends no longer silently fail when `HOME` and `OPENCLAW_HOME` differ (Docker / multi-user hosts). Persisted QQ Bot data (sessions, known users, refs) stays anchored on the OS home for upgrade compatibility. Fixes #83562. Thanks @sliverp.
- Update: report the primary malformed `openclaw.extensions` payload error without adding a duplicate missing-main diagnostic. (#86596) Thanks @ferminquant.
- Control UI: keep host-local Markdown file paths inert while preserving app-relative links. (#86620) Thanks @BryanTegomoh.
- Gateway: dampen repeated unauthenticated device-required probes per URL while preserving explicit-auth and paired recovery paths. (#86575) Thanks @ferminquant.
- IRC: store inbound channel routes with the canonical `channel:#name` target and join transient channel sends before writing. (#85906) Thanks @Kailigithub.
- Usage: surface unknown all-zero model pricing as missing cost entries instead of a confident `$0` total. (#85882) Thanks @MichaelZelbel.
- Agents/Codex: honor yolo app-server approval policy only for the full `never` plus `danger-full-access` case. (#85909) Thanks @earlvanze.
- Gateway/Gmail: clear Gmail watcher renewal intervals on re-entry so hot reloads do not leak lifecycle timers. (#82947) Thanks @SebTardif.
- Logging: exit cleanly on broken stdout/stderr pipes without masking existing failure exit codes. (#80059) Thanks @pavelzak.
- Gateway/security: escape transcript metadata field names while extracting oversized session line prefixes. (#85934) Thanks @SebTardif.
- Plugins/security: validate manifest model pattern regexes with the safe-regex compiler so unsafe patterns are ignored before matching. (#86046) Thanks @SebTardif.
- Discord: route gateway metadata REST lookups through the configured Discord proxy so proxied accounts do not fall back to direct `discord.com` connections before opening the WebSocket. Fixes #80227. Thanks @Clivilwalker.
- Agents/media: hydrate current-turn image attachments from filename-derived MIME types so active vision can see generated or forwarded images whose source omitted an image content type. (#84812) Thanks @marchpure.
- Agents/fs: point workspace-only scratch-path guidance at in-workspace temp directories while keeping host-root writes rejected by the tool guard. (#86501) Thanks @tianxiaochannel-oss88.
- Agents/media: keep async cron media completions scoped to their run session while preserving direct delivery for stale generated-media success and failure notifications. (#86529) Thanks @ai-hpc.
- Gateway: emit plugin `session_end`/`session_start` hooks when `agent.send` rotates or replaces a session id, keeping hook lifecycle state aligned with `sessions.changed` notifications. Fixes #83507. (#85875) Thanks @brokemac79.
- OpenShell/SSH: reject malformed generated exec commands before sandbox/session setup so unresolved workflow placeholders fail fast instead of reaching the remote shell. Fixes #72373. Thanks @brokemac79.
- Google: stop normalizing `gemini-3.1-flash-lite` to the retired preview endpoint and update Flash Lite alias guidance to the GA model id. Fixes #86151. (#86240) Thanks @SebTardif.
- Installer: make Alpine apk installs cover Git, verify the Node runtime floor, try `nodejs-current`, and report Alpine version guidance when repositories only provide older Node packages.
- Agents/status: prefer the active Claude CLI OAuth auth label over an unused Anthropic env API-key label for equivalent runtime aliases. Fixes #80184. (#86570) Thanks @brokemac79.
- Agents/media: send direct fallback for generated media still missing after an active requester wake fails. (#85489) Thanks @fuller-stack-dev.
- Agents: derive overflow compaction budgets from provider-reported and synthetic over-budget token counts so confirmed context overflows compact before retrying. (#70473) Thanks @fuller-stack-dev.
- Agents/Codex: recover Codex context-window prompt errors through overflow compaction and surface reset guidance when recovery is exhausted. (#85542) Thanks @fuller-stack-dev.
- Agents/Codex: allow Codex app-server runs to bootstrap from `CODEX_API_KEY` or `OPENAI_API_KEY` when no Codex auth profile is configured.
- Agents/Codex: keep selected Codex runtime routing on OpenAI-Codex while preserving direct OpenAI API-key compaction fallback. (#86408) Thanks @funmerlin and @VACInc.
- Agent transcript: include OpenClaw agent session logs when finding local transcript candidates.
- Crabbox: bootstrap raw AWS macOS shell commands wrapped in absolute `time` paths so RSS probes can run Node and pnpm on fresh macOS runners.
- Crabbox: bootstrap raw AWS macOS shell commands even when setup statements precede Node or pnpm usage.
- TUI/local: skip unnecessary secret resolution, gateway model catalog loading, bootstrap, and skill scans in explicit local-model runs so startup reaches the model request faster.
- Sessions/doctor: load large session stores without clone amplification during read-only doctor checks and reclaim stale `sessions.json.*.tmp` sidecars. Fixes #56827. Thanks @openperf.
- Tests: clean successful plugin gateway gauntlet isolated temp roots while keeping an explicit preservation switch for failed/debug runs.
- Plugins/perf: reuse derived plugin metadata snapshots for the lifetime of the process so reply-time skill setup no longer rescans plugin metadata on every turn.
- Discord/OpenAI voice: keep wake-name master consults using the current speaker context after ignored ambient transcripts and shorten the default capture silence grace.
- Doctor: skip redundant Gateway restart prompts when a recent supervisor restart leaves the Gateway healthy. Fixes #86518. (#86533) Thanks @liaoyl830.
- Cron: restore suspended cron lanes to the configured/default concurrency instead of falling back to one after quota or circuit-breaker auto-resume.
- Gateway: keep session-only Control UI tool-start mirrors flowing during diagnostic queue pressure instead of silently dropping non-terminal tool updates.
- Discord: merge streamed text captions into following media block replies so captions and attachments send as one message. (#86487) Thanks @neeravmakwana.
- Gateway: avoid sending duplicate tool-event frames to Control UI connections that are subscribed by both run and session.
- Discord/OpenAI voice: accept longer leading wake-name mistranscripts such as "Open Club" for OpenClaw.
- Agents/OpenAI-compatible: stop ModelStudio-compatible chat requests before sending system/tool-only payloads that have no usable user or assistant turn. (#86177) Thanks @TurboTheTurtle.
- Gateway/plugins: reuse plugin package realpath checks while building installed plugin indexes so startup avoids repeated filesystem resolution work.
- Kilo Gateway: send string `stop` sequences as arrays so Kilo accepts OpenAI-compatible chat completions. (#86461) Thanks @SebTardif.
- Discord/OpenAI voice: accept leading fuzzy wake-name transcripts such as "Monty" or "Moti" for a Molty agent while keeping ambient speech gated.
- Media understanding: convert HEIC and HEIF images to JPEG before image description providers run so iPhone photos work in direct and configured image-description flows. (#86037)
- Agents: release embedded-attempt session locks from outer teardown so post-prompt exceptions cannot wedge later requests behind `SessionWriteLockTimeoutError`. Fixes #86014. Thanks @openperf.
- Discord/OpenAI voice: rotate Realtime sessions at provider max duration without logging the expected session-expiry event as an error.
- Sessions: skip metadata-only entries during QMD-slugified session lookup so one incomplete row does not block transcript hit resolution. (#86327) Thanks @abnershang.
- Agents/media: derive bundled plugin local-media trust from plugin tool metadata instead of importing the full plugin registry on subscription paths. (#84409) Thanks @samzong.
- Image tool: keep config-backed custom-provider API keys usable for auto-discovered vision models, including deferred image-tool execution without env keys or auth profiles. (#85733)
- Memory/local embeddings: run local GGUF embeddings in an isolated worker sidecar and degrade to configured fallback or keyword search on worker failure so native embedding crashes do not take down the Gateway. (#85348) Thanks @osolmaz.
- Gateway: clear the runtime config snapshot before `SIGUSR1` in-process restarts so config changes survive the next gateway loop. (#86388) Thanks @XuZehan-iCenter.
- Models: show OAuth delegation markers as configured `models.json` auth while keeping runtime route usability checks strict. (#86378) Thanks @rohitjavvadi.
- Cron: seed active scheduled and manual cron task rows with a progress summary so status surfaces do not look blank while jobs run. (#86313) Thanks @ferminquant.
- Cron: preserve unsupported persisted cron payload rows during routine store writes while keeping those rows non-runnable. Fixes #84922. (#86415) Thanks @IWhatsskill.
- Updater: exclude prerelease git tags from stable channel resolution so source updates do not check out newer alpha/rc/preview/canary tags. (#86260) Thanks @stevenepalmer.
- Security/Audit: flag webhook `hooks.token` reuse of active Gateway password auth in `openclaw security audit` while keeping password-mode startup compatibility. (#84338) Thanks @coygeek.
- QQBot: derive the outbound reply watchdog from configured agent and provider timeouts so slow local model replies are not cut off at five minutes. Fixes #85267. (#85271) Thanks @SymbolStar.
- Agents/heartbeat: stop heartbeat turns after the first valid `heartbeat_respond` so repeated response loops do not burn tokens. (#86357) Thanks @udaymanish6.
- Tasks: keep retained lost tasks out of default status health counts, explain their cleanup window during maintenance, and prune lost task records after 24 hours instead of the general 7-day terminal retention.
- Memory-core: keep REM dreaming focused on live light-staged memories and mark staged entries as considered so old recall history no longer dominates fresh candidates. (#86302) Thanks @SebTardif.
- Memory: abort sync instead of downgrading an existing semantic vector index to FTS-only when the configured embedding provider is temporarily unavailable. (#85704) Thanks @yaaboo-gif.
- Telegram: propagate forum topic names through the account-scoped topic cache for native command context and topic create/edit actions. (#86299) Thanks @SebTardif.
- Slack: keep downloaded read-only files out of reply media so Slack file reads do not echo files back to the conversation. (#86318) Thanks @neeravmakwana.
- Cron: accept leading-plus relative durations such as `+5m` for one-shot `--at` schedules. (#86341) Thanks @mushuiyu886.
- Agents/media: preserve async-started media tool metadata so background generation starts no longer surface generic incomplete-turn warnings while replay stays unsafe. (#85933) Thanks @fuller-stack-dev.
- Docker E2E: dedupe scheduler lane resources so npm/service package lanes are not over-counted and serialized unnecessarily.
- QA/diagnostics: add a collector-backed OpenTelemetry smoke lane, make the OTLP payload leak check scenario-aware, and keep source QA builds from failing on optional dependency imports resolved through pnpm's temp module path.
- Crabbox: bootstrap Git metadata for sparse remote changed gates so raw synced workspaces can run `pnpm check:changed` from the intended diff.
- xAI/LM Studio: avoid buffering ordinary bracketed or `final` prose until stream completion while watching for plain-text tool-call fallbacks.
- Doctor: warn and continue when the cron job store exists but cannot be read so later health checks still run. Fixes #86102. (#86384) Thanks @1052326311.
- Discord: suppress a bot's previous reply body and referenced media from prompt context when a user replies to that bot message, while keeping reply metadata for routing. (#86238) Thanks @fuller-stack-dev.
- Discord: restore bare numeric channel IDs for outbound message-tool sends while keeping explicit DM targets unambiguous. (#86571) Thanks @joshavant.
- Docker E2E: avoid rebuilding the Control UI twice while preparing the shared OpenClaw package tarball for package-backed scenario runs.
- Tests: avoid rebuilding the Control UI twice during the installer Docker smoke now that `pnpm build` includes `ui:build`.
- Tests: give QA config mutation RPCs enough native Windows budget to finish gateway config writes and restart settle after hot scenario runs.
- Tests: keep the gateway restart-inflight QA scenario focused on restart recovery on native Windows by allowing expected embedded prompt handoff errors and using the Windows-safe timeout budget.
- QA-Lab: make the synthetic OpenAI provider honor generic `reply exactly:` directives after required kickoff reads so restart-recovery scenarios do not fall through to generic repo-summary prose.
- Gateway: abort active `agent` RPC runs during forced restart shutdown so stale in-process turns cannot keep writing a session after the Gateway lifecycle restarts.
- Crabbox: sync clean sparse worktrees through a temporary full checkout even when reusing an existing lease so tracked build-time files are not omitted.
- Build: route `scripts/ui.js` through the shared pnpm runner and keep Control UI chunking helpers in sparse-included source so native Windows Corepack builds can produce `dist/control-ui`.
- Tests: give the memory fallback QA scenario enough turn budget to exercise native Windows gateway runs instead of failing on the client timeout while the mock agent is still dispatching.
- Tests: collect QA gateway CPU/RSS metrics on native Windows and give the channel baseline enough turn budget to report slow gateway runs instead of timing out before proof.
- Install/update: bypass npm `min-release-age` policies with `--min-release-age=0` instead of `--before` so hosted installers keep working on npm versions that reject the combined config. (#84749) Thanks @TeodoroRodrigo.
- Diagnostics: reclaim wedged session lanes when stale active-run bookkeeping blocks queued work despite no forward progress. Fixes #85639. Thanks @openperf.
- WebChat: keep message-tool replies visible in the chat while still summarizing internal tool results for the model. Fixes #86347. Thanks @shakkernerd.
- Gateway/perf: fail startup benchmark samples when the Gateway process exits before benchmark teardown, including signal deaths after readiness probes.
- Gateway/perf: fail restart benchmark samples when the Gateway exits before benchmark teardown, including clean exits and signal deaths after successful restart probes.
- Agents/tests: keep model catalog visibility on static selection helpers so catalog visibility checks avoid the broad model-selection barrel import.
- Agents/commitments: serialize commitment store load-modify-save writes so concurrent heartbeat and CLI updates no longer lose dismissal, sent, or attempt state. (#81153) Thanks @ai-hpc.
- xAI/LM Studio: promote plain-text tool-call fallbacks into structured tool calls and strip leaked internal tool syntax before user-facing delivery. (#86222) Thanks @fuller-stack-dev.
- Gateway/perf: tighten restart and startup benchmark failure handling so long profiling runs, failed probes, and fresh Linux runners no longer produce false passing or `n/a` results.
- Checks: keep intentional Knip unused-file findings optional so full CI and sparse proof workspaces stay aligned.
- Docker: restore writable `~/.config` in runtime images. Fixes #85968. Thanks @hkoessler and @Bartok9.
- Plugin SDK: keep legacy root diagnostic subscriptions connected when built plugin SDK aliases resolve diagnostic helpers through a separate module graph.
- Diagnostics: export alertable OTel and Prometheus signals for blocked tools, model failover, stale sessions, liveness warnings, oversized payloads, and webhook ingress while fixing shared OTLP endpoints with query strings.
- Tests: normalize macOS canonical temp paths in exec allowlists, fs-safe trash assertions, installed plugin matching, Telegram topic-name stores, and built ACPX MCP server expectations so native macOS proof runners cover the intended behavior.
- Codex/app-server: preserve message-tool-only source reply delivery mode on active runs so sub-agent completion wakeups can steer the active Codex turn instead of being rejected. (#86287) Thanks @ferminquant.
- Tests: sample the Windows kitchen-sink RPC gateway directly and serialize RSS probes so native runs keep the memory guard active.
- Tests: normalize bundled plugin lifecycle probe paths and state-root lookup so native Windows release sweeps accept valid packaged plugin installs.
- Agents/Claude CLI: route live native Bash permission requests through OpenClaw exec policy so Claude turns no longer stall on `control_request`, and document that OpenClaw exec policy is authoritative. Fixes #80819. (#86330, from #81971) Thanks @guthirry and @sallyom.
- Security audit: warn when YOLO OpenClaw exec policy overrides a restrictive raw Claude `--permission-mode` for managed live sessions. (#86557) Thanks @sallyom.
- Config: keep benign legacy metadata write anomalies out of default doctor and config command output while preserving explicit anomaly logging for diagnostics.
- Codex: log when implicit app-server `never` approvals are promoted for OpenClaw tool policy, including whether the trigger was a `before_tool_call` hook or trusted tool policy.
- Codex harness: make subscription usage-limit errors without reset times explain that OpenClaw cannot determine the reset and point users to wait until Codex is available, use another Codex account, or switch to another configured model/provider. Thanks @amknight.
- Google Vertex: support production ADC modes such as Workload Identity Federation, service-account credentials, and metadata-server ADC for the native Vertex transport. (#83971) Thanks @damianFelixPago.
- Telegram: route normal `[telegram][diag]` polling diagnostics through `runtime.log` while keeping non-diag warnings and persistence failures on `runtime.error`, so healthy polling startup no longer looks like an error. Fixes #82957. (#82958) Thanks @galiniliev.
- Providers/Ollama: strip inline Kimi cloud reasoning prefixes from streamed and final visible replies while keeping ordinary Kimi answers append-only. (#86286) Thanks @jason-allen-oneal.
- Gateway: require Talk secret authority before setup-code handoff can include Talk secrets. (#85690) Thanks @ngutman.
- Agents: keep fallback error reporting scoped to the active model candidate so stale prior-provider quota/auth text is not reported for later fallback attempts. (#86134) Thanks @zhangguiping-xydt.
- iMessage: dedupe watcher startup when `channels.imessage.accounts` lists both `default` and a named account that point at the same local Messages source, so the gateway no longer spawns two `imsg rpc` processes or doubles inbound replies; the dedupe is scoped to watcher startup, leaving duplicate accounts addressable for outbound sends, status, and capability listings, and `openclaw doctor` flags the redundant account with a rebinding hint. Fixes #65141. (#86705) Thanks @swang430.
## 2026.5.22
### Changes
- Gateway/perf: reuse process-stable channel catalog reads, avoid repeated bundled-channel boundary checks, and rotate gateway watch CPU profiles so benchmark runs do not accumulate unbounded artifacts.
- Gateway/perf: reuse immutable plugin metadata snapshots across startup, config, model, channel, setup, and secret metadata readers so hot paths avoid repeated plugin file stats and manifest registry reloads.
- Gateway/perf: lazy-load startup-idle plugin work, core gateway method handlers, and the embedded ACPX runtime so Gateway health and ready signals no longer wait on unused handler trees or ACPX probes.
- Gateway/perf: cache plugin SDK public-surface alias maps and skip irrelevant macOS Linuxbrew PATH probes so Gateway startup avoids repeated filesystem walks and slow missing-directory stats.
- Transcripts: add the initial transcript capture and source-provider foundation, including auto-start capture config, manual transcript imports, read-only transcript access, and Discord voice as the first live source.
- Docs/channels/config: add Signal `configPath`, Telegram wildcard topic defaults, local-time backup archive names, Termux home fallback, include-path validation, secret-scanner-safe placeholder guidance, Gemini CLI/Antigravity media guidance, and macOS VM auto-login guidance. Thanks @NorseGaud, @yudistiraashadi, @huangqian8, @VibhorGautam, @maweibin, @tianxingleo, @IgnacioPro, and @xzcxzcyy-claw.
- Docs: clarify model-usage portability, Codex migration prerequisites, status bootstrap wording, thread-bound subagent limits, hook ownership, and config-preserving safety guidance. Thanks @aniruddhaadak80, @leno23, @TomDjerry, @matthewxmurphy, @vincentkoc, and @stablegenius49.
- Docs: clarify README onboarding and Gateway startup paths, WhatsApp QR/408 recovery, cron output language prompts, skill advanced features, gateway upstream 403 troubleshooting, and plugin fallback override guidance. Thanks @deepujain, @Zacxxx, @Jah-yee, @neyric, @usimic, @Renu-Cybe, @BigUncle, and @SeashoreShi.
- Docs: clarify context-pruning ratio bounds, local dashboard recovery, CLI env markers, remote onboarding token behavior, and Peekaboo Bridge permissions for subprocess agents. Thanks @ayesha-aziz123, @dishraters, @hougangdev, and @brandonlipman.
- Media understanding: stop auto-probing Gemini CLI and use Antigravity CLI only as a lower-priority image/video fallback after configured provider APIs.
- Agents/subagents: limit default sub-agent bootstrap context to `AGENTS.md` and `TOOLS.md`, keeping persona, identity, user, memory, heartbeat, and setup files out of delegated workers by default. (#85283) Thanks @100yenadmin.
- Maintainer skills: exclude plugin SDK/API boundary work from `openclaw-landable-bug-sweep` so bugbash sweeps stay focused on small paper-cut fixes.
- QA-Lab/diagnostics: extend the OpenTelemetry smoke harness to prove trace, metric, and log export, and add first-class Prometheus and observability smoke aliases.
- Plugin SDK: add a generic channel-message poll sender so channel plugins can expose poll delivery without depending on channel-specific SDK facades.
- Crabbox: keep the local wrapper's provider validation synced with the installed Crabbox binary while preserving supported aliases such as `docker` and `blacksmith`. (#85302) Thanks @hxy91819.
- Maintainer skills: add `openclaw-landable-bug-sweep` for producing five small, reviewed, CI-green OpenClaw bugfix PRs from issue/PR sweeps.
- Control UI/chat: add search and Load More pagination to the chat session picker, keeping initial session loads bounded while making older conversations reachable. (#85237) Thanks @amknight.
- CLI/onboarding: start classic onboarding when bare `openclaw` runs before an authored config exists, while keeping configured installs on Crestodian. (#72343) Thanks @fuller-stack-dev.
- Agents/runtime: internalize the former Pi agent runtime into OpenClaw, remove legacy package dependencies, and keep Pi-named SDK aliases only as deprecated plugin compatibility.
- Discord: allow configuring a bounded `agentComponents.ttlMs` callback registry lifetime for long-running component workflows, with per-account overrides and a 24-hour cap. (#84189) Thanks @100menotu001.
- xAI/Grok: reuse xAI OAuth auth profiles for Grok `web_search`, thread active-agent auth through web search, add Grok model aliases, and let media providers declare default operation timeouts. (#85182) Thanks @fuller-stack-dev.
- Plugin SDK: add row-level session workflow helpers and deprecate `loadSessionStore` so plugins can read and patch sessions without depending on the legacy whole-store shape. (#84693) Thanks @efpiva.
- Gateway/plugins: reuse a compatible Gateway startup plugin registry during dispatch so safe plugin dispatches avoid redundant registry loading. (#84324) Thanks @ai-hpc.
- Control UI/debugging: add an explicit source-only Traces view for local LLM request debugging, including full prompt and tool payload capture behind `OPENCLAW_DEV_EXTENDED_TRACING`. Thanks @amknight.
- Plugins/SDK: add a general `embeddingProviders` capability contract and registration API so embeddings can become a reusable provider surface outside memory-specific adapters.
- Dependencies: refresh provider, plugin, UI, and tooling packages, update `protobufjs` to 8.4.0 to clear the current npm advisory, and carry the Claude ACP completion patch forward to `@agentclientprotocol/claude-agent-acp` 0.36.1.
- Agents/tools: remove the old sender-owner tool gating path so configured tools stay visible for trusted sessions while command and channel-action auth still carry real sender identity.
- WebChat: summarize internal message-tool source replies so tool cards no longer duplicate the visible reply body. (#84773) Thanks @jason-allen-oneal.
- Gateway: preserve deferred lifecycle-error cleanup across later non-terminal events so provider timeouts can persist failed session state instead of leaving sessions stuck running. (#85256, fixes #63819) Thanks @samzong.
- Agents/subagents: report tool-only child progress during timeout summaries instead of showing no visible output.
- Telegram/ACP: preserve explicit `:topic:` conversation suffixes when inbound ACP targets do not carry a separate thread id.
- Browser/proxy: bypass the managed proxy for the exact local managed Chrome CDP readiness and DevTools WebSocket endpoints, so `openclaw browser start` works when the operator proxy blocks loopback egress. (#83255) Thanks @lightcap.
- Ollama: bypass the managed proxy for configured local embedding origins while keeping SSRF guardrails on unconfigured targets. Thanks @Kaspre.
- OpenAI/images: route Codex API-key image generation through the native OpenAI Images API instead of the Codex OAuth streaming backend, avoiding 401s from valid API keys.
- Checks/Windows: route full `pnpm check` stage commands through the managed child runner so Windows avoids Node shell-argv deprecation warnings there too.
- Checks/Windows: run managed child commands through explicit `cmd.exe` wrapping instead of Node shell mode with argv, avoiding Node 24 subprocess deprecation warnings during changed checks.
- Gateway: omit internal stream-error placeholder entries from agent prompt history so failed assistant turns are not replayed as model-authored text. (#85652) Thanks @anyech.
- Sessions: enforce the session write-lock max-hold policy during lock acquisition so long-held locks can be reclaimed before the stale-lock window. (#85764) Thanks @njuboy11.
- Models: prune retired Groq, GitHub Copilot, OpenAI, xAI, and old Claude catalog entries, with doctor migration to upgrade existing configs to current provider refs.
- Doctor/update: recognize junction-backed source checkouts as git installs by comparing canonical paths before showing package-manager update guidance. Fixes #82215. Thanks @igormf.
- Channels: honor `/verbose on` for tool/progress summaries across direct chats, groups, channels, and forum topics while preserving quiet default behavior. (#85488) Thanks @kurplunkin.
- CLI/skills: show an all-ready note with next-step commands when skill setup has no missing dependencies to install. (#85032) Thanks @aniruddhaadak80.
- Microsoft Foundry: route DeepSeek V4 Pro and Flash models through the Foundry Responses API while keeping older DeepSeek models on their existing path. (#85549) Thanks @roslinmahmud.
- Status/usage: show configured cost estimates for AWS SDK models in full usage output while keeping token-only usage replies cost-free. (#85619) Thanks @ItsOtherMauridian.
- Agents/OpenAI Responses: retry non-visible reasoning-only turns for OpenAI Responses API families instead of treating them as empty failed turns. (#85603) Thanks @SebTardif.
- Directive tags: preserve message and content-part object identity when display stripping makes no directive-tag changes. (#85682) Thanks @willamhou.
- Telegram: send local `path`/`filePath` and structured attachment media from `sendMessage` actions instead of dropping them or sending text-only messages. (#85219) Thanks @keshavbotagent.
- Sessions/status: show the estimated context budget when fresh provider usage is unavailable and clear stale estimates across session resets and compaction boundaries. (#84830) Thanks @giodl73-repo.
- Gateway/config: pin relative `OPENCLAW_STATE_DIR` overrides to an absolute path at startup so later working-directory changes cannot retarget gateway state. (#52264) Thanks @PerfectPan.
- Release/package: run npm release, prepublish, and postpublish verification through Windows-safe npm command shims so native Windows checks can execute `npm.cmd` instead of treating it as a binary.
- Agents/harness: pass CLI runtime aliases through harness selection so provider-owned CLI aliases no longer get rejected before reaching the right runtime. (#85631) Thanks @potterdigital.
- Secrets: show the irreversible apply warning after interactive `secrets configure` confirmation so confirmed migrations still get the final safety prompt. (#85638) Thanks @alkor2000.
- Agents/CLI output: ignore cumulative Claude `stream-json` result usage when assistant usage events are present, preventing inflated cache-read accounting. (#85625) Thanks @zhouhe-xydt.
- CLI: keep `waitForever()` alive by leaving its keep-alive interval ref'd so the public helper no longer exits immediately with Node's unsettled-await code. (#85694) Thanks @m1qaweb.
- Agents/bootstrap: guard bootstrap name checks against missing file names so malformed bootstrap entries warn and truncate instead of crashing. Fixes #85523. (#85615) Thanks @zhouhe-xydt.
- CLI/tasks: reject partially numeric `openclaw tasks audit --limit` values so audit limits must be real positive integers instead of accepting strings like `5abc`. (#84901) Thanks @jbetala7.
- Status/diagnostics: bound deep Docker audit probes so `openclaw status --deep` reports slow container checks instead of hanging behind unbounded inspection. (#85476) Thanks @giodl73-repo.
- Providers/Anthropic: migrate 1M context handling to GA-capable Claude 4.x models by sizing eligible models at 1M without the retired `context-1m-2025-08-07` beta, ignoring that retired beta in older configs, and preserving OAuth-required Anthropic beta headers. (#45613) Thanks @haoyu-haoyu.
- Cron/Telegram: parse forum-topic delivery targets through the Telegram plugin instead of cron core, including `:topic:` and `:topicId` forms for announce delivery. Thanks @etticat.
- Twitch: keep stale message-handler cleanup callbacks from removing newer handler registrations for the same account, preserving inbound message delivery after reconnects. Fixes #83888. (#85425) Thanks @alkor2000.
- Memory/LanceDB: expose public memory artifacts through the active memory provider bridge so memory-wiki imports durable memory files, daily notes, dream reports, and event logs without depending on memory-core internals. Fixes #83604. (#85060) Thanks @brokemac79.
- Crabbox: keep AWS hydration compatible with local Actions replay by inlining the hydrate workflow's Node/pnpm setup instead of invoking repo-local composite actions.
- Agents/subagents: simplify native sub-agent completion handoff so children report their latest visible assistant result to the requester without using `message`, while keeping parent-owned message-tool delivery policy intact. Fixes #85070. (#85089) Thanks @brokemac79.
- Docker setup: stop printing the Gateway bearer token in setup logs and printed follow-up commands.
- Agents: let embedded compaction fallback retries proceed when PI-compatible candidates do not need agent harness plugin preparation.
- Agents/tools: honor configured custom provider API keys when deciding whether media, image-generation, video-generation, music-generation, and PDF tools are available. (#85570)
- StepFun: stop advertising stale generic API key auth choices so onboarding only offers runtime-backed Standard and Step Plan choices.
- Diagnostics: keep OpenTelemetry log bodies behind explicit content capture and scrub scoped agent-session keys from OpenTelemetry and Prometheus labels while preserving bounded queue-lane prefixes.
- Windows installer: fail Git checkout installs when `pnpm install` or `pnpm build` fails instead of writing a wrapper to a missing CLI build.
- Sessions: surface previous-transcript archive failures during `/new` rotation so disk rename errors are logged instead of silently hiding stranded transcript files. Fixes #81984. (#85586, from #82081) Thanks @0xghost42.
- TUI/agents: mirror internal-ui message-tool replies into final chat output so message-tool-only agents remain visible in `openclaw tui`. Fixes #85538. Thanks @danpolasek.
- Agents: keep parallel OpenAI-compatible tool-call deltas in separate argument buffers so interleaved tool calls no longer corrupt streamed arguments. (#82263) Thanks @luna-system.
- Memory/doctor: report missing or unusable QMD workspace directories as workspace failures instead of generic binary failures. (#63167) Thanks @sercada.
- Debug proxy: record CONNECT client-socket errors and destroy the paired upstream socket so abrupt client disconnects no longer leak tunnel resources. (#82444) Thanks @SebTardif.
- Diffs: continue hydrating later diff cards when one card fails so a single broken card no longer blanks the whole diff viewer. (#84775) Thanks @cosmopolitan033.
- Mac app: use the native settings sidebar window chrome so the sidebar toggle stays on the left and content no longer clips under oversized titlebar padding.
- QA-Lab/Codex: bundle auth/plugin fixture imports for flow scenarios and let terminal async media tools end Codex app-server turns without timing out. (#80397, refs #80323) Thanks @100yenadmin.
- Gateway/agents: preserve fresh session overrides and metadata when stale cached agent-session entries race with store updates, so subagent model/provider overrides and routing policy survive concurrent writes. (#19328) Thanks @CodeReclaimers.
- Control UI/chat: keep chat session search inline with the session selector so the header no longer shows a duplicate standalone search row.
- Control UI/chat: collapse focused-mode header chrome and suppress hidden-header scroll updates so focus mode no longer jumps while scrolling. Thanks @amknight.
- Codex app-server: restart the native app-server and retry once when server-side compaction times out, so preflight compaction stalls recover instead of failing every dispatch. (#85500)
- OpenAI video: honor configured provider request private-network opt-in for local/custom video endpoints so explicitly trusted mock and self-hosted providers are not blocked. Thanks @shakkernerd.
- Providers/Gemini: strip fractional seconds from web-search time range filters so Gemini accepts freshness-bound search requests. (#85071) Thanks @Noerr.
- OpenAI Codex: preserve image input support for sparse `openai-codex/gpt-5.5` catalog rows. (#85095) Thanks @sercada.
- CLI/models: add a piped or pasted API-key path for OpenAI Codex auth and warn when API keys are pasted into token-mode auth. (#85533) Thanks @joshavant.
- Telegram: dead-letter missing-harness isolated ingress failures so a poisoned spooled update no longer blocks later same-lane messages. Fixes #85470. (#85605) Thanks @joshavant.
- Plugins/discovery: strip `-plugin` package suffixes when deriving plugin id hints so package names line up with manifest ids. (#85170) Thanks @JulyanXu.
- Tlon: stop advertising a non-existent agent tool contract in the plugin manifest.
- Telegram: preserve fenced code block languages through Markdown rendering so Telegram receives `language-*` code classes. (#85209) Thanks @leno23.
- Channels/message tool: resolve configured external channel plugins during in-agent channel selection, so `openclaw agent --local` message-tool sends no longer report an available channel as unavailable. (#85022) Thanks @Kaspre.
- Agents/heartbeat: honor group/channel `message_tool` visible-reply policy and model-specific Codex runtime config for scheduled heartbeat runs, so failed internal tool output stays private. Fixes #85310. (#85357) Thanks @neeravmakwana.
- Gateway/ACP: close child ACP sessions spawned via `sessions_spawn` when their parent session is reset or deleted, instead of leaving orphaned `claude-agent-acp` processes that accumulate and exhaust memory. Fixes #68916. (#85190) Thanks @openperf.
- Codex app-server: block native execution paths when OpenClaw exec resolves to a node host while preserving the first-party CLI node binding path. Fixes #85012. (#85534) Thanks @joshavant.
- Diagnostics: bound cleanup timeout detail logs, emit drop summaries when async diagnostic bursts exceed the queue cap, and surface async queue drops through diagnostic telemetry.
- Agents/subagents: surface blocked child-run completions as errors instead of successful subagent finishes. (#80886) Thanks @TurboTheTurtle.
- Context engines: fail closed with a descriptive error when the selected agent runtime cannot satisfy declared context-engine host requirements.
- Agents: validate a forced plugin harness against the candidate provider/model before pinning it, so unsupported fallback-chain candidates fail with a clear harness error instead of producing a late `Model provider X not found` from the underlying harness. Codex harness `supports()` now also accepts the canonical `openai` and `openai-codex` routing ids so documented Codex configs keep working. Thanks @cathrynlavery.
- Control UI/WebChat: keep selected external-channel sessions live by mirroring Codex prompts at turn start, streaming hidden runs only to exact selected-session subscribers, and deduplicating accumulated stream snapshots around tool cards. Fixes #83528, #82611, refs #83949. Thanks @BunsDev.
- CLI/tasks: include stale-running task maintenance decisions in `openclaw tasks maintenance --json` so retained and reconcile candidates explain backing-session, cron, CLI, and wedged-subagent state. (#84691) Thanks @efpiva.
- Codex app-server: keep system-prompt reports working when bootstrap hooks provide workspace files with only a path and content, so hook-supplied SOUL/IDENTITY/TOOLS/USER context still reports injected characters correctly. (#84736) Thanks @JARVIS-Glasses.
- Providers/MiniMax music: stop advertising `durationSeconds` control and remove prompt-injected duration hints, so `music_generate` reports MiniMax duration as an unsupported override instead of suggesting MiniMax can enforce track length. Fixes #84508. Thanks @neeravmakwana.
- Agents/Codex: keep encrypted Responses reasoning replay provenance-bound so stale mirrored Codex transcripts drop invalid encrypted content before request assembly while preserving matching same-session replay. Fixes #83836. (#84367) Thanks @joshavant.
- Agents/subagents: skip stale embedded-run wake probes for dormant completion requesters, so late subagent completions go straight to requester-agent/direct handoff instead of producing `reason=no_active_run` queue noise. (#82964) Thanks @galiniliev.
- CLI: retry config snapshot reads after a transient failure so one rejected read no longer poisons later commands in the same process. (#83931) Thanks @honor2030.
- TUI: handle German-layout Kitty keyboard input by ignoring printable release events and accepting AltGr-produced printable characters such as `@` and `€`. Fixes #48897.
- Media: decode URL path basenames before using them as remote media fallback filenames, so files like `My%20Report.pdf` are surfaced as `My Report.pdf`. Fixes #84050. (#84052) Thanks @jbetala7.
- WhatsApp: clarify inbound group diagnostics so observed but unregistered groups point to `channels.whatsapp.groups` without changing routing or sender authorization. (#83846) Thanks @neeravmakwana.
- WhatsApp: drain pending outbound deliveries on a 30s periodic timer in addition to the reconnect handler, so messages enqueued while the provider is already connected no longer wait for the next reconnect to send. (#79083) Thanks @Oviemudiaga.
- CLI: reject explicit port numbers above 65535 before they reach Gateway or Node bind paths. Fixes #83900. (#84008) Thanks @hclsys.
- Codex app-server: preserve plugin tool auth profiles when Codex owns model transport so OpenClaw dynamic tools can resolve their provider credentials. (#83603) Thanks @rubencu.
- Memory/search: scan the JS-side fallback vector path (used when the sqlite-vec index is unavailable or has a mismatched dimension) in bounded rowid batches and yield to the event loop between batches so large chunk tables can no longer pin the Node.js main thread for multi-second windows. Also keeps the SQL prepared statement rooted in a local so node:sqlite cannot finalize it mid-scan under heap pressure. Fixes #81172. Thanks @dev23xyz-oss.
- Telegram: preserve inbound bold, italic, code, preformatted, strikethrough, underline, spoiler, and text-link entities as markdown in the agent-facing prompt body. Fixes #52859.
- Backup: dereference hardlinks during archive creation and reject unsafe hardlink targets during verification so archives that pass `backup verify` do not fail broad extraction on macOS tar. Fixes #54242. Thanks @jason-allen-oneal.
- Memory Wiki: preserve fs-safe diagnostics when bridge source page writes fail for non-symlink filesystem safety reasons, so directory collisions are reported with the underlying error code. (#83776) Thanks @TurboTheTurtle.
- Telegram: keep forum topics from blocking sibling topic traffic by routing inbound serialization, media/text buffers, and account API queues on topic-aware lanes. (#83829)
- Telegram: keep queued forum-topic follow-up messages from inheriting superseded source abort signals, so later same-topic user turns can still run and reply after an active turn is replaced. (#83827) Thanks @VACInc.
- Agents/read tool: treat positive offsets beyond EOF as empty ranges instead of surfacing the upstream read error, so stale pagination cursors no longer crash tool calls while unrelated read failures still fail loud. Fixes #62466. (#75536) Thanks @vyctorbrzezowski.
- Google/Gemini: normalize retired Gemini 3 Pro Preview refs left in Google API-key onboarding model allowlists and fallbacks, so setup-emitted config keeps testing `google/gemini-3.1-pro-preview` instead of `google/gemini-3-pro-preview`.
- Telegram/context: bound selected topic context to the active session so messages from before `/new` or `/reset` are not replayed into later turns. (#80848) Thanks @VACInc.
- Docs/providers/openai: clarify that OpenAI Realtime voice goes through the OpenAI Platform Realtime API and requires Platform credits — Codex/ChatGPT subscription quota does not cover this route. Fixes #76498. Thanks @lonexreb.
- Google/Gemini: normalize retired nested Gemini 3 Pro Preview ids when resolving exact configured proxy-provider refs, so `kilocode/google/gemini-3-pro-preview` resolves to `kilocode/google/gemini-3.1-pro-preview` for Gemini 3.1 testing.
- CLI: strip generic OSC terminal escape payloads from sanitized output fields, preventing clipboard/title escape bodies from leaking into commitment tables and other terminal-safe text. Thanks @shakkernerd.
- Codex app-server: match connector-backed plugin approval elicitations by stable connector id so enabled destructive actions no longer fall through to display-name-only rejection.
- Telegram/groups: include the recent local chat window and nearby reply-target window as generic inbound context so stale reply ancestry does not overshadow the live group conversation.
- Plugins/Nix: allow externally configured plugin roots under `/nix/store` to load in `OPENCLAW_NIX_MODE=1` while keeping normal external plugin hardlink rejection unchanged. Thanks @joshp123.
- Nextcloud Talk: include the required bot `response` feature in setup, explain missing `--feature response` on rejected sends, and surface missing response capability in doctor/status checks. Fixes #78935. (#79657) Thanks @joshavant.
- Cron/diagnostics: emit the existing `message.queued`, `session.state` (processing/idle), and `message.processed` lifecycle events for isolated-cron agent turns in `runCronIsolatedAgentTurn`, matching the dispatch and embedded-runner paths so subscribers (diagnostics OTLP, OTel exporters, custom observability plugins) get per-run session attribution instead of bucketing isolated cron LLM calls under static fallback ids. Events are gated on `isDiagnosticsEnabled(cfg)` so the documented `diagnostics.enabled: false` master toggle continues to silence the recorder. (#79214) Thanks @arniesaha.
- fix(discord): gate user allowlist name resolution [AI]. (#79002) Thanks @pgondhi987.
- Infra/fetch-timeout: pass `operation` and `url` context to `buildTimeoutAbortSignal` from the music-generate reference fetch and the Matrix guarded redirect transport, so the `fetch timeout reached; aborting operation` warning carries actionable structured fields instead of a bare line. Fixes #79195. Thanks @pandadev66.
- CLI/plugins: refresh persisted plugin registry policy in place for `plugins enable` and `plugins disable`, so routine toggles no longer rebuild and hash every plugin source when the target is already indexed. Thanks @vincentkoc.
- Windows/install: run npm from a writable installer temp directory and pin the Bedrock runtime dependency below a Windows ARM Node 24 npm resolver failure, so global OpenClaw installs no longer fail before onboarding. Thanks @mariozechner.
- CLI/plugins: scope install and enable slot selection to the selected plugin manifest/runtime fallback, so plugin installs no longer load every plugin runtime or broad status snapshot just to update memory/context slots. Thanks @vincentkoc.
- Browser/snapshot: propagate the configured snapshot timeout through the agent tool, Chrome MCP, and Playwright snapshot paths so snapshot actions honor the requested deadline instead of hanging. Fixes #72934. Thanks @masatohoshino.
- Plugins/TTS: keep bundled speech-provider discovery available on cold package Gateway paths and add bundled plugin matrix runtime probes for health, readiness, RPC, TTS discovery, and post-ready runtime-deps watchdog coverage. Refs #75283. Thanks @vincentkoc.
- Google Meet/Twilio: show delegated voice call ID, DTMF, and intro-greeting state in `googlemeet doctor`, and avoid claiming DTMF was sent when no Meet PIN sequence was configured. Refs #72478. Thanks @DougButdorf.
- Plugins/tools: prefer built bundled plugin code during tool discovery and skip channel runtime hydration while preserving companion provider registrations, reducing per-run plugin-tool prep cost without dropping executable plugin tools. Fixes #75290. Thanks @thanos-openclaw.
@@ -107,6 +107,7 @@ For coordinated change sets that genuinely need more than 20 PRs, join the **#cl
- Test locally with your OpenClaw instance
- External PRs must include a filled **Real behavior proof** section in the PR body. Show the real setup you tested, the exact command or steps you ran after the patch, after-fix evidence, the observed result, and anything you did not test. Screenshots, recordings, terminal screenshots, console output, copied live output, linked artifacts, and redacted runtime logs all count. Unit tests, mocks, snapshots, lint, typechecks, and CI are useful but do not satisfy this requirement by themselves. Maintainers may apply `proof: override` only when the proof gate should not apply.
- Keep PRs takeover-ready: open them from a branch maintainers can push to. For fork PRs, leave GitHub's **Allow edits by maintainers** option enabled so maintainers can finish urgent fixes, changelog entries, or merge prep when needed. If GitHub shows **Allow edits and access to secrets by maintainers**, enable it only when that workflow/secrets access is acceptable and say so in the PR.
- Do not edit `CHANGELOG.md` in contributor PRs. Maintainers or ClawSweeper add the changelog entry when landing user-facing changes.
- For iterative local commits, `scripts/committer --fast "message" <files...>` passes `FAST_COMMIT=1` through to the pre-commit hook so it skips the repo-wide `pnpm check`. Only use it when you've already run equivalent targeted validation for the touched surface.
@@ -98,7 +98,7 @@ These are frequently reported but are typically closed with no code change:
- Reports that treat `POST /tools/invoke` under shared-secret bearer auth (`gateway.auth.mode="token"` or `"password"`) as a narrower per-request/per-scope authorization surface. That endpoint is designed as the same trusted-operator HTTP boundary: shared-secret bearer auth is full operator access there, narrower `x-openclaw-scopes` values do not reduce that path, and owner-only tool policy follows the shared-secret operator contract.
- Reports that only show differences in heuristic detection/parity (for example obfuscation-pattern detection on one exec path but not another, such as `node.invoke -> system.run` parity gaps) without demonstrating bypass of auth, approvals, allowlist enforcement, sandboxing, or other documented trust boundaries.
- Reports that only show an ACP tool can indirectly execute, mutate, orchestrate sessions, or reach another tool/runtime without demonstrating bypass of ACP prompt/approval, allowlist enforcement, sandboxing, or another documented trust boundary. ACP silent approval is intentionally limited to narrow readonly classes; parity-only indirect-command findings are hardening, not vulnerabilities.
- Reports that only show untrusted media bytes reaching a maintained native decoder dependency (for example Sharp/libvips/libheif) without proving the shipped dependency version is vulnerable and demonstrating crash, memory corruption, data exposure, or a boundary bypass through OpenClaw. JavaScript header sniffing and image dimension fast-paths are preflight/UX checks, not the security boundary for native decoder correctness.
- Reports that only show untrusted media bytes reaching a maintained native decoder dependency (for example image codec libraries such as libheif) without proving the shipped dependency version is vulnerable and demonstrating crash, memory corruption, data exposure, or a boundary bypass through OpenClaw. JavaScript header sniffing and image dimension fast-paths are preflight/UX checks, not the security boundary for native decoder correctness.
- Reports whose only impact is transient extra memory, CPU, or allocation work from decoding, base64 expansion, media transcoding, serialization, or other format conversion after the input was already accepted under OpenClaw's configured size/trust limits, including base64 decode-before-size-estimate findings. These are performance issues, not vulnerabilities, unless the report demonstrates unauthenticated amplification, bypass of configured limits, crash/process termination, persistent resource exhaustion, data exposure, or another documented boundary bypass.
- ReDoS/DoS claims that require trusted operator configuration input (for example catastrophic regex in `sessionFilter` or `logging.redactPatterns`) without a trust-boundary bypass.
- Archive/install extraction claims that require pre-existing local filesystem priming in trusted state (for example planting symlink/hardlink aliases under destination directories such as skills/tools paths) without showing an untrusted path that can create/control that primitive.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.