Local extension before_tool_call/after_tool_call hooks registered but
never fired after a scoped mid-run plugin activation (harness or memory
ensure) rebound the global hook runner to a narrow registry, dropping
hooks unique to the broader registry (#91918).
The runner is now created once and resolves hooks live on every dispatch
from the composed set of currently-live registries (the most recently
initialized registry, the active registry, and the pinned channel and
http-route surfaces) instead of freezing one registry. The loader's
one-shot preserve gate is removed since activation order no longer
matters. Per-plugin ownership prefers loaded records so a failed scoped
reload cannot shadow a healthy pinned registration (including a
fail-closed tool-call gate), and the explicitly initialized registry
stays highest precedence so SDK callers keep an authoritative registry.
Reuses the live-registry collector the agent-event bridge already uses
so both dispatch surfaces agree on what is live.
One-time maintainer-authorized bootstrap merge for the release-gate verifier policy. Exact hosted CI and all supporting workflow gates passed on 66133de419.
When ensureAgentWorkspace is called for an already-configured workspace
(setupCompletedAt is set), skip creating optional bootstrap files
(SOUL.md, USER.md, IDENTITY.md, HEARTBEAT.md) at the root level.
This prevents subagent spawns from recreating root-level optional
bootstrap markdown files in repository workspaces where these files
were removed intentionally or only exist under agent-specific
subdirectories (e.g., main/).
Fixes#83593
* fix(session): prevent stale finalizer from recreating deleted session rows
After sessions.delete removes a session row, updateSessionStoreAfterAgentRun
could still recreate it via the fallbackEntry path in patchSessionEntry when
preserveUserFacingRunState was false. Changed the guard from only checking
preserveUserFacingRunState to checking whether the session key exists in the
in-memory store but not on disk — indicating the session was intentionally
deleted mid-run.
Fixes#40840
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(session): cover deleted session finalizer fence
* fix(session): fence post-run writes after deletion
* fix(session): guard post-run transcript persistence
* fix(session): fence metadata after session reset
---------
Co-authored-by: Peter Lee <22994703+xialonglee@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Mount the configured package Telegram output directory into the Docker runtime and pass the container path to the harness, avoiding host `/home/runner` paths inside Docker.
Proof:
- pnpm test test/scripts/npm-telegram-live.test.ts
- git diff --check
- https://github.com/openclaw/openclaw/actions/runs/27685093647
Set TMPDIR=/tmp inside the package Telegram Docker runner so runtime scratch files are written to a writable container path.
Proof:
- pnpm test test/scripts/npm-telegram-live.test.ts
- git diff --check
Compaction summarization consumes the model stream via result() only (no
iteration), so it never emitted model.call diagnostic spans. Observe the
stream's result() in the diagnostic wrapper and wire the wrapper into the
direct compaction path so these LLM calls are traced (request/response
content, byte accounting, traceparent).
Decouple underlying-iterator cleanup from terminal-event dedup. The agent
loop awaits result() on the terminal event then abandons the iterator, so
once result() also emits the terminal event, gating safeReturnIterator on
terminalEventEmitted skipped provider cleanup (idle-timeout abort listeners
on the long-lived run signal, SSE readers). Track iterator settlement
separately so return() cleanup always runs; emit dedup stays on
terminalEventEmitted.
Parent compaction model-call spans to the active run/harness trace rather
than a phantom child trace that emits no span of its own.
Clarify that `openclaw mcp list`, `show`, `set`, and `unset` manage the OpenClaw `mcp.servers` registry and do not include the separate mcporter registry.
Co-authored-by: Alix-007 <li.long15@xydigit.com>
The generic dmPolicy/allowFrom warning read only the canonical top-level
allowFrom, so channels that keep their wildcard under the legacy dm.allowFrom
alias (e.g. Discord/Slack, mode=topOnly/topOrNested) got a false 'all DMs
dropped' warning even though runtime honors dm.allowFrom. Resolve policy and
allowFrom through the shared resolveChannelDm* helpers with the channel's
dmAllowFromMode (matching runtime and doctor), and skip nestedOnly channels
whose canonical fields live under dm.* and do not match this warning's
top-level paths. Adds a Discord legacy-alias regression test.
Addresses ClawSweeper review finding P1 (false positives on legacy dm.allowFrom).
Replace the hardcoded Mattermost-only open-DM config check with a generic,
plugin-agnostic warning driven by a single shared evaluator
(evaluateDmPolicyAllowFromDependency) reused by the Zod refinements and the
CLI validator. Surface warnings at 'config validate' and on config load.
Remove the Mattermost-specific status-issues module now covered generically;
keep the runtime drop-log diagnostic.
* fix(feishu): fetch quoted content before empty-message guard
Moves the quoted/replied message content fetching before the empty-message
early return so a reply with only @bot mention (no text, no media) is not
dropped when it quotes a message with meaningful content. The guard now also
checks that quoted text is empty before skipping.
Note: because the fetch is now unconditional on parentId after passing the
group admission/mention gate, an empty-text reply that quotes a parent in an
open group (requireMention: false) without mentioning the bot will now be
dispatched, where before it was dropped. This is the intended behavior for
open groups — any non-empty turn (including one where context comes from a
quote) should reach the agent. For requireMention:true groups, unmentioned
messages still exit at the mention gate before the fetch, so no over-fetch
occurs.
Adds group-based regression tests for the #90177 scenario:
- Positive: mention-only reply in requireMention:true group with quoted
parent — dispatches with [Replying to: "..."] in the body.
- Negative: empty reply with no bot mention in requireMention:true group —
getMessageFeishu is never called and nothing is dispatched.
* fix(feishu): fetch quoted content before empty-message guard (#90192) (thanks @bladin)
---------
Co-authored-by: 黑承亮0668000844 <bladin@users.noreply.github.com>
Co-authored-by: sliverp <870080352@qq.com>
* fix(cron): reject invalid absolute timestamps
* fix(cron): preserve ISO end of day
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(reasoning-tags): accept MiniMax mm: prefix in silent-detection and stream gates
PR #93767 added MiniMax `mm:`-namespaced reasoning-tag support across the
shared sanitizer and Telegram lane coordinator, but two production reasoning-tag
recognizers were missed and still only matched the `antml:` namespace:
- src/auto-reply/tokens.ts: `taggedReasoningPrefixRe` / `openReasoningPrefixRe`
drive `stripLeadingReasoningBlocks` and `isSilentReplyPayloadText`, which 14+
call sites use to detect NO_REPLY silent payloads. A `<mm:think>…</mm:think>NO_REPLY`
reply was not recognized as silent, leaking the wrapper into delivery.
- src/agents/embedded-agent-subscribe.handlers.messages.ts: `REASONING_TAG_RE`
gates `shouldRecomputeFullStream`. A `<mm:think>` streaming chunk failed the
test, so the visible stream was not recomputed and the hidden reasoning leaked.
Add the `mm:` alternative alongside `antml:` in all three regexes, matching the
exact `(?:antml:|mm:)?` form used by #93767. Identification-only change, no other
regex logic touched.
* test(agents): cover MiniMax reasoning regressions
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(browser): use openTab return value to prevent wsUrl race in ensureTabAvailable
When ensureTabAvailable opens a new tab on empty list, the return value
from openTab was discarded. A subsequent listTabs() call may return tabs
without webSocketDebuggerUrl populated yet, causing the wsUrl filter to
eliminate the newly opened tab and throw BrowserTabNotFoundError.
Fix: capture openTab's return value and merge it into candidates if the
wsUrl filter excluded it. openTab's internal discovery loop already
resolves wsUrl, so the returned tab is always valid.
* fix(browser): harden tab selection discovery
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(feishu): paginate wiki node and space listing (fixes#37626)
client.wiki.spaceNode.list / wiki.space.list return at most one page (max
50 items); the tool ignored has_more/page_token and silently dropped every
node past the first page. Drain both endpoints via a bounded shared helper
that loops on has_more with a 100-page safety cap.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(feishu): expose wiki pagination cursors
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
The previous `read_text(encoding="utf-8")` call left the UTF-8 byte
order mark (EF BB BF, three bytes) in the content string if the file
was saved by a tool that emits a BOM. The first line check
(`lines[0].strip() != "---"`) then saw "\ufeff---" and rejected the
file as "Invalid frontmatter format", even though the document was
otherwise valid frontmatter.
Co-authored-by: Zo Bot <github-automation@zo.computer>
When more than maxMissedJobsPerRestart cron jobs are overdue after gateway
downtime, runMissedJobs defers the overflow jobs to a near-future staggered
catch-up slot. start()'s second maintenance pass then recomputed each overflow
cron deferral to its natural schedule slot, because it ran future-slot repair
with the default-enabled flag. For a daily 0 9 * * * job the now+stagger
catch-up was clobbered to the next 09:00, dropping the missed run for a full
period.
Scope the exemption instead of disabling repair wholesale: runMissedJobs now
returns the ids it deferred this startup, recomputeNextRunsForMaintenance gains
skipFutureRepairJobIds to exempt exactly those ids, and start() threads them
into its pass. Overflow catch-up deferrals survive until their staggered tick
while ordinary stale-future cron slots are still repaired on startup.
* fix(status): show 0/1.0m instead of ?/1.0m on a fresh session
On a brand-new /new session the persisted totalTokens is absent
(undefined), so /status rendered the context numerator as ? via
formatTokens(null, ...). A fresh session with no usage is a known
zero, not an unknown total, so normalize undefined-but-not-stale
totals to 0 before formatting while leaving the intentional
totalTokensFresh === false stale guard (which must keep ?) intact.
Fixes#93771
* fix(status): persist fresh-session zero usage
* fix(status): identify fresh empty sessions
* fix(status): persist fresh empty session usage
* fix(status): preserve fork and compaction token state
* fix(status): preserve queued compaction token state
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(google): keep parallel Gemini tool responses in the turn after the model
On Gemini < 3 vision models, a parallel tool-call turn whose non-last result
returns an image split function responses across user turns. The merge heuristic
only inspected contents[last], so the separate "Tool result image:" turn landed
between two parallel responses and stranded the second one in a fresh turn. The
turn right after the model then carried fewer functionResponse parts than the
model issued functionCall parts, so Gemini returned 400 INVALID_ARGUMENT. Because
the malformed turn is persisted, every later turn re-400s and the session sticks.
Replace the contents[last] heuristic with a run-scoped accumulator: all responses
for one model turn merge into the single user turn after it, and Gemini < 3 image
turns defer to the end of the tool-result run so they trail that response turn.
Covers both google.ts and google-vertex.ts, which share this convertMessages.
* fix(google): align provider transport tool result turns
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(memory): await search-sync before returning results to prevent stale index
When the gateway process has been running for a while, memory_search
returns stale results because startAsyncSearchSync fires off the index
sync as a background task (void ... .catch()) without waiting for it
to complete. Search results are then read from the old index state.
Change startAsyncSearchSync from sync/fire-and-forget to async/await
so that the index is synced before search results are returned. This
ensures memory_search reflects the current filesystem state, matching
the behavior of the CLI command which creates
a fresh manager each time.
Fixes#52115
* test(memory): prove search waits for dirty sync
* test(memory): align search with synchronous sync
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Treat refreshable manifest catalog rows as non-authoritative and load the owning plugin for runtime/cache-backed discovery. Adds focused regression coverage for entries-only and full discovery paths.
* fix(feishu): recover CJK filenames from JSON file_name field
Apply recoverUtf8FileNameFromLatin1Header to JSON-derived filenames in
extractFeishuDownloadMetadata, matching the behavior already present for
Content-Disposition headers in decodeDispositionFileName.
Fixes#81103
* fix(feishu): recover inbound CJK filenames
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(reasoning-tags): strip MiniMax `mm:` namespaced reasoning tags
MiniMax M3 (e.g. via Fireworks) emits its chain-of-thought inline in the
content stream wrapped in `<mm:think>…</mm:think>` rather than in a separate
`reasoning_content` field. The reasoning-tag stripper only recognized the
`antml:` namespace, so `mm:`-namespaced tags slipped through QUICK_TAG_RE and
leaked the model's hidden reasoning into visible chat output.
Accept the `mm:` prefix alongside `antml:` in the shared sanitizer
(reasoning-tags.ts) and in the Telegram reasoning-lane coordinator's tag regex
and prefix list. Adds unit tests covering mm: think/thinking/thought blocks,
truncated-open orphan close recovery, and code-fence preservation.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(reasoning): handle MiniMax tags in streams
---------
Co-authored-by: DrHack1 <DrHack1@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* feat(inbound-meta): expose message_type in trusted inbound metadata (fixes#50482)
Add resolveInboundMessageType() that extracts the media type prefix
(e.g. 'audio' from 'audio/ogg') from MediaType or MediaTypes fields.
Expose it as message_type in the inbound metadata JSON so agents can
distinguish voice messages from typed text for turn-completion heuristics.
* fix(inbound-meta): preserve per-turn source modality
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* feat(memory): apply outputDimensionality truncation to local GGUF embeddings
The outputDimensionality config field was passed through to the local
embedding provider but never applied. Local GGUF models (e.g.
Qwen3-Embedding-0.6B) always returned their full dimension vector.
Apply slice(0, N) after normalization so MRL-capable models can benefit
from dimension truncation — matching the behavior already supported by
Gemini embedding-2 and OpenAI providers.
Fixes#58765
* fix(memory): preserve local embedding dimensions through worker
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
The isToolDocBlockStart function checked normalized === normalized.toUpperCase()
but normalized is already uppercased from line 24, making the condition always true.
This caused mixed-case lines ending with ':' to be incorrectly detected as doc block
starts, truncating tool descriptions unnecessarily.
Compare the original line instead to correctly detect all-uppercase headings.
Co-authored-by: Gautam Kumar <gautamkumarofficial@users.noreply.github.com>
* fix(usage): reject invalid explicit dates in usage RPC date parsing
usage.cost and sessions.usage accepted shape-valid but impossible dates such as 2026-02-30: parseDateParts validated only the YYYY-MM-DD regex, so Date.* silently rolled them over (2026-02-30 -> 2026-03-02) and the RPC returned cost/usage for the wrong day. Out-of-range parts now fail a UTC round-trip check, and an explicitly provided unparseable date (bad format or impossible calendar date) returns INVALID_REQUEST instead of silently falling back to the default range. Absent/valid dates are unchanged.
[AI-assisted]
* fix(usage): reject non-string explicit dates
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Duplicate user-message detection ran over the full branch, so when a prompt
was re-sent within the 60s window its earlier copy in the summarized prefix
and the later copy in the kept tail were both removed: the summarized copy via
summarizedBranchIds and the kept copy as a duplicate. With
truncateAfterCompaction enabled the prompt then vanished from the successor
transcript entirely. Restrict dedup to the kept region so the first surviving
copy is preserved.
* fix: pin plugin workspace dir for sessions.list to avoid O(rows) memo busting
sessions.list was O(rows) slow under concurrent agent/cron load because
each row read a process-global active plugin-registry workspace dir
that was mutated by other turns between rows. The per-row memo key
changed every time, so loadPluginMetadataSnapshot scanned fresh
(~100ms per row).
Fix:
1. Add AsyncLocalStorage-based workspace dir pinning to
runtime-workspace-state.ts — withPinnedActivePluginRegistryWorkspaceDir()
snapshots the current workspace dir for the duration of a callback.
2. Wrap listSessionsFromStoreAsync body in the pin so all per-row
metadata lookups use a stable memo key.
Fixes#90814
* test(plugins): cover request-scoped workspace pins
* fix(plugins): pin canonical runtime workspace reads
* fix(plugins): preserve workspace pins across reloads
---------
Co-authored-by: lsr911 <lsr911@github.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(ollama): preserve configured API during discovery
* fix(ollama): keep compatible discovery base URL
* fix(ollama): route compatible APIs through configured transport
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix: scope assistant avatar override to agent ID
The local assistant avatar override was stored globally in
localStorage without an agentId, causing the same avatar to
apply to all agents. Setting an avatar for agent A would
overwrite the avatar for agent B.
Fix: include agentId when saving the local avatar override,
and filter by agentId when loading. An override saved for one
agent no longer bleeds into other agents.
Fixes#90890
* fix(ui): persist assistant avatars per agent
* fix(ui): satisfy scoped avatar checks
---------
Co-authored-by: lsr911 <lsr911@github.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
The 'can preserve asynchronous provider model discovery' test was
flaky because resolveModelAsyncMock in beforeEach delegates to
resolveModelMock. When useAsyncModelResolution=true, the test
asserted resolveModelMock was not called, but the delegation
caused it to be called, failing CI on two lanes.
Fix: use a standalone vi.fn() for the async resolver in this
test, and explicitly reset resolveModelMock before the assertion
to guard against mock state leakage from prior tests.
Fixes#92117
Co-authored-by: lsr911 <lsr911@github.com>
MiniMax TTS API returns HTTP 200 even on quota/billing errors, with the
error encoded in base_resp.status_code. Without this check, placeholder
audio returned alongside the error is silently accepted, preventing the
TTS dispatcher from falling back to a configured secondary provider.
This follows the same pattern used by all other MiniMax providers:
- image-generation-provider.ts
- video-generation-provider.ts
- music-generation-provider.ts
- minimax-web-search-provider.runtime.ts
Fixes#76904
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* fix(whatsapp): extract GIF metadata and distinguish gifPlayback in media placeholders (fixes#49099)
- Add escapeAttr() helper to sanitize quotes and angle brackets in XML attribute values
- Add extractExternalAdReplyMetadata() to extract title, sourceUrl, body from contextInfo.externalAdReply
- Distinguish GIFs from videos using videoMessage.gifPlayback flag (media:gif vs media:video)
- Enrich image and video placeholders with externalAdReply metadata when available
- Add 5 test cases covering GIF detection, metadata extraction, attribute escaping, and empty fields
* fix(whatsapp): keep GIF metadata in untrusted context
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
QQBot is the only channel that root-sandboxes outbound local files. Its three
gate sites (resolveOutboundMediaPath, the voice send re-check, and
structured-payload validation) only trusted the QQ Bot media storage roots, so
framework-generated scratch media written under OpenClaw's hardened temp root
(e.g. cron auto-TTS voice files from speech-core) was rejected. The send then
returned a no-identity error, the message was silently lost, yet cron still
recorded it as delivered.
Add one shared resolver (resolveTrustedOutboundMediaPath) that also trusts the
preferred OpenClaw temp root — already a sanctioned media root in core
(buildMediaLocalRoots) — and route all three gates through it so the trust set
agrees everywhere. Fixes#92816.
Co-authored-by: zengwen <zeng_wen@foxmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
When an assistant message's `content` is a raw string at runtime (JSONL
transcript replay passes it through even though the type declares an array),
the OpenAI-compatible completions path crashes:
- `transformMessages` called `assistantMsg.content.flatMap(...)` ->
`TypeError: ... .flatMap is not a function` (first crash, always hit).
- Two `hasToolHistory` helpers (`openai-transport-stream.ts` and
`openai-completions.ts`) called `content.some(...)` -> `TypeError: ...
.some is not a function` (siblings, surface once the flatMap crash is fixed).
Normalize a string assistant content to an equivalent single text block
before transforming (matching the string->text handling already used in
anthropic-payload-policy.ts), and `Array.isArray`-guard both `hasToolHistory`
helpers so a string assistant simply does not count toward tool history.
Verified end-to-end through the real `buildOpenAICompletionsParams` and
`streamOpenAICompletions` entry points: before the fix a string-content
assistant followed by a toolResult throws TypeError; after the fix params are
produced correctly (string preserved as text, tool history detected). Normal
array content is unaffected.
* fix(respawn): rewrite pnpm versioned entry paths to stable wrapper
During self-update the pnpm versioned directory (node_modules/.pnpm/openclaw@<ver>/)
may be removed. If process.argv contains the versioned path, the respawned child
fails to start because the entrypoint no longer exists.
Detect pnpm versioned realpaths in spawnDetachedGatewayProcess and rewrite them
to the stable node_modules/<pkg>/openclaw.mjs wrapper before spawning.
Fixes#52313
* fix(respawn): scope pnpm entry rewrite to openclaw
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(wizard): preserve existing default model during setup auth choice
Without preserveExistingDefaultModel: true, the setup wizard
overwrite the user's configured default model when a new provider
auth is selected. This causes existing heartbeat turns to silently
consume paid API quota (e.g. Google Gemini) instead of the user's
original model.
The configure.gateway-auth.ts path already passes this flag; the
setup wizard path was missing it.
Fixes#64129
* fix(wizard): add type assertion for preserveExistingDefaultModel test
Summary:
- This PR changes the docs i18n Codex command-output preview to keep a short head plus retained tail, and adds Go unit coverage for stdout and stderr tails.
- PR surface: Other +20. Total +20 across 2 files.
- Reproducibility: yes. Source inspection of current main and `v2026.6.6` shows long output is truncated to the prefix only, and the PR's focused tests model the stdout/stderr tail cases that lose final API details.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head b510b598c6.
- Required merge gates passed before the squash merge.
Prepared head SHA: b510b598c6
Review: https://github.com/openclaw/openclaw/pull/93687#issuecomment-4720840859
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: Mason Huang <8814856+hxy91819@users.noreply.github.com>
Approved-by: hxy91819
* fix(agents): handle string assistant content in getLastAssistantText
PR #93456 added an `if (!Array.isArray(message.content)) return false` guard
to hasAssistantToolCallArguments, acknowledging that a persisted/legacy
assistant message can carry a string `content` at runtime even though the
type is declared as an array. buildSessionContext pushes such entries through
unchanged, so the string can reach agent.state.messages.
getLastAssistantText() still assumed an array: iterating a string `content`
yields individual characters, none of which has `type === "text"`, so the
assistant's text was silently dropped and the function returned undefined.
Mirror extractTextContent(): when `content` is a string, treat it as the text
itself; otherwise iterate the content blocks as before. The aborted/empty
check is left untouched because `.length === 0` is already correct for both an
empty array and an empty string.
* fix(agents): safely read persisted assistant text
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
When the block reply pipeline streamed partial content, buildReplyPayloads()
unconditionally dropped all text-only final payloads. This suppressed the
complete final reply when the pipeline only streamed a partial block and
never sent the exact final text.
The fix checks hasSentPayload() for text-only payloads too, preserving
unsent finals instead of dropping them unconditionally.
The async Clipboard API is only available in secure contexts (HTTPS or
localhost). On plain-HTTP deployments navigator.clipboard is undefined, so the
code block copy button threw synchronously and silently failed. Add a shared
copyToClipboard helper that guards the secure-context path and falls back to the
legacy execCommand copy, reuse it for the code block button and the copy-as-
markdown affordance, and cover it with a unit test plus a real-browser e2e that
simulates the non-secure context.
Fixes#93628
Co-authored-by: Pick-cat <266665499+Pick-cat@users.noreply.github.com>
Summary:
- The PR changes `/status` context-window selection to ignore stale runtime snapshots after manual model switches while preserving fallback/runtime-alias context windows.
- PR surface: Source +6, Tests +128. Total +134 across 2 files.
- Reproducibility: yes. source-reproducible: current main trusts explicit runtime context before checking fall ... fer. I did not run a local failing repro, but the PR fixture models the stale prior-runtime state directly.
Automerge notes:
- PR branch already contained follow-up commit before automerge: test(status): make context fixtures type-correct
Validation:
- ClawSweeper review passed for head f14fda4279.
- Required merge gates passed before the squash merge.
Prepared head SHA: f14fda4279
Review: https://github.com/openclaw/openclaw/pull/93306#issuecomment-4708596208
Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Approved-by: hxy91819
Preserve concurrent prompt-time transcript updates across stale session managers, side appends, transcript navigation, nested owned writes, and doctor repair.
Fixes#93193.
Thanks @snowzlm for the report and original fix.
Co-authored-by: snowzlm <snowzlm@noreply.codeberg.org>
* fix(reply): preserve pending thread evidence when reconciling partial send results
extractMessagingToolSendResult re-derived threadId/threadImplicit/threadSuppressed
straight from the provider result. Mattermost is the only production provider that
implements extractToolSendResult, and for an implicitly threaded send it reports only
{ to }, so the reconciler overwrote the correct pending thread evidence with undefined.
That defeated same-thread reply suppression in reply-payloads dedupe and delivered the
agent's final reply twice in the thread, on both the native and Codex harnesses.
A partial provider result now keeps the pending thread evidence it does not speak to: a
provider-reported threadId still wins (and clears the implicit flag), but an absent one
no longer erases the pending threadId/threadImplicit/threadSuppressed.
Regression introduced by c67dc59b02 (#90943).
* test(reply): use a core-local stub provider instead of the bundled Mattermost import
The reconcile-thread regression test deep-imported extensions/mattermost from a
core test, which trips the core/extension package boundary (boundary-invariants
"keeps core tests off bundled extension deep imports", extension-test-boundary,
and check-tsgo-core-boundary pulling extensions/mattermost transitively).
Replace it with a core-local channel test plugin that reproduces the same
contract: an implicit-threading extractToolSend, a partial extractToolSendResult
that reports only { to, threadId? }, and no targetsMatchForReplySuppression
matcher. The test now exercises the generic reconciler contract with no
extension dependency. It still fails on pristine main and passes with the fix.
* fix(reply): reconcile thread evidence atomically
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Adds Codex as a selectable hosted web-search provider, routes native Codex search safely across model overrides, and isolates bounded hosted-search workers from configured tools.\n\nVerification: focused post-merge regression suite passed 202/202 tests on exact head 23824af49a.
* fix(ollama): repair retired cloud provider endpoint
Route configured Ollama Cloud provider ids through plugin doctor compatibility migrations so doctor --fix can rewrite the retired ai.ollama.com endpoint before runtime reads persisted config.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test(doctor): align provider fixture with typed config
Ensure the doctor registry provider-scoped migration test uses a fully typed provider fixture so the test type-check shard validates the intended behavior.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test(ollama): align doctor fixture with typed config
Use fully typed provider and model fixtures in the Ollama doctor contract tests so the extension test type-check shard validates the migration behavior.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(ollama): preserve custom cloud provider base url
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(ollama): avoid logging retired endpoint secrets
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(ollama): repair retired cloud provider endpoint
Route configured Ollama Cloud provider ids through plugin doctor compatibility migrations so doctor --fix can rewrite the retired ai.ollama.com endpoint before runtime reads persisted config.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test(doctor): align provider fixture with typed config
Ensure the doctor registry provider-scoped migration test uses a fully typed provider fixture so the test type-check shard validates the intended behavior.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test(ollama): align doctor fixture with typed config
Use fully typed provider and model fixtures in the Ollama doctor contract tests so the extension test type-check shard validates the migration behavior.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(ollama): preserve custom cloud provider base url
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(ollama): avoid logging retired endpoint secrets
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
A literal null `workspace` field in an agent entry failed schema validation at
startup, producing a crash loop that `openclaw doctor --fix` could not recover
from because the compatibility pipeline never normalized the malformed field.
Add a narrow doctor migration that removes null `workspace` values from
`agents.list` entries and relies on the existing fallback path (defaults or
stateDir-derived workspace) at runtime.
Fixes#77718.
PR #88496 routed /config show and /config set chat output through the
shared schema-aware redaction path, but the sibling /debug commands in
the same handler were left untouched. /debug show JSON-stringified the
full runtime override tree verbatim and /debug set echoed the raw value,
so a secret-shaped override (e.g. gateway.auth.token, channels.*.botToken)
set via /debug set was rendered in plaintext to chat-visible output.
Apply redactConfigObject(overrides, schema.uiHints) to the override tree
before rendering /debug show, and reuse formatConfigSetValueLabel for the
/debug set acknowledgement, matching the existing /config redaction
contract. Non-secret fields and env placeholders are preserved.
* fix(ui): restore provider usage pill in desktop chat composer (#93041)
Composer refactors dropped the quota pill from renderChatControls and left the
desktop renderChatSessionSelect wrapper orphaned, so it rendered nowhere on
desktop. Re-attach the existing pill, add modelAuthStatusResult to the guarded
controls dep list so it updates when usage windows arrive async, and hide it on
the 2-col mobile composer grid.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(ui): add real-browser e2e proof for chat quota pill (#93041)
Playwright/Chromium test that mocks models.authStatus usage windows and asserts
the restored provider usage pill renders in the desktop chat composer (and is
absent without usage). Skips gracefully when Chromium is unavailable.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(ui): write quota-pill e2e screenshots to ignored .artifacts path (#93041)
Match the control-ui-e2e convention (.artifacts/control-ui-e2e/...) so the proof
run does not leave untracked root-level files. Addresses ClawSweeper review.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* fix(discord): apply tool status emojis immediately to avoid override by thinking reactions
Tool emoji reactions (🛠️, 🌐, 🔎, etc.) during Discord tool/skill execution
were not appearing because setTool() used a 700ms debounce shared with
setThinking(). Rapid onReasoningStream calls from overlapping reasoning
would repeatedly overwrite the pending tool emoji with 🧠, so the tool
emoji never reached Discord.
Fix by making setTool() apply emojis immediately (skip debounce). Tool
transitions are user-facing state changes that should be visible without
delay, and the terminal done/error transitions already flush any pending
state.
Fixes#92715.
* fix(discord): forward quiet tool lifecycle status
* fix(slack): preserve tool status reactions
* test(channels): type quiet tool lifecycle options
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(cron): clear delivery routing fields from cron edit
cron edit could set delivery channel/to/thread-id/account but could not unset them: an empty value (e.g. --to "") builds delivery.X = undefined, which is omitted from the JSON-RPC patch, so mergeCronDelivery never sees the key and the field is silently kept. The gateway RPC already accepts an explicit null to clear each field (CronDeliveryPatchSchema + mergeCronDelivery via normalizeOptionalString); the CLI just never sent it.
Add --clear-channel/--clear-to/--clear-thread-id/--clear-account, each emitting null (mirroring the existing --clear-model), with mutual-exclusion guards against the matching set flag and against --webhook.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(cron): preserve delivery defaults when clearing routes
* fix(cron): validate cleared prefixed routes
---------
Co-authored-by: ly-wang19 <ly-wang19@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* [AI] fix(feishu): guard against missing inbound in channelRuntime fallback
When channelRuntime from gateway context is truthy but lacks the inbound
property, the ?? operator still selects it over getFeishuRuntime().channel,
causing TypeError at core.channel.inbound.run().
The ChannelGatewayContext types channelRuntime as ChannelRuntimeSurface
(only guarantees runtimeContexts), but channel.ts casts it to
PluginRuntimeChannel via type assertion. If a partial runtime object
without inbound is provided, the type lie becomes a runtime crash.
Fix: check channelRuntime?.inbound before using it; fall back to
getFeishuRuntime().channel when inbound is absent.
Related to #93453
* [AI] test(feishu): add regression for partial channelRuntime lacking inbound
When channelRuntime has runtimeContexts but no inbound, the guard in
bot.ts should fall back to getFeishuRuntime().channel. Add a test that
passes a partial channelRuntime and verifies dispatch does not crash.
Refs #93453
Carry prepared manifest model-id normalization records through the runtime bridge so hot callers reuse existing metadata instead of consulting the snapshot fallback.
The final change preserves the existing no-prepared-record behavior, adds focused forwarding coverage, and removes the one-off proof script before landing.
Thanks @zeroaltitude.
Verification:
- 224 focused tests
- full CI run 27594070734
- real behavior proof run 27594081022
- final whole-branch autoreview clean
Co-authored-by: zeroaltitude <zeroaltitude@gmail.com>
Suppress each raw commentary echo paired with a typed Codex item completion by protocol order, while preserving later raw-only notes and contributor-rewritten completion text.
Fixes#93296.
Thanks @Marvinthebored.
Verification:
- 95 focused projector tests
- full CI run 27593515603
- real behavior proof run 27593522821
- local and whole-branch autoreview clean
Co-authored-by: Peter Lindsey <peter@lindsey.jp>
Pinned session-extension registries now remain the owner even when empty, preventing later active registry churn from leaking agent-owned extensions into the gateway surface.
Inbound PDF/document text already flows to agents through the canonical
media-understanding pipeline (applyMediaUnderstanding -> extractFileBlocks),
but it inherited the OpenResponses input_file limits (5MB / 4 pages), so large
managed PDFs from channels/Control UI were skipped and locked-down agents saw
only an attachment marker.
- Size inbound file extraction from agents.defaults.mediaMaxMb (default 20MB,
cap 25MB) and pdfMaxPages (default 20, cap 150) via a new
resolveFileExtractionLimits; explicit gateway responses.files config still
wins per-field. (#90096)
- chat.send: let oversized (>5MB) managed inbound PDFs pass through sandbox
staging with their managed media path instead of a 4xx, so host-side
extraction reaches sandboxed agents without copying the file into every
sandbox; non-PDF oversize files are still rejected. (#90097)
Reuses the existing extraction/injection path; no parallel module or extra
prompt-injection sites.
When a child openclaw process is spawned via a backgrounded subshell that
exits before the new process reaches the stale-pid sweep, the new process
is reparented to the supervisor (PID 1 / launchd) and the ancestor walk
in getSelfAndAncestorPidsSync can no longer see the running gateway. The
running gateway then shows up on lsof as an unrelated sibling on the
port and gets SIGKILL'd by cleanStaleGatewayProcessesSync, recreating
the issue #68451 supervisor restart loop across a reparent boundary.
Real-world trigger: a user ~/.zshrc auto-start block
if ! pgrep -x openclaw-gateway >/dev/null; then
(openclaw gateway >/dev/null 2>&1 &)
fi
combined with codex per-turn `zsh -c "set -e; . shell_snapshot"` invocations
caused every chat turn on rh-bot to SIGKILL its launchd-managed gateway,
producing HTTP 000 errors and ~33 kill events captured by a forensic
launchd unified-log tracker before the zshrc was patched.
Fix: gateway-cli captures OPENCLAW_GATEWAY_SERVICE_PID from inherited env
BEFORE overwriting it with process.pid, then threads the captured PID
through cleanStaleGatewayProcessesSync into getSelfAndAncestorPidsSync's
exclusion set. The protection is opt-in per call site so existing
maintainer paths (openclaw update / openclaw doctor restart helpers) keep
their ability to terminate a running gateway intentionally.
The inherited-PID parser is strict positive-integer only: a malformed
inherited env value (`"123abc"`, `"123.4"`, `"0x7b"`, etc.) is rejected
rather than silently protecting PID 123 from cleanup and leaving the
stale listener alive. New focused unit tests cover the parser
contract.
Existing regression tests cover the reparent suicide-kill scenario and
the defensive ignore-non-positive-PID contract on the cleanup side.
Preserve rollback journaling for NFS and SMB-backed stores, refuse SSHFS after symlink-aware mount classification, and close Workboard database handles when filesystem policy rejects initialization.
Use transactionally consistent VACUUM INTO snapshots for every state-root SQLite database and exclude original journal sidecars so verified backups cannot restore torn plugin or memory state.
* test(qa): add smoke ci primary coverage evidence
* test(qa): remove overstated primary coverage claims
* test(qa): make release profile include smoke ci
* test(qa): trim taxonomy formatting churn
* test(qa): avoid hardcoded profile names in coverage test
* test(qa): make release profile cover taxonomy
* test(qa): type profile fixture all category flag
* test(qa): include channel delivery in smoke ci profile
Archive the canonical legacy database before SQLite sidecars, then detect and finish pending sidecar cleanup on retry without reopening the migrated database.
Allow the auth-profile read-only SQLite bootstrap path through the Kysely guardrail. The runtime already wraps reads with Kysely; the raw DatabaseSync boundary is the short-lived read-only bootstrap.
Co-authored-by: Alex Knight <15041791+amknight@users.noreply.github.com>
Add exec approvals artifact evidence to Policy.
- add the execApprovals policy namespace and check IDs for required artifact presence, default/per-agent security posture, autoAllowSkills, and allowlist drift
- read the active exec-approvals.json artifact only when execApprovals policy rules are configured, honoring OPENCLAW_STATE_DIR before the default ~/.openclaw path
- emit redacted posture evidence and stable oc:// references without socket tokens, command text, resolved paths, timestamps, or approval-session details
- document the public policy surface and add focused scanner, doctor, conformance, and CLI coverage
Validation:
- GitHub Actions for head b82eefe492 are green, including Real behavior proof.
- ClawSweeper re-review completed for the same head with proof: sufficient and status: ready for maintainer look.
- Maintainer artifact-boundary acceptance is recorded in the PR discussion and body.
Co-authored-by: Gio Della-Libera <235387111+giodl73-repo@users.noreply.github.com>
* fix(tui): show activity indicator for system-injected runs
System-injected runs (bridge-notify, webhook, cron) never go through the
TUI submit path, so no active/pending run id exists when their lifecycle
"start" event arrives. handleAgentEvent dropped events for untracked runs,
leaving the status bar idle until the response landed.
Adopt an untracked lifecycle "start" for the current session (lifecycle
events always carry sessionKey) so the activity indicator shows work is
happening, mirroring how chat deltas adopt runs in handleChatEvent. Local
side-question (btw) runs never claim the active slot.
Closes#51825
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(tui): preserve concurrent injected run activity
---------
Co-authored-by: zengwen <zeng_wen@foxmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(line): cap carousel column text at 60 chars with title or image
LINE limits a carousel column's text to 60 characters when the column has
a title or thumbnail image, and 120 characters otherwise. createCarouselColumn
always truncated to 120, so a column with a title/image and 61-120 char text
exceeded the limit and made LINE reject the entire carousel reply (HTTP 400).
Apply the conditional limit (mirroring the buttons template) and drop the now
redundant slice in createProductCarousel.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(line): apply conditional text limits across templates
* fix(line): truncate template text by code point
* fix(line): preserve grapheme clusters when truncating
* fix(line): apply compact limit for default actions
* fix(line): follow title and thumbnail text limits
* fix(line): truncate template text within UTF-16 limits
* fix(line): preserve required text within template limits
* fix(line): preserve carousel product prices
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Restore readable standard Telegram text delivery by default after Bot API 10.1 rich messages rendered as unsupported in current clients. Keep native rich tables and structured messages available through the account-level richMessages opt-in, with account-aware capability advertising and documented structural limits.
Fixes#93263.
* fix(whatsapp): preserve auth on passive terminal stops
* fix(whatsapp): recover stale web auth during relink
* fix(gateway): defer channel stop until qr takeover
Apply the canonical SQLite busy timeout to short-lived read-only auth profile reads so a brief rollback-journal exclusive lock cannot make valid persisted credentials appear missing.
The atomic reindex file ops hardcoded the WAL sidecar pair (-wal/-shm)
when moving, removing, and backing up index files. NFS-backed memory
stores run SQLite under journal_mode=DELETE, which produces a
rollback-journal (-journal) sidecar instead. As a result an index swap
left the previous targets stale -journal next to the freshly published
The inline-code/fence restore step matched the placeholder index with a
greedy `(\d+)`, so a digit in user text immediately after a code span
(e.g. `code`5) was absorbed into the index, resolved to undefined, and
`?? ""` deleted both the code span and the digit. Terminate the
placeholder index with the existing NUL marker so the index boundary is
unambiguous.
Co-authored-by: Dr Rushindra Sinha <5796457+rushindrasinha@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Classify owned silent model calls as long-running until the abort threshold while preserving stalled handling for ownerless stale activity, with diagnostics tests and docs.
Stop media writes from triggering opportunistic pruning and leave retention cleanup to the configured maintenance timer. Preserve explicit cleanup options and cover shallow/root/recursive cleanup behavior.
Track platform-incompatible skills separately from missing requirements, keep doctor --fix from treating them as broken installs, and cover the status output.
Use alias-aware credential compatibility before clearing auth-profile overrides, preventing compatible CLI sessions from flapping auth profiles. Includes regression coverage.
Use the shared suppressed-control-reply detector for cron delivery so NO_REPLY, ANNOUNCE_SKIP, and REPLY_SKIP do not leak to outbound channels, with direct/text delivery coverage.
Keep workboard card titles visible when a column overflows by pinning implicit rows to content height, and add e2e coverage for the overflow case.
Fixes#91717
* fix(telegram): control group history context
* fix(telegram): keep history mode type local
* fix(telegram): respect history mode during forum recovery
Avoid repeated full JSONL parsing and cloning on every embedded-agent turn by keeping a bounded, validated transcript cache and advancing repair incrementally.
The final implementation preserves lock ownership and exact fingerprint validation, publishes only verified writes, handles header rewrites and unterminated JSONL safely, and adds focused regression coverage.
Fixes#83943.
Co-authored-by: Alix-007 <li.long15@xydigit.com>
The fresh-tokens path of runPreflightCompactionIfNeeded fed the prompt-only
entry.totalTokens snapshot straight into the budget threshold check, dropping
the current user prompt estimate and the previous turn's output. The sibling
memory-flush gate and this function's own stale branch already project
base + output + estimate via resolveEffectivePromptTokens, so the preflight
gate under-triggered and let over-budget requests through to overflow-retry.
Project the fresh persisted base the same way: read transcript output when near
the threshold (mirroring the memory-flush gate's buffer) and run the fresh base
through resolveEffectivePromptTokens before the threshold check.
Claude Code built-ins ScheduleWakeup and CronCreate schedule a deferred
re-invocation managed by the persistent CLI runtime. In OpenClaw's
one-shot `claude -p` invocations the process exits at end_turn, so any
wakeup or cron registered during the run has no host to fire into and is
silently lost. Symptom: a CLI session spawns a background sub-agent,
calls ScheduleWakeup to poll for completion, ends the turn, and never
picks up the result — the work finishes unreviewed.
Append `--disallowedTools "ScheduleWakeup,CronCreate"` to both `args`
and `resumeArgs` in the anthropic CLI backend so the model cannot reach
for tools that don't survive the run mode. The right pattern in CLI
sessions is Monitor on the background output file, or a synchronous
sub-agent.
* fix(gateway): pass managed inbound PDFs through when sandbox staging fails
chat.send force-stages offloaded non-image media into the sandbox workspace when
one exists. If that optional staging was unavailable or incomplete,
prestageMediaPathOffloads deleted the media buffers and failed the whole send
with a 5xx — even for already-managed inbound PDFs that are safe to read
host-side. A Control-UI-uploaded PDF could fail to send.
When staging throws or is incomplete, fall back to the absolute managed paths
iff every non-image offloaded ref is a managed-inbound application/pdf (reusing
the existing resolveInboundMediaReference allow-check + the PDF mime type). This
mirrors the existing no-sandbox passthrough: with MediaWorkspaceDir unset the
managed media dir is a default media-understanding local root, so the absolute
path resolves host-side. Gated all-or-nothing so a single non-managed or non-PDF
ref keeps the previous delete + 5xx behavior. Success path and oversized 4xx are
unchanged; managed buffers are not deleted on the fallback.
Fixes#90097
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(gateway): exempt managed PDFs from staging cap
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* docs(windows): fix WSL gateway-autostart recipe for WSL ≥ 2.6.1.0
Replace /bin/true with dbus-launch true to work around the WSL ≥ 2.6.1.0
idle-termination regression (microsoft/WSL #13416): the distro exits 15-20 s
after the last wsl.exe client detaches even with loginctl linger and an active
user service. dbus-launch true keeps a child-of-init process alive (workaround
from microsoft/WSL discussion #9245, validated on WSL 2.7.3.0).
Also replace /ru SYSTEM with /ru "$env:USERNAME". Per-user WSL distros (the
default setup) are not enumerable by the SYSTEM account — the task runs
silently without starting the distro. Running as the installing user account
fixes this; Windows prompts for the password at task creation time.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
* docs(windows): add dbus-x11 prerequisite for WSL keepalive
dbus-launch is provided by dbus-x11, which is not installed by default
on fresh Ubuntu WSL distros. Without it the scheduled task hits
command-not-found silently. Add the apt-get install step before the
linger and gateway-install steps so the recipe is self-contained.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Rebase onto current upstream/main (head 4780546c12). Resolves the conflict from upstream's two-line Current time + Reference UTC helper output: appendCronStyleCurrentTimeLine now refreshes/collapses any prior helper-injected block via CURRENT_TIME_LINE_RE instead of returning early on a stale base.includes('Current time:') match. Preserves upstream-added doc comments. 16/16 current-time.test.ts pass; tsgo core clean.
Thread the existing agents.defaults.timeFormat setting through the Control UI
bootstrap config so WebChat/Control UI timestamps render in the configured
12h/24h clock instead of always using the browser locale default. "auto"
keeps the browser default, so existing deployments are unchanged.
Closes#58147
Co-authored-by: zengwen <zeng_wen@foxmail.com>
resolveCronChannelOutputPolicy checked deliveryRequested === false
when there is no channel. Since deliveryRequested is optional
(?: boolean), undefined and missing opts both returned false,
blocking the hasRecoveredToolWarning rescue path for --no-deliver
cron runs whose agent recovered successfully.
Change === false to !== true: when no channel exists, prefer the
agent's final visible text unless delivery was explicitly
requested.
Fixes#90664
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
cron list/show printed "idle" for a job whose status is ok/error/skipped
when only lastRunStatus (the primary field) was set: formatStatus used
`lastStatus ?? "idle"` and omitted lastRunStatus, diverging from computeStatus
(the --json status resolver) whose JSDoc says it mirrors the human output.
Delete the duplicate formatStatus and render via the canonical computeStatus.
Co-authored-by: ly-wang19 <ly-wang19@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Guard memory index identity resolution against empty or whitespace provider models by falling back to fts-only, and use fts-only as the fallback source model when an adapter fallback cannot resolve a model.
This prevents empty expectedModel mismatch reasons that can leave memory search dirty while preserving registered adapter default-model resolution.
Refs #90787
Two code-review findings. (1) gateway taskMatchesAgent fell through to a requester/owner/child session-key scan even when the task had an explicit agentId, so a worker subagent task owned by agent:main:main also matched agentId:main; make explicit task.agentId authoritative and keep the session-key fallback only for legacy records without an agentId, with a gateway tasks.list regression. (2) the cross-agent attribution test passed async (root) to the zero-arg withTaskRegistryTempDir helper (TS2345/TS7006); drop the unused parameter and redundant env assignment.
Summary:
- The PR adds artifact, installed skill-file, source URL, and verification-envelope fields to ClawHub skill origin/lock metadata while keeping install telemetry restricted to the older version/registry shape.
- PR surface: Source +144, Tests +139. Total +283 across 2 files.
- Reproducibility: not applicable. as a bug reproduction. Source inspection shows current main lacks the richer `.clawhub` provenance fields, and the PR body provides after-patch live output from a ClawHub install.
Automerge notes:
- PR branch already contained follow-up commit before automerge: Persist ClawHub skill install provenance
Validation:
- ClawSweeper review passed for head 65774f4f4b.
- Required merge gates passed before the squash merge.
Prepared head SHA: 65774f4f4b
Review: https://github.com/openclaw/openclaw/pull/93283#issuecomment-4707787041
Co-authored-by: momothemage <niuzhengnan@163.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: momothemage
cleanupAgedMemoryReindexTempFiles only removed WAL sidecars (-wal/-shm) of orphaned reindex temp DBs. On NFS-backed stores configureMemorySqliteWalMaintenance -> requireRollbackJournalMode forces journal_mode=DELETE, so the reindex temp DB uses a rollback journal; a hard crash leaves an orphaned .tmp-<uuid>-journal that leaked forever (cleanup neither deleted nor even discovered it). Add -journal to both the delete set (memoryIndexFileSuffixes) and the discovery set (reindexTempEntrySuffixes), with regression tests for the temp-plus-journal and stranded-journal cases.
Summary:
- The PR filters stale session/live context-token values when rendering `/status`, threads existing per-agent/default context caps into status rendering, and adds regression tests for status message and summary output.
- PR surface: Source +107, Tests +155. Total +262 across 7 files.
- Reproducibility: yes. Source inspection shows current main forwards stale live and persisted context-token v ... atus`, and the PR comments include live gateway output validating the Kimi/DeepSeek mismatch after the fix.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(status): avoid stale session context windows
Validation:
- ClawSweeper review passed for head 4a8e9299a3.
- Required merge gates passed before the squash merge.
Prepared head SHA: 4a8e9299a3
Review: https://github.com/openclaw/openclaw/pull/93220#issuecomment-4705953238
Co-authored-by: masonxhuang <masonxhuang@tencent.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: hxy91819
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Summary:
- This PR changes pinned-session `/status` guidance, model-selection docs, and status tests to recommend `/model default` instead of `/model <configured>` or `/reset` for clearing a session model pin.
- PR surface: Source 0, Tests 0, Docs +4. Total +4 across 7 files.
- Reproducibility: yes. from source inspection. Current main and v2026.6.6 emit the old `/reset` hint, while `/model default` clears persisted model overrides and `/reset` intentionally preserves user-selected overrides.
Automerge notes:
- PR branch already contained follow-up commit before automerge: docs: align model clear hint docs
- PR branch already contained follow-up commit before automerge: fix(status): correct pinned model clear hint
Validation:
- ClawSweeper review passed for head 1181624daa.
- Required merge gates passed before the squash merge.
Prepared head SHA: 1181624daa
Review: https://github.com/openclaw/openclaw/pull/93231#issuecomment-4706327717
Co-authored-by: masonxhuang <masonxhuang@tencent.com>
Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: hxy91819
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
In dispatchReplyFromConfig the user-message success branch ran
throwIfDispatchOperationAborted() *before* clearPendingFinalDeliveryAfterSuccess().
If stuck-session recovery aborted the run in the window between the final reply
shipping and the clear, the message was delivered but pendingFinalDelivery stayed
true forever — the get-reply redelivery short-circuit then silently blocked every
future inbound and the agent "went silent" (#89115).
Reorder so the durable pending-final bookkeeping is cleared first, then honor the
abort afterwards (preserving abort reporting). Also clear the stranded
pendingFinalDeliveryIntentId field — agent-command.ts already clears it but the
success helper did not.
Keep the setup TUI parent stdin paused after its inherited-stdio child exits so Docker and PTY setup parents terminate cleanly. Align pre/post setup terminal cleanup with the cleanup-then-exit contract and add lifecycle regression coverage.
Thanks @fuller-stack-dev.
Recover assistant turns that complete tool work without producing a visible final answer, while preserving intentional silent replies.
Use concrete tool-instance replay safety across embedded, Codex, and Copilot runtimes so unknown, mutating, async-started, and durable recall operations fail closed. Preserve genuine empty Codex final items without promoting commentary or tool-progress echoes.
Supersedes #90872. Thanks @fuller-stack-dev.
Co-authored-by: fuller-stack-dev <263060202+fuller-stack-dev@users.noreply.github.com>
* fix(agents): resolve "current" session alias locally without gateway round-trip
The system prompt tells agents to use sessionKey="current" to refer to
their own session. Previously, resolveSessionReference sent the literal
string "current" to the gateway sessions.resolve action, which rejected
it with INVALID_REQUEST and logged a noisy error line on every tool call.
The wrapper fell back to requesterInternalKey and succeeded, so the tool
worked — but the gateway error was spurious.
Add "current" to the well-known client alias check in
resolveCurrentSessionClientAlias so it is resolved locally to the
requester's session key, matching how TUI/CLI/WebChat client labels are
handled. This eliminates the unnecessary gateway round-trip and the
error log line.
Fixes#78424
* test: update session_status tests for local current-key resolution
* test: update session_status tests for local current-key resolution
* Revert "test: update session_status tests for local current-key resolution"
This reverts commit d9f6c8b5248921c99f43dc222667ffa429b34401.
* Revert "test: update session_status tests for local current-key resolution"
This reverts commit 40bf77d06711833c1beaeedf562b60a765a559d6.
* Revert "fix(agents): resolve "current" session alias locally without gateway round-trip"
This reverts commit d92bc9b91e0840ea5823cd44223c139e434c5ec4.
* fix(agents): preserve literal current session resolution
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
* fix(feishu): pass card_msg_content_type to get full card content
When reading Feishu interactive card messages via getMessageFeishu,
the API returns a degraded structure (title + 'upgrade client' prompt)
unless card_msg_content_type=user_card_content is passed in params.
Fixes#78289
* fix(feishu): request full card content for message reads
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Telegram's rich-markdown renderer treats a lone "\n" as a soft break
(rendered as a space), so streamed tool-progress draft lines joined by a
single newline collapsed onto one line. Pass "\n\n" as the progress-draft
line separator for Telegram; it renders a blank line as a single break, so
each tool/thinking/commentary line gets its own line again. Other channels
keep the single-newline default, so Discord and the rest are unaffected.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a required database-first legacy-store guard and regression coverage for legacy runtime state write patterns.
The guard is wired into architecture/preflight/changed checks, narrows the documented guard contract to the implemented filesystem-write scope, and tightens extension migration exemptions to explicit owner APIs. Also includes a small memory-core lint unblocker after current CI flagged an unnecessary non-null assertion.
Verification:
- pnpm check:database-first-legacy-stores
- pnpm lint:scripts
- node scripts/run-vitest.mjs test/scripts/check-database-first-legacy-stores.test.ts -- --reporter=verbose
- node scripts/run-oxlint.mjs extensions/memory-core/src/memory/manager-embedding-ops.ts
- git diff --check
- .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main
- GitHub CI green for PR head 34dde2c620Closes#91628.
Summary:
- The PR updates the voice-call plugin to preserve live `speaking`/`listening` calls without `answeredAt`, backfill max-duration enforcement for live/restored call paths, and add regression tests.
- PR surface: Source +90, Tests +223. Total +313 across 9 files.
- Reproducibility: yes. source-level: current main and v2026.6.6 still reap aged non-terminal calls solely bec ... king` or `listening` without setting it. I did not run a live Twilio carrier call in this read-only review.
Automerge notes:
- Ran the ClawSweeper repair loop before final review.
- Included post-review commit in the final squash: fix(voice-call): preserve live Twilio streams in stale reaper
- Included post-review commit in the final squash: fix(clawsweeper): address review for automerge-openclaw-openclaw-9062…
Validation:
- ClawSweeper review passed for head 5fee2ff7a1.
- Required merge gates passed before the squash merge.
Prepared head SHA: 5fee2ff7a1
Review: https://github.com/openclaw/openclaw/pull/90812#issuecomment-4637047870
Co-authored-by: Sahibzada Allahyar <sahibzada@fastino.ai>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Summary:
- The PR adds an abort-signal-specific timeout classifier, switches two embedded attempt abort handlers to it, and adds focused failover tests.
- PR surface: Source +5, Tests +32. Total +37 across 3 files.
- Reproducibility: yes. from source inspection and a focused Node abort-reason check, but not from a live 180- ... ault AbortController abort reason through the broad timeout classifier used by the embedded abort handlers.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(agents): do not misclassify client-disconnect abort as run timeout
Validation:
- ClawSweeper review passed for head 2708b0a37d.
- Required merge gates passed before the squash merge.
Prepared head SHA: 2708b0a37d
Review: https://github.com/openclaw/openclaw/pull/90936#issuecomment-4638919394
Co-authored-by: openperf <16864032@qq.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
* fix(memory): accept local default model path migration
Treat the official local default embedding model's hf URI and downloaded GGUF path identities as equivalent so upgraded local memory indexes do not pause solely on path-format changes.
* fix(memory): satisfy local identity lint
Avoid filtered array tail access in the local model filename helper while preserving the same compatibility behavior.
* fix(memory): preserve local embedding identity aliases
---------
Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com>
Fix gateway-routed one-shot Codex app-server teardown so owned shared clients are retired after run cleanup. Verified with focused tests, Showboat proof, and green PR CI.
Register OpenCode Go's provider-owned static catalog so lifecycle cache warmup supplies the correct context window to memory flush and compaction without persisting catalog rows in user config.
Fixes#92912.
Co-authored-by: kumaxs <45620232+kumaxs@users.noreply.github.com>
Avoid assuming every runtime model exposes a string `baseUrl` before provider attribution checks. Preserve OpenRouter and Cloudflare attribution behavior while allowing Bedrock session setup to reach provider routing.
Fixes#92974.
Co-authored-by: Sami Rusani <sr@samirusani>
Prevent duplicate `before_tool_call` execution when an already wrapped tool passes through schema normalization and coding-tool assembly. Preserve the normalized schema while replacing stale wrapper context with the current agent/session/run context.
Fixes#92973.
Co-authored-by: zengLingbiao <zeng.lingbiao@xydigit.com>
Resolve explicit relative SQLite DB paths before caching handles and centralize durable SQLite connection pragmas so busy_timeout is applied before WAL/NFS negotiation.
Use the active runtime snapshot for Discord and Slack native command routing and Discord autocomplete after config hot writes.
Fixes#39605
Co-authored-by: Peter Steinberger <steipete@gmail.com>
- Prefer small fixes at the right ownership boundary; no refactor unless it clearly improves the bug class.
- When an accepted finding shows a bug class or repeated pattern, inspect the current PR scope for sibling instances before fixing.
- Fix the scoped bug class at once when practical; stop at touched surfaces, owner boundaries, and clear follow-up territory.
- Keep going until structured review returns no accepted/actionable findings.
- Keep going until structured review returns no accepted/actionable findings only while the work remains inside the original task scope.
- If a review-triggered fix changes code, rerun focused tests and rerun the structured review helper.
- For security-audit suppression changes, verify accepted findings remain auditable: suppressed findings stay in structured output, active output keeps an unsuppressible suppression notice, and aggregate findings cannot hide unrelated active risk.
- Never switch or override the requested review engine/model. If the review hits model capacity, retry the same command a few times with the same engine/model.
@@ -43,6 +43,42 @@ Use when:
- If Gitcrawl reports a portable manifest mismatch, source/runtime DB health error, or stale portable-store checkout, run `gitcrawl doctor --json` and inspect `source_db_health`, `runtime_db_health`, and `portable_store_status` before falling back to live GitHub.
- Do not push just to review. Push only when the user requested push/ship/PR update.
## Scope Governor
Autoreview is a closeout gate, not permission to rewrite the task.
Before the first review, freeze a scope baseline: original request or issue, target branch, intended behavior, owner boundary, changed files, and non-test LOC. For inherited or already-bloated branches, use the intended PR diff as the baseline rather than accepting all existing branch drift.
Before patching a finding, classify it:
- **In-scope blocker**: the finding is introduced by the current diff, affects the same owner boundary, and can be fixed without changing the task's contract.
- **Follow-up**: the finding is real but belongs to an adjacent bug class, sibling surface, cleanup, or broader hardening track.
- **Stop-and-escalate**: the finding requires a new protocol/config/storage/public API contract, a different owner boundary, a release-process change, or a design choice outside the original request.
Stop patching and report the scope break instead of continuing when:
- a narrow PR turns into an architecture change, protocol change, migration, or release-process change;
- the diff grows past 2x the original files or non-test LOC without explicit approval to expand scope;
- two review-triggered patch cycles have not converged; pause and reclassify every remaining finding before another edit;
- the best fix is "define the canonical contract first" rather than another local inference layer;
- fixing the accepted finding would make the PR no longer describe the same behavior, issue, or owner boundary.
After the two-cycle pause, continue only when every remaining accepted finding is still an in-scope blocker. Otherwise preserve the useful analysis, identify the smallest safe landed subset if one exists, and open or request a follow-up for the larger fix. Do not keep committing speculative fixes just to satisfy the reviewer.
Do not stack or push review-triggered fix commits while scope classification or focused proof is unresolved. Keep exploratory edits local until the cycle is proven in scope; if scope breaks, remove them from the landing lane instead of preserving them as branch history.
Critical exceptions must be explicit: active data loss, crash, broken install/upgrade, release blocker, or concrete security exposure. If the exception is not one of those, it is not critical enough to blow up scope.
## Release Branches And Release Process
On release, beta, stable, hotfix, signing, notarization, appcast, package-publish, or release-check work, use freeze discipline even when the branch name is not release-like:
- Fix only release blockers, failed release infrastructure, exact backports, install/upgrade breakage, data loss, crashes, or concrete security exposure.
- Treat non-blocking autoreview findings as follow-ups for `main`, not reasons to broaden the release branch.
- Do not introduce new product behavior, config surface, protocol shape, migration, plugin ownership, docs narrative, or process policy unless it directly unblocks the release.
- Keep proof tied to the release target: exact branch/ref, failing check or shipped-risk reason, smallest command/proof, and whether the fix must also forward-port to `main`.
- If review discovers a real but non-critical design problem during release closeout, stop with a follow-up issue/PR plan; do not use the release branch as the refactor lane.
description: Audit or refresh OpenClaw maturity scorecard docs from root taxonomy, maturity scores, and QA evidence artifacts without using maintainer discrawl data or committed inventory reports.
---
# claw-score
Use this skill when working on the OpenClaw maturity scorecard in this repo.
This is the openclaw-local version of the maintainer `claw-score` workflow:
it keeps the taxonomy and scorecard concepts, but excludes discrawl and the old
committed `inventory/` report tree.
## Authority
This skill owns the operational workflow for:
-`taxonomy.yaml`
-`docs/maturity-scores.yaml`
-`docs/maturity-scorecard.md`
-`docs/taxonomy.md`
-`docs/taxonomy-outline.md`
-`scripts/render-maturity-docs.mjs`
-`.github/workflows/maturity-scorecard.yml`
Keep person-specific, maintainer-private, Discord archive, and discrawl facts
out of this repo. If a score needs private evidence, use the redacted
`qa-evidence.json` artifact shape generated by OpenClaw QA workflows.
## Source Model
-`taxonomy.yaml` is the hand-edited source of truth for surfaces, levels,
- Hosted Provider Execution: Hosted provider turns, Provider-specific model options, Hosted tool use, Reasoning and cache controls, Hosted streaming and replies
- Local and Self-hosted Providers: Local provider profiles, Tool-capability flags, Timeouts and context windows, Local smoke checks, Local failure handling
- Model and Runtime Selection: Model reference selection, Provider and runtime overrides, Thinking and context settings, Invalid route recovery
- Provider Auth: Login and API-key setup, Auth profile selection, Credential health checks, Auth failover, Provider fallback recovery, Rate-limit and capacity recovery, Missing-key and OAuth guidance, Restart and stale-route recovery, Structured provider diagnostics, Subagent credential propagation
- Streaming and Progress: Streaming replies, Progress visibility
- Tool Calls and Response Handling: Tool-call handling, Usage and response reporting, Failure recovery
Use this rubric when assigning category Completeness scores for the
`cli-install-update-onboard-doctor` surface.
## Surface-Specific Scoring Questions
For each category, ask:
- Can a normal operator complete the job end to end from the CLI?
- Are the expected environments represented where they matter for the category,
such as local installs, remote gateway use, supervised services, or
Windows/WSL2?
- Are the main lifecycle stages present where relevant: setup, inspection,
change, repair, and upgrade?
- Are common recovery and troubleshooting branches present, or does the
workflow dead-end after the happy path?
- Are major documented operator expectations still unimplemented?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the CLI operator journey for installation, onboarding, configuration, repair, and upgrade across expected environments and recovery branches.
- Score the CLI against the full operator journey, not only installation or the happy path.
- Repair, migration, remote, and platform-specific branches are expected where a category exposes them.
- For Windows and WSL2, score against the intended supported experience rather than parity with macOS/Linux internals.
- Gateway Service Management: Foreground gateway runs, Service install and control, Service auth wiring, Drift and reinstall recovery, Service health checks
- CLI Observability: Status snapshots, Health snapshots, Remote log tailing, Diagnostics export, Support-safe redaction
- Doctor: Interactive repair, Config migration, Auth and SecretRef checks, Plugin validation and repair, Lint and JSON findings, Extra gateway discovery, Supervisor drift repair, Port and startup diagnosis, Runtime path checks, Restart guidance
- Updates and Upgrades: Update channels, Install-kind switching, Managed gateway restart, Update status and RPC, Plugin convergence
Use this rubric when assigning category Completeness scores for the
`discord` surface.
## Category Scope
- Channel Setup and Operations: Application and bot setup, Token and application ID configuration, Setup wizard and account inspection, Status, doctor, and intent checks, Multi-account bot configuration, Account monitor startup, Gateway WebSocket lifecycle, Reconnect and heartbeat handling, Rate limits and gateway metadata, Status, probe, and health-monitor recovery
- Access and Identity: DM policy modes, Allowlist inheritance, Pairing-code approval, Sender authorization, Access-group authorization, Group DM authorization
- Conversation Routing and Delivery: Guild and channel admission, Mention gating, Session key isolation, Configured and runtime routing, Inbound context visibility, Forum and media-channel thread posts, Thread actions, Target parsing, Thread context resolution, Thread-bound session routing, ACP agent routing, Routing lifecycle, Discord forum/media channel posts created as, CLI and message-tool thread actions, Discord target parsing for `channel:<id>`, Thread context resolution, Thread-bound session routing for `/focus`, `/unfocus`, `/agents`, `/session idle`, `/session max-age`, `sessions_spawn({ thread, ACP current-conversation bindings and ACP thread, Binding lifecycle behavior, Direct and thread sends, Text chunking and reply mode, Draft and progress edits, Mention and embed rendering, REST retry and final delivery, File uploads, Component file and media-gallery blocks, Video caption follow-up, Voice-message upload, Inbound attachment context
- Media and Rich Content: Direct and thread sends, Text chunking and reply mode, Draft and progress edits, Mention and embed rendering, REST retry and final delivery, File uploads, Component file and media-gallery blocks, Video caption follow-up, Voice-message upload, Inbound attachment context, Direct and thread sends, Text chunking and reply mode, Draft and progress edits, Mention and embed rendering, REST retry and final delivery, File uploads, Component file and media-gallery blocks, Video caption follow-up, Voice-message upload, Inbound attachment context, Outbound file uploads from URLs and, Component v2 file and media-gallery blocks, Video caption handling and follow-up media-only delivery, Discord voice-message sends with OGG/Opus conversion, Inbound media/attachment-aware debounce behavior, Realtime voice-channel conversations, General text-only delivery
- Native Controls and Approvals: Native slash command registration, Native slash command execution, Model Picker Commands, Components v2 messages, Callback TTL, Native Discord exec/plugin approvals, Sensitive owner-only command routing for prompts, Discord message actions, Action gates under channels.discord.actions.\*
- Realtime Voice and Calls: Voice Channel Lifecycle, Auto-join and follow-users, Realtime voice modes, Wake, barge-in, and echo handling, Voice codec and DAVE recovery
- Hosted Web Surface: Control UI, WebChat hosting, Plugin web routes, Canvas and A2UI routes
- Gateway RPC APIs and Events: Health APIs, Identity and presence APIs, Model APIs, Usage and memory APIs, Session APIs, Chat APIs, Channel APIs, Web login and wake APIs, Config and secrets APIs, Update and setup APIs, Agent and artifact APIs, Task and automation APIs, Tool and skill APIs, Request and event envelopes, Idempotent side effects, Method discovery, Event discovery, Accepted-then-final results, Event ordering, State refresh after gaps
- Health, Diagnostics, and Repair: Health snapshots, Channel readiness, Stability diagnostics, Payload diagnostics, Diagnostics exports, Doctor checks, Log tailing
- Protocol Compatibility: Published protocol schema, Runtime request validation, JSON Schema export, Swift client models, Version negotiation, Client transport defaults, Backward-compatible evolution
- Roles and Permissions: Role negotiation, Operator permissions, Approval-gated actions, Untrusted node declarations, Event scoping
- Gateway Lifecycle: Foreground startup, Service installation, Restart and stop, Service status, Bind and port settings, Config reload, Multi-gateway isolation
Use this rubric when assigning category Completeness scores for the
`image-video-music-generation-tools` surface.
## Category Scope
- Media Routing and Discovery: default media model config, per-call model refs and fallbacks, auth-backed tool discovery, action=list provider inspection
Use this rubric when assigning category Completeness scores for the
`imessage-bluebubbles` surface.
## Category Scope
- Channel Setup and Operations: Translate legacy config, Cut over safely, Handle migration caveats, Run local imsg, Run through SSH wrapper, Grant macOS permissions, Probe runtime health, Account setup prompts, Account status checks, Doctor repair checks, Account Config, Translate legacy config, Cut over safely, Handle migration caveats, Run local imsg, Run through SSH wrapper, Grant macOS permissions, Probe runtime health
- Access and Identity: Authorize direct senders, Route direct conversations, Bind ACP sessions, Group Policy, Mentions, System Prompts, Group Policy, Mentions, System Prompts
- Conversation Routing and Delivery: Watch live messages, Coalesce split-send DMs, Replay missed messages, Seed conversation history, Authorize direct senders, Route direct conversations, Bind ACP sessions, Group Policy, Mentions, System Prompts
- Media and Rich Content: Media, Attachments, Remote Fetch, Chunking, Native Actions, Private API, Message Tool
Use this rubric when assigning category Completeness scores for the
`kubernetes-hosting` surface.
## Surface-Specific Scoring Questions
For each category, ask:
- Can an operator deploy and manage OpenClaw on Kubernetes end to end?
- Are the taxonomy features present as supported manifests, commands, and docs rather than examples only?
- Are setup, normal operation, status or inspection, redeploy, teardown, and secret rotation represented where relevant?
- Are local Kind validation, namespace/image customization, provider secrets, and secure exposure branches covered?
- Do known gaps leave major cluster-hosting capability branches missing?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the Kubernetes operator workflow for deployment, configuration, secrets, access, exposure, lifecycle, security posture, status, and recovery.
- A complete Kubernetes category lets an operator deploy, expose, secure, update, troubleshoot, and remove the Gateway without relying on Docker-only assumptions.
- Happy-path port-forwarding, missing secret/config rotation, or omitted exposed-service security posture are material completeness gaps.
- Gateway Connectivity: Local Gateway attach and status, Gateway pairing and auth, Remote mode, Local and remote resource boundaries
- Chat and Sessions: Native Linux chat window, Transcript, Gateway chat transport
- Desktop Capabilities: Linux desktop permissions, Secret storage, Sandbox/package posture, Linux native node identity, Host command execution, Desktop tools, Linux native Talk, Microphone capture, Native media permissions
- Status and Diagnostics: Native Linux app readiness, Gateway health/status display, Log/transcript opening, Doctor/repair affordances, Linux tray/status item, Runtime status row, Desktop-environment integration
Use this rubric when assigning category Completeness scores for the
`linux-gateway-host` surface.
## Category Scope
- Host Setup and Updates: Linux CLI install, Node runtime prerequisites, Package-manager policy, Update path
- Gateway Runtime and Service Control: Foreground Gateway Runtime, Process Control, Systemd User Service Lifecycle setup, Systemd User Service Lifecycle operation, Systemd User Service Lifecycle status, Systemd User Service Lifecycle recovery
- Provider Setup, Lifecycle, and Diagnostics: Provider Selection, Onboarding, localService configuration, Process startup and readiness, Request leases and idle shutdown, Health checks and restart, Provider recipes, Local provider status, Backend reachability probes, Model availability errors, Memory readiness diagnostics, Provider troubleshooting docs
- Native Provider Plugins: Ollama setup and model pulling, Model discovery, Streaming and vision, Ollama embeddings, Web-search support, LM Studio setup, Model discovery and auth, Model preload and JIT loading, Streaming compatibility, LM Studio embeddings
- Media Intake and Access: Local and remote media references, MIME and type detection, Size caps and bounded reads, Safe remote fetch, Local root policy, Inbound media store, PDF/document extraction dispatch, QR and media helper classification
- Channel Media Handling: Inbound attachment staging, Sandbox media rewrites, Reply media templating, Message-tool attachment delivery, Duplicate delivery suppression
- Media Configuration: Media capability configuration
- Media Understanding: Audio attachment selection, Batch STT provider and CLI fallback, Voice-note mention preflight, Transcript insertion and echo, Audio proxy and limit handling, Inbound image summarization, Active vision model bypass, Text-only model media offload, Vision provider fallback, Image and PDF input routing, Video Understanding, Direct Video Analysis
- Media Generation: Image generation tool invocation, Provider and model selection, Reference image editing, Generated image task lifecycle, Generated image persistence and delivery, Music generation tool invocation, Provider and model selection, Lyrics, instrumental, duration, and format controls, Reference inputs where supported, Music task lifecycle and duplicate status, Generated audio persistence and delivery, Video generation tool invocation, Mode and provider capability selection, Reference image, video, and audio inputs, Provider option validation, Video task lifecycle and status, Generated video persistence and delivery
Use this rubric when assigning category Completeness scores for the
`microsoft-teams` surface.
## Category Scope
- Channel Setup and Operations: Teams CLI app creation, Bot registration and manifest upload, Credential configuration, Teams app install verification, Setup status, Probe and scope reporting, Teams app doctor, Webhook and health diagnostics, Operator repair paths, Text formatting and chunking, Adaptive and presentation cards, Progress streaming, Delivery receipts and errors, Queued and proactive replies, Webhook Runtime, SDK Lifecycle, Proactive Cloud Boundary, Setup status, Probe and scope reporting, Teams app doctor, Webhook and health diagnostics, Operator repair paths, Webhook Runtime, SDK Lifecycle, Proactive Cloud Boundary
- Access and Identity: DM pairing, Stable sender identity, Allowlists and access groups, Invoke and command authorization, Teams-originated config writes, Bot Framework SSO invokes, Delegated token storage, Graph directory lookup, Member profile lookup, Bot Framework SSO invokes, Delegated token storage, Graph directory lookup, Member profile lookup
- Conversation Routing and Delivery: Team and channel allowlists, Deterministic channel replies, Mention-gated group access, Session routing, Reply and thread context, Text formatting and chunking, Adaptive and presentation cards, Progress streaming, Delivery receipts and errors, Queued and proactive replies, Webhook Runtime, SDK Lifecycle, Proactive Cloud Boundary, Text formatting and chunking, Adaptive and presentation cards, Progress streaming, Delivery receipts and errors, Queued and proactive replies, Webhook Runtime, SDK Lifecycle, Proactive Cloud Boundary
- Media and Rich Content: Inbound attachments, Graph-hosted media, File consent, SharePoint and OneDrive sharing, Media fetch safety
- Native Controls and Approvals: Message action discovery, Polls and reactions, Read, edit, delete, and pin, Native approval cards, Feedback and group actions
Use this rubric when assigning category Completeness scores for the
`multi-agent-orchestration` surface.
## Surface-Specific Scoring Questions
For each category, ask:
- Can an operator configure and run the category workflow end to end?
- Are the taxonomy features present as supported user paths rather than partial config fragments?
- Are setup, normal operation, status or inspection, recovery, and removal paths represented where relevant?
- Are channel, account, workspace, auth, task, and delegate variants covered where the category expects them?
- Do known gaps leave major coordination or isolation branches missing?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the operator-facing system for setup, isolation, conversation routing, account routing, specialist lanes, delegate identity, status, recovery, and safe defaults.
- A complete category lets multiple agents be created, isolated, routed, delegated, and inspected without implicit cross-agent leakage.
- Undocumented config, nondeterministic routing, or unclear ownership of state, credentials, and outbound delivery are material completeness gaps.
Use this rubric when assigning category Completeness scores for the
`native-windows-companion-app` surface.
## Category Scope
- Installation and Updates: Official app download, MSI/MSIX/App Installer/winget-style packaging, Windows architecture handling for x64, App release channel
Use this rubric when assigning category Completeness scores for the
`openclaw-app-sdk` surface.
## Surface-Specific Scoring Questions
For each category, ask:
- Can an external app developer complete the category workflow using public SDK APIs?
- Are the taxonomy features represented by stable client contracts rather than protocol-only fragments?
- Are setup, authentication, streaming, result handling, error behavior, and compatibility expectations documented?
- Are browser, Node, React, testing, and custom transport variants covered where the category expects them?
- Do known gaps leave major external-app capability branches missing?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the external app-developer workflow from connection through agent runs, sessions, events, approvals, resources, compatibility, and operational error handling.
- A complete SDK category exposes typed, documented, reusable client APIs instead of requiring low-level Gateway protocol work.
- Manual Gateway frame construction or reliance on internal package shapes is a material completeness gap.
- Can the intended plugin task be completed end to end by an author or
operator?
- Are the important plugin variants present for this category, such as channel,
provider, tool, bundled, local, npm, or ClawHub flows?
- Are the main lifecycle stages present where relevant: create, configure,
validate, run, update, and remove or roll back?
- Are compatibility, approval, or safety branches present when the category
implies them?
- Are important author/operator-visible gaps still forcing workarounds or
unsupported paths?
## Surface-Specific Guidance
Variation from the default completeness process:
- Completeness is the plugin author or operator lifecycle for authoring, packaging, installing, running, approving, publishing, and testing plugins, not just SDK or runtime primitives.
- Score the plugin surface against the full plugin journey, not only one import path, packaging mode, or runtime path.
- Bundled-only support or support for only selected plugin families is incomplete when the category implies broader plugin capability.
- Publishing and testing categories should include expected lifecycle support, not just raw commands or fixtures.
Use this rubric when assigning category Completeness scores for the
`signal` surface.
## Category Scope
- Setup and Account Health: QR link setup, SMS registration, Installer and binary setup, Container account provisioning, Status probes, Setup diagnostics, Account safety guardrails
- Conversation Access and Routing: DM pairing, DM allowlists, Sender identity normalization, Group allowlists, Mention gates, Pending group history
- Message Delivery and Actions: Text delivery targets, Media delivery and limits, Typing and read receipts, Styled/chunked output, Reaction action discovery, Add/remove reactions, Group reaction targeting
Use this rubric when assigning category Completeness scores for the
`telegram` surface.
## Category Scope
- Channel Setup and Operations: BotFather token creation, TELEGRAM_BOT_TOKEN, Setup wizard credential capture, Startup getMe, Doctor/status surfacing, Named account configuration, CLI/message-tool targets, Directory adapters, Channel status, Account-scoped outbound, Long polling runner startup, Webhook listener startup, Reconnect, Restart, Named account configuration, Directory adapters and configured peers/groups for, Channel status, Account-scoped outbound, Long polling runner startup, Reconnect, Restart
- Access and Identity: dmPolicy modes, Pairing-code approval, Numeric Telegram user ID normalization with telegram, allowFrom, Unauthorized DM, Group allowlists, Supergroup negative chat IDs, Forum topic session keys, ACP topic routing, Session key construction
- Conversation Routing and Delivery: dmPolicy modes, Pairing-code approval, Numeric Telegram user ID normalization with telegram, allowFrom, Unauthorized DM, Group allowlists, Supergroup negative chat IDs, Forum topic session keys, ACP topic routing, Session key construction, Inbound media download, Voice notes, Location, Poll sending, Reactions, Text, Preview streaming, Reply threading tags, Durable outbound message recording, Voice notes, Poll sending, Reply threading tags, Durable outbound message recording
- Media and Rich Content: Inbound media download, Voice notes, Location, Poll sending, Reactions, Text, Preview streaming, Reply threading tags, Durable outbound message recording, Voice notes, Poll sending, Reply threading tags, Durable outbound message recording, Inbound media download, Voice notes, Location and venue extraction into channel context, Poll sending, Reactions
- Native Controls and Approvals: Inline keyboard rendering, Exec approvals in DMs, Message actions, Action capability discovery, Native setMyCommands startup sync, Command name/description normalization, Built-in commands, Command authorization in DMs, Model buttons, Native `setMyCommands` startup sync, Command name/description normalization, Built-in commands such as `/help`, Command authorization in DMs, Model buttons and command UI helpers
Use this rubric when assigning category Completeness scores for the
`whatsapp` surface.
## Category Scope
- Channel Setup and Operations: Official @openclaw/whatsapp plugin metadata, openclaw plugin install whatsapp, Channel config schema, Baileys socket lifecycle, Operator troubleshooting, Baileys socket lifecycle, Operator troubleshooting for reconnect loops
- Access and Identity: QR login, Baileys multi-file auth persistence, DM pairing challenge, Multi-account/default-account resolution, Direct-message dmPolicy, Sender identity extraction, Privacy controls for plugin hooks, Direct-message `dmPolicy`, Sender identity extraction, Privacy controls for plugin hooks and
- Conversation Routing and Delivery: Group allowlists, Group session keys, Outbound text sends, Provider-accepted receipts, Outbound text sends, Provider-accepted receipts and durable delivery identifiers
- Media and Rich Content: Inbound media download, Outbound image
- Native Controls and Approvals: Native exec, Approver target resolution
- Gateway Service Lifecycle: Onboarded systemd install, Gateway service install, systemd user unit rendering, WSL-aware systemd unavailable hints, Doctor service repair, WSL user-service linger, Systemd availability after Windows boot, Windows startup task for WSL, Verification before Windows sign-in, Clear expectations around PC power
- Gateway Access and Exposure: Gateway token/password auth, Provider credentials, Gateway auth SecretRefs, Remote URL credential precedence, WSL virtual network, Windows portproxy setup, Windows Firewall rules, Reachable Gateway URLs, Loopback and LAN exposure, WSL2 IPv4 networking, Tailscale remote access
- Diagnostics and Repair: openclaw doctor, openclaw status, openclaw logs, SecretRef, WSL/systemd unavailable hints, Operator repair guidance after WSL2 service
- Browser and Control UI: WSL2 Gateway with Windows browser, Windows Control UI URL, Raw remote CDP to Windows Chrome, Host-local Chrome MCP, Browser profile cdpUrl, Layered diagnostics
description: Post an approved message as the logged-in Discord user through the Discord desktop app. Use for release announcements or other direct user-authored Discord posts; not for OpenClaw channel sends, bots, webhooks, relays, agent sessions, or archive search.
---
# Discord User Post
Use `$computer-use` to operate `/Applications/Discord.app` in the user's
existing logged-in session. This workflow represents the user directly.
## Prepare
1. Draft the complete final message outside Discord.
2. Confirm the intended server and channel with the user when either is
ambiguous.
3. Open Discord and navigate to the exact destination without entering the
message.
4. Verify the visible server name, channel header, and logged-in account.
Do not infer the target from unrelated Discord content. Stop if Discord is not
logged in, the account is wrong, or the exact destination cannot be verified.
## Confirm and Post
Posting is representational communication. Follow the `$computer-use`
confirmation policy even when the user previously asked for an announcement:
1. Show the user the exact final body and verified destination.
2. Request action-time confirmation before typing into Discord.
3. After confirmation, enter the approved body unchanged.
4. Visually inspect the composed message and destination again.
5. Send once.
If the body or destination changes after confirmation, request confirmation
again before sending.
## Verify
- Confirm the message appears once, from the user's account, in the intended
channel.
- Report the server, channel, and visible send result.
- Do not edit, delete, react, or send a follow-up without the corresponding
user instruction and confirmation.
## Guardrails
- Never use `openclaw message`, an OpenClaw agent, a Discord bot, webhook, relay,
or token for this workflow.
- Never expose private Discord content or account details in public output.
- Never send a draft, partial message, duplicate, or unreviewed attachment.
- For Discord archive/history/search, use `$discrawl` instead.
`This audited record covers the complete ${base}..${target} history: ${pullRequests.length} PRs and ${issues.length} linked issues. The grouped notes above prioritize user impact; this ledger preserves every contribution reference and eligible human credit.`,
- If bot review conversations exist on your PR, address them and resolve them yourself once fixed.
- Leave a review conversation unresolved only when reviewer or maintainer judgment is still needed.
- Before landing any PR with non-trivial code changes, run `$autoreview` until no accepted/actionable findings remain, unless equivalent manual review already covered it, the change is trivial/docs-only, or the user opts out.
- When landing or merging any PR, follow the global `/landpr` process.
- When an agent is landing or merging a PR targeting `main`, use only the repo-native `scripts/pr` wrapper: run `scripts/pr review-init <PR>`, follow its emitted checkout/guard guidance, initialize and complete review artifacts with `scripts/pr review-artifacts-init <PR>`, validate them with `scripts/pr review-validate-artifacts <PR>`, then run `scripts/pr prepare-run <PR>` and `scripts/pr merge-run <PR>`.
- Use `scripts/committer "<msg>" <file...>` for scoped commits instead of manual `git add` and `git commit`.
- Keep commit messages concise and action-oriented.
- Group related changes; avoid bundling unrelated refactors.
- Judges default to `openai/gpt-5.4,thinking=xhigh,fast` and `anthropic/claude-opus-4-6,thinking=high`.
- Report includes judge ranking, run stats, durations, and full transcripts; do not include raw judge replies. Duration is benchmark context, not a grading signal.
- Candidate and judge concurrency default to 16. Use `--concurrency <n>` and `--judge-concurrency <n>` to override when local gateways or provider limits need a gentler lane.
- Scenario source should stay markdown-driven under `qa/scenarios/`.
- Scenario source is YAML-only under `qa/scenarios/`: use `index.yaml` and
per-scenario `*.yaml` files with top-level `title`, `scenario`, and optional
`flow`. Never add fenced `qa-scenario` / `qa-flow` Markdown files.
- For isolated character/persona evals, write the persona into `SOUL.md` and blank `IDENTITY.md` in the scenario flow. Use `SOUL.md + IDENTITY.md` only when intentionally testing how the normal OpenClaw identity combines with the character.
- Keep prompts natural and task-shaped. The candidate model should receive character setup through `SOUL.md`, then normal user turns such as chat, workspace help, and small file tasks; do not ask "how would you react?" or tell the model it is in an eval.
- Prefer at least one real task, such as creating or editing a tiny workspace artifact, so the transcript captures character under normal tool use instead of pure roleplay.
@@ -234,7 +236,8 @@ pnpm openclaw qa manual \
## Repo facts
- Seed scenarios live in `qa/`.
- Seed scenarios live in `qa/scenarios/index.yaml` and
`qa/scenarios/<theme>/*.yaml`.
- Main live runner: `extensions/qa-lab/src/suite.ts`
@@ -16,10 +16,33 @@ Use this with `$release-openclaw-maintainer` and `$openclaw-testing` when a rele
- Watch one parent run plus compact child summaries. Avoid broad `gh run view` polling loops; REST quota is easy to burn.
- Fetch logs only for failed or currently-blocking jobs. If quota is low, stop polling and wait for reset.
- Treat live-provider flakes separately from code failures: prove key validity, provider HTTP status, retry evidence, and exact failing lane before editing code.
- A model-list response proves authentication, not billing or inference
entitlement. Mandatory live providers must pass a real completion probe
before release dispatch. Fix the credential first; do not add an alternate
auth path merely to bypass a failed release credential.
- Full Release Validation parent monitors fail fast: once a required child job
fails, the parent cancels the remaining child matrix and prints the failed
job summary. Inspect that first red job instead of waiting for unrelated
matrix tails.
- In a sparse worktree or Testbox source sync, first confirm `package.json`,
`pnpm-lock.yaml`, and every source path the selected check reads. If any are
absent, that checkout cannot validate a release dependency or Docker lane:
stop and use the repo remote changed gate or a full task worktree. When the
inputs are present and a release fix changes `package.json` or
`pnpm-lock.yaml`, rebuild only the task-owned disposable box with
`CI=true pnpm install --frozen-lockfile`, then run an explicit
`require.resolve()` probe before Docker or focused tests. The CI flag permits
pnpm to recreate a prewarmed modules directory without an interactive
confirmation. Do not weaken the lockfile or label sparse-checkout failures
as product/Docker failures.
- If the candidate is rebased or its base SHA changes after warmup, stop the
task-owned box and warm a fresh one before testing. Testbox source sync is
relative to the warmed source tree; continuing can mix an old base file with
a new candidate diff and produce false lockfile or Docker failures.
- For a committed release candidate, warm the box with
`blacksmith testbox warmup ... --ref <candidate-branch-or-sha>`. Do not rely
on source sync to overlay committed branch changes onto the workflow's
default ref.
## Preflight
@@ -36,6 +59,8 @@ git rev-parse HEAD
preflight. Inject those exact targeted keys first, then run the verifier; use
ambient env only when it was already intentionally injected for this release.
The script prints only provider status and HTTP class, never tokens.
The Anthropic check performs a tiny message completion so exhausted or
non-billable credentials fail before the expensive release matrix.
## Dispatch
@@ -51,7 +76,7 @@ gh workflow run openclaw-performance.yml \
-f repeat=3\
-f deep_profile=false\
-f live_openai_candidate=false\
-f fail_on_regression=false
-f fail_on_regression=true
```
- Do not wait for full release validation to start this early perf signal.
@@ -60,11 +85,19 @@ gh workflow run openclaw-performance.yml \
- Call out any regression in the release proof. Treat a major regression as a
release blocker until it is fixed, waived by the operator, or proven to be
infrastructure noise.
- Full Release Validation also records advisory product-performance evidence;
the early standalone run is for overlap and faster regression discovery.
- Full Release Validation records blocking product-performance evidence. The
early standalone run is for overlap and faster regression discovery, but a
regression or missing child run blocks the parent validation.
Prefer the trusted workflow on `main`, target the exact release SHA:
- Keep trusted-workflow checks compatible with frozen release targets. If
`main` adds a target-owned guard script or package command after the release
branch cut, make the trusted workflow skip only when that target surface is
absent. Heal the trusted workflow before rerunning validation; do not port an
unrelated runtime refactor or mutate the release candidate just to satisfy a
newer `main`-only check.
```bash
gh workflow run full-release-validation.yml \
--repo openclaw/openclaw \
@@ -76,7 +109,7 @@ gh workflow run full-release-validation.yml \
-f rerun_group=all
```
Use `release_profile=stable` unless the operator explicitly asks for the broad advisory provider/media matrix. Use narrow `rerun_group` after focused fixes.
Use `release_profile=stable` unless the operator explicitly asks for the broad advisory provider/media matrix. Stable and full profiles force the release soak; the beta profile may opt in with `run_release_soak=true`. Use narrow `rerun_group` after focused fixes.
Publish with `openclaw-release-publish.yml` using `release_profile=from-validation`
unless a maintainer intentionally wants to cross-check a specific profile; the
publish workflow reads the effective profile from the full-validation manifest.
@@ -106,9 +139,25 @@ Stop watchers before ending the turn or switching strategy.
--jq '.jobs[] | select(.conclusion=="failure" or .conclusion=="timed_out" or .conclusion=="cancelled") | [.databaseId,.name,.conclusion,.url] | @tsv'
```
3. Fetch one failed job log. If rate-limited, note reset time and avoid more REST calls.
4. For secret-looking failures, validate the provider endpoint from the same secret source before editing code.
4. For secret-looking failures, validate a real completion from the same secret source before editing code. A successful model-list request is insufficient.
Claude CLI subscription credentials are a separate native auth path; prove
them in a clean-home CLI probe, never as a substitute for a required
Anthropic API-key lane.
5. For live-cache failures, inspect whether it is missing/invalid key, empty text, provider refusal, timeout, or baseline miss. Do not weaken release gates without clear provider evidence.
6. Fix narrowly, run local/changed proof, commit, push, rerun the smallest matching group.
7. If a required PR CI run is capacity-stalled with queued jobs and no active
jobs, do not cancel unrelated work or accept a generic manual dispatch.
From the PR head branch, dispatch the explicit exact-SHA fallback:
`gh workflow run ci.yml --repo openclaw/openclaw --ref <pr-head-branch> -f
@@ -17,6 +17,10 @@ Use this skill for release and publish-time workflow. Load `$release-private` if
- This skill should be sufficient to drive the normal release flow end-to-end.
- Use the private maintainer release docs for credentials, recovery steps, and mac signing/notary specifics, and use `docs/reference/RELEASING.md` for public policy.
- Core `openclaw` publish is manual `workflow_dispatch`; creating or pushing a tag does not publish by itself.
- Do not edit the root `README.md` as release prep, release closeout, or a
substitute for release notes. Package-root README validation is a hard
packaging gate, but a release only changes README content when an actual
user-facing documentation contract changed.
- Normal release work happens on a branch cut from `main`, not directly on
`main`. Use `release/YYYY.M.PATCH` for the branch name.
- If the operator asks for a release without saying stable/full, default to
@@ -76,6 +80,44 @@ Use this skill for release and publish-time workflow. Load `$release-private` if
or clawgrit reports. Report regressions explicitly. A major regression is a
release blocker unless the operator waives it or the data clearly proves
infrastructure noise.
- Heal CI before tagging or publishing. The exact candidate SHA must have green
`Full Release Validation`, including the root Dockerfile/install-smoke path.
Treat a red Docker, package, or release workflow lane as a release-branch
defect until the smallest correct fix is landed and proven; do not waive it
because npm preflight or another sibling lane passed.
- Keep the canonical `scripts/pr` runner authoritative for prepare and merge
artifacts. A release-gate policy change may use focused candidate tests and
exact-SHA hosted CI for proof, but never route `prepare-*` or `merge-*`
through PR-controlled scripts or synthesize prepare artifacts to bootstrap
the change. If the current canonical gate cannot validate the new policy,
stop for explicit maintainer direction rather than weakening that boundary.
- In maintainer Testbox mode, use `OPENCLAW_TESTBOX=1 scripts/pr prepare-run
<PR>` only after the exact PR head has passed `CI` and every scheduled
hosted gate. For a workflow change, that means `Blacksmith Testbox`,
`Blacksmith ARM Testbox`, `Blacksmith Build Artifacts Testbox`, and
`Workflow Sanity`; only gates GitHub actually scheduled for that exact head
are required. This preserves the canonical prepare artifacts while avoiding
a redundant broad local suite. A
literal `CHANGELOG.md`-only head gets a clean diff check instead because
those workflows intentionally do not dispatch. Documentation and README
changes still require CI. If `merge-run` requires a mainline sync, run
`OPENCLAW_TESTBOX=1 scripts/pr prepare-sync-head <PR>`, wait for those hosted
gates on the newly pushed SHA, then run `prepare-run` again.
- If an exact PR-head CI run has no active jobs because Blacksmith capacity is
stalled, a maintainer may dispatch the explicit GitHub-hosted fallback from
the PR head branch:
`gh workflow run ci.yml --repo openclaw/openclaw --ref <pr-head-branch> -f
workflow_run: # zizmor:ignore[dangerous-triggers] trusted PR commenter; job gates repository, source event, workflow name, live open PR, and exact current head before reading artifacts or writing comments
if (escaped.length + encoded.length > maxEncodedLength) {
break;
}
escaped += encoded;
}
return `\`${escaped || "-"}\``;
};
const rows = (findings ?? []).map((finding) => {
const location = String(finding.location ?? "");
const [file, line] = location.split(":");
return {
file: file ? `apps/ios/${file}` : "",
line: line || "",
kind: String(finding.kind ?? ""),
name: String(finding.name ?? ""),
};
});
let mode = "failure";
let body = `${marker}\n`;
if (scanSkipped) {
mode = "skipped";
body += [
"### iOS Periphery",
"",
"Periphery scan skipped because the pull request is a draft or no longer touches iOS scan scope.",
].join("\n");
} else if (findings === null) {
body += [
"### iOS Periphery",
"",
"Periphery did not complete or its report could not be safely read. Check the workflow run for details.",
].join("\n");
} else if (rows.length === 0 && status === 0) {
mode = "success";
body += [
"### iOS Periphery",
"",
"No dead Swift code found.",
].join("\n");
} else if (rows.length > 0) {
const shown = rows.slice(0, 50);
body += [
"### iOS Periphery",
"",
`Found ${rows.length} dead Swift code ${rows.length === 1 ? "symbol" : "symbols"}. Remove the code or add a narrow Periphery exemption with a comment explaining why it must stay.`,
rows.length > shown.length ? `Showing first ${shown.length}; full JSON is in the workflow artifact.` : null,
].filter(Boolean).join("\n");
} else {
body += [
"### iOS Periphery",
"",
"Periphery exited with a non-zero status before producing findings. Check the workflow artifact for stdout/stderr.",
].join("\n");
}
body += "\n";
const maxCommentChars = 60_000;
if (body.length > maxCommentChars) {
body = [
marker,
"### iOS Periphery",
"",
`Found ${rows.length} dead Swift code ${rows.length === 1 ? "symbol" : "symbols"}. The rendered report exceeded the safe comment limit; use the workflow artifact for details.`,
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.