Compare commits

..

277 Commits

Author SHA1 Message Date
Vincent Koc
d49a014424 fix(plugins): stabilize registry package paths 2026-04-25 12:33:17 -07:00
Vincent Koc
0c3a0a3682 fix(plugins): satisfy registry repair lint 2026-04-25 12:33:17 -07:00
Vincent Koc
ee19c91591 fix(plugins): add doctor registry repair 2026-04-25 12:33:16 -07:00
Vincent Koc
64582bb3a7 docs(diagnostics-otel): clarify genai semconv exports 2026-04-25 12:30:14 -07:00
Peter Steinberger
d4971aad2c docs: require feasible live verification 2026-04-25 20:27:21 +01:00
Peter Steinberger
30325f567c fix: use prompt snapshots for live context diagnostics 2026-04-25 20:25:44 +01:00
Peter Steinberger
b732f21a86 fix: clarify voice-call setup diagnostics 2026-04-25 20:24:36 +01:00
Vincent Koc
44648440a5 fix(diagnostics-otel): stabilize genai token metric model attr 2026-04-25 12:22:55 -07:00
Peter Steinberger
75d64cd4b8 feat: expose generic image background option 2026-04-25 20:21:46 +01:00
Peter Steinberger
03fd7df929 fix: remove duplicate diagnostic stability case 2026-04-25 20:21:39 +01:00
Peter Steinberger
d757396785 test(ui): consolidate chat jsdom suites 2026-04-25 20:17:23 +01:00
Peter Steinberger
7436e395d5 test(node-host): cache native binary fixture lookup 2026-04-25 20:17:23 +01:00
Peter Steinberger
f34513ac66 perf(memory): avoid duplicate session store reads 2026-04-25 20:17:22 +01:00
Vincent Koc
5815ca93d9 fix(diagnostics-otel): honor genai usage semconv opt-in 2026-04-25 12:13:50 -07:00
Peter Steinberger
86d897cfaa feat(android): expose talk mode
Co-authored-by: alex-latitude <213670856+alex-latitude@users.noreply.github.com>
2026-04-25 20:12:38 +01:00
Peter Steinberger
791ad0864a fix: strip invalid thinking replay signatures
Fixes #45010.
Supersedes #70054.

Co-authored-by: Chris Staples <chris.staples@sophos.com>
Co-authored-by: Fourier <yang.fourier@gmail.com>
2026-04-25 20:12:30 +01:00
Peter Steinberger
47a63f7acf fix(logging): merge duplicate context diagnostic case 2026-04-25 20:11:08 +01:00
Peter Steinberger
e6ab61762a fix(check): pass lock env to changed lint lanes 2026-04-25 20:11:08 +01:00
Peter Steinberger
1e7ae07772 fix(cli): dedupe onboard auth flags for completion cache 2026-04-25 20:11:08 +01:00
Peter Steinberger
d9486c683b fix: stabilize macos npm update smoke 2026-04-25 20:09:32 +01:00
Peter Steinberger
17401e31de fix: avoid changed gate lint self-lock 2026-04-25 20:09:00 +01:00
mushuiyu_xydt
0e1ef93e84 fix(minimax): use dedicated image generation endpoint (#61155)
* fix(minimax): use dedicated image generation endpoint

MiniMax image generation uses a dedicated API endpoint
(api.minimax.io/v1/image_generation) that is separate from the
text/chat API endpoint (api.minimax.io/anthropic).

Previously, the resolveMinimaxImageBaseUrl function would extract
the origin from the provider's configured baseUrl. If a user had
configured their baseUrl to the chat endpoint (e.g.,
api.minimax.chat/anthropic), the image generation would incorrectly
use that endpoint, resulting in "invalid api key" errors.

This fix always uses the dedicated image generation endpoint,
ignoring the provider's baseUrl configuration for image generation.

Fixes #61149

* fix(minimax): support CN endpoint for image generation

Respect MINIMAX_API_HOST environment variable to determine whether
to use the global (api.minimax.io) or CN (api.minimaxi.com) endpoint
for image generation.

This ensures that CN users who configure MINIMAX_API_HOST to use
api.minimaxi.com will continue to use the CN endpoint for image
generation, while global users continue to use api.minimax.io.

The original bug was caused by the code extracting the origin from
the provider's configured baseUrl, which could be set to incorrect
endpoints like api.minimax.chat. This fix uses the dedicated image
generation endpoints instead.

Fixes #61149

* fix(minimax): infer CN endpoint from provider config when env is unset

When MINIMAX_API_HOST is not set, fall back to checking the provider's
configured baseUrl to determine whether to use the CN or global image
endpoint. This ensures CN users who went through onboarding (which sets
models.providers.minimax.baseUrl to https://api.minimaxi.com/anthropic)
are correctly routed to the CN image endpoint.

The isMinimaxCnHost check ensures we only use the baseUrl origin for
CN detection - invalid endpoints like api.minimax.chat would not match
minimaxi.com and would correctly fall through to the global default.

Fixes #61149

* test(minimax): cover dedicated image endpoints

* fix(logging): handle context assembly diagnostics

* Revert "fix(logging): handle context assembly diagnostics"

This reverts commit f51d2f7d67f8193268dd37553ac77e80a0423390.

* test(minimax): isolate image endpoint env

* docs(changelog): credit minimax image fix

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 20:07:52 +01:00
Quratulain-bilal
7d58362f3f docs(browser): note tilde expansion also covers per-profile paths (#71601)
* docs(browser): note tilde expansion also covers per-profile paths

The 95a2c9b fix expanded "~" for both `browser.executablePath` and
per-profile `profiles.<name>.executablePath` (config.ts:382 calls
`normalizeExecutablePath` for profile overrides). Per-profile
`userDataDir` on existing-session profiles is also tilde-expanded
(config.ts:391 via `resolveUserPath`). The configuration reference
only mentioned the top-level `browser.executablePath` case.

* docs(browser): align tilde path config help

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 20:05:03 +01:00
Vincent Koc
5671fdca87 feat(diagnostics-otel): add genai usage span identity 2026-04-25 12:03:10 -07:00
Peter Steinberger
5eab16e086 fix: improve google meet setup diagnostics 2026-04-25 20:01:24 +01:00
Peter Steinberger
e36b77c13e docs(changelog): drop self-thanks 2026-04-25 20:01:00 +01:00
Peter Steinberger
d68574653e docs(changelog): split 2026.4.24 and 2026.4.25 notes 2026-04-25 19:59:54 +01:00
Quratulain-bilal
8170df9127 docs(browser): document local startup timeout bounds (#71672)
* docs(browser): document local startup timeout bounds

The new browser.localLaunchTimeoutMs and browser.localCdpReadyTimeoutMs
options are clamped to MAX_BROWSER_STARTUP_TIMEOUT_MS (120000 ms) by
normalizeStartupTimeoutMs in extensions/browser/src/browser/config.ts,
and zero/negative/non-finite values fall back to the defaults. Without
this in the configuration reference, users setting a higher value see
no error and silently get the 120 s ceiling, or set 0 expecting 'no
timeout' and silently get the default.

* docs(browser): clarify startup timeout validation

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 19:59:53 +01:00
Peter Steinberger
b66f01bdca fix: expose transparent image infer options 2026-04-25 19:58:41 +01:00
Vincent Koc
cd7a8f870b feat(diagnostics-otel): add genai usage span attrs 2026-04-25 11:56:13 -07:00
91wan
bb2b68b34e fix(acp): pass Codex ACP model thinking overrides
Fix ACP Codex model/thinking override propagation.\n\nThanks @91wan.
2026-04-25 19:56:03 +01:00
Peter Steinberger
e9d9726f2d fix: handle context assembled diagnostics 2026-04-25 19:54:28 +01:00
Peter Steinberger
a018db771d fix: preserve omitted thinking replay turns 2026-04-25 19:54:28 +01:00
Peter Steinberger
690c98ad99 test(plugins): align install ledger mocks 2026-04-25 19:54:12 +01:00
Vincent Koc
c410e48382 fix(plugins): keep onboarding install records out of config 2026-04-25 11:52:19 -07:00
Peter Steinberger
bbc0884e23 docs(changelog): restore 2026.4.24 release notes 2026-04-25 19:51:11 +01:00
Vincent Koc
9bd348fdec fix(plugins): harden install ledger path handling 2026-04-25 11:48:17 -07:00
Vincent Koc
dc19069d71 feat(diagnostics-otel): add genai operation duration metric 2026-04-25 11:48:10 -07:00
Peter Steinberger
81307fc11d test: hoist backup archive mocks 2026-04-25 19:48:03 +01:00
Peter Steinberger
599ae7fed8 docs: clarify tool result details persistence 2026-04-25 19:47:19 +01:00
Peter Steinberger
fecf1e9b8f fix: align plugin install tests with ledger store 2026-04-25 19:44:11 +01:00
Peter Steinberger
4c0e9a4b2e fix(plugins): honor inferred agent model defaults 2026-04-25 19:40:32 +01:00
Peter Steinberger
cd8cb8254a fix(logging): remove duplicate context diagnostic case 2026-04-25 19:39:20 +01:00
Peter Steinberger
2055e6ceba fix(logging): include context assembly diagnostics in stability log 2026-04-25 19:39:20 +01:00
Peter Steinberger
8ea3099cd3 test(codex): accept visible session model reply 2026-04-25 19:39:20 +01:00
Peter Steinberger
e4f544790c test: isolate gateway live model sessions 2026-04-25 19:39:20 +01:00
Peter Steinberger
02639d3ec8 fix(plugins): alias wildcard runtime dependency exports 2026-04-25 19:39:20 +01:00
Peter Steinberger
14c9cfb637 fix(plugins): alias runtime dependency export subpaths 2026-04-25 19:39:20 +01:00
Peter Steinberger
9e9aa4722a fix(plugins): load mirrored runtime deps through ESM-safe aliases 2026-04-25 19:39:20 +01:00
Peter Steinberger
d2ab6b4fd5 fix(plugins): preserve package deps for runtime mirrors 2026-04-25 19:39:19 +01:00
Troy Hitch
63241bf1e0 fix(bonjour): suppress ciao cancellation across plugin runtime copies
Fix the bundled Bonjour gateway discovery crash-loop caused by ciao probe cancellation rejections after the Bonjour plugin migration.

The plugin entry now wires the existing rejection handler into the advertiser, and the unhandled-rejection handler registry is anchored on globalThis so staged plugin SDK module copies register into the same process-level handler set used by the host.

Verification:
- pnpm test:serial extensions/bonjour/src/advertiser.test.ts src/infra/unhandled-rejections.fatal-detection.test.ts
- OPENCLAW_LOCAL_CHECK_MODE=throttled pnpm check:changed partially completed: conflict markers plus core/core-test/extensions/extension-test typecheck passed; local lint lane hit a self-lock and was stopped.
2026-04-25 11:38:30 -07:00
Vincent Koc
888448facc feat(plugins): move install records to managed ledger 2026-04-25 11:37:10 -07:00
Peter Steinberger
e473577eaa test(voice): harden live STT transcript checks 2026-04-25 19:36:01 +01:00
Vincent Koc
f204f0c999 docs(logging): document new OTEL metrics and spans from recent diagnostics-otel feats
Five recent diagnostics-otel feat commits added user-facing OpenTelemetry
surfaces but did not update docs/logging.md, so the listed metrics and
spans drifted out of sync with what the plugin actually exports:

- 7bbd47349e adds gen_ai.client.token.usage histogram (GenAI semconv)
- b8a41739d5 adds memory heap/rss histograms, pressure counter and span
- d6ef1fcf24 adds openclaw.tool.loop counters and span
- ff172f46a5 adds openclaw.context.assembled span
- 44114328b4 adds openclaw.provider.request_id_hash attr on
  openclaw.model.call spans

Append the new metrics under existing model-usage and exec sections,
add a 'Diagnostics internals' subsection for memory + tool-loop
metrics, and add the three new spans (context.assembled, tool.loop,
memory.pressure) plus the request-id-hash attribute to the spans
listing.
2026-04-25 11:35:20 -07:00
Vincent Koc
7bbd47349e feat(diagnostics-otel): add genai token usage metric 2026-04-25 11:31:45 -07:00
Peter Steinberger
73706ca244 test: stabilize QA session memory ranking 2026-04-25 19:30:28 +01:00
Peter Steinberger
de0097a23c fix: support transparent OpenAI image generation 2026-04-25 19:28:56 +01:00
Peter Steinberger
0bf4876add fix: sanitize assembled diagnostic context 2026-04-25 19:23:51 +01:00
Peter Steinberger
a00c225899 test: split pure tool-card coverage 2026-04-25 19:23:51 +01:00
Peter Steinberger
e1495c3372 test: streamline memory and tts suites 2026-04-25 19:23:51 +01:00
Peter Steinberger
75fcb8c56d perf: lazy-load heavy test imports 2026-04-25 19:23:51 +01:00
Peter Steinberger
31456e3326 fix(providers): handle proxied DeepSeek V4 replay 2026-04-25 19:23:15 +01:00
Vincent Koc
b8a41739d5 feat(diagnostics-otel): export memory diagnostics 2026-04-25 11:22:19 -07:00
Peter Steinberger
1380dc170e fix(browser): avoid restart hint for external profiles 2026-04-25 19:18:06 +01:00
Vincent Koc
d6ef1fcf24 feat(diagnostics-otel): export tool loop events 2026-04-25 11:11:56 -07:00
Peter Steinberger
830bd2e236 fix: recover stale runtime deps locks 2026-04-25 19:09:09 +01:00
Poo-Squirry
fd3840cb00 Fix context usage display and active-run reload interruptions
Fixes context usage display regressions and prevents active runs from being interrupted by channel reloads. Adds persisted tool-result detail bounds so large tool metadata stays out of model/session payloads.
2026-04-25 19:07:52 +01:00
Chris Zhang
c3bfd328ad feat(litellm): add image generation provider (#70246)
* feat(litellm): add image generation provider

Registers litellm as an image-generation provider so model refs like
litellm/gpt-image-2 route through the LiteLLM proxy, and
agents.defaults.imageGenerationModel.fallbacks entries of the form
litellm/... resolve without "No image-generation provider registered
for litellm" errors.

Implementation uses the OpenAI-compatible /images/generations and
/images/edits endpoints that LiteLLM proxies for. BaseUrl resolves from
models.providers.litellm.baseUrl (default http://localhost:4000). Private
network is auto-allowed when baseUrl is a loopback/RFC1918 address, which
covers the common self-hosted LiteLLM proxy case without needing
OPENCLAW_PROVIDER_ALLOW_PRIVATE_NETWORK. Public baseUrls keep normal SSRF
defaults.

Default model is gpt-image-2 (matching upstream 4.21+ OpenAI default).
Advertises the same 2K/4K sizes OpenAI now exposes, plus legacy
256/512/1024 for dall-e-3. Supports both generate and edit.

Local patch. LiteLLM has no upstream image-generation support yet; revisit
if upstream adds one.

* ci: rerun after upstream main hot-fix

* fix(litellm): harden image generation provider

---------

Co-authored-by: Chris Zhang <chris@ChrisdeMac-mini.local>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 19:06:51 +01:00
Chunyue Wang
930d81aa41 fix(agents): prevent Bedrock replay death loop on empty assistant content (#71627)
* fix(agents): prevent Bedrock replay death loop on empty assistant content

  Fixes #71572

* docs: document Bedrock replay repair (#71627) (thanks @openperf)

* fix(diagnostics): share diagnostic event state across sdk graphs

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 19:04:40 +01:00
Vincent Koc
ff172f46a5 feat(diagnostics-otel): add context assembly spans 2026-04-25 11:03:46 -07:00
Peter Steinberger
afd6b5d6fc fix(opencode-go): route DeepSeek V4 through OpenAI transport 2026-04-25 18:58:08 +01:00
Vincent Koc
275c128e99 feat(plugins): add sanitized model call hooks 2026-04-25 10:56:40 -07:00
Peter Steinberger
9ffe764416 fix(whatsapp): send voice note text separately 2026-04-25 18:55:03 +01:00
Peter Steinberger
617e1dd6bf fix(browser): honor remote CDP open timeouts 2026-04-25 18:52:57 +01:00
Peter Steinberger
d623354a0e fix(infra): share diagnostic event state across loaders 2026-04-25 18:52:38 +01:00
Vincent Koc
44114328b4 feat(diagnostics): surface provider request id hashes 2026-04-25 10:46:10 -07:00
Peter Steinberger
2e0ae56b1a test(plugins): satisfy readonly index lint 2026-04-25 18:44:29 +01:00
Peter Steinberger
cd6c64d2ee test(plugins): avoid readonly index mutation 2026-04-25 18:42:25 +01:00
Peter Steinberger
649a645492 test(core): trim sync test overhead 2026-04-25 18:41:21 +01:00
Peter Steinberger
39488dfd68 test(pairing): reduce fixture io overhead 2026-04-25 18:41:20 +01:00
Peter Steinberger
8c93745f0f test(memory): speed up host fixture setup 2026-04-25 18:41:20 +01:00
Vincent Koc
f56bf63b06 fix(plugins): reject stale registry policy reads 2026-04-25 10:35:36 -07:00
Vincent Koc
61b3c04424 test(plugins): cover registry refresh mutations 2026-04-25 10:35:36 -07:00
Vincent Koc
3ec92dfac0 fix(plugins): deprecate registry disable break glass 2026-04-25 10:35:36 -07:00
Vincent Koc
4324855a9d docs(plugins): document persisted registry repair 2026-04-25 10:35:35 -07:00
Vincent Koc
fd8a8789d0 fix(plugins): satisfy registry lint 2026-04-25 10:35:35 -07:00
Vincent Koc
2f622acec6 fix(plugins): normalize startup config from registry 2026-04-25 10:35:35 -07:00
Vincent Koc
f14aa65bcc fix(plugins): refresh registry after chat toggles 2026-04-25 10:35:35 -07:00
Vincent Koc
29988335fc feat(plugins): resolve provider owners from registry 2026-04-25 10:35:35 -07:00
Vincent Koc
674d188153 feat(plugins): plan gateway startup from registry 2026-04-25 10:35:35 -07:00
Vincent Koc
feb8d3a4bd fix(plugins): label registry list state as enabled 2026-04-25 10:35:34 -07:00
Vincent Koc
5677a26385 docs(changelog): note registry-backed plugin list 2026-04-25 10:35:34 -07:00
Vincent Koc
5859dcd298 feat(plugins): list from registry snapshot 2026-04-25 10:35:34 -07:00
Vincent Koc
caf25fac91 feat(plugins): add registry repair command 2026-04-25 10:35:34 -07:00
Vincent Koc
521e75dea0 feat(plugins): prefer persisted registry reads 2026-04-25 10:35:09 -07:00
Vincent Koc
a7de722f4f fix(diagnostics-otel): align GenAI semconv attrs 2026-04-25 10:33:13 -07:00
Peter Steinberger
5f4bc6ec02 fix: surface external agent errors 2026-04-25 18:30:16 +01:00
Peter Steinberger
f545872cbc test(ui): streamline session controls async tests 2026-04-25 18:27:23 +01:00
Peter Steinberger
847c00d409 test(ui): speed up chat icon mocks 2026-04-25 18:27:23 +01:00
Peter Steinberger
88df8fe09d fix(browser): clarify Browserless CDP attach handling 2026-04-25 18:26:57 +01:00
Peter Steinberger
0bbb0eb735 fix(image): honor generation timeout config 2026-04-25 18:25:26 +01:00
Peter Steinberger
80739731dd docs: clarify pi-ai generic failover (#71647) 2026-04-25 18:22:06 +01:00
willamhou
4b5c2f9aa3 fix(agents/failover): classify bare pi-ai stream wrapper as timeout regardless of provider (#71620) 2026-04-25 18:22:06 +01:00
Vincent Koc
dcdf97685b fix(diagnostics): trust internal trace parents (#71574)
* fix(diagnostics): trust internal trace parents

* fix(diagnostics): harden trusted trace metadata

* fix(tooling): honor explicit oxlint threads

* fix(agents): use stable nonmutating sort helpers

* chore(plugin-sdk): refresh api baseline

* fix(diagnostics): gate internal event subscriptions

* fix(diagnostics): isolate listener event copies

* chore(plugin-sdk): refresh internal diagnostics baseline

* chore(plugin-sdk): refresh diagnostics event baseline

* fix(diagnostics): keep event state module local

* fix(diagnostics): harden internal subscription capability

* fix(diagnostics): freeze listener metadata
2026-04-25 10:18:52 -07:00
Peter Steinberger
8e7d382c37 refactor(tts): clarify text media directives 2026-04-25 18:18:34 +01:00
Peter Steinberger
67506ac2a9 fix(xai): support video reference images 2026-04-25 18:14:51 +01:00
Peter Steinberger
768bbc7cc0 docs: update OpenAI GPT-5.5 API guidance 2026-04-25 18:14:10 +01:00
Peter Steinberger
390be8138f fix: add OpenCode Go DeepSeek V4 models 2026-04-25 18:11:59 +01:00
Vincent Koc
0d274ef6c2 docs(control-ui): note assistant avatar uploads stay browser-local
Val Alexander's c65aa1d2a6 (#71639) changed assistant avatar uploads
from gateway config persistence to localStorage, mirroring the existing
user-avatar pattern. CHANGELOG covered it but docs/web/control-ui.md
'Personal identity (browser-local)' section only documented the user
identity. Add a paragraph noting the assistant avatar override follows
the same browser-local pattern, while keeping the ui.assistant.avatar
config field reachable for non-UI clients writing the field directly.
2026-04-25 10:08:59 -07:00
Peter Steinberger
6b3e4b88d6 test: update QA parity fixtures for GPT-5.5 2026-04-25 18:05:28 +01:00
Peter Steinberger
39343088ed fix(tts): keep media-only no-reply payloads 2026-04-25 18:04:54 +01:00
Peter Steinberger
f3ba962fd0 fix(subagents): explain browser tool profile filtering 2026-04-25 17:59:05 +01:00
Peter Steinberger
e27e29c66e refactor: split Crestodian planner backend selection 2026-04-25 17:56:46 +01:00
Peter Steinberger
60f9358348 fix(tts): preserve legacy tool voice hints 2026-04-25 17:56:37 +01:00
Peter Steinberger
dc7c703425 test: lazy-load global cleanup helpers 2026-04-25 17:49:16 +01:00
Peter Steinberger
8bead989da fix(telegram): frame audio transcripts as untrusted 2026-04-25 17:45:40 +01:00
Peter Steinberger
8659495384 test: make live cron probe agent-generic 2026-04-25 17:42:32 +01:00
Val Alexander
c65aa1d2a6 fix(control-ui): persist assistant avatar override locally (#71639)
* fix(control-ui): rebalance quick settings into stable 3-col bento

Pair Appearance with Automations and let Channels stand alone in the
middle column so all three top-row columns reach similar heights.
Promote Personal to a full-width row with a horizontal body
(identity tiles | emoji + actions) so the avatar block stops fighting
for half-width space. Drops the unused .qs-stack--wide hook.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(control-ui): rebalance Personal card with symmetric User↔Assistant identity pair

Restructure Personal card layout to present User and Assistant as 2 balanced identity cards instead of separate User tile + form controls. Mirrors the visual hierarchy and UI pattern across both identities.

Changes:
- Move User avatar text input into User identity card's .__repair section (mirroring Assistant's structure)
- Inline "Choose image" and "Clear avatar" buttons as flex-wrapped action group
- Remove .qs-personal-body and .qs-personal-form wrapper divs
- Update Personal card's .qs-identity-grid to 2-column layout with balanced spacing
- Responsive collapse to 1-column at ≤760px

Tests:
- config-quick.test.ts updated to expect 2 stacks (no longer wrapping Personal in form)
- config-quick.test.ts validates identity card layout now has symmetric User↔Assistant structure
- All 10 quick settings view tests passing
- All 20 schema regression tests passing

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* chore: ignore .vmux worktree paths

* fix(control-ui): persist assistant avatar override locally instead of via gateway config

Mirrors the user-avatar pattern: assistant avatar uploads now go to
localStorage and overlay the gateway-resolved identity at bootstrap and on
agent.identity.get refreshes. Sidesteps the ui.assistant.avatar zod cap
that rejected uploaded data URLs as 'Too big: expected string to have
<=200 characters', removes one config.patch RPC from the avatar path, and
collapses the upload handler from a 44-line async/loadConfig dance into a
plain synchronous setter.

Also lifts the gateway-side ui.assistant.avatar schema cap from 200 to
2,000,000 to match the user-avatar size budget for non-UI clients writing
the field directly, and adds a content-aware text/image normalizer in
ui/src/ui/assistant-identity.ts so short-text avatars stay short while
data URLs survive round-tripping.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 11:17:48 -05:00
Darshan Paccha
95b7a85f06 fix(ui): remove duplicate config section headers
Fix duplicate section title and description rendering in single-section Control UI config pages.\n\nKeeps root multi-section card headers intact, keeps single-section hero copy as the only visible section title, and adds browser coverage for both single-section and root views.\n\nFixes #68003.\n\nThanks @d1rshan.
2026-04-25 09:43:50 -05:00
Vincent Koc
c070509b7f fix(security): bound archive and MIME parser work (#71561)
* fix(security): bound archive and MIME parser work

* fix(security): harden zip preflight accounting

* fix(plugins): keep update channel sync on bundled path helpers

* fix(lint): avoid boolean literal comparisons

* fix(lint): keep agent spawn assertion immutable

* test(auto-reply): relax slow model directive regression timeout
2026-04-25 06:22:56 -07:00
Peter Steinberger
4e3bf7ce6a test: scope gateway restart signal assertion 2026-04-25 14:09:31 +01:00
Peter Steinberger
5c6a5afe81 test: use non-mutating sort in cli runner spec 2026-04-25 14:07:24 +01:00
Peter Steinberger
cd392b947c test: dedupe memory and context suites 2026-04-25 14:06:26 +01:00
Peter Steinberger
2413c0f5a5 perf: split chat UI test dependencies 2026-04-25 14:06:26 +01:00
Peter Steinberger
3db60f7eab perf: trim agent workspace imports 2026-04-25 14:06:26 +01:00
Vincent Koc
9b1dd9e573 docs(browser): document Chrome MCP per-profile mcpCommand/mcpArgs and cdpUrl mapping
Vincent's commit ab1d1a5c9e (#71560) added user-facing config keys to
existing-session profiles for the Chrome DevTools MCP launch path:

- browser.profiles.<name>.mcpCommand
- browser.profiles.<name>.mcpArgs

Plus runtime behavior changes:

- cdpUrl http(s) -> --browserUrl, cdpUrl ws(s) -> --wsEndpoint
- endpoint flags and userDataDir are mutually exclusive

The CHANGELOG entry covered the change, but docs/tools/browser.md
existing-session reference did not. Add a 'Custom Chrome MCP launch'
subsection describing the new fields and the cdpUrl endpoint mapping
rules.
2026-04-25 05:54:54 -07:00
Chunyue Wang
bc73141e82 fix(cli): key gemini cli auth epoch on google account identity (#71076)
Fixes openclaw#70973. Adds a \`google-gemini-cli\` branch to \`getLocalCliCredentialFingerprint\` that lifts OpenID \`id_token\` \`sub\`/\`email\` claims from \`~/.gemini/oauth_creds.json\` onto \`GeminiCliCredential\` so the shared \`encodeOAuthIdentity\` produces an identity-keyed auth-epoch matching the Claude/Codex contract, plus bumps \`CLI_AUTH_EPOCH_VERSION\` from 3 to 4 so existing v3 Gemini bindings without an \`authEpoch\` ride the existing \`cli-session.ts\` version-gate instead of forcing a one-time invalidation.
2026-04-25 20:47:58 +08:00
Vincent Koc
ab1d1a5c9e fix(browser): configure Chrome MCP existing-session launch (#71560) 2026-04-25 05:46:39 -07:00
Peter Steinberger
dd78b7f773 fix: harden OpenCode ACP bind dispatch 2026-04-25 13:38:58 +01:00
Peter Steinberger
42514156e0 fix: yield while waiting for subagent completions 2026-04-25 13:29:47 +01:00
skylee-01
f7b71abf48 fix(agents): pass Claude system prompt via file 2026-04-25 17:59:25 +05:30
Vincent Koc
ed650b652f fix(test): detect partial sparse core roots 2026-04-25 05:18:25 -07:00
Peter Steinberger
b26367e22f test: add Crestodian QA lab setup scenario 2026-04-25 13:15:11 +01:00
Peter Steinberger
c977643460 perf(browser): precompute browser help 2026-04-25 13:07:15 +01:00
Sahil Satralkar
3064ea78ab fix(telegram): recover incomplete preview finalization (#71554)
Fix Telegram partial-stream preview finalization so ambiguous final edit failures fall back to a final send when the visible preview is a strict prefix of the answer.

Includes archived-preview regression coverage and generated config metadata refresh.

Thanks @sahilsatralkar.

Co-authored-by: Sahil Satralkar <62758655+sahilsatralkar@users.noreply.github.com>
2026-04-25 13:01:10 +01:00
Peter Steinberger
e25b3c6056 fix(browser): align bare ws cdp readiness 2026-04-25 13:00:22 +01:00
Vincent Koc
2b822f6ed0 fix(plugins): preserve default enablement for relocation 2026-04-25 04:59:53 -07:00
Vincent Koc
f70d77b0bd docs(plugins): clarify registry-derived relocations 2026-04-25 04:59:53 -07:00
Vincent Koc
0abb2a571f fix(plugins): derive bundled relocation from registry 2026-04-25 04:59:53 -07:00
Vincent Koc
7177492487 fix(plugins): keep enabled-only registry migration fresh 2026-04-25 04:59:53 -07:00
Vincent Koc
0cc2b0e283 feat(plugins): refresh registry after plugin mutations 2026-04-25 04:59:53 -07:00
Vincent Koc
53c3c949d0 feat(plugins): bridge externalized bundled updates 2026-04-25 04:59:52 -07:00
Vincent Koc
ad8296e685 fix(plugins): harden registry migration guards 2026-04-25 04:59:52 -07:00
Vincent Koc
f22a2f7e8b fix(plugins): migrate only enabled registry entries 2026-04-25 04:59:52 -07:00
Vincent Koc
d7cf803705 fix(plugins): preflight registry install migration 2026-04-25 04:59:52 -07:00
Vincent Koc
81aefb9a18 feat(plugins): migrate plugin registry on install 2026-04-25 04:59:52 -07:00
Peter Steinberger
a48998d8c8 test(qqbot): cover voice utility contracts 2026-04-25 12:57:23 +01:00
Peter Steinberger
c307700db0 test(whatsapp): cover group generated media delivery 2026-04-25 12:56:53 +01:00
Peter Steinberger
d6e9ae53fe perf: split chat strip helper 2026-04-25 12:52:27 +01:00
Peter Steinberger
56573185f2 perf: split canvas a2ui shared imports 2026-04-25 12:52:27 +01:00
Peter Steinberger
40e4a00c8e perf: slim crestodian rescue tests 2026-04-25 12:52:27 +01:00
Peter Steinberger
2b8105598e perf: lazy load support bundle zip 2026-04-25 12:52:27 +01:00
Peter Steinberger
1888242bd3 perf: split trajectory export paths 2026-04-25 12:52:27 +01:00
Peter Steinberger
4a76a66872 perf: slim memory host imports 2026-04-25 12:52:27 +01:00
Peter Steinberger
6eec38ad5a feat(discord): allow voice model override 2026-04-25 12:47:46 +01:00
Ayaan Zaidi
d0ed938351 fix: make subagent session errors actionable (#67790) (thanks @stainlu) 2026-04-25 17:15:36 +05:30
stainlu
835f768036 fix(agents): make sessions_spawn mode=session errors actionable when thread binding is unavailable 2026-04-25 17:15:36 +05:30
Peter Steinberger
3507efa4ec fix(media): preserve oversized video generation delivery 2026-04-25 12:41:43 +01:00
Roman Godz
150f3e472b fix: sync Claude CLI OAuth credentials (#70902) (thanks @starvex) 2026-04-25 17:07:27 +05:30
Peter Steinberger
84dc9f12f1 test(agents): cover single image generation media delivery 2026-04-25 12:32:43 +01:00
Vincent Koc
e174d96cc0 refactor(media): move sharp image ops into media runtime (#71519)
* refactor(media): move sharp image ops into plugin

* fix(media): pass image pixel budget to sharp plugin

* refactor(media): reuse media understanding sharp runtime

* test(build): allow staged runtime core graphs
2026-04-25 04:31:10 -07:00
Peter Steinberger
b2b898c2a8 feat(browser): configure local startup timeouts 2026-04-25 12:30:35 +01:00
Peter Steinberger
4ac6729d12 test: expand Crestodian first-run Docker smoke 2026-04-25 12:30:26 +01:00
Peter Steinberger
9ab51bb66e test: stabilize qa lab live scenarios 2026-04-25 12:30:08 +01:00
Peter Steinberger
c5fe80ad58 fix: make qa config apply retries idempotent 2026-04-25 12:30:07 +01:00
Peter Steinberger
67436918f3 fix: deliver subagent completions via external requester route 2026-04-25 12:30:07 +01:00
Vincent Koc
924271385b fix(cron): record interrupted startup runs
* fix(cron): record interrupted startup runs

* test(cron): update interrupted startup expectations
2026-04-25 04:28:11 -07:00
Val Alexander
fc5920fb51 fix(ui): polish assistant identity settings
Polishes the basic config identity layout, aligns assistant avatar rendering with chat, and adds a Control UI assistant avatar override with IDENTITY.md fallback.
2026-04-25 06:27:22 -05:00
Vincent Koc
443b837bd5 fix(build): harden bundled plugin runtime staging
Copy bundled plugin skill trees into dist-runtime, broaden Windows symlink-copy fallbacks, and harden runtime-deps fingerprinting.
2026-04-25 04:27:17 -07:00
Donetta Flatley
f408bba9de fix(memory-host-sdk): use TRUSTED_ENV_PROXY mode for remote embeddings in proxy environments (#71506)
* fix(memory-host-sdk): use TRUSTED_ENV_PROXY mode in withRemoteHttpResponse

When a HTTP/HTTPS proxy is configured via environment variables
(HTTPS_PROXY, HTTP_PROXY, ALL_PROXY), the withRemoteHttpResponse
function now passes mode=TRUSTED_ENV_PROXY to fetchWithSsrFGuard.

This causes DNS resolution to skip the local resolver and route
through the configured proxy, fixing 'fetch failed' errors for
remote memory embeddings (including GitHub Copilot embeddings) in
proxy environments (e.g. Clash TUN, corporate proxies).

Previously, without an explicit mode, fetchWithSsrFGuard defaulted
to STRICT mode which performs local DNS pre-resolution via
resolvePinnedHostnameWithPolicy(), failing in proxy environments
where DNS must go through the proxy.

Fixes: openclaw/openclaw#52162

* fix: harden memory env proxy guard (#71506) (thanks @DhtIsCoding)

---------

Co-authored-by: Dht <dht@openclaw.ai>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 12:24:09 +01:00
Peter Steinberger
f1470b52fb fix(agents): fall back for threadless completion delivery 2026-04-25 12:23:42 +01:00
Ayaan Zaidi
bdba4fa1bf fix: isolate active memory auth health (#71539)
* fix(agents): scope helper auth failures

* fix(active-memory): isolate recall auth health

* fix: isolate active memory auth health (#71539)

* fix: avoid auth policy import cycle (#71539)
2026-04-25 16:50:38 +05:30
Peter Steinberger
be1d716427 refactor(plugin-sdk): narrow CLI runtime exports 2026-04-25 12:20:34 +01:00
Vincent Koc
f8a41e5e9c fix(test): serialize changed checks locally 2026-04-25 04:19:09 -07:00
Peter Steinberger
b511250e5c feat(media): add voice conversion and speech plugins 2026-04-25 12:12:33 +01:00
Peter Steinberger
16b7dee1ef test(crestodian): complete tui overview mock 2026-04-25 12:07:52 +01:00
Peter Steinberger
de652afffd fix: use random restart intent temp suffix 2026-04-25 12:04:37 +01:00
Peter Steinberger
e6fd1ccfd7 perf(ui): trim chat test imports 2026-04-25 12:04:17 +01:00
Peter Steinberger
4484772e7d test(logger): isolate rolling file cleanup 2026-04-25 12:04:17 +01:00
Peter Steinberger
4d00c47072 perf(crestodian): reduce test import overhead 2026-04-25 12:04:17 +01:00
Vincent Koc
84a22a64be fix(feishu): finish streaming card closeout 2026-04-25 04:04:03 -07:00
Peter Steinberger
935cd34e9f fix(openai): omit Azure image deployment model body 2026-04-25 12:02:26 +01:00
Peter Steinberger
89755d1c79 refactor(browser): simplify lazy CLI placeholders 2026-04-25 11:48:59 +01:00
deepkilo
df6c58cf30 fix(gateway): use secure dashboard links when TLS is enabled (#71499)
Fixes #71494.

- Render Control UI links with https:// when gateway TLS is enabled.
- Render websocket links with wss:// through the shared link resolver.
- Add daemon status handoff coverage and TLS scheme docs.

Co-authored-by: deepkilord <wang_hgang@msn.com>
2026-04-25 11:45:15 +01:00
Peter Steinberger
8cbb62d93c docs(browser): document headless start override 2026-04-25 11:42:04 +01:00
Peter Steinberger
c52ec520c7 feat(browser): add one-shot headless start override 2026-04-25 11:42:03 +01:00
Peter Steinberger
51e6f9c27e fix(reply): narrow empty-body history guard 2026-04-25 11:41:36 +01:00
jindongfu
1559e28d6b fix(get-reply): include inboundUserContext in empty-body guard (#71489)
The empty-body guard only checked baseBodyFinal (current message body)
and softResetTail, ignoring inboundUserContext which includes
InboundHistory from group chat context. This caused the bot to reject
bare @mentions in Feishu group chats where prior messages provided the
conversation context via InboundHistory.

Now hasUserBody also checks whether inboundUserContext has content,
matching the behavior before the 2026.4.12 refactor.
2026-04-25 11:41:36 +01:00
Vincent Koc
1549ded4ac docs(control-ui): document PWA install and web push
Eduardo Cruz's PWA web push feat (21b7ad5805, #44590) added a substantial
user-facing surface — manifest.webmanifest, sw.js, gateway push.web.*
methods, persisted vapid-keys.json/web-push-subscriptions.json, and
OPENCLAW_VAPID_* env overrides — but did not touch any docs/.

Add a 'PWA install and web push' section to docs/web/control-ui.md
covering the new persisted state files, env vars, and the four scope-gated
gateway methods (push.web.vapidPublicKey, push.web.subscribe,
push.web.unsubscribe, push.web.test). Distinguish from the existing
APNS relay-backed iOS push path.
2026-04-25 03:40:38 -07:00
Peter Steinberger
776d2ab65d fix(browser): lazy-load browser CLI runtime
Co-authored-by: pandego <7780875+pandego@users.noreply.github.com>
Co-authored-by: Tianworld <3580442280@qq.com>
2026-04-25 11:40:20 +01:00
Ayaan Zaidi
27aae62d99 fix: stop heartbeat prompt leaking into user runs (#69278) (thanks @stainlu) 2026-04-25 16:09:56 +05:30
stainlu
06c058b21d fix(agents): stop injecting heartbeat system prompt on non-heartbeat runs (#69079) 2026-04-25 16:09:56 +05:30
Val Alexander
151befb90b chore: keep superpowers plans local (#71530)
* docs: add control ui setup guidance design

* chore: keep superpowers plans local
2026-04-25 05:35:23 -05:00
Vincent Koc
0c9dacf902 fix(test): ignore local check opt-out in dev wrappers 2026-04-25 03:32:01 -07:00
Peter Steinberger
87aa0f813c fix(cli): forward video generation options 2026-04-25 11:31:09 +01:00
Val Alexander
b85b106b10 docs: add application modernization plan (#71528)
* docs: add application modernization plan

* docs: clarify frontend skill target
2026-04-25 05:29:57 -05:00
Vincent Koc
e0546edd98 fix(cron): normalize flat legacy job rows 2026-04-25 03:29:30 -07:00
Ayaan Zaidi
bbd6dfbe92 fix: cover CLI session prompt hash reuse (#69236) (thanks @stainlu) 2026-04-25 15:58:19 +05:30
Peter Steinberger
7711df0669 fix: default proxy completions tool choice (#71472) (thanks @Speed-maker) 2026-04-25 11:23:33 +01:00
Speed-maker
9a6b769e6e fix(agents): default proxy completions tool choice 2026-04-25 11:23:33 +01:00
Peter Steinberger
6a71c19839 fix: simplify Crestodian startup greeting 2026-04-25 11:20:59 +01:00
Peter Steinberger
a0c70c4f5a fix(google): guard veo rest polling 2026-04-25 11:17:23 +01:00
Peter Steinberger
9b48e4c0b6 fix(browser): fall back to headless on Linux without display 2026-04-25 11:13:42 +01:00
Peter Steinberger
b5a1b7d44d fix(google): guard veo video downloads 2026-04-25 11:12:49 +01:00
Peter Steinberger
978f869fcd fix(google): type veo fallback operation state 2026-04-25 11:11:14 +01:00
Peter Steinberger
94686c63fb fix(google): fall back to rest for veo sdk 404 2026-04-25 11:11:14 +01:00
Vincent Koc
814409a3b3 fix(test): keep local Vitest checks serialized 2026-04-25 03:07:27 -07:00
Peter Steinberger
5e0cca5e24 fix(google): narrow veo api key for uri download 2026-04-25 11:07:16 +01:00
Peter Steinberger
c11337149b fix(google): download direct veo video uri 2026-04-25 11:07:16 +01:00
Vincent Koc
455eba7f94 fix(feishu): coalesce streaming card final delivery 2026-04-25 03:06:38 -07:00
Peter Steinberger
38703ed9a1 fix(discord): identify voice attachment metadata 2026-04-25 11:05:38 +01:00
Peter Steinberger
5985e1d8b9 test: speed up import-heavy tests 2026-04-25 11:04:16 +01:00
Peter Steinberger
b9ea631b4b fix(openai): use gpt 5.5 for codex image responses 2026-04-25 11:03:53 +01:00
Eduardo Cruz
21b7ad5805 feat: add Control UI PWA web push support (#44590)
Adds browser PWA manifest and service worker support for the Control UI, plus gateway RPC methods and persisted Web Push subscription handling.

Maintainer verification:
- OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test src/infra/push-web.test.ts src/gateway/server-methods/push.test.ts src/gateway/control-ui.test.ts src/gateway/protocol/push.test.ts
- pnpm check:changed passed before final GitHub update-branch merge commit
- pnpm build

Source head: 0720024368
2026-04-25 05:03:00 -05:00
Peter Steinberger
385da2db60 feat: run Crestodian in TUI shell 2026-04-25 10:59:49 +01:00
Peter Steinberger
9fe35a0c62 fix(discord): restore voice note audio preflight 2026-04-25 10:57:37 +01:00
Peter Steinberger
936f27dcab docs: clarify minimax music changelog scope 2026-04-25 10:55:54 +01:00
Peter Steinberger
e6713c0a61 test(minimax): cover default music model normalization 2026-04-25 10:55:54 +01:00
Peter Steinberger
ed8384d32d fix(minimax): default music generation to music 2.6 2026-04-25 10:55:54 +01:00
Vincent Koc
c1f359c276 fix(test): reuse heavy-check lock in boundary prep 2026-04-25 02:49:45 -07:00
Vincent Koc
678d2c327c docs(changelog): backfill missing PR refs and reporter credits in top Unreleased
Three of my (vincentkoc) entries were missing closing PR refs, and
several maintainer-fix entries were missing credit for the user who
reported the underlying issue:

- Diagnostics/OTEL outbound delivery: add (#71471) and credit @jlapenna
  whose #70424 framed the broader tracing work.
- Cron malformed legacy jobs: add (#71509).
- OpenAI/Codex OAuth region failures: add (#71501) and credit reporter
  @wulala-xjj (#51175).
- Telegram duplicate pollers: credit reporter @Co-Messi (#56230).
- MCP/CLI one-shot retire: credit reporter @spartoviMD (#71457).
- OpenAI/Codex image baseUrl canonicalize: credit reporter @GodsBoy
  (#71460).
- Feishu TTS Ogg/Opus: credit reporters @sg1416-zg (#61249) and
  @ycjlb2023-peteryi (#37868).
- MiniMax TTS portal OAuth: credit reporter @zx15210404690-hash
  (#55017).
- MCP config reload disposal: credit reporter @xieyuanqing (#60656).
2026-04-25 02:49:37 -07:00
Peter Steinberger
815e9b493c fix: improve openrouter model scan fallback 2026-04-25 10:46:20 +01:00
Peter Steinberger
da2c61fe6e fix: render authenticated control ui avatars 2026-04-25 10:46:14 +01:00
Yunsu
9c64a0ca23 fix(google): avoid doubled media generation API version
Strip configured trailing /v1beta from Google music/video generation base URLs before calling the Google GenAI SDK.\n\nFixes #63240.\n\nThanks @Hybirdss.
2026-04-25 10:45:38 +01:00
Val Alexander
0bef73d151 chore: remove repo PR assets (#71510) 2026-04-25 04:40:29 -05:00
Vincent Koc
2896107153 fix(cron): tolerate malformed legacy jobs 2026-04-25 02:39:06 -07:00
Peter Steinberger
a7604f8170 fix(minimax): support token plan tts auth 2026-04-25 10:36:12 +01:00
Peter Steinberger
7fcefd56b7 chore: bump version to 2026.4.25 2026-04-25 10:31:52 +01:00
Vincent Koc
65ea6a0d94 fix(auth): clarify Codex OAuth region failures (#71501) 2026-04-25 02:31:42 -07:00
Peter Steinberger
c6770d3694 fix: align native think menus with session models 2026-04-25 10:30:49 +01:00
Peter Steinberger
4f91d81e1d fix(googlechat): preserve reply text after typing update failures
Preserve Google Chat reply text when typing indicator cleanup or update fails.

- Extract Google Chat reply delivery into a focused module
- Retry the failed first text chunk as a new message after placeholder update failure
- Cover media caption and chunk fallback regressions

Thanks @colin-lgtm.
2026-04-25 10:30:41 +01:00
Vincent Koc
0ee9e8188d Merge branch 'main' of https://github.com/openclaw/openclaw
* 'main' of https://github.com/openclaw/openclaw:
  feat: add crestodian local planner fallback
  fix(control-ui): clarify chat context details
  fix(telegram): keep polling watchdog active for wedged runner
2026-04-25 02:22:02 -07:00
Peter Steinberger
9056d4f708 feat: add crestodian local planner fallback 2026-04-25 10:20:02 +01:00
Val Alexander
388270ffce fix(control-ui): clarify chat context details
Summary:
- Show full date and time in Control UI chat message footers.
- Collapse assistant model/token/context metadata behind an explicit Context disclosure.
- Update changelog attribution guidance to allow multi-author credited entries.

Validation:
- OPENCLAW_LOCAL_CHECK=0 pnpm test ui/src/ui/chat/grouped-render.test.ts
- OPENCLAW_LOCAL_CHECK=0 pnpm test src/commands/gateway-status/helpers.test.ts
- OPENCLAW_LOCAL_CHECK=0 pnpm check:changed
- GitHub CI passed on f071a38177
2026-04-25 04:19:56 -05:00
Vincent Koc
c52c161f5a refactor(plugins): compact package json index metadata 2026-04-25 02:18:56 -07:00
Vincent Koc
c959c18fc7 fix(plugins): persist registry enabled snapshot 2026-04-25 02:18:56 -07:00
Vincent Koc
00f47f01fe refactor(plugins): trim persisted plugin registry state 2026-04-25 02:18:56 -07:00
Vincent Koc
3556f8441a feat(plugins): add plugin registry facade 2026-04-25 02:18:56 -07:00
Vincent Koc
36219b0ffc fix(plugins): invalidate index on policy changes 2026-04-25 02:18:56 -07:00
Vincent Koc
b001b8c947 feat(plugins): inspect persisted plugin index state 2026-04-25 02:18:55 -07:00
Vincent Koc
74a384d887 feat(plugins): persist installed plugin index snapshots 2026-04-25 02:18:55 -07:00
Vincent Koc
dfac36ee01 feat(plugins): add cold installed index owner APIs 2026-04-25 02:18:55 -07:00
Vincent Koc
ceace83556 fix(telegram): keep polling watchdog active for wedged runner 2026-04-25 02:18:49 -07:00
Peter Steinberger
f6a3b42cfa fix(browser): keep transient fetch errors retryable
Co-authored-by: jriff <jriff@users.noreply.github.com>
2026-04-25 10:09:15 +01:00
Peter Steinberger
2483d1dc12 fix(browser): drop redundant setuid sandbox flag
Co-authored-by: Sebastian Krueger <150018+sebykrueger@users.noreply.github.com>
2026-04-25 10:09:15 +01:00
Peter Steinberger
41ed7fa535 fix(browser): manage isolated downloads
Co-authored-by: Pearce Kieser <5055971+Pearcekieser@users.noreply.github.com>
2026-04-25 10:09:13 +01:00
Peter Steinberger
b756dfcb2b perf: speed up boundary and provider tests 2026-04-25 10:08:46 +01:00
Vincent Koc
c5e6f4bbc0 docs(agents): document sparse changed gate 2026-04-25 02:07:15 -07:00
Peter Steinberger
2377f1a4cd test(elevenlabs): cover eleven_v3 tts catalog 2026-04-25 10:06:42 +01:00
itsuzef
0fc68a5ed4 feat(elevenlabs): register eleven_v3 in TTS model allowlist
eleven_v3 already works end-to-end (model_id passes through to the API
without validation), but was missing from ELEVENLABS_TTS_MODELS so it
never appeared in the in-product model picker or catalog metadata.
2026-04-25 10:06:42 +01:00
hcl
fd74fc5a4f fix(heartbeat): clamp scheduler delay to Node setTimeout cap (#71414) (#71478)
* fix(heartbeat): clamp scheduler delay to Node setTimeout cap (#71414)

When `agents.defaults.heartbeat.every` resolves to >2_147_483_647 ms
(~24.85d), the previous scheduleNext() called setTimeout with the raw
delay. Node clamps any delay > 2^31-1 to 1 ms, fires the callback, and
the heartbeat re-arms with the same oversized value - a tight loop that
floods the log with TimeoutOverflowWarning and crashes the gateway with
exit code 1.

Clamp the computed delay to HEARTBEAT_MAX_TIMEOUT_MS (2_147_483_647)
before calling setTimeout. The worst case is now one heartbeat every
~24.85d instead of crash-loop. Warn once per process when clamping
fires, so a misconfigured "365d" remains visible without flooding.

This is a defense-in-depth fix at the scheduler layer; loadConfig-level
rejection is a broader change with more blast radius and a separate
question (some users may legitimately want "every: 365d" to mean
"effectively never"). The clamped behaviour is closer to that intent
than the crash is.

Test: new scheduler test sets heartbeat.every="365d" with fake timers,
advances 60s, and asserts runSpy was never called (with the bug, it
would be called ~60_000 times).

* style: format heartbeat scheduler clamp

* fix: share safe timeout delay clamp (#71478) (thanks @hclsys)

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 10:03:43 +01:00
Vincent Koc
a33f7b7d05 fix(test): make changed typechecks sparse-safe 2026-04-25 02:02:57 -07:00
Peter Steinberger
ed0210a187 test: streamline slow import-heavy suites 2026-04-25 10:02:02 +01:00
Peter Steinberger
f7d276b842 perf: cache guard inventory checks 2026-04-25 10:02:02 +01:00
Peter Steinberger
70b3ba2fed test: speed up Docker live scheduling 2026-04-25 10:01:50 +01:00
Val Alexander
6bdf87de87 fix(ui): remove quick config API keys card
Remove the misleading API Keys card from the quick settings page.

The card was hardcoded to a fixed env-var provider list and routed all actions to the broad Environment config section, which made the Add/Change affordances look more precise than they were. This removes the dead surface and keeps the quick settings grid focused on meaningful controls.

Verified:
- pnpm test ui/src/ui/views/config-quick.test.ts
- CI passed on PR #71496
2026-04-25 04:00:53 -05:00
Chinar Amrutkar
bf34fde235 fix(telegram): remove offset confirmation getUpdates call
Remove the startup persisted-offset getUpdates preflight so polling restarts do not self-conflict before the grammY runner starts.\n\nFixes #69304.\n\nThanks @chinar-amrutkar.
2026-04-25 09:53:50 +01:00
Peter Steinberger
19017bad96 docs(browser): explain actionable aria snapshot refs 2026-04-25 09:51:34 +01:00
Peter Steinberger
ec8dbc4595 feat(tts): add xiaomi mimo speech provider 2026-04-25 09:48:05 +01:00
Peter Steinberger
e10f20032a fix(browser): resolve aria snapshot refs via DOM markers
Co-authored-by: MrKipler <mrkipler@kiphausen.com>
2026-04-25 09:44:31 +01:00
Behnam Shahbazi
207f0341e0 docs: use target flag in README message example 2026-04-25 01:44:21 -07:00
Vincent Koc
01bf61fcfd fix(media): remove express from media host (#71436)
* fix(media): remove express from media host

* fix(media): harden media host responses

* fix(msteams): stage express runtime dependency

* fix(browser): align profile facade exports

* fix(msteams): keep setup entry narrow

* fix(types): satisfy extension setup gates

* fix(msteams): use generic setup config type
2026-04-25 01:39:42 -07:00
Peter Steinberger
3169886a21 fix(telegram): guard duplicate polling leases 2026-04-25 09:38:51 +01:00
Vincent Koc
c88c2328c2 docs: align Node minimum requirement 2026-04-25 01:36:01 -07:00
Vincent Koc
ec1f72b6c5 fix(gateway): preserve restart drain for active runs
Fixes https://github.com/openclaw/openclaw/issues/65485
2026-04-25 01:35:47 -07:00
Vincent Koc
734748d4f4 fix(test): cap native worker pools for serial Vitest 2026-04-25 01:31:30 -07:00
Peter Steinberger
bc21f500d4 fix: align Codex Responses instructions payload 2026-04-25 09:30:34 +01:00
Peter Steinberger
bf0221c5b3 fix(plugins): preserve bundled cli metadata skip 2026-04-25 09:29:16 +01:00
Peter Steinberger
87e92c71a4 docs(release): require changelog rewrite from commits 2026-04-25 09:29:16 +01:00
Peter Steinberger
689a353621 fix(plugins): load packaged runtime mirrors from canonical sources 2026-04-25 09:29:16 +01:00
Peter Steinberger
8503935a21 test: speed up changed unit checks 2026-04-25 09:27:59 +01:00
Peter Steinberger
9ad14f3639 fix: restore msteams channel plugin api type 2026-04-25 09:27:59 +01:00
Vincent Koc
bf0d2d70be fix(session): clean up rollover resources 2026-04-25 01:27:16 -07:00
Peter Steinberger
b0c55eb659 fix(feishu): transcode voice TTS audio 2026-04-25 09:26:42 +01:00
Vincent Koc
bd32b1a906 feat(diagnostics): add outbound delivery lifecycle events
Add bounded outbound message delivery lifecycle diagnostics and OTEL export without message body, recipient, room, media path, or raw channel result data.
2026-04-25 01:26:34 -07:00
Peter Steinberger
9e149519fe fix: keep control ui bundle browser-safe 2026-04-25 09:22:49 +01:00
Peter Steinberger
65b607245a fix(browser): ignore handled route navigation races
Co-authored-by: Richard Steadman <198648604+Steady-ai@users.noreply.github.com>
2026-04-25 09:22:31 +01:00
1058 changed files with 45626 additions and 9853 deletions

View File

@@ -68,6 +68,7 @@ gh search issues --repo openclaw/openclaw --match title,body --limit 50 \
- Keep commit messages concise and action-oriented.
- Group related changes; avoid bundling unrelated refactors.
- Use `.github/pull_request_template.md` for PR submissions and `.github/ISSUE_TEMPLATE/` for issues.
- Do not commit PR-only artifacts such as screenshots under `.github/pr-assets`; attach them to the PR/comment or use an external artifact store instead.
## Extra safety

View File

@@ -97,6 +97,11 @@ Use this skill for release and publish-time workflow. Keep ordinary development
## Build changelog-backed release notes
- Before release branching or tagging, rewrite the target `CHANGELOG.md`
section from commit history, not just from existing notes: scan commits since
the last reachable release tag, add missed user-facing changes, dedupe
overlapping entries, and sort each section from most to least interesting for
users.
- Changelog entries should be user-facing, not internal release-process notes.
- GitHub release and prerelease bodies must use the full matching
`CHANGELOG.md` version section, not highlights or an excerpt. When creating

15
.github/labeler.yml vendored
View File

@@ -315,6 +315,11 @@
- changed-files:
- any-glob-to-any-file:
- "extensions/lmstudio/**"
"extensions: litellm":
- changed-files:
- any-glob-to-any-file:
- "extensions/litellm/**"
- "docs/providers/litellm.md"
"extensions: openai":
- changed-files:
- any-glob-to-any-file:
@@ -351,6 +356,11 @@
- changed-files:
- any-glob-to-any-file:
- "extensions/qianfan/**"
"extensions: senseaudio":
- changed-files:
- any-glob-to-any-file:
- "extensions/senseaudio/**"
- "docs/providers/senseaudio.md"
"extensions: synthetic":
- changed-files:
- any-glob-to-any-file:
@@ -367,6 +377,11 @@
- changed-files:
- any-glob-to-any-file:
- "extensions/together/**"
"extensions: tts-local-cli":
- changed-files:
- any-glob-to-any-file:
- "extensions/tts-local-cli/**"
- "docs/tools/tts.md"
"extensions: venice":
- changed-files:
- any-glob-to-any-file:

Binary file not shown.

Before

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 44 KiB

6
.gitignore vendored
View File

@@ -128,15 +128,14 @@ dist/protocol.schema.json
# Synthing
**/.stfolder/
.dev-state
docs/superpowers/plans/2026-03-10-collapsed-side-nav.md
docs/superpowers/specs/2026-03-10-collapsed-side-nav-design.md
docs/superpowers
.superpowers/
.gitignore
test/config-form.analyze.telegram.test.ts
ui/src/ui/theme-variants.browser.test.ts
ui/src/ui/__screenshots__
ui/src/ui/views/__screenshots__
ui/.vitest-attachments
docs/superpowers
# Generated docs baseline artifacts (locally generated, only hashes tracked)
docs/.generated/*.json
@@ -147,6 +146,7 @@ changelog/fragments/
# Local scratch workspace
.tmp/
.vmux*
.artifacts/
test/fixtures/openclaw-vitest-unit-report.json
analysis/

View File

@@ -9,6 +9,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
- Run docs list first: `pnpm docs:list` if available; read relevant docs only.
- High-confidence answers only when fixing/triaging: verify source, tests, shipped/current behavior, and dependency contracts before deciding.
- Dependency-backed behavior: read upstream dependency docs/source/types first. Do not assume APIs, defaults, errors, timing, or runtime behavior.
- Live-verify when feasible. Check env/`~/.profile` for keys before assuming live tests are blocked; keep secret output redacted.
- Missing deps: `pnpm install`, retry once, then report first actionable error.
- CODEOWNERS: maint/refactor/tests ok. Larger behavior/product/security/ownership: owner ask/review.
- Wording: product/docs/UI/changelog say "plugin/plugins"; `extensions/` is internal.
@@ -44,6 +45,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
- Install: `pnpm install` (keep Bun lock/patches aligned if touched).
- CLI: `pnpm openclaw ...` or `pnpm dev`; build: `pnpm build`.
- Smart gate: `pnpm check:changed`; explain `pnpm changed:lanes --json`; staged preview `pnpm check:changed --staged`.
- Sparse worktrees: `pnpm check:changed` is sparse-safe and may skip sparse-missing typecheck projects; do not expand sparse checkout just to satisfy changed-gate tsgo. Direct `pnpm tsgo*` remains strict; use a fuller worktree when you need direct typecheck proof.
- Prod sweep: `pnpm check`; tests: `pnpm test`, `pnpm test:changed`, `pnpm test:serial`, `pnpm test:coverage`.
- Extension tests: `pnpm test:extensions`, `pnpm test extensions`, `pnpm test extensions/<id>`.
- Targeted tests: `pnpm test <path-or-filter> [vitest args...]`; never raw `vitest`.
@@ -59,6 +61,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
- PR shortlist: `gh pr list ...`; then `gh pr view <n> --json number,title,body,closingIssuesReferences,files,statusCheckRollup,reviewDecision`.
- After landing PR: search duplicate open issues/PRs. Before closing: comment why + canonical link.
- GH comments with markdown backticks, `$`, or shell snippets: avoid inline double-quoted `--body`; use single quotes or `--body-file`.
- PR execution artifacts/screenshots: attach them to the PR, comment, or an external artifact store. Do not add `.github/pr-assets` or other PR-only assets to the repo.
- PR review answer must explicitly cover: what bug/behavior we are trying to fix; PR/issue URL(s) and affected endpoint/surface; whether this is the best possible fix, with high-certainty evidence from code, tests, CI, and shipped/current behavior.
- CI polling: exact SHA, needed fields only. Example: `gh api repos/<owner>/<repo>/actions/runs/<id> --jq '{status,conclusion,head_sha,updated_at,name,path}'`.
- Post-land wait: minimal. Exact landed SHA only. If superseded on `main`, same-branch `cancel-in-progress` cancellations are expected; stop once local touched-surface proof exists. Never wait for newer unrelated `main` unless asked.
@@ -119,7 +122,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
- Docs change with behavior/API. Use docs list/read_when hints; docs links per `docs/AGENTS.md`.
- Changelog user-facing only; pure test/internal usually no entry.
- Changelog placement: active version `### Changes`/`### Fixes`; at most one contributor mention, prefer `Thanks @user`.
- Changelog placement: active version `### Changes`/`### Fixes`; every added entry must include at least one `Thanks @author` attribution, using credited GitHub username(s).
## Git

File diff suppressed because it is too large Load Diff

View File

@@ -96,7 +96,7 @@ Model note: while many providers and models are supported, prefer a current flag
## Install (recommended)
Runtime: **Node 24 (recommended) or Node 22.16+**.
Runtime: **Node 24 (recommended) or Node 22.14+**.
```bash
npm install -g openclaw@latest
@@ -109,7 +109,7 @@ OpenClaw Onboard installs the Gateway daemon (launchd/systemd user service) so i
## Quick start (TL;DR)
Runtime: **Node 24 (recommended) or Node 22.16+**.
Runtime: **Node 24 (recommended) or Node 22.14+**.
Full beginner guide (auth, pairing, channels): [Getting started](https://docs.openclaw.ai/start/getting-started)
@@ -119,7 +119,7 @@ openclaw onboard --install-daemon
openclaw gateway --port 18789 --verbose
# Send a message
openclaw message send --to +1234567890 --message "Hello from OpenClaw"
openclaw message send --target +1234567890 --message "Hello from OpenClaw"
# Talk to the assistant (optionally deliver back to any connected channel: WhatsApp/Telegram/Slack/Discord/Google Chat/Signal/iMessage/BlueBubbles/IRC/Microsoft Teams/Matrix/Feishu/LINE/Mattermost/Nextcloud Talk/Nostr/Synology Chat/Tlon/Twitch/Zalo/Zalo Personal/WeChat/QQ/WebChat)
openclaw agent --message "Ship checklist" --thinking high

View File

@@ -288,7 +288,7 @@ OpenClaw's web interface (Gateway Control UI + HTTP endpoints) is intended for *
### Node.js Version
OpenClaw requires **Node.js 22.12.0 or later** (LTS). This version includes important security patches:
OpenClaw requires **Node.js 22.14.0 or later** (LTS). This version includes important security patches:
- CVE-2025-59466: async_hooks DoS vulnerability
- CVE-2026-21636: Permission model bypass vulnerability
@@ -296,7 +296,7 @@ OpenClaw requires **Node.js 22.12.0 or later** (LTS). This version includes impo
Verify your Node.js version:
```bash
node --version # Should be v22.12.0 or later
node --version # Should be v22.14.0 or later
```
### Docker Security

View File

@@ -65,8 +65,8 @@ android {
applicationId = "ai.openclaw.app"
minSdk = 31
targetSdk = 36
versionCode = 2026042400
versionName = "2026.4.24"
versionCode = 2026042500
versionName = "2026.4.25"
ndk {
// Support all major ABIs — native libs are tiny (~47 KB per ABI)
abiFilters += listOf("armeabi-v7a", "arm64-v8a", "x86", "x86_64")

View File

@@ -3,6 +3,7 @@
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_DATA_SYNC" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_MICROPHONE" />
<uses-permission android:name="android.permission.POST_NOTIFICATIONS" />
<uses-permission
android:name="android.permission.NEARBY_WIFI_DEVICES"
@@ -52,7 +53,7 @@
<service
android:name=".NodeForegroundService"
android:exported="false"
android:foregroundServiceType="dataSync" />
android:foregroundServiceType="dataSync|microphone" />
<service
android:name=".node.DeviceNotificationListenerService"
android:label="@string/app_name"

View File

@@ -101,7 +101,8 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
val onboardingCompleted: StateFlow<Boolean> = prefs.onboardingCompleted
val canvasDebugStatusEnabled: StateFlow<Boolean> = prefs.canvasDebugStatusEnabled
val speakerEnabled: StateFlow<Boolean> = prefs.speakerEnabled
val micEnabled: StateFlow<Boolean> = prefs.talkEnabled
val voiceCaptureMode: StateFlow<VoiceCaptureMode> = runtimeState(initial = VoiceCaptureMode.Off) { it.voiceCaptureMode }
val micEnabled: StateFlow<Boolean> = runtimeState(initial = false) { it.micEnabled }
val micCooldown: StateFlow<Boolean> = runtimeState(initial = false) { it.micCooldown }
val micStatusText: StateFlow<String> = runtimeState(initial = "Mic off") { it.micStatusText }
@@ -111,6 +112,10 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
val micConversation: StateFlow<List<VoiceConversationEntry>> = runtimeState(initial = emptyList()) { it.micConversation }
val micInputLevel: StateFlow<Float> = runtimeState(initial = 0f) { it.micInputLevel }
val micIsSending: StateFlow<Boolean> = runtimeState(initial = false) { it.micIsSending }
val talkModeEnabled: StateFlow<Boolean> = runtimeState(initial = false) { it.talkModeEnabled }
val talkModeListening: StateFlow<Boolean> = runtimeState(initial = false) { it.talkModeListening }
val talkModeSpeaking: StateFlow<Boolean> = runtimeState(initial = false) { it.talkModeSpeaking }
val talkModeStatusText: StateFlow<String> = runtimeState(initial = "Off") { it.talkModeStatusText }
val chatSessionKey: StateFlow<String> = runtimeState(initial = "main") { it.chatSessionKey }
val chatSessionId: StateFlow<String?> = runtimeState(initial = null) { it.chatSessionId }
@@ -283,6 +288,10 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
ensureRuntime().setMicEnabled(enabled)
}
fun setTalkModeEnabled(enabled: Boolean) {
ensureRuntime().setTalkModeEnabled(enabled)
}
fun setSpeakerEnabled(enabled: Boolean) {
ensureRuntime().setSpeakerEnabled(enabled)
}

View File

@@ -3,12 +3,14 @@ package ai.openclaw.app
import android.app.Notification
import android.app.NotificationChannel
import android.app.NotificationManager
import android.app.Service
import android.app.PendingIntent
import android.app.Service
import android.content.Context
import android.content.Intent
import android.content.pm.ServiceInfo
import androidx.core.app.NotificationCompat
import androidx.core.app.ServiceCompat
import androidx.core.content.ContextCompat
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.Job
@@ -21,6 +23,7 @@ class NodeForegroundService : Service() {
private val scope: CoroutineScope = CoroutineScope(SupervisorJob() + Dispatchers.Main)
private var notificationJob: Job? = null
private var didStartForeground = false
private var voiceCaptureMode = VoiceCaptureMode.Off
override fun onCreate() {
super.onCreate()
@@ -36,22 +39,51 @@ class NodeForegroundService : Service() {
notificationJob =
scope.launch {
combine(
runtime.statusText,
runtime.serverName,
runtime.isConnected,
runtime.micEnabled,
runtime.micIsListening,
) { status, server, connected, micEnabled, micListening ->
Quint(status, server, connected, micEnabled, micListening)
}.collect { (status, server, connected, micEnabled, micListening) ->
val title = if (connected) "OpenClaw Node · Connected" else "OpenClaw Node"
val micSuffix =
if (micEnabled) {
if (micListening) " · Mic: Listening" else " · Mic: Pending"
} else {
""
combine(
runtime.statusText,
runtime.serverName,
runtime.isConnected,
runtime.voiceCaptureMode,
) { status, server, connected, mode ->
VoiceNotificationBase(
status = status,
server = server,
connected = connected,
mode = mode,
)
},
combine(
runtime.micEnabled,
runtime.micIsListening,
runtime.talkModeListening,
runtime.talkModeSpeaking,
) { micEnabled, micListening, talkListening, talkSpeaking ->
VoiceNotificationCapture(
micEnabled = micEnabled,
micListening = micListening,
talkListening = talkListening,
talkSpeaking = talkSpeaking,
)
},
) { base, capture ->
VoiceNotificationState(base = base, capture = capture)
}.collect { state ->
voiceCaptureMode = state.mode
val title =
when {
state.connected && state.mode == VoiceCaptureMode.TalkMode -> "OpenClaw Node · Talk"
state.connected -> "OpenClaw Node · Connected"
else -> "OpenClaw Node"
}
val text = (server?.let { "$status · $it" } ?: status) + micSuffix
val text =
(state.server?.let { "${state.status} · $it" } ?: state.status) +
voiceNotificationSuffix(
mode = state.mode,
manualMicEnabled = state.capture.micEnabled,
manualMicListening = state.capture.micListening,
talkListening = state.capture.talkListening,
talkSpeaking = state.capture.talkSpeaking,
)
startForegroundWithTypes(
notification = buildNotification(title = title, text = text),
@@ -60,13 +92,27 @@ class NodeForegroundService : Service() {
}
}
override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
override fun onStartCommand(
intent: Intent?,
flags: Int,
startId: Int,
): Int {
when (intent?.action) {
ACTION_STOP -> {
(application as NodeApp).peekRuntime()?.disconnect()
stopSelf()
return START_NOT_STICKY
}
ACTION_SET_VOICE_CAPTURE_MODE -> {
voiceCaptureMode = intent.getStringExtra(EXTRA_VOICE_CAPTURE_MODE).toVoiceCaptureMode()
startForegroundWithTypes(
notification =
buildNotification(
title = "OpenClaw Node",
text = if (voiceCaptureMode == VoiceCaptureMode.TalkMode) "Talk mode active" else "Connected",
),
)
}
}
// Keep running; connection is managed by NodeRuntime (auto-reconnect + manual).
return START_STICKY
@@ -127,17 +173,13 @@ class NodeForegroundService : Service() {
.build()
}
private fun updateNotification(notification: Notification) {
val mgr = getSystemService(Context.NOTIFICATION_SERVICE) as NotificationManager
mgr.notify(NOTIFICATION_ID, notification)
}
private fun startForegroundWithTypes(notification: Notification) {
val serviceTypes = foregroundServiceTypesForVoiceMode(voiceCaptureMode)
if (didStartForeground) {
updateNotification(notification)
ServiceCompat.startForeground(this, NOTIFICATION_ID, notification, serviceTypes)
return
}
startForeground(NOTIFICATION_ID, notification, ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC)
ServiceCompat.startForeground(this, NOTIFICATION_ID, notification, serviceTypes)
didStartForeground = true
}
@@ -146,6 +188,8 @@ class NodeForegroundService : Service() {
private const val NOTIFICATION_ID = 1
private const val ACTION_STOP = "ai.openclaw.app.action.STOP"
private const val ACTION_SET_VOICE_CAPTURE_MODE = "ai.openclaw.app.action.SET_VOICE_CAPTURE_MODE"
private const val EXTRA_VOICE_CAPTURE_MODE = "ai.openclaw.app.extra.VOICE_CAPTURE_MODE"
fun start(context: Context) {
val intent = Intent(context, NodeForegroundService::class.java)
@@ -156,7 +200,85 @@ class NodeForegroundService : Service() {
val intent = Intent(context, NodeForegroundService::class.java).setAction(ACTION_STOP)
context.startService(intent)
}
fun setVoiceCaptureMode(
context: Context,
mode: VoiceCaptureMode,
) {
val intent =
Intent(context, NodeForegroundService::class.java)
.setAction(ACTION_SET_VOICE_CAPTURE_MODE)
.putExtra(EXTRA_VOICE_CAPTURE_MODE, mode.name)
if (mode == VoiceCaptureMode.TalkMode) {
ContextCompat.startForegroundService(context, intent)
} else {
context.startService(intent)
}
}
}
}
private data class Quint<A, B, C, D, E>(val first: A, val second: B, val third: C, val fourth: D, val fifth: E)
internal fun foregroundServiceTypesForVoiceMode(mode: VoiceCaptureMode): Int {
val base = ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC
return if (mode == VoiceCaptureMode.TalkMode) {
base or ServiceInfo.FOREGROUND_SERVICE_TYPE_MICROPHONE
} else {
base
}
}
internal fun voiceNotificationSuffix(
mode: VoiceCaptureMode,
manualMicEnabled: Boolean,
manualMicListening: Boolean,
talkListening: Boolean,
talkSpeaking: Boolean,
): String {
return when (mode) {
VoiceCaptureMode.TalkMode ->
when {
talkSpeaking -> " · Talk: Speaking"
talkListening -> " · Talk: Listening"
else -> " · Talk: On"
}
VoiceCaptureMode.ManualMic ->
if (manualMicEnabled) {
if (manualMicListening) " · Mic: Listening" else " · Mic: Pending"
} else {
""
}
VoiceCaptureMode.Off -> ""
}
}
private fun String?.toVoiceCaptureMode(): VoiceCaptureMode {
return VoiceCaptureMode.entries.firstOrNull { it.name == this } ?: VoiceCaptureMode.Off
}
private data class VoiceNotificationBase(
val status: String,
val server: String?,
val connected: Boolean,
val mode: VoiceCaptureMode,
)
private data class VoiceNotificationCapture(
val micEnabled: Boolean,
val micListening: Boolean,
val talkListening: Boolean,
val talkSpeaking: Boolean,
)
private data class VoiceNotificationState(
val base: VoiceNotificationBase,
val capture: VoiceNotificationCapture,
) {
val status: String
get() = base.status
val server: String?
get() = base.server
val connected: Boolean
get() = base.connected
val mode: VoiceCaptureMode
get() = base.mode
}

View File

@@ -64,6 +64,8 @@ class NodeRuntime(
private val json = Json { ignoreUnknownKeys = true }
private val externalAudioCaptureActive = MutableStateFlow(false)
private val _voiceCaptureMode = MutableStateFlow(VoiceCaptureMode.Off)
val voiceCaptureMode: StateFlow<VoiceCaptureMode> = _voiceCaptureMode.asStateFlow()
private val discovery = GatewayDiscovery(appContext, scope = scope)
val gateways: StateFlow<List<GatewayEndpoint>> = discovery.gateways
@@ -428,6 +430,18 @@ class NodeRuntime(
)
}
val talkModeEnabled: StateFlow<Boolean>
get() = talkMode.isEnabled
val talkModeListening: StateFlow<Boolean>
get() = talkMode.isListening
val talkModeSpeaking: StateFlow<Boolean>
get() = talkMode.isSpeaking
val talkModeStatusText: StateFlow<String>
get() = talkMode.statusText
private fun syncMainSessionKey(agentId: String?) {
val resolvedKey = resolveNodeMainSessionKey(agentId)
// Always push the resolved session key into TalkMode, even when the
@@ -599,17 +613,8 @@ class NodeRuntime(
prefs.loadGatewayToken()
}
scope.launch {
prefs.talkEnabled.collect { enabled ->
// MicCaptureManager handles STT + send to gateway, while the dedicated
// reply speaker handles TTS for assistant replies in the voice tab.
micCapture.setMicEnabled(enabled)
if (enabled) {
talkMode.ttsOnAllResponses = false
scope.launch { talkMode.ensureChatSubscribed() }
}
externalAudioCaptureActive.value = enabled
}
if (prefs.voiceMicEnabled.value) {
setVoiceCaptureMode(VoiceCaptureMode.ManualMic, persistManualMic = false)
}
scope.launch(Dispatchers.Default) {
@@ -643,7 +648,7 @@ class NodeRuntime(
if (value) {
reconnectPreferredGatewayOnForeground()
} else {
stopActiveVoiceSession()
stopManualVoiceSession()
}
}
@@ -757,21 +762,17 @@ class NodeRuntime(
fun setVoiceScreenActive(active: Boolean) {
if (!active) {
stopActiveVoiceSession()
stopManualVoiceSession()
}
// Don't re-enable on active=true; mic toggle drives that
}
fun setMicEnabled(value: Boolean) {
prefs.setTalkEnabled(value)
if (value) {
// Tapping mic on interrupts any active TTS (barge-in)
stopVoicePlayback()
talkMode.ttsOnAllResponses = false
scope.launch { talkMode.ensureChatSubscribed() }
}
micCapture.setMicEnabled(value)
externalAudioCaptureActive.value = value
setVoiceCaptureMode(if (value) VoiceCaptureMode.ManualMic else VoiceCaptureMode.Off)
}
fun setTalkModeEnabled(value: Boolean) {
setVoiceCaptureMode(if (value) VoiceCaptureMode.TalkMode else VoiceCaptureMode.Off)
}
val speakerEnabled: StateFlow<Boolean>
@@ -786,11 +787,72 @@ class NodeRuntime(
talkMode.setPlaybackEnabled(value)
}
private fun setVoiceCaptureMode(
mode: VoiceCaptureMode,
persistManualMic: Boolean = true,
) {
if (mode == VoiceCaptureMode.TalkMode && !hasRecordAudioPermission()) {
_voiceCaptureMode.value = VoiceCaptureMode.Off
externalAudioCaptureActive.value = false
return
}
if (_voiceCaptureMode.value == mode) return
_voiceCaptureMode.value = mode
when (mode) {
VoiceCaptureMode.Off -> {
talkMode.ttsOnAllResponses = false
talkMode.setEnabled(false)
stopVoicePlayback()
micCapture.setMicEnabled(false)
if (persistManualMic) {
prefs.setVoiceMicEnabled(false)
}
NodeForegroundService.setVoiceCaptureMode(appContext, VoiceCaptureMode.Off)
externalAudioCaptureActive.value = false
}
VoiceCaptureMode.ManualMic -> {
talkMode.ttsOnAllResponses = false
talkMode.setEnabled(false)
NodeForegroundService.setVoiceCaptureMode(appContext, VoiceCaptureMode.ManualMic)
if (persistManualMic) {
prefs.setVoiceMicEnabled(true)
}
// Tapping mic on interrupts any active TTS (barge-in).
stopVoicePlayback()
scope.launch { talkMode.ensureChatSubscribed() }
micCapture.setMicEnabled(true)
externalAudioCaptureActive.value = true
}
VoiceCaptureMode.TalkMode -> {
if (persistManualMic) {
prefs.setVoiceMicEnabled(false)
}
micCapture.setMicEnabled(false)
NodeForegroundService.setVoiceCaptureMode(appContext, VoiceCaptureMode.TalkMode)
talkMode.ttsOnAllResponses = true
talkMode.setPlaybackEnabled(speakerEnabled.value)
scope.launch { talkMode.ensureChatSubscribed() }
talkMode.setEnabled(true)
externalAudioCaptureActive.value = true
}
}
}
private fun stopManualVoiceSession() {
if (_voiceCaptureMode.value != VoiceCaptureMode.ManualMic) return
setVoiceCaptureMode(VoiceCaptureMode.Off)
}
private fun stopActiveVoiceSession() {
talkMode.ttsOnAllResponses = false
talkMode.setEnabled(false)
stopVoicePlayback()
micCapture.setMicEnabled(false)
prefs.setTalkEnabled(false)
prefs.setVoiceMicEnabled(false)
NodeForegroundService.setVoiceCaptureMode(appContext, VoiceCaptureMode.Off)
_voiceCaptureMode.value = VoiceCaptureMode.Off
externalAudioCaptureActive.value = false
}
@@ -970,6 +1032,7 @@ class NodeRuntime(
}
fun disconnect() {
stopActiveVoiceSession()
connectedEndpoint = null
activeGatewayAuth = null
_pendingGatewayTrust.value = null

View File

@@ -37,6 +37,7 @@ class SecurePrefs(
private const val notificationsForwardingMaxEventsPerMinuteKey =
"notifications.forwarding.maxEventsPerMinute"
private const val notificationsForwardingSessionKeyKey = "notifications.forwarding.sessionKey"
private const val voiceMicEnabledKey = "voice.micEnabled"
}
private val appContext = context.applicationContext
@@ -162,8 +163,8 @@ class SecurePrefs(
private val _voiceWakeMode = MutableStateFlow(loadVoiceWakeMode())
val voiceWakeMode: StateFlow<VoiceWakeMode> = _voiceWakeMode
private val _talkEnabled = MutableStateFlow(plainPrefs.getBoolean("talk.enabled", false))
val talkEnabled: StateFlow<Boolean> = _talkEnabled
private val _voiceMicEnabled = MutableStateFlow(plainPrefs.getBoolean(voiceMicEnabledKey, false))
val voiceMicEnabled: StateFlow<Boolean> = _voiceMicEnabled
private val _speakerEnabled = MutableStateFlow(plainPrefs.getBoolean("voice.speakerEnabled", true))
val speakerEnabled: StateFlow<Boolean> = _speakerEnabled
@@ -478,9 +479,9 @@ class SecurePrefs(
_voiceWakeMode.value = mode
}
fun setTalkEnabled(value: Boolean) {
plainPrefs.edit { putBoolean("talk.enabled", value) }
_talkEnabled.value = value
fun setVoiceMicEnabled(value: Boolean) {
plainPrefs.edit { putBoolean(voiceMicEnabledKey, value) }
_voiceMicEnabled.value = value
}
fun setSpeakerEnabled(value: Boolean) {

View File

@@ -0,0 +1,7 @@
package ai.openclaw.app
enum class VoiceCaptureMode {
Off,
ManualMic,
TalkMode,
}

View File

@@ -35,10 +35,11 @@ import androidx.compose.foundation.lazy.rememberLazyListState
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Mic
import androidx.compose.material.icons.filled.MicOff
import androidx.compose.material.icons.automirrored.filled.VolumeOff
import androidx.compose.material.icons.automirrored.filled.VolumeUp
import androidx.compose.material.icons.filled.Mic
import androidx.compose.material.icons.filled.MicOff
import androidx.compose.material.icons.filled.RecordVoiceOver
import androidx.compose.material3.Button
import androidx.compose.material3.ButtonDefaults
import androidx.compose.material3.Icon
@@ -69,6 +70,7 @@ import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleEventObserver
import androidx.lifecycle.compose.LocalLifecycleOwner
import ai.openclaw.app.MainViewModel
import ai.openclaw.app.VoiceCaptureMode
import ai.openclaw.app.voice.VoiceConversationEntry
import ai.openclaw.app.voice.VoiceConversationRole
import kotlin.math.max
@@ -81,6 +83,7 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
val listState = rememberLazyListState()
val gatewayStatus by viewModel.statusText.collectAsState()
val voiceCaptureMode by viewModel.voiceCaptureMode.collectAsState()
val micEnabled by viewModel.micEnabled.collectAsState()
val micCooldown by viewModel.micCooldown.collectAsState()
val speakerEnabled by viewModel.speakerEnabled.collectAsState()
@@ -90,12 +93,15 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
val micConversation by viewModel.micConversation.collectAsState()
val micInputLevel by viewModel.micInputLevel.collectAsState()
val micIsSending by viewModel.micIsSending.collectAsState()
val talkModeEnabled by viewModel.talkModeEnabled.collectAsState()
val talkModeListening by viewModel.talkModeListening.collectAsState()
val talkModeSpeaking by viewModel.talkModeSpeaking.collectAsState()
val hasStreamingAssistant = micConversation.any { it.role == VoiceConversationRole.Assistant && it.isStreaming }
val showThinkingBubble = micIsSending && !hasStreamingAssistant
var hasMicPermission by remember { mutableStateOf(context.hasRecordAudioPermission()) }
var pendingMicEnable by remember { mutableStateOf(false) }
var pendingVoicePermissionAction by remember { mutableStateOf<PendingVoicePermissionAction?>(null) }
DisposableEffect(lifecycleOwner, context) {
val observer =
@@ -107,7 +113,7 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
lifecycleOwner.lifecycle.addObserver(observer)
onDispose {
lifecycleOwner.lifecycle.removeObserver(observer)
// Stop TTS when leaving the voice screen
// Manual mic is tied to the Voice tab; Talk Mode is explicit and can continue.
viewModel.setVoiceScreenActive(false)
}
}
@@ -115,10 +121,14 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
val requestMicPermission =
rememberLauncherForActivityResult(ActivityResultContracts.RequestPermission()) { granted ->
hasMicPermission = granted
if (granted && pendingMicEnable) {
viewModel.setMicEnabled(true)
if (granted) {
when (pendingVoicePermissionAction) {
PendingVoicePermissionAction.ManualMic -> viewModel.setMicEnabled(true)
PendingVoicePermissionAction.TalkMode -> viewModel.setTalkModeEnabled(true)
null -> Unit
}
}
pendingMicEnable = false
pendingVoicePermissionAction = null
}
LaunchedEffect(micConversation.size, showThinkingBubble) {
@@ -161,12 +171,12 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
tint = mobileTextTertiary,
)
Text(
"Tap the mic to start",
"Tap mic or Talk",
style = mobileHeadline,
color = mobileTextSecondary,
)
Text(
"Each pause sends a turn automatically.",
"Mic sends turns; Talk keeps the conversation open.",
style = mobileCallout,
color = mobileTextTertiary,
)
@@ -263,7 +273,7 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
if (hasMicPermission) {
viewModel.setMicEnabled(true)
} else {
pendingMicEnable = true
pendingVoicePermissionAction = PendingVoicePermissionAction.ManualMic
requestMicPermission.launch(Manifest.permission.RECORD_AUDIO)
}
},
@@ -287,11 +297,39 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
}
}
// Invisible spacer to balance the row (matches speaker column width)
Column(horizontalAlignment = Alignment.CenterHorizontally) {
Box(modifier = Modifier.size(48.dp))
Column(horizontalAlignment = Alignment.CenterHorizontally, verticalArrangement = Arrangement.spacedBy(4.dp)) {
IconButton(
onClick = {
if (talkModeEnabled) {
viewModel.setTalkModeEnabled(false)
return@IconButton
}
if (hasMicPermission) {
viewModel.setTalkModeEnabled(true)
} else {
pendingVoicePermissionAction = PendingVoicePermissionAction.TalkMode
requestMicPermission.launch(Manifest.permission.RECORD_AUDIO)
}
},
modifier = Modifier.size(48.dp),
colors =
IconButtonDefaults.iconButtonColors(
containerColor = if (talkModeEnabled) mobileSuccessSoft else mobileSurface,
),
) {
Icon(
imageVector = Icons.Default.RecordVoiceOver,
contentDescription = if (talkModeEnabled) "Turn Talk Mode off" else "Turn Talk Mode on",
modifier = Modifier.size(22.dp),
tint = if (talkModeEnabled) mobileSuccess else mobileTextSecondary,
)
}
Spacer(modifier = Modifier.height(4.dp))
Text("", style = mobileCaption2)
Text(
if (talkModeEnabled) "Talk on" else "Talk",
style = mobileCaption2,
color = if (talkModeEnabled) mobileSuccess else mobileTextTertiary,
)
}
}
@@ -299,6 +337,9 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
val queueCount = micQueuedMessages.size
val stateText =
when {
voiceCaptureMode == VoiceCaptureMode.TalkMode && talkModeSpeaking -> "Talk speaking"
voiceCaptureMode == VoiceCaptureMode.TalkMode && talkModeListening -> "Talk listening"
voiceCaptureMode == VoiceCaptureMode.TalkMode -> "Talk on"
queueCount > 0 -> "$queueCount queued"
micIsSending -> "Sending"
micCooldown -> "Cooldown"
@@ -307,14 +348,15 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
}
val stateColor =
when {
voiceCaptureMode == VoiceCaptureMode.TalkMode -> mobileSuccess
micEnabled -> mobileSuccess
micIsSending -> mobileAccent
else -> mobileTextSecondary
}
Surface(
shape = RoundedCornerShape(999.dp),
color = if (micEnabled) mobileSuccessSoft else mobileSurface,
border = BorderStroke(1.dp, if (micEnabled) mobileSuccess.copy(alpha = 0.3f) else mobileBorder),
color = if (micEnabled || talkModeEnabled) mobileSuccessSoft else mobileSurface,
border = BorderStroke(1.dp, if (micEnabled || talkModeEnabled) mobileSuccess.copy(alpha = 0.3f) else mobileBorder),
) {
Text(
"$gatewayStatus · $stateText",
@@ -353,6 +395,11 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
}
}
private enum class PendingVoicePermissionAction {
ManualMic,
TalkMode,
}
@Composable
private fun VoiceTurnBubble(entry: VoiceConversationEntry) {
val isUser = entry.role == VoiceConversationRole.User

View File

@@ -2,6 +2,7 @@ package ai.openclaw.app
import android.app.Notification
import android.content.Intent
import android.content.pm.ServiceInfo
import org.junit.Assert.assertEquals
import org.junit.Assert.assertNotNull
import org.junit.Test
@@ -30,6 +31,35 @@ class NodeForegroundServiceTest {
assertEquals(expectedFlags, savedIntent.flags and expectedFlags)
}
@Test
fun foregroundServiceTypesForVoiceMode_addsMicrophoneOnlyForTalkMode() {
assertEquals(
ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC,
foregroundServiceTypesForVoiceMode(VoiceCaptureMode.Off),
)
assertEquals(
ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC,
foregroundServiceTypesForVoiceMode(VoiceCaptureMode.ManualMic),
)
assertEquals(
ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC or ServiceInfo.FOREGROUND_SERVICE_TYPE_MICROPHONE,
foregroundServiceTypesForVoiceMode(VoiceCaptureMode.TalkMode),
)
}
@Test
fun voiceNotificationSuffixReflectsActiveCaptureMode() {
assertEquals("", voiceNotificationSuffix(VoiceCaptureMode.Off, false, false, false, false))
assertEquals(
" · Mic: Listening",
voiceNotificationSuffix(VoiceCaptureMode.ManualMic, true, true, false, false),
)
assertEquals(
" · Talk: Speaking",
voiceNotificationSuffix(VoiceCaptureMode.TalkMode, false, false, true, true),
)
}
private fun buildNotification(service: NodeForegroundService): Notification {
val method =
NodeForegroundService::class.java.getDeclaredMethod(

View File

@@ -2,7 +2,9 @@ package ai.openclaw.app
import android.content.Context
import org.junit.Assert.assertEquals
import org.junit.Assert.assertFalse
import org.junit.Assert.assertNull
import org.junit.Assert.assertTrue
import org.junit.Test
import org.junit.runner.RunWith
import org.robolectric.RobolectricTestRunner
@@ -22,6 +24,32 @@ class SecurePrefsTest {
assertEquals("whileUsing", plainPrefs.getString("location.enabledMode", null))
}
@Test
fun voiceMicEnabled_ignoresOldTalkEnabledKey() {
val context = RuntimeEnvironment.getApplication()
val plainPrefs = context.getSharedPreferences("openclaw.node", Context.MODE_PRIVATE)
plainPrefs.edit().clear().putBoolean("talk.enabled", true).commit()
val prefs = SecurePrefs(context)
assertFalse(prefs.voiceMicEnabled.value)
assertFalse(plainPrefs.contains("voice.micEnabled"))
}
@Test
fun setVoiceMicEnabled_persistsNewKeyOnly() {
val context = RuntimeEnvironment.getApplication()
val plainPrefs = context.getSharedPreferences("openclaw.node", Context.MODE_PRIVATE)
plainPrefs.edit().clear().putBoolean("talk.enabled", false).commit()
val prefs = SecurePrefs(context)
prefs.setVoiceMicEnabled(true)
assertTrue(prefs.voiceMicEnabled.value)
assertTrue(plainPrefs.getBoolean("voice.micEnabled", false))
assertFalse(plainPrefs.getBoolean("talk.enabled", false))
}
@Test
fun saveGatewayBootstrapToken_persistsSeparatelyFromSharedToken() {
val context = RuntimeEnvironment.getApplication()

View File

@@ -1,6 +1,6 @@
# OpenClaw iOS Changelog
## 2026.4.24 - 2026-04-24
## 2026.4.25 - 2026-04-25
Maintenance update for the current OpenClaw development release.

View File

@@ -2,8 +2,8 @@
// Source of truth: apps/ios/version.json
// Generated by scripts/ios-sync-versioning.ts.
OPENCLAW_IOS_VERSION = 2026.4.24
OPENCLAW_MARKETING_VERSION = 2026.4.24
OPENCLAW_IOS_VERSION = 2026.4.25
OPENCLAW_MARKETING_VERSION = 2026.4.25
OPENCLAW_BUILD_VERSION = 1
#include? "../build/Version.xcconfig"

View File

@@ -1,3 +1,3 @@
{
"version": "2026.4.24"
"version": "2026.4.25"
}

View File

@@ -15,9 +15,9 @@
<key>CFBundlePackageType</key>
<string>APPL</string>
<key>CFBundleShortVersionString</key>
<string>2026.4.24</string>
<string>2026.4.25</string>
<key>CFBundleVersion</key>
<string>2026042400</string>
<string>2026042500</string>
<key>CFBundleIconFile</key>
<string>OpenClaw</string>
<key>CFBundleURLTypes</key>

View File

@@ -723,17 +723,26 @@ public struct AgentIdentityResult: Codable, Sendable {
public let agentid: String
public let name: String?
public let avatar: String?
public let avatarsource: String?
public let avatarstatus: String?
public let avatarreason: String?
public let emoji: String?
public init(
agentid: String,
name: String?,
avatar: String?,
avatarsource: String?,
avatarstatus: String?,
avatarreason: String?,
emoji: String?)
{
self.agentid = agentid
self.name = name
self.avatar = avatar
self.avatarsource = avatarsource
self.avatarstatus = avatarstatus
self.avatarreason = avatarreason
self.emoji = emoji
}
@@ -741,6 +750,9 @@ public struct AgentIdentityResult: Codable, Sendable {
case agentid = "agentId"
case name
case avatar
case avatarsource = "avatarSource"
case avatarstatus = "avatarStatus"
case avatarreason = "avatarReason"
case emoji
}
}

View File

@@ -723,17 +723,26 @@ public struct AgentIdentityResult: Codable, Sendable {
public let agentid: String
public let name: String?
public let avatar: String?
public let avatarsource: String?
public let avatarstatus: String?
public let avatarreason: String?
public let emoji: String?
public init(
agentid: String,
name: String?,
avatar: String?,
avatarsource: String?,
avatarstatus: String?,
avatarreason: String?,
emoji: String?)
{
self.agentid = agentid
self.name = name
self.avatar = avatar
self.avatarsource = avatarsource
self.avatarstatus = avatarstatus
self.avatarreason = avatarreason
self.emoji = emoji
}
@@ -741,6 +750,9 @@ public struct AgentIdentityResult: Codable, Sendable {
case agentid = "agentId"
case name
case avatar
case avatarsource = "avatarSource"
case avatarstatus = "avatarStatus"
case avatarreason = "avatarReason"
case emoji
}
}

View File

@@ -1,4 +1,4 @@
83677b2666da2169511e5372f26c20c794001ec8acc7e9c2e1935043010c05d6 config-baseline.json
fa38a1bde88d8858ae0a11e7e17fa42fe107c34268b568f51877afbde81922e8 config-baseline.core.json
d72032762ab46b99480b57deb81130a0ab5b1401189cfbaf4f7fef4a063a7f6c config-baseline.channel.json
0504c4f38d4c753fffeb465c93540d829df6b0fcef921eb0e2226ac16bdbbe07 config-baseline.plugin.json
6ed33ef102e7c92816243bfabc3626222a679c3270c12ec5ea47b28b66204b3b config-baseline.json
f86cb4d57ec1f5fd75008be0ab86151194945eb013a47ab4bdeaddafd3780da7 config-baseline.core.json
7cd9c908f066c143eab2a201efbc9640f483ab28bba92ddeca1d18cc2b528bc3 config-baseline.channel.json
7825b56a5b3fcdbe2e09ef8fe5d9f12ac3598435afebe20413051e45b0d1968e config-baseline.plugin.json

View File

@@ -1,2 +1,2 @@
56ccee3ef8ff3b0ba7e2e765ae631b59254464585d5fef9db7e905f2c4c34ded plugin-sdk-api-baseline.json
39184cf8afaec691f0352d1a113e30a7099b87c0748237a3c7307e903ba24eee plugin-sdk-api-baseline.jsonl
f813474b1623f06e1465daacd56db970e8e92ab1be122faee0fa2a1dc2d4fc43 plugin-sdk-api-baseline.json
b3ea88c0c9b4cf6d9a46f0d34149063303853e78ef9708224608e4da79b23190 plugin-sdk-api-baseline.jsonl

View File

@@ -961,14 +961,23 @@ Discord has two distinct voice surfaces: realtime **voice channels** (continuous
### Voice channels
Requirements:
Setup checklist:
- Enable native commands (`commands.native` or `channels.discord.commands.native`).
- Configure `channels.discord.voice`.
- The bot needs Connect + Speak permissions in the target voice channel.
1. Enable Message Content Intent in the Discord Developer Portal.
2. Enable Server Members Intent when role/user allowlists are used.
3. Invite the bot with `bot` and `applications.commands` scopes.
4. Grant Connect, Speak, Send Messages, and Read Message History in the target voice channel.
5. Enable native commands (`commands.native` or `channels.discord.commands.native`).
6. Configure `channels.discord.voice`.
Use `/vc join|leave|status` to control sessions. The command uses the account default agent and follows the same allowlist and group policy rules as other Discord commands.
```bash
/vc join channel:<voice-channel-id>
/vc status
/vc leave
```
Auto-join example:
```json5
@@ -977,6 +986,7 @@ Auto-join example:
discord: {
voice: {
enabled: true,
model: "openai/gpt-5.4-mini",
autoJoin: [
{
guildId: "123456789012345678",
@@ -987,7 +997,7 @@ Auto-join example:
decryptionFailureTolerance: 24,
tts: {
provider: "openai",
openai: { voice: "alloy" },
openai: { voice: "onyx" },
},
},
},
@@ -998,12 +1008,24 @@ Auto-join example:
Notes:
- `voice.tts` overrides `messages.tts` for voice playback only.
- `voice.model` overrides the LLM used for Discord voice channel responses only. Leave it unset to inherit the routed agent model.
- STT uses `tools.media.audio`; `voice.model` does not affect transcription.
- Voice transcript turns derive owner status from Discord `allowFrom` (or `dm.allowFrom`); non-owner speakers cannot access owner-only tools (for example `gateway` and `cron`).
- Voice is enabled by default; set `channels.discord.voice.enabled=false` to disable it.
- `voice.daveEncryption` and `voice.decryptionFailureTolerance` pass through to `@discordjs/voice` join options.
- `@discordjs/voice` defaults are `daveEncryption=true` and `decryptionFailureTolerance=24` if unset.
- OpenClaw also watches receive decrypt failures and auto-recovers by leaving/rejoining the voice channel after repeated failures in a short window.
- If receive logs repeatedly show `DecryptionFailed(UnencryptedWhenPassthroughDisabled)`, this may be the upstream `@discordjs/voice` receive bug tracked in [discord.js #11419](https://github.com/discordjs/discord.js/issues/11419).
- If receive logs repeatedly show `DecryptionFailed(UnencryptedWhenPassthroughDisabled)` after updating, collect a dependency report and logs. The bundled `@discordjs/voice` line includes the upstream padding fix from discord.js PR #11449, which closed discord.js issue #11419.
Voice channel pipeline:
- Discord PCM capture is converted to a WAV temp file.
- `tools.media.audio` handles STT, for example `openai/gpt-4o-mini-transcribe`.
- The transcript is sent through normal Discord ingress and routing.
- `voice.model`, when set, overrides only the response LLM for this voice-channel turn.
- `voice.tts` is merged over `messages.tts`; the resulting audio is played in the joined channel.
Credentials are resolved per component: LLM route auth for `voice.model`, STT auth for `tools.media.audio`, and TTS auth for `messages.tts`/`voice.tts`.
### Voice messages
@@ -1130,7 +1152,7 @@ openclaw logs --follow
- watch logs for:
- `discord voice: DAVE decrypt failures detected`
- `discord voice: repeated decrypt failures; attempting rejoin`
- if failures continue after automatic rejoin, collect logs and compare against [discord.js #11419](https://github.com/discordjs/discord.js/issues/11419)
- if failures continue after automatic rejoin, collect logs and compare against the upstream DAVE receive history in [discord.js #11419](https://github.com/discordjs/discord.js/issues/11419) and [discord.js #11449](https://github.com/discordjs/discord.js/pull/11449)
</Accordion>
</AccordionGroup>

View File

@@ -16,7 +16,7 @@ Feishu/Lark is an all-in-one collaboration platform where teams chat, share docu
## Quick start
> **Requires OpenClaw 2026.4.24 or above.** Run `openclaw --version` to check. Upgrade with `openclaw update`.
> **Requires OpenClaw 2026.4.25 or above.** Run `openclaw --version` to check. Upgrade with `openclaw update`.
<Steps>
<Step title="Run the channel setup wizard">
@@ -424,6 +424,14 @@ Full configuration: [Gateway configuration](/gateway/configuration)
- ✅ Interactive cards (including streaming updates)
- ⚠️ Rich text (post-style formatting; doesn't support full Feishu/Lark authoring capabilities)
Native Feishu/Lark audio bubbles use the Feishu `audio` message type and require
Ogg/Opus upload media (`file_type: "opus"`). Existing `.opus` and `.ogg` media
is sent directly as native audio. MP3/WAV/M4A and other likely audio formats are
transcoded to 48kHz Ogg/Opus with `ffmpeg` only when the reply requests voice
delivery (`audioAsVoice` / message tool `asVoice`, including TTS voice-note
replies). Ordinary MP3 attachments stay regular files. If `ffmpeg` is missing or
conversion fails, OpenClaw falls back to a file attachment and logs the reason.
### Threads and replies
- ✅ Inline replies

View File

@@ -147,6 +147,11 @@ STT and TTS support two-level configuration with priority fallback:
Set `enabled: false` on either to disable.
Inbound QQ voice attachments are exposed to agents as audio media metadata while
keeping raw voice files out of generic `MediaPaths`. `[[audio_as_voice]]` plain
text replies synthesize TTS and send a native QQ voice message when TTS is
configured.
Outbound audio upload/transcode behavior can also be tuned with
`channels.qqbot.audioFormatPolicy`:

View File

@@ -257,6 +257,7 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
- Group sessions are isolated by group ID. Forum topics append `:topic:<threadId>` to keep topics isolated.
- DM messages can carry `message_thread_id`; OpenClaw routes them with thread-aware session keys and preserves thread ID for replies.
- Long polling uses grammY runner with per-chat/per-thread sequencing. Overall runner sink concurrency uses `agents.defaults.maxConcurrent`.
- Long polling is guarded inside each gateway process so only one active poller can use a bot token at a time. If you still see `getUpdates` 409 conflicts, another OpenClaw gateway, script, or external poller is likely using the same token.
- Long-polling watchdog restarts trigger after 120 seconds without completed `getUpdates` liveness by default. Increase `channels.telegram.pollingStallThresholdMs` only if your deployment still sees false polling-stall restarts during long-running work. The value is in milliseconds and is allowed from `30000` to `600000`; per-account overrides are supported.
- Telegram Bot API has no read-receipt support (`sendReadReceipts` does not apply).
@@ -274,7 +275,7 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
- `channels.telegram.streaming` is `off | partial | block | progress` (default: `partial`)
- `progress` maps to `partial` on Telegram (compat with cross-channel naming)
- `streaming.preview.toolProgress` controls whether tool/progress updates reuse the same edited preview message (default: `true` when preview streaming is active)
- legacy `channels.telegram.streamMode` and boolean `streaming` values are auto-mapped
- legacy `channels.telegram.streamMode` and boolean `streaming` values are detected; run `openclaw doctor --fix` to migrate them to `channels.telegram.streaming.mode`
Tool-progress preview updates are the short "Working..." lines shown while tools run, for example command execution, file reads, planning updates, or patch summaries. Telegram keeps these enabled by default to match released OpenClaw behavior from `v2026.4.22` and later. To keep the edited preview for answer text but hide tool-progress lines, set:
@@ -545,6 +546,9 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
- default: audio file behavior
- tag `[[audio_as_voice]]` in agent reply to force voice-note send
- inbound voice-note transcripts are framed as machine-generated,
untrusted text in the agent context; mention detection still uses the raw
transcript so mention-gated voice messages continue to work.
Message action example:

View File

@@ -362,9 +362,10 @@ When the linked self number is also present in `allowFrom`, WhatsApp self-chat s
<Accordion title="Outbound media behavior">
- supports image, video, audio (PTT voice-note), and document payloads
- reply payloads preserve `audioAsVoice`; WhatsApp sends audio media as Baileys PTT voice notes
- `audio/ogg` is rewritten to `audio/ogg; codecs=opus` for voice-note compatibility
- non-Ogg audio, including Microsoft Edge TTS MP3/WebM output, is transcoded to Ogg/Opus before PTT delivery
- native Ogg/Opus audio is sent with `audio/ogg; codecs=opus` for voice-note compatibility
- animated GIF playback is supported via `gifPlayback: true` on video sends
- captions are applied to the first media item when sending multi-media reply payloads
- captions are applied to the first media item when sending multi-media reply payloads, except PTT voice notes send the audio first and visible text separately because WhatsApp clients do not render voice-note captions consistently
- media source can be HTTP(S), `file://`, or local paths
</Accordion>

View File

@@ -10,7 +10,7 @@ The CI runs on every push to `main` and every pull request. It uses smart scopin
QA Lab has dedicated CI lanes outside the main smart-scoped workflow. The
`Parity gate` workflow runs on matching PR changes and manual dispatch; it
builds the private QA runtime and compares the mock GPT-5.4 and Opus 4.6
builds the private QA runtime and compares the mock GPT-5.5 and Opus 4.6
agentic packs. The `QA-Lab - All Lanes` workflow runs nightly on `main` and on
manual dispatch; it fans out the mock parity gate, live Matrix lane, and live
Telegram lane as parallel jobs. The live jobs use the `qa-live-shared`

View File

@@ -56,6 +56,7 @@ Detailed guidance: [Browser troubleshooting](/tools/browser#cdp-startup-failure-
openclaw browser status
openclaw browser doctor
openclaw browser start
openclaw browser start --headless
openclaw browser stop
openclaw browser --browser-profile openclaw reset-profile
```
@@ -67,6 +68,14 @@ Notes:
OpenClaw did not launch the browser process itself.
- For local managed profiles, `openclaw browser stop` stops the spawned browser
process.
- `openclaw browser start --headless` applies only to that start request and
only when OpenClaw launches a local managed browser. It does not rewrite
`browser.headless` or profile config, and it is a no-op for an already-running
browser.
- On Linux hosts without `DISPLAY` or `WAYLAND_DISPLAY`, local managed profiles
run headless automatically unless `OPENCLAW_BROWSER_HEADLESS=0`,
`browser.headless=false`, or `browser.profiles.<name>.headless=false`
explicitly requests a visible browser.
## If the command is missing
@@ -185,6 +194,11 @@ openclaw browser download <ref> report.pdf
openclaw browser dialog --accept
```
Managed Chrome profiles save ordinary click-triggered downloads into the OpenClaw
downloads directory (`/tmp/openclaw/downloads` by default, or the configured temp
root). Use `waitfordownload` or `download` when the agent needs to wait for a
specific file and return its path; those explicit waiters own the next download.
## State and storage
Viewport + emulation:

View File

@@ -17,20 +17,22 @@ Running `openclaw crestodian` starts the same helper explicitly.
## What Crestodian shows
On startup, Crestodian prints a compact system overview:
On startup, interactive Crestodian opens the same TUI shell used by
`openclaw tui`, with a Crestodian chat backend. The chat log starts with a short
greeting:
- config path and validity
- configured agents and the default agent
- default model
- local Codex and Claude Code CLI availability
- OpenAI and Anthropic API-key presence
- planner mode (`deterministic` or model-assisted through the configured model)
- local docs path or the public docs URL
- local source path for Git checkouts, otherwise the OpenClaw GitHub source URL
- gateway reachability
- the immediate recommended next step
- when to start Crestodian
- the model or deterministic planner path Crestodian is actually using
- config validity and the default agent
- Gateway reachability from the first startup probe
- the next debug action Crestodian can take
It does not dump secrets or load plugin CLI commands just to start.
It does not dump secrets or load plugin CLI commands just to start. The TUI
still provides the normal header, chat log, status line, footer, autocomplete,
and editor controls.
Use `status` for the detailed inventory with config path, docs/source paths,
local CLI probes, API-key presence, agents, model, and Gateway details.
Crestodian uses the same OpenClaw reference discovery as regular agents. In a Git checkout,
it points itself at local `docs/` and the local source tree. In an npm package install, it
@@ -51,7 +53,7 @@ openclaw crestodian --message "set default model openai/gpt-5.5" --yes
openclaw onboard --modern
```
Inside the interactive prompt:
Inside the Crestodian TUI:
```text
status
@@ -105,7 +107,7 @@ Read-only operations can run immediately:
- show the audit-log path
Persistent operations require conversational approval in interactive mode unless
you pass `--yes` for a one-shot command:
you pass `--yes` for a direct command:
- write config
- run `config set`
@@ -153,14 +155,22 @@ model unset. Install or log into Codex/Claude Code, or expose
## Model-Assisted Planner
Crestodian always starts in deterministic mode. Once a valid OpenClaw model is
configured, local Crestodian can make one bounded model call for fuzzy commands
that the deterministic parser does not understand.
Crestodian always starts in deterministic mode. For fuzzy commands that the
deterministic parser does not understand, local Crestodian can make one bounded
planner turn through OpenClaw's normal runtime paths. It first uses the
configured OpenClaw model. If no configured model is usable yet, it can fall
back to local runtimes already present on the machine:
- Claude Code CLI: `claude-cli/claude-opus-4-7`
- Codex app-server harness: `openai/gpt-5.5` with `embeddedHarness.runtime: "codex"`
- Codex CLI: `codex-cli/gpt-5.5`
The model-assisted planner cannot mutate config directly. It must translate the
request into one of Crestodian's typed commands, then the normal approval and
audit rules apply. Crestodian prints the model it used and the interpreted
command before it runs anything.
command before it runs anything. Configless fallback planner turns are
temporary, tool-disabled where the runtime supports it, and use a temporary
workspace/session.
Message-channel rescue mode does not use the model-assisted planner. Remote
rescue stays deterministic so a broken or compromised normal agent path cannot
@@ -275,12 +285,34 @@ Remote rescue is covered by the Docker lane:
pnpm test:docker:crestodian-rescue
```
Configless local planner fallback is covered by:
```bash
pnpm test:docker:crestodian-planner
```
An opt-in live channel command-surface smoke checks `/crestodian status` plus a
persistent approval roundtrip through the rescue handler:
```bash
pnpm test:live:crestodian-rescue-channel
```
Fresh configless setup through Crestodian is covered by:
```bash
pnpm test:docker:crestodian-first-run
```
That lane starts with an empty state dir, routes bare `openclaw` to Crestodian,
sets the default model, creates an additional agent, configures Discord through
a plugin enablement plus token SecretRef, validates config, and checks the audit
log. QA Lab also has a repo-backed scenario for the same Ring 0 flow:
```bash
pnpm openclaw qa suite --scenario crestodian-ring-zero-setup
```
## Related
- [CLI reference](/cli)

View File

@@ -18,6 +18,8 @@ openclaw dashboard --no-open
Notes:
- `dashboard` resolves configured `gateway.auth.token` SecretRefs when possible.
- `dashboard` follows `gateway.tls.enabled`: TLS-enabled gateways print/open
`https://` Control UI URLs and connect over `wss://`.
- For SecretRef-managed tokens (resolved or unresolved), `dashboard` prints/copies/opens a non-tokenized URL to avoid exposing external secrets in terminal output, clipboard history, or browser-launch arguments.
- If `gateway.auth.token` is SecretRef-managed but unresolved in this command path, the command prints a non-tokenized URL and explicit remediation guidance instead of embedding an invalid token placeholder.

View File

@@ -114,7 +114,7 @@ This table maps common inference tasks to the corresponding infer command.
| Describe an image file | `openclaw infer image describe --file ./image.png --json` | `--model` must be an image-capable `<provider/model>` |
| Transcribe audio | `openclaw infer audio transcribe --file ./memo.m4a --json` | `--model` must be `<provider/model>` |
| Synthesize speech | `openclaw infer tts convert --text "..." --output ./speech.mp3 --json` | `tts status` is gateway-oriented |
| Generate a video | `openclaw infer video generate --prompt "..." --json` | |
| Generate a video | `openclaw infer video generate --prompt "..." --json` | Supports provider hints such as `--resolution` |
| Describe a video file | `openclaw infer video describe --file ./clip.mp4 --json` | `--model` must be `<provider/model>` |
| Search the web | `openclaw infer web search --query "..." --json` | |
| Fetch a web page | `openclaw infer web fetch --url https://example.com --json` | |
@@ -156,6 +156,9 @@ Use `image` for generation, edit, and description.
```bash
openclaw infer image generate --prompt "friendly lobster illustration" --json
openclaw infer image generate --prompt "cinematic product photo of headphones" --json
openclaw infer image generate --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "simple red circle sticker on a transparent background" --json
openclaw infer image generate --prompt "slow image backend" --timeout-ms 180000 --json
openclaw infer image edit --file ./logo.png --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "keep the logo, remove the background" --json
openclaw infer image describe --file ./photo.jpg --json
openclaw infer image describe --file ./ui-screenshot.png --model openai/gpt-4.1-mini --json
openclaw infer image describe --file ./photo.jpg --model ollama/qwen2.5vl:7b --json
@@ -164,6 +167,10 @@ openclaw infer image describe --file ./photo.jpg --model ollama/qwen2.5vl:7b --j
Notes:
- Use `image edit` when starting from existing input files.
- Use `--output-format png --background transparent` with
`--model openai/gpt-image-1.5` for transparent-background OpenAI PNG output;
`--openai-background` remains available as an OpenAI-specific alias. Providers
that do not declare background support report the hint as an ignored override.
- Use `image providers --json` to verify which bundled image providers are
discoverable, configured, selected, and which generation/edit capabilities
each provider exposes.
@@ -223,13 +230,14 @@ Use `video` for generation and description.
```bash
openclaw infer video generate --prompt "cinematic sunset over the ocean" --json
openclaw infer video generate --prompt "slow drone shot over a forest lake" --json
openclaw infer video generate --prompt "slow drone shot over a forest lake" --resolution 768P --duration 6 --json
openclaw infer video describe --file ./clip.mp4 --json
openclaw infer video describe --file ./clip.mp4 --model openai/gpt-4.1-mini --json
```
Notes:
- `video generate` accepts `--size`, `--aspect-ratio`, `--resolution`, `--duration`, `--audio`, `--watermark`, and `--timeout-ms` and forwards them to the video-generation runtime.
- `--model` must be `<provider/model>` for `video describe`.
## Web

View File

@@ -66,6 +66,35 @@ Notes:
stale removed-provider default.
- `models status` may show `marker(<value>)` in auth output for non-secret placeholders (for example `OPENAI_API_KEY`, `secretref-managed`, `minimax-oauth`, `oauth:chutes`, `ollama-local`) instead of masking them as secrets.
### `models scan`
`models scan` reads OpenRouter's public `:free` catalog and ranks candidates for
fallback use. The catalog itself is public, so metadata-only scans do not need
an OpenRouter key.
By default OpenClaw tries to probe tool and image support with live model calls.
If no OpenRouter key is configured, the command falls back to metadata-only
output and explains that `:free` models still require `OPENROUTER_API_KEY` for
probes and inference.
Options:
- `--no-probe` (metadata only; no config/secrets lookup)
- `--min-params <b>`
- `--max-age-days <days>`
- `--provider <name>`
- `--max-candidates <n>`
- `--timeout <ms>` (catalog request and per-probe timeout)
- `--concurrency <n>`
- `--yes`
- `--no-input`
- `--set-default`
- `--set-image`
- `--json`
`--set-default` and `--set-image` require live probes; metadata-only scan
results are informational and are not applied to config.
### `models status`
Options:

View File

@@ -31,6 +31,8 @@ openclaw plugins inspect --all
openclaw plugins info <id>
openclaw plugins enable <id>
openclaw plugins disable <id>
openclaw plugins registry
openclaw plugins registry --refresh
openclaw plugins uninstall <id>
openclaw plugins doctor
openclaw plugins update <id-or-npm-spec>
@@ -195,18 +197,20 @@ openclaw plugins list --verbose
openclaw plugins list --json
```
Use `--enabled` to show only loaded plugins. Use `--verbose` to switch from the
Use `--enabled` to show only enabled plugins. Use `--verbose` to switch from the
table view to per-plugin detail lines with source/origin/version/activation
metadata. Use `--json` for machine-readable inventory plus registry
diagnostics.
`plugins list` runs discovery from the current CLI environment and config. It is
useful for checking whether a plugin is enabled/loadable, but it is not a live
runtime probe of an already-running Gateway process. After changing plugin code,
enablement, hook policy, or `plugins.load.paths`, restart the Gateway that
serves the channel before expecting new `register(api)` code or hooks to run.
For remote/container deployments, verify you are restarting the actual
`openclaw gateway run` child, not only a wrapper process.
`plugins list` reads the persisted local plugin registry first, with a
manifest-only derived fallback when the registry is missing or invalid. It is
useful for checking whether a plugin is installed, enabled, and visible to cold
startup planning, but it is not a live runtime probe of an already-running
Gateway process. After changing plugin code, enablement, hook policy, or
`plugins.load.paths`, restart the Gateway that serves the channel before
expecting new `register(api)` code or hooks to run. For remote/container
deployments, verify you are restarting the actual `openclaw gateway run` child,
not only a wrapper process.
For runtime hook debugging:
@@ -227,7 +231,19 @@ openclaw plugins install -l ./my-plugin
source path instead of copying over a managed install target.
Use `--pin` on npm installs to save the resolved exact spec (`name@version`) in
`plugins.installs` while keeping the default behavior unpinned.
the managed install ledger while keeping the default behavior unpinned.
### Install Ledger
Plugin install metadata is machine-managed state, not user config. New installs
and updates write it to `plugins/installs.json` under the active OpenClaw state
directory. The file includes a do-not-edit warning and is used by
`openclaw plugins update`, uninstall, diagnostics, and the cold plugin registry.
Legacy `plugins.installs` entries in `openclaw.json` remain readable as a
deprecated compatibility fallback. When install/update/uninstall paths rewrite
plugin install state, OpenClaw writes the ledger file and removes
`plugins.installs` from the persisted config payload.
### Uninstall
@@ -237,8 +253,9 @@ openclaw plugins uninstall <id> --dry-run
openclaw plugins uninstall <id> --keep-files
```
`uninstall` removes plugin records from `plugins.entries`, `plugins.installs`,
the plugin allowlist, and linked `plugins.load.paths` entries when applicable.
`uninstall` removes plugin records from `plugins.entries`, the managed install
ledger, the plugin allowlist, and linked `plugins.load.paths` entries when
applicable.
For active memory plugins, the memory slot resets to `memory-core`.
By default, uninstall also removes the plugin install directory under the active
@@ -257,8 +274,8 @@ openclaw plugins update @openclaw/voice-call@beta
openclaw plugins update openclaw-codex-app-server --dangerously-force-unsafe-install
```
Updates apply to tracked installs in `plugins.installs` and tracked hook-pack
installs in `hooks.internal.installs`.
Updates apply to tracked plugin installs in the managed install ledger and
tracked hook-pack installs in `hooks.internal.installs`.
When you pass a plugin id, OpenClaw reuses the recorded install spec for that
plugin. That means previously stored dist-tags such as `@beta` and exact pinned
@@ -333,6 +350,29 @@ For module-shape failures such as missing `register`/`activate` exports, rerun
with `OPENCLAW_PLUGIN_LOAD_DEBUG=1` to include a compact export-shape summary in
the diagnostic output.
### Registry
```bash
openclaw plugins registry
openclaw plugins registry --refresh
openclaw plugins registry --json
```
The local plugin registry is OpenClaw's persisted cold read model for installed
plugin identity, enablement, source metadata, and contribution ownership.
Normal startup, provider owner lookup, channel setup classification, and plugin
inventory can read it without importing plugin runtime modules.
Use `plugins registry` to inspect whether the persisted registry is present,
current, or stale. Use `--refresh` to rebuild it from the durable install
ledger, config policy, and manifest/package metadata. This is a repair path, not
a runtime activation path.
`OPENCLAW_DISABLE_PERSISTED_PLUGIN_REGISTRY=1` is a deprecated break-glass
compatibility switch for registry read failures. Prefer `plugins registry
--refresh` or `openclaw doctor --fix`; the env fallback is only for emergency
startup recovery while the migration rolls out.
### Marketplace
```bash

View File

@@ -68,10 +68,6 @@ Inspect current vault mode, health, and Obsidian CLI availability.
Use this first when you are unsure whether the vault is initialized, bridge mode
is healthy, or Obsidian integration is available.
This command calls the Gateway so bridge mode reports the same exported memory
artifacts that runtime wiki tools see. Start the Gateway first, or pass
`--url` and `--token` when checking a remote Gateway.
### `wiki doctor`
Run wiki health checks and surface configuration or vault problems.
@@ -82,9 +78,6 @@ Typical issues include:
- invalid or missing vault layout
- missing external Obsidian CLI when Obsidian mode is expected
Like `wiki status`, this command calls the Gateway in order to inspect the
active runtime memory plugin state.
### `wiki init`
Create the wiki vault layout and starter pages.
@@ -175,10 +168,6 @@ source pages.
Use this in `bridge` mode when you want the latest exported memory artifacts
pulled into the wiki vault.
The import runs through the Gateway process. This keeps the CLI from seeing an
empty standalone plugin registry and accidentally treating bridge-backed pages
as removed.
### `wiki unsafe-local import`
Import from explicitly configured local paths in `unsafe-local` mode.

View File

@@ -77,6 +77,19 @@ gateway-backed session transcript, so they are the source of truth.
Details: [Session management](/concepts/session).
## Tool result metadata
Tool result `content` is the model-visible result. Tool result `details` is
runtime metadata for UI rendering, diagnostics, media delivery, and plugins.
OpenClaw keeps that boundary explicit:
- `toolResult.details` is stripped before provider replay and compaction input.
- Persisted session transcripts keep only bounded `details`; oversized metadata
is replaced with a compact summary marked `persistedDetailsTruncated: true`.
- Plugins and tools should put text the model must read in `content`, not only
in `details`.
## Inbound bodies and history context
OpenClaw separates the **prompt body** from the **command body**:
@@ -154,6 +167,8 @@ Details: [Configuration](/gateway/config-agents#messages) and channel docs.
## Silent replies
The exact silent token `NO_REPLY` / `no_reply` means “do not deliver a user-visible reply”.
When a turn also has pending tool media, such as generated TTS audio, OpenClaw
strips the silent text but still delivers the media attachment.
OpenClaw resolves that behavior by conversation type:
- Direct conversations disallow silence by default and rewrite a bare silent

View File

@@ -129,15 +129,18 @@ validation failures) are treated as failoverworthy and use the same cooldowns
OpenAI-compatible stop-reason errors such as `Unhandled stop reason: error`,
`stop reason: error`, and `reason: error` are classified as timeout/failover
signals.
Provider-scoped generic server text can also land in that timeout bucket when
the source matches a known transient pattern. For example, Anthropic bare
`An unknown error occurred` and JSON `api_error` payloads with transient server
text such as `internal server error`, `unknown error, 520`, `upstream error`,
or `backend error` are treated as failover-worthy timeouts. OpenRouter-specific
generic upstream text such as bare `Provider returned error` is also treated as
timeout only when the provider context is actually OpenRouter. Generic internal
fallback text such as `LLM request failed with an unknown error.` stays
conservative and does not trigger failover by itself.
Generic server text can also land in that timeout bucket when the source matches
a known transient pattern. For example, the bare pi-ai stream-wrapper message
`An unknown error occurred` is treated as failover-worthy for every provider
because pi-ai emits it when provider streams end with `stopReason: "aborted"` or
`stopReason: "error"` without specific details. JSON `api_error` payloads with
transient server text such as `internal server error`, `unknown error, 520`,
`upstream error`, or `backend error` are also treated as failover-worthy
timeouts.
OpenRouter-specific generic upstream text such as bare `Provider returned error`
is treated as timeout only when the provider context is actually OpenRouter.
Generic internal fallback text such as `LLM request failed with an unknown
error.` stays conservative and does not trigger failover by itself.
Some provider SDKs may otherwise sleep for a long `Retry-After` window before
returning control to OpenClaw. For Stainless-based SDKs such as Anthropic and

View File

@@ -30,9 +30,9 @@ Reference for **LLM/model providers** (not chat channels like WhatsApp/Telegram)
`google-gemini-cli`, or `codex-cli` when you want a local CLI backend.
Legacy `claude-cli/*`, `google-gemini-cli/*`, and `codex-cli/*` refs migrate
back to canonical provider refs with the runtime recorded separately.
- GPT-5.5 is available through `openai-codex/gpt-5.5` in PI, the native
Codex app-server harness, and the public OpenAI API when the bundled PI
catalog exposes `openai/gpt-5.5` for your install.
- GPT-5.5 is available through `openai/gpt-5.5` for direct API-key traffic,
`openai-codex/gpt-5.5` in PI for Codex OAuth, and the native Codex
app-server harness when `embeddedHarness.runtime: "codex"` is set.
## Plugin-owned provider behavior
@@ -71,10 +71,9 @@ OpenClaw ships with the piai catalog. These providers require **no**
- Provider: `openai`
- Auth: `OPENAI_API_KEY`
- Optional rotation: `OPENAI_API_KEYS`, `OPENAI_API_KEY_1`, `OPENAI_API_KEY_2`, plus `OPENCLAW_LIVE_OPENAI_KEY` (single override)
- Example models: `openai/gpt-5.5`, `openai/gpt-5.4`, `openai/gpt-5.4-mini`
- GPT-5.5 direct API support depends on the bundled PI catalog version for
your install; verify with `openclaw models list --provider openai` before
using `openai/gpt-5.5` without the Codex app-server runtime.
- Example models: `openai/gpt-5.5`, `openai/gpt-5.4-mini`
- Verify account/model availability with `openclaw models list --provider openai`
if a specific install or API key behaves differently.
- CLI: `openclaw onboard --auth-choice openai-api-key`
- Default transport is `auto` (WebSocket-first, SSE fallback)
- Override per model via `agents.defaults.models["openai/<model>"].params.transport` (`"sse"`, `"websocket"`, or `"auto"`)
@@ -91,7 +90,7 @@ OpenClaw ships with the piai catalog. These providers require **no**
```json5
{
agents: { defaults: { model: { primary: "openai/gpt-5.4" } } },
agents: { defaults: { model: { primary: "openai/gpt-5.5" } } },
}
```

View File

@@ -242,8 +242,11 @@ Key flags:
- `--set-default`: set `agents.defaults.model.primary` to the first selection
- `--set-image`: set `agents.defaults.imageModel.primary` to the first image selection
Probing requires an OpenRouter API key (from auth profiles or
`OPENROUTER_API_KEY`). Without a key, use `--no-probe` to list candidates only.
The OpenRouter `/models` catalog is public, so metadata-only scans can list
free candidates without a key. Probing and inference still require an
OpenRouter API key (from auth profiles or `OPENROUTER_API_KEY`). If no key is
available, `openclaw models scan` falls back to metadata-only output and leaves
config unchanged. Use `--no-probe` to request metadata-only mode explicitly.
Scan results are ranked by:
@@ -255,12 +258,14 @@ Scan results are ranked by:
Input
- OpenRouter `/models` list (filter `:free`)
- Requires OpenRouter API key from auth profiles or `OPENROUTER_API_KEY` (see [/environment](/help/environment))
- Live probes require OpenRouter API key from auth profiles or `OPENROUTER_API_KEY` (see [/environment](/help/environment))
- Optional filters: `--max-age-days`, `--min-params`, `--provider`, `--max-candidates`
- Probe controls: `--timeout`, `--concurrency`
- Request/probe controls: `--timeout`, `--concurrency`
When run in a TTY, you can select fallbacks interactively. In noninteractive
mode, pass `--yes` to accept defaults.
When live probes run in a TTY, you can select fallbacks interactively. In
noninteractive mode, pass `--yes` to accept defaults. Metadata-only results are
informational; `--set-default` and `--set-image` require live probes so
OpenClaw does not configure an unusable keyless OpenRouter model.
## Models registry (`models.json`)

View File

@@ -238,7 +238,7 @@ refs and write a judged Markdown report:
```bash
pnpm openclaw qa character-eval \
--model openai/gpt-5.4,thinking=medium,fast \
--model openai/gpt-5.5,thinking=medium,fast \
--model openai/gpt-5.2,thinking=xhigh \
--model openai/gpt-5,thinking=xhigh \
--model anthropic/claude-opus-4-6,thinking=high \
@@ -246,7 +246,7 @@ pnpm openclaw qa character-eval \
--model zai/glm-5.1,thinking=high \
--model moonshot/kimi-k2.5,thinking=high \
--model google/gemini-3.1-pro-preview,thinking=high \
--judge-model openai/gpt-5.4,thinking=xhigh,fast \
--judge-model openai/gpt-5.5,thinking=xhigh,fast \
--judge-model anthropic/claude-opus-4-6,thinking=high \
--blind-judge-models \
--concurrency 16 \
@@ -263,7 +263,7 @@ Use `--blind-judge-models` when comparing providers: the judge prompt still gets
every transcript and run status, but candidate refs are replaced with neutral
labels such as `candidate-01`; the report maps rankings back to real refs after
parsing.
Candidate runs default to `high` thinking, with `medium` for GPT-5.4 and `xhigh`
Candidate runs default to `high` thinking, with `medium` for GPT-5.5 and `xhigh`
for older OpenAI eval refs that support it. Override a specific candidate inline with
`--model provider/model,thinking=<level>`. `--thinking <level>` still sets a
global fallback, and the older `--model-thinking <provider/model=level>` form is
@@ -278,12 +278,12 @@ Candidate and judge model runs both default to concurrency 16. Lower
`--concurrency` or `--judge-concurrency` when provider limits or local gateway
pressure make a run too noisy.
When no candidate `--model` is passed, the character eval defaults to
`openai/gpt-5.4`, `openai/gpt-5.2`, `openai/gpt-5`, `anthropic/claude-opus-4-6`,
`openai/gpt-5.5`, `openai/gpt-5.2`, `openai/gpt-5`, `anthropic/claude-opus-4-6`,
`anthropic/claude-sonnet-4-6`, `zai/glm-5.1`,
`moonshot/kimi-k2.5`, and
`google/gemini-3.1-pro-preview` when no `--model` is passed.
When no `--judge-model` is passed, the judges default to
`openai/gpt-5.4,thinking=xhigh,fast` and
`openai/gpt-5.5,thinking=xhigh,fast` and
`anthropic/claude-opus-4-6,thinking=high`.
## Related docs

View File

@@ -143,7 +143,7 @@ Slack-only:
Legacy key migration:
- Telegram: `streamMode` + boolean `streaming` auto-migrate to `streaming` enum.
- Telegram: legacy `streamMode` and scalar/boolean `streaming` values are detected and migrated by doctor/config compatibility paths to `streaming.mode`.
- Discord: `streamMode` + boolean `streaming` auto-migrate to `streaming` enum.
- Slack: `streamMode` auto-migrates to `streaming.mode`; boolean `streaming` auto-migrates to `streaming.mode` plus `streaming.nativeTransport`; legacy `nativeStreaming` auto-migrates to `streaming.nativeTransport`.

View File

@@ -52,6 +52,14 @@
]
},
"redirects": [
{
"source": "/help/gpt54-codex-agentic-parity",
"destination": "/help/gpt55-codex-agentic-parity"
},
{
"source": "/help/gpt54-codex-agentic-parity-maintainers",
"destination": "/help/gpt55-codex-agentic-parity-maintainers"
},
{
"source": "/mcp",
"destination": "/cli/mcp"
@@ -1649,8 +1657,8 @@
"concepts/typing-indicators",
"concepts/usage-tracking",
"concepts/timezone",
"help/gpt54-codex-agentic-parity",
"help/gpt54-codex-agentic-parity-maintainers"
"help/gpt55-codex-agentic-parity",
"help/gpt55-codex-agentic-parity-maintainers"
]
},
{

View File

@@ -122,6 +122,8 @@ The provider id becomes the left side of your model ref:
sessionMode: "existing",
sessionIdFields: ["session_id", "conversation_id"],
systemPromptArg: "--system",
// For CLIs with a dedicated prompt-file flag:
// systemPromptFileArg: "--system-file",
// Codex-style CLIs can point at a prompt file instead:
// systemPromptFileConfigArg: "-c",
// systemPromptFileConfigKey: "model_instructions_file",

View File

@@ -342,12 +342,12 @@ Time format in system prompt. Default: `auto` (OS preference).
- Also used as fallback routing when the selected/default model cannot accept image input.
- `imageGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
- Used by the shared image-generation capability and any future tool/plugin surface that generates images.
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, or `openai/gpt-image-2` for OpenAI Images.
- If you select a provider/model directly, configure matching provider auth too (for example `GEMINI_API_KEY` or `GOOGLE_API_KEY` for `google/*`, `OPENAI_API_KEY` or OpenAI Codex OAuth for `openai/gpt-image-2`, `FAL_KEY` for `fal/*`).
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, `openai/gpt-image-2` for OpenAI Images, or `openai/gpt-image-1.5` for transparent-background OpenAI PNG/WebP output.
- If you select a provider/model directly, configure matching provider auth too (for example `GEMINI_API_KEY` or `GOOGLE_API_KEY` for `google/*`, `OPENAI_API_KEY` or OpenAI Codex OAuth for `openai/gpt-image-2` / `openai/gpt-image-1.5`, `FAL_KEY` for `fal/*`).
- If omitted, `image_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered image-generation providers in provider-id order.
- `musicGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
- Used by the shared music-generation capability and the built-in `music_generate` tool.
- Typical values: `google/lyria-3-clip-preview`, `google/lyria-3-pro-preview`, or `minimax/music-2.5+`.
- Typical values: `google/lyria-3-clip-preview`, `google/lyria-3-pro-preview`, or `minimax/music-2.6`.
- If omitted, `music_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered music-generation providers in provider-id order.
- If you select a provider/model directly, configure the matching provider auth/API key too.
- `videoGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
@@ -363,7 +363,7 @@ Time format in system prompt. Default: `auto` (OS preference).
- `pdfMaxPages`: default maximum pages considered by extraction fallback mode in the `pdf` tool.
- `verboseDefault`: default verbose level for agents. Values: `"off"`, `"on"`, `"full"`. Default: `"off"`.
- `elevatedDefault`: default elevated-output level for agents. Values: `"off"`, `"on"`, `"ask"`, `"full"`. Default: `"on"`.
- `model.primary`: format `provider/model` (e.g. `openai/gpt-5.4` for API-key access or `openai-codex/gpt-5.5` for Codex OAuth). If you omit the provider, OpenClaw tries an alias first, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider (deprecated compatibility behavior, so prefer explicit `provider/model`). If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default.
- `model.primary`: format `provider/model` (e.g. `openai/gpt-5.5` for API-key access or `openai-codex/gpt-5.5` for Codex OAuth). If you omit the provider, OpenClaw tries an alias first, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider (deprecated compatibility behavior, so prefer explicit `provider/model`). If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default.
- `models`: the configured model catalog and allowlist for `/model`. Each entry can include `alias` (shortcut) and `params` (provider-specific, for example `temperature`, `maxTokens`, `cacheRetention`, `context1m`, `responsesServerCompaction`, `responsesCompactThreshold`, `extra_body`/`extraBody`).
- Safe edits: use `openclaw config set agents.defaults.models '<json>' --strict-json --merge` to add entries. `config set` refuses replacements that would remove existing allowlist entries unless you pass `--replace`.
- Provider-scoped configure/onboarding flows merge selected provider models into this map and preserve unrelated providers already configured.
@@ -406,16 +406,16 @@ Codex app-server harness. For the mental model, see
**Built-in alias shorthands** (only apply when the model is in `agents.defaults.models`):
| Alias | Model |
| ------------------- | -------------------------------------------------- |
| `opus` | `anthropic/claude-opus-4-6` |
| `sonnet` | `anthropic/claude-sonnet-4-6` |
| `gpt` | `openai/gpt-5.4` or configured Codex OAuth GPT-5.5 |
| `gpt-mini` | `openai/gpt-5.4-mini` |
| `gpt-nano` | `openai/gpt-5.4-nano` |
| `gemini` | `google/gemini-3.1-pro-preview` |
| `gemini-flash` | `google/gemini-3-flash-preview` |
| `gemini-flash-lite` | `google/gemini-3.1-flash-lite-preview` |
| Alias | Model |
| ------------------- | ------------------------------------------ |
| `opus` | `anthropic/claude-opus-4-6` |
| `sonnet` | `anthropic/claude-sonnet-4-6` |
| `gpt` | `openai/gpt-5.5` or `openai-codex/gpt-5.5` |
| `gpt-mini` | `openai/gpt-5.4-mini` |
| `gpt-nano` | `openai/gpt-5.4-nano` |
| `gemini` | `google/gemini-3.1-pro-preview` |
| `gemini-flash` | `google/gemini-3-flash-preview` |
| `gemini-flash-lite` | `google/gemini-3.1-flash-lite-preview` |
Your configured aliases always win over defaults.
@@ -443,6 +443,7 @@ Optional CLI backends for text-only fallback runs (no tool calls). Useful as a b
sessionArg: "--session",
sessionMode: "existing",
systemPromptArg: "--system",
// Or use systemPromptFileArg when the CLI accepts a prompt file flag.
systemPromptWhen: "first",
imageArg: "--image",
imageMode: "repeat",
@@ -881,7 +882,7 @@ noVNC observer access uses VNC auth by default and OpenClaw emits a short-lived
- `--renderer-process-limit=2` can be changed with
`OPENCLAW_BROWSER_RENDERER_PROCESS_LIMIT=<N>`; set `0` to use Chromium's
default process limit.
- plus `--no-sandbox` and `--disable-setuid-sandbox` when `noSandbox` is enabled.
- plus `--no-sandbox` when `noSandbox` is enabled.
- Defaults are the container image baseline; use a custom browser image with a custom
entrypoint to change container defaults.

View File

@@ -327,7 +327,8 @@ WhatsApp runs through the gateway's web channel (Baileys Web). It starts automat
- `spawnSubagentSessions`: opt-in switch for `sessions_spawn({ thread: true })` auto thread creation/binding
- Top-level `bindings[]` entries with `type: "acp"` configure persistent ACP bindings for channels and threads (use channel/thread id in `match.peer.id`). Field semantics are shared in [ACP Agents](/tools/acp-agents#channel-specific-settings).
- `channels.discord.ui.components.accentColor` sets the accent color for Discord components v2 containers.
- `channels.discord.voice` enables Discord voice channel conversations and optional auto-join + TTS overrides.
- `channels.discord.voice` enables Discord voice channel conversations and optional auto-join + LLM + TTS overrides.
- `channels.discord.voice.model` optionally overrides the LLM model used for Discord voice channel responses.
- `channels.discord.voice.daveEncryption` and `channels.discord.voice.decryptionFailureTolerance` pass through to `@discordjs/voice` DAVE options (`true` and `24` by default).
- OpenClaw additionally attempts voice receive recovery by leaving/rejoining a voice session after repeated decrypt failures.
- `channels.discord.streaming` is the canonical stream mode key. Legacy `streamMode` and boolean `streaming` values are auto-migrated.

View File

@@ -186,9 +186,14 @@ See [MCP](/cli/mcp#openclaw-as-an-mcp-client-registry) and
- Enabled Claude bundle plugins can also contribute embedded Pi defaults from `settings.json`; OpenClaw applies those as sanitized agent settings, not as raw OpenClaw config patches.
- `plugins.slots.memory`: pick the active memory plugin id, or `"none"` to disable memory plugins.
- `plugins.slots.contextEngine`: pick the active context engine plugin id; defaults to `"legacy"` unless you install and select another engine.
- `plugins.installs`: CLI-managed install metadata used by `openclaw plugins update`.
- Includes `source`, `spec`, `sourcePath`, `installPath`, `version`, `resolvedName`, `resolvedVersion`, `resolvedSpec`, `integrity`, `shasum`, `resolvedAt`, `installedAt`.
- Treat `plugins.installs.*` as managed state; prefer CLI commands over manual edits.
- `plugins.installs`: deprecated compatibility fallback for legacy
CLI-managed install metadata. New plugin installs write the managed
`plugins/installs.json` state ledger instead.
- Legacy records include `source`, `spec`, `sourcePath`, `installPath`,
`version`, `resolvedName`, `resolvedVersion`, `resolvedSpec`, `integrity`,
`shasum`, `resolvedAt`, `installedAt`.
- Treat `plugins.installs.*` as managed state; prefer CLI commands over
manual edits.
See [Plugins](/tools/plugin).
@@ -253,6 +258,12 @@ See [Plugins](/tools/plugin).
- `profiles.*.cdpUrl` accepts `http://`, `https://`, `ws://`, and `wss://`.
Use HTTP(S) when you want OpenClaw to discover `/json/version`; use WS(S)
when your provider gives you a direct DevTools WebSocket URL.
- `remoteCdpTimeoutMs` and `remoteCdpHandshakeTimeoutMs` apply to remote and
`attachOnly` CDP reachability plus tab-opening requests. Managed loopback
profiles keep local CDP defaults.
- If an externally managed CDP service is reachable through loopback, set that
profile's `attachOnly: true`; otherwise OpenClaw treats the loopback port as a
local managed browser profile and may report local port ownership errors.
- `existing-session` profiles use Chrome MCP instead of CDP and can attach on
the selected host or through a connected browser node.
- `existing-session` profiles can set `userDataDir` to target a specific
@@ -266,8 +277,15 @@ See [Plugins](/tools/plugin).
- Local managed profiles can set `executablePath` to override the global
`browser.executablePath` for that profile. Use this to run one profile in
Chrome and another in Brave.
- Local managed profiles use `browser.localLaunchTimeoutMs` for Chrome CDP HTTP
discovery after process start and `browser.localCdpReadyTimeoutMs` for
post-launch CDP websocket readiness. Raise them on slower hosts where Chrome
starts successfully but readiness checks race startup. Both values must be
positive integers up to `120000` ms; invalid config values are rejected.
- Auto-detect order: default browser if Chromium-based → Chrome → Brave → Edge → Chromium → Chrome Canary.
- `browser.executablePath` accepts `~` for your OS home directory.
- `browser.executablePath` and `browser.profiles.<name>.executablePath` both
accept `~` and `~/...` for your OS home directory before Chromium launch.
Per-profile `userDataDir` on `existing-session` profiles is also tilde-expanded.
- Control service: loopback only (port derived from `gateway.port`, default `18791`).
- `extraArgs` appends extra launch flags to local Chromium startup (for example
`--disable-gpu`, window sizing, or debug flags).
@@ -472,7 +490,7 @@ See [Multiple Gateways](/gateway/multiple-gateways).
reload: {
mode: "hybrid", // off | restart | hot | hybrid
debounceMs: 500,
deferralTimeoutMs: 300000,
deferralTimeoutMs: 0,
},
},
}
@@ -484,7 +502,7 @@ See [Multiple Gateways](/gateway/multiple-gateways).
- `"hot"`: apply changes in-process without restarting.
- `"hybrid"` (default): try hot reload first; fall back to restart if required.
- `debounceMs`: debounce window in ms before config changes are applied (non-negative integer).
- `deferralTimeoutMs`: maximum time in ms to wait for in-flight operations before forcing a restart (default: `300000` = 5 minutes).
- `deferralTimeoutMs`: optional maximum time in ms to wait for in-flight operations before forcing a restart. Omit it or set `0` to wait indefinitely and log periodic still-pending warnings.
---
@@ -899,6 +917,7 @@ Notes:
- `otel.sampleRate`: trace sampling rate `0``1`.
- `otel.flushIntervalMs`: periodic telemetry flush interval in ms.
- `otel.captureContent`: opt-in raw content capture for OTEL span attributes. Defaults to off. Boolean `true` captures non-system message/tool content; the object form lets you enable `inputMessages`, `outputMessages`, `toolInputs`, `toolOutputs`, and `systemPrompt` explicitly.
- `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental`: environment toggle for latest experimental GenAI span provider attributes. By default spans keep the legacy `gen_ai.system` attribute for compatibility; GenAI metrics use bounded semantic attributes.
- `OPENCLAW_OTEL_PRELOADED=1`: environment toggle for hosts that already registered a global OpenTelemetry SDK. OpenClaw then skips plugin-owned SDK startup/shutdown while keeping diagnostic listeners active.
- `cacheTrace.enabled`: log cache trace snapshots for embedded runs (default: `false`).
- `cacheTrace.filePath`: output path for cache trace JSONL (default: `$OPENCLAW_STATE_DIR/logs/cache-trace.jsonl`).

View File

@@ -457,7 +457,7 @@ Doctor prints a summary of the workspace state for the default agent:
- **Skills status**: counts eligible, missing-requirements, and allowlist-blocked skills.
- **Legacy workspace dirs**: warns when `~/openclaw` or other legacy workspace directories
exist alongside the current workspace.
- **Plugin status**: counts loaded/disabled/errored plugins; lists plugin IDs for any
- **Plugin status**: counts enabled/disabled/errored plugins; lists plugin IDs for any
errors; reports bundle plugin capabilities.
- **Plugin compatibility warnings**: flags plugins that have compatibility issues with
the current runtime.

View File

@@ -393,7 +393,7 @@ for containerized workloads. Current container defaults include:
- `--no-zygote`
- `--metrics-recording-only`
- `--renderer-process-limit=2`
- `--no-sandbox` and `--disable-setuid-sandbox` when `noSandbox` is enabled.
- `--no-sandbox` when `noSandbox` is enabled.
- The three graphics hardening flags (`--disable-3d-apis`,
`--disable-software-rasterizer`, `--disable-gpu`) are optional and are useful
when containers lack GPU support. Set `OPENCLAW_BROWSER_DISABLE_GRAPHICS_FLAGS=0`

View File

@@ -595,10 +595,9 @@ and troubleshooting see the main [FAQ](/help/faq).
<Accordion title="How does Codex auth work?">
OpenClaw supports **OpenAI Code (Codex)** via OAuth (ChatGPT sign-in). Use
`openai-codex/gpt-5.5` for Codex OAuth through the default PI runner. Use
`openai/gpt-5.4` for current direct OpenAI API-key access. GPT-5.5 direct
API-key access is supported once OpenAI enables it on the public API; today
GPT-5.5 uses subscription/OAuth via `openai-codex/gpt-5.5` or native Codex
app-server runs with `openai/gpt-5.5` and `embeddedHarness.runtime: "codex"`.
`openai/gpt-5.5` for direct OpenAI API-key access. GPT-5.5 can also use
subscription/OAuth via `openai-codex/gpt-5.5` or native Codex app-server
runs with `openai/gpt-5.5` and `embeddedHarness.runtime: "codex"`.
See [Model providers](/concepts/model-providers) and [Onboarding (CLI)](/start/wizard).
</Accordion>
@@ -606,8 +605,7 @@ and troubleshooting see the main [FAQ](/help/faq).
`openai-codex` is the provider and auth-profile id for ChatGPT/Codex OAuth.
It is also the explicit PI model prefix for Codex OAuth:
- `openai/gpt-5.4` = current direct OpenAI API-key route in PI
- `openai/gpt-5.5` = future direct API-key route once OpenAI enables GPT-5.5 on the API
- `openai/gpt-5.5` = current direct OpenAI API-key route in PI
- `openai-codex/gpt-5.5` = Codex OAuth route in PI
- `openai/gpt-5.5` + `embeddedHarness.runtime: "codex"` = native Codex app-server route
- `openai-codex:...` = auth profile id, not a model ref

View File

@@ -21,7 +21,7 @@ troubleshooting, see the main [FAQ](/help/faq).
agents.defaults.model.primary
```
Models are referenced as `provider/model` (example: `openai/gpt-5.4` or `openai-codex/gpt-5.5`). If you omit the provider, OpenClaw first tries an alias, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider as a deprecated compatibility path. If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default. You should still **explicitly** set `provider/model`.
Models are referenced as `provider/model` (example: `openai/gpt-5.5` or `openai-codex/gpt-5.5`). If you omit the provider, OpenClaw first tries an alias, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider as a deprecated compatibility path. If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default. You should still **explicitly** set `provider/model`.
</Accordion>
@@ -146,13 +146,10 @@ troubleshooting, see the main [FAQ](/help/faq).
<Accordion title="Can I use GPT 5.5 for daily tasks and Codex 5.5 for coding?">
Yes. Set one as default and switch as needed:
- **Quick switch (per session):** `/model openai/gpt-5.4` for current direct OpenAI API-key tasks or `/model openai-codex/gpt-5.5` for GPT-5.5 Codex OAuth tasks.
- **Default:** set `agents.defaults.model.primary` to `openai/gpt-5.4` for API-key usage or `openai-codex/gpt-5.5` for GPT-5.5 Codex OAuth usage.
- **Quick switch (per session):** `/model openai/gpt-5.5` for current direct OpenAI API-key tasks or `/model openai-codex/gpt-5.5` for GPT-5.5 Codex OAuth tasks.
- **Default:** set `agents.defaults.model.primary` to `openai/gpt-5.5` for API-key usage or `openai-codex/gpt-5.5` for GPT-5.5 Codex OAuth usage.
- **Sub-agents:** route coding tasks to sub-agents with a different default model.
Direct API-key access for `openai/gpt-5.5` is supported once OpenAI enables
GPT-5.5 on the public API. Until then GPT-5.5 is subscription/OAuth-only.
See [Models](/concepts/models) and [Slash commands](/tools/slash-commands).
</Accordion>
@@ -160,8 +157,8 @@ troubleshooting, see the main [FAQ](/help/faq).
<Accordion title="How do I configure fast mode for GPT 5.5?">
Use either a session toggle or a config default:
- **Per session:** send `/fast on` while the session is using `openai/gpt-5.4` or `openai-codex/gpt-5.5`.
- **Per model default:** set `agents.defaults.models["openai/gpt-5.4"].params.fastMode` or `agents.defaults.models["openai-codex/gpt-5.5"].params.fastMode` to `true`.
- **Per session:** send `/fast on` while the session is using `openai/gpt-5.5` or `openai-codex/gpt-5.5`.
- **Per model default:** set `agents.defaults.models["openai/gpt-5.5"].params.fastMode` or `agents.defaults.models["openai-codex/gpt-5.5"].params.fastMode` to `true`.
Example:
@@ -170,7 +167,7 @@ troubleshooting, see the main [FAQ](/help/faq).
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: {
fastMode: true,
},
@@ -241,7 +238,7 @@ troubleshooting, see the main [FAQ](/help/faq).
model: { primary: "minimax/MiniMax-M2.7" },
models: {
"minimax/MiniMax-M2.7": { alias: "minimax" },
"openai/gpt-5.4": { alias: "gpt" },
"openai/gpt-5.5": { alias: "gpt" },
},
},
},
@@ -269,7 +266,7 @@ troubleshooting, see the main [FAQ](/help/faq).
- `opus` → `anthropic/claude-opus-4-6`
- `sonnet` → `anthropic/claude-sonnet-4-6`
- `gpt` → `openai/gpt-5.4` for API-key setups, or `openai-codex/gpt-5.5` when configured for Codex OAuth
- `gpt` → `openai/gpt-5.5` for API-key setups, or `openai-codex/gpt-5.5` when configured for Codex OAuth
- `gpt-mini` → `openai/gpt-5.4-mini`
- `gpt-nano` → `openai/gpt-5.4-nano`
- `gemini` → `google/gemini-3.1-pro-preview`

View File

@@ -1,12 +1,12 @@
---
summary: "How to review the GPT-5.4 / Codex parity program as four merge units"
title: "GPT-5.4 / Codex parity maintainer notes"
summary: "How to review the GPT-5.5 / Codex parity program as four merge units"
title: "GPT-5.5 / Codex parity maintainer notes"
read_when:
- Reviewing the GPT-5.4 / Codex parity PR series
- Reviewing the GPT-5.5 / Codex parity PR series
- Maintaining the six-contract agentic architecture behind the parity program
---
This note explains how to review the GPT-5.4 / Codex parity program as four merge units without losing the original six-contract architecture.
This note explains how to review the GPT-5.5 / Codex parity program as four merge units without losing the original six-contract architecture.
## Merge units
@@ -59,7 +59,7 @@ Does not own:
Owns:
- first-wave GPT-5.4 vs Opus 4.6 scenario pack
- first-wave GPT-5.5 vs Opus 4.6 scenario pack
- parity documentation
- parity report and release-gate mechanics
@@ -123,7 +123,7 @@ Expected artifacts from PR D:
## Release gate
Do not claim GPT-5.4 parity or superiority over Opus 4.6 until:
Do not claim GPT-5.5 parity or superiority over Opus 4.6 until:
- PR A, PR B, and PR C are merged
- PR D runs the first-wave parity pack cleanly
@@ -132,7 +132,7 @@ Do not claim GPT-5.4 parity or superiority over Opus 4.6 until:
```mermaid
flowchart LR
A["PR A-C merged"] --> B["Run GPT-5.4 parity pack"]
A["PR A-C merged"] --> B["Run GPT-5.5 parity pack"]
A --> C["Run Opus 4.6 parity pack"]
B --> D["qa-suite-summary.json"]
C --> E["qa-suite-summary.json"]
@@ -146,9 +146,31 @@ flowchart LR
The parity harness is not the only evidence source. Keep this split explicit in review:
- PR D owns the scenario-based GPT-5.4 vs Opus 4.6 comparison
- PR D owns the scenario-based GPT-5.5 vs Opus 4.6 comparison
- PR B deterministic suites still own auth/proxy/DNS and full-access truthfulness evidence
## Quick maintainer merge workflow
Use this when you are ready to land a parity PR and want a repeatable, low-risk sequence.
1. Confirm evidence bar is met before merge:
- reproducible symptom or failing test
- verified root cause in touched code
- fix in the implicated path
- regression test or explicit manual verification note
2. Triage/label before merge:
- apply any `r:*` auto-close labels when the PR should not land
- keep merge candidates free of unresolved blocker threads
3. Validate locally on the touched surface:
- `pnpm check:changed`
- `pnpm test:changed` when tests changed or bug-fix confidence depends on test coverage
4. Land with the standard maintainer flow (`/landpr` process), then verify:
- linked issues auto-close behavior
- CI and post-merge status on `main`
5. After landing, run duplicate search for related open PRs/issues and close only with a canonical reference.
If any one of the evidence bar items is missing, request changes instead of merging.
## Goal-to-evidence map
| Completion gate item | Primary owner | Review artifact |
@@ -157,13 +179,13 @@ The parity harness is not the only evidence source. Keep this split explicit in
| No fake progress or fake tool completion | PR A + PR D | parity fake-success count plus scenario-level report details |
| No false `/elevated full` guidance | PR B | deterministic runtime-truthfulness suites |
| Replay/liveness failures remain explicit | PR C + PR D | lifecycle/replay suites plus `compaction-retry-mutating-tool` |
| GPT-5.4 matches or beats Opus 4.6 | PR D | `qa-agentic-parity-report.md` and `qa-agentic-parity-summary.json` |
| GPT-5.5 matches or beats Opus 4.6 | PR D | `qa-agentic-parity-report.md` and `qa-agentic-parity-summary.json` |
## Reviewer shorthand: before vs after
| User-visible problem before | Review signal after |
| ----------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| GPT-5.4 stopped after planning | PR A shows act-or-block behavior instead of commentary-only completion |
| GPT-5.5 stopped after planning | PR A shows act-or-block behavior instead of commentary-only completion |
| Tool use felt brittle with strict OpenAI/Codex schemas | PR C keeps tool registration and parameter-free invocation predictable |
| `/elevated full` hints were sometimes misleading | PR B ties guidance to actual runtime capability and blocked reasons |
| Long tasks could disappear into replay/compaction ambiguity | PR C emits explicit paused, blocked, abandoned, and replay-invalid state |
@@ -171,4 +193,4 @@ The parity harness is not the only evidence source. Keep this split explicit in
## Related
- [GPT-5.4 / Codex agentic parity](/help/gpt54-codex-agentic-parity)
- [GPT-5.5 / Codex agentic parity](/help/gpt55-codex-agentic-parity)

View File

@@ -1,15 +1,15 @@
---
summary: "How OpenClaw closes agentic execution gaps for GPT-5.4 and Codex-style models"
title: "GPT-5.4 / Codex agentic parity"
summary: "How OpenClaw closes agentic execution gaps for GPT-5.5 and Codex-style models"
title: "GPT-5.5 / Codex agentic parity"
read_when:
- Debugging GPT-5.4 or Codex agent behavior
- Debugging GPT-5.5 or Codex agent behavior
- Comparing OpenClaw agentic behavior across frontier models
- Reviewing the strict-agentic, tool-schema, elevation, and replay fixes
---
# GPT-5.4 / Codex Agentic Parity in OpenClaw
# GPT-5.5 / Codex Agentic Parity in OpenClaw
OpenClaw already worked well with tool-using frontier models, but GPT-5.4 and Codex-style models were still underperforming in a few practical ways:
OpenClaw already worked well with tool-using frontier models, but GPT-5.5 and Codex-style models were still underperforming in a few practical ways:
- they could stop after planning instead of doing the work
- they could use strict OpenAI/Codex tool schemas incorrectly
@@ -27,7 +27,7 @@ This slice adds an opt-in `strict-agentic` execution contract for embedded Pi GP
When enabled, OpenClaw stops accepting plan-only turns as “good enough” completion. If the model only says what it intends to do and does not actually use tools or make progress, OpenClaw retries with an act-now steer and then fails closed with an explicit blocked state instead of silently ending the task.
This improves the GPT-5.4 experience most on:
This improves the GPT-5.5 experience most on:
- short “ok do it” follow-ups
- code tasks where the first step is obvious
@@ -40,7 +40,7 @@ This slice makes OpenClaw tell the truth about two things:
- why the provider/runtime call failed
- whether `/elevated full` is actually available
That means GPT-5.4 gets better runtime signals for missing scope, auth refresh failures, HTML 403 auth failures, proxy issues, DNS or timeout failures, and blocked full-access modes. The model is less likely to hallucinate the wrong remediation or keep asking for a permission mode the runtime cannot provide.
That means GPT-5.5 gets better runtime signals for missing scope, auth refresh failures, HTML 403 auth failures, proxy issues, DNS or timeout failures, and blocked full-access modes. The model is less likely to hallucinate the wrong remediation or keep asking for a permission mode the runtime cannot provide.
### PR C: execution correctness
@@ -53,7 +53,7 @@ The tool-compat work reduces schema friction for strict OpenAI/Codex tool regist
### PR D: parity harness
This slice adds the first-wave QA-lab parity pack so GPT-5.4 and Opus 4.6 can be exercised through the same scenarios and compared using shared evidence.
This slice adds the first-wave QA-lab parity pack so GPT-5.5 and Opus 4.6 can be exercised through the same scenarios and compared using shared evidence.
The parity pack is the proof layer. It does not change runtime behavior by itself.
@@ -62,7 +62,7 @@ After you have two `qa-suite-summary.json` artifacts, generate the release-gate
```bash
pnpm openclaw qa parity-report \
--repo-root . \
--candidate-summary .artifacts/qa-e2e/gpt54/qa-suite-summary.json \
--candidate-summary .artifacts/qa-e2e/gpt55/qa-suite-summary.json \
--baseline-summary .artifacts/qa-e2e/opus46/qa-suite-summary.json \
--output-dir .artifacts/qa-e2e/parity
```
@@ -73,16 +73,16 @@ That command writes:
- a machine-readable JSON verdict
- an explicit `pass` / `fail` gate result
## Why this improves GPT-5.4 in practice
## Why this improves GPT-5.5 in practice
Before this work, GPT-5.4 on OpenClaw could feel less agentic than Opus in real coding sessions because the runtime tolerated behaviors that are especially harmful for GPT-5-style models:
Before this work, GPT-5.5 on OpenClaw could feel less agentic than Opus in real coding sessions because the runtime tolerated behaviors that are especially harmful for GPT-5-style models:
- commentary-only turns
- schema friction around tools
- vague permission feedback
- silent replay or compaction breakage
The goal is not to make GPT-5.4 imitate Opus. The goal is to give GPT-5.4 a runtime contract that rewards real progress, supplies cleaner tool and permission semantics, and turns failure modes into explicit machine- and human-readable states.
The goal is not to make GPT-5.5 imitate Opus. The goal is to give GPT-5.5 a runtime contract that rewards real progress, supplies cleaner tool and permission semantics, and turns failure modes into explicit machine- and human-readable states.
That changes the user experience from:
@@ -92,15 +92,15 @@ to:
- “the model either acted, or OpenClaw surfaced the exact reason it could not”
## Before vs after for GPT-5.4 users
## Before vs after for GPT-5.5 users
| Before this program | After PR A-D |
| ---------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- |
| GPT-5.4 could stop after a reasonable plan without taking the next tool step | PR A turns “plan only” into “act now or surface a blocked state” |
| GPT-5.5 could stop after a reasonable plan without taking the next tool step | PR A turns “plan only” into “act now or surface a blocked state” |
| Strict tool schemas could reject parameter-free or OpenAI/Codex-shaped tools in confusing ways | PR C makes provider-owned tool registration and invocation more predictable |
| `/elevated full` guidance could be vague or wrong in blocked runtimes | PR B gives GPT-5.4 and the user truthful runtime and permission hints |
| `/elevated full` guidance could be vague or wrong in blocked runtimes | PR B gives GPT-5.5 and the user truthful runtime and permission hints |
| Replay or compaction failures could feel like the task silently disappeared | PR C surfaces paused, blocked, abandoned, and replay-invalid outcomes explicitly |
| “GPT-5.4 feels worse than Opus” was mostly anecdotal | PR D turns that into the same scenario pack, the same metrics, and a hard pass/fail gate |
| “GPT-5.5 feels worse than Opus” was mostly anecdotal | PR D turns that into the same scenario pack, the same metrics, and a hard pass/fail gate |
## Architecture
@@ -123,7 +123,7 @@ flowchart TD
```mermaid
flowchart LR
A["Merged runtime slices (PR A-C)"] --> B["Run GPT-5.4 parity pack"]
A["Merged runtime slices (PR A-C)"] --> B["Run GPT-5.5 parity pack"]
A --> C["Run Opus 4.6 parity pack"]
B --> D["qa-suite-summary.json"]
C --> E["qa-suite-summary.json"]
@@ -162,7 +162,7 @@ Checks that a task with a real mutating write keeps replay-unsafety explicit ins
## Scenario matrix
| Scenario | What it tests | Good GPT-5.4 behavior | Failure signal |
| Scenario | What it tests | Good GPT-5.5 behavior | Failure signal |
| ---------------------------------- | --------------------------------------- | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------ |
| `approval-turn-tool-followthrough` | Short approval turns after a plan | Starts the first concrete tool action immediately instead of restating intent | plan-only follow-up, no tool activity, or blocked turn without a real blocker |
| `model-switch-tool-continuity` | Runtime/model switching under tool use | Preserves task context and continues acting coherently | resets into commentary, loses tool context, or stops after switch |
@@ -172,7 +172,7 @@ Checks that a task with a real mutating write keeps replay-unsafety explicit ins
## Release gate
GPT-5.4 can only be considered at parity or better when the merged runtime passes the parity pack and the runtime-truthfulness regressions at the same time.
GPT-5.5 can only be considered at parity or better when the merged runtime passes the parity pack and the runtime-truthfulness regressions at the same time.
Required outcomes:
@@ -191,24 +191,24 @@ For the first-wave harness, the gate compares:
Parity evidence is intentionally split across two layers:
- PR D proves same-scenario GPT-5.4 vs Opus 4.6 behavior with QA-lab
- PR D proves same-scenario GPT-5.5 vs Opus 4.6 behavior with QA-lab
- PR B deterministic suites prove auth, proxy, DNS, and `/elevated full` truthfulness outside the harness
## Goal-to-evidence matrix
| Completion gate item | Owning PR | Evidence source | Pass signal |
| -------------------------------------------------------- | ----------- | ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- |
| GPT-5.4 no longer stalls after planning | PR A | `approval-turn-tool-followthrough` plus PR A runtime suites | approval turns trigger real work or an explicit blocked state |
| GPT-5.4 no longer fakes progress or fake tool completion | PR A + PR D | parity report scenario outcomes and fake-success count | no suspicious pass results and no commentary-only completion |
| GPT-5.4 no longer gives false `/elevated full` guidance | PR B | deterministic truthfulness suites | blocked reasons and full-access hints stay runtime-accurate |
| GPT-5.5 no longer stalls after planning | PR A | `approval-turn-tool-followthrough` plus PR A runtime suites | approval turns trigger real work or an explicit blocked state |
| GPT-5.5 no longer fakes progress or fake tool completion | PR A + PR D | parity report scenario outcomes and fake-success count | no suspicious pass results and no commentary-only completion |
| GPT-5.5 no longer gives false `/elevated full` guidance | PR B | deterministic truthfulness suites | blocked reasons and full-access hints stay runtime-accurate |
| Replay/liveness failures stay explicit | PR C + PR D | PR C lifecycle/replay suites plus `compaction-retry-mutating-tool` | mutating work keeps replay-unsafety explicit instead of silently disappearing |
| GPT-5.4 matches or beats Opus 4.6 on the agreed metrics | PR D | `qa-agentic-parity-report.md` and `qa-agentic-parity-summary.json` | same scenario coverage and no regression on completion, stop behavior, or valid tool use |
| GPT-5.5 matches or beats Opus 4.6 on the agreed metrics | PR D | `qa-agentic-parity-report.md` and `qa-agentic-parity-summary.json` | same scenario coverage and no regression on completion, stop behavior, or valid tool use |
## How to read the parity verdict
Use the verdict in `qa-agentic-parity-summary.json` as the final machine-readable decision for the first-wave parity pack.
- `pass` means GPT-5.4 covered the same scenarios as Opus 4.6 and did not regress on the agreed aggregate metrics.
- `pass` means GPT-5.5 covered the same scenarios as Opus 4.6 and did not regress on the agreed aggregate metrics.
- `fail` means at least one hard gate tripped: weaker completion, worse unintended stops, weaker valid tool use, any fake-success case, or mismatched scenario coverage.
- “shared/base CI issue” is not itself a parity result. If CI noise outside PR D blocks a run, the verdict should wait for a clean merged-runtime execution instead of being inferred from branch-era logs.
- Auth, proxy, DNS, and `/elevated full` truthfulness still come from PR Bs deterministic suites, so the final release claim needs both: a passing PR D parity verdict and green PR B truthfulness coverage.
@@ -218,7 +218,7 @@ Use the verdict in `qa-agentic-parity-summary.json` as the final machine-readabl
Use `strict-agentic` when:
- the agent is expected to act immediately when a next step is obvious
- GPT-5.4 or Codex-family models are the primary runtime
- GPT-5.5 or Codex-family models are the primary runtime
- you prefer explicit blocked states over “helpful” recap-only replies
Keep the default contract when:
@@ -229,4 +229,4 @@ Keep the default contract when:
## Related
- [GPT-5.4 / Codex parity maintainer notes](/help/gpt54-codex-agentic-parity-maintainers)
- [GPT-5.5 / Codex parity maintainer notes](/help/gpt55-codex-agentic-parity-maintainers)

View File

@@ -208,9 +208,12 @@ Notes:
- `OPENCLAW_LIVE_ACP_BIND_AGENT=claude`
- `OPENCLAW_LIVE_ACP_BIND_AGENT=codex`
- `OPENCLAW_LIVE_ACP_BIND_AGENT=gemini`
- `OPENCLAW_LIVE_ACP_BIND_AGENT=opencode`
- `OPENCLAW_LIVE_ACP_BIND_AGENTS=claude,codex,gemini`
- `OPENCLAW_LIVE_ACP_BIND_AGENT_COMMAND='npx -y @agentclientprotocol/claude-agent-acp@<version>'`
- `OPENCLAW_LIVE_ACP_BIND_CODEX_MODEL=gpt-5.2`
- `OPENCLAW_LIVE_ACP_BIND_OPENCODE_MODEL=opencode/kimi-k2.6`
- `OPENCLAW_LIVE_ACP_BIND_REQUIRE_TRANSCRIPT=1`
- `OPENCLAW_LIVE_ACP_BIND_PARENT_MODEL=openai/gpt-5.2`
- Notes:
- This lane uses the gateway `chat.send` surface with admin-only synthetic originating-route fields so tests can attach message-channel context without pretending to deliver externally.
@@ -236,15 +239,17 @@ Single-agent Docker recipes:
pnpm test:docker:live-acp-bind:claude
pnpm test:docker:live-acp-bind:codex
pnpm test:docker:live-acp-bind:gemini
pnpm test:docker:live-acp-bind:opencode
```
Docker notes:
- The Docker runner lives at `scripts/test-live-acp-bind-docker.sh`.
- By default, it runs the ACP bind smoke against all supported live CLI agents in sequence: `claude`, `codex`, then `gemini`.
- Use `OPENCLAW_LIVE_ACP_BIND_AGENTS=claude`, `OPENCLAW_LIVE_ACP_BIND_AGENTS=codex`, or `OPENCLAW_LIVE_ACP_BIND_AGENTS=gemini` to narrow the matrix.
- It sources `~/.profile`, stages the matching CLI auth material into the container, installs `acpx` into a writable npm prefix, then installs the requested live CLI (`@anthropic-ai/claude-code`, `@openai/codex`, or `@google/gemini-cli`) if missing.
- Inside Docker, the runner sets `OPENCLAW_LIVE_ACP_BIND_ACPX_COMMAND=$HOME/.npm-global/bin/acpx` so acpx keeps provider env vars from the sourced profile available to the child harness CLI.
- By default, it runs the ACP bind smoke against the aggregate live CLI agents in sequence: `claude`, `codex`, then `gemini`.
- Use `OPENCLAW_LIVE_ACP_BIND_AGENTS=claude`, `OPENCLAW_LIVE_ACP_BIND_AGENTS=codex`, `OPENCLAW_LIVE_ACP_BIND_AGENTS=gemini`, or `OPENCLAW_LIVE_ACP_BIND_AGENTS=opencode` to narrow the matrix.
- It sources `~/.profile`, stages the matching CLI auth material into the container, then installs the requested live CLI (`@anthropic-ai/claude-code`, `@openai/codex`, `@google/gemini-cli`, or `opencode-ai`) if missing. The ACP backend itself is the bundled embedded `acpx/runtime` package from the `acpx` plugin.
- The OpenCode Docker variant is a strict single-agent regression lane. It writes a temporary `OPENCODE_CONFIG_CONTENT` default model from `OPENCLAW_LIVE_ACP_BIND_OPENCODE_MODEL` (default `opencode/kimi-k2.6`) after sourcing `~/.profile`, and `pnpm test:docker:live-acp-bind:opencode` requires a bound assistant transcript instead of accepting the generic post-bind skip.
- Direct `acpx` CLI calls are only a manual/workaround path for comparing behavior outside the Gateway. The Docker ACP bind smoke exercises OpenClaw's embedded `acpx` runtime backend.
## Live: Codex app-server harness smoke
@@ -492,7 +497,7 @@ image-generation runtime, and the live provider request.
- `comfy`: separate Comfy live file, not this shared sweep
- Optional narrowing:
- `OPENCLAW_LIVE_MUSIC_GENERATION_PROVIDERS="google,minimax"`
- `OPENCLAW_LIVE_MUSIC_GENERATION_MODELS="google/lyria-3-clip-preview,minimax/music-2.5+"`
- `OPENCLAW_LIVE_MUSIC_GENERATION_MODELS="google/lyria-3-clip-preview,minimax/music-2.6"`
- Optional auth behavior:
- `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides

View File

@@ -55,6 +55,20 @@ When debugging real providers/models (requires real creds):
Slack DM with `/codex bind`, exercises `/codex fast` and
`/codex permissions`, then verifies a plain reply and an image attachment
route through the native plugin binding instead of ACP.
- Crestodian rescue command smoke: `pnpm test:live:crestodian-rescue-channel`
- Opt-in belt-and-suspenders check for the message-channel rescue command
surface. It exercises `/crestodian status`, queues a persistent model
change, replies `/crestodian yes`, and verifies the audit/config write path.
- Crestodian planner Docker smoke: `pnpm test:docker:crestodian-planner`
- Runs Crestodian in a configless container with a fake Claude CLI on `PATH`
and verifies the fuzzy planner fallback translates into an audited typed
config write.
- Crestodian first-run Docker smoke: `pnpm test:docker:crestodian-first-run`
- Starts from an empty OpenClaw state dir, routes bare `openclaw` to
Crestodian, applies setup/model/agent/Discord plugin + SecretRef writes,
validates config, and verifies audit entries. The same Ring 0 setup path is
also covered in QA Lab by
`pnpm openclaw qa suite --scenario crestodian-ring-zero-setup`.
- Moonshot/Kimi cost smoke: with `MOONSHOT_API_KEY` set, run
`openclaw models list --provider moonshot --json`, then run an isolated
`openclaw agent --local --session-id live-kimi-cost --message 'Reply exactly: KIMI_LIVE_OK' --thinking off --json`
@@ -574,7 +588,7 @@ These Docker runners split into two buckets:
The live-model Docker runners also bind-mount only the needed CLI auth homes (or all supported ones when the run is not narrowed), then copy them into the container home before the run so external-CLI OAuth can refresh tokens without mutating the host auth store:
- Direct models: `pnpm test:docker:live-models` (script: `scripts/test-live-models-docker.sh`)
- ACP bind smoke: `pnpm test:docker:live-acp-bind` (script: `scripts/test-live-acp-bind-docker.sh`)
- ACP bind smoke: `pnpm test:docker:live-acp-bind` (script: `scripts/test-live-acp-bind-docker.sh`; covers Claude, Codex, and Gemini by default, with strict OpenCode coverage via `pnpm test:docker:live-acp-bind:opencode`)
- CLI backend smoke: `pnpm test:docker:live-cli-backend` (script: `scripts/test-live-cli-backend-docker.sh`)
- Codex app-server harness smoke: `pnpm test:docker:live-codex-harness` (script: `scripts/test-live-codex-harness-docker.sh`)
- Gateway + dev agent: `pnpm test:docker:live-gateway` (script: `scripts/test-live-gateway-models-docker.sh`)

View File

@@ -198,6 +198,9 @@ diagnostics + the exporter plugin are enabled.
Model usage:
- `model.usage`: tokens, cost, duration, context, provider/model/channel, session ids.
`usage` is provider/turn accounting for cost and telemetry; `context.used`
is the current prompt/context snapshot and can be lower than provider
`usage.total` when cached input or tool-loop calls are involved.
Message flow:
@@ -206,6 +209,9 @@ Message flow:
- `webhook.error`: webhook handler errors.
- `message.queued`: message enqueued for processing.
- `message.processed`: outcome + duration + optional error.
- `message.delivery.started`: outbound delivery attempt started.
- `message.delivery.completed`: outbound delivery attempt finished + duration/result count.
- `message.delivery.error`: outbound delivery attempt failed + duration/bounded error category.
Queue + session:
@@ -304,7 +310,8 @@ Notes:
- You can also enable the plugin with `openclaw plugins enable diagnostics-otel`.
- `protocol` currently supports `http/protobuf` only. `grpc` is ignored.
- Metrics include token usage, cost, context size, run duration, and message-flow
counters/histograms (webhooks, queueing, session state, queue depth/wait).
counters/histograms (webhooks, queueing, session state, queue depth/wait),
plus GenAI token usage and model-call duration histograms.
- Traces/metrics can be toggled with `traces` / `metrics` (default: on). Traces
include model usage spans plus webhook/message processing spans when enabled.
- Raw model/tool content is not exported by default. Use
@@ -313,6 +320,10 @@ Notes:
- Set `headers` when your collector requires auth.
- Environment variables supported: `OTEL_EXPORTER_OTLP_ENDPOINT`,
`OTEL_SERVICE_NAME`, `OTEL_EXPORTER_OTLP_PROTOCOL`.
- Set `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` to emit the
latest experimental GenAI provider span attribute (`gen_ai.provider.name`)
instead of the legacy span attribute (`gen_ai.system`). GenAI metrics always
use bounded, low-cardinality semantic attributes.
- Set `OPENCLAW_OTEL_PRELOADED=1` when another preload or host process already
registered the global OpenTelemetry SDK. In that mode the plugin does not start
or shut down its own SDK, but it still wires OpenClaw diagnostic listeners and
@@ -330,6 +341,12 @@ Model usage:
`openclaw.provider`, `openclaw.model`)
- `openclaw.context.tokens` (histogram, attrs: `openclaw.context`,
`openclaw.channel`, `openclaw.provider`, `openclaw.model`)
- `gen_ai.client.token.usage` (histogram, GenAI semantic-conventions metric,
attrs: `gen_ai.token.type` = `input`/`output`, `gen_ai.provider.name`,
`gen_ai.operation.name`, `gen_ai.request.model`)
- `gen_ai.client.operation.duration` (histogram, seconds, GenAI
semantic-conventions metric, attrs: `gen_ai.provider.name`,
`gen_ai.operation.name`, `gen_ai.request.model`, optional `error.type`)
Message flow:
@@ -345,6 +362,11 @@ Message flow:
`openclaw.outcome`)
- `openclaw.message.duration_ms` (histogram, attrs: `openclaw.channel`,
`openclaw.outcome`)
- `openclaw.message.delivery.started` (counter, attrs: `openclaw.channel`,
`openclaw.delivery.kind`)
- `openclaw.message.delivery.duration_ms` (histogram, attrs:
`openclaw.channel`, `openclaw.delivery.kind`, `openclaw.outcome`,
`openclaw.errorCategory`)
Queues + sessions:
@@ -363,18 +385,35 @@ Exec:
- `openclaw.exec.duration_ms` (histogram, attrs: `openclaw.exec.target`,
`openclaw.exec.mode`, `openclaw.outcome`, `openclaw.failureKind`)
Diagnostics internals (memory + tool loop):
- `openclaw.memory.heap_used_bytes` (histogram, attrs: `openclaw.memory.kind`)
- `openclaw.memory.rss_bytes` (histogram)
- `openclaw.memory.pressure` (counter, attrs: `openclaw.memory.level`)
- `openclaw.tool.loop.iterations` (counter, attrs: `openclaw.toolName`,
`openclaw.outcome`)
- `openclaw.tool.loop.duration_ms` (histogram, attrs: `openclaw.toolName`,
`openclaw.outcome`)
### Exported spans (names + key attributes)
- `openclaw.model.usage`
- `openclaw.channel`, `openclaw.provider`, `openclaw.model`
- `openclaw.tokens.*` (input/output/cache_read/cache_write/total)
- `gen_ai.system` by default, or `gen_ai.provider.name` when latest GenAI
semantic conventions are opted in
- `gen_ai.request.model`, `gen_ai.operation.name`, `gen_ai.usage.*`
- `openclaw.run`
- `openclaw.outcome`, `openclaw.channel`, `openclaw.provider`,
`openclaw.model`, `openclaw.errorCategory`
- `openclaw.model.call`
- `gen_ai.system`, `gen_ai.request.model`, `gen_ai.operation.name`,
- `gen_ai.system` by default, or `gen_ai.provider.name` when latest GenAI
semantic conventions are opted in
- `gen_ai.request.model`, `gen_ai.operation.name`,
`openclaw.provider`, `openclaw.model`, `openclaw.api`,
`openclaw.transport`
`openclaw.transport`, `openclaw.provider.request_id_hash` (bounded
SHA-based hash of the upstream provider request id; raw ids are not
exported)
- `openclaw.tool.execution`
- `gen_ai.tool.name`, `openclaw.toolName`, `openclaw.errorCategory`,
`openclaw.tool.params.*`
@@ -390,8 +429,21 @@ Exec:
- `openclaw.message.processed`
- `openclaw.channel`, `openclaw.outcome`, `openclaw.chatId`,
`openclaw.messageId`, `openclaw.reason`
- `openclaw.message.delivery`
- `openclaw.channel`, `openclaw.delivery.kind`, `openclaw.outcome`,
`openclaw.errorCategory`, `openclaw.delivery.result_count`
- `openclaw.session.stuck`
- `openclaw.state`, `openclaw.ageMs`, `openclaw.queueDepth`
- `openclaw.context.assembled`
- `openclaw.prompt.size`, `openclaw.history.size`,
`openclaw.context.tokens`, `openclaw.errorCategory` (no prompt,
history, response, or session-key content)
- `openclaw.tool.loop`
- `openclaw.toolName`, `openclaw.outcome`, `openclaw.iterations`,
`openclaw.errorCategory` (no loop messages, params, or tool output)
- `openclaw.memory.pressure`
- `openclaw.memory.level`, `openclaw.memory.heap_used_bytes`,
`openclaw.memory.rss_bytes`
When content capture is explicitly enabled, model/tool spans can also include
bounded, redacted `openclaw.content.*` attributes for the specific content
@@ -408,6 +460,9 @@ classes you opted into.
`OTEL_EXPORTER_OTLP_ENDPOINT`.
- If the endpoint already contains `/v1/traces` or `/v1/metrics`, it is used as-is.
- If the endpoint already contains `/v1/logs`, it is used as-is for logs.
- `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` controls only the
GenAI span provider attribute shape. Existing dashboards that read
`gen_ai.system` can keep the default until they migrate.
- `OPENCLAW_OTEL_PRELOADED=1` reuses an externally registered OpenTelemetry SDK
for traces/metrics instead of starting a plugin-owned NodeSDK.
- `diagnostics.otel.logs` enables OTLP log export for the main logger output.

View File

@@ -31,7 +31,7 @@ OpenClaw auto-detects in this order and stops at the first working option:
3. **Gemini CLI** (`gemini`) using `read_many_files`
4. **Provider auth**
- Configured `models.providers.*` entries that support audio are tried first
- Bundled fallback order: OpenAI → Groq → Deepgram → Google → Mistral
- Bundled fallback order: OpenAI → Groq → xAI → Deepgram → Google → SenseAudio → ElevenLabs → Mistral
To disable auto-detection, set `tools.media.audio.enabled: false`.
To customize, set `tools.media.audio.models`.
@@ -112,6 +112,21 @@ Note: Binary detection is best-effort across macOS/Linux/Windows; ensure the CLI
}
```
### Provider-only (SenseAudio)
```json5
{
tools: {
media: {
audio: {
enabled: true,
models: [{ provider: "senseaudio", model: "senseaudio-asr-pro-1.5-260319" }],
},
},
},
}
```
### Echo transcript to chat (opt-in)
```json5
@@ -136,6 +151,8 @@ Note: Binary detection is best-effort across macOS/Linux/Windows; ensure the CLI
- Deepgram picks up `DEEPGRAM_API_KEY` when `provider: "deepgram"` is used.
- Deepgram setup details: [Deepgram (audio transcription)](/providers/deepgram).
- Mistral setup details: [Mistral](/providers/mistral).
- SenseAudio picks up `SENSEAUDIO_API_KEY` when `provider: "senseaudio"` is used.
- SenseAudio setup details: [SenseAudio](/providers/senseaudio).
- Audio providers can override `baseUrl`, `headers`, and `providerOptions` via `tools.media.audio`.
- Default size cap is 20MB (`tools.media.audio.maxBytes`). Oversize audio is skipped for that model and the next entry is tried.
- Tiny/empty audio files below 1024 bytes are skipped before provider/CLI transcription.

View File

@@ -167,7 +167,7 @@ working option**:
example through `agents.defaults.imageModel` or
`openclaw infer image describe --model ollama/<vision-model>`.
- Bundled fallback order:
- Audio: OpenAI → Groq → xAI → Deepgram → Google → Mistral
- Audio: OpenAI → Groq → xAI → Deepgram → Google → SenseAudio → ElevenLabs → Mistral
- Image: OpenAI → Anthropic → Google → MiniMax → MiniMax Portal → Z.AI
- Video: Google → Qwen → Moonshot
@@ -228,7 +228,7 @@ If you omit `capabilities`, the entry is eligible for the list it appears in.
| Capability | Provider integration | Notes |
| ---------- | ---------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Image | OpenAI, OpenAI Codex OAuth, Codex app-server, OpenRouter, Anthropic, Google, MiniMax, Moonshot, Qwen, Z.AI, config providers | Vendor plugins register image support; `openai-codex/*` uses OAuth provider plumbing; `codex/*` uses a bounded Codex app-server turn; MiniMax and MiniMax OAuth both use `MiniMax-VL-01`; image-capable config providers auto-register. |
| Audio | OpenAI, Groq, Deepgram, Google, Mistral | Provider transcription (Whisper/Deepgram/Gemini/Voxtral). |
| Audio | OpenAI, Groq, xAI, Deepgram, Google, SenseAudio, ElevenLabs, Mistral | Provider transcription (Whisper/Groq/xAI/Deepgram/Gemini/SenseAudio/Scribe/Voxtral). |
| Video | Google, Qwen, Moonshot | Provider video understanding via vendor plugins; Qwen video understanding uses the Standard DashScope endpoints. |
MiniMax note:

View File

@@ -91,6 +91,13 @@ Defaults:
- Click cloud: stop speaking
- Click X: exit Talk mode
## Android UI
- Voice tab toggle: **Talk**
- Manual **Mic** and **Talk** are mutually exclusive runtime capture modes.
- Manual Mic stops when the app leaves the foreground or the user leaves the Voice tab.
- Talk Mode keeps running until toggled off or the Android node disconnects, and uses Android's microphone foreground-service type while active.
## Notes
- Requires Speech + Microphone permissions.

View File

@@ -199,8 +199,10 @@ See [Camera node](/nodes/camera) for parameters and CLI helpers.
### 8) Voice + expanded Android command surface
- Voice: Android uses a single mic on/off flow in the Voice tab with transcript capture and `talk.speak` playback. Local system TTS is used only when `talk.speak` is unavailable. Voice stops when the app leaves the foreground.
- Voice wake/talk-mode toggles are currently removed from Android UX/runtime.
- Voice tab: Android has two explicit capture modes. **Mic** is a manual Voice-tab session that sends each pause as a chat turn and stops when the app leaves the foreground or the user leaves the Voice tab. **Talk** is continuous Talk Mode and keeps listening until toggled off or the node disconnects.
- Talk Mode promotes the existing foreground service from `dataSync` to `dataSync|microphone` before capture starts, then demotes it when Talk Mode stops. Android 14+ requires the `FOREGROUND_SERVICE_MICROPHONE` declaration, the `RECORD_AUDIO` runtime grant, and the microphone service type at runtime.
- Spoken replies use `talk.speak` through the configured gateway Talk provider. Local system TTS is used only when `talk.speak` is unavailable.
- Voice wake remains disabled in the Android UX/runtime.
- Additional Android command families (availability depends on device + permissions):
- `device.status`, `device.info`, `device.permissions`, `device.health`
- `notifications.list`, `notifications.actions` (see [Notification forwarding](#notification-forwarding) below)

View File

@@ -911,13 +911,18 @@ Official external npm entries should prefer an exact `npmSpec` plus
`expectedIntegrity`. Bare package names and dist-tags still work for
compatibility, but they surface source-plane warnings so the catalog can move
toward pinned, integrity-checked installs without breaking existing plugins.
When onboarding installs from a local catalog path, it records a
`plugins.installs` entry with `source: "path"` and a workspace-relative
When onboarding installs from a local catalog path, it records a managed plugin
install ledger entry with `source: "path"` and a workspace-relative
`sourcePath` when possible. The absolute operational load path stays in
`plugins.load.paths`; the install record avoids duplicating local workstation
paths into long-lived config. This keeps local development installs visible to
source-plane diagnostics without adding a second raw filesystem-path disclosure
surface.
surface. Legacy `plugins.installs` config entries are still read as a
compatibility fallback while the state-managed `plugins/installs.json` ledger
becomes the install source of truth.
`openclaw doctor --fix` migrates those legacy config entries into the managed
ledger and refreshes the cold registry index without loading plugin runtime
modules.
## Context engine plugins

View File

@@ -86,6 +86,8 @@ Current compatibility records include:
toward `agentRuntime`
- generated bundled channel config metadata fallback while registry-first
`channelConfigs` metadata lands
- the persisted plugin registry disable env while repair flows migrate operators
to `openclaw plugins registry --refresh` and `openclaw doctor --fix`
New plugin code should prefer the replacement listed in the registry and in the
specific migration guide. Existing plugins can keep using a compatibility path

View File

@@ -79,6 +79,8 @@ audio bridge, node pinning, delayed realtime intro, and, when Twilio delegation
is configured, whether the `voice-call` plugin and Twilio credentials are ready.
Treat any `ok: false` check as a blocker before asking an agent to join.
Use `openclaw googlemeet setup --json` for scripts or machine-readable output.
Use `--transport chrome`, `--transport chrome-node`, or `--transport twilio`
to preflight a specific transport before an agent tries it.
Join a meeting:
@@ -303,11 +305,17 @@ display name, or remote IP.
Common failure checks:
- `Configured Google Meet node ... is not usable: offline`: the pinned node is
known to the Gateway but unavailable. Agents should treat that node as
diagnostic state, not as a usable Chrome host, and report the setup blocker
instead of falling back to another transport unless the user asked for that.
- `No connected Google Meet-capable node`: start `openclaw node run` in the VM,
approve pairing, and make sure `openclaw plugins enable google-meet` and
`openclaw plugins enable browser` were run in the VM. Also confirm the
Gateway host allows both node commands with
`gateway.nodes.allowCommands: ["googlemeet.chrome", "browser.proxy"]`.
- `BlackHole 2ch audio device not found`: install `blackhole-2ch` on the host
being checked and reboot before using local Chrome audio.
- `BlackHole 2ch audio device not found on the node`: install `blackhole-2ch`
in the VM and reboot the VM.
- Chrome opens but cannot join: sign in to the browser profile inside the VM, or

View File

@@ -68,6 +68,7 @@ observation-only.
**Conversation observation**
- `model_call_started` / `model_call_ended` — observe sanitized provider/model call metadata, timing, outcome, and bounded request-id hashes without prompt or response content
- `llm_input` — observe provider input (system prompt, prompt, history)
- `llm_output` — observe provider output
@@ -146,6 +147,21 @@ Rules:
- `onResolution` receives the resolved approval decision — `allow-once`,
`allow-always`, `deny`, `timeout`, or `cancelled`.
### Tool result persistence
Tool results can include structured `details` for UI rendering, diagnostics,
media routing, or plugin-owned metadata. Treat `details` as runtime metadata,
not prompt content:
- OpenClaw strips `toolResult.details` before provider replay and compaction
input so metadata does not become model context.
- Persisted session entries keep only bounded `details`. Oversized details are
replaced with a compact summary and `persistedDetailsTruncated: true`.
- `tool_result_persist` and `before_message_write` run before the final
persistence cap. Hooks should still keep returned `details` small and avoid
placing prompt-relevant text only in `details`; put model-visible tool output
in `content`.
## Prompt and model hooks
Use the phase-specific hooks for new plugins:
@@ -162,6 +178,13 @@ so your plugin does not depend on a legacy combined phase.
`before_agent_start` and `agent_end` include `event.runId` when OpenClaw can
identify the active run. The same value is also available on `ctx.runId`.
Use `model_call_started` and `model_call_ended` for provider-call telemetry
that should not receive raw prompts, history, responses, headers, request
bodies, or provider request IDs. These hooks include stable metadata such as
`runId`, `callId`, `provider`, `model`, optional `api`/`transport`, terminal
`durationMs`/`outcome`, and `upstreamRequestIdHash` when OpenClaw can derive a
bounded provider request-id hash.
Non-bundled plugins that need `llm_input`, `llm_output`, or `agent_end` must set:
```json

View File

@@ -63,8 +63,7 @@ Practical rule:
If bridge mode reports zero exported artifacts, the active memory plugin is not
currently exposing public bridge inputs yet. Run `openclaw wiki doctor` first,
then confirm the Gateway is running and the active memory plugin supports public
artifacts.
then confirm the active memory plugin supports public artifacts.
## Vault modes
@@ -92,10 +91,6 @@ Bridge mode can index:
- memory root files
- memory event logs
Bridge status, doctor, and import commands call the Gateway so the CLI reads
the same in-process memory capability as runtime wiki tools. For remote
Gateways, pass the usual `--url` and `--token` options.
### `unsafe-local`
Explicit same-machine escape hatch for local private paths.

View File

@@ -5,7 +5,7 @@ sidebarTitle: "Migrate to SDK"
read_when:
- You see the OPENCLAW_PLUGIN_SDK_COMPAT_DEPRECATED warning
- You see the OPENCLAW_EXTENSION_API_DEPRECATED warning
- You used api.registerEmbeddedExtensionFactory before OpenClaw 2026.4.24
- You used api.registerEmbeddedExtensionFactory before OpenClaw 2026.4.25
- You are updating a plugin to the modern plugin architecture
- You maintain an external OpenClaw plugin
---
@@ -299,7 +299,7 @@ releases.
| `plugin-sdk/retry-runtime` | Retry helpers | `RetryConfig`, `retryAsync`, policy runners |
| `plugin-sdk/allow-from` | Allowlist formatting | `formatAllowFromLowercase` |
| `plugin-sdk/allowlist-resolution` | Allowlist input mapping | `mapAllowlistResolutionInputs` |
| `plugin-sdk/command-auth` | Command gating and command-surface helpers | `resolveControlCommandGate`, sender-authorization helpers, command registry helpers |
| `plugin-sdk/command-auth` | Command gating and command-surface helpers | `resolveControlCommandGate`, sender-authorization helpers, command registry helpers including dynamic argument menu formatting |
| `plugin-sdk/command-status` | Command status/help renderers | `buildCommandsMessage`, `buildCommandsMessagePaginated`, `buildHelpMessage` |
| `plugin-sdk/secret-input` | Secret input parsing | Secret input helpers |
| `plugin-sdk/webhook-ingress` | Webhook request helpers | Webhook target utilities |
@@ -342,7 +342,7 @@ releases.
| `plugin-sdk/provider-web-search` | Provider web-search helpers | Web-search provider registration/cache/runtime helpers |
| `plugin-sdk/provider-tools` | Provider tool/schema compat helpers | `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks`, Gemini schema cleanup + diagnostics, and xAI compat helpers such as `resolveXaiModelCompatPatch` / `applyXaiModelCompat` |
| `plugin-sdk/provider-usage` | Provider usage helpers | `fetchClaudeUsage`, `fetchGeminiUsage`, `fetchGithubCopilotUsage`, and other provider usage helpers |
| `plugin-sdk/provider-stream` | Provider stream wrapper helpers | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-stream` | Provider stream wrapper helpers | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/DeepSeek V4/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-transport-runtime` | Provider transport helpers | Native provider transport helpers such as guarded fetch, transport message transforms, and writable transport event streams |
| `plugin-sdk/keyed-async-queue` | Ordered async queue | `KeyedAsyncQueue` |
| `plugin-sdk/media-runtime` | Shared media helpers | Media fetch/transform/store helpers plus media payload builders |

View File

@@ -340,7 +340,7 @@ API key auth, and dynamic model resolution.
Each family builder is composed from lower-level public helpers exported from the same package, which you can reach for when a provider needs to go off the common pattern:
- `openclaw/plugin-sdk/provider-model-shared` — `ProviderReplayFamily`, `buildProviderReplayFamilyHooks(...)`, and the raw replay builders (`buildOpenAICompatibleReplayPolicy`, `buildAnthropicReplayPolicyForModel`, `buildGoogleGeminiReplayPolicy`, `buildHybridAnthropicOrOpenAIReplayPolicy`). Also exports Gemini replay helpers (`sanitizeGoogleGeminiReplayHistory`, `resolveTaggedReasoningOutputMode`) and endpoint/model helpers (`resolveProviderEndpoint`, `normalizeProviderId`, `normalizeGooglePreviewModelId`, `normalizeNativeXaiModelId`).
- `openclaw/plugin-sdk/provider-stream` — `ProviderStreamFamily`, `buildProviderStreamFamilyHooks(...)`, `composeProviderStreamWrappers(...)`, plus the shared OpenAI/Codex wrappers (`createOpenAIAttributionHeadersWrapper`, `createOpenAIFastModeWrapper`, `createOpenAIServiceTierWrapper`, `createOpenAIResponsesContextManagementWrapper`, `createCodexNativeWebSearchWrapper`) and shared proxy/provider wrappers (`createOpenRouterWrapper`, `createToolStreamWrapper`, `createMinimaxFastModeWrapper`).
- `openclaw/plugin-sdk/provider-stream` — `ProviderStreamFamily`, `buildProviderStreamFamilyHooks(...)`, `composeProviderStreamWrappers(...)`, plus the shared OpenAI/Codex wrappers (`createOpenAIAttributionHeadersWrapper`, `createOpenAIFastModeWrapper`, `createOpenAIServiceTierWrapper`, `createOpenAIResponsesContextManagementWrapper`, `createCodexNativeWebSearchWrapper`), DeepSeek V4 OpenAI-compatible wrapper (`createDeepSeekV4OpenAICompatibleThinkingWrapper`), and shared proxy/provider wrappers (`createOpenRouterWrapper`, `createToolStreamWrapper`, `createMinimaxFastModeWrapper`).
- `openclaw/plugin-sdk/provider-tools` — `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks("gemini")`, underlying Gemini schema helpers (`normalizeGeminiToolSchemas`, `inspectGeminiToolSchemas`), and xAI compat helpers (`resolveXaiModelCompatPatch()`, `applyXaiModelCompat(model)`). The bundled xAI plugin uses `normalizeResolvedModel` + `contributeResolvedModelCompat` with these to keep xAI rules owned by the provider.
Some stream helpers stay provider-local on purpose. `@openclaw/anthropic-provider` keeps `wrapAnthropicProviderStream`, `resolveAnthropicBetas`, `resolveAnthropicFastMode`, `resolveAnthropicServiceTier`, and the lower-level Anthropic wrapper builders in its own public `api.ts` / `contract-api.ts` seam because they encode Claude OAuth beta handling and `context1m` gating. The xAI plugin similarly keeps native xAI Responses shaping in its own `wrapStreamFn` (`/fast` aliases, default `tool_stream`, unsupported strict-tool cleanup, xAI-specific reasoning-payload removal).

View File

@@ -102,7 +102,7 @@ For the plugin authoring guide, see [Plugin SDK overview](/plugins/sdk-overview)
| `plugin-sdk/provider-web-search` | Web-search provider registration/cache/runtime helpers |
| `plugin-sdk/provider-tools` | `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks`, Gemini schema cleanup + diagnostics, and xAI compat helpers such as `resolveXaiModelCompatPatch` / `applyXaiModelCompat` |
| `plugin-sdk/provider-usage` | `fetchClaudeUsage` and similar |
| `plugin-sdk/provider-stream` | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-stream` | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/DeepSeek V4/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-transport-runtime` | Native provider transport helpers such as guarded fetch, transport message transforms, and writable transport event streams |
| `plugin-sdk/provider-onboard` | Onboarding config patch helpers |
| `plugin-sdk/global-singleton` | Process-local singleton/map/cache helpers |
@@ -112,7 +112,7 @@ For the plugin authoring guide, see [Plugin SDK overview](/plugins/sdk-overview)
<Accordion title="Auth and security subpaths">
| Subpath | Key exports |
| --- | --- |
| `plugin-sdk/command-auth` | `resolveControlCommandGate`, command registry helpers, sender-authorization helpers |
| `plugin-sdk/command-auth` | `resolveControlCommandGate`, command registry helpers including dynamic argument menu formatting, sender-authorization helpers |
| `plugin-sdk/command-status` | Command/help message builders such as `buildCommandsMessagePaginated` and `buildHelpMessage` |
| `plugin-sdk/approval-auth-runtime` | Approver resolution and same-chat action-auth helpers |
| `plugin-sdk/approval-client-runtime` | Native exec approval profile/filter helpers |
@@ -125,7 +125,7 @@ For the plugin authoring guide, see [Plugin SDK overview](/plugins/sdk-overview)
| `plugin-sdk/approval-runtime` | Exec/plugin approval payload helpers, native approval routing/runtime helpers, and structured approval display helpers such as `formatApprovalDisplayPath` |
| `plugin-sdk/reply-dedupe` | Narrow inbound reply dedupe reset helpers |
| `plugin-sdk/channel-contract-testing` | Narrow channel contract test helpers without the broad testing barrel |
| `plugin-sdk/command-auth-native` | Native command auth + native session-target helpers |
| `plugin-sdk/command-auth-native` | Native command auth, dynamic argument menu formatting, and native session-target helpers |
| `plugin-sdk/command-detection` | Shared command detection helpers |
| `plugin-sdk/command-primitives-runtime` | Lightweight command text predicates for hot channel paths |
| `plugin-sdk/command-surface` | Command-body normalization and command-surface helpers |
@@ -152,7 +152,7 @@ For the plugin authoring guide, see [Plugin SDK overview](/plugins/sdk-overview)
| `plugin-sdk/hook-runtime` | Shared webhook/internal hook pipeline helpers |
| `plugin-sdk/lazy-runtime` | Lazy runtime import/binding helpers such as `createLazyRuntimeModule`, `createLazyRuntimeMethod`, and `createLazyRuntimeSurface` |
| `plugin-sdk/process-runtime` | Process exec helpers |
| `plugin-sdk/cli-runtime` | CLI formatting, wait, and version helpers |
| `plugin-sdk/cli-runtime` | CLI formatting, wait, version, argument-invocation, and lazy command-group helpers |
| `plugin-sdk/gateway-runtime` | Gateway client and channel-status patch helpers |
| `plugin-sdk/config-runtime` | Config load/write helpers and plugin-config lookup helpers |
| `plugin-sdk/telegram-command-config` | Telegram command-name/description normalization and duplicate/conflict checks, even when the bundled Telegram contract surface is unavailable |

View File

@@ -53,6 +53,12 @@ Restart the Gateway afterwards.
Set config under `plugins.entries.voice-call.config`:
If `enabled` is true but the selected provider is missing credentials, Gateway
startup logs a setup-incomplete warning with the missing keys and skips starting
the runtime. Run `openclaw voicecall setup` to see the same readiness details.
Commands, RPC calls, and agent tools still return the exact missing provider
configuration when used.
```json5
{
plugins: {

View File

@@ -43,6 +43,9 @@ export ELEVENLABS_API_KEY="..."
}
```
Set `modelId` to `eleven_v3` to use ElevenLabs v3 TTS. OpenClaw keeps
`eleven_multilingual_v2` as the default for existing installs.
## Speech-to-text
Use Scribe v2 for inbound audio attachments and short recorded voice segments:

View File

@@ -50,11 +50,16 @@ The bundled `fal` image-generation provider defaults to
| Size overrides | Supported |
| Aspect ratio | Supported |
| Resolution | Supported |
| Output format | `png` or `jpeg` |
<Warning>
The fal image edit endpoint does **not** support `aspectRatio` overrides.
</Warning>
Use `outputFormat: "png"` when you want PNG output. fal does not declare an
explicit transparent-background control in OpenClaw, so `background:
"transparent"` is reported as an ignored override for fal models.
To use fal as the default image provider:
```json5

View File

@@ -62,6 +62,7 @@ Looking for chat channel docs (WhatsApp/Telegram/Discord/Slack/Mattermost (plugi
- [Qianfan](/providers/qianfan)
- [Qwen Cloud](/providers/qwen)
- [Runway](/providers/runway)
- [SenseAudio](/providers/senseaudio)
- [SGLang (local models)](/providers/sglang)
- [StepFun](/providers/stepfun)
- [Synthetic](/providers/synthetic)
@@ -89,6 +90,7 @@ Looking for chat channel docs (WhatsApp/Telegram/Discord/Slack/Mattermost (plugi
- [ElevenLabs](/providers/elevenlabs#speech-to-text)
- [Mistral](/providers/mistral#audio-transcription-voxtral)
- [OpenAI](/providers/openai#speech-to-text)
- [SenseAudio](/providers/senseaudio)
- [xAI](/providers/xai#speech-to-text)
## Community tools

View File

@@ -108,6 +108,38 @@ export LITELLM_API_KEY="sk-litellm-key"
## Advanced configuration
### Image generation
LiteLLM can also back the `image_generate` tool through OpenAI-compatible
`/images/generations` and `/images/edits` routes. Configure a LiteLLM image
model under `agents.defaults.imageGenerationModel`:
```json5
{
models: {
providers: {
litellm: {
baseUrl: "http://localhost:4000",
apiKey: "${LITELLM_API_KEY}",
},
},
},
agents: {
defaults: {
imageGenerationModel: {
primary: "litellm/gpt-image-2",
timeoutMs: 180_000,
},
},
},
}
```
Loopback LiteLLM URLs such as `http://localhost:4000` work without a global
private-network override. For a LAN-hosted proxy, set
`models.providers.litellm.request.allowPrivateNetwork: true` because the API key
will be sent to the configured proxy host.
<AccordionGroup>
<Accordion title="Virtual keys">
Create a dedicated key for OpenClaw with spend limits:

View File

@@ -12,7 +12,7 @@ MiniMax also provides:
- Bundled speech synthesis via T2A v2
- Bundled image understanding via `MiniMax-VL-01`
- Bundled music generation via `music-2.5+`
- Bundled music generation via `music-2.6`
- Bundled `web_search` through the MiniMax Coding Plan search API
Provider split:
@@ -20,7 +20,7 @@ Provider split:
| Provider ID | Auth | Capabilities |
| ---------------- | ------- | --------------------------------------------------------------- |
| `minimax` | API key | Text, image generation, image understanding, speech, web search |
| `minimax-portal` | OAuth | Text, image generation, image understanding |
| `minimax-portal` | OAuth | Text, image generation, image understanding, speech |
## Built-in catalog
@@ -30,7 +30,7 @@ Provider split:
| `MiniMax-M2.7-highspeed` | Chat (reasoning) | Faster M2.7 reasoning tier |
| `MiniMax-VL-01` | Vision | Image understanding model |
| `image-01` | Image generation | Text-to-image and image-to-image editing |
| `music-2.5+` | Music generation | Default music model |
| `music-2.6` | Music generation | Default music model |
| `music-2.5` | Music generation | Previous music generation tier |
| `music-2.0` | Music generation | Legacy music generation tier |
| `MiniMax-Hailuo-2.3` | Video generation | Text-to-video and image reference flows |
@@ -251,6 +251,16 @@ The bundled `minimax` plugin registers MiniMax T2A v2 as a speech provider for
- Default TTS model: `speech-2.8-hd`
- Default voice: `English_expressive_narrator`
- Supported bundled model ids include `speech-2.8-hd`, `speech-2.8-turbo`,
`speech-2.6-hd`, `speech-2.6-turbo`, `speech-02-hd`,
`speech-02-turbo`, `speech-01-hd`, and `speech-01-turbo`.
- Auth resolution is `messages.tts.providers.minimax.apiKey`, then
`minimax-portal` OAuth/token auth profiles, then Token Plan environment
keys (`MINIMAX_OAUTH_TOKEN`, `MINIMAX_CODE_PLAN_KEY`,
`MINIMAX_CODING_API_KEY`), then `MINIMAX_API_KEY`.
- If no TTS host is configured, OpenClaw reuses the configured
`minimax-portal` OAuth host and strips Anthropic-compatible path suffixes
such as `/anthropic`.
- Normal audio attachments stay MP3.
- Voice-note targets such as Feishu and Telegram are transcoded from MiniMax
MP3 to 48kHz Opus with `ffmpeg`, because the Feishu/Lark file API only
@@ -272,7 +282,7 @@ The bundled `minimax` plugin registers MiniMax T2A v2 as a speech provider for
The bundled `minimax` plugin also registers music generation through the shared
`music_generate` tool.
- Default music model: `minimax/music-2.5+`
- Default music model: `minimax/music-2.6`
- Also supports `minimax/music-2.5` and `minimax/music-2.0`
- Prompt controls: `lyrics`, `instrumental`, `durationSeconds`
- Output format: `mp3`
@@ -285,7 +295,7 @@ To use MiniMax as the default music provider:
agents: {
defaults: {
musicGenerationModel: {
primary: "minimax/music-2.5+",
primary: "minimax/music-2.6",
},
},
},

View File

@@ -23,17 +23,18 @@ changing config.
| Goal | Use | Notes |
| --------------------------------------------- | -------------------------------------------------------- | ---------------------------------------------------------------------------- |
| Direct API-key billing | `openai/gpt-5.4` | Set `OPENAI_API_KEY` or run OpenAI API-key onboarding. |
| Direct API-key billing | `openai/gpt-5.5` | Set `OPENAI_API_KEY` or run OpenAI API-key onboarding. |
| GPT-5.5 with ChatGPT/Codex subscription auth | `openai-codex/gpt-5.5` | Default PI route for Codex OAuth. Best first choice for subscription setups. |
| GPT-5.5 with native Codex app-server behavior | `openai/gpt-5.5` plus `embeddedHarness.runtime: "codex"` | Uses the Codex app-server harness, not the public OpenAI API route. |
| GPT-5.5 with native Codex app-server behavior | `openai/gpt-5.5` plus `embeddedHarness.runtime: "codex"` | Forces the Codex app-server harness for that model ref. |
| Image generation or editing | `openai/gpt-image-2` | Works with either `OPENAI_API_KEY` or OpenAI Codex OAuth. |
| Transparent-background images | `openai/gpt-image-1.5` | Use `outputFormat=png` or `webp` and `openai.background=transparent`. |
<Note>
GPT-5.5 is currently available in OpenClaw through subscription/OAuth routes:
`openai-codex/gpt-5.5` with the PI runner, or `openai/gpt-5.5` with the
Codex app-server harness. Direct API-key access for `openai/gpt-5.5` is
supported once OpenAI enables GPT-5.5 on the public API; until then use an
API-enabled model such as `openai/gpt-5.4` for `OPENAI_API_KEY` setups.
GPT-5.5 is available through both direct OpenAI Platform API-key access and
subscription/OAuth routes. Use `openai/gpt-5.5` for direct `OPENAI_API_KEY`
traffic, `openai-codex/gpt-5.5` for Codex OAuth through PI, or
`openai/gpt-5.5` with `embeddedHarness.runtime: "codex"` for the native Codex
app-server harness.
</Note>
<Note>
@@ -93,16 +94,14 @@ Choose your preferred auth method and follow the setup steps.
| Model ref | Route | Auth |
|-----------|-------|------|
| `openai/gpt-5.4` | Direct OpenAI Platform API | `OPENAI_API_KEY` |
| `openai/gpt-5.5` | Direct OpenAI Platform API | `OPENAI_API_KEY` |
| `openai/gpt-5.4-mini` | Direct OpenAI Platform API | `OPENAI_API_KEY` |
| `openai/gpt-5.5` | Future direct API route once OpenAI enables GPT-5.5 on the API | `OPENAI_API_KEY` |
<Note>
`openai/*` is the direct OpenAI API-key route unless you explicitly force
the Codex app-server harness. GPT-5.5 itself is currently subscription/OAuth
only; use `openai-codex/*` for Codex OAuth through the default PI runner, or
use `openai/gpt-5.5` with `embeddedHarness.runtime: "codex"` for native
Codex app-server execution.
the Codex app-server harness. Use `openai-codex/*` for Codex OAuth through
the default PI runner, or use `openai/gpt-5.5` with
`embeddedHarness.runtime: "codex"` for native Codex app-server execution.
</Note>
### Config example
@@ -110,7 +109,7 @@ Choose your preferred auth method and follow the setup steps.
```json5
{
env: { OPENAI_API_KEY: "sk-..." },
agents: { defaults: { model: { primary: "openai/gpt-5.4" } } },
agents: { defaults: { model: { primary: "openai/gpt-5.5" } } },
}
```
@@ -256,8 +255,33 @@ See [Image Generation](/tools/image-generation) for shared tool parameters, prov
</Note>
`gpt-image-2` is the default for both OpenAI text-to-image generation and image
editing. `gpt-image-1` remains usable as an explicit model override, but new
OpenAI image workflows should use `openai/gpt-image-2`.
editing. `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini` remain usable as
explicit model overrides. Use `openai/gpt-image-1.5` for transparent-background
PNG/WebP output; the current `gpt-image-2` API rejects
`background: "transparent"`.
For a transparent-background request, agents should call `image_generate` with
`model: "openai/gpt-image-1.5"`, `outputFormat: "png"` or `"webp"`, and
`background: "transparent"`; the older `openai.background` provider option is
still accepted. OpenClaw also protects the public OpenAI and
OpenAI Codex OAuth routes by rewriting default `openai/gpt-image-2` transparent
requests to `gpt-image-1.5`; Azure and custom OpenAI-compatible endpoints keep
their configured deployment/model names.
The same setting is exposed for headless CLI runs:
```bash
openclaw infer image generate \
--model openai/gpt-image-1.5 \
--output-format png \
--background transparent \
--prompt "A simple red circle sticker on a transparent background" \
--json
```
Use the same `--output-format` and `--background` flags with
`openclaw infer image edit` when starting from an input file.
`--openai-background` remains available as an OpenAI-specific alias.
For Codex OAuth installs, keep the same `openai/gpt-image-2` ref. When an
`openai-codex` OAuth profile is configured, OpenClaw resolves that stored OAuth
@@ -277,6 +301,12 @@ Generate:
/tool image_generate model=openai/gpt-image-2 prompt="A polished launch poster for OpenClaw on macOS" size=3840x2160 count=1
```
Generate a transparent PNG:
```
/tool image_generate model=openai/gpt-image-1.5 prompt="A simple red circle sticker on a transparent background" outputFormat=png background=transparent
```
Edit:
```
@@ -311,7 +341,7 @@ See [Video Generation](/tools/video-generation) for shared tool parameters, prov
## GPT-5 prompt contribution
OpenClaw adds a shared GPT-5 prompt contribution for GPT-5-family runs across providers. It applies by model id, so `openai-codex/gpt-5.5`, `openai/gpt-5.4`, `openrouter/openai/gpt-5.5`, `opencode/gpt-5.5`, and other compatible GPT-5 refs receive the same overlay. Older GPT-4.x models do not.
OpenClaw adds a shared GPT-5 prompt contribution for GPT-5-family runs across providers. It applies by model id, so `openai-codex/gpt-5.5`, `openai/gpt-5.5`, `openrouter/openai/gpt-5.5`, `opencode/gpt-5.5`, and other compatible GPT-5 refs receive the same overlay. Older GPT-4.x models do not.
The bundled native Codex harness uses the same GPT-5 behavior and heartbeat overlay through Codex app-server developer instructions, so `openai/gpt-5.x` sessions forced through `embeddedHarness.runtime: "codex"` keep the same follow-through and proactive heartbeat guidance even though Codex owns the rest of the harness prompt.
@@ -603,7 +633,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: { transport: "auto" },
},
"openai-codex/gpt-5.5": {
@@ -630,7 +660,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: { openaiWsWarmup: false },
},
},
@@ -654,7 +684,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": { params: { fastMode: true } },
"openai/gpt-5.5": { params: { fastMode: true } },
},
},
},
@@ -675,7 +705,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": { params: { serviceTier: "priority" } },
"openai/gpt-5.5": { params: { serviceTier: "priority" } },
},
},
},
@@ -723,7 +753,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: {
responsesServerCompaction: true,
responsesCompactThreshold: 120000,
@@ -741,7 +771,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: { responsesServerCompaction: false },
},
},

View File

@@ -18,23 +18,26 @@ provider id `opencode-go` so upstream per-model routing stays correct.
## Built-in catalog
OpenClaw sources the Go catalog from the bundled pi model registry. Run
OpenClaw sources most Go catalog rows from the bundled pi model registry and
supplements current upstream rows while the registry catches up. Run
`openclaw models list --provider opencode-go` for the current model list.
As of the bundled pi catalog, the provider includes:
The provider includes:
| Model ref | Name |
| -------------------------- | --------------------- |
| `opencode-go/glm-5` | GLM-5 |
| `opencode-go/glm-5.1` | GLM-5.1 |
| `opencode-go/kimi-k2.5` | Kimi K2.5 |
| `opencode-go/kimi-k2.6` | Kimi K2.6 (3x limits) |
| `opencode-go/mimo-v2-omni` | MiMo V2 Omni |
| `opencode-go/mimo-v2-pro` | MiMo V2 Pro |
| `opencode-go/minimax-m2.5` | MiniMax M2.5 |
| `opencode-go/minimax-m2.7` | MiniMax M2.7 |
| `opencode-go/qwen3.5-plus` | Qwen3.5 Plus |
| `opencode-go/qwen3.6-plus` | Qwen3.6 Plus |
| Model ref | Name |
| ------------------------------- | --------------------- |
| `opencode-go/glm-5` | GLM-5 |
| `opencode-go/glm-5.1` | GLM-5.1 |
| `opencode-go/kimi-k2.5` | Kimi K2.5 |
| `opencode-go/kimi-k2.6` | Kimi K2.6 (3x limits) |
| `opencode-go/deepseek-v4-pro` | DeepSeek V4 Pro |
| `opencode-go/deepseek-v4-flash` | DeepSeek V4 Flash |
| `opencode-go/mimo-v2-omni` | MiMo V2 Omni |
| `opencode-go/mimo-v2-pro` | MiMo V2 Pro |
| `opencode-go/minimax-m2.5` | MiniMax M2.5 |
| `opencode-go/minimax-m2.7` | MiniMax M2.7 |
| `opencode-go/qwen3.5-plus` | Qwen3.5 Plus |
| `opencode-go/qwen3.6-plus` | Qwen3.6 Plus |
## Getting started

View File

@@ -71,13 +71,14 @@ OpenRouter can also back the `image_generate` tool. Use an OpenRouter image mode
defaults: {
imageGenerationModel: {
primary: "openrouter/google/gemini-3.1-flash-image-preview",
timeoutMs: 180_000,
},
},
},
}
```
OpenClaw sends image requests to OpenRouter's chat completions image API with `modalities: ["image", "text"]`. Gemini image models receive supported `aspectRatio` and `resolution` hints through OpenRouter's `image_config`.
OpenClaw sends image requests to OpenRouter's chat completions image API with `modalities: ["image", "text"]`. Gemini image models receive supported `aspectRatio` and `resolution` hints through OpenRouter's `image_config`. Use `agents.defaults.imageGenerationModel.timeoutMs` for slower OpenRouter image models; the `image_generate` tool's per-call `timeoutMs` parameter still wins.
## Text-to-speech

View File

@@ -0,0 +1,65 @@
---
summary: "SenseAudio batch speech-to-text for inbound voice notes"
read_when:
- You want SenseAudio speech-to-text for audio attachments
- You need the SenseAudio API key env var or audio config path
title: "SenseAudio"
---
# SenseAudio
SenseAudio can transcribe inbound audio/voice-note attachments through
OpenClaw's shared `tools.media.audio` pipeline. OpenClaw posts multipart audio
to the OpenAI-compatible transcription endpoint and injects the returned text
as `{{Transcript}}` plus an `[Audio]` block.
| Detail | Value |
| ------------- | ------------------------------------------------ |
| Website | [senseaudio.cn](https://senseaudio.cn) |
| Docs | [senseaudio.cn/docs](https://senseaudio.cn/docs) |
| Auth | `SENSEAUDIO_API_KEY` |
| Default model | `senseaudio-asr-pro-1.5-260319` |
| Default URL | `https://api.senseaudio.cn/v1` |
## Getting Started
<Steps>
<Step title="Set your API key">
```bash
export SENSEAUDIO_API_KEY="..."
```
</Step>
<Step title="Enable the audio provider">
```json5
{
tools: {
media: {
audio: {
enabled: true,
models: [{ provider: "senseaudio", model: "senseaudio-asr-pro-1.5-260319" }],
},
},
},
}
```
</Step>
<Step title="Send a voice note">
Send an audio message through any connected channel. OpenClaw uploads the
audio to SenseAudio and uses the transcript in the reply pipeline.
</Step>
</Steps>
## Options
| Option | Path | Description |
| ---------- | ------------------------------------- | ----------------------------------- |
| `model` | `tools.media.audio.models[].model` | SenseAudio ASR model id |
| `language` | `tools.media.audio.models[].language` | Optional language hint |
| `prompt` | `tools.media.audio.prompt` | Optional transcription prompt |
| `baseUrl` | `tools.media.audio.baseUrl` or model | Override the OpenAI-compatible base |
| `headers` | `tools.media.audio.request.headers` | Extra request headers |
<Note>
SenseAudio is batch STT only in OpenClaw. Voice Call realtime transcription
continues to use providers with streaming STT support.
</Note>

View File

@@ -123,6 +123,15 @@ Use the table below to pick the right model for your use case.
</Tip>
## DeepSeek V4 replay behavior
If Venice exposes DeepSeek V4 models such as `venice/deepseek-v4-pro` or
`venice/deepseek-v4-flash`, OpenClaw fills the required DeepSeek V4
`reasoning_content` replay placeholder on assistant tool-call turns when the
proxy omits it. Venice rejects DeepSeek's native top-level `thinking` control,
so OpenClaw keeps that provider-specific replay fix separate from the native
DeepSeek provider's thinking controls.
## Built-in catalog (41 total)
<AccordionGroup>

View File

@@ -132,12 +132,14 @@ Legacy aliases still normalize to the canonical bundled ids:
`video_generate` tool.
- Default video model: `xai/grok-imagine-video`
- Modes: text-to-video, image-to-video, remote video edit, and remote video
extension
- Modes: text-to-video, image-to-video, reference-image generation, remote
video edit, and remote video extension
- Aspect ratios: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3`
- Resolutions: `480P`, `720P`
- Duration: 1-15 seconds for generation/image-to-video, 2-10 seconds for
extension
- Duration: 1-15 seconds for generation/image-to-video, 1-10 seconds when
using `reference_image` roles, 2-10 seconds for extension
- Reference-image generation: set `imageRoles` to `reference_image` for
every supplied image; xAI accepts up to 7 such images
<Warning>
Local video buffers are not accepted. Use remote `http(s)` URLs for

View File

@@ -53,6 +53,46 @@ OpenAI-compatible endpoint with API-key authentication.
The default model ref is `xiaomi/mimo-v2-flash`. The provider is injected automatically when `XIAOMI_API_KEY` is set or an auth profile exists.
</Tip>
## Text-to-speech
The bundled `xiaomi` plugin also registers Xiaomi MiMo as a speech provider for
`messages.tts`. It calls Xiaomi's chat-completions TTS contract with the text as
an `assistant` message and optional style guidance as a `user` message.
| Property | Value |
| -------- | ---------------------------------------- |
| TTS id | `xiaomi` (`mimo` alias) |
| Auth | `XIAOMI_API_KEY` |
| API | `POST /v1/chat/completions` with `audio` |
| Default | `mimo-v2.5-tts`, voice `mimo_default` |
| Output | MP3 by default; WAV when configured |
```json5
{
messages: {
tts: {
auto: "always",
provider: "xiaomi",
providers: {
xiaomi: {
apiKey: "xiaomi_api_key",
model: "mimo-v2.5-tts",
voice: "mimo_default",
format: "mp3",
style: "Bright, natural, conversational tone.",
},
},
},
},
}
```
Supported built-in voices include `mimo_default`, `default_zh`, `default_en`,
`Mia`, `Chloe`, `Milo`, and `Dean`. `mimo-v2-tts` is supported for older MiMo
TTS accounts; the default uses the current MiMo-V2.5 TTS model. For voice-note
targets such as Feishu and Telegram, OpenClaw transcodes Xiaomi output to 48kHz
Opus with `ffmpeg` before delivery.
## Config example
```json5

View File

@@ -0,0 +1,210 @@
---
summary: "Comprehensive application modernization plan with frontend delivery skill updates"
title: "Application modernization plan"
read_when:
- Planning a broad OpenClaw application modernization pass
- Updating frontend implementation standards for app or Control UI work
- Turning a broad product quality review into phased engineering work
---
# Application modernization plan
## Goal
Move the application toward a cleaner, faster, more maintainable product without
breaking current workflows or hiding risk in broad refactors. The work should
land as small, reviewable slices with proof for each touched surface.
## Principles
- Preserve current architecture unless a boundary is demonstrably causing churn,
performance cost, or user-visible bugs.
- Prefer the smallest correct patch for each issue, then repeat.
- Separate required fixes from optional polish so maintainers can land high
value work without waiting on subjective decisions.
- Keep plugin-facing behavior documented and backwards compatible.
- Verify shipped behavior, dependency contracts, and tests before claiming a
regression is fixed.
- Make the main user path better first: onboarding, auth, chat, provider setup,
plugin management, and diagnostics.
## Phase 1: Baseline audit
Inventory the current application before changing it.
- Identify the top user workflows and the code surfaces that own them.
- List dead affordances, duplicate settings, unclear error states, and expensive
render paths.
- Capture current validation commands for each surface.
- Mark issues as required, recommended, or optional.
- Document known blockers that need owner review, especially API, security,
release, and plugin contract changes.
Definition of done:
- One issue list with repo-root file references.
- Each issue has severity, owner surface, expected user impact, and a proposed
validation path.
- No speculative cleanup items are mixed into required fixes.
## Phase 2: Product and UX cleanup
Prioritize visible workflows and remove confusion.
- Tighten onboarding copy and empty states around model auth, gateway status,
and plugin setup.
- Remove or disable dead affordances where no action is possible.
- Keep important actions visible across responsive widths instead of hiding them
behind fragile layout assumptions.
- Consolidate repeated status language so errors have one source of truth.
- Add progressive disclosure for advanced settings while keeping core setup fast.
Recommended validation:
- Manual happy path for first-run setup and existing user startup.
- Focused tests for any routing, config persistence, or status derivation logic.
- Browser screenshots for changed responsive surfaces.
## Phase 3: Frontend architecture tightening
Improve maintainability without a broad rewrite.
- Move repeated UI state transformations into narrow typed helpers.
- Keep data fetching, persistence, and presentation responsibilities separate.
- Prefer existing hooks, stores, and component patterns over new abstractions.
- Split oversized components only when it reduces coupling or clarifies tests.
- Avoid introducing broad global state for local panel interactions.
Required guardrails:
- Do not change public behavior as a side effect of file splitting.
- Keep accessibility behavior intact for menus, dialogs, tabs, and keyboard
navigation.
- Verify that loading, empty, error, and optimistic states still render.
## Phase 4: Performance and reliability
Target measured pain rather than broad theoretical optimization.
- Measure startup, route transition, large list, and chat transcript costs.
- Replace repeated expensive derived data with memoized selectors or cached
helpers where profiling proves value.
- Reduce avoidable network or filesystem scans on hot paths.
- Keep deterministic ordering for prompt, registry, file, plugin, and network
inputs before model payload construction.
- Add lightweight regression tests for hot helpers and contract boundaries.
Definition of done:
- Each performance change records baseline, expected impact, actual impact, and
remaining gap.
- No perf patch lands solely on intuition when cheap measurement is available.
## Phase 5: Type, contract, and test hardening
Raise correctness at the boundary points users and plugin authors depend on.
- Replace loose runtime strings with discriminated unions or closed code lists.
- Validate external inputs with existing schema helpers or zod.
- Add contract tests around plugin manifests, provider catalogs, gateway protocol
messages, and config migration behavior.
- Keep compatibility paths in doctor or repair flows instead of startup-time
hidden migrations.
- Avoid test-only coupling to plugin internals; use SDK facades and documented
barrels.
Recommended validation:
- `pnpm check:changed`
- Targeted tests for every changed boundary.
- `pnpm build` when lazy boundaries, packaging, or published surfaces change.
## Phase 6: Documentation and release readiness
Keep user-facing docs aligned with behavior.
- Update docs with behavior, API, config, onboarding, or plugin changes.
- Add changelog entries only for user-visible changes.
- Keep plugin terminology user-facing; use internal package names only where
needed for contributors.
- Confirm release and install instructions still match the current command
surface.
Definition of done:
- Relevant docs are updated in the same branch as behavior changes.
- Generated docs or API drift checks pass when touched.
- The handoff names any skipped validation and why it was skipped.
## Recommended first slice
Start with a scoped Control UI and onboarding pass:
- Audit first-run setup, provider auth readiness, gateway status, and plugin
setup surfaces.
- Remove dead actions and clarify failure states.
- Add or update focused tests for status derivation and config persistence.
- Run `pnpm check:changed`.
This gives high user value with limited architecture risk.
## Frontend skill update
Use this section to update the frontend-focused `SKILL.md` supplied with the
modernization task. If adopting this guidance as a repo-local OpenClaw skill,
create `.agents/skills/openclaw-frontend/SKILL.md` first, keep the frontmatter
that belongs in that target skill, then add or replace the body guidance with
the following content.
```markdown
# Frontend Delivery Standards
Use this skill when implementing or reviewing user-facing React, Next.js,
desktop webview, or app UI work.
## Operating rules
- Start from the existing product workflow and code conventions.
- Prefer the smallest correct patch that improves the current user path.
- Separate required fixes from optional polish in the handoff.
- Do not build marketing pages when the request is for an application surface.
- Keep actions visible and usable across supported viewport sizes.
- Remove dead affordances instead of leaving controls that cannot act.
- Preserve loading, empty, error, success, and permission states.
- Use existing design-system components, hooks, stores, and icons before adding
new primitives.
## Implementation checklist
1. Identify the primary user task and the component or route that owns it.
2. Read the local component patterns before editing.
3. Patch the narrowest surface that solves the issue.
4. Add responsive constraints for fixed-format controls, toolbars, grids, and
counters so text and hover states cannot resize the layout unexpectedly.
5. Keep data loading, state derivation, and rendering responsibilities clear.
6. Add tests when logic, persistence, routing, permissions, or shared helpers
change.
7. Verify the main happy path and the most relevant edge case.
## Visual quality gates
- Text must fit inside its container on mobile and desktop.
- Toolbars may wrap, but controls must remain reachable.
- Buttons should use familiar icons when the icon is clearer than text.
- Cards should be used for repeated items, modals, and framed tools, not for
every page section.
- Avoid one-note color palettes and decorative backgrounds that compete with
operational content.
- Dense product surfaces should optimize for scanning, comparison, and repeated
use.
## Handoff format
Report:
- What changed.
- What user behavior changed.
- Required validation that passed.
- Any validation skipped and the concrete reason.
- Optional follow-up work, clearly separated from required fixes.
```

View File

@@ -14,6 +14,7 @@ Assistant output can carry a small set of delivery/render directives:
- `[embed ...]` for Control UI rich rendering
These directives are separate. `MEDIA:` and reply/voice tags remain delivery metadata; `[embed ...]` is the web-only rich render path.
Trusted tool-result media uses the same `MEDIA:` / `[[audio_as_voice]]` parser before delivery, so text tool outputs can still mark an audio attachment as a voice note.
When block streaming is enabled, `MEDIA:` remains single-delivery metadata for a
turn. If the same media URL is sent in a streamed block and repeated in the final

View File

@@ -32,7 +32,8 @@ title: "Tests"
- Gateway integration: opt-in via `OPENCLAW_TEST_INCLUDE_GATEWAY=1 pnpm test` or `pnpm test:gateway`.
- `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=<n>` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs.
- `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip.
- `pnpm test:docker:all`: Builds the shared live-test image and Docker E2E image once, then runs the Docker smoke lanes with `OPENCLAW_SKIP_DOCKER_BUILD=1` through a weighted scheduler. `OPENCLAW_DOCKER_ALL_PARALLELISM=<n>` controls process slots and defaults to 10; `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=<n>` controls the provider-sensitive tail pool and defaults to 10. Heavy lane caps default to `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=6`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=8`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; use `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` for larger hosts. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=<ms>`. The runner preflights Docker by default, cleans stale OpenClaw E2E containers, emits active-lane status every 30 seconds, and stores lane timings in `.artifacts/docker-tests/lane-timings.json` for longest-first ordering on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the lane manifest without running Docker, `OPENCLAW_DOCKER_ALL_STATUS_INTERVAL_MS=<ms>` to tune status output, or `OPENCLAW_DOCKER_ALL_TIMINGS=0` to disable timing reuse. The runner stops scheduling new pooled lanes after the first failure unless `OPENCLAW_DOCKER_ALL_FAIL_FAST=0` is set, and each lane has a 120-minute fallback timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`; selected live/tail lanes use tighter per-lane caps. Per-lane logs are written under `.artifacts/docker-tests/<run-id>/`.
- `pnpm test:docker:all`: Builds the shared live-test image and Docker E2E image once, then runs the Docker smoke lanes with `OPENCLAW_SKIP_DOCKER_BUILD=1` through a weighted scheduler. `OPENCLAW_DOCKER_ALL_PARALLELISM=<n>` controls process slots and defaults to 10; `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=<n>` controls the provider-sensitive tail pool and defaults to 10. Heavy lane caps default to `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=9`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=10`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; provider caps default to one heavy lane per provider via `OPENCLAW_DOCKER_ALL_LIVE_CLAUDE_LIMIT=4`, `OPENCLAW_DOCKER_ALL_LIVE_CODEX_LIMIT=4`, and `OPENCLAW_DOCKER_ALL_LIVE_GEMINI_LIMIT=4`. Use `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` for larger hosts. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=<ms>`. The runner preflights Docker by default, cleans stale OpenClaw E2E containers, emits active-lane status every 30 seconds, shares provider CLI tool caches between compatible lanes, retries transient live-provider failures once by default (`OPENCLAW_DOCKER_ALL_LIVE_RETRIES=<n>`), and stores lane timings in `.artifacts/docker-tests/lane-timings.json` for longest-first ordering on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the lane manifest without running Docker, `OPENCLAW_DOCKER_ALL_STATUS_INTERVAL_MS=<ms>` to tune status output, or `OPENCLAW_DOCKER_ALL_TIMINGS=0` to disable timing reuse. Use `OPENCLAW_DOCKER_ALL_LIVE_MODE=skip` for deterministic/local lanes only or `OPENCLAW_DOCKER_ALL_LIVE_MODE=only` for live-provider lanes only; package aliases are `pnpm test:docker:local:all` and `pnpm test:docker:live:all`. Live-only mode merges main and tail live lanes into one longest-first pool so provider buckets can pack Claude, Codex, and Gemini work together. The runner stops scheduling new pooled lanes after the first failure unless `OPENCLAW_DOCKER_ALL_FAIL_FAST=0` is set, and each lane has a 120-minute fallback timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`; selected live/tail lanes use tighter per-lane caps. CLI backend Docker setup commands have their own timeout via `OPENCLAW_LIVE_CLI_BACKEND_SETUP_TIMEOUT_SECONDS` (default 180). Per-lane logs are written under `.artifacts/docker-tests/<run-id>/`.
- CLI backend live Docker probes can be run as focused lanes, for example `pnpm test:docker:live-cli-backend:codex`, `pnpm test:docker:live-cli-backend:codex:resume`, or `pnpm test:docker:live-cli-backend:codex:mcp`. Claude and Gemini have matching `:resume` and `:mcp` aliases.
- `pnpm test:docker:openwebui`: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks `/api/models`, then runs a real proxied chat through `/api/chat/completions`. Requires a usable live model key (for example OpenAI in `~/.profile`), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites.
- `pnpm test:docker:mcp-channels`: Starts a seeded Gateway container and a second client container that spawns `openclaw mcp serve`, then verifies routed conversation discovery, transcript reads, attachment metadata, live event queue behavior, outbound send routing, and Claude-style channel + permission notifications over the real stdio bridge. The Claude notification assertion reads the raw stdio MCP frames directly so the smoke reflects what the bridge actually emits.

View File

@@ -101,6 +101,13 @@ Assistant transcript entries persist the same normalized usage shape, including
returns usage metadata. This gives `/usage cost` and transcript-backed session
status a stable source even after the live runtime state is gone.
OpenClaw keeps provider usage accounting separate from the current context
snapshot. Provider `usage.total` can include cached input, output, and multiple
tool-loop model calls, so it is useful for cost and telemetry but can overstate
the live context window. Context displays and diagnostics use the latest prompt
snapshot (`promptTokens`, or the last model call when no prompt snapshot is
available) for `context.used`.
## Cost estimation (when shown)
Costs are estimated from your model pricing config:

View File

@@ -8,11 +8,12 @@ title: "Transcript hygiene"
---
This document describes **provider-specific fixes** applied to transcripts before a run
(building model context). These are **in-memory** adjustments used to satisfy strict
provider requirements. These hygiene steps do **not** rewrite the stored JSONL transcript
on disk; however, a separate session-file repair pass may rewrite malformed JSONL files
by dropping invalid lines before the session is loaded. When a repair occurs, the original
file is backed up alongside the session file.
(building model context). Most of these are **in-memory** adjustments used to satisfy
strict provider requirements. A separate session-file repair pass may also rewrite
stored JSONL before the session is loaded, either by dropping malformed JSONL lines or
by repairing persisted turns that are syntactically valid but known to be rejected by a
provider during replay. When a repair occurs, the original file is backed up alongside
the session file.
Scope includes:
@@ -22,8 +23,10 @@ Scope includes:
- Tool result pairing repair
- Turn validation / ordering
- Thought signature cleanup
- Thinking signature cleanup
- Image payload sanitization
- User-input provenance tagging (for inter-session routed prompts)
- Empty assistant error-turn repair for Bedrock Converse replay
If you need transcript storage details, see:
@@ -131,6 +134,26 @@ external end-user instructions.
- Tool result pairing repair and synthetic tool results.
- Turn validation (merge consecutive user turns to satisfy strict alternation).
- Thinking blocks with missing, empty, or blank replay signatures are stripped
before provider conversion. If that empties an assistant turn, OpenClaw keeps
turn shape with non-empty omitted-reasoning text.
- Older thinking-only assistant turns that must be stripped are replaced with
non-empty omitted-reasoning text so provider adapters do not drop the replay
turn.
**Amazon Bedrock (Converse API)**
- Empty assistant stream-error turns are repaired to a non-empty fallback text block
before replay. Bedrock Converse rejects assistant messages with `content: []`, so
persisted assistant turns with `stopReason: "error"` and empty content are also
repaired on disk before load.
- Claude thinking blocks with missing, empty, or blank replay signatures are
stripped before Converse replay. If that empties an assistant turn, OpenClaw
keeps turn shape with non-empty omitted-reasoning text.
- Older thinking-only assistant turns that must be stripped are replaced with
non-empty omitted-reasoning text so the Converse replay keeps strict turn shape.
- Replay filters OpenClaw delivery-mirror and gateway-injected assistant turns.
- Image sanitization applies through the global rule.
**Mistral (including model-id based detection)**

View File

@@ -36,7 +36,7 @@ For a high-level overview, see [Onboarding (CLI)](/start/wizard).
- **OpenAI Code (Codex) subscription (device pairing)**: browser pairing flow with a short-lived device code.
- Sets `agents.defaults.model` to `openai-codex/gpt-5.5` when model is unset or already OpenAI-family.
- **OpenAI API key**: uses `OPENAI_API_KEY` if present or prompts for a key, then stores it in auth profiles.
- Sets `agents.defaults.model` to `openai/gpt-5.4` when model is unset, `openai/*`, or `openai-codex/*`.
- Sets `agents.defaults.model` to `openai/gpt-5.5` when model is unset, `openai/*`, or `openai-codex/*`.
- **xAI (Grok) API key**: prompts for `XAI_API_KEY` and configures xAI as a model provider.
- **OpenCode**: prompts for `OPENCODE_API_KEY` (or `OPENCODE_ZEN_API_KEY`, get it at https://opencode.ai/auth) and lets you pick the Zen or Go catalog.
- **Ollama**: offers **Cloud + Local**, **Cloud only**, or **Local only** first. `Cloud only` prompts for `OLLAMA_API_KEY` and uses `https://ollama.com`; the host-backed modes prompt for the Ollama base URL, discover available models, and auto-pull the selected local model when needed; `Cloud + Local` also checks whether that Ollama host is signed in for cloud access.
@@ -182,7 +182,7 @@ Use this reference page for flag semantics and step ordering.
```bash
openclaw agents add work \
--workspace ~/.openclaw/workspace-work \
--model openai/gpt-5.4 \
--model openai/gpt-5.5 \
--bind whatsapp:biz \
--non-interactive \
--json

View File

@@ -204,7 +204,7 @@ sessions, and auth profiles. Running without `--workspace` launches the wizard.
```bash
openclaw agents add work \
--workspace ~/.openclaw/workspace-work \
--model openai/gpt-5.4 \
--model openai/gpt-5.5 \
--bind whatsapp:biz \
--non-interactive \
--json

View File

@@ -142,7 +142,7 @@ What you set:
<Accordion title="OpenAI API key">
Uses `OPENAI_API_KEY` if present or prompts for a key, then stores the credential in auth profiles.
Sets `agents.defaults.model` to `openai/gpt-5.4` when model is unset, `openai/*`, or `openai-codex/*`.
Sets `agents.defaults.model` to `openai/gpt-5.5` when model is unset, `openai/*`, or `openai-codex/*`.
</Accordion>
<Accordion title="xAI (Grok) API key">

View File

@@ -329,7 +329,8 @@ Interface details:
- `resumeSessionId` (optional): resume an existing ACP session instead of creating a new one. The agent replays its conversation history via `session/load`. Requires `runtime: "acp"`.
- `streamTo` (optional): `"parent"` streams initial ACP run progress summaries back to the requester session as system events.
- When available, accepted responses include `streamLogPath` pointing to a session-scoped JSONL log (`<sessionId>.acp-stream.jsonl`) you can tail for full relay history.
- `model` (optional): explicit model override for the ACP child session. Honored for `runtime: "acp"` so the child uses the requested model instead of silently falling back to the target agent default.
- `model` (optional): explicit model override for the ACP child session. Honored for `runtime: "acp"` so the child uses the requested model instead of silently falling back to the target agent default. Codex ACP spawns normalize OpenClaw Codex refs such as `openai-codex/gpt-5.4` to Codex ACP startup config before `session/new`; slash forms such as `openai-codex/gpt-5.4/high` also set Codex ACP reasoning effort.
- `thinking` (optional): explicit thinking/reasoning effort for the ACP child session. For Codex ACP, `minimal` maps to low effort, `low`/`medium`/`high`/`xhigh` map directly, and `off` omits the reasoning-effort startup override.
## Delivery model
@@ -522,7 +523,8 @@ Notes:
Equivalent operations:
- `/acp model <id>` maps to runtime config key `model`.
- `/acp model <id>` maps to runtime config key `model`. For Codex ACP, OpenClaw normalizes `openai-codex/<model>` to the adapter model id and maps slash reasoning suffixes such as `openai-codex/gpt-5.4/high` to Codex ACP `reasoning_effort`.
- `/acp set thinking <level>` maps to runtime config key `thinking`. For Codex ACP, OpenClaw sends the corresponding `reasoning_effort` where the adapter supports one.
- `/acp permissions <profile>` maps to runtime config key `approval_policy`.
- `/acp timeout <seconds>` maps to runtime config key `timeout`.
- `/acp cwd <path>` updates runtime cwd override directly.

View File

@@ -28,7 +28,10 @@ For local integrations only, the Gateway exposes a small loopback HTTP API:
- State: `GET /storage/:kind`, `POST /storage/:kind/set`, `POST /storage/:kind/clear`
- Settings: `POST /set/offline`, `POST /set/headers`, `POST /set/credentials`, `POST /set/geolocation`, `POST /set/media`, `POST /set/timezone`, `POST /set/locale`, `POST /set/device`
All endpoints accept `?profile=<name>`.
All endpoints accept `?profile=<name>`. `POST /start?headless=true` requests a
one-shot headless launch for local managed profiles without changing persisted
browser config; attach-only, remote CDP, and existing-session profiles reject
that override because OpenClaw does not launch those browser processes.
If shared-secret gateway auth is configured, browser HTTP routes require auth too:
@@ -122,6 +125,7 @@ All commands accept `--browser-profile <name>` to target a specific profile, and
```bash
openclaw browser status
openclaw browser start
openclaw browser start --headless # one-shot local managed headless launch
openclaw browser stop # also clears emulation on attach-only/remote CDP
openclaw browser tabs
openclaw browser tab # shortcut for current tab
@@ -213,14 +217,14 @@ openclaw browser set device "iPhone 14"
Notes:
- `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog.
- `click`/`type`/etc require a `ref` from `snapshot` (numeric `12` or role ref `e12`). CSS selectors are intentionally not supported for actions. Use `click-coords` when the visible viewport position is the only reliable target.
- `click`/`type`/etc require a `ref` from `snapshot` (numeric `12`, role ref `e12`, or actionable ARIA ref `ax12`). CSS selectors are intentionally not supported for actions. Use `click-coords` when the visible viewport position is the only reliable target.
- Download, trace, and upload paths are constrained to OpenClaw temp roots: `/tmp/openclaw{,/downloads,/uploads}` (fallback: `${os.tmpdir()}/openclaw/...`).
- `upload` can also set file inputs directly via `--input-ref` or `--element`.
Snapshot flags at a glance:
- `--format ai` (default with Playwright): AI snapshot with numeric refs (`aria-ref="<n>"`).
- `--format aria`: accessibility tree, no refs; inspection only.
- `--format aria`: accessibility tree with `axN` refs. When Playwright is available, OpenClaw binds refs with backend DOM ids to the live page so follow-up actions can use them; otherwise treat the output as inspection-only.
- `--efficient` (or `--mode efficient`): compact role snapshot preset. Set `browser.snapshotDefaults.mode: "efficient"` to make this the default (see [Gateway configuration](/gateway/configuration-reference#browser)).
- `--interactive`, `--compact`, `--depth`, `--selector` force a role snapshot with `ref=e12` refs. `--frame "<iframe>"` scopes role snapshots to an iframe.
- `--labels` adds a viewport-only screenshot with overlayed ref labels (prints `MEDIA:<path>`).
@@ -243,10 +247,21 @@ OpenClaw supports two “snapshot” styles:
- Add `--urls` when link text is ambiguous and the agent needs concrete
navigation targets.
- **ARIA snapshot (ARIA refs like `ax12`)**: `openclaw browser snapshot --format aria`
- Output: the accessibility tree as structured nodes.
- Actions: `openclaw browser click ax12` works when the snapshot path can bind
the ref through Playwright and Chrome backend DOM ids.
- If Playwright is unavailable, ARIA snapshots can still be useful for
inspection, but refs may not be actionable. Re-snapshot with `--format ai`
or `--interactive` when you need action refs.
Ref behavior:
- Refs are **not stable across navigations**; if something fails, re-run `snapshot` and use a fresh ref.
- If the role snapshot was taken with `--frame`, role refs are scoped to that iframe until the next role snapshot.
- Unknown or stale `axN` refs fail fast instead of falling through to
Playwright's `aria-ref` selector. Run a fresh snapshot on the same tab when
that happens.
## Wait power-ups

View File

@@ -31,9 +31,14 @@ Other common Linux launch failures:
found stale `Singleton*` lock files in the managed profile directory. OpenClaw
removes those locks and retries once when the lock points at a dead or
different-host process.
- `Missing X server or $DISPLAY` means OpenClaw is trying to launch a visible
browser on a host without a desktop session. Use `browser.headless: true`,
start `Xvfb`, or run OpenClaw in a real desktop session.
- `Missing X server or $DISPLAY` means a visible browser was explicitly
requested on a host without a desktop session. By default, local managed
profiles now fall back to headless mode on Linux when `DISPLAY` and
`WAYLAND_DISPLAY` are both unset. If you set `OPENCLAW_BROWSER_HEADLESS=0`,
`browser.headless: false`, or `browser.profiles.<name>.headless: false`,
remove that headed override, set `OPENCLAW_BROWSER_HEADLESS=1`, start `Xvfb`,
run `openclaw browser start --headless` for a one-shot managed launch, or run
OpenClaw in a real desktop session.
### Solution 1: Install Google Chrome (Recommended)
@@ -120,14 +125,23 @@ curl -s http://127.0.0.1:18791/tabs
### Config Reference
| Option | Description | Default |
| ------------------------ | -------------------------------------------------------------------- | ----------------------------------------------------------- |
| `browser.enabled` | Enable browser control | `true` |
| `browser.executablePath` | Path to a Chromium-based browser binary (Chrome/Brave/Edge/Chromium) | auto-detected (prefers default browser when Chromium-based) |
| `browser.headless` | Run without GUI | `false` |
| `browser.noSandbox` | Add `--no-sandbox` flag (needed for some Linux setups) | `false` |
| `browser.attachOnly` | Don't launch browser, only attach to existing | `false` |
| `browser.cdpPort` | Chrome DevTools Protocol port | `18800` |
| Option | Description | Default |
| -------------------------------- | -------------------------------------------------------------------- | ----------------------------------------------------------- |
| `browser.enabled` | Enable browser control | `true` |
| `browser.executablePath` | Path to a Chromium-based browser binary (Chrome/Brave/Edge/Chromium) | auto-detected (prefers default browser when Chromium-based) |
| `browser.headless` | Run without GUI | `false` |
| `OPENCLAW_BROWSER_HEADLESS` | Per-process override for local managed browser headless mode | unset |
| `browser.noSandbox` | Add `--no-sandbox` flag (needed for some Linux setups) | `false` |
| `browser.attachOnly` | Don't launch browser, only attach to existing | `false` |
| `browser.cdpPort` | Chrome DevTools Protocol port | `18800` |
| `browser.localLaunchTimeoutMs` | Local managed Chrome discovery timeout | `15000` |
| `browser.localCdpReadyTimeoutMs` | Local managed post-launch CDP readiness timeout | `8000` |
On Raspberry Pi, older VPS hosts, or slow storage, raise
`browser.localLaunchTimeoutMs` when Chrome needs more time to expose its CDP HTTP
endpoint. Raise `browser.localCdpReadyTimeoutMs` when launch succeeds but
`openclaw browser start` still reports `not reachable after start`. Values must
be positive integers up to `120000` ms; invalid config values are rejected.
### Problem: "No Chrome tabs found for profile=\"user\""

View File

@@ -69,6 +69,24 @@ Browser config changes require a Gateway restart so the plugin can re-register i
## Agent guidance
Tool-profile note: `tools.profile: "coding"` includes `web_search` and
`web_fetch`, but it does not include the full `browser` tool. If the agent or a
spawned sub-agent should use browser automation, add browser at the profile
stage:
```json5
{
tools: {
profile: "coding",
alsoAllow: ["browser"],
},
}
```
For a single agent, use `agents.list[].tools.alsoAllow: ["browser"]`.
`tools.subagents.tools.allow: ["browser"]` alone is not enough because sub-agent
policy is applied after profile filtering.
The browser plugin ships two levels of agent guidance:
- The `browser` tool description carries the compact always-on contract: pick
@@ -129,6 +147,8 @@ Browser settings live in `~/.openclaw/openclaw.json`.
// cdpUrl: "http://127.0.0.1:18792", // legacy single-profile override
remoteCdpTimeoutMs: 1500, // remote CDP HTTP timeout (ms)
remoteCdpHandshakeTimeoutMs: 3000, // remote CDP WebSocket handshake timeout (ms)
localLaunchTimeoutMs: 15000, // local managed Chrome discovery timeout (ms)
localCdpReadyTimeoutMs: 8000, // local managed post-launch CDP readiness timeout (ms)
actionTimeoutMs: 60000, // default browser act timeout (ms)
tabCleanup: {
enabled: true, // default: true
@@ -173,7 +193,15 @@ Browser settings live in `~/.openclaw/openclaw.json`.
- Control service binds to loopback on a port derived from `gateway.port` (default `18791` = gateway + 2). Overriding `gateway.port` or `OPENCLAW_GATEWAY_PORT` shifts the derived ports in the same family.
- Local `openclaw` profiles auto-assign `cdpPort`/`cdpUrl`; set those only for remote CDP. `cdpUrl` defaults to the managed local CDP port when unset.
- `remoteCdpTimeoutMs` applies to remote (non-loopback) CDP HTTP reachability checks; `remoteCdpHandshakeTimeoutMs` applies to remote CDP WebSocket handshakes.
- `remoteCdpTimeoutMs` applies to remote and `attachOnly` CDP HTTP reachability
checks and tab-opening HTTP requests; `remoteCdpHandshakeTimeoutMs` applies to
their CDP WebSocket handshakes.
- `localLaunchTimeoutMs` is the budget for a locally launched managed Chrome
process to expose its CDP HTTP endpoint. `localCdpReadyTimeoutMs` is the
follow-up budget for CDP websocket readiness after the process is discovered.
Raise these on Raspberry Pi, low-end VPS, or older hardware where Chromium
starts slowly. Values must be positive integers up to `120000` ms; invalid
config values are rejected.
- `actionTimeoutMs` is the default budget for browser `act` requests when the caller does not pass `timeoutMs`. The client transport adds a small slack window so long waits can finish instead of timing out at the HTTP boundary.
- `tabCleanup` is best-effort cleanup for tabs opened by primary-agent browser sessions. Subagent, cron, and ACP lifecycle cleanup still closes their explicit tracked tabs at session end; primary sessions keep active tabs reusable, then close idle or excess tracked tabs in the background.
@@ -194,12 +222,26 @@ Browser settings live in `~/.openclaw/openclaw.json`.
- `attachOnly: true` means never launch a local browser; only attach if one is already running.
- `headless` can be set globally or per local managed profile. Per-profile values override `browser.headless`, so one locally launched profile can stay headless while another remains visible.
- `executablePath` can be set globally or per local managed profile. Per-profile values override `browser.executablePath`, so different managed profiles can launch different Chromium-based browsers.
- `POST /start?headless=true` and `openclaw browser start --headless` request a
one-shot headless launch for local managed profiles without rewriting
`browser.headless` or profile config. Existing-session, attach-only, and
remote CDP profiles reject the override because OpenClaw does not launch those
browser processes.
- On Linux hosts without `DISPLAY` or `WAYLAND_DISPLAY`, local managed profiles
default to headless automatically when neither the environment nor profile/global
config explicitly chooses headed mode. `openclaw browser status --json`
reports `headlessSource` as `env`, `profile`, `config`,
`request`, `linux-display-fallback`, or `default`.
- `OPENCLAW_BROWSER_HEADLESS=1` forces local managed launches headless for the
current process. `OPENCLAW_BROWSER_HEADLESS=0` forces headed mode for ordinary
starts and returns an actionable error on Linux hosts without a display server;
an explicit `start --headless` request still wins for that one launch.
- `executablePath` can be set globally or per local managed profile. Per-profile values override `browser.executablePath`, so different managed profiles can launch different Chromium-based browsers. Both forms accept `~` for your OS home directory.
- `color` (top-level and per-profile) tints the browser UI so you can see which profile is active.
- Default profile is `openclaw` (managed standalone). Use `defaultProfile: "user"` to opt into the signed-in user browser.
- Auto-detect order: system default browser if Chromium-based; otherwise Chrome → Brave → Edge → Chromium → Chrome Canary.
- `driver: "existing-session"` uses Chrome DevTools MCP instead of raw CDP. Do not set `cdpUrl` for that driver.
- Set `browser.profiles.<name>.userDataDir` when an existing-session profile should attach to a non-default Chromium user profile (Brave, Edge, etc.).
- Set `browser.profiles.<name>.userDataDir` when an existing-session profile should attach to a non-default Chromium user profile (Brave, Edge, etc.). This path also accepts `~` for your OS home directory.
</Accordion>
@@ -209,7 +251,8 @@ Browser settings live in `~/.openclaw/openclaw.json`.
If your **system default** browser is Chromium-based (Chrome/Brave/Edge/etc),
OpenClaw uses it automatically. Set `browser.executablePath` to override
auto-detection. `~` expands to your OS home directory:
auto-detection. Top-level and per-profile `executablePath` values accept `~`
for your OS home directory:
```bash
openclaw config set browser.executablePath "/usr/bin/google-chrome"
@@ -258,6 +301,9 @@ instead, and remote CDP profiles use the browser behind `cdpUrl`.
- **Remote control (node host):** run a node host on the machine that has the browser; the Gateway proxies browser actions to it.
- **Remote CDP:** set `browser.profiles.<name>.cdpUrl` (or `browser.cdpUrl`) to
attach to a remote Chromium-based browser. In this case, OpenClaw will not launch a local browser.
- For externally managed CDP services on loopback (for example Browserless in
Docker published to `127.0.0.1`), also set `attachOnly: true`. Loopback CDP
without `attachOnly` is treated as a local OpenClaw-managed browser profile.
- `headless` only affects local managed profiles that OpenClaw launches. It does not restart or change existing-session or remote CDP browsers.
- `executablePath` follows the same local managed profile rule. Changing it on a
running local managed profile marks that profile for restart/reconcile so the
@@ -331,6 +377,39 @@ Notes:
`wss://` for a direct CDP connection or keep the HTTPS URL and let OpenClaw
discover `/json/version`.
### Browserless Docker on the same host
When Browserless is self-hosted in Docker and OpenClaw runs on the host, treat
Browserless as an externally managed CDP service:
```json5
{
browser: {
enabled: true,
defaultProfile: "browserless",
profiles: {
browserless: {
cdpUrl: "ws://127.0.0.1:3000",
attachOnly: true,
color: "#00AA00",
},
},
},
}
```
The address in `browser.profiles.browserless.cdpUrl` must be reachable from the
OpenClaw process. Browserless must also advertise a matching reachable endpoint;
set Browserless `EXTERNAL` to that same public-to-OpenClaw WebSocket base, such
as `ws://127.0.0.1:3000`, `ws://browserless:3000`, or a stable private Docker
network address. If `/json/version` returns `webSocketDebuggerUrl` pointing at
an address OpenClaw cannot reach, CDP HTTP can look healthy while the WebSocket
attach still fails.
Do not leave `attachOnly` unset for a loopback Browserless profile. Without
`attachOnly`, OpenClaw treats the loopback port as a local managed browser
profile and may report that the port is in use but not owned by OpenClaw.
## Direct WebSocket CDP providers
Some hosted browser services expose a **direct WebSocket** endpoint rather than
@@ -349,10 +428,13 @@ CDP URL shapes and picks the right connection strategy automatically:
[Browserbase](https://www.browserbase.com)). OpenClaw tries HTTP
`/json/version` discovery first (normalising the scheme to `http`/`https`);
if discovery returns a `webSocketDebuggerUrl` it is used, otherwise OpenClaw
falls back to a direct WebSocket handshake at the bare root. This lets a
bare `ws://` pointed at a local Chrome still connect, since Chrome only
accepts WebSocket upgrades on the specific per-target path from
`/json/version`.
falls back to a direct WebSocket handshake at the bare root. If the advertised
WebSocket endpoint rejects the CDP handshake but the configured bare root
accepts it, OpenClaw falls back to that root as well. This lets a bare `ws://`
pointed at a local Chrome still connect, since Chrome only accepts WebSocket
upgrades on the specific per-target path from `/json/version`, while hosted
providers can still use their root WebSocket endpoint when their discovery
endpoint advertises a short-lived URL that is not suitable for Playwright CDP.
### Browserbase
@@ -452,7 +534,8 @@ Default behavior:
- The built-in `user` profile uses Chrome MCP auto-connect, which targets the
default local Google Chrome profile.
Use `userDataDir` for Brave, Edge, Chromium, or a non-default Chrome profile:
Use `userDataDir` for Brave, Edge, Chromium, or a non-default Chrome profile.
`~` expands to your OS home directory:
```json5
{
@@ -526,6 +609,28 @@ Notes:
browser node. If Chrome lives elsewhere and no browser node is connected, use
remote CDP or a node host instead.
### Custom Chrome MCP launch
Override the spawned Chrome DevTools MCP server per profile when the default
`npx chrome-devtools-mcp@latest` flow is not what you want (offline hosts,
pinned versions, vendored binaries):
| Field | What it does |
| ------------ | -------------------------------------------------------------------------------------------------------------------------- |
| `mcpCommand` | Executable to spawn instead of `npx`. Resolved as-is; absolute paths are honored. |
| `mcpArgs` | Argument array passed verbatim to `mcpCommand`. Replaces the default `chrome-devtools-mcp@latest --autoConnect` arguments. |
When `cdpUrl` is set on an existing-session profile, OpenClaw skips
`--autoConnect` and forwards the endpoint to Chrome MCP automatically:
- `http(s)://...``--browserUrl <url>` (DevTools HTTP discovery endpoint).
- `ws(s)://...``--wsEndpoint <url>` (direct CDP WebSocket).
Endpoint flags and `userDataDir` cannot be combined: when `cdpUrl` is set,
`userDataDir` is ignored for Chrome MCP launch, since Chrome MCP attaches to
the running browser behind the endpoint rather than opening a profile
directory.
<Accordion title="Existing-session feature limitations">
Compared to the managed `openclaw` profile, existing-session drivers are more constrained:
@@ -593,6 +698,8 @@ Common examples:
- CDP startup or readiness failure:
- `Chrome CDP websocket for profile "openclaw" is not reachable after start`
- `Remote CDP for profile "<name>" is not reachable at <cdpUrl>`
- `Port <port> is in use for profile "<name>" but not by openclaw` when a
loopback external CDP service is configured without `attachOnly: true`
- Navigation SSRF block:
- `open`, `navigate`, snapshot, or tab-opening flows fail with a browser/network policy error while `start` and `tabs` still work

View File

@@ -1,5 +1,5 @@
---
summary: "Generate and edit images using configured providers (OpenAI, OpenAI Codex OAuth, Google Gemini, OpenRouter, fal, MiniMax, ComfyUI, Vydra, xAI)"
summary: "Generate and edit images using configured providers (OpenAI, OpenAI Codex OAuth, Google Gemini, OpenRouter, LiteLLM, fal, MiniMax, ComfyUI, Vydra, xAI)"
read_when:
- Generating images via the agent
- Configuring image generation providers and models
@@ -24,6 +24,8 @@ The tool only appears when at least one image generation provider is available.
defaults: {
imageGenerationModel: {
primary: "openai/gpt-image-2",
// Optional default provider request timeout for image_generate.
timeoutMs: 180_000,
},
},
},
@@ -46,18 +48,22 @@ The agent calls `image_generate` automatically. No tool allow-listing needed —
## Common routes
| Goal | Model ref | Auth |
| ---------------------------------------------------- | -------------------------------------------------- | ------------------------------------ |
| OpenAI image generation with API billing | `openai/gpt-image-2` | `OPENAI_API_KEY` |
| OpenAI image generation with Codex subscription auth | `openai/gpt-image-2` | OpenAI Codex OAuth |
| OpenRouter image generation | `openrouter/google/gemini-3.1-flash-image-preview` | `OPENROUTER_API_KEY` |
| Google Gemini image generation | `google/gemini-3.1-flash-image-preview` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
| Goal | Model ref | Auth |
| ---------------------------------------------------- | -------------------------------------------------- | -------------------------------------- |
| OpenAI image generation with API billing | `openai/gpt-image-2` | `OPENAI_API_KEY` |
| OpenAI image generation with Codex subscription auth | `openai/gpt-image-2` | OpenAI Codex OAuth |
| OpenAI transparent-background PNG/WebP | `openai/gpt-image-1.5` | `OPENAI_API_KEY` or OpenAI Codex OAuth |
| OpenRouter image generation | `openrouter/google/gemini-3.1-flash-image-preview` | `OPENROUTER_API_KEY` |
| LiteLLM image generation | `litellm/gpt-image-2` | `LITELLM_API_KEY` |
| Google Gemini image generation | `google/gemini-3.1-flash-image-preview` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
The same `image_generate` tool handles text-to-image and reference-image
editing. Use `image` for one reference or `images` for multiple references.
Provider-supported output hints such as `quality`, `outputFormat`, and
OpenAI-specific `background` are forwarded when available and reported as
ignored when a provider does not support them.
`background` are forwarded when available and reported as ignored when a
provider does not support them. Current bundled transparent-background support
is OpenAI-specific; other providers may still preserve PNG alpha if their
backend emits it.
## Supported providers
@@ -65,6 +71,7 @@ ignored when a provider does not support them.
| ---------- | --------------------------------------- | ---------------------------------- | ----------------------------------------------------- |
| OpenAI | `gpt-image-2` | Yes (up to 4 images) | `OPENAI_API_KEY` or OpenAI Codex OAuth |
| OpenRouter | `google/gemini-3.1-flash-image-preview` | Yes (up to 5 input images) | `OPENROUTER_API_KEY` |
| LiteLLM | `gpt-image-2` | Yes (up to 5 input images) | `LITELLM_API_KEY` |
| Google | `gemini-3.1-flash-image-preview` | Yes | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
| fal | `fal-ai/flux/dev` | Yes | `FAL_KEY` |
| MiniMax | `image-01` | Yes (subject reference) | `MINIMAX_API_KEY` or MiniMax OAuth (`minimax-portal`) |
@@ -89,7 +96,8 @@ Use `"list"` to inspect available providers and models at runtime.
</ParamField>
<ParamField path="model" type="string">
Provider/model override, e.g. `openai/gpt-image-2`.
Provider/model override, e.g. `openai/gpt-image-2`; use
`openai/gpt-image-1.5` for transparent OpenAI backgrounds.
</ParamField>
<ParamField path="image" type="string">
@@ -120,6 +128,11 @@ Quality hint when the provider supports it.
Output format hint when the provider supports it.
</ParamField>
<ParamField path="background" type="'transparent' | 'opaque' | 'auto'">
Background hint when the provider supports it. Use `transparent` with
`outputFormat: "png"` or `"webp"` for transparency-capable providers.
</ParamField>
<ParamField path="count" type="number">
Number of images to generate (14).
</ParamField>
@@ -150,6 +163,7 @@ Tool results report the applied settings. When OpenClaw remaps geometry during p
defaults: {
imageGenerationModel: {
primary: "openai/gpt-image-2",
timeoutMs: 180_000,
fallbacks: [
"openrouter/google/gemini-3.1-flash-image-preview",
"google/gemini-3.1-flash-image-preview",
@@ -185,6 +199,8 @@ Notes:
`agents.defaults.mediaGenerationAutoProviderFallback: false` if you want image
generation to use only the explicit `model`, `primary`, and `fallbacks`
entries.
- Set `agents.defaults.imageGenerationModel.timeoutMs` for slow image backends.
A per-call `timeoutMs` tool parameter overrides the configured default.
- Use `action: "list"` to inspect the currently registered providers, their
default models, and auth env-var hints.
@@ -226,9 +242,10 @@ through the Codex Responses backend. Legacy Codex base URLs such as
`https://chatgpt.com/backend-api/codex` for image requests. It does not
silently fall back to `OPENAI_API_KEY` for that request. To force direct OpenAI
Images API routing, configure `models.providers.openai` explicitly with an API
key, custom base URL, or Azure endpoint. The older
`openai/gpt-image-1` model can still be selected explicitly, but new OpenAI
image-generation and image-editing requests should use `gpt-image-2`.
key, custom base URL, or Azure endpoint. The `openai/gpt-image-1.5`,
`openai/gpt-image-1`, and `openai/gpt-image-1-mini` models can still be
selected explicitly. Use `gpt-image-1.5` for transparent-background PNG/WebP
output; the current `gpt-image-2` API rejects `background: "transparent"`.
`gpt-image-2` supports both text-to-image generation and reference-image
editing through the same `image_generate` tool. OpenClaw forwards `prompt`,
@@ -253,8 +270,51 @@ OpenAI-specific options live under the `openai` object:
```
`openai.background` accepts `transparent`, `opaque`, or `auto`; transparent
outputs require `outputFormat` `png` or `webp`. `openai.outputCompression`
applies to JPEG/WebP outputs.
outputs require `outputFormat` `png` or `webp` and a transparency-capable OpenAI
image model. OpenClaw routes default `gpt-image-2` transparent-background
requests to `gpt-image-1.5`. `openai.outputCompression` applies to JPEG/WebP
outputs.
The top-level `background` hint is provider-neutral and currently maps to the
same OpenAI `background` request field when the OpenAI provider is selected.
Providers that do not declare background support return it in `ignoredOverrides`
instead of receiving the unsupported parameter.
When asking an agent for a transparent-background OpenAI image, the expected
tool call is:
```json
{
"model": "openai/gpt-image-1.5",
"prompt": "A simple red circle sticker on a transparent background",
"outputFormat": "png",
"background": "transparent"
}
```
The explicit `openai/gpt-image-1.5` model keeps the request portable across
tool summaries and harnesses. If the agent instead uses the default
`openai/gpt-image-2` with `openai.background: "transparent"` on the public
OpenAI or OpenAI Codex OAuth route, OpenClaw rewrites the provider request to
`gpt-image-1.5`. Azure and custom OpenAI-compatible endpoints keep their
configured deployment/model names.
For headless CLI generation, use the equivalent `openclaw infer` flags:
```bash
openclaw infer image generate \
--model openai/gpt-image-1.5 \
--output-format png \
--background transparent \
--prompt "A simple red circle sticker on a transparent background" \
--json
```
The same `--output-format` and `--background` flags are available on
`openclaw infer image edit`; `--openai-background` remains available as an
OpenAI-specific alias. Current bundled providers other than OpenAI do not
declare explicit background control, so `background: "transparent"` is reported
as ignored for them.
Generate one 4K landscape image:
@@ -262,6 +322,12 @@ Generate one 4K landscape image:
/tool image_generate action=generate model=openai/gpt-image-2 prompt="A clean editorial poster for OpenClaw image generation" size=3840x2160 count=1
```
Generate a transparent PNG:
```
/tool image_generate action=generate model=openai/gpt-image-1.5 prompt="A simple red circle sticker on a transparent background" outputFormat=png background=transparent
```
Generate two square images:
```

View File

@@ -143,6 +143,12 @@ Per-agent override: `agents.list[].tools.profile`.
| `messaging` | `group:messaging`, `sessions_list`, `sessions_history`, `sessions_send`, `session_status` |
| `minimal` | `session_status` only |
`coding` includes lightweight web tools (`web_search`, `web_fetch`, `x_search`)
but not the full browser-control tool. Browser automation can drive real
sessions and logged-in profiles, so add it explicitly with
`tools.alsoAllow: ["browser"]` or a per-agent
`agents.list[].tools.alsoAllow: ["browser"]`.
The `coding` and `messaging` profiles also allow configured bundle MCP tools
under the plugin key `bundle-mcp`. Add `tools.deny: ["bundle-mcp"]` when you
want a profile to keep its normal built-ins but hide all configured MCP tools.

Some files were not shown because too many files have changed in this diff Show More