Compare commits

..

118 Commits

Author SHA1 Message Date
Vincent Koc
d49a014424 fix(plugins): stabilize registry package paths 2026-04-25 12:33:17 -07:00
Vincent Koc
0c3a0a3682 fix(plugins): satisfy registry repair lint 2026-04-25 12:33:17 -07:00
Vincent Koc
ee19c91591 fix(plugins): add doctor registry repair 2026-04-25 12:33:16 -07:00
Vincent Koc
64582bb3a7 docs(diagnostics-otel): clarify genai semconv exports 2026-04-25 12:30:14 -07:00
Peter Steinberger
d4971aad2c docs: require feasible live verification 2026-04-25 20:27:21 +01:00
Peter Steinberger
30325f567c fix: use prompt snapshots for live context diagnostics 2026-04-25 20:25:44 +01:00
Peter Steinberger
b732f21a86 fix: clarify voice-call setup diagnostics 2026-04-25 20:24:36 +01:00
Vincent Koc
44648440a5 fix(diagnostics-otel): stabilize genai token metric model attr 2026-04-25 12:22:55 -07:00
Peter Steinberger
75d64cd4b8 feat: expose generic image background option 2026-04-25 20:21:46 +01:00
Peter Steinberger
03fd7df929 fix: remove duplicate diagnostic stability case 2026-04-25 20:21:39 +01:00
Peter Steinberger
d757396785 test(ui): consolidate chat jsdom suites 2026-04-25 20:17:23 +01:00
Peter Steinberger
7436e395d5 test(node-host): cache native binary fixture lookup 2026-04-25 20:17:23 +01:00
Peter Steinberger
f34513ac66 perf(memory): avoid duplicate session store reads 2026-04-25 20:17:22 +01:00
Vincent Koc
5815ca93d9 fix(diagnostics-otel): honor genai usage semconv opt-in 2026-04-25 12:13:50 -07:00
Peter Steinberger
86d897cfaa feat(android): expose talk mode
Co-authored-by: alex-latitude <213670856+alex-latitude@users.noreply.github.com>
2026-04-25 20:12:38 +01:00
Peter Steinberger
791ad0864a fix: strip invalid thinking replay signatures
Fixes #45010.
Supersedes #70054.

Co-authored-by: Chris Staples <chris.staples@sophos.com>
Co-authored-by: Fourier <yang.fourier@gmail.com>
2026-04-25 20:12:30 +01:00
Peter Steinberger
47a63f7acf fix(logging): merge duplicate context diagnostic case 2026-04-25 20:11:08 +01:00
Peter Steinberger
e6ab61762a fix(check): pass lock env to changed lint lanes 2026-04-25 20:11:08 +01:00
Peter Steinberger
1e7ae07772 fix(cli): dedupe onboard auth flags for completion cache 2026-04-25 20:11:08 +01:00
Peter Steinberger
d9486c683b fix: stabilize macos npm update smoke 2026-04-25 20:09:32 +01:00
Peter Steinberger
17401e31de fix: avoid changed gate lint self-lock 2026-04-25 20:09:00 +01:00
mushuiyu_xydt
0e1ef93e84 fix(minimax): use dedicated image generation endpoint (#61155)
* fix(minimax): use dedicated image generation endpoint

MiniMax image generation uses a dedicated API endpoint
(api.minimax.io/v1/image_generation) that is separate from the
text/chat API endpoint (api.minimax.io/anthropic).

Previously, the resolveMinimaxImageBaseUrl function would extract
the origin from the provider's configured baseUrl. If a user had
configured their baseUrl to the chat endpoint (e.g.,
api.minimax.chat/anthropic), the image generation would incorrectly
use that endpoint, resulting in "invalid api key" errors.

This fix always uses the dedicated image generation endpoint,
ignoring the provider's baseUrl configuration for image generation.

Fixes #61149

* fix(minimax): support CN endpoint for image generation

Respect MINIMAX_API_HOST environment variable to determine whether
to use the global (api.minimax.io) or CN (api.minimaxi.com) endpoint
for image generation.

This ensures that CN users who configure MINIMAX_API_HOST to use
api.minimaxi.com will continue to use the CN endpoint for image
generation, while global users continue to use api.minimax.io.

The original bug was caused by the code extracting the origin from
the provider's configured baseUrl, which could be set to incorrect
endpoints like api.minimax.chat. This fix uses the dedicated image
generation endpoints instead.

Fixes #61149

* fix(minimax): infer CN endpoint from provider config when env is unset

When MINIMAX_API_HOST is not set, fall back to checking the provider's
configured baseUrl to determine whether to use the CN or global image
endpoint. This ensures CN users who went through onboarding (which sets
models.providers.minimax.baseUrl to https://api.minimaxi.com/anthropic)
are correctly routed to the CN image endpoint.

The isMinimaxCnHost check ensures we only use the baseUrl origin for
CN detection - invalid endpoints like api.minimax.chat would not match
minimaxi.com and would correctly fall through to the global default.

Fixes #61149

* test(minimax): cover dedicated image endpoints

* fix(logging): handle context assembly diagnostics

* Revert "fix(logging): handle context assembly diagnostics"

This reverts commit f51d2f7d67f8193268dd37553ac77e80a0423390.

* test(minimax): isolate image endpoint env

* docs(changelog): credit minimax image fix

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 20:07:52 +01:00
Quratulain-bilal
7d58362f3f docs(browser): note tilde expansion also covers per-profile paths (#71601)
* docs(browser): note tilde expansion also covers per-profile paths

The 95a2c9b fix expanded "~" for both `browser.executablePath` and
per-profile `profiles.<name>.executablePath` (config.ts:382 calls
`normalizeExecutablePath` for profile overrides). Per-profile
`userDataDir` on existing-session profiles is also tilde-expanded
(config.ts:391 via `resolveUserPath`). The configuration reference
only mentioned the top-level `browser.executablePath` case.

* docs(browser): align tilde path config help

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 20:05:03 +01:00
Vincent Koc
5671fdca87 feat(diagnostics-otel): add genai usage span identity 2026-04-25 12:03:10 -07:00
Peter Steinberger
5eab16e086 fix: improve google meet setup diagnostics 2026-04-25 20:01:24 +01:00
Peter Steinberger
e36b77c13e docs(changelog): drop self-thanks 2026-04-25 20:01:00 +01:00
Peter Steinberger
d68574653e docs(changelog): split 2026.4.24 and 2026.4.25 notes 2026-04-25 19:59:54 +01:00
Quratulain-bilal
8170df9127 docs(browser): document local startup timeout bounds (#71672)
* docs(browser): document local startup timeout bounds

The new browser.localLaunchTimeoutMs and browser.localCdpReadyTimeoutMs
options are clamped to MAX_BROWSER_STARTUP_TIMEOUT_MS (120000 ms) by
normalizeStartupTimeoutMs in extensions/browser/src/browser/config.ts,
and zero/negative/non-finite values fall back to the defaults. Without
this in the configuration reference, users setting a higher value see
no error and silently get the 120 s ceiling, or set 0 expecting 'no
timeout' and silently get the default.

* docs(browser): clarify startup timeout validation

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 19:59:53 +01:00
Peter Steinberger
b66f01bdca fix: expose transparent image infer options 2026-04-25 19:58:41 +01:00
Vincent Koc
cd7a8f870b feat(diagnostics-otel): add genai usage span attrs 2026-04-25 11:56:13 -07:00
91wan
bb2b68b34e fix(acp): pass Codex ACP model thinking overrides
Fix ACP Codex model/thinking override propagation.\n\nThanks @91wan.
2026-04-25 19:56:03 +01:00
Peter Steinberger
e9d9726f2d fix: handle context assembled diagnostics 2026-04-25 19:54:28 +01:00
Peter Steinberger
a018db771d fix: preserve omitted thinking replay turns 2026-04-25 19:54:28 +01:00
Peter Steinberger
690c98ad99 test(plugins): align install ledger mocks 2026-04-25 19:54:12 +01:00
Vincent Koc
c410e48382 fix(plugins): keep onboarding install records out of config 2026-04-25 11:52:19 -07:00
Peter Steinberger
bbc0884e23 docs(changelog): restore 2026.4.24 release notes 2026-04-25 19:51:11 +01:00
Vincent Koc
9bd348fdec fix(plugins): harden install ledger path handling 2026-04-25 11:48:17 -07:00
Vincent Koc
dc19069d71 feat(diagnostics-otel): add genai operation duration metric 2026-04-25 11:48:10 -07:00
Peter Steinberger
81307fc11d test: hoist backup archive mocks 2026-04-25 19:48:03 +01:00
Peter Steinberger
599ae7fed8 docs: clarify tool result details persistence 2026-04-25 19:47:19 +01:00
Peter Steinberger
fecf1e9b8f fix: align plugin install tests with ledger store 2026-04-25 19:44:11 +01:00
Peter Steinberger
4c0e9a4b2e fix(plugins): honor inferred agent model defaults 2026-04-25 19:40:32 +01:00
Peter Steinberger
cd8cb8254a fix(logging): remove duplicate context diagnostic case 2026-04-25 19:39:20 +01:00
Peter Steinberger
2055e6ceba fix(logging): include context assembly diagnostics in stability log 2026-04-25 19:39:20 +01:00
Peter Steinberger
8ea3099cd3 test(codex): accept visible session model reply 2026-04-25 19:39:20 +01:00
Peter Steinberger
e4f544790c test: isolate gateway live model sessions 2026-04-25 19:39:20 +01:00
Peter Steinberger
02639d3ec8 fix(plugins): alias wildcard runtime dependency exports 2026-04-25 19:39:20 +01:00
Peter Steinberger
14c9cfb637 fix(plugins): alias runtime dependency export subpaths 2026-04-25 19:39:20 +01:00
Peter Steinberger
9e9aa4722a fix(plugins): load mirrored runtime deps through ESM-safe aliases 2026-04-25 19:39:20 +01:00
Peter Steinberger
d2ab6b4fd5 fix(plugins): preserve package deps for runtime mirrors 2026-04-25 19:39:19 +01:00
Troy Hitch
63241bf1e0 fix(bonjour): suppress ciao cancellation across plugin runtime copies
Fix the bundled Bonjour gateway discovery crash-loop caused by ciao probe cancellation rejections after the Bonjour plugin migration.

The plugin entry now wires the existing rejection handler into the advertiser, and the unhandled-rejection handler registry is anchored on globalThis so staged plugin SDK module copies register into the same process-level handler set used by the host.

Verification:
- pnpm test:serial extensions/bonjour/src/advertiser.test.ts src/infra/unhandled-rejections.fatal-detection.test.ts
- OPENCLAW_LOCAL_CHECK_MODE=throttled pnpm check:changed partially completed: conflict markers plus core/core-test/extensions/extension-test typecheck passed; local lint lane hit a self-lock and was stopped.
2026-04-25 11:38:30 -07:00
Vincent Koc
888448facc feat(plugins): move install records to managed ledger 2026-04-25 11:37:10 -07:00
Peter Steinberger
e473577eaa test(voice): harden live STT transcript checks 2026-04-25 19:36:01 +01:00
Vincent Koc
f204f0c999 docs(logging): document new OTEL metrics and spans from recent diagnostics-otel feats
Five recent diagnostics-otel feat commits added user-facing OpenTelemetry
surfaces but did not update docs/logging.md, so the listed metrics and
spans drifted out of sync with what the plugin actually exports:

- 7bbd47349e adds gen_ai.client.token.usage histogram (GenAI semconv)
- b8a41739d5 adds memory heap/rss histograms, pressure counter and span
- d6ef1fcf24 adds openclaw.tool.loop counters and span
- ff172f46a5 adds openclaw.context.assembled span
- 44114328b4 adds openclaw.provider.request_id_hash attr on
  openclaw.model.call spans

Append the new metrics under existing model-usage and exec sections,
add a 'Diagnostics internals' subsection for memory + tool-loop
metrics, and add the three new spans (context.assembled, tool.loop,
memory.pressure) plus the request-id-hash attribute to the spans
listing.
2026-04-25 11:35:20 -07:00
Vincent Koc
7bbd47349e feat(diagnostics-otel): add genai token usage metric 2026-04-25 11:31:45 -07:00
Peter Steinberger
73706ca244 test: stabilize QA session memory ranking 2026-04-25 19:30:28 +01:00
Peter Steinberger
de0097a23c fix: support transparent OpenAI image generation 2026-04-25 19:28:56 +01:00
Peter Steinberger
0bf4876add fix: sanitize assembled diagnostic context 2026-04-25 19:23:51 +01:00
Peter Steinberger
a00c225899 test: split pure tool-card coverage 2026-04-25 19:23:51 +01:00
Peter Steinberger
e1495c3372 test: streamline memory and tts suites 2026-04-25 19:23:51 +01:00
Peter Steinberger
75fcb8c56d perf: lazy-load heavy test imports 2026-04-25 19:23:51 +01:00
Peter Steinberger
31456e3326 fix(providers): handle proxied DeepSeek V4 replay 2026-04-25 19:23:15 +01:00
Vincent Koc
b8a41739d5 feat(diagnostics-otel): export memory diagnostics 2026-04-25 11:22:19 -07:00
Peter Steinberger
1380dc170e fix(browser): avoid restart hint for external profiles 2026-04-25 19:18:06 +01:00
Vincent Koc
d6ef1fcf24 feat(diagnostics-otel): export tool loop events 2026-04-25 11:11:56 -07:00
Peter Steinberger
830bd2e236 fix: recover stale runtime deps locks 2026-04-25 19:09:09 +01:00
Poo-Squirry
fd3840cb00 Fix context usage display and active-run reload interruptions
Fixes context usage display regressions and prevents active runs from being interrupted by channel reloads. Adds persisted tool-result detail bounds so large tool metadata stays out of model/session payloads.
2026-04-25 19:07:52 +01:00
Chris Zhang
c3bfd328ad feat(litellm): add image generation provider (#70246)
* feat(litellm): add image generation provider

Registers litellm as an image-generation provider so model refs like
litellm/gpt-image-2 route through the LiteLLM proxy, and
agents.defaults.imageGenerationModel.fallbacks entries of the form
litellm/... resolve without "No image-generation provider registered
for litellm" errors.

Implementation uses the OpenAI-compatible /images/generations and
/images/edits endpoints that LiteLLM proxies for. BaseUrl resolves from
models.providers.litellm.baseUrl (default http://localhost:4000). Private
network is auto-allowed when baseUrl is a loopback/RFC1918 address, which
covers the common self-hosted LiteLLM proxy case without needing
OPENCLAW_PROVIDER_ALLOW_PRIVATE_NETWORK. Public baseUrls keep normal SSRF
defaults.

Default model is gpt-image-2 (matching upstream 4.21+ OpenAI default).
Advertises the same 2K/4K sizes OpenAI now exposes, plus legacy
256/512/1024 for dall-e-3. Supports both generate and edit.

Local patch. LiteLLM has no upstream image-generation support yet; revisit
if upstream adds one.

* ci: rerun after upstream main hot-fix

* fix(litellm): harden image generation provider

---------

Co-authored-by: Chris Zhang <chris@ChrisdeMac-mini.local>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 19:06:51 +01:00
Chunyue Wang
930d81aa41 fix(agents): prevent Bedrock replay death loop on empty assistant content (#71627)
* fix(agents): prevent Bedrock replay death loop on empty assistant content

  Fixes #71572

* docs: document Bedrock replay repair (#71627) (thanks @openperf)

* fix(diagnostics): share diagnostic event state across sdk graphs

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 19:04:40 +01:00
Vincent Koc
ff172f46a5 feat(diagnostics-otel): add context assembly spans 2026-04-25 11:03:46 -07:00
Peter Steinberger
afd6b5d6fc fix(opencode-go): route DeepSeek V4 through OpenAI transport 2026-04-25 18:58:08 +01:00
Vincent Koc
275c128e99 feat(plugins): add sanitized model call hooks 2026-04-25 10:56:40 -07:00
Peter Steinberger
9ffe764416 fix(whatsapp): send voice note text separately 2026-04-25 18:55:03 +01:00
Peter Steinberger
617e1dd6bf fix(browser): honor remote CDP open timeouts 2026-04-25 18:52:57 +01:00
Peter Steinberger
d623354a0e fix(infra): share diagnostic event state across loaders 2026-04-25 18:52:38 +01:00
Vincent Koc
44114328b4 feat(diagnostics): surface provider request id hashes 2026-04-25 10:46:10 -07:00
Peter Steinberger
2e0ae56b1a test(plugins): satisfy readonly index lint 2026-04-25 18:44:29 +01:00
Peter Steinberger
cd6c64d2ee test(plugins): avoid readonly index mutation 2026-04-25 18:42:25 +01:00
Peter Steinberger
649a645492 test(core): trim sync test overhead 2026-04-25 18:41:21 +01:00
Peter Steinberger
39488dfd68 test(pairing): reduce fixture io overhead 2026-04-25 18:41:20 +01:00
Peter Steinberger
8c93745f0f test(memory): speed up host fixture setup 2026-04-25 18:41:20 +01:00
Vincent Koc
f56bf63b06 fix(plugins): reject stale registry policy reads 2026-04-25 10:35:36 -07:00
Vincent Koc
61b3c04424 test(plugins): cover registry refresh mutations 2026-04-25 10:35:36 -07:00
Vincent Koc
3ec92dfac0 fix(plugins): deprecate registry disable break glass 2026-04-25 10:35:36 -07:00
Vincent Koc
4324855a9d docs(plugins): document persisted registry repair 2026-04-25 10:35:35 -07:00
Vincent Koc
fd8a8789d0 fix(plugins): satisfy registry lint 2026-04-25 10:35:35 -07:00
Vincent Koc
2f622acec6 fix(plugins): normalize startup config from registry 2026-04-25 10:35:35 -07:00
Vincent Koc
f14aa65bcc fix(plugins): refresh registry after chat toggles 2026-04-25 10:35:35 -07:00
Vincent Koc
29988335fc feat(plugins): resolve provider owners from registry 2026-04-25 10:35:35 -07:00
Vincent Koc
674d188153 feat(plugins): plan gateway startup from registry 2026-04-25 10:35:35 -07:00
Vincent Koc
feb8d3a4bd fix(plugins): label registry list state as enabled 2026-04-25 10:35:34 -07:00
Vincent Koc
5677a26385 docs(changelog): note registry-backed plugin list 2026-04-25 10:35:34 -07:00
Vincent Koc
5859dcd298 feat(plugins): list from registry snapshot 2026-04-25 10:35:34 -07:00
Vincent Koc
caf25fac91 feat(plugins): add registry repair command 2026-04-25 10:35:34 -07:00
Vincent Koc
521e75dea0 feat(plugins): prefer persisted registry reads 2026-04-25 10:35:09 -07:00
Vincent Koc
a7de722f4f fix(diagnostics-otel): align GenAI semconv attrs 2026-04-25 10:33:13 -07:00
Peter Steinberger
5f4bc6ec02 fix: surface external agent errors 2026-04-25 18:30:16 +01:00
Peter Steinberger
f545872cbc test(ui): streamline session controls async tests 2026-04-25 18:27:23 +01:00
Peter Steinberger
847c00d409 test(ui): speed up chat icon mocks 2026-04-25 18:27:23 +01:00
Peter Steinberger
88df8fe09d fix(browser): clarify Browserless CDP attach handling 2026-04-25 18:26:57 +01:00
Peter Steinberger
0bbb0eb735 fix(image): honor generation timeout config 2026-04-25 18:25:26 +01:00
Peter Steinberger
80739731dd docs: clarify pi-ai generic failover (#71647) 2026-04-25 18:22:06 +01:00
willamhou
4b5c2f9aa3 fix(agents/failover): classify bare pi-ai stream wrapper as timeout regardless of provider (#71620) 2026-04-25 18:22:06 +01:00
Vincent Koc
dcdf97685b fix(diagnostics): trust internal trace parents (#71574)
* fix(diagnostics): trust internal trace parents

* fix(diagnostics): harden trusted trace metadata

* fix(tooling): honor explicit oxlint threads

* fix(agents): use stable nonmutating sort helpers

* chore(plugin-sdk): refresh api baseline

* fix(diagnostics): gate internal event subscriptions

* fix(diagnostics): isolate listener event copies

* chore(plugin-sdk): refresh internal diagnostics baseline

* chore(plugin-sdk): refresh diagnostics event baseline

* fix(diagnostics): keep event state module local

* fix(diagnostics): harden internal subscription capability

* fix(diagnostics): freeze listener metadata
2026-04-25 10:18:52 -07:00
Peter Steinberger
8e7d382c37 refactor(tts): clarify text media directives 2026-04-25 18:18:34 +01:00
Peter Steinberger
67506ac2a9 fix(xai): support video reference images 2026-04-25 18:14:51 +01:00
Peter Steinberger
768bbc7cc0 docs: update OpenAI GPT-5.5 API guidance 2026-04-25 18:14:10 +01:00
Peter Steinberger
390be8138f fix: add OpenCode Go DeepSeek V4 models 2026-04-25 18:11:59 +01:00
Vincent Koc
0d274ef6c2 docs(control-ui): note assistant avatar uploads stay browser-local
Val Alexander's c65aa1d2a6 (#71639) changed assistant avatar uploads
from gateway config persistence to localStorage, mirroring the existing
user-avatar pattern. CHANGELOG covered it but docs/web/control-ui.md
'Personal identity (browser-local)' section only documented the user
identity. Add a paragraph noting the assistant avatar override follows
the same browser-local pattern, while keeping the ui.assistant.avatar
config field reachable for non-UI clients writing the field directly.
2026-04-25 10:08:59 -07:00
Peter Steinberger
6b3e4b88d6 test: update QA parity fixtures for GPT-5.5 2026-04-25 18:05:28 +01:00
Peter Steinberger
39343088ed fix(tts): keep media-only no-reply payloads 2026-04-25 18:04:54 +01:00
Peter Steinberger
f3ba962fd0 fix(subagents): explain browser tool profile filtering 2026-04-25 17:59:05 +01:00
Peter Steinberger
e27e29c66e refactor: split Crestodian planner backend selection 2026-04-25 17:56:46 +01:00
Peter Steinberger
60f9358348 fix(tts): preserve legacy tool voice hints 2026-04-25 17:56:37 +01:00
Peter Steinberger
dc7c703425 test: lazy-load global cleanup helpers 2026-04-25 17:49:16 +01:00
Peter Steinberger
8bead989da fix(telegram): frame audio transcripts as untrusted 2026-04-25 17:45:40 +01:00
Peter Steinberger
8659495384 test: make live cron probe agent-generic 2026-04-25 17:42:32 +01:00
Val Alexander
c65aa1d2a6 fix(control-ui): persist assistant avatar override locally (#71639)
* fix(control-ui): rebalance quick settings into stable 3-col bento

Pair Appearance with Automations and let Channels stand alone in the
middle column so all three top-row columns reach similar heights.
Promote Personal to a full-width row with a horizontal body
(identity tiles | emoji + actions) so the avatar block stops fighting
for half-width space. Drops the unused .qs-stack--wide hook.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(control-ui): rebalance Personal card with symmetric User↔Assistant identity pair

Restructure Personal card layout to present User and Assistant as 2 balanced identity cards instead of separate User tile + form controls. Mirrors the visual hierarchy and UI pattern across both identities.

Changes:
- Move User avatar text input into User identity card's .__repair section (mirroring Assistant's structure)
- Inline "Choose image" and "Clear avatar" buttons as flex-wrapped action group
- Remove .qs-personal-body and .qs-personal-form wrapper divs
- Update Personal card's .qs-identity-grid to 2-column layout with balanced spacing
- Responsive collapse to 1-column at ≤760px

Tests:
- config-quick.test.ts updated to expect 2 stacks (no longer wrapping Personal in form)
- config-quick.test.ts validates identity card layout now has symmetric User↔Assistant structure
- All 10 quick settings view tests passing
- All 20 schema regression tests passing

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* chore: ignore .vmux worktree paths

* fix(control-ui): persist assistant avatar override locally instead of via gateway config

Mirrors the user-avatar pattern: assistant avatar uploads now go to
localStorage and overlay the gateway-resolved identity at bootstrap and on
agent.identity.get refreshes. Sidesteps the ui.assistant.avatar zod cap
that rejected uploaded data URLs as 'Too big: expected string to have
<=200 characters', removes one config.patch RPC from the avatar path, and
collapses the upload handler from a 44-line async/loadConfig dance into a
plain synchronous setter.

Also lifts the gateway-side ui.assistant.avatar schema cap from 200 to
2,000,000 to match the user-avatar size budget for non-UI clients writing
the field directly, and adds a content-aware text/image normalizer in
ui/src/ui/assistant-identity.ts so short-text avatars stay short while
data URLs survive round-tripping.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 11:17:48 -05:00
408 changed files with 17412 additions and 5307 deletions

5
.github/labeler.yml vendored
View File

@@ -315,6 +315,11 @@
- changed-files:
- any-glob-to-any-file:
- "extensions/lmstudio/**"
"extensions: litellm":
- changed-files:
- any-glob-to-any-file:
- "extensions/litellm/**"
- "docs/providers/litellm.md"
"extensions: openai":
- changed-files:
- any-glob-to-any-file:

1
.gitignore vendored
View File

@@ -146,6 +146,7 @@ changelog/fragments/
# Local scratch workspace
.tmp/
.vmux*
.artifacts/
test/fixtures/openclaw-vitest-unit-report.json
analysis/

View File

@@ -9,6 +9,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
- Run docs list first: `pnpm docs:list` if available; read relevant docs only.
- High-confidence answers only when fixing/triaging: verify source, tests, shipped/current behavior, and dependency contracts before deciding.
- Dependency-backed behavior: read upstream dependency docs/source/types first. Do not assume APIs, defaults, errors, timing, or runtime behavior.
- Live-verify when feasible. Check env/`~/.profile` for keys before assuming live tests are blocked; keep secret output redacted.
- Missing deps: `pnpm install`, retry once, then report first actionable error.
- CODEOWNERS: maint/refactor/tests ok. Larger behavior/product/security/ownership: owner ask/review.
- Wording: product/docs/UI/changelog say "plugin/plugins"; `extensions/` is internal.

File diff suppressed because it is too large Load Diff

View File

@@ -3,6 +3,7 @@
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_DATA_SYNC" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_MICROPHONE" />
<uses-permission android:name="android.permission.POST_NOTIFICATIONS" />
<uses-permission
android:name="android.permission.NEARBY_WIFI_DEVICES"
@@ -52,7 +53,7 @@
<service
android:name=".NodeForegroundService"
android:exported="false"
android:foregroundServiceType="dataSync" />
android:foregroundServiceType="dataSync|microphone" />
<service
android:name=".node.DeviceNotificationListenerService"
android:label="@string/app_name"

View File

@@ -101,7 +101,8 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
val onboardingCompleted: StateFlow<Boolean> = prefs.onboardingCompleted
val canvasDebugStatusEnabled: StateFlow<Boolean> = prefs.canvasDebugStatusEnabled
val speakerEnabled: StateFlow<Boolean> = prefs.speakerEnabled
val micEnabled: StateFlow<Boolean> = prefs.talkEnabled
val voiceCaptureMode: StateFlow<VoiceCaptureMode> = runtimeState(initial = VoiceCaptureMode.Off) { it.voiceCaptureMode }
val micEnabled: StateFlow<Boolean> = runtimeState(initial = false) { it.micEnabled }
val micCooldown: StateFlow<Boolean> = runtimeState(initial = false) { it.micCooldown }
val micStatusText: StateFlow<String> = runtimeState(initial = "Mic off") { it.micStatusText }
@@ -111,6 +112,10 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
val micConversation: StateFlow<List<VoiceConversationEntry>> = runtimeState(initial = emptyList()) { it.micConversation }
val micInputLevel: StateFlow<Float> = runtimeState(initial = 0f) { it.micInputLevel }
val micIsSending: StateFlow<Boolean> = runtimeState(initial = false) { it.micIsSending }
val talkModeEnabled: StateFlow<Boolean> = runtimeState(initial = false) { it.talkModeEnabled }
val talkModeListening: StateFlow<Boolean> = runtimeState(initial = false) { it.talkModeListening }
val talkModeSpeaking: StateFlow<Boolean> = runtimeState(initial = false) { it.talkModeSpeaking }
val talkModeStatusText: StateFlow<String> = runtimeState(initial = "Off") { it.talkModeStatusText }
val chatSessionKey: StateFlow<String> = runtimeState(initial = "main") { it.chatSessionKey }
val chatSessionId: StateFlow<String?> = runtimeState(initial = null) { it.chatSessionId }
@@ -283,6 +288,10 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
ensureRuntime().setMicEnabled(enabled)
}
fun setTalkModeEnabled(enabled: Boolean) {
ensureRuntime().setTalkModeEnabled(enabled)
}
fun setSpeakerEnabled(enabled: Boolean) {
ensureRuntime().setSpeakerEnabled(enabled)
}

View File

@@ -3,12 +3,14 @@ package ai.openclaw.app
import android.app.Notification
import android.app.NotificationChannel
import android.app.NotificationManager
import android.app.Service
import android.app.PendingIntent
import android.app.Service
import android.content.Context
import android.content.Intent
import android.content.pm.ServiceInfo
import androidx.core.app.NotificationCompat
import androidx.core.app.ServiceCompat
import androidx.core.content.ContextCompat
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.Job
@@ -21,6 +23,7 @@ class NodeForegroundService : Service() {
private val scope: CoroutineScope = CoroutineScope(SupervisorJob() + Dispatchers.Main)
private var notificationJob: Job? = null
private var didStartForeground = false
private var voiceCaptureMode = VoiceCaptureMode.Off
override fun onCreate() {
super.onCreate()
@@ -36,22 +39,51 @@ class NodeForegroundService : Service() {
notificationJob =
scope.launch {
combine(
runtime.statusText,
runtime.serverName,
runtime.isConnected,
runtime.micEnabled,
runtime.micIsListening,
) { status, server, connected, micEnabled, micListening ->
Quint(status, server, connected, micEnabled, micListening)
}.collect { (status, server, connected, micEnabled, micListening) ->
val title = if (connected) "OpenClaw Node · Connected" else "OpenClaw Node"
val micSuffix =
if (micEnabled) {
if (micListening) " · Mic: Listening" else " · Mic: Pending"
} else {
""
combine(
runtime.statusText,
runtime.serverName,
runtime.isConnected,
runtime.voiceCaptureMode,
) { status, server, connected, mode ->
VoiceNotificationBase(
status = status,
server = server,
connected = connected,
mode = mode,
)
},
combine(
runtime.micEnabled,
runtime.micIsListening,
runtime.talkModeListening,
runtime.talkModeSpeaking,
) { micEnabled, micListening, talkListening, talkSpeaking ->
VoiceNotificationCapture(
micEnabled = micEnabled,
micListening = micListening,
talkListening = talkListening,
talkSpeaking = talkSpeaking,
)
},
) { base, capture ->
VoiceNotificationState(base = base, capture = capture)
}.collect { state ->
voiceCaptureMode = state.mode
val title =
when {
state.connected && state.mode == VoiceCaptureMode.TalkMode -> "OpenClaw Node · Talk"
state.connected -> "OpenClaw Node · Connected"
else -> "OpenClaw Node"
}
val text = (server?.let { "$status · $it" } ?: status) + micSuffix
val text =
(state.server?.let { "${state.status} · $it" } ?: state.status) +
voiceNotificationSuffix(
mode = state.mode,
manualMicEnabled = state.capture.micEnabled,
manualMicListening = state.capture.micListening,
talkListening = state.capture.talkListening,
talkSpeaking = state.capture.talkSpeaking,
)
startForegroundWithTypes(
notification = buildNotification(title = title, text = text),
@@ -60,13 +92,27 @@ class NodeForegroundService : Service() {
}
}
override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
override fun onStartCommand(
intent: Intent?,
flags: Int,
startId: Int,
): Int {
when (intent?.action) {
ACTION_STOP -> {
(application as NodeApp).peekRuntime()?.disconnect()
stopSelf()
return START_NOT_STICKY
}
ACTION_SET_VOICE_CAPTURE_MODE -> {
voiceCaptureMode = intent.getStringExtra(EXTRA_VOICE_CAPTURE_MODE).toVoiceCaptureMode()
startForegroundWithTypes(
notification =
buildNotification(
title = "OpenClaw Node",
text = if (voiceCaptureMode == VoiceCaptureMode.TalkMode) "Talk mode active" else "Connected",
),
)
}
}
// Keep running; connection is managed by NodeRuntime (auto-reconnect + manual).
return START_STICKY
@@ -127,17 +173,13 @@ class NodeForegroundService : Service() {
.build()
}
private fun updateNotification(notification: Notification) {
val mgr = getSystemService(Context.NOTIFICATION_SERVICE) as NotificationManager
mgr.notify(NOTIFICATION_ID, notification)
}
private fun startForegroundWithTypes(notification: Notification) {
val serviceTypes = foregroundServiceTypesForVoiceMode(voiceCaptureMode)
if (didStartForeground) {
updateNotification(notification)
ServiceCompat.startForeground(this, NOTIFICATION_ID, notification, serviceTypes)
return
}
startForeground(NOTIFICATION_ID, notification, ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC)
ServiceCompat.startForeground(this, NOTIFICATION_ID, notification, serviceTypes)
didStartForeground = true
}
@@ -146,6 +188,8 @@ class NodeForegroundService : Service() {
private const val NOTIFICATION_ID = 1
private const val ACTION_STOP = "ai.openclaw.app.action.STOP"
private const val ACTION_SET_VOICE_CAPTURE_MODE = "ai.openclaw.app.action.SET_VOICE_CAPTURE_MODE"
private const val EXTRA_VOICE_CAPTURE_MODE = "ai.openclaw.app.extra.VOICE_CAPTURE_MODE"
fun start(context: Context) {
val intent = Intent(context, NodeForegroundService::class.java)
@@ -156,7 +200,85 @@ class NodeForegroundService : Service() {
val intent = Intent(context, NodeForegroundService::class.java).setAction(ACTION_STOP)
context.startService(intent)
}
fun setVoiceCaptureMode(
context: Context,
mode: VoiceCaptureMode,
) {
val intent =
Intent(context, NodeForegroundService::class.java)
.setAction(ACTION_SET_VOICE_CAPTURE_MODE)
.putExtra(EXTRA_VOICE_CAPTURE_MODE, mode.name)
if (mode == VoiceCaptureMode.TalkMode) {
ContextCompat.startForegroundService(context, intent)
} else {
context.startService(intent)
}
}
}
}
private data class Quint<A, B, C, D, E>(val first: A, val second: B, val third: C, val fourth: D, val fifth: E)
internal fun foregroundServiceTypesForVoiceMode(mode: VoiceCaptureMode): Int {
val base = ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC
return if (mode == VoiceCaptureMode.TalkMode) {
base or ServiceInfo.FOREGROUND_SERVICE_TYPE_MICROPHONE
} else {
base
}
}
internal fun voiceNotificationSuffix(
mode: VoiceCaptureMode,
manualMicEnabled: Boolean,
manualMicListening: Boolean,
talkListening: Boolean,
talkSpeaking: Boolean,
): String {
return when (mode) {
VoiceCaptureMode.TalkMode ->
when {
talkSpeaking -> " · Talk: Speaking"
talkListening -> " · Talk: Listening"
else -> " · Talk: On"
}
VoiceCaptureMode.ManualMic ->
if (manualMicEnabled) {
if (manualMicListening) " · Mic: Listening" else " · Mic: Pending"
} else {
""
}
VoiceCaptureMode.Off -> ""
}
}
private fun String?.toVoiceCaptureMode(): VoiceCaptureMode {
return VoiceCaptureMode.entries.firstOrNull { it.name == this } ?: VoiceCaptureMode.Off
}
private data class VoiceNotificationBase(
val status: String,
val server: String?,
val connected: Boolean,
val mode: VoiceCaptureMode,
)
private data class VoiceNotificationCapture(
val micEnabled: Boolean,
val micListening: Boolean,
val talkListening: Boolean,
val talkSpeaking: Boolean,
)
private data class VoiceNotificationState(
val base: VoiceNotificationBase,
val capture: VoiceNotificationCapture,
) {
val status: String
get() = base.status
val server: String?
get() = base.server
val connected: Boolean
get() = base.connected
val mode: VoiceCaptureMode
get() = base.mode
}

View File

@@ -64,6 +64,8 @@ class NodeRuntime(
private val json = Json { ignoreUnknownKeys = true }
private val externalAudioCaptureActive = MutableStateFlow(false)
private val _voiceCaptureMode = MutableStateFlow(VoiceCaptureMode.Off)
val voiceCaptureMode: StateFlow<VoiceCaptureMode> = _voiceCaptureMode.asStateFlow()
private val discovery = GatewayDiscovery(appContext, scope = scope)
val gateways: StateFlow<List<GatewayEndpoint>> = discovery.gateways
@@ -428,6 +430,18 @@ class NodeRuntime(
)
}
val talkModeEnabled: StateFlow<Boolean>
get() = talkMode.isEnabled
val talkModeListening: StateFlow<Boolean>
get() = talkMode.isListening
val talkModeSpeaking: StateFlow<Boolean>
get() = talkMode.isSpeaking
val talkModeStatusText: StateFlow<String>
get() = talkMode.statusText
private fun syncMainSessionKey(agentId: String?) {
val resolvedKey = resolveNodeMainSessionKey(agentId)
// Always push the resolved session key into TalkMode, even when the
@@ -599,17 +613,8 @@ class NodeRuntime(
prefs.loadGatewayToken()
}
scope.launch {
prefs.talkEnabled.collect { enabled ->
// MicCaptureManager handles STT + send to gateway, while the dedicated
// reply speaker handles TTS for assistant replies in the voice tab.
micCapture.setMicEnabled(enabled)
if (enabled) {
talkMode.ttsOnAllResponses = false
scope.launch { talkMode.ensureChatSubscribed() }
}
externalAudioCaptureActive.value = enabled
}
if (prefs.voiceMicEnabled.value) {
setVoiceCaptureMode(VoiceCaptureMode.ManualMic, persistManualMic = false)
}
scope.launch(Dispatchers.Default) {
@@ -643,7 +648,7 @@ class NodeRuntime(
if (value) {
reconnectPreferredGatewayOnForeground()
} else {
stopActiveVoiceSession()
stopManualVoiceSession()
}
}
@@ -757,21 +762,17 @@ class NodeRuntime(
fun setVoiceScreenActive(active: Boolean) {
if (!active) {
stopActiveVoiceSession()
stopManualVoiceSession()
}
// Don't re-enable on active=true; mic toggle drives that
}
fun setMicEnabled(value: Boolean) {
prefs.setTalkEnabled(value)
if (value) {
// Tapping mic on interrupts any active TTS (barge-in)
stopVoicePlayback()
talkMode.ttsOnAllResponses = false
scope.launch { talkMode.ensureChatSubscribed() }
}
micCapture.setMicEnabled(value)
externalAudioCaptureActive.value = value
setVoiceCaptureMode(if (value) VoiceCaptureMode.ManualMic else VoiceCaptureMode.Off)
}
fun setTalkModeEnabled(value: Boolean) {
setVoiceCaptureMode(if (value) VoiceCaptureMode.TalkMode else VoiceCaptureMode.Off)
}
val speakerEnabled: StateFlow<Boolean>
@@ -786,11 +787,72 @@ class NodeRuntime(
talkMode.setPlaybackEnabled(value)
}
private fun setVoiceCaptureMode(
mode: VoiceCaptureMode,
persistManualMic: Boolean = true,
) {
if (mode == VoiceCaptureMode.TalkMode && !hasRecordAudioPermission()) {
_voiceCaptureMode.value = VoiceCaptureMode.Off
externalAudioCaptureActive.value = false
return
}
if (_voiceCaptureMode.value == mode) return
_voiceCaptureMode.value = mode
when (mode) {
VoiceCaptureMode.Off -> {
talkMode.ttsOnAllResponses = false
talkMode.setEnabled(false)
stopVoicePlayback()
micCapture.setMicEnabled(false)
if (persistManualMic) {
prefs.setVoiceMicEnabled(false)
}
NodeForegroundService.setVoiceCaptureMode(appContext, VoiceCaptureMode.Off)
externalAudioCaptureActive.value = false
}
VoiceCaptureMode.ManualMic -> {
talkMode.ttsOnAllResponses = false
talkMode.setEnabled(false)
NodeForegroundService.setVoiceCaptureMode(appContext, VoiceCaptureMode.ManualMic)
if (persistManualMic) {
prefs.setVoiceMicEnabled(true)
}
// Tapping mic on interrupts any active TTS (barge-in).
stopVoicePlayback()
scope.launch { talkMode.ensureChatSubscribed() }
micCapture.setMicEnabled(true)
externalAudioCaptureActive.value = true
}
VoiceCaptureMode.TalkMode -> {
if (persistManualMic) {
prefs.setVoiceMicEnabled(false)
}
micCapture.setMicEnabled(false)
NodeForegroundService.setVoiceCaptureMode(appContext, VoiceCaptureMode.TalkMode)
talkMode.ttsOnAllResponses = true
talkMode.setPlaybackEnabled(speakerEnabled.value)
scope.launch { talkMode.ensureChatSubscribed() }
talkMode.setEnabled(true)
externalAudioCaptureActive.value = true
}
}
}
private fun stopManualVoiceSession() {
if (_voiceCaptureMode.value != VoiceCaptureMode.ManualMic) return
setVoiceCaptureMode(VoiceCaptureMode.Off)
}
private fun stopActiveVoiceSession() {
talkMode.ttsOnAllResponses = false
talkMode.setEnabled(false)
stopVoicePlayback()
micCapture.setMicEnabled(false)
prefs.setTalkEnabled(false)
prefs.setVoiceMicEnabled(false)
NodeForegroundService.setVoiceCaptureMode(appContext, VoiceCaptureMode.Off)
_voiceCaptureMode.value = VoiceCaptureMode.Off
externalAudioCaptureActive.value = false
}
@@ -970,6 +1032,7 @@ class NodeRuntime(
}
fun disconnect() {
stopActiveVoiceSession()
connectedEndpoint = null
activeGatewayAuth = null
_pendingGatewayTrust.value = null

View File

@@ -37,6 +37,7 @@ class SecurePrefs(
private const val notificationsForwardingMaxEventsPerMinuteKey =
"notifications.forwarding.maxEventsPerMinute"
private const val notificationsForwardingSessionKeyKey = "notifications.forwarding.sessionKey"
private const val voiceMicEnabledKey = "voice.micEnabled"
}
private val appContext = context.applicationContext
@@ -162,8 +163,8 @@ class SecurePrefs(
private val _voiceWakeMode = MutableStateFlow(loadVoiceWakeMode())
val voiceWakeMode: StateFlow<VoiceWakeMode> = _voiceWakeMode
private val _talkEnabled = MutableStateFlow(plainPrefs.getBoolean("talk.enabled", false))
val talkEnabled: StateFlow<Boolean> = _talkEnabled
private val _voiceMicEnabled = MutableStateFlow(plainPrefs.getBoolean(voiceMicEnabledKey, false))
val voiceMicEnabled: StateFlow<Boolean> = _voiceMicEnabled
private val _speakerEnabled = MutableStateFlow(plainPrefs.getBoolean("voice.speakerEnabled", true))
val speakerEnabled: StateFlow<Boolean> = _speakerEnabled
@@ -478,9 +479,9 @@ class SecurePrefs(
_voiceWakeMode.value = mode
}
fun setTalkEnabled(value: Boolean) {
plainPrefs.edit { putBoolean("talk.enabled", value) }
_talkEnabled.value = value
fun setVoiceMicEnabled(value: Boolean) {
plainPrefs.edit { putBoolean(voiceMicEnabledKey, value) }
_voiceMicEnabled.value = value
}
fun setSpeakerEnabled(value: Boolean) {

View File

@@ -0,0 +1,7 @@
package ai.openclaw.app
enum class VoiceCaptureMode {
Off,
ManualMic,
TalkMode,
}

View File

@@ -35,10 +35,11 @@ import androidx.compose.foundation.lazy.rememberLazyListState
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Mic
import androidx.compose.material.icons.filled.MicOff
import androidx.compose.material.icons.automirrored.filled.VolumeOff
import androidx.compose.material.icons.automirrored.filled.VolumeUp
import androidx.compose.material.icons.filled.Mic
import androidx.compose.material.icons.filled.MicOff
import androidx.compose.material.icons.filled.RecordVoiceOver
import androidx.compose.material3.Button
import androidx.compose.material3.ButtonDefaults
import androidx.compose.material3.Icon
@@ -69,6 +70,7 @@ import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleEventObserver
import androidx.lifecycle.compose.LocalLifecycleOwner
import ai.openclaw.app.MainViewModel
import ai.openclaw.app.VoiceCaptureMode
import ai.openclaw.app.voice.VoiceConversationEntry
import ai.openclaw.app.voice.VoiceConversationRole
import kotlin.math.max
@@ -81,6 +83,7 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
val listState = rememberLazyListState()
val gatewayStatus by viewModel.statusText.collectAsState()
val voiceCaptureMode by viewModel.voiceCaptureMode.collectAsState()
val micEnabled by viewModel.micEnabled.collectAsState()
val micCooldown by viewModel.micCooldown.collectAsState()
val speakerEnabled by viewModel.speakerEnabled.collectAsState()
@@ -90,12 +93,15 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
val micConversation by viewModel.micConversation.collectAsState()
val micInputLevel by viewModel.micInputLevel.collectAsState()
val micIsSending by viewModel.micIsSending.collectAsState()
val talkModeEnabled by viewModel.talkModeEnabled.collectAsState()
val talkModeListening by viewModel.talkModeListening.collectAsState()
val talkModeSpeaking by viewModel.talkModeSpeaking.collectAsState()
val hasStreamingAssistant = micConversation.any { it.role == VoiceConversationRole.Assistant && it.isStreaming }
val showThinkingBubble = micIsSending && !hasStreamingAssistant
var hasMicPermission by remember { mutableStateOf(context.hasRecordAudioPermission()) }
var pendingMicEnable by remember { mutableStateOf(false) }
var pendingVoicePermissionAction by remember { mutableStateOf<PendingVoicePermissionAction?>(null) }
DisposableEffect(lifecycleOwner, context) {
val observer =
@@ -107,7 +113,7 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
lifecycleOwner.lifecycle.addObserver(observer)
onDispose {
lifecycleOwner.lifecycle.removeObserver(observer)
// Stop TTS when leaving the voice screen
// Manual mic is tied to the Voice tab; Talk Mode is explicit and can continue.
viewModel.setVoiceScreenActive(false)
}
}
@@ -115,10 +121,14 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
val requestMicPermission =
rememberLauncherForActivityResult(ActivityResultContracts.RequestPermission()) { granted ->
hasMicPermission = granted
if (granted && pendingMicEnable) {
viewModel.setMicEnabled(true)
if (granted) {
when (pendingVoicePermissionAction) {
PendingVoicePermissionAction.ManualMic -> viewModel.setMicEnabled(true)
PendingVoicePermissionAction.TalkMode -> viewModel.setTalkModeEnabled(true)
null -> Unit
}
}
pendingMicEnable = false
pendingVoicePermissionAction = null
}
LaunchedEffect(micConversation.size, showThinkingBubble) {
@@ -161,12 +171,12 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
tint = mobileTextTertiary,
)
Text(
"Tap the mic to start",
"Tap mic or Talk",
style = mobileHeadline,
color = mobileTextSecondary,
)
Text(
"Each pause sends a turn automatically.",
"Mic sends turns; Talk keeps the conversation open.",
style = mobileCallout,
color = mobileTextTertiary,
)
@@ -263,7 +273,7 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
if (hasMicPermission) {
viewModel.setMicEnabled(true)
} else {
pendingMicEnable = true
pendingVoicePermissionAction = PendingVoicePermissionAction.ManualMic
requestMicPermission.launch(Manifest.permission.RECORD_AUDIO)
}
},
@@ -287,11 +297,39 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
}
}
// Invisible spacer to balance the row (matches speaker column width)
Column(horizontalAlignment = Alignment.CenterHorizontally) {
Box(modifier = Modifier.size(48.dp))
Column(horizontalAlignment = Alignment.CenterHorizontally, verticalArrangement = Arrangement.spacedBy(4.dp)) {
IconButton(
onClick = {
if (talkModeEnabled) {
viewModel.setTalkModeEnabled(false)
return@IconButton
}
if (hasMicPermission) {
viewModel.setTalkModeEnabled(true)
} else {
pendingVoicePermissionAction = PendingVoicePermissionAction.TalkMode
requestMicPermission.launch(Manifest.permission.RECORD_AUDIO)
}
},
modifier = Modifier.size(48.dp),
colors =
IconButtonDefaults.iconButtonColors(
containerColor = if (talkModeEnabled) mobileSuccessSoft else mobileSurface,
),
) {
Icon(
imageVector = Icons.Default.RecordVoiceOver,
contentDescription = if (talkModeEnabled) "Turn Talk Mode off" else "Turn Talk Mode on",
modifier = Modifier.size(22.dp),
tint = if (talkModeEnabled) mobileSuccess else mobileTextSecondary,
)
}
Spacer(modifier = Modifier.height(4.dp))
Text("", style = mobileCaption2)
Text(
if (talkModeEnabled) "Talk on" else "Talk",
style = mobileCaption2,
color = if (talkModeEnabled) mobileSuccess else mobileTextTertiary,
)
}
}
@@ -299,6 +337,9 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
val queueCount = micQueuedMessages.size
val stateText =
when {
voiceCaptureMode == VoiceCaptureMode.TalkMode && talkModeSpeaking -> "Talk speaking"
voiceCaptureMode == VoiceCaptureMode.TalkMode && talkModeListening -> "Talk listening"
voiceCaptureMode == VoiceCaptureMode.TalkMode -> "Talk on"
queueCount > 0 -> "$queueCount queued"
micIsSending -> "Sending"
micCooldown -> "Cooldown"
@@ -307,14 +348,15 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
}
val stateColor =
when {
voiceCaptureMode == VoiceCaptureMode.TalkMode -> mobileSuccess
micEnabled -> mobileSuccess
micIsSending -> mobileAccent
else -> mobileTextSecondary
}
Surface(
shape = RoundedCornerShape(999.dp),
color = if (micEnabled) mobileSuccessSoft else mobileSurface,
border = BorderStroke(1.dp, if (micEnabled) mobileSuccess.copy(alpha = 0.3f) else mobileBorder),
color = if (micEnabled || talkModeEnabled) mobileSuccessSoft else mobileSurface,
border = BorderStroke(1.dp, if (micEnabled || talkModeEnabled) mobileSuccess.copy(alpha = 0.3f) else mobileBorder),
) {
Text(
"$gatewayStatus · $stateText",
@@ -353,6 +395,11 @@ fun VoiceTabScreen(viewModel: MainViewModel) {
}
}
private enum class PendingVoicePermissionAction {
ManualMic,
TalkMode,
}
@Composable
private fun VoiceTurnBubble(entry: VoiceConversationEntry) {
val isUser = entry.role == VoiceConversationRole.User

View File

@@ -2,6 +2,7 @@ package ai.openclaw.app
import android.app.Notification
import android.content.Intent
import android.content.pm.ServiceInfo
import org.junit.Assert.assertEquals
import org.junit.Assert.assertNotNull
import org.junit.Test
@@ -30,6 +31,35 @@ class NodeForegroundServiceTest {
assertEquals(expectedFlags, savedIntent.flags and expectedFlags)
}
@Test
fun foregroundServiceTypesForVoiceMode_addsMicrophoneOnlyForTalkMode() {
assertEquals(
ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC,
foregroundServiceTypesForVoiceMode(VoiceCaptureMode.Off),
)
assertEquals(
ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC,
foregroundServiceTypesForVoiceMode(VoiceCaptureMode.ManualMic),
)
assertEquals(
ServiceInfo.FOREGROUND_SERVICE_TYPE_DATA_SYNC or ServiceInfo.FOREGROUND_SERVICE_TYPE_MICROPHONE,
foregroundServiceTypesForVoiceMode(VoiceCaptureMode.TalkMode),
)
}
@Test
fun voiceNotificationSuffixReflectsActiveCaptureMode() {
assertEquals("", voiceNotificationSuffix(VoiceCaptureMode.Off, false, false, false, false))
assertEquals(
" · Mic: Listening",
voiceNotificationSuffix(VoiceCaptureMode.ManualMic, true, true, false, false),
)
assertEquals(
" · Talk: Speaking",
voiceNotificationSuffix(VoiceCaptureMode.TalkMode, false, false, true, true),
)
}
private fun buildNotification(service: NodeForegroundService): Notification {
val method =
NodeForegroundService::class.java.getDeclaredMethod(

View File

@@ -2,7 +2,9 @@ package ai.openclaw.app
import android.content.Context
import org.junit.Assert.assertEquals
import org.junit.Assert.assertFalse
import org.junit.Assert.assertNull
import org.junit.Assert.assertTrue
import org.junit.Test
import org.junit.runner.RunWith
import org.robolectric.RobolectricTestRunner
@@ -22,6 +24,32 @@ class SecurePrefsTest {
assertEquals("whileUsing", plainPrefs.getString("location.enabledMode", null))
}
@Test
fun voiceMicEnabled_ignoresOldTalkEnabledKey() {
val context = RuntimeEnvironment.getApplication()
val plainPrefs = context.getSharedPreferences("openclaw.node", Context.MODE_PRIVATE)
plainPrefs.edit().clear().putBoolean("talk.enabled", true).commit()
val prefs = SecurePrefs(context)
assertFalse(prefs.voiceMicEnabled.value)
assertFalse(plainPrefs.contains("voice.micEnabled"))
}
@Test
fun setVoiceMicEnabled_persistsNewKeyOnly() {
val context = RuntimeEnvironment.getApplication()
val plainPrefs = context.getSharedPreferences("openclaw.node", Context.MODE_PRIVATE)
plainPrefs.edit().clear().putBoolean("talk.enabled", false).commit()
val prefs = SecurePrefs(context)
prefs.setVoiceMicEnabled(true)
assertTrue(prefs.voiceMicEnabled.value)
assertTrue(plainPrefs.getBoolean("voice.micEnabled", false))
assertFalse(plainPrefs.getBoolean("talk.enabled", false))
}
@Test
fun saveGatewayBootstrapToken_persistsSeparatelyFromSharedToken() {
val context = RuntimeEnvironment.getApplication()

View File

@@ -1,4 +1,4 @@
9a012a9c87b9010683289dc7d68ba5446a4b78beedf381e2c5f9d486f25a9213 config-baseline.json
6128d6eff8c28d17194d1ae9ee7f72abae48da1c6476ab16e6378f1898e4373a config-baseline.core.json
6ed33ef102e7c92816243bfabc3626222a679c3270c12ec5ea47b28b66204b3b config-baseline.json
f86cb4d57ec1f5fd75008be0ab86151194945eb013a47ab4bdeaddafd3780da7 config-baseline.core.json
7cd9c908f066c143eab2a201efbc9640f483ab28bba92ddeca1d18cc2b528bc3 config-baseline.channel.json
7825b56a5b3fcdbe2e09ef8fe5d9f12ac3598435afebe20413051e45b0d1968e config-baseline.plugin.json

View File

@@ -1,2 +1,2 @@
d5bad55d588ecafab1298a2a79578ce13becced8bc33d2b8543161ab528feca4 plugin-sdk-api-baseline.json
373ded33d5ecc61229de5179827182f0c6f805a804e1f0666cf2da68301153be plugin-sdk-api-baseline.jsonl
f813474b1623f06e1465daacd56db970e8e92ab1be122faee0fa2a1dc2d4fc43 plugin-sdk-api-baseline.json
b3ea88c0c9b4cf6d9a46f0d34149063303853e78ef9708224608e4da79b23190 plugin-sdk-api-baseline.jsonl

View File

@@ -546,6 +546,9 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
- default: audio file behavior
- tag `[[audio_as_voice]]` in agent reply to force voice-note send
- inbound voice-note transcripts are framed as machine-generated,
untrusted text in the agent context; mention detection still uses the raw
transcript so mention-gated voice messages continue to work.
Message action example:

View File

@@ -365,7 +365,7 @@ When the linked self number is also present in `allowFrom`, WhatsApp self-chat s
- non-Ogg audio, including Microsoft Edge TTS MP3/WebM output, is transcoded to Ogg/Opus before PTT delivery
- native Ogg/Opus audio is sent with `audio/ogg; codecs=opus` for voice-note compatibility
- animated GIF playback is supported via `gifPlayback: true` on video sends
- captions are applied to the first media item when sending multi-media reply payloads
- captions are applied to the first media item when sending multi-media reply payloads, except PTT voice notes send the audio first and visible text separately because WhatsApp clients do not render voice-note captions consistently
- media source can be HTTP(S), `file://`, or local paths
</Accordion>

View File

@@ -10,7 +10,7 @@ The CI runs on every push to `main` and every pull request. It uses smart scopin
QA Lab has dedicated CI lanes outside the main smart-scoped workflow. The
`Parity gate` workflow runs on matching PR changes and manual dispatch; it
builds the private QA runtime and compares the mock GPT-5.4 and Opus 4.6
builds the private QA runtime and compares the mock GPT-5.5 and Opus 4.6
agentic packs. The `QA-Lab - All Lanes` workflow runs nightly on `main` and on
manual dispatch; it fans out the mock parity gate, live Matrix lane, and live
Telegram lane as parallel jobs. The live jobs use the `qa-live-shared`

View File

@@ -156,6 +156,9 @@ Use `image` for generation, edit, and description.
```bash
openclaw infer image generate --prompt "friendly lobster illustration" --json
openclaw infer image generate --prompt "cinematic product photo of headphones" --json
openclaw infer image generate --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "simple red circle sticker on a transparent background" --json
openclaw infer image generate --prompt "slow image backend" --timeout-ms 180000 --json
openclaw infer image edit --file ./logo.png --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "keep the logo, remove the background" --json
openclaw infer image describe --file ./photo.jpg --json
openclaw infer image describe --file ./ui-screenshot.png --model openai/gpt-4.1-mini --json
openclaw infer image describe --file ./photo.jpg --model ollama/qwen2.5vl:7b --json
@@ -164,6 +167,10 @@ openclaw infer image describe --file ./photo.jpg --model ollama/qwen2.5vl:7b --j
Notes:
- Use `image edit` when starting from existing input files.
- Use `--output-format png --background transparent` with
`--model openai/gpt-image-1.5` for transparent-background OpenAI PNG output;
`--openai-background` remains available as an OpenAI-specific alias. Providers
that do not declare background support report the hint as an ignored override.
- Use `image providers --json` to verify which bundled image providers are
discoverable, configured, selected, and which generation/edit capabilities
each provider exposes.

View File

@@ -31,6 +31,8 @@ openclaw plugins inspect --all
openclaw plugins info <id>
openclaw plugins enable <id>
openclaw plugins disable <id>
openclaw plugins registry
openclaw plugins registry --refresh
openclaw plugins uninstall <id>
openclaw plugins doctor
openclaw plugins update <id-or-npm-spec>
@@ -195,18 +197,20 @@ openclaw plugins list --verbose
openclaw plugins list --json
```
Use `--enabled` to show only loaded plugins. Use `--verbose` to switch from the
Use `--enabled` to show only enabled plugins. Use `--verbose` to switch from the
table view to per-plugin detail lines with source/origin/version/activation
metadata. Use `--json` for machine-readable inventory plus registry
diagnostics.
`plugins list` runs discovery from the current CLI environment and config. It is
useful for checking whether a plugin is enabled/loadable, but it is not a live
runtime probe of an already-running Gateway process. After changing plugin code,
enablement, hook policy, or `plugins.load.paths`, restart the Gateway that
serves the channel before expecting new `register(api)` code or hooks to run.
For remote/container deployments, verify you are restarting the actual
`openclaw gateway run` child, not only a wrapper process.
`plugins list` reads the persisted local plugin registry first, with a
manifest-only derived fallback when the registry is missing or invalid. It is
useful for checking whether a plugin is installed, enabled, and visible to cold
startup planning, but it is not a live runtime probe of an already-running
Gateway process. After changing plugin code, enablement, hook policy, or
`plugins.load.paths`, restart the Gateway that serves the channel before
expecting new `register(api)` code or hooks to run. For remote/container
deployments, verify you are restarting the actual `openclaw gateway run` child,
not only a wrapper process.
For runtime hook debugging:
@@ -227,7 +231,19 @@ openclaw plugins install -l ./my-plugin
source path instead of copying over a managed install target.
Use `--pin` on npm installs to save the resolved exact spec (`name@version`) in
`plugins.installs` while keeping the default behavior unpinned.
the managed install ledger while keeping the default behavior unpinned.
### Install Ledger
Plugin install metadata is machine-managed state, not user config. New installs
and updates write it to `plugins/installs.json` under the active OpenClaw state
directory. The file includes a do-not-edit warning and is used by
`openclaw plugins update`, uninstall, diagnostics, and the cold plugin registry.
Legacy `plugins.installs` entries in `openclaw.json` remain readable as a
deprecated compatibility fallback. When install/update/uninstall paths rewrite
plugin install state, OpenClaw writes the ledger file and removes
`plugins.installs` from the persisted config payload.
### Uninstall
@@ -237,8 +253,9 @@ openclaw plugins uninstall <id> --dry-run
openclaw plugins uninstall <id> --keep-files
```
`uninstall` removes plugin records from `plugins.entries`, `plugins.installs`,
the plugin allowlist, and linked `plugins.load.paths` entries when applicable.
`uninstall` removes plugin records from `plugins.entries`, the managed install
ledger, the plugin allowlist, and linked `plugins.load.paths` entries when
applicable.
For active memory plugins, the memory slot resets to `memory-core`.
By default, uninstall also removes the plugin install directory under the active
@@ -257,8 +274,8 @@ openclaw plugins update @openclaw/voice-call@beta
openclaw plugins update openclaw-codex-app-server --dangerously-force-unsafe-install
```
Updates apply to tracked installs in `plugins.installs` and tracked hook-pack
installs in `hooks.internal.installs`.
Updates apply to tracked plugin installs in the managed install ledger and
tracked hook-pack installs in `hooks.internal.installs`.
When you pass a plugin id, OpenClaw reuses the recorded install spec for that
plugin. That means previously stored dist-tags such as `@beta` and exact pinned
@@ -333,6 +350,29 @@ For module-shape failures such as missing `register`/`activate` exports, rerun
with `OPENCLAW_PLUGIN_LOAD_DEBUG=1` to include a compact export-shape summary in
the diagnostic output.
### Registry
```bash
openclaw plugins registry
openclaw plugins registry --refresh
openclaw plugins registry --json
```
The local plugin registry is OpenClaw's persisted cold read model for installed
plugin identity, enablement, source metadata, and contribution ownership.
Normal startup, provider owner lookup, channel setup classification, and plugin
inventory can read it without importing plugin runtime modules.
Use `plugins registry` to inspect whether the persisted registry is present,
current, or stale. Use `--refresh` to rebuild it from the durable install
ledger, config policy, and manifest/package metadata. This is a repair path, not
a runtime activation path.
`OPENCLAW_DISABLE_PERSISTED_PLUGIN_REGISTRY=1` is a deprecated break-glass
compatibility switch for registry read failures. Prefer `plugins registry
--refresh` or `openclaw doctor --fix`; the env fallback is only for emergency
startup recovery while the migration rolls out.
### Marketplace
```bash

View File

@@ -77,6 +77,19 @@ gateway-backed session transcript, so they are the source of truth.
Details: [Session management](/concepts/session).
## Tool result metadata
Tool result `content` is the model-visible result. Tool result `details` is
runtime metadata for UI rendering, diagnostics, media delivery, and plugins.
OpenClaw keeps that boundary explicit:
- `toolResult.details` is stripped before provider replay and compaction input.
- Persisted session transcripts keep only bounded `details`; oversized metadata
is replaced with a compact summary marked `persistedDetailsTruncated: true`.
- Plugins and tools should put text the model must read in `content`, not only
in `details`.
## Inbound bodies and history context
OpenClaw separates the **prompt body** from the **command body**:
@@ -154,6 +167,8 @@ Details: [Configuration](/gateway/config-agents#messages) and channel docs.
## Silent replies
The exact silent token `NO_REPLY` / `no_reply` means “do not deliver a user-visible reply”.
When a turn also has pending tool media, such as generated TTS audio, OpenClaw
strips the silent text but still delivers the media attachment.
OpenClaw resolves that behavior by conversation type:
- Direct conversations disallow silence by default and rewrite a bare silent

View File

@@ -129,15 +129,18 @@ validation failures) are treated as failoverworthy and use the same cooldowns
OpenAI-compatible stop-reason errors such as `Unhandled stop reason: error`,
`stop reason: error`, and `reason: error` are classified as timeout/failover
signals.
Provider-scoped generic server text can also land in that timeout bucket when
the source matches a known transient pattern. For example, Anthropic bare
`An unknown error occurred` and JSON `api_error` payloads with transient server
text such as `internal server error`, `unknown error, 520`, `upstream error`,
or `backend error` are treated as failover-worthy timeouts. OpenRouter-specific
generic upstream text such as bare `Provider returned error` is also treated as
timeout only when the provider context is actually OpenRouter. Generic internal
fallback text such as `LLM request failed with an unknown error.` stays
conservative and does not trigger failover by itself.
Generic server text can also land in that timeout bucket when the source matches
a known transient pattern. For example, the bare pi-ai stream-wrapper message
`An unknown error occurred` is treated as failover-worthy for every provider
because pi-ai emits it when provider streams end with `stopReason: "aborted"` or
`stopReason: "error"` without specific details. JSON `api_error` payloads with
transient server text such as `internal server error`, `unknown error, 520`,
`upstream error`, or `backend error` are also treated as failover-worthy
timeouts.
OpenRouter-specific generic upstream text such as bare `Provider returned error`
is treated as timeout only when the provider context is actually OpenRouter.
Generic internal fallback text such as `LLM request failed with an unknown
error.` stays conservative and does not trigger failover by itself.
Some provider SDKs may otherwise sleep for a long `Retry-After` window before
returning control to OpenClaw. For Stainless-based SDKs such as Anthropic and

View File

@@ -30,9 +30,9 @@ Reference for **LLM/model providers** (not chat channels like WhatsApp/Telegram)
`google-gemini-cli`, or `codex-cli` when you want a local CLI backend.
Legacy `claude-cli/*`, `google-gemini-cli/*`, and `codex-cli/*` refs migrate
back to canonical provider refs with the runtime recorded separately.
- GPT-5.5 is available through `openai-codex/gpt-5.5` in PI, the native
Codex app-server harness, and the public OpenAI API when the bundled PI
catalog exposes `openai/gpt-5.5` for your install.
- GPT-5.5 is available through `openai/gpt-5.5` for direct API-key traffic,
`openai-codex/gpt-5.5` in PI for Codex OAuth, and the native Codex
app-server harness when `embeddedHarness.runtime: "codex"` is set.
## Plugin-owned provider behavior
@@ -71,10 +71,9 @@ OpenClaw ships with the piai catalog. These providers require **no**
- Provider: `openai`
- Auth: `OPENAI_API_KEY`
- Optional rotation: `OPENAI_API_KEYS`, `OPENAI_API_KEY_1`, `OPENAI_API_KEY_2`, plus `OPENCLAW_LIVE_OPENAI_KEY` (single override)
- Example models: `openai/gpt-5.5`, `openai/gpt-5.4`, `openai/gpt-5.4-mini`
- GPT-5.5 direct API support depends on the bundled PI catalog version for
your install; verify with `openclaw models list --provider openai` before
using `openai/gpt-5.5` without the Codex app-server runtime.
- Example models: `openai/gpt-5.5`, `openai/gpt-5.4-mini`
- Verify account/model availability with `openclaw models list --provider openai`
if a specific install or API key behaves differently.
- CLI: `openclaw onboard --auth-choice openai-api-key`
- Default transport is `auto` (WebSocket-first, SSE fallback)
- Override per model via `agents.defaults.models["openai/<model>"].params.transport` (`"sse"`, `"websocket"`, or `"auto"`)
@@ -91,7 +90,7 @@ OpenClaw ships with the piai catalog. These providers require **no**
```json5
{
agents: { defaults: { model: { primary: "openai/gpt-5.4" } } },
agents: { defaults: { model: { primary: "openai/gpt-5.5" } } },
}
```

View File

@@ -238,7 +238,7 @@ refs and write a judged Markdown report:
```bash
pnpm openclaw qa character-eval \
--model openai/gpt-5.4,thinking=medium,fast \
--model openai/gpt-5.5,thinking=medium,fast \
--model openai/gpt-5.2,thinking=xhigh \
--model openai/gpt-5,thinking=xhigh \
--model anthropic/claude-opus-4-6,thinking=high \
@@ -246,7 +246,7 @@ pnpm openclaw qa character-eval \
--model zai/glm-5.1,thinking=high \
--model moonshot/kimi-k2.5,thinking=high \
--model google/gemini-3.1-pro-preview,thinking=high \
--judge-model openai/gpt-5.4,thinking=xhigh,fast \
--judge-model openai/gpt-5.5,thinking=xhigh,fast \
--judge-model anthropic/claude-opus-4-6,thinking=high \
--blind-judge-models \
--concurrency 16 \
@@ -263,7 +263,7 @@ Use `--blind-judge-models` when comparing providers: the judge prompt still gets
every transcript and run status, but candidate refs are replaced with neutral
labels such as `candidate-01`; the report maps rankings back to real refs after
parsing.
Candidate runs default to `high` thinking, with `medium` for GPT-5.4 and `xhigh`
Candidate runs default to `high` thinking, with `medium` for GPT-5.5 and `xhigh`
for older OpenAI eval refs that support it. Override a specific candidate inline with
`--model provider/model,thinking=<level>`. `--thinking <level>` still sets a
global fallback, and the older `--model-thinking <provider/model=level>` form is
@@ -278,12 +278,12 @@ Candidate and judge model runs both default to concurrency 16. Lower
`--concurrency` or `--judge-concurrency` when provider limits or local gateway
pressure make a run too noisy.
When no candidate `--model` is passed, the character eval defaults to
`openai/gpt-5.4`, `openai/gpt-5.2`, `openai/gpt-5`, `anthropic/claude-opus-4-6`,
`openai/gpt-5.5`, `openai/gpt-5.2`, `openai/gpt-5`, `anthropic/claude-opus-4-6`,
`anthropic/claude-sonnet-4-6`, `zai/glm-5.1`,
`moonshot/kimi-k2.5`, and
`google/gemini-3.1-pro-preview` when no `--model` is passed.
When no `--judge-model` is passed, the judges default to
`openai/gpt-5.4,thinking=xhigh,fast` and
`openai/gpt-5.5,thinking=xhigh,fast` and
`anthropic/claude-opus-4-6,thinking=high`.
## Related docs

View File

@@ -52,6 +52,14 @@
]
},
"redirects": [
{
"source": "/help/gpt54-codex-agentic-parity",
"destination": "/help/gpt55-codex-agentic-parity"
},
{
"source": "/help/gpt54-codex-agentic-parity-maintainers",
"destination": "/help/gpt55-codex-agentic-parity-maintainers"
},
{
"source": "/mcp",
"destination": "/cli/mcp"
@@ -1649,8 +1657,8 @@
"concepts/typing-indicators",
"concepts/usage-tracking",
"concepts/timezone",
"help/gpt54-codex-agentic-parity",
"help/gpt54-codex-agentic-parity-maintainers"
"help/gpt55-codex-agentic-parity",
"help/gpt55-codex-agentic-parity-maintainers"
]
},
{

View File

@@ -342,8 +342,8 @@ Time format in system prompt. Default: `auto` (OS preference).
- Also used as fallback routing when the selected/default model cannot accept image input.
- `imageGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
- Used by the shared image-generation capability and any future tool/plugin surface that generates images.
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, or `openai/gpt-image-2` for OpenAI Images.
- If you select a provider/model directly, configure matching provider auth too (for example `GEMINI_API_KEY` or `GOOGLE_API_KEY` for `google/*`, `OPENAI_API_KEY` or OpenAI Codex OAuth for `openai/gpt-image-2`, `FAL_KEY` for `fal/*`).
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, `openai/gpt-image-2` for OpenAI Images, or `openai/gpt-image-1.5` for transparent-background OpenAI PNG/WebP output.
- If you select a provider/model directly, configure matching provider auth too (for example `GEMINI_API_KEY` or `GOOGLE_API_KEY` for `google/*`, `OPENAI_API_KEY` or OpenAI Codex OAuth for `openai/gpt-image-2` / `openai/gpt-image-1.5`, `FAL_KEY` for `fal/*`).
- If omitted, `image_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered image-generation providers in provider-id order.
- `musicGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
- Used by the shared music-generation capability and the built-in `music_generate` tool.
@@ -363,7 +363,7 @@ Time format in system prompt. Default: `auto` (OS preference).
- `pdfMaxPages`: default maximum pages considered by extraction fallback mode in the `pdf` tool.
- `verboseDefault`: default verbose level for agents. Values: `"off"`, `"on"`, `"full"`. Default: `"off"`.
- `elevatedDefault`: default elevated-output level for agents. Values: `"off"`, `"on"`, `"ask"`, `"full"`. Default: `"on"`.
- `model.primary`: format `provider/model` (e.g. `openai/gpt-5.4` for API-key access or `openai-codex/gpt-5.5` for Codex OAuth). If you omit the provider, OpenClaw tries an alias first, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider (deprecated compatibility behavior, so prefer explicit `provider/model`). If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default.
- `model.primary`: format `provider/model` (e.g. `openai/gpt-5.5` for API-key access or `openai-codex/gpt-5.5` for Codex OAuth). If you omit the provider, OpenClaw tries an alias first, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider (deprecated compatibility behavior, so prefer explicit `provider/model`). If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default.
- `models`: the configured model catalog and allowlist for `/model`. Each entry can include `alias` (shortcut) and `params` (provider-specific, for example `temperature`, `maxTokens`, `cacheRetention`, `context1m`, `responsesServerCompaction`, `responsesCompactThreshold`, `extra_body`/`extraBody`).
- Safe edits: use `openclaw config set agents.defaults.models '<json>' --strict-json --merge` to add entries. `config set` refuses replacements that would remove existing allowlist entries unless you pass `--replace`.
- Provider-scoped configure/onboarding flows merge selected provider models into this map and preserve unrelated providers already configured.
@@ -406,16 +406,16 @@ Codex app-server harness. For the mental model, see
**Built-in alias shorthands** (only apply when the model is in `agents.defaults.models`):
| Alias | Model |
| ------------------- | -------------------------------------------------- |
| `opus` | `anthropic/claude-opus-4-6` |
| `sonnet` | `anthropic/claude-sonnet-4-6` |
| `gpt` | `openai/gpt-5.4` or configured Codex OAuth GPT-5.5 |
| `gpt-mini` | `openai/gpt-5.4-mini` |
| `gpt-nano` | `openai/gpt-5.4-nano` |
| `gemini` | `google/gemini-3.1-pro-preview` |
| `gemini-flash` | `google/gemini-3-flash-preview` |
| `gemini-flash-lite` | `google/gemini-3.1-flash-lite-preview` |
| Alias | Model |
| ------------------- | ------------------------------------------ |
| `opus` | `anthropic/claude-opus-4-6` |
| `sonnet` | `anthropic/claude-sonnet-4-6` |
| `gpt` | `openai/gpt-5.5` or `openai-codex/gpt-5.5` |
| `gpt-mini` | `openai/gpt-5.4-mini` |
| `gpt-nano` | `openai/gpt-5.4-nano` |
| `gemini` | `google/gemini-3.1-pro-preview` |
| `gemini-flash` | `google/gemini-3-flash-preview` |
| `gemini-flash-lite` | `google/gemini-3.1-flash-lite-preview` |
Your configured aliases always win over defaults.

View File

@@ -186,9 +186,14 @@ See [MCP](/cli/mcp#openclaw-as-an-mcp-client-registry) and
- Enabled Claude bundle plugins can also contribute embedded Pi defaults from `settings.json`; OpenClaw applies those as sanitized agent settings, not as raw OpenClaw config patches.
- `plugins.slots.memory`: pick the active memory plugin id, or `"none"` to disable memory plugins.
- `plugins.slots.contextEngine`: pick the active context engine plugin id; defaults to `"legacy"` unless you install and select another engine.
- `plugins.installs`: CLI-managed install metadata used by `openclaw plugins update`.
- Includes `source`, `spec`, `sourcePath`, `installPath`, `version`, `resolvedName`, `resolvedVersion`, `resolvedSpec`, `integrity`, `shasum`, `resolvedAt`, `installedAt`.
- Treat `plugins.installs.*` as managed state; prefer CLI commands over manual edits.
- `plugins.installs`: deprecated compatibility fallback for legacy
CLI-managed install metadata. New plugin installs write the managed
`plugins/installs.json` state ledger instead.
- Legacy records include `source`, `spec`, `sourcePath`, `installPath`,
`version`, `resolvedName`, `resolvedVersion`, `resolvedSpec`, `integrity`,
`shasum`, `resolvedAt`, `installedAt`.
- Treat `plugins.installs.*` as managed state; prefer CLI commands over
manual edits.
See [Plugins](/tools/plugin).
@@ -253,6 +258,12 @@ See [Plugins](/tools/plugin).
- `profiles.*.cdpUrl` accepts `http://`, `https://`, `ws://`, and `wss://`.
Use HTTP(S) when you want OpenClaw to discover `/json/version`; use WS(S)
when your provider gives you a direct DevTools WebSocket URL.
- `remoteCdpTimeoutMs` and `remoteCdpHandshakeTimeoutMs` apply to remote and
`attachOnly` CDP reachability plus tab-opening requests. Managed loopback
profiles keep local CDP defaults.
- If an externally managed CDP service is reachable through loopback, set that
profile's `attachOnly: true`; otherwise OpenClaw treats the loopback port as a
local managed browser profile and may report local port ownership errors.
- `existing-session` profiles use Chrome MCP instead of CDP and can attach on
the selected host or through a connected browser node.
- `existing-session` profiles can set `userDataDir` to target a specific
@@ -269,9 +280,12 @@ See [Plugins](/tools/plugin).
- Local managed profiles use `browser.localLaunchTimeoutMs` for Chrome CDP HTTP
discovery after process start and `browser.localCdpReadyTimeoutMs` for
post-launch CDP websocket readiness. Raise them on slower hosts where Chrome
starts successfully but readiness checks race startup.
starts successfully but readiness checks race startup. Both values must be
positive integers up to `120000` ms; invalid config values are rejected.
- Auto-detect order: default browser if Chromium-based → Chrome → Brave → Edge → Chromium → Chrome Canary.
- `browser.executablePath` accepts `~` for your OS home directory.
- `browser.executablePath` and `browser.profiles.<name>.executablePath` both
accept `~` and `~/...` for your OS home directory before Chromium launch.
Per-profile `userDataDir` on `existing-session` profiles is also tilde-expanded.
- Control service: loopback only (port derived from `gateway.port`, default `18791`).
- `extraArgs` appends extra launch flags to local Chromium startup (for example
`--disable-gpu`, window sizing, or debug flags).
@@ -903,6 +917,7 @@ Notes:
- `otel.sampleRate`: trace sampling rate `0``1`.
- `otel.flushIntervalMs`: periodic telemetry flush interval in ms.
- `otel.captureContent`: opt-in raw content capture for OTEL span attributes. Defaults to off. Boolean `true` captures non-system message/tool content; the object form lets you enable `inputMessages`, `outputMessages`, `toolInputs`, `toolOutputs`, and `systemPrompt` explicitly.
- `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental`: environment toggle for latest experimental GenAI span provider attributes. By default spans keep the legacy `gen_ai.system` attribute for compatibility; GenAI metrics use bounded semantic attributes.
- `OPENCLAW_OTEL_PRELOADED=1`: environment toggle for hosts that already registered a global OpenTelemetry SDK. OpenClaw then skips plugin-owned SDK startup/shutdown while keeping diagnostic listeners active.
- `cacheTrace.enabled`: log cache trace snapshots for embedded runs (default: `false`).
- `cacheTrace.filePath`: output path for cache trace JSONL (default: `$OPENCLAW_STATE_DIR/logs/cache-trace.jsonl`).

View File

@@ -457,7 +457,7 @@ Doctor prints a summary of the workspace state for the default agent:
- **Skills status**: counts eligible, missing-requirements, and allowlist-blocked skills.
- **Legacy workspace dirs**: warns when `~/openclaw` or other legacy workspace directories
exist alongside the current workspace.
- **Plugin status**: counts loaded/disabled/errored plugins; lists plugin IDs for any
- **Plugin status**: counts enabled/disabled/errored plugins; lists plugin IDs for any
errors; reports bundle plugin capabilities.
- **Plugin compatibility warnings**: flags plugins that have compatibility issues with
the current runtime.

View File

@@ -595,10 +595,9 @@ and troubleshooting see the main [FAQ](/help/faq).
<Accordion title="How does Codex auth work?">
OpenClaw supports **OpenAI Code (Codex)** via OAuth (ChatGPT sign-in). Use
`openai-codex/gpt-5.5` for Codex OAuth through the default PI runner. Use
`openai/gpt-5.4` for current direct OpenAI API-key access. GPT-5.5 direct
API-key access is supported once OpenAI enables it on the public API; today
GPT-5.5 uses subscription/OAuth via `openai-codex/gpt-5.5` or native Codex
app-server runs with `openai/gpt-5.5` and `embeddedHarness.runtime: "codex"`.
`openai/gpt-5.5` for direct OpenAI API-key access. GPT-5.5 can also use
subscription/OAuth via `openai-codex/gpt-5.5` or native Codex app-server
runs with `openai/gpt-5.5` and `embeddedHarness.runtime: "codex"`.
See [Model providers](/concepts/model-providers) and [Onboarding (CLI)](/start/wizard).
</Accordion>
@@ -606,8 +605,7 @@ and troubleshooting see the main [FAQ](/help/faq).
`openai-codex` is the provider and auth-profile id for ChatGPT/Codex OAuth.
It is also the explicit PI model prefix for Codex OAuth:
- `openai/gpt-5.4` = current direct OpenAI API-key route in PI
- `openai/gpt-5.5` = future direct API-key route once OpenAI enables GPT-5.5 on the API
- `openai/gpt-5.5` = current direct OpenAI API-key route in PI
- `openai-codex/gpt-5.5` = Codex OAuth route in PI
- `openai/gpt-5.5` + `embeddedHarness.runtime: "codex"` = native Codex app-server route
- `openai-codex:...` = auth profile id, not a model ref

View File

@@ -21,7 +21,7 @@ troubleshooting, see the main [FAQ](/help/faq).
agents.defaults.model.primary
```
Models are referenced as `provider/model` (example: `openai/gpt-5.4` or `openai-codex/gpt-5.5`). If you omit the provider, OpenClaw first tries an alias, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider as a deprecated compatibility path. If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default. You should still **explicitly** set `provider/model`.
Models are referenced as `provider/model` (example: `openai/gpt-5.5` or `openai-codex/gpt-5.5`). If you omit the provider, OpenClaw first tries an alias, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider as a deprecated compatibility path. If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default. You should still **explicitly** set `provider/model`.
</Accordion>
@@ -146,13 +146,10 @@ troubleshooting, see the main [FAQ](/help/faq).
<Accordion title="Can I use GPT 5.5 for daily tasks and Codex 5.5 for coding?">
Yes. Set one as default and switch as needed:
- **Quick switch (per session):** `/model openai/gpt-5.4` for current direct OpenAI API-key tasks or `/model openai-codex/gpt-5.5` for GPT-5.5 Codex OAuth tasks.
- **Default:** set `agents.defaults.model.primary` to `openai/gpt-5.4` for API-key usage or `openai-codex/gpt-5.5` for GPT-5.5 Codex OAuth usage.
- **Quick switch (per session):** `/model openai/gpt-5.5` for current direct OpenAI API-key tasks or `/model openai-codex/gpt-5.5` for GPT-5.5 Codex OAuth tasks.
- **Default:** set `agents.defaults.model.primary` to `openai/gpt-5.5` for API-key usage or `openai-codex/gpt-5.5` for GPT-5.5 Codex OAuth usage.
- **Sub-agents:** route coding tasks to sub-agents with a different default model.
Direct API-key access for `openai/gpt-5.5` is supported once OpenAI enables
GPT-5.5 on the public API. Until then GPT-5.5 is subscription/OAuth-only.
See [Models](/concepts/models) and [Slash commands](/tools/slash-commands).
</Accordion>
@@ -160,8 +157,8 @@ troubleshooting, see the main [FAQ](/help/faq).
<Accordion title="How do I configure fast mode for GPT 5.5?">
Use either a session toggle or a config default:
- **Per session:** send `/fast on` while the session is using `openai/gpt-5.4` or `openai-codex/gpt-5.5`.
- **Per model default:** set `agents.defaults.models["openai/gpt-5.4"].params.fastMode` or `agents.defaults.models["openai-codex/gpt-5.5"].params.fastMode` to `true`.
- **Per session:** send `/fast on` while the session is using `openai/gpt-5.5` or `openai-codex/gpt-5.5`.
- **Per model default:** set `agents.defaults.models["openai/gpt-5.5"].params.fastMode` or `agents.defaults.models["openai-codex/gpt-5.5"].params.fastMode` to `true`.
Example:
@@ -170,7 +167,7 @@ troubleshooting, see the main [FAQ](/help/faq).
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: {
fastMode: true,
},
@@ -241,7 +238,7 @@ troubleshooting, see the main [FAQ](/help/faq).
model: { primary: "minimax/MiniMax-M2.7" },
models: {
"minimax/MiniMax-M2.7": { alias: "minimax" },
"openai/gpt-5.4": { alias: "gpt" },
"openai/gpt-5.5": { alias: "gpt" },
},
},
},
@@ -269,7 +266,7 @@ troubleshooting, see the main [FAQ](/help/faq).
- `opus` → `anthropic/claude-opus-4-6`
- `sonnet` → `anthropic/claude-sonnet-4-6`
- `gpt` → `openai/gpt-5.4` for API-key setups, or `openai-codex/gpt-5.5` when configured for Codex OAuth
- `gpt` → `openai/gpt-5.5` for API-key setups, or `openai-codex/gpt-5.5` when configured for Codex OAuth
- `gpt-mini` → `openai/gpt-5.4-mini`
- `gpt-nano` → `openai/gpt-5.4-nano`
- `gemini` → `google/gemini-3.1-pro-preview`

View File

@@ -1,12 +1,12 @@
---
summary: "How to review the GPT-5.4 / Codex parity program as four merge units"
title: "GPT-5.4 / Codex parity maintainer notes"
summary: "How to review the GPT-5.5 / Codex parity program as four merge units"
title: "GPT-5.5 / Codex parity maintainer notes"
read_when:
- Reviewing the GPT-5.4 / Codex parity PR series
- Reviewing the GPT-5.5 / Codex parity PR series
- Maintaining the six-contract agentic architecture behind the parity program
---
This note explains how to review the GPT-5.4 / Codex parity program as four merge units without losing the original six-contract architecture.
This note explains how to review the GPT-5.5 / Codex parity program as four merge units without losing the original six-contract architecture.
## Merge units
@@ -59,7 +59,7 @@ Does not own:
Owns:
- first-wave GPT-5.4 vs Opus 4.6 scenario pack
- first-wave GPT-5.5 vs Opus 4.6 scenario pack
- parity documentation
- parity report and release-gate mechanics
@@ -123,7 +123,7 @@ Expected artifacts from PR D:
## Release gate
Do not claim GPT-5.4 parity or superiority over Opus 4.6 until:
Do not claim GPT-5.5 parity or superiority over Opus 4.6 until:
- PR A, PR B, and PR C are merged
- PR D runs the first-wave parity pack cleanly
@@ -132,7 +132,7 @@ Do not claim GPT-5.4 parity or superiority over Opus 4.6 until:
```mermaid
flowchart LR
A["PR A-C merged"] --> B["Run GPT-5.4 parity pack"]
A["PR A-C merged"] --> B["Run GPT-5.5 parity pack"]
A --> C["Run Opus 4.6 parity pack"]
B --> D["qa-suite-summary.json"]
C --> E["qa-suite-summary.json"]
@@ -146,7 +146,7 @@ flowchart LR
The parity harness is not the only evidence source. Keep this split explicit in review:
- PR D owns the scenario-based GPT-5.4 vs Opus 4.6 comparison
- PR D owns the scenario-based GPT-5.5 vs Opus 4.6 comparison
- PR B deterministic suites still own auth/proxy/DNS and full-access truthfulness evidence
## Quick maintainer merge workflow
@@ -179,13 +179,13 @@ If any one of the evidence bar items is missing, request changes instead of merg
| No fake progress or fake tool completion | PR A + PR D | parity fake-success count plus scenario-level report details |
| No false `/elevated full` guidance | PR B | deterministic runtime-truthfulness suites |
| Replay/liveness failures remain explicit | PR C + PR D | lifecycle/replay suites plus `compaction-retry-mutating-tool` |
| GPT-5.4 matches or beats Opus 4.6 | PR D | `qa-agentic-parity-report.md` and `qa-agentic-parity-summary.json` |
| GPT-5.5 matches or beats Opus 4.6 | PR D | `qa-agentic-parity-report.md` and `qa-agentic-parity-summary.json` |
## Reviewer shorthand: before vs after
| User-visible problem before | Review signal after |
| ----------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| GPT-5.4 stopped after planning | PR A shows act-or-block behavior instead of commentary-only completion |
| GPT-5.5 stopped after planning | PR A shows act-or-block behavior instead of commentary-only completion |
| Tool use felt brittle with strict OpenAI/Codex schemas | PR C keeps tool registration and parameter-free invocation predictable |
| `/elevated full` hints were sometimes misleading | PR B ties guidance to actual runtime capability and blocked reasons |
| Long tasks could disappear into replay/compaction ambiguity | PR C emits explicit paused, blocked, abandoned, and replay-invalid state |
@@ -193,4 +193,4 @@ If any one of the evidence bar items is missing, request changes instead of merg
## Related
- [GPT-5.4 / Codex agentic parity](/help/gpt54-codex-agentic-parity)
- [GPT-5.5 / Codex agentic parity](/help/gpt55-codex-agentic-parity)

View File

@@ -1,15 +1,15 @@
---
summary: "How OpenClaw closes agentic execution gaps for GPT-5.4 and Codex-style models"
title: "GPT-5.4 / Codex agentic parity"
summary: "How OpenClaw closes agentic execution gaps for GPT-5.5 and Codex-style models"
title: "GPT-5.5 / Codex agentic parity"
read_when:
- Debugging GPT-5.4 or Codex agent behavior
- Debugging GPT-5.5 or Codex agent behavior
- Comparing OpenClaw agentic behavior across frontier models
- Reviewing the strict-agentic, tool-schema, elevation, and replay fixes
---
# GPT-5.4 / Codex Agentic Parity in OpenClaw
# GPT-5.5 / Codex Agentic Parity in OpenClaw
OpenClaw already worked well with tool-using frontier models, but GPT-5.4 and Codex-style models were still underperforming in a few practical ways:
OpenClaw already worked well with tool-using frontier models, but GPT-5.5 and Codex-style models were still underperforming in a few practical ways:
- they could stop after planning instead of doing the work
- they could use strict OpenAI/Codex tool schemas incorrectly
@@ -27,7 +27,7 @@ This slice adds an opt-in `strict-agentic` execution contract for embedded Pi GP
When enabled, OpenClaw stops accepting plan-only turns as “good enough” completion. If the model only says what it intends to do and does not actually use tools or make progress, OpenClaw retries with an act-now steer and then fails closed with an explicit blocked state instead of silently ending the task.
This improves the GPT-5.4 experience most on:
This improves the GPT-5.5 experience most on:
- short “ok do it” follow-ups
- code tasks where the first step is obvious
@@ -40,7 +40,7 @@ This slice makes OpenClaw tell the truth about two things:
- why the provider/runtime call failed
- whether `/elevated full` is actually available
That means GPT-5.4 gets better runtime signals for missing scope, auth refresh failures, HTML 403 auth failures, proxy issues, DNS or timeout failures, and blocked full-access modes. The model is less likely to hallucinate the wrong remediation or keep asking for a permission mode the runtime cannot provide.
That means GPT-5.5 gets better runtime signals for missing scope, auth refresh failures, HTML 403 auth failures, proxy issues, DNS or timeout failures, and blocked full-access modes. The model is less likely to hallucinate the wrong remediation or keep asking for a permission mode the runtime cannot provide.
### PR C: execution correctness
@@ -53,7 +53,7 @@ The tool-compat work reduces schema friction for strict OpenAI/Codex tool regist
### PR D: parity harness
This slice adds the first-wave QA-lab parity pack so GPT-5.4 and Opus 4.6 can be exercised through the same scenarios and compared using shared evidence.
This slice adds the first-wave QA-lab parity pack so GPT-5.5 and Opus 4.6 can be exercised through the same scenarios and compared using shared evidence.
The parity pack is the proof layer. It does not change runtime behavior by itself.
@@ -62,7 +62,7 @@ After you have two `qa-suite-summary.json` artifacts, generate the release-gate
```bash
pnpm openclaw qa parity-report \
--repo-root . \
--candidate-summary .artifacts/qa-e2e/gpt54/qa-suite-summary.json \
--candidate-summary .artifacts/qa-e2e/gpt55/qa-suite-summary.json \
--baseline-summary .artifacts/qa-e2e/opus46/qa-suite-summary.json \
--output-dir .artifacts/qa-e2e/parity
```
@@ -73,16 +73,16 @@ That command writes:
- a machine-readable JSON verdict
- an explicit `pass` / `fail` gate result
## Why this improves GPT-5.4 in practice
## Why this improves GPT-5.5 in practice
Before this work, GPT-5.4 on OpenClaw could feel less agentic than Opus in real coding sessions because the runtime tolerated behaviors that are especially harmful for GPT-5-style models:
Before this work, GPT-5.5 on OpenClaw could feel less agentic than Opus in real coding sessions because the runtime tolerated behaviors that are especially harmful for GPT-5-style models:
- commentary-only turns
- schema friction around tools
- vague permission feedback
- silent replay or compaction breakage
The goal is not to make GPT-5.4 imitate Opus. The goal is to give GPT-5.4 a runtime contract that rewards real progress, supplies cleaner tool and permission semantics, and turns failure modes into explicit machine- and human-readable states.
The goal is not to make GPT-5.5 imitate Opus. The goal is to give GPT-5.5 a runtime contract that rewards real progress, supplies cleaner tool and permission semantics, and turns failure modes into explicit machine- and human-readable states.
That changes the user experience from:
@@ -92,15 +92,15 @@ to:
- “the model either acted, or OpenClaw surfaced the exact reason it could not”
## Before vs after for GPT-5.4 users
## Before vs after for GPT-5.5 users
| Before this program | After PR A-D |
| ---------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- |
| GPT-5.4 could stop after a reasonable plan without taking the next tool step | PR A turns “plan only” into “act now or surface a blocked state” |
| GPT-5.5 could stop after a reasonable plan without taking the next tool step | PR A turns “plan only” into “act now or surface a blocked state” |
| Strict tool schemas could reject parameter-free or OpenAI/Codex-shaped tools in confusing ways | PR C makes provider-owned tool registration and invocation more predictable |
| `/elevated full` guidance could be vague or wrong in blocked runtimes | PR B gives GPT-5.4 and the user truthful runtime and permission hints |
| `/elevated full` guidance could be vague or wrong in blocked runtimes | PR B gives GPT-5.5 and the user truthful runtime and permission hints |
| Replay or compaction failures could feel like the task silently disappeared | PR C surfaces paused, blocked, abandoned, and replay-invalid outcomes explicitly |
| “GPT-5.4 feels worse than Opus” was mostly anecdotal | PR D turns that into the same scenario pack, the same metrics, and a hard pass/fail gate |
| “GPT-5.5 feels worse than Opus” was mostly anecdotal | PR D turns that into the same scenario pack, the same metrics, and a hard pass/fail gate |
## Architecture
@@ -123,7 +123,7 @@ flowchart TD
```mermaid
flowchart LR
A["Merged runtime slices (PR A-C)"] --> B["Run GPT-5.4 parity pack"]
A["Merged runtime slices (PR A-C)"] --> B["Run GPT-5.5 parity pack"]
A --> C["Run Opus 4.6 parity pack"]
B --> D["qa-suite-summary.json"]
C --> E["qa-suite-summary.json"]
@@ -162,7 +162,7 @@ Checks that a task with a real mutating write keeps replay-unsafety explicit ins
## Scenario matrix
| Scenario | What it tests | Good GPT-5.4 behavior | Failure signal |
| Scenario | What it tests | Good GPT-5.5 behavior | Failure signal |
| ---------------------------------- | --------------------------------------- | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------ |
| `approval-turn-tool-followthrough` | Short approval turns after a plan | Starts the first concrete tool action immediately instead of restating intent | plan-only follow-up, no tool activity, or blocked turn without a real blocker |
| `model-switch-tool-continuity` | Runtime/model switching under tool use | Preserves task context and continues acting coherently | resets into commentary, loses tool context, or stops after switch |
@@ -172,7 +172,7 @@ Checks that a task with a real mutating write keeps replay-unsafety explicit ins
## Release gate
GPT-5.4 can only be considered at parity or better when the merged runtime passes the parity pack and the runtime-truthfulness regressions at the same time.
GPT-5.5 can only be considered at parity or better when the merged runtime passes the parity pack and the runtime-truthfulness regressions at the same time.
Required outcomes:
@@ -191,24 +191,24 @@ For the first-wave harness, the gate compares:
Parity evidence is intentionally split across two layers:
- PR D proves same-scenario GPT-5.4 vs Opus 4.6 behavior with QA-lab
- PR D proves same-scenario GPT-5.5 vs Opus 4.6 behavior with QA-lab
- PR B deterministic suites prove auth, proxy, DNS, and `/elevated full` truthfulness outside the harness
## Goal-to-evidence matrix
| Completion gate item | Owning PR | Evidence source | Pass signal |
| -------------------------------------------------------- | ----------- | ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- |
| GPT-5.4 no longer stalls after planning | PR A | `approval-turn-tool-followthrough` plus PR A runtime suites | approval turns trigger real work or an explicit blocked state |
| GPT-5.4 no longer fakes progress or fake tool completion | PR A + PR D | parity report scenario outcomes and fake-success count | no suspicious pass results and no commentary-only completion |
| GPT-5.4 no longer gives false `/elevated full` guidance | PR B | deterministic truthfulness suites | blocked reasons and full-access hints stay runtime-accurate |
| GPT-5.5 no longer stalls after planning | PR A | `approval-turn-tool-followthrough` plus PR A runtime suites | approval turns trigger real work or an explicit blocked state |
| GPT-5.5 no longer fakes progress or fake tool completion | PR A + PR D | parity report scenario outcomes and fake-success count | no suspicious pass results and no commentary-only completion |
| GPT-5.5 no longer gives false `/elevated full` guidance | PR B | deterministic truthfulness suites | blocked reasons and full-access hints stay runtime-accurate |
| Replay/liveness failures stay explicit | PR C + PR D | PR C lifecycle/replay suites plus `compaction-retry-mutating-tool` | mutating work keeps replay-unsafety explicit instead of silently disappearing |
| GPT-5.4 matches or beats Opus 4.6 on the agreed metrics | PR D | `qa-agentic-parity-report.md` and `qa-agentic-parity-summary.json` | same scenario coverage and no regression on completion, stop behavior, or valid tool use |
| GPT-5.5 matches or beats Opus 4.6 on the agreed metrics | PR D | `qa-agentic-parity-report.md` and `qa-agentic-parity-summary.json` | same scenario coverage and no regression on completion, stop behavior, or valid tool use |
## How to read the parity verdict
Use the verdict in `qa-agentic-parity-summary.json` as the final machine-readable decision for the first-wave parity pack.
- `pass` means GPT-5.4 covered the same scenarios as Opus 4.6 and did not regress on the agreed aggregate metrics.
- `pass` means GPT-5.5 covered the same scenarios as Opus 4.6 and did not regress on the agreed aggregate metrics.
- `fail` means at least one hard gate tripped: weaker completion, worse unintended stops, weaker valid tool use, any fake-success case, or mismatched scenario coverage.
- “shared/base CI issue” is not itself a parity result. If CI noise outside PR D blocks a run, the verdict should wait for a clean merged-runtime execution instead of being inferred from branch-era logs.
- Auth, proxy, DNS, and `/elevated full` truthfulness still come from PR Bs deterministic suites, so the final release claim needs both: a passing PR D parity verdict and green PR B truthfulness coverage.
@@ -218,7 +218,7 @@ Use the verdict in `qa-agentic-parity-summary.json` as the final machine-readabl
Use `strict-agentic` when:
- the agent is expected to act immediately when a next step is obvious
- GPT-5.4 or Codex-family models are the primary runtime
- GPT-5.5 or Codex-family models are the primary runtime
- you prefer explicit blocked states over “helpful” recap-only replies
Keep the default contract when:
@@ -229,4 +229,4 @@ Keep the default contract when:
## Related
- [GPT-5.4 / Codex parity maintainer notes](/help/gpt54-codex-agentic-parity-maintainers)
- [GPT-5.5 / Codex parity maintainer notes](/help/gpt55-codex-agentic-parity-maintainers)

View File

@@ -198,6 +198,9 @@ diagnostics + the exporter plugin are enabled.
Model usage:
- `model.usage`: tokens, cost, duration, context, provider/model/channel, session ids.
`usage` is provider/turn accounting for cost and telemetry; `context.used`
is the current prompt/context snapshot and can be lower than provider
`usage.total` when cached input or tool-loop calls are involved.
Message flow:
@@ -307,7 +310,8 @@ Notes:
- You can also enable the plugin with `openclaw plugins enable diagnostics-otel`.
- `protocol` currently supports `http/protobuf` only. `grpc` is ignored.
- Metrics include token usage, cost, context size, run duration, and message-flow
counters/histograms (webhooks, queueing, session state, queue depth/wait).
counters/histograms (webhooks, queueing, session state, queue depth/wait),
plus GenAI token usage and model-call duration histograms.
- Traces/metrics can be toggled with `traces` / `metrics` (default: on). Traces
include model usage spans plus webhook/message processing spans when enabled.
- Raw model/tool content is not exported by default. Use
@@ -316,6 +320,10 @@ Notes:
- Set `headers` when your collector requires auth.
- Environment variables supported: `OTEL_EXPORTER_OTLP_ENDPOINT`,
`OTEL_SERVICE_NAME`, `OTEL_EXPORTER_OTLP_PROTOCOL`.
- Set `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` to emit the
latest experimental GenAI provider span attribute (`gen_ai.provider.name`)
instead of the legacy span attribute (`gen_ai.system`). GenAI metrics always
use bounded, low-cardinality semantic attributes.
- Set `OPENCLAW_OTEL_PRELOADED=1` when another preload or host process already
registered the global OpenTelemetry SDK. In that mode the plugin does not start
or shut down its own SDK, but it still wires OpenClaw diagnostic listeners and
@@ -333,6 +341,12 @@ Model usage:
`openclaw.provider`, `openclaw.model`)
- `openclaw.context.tokens` (histogram, attrs: `openclaw.context`,
`openclaw.channel`, `openclaw.provider`, `openclaw.model`)
- `gen_ai.client.token.usage` (histogram, GenAI semantic-conventions metric,
attrs: `gen_ai.token.type` = `input`/`output`, `gen_ai.provider.name`,
`gen_ai.operation.name`, `gen_ai.request.model`)
- `gen_ai.client.operation.duration` (histogram, seconds, GenAI
semantic-conventions metric, attrs: `gen_ai.provider.name`,
`gen_ai.operation.name`, `gen_ai.request.model`, optional `error.type`)
Message flow:
@@ -371,18 +385,35 @@ Exec:
- `openclaw.exec.duration_ms` (histogram, attrs: `openclaw.exec.target`,
`openclaw.exec.mode`, `openclaw.outcome`, `openclaw.failureKind`)
Diagnostics internals (memory + tool loop):
- `openclaw.memory.heap_used_bytes` (histogram, attrs: `openclaw.memory.kind`)
- `openclaw.memory.rss_bytes` (histogram)
- `openclaw.memory.pressure` (counter, attrs: `openclaw.memory.level`)
- `openclaw.tool.loop.iterations` (counter, attrs: `openclaw.toolName`,
`openclaw.outcome`)
- `openclaw.tool.loop.duration_ms` (histogram, attrs: `openclaw.toolName`,
`openclaw.outcome`)
### Exported spans (names + key attributes)
- `openclaw.model.usage`
- `openclaw.channel`, `openclaw.provider`, `openclaw.model`
- `openclaw.tokens.*` (input/output/cache_read/cache_write/total)
- `gen_ai.system` by default, or `gen_ai.provider.name` when latest GenAI
semantic conventions are opted in
- `gen_ai.request.model`, `gen_ai.operation.name`, `gen_ai.usage.*`
- `openclaw.run`
- `openclaw.outcome`, `openclaw.channel`, `openclaw.provider`,
`openclaw.model`, `openclaw.errorCategory`
- `openclaw.model.call`
- `gen_ai.system`, `gen_ai.request.model`, `gen_ai.operation.name`,
- `gen_ai.system` by default, or `gen_ai.provider.name` when latest GenAI
semantic conventions are opted in
- `gen_ai.request.model`, `gen_ai.operation.name`,
`openclaw.provider`, `openclaw.model`, `openclaw.api`,
`openclaw.transport`
`openclaw.transport`, `openclaw.provider.request_id_hash` (bounded
SHA-based hash of the upstream provider request id; raw ids are not
exported)
- `openclaw.tool.execution`
- `gen_ai.tool.name`, `openclaw.toolName`, `openclaw.errorCategory`,
`openclaw.tool.params.*`
@@ -403,6 +434,16 @@ Exec:
`openclaw.errorCategory`, `openclaw.delivery.result_count`
- `openclaw.session.stuck`
- `openclaw.state`, `openclaw.ageMs`, `openclaw.queueDepth`
- `openclaw.context.assembled`
- `openclaw.prompt.size`, `openclaw.history.size`,
`openclaw.context.tokens`, `openclaw.errorCategory` (no prompt,
history, response, or session-key content)
- `openclaw.tool.loop`
- `openclaw.toolName`, `openclaw.outcome`, `openclaw.iterations`,
`openclaw.errorCategory` (no loop messages, params, or tool output)
- `openclaw.memory.pressure`
- `openclaw.memory.level`, `openclaw.memory.heap_used_bytes`,
`openclaw.memory.rss_bytes`
When content capture is explicitly enabled, model/tool spans can also include
bounded, redacted `openclaw.content.*` attributes for the specific content
@@ -419,6 +460,9 @@ classes you opted into.
`OTEL_EXPORTER_OTLP_ENDPOINT`.
- If the endpoint already contains `/v1/traces` or `/v1/metrics`, it is used as-is.
- If the endpoint already contains `/v1/logs`, it is used as-is for logs.
- `OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental` controls only the
GenAI span provider attribute shape. Existing dashboards that read
`gen_ai.system` can keep the default until they migrate.
- `OPENCLAW_OTEL_PRELOADED=1` reuses an externally registered OpenTelemetry SDK
for traces/metrics instead of starting a plugin-owned NodeSDK.
- `diagnostics.otel.logs` enables OTLP log export for the main logger output.

View File

@@ -91,6 +91,13 @@ Defaults:
- Click cloud: stop speaking
- Click X: exit Talk mode
## Android UI
- Voice tab toggle: **Talk**
- Manual **Mic** and **Talk** are mutually exclusive runtime capture modes.
- Manual Mic stops when the app leaves the foreground or the user leaves the Voice tab.
- Talk Mode keeps running until toggled off or the Android node disconnects, and uses Android's microphone foreground-service type while active.
## Notes
- Requires Speech + Microphone permissions.

View File

@@ -199,8 +199,10 @@ See [Camera node](/nodes/camera) for parameters and CLI helpers.
### 8) Voice + expanded Android command surface
- Voice: Android uses a single mic on/off flow in the Voice tab with transcript capture and `talk.speak` playback. Local system TTS is used only when `talk.speak` is unavailable. Voice stops when the app leaves the foreground.
- Voice wake/talk-mode toggles are currently removed from Android UX/runtime.
- Voice tab: Android has two explicit capture modes. **Mic** is a manual Voice-tab session that sends each pause as a chat turn and stops when the app leaves the foreground or the user leaves the Voice tab. **Talk** is continuous Talk Mode and keeps listening until toggled off or the node disconnects.
- Talk Mode promotes the existing foreground service from `dataSync` to `dataSync|microphone` before capture starts, then demotes it when Talk Mode stops. Android 14+ requires the `FOREGROUND_SERVICE_MICROPHONE` declaration, the `RECORD_AUDIO` runtime grant, and the microphone service type at runtime.
- Spoken replies use `talk.speak` through the configured gateway Talk provider. Local system TTS is used only when `talk.speak` is unavailable.
- Voice wake remains disabled in the Android UX/runtime.
- Additional Android command families (availability depends on device + permissions):
- `device.status`, `device.info`, `device.permissions`, `device.health`
- `notifications.list`, `notifications.actions` (see [Notification forwarding](#notification-forwarding) below)

View File

@@ -911,13 +911,18 @@ Official external npm entries should prefer an exact `npmSpec` plus
`expectedIntegrity`. Bare package names and dist-tags still work for
compatibility, but they surface source-plane warnings so the catalog can move
toward pinned, integrity-checked installs without breaking existing plugins.
When onboarding installs from a local catalog path, it records a
`plugins.installs` entry with `source: "path"` and a workspace-relative
When onboarding installs from a local catalog path, it records a managed plugin
install ledger entry with `source: "path"` and a workspace-relative
`sourcePath` when possible. The absolute operational load path stays in
`plugins.load.paths`; the install record avoids duplicating local workstation
paths into long-lived config. This keeps local development installs visible to
source-plane diagnostics without adding a second raw filesystem-path disclosure
surface.
surface. Legacy `plugins.installs` config entries are still read as a
compatibility fallback while the state-managed `plugins/installs.json` ledger
becomes the install source of truth.
`openclaw doctor --fix` migrates those legacy config entries into the managed
ledger and refreshes the cold registry index without loading plugin runtime
modules.
## Context engine plugins

View File

@@ -86,6 +86,8 @@ Current compatibility records include:
toward `agentRuntime`
- generated bundled channel config metadata fallback while registry-first
`channelConfigs` metadata lands
- the persisted plugin registry disable env while repair flows migrate operators
to `openclaw plugins registry --refresh` and `openclaw doctor --fix`
New plugin code should prefer the replacement listed in the registry and in the
specific migration guide. Existing plugins can keep using a compatibility path

View File

@@ -79,6 +79,8 @@ audio bridge, node pinning, delayed realtime intro, and, when Twilio delegation
is configured, whether the `voice-call` plugin and Twilio credentials are ready.
Treat any `ok: false` check as a blocker before asking an agent to join.
Use `openclaw googlemeet setup --json` for scripts or machine-readable output.
Use `--transport chrome`, `--transport chrome-node`, or `--transport twilio`
to preflight a specific transport before an agent tries it.
Join a meeting:
@@ -303,11 +305,17 @@ display name, or remote IP.
Common failure checks:
- `Configured Google Meet node ... is not usable: offline`: the pinned node is
known to the Gateway but unavailable. Agents should treat that node as
diagnostic state, not as a usable Chrome host, and report the setup blocker
instead of falling back to another transport unless the user asked for that.
- `No connected Google Meet-capable node`: start `openclaw node run` in the VM,
approve pairing, and make sure `openclaw plugins enable google-meet` and
`openclaw plugins enable browser` were run in the VM. Also confirm the
Gateway host allows both node commands with
`gateway.nodes.allowCommands: ["googlemeet.chrome", "browser.proxy"]`.
- `BlackHole 2ch audio device not found`: install `blackhole-2ch` on the host
being checked and reboot before using local Chrome audio.
- `BlackHole 2ch audio device not found on the node`: install `blackhole-2ch`
in the VM and reboot the VM.
- Chrome opens but cannot join: sign in to the browser profile inside the VM, or

View File

@@ -68,6 +68,7 @@ observation-only.
**Conversation observation**
- `model_call_started` / `model_call_ended` — observe sanitized provider/model call metadata, timing, outcome, and bounded request-id hashes without prompt or response content
- `llm_input` — observe provider input (system prompt, prompt, history)
- `llm_output` — observe provider output
@@ -146,6 +147,21 @@ Rules:
- `onResolution` receives the resolved approval decision — `allow-once`,
`allow-always`, `deny`, `timeout`, or `cancelled`.
### Tool result persistence
Tool results can include structured `details` for UI rendering, diagnostics,
media routing, or plugin-owned metadata. Treat `details` as runtime metadata,
not prompt content:
- OpenClaw strips `toolResult.details` before provider replay and compaction
input so metadata does not become model context.
- Persisted session entries keep only bounded `details`. Oversized details are
replaced with a compact summary and `persistedDetailsTruncated: true`.
- `tool_result_persist` and `before_message_write` run before the final
persistence cap. Hooks should still keep returned `details` small and avoid
placing prompt-relevant text only in `details`; put model-visible tool output
in `content`.
## Prompt and model hooks
Use the phase-specific hooks for new plugins:
@@ -162,6 +178,13 @@ so your plugin does not depend on a legacy combined phase.
`before_agent_start` and `agent_end` include `event.runId` when OpenClaw can
identify the active run. The same value is also available on `ctx.runId`.
Use `model_call_started` and `model_call_ended` for provider-call telemetry
that should not receive raw prompts, history, responses, headers, request
bodies, or provider request IDs. These hooks include stable metadata such as
`runId`, `callId`, `provider`, `model`, optional `api`/`transport`, terminal
`durationMs`/`outcome`, and `upstreamRequestIdHash` when OpenClaw can derive a
bounded provider request-id hash.
Non-bundled plugins that need `llm_input`, `llm_output`, or `agent_end` must set:
```json

View File

@@ -342,7 +342,7 @@ releases.
| `plugin-sdk/provider-web-search` | Provider web-search helpers | Web-search provider registration/cache/runtime helpers |
| `plugin-sdk/provider-tools` | Provider tool/schema compat helpers | `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks`, Gemini schema cleanup + diagnostics, and xAI compat helpers such as `resolveXaiModelCompatPatch` / `applyXaiModelCompat` |
| `plugin-sdk/provider-usage` | Provider usage helpers | `fetchClaudeUsage`, `fetchGeminiUsage`, `fetchGithubCopilotUsage`, and other provider usage helpers |
| `plugin-sdk/provider-stream` | Provider stream wrapper helpers | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-stream` | Provider stream wrapper helpers | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/DeepSeek V4/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-transport-runtime` | Provider transport helpers | Native provider transport helpers such as guarded fetch, transport message transforms, and writable transport event streams |
| `plugin-sdk/keyed-async-queue` | Ordered async queue | `KeyedAsyncQueue` |
| `plugin-sdk/media-runtime` | Shared media helpers | Media fetch/transform/store helpers plus media payload builders |

View File

@@ -340,7 +340,7 @@ API key auth, and dynamic model resolution.
Each family builder is composed from lower-level public helpers exported from the same package, which you can reach for when a provider needs to go off the common pattern:
- `openclaw/plugin-sdk/provider-model-shared` — `ProviderReplayFamily`, `buildProviderReplayFamilyHooks(...)`, and the raw replay builders (`buildOpenAICompatibleReplayPolicy`, `buildAnthropicReplayPolicyForModel`, `buildGoogleGeminiReplayPolicy`, `buildHybridAnthropicOrOpenAIReplayPolicy`). Also exports Gemini replay helpers (`sanitizeGoogleGeminiReplayHistory`, `resolveTaggedReasoningOutputMode`) and endpoint/model helpers (`resolveProviderEndpoint`, `normalizeProviderId`, `normalizeGooglePreviewModelId`, `normalizeNativeXaiModelId`).
- `openclaw/plugin-sdk/provider-stream` — `ProviderStreamFamily`, `buildProviderStreamFamilyHooks(...)`, `composeProviderStreamWrappers(...)`, plus the shared OpenAI/Codex wrappers (`createOpenAIAttributionHeadersWrapper`, `createOpenAIFastModeWrapper`, `createOpenAIServiceTierWrapper`, `createOpenAIResponsesContextManagementWrapper`, `createCodexNativeWebSearchWrapper`) and shared proxy/provider wrappers (`createOpenRouterWrapper`, `createToolStreamWrapper`, `createMinimaxFastModeWrapper`).
- `openclaw/plugin-sdk/provider-stream` — `ProviderStreamFamily`, `buildProviderStreamFamilyHooks(...)`, `composeProviderStreamWrappers(...)`, plus the shared OpenAI/Codex wrappers (`createOpenAIAttributionHeadersWrapper`, `createOpenAIFastModeWrapper`, `createOpenAIServiceTierWrapper`, `createOpenAIResponsesContextManagementWrapper`, `createCodexNativeWebSearchWrapper`), DeepSeek V4 OpenAI-compatible wrapper (`createDeepSeekV4OpenAICompatibleThinkingWrapper`), and shared proxy/provider wrappers (`createOpenRouterWrapper`, `createToolStreamWrapper`, `createMinimaxFastModeWrapper`).
- `openclaw/plugin-sdk/provider-tools` — `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks("gemini")`, underlying Gemini schema helpers (`normalizeGeminiToolSchemas`, `inspectGeminiToolSchemas`), and xAI compat helpers (`resolveXaiModelCompatPatch()`, `applyXaiModelCompat(model)`). The bundled xAI plugin uses `normalizeResolvedModel` + `contributeResolvedModelCompat` with these to keep xAI rules owned by the provider.
Some stream helpers stay provider-local on purpose. `@openclaw/anthropic-provider` keeps `wrapAnthropicProviderStream`, `resolveAnthropicBetas`, `resolveAnthropicFastMode`, `resolveAnthropicServiceTier`, and the lower-level Anthropic wrapper builders in its own public `api.ts` / `contract-api.ts` seam because they encode Claude OAuth beta handling and `context1m` gating. The xAI plugin similarly keeps native xAI Responses shaping in its own `wrapStreamFn` (`/fast` aliases, default `tool_stream`, unsupported strict-tool cleanup, xAI-specific reasoning-payload removal).

View File

@@ -102,7 +102,7 @@ For the plugin authoring guide, see [Plugin SDK overview](/plugins/sdk-overview)
| `plugin-sdk/provider-web-search` | Web-search provider registration/cache/runtime helpers |
| `plugin-sdk/provider-tools` | `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks`, Gemini schema cleanup + diagnostics, and xAI compat helpers such as `resolveXaiModelCompatPatch` / `applyXaiModelCompat` |
| `plugin-sdk/provider-usage` | `fetchClaudeUsage` and similar |
| `plugin-sdk/provider-stream` | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-stream` | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/DeepSeek V4/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-transport-runtime` | Native provider transport helpers such as guarded fetch, transport message transforms, and writable transport event streams |
| `plugin-sdk/provider-onboard` | Onboarding config patch helpers |
| `plugin-sdk/global-singleton` | Process-local singleton/map/cache helpers |

View File

@@ -53,6 +53,12 @@ Restart the Gateway afterwards.
Set config under `plugins.entries.voice-call.config`:
If `enabled` is true but the selected provider is missing credentials, Gateway
startup logs a setup-incomplete warning with the missing keys and skips starting
the runtime. Run `openclaw voicecall setup` to see the same readiness details.
Commands, RPC calls, and agent tools still return the exact missing provider
configuration when used.
```json5
{
plugins: {

View File

@@ -50,11 +50,16 @@ The bundled `fal` image-generation provider defaults to
| Size overrides | Supported |
| Aspect ratio | Supported |
| Resolution | Supported |
| Output format | `png` or `jpeg` |
<Warning>
The fal image edit endpoint does **not** support `aspectRatio` overrides.
</Warning>
Use `outputFormat: "png"` when you want PNG output. fal does not declare an
explicit transparent-background control in OpenClaw, so `background:
"transparent"` is reported as an ignored override for fal models.
To use fal as the default image provider:
```json5

View File

@@ -108,6 +108,38 @@ export LITELLM_API_KEY="sk-litellm-key"
## Advanced configuration
### Image generation
LiteLLM can also back the `image_generate` tool through OpenAI-compatible
`/images/generations` and `/images/edits` routes. Configure a LiteLLM image
model under `agents.defaults.imageGenerationModel`:
```json5
{
models: {
providers: {
litellm: {
baseUrl: "http://localhost:4000",
apiKey: "${LITELLM_API_KEY}",
},
},
},
agents: {
defaults: {
imageGenerationModel: {
primary: "litellm/gpt-image-2",
timeoutMs: 180_000,
},
},
},
}
```
Loopback LiteLLM URLs such as `http://localhost:4000` work without a global
private-network override. For a LAN-hosted proxy, set
`models.providers.litellm.request.allowPrivateNetwork: true` because the API key
will be sent to the configured proxy host.
<AccordionGroup>
<Accordion title="Virtual keys">
Create a dedicated key for OpenClaw with spend limits:

View File

@@ -23,17 +23,18 @@ changing config.
| Goal | Use | Notes |
| --------------------------------------------- | -------------------------------------------------------- | ---------------------------------------------------------------------------- |
| Direct API-key billing | `openai/gpt-5.4` | Set `OPENAI_API_KEY` or run OpenAI API-key onboarding. |
| Direct API-key billing | `openai/gpt-5.5` | Set `OPENAI_API_KEY` or run OpenAI API-key onboarding. |
| GPT-5.5 with ChatGPT/Codex subscription auth | `openai-codex/gpt-5.5` | Default PI route for Codex OAuth. Best first choice for subscription setups. |
| GPT-5.5 with native Codex app-server behavior | `openai/gpt-5.5` plus `embeddedHarness.runtime: "codex"` | Uses the Codex app-server harness, not the public OpenAI API route. |
| GPT-5.5 with native Codex app-server behavior | `openai/gpt-5.5` plus `embeddedHarness.runtime: "codex"` | Forces the Codex app-server harness for that model ref. |
| Image generation or editing | `openai/gpt-image-2` | Works with either `OPENAI_API_KEY` or OpenAI Codex OAuth. |
| Transparent-background images | `openai/gpt-image-1.5` | Use `outputFormat=png` or `webp` and `openai.background=transparent`. |
<Note>
GPT-5.5 is currently available in OpenClaw through subscription/OAuth routes:
`openai-codex/gpt-5.5` with the PI runner, or `openai/gpt-5.5` with the
Codex app-server harness. Direct API-key access for `openai/gpt-5.5` is
supported once OpenAI enables GPT-5.5 on the public API; until then use an
API-enabled model such as `openai/gpt-5.4` for `OPENAI_API_KEY` setups.
GPT-5.5 is available through both direct OpenAI Platform API-key access and
subscription/OAuth routes. Use `openai/gpt-5.5` for direct `OPENAI_API_KEY`
traffic, `openai-codex/gpt-5.5` for Codex OAuth through PI, or
`openai/gpt-5.5` with `embeddedHarness.runtime: "codex"` for the native Codex
app-server harness.
</Note>
<Note>
@@ -93,16 +94,14 @@ Choose your preferred auth method and follow the setup steps.
| Model ref | Route | Auth |
|-----------|-------|------|
| `openai/gpt-5.4` | Direct OpenAI Platform API | `OPENAI_API_KEY` |
| `openai/gpt-5.5` | Direct OpenAI Platform API | `OPENAI_API_KEY` |
| `openai/gpt-5.4-mini` | Direct OpenAI Platform API | `OPENAI_API_KEY` |
| `openai/gpt-5.5` | Future direct API route once OpenAI enables GPT-5.5 on the API | `OPENAI_API_KEY` |
<Note>
`openai/*` is the direct OpenAI API-key route unless you explicitly force
the Codex app-server harness. GPT-5.5 itself is currently subscription/OAuth
only; use `openai-codex/*` for Codex OAuth through the default PI runner, or
use `openai/gpt-5.5` with `embeddedHarness.runtime: "codex"` for native
Codex app-server execution.
the Codex app-server harness. Use `openai-codex/*` for Codex OAuth through
the default PI runner, or use `openai/gpt-5.5` with
`embeddedHarness.runtime: "codex"` for native Codex app-server execution.
</Note>
### Config example
@@ -110,7 +109,7 @@ Choose your preferred auth method and follow the setup steps.
```json5
{
env: { OPENAI_API_KEY: "sk-..." },
agents: { defaults: { model: { primary: "openai/gpt-5.4" } } },
agents: { defaults: { model: { primary: "openai/gpt-5.5" } } },
}
```
@@ -256,8 +255,33 @@ See [Image Generation](/tools/image-generation) for shared tool parameters, prov
</Note>
`gpt-image-2` is the default for both OpenAI text-to-image generation and image
editing. `gpt-image-1` remains usable as an explicit model override, but new
OpenAI image workflows should use `openai/gpt-image-2`.
editing. `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini` remain usable as
explicit model overrides. Use `openai/gpt-image-1.5` for transparent-background
PNG/WebP output; the current `gpt-image-2` API rejects
`background: "transparent"`.
For a transparent-background request, agents should call `image_generate` with
`model: "openai/gpt-image-1.5"`, `outputFormat: "png"` or `"webp"`, and
`background: "transparent"`; the older `openai.background` provider option is
still accepted. OpenClaw also protects the public OpenAI and
OpenAI Codex OAuth routes by rewriting default `openai/gpt-image-2` transparent
requests to `gpt-image-1.5`; Azure and custom OpenAI-compatible endpoints keep
their configured deployment/model names.
The same setting is exposed for headless CLI runs:
```bash
openclaw infer image generate \
--model openai/gpt-image-1.5 \
--output-format png \
--background transparent \
--prompt "A simple red circle sticker on a transparent background" \
--json
```
Use the same `--output-format` and `--background` flags with
`openclaw infer image edit` when starting from an input file.
`--openai-background` remains available as an OpenAI-specific alias.
For Codex OAuth installs, keep the same `openai/gpt-image-2` ref. When an
`openai-codex` OAuth profile is configured, OpenClaw resolves that stored OAuth
@@ -277,6 +301,12 @@ Generate:
/tool image_generate model=openai/gpt-image-2 prompt="A polished launch poster for OpenClaw on macOS" size=3840x2160 count=1
```
Generate a transparent PNG:
```
/tool image_generate model=openai/gpt-image-1.5 prompt="A simple red circle sticker on a transparent background" outputFormat=png background=transparent
```
Edit:
```
@@ -311,7 +341,7 @@ See [Video Generation](/tools/video-generation) for shared tool parameters, prov
## GPT-5 prompt contribution
OpenClaw adds a shared GPT-5 prompt contribution for GPT-5-family runs across providers. It applies by model id, so `openai-codex/gpt-5.5`, `openai/gpt-5.4`, `openrouter/openai/gpt-5.5`, `opencode/gpt-5.5`, and other compatible GPT-5 refs receive the same overlay. Older GPT-4.x models do not.
OpenClaw adds a shared GPT-5 prompt contribution for GPT-5-family runs across providers. It applies by model id, so `openai-codex/gpt-5.5`, `openai/gpt-5.5`, `openrouter/openai/gpt-5.5`, `opencode/gpt-5.5`, and other compatible GPT-5 refs receive the same overlay. Older GPT-4.x models do not.
The bundled native Codex harness uses the same GPT-5 behavior and heartbeat overlay through Codex app-server developer instructions, so `openai/gpt-5.x` sessions forced through `embeddedHarness.runtime: "codex"` keep the same follow-through and proactive heartbeat guidance even though Codex owns the rest of the harness prompt.
@@ -603,7 +633,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: { transport: "auto" },
},
"openai-codex/gpt-5.5": {
@@ -630,7 +660,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: { openaiWsWarmup: false },
},
},
@@ -654,7 +684,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": { params: { fastMode: true } },
"openai/gpt-5.5": { params: { fastMode: true } },
},
},
},
@@ -675,7 +705,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": { params: { serviceTier: "priority" } },
"openai/gpt-5.5": { params: { serviceTier: "priority" } },
},
},
},
@@ -723,7 +753,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: {
responsesServerCompaction: true,
responsesCompactThreshold: 120000,
@@ -741,7 +771,7 @@ the Server-side compaction accordion below.
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
"openai/gpt-5.5": {
params: { responsesServerCompaction: false },
},
},

View File

@@ -18,23 +18,26 @@ provider id `opencode-go` so upstream per-model routing stays correct.
## Built-in catalog
OpenClaw sources the Go catalog from the bundled pi model registry. Run
OpenClaw sources most Go catalog rows from the bundled pi model registry and
supplements current upstream rows while the registry catches up. Run
`openclaw models list --provider opencode-go` for the current model list.
As of the bundled pi catalog, the provider includes:
The provider includes:
| Model ref | Name |
| -------------------------- | --------------------- |
| `opencode-go/glm-5` | GLM-5 |
| `opencode-go/glm-5.1` | GLM-5.1 |
| `opencode-go/kimi-k2.5` | Kimi K2.5 |
| `opencode-go/kimi-k2.6` | Kimi K2.6 (3x limits) |
| `opencode-go/mimo-v2-omni` | MiMo V2 Omni |
| `opencode-go/mimo-v2-pro` | MiMo V2 Pro |
| `opencode-go/minimax-m2.5` | MiniMax M2.5 |
| `opencode-go/minimax-m2.7` | MiniMax M2.7 |
| `opencode-go/qwen3.5-plus` | Qwen3.5 Plus |
| `opencode-go/qwen3.6-plus` | Qwen3.6 Plus |
| Model ref | Name |
| ------------------------------- | --------------------- |
| `opencode-go/glm-5` | GLM-5 |
| `opencode-go/glm-5.1` | GLM-5.1 |
| `opencode-go/kimi-k2.5` | Kimi K2.5 |
| `opencode-go/kimi-k2.6` | Kimi K2.6 (3x limits) |
| `opencode-go/deepseek-v4-pro` | DeepSeek V4 Pro |
| `opencode-go/deepseek-v4-flash` | DeepSeek V4 Flash |
| `opencode-go/mimo-v2-omni` | MiMo V2 Omni |
| `opencode-go/mimo-v2-pro` | MiMo V2 Pro |
| `opencode-go/minimax-m2.5` | MiniMax M2.5 |
| `opencode-go/minimax-m2.7` | MiniMax M2.7 |
| `opencode-go/qwen3.5-plus` | Qwen3.5 Plus |
| `opencode-go/qwen3.6-plus` | Qwen3.6 Plus |
## Getting started

View File

@@ -71,13 +71,14 @@ OpenRouter can also back the `image_generate` tool. Use an OpenRouter image mode
defaults: {
imageGenerationModel: {
primary: "openrouter/google/gemini-3.1-flash-image-preview",
timeoutMs: 180_000,
},
},
},
}
```
OpenClaw sends image requests to OpenRouter's chat completions image API with `modalities: ["image", "text"]`. Gemini image models receive supported `aspectRatio` and `resolution` hints through OpenRouter's `image_config`.
OpenClaw sends image requests to OpenRouter's chat completions image API with `modalities: ["image", "text"]`. Gemini image models receive supported `aspectRatio` and `resolution` hints through OpenRouter's `image_config`. Use `agents.defaults.imageGenerationModel.timeoutMs` for slower OpenRouter image models; the `image_generate` tool's per-call `timeoutMs` parameter still wins.
## Text-to-speech

View File

@@ -123,6 +123,15 @@ Use the table below to pick the right model for your use case.
</Tip>
## DeepSeek V4 replay behavior
If Venice exposes DeepSeek V4 models such as `venice/deepseek-v4-pro` or
`venice/deepseek-v4-flash`, OpenClaw fills the required DeepSeek V4
`reasoning_content` replay placeholder on assistant tool-call turns when the
proxy omits it. Venice rejects DeepSeek's native top-level `thinking` control,
so OpenClaw keeps that provider-specific replay fix separate from the native
DeepSeek provider's thinking controls.
## Built-in catalog (41 total)
<AccordionGroup>

View File

@@ -132,12 +132,14 @@ Legacy aliases still normalize to the canonical bundled ids:
`video_generate` tool.
- Default video model: `xai/grok-imagine-video`
- Modes: text-to-video, image-to-video, remote video edit, and remote video
extension
- Modes: text-to-video, image-to-video, reference-image generation, remote
video edit, and remote video extension
- Aspect ratios: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3`
- Resolutions: `480P`, `720P`
- Duration: 1-15 seconds for generation/image-to-video, 2-10 seconds for
extension
- Duration: 1-15 seconds for generation/image-to-video, 1-10 seconds when
using `reference_image` roles, 2-10 seconds for extension
- Reference-image generation: set `imageRoles` to `reference_image` for
every supplied image; xAI accepts up to 7 such images
<Warning>
Local video buffers are not accepted. Use remote `http(s)` URLs for

View File

@@ -14,6 +14,7 @@ Assistant output can carry a small set of delivery/render directives:
- `[embed ...]` for Control UI rich rendering
These directives are separate. `MEDIA:` and reply/voice tags remain delivery metadata; `[embed ...]` is the web-only rich render path.
Trusted tool-result media uses the same `MEDIA:` / `[[audio_as_voice]]` parser before delivery, so text tool outputs can still mark an audio attachment as a voice note.
When block streaming is enabled, `MEDIA:` remains single-delivery metadata for a
turn. If the same media URL is sent in a streamed block and repeated in the final

View File

@@ -101,6 +101,13 @@ Assistant transcript entries persist the same normalized usage shape, including
returns usage metadata. This gives `/usage cost` and transcript-backed session
status a stable source even after the live runtime state is gone.
OpenClaw keeps provider usage accounting separate from the current context
snapshot. Provider `usage.total` can include cached input, output, and multiple
tool-loop model calls, so it is useful for cost and telemetry but can overstate
the live context window. Context displays and diagnostics use the latest prompt
snapshot (`promptTokens`, or the last model call when no prompt snapshot is
available) for `context.used`.
## Cost estimation (when shown)
Costs are estimated from your model pricing config:

View File

@@ -8,11 +8,12 @@ title: "Transcript hygiene"
---
This document describes **provider-specific fixes** applied to transcripts before a run
(building model context). These are **in-memory** adjustments used to satisfy strict
provider requirements. These hygiene steps do **not** rewrite the stored JSONL transcript
on disk; however, a separate session-file repair pass may rewrite malformed JSONL files
by dropping invalid lines before the session is loaded. When a repair occurs, the original
file is backed up alongside the session file.
(building model context). Most of these are **in-memory** adjustments used to satisfy
strict provider requirements. A separate session-file repair pass may also rewrite
stored JSONL before the session is loaded, either by dropping malformed JSONL lines or
by repairing persisted turns that are syntactically valid but known to be rejected by a
provider during replay. When a repair occurs, the original file is backed up alongside
the session file.
Scope includes:
@@ -22,8 +23,10 @@ Scope includes:
- Tool result pairing repair
- Turn validation / ordering
- Thought signature cleanup
- Thinking signature cleanup
- Image payload sanitization
- User-input provenance tagging (for inter-session routed prompts)
- Empty assistant error-turn repair for Bedrock Converse replay
If you need transcript storage details, see:
@@ -131,6 +134,26 @@ external end-user instructions.
- Tool result pairing repair and synthetic tool results.
- Turn validation (merge consecutive user turns to satisfy strict alternation).
- Thinking blocks with missing, empty, or blank replay signatures are stripped
before provider conversion. If that empties an assistant turn, OpenClaw keeps
turn shape with non-empty omitted-reasoning text.
- Older thinking-only assistant turns that must be stripped are replaced with
non-empty omitted-reasoning text so provider adapters do not drop the replay
turn.
**Amazon Bedrock (Converse API)**
- Empty assistant stream-error turns are repaired to a non-empty fallback text block
before replay. Bedrock Converse rejects assistant messages with `content: []`, so
persisted assistant turns with `stopReason: "error"` and empty content are also
repaired on disk before load.
- Claude thinking blocks with missing, empty, or blank replay signatures are
stripped before Converse replay. If that empties an assistant turn, OpenClaw
keeps turn shape with non-empty omitted-reasoning text.
- Older thinking-only assistant turns that must be stripped are replaced with
non-empty omitted-reasoning text so the Converse replay keeps strict turn shape.
- Replay filters OpenClaw delivery-mirror and gateway-injected assistant turns.
- Image sanitization applies through the global rule.
**Mistral (including model-id based detection)**

View File

@@ -36,7 +36,7 @@ For a high-level overview, see [Onboarding (CLI)](/start/wizard).
- **OpenAI Code (Codex) subscription (device pairing)**: browser pairing flow with a short-lived device code.
- Sets `agents.defaults.model` to `openai-codex/gpt-5.5` when model is unset or already OpenAI-family.
- **OpenAI API key**: uses `OPENAI_API_KEY` if present or prompts for a key, then stores it in auth profiles.
- Sets `agents.defaults.model` to `openai/gpt-5.4` when model is unset, `openai/*`, or `openai-codex/*`.
- Sets `agents.defaults.model` to `openai/gpt-5.5` when model is unset, `openai/*`, or `openai-codex/*`.
- **xAI (Grok) API key**: prompts for `XAI_API_KEY` and configures xAI as a model provider.
- **OpenCode**: prompts for `OPENCODE_API_KEY` (or `OPENCODE_ZEN_API_KEY`, get it at https://opencode.ai/auth) and lets you pick the Zen or Go catalog.
- **Ollama**: offers **Cloud + Local**, **Cloud only**, or **Local only** first. `Cloud only` prompts for `OLLAMA_API_KEY` and uses `https://ollama.com`; the host-backed modes prompt for the Ollama base URL, discover available models, and auto-pull the selected local model when needed; `Cloud + Local` also checks whether that Ollama host is signed in for cloud access.
@@ -182,7 +182,7 @@ Use this reference page for flag semantics and step ordering.
```bash
openclaw agents add work \
--workspace ~/.openclaw/workspace-work \
--model openai/gpt-5.4 \
--model openai/gpt-5.5 \
--bind whatsapp:biz \
--non-interactive \
--json

View File

@@ -204,7 +204,7 @@ sessions, and auth profiles. Running without `--workspace` launches the wizard.
```bash
openclaw agents add work \
--workspace ~/.openclaw/workspace-work \
--model openai/gpt-5.4 \
--model openai/gpt-5.5 \
--bind whatsapp:biz \
--non-interactive \
--json

View File

@@ -142,7 +142,7 @@ What you set:
<Accordion title="OpenAI API key">
Uses `OPENAI_API_KEY` if present or prompts for a key, then stores the credential in auth profiles.
Sets `agents.defaults.model` to `openai/gpt-5.4` when model is unset, `openai/*`, or `openai-codex/*`.
Sets `agents.defaults.model` to `openai/gpt-5.5` when model is unset, `openai/*`, or `openai-codex/*`.
</Accordion>
<Accordion title="xAI (Grok) API key">

View File

@@ -329,7 +329,8 @@ Interface details:
- `resumeSessionId` (optional): resume an existing ACP session instead of creating a new one. The agent replays its conversation history via `session/load`. Requires `runtime: "acp"`.
- `streamTo` (optional): `"parent"` streams initial ACP run progress summaries back to the requester session as system events.
- When available, accepted responses include `streamLogPath` pointing to a session-scoped JSONL log (`<sessionId>.acp-stream.jsonl`) you can tail for full relay history.
- `model` (optional): explicit model override for the ACP child session. Honored for `runtime: "acp"` so the child uses the requested model instead of silently falling back to the target agent default.
- `model` (optional): explicit model override for the ACP child session. Honored for `runtime: "acp"` so the child uses the requested model instead of silently falling back to the target agent default. Codex ACP spawns normalize OpenClaw Codex refs such as `openai-codex/gpt-5.4` to Codex ACP startup config before `session/new`; slash forms such as `openai-codex/gpt-5.4/high` also set Codex ACP reasoning effort.
- `thinking` (optional): explicit thinking/reasoning effort for the ACP child session. For Codex ACP, `minimal` maps to low effort, `low`/`medium`/`high`/`xhigh` map directly, and `off` omits the reasoning-effort startup override.
## Delivery model
@@ -522,7 +523,8 @@ Notes:
Equivalent operations:
- `/acp model <id>` maps to runtime config key `model`.
- `/acp model <id>` maps to runtime config key `model`. For Codex ACP, OpenClaw normalizes `openai-codex/<model>` to the adapter model id and maps slash reasoning suffixes such as `openai-codex/gpt-5.4/high` to Codex ACP `reasoning_effort`.
- `/acp set thinking <level>` maps to runtime config key `thinking`. For Codex ACP, OpenClaw sends the corresponding `reasoning_effort` where the adapter supports one.
- `/acp permissions <profile>` maps to runtime config key `approval_policy`.
- `/acp timeout <seconds>` maps to runtime config key `timeout`.
- `/acp cwd <path>` updates runtime cwd override directly.

View File

@@ -140,8 +140,8 @@ curl -s http://127.0.0.1:18791/tabs
On Raspberry Pi, older VPS hosts, or slow storage, raise
`browser.localLaunchTimeoutMs` when Chrome needs more time to expose its CDP HTTP
endpoint. Raise `browser.localCdpReadyTimeoutMs` when launch succeeds but
`openclaw browser start` still reports `not reachable after start`. Values are
capped at 120000 ms.
`openclaw browser start` still reports `not reachable after start`. Values must
be positive integers up to `120000` ms; invalid config values are rejected.
### Problem: "No Chrome tabs found for profile=\"user\""

View File

@@ -69,6 +69,24 @@ Browser config changes require a Gateway restart so the plugin can re-register i
## Agent guidance
Tool-profile note: `tools.profile: "coding"` includes `web_search` and
`web_fetch`, but it does not include the full `browser` tool. If the agent or a
spawned sub-agent should use browser automation, add browser at the profile
stage:
```json5
{
tools: {
profile: "coding",
alsoAllow: ["browser"],
},
}
```
For a single agent, use `agents.list[].tools.alsoAllow: ["browser"]`.
`tools.subagents.tools.allow: ["browser"]` alone is not enough because sub-agent
policy is applied after profile filtering.
The browser plugin ships two levels of agent guidance:
- The `browser` tool description carries the compact always-on contract: pick
@@ -175,12 +193,15 @@ Browser settings live in `~/.openclaw/openclaw.json`.
- Control service binds to loopback on a port derived from `gateway.port` (default `18791` = gateway + 2). Overriding `gateway.port` or `OPENCLAW_GATEWAY_PORT` shifts the derived ports in the same family.
- Local `openclaw` profiles auto-assign `cdpPort`/`cdpUrl`; set those only for remote CDP. `cdpUrl` defaults to the managed local CDP port when unset.
- `remoteCdpTimeoutMs` applies to remote (non-loopback) CDP HTTP reachability checks; `remoteCdpHandshakeTimeoutMs` applies to remote CDP WebSocket handshakes.
- `remoteCdpTimeoutMs` applies to remote and `attachOnly` CDP HTTP reachability
checks and tab-opening HTTP requests; `remoteCdpHandshakeTimeoutMs` applies to
their CDP WebSocket handshakes.
- `localLaunchTimeoutMs` is the budget for a locally launched managed Chrome
process to expose its CDP HTTP endpoint. `localCdpReadyTimeoutMs` is the
follow-up budget for CDP websocket readiness after the process is discovered.
Raise these on Raspberry Pi, low-end VPS, or older hardware where Chromium
starts slowly. Values are capped at 120000 ms.
starts slowly. Values must be positive integers up to `120000` ms; invalid
config values are rejected.
- `actionTimeoutMs` is the default budget for browser `act` requests when the caller does not pass `timeoutMs`. The client transport adds a small slack window so long waits can finish instead of timing out at the HTTP boundary.
- `tabCleanup` is best-effort cleanup for tabs opened by primary-agent browser sessions. Subagent, cron, and ACP lifecycle cleanup still closes their explicit tracked tabs at session end; primary sessions keep active tabs reusable, then close idle or excess tracked tabs in the background.
@@ -215,12 +236,12 @@ Browser settings live in `~/.openclaw/openclaw.json`.
current process. `OPENCLAW_BROWSER_HEADLESS=0` forces headed mode for ordinary
starts and returns an actionable error on Linux hosts without a display server;
an explicit `start --headless` request still wins for that one launch.
- `executablePath` can be set globally or per local managed profile. Per-profile values override `browser.executablePath`, so different managed profiles can launch different Chromium-based browsers.
- `executablePath` can be set globally or per local managed profile. Per-profile values override `browser.executablePath`, so different managed profiles can launch different Chromium-based browsers. Both forms accept `~` for your OS home directory.
- `color` (top-level and per-profile) tints the browser UI so you can see which profile is active.
- Default profile is `openclaw` (managed standalone). Use `defaultProfile: "user"` to opt into the signed-in user browser.
- Auto-detect order: system default browser if Chromium-based; otherwise Chrome → Brave → Edge → Chromium → Chrome Canary.
- `driver: "existing-session"` uses Chrome DevTools MCP instead of raw CDP. Do not set `cdpUrl` for that driver.
- Set `browser.profiles.<name>.userDataDir` when an existing-session profile should attach to a non-default Chromium user profile (Brave, Edge, etc.).
- Set `browser.profiles.<name>.userDataDir` when an existing-session profile should attach to a non-default Chromium user profile (Brave, Edge, etc.). This path also accepts `~` for your OS home directory.
</Accordion>
@@ -230,7 +251,8 @@ Browser settings live in `~/.openclaw/openclaw.json`.
If your **system default** browser is Chromium-based (Chrome/Brave/Edge/etc),
OpenClaw uses it automatically. Set `browser.executablePath` to override
auto-detection. `~` expands to your OS home directory:
auto-detection. Top-level and per-profile `executablePath` values accept `~`
for your OS home directory:
```bash
openclaw config set browser.executablePath "/usr/bin/google-chrome"
@@ -279,6 +301,9 @@ instead, and remote CDP profiles use the browser behind `cdpUrl`.
- **Remote control (node host):** run a node host on the machine that has the browser; the Gateway proxies browser actions to it.
- **Remote CDP:** set `browser.profiles.<name>.cdpUrl` (or `browser.cdpUrl`) to
attach to a remote Chromium-based browser. In this case, OpenClaw will not launch a local browser.
- For externally managed CDP services on loopback (for example Browserless in
Docker published to `127.0.0.1`), also set `attachOnly: true`. Loopback CDP
without `attachOnly` is treated as a local OpenClaw-managed browser profile.
- `headless` only affects local managed profiles that OpenClaw launches. It does not restart or change existing-session or remote CDP browsers.
- `executablePath` follows the same local managed profile rule. Changing it on a
running local managed profile marks that profile for restart/reconcile so the
@@ -352,6 +377,39 @@ Notes:
`wss://` for a direct CDP connection or keep the HTTPS URL and let OpenClaw
discover `/json/version`.
### Browserless Docker on the same host
When Browserless is self-hosted in Docker and OpenClaw runs on the host, treat
Browserless as an externally managed CDP service:
```json5
{
browser: {
enabled: true,
defaultProfile: "browserless",
profiles: {
browserless: {
cdpUrl: "ws://127.0.0.1:3000",
attachOnly: true,
color: "#00AA00",
},
},
},
}
```
The address in `browser.profiles.browserless.cdpUrl` must be reachable from the
OpenClaw process. Browserless must also advertise a matching reachable endpoint;
set Browserless `EXTERNAL` to that same public-to-OpenClaw WebSocket base, such
as `ws://127.0.0.1:3000`, `ws://browserless:3000`, or a stable private Docker
network address. If `/json/version` returns `webSocketDebuggerUrl` pointing at
an address OpenClaw cannot reach, CDP HTTP can look healthy while the WebSocket
attach still fails.
Do not leave `attachOnly` unset for a loopback Browserless profile. Without
`attachOnly`, OpenClaw treats the loopback port as a local managed browser
profile and may report that the port is in use but not owned by OpenClaw.
## Direct WebSocket CDP providers
Some hosted browser services expose a **direct WebSocket** endpoint rather than
@@ -370,10 +428,13 @@ CDP URL shapes and picks the right connection strategy automatically:
[Browserbase](https://www.browserbase.com)). OpenClaw tries HTTP
`/json/version` discovery first (normalising the scheme to `http`/`https`);
if discovery returns a `webSocketDebuggerUrl` it is used, otherwise OpenClaw
falls back to a direct WebSocket handshake at the bare root. This lets a
bare `ws://` pointed at a local Chrome still connect, since Chrome only
accepts WebSocket upgrades on the specific per-target path from
`/json/version`.
falls back to a direct WebSocket handshake at the bare root. If the advertised
WebSocket endpoint rejects the CDP handshake but the configured bare root
accepts it, OpenClaw falls back to that root as well. This lets a bare `ws://`
pointed at a local Chrome still connect, since Chrome only accepts WebSocket
upgrades on the specific per-target path from `/json/version`, while hosted
providers can still use their root WebSocket endpoint when their discovery
endpoint advertises a short-lived URL that is not suitable for Playwright CDP.
### Browserbase
@@ -473,7 +534,8 @@ Default behavior:
- The built-in `user` profile uses Chrome MCP auto-connect, which targets the
default local Google Chrome profile.
Use `userDataDir` for Brave, Edge, Chromium, or a non-default Chrome profile:
Use `userDataDir` for Brave, Edge, Chromium, or a non-default Chrome profile.
`~` expands to your OS home directory:
```json5
{
@@ -636,6 +698,8 @@ Common examples:
- CDP startup or readiness failure:
- `Chrome CDP websocket for profile "openclaw" is not reachable after start`
- `Remote CDP for profile "<name>" is not reachable at <cdpUrl>`
- `Port <port> is in use for profile "<name>" but not by openclaw` when a
loopback external CDP service is configured without `attachOnly: true`
- Navigation SSRF block:
- `open`, `navigate`, snapshot, or tab-opening flows fail with a browser/network policy error while `start` and `tabs` still work

View File

@@ -1,5 +1,5 @@
---
summary: "Generate and edit images using configured providers (OpenAI, OpenAI Codex OAuth, Google Gemini, OpenRouter, fal, MiniMax, ComfyUI, Vydra, xAI)"
summary: "Generate and edit images using configured providers (OpenAI, OpenAI Codex OAuth, Google Gemini, OpenRouter, LiteLLM, fal, MiniMax, ComfyUI, Vydra, xAI)"
read_when:
- Generating images via the agent
- Configuring image generation providers and models
@@ -24,6 +24,8 @@ The tool only appears when at least one image generation provider is available.
defaults: {
imageGenerationModel: {
primary: "openai/gpt-image-2",
// Optional default provider request timeout for image_generate.
timeoutMs: 180_000,
},
},
},
@@ -46,18 +48,22 @@ The agent calls `image_generate` automatically. No tool allow-listing needed —
## Common routes
| Goal | Model ref | Auth |
| ---------------------------------------------------- | -------------------------------------------------- | ------------------------------------ |
| OpenAI image generation with API billing | `openai/gpt-image-2` | `OPENAI_API_KEY` |
| OpenAI image generation with Codex subscription auth | `openai/gpt-image-2` | OpenAI Codex OAuth |
| OpenRouter image generation | `openrouter/google/gemini-3.1-flash-image-preview` | `OPENROUTER_API_KEY` |
| Google Gemini image generation | `google/gemini-3.1-flash-image-preview` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
| Goal | Model ref | Auth |
| ---------------------------------------------------- | -------------------------------------------------- | -------------------------------------- |
| OpenAI image generation with API billing | `openai/gpt-image-2` | `OPENAI_API_KEY` |
| OpenAI image generation with Codex subscription auth | `openai/gpt-image-2` | OpenAI Codex OAuth |
| OpenAI transparent-background PNG/WebP | `openai/gpt-image-1.5` | `OPENAI_API_KEY` or OpenAI Codex OAuth |
| OpenRouter image generation | `openrouter/google/gemini-3.1-flash-image-preview` | `OPENROUTER_API_KEY` |
| LiteLLM image generation | `litellm/gpt-image-2` | `LITELLM_API_KEY` |
| Google Gemini image generation | `google/gemini-3.1-flash-image-preview` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
The same `image_generate` tool handles text-to-image and reference-image
editing. Use `image` for one reference or `images` for multiple references.
Provider-supported output hints such as `quality`, `outputFormat`, and
OpenAI-specific `background` are forwarded when available and reported as
ignored when a provider does not support them.
`background` are forwarded when available and reported as ignored when a
provider does not support them. Current bundled transparent-background support
is OpenAI-specific; other providers may still preserve PNG alpha if their
backend emits it.
## Supported providers
@@ -65,6 +71,7 @@ ignored when a provider does not support them.
| ---------- | --------------------------------------- | ---------------------------------- | ----------------------------------------------------- |
| OpenAI | `gpt-image-2` | Yes (up to 4 images) | `OPENAI_API_KEY` or OpenAI Codex OAuth |
| OpenRouter | `google/gemini-3.1-flash-image-preview` | Yes (up to 5 input images) | `OPENROUTER_API_KEY` |
| LiteLLM | `gpt-image-2` | Yes (up to 5 input images) | `LITELLM_API_KEY` |
| Google | `gemini-3.1-flash-image-preview` | Yes | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
| fal | `fal-ai/flux/dev` | Yes | `FAL_KEY` |
| MiniMax | `image-01` | Yes (subject reference) | `MINIMAX_API_KEY` or MiniMax OAuth (`minimax-portal`) |
@@ -89,7 +96,8 @@ Use `"list"` to inspect available providers and models at runtime.
</ParamField>
<ParamField path="model" type="string">
Provider/model override, e.g. `openai/gpt-image-2`.
Provider/model override, e.g. `openai/gpt-image-2`; use
`openai/gpt-image-1.5` for transparent OpenAI backgrounds.
</ParamField>
<ParamField path="image" type="string">
@@ -120,6 +128,11 @@ Quality hint when the provider supports it.
Output format hint when the provider supports it.
</ParamField>
<ParamField path="background" type="'transparent' | 'opaque' | 'auto'">
Background hint when the provider supports it. Use `transparent` with
`outputFormat: "png"` or `"webp"` for transparency-capable providers.
</ParamField>
<ParamField path="count" type="number">
Number of images to generate (14).
</ParamField>
@@ -150,6 +163,7 @@ Tool results report the applied settings. When OpenClaw remaps geometry during p
defaults: {
imageGenerationModel: {
primary: "openai/gpt-image-2",
timeoutMs: 180_000,
fallbacks: [
"openrouter/google/gemini-3.1-flash-image-preview",
"google/gemini-3.1-flash-image-preview",
@@ -185,6 +199,8 @@ Notes:
`agents.defaults.mediaGenerationAutoProviderFallback: false` if you want image
generation to use only the explicit `model`, `primary`, and `fallbacks`
entries.
- Set `agents.defaults.imageGenerationModel.timeoutMs` for slow image backends.
A per-call `timeoutMs` tool parameter overrides the configured default.
- Use `action: "list"` to inspect the currently registered providers, their
default models, and auth env-var hints.
@@ -226,9 +242,10 @@ through the Codex Responses backend. Legacy Codex base URLs such as
`https://chatgpt.com/backend-api/codex` for image requests. It does not
silently fall back to `OPENAI_API_KEY` for that request. To force direct OpenAI
Images API routing, configure `models.providers.openai` explicitly with an API
key, custom base URL, or Azure endpoint. The older
`openai/gpt-image-1` model can still be selected explicitly, but new OpenAI
image-generation and image-editing requests should use `gpt-image-2`.
key, custom base URL, or Azure endpoint. The `openai/gpt-image-1.5`,
`openai/gpt-image-1`, and `openai/gpt-image-1-mini` models can still be
selected explicitly. Use `gpt-image-1.5` for transparent-background PNG/WebP
output; the current `gpt-image-2` API rejects `background: "transparent"`.
`gpt-image-2` supports both text-to-image generation and reference-image
editing through the same `image_generate` tool. OpenClaw forwards `prompt`,
@@ -253,8 +270,51 @@ OpenAI-specific options live under the `openai` object:
```
`openai.background` accepts `transparent`, `opaque`, or `auto`; transparent
outputs require `outputFormat` `png` or `webp`. `openai.outputCompression`
applies to JPEG/WebP outputs.
outputs require `outputFormat` `png` or `webp` and a transparency-capable OpenAI
image model. OpenClaw routes default `gpt-image-2` transparent-background
requests to `gpt-image-1.5`. `openai.outputCompression` applies to JPEG/WebP
outputs.
The top-level `background` hint is provider-neutral and currently maps to the
same OpenAI `background` request field when the OpenAI provider is selected.
Providers that do not declare background support return it in `ignoredOverrides`
instead of receiving the unsupported parameter.
When asking an agent for a transparent-background OpenAI image, the expected
tool call is:
```json
{
"model": "openai/gpt-image-1.5",
"prompt": "A simple red circle sticker on a transparent background",
"outputFormat": "png",
"background": "transparent"
}
```
The explicit `openai/gpt-image-1.5` model keeps the request portable across
tool summaries and harnesses. If the agent instead uses the default
`openai/gpt-image-2` with `openai.background: "transparent"` on the public
OpenAI or OpenAI Codex OAuth route, OpenClaw rewrites the provider request to
`gpt-image-1.5`. Azure and custom OpenAI-compatible endpoints keep their
configured deployment/model names.
For headless CLI generation, use the equivalent `openclaw infer` flags:
```bash
openclaw infer image generate \
--model openai/gpt-image-1.5 \
--output-format png \
--background transparent \
--prompt "A simple red circle sticker on a transparent background" \
--json
```
The same `--output-format` and `--background` flags are available on
`openclaw infer image edit`; `--openai-background` remains available as an
OpenAI-specific alias. Current bundled providers other than OpenAI do not
declare explicit background control, so `background: "transparent"` is reported
as ignored for them.
Generate one 4K landscape image:
@@ -262,6 +322,12 @@ Generate one 4K landscape image:
/tool image_generate action=generate model=openai/gpt-image-2 prompt="A clean editorial poster for OpenClaw image generation" size=3840x2160 count=1
```
Generate a transparent PNG:
```
/tool image_generate action=generate model=openai/gpt-image-1.5 prompt="A simple red circle sticker on a transparent background" outputFormat=png background=transparent
```
Generate two square images:
```

View File

@@ -143,6 +143,12 @@ Per-agent override: `agents.list[].tools.profile`.
| `messaging` | `group:messaging`, `sessions_list`, `sessions_history`, `sessions_send`, `session_status` |
| `minimal` | `session_status` only |
`coding` includes lightweight web tools (`web_search`, `web_fetch`, `x_search`)
but not the full browser-control tool. Browser automation can drive real
sessions and logged-in profiles, so add it explicitly with
`tools.alsoAllow: ["browser"]` or a per-agent
`agents.list[].tools.alsoAllow: ["browser"]`.
The `coding` and `messaging` profiles also allow configured bundle MCP tools
under the plugin key `bundle-mcp`. Add `tools.deny: ["bundle-mcp"]` when you
want a profile to keep its normal built-ins but hide all configured MCP tools.

View File

@@ -159,12 +159,12 @@ hooks; restart the Gateway process that is serving the live channel before
expecting updated `register(api)` code, `api.on(...)` hooks, tools, services, or
provider/runtime hooks to run.
`openclaw plugins list` is a local CLI/config snapshot. A `loaded` plugin there
means the plugin is discoverable and loadable from the config/files seen by that
CLI invocation. It does not prove that an already-running remote Gateway child
has restarted into the same plugin code. On VPS/container setups with wrapper
processes, send restarts to the actual `openclaw gateway run` process, or use
`openclaw gateway restart` against the running Gateway.
`openclaw plugins list` is a local plugin registry/config snapshot. An
`enabled` plugin there means the persisted registry and current config allow the
plugin to participate. It does not prove that an already-running remote Gateway
child has restarted into the same plugin code. On VPS/container setups with
wrapper processes, send restarts to the actual `openclaw gateway run` process,
or use `openclaw gateway restart` against the running Gateway.
<Accordion title="Plugin states: disabled vs missing vs invalid">
- **Disabled**: plugin exists but enablement rules turned it off. Config is preserved.
@@ -256,7 +256,7 @@ Some categories are exclusive (only one active at a time):
```bash
openclaw plugins list # compact inventory
openclaw plugins list --enabled # only loaded plugins
openclaw plugins list --enabled # only enabled plugins
openclaw plugins list --verbose # per-plugin detail lines
openclaw plugins list --json # machine-readable inventory
openclaw plugins inspect <id> # deep detail
@@ -264,6 +264,9 @@ openclaw plugins inspect <id> --json # machine-readable
openclaw plugins inspect --all # fleet-wide table
openclaw plugins info <id> # inspect alias
openclaw plugins doctor # diagnostics
openclaw plugins registry # inspect persisted registry state
openclaw plugins registry --refresh # rebuild persisted registry
openclaw doctor --fix # repair registry/ledger migration state
openclaw plugins install <package> # install (ClawHub first, then npm)
openclaw plugins install clawhub:<pkg> # install from ClawHub only
@@ -277,7 +280,7 @@ openclaw plugins install <spec> --dangerously-force-unsafe-install
openclaw plugins update <id-or-npm-spec> # update one plugin
openclaw plugins update <id-or-npm-spec> --dangerously-force-unsafe-install
openclaw plugins update --all # update all
openclaw plugins uninstall <id> # remove config/install records
openclaw plugins uninstall <id> # remove config and install ledger records
openclaw plugins uninstall <id> --keep-files
openclaw plugins marketplace list <source>
openclaw plugins marketplace list <source> --json
@@ -299,6 +302,16 @@ When `plugins.allow` is already set, `openclaw plugins install` adds the
installed plugin id to that allowlist before enabling it, so installs are
immediately loadable after restart.
OpenClaw keeps a persisted local plugin registry as the cold read model for
plugin inventory, contribution ownership, and startup planning. Install, update,
uninstall, enable, and disable flows refresh that registry after changing plugin
state. If the registry is missing, stale, or invalid, `openclaw plugins registry
--refresh` rebuilds it from the durable install ledger, config policy, and
manifest/package metadata without loading plugin runtime modules.
If a machine still has legacy `plugins.installs` records in config, run
`openclaw doctor --fix` to move them into the managed
`plugins/installs.json` ledger and remove the config copy.
`openclaw plugins update <id-or-npm-spec>` applies to tracked installs. Passing
an npm package spec with a dist-tag or exact version resolves the package name
back to the tracked plugin record and records the new spec for future updates.

View File

@@ -305,7 +305,11 @@ Announce payloads include a stats line at the end (even when wrapped):
## Tool Policy (sub-agent tools)
By default, sub-agents get **all tools except session tools** and system tools:
Sub-agents use the same profile and tool-policy pipeline as the parent or target
agent first. After that, OpenClaw applies the sub-agent restriction layer.
With no restrictive `tools.profile`, sub-agents get **all tools except session
tools** and system tools:
- `sessions_list`
- `sessions_history`
@@ -341,6 +345,24 @@ Override via config:
}
```
`tools.subagents.tools.allow` is a final allow-only filter. It can narrow the
already-resolved tool set, but it cannot add back a tool removed by
`tools.profile`. For example, `tools.profile: "coding"` includes
`web_search`/`web_fetch`, but not the `browser` tool. To let coding-profile
sub-agents use browser automation, add browser at the profile stage:
```json5
{
tools: {
profile: "coding",
alsoAllow: ["browser"],
},
}
```
Use per-agent `agents.list[].tools.alsoAllow: ["browser"]` when only one agent
should get browser automation.
## Concurrency
Sub-agents use a dedicated in-process queue lane:

View File

@@ -664,6 +664,8 @@ reply delivery. When the channel is Feishu, Matrix, Telegram, or WhatsApp,
the audio is delivered as a voice message rather than a file attachment.
Feishu can transcode non-Opus TTS output on this path when `ffmpeg` is
available.
WhatsApp sends visible text separately from PTT voice-note audio because clients
do not consistently render captions on voice notes.
It accepts optional `channel` and `timeoutMs` fields; `timeoutMs` is a
per-call provider request timeout in milliseconds.

View File

@@ -97,7 +97,7 @@ Duplicate prevention: if a video task is already `queued` or `running` for the c
| Runway | `gen4.5` | Yes | 1 image | 1 video | `RUNWAYML_API_SECRET` |
| Together | `Wan-AI/Wan2.2-T2V-A14B` | Yes | 1 image | No | `TOGETHER_API_KEY` |
| Vydra | `veo3` | Yes | 1 image (`kling`) | No | `VYDRA_API_KEY` |
| xAI | `grok-imagine-video` | Yes | 1 image | 1 video | `XAI_API_KEY` |
| xAI | `grok-imagine-video` | Yes | 1 first-frame image or up to 7 `reference_image`s | 1 video | `XAI_API_KEY` |
Some providers accept additional or alternate API key env vars. See individual [provider pages](#related) for details.
@@ -150,7 +150,9 @@ Role hints are forwarded to the provider as-is. Canonical values come from
the `VideoGenerationAssetRole` union but providers may accept additional
role strings. `*Roles` arrays must not have more entries than the
corresponding reference list; off-by-one mistakes fail with a clear error.
Use an empty string to leave a slot unset.
Use an empty string to leave a slot unset. For xAI, set every image role to
`reference_image` to use its `reference_images` generation mode; omit the role
or use `first_frame` for single-image image-to-video.
### Style controls
@@ -326,7 +328,7 @@ entries.
</Accordion>
<Accordion title="xAI">
Supports text-to-video, image-to-video, and remote video edit/extend flows.
Supports text-to-video, single first-frame image-to-video, up to 7 `reference_image` inputs through xAI `reference_images`, and remote video edit/extend flows.
</Accordion>
</AccordionGroup>

View File

@@ -83,6 +83,12 @@ synced to other devices or persisted server-side beyond the normal transcript
authorship metadata on messages you actually send. Clearing site data or
switching browsers resets it to empty.
The same browser-local pattern applies to the assistant avatar override.
Uploaded assistant avatars overlay the gateway-resolved identity on the local
browser only and never round-trip through `config.patch`. The shared
`ui.assistant.avatar` config field is still available for non-UI clients
writing the field directly (such as scripted gateways or custom dashboards).
## Runtime config endpoint
The Control UI fetches its runtime settings from

View File

@@ -1,6 +1,6 @@
import { beforeEach, describe, expect, it, vi } from "vitest";
import type { AcpRuntime } from "../runtime-api.js";
import { AcpxRuntime } from "./runtime.js";
import { AcpxRuntime, __testing } from "./runtime.js";
type TestSessionStore = {
load(sessionId: string): Promise<Record<string, unknown> | undefined>;
@@ -9,6 +9,8 @@ type TestSessionStore = {
const DOCUMENTED_OPENCLAW_BRIDGE_COMMAND =
"env OPENCLAW_HIDE_BANNER=1 OPENCLAW_SUPPRESS_NOTES=1 openclaw acp --url ws://127.0.0.1:18789 --token-file ~/.openclaw/gateway.token --session agent:main:main";
const CODEX_ACP_COMMAND = "npx @zed-industries/codex-acp@^0.11.1";
const CODEX_ACP_WRAPPER_COMMAND = `node "/tmp/openclaw/acpx/codex-acp-wrapper.mjs"`;
function makeRuntime(
baseStore: TestSessionStore,
@@ -20,6 +22,7 @@ function makeRuntime(
close: AcpRuntime["close"];
ensureSession: AcpRuntime["ensureSession"];
getStatus: NonNullable<AcpRuntime["getStatus"]>;
setConfigOption: NonNullable<AcpRuntime["setConfigOption"]>;
isHealthy(): boolean;
probeAvailability(): Promise<void>;
};
@@ -27,6 +30,7 @@ function makeRuntime(
close: AcpRuntime["close"];
ensureSession: AcpRuntime["ensureSession"];
getStatus: NonNullable<AcpRuntime["getStatus"]>;
setConfigOption: NonNullable<AcpRuntime["setConfigOption"]>;
isHealthy(): boolean;
probeAvailability(): Promise<void>;
};
@@ -55,6 +59,7 @@ function makeRuntime(
close: AcpRuntime["close"];
ensureSession: AcpRuntime["ensureSession"];
getStatus: NonNullable<AcpRuntime["getStatus"]>;
setConfigOption: NonNullable<AcpRuntime["setConfigOption"]>;
isHealthy(): boolean;
probeAvailability(): Promise<void>;
};
@@ -66,6 +71,7 @@ function makeRuntime(
close: AcpRuntime["close"];
ensureSession: AcpRuntime["ensureSession"];
getStatus: NonNullable<AcpRuntime["getStatus"]>;
setConfigOption: NonNullable<AcpRuntime["setConfigOption"]>;
isHealthy(): boolean;
probeAvailability(): Promise<void>;
};
@@ -79,6 +85,274 @@ describe("AcpxRuntime fresh reset wrapper", () => {
vi.restoreAllMocks();
});
it("normalizes OpenClaw Codex model ids for ACP startup", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => undefined),
save: vi.fn(async () => {}),
};
const { runtime, delegate } = makeRuntime(baseStore, {
agentRegistry: {
resolve: (agentName: string) => (agentName === "codex" ? CODEX_ACP_COMMAND : agentName),
list: () => ["codex", "openclaw"],
},
});
const ensure = vi.spyOn(delegate, "ensureSession").mockResolvedValue({
sessionKey: "agent:codex:acp:test",
backend: "acpx",
runtimeSessionName: "codex",
});
await runtime.ensureSession({
sessionKey: "agent:codex:acp:test",
agent: "codex",
mode: "persistent",
model: "openai-codex/gpt-5.4",
});
expect(ensure).toHaveBeenCalledWith(
expect.objectContaining({
model: "gpt-5.4",
}),
);
});
it("leaves Codex ACP startup defaults alone when no model or thinking is provided", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => undefined),
save: vi.fn(async () => {}),
};
const { runtime, delegate } = makeRuntime(baseStore, {
agentRegistry: {
resolve: (agentName: string) => (agentName === "codex" ? CODEX_ACP_COMMAND : agentName),
list: () => ["codex", "openclaw"],
},
});
const ensure = vi.spyOn(delegate, "ensureSession").mockResolvedValue({
sessionKey: "agent:codex:acp:test",
backend: "acpx",
runtimeSessionName: "codex",
});
await runtime.ensureSession({
sessionKey: "agent:codex:acp:test",
agent: "codex",
mode: "persistent",
});
expect(ensure).toHaveBeenCalledWith(
expect.objectContaining({
agent: "codex",
}),
);
expect(ensure.mock.calls[0]?.[0]).not.toHaveProperty("model");
expect(ensure.mock.calls[0]?.[0]).not.toHaveProperty("thinking");
});
it("does not normalize model startup for non-Codex ACP agents", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => undefined),
save: vi.fn(async () => {}),
};
const { runtime, delegate } = makeRuntime(baseStore, {
agentRegistry: {
resolve: (agentName: string) => (agentName === "main" ? CODEX_ACP_COMMAND : agentName),
list: () => ["main", "codex", "openclaw"],
},
});
const ensure = vi.spyOn(delegate, "ensureSession").mockResolvedValue({
sessionKey: "agent:main:acp:test",
backend: "acpx",
runtimeSessionName: "main",
});
await runtime.ensureSession({
sessionKey: "agent:main:acp:test",
agent: "main",
mode: "persistent",
model: "openai-codex/gpt-5.5",
});
expect(ensure).toHaveBeenCalledWith(
expect.objectContaining({
agent: "main",
model: "openai-codex/gpt-5.5",
}),
);
});
it("injects Codex ACP startup config into the scoped registry", () => {
expect(__testing.isCodexAcpCommand(CODEX_ACP_COMMAND)).toBe(true);
expect(__testing.isCodexAcpCommand(CODEX_ACP_WRAPPER_COMMAND)).toBe(true);
expect(
__testing.appendCodexAcpConfigOverrides(CODEX_ACP_COMMAND, {
model: "gpt-5.4",
reasoningEffort: "medium",
}),
).toBe(
"npx @zed-industries/codex-acp@^0.11.1 -c model=gpt-5.4 -c model_reasoning_effort=medium",
);
expect(__testing.isCodexAcpCommand("openclaw acp")).toBe(false);
});
it("passes gpt-5.5 Codex ACP startup through instead of blocking it", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => undefined),
save: vi.fn(async () => {}),
};
const { runtime, delegate } = makeRuntime(baseStore, {
agentRegistry: {
resolve: (agentName: string) => (agentName === "codex" ? CODEX_ACP_COMMAND : agentName),
list: () => ["codex", "openclaw"],
},
});
const ensure = vi.spyOn(delegate, "ensureSession").mockResolvedValue({
sessionKey: "agent:codex:acp:test",
backend: "acpx",
runtimeSessionName: "codex",
});
await runtime.ensureSession({
sessionKey: "agent:codex:acp:test",
agent: "codex",
mode: "persistent",
model: "openai-codex/gpt-5.5",
});
expect(ensure).toHaveBeenCalledWith(
expect.objectContaining({
model: "gpt-5.5",
}),
);
});
it("maps explicit Codex ACP thinking to startup reasoning effort", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => undefined),
save: vi.fn(async () => {}),
};
const { runtime, delegate } = makeRuntime(baseStore, {
agentRegistry: {
resolve: (agentName: string) => (agentName === "codex" ? CODEX_ACP_COMMAND : agentName),
list: () => ["codex", "openclaw"],
},
});
const ensure = vi.spyOn(delegate, "ensureSession").mockResolvedValue({
sessionKey: "agent:codex:acp:test",
backend: "acpx",
runtimeSessionName: "codex",
});
await runtime.ensureSession({
sessionKey: "agent:codex:acp:test",
agent: "codex",
mode: "persistent",
model: "openai-codex/gpt-5.4",
thinking: "x-high",
});
expect(ensure).toHaveBeenCalledWith(
expect.objectContaining({
model: "gpt-5.4/xhigh",
}),
);
});
it("normalizes Codex ACP model config controls to adapter ids", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => ({
acpxRecordId: "agent:codex:acp:test",
agentCommand: CODEX_ACP_COMMAND,
})),
save: vi.fn(async () => {}),
};
const { runtime, delegate } = makeRuntime(baseStore);
const setConfigOption = vi.spyOn(delegate, "setConfigOption").mockResolvedValue(undefined);
const handle: Parameters<NonNullable<AcpRuntime["setConfigOption"]>>[0]["handle"] = {
sessionKey: "agent:codex:acp:test",
backend: "acpx",
runtimeSessionName: "agent:codex:acp:test",
acpxRecordId: "agent:codex:acp:test",
};
await runtime.setConfigOption({
handle,
key: "model",
value: "openai-codex/gpt-5.4",
});
expect(setConfigOption).toHaveBeenNthCalledWith(1, {
handle,
key: "model",
value: "gpt-5.4",
});
expect(setConfigOption).toHaveBeenCalledOnce();
});
it("normalizes Codex ACP slash reasoning suffixes to config controls", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => ({
acpxRecordId: "agent:codex:acp:test",
agentCommand: CODEX_ACP_COMMAND,
})),
save: vi.fn(async () => {}),
};
const { runtime, delegate } = makeRuntime(baseStore);
const setConfigOption = vi.spyOn(delegate, "setConfigOption").mockResolvedValue(undefined);
const handle: Parameters<NonNullable<AcpRuntime["setConfigOption"]>>[0]["handle"] = {
sessionKey: "agent:codex:acp:test",
backend: "acpx",
runtimeSessionName: "agent:codex:acp:test",
acpxRecordId: "agent:codex:acp:test",
};
await runtime.setConfigOption({
handle,
key: "model",
value: "openai-codex/gpt-5.4/high",
});
expect(setConfigOption).toHaveBeenNthCalledWith(1, {
handle,
key: "model",
value: "gpt-5.4",
});
expect(setConfigOption).toHaveBeenNthCalledWith(2, {
handle,
key: "reasoning_effort",
value: "high",
});
});
it("normalizes Codex ACP thinking config controls to reasoning effort", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => ({
acpxRecordId: "agent:codex:acp:test",
agentCommand: CODEX_ACP_COMMAND,
})),
save: vi.fn(async () => {}),
};
const { runtime, delegate } = makeRuntime(baseStore);
const setConfigOption = vi.spyOn(delegate, "setConfigOption").mockResolvedValue(undefined);
const handle: Parameters<NonNullable<AcpRuntime["setConfigOption"]>>[0]["handle"] = {
sessionKey: "agent:codex:acp:test",
backend: "acpx",
runtimeSessionName: "agent:codex:acp:test",
acpxRecordId: "agent:codex:acp:test",
};
await runtime.setConfigOption({
handle,
key: "thinking",
value: "minimal",
});
expect(setConfigOption).toHaveBeenCalledWith({
handle,
key: "reasoning_effort",
value: "low",
});
});
it("keeps stale persistent loads hidden until a fresh record is saved", async () => {
const baseStore: TestSessionStore = {
load: vi.fn(async () => ({ acpxRecordId: "stale" }) as never),

View File

@@ -1,3 +1,4 @@
import { AsyncLocalStorage } from "node:async_hooks";
import {
ACPX_BACKEND_ID,
AcpxRuntime as BaseAcpxRuntime,
@@ -13,7 +14,7 @@ import {
type AcpRuntimeOptions,
type AcpRuntimeStatus,
} from "acpx/runtime";
import type { AcpRuntime } from "../runtime-api.js";
import { AcpRuntimeError, type AcpRuntime } from "../runtime-api.js";
type AcpSessionStore = AcpRuntimeOptions["sessionStore"];
type AcpSessionRecord = Parameters<AcpSessionStore["save"]>[0];
@@ -60,6 +61,27 @@ function createResetAwareSessionStore(baseStore: AcpSessionStore): ResetAwareSes
const OPENCLAW_BRIDGE_EXECUTABLE = "openclaw";
const OPENCLAW_BRIDGE_SUBCOMMAND = "acp";
const CODEX_ACP_AGENT_ID = "codex";
const CODEX_ACP_OPENCLAW_PREFIX = "openai-codex/";
const CODEX_ACP_REASONING_EFFORTS = new Set(["low", "medium", "high", "xhigh"]);
const CODEX_ACP_THINKING_ALIASES = new Map<string, string | undefined>([
["off", undefined],
["minimal", "low"],
["low", "low"],
["medium", "medium"],
["high", "high"],
["x-high", "xhigh"],
["x_high", "xhigh"],
["extra-high", "xhigh"],
["extra_high", "xhigh"],
["extra high", "xhigh"],
["xhigh", "xhigh"],
]);
type CodexAcpModelOverride = {
model?: string;
reasoningEffort?: string;
};
function normalizeAgentName(value: string | undefined): string | undefined {
const normalized = value?.trim().toLowerCase();
@@ -175,6 +197,149 @@ function isOpenClawBridgeCommand(command: string | undefined): boolean {
return /^openclaw(?:\.[cm]?js)?$/i.test(scriptName) && parts[2] === OPENCLAW_BRIDGE_SUBCOMMAND;
}
function isCodexAcpPackageSpec(value: string): boolean {
return /^@zed-industries\/codex-acp(?:@.+)?$/i.test(value.trim());
}
function isCodexAcpCommand(command: string | undefined): boolean {
if (!command) {
return false;
}
const parts = unwrapEnvCommand(splitCommandParts(command.trim()));
if (!parts.length) {
return false;
}
if (parts.some(isCodexAcpPackageSpec)) {
return true;
}
const commandName = basename(parts[0] ?? "");
if (/^codex-acp(?:\.exe)?$/i.test(commandName)) {
return true;
}
if (commandName !== "node") {
return false;
}
const scriptName = basename(parts[1] ?? "");
return /^codex-acp(?:-wrapper)?(?:\.[cm]?js)?$/i.test(scriptName);
}
function failUnsupportedCodexAcpModel(rawModel: string, detail?: string): never {
throw new AcpRuntimeError(
"ACP_INVALID_RUNTIME_OPTION",
detail ??
`Codex ACP model "${rawModel}" is not supported. Use openai-codex/<model> or <model>/<reasoning-effort>.`,
);
}
function failUnsupportedCodexAcpThinking(rawThinking: string): never {
throw new AcpRuntimeError(
"ACP_INVALID_RUNTIME_OPTION",
`Codex ACP thinking level "${rawThinking}" is not supported. Use off, minimal, low, medium, high, or xhigh.`,
);
}
function normalizeCodexAcpReasoningEffort(rawThinking: string | undefined): string | undefined {
const normalized = rawThinking?.trim().toLowerCase();
if (!normalized) {
return undefined;
}
if (!CODEX_ACP_THINKING_ALIASES.has(normalized)) {
failUnsupportedCodexAcpThinking(rawThinking ?? "");
}
return CODEX_ACP_THINKING_ALIASES.get(normalized);
}
function normalizeCodexAcpModelOverride(
rawModel: string | undefined,
rawThinking?: string,
): CodexAcpModelOverride | undefined {
const raw = rawModel?.trim();
const thinkingReasoningEffort = normalizeCodexAcpReasoningEffort(rawThinking);
if (!raw) {
return thinkingReasoningEffort ? { reasoningEffort: thinkingReasoningEffort } : undefined;
}
let value = raw;
if (value.toLowerCase().startsWith(CODEX_ACP_OPENCLAW_PREFIX)) {
value = value.slice(CODEX_ACP_OPENCLAW_PREFIX.length);
}
const parts = value.split("/");
if (parts.length > 2) {
failUnsupportedCodexAcpModel(
raw,
`Codex ACP model "${raw}" is not supported. Use openai-codex/<model> or <model>/<reasoning-effort>.`,
);
}
const model = (parts[0] ?? "").trim();
const modelReasoningEffort = normalizeCodexAcpReasoningEffort(parts[1]);
if (!model) {
failUnsupportedCodexAcpModel(
raw,
`Codex ACP model "${raw}" is not supported. Use openai-codex/<model> or <model>/<reasoning-effort>.`,
);
}
const reasoningEffort = thinkingReasoningEffort ?? modelReasoningEffort;
if (reasoningEffort && !CODEX_ACP_REASONING_EFFORTS.has(reasoningEffort)) {
failUnsupportedCodexAcpThinking(reasoningEffort);
}
return {
model,
...(reasoningEffort ? { reasoningEffort } : {}),
};
}
function codexAcpSessionModelId(override: CodexAcpModelOverride): string {
if (!override.model) {
return "";
}
return override.reasoningEffort
? `${override.model}/${override.reasoningEffort}`
: override.model;
}
function quoteShellArg(value: string): string {
if (/^[A-Za-z0-9_./:=@+-]+$/.test(value)) {
return value;
}
return `'${value.replace(/'/g, "'\\''")}'`;
}
function appendCodexAcpConfigOverrides(command: string, override: CodexAcpModelOverride): string {
const configArgs = override.model ? [`model=${override.model}`] : [];
if (override.reasoningEffort) {
configArgs.push(`model_reasoning_effort=${override.reasoningEffort}`);
}
if (configArgs.length === 0) {
return command;
}
return `${command} ${configArgs.map((arg) => `-c ${quoteShellArg(arg)}`).join(" ")}`;
}
function createModelScopedAgentRegistry(params: {
agentRegistry: AcpAgentRegistry;
scope: AsyncLocalStorage<CodexAcpModelOverride | undefined>;
}): AcpAgentRegistry {
return {
resolve(agentName: string): string | undefined {
const command = params.agentRegistry.resolve(agentName);
const override = params.scope.getStore();
if (
!override ||
normalizeAgentName(agentName) !== CODEX_ACP_AGENT_ID ||
typeof command !== "string" ||
!isCodexAcpCommand(command)
) {
return command;
}
return appendCodexAcpConfigOverrides(command, override);
},
list(): string[] {
return params.agentRegistry.list();
},
};
}
function resolveAgentCommand(params: {
agentName: string | undefined;
agentRegistry: AcpAgentRegistry;
@@ -211,6 +376,10 @@ function shouldUseDistinctBridgeDelegate(options: AcpRuntimeOptions): boolean {
export class AcpxRuntime implements AcpRuntime {
private readonly sessionStore: ResetAwareSessionStore;
private readonly agentRegistry: AcpAgentRegistry;
private readonly scopedAgentRegistry: AcpAgentRegistry;
private readonly codexAcpModelOverrideScope = new AsyncLocalStorage<
CodexAcpModelOverride | undefined
>();
private readonly delegate: BaseAcpxRuntime;
private readonly bridgeSafeDelegate: BaseAcpxRuntime;
private readonly probeDelegate: BaseAcpxRuntime;
@@ -221,9 +390,14 @@ export class AcpxRuntime implements AcpRuntime {
) {
this.sessionStore = createResetAwareSessionStore(options.sessionStore);
this.agentRegistry = options.agentRegistry;
this.scopedAgentRegistry = createModelScopedAgentRegistry({
agentRegistry: this.agentRegistry,
scope: this.codexAcpModelOverrideScope,
});
const sharedOptions = {
...options,
sessionStore: this.sessionStore,
agentRegistry: this.scopedAgentRegistry,
};
this.delegate = new BaseAcpxRuntime(sharedOptions, testOptions);
this.bridgeSafeDelegate = shouldUseDistinctBridgeDelegate(options)
@@ -259,6 +433,18 @@ export class AcpxRuntime implements AcpRuntime {
return this.resolveDelegateForAgent(readAgentFromHandle(handle));
}
private async resolveCommandForHandle(handle: AcpRuntimeHandle): Promise<string | undefined> {
const record = await this.sessionStore.load(handle.acpxRecordId ?? handle.sessionKey);
const recordCommand = readAgentCommandFromRecord(record);
if (recordCommand) {
return recordCommand;
}
return resolveAgentCommandForName({
agentName: readAgentFromHandle(handle),
agentRegistry: this.agentRegistry,
});
}
isHealthy(): boolean {
return this.probeDelegate.isHealthy();
}
@@ -271,8 +457,32 @@ export class AcpxRuntime implements AcpRuntime {
return this.probeDelegate.doctor();
}
ensureSession(input: Parameters<AcpRuntime["ensureSession"]>[0]): Promise<AcpRuntimeHandle> {
return this.resolveDelegateForAgent(input.agent).ensureSession(input);
async ensureSession(
input: Parameters<AcpRuntime["ensureSession"]>[0],
): Promise<AcpRuntimeHandle> {
const command = resolveAgentCommandForName({
agentName: input.agent,
agentRegistry: this.agentRegistry,
});
const delegate = this.resolveDelegateForCommand(command);
const codexModelOverride =
normalizeAgentName(input.agent) === CODEX_ACP_AGENT_ID && isCodexAcpCommand(command)
? normalizeCodexAcpModelOverride(input.model, input.thinking)
: undefined;
if (!codexModelOverride) {
return delegate.ensureSession(input);
}
const normalizedInput = {
...input,
...(codexAcpSessionModelId(codexModelOverride)
? { model: codexAcpSessionModelId(codexModelOverride) }
: {}),
};
return this.codexAcpModelOverrideScope.run(codexModelOverride, () =>
delegate.ensureSession(normalizedInput),
);
}
async *runTurn(input: Parameters<AcpRuntime["runTurn"]>[0]): AsyncIterable<AcpRuntimeEvent> {
@@ -299,6 +509,39 @@ export class AcpxRuntime implements AcpRuntime {
input: Parameters<NonNullable<AcpRuntime["setConfigOption"]>>[0],
): Promise<void> {
const delegate = await this.resolveDelegateForHandle(input.handle);
const command = await this.resolveCommandForHandle(input.handle);
if (
(input.key === "model" ||
input.key === "thinking" ||
input.key === "thought_level" ||
input.key === "reasoning_effort") &&
isCodexAcpCommand(command)
) {
const override =
input.key === "model"
? normalizeCodexAcpModelOverride(input.value)
: normalizeCodexAcpModelOverride(undefined, input.value);
if (!override && input.key !== "model") {
return;
}
if (override) {
if (override.model) {
await delegate.setConfigOption({
...input,
key: "model",
value: override.model,
});
}
if (override.reasoningEffort) {
await delegate.setConfigOption({
...input,
key: "reasoning_effort",
value: override.reasoningEffort,
});
}
return;
}
}
await delegate.setConfigOption(input);
}
@@ -334,4 +577,11 @@ export {
encodeAcpxRuntimeHandleState,
};
export const __testing = {
appendCodexAcpConfigOverrides,
codexAcpSessionModelId,
isCodexAcpCommand,
normalizeCodexAcpModelOverride,
};
export type { AcpAgentRegistry, AcpRuntimeOptions, AcpSessionRecord, AcpSessionStore };

View File

@@ -922,6 +922,53 @@ describe("active-memory plugin", () => {
});
});
it("infers the configured provider for bare active-memory default models", async () => {
api.config = {
agents: {
defaults: {
model: { primary: "gpt-5.5" },
},
},
models: {
providers: {
"openai-codex": {
baseUrl: "https://chatgpt.com/backend-api/codex",
models: [
{
id: "gpt-5.5",
name: "GPT 5.5",
reasoning: true,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 200_000,
maxTokens: 128_000,
},
],
},
},
},
};
api.pluginConfig = {
agents: ["main"],
};
plugin.register(api as unknown as OpenClawPluginApi);
await hooks.before_prompt_build(
{ prompt: "what wings should i order? bare model default", messages: [] },
{
agentId: "main",
trigger: "user",
sessionKey: "agent:main:main",
messageProvider: "webchat",
},
);
expect(runEmbeddedPiAgent.mock.calls.at(-1)?.[0]).toMatchObject({
provider: "openai-codex",
model: "gpt-5.5",
});
});
it("skips recall when no model or explicit fallback resolves", async () => {
api.config = {};
api.pluginConfig = {

View File

@@ -7,6 +7,7 @@ import {
resolveAgentDir,
resolveAgentEffectiveModelPrimary,
resolveAgentWorkspaceDir,
resolveDefaultModelForAgent,
} from "openclaw/plugin-sdk/agent-runtime";
import {
resolveLivePluginConfigObject,
@@ -1550,13 +1551,11 @@ function extractRecentTurns(messages: unknown[]): ActiveRecallRecentTurn[] {
return turns;
}
function parseModelCandidate(modelRef: string | undefined) {
function parseModelCandidate(modelRef: string | undefined, defaultProvider = DEFAULT_PROVIDER) {
if (!modelRef) {
return undefined;
}
return (
parseModelRef(modelRef, DEFAULT_PROVIDER) ?? { provider: DEFAULT_PROVIDER, model: modelRef }
);
return parseModelRef(modelRef, defaultProvider) ?? { provider: defaultProvider, model: modelRef };
}
function getModelRef(
@@ -1570,14 +1569,20 @@ function getModelRef(
): { provider: string; model: string } | undefined {
const currentRunModel =
ctx?.modelProviderId && ctx?.modelId ? `${ctx.modelProviderId}/${ctx.modelId}` : undefined;
const configuredDefaultModel = resolveAgentEffectiveModelPrimary(api.config, agentId)
? resolveDefaultModelForAgent({ cfg: api.config, agentId })
: undefined;
const defaultProvider = configuredDefaultModel?.provider ?? DEFAULT_PROVIDER;
const candidates = [
config.model,
currentRunModel,
resolveAgentEffectiveModelPrimary(api.config, agentId),
configuredDefaultModel
? `${configuredDefaultModel.provider}/${configuredDefaultModel.model}`
: undefined,
config.modelFallback,
];
for (const candidate of candidates) {
const parsed = parseModelCandidate(candidate);
const parsed = parseModelCandidate(candidate, defaultProvider);
if (parsed) {
return parsed;
}

View File

@@ -1,4 +1,5 @@
import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
import { registerUnhandledRejectionHandler } from "openclaw/plugin-sdk/runtime";
import { startGatewayBonjourAdvertiser } from "./src/advertiser.js";
function formatBonjourInstanceName(displayName: string) {
@@ -32,7 +33,7 @@ export default definePluginEntry({
cliPath: ctx.cliPath,
minimal: ctx.minimal,
},
{ logger: api.logger },
{ logger: api.logger, registerUnhandledRejectionHandler },
);
return { stop: advertiser.stop };
},

View File

@@ -484,12 +484,12 @@ describe("gateway bonjour advertiser", () => {
expect(createService).toHaveBeenCalledTimes(2);
expect(advertise).toHaveBeenCalledTimes(2);
expect(destroy).toHaveBeenCalledTimes(1);
expect(shutdown).toHaveBeenCalledTimes(1);
expect(shutdown).not.toHaveBeenCalled();
expect(events).toEqual(["advertise:1", "destroy", "advertise:2"]);
await started.stop();
expect(destroy).toHaveBeenCalledTimes(2);
expect(shutdown).toHaveBeenCalledTimes(2);
expect(shutdown).toHaveBeenCalledTimes(1);
});
it("treats probing-to-announcing churn as one unhealthy window", async () => {
@@ -527,9 +527,10 @@ describe("gateway bonjour advertiser", () => {
expect(createService).toHaveBeenCalledTimes(2);
expect(advertise).toHaveBeenCalledTimes(3);
expect(destroy).toHaveBeenCalledTimes(1);
expect(shutdown).toHaveBeenCalledTimes(1);
expect(shutdown).not.toHaveBeenCalled();
await started.stop();
expect(shutdown).toHaveBeenCalledTimes(1);
});
it("normalizes hostnames with domains for service names", async () => {

View File

@@ -233,8 +233,9 @@ export async function startGatewayBonjourAdvertiser(
gatewayTxt.sshPort = String(opts.sshPort ?? 22);
}
const responder = getResponder();
function createCycle(): BonjourCycle {
const responder = getResponder();
const services: Array<{ label: string; svc: BonjourService }> = [];
const gateway = responder.createService({
@@ -259,7 +260,7 @@ export async function startGatewayBonjourAdvertiser(
return { responder, services, cleanupUnhandledRejection };
}
async function stopCycle(cycle: BonjourCycle | null) {
async function stopCycle(cycle: BonjourCycle | null, opts?: { shutdownResponder?: boolean }) {
if (!cycle) {
return;
}
@@ -271,7 +272,9 @@ export async function startGatewayBonjourAdvertiser(
}
}
try {
await cycle.responder.shutdown();
if (opts?.shutdownResponder) {
await cycle.responder.shutdown();
}
} catch {
/* ignore */
} finally {
@@ -442,7 +445,7 @@ export async function startGatewayBonjourAdvertiser(
} catch {
// ignore
}
await stopCycle(cycle);
await stopCycle(cycle, { shutdownResponder: true });
restoreConsoleLog();
},
};

View File

@@ -1,4 +1,6 @@
import { createServer } from "node:http";
import type { AddressInfo } from "node:net";
import type { Duplex } from "node:stream";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { type WebSocket, WebSocketServer } from "ws";
import { SsrFBlockedError } from "../infra/net/ssrf.js";
@@ -137,6 +139,67 @@ describe("cdp", () => {
}
});
it("honors configured HTTP discovery timeouts when creating a target", async () => {
const wsPort = await startWsServerWithMessages((msg, socket) => {
if (msg.method !== "Target.createTarget") {
return;
}
socket.send(JSON.stringify({ id: msg.id, result: { targetId: "TARGET_SLOW" } }));
});
httpServer = createServer((req, res) => {
if (req.url === "/json/version") {
setTimeout(() => {
res.setHeader("content-type", "application/json");
res.end(
JSON.stringify({
webSocketDebuggerUrl: `ws://127.0.0.1:${wsPort}/devtools/browser/SLOW`,
}),
);
}, 120);
return;
}
res.statusCode = 404;
res.end("not found");
});
await new Promise<void>((resolve) => httpServer?.listen(0, "127.0.0.1", resolve));
const httpPort = (httpServer.address() as AddressInfo).port;
await expect(
createTargetViaCdp({
cdpUrl: `http://127.0.0.1:${httpPort}`,
url: "https://example.com",
timeouts: { httpTimeoutMs: 20 },
}),
).rejects.toThrow();
});
it("honors configured WebSocket handshake timeouts when creating a target", async () => {
wsServer = new WebSocketServer({ noServer: true });
httpServer = createServer();
const heldSockets: Duplex[] = [];
httpServer.on("upgrade", (_req, socket) => {
heldSockets.push(socket);
// Hold the TCP connection open without completing the WebSocket handshake.
});
await new Promise<void>((resolve) => httpServer?.listen(0, "127.0.0.1", resolve));
const port = (httpServer.address() as AddressInfo).port;
try {
await expect(
createTargetViaCdp({
cdpUrl: `ws://127.0.0.1:${port}/devtools/browser/SLOW`,
url: "https://example.com",
timeouts: { handshakeTimeoutMs: 20 },
}),
).rejects.toThrow();
} finally {
for (const socket of heldSockets) {
socket.destroy();
}
}
});
it("preserves query params when connecting via direct WebSocket URL", async () => {
let receivedHeaders: Record<string, string> = {};
const wsPort = await startWsServer();
@@ -351,6 +414,56 @@ describe("cdp", () => {
expect(created.targetId).toBe("WS_FALLBACK");
});
it("falls back to direct WS connection when discovered Browserless endpoint rejects commands", async () => {
const server = createServer((req, res) => {
if (req.url?.startsWith("/json/version")) {
const addr = server.address() as AddressInfo;
res.setHeader("content-type", "application/json");
res.end(
JSON.stringify({
webSocketDebuggerUrl: `ws://127.0.0.1:${addr.port}/e/bad`,
}),
);
return;
}
res.statusCode = 404;
res.end("not found");
});
const wss = new WebSocketServer({ noServer: true });
server.on("upgrade", (req, socket, head) => {
if (req.url?.startsWith("/e/bad")) {
socket.destroy();
return;
}
wss.handleUpgrade(req, socket, head, (ws) => {
wss.emit("connection", ws, req);
});
});
wss.on("connection", (socket) => {
socket.on("message", (data) => {
const msg = JSON.parse(rawDataToString(data)) as {
id?: number;
method?: string;
};
if (msg.method === "Target.createTarget") {
socket.send(JSON.stringify({ id: msg.id, result: { targetId: "ROOT_FALLBACK" } }));
}
});
});
await new Promise<void>((resolve) => server.listen(0, "127.0.0.1", resolve));
try {
const addr = server.address() as AddressInfo;
const created = await createTargetViaCdp({
cdpUrl: `ws://127.0.0.1:${addr.port}?token=abc`,
url: "https://example.com",
});
expect(created.targetId).toBe("ROOT_FALLBACK");
} finally {
await new Promise<void>((resolve) => wss.close(() => resolve()));
await new Promise<void>((resolve) => server.close(() => resolve()));
}
});
it("captures an aria snapshot via CDP", async () => {
const wsPort = await startWsServerWithMessages((msg, socket) => {
if (msg.method === "Accessibility.enable") {

View File

@@ -180,10 +180,16 @@ export async function captureScreenshot(opts: {
);
}
export type CdpActionTimeouts = {
httpTimeoutMs?: number;
handshakeTimeoutMs?: number;
};
export async function createTargetViaCdp(opts: {
cdpUrl: string;
url: string;
ssrfPolicy?: SsrFPolicy;
timeouts?: CdpActionTimeouts;
}): Promise<{ targetId: string }> {
await assertBrowserNavigationAllowed({
url: opts.url,
@@ -208,7 +214,7 @@ export async function createTargetViaCdp(opts: {
try {
version = await fetchJson<{ webSocketDebuggerUrl?: string }>(
appendCdpPath(discoveryUrl, "/json/version"),
1500,
opts.timeouts?.httpTimeoutMs,
undefined,
opts.ssrfPolicy,
);
@@ -230,19 +236,36 @@ export async function createTargetViaCdp(opts: {
} else {
throw new Error("CDP /json/version missing webSocketDebuggerUrl");
}
await assertCdpEndpointAllowed(wsUrl, opts.ssrfPolicy);
}
return await withCdpSocket(wsUrl, async (send) => {
const created = (await send("Target.createTarget", { url: opts.url })) as {
targetId?: string;
};
const targetId = created?.targetId?.trim() ?? "";
if (!targetId) {
throw new Error("CDP Target.createTarget returned no targetId");
const candidateWsUrls =
isWebSocketUrl(opts.cdpUrl) && wsUrl !== opts.cdpUrl ? [wsUrl, opts.cdpUrl] : [wsUrl];
let lastError: unknown;
for (const candidateWsUrl of candidateWsUrls) {
try {
await assertCdpEndpointAllowed(candidateWsUrl, opts.ssrfPolicy);
return await withCdpSocket(
candidateWsUrl,
async (send) => {
const created = (await send("Target.createTarget", { url: opts.url })) as {
targetId?: string;
};
const targetId = created?.targetId?.trim() ?? "";
if (!targetId) {
throw new Error("CDP Target.createTarget returned no targetId");
}
return { targetId };
},
{ handshakeTimeoutMs: opts.timeouts?.handshakeTimeoutMs },
);
} catch (err) {
lastError = err;
}
return { targetId };
});
}
if (lastError instanceof Error) {
throw lastError;
}
throw new Error("CDP Target.createTarget failed");
}
export type CdpRemoteObject = {

View File

@@ -365,6 +365,19 @@ export async function diagnoseChromeCdp(
const health = await diagnoseCdpHealthCommand(wsUrl, handshakeTimeoutMs);
if (!health.ok) {
if (isWebSocketUrl(cdpUrl) && wsUrl !== cdpUrl) {
const directHealth = await diagnoseCdpHealthCommand(cdpUrl, handshakeTimeoutMs);
if (directHealth.ok) {
return {
ok: true,
cdpUrl,
wsUrl: cdpUrl,
browser: version.Browser,
userAgent: version["User-Agent"],
elapsedMs: elapsedSince(startedAt),
};
}
}
return failureDiagnostic({
cdpUrl,
wsUrl,

View File

@@ -662,6 +662,59 @@ describe("browser chrome helpers", () => {
});
});
it("falls back to the bare WebSocket root when discovered Browserless endpoint rejects readiness", async () => {
const server = createServer((req, res) => {
if (req.url?.startsWith("/json/version")) {
const addr = server.address() as AddressInfo;
res.writeHead(200, { "Content-Type": "application/json" });
res.end(
JSON.stringify({
Browser: "Browserless/Mock",
webSocketDebuggerUrl: `ws://127.0.0.1:${addr.port}/e/bad`,
}),
);
return;
}
res.writeHead(404);
res.end();
});
const wss = new WebSocketServer({ noServer: true });
server.on("upgrade", (req, socket, head) => {
if (req.url?.startsWith("/e/bad")) {
socket.destroy();
return;
}
wss.handleUpgrade(req, socket, head, (ws) => {
wss.emit("connection", ws, req);
});
});
wss.on("connection", (ws) => {
ws.on("message", (raw) => {
const message = JSON.parse(rawDataToString(raw)) as { id?: number; method?: string };
if (message.method === "Browser.getVersion" && message.id === 1) {
ws.send(JSON.stringify({ id: 1, result: { product: "Browserless/Mock" } }));
}
});
});
await new Promise<void>((resolve, reject) => {
server.listen(0, "127.0.0.1", () => resolve());
server.once("error", reject);
});
try {
const addr = server.address() as AddressInfo;
const wsOnlyBase = `ws://127.0.0.1:${addr.port}?token=abc`;
await expect(isChromeCdpReady(wsOnlyBase, 300, 400)).resolves.toBe(true);
await expect(diagnoseChromeCdp(wsOnlyBase, 300, 400)).resolves.toMatchObject({
ok: true,
wsUrl: wsOnlyBase,
browser: "Browserless/Mock",
});
} finally {
await new Promise<void>((resolve) => wss.close(() => resolve()));
await new Promise<void>((resolve) => server.close(() => resolve()));
}
});
it("reports unreachable when a bare ws:// CDP URL points at a server with no /json/version and refuses WS", async () => {
// Negative counterpart to the #68027 happy path — a bare ws URL
// pointed at a port that neither serves /json/version nor accepts

View File

@@ -0,0 +1,69 @@
import fs from "node:fs/promises";
import net from "node:net";
import path from "node:path";
import { afterEach, describe, expect, it } from "vitest";
import { clearConfigCache } from "../../../../src/config/config.js";
import { createTempHomeEnv } from "../../test-support.js";
import { fetchBrowserJson } from "./client-fetch.js";
type TempHome = {
home: string;
restore: () => Promise<void>;
};
describe("browser client fetch attachOnly diagnostics", () => {
let tempHome: TempHome | undefined;
afterEach(async () => {
clearConfigCache();
await tempHome?.restore();
tempHome = undefined;
});
it("does not suggest gateway restart when an attachOnly CDP endpoint hangs", async () => {
tempHome = await createTempHomeEnv("openclaw-browser-client-fetch-live-");
const server = net.createServer((socket) => {
socket.on("error", () => {});
});
await new Promise<void>((resolve) => server.listen(0, "127.0.0.1", resolve));
const port = (server.address() as { port: number }).port;
const configPath = path.join(tempHome.home, ".openclaw", "openclaw.json");
await fs.writeFile(
configPath,
JSON.stringify(
{
browser: {
enabled: true,
defaultProfile: "hung",
attachOnly: true,
profiles: {
hung: {
cdpUrl: `http://127.0.0.1:${port}`,
attachOnly: true,
color: "#00AA00",
},
},
},
},
null,
2,
),
);
process.env.OPENCLAW_CONFIG_PATH = configPath;
clearConfigCache();
try {
const thrown = await fetchBrowserJson("/tabs?profile=hung", { timeoutMs: 200 }).catch(
(err: unknown) => err,
);
expect(thrown).toBeInstanceOf(Error);
const message = thrown instanceof Error ? thrown.message : String(thrown);
expect(message).toContain("browser profile is external to OpenClaw");
expect(message).toContain("Restarting the OpenClaw gateway will not launch it");
expect(message).not.toContain("Restart the OpenClaw gateway");
expect(message).not.toContain("Do NOT retry the browser tool");
} finally {
await new Promise<void>((resolve) => server.close(() => resolve()));
}
});
});

View File

@@ -1,5 +1,6 @@
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import "../../test-support/browser-security-runtime.mock.js";
import type { OpenClawConfig } from "../config/config.js";
import type { BrowserDispatchResponse } from "./routes/dispatcher.js";
vi.mock("openclaw/plugin-sdk/ssrf-runtime", async () => {
@@ -28,7 +29,7 @@ function okDispatchResponse(): BrowserDispatchResponse {
}
const mocks = vi.hoisted(() => ({
loadConfig: vi.fn(() => ({
loadConfig: vi.fn<() => OpenClawConfig>(() => ({
gateway: {
auth: {
token: "loopback-token",
@@ -215,6 +216,202 @@ describe("fetchBrowserJson loopback auth", () => {
});
});
it("avoids restart-gateway guidance for attachOnly dispatcher timeouts", async () => {
mocks.loadConfig.mockReturnValue({
browser: {
attachOnly: true,
defaultProfile: "manual",
profiles: {
manual: {
cdpUrl: "http://127.0.0.1:9222",
attachOnly: true,
color: "#00AA00",
},
},
},
});
mocks.dispatch.mockRejectedValueOnce(new Error("Chrome CDP handshake timeout"));
await expectThrownBrowserFetchError(
() => fetchBrowserJson<{ ok: boolean }>("/tabs?profile=manual"),
{
contains: [
"Chrome CDP handshake timeout",
"browser profile is external to OpenClaw",
"Restarting the OpenClaw gateway will not launch it",
],
omits: ["Restart the OpenClaw gateway", "Do NOT retry the browser tool"],
},
);
});
it("avoids restart-gateway guidance for existing-session dispatcher timeouts", async () => {
mocks.loadConfig.mockReturnValue({
browser: {
defaultProfile: "user",
profiles: {
user: {
driver: "existing-session",
attachOnly: true,
color: "#00AA00",
},
},
},
});
mocks.dispatch.mockRejectedValueOnce(new DOMException("operation aborted", "AbortError"));
await expectThrownBrowserFetchError(() => fetchBrowserJson<{ ok: boolean }>("/tabs"), {
contains: [
"operation aborted",
"browser profile is external to OpenClaw",
"Restarting the OpenClaw gateway will not launch it",
],
omits: ["Restart the OpenClaw gateway", "Do NOT retry the browser tool"],
});
});
it("avoids restart-gateway guidance for remote CDP dispatcher timeouts", async () => {
mocks.loadConfig.mockReturnValue({
browser: {
defaultProfile: "remote",
profiles: {
remote: {
cdpUrl: "https://browserless.example/chrome?token=test",
color: "#00AA00",
},
},
},
});
mocks.dispatch.mockRejectedValueOnce(new Error("timed out"));
await expectThrownBrowserFetchError(
() => fetchBrowserJson<{ ok: boolean }>("/tabs?profile=remote"),
{
contains: [
"timed out",
"browser profile is external to OpenClaw",
"Restarting the OpenClaw gateway will not launch it",
],
omits: ["Restart the OpenClaw gateway", "Do NOT retry the browser tool"],
},
);
});
it("keeps restart-gateway guidance for managed local dispatcher timeouts", async () => {
mocks.loadConfig.mockReturnValue({
browser: {
defaultProfile: "openclaw",
profiles: {
openclaw: {
cdpPort: 18800,
color: "#FF4500",
},
},
},
});
mocks.dispatch.mockRejectedValueOnce(new Error("Chrome CDP handshake timeout"));
await expectThrownBrowserFetchError(
() => fetchBrowserJson<{ ok: boolean }>("/tabs?profile=openclaw"),
{
contains: ["Chrome CDP handshake timeout", "Restart the OpenClaw gateway"],
omits: ["browser profile is external to OpenClaw", "Do NOT retry the browser tool"],
},
);
});
it("keeps restart-gateway guidance when dispatcher profile resolution fails", async () => {
mocks.loadConfig.mockImplementation(() => {
throw new Error("config unavailable");
});
mocks.dispatch.mockRejectedValueOnce(new Error("Chrome CDP handshake timeout"));
await expectThrownBrowserFetchError(
() => fetchBrowserJson<{ ok: boolean }>("/tabs?profile=manual"),
{
contains: ["Chrome CDP handshake timeout", "Restart the OpenClaw gateway"],
omits: ["browser profile is external to OpenClaw", "Do NOT retry the browser tool"],
},
);
});
it("keeps restart-gateway guidance for unknown dispatcher profiles", async () => {
mocks.loadConfig.mockReturnValue({
browser: {
defaultProfile: "openclaw",
profiles: {
openclaw: {
cdpPort: 18800,
color: "#FF4500",
},
},
},
});
mocks.dispatch.mockRejectedValueOnce(new Error("Chrome CDP handshake timeout"));
await expectThrownBrowserFetchError(
() => fetchBrowserJson<{ ok: boolean }>("/tabs?profile=missing"),
{
contains: ["Chrome CDP handshake timeout", "Restart the OpenClaw gateway"],
omits: ["browser profile is external to OpenClaw", "Do NOT retry the browser tool"],
},
);
});
it("uses the default external profile when dispatcher request omits profile", async () => {
mocks.loadConfig.mockReturnValue({
browser: {
defaultProfile: "manual",
profiles: {
manual: {
cdpUrl: "http://127.0.0.1:9222",
attachOnly: true,
color: "#00AA00",
},
},
},
});
mocks.dispatch.mockRejectedValueOnce(new Error("Chrome CDP handshake timeout"));
await expectThrownBrowserFetchError(() => fetchBrowserJson<{ ok: boolean }>("/tabs"), {
contains: [
"Chrome CDP handshake timeout",
"browser profile is external to OpenClaw",
"Restarting the OpenClaw gateway will not launch it",
],
omits: ["Restart the OpenClaw gateway", "Do NOT retry the browser tool"],
});
});
it("keeps no-retry hint but not restart guidance for persistent external profile failures", async () => {
mocks.loadConfig.mockReturnValue({
browser: {
attachOnly: true,
defaultProfile: "manual",
profiles: {
manual: {
cdpUrl: "http://127.0.0.1:9222",
attachOnly: true,
color: "#00AA00",
},
},
},
});
mocks.dispatch.mockRejectedValueOnce(new Error("Chrome CDP connection refused"));
await expectThrownBrowserFetchError(
() => fetchBrowserJson<{ ok: boolean }>("/tabs?profile=manual"),
{
contains: [
"Chrome CDP connection refused",
"browser profile is external to OpenClaw",
"Do NOT retry the browser tool",
],
omits: ["Restart the OpenClaw gateway"],
},
);
});
it("keeps no-retry hint for persistent dispatcher failures", async () => {
mocks.dispatch.mockRejectedValueOnce(new Error("Chrome CDP connection refused"));

View File

@@ -5,6 +5,7 @@ import { formatCliCommand } from "../cli/command-format.js";
import { loadConfig } from "../config/config.js";
import { isLoopbackHost } from "../gateway/net.js";
import { getBridgeAuthForPort } from "./bridge-auth-registry.js";
import { resolveBrowserConfig, resolveProfile } from "./config.js";
import { resolveBrowserControlAuth } from "./control-auth.js";
import { resolveBrowserRateLimitMessage } from "./rate-limit-message.js";
@@ -105,7 +106,39 @@ function isRateLimitStatus(status: number): boolean {
return status === 429;
}
function resolveBrowserFetchOperatorHint(url: string): string {
type BrowserControlOwnership = "local-managed" | "external-browser" | "unknown";
function resolveDispatcherBrowserControlOwnership(url: string): BrowserControlOwnership {
if (isAbsoluteHttp(url)) {
return "unknown";
}
try {
const cfg = loadConfig();
const resolved = resolveBrowserConfig(cfg?.browser, cfg);
const parsed = new URL(url, "http://localhost");
const requestedProfile = parsed.searchParams.get("profile")?.trim();
const profile = resolveProfile(resolved, requestedProfile || resolved.defaultProfile);
if (!profile) {
return "unknown";
}
return profile.driver === "openclaw" && profile.cdpIsLoopback && !profile.attachOnly
? "local-managed"
: "external-browser";
} catch {
return "unknown";
}
}
function resolveBrowserFetchOperatorHint(
url: string,
opts?: { ownership?: BrowserControlOwnership },
): string {
if (opts?.ownership === "external-browser") {
return (
"The browser profile is external to OpenClaw; make sure its browser/CDP endpoint " +
"is running and reachable. Restarting the OpenClaw gateway will not launch it."
);
}
const isLocal = !isAbsoluteHttp(url);
return isLocal
? `Restart the OpenClaw gateway (OpenClaw.app menubar, or \`${formatCliCommand("openclaw gateway")}\`).`
@@ -159,10 +192,10 @@ async function discardResponseBody(res: Response): Promise<void> {
function enhanceDispatcherPathError(url: string, err: unknown): Error {
const msg = normalizeErrorMessage(err);
const kind = classifyBrowserFetchFailure(err);
const ownership = resolveDispatcherBrowserControlOwnership(url);
const operatorHint = resolveBrowserFetchOperatorHint(url, { ownership });
const suffix =
kind === "persistent"
? `${resolveBrowserFetchOperatorHint(url)} ${BROWSER_TOOL_MODEL_HINT}`
: resolveBrowserFetchOperatorHint(url);
kind === "persistent" ? `${operatorHint} ${BROWSER_TOOL_MODEL_HINT}` : operatorHint;
const normalized = msg.endsWith(".") ? msg : `${msg}.`;
return new Error(`${normalized} ${suffix}`, err instanceof Error ? { cause: err } : undefined);
}

View File

@@ -20,6 +20,7 @@ import {
assertCdpEndpointAllowed,
fetchJson,
getHeadersWithAuth,
isWebSocketUrl,
normalizeCdpHttpBaseForJsonEndpoints,
withCdpSocket,
} from "./cdp.helpers.js";
@@ -500,11 +501,22 @@ async function connectBrowser(cdpUrl: string, ssrfPolicy?: SsrFPolicy): Promise<
() => null,
);
const endpoint = wsUrl ?? normalized;
const headers = getHeadersWithAuth(endpoint);
// Bypass proxy for loopback CDP connections (#31219)
const browser = await withNoProxyForCdpUrl(endpoint, () =>
chromium.connectOverCDP(endpoint, { timeout, headers }),
);
const connectEndpoint = async (target: string) => {
const headers = getHeadersWithAuth(target);
// Bypass proxy for loopback CDP connections (#31219)
return await withNoProxyForCdpUrl(target, () =>
chromium.connectOverCDP(target, { timeout, headers }),
);
};
let browser: Browser;
try {
browser = await connectEndpoint(endpoint);
} catch (err) {
if (!isWebSocketUrl(normalized) || endpoint === normalized) {
throw err;
}
browser = await connectEndpoint(normalized);
}
const onDisconnected = () => {
const current = cachedByCdpUrl.get(normalized);
if (current?.browser === browser) {

View File

@@ -66,6 +66,21 @@ function ensureOptionsKey(options?: BrowserEnsureOptions): string {
return typeof options?.headless === "boolean" ? `headless:${options.headless}` : "default";
}
function formatLocalPortOwnershipHint(profile: ResolvedBrowserProfile): string {
const resetHint =
`If OpenClaw should own this local profile, run action=reset-profile profile=${profile.name} ` +
"to stop the conflicting process.";
if (!profile.cdpIsLoopback) {
return resetHint;
}
return (
`${resetHint} If this port is an externally managed CDP service such as Browserless, ` +
`set browser.profiles.${profile.name}.attachOnly=true so OpenClaw attaches without trying ` +
"to manage the local process. For Browserless Docker, set EXTERNAL to the same WebSocket " +
"endpoint OpenClaw can reach via browser.profiles.<name>.cdpUrl."
);
}
export function createProfileAvailability({
opts,
profile,
@@ -317,7 +332,7 @@ export function createProfileAvailability({
const detail = await describeCdpFailure(PROFILE_ATTACH_RETRY_TIMEOUT_MS);
throw new BrowserProfileUnavailableError(
`Port ${profile.cdpPort} is in use for profile "${profile.name}" but not by openclaw. ` +
`Run action=reset-profile profile=${profile.name} to kill the process. ${detail}`,
`${formatLocalPortOwnershipHint(profile)} ${detail}`,
);
}

View File

@@ -230,6 +230,29 @@ describe("browser server-context ensureBrowserAvailable", () => {
expect(stopOpenClawChrome).not.toHaveBeenCalled();
});
it("explains attachOnly for externally managed loopback CDP services", async () => {
const { launchOpenClawChrome, stopOpenClawChrome, isChromeCdpReady, profile } =
setupEnsureBrowserAvailableHarness();
const isChromeReachable = vi.mocked(chromeModule.isChromeReachable);
isChromeReachable.mockResolvedValue(true);
isChromeCdpReady.mockResolvedValue(false);
const promise = profile.ensureBrowserAvailable();
await expect(promise).rejects.toThrow(
'Port 18800 is in use for profile "openclaw" but not by openclaw.',
);
await expect(promise).rejects.toThrow(
"set browser.profiles.openclaw.attachOnly=true so OpenClaw attaches without trying to manage the local process",
);
await expect(promise).rejects.toThrow(
"For Browserless Docker, set EXTERNAL to the same WebSocket endpoint OpenClaw can reach via browser.profiles.<name>.cdpUrl.",
);
expect(launchOpenClawChrome).not.toHaveBeenCalled();
expect(stopOpenClawChrome).not.toHaveBeenCalled();
});
it("retries remote CDP websocket reachability once before failing", async () => {
const { launchOpenClawChrome, stopOpenClawChrome, isChromeCdpReady } =
setupEnsureBrowserAvailableHarness();

View File

@@ -1,4 +1,5 @@
import { describe, expect, it, vi } from "vitest";
import { withBrowserFetchPreconnect } from "../../test-fetch.js";
import {
installRemoteProfileTestLifecycle,
loadRemoteProfileTestDeps,
@@ -127,4 +128,149 @@ describe("browser remote profile fallback and attachOnly behavior", () => {
expect(opened.targetId).toBe("T1");
expect(fetchMock).not.toHaveBeenCalled();
});
it("passes configured remote CDP timeouts when opening tabs through raw CDP", async () => {
vi.spyOn(deps.pwAiModule, "getPwAiModule").mockResolvedValue(null);
const createTargetViaCdp = vi
.spyOn(deps.cdpModule, "createTargetViaCdp")
.mockResolvedValue({ targetId: "T_REMOTE" });
const { state, remote } = deps.createRemoteRouteHarness(
vi.fn(
deps.createJsonListFetchMock([
{
id: "T_REMOTE",
title: "Remote Tab",
url: "https://example.com",
webSocketDebuggerUrl: "wss://browserless.example/devtools/page/T_REMOTE",
type: "page",
},
]),
),
);
state.resolved.remoteCdpTimeoutMs = 4321;
state.resolved.remoteCdpHandshakeTimeoutMs = 8765;
const opened = await remote.openTab("https://example.com");
expect(opened.targetId).toBe("T_REMOTE");
expect(createTargetViaCdp).toHaveBeenCalledWith(
expect.objectContaining({
cdpUrl: "https://1.1.1.1:9222/chrome?token=abc",
url: "https://example.com",
timeouts: {
httpTimeoutMs: 4321,
handshakeTimeoutMs: 8765,
},
}),
);
});
it("uses remote-class tab-open timeouts for attachOnly loopback CDP profiles", async () => {
vi.spyOn(deps.pwAiModule, "getPwAiModule").mockResolvedValue(null);
const createTargetViaCdp = vi
.spyOn(deps.cdpModule, "createTargetViaCdp")
.mockResolvedValue({ targetId: "T_ATTACH" });
const state = deps.makeState("openclaw");
state.resolved.remoteCdpTimeoutMs = 2345;
state.resolved.remoteCdpHandshakeTimeoutMs = 6789;
state.resolved.profiles.openclaw = {
cdpPort: 18800,
attachOnly: true,
color: "#FF4500",
};
const fetchMock = vi.fn(
deps.createJsonListFetchMock([
{
id: "T_ATTACH",
title: "Attach Tab",
url: "https://example.com",
webSocketDebuggerUrl: "ws://127.0.0.1:18800/devtools/page/T_ATTACH",
type: "page",
},
]),
);
global.fetch = withBrowserFetchPreconnect(fetchMock);
const ctx = deps.createBrowserRouteContext({ getState: () => state });
const opened = await ctx.forProfile("openclaw").openTab("https://example.com");
expect(opened.targetId).toBe("T_ATTACH");
expect(createTargetViaCdp).toHaveBeenCalledWith(
expect.objectContaining({
cdpUrl: "http://127.0.0.1:18800",
timeouts: {
httpTimeoutMs: 2345,
handshakeTimeoutMs: 6789,
},
}),
);
});
it("keeps managed loopback tab opens on local CDP defaults", async () => {
vi.spyOn(deps.pwAiModule, "getPwAiModule").mockResolvedValue(null);
const createTargetViaCdp = vi
.spyOn(deps.cdpModule, "createTargetViaCdp")
.mockResolvedValue({ targetId: "T_LOCAL" });
const state = deps.makeState("openclaw");
const fetchMock = vi.fn(
deps.createJsonListFetchMock([
{
id: "T_LOCAL",
title: "Local Tab",
url: "http://127.0.0.1:3000",
webSocketDebuggerUrl: "ws://127.0.0.1:18800/devtools/page/T_LOCAL",
type: "page",
},
]),
);
global.fetch = withBrowserFetchPreconnect(fetchMock);
const ctx = deps.createBrowserRouteContext({ getState: () => state });
await ctx.forProfile("openclaw").openTab("http://127.0.0.1:3000");
expect(createTargetViaCdp).toHaveBeenCalledWith({
cdpUrl: "http://127.0.0.1:18800",
url: "http://127.0.0.1:3000",
ssrfPolicy: undefined,
});
});
it("uses the remote HTTP timeout for /json/new fallback tab opens", async () => {
vi.spyOn(deps.pwAiModule, "getPwAiModule").mockResolvedValue(null);
vi.spyOn(deps.cdpModule, "createTargetViaCdp").mockRejectedValue(
new Error("Target.createTarget unavailable"),
);
const fetchMock = vi.fn(async (...args: unknown[]) => {
const url = String(args[0]);
if (url.includes("/json/new")) {
const init = args[1] as RequestInit | undefined;
expect(init?.method).toBe("PUT");
expect(init?.signal).toBeInstanceOf(AbortSignal);
return await new Promise<Response>((_resolve, reject) => {
init?.signal?.addEventListener(
"abort",
() => reject(new Error("aborted after remote timeout")),
{ once: true },
);
});
}
throw new Error(`unexpected fetch: ${url}`);
});
const { state, remote } = deps.createRemoteRouteHarness(fetchMock);
state.resolved.remoteCdpTimeoutMs = 25;
const startedAt = Date.now();
await expect(remote.openTab("https://example.com")).rejects.toThrow(
/aborted after remote timeout/,
);
expect(Date.now() - startedAt).toBeLessThan(700);
expect(fetchMock).toHaveBeenCalledWith(
expect.stringContaining("/json/new"),
expect.objectContaining({
method: "PUT",
signal: expect.any(AbortSignal),
}),
);
});
});

View File

@@ -1,6 +1,7 @@
import { afterEach, beforeEach, vi } from "vitest";
export type RemoteProfileTestDeps = {
cdpModule: typeof import("./cdp.js");
chromeModule: typeof import("./chrome.js");
InvalidBrowserNavigationUrlError: typeof import("./navigation-guard.js").InvalidBrowserNavigationUrlError;
pwAiModule: typeof import("./pw-ai-module.js");
@@ -18,6 +19,7 @@ let remoteProfileTestDepsPromise: Promise<RemoteProfileTestDeps> | undefined;
export async function loadRemoteProfileTestDeps(): Promise<RemoteProfileTestDeps> {
remoteProfileTestDepsPromise ??= (async () => {
await import("./server-context.chrome-test-harness.js");
const cdpModule = await import("./cdp.js");
const chromeModule = await import("./chrome.js");
const { InvalidBrowserNavigationUrlError } = await import("./navigation-guard.js");
const pwAiModule = await import("./pw-ai-module.js");
@@ -31,6 +33,7 @@ export async function loadRemoteProfileTestDeps(): Promise<RemoteProfileTestDeps
originalFetch,
} = await import("./server-context.remote-tab-ops.harness.js");
return {
cdpModule,
chromeModule,
InvalidBrowserNavigationUrlError,
pwAiModule,

View File

@@ -8,6 +8,7 @@ import {
normalizeCdpHttpBaseForJsonEndpoints,
} from "./cdp.helpers.js";
import { appendCdpPath, createTargetViaCdp, normalizeCdpWsUrl } from "./cdp.js";
import type { CdpActionTimeouts } from "./cdp.js";
import { getChromeMcpModule } from "./chrome-mcp.runtime.js";
import type { ResolvedBrowserProfile } from "./config.js";
import { BrowserTabNotFoundError, BrowserTargetAmbiguousError } from "./errors.js";
@@ -140,6 +141,16 @@ export function createProfileTabOps({
profile,
}),
});
const getRemoteCdpActionTimeouts = (): CdpActionTimeouts | undefined => {
if (profile.cdpIsLoopback && !profile.attachOnly) {
return undefined;
}
const resolved = state().resolved;
return {
httpTimeoutMs: resolved.remoteCdpTimeoutMs,
handshakeTimeoutMs: resolved.remoteCdpHandshakeTimeoutMs,
};
};
const readTabs = async (): Promise<BrowserTab[]> => {
if (capabilities.usesChromeMcp) {
@@ -270,11 +281,16 @@ export function createProfileTabOps({
}
await assertBrowserNavigationAllowed({ url, ...ssrfPolicyOpts });
const createdViaCdp = await createTargetViaCdp({
const cdpActionTimeouts = getRemoteCdpActionTimeouts();
const createTargetOpts: Parameters<typeof createTargetViaCdp>[0] = {
cdpUrl: profile.cdpUrl,
url,
ssrfPolicy: getCdpControlPolicy(),
})
};
if (cdpActionTimeouts) {
createTargetOpts.timeouts = cdpActionTimeouts;
}
const createdViaCdp = await createTargetViaCdp(createTargetOpts)
.then((r) => r.targetId)
.catch(() => null);
@@ -310,7 +326,7 @@ export function createProfileTabOps({
: `${endpointUrl.toString()}?${encoded}`;
const created = await fetchJson<CdpTarget>(
endpoint,
CDP_JSON_NEW_TIMEOUT_MS,
cdpActionTimeouts?.httpTimeoutMs ?? CDP_JSON_NEW_TIMEOUT_MS,
{
method: "PUT",
},
@@ -319,7 +335,7 @@ export function createProfileTabOps({
if (String(err).includes("HTTP 405")) {
return await fetchJson<CdpTarget>(
endpoint,
CDP_JSON_NEW_TIMEOUT_MS,
cdpActionTimeouts?.httpTimeoutMs ?? CDP_JSON_NEW_TIMEOUT_MS,
undefined,
getCdpControlPolicy(),
);

View File

@@ -1,76 +1,17 @@
import type { ProviderWrapStreamFnContext } from "openclaw/plugin-sdk/plugin-entry";
import { streamWithPayloadPatch } from "openclaw/plugin-sdk/provider-stream-shared";
type DeepSeekThinkingLevel = ProviderWrapStreamFnContext["thinkingLevel"];
import { createDeepSeekV4OpenAICompatibleThinkingWrapper } from "openclaw/plugin-sdk/provider-stream-shared";
function isDeepSeekV4ModelId(modelId: unknown): boolean {
return modelId === "deepseek-v4-flash" || modelId === "deepseek-v4-pro";
}
function isDisabledThinkingLevel(thinkingLevel: DeepSeekThinkingLevel): boolean {
const normalized = typeof thinkingLevel === "string" ? thinkingLevel.toLowerCase() : "";
return normalized === "off" || normalized === "none";
}
function resolveDeepSeekReasoningEffort(thinkingLevel: DeepSeekThinkingLevel): "high" | "max" {
return thinkingLevel === "xhigh" || thinkingLevel === "max" ? "max" : "high";
}
function stripDeepSeekReasoningContent(payload: Record<string, unknown>): void {
if (!Array.isArray(payload.messages)) {
return;
}
for (const message of payload.messages) {
if (!message || typeof message !== "object") {
continue;
}
delete (message as Record<string, unknown>).reasoning_content;
}
}
function ensureDeepSeekToolCallReasoningContent(payload: Record<string, unknown>): void {
if (!Array.isArray(payload.messages)) {
return;
}
for (const message of payload.messages) {
if (!message || typeof message !== "object") {
continue;
}
const record = message as Record<string, unknown>;
if (record.role !== "assistant" || !Array.isArray(record.tool_calls)) {
continue;
}
if (!("reasoning_content" in record)) {
record.reasoning_content = "";
}
}
}
export function createDeepSeekV4ThinkingWrapper(
baseStreamFn: ProviderWrapStreamFnContext["streamFn"],
thinkingLevel: DeepSeekThinkingLevel,
thinkingLevel: ProviderWrapStreamFnContext["thinkingLevel"],
): ProviderWrapStreamFnContext["streamFn"] {
if (!baseStreamFn) {
return undefined;
}
const underlying = baseStreamFn;
return (model, context, options) => {
if (model.provider !== "deepseek" || !isDeepSeekV4ModelId(model.id)) {
return underlying(model, context, options);
}
return streamWithPayloadPatch(underlying, model, context, options, (payload) => {
if (isDisabledThinkingLevel(thinkingLevel)) {
payload.thinking = { type: "disabled" };
delete payload.reasoning_effort;
delete payload.reasoning;
stripDeepSeekReasoningContent(payload);
return;
}
payload.thinking = { type: "enabled" };
payload.reasoning_effort = resolveDeepSeekReasoningEffort(thinkingLevel);
ensureDeepSeekToolCallReasoningContent(payload);
});
};
return createDeepSeekV4OpenAICompatibleThinkingWrapper({
baseStreamFn,
thinkingLevel,
shouldPatchModel: (model) => model.provider === "deepseek" && isDeepSeekV4ModelId(model.id),
});
}

View File

@@ -5,12 +5,14 @@ const telemetryState = vi.hoisted(() => {
const histograms = new Map<string, { record: ReturnType<typeof vi.fn> }>();
const spans: Array<{
name: string;
addEvent: ReturnType<typeof vi.fn>;
end: ReturnType<typeof vi.fn>;
setStatus: ReturnType<typeof vi.fn>;
}> = [];
const tracer = {
startSpan: vi.fn((name: string, _opts?: unknown, _ctx?: unknown) => {
const span = {
addEvent: vi.fn(),
end: vi.fn(),
setStatus: vi.fn(),
};
@@ -111,6 +113,10 @@ vi.mock("@opentelemetry/semantic-conventions", () => ({
ATTR_SERVICE_NAME: "service.name",
}));
import {
emitTrustedDiagnosticEvent,
onInternalDiagnosticEvent,
} from "../../../src/infra/diagnostic-events.js";
import type { OpenClawPluginServiceContext } from "../api.js";
import { emitDiagnosticEvent } from "../api.js";
import { createDiagnosticsOtelService } from "./service.js";
@@ -122,10 +128,12 @@ const TRACE_ID = "4bf92f3577b34da6a3ce929d0e0e4736";
const SPAN_ID = "00f067aa0ba902b7";
const CHILD_SPAN_ID = "1111111111111111";
const GRANDCHILD_SPAN_ID = "2222222222222222";
const TOOL_SPAN_ID = "3333333333333333";
const PROTO_KEY = "__proto__";
const MAX_TEST_OTEL_CONTENT_ATTRIBUTE_CHARS = 4096;
const OTEL_TRUNCATED_SUFFIX_MAX_CHARS = 20;
const ORIGINAL_OPENCLAW_OTEL_PRELOADED = process.env.OPENCLAW_OTEL_PRELOADED;
const ORIGINAL_OTEL_SEMCONV_STABILITY_OPT_IN = process.env.OTEL_SEMCONV_STABILITY_OPT_IN;
function createLogger() {
return {
@@ -165,6 +173,7 @@ function createOtelContext(
},
logger: createLogger(),
stateDir: OTEL_TEST_STATE_DIR,
internalDiagnostics: { onEvent: onInternalDiagnosticEvent },
};
}
@@ -174,11 +183,13 @@ function createTraceOnlyContext(endpoint: string): OpenClawPluginServiceContext
async function emitAndCaptureLog(
event: Omit<Extract<Parameters<typeof emitDiagnosticEvent>[0], { type: "log.record" }>, "type">,
options: { trusted?: boolean } = {},
) {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { logs: true });
await service.start(ctx);
emitDiagnosticEvent({
const emit = options.trusted ? emitTrustedDiagnosticEvent : emitDiagnosticEvent;
emit({
type: "log.record",
...event,
});
@@ -196,6 +207,7 @@ function flushDiagnosticEvents() {
describe("diagnostics-otel service", () => {
beforeEach(() => {
delete process.env.OPENCLAW_OTEL_PRELOADED;
delete process.env.OTEL_SEMCONV_STABILITY_OPT_IN;
telemetryState.counters.clear();
telemetryState.histograms.clear();
telemetryState.spans.length = 0;
@@ -216,6 +228,11 @@ describe("diagnostics-otel service", () => {
} else {
process.env.OPENCLAW_OTEL_PRELOADED = ORIGINAL_OPENCLAW_OTEL_PRELOADED;
}
if (ORIGINAL_OTEL_SEMCONV_STABILITY_OPT_IN === undefined) {
delete process.env.OTEL_SEMCONV_STABILITY_OPT_IN;
} else {
process.env.OTEL_SEMCONV_STABILITY_OPT_IN = ORIGINAL_OTEL_SEMCONV_STABILITY_OPT_IN;
}
});
test("records message-flow metrics and spans", async () => {
@@ -499,7 +516,7 @@ describe("diagnostics-otel service", () => {
}
});
test("attaches diagnostic trace context to exported logs", async () => {
test("does not attach untrusted diagnostic trace context to exported logs", async () => {
const emitCall = await emitAndCaptureLog({
level: "INFO",
message: "traceable log",
@@ -513,15 +530,31 @@ describe("diagnostics-otel service", () => {
},
});
expect(emitCall?.attributes).toMatchObject({
"openclaw.traceFlags": "01",
});
expect(emitCall?.attributes).toEqual(
expect.not.objectContaining({
"openclaw.traceId": expect.anything(),
"openclaw.spanId": expect.anything(),
"openclaw.traceFlags": expect.anything(),
}),
);
expect(telemetryState.tracer.setSpanContext).not.toHaveBeenCalled();
expect(emitCall?.context).toBeUndefined();
});
test("attaches trusted diagnostic trace context to exported logs", async () => {
const emitCall = await emitAndCaptureLog(
{
level: "INFO",
message: "traceable log",
trace: {
traceId: TRACE_ID,
spanId: SPAN_ID,
traceFlags: "01",
},
},
{ trusted: true },
);
expect(telemetryState.tracer.setSpanContext).toHaveBeenCalledWith(
expect.anything(),
expect.objectContaining({
@@ -658,6 +691,187 @@ describe("diagnostics-otel service", () => {
await service.stop?.(ctx);
});
test("exports GenAI client token usage histogram for input and output only", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { metrics: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "model.usage",
sessionKey: "session-key",
channel: "webchat",
provider: "openai",
model: "gpt-5.4",
usage: {
input: 12,
output: 7,
cacheRead: 3,
cacheWrite: 2,
promptTokens: 17,
total: 24,
},
});
await flushDiagnosticEvents();
expect(telemetryState.meter.createHistogram).toHaveBeenCalledWith(
"gen_ai.client.token.usage",
expect.objectContaining({
unit: "{token}",
advice: {
explicitBucketBoundaries: expect.arrayContaining([1, 4, 16, 1024, 67108864]),
},
}),
);
const genAiTokenUsage = telemetryState.histograms.get("gen_ai.client.token.usage");
expect(genAiTokenUsage?.record).toHaveBeenCalledTimes(2);
expect(genAiTokenUsage?.record).toHaveBeenCalledWith(12, {
"gen_ai.operation.name": "chat",
"gen_ai.provider.name": "openai",
"gen_ai.request.model": "gpt-5.4",
"gen_ai.token.type": "input",
});
expect(genAiTokenUsage?.record).toHaveBeenCalledWith(7, {
"gen_ai.operation.name": "chat",
"gen_ai.provider.name": "openai",
"gen_ai.request.model": "gpt-5.4",
"gen_ai.token.type": "output",
});
expect(JSON.stringify(genAiTokenUsage?.record.mock.calls)).not.toContain("session-key");
await service.stop?.(ctx);
});
test("keeps GenAI token usage metric model attribute present when model is unavailable", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { metrics: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "model.usage",
provider: "openai",
usage: { input: 2 },
});
await flushDiagnosticEvents();
expect(telemetryState.histograms.get("gen_ai.client.token.usage")?.record).toHaveBeenCalledWith(
2,
{
"gen_ai.operation.name": "chat",
"gen_ai.provider.name": "openai",
"gen_ai.request.model": "unknown",
"gen_ai.token.type": "input",
},
);
await service.stop?.(ctx);
});
test("exports GenAI usage attributes on model usage spans without diagnostic identifiers", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "model.usage",
sessionKey: "session-key",
sessionId: "session-id",
provider: "anthropic",
model: "claude-sonnet-4.6",
usage: {
input: 100,
output: 40,
cacheRead: 30,
cacheWrite: 20,
promptTokens: 150,
total: 190,
},
durationMs: 25,
});
await flushDiagnosticEvents();
const modelUsageCall = telemetryState.tracer.startSpan.mock.calls.find(
(call) => call[0] === "openclaw.model.usage",
);
expect(modelUsageCall?.[1]).toMatchObject({
attributes: {
"gen_ai.operation.name": "chat",
"gen_ai.system": "anthropic",
"gen_ai.request.model": "claude-sonnet-4.6",
"gen_ai.usage.input_tokens": 150,
"gen_ai.usage.output_tokens": 40,
"gen_ai.usage.cache_read.input_tokens": 30,
"gen_ai.usage.cache_creation.input_tokens": 20,
},
});
expect(modelUsageCall?.[1]).toEqual({
attributes: expect.not.objectContaining({
"openclaw.sessionKey": expect.anything(),
"openclaw.sessionId": expect.anything(),
"gen_ai.provider.name": expect.anything(),
"gen_ai.input.messages": expect.anything(),
"gen_ai.output.messages": expect.anything(),
}),
startTime: expect.any(Number),
});
expect(JSON.stringify(modelUsageCall)).not.toContain("session-key");
await service.stop?.(ctx);
});
test("exports GenAI client operation duration histogram without diagnostic identifiers", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { metrics: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "model.call.completed",
runId: "run-1",
callId: "call-1",
sessionKey: "session-key",
provider: "openai",
model: "gpt-5.4",
api: "openai-completions",
durationMs: 250,
});
emitDiagnosticEvent({
type: "model.call.error",
runId: "run-1",
callId: "call-2",
sessionKey: "session-key",
provider: "google",
model: "gemini-2.5-flash",
api: "google-generative-ai",
durationMs: 1250,
errorCategory: "TimeoutError",
});
await flushDiagnosticEvents();
expect(telemetryState.meter.createHistogram).toHaveBeenCalledWith(
"gen_ai.client.operation.duration",
expect.objectContaining({
unit: "s",
advice: {
explicitBucketBoundaries: expect.arrayContaining([0.01, 0.32, 2.56, 81.92]),
},
}),
);
const genAiOperationDuration = telemetryState.histograms.get(
"gen_ai.client.operation.duration",
);
expect(genAiOperationDuration?.record).toHaveBeenCalledTimes(2);
expect(genAiOperationDuration?.record).toHaveBeenCalledWith(0.25, {
"gen_ai.operation.name": "text_completion",
"gen_ai.provider.name": "openai",
"gen_ai.request.model": "gpt-5.4",
});
expect(genAiOperationDuration?.record).toHaveBeenCalledWith(1.25, {
"gen_ai.operation.name": "generate_content",
"gen_ai.provider.name": "google",
"gen_ai.request.model": "gemini-2.5-flash",
"error.type": "TimeoutError",
});
expect(JSON.stringify(genAiOperationDuration?.record.mock.calls)).not.toContain("session-key");
expect(JSON.stringify(genAiOperationDuration?.record.mock.calls)).not.toContain("run-1");
await service.stop?.(ctx);
});
test("exports run, model call, and tool execution lifecycle spans", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
@@ -752,6 +966,7 @@ describe("diagnostics-otel service", () => {
});
expect(modelCall?.[1]).toEqual({
attributes: expect.not.objectContaining({
"gen_ai.provider.name": expect.anything(),
"openclaw.callId": expect.anything(),
"openclaw.runId": expect.anything(),
"openclaw.sessionKey": expect.anything(),
@@ -817,6 +1032,423 @@ describe("diagnostics-otel service", () => {
await service.stop?.(ctx);
});
test("maps model call APIs to GenAI operation names and error type", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "model.call.completed",
runId: "run-1",
callId: "call-1",
provider: "openai",
model: "gpt-5.4",
api: "openai-completions",
durationMs: 80,
});
emitDiagnosticEvent({
type: "model.call.completed",
runId: "run-1",
callId: "call-2",
provider: "google",
model: "gemini-2.5-flash",
api: "google-generative-ai",
durationMs: 90,
});
emitDiagnosticEvent({
type: "model.call.error",
runId: "run-1",
callId: "call-3",
provider: "openai",
model: "gpt-5.4",
api: "openai-responses",
durationMs: 40,
errorCategory: "TimeoutError",
});
await flushDiagnosticEvents();
const modelCallAttrs = telemetryState.tracer.startSpan.mock.calls
.filter((call) => call[0] === "openclaw.model.call")
.map((call) => (call[1] as { attributes?: Record<string, unknown> }).attributes);
expect(modelCallAttrs).toEqual([
expect.objectContaining({
"gen_ai.system": "openai",
"gen_ai.request.model": "gpt-5.4",
"gen_ai.operation.name": "text_completion",
}),
expect.objectContaining({
"gen_ai.system": "google",
"gen_ai.request.model": "gemini-2.5-flash",
"gen_ai.operation.name": "generate_content",
}),
expect.objectContaining({
"gen_ai.system": "openai",
"gen_ai.request.model": "gpt-5.4",
"gen_ai.operation.name": "chat",
"error.type": "TimeoutError",
}),
]);
await service.stop?.(ctx);
});
test("uses latest GenAI provider attribute only when semconv opt-in is set", async () => {
process.env.OTEL_SEMCONV_STABILITY_OPT_IN = "http,gen_ai_latest_experimental";
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "model.call.completed",
runId: "run-1",
callId: "call-1",
provider: "openai",
model: "gpt-5.4",
api: "openai-completions",
durationMs: 80,
});
emitDiagnosticEvent({
type: "model.usage",
provider: "openai",
model: "gpt-5.4",
usage: { input: 3, output: 2 },
durationMs: 10,
});
await flushDiagnosticEvents();
const modelCall = telemetryState.tracer.startSpan.mock.calls.find(
(call) => call[0] === "openclaw.model.call",
);
expect(modelCall?.[1]).toMatchObject({
attributes: {
"gen_ai.provider.name": "openai",
"gen_ai.request.model": "gpt-5.4",
"gen_ai.operation.name": "text_completion",
},
});
expect(modelCall?.[1]).toEqual({
attributes: expect.not.objectContaining({
"gen_ai.system": expect.anything(),
}),
startTime: expect.any(Number),
});
const modelUsage = telemetryState.tracer.startSpan.mock.calls.find(
(call) => call[0] === "openclaw.model.usage",
);
expect(modelUsage?.[1]).toMatchObject({
attributes: {
"gen_ai.provider.name": "openai",
"gen_ai.request.model": "gpt-5.4",
"gen_ai.operation.name": "chat",
},
});
expect(modelUsage?.[1]).toEqual({
attributes: expect.not.objectContaining({
"gen_ai.system": expect.anything(),
}),
startTime: expect.any(Number),
});
await service.stop?.(ctx);
});
test("records upstream request id hashes as model call span events only", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "model.call.error",
runId: "run-1",
callId: "call-1",
provider: "openai",
model: "gpt-5.4",
api: "openai-responses",
durationMs: 40,
errorCategory: "ProviderError",
upstreamRequestIdHash: "sha256:123456abcdef",
});
await flushDiagnosticEvents();
const modelCall = telemetryState.tracer.startSpan.mock.calls.find(
(call) => call[0] === "openclaw.model.call",
);
expect(modelCall?.[1]).toEqual({
attributes: expect.not.objectContaining({
"openclaw.upstreamRequestIdHash": expect.anything(),
}),
startTime: expect.any(Number),
});
const span = telemetryState.spans.find((candidate) => candidate.name === "openclaw.model.call");
expect(span?.addEvent).toHaveBeenCalledWith("openclaw.provider.request", {
"openclaw.upstreamRequestIdHash": "sha256:123456abcdef",
});
expect(
telemetryState.histograms.get("openclaw.model_call.duration_ms")?.record,
).toHaveBeenCalledWith(
40,
expect.not.objectContaining({
"openclaw.upstreamRequestIdHash": expect.anything(),
}),
);
await service.stop?.(ctx);
});
test("exports trusted context assembly spans without prompt content", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
await service.start(ctx);
emitTrustedDiagnosticEvent({
type: "context.assembled",
runId: "run-1",
sessionKey: "session-key",
sessionId: "session-id",
provider: "openai",
model: "gpt-5.4",
channel: "webchat",
trigger: "message",
messageCount: 12,
historyTextChars: 1234,
historyImageBlocks: 2,
maxMessageTextChars: 456,
systemPromptChars: 789,
promptChars: 42,
promptImages: 1,
contextTokenBudget: 128_000,
reserveTokens: 4096,
trace: {
traceId: TRACE_ID,
spanId: GRANDCHILD_SPAN_ID,
parentSpanId: SPAN_ID,
traceFlags: "01",
},
});
await flushDiagnosticEvents();
const contextCall = telemetryState.tracer.startSpan.mock.calls.find(
(call) => call[0] === "openclaw.context.assembled",
);
expect(contextCall?.[1]).toMatchObject({
attributes: {
"openclaw.provider": "openai",
"openclaw.model": "gpt-5.4",
"openclaw.channel": "webchat",
"openclaw.trigger": "message",
"openclaw.context.message_count": 12,
"openclaw.context.history_text_chars": 1234,
"openclaw.context.history_image_blocks": 2,
"openclaw.context.max_message_text_chars": 456,
"openclaw.context.system_prompt_chars": 789,
"openclaw.context.prompt_chars": 42,
"openclaw.context.prompt_images": 1,
"openclaw.context.token_budget": 128_000,
"openclaw.context.reserve_tokens": 4096,
},
});
expect(JSON.stringify(contextCall)).not.toContain("session-key");
expect(JSON.stringify(contextCall)).not.toContain("prompt text");
expect(telemetryState.tracer.setSpanContext).toHaveBeenCalledWith(
expect.anything(),
expect.objectContaining({ traceId: TRACE_ID, spanId: SPAN_ID }),
);
await service.stop?.(ctx);
});
test("exports tool loop diagnostics without loop messages or session identifiers", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "tool.loop",
sessionKey: "session-key",
sessionId: "session-id",
toolName: "process",
level: "critical",
action: "block",
detector: "known_poll_no_progress",
count: 20,
message: "CRITICAL: repeated secret-bearing tool output",
pairedToolName: "read",
});
await flushDiagnosticEvents();
expect(telemetryState.counters.get("openclaw.tool.loop")?.add).toHaveBeenCalledWith(1, {
"openclaw.toolName": "process",
"openclaw.loop.level": "critical",
"openclaw.loop.action": "block",
"openclaw.loop.detector": "known_poll_no_progress",
"openclaw.loop.count": 20,
"openclaw.loop.paired_tool": "read",
});
const loopSpanCall = telemetryState.tracer.startSpan.mock.calls.find(
(call) => call[0] === "openclaw.tool.loop",
);
expect(loopSpanCall?.[1]).toMatchObject({
attributes: {
"openclaw.toolName": "process",
"openclaw.loop.level": "critical",
"openclaw.loop.action": "block",
"openclaw.loop.detector": "known_poll_no_progress",
"openclaw.loop.count": 20,
"openclaw.loop.paired_tool": "read",
},
});
const loopSpan = telemetryState.spans.find((span) => span.name === "openclaw.tool.loop");
expect(loopSpan?.setStatus).toHaveBeenCalledWith({
code: 2,
message: "known_poll_no_progress:block",
});
expect(JSON.stringify(loopSpanCall)).not.toContain("session-key");
expect(JSON.stringify(loopSpanCall)).not.toContain("secret-bearing");
await service.stop?.(ctx);
});
test("exports diagnostic memory samples and pressure without session identifiers", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
await service.start(ctx);
emitDiagnosticEvent({
type: "diagnostic.memory.sample",
uptimeMs: 1234,
memory: {
rssBytes: 100,
heapUsedBytes: 40,
heapTotalBytes: 80,
externalBytes: 10,
arrayBuffersBytes: 5,
},
});
emitDiagnosticEvent({
type: "diagnostic.memory.pressure",
level: "critical",
reason: "rss_growth",
thresholdBytes: 512,
rssGrowthBytes: 256,
windowMs: 60_000,
memory: {
rssBytes: 200,
heapUsedBytes: 50,
heapTotalBytes: 90,
externalBytes: 20,
arrayBuffersBytes: 6,
},
});
await flushDiagnosticEvents();
expect(telemetryState.histograms.get("openclaw.memory.rss_bytes")?.record).toHaveBeenCalledWith(
100,
{},
);
expect(telemetryState.histograms.get("openclaw.memory.rss_bytes")?.record).toHaveBeenCalledWith(
200,
{
"openclaw.memory.level": "critical",
"openclaw.memory.reason": "rss_growth",
},
);
expect(telemetryState.counters.get("openclaw.memory.pressure")?.add).toHaveBeenCalledWith(1, {
"openclaw.memory.level": "critical",
"openclaw.memory.reason": "rss_growth",
});
const pressureCall = telemetryState.tracer.startSpan.mock.calls.find(
(call) => call[0] === "openclaw.memory.pressure",
);
expect(pressureCall?.[1]).toMatchObject({
attributes: {
"openclaw.memory.level": "critical",
"openclaw.memory.reason": "rss_growth",
"openclaw.memory.rss_bytes": 200,
"openclaw.memory.heap_used_bytes": 50,
"openclaw.memory.heap_total_bytes": 90,
"openclaw.memory.external_bytes": 20,
"openclaw.memory.array_buffers_bytes": 6,
"openclaw.memory.threshold_bytes": 512,
"openclaw.memory.rss_growth_bytes": 256,
"openclaw.memory.window_ms": 60_000,
},
});
const pressureSpan = telemetryState.spans.find(
(span) => span.name === "openclaw.memory.pressure",
);
expect(pressureSpan?.setStatus).toHaveBeenCalledWith({
code: 2,
message: "rss_growth",
});
expect(JSON.stringify(pressureCall)).not.toContain("session");
await service.stop?.(ctx);
});
test("parents trusted diagnostic lifecycle spans from explicit parent ids", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
await service.start(ctx);
emitTrustedDiagnosticEvent({
type: "run.completed",
runId: "run-1",
provider: "openai",
model: "gpt-5.4",
outcome: "completed",
durationMs: 100,
trace: {
traceId: TRACE_ID,
spanId: CHILD_SPAN_ID,
parentSpanId: SPAN_ID,
traceFlags: "01",
},
});
emitTrustedDiagnosticEvent({
type: "model.call.completed",
runId: "run-1",
callId: "call-1",
provider: "openai",
model: "gpt-5.4",
durationMs: 80,
trace: {
traceId: TRACE_ID,
spanId: GRANDCHILD_SPAN_ID,
parentSpanId: CHILD_SPAN_ID,
traceFlags: "01",
},
});
emitTrustedDiagnosticEvent({
type: "tool.execution.error",
runId: "run-1",
toolName: "read",
durationMs: 20,
errorCategory: "TypeError",
trace: {
traceId: TRACE_ID,
spanId: TOOL_SPAN_ID,
parentSpanId: GRANDCHILD_SPAN_ID,
traceFlags: "01",
},
});
await flushDiagnosticEvents();
expect(telemetryState.tracer.setSpanContext).toHaveBeenCalledTimes(3);
expect(telemetryState.tracer.setSpanContext.mock.calls.map((call) => call[1])).toEqual([
expect.objectContaining({ traceId: TRACE_ID, spanId: SPAN_ID }),
expect.objectContaining({ traceId: TRACE_ID, spanId: CHILD_SPAN_ID }),
expect.objectContaining({ traceId: TRACE_ID, spanId: GRANDCHILD_SPAN_ID }),
]);
const parentBySpanName = Object.fromEntries(
telemetryState.tracer.startSpan.mock.calls.map((call) => [
call[0],
(call[2] as { spanContext?: { spanId?: string } } | undefined)?.spanContext?.spanId,
]),
);
expect(parentBySpanName).toMatchObject({
"openclaw.run": SPAN_ID,
"openclaw.model.call": CHILD_SPAN_ID,
"openclaw.tool.execution": GRANDCHILD_SPAN_ID,
});
await service.stop?.(ctx);
});
test("exports exec process spans without command text", async () => {
const service = createDiagnosticsOtelService();
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });

View File

@@ -16,6 +16,7 @@ import { NodeSDK } from "@opentelemetry/sdk-node";
import { ParentBasedSampler, TraceIdRatioBasedSampler } from "@opentelemetry/sdk-trace-base";
import { ATTR_SERVICE_NAME } from "@opentelemetry/semantic-conventions";
import type {
DiagnosticEventMetadata,
DiagnosticEventPayload,
DiagnosticTraceContext,
OpenClawPluginService,
@@ -24,7 +25,6 @@ import {
isValidDiagnosticSpanId,
isValidDiagnosticTraceFlags,
isValidDiagnosticTraceId,
onInternalDiagnosticEvent,
redactSensitiveText,
} from "../api.js";
@@ -50,6 +50,14 @@ const OTEL_LOG_RAW_ATTRIBUTE_KEY_RE = /^[A-Za-z0-9_.:-]{1,64}$/u;
const OTEL_LOG_ATTRIBUTE_KEY_RE = /^[A-Za-z0-9_.:-]{1,96}$/u;
const BLOCKED_OTEL_LOG_ATTRIBUTE_KEYS = new Set(["__proto__", "prototype", "constructor"]);
const PRELOADED_OTEL_SDK_ENV = "OPENCLAW_OTEL_PRELOADED";
const OTEL_SEMCONV_STABILITY_OPT_IN_ENV = "OTEL_SEMCONV_STABILITY_OPT_IN";
const GEN_AI_LATEST_EXPERIMENTAL_OPT_IN = "gen_ai_latest_experimental";
const GEN_AI_TOKEN_USAGE_BUCKETS = [
1, 4, 16, 64, 256, 1024, 4096, 16384, 65536, 262144, 1048576, 4194304, 16777216, 67108864,
];
const GEN_AI_OPERATION_DURATION_BUCKETS = [
0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24, 20.48, 40.96, 81.92,
];
type OtelContentCapturePolicy = {
inputMessages: boolean;
@@ -65,6 +73,10 @@ type MessageDeliveryDiagnosticEvent = Extract<
type: "message.delivery.started" | "message.delivery.completed" | "message.delivery.error";
}
>;
type ModelCallLifecycleDiagnosticEvent = Extract<
DiagnosticEventPayload,
{ type: "model.call.completed" | "model.call.error" }
>;
const NO_CONTENT_CAPTURE: OtelContentCapturePolicy = {
inputMessages: false,
@@ -133,8 +145,89 @@ function lowCardinalityAttr(value: string | undefined, fallback = "unknown"): st
return LOW_CARDINALITY_VALUE_RE.test(redacted) ? redacted : fallback;
}
function genAiOperationName(api: string | undefined): "chat" | "text_completion" {
return api === "completions" ? "text_completion" : "chat";
function hasOtelSemconvOptIn(value: string | undefined, optIn: string): boolean {
return (
value
?.split(",")
.map((part) => part.trim())
.includes(optIn) ?? false
);
}
function emitLatestGenAiSemconv(): boolean {
return hasOtelSemconvOptIn(
process.env[OTEL_SEMCONV_STABILITY_OPT_IN_ENV],
GEN_AI_LATEST_EXPERIMENTAL_OPT_IN,
);
}
function genAiOperationName(
api: string | undefined,
): "chat" | "generate_content" | "text_completion" {
const normalized = api?.trim().toLowerCase();
if (!normalized) {
return "chat";
}
if (normalized === "completions" || normalized.endsWith("-completions")) {
return "text_completion";
}
if (normalized === "generate_content" || normalized.includes("generative-ai")) {
return "generate_content";
}
return "chat";
}
function positiveFiniteNumber(value: number | undefined): number | undefined {
return typeof value === "number" && Number.isFinite(value) && value > 0 ? value : undefined;
}
function assignPositiveNumberAttr(
attrs: Record<string, string | number>,
key: string,
value: number | undefined,
): void {
const normalized = positiveFiniteNumber(value);
if (normalized !== undefined) {
attrs[key] = normalized;
}
}
function assignGenAiSpanIdentityAttrs(
attrs: Record<string, string | number | boolean>,
input: { api?: string; model?: string; provider?: string },
): void {
if (emitLatestGenAiSemconv()) {
attrs["gen_ai.provider.name"] = lowCardinalityAttr(input.provider);
} else {
attrs["gen_ai.system"] = lowCardinalityAttr(input.provider);
}
if (input.model) {
attrs["gen_ai.request.model"] = lowCardinalityAttr(input.model);
}
attrs["gen_ai.operation.name"] = genAiOperationName(input.api);
}
function assignGenAiModelCallAttrs(
attrs: Record<string, string | number | boolean>,
evt: ModelCallLifecycleDiagnosticEvent,
): void {
assignGenAiSpanIdentityAttrs(attrs, evt);
}
function addUpstreamRequestIdSpanEvent(
span: { addEvent?: (name: string, attributes?: Record<string, string>) => void },
upstreamRequestIdHash: string | undefined,
): void {
if (!upstreamRequestIdHash) {
return;
}
const boundedHash = lowCardinalityAttr(upstreamRequestIdHash);
if (boundedHash === "unknown") {
return;
}
span.addEvent?.("openclaw.provider.request", {
"openclaw.upstreamRequestIdHash": boundedHash,
});
}
function clampOtelLogText(value: string, maxChars: number): string {
@@ -339,6 +432,33 @@ function contextForTraceContext(traceContext: DiagnosticTraceContext | undefined
});
}
function contextForDiagnosticSpanParent(traceContext: DiagnosticTraceContext | undefined) {
const normalized = normalizeTraceContext(traceContext);
if (!normalized?.parentSpanId) {
return undefined;
}
return trace.setSpanContext(otelContextApi.active(), {
traceId: normalized.traceId,
spanId: normalized.parentSpanId,
traceFlags: traceFlagsToOtel(normalized.traceFlags),
isRemote: true,
});
}
function contextForTrustedTraceContext(
evt: DiagnosticEventPayload,
metadata: DiagnosticEventMetadata,
) {
return metadata.trusted ? contextForTraceContext(evt.trace) : undefined;
}
function contextForTrustedDiagnosticSpanParent(
evt: DiagnosticEventPayload,
metadata: DiagnosticEventMetadata,
) {
return metadata.trusted ? contextForDiagnosticSpanParent(evt.trace) : undefined;
}
function addTraceAttributes(
attributes: Record<string, string | number | boolean>,
traceContext: DiagnosticTraceContext | undefined,
@@ -485,6 +605,23 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
unit: "1",
description: "Token usage by type",
});
const genAiTokenUsageHistogram = meter.createHistogram("gen_ai.client.token.usage", {
unit: "{token}",
description: "Number of input and output tokens used by GenAI client operations",
advice: {
explicitBucketBoundaries: GEN_AI_TOKEN_USAGE_BUCKETS,
},
});
const genAiOperationDurationHistogram = meter.createHistogram(
"gen_ai.client.operation.duration",
{
unit: "s",
description: "GenAI client operation duration",
advice: {
explicitBucketBoundaries: GEN_AI_OPERATION_DURATION_BUCKETS,
},
},
);
const costCounter = meter.createCounter("openclaw.cost.usd", {
unit: "1",
description: "Estimated model cost (USD)",
@@ -567,6 +704,10 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
unit: "1",
description: "Run attempts",
});
const toolLoopCounter = meter.createCounter("openclaw.tool.loop", {
unit: "1",
description: "Detected repetitive tool-call loop events",
});
const modelCallDurationHistogram = meter.createHistogram("openclaw.model_call.duration_ms", {
unit: "ms",
description: "Model call duration",
@@ -582,9 +723,39 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
unit: "ms",
description: "Exec process duration",
});
const memoryRssHistogram = meter.createHistogram("openclaw.memory.rss_bytes", {
unit: "By",
description: "Resident set size reported by diagnostic memory samples",
});
const memoryHeapUsedHistogram = meter.createHistogram("openclaw.memory.heap_used_bytes", {
unit: "By",
description: "Heap used bytes reported by diagnostic memory samples",
});
const memoryHeapTotalHistogram = meter.createHistogram("openclaw.memory.heap_total_bytes", {
unit: "By",
description: "Heap total bytes reported by diagnostic memory samples",
});
const memoryExternalHistogram = meter.createHistogram("openclaw.memory.external_bytes", {
unit: "By",
description: "External memory bytes reported by diagnostic memory samples",
});
const memoryArrayBuffersHistogram = meter.createHistogram(
"openclaw.memory.array_buffers_bytes",
{
unit: "By",
description: "ArrayBuffer bytes reported by diagnostic memory samples",
},
);
const memoryPressureCounter = meter.createCounter("openclaw.memory.pressure", {
unit: "1",
description: "Diagnostic memory pressure events",
});
let recordLogRecord:
| ((evt: Extract<DiagnosticEventPayload, { type: "log.record" }>) => void)
| ((
evt: Extract<DiagnosticEventPayload, { type: "log.record" }>,
metadata: DiagnosticEventMetadata,
) => void)
| undefined;
if (logsEnabled) {
let logRecordExportFailureLastReportedAt = Number.NEGATIVE_INFINITY;
@@ -603,7 +774,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
processors: [logProcessor],
});
const otelLogger = logProvider.getLogger("openclaw");
recordLogRecord = (evt) => {
recordLogRecord = (evt, metadata) => {
try {
const logLevelName = evt.level || "INFO";
const severityNumber = logSeverityMap[logLevelName] ?? (9 as SeverityNumber);
@@ -626,7 +797,9 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
if (evt.code?.functionName) {
assignOtelLogAttribute(attributes, "code.function", evt.code.functionName);
}
addTraceAttributes(attributes, evt.trace);
if (metadata.trusted) {
addTraceAttributes(attributes, evt.trace);
}
const logRecord: LogRecord = {
body: normalizeOtelLogString(evt.message || "log", MAX_OTEL_LOG_BODY_CHARS),
@@ -635,7 +808,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
attributes: redactOtelAttributes(attributes),
timestamp: evt.ts,
};
const logContext = contextForTraceContext(evt.trace);
const logContext = contextForTrustedTraceContext(evt, metadata);
if (logContext) {
logRecord.context = logContext;
}
@@ -719,19 +892,35 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
};
};
const recordModelUsage = (evt: Extract<DiagnosticEventPayload, { type: "model.usage" }>) => {
const recordModelUsage = (
evt: Extract<DiagnosticEventPayload, { type: "model.usage" }>,
metadata: DiagnosticEventMetadata,
) => {
const attrs = {
"openclaw.channel": evt.channel ?? "unknown",
"openclaw.provider": evt.provider ?? "unknown",
"openclaw.model": evt.model ?? "unknown",
};
const genAiAttrs: Record<string, string> = {
"gen_ai.operation.name": "chat",
"gen_ai.provider.name": lowCardinalityAttr(evt.provider),
"gen_ai.request.model": lowCardinalityAttr(evt.model),
};
const usage = evt.usage;
if (usage.input) {
tokensCounter.add(usage.input, { ...attrs, "openclaw.token": "input" });
genAiTokenUsageHistogram.record(usage.input, {
...genAiAttrs,
"gen_ai.token.type": "input",
});
}
if (usage.output) {
tokensCounter.add(usage.output, { ...attrs, "openclaw.token": "output" });
genAiTokenUsageHistogram.record(usage.output, {
...genAiAttrs,
"gen_ai.token.type": "output",
});
}
if (usage.cacheRead) {
tokensCounter.add(usage.cacheRead, { ...attrs, "openclaw.token": "cache_read" });
@@ -768,6 +957,9 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
if (!tracesEnabled) {
return;
}
const genAiInputTokens =
usage.promptTokens ??
(usage.input ?? 0) + (usage.cacheRead ?? 0) + (usage.cacheWrite ?? 0);
const spanAttrs: Record<string, string | number> = {
...attrs,
"openclaw.tokens.input": usage.input ?? 0,
@@ -776,9 +968,25 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
"openclaw.tokens.cache_write": usage.cacheWrite ?? 0,
"openclaw.tokens.total": usage.total ?? 0,
};
assignGenAiSpanIdentityAttrs(spanAttrs, evt);
assignPositiveNumberAttr(spanAttrs, "gen_ai.usage.input_tokens", genAiInputTokens);
assignPositiveNumberAttr(spanAttrs, "gen_ai.usage.output_tokens", usage.output);
assignPositiveNumberAttr(
spanAttrs,
"gen_ai.usage.cache_read.input_tokens",
usage.cacheRead,
);
assignPositiveNumberAttr(
spanAttrs,
"gen_ai.usage.cache_creation.input_tokens",
usage.cacheWrite,
);
const span = spanWithDuration("openclaw.model.usage", spanAttrs, evt.durationMs);
span.end();
const span = spanWithDuration("openclaw.model.usage", spanAttrs, evt.durationMs, {
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
endTimeMs: evt.ts,
});
span.end(evt.ts);
};
const recordWebhookReceived = (
@@ -992,8 +1200,97 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
runAttemptCounter.add(1, { "openclaw.attempt": evt.attempt });
};
const toolLoopAttrs = (
evt: Extract<DiagnosticEventPayload, { type: "tool.loop" }>,
): Record<string, string | number> => ({
"openclaw.toolName": lowCardinalityAttr(evt.toolName, "tool"),
"openclaw.loop.level": evt.level,
"openclaw.loop.action": evt.action,
"openclaw.loop.detector": evt.detector,
"openclaw.loop.count": evt.count,
...(evt.pairedToolName
? { "openclaw.loop.paired_tool": lowCardinalityAttr(evt.pairedToolName, "tool") }
: {}),
});
const recordToolLoop = (evt: Extract<DiagnosticEventPayload, { type: "tool.loop" }>) => {
const attrs = toolLoopAttrs(evt);
toolLoopCounter.add(1, attrs);
if (!tracesEnabled) {
return;
}
const span = spanWithDuration("openclaw.tool.loop", attrs, 0, { endTimeMs: evt.ts });
if (evt.level === "critical" || evt.action === "block") {
span.setStatus({
code: SpanStatusCode.ERROR,
message: `${evt.detector}:${evt.action}`,
});
}
span.end(evt.ts);
};
const recordMemoryUsageMetrics = (
evt: Extract<
DiagnosticEventPayload,
{ type: "diagnostic.memory.sample" | "diagnostic.memory.pressure" }
>,
attrs: Record<string, string> = {},
) => {
memoryRssHistogram.record(evt.memory.rssBytes, attrs);
memoryHeapUsedHistogram.record(evt.memory.heapUsedBytes, attrs);
memoryHeapTotalHistogram.record(evt.memory.heapTotalBytes, attrs);
memoryExternalHistogram.record(evt.memory.externalBytes, attrs);
memoryArrayBuffersHistogram.record(evt.memory.arrayBuffersBytes, attrs);
};
const recordMemorySample = (
evt: Extract<DiagnosticEventPayload, { type: "diagnostic.memory.sample" }>,
) => {
recordMemoryUsageMetrics(evt);
};
const recordMemoryPressure = (
evt: Extract<DiagnosticEventPayload, { type: "diagnostic.memory.pressure" }>,
) => {
const attrs = {
"openclaw.memory.level": evt.level,
"openclaw.memory.reason": evt.reason,
};
memoryPressureCounter.add(1, attrs);
recordMemoryUsageMetrics(evt, attrs);
if (!tracesEnabled) {
return;
}
const spanAttrs: Record<string, string | number | boolean> = {
...attrs,
"openclaw.memory.rss_bytes": evt.memory.rssBytes,
"openclaw.memory.heap_used_bytes": evt.memory.heapUsedBytes,
"openclaw.memory.heap_total_bytes": evt.memory.heapTotalBytes,
"openclaw.memory.external_bytes": evt.memory.externalBytes,
"openclaw.memory.array_buffers_bytes": evt.memory.arrayBuffersBytes,
...(evt.thresholdBytes !== undefined
? { "openclaw.memory.threshold_bytes": evt.thresholdBytes }
: {}),
...(evt.rssGrowthBytes !== undefined
? { "openclaw.memory.rss_growth_bytes": evt.rssGrowthBytes }
: {}),
...(evt.windowMs !== undefined ? { "openclaw.memory.window_ms": evt.windowMs } : {}),
};
const span = spanWithDuration("openclaw.memory.pressure", spanAttrs, 0, {
endTimeMs: evt.ts,
});
if (evt.level === "critical") {
span.setStatus({
code: SpanStatusCode.ERROR,
message: evt.reason,
});
}
span.end(evt.ts);
};
const recordRunCompleted = (
evt: Extract<DiagnosticEventPayload, { type: "run.completed" }>,
metadata: DiagnosticEventMetadata,
) => {
const attrs: Record<string, string | number> = {
"openclaw.outcome": evt.outcome,
@@ -1015,6 +1312,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
spanAttrs["openclaw.errorCategory"] = lowCardinalityAttr(evt.errorCategory, "other");
}
const span = spanWithDuration("openclaw.run", spanAttrs, evt.durationMs, {
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
endTimeMs: evt.ts,
});
if (evt.outcome === "error") {
@@ -1026,64 +1324,69 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
span.end(evt.ts);
};
const modelCallMetricAttrs = (
evt: Extract<DiagnosticEventPayload, { type: "model.call.completed" | "model.call.error" }>,
) => ({
"openclaw.provider": evt.provider,
"openclaw.model": evt.model,
"openclaw.api": lowCardinalityAttr(evt.api),
"openclaw.transport": lowCardinalityAttr(evt.transport),
});
const recordModelCallCompleted = (
evt: Extract<DiagnosticEventPayload, { type: "model.call.completed" }>,
const recordContextAssembled = (
evt: Extract<DiagnosticEventPayload, { type: "context.assembled" }>,
metadata: DiagnosticEventMetadata,
) => {
modelCallDurationHistogram.record(evt.durationMs, modelCallMetricAttrs(evt));
if (!tracesEnabled) {
return;
}
const spanAttrs: Record<string, string | number | boolean> = {
"openclaw.provider": evt.provider,
"openclaw.model": evt.model,
"gen_ai.system": evt.provider,
"gen_ai.request.model": evt.model,
"gen_ai.operation.name": genAiOperationName(evt.api),
"openclaw.context.message_count": evt.messageCount,
"openclaw.context.history_text_chars": evt.historyTextChars,
"openclaw.context.history_image_blocks": evt.historyImageBlocks,
"openclaw.context.max_message_text_chars": evt.maxMessageTextChars,
"openclaw.context.system_prompt_chars": evt.systemPromptChars,
"openclaw.context.prompt_chars": evt.promptChars,
"openclaw.context.prompt_images": evt.promptImages,
};
if (evt.api) {
spanAttrs["openclaw.api"] = evt.api;
addRunAttrs(spanAttrs, evt);
if (evt.contextTokenBudget !== undefined) {
spanAttrs["openclaw.context.token_budget"] = evt.contextTokenBudget;
}
if (evt.transport) {
spanAttrs["openclaw.transport"] = evt.transport;
if (evt.reserveTokens !== undefined) {
spanAttrs["openclaw.context.reserve_tokens"] = evt.reserveTokens;
}
assignOtelModelContentAttributes(
spanAttrs,
evt as unknown as Record<string, unknown>,
contentCapturePolicy,
);
const span = spanWithDuration("openclaw.model.call", spanAttrs, evt.durationMs, {
const span = spanWithDuration("openclaw.context.assembled", spanAttrs, 0, {
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
endTimeMs: evt.ts,
});
span.end(evt.ts);
};
const recordModelCallError = (
evt: Extract<DiagnosticEventPayload, { type: "model.call.error" }>,
const modelCallMetricAttrs = (evt: ModelCallLifecycleDiagnosticEvent) => ({
"openclaw.provider": evt.provider,
"openclaw.model": evt.model,
"openclaw.api": lowCardinalityAttr(evt.api),
"openclaw.transport": lowCardinalityAttr(evt.transport),
});
const genAiModelCallMetricAttrs = (
evt: ModelCallLifecycleDiagnosticEvent,
errorType?: string,
) => ({
"gen_ai.operation.name": genAiOperationName(evt.api),
"gen_ai.provider.name": lowCardinalityAttr(evt.provider),
"gen_ai.request.model": lowCardinalityAttr(evt.model),
...(errorType ? { "error.type": errorType } : {}),
});
const recordModelCallCompleted = (
evt: Extract<DiagnosticEventPayload, { type: "model.call.completed" }>,
metadata: DiagnosticEventMetadata,
) => {
modelCallDurationHistogram.record(evt.durationMs, {
...modelCallMetricAttrs(evt),
"openclaw.errorCategory": lowCardinalityAttr(evt.errorCategory, "other"),
});
modelCallDurationHistogram.record(evt.durationMs, modelCallMetricAttrs(evt));
genAiOperationDurationHistogram.record(
evt.durationMs / 1000,
genAiModelCallMetricAttrs(evt),
);
if (!tracesEnabled) {
return;
}
const spanAttrs: Record<string, string | number | boolean> = {
"openclaw.provider": evt.provider,
"openclaw.model": evt.model,
"openclaw.errorCategory": lowCardinalityAttr(evt.errorCategory, "other"),
"gen_ai.system": evt.provider,
"gen_ai.request.model": evt.model,
"gen_ai.operation.name": genAiOperationName(evt.api),
};
assignGenAiModelCallAttrs(spanAttrs, evt);
if (evt.api) {
spanAttrs["openclaw.api"] = evt.api;
}
@@ -1096,8 +1399,52 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
contentCapturePolicy,
);
const span = spanWithDuration("openclaw.model.call", spanAttrs, evt.durationMs, {
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
endTimeMs: evt.ts,
});
addUpstreamRequestIdSpanEvent(span, evt.upstreamRequestIdHash);
span.end(evt.ts);
};
const recordModelCallError = (
evt: Extract<DiagnosticEventPayload, { type: "model.call.error" }>,
metadata: DiagnosticEventMetadata,
) => {
const errorType = lowCardinalityAttr(evt.errorCategory, "other");
modelCallDurationHistogram.record(evt.durationMs, {
...modelCallMetricAttrs(evt),
"openclaw.errorCategory": errorType,
});
genAiOperationDurationHistogram.record(
evt.durationMs / 1000,
genAiModelCallMetricAttrs(evt, errorType),
);
if (!tracesEnabled) {
return;
}
const spanAttrs: Record<string, string | number | boolean> = {
"openclaw.provider": evt.provider,
"openclaw.model": evt.model,
"openclaw.errorCategory": errorType,
"error.type": errorType,
};
assignGenAiModelCallAttrs(spanAttrs, evt);
if (evt.api) {
spanAttrs["openclaw.api"] = evt.api;
}
if (evt.transport) {
spanAttrs["openclaw.transport"] = evt.transport;
}
assignOtelModelContentAttributes(
spanAttrs,
evt as unknown as Record<string, unknown>,
contentCapturePolicy,
);
const span = spanWithDuration("openclaw.model.call", spanAttrs, evt.durationMs, {
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
endTimeMs: evt.ts,
});
addUpstreamRequestIdSpanEvent(span, evt.upstreamRequestIdHash);
span.setStatus({
code: SpanStatusCode.ERROR,
message: redactSensitiveText(evt.errorCategory),
@@ -1107,6 +1454,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
const recordToolExecutionCompleted = (
evt: Extract<DiagnosticEventPayload, { type: "tool.execution.completed" }>,
metadata: DiagnosticEventMetadata,
) => {
const attrs = {
"openclaw.toolName": evt.toolName,
@@ -1128,6 +1476,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
contentCapturePolicy,
);
const span = spanWithDuration("openclaw.tool.execution", spanAttrs, evt.durationMs, {
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
endTimeMs: evt.ts,
});
span.end(evt.ts);
@@ -1135,6 +1484,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
const recordToolExecutionError = (
evt: Extract<DiagnosticEventPayload, { type: "tool.execution.error" }>,
metadata: DiagnosticEventMetadata,
) => {
const attrs = {
"openclaw.toolName": evt.toolName,
@@ -1161,6 +1511,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
contentCapturePolicy,
);
const span = spanWithDuration("openclaw.tool.execution", spanAttrs, evt.durationMs, {
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
endTimeMs: evt.ts,
});
span.setStatus({
@@ -1218,11 +1569,17 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
queueDepthHistogram.record(evt.queued, { "openclaw.channel": "heartbeat" });
};
unsubscribe = onInternalDiagnosticEvent((evt: DiagnosticEventPayload) => {
const subscribe = ctx.internalDiagnostics?.onEvent;
if (!subscribe) {
ctx.logger.error("diagnostics-otel: internal diagnostics capability unavailable");
return;
}
unsubscribe = subscribe((evt: DiagnosticEventPayload, metadata: DiagnosticEventMetadata) => {
try {
switch (evt.type) {
case "model.usage":
recordModelUsage(evt);
recordModelUsage(evt, metadata);
return;
case "webhook.received":
recordWebhookReceived(evt);
@@ -1267,32 +1624,41 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
recordHeartbeat(evt);
return;
case "run.completed":
recordRunCompleted(evt);
recordRunCompleted(evt, metadata);
return;
case "context.assembled":
recordContextAssembled(evt, metadata);
return;
case "model.call.completed":
recordModelCallCompleted(evt);
recordModelCallCompleted(evt, metadata);
return;
case "model.call.error":
recordModelCallError(evt);
recordModelCallError(evt, metadata);
return;
case "tool.execution.completed":
recordToolExecutionCompleted(evt);
recordToolExecutionCompleted(evt, metadata);
return;
case "tool.execution.error":
recordToolExecutionError(evt);
recordToolExecutionError(evt, metadata);
return;
case "exec.process.completed":
recordExecProcessCompleted(evt);
return;
case "log.record":
recordLogRecord?.(evt);
recordLogRecord?.(evt, metadata);
return;
case "tool.loop":
recordToolLoop(evt);
return;
case "diagnostic.memory.sample":
recordMemorySample(evt);
return;
case "diagnostic.memory.pressure":
recordMemoryPressure(evt);
return;
case "tool.execution.started":
case "run.started":
case "model.call.started":
case "diagnostic.memory.sample":
case "diagnostic.memory.pressure":
case "payload.large":
return;
}

View File

@@ -61,7 +61,7 @@ describeLive("elevenlabs plugin live", () => {
const normalized = normalizeTranscriptForMatch(transcript?.text ?? "");
expect(normalized).toContain("openclaw");
expect(normalized).toContain("elevenlabs");
expect(normalized).toMatch(/(?:elevenlabs|11labs)/);
}, 90_000);
it("streams realtime STT through the registered transcription provider", async () => {

View File

@@ -76,6 +76,7 @@ describe("fal image-generation provider", () => {
cfg: {},
count: 2,
size: "1536x1024",
outputFormat: "jpeg",
});
expectFalJsonPost({
@@ -85,7 +86,7 @@ describe("fal image-generation provider", () => {
prompt: "draw a cat",
image_size: { width: 1536, height: 1024 },
num_images: 2,
output_format: "png",
output_format: "jpeg",
},
});
expect(fetchWithSsrFGuardMock).toHaveBeenNthCalledWith(

View File

@@ -25,6 +25,7 @@ const DEFAULT_FAL_BASE_URL = "https://fal.run";
const DEFAULT_FAL_IMAGE_MODEL = "fal-ai/flux/dev";
const DEFAULT_FAL_EDIT_SUBPATH = "image-to-image";
const DEFAULT_OUTPUT_FORMAT = "png";
const FAL_OUTPUT_FORMATS = ["png", "jpeg"] as const;
const FAL_SUPPORTED_SIZES = [
"1024x1024",
"1024x1536",
@@ -292,6 +293,9 @@ export function buildFalImageGenerationProvider(): ImageGenerationProvider {
aspectRatios: [...FAL_SUPPORTED_ASPECT_RATIOS],
resolutions: ["1K", "2K", "4K"],
},
output: {
formats: [...FAL_OUTPUT_FORMATS],
},
},
async generateImage(req) {
const auth = await resolveApiKeyForProvider({
@@ -333,7 +337,7 @@ export function buildFalImageGenerationProvider(): ImageGenerationProvider {
const requestBody: Record<string, unknown> = {
prompt: req.prompt,
num_images: req.count ?? 1,
output_format: DEFAULT_OUTPUT_FORMAT,
output_format: req.outputFormat ?? DEFAULT_OUTPUT_FORMAT,
};
if (imageSize !== undefined) {
requestBody.image_size = imageSize;

View File

@@ -858,19 +858,25 @@ describe("google-meet plugin", () => {
});
it("reports setup status through the tool", async () => {
const { tools } = setup({
chrome: {
audioInputCommand: ["openclaw-audio-bridge", "capture"],
audioOutputCommand: ["openclaw-audio-bridge", "play"],
},
});
const tool = tools[0] as {
execute: (id: string, params: unknown) => Promise<{ details: { ok?: boolean } }>;
};
const originalPlatform = process.platform;
Object.defineProperty(process, "platform", { value: "darwin" });
try {
const { tools } = setup({
chrome: {
audioInputCommand: ["openclaw-audio-bridge", "capture"],
audioOutputCommand: ["openclaw-audio-bridge", "play"],
},
});
const tool = tools[0] as {
execute: (id: string, params: unknown) => Promise<{ details: { ok?: boolean } }>;
};
const result = await tool.execute("id", { action: "setup_status" });
const result = await tool.execute("id", { action: "setup_status" });
expect(result.details.ok).toBe(true);
expect(result.details.ok).toBe(true);
} finally {
Object.defineProperty(process, "platform", { value: originalPlatform });
}
});
it("reports attendance through the tool", async () => {
@@ -1045,7 +1051,20 @@ describe("google-meet plugin", () => {
defaultTransport: "chrome-node",
chromeNode: { node: "parallels-macos" },
},
{ nodesListResult: { nodes: [] } },
{
nodesListResult: {
nodes: [
{
nodeId: "node-1",
displayName: "parallels-macos",
connected: false,
caps: [],
commands: [],
remoteIp: "192.168.0.25",
},
],
},
},
);
const tool = tools[0] as {
execute: (
@@ -1062,10 +1081,97 @@ describe("google-meet plugin", () => {
expect.objectContaining({
id: "chrome-node-connected",
ok: false,
message: expect.stringContaining("No connected Google Meet-capable node"),
message: expect.stringContaining("parallels-macos"),
}),
]),
);
const check = result.details.checks?.find(
(item) => (item as { id?: unknown }).id === "chrome-node-connected",
) as { message?: string } | undefined;
expect(check?.message).toContain("offline");
expect(check?.message).toContain("missing googlemeet.chrome");
expect(check?.message).toContain("missing browser.proxy/browser capability");
});
it("reports missing local Chrome audio prerequisites in setup status", async () => {
const originalPlatform = process.platform;
Object.defineProperty(process, "platform", { value: "darwin" });
try {
const { tools } = setup(
{ defaultTransport: "chrome" },
{
runCommandWithTimeoutHandler: async (argv) => {
if (argv[0] === "/usr/sbin/system_profiler") {
return { code: 0, stdout: "Built-in Output", stderr: "" };
}
return { code: 0, stdout: "", stderr: "" };
},
},
);
const tool = tools[0] as {
execute: (
id: string,
params: unknown,
) => Promise<{ details: { ok?: boolean; checks?: unknown[] } }>;
};
const result = await tool.execute("id", { action: "setup_status", transport: "chrome" });
expect(result.details.ok).toBe(false);
expect(result.details.checks).toEqual(
expect.arrayContaining([
expect.objectContaining({
id: "chrome-local-audio-device",
ok: false,
message: expect.stringContaining("BlackHole 2ch audio device not found"),
}),
]),
);
} finally {
Object.defineProperty(process, "platform", { value: originalPlatform });
}
});
it("reports missing local Chrome audio commands in setup status", async () => {
const originalPlatform = process.platform;
Object.defineProperty(process, "platform", { value: "darwin" });
try {
const { tools } = setup(
{ defaultTransport: "chrome" },
{
runCommandWithTimeoutHandler: async (argv) => {
if (argv[0] === "/usr/sbin/system_profiler") {
return { code: 0, stdout: "BlackHole 2ch", stderr: "" };
}
if (argv[0] === "/bin/sh" && argv.at(-1) === "play") {
return { code: 1, stdout: "", stderr: "" };
}
return { code: 0, stdout: "", stderr: "" };
},
},
);
const tool = tools[0] as {
execute: (
id: string,
params: unknown,
) => Promise<{ details: { ok?: boolean; checks?: unknown[] } }>;
};
const result = await tool.execute("id", { action: "setup_status", transport: "chrome" });
expect(result.details.ok).toBe(false);
expect(result.details.checks).toEqual(
expect.arrayContaining([
expect.objectContaining({
id: "chrome-local-audio-commands",
ok: false,
message: "Chrome audio command missing: play",
}),
]),
);
} finally {
Object.defineProperty(process, "platform", { value: originalPlatform });
}
});
it("reports Twilio delegation readiness when voice-call is enabled", async () => {
@@ -1217,7 +1323,7 @@ describe("google-meet plugin", () => {
});
expect(respond.mock.calls[0]?.[0]).toBe(true);
expect(nodesList).toHaveBeenCalledWith({ connected: true });
expect(nodesList.mock.calls[0]).toEqual([]);
expect(nodesInvoke).toHaveBeenCalledWith(
expect.objectContaining({
nodeId: "node-1",

View File

@@ -566,10 +566,10 @@ export default definePluginEntry({
api.registerGatewayMethod(
"googlemeet.setup",
async ({ respond }: GatewayRequestHandlerOptions) => {
async ({ params, respond }: GatewayRequestHandlerOptions) => {
try {
const rt = await ensureRuntime();
respond(true, await rt.setupStatus());
respond(true, await rt.setupStatus({ transport: normalizeTransport(params?.transport) }));
} catch (err) {
sendError(respond, err);
}
@@ -741,7 +741,7 @@ export default definePluginEntry({
name: "google_meet",
label: "Google Meet",
description:
"Join and track Google Meet sessions through Chrome or Twilio. If a Meet tab is already open after a timeout, call recover_current_tab before retrying join to report login, permission, or admission blockers without opening another tab.",
"Join and track Google Meet sessions through Chrome or Twilio. Call setup_status before join/create/test_speech; if it reports a Chrome node offline or local audio missing, surface that blocker instead of retrying or switching transports. Offline nodes are diagnostics only, not usable candidates. If a Meet tab is already open after a timeout, call recover_current_tab before retrying join to report login, permission, or admission blockers without opening another tab.",
parameters: GoogleMeetToolSchema,
async execute(_toolCallId, params) {
const raw = asParamRecord(params);
@@ -797,7 +797,7 @@ export default definePluginEntry({
}
case "setup_status": {
const rt = await ensureRuntime();
return json(await rt.setupStatus());
return json(await rt.setupStatus({ transport: normalizeTransport(raw.transport) }));
}
case "resolve_space": {
const { token: _token, ...result } = await resolveSpaceFromParams(config, raw);

View File

@@ -129,6 +129,7 @@ export type GoogleMeetExportManifest = {
type SetupOptions = {
json?: boolean;
transport?: GoogleMeetTransport;
};
type DoctorOptions = {
@@ -1975,10 +1976,11 @@ export function registerGoogleMeetCli(params: {
root
.command("setup")
.description("Show Google Meet transport setup status")
.option("--transport <transport>", "Transport to check: chrome, chrome-node, or twilio")
.option("--json", "Print JSON output", false)
.action(async (options: SetupOptions) => {
const rt = await params.ensureRuntime();
const status = await rt.setupStatus();
const status = await rt.setupStatus({ transport: options.transport });
if (options.json) {
writeStdoutJson(status);
return;

View File

@@ -8,6 +8,7 @@ import { addGoogleMeetSetupCheck, getGoogleMeetSetupStatus } from "./setup.js";
import { isSameMeetUrlForReuse, resolveChromeNodeInfo } from "./transports/chrome-browser-proxy.js";
import { createMeetWithBrowserProxyOnNode } from "./transports/chrome-create.js";
import {
assertBlackHole2chAvailable,
launchChromeMeet,
launchChromeMeetOnNode,
recoverCurrentMeetTabOnNode,
@@ -53,6 +54,21 @@ function resolveMode(input: GoogleMeetMode | undefined, config: GoogleMeetConfig
return input ?? config.defaultMode;
}
function collectChromeAudioCommands(config: GoogleMeetConfig): string[] {
const commands = config.chrome.audioBridgeCommand
? [config.chrome.audioBridgeCommand[0]]
: [config.chrome.audioInputCommand?.[0], config.chrome.audioOutputCommand?.[0]];
return [...new Set(commands.filter((value): value is string => Boolean(value?.trim())))];
}
async function commandExists(runtime: PluginRuntime, command: string): Promise<boolean> {
const result = await runtime.system.runCommandWithTimeout(
["/bin/sh", "-lc", 'command -v "$1" >/dev/null 2>&1', "sh", command],
{ timeoutMs: 5_000 },
);
return result.code === 0;
}
export class GoogleMeetRuntime {
readonly #sessions = new Map<string, GoogleMeetSession>();
readonly #sessionStops = new Map<string, () => Promise<void>>();
@@ -86,14 +102,15 @@ export class GoogleMeetRuntime {
return session ? { found: true, session } : { found: false };
}
async setupStatus() {
async setupStatus(options: { transport?: GoogleMeetTransport } = {}) {
const transport = resolveTransport(options.transport, this.params.config);
const shouldCheckChromeNode =
transport === "chrome-node" ||
(!options.transport && Boolean(this.params.config.chromeNode.node));
let status = getGoogleMeetSetupStatus(this.params.config, {
fullConfig: this.params.fullConfig,
});
if (
this.params.config.defaultTransport === "chrome-node" ||
Boolean(this.params.config.chromeNode.node)
) {
if (shouldCheckChromeNode) {
try {
const node = await resolveChromeNodeInfo({
runtime: this.params.runtime,
@@ -113,6 +130,47 @@ export class GoogleMeetRuntime {
});
}
}
if (transport === "chrome") {
try {
await assertBlackHole2chAvailable({
runtime: this.params.runtime,
timeoutMs: Math.min(this.params.config.chrome.joinTimeoutMs, 10_000),
});
status = addGoogleMeetSetupCheck(status, {
id: "chrome-local-audio-device",
ok: true,
message: "BlackHole 2ch audio device found",
});
} catch (error) {
status = addGoogleMeetSetupCheck(status, {
id: "chrome-local-audio-device",
ok: false,
message: formatErrorMessage(error),
});
}
const commands = collectChromeAudioCommands(this.params.config);
const missingCommands: string[] = [];
for (const command of commands) {
try {
if (!(await commandExists(this.params.runtime, command))) {
missingCommands.push(command);
}
} catch {
missingCommands.push(command);
}
}
status = addGoogleMeetSetupCheck(status, {
id: "chrome-local-audio-commands",
ok: commands.length > 0 && missingCommands.length === 0,
message:
commands.length === 0
? "Chrome realtime audio commands are not configured"
: missingCommands.length === 0
? `Chrome audio command${commands.length === 1 ? "" : "s"} available: ${commands.join(", ")}`
: `Chrome audio command${missingCommands.length === 1 ? "" : "s"} missing: ${missingCommands.join(", ")}`,
});
}
return status;
}

View File

@@ -24,6 +24,12 @@ export type GoogleMeetTestNodeListResult = {
}>;
};
type CommandResult = {
code: number;
stdout?: string;
stderr?: string;
};
export function captureStdout() {
let output = "";
const writeSpy = vi.spyOn(process.stdout, "write").mockImplementation(((chunk: unknown) => {
@@ -50,6 +56,10 @@ export function setupGoogleMeetPlugin(
params?: unknown;
timeoutMs?: number;
}) => Promise<unknown>;
runCommandWithTimeoutHandler?: (
argv: string[],
options?: { timeoutMs?: number },
) => Promise<CommandResult>;
} = {},
) {
const methods = new Map<string, unknown>();
@@ -112,12 +122,17 @@ export function setupGoogleMeetPlugin(
}
return options.nodesInvokeResult ?? { launched: true };
});
const runCommandWithTimeout = vi.fn(async (argv: string[]) => {
if (argv[0] === "/usr/sbin/system_profiler") {
return { code: 0, stdout: "BlackHole 2ch", stderr: "" };
}
return { code: 0, stdout: "", stderr: "" };
});
const runCommandWithTimeout = vi.fn(
async (argv: string[], runOptions?: { timeoutMs?: number }) => {
if (options.runCommandWithTimeoutHandler) {
return options.runCommandWithTimeoutHandler(argv, runOptions);
}
if (argv[0] === "/usr/sbin/system_profiler") {
return { code: 0, stdout: "BlackHole 2ch", stderr: "" };
}
return { code: 0, stdout: "", stderr: "" };
},
);
const api = createTestPluginApi({
id: "google-meet",
name: "Google Meet",

View File

@@ -54,27 +54,78 @@ function isGoogleMeetNode(node: GoogleMeetNodeInfo) {
);
}
function matchesRequestedNode(node: GoogleMeetNodeInfo, requested: string): boolean {
return [node.nodeId, node.displayName, node.remoteIp].some((value) => value === requested);
}
function formatNodeLabel(node: GoogleMeetNodeInfo): string {
const parts = [node.displayName, node.nodeId, node.remoteIp].filter(Boolean);
return parts.length > 0 ? parts.join(" / ") : "unknown node";
}
function describeNodeUsabilityIssues(node: GoogleMeetNodeInfo): string[] {
const commands = Array.isArray(node.commands) ? node.commands : [];
const caps = Array.isArray(node.caps) ? node.caps : [];
const issues: string[] = [];
if (node.connected !== true) {
issues.push("offline");
}
if (!commands.includes("googlemeet.chrome")) {
issues.push("missing googlemeet.chrome");
}
if (!commands.includes("browser.proxy") && !caps.includes("browser")) {
issues.push("missing browser.proxy/browser capability");
}
return issues;
}
async function listGoogleMeetNodes(
runtime: PluginRuntime,
params?: { connected?: boolean },
): Promise<{ nodes: GoogleMeetNodeInfo[] }> {
try {
return params ? await runtime.nodes.list(params) : await runtime.nodes.list();
} catch (error) {
throw new Error("Google Meet node inventory unavailable", {
cause: error,
});
}
}
export async function resolveChromeNodeInfo(params: {
runtime: PluginRuntime;
requestedNode?: string;
}): Promise<GoogleMeetNodeInfo> {
const list = await params.runtime.nodes.list({ connected: true });
const requested = params.requestedNode?.trim();
if (requested) {
const list = await listGoogleMeetNodes(params.runtime);
const matches = list.nodes.filter((node) => matchesRequestedNode(node, requested));
if (matches.length === 1) {
const [node] = matches;
if (isGoogleMeetNode(node)) {
return node;
}
throw new Error(
`Configured Google Meet node ${requested} is not usable (${formatNodeLabel(node)}): ${describeNodeUsabilityIssues(node).join("; ")}. Start or reinstall \`openclaw node run\` on that Chrome host, approve pairing, and allow googlemeet.chrome plus browser.proxy.`,
);
}
if (matches.length > 1) {
throw new Error(
`Configured Google Meet node ${requested} is ambiguous (${matches.length} matches). Pin chromeNode.node to a unique node id, display name, or remote IP.`,
);
}
throw new Error(
`Configured Google Meet node ${requested} was not found. Run \`openclaw nodes status\` and start or approve the Chrome node.`,
);
}
const list = await listGoogleMeetNodes(params.runtime, { connected: true });
const nodes = list.nodes.filter(isGoogleMeetNode);
if (nodes.length === 0) {
throw new Error(
"No connected Google Meet-capable node with browser proxy. Run `openclaw node run` on the Chrome host with browser proxy enabled, approve pairing, and allow googlemeet.chrome plus browser.proxy.",
);
}
const requested = params.requestedNode?.trim();
if (requested) {
const matches = nodes.filter((node) =>
[node.nodeId, node.displayName, node.remoteIp].some((value) => value === requested),
);
if (matches.length === 1) {
return matches[0];
}
throw new Error(`Google Meet node not found or ambiguous: ${requested}`);
}
if (nodes.length === 1) {
return nodes[0];
}

View File

@@ -0,0 +1,331 @@
import { afterEach, describe, expect, it, vi } from "vitest";
import { buildLitellmImageGenerationProvider } from "./image-generation-provider.js";
const {
resolveApiKeyForProviderMock,
postJsonRequestMock,
assertOkOrThrowHttpErrorMock,
resolveProviderHttpRequestConfigMock,
sanitizeConfiguredModelProviderRequestMock,
} = vi.hoisted(() => ({
resolveApiKeyForProviderMock: vi.fn(async () => ({ apiKey: "litellm-key" })),
postJsonRequestMock: vi.fn(),
assertOkOrThrowHttpErrorMock: vi.fn(async () => {}),
resolveProviderHttpRequestConfigMock: vi.fn((params) => ({
baseUrl: params.baseUrl ?? params.defaultBaseUrl,
allowPrivateNetwork: Boolean(params.allowPrivateNetwork ?? params.request?.allowPrivateNetwork),
headers: new Headers(params.defaultHeaders),
dispatcherPolicy: undefined as unknown,
})),
sanitizeConfiguredModelProviderRequestMock: vi.fn((request) => request),
}));
vi.mock("openclaw/plugin-sdk/provider-auth-runtime", () => ({
resolveApiKeyForProvider: resolveApiKeyForProviderMock,
}));
vi.mock("openclaw/plugin-sdk/provider-http", () => ({
assertOkOrThrowHttpError: assertOkOrThrowHttpErrorMock,
postJsonRequest: postJsonRequestMock,
resolveProviderHttpRequestConfig: resolveProviderHttpRequestConfigMock,
sanitizeConfiguredModelProviderRequest: sanitizeConfiguredModelProviderRequestMock,
}));
function mockGeneratedPngResponse() {
postJsonRequestMock.mockResolvedValue({
response: {
json: async () => ({
data: [{ b64_json: Buffer.from("png-bytes").toString("base64") }],
}),
},
release: vi.fn(async () => {}),
});
}
describe("litellm image generation provider", () => {
afterEach(() => {
resolveApiKeyForProviderMock.mockClear();
postJsonRequestMock.mockReset();
assertOkOrThrowHttpErrorMock.mockClear();
resolveProviderHttpRequestConfigMock.mockClear();
sanitizeConfiguredModelProviderRequestMock.mockClear();
});
it("declares litellm id and OpenAI-compatible size hints", () => {
const provider = buildLitellmImageGenerationProvider();
expect(provider.id).toBe("litellm");
expect(provider.label).toBe("LiteLLM");
expect(provider.defaultModel).toBe("gpt-image-2");
expect(provider.capabilities.geometry?.sizes).toEqual(
expect.arrayContaining(["1024x1024", "2048x2048", "3840x2160"]),
);
expect(provider.capabilities.edit?.enabled).toBe(true);
});
it("defaults to the loopback proxy and allows private network for localhost", async () => {
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "Draw a QA lighthouse",
cfg: {},
});
expect(resolveProviderHttpRequestConfigMock).toHaveBeenCalledWith(
expect.objectContaining({
baseUrl: "http://localhost:4000",
allowPrivateNetwork: true,
}),
);
expect(postJsonRequestMock).toHaveBeenCalledWith(
expect.objectContaining({
url: "http://localhost:4000/images/generations",
allowPrivateNetwork: true,
}),
);
});
it("honors configured baseUrl and keeps private-network off for public endpoints", async () => {
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "campaign hero",
cfg: {
models: {
providers: {
litellm: {
baseUrl: "https://proxy.example.com/v1",
models: [],
},
},
},
},
});
expect(resolveProviderHttpRequestConfigMock).toHaveBeenCalledWith(
expect.objectContaining({
baseUrl: "https://proxy.example.com/v1",
allowPrivateNetwork: undefined,
}),
);
expect(postJsonRequestMock).toHaveBeenCalledWith(
expect.objectContaining({
url: "https://proxy.example.com/v1/images/generations",
allowPrivateNetwork: false,
}),
);
});
it("forwards count and size overrides on generation requests", async () => {
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "dall-e-3",
prompt: "two landscape variants",
cfg: {},
count: 2,
size: "3840x2160",
});
expect(postJsonRequestMock).toHaveBeenCalledWith(
expect.objectContaining({
url: "http://localhost:4000/images/generations",
body: {
model: "dall-e-3",
prompt: "two landscape variants",
n: 2,
size: "3840x2160",
},
}),
);
});
it("routes to the edit endpoint when input images are provided", async () => {
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "refine the hero",
cfg: {},
inputImages: [
{
buffer: Buffer.from("fake-input"),
mimeType: "image/png",
},
],
});
expect(postJsonRequestMock).toHaveBeenCalledWith(
expect.objectContaining({
url: "http://localhost:4000/images/edits",
}),
);
const call = postJsonRequestMock.mock.calls[0][0] as { body: { images: unknown[] } };
expect(call.body.images).toHaveLength(1);
});
it("throws a clear error when the API key is missing", async () => {
resolveApiKeyForProviderMock.mockResolvedValueOnce({ apiKey: "" });
const provider = buildLitellmImageGenerationProvider();
await expect(
provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "x",
cfg: {},
}),
).rejects.toThrow("LiteLLM API key missing");
});
it("forwards dispatcherPolicy from resolveProviderHttpRequestConfig to postJsonRequest", async () => {
const dispatcherPolicy = { proxyUrl: "http://corp-proxy:3128" } as unknown;
resolveProviderHttpRequestConfigMock.mockReturnValueOnce({
baseUrl: "https://proxy.example.com/v1",
allowPrivateNetwork: false,
headers: new Headers({ Authorization: "Bearer litellm-key" }),
dispatcherPolicy,
});
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "hi",
cfg: {
models: {
providers: {
litellm: { baseUrl: "https://proxy.example.com/v1", models: [] },
},
},
},
});
expect(postJsonRequestMock).toHaveBeenCalledWith(expect.objectContaining({ dispatcherPolicy }));
});
it("auto-allows private network for loopback-style baseUrls", async () => {
const cases = [
"http://localhost:4000",
"http://127.0.0.1:4000",
"http://[::1]:4000",
"http://host.docker.internal:4000",
"https://localhost:4000",
] as const;
for (const baseUrl of cases) {
resolveProviderHttpRequestConfigMock.mockClear();
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "x",
cfg: { models: { providers: { litellm: { baseUrl, models: [] } } } },
});
expect(
resolveProviderHttpRequestConfigMock,
`expected allowPrivateNetwork=true for ${baseUrl}`,
).toHaveBeenCalledWith(expect.objectContaining({ allowPrivateNetwork: true }));
}
});
it("requires explicit private-network opt-in for LAN and internal baseUrls", async () => {
const cases = [
"http://10.0.0.42:4000",
"http://192.168.5.10:4000",
"http://172.16.0.5:4000",
"https://192.168.5.10:4000",
"http://printer.local:4000",
"http://proxy.internal:4000",
"https://metadata.google.internal",
] as const;
for (const baseUrl of cases) {
resolveProviderHttpRequestConfigMock.mockClear();
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "x",
cfg: { models: { providers: { litellm: { baseUrl, models: [] } } } },
});
expect(
resolveProviderHttpRequestConfigMock,
`expected no automatic allowPrivateNetwork for ${baseUrl}`,
).toHaveBeenCalledWith(expect.objectContaining({ allowPrivateNetwork: undefined }));
expect(postJsonRequestMock).toHaveBeenCalledWith(
expect.objectContaining({ allowPrivateNetwork: false }),
);
}
});
it("honors explicit private-network opt-in for a LAN LiteLLM proxy", async () => {
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "x",
cfg: {
models: {
providers: {
litellm: {
baseUrl: "http://192.168.5.10:4000",
request: { allowPrivateNetwork: true },
models: [],
},
},
},
},
});
expect(resolveProviderHttpRequestConfigMock).toHaveBeenCalledWith(
expect.objectContaining({
allowPrivateNetwork: undefined,
request: { allowPrivateNetwork: true },
}),
);
expect(postJsonRequestMock).toHaveBeenCalledWith(
expect.objectContaining({ allowPrivateNetwork: true }),
);
});
it("does not allow private network for public hosts that embed private strings in the URL", async () => {
// Must not be fooled by an attacker-controlled URL that mentions
// "host.docker.internal" (or any private-looking literal) in the path,
// query string, or fragment. Only the parsed hostname should count.
const cases = [
"https://evil.example.com/?target=host.docker.internal",
"https://evil.example.com/host.docker.internal/foo",
"https://evil.example.com/redirect?to=127.0.0.1",
"https://public-api.openai.com/v1",
] as const;
for (const baseUrl of cases) {
resolveProviderHttpRequestConfigMock.mockClear();
mockGeneratedPngResponse();
const provider = buildLitellmImageGenerationProvider();
await provider.generateImage({
provider: "litellm",
model: "gpt-image-2",
prompt: "x",
cfg: { models: { providers: { litellm: { baseUrl, models: [] } } } },
});
expect(
resolveProviderHttpRequestConfigMock,
`expected allowPrivateNetwork=false for ${baseUrl}`,
).toHaveBeenCalledWith(expect.objectContaining({ allowPrivateNetwork: undefined }));
}
});
});

View File

@@ -0,0 +1,220 @@
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-runtime";
import type { ImageGenerationProvider } from "openclaw/plugin-sdk/image-generation";
import { isProviderApiKeyConfigured } from "openclaw/plugin-sdk/provider-auth";
import { resolveApiKeyForProvider } from "openclaw/plugin-sdk/provider-auth-runtime";
import {
assertOkOrThrowHttpError,
postJsonRequest,
resolveProviderHttpRequestConfig,
sanitizeConfiguredModelProviderRequest,
} from "openclaw/plugin-sdk/provider-http";
import { normalizeOptionalString } from "openclaw/plugin-sdk/text-runtime";
import { LITELLM_BASE_URL } from "./onboard.js";
const DEFAULT_OUTPUT_MIME = "image/png";
const DEFAULT_SIZE = "1024x1024";
const DEFAULT_LITELLM_IMAGE_MODEL = "gpt-image-2";
const LITELLM_SUPPORTED_SIZES = [
"256x256",
"512x512",
"1024x1024",
"1024x1536",
"1024x1792",
"1536x1024",
"1792x1024",
"2048x2048",
"2048x1152",
"3840x2160",
"2160x3840",
] as const;
const LITELLM_MAX_INPUT_IMAGES = 5;
type LitellmProviderConfig = NonNullable<
NonNullable<OpenClawConfig["models"]>["providers"]
>[string];
function resolveLitellmProviderConfig(
cfg: OpenClawConfig | undefined,
): LitellmProviderConfig | undefined {
return cfg?.models?.providers?.litellm;
}
function resolveConfiguredLitellmBaseUrl(cfg: OpenClawConfig | undefined): string {
return normalizeOptionalString(resolveLitellmProviderConfig(cfg)?.baseUrl) ?? LITELLM_BASE_URL;
}
// LiteLLM's default proxy is loopback. Auto-enable private-network access only
// for loopback-style hosts; LAN/custom private endpoints should use the
// explicit models.providers.litellm.request.allowPrivateNetwork opt-in.
function isAutoAllowedLitellmHostname(hostname: string): boolean {
if (!hostname) {
return false;
}
// Strip IPv6 brackets if any: "[::1]" -> "::1".
const host =
hostname.startsWith("[") && hostname.endsWith("]") ? hostname.slice(1, -1) : hostname;
const lowered = host.toLowerCase();
if (
lowered === "localhost" ||
lowered === "host.docker.internal" ||
lowered.endsWith(".localhost")
) {
return true;
}
if (lowered === "127.0.0.1" || lowered.startsWith("127.")) {
return true;
}
if (lowered === "::1" || lowered === "0:0:0:0:0:0:0:1") {
return true;
}
return false;
}
function shouldAutoAllowPrivateLitellmEndpoint(baseUrl: string): boolean {
try {
const parsed = new URL(baseUrl);
if (parsed.protocol !== "http:" && parsed.protocol !== "https:") {
return false;
}
return isAutoAllowedLitellmHostname(parsed.hostname);
} catch {
return false;
}
}
function toDataUrl(buffer: Buffer, mimeType: string): string {
return `data:${mimeType};base64,${buffer.toString("base64")}`;
}
type LitellmImageApiResponse = {
data?: Array<{
b64_json?: string;
revised_prompt?: string;
}>;
};
export function buildLitellmImageGenerationProvider(): ImageGenerationProvider {
return {
id: "litellm",
label: "LiteLLM",
defaultModel: DEFAULT_LITELLM_IMAGE_MODEL,
models: [DEFAULT_LITELLM_IMAGE_MODEL],
isConfigured: ({ agentDir }) =>
isProviderApiKeyConfigured({
provider: "litellm",
agentDir,
}),
capabilities: {
generate: {
maxCount: 4,
supportsSize: true,
supportsAspectRatio: false,
supportsResolution: false,
},
edit: {
enabled: true,
maxCount: 4,
maxInputImages: LITELLM_MAX_INPUT_IMAGES,
supportsSize: true,
supportsAspectRatio: false,
supportsResolution: false,
},
geometry: {
sizes: [...LITELLM_SUPPORTED_SIZES],
},
},
async generateImage(req) {
const inputImages = req.inputImages ?? [];
const isEdit = inputImages.length > 0;
const auth = await resolveApiKeyForProvider({
provider: "litellm",
cfg: req.cfg,
agentDir: req.agentDir,
store: req.authStore,
});
if (!auth.apiKey) {
throw new Error("LiteLLM API key missing");
}
const providerConfig = resolveLitellmProviderConfig(req.cfg);
const resolvedBaseUrl = resolveConfiguredLitellmBaseUrl(req.cfg);
const { baseUrl, allowPrivateNetwork, headers, dispatcherPolicy } =
resolveProviderHttpRequestConfig({
baseUrl: resolvedBaseUrl,
defaultBaseUrl: LITELLM_BASE_URL,
allowPrivateNetwork: shouldAutoAllowPrivateLitellmEndpoint(resolvedBaseUrl)
? true
: undefined,
request: sanitizeConfiguredModelProviderRequest(providerConfig?.request),
defaultHeaders: {
Authorization: `Bearer ${auth.apiKey}`,
},
provider: "litellm",
capability: "image",
transport: "http",
});
const model = req.model || DEFAULT_LITELLM_IMAGE_MODEL;
const count = req.count ?? 1;
const size = req.size ?? DEFAULT_SIZE;
const jsonHeaders = new Headers(headers);
jsonHeaders.set("Content-Type", "application/json");
const endpoint = isEdit ? "images/edits" : "images/generations";
const body = isEdit
? {
model,
prompt: req.prompt,
n: count,
size,
images: inputImages.map((image) => ({
image_url: toDataUrl(image.buffer, image.mimeType?.trim() || DEFAULT_OUTPUT_MIME),
})),
}
: {
model,
prompt: req.prompt,
n: count,
size,
};
const { response, release } = await postJsonRequest({
url: `${baseUrl}/${endpoint}`,
headers: jsonHeaders,
body,
timeoutMs: req.timeoutMs,
fetchFn: fetch,
allowPrivateNetwork,
dispatcherPolicy,
});
try {
await assertOkOrThrowHttpError(
response,
isEdit ? "LiteLLM image edit failed" : "LiteLLM image generation failed",
);
const data = (await response.json()) as LitellmImageApiResponse;
const images = (data.data ?? [])
.map((entry, index) => {
if (!entry.b64_json) {
return null;
}
return Object.assign(
{
buffer: Buffer.from(entry.b64_json, `base64`),
mimeType: DEFAULT_OUTPUT_MIME,
fileName: `image-${index + 1}.png`,
},
entry.revised_prompt ? { revisedPrompt: entry.revised_prompt } : {},
);
})
.filter((entry): entry is NonNullable<typeof entry> => entry !== null);
return {
images,
model,
};
} finally {
await release();
}
},
};
}

Some files were not shown because too many files have changed in this diff Show More