Compare commits

..

59 Commits

Author SHA1 Message Date
Alex Knight
7848b5296e feat(plugins): add bundled plugin policy 2026-04-24 18:55:02 +10:00
Alex Knight
6b62abff43 fix(plugin-sdk): avoid stealing fresh file locks 2026-04-24 12:54:47 +10:00
Alex Knight
366c180915 fix(plugin-sdk): align persistence queue hardening 2026-04-24 12:44:37 +10:00
Alex Knight
0825b75477 fix(plugin-sdk): harden persistent keyed store edge cases 2026-04-24 12:44:08 +10:00
Alex Knight
4de31692ac fix(plugin-sdk): harden persistent keyed store concurrency 2026-04-24 12:43:39 +10:00
Alex Knight
100b41afd6 feat(plugin-sdk): export persistent keyed store 2026-04-24 12:43:09 +10:00
Alex Knight
9b09445510 feat(plugin-sdk): add persistent keyed store 2026-04-24 12:42:28 +10:00
Peter Steinberger
913f97c956 perf: lazy codex app server test imports 2026-04-24 03:42:00 +01:00
Vincent Koc
d3f6783b16 docs(tools): convert perplexity-search params to ParamField 2026-04-23 19:41:09 -07:00
Vincent Koc
5f19e288b1 docs(tools): convert search and web-fetch param tables to ParamField 2026-04-23 19:40:05 -07:00
Vincent Koc
f4b61e7277 docs(help): split testing by extracting live (network-touching) test suites 2026-04-23 19:38:59 -07:00
Vincent Koc
f4d73e1dcd docs(tools): split ACP agents by extracting acpx harness, plugin setup, and permissions 2026-04-23 19:36:19 -07:00
Vincent Koc
743b69d307 docs(tools): split browser docs by extracting control API and CLI reference 2026-04-23 19:34:50 -07:00
Vincent Koc
60d892d700 fix(reply): parse markdown image replies as media
* fix(reply): parse markdown image replies as media

* fix(reply): preserve inline markdown image captions

* fix(reply): harden markdown image parsing
2026-04-23 19:34:30 -07:00
Peter Steinberger
04066d246a feat: add browser realtime talk 2026-04-24 03:33:36 +01:00
Vincent Koc
d42069b11e fix(replay): preserve synthetic tool repair aliases
* fix(replay): preserve synthetic tool repair aliases

* test(replay): cover Bedrock repair ownership
2026-04-23 19:33:05 -07:00
Truffle
a958b6e723 fix(runner): surface provider errors to webchat (#70848)
Surface non-retryable assistant provider failures from the embedded runner instead of letting surface_error fall through to continue_normal.

- Preserve external abort and plain timeout fall-through paths.
- Preserve raw provider error diagnostics on surfaced FailoverError.
- Add regression coverage for billing/auth/rate-limit/null-reason/error fall-through cases.
- Update changelog.

Fixes #70124.
Thanks @truffle-dev.
2026-04-24 03:28:38 +01:00
Peter Steinberger
7c18b765e8 perf: reduce discord provider test imports 2026-04-24 03:27:31 +01:00
Peter Steinberger
dd1576204a docs(changelog): credit bundled runtime deps fix 2026-04-24 03:27:04 +01:00
Peter Steinberger
0daf51d645 fix(plugins): mirror sdk alias for external bundled deps 2026-04-24 03:27:04 +01:00
Peter Steinberger
49c95b31c0 test(e2e): run root-owned gateway logging as appuser 2026-04-24 03:27:04 +01:00
Peter Steinberger
b0244f613e fix(plugins): clean bundled runtime install stage 2026-04-24 03:27:04 +01:00
simonemacario
02a9dd0ddc fix(plugins): stage bundled-plugin runtime-dep install outside the plugin root
When a packaged bundled plugin's `pluginRoot` is used directly as the npm
execution cwd, `npm install <specs>` resolves the plugin's own
`package.json` as the project manifest and fails with
`EUNSUPPORTEDPROTOCOL: Unsupported URL Type "workspace:": workspace:*`
whenever that manifest declares a `workspace:` runtime dep (e.g.
`"@openclaw/plugin-sdk": "workspace:*"`). This takes out every plugin
with any runtime deps at gateway startup.

`ensureBundledPluginRuntimeDeps` already filters `workspace:` specs from
the CLI arguments, but npm's own resolver reads the cwd manifest
regardless, so the filter alone is not enough. The existing isolated
execution-root + `replaceNodeModulesDir` machinery handles this exact
problem for source-checkout + cache-hit installs. This change activates
the same staging path for the packaged case: when `installRoot ===
pluginRoot` and we are not in the source-checkout cache path, stage the
install inside `<pluginRoot>/.openclaw-install-stage` (which has a
minimal generated `package.json`) and move the produced `node_modules/`
back to the plugin root as before.

- Add regression test `stages plugin-root install when the plugin's own
  package.json declares workspace:* deps` covering the Docker scenario
  (mixed `workspace:*` + concrete runtime dep, e.g. anthropic-style
  `@openclaw/plugin-sdk` + `@anthropic-ai/sdk`).
- Update existing plugin-root-install expectations (`installs
  plugin-local runtime deps when one is missing`, `skips workspace-only
  runtime deps before npm install`, `installs deps that are only present
  in the package root`, `does not trust runtime deps that only resolve
  from the package root`, `does not treat sibling extension runtime deps
  as satisfying a plugin`) to assert the new `installExecutionRoot`.

Reported in #70844; same root cause as #70701, #70756, #70773, #70818,
#70839 which see the downstream "Cannot find package 'openclaw' from
plugin-runtime-deps" symptom because their
`resolveBundledRuntimeDependencyInstallRoot` resolves to an external
stage dir (clean manifest) so the install succeeds but the resulting
node_modules tree cannot satisfy the filtered-out workspace packages at
ESM import time.

## AI assistance

This PR was AI-assisted with Claude Code.

Testing degree: fully tested for the touched `bundled-runtime-deps`
install staging surface.

- `pnpm exec vitest run --config test/vitest/vitest.plugins.config.ts src/plugins/bundled-runtime-deps.test.ts` (31/31)
- `pnpm exec vitest run --config test/vitest/vitest.plugins.config.ts src/plugins/` (43/43 across 8 files)
- `pnpm exec tsgo --noEmit -p tsconfig.core.json`, `pnpm exec tsgo --noEmit -p tsconfig.core.test.json` (clean)
- `pnpm exec oxlint src/plugins/bundled-runtime-deps.ts src/plugins/bundled-runtime-deps.test.ts` (0 warnings, 0 errors)
- `node scripts/check-src-extension-import-boundary.mjs --json` and `node scripts/check-sdk-package-extension-import-boundary.mjs --json` (both `[]`)

I understand the code path changed here: packaged bundled plugins now
stage their runtime-dep install one directory below `pluginRoot` so npm
never reads the plugin's `workspace:*`-containing manifest during
install; after install completes, the produced `node_modules/` is moved
back to `pluginRoot` via the existing `replaceNodeModulesDir` helper.

Signed-off-by: Simone Macario <simone@sharly.ai>
2026-04-24 03:27:04 +01:00
Shakker
64ed439ad0 perf: avoid broad models list enumeration (#70883) (thanks @shakkernerd) 2026-04-24 03:25:26 +01:00
Shakker
c93f053f80 test: cover default models list registry narrowing 2026-04-24 03:25:26 +01:00
Shakker
a6a2516cd8 perf: narrow default models list registry loading 2026-04-24 03:25:26 +01:00
Peter Steinberger
f9b33b7d96 fix: disable bundled plugins during Parallels update 2026-04-24 03:23:14 +01:00
Peter Steinberger
a59d1bd46d perf: narrow slack test imports 2026-04-24 03:17:32 +01:00
Peter Steinberger
3aa3551491 test: cover OpenAI server compaction docs 2026-04-24 03:15:47 +01:00
Peter Steinberger
467f839198 docs(plugins): explain google meet audio install 2026-04-24 03:14:25 +01:00
Peter Steinberger
2af88fab6c docs: document local memory embedding provider 2026-04-24 03:11:22 +01:00
Matt Znoj Assist
e069d03945 fix(memory-core): declare local memoryEmbeddingProviders contract (#70873)
Fix standalone memory CLI resolution for the built-in local embedding provider by declaring the memory-core capability contract.\n\nFixes #70836.\nThanks @mattznojassist.
2026-04-24 03:09:49 +01:00
Peter Steinberger
272bd59e7a docs(plugins): clarify google meet quick start 2026-04-24 03:05:42 +01:00
Peter Steinberger
0c9659b70c feat(plugins): simplify google meet realtime defaults 2026-04-24 03:03:21 +01:00
Peter Steinberger
28299a94ba fix: escape Parallels config scrub script 2026-04-24 03:02:12 +01:00
Peter Steinberger
e41298f501 docs(plugins): expand google meet realtime consult docs 2026-04-24 02:56:25 +01:00
Peter Steinberger
e314190403 feat(plugins): give google meet realtime agent consult 2026-04-24 02:55:43 +01:00
Peter Steinberger
3361593442 perf: reduce feishu monitor import drag 2026-04-24 02:55:09 +01:00
Peter Steinberger
68e2d6f088 fix: use node for Parallels config scrub 2026-04-24 02:50:42 +01:00
Peter Steinberger
903308dbf2 fix: stabilize qa lab mock suite 2026-04-24 02:46:33 +01:00
Peter Steinberger
2779020cbe perf: lazy load browser test server 2026-04-24 02:45:25 +01:00
Peter Steinberger
86f69ba5a0 fix: preserve gateway image refs for text-only models 2026-04-24 02:40:10 +01:00
Peter Steinberger
92a42413df perf: lazy load discord inbound runtimes 2026-04-24 02:36:36 +01:00
Peter Steinberger
1a8a6f8fba feat(ui): steer queued chat messages 2026-04-24 02:35:40 +01:00
Shakker
7dc1aeebbf refactor: split models list row sources (#70867) (thanks @shakkernerd) 2026-04-24 02:34:36 +01:00
Shakker
a606838b4b refactor: plan models list registry loading 2026-04-24 02:34:36 +01:00
Shakker
0af56c8ba6 refactor: split models list row sources 2026-04-24 02:34:36 +01:00
Peter Steinberger
07cb18ca04 fix: scrub future plugin entries in Parallels update smoke 2026-04-24 02:33:21 +01:00
Peter Steinberger
794437a730 ci: keep full install smoke off merge pushes 2026-04-24 02:31:36 +01:00
Peter Steinberger
754acc4478 perf: reduce telegram test import drag 2026-04-24 02:28:38 +01:00
Peter Steinberger
d268c850e6 fix: honor explicit media image model routing 2026-04-24 02:21:30 +01:00
Vincent Koc
c0a7b6a510 fix(plugins): align provider auth metadata 2026-04-23 18:16:20 -07:00
Peter Steinberger
8129ac0f26 docs: add Google Meet changelog entry 2026-04-24 02:15:53 +01:00
Peter Steinberger
e63b16cf46 refactor: centralize realtime voice resolution 2026-04-24 02:15:53 +01:00
Peter Steinberger
09a79bf499 refactor: share realtime voice bridge sessions 2026-04-24 02:15:53 +01:00
Peter Steinberger
15a82d4536 refactor: share provider selection runtime helper 2026-04-24 02:15:53 +01:00
Peter Steinberger
051c543bcb fix: guard Google Meet API fetches 2026-04-24 02:15:53 +01:00
Peter Steinberger
59a8afe6fa feat: add Google Meet participant plugin 2026-04-24 02:15:53 +01:00
Peter Steinberger
e0072ef91a chore: bump version to 2026.4.24 2026-04-24 02:13:50 +01:00
201 changed files with 10192 additions and 1751 deletions

5
.github/labeler.yml vendored
View File

@@ -24,6 +24,11 @@
- any-glob-to-any-file:
- "extensions/googlechat/**"
- "docs/channels/googlechat.md"
"plugin: google-meet":
- changed-files:
- any-glob-to-any-file:
- "extensions/google-meet/**"
- "docs/plugins/google-meet.md"
"channel: imessage":
- changed-files:
- any-glob-to-any-file:

View File

@@ -87,7 +87,7 @@ jobs:
env:
OPENCLAW_CI_DOCS_ONLY: ${{ steps.docs_scope.outputs.docs_only }}
OPENCLAW_CI_EVENT_NAME: ${{ github.event_name }}
OPENCLAW_CI_FORCE_FULL_INSTALL_SMOKE: ${{ (github.event_name == 'workflow_dispatch' || github.event_name == 'schedule' || github.event_name == 'workflow_call' || github.event_name == 'push') && 'true' || 'false' }}
OPENCLAW_CI_FORCE_FULL_INSTALL_SMOKE: ${{ (github.event_name == 'workflow_dispatch' || github.event_name == 'schedule' || github.event_name == 'workflow_call') && 'true' || 'false' }}
OPENCLAW_CI_WORKFLOW_BUN_GLOBAL_INSTALL_SMOKE: ${{ inputs.run_bun_global_install_smoke || 'false' }}
OPENCLAW_CI_RUN_FAST_INSTALL_SMOKE: ${{ steps.changed_scope.outputs.run_fast_install_smoke || steps.changed_scope.outputs.run_changed_smoke || 'false' }}
OPENCLAW_CI_RUN_FULL_INSTALL_SMOKE: ${{ steps.changed_scope.outputs.run_full_install_smoke || 'false' }}
@@ -106,10 +106,13 @@ jobs:
run_fast_install_smoke=true
run_full_install_smoke=true
run_install_smoke=true
elif [ "$docs_only" != "true" ] && [ "$run_changed_full_install_smoke" = "true" ]; then
elif [ "$docs_only" != "true" ] && [ "$event_name" != "push" ] && [ "$run_changed_full_install_smoke" = "true" ]; then
run_fast_install_smoke=true
run_full_install_smoke=true
run_install_smoke=true
elif [ "$docs_only" != "true" ] && [ "$run_changed_full_install_smoke" = "true" ]; then
run_fast_install_smoke=true
run_install_smoke=true
elif [ "$docs_only" != "true" ] && [ "$run_changed_fast_install_smoke" = "true" ]; then
run_fast_install_smoke=true
run_install_smoke=true

View File

@@ -6,22 +6,37 @@ Docs: https://docs.openclaw.ai
### Changes
- Control UI/chat: add a Steer action on queued messages so a browser follow-up can be injected into the active run without retyping it.
- Control UI/Talk: add browser WebRTC realtime voice sessions backed by OpenAI Realtime, with Gateway-minted ephemeral client secrets and `openclaw_agent_consult` handoff to the full OpenClaw agent.
- Agents/tools: add optional per-call `timeoutMs` support for image, video, music, and TTS generation tools so agents can extend provider request timeouts only when a specific generation needs it.
- Agents/subagents: add optional forked context for native `sessions_spawn` runs so agents can let a child inherit the requester transcript when needed, while keeping clean isolated sessions as the default; includes prompt guidance, context-engine hook metadata, docs, and QA coverage.
- Codex harness: add structured debug logging for embedded harness selection decisions so `/status` stays simple while gateway logs explain auto-selection and Pi fallback reasons. (#70760) Thanks @100yenadmin.
- Dependencies/Pi: update bundled Pi packages to `0.70.0`, use Pi's upstream `gpt-5.5` catalog metadata for OpenAI and OpenAI Codex, and keep only local `gpt-5.5-pro` forward-compat handling.
- Models/CLI: avoid broad registry enumeration for default `openclaw models list`, reducing default listing latency while preserving configured-row output. (#70883) Thanks @shakkernerd.
- Models/CLI: split `openclaw models list` row-source orchestration and registry loading into narrower helpers without changing list output behavior. (#70867) Thanks @shakkernerd.
- Plugins/Google Meet: add a bundled participant plugin with personal Google auth, explicit meeting URL joins, Chrome and Twilio transports, and realtime voice support. (#70765) Thanks @steipete.
- Plugins/Google Meet: default Chrome realtime sessions to OpenAI plus SoX `rec`/`play` audio bridge commands, so the usual setup only needs the plugin enabled and `OPENAI_API_KEY`.
- Providers/OpenAI: add image generation and reference-image editing through Codex OAuth, so `openai/gpt-image-2` works without an `OPENAI_API_KEY`. Fixes #70703.
- Providers/OpenRouter: add image generation and reference-image editing through `image_generate`, so OpenRouter image models work with `OPENROUTER_API_KEY`. Fixes #55066 via #67668. Thanks @notamicrodose.
- Image generation: let agents request provider-supported quality and output format hints, and pass OpenAI-specific background, moderation, compression, and user hints through the `image_generate` tool. (#70503) Thanks @ottodeng.
- Plugins/Google Meet: let realtime Meet sessions consult the full OpenClaw agent for deeper answers while staying in the live voice loop.
### Fixes
- Agents/replay: stop OpenAI/Codex transcript replay from synthesizing missing tool results while still preserving synthetic repair on Anthropic, Gemini, and Bedrock transport-owned sessions. (#61556) Thanks @VictorJeon and @vincentkoc.
- Telegram/media replies: parse remote markdown image syntax into outbound media payloads on the final reply path, so Telegram group chats stop falling back to plain-text image URLs when the model or a tool emits `![...](...)` instead of a `MEDIA:` token. (#66191) Thanks @apezam and @vincentkoc.
- Agents/WebChat: surface non-retryable provider failures such as billing, auth, and rate-limit errors from the embedded runner instead of logging `surface_error` and leaving webchat with no rendered error. Fixes #70124. (#70848) Thanks @truffle-dev.
- Memory/CLI: declare the built-in `local` embedding provider in the memory-core manifest, so standalone `openclaw memory status`, `index`, and `search` can resolve local embeddings just like the gateway runtime. Fixes #70836. (#70873) Thanks @mattznojassist.
- Gateway/WebChat: preserve image attachments for text-only primary models by offloading them as media refs instead of dropping them, so configured image tools can still inspect the original file. Fixes #68513, #44276, #51656, #70212.
- Plugins/Google Meet: hang up delegated Twilio calls on leave, clean up Chrome realtime audio bridges when launch fails, and use a flat provider-safe tool schema.
- Media understanding: honor explicit image-model configuration before native-vision skips, including `agents.defaults.imageModel`, `tools.media.image.models`, and provider image defaults such as MiniMax VL when the active chat model is text-only. Fixes #47614, #63722, #69171.
- Codex/media understanding: support `codex/*` image models through bounded Codex app-server image turns, while keeping `openai-codex/*` on the OpenAI Codex OAuth route and validating app-server responses against generated protocol contracts. Fixes #70201.
- Providers/OpenAI Codex: synthesize the `openai-codex/gpt-5.5` OAuth model row when Codex catalog discovery omits it, so cron and subagent runs do not fail with `Unknown model` while the account is authenticated.
- Models/CLI: keep `openclaw models list` read-only while still showing eligible configured-provider rows, so listing models no longer rewrites per-agent `models.json`. (#70847) Thanks @shakkernerd.
- Providers/Google: honor the private-network SSRF opt-in for Gemini image generation requests, so trusted proxy setups that resolve Google API hosts to private addresses can use `image_generate`. Fixes #67216.
- Agents/transport: stop embedded runs from lowering the process-wide undici stream timeouts, so slow Gemini image generation and other long-running provider requests no longer inherit short run-attempt headers timeouts. Fixes #70423. Thanks @giangthb.
- Providers/OpenRouter: send image-understanding prompts as user text before image parts, restoring non-empty vision responses for OpenRouter multimodal models. Fixes #70410.
- Plugins/providers: mirror runtime auth choices in bundled provider manifests and detect `KIMI_API_KEY` for Moonshot/Kimi web search before plugin runtime loads. Thanks @vincentkoc.
- Memory/QMD: recreate stale managed QMD collections when startup repair finds the collection name already exists, so root memory narrows back to `MEMORY.md` instead of staying on broad workspace markdown indexing.
- Agents/OpenAI: surface selected-model capacity failures from PI, Codex, and auto-reply harness paths with a model-switch hint instead of the generic empty-response error. Thanks @vincentkoc.
- Providers/OpenAI: route `openai/gpt-image-2` through configured Codex OAuth directly when an `openai-codex` profile is active, instead of probing `OPENAI_API_KEY` first.
@@ -55,7 +70,7 @@ Docs: https://docs.openclaw.ai
- WhatsApp/security: keep contact/vCard/location structured-object free text out of the inline message body and render it through fenced untrusted metadata JSON, limiting hidden prompt-injection payloads in names, phone fields, and location labels/comments.
- Group-chat/security: keep channel-sourced group names and participant labels out of inline group system prompts and render them through fenced untrusted metadata JSON.
- Agents/replay: preserve Kimi-style `functions.<name>:<index>` tool-call IDs during strict replay sanitization so custom OpenAI-compatible Kimi routes keep multi-turn tool use intact. (#70693) Thanks @geri4.
- Plugins/startup: restore bundled plugin `openclaw/plugin-sdk/*` resolution from packaged installs and external runtime-deps stage roots, so Telegram/Discord no longer crash-loop with `Cannot find package 'openclaw'` after missing dependency repair.
- Plugins/startup: restore bundled plugin `openclaw/plugin-sdk/*` resolution from packaged installs and external runtime-deps stage roots, so Telegram/Discord no longer crash-loop with `Cannot find package 'openclaw'` after missing dependency repair. (#70852) Thanks @simonemacario.
- CLI/Claude: run the same prompt-build hooks and trigger/channel context on `claude-cli` turns as on direct embedded runs, keeping Claude Code sessions aligned with OpenClaw workspace identity, routing, and hook-driven prompt mutations. (#70625) Thanks @mbelinky.
- Discord/plugin startup: keep subagent hooks lazy behind Discord's channel entry so packaged entry imports stay narrow and report import failures with the channel id and entry path.
- Memory/doctor: keep root durable memory canonicalized on `MEMORY.md`, stop treating lowercase `memory.md` as a runtime fallback, and let `openclaw doctor --fix` merge true split-brain root files into `MEMORY.md` with a backup. (#70621) Thanks @mbelinky.

View File

@@ -65,8 +65,8 @@ android {
applicationId = "ai.openclaw.app"
minSdk = 31
targetSdk = 36
versionCode = 2026042300
versionName = "2026.4.23"
versionCode = 2026042400
versionName = "2026.4.24"
ndk {
// Support all major ABIs — native libs are tiny (~47 KB per ABI)
abiFilters += listOf("armeabi-v7a", "arm64-v8a", "x86", "x86_64")

View File

@@ -1,5 +1,9 @@
# OpenClaw iOS Changelog
## 2026.4.24 - 2026-04-24
Maintenance update for the current OpenClaw development release.
## 2026.4.23 - 2026-04-23
Maintenance update for the current OpenClaw development release.

View File

@@ -2,8 +2,8 @@
// Source of truth: apps/ios/version.json
// Generated by scripts/ios-sync-versioning.ts.
OPENCLAW_IOS_VERSION = 2026.4.23
OPENCLAW_MARKETING_VERSION = 2026.4.23
OPENCLAW_IOS_VERSION = 2026.4.24
OPENCLAW_MARKETING_VERSION = 2026.4.24
OPENCLAW_BUILD_VERSION = 1
#include? "../build/Version.xcconfig"

View File

@@ -1,3 +1,3 @@
{
"version": "2026.4.23"
"version": "2026.4.24"
}

View File

@@ -15,9 +15,9 @@
<key>CFBundlePackageType</key>
<string>APPL</string>
<key>CFBundleShortVersionString</key>
<string>2026.4.23</string>
<string>2026.4.24</string>
<key>CFBundleVersion</key>
<string>2026042300</string>
<string>2026042400</string>
<key>CFBundleIconFile</key>
<string>OpenClaw</string>
<key>CFBundleURLTypes</key>

View File

@@ -1,4 +1,4 @@
6b142e6a8aa513ccd8f9cfbf7e95fa4919fb6fca7aeaa841f57ad9e39e8901a9 config-baseline.json
a4e167f169db58d71c385a31fa2b980772f9fee963e70dd9553f63536cae5aed config-baseline.core.json
0f509ffa75fca5e65afa13f7da09f5f8bd6d084e7f17d67e624c843dde842ba7 config-baseline.json
1bd5eee830a832d036e01b1ccc8a637e12703940e70b8ac320cc86e85fc4d25f config-baseline.core.json
22d7cd6d8279146b2d79c9531a55b80b52a2c99c81338c508104729154fdd02d config-baseline.channel.json
a91304e3566ecc8906f199b88a2e38eaee86130aad799bf4d62921e2f0ddc1b5 config-baseline.plugin.json
de2abdf469e828f30e37a5c60f147d6e3f5c9f116e2b1bb3289c21992d5138ba config-baseline.plugin.json

View File

@@ -1,2 +1,2 @@
1d2767b688414ac41305e88c830858c00947e2d7c713f1a25d86f38cd577620e plugin-sdk-api-baseline.json
e5167477ab6aa2e67bd4361048cf5f6f8fd1cb7ee570544c634d14417f890674 plugin-sdk-api-baseline.jsonl
1b545ffc4f0704382a16765a6cc26eb8da6df3d7ffdd9f489f0ca31399fe9833 plugin-sdk-api-baseline.json
c192f0a2a827c4f756c22ba15eec90728e1edc4bd4a7d96cd097438242c3885a plugin-sdk-api-baseline.jsonl

View File

@@ -16,7 +16,7 @@ Feishu/Lark is an all-in-one collaboration platform where teams chat, share docu
## Quick start
> **Requires OpenClaw 2026.4.23 or above.** Run `openclaw --version` to check. Upgrade with `openclaw update`.
> **Requires OpenClaw 2026.4.24 or above.** Run `openclaw --version` to check. Upgrade with `openclaw update`.
<Steps>
<Step title="Run the channel setup wizard">

View File

@@ -91,7 +91,7 @@ Jobs are ordered so cheap checks fail before expensive ones run:
Scope logic lives in `scripts/ci-changed-scope.mjs` and is covered by unit tests in `src/scripts/ci-changed-scope.test.ts`.
CI workflow edits validate the Node CI graph plus workflow linting, but do not force Windows, Android, or macOS native builds by themselves; those platform lanes stay scoped to platform source changes.
Windows Node checks are scoped to Windows-specific process/path wrappers, npm/pnpm/UI runner helpers, package manager config, and the CI workflow surfaces that execute that lane; unrelated source, plugin, install-smoke, and test-only changes stay on the Linux Node lanes so they do not reserve a 16-vCPU Windows worker for coverage that is already exercised by the normal test shards.
The separate `install-smoke` workflow reuses the same scope script through its own `preflight` job. It splits smoke coverage into `run_fast_install_smoke` and `run_full_install_smoke`. Pull requests run the fast path for Docker/package surfaces, bundled plugin package/manifest changes, and core plugin/channel/gateway/Plugin SDK surfaces that the Docker smoke jobs exercise. Source-only bundled plugin changes, test-only edits, and docs-only edits do not reserve Docker workers. The fast path builds the root Dockerfile image once, checks the CLI, runs the container gateway-network e2e, verifies a bundled extension build arg, and runs the bounded bundled-plugin Docker profile under a 120-second command timeout. The full path keeps QR package install and installer Docker/update coverage for `main` pushes, nightly scheduled runs, manual dispatches, workflow-call release checks, and true installer/package/Docker changes. The slow Bun global install image-provider smoke is separately gated by `run_bun_global_install_smoke`; it runs on the nightly schedule and from the release checks workflow, and manual `install-smoke` dispatches can opt into it, but pull requests do not run it. QR and installer Docker tests keep their own install-focused Dockerfiles. Local `test:docker:all` prebuilds one shared live-test image and one shared `scripts/e2e/Dockerfile` built-app image, then runs the live/E2E smoke lanes in parallel with `OPENCLAW_SKIP_DOCKER_BUILD=1`; tune the default concurrency of 4 with `OPENCLAW_DOCKER_ALL_PARALLELISM`. The local aggregate stops scheduling new pooled lanes after the first failure by default, and each lane has a 120-minute timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`. Startup- or provider-sensitive lanes run exclusively after the parallel pool. The reusable live/E2E workflow mirrors the shared-image pattern by building and pushing one SHA-tagged GHCR Docker E2E image before the Docker matrix, then running the matrix with `OPENCLAW_SKIP_DOCKER_BUILD=1`. The scheduled live/E2E workflow runs the full release-path Docker suite daily. The full bundled update/channel matrix remains manual/full-suite because it performs repeated real npm update and doctor repair passes.
The separate `install-smoke` workflow reuses the same scope script through its own `preflight` job. It splits smoke coverage into `run_fast_install_smoke` and `run_full_install_smoke`. Pull requests run the fast path for Docker/package surfaces, bundled plugin package/manifest changes, and core plugin/channel/gateway/Plugin SDK surfaces that the Docker smoke jobs exercise. Source-only bundled plugin changes, test-only edits, and docs-only edits do not reserve Docker workers. The fast path builds the root Dockerfile image once, checks the CLI, runs the container gateway-network e2e, verifies a bundled extension build arg, and runs the bounded bundled-plugin Docker profile under a 120-second command timeout. The full path keeps QR package install and installer Docker/update coverage for nightly scheduled runs, manual dispatches, workflow-call release checks, and pull requests that truly touch installer/package/Docker surfaces. `main` pushes, including merge commits, do not force the full path; when changed-scope logic would request full coverage on a push, the workflow keeps the fast Docker smoke and leaves the full install smoke to nightly or release validation. The slow Bun global install image-provider smoke is separately gated by `run_bun_global_install_smoke`; it runs on the nightly schedule and from the release checks workflow, and manual `install-smoke` dispatches can opt into it, but pull requests and `main` pushes do not run it. QR and installer Docker tests keep their own install-focused Dockerfiles. Local `test:docker:all` prebuilds one shared live-test image and one shared `scripts/e2e/Dockerfile` built-app image, then runs the live/E2E smoke lanes in parallel with `OPENCLAW_SKIP_DOCKER_BUILD=1`; tune the default concurrency of 4 with `OPENCLAW_DOCKER_ALL_PARALLELISM`. The local aggregate stops scheduling new pooled lanes after the first failure by default, and each lane has a 120-minute timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`. Startup- or provider-sensitive lanes run exclusively after the parallel pool. The reusable live/E2E workflow mirrors the shared-image pattern by building and pushing one SHA-tagged GHCR Docker E2E image before the Docker matrix, then running the matrix with `OPENCLAW_SKIP_DOCKER_BUILD=1`. The scheduled live/E2E workflow runs the full release-path Docker suite daily. The full bundled update/channel matrix remains manual/full-suite because it performs repeated real npm update and doctor repair passes.
Local changed-lane logic lives in `scripts/changed-lanes.mjs` and is executed by `scripts/check-changed.mjs`. That local gate is stricter about architecture boundaries than the broad CI platform scope: core production changes run core prod typecheck plus core tests, core test-only changes run only core test typecheck/tests, extension production changes run extension prod typecheck plus extension tests, and extension test-only changes run only extension test typecheck/tests. Public Plugin SDK or plugin-contract changes expand to extension validation because extensions depend on those core contracts. Release metadata-only version bumps run targeted version/config/root-dependency checks. Unknown root/config changes fail safe to all lanes.

View File

@@ -167,8 +167,8 @@ error instead of silently ignoring them.
If you want ACPX-backed sessions to see OpenClaw plugin tools or selected
built-in tools such as `cron`, enable the gateway-side ACPX MCP bridges instead
of trying to pass per-session `mcpServers`. See
[ACP Agents](/tools/acp-agents#plugin-tools-mcp-bridge) and
[OpenClaw tools MCP bridge](/tools/acp-agents#openclaw-tools-mcp-bridge).
[ACP Agents](/tools/acp-agents-setup#plugin-tools-mcp-bridge) and
[OpenClaw tools MCP bridge](/tools/acp-agents-setup#openclaw-tools-mcp-bridge).
## Use from `acpx` (Codex, Claude, other ACP clients)

View File

@@ -38,6 +38,25 @@ To set a provider explicitly:
Without an embedding provider, only keyword search is available.
To force the built-in local embedding provider, point `local.modelPath` at a
GGUF file:
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "local",
fallback: "none",
local: {
modelPath: "~/.node-llama-cpp/models/embeddinggemma-300m-qat-Q8_0.gguf",
},
},
},
},
}
```
## Supported embedding providers
| Provider | ID | Auto-detected | Notes |
@@ -89,6 +108,17 @@ automatic user modeling.
**Memory search disabled?** Check `openclaw memory status`. If no provider is
detected, set one explicitly or add an API key.
**Local provider not detected?** Confirm the local path exists and run:
```bash
openclaw memory status --deep --agent main
openclaw memory index --force --agent main
```
Both standalone CLI commands and the Gateway use the same `local` provider id.
If the provider is set to `auto`, local embeddings are considered first only
when `memorySearch.local.modelPath` points to an existing local file.
**Stale results?** Run `openclaw memory index --force` to rebuild. The watcher
may miss changes in rare edge cases.

View File

@@ -207,7 +207,7 @@ refs and write a judged Markdown report:
```bash
pnpm openclaw qa character-eval \
--model openai-codex/gpt-5.5,thinking=xhigh \
--model openai/gpt-5.4,thinking=medium,fast \
--model openai/gpt-5.2,thinking=xhigh \
--model openai/gpt-5,thinking=xhigh \
--model anthropic/claude-opus-4-6,thinking=high \
@@ -215,7 +215,7 @@ pnpm openclaw qa character-eval \
--model zai/glm-5.1,thinking=high \
--model moonshot/kimi-k2.5,thinking=high \
--model google/gemini-3.1-pro-preview,thinking=high \
--judge-model openai-codex/gpt-5.5,thinking=xhigh,fast \
--judge-model openai/gpt-5.4,thinking=xhigh,fast \
--judge-model anthropic/claude-opus-4-6,thinking=high \
--blind-judge-models \
--concurrency 16 \
@@ -227,13 +227,13 @@ scenarios should set the persona through `SOUL.md`, then run ordinary user turns
such as chat, workspace help, and small file tasks. The candidate model should
not be told that it is being evaluated. The command preserves each full
transcript, records basic run stats, then asks the judge models in fast mode with
`xhigh` reasoning to rank the runs by naturalness, vibe, and humor.
`xhigh` reasoning where supported to rank the runs by naturalness, vibe, and humor.
Use `--blind-judge-models` when comparing providers: the judge prompt still gets
every transcript and run status, but candidate refs are replaced with neutral
labels such as `candidate-01`; the report maps rankings back to real refs after
parsing.
Candidate runs default to `high` thinking, with `xhigh` for OpenAI models that
support it. Override a specific candidate inline with
Candidate runs default to `high` thinking, with `medium` for GPT-5.4 and `xhigh`
for older OpenAI eval refs that support it. Override a specific candidate inline with
`--model provider/model,thinking=<level>`. `--thinking <level>` still sets a
global fallback, and the older `--model-thinking <provider/model=level>` form is
kept for compatibility.
@@ -247,12 +247,12 @@ Candidate and judge model runs both default to concurrency 16. Lower
`--concurrency` or `--judge-concurrency` when provider limits or local gateway
pressure make a run too noisy.
When no candidate `--model` is passed, the character eval defaults to
`openai-codex/gpt-5.5`, `openai/gpt-5.4`, `openai/gpt-5.2`, `anthropic/claude-opus-4-6`,
`openai/gpt-5.4`, `openai/gpt-5.2`, `openai/gpt-5`, `anthropic/claude-opus-4-6`,
`anthropic/claude-sonnet-4-6`, `zai/glm-5.1`,
`moonshot/kimi-k2.5`, and
`google/gemini-3.1-pro-preview` when no `--model` is passed.
When no `--judge-model` is passed, the judges default to
`openai-codex/gpt-5.5,thinking=xhigh,fast` and
`openai/gpt-5.4,thinking=xhigh,fast` and
`anthropic/claude-opus-4-6,thinking=high`.
## Related docs

View File

@@ -1130,6 +1130,7 @@
"plugins/community",
"plugins/bundles",
"plugins/codex-harness",
"plugins/google-meet",
"plugins/webhooks",
"plugins/voice-call",
"plugins/memory-wiki",
@@ -1208,6 +1209,7 @@
"group": "Web Browser",
"pages": [
"tools/browser",
"tools/browser-control",
"tools/browser-login",
"tools/browser-linux-troubleshooting",
"tools/browser-wsl2-windows-remote-cdp-troubleshooting"
@@ -1240,6 +1242,7 @@
"tools/agent-send",
"tools/subagents",
"tools/acp-agents",
"tools/acp-agents-setup",
"tools/multi-agent-sandbox-tools"
]
}
@@ -1613,6 +1616,7 @@
"help/environment",
"help/debugging",
"help/testing",
"help/testing-live",
"help/scripts",
"debug/node-issue",
"diagnostics/flags"

View File

@@ -1235,9 +1235,10 @@ Time format in system prompt. Default: `auto` (OS preference).
- `verboseDefault`: default verbose level for agents. Values: `"off"`, `"on"`, `"full"`. Default: `"off"`.
- `elevatedDefault`: default elevated-output level for agents. Values: `"off"`, `"on"`, `"ask"`, `"full"`. Default: `"on"`.
- `model.primary`: format `provider/model` (e.g. `openai/gpt-5.4` for API-key access or `openai-codex/gpt-5.5` for Codex OAuth). If you omit the provider, OpenClaw tries an alias first, then a unique configured-provider match for that exact model id, and only then falls back to the configured default provider (deprecated compatibility behavior, so prefer explicit `provider/model`). If that provider no longer exposes the configured default model, OpenClaw falls back to the first configured provider/model instead of surfacing a stale removed-provider default.
- `models`: the configured model catalog and allowlist for `/model`. Each entry can include `alias` (shortcut) and `params` (provider-specific, for example `temperature`, `maxTokens`, `cacheRetention`, `context1m`).
- `models`: the configured model catalog and allowlist for `/model`. Each entry can include `alias` (shortcut) and `params` (provider-specific, for example `temperature`, `maxTokens`, `cacheRetention`, `context1m`, `responsesServerCompaction`, `responsesCompactThreshold`).
- Safe edits: use `openclaw config set agents.defaults.models '<json>' --strict-json --merge` to add entries. `config set` refuses replacements that would remove existing allowlist entries unless you pass `--replace`.
- Provider-scoped configure/onboarding flows merge selected provider models into this map and preserve unrelated providers already configured.
- For direct OpenAI Responses models, server-side compaction is enabled automatically. Use `params.responsesServerCompaction: false` to stop injecting `context_management`, or `params.responsesCompactThreshold` to override the threshold. See [OpenAI server-side compaction](/providers/openai#server-side-compaction-responses-api).
- `params`: global default provider parameters applied to all models. Set at `agents.defaults.params` (e.g. `{ cacheRetention: "long" }`).
- `params` merge precedence (config): `agents.defaults.params` (global base) is overridden by `agents.defaults.models["provider/model"].params` (per-model), then `agents.list[].params` (matching agent id) overrides by key. See [Prompt Caching](/reference/prompt-caching) for details.
- `embeddedHarness`: default low-level embedded agent runtime policy. Use `runtime: "auto"` to let registered plugin harnesses claim supported models, `runtime: "pi"` to force the built-in PI harness, or a registered harness id such as `runtime: "codex"`. Set `fallback: "none"` to disable automatic PI fallback.

495
docs/help/testing-live.md Normal file
View File

@@ -0,0 +1,495 @@
---
summary: "Live (network-touching) tests: model matrix, CLI backends, ACP, media providers, credentials"
read_when:
- Running live model matrix / CLI backend / ACP / media-provider smokes
- Debugging live-test credential resolution
- Adding a new provider-specific live test
title: "Testing — live suites"
---
For quick start, QA runners, unit/integration suites, and Docker flows, see
[Testing](/help/testing). This page covers the **live** (network-touching) test
suites: model matrix, CLI backends, ACP, and media-provider live tests, plus
credential handling.
## Live: Android node capability sweep
- Test: `src/gateway/android-node.capabilities.live.test.ts`
- Script: `pnpm android:test:integration`
- Goal: invoke **every command currently advertised** by a connected Android node and assert command contract behavior.
- Scope:
- Preconditioned/manual setup (the suite does not install/run/pair the app).
- Command-by-command gateway `node.invoke` validation for the selected Android node.
- Required pre-setup:
- Android app already connected + paired to the gateway.
- App kept in foreground.
- Permissions/capture consent granted for capabilities you expect to pass.
- Optional target overrides:
- `OPENCLAW_ANDROID_NODE_ID` or `OPENCLAW_ANDROID_NODE_NAME`.
- `OPENCLAW_ANDROID_GATEWAY_URL` / `OPENCLAW_ANDROID_GATEWAY_TOKEN` / `OPENCLAW_ANDROID_GATEWAY_PASSWORD`.
- Full Android setup details: [Android App](/platforms/android)
## Live: model smoke (profile keys)
Live tests are split into two layers so we can isolate failures:
- “Direct model” tells us the provider/model can answer at all with the given key.
- “Gateway smoke” tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.).
### Layer 1: Direct model completion (no gateway)
- Test: `src/agents/models.profiles.live.test.ts`
- Goal:
- Enumerate discovered models
- Use `getApiKeyForModel` to select models you have creds for
- Run a small completion per model (and targeted regressions where needed)
- How to enable:
- `pnpm test:live` (or `OPENCLAW_LIVE_TEST=1` if invoking Vitest directly)
- Set `OPENCLAW_LIVE_MODELS=modern` (or `all`, alias for modern) to actually run this suite; otherwise it skips to keep `pnpm test:live` focused on gateway smoke
- How to select models:
- `OPENCLAW_LIVE_MODELS=modern` to run the modern allowlist (Opus/Sonnet 4.6+, GPT-5.x + Codex, Gemini 3, GLM 4.7, MiniMax M2.7, Grok 4)
- `OPENCLAW_LIVE_MODELS=all` is an alias for the modern allowlist
- or `OPENCLAW_LIVE_MODELS="openai/gpt-5.4,openai-codex/gpt-5.5,anthropic/claude-opus-4-6,..."` (comma allowlist)
- Modern/all sweeps default to a curated high-signal cap; set `OPENCLAW_LIVE_MAX_MODELS=0` for an exhaustive modern sweep or a positive number for a smaller cap.
- How to select providers:
- `OPENCLAW_LIVE_PROVIDERS="google,google-antigravity,google-gemini-cli"` (comma allowlist)
- Where keys come from:
- By default: profile store and env fallbacks
- Set `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to enforce **profile store** only
- Why this exists:
- Separates “provider API is broken / key is invalid” from “gateway agent pipeline is broken”
- Contains small, isolated regressions (example: OpenAI Responses/Codex Responses reasoning replay + tool-call flows)
### Layer 2: Gateway + dev agent smoke (what "@openclaw" actually does)
- Test: `src/gateway/gateway-models.profiles.live.test.ts`
- Goal:
- Spin up an in-process gateway
- Create/patch a `agent:dev:*` session (model override per run)
- Iterate models-with-keys and assert:
- “meaningful” response (no tools)
- a real tool invocation works (read probe)
- optional extra tool probes (exec+read probe)
- OpenAI regression paths (tool-call-only → follow-up) keep working
- Probe details (so you can explain failures quickly):
- `read` probe: the test writes a nonce file in the workspace and asks the agent to `read` it and echo the nonce back.
- `exec+read` probe: the test asks the agent to `exec`-write a nonce into a temp file, then `read` it back.
- image probe: the test attaches a generated PNG (cat + randomized code) and expects the model to return `cat <CODE>`.
- Implementation reference: `src/gateway/gateway-models.profiles.live.test.ts` and `src/gateway/live-image-probe.ts`.
- How to enable:
- `pnpm test:live` (or `OPENCLAW_LIVE_TEST=1` if invoking Vitest directly)
- How to select models:
- Default: modern allowlist (Opus/Sonnet 4.6+, GPT-5.x + Codex, Gemini 3, GLM 4.7, MiniMax M2.7, Grok 4)
- `OPENCLAW_LIVE_GATEWAY_MODELS=all` is an alias for the modern allowlist
- Or set `OPENCLAW_LIVE_GATEWAY_MODELS="provider/model"` (or comma list) to narrow
- Modern/all gateway sweeps default to a curated high-signal cap; set `OPENCLAW_LIVE_GATEWAY_MAX_MODELS=0` for an exhaustive modern sweep or a positive number for a smaller cap.
- How to select providers (avoid “OpenRouter everything”):
- `OPENCLAW_LIVE_GATEWAY_PROVIDERS="google,google-antigravity,google-gemini-cli,openai,anthropic,zai,minimax"` (comma allowlist)
- Tool + image probes are always on in this live test:
- `read` probe + `exec+read` probe (tool stress)
- image probe runs when the model advertises image input support
- Flow (high level):
- Test generates a tiny PNG with “CAT” + random code (`src/gateway/live-image-probe.ts`)
- Sends it via `agent` `attachments: [{ mimeType: "image/png", content: "<base64>" }]`
- Gateway parses attachments into `images[]` (`src/gateway/server-methods/agent.ts` + `src/gateway/chat-attachments.ts`)
- Embedded agent forwards a multimodal user message to the model
- Assertion: reply contains `cat` + the code (OCR tolerance: minor mistakes allowed)
Tip: to see what you can test on your machine (and the exact `provider/model` ids), run:
```bash
openclaw models list
openclaw models list --json
```
## Live: CLI backend smoke (Claude, Codex, Gemini, or other local CLIs)
- Test: `src/gateway/gateway-cli-backend.live.test.ts`
- Goal: validate the Gateway + agent pipeline using a local CLI backend, without touching your default config.
- Backend-specific smoke defaults live with the owning extension's `cli-backend.ts` definition.
- Enable:
- `pnpm test:live` (or `OPENCLAW_LIVE_TEST=1` if invoking Vitest directly)
- `OPENCLAW_LIVE_CLI_BACKEND=1`
- Defaults:
- Default provider/model: `claude-cli/claude-sonnet-4-6`
- Command/args/image behavior come from the owning CLI backend plugin metadata.
- Overrides (optional):
- `OPENCLAW_LIVE_CLI_BACKEND_MODEL="codex-cli/gpt-5.5"`
- `OPENCLAW_LIVE_CLI_BACKEND_COMMAND="/full/path/to/codex"`
- `OPENCLAW_LIVE_CLI_BACKEND_ARGS='["exec","--json","--color","never","--sandbox","read-only","--skip-git-repo-check"]'`
- `OPENCLAW_LIVE_CLI_BACKEND_IMAGE_PROBE=1` to send a real image attachment (paths are injected into the prompt).
- `OPENCLAW_LIVE_CLI_BACKEND_IMAGE_ARG="--image"` to pass image file paths as CLI args instead of prompt injection.
- `OPENCLAW_LIVE_CLI_BACKEND_IMAGE_MODE="repeat"` (or `"list"`) to control how image args are passed when `IMAGE_ARG` is set.
- `OPENCLAW_LIVE_CLI_BACKEND_RESUME_PROBE=1` to send a second turn and validate resume flow.
- `OPENCLAW_LIVE_CLI_BACKEND_MODEL_SWITCH_PROBE=0` to disable the default Claude Sonnet -> Opus same-session continuity probe (set to `1` to force it on when the selected model supports a switch target).
Example:
```bash
OPENCLAW_LIVE_CLI_BACKEND=1 \
OPENCLAW_LIVE_CLI_BACKEND_MODEL="codex-cli/gpt-5.5" \
pnpm test:live src/gateway/gateway-cli-backend.live.test.ts
```
Docker recipe:
```bash
pnpm test:docker:live-cli-backend
```
Single-provider Docker recipes:
```bash
pnpm test:docker:live-cli-backend:claude
pnpm test:docker:live-cli-backend:claude-subscription
pnpm test:docker:live-cli-backend:codex
pnpm test:docker:live-cli-backend:gemini
```
Notes:
- The Docker runner lives at `scripts/test-live-cli-backend-docker.sh`.
- It runs the live CLI-backend smoke inside the repo Docker image as the non-root `node` user.
- It resolves CLI smoke metadata from the owning extension, then installs the matching Linux CLI package (`@anthropic-ai/claude-code`, `@openai/codex`, or `@google/gemini-cli`) into a cached writable prefix at `OPENCLAW_DOCKER_CLI_TOOLS_DIR` (default: `~/.cache/openclaw/docker-cli-tools`).
- `pnpm test:docker:live-cli-backend:claude-subscription` requires portable Claude Code subscription OAuth through either `~/.claude/.credentials.json` with `claudeAiOauth.subscriptionType` or `CLAUDE_CODE_OAUTH_TOKEN` from `claude setup-token`. It first proves direct `claude -p` in Docker, then runs two Gateway CLI-backend turns without preserving Anthropic API-key env vars. This subscription lane disables the Claude MCP/tool and image probes by default because Claude currently routes third-party app usage through extra-usage billing instead of normal subscription plan limits.
- The live CLI-backend smoke now exercises the same end-to-end flow for Claude, Codex, and Gemini: text turn, image classification turn, then MCP `cron` tool call verified through the gateway CLI.
- Claude's default smoke also patches the session from Sonnet to Opus and verifies the resumed session still remembers an earlier note.
## Live: ACP bind smoke (`/acp spawn ... --bind here`)
- Test: `src/gateway/gateway-acp-bind.live.test.ts`
- Goal: validate the real ACP conversation-bind flow with a live ACP agent:
- send `/acp spawn <agent> --bind here`
- bind a synthetic message-channel conversation in place
- send a normal follow-up on that same conversation
- verify the follow-up lands in the bound ACP session transcript
- Enable:
- `pnpm test:live src/gateway/gateway-acp-bind.live.test.ts`
- `OPENCLAW_LIVE_ACP_BIND=1`
- Defaults:
- ACP agents in Docker: `claude,codex,gemini`
- ACP agent for direct `pnpm test:live ...`: `claude`
- Synthetic channel: Slack DM-style conversation context
- ACP backend: `acpx`
- Overrides:
- `OPENCLAW_LIVE_ACP_BIND_AGENT=claude`
- `OPENCLAW_LIVE_ACP_BIND_AGENT=codex`
- `OPENCLAW_LIVE_ACP_BIND_AGENT=gemini`
- `OPENCLAW_LIVE_ACP_BIND_AGENTS=claude,codex,gemini`
- `OPENCLAW_LIVE_ACP_BIND_AGENT_COMMAND='npx -y @agentclientprotocol/claude-agent-acp@<version>'`
- `OPENCLAW_LIVE_ACP_BIND_CODEX_MODEL=gpt-5.5`
- `OPENCLAW_LIVE_ACP_BIND_PARENT_MODEL=openai/gpt-5.4`
- Notes:
- This lane uses the gateway `chat.send` surface with admin-only synthetic originating-route fields so tests can attach message-channel context without pretending to deliver externally.
- When `OPENCLAW_LIVE_ACP_BIND_AGENT_COMMAND` is unset, the test uses the embedded `acpx` plugin's built-in agent registry for the selected ACP harness agent.
Example:
```bash
OPENCLAW_LIVE_ACP_BIND=1 \
OPENCLAW_LIVE_ACP_BIND_AGENT=claude \
pnpm test:live src/gateway/gateway-acp-bind.live.test.ts
```
Docker recipe:
```bash
pnpm test:docker:live-acp-bind
```
Single-agent Docker recipes:
```bash
pnpm test:docker:live-acp-bind:claude
pnpm test:docker:live-acp-bind:codex
pnpm test:docker:live-acp-bind:gemini
```
Docker notes:
- The Docker runner lives at `scripts/test-live-acp-bind-docker.sh`.
- By default, it runs the ACP bind smoke against all supported live CLI agents in sequence: `claude`, `codex`, then `gemini`.
- Use `OPENCLAW_LIVE_ACP_BIND_AGENTS=claude`, `OPENCLAW_LIVE_ACP_BIND_AGENTS=codex`, or `OPENCLAW_LIVE_ACP_BIND_AGENTS=gemini` to narrow the matrix.
- It sources `~/.profile`, stages the matching CLI auth material into the container, installs `acpx` into a writable npm prefix, then installs the requested live CLI (`@anthropic-ai/claude-code`, `@openai/codex`, or `@google/gemini-cli`) if missing.
- Inside Docker, the runner sets `OPENCLAW_LIVE_ACP_BIND_ACPX_COMMAND=$HOME/.npm-global/bin/acpx` so acpx keeps provider env vars from the sourced profile available to the child harness CLI.
## Live: Codex app-server harness smoke
- Goal: validate the plugin-owned Codex harness through the normal gateway
`agent` method:
- load the bundled `codex` plugin
- select `OPENCLAW_AGENT_RUNTIME=codex`
- send a first gateway agent turn to `openai/gpt-5.4` with the Codex harness forced
- send a second turn to the same OpenClaw session and verify the app-server
thread can resume
- run `/codex status` and `/codex models` through the same gateway command
path
- optionally run two Guardian-reviewed escalated shell probes: one benign
command that should be approved and one fake-secret upload that should be
denied so the agent asks back
- Test: `src/gateway/gateway-codex-harness.live.test.ts`
- Enable: `OPENCLAW_LIVE_CODEX_HARNESS=1`
- Default model: `openai/gpt-5.4`
- Optional image probe: `OPENCLAW_LIVE_CODEX_HARNESS_IMAGE_PROBE=1`
- Optional MCP/tool probe: `OPENCLAW_LIVE_CODEX_HARNESS_MCP_PROBE=1`
- Optional Guardian probe: `OPENCLAW_LIVE_CODEX_HARNESS_GUARDIAN_PROBE=1`
- The smoke sets `OPENCLAW_AGENT_HARNESS_FALLBACK=none` so a broken Codex
harness cannot pass by silently falling back to PI.
- Auth: Codex app-server auth from the local Codex subscription login. Docker
smokes can also provide `OPENAI_API_KEY` for non-Codex probes when applicable,
plus optional copied `~/.codex/auth.json` and `~/.codex/config.toml`.
Local recipe:
```bash
source ~/.profile
OPENCLAW_LIVE_CODEX_HARNESS=1 \
OPENCLAW_LIVE_CODEX_HARNESS_IMAGE_PROBE=1 \
OPENCLAW_LIVE_CODEX_HARNESS_MCP_PROBE=1 \
OPENCLAW_LIVE_CODEX_HARNESS_GUARDIAN_PROBE=1 \
OPENCLAW_LIVE_CODEX_HARNESS_MODEL=openai/gpt-5.4 \
pnpm test:live -- src/gateway/gateway-codex-harness.live.test.ts
```
Docker recipe:
```bash
source ~/.profile
pnpm test:docker:live-codex-harness
```
Docker notes:
- The Docker runner lives at `scripts/test-live-codex-harness-docker.sh`.
- It sources the mounted `~/.profile`, passes `OPENAI_API_KEY`, copies Codex CLI
auth files when present, installs `@openai/codex` into a writable mounted npm
prefix, stages the source tree, then runs only the Codex-harness live test.
- Docker enables the image, MCP/tool, and Guardian probes by default. Set
`OPENCLAW_LIVE_CODEX_HARNESS_IMAGE_PROBE=0` or
`OPENCLAW_LIVE_CODEX_HARNESS_MCP_PROBE=0` or
`OPENCLAW_LIVE_CODEX_HARNESS_GUARDIAN_PROBE=0` when you need a narrower debug
run.
- Docker also exports `OPENCLAW_AGENT_HARNESS_FALLBACK=none`, matching the live
test config so legacy aliases or PI fallback cannot hide a Codex harness
regression.
### Recommended live recipes
Narrow, explicit allowlists are fastest and least flaky:
- Single model, direct (no gateway):
- `OPENCLAW_LIVE_MODELS="openai/gpt-5.4" pnpm test:live src/agents/models.profiles.live.test.ts`
- Single model, gateway smoke:
- `OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.4" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
- Tool calling across several providers:
- `OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.4,openai-codex/gpt-5.5,anthropic/claude-opus-4-6,google/gemini-3-flash-preview,zai/glm-4.7,minimax/MiniMax-M2.7" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
- Google focus (Gemini API key + Antigravity):
- Gemini (API key): `OPENCLAW_LIVE_GATEWAY_MODELS="google/gemini-3-flash-preview" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
- Antigravity (OAuth): `OPENCLAW_LIVE_GATEWAY_MODELS="google-antigravity/claude-opus-4-6-thinking,google-antigravity/gemini-3-pro-high" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
Notes:
- `google/...` uses the Gemini API (API key).
- `google-antigravity/...` uses the Antigravity OAuth bridge (Cloud Code Assist-style agent endpoint).
- `google-gemini-cli/...` uses the local Gemini CLI on your machine (separate auth + tooling quirks).
- Gemini API vs Gemini CLI:
- API: OpenClaw calls Googles hosted Gemini API over HTTP (API key / profile auth); this is what most users mean by “Gemini”.
- CLI: OpenClaw shells out to a local `gemini` binary; it has its own auth and can behave differently (streaming/tool support/version skew).
## Live: model matrix (what we cover)
There is no fixed “CI model list” (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys.
### Modern smoke set (tool calling + image)
This is the “common models” run we expect to keep working:
- OpenAI (non-Codex): `openai/gpt-5.4` (optional: `openai/gpt-5.4-mini`)
- OpenAI Codex OAuth: `openai-codex/gpt-5.5`
- Anthropic: `anthropic/claude-opus-4-6` (or `anthropic/claude-sonnet-4-6`)
- Google (Gemini API): `google/gemini-3.1-pro-preview` and `google/gemini-3-flash-preview` (avoid older Gemini 2.x models)
- Google (Antigravity): `google-antigravity/claude-opus-4-6-thinking` and `google-antigravity/gemini-3-flash`
- Z.AI (GLM): `zai/glm-4.7`
- MiniMax: `minimax/MiniMax-M2.7`
Run gateway smoke with tools + image:
`OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.4,openai-codex/gpt-5.5,anthropic/claude-opus-4-6,google/gemini-3.1-pro-preview,google/gemini-3-flash-preview,google-antigravity/claude-opus-4-6-thinking,google-antigravity/gemini-3-flash,zai/glm-4.7,minimax/MiniMax-M2.7" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
### Baseline: tool calling (Read + optional Exec)
Pick at least one per provider family:
- OpenAI: `openai/gpt-5.4` (or `openai/gpt-5.4-mini`)
- Anthropic: `anthropic/claude-opus-4-6` (or `anthropic/claude-sonnet-4-6`)
- Google: `google/gemini-3-flash-preview` (or `google/gemini-3.1-pro-preview`)
- Z.AI (GLM): `zai/glm-4.7`
- MiniMax: `minimax/MiniMax-M2.7`
Optional additional coverage (nice to have):
- xAI: `xai/grok-4` (or latest available)
- Mistral: `mistral/`… (pick one “tools” capable model you have enabled)
- Cerebras: `cerebras/`… (if you have access)
- LM Studio: `lmstudio/`… (local; tool calling depends on API mode)
### Vision: image send (attachment → multimodal message)
Include at least one image-capable model in `OPENCLAW_LIVE_GATEWAY_MODELS` (Claude/Gemini/OpenAI vision-capable variants, etc.) to exercise the image probe.
### Aggregators / alternate gateways
If you have keys enabled, we also support testing via:
- OpenRouter: `openrouter/...` (hundreds of models; use `openclaw models scan` to find tool+image capable candidates)
- OpenCode: `opencode/...` for Zen and `opencode-go/...` for Go (auth via `OPENCODE_API_KEY` / `OPENCODE_ZEN_API_KEY`)
More providers you can include in the live matrix (if you have creds/config):
- Built-in: `openai`, `openai-codex`, `anthropic`, `google`, `google-vertex`, `google-antigravity`, `google-gemini-cli`, `zai`, `openrouter`, `opencode`, `opencode-go`, `xai`, `groq`, `cerebras`, `mistral`, `github-copilot`
- Via `models.providers` (custom endpoints): `minimax` (cloud/API), plus any OpenAI/Anthropic-compatible proxy (LM Studio, vLLM, LiteLLM, etc.)
Tip: dont try to hardcode “all models” in docs. The authoritative list is whatever `discoverModels(...)` returns on your machine + whatever keys are available.
## Credentials (never commit)
Live tests discover credentials the same way the CLI does. Practical implications:
- If the CLI works, live tests should find the same keys.
- If a live test says “no creds”, debug the same way youd debug `openclaw models list` / model selection.
- Per-agent auth profiles: `~/.openclaw/agents/<agentId>/agent/auth-profiles.json` (this is what “profile keys” means in the live tests)
- Config: `~/.openclaw/openclaw.json` (or `OPENCLAW_CONFIG_PATH`)
- Legacy state dir: `~/.openclaw/credentials/` (copied into the staged live home when present, but not the main profile-key store)
- Live local runs copy the active config, per-agent `auth-profiles.json` files, legacy `credentials/`, and supported external CLI auth dirs into a temp test home by default; staged live homes skip `workspace/` and `sandboxes/`, and `agents.*.workspace` / `agentDir` path overrides are stripped so probes stay off your real host workspace.
If you want to rely on env keys (e.g. exported in your `~/.profile`), run local tests after `source ~/.profile`, or use the Docker runners below (they can mount `~/.profile` into the container).
## Deepgram live (audio transcription)
- Test: `extensions/deepgram/audio.live.test.ts`
- Enable: `DEEPGRAM_API_KEY=... DEEPGRAM_LIVE_TEST=1 pnpm test:live extensions/deepgram/audio.live.test.ts`
## BytePlus coding plan live
- Test: `extensions/byteplus/live.test.ts`
- Enable: `BYTEPLUS_API_KEY=... BYTEPLUS_LIVE_TEST=1 pnpm test:live extensions/byteplus/live.test.ts`
- Optional model override: `BYTEPLUS_CODING_MODEL=ark-code-latest`
## ComfyUI workflow media live
- Test: `extensions/comfy/comfy.live.test.ts`
- Enable: `OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts`
- Scope:
- Exercises the bundled comfy image, video, and `music_generate` paths
- Skips each capability unless `models.providers.comfy.<capability>` is configured
- Useful after changing comfy workflow submission, polling, downloads, or plugin registration
## Image generation live
- Test: `test/image-generation.runtime.live.test.ts`
- Command: `pnpm test:live test/image-generation.runtime.live.test.ts`
- Harness: `pnpm test:live:media image`
- Scope:
- Enumerates every registered image-generation provider plugin
- Loads missing provider env vars from your login shell (`~/.profile`) before probing
- Uses live/env API keys ahead of stored auth profiles by default, so stale test keys in `auth-profiles.json` do not mask real shell credentials
- Skips providers with no usable auth/profile/model
- Runs the stock image-generation variants through the shared runtime capability:
- `google:flash-generate`
- `google:pro-generate`
- `google:pro-edit`
- `openai:default-generate`
- Current bundled providers covered:
- `fal`
- `google`
- `minimax`
- `openai`
- `openrouter`
- `vydra`
- `xai`
- Optional narrowing:
- `OPENCLAW_LIVE_IMAGE_GENERATION_PROVIDERS="openai,google,openrouter,xai"`
- `OPENCLAW_LIVE_IMAGE_GENERATION_MODELS="openai/gpt-image-2,google/gemini-3.1-flash-image-preview,openrouter/google/gemini-3.1-flash-image-preview,xai/grok-imagine-image"`
- `OPENCLAW_LIVE_IMAGE_GENERATION_CASES="google:flash-generate,google:pro-edit,openrouter:generate,xai:default-generate,xai:default-edit"`
- Optional auth behavior:
- `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides
## Music generation live
- Test: `extensions/music-generation-providers.live.test.ts`
- Enable: `OPENCLAW_LIVE_TEST=1 pnpm test:live -- extensions/music-generation-providers.live.test.ts`
- Harness: `pnpm test:live:media music`
- Scope:
- Exercises the shared bundled music-generation provider path
- Currently covers Google and MiniMax
- Loads provider env vars from your login shell (`~/.profile`) before probing
- Uses live/env API keys ahead of stored auth profiles by default, so stale test keys in `auth-profiles.json` do not mask real shell credentials
- Skips providers with no usable auth/profile/model
- Runs both declared runtime modes when available:
- `generate` with prompt-only input
- `edit` when the provider declares `capabilities.edit.enabled`
- Current shared-lane coverage:
- `google`: `generate`, `edit`
- `minimax`: `generate`
- `comfy`: separate Comfy live file, not this shared sweep
- Optional narrowing:
- `OPENCLAW_LIVE_MUSIC_GENERATION_PROVIDERS="google,minimax"`
- `OPENCLAW_LIVE_MUSIC_GENERATION_MODELS="google/lyria-3-clip-preview,minimax/music-2.5+"`
- Optional auth behavior:
- `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides
## Video generation live
- Test: `extensions/video-generation-providers.live.test.ts`
- Enable: `OPENCLAW_LIVE_TEST=1 pnpm test:live -- extensions/video-generation-providers.live.test.ts`
- Harness: `pnpm test:live:media video`
- Scope:
- Exercises the shared bundled video-generation provider path
- Defaults to the release-safe smoke path: non-FAL providers, one text-to-video request per provider, one-second lobster prompt, and a per-provider operation cap from `OPENCLAW_LIVE_VIDEO_GENERATION_TIMEOUT_MS` (`180000` by default)
- Skips FAL by default because provider-side queue latency can dominate release time; pass `--video-providers fal` or `OPENCLAW_LIVE_VIDEO_GENERATION_PROVIDERS="fal"` to run it explicitly
- Loads provider env vars from your login shell (`~/.profile`) before probing
- Uses live/env API keys ahead of stored auth profiles by default, so stale test keys in `auth-profiles.json` do not mask real shell credentials
- Skips providers with no usable auth/profile/model
- Runs only `generate` by default
- Set `OPENCLAW_LIVE_VIDEO_GENERATION_FULL_MODES=1` to also run declared transform modes when available:
- `imageToVideo` when the provider declares `capabilities.imageToVideo.enabled` and the selected provider/model accepts buffer-backed local image input in the shared sweep
- `videoToVideo` when the provider declares `capabilities.videoToVideo.enabled` and the selected provider/model accepts buffer-backed local video input in the shared sweep
- Current declared-but-skipped `imageToVideo` providers in the shared sweep:
- `vydra` because bundled `veo3` is text-only and bundled `kling` requires a remote image URL
- Provider-specific Vydra coverage:
- `OPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_VYDRA_VIDEO=1 pnpm test:live -- extensions/vydra/vydra.live.test.ts`
- that file runs `veo3` text-to-video plus a `kling` lane that uses a remote image URL fixture by default
- Current `videoToVideo` live coverage:
- `runway` only when the selected model is `runway/gen4_aleph`
- Current declared-but-skipped `videoToVideo` providers in the shared sweep:
- `alibaba`, `qwen`, `xai` because those paths currently require remote `http(s)` / MP4 reference URLs
- `google` because the current shared Gemini/Veo lane uses local buffer-backed input and that path is not accepted in the shared sweep
- `openai` because the current shared lane lacks org-specific video inpaint/remix access guarantees
- Optional narrowing:
- `OPENCLAW_LIVE_VIDEO_GENERATION_PROVIDERS="google,openai,runway"`
- `OPENCLAW_LIVE_VIDEO_GENERATION_MODELS="google/veo-3.1-fast-generate-preview,openai/sora-2,runway/gen4_aleph"`
- `OPENCLAW_LIVE_VIDEO_GENERATION_SKIP_PROVIDERS=""` to include every provider in the default sweep, including FAL
- `OPENCLAW_LIVE_VIDEO_GENERATION_TIMEOUT_MS=60000` to reduce each provider operation cap for an aggressive smoke run
- Optional auth behavior:
- `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides
## Media live harness
- Command: `pnpm test:live:media`
- Purpose:
- Runs the shared image, music, and video live suites through one repo-native entrypoint
- Auto-loads missing provider env vars from `~/.profile`
- Auto-narrows each suite to providers that currently have usable auth by default
- Reuses `scripts/test-live.mjs`, so heartbeat and quiet-mode behavior stay consistent
- Examples:
- `pnpm test:live:media`
- `pnpm test:live:media image video --providers openai,google,minimax`
- `pnpm test:live:media video --video-providers openai,runway --all-providers`
- `pnpm test:live:media music --quiet`
## Related
- [Testing](/help/testing) — unit, integration, QA, and Docker suites

View File

@@ -473,483 +473,12 @@ Use this decision table:
- Touching gateway networking / WS protocol / pairing: add `pnpm test:e2e`
- Debugging “my bot is down” / provider-specific failures / tool calling: run a narrowed `pnpm test:live`
## Live: Android node capability sweep
## Live (network-touching) tests
- Test: `src/gateway/android-node.capabilities.live.test.ts`
- Script: `pnpm android:test:integration`
- Goal: invoke **every command currently advertised** by a connected Android node and assert command contract behavior.
- Scope:
- Preconditioned/manual setup (the suite does not install/run/pair the app).
- Command-by-command gateway `node.invoke` validation for the selected Android node.
- Required pre-setup:
- Android app already connected + paired to the gateway.
- App kept in foreground.
- Permissions/capture consent granted for capabilities you expect to pass.
- Optional target overrides:
- `OPENCLAW_ANDROID_NODE_ID` or `OPENCLAW_ANDROID_NODE_NAME`.
- `OPENCLAW_ANDROID_GATEWAY_URL` / `OPENCLAW_ANDROID_GATEWAY_TOKEN` / `OPENCLAW_ANDROID_GATEWAY_PASSWORD`.
- Full Android setup details: [Android App](/platforms/android)
## Live: model smoke (profile keys)
Live tests are split into two layers so we can isolate failures:
- “Direct model” tells us the provider/model can answer at all with the given key.
- “Gateway smoke” tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.).
### Layer 1: Direct model completion (no gateway)
- Test: `src/agents/models.profiles.live.test.ts`
- Goal:
- Enumerate discovered models
- Use `getApiKeyForModel` to select models you have creds for
- Run a small completion per model (and targeted regressions where needed)
- How to enable:
- `pnpm test:live` (or `OPENCLAW_LIVE_TEST=1` if invoking Vitest directly)
- Set `OPENCLAW_LIVE_MODELS=modern` (or `all`, alias for modern) to actually run this suite; otherwise it skips to keep `pnpm test:live` focused on gateway smoke
- How to select models:
- `OPENCLAW_LIVE_MODELS=modern` to run the modern allowlist (Opus/Sonnet 4.6+, GPT-5.x + Codex, Gemini 3, GLM 4.7, MiniMax M2.7, Grok 4)
- `OPENCLAW_LIVE_MODELS=all` is an alias for the modern allowlist
- or `OPENCLAW_LIVE_MODELS="openai/gpt-5.4,openai-codex/gpt-5.5,anthropic/claude-opus-4-6,..."` (comma allowlist)
- Modern/all sweeps default to a curated high-signal cap; set `OPENCLAW_LIVE_MAX_MODELS=0` for an exhaustive modern sweep or a positive number for a smaller cap.
- How to select providers:
- `OPENCLAW_LIVE_PROVIDERS="google,google-antigravity,google-gemini-cli"` (comma allowlist)
- Where keys come from:
- By default: profile store and env fallbacks
- Set `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to enforce **profile store** only
- Why this exists:
- Separates “provider API is broken / key is invalid” from “gateway agent pipeline is broken”
- Contains small, isolated regressions (example: OpenAI Responses/Codex Responses reasoning replay + tool-call flows)
### Layer 2: Gateway + dev agent smoke (what "@openclaw" actually does)
- Test: `src/gateway/gateway-models.profiles.live.test.ts`
- Goal:
- Spin up an in-process gateway
- Create/patch a `agent:dev:*` session (model override per run)
- Iterate models-with-keys and assert:
- “meaningful” response (no tools)
- a real tool invocation works (read probe)
- optional extra tool probes (exec+read probe)
- OpenAI regression paths (tool-call-only → follow-up) keep working
- Probe details (so you can explain failures quickly):
- `read` probe: the test writes a nonce file in the workspace and asks the agent to `read` it and echo the nonce back.
- `exec+read` probe: the test asks the agent to `exec`-write a nonce into a temp file, then `read` it back.
- image probe: the test attaches a generated PNG (cat + randomized code) and expects the model to return `cat <CODE>`.
- Implementation reference: `src/gateway/gateway-models.profiles.live.test.ts` and `src/gateway/live-image-probe.ts`.
- How to enable:
- `pnpm test:live` (or `OPENCLAW_LIVE_TEST=1` if invoking Vitest directly)
- How to select models:
- Default: modern allowlist (Opus/Sonnet 4.6+, GPT-5.x + Codex, Gemini 3, GLM 4.7, MiniMax M2.7, Grok 4)
- `OPENCLAW_LIVE_GATEWAY_MODELS=all` is an alias for the modern allowlist
- Or set `OPENCLAW_LIVE_GATEWAY_MODELS="provider/model"` (or comma list) to narrow
- Modern/all gateway sweeps default to a curated high-signal cap; set `OPENCLAW_LIVE_GATEWAY_MAX_MODELS=0` for an exhaustive modern sweep or a positive number for a smaller cap.
- How to select providers (avoid “OpenRouter everything”):
- `OPENCLAW_LIVE_GATEWAY_PROVIDERS="google,google-antigravity,google-gemini-cli,openai,anthropic,zai,minimax"` (comma allowlist)
- Tool + image probes are always on in this live test:
- `read` probe + `exec+read` probe (tool stress)
- image probe runs when the model advertises image input support
- Flow (high level):
- Test generates a tiny PNG with “CAT” + random code (`src/gateway/live-image-probe.ts`)
- Sends it via `agent` `attachments: [{ mimeType: "image/png", content: "<base64>" }]`
- Gateway parses attachments into `images[]` (`src/gateway/server-methods/agent.ts` + `src/gateway/chat-attachments.ts`)
- Embedded agent forwards a multimodal user message to the model
- Assertion: reply contains `cat` + the code (OCR tolerance: minor mistakes allowed)
Tip: to see what you can test on your machine (and the exact `provider/model` ids), run:
```bash
openclaw models list
openclaw models list --json
```
## Live: CLI backend smoke (Claude, Codex, Gemini, or other local CLIs)
- Test: `src/gateway/gateway-cli-backend.live.test.ts`
- Goal: validate the Gateway + agent pipeline using a local CLI backend, without touching your default config.
- Backend-specific smoke defaults live with the owning extension's `cli-backend.ts` definition.
- Enable:
- `pnpm test:live` (or `OPENCLAW_LIVE_TEST=1` if invoking Vitest directly)
- `OPENCLAW_LIVE_CLI_BACKEND=1`
- Defaults:
- Default provider/model: `claude-cli/claude-sonnet-4-6`
- Command/args/image behavior come from the owning CLI backend plugin metadata.
- Overrides (optional):
- `OPENCLAW_LIVE_CLI_BACKEND_MODEL="codex-cli/gpt-5.5"`
- `OPENCLAW_LIVE_CLI_BACKEND_COMMAND="/full/path/to/codex"`
- `OPENCLAW_LIVE_CLI_BACKEND_ARGS='["exec","--json","--color","never","--sandbox","read-only","--skip-git-repo-check"]'`
- `OPENCLAW_LIVE_CLI_BACKEND_IMAGE_PROBE=1` to send a real image attachment (paths are injected into the prompt).
- `OPENCLAW_LIVE_CLI_BACKEND_IMAGE_ARG="--image"` to pass image file paths as CLI args instead of prompt injection.
- `OPENCLAW_LIVE_CLI_BACKEND_IMAGE_MODE="repeat"` (or `"list"`) to control how image args are passed when `IMAGE_ARG` is set.
- `OPENCLAW_LIVE_CLI_BACKEND_RESUME_PROBE=1` to send a second turn and validate resume flow.
- `OPENCLAW_LIVE_CLI_BACKEND_MODEL_SWITCH_PROBE=0` to disable the default Claude Sonnet -> Opus same-session continuity probe (set to `1` to force it on when the selected model supports a switch target).
Example:
```bash
OPENCLAW_LIVE_CLI_BACKEND=1 \
OPENCLAW_LIVE_CLI_BACKEND_MODEL="codex-cli/gpt-5.5" \
pnpm test:live src/gateway/gateway-cli-backend.live.test.ts
```
Docker recipe:
```bash
pnpm test:docker:live-cli-backend
```
Single-provider Docker recipes:
```bash
pnpm test:docker:live-cli-backend:claude
pnpm test:docker:live-cli-backend:claude-subscription
pnpm test:docker:live-cli-backend:codex
pnpm test:docker:live-cli-backend:gemini
```
Notes:
- The Docker runner lives at `scripts/test-live-cli-backend-docker.sh`.
- It runs the live CLI-backend smoke inside the repo Docker image as the non-root `node` user.
- It resolves CLI smoke metadata from the owning extension, then installs the matching Linux CLI package (`@anthropic-ai/claude-code`, `@openai/codex`, or `@google/gemini-cli`) into a cached writable prefix at `OPENCLAW_DOCKER_CLI_TOOLS_DIR` (default: `~/.cache/openclaw/docker-cli-tools`).
- `pnpm test:docker:live-cli-backend:claude-subscription` requires portable Claude Code subscription OAuth through either `~/.claude/.credentials.json` with `claudeAiOauth.subscriptionType` or `CLAUDE_CODE_OAUTH_TOKEN` from `claude setup-token`. It first proves direct `claude -p` in Docker, then runs two Gateway CLI-backend turns without preserving Anthropic API-key env vars. This subscription lane disables the Claude MCP/tool and image probes by default because Claude currently routes third-party app usage through extra-usage billing instead of normal subscription plan limits.
- The live CLI-backend smoke now exercises the same end-to-end flow for Claude, Codex, and Gemini: text turn, image classification turn, then MCP `cron` tool call verified through the gateway CLI.
- Claude's default smoke also patches the session from Sonnet to Opus and verifies the resumed session still remembers an earlier note.
## Live: ACP bind smoke (`/acp spawn ... --bind here`)
- Test: `src/gateway/gateway-acp-bind.live.test.ts`
- Goal: validate the real ACP conversation-bind flow with a live ACP agent:
- send `/acp spawn <agent> --bind here`
- bind a synthetic message-channel conversation in place
- send a normal follow-up on that same conversation
- verify the follow-up lands in the bound ACP session transcript
- Enable:
- `pnpm test:live src/gateway/gateway-acp-bind.live.test.ts`
- `OPENCLAW_LIVE_ACP_BIND=1`
- Defaults:
- ACP agents in Docker: `claude,codex,gemini`
- ACP agent for direct `pnpm test:live ...`: `claude`
- Synthetic channel: Slack DM-style conversation context
- ACP backend: `acpx`
- Overrides:
- `OPENCLAW_LIVE_ACP_BIND_AGENT=claude`
- `OPENCLAW_LIVE_ACP_BIND_AGENT=codex`
- `OPENCLAW_LIVE_ACP_BIND_AGENT=gemini`
- `OPENCLAW_LIVE_ACP_BIND_AGENTS=claude,codex,gemini`
- `OPENCLAW_LIVE_ACP_BIND_AGENT_COMMAND='npx -y @agentclientprotocol/claude-agent-acp@<version>'`
- `OPENCLAW_LIVE_ACP_BIND_CODEX_MODEL=gpt-5.5`
- `OPENCLAW_LIVE_ACP_BIND_PARENT_MODEL=openai/gpt-5.4`
- Notes:
- This lane uses the gateway `chat.send` surface with admin-only synthetic originating-route fields so tests can attach message-channel context without pretending to deliver externally.
- When `OPENCLAW_LIVE_ACP_BIND_AGENT_COMMAND` is unset, the test uses the embedded `acpx` plugin's built-in agent registry for the selected ACP harness agent.
Example:
```bash
OPENCLAW_LIVE_ACP_BIND=1 \
OPENCLAW_LIVE_ACP_BIND_AGENT=claude \
pnpm test:live src/gateway/gateway-acp-bind.live.test.ts
```
Docker recipe:
```bash
pnpm test:docker:live-acp-bind
```
Single-agent Docker recipes:
```bash
pnpm test:docker:live-acp-bind:claude
pnpm test:docker:live-acp-bind:codex
pnpm test:docker:live-acp-bind:gemini
```
Docker notes:
- The Docker runner lives at `scripts/test-live-acp-bind-docker.sh`.
- By default, it runs the ACP bind smoke against all supported live CLI agents in sequence: `claude`, `codex`, then `gemini`.
- Use `OPENCLAW_LIVE_ACP_BIND_AGENTS=claude`, `OPENCLAW_LIVE_ACP_BIND_AGENTS=codex`, or `OPENCLAW_LIVE_ACP_BIND_AGENTS=gemini` to narrow the matrix.
- It sources `~/.profile`, stages the matching CLI auth material into the container, installs `acpx` into a writable npm prefix, then installs the requested live CLI (`@anthropic-ai/claude-code`, `@openai/codex`, or `@google/gemini-cli`) if missing.
- Inside Docker, the runner sets `OPENCLAW_LIVE_ACP_BIND_ACPX_COMMAND=$HOME/.npm-global/bin/acpx` so acpx keeps provider env vars from the sourced profile available to the child harness CLI.
## Live: Codex app-server harness smoke
- Goal: validate the plugin-owned Codex harness through the normal gateway
`agent` method:
- load the bundled `codex` plugin
- select `OPENCLAW_AGENT_RUNTIME=codex`
- send a first gateway agent turn to `openai/gpt-5.5` with the Codex harness forced
- send a second turn to the same OpenClaw session and verify the app-server
thread can resume
- run `/codex status` and `/codex models` through the same gateway command
path
- optionally run two Guardian-reviewed escalated shell probes: one benign
command that should be approved and one fake-secret upload that should be
denied so the agent asks back
- Test: `src/gateway/gateway-codex-harness.live.test.ts`
- Enable: `OPENCLAW_LIVE_CODEX_HARNESS=1`
- Default model: `openai/gpt-5.5`
- Optional image probe: `OPENCLAW_LIVE_CODEX_HARNESS_IMAGE_PROBE=1`
- Optional MCP/tool probe: `OPENCLAW_LIVE_CODEX_HARNESS_MCP_PROBE=1`
- Optional Guardian probe: `OPENCLAW_LIVE_CODEX_HARNESS_GUARDIAN_PROBE=1`
- The smoke sets `OPENCLAW_AGENT_HARNESS_FALLBACK=none` so a broken Codex
harness cannot pass by silently falling back to PI.
- Auth: Codex app-server auth from the local Codex subscription login. Docker
smokes can also provide `OPENAI_API_KEY` for non-Codex probes when applicable,
plus optional copied `~/.codex/auth.json` and `~/.codex/config.toml`.
Local recipe:
```bash
source ~/.profile
OPENCLAW_LIVE_CODEX_HARNESS=1 \
OPENCLAW_LIVE_CODEX_HARNESS_IMAGE_PROBE=1 \
OPENCLAW_LIVE_CODEX_HARNESS_MCP_PROBE=1 \
OPENCLAW_LIVE_CODEX_HARNESS_GUARDIAN_PROBE=1 \
OPENCLAW_LIVE_CODEX_HARNESS_MODEL=openai/gpt-5.5 \
pnpm test:live -- src/gateway/gateway-codex-harness.live.test.ts
```
Docker recipe:
```bash
source ~/.profile
pnpm test:docker:live-codex-harness
```
Docker notes:
- The Docker runner lives at `scripts/test-live-codex-harness-docker.sh`.
- It sources the mounted `~/.profile`, passes `OPENAI_API_KEY`, copies Codex CLI
auth files when present, installs `@openai/codex` into a writable mounted npm
prefix, stages the source tree, then runs only the Codex-harness live test.
- Docker enables the image, MCP/tool, and Guardian probes by default. Set
`OPENCLAW_LIVE_CODEX_HARNESS_IMAGE_PROBE=0` or
`OPENCLAW_LIVE_CODEX_HARNESS_MCP_PROBE=0` or
`OPENCLAW_LIVE_CODEX_HARNESS_GUARDIAN_PROBE=0` when you need a narrower debug
run.
- Docker also exports `OPENCLAW_AGENT_HARNESS_FALLBACK=none`, matching the live
test config so legacy aliases or PI fallback cannot hide a Codex harness
regression.
### Recommended live recipes
Narrow, explicit allowlists are fastest and least flaky:
- Single model, direct (no gateway):
- `OPENCLAW_LIVE_MODELS="openai/gpt-5.4" pnpm test:live src/agents/models.profiles.live.test.ts`
- Single model, gateway smoke:
- `OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.4" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
- Tool calling across several providers:
- `OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.4,openai-codex/gpt-5.5,anthropic/claude-opus-4-6,google/gemini-3-flash-preview,zai/glm-4.7,minimax/MiniMax-M2.7" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
- Google focus (Gemini API key + Antigravity):
- Gemini (API key): `OPENCLAW_LIVE_GATEWAY_MODELS="google/gemini-3-flash-preview" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
- Antigravity (OAuth): `OPENCLAW_LIVE_GATEWAY_MODELS="google-antigravity/claude-opus-4-6-thinking,google-antigravity/gemini-3-pro-high" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
Notes:
- `google/...` uses the Gemini API (API key).
- `google-antigravity/...` uses the Antigravity OAuth bridge (Cloud Code Assist-style agent endpoint).
- `google-gemini-cli/...` uses the local Gemini CLI on your machine (separate auth + tooling quirks).
- Gemini API vs Gemini CLI:
- API: OpenClaw calls Googles hosted Gemini API over HTTP (API key / profile auth); this is what most users mean by “Gemini”.
- CLI: OpenClaw shells out to a local `gemini` binary; it has its own auth and can behave differently (streaming/tool support/version skew).
## Live: model matrix (what we cover)
There is no fixed “CI model list” (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys.
### Modern smoke set (tool calling + image)
This is the “common models” run we expect to keep working:
- OpenAI (non-Codex): `openai/gpt-5.4` (optional: `openai/gpt-5.4-mini`)
- OpenAI Codex OAuth: `openai-codex/gpt-5.5`
- Anthropic: `anthropic/claude-opus-4-6` (or `anthropic/claude-sonnet-4-6`)
- Google (Gemini API): `google/gemini-3.1-pro-preview` and `google/gemini-3-flash-preview` (avoid older Gemini 2.x models)
- Google (Antigravity): `google-antigravity/claude-opus-4-6-thinking` and `google-antigravity/gemini-3-flash`
- Z.AI (GLM): `zai/glm-4.7`
- MiniMax: `minimax/MiniMax-M2.7`
Run gateway smoke with tools + image:
`OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.4,openai-codex/gpt-5.5,anthropic/claude-opus-4-6,google/gemini-3.1-pro-preview,google/gemini-3-flash-preview,google-antigravity/claude-opus-4-6-thinking,google-antigravity/gemini-3-flash,zai/glm-4.7,minimax/MiniMax-M2.7" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
### Baseline: tool calling (Read + optional Exec)
Pick at least one per provider family:
- OpenAI: `openai/gpt-5.4` (or `openai/gpt-5.4-mini`)
- Anthropic: `anthropic/claude-opus-4-6` (or `anthropic/claude-sonnet-4-6`)
- Google: `google/gemini-3-flash-preview` (or `google/gemini-3.1-pro-preview`)
- Z.AI (GLM): `zai/glm-4.7`
- MiniMax: `minimax/MiniMax-M2.7`
Optional additional coverage (nice to have):
- xAI: `xai/grok-4` (or latest available)
- Mistral: `mistral/`… (pick one “tools” capable model you have enabled)
- Cerebras: `cerebras/`… (if you have access)
- LM Studio: `lmstudio/`… (local; tool calling depends on API mode)
### Vision: image send (attachment → multimodal message)
Include at least one image-capable model in `OPENCLAW_LIVE_GATEWAY_MODELS` (Claude/Gemini/OpenAI vision-capable variants, etc.) to exercise the image probe.
### Aggregators / alternate gateways
If you have keys enabled, we also support testing via:
- OpenRouter: `openrouter/...` (hundreds of models; use `openclaw models scan` to find tool+image capable candidates)
- OpenCode: `opencode/...` for Zen and `opencode-go/...` for Go (auth via `OPENCODE_API_KEY` / `OPENCODE_ZEN_API_KEY`)
More providers you can include in the live matrix (if you have creds/config):
- Built-in: `openai`, `openai-codex`, `anthropic`, `google`, `google-vertex`, `google-antigravity`, `google-gemini-cli`, `zai`, `openrouter`, `opencode`, `opencode-go`, `xai`, `groq`, `cerebras`, `mistral`, `github-copilot`
- Via `models.providers` (custom endpoints): `minimax` (cloud/API), plus any OpenAI/Anthropic-compatible proxy (LM Studio, vLLM, LiteLLM, etc.)
Tip: dont try to hardcode “all models” in docs. The authoritative list is whatever `discoverModels(...)` returns on your machine + whatever keys are available.
## Credentials (never commit)
Live tests discover credentials the same way the CLI does. Practical implications:
- If the CLI works, live tests should find the same keys.
- If a live test says “no creds”, debug the same way youd debug `openclaw models list` / model selection.
- Per-agent auth profiles: `~/.openclaw/agents/<agentId>/agent/auth-profiles.json` (this is what “profile keys” means in the live tests)
- Config: `~/.openclaw/openclaw.json` (or `OPENCLAW_CONFIG_PATH`)
- Legacy state dir: `~/.openclaw/credentials/` (copied into the staged live home when present, but not the main profile-key store)
- Live local runs copy the active config, per-agent `auth-profiles.json` files, legacy `credentials/`, and supported external CLI auth dirs into a temp test home by default; staged live homes skip `workspace/` and `sandboxes/`, and `agents.*.workspace` / `agentDir` path overrides are stripped so probes stay off your real host workspace.
If you want to rely on env keys (e.g. exported in your `~/.profile`), run local tests after `source ~/.profile`, or use the Docker runners below (they can mount `~/.profile` into the container).
## Deepgram live (audio transcription)
- Test: `extensions/deepgram/audio.live.test.ts`
- Enable: `DEEPGRAM_API_KEY=... DEEPGRAM_LIVE_TEST=1 pnpm test:live extensions/deepgram/audio.live.test.ts`
## BytePlus coding plan live
- Test: `extensions/byteplus/live.test.ts`
- Enable: `BYTEPLUS_API_KEY=... BYTEPLUS_LIVE_TEST=1 pnpm test:live extensions/byteplus/live.test.ts`
- Optional model override: `BYTEPLUS_CODING_MODEL=ark-code-latest`
## ComfyUI workflow media live
- Test: `extensions/comfy/comfy.live.test.ts`
- Enable: `OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts`
- Scope:
- Exercises the bundled comfy image, video, and `music_generate` paths
- Skips each capability unless `models.providers.comfy.<capability>` is configured
- Useful after changing comfy workflow submission, polling, downloads, or plugin registration
## Image generation live
- Test: `test/image-generation.runtime.live.test.ts`
- Command: `pnpm test:live test/image-generation.runtime.live.test.ts`
- Harness: `pnpm test:live:media image`
- Scope:
- Enumerates every registered image-generation provider plugin
- Loads missing provider env vars from your login shell (`~/.profile`) before probing
- Uses live/env API keys ahead of stored auth profiles by default, so stale test keys in `auth-profiles.json` do not mask real shell credentials
- Skips providers with no usable auth/profile/model
- Runs the stock image-generation variants through the shared runtime capability:
- `google:flash-generate`
- `google:pro-generate`
- `google:pro-edit`
- `openai:default-generate`
- Current bundled providers covered:
- `fal`
- `google`
- `minimax`
- `openai`
- `openrouter`
- `vydra`
- `xai`
- Optional narrowing:
- `OPENCLAW_LIVE_IMAGE_GENERATION_PROVIDERS="openai,google,openrouter,xai"`
- `OPENCLAW_LIVE_IMAGE_GENERATION_MODELS="openai/gpt-image-2,google/gemini-3.1-flash-image-preview,openrouter/google/gemini-3.1-flash-image-preview,xai/grok-imagine-image"`
- `OPENCLAW_LIVE_IMAGE_GENERATION_CASES="google:flash-generate,google:pro-edit,openrouter:generate,xai:default-generate,xai:default-edit"`
- Optional auth behavior:
- `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides
## Music generation live
- Test: `extensions/music-generation-providers.live.test.ts`
- Enable: `OPENCLAW_LIVE_TEST=1 pnpm test:live -- extensions/music-generation-providers.live.test.ts`
- Harness: `pnpm test:live:media music`
- Scope:
- Exercises the shared bundled music-generation provider path
- Currently covers Google and MiniMax
- Loads provider env vars from your login shell (`~/.profile`) before probing
- Uses live/env API keys ahead of stored auth profiles by default, so stale test keys in `auth-profiles.json` do not mask real shell credentials
- Skips providers with no usable auth/profile/model
- Runs both declared runtime modes when available:
- `generate` with prompt-only input
- `edit` when the provider declares `capabilities.edit.enabled`
- Current shared-lane coverage:
- `google`: `generate`, `edit`
- `minimax`: `generate`
- `comfy`: separate Comfy live file, not this shared sweep
- Optional narrowing:
- `OPENCLAW_LIVE_MUSIC_GENERATION_PROVIDERS="google,minimax"`
- `OPENCLAW_LIVE_MUSIC_GENERATION_MODELS="google/lyria-3-clip-preview,minimax/music-2.5+"`
- Optional auth behavior:
- `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides
## Video generation live
- Test: `extensions/video-generation-providers.live.test.ts`
- Enable: `OPENCLAW_LIVE_TEST=1 pnpm test:live -- extensions/video-generation-providers.live.test.ts`
- Harness: `pnpm test:live:media video`
- Scope:
- Exercises the shared bundled video-generation provider path
- Defaults to the release-safe smoke path: non-FAL providers, one text-to-video request per provider, one-second lobster prompt, and a per-provider operation cap from `OPENCLAW_LIVE_VIDEO_GENERATION_TIMEOUT_MS` (`180000` by default)
- Skips FAL by default because provider-side queue latency can dominate release time; pass `--video-providers fal` or `OPENCLAW_LIVE_VIDEO_GENERATION_PROVIDERS="fal"` to run it explicitly
- Loads provider env vars from your login shell (`~/.profile`) before probing
- Uses live/env API keys ahead of stored auth profiles by default, so stale test keys in `auth-profiles.json` do not mask real shell credentials
- Skips providers with no usable auth/profile/model
- Runs only `generate` by default
- Set `OPENCLAW_LIVE_VIDEO_GENERATION_FULL_MODES=1` to also run declared transform modes when available:
- `imageToVideo` when the provider declares `capabilities.imageToVideo.enabled` and the selected provider/model accepts buffer-backed local image input in the shared sweep
- `videoToVideo` when the provider declares `capabilities.videoToVideo.enabled` and the selected provider/model accepts buffer-backed local video input in the shared sweep
- Current declared-but-skipped `imageToVideo` providers in the shared sweep:
- `vydra` because bundled `veo3` is text-only and bundled `kling` requires a remote image URL
- Provider-specific Vydra coverage:
- `OPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_VYDRA_VIDEO=1 pnpm test:live -- extensions/vydra/vydra.live.test.ts`
- that file runs `veo3` text-to-video plus a `kling` lane that uses a remote image URL fixture by default
- Current `videoToVideo` live coverage:
- `runway` only when the selected model is `runway/gen4_aleph`
- Current declared-but-skipped `videoToVideo` providers in the shared sweep:
- `alibaba`, `qwen`, `xai` because those paths currently require remote `http(s)` / MP4 reference URLs
- `google` because the current shared Gemini/Veo lane uses local buffer-backed input and that path is not accepted in the shared sweep
- `openai` because the current shared lane lacks org-specific video inpaint/remix access guarantees
- Optional narrowing:
- `OPENCLAW_LIVE_VIDEO_GENERATION_PROVIDERS="google,openai,runway"`
- `OPENCLAW_LIVE_VIDEO_GENERATION_MODELS="google/veo-3.1-fast-generate-preview,openai/sora-2,runway/gen4_aleph"`
- `OPENCLAW_LIVE_VIDEO_GENERATION_SKIP_PROVIDERS=""` to include every provider in the default sweep, including FAL
- `OPENCLAW_LIVE_VIDEO_GENERATION_TIMEOUT_MS=60000` to reduce each provider operation cap for an aggressive smoke run
- Optional auth behavior:
- `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides
## Media live harness
- Command: `pnpm test:live:media`
- Purpose:
- Runs the shared image, music, and video live suites through one repo-native entrypoint
- Auto-loads missing provider env vars from `~/.profile`
- Auto-narrows each suite to providers that currently have usable auth by default
- Reuses `scripts/test-live.mjs`, so heartbeat and quiet-mode behavior stay consistent
- Examples:
- `pnpm test:live:media`
- `pnpm test:live:media image video --providers openai,google,minimax`
- `pnpm test:live:media video --video-providers openai,runway --all-providers`
- `pnpm test:live:media music --quiet`
For the live model matrix, CLI backend smokes, ACP smokes, Codex app-server
harness, and all media-provider live tests (Deepgram, BytePlus, ComfyUI, image,
music, video, media harness) — plus credential handling for live runs — see
[Testing — live suites](/help/testing-live).
## Docker runners (optional "works in Linux" checks)

View File

@@ -136,6 +136,9 @@ Rules:
- If the active primary image model already supports vision natively, OpenClaw
skips the `[Image]` summary block and passes the original image into the
model instead.
- If a Gateway/WebChat primary model is text-only, image attachments are
preserved as offloaded `media://inbound/*` refs so the image tool or configured
image model can still inspect them instead of losing the attachment.
- Explicit `openclaw infer image describe --model <provider/model>` requests
are different: they run that image-capable provider/model directly, including
Ollama refs such as `ollama/qwen2.5vl:7b`.

View File

@@ -563,4 +563,4 @@ and that the remote app-server speaks the same Codex app-server protocol version
- [Agent Harness Plugins](/plugins/sdk-agent-harness)
- [Model Providers](/concepts/model-providers)
- [Configuration Reference](/gateway/configuration-reference)
- [Testing](/help/testing#live-codex-app-server-harness-smoke)
- [Testing](/help/testing-live#live-codex-app-server-harness-smoke)

314
docs/plugins/google-meet.md Normal file
View File

@@ -0,0 +1,314 @@
---
summary: "Google Meet plugin: join explicit Meet URLs through Chrome or Twilio with realtime voice defaults"
read_when:
- You want an OpenClaw agent to join a Google Meet call
- You are configuring Chrome or Twilio as a Google Meet transport
title: "Google Meet Plugin"
---
# Google Meet (plugin)
Google Meet participant support for OpenClaw.
The plugin is explicit by design:
- It only joins an explicit `https://meet.google.com/...` URL.
- `realtime` voice is the default mode.
- Realtime voice can call back into the full OpenClaw agent when deeper
reasoning or tools are needed.
- Auth starts as personal Google OAuth or an already signed-in Chrome profile.
- There is no automatic consent announcement.
- The default Chrome audio backend is `BlackHole 2ch`.
- Twilio accepts a dial-in number plus optional PIN or DTMF sequence.
- The CLI command is `googlemeet`; `meet` is reserved for broader agent
teleconference workflows.
## Quick start
Install the local audio dependencies and make sure the realtime provider can use
OpenAI:
```bash
brew install blackhole-2ch sox
export OPENAI_API_KEY=sk-...
```
`blackhole-2ch` installs the `BlackHole 2ch` virtual audio device. Homebrew's
installer requires a reboot before macOS exposes the device:
```bash
sudo reboot
```
After reboot, verify both pieces:
```bash
system_profiler SPAudioDataType | grep -i BlackHole
command -v rec play
```
Enable the plugin:
```json5
{
plugins: {
entries: {
"google-meet": {
enabled: true,
config: {},
},
},
},
}
```
Check setup:
```bash
openclaw googlemeet setup
```
Join a meeting:
```bash
openclaw googlemeet join https://meet.google.com/abc-defg-hij
```
Or let an agent join through the `google_meet` tool:
```json
{
"action": "join",
"url": "https://meet.google.com/abc-defg-hij"
}
```
Chrome joins as the signed-in Chrome profile. In Meet, pick `BlackHole 2ch` for
the microphone/speaker path used by OpenClaw. For clean duplex audio, use
separate virtual devices or a Loopback-style graph; a single BlackHole device is
enough for a first smoke test but can echo.
## Install notes
The Chrome realtime default uses two external tools:
- `sox`: command-line audio utility. The plugin uses its `rec` and `play`
commands for the default 8 kHz G.711 mu-law audio bridge.
- `blackhole-2ch`: macOS virtual audio driver. It creates the `BlackHole 2ch`
audio device that Chrome/Meet can route through.
OpenClaw does not bundle or redistribute either package. The docs ask users to
install them as host dependencies through Homebrew. SoX is licensed as
`LGPL-2.0-only AND GPL-2.0-only`; BlackHole is GPL-3.0. If you build an
installer or appliance that bundles BlackHole with OpenClaw, review BlackHole's
upstream licensing terms or get a separate license from Existential Audio.
## Transports
### Chrome
Chrome transport opens the Meet URL in Google Chrome and joins as the signed-in
Chrome profile. On macOS, the plugin checks for `BlackHole 2ch` before launch.
If configured, it also runs an audio bridge health command and startup command
before opening Chrome.
```bash
openclaw googlemeet join https://meet.google.com/abc-defg-hij --transport chrome
```
Route Chrome microphone and speaker audio through the local OpenClaw audio
bridge. If `BlackHole 2ch` is not installed, the join fails with a setup error
instead of silently joining without an audio path.
### Twilio
Twilio transport is a strict dial plan delegated to the Voice Call plugin. It
does not parse Meet pages for phone numbers.
```bash
openclaw googlemeet join https://meet.google.com/abc-defg-hij \
--transport twilio \
--dial-in-number +15551234567 \
--pin 123456
```
Use `--dtmf-sequence` when the meeting needs a custom sequence:
```bash
openclaw googlemeet join https://meet.google.com/abc-defg-hij \
--transport twilio \
--dial-in-number +15551234567 \
--dtmf-sequence ww123456#
```
## OAuth and preflight
Google Meet Media API access uses a personal OAuth client first. Configure
`oauth.clientId` and optionally `oauth.clientSecret`, then run:
```bash
openclaw googlemeet auth login --json
```
The command prints an `oauth` config block with a refresh token. It uses PKCE,
localhost callback on `http://localhost:8085/oauth2callback`, and a manual
copy/paste flow with `--manual`.
These environment variables are accepted as fallbacks:
- `OPENCLAW_GOOGLE_MEET_CLIENT_ID` or `GOOGLE_MEET_CLIENT_ID`
- `OPENCLAW_GOOGLE_MEET_CLIENT_SECRET` or `GOOGLE_MEET_CLIENT_SECRET`
- `OPENCLAW_GOOGLE_MEET_REFRESH_TOKEN` or `GOOGLE_MEET_REFRESH_TOKEN`
- `OPENCLAW_GOOGLE_MEET_ACCESS_TOKEN` or `GOOGLE_MEET_ACCESS_TOKEN`
- `OPENCLAW_GOOGLE_MEET_ACCESS_TOKEN_EXPIRES_AT` or
`GOOGLE_MEET_ACCESS_TOKEN_EXPIRES_AT`
- `OPENCLAW_GOOGLE_MEET_DEFAULT_MEETING` or `GOOGLE_MEET_DEFAULT_MEETING`
- `OPENCLAW_GOOGLE_MEET_PREVIEW_ACK` or `GOOGLE_MEET_PREVIEW_ACK`
Resolve a Meet URL, code, or `spaces/{id}` through `spaces.get`:
```bash
openclaw googlemeet resolve-space --meeting https://meet.google.com/abc-defg-hij
```
Run preflight before media work:
```bash
openclaw googlemeet preflight --meeting https://meet.google.com/abc-defg-hij
```
Set `preview.enrollmentAcknowledged: true` only after confirming your Cloud
project, OAuth principal, and meeting participants are enrolled in the Google
Workspace Developer Preview Program for Meet media APIs.
## Config
The common Chrome realtime path only needs the plugin enabled, BlackHole, SoX,
and an OpenAI key:
```bash
brew install blackhole-2ch sox
export OPENAI_API_KEY=sk-...
```
Set the plugin config under `plugins.entries.google-meet.config`:
```json5
{
plugins: {
entries: {
"google-meet": {
enabled: true,
config: {},
},
},
},
}
```
Defaults:
- `defaultTransport: "chrome"`
- `defaultMode: "realtime"`
- `chrome.audioBackend: "blackhole-2ch"`
- `chrome.audioInputCommand`: SoX `rec` command writing 8 kHz G.711 mu-law
audio to stdout
- `chrome.audioOutputCommand`: SoX `play` command reading 8 kHz G.711 mu-law
audio from stdin
- `realtime.provider: "openai"`
- `realtime.toolPolicy: "safe-read-only"`
- `realtime.instructions`: brief spoken replies, with
`openclaw_agent_consult` for deeper answers
Optional overrides:
```json5
{
defaults: {
meeting: "https://meet.google.com/abc-defg-hij",
},
chrome: {
browserProfile: "Default",
},
realtime: {
toolPolicy: "owner",
},
}
```
Twilio-only config:
```json5
{
defaultTransport: "twilio",
twilio: {
defaultDialInNumber: "+15551234567",
defaultPin: "123456",
},
voiceCall: {
gatewayUrl: "ws://127.0.0.1:18789",
},
}
```
## Tool
Agents can use the `google_meet` tool:
```json
{
"action": "join",
"url": "https://meet.google.com/abc-defg-hij",
"transport": "chrome",
"mode": "realtime"
}
```
Use `action: "status"` to list active sessions or inspect a session ID. Use
`action: "leave"` to mark a session ended.
## Realtime agent consult
Chrome realtime mode is optimized for a live voice loop. The realtime voice
provider hears the meeting audio and speaks through the configured audio bridge.
When the realtime model needs deeper reasoning, current information, or normal
OpenClaw tools, it can call `openclaw_agent_consult`.
The consult tool runs the regular OpenClaw agent behind the scenes with recent
meeting transcript context and returns a concise spoken answer to the realtime
voice session. The voice model can then speak that answer back into the meeting.
`realtime.toolPolicy` controls the consult run:
- `safe-read-only`: expose the consult tool and limit the regular agent to
`read`, `web_search`, `web_fetch`, `x_search`, `memory_search`, and
`memory_get`.
- `owner`: expose the consult tool and let the regular agent use the normal
agent tool policy.
- `none`: do not expose the consult tool to the realtime voice model.
The consult session key is scoped per Meet session, so follow-up consult calls
can reuse prior consult context during the same meeting.
## Notes
Google Meet's official media API is receive-oriented, so speaking into a Meet
call still needs a participant path. This plugin keeps that boundary visible:
Chrome handles browser participation and local audio routing; Twilio handles
phone dial-in participation.
Chrome realtime mode needs either:
- `chrome.audioInputCommand` plus `chrome.audioOutputCommand`: OpenClaw owns the
realtime model bridge and pipes 8 kHz G.711 mu-law audio between those
commands and the selected realtime voice provider.
- `chrome.audioBridgeCommand`: an external bridge command owns the whole local
audio path and must exit after starting or validating its daemon.
For clean duplex audio, route Meet output and Meet microphone through separate
virtual devices or a Loopback-style virtual device graph. A single shared
BlackHole device can echo other participants back into the call.
`googlemeet leave` stops the command-pair realtime audio bridge for Chrome
sessions. For Twilio sessions delegated through the Voice Call plugin, it also
hangs up the underlying voice call.

View File

@@ -370,6 +370,7 @@ read without importing the plugin runtime.
"speechProviders": ["openai"],
"realtimeTranscriptionProviders": ["openai"],
"realtimeVoiceProviders": ["openai"],
"memoryEmbeddingProviders": ["local"],
"mediaUnderstandingProviders": ["openai", "openai-codex"],
"imageGenerationProviders": ["openai"],
"videoGenerationProviders": ["qwen"],
@@ -389,6 +390,7 @@ Each list is optional:
| `speechProviders` | `string[]` | Speech provider ids this plugin owns. |
| `realtimeTranscriptionProviders` | `string[]` | Realtime-transcription provider ids this plugin owns. |
| `realtimeVoiceProviders` | `string[]` | Realtime-voice provider ids this plugin owns. |
| `memoryEmbeddingProviders` | `string[]` | Memory embedding provider ids this plugin owns. |
| `mediaUnderstandingProviders` | `string[]` | Media-understanding provider ids this plugin owns. |
| `imageGenerationProviders` | `string[]` | Image-generation provider ids this plugin owns. |
| `videoGenerationProviders` | `string[]` | Video-generation provider ids this plugin owns. |
@@ -401,6 +403,12 @@ Provider plugins that implement `resolveExternalAuthProfiles` should declare
through a deprecated compatibility fallback, but that fallback is slower and
will be removed after the migration window.
Bundled memory embedding providers should declare
`contracts.memoryEmbeddingProviders` for every adapter id they expose, including
built-in adapters such as `local`. Standalone CLI paths use this manifest
contract to load only the owning plugin before the full Gateway runtime has
registered providers.
## mediaUnderstandingProviderMetadata reference
Use `mediaUnderstandingProviderMetadata` when a media-understanding provider has

View File

@@ -119,8 +119,10 @@ For the plugin authoring guide, see [Plugin SDK overview](/plugins/sdk-overview)
| `plugin-sdk/approval-handler-runtime` | Broader approval handler runtime helpers; prefer the narrower adapter/gateway seams when they are enough |
| `plugin-sdk/approval-native-runtime` | Native approval target + account-binding helpers |
| `plugin-sdk/approval-reply-runtime` | Exec/plugin approval reply payload helpers |
| `plugin-sdk/channel-contract-testing` | Narrow channel contract test helpers without the broad testing barrel |
| `plugin-sdk/command-auth-native` | Native command auth + native session-target helpers |
| `plugin-sdk/command-detection` | Shared command detection helpers |
| `plugin-sdk/command-primitives-runtime` | Lightweight command text predicates for hot channel paths |
| `plugin-sdk/command-surface` | Command-body normalization and command-surface helpers |
| `plugin-sdk/allow-from` | `formatAllowFromLowercase` |
| `plugin-sdk/channel-secret-runtime` | Narrow secret-contract collection helpers for channel/plugin secret surfaces |
@@ -173,6 +175,7 @@ For the plugin authoring guide, see [Plugin SDK overview](/plugins/sdk-overview)
| `plugin-sdk/json-store` | Small JSON state read/write helpers |
| `plugin-sdk/file-lock` | Re-entrant file-lock helpers |
| `plugin-sdk/persistent-dedupe` | Disk-backed dedupe cache helpers |
| `plugin-sdk/persistent-keyed-store` | Persistent keyed registry helper with TTL, deterministic enumeration, and quarantine-on-corruption behavior for restart-safe lifecycle state |
| `plugin-sdk/acp-runtime` | ACP runtime/session and reply-dispatch helpers |
| `plugin-sdk/acp-binding-resolve-runtime` | Read-only ACP binding resolution without lifecycle startup imports |
| `plugin-sdk/agent-config-primitives` | Narrow agent runtime config-schema primitives |

View File

@@ -25,19 +25,19 @@ API-enabled model such as `openai/gpt-5.4` for `OPENAI_API_KEY` setups.
## OpenClaw feature coverage
| OpenAI capability | OpenClaw surface | Status |
| ------------------------- | ------------------------------------------------------ | ------------------------------------------------------ |
| Chat / Responses | `openai/<model>` model provider | Yes |
| Codex subscription models | `openai-codex/<model>` with `openai-codex` OAuth | Yes |
| Codex app-server harness | `openai/<model>` with `embeddedHarness.runtime: codex` | Yes |
| Server-side web search | Native OpenAI Responses tool | Yes, when web search is enabled and no provider pinned |
| Images | `image_generate` | Yes |
| Videos | `video_generate` | Yes |
| Text-to-speech | `messages.tts.provider: "openai"` / `tts` | Yes |
| Batch speech-to-text | `tools.media.audio` / media understanding | Yes |
| Streaming speech-to-text | Voice Call `streaming.provider: "openai"` | Yes |
| Realtime voice | Voice Call `realtime.provider: "openai"` | Yes |
| Embeddings | memory embedding provider | Yes |
| OpenAI capability | OpenClaw surface | Status |
| ------------------------- | ---------------------------------------------------------- | ------------------------------------------------------ |
| Chat / Responses | `openai/<model>` model provider | Yes |
| Codex subscription models | `openai-codex/<model>` with `openai-codex` OAuth | Yes |
| Codex app-server harness | `openai/<model>` with `embeddedHarness.runtime: codex` | Yes |
| Server-side web search | Native OpenAI Responses tool | Yes, when web search is enabled and no provider pinned |
| Images | `image_generate` | Yes |
| Videos | `video_generate` | Yes |
| Text-to-speech | `messages.tts.provider: "openai"` / `tts` | Yes |
| Batch speech-to-text | `tools.media.audio` / media understanding | Yes |
| Streaming speech-to-text | Voice Call `streaming.provider: "openai"` | Yes |
| Realtime voice | Voice Call `realtime.provider: "openai"` / Control UI Talk | Yes |
| Embeddings | memory embedding provider | Yes |
## Getting started
@@ -661,12 +661,14 @@ the Server-side compaction accordion below.
</Accordion>
<Accordion title="Server-side compaction (Responses API)">
For direct OpenAI Responses models (`openai/*` on `api.openai.com`), OpenClaw auto-enables server-side compaction:
For direct OpenAI Responses models (`openai/*` on `api.openai.com`), the OpenAI plugin's Pi-harness stream wrapper auto-enables server-side compaction:
- Forces `store: true` (unless model compat sets `supportsStore: false`)
- Injects `context_management: [{ type: "compaction", compact_threshold: ... }]`
- Default `compact_threshold`: 70% of `contextWindow` (or `80000` when unavailable)
This applies to the built-in Pi harness path and to OpenAI provider hooks used by embedded runs. The native Codex app-server harness manages its own context through Codex and is configured separately with `agents.defaults.embeddedHarness.runtime`.
<Tabs>
<Tab title="Enable explicitly">
Useful for compatible endpoints like Azure OpenAI Responses:

View File

@@ -207,6 +207,18 @@ arn:aws:bedrock:*::foundation-model/amazon.titan-embed-text-v2:0
Default model: `embeddinggemma-300m-qat-Q8_0.gguf` (~0.6 GB, auto-downloaded).
Requires native build: `pnpm approve-builds` then `pnpm rebuild node-llama-cpp`.
Use the standalone CLI to verify the same provider path the Gateway uses:
```bash
openclaw memory status --deep --agent main
openclaw memory index --force --agent main
```
If `provider` is `auto`, `local` is selected only when `local.modelPath` points
to an existing local file. `hf:` and HTTP(S) model references can still be used
explicitly with `provider: "local"`, but they do not make `auto` select local
before the model is available on disk.
---
## Hybrid search config

View File

@@ -0,0 +1,291 @@
---
summary: "Setting up ACP agents: acpx harness config, plugin setup, permissions"
read_when:
- Installing or configuring the acpx harness for Claude Code / Codex / Gemini CLI
- Enabling the plugin-tools or OpenClaw-tools MCP bridge
- Configuring ACP permission modes
title: "ACP agents — setup"
---
For the overview, operator runbook, and concepts, see [ACP agents](/tools/acp-agents).
This page covers acpx harness config, plugin setup for the MCP bridges, and
permission configuration.
## acpx harness support (current)
Current acpx built-in harness aliases:
- `claude`
- `codex`
- `copilot`
- `cursor` (Cursor CLI: `cursor-agent acp`)
- `droid`
- `gemini`
- `iflow`
- `kilocode`
- `kimi`
- `kiro`
- `openclaw`
- `opencode`
- `pi`
- `qwen`
When OpenClaw uses the acpx backend, prefer these values for `agentId` unless your acpx config defines custom agent aliases.
If your local Cursor install still exposes ACP as `agent acp`, override the `cursor` agent command in your acpx config instead of changing the built-in default.
Direct acpx CLI usage can also target arbitrary adapters via `--agent <command>`, but that raw escape hatch is an acpx CLI feature (not the normal OpenClaw `agentId` path).
## Required config
Core ACP baseline:
```json5
{
acp: {
enabled: true,
// Optional. Default is true; set false to pause ACP dispatch while keeping /acp controls.
dispatch: { enabled: true },
backend: "acpx",
defaultAgent: "codex",
allowedAgents: [
"claude",
"codex",
"copilot",
"cursor",
"droid",
"gemini",
"iflow",
"kilocode",
"kimi",
"kiro",
"openclaw",
"opencode",
"pi",
"qwen",
],
maxConcurrentSessions: 8,
stream: {
coalesceIdleMs: 300,
maxChunkChars: 1200,
},
runtime: {
ttlMinutes: 120,
},
},
}
```
Thread binding config is channel-adapter specific. Example for Discord:
```json5
{
session: {
threadBindings: {
enabled: true,
idleHours: 24,
maxAgeHours: 0,
},
},
channels: {
discord: {
threadBindings: {
enabled: true,
spawnAcpSessions: true,
},
},
},
}
```
If thread-bound ACP spawn does not work, verify the adapter feature flag first:
- Discord: `channels.discord.threadBindings.spawnAcpSessions=true`
Current-conversation binds do not require child-thread creation. They require an active conversation context and a channel adapter that exposes ACP conversation bindings.
See [Configuration Reference](/gateway/configuration-reference).
## Plugin setup for acpx backend
Fresh installs ship the bundled `acpx` runtime plugin enabled by default, so ACP
usually works without a manual plugin install step.
Start with:
```text
/acp doctor
```
If you disabled `acpx`, denied it via `plugins.allow` / `plugins.deny`, or want
to switch to a local development checkout, use the explicit plugin path:
```bash
openclaw plugins install acpx
openclaw config set plugins.entries.acpx.enabled true
```
Local workspace install during development:
```bash
openclaw plugins install ./path/to/local/acpx-plugin
```
Then verify backend health:
```text
/acp doctor
```
### acpx command and version configuration
By default, the bundled `acpx` plugin uses its plugin-local pinned binary (`node_modules/.bin/acpx` inside the plugin package). Startup registers the backend as not-ready and a background job verifies `acpx --version`; if the binary is missing or mismatched, it runs `npm install --omit=dev --no-save acpx@<pinned>` and re-verifies. The gateway stays non-blocking throughout.
Override the command or version in plugin config:
```json
{
"plugins": {
"entries": {
"acpx": {
"enabled": true,
"config": {
"command": "../acpx/dist/cli.js",
"expectedVersion": "any"
}
}
}
}
}
```
- `command` accepts an absolute path, relative path (resolved from the OpenClaw workspace), or command name.
- `expectedVersion: "any"` disables strict version matching.
- Custom `command` paths disable plugin-local auto-install.
See [Plugins](/tools/plugin).
### Automatic dependency install
When you install OpenClaw globally with `npm install -g openclaw`, the acpx
runtime dependencies (platform-specific binaries) are installed automatically
via a postinstall hook. If the automatic install fails, the gateway still starts
normally and reports the missing dependency through `openclaw acp doctor`.
### Plugin tools MCP bridge
By default, ACPX sessions do **not** expose OpenClaw plugin-registered tools to
the ACP harness.
If you want ACP agents such as Codex or Claude Code to call installed
OpenClaw plugin tools such as memory recall/store, enable the dedicated bridge:
```bash
openclaw config set plugins.entries.acpx.config.pluginToolsMcpBridge true
```
What this does:
- Injects a built-in MCP server named `openclaw-plugin-tools` into ACPX session
bootstrap.
- Exposes plugin tools already registered by installed and enabled OpenClaw
plugins.
- Keeps the feature explicit and default-off.
Security and trust notes:
- This expands the ACP harness tool surface.
- ACP agents get access only to plugin tools already active in the gateway.
- Treat this as the same trust boundary as letting those plugins execute in
OpenClaw itself.
- Review installed plugins before enabling it.
Custom `mcpServers` still work as before. The built-in plugin-tools bridge is an
additional opt-in convenience, not a replacement for generic MCP server config.
### OpenClaw tools MCP bridge
By default, ACPX sessions also do **not** expose built-in OpenClaw tools through
MCP. Enable the separate core-tools bridge when an ACP agent needs selected
built-in tools such as `cron`:
```bash
openclaw config set plugins.entries.acpx.config.openClawToolsMcpBridge true
```
What this does:
- Injects a built-in MCP server named `openclaw-tools` into ACPX session
bootstrap.
- Exposes selected built-in OpenClaw tools. The initial server exposes `cron`.
- Keeps core-tool exposure explicit and default-off.
### Runtime timeout configuration
The bundled `acpx` plugin defaults embedded runtime turns to a 120-second
timeout. This gives slower harnesses such as Gemini CLI enough time to complete
ACP startup and initialization. Override it if your host needs a different
runtime limit:
```bash
openclaw config set plugins.entries.acpx.config.timeoutSeconds 180
```
Restart the gateway after changing this value.
### Health probe agent configuration
The bundled `acpx` plugin probes one harness agent while deciding whether the
embedded runtime backend is ready. It defaults to `codex`. If your deployment
uses a different default ACP agent, set the probe agent to the same id:
```bash
openclaw config set plugins.entries.acpx.config.probeAgent claude
```
Restart the gateway after changing this value.
## Permission configuration
ACP sessions run non-interactively — there is no TTY to approve or deny file-write and shell-exec permission prompts. The acpx plugin provides two config keys that control how permissions are handled:
These ACPX harness permissions are separate from OpenClaw exec approvals and separate from CLI-backend vendor bypass flags such as Claude CLI `--permission-mode bypassPermissions`. ACPX `approve-all` is the harness-level break-glass switch for ACP sessions.
### `permissionMode`
Controls which operations the harness agent can perform without prompting.
| Value | Behavior |
| --------------- | --------------------------------------------------------- |
| `approve-all` | Auto-approve all file writes and shell commands. |
| `approve-reads` | Auto-approve reads only; writes and exec require prompts. |
| `deny-all` | Deny all permission prompts. |
### `nonInteractivePermissions`
Controls what happens when a permission prompt would be shown but no interactive TTY is available (which is always the case for ACP sessions).
| Value | Behavior |
| ------ | ----------------------------------------------------------------- |
| `fail` | Abort the session with `AcpRuntimeError`. **(default)** |
| `deny` | Silently deny the permission and continue (graceful degradation). |
### Configuration
Set via plugin config:
```bash
openclaw config set plugins.entries.acpx.config.permissionMode approve-all
openclaw config set plugins.entries.acpx.config.nonInteractivePermissions fail
```
Restart the gateway after changing these values.
> **Important:** OpenClaw currently defaults to `permissionMode=approve-reads` and `nonInteractivePermissions=fail`. In non-interactive ACP sessions, any write or exec that triggers a permission prompt can fail with `AcpRuntimeError: Permission prompt unavailable in non-interactive mode`.
>
> If you need to restrict permissions, set `nonInteractivePermissions` to `deny` so sessions degrade gracefully instead of crashing.
## Related
- [ACP agents](/tools/acp-agents) — overview, operator runbook, concepts
- [Sub-agents](/tools/subagents)
- [Multi-agent routing](/concepts/multi-agent)

View File

@@ -507,278 +507,11 @@ Equivalent operations:
- Special case: `key=cwd` uses the cwd override path.
- `/acp reset-options` clears all runtime overrides for target session.
## acpx harness support (current)
## acpx harness, plugin setup, and permissions
Current acpx built-in harness aliases:
- `claude`
- `codex`
- `copilot`
- `cursor` (Cursor CLI: `cursor-agent acp`)
- `droid`
- `gemini`
- `iflow`
- `kilocode`
- `kimi`
- `kiro`
- `openclaw`
- `opencode`
- `pi`
- `qwen`
When OpenClaw uses the acpx backend, prefer these values for `agentId` unless your acpx config defines custom agent aliases.
If your local Cursor install still exposes ACP as `agent acp`, override the `cursor` agent command in your acpx config instead of changing the built-in default.
Direct acpx CLI usage can also target arbitrary adapters via `--agent <command>`, but that raw escape hatch is an acpx CLI feature (not the normal OpenClaw `agentId` path).
## Required config
Core ACP baseline:
```json5
{
acp: {
enabled: true,
// Optional. Default is true; set false to pause ACP dispatch while keeping /acp controls.
dispatch: { enabled: true },
backend: "acpx",
defaultAgent: "codex",
allowedAgents: [
"claude",
"codex",
"copilot",
"cursor",
"droid",
"gemini",
"iflow",
"kilocode",
"kimi",
"kiro",
"openclaw",
"opencode",
"pi",
"qwen",
],
maxConcurrentSessions: 8,
stream: {
coalesceIdleMs: 300,
maxChunkChars: 1200,
},
runtime: {
ttlMinutes: 120,
},
},
}
```
Thread binding config is channel-adapter specific. Example for Discord:
```json5
{
session: {
threadBindings: {
enabled: true,
idleHours: 24,
maxAgeHours: 0,
},
},
channels: {
discord: {
threadBindings: {
enabled: true,
spawnAcpSessions: true,
},
},
},
}
```
If thread-bound ACP spawn does not work, verify the adapter feature flag first:
- Discord: `channels.discord.threadBindings.spawnAcpSessions=true`
Current-conversation binds do not require child-thread creation. They require an active conversation context and a channel adapter that exposes ACP conversation bindings.
See [Configuration Reference](/gateway/configuration-reference).
## Plugin setup for acpx backend
Fresh installs ship the bundled `acpx` runtime plugin enabled by default, so ACP
usually works without a manual plugin install step.
Start with:
```text
/acp doctor
```
If you disabled `acpx`, denied it via `plugins.allow` / `plugins.deny`, or want
to switch to a local development checkout, use the explicit plugin path:
```bash
openclaw plugins install acpx
openclaw config set plugins.entries.acpx.enabled true
```
Local workspace install during development:
```bash
openclaw plugins install ./path/to/local/acpx-plugin
```
Then verify backend health:
```text
/acp doctor
```
### acpx command and version configuration
By default, the bundled `acpx` plugin uses its plugin-local pinned binary (`node_modules/.bin/acpx` inside the plugin package). Startup registers the backend as not-ready and a background job verifies `acpx --version`; if the binary is missing or mismatched, it runs `npm install --omit=dev --no-save acpx@<pinned>` and re-verifies. The gateway stays non-blocking throughout.
Override the command or version in plugin config:
```json
{
"plugins": {
"entries": {
"acpx": {
"enabled": true,
"config": {
"command": "../acpx/dist/cli.js",
"expectedVersion": "any"
}
}
}
}
}
```
- `command` accepts an absolute path, relative path (resolved from the OpenClaw workspace), or command name.
- `expectedVersion: "any"` disables strict version matching.
- Custom `command` paths disable plugin-local auto-install.
See [Plugins](/tools/plugin).
### Automatic dependency install
When you install OpenClaw globally with `npm install -g openclaw`, the acpx
runtime dependencies (platform-specific binaries) are installed automatically
via a postinstall hook. If the automatic install fails, the gateway still starts
normally and reports the missing dependency through `openclaw acp doctor`.
### Plugin tools MCP bridge
By default, ACPX sessions do **not** expose OpenClaw plugin-registered tools to
the ACP harness.
If you want ACP agents such as Codex or Claude Code to call installed
OpenClaw plugin tools such as memory recall/store, enable the dedicated bridge:
```bash
openclaw config set plugins.entries.acpx.config.pluginToolsMcpBridge true
```
What this does:
- Injects a built-in MCP server named `openclaw-plugin-tools` into ACPX session
bootstrap.
- Exposes plugin tools already registered by installed and enabled OpenClaw
plugins.
- Keeps the feature explicit and default-off.
Security and trust notes:
- This expands the ACP harness tool surface.
- ACP agents get access only to plugin tools already active in the gateway.
- Treat this as the same trust boundary as letting those plugins execute in
OpenClaw itself.
- Review installed plugins before enabling it.
Custom `mcpServers` still work as before. The built-in plugin-tools bridge is an
additional opt-in convenience, not a replacement for generic MCP server config.
### OpenClaw tools MCP bridge
By default, ACPX sessions also do **not** expose built-in OpenClaw tools through
MCP. Enable the separate core-tools bridge when an ACP agent needs selected
built-in tools such as `cron`:
```bash
openclaw config set plugins.entries.acpx.config.openClawToolsMcpBridge true
```
What this does:
- Injects a built-in MCP server named `openclaw-tools` into ACPX session
bootstrap.
- Exposes selected built-in OpenClaw tools. The initial server exposes `cron`.
- Keeps core-tool exposure explicit and default-off.
### Runtime timeout configuration
The bundled `acpx` plugin defaults embedded runtime turns to a 120-second
timeout. This gives slower harnesses such as Gemini CLI enough time to complete
ACP startup and initialization. Override it if your host needs a different
runtime limit:
```bash
openclaw config set plugins.entries.acpx.config.timeoutSeconds 180
```
Restart the gateway after changing this value.
### Health probe agent configuration
The bundled `acpx` plugin probes one harness agent while deciding whether the
embedded runtime backend is ready. It defaults to `codex`. If your deployment
uses a different default ACP agent, set the probe agent to the same id:
```bash
openclaw config set plugins.entries.acpx.config.probeAgent claude
```
Restart the gateway after changing this value.
## Permission configuration
ACP sessions run non-interactively — there is no TTY to approve or deny file-write and shell-exec permission prompts. The acpx plugin provides two config keys that control how permissions are handled:
These ACPX harness permissions are separate from OpenClaw exec approvals and separate from CLI-backend vendor bypass flags such as Claude CLI `--permission-mode bypassPermissions`. ACPX `approve-all` is the harness-level break-glass switch for ACP sessions.
### `permissionMode`
Controls which operations the harness agent can perform without prompting.
| Value | Behavior |
| --------------- | --------------------------------------------------------- |
| `approve-all` | Auto-approve all file writes and shell commands. |
| `approve-reads` | Auto-approve reads only; writes and exec require prompts. |
| `deny-all` | Deny all permission prompts. |
### `nonInteractivePermissions`
Controls what happens when a permission prompt would be shown but no interactive TTY is available (which is always the case for ACP sessions).
| Value | Behavior |
| ------ | ----------------------------------------------------------------- |
| `fail` | Abort the session with `AcpRuntimeError`. **(default)** |
| `deny` | Silently deny the permission and continue (graceful degradation). |
### Configuration
Set via plugin config:
```bash
openclaw config set plugins.entries.acpx.config.permissionMode approve-all
openclaw config set plugins.entries.acpx.config.nonInteractivePermissions fail
```
Restart the gateway after changing these values.
> **Important:** OpenClaw currently defaults to `permissionMode=approve-reads` and `nonInteractivePermissions=fail`. In non-interactive ACP sessions, any write or exec that triggers a permission prompt can fail with `AcpRuntimeError: Permission prompt unavailable in non-interactive mode`.
>
> If you need to restrict permissions, set `nonInteractivePermissions` to `deny` so sessions degrade gracefully instead of crashing.
For acpx harness configuration (Claude Code / Codex / Gemini CLI aliases), the
plugin-tools and OpenClaw-tools MCP bridges, and ACP permission modes, see
[ACP agents — setup](/tools/acp-agents-setup).
## Troubleshooting

View File

@@ -54,17 +54,41 @@ Legacy `tools.web.search.apiKey` still loads through the compatibility shim, but
## Tool parameters
| Parameter | Description |
| ------------- | ------------------------------------------------------------------- |
| `query` | Search query (required) |
| `count` | Number of results to return (1-10, default: 5) |
| `country` | 2-letter ISO country code (e.g., "US", "DE") |
| `language` | ISO 639-1 language code for search results (e.g., "en", "de", "fr") |
| `search_lang` | Brave search-language code (e.g., `en`, `en-gb`, `zh-hans`) |
| `ui_lang` | ISO language code for UI elements |
| `freshness` | Time filter: `day` (24h), `week`, `month`, or `year` |
| `date_after` | Only results published after this date (YYYY-MM-DD) |
| `date_before` | Only results published before this date (YYYY-MM-DD) |
<ParamField path="query" type="string" required>
Search query.
</ParamField>
<ParamField path="count" type="number" default="5">
Number of results to return (110).
</ParamField>
<ParamField path="country" type="string">
2-letter ISO country code (e.g. `US`, `DE`).
</ParamField>
<ParamField path="language" type="string">
ISO 639-1 language code for search results (e.g. `en`, `de`, `fr`).
</ParamField>
<ParamField path="search_lang" type="string">
Brave search-language code (e.g. `en`, `en-gb`, `zh-hans`).
</ParamField>
<ParamField path="ui_lang" type="string">
ISO language code for UI elements.
</ParamField>
<ParamField path="freshness" type="'day' | 'week' | 'month' | 'year'">
Time filter — `day` is 24 hours.
</ParamField>
<ParamField path="date_after" type="string">
Only results published after this date (`YYYY-MM-DD`).
</ParamField>
<ParamField path="date_before" type="string">
Only results published before this date (`YYYY-MM-DD`).
</ParamField>
**Examples:**

View File

@@ -0,0 +1,343 @@
---
summary: "OpenClaw browser control API, CLI reference, and scripting actions"
read_when:
- Scripting or debugging the agent browser via the local control API
- Looking for the `openclaw browser` CLI reference
- Adding custom browser automation with snapshots and refs
title: "Browser control API"
---
For setup, configuration, and troubleshooting, see [Browser](/tools/browser).
This page is the reference for the local control HTTP API, the `openclaw browser`
CLI, and scripting patterns (snapshots, refs, waits, debug flows).
## Control API (optional)
For local integrations only, the Gateway exposes a small loopback HTTP API:
- Status/start/stop: `GET /`, `POST /start`, `POST /stop`
- Tabs: `GET /tabs`, `POST /tabs/open`, `POST /tabs/focus`, `DELETE /tabs/:targetId`
- Snapshot/screenshot: `GET /snapshot`, `POST /screenshot`
- Actions: `POST /navigate`, `POST /act`
- Hooks: `POST /hooks/file-chooser`, `POST /hooks/dialog`
- Downloads: `POST /download`, `POST /wait/download`
- Debugging: `GET /console`, `POST /pdf`
- Debugging: `GET /errors`, `GET /requests`, `POST /trace/start`, `POST /trace/stop`, `POST /highlight`
- Network: `POST /response/body`
- State: `GET /cookies`, `POST /cookies/set`, `POST /cookies/clear`
- State: `GET /storage/:kind`, `POST /storage/:kind/set`, `POST /storage/:kind/clear`
- Settings: `POST /set/offline`, `POST /set/headers`, `POST /set/credentials`, `POST /set/geolocation`, `POST /set/media`, `POST /set/timezone`, `POST /set/locale`, `POST /set/device`
All endpoints accept `?profile=<name>`.
If shared-secret gateway auth is configured, browser HTTP routes require auth too:
- `Authorization: Bearer <gateway token>`
- `x-openclaw-password: <gateway password>` or HTTP Basic auth with that password
Notes:
- This standalone loopback browser API does **not** consume trusted-proxy or
Tailscale Serve identity headers.
- If `gateway.auth.mode` is `none` or `trusted-proxy`, these loopback browser
routes do not inherit those identity-bearing modes; keep them loopback-only.
### `/act` error contract
`POST /act` uses a structured error response for route-level validation and
policy failures:
```json
{ "error": "<message>", "code": "ACT_*" }
```
Current `code` values:
- `ACT_KIND_REQUIRED` (HTTP 400): `kind` is missing or unrecognized.
- `ACT_INVALID_REQUEST` (HTTP 400): action payload failed normalization or validation.
- `ACT_SELECTOR_UNSUPPORTED` (HTTP 400): `selector` was used with an unsupported action kind.
- `ACT_EVALUATE_DISABLED` (HTTP 403): `evaluate` (or `wait --fn`) is disabled by config.
- `ACT_TARGET_ID_MISMATCH` (HTTP 403): top-level or batched `targetId` conflicts with request target.
- `ACT_EXISTING_SESSION_UNSUPPORTED` (HTTP 501): action is not supported for existing-session profiles.
Other runtime failures may still return `{ "error": "<message>" }` without a
`code` field.
### Playwright requirement
Some features (navigate/act/AI snapshot/role snapshot, element screenshots,
PDF) require Playwright. If Playwright isnt installed, those endpoints return
a clear 501 error.
What still works without Playwright:
- ARIA snapshots
- Page screenshots for the managed `openclaw` browser when a per-tab CDP
WebSocket is available
- Page screenshots for `existing-session` / Chrome MCP profiles
- `existing-session` ref-based screenshots (`--ref`) from snapshot output
What still needs Playwright:
- `navigate`
- `act`
- AI snapshots / role snapshots
- CSS-selector element screenshots (`--element`)
- full browser PDF export
Element screenshots also reject `--full-page`; the route returns `fullPage is
not supported for element screenshots`.
If you see `Playwright is not available in this gateway build`, repair the
bundled browser plugin runtime dependencies so `playwright-core` is installed,
then restart the gateway. For packaged installs, run `openclaw doctor --fix`.
For Docker, also install the Chromium browser binaries as shown below.
#### Docker Playwright install
If your Gateway runs in Docker, avoid `npx playwright` (npm override conflicts).
Use the bundled CLI instead:
```bash
docker compose run --rm openclaw-cli \
node /app/node_modules/playwright-core/cli.js install chromium
```
To persist browser downloads, set `PLAYWRIGHT_BROWSERS_PATH` (for example,
`/home/node/.cache/ms-playwright`) and make sure `/home/node` is persisted via
`OPENCLAW_HOME_VOLUME` or a bind mount. See [Docker](/install/docker).
## How it works (internal)
A small loopback control server accepts HTTP requests and connects to Chromium-based browsers via CDP. Advanced actions (click/type/snapshot/PDF) go through Playwright on top of CDP; when Playwright is missing, only non-Playwright operations are available. The agent sees one stable interface while local/remote browsers and profiles swap freely underneath.
## CLI quick reference
All commands accept `--browser-profile <name>` to target a specific profile, and `--json` for machine-readable output.
<AccordionGroup>
<Accordion title="Basics: status, tabs, open/focus/close">
```bash
openclaw browser status
openclaw browser start
openclaw browser stop # also clears emulation on attach-only/remote CDP
openclaw browser tabs
openclaw browser tab # shortcut for current tab
openclaw browser tab new
openclaw browser tab select 2
openclaw browser tab close 2
openclaw browser open https://example.com
openclaw browser focus abcd1234
openclaw browser close abcd1234
```
</Accordion>
<Accordion title="Inspection: screenshot, snapshot, console, errors, requests">
```bash
openclaw browser screenshot
openclaw browser screenshot --full-page
openclaw browser screenshot --ref 12 # or --ref e12
openclaw browser snapshot
openclaw browser snapshot --format aria --limit 200
openclaw browser snapshot --interactive --compact --depth 6
openclaw browser snapshot --efficient
openclaw browser snapshot --labels
openclaw browser snapshot --selector "#main" --interactive
openclaw browser snapshot --frame "iframe#main" --interactive
openclaw browser console --level error
openclaw browser errors --clear
openclaw browser requests --filter api --clear
openclaw browser pdf
openclaw browser responsebody "**/api" --max-chars 5000
```
</Accordion>
<Accordion title="Actions: navigate, click, type, drag, wait, evaluate">
```bash
openclaw browser navigate https://example.com
openclaw browser resize 1280 720
openclaw browser click 12 --double # or e12 for role refs
openclaw browser type 23 "hello" --submit
openclaw browser press Enter
openclaw browser hover 44
openclaw browser scrollintoview e12
openclaw browser drag 10 11
openclaw browser select 9 OptionA OptionB
openclaw browser download e12 report.pdf
openclaw browser waitfordownload report.pdf
openclaw browser upload /tmp/openclaw/uploads/file.pdf
openclaw browser fill --fields '[{"ref":"1","type":"text","value":"Ada"}]'
openclaw browser dialog --accept
openclaw browser wait --text "Done"
openclaw browser wait "#main" --url "**/dash" --load networkidle --fn "window.ready===true"
openclaw browser evaluate --fn '(el) => el.textContent' --ref 7
openclaw browser highlight e12
openclaw browser trace start
openclaw browser trace stop
```
</Accordion>
<Accordion title="State: cookies, storage, offline, headers, geo, device">
```bash
openclaw browser cookies
openclaw browser cookies set session abc123 --url "https://example.com"
openclaw browser cookies clear
openclaw browser storage local get
openclaw browser storage local set theme dark
openclaw browser storage session clear
openclaw browser set offline on
openclaw browser set headers --headers-json '{"X-Debug":"1"}'
openclaw browser set credentials user pass # --clear to remove
openclaw browser set geo 37.7749 -122.4194 --origin "https://example.com"
openclaw browser set media dark
openclaw browser set timezone America/New_York
openclaw browser set locale en-US
openclaw browser set device "iPhone 14"
```
</Accordion>
</AccordionGroup>
Notes:
- `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog.
- `click`/`type`/etc require a `ref` from `snapshot` (numeric `12` or role ref `e12`). CSS selectors are intentionally not supported for actions.
- Download, trace, and upload paths are constrained to OpenClaw temp roots: `/tmp/openclaw{,/downloads,/uploads}` (fallback: `${os.tmpdir()}/openclaw/...`).
- `upload` can also set file inputs directly via `--input-ref` or `--element`.
Snapshot flags at a glance:
- `--format ai` (default with Playwright): AI snapshot with numeric refs (`aria-ref="<n>"`).
- `--format aria`: accessibility tree, no refs; inspection only.
- `--efficient` (or `--mode efficient`): compact role snapshot preset. Set `browser.snapshotDefaults.mode: "efficient"` to make this the default (see [Gateway configuration](/gateway/configuration-reference#browser)).
- `--interactive`, `--compact`, `--depth`, `--selector` force a role snapshot with `ref=e12` refs. `--frame "<iframe>"` scopes role snapshots to an iframe.
- `--labels` adds a viewport-only screenshot with overlayed ref labels (prints `MEDIA:<path>`).
## Snapshots and refs
OpenClaw supports two “snapshot” styles:
- **AI snapshot (numeric refs)**: `openclaw browser snapshot` (default; `--format ai`)
- Output: a text snapshot that includes numeric refs.
- Actions: `openclaw browser click 12`, `openclaw browser type 23 "hello"`.
- Internally, the ref is resolved via Playwrights `aria-ref`.
- **Role snapshot (role refs like `e12`)**: `openclaw browser snapshot --interactive` (or `--compact`, `--depth`, `--selector`, `--frame`)
- Output: a role-based list/tree with `[ref=e12]` (and optional `[nth=1]`).
- Actions: `openclaw browser click e12`, `openclaw browser highlight e12`.
- Internally, the ref is resolved via `getByRole(...)` (plus `nth()` for duplicates).
- Add `--labels` to include a viewport screenshot with overlayed `e12` labels.
Ref behavior:
- Refs are **not stable across navigations**; if something fails, re-run `snapshot` and use a fresh ref.
- If the role snapshot was taken with `--frame`, role refs are scoped to that iframe until the next role snapshot.
## Wait power-ups
You can wait on more than just time/text:
- Wait for URL (globs supported by Playwright):
- `openclaw browser wait --url "**/dash"`
- Wait for load state:
- `openclaw browser wait --load networkidle`
- Wait for a JS predicate:
- `openclaw browser wait --fn "window.ready===true"`
- Wait for a selector to become visible:
- `openclaw browser wait "#main"`
These can be combined:
```bash
openclaw browser wait "#main" \
--url "**/dash" \
--load networkidle \
--fn "window.ready===true" \
--timeout-ms 15000
```
## Debug workflows
When an action fails (e.g. “not visible”, “strict mode violation”, “covered”):
1. `openclaw browser snapshot --interactive`
2. Use `click <ref>` / `type <ref>` (prefer role refs in interactive mode)
3. If it still fails: `openclaw browser highlight <ref>` to see what Playwright is targeting
4. If the page behaves oddly:
- `openclaw browser errors --clear`
- `openclaw browser requests --filter api --clear`
5. For deep debugging: record a trace:
- `openclaw browser trace start`
- reproduce the issue
- `openclaw browser trace stop` (prints `TRACE:<path>`)
## JSON output
`--json` is for scripting and structured tooling.
Examples:
```bash
openclaw browser status --json
openclaw browser snapshot --interactive --json
openclaw browser requests --filter api --json
openclaw browser cookies --json
```
Role snapshots in JSON include `refs` plus a small `stats` block (lines/chars/refs/interactive) so tools can reason about payload size and density.
## State and environment knobs
These are useful for “make the site behave like X” workflows:
- Cookies: `cookies`, `cookies set`, `cookies clear`
- Storage: `storage local|session get|set|clear`
- Offline: `set offline on|off`
- Headers: `set headers --headers-json '{"X-Debug":"1"}'` (legacy `set headers --json '{"X-Debug":"1"}'` remains supported)
- HTTP basic auth: `set credentials user pass` (or `--clear`)
- Geolocation: `set geo <lat> <lon> --origin "https://example.com"` (or `--clear`)
- Media: `set media dark|light|no-preference|none`
- Timezone / locale: `set timezone ...`, `set locale ...`
- Device / viewport:
- `set device "iPhone 14"` (Playwright device presets)
- `set viewport 1280 720`
## Security and privacy
- The openclaw browser profile may contain logged-in sessions; treat it as sensitive.
- `browser act kind=evaluate` / `openclaw browser evaluate` and `wait --fn`
execute arbitrary JavaScript in the page context. Prompt injection can steer
this. Disable it with `browser.evaluateEnabled=false` if you do not need it.
- For logins and anti-bot notes (X/Twitter, etc.), see [Browser login + X/Twitter posting](/tools/browser-login).
- Keep the Gateway/node host private (loopback or tailnet-only).
- Remote CDP endpoints are powerful; tunnel and protect them.
Strict-mode example (block private/internal destinations by default):
```json5
{
browser: {
ssrfPolicy: {
dangerouslyAllowPrivateNetwork: false,
hostnameAllowlist: ["*.example.com", "example.com"],
allowedHostnames: ["localhost"], // optional exact allow
},
},
}
```
## Related
- [Browser](/tools/browser) — overview, configuration, profiles, security
- [Browser login](/tools/browser-login) — signing in to sites
- [Browser Linux troubleshooting](/tools/browser-linux-troubleshooting)
- [Browser WSL2 troubleshooting](/tools/browser-wsl2-windows-remote-cdp-troubleshooting)

View File

@@ -516,327 +516,10 @@ Platforms:
## Control API (optional)
For local integrations only, the Gateway exposes a small loopback HTTP API:
- Status/start/stop: `GET /`, `POST /start`, `POST /stop`
- Tabs: `GET /tabs`, `POST /tabs/open`, `POST /tabs/focus`, `DELETE /tabs/:targetId`
- Snapshot/screenshot: `GET /snapshot`, `POST /screenshot`
- Actions: `POST /navigate`, `POST /act`
- Hooks: `POST /hooks/file-chooser`, `POST /hooks/dialog`
- Downloads: `POST /download`, `POST /wait/download`
- Debugging: `GET /console`, `POST /pdf`
- Debugging: `GET /errors`, `GET /requests`, `POST /trace/start`, `POST /trace/stop`, `POST /highlight`
- Network: `POST /response/body`
- State: `GET /cookies`, `POST /cookies/set`, `POST /cookies/clear`
- State: `GET /storage/:kind`, `POST /storage/:kind/set`, `POST /storage/:kind/clear`
- Settings: `POST /set/offline`, `POST /set/headers`, `POST /set/credentials`, `POST /set/geolocation`, `POST /set/media`, `POST /set/timezone`, `POST /set/locale`, `POST /set/device`
All endpoints accept `?profile=<name>`.
If shared-secret gateway auth is configured, browser HTTP routes require auth too:
- `Authorization: Bearer <gateway token>`
- `x-openclaw-password: <gateway password>` or HTTP Basic auth with that password
Notes:
- This standalone loopback browser API does **not** consume trusted-proxy or
Tailscale Serve identity headers.
- If `gateway.auth.mode` is `none` or `trusted-proxy`, these loopback browser
routes do not inherit those identity-bearing modes; keep them loopback-only.
### `/act` error contract
`POST /act` uses a structured error response for route-level validation and
policy failures:
```json
{ "error": "<message>", "code": "ACT_*" }
```
Current `code` values:
- `ACT_KIND_REQUIRED` (HTTP 400): `kind` is missing or unrecognized.
- `ACT_INVALID_REQUEST` (HTTP 400): action payload failed normalization or validation.
- `ACT_SELECTOR_UNSUPPORTED` (HTTP 400): `selector` was used with an unsupported action kind.
- `ACT_EVALUATE_DISABLED` (HTTP 403): `evaluate` (or `wait --fn`) is disabled by config.
- `ACT_TARGET_ID_MISMATCH` (HTTP 403): top-level or batched `targetId` conflicts with request target.
- `ACT_EXISTING_SESSION_UNSUPPORTED` (HTTP 501): action is not supported for existing-session profiles.
Other runtime failures may still return `{ "error": "<message>" }` without a
`code` field.
### Playwright requirement
Some features (navigate/act/AI snapshot/role snapshot, element screenshots,
PDF) require Playwright. If Playwright isnt installed, those endpoints return
a clear 501 error.
What still works without Playwright:
- ARIA snapshots
- Page screenshots for the managed `openclaw` browser when a per-tab CDP
WebSocket is available
- Page screenshots for `existing-session` / Chrome MCP profiles
- `existing-session` ref-based screenshots (`--ref`) from snapshot output
What still needs Playwright:
- `navigate`
- `act`
- AI snapshots / role snapshots
- CSS-selector element screenshots (`--element`)
- full browser PDF export
Element screenshots also reject `--full-page`; the route returns `fullPage is
not supported for element screenshots`.
If you see `Playwright is not available in this gateway build`, repair the
bundled browser plugin runtime dependencies so `playwright-core` is installed,
then restart the gateway. For packaged installs, run `openclaw doctor --fix`.
For Docker, also install the Chromium browser binaries as shown below.
#### Docker Playwright install
If your Gateway runs in Docker, avoid `npx playwright` (npm override conflicts).
Use the bundled CLI instead:
```bash
docker compose run --rm openclaw-cli \
node /app/node_modules/playwright-core/cli.js install chromium
```
To persist browser downloads, set `PLAYWRIGHT_BROWSERS_PATH` (for example,
`/home/node/.cache/ms-playwright`) and make sure `/home/node` is persisted via
`OPENCLAW_HOME_VOLUME` or a bind mount. See [Docker](/install/docker).
## How it works (internal)
A small loopback control server accepts HTTP requests and connects to Chromium-based browsers via CDP. Advanced actions (click/type/snapshot/PDF) go through Playwright on top of CDP; when Playwright is missing, only non-Playwright operations are available. The agent sees one stable interface while local/remote browsers and profiles swap freely underneath.
## CLI quick reference
All commands accept `--browser-profile <name>` to target a specific profile, and `--json` for machine-readable output.
<AccordionGroup>
<Accordion title="Basics: status, tabs, open/focus/close">
```bash
openclaw browser status
openclaw browser start
openclaw browser stop # also clears emulation on attach-only/remote CDP
openclaw browser tabs
openclaw browser tab # shortcut for current tab
openclaw browser tab new
openclaw browser tab select 2
openclaw browser tab close 2
openclaw browser open https://example.com
openclaw browser focus abcd1234
openclaw browser close abcd1234
```
</Accordion>
<Accordion title="Inspection: screenshot, snapshot, console, errors, requests">
```bash
openclaw browser screenshot
openclaw browser screenshot --full-page
openclaw browser screenshot --ref 12 # or --ref e12
openclaw browser snapshot
openclaw browser snapshot --format aria --limit 200
openclaw browser snapshot --interactive --compact --depth 6
openclaw browser snapshot --efficient
openclaw browser snapshot --labels
openclaw browser snapshot --selector "#main" --interactive
openclaw browser snapshot --frame "iframe#main" --interactive
openclaw browser console --level error
openclaw browser errors --clear
openclaw browser requests --filter api --clear
openclaw browser pdf
openclaw browser responsebody "**/api" --max-chars 5000
```
</Accordion>
<Accordion title="Actions: navigate, click, type, drag, wait, evaluate">
```bash
openclaw browser navigate https://example.com
openclaw browser resize 1280 720
openclaw browser click 12 --double # or e12 for role refs
openclaw browser type 23 "hello" --submit
openclaw browser press Enter
openclaw browser hover 44
openclaw browser scrollintoview e12
openclaw browser drag 10 11
openclaw browser select 9 OptionA OptionB
openclaw browser download e12 report.pdf
openclaw browser waitfordownload report.pdf
openclaw browser upload /tmp/openclaw/uploads/file.pdf
openclaw browser fill --fields '[{"ref":"1","type":"text","value":"Ada"}]'
openclaw browser dialog --accept
openclaw browser wait --text "Done"
openclaw browser wait "#main" --url "**/dash" --load networkidle --fn "window.ready===true"
openclaw browser evaluate --fn '(el) => el.textContent' --ref 7
openclaw browser highlight e12
openclaw browser trace start
openclaw browser trace stop
```
</Accordion>
<Accordion title="State: cookies, storage, offline, headers, geo, device">
```bash
openclaw browser cookies
openclaw browser cookies set session abc123 --url "https://example.com"
openclaw browser cookies clear
openclaw browser storage local get
openclaw browser storage local set theme dark
openclaw browser storage session clear
openclaw browser set offline on
openclaw browser set headers --headers-json '{"X-Debug":"1"}'
openclaw browser set credentials user pass # --clear to remove
openclaw browser set geo 37.7749 -122.4194 --origin "https://example.com"
openclaw browser set media dark
openclaw browser set timezone America/New_York
openclaw browser set locale en-US
openclaw browser set device "iPhone 14"
```
</Accordion>
</AccordionGroup>
Notes:
- `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog.
- `click`/`type`/etc require a `ref` from `snapshot` (numeric `12` or role ref `e12`). CSS selectors are intentionally not supported for actions.
- Download, trace, and upload paths are constrained to OpenClaw temp roots: `/tmp/openclaw{,/downloads,/uploads}` (fallback: `${os.tmpdir()}/openclaw/...`).
- `upload` can also set file inputs directly via `--input-ref` or `--element`.
Snapshot flags at a glance:
- `--format ai` (default with Playwright): AI snapshot with numeric refs (`aria-ref="<n>"`).
- `--format aria`: accessibility tree, no refs; inspection only.
- `--efficient` (or `--mode efficient`): compact role snapshot preset. Set `browser.snapshotDefaults.mode: "efficient"` to make this the default (see [Gateway configuration](/gateway/configuration-reference#browser)).
- `--interactive`, `--compact`, `--depth`, `--selector` force a role snapshot with `ref=e12` refs. `--frame "<iframe>"` scopes role snapshots to an iframe.
- `--labels` adds a viewport-only screenshot with overlayed ref labels (prints `MEDIA:<path>`).
## Snapshots and refs
OpenClaw supports two “snapshot” styles:
- **AI snapshot (numeric refs)**: `openclaw browser snapshot` (default; `--format ai`)
- Output: a text snapshot that includes numeric refs.
- Actions: `openclaw browser click 12`, `openclaw browser type 23 "hello"`.
- Internally, the ref is resolved via Playwrights `aria-ref`.
- **Role snapshot (role refs like `e12`)**: `openclaw browser snapshot --interactive` (or `--compact`, `--depth`, `--selector`, `--frame`)
- Output: a role-based list/tree with `[ref=e12]` (and optional `[nth=1]`).
- Actions: `openclaw browser click e12`, `openclaw browser highlight e12`.
- Internally, the ref is resolved via `getByRole(...)` (plus `nth()` for duplicates).
- Add `--labels` to include a viewport screenshot with overlayed `e12` labels.
Ref behavior:
- Refs are **not stable across navigations**; if something fails, re-run `snapshot` and use a fresh ref.
- If the role snapshot was taken with `--frame`, role refs are scoped to that iframe until the next role snapshot.
## Wait power-ups
You can wait on more than just time/text:
- Wait for URL (globs supported by Playwright):
- `openclaw browser wait --url "**/dash"`
- Wait for load state:
- `openclaw browser wait --load networkidle`
- Wait for a JS predicate:
- `openclaw browser wait --fn "window.ready===true"`
- Wait for a selector to become visible:
- `openclaw browser wait "#main"`
These can be combined:
```bash
openclaw browser wait "#main" \
--url "**/dash" \
--load networkidle \
--fn "window.ready===true" \
--timeout-ms 15000
```
## Debug workflows
When an action fails (e.g. “not visible”, “strict mode violation”, “covered”):
1. `openclaw browser snapshot --interactive`
2. Use `click <ref>` / `type <ref>` (prefer role refs in interactive mode)
3. If it still fails: `openclaw browser highlight <ref>` to see what Playwright is targeting
4. If the page behaves oddly:
- `openclaw browser errors --clear`
- `openclaw browser requests --filter api --clear`
5. For deep debugging: record a trace:
- `openclaw browser trace start`
- reproduce the issue
- `openclaw browser trace stop` (prints `TRACE:<path>`)
## JSON output
`--json` is for scripting and structured tooling.
Examples:
```bash
openclaw browser status --json
openclaw browser snapshot --interactive --json
openclaw browser requests --filter api --json
openclaw browser cookies --json
```
Role snapshots in JSON include `refs` plus a small `stats` block (lines/chars/refs/interactive) so tools can reason about payload size and density.
## State and environment knobs
These are useful for “make the site behave like X” workflows:
- Cookies: `cookies`, `cookies set`, `cookies clear`
- Storage: `storage local|session get|set|clear`
- Offline: `set offline on|off`
- Headers: `set headers --headers-json '{"X-Debug":"1"}'` (legacy `set headers --json '{"X-Debug":"1"}'` remains supported)
- HTTP basic auth: `set credentials user pass` (or `--clear`)
- Geolocation: `set geo <lat> <lon> --origin "https://example.com"` (or `--clear`)
- Media: `set media dark|light|no-preference|none`
- Timezone / locale: `set timezone ...`, `set locale ...`
- Device / viewport:
- `set device "iPhone 14"` (Playwright device presets)
- `set viewport 1280 720`
## Security and privacy
- The openclaw browser profile may contain logged-in sessions; treat it as sensitive.
- `browser act kind=evaluate` / `openclaw browser evaluate` and `wait --fn`
execute arbitrary JavaScript in the page context. Prompt injection can steer
this. Disable it with `browser.evaluateEnabled=false` if you do not need it.
- For logins and anti-bot notes (X/Twitter, etc.), see [Browser login + X/Twitter posting](/tools/browser-login).
- Keep the Gateway/node host private (loopback or tailnet-only).
- Remote CDP endpoints are powerful; tunnel and protect them.
Strict-mode example (block private/internal destinations by default):
```json5
{
browser: {
ssrfPolicy: {
dangerouslyAllowPrivateNetwork: false,
hostnameAllowlist: ["*.example.com", "example.com"],
allowedHostnames: ["localhost"], // optional exact allow
},
},
}
```
For scripting and debugging, the Gateway exposes a small **loopback-only HTTP
control API** plus a matching `openclaw browser` CLI (snapshots, refs, wait
power-ups, JSON output, debug workflows). See
[Browser control API](/tools/browser-control) for the full reference.
## Troubleshooting

View File

@@ -64,12 +64,21 @@ Optional plugin-level settings for region and SafeSearch:
## Tool parameters
| Parameter | Description |
| ------------ | ---------------------------------------------------------- |
| `query` | Search query (required) |
| `count` | Results to return (1-10, default: 5) |
| `region` | DuckDuckGo region code (e.g. `us-en`, `uk-en`, `de-de`) |
| `safeSearch` | SafeSearch level: `strict`, `moderate` (default), or `off` |
<ParamField path="query" type="string" required>
Search query.
</ParamField>
<ParamField path="count" type="number" default="5">
Results to return (110).
</ParamField>
<ParamField path="region" type="string">
DuckDuckGo region code (e.g. `us-en`, `uk-en`, `de-de`).
</ParamField>
<ParamField path="safeSearch" type="'strict' | 'moderate' | 'off'" default="moderate">
SafeSearch level.
</ParamField>
Region and SafeSearch can also be set in plugin config (see above) — tool
parameters override config values per-query.

View File

@@ -58,15 +58,33 @@ For a gateway install, put it in `~/.openclaw/.env`.
## Tool parameters
| Parameter | Description |
| ------------- | ----------------------------------------------------------------------------- |
| `query` | Search query (required) |
| `count` | Results to return (1-100) |
| `type` | Search mode: `auto`, `neural`, `fast`, `deep`, `deep-reasoning`, or `instant` |
| `freshness` | Time filter: `day`, `week`, `month`, or `year` |
| `date_after` | Results after this date (YYYY-MM-DD) |
| `date_before` | Results before this date (YYYY-MM-DD) |
| `contents` | Content extraction options (see below) |
<ParamField path="query" type="string" required>
Search query.
</ParamField>
<ParamField path="count" type="number">
Results to return (1100).
</ParamField>
<ParamField path="type" type="'auto' | 'neural' | 'fast' | 'deep' | 'deep-reasoning' | 'instant'">
Search mode.
</ParamField>
<ParamField path="freshness" type="'day' | 'week' | 'month' | 'year'">
Time filter.
</ParamField>
<ParamField path="date_after" type="string">
Results after this date (`YYYY-MM-DD`).
</ParamField>
<ParamField path="date_before" type="string">
Results before this date (`YYYY-MM-DD`).
</ParamField>
<ParamField path="contents" type="object">
Content extraction options (see below).
</ParamField>
### Content extraction

View File

@@ -99,18 +99,45 @@ If `provider: "perplexity"` is configured and the Perplexity key SecretRef is un
These parameters apply to the native Perplexity Search API path.
| Parameter | Description |
| --------------------- | ---------------------------------------------------- |
| `query` | Search query (required) |
| `count` | Number of results to return (1-10, default: 5) |
| `country` | 2-letter ISO country code (e.g., "US", "DE") |
| `language` | ISO 639-1 language code (e.g., "en", "de", "fr") |
| `freshness` | Time filter: `day` (24h), `week`, `month`, or `year` |
| `date_after` | Only results published after this date (YYYY-MM-DD) |
| `date_before` | Only results published before this date (YYYY-MM-DD) |
| `domain_filter` | Domain allowlist/denylist array (max 20) |
| `max_tokens` | Total content budget (default: 25000, max: 1000000) |
| `max_tokens_per_page` | Per-page token limit (default: 2048) |
<ParamField path="query" type="string" required>
Search query.
</ParamField>
<ParamField path="count" type="number" default="5">
Number of results to return (110).
</ParamField>
<ParamField path="country" type="string">
2-letter ISO country code (e.g. `US`, `DE`).
</ParamField>
<ParamField path="language" type="string">
ISO 639-1 language code (e.g. `en`, `de`, `fr`).
</ParamField>
<ParamField path="freshness" type="'day' | 'week' | 'month' | 'year'">
Time filter — `day` is 24 hours.
</ParamField>
<ParamField path="date_after" type="string">
Only results published after this date (`YYYY-MM-DD`).
</ParamField>
<ParamField path="date_before" type="string">
Only results published before this date (`YYYY-MM-DD`).
</ParamField>
<ParamField path="domain_filter" type="string[]">
Domain allowlist/denylist array (max 20).
</ParamField>
<ParamField path="max_tokens" type="number" default="25000">
Total content budget (max 1000000).
</ParamField>
<ParamField path="max_tokens_per_page" type="number" default="2048">
Per-page token limit.
</ParamField>
For the legacy Sonar/OpenRouter compatibility path:

View File

@@ -129,6 +129,7 @@ Looking for third-party plugins? See [Community Plugins](/plugins/community).
{
plugins: {
enabled: true,
bundled: { mode: "default" },
allow: ["voice-call"],
deny: ["untrusted-plugin"],
load: { paths: ["~/Projects/oss/voice-call-plugin"] },
@@ -142,6 +143,7 @@ Looking for third-party plugins? See [Community Plugins](/plugins/community).
| Field | Description |
| ---------------- | --------------------------------------------------------- |
| `enabled` | Master toggle (default: `true`) |
| `bundled.mode` | Bundled plugin policy (`default`, `explicit`, `disabled`) |
| `allow` | Plugin allowlist (optional) |
| `deny` | Plugin denylist (optional; deny wins) |
| `load.paths` | Extra plugin files/directories |
@@ -186,9 +188,25 @@ OpenClaw scans for plugins in this order (first match wins):
- `plugins.enabled: false` disables all plugins
- `plugins.deny` always wins over allow
- `plugins.entries.\<id\>.enabled: false` disables that plugin
- `plugins.bundled.mode: "disabled"` disables all bundled plugins
- `plugins.bundled.mode: "explicit"` disables bundled defaults unless selected explicitly
- Workspace-origin plugins are **disabled by default** (must be explicitly enabled)
- Bundled plugins follow the built-in default-on set unless overridden
- Exclusive slots can force-enable the selected plugin for that slot
- Exclusive slots can force-enable the selected plugin unless `bundled.mode: "disabled"` blocks it
To load only a specific external plugin and no bundled plugins:
```json5
{
plugins: {
bundled: { mode: "disabled" },
allow: ["foo"],
entries: {
foo: { enabled: true },
},
},
}
```
## Plugin slots (exclusive categories)

View File

@@ -25,11 +25,17 @@ await web_fetch({ url: "https://example.com/article" });
## Tool parameters
| Parameter | Type | Description |
| ------------- | -------- | ---------------------------------------- |
| `url` | `string` | URL to fetch (required, http/https only) |
| `extractMode` | `string` | `"markdown"` (default) or `"text"` |
| `maxChars` | `number` | Truncate output to this many chars |
<ParamField path="url" type="string" required>
URL to fetch. `http(s)` only.
</ParamField>
<ParamField path="extractMode" type="'markdown' | 'text'" default="markdown">
Output format after main-content extraction.
</ParamField>
<ParamField path="maxChars" type="number">
Truncate output to this many characters.
</ParamField>
## How it works

View File

@@ -105,6 +105,11 @@ locale picker lives in the Gateway Access card, not under Appearance.
## What it can do (today)
- Chat with the model via Gateway WS (`chat.history`, `chat.send`, `chat.abort`, `chat.inject`)
- Talk to OpenAI Realtime directly from the browser via WebRTC. The Gateway
mints a short-lived Realtime client secret with `talk.realtime.session`; the
browser sends microphone audio directly to OpenAI and relays
`openclaw_agent_consult` tool calls back through `chat.send` for the larger
configured OpenClaw model.
- Stream tool calls + live tool output cards in Chat (agent events)
- Channels: built-in plus bundled/external plugin channels status, QR login, and per-channel config (`channels.status`, `web.login.*`, `config.patch`)
- Instances: presence list + refresh (`system-presence`)
@@ -151,8 +156,13 @@ Cron jobs panel notes:
- `chat.history` also strips display-only inline directive tags from visible assistant text (for example `[[reply_to_*]]` and `[[audio_as_voice]]`), plain-text tool-call XML payloads (including `<tool_call>...</tool_call>`, `<function_call>...</function_call>`, `<tool_calls>...</tool_calls>`, `<function_calls>...</function_calls>`, and truncated tool-call blocks), and leaked ASCII/full-width model control tokens, and omits assistant entries whose whole visible text is only the exact silent token `NO_REPLY` / `no_reply`.
- `chat.inject` appends an assistant note to the session transcript and broadcasts a `chat` event for UI-only updates (no agent run, no channel delivery).
- The chat header model and thinking pickers patch the active session immediately through `sessions.patch`; they are persistent session overrides, not one-turn-only send options.
- Talk mode uses the registered realtime voice provider. Configure OpenAI with
`talk.provider: "openai"` plus `talk.providers.openai.apiKey`, or reuse the
Voice Call realtime provider config. The browser never receives the standard
OpenAI API key; it receives only the ephemeral Realtime client secret.
- Stop:
- Click **Stop** (calls `chat.abort`)
- While a run is active, normal follow-ups queue. Click **Steer** on a queued message to inject that follow-up into the running turn.
- Type `/stop` (or standalone abort phrases like `stop`, `stop action`, `stop run`, `stop openclaw`, `please stop`) to abort out-of-band
- `chat.abort` supports `{ sessionKey }` (no `runId`) to abort all active runs for that session
- Abort partial retention:

View File

@@ -23,6 +23,17 @@
"groupLabel": "Anthropic",
"groupHint": "Claude CLI + API key"
},
{
"provider": "anthropic",
"method": "setup-token",
"choiceId": "setup-token",
"choiceLabel": "Anthropic setup-token",
"choiceHint": "Manual token path",
"assistantPriority": 40,
"groupId": "anthropic",
"groupLabel": "Anthropic",
"groupHint": "Claude CLI + API key + token"
},
{
"provider": "anthropic",
"method": "api-key",

View File

@@ -1,6 +1,5 @@
import { afterEach, beforeEach, vi } from "vitest";
import { deriveDefaultBrowserCdpPortRange } from "../config/port-defaults.js";
import * as browserServerModule from "../server.js";
import type { MockFn } from "../test-utils/vitest-mock-fn.js";
import { installChromeUserDataDirHooks } from "./chrome-user-data-dir.test-harness.js";
import { getFreePort } from "./test-port.js";
@@ -466,12 +465,19 @@ vi.mock("./screenshot.js", () => ({
})),
}));
let browserServerModulePromise: Promise<typeof import("../server.js")> | undefined;
async function loadBrowserServerModule() {
browserServerModulePromise ??= import("../server.js");
return await browserServerModulePromise;
}
export async function startBrowserControlServerFromConfig() {
return await browserServerModule.startBrowserControlServerFromConfig();
return await (await loadBrowserServerModule()).startBrowserControlServerFromConfig();
}
export async function stopBrowserControlServer(): Promise<void> {
await browserServerModule.stopBrowserControlServer();
await (await loadBrowserServerModule()).stopBrowserControlServer();
}
export function makeResponse(

View File

@@ -8,6 +8,12 @@ import { type CodexAppServerClientFactory } from "./src/app-server/client-factor
import type { CodexAppServerClient } from "./src/app-server/client.js";
import { resolveCodexAppServerRuntimeOptions } from "./src/app-server/config.js";
import { readModelListResult } from "./src/app-server/models.js";
import {
assertCodexThreadStartResponse,
assertCodexTurnStartResponse,
readCodexErrorNotification,
readCodexTurnCompletedNotification,
} from "./src/app-server/protocol-validators.js";
import {
isJsonObject,
type CodexServerNotification,
@@ -17,13 +23,6 @@ import {
type CodexTurnStartParams,
type JsonObject,
} from "./src/app-server/protocol.js";
import {
assertCodexThreadStartResponse,
assertCodexTurnStartResponse,
readCodexErrorNotification,
readCodexTurnCompletedNotification,
} from "./src/app-server/protocol-validators.js";
import { createIsolatedCodexAppServerClient } from "./src/app-server/shared-client.js";
const DEFAULT_CODEX_IMAGE_MODEL =
FALLBACK_CODEX_MODELS.find((model) => model.inputModalities.includes("image"))?.id ??
@@ -83,11 +82,14 @@ async function describeCodexImages(
const ownsClient = !options.clientFactory;
const client = options.clientFactory
? await options.clientFactory(appServer.start, req.profile)
: await createIsolatedCodexAppServerClient({
startOptions: appServer.start,
timeoutMs,
authProfileId: req.profile,
});
: await import("./src/app-server/shared-client.js").then(
({ createIsolatedCodexAppServerClient }) =>
createIsolatedCodexAppServerClient({
startOptions: appServer.start,
timeoutMs,
authProfileId: req.profile,
}),
);
const abortController = new AbortController();
const timeout = setTimeout(() => abortController.abort("timeout"), timeoutMs);
timeout.unref?.();
@@ -229,7 +231,8 @@ function createCodexImageTurnCollector(threadId: string) {
return;
}
if (notification.method === "turn/completed") {
completedTurn = readCodexTurnCompletedNotification(notification.params)?.turn ?? completedTurn;
completedTurn =
readCodexTurnCompletedNotification(notification.params)?.turn ?? completedTurn;
resolveCompletion?.();
return;
}

View File

@@ -1,6 +1,5 @@
import type { CodexAppServerClient } from "./client.js";
import type { CodexAppServerStartOptions } from "./config.js";
import { getSharedCodexAppServerClient } from "./shared-client.js";
export type CodexAppServerClientFactory = (
startOptions?: CodexAppServerStartOptions,
@@ -10,7 +9,10 @@ export type CodexAppServerClientFactory = (
export const defaultCodexAppServerClientFactory: CodexAppServerClientFactory = (
startOptions,
authProfileId,
) => getSharedCodexAppServerClient({ startOptions, authProfileId });
) =>
import("./shared-client.js").then(({ getSharedCodexAppServerClient }) =>
getSharedCodexAppServerClient({ startOptions, authProfileId }),
);
export function createCodexAppServerClientFactoryTestHooks(
setFactory: (factory: CodexAppServerClientFactory) => void,

View File

@@ -1,10 +1,6 @@
import type { CodexAppServerStartOptions } from "./config.js";
import type { v2 } from "./protocol-generated/typescript/index.js";
import { readCodexModelListResponse } from "./protocol-validators.js";
import {
createIsolatedCodexAppServerClient,
getSharedCodexAppServerClient,
} from "./shared-client.js";
export type CodexAppServerModel = {
id: string;
@@ -38,6 +34,8 @@ export async function listCodexAppServerModels(
): Promise<CodexAppServerModelListResult> {
const timeoutMs = options.timeoutMs ?? 2500;
const useSharedClient = options.sharedClient !== false;
const { createIsolatedCodexAppServerClient, getSharedCodexAppServerClient } =
await import("./shared-client.js");
const client = useSharedClient
? await getSharedCodexAppServerClient({
startOptions: options.startOptions,

View File

@@ -32,6 +32,10 @@ import { resolveCodexAppServerRuntimeOptions } from "./config.js";
import { createCodexDynamicToolBridge } from "./dynamic-tools.js";
import { handleCodexAppServerElicitationRequest } from "./elicitation-bridge.js";
import { CodexAppServerEventProjector } from "./event-projector.js";
import {
assertCodexTurnStartResponse,
readCodexDynamicToolCallParams,
} from "./protocol-validators.js";
import {
isJsonObject,
type CodexServerNotification,
@@ -40,10 +44,6 @@ import {
type JsonObject,
type JsonValue,
} from "./protocol.js";
import {
assertCodexTurnStartResponse,
readCodexDynamicToolCallParams,
} from "./protocol-validators.js";
import { readCodexAppServerBinding, type CodexAppServerThreadBinding } from "./session-binding.js";
import { clearSharedCodexAppServerClient } from "./shared-client.js";
import {
@@ -58,6 +58,7 @@ import {
recordCodexTrajectoryContext,
} from "./trajectory.js";
import { mirrorCodexAppServerTranscript } from "./transcript-mirror.js";
import { filterToolsForVisionInputs } from "./vision-tools.js";
let clientFactory = defaultCodexAppServerClientFactory;
@@ -601,19 +602,6 @@ async function buildDynamicTools(input: DynamicToolBuildParams) {
});
}
function filterToolsForVisionInputs<T extends { name?: string }>(
tools: T[],
params: {
modelHasVision: boolean;
hasInboundImages: boolean;
},
): T[] {
if (!params.modelHasVision || !params.hasInboundImages) {
return tools;
}
return tools.filter((tool) => tool.name !== "image");
}
async function withCodexStartupTimeout<T>(params: {
timeoutMs: number;
timeoutFloorMs?: number;

View File

@@ -1,14 +1,15 @@
import { describe, expect, it } from "vitest";
import { __testing } from "./run-attempt.js";
import { filterToolsForVisionInputs } from "./vision-tools.js";
describe("Codex dynamic tool filtering", () => {
it("drops the image tool when the model already has inbound vision input", () => {
const toolNames = __testing
.filterToolsForVisionInputs([{ name: "image" }, { name: "read" }, { name: "write" }], {
const toolNames = filterToolsForVisionInputs(
[{ name: "image" }, { name: "read" }, { name: "write" }],
{
modelHasVision: true,
hasInboundImages: true,
})
.map((tool) => tool.name);
},
).map((tool) => tool.name);
expect(toolNames).toContain("read");
expect(toolNames).toContain("write");
@@ -19,13 +20,13 @@ describe("Codex dynamic tool filtering", () => {
const tools = [{ name: "image" }, { name: "read" }];
expect(
__testing.filterToolsForVisionInputs(tools, {
filterToolsForVisionInputs(tools, {
modelHasVision: false,
hasInboundImages: true,
}),
).toBe(tools);
expect(
__testing.filterToolsForVisionInputs(tools, {
filterToolsForVisionInputs(tools, {
modelHasVision: true,
hasInboundImages: false,
}),

View File

@@ -0,0 +1,12 @@
export function filterToolsForVisionInputs<T extends { name?: string }>(
tools: T[],
params: {
modelHasVision: boolean;
hasInboundImages: boolean;
},
): T[] {
if (!params.modelHasVision || !params.hasInboundImages) {
return tools;
}
return tools.filter((tool) => tool.name !== "image");
}

View File

@@ -0,0 +1,41 @@
import fs from "node:fs";
import { describe, expect, it } from "vitest";
import { resolveProviderPluginChoice } from "../../src/plugins/provider-auth-choice.runtime.js";
import { registerSingleProviderPlugin } from "../../test/helpers/plugins/plugin-registration.js";
import plugin from "./index.js";
type ComfyManifest = {
providerAuthChoices?: Array<{ choiceId?: string; method?: string; provider?: string }>;
};
function readManifest(): ComfyManifest {
return JSON.parse(
fs.readFileSync(new URL("./openclaw.plugin.json", import.meta.url), "utf8"),
) as ComfyManifest;
}
describe("comfy provider plugin", () => {
it("registers cloud API-key auth metadata", async () => {
const provider = await registerSingleProviderPlugin(plugin);
expect(provider.id).toBe("comfy");
expect(provider.envVars).toEqual(["COMFY_API_KEY", "COMFY_CLOUD_API_KEY"]);
expect(provider.auth?.map((method) => method.id)).toEqual(["cloud-api-key"]);
const choice = resolveProviderPluginChoice({
providers: [provider],
choice: "comfy-cloud-api-key",
});
expect(choice?.provider.id).toBe("comfy");
expect(choice?.method.id).toBe("cloud-api-key");
expect(readManifest().providerAuthChoices).toEqual(
expect.arrayContaining([
expect.objectContaining({
provider: "comfy",
method: "cloud-api-key",
choiceId: "comfy-cloud-api-key",
}),
]),
);
});
});

View File

@@ -1,4 +1,5 @@
import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
import { createProviderApiKeyAuthMethod } from "openclaw/plugin-sdk/provider-auth-api-key";
import { buildComfyImageGenerationProvider } from "./image-generation-provider.js";
import { buildComfyMusicGenerationProvider } from "./music-generation-provider.js";
import { buildComfyVideoGenerationProvider } from "./video-generation-provider.js";
@@ -15,7 +16,27 @@ export default definePluginEntry({
label: "ComfyUI",
docsPath: "/providers/comfy",
envVars: ["COMFY_API_KEY", "COMFY_CLOUD_API_KEY"],
auth: [],
auth: [
createProviderApiKeyAuthMethod({
providerId: PROVIDER_ID,
methodId: "cloud-api-key",
label: "Comfy Cloud API key",
hint: "API key for Comfy Cloud workflow runs",
optionKey: "comfyApiKey",
flagName: "--comfy-api-key",
envVar: "COMFY_API_KEY",
promptMessage: "Enter Comfy Cloud API key",
wizard: {
choiceId: "comfy-cloud-api-key",
choiceLabel: "Comfy Cloud API key",
choiceHint: "Required for cloud workflows",
groupId: "comfy",
groupLabel: "ComfyUI",
groupHint: "Local or cloud workflows",
onboardingScopes: ["image-generation"],
},
}),
],
});
api.registerImageGenerationProvider(buildComfyImageGenerationProvider());
api.registerMusicGenerationProvider(buildComfyMusicGenerationProvider());

View File

@@ -5,6 +5,23 @@
"providerAuthEnvVars": {
"comfy": ["COMFY_API_KEY", "COMFY_CLOUD_API_KEY"]
},
"providerAuthChoices": [
{
"provider": "comfy",
"method": "cloud-api-key",
"choiceId": "comfy-cloud-api-key",
"choiceLabel": "Comfy Cloud API key",
"choiceHint": "Required for cloud workflows",
"groupId": "comfy",
"groupLabel": "ComfyUI",
"groupHint": "Local or cloud workflows",
"optionKey": "comfyApiKey",
"cliFlag": "--comfy-api-key",
"cliOption": "--comfy-api-key <key>",
"cliDescription": "Comfy Cloud API key",
"onboardingScopes": ["image-generation"]
}
],
"contracts": {
"imageGenerationProviders": ["comfy"],
"musicGenerationProviders": ["comfy"],

View File

@@ -11,12 +11,13 @@ import {
} from "./inbound-dedupe.js";
import { materializeDiscordInboundJob, type DiscordInboundJob } from "./inbound-job.js";
import type { RuntimeEnv } from "./message-handler.preflight.types.js";
import { processDiscordMessage } from "./message-handler.process.js";
import { deliverDiscordReply } from "./reply-delivery.js";
import type { DiscordMonitorStatusSink } from "./status.js";
import { resolveDiscordReplyDeliveryPlan } from "./threading.js";
import { normalizeDiscordInboundWorkerTimeoutMs, runDiscordTaskWithTimeout } from "./timeouts.js";
type ProcessDiscordMessage = typeof import("./message-handler.process.js").processDiscordMessage;
type DeliverDiscordReply = typeof import("./reply-delivery.js").deliverDiscordReply;
type DiscordInboundWorkerParams = {
runtime: RuntimeEnv;
setStatus?: DiscordMonitorStatusSink;
@@ -32,10 +33,25 @@ export type DiscordInboundWorker = {
};
export type DiscordInboundWorkerTestingHooks = {
processDiscordMessage?: typeof processDiscordMessage;
deliverDiscordReply?: typeof deliverDiscordReply;
processDiscordMessage?: ProcessDiscordMessage;
deliverDiscordReply?: DeliverDiscordReply;
};
let messageProcessRuntimePromise:
| Promise<typeof import("./message-handler.process.js")>
| undefined;
let replyDeliveryRuntimePromise: Promise<typeof import("./reply-delivery.js")> | undefined;
async function loadMessageProcessRuntime() {
messageProcessRuntimePromise ??= import("./message-handler.process.js");
return await messageProcessRuntimePromise;
}
async function loadReplyDeliveryRuntime() {
replyDeliveryRuntimePromise ??= import("./reply-delivery.js");
return await replyDeliveryRuntimePromise;
}
function formatDiscordRunContextSuffix(job: DiscordInboundJob): string {
const channelId = job.payload.messageChannelId?.trim();
const messageId = job.payload.data?.message?.id?.trim();
@@ -62,7 +78,9 @@ async function processDiscordInboundJob(params: {
let finalReplyStarted = false;
let createdThreadId: string | undefined;
let sessionKey: string | undefined;
const processDiscordMessageImpl = params.testing?.processDiscordMessage ?? processDiscordMessage;
const processDiscordMessageImpl =
params.testing?.processDiscordMessage ??
(await loadMessageProcessRuntime()).processDiscordMessage;
try {
await runDiscordTaskWithTimeout({
run: async (abortSignal) => {
@@ -135,7 +153,7 @@ async function sendDiscordInboundWorkerTimeoutReply(params: {
contextSuffix: string;
createdThreadId?: string;
sessionKey?: string;
deliverDiscordReplyImpl?: typeof deliverDiscordReply;
deliverDiscordReplyImpl?: DeliverDiscordReply;
}) {
const messageChannelId = params.job.payload.messageChannelId?.trim();
const messageId = params.job.payload.message?.id?.trim();
@@ -158,7 +176,9 @@ async function sendDiscordInboundWorkerTimeoutReply(params: {
});
try {
await (params.deliverDiscordReplyImpl ?? deliverDiscordReply)({
const deliverDiscordReplyImpl =
params.deliverDiscordReplyImpl ?? (await loadReplyDeliveryRuntime()).deliverDiscordReply;
await deliverDiscordReplyImpl({
cfg: params.job.payload.cfg,
replies: [{ text: "Discord inbound worker timed out.", isError: true }],
target: deliveryPlan.deliverTarget,

View File

@@ -20,7 +20,6 @@ import {
} from "./inbound-worker.js";
import type { DiscordMessageEvent, DiscordMessageHandler } from "./listeners.js";
import { applyImplicitReplyBatchGate } from "./message-handler.batch-gate.js";
import { preflightDiscordMessage } from "./message-handler.preflight.js";
import type { DiscordMessagePreflightParams } from "./message-handler.preflight.types.js";
import {
hasDiscordMessageStickers,
@@ -29,6 +28,9 @@ import {
} from "./message-utils.js";
import type { DiscordMonitorStatusSink } from "./status.js";
type PreflightDiscordMessage =
typeof import("./message-handler.preflight.js").preflightDiscordMessage;
type DiscordMessageHandlerParams = Omit<
DiscordMessagePreflightParams,
"ackReactionScope" | "groupPolicy" | "data" | "client"
@@ -40,9 +42,18 @@ type DiscordMessageHandlerParams = Omit<
};
type DiscordMessageHandlerTestingHooks = DiscordInboundWorkerTestingHooks & {
preflightDiscordMessage?: typeof preflightDiscordMessage;
preflightDiscordMessage?: PreflightDiscordMessage;
};
let messagePreflightRuntimePromise:
| Promise<typeof import("./message-handler.preflight.js")>
| undefined;
async function loadMessagePreflightRuntime() {
messagePreflightRuntimePromise ??= import("./message-handler.preflight.js");
return await messagePreflightRuntimePromise;
}
export type DiscordMessageHandlerWithLifecycle = DiscordMessageHandler & {
deactivate: () => void;
};
@@ -63,8 +74,7 @@ export function createDiscordMessageHandler(
params.discordConfig?.ackReactionScope ??
params.cfg.messages?.ackReactionScope ??
"group-mentions";
const preflightDiscordMessageImpl =
params.__testing?.preflightDiscordMessage ?? preflightDiscordMessage;
const preflightDiscordMessageImpl = params.__testing?.preflightDiscordMessage;
const replayGuard = createDiscordInboundReplayGuard();
const inboundWorker = createDiscordInboundWorker({
runtime: params.runtime,
@@ -130,7 +140,10 @@ export function createDiscordMessageHandler(
}
try {
if (entries.length === 1) {
const ctx = await preflightDiscordMessageImpl({
const preflight =
preflightDiscordMessageImpl ??
(await loadMessagePreflightRuntime()).preflightDiscordMessage;
const ctx = await preflight({
...params,
ackReactionScope,
groupPolicy,
@@ -167,7 +180,10 @@ export function createDiscordMessageHandler(
...last.data,
message: syntheticMessage,
};
const ctx = await preflightDiscordMessageImpl({
const preflight =
preflightDiscordMessageImpl ??
(await loadMessagePreflightRuntime()).preflightDiscordMessage;
const ctx = await preflight({
...params,
ackReactionScope,
groupPolicy,

View File

@@ -1,9 +1,8 @@
import { EventEmitter } from "node:events";
import { RateLimitError } from "@buape/carbon";
import { AcpRuntimeError } from "openclaw/plugin-sdk/acp-runtime";
import type { ChannelRuntimeSurface } from "openclaw/plugin-sdk/channel-contract";
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-runtime";
import { beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
import { createRuntimeChannel } from "../../../../src/plugins/runtime/runtime-channel.js";
import {
baseConfig,
baseRuntime,
@@ -41,6 +40,34 @@ let monitorDiscordProvider: typeof import("./provider.js").monitorDiscordProvide
let providerTesting: typeof import("./provider.js").__testing;
let runtimeEnvModule: typeof import("openclaw/plugin-sdk/runtime-env");
function createAcpRuntimeError(code: string, message: string): Error & { code: string } {
return Object.assign(new Error(message), { code });
}
function createTestChannelRuntime(): ChannelRuntimeSurface {
const contexts = new Map<string, unknown>();
const keyFor = (params: { channelId: string; accountId?: string | null; capability: string }) =>
`${params.channelId}:${params.accountId ?? ""}:${params.capability}`;
const runtimeContexts: ChannelRuntimeSurface["runtimeContexts"] = {
register(params) {
contexts.set(keyFor(params), params.context);
return {
dispose: () => {
contexts.delete(keyFor(params));
},
};
},
get: ((params: { channelId: string; accountId?: string | null; capability: string }) =>
contexts.get(keyFor(params))) as ChannelRuntimeSurface["runtimeContexts"]["get"],
watch() {
return () => {};
},
};
return {
runtimeContexts,
};
}
function createCompatRateLimitError(
response: Response,
body: { message: string; retry_after: number; global: boolean },
@@ -348,7 +375,7 @@ describe("monitorDiscordProvider", () => {
});
it("registers the native approval runtime context when exec approvals are enabled", async () => {
const channelRuntime = createRuntimeChannel();
const channelRuntime = createTestChannelRuntime();
const execApprovalsConfig = { enabled: true, approvers: ["123"] };
resolveDiscordAccountMock.mockReturnValue({
accountId: "default",
@@ -408,7 +435,7 @@ describe("monitorDiscordProvider", () => {
it("classifies typed ACP session init failures as stale", async () => {
getAcpSessionStatusMock.mockRejectedValue(
new AcpRuntimeError("ACP_SESSION_INIT_FAILED", "missing ACP metadata"),
createAcpRuntimeError("ACP_SESSION_INIT_FAILED", "missing ACP metadata"),
);
await monitorDiscordProvider({
@@ -437,7 +464,7 @@ describe("monitorDiscordProvider", () => {
it("classifies typed non-init ACP errors as uncertain when not stale-running", async () => {
getAcpSessionStatusMock.mockRejectedValue(
new AcpRuntimeError("ACP_BACKEND_UNAVAILABLE", "runtime unavailable"),
createAcpRuntimeError("ACP_BACKEND_UNAVAILABLE", "runtime unavailable"),
);
await monitorDiscordProvider({

View File

@@ -24,7 +24,6 @@ import {
import type { OpenClawConfig, ReplyToMode } from "openclaw/plugin-sdk/config-runtime";
import { loadConfig } from "openclaw/plugin-sdk/config-runtime";
import { createConnectedChannelStatusPatch } from "openclaw/plugin-sdk/gateway-runtime";
import { getPluginCommandSpecs } from "openclaw/plugin-sdk/plugin-runtime";
import { resolveTextChunkLimit } from "openclaw/plugin-sdk/reply-chunking";
import {
danger,
@@ -110,9 +109,12 @@ type DiscordVoiceManager = import("../voice/manager.js").DiscordVoiceManager;
type DiscordVoiceRuntimeModule = typeof import("../voice/manager.runtime.js");
type DiscordProviderSessionRuntimeModule = typeof import("./provider-session.runtime.js");
type GetPluginCommandSpecs =
typeof import("openclaw/plugin-sdk/plugin-runtime").getPluginCommandSpecs;
let discordVoiceRuntimePromise: Promise<DiscordVoiceRuntimeModule> | undefined;
let discordProviderSessionRuntimePromise: Promise<DiscordProviderSessionRuntimeModule> | undefined;
let pluginRuntimePromise: Promise<typeof import("openclaw/plugin-sdk/plugin-runtime")> | undefined;
let fetchDiscordApplicationIdForTesting: typeof fetchDiscordApplicationId | undefined;
let createDiscordNativeCommandForTesting: typeof createDiscordNativeCommand | undefined;
@@ -130,7 +132,7 @@ let createClientForTesting:
plugins: ConstructorParameters<typeof Client>[2],
) => Client)
| undefined;
let getPluginCommandSpecsForTesting: typeof getPluginCommandSpecs | undefined;
let getPluginCommandSpecsForTesting: GetPluginCommandSpecs | undefined;
let resolveDiscordAccountForTesting: typeof resolveDiscordAccount | undefined;
let resolveNativeCommandsEnabledForTesting: typeof resolveNativeCommandsEnabled | undefined;
let resolveNativeSkillsEnabledForTesting: typeof resolveNativeSkillsEnabled | undefined;
@@ -155,6 +157,11 @@ async function loadDiscordProviderSessionRuntime(): Promise<DiscordProviderSessi
return await discordProviderSessionRuntimePromise;
}
async function loadPluginRuntime() {
pluginRuntimePromise ??= import("openclaw/plugin-sdk/plugin-runtime");
return await pluginRuntimePromise;
}
function normalizeBooleanForTesting(value: unknown): boolean | undefined {
if (typeof value === "boolean") {
return value;
@@ -178,17 +185,17 @@ function formatThreadBindingDurationForConfigLabel(durationMs: number): string {
return label === "disabled" ? "off" : label;
}
function appendPluginCommandSpecs(params: {
async function appendPluginCommandSpecs(params: {
commandSpecs: NativeCommandSpec[];
runtime: RuntimeEnv;
}): NativeCommandSpec[] {
}): Promise<NativeCommandSpec[]> {
const merged = [...params.commandSpecs];
const existingNames = new Set(
merged.map((spec) => normalizeLowercaseStringOrEmpty(spec.name)).filter(Boolean),
);
for (const pluginCommand of (getPluginCommandSpecsForTesting ?? getPluginCommandSpecs)(
"discord",
)) {
const getPluginCommandSpecs =
getPluginCommandSpecsForTesting ?? (await loadPluginRuntime()).getPluginCommandSpecs;
for (const pluginCommand of getPluginCommandSpecs("discord")) {
const normalizedName = normalizeLowercaseStringOrEmpty(pluginCommand.name);
if (!normalizedName) {
continue;
@@ -740,7 +747,7 @@ export async function monitorDiscordProvider(opts: MonitorDiscordOpts = {}) {
})
: [];
if (nativeEnabled) {
commandSpecs = appendPluginCommandSpecs({ commandSpecs, runtime });
commandSpecs = await appendPluginCommandSpecs({ commandSpecs, runtime });
}
const initialCommandCount = commandSpecs.length;
if (nativeEnabled && nativeSkillsEnabled && commandSpecs.length > maxDiscordCommands) {
@@ -749,7 +756,7 @@ export async function monitorDiscordProvider(opts: MonitorDiscordOpts = {}) {
cfg,
{ skillCommands: [], provider: "discord" },
);
commandSpecs = appendPluginCommandSpecs({ commandSpecs, runtime });
commandSpecs = await appendPluginCommandSpecs({ commandSpecs, runtime });
runtime.log?.(
warn(
`discord: ${initialCommandCount} commands exceeds limit; removing per-skill commands and keeping /skill.`,
@@ -1201,7 +1208,7 @@ export const __testing = {
) {
createClientForTesting = mock;
},
setGetPluginCommandSpecs(mock?: typeof getPluginCommandSpecs) {
setGetPluginCommandSpecs(mock?: GetPluginCommandSpecs) {
getPluginCommandSpecsForTesting = mock;
},
setResolveDiscordAccount(mock?: typeof resolveDiscordAccount) {

View File

@@ -12,7 +12,7 @@
"openclaw": "workspace:*"
},
"peerDependencies": {
"openclaw": ">=2026.4.23"
"openclaw": ">=2026.4.24"
},
"peerDependenciesMeta": {
"openclaw": {
@@ -40,10 +40,10 @@
"install": {
"npmSpec": "@openclaw/feishu",
"defaultChoice": "npm",
"minHostVersion": ">=2026.4.23"
"minHostVersion": ">=2026.4.24"
},
"compat": {
"pluginApi": ">=2026.4.23"
"pluginApi": ">=2026.4.24"
},
"build": {
"openclawVersion": "2026.4.20"

View File

@@ -8,8 +8,11 @@ import { createNonExitingTypedRuntimeEnv } from "../../../test/helpers/plugins/r
import type { ClawdbotConfig, PluginRuntime, RuntimeEnv } from "../runtime-api.js";
import { parseFeishuMessageEvent, type FeishuMessageEvent } from "./bot.js";
import * as dedup from "./dedup.js";
import { monitorSingleAccount } from "./monitor.account.js";
import { resolveReactionSyntheticEvent, type FeishuReactionCreatedEvent } from "./monitor.js";
import {
monitorSingleAccount,
resolveReactionSyntheticEvent,
type FeishuReactionCreatedEvent,
} from "./monitor.account.js";
import { setFeishuRuntime } from "./runtime.js";
import type { ResolvedFeishuAccount } from "./types.js";

View File

@@ -40,9 +40,12 @@ function buildMultiAccountWebsocketConfig(accountIds: string[]): ClawdbotConfig
}
async function waitForStartedAccount(started: string[], accountId: string) {
for (let i = 0; i < 10 && !started.includes(accountId); i += 1) {
await Promise.resolve();
}
await vi.waitFor(
() => {
expect(started).toContain(accountId);
},
{ timeout: 10_000 },
);
}
afterEach(() => {
@@ -74,9 +77,7 @@ describe("Feishu monitor startup preflight", () => {
});
try {
await Promise.resolve();
await Promise.resolve();
await waitForStartedAccount(started, "alpha");
expect(started).toEqual(["alpha"]);
expect(maxInFlight).toBe(1);
} finally {
@@ -177,7 +178,7 @@ describe("Feishu monitor startup preflight", () => {
});
try {
await Promise.resolve();
await waitForStartedAccount(started, "alpha");
expect(started).toEqual(["alpha"]);
abortController.abort();

View File

@@ -1,10 +1,5 @@
import type { ClawdbotConfig, RuntimeEnv } from "../runtime-api.js";
import { listEnabledFeishuAccounts, resolveFeishuRuntimeAccount } from "./accounts.js";
import {
monitorSingleAccount,
resolveReactionSyntheticEvent,
type FeishuReactionCreatedEvent,
} from "./monitor.account.js";
import { fetchBotIdentityForMonitor } from "./monitor.startup.js";
import {
clearFeishuWebhookRateLimitStateForTest,
@@ -20,13 +15,18 @@ export type MonitorFeishuOpts = {
accountId?: string;
};
let monitorAccountRuntimePromise: Promise<typeof import("./monitor.account.js")> | undefined;
async function loadMonitorAccountRuntime() {
monitorAccountRuntimePromise ??= import("./monitor.account.js");
return await monitorAccountRuntimePromise;
}
export {
clearFeishuWebhookRateLimitStateForTest,
getFeishuWebhookRateLimitStateSizeForTest,
isWebhookRateLimitedForTest,
resolveReactionSyntheticEvent,
};
export type { FeishuReactionCreatedEvent };
export async function monitorFeishuProvider(opts: MonitorFeishuOpts = {}): Promise<void> {
const cfg = opts.config;
@@ -44,6 +44,7 @@ export async function monitorFeishuProvider(opts: MonitorFeishuOpts = {}): Promi
if (!account.enabled || !account.configured) {
throw new Error(`Feishu account "${opts.accountId}" not configured or disabled`);
}
const { monitorSingleAccount } = await loadMonitorAccountRuntime();
return monitorSingleAccount({
cfg,
account,
@@ -61,6 +62,7 @@ export async function monitorFeishuProvider(opts: MonitorFeishuOpts = {}): Promi
`feishu: starting ${accounts.length} account(s): ${accounts.map((a) => a.accountId).join(", ")}`,
);
const { monitorSingleAccount } = await loadMonitorAccountRuntime();
const monitorPromises: Promise<void>[] = [];
for (const account of accounts) {
if (opts.abortSignal?.aborted) {

View File

@@ -1,4 +1,7 @@
import { isAbortRequestText, isBtwRequestText } from "openclaw/plugin-sdk/reply-runtime";
import {
isAbortRequestText,
isBtwRequestText,
} from "openclaw/plugin-sdk/command-primitives-runtime";
import { parseFeishuMessageEvent, type FeishuMessageEvent } from "./bot.js";
export function getFeishuSequentialKey(params: {

View File

@@ -0,0 +1,629 @@
import { EventEmitter } from "node:events";
import { PassThrough, Writable } from "node:stream";
import type { OpenClawPluginApi } from "openclaw/plugin-sdk/plugin-entry";
import type { RealtimeVoiceProviderPlugin } from "openclaw/plugin-sdk/realtime-voice";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { createTestPluginApi } from "../../test/helpers/plugins/plugin-api.ts";
import plugin from "./index.js";
import { resolveGoogleMeetConfig, resolveGoogleMeetConfigWithEnv } from "./src/config.js";
import {
buildGoogleMeetPreflightReport,
fetchGoogleMeetSpace,
normalizeGoogleMeetSpaceName,
} from "./src/meet.js";
import {
buildGoogleMeetAuthUrl,
refreshGoogleMeetAccessToken,
resolveGoogleMeetAccessToken,
} from "./src/oauth.js";
import { startCommandRealtimeAudioBridge } from "./src/realtime.js";
import { normalizeMeetUrl } from "./src/runtime.js";
import { buildMeetDtmfSequence, normalizeDialInNumber } from "./src/transports/twilio.js";
const voiceCallMocks = vi.hoisted(() => ({
joinMeetViaVoiceCallGateway: vi.fn(async () => ({ callId: "call-1", dtmfSent: true })),
endMeetVoiceCallGatewayCall: vi.fn(async () => {}),
}));
const fetchGuardMocks = vi.hoisted(() => ({
fetchWithSsrFGuard: vi.fn(
async (params: {
url: string;
init?: RequestInit;
}): Promise<{
response: Response;
release: () => Promise<void>;
}> => ({
response: await fetch(params.url, params.init),
release: vi.fn(async () => {}),
}),
),
}));
vi.mock("openclaw/plugin-sdk/ssrf-runtime", () => ({
fetchWithSsrFGuard: fetchGuardMocks.fetchWithSsrFGuard,
}));
vi.mock("./src/voice-call-gateway.js", () => ({
joinMeetViaVoiceCallGateway: voiceCallMocks.joinMeetViaVoiceCallGateway,
endMeetVoiceCallGatewayCall: voiceCallMocks.endMeetVoiceCallGatewayCall,
}));
const noopLogger = {
info: vi.fn(),
warn: vi.fn(),
error: vi.fn(),
debug: vi.fn(),
};
type TestBridgeProcess = {
stdin?: { write(chunk: unknown): unknown } | null;
stdout?: { on(event: "data", listener: (chunk: unknown) => void): unknown } | null;
stderr: PassThrough;
killed: boolean;
kill: ReturnType<typeof vi.fn>;
on: EventEmitter["on"];
};
function setup(config: Record<string, unknown> = {}) {
const methods = new Map<string, unknown>();
const tools: unknown[] = [];
const cliRegistrations: unknown[] = [];
const runCommandWithTimeout = vi.fn(async (argv: string[]) => {
if (argv[0] === "system_profiler") {
return { code: 0, stdout: "BlackHole 2ch", stderr: "" };
}
return { code: 0, stdout: "", stderr: "" };
});
const api = createTestPluginApi({
id: "google-meet",
name: "Google Meet",
description: "test",
version: "0",
source: "test",
pluginConfig: config,
runtime: {
system: {
runCommandWithTimeout,
formatNativeDependencyHint: vi.fn(() => "Install with brew install blackhole-2ch."),
},
} as unknown as OpenClawPluginApi["runtime"],
logger: noopLogger,
registerGatewayMethod: (method: string, handler: unknown) => methods.set(method, handler),
registerTool: (tool: unknown) => tools.push(tool),
registerCli: (_registrar: unknown, opts: unknown) => cliRegistrations.push(opts),
});
plugin.register(api);
return { cliRegistrations, methods, tools, runCommandWithTimeout };
}
describe("google-meet plugin", () => {
beforeEach(() => {
vi.clearAllMocks();
});
afterEach(() => {
vi.unstubAllGlobals();
});
it("defaults to chrome realtime with safe read-only tools", () => {
expect(resolveGoogleMeetConfig({})).toMatchObject({
enabled: true,
defaults: {},
preview: { enrollmentAcknowledged: false },
defaultTransport: "chrome",
defaultMode: "realtime",
chrome: {
audioBackend: "blackhole-2ch",
launch: true,
audioInputCommand: [
"rec",
"-q",
"-t",
"raw",
"-r",
"8000",
"-c",
"1",
"-e",
"mu-law",
"-b",
"8",
"-",
],
audioOutputCommand: [
"play",
"-q",
"-t",
"raw",
"-r",
"8000",
"-c",
"1",
"-e",
"mu-law",
"-b",
"8",
"-",
],
},
voiceCall: { enabled: true, requestTimeoutMs: 30000, dtmfDelayMs: 2500 },
realtime: {
provider: "openai",
toolPolicy: "safe-read-only",
},
oauth: {},
auth: { provider: "google-oauth" },
});
expect(resolveGoogleMeetConfig({}).realtime.instructions).toContain("openclaw_agent_consult");
});
it("uses env fallbacks for OAuth, preview, and default meeting values", () => {
expect(
resolveGoogleMeetConfigWithEnv(
{},
{
OPENCLAW_GOOGLE_MEET_CLIENT_ID: "client-id",
GOOGLE_MEET_CLIENT_SECRET: "client-secret",
OPENCLAW_GOOGLE_MEET_REFRESH_TOKEN: "refresh-token",
GOOGLE_MEET_ACCESS_TOKEN: "access-token",
OPENCLAW_GOOGLE_MEET_ACCESS_TOKEN_EXPIRES_AT: "123456",
GOOGLE_MEET_DEFAULT_MEETING: "https://meet.google.com/abc-defg-hij",
OPENCLAW_GOOGLE_MEET_PREVIEW_ACK: "true",
},
),
).toMatchObject({
defaults: { meeting: "https://meet.google.com/abc-defg-hij" },
preview: { enrollmentAcknowledged: true },
oauth: {
clientId: "client-id",
clientSecret: "client-secret",
refreshToken: "refresh-token",
accessToken: "access-token",
expiresAt: 123456,
},
});
});
it("requires explicit Meet URLs", () => {
expect(normalizeMeetUrl("https://meet.google.com/abc-defg-hij")).toBe(
"https://meet.google.com/abc-defg-hij",
);
expect(() => normalizeMeetUrl("https://example.com/abc-defg-hij")).toThrow("meet.google.com");
});
it("advertises only the googlemeet CLI descriptor", () => {
const { cliRegistrations } = setup();
expect(cliRegistrations).toContainEqual({
commands: ["googlemeet"],
descriptors: [
{
name: "googlemeet",
description: "Join and manage Google Meet calls",
hasSubcommands: true,
},
],
});
});
it("uses a provider-safe flat tool parameter schema", () => {
const { tools } = setup();
const tool = tools[0] as { parameters: unknown };
expect(JSON.stringify(tool.parameters)).not.toContain("anyOf");
expect(tool.parameters).toMatchObject({
type: "object",
properties: {
action: {
type: "string",
enum: ["join", "status", "setup_status", "resolve_space", "preflight", "leave"],
},
transport: { type: "string", enum: ["chrome", "twilio"] },
mode: { type: "string", enum: ["realtime", "transcribe"] },
},
});
});
it("normalizes Meet URLs, codes, and space names for the Meet API", () => {
expect(normalizeGoogleMeetSpaceName("spaces/abc-defg-hij")).toBe("spaces/abc-defg-hij");
expect(normalizeGoogleMeetSpaceName("abc-defg-hij")).toBe("spaces/abc-defg-hij");
expect(normalizeGoogleMeetSpaceName("https://meet.google.com/abc-defg-hij")).toBe(
"spaces/abc-defg-hij",
);
expect(() => normalizeGoogleMeetSpaceName("https://example.com/abc-defg-hij")).toThrow(
"meet.google.com",
);
});
it("fetches Meet spaces without percent-encoding the spaces path separator", async () => {
const fetchMock = vi.fn(async () => {
return new Response(
JSON.stringify({
name: "spaces/abc-defg-hij",
meetingCode: "abc-defg-hij",
meetingUri: "https://meet.google.com/abc-defg-hij",
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
);
});
vi.stubGlobal("fetch", fetchMock);
await expect(
fetchGoogleMeetSpace({
accessToken: "token",
meeting: "spaces/abc-defg-hij",
}),
).resolves.toMatchObject({ name: "spaces/abc-defg-hij" });
expect(fetchGuardMocks.fetchWithSsrFGuard).toHaveBeenCalledWith(
expect.objectContaining({
url: "https://meet.googleapis.com/v2/spaces/abc-defg-hij",
init: expect.objectContaining({
headers: expect.objectContaining({ Authorization: "Bearer token" }),
}),
policy: { allowedHostnames: ["meet.googleapis.com"] },
auditContext: "google-meet.spaces.get",
}),
);
expect(fetchMock).toHaveBeenCalledWith(
"https://meet.googleapis.com/v2/spaces/abc-defg-hij",
expect.objectContaining({
headers: expect.objectContaining({ Authorization: "Bearer token" }),
}),
);
});
it("surfaces Developer Preview acknowledgment blockers in preflight reports", () => {
expect(
buildGoogleMeetPreflightReport({
input: "abc-defg-hij",
space: { name: "spaces/abc-defg-hij" },
previewAcknowledged: false,
tokenSource: "cached-access-token",
}),
).toMatchObject({
resolvedSpaceName: "spaces/abc-defg-hij",
previewAcknowledged: false,
blockers: [expect.stringContaining("Developer Preview Program")],
});
});
it("builds Meet OAuth URLs and prefers fresh cached access tokens", async () => {
const url = new URL(
buildGoogleMeetAuthUrl({
clientId: "client-id",
challenge: "challenge",
state: "state",
}),
);
expect(url.hostname).toBe("accounts.google.com");
expect(url.searchParams.get("client_id")).toBe("client-id");
expect(url.searchParams.get("code_challenge")).toBe("challenge");
expect(url.searchParams.get("access_type")).toBe("offline");
expect(url.searchParams.get("scope")).toContain("meetings.conference.media.readonly");
await expect(
resolveGoogleMeetAccessToken({
accessToken: "cached-token",
expiresAt: Date.now() + 120_000,
}),
).resolves.toEqual({
accessToken: "cached-token",
expiresAt: expect.any(Number),
refreshed: false,
});
});
it("refreshes Google Meet access tokens with a refresh-token grant", async () => {
const fetchMock = vi.fn(async (_input: RequestInfo | URL, _init?: RequestInit) => {
return new Response(
JSON.stringify({
access_token: "new-access-token",
expires_in: 3600,
token_type: "Bearer",
}),
{ status: 200, headers: { "Content-Type": "application/json" } },
);
});
vi.stubGlobal("fetch", fetchMock);
await expect(
refreshGoogleMeetAccessToken({
clientId: "client-id",
clientSecret: "client-secret",
refreshToken: "refresh-token",
}),
).resolves.toMatchObject({
accessToken: "new-access-token",
tokenType: "Bearer",
});
const body = fetchMock.mock.calls[0]?.[1]?.body;
expect(body).toBeInstanceOf(URLSearchParams);
const params = body as URLSearchParams;
expect(params.get("grant_type")).toBe("refresh_token");
expect(params.get("refresh_token")).toBe("refresh-token");
});
it("builds Twilio dial plans from a PIN", () => {
expect(normalizeDialInNumber("+1 (555) 123-4567")).toBe("+15551234567");
expect(buildMeetDtmfSequence({ pin: "123 456" })).toBe("123456#");
expect(buildMeetDtmfSequence({ dtmfSequence: "ww123#" })).toBe("ww123#");
});
it("joins a Twilio session through the tool without page parsing", async () => {
const { tools } = setup({ defaultTransport: "twilio" });
const tool = tools[0] as {
execute: (id: string, params: unknown) => Promise<{ details: { session: unknown } }>;
};
const result = await tool.execute("id", {
action: "join",
url: "https://meet.google.com/abc-defg-hij",
dialInNumber: "+15551234567",
pin: "123456",
});
expect(result.details.session).toMatchObject({
transport: "twilio",
mode: "realtime",
twilio: {
dialInNumber: "+15551234567",
pinProvided: true,
dtmfSequence: "123456#",
voiceCallId: "call-1",
dtmfSent: true,
},
});
expect(voiceCallMocks.joinMeetViaVoiceCallGateway).toHaveBeenCalledWith({
config: expect.objectContaining({ defaultTransport: "twilio" }),
dialInNumber: "+15551234567",
dtmfSequence: "123456#",
});
});
it("hangs up delegated Twilio calls on leave", async () => {
const { tools } = setup({ defaultTransport: "twilio" });
const tool = tools[0] as {
execute: (id: string, params: unknown) => Promise<{ details: { session: { id: string } } }>;
};
const joined = await tool.execute("id", {
action: "join",
url: "https://meet.google.com/abc-defg-hij",
dialInNumber: "+15551234567",
pin: "123456",
});
await tool.execute("id", { action: "leave", sessionId: joined.details.session.id });
expect(voiceCallMocks.endMeetVoiceCallGatewayCall).toHaveBeenCalledWith({
config: expect.objectContaining({ defaultTransport: "twilio" }),
callId: "call-1",
});
});
it("reports setup status through the tool", async () => {
const { tools } = setup({
chrome: {
audioInputCommand: ["openclaw-audio-bridge", "capture"],
audioOutputCommand: ["openclaw-audio-bridge", "play"],
},
});
const tool = tools[0] as {
execute: (id: string, params: unknown) => Promise<{ details: { ok?: boolean } }>;
};
const result = await tool.execute("id", { action: "setup_status" });
expect(result.details.ok).toBe(true);
});
it("launches Chrome after the BlackHole check", async () => {
const originalPlatform = process.platform;
Object.defineProperty(process, "platform", { value: "darwin" });
try {
const { methods, runCommandWithTimeout } = setup({
defaultMode: "transcribe",
});
const handler = methods.get("googlemeet.join") as
| ((ctx: {
params: Record<string, unknown>;
respond: ReturnType<typeof vi.fn>;
}) => Promise<void>)
| undefined;
const respond = vi.fn();
await handler?.({
params: { url: "https://meet.google.com/abc-defg-hij" },
respond,
});
expect(respond.mock.calls[0]?.[0]).toBe(true);
expect(runCommandWithTimeout).toHaveBeenNthCalledWith(
1,
["system_profiler", "SPAudioDataType"],
{ timeoutMs: 10000 },
);
expect(runCommandWithTimeout).toHaveBeenNthCalledWith(
2,
["open", "-a", "Google Chrome", "https://meet.google.com/abc-defg-hij"],
{ timeoutMs: 30000 },
);
} finally {
Object.defineProperty(process, "platform", { value: originalPlatform });
}
});
it("runs configured Chrome audio bridge commands before launch", async () => {
const originalPlatform = process.platform;
Object.defineProperty(process, "platform", { value: "darwin" });
try {
const { methods, runCommandWithTimeout } = setup({
chrome: {
audioBridgeHealthCommand: ["bridge", "status"],
audioBridgeCommand: ["bridge", "start"],
},
});
const handler = methods.get("googlemeet.join") as
| ((ctx: {
params: Record<string, unknown>;
respond: ReturnType<typeof vi.fn>;
}) => Promise<void>)
| undefined;
const respond = vi.fn();
await handler?.({
params: { url: "https://meet.google.com/abc-defg-hij" },
respond,
});
expect(respond.mock.calls[0]?.[0]).toBe(true);
expect(runCommandWithTimeout).toHaveBeenNthCalledWith(2, ["bridge", "status"], {
timeoutMs: 30000,
});
expect(runCommandWithTimeout).toHaveBeenNthCalledWith(3, ["bridge", "start"], {
timeoutMs: 30000,
});
} finally {
Object.defineProperty(process, "platform", { value: originalPlatform });
}
});
it("pipes Chrome command-pair audio through the realtime provider", async () => {
let callbacks:
| {
onAudio: (audio: Buffer) => void;
onMark?: (markName: string) => void;
onToolCall?: (event: {
itemId: string;
callId: string;
name: string;
args: unknown;
}) => void;
tools?: unknown[];
}
| undefined;
const sendAudio = vi.fn();
const bridge = {
connect: vi.fn(async () => {}),
sendAudio,
setMediaTimestamp: vi.fn(),
submitToolResult: vi.fn(),
acknowledgeMark: vi.fn(),
close: vi.fn(),
isConnected: vi.fn(() => true),
};
const provider: RealtimeVoiceProviderPlugin = {
id: "openai",
label: "OpenAI",
autoSelectOrder: 1,
resolveConfig: ({ rawConfig }) => rawConfig,
isConfigured: () => true,
createBridge: (req) => {
callbacks = req;
return bridge;
},
};
const inputStdout = new PassThrough();
const outputStdinWrites: Buffer[] = [];
const makeProcess = (stdio: {
stdin?: { write(chunk: unknown): unknown } | null;
stdout?: { on(event: "data", listener: (chunk: unknown) => void): unknown } | null;
}): TestBridgeProcess => {
const proc = new EventEmitter() as unknown as TestBridgeProcess;
proc.stdin = stdio.stdin;
proc.stdout = stdio.stdout;
proc.stderr = new PassThrough();
proc.killed = false;
proc.kill = vi.fn(() => {
proc.killed = true;
return true;
});
return proc;
};
const outputStdin = new Writable({
write(chunk, _encoding, done) {
outputStdinWrites.push(Buffer.from(chunk));
done();
},
});
const inputProcess = makeProcess({ stdout: inputStdout, stdin: null });
const outputProcess = makeProcess({ stdin: outputStdin, stdout: null });
const spawnMock = vi.fn().mockReturnValueOnce(outputProcess).mockReturnValueOnce(inputProcess);
const sessionStore: Record<string, unknown> = {};
const runtime = {
agent: {
resolveAgentDir: vi.fn(() => "/tmp/agent"),
resolveAgentWorkspaceDir: vi.fn(() => "/tmp/workspace"),
ensureAgentWorkspace: vi.fn(async () => {}),
session: {
resolveStorePath: vi.fn(() => "/tmp/sessions.json"),
loadSessionStore: vi.fn(() => sessionStore),
saveSessionStore: vi.fn(async () => {}),
resolveSessionFilePath: vi.fn(() => "/tmp/session.json"),
},
runEmbeddedPiAgent: vi.fn(async () => ({
payloads: [{ text: "Use the Portugal launch data." }],
meta: {},
})),
resolveAgentTimeoutMs: vi.fn(() => 1000),
},
};
const handle = await startCommandRealtimeAudioBridge({
config: resolveGoogleMeetConfig({
realtime: { provider: "openai", model: "gpt-realtime" },
}),
fullConfig: {} as never,
runtime: runtime as never,
meetingSessionId: "meet-1",
inputCommand: ["capture-meet"],
outputCommand: ["play-meet"],
logger: noopLogger,
providers: [provider],
spawn: spawnMock,
});
inputStdout.write(Buffer.from([1, 2, 3]));
callbacks?.onAudio(Buffer.from([4, 5]));
callbacks?.onMark?.("mark-1");
callbacks?.onToolCall?.({
itemId: "item-1",
callId: "tool-call-1",
name: "openclaw_agent_consult",
args: { question: "What should I say about launch timing?" },
});
expect(spawnMock).toHaveBeenNthCalledWith(1, "play-meet", [], {
stdio: ["pipe", "ignore", "pipe"],
});
expect(spawnMock).toHaveBeenNthCalledWith(2, "capture-meet", [], {
stdio: ["ignore", "pipe", "pipe"],
});
expect(sendAudio).toHaveBeenCalledWith(Buffer.from([1, 2, 3]));
expect(outputStdinWrites).toEqual([Buffer.from([4, 5])]);
expect(bridge.acknowledgeMark).toHaveBeenCalled();
expect(callbacks).toMatchObject({
tools: [
expect.objectContaining({
name: "openclaw_agent_consult",
}),
],
});
await vi.waitFor(() => {
expect(bridge.submitToolResult).toHaveBeenCalledWith("tool-call-1", {
text: "Use the Portugal launch data.",
});
});
expect(runtime.agent.runEmbeddedPiAgent).toHaveBeenCalledWith(
expect.objectContaining({
messageProvider: "google-meet",
thinkLevel: "high",
toolsAllow: ["read", "web_search", "web_fetch", "x_search", "memory_search", "memory_get"],
}),
);
await handle.stop();
expect(bridge.close).toHaveBeenCalled();
expect(inputProcess.kill).toHaveBeenCalledWith("SIGTERM");
expect(outputProcess.kill).toHaveBeenCalledWith("SIGTERM");
});
});

View File

@@ -0,0 +1,343 @@
import { formatErrorMessage } from "openclaw/plugin-sdk/error-runtime";
import type { GatewayRequestHandlerOptions } from "openclaw/plugin-sdk/gateway-runtime";
import { definePluginEntry, type OpenClawPluginApi } from "openclaw/plugin-sdk/plugin-entry";
import { normalizeOptionalString } from "openclaw/plugin-sdk/text-runtime";
import { Type } from "typebox";
import { registerGoogleMeetCli } from "./src/cli.js";
import {
resolveGoogleMeetConfig,
type GoogleMeetConfig,
type GoogleMeetMode,
type GoogleMeetTransport,
} from "./src/config.js";
import { buildGoogleMeetPreflightReport, fetchGoogleMeetSpace } from "./src/meet.js";
import { resolveGoogleMeetAccessToken } from "./src/oauth.js";
import { GoogleMeetRuntime } from "./src/runtime.js";
const googleMeetConfigSchema = {
parse(value: unknown) {
return resolveGoogleMeetConfig(value);
},
uiHints: {
"defaults.meeting": {
label: "Default Meeting",
help: "Meet URL, meeting code, or spaces/{id} used when CLI commands omit a meeting.",
},
"preview.enrollmentAcknowledged": {
label: "Preview Acknowledged",
help: "Confirms you understand the Google Meet Media API is still Developer Preview.",
advanced: true,
},
defaultTransport: {
label: "Default Transport",
help: "Chrome uses a signed-in browser profile. Twilio uses Meet dial-in numbers.",
},
defaultMode: {
label: "Default Mode",
help: "Realtime voice is the default.",
},
"chrome.audioBackend": {
label: "Chrome Audio Backend",
help: "BlackHole 2ch is required for local duplex audio routing.",
},
"chrome.launch": { label: "Launch Chrome" },
"chrome.browserProfile": { label: "Chrome Profile", advanced: true },
"chrome.audioInputCommand": {
label: "Audio Input Command",
help: "Command that writes 8 kHz G.711 mu-law meeting audio to stdout.",
advanced: true,
},
"chrome.audioOutputCommand": {
label: "Audio Output Command",
help: "Command that reads 8 kHz G.711 mu-law assistant audio from stdin.",
advanced: true,
},
"chrome.audioBridgeCommand": { label: "Audio Bridge Command", advanced: true },
"chrome.audioBridgeHealthCommand": {
label: "Audio Bridge Health Command",
advanced: true,
},
"twilio.defaultDialInNumber": {
label: "Default Dial-In Number",
placeholder: "+15551234567",
},
"twilio.defaultPin": { label: "Default PIN", advanced: true },
"twilio.defaultDtmfSequence": { label: "Default DTMF Sequence", advanced: true },
"voiceCall.enabled": { label: "Delegate To Voice Call" },
"voiceCall.gatewayUrl": { label: "Voice Call Gateway URL", advanced: true },
"voiceCall.token": {
label: "Voice Call Gateway Token",
sensitive: true,
advanced: true,
},
"voiceCall.requestTimeoutMs": {
label: "Voice Call Request Timeout (ms)",
advanced: true,
},
"voiceCall.dtmfDelayMs": { label: "DTMF Delay (ms)", advanced: true },
"voiceCall.introMessage": { label: "Voice Call Intro Message", advanced: true },
"realtime.provider": {
label: "Realtime Provider",
help: "Defaults to OpenAI; uses OPENAI_API_KEY when no provider config is set.",
},
"realtime.model": { label: "Realtime Model", advanced: true },
"realtime.instructions": { label: "Realtime Instructions", advanced: true },
"realtime.toolPolicy": {
label: "Realtime Tool Policy",
help: "Safe read-only tools are available by default; owner requests can unlock broader tools.",
advanced: true,
},
"oauth.clientId": { label: "OAuth Client ID" },
"oauth.clientSecret": { label: "OAuth Client Secret", sensitive: true },
"oauth.refreshToken": { label: "OAuth Refresh Token", sensitive: true },
"oauth.accessToken": {
label: "Cached Access Token",
sensitive: true,
advanced: true,
},
"oauth.expiresAt": {
label: "Cached Access Token Expiry",
help: "Unix epoch milliseconds used only for the cached access-token fast path.",
advanced: true,
},
},
};
const GoogleMeetToolSchema = Type.Object({
action: Type.String({
enum: ["join", "status", "setup_status", "resolve_space", "preflight", "leave"],
description: "Google Meet action to run",
}),
url: Type.Optional(Type.String({ description: "Explicit https://meet.google.com/... URL" })),
transport: Type.Optional(
Type.String({ enum: ["chrome", "twilio"], description: "Join transport" }),
),
mode: Type.Optional(Type.String({ enum: ["realtime", "transcribe"], description: "Join mode" })),
dialInNumber: Type.Optional(Type.String({ description: "Meet dial-in number for Twilio" })),
pin: Type.Optional(Type.String({ description: "Meet phone PIN for Twilio" })),
dtmfSequence: Type.Optional(Type.String({ description: "Explicit DTMF sequence for Twilio" })),
sessionId: Type.Optional(Type.String({ description: "Meet session ID" })),
meeting: Type.Optional(Type.String({ description: "Meet URL, meeting code, or spaces/{id}" })),
accessToken: Type.Optional(Type.String({ description: "Access token override" })),
refreshToken: Type.Optional(Type.String({ description: "Refresh token override" })),
clientId: Type.Optional(Type.String({ description: "OAuth client id override" })),
clientSecret: Type.Optional(Type.String({ description: "OAuth client secret override" })),
expiresAt: Type.Optional(Type.Number({ description: "Cached access token expiry ms" })),
});
function asParamRecord(params: unknown): Record<string, unknown> {
return params && typeof params === "object" && !Array.isArray(params)
? (params as Record<string, unknown>)
: {};
}
function json(payload: unknown) {
return {
content: [{ type: "text" as const, text: JSON.stringify(payload, null, 2) }],
details: payload,
};
}
function normalizeTransport(value: unknown): GoogleMeetTransport | undefined {
return value === "chrome" || value === "twilio" ? value : undefined;
}
function normalizeMode(value: unknown): GoogleMeetMode | undefined {
return value === "realtime" || value === "transcribe" ? value : undefined;
}
function resolveMeetingInput(config: GoogleMeetConfig, value: unknown): string {
const meeting = normalizeOptionalString(value) ?? config.defaults.meeting;
if (!meeting) {
throw new Error("Meeting input is required");
}
return meeting;
}
async function resolveSpaceFromParams(config: GoogleMeetConfig, raw: Record<string, unknown>) {
const meeting = resolveMeetingInput(config, raw.meeting);
const token = await resolveGoogleMeetAccessToken({
clientId: normalizeOptionalString(raw.clientId) ?? config.oauth.clientId,
clientSecret: normalizeOptionalString(raw.clientSecret) ?? config.oauth.clientSecret,
refreshToken: normalizeOptionalString(raw.refreshToken) ?? config.oauth.refreshToken,
accessToken: normalizeOptionalString(raw.accessToken) ?? config.oauth.accessToken,
expiresAt: typeof raw.expiresAt === "number" ? raw.expiresAt : config.oauth.expiresAt,
});
const space = await fetchGoogleMeetSpace({
accessToken: token.accessToken,
meeting,
});
return { meeting, token, space };
}
export default definePluginEntry({
id: "google-meet",
name: "Google Meet",
description: "Join Google Meet calls through Chrome or Twilio transports",
configSchema: googleMeetConfigSchema,
register(api: OpenClawPluginApi) {
const config = googleMeetConfigSchema.parse(api.pluginConfig);
let runtime: GoogleMeetRuntime | null = null;
const ensureRuntime = async () => {
if (!config.enabled) {
throw new Error("Google Meet plugin disabled in plugin config");
}
if (!runtime) {
runtime = new GoogleMeetRuntime({
config,
fullConfig: api.config,
runtime: api.runtime,
logger: api.logger,
});
}
return runtime;
};
const sendError = (respond: (ok: boolean, payload?: unknown) => void, err: unknown) => {
respond(false, { error: formatErrorMessage(err) });
};
api.registerGatewayMethod(
"googlemeet.join",
async ({ params, respond }: GatewayRequestHandlerOptions) => {
try {
const rt = await ensureRuntime();
const result = await rt.join({
url: resolveMeetingInput(config, params?.url),
transport: normalizeTransport(params?.transport),
mode: normalizeMode(params?.mode),
dialInNumber: normalizeOptionalString(params?.dialInNumber),
pin: normalizeOptionalString(params?.pin),
dtmfSequence: normalizeOptionalString(params?.dtmfSequence),
});
respond(true, result);
} catch (err) {
sendError(respond, err);
}
},
);
api.registerGatewayMethod(
"googlemeet.status",
async ({ params, respond }: GatewayRequestHandlerOptions) => {
try {
const rt = await ensureRuntime();
respond(true, rt.status(normalizeOptionalString(params?.sessionId)));
} catch (err) {
sendError(respond, err);
}
},
);
api.registerGatewayMethod(
"googlemeet.setup",
async ({ respond }: GatewayRequestHandlerOptions) => {
try {
const rt = await ensureRuntime();
respond(true, rt.setupStatus());
} catch (err) {
sendError(respond, err);
}
},
);
api.registerGatewayMethod(
"googlemeet.leave",
async ({ params, respond }: GatewayRequestHandlerOptions) => {
try {
const sessionId = normalizeOptionalString(params?.sessionId);
if (!sessionId) {
respond(false, { error: "sessionId required" });
return;
}
const rt = await ensureRuntime();
respond(true, await rt.leave(sessionId));
} catch (err) {
sendError(respond, err);
}
},
);
api.registerTool({
name: "google_meet",
label: "Google Meet",
description: "Join and track Google Meet sessions through Chrome or Twilio.",
parameters: GoogleMeetToolSchema,
async execute(_toolCallId, params) {
const raw = asParamRecord(params);
try {
switch (raw.action) {
case "join": {
const rt = await ensureRuntime();
return json(
await rt.join({
url: resolveMeetingInput(config, raw.url),
transport: normalizeTransport(raw.transport),
mode: normalizeMode(raw.mode),
dialInNumber: normalizeOptionalString(raw.dialInNumber),
pin: normalizeOptionalString(raw.pin),
dtmfSequence: normalizeOptionalString(raw.dtmfSequence),
}),
);
}
case "status": {
const rt = await ensureRuntime();
return json(rt.status(normalizeOptionalString(raw.sessionId)));
}
case "setup_status": {
const rt = await ensureRuntime();
return json(rt.setupStatus());
}
case "resolve_space": {
const { token: _token, ...result } = await resolveSpaceFromParams(config, raw);
return json(result);
}
case "preflight": {
const { meeting, token, space } = await resolveSpaceFromParams(config, raw);
return json(
buildGoogleMeetPreflightReport({
input: meeting,
space,
previewAcknowledged: config.preview.enrollmentAcknowledged,
tokenSource: token.refreshed ? "refresh-token" : "cached-access-token",
}),
);
}
case "leave": {
const rt = await ensureRuntime();
const sessionId = normalizeOptionalString(raw.sessionId);
if (!sessionId) {
throw new Error("sessionId required");
}
return json(await rt.leave(sessionId));
}
default:
throw new Error("unknown google_meet action");
}
} catch (err) {
return json({ error: formatErrorMessage(err) });
}
},
});
api.registerCli(
({ program }) =>
registerGoogleMeetCli({
program,
config,
ensureRuntime,
}),
{
commands: ["googlemeet"],
descriptors: [
{
name: "googlemeet",
description: "Join and manage Google Meet calls",
hasSubcommands: true,
},
],
},
);
},
});

View File

@@ -0,0 +1,357 @@
{
"id": "google-meet",
"name": "Google Meet",
"description": "Join Google Meet calls through Chrome or Twilio transports.",
"enabledByDefault": false,
"commandAliases": [{ "name": "googlemeet" }],
"activation": {
"onCommands": ["googlemeet"],
"onCapabilities": ["tool"]
},
"uiHints": {
"defaults.meeting": {
"label": "Default Meeting",
"help": "Meet URL, meeting code, or spaces/{id} used when commands omit a meeting."
},
"preview.enrollmentAcknowledged": {
"label": "Preview Acknowledged",
"help": "Confirms you understand the Google Meet Media API is still Developer Preview.",
"advanced": true
},
"defaultTransport": {
"label": "Default Transport",
"help": "Chrome uses a signed-in browser profile. Twilio uses Meet dial-in numbers."
},
"defaultMode": {
"label": "Default Mode",
"help": "Realtime voice is the default."
},
"chrome.audioBackend": {
"label": "Chrome Audio Backend",
"help": "BlackHole 2ch is required for local duplex audio routing."
},
"chrome.launch": {
"label": "Launch Chrome"
},
"chrome.browserProfile": {
"label": "Chrome Profile",
"advanced": true
},
"chrome.audioInputCommand": {
"label": "Audio Input Command",
"help": "Command that writes 8 kHz G.711 mu-law meeting audio to stdout.",
"advanced": true
},
"chrome.audioOutputCommand": {
"label": "Audio Output Command",
"help": "Command that reads 8 kHz G.711 mu-law assistant audio from stdin.",
"advanced": true
},
"chrome.audioBridgeCommand": {
"label": "Audio Bridge Command",
"advanced": true
},
"chrome.audioBridgeHealthCommand": {
"label": "Audio Bridge Health Command",
"advanced": true
},
"twilio.defaultDialInNumber": {
"label": "Default Dial-In Number",
"placeholder": "+15551234567"
},
"twilio.defaultPin": {
"label": "Default PIN",
"advanced": true
},
"twilio.defaultDtmfSequence": {
"label": "Default DTMF Sequence",
"advanced": true
},
"voiceCall.enabled": {
"label": "Delegate To Voice Call"
},
"voiceCall.gatewayUrl": {
"label": "Voice Call Gateway URL",
"advanced": true
},
"voiceCall.token": {
"label": "Voice Call Gateway Token",
"sensitive": true,
"advanced": true
},
"voiceCall.requestTimeoutMs": {
"label": "Voice Call Request Timeout (ms)",
"advanced": true
},
"voiceCall.dtmfDelayMs": {
"label": "DTMF Delay (ms)",
"advanced": true
},
"voiceCall.introMessage": {
"label": "Voice Call Intro Message",
"advanced": true
},
"realtime.provider": {
"label": "Realtime Provider",
"help": "Defaults to OpenAI; uses OPENAI_API_KEY when no provider config is set."
},
"realtime.model": {
"label": "Realtime Model",
"advanced": true
},
"realtime.instructions": {
"label": "Realtime Instructions",
"advanced": true
},
"realtime.toolPolicy": {
"label": "Realtime Tool Policy",
"help": "Safe read-only tools are available by default; owner requests can unlock broader tools.",
"advanced": true
},
"oauth.clientId": {
"label": "OAuth Client ID"
},
"oauth.clientSecret": {
"label": "OAuth Client Secret",
"sensitive": true
},
"oauth.refreshToken": {
"label": "OAuth Refresh Token",
"sensitive": true
},
"oauth.accessToken": {
"label": "Cached Access Token",
"sensitive": true,
"advanced": true
},
"oauth.expiresAt": {
"label": "Cached Access Token Expiry",
"help": "Unix epoch milliseconds used only for the cached access-token fast path.",
"advanced": true
}
},
"configSchema": {
"type": "object",
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean"
},
"defaults": {
"type": "object",
"additionalProperties": false,
"properties": {
"meeting": {
"type": "string"
}
}
},
"preview": {
"type": "object",
"additionalProperties": false,
"properties": {
"enrollmentAcknowledged": {
"type": "boolean"
}
}
},
"defaultTransport": {
"type": "string",
"enum": ["chrome", "twilio"],
"default": "chrome"
},
"defaultMode": {
"type": "string",
"enum": ["realtime", "transcribe"],
"default": "realtime"
},
"chrome": {
"type": "object",
"additionalProperties": false,
"properties": {
"audioBackend": {
"type": "string",
"enum": ["blackhole-2ch"],
"default": "blackhole-2ch"
},
"launch": {
"type": "boolean",
"default": true
},
"browserProfile": {
"type": "string"
},
"joinTimeoutMs": {
"type": "number",
"default": 30000
},
"audioInputCommand": {
"type": "array",
"default": [
"rec",
"-q",
"-t",
"raw",
"-r",
"8000",
"-c",
"1",
"-e",
"mu-law",
"-b",
"8",
"-"
],
"items": {
"type": "string"
}
},
"audioOutputCommand": {
"type": "array",
"default": [
"play",
"-q",
"-t",
"raw",
"-r",
"8000",
"-c",
"1",
"-e",
"mu-law",
"-b",
"8",
"-"
],
"items": {
"type": "string"
}
},
"audioBridgeCommand": {
"type": "array",
"items": {
"type": "string"
}
},
"audioBridgeHealthCommand": {
"type": "array",
"items": {
"type": "string"
}
}
}
},
"twilio": {
"type": "object",
"additionalProperties": false,
"properties": {
"defaultDialInNumber": {
"type": "string"
},
"defaultPin": {
"type": "string"
},
"defaultDtmfSequence": {
"type": "string"
}
}
},
"voiceCall": {
"type": "object",
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean",
"default": true
},
"gatewayUrl": {
"type": "string"
},
"token": {
"type": "string"
},
"requestTimeoutMs": {
"type": "number",
"default": 30000
},
"dtmfDelayMs": {
"type": "number",
"default": 2500
},
"introMessage": {
"type": "string"
}
}
},
"realtime": {
"type": "object",
"additionalProperties": false,
"properties": {
"provider": {
"type": "string",
"default": "openai"
},
"model": {
"type": "string"
},
"instructions": {
"type": "string",
"default": "You are joining a private Google Meet as an OpenClaw agent. Keep spoken replies brief and natural. When a question needs deeper reasoning, current information, or tools, call openclaw_agent_consult before answering."
},
"toolPolicy": {
"type": "string",
"enum": ["safe-read-only", "owner", "none"],
"default": "safe-read-only"
},
"providers": {
"type": "object",
"additionalProperties": {
"type": "object",
"additionalProperties": true
}
}
}
},
"oauth": {
"type": "object",
"additionalProperties": false,
"properties": {
"clientId": {
"type": "string"
},
"clientSecret": {
"type": "string"
},
"refreshToken": {
"type": "string"
},
"accessToken": {
"type": "string"
},
"expiresAt": {
"type": "number"
}
}
},
"auth": {
"type": "object",
"additionalProperties": false,
"properties": {
"provider": {
"type": "string",
"enum": ["google-oauth"]
},
"clientId": {
"type": "string"
},
"clientSecret": {
"type": "string"
},
"tokenPath": {
"type": "string"
}
}
}
}
}
}

View File

@@ -0,0 +1,40 @@
{
"name": "@openclaw/google-meet",
"version": "2026.4.20",
"description": "OpenClaw Google Meet participant plugin",
"type": "module",
"dependencies": {
"commander": "^14.0.3",
"typebox": "1.1.28"
},
"devDependencies": {
"@openclaw/plugin-sdk": "workspace:*",
"openclaw": "workspace:*"
},
"peerDependencies": {
"openclaw": ">=2026.4.20"
},
"peerDependenciesMeta": {
"openclaw": {
"optional": true
}
},
"openclaw": {
"extensions": [
"./index.ts"
],
"install": {
"minHostVersion": ">=2026.4.20"
},
"compat": {
"pluginApi": ">=2026.4.20"
},
"build": {
"openclawVersion": "2026.4.20"
},
"release": {
"publishToClawHub": true,
"publishToNpm": true
}
}
}

View File

@@ -0,0 +1,143 @@
import { randomUUID } from "node:crypto";
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-runtime";
import type { PluginRuntime, RuntimeLogger } from "openclaw/plugin-sdk/plugin-runtime";
import {
REALTIME_VOICE_AGENT_CONSULT_TOOL,
REALTIME_VOICE_AGENT_CONSULT_TOOL_NAME,
type RealtimeVoiceTool,
} from "openclaw/plugin-sdk/realtime-voice";
import { normalizeOptionalString } from "openclaw/plugin-sdk/text-runtime";
import type { GoogleMeetConfig, GoogleMeetToolPolicy } from "./config.js";
type AgentPayload = {
text?: string;
isError?: boolean;
isReasoning?: boolean;
};
export const GOOGLE_MEET_AGENT_CONSULT_TOOL_NAME = REALTIME_VOICE_AGENT_CONSULT_TOOL_NAME;
export const GOOGLE_MEET_AGENT_CONSULT_TOOL = REALTIME_VOICE_AGENT_CONSULT_TOOL;
export function resolveGoogleMeetRealtimeTools(policy: GoogleMeetToolPolicy): RealtimeVoiceTool[] {
return policy === "none" ? [] : [GOOGLE_MEET_AGENT_CONSULT_TOOL];
}
function normalizeToolArgString(args: unknown, key: string): string | undefined {
if (!args || typeof args !== "object" || Array.isArray(args)) {
return undefined;
}
return normalizeOptionalString((args as Record<string, unknown>)[key]);
}
function collectVisibleText(payloads: AgentPayload[]): string | null {
const chunks: string[] = [];
for (const payload of payloads) {
if (payload.isError || payload.isReasoning) {
continue;
}
const text = normalizeOptionalString(payload.text);
if (text) {
chunks.push(text);
}
}
return chunks.length > 0 ? chunks.join("\n\n").trim() : null;
}
function resolveToolsAllow(policy: GoogleMeetToolPolicy): string[] | undefined {
if (policy === "owner") {
return undefined;
}
if (policy === "safe-read-only") {
return ["read", "web_search", "web_fetch", "x_search", "memory_search", "memory_get"];
}
return [];
}
function buildPrompt(params: {
args: unknown;
transcript: Array<{ role: "user" | "assistant"; text: string }>;
}): string {
const question = normalizeToolArgString(params.args, "question");
if (!question) {
throw new Error("question required");
}
const context = normalizeToolArgString(params.args, "context");
const responseStyle = normalizeToolArgString(params.args, "responseStyle");
const transcript = params.transcript
.slice(-12)
.map((entry) => `${entry.role === "assistant" ? "Agent" : "Participant"}: ${entry.text}`)
.join("\n");
return [
"You are helping an OpenClaw realtime voice agent during a private Google Meet.",
"Answer the participant's question with the strongest useful reasoning and available tools.",
"Return only the concise answer the realtime voice agent should speak next.",
"Do not include markdown, citations unless needed, tool logs, or private reasoning.",
responseStyle ? `Spoken style: ${responseStyle}` : undefined,
transcript ? `Recent meeting transcript:\n${transcript}` : undefined,
context ? `Additional context:\n${context}` : undefined,
`Question:\n${question}`,
]
.filter(Boolean)
.join("\n\n");
}
export async function consultOpenClawAgentForGoogleMeet(params: {
config: GoogleMeetConfig;
fullConfig: OpenClawConfig;
runtime: PluginRuntime;
logger: RuntimeLogger;
meetingSessionId: string;
args: unknown;
transcript: Array<{ role: "user" | "assistant"; text: string }>;
}): Promise<{ text: string }> {
const agentId = "main";
const sessionKey = `google-meet:${params.meetingSessionId}`;
const cfg = params.fullConfig;
const agentDir = params.runtime.agent.resolveAgentDir(cfg, agentId);
const workspaceDir = params.runtime.agent.resolveAgentWorkspaceDir(cfg, agentId);
await params.runtime.agent.ensureAgentWorkspace({ dir: workspaceDir });
const storePath = params.runtime.agent.session.resolveStorePath(cfg.session?.store, { agentId });
const sessionStore = params.runtime.agent.session.loadSessionStore(storePath);
const now = Date.now();
const existing = sessionStore[sessionKey] as
| { sessionId?: string; updatedAt?: number }
| undefined;
const sessionId = normalizeOptionalString(existing?.sessionId) ?? randomUUID();
sessionStore[sessionKey] = { ...existing, sessionId, updatedAt: now };
await params.runtime.agent.session.saveSessionStore(storePath, sessionStore);
const sessionFile = params.runtime.agent.session.resolveSessionFilePath(
sessionId,
sessionStore[sessionKey],
{ agentId },
);
const result = await params.runtime.agent.runEmbeddedPiAgent({
sessionId,
sessionKey,
messageProvider: "google-meet",
sessionFile,
workspaceDir,
config: cfg,
prompt: buildPrompt({ args: params.args, transcript: params.transcript }),
thinkLevel: "high",
verboseLevel: "off",
reasoningLevel: "off",
toolResultFormat: "plain",
toolsAllow: resolveToolsAllow(params.config.realtime.toolPolicy),
timeoutMs: params.runtime.agent.resolveAgentTimeoutMs({ cfg }),
runId: `google-meet:${params.meetingSessionId}:${Date.now()}`,
lane: "google-meet",
extraSystemPrompt:
"You are a behind-the-scenes consultant for a live meeting voice agent. Be accurate, brief, and speakable.",
agentDir,
});
const text = collectVisibleText((result.payloads ?? []) as AgentPayload[]);
if (!text) {
const reason = result.meta?.aborted ? "agent run aborted" : "agent returned no speakable text";
params.logger.warn(`[google-meet] agent consult produced no answer: ${reason}`);
return { text: "I need a moment to verify that before answering." };
}
return { text };
}

View File

@@ -0,0 +1,307 @@
import { createInterface } from "node:readline/promises";
import { format } from "node:util";
import type { Command } from "commander";
import type { GoogleMeetConfig, GoogleMeetMode, GoogleMeetTransport } from "./config.js";
import { buildGoogleMeetPreflightReport, fetchGoogleMeetSpace } from "./meet.js";
import {
buildGoogleMeetAuthUrl,
createGoogleMeetOAuthState,
createGoogleMeetPkce,
exchangeGoogleMeetAuthCode,
resolveGoogleMeetAccessToken,
waitForGoogleMeetAuthCode,
} from "./oauth.js";
import type { GoogleMeetRuntime } from "./runtime.js";
type JoinOptions = {
transport?: GoogleMeetTransport;
mode?: GoogleMeetMode;
dialInNumber?: string;
pin?: string;
dtmfSequence?: string;
};
type OAuthLoginOptions = {
clientId?: string;
clientSecret?: string;
manual?: boolean;
json?: boolean;
timeoutSec?: string;
};
type ResolveSpaceOptions = {
meeting?: string;
accessToken?: string;
refreshToken?: string;
clientId?: string;
clientSecret?: string;
expiresAt?: string;
json?: boolean;
};
function writeStdoutJson(value: unknown): void {
process.stdout.write(`${JSON.stringify(value, null, 2)}\n`);
}
function writeStdoutLine(...values: unknown[]): void {
process.stdout.write(`${format(...values)}\n`);
}
async function promptInput(message: string): Promise<string> {
const rl = createInterface({
input: process.stdin,
output: process.stderr,
});
try {
return await rl.question(message);
} finally {
rl.close();
}
}
function parseOptionalNumber(value: string | undefined): number | undefined {
if (!value?.trim()) {
return undefined;
}
const parsed = Number(value);
if (!Number.isFinite(parsed)) {
throw new Error(`Expected a numeric value, received ${value}`);
}
return parsed;
}
function resolveMeetingInput(config: GoogleMeetConfig, value?: string): string {
const meeting = value?.trim() || config.defaults.meeting;
if (!meeting) {
throw new Error(
"Meeting input is required. Pass a URL/meeting code or configure defaults.meeting.",
);
}
return meeting;
}
function resolveTokenOptions(
config: GoogleMeetConfig,
options: ResolveSpaceOptions,
): {
meeting: string;
clientId?: string;
clientSecret?: string;
refreshToken?: string;
accessToken?: string;
expiresAt?: number;
} {
return {
meeting: resolveMeetingInput(config, options.meeting),
clientId: options.clientId?.trim() || config.oauth.clientId,
clientSecret: options.clientSecret?.trim() || config.oauth.clientSecret,
refreshToken: options.refreshToken?.trim() || config.oauth.refreshToken,
accessToken: options.accessToken?.trim() || config.oauth.accessToken,
expiresAt: parseOptionalNumber(options.expiresAt) ?? config.oauth.expiresAt,
};
}
export function registerGoogleMeetCli(params: {
program: Command;
config: GoogleMeetConfig;
ensureRuntime: () => Promise<GoogleMeetRuntime>;
}) {
const root = params.program
.command("googlemeet")
.description("Google Meet participant utilities")
.addHelpText("after", () => `\nDocs: https://docs.openclaw.ai/plugins/google-meet\n`);
const auth = root.command("auth").description("Google Meet OAuth helpers");
auth
.command("login")
.description("Run a PKCE OAuth flow and print refresh-token JSON to store in plugin config")
.option("--client-id <id>", "OAuth client id override")
.option("--client-secret <secret>", "OAuth client secret override")
.option("--manual", "Use copy/paste callback flow instead of localhost callback")
.option("--json", "Print the token payload as JSON", false)
.option("--timeout-sec <n>", "Local callback timeout in seconds", "300")
.action(async (options: OAuthLoginOptions) => {
const clientId = options.clientId?.trim() || params.config.oauth.clientId;
const clientSecret = options.clientSecret?.trim() || params.config.oauth.clientSecret;
if (!clientId) {
throw new Error(
"Missing Google Meet OAuth client id. Configure oauth.clientId or pass --client-id.",
);
}
const { verifier, challenge } = createGoogleMeetPkce();
const state = createGoogleMeetOAuthState();
const authUrl = buildGoogleMeetAuthUrl({
clientId,
challenge,
state,
});
const code = await waitForGoogleMeetAuthCode({
state,
manual: Boolean(options.manual),
timeoutMs: (parseOptionalNumber(options.timeoutSec) ?? 300) * 1000,
authUrl,
promptInput,
writeLine: (message) => writeStdoutLine("%s", message),
});
const tokens = await exchangeGoogleMeetAuthCode({
clientId,
clientSecret,
code,
verifier,
});
if (!tokens.refreshToken) {
throw new Error(
"Google OAuth did not return a refresh token. Re-run the flow with consent and offline access.",
);
}
const payload = {
oauth: {
clientId,
...(clientSecret ? { clientSecret } : {}),
refreshToken: tokens.refreshToken,
accessToken: tokens.accessToken,
expiresAt: tokens.expiresAt,
},
scope: tokens.scope,
tokenType: tokens.tokenType,
};
if (!options.json) {
writeStdoutLine("Paste this into plugins.entries.google-meet.config:");
}
writeStdoutJson(payload);
});
root
.command("join")
.argument("[url]", "Explicit https://meet.google.com/... URL")
.option("--transport <transport>", "Transport: chrome or twilio")
.option("--mode <mode>", "Mode: realtime or transcribe")
.option("--dial-in-number <phone>", "Meet dial-in number for Twilio transport")
.option("--pin <pin>", "Meet phone PIN; # is appended if omitted")
.option("--dtmf-sequence <sequence>", "Explicit Twilio DTMF sequence")
.action(async (url: string | undefined, options: JoinOptions) => {
const rt = await params.ensureRuntime();
const result = await rt.join({
url: resolveMeetingInput(params.config, url),
transport: options.transport,
mode: options.mode,
dialInNumber: options.dialInNumber,
pin: options.pin,
dtmfSequence: options.dtmfSequence,
});
writeStdoutJson(result.session);
});
root
.command("resolve-space")
.description("Resolve a Meet URL, meeting code, or spaces/{id} to its canonical space")
.option("--meeting <value>", "Meet URL, meeting code, or spaces/{id}")
.option("--access-token <token>", "Access token override")
.option("--refresh-token <token>", "Refresh token override")
.option("--client-id <id>", "OAuth client id override")
.option("--client-secret <secret>", "OAuth client secret override")
.option("--expires-at <ms>", "Cached access token expiry as unix epoch milliseconds")
.option("--json", "Print JSON output", false)
.action(async (options: ResolveSpaceOptions) => {
const resolved = resolveTokenOptions(params.config, options);
const token = await resolveGoogleMeetAccessToken(resolved);
const space = await fetchGoogleMeetSpace({
accessToken: token.accessToken,
meeting: resolved.meeting,
});
if (options.json) {
writeStdoutJson(space);
return;
}
writeStdoutLine("input: %s", resolved.meeting);
writeStdoutLine("space: %s", space.name);
if (space.meetingCode) {
writeStdoutLine("meeting code: %s", space.meetingCode);
}
if (space.meetingUri) {
writeStdoutLine("meeting uri: %s", space.meetingUri);
}
writeStdoutLine("active conference: %s", space.activeConference ? "yes" : "no");
writeStdoutLine(
"token source: %s",
token.refreshed ? "refresh-token" : "cached-access-token",
);
});
root
.command("preflight")
.description("Validate OAuth + meeting resolution prerequisites for Meet media work")
.option("--meeting <value>", "Meet URL, meeting code, or spaces/{id}")
.option("--access-token <token>", "Access token override")
.option("--refresh-token <token>", "Refresh token override")
.option("--client-id <id>", "OAuth client id override")
.option("--client-secret <secret>", "OAuth client secret override")
.option("--expires-at <ms>", "Cached access token expiry as unix epoch milliseconds")
.option("--json", "Print JSON output", false)
.action(async (options: ResolveSpaceOptions) => {
const resolved = resolveTokenOptions(params.config, options);
const token = await resolveGoogleMeetAccessToken(resolved);
const space = await fetchGoogleMeetSpace({
accessToken: token.accessToken,
meeting: resolved.meeting,
});
const report = buildGoogleMeetPreflightReport({
input: resolved.meeting,
space,
previewAcknowledged: params.config.preview.enrollmentAcknowledged,
tokenSource: token.refreshed ? "refresh-token" : "cached-access-token",
});
if (options.json) {
writeStdoutJson(report);
return;
}
writeStdoutLine("input: %s", report.input);
writeStdoutLine("resolved space: %s", report.resolvedSpaceName);
if (report.meetingCode) {
writeStdoutLine("meeting code: %s", report.meetingCode);
}
if (report.meetingUri) {
writeStdoutLine("meeting uri: %s", report.meetingUri);
}
writeStdoutLine("active conference: %s", report.hasActiveConference ? "yes" : "no");
writeStdoutLine("preview acknowledged: %s", report.previewAcknowledged ? "yes" : "no");
writeStdoutLine("token source: %s", report.tokenSource);
if (report.blockers.length === 0) {
writeStdoutLine("blockers: none");
return;
}
writeStdoutLine("blockers:");
for (const blocker of report.blockers) {
writeStdoutLine("- %s", blocker);
}
});
root
.command("status")
.argument("[session-id]", "Meet session ID")
.action(async (sessionId?: string) => {
const rt = await params.ensureRuntime();
writeStdoutJson(rt.status(sessionId));
});
root
.command("setup")
.description("Show Google Meet transport setup status")
.action(async () => {
const rt = await params.ensureRuntime();
writeStdoutJson(rt.setupStatus());
});
root
.command("leave")
.argument("<session-id>", "Meet session ID")
.action(async (sessionId: string) => {
const rt = await params.ensureRuntime();
const result = await rt.leave(sessionId);
if (!result.found) {
throw new Error("session not found");
}
writeStdoutLine("left %s", sessionId);
});
}

View File

@@ -0,0 +1,364 @@
import { REALTIME_VOICE_AGENT_CONSULT_TOOL_NAME } from "openclaw/plugin-sdk/realtime-voice";
import {
normalizeOptionalLowercaseString,
normalizeOptionalString,
} from "openclaw/plugin-sdk/text-runtime";
export type GoogleMeetTransport = "chrome" | "twilio";
export type GoogleMeetMode = "realtime" | "transcribe";
export type GoogleMeetToolPolicy = "safe-read-only" | "owner" | "none";
export type GoogleMeetConfig = {
enabled: boolean;
defaults: {
meeting?: string;
};
preview: {
enrollmentAcknowledged: boolean;
};
defaultTransport: GoogleMeetTransport;
defaultMode: GoogleMeetMode;
chrome: {
audioBackend: "blackhole-2ch";
launch: boolean;
browserProfile?: string;
joinTimeoutMs: number;
audioInputCommand?: string[];
audioOutputCommand?: string[];
audioBridgeCommand?: string[];
audioBridgeHealthCommand?: string[];
};
twilio: {
defaultDialInNumber?: string;
defaultPin?: string;
defaultDtmfSequence?: string;
};
voiceCall: {
enabled: boolean;
gatewayUrl?: string;
token?: string;
requestTimeoutMs: number;
dtmfDelayMs: number;
introMessage?: string;
};
realtime: {
provider?: string;
model?: string;
instructions?: string;
toolPolicy: GoogleMeetToolPolicy;
providers: Record<string, Record<string, unknown>>;
};
oauth: {
clientId?: string;
clientSecret?: string;
refreshToken?: string;
accessToken?: string;
expiresAt?: number;
};
auth: {
provider: "google-oauth";
clientId?: string;
clientSecret?: string;
tokenPath?: string;
};
};
export const DEFAULT_GOOGLE_MEET_AUDIO_INPUT_COMMAND = [
"rec",
"-q",
"-t",
"raw",
"-r",
"8000",
"-c",
"1",
"-e",
"mu-law",
"-b",
"8",
"-",
] as const;
export const DEFAULT_GOOGLE_MEET_AUDIO_OUTPUT_COMMAND = [
"play",
"-q",
"-t",
"raw",
"-r",
"8000",
"-c",
"1",
"-e",
"mu-law",
"-b",
"8",
"-",
] as const;
export const DEFAULT_GOOGLE_MEET_REALTIME_INSTRUCTIONS = `You are joining a private Google Meet as an OpenClaw agent. Keep spoken replies brief and natural. When a question needs deeper reasoning, current information, or tools, call ${REALTIME_VOICE_AGENT_CONSULT_TOOL_NAME} before answering.`;
export const DEFAULT_GOOGLE_MEET_CONFIG: GoogleMeetConfig = {
enabled: true,
defaults: {},
preview: {
enrollmentAcknowledged: false,
},
defaultTransport: "chrome",
defaultMode: "realtime",
chrome: {
audioBackend: "blackhole-2ch",
launch: true,
joinTimeoutMs: 30_000,
audioInputCommand: [...DEFAULT_GOOGLE_MEET_AUDIO_INPUT_COMMAND],
audioOutputCommand: [...DEFAULT_GOOGLE_MEET_AUDIO_OUTPUT_COMMAND],
},
twilio: {},
voiceCall: {
enabled: true,
requestTimeoutMs: 30_000,
dtmfDelayMs: 2_500,
},
realtime: {
provider: "openai",
instructions: DEFAULT_GOOGLE_MEET_REALTIME_INSTRUCTIONS,
toolPolicy: "safe-read-only",
providers: {},
},
oauth: {},
auth: {
provider: "google-oauth",
},
};
const GOOGLE_MEET_CLIENT_ID_KEYS = ["OPENCLAW_GOOGLE_MEET_CLIENT_ID", "GOOGLE_MEET_CLIENT_ID"];
const GOOGLE_MEET_CLIENT_SECRET_KEYS = [
"OPENCLAW_GOOGLE_MEET_CLIENT_SECRET",
"GOOGLE_MEET_CLIENT_SECRET",
] as const;
const GOOGLE_MEET_REFRESH_TOKEN_KEYS = [
"OPENCLAW_GOOGLE_MEET_REFRESH_TOKEN",
"GOOGLE_MEET_REFRESH_TOKEN",
] as const;
const GOOGLE_MEET_ACCESS_TOKEN_KEYS = [
"OPENCLAW_GOOGLE_MEET_ACCESS_TOKEN",
"GOOGLE_MEET_ACCESS_TOKEN",
] as const;
const GOOGLE_MEET_ACCESS_TOKEN_EXPIRES_AT_KEYS = [
"OPENCLAW_GOOGLE_MEET_ACCESS_TOKEN_EXPIRES_AT",
"GOOGLE_MEET_ACCESS_TOKEN_EXPIRES_AT",
] as const;
const GOOGLE_MEET_DEFAULT_MEETING_KEYS = [
"OPENCLAW_GOOGLE_MEET_DEFAULT_MEETING",
"GOOGLE_MEET_DEFAULT_MEETING",
] as const;
const GOOGLE_MEET_PREVIEW_ACK_KEYS = [
"OPENCLAW_GOOGLE_MEET_PREVIEW_ACK",
"GOOGLE_MEET_PREVIEW_ACK",
] as const;
function asRecord(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
: {};
}
function resolveBoolean(value: unknown, fallback: boolean): boolean {
return typeof value === "boolean" ? value : fallback;
}
function resolveNumber(value: unknown, fallback: number): number {
return typeof value === "number" && Number.isFinite(value) && value > 0 ? value : fallback;
}
function resolveOptionalNumber(value: unknown): number | undefined {
if (typeof value === "number" && Number.isFinite(value)) {
return value;
}
if (typeof value === "string" && value.trim()) {
const parsed = Number(value);
return Number.isFinite(parsed) ? parsed : undefined;
}
return undefined;
}
function readEnvString(env: NodeJS.ProcessEnv, keys: readonly string[]): string | undefined {
for (const key of keys) {
const value = normalizeOptionalString(env[key]);
if (value) {
return value;
}
}
return undefined;
}
function readEnvBoolean(env: NodeJS.ProcessEnv, keys: readonly string[]): boolean | undefined {
const normalized = normalizeOptionalLowercaseString(readEnvString(env, keys));
if (!normalized) {
return undefined;
}
if (["1", "true", "yes", "on"].includes(normalized)) {
return true;
}
if (["0", "false", "no", "off"].includes(normalized)) {
return false;
}
return undefined;
}
function readEnvNumber(env: NodeJS.ProcessEnv, keys: readonly string[]): number | undefined {
return resolveOptionalNumber(readEnvString(env, keys));
}
function resolveStringArray(value: unknown): string[] | undefined {
if (!Array.isArray(value)) {
return undefined;
}
const normalized = value
.map((entry) => normalizeOptionalString(entry))
.filter((entry): entry is string => Boolean(entry));
return normalized.length > 0 ? normalized : undefined;
}
function resolveProvidersConfig(value: unknown): Record<string, Record<string, unknown>> {
const raw = asRecord(value);
const providers: Record<string, Record<string, unknown>> = {};
for (const [key, entry] of Object.entries(raw)) {
const providerId = normalizeOptionalLowercaseString(key);
if (!providerId) {
continue;
}
providers[providerId] = asRecord(entry);
}
return providers;
}
function resolveTransport(value: unknown, fallback: GoogleMeetTransport): GoogleMeetTransport {
const normalized = normalizeOptionalLowercaseString(value);
return normalized === "chrome" || normalized === "twilio" ? normalized : fallback;
}
function resolveMode(value: unknown, fallback: GoogleMeetMode): GoogleMeetMode {
const normalized = normalizeOptionalLowercaseString(value);
return normalized === "realtime" || normalized === "transcribe" ? normalized : fallback;
}
function resolveToolPolicy(value: unknown, fallback: GoogleMeetToolPolicy): GoogleMeetToolPolicy {
const normalized = normalizeOptionalLowercaseString(value);
return normalized === "safe-read-only" || normalized === "owner" || normalized === "none"
? normalized
: fallback;
}
export function resolveGoogleMeetConfig(input: unknown): GoogleMeetConfig {
return resolveGoogleMeetConfigWithEnv(input);
}
export function resolveGoogleMeetConfigWithEnv(
input: unknown,
env: NodeJS.ProcessEnv = process.env,
): GoogleMeetConfig {
const raw = asRecord(input);
const defaults = asRecord(raw.defaults);
const preview = asRecord(raw.preview);
const chrome = asRecord(raw.chrome);
const twilio = asRecord(raw.twilio);
const voiceCall = asRecord(raw.voiceCall);
const realtime = asRecord(raw.realtime);
const oauth = asRecord(raw.oauth);
const auth = asRecord(raw.auth);
return {
enabled: resolveBoolean(raw.enabled, DEFAULT_GOOGLE_MEET_CONFIG.enabled),
defaults: {
meeting:
normalizeOptionalString(defaults.meeting) ??
readEnvString(env, GOOGLE_MEET_DEFAULT_MEETING_KEYS),
},
preview: {
enrollmentAcknowledged: resolveBoolean(
preview.enrollmentAcknowledged,
readEnvBoolean(env, GOOGLE_MEET_PREVIEW_ACK_KEYS) ??
DEFAULT_GOOGLE_MEET_CONFIG.preview.enrollmentAcknowledged,
),
},
defaultTransport: resolveTransport(
raw.defaultTransport,
DEFAULT_GOOGLE_MEET_CONFIG.defaultTransport,
),
defaultMode: resolveMode(raw.defaultMode, DEFAULT_GOOGLE_MEET_CONFIG.defaultMode),
chrome: {
audioBackend: "blackhole-2ch",
launch: resolveBoolean(chrome.launch, DEFAULT_GOOGLE_MEET_CONFIG.chrome.launch),
browserProfile: normalizeOptionalString(chrome.browserProfile),
joinTimeoutMs: resolveNumber(
chrome.joinTimeoutMs,
DEFAULT_GOOGLE_MEET_CONFIG.chrome.joinTimeoutMs,
),
audioInputCommand: resolveStringArray(chrome.audioInputCommand) ?? [
...DEFAULT_GOOGLE_MEET_AUDIO_INPUT_COMMAND,
],
audioOutputCommand: resolveStringArray(chrome.audioOutputCommand) ?? [
...DEFAULT_GOOGLE_MEET_AUDIO_OUTPUT_COMMAND,
],
audioBridgeCommand: resolveStringArray(chrome.audioBridgeCommand),
audioBridgeHealthCommand: resolveStringArray(chrome.audioBridgeHealthCommand),
},
twilio: {
defaultDialInNumber: normalizeOptionalString(twilio.defaultDialInNumber),
defaultPin: normalizeOptionalString(twilio.defaultPin),
defaultDtmfSequence: normalizeOptionalString(twilio.defaultDtmfSequence),
},
voiceCall: {
enabled: resolveBoolean(voiceCall.enabled, DEFAULT_GOOGLE_MEET_CONFIG.voiceCall.enabled),
gatewayUrl: normalizeOptionalString(voiceCall.gatewayUrl),
token: normalizeOptionalString(voiceCall.token),
requestTimeoutMs: resolveNumber(
voiceCall.requestTimeoutMs,
DEFAULT_GOOGLE_MEET_CONFIG.voiceCall.requestTimeoutMs,
),
dtmfDelayMs: resolveNumber(
voiceCall.dtmfDelayMs,
DEFAULT_GOOGLE_MEET_CONFIG.voiceCall.dtmfDelayMs,
),
introMessage: normalizeOptionalString(voiceCall.introMessage),
},
realtime: {
provider:
normalizeOptionalString(realtime.provider) ?? DEFAULT_GOOGLE_MEET_CONFIG.realtime.provider,
model: normalizeOptionalString(realtime.model) ?? DEFAULT_GOOGLE_MEET_CONFIG.realtime.model,
instructions:
normalizeOptionalString(realtime.instructions) ??
DEFAULT_GOOGLE_MEET_CONFIG.realtime.instructions,
toolPolicy: resolveToolPolicy(
realtime.toolPolicy,
DEFAULT_GOOGLE_MEET_CONFIG.realtime.toolPolicy,
),
providers: resolveProvidersConfig(realtime.providers),
},
oauth: {
clientId:
normalizeOptionalString(oauth.clientId) ??
normalizeOptionalString(auth.clientId) ??
readEnvString(env, GOOGLE_MEET_CLIENT_ID_KEYS),
clientSecret:
normalizeOptionalString(oauth.clientSecret) ??
normalizeOptionalString(auth.clientSecret) ??
readEnvString(env, GOOGLE_MEET_CLIENT_SECRET_KEYS),
refreshToken:
normalizeOptionalString(oauth.refreshToken) ??
readEnvString(env, GOOGLE_MEET_REFRESH_TOKEN_KEYS),
accessToken:
normalizeOptionalString(oauth.accessToken) ??
readEnvString(env, GOOGLE_MEET_ACCESS_TOKEN_KEYS),
expiresAt:
resolveOptionalNumber(oauth.expiresAt) ??
readEnvNumber(env, GOOGLE_MEET_ACCESS_TOKEN_EXPIRES_AT_KEYS),
},
auth: {
provider: "google-oauth",
clientId: normalizeOptionalString(auth.clientId),
clientSecret: normalizeOptionalString(auth.clientSecret),
tokenPath: normalizeOptionalString(auth.tokenPath),
},
};
}

View File

@@ -0,0 +1,112 @@
import { fetchWithSsrFGuard } from "openclaw/plugin-sdk/ssrf-runtime";
const GOOGLE_MEET_API_BASE_URL = "https://meet.googleapis.com/v2";
const GOOGLE_MEET_URL_HOST = "meet.google.com";
const GOOGLE_MEET_API_HOST = "meet.googleapis.com";
export type GoogleMeetSpace = {
name: string;
meetingCode?: string;
meetingUri?: string;
activeConference?: Record<string, unknown>;
config?: Record<string, unknown>;
};
export type GoogleMeetPreflightReport = {
input: string;
resolvedSpaceName: string;
meetingCode?: string;
meetingUri?: string;
hasActiveConference: boolean;
previewAcknowledged: boolean;
tokenSource: "cached-access-token" | "refresh-token";
blockers: string[];
};
export function normalizeGoogleMeetSpaceName(input: string): string {
const trimmed = input.trim();
if (!trimmed) {
throw new Error("Meeting input is required");
}
if (trimmed.startsWith("spaces/")) {
const suffix = trimmed.slice("spaces/".length).trim();
if (!suffix) {
throw new Error("spaces/ input must include a meeting code or space id");
}
return `spaces/${suffix}`;
}
if (/^https?:\/\//i.test(trimmed)) {
const url = new URL(trimmed);
if (url.hostname !== GOOGLE_MEET_URL_HOST) {
throw new Error(`Expected a ${GOOGLE_MEET_URL_HOST} URL, received ${url.hostname}`);
}
const firstSegment = url.pathname
.split("/")
.map((segment) => segment.trim())
.find(Boolean);
if (!firstSegment) {
throw new Error("Google Meet URL did not include a meeting code");
}
return `spaces/${firstSegment}`;
}
return `spaces/${trimmed}`;
}
function encodeSpaceNameForPath(name: string): string {
return name.split("/").map(encodeURIComponent).join("/");
}
export async function fetchGoogleMeetSpace(params: {
accessToken: string;
meeting: string;
}): Promise<GoogleMeetSpace> {
const name = normalizeGoogleMeetSpaceName(params.meeting);
const { response, release } = await fetchWithSsrFGuard({
url: `${GOOGLE_MEET_API_BASE_URL}/${encodeSpaceNameForPath(name)}`,
init: {
headers: {
Authorization: `Bearer ${params.accessToken}`,
Accept: "application/json",
},
},
policy: { allowedHostnames: [GOOGLE_MEET_API_HOST] },
auditContext: "google-meet.spaces.get",
});
try {
if (!response.ok) {
const detail = await response.text();
throw new Error(`Google Meet spaces.get failed (${response.status}): ${detail}`);
}
const payload = (await response.json()) as GoogleMeetSpace;
if (!payload.name?.trim()) {
throw new Error("Google Meet spaces.get response was missing name");
}
return payload;
} finally {
await release();
}
}
export function buildGoogleMeetPreflightReport(params: {
input: string;
space: GoogleMeetSpace;
previewAcknowledged: boolean;
tokenSource: "cached-access-token" | "refresh-token";
}): GoogleMeetPreflightReport {
const blockers: string[] = [];
if (!params.previewAcknowledged) {
blockers.push(
"Set preview.enrollmentAcknowledged=true after confirming your Cloud project, OAuth principal, and meeting participants are enrolled in the Google Workspace Developer Preview Program.",
);
}
return {
input: params.input,
resolvedSpaceName: params.space.name,
meetingCode: params.space.meetingCode,
meetingUri: params.space.meetingUri,
hasActiveConference: Boolean(params.space.activeConference),
previewAcknowledged: params.previewAcknowledged,
tokenSource: params.tokenSource,
blockers,
};
}

View File

@@ -0,0 +1,225 @@
import { generateHexPkceVerifierChallenge } from "openclaw/plugin-sdk/provider-auth";
import {
generateOAuthState,
parseOAuthCallbackInput,
waitForLocalOAuthCallback,
} from "openclaw/plugin-sdk/provider-auth-runtime";
import { fetchWithSsrFGuard } from "openclaw/plugin-sdk/ssrf-runtime";
export const GOOGLE_MEET_REDIRECT_URI = "http://localhost:8085/oauth2callback";
export const GOOGLE_MEET_AUTH_URL = "https://accounts.google.com/o/oauth2/v2/auth";
export const GOOGLE_MEET_TOKEN_URL = "https://oauth2.googleapis.com/token";
const GOOGLE_MEET_TOKEN_HOST = "oauth2.googleapis.com";
export const GOOGLE_MEET_SCOPES = [
"https://www.googleapis.com/auth/meetings.space.readonly",
"https://www.googleapis.com/auth/meetings.conference.media.readonly",
] as const;
export type GoogleMeetOAuthTokens = {
accessToken: string;
expiresAt: number;
refreshToken?: string;
scope?: string;
tokenType?: string;
};
export function buildGoogleMeetAuthUrl(params: {
clientId: string;
challenge: string;
state: string;
redirectUri?: string;
scopes?: readonly string[];
}): string {
const search = new URLSearchParams({
client_id: params.clientId,
response_type: "code",
redirect_uri: params.redirectUri ?? GOOGLE_MEET_REDIRECT_URI,
scope: (params.scopes ?? GOOGLE_MEET_SCOPES).join(" "),
code_challenge: params.challenge,
code_challenge_method: "S256",
access_type: "offline",
prompt: "consent",
state: params.state,
});
return `${GOOGLE_MEET_AUTH_URL}?${search.toString()}`;
}
async function executeGoogleTokenRequest(body: URLSearchParams): Promise<GoogleMeetOAuthTokens> {
const { response, release } = await fetchWithSsrFGuard({
url: GOOGLE_MEET_TOKEN_URL,
init: {
method: "POST",
headers: {
"Content-Type": "application/x-www-form-urlencoded;charset=UTF-8",
Accept: "application/json",
},
body,
},
policy: { allowedHostnames: [GOOGLE_MEET_TOKEN_HOST] },
auditContext: "google-meet.oauth.token",
});
try {
if (!response.ok) {
const detail = await response.text();
throw new Error(`Google OAuth token request failed (${response.status}): ${detail}`);
}
const payload = (await response.json()) as {
access_token?: string;
expires_in?: number;
refresh_token?: string;
scope?: string;
token_type?: string;
};
const accessToken = payload.access_token?.trim();
if (!accessToken) {
throw new Error("Google OAuth token response was missing access_token");
}
const expiresInSeconds =
typeof payload.expires_in === "number" && Number.isFinite(payload.expires_in)
? payload.expires_in
: 3600;
return {
accessToken,
expiresAt: Date.now() + expiresInSeconds * 1000,
refreshToken: payload.refresh_token?.trim() || undefined,
scope: payload.scope?.trim() || undefined,
tokenType: payload.token_type?.trim() || undefined,
};
} finally {
await release();
}
}
function tokenRequestBody(values: Record<string, string | undefined>): URLSearchParams {
const body = new URLSearchParams();
for (const [key, value] of Object.entries(values)) {
if (value?.trim()) {
body.set(key, value);
}
}
return body;
}
export async function exchangeGoogleMeetAuthCode(params: {
clientId: string;
clientSecret?: string;
code: string;
verifier: string;
redirectUri?: string;
}): Promise<GoogleMeetOAuthTokens> {
return await executeGoogleTokenRequest(
tokenRequestBody({
client_id: params.clientId,
client_secret: params.clientSecret,
code: params.code,
grant_type: "authorization_code",
redirect_uri: params.redirectUri ?? GOOGLE_MEET_REDIRECT_URI,
code_verifier: params.verifier,
}),
);
}
export async function refreshGoogleMeetAccessToken(params: {
clientId: string;
clientSecret?: string;
refreshToken: string;
}): Promise<GoogleMeetOAuthTokens> {
return await executeGoogleTokenRequest(
tokenRequestBody({
client_id: params.clientId,
client_secret: params.clientSecret,
grant_type: "refresh_token",
refresh_token: params.refreshToken,
}),
);
}
export function shouldUseCachedGoogleMeetAccessToken(params: {
accessToken?: string;
expiresAt?: number;
now?: number;
safetyWindowMs?: number;
}): boolean {
const now = params.now ?? Date.now();
const safetyWindowMs = params.safetyWindowMs ?? 60_000;
return Boolean(
params.accessToken?.trim() &&
typeof params.expiresAt === "number" &&
Number.isFinite(params.expiresAt) &&
params.expiresAt > now + safetyWindowMs,
);
}
export async function resolveGoogleMeetAccessToken(params: {
clientId?: string;
clientSecret?: string;
refreshToken?: string;
accessToken?: string;
expiresAt?: number;
}): Promise<{ accessToken: string; expiresAt?: number; refreshed: boolean }> {
if (shouldUseCachedGoogleMeetAccessToken(params)) {
return {
accessToken: params.accessToken!.trim(),
expiresAt: params.expiresAt,
refreshed: false,
};
}
if (!params.clientId?.trim() || !params.refreshToken?.trim()) {
throw new Error(
"Missing Google Meet OAuth credentials. Configure oauth.clientId and oauth.refreshToken, or pass --client-id and --refresh-token.",
);
}
const refreshed = await refreshGoogleMeetAccessToken({
clientId: params.clientId,
clientSecret: params.clientSecret,
refreshToken: params.refreshToken,
});
return {
accessToken: refreshed.accessToken,
expiresAt: refreshed.expiresAt,
refreshed: true,
};
}
export function createGoogleMeetPkce() {
const { verifier, challenge } = generateHexPkceVerifierChallenge();
return { verifier, challenge };
}
export function createGoogleMeetOAuthState(): string {
return generateOAuthState();
}
export async function waitForGoogleMeetAuthCode(params: {
state: string;
manual: boolean;
timeoutMs: number;
authUrl: string;
promptInput: (message: string) => Promise<string>;
writeLine: (message: string) => void;
}): Promise<string> {
params.writeLine(`Open this URL in your browser:\n\n${params.authUrl}\n`);
if (params.manual) {
const input = await params.promptInput("Paste the full redirect URL here: ");
const parsed = parseOAuthCallbackInput(input, {
missingState: "Missing 'state' parameter. Paste the full redirect URL.",
invalidInput: "Paste the full redirect URL, not just the code.",
});
if ("error" in parsed) {
throw new Error(parsed.error);
}
if (parsed.state !== params.state) {
throw new Error("OAuth state mismatch - please try again");
}
return parsed.code;
}
const callback = await waitForLocalOAuthCallback({
expectedState: params.state,
timeoutMs: params.timeoutMs,
port: 8085,
callbackPath: "/oauth2callback",
redirectUri: GOOGLE_MEET_REDIRECT_URI,
successTitle: "Google Meet OAuth complete",
});
return callback.code;
}

View File

@@ -0,0 +1,215 @@
import { spawn } from "node:child_process";
import type { Writable } from "node:stream";
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-runtime";
import { formatErrorMessage } from "openclaw/plugin-sdk/error-runtime";
import type { PluginRuntime, RuntimeLogger } from "openclaw/plugin-sdk/plugin-runtime";
import {
createRealtimeVoiceBridgeSession,
resolveConfiguredRealtimeVoiceProvider,
type RealtimeVoiceBridgeSession,
type RealtimeVoiceProviderConfig,
type RealtimeVoiceProviderPlugin,
} from "openclaw/plugin-sdk/realtime-voice";
import {
consultOpenClawAgentForGoogleMeet,
GOOGLE_MEET_AGENT_CONSULT_TOOL_NAME,
resolveGoogleMeetRealtimeTools,
} from "./agent-consult.js";
import type { GoogleMeetConfig } from "./config.js";
type BridgeProcess = {
pid?: number;
killed?: boolean;
stdin?: Writable | null;
stdout?: { on(event: "data", listener: (chunk: Buffer | string) => void): unknown } | null;
stderr?: { on(event: "data", listener: (chunk: Buffer | string) => void): unknown } | null;
kill(signal?: NodeJS.Signals): boolean;
on(
event: "exit",
listener: (code: number | null, signal: NodeJS.Signals | null) => void,
): unknown;
on(event: "error", listener: (error: Error) => void): unknown;
};
type SpawnFn = (
command: string,
args: string[],
options: { stdio: ["pipe" | "ignore", "pipe" | "ignore", "pipe" | "ignore"] },
) => BridgeProcess;
export type ChromeRealtimeAudioBridgeHandle = {
providerId: string;
inputCommand: string[];
outputCommand: string[];
stop: () => Promise<void>;
};
type ResolvedRealtimeProvider = {
provider: RealtimeVoiceProviderPlugin;
providerConfig: RealtimeVoiceProviderConfig;
};
function splitCommand(argv: string[]): { command: string; args: string[] } {
const [command, ...args] = argv;
if (!command) {
throw new Error("audio bridge command must not be empty");
}
return { command, args };
}
export function resolveGoogleMeetRealtimeProvider(params: {
config: GoogleMeetConfig;
fullConfig: OpenClawConfig;
providers?: RealtimeVoiceProviderPlugin[];
}): ResolvedRealtimeProvider {
return resolveConfiguredRealtimeVoiceProvider({
configuredProviderId: params.config.realtime.provider,
providerConfigs: params.config.realtime.providers,
cfg: params.fullConfig,
providers: params.providers,
defaultModel: params.config.realtime.model,
noRegisteredProviderMessage: "No configured realtime voice provider registered",
});
}
export async function startCommandRealtimeAudioBridge(params: {
config: GoogleMeetConfig;
fullConfig: OpenClawConfig;
runtime: PluginRuntime;
meetingSessionId: string;
inputCommand: string[];
outputCommand: string[];
logger: RuntimeLogger;
providers?: RealtimeVoiceProviderPlugin[];
spawn?: SpawnFn;
}): Promise<ChromeRealtimeAudioBridgeHandle> {
const input = splitCommand(params.inputCommand);
const output = splitCommand(params.outputCommand);
const spawnFn: SpawnFn =
params.spawn ??
((command, args, options) => spawn(command, args, options) as unknown as BridgeProcess);
const outputProcess = spawnFn(output.command, output.args, {
stdio: ["pipe", "ignore", "pipe"],
});
const inputProcess = spawnFn(input.command, input.args, {
stdio: ["ignore", "pipe", "pipe"],
});
let stopped = false;
let bridge: RealtimeVoiceBridgeSession | null = null;
const stop = async () => {
if (stopped) {
return;
}
stopped = true;
try {
bridge?.close();
} catch (error) {
params.logger.debug?.(
`[google-meet] realtime voice bridge close ignored: ${formatErrorMessage(error)}`,
);
}
inputProcess.kill("SIGTERM");
outputProcess.kill("SIGTERM");
};
const fail = (label: string) => (error: Error) => {
params.logger.warn(`[google-meet] ${label} failed: ${formatErrorMessage(error)}`);
void stop();
};
inputProcess.on("error", fail("audio input command"));
outputProcess.on("error", fail("audio output command"));
inputProcess.on("exit", (code, signal) => {
if (!stopped) {
params.logger.warn(`[google-meet] audio input command exited (${code ?? signal ?? "done"})`);
void stop();
}
});
outputProcess.on("exit", (code, signal) => {
if (!stopped) {
params.logger.warn(`[google-meet] audio output command exited (${code ?? signal ?? "done"})`);
void stop();
}
});
inputProcess.stderr?.on("data", (chunk) => {
params.logger.debug?.(`[google-meet] audio input: ${String(chunk).trim()}`);
});
outputProcess.stderr?.on("data", (chunk) => {
params.logger.debug?.(`[google-meet] audio output: ${String(chunk).trim()}`);
});
const resolved = resolveGoogleMeetRealtimeProvider({
config: params.config,
fullConfig: params.fullConfig,
providers: params.providers,
});
const transcript: Array<{ role: "user" | "assistant"; text: string }> = [];
bridge = createRealtimeVoiceBridgeSession({
provider: resolved.provider,
providerConfig: resolved.providerConfig,
instructions: params.config.realtime.instructions,
markStrategy: "ack-immediately",
tools: resolveGoogleMeetRealtimeTools(params.config.realtime.toolPolicy),
audioSink: {
isOpen: () => !stopped,
sendAudio: (muLaw) => {
outputProcess.stdin?.write(muLaw);
},
},
onTranscript: (role, text, isFinal) => {
if (isFinal) {
transcript.push({ role, text });
if (transcript.length > 40) {
transcript.splice(0, transcript.length - 40);
}
params.logger.debug?.(`[google-meet] ${role}: ${text}`);
}
},
onToolCall: (event, session) => {
if (event.name !== GOOGLE_MEET_AGENT_CONSULT_TOOL_NAME) {
session.submitToolResult(event.callId || event.itemId, {
error: `Tool "${event.name}" not available`,
});
return;
}
void consultOpenClawAgentForGoogleMeet({
config: params.config,
fullConfig: params.fullConfig,
runtime: params.runtime,
logger: params.logger,
meetingSessionId: params.meetingSessionId,
args: event.args,
transcript,
})
.then((result) => {
session.submitToolResult(event.callId || event.itemId, result);
})
.catch((error: Error) => {
session.submitToolResult(event.callId || event.itemId, {
error: formatErrorMessage(error),
});
});
},
onError: fail("realtime voice bridge"),
onClose: (reason) => {
if (reason === "error") {
void stop();
}
},
});
inputProcess.stdout?.on("data", (chunk) => {
const audio = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk);
if (!stopped && audio.byteLength > 0) {
bridge?.sendAudio(Buffer.from(audio));
}
});
await bridge.connect();
return {
providerId: resolved.provider.id,
inputCommand: params.inputCommand,
outputCommand: params.outputCommand,
stop,
};
}

View File

@@ -0,0 +1,202 @@
import { randomUUID } from "node:crypto";
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-runtime";
import { formatErrorMessage } from "openclaw/plugin-sdk/error-runtime";
import type { PluginRuntime, RuntimeLogger } from "openclaw/plugin-sdk/plugin-runtime";
import { normalizeOptionalString } from "openclaw/plugin-sdk/text-runtime";
import type { GoogleMeetConfig, GoogleMeetMode, GoogleMeetTransport } from "./config.js";
import { getGoogleMeetSetupStatus } from "./setup.js";
import { launchChromeMeet } from "./transports/chrome.js";
import { buildMeetDtmfSequence, normalizeDialInNumber } from "./transports/twilio.js";
import type {
GoogleMeetJoinRequest,
GoogleMeetJoinResult,
GoogleMeetSession,
} from "./transports/types.js";
import { endMeetVoiceCallGatewayCall, joinMeetViaVoiceCallGateway } from "./voice-call-gateway.js";
function nowIso(): string {
return new Date().toISOString();
}
export function normalizeMeetUrl(input: unknown): string {
const raw = normalizeOptionalString(input);
if (!raw) {
throw new Error("url required");
}
let url: URL;
try {
url = new URL(raw);
} catch {
throw new Error("url must be a valid Google Meet URL");
}
if (url.protocol !== "https:" || url.hostname.toLowerCase() !== "meet.google.com") {
throw new Error("url must be an explicit https://meet.google.com/... URL");
}
if (!/^\/[a-z]{3}-[a-z]{4}-[a-z]{3}(?:$|[/?#])/i.test(url.pathname)) {
throw new Error("url must include a Google Meet meeting code");
}
return url.toString();
}
function resolveTransport(input: GoogleMeetTransport | undefined, config: GoogleMeetConfig) {
return input ?? config.defaultTransport;
}
function resolveMode(input: GoogleMeetMode | undefined, config: GoogleMeetConfig) {
return input ?? config.defaultMode;
}
export class GoogleMeetRuntime {
readonly #sessions = new Map<string, GoogleMeetSession>();
readonly #sessionStops = new Map<string, () => Promise<void>>();
constructor(
private readonly params: {
config: GoogleMeetConfig;
fullConfig: OpenClawConfig;
runtime: PluginRuntime;
logger: RuntimeLogger;
},
) {}
list(): GoogleMeetSession[] {
return [...this.#sessions.values()].toSorted((a, b) => a.createdAt.localeCompare(b.createdAt));
}
status(sessionId?: string): {
found: boolean;
session?: GoogleMeetSession;
sessions?: GoogleMeetSession[];
} {
if (!sessionId) {
return { found: true, sessions: this.list() };
}
const session = this.#sessions.get(sessionId);
return session ? { found: true, session } : { found: false };
}
setupStatus() {
return getGoogleMeetSetupStatus(this.params.config);
}
async join(request: GoogleMeetJoinRequest): Promise<GoogleMeetJoinResult> {
const url = normalizeMeetUrl(request.url);
const transport = resolveTransport(request.transport, this.params.config);
const mode = resolveMode(request.mode, this.params.config);
const createdAt = nowIso();
const session: GoogleMeetSession = {
id: `meet_${randomUUID()}`,
url,
transport,
mode,
state: "active",
createdAt,
updatedAt: createdAt,
participantIdentity:
transport === "chrome" ? "signed-in Google Chrome profile" : "Twilio phone participant",
realtime: {
enabled: mode === "realtime",
provider: this.params.config.realtime.provider,
model: this.params.config.realtime.model,
toolPolicy: this.params.config.realtime.toolPolicy,
},
notes: [],
};
try {
if (transport === "chrome") {
const result = await launchChromeMeet({
runtime: this.params.runtime,
config: this.params.config,
fullConfig: this.params.fullConfig,
meetingSessionId: session.id,
mode,
url,
logger: this.params.logger,
});
session.chrome = {
audioBackend: this.params.config.chrome.audioBackend,
launched: result.launched,
browserProfile: this.params.config.chrome.browserProfile,
audioBridge: result.audioBridge
? {
type: result.audioBridge.type,
provider:
result.audioBridge.type === "command-pair"
? result.audioBridge.providerId
: undefined,
}
: undefined,
};
if (result.audioBridge?.type === "command-pair") {
this.#sessionStops.set(session.id, result.audioBridge.stop);
}
session.notes.push(
result.audioBridge
? "Chrome transport joins as the signed-in Google profile and routes realtime audio through the configured bridge."
: "Chrome transport joins as the signed-in Google profile and expects BlackHole 2ch audio routing.",
);
} else {
const dialInNumber = normalizeDialInNumber(
request.dialInNumber ?? this.params.config.twilio.defaultDialInNumber,
);
if (!dialInNumber) {
throw new Error("dialInNumber required for twilio transport");
}
const dtmfSequence = buildMeetDtmfSequence({
pin: request.pin ?? this.params.config.twilio.defaultPin,
dtmfSequence: request.dtmfSequence ?? this.params.config.twilio.defaultDtmfSequence,
});
const voiceCallResult = this.params.config.voiceCall.enabled
? await joinMeetViaVoiceCallGateway({
config: this.params.config,
dialInNumber,
dtmfSequence,
})
: undefined;
session.twilio = {
dialInNumber,
pinProvided: Boolean(request.pin ?? this.params.config.twilio.defaultPin),
dtmfSequence,
voiceCallId: voiceCallResult?.callId,
dtmfSent: voiceCallResult?.dtmfSent,
};
if (voiceCallResult?.callId) {
this.#sessionStops.set(session.id, async () => {
await endMeetVoiceCallGatewayCall({
config: this.params.config,
callId: voiceCallResult.callId,
});
});
}
session.notes.push(
this.params.config.voiceCall.enabled
? "Twilio transport delegated the call to the voice-call plugin and sent configured DTMF."
: "Twilio transport is an explicit dial plan; voice-call delegation is disabled.",
);
}
} catch (err) {
this.params.logger.warn(`[google-meet] join failed: ${formatErrorMessage(err)}`);
throw err;
}
this.#sessions.set(session.id, session);
return { session };
}
async leave(sessionId: string): Promise<{ found: boolean; session?: GoogleMeetSession }> {
const session = this.#sessions.get(sessionId);
if (!session) {
return { found: false };
}
const stop = this.#sessionStops.get(sessionId);
if (stop) {
this.#sessionStops.delete(sessionId);
await stop();
}
session.state = "ended";
session.updatedAt = nowIso();
return { found: true, session };
}
}

View File

@@ -0,0 +1,86 @@
import fs from "node:fs";
import os from "node:os";
import path from "node:path";
import type { GoogleMeetConfig } from "./config.js";
type SetupCheck = {
id: string;
ok: boolean;
message: string;
};
function resolveUserPath(input: string): string {
if (input === "~") {
return os.homedir();
}
if (input.startsWith("~/")) {
return path.join(os.homedir(), input.slice(2));
}
return input;
}
export function getGoogleMeetSetupStatus(config: GoogleMeetConfig): {
ok: boolean;
checks: SetupCheck[];
} {
const checks: SetupCheck[] = [];
if (config.auth.tokenPath) {
const tokenPath = resolveUserPath(config.auth.tokenPath);
checks.push({
id: "google-oauth-token",
ok: fs.existsSync(tokenPath),
message: fs.existsSync(tokenPath)
? "Google OAuth token file found"
: `Google OAuth token file missing at ${config.auth.tokenPath}`,
});
} else {
checks.push({
id: "google-oauth-token",
ok: true,
message: "Google OAuth token path not configured; Chrome profile auth will be used",
});
}
if (config.chrome.browserProfile) {
const profilePath = path.join(
os.homedir(),
"Library",
"Application Support",
"Google",
"Chrome",
config.chrome.browserProfile,
);
checks.push({
id: "chrome-profile",
ok: fs.existsSync(profilePath),
message: fs.existsSync(profilePath)
? "Chrome profile found"
: `Chrome profile missing: ${config.chrome.browserProfile}`,
});
} else {
checks.push({
id: "chrome-profile",
ok: true,
message: "Chrome profile not pinned; default signed-in profile will be used",
});
}
checks.push({
id: "audio-bridge",
ok: Boolean(
config.chrome.audioBridgeCommand ||
(config.chrome.audioInputCommand && config.chrome.audioOutputCommand),
),
message: config.chrome.audioBridgeCommand
? "Chrome audio bridge command configured"
: config.chrome.audioInputCommand && config.chrome.audioOutputCommand
? "Chrome command-pair realtime audio bridge configured"
: "Chrome realtime audio bridge not configured",
});
return {
ok: checks.every((check) => check.ok),
checks,
};
}

View File

@@ -0,0 +1,148 @@
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-runtime";
import type { PluginRuntime } from "openclaw/plugin-sdk/plugin-runtime";
import type { RuntimeLogger } from "openclaw/plugin-sdk/plugin-runtime";
import type { GoogleMeetConfig } from "../config.js";
import {
startCommandRealtimeAudioBridge,
type ChromeRealtimeAudioBridgeHandle,
} from "../realtime.js";
export function outputMentionsBlackHole2ch(output: string): boolean {
return /\bBlackHole\s+2ch\b/i.test(output);
}
export async function assertBlackHole2chAvailable(params: {
runtime: PluginRuntime;
timeoutMs: number;
}): Promise<void> {
if (process.platform !== "darwin") {
throw new Error("Chrome Meet transport with blackhole-2ch audio is currently macOS-only");
}
const result = await params.runtime.system.runCommandWithTimeout(
["system_profiler", "SPAudioDataType"],
{ timeoutMs: params.timeoutMs },
);
const output = `${result.stdout ?? ""}\n${result.stderr ?? ""}`;
if (result.code !== 0 || !outputMentionsBlackHole2ch(output)) {
const hint =
params.runtime.system.formatNativeDependencyHint?.({
packageName: "BlackHole 2ch",
downloadCommand: "brew install blackhole-2ch",
}) ?? "";
throw new Error(
[
"BlackHole 2ch audio device not found.",
"Install BlackHole 2ch and route Chrome input/output through the OpenClaw audio bridge.",
hint,
]
.filter(Boolean)
.join(" "),
);
}
}
export async function launchChromeMeet(params: {
runtime: PluginRuntime;
config: GoogleMeetConfig;
fullConfig: OpenClawConfig;
meetingSessionId: string;
mode: "realtime" | "transcribe";
url: string;
logger: RuntimeLogger;
}): Promise<{
launched: boolean;
audioBridge?:
| { type: "external-command" }
| ({ type: "command-pair" } & ChromeRealtimeAudioBridgeHandle);
}> {
await assertBlackHole2chAvailable({
runtime: params.runtime,
timeoutMs: Math.min(params.config.chrome.joinTimeoutMs, 10_000),
});
if (params.config.chrome.audioBridgeHealthCommand) {
const health = await params.runtime.system.runCommandWithTimeout(
params.config.chrome.audioBridgeHealthCommand,
{ timeoutMs: params.config.chrome.joinTimeoutMs },
);
if (health.code !== 0) {
throw new Error(
`Chrome audio bridge health check failed: ${health.stderr || health.stdout || health.code}`,
);
}
}
let audioBridge:
| { type: "external-command" }
| ({ type: "command-pair" } & ChromeRealtimeAudioBridgeHandle)
| undefined;
if (params.config.chrome.audioBridgeCommand) {
const bridge = await params.runtime.system.runCommandWithTimeout(
params.config.chrome.audioBridgeCommand,
{ timeoutMs: params.config.chrome.joinTimeoutMs },
);
if (bridge.code !== 0) {
throw new Error(
`failed to start Chrome audio bridge: ${bridge.stderr || bridge.stdout || bridge.code}`,
);
}
audioBridge = { type: "external-command" };
} else if (params.mode === "realtime") {
if (!params.config.chrome.audioInputCommand || !params.config.chrome.audioOutputCommand) {
throw new Error(
"Chrome realtime mode requires chrome.audioInputCommand and chrome.audioOutputCommand, or chrome.audioBridgeCommand for an external bridge.",
);
}
audioBridge = {
type: "command-pair",
...(await startCommandRealtimeAudioBridge({
config: params.config,
fullConfig: params.fullConfig,
runtime: params.runtime,
meetingSessionId: params.meetingSessionId,
inputCommand: params.config.chrome.audioInputCommand,
outputCommand: params.config.chrome.audioOutputCommand,
logger: params.logger,
})),
};
}
if (!params.config.chrome.launch) {
return { launched: false, audioBridge };
}
const argv = ["open", "-a", "Google Chrome"];
if (params.config.chrome.browserProfile) {
argv.push("--args", `--profile-directory=${params.config.chrome.browserProfile}`);
}
argv.push(params.url);
let commandPairBridgeStopped = false;
const stopCommandPairBridge = async () => {
if (commandPairBridgeStopped) {
return;
}
commandPairBridgeStopped = true;
if (audioBridge?.type === "command-pair") {
await audioBridge.stop();
}
};
try {
const result = await params.runtime.system.runCommandWithTimeout(argv, {
timeoutMs: params.config.chrome.joinTimeoutMs,
});
if (result.code === 0) {
return { launched: true, audioBridge };
}
await stopCommandPairBridge();
throw new Error(
`failed to launch Chrome for Meet: ${result.stderr || result.stdout || result.code}`,
);
} catch (error) {
await stopCommandPairBridge();
throw error;
}
}

View File

@@ -0,0 +1,46 @@
import { normalizeOptionalString } from "openclaw/plugin-sdk/text-runtime";
const DTMF_PATTERN = /^[0-9*#wWpP,]+$/;
export function normalizeDialInNumber(value: unknown): string | undefined {
const normalized = normalizeOptionalString(value);
if (!normalized) {
return undefined;
}
const compact = normalized.replace(/[()\s.-]/g, "");
if (!/^\+?[0-9]{5,20}$/.test(compact)) {
throw new Error("dialInNumber must be a phone number");
}
return compact;
}
export function normalizeDtmfSequence(value: unknown): string | undefined {
const normalized = normalizeOptionalString(value);
if (!normalized) {
return undefined;
}
const compact = normalized.replace(/\s+/g, "");
if (!DTMF_PATTERN.test(compact)) {
throw new Error("dtmfSequence may only contain digits, *, #, comma, w, p");
}
return compact;
}
export function buildMeetDtmfSequence(params: {
pin?: string;
dtmfSequence?: string;
}): string | undefined {
const explicit = normalizeDtmfSequence(params.dtmfSequence);
if (explicit) {
return explicit;
}
const pin = normalizeOptionalString(params.pin);
if (!pin) {
return undefined;
}
const compactPin = pin.replace(/\s+/g, "");
if (!/^[0-9]+#?$/.test(compactPin)) {
throw new Error("pin may only contain digits and an optional trailing #");
}
return compactPin.endsWith("#") ? compactPin : `${compactPin}#`;
}

View File

@@ -0,0 +1,50 @@
import type { GoogleMeetMode, GoogleMeetTransport } from "../config.js";
export type GoogleMeetSessionState = "active" | "ended";
export type GoogleMeetJoinRequest = {
url: string;
transport?: GoogleMeetTransport;
mode?: GoogleMeetMode;
dialInNumber?: string;
pin?: string;
dtmfSequence?: string;
};
export type GoogleMeetSession = {
id: string;
url: string;
transport: GoogleMeetTransport;
mode: GoogleMeetMode;
state: GoogleMeetSessionState;
createdAt: string;
updatedAt: string;
participantIdentity: string;
realtime: {
enabled: boolean;
provider?: string;
model?: string;
toolPolicy: string;
};
chrome?: {
audioBackend: "blackhole-2ch";
launched: boolean;
browserProfile?: string;
audioBridge?: {
type: "command-pair" | "external-command";
provider?: string;
};
};
twilio?: {
dialInNumber: string;
pinProvided: boolean;
dtmfSequence?: string;
voiceCallId?: string;
dtmfSent?: boolean;
};
notes: string[];
};
export type GoogleMeetJoinResult = {
session: GoogleMeetSession;
};

View File

@@ -0,0 +1,104 @@
import { setTimeout as sleep } from "node:timers/promises";
import { GatewayClient } from "openclaw/plugin-sdk/gateway-runtime";
import type { GoogleMeetConfig } from "./config.js";
type VoiceCallGatewayClient = InstanceType<typeof GatewayClient>;
type VoiceCallStartResult = {
callId?: string;
initiated?: boolean;
error?: string;
};
export type VoiceCallMeetJoinResult = {
callId: string;
dtmfSent: boolean;
};
async function createConnectedGatewayClient(
config: GoogleMeetConfig,
): Promise<VoiceCallGatewayClient> {
let client: VoiceCallGatewayClient;
await new Promise<void>((resolve, reject) => {
const timer = setTimeout(
() => reject(new Error("gateway connect timeout")),
config.voiceCall.requestTimeoutMs,
);
client = new GatewayClient({
url: config.voiceCall.gatewayUrl,
token: config.voiceCall.token,
requestTimeoutMs: config.voiceCall.requestTimeoutMs,
clientName: "cli",
clientDisplayName: "Google Meet plugin",
scopes: ["operator.write"],
onHelloOk: () => {
clearTimeout(timer);
resolve();
},
onConnectError: (err) => {
clearTimeout(timer);
reject(err);
},
});
client.start();
});
return client!;
}
export async function joinMeetViaVoiceCallGateway(params: {
config: GoogleMeetConfig;
dialInNumber: string;
dtmfSequence?: string;
}): Promise<VoiceCallMeetJoinResult> {
let client: VoiceCallGatewayClient | undefined;
try {
client = await createConnectedGatewayClient(params.config);
const start = (await client.request(
"voicecall.start",
{
to: params.dialInNumber,
message: params.config.voiceCall.introMessage,
mode: "conversation",
},
{ timeoutMs: params.config.voiceCall.requestTimeoutMs },
)) as VoiceCallStartResult;
if (!start.callId) {
throw new Error(start.error || "voicecall.start did not return callId");
}
if (params.dtmfSequence) {
await sleep(params.config.voiceCall.dtmfDelayMs);
await client.request(
"voicecall.dtmf",
{
callId: start.callId,
digits: params.dtmfSequence,
},
{ timeoutMs: params.config.voiceCall.requestTimeoutMs },
);
}
return { callId: start.callId, dtmfSent: Boolean(params.dtmfSequence) };
} finally {
await client?.stopAndWait({ timeoutMs: 1_000 });
}
}
export async function endMeetVoiceCallGatewayCall(params: {
config: GoogleMeetConfig;
callId: string;
}): Promise<void> {
let client: VoiceCallGatewayClient | undefined;
try {
client = await createConnectedGatewayClient(params.config);
await client.request(
"voicecall.end",
{
callId: params.callId,
},
{ timeoutMs: params.config.voiceCall.requestTimeoutMs },
);
} finally {
await client?.stopAndWait({ timeoutMs: 1_000 });
}
}

View File

@@ -0,0 +1,16 @@
{
"extends": "../tsconfig.package-boundary.base.json",
"compilerOptions": {
"rootDir": "."
},
"include": ["./*.ts", "./src/**/*.ts"],
"exclude": [
"./**/*.test.ts",
"./dist/**",
"./node_modules/**",
"./src/test-support/**",
"./src/**/*test-helpers.ts",
"./src/**/*test-harness.ts",
"./src/**/*test-support.ts"
]
}

View File

@@ -1,6 +1,9 @@
{
"id": "memory-core",
"kind": "memory",
"contracts": {
"memoryEmbeddingProviders": ["local"]
},
"commandAliases": [
{
"name": "dreaming",

View File

@@ -1,10 +1,28 @@
import fs from "node:fs";
import type { Context, Model } from "@mariozechner/pi-ai";
import { describe, expect, it } from "vitest";
import { registerSingleProviderPlugin } from "../../test/helpers/plugins/plugin-registration.js";
import { createCapturedThinkingConfigStream } from "../../test/helpers/plugins/stream-hooks.js";
import plugin from "./index.js";
import { createKimiWebSearchProvider } from "./src/kimi-web-search-provider.js";
type MoonshotManifest = {
providerAuthEnvVars?: Record<string, string[]>;
};
function readManifest(): MoonshotManifest {
return JSON.parse(
fs.readFileSync(new URL("./openclaw.plugin.json", import.meta.url), "utf8"),
) as MoonshotManifest;
}
describe("moonshot provider plugin", () => {
it("mirrors Kimi web-search env credentials in manifest metadata", () => {
const manifestEnvVars = readManifest().providerAuthEnvVars?.moonshot ?? [];
expect(manifestEnvVars).toEqual(expect.arrayContaining(createKimiWebSearchProvider().envVars));
});
it("owns replay policy for OpenAI-compatible Moonshot transports without mangling native Kimi tool_call IDs", async () => {
const provider = await registerSingleProviderPlugin(plugin);

View File

@@ -3,7 +3,7 @@
"enabledByDefault": true,
"providers": ["moonshot"],
"providerAuthEnvVars": {
"moonshot": ["MOONSHOT_API_KEY"]
"moonshot": ["MOONSHOT_API_KEY", "KIMI_API_KEY"]
},
"providerAuthChoices": [
{

View File

@@ -0,0 +1,41 @@
import fs from "node:fs";
import { describe, expect, it } from "vitest";
import { resolveProviderPluginChoice } from "../../src/plugins/provider-auth-choice.runtime.js";
import { registerSingleProviderPlugin } from "../../test/helpers/plugins/plugin-registration.js";
import plugin from "./index.js";
type NvidiaManifest = {
providerAuthChoices?: Array<{ choiceId?: string; method?: string; provider?: string }>;
};
function readManifest(): NvidiaManifest {
return JSON.parse(
fs.readFileSync(new URL("./openclaw.plugin.json", import.meta.url), "utf8"),
) as NvidiaManifest;
}
describe("nvidia provider plugin", () => {
it("registers API-key auth metadata", async () => {
const provider = await registerSingleProviderPlugin(plugin);
expect(provider.id).toBe("nvidia");
expect(provider.envVars).toEqual(["NVIDIA_API_KEY"]);
expect(provider.auth?.map((method) => method.id)).toEqual(["api-key"]);
const choice = resolveProviderPluginChoice({
providers: [provider],
choice: "nvidia-api-key",
});
expect(choice?.provider.id).toBe("nvidia");
expect(choice?.method.id).toBe("api-key");
expect(readManifest().providerAuthChoices).toEqual(
expect.arrayContaining([
expect.objectContaining({
provider: "nvidia",
method: "api-key",
choiceId: "nvidia-api-key",
}),
]),
);
});
});

View File

@@ -11,7 +11,24 @@ export default defineSingleProviderPluginEntry({
label: "NVIDIA",
docsPath: "/providers/nvidia",
envVars: ["NVIDIA_API_KEY"],
auth: [],
auth: [
{
methodId: "api-key",
label: "NVIDIA API key",
hint: "API key",
optionKey: "nvidiaApiKey",
flagName: "--nvidia-api-key",
envVar: "NVIDIA_API_KEY",
promptMessage: "Enter NVIDIA API key",
wizard: {
choiceId: "nvidia-api-key",
choiceLabel: "NVIDIA API key",
groupId: "nvidia",
groupLabel: "NVIDIA",
groupHint: "API key",
},
},
],
catalog: {
buildProvider: buildNvidiaProvider,
},

View File

@@ -5,6 +5,21 @@
"providerAuthEnvVars": {
"nvidia": ["NVIDIA_API_KEY"]
},
"providerAuthChoices": [
{
"provider": "nvidia",
"method": "api-key",
"choiceId": "nvidia-api-key",
"choiceLabel": "NVIDIA API key",
"groupId": "nvidia",
"groupLabel": "NVIDIA",
"groupHint": "API key",
"optionKey": "nvidiaApiKey",
"cliFlag": "--nvidia-api-key",
"cliOption": "--nvidia-api-key <key>",
"cliDescription": "NVIDIA API key"
}
],
"configSchema": {
"type": "object",
"additionalProperties": false,

View File

@@ -447,6 +447,7 @@ describe("buildOpenAIProvider", () => {
provider: "openai",
id: "gpt-5.4",
baseUrl: "https://api.openai.com/v1",
contextWindow: 200_000,
} as Model<"openai-responses">,
payload: {
reasoning: { effort: "none" },
@@ -457,6 +458,10 @@ describe("buildOpenAIProvider", () => {
transport: "auto",
openaiWsWarmup: true,
});
expect(result.payload.store).toBe(true);
expect(result.payload.context_management).toEqual([
{ type: "compaction", compact_threshold: 140_000 },
]);
expect(result.payload.service_tier).toBe("priority");
expect(result.payload.text).toEqual({ verbosity: "low" });
expect(result.payload.reasoning).toEqual({ effort: "none" });

View File

@@ -6,6 +6,8 @@ import {
} from "openclaw/plugin-sdk/proxy-capture";
import type {
RealtimeVoiceBridge,
RealtimeVoiceBrowserSession,
RealtimeVoiceBrowserSessionCreateRequest,
RealtimeVoiceBridgeCreateRequest,
RealtimeVoiceProviderConfig,
RealtimeVoiceProviderPlugin,
@@ -59,6 +61,8 @@ type OpenAIRealtimeVoiceBridgeConfig = RealtimeVoiceBridgeCreateRequest & {
azureApiVersion?: string;
};
const OPENAI_REALTIME_DEFAULT_MODEL = "gpt-realtime-1.5";
type RealtimeEvent = {
type: string;
delta?: string;
@@ -117,7 +121,7 @@ function base64ToBuffer(b64: string): Buffer {
}
class OpenAIRealtimeVoiceBridge implements RealtimeVoiceBridge {
private static readonly DEFAULT_MODEL = "gpt-realtime-1.5";
private static readonly DEFAULT_MODEL = OPENAI_REALTIME_DEFAULT_MODEL;
private static readonly MAX_RECONNECT_ATTEMPTS = 5;
private static readonly BASE_RECONNECT_DELAY_MS = 1000;
private static readonly CONNECT_TIMEOUT_MS = 10_000;
@@ -579,6 +583,77 @@ class OpenAIRealtimeVoiceBridge implements RealtimeVoiceBridge {
}
}
function readStringField(value: unknown, key: string): string | undefined {
if (!value || typeof value !== "object") {
return undefined;
}
const raw = (value as Record<string, unknown>)[key];
return typeof raw === "string" && raw.trim() ? raw.trim() : undefined;
}
async function createOpenAIRealtimeBrowserSession(
req: RealtimeVoiceBrowserSessionCreateRequest,
): Promise<RealtimeVoiceBrowserSession> {
const config = normalizeProviderConfig(req.providerConfig);
const apiKey = config.apiKey || process.env.OPENAI_API_KEY;
if (!apiKey) {
throw new Error("OpenAI API key missing");
}
if (config.azureEndpoint || config.azureDeployment) {
throw new Error("OpenAI Realtime browser sessions do not support Azure endpoints yet");
}
const model = req.model ?? config.model ?? OPENAI_REALTIME_DEFAULT_MODEL;
const voice = (req.voice ?? config.voice ?? "alloy") as OpenAIRealtimeVoice;
const session: Record<string, unknown> = {
type: "realtime",
model,
instructions: req.instructions,
audio: {
output: { voice },
},
};
if (req.tools && req.tools.length > 0) {
session.tools = req.tools;
session.tool_choice = "auto";
}
const response = await fetch("https://api.openai.com/v1/realtime/client_secrets", {
method: "POST",
headers: {
Authorization: `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ session }),
});
if (!response.ok) {
const detail = await response.text().catch(() => "");
throw new Error(
`OpenAI Realtime browser session failed (${response.status}): ${detail || response.statusText}`,
);
}
const payload = (await response.json()) as unknown;
const nestedSecret =
payload && typeof payload === "object"
? (payload as Record<string, unknown>).client_secret
: undefined;
const clientSecret = readStringField(payload, "value") ?? readStringField(nestedSecret, "value");
if (!clientSecret) {
throw new Error("OpenAI Realtime browser session did not return a client secret");
}
const expiresAt =
payload && typeof payload === "object"
? (payload as Record<string, unknown>).expires_at
: undefined;
return {
provider: "openai",
clientSecret,
model,
voice,
...(typeof expiresAt === "number" ? { expiresAt } : {}),
};
}
export function buildOpenAIRealtimeVoiceProvider(): RealtimeVoiceProviderPlugin {
return {
id: "openai",
@@ -607,6 +682,7 @@ export function buildOpenAIRealtimeVoiceProvider(): RealtimeVoiceProviderPlugin
azureApiVersion: config.azureApiVersion,
});
},
createBrowserSession: createOpenAIRealtimeBrowserSession,
};
}

View File

@@ -125,7 +125,7 @@ describe("runQaCharacterEval", () => {
expect.objectContaining({
judgeModel: "openai/gpt-5.4",
judgeThinkingDefault: "xhigh",
judgeFastMode: false,
judgeFastMode: true,
timeoutMs: 300_000,
}),
);
@@ -223,7 +223,7 @@ describe("runQaCharacterEval", () => {
expect(runSuite).toHaveBeenCalledTimes(8);
expect(runSuite.mock.calls.map(([params]) => params.primaryModel)).toEqual([
"openai/gpt-5.5",
"openai/gpt-5.4",
"openai/gpt-5.2",
"openai/gpt-5",
"anthropic/claude-opus-4-6",
@@ -233,7 +233,7 @@ describe("runQaCharacterEval", () => {
"google/gemini-3.1-pro-preview",
]);
expect(runSuite.mock.calls.map(([params]) => params.thinkingDefault)).toEqual([
"xhigh",
"medium",
"xhigh",
"xhigh",
"high",
@@ -254,14 +254,14 @@ describe("runQaCharacterEval", () => {
]);
expect(runJudge).toHaveBeenCalledTimes(2);
expect(runJudge.mock.calls.map(([params]) => params.judgeModel)).toEqual([
"openai/gpt-5.5",
"openai/gpt-5.4",
"anthropic/claude-opus-4-6",
]);
expect(runJudge.mock.calls.map(([params]) => params.judgeThinkingDefault)).toEqual([
"xhigh",
"high",
]);
expect(runJudge.mock.calls.map(([params]) => params.judgeFastMode)).toEqual([false, false]);
expect(runJudge.mock.calls.map(([params]) => params.judgeFastMode)).toEqual([true, false]);
});
it("runs candidate models with bounded concurrency while preserving result order", async () => {

View File

@@ -189,6 +189,7 @@ describe("qa cli runtime", () => {
primaryModel: "openai/gpt-5.4",
alternateModel: "anthropic/claude-sonnet-4-6",
fastMode: true,
thinking: "medium",
scenarioIds: ["approval-turn-tool-followthrough"],
});
@@ -200,6 +201,7 @@ describe("qa cli runtime", () => {
primaryModel: "openai/gpt-5.4",
alternateModel: "anthropic/claude-sonnet-4-6",
fastMode: true,
thinkingDefault: "medium",
scenarioIds: ["approval-turn-tool-followthrough"],
});
});
@@ -1135,8 +1137,8 @@ describe("qa cli runtime", () => {
repoRoot: path.resolve("/tmp/openclaw-repo"),
transportId: "qa-channel",
providerMode: "live-frontier",
primaryModel: "openai/gpt-5.5",
alternateModel: "openai/gpt-5.5",
primaryModel: "openai/gpt-5.4",
alternateModel: "openai/gpt-5.4",
fastMode: undefined,
message: "read qa kickoff and reply short",
timeoutMs: undefined,
@@ -1166,7 +1168,7 @@ describe("qa cli runtime", () => {
it("defaults manual frontier runs onto Codex OAuth when the runtime resolver prefers it", async () => {
defaultQaRuntimeModelForMode.mockImplementation((mode, options) =>
mode === "live-frontier"
? "openai/gpt-5.5"
? "openai/gpt-5.4"
: defaultQaProviderModelForMode(mode as QaProviderModeInput, options),
);
@@ -1179,8 +1181,8 @@ describe("qa cli runtime", () => {
repoRoot: path.resolve("/tmp/openclaw-repo"),
transportId: "qa-channel",
providerMode: "live-frontier",
primaryModel: "openai/gpt-5.5",
alternateModel: "openai/gpt-5.5",
primaryModel: "openai/gpt-5.4",
alternateModel: "openai/gpt-5.4",
fastMode: undefined,
message: "read qa kickoff and reply short",
timeoutMs: undefined,

View File

@@ -450,6 +450,7 @@ export async function runQaSuiteCommand(opts: {
primaryModel?: string;
alternateModel?: string;
fastMode?: boolean;
thinking?: string;
cliAuthMode?: string;
parityPack?: string;
scenarioIds?: string[];
@@ -490,6 +491,7 @@ export async function runQaSuiteCommand(opts: {
throw new Error("--cli-auth-mode requires --runner host.");
}
if (runner === "multipass") {
const thinkingDefault = parseQaThinkingLevel("--thinking", opts.thinking);
const result = await runQaMultipass({
repoRoot,
outputDir: resolveRepoRelativeOutputDir(repoRoot, opts.outputDir),
@@ -498,6 +500,7 @@ export async function runQaSuiteCommand(opts: {
primaryModel: opts.primaryModel,
alternateModel: opts.alternateModel,
fastMode: opts.fastMode,
...(thinkingDefault ? { thinkingDefault } : {}),
allowFailures: true,
scenarioIds,
...(opts.concurrency !== undefined
@@ -532,6 +535,7 @@ export async function runQaSuiteCommand(opts: {
});
return;
}
const thinkingDefault = parseQaThinkingLevel("--thinking", opts.thinking);
const result = await runQaSuiteFromRuntimeWithInfraRetry({
repoRoot,
outputDir: resolveRepoRelativeOutputDir(repoRoot, opts.outputDir),
@@ -540,6 +544,7 @@ export async function runQaSuiteCommand(opts: {
primaryModel: opts.primaryModel,
alternateModel: opts.alternateModel,
fastMode: opts.fastMode,
...(thinkingDefault ? { thinkingDefault } : {}),
...(claudeCliAuthMode ? { claudeCliAuthMode } : {}),
scenarioIds,
...(opts.concurrency !== undefined

View File

@@ -35,6 +35,7 @@ async function runQaSuite(opts: {
primaryModel?: string;
alternateModel?: string;
fastMode?: boolean;
thinking?: string;
allowFailures?: boolean;
cliAuthMode?: string;
parityPack?: string;
@@ -247,6 +248,10 @@ export function registerQaLabCli(program: Command) {
false,
)
.option("--fast", "Enable provider fast mode where supported", false)
.option(
"--thinking <level>",
"Suite thinking default: off|minimal|low|medium|high|xhigh|adaptive|max",
)
.option("--image <alias>", "Multipass image alias")
.option("--cpus <count>", "Multipass vCPU count", (value: string) => Number(value))
.option("--memory <size>", "Multipass memory size")
@@ -266,6 +271,7 @@ export function registerQaLabCli(program: Command) {
concurrency?: number;
allowFailures?: boolean;
fast?: boolean;
thinking?: string;
image?: string;
cpus?: number;
memory?: string;
@@ -281,6 +287,7 @@ export function registerQaLabCli(program: Command) {
primaryModel: opts.model,
alternateModel: opts.altModel,
fastMode: opts.fast,
thinking: opts.thinking,
cliAuthMode: opts.cliAuthMode,
parityPack: opts.parityPack,
scenarioIds: opts.scenario,

View File

@@ -2,7 +2,7 @@ import { describe, expect, it } from "vitest";
import { selectQaRunnerModelOptions } from "./model-catalog.runtime.js";
describe("qa runner model catalog", () => {
it("filters to available rows and prefers gpt-5.5 first", () => {
it("filters to available rows and prefers gpt-5.4 first", () => {
expect(
selectQaRunnerModelOptions([
{
@@ -13,8 +13,8 @@ describe("qa runner model catalog", () => {
missing: false,
},
{
key: "openai/gpt-5.5",
name: "gpt-5.5",
key: "openai/gpt-5.4",
name: "gpt-5.4",
input: "text,image",
available: true,
missing: false,
@@ -27,6 +27,6 @@ describe("qa runner model catalog", () => {
missing: false,
},
]).map((entry) => entry.key),
).toEqual(["openai/gpt-5.5", "anthropic/claude-sonnet-4-6"]);
).toEqual(["openai/gpt-5.4", "anthropic/claude-sonnet-4-6"]);
});
});

View File

@@ -34,7 +34,7 @@ describe("qa model selection runtime", () => {
resolveEnvApiKey.mockReturnValue({ apiKey: "sk-test" });
expect(resolveQaPreferredLiveModel()).toBeUndefined();
expect(defaultQaRuntimeModelForMode("live-frontier")).toBe("openai/gpt-5.5");
expect(defaultQaRuntimeModelForMode("live-frontier")).toBe("openai/gpt-5.4");
expect(loadAuthProfileStoreForRuntime).not.toHaveBeenCalled();
});
@@ -43,8 +43,8 @@ describe("qa model selection runtime", () => {
provider === "openai-codex" ? ["openai-codex:user@example.com"] : [],
);
expect(resolveQaPreferredLiveModel()).toBe("openai/gpt-5.5");
expect(defaultQaRuntimeModelForMode("live-frontier")).toBe("openai/gpt-5.5");
expect(resolveQaPreferredLiveModel()).toBe("openai/gpt-5.4");
expect(defaultQaRuntimeModelForMode("live-frontier")).toBe("openai/gpt-5.4");
});
it("keeps the OpenAI live default when stored OpenAI profiles are available", () => {
@@ -53,7 +53,7 @@ describe("qa model selection runtime", () => {
);
expect(resolveQaPreferredLiveModel()).toBeUndefined();
expect(defaultQaRuntimeModelForMode("live-frontier")).toBe("openai/gpt-5.5");
expect(defaultQaRuntimeModelForMode("live-frontier")).toBe("openai/gpt-5.4");
});
it("leaves mock defaults unchanged", () => {

View File

@@ -71,6 +71,7 @@ export type QaMultipassPlan = {
primaryModel?: string;
alternateModel?: string;
fastMode?: boolean;
thinkingDefault?: string;
scenarioIds: string[];
forwardedEnv: Record<string, string>;
hostCodexHomePath?: string;
@@ -237,6 +238,7 @@ export function createQaMultipassPlan(params: {
primaryModel?: string;
alternateModel?: string;
fastMode?: boolean;
thinkingDefault?: string;
allowFailures?: boolean;
scenarioIds?: string[];
concurrency?: number;
@@ -276,6 +278,7 @@ export function createQaMultipassPlan(params: {
...(params.primaryModel ? ["--model", params.primaryModel] : []),
...(params.alternateModel ? ["--alt-model", params.alternateModel] : []),
...(params.fastMode ? ["--fast"] : []),
...(params.thinkingDefault ? ["--thinking", params.thinkingDefault] : []),
...(params.allowFailures ? ["--allow-failures"] : []),
...(params.concurrency ? ["--concurrency", String(params.concurrency)] : []),
],
@@ -301,6 +304,7 @@ export function createQaMultipassPlan(params: {
primaryModel: params.primaryModel,
alternateModel: params.alternateModel,
fastMode: params.fastMode,
thinkingDefault: params.thinkingDefault,
scenarioIds,
forwardedEnv,
hostCodexHomePath,

View File

@@ -1,5 +1,5 @@
export const QA_FRONTIER_PROVIDER_IDS = ["anthropic", "google", "openai"] as const;
export const QA_FRONTIER_CATALOG_PRIMARY_MODEL = "openai/gpt-5.5";
export const QA_FRONTIER_CATALOG_PRIMARY_MODEL = "openai/gpt-5.4";
export const QA_FRONTIER_CATALOG_ALTERNATE_MODEL = "anthropic/claude-sonnet-4-6";
export function isPreferredQaLiveFrontierCatalogModel(modelRef: string) {

View File

@@ -6,7 +6,7 @@ type QaFrontierCharacterModelOptions = {
};
export const QA_FRONTIER_CHARACTER_EVAL_MODELS = Object.freeze([
"openai/gpt-5.5",
"openai/gpt-5.4",
"openai/gpt-5.2",
"openai/gpt-5",
"anthropic/claude-opus-4-6",
@@ -18,19 +18,19 @@ export const QA_FRONTIER_CHARACTER_EVAL_MODELS = Object.freeze([
export const QA_FRONTIER_CHARACTER_THINKING_BY_MODEL: Readonly<Record<string, QaThinkingLevel>> =
Object.freeze({
"openai/gpt-5.5": "xhigh",
"openai/gpt-5.4": "medium",
"openai/gpt-5.2": "xhigh",
"openai/gpt-5": "xhigh",
});
export const QA_FRONTIER_CHARACTER_JUDGE_MODELS = Object.freeze([
"openai/gpt-5.5",
"openai/gpt-5.4",
"anthropic/claude-opus-4-6",
]);
export const QA_FRONTIER_CHARACTER_JUDGE_MODEL_OPTIONS: Readonly<
Record<string, QaFrontierCharacterModelOptions>
> = Object.freeze({
"openai/gpt-5.5": { thinkingDefault: "xhigh" },
"openai/gpt-5.4": { thinkingDefault: "xhigh", fastMode: true },
"anthropic/claude-opus-4-6": { thinkingDefault: "high" },
});

View File

@@ -23,7 +23,7 @@ function isClaudeOpusModel(modelRef: string) {
export const liveFrontierProviderDefinition: QaProviderDefinition = {
mode: "live-frontier",
kind: "live",
defaultModel: (options) => options?.preferredLiveModel ?? "openai/gpt-5.5",
defaultModel: (options) => options?.preferredLiveModel ?? "openai/gpt-5.4",
defaultImageGenerationProviderIds: ["openai"],
defaultImageGenerationModel: ({ modelProviderIds }) =>
modelProviderIds.includes("openai") ? "openai/gpt-image-1" : null,

View File

@@ -4,7 +4,7 @@ import {
} from "openclaw/plugin-sdk/agent-runtime";
import { resolveEnvApiKey } from "openclaw/plugin-sdk/provider-auth";
const QA_CODEX_OAUTH_LIVE_MODEL = "openai/gpt-5.5";
const QA_CODEX_OAUTH_LIVE_MODEL = "openai/gpt-5.4";
export function resolveQaLiveFrontierPreferredModel() {
if (resolveEnvApiKey("openai")?.apiKey) {

View File

@@ -1,2 +1,2 @@
export const QA_FRONTIER_PARITY_CANDIDATE_LABEL = "openai/gpt-5.5";
export const QA_FRONTIER_PARITY_CANDIDATE_LABEL = "openai/gpt-5.4";
export const QA_FRONTIER_PARITY_BASELINE_LABEL = "anthropic/claude-opus-4-6";

View File

@@ -151,6 +151,10 @@ const QA_REASONING_ONLY_RETRY_NEEDLE =
"recorded reasoning but did not produce a user-visible answer";
const QA_EMPTY_RESPONSE_RETRY_NEEDLE =
"The previous attempt did not produce a user-visible answer.";
const QA_SKILL_WORKSHOP_GIF_PROMPT_RE =
/externally sourced animated GIF asset|animated GIF asset in a product UI/i;
const QA_SKILL_WORKSHOP_REVIEW_PROMPT_RE = /Review transcript for durable skill updates/i;
const QA_RELEASE_AUDIT_PROMPT_RE = /release readiness audit for the small project/i;
type MockScenarioState = {
subagentFanoutPhase: number;
@@ -727,6 +731,16 @@ function buildAssistantText(
if (/(image generation check|capability flip image check)/i.test(prompt) && mediaPath) {
return `Protocol note: generated the QA lighthouse image successfully.\nMEDIA:${mediaPath}`;
}
if (QA_SKILL_WORKSHOP_GIF_PROMPT_RE.test(prompt) && toolOutput) {
return [
"Animated GIF QA checklist ready.",
"- Confirm true animation, not a static preview.",
"- Verify dimensions and product UI fit.",
"- Record attribution and license.",
"- Keep a local copy before using the asset.",
"- Re-open the copied file for final verification.",
].join("\n");
}
if (/roundtrip image inspection check/i.test(prompt) && imageInputCount > 0) {
return "Protocol note: the generated attachment shows the same QA lighthouse scene from the previous step.";
}
@@ -808,6 +822,79 @@ function buildToolCallEvents(prompt: string): StreamEvent[] {
return buildToolCallEventsWithArgs("read", { path: targetPath });
}
function buildReleaseAuditJson() {
return `${JSON.stringify(
{
verified: true,
findings: [
{
id: "REL-GATEWAY-417",
source: "src/gateway/reconnect.ts",
status: "retry jitter verified, resume token fallback still needs manual spot check",
},
{
id: "REL-CHANNEL-238",
source: "src/channels/delivery.ts",
status: "thread replies preserve ordering, root-channel fallback needs handoff note",
},
{
id: "REL-CRON-904",
source: "src/scheduling/cron.ts",
status: "single-run lock verified for restart wakeups",
},
{
id: "REL-MEMORY-552",
source: "src/memory/recall.ts",
status:
"fallback summary survives empty memory search; ranking sample needs second reviewer",
},
{
id: "REL-PLUGIN-319",
source: "src/plugins/runtime.ts",
status: "bundled runtime manifest loads cleanly after restart",
},
{
id: "REL-INSTALL-846",
source: "install/update.ts",
status: "update smoke passed from previous stable tag",
},
{
id: "REL-DOCS-611",
source: "docs/operator-notes.md",
status:
"docs mention reconnect, cron, memory, plugin, and installer checks; channel ordering and UI notes need maintainer handoff",
},
{
id: "REL-UI-BLOCKED",
source: "ui/control-panel.ts",
status: "blocked: source file was referenced by checklist but missing from the fixture",
},
],
},
null,
2,
)}\n`;
}
function buildReleaseHandoffMarkdown() {
return [
"# Release Handoff",
"",
"Ready:",
"- REL-GATEWAY-417: gateway reconnect handling checked in `src/gateway/reconnect.ts`.",
"- REL-CRON-904: cron duplicate prevention checked in `src/scheduling/cron.ts`.",
"- REL-PLUGIN-319: plugin runtime loading checked in `src/plugins/runtime.ts`.",
"- REL-INSTALL-846: installer update path checked in `install/update.ts`.",
"",
"Follow-up:",
"- REL-CHANNEL-238: channel delivery ordering needs maintainer handoff.",
"- REL-MEMORY-552: memory recall fallback ranking sample needs a second reviewer.",
"- REL-DOCS-611: docs update status needs channel ordering and UI notes.",
"- `ui/control-panel.ts` is blocked/not found in the fixture.",
"",
].join("\n");
}
function extractPlannedToolName(events: StreamEvent[]) {
for (const event of events) {
if (event.type !== "response.output_item.done") {
@@ -1128,6 +1215,63 @@ async function buildResponsesPayload(
},
]);
}
if (QA_SKILL_WORKSHOP_REVIEW_PROMPT_RE.test(allInputText)) {
return buildAssistantEvents(
JSON.stringify({
action: "create",
skillName: "animated-gif-workflow",
title: "Animated GIF Workflow",
reason: "Transcript captured a reusable animated media QA checklist.",
description: "Reusable workflow notes for animated GIF QA tasks.",
body: [
"- Confirm the asset has true animation, not a static preview.",
"- Check dimensions against the target product UI slot.",
"- Record attribution and license before using the file.",
"- Keep a local copy under the workspace before integration.",
"- Re-open the local copy for final verification.",
].join("\n"),
}),
);
}
if (QA_SKILL_WORKSHOP_GIF_PROMPT_RE.test(prompt) && !toolOutput) {
return buildToolCallEventsWithArgs("write", {
path: "animated-gif-qa-checklist.md",
content: [
"# Animated GIF QA Checklist",
"",
"- Confirm true animation.",
"- Verify dimensions.",
"- Record attribution.",
"- Keep a local copy.",
"- Perform final verification.",
].join("\n"),
});
}
if (QA_RELEASE_AUDIT_PROMPT_RE.test(prompt)) {
if (!toolOutput) {
return buildToolCallEventsWithArgs("read", { path: "audit-fixture/README.md" });
}
if (/Release readiness task|current checklist/i.test(toolOutput)) {
return buildToolCallEventsWithArgs("read", {
path: "audit-fixture/docs/current-readiness-checklist.md",
});
}
if (/Current release readiness requires checking eight areas/i.test(toolOutput)) {
return buildToolCallEventsWithArgs("write", {
path: "audit-fixture/release-audit.json",
content: buildReleaseAuditJson(),
});
}
if (/release-audit\.json/i.test(toolOutput)) {
return buildToolCallEventsWithArgs("write", {
path: "audit-fixture/release-handoff.md",
content: buildReleaseHandoffMarkdown(),
});
}
if (/release-handoff\.md/i.test(toolOutput)) {
return buildAssistantEvents("RELEASE-AUDIT-COMPLETE");
}
}
if (/lobster invaders/i.test(prompt)) {
if (!toolOutput) {
return buildToolCallEventsWithArgs("read", { path: "QA_KICKOFF_TASK.md" });

View File

@@ -45,8 +45,8 @@ describe("qa run config", () => {
it("creates a live-by-default selection that arms every scenario", () => {
expect(createDefaultQaRunSelection(scenarios)).toEqual({
providerMode: "live-frontier",
primaryModel: "openai/gpt-5.5",
alternateModel: "openai/gpt-5.5",
primaryModel: "openai/gpt-5.4",
alternateModel: "openai/gpt-5.4",
fastMode: true,
scenarioIds: ["dm-chat-baseline", "thread-lifecycle"],
});
@@ -57,7 +57,7 @@ describe("qa run config", () => {
normalizeQaRunSelection(
{
providerMode: "live-frontier",
primaryModel: "openai/gpt-5.5",
primaryModel: "openai/gpt-5.4",
alternateModel: "",
fastMode: false,
scenarioIds: ["thread-lifecycle", "missing", "thread-lifecycle"],
@@ -66,8 +66,8 @@ describe("qa run config", () => {
),
).toEqual({
providerMode: "live-frontier",
primaryModel: "openai/gpt-5.5",
alternateModel: "openai/gpt-5.5",
primaryModel: "openai/gpt-5.4",
alternateModel: "openai/gpt-5.4",
fastMode: true,
scenarioIds: ["thread-lifecycle"],
});
@@ -99,13 +99,13 @@ describe("qa run config", () => {
});
it("keeps idle snapshots on static defaults so startup does not inspect auth profiles", () => {
defaultQaRuntimeModelForMode.mockReturnValue("openai/gpt-5.5");
defaultQaRuntimeModelForMode.mockReturnValue("openai/gpt-5.4");
defaultQaRuntimeModelForMode.mockClear();
expect(createIdleQaRunnerSnapshot(scenarios).selection).toMatchObject({
providerMode: "live-frontier",
primaryModel: "openai/gpt-5.5",
alternateModel: "openai/gpt-5.5",
primaryModel: "openai/gpt-5.4",
alternateModel: "openai/gpt-5.4",
});
expect(defaultQaRuntimeModelForMode).not.toHaveBeenCalled();
});
@@ -138,14 +138,14 @@ describe("qa run config", () => {
it("prefers the Codex OAuth default when the runtime resolver says it is available", () => {
defaultQaRuntimeModelForMode.mockImplementation((mode, options) =>
mode === "live-frontier"
? "openai/gpt-5.5"
? "openai/gpt-5.4"
: defaultQaProviderModelForMode(mode as QaProviderModeInput, options),
);
expect(createDefaultQaRunSelection(scenarios)).toEqual({
providerMode: "live-frontier",
primaryModel: "openai/gpt-5.5",
alternateModel: "openai/gpt-5.5",
primaryModel: "openai/gpt-5.4",
alternateModel: "openai/gpt-5.4",
fastMode: true,
scenarioIds: ["dm-chat-baseline", "thread-lifecycle"],
});

View File

@@ -137,15 +137,15 @@ describe("qa scenario catalog", () => {
expect(scenario.sourcePath).toBe("qa/scenarios/models/gpt54-thinking-visibility-switch.md");
expect(config?.requiredLiveProvider).toBe("openai");
expect(config?.requiredLiveModel).toBe("gpt-5.5");
expect(config?.requiredLiveModel).toBe("gpt-5.4");
expect(config?.offDirective).toBe("/think off");
expect(config?.maxDirective).toBe("/think max");
expect(config?.maxDirective).toBe("/think medium");
expect(config?.reasoningDirective).toBe("/reasoning on");
expect(scenario.execution.flow?.steps.map((step) => step.name)).toEqual([
"enables reasoning display and disables thinking",
"switches to max thinking",
"verifies max thinking emits visible reasoning",
"verifies max thinking completes the answer",
"switches to medium thinking",
"verifies medium thinking emits visible reasoning",
"verifies medium thinking completes the answer",
]);
});
@@ -169,10 +169,10 @@ describe("qa scenario catalog", () => {
},
});
expect(config?.requiredProvider).toBe("openai");
expect(config?.requiredModel).toBe("gpt-5.5");
expect(config?.requiredModel).toBe("gpt-5.4");
expect(config?.expectedMarker).toBe("WEB-SEARCH-OK");
expect(scenario.execution.flow?.steps.map((step) => step.name)).toEqual([
"confirms live OpenAI GPT-5.5 web search auto mode",
"confirms live OpenAI GPT-5.4 web search auto mode",
"searches official OpenAI News through the live model",
]);
});
@@ -191,7 +191,7 @@ describe("qa scenario catalog", () => {
expect(scenario.sourcePath).toBe("qa/scenarios/models/thinking-slash-model-remap.md");
expect(config?.requiredProviderMode).toBe("live-frontier");
expect(config?.anthropicModelRef).toBe("anthropic/claude-sonnet-4-6");
expect(config?.openAiXhighModelRef).toBe("openai/gpt-5.5");
expect(config?.openAiXhighModelRef).toBe("openai/gpt-5.4");
expect(config?.noXhighModelRef).toBe("anthropic/claude-sonnet-4-6");
expect(scenario.execution.flow?.steps.map((step) => step.name)).toEqual([
"selects Anthropic and verifies adaptive options",

View File

@@ -250,4 +250,32 @@ describe("qa suite planning helpers", () => {
}).map((scenario) => scenario.id),
).toEqual(["generic", "claude-subscription"]);
});
it("filters provider-mode-specific scenarios from implicit suite selections", () => {
const scenarios = [
makeQaSuiteTestScenario("generic"),
makeQaSuiteTestScenario("live-only", {
config: { requiredProviderMode: "live-frontier" },
}),
makeQaSuiteTestScenario("mock-only", {
config: { requiredProviderMode: "mock-openai" },
}),
];
expect(
selectQaSuiteScenarios({
scenarios,
providerMode: "mock-openai",
primaryModel: "mock-openai/gpt-5.4",
}).map((scenario) => scenario.id),
).toEqual(["generic", "mock-only"]);
expect(
selectQaSuiteScenarios({
scenarios,
providerMode: "live-frontier",
primaryModel: "openai/gpt-5.4",
}).map((scenario) => scenario.id),
).toEqual(["generic", "live-only"]);
});
});

View File

@@ -33,11 +33,15 @@ function scenarioMatchesLiveLane(params: {
providerMode: QaProviderMode;
claudeCliAuthMode?: QaCliBackendAuthMode;
}) {
const config = params.scenario.execution.config ?? {};
const requiredProviderMode = normalizeQaConfigString(config.requiredProviderMode);
if (requiredProviderMode && params.providerMode !== requiredProviderMode) {
return false;
}
if (getQaProvider(params.providerMode).kind !== "live") {
return true;
}
const selected = splitModelRef(params.primaryModel);
const config = params.scenario.execution.config ?? {};
const requiredProvider = normalizeQaConfigString(config.requiredProvider);
if (requiredProvider && selected?.provider !== requiredProvider) {
return false;

View File

@@ -1,5 +1,6 @@
import fs from "node:fs";
import type { App } from "@slack/bolt";
import { expectChannelInboundContextContract as expectInboundContextContract } from "openclaw/plugin-sdk/channel-contract-testing";
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-runtime";
import {
registerSessionBindingAdapter,
@@ -9,7 +10,6 @@ import {
} from "openclaw/plugin-sdk/conversation-runtime";
import { resolveAgentRoute } from "openclaw/plugin-sdk/routing";
import { resolveThreadSessionKeys } from "openclaw/plugin-sdk/routing";
import { expectChannelInboundContextContract as expectInboundContextContract } from "openclaw/plugin-sdk/testing";
import { afterAll, beforeAll, describe, expect, it, vi } from "vitest";
import type { ResolvedSlackAccount } from "../../accounts.js";
import type { SlackMessageEvent } from "../../types.js";

Some files were not shown because too many files have changed in this diff Show More