Compare commits

..

1 Commits

Author SHA1 Message Date
Mason Huang
93fc591af9 ci: add process exec codeql security shard 2026-06-13 20:38:21 +08:00
1297 changed files with 27497 additions and 89837 deletions

View File

@@ -54,13 +54,6 @@ pnpm crabbox:run -- --help | sed -n '1,120p'
- For broad OpenClaw maintainer `pnpm` gates, prefer the repo wrapper with
`--provider blacksmith-testbox` or the repo Testbox helpers when the standing
Testbox policy applies.
- Cold Testbox acquisition and hydration often take tens of seconds. When broad
remote proof is likely, immediately start
`node scripts/crabbox-wrapper.mjs warmup --provider blacksmith-testbox --keep --timing-json`
in a background command session while inspecting, editing, and running
focused local tests. Poll later, reuse the returned `tbx_...` with
`--provider blacksmith-testbox --id <tbx_id>`, and stop it before handoff.
Do not warm speculatively when remote proof is unlikely.
- Always report the actual provider and id. `cbx_...` means AWS Crabbox;
`tbx_...` means Blacksmith Testbox through Crabbox. If the output only says
`blacksmith testbox list`, use `blacksmith testbox list --all` before

View File

@@ -13,7 +13,7 @@ Use this skill for `qa-lab` / `qa-channel` work. Repo-local QA only.
- `docs/help/testing.md`
- `docs/channels/qa-channel.md`
- `qa/README.md`
- `qa/scenarios/index.yaml`
- `qa/scenarios/index.md`
- `extensions/qa-lab/src/suite.ts`
- `extensions/qa-lab/src/character-eval.ts`
@@ -198,9 +198,7 @@ pnpm openclaw qa character-eval \
- Judges default to `openai/gpt-5.4,thinking=xhigh,fast` and `anthropic/claude-opus-4-6,thinking=high`.
- Report includes judge ranking, run stats, durations, and full transcripts; do not include raw judge replies. Duration is benchmark context, not a grading signal.
- Candidate and judge concurrency default to 16. Use `--concurrency <n>` and `--judge-concurrency <n>` to override when local gateways or provider limits need a gentler lane.
- Scenario source is YAML-only under `qa/scenarios/`: use `index.yaml` and
per-scenario `*.yaml` files with top-level `title`, `scenario`, and optional
`flow`. Never add fenced `qa-scenario` / `qa-flow` Markdown files.
- Scenario source should stay markdown-driven under `qa/scenarios/`.
- For isolated character/persona evals, write the persona into `SOUL.md` and blank `IDENTITY.md` in the scenario flow. Use `SOUL.md + IDENTITY.md` only when intentionally testing how the normal OpenClaw identity combines with the character.
- Keep prompts natural and task-shaped. The candidate model should receive character setup through `SOUL.md`, then normal user turns such as chat, workspace help, and small file tasks; do not ask "how would you react?" or tell the model it is in an eval.
- Prefer at least one real task, such as creating or editing a tiny workspace artifact, so the transcript captures character under normal tool use instead of pure roleplay.
@@ -236,8 +234,7 @@ pnpm openclaw qa manual \
## Repo facts
- Seed scenarios live in `qa/scenarios/index.yaml` and
`qa/scenarios/<theme>/*.yaml`.
- Seed scenarios live in `qa/`.
- Main live runner: `extensions/qa-lab/src/suite.ts`
- QA lab server: `extensions/qa-lab/src/lab-server.ts`
- Child gateway harness: `extensions/qa-lab/src/gateway-child.ts`
@@ -265,9 +262,8 @@ pnpm openclaw qa manual \
## When adding scenarios
- Add or update scenario YAML under `qa/scenarios/`; do not add `.md` scenario
files or fenced YAML blocks.
- Keep kickoff expectations in `qa/scenarios/index.yaml` aligned
- Add or update scenario markdown under `qa/scenarios/`
- Keep kickoff expectations in `qa/scenarios/index.md` aligned
- Add executable coverage in `extensions/qa-lab/src/suite.ts`
- Prefer end-to-end assertions over mock-only checks
- Save outputs under `.artifacts/qa-e2e/`

View File

@@ -150,21 +150,9 @@ Use this skill for release and publish-time workflow. Load `$release-private` if
- Stable Windows Hub release closeout requires the signed
`OpenClawCompanion-Setup-x64.exe`, `OpenClawCompanion-Setup-arm64.exe`, and
`OpenClawCompanion-SHA256SUMS.txt` assets on the canonical
`openclaw/openclaw` GitHub Release. Pass the exact signed
`openclaw/openclaw-windows-node` release tag as `windows_node_tag` to
`OpenClaw Release Publish`, together with the candidate-approved
`windows_node_installer_digests` map; it prevalidates the published source
release and required installers against that map before any publish child,
dispatches the public `Windows Node Release` workflow while the OpenClaw
release is still a draft, carries those pinned source asset digests
unchanged, verifies the expected OpenClaw Foundation Authenticode signer on
Windows, re-downloads and checksum-verifies the promoted asset contract, and
blocks publication until the canonical asset contract is present. Use direct
`Windows Node Release` dispatch only for recovery, always with an exact tag,
never `latest`, and the explicit `expected_installer_digests` JSON map from
the approved source release. Recovery rejects unexpected
`OpenClawCompanion-*` target asset names, then replaces the expected contract
assets with the pinned source bytes.
`openclaw/openclaw` GitHub Release. Use the public `Windows Node Release`
workflow after the matching `openclaw/openclaw-windows-node` release exists;
it verifies Authenticode signatures on Windows before uploading assets.
- Website Windows Hub download links should target exact canonical
`openclaw/openclaw/releases/download/vYYYY.M.PATCH/...` assets for the current
stable release, or `releases/latest/download/...` only after verifying the
@@ -687,23 +675,19 @@ node --import tsx scripts/openclaw-npm-postpublish-verify.ts <published-version>
where npm did not publish the beta version, delete/recreate the same beta
tag and any accidental draft/incomplete prerelease at the fixed commit
instead of skipping a prerelease number.
22. Start `.github/workflows/openclaw-release-publish.yml` from the same branch with
22. Start `.github/workflows/openclaw-npm-release.yml` from the same branch with
the same tag for the real publish, choose `npm_dist_tag` (`beta` default,
`latest` only when you intentionally want direct stable publish), keep it
the same as the preflight run, and pass the successful npm
`preflight_run_id` plus the successful `full_release_validation_run_id`.
For stable publish, also pass the exact non-prerelease
`openclaw/openclaw-windows-node` tag as `windows_node_tag` and its
candidate-approved installer digest map as `windows_node_installer_digests`.
`preflight_run_id`.
23. Wait for `npm-release` approval from `@openclaw/openclaw-release-managers`.
24. Wait for the real publish workflow to run postpublish verification,
create or update the GitHub release as a draft, upload dependency evidence,
promote and verify the required Windows Hub assets for stable releases,
append release verification proof, and only then undraft/publish it. If a
waited plugin publish or Windows Hub promotion fails after OpenClaw npm
succeeds, the workflow keeps the release draft with OpenClaw npm evidence
and exits red; do not undraft until the gap is repaired. The standalone
verifier command remains the recovery probe:
waited plugin publish fails after OpenClaw npm succeeds, the workflow keeps
the release draft with OpenClaw npm evidence and exits red; do not undraft
until the plugin publish gap is repaired. The standalone verifier command
remains the recovery probe:
`node --import tsx scripts/openclaw-npm-postpublish-verify.ts <published-version>`.
25. Run the post-published beta verification roster. First scan current `main`
for critical fixes that landed after the release branch cut; backport only

View File

@@ -0,0 +1,61 @@
name: openclaw-codeql-process-exec-boundary-critical-security
disable-default-queries: true
queries:
- uses: security-extended
query-filters:
- include:
precision:
- high
- very-high
tags contain: security
security-severity: /([7-9]|10)\.(\d)+/
paths:
- src/process
- src/tui/tui-local-shell.ts
- src/tui/tui.ts
- src/plugin-sdk/windows-spawn.ts
- packages/agent-core/src/harness/env
- packages/memory-host-sdk/src/host
- extensions/acpx/src
- extensions/bonjour/src/advertiser.ts
- extensions/browser/src/browser/chrome-mcp.ts
- extensions/browser/src/browser/chrome.executables.ts
- extensions/browser/src/browser/chrome.ts
- extensions/codex/src/app-server/sandbox-exec-server
- extensions/codex/src/app-server/transport-stdio.ts
- extensions/codex/src/node-cli-sessions.ts
- extensions/codex-supervisor/src/json-rpc-client.ts
- extensions/file-transfer/src
- extensions/google-meet/src
- extensions/imessage/src
- extensions/memory-core/src/memory/qmd-manager.ts
- extensions/memory-wiki/src/obsidian.ts
- extensions/microsoft-foundry/cli.ts
- extensions/ollama/src/wsl2-crash-loop-check.ts
- extensions/qa-lab/src
- extensions/signal/src/daemon.ts
- extensions/tts-local-cli/speech-provider.ts
- extensions/voice-call/src
- scripts
paths-ignore:
- "**/node_modules"
- "**/coverage"
- "**/*.generated.ts"
- "**/*.bundle.js"
- "**/*-runtime.js"
- "**/*.test.ts"
- "**/*.test.tsx"
- "**/*.spec.ts"
- "**/*.spec.tsx"
- "**/*.e2e.test.ts"
- "**/*.e2e.test.tsx"
- "**/*test-support*"
- "**/*test-helper*"
- "**/*mock*"
- "**/*fixture*"
- "**/*bench*"

View File

@@ -1358,8 +1358,6 @@ jobs:
- check_name: check-additional-boundaries-bcd
group: boundaries
boundary_shard: 2/4,3/4,4/4
- check_name: check-session-accessor-boundary
group: session-accessor-boundary
- check_name: check-additional-extension-channels
group: extension-channels
- check_name: check-additional-extension-bundled
@@ -1506,15 +1504,6 @@ jobs:
boundaries)
node scripts/run-additional-boundary-checks.mjs
;;
session-accessor-boundary)
if [ ! -f scripts/check-session-accessor-boundary.mjs ]; then
echo "[skip] session accessor boundary check is not present in this checkout"
elif ! node -e 'const pkg = require("./package.json"); process.exit(pkg.scripts?.["lint:tmp:session-accessor-boundary"] ? 0 : 1);'; then
echo "[skip] session accessor boundary script is not present in package.json"
else
run_check "lint:tmp:session-accessor-boundary" pnpm run lint:tmp:session-accessor-boundary
fi
;;
extension-channels)
run_check "lint:extensions:channels" pnpm run lint:extensions:channels
;;

View File

@@ -17,7 +17,28 @@ on:
- ".github/actions/**"
- ".github/codeql/**"
- ".github/workflows/**"
- "extensions/acpx/src/**"
- "extensions/bonjour/src/advertiser.ts"
- "extensions/browser/src/browser/chrome-mcp.ts"
- "extensions/browser/src/browser/chrome.executables.ts"
- "extensions/browser/src/browser/chrome.ts"
- "extensions/codex/src/app-server/sandbox-exec-server/**"
- "extensions/codex/src/app-server/transport-stdio.ts"
- "extensions/codex/src/node-cli-sessions.ts"
- "extensions/codex-supervisor/src/json-rpc-client.ts"
- "extensions/file-transfer/src/**"
- "extensions/google-meet/src/**"
- "extensions/imessage/src/**"
- "extensions/memory-core/src/memory/qmd-manager.ts"
- "extensions/memory-wiki/src/obsidian.ts"
- "extensions/microsoft-foundry/cli.ts"
- "extensions/ollama/src/wsl2-crash-loop-check.ts"
- "extensions/qa-lab/src/**"
- "extensions/signal/src/daemon.ts"
- "extensions/tts-local-cli/speech-provider.ts"
- "extensions/voice-call/src/**"
- "packages/**"
- "scripts/**"
- "src/**"
push:
branches:
@@ -26,7 +47,28 @@ on:
- ".github/actions/**"
- ".github/codeql/**"
- ".github/workflows/**"
- "extensions/acpx/src/**"
- "extensions/bonjour/src/advertiser.ts"
- "extensions/browser/src/browser/chrome-mcp.ts"
- "extensions/browser/src/browser/chrome.executables.ts"
- "extensions/browser/src/browser/chrome.ts"
- "extensions/codex/src/app-server/sandbox-exec-server/**"
- "extensions/codex/src/app-server/transport-stdio.ts"
- "extensions/codex/src/node-cli-sessions.ts"
- "extensions/codex-supervisor/src/json-rpc-client.ts"
- "extensions/file-transfer/src/**"
- "extensions/google-meet/src/**"
- "extensions/imessage/src/**"
- "extensions/memory-core/src/memory/qmd-manager.ts"
- "extensions/memory-wiki/src/obsidian.ts"
- "extensions/microsoft-foundry/cli.ts"
- "extensions/ollama/src/wsl2-crash-loop-check.ts"
- "extensions/qa-lab/src/**"
- "extensions/signal/src/daemon.ts"
- "extensions/tts-local-cli/speech-provider.ts"
- "extensions/voice-call/src/**"
- "packages/**"
- "scripts/**"
- "src/**"
schedule:
- cron: "0 6 * * *"
@@ -73,6 +115,11 @@ jobs:
runs_on: blacksmith-4vcpu-ubuntu-2404
timeout_minutes: 25
config_file: ./.github/codeql/codeql-mcp-process-tool-boundary-critical-security.yml
- language: javascript-typescript
category: process-exec-boundary
runs_on: blacksmith-4vcpu-ubuntu-2404
timeout_minutes: 25
config_file: ./.github/codeql/codeql-process-exec-boundary-critical-security.yml
- language: javascript-typescript
category: plugin-trust-boundary
runs_on: blacksmith-4vcpu-ubuntu-2404

View File

@@ -783,7 +783,7 @@ jobs:
fi
args=(
-f ref="$TARGET_REF"
-f ref="$TARGET_SHA"
-f expected_sha="$TARGET_SHA"
-f provider="$PROVIDER"
-f mode="$MODE"

View File

@@ -379,6 +379,7 @@ jobs:
OPENCLAW_QA_CONVEX_SECRET_CI: ${{ secrets.OPENCLAW_QA_CONVEX_SECRET_CI }}
OPENCLAW_QA_CREDENTIAL_ACQUIRE_TIMEOUT_MS: "1800000"
OPENCLAW_QA_REDACT_PUBLIC_METADATA: "1"
OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT: "1"
CRABBOX_COORDINATOR: ${{ secrets.CRABBOX_COORDINATOR }}
CRABBOX_COORDINATOR_TOKEN: ${{ secrets.CRABBOX_COORDINATOR_TOKEN }}
OPENCLAW_QA_MANTIS_CRABBOX_COORDINATOR: ${{ secrets.OPENCLAW_QA_MANTIS_CRABBOX_COORDINATOR }}

View File

@@ -220,6 +220,7 @@ jobs:
OPENCLAW_QA_CONVEX_SECRET_CI: ${{ secrets.OPENCLAW_QA_CONVEX_SECRET_CI }}
OPENCLAW_QA_CREDENTIAL_ACQUIRE_TIMEOUT_MS: "1800000"
OPENCLAW_QA_REDACT_PUBLIC_METADATA: "1"
OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT: "1"
INPUT_SCENARIO: ${{ inputs.scenario }}
PACKAGE_ARTIFACT_NAME: ${{ inputs.package_artifact_name || '' }}
run: |

View File

@@ -420,7 +420,6 @@ jobs:
add_suite live-cache
add_profile_suite native-live-src-agents "stable full"
add_profile_suite native-live-src-agents-zai-coding "stable full"
add_profile_suite native-live-src-gateway-core "beta minimum stable full"
add_profile_suite native-live-src-gateway-profiles-anthropic "stable full"
add_profile_suite native-live-src-gateway-profiles-anthropic-smoke "stable"
@@ -1957,12 +1956,6 @@ jobs:
timeout_minutes: 60
profile_env_only: false
profiles: stable full
- suite_id: native-live-src-agents-zai-coding
label: Native live Z.AI Coding Plan
command: ZAI_CODING_LIVE_TEST=1 node .release-harness/scripts/test-live-shard.mjs native-live-src-agents-zai-coding
timeout_minutes: 15
profile_env_only: false
profiles: stable full
- suite_id: native-live-src-gateway-core
label: Native live gateway core
command: OPENCLAW_LIVE_CODEX_HARNESS=1 OPENCLAW_LIVE_CODEX_HARNESS_AUTH=api-key node .release-harness/scripts/test-live-shard.mjs native-live-src-gateway-core

View File

@@ -1181,7 +1181,7 @@ jobs:
runtime_tool_coverage_release_checks:
name: Enforce QA Lab runtime tool coverage
needs: [resolve_target, qa_lab_runtime_parity_release_checks]
if: contains(fromJSON('["all","qa","qa-parity"]'), needs.resolve_target.outputs.rerun_group)
if: always() && contains(fromJSON('["all","qa","qa-parity"]'), needs.resolve_target.outputs.rerun_group)
runs-on: ubuntu-24.04
timeout-minutes: 15
permissions:
@@ -1204,35 +1204,13 @@ jobs:
node-version: ${{ env.NODE_VERSION }}
install-bun: "true"
- name: Download runtime parity status
uses: actions/download-artifact@v8
with:
name: release-check-status-qa-runtime-parity-${{ needs.resolve_target.outputs.revision }}
path: .artifacts/release-check-status/
- name: Verify runtime parity producer status
id: verify_runtime_parity_status
shell: bash
run: |
set -euo pipefail
status_path=".artifacts/release-check-status/qa_lab_runtime_parity_release_checks.env"
status="$(sed -n 's/^status=//p' "$status_path" | tail -n 1)"
if [[ "$status" != "success" ]]; then
echo "Runtime parity producer status is ${status:-missing}; skipping coverage artifact consumer."
echo "ready=false" >> "$GITHUB_OUTPUT"
exit 0
fi
echo "ready=true" >> "$GITHUB_OUTPUT"
- name: Download runtime parity artifacts
if: steps.verify_runtime_parity_status.outputs.ready == 'true'
uses: actions/download-artifact@v8
with:
name: release-qa-runtime-parity-${{ needs.resolve_target.outputs.revision }}
path: .artifacts/qa-e2e/
- name: Enforce standard runtime tool coverage
if: steps.verify_runtime_parity_status.outputs.ready == 'true'
run: |
set -euo pipefail
pnpm openclaw qa coverage \
@@ -1434,6 +1412,7 @@ jobs:
OPENCLAW_QA_CONVEX_SECRET_CI: ${{ secrets.OPENCLAW_QA_CONVEX_SECRET_CI }}
OPENCLAW_QA_CREDENTIAL_ACQUIRE_TIMEOUT_MS: "1800000"
OPENCLAW_QA_REDACT_PUBLIC_METADATA: "1"
OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT: "1"
run: |
set -euo pipefail

View File

@@ -15,14 +15,6 @@ on:
description: Successful Full Release Validation run id for this tag/SHA, required when publish_openclaw_npm=true
required: false
type: string
windows_node_tag:
description: Exact openclaw-windows-node release tag, required for stable OpenClaw publish
required: false
type: string
windows_node_installer_digests:
description: Candidate-approved compact JSON map of Windows installer names to pinned sha256 digests
required: false
type: string
npm_telegram_run_id:
description: Optional successful NPM Telegram Beta E2E run id to include in final release evidence
required: false
@@ -89,15 +81,12 @@ jobs:
outputs:
sha: ${{ steps.manifest.outputs.sha || steps.ref.outputs.sha }}
preflight_artifact_name: ${{ steps.preflight_artifact.outputs.name }}
windows_node_installer_digests: ${{ steps.windows_source.outputs.installer_digests }}
steps:
- name: Validate inputs
env:
RELEASE_TAG: ${{ inputs.tag }}
PREFLIGHT_RUN_ID: ${{ inputs.preflight_run_id }}
FULL_RELEASE_VALIDATION_RUN_ID: ${{ inputs.full_release_validation_run_id }}
WINDOWS_NODE_TAG: ${{ inputs.windows_node_tag }}
WINDOWS_NODE_INSTALLER_DIGESTS: ${{ inputs.windows_node_installer_digests }}
PUBLISH_OPENCLAW_NPM: ${{ inputs.publish_openclaw_npm && 'true' || 'false' }}
PLUGIN_PUBLISH_SCOPE: ${{ inputs.plugin_publish_scope }}
PLUGINS: ${{ inputs.plugins }}
@@ -126,22 +115,6 @@ jobs:
echo "publish_openclaw_npm=true requires full_release_validation_run_id." >&2
exit 1
fi
stable_release=true
if [[ "${RELEASE_TAG}" == *"-alpha."* || "${RELEASE_TAG}" == *"-beta."* ]]; then
stable_release=false
fi
if [[ -n "${WINDOWS_NODE_TAG}" && ! "${WINDOWS_NODE_TAG}" =~ ^v[0-9]+\.[0-9]+\.[0-9]+([-.][0-9A-Za-z]+([.-][0-9A-Za-z]+)*)?$ ]]; then
echo "windows_node_tag must be an explicit openclaw-windows-node release tag, not latest: ${WINDOWS_NODE_TAG}" >&2
exit 1
fi
if [[ "${PUBLISH_OPENCLAW_NPM}" == "true" && "${stable_release}" == "true" && -z "${WINDOWS_NODE_TAG}" ]]; then
echo "Stable OpenClaw publish requires an explicit windows_node_tag." >&2
exit 1
fi
if [[ "${PUBLISH_OPENCLAW_NPM}" == "true" && "${stable_release}" == "true" && -z "${WINDOWS_NODE_INSTALLER_DIGESTS}" ]]; then
echo "Stable OpenClaw publish requires candidate-approved windows_node_installer_digests." >&2
exit 1
fi
tideclaw_alpha_publish=false
if [[ "${RELEASE_TAG}" == *"-alpha."* && "${RELEASE_NPM_DIST_TAG}" == "alpha" && "${WORKFLOW_REF}" =~ ^refs/heads/tideclaw/alpha/[0-9]{4}-[0-9]{2}-[0-9]{2}-[0-9]{4}Z$ ]]; then
tideclaw_alpha_publish=true
@@ -170,73 +143,6 @@ jobs:
;;
esac
- name: Validate stable Windows source release
id: windows_source
if: ${{ inputs.publish_openclaw_npm }}
env:
GH_TOKEN: ${{ github.token }}
RELEASE_TAG: ${{ inputs.tag }}
WINDOWS_NODE_TAG: ${{ inputs.windows_node_tag }}
APPROVED_INSTALLER_DIGESTS: ${{ inputs.windows_node_installer_digests }}
run: |
set -euo pipefail
if [[ "${RELEASE_TAG}" == *"-alpha."* || "${RELEASE_TAG}" == *"-beta."* ]]; then
exit 0
fi
source_json="$(gh release view "${WINDOWS_NODE_TAG}" \
--repo openclaw/openclaw-windows-node \
--json tagName,isDraft,isPrerelease,assets,url)"
if [[ "$(printf '%s' "${source_json}" | jq -r '.tagName')" != "${WINDOWS_NODE_TAG}" ]]; then
echo "Windows source release tag does not match ${WINDOWS_NODE_TAG}." >&2
exit 1
fi
if [[ "$(printf '%s' "${source_json}" | jq -r '.isDraft')" == "true" ]]; then
echo "Stable OpenClaw publish requires a published Windows source release." >&2
exit 1
fi
if [[ "$(printf '%s' "${source_json}" | jq -r '.isPrerelease')" == "true" ]]; then
echo "Stable OpenClaw publish requires a non-prerelease Windows source release." >&2
exit 1
fi
required_assets=(
"OpenClawCompanion-Setup-x64.exe"
"OpenClawCompanion-Setup-arm64.exe"
)
required_assets_json="$(printf '%s\n' "${required_assets[@]}" | jq -R . | jq -sc .)"
if ! approved_installer_digests="$(printf '%s' "${APPROVED_INSTALLER_DIGESTS}" | jq -ce --argjson names "${required_assets_json}" '
if type == "object" and
(keys | sort) == ($names | sort) and
all(.[]; type == "string" and test("^sha256:[a-f0-9]{64}$"))
then .
else error("invalid candidate-approved Windows installer digest map")
end
')"; then
echo "windows_node_installer_digests must contain exactly the candidate-approved current installer asset contract." >&2
exit 1
fi
for asset_name in "${required_assets[@]}"; do
asset_matches="$(printf '%s' "${source_json}" | jq -c --arg name "${asset_name}" '[.assets[]? | select(.name == $name)]')"
asset_match_count="$(printf '%s' "${asset_matches}" | jq 'length')"
if [[ "${asset_match_count}" != "1" ]]; then
echo "Windows source release ${WINDOWS_NODE_TAG} must contain exactly one required asset ${asset_name}; found ${asset_match_count}." >&2
exit 1
fi
asset_digest="$(printf '%s' "${asset_matches}" | jq -r '.[0].digest // empty')"
if [[ ! "${asset_digest}" =~ ^sha256:[a-f0-9]{64}$ ]]; then
echo "Windows source release ${WINDOWS_NODE_TAG} asset ${asset_name} is missing its immutable SHA-256 digest." >&2
exit 1
fi
approved_digest="$(printf '%s' "${approved_installer_digests}" | jq -r --arg name "${asset_name}" '.[$name]')"
if [[ "${asset_digest}" != "${approved_digest}" ]]; then
echo "Windows source release ${WINDOWS_NODE_TAG} asset ${asset_name} no longer matches its candidate-approved digest." >&2
exit 1
fi
done
echo "installer_digests=${approved_installer_digests}" >> "$GITHUB_OUTPUT"
echo "- Windows Node source release: prevalidated \`${WINDOWS_NODE_TAG}\`" >> "$GITHUB_STEP_SUMMARY"
- name: Download OpenClaw npm preflight manifest
id: preflight_artifact
if: ${{ inputs.publish_openclaw_npm }}
@@ -431,7 +337,6 @@ jobs:
TARGET_SHA: ${{ steps.manifest.outputs.sha || steps.ref.outputs.sha }}
RELEASE_PROFILE: ${{ steps.full_manifest.outputs.release_profile || inputs.release_profile }}
FULL_RELEASE_VALIDATION_RUN_ID: ${{ inputs.full_release_validation_run_id }}
WINDOWS_NODE_TAG: ${{ inputs.windows_node_tag }}
run: |
{
echo "### Release target"
@@ -442,16 +347,13 @@ jobs:
if [[ -n "${FULL_RELEASE_VALIDATION_RUN_ID// }" ]]; then
echo "- Full release validation: \`${FULL_RELEASE_VALIDATION_RUN_ID}\`"
fi
if [[ -n "${WINDOWS_NODE_TAG// }" ]]; then
echo "- Windows Node source release: \`${WINDOWS_NODE_TAG}\`"
fi
} >> "$GITHUB_STEP_SUMMARY"
publish:
name: Publish plugins, then OpenClaw
needs: [resolve_release_target]
runs-on: ubuntu-latest
timeout-minutes: 120
timeout-minutes: 60
environment: npm-release
steps:
- name: Checkout release SHA
@@ -481,16 +383,10 @@ jobs:
WAIT_FOR_CLAWHUB: ${{ inputs.wait_for_clawhub && 'true' || 'false' }}
PREFLIGHT_ARTIFACT_NAME: ${{ needs.resolve_release_target.outputs.preflight_artifact_name }}
NPM_TELEGRAM_RUN_ID: ${{ inputs.npm_telegram_run_id }}
WINDOWS_NODE_TAG: ${{ inputs.windows_node_tag }}
WINDOWS_NODE_INSTALLER_DIGESTS: ${{ needs.resolve_release_target.outputs.windows_node_installer_digests }}
POSTPUBLISH_EVIDENCE_DIR: ${{ runner.temp }}/openclaw-release-postpublish-evidence
run: |
set -euo pipefail
is_stable_release() {
[[ "${RELEASE_TAG}" != *"-alpha."* && "${RELEASE_TAG}" != *"-beta."* ]]
}
dispatch_workflow_at_ref() {
local workflow_ref="$1"
shift
@@ -940,105 +836,10 @@ jobs:
}
publish_github_release() {
if is_stable_release; then
verify_windows_release_asset_contract
fi
gh release edit "${RELEASE_TAG}" --repo "$GITHUB_REPOSITORY" --draft=false
echo "- GitHub release: https://github.com/${GITHUB_REPOSITORY}/releases/tag/${RELEASE_TAG}" >> "$GITHUB_STEP_SUMMARY"
}
verify_windows_release_asset_contract() {
local actual_companion_assets actual_digest asset_name expected_companion_assets expected_digest expected_hash expected_installer_names manifest_dir manifest_json manifest_path release_json
# Add future promoted installer names, such as MSIX x64/ARM64, here.
local -a installer_assets=(
"OpenClawCompanion-Setup-x64.exe"
"OpenClawCompanion-Setup-arm64.exe"
)
local -a required_assets=(
"${installer_assets[@]}"
"OpenClawCompanion-SHA256SUMS.txt"
)
release_json="$(gh release view "${RELEASE_TAG}" --repo "$GITHUB_REPOSITORY" --json assets,url)"
expected_companion_assets="$(printf '%s\n' "${required_assets[@]}" | jq -R . | jq -sc 'sort')"
actual_companion_assets="$(printf '%s' "${release_json}" | jq -c '
[.assets[]? | select(.name | startswith("OpenClawCompanion-")) | .name] | sort
')"
if [[ "${actual_companion_assets}" != "${expected_companion_assets}" ]]; then
echo "Stable release OpenClawCompanion asset names do not exactly match the current contract." >&2
return 1
fi
for asset_name in "${required_assets[@]}"; do
if ! printf '%s' "${release_json}" | jq -e --arg name "${asset_name}" 'any(.assets[]?; .name == $name)' >/dev/null; then
echo "Stable release is missing required Windows asset ${asset_name}." >&2
return 1
fi
done
manifest_dir="${RUNNER_TEMP}/openclaw-windows-release-contract"
manifest_path="${manifest_dir}/OpenClawCompanion-SHA256SUMS.txt"
rm -rf "${manifest_dir}"
mkdir -p "${manifest_dir}"
gh release download "${RELEASE_TAG}" \
--repo "$GITHUB_REPOSITORY" \
--pattern "OpenClawCompanion-SHA256SUMS.txt" \
--dir "${manifest_dir}"
if ! manifest_json="$(jq -Rsc '
split("\n") as $lines |
(if $lines[-1] == "" then $lines[0:-1] else $lines end) |
map(sub("\r$"; "")) |
if all(.[]; test("^(?<hash>[a-f0-9]{64}) (?<name>[^/\\\\]+)$"))
then map(capture("^(?<hash>[a-f0-9]{64}) (?<name>[^/\\\\]+)$"))
else error("malformed Windows checksum manifest entry")
end
' "${manifest_path}")"; then
echo "Stable release Windows checksum manifest contains malformed entries." >&2
return 1
fi
expected_installer_names="$(printf '%s\n' "${installer_assets[@]}" | jq -R . | jq -sc 'sort')"
if ! printf '%s' "${manifest_json}" | jq -e --argjson expected "${expected_installer_names}" '
length == ($expected | length) and
([.[].name] | sort) == $expected and
([.[].name] | unique | length) == length
' >/dev/null; then
echo "Stable release Windows checksum manifest does not exactly match the installer asset contract." >&2
return 1
fi
for asset_name in "${installer_assets[@]}"; do
expected_digest="$(printf '%s' "${WINDOWS_NODE_INSTALLER_DIGESTS}" | jq -r --arg name "${asset_name}" '.[$name] // empty')"
actual_digest="$(printf '%s' "${release_json}" | jq -r --arg name "${asset_name}" '.assets[]? | select(.name == $name) | .digest // empty')"
if [[ -z "${expected_digest}" || "${actual_digest}" != "${expected_digest}" ]]; then
echo "Stable release Windows asset ${asset_name} does not match its pinned digest." >&2
return 1
fi
expected_hash="${expected_digest#sha256:}"
if ! printf '%s' "${manifest_json}" | jq -e --arg name "${asset_name}" --arg hash "${expected_hash}" '
any(.[]; .name == $name and .hash == $hash)
' >/dev/null; then
echo "Stable release Windows checksum manifest does not match pinned digest for ${asset_name}." >&2
return 1
fi
done
echo "- Windows Hub asset contract: verified" >> "$GITHUB_STEP_SUMMARY"
}
promote_windows_release_assets() {
if ! is_stable_release; then
return 0
fi
if [[ -z "${WINDOWS_NODE_INSTALLER_DIGESTS// }" ]]; then
echo "Stable release is missing prevalidated Windows installer digests." >&2
return 1
fi
windows_node_run_id="$(dispatch_workflow windows-node-release.yml \
-f tag="${RELEASE_TAG}" \
-f windows_node_tag="${WINDOWS_NODE_TAG}" \
-f expected_installer_digests="${WINDOWS_NODE_INSTALLER_DIGESTS}")"
echo "- Windows Node release run ID: \`${windows_node_run_id}\`" >> "$GITHUB_STEP_SUMMARY"
wait_for_run windows-node-release.yml "${windows_node_run_id}"
}
upload_dependency_evidence_release_asset() {
local release_version download_dir asset_path asset_name artifact_name
release_version="${RELEASE_TAG#v}"
@@ -1112,7 +913,7 @@ jobs:
}
append_release_proof_to_github_release() {
local release_version body_file notes_file tarball integrity telegram_line clawhub_line clawhub_bootstrap_line clawhub_runtime_state_path windows_line
local release_version body_file notes_file tarball integrity telegram_line clawhub_line clawhub_bootstrap_line clawhub_runtime_state_path
release_version="${RELEASE_TAG#v}"
body_file="${RUNNER_TEMP}/release-body.md"
@@ -1130,10 +931,6 @@ jobs:
write_clawhub_runtime_state false "${clawhub_runtime_state_path}"
clawhub_line="$(jq -r '.proofLines.normal' "${clawhub_runtime_state_path}")"
clawhub_bootstrap_line="$(jq -r '.proofLines.bootstrap' "${clawhub_runtime_state_path}")"
windows_line=""
if [[ -n "${windows_node_run_id// }" ]]; then
windows_line="- Windows Hub promotion: https://github.com/${GITHUB_REPOSITORY}/actions/runs/${windows_node_run_id} from openclaw/openclaw-windows-node@${WINDOWS_NODE_TAG}"
fi
RELEASE_BODY_FILE="${body_file}" \
RELEASE_NOTES_FILE="${notes_file}" \
@@ -1151,7 +948,6 @@ jobs:
CLAWHUB_LINE="${clawhub_line}" \
CLAWHUB_BOOTSTRAP_LINE="${clawhub_bootstrap_line}" \
TELEGRAM_LINE="${telegram_line}" \
WINDOWS_LINE="${windows_line}" \
node --input-type=module <<'NODE'
import { readFileSync, writeFileSync } from "node:fs";
@@ -1178,7 +974,6 @@ jobs:
process.env.CLAWHUB_BOOTSTRAP_LINE,
`- OpenClaw npm publish: https://github.com/${process.env.RELEASE_REPO}/actions/runs/${process.env.OPENCLAW_NPM_RUN_ID}`,
process.env.TELEGRAM_LINE,
...(process.env.WINDOWS_LINE ? [process.env.WINDOWS_LINE] : []),
].join("\n");
const withoutOldProof = body.replace(/\n?### Release verification\n[\s\S]*?(?=\n### |\n## |$)/, "");
@@ -1203,9 +998,6 @@ jobs:
else
echo "- OpenClaw npm publish: skipped by input"
fi
if is_stable_release && [[ "${PUBLISH_OPENCLAW_NPM}" == "true" ]]; then
echo "- Windows Hub promotion: required before the GitHub release can be published"
fi
if [[ "${WAIT_FOR_CLAWHUB}" == "true" ]]; then
echo "- Workflow completion waits for ClawHub"
else
@@ -1350,7 +1142,6 @@ jobs:
failed=0
openclaw_failed=0
windows_node_run_id=""
if [[ -n "${openclaw_pid}" ]] && ! wait "${openclaw_pid}"; then
failed=1
openclaw_failed=1
@@ -1381,9 +1172,6 @@ jobs:
fi
create_or_update_github_release
upload_dependency_evidence_release_asset
if ! promote_windows_release_assets; then
failed=1
fi
append_release_proof_to_github_release
if [[ "${failed}" == "0" ]]; then
publish_github_release

View File

@@ -532,6 +532,7 @@ jobs:
OPENCLAW_QA_CONVEX_SECRET_CI: ${{ secrets.OPENCLAW_QA_CONVEX_SECRET_CI }}
OPENCLAW_QA_CREDENTIAL_ACQUIRE_TIMEOUT_MS: "1800000"
OPENCLAW_QA_REDACT_PUBLIC_METADATA: "1"
OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT: "1"
INPUT_SCENARIO: ${{ github.event_name == 'workflow_dispatch' && inputs.scenario || '' }}
run: |
set -euo pipefail

View File

@@ -68,7 +68,7 @@ jobs:
days-before-pr-close: 7
stale-issue-label: stale
stale-pr-label: stale
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale,bad-barnacle,clawsweeper:queueable-fix,clawsweeper:source-repro,clawsweeper:fix-shape-clear
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale,bad-barnacle
exempt-pr-labels: maintainer,no-stale,bad-barnacle
operations-per-run: 2000
ascending: true
@@ -100,7 +100,7 @@ jobs:
days-before-pr-stale: -1
days-before-pr-close: -1
stale-issue-label: stale
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale,bad-barnacle,clawsweeper:queueable-fix,clawsweeper:source-repro,clawsweeper:fix-shape-clear
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale,bad-barnacle
operations-per-run: 2000
ascending: true
include-only-assigned: true
@@ -172,7 +172,7 @@ jobs:
days-before-pr-close: 7
stale-issue-label: stale
stale-pr-label: stale
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale,bad-barnacle,clawsweeper:queueable-fix,clawsweeper:source-repro,clawsweeper:fix-shape-clear
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale,bad-barnacle
exempt-pr-labels: maintainer,no-stale,bad-barnacle
operations-per-run: 2000
ascending: true
@@ -203,7 +203,7 @@ jobs:
days-before-pr-stale: -1
days-before-pr-close: -1
stale-issue-label: stale
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale,bad-barnacle,clawsweeper:queueable-fix,clawsweeper:source-repro,clawsweeper:fix-shape-clear
exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale,bad-barnacle
operations-per-run: 2000
ascending: true
include-only-assigned: true
@@ -277,9 +277,6 @@ jobs:
"security",
"no-stale",
"bad-barnacle",
"clawsweeper:queueable-fix",
"clawsweeper:source-repro",
"clawsweeper:fix-shape-clear",
]);
const prExemptLabels = new Set(["maintainer", "no-stale", "bad-barnacle"]);
const maintainerAssociations = new Set(["OWNER", "MEMBER", "COLLABORATOR"]);

View File

@@ -8,12 +8,9 @@ on:
required: true
type: string
windows_node_tag:
description: Exact openclaw-windows-node release tag to promote, for example v0.6.3
required: true
type: string
expected_installer_digests:
description: Compact JSON map of installer asset names to pinned source sha256 digests
description: openclaw-windows-node release tag to promote, or latest
required: true
default: latest
type: string
permissions:
@@ -34,129 +31,46 @@ jobs:
env:
RELEASE_TAG: ${{ inputs.tag }}
WINDOWS_NODE_TAG: ${{ inputs.windows_node_tag }}
EXPECTED_INSTALLER_DIGESTS: ${{ inputs.expected_installer_digests }}
GH_TOKEN: ${{ github.token }}
run: |
if ($env:RELEASE_TAG -notmatch '^v[0-9]{4}\.[1-9][0-9]*\.[1-9][0-9]*((-(alpha|beta)\.[1-9][0-9]*)|(-[1-9][0-9]*))?$') {
throw "Invalid OpenClaw release tag: $env:RELEASE_TAG"
}
$stableRelease = -not (
$env:RELEASE_TAG.Contains("-alpha.") -or
$env:RELEASE_TAG.Contains("-beta.")
)
if ($env:WINDOWS_NODE_TAG -notmatch '^v[0-9]+\.[0-9]+\.[0-9]+([-.][0-9A-Za-z]+([.-][0-9A-Za-z]+)*)?$') {
throw "windows_node_tag must be an explicit openclaw-windows-node release tag, not latest: $env:WINDOWS_NODE_TAG"
}
try {
$expectedDigests = $env:EXPECTED_INSTALLER_DIGESTS | ConvertFrom-Json -AsHashtable
} catch {
throw "expected_installer_digests must be a JSON object: $_"
}
# Add future signed installer names, such as MSIX x64/ARM64, here.
$requiredInstallerNames = @(
"OpenClawCompanion-Setup-x64.exe",
"OpenClawCompanion-Setup-arm64.exe"
)
$allowedTargetCompanionAssetNames = @(
$requiredInstallerNames
"OpenClawCompanion-SHA256SUMS.txt"
)
if ($expectedDigests.Count -ne $requiredInstallerNames.Count) {
throw "expected_installer_digests must contain exactly the current installer asset contract."
}
foreach ($name in $requiredInstallerNames) {
$digest = [string]$expectedDigests[$name]
if ($digest -notmatch '^sha256:[A-Fa-f0-9]{64}$') {
throw "expected_installer_digests is missing a valid pinned digest for $name."
}
}
$targetRelease = gh release view $env:RELEASE_TAG --repo $env:GITHUB_REPOSITORY --json tagName,isDraft,isPrerelease,assets,url | ConvertFrom-Json
if ($targetRelease.tagName -ne $env:RELEASE_TAG) {
throw "OpenClaw release tag mismatch: expected $env:RELEASE_TAG, got $($targetRelease.tagName)"
}
$unexpectedTargetCompanionAssets = @(
$targetRelease.assets |
Where-Object {
$_.name.StartsWith("OpenClawCompanion-") -and
$_.name -notin $allowedTargetCompanionAssetNames
} |
ForEach-Object name |
Sort-Object
)
if ($unexpectedTargetCompanionAssets.Count -ne 0) {
throw "Target OpenClaw release contains unexpected OpenClawCompanion assets before upload: $($unexpectedTargetCompanionAssets -join ', ')"
}
$sourceRelease = gh release view $env:WINDOWS_NODE_TAG --repo openclaw/openclaw-windows-node --json tagName,isDraft,isPrerelease,assets,url | ConvertFrom-Json
if ($sourceRelease.tagName -ne $env:WINDOWS_NODE_TAG) {
throw "Windows source release tag mismatch: expected $env:WINDOWS_NODE_TAG, got $($sourceRelease.tagName)"
}
if ($sourceRelease.isDraft) {
throw "Windows source release must be published: $($sourceRelease.url)"
}
if ($stableRelease -and $sourceRelease.isPrerelease) {
throw "Stable OpenClaw releases require a non-prerelease Windows source release: $($sourceRelease.url)"
}
foreach ($name in $requiredInstallerNames) {
$sourceAssets = @($sourceRelease.assets | Where-Object name -eq $name)
if ($sourceAssets.Count -ne 1) {
throw "Windows source release must contain exactly one required asset $name; found $($sourceAssets.Count)."
}
if ([string]$sourceAssets[0].digest -ne [string]$expectedDigests[$name]) {
throw "Windows source release asset digest does not match the pinned digest: $name"
}
if ($env:WINDOWS_NODE_TAG -ne "latest" -and $env:WINDOWS_NODE_TAG -notmatch '^v[0-9]+\.[0-9]+\.[0-9]+([-.][0-9A-Za-z.-]+)?$') {
throw "Invalid openclaw-windows-node release tag: $env:WINDOWS_NODE_TAG"
}
gh release view $env:RELEASE_TAG --repo $env:GITHUB_REPOSITORY | Out-Null
- name: Download Windows Hub release installers
shell: pwsh
env:
WINDOWS_NODE_TAG: ${{ inputs.windows_node_tag }}
EXPECTED_INSTALLER_DIGESTS: ${{ inputs.expected_installer_digests }}
GH_TOKEN: ${{ github.token }}
run: |
New-Item -ItemType Directory -Force -Path dist | Out-Null
# Add future signed installer patterns, such as MSIX x64/ARM64, here.
# Every matched installer is signature-checked, checksummed, and promoted.
$installerPatterns = @(
"OpenClawCompanion-Setup-x64.exe",
"OpenClawCompanion-Setup-arm64.exe"
)
$downloadArgs = @(
$env:WINDOWS_NODE_TAG,
"--repo", "openclaw/openclaw-windows-node",
"--dir", "dist"
)
foreach ($pattern in $installerPatterns) {
$downloadArgs += @("--pattern", $pattern)
}
gh release download @downloadArgs
if ($LASTEXITCODE -ne 0) {
throw "Failed to download Windows release assets from $env:WINDOWS_NODE_TAG."
$tagArgs = @()
if ($env:WINDOWS_NODE_TAG -ne "latest") {
$tagArgs += $env:WINDOWS_NODE_TAG
}
gh release download @tagArgs `
--repo openclaw/openclaw-windows-node `
--pattern "OpenClawCompanion-Setup-*.exe" `
--dir dist
foreach ($pattern in $installerPatterns) {
$patternMatches = @(Get-ChildItem -LiteralPath dist -File | Where-Object Name -Like $pattern)
if ($patternMatches.Count -ne 1) {
throw "Expected exactly one Windows installer matching '$pattern', found $($patternMatches.Count)."
}
}
$expectedDigests = $env:EXPECTED_INSTALLER_DIGESTS | ConvertFrom-Json -AsHashtable
foreach ($file in Get-ChildItem -LiteralPath dist -File) {
$expectedHash = ([string]$expectedDigests[$file.Name]) -replace '^sha256:', ''
$actualHash = (Get-FileHash -Algorithm SHA256 -LiteralPath $file.FullName).Hash
if ($actualHash -ne $expectedHash) {
throw "Downloaded Windows source asset does not match pinned digest: $($file.Name)"
$expected = @(
"dist/OpenClawCompanion-Setup-x64.exe",
"dist/OpenClawCompanion-Setup-arm64.exe"
)
foreach ($file in $expected) {
if (-not (Test-Path -LiteralPath $file)) {
throw "Missing expected Windows installer: $file"
}
}
- name: Verify Authenticode signatures
shell: pwsh
run: |
$expectedSignerSubject = "CN=OpenClaw Foundation, O=OpenClaw Foundation, L=Mill Valley, S=California, C=US"
Get-ChildItem -LiteralPath dist -File | ForEach-Object {
Get-ChildItem -LiteralPath dist -Filter "OpenClawCompanion-Setup-*.exe" | ForEach-Object {
$signature = Get-AuthenticodeSignature -LiteralPath $_.FullName
if ($signature.Status -ne "Valid") {
throw "$($_.Name) Authenticode signature was $($signature.Status)."
@@ -164,9 +78,6 @@ jobs:
if (-not $signature.SignerCertificate) {
throw "$($_.Name) has no signer certificate."
}
if ($signature.SignerCertificate.Subject -ne $expectedSignerSubject) {
throw "$($_.Name) has unexpected signer subject $($signature.SignerCertificate.Subject)."
}
[pscustomobject]@{
File = $_.Name
Signer = $signature.SignerCertificate.Subject
@@ -177,7 +88,7 @@ jobs:
- name: Write SHA-256 manifest
shell: pwsh
run: |
Get-ChildItem -LiteralPath dist -File |
Get-ChildItem -LiteralPath dist -Filter "OpenClawCompanion-Setup-*.exe" |
Sort-Object Name |
ForEach-Object {
$hash = Get-FileHash -Algorithm SHA256 -LiteralPath $_.FullName
@@ -190,81 +101,12 @@ jobs:
RELEASE_TAG: ${{ inputs.tag }}
GH_TOKEN: ${{ github.token }}
run: |
$releaseAssets = @(Get-ChildItem -LiteralPath dist -File | Sort-Object Name | ForEach-Object FullName)
gh release upload $env:RELEASE_TAG @releaseAssets --repo $env:GITHUB_REPOSITORY --clobber
if ($LASTEXITCODE -ne 0) {
throw "Failed to upload Windows release assets to $env:RELEASE_TAG."
}
- name: Verify promoted release asset contract
shell: pwsh
env:
RELEASE_TAG: ${{ inputs.tag }}
GH_TOKEN: ${{ github.token }}
run: |
New-Item -ItemType Directory -Force -Path verified | Out-Null
$expectedAssets = @(Get-ChildItem -LiteralPath dist -File | Sort-Object Name)
$expectedCompanionAssetNames = @($expectedAssets | ForEach-Object Name | Sort-Object)
$targetRelease = gh release view $env:RELEASE_TAG --repo $env:GITHUB_REPOSITORY --json assets | ConvertFrom-Json
$actualCompanionAssetNames = @(
$targetRelease.assets |
Where-Object { $_.name.StartsWith("OpenClawCompanion-") } |
ForEach-Object name |
Sort-Object
)
$assetContractDiff = @(
Compare-Object `
-ReferenceObject $expectedCompanionAssetNames `
-DifferenceObject $actualCompanionAssetNames
)
if (
$actualCompanionAssetNames.Count -ne $expectedCompanionAssetNames.Count -or
$assetContractDiff.Count -ne 0
) {
throw "Promoted OpenClawCompanion asset names do not exactly match the current contract."
}
foreach ($asset in $expectedAssets) {
gh release download $env:RELEASE_TAG `
--repo $env:GITHUB_REPOSITORY `
--pattern $asset.Name `
--dir verified
if ($LASTEXITCODE -ne 0) {
throw "Failed to download promoted Windows release asset $($asset.Name)."
}
}
$manifestPath = "verified/OpenClawCompanion-SHA256SUMS.txt"
$manifestEntries = @(Get-Content -LiteralPath $manifestPath | ForEach-Object {
if ($_ -notmatch '^([A-Fa-f0-9]{64}) ([^\\/]+)$') {
throw "Invalid Windows SHA-256 manifest entry: $_"
}
[PSCustomObject]@{
Hash = $Matches[1]
Name = $Matches[2]
}
})
$expectedInstallerNames = @(
$expectedAssets |
Where-Object Name -ne "OpenClawCompanion-SHA256SUMS.txt" |
ForEach-Object Name
)
$manifestInstallerNames = @($manifestEntries | ForEach-Object Name | Sort-Object)
$contractDiff = @(
Compare-Object `
-ReferenceObject $expectedInstallerNames `
-DifferenceObject $manifestInstallerNames
)
if ($contractDiff.Count -ne 0) {
throw "Promoted Windows SHA-256 manifest does not match the installer asset contract."
}
foreach ($entry in $manifestEntries) {
$hash = (Get-FileHash -Algorithm SHA256 -LiteralPath "verified/$($entry.Name)").Hash
if ($hash -ne $entry.Hash) {
throw "Promoted Windows release asset checksum mismatch: $($entry.Name)"
}
}
gh release upload $env:RELEASE_TAG `
dist/OpenClawCompanion-Setup-x64.exe `
dist/OpenClawCompanion-Setup-arm64.exe `
dist/OpenClawCompanion-SHA256SUMS.txt `
--repo $env:GITHUB_REPOSITORY `
--clobber
- name: Summary
shell: pwsh
@@ -277,9 +119,8 @@ jobs:
OpenClaw release: $env:RELEASE_TAG
Source release: openclaw/openclaw-windows-node@$env:WINDOWS_NODE_TAG
- https://github.com/openclaw/openclaw/releases/download/$env:RELEASE_TAG/OpenClawCompanion-Setup-x64.exe
- https://github.com/openclaw/openclaw/releases/download/$env:RELEASE_TAG/OpenClawCompanion-Setup-arm64.exe
- https://github.com/openclaw/openclaw/releases/download/$env:RELEASE_TAG/OpenClawCompanion-SHA256SUMS.txt
"@ >> $env:GITHUB_STEP_SUMMARY
Get-ChildItem -LiteralPath dist -File |
Sort-Object Name |
ForEach-Object {
"- https://github.com/openclaw/openclaw/releases/download/$env:RELEASE_TAG/$($_.Name)"
} >> $env:GITHUB_STEP_SUMMARY

View File

@@ -214,7 +214,6 @@ Skills own workflows; root owns hard policy and routing.
- Vitest. Colocated `*.test.ts`; e2e `*.e2e.test.ts`; example models `sonnet-4.6`, `gpt-5.5`; test GPT with 5.5 preferred, 5.4 ok; no GPT-4.x agent-smoke defaults.
- Prefer behavior tests over workflow/docs string greps. Put operator policy reminders in AGENTS/docs.
- QA scenario sources are YAML only: `qa/scenarios/index.yaml` and `qa/scenarios/<theme>/*.yaml`. Do not add fenced `qa-scenario`/`qa-flow` Markdown files under `qa/scenarios/`.
- Clean timers/env/globals/mocks/sockets/temp dirs/module state; `--isolate=false` safe.
- Prefer injection and narrow `*.runtime.ts` mocks over broad barrels or `openclaw/plugin-sdk/*`.
- Do not edit baseline/inventory/ignore/snapshot/expected-failure files to silence checks without explicit approval.

View File

@@ -2,35 +2,6 @@
Docs: https://docs.openclaw.ai
## 2026.6.8
### Highlights
- Telegram and WhatsApp channel delivery are richer and less brittle: Telegram can send structured rich text with tables, lists, expandable blockquotes, prompt-preserving CLI backend delivery, retired native draft migration, and safer rich-media boundaries, while WhatsApp now honors configured ACP bindings. (#92679, #84082, #89421, #92513) Thanks @obviyus, @jzakirov, @spacegeologist, and @TurboTheTurtle.
- Agent and Gateway recovery is sharper across account-scoped DM sends, generated media completions, restart shutdown aborts, yielded subagent pauses, yielded cron media, heartbeat dedupe, session identity prompts, and unknown OpenAI agent selector rejection. (#92788, #91246, #91357, #92631, #92146, #91287, #92468, #92510) Thanks @yetval, @TurboTheTurtle, @ooiuuii, @openperf, @IWhatsskill, @ZengWen-DT, and @zhangguiping-xydt.
- Provider/model handling expands and tightens with GLM-5.2, Claude Haiku 4.5 catalog rows, OpenRouter and Google Vertex provider-prefix normalization, managed SecretRef auth, bounded model browse discovery, storeless OpenAI Responses replay gating, and Claude 4.5 Copilot tool-streaming safety. (#92796, #90116, #92627, #91218, #90686, #92247, #90706, #75393) Thanks @arkyu2077, @liuhao1024, @bymle, @rohitjavvadi, @samson910022, @snowzlm, and @Kailigithub.
- `/usage` and reply payload hooks now have a native full footer renderer, default template, fixed-decimal formatting, credential-aware limits, better partial-count handling, and warnings for broken templates instead of silent bad output. (#92657, #89835, #89629) Thanks @Marvinthebored.
- UI and mobile flows are steadier: workspace files can collapse and start collapsed, WebChat backscroll survives streaming, the sidebar session picker remains interactive above the desktop workbench, reset soft args survive UI dispatch, stale dashboard session parent lineage is preserved, and iOS reconnects stale foreground gateways. (#92779, #92622, #92705, #91353, #90658, #92552) Thanks @shakkernerd, @TurboTheTurtle, @NianJiuZst, @zhouhe-xydt, @luoyanglang, and @Solvely-Colin.
- Memory, state, and diagnostics recover cleaner: oversized OpenAI embedding batches split before 431s, QMD memory search stays available in transient mode, SQLite avoids WAL on NFS state volumes, stuck-session recovery scheduling no longer resets warning backoff, and Infinity chunk limits stay genuinely unbounded. (#92650, #92618, #92639, #91247, #92752, #92735) Thanks @mushuiyu886, @TurboTheTurtle, @849261680, @gnanam1990, and @yhterrance.
### Changes
- Providers/models: add GLM-5.2 support and Claude Haiku 4.5 catalog entries while keeping provider-qualified model IDs normalized across OpenRouter and Google Vertex paths. (#92796, #90116, #92627, #91218) Thanks @arkyu2077, @liuhao1024, and @bymle.
- Channel plugins: ship Telegram rich-message delivery and WhatsApp ACP binding support, including rich prompt handoff to CLI backends and transport fixtures for richer drafts. (#92679, #92513) Thanks @obviyus and @TurboTheTurtle.
- Agent commands: support `/btw` in CLI-backed sessions and keep CLI usage-error exits classified as usage failures instead of successful runs. (#92669, #92162) Thanks @joshavant and @Pandah97.
- Usage hooks: add built-in full footer rendering, default footer templates, per-turn usage state, credential-aware limits, and fixed-decimal formatting for usage-bar templates. (#92657, #89835, #89629) Thanks @Marvinthebored.
- Docs and operator guidance: document node config examples, clarify before-install hook scope, correct agent default concurrency comments, refresh ZAI provider docs, and update channel/group docs for current Telegram and WhatsApp behavior. (#92677, #92766, #92695) Thanks @liuhao1024, @sallyom, and @ArielSmoliar.
### Fixes
- Channels and delivery: preserve account-scoped DM channel send policy, rich Telegram final replies, rich Telegram tables and lists, Telegram thread-create CLI remapping, Slack outbound `message_sent` hooks, contributed message-tool schema optionality, same-channel generated media completions, and channel chunking around surrogate pairs and Infinity limits. (#92788, #92679, #89421, #89943, #91137, #91246, #92735) Thanks @yetval, @obviyus, @spacegeologist, @rishitamrakar, @lundog, @TurboTheTurtle, and @yhterrance.
- Discord: give generated auto-thread titles a 60-second timeout and 4,096-token reasoning-model output budget, clamped to the selected model output cap. (#64734) Thanks @hanamizuki.
- Agent, cron, and Gateway runtime: mark active main sessions before restart shutdown aborts, pause yielded subagent runs whose terminal also signals abort, preserve yielded media completions, de-duplicate main-session heartbeat events, expose session identity in runtime prompts, reject unknown OpenAI agent selectors, keep generated media completions and slash-command block replies in WebChat, preserve fresh post-compaction usage while clearing stale usage snapshots, and require admin privileges for HTTP session/model override surfaces. (#91357, #92631, #92146, #91287, #92468, #92510, #91246, #50795, #50845, #82874, #92651, #92646) Thanks @ooiuuii, @openperf, @IWhatsskill, @ZengWen-DT, @zhangguiping-xydt, @Hollychou924, @leno23, and @TurboTheTurtle.
- Providers and model replay: preserve storeless OpenAI Responses replay compatibility, avoid eager tool streaming for Claude 4.5 in Copilot, honor profile auth for SecretRef model entries, bound model browsing, strip provider prefixes where runtimes need bare IDs, and surface nested embedding fetch failures. (#90706, #75393, #90686, #92247, #92627, #91218, #92628) Thanks @snowzlm, @Kailigithub, @rohitjavvadi, @samson910022, @liuhao1024, @bymle, and @mushuiyu886.
- Memory, state, diagnostics, and config: split header-too-large embedding batches, keep QMD memory search enabled in transient mode, avoid SQLite WAL on NFS volumes, preserve recovery scheduling outside stuck-session warning backoff, and keep shell environment fallbacks contained in config write tests. (#92650, #92618, #92639, #91247, #92752) Thanks @mushuiyu886, @TurboTheTurtle, @849261680, and @gnanam1990.
- UI/mobile/TUI: preserve dashboard session parent lineage, WebChat backscroll, reset soft command args, sidebar session picker interactivity, collapsed workspace files, resolved `/model` confirmation refs, and stale foreground iOS Gateway reconnects. (#90658, #92622, #91353, #92705, #92779, #92773, #92552) Thanks @luoyanglang, @TurboTheTurtle, @zhouhe-xydt, @NianJiuZst, @shakkernerd, @NarahariRaghava, and @Solvely-Colin.
- Release and test reliability: extend slow Gateway/full-suite watchdogs, split local full-suite shards when throttled, stabilize plugin auth marker fixtures, avoid brittle provider-ref error text, and keep QA Lab bootstrap selection assertions aligned with flow-only scenarios. (#92652)
## 2026.6.6
### Highlights

View File

@@ -147,10 +147,6 @@ RUN --mount=type=cache,id=openclaw-pnpm-store,target=/root/.local/share/pnpm/sto
OPENCLAW_EXTENSIONS="$OPENCLAW_EXTENSIONS" OPENCLAW_BUNDLED_PLUGIN_DIR="$OPENCLAW_BUNDLED_PLUGIN_DIR" node scripts/prune-docker-plugin-dist.mjs && \
node scripts/postinstall-bundled-plugins.mjs && \
find dist -type f \( -name '*.d.ts' -o -name '*.d.mts' -o -name '*.d.cts' -o -name '*.map' \) -delete && \
rm -rf \
/app/node_modules/openclaw \
/app/node_modules/.bin/openclaw \
/app/node_modules/.pnpm/openclaw@*/node_modules/openclaw && \
node scripts/check-package-dist-imports.mjs /app
# ── Runtime base image ──────────────────────────────────────────

View File

@@ -188,7 +188,6 @@ final class NodeAppModel {
@ObservationIgnored private var backgroundGraceTaskTimer: Task<Void, Never>?
private var backgroundReconnectSuppressed = false
private var backgroundReconnectLeaseUntil: Date?
@ObservationIgnored private var foregroundGatewayResumeCheckInFlight = false
private var lastSignificantLocationWakeAt: Date?
@ObservationIgnored private let watchReplyCoordinator = WatchReplyCoordinator()
private var watchExecApprovalPromptsByID: [String: ExecApprovalPrompt] = [:]
@@ -215,7 +214,6 @@ final class NodeAppModel {
private static let watchExecApprovalBridgeStateKey = "watch.execApproval.bridge.state.v1"
private static let backgroundAliveLastSuccessAtMsKey = "gateway.backgroundAlive.lastSuccessAtMs"
private static let backgroundAliveLastTriggerKey = "gateway.backgroundAlive.lastTrigger"
private static let foregroundResumeHealthTimeoutSeconds = 1
var cameraHUDText: String?
var cameraHUDKind: CameraHUDKind?
@@ -419,7 +417,9 @@ final class NodeAppModel {
self.isBackgrounded = false
self.endBackgroundConnectionGracePeriod(reason: "scene_foreground")
self.clearBackgroundReconnectSuppression(reason: "scene_foreground")
var shouldStartGatewayHealthMonitor = self.operatorConnected
if self.operatorConnected {
self.startGatewayHealthMonitor()
}
if phase == .active {
self.voiceWake.resumeAfterExternalAudioCapture(wasSuspended: self.backgroundVoiceWakeSuspended)
self.backgroundVoiceWakeSuspended = false
@@ -444,8 +444,6 @@ final class NodeAppModel {
// iOS may suspend network sockets in background without a clean close.
// On foreground, force a fresh handshake to avoid "connected but dead" states.
if backgroundedFor >= 3.0 {
shouldStartGatewayHealthMonitor = false
self.foregroundGatewayResumeCheckInFlight = true
Task { [weak self] in
guard let self else { return }
let operatorWasConnected = await MainActor.run { self.operatorConnected }
@@ -454,26 +452,31 @@ final class NodeAppModel {
let healthy = await (try? self.operatorGateway.request(
method: "health",
paramsJSON: nil,
timeoutSeconds: Self.foregroundResumeHealthTimeoutSeconds)) != nil
timeoutSeconds: 2)) != nil
if healthy {
await MainActor.run {
self.foregroundGatewayResumeCheckInFlight = false
self.startGatewayHealthMonitor()
}
await MainActor.run { self.startGatewayHealthMonitor() }
return
}
}
await self.operatorGateway.disconnect()
await self.nodeGateway.disconnect()
await MainActor.run {
self.foregroundGatewayResumeCheckInFlight = false
guard !self.isAppleReviewDemoModeEnabled else { return }
self.setOperatorConnected(false)
self.gatewayConnected = false
// Foreground recovery must actively restart the saved gateway config.
// Disconnecting stale sockets alone can leave us idle if the old
// reconnect tasks were suppressed or otherwise got stuck in background.
self.gatewayStatusText = "Reconnecting…"
self.talkMode.updateGatewayConnected(false)
if let cfg = self.activeGatewayConnectConfig {
self.applyGatewayConnectConfig(cfg)
}
}
await self.restartGatewaySessionsAfterForegroundStaleConnection()
}
}
}
if shouldStartGatewayHealthMonitor {
self.startGatewayHealthMonitor()
}
@unknown default:
self.isBackgrounded = false
self.endBackgroundConnectionGracePeriod(reason: "scene_unknown")
@@ -783,12 +786,6 @@ final class NodeAppModel {
func refreshGatewayOverviewIfConnected() async {
guard await self.isOperatorConnected() else { return }
if self.foregroundGatewayResumeCheckInFlight {
GatewayDiagnostics.log("gateway overview refresh deferred reason=foreground_resume_check")
try? await Task.sleep(
nanoseconds: UInt64(Self.foregroundResumeHealthTimeoutSeconds) * 1_000_000_000)
guard await self.isOperatorConnected(), !self.foregroundGatewayResumeCheckInFlight else { return }
}
await self.refreshBrandingFromGateway()
await self.refreshAgentsFromGateway()
}
@@ -1989,33 +1986,12 @@ extension NodeAppModel {
}
func resetGatewaySessionsForForcedReconnect() async {
let nodeGatewayTask = self.nodeGatewayTask
let operatorGatewayTask = self.operatorGatewayTask
nodeGatewayTask?.cancel()
self.nodeGatewayTask?.cancel()
self.nodeGatewayTask = nil
operatorGatewayTask?.cancel()
self.operatorGatewayTask?.cancel()
self.operatorGatewayTask = nil
await self.operatorGateway.disconnect()
await self.nodeGateway.disconnect()
// Foreground recovery reuses the same config immediately after reset.
// Wait for canceled loops so their shutdown cleanup cannot clobber the new reconnect state.
if let operatorGatewayTask {
await operatorGatewayTask.value
}
if let nodeGatewayTask {
await nodeGatewayTask.value
}
}
private func restartGatewaySessionsAfterForegroundStaleConnection() async {
await self.resetGatewaySessionsForForcedReconnect()
guard !self.isAppleReviewDemoModeEnabled else { return }
self.setOperatorConnected(false)
self.gatewayConnected = false
self.gatewayStatusText = "Reconnecting…"
self.talkMode.updateGatewayConnected(false)
guard let cfg = self.activeGatewayConnectConfig else { return }
self.applyGatewayConnectConfig(cfg, forceReconnect: true)
}
func disconnectGateway() {
@@ -4850,10 +4826,6 @@ extension NodeAppModel {
(self.nodeGatewayTask != nil, self.operatorGatewayTask != nil)
}
func _test_restartGatewaySessionsAfterForegroundStaleConnection() async {
await self.restartGatewaySessionsAfterForegroundStaleConnection()
}
func _test_handleSuccessfulBootstrapGatewayOnboarding() async {
await self.handleSuccessfulBootstrapGatewayOnboarding(
url: URL(string: "wss://gateway.example")!,

View File

@@ -356,20 +356,6 @@ import UIKit
#expect(!appModel._test_hasGatewayLoopTasks().operator)
}
@Test @MainActor func foregroundStaleConnectionRestartReappliesActiveGatewayConfig() async {
let appModel = NodeAppModel()
defer { appModel.disconnectGateway() }
let config = Self.makeGatewayConnectConfig()
appModel.applyGatewayConnectConfig(config)
await appModel._test_restartGatewaySessionsAfterForegroundStaleConnection()
#expect(appModel.gatewayStatusText == "Reconnecting…")
#expect(appModel.activeGatewayConnectConfig?.hasSameConnectionInputs(as: config) == true)
#expect(appModel._test_hasGatewayLoopTasks().node)
#expect(appModel._test_hasGatewayLoopTasks().operator)
}
@Test @MainActor func loadLastConnectionReadsSavedValues() {
let prior = KeychainStore.loadString(service: "ai.openclaw.gateway", account: "lastConnection")
defer {

View File

@@ -1,5 +1,5 @@
{
"originHash" : "ae9f37f50cff0d32d189e60948f61e2fa1704e997a6ef4ad5e37f6a11c165ea4",
"originHash" : "035a4fe955164c62c1628de75f6437a14443a947eea2a1b0176ba484d6fde6f8",
"pins" : [
{
"identity" : "axorcist",
@@ -42,8 +42,8 @@
"kind" : "remoteSourceControl",
"location" : "https://github.com/steipete/Peekaboo.git",
"state" : {
"revision" : "ee0e3185431788dad533ffca77cd75315aa3d26f",
"version" : "3.4.1"
"revision" : "3a56ed2aa769bfefb5a78722dfce3c34088cfba1",
"version" : "3.4.0"
}
},
{
@@ -51,8 +51,8 @@
"kind" : "remoteSourceControl",
"location" : "https://github.com/sparkle-project/Sparkle",
"state" : {
"revision" : "d46d456107feacc80711b21847b82b07bd9fb46e",
"version" : "2.9.3"
"revision" : "6276ba2b404829d139c45ff98427cf90e2efc59b",
"version" : "2.9.2"
}
},
{
@@ -78,8 +78,8 @@
"kind" : "remoteSourceControl",
"location" : "https://github.com/apple/swift-log.git",
"state" : {
"revision" : "92448c359f00ebe36ae97d3bd9086f13c7692b5a",
"version" : "1.13.2"
"revision" : "2aed77ae5ec9a86d8fe42c12275e4c2653a286ee",
"version" : "1.13.1"
}
},
{

View File

@@ -19,7 +19,7 @@ let package = Package(
.package(url: "https://github.com/swiftlang/swift-subprocess.git", from: "0.4.0"),
.package(url: "https://github.com/apple/swift-log.git", from: "1.10.1"),
.package(url: "https://github.com/sparkle-project/Sparkle", from: "2.9.0"),
.package(url: "https://github.com/steipete/Peekaboo.git", exact: "3.4.1"),
.package(url: "https://github.com/steipete/Peekaboo.git", exact: "3.4.0"),
.package(path: "../shared/OpenClawKit"),
.package(path: "../swabble"),
],

View File

@@ -92,13 +92,7 @@ extension VoiceWakeOverlayController {
let contentHeight = ceil(used.height + (textInset.height * 2))
let total = contentHeight + self.verticalPadding * 2
// Defer the overflow state mutation to break the SwiftUI onChange measuredHeight
// isOverflowing re-render onChange synchronous render loop (fixes #43480).
let overflowing = total > self.maxHeight
DispatchQueue.main.async { [weak self] in
guard let self, self.model.isOverflowing != overflowing else { return }
self.model.isOverflowing = overflowing
}
self.model.isOverflowing = total > self.maxHeight
return max(self.minHeight, min(total, self.maxHeight))
}

View File

@@ -4,64 +4,14 @@ import Testing
@Suite(.serialized)
struct ExecApprovalsStoreRefactorTests {
private var realTemporaryDirectory: URL {
let path = FileManager().temporaryDirectory.path
if path.hasPrefix("/var/") {
return URL(fileURLWithPath: "/private\(path)", isDirectory: true)
}
return FileManager().temporaryDirectory.resolvingSymlinksInPath()
}
private func withLockedEnv(
_ values: [String: String?],
_ body: () async throws -> Void) async throws
{
func restoreEnv(_ values: [String: String?]) {
for (key, value) in values {
if let value {
setenv(key, value, 1)
} else {
unsetenv(key)
}
}
}
await TestIsolationLock.shared.acquire()
var previousEnv: [String: String?] = [:]
for (key, value) in values {
previousEnv[key] = getenv(key).map { String(cString: $0) }
if let value {
setenv(key, value, 1)
} else {
unsetenv(key)
}
}
do {
try await body()
restoreEnv(previousEnv)
await TestIsolationLock.shared.release()
} catch {
restoreEnv(previousEnv)
await TestIsolationLock.shared.release()
throw error
}
}
private func withTempStateDir(
_ body: @escaping @Sendable (URL) async throws -> Void) async throws
{
let root = self.realTemporaryDirectory
let stateDir = FileManager().temporaryDirectory
.appendingPathComponent("openclaw-state-\(UUID().uuidString)", isDirectory: true)
let home = root.appendingPathComponent("home", isDirectory: true)
let stateDir = root.appendingPathComponent("state", isDirectory: true)
defer { try? FileManager().removeItem(at: root) }
try Self.seedCurrentApprovalsFile(in: stateDir)
defer { try? FileManager().removeItem(at: stateDir) }
try await self.withLockedEnv([
"OPENCLAW_HOME": home.path,
"OPENCLAW_STATE_DIR": stateDir.path,
]) {
try await TestIsolation.withEnvValues(["OPENCLAW_STATE_DIR": stateDir.path]) {
try await body(stateDir)
}
}
@@ -69,13 +19,13 @@ struct ExecApprovalsStoreRefactorTests {
private func withTempHomeAndStateDir(
_ body: @escaping @Sendable (URL, URL) async throws -> Void) async throws
{
let root = self.realTemporaryDirectory
let root = FileManager().temporaryDirectory
.appendingPathComponent("openclaw-home-state-\(UUID().uuidString)", isDirectory: true)
let home = root.appendingPathComponent("home", isDirectory: true)
let stateDir = root.appendingPathComponent("state", isDirectory: true)
defer { try? FileManager().removeItem(at: root) }
try await self.withLockedEnv([
try await TestIsolation.withEnvValues([
"OPENCLAW_HOME": home.path,
"OPENCLAW_STATE_DIR": stateDir.path,
]) {
@@ -197,19 +147,4 @@ struct ExecApprovalsStoreRefactorTests {
}
return identifier
}
private static func seedCurrentApprovalsFile(in stateDir: URL) throws {
try FileManager().createDirectory(at: stateDir, withIntermediateDirectories: true)
let file = ExecApprovalsFile(
version: 1,
socket: ExecApprovalsSocketConfig(
path: stateDir.appendingPathComponent("exec-approvals.sock").path,
token: "test-token"),
defaults: nil,
agents: [:])
let encoder = JSONEncoder()
encoder.outputFormatting = [.prettyPrinted, .sortedKeys]
try encoder.encode(file)
.write(to: stateDir.appendingPathComponent("exec-approvals.json"))
}
}

View File

@@ -2074,204 +2074,6 @@ public struct SessionsCompactionRestoreResult: Codable, Sendable {
}
}
public struct SessionFileBrowserEntry: Codable, Sendable {
public let path: String
public let name: String
public let kind: AnyCodable
public let sessionkind: SessionFileRelevance?
public let size: Int?
public let updatedatms: Int?
public init(
path: String,
name: String,
kind: AnyCodable,
sessionkind: SessionFileRelevance?,
size: Int?,
updatedatms: Int?)
{
self.path = path
self.name = name
self.kind = kind
self.sessionkind = sessionkind
self.size = size
self.updatedatms = updatedatms
}
private enum CodingKeys: String, CodingKey {
case path
case name
case kind
case sessionkind = "sessionKind"
case size
case updatedatms = "updatedAtMs"
}
}
public struct SessionFileBrowserResult: Codable, Sendable {
public let path: String
public let parentpath: String?
public let search: String?
public let entries: [SessionFileBrowserEntry]
public let truncated: Bool?
public init(
path: String,
parentpath: String?,
search: String?,
entries: [SessionFileBrowserEntry],
truncated: Bool?)
{
self.path = path
self.parentpath = parentpath
self.search = search
self.entries = entries
self.truncated = truncated
}
private enum CodingKeys: String, CodingKey {
case path
case parentpath = "parentPath"
case search
case entries
case truncated
}
}
public struct SessionFileEntry: Codable, Sendable {
public let path: String
public let name: String
public let kind: SessionFileKind
public let missing: Bool
public let size: Int?
public let updatedatms: Int?
public let content: String?
public init(
path: String,
name: String,
kind: SessionFileKind,
missing: Bool,
size: Int?,
updatedatms: Int?,
content: String?)
{
self.path = path
self.name = name
self.kind = kind
self.missing = missing
self.size = size
self.updatedatms = updatedatms
self.content = content
}
private enum CodingKeys: String, CodingKey {
case path
case name
case kind
case missing
case size
case updatedatms = "updatedAtMs"
case content
}
}
public struct SessionsFilesListParams: Codable, Sendable {
public let sessionkey: String
public let agentid: String?
public let path: String?
public let search: String?
public init(
sessionkey: String,
agentid: String? = nil,
path: String?,
search: String?)
{
self.sessionkey = sessionkey
self.agentid = agentid
self.path = path
self.search = search
}
private enum CodingKeys: String, CodingKey {
case sessionkey = "sessionKey"
case agentid = "agentId"
case path
case search
}
}
public struct SessionsFilesListResult: Codable, Sendable {
public let sessionkey: String
public let root: String?
public let files: [SessionFileEntry]
public let browser: SessionFileBrowserResult?
public init(
sessionkey: String,
root: String?,
files: [SessionFileEntry],
browser: SessionFileBrowserResult?)
{
self.sessionkey = sessionkey
self.root = root
self.files = files
self.browser = browser
}
private enum CodingKeys: String, CodingKey {
case sessionkey = "sessionKey"
case root
case files
case browser
}
}
public struct SessionsFilesGetParams: Codable, Sendable {
public let sessionkey: String
public let path: String
public let agentid: String?
public init(
sessionkey: String,
path: String,
agentid: String? = nil)
{
self.sessionkey = sessionkey
self.path = path
self.agentid = agentid
}
private enum CodingKeys: String, CodingKey {
case sessionkey = "sessionKey"
case path
case agentid = "agentId"
}
}
public struct SessionsFilesGetResult: Codable, Sendable {
public let sessionkey: String
public let root: String?
public let file: SessionFileEntry
public init(
sessionkey: String,
root: String?,
file: SessionFileEntry)
{
self.sessionkey = sessionkey
self.root = root
self.file = file
}
private enum CodingKeys: String, CodingKey {
case sessionkey = "sessionKey"
case root
case file
}
}
public struct SessionsCreateParams: Codable, Sendable {
public let key: String?
public let agentid: String?

View File

@@ -1,4 +1,4 @@
0485ba902d2afd89d2c41cde7180d0cec2900b2db6804b9f97d42b7d85cd3af5 config-baseline.json
72bb80be618406f3337eaa2560d2559a35e49bd29576de8dd4a3aec1a6a94d92 config-baseline.core.json
1218f5555541b61bd5ddcac6441f15061b44789e2471d4ffecbe3059777c55c1 config-baseline.channel.json
a14ac4261e98403d1a7e047070e6f151938444e27382b860315bd0c74fda4861 config-baseline.plugin.json
37b56008790612b8293930b6a29d74490e98daa90f954fca9d133fcc28645c4c config-baseline.json
75b64c2ea081369ba4306493313a8a4cd48b784145f92fed995e6b77a5df350d config-baseline.core.json
17d64c9799dfa239a49493413f1100bdd9237e9b67aaeae331a4604dbc227023 config-baseline.channel.json
f9d1f50bfa8403891e76cd99dc1357cdece4a71e8ae18a39b190c2a14e6f97b0 config-baseline.plugin.json

View File

@@ -1,2 +1,2 @@
b121079a0912b3051a9fc319a675ef920da9db23364ca0c0ccd3c9f0a05a3a49 plugin-sdk-api-baseline.json
61a0108da670e0f44ba4b861c002eb6eaa5cf63e392d4e7e7de42044cbe7d115 plugin-sdk-api-baseline.jsonl
2c783beea6b3cda3d79060739a923f9f39e7e8b5942123dd6b08a09143a587ca plugin-sdk-api-baseline.json
0b33af2cffb42abb46682fb71c8f214da220793f13d10a34d332e75ff99e8ce9 plugin-sdk-api-baseline.jsonl

View File

@@ -311,9 +311,7 @@ $OPENCLAW_STATE_DIR/tasks/runs.sqlite
The registry loads into memory at gateway start and syncs writes to SQLite for durability across restarts.
The Gateway keeps the SQLite write-ahead log bounded by using SQLite's default
autocheckpoint threshold plus periodic `PASSIVE` checkpoints. Shutdown and
explicit maintenance checkpoints still use `TRUNCATE` so normal closes can
reclaim WAL space without making the background sweeper wait on active readers.
autocheckpoint threshold plus periodic and shutdown `TRUNCATE` checkpoints.
### Automatic maintenance

View File

@@ -161,20 +161,17 @@ Control how agents process messages:
<Step title="Incoming message arrives">
A WhatsApp group or DM message arrives.
</Step>
<Step title="Route and admission">
OpenClaw applies channel allowlists, group activation rules, and configured ACP binding ownership.
</Step>
<Step title="Broadcast check">
If no configured ACP binding owns the route, OpenClaw checks whether the peer ID is in `broadcast`.
System checks if peer ID is in `broadcast`.
</Step>
<Step title="If broadcast applies">
<Step title="If in broadcast list">
- All listed agents process the message.
- Each agent has its own session key and isolated context.
- Agents process in parallel (default) or sequentially.
</Step>
<Step title="If broadcast does not apply">
OpenClaw dispatches the ordinary route or the configured ACP session route selected during routing.
<Step title="If not in broadcast list">
Normal routing applies (first matching binding).
</Step>
</Steps>
@@ -325,7 +322,7 @@ Broadcast groups work alongside existing routing:
- `GROUP_B`: agent1 AND agent2 respond (broadcast).
<Note>
**Precedence:** `broadcast` takes priority over ordinary route bindings. Configured ACP bindings (`bindings[].type="acp"`) are exclusive: when one matches, OpenClaw dispatches to the configured ACP session instead of fan-out broadcast.
**Precedence:** `broadcast` takes priority over `bindings`.
</Note>
## Troubleshooting
@@ -346,9 +343,9 @@ Broadcast groups work alongside existing routing:
</Accordion>
<Accordion title="Only one agent responding">
**Cause:** Peer ID might be in ordinary route bindings but not `broadcast`, or it might match an exclusive configured ACP binding.
**Cause:** Peer ID might be in `bindings` but not `broadcast`.
**Fix:** Add ordinary route-bound peers to broadcast config, or remove/change the configured ACP binding if fan-out broadcast is desired.
**Fix:** Add to broadcast config or remove from bindings.
</Accordion>
<Accordion title="Performance issues">

View File

@@ -416,9 +416,7 @@ Enable `dynamicAgentCreation` to automatically create **isolated agent instances
This is essential for public bots where you want each user to have their own private AI assistant experience.
<Note>
Dynamic bindings include the normalized Feishu `accountId`, so default and named accounts route each sender to the correct dynamic agent.
If a named account created an unscoped dynamic agent on an older release, that legacy agent still counts toward `maxAgents`. Confirm that it is not used by the default account before removing it, or temporarily increase `maxAgents`; OpenClaw cannot safely infer which account owns ambiguous legacy state.
**Account limitation**: `dynamicAgentCreation` currently works with the **default Feishu account only**. Named/multi-account setups are not yet fully supported — dynamic bindings are created without `accountId`, so messages to named accounts may still route to `agent:main`. Track progress in [Issue #42837](https://github.com/openclaw/openclaw/issues/42837).
</Note>
### Quick setup
@@ -449,7 +447,7 @@ If a named account created an unscoped dynamic agent on an older release, that l
When a new user sends their first DM:
1. The channel generates a unique `agentId`: `feishu-{user_open_id}` for the default account, or a bounded account-prefixed identity digest for a named account
1. The channel generates a unique `agentId` = `feishu-{user_open_id}`
2. Creates a new workspace at `workspaceTemplate` path
3. Registers the agent and creates a binding for this user
4. The workspace helper ensures bootstrap files (`AGENTS.md`, `SOUL.md`, `USER.md`, etc.) on first access
@@ -466,23 +464,22 @@ When a new user sends their first DM:
Template variables:
- `{agentId}` - the generated agent ID (e.g., `feishu-ou_xxxxxx` or `feishu-support-<identity_digest>`)
- `{agentId}` - the generated agent ID (e.g., `feishu-ou_xxxxxx`)
- `{userId}` - the sender's Feishu open_id (e.g., `ou_xxxxxx`)
### Session scope
`session.dmScope` controls how direct messages are mapped to agent sessions. This is a **global setting** that affects all channels.
| Value | Behavior | Best for |
| ---------------------------- | ------------------------------------------------------------------- | ------------------------------------------------------------------ |
| `"main"` | Each user's DM maps to their agent's main session | Single-user bots where you want `USER.md` / `SOUL.md` to auto-load |
| `"per-channel-peer"` | Each (channel + user) combination gets a separate session | Public multi-user bots needing stronger isolation |
| `"per-account-channel-peer"` | Each (account + channel + user) combination gets a separate session | Multi-account bots needing account-level session isolation |
| Value | Behavior | Best for |
| -------------------- | --------------------------------------------------------- | ------------------------------------------------------------------ |
| `"main"` | Each user's DM maps to their agent's main session | Single-user bots where you want `USER.md` / `SOUL.md` to auto-load |
| `"per-channel-peer"` | Each (channel + user) combination gets a separate session | Public multi-user bots needing stronger isolation |
**Tradeoff**: Using `"main"` enables automatic bootstrap file loading (`USER.md`, `SOUL.md`, `MEMORY.md`), but means all DMs across all channels share the same session key pattern. For public multi-user bots where isolation matters more than bootstrap auto-loading, consider `"per-channel-peer"` and manage bootstrap files manually.
<Note>
Use `"per-account-channel-peer"` when named Feishu accounts should keep separate sessions for the same sender. Dynamic bindings preserve the account scope.
`"per-account-channel-peer"` is not recommended with `dynamicAgentCreation` because dynamic bindings are created without `accountId`. Use it only with manual bindings.
</Note>
```json5

View File

@@ -586,7 +586,7 @@ Group inbound payloads set:
- `WasMentioned` (mention gating result)
- Telegram forum topics also include `MessageThreadId` and `IsForum`.
The agent system prompt includes a group intro on the first turn of a new group session. It reminds the model to respond like a human, minimize empty lines and follow normal chat spacing, and avoid typing literal `\n` sequences. Non-Telegram groups also discourage Markdown tables; Telegram rich-text guidance comes from the Telegram channel prompt. Channel-sourced group names and participant labels are rendered as fenced untrusted metadata, not inline system instructions.
The agent system prompt includes a group intro on the first turn of a new group session. It reminds the model to respond like a human, avoid Markdown tables, minimize empty lines and follow normal chat spacing, and avoid typing literal `\n` sequences. Channel-sourced group names and participant labels are rendered as fenced untrusted metadata, not inline system instructions.
## iMessage specifics

View File

@@ -311,6 +311,7 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
- direct chats: preview message + `editMessageText`
- groups/topics: preview message + `editMessageText`
- direct-chat tool progress: optional native `sendMessageDraft` status preview when enabled and supported
Requirement:
@@ -319,10 +320,29 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
- `streaming.preview.toolProgress` controls whether tool/progress updates reuse the same edited preview message (default: `true` when preview streaming is active)
- `streaming.preview.commandText` controls command/exec detail inside those tool-progress lines: `raw` (default, preserves released behavior) or `status` (tool label only)
- `streaming.progress.commentary` (default: `false`) opts into assistant commentary/preamble text in the temporary progress draft
- legacy `channels.telegram.streamMode`, boolean `streaming` values, and retired native draft preview keys are detected; run `openclaw doctor --fix` to migrate them to current streaming config
- legacy `channels.telegram.streamMode` and boolean `streaming` values are detected; run `openclaw doctor --fix` to migrate them to `channels.telegram.streaming.mode`
Tool-progress preview updates are the short status lines shown while tools run, for example command execution, file reads, planning updates, patch summaries, or Codex preamble/commentary text in Codex app-server mode. Telegram keeps these enabled by default to match released OpenClaw behavior from `v2026.4.22` and later.
Direct chats can use native Telegram drafts for these tool-progress lines without persisting tool chatter into chat history. Native drafts stop before answer text starts; final answers stay on the normal persistent delivery path. This lane is off by default and should be gated to trusted DM IDs first:
```json
{
"channels": {
"telegram": {
"streaming": {
"mode": "partial",
"preview": {
"toolProgress": true,
"nativeToolProgress": true,
"nativeToolProgressAllowFrom": ["123456789"]
}
}
}
}
}
```
To keep the edited preview for answer text but hide tool-progress lines, set:
```json
@@ -400,16 +420,14 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
</Accordion>
<Accordion title="Rich message formatting">
Outbound text uses Telegram rich messages.
<Accordion title="Formatting and HTML fallback">
Outbound text uses Telegram `parse_mode: "HTML"`.
- Markdown text is sent as rich Markdown without converting it to HTML.
- Explicit HTML payloads are sent as rich HTML.
- Media captions still use Telegram HTML captions because rich messages do not replace captions.
- Markdown-ish text is rendered to Telegram-safe HTML.
- Supported Telegram HTML tags are preserved; unsupported HTML is escaped.
- If Telegram rejects parsed HTML, OpenClaw retries as plain text.
Long rich text is split automatically across Telegram's rich text and rich block limits. Tables over Telegram's column limit are sent as code blocks.
Link previews are enabled by default. `channels.telegram.linkPreview: false` skips automatic entity detection for rich text.
Link previews are enabled by default and can be disabled with `channels.telegram.linkPreview: false`.
</Accordion>

View File

@@ -319,40 +319,6 @@ content and identifiers.
</Tab>
</Tabs>
## Configured ACP bindings
WhatsApp supports persistent ACP bindings with top-level `bindings[]` entries:
```json5
{
bindings: [
{
type: "acp",
agentId: "codex",
match: {
channel: "whatsapp",
accountId: "work",
peer: { kind: "direct", id: "+15555550123" },
},
},
{
type: "acp",
agentId: "codex",
match: {
channel: "whatsapp",
accountId: "work",
peer: { kind: "group", id: "120363424282127706@g.us" },
},
},
],
}
```
- Direct chats match E.164 numbers such as `+15555550123`.
- Groups match WhatsApp group JIDs such as `120363424282127706@g.us`.
- Group allowlists, sender policy, and mention or activation gating run before OpenClaw ensures the configured ACP session exists.
- A matched configured ACP binding owns the route. WhatsApp broadcast groups do not fan out that turn to ordinary WhatsApp sessions.
## Personal-number and self-chat behavior
When the linked self number is also present in `allowFrom`, WhatsApp self-chat safeguards activate:

View File

@@ -200,19 +200,13 @@ from `release/YYYY.M.PATCH` or `main` after the release tag exists and after the
OpenClaw npm preflight has succeeded. It verifies `pnpm plugins:sync:check`,
dispatches `Plugin NPM Release` for all publishable plugin packages, dispatches
`Plugin ClawHub Release` for the same release SHA, and only then dispatches
`OpenClaw NPM Release` with the saved `preflight_run_id`. Stable publish also
requires an exact `windows_node_tag`; the workflow verifies the Windows source
release and compares its x64/ARM64 installers with the candidate-approved
`windows_node_installer_digests` input before any publish child, then promotes
and verifies those same pinned installer digests plus the exact companion asset
and checksum contract before publishing the GitHub release draft.
`OpenClaw NPM Release` with the saved `preflight_run_id`.
```bash
gh workflow run openclaw-release-publish.yml \
--ref release/YYYY.M.PATCH \
-f tag=vYYYY.M.PATCH-beta.N \
-f preflight_run_id=<successful-openclaw-npm-preflight-run-id> \
-f full_release_validation_run_id=<successful-full-release-validation-run-id> \
-f npm_dist_tag=beta
```
@@ -458,7 +452,7 @@ For normal PRs, follow scoped CI/check evidence instead of treating parity as a
The `CodeQL` workflow is intentionally a narrow first-pass security scanner, not the full repository sweep. Daily, manual, and non-draft pull request guard runs scan Actions workflow code plus the highest-risk JavaScript/TypeScript surfaces with high-confidence security queries filtered to high/critical `security-severity`.
The pull request guard stays light: it only starts for changes under `.github/actions`, `.github/codeql`, `.github/workflows`, `packages`, or `src`, and it runs the same high-confidence security matrix as the scheduled workflow. Android and macOS CodeQL stay out of PR defaults.
The pull request guard stays light: it only starts for changes under `.github/actions`, `.github/codeql`, `.github/workflows`, `packages`, `scripts`, `src`, or process-owning bundled plugin runtime paths, and it runs the same high-confidence security matrix as the scheduled workflow. Android and macOS CodeQL stay out of PR defaults.
### Security categories
@@ -468,6 +462,7 @@ The pull request guard stays light: it only starts for changes under `.github/ac
| `/codeql-security-high/channel-runtime-boundary` | Core channel implementation contracts plus the channel plugin runtime, gateway, Plugin SDK, secrets, audit touchpoints |
| `/codeql-security-high/network-ssrf-boundary` | Core SSRF, IP parsing, network guard, web-fetch, and Plugin SDK SSRF policy surfaces |
| `/codeql-security-high/mcp-process-tool-boundary` | MCP servers, process execution helpers, outbound delivery, and agent tool-execution gates |
| `/codeql-security-high/process-exec-boundary` | Local shell, process spawn helpers, subprocess-owning bundled plugin runtimes, and workflow script glue |
| `/codeql-security-high/plugin-trust-boundary` | Plugin install, loader, manifest, registry, package-manager install, source-loading, and Plugin SDK package contract trust surfaces |
### Platform-specific security shards

View File

@@ -174,22 +174,7 @@ Notes:
or `--element`.
- `existing-session` / `user` profiles support page screenshots and `--ref`
screenshots from snapshot output, but not CSS `--element` screenshots.
- `--labels` overlays current snapshot refs on the screenshot. On
Playwright-backed profiles, it works with `--full-page` (full-page label
overlay), `--ref` (element-clip label overlay by ARIA ref), and `--element`
(element-clip label overlay by CSS selector); in element-clip modes, labels
are projected relative to the element. The response also includes an
`annotations` array with each ref's bounding box. Each item has `ref`,
`number`, `role`, optional `name`, and `box: {x, y, width, height}`;
coordinates are in the captured image's space (viewport / fullpage /
element-relative). The field is omitted when empty.
`existing-session` profiles render a chrome-mcp overlay on page screenshots
but do not use the Playwright projection helper and do not include
`annotations`; CSS `--element` screenshots are unsupported there. Without
Playwright or chrome-mcp, labeled screenshots are not available. Prior
releases ignored `--full-page`, `--ref`, and `--element` on labeled
Playwright screenshots and always returned a viewport capture; labeled
screenshots now honor those scopes.
- `--labels` overlays current snapshot refs on the screenshot.
- `snapshot --urls` appends discovered link destinations to AI snapshots so
agents can choose direct navigation targets instead of guessing from link
text alone.

View File

@@ -182,10 +182,7 @@ Interactive onboarding behavior with reference mode:
### Non-interactive Z.AI endpoint choices
<Note>
`--auth-choice zai-api-key` auto-detects the best Z.AI endpoint and model for
your key. Coding Plan endpoints prefer `zai/glm-5.2`; general API endpoints use
`zai/glm-5.1`. To force a Coding Plan endpoint, pick `zai-coding-global` or
`zai-coding-cn`.
`--auth-choice zai-api-key` auto-detects the best Z.AI endpoint for your key (prefers the general API with `zai/glm-5.1`). If you specifically want the GLM Coding Plan endpoints, pick `zai-coding-global` or `zai-coding-cn`.
</Note>
```bash

View File

@@ -159,7 +159,7 @@ is available, then fall back to `latest`.
<Accordion title="--dangerously-force-unsafe-install">
`--dangerously-force-unsafe-install` is deprecated and is now a no-op. OpenClaw no longer runs built-in install-time dangerous-code blocking for plugin installs.
Use the shared operator-owned `security.installPolicy` surface when host-specific install policy is required. Plugin `before_install` hooks are plugin-runtime lifecycle hooks and are not the primary policy boundary for CLI installs.
Use the shared operator-owned `security.installPolicy` surface when host-specific install policy is required. Plugin `before_install` hooks and `security.installPolicy` can still block installs.
If a plugin you published on ClawHub is hidden or blocked by a registry scan, use the publisher steps in [ClawHub publishing](/clawhub/publishing). `--dangerously-force-unsafe-install` does not ask ClawHub to rescan the plugin or make a blocked release public.
@@ -405,7 +405,7 @@ Updates apply to tracked plugin installs in the managed plugin index and tracked
</Accordion>
<Accordion title="--dangerously-force-unsafe-install on update">
`--dangerously-force-unsafe-install` is also accepted on `plugins update` for compatibility, but it is deprecated and no longer changes plugin update behavior. Operator `security.installPolicy` can still block updates; plugin `before_install` hooks only apply in processes where plugin hooks are loaded.
`--dangerously-force-unsafe-install` is also accepted on `plugins update` for compatibility, but it is deprecated and no longer changes plugin update behavior. Operator `security.installPolicy` and plugin `before_install` hooks can still block updates.
</Accordion>
</AccordionGroup>

View File

@@ -479,9 +479,6 @@ names that plugin registers. Active Memory lists those tools in the recall
prompt and passes the same list to the embedded sub-agent. If none of the
configured tools are available, or the memory sub-agent fails, Active Memory
skips recall for that turn and the main reply continues without memory context.
For custom recall tools, non-empty model-visible tool output counts as recall
evidence unless structured result fields explicitly report an empty result or
failure.
`toolsAllow` only accepts concrete memory tool names. Wildcards, `group:*`
entries, and core agent tools such as `read`, `exec`, `message`, and
`web_search` are ignored before the hidden memory sub-agent starts.
@@ -746,11 +743,7 @@ Before v2026.5.2 the plugin silently extended your configured `timeoutMs` by an
extra 30000 ms during cold-start so model warm-up, embedding-index load, and
the first recall could share one larger budget. v2026.5.2 moved that grace
behind an explicit `setupGraceTimeoutMs` config — your configured `timeoutMs`
is now the recall-work budget by default, unless you opt in. The blocking hook
uses two bounded phases around that budget: up to 1500 ms for session/config
preflight before recall starts, then a separate fixed 1500 ms for abort
settlement and transcript recovery after recall work stops. Neither allowance
extends model or tool execution.
is now the budget by default, unless you opt in.
If you upgraded from v2026.4.x and you set `timeoutMs` to a value tuned for the
old implicit-grace world (the recommended starter `timeoutMs: 15000` is one
@@ -772,16 +765,14 @@ outer watchdog budgets back to the pre-v5.2 effective values:
}
```
The v2026.5.2 change removed the old implicit 30000 ms cold-start extension.
Beyond the configured recall-work budget, the hook can use up to 1500 ms for
preflight and another 1500 ms for post-recall completion. Its worst-case
blocking time is therefore `timeoutMs + setupGraceTimeoutMs + 3000` ms.
Per the v2026.5.2 changelog: _"use the configured recall timeout as the
blocking prompt-build hook budget by default and move cold-start setup grace
behind explicit `setupGraceTimeoutMs` config, so the plugin no longer silently
extends 15000 ms configs to 45000 ms on the main lane."_
The embedded recall runner uses the same effective timeout budget, so
`setupGraceTimeoutMs` covers both the outer prompt-build watchdog and the inner
blocking recall run. The preflight cap covers session/config checks before that
budget begins. The post-recall allowance lets the outer hook settle abort
cleanup and read any final transcript state.
blocking recall run.
For resource-tight gateways where cold-start latency is a known trade-off,
lower values (500015000 ms) work too — the trade-off is a higher chance of

View File

@@ -97,7 +97,7 @@ These run inside the agent loop or gateway pipeline:
- **`agent_end`**: inspect the final message list and run metadata after completion.
- **`before_compaction` / `after_compaction`**: observe or annotate compaction cycles.
- **`before_tool_call` / `after_tool_call`**: intercept tool params/results.
- **`before_install`**: inspect staged skill or plugin install material after operator install policy runs, when plugin hooks are loaded in the current OpenClaw process.
- **`before_install`**: inspect install context and optionally block skill or plugin installs after operator install policy runs.
- **`tool_result_persist`**: synchronously transform tool results before they are written to an OpenClaw-owned session transcript.
- **`message_received` / `message_sending` / `message_sent`**: inbound + outbound message hooks.
- **`session_start` / `session_end`**: session lifecycle boundaries.
@@ -109,7 +109,6 @@ Hook decision rules for outbound/tool guards:
- `before_tool_call`: `{ block: false }` is a no-op and does not clear a prior block.
- `before_install`: `{ block: true }` is terminal and stops lower-priority handlers.
- `before_install`: `{ block: false }` is a no-op and does not clear a prior block.
- Use `security.installPolicy`, not `before_install`, for operator-owned install allow/block decisions that must cover CLI install and update paths.
- `message_sending`: `{ cancel: true }` is terminal and stops lower-priority handlers.
- `message_sending`: `{ cancel: false }` is a no-op and does not clear a prior cancel.

View File

@@ -247,13 +247,12 @@ of only a bot-to-bot Slack transcript.
evidence pipeline. It checks out the trusted candidate ref in a separate
worktree, runs `pnpm openclaw qa telegram --credential-source convex
--credential-role ci`, writes a `mantis-evidence.json` manifest from the
Telegram QA summary, `qa-evidence.json`, and report artifacts, renders the
redacted evidence HTML through a Crabbox desktop browser, generates a
motion-trimmed GIF with `crabbox media preview`, and posts the inline PR
evidence comment when a PR number is available. This lane is QA-evidence visual
rather than logged-in Telegram Web proof: the Telegram Bot API gives stable live
message evidence, but Telegram Web login state is not required for normal Mantis
automation.
Telegram QA summary and observed-message artifact, renders the redacted
transcript HTML through a Crabbox desktop browser, generates a motion-trimmed GIF
with `crabbox media preview`, and posts the inline PR evidence comment when a PR
number is available. This lane is transcript-visual rather than logged-in
Telegram Web proof: the Telegram Bot API gives stable live message evidence, but
Telegram Web login state is not required for normal Mantis automation.
`Mantis Telegram Desktop Proof` is the agentic native Telegram Desktop
before/after wrapper. A maintainer can trigger it from a PR comment with
@@ -495,8 +494,8 @@ zero:
- `pnpm openclaw qa discord` already runs a live Discord lane with driver and
SUT bots.
- The live transport runner already writes reports, QA evidence, and
transport-specific artifacts under `.artifacts/qa-e2e/`.
- The live transport runner already writes reports and observed-message
artifacts under `.artifacts/qa-e2e/`.
- Convex credential leases already provide exclusive access to shared live
transport credentials.
- The browser control service already supports screenshots, snapshots,

View File

@@ -264,7 +264,7 @@ Gemini CLI JSON replies are parsed from `response`; usage falls back to `stats`,
- Provider: `zai`
- Auth: `ZAI_API_KEY`
- Example model: `zai/glm-5.2`
- Example model: `zai/glm-5.1`
- CLI: `openclaw onboard --auth-choice zai-api-key`
- Model refs use the canonical `zai/*` provider ID.
- `zai-api-key` auto-detects the matching Z.AI endpoint; `zai-coding-global`, `zai-coding-cn`, `zai-global`, and `zai-cn` force a specific surface

View File

@@ -11,7 +11,7 @@ The Personal Agent Benchmark Pack is a small repo-backed QA scenario pack for
local personal assistant workflows. It is not a generic model benchmark and it
does not require a new runner. The pack reuses the private QA stack described in
[QA overview](/concepts/qa-e2e-automation), the synthetic
[QA channel](/channels/qa-channel), and the existing `qa/scenarios` YAML
[QA channel](/channels/qa-channel), and the existing `qa/scenarios` markdown
catalog.
The first pack is intentionally narrow:
@@ -61,9 +61,9 @@ to inspect and file in issues.
## Extending The Pack
Add new `.yaml` cases under `qa/scenarios/personal/`, then add the scenario id
to `QA_PERSONAL_AGENT_SCENARIO_IDS`. Keep each case small, local, deterministic
in `mock-openai`, and focused on one personal assistant behavior.
Add new cases under `qa/scenarios/personal/`, then add the scenario id to
`QA_PERSONAL_AGENT_SCENARIO_IDS`. Keep each case small, local, deterministic in
`mock-openai`, and focused on one personal assistant behavior.
Good follow-up candidates:

View File

@@ -33,7 +33,7 @@ script aliases; both forms are supported.
| --------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `qa run` | Bundled QA self-check; writes a Markdown report. |
| `qa suite` | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM. |
| `qa coverage` | Print the YAML scenario-coverage inventory (`--json` for machine output). |
| `qa coverage` | Print the markdown scenario-coverage inventory (`--json` for machine output). |
| `qa parity-report` | Compare two `qa-suite-summary.json` files and write the agentic parity report, or use `--runtime-axis --token-efficiency` to write Codex-vs-OpenClaw runtime parity and token-efficiency reports from one runtime-pair summary. |
| `qa character-eval` | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting). |
| `qa manual` | Run a one-off prompt against the selected provider/model lane. |
@@ -318,17 +318,17 @@ Matrix has a [dedicated page](/concepts/qa-matrix) because of its scenario count
These lanes register through `extensions/qa-lab/src/live-transports/shared/live-transport-cli.ts` and accept the same flags:
| Flag | Default | Description |
| ------------------------------------- | -------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| `--scenario <id>` | - | Run only this scenario. Repeatable. |
| `--output-dir <path>` | `<repo>/.artifacts/qa-e2e/<transport>-<timestamp>` | Where reports, summaries, evidence, transport-specific artifacts, and the output log are written. Relative paths resolve against `--repo-root`. |
| `--repo-root <path>` | `process.cwd()` | Repository root when invoking from a neutral cwd. |
| `--sut-account <id>` | `sut` | Temporary account id inside the QA gateway config. |
| `--provider-mode <mode>` | `live-frontier` | `mock-openai` or `live-frontier` (legacy `live-openai` still works). |
| `--model <ref>` / `--alt-model <ref>` | provider default | Primary/alternate model refs. |
| `--fast` | off | Provider fast mode where supported. |
| `--credential-source <env\|convex>` | `env` | See [Convex credential pool](#convex-credential-pool). |
| `--credential-role <maintainer\|ci>` | `ci` in CI, `maintainer` otherwise | Role used when `--credential-source convex`. |
| Flag | Default | Description |
| ------------------------------------- | -------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
| `--scenario <id>` | - | Run only this scenario. Repeatable. |
| `--output-dir <path>` | `<repo>/.artifacts/qa-e2e/<transport>-<timestamp>` | Where reports/summary/observed messages and the output log are written. Relative paths resolve against `--repo-root`. |
| `--repo-root <path>` | `process.cwd()` | Repository root when invoking from a neutral cwd. |
| `--sut-account <id>` | `sut` | Temporary account id inside the QA gateway config. |
| `--provider-mode <mode>` | `live-frontier` | `mock-openai` or `live-frontier` (legacy `live-openai` still works). |
| `--model <ref>` / `--alt-model <ref>` | provider default | Primary/alternate model refs. |
| `--fast` | off | Provider fast mode where supported. |
| `--credential-source <env\|convex>` | `env` | See [Convex credential pool](#convex-credential-pool). |
| `--credential-role <maintainer\|ci>` | `ci` in CI, `maintainer` otherwise | Role used when `--credential-source convex`. |
Each lane exits non-zero on any failed scenario. `--allow-failures` writes artifacts without setting a failing exit code.
@@ -346,6 +346,10 @@ Required env when `--credential-source env`:
- `OPENCLAW_QA_TELEGRAM_DRIVER_BOT_TOKEN`
- `OPENCLAW_QA_TELEGRAM_SUT_BOT_TOKEN`
Optional:
- `OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT=1` keeps message bodies in observed-message artifacts (default redacts).
Scenarios (`extensions/qa-lab/src/live-transports/telegram/telegram-live.runtime.ts`):
- `telegram-canary`
@@ -371,26 +375,26 @@ Output artifacts:
- `telegram-qa-report.md`
- `qa-evidence.json` - evidence entries for the live transport checks, including profile, coverage, provider, channel, artifacts, result, and RTT fields.
- `telegram-qa-observed-messages.json` - bodies redacted unless `OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT=1`.
Package Telegram runs use the same Telegram credential contract. Repeated RTT
measurement is part of the normal package Telegram live lane; the RTT
distribution is folded into `qa-evidence.json` under `result.timing` for the
selected RTT check.
Package RTT comparison uses the same Telegram credential contract while keeping
its RTT sample controls on the RTT harness path:
```bash
OPENCLAW_QA_CREDENTIAL_SOURCE=convex \
pnpm test:docker:npm-telegram-live
pnpm rtt openclaw@beta \
--credential-source convex \
--credential-role maintainer \
--samples 20 \
--sample-timeout-ms 30000
```
When `OPENCLAW_QA_CREDENTIAL_SOURCE=convex` is set, the package live wrapper
leases a `kind: "telegram"` credential, exports the leased group/driver/SUT bot
env into the installed-package run, heartbeats the lease, and releases it on
shutdown. The package wrapper defaults to 20 RTT checks of
`telegram-mentioned-message-reply`, a 30s RTT timeout, and Convex role
`maintainer` outside CI when Convex is selected. Override
`OPENCLAW_NPM_TELEGRAM_RTT_SAMPLES`, `OPENCLAW_NPM_TELEGRAM_RTT_TIMEOUT_MS`,
or `OPENCLAW_NPM_TELEGRAM_RTT_MAX_FAILURES` to tune RTT measurement without
creating a separate RTT command or Telegram-specific summary format.
When `--credential-source convex` is set, the RTT Docker wrapper leases a
`kind: "telegram"` credential, exports the leased group/driver/SUT bot env into
the installed-package run, heartbeats the lease, and releases it on shutdown.
`--samples` and `--sample-timeout-ms` still feed
`OPENCLAW_NPM_TELEGRAM_WARM_SAMPLES` and
`OPENCLAW_NPM_TELEGRAM_SAMPLE_TIMEOUT_MS`, so `result.json` remains comparable
across env-backed and Convex-backed RTT runs.
### Discord QA
@@ -769,26 +773,25 @@ Operational env vars and the Convex broker endpoint contract live in [Testing
Seed assets live in `qa/`:
- `qa/scenarios/index.yaml`
- `qa/scenarios/<theme>/*.yaml`
- `qa/scenarios/index.md`
- `qa/scenarios/<theme>/*.md`
These are intentionally in git so the QA plan is visible to both humans and the
agent.
`qa-lab` should stay a generic YAML scenario runner. Each scenario YAML file is
`qa-lab` should stay a generic markdown runner. Each scenario markdown file is
the source of truth for one test run and should define:
- top-level `title`
- `scenario` metadata
- optional category, capability, lane, and risk metadata in `scenario`
- docs and code refs in `scenario`
- optional plugin requirements in `scenario`
- optional gateway config patch in `scenario`
- executable top-level `flow` for flow scenarios, or `scenario.execution.kind` /
`scenario.execution.path` for Vitest and Playwright scenarios
- scenario metadata
- optional category, capability, lane, and risk metadata
- docs and code refs
- optional plugin requirements
- optional gateway config patch
- an executable `qa-flow` block for flow scenarios, or `execution.kind`/`execution.path`
for Vitest and Playwright scenarios
The reusable runtime surface that backs `flow` is allowed to stay generic
and cross-cutting. For example, YAML scenarios can combine transport-side
The reusable runtime surface that backs `qa-flow` blocks is allowed to stay generic
and cross-cutting. For example, markdown scenarios can combine transport-side
helpers with browser-side helpers that drive the embedded Control UI through the
Gateway `browser.request` seam without adding a special-case runner.
@@ -826,17 +829,17 @@ provider names.
## Transport adapters
`qa-lab` owns a generic transport seam for YAML QA scenarios. `qa-channel` is the first adapter on that seam, but the design target is wider: future real or synthetic channels should plug into the same suite runner instead of adding a transport-specific QA runner.
`qa-lab` owns a generic transport seam for markdown QA scenarios. `qa-channel` is the first adapter on that seam, but the design target is wider: future real or synthetic channels should plug into the same suite runner instead of adding a transport-specific QA runner.
At the architecture level, the split is:
- `qa-lab` owns generic scenario execution, worker concurrency, artifact writing, and reporting.
- The transport adapter owns gateway config, readiness, inbound and outbound observation, transport actions, and normalized transport state.
- YAML scenario files under `qa/scenarios/` define the test run; `qa-lab` provides the reusable runtime surface that executes them.
- Markdown scenario files under `qa/scenarios/` define the test run; `qa-lab` provides the reusable runtime surface that executes them.
### Adding a channel
Adding a channel to the YAML QA system requires exactly two things:
Adding a channel to the markdown QA system requires exactly two things:
1. A transport adapter for the channel.
2. A scenario pack that exercises the channel contract.
@@ -870,7 +873,7 @@ The minimum adoption bar for a new channel:
2. Implement the transport runner on the shared `qa-lab` host seam.
3. Keep transport-specific mechanics inside the runner plugin or channel harness.
4. Mount the runner as `openclaw qa <runner>` instead of registering a competing root command. Runner plugins should declare `qaRunners` in `openclaw.plugin.json` and export a matching `qaRunnerCliRegistrations` array from `runtime-api.ts`. Keep `runtime-api.ts` light; lazy CLI and runner execution should stay behind separate entrypoints.
5. Author or adapt YAML scenarios under the themed `qa/scenarios/` directories.
5. Author or adapt markdown scenarios under the themed `qa/scenarios/` directories.
6. Use the generic scenario helpers for new scenarios.
7. Keep existing compatibility aliases working unless the repo is doing an intentional migration.

View File

@@ -32,13 +32,8 @@ title: "Usage tracking"
## Custom `/usage full` footer
`/usage full` shows a built-in compact footer with model, reasoning, fast/slow,
context window, turn tokens, cache, and cost when those fields are available. No
template file is required.
`messages.usageTemplate` is only for advanced custom layouts. The value is a
JSON file path (supports `~`) or an inline object, and it replaces the built-in
footer when valid:
Set `messages.usageTemplate` to customize the per-response `/usage full`
footer. The value can be an inline template object or a JSON file path:
```json
{
@@ -48,182 +43,9 @@ footer when valid:
}
```
Missing or empty templates fall back to the built-in footer quietly. Unreadable
or invalid configured templates also fall back to the built-in footer and emit an
operator warning.
Start custom templates from the built-in shape, then edit the parts you want to
change:
```jsonc
{
"schema": "openclaw.usageBar.v1",
"scales": {
"braille": "⠐⡀⡄⡆⡇⣇⣧⣷⣿",
"block": "░▏▎▍▌▋▊▉█",
"shade": "░▒▓█",
"moon": "🌑🌘🌗🌖🌕",
"level": "▁▂▃▄▅▆▇█",
"weather": ["🥶", "☁️", "🌥", "⛅️", "🌤", "☀️"],
"plants": ["🪾", "🍂", "🌱", "☘️", "🍀", "🌿"],
"moons6": ["🌑", "🌚", "🌘", "🌗", "🌖", "🌝"],
},
"aliases": {
"models": {
"claude-opus-4-6": "opus46",
"claude-opus-4-8": "opus48",
"claude-sonnet-4-6": "sonnet46",
"claude-haiku-4-5": "haiku45",
"gpt-5.5": "gpt5.5",
},
"reasoning": {
"off": "🌑",
"minimal": "🌚",
"low": "🌘",
"medium": "🌗",
"high": "🌕",
"xhigh": "🌝",
},
},
"output": {
"sep": "",
"default": [
{ "text": "{model.provider}{identity.emoji|🤖} {model.display_name|alias:models}" },
{ "map": "model.is_fallback", "cases": { "true": " 🔄" } },
{ "map": "model.is_override", "cases": { "true": " 📌" } },
{ "when": "model.reasoning", "text": " {model.reasoning|alias:reasoning}" },
{ "map": "state.fast_mode", "cases": { "true": " ⚡", "false": " 🐌" } },
{
"when": "context.max_tokens",
"text": " | 📚 [{context.pct_used|meter:5:braille}]{context.max_tokens|num}",
},
{
"when": "usage.has_split_tokens",
"text": " ↕️ {usage.input_tokens|num|?}/{usage.output_tokens|num|?}",
},
{ "when": "usage.has_total_only_tokens", "text": " ↕️ {usage.total_tokens|num}" },
{ "when": "usage.cache_hit_pct", "text": " 🗄 {usage.cache_hit_pct|pct}" },
{ "when": "cost.turn_usd", "text": " 💰{cost.turn_usd|fixed:4}" },
],
"surfaces": {
"discord": [
{ "text": "-# -\n" },
{ "text": "-# {model.provider}{identity.emoji|🤖} {model.display_name|alias:models}" },
{ "map": "model.is_fallback", "cases": { "true": "🔄" } },
{ "map": "model.is_override", "cases": { "true": "📌" } },
{ "when": "model.reasoning", "text": " {model.reasoning|alias:reasoning}" },
{ "map": "state.fast_mode", "cases": { "true": " ⚡️", "false": " 🐌" } },
{
"when": "context.max_tokens",
"text": " | 📚 [{context.pct_used|meter:5:braille}]{context.max_tokens|num}",
},
{
"when": "usage.has_split_tokens",
"text": " ↕️ {usage.input_tokens|num|?}/{usage.output_tokens|num|?}",
},
{ "when": "usage.has_total_only_tokens", "text": " ↕️ {usage.total_tokens|num}" },
{ "when": "usage.cache_hit_pct", "text": " 🗄 {usage.cache_hit_pct|pct}" },
{ "when": "cost.turn_usd", "text": " 💰{cost.turn_usd|fixed:4}" },
],
},
},
}
```
### Shape
```jsonc
{
"schema": "openclaw.usageBar.v1",
"scales": { "<name>": "low-to-high glyphs" }, // string (1 glyph/char) or array
"aliases": { "<table>": { "<value>": "<label>" } },
"output": {
"sep": "", // joins surviving pieces
"default": [
/* pieces */
], // fallback for any surface
"surfaces": {
"discord": [
/* pieces */
],
"telegram": [
/* pieces */
],
},
},
}
```
Each surface is an ordered list of **pieces**; the engine renders each, drops
empties, and joins survivors with `sep`. A surface with no entry uses
`output.default`.
### Contract Paths
A piece reads values from the per-turn contract by dot-path. Absent values are
empty (so a `when` guard or a `|fallback` keeps the piece clean).
| Path | Meaning |
| ----------------------------------------------------------------------------------- | -------------------------------------- |
| `surface` | channel id (`discord`/`telegram`/etc.) |
| `model.provider` / `model.display_name` | provider id / model id |
| `model.reasoning` | effort (`off` through `xhigh`) |
| `model.is_fallback` / `model.is_override` | bool: fallback used / model pinned |
| `state.fast_mode` | bool: fast vs slow |
| `context.max_tokens` / `context.pct_used` | window budget / 0-100 used |
| `usage.input_tokens` / `usage.output_tokens` / `usage.total_tokens` | turn aggregate |
| `usage.has_split_tokens` / `usage.has_total_only_tokens` / `usage.cache_hit_pct` | token display guards and cache percent |
| `usage.last.input_tokens` / `usage.last.output_tokens` / `usage.last.cache_hit_pct` | final model call only |
| `cost.turn_usd` | estimated turn cost |
| `identity.name` / `identity.emoji` | agent name / chosen emoji |
(Provider rate-limit windows are **not** in this contract.)
### Verbs
Pipe a value through verbs left to right; a non-verb segment is the fallback.
| Verb | Effect | Example |
| --------------- | ------------------------------------- | --------------------------------- |
| `num` | compact count | `272000 -> 272k` |
| `fixed:N` | N decimals (default 2) | `0.0377` |
| `dur` | seconds to duration | `14820 -> 4h07m` |
| `pct` | append `%` | `96 -> 96%` |
| `inv` | `100 - x` | for used to remaining |
| `alias:TABLE` | lookup in `aliases`, echo if unlisted | `medium -> 🌗` |
| `meter:W:SCALE` | W-cell glyph bar over a 0-100 value | `[⣿⣿⠐⠐⠐]` (`meter:1` = one glyph) |
### Piece forms
- `{ "text": "📚 {context.max_tokens|num}" }`: literal + interpolation.
- `{ "when": "<path>", "text": "..." }`: render only if the path is truthy.
- `{ "map": "<path>", "cases": { "true": "⚡", "false": "🐌" } }`: value to glyph.
- `{ "each": "limits.windows", "item": "{label}" }`: iterate an array.
### Example
```jsonc
{
"schema": "openclaw.usageBar.v1",
"scales": { "braille": "⠐⡀⡄⡆⡇⣇⣧⣷⣿" },
"aliases": { "reasoning": { "medium": "🌗", "high": "🌕" } },
"output": {
"surfaces": {
"discord": [
{ "text": "{model.display_name}" },
{ "when": "model.reasoning", "text": " {model.reasoning|alias:reasoning}" },
{ "map": "state.fast_mode", "cases": { "true": " ⚡", "false": " 🐌" } },
{
"when": "context.max_tokens",
"text": " | 📚 [{context.pct_used|meter:5:braille}]{context.max_tokens|num}",
},
],
},
},
}
```
renders e.g. `claude-sonnet-4-6 🌗 🐌 | 📚 [⣿⣿⣿⣿⣧]272k`.
Templates read the `openclaw.usageLine.v1` contract and can use `scales`,
`aliases`, and `output.surfaces` to render channel-specific footers. Missing,
unreadable, invalid, or empty templates fall back to the built-in usage line.
## Providers + credentials

View File

@@ -130,8 +130,6 @@ WhatsApp runs through the gateway's web channel (Baileys Web). It starts automat
}
```
- Top-level `bindings[]` entries with `type: "acp"` configure persistent ACP bindings for WhatsApp DMs and groups. Use an E.164 direct number or WhatsApp group JID in `match.peer.id`. Field semantics are shared in [ACP Agents](/tools/acp-agents#persistent-channel-bindings).
<Accordion title="Multi-account WhatsApp">
```json5

View File

@@ -339,7 +339,7 @@ Configures inbound media understanding (image/audio/video):
- `capabilities`: optional list (`image`, `audio`, `video`). Defaults: `openai`/`anthropic`/`minimax` → image, `google` → image+audio+video, `groq` → audio.
- `prompt`, `maxChars`, `maxBytes`, `timeoutSeconds`, `language`: per-entry overrides.
- `tools.media.image.timeoutSeconds` and matching image model `timeoutSeconds` entries also apply when the agent calls the explicit `image` tool. For image understanding, this timeout applies to the request itself and is not reduced by earlier preparation work.
- `tools.media.image.timeoutSeconds` and matching image model `timeoutSeconds` entries also apply when the agent calls the explicit `image` tool.
- Failures fall back to the next entry.
Provider auth follows standard order: `auth-profiles.json` → env vars → `models.providers.*.apiKey`.

View File

@@ -73,7 +73,7 @@ Live tests are split into two layers so we can isolate failures:
- `pnpm test:live` (or `OPENCLAW_LIVE_TEST=1` if invoking Vitest directly)
- Set `OPENCLAW_LIVE_MODELS=modern`, `small`, or `all` (alias for modern) to actually run this suite; otherwise it skips to keep `pnpm test:live` focused on gateway smoke
- How to select models:
- `OPENCLAW_LIVE_MODELS=modern` to run the modern allowlist (Opus/Sonnet 4.6+, GPT-5.2 + Codex, Gemini 3, DeepSeek V4, GLM 5.1, MiniMax M3, Grok 4.3)
- `OPENCLAW_LIVE_MODELS=modern` to run the modern allowlist (Opus/Sonnet 4.6+, GPT-5.2 + Codex, Gemini 3, DeepSeek V4, GLM 4.7, MiniMax M3, Grok 4.3)
- `OPENCLAW_LIVE_MODELS=small` to run the constrained small-model allowlist (Qwen 8B/9B local-compatible routes, Ollama Gemma, OpenRouter Qwen/GLM, and Z.AI GLM)
- `OPENCLAW_LIVE_MODELS=all` is an alias for the modern allowlist
- or `OPENCLAW_LIVE_MODELS="openai/gpt-5.5,anthropic/claude-opus-4-6,..."` (comma allowlist)
@@ -357,9 +357,6 @@ Narrow, explicit allowlists are fastest and least flaky:
- Tool calling across several providers:
- `OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.5,anthropic/claude-opus-4-6,google/gemini-3-flash-preview,deepseek/deepseek-v4-flash,zai/glm-5.1,minimax/MiniMax-M3" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
- Z.AI Coding Plan GLM-5.2 direct smoke:
- `ZAI_CODING_LIVE_TEST=1 pnpm test:live src/agents/zai.live.test.ts`
- Google focus (Gemini API key + Antigravity):
- Gemini (API key): `OPENCLAW_LIVE_GATEWAY_MODELS="google/gemini-3-flash-preview" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
- Antigravity (OAuth): `OPENCLAW_LIVE_GATEWAY_MODELS="google-antigravity/claude-opus-4-6-thinking,google-antigravity/gemini-3-pro-high" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts`
@@ -391,7 +388,7 @@ This is the "common models" run we expect to keep working:
- Google (Gemini API): `google/gemini-3.1-pro-preview` and `google/gemini-3-flash-preview` (avoid older Gemini 2.x models)
- Google (Antigravity): `google-antigravity/claude-opus-4-6-thinking` and `google-antigravity/gemini-3-flash`
- DeepSeek: `deepseek/deepseek-v4-flash` and `deepseek/deepseek-v4-pro`
- Z.AI (GLM): `zai/glm-5.1` (general API) or `zai/glm-5.2` (Coding Plan)
- Z.AI (GLM): `zai/glm-5.1`
- MiniMax: `minimax/MiniMax-M3`
Run gateway smoke with tools + image:
@@ -405,7 +402,7 @@ Pick at least one per provider family:
- Anthropic: `anthropic/claude-opus-4-6` (or `anthropic/claude-sonnet-4-6`)
- Google: `google/gemini-3-flash-preview` (or `google/gemini-3.1-pro-preview`)
- DeepSeek: `deepseek/deepseek-v4-flash`
- Z.AI (GLM): `zai/glm-5.1` (general API) or `zai/glm-5.2` (Coding Plan)
- Z.AI (GLM): `zai/glm-5.1`
- MiniMax: `minimax/MiniMax-M3`
Optional additional coverage (nice to have):

View File

@@ -218,27 +218,17 @@ inside every shard.
`OPENCLAW_NPM_TELEGRAM_PACKAGE_TGZ=/path/to/openclaw-current.tgz` or
`OPENCLAW_CURRENT_PACKAGE_TGZ` to test a resolved local tarball instead of
installing from the registry.
- Emits repeated RTT timing in `qa-evidence.json` by default with
`OPENCLAW_NPM_TELEGRAM_RTT_SAMPLES=20`. Override
`OPENCLAW_NPM_TELEGRAM_RTT_SAMPLES`,
`OPENCLAW_NPM_TELEGRAM_RTT_TIMEOUT_MS`, or
`OPENCLAW_NPM_TELEGRAM_RTT_MAX_FAILURES` to tune the RTT run.
`OPENCLAW_NPM_TELEGRAM_RTT_CHECKS` accepts a comma-separated list of
Telegram QA check IDs to sample; when unset, the default RTT-capable check
is `telegram-mentioned-message-reply`.
- Uses the same Telegram env credentials or Convex credential source as
`pnpm openclaw qa telegram`. For CI/release automation, set
`OPENCLAW_NPM_TELEGRAM_CREDENTIAL_SOURCE=convex` plus
`OPENCLAW_QA_CONVEX_SITE_URL` and a role secret. If
`OPENCLAW_QA_CONVEX_SITE_URL` and the role secret. If
`OPENCLAW_QA_CONVEX_SITE_URL` and a Convex role secret are present in CI,
the Docker wrapper selects Convex automatically.
- The wrapper validates Telegram or Convex credential env on the host before
Docker build/install work. Set `OPENCLAW_NPM_TELEGRAM_SKIP_CREDENTIAL_PREFLIGHT=1`
only when deliberately debugging pre-credential setup.
- `OPENCLAW_NPM_TELEGRAM_CREDENTIAL_ROLE=ci|maintainer` overrides the shared
`OPENCLAW_QA_CREDENTIAL_ROLE` for this lane only. When Convex credentials
are selected and no role is set, the wrapper uses `ci` in CI and
`maintainer` outside CI.
`OPENCLAW_QA_CREDENTIAL_ROLE` for this lane only.
- GitHub Actions exposes this lane as the manual maintainer workflow
`NPM Telegram Beta E2E`. It does not run on merge. The workflow uses the
`qa-live-shared` environment and Convex CI credential leases.
@@ -354,11 +344,11 @@ gh workflow run package-acceptance.yml --ref main \
want artifacts without a failing exit code.
- Requires two distinct bots in the same private group, with the SUT bot exposing a Telegram username.
- For stable bot-to-bot observation, enable Bot-to-Bot Communication Mode in `@BotFather` for both bots and ensure the driver bot can observe group bot traffic.
- Writes a Telegram QA report, summary, and `qa-evidence.json` under `.artifacts/qa-e2e/...`. Replying scenarios include RTT from driver send request to observed SUT reply.
- Writes a Telegram QA report, summary, and observed-messages artifact under `.artifacts/qa-e2e/...`. Replying scenarios include RTT from driver send request to observed SUT reply.
`Mantis Telegram Live` is the PR-evidence wrapper around this lane. It runs the
candidate ref with Convex-leased Telegram credentials, renders the redacted QA
report/evidence bundle in a Crabbox desktop browser, records MP4 evidence,
candidate ref with Convex-leased Telegram credentials, renders the redacted
observed-message transcript in a Crabbox desktop browser, records MP4 evidence,
generates a motion-trimmed GIF, uploads the artifact bundle, and posts inline PR
evidence through the Mantis GitHub App when `pr_number` is set. Maintainers can
start it from the Actions UI through `Mantis Scenario` (`scenario_id:

View File

@@ -214,59 +214,6 @@ permission boundary. Dangerous plugin node commands still require explicit
After a node changes its declared command list, reject the old device pairing
and approve the new request so the gateway stores the updated command snapshot.
## Config (`openclaw.json`)
Node-related settings live under `gateway.nodes` and `tools.exec`:
```json5
{
gateway: {
nodes: {
// Auto-approve first-time node pairing from trusted networks (CIDR list).
// Disabled when unset. Only applies to first-time role:node requests
// with no requested scopes; does not auto-approve upgrades.
pairing: {
autoApproveCidrs: ["192.168.1.0/24"],
},
// Opt into dangerous/privacy-heavy node commands (camera.snap, etc.).
allowCommands: ["camera.snap", "screen.record"],
// Block exact command names even if defaults or allowCommands include them.
denyCommands: ["camera.clip"],
},
},
tools: {
exec: {
// Default exec host: "node" routes all exec calls to a paired node.
host: "node",
// Security mode for node exec: allow only approved/allowlisted commands.
security: "allowlist",
// Pin exec to a specific node (id or name). Omit to allow any node.
node: "build-node",
},
},
}
```
Use exact node command names. `denyCommands` removes a command even when a
platform default or `allowCommands` entry would otherwise allow it. See
[Gateway configuration reference](/gateway/configuration-reference#gateway-field-details)
for gateway node pairing and command-policy field details.
Per-agent exec node override:
```json5
{
agents: {
list: [
{
id: "main",
tools: { exec: { node: "build-node" } },
},
],
},
}
```
## Screenshots (canvas snapshots)
If the node is showing the Canvas (WebView), `canvas.snapshot` returns `{ format, base64 }`.

View File

@@ -197,30 +197,22 @@ only for behavior that really belongs to the backend.
`CliBackendPlugin` can also define:
| Hook | Use |
| ---------------------------------- | --------------------------------------------------------------------------- |
| `normalizeConfig(config, context)` | Rewrite legacy user config after merge |
| `resolveExecutionArgs(ctx)` | Add request-scoped flags such as thinking effort or side-question isolation |
| `prepareExecution(ctx)` | Create temporary auth or config bridges before launch |
| `transformSystemPrompt(ctx)` | Apply a final CLI-specific system prompt transform |
| `textTransforms` | Bidirectional prompt/output replacements |
| `defaultAuthProfileId` | Prefer a specific OpenClaw auth profile |
| `authEpochMode` | Decide how auth changes invalidate stored CLI sessions |
| `nativeToolMode` | Declare whether the CLI has always-on native tools |
| `sideQuestionToolMode` | Declare disabled native tools for `/btw` side questions |
| `bundleMcp` / `bundleMcpMode` | Opt into OpenClaw's loopback MCP tool bridge |
| `ownsNativeCompaction` | Backend owns its own compaction - OpenClaw defers |
| Hook | Use |
| ---------------------------------- | ------------------------------------------------------ |
| `normalizeConfig(config, context)` | Rewrite legacy user config after merge |
| `resolveExecutionArgs(ctx)` | Add request-scoped flags such as thinking effort |
| `prepareExecution(ctx)` | Create temporary auth or config bridges before launch |
| `transformSystemPrompt(ctx)` | Apply a final CLI-specific system prompt transform |
| `textTransforms` | Bidirectional prompt/output replacements |
| `defaultAuthProfileId` | Prefer a specific OpenClaw auth profile |
| `authEpochMode` | Decide how auth changes invalidate stored CLI sessions |
| `nativeToolMode` | Declare whether the CLI has always-on native tools |
| `bundleMcp` / `bundleMcpMode` | Opt into OpenClaw's loopback MCP tool bridge |
| `ownsNativeCompaction` | Backend owns its own compaction - OpenClaw defers |
Keep these hooks provider-owned. Do not add CLI-specific branches to core when a
backend hook can express the behavior.
`ctx.executionMode` is `"agent"` for normal turns and `"side-question"` for
ephemeral `/btw` calls. Use it when the CLI needs different one-shot flags, such
as disabling native tools, session persistence, or resume behavior for BTW. If a
backend normally has `nativeToolMode: "always-on"` but its side-question argv
reliably disables those tools, also set `sideQuestionToolMode: "disabled"`;
otherwise OpenClaw fails closed when BTW requires a no-tools CLI run.
### `ownsNativeCompaction`: opting out of OpenClaw compaction
If your backend runs an agent that compacts its **own** transcript, set

View File

@@ -313,13 +313,9 @@ available timeout in this order:
- For `image_generate` without a configured timeout, the 120 second
image-generation default.
- For the media-understanding `image` tool, `tools.media.image.timeoutSeconds`
converted to milliseconds, or the 60 second media default. For image
understanding, this applies to the request itself and is not reduced by
earlier preparation work.
converted to milliseconds, or the 60 second media default.
- The 90 second dynamic-tool default.
This watchdog is the outer dynamic `item/tool/call` budget. Provider-specific
request timeouts run inside that call and keep their own timeout semantics.
Dynamic tool budgets are capped at 600000 ms. On timeout, OpenClaw aborts the
tool signal where supported and returns a failed dynamic-tool response to Codex
so the turn can continue instead of leaving the session in `processing`.

View File

@@ -557,14 +557,10 @@ or shortens that specific tool budget. The `image_generate` tool uses
`agents.defaults.imageGenerationModel.timeoutMs` when the tool call does not
provide its own timeout, or a 120 second image-generation default otherwise.
The media-understanding `image` tool uses
`tools.media.image.timeoutSeconds` or its 60 second media default. For image
understanding, that timeout applies to the request itself and is not
reduced by earlier preparation work. Dynamic tool budgets are
capped at 600000 ms. On timeout, OpenClaw aborts the tool signal
`tools.media.image.timeoutSeconds` or its 60 second media default. Dynamic tool
budgets are capped at 600000 ms. On timeout, OpenClaw aborts the tool signal
where supported and returns a failed dynamic-tool response to Codex so the turn
can continue instead of leaving the session in `processing`.
This watchdog is the outer dynamic `item/tool/call` budget; provider-specific
request timeouts run inside that call and keep their own timeout semantics.
After Codex accepts a turn, and after OpenClaw responds to a turn-scoped
app-server request, the harness expects Codex to make current-turn progress and

View File

@@ -152,8 +152,7 @@ observation-only.
- `gateway_start` / `gateway_stop` - start or stop plugin-owned services with the Gateway
- `deactivate` - deprecated compatibility alias for `gateway_stop`; use `gateway_stop` in new plugins
- `cron_changed` - observe gateway-owned cron lifecycle changes (added, updated, removed, started, finished, scheduled)
- **`before_install`** - inspect staged skill or plugin install material from a loaded
plugin runtime
- **`before_install`** - inspect skill or plugin install context and optionally block
## Debug runtime hooks
@@ -463,19 +462,11 @@ Decision rules:
## Install hooks
Use `security.installPolicy` for operator-owned allow/block decisions. That
policy runs from OpenClaw config, covers CLI install and update paths, and fails
closed when enabled but unavailable.
`before_install` is a plugin-runtime lifecycle hook. It runs after
`security.installPolicy` only in the OpenClaw process where plugin hooks have
already been loaded, such as Gateway-backed install flows. It is useful for
plugin-owned observations, warnings, and compatibility checks, but it is not the
primary enterprise or host security boundary for installs. The `builtinScan`
field remains in the event payload for compatibility, but OpenClaw no longer
runs built-in install-time dangerous-code blocking, so it is an empty `ok`
result. Return additional findings or `{ block: true, blockReason }` to stop the
install in that process.
`before_install` runs after the operator-owned `security.installPolicy` check
when one is configured. The `builtinScan` field remains in the event payload for
compatibility, but OpenClaw no longer runs built-in install-time dangerous-code
blocking, so it is an empty `ok` result. Return additional findings or
`{ block: true, blockReason }` to stop the install.
`block: true` is terminal. `block: false` is treated as no decision.
Handler failures block the install fail-closed.

View File

@@ -378,10 +378,7 @@ AI CLI backend such as `claude-cli` or `my-cli`.
(for example normalizing old flag shapes).
- Use `resolveExecutionArgs` for request-scoped argv rewrites that belong to
the CLI dialect, such as mapping OpenClaw thinking levels to a native effort
flag. The hook receives `ctx.executionMode`; use `"side-question"` to add
backend-native isolation flags for ephemeral `/btw` calls. If those flags
reliably disable native tools for an otherwise always-on CLI, declare
`sideQuestionToolMode: "disabled"` too.
flag.
For an end-to-end authoring guide, see
[CLI backend plugins](/plugins/cli-backend-plugins).
@@ -431,10 +428,6 @@ semantics.
### Hook decision semantics
`before_install` is a plugin-runtime lifecycle hook, not the operator install
policy surface. Use `security.installPolicy` when an allow/block decision must
cover CLI and Gateway-backed install or update paths.
- `before_tool_call`: returning `{ block: true }` is terminal. Once any handler sets it, lower-priority handlers are skipped.
- `before_tool_call`: returning `{ block: false }` is treated as no decision (same as omitting `block`), not as an override.
- `before_install`: returning `{ block: true }` is terminal. Once any handler sets it, lower-priority handlers are skipped.

View File

@@ -515,7 +515,6 @@ API key auth, and dynamic model resolution.
- `openclaw/plugin-sdk/provider-model-shared` - `ProviderReplayFamily`, `buildProviderReplayFamilyHooks(...)`, and the raw replay builders (`buildOpenAICompatibleReplayPolicy`, `buildAnthropicReplayPolicyForModel`, `buildGoogleGeminiReplayPolicy`, `buildHybridAnthropicOrOpenAIReplayPolicy`). Also exports Gemini replay helpers (`sanitizeGoogleGeminiReplayHistory`, `resolveTaggedReasoningOutputMode`) and endpoint/model helpers (`resolveProviderEndpoint`, `normalizeProviderId`, `normalizeGooglePreviewModelId`).
- `openclaw/plugin-sdk/provider-stream` - `ProviderStreamFamily`, `buildProviderStreamFamilyHooks(...)`, `composeProviderStreamWrappers(...)`, plus the shared OpenAI/Codex wrappers (`createOpenAIAttributionHeadersWrapper`, `createOpenAIFastModeWrapper`, `createOpenAIServiceTierWrapper`, `createOpenAIResponsesContextManagementWrapper`, `createCodexNativeWebSearchWrapper`), DeepSeek V4 OpenAI-compatible wrapper (`createDeepSeekV4OpenAICompatibleThinkingWrapper`), Anthropic Messages thinking prefill cleanup (`createAnthropicThinkingPrefillPayloadWrapper`), plain-text tool-call compat (`createPlainTextToolCallCompatWrapper`), and shared proxy/provider wrappers (`createOpenRouterWrapper`, `createToolStreamWrapper`, `createMinimaxFastModeWrapper`).
- `openclaw/plugin-sdk/provider-stream-shared` - lightweight payload and event wrappers for hot provider paths, including `createOpenAICompatibleCompletionsThinkingOffWrapper`, `createPayloadPatchStreamWrapper`, and `createPlainTextToolCallCompatWrapper`.
- `openclaw/plugin-sdk/provider-tools` - `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks("deepseek" | "gemini" | "openai")`, and underlying provider schema helpers.
For Gemini-family providers, keep the reasoning-output mode aligned with

View File

@@ -164,7 +164,7 @@ and pairing-path families.
| `plugin-sdk/provider-tools` | `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks`, and DeepSeek/Gemini/OpenAI schema cleanup + diagnostics |
| `plugin-sdk/provider-usage` | Provider usage snapshot types, shared usage fetch helpers, and provider fetchers such as `fetchClaudeUsage` |
| `plugin-sdk/provider-stream` | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, plain-text tool-call compat, and shared Anthropic/Bedrock/DeepSeek V4/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |
| `plugin-sdk/provider-stream-shared` | Public shared provider stream wrapper helpers including `composeProviderStreamWrappers`, `createOpenAICompatibleCompletionsThinkingOffWrapper`, `createPlainTextToolCallCompatWrapper`, `createPayloadPatchStreamWrapper`, `createToolStreamWrapper`, and Anthropic/DeepSeek/OpenAI-compatible stream utilities |
| `plugin-sdk/provider-stream-shared` | Public shared provider stream wrapper helpers including `composeProviderStreamWrappers`, `createPlainTextToolCallCompatWrapper`, `createPayloadPatchStreamWrapper`, `createToolStreamWrapper`, and Anthropic/DeepSeek/OpenAI-compatible stream utilities |
| `plugin-sdk/provider-transport-runtime` | Native provider transport helpers such as guarded fetch, transport message transforms, and writable transport event streams |
| `plugin-sdk/provider-onboard` | Onboarding config patch helpers |
| `plugin-sdk/global-singleton` | Process-local singleton/map/cache helpers |
@@ -236,7 +236,6 @@ usage endpoint failed or returned no usable usage data.
| `plugin-sdk/config-contracts` | Focused type-only config surface for plugin config shapes such as `OpenClawConfig` and channel/provider config types |
| `plugin-sdk/plugin-config-runtime` | Runtime plugin-config lookup helpers such as `requireRuntimeConfig`, `resolvePluginConfigObject`, and `resolveLivePluginConfigObject` |
| `plugin-sdk/config-mutation` | Transactional config mutation helpers such as `mutateConfigFile`, `replaceConfigFile`, and `logConfigUpdated` |
| `plugin-sdk/message-tool-delivery-hints` | Shared message-tool delivery metadata hint strings |
| `plugin-sdk/runtime-config-snapshot` | Current process config snapshot helpers such as `getRuntimeConfig`, `getRuntimeConfigSnapshot`, and test snapshot setters |
| `plugin-sdk/telegram-command-config` | Telegram command-name/description normalization and duplicate/conflict checks, even when the bundled Telegram contract surface is unavailable |
| `plugin-sdk/text-autolink-runtime` | File-reference autolink detection without the broad text barrel |

View File

@@ -86,7 +86,6 @@ Bundled fallback examples:
| Model ref | Notes |
| --------------------------------- | ---------------------------- |
| `openrouter/auto` | OpenRouter automatic routing |
| `openrouter/openrouter/fusion` | OpenRouter Fusion router |
| `openrouter/moonshotai/kimi-k2.6` | Kimi K2.6 via MoonshotAI |
| `openrouter/moonshotai/kimi-k2.5` | Kimi K2.5 via MoonshotAI |
@@ -214,79 +213,6 @@ media understanding preflight.
OpenClaw sends OpenRouter STT requests as JSON with base64 audio under
`input_audio` (OpenRouter STT contract), not as multipart OpenAI form uploads.
## Fusion router
Use OpenRouter Fusion when you want one OpenClaw model ref to ask several
OpenRouter models in parallel, have OpenRouter judge their answers, and return a
single final response through the normal OpenRouter provider endpoint. Because
the upstream model slug is `openrouter/fusion`, the OpenClaw model ref includes
both the OpenClaw provider prefix and the upstream OpenRouter namespace:
```bash
openclaw models set openrouter/openrouter/fusion
```
Configure Fusion's panel and judge through the model's `params.extraBody`. Those
fields are forwarded into the OpenRouter chat-completions request body. Fusion
works with either OpenRouter OAuth onboarding or API-key onboarding; if you use
OAuth, omit the `env.OPENROUTER_API_KEY` line from the example below.
```json5
{
env: { OPENROUTER_API_KEY: "sk-or-..." },
agents: {
defaults: {
model: { primary: "openrouter/openrouter/fusion" },
models: {
"openrouter/openrouter/fusion": {
params: {
extraBody: {
plugins: [
{
id: "fusion",
analysis_models: [
"google/gemini-3.5-flash",
"moonshotai/kimi-k2.6",
"deepseek/deepseek-v4-pro",
],
model: "google/gemini-3.5-flash",
},
],
},
},
},
},
},
},
}
```
The `analysis_models` list is the parallel panel, and `model` inside the Fusion
plugin config is the judge model. Do not set top-level `tool_choice` to
`"required"` in normal OpenClaw agent/chat turns to try to force Fusion;
OpenClaw turns may include OpenClaw tool definitions, and a top-level required
tool choice can require one of those tools instead of the Fusion router. When
this Fusion plugin config is present, OpenClaw also adds a sanitized
system-prompt note with the configured analysis models and judge model so the
agent can answer questions about its current Fusion panel. Other `extraBody`
fields are not copied into the prompt.
Fusion is slower by design. OpenRouter may send the same OpenClaw prompt to
multiple analysis models and then run a final judge/synthesis step, so latency is
usually higher than a direct single-model request. Use Fusion for deliberate,
high-quality answers or escalation paths, not as the default for
latency-sensitive chat. For faster responses, keep the panel small and choose
faster analysis and judge models.
Test the configured ref with a one-shot local model call:
```bash
openclaw infer model run --local \
--model openrouter/openrouter/fusion \
--prompt "Reply with exactly: FUSION_OK" \
--json
```
## Authentication and headers
OpenRouter uses a Bearer token with your API key under the hood. OpenRouter

View File

@@ -19,7 +19,7 @@ OpenClaw uses the `zai` provider with a Z.AI API key.
## GLM models
GLM is a model family, not a separate provider. In OpenClaw, GLM models use
refs such as `zai/glm-5.2`: provider `zai`, model id `glm-5.2`.
refs such as `zai/glm-5.1`: provider `zai`, model id `glm-5.1`.
## Getting started
@@ -85,12 +85,12 @@ you want to force a specific Coding Plan or general API surface.
models: {
providers: {
zai: {
// GLM-5.2 uses the Coding Plan endpoint.
baseUrl: "https://api.z.ai/api/coding/paas/v4",
// Example value. Onboarding writes the matching baseUrl for your endpoint.
baseUrl: "https://api.z.ai/api/paas/v4",
},
},
},
agents: { defaults: { model: { primary: "zai/glm-5.2" } } },
agents: { defaults: { model: { primary: "zai/glm-5.1" } } },
}
```
@@ -105,31 +105,28 @@ openclaw models list --all --provider zai
The manifest-backed catalog currently includes:
| Model ref | Notes |
| -------------------- | ------------------------------- |
| `zai/glm-5.2` | Coding Plan default; 1M context |
| `zai/glm-5.1` | General API default |
| `zai/glm-5` | |
| `zai/glm-5-turbo` | |
| `zai/glm-5v-turbo` | |
| `zai/glm-4.7` | |
| `zai/glm-4.7-flash` | |
| `zai/glm-4.7-flashx` | |
| `zai/glm-4.6` | |
| `zai/glm-4.6v` | |
| `zai/glm-4.5` | |
| `zai/glm-4.5-air` | |
| `zai/glm-4.5-flash` | |
| `zai/glm-4.5v` | |
| Model ref | Notes |
| -------------------- | ------------- |
| `zai/glm-5.1` | Default model |
| `zai/glm-5` | |
| `zai/glm-5-turbo` | |
| `zai/glm-5v-turbo` | |
| `zai/glm-4.7` | |
| `zai/glm-4.7-flash` | |
| `zai/glm-4.7-flashx` | |
| `zai/glm-4.6` | |
| `zai/glm-4.6v` | |
| `zai/glm-4.5` | |
| `zai/glm-4.5-air` | |
| `zai/glm-4.5-flash` | |
| `zai/glm-4.5v` | |
<Tip>
GLM models are available as `zai/<model>` (example: `zai/glm-5`).
</Tip>
<Note>
Coding Plan setup defaults to `zai/glm-5.2`; general API setup keeps
`zai/glm-5.1`. Endpoint auto-detection falls back to `glm-5.1` or `glm-4.7`
when the selected plan does not expose GLM-5.2. GLM versions and availability
The default bundled model ref is `zai/glm-5.1`. GLM versions and availability
can change; run `openclaw models list --all --provider zai` to see the catalog
known to your installed version.
</Note>
@@ -176,7 +173,7 @@ known to your installed version.
agents: {
defaults: {
models: {
"zai/glm-5.2": {
"zai/glm-5.1": {
params: { preserveThinking: true },
},
},

View File

@@ -99,14 +99,10 @@ the maintainer-only release runbook.
file, lane, workflow job, package profile, provider, or model allowlist that
proves the fix. Rerun the full umbrella only when the changed surface makes
prior evidence stale.
9. For a tagged beta candidate, run
`pnpm release:candidate -- --tag vYYYY.M.PATCH-beta.N` from the matching
`release/YYYY.M.PATCH` branch. For stable, pass the required Windows source
release too:
`pnpm release:candidate -- --tag vYYYY.M.PATCH --windows-node-tag vX.Y.Z`.
The helper runs the local generated-release checks, dispatches or verifies
the full release validation and npm preflight evidence, runs Parallels
fresh/update proof against the exact prepared tarball plus Telegram package
9. For beta, tag `vYYYY.M.PATCH-beta.N`, then run `pnpm release:candidate -- --tag
vYYYY.M.PATCH-beta.N` from the matching `release/YYYY.M.PATCH` branch. The helper runs
the local generated-release checks, dispatches or verifies the full release
validation and npm preflight evidence, runs Parallels and Telegram package
proof, records plugin npm and ClawHub plans, and prints the exact
`OpenClaw Release Publish` command only after the evidence bundle is green.
`OpenClaw Release Publish` dispatches the selected or all-publishable plugin
@@ -146,12 +142,9 @@ the maintainer-only release runbook.
direct push, it opens or updates an appcast PR. Stable Windows Hub
readiness requires the signed `OpenClawCompanion-Setup-x64.exe`,
`OpenClawCompanion-Setup-arm64.exe`, and
`OpenClawCompanion-SHA256SUMS.txt` assets on the OpenClaw GitHub release.
Pass the exact signed `openclaw/openclaw-windows-node` release tag as
`windows_node_tag` and its candidate-approved installer digest map as
`windows_node_installer_digests`; `OpenClaw Release Publish` keeps the
release draft, dispatches `Windows Node Release`, and verifies all three
assets before publication.
`OpenClawCompanion-SHA256SUMS.txt` assets on the OpenClaw GitHub release;
promote them with the `Windows Node Release` workflow after the matching
`openclaw/openclaw-windows-node` release has passed its signing workflow.
11. After publish, run the npm post-publish verifier, optional standalone
published-npm Telegram E2E when you need post-publish channel proof,
dist-tag promotion when needed, verify the generated GitHub release page,
@@ -260,36 +253,21 @@ the maintainer-only release runbook.
to the GitHub release as `openclaw-<version>-dependency-evidence.zip`.
- Run `OpenClaw Release Publish` for the mutating publish sequence after the
tag exists. Dispatch it from `release/YYYY.M.PATCH` (or `main` when publishing a
main-reachable tag), pass the release tag, successful OpenClaw npm
`preflight_run_id`, and successful `full_release_validation_run_id`, and keep
the default plugin publish scope `all-publishable` unless you are deliberately
running a focused repair. The workflow serializes plugin npm publish, plugin
ClawHub publish, and OpenClaw npm publish so the core package is not published
before its externalized plugins.
- Stable `OpenClaw Release Publish` requires an exact `windows_node_tag` after
the matching non-prerelease `openclaw/openclaw-windows-node` release exists.
It also requires the candidate-approved `windows_node_installer_digests` map.
Before dispatching any publish child, it verifies that source release is
published, non-prerelease, contains the required x64/ARM64 installers, and
still matches that approved map. It then dispatches `Windows Node Release`
while the OpenClaw release is still a draft, carrying the pinned installer
digest map unchanged. The child
workflow downloads the signed Windows Hub installers from that exact tag,
matches them against the pinned digests, verifies their Authenticode
signatures use the expected OpenClaw Foundation signer on a Windows runner,
writes a SHA-256 manifest, and uploads the installers plus manifest onto the
canonical OpenClaw GitHub release, then re-downloads the promoted assets and
verifies the manifest membership and hashes. The parent verifies the current
x64, ARM64, and checksum asset contract before publication. Direct recovery
rejects unexpected `OpenClawCompanion-*` asset names before replacing the
expected contract assets with the pinned source bytes. Manually dispatch
`Windows Node Release` only for recovery, and always pass an exact tag, never
`latest`, plus the explicit `expected_installer_digests` JSON map from the
approved source release. Website download links should target exact OpenClaw
release asset URLs for the current stable release, or
`releases/latest/download/...` only after verifying GitHub's latest redirect
points at that same release; do not link only to the companion repo release
page.
main-reachable tag), pass the release tag and successful OpenClaw npm
`preflight_run_id`, and keep the default plugin publish scope
`all-publishable` unless you are deliberately running a focused repair. The
workflow serializes plugin npm publish, plugin ClawHub publish, and OpenClaw
npm publish so the core package is not published before its externalized
plugins.
- Run the manual `Windows Node Release` workflow for stable releases after the
matching `openclaw/openclaw-windows-node` release exists. It downloads the
signed Windows Hub installers from the companion repo, verifies their
Authenticode signatures on a Windows runner, writes a SHA-256 manifest, and
uploads the installers plus manifest onto the canonical OpenClaw GitHub
release. Website download links should target exact OpenClaw release asset
URLs for the current stable release, or `releases/latest/download/...` only
after verifying GitHub's latest redirect points at that same release; do not
link only to the companion repo release page.
- Release checks now run in a separate manual workflow:
`OpenClaw Release Checks`
- `OpenClaw Release Checks` also runs the QA Lab mock parity lane plus the fast
@@ -719,12 +697,7 @@ orchestrates the trusted-publisher workflows in the order the release needs:
`ref=<release-sha>`.
5. Dispatch `Plugin ClawHub Release` with the same scope and SHA.
6. Dispatch `OpenClaw NPM Release` with the release tag, npm dist-tag, and
saved `preflight_run_id` after verifying the saved
`full_release_validation_run_id`.
7. For stable releases, create or update the GitHub release as a draft, dispatch
`Windows Node Release` with the explicit `windows_node_tag` and
candidate-approved `windows_node_installer_digests`, and verify the canonical
installer/checksum assets before publishing the draft.
saved `preflight_run_id`.
Beta publish example:
@@ -733,7 +706,6 @@ gh workflow run openclaw-release-publish.yml \
--ref release/YYYY.M.PATCH \
-f tag=vYYYY.M.PATCH-beta.N \
-f preflight_run_id=<successful-openclaw-npm-preflight-run-id> \
-f full_release_validation_run_id=<successful-full-release-validation-run-id> \
-f npm_dist_tag=beta
```
@@ -743,10 +715,7 @@ Stable publish to the default beta dist-tag:
gh workflow run openclaw-release-publish.yml \
--ref release/YYYY.M.PATCH \
-f tag=vYYYY.M.PATCH \
-f windows_node_tag=vX.Y.Z \
-f windows_node_installer_digests='{"OpenClawCompanion-Setup-x64.exe":"sha256:<approved-x64-sha256>","OpenClawCompanion-Setup-arm64.exe":"sha256:<approved-arm64-sha256>"}' \
-f preflight_run_id=<successful-openclaw-npm-preflight-run-id> \
-f full_release_validation_run_id=<successful-full-release-validation-run-id> \
-f npm_dist_tag=beta
```
@@ -756,10 +725,7 @@ Stable promotion directly to `latest` is explicit:
gh workflow run openclaw-release-publish.yml \
--ref release/YYYY.M.PATCH \
-f tag=vYYYY.M.PATCH \
-f windows_node_tag=vX.Y.Z \
-f windows_node_installer_digests='{"OpenClawCompanion-Setup-x64.exe":"sha256:<approved-x64-sha256>","OpenClawCompanion-Setup-arm64.exe":"sha256:<approved-arm64-sha256>"}' \
-f preflight_run_id=<successful-openclaw-npm-preflight-run-id> \
-f full_release_validation_run_id=<successful-full-release-validation-run-id> \
-f npm_dist_tag=latest
```
@@ -789,13 +755,6 @@ package cannot ship without every publishable official plugin, including
- `tag`: required release tag; must already exist
- `preflight_run_id`: successful `OpenClaw NPM Release` preflight run id;
required when `publish_openclaw_npm=true`
- `full_release_validation_run_id`: successful `Full Release Validation` run
id; required when `publish_openclaw_npm=true`
- `windows_node_tag`: exact non-prerelease `openclaw/openclaw-windows-node`
release tag; required for stable OpenClaw publish
- `windows_node_installer_digests`: candidate-approved compact JSON map of the
current Windows installer names to their pinned `sha256:` digests; required
for stable OpenClaw publish
- `npm_dist_tag`: npm target tag for the OpenClaw package
- `plugin_publish_scope`: defaults to `all-publishable`; use `selected` only
for focused plugin-only repair work with `publish_openclaw_npm=false`
@@ -841,21 +800,14 @@ When cutting a stable npm release:
Matrix, and Telegram coverage from one manual workflow
4. If you intentionally only need the deterministic normal test graph, run the
manual `CI` workflow on the release ref instead
5. Select the exact non-prerelease `openclaw/openclaw-windows-node` release tag
whose signed x64 and ARM64 installers should ship. Save it as
`windows_node_tag`, and save their validated digest map as
`windows_node_installer_digests`. The release-candidate helper records both
and includes them in its generated publish command.
6. Save the successful `preflight_run_id` and `full_release_validation_run_id`
7. Run `OpenClaw Release Publish` with the same `tag`, the same `npm_dist_tag`,
the selected `windows_node_tag`, its saved `windows_node_installer_digests`,
the saved `preflight_run_id`, and the saved `full_release_validation_run_id`;
it publishes externalized plugins to npm and ClawHub before promoting the
OpenClaw npm package
8. If the release landed on `beta`, use the
5. Save the successful `preflight_run_id`
6. Run `OpenClaw Release Publish` with the same `tag`, the same `npm_dist_tag`,
and the saved `preflight_run_id`; it publishes externalized plugins to npm
and ClawHub before promoting the OpenClaw npm package
7. If the release landed on `beta`, use the
`openclaw/releases/.github/workflows/openclaw-npm-dist-tags.yml`
workflow to promote that stable version from `beta` to `latest`
9. If the release intentionally published directly to `latest` and `beta`
8. If the release intentionally published directly to `latest` and `beta`
should follow the same stable build immediately, use that same release
workflow to point both dist-tags at the stable version, or let its scheduled
self-healing sync move `beta` later

View File

@@ -20,7 +20,6 @@ Scope includes:
- Thinking signature cleanup
- Image payload sanitization
- Blank text-block cleanup before provider replay
- Incomplete reasoning-only length-turn cleanup before provider replay
- User-input provenance tagging (for inter-session routed prompts)
- Empty assistant error-turn repair for Bedrock Converse replay
@@ -92,21 +91,6 @@ Implementation:
---
## Global rule: incomplete reasoning-only turns
Assistant turns that hit the provider output limit with only thinking or
redacted-thinking content are omitted from the in-memory replay copy. Such turns
contain incomplete provider state and may carry a partial thinking signature.
Empty length turns remain unchanged, as do length turns with visible text, tool
calls, or unknown content blocks. Stored transcripts are not rewritten.
Implementation:
- `normalizeAssistantReplayContent` in `src/agents/embedded-agent-runner/replay-history.ts`
---
## Global rule: inter-session input provenance
When an agent sends a prompt into another session via `sessions_send` (including

View File

@@ -336,7 +336,6 @@ top-level `bindings[]` entries.
- **Discord channel/thread:** `match.channel="discord"` + `match.peer.id="<channelOrThreadId>"`
- **Slack channel/DM:** `match.channel="slack"` + `match.peer.id="<channelId|channel:<channelId>|#<channelId>|userId|user:<userId>|slack:<userId>|<@userId>>"`. Prefer stable Slack ids; channel bindings also match replies inside that channel's threads.
- **Telegram forum topic:** `match.channel="telegram"` + `match.peer.id="<chatId>:topic:<topicId>"`
- **WhatsApp DM/group:** `match.channel="whatsapp"` + `match.peer.id="<E.164|group JID>"`. Use E.164 numbers such as `+15555550123` for direct chats and WhatsApp group JIDs such as `120363424282127706@g.us` for groups.
- **iMessage DM/group:** `match.channel="imessage"` + `match.peer.id="<handle|chat_id:*|chat_guid:*|chat_identifier:*>"`. Prefer `chat_id:*` for stable group bindings.
</ParamField>
@@ -454,9 +453,8 @@ Use `agents.list[].runtime` to define ACP defaults once per agent:
### Behavior
- OpenClaw ensures the configured ACP session exists after channel-specific admission and before use.
- Messages in that channel, topic, or chat route to the configured ACP session.
- Configured ACP bindings own their session route. Channel broadcast fan-out does not replace the configured ACP session for a matched binding.
- OpenClaw ensures the configured ACP session exists before use.
- Messages in that channel or topic route to the configured ACP session.
- In bound conversations, `/new` and `/reset` reset the same ACP session key in place.
- Temporary runtime bindings (for example created by thread-focus flows) still apply where present.
- For cross-agent ACP spawns without an explicit `cwd`, OpenClaw inherits the target agent workspace from agent config.

View File

@@ -13,12 +13,7 @@ CLI, and scripting patterns (snapshots, refs, waits, debug flows).
## Control API (optional)
For local integrations only, the Gateway exposes a small loopback HTTP API.
This standalone server is opt-in — set the environment variable
`OPENCLAW_EAGER_BROWSER_CONTROL_SERVER=1` in the gateway service environment
and restart the gateway before the HTTP endpoints become available. Without
this variable the browser control runtime still works through the CLI and
agent tools, but nothing listens on the loopback control port.
For local integrations only, the Gateway exposes a small loopback HTTP API:
- Status/start/stop: `GET /`, `POST /start`, `POST /stop`
- Tabs: `GET /tabs`, `POST /tabs/open`, `POST /tabs/focus`, `DELETE /tabs/:targetId`
@@ -263,14 +258,7 @@ Snapshot flags at a glance:
- `--format aria`: accessibility tree with `axN` refs. When Playwright is available, OpenClaw binds refs with backend DOM ids to the live page so follow-up actions can use them; otherwise treat the output as inspection-only.
- `--efficient` (or `--mode efficient`): compact role snapshot preset. Set `browser.snapshotDefaults.mode: "efficient"` to make this the default (see [Gateway configuration](/gateway/configuration-reference#browser)).
- `--interactive`, `--compact`, `--depth`, `--selector` force a role snapshot with `ref=e12` refs. `--frame "<iframe>"` scopes role snapshots to an iframe.
- With Playwright, `--labels` adds a screenshot with overlayed ref labels
(prints `MEDIA:<path>`) plus an `annotations` array with each ref's bounding
box. On `screenshot`, Playwright-backed labels work with `--full-page`,
`--ref`, and `--element`; on `snapshot`, the accompanying screenshot remains
viewport-only. Existing-session/chrome-mcp profiles render overlay labels on
page screenshots but do not return `annotations` or use the Playwright
full-page/ref/element projection helper. Without Playwright or chrome-mcp,
labeled screenshots are not available.
- `--labels` adds a viewport-only screenshot with overlayed ref labels and prints the saved path.
- `--urls` appends discovered link destinations to AI snapshots.
## Snapshots and refs
@@ -286,9 +274,7 @@ OpenClaw supports two "snapshot" styles:
- Output: a role-based list/tree with `[ref=e12]` (and optional `[nth=1]`).
- Actions: `openclaw browser click e12`, `openclaw browser highlight e12`.
- Internally, the ref is resolved via `getByRole(...)` (plus `nth()` for duplicates).
- Add `--labels` to include a screenshot with overlayed `e12` labels. On
Playwright-backed profiles this also returns per-ref bounding-box metadata
(`annotations[]`).
- Add `--labels` to include a viewport screenshot with overlayed `e12` labels.
- Add `--urls` when link text is ambiguous and the agent needs concrete
navigation targets.

View File

@@ -42,14 +42,8 @@ app-server thread as an ephemeral side thread. That keeps Codex OAuth and native
thread behavior intact while still isolating the side answer from the parent
transcript. Like Codex `/side`, the side thread keeps the current Codex
permissions and native tool surface, with guardrails that tell the model not to
treat inherited parent-thread work as active instructions.
For CLI runtime aliases, BTW uses the owning CLI backend in side-question mode
instead of falling back to a direct provider call. OpenClaw seeds sanitized
conversation context into a fresh one-shot CLI invocation, disables OpenClaw MCP
tool bundling and reusable CLI session state for that invocation, and lets the
backend add any CLI-native no-resume or no-tools flags it supports. Direct
non-CLI runtimes keep the direct one-shot path.
treat inherited parent-thread work as active instructions. Non-Codex runtimes
keep the older direct one-shot path.
## What it does not do

View File

@@ -147,12 +147,10 @@ such as `@beta` stay pinned to the selected package and fail when incompatible.
Configure `security.installPolicy` to run a trusted local policy command before
plugin install or update proceeds. The policy receives metadata plus the staged
source path and can allow or block the install. It covers CLI and Gateway-backed
plugin install/update paths. Plugin `before_install` hooks run later only in
OpenClaw processes where plugin hooks are loaded, so use `security.installPolicy`
for operator-owned install decisions. The deprecated
`--dangerously-force-unsafe-install` flag is accepted for compatibility but does
not bypass install policy or OpenClaw's built-in plugin dependency denylist.
source path and can allow or block the install. It runs before plugin
`before_install` hooks. The deprecated `--dangerously-force-unsafe-install`
flag is accepted for compatibility but does not bypass install policy, hooks, or
OpenClaw's built-in plugin dependency denylist.
See [Skills config](/tools/skills-config#operator-install-policy-securityinstallpolicy)
for the shared `security.installPolicy` exec schema used by both skills and

View File

@@ -16,9 +16,9 @@ search or dynamic-tools surface. Codex-native code mode, tool search, deferred
dynamic tools, and nested tool calls are stable Codex harness surfaces and do
not depend on `tools.toolSearch`.
When enabled for OpenClaw runs, the model receives one `tool_search_code` tool
by default. That tool runs a short JavaScript body in an isolated Node
subprocess with an `openclaw.tools` bridge:
When enabled for OpenClaw runs, the model receives one `tool_search_code` tool by default.
That tool runs a short JavaScript body in an isolated Node subprocess with an
`openclaw.tools` bridge:
```js
const hits = await openclaw.tools.search("create a GitHub issue");
@@ -49,8 +49,8 @@ run:
3. List eligible MCP tools through the session MCP runtime.
4. Add eligible client tools supplied for the current run.
5. Index compact descriptors for search.
6. Expose the OpenClaw code bridge, the structured fallback tools, or the
compact directory surface to the model.
6. Expose either the OpenClaw code bridge or the structured fallback tools to the
model.
At execution time every real tool call returns to OpenClaw. The isolated Node
runtime does not hold plugin implementations, MCP client objects, or secrets.
@@ -59,26 +59,18 @@ normal policy, approval, hook, logging, and result handling still apply.
## Modes
`tools.toolSearch` has three model-facing modes:
`tools.toolSearch` has two model-facing modes:
- `code`: exposes `tool_search_code`, the default compact JavaScript bridge.
- `tools`: exposes `tool_search`, `tool_describe`, and `tool_call` as plain
structured tools for providers that should not receive code.
- `directory`: exposes `tool_search`, `tool_describe`, and `tool_call` plus a
bounded prompt directory of available tool names and descriptions for
providers that should see tool names without every full schema. OpenClaw can
also expose a small bounded set of likely or required tool schemas directly
for the current turn.
All modes use the same policy-filtered catalog and normal OpenClaw execution
path. If the current runtime cannot launch the isolated Node code-mode child
process, the default `code` mode falls back to `tools` before catalog
compaction. In `directory` mode, client-provided tools stay directly visible
for the current run while OpenClaw tools, plugin tools, and MCP tools can be
compacted behind the directory catalog. A direct call to an exact hidden
directory name is hydrated from that same authorized catalog before execution.
Both modes use the same catalog and execution path. The only difference is the
shape the model sees. If the current runtime cannot launch the isolated Node
code-mode child process, the default `code` mode falls back to `tools` before
catalog compaction.
All modes are experimental. Prefer direct tool exposure for small OpenClaw tool
Both modes are experimental. Prefer direct tool exposure for small OpenClaw tool
catalogs, and prefer the Codex-native stable surfaces for Codex harness runs.
There is no separate source-selection config. When Tool Search is enabled, the
@@ -98,10 +90,7 @@ Tool Search changes the shape:
contract
- Tool Search tools mode: the model sees three compact structured fallback
tools
- Tool Search directory mode: the model sees a bounded directory plus
search/describe/call controls and a small bounded set of likely or required
schemas
- during the turn: the model can load remaining schemas as needed
- during the turn: the model loads only the tool schemas it actually needs
Direct tool exposure is still the right default for small catalogs. Tool Search
is best when one run can see many tools, especially from MCP servers or
@@ -143,20 +132,6 @@ The structured fallback mode exposes the same operations as tools:
- `tool_describe`
- `tool_call`
Directory mode exposes:
- `tool_search`
- `tool_describe`
- `tool_call`
It also keeps client-provided tools directly visible and may expose a small
bounded set of likely or required catalog tool schemas directly for the current
turn. If the bounded directory omits entries, use `tool_search` to find them. If
the model requests an exact hidden directory tool name directly, OpenClaw
hydrates it from the authorized catalog before normal execution.
Directory-mode client tool names must not collide with OpenClaw, plugin, or MCP
tool names because exact deferred dispatch uses those names.
## Runtime boundary
The code bridge runs in a short-lived Node subprocess. The subprocess starts
@@ -211,18 +186,6 @@ Use the structured fallback tools instead for OpenClaw runs:
}
```
Use the compact directory surface instead for OpenClaw runs:
```json5
{
tools: {
toolSearch: {
mode: "directory",
},
},
}
```
Tune code-mode timeout and search result limits:
```json5

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -123,12 +123,11 @@
"help": "Optional explicit denylist of chat/user IDs. Sessions whose resolved conversation id matches the list are skipped even when the chat type is allowed. Applied after allowedChatIds."
},
"timeoutMs": {
"label": "Timeout (ms)",
"help": "Recall work budget on the main lane. Before recall, the hook allows up to 1500 ms for session/config preflight. After recall starts, it reserves another fixed 1500 ms only for abort settlement and transcript recovery."
"label": "Timeout (ms)"
},
"setupGraceTimeoutMs": {
"label": "Setup Grace Timeout (ms)",
"help": "Advanced: extra recall-work budget for cold embedded-run setup. Defaults to 0. The separate 1500 ms preflight cap and 1500 ms post-recall completion allowance still apply."
"help": "Advanced: extra blocking budget for cold embedded-run setup before the recall timeout is considered exhausted. Defaults to 0 so timeoutMs remains the main-lane hook budget unless you opt in."
},
"queryMode": {
"label": "Query Mode",

View File

@@ -34,7 +34,6 @@ export function buildAnthropicCliBackend(): CliBackendPlugin {
bundleMcp: true,
bundleMcpMode: "claude-config-file",
nativeToolMode: "always-on",
sideQuestionToolMode: "disabled",
ownsNativeCompaction: true,
config: {
command: "claude",

View File

@@ -150,61 +150,6 @@ describe("resolveClaudeCliExecutionArgs", () => {
}),
).toEqual(["-p", "--effort", "max"]);
});
it("forces isolated no-tool one-shot args for side-question execution", () => {
expect(
resolveClaudeCliExecutionArgs({
workspaceDir: "/tmp",
provider: "claude-cli",
modelId: "claude-opus-4-7",
thinkingLevel: "max",
useResume: true,
executionMode: "side-question",
baseArgs: [
"-p",
"--output-format",
"stream-json",
"--allowedTools=mcp__openclaw__*",
"--allowedTools",
"Read",
"Grep",
"--permission-mode",
"bypassPermissions",
"--session-id=abc",
"--resume",
"old-session",
"--resume-session-at",
"old-message",
"--resume-session-at=old-message-equals",
"--mcp-config",
"/tmp/side-question-mcp.json",
"--bare",
"--safe-mode",
"--strict-mcp-config",
"--no-session-persistence",
"--max-turns",
"4",
"--effort",
"high",
],
}),
).toEqual([
"-p",
"--output-format",
"stream-json",
"--safe-mode",
"--tools",
"",
"--disallowedTools",
"mcp__*",
"--strict-mcp-config",
"--no-session-persistence",
"--max-turns",
"1",
"--permission-mode",
"default",
]);
});
});
describe("normalizeClaudeBackendConfig", () => {

View File

@@ -67,26 +67,8 @@ const CLAUDE_LEGACY_SKIP_PERMISSIONS_ARG = "--dangerously-skip-permissions";
const CLAUDE_PERMISSION_MODE_ARG = "--permission-mode";
const CLAUDE_SETTING_SOURCES_ARG = "--setting-sources";
const CLAUDE_EFFORT_ARG = "--effort";
const CLAUDE_BARE_ARG = "--bare";
const CLAUDE_SAFE_MODE_ARG = "--safe-mode";
const CLAUDE_TOOLS_ARG = "--tools";
const CLAUDE_DISALLOWED_TOOLS_ARG = "--disallowedTools";
const CLAUDE_MCP_CONFIG_ARG = "--mcp-config";
const CLAUDE_STRICT_MCP_CONFIG_ARG = "--strict-mcp-config";
const CLAUDE_NO_SESSION_PERSISTENCE_ARG = "--no-session-persistence";
const CLAUDE_MAX_TURNS_ARG = "--max-turns";
const CLAUDE_SESSION_ID_ARG = "--session-id";
const CLAUDE_RESUME_ARG = "--resume";
const CLAUDE_RESUME_SESSION_AT_ARG = "--resume-session-at";
const CLAUDE_RESUME_SHORT_ARG = "-r";
const CLAUDE_CONTINUE_ARG = "--continue";
const CLAUDE_CONTINUE_SHORT_ARG = "-c";
const CLAUDE_FORK_SESSION_ARG = "--fork-session";
const CLAUDE_SAFE_SETTING_SOURCES = "user";
const CLAUDE_BYPASS_PERMISSION_MODE = "bypassPermissions";
const CLAUDE_DEFAULT_PERMISSION_MODE = "default";
const CLAUDE_NO_TOOLS_VALUE = "";
const CLAUDE_DENY_MCP_TOOLS_VALUE = "mcp__*";
type ClaudeCliEffort = "low" | "medium" | "high" | "xhigh" | "max";
@@ -250,89 +232,10 @@ function stripClaudeEffortArgs(args: readonly string[]): string[] {
return normalized;
}
const CLAUDE_SIDE_QUESTION_VARIADIC_VALUE_ARGS = new Set([
"--allowedTools",
"--allowed-tools",
CLAUDE_DISALLOWED_TOOLS_ARG,
"--disallowed-tools",
CLAUDE_TOOLS_ARG,
CLAUDE_MCP_CONFIG_ARG,
]);
const CLAUDE_SIDE_QUESTION_VALUE_ARGS = new Set([
CLAUDE_PERMISSION_MODE_ARG,
CLAUDE_SESSION_ID_ARG,
CLAUDE_RESUME_ARG,
CLAUDE_RESUME_SESSION_AT_ARG,
CLAUDE_RESUME_SHORT_ARG,
CLAUDE_MAX_TURNS_ARG,
]);
const CLAUDE_SIDE_QUESTION_BARE_ARGS = new Set([
CLAUDE_CONTINUE_ARG,
CLAUDE_CONTINUE_SHORT_ARG,
CLAUDE_FORK_SESSION_ARG,
CLAUDE_BARE_ARG,
CLAUDE_SAFE_MODE_ARG,
CLAUDE_STRICT_MCP_CONFIG_ARG,
CLAUDE_NO_SESSION_PERSISTENCE_ARG,
]);
function stripClaudeSideQuestionConflictingArgs(args: readonly string[]): string[] {
const normalized: string[] = [];
for (let i = 0; i < args.length; i += 1) {
const arg = args[i] ?? "";
const equalsIndex = arg.indexOf("=");
const argName = equalsIndex > 0 ? arg.slice(0, equalsIndex) : arg;
if (CLAUDE_SIDE_QUESTION_BARE_ARGS.has(argName)) {
continue;
}
if (CLAUDE_SIDE_QUESTION_VARIADIC_VALUE_ARGS.has(argName)) {
if (equalsIndex < 0) {
while (typeof args[i + 1] === "string" && !args[i + 1]?.startsWith("-")) {
i += 1;
}
}
continue;
}
if (CLAUDE_SIDE_QUESTION_VALUE_ARGS.has(argName)) {
if (equalsIndex < 0) {
const maybeValue = args[i + 1];
if (typeof maybeValue === "string" && !maybeValue.startsWith("-")) {
i += 1;
}
}
continue;
}
normalized.push(arg);
}
return normalized;
}
function resolveClaudeCliSideQuestionExecutionArgs(baseArgs: readonly string[]): string[] {
return [
...stripClaudeSideQuestionConflictingArgs(stripClaudeEffortArgs(baseArgs)),
CLAUDE_SAFE_MODE_ARG,
CLAUDE_TOOLS_ARG,
CLAUDE_NO_TOOLS_VALUE,
CLAUDE_DISALLOWED_TOOLS_ARG,
CLAUDE_DENY_MCP_TOOLS_VALUE,
CLAUDE_STRICT_MCP_CONFIG_ARG,
CLAUDE_NO_SESSION_PERSISTENCE_ARG,
CLAUDE_MAX_TURNS_ARG,
"1",
CLAUDE_PERMISSION_MODE_ARG,
CLAUDE_DEFAULT_PERMISSION_MODE,
];
}
/** Resolve final Claude CLI execution args for one backend invocation. */
export function resolveClaudeCliExecutionArgs(
context: CliBackendResolveExecutionArgsContext,
): string[] {
if (context.executionMode === "side-question") {
return resolveClaudeCliSideQuestionExecutionArgs(context.baseArgs);
}
const effort = mapClaudeCliThinkingLevelToEffort(context.thinkingLevel);
if (!effort) {
return [...context.baseArgs];

View File

@@ -25,7 +25,7 @@ Use this skill when you need the `browser` tool for anything beyond a single pag
- Use the same `targetId` for follow-up actions so refs stay on the same tab.
- For durable Playwright refs, request `refs="aria"` when supported. If you receive `axN` refs from `snapshotFormat="aria"`, use them only after that same snapshot call; stale or unbound `axN` refs fail fast and need a fresh snapshot.
- Use `urls=true` when link text is ambiguous or a direct navigation target would avoid brittle clicks.
- Use `labels=true` on snapshot or screenshot when visual position matters. On Playwright-backed profiles, the response includes an `annotations` array (`{ref, number, role, name?, box}`) with each ref's bounding box in the captured image's coordinate space, so you can reason about position without re-snapshotting; screenshot labels can also combine with `fullPage=true` (CLI: `--full-page`) to label the whole document, or `ref` / `element` to clip to one element. `profile="user"` and other existing-session (chrome-mcp) profiles render an overlay into page screenshots but do not attach `annotations` or use the Playwright full-page/ref/element projection helper, so read positions from the labeled image itself on those profiles. The raw-CDP fallback (no Playwright) does not support labeled screenshots at all and returns a 501, so only request `labels` when Playwright is available.
- Use `labels=true` on snapshot or screenshot when visual position matters.
4. Act narrowly:
- Prefer `action="act"` with a ref from the latest snapshot.
- After navigation, modal changes, or form submission, snapshot again before the next action.

View File

@@ -486,7 +486,6 @@ export async function executeSnapshotAction(params: {
labels: snapshot.labels,
labelsCount: snapshot.labelsCount,
labelsSkipped: snapshot.labelsSkipped,
annotations: snapshot.annotations,
imagePath: snapshot.imagePath,
imageType: snapshot.imageType,
refsFallback,

View File

@@ -1,8 +1,6 @@
/**
* Shared result types for browser client action helpers.
*/
import type { AnnotationItem } from "./screenshot-annotate.js";
/** Generic success result for action endpoints. */
export type BrowserActionOk = { ok: true };
@@ -22,10 +20,4 @@ export type BrowserActionPathResult = {
labels?: boolean;
labelsCount?: number;
labelsSkipped?: number;
/**
* Per-ref bounding boxes when labels=true. Coordinates are in the
* captured image's space (viewport / fullpage / element-relative).
* Omitted when empty.
*/
annotations?: AnnotationItem[];
};

View File

@@ -18,7 +18,6 @@ import type {
} from "./client.types.js";
import { DEFAULT_BROWSER_SNAPSHOT_TIMEOUT_MS } from "./constants.js";
import type { BrowserDoctorReport } from "./doctor.js";
import type { AnnotationItem } from "./screenshot-annotate.js";
export type { BrowserStatus, BrowserTab, BrowserTransport } from "./client.types.js";
export type { BrowserDoctorCheck, BrowserDoctorReport } from "./doctor.js";
@@ -125,11 +124,6 @@ export type SnapshotResult =
labels?: boolean;
labelsCount?: number;
labelsSkipped?: number;
/**
* Per-ref bounding boxes when labels=true. Coordinates are in the
* captured image's space. Omitted when empty.
*/
annotations?: AnnotationItem[];
imagePath?: string;
imageType?: "png" | "jpeg";
blockedByDialog?: boolean;

View File

@@ -1,205 +0,0 @@
import { beforeEach, describe, expect, it, vi } from "vitest";
import {
installPwToolsCoreTestHooks,
setPwToolsCoreCurrentPage,
setPwToolsCoreCurrentRefLocator,
} from "./pw-tools-core.test-harness.js";
installPwToolsCoreTestHooks();
const mod = await import("./pw-tools-core.js");
type EvaluateArg = unknown;
function evaluateMockReturning(view: { x: number; y: number; width?: number; height?: number }) {
// Caller reads { x, y, width, height } in one evaluate; default to a normal
// desktop viewport so refs near the top stay in-viewport unless a test puts
// them out of range explicitly.
const result = { width: 1280, height: 720, ...view };
return vi.fn(async (arg: EvaluateArg) => {
if (typeof arg === "function") {
return result;
}
return true;
});
}
describe("screenshotWithLabelsViaPlaywright (viewport)", () => {
beforeEach(() => {
vi.clearAllMocks();
});
it("calls page.screenshot without fullPage and returns annotations", async () => {
const evaluate = evaluateMockReturning({ x: 0, y: 100 });
const screenshot = vi.fn(async () => Buffer.from("PNG"));
setPwToolsCoreCurrentPage({ evaluate, screenshot, url: () => "https://example.com" });
setPwToolsCoreCurrentRefLocator({
boundingBox: async () => ({ x: 10, y: 200, width: 50, height: 20 }),
});
const result = await mod.screenshotWithLabelsViaPlaywright({
cdpUrl: "http://127.0.0.1:18792",
targetId: "T1",
refs: { e1: { role: "button", name: "Submit" } },
type: "png",
});
expect(screenshot).toHaveBeenCalledWith(expect.objectContaining({ type: "png" }));
expect(screenshot).not.toHaveBeenCalledWith(expect.objectContaining({ fullPage: true }));
expect(result.annotations).toHaveLength(1);
expect(result.annotations[0]).toMatchObject({
ref: "e1",
number: 1,
role: "button",
name: "Submit",
});
// viewport-mode box = doc(box.x + scroll.x, box.y + scroll.y) - scroll = bbox
expect(result.annotations[0]?.box).toEqual({ x: 10, y: 200, width: 50, height: 20 });
expect(result.skipped).toBe(0);
});
it("runs the clear script even when screenshot throws", async () => {
const evaluate = evaluateMockReturning({ x: 0, y: 0 });
const screenshot = vi.fn(async () => {
throw new Error("boom");
});
setPwToolsCoreCurrentPage({ evaluate, screenshot });
setPwToolsCoreCurrentRefLocator({
boundingBox: async () => ({ x: 0, y: 0, width: 1, height: 1 }),
});
await expect(
mod.screenshotWithLabelsViaPlaywright({
cdpUrl: "http://127.0.0.1:18792",
targetId: "T1",
refs: { e1: { role: "button" } },
}),
).rejects.toThrow(/boom/);
// The clear script must have run (string evaluate calls include the overlay attr)
const clearCalls = evaluate.mock.calls.filter(
([arg]) => typeof arg === "string" && arg.includes("data-openclaw-labels"),
);
// inject + clear = at least 2 string evaluations
expect(clearCalls.length).toBeGreaterThanOrEqual(2);
});
it("counts off-viewport refs as skipped but still surfaces them in annotations", async () => {
const evaluate = evaluateMockReturning({ x: 0, y: 0, width: 1280, height: 720 });
const screenshot = vi.fn(async () => Buffer.from("PNG"));
setPwToolsCoreCurrentPage({ evaluate, screenshot });
// bbox is far below the viewport (y: 5000): not drawn, but still reported
// so callers keep the position and a non-zero skipped count.
setPwToolsCoreCurrentRefLocator({
boundingBox: async () => ({ x: 0, y: 5000, width: 50, height: 20 }),
});
const result = await mod.screenshotWithLabelsViaPlaywright({
cdpUrl: "http://127.0.0.1:18792",
targetId: "T1",
refs: { e1: { role: "button" } },
});
expect(result.skipped).toBe(1);
expect(result.labels).toBe(0);
expect(result.annotations).toHaveLength(1);
expect(result.annotations[0]?.ref).toBe("e1");
});
});
describe("screenshotWithLabelsViaPlaywright (fullpage)", () => {
beforeEach(() => vi.clearAllMocks());
it("forwards fullPage:true to page.screenshot and uses doc-space annotations", async () => {
const evaluate = evaluateMockReturning({ x: 0, y: 1000 });
const screenshot = vi.fn(async () => Buffer.from("FULL"));
setPwToolsCoreCurrentPage({ evaluate, screenshot });
setPwToolsCoreCurrentRefLocator({
boundingBox: async () => ({ x: 10, y: 200, width: 50, height: 20 }),
});
const result = await mod.screenshotWithLabelsViaPlaywright({
cdpUrl: "http://127.0.0.1:18792",
targetId: "T1",
refs: { e1: { role: "button" } },
fullPage: true,
});
expect(screenshot).toHaveBeenCalledWith(expect.objectContaining({ fullPage: true }));
// doc-space: scroll y=1000 + bbox y=200 = 1200
expect(result.annotations[0]?.box.y).toBe(1200);
expect(result.annotations[0]?.box.x).toBe(10);
});
});
describe("screenshotWithLabelsViaPlaywright (element/ref)", () => {
beforeEach(() => vi.clearAllMocks());
it("uses refLocator.screenshot for ref mode and projects relative to element", async () => {
const evaluate = evaluateMockReturning({ x: 0, y: 0 });
// First call resolves the element rect (container), second resolves e1 annotation bbox.
const boundingBox = vi
.fn<() => Promise<{ x: number; y: number; width: number; height: number } | null>>()
.mockResolvedValueOnce({ x: 50, y: 100, width: 200, height: 300 })
.mockResolvedValueOnce({ x: 60, y: 110, width: 30, height: 20 });
const elementScreenshot = vi.fn(async () => Buffer.from("ELEM"));
setPwToolsCoreCurrentPage({ evaluate, screenshot: vi.fn() });
setPwToolsCoreCurrentRefLocator({ boundingBox, screenshot: elementScreenshot });
const result = await mod.screenshotWithLabelsViaPlaywright({
cdpUrl: "http://127.0.0.1:18792",
targetId: "T1",
refs: { e1: { role: "button" } },
ref: "container",
});
expect(elementScreenshot).toHaveBeenCalledTimes(1);
// Element-relative: doc(60,110) - elementRect(50,100) = (10,10)
expect(result.annotations).toHaveLength(1);
expect(result.annotations[0]?.box).toEqual({ x: 10, y: 10, width: 30, height: 20 });
});
it("throws when ref/element cannot be resolved", async () => {
const evaluate = evaluateMockReturning({ x: 0, y: 0 });
setPwToolsCoreCurrentPage({ evaluate, screenshot: vi.fn() });
setPwToolsCoreCurrentRefLocator({
boundingBox: async () => null,
screenshot: vi.fn(),
});
await expect(
mod.screenshotWithLabelsViaPlaywright({
cdpUrl: "http://127.0.0.1:18792",
targetId: "T1",
refs: { e1: { role: "button" } },
ref: "missing",
}),
).rejects.toThrow(/element not found/i);
});
});
describe("screenshotWithLabelsViaPlaywright (skipped accounting)", () => {
beforeEach(() => vi.clearAllMocks());
it("counts refs whose boundingBox is null toward skipped", async () => {
const evaluate = evaluateMockReturning({ x: 0, y: 0 });
const screenshot = vi.fn(async () => Buffer.from("PNG"));
setPwToolsCoreCurrentPage({ evaluate, screenshot });
// Two refs: first returns a box, second returns null (e.g. element detached).
const boundingBox = vi
.fn<() => Promise<{ x: number; y: number; width: number; height: number } | null>>()
.mockResolvedValueOnce({ x: 10, y: 20, width: 30, height: 40 })
.mockResolvedValueOnce(null);
setPwToolsCoreCurrentRefLocator({ boundingBox });
const result = await mod.screenshotWithLabelsViaPlaywright({
cdpUrl: "http://127.0.0.1:18792",
targetId: "T1",
refs: { e1: { role: "button" }, e2: { role: "link" } },
});
expect(result.annotations).toHaveLength(1);
expect(result.annotations[0]?.ref).toBe("e1");
expect(result.skipped).toBe(1);
});
});

View File

@@ -41,15 +41,6 @@ import {
toAIFriendlyError,
} from "./pw-tools-core.shared.js";
import { closePageViaPlaywright, resizeViewportViaPlaywright } from "./pw-tools-core.snapshot.js";
import {
ANNOTATION_MAX_LABELS_DEFAULT,
type AnnotationItem,
buildOverlayClearScript,
buildOverlayInjectionScript,
type CoordinateSpace,
planAnnotations,
type RawAnnotationInput,
} from "./screenshot-annotate.js";
type TargetOpts = {
cdpUrl: string;
@@ -1296,15 +1287,7 @@ export async function screenshotWithLabelsViaPlaywright(opts: {
maxLabels?: number;
type?: "png" | "jpeg";
timeoutMs?: number;
fullPage?: boolean;
ref?: string;
element?: string;
}): Promise<{
buffer: Buffer;
labels: number;
skipped: number;
annotations: AnnotationItem[];
}> {
}): Promise<{ buffer: Buffer; labels: number; skipped: number }> {
const page = await getPageForTargetId(opts);
ensurePageState(page);
restoreRoleRefsForTarget({ cdpUrl: opts.cdpUrl, targetId: opts.targetId, page });
@@ -1312,151 +1295,119 @@ export async function screenshotWithLabelsViaPlaywright(opts: {
const maxLabels =
typeof opts.maxLabels === "number" && Number.isFinite(opts.maxLabels)
? Math.max(1, Math.floor(opts.maxLabels))
: ANNOTATION_MAX_LABELS_DEFAULT;
: 150;
const refKey = normalizeOptionalString(opts.ref) ?? undefined;
const elementSelector = normalizeOptionalString(opts.element) ?? undefined;
const space: CoordinateSpace = opts.fullPage
? "fullpage"
: refKey || elementSelector
? "element"
: "viewport";
// Read scroll + viewport size. Scroll converts Playwright's viewport-space
// boundingBoxes into document-space inputs; the viewport size lets the helper
// restore the shipped `labelsSkipped` semantics by counting off-viewport refs
// as skipped (in viewport capture mode).
const view = await page.evaluate(() => ({
x: window.scrollX || 0,
y: window.scrollY || 0,
const viewport = await page.evaluate(() => ({
scrollX: window.scrollX || 0,
scrollY: window.scrollY || 0,
width: window.innerWidth || 0,
height: window.innerHeight || 0,
}));
const scroll = { x: view.x, y: view.y };
let elementRect: { x: number; y: number; width: number; height: number } | undefined;
if (space === "element") {
const box = await resolveElementBoundingBoxForLabels(page, refKey, elementSelector);
if (!box) {
throw new Error(
`screenshotWithLabelsViaPlaywright: element not found for ${
refKey ? `ref="${refKey}"` : `selector="${elementSelector ?? ""}"`
}`,
);
}
// Convert viewport-space bbox to document space.
elementRect = {
x: box.x + scroll.x,
y: box.y + scroll.y,
width: box.width,
height: box.height,
};
}
const refs = Object.keys(opts.refs ?? {});
const boxes: Array<{ ref: string; x: number; y: number; w: number; h: number }> = [];
let skipped = 0;
const refKeys = Object.keys(opts.refs ?? {});
const inputs: RawAnnotationInput[] = [];
let bboxFailures = 0;
for (const ref of refKeys) {
const box = await refLocator(page, ref)
.boundingBox()
.catch(() => null);
if (!box) {
bboxFailures += 1;
for (const ref of refs) {
if (boxes.length >= maxLabels) {
skipped += 1;
continue;
}
inputs.push({
ref,
role: opts.refs[ref].role,
name: opts.refs[ref].name,
doc: {
x: box.x + scroll.x,
y: box.y + scroll.y,
width: box.width,
height: box.height,
},
});
try {
const box = await refLocator(page, ref).boundingBox();
if (!box) {
skipped += 1;
continue;
}
const x0 = box.x;
const y0 = box.y;
const x1 = box.x + box.width;
const y1 = box.y + box.height;
const vx0 = viewport.scrollX;
const vy0 = viewport.scrollY;
const vx1 = viewport.scrollX + viewport.width;
const vy1 = viewport.scrollY + viewport.height;
if (x1 < vx0 || x0 > vx1 || y1 < vy0 || y0 > vy1) {
skipped += 1;
continue;
}
boxes.push({
ref,
x: x0 - viewport.scrollX,
y: y0 - viewport.scrollY,
w: Math.max(1, box.width),
h: Math.max(1, box.height),
});
} catch {
skipped += 1;
}
}
const plan = planAnnotations({
inputs,
space,
scroll,
viewport: { width: view.width, height: view.height },
elementRect,
maxLabels,
});
try {
if (plan.overlayItems.length > 0) {
const captureY = space === "element" ? elementRect?.y : space === "viewport" ? scroll.y : 0;
await page.evaluate(buildOverlayInjectionScript({ items: plan.overlayItems, captureY }));
if (boxes.length > 0) {
await page.evaluate((labels) => {
const existing = document.querySelectorAll("[data-openclaw-labels]");
existing.forEach((el) => el.remove());
const root = document.createElement("div");
root.setAttribute("data-openclaw-labels", "1");
root.style.position = "fixed";
root.style.left = "0";
root.style.top = "0";
root.style.zIndex = "2147483647";
root.style.pointerEvents = "none";
root.style.fontFamily =
'"SF Mono","SFMono-Regular",Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace';
const clamp = (value: number, min: number, max: number) =>
Math.min(max, Math.max(min, value));
for (const label of labels) {
const box = document.createElement("div");
box.setAttribute("data-openclaw-labels", "1");
box.style.position = "absolute";
box.style.left = `${label.x}px`;
box.style.top = `${label.y}px`;
box.style.width = `${label.w}px`;
box.style.height = `${label.h}px`;
box.style.border = "2px solid #ffb020";
box.style.boxSizing = "border-box";
const tag = document.createElement("div");
tag.setAttribute("data-openclaw-labels", "1");
tag.textContent = label.ref;
tag.style.position = "absolute";
tag.style.left = `${label.x}px`;
tag.style.top = `${clamp(label.y - 18, 0, 20000)}px`;
tag.style.background = "#ffb020";
tag.style.color = "#1a1a1a";
tag.style.fontSize = "12px";
tag.style.lineHeight = "14px";
tag.style.padding = "1px 4px";
tag.style.borderRadius = "3px";
tag.style.boxShadow = "0 1px 2px rgba(0,0,0,0.35)";
tag.style.whiteSpace = "nowrap";
root.appendChild(box);
root.appendChild(tag);
}
document.documentElement.appendChild(root);
}, boxes);
}
const buffer =
space === "element"
? await captureElementScreenshotForLabels(
page,
refKey,
elementSelector,
type,
opts.timeoutMs,
)
: await page.screenshot({
type,
fullPage: Boolean(opts.fullPage),
timeout: opts.timeoutMs,
});
return {
// `labels` reports overlay boxes actually drawn on the captured image
// (in-viewport, within budget); off-viewport refs are surfaced via
// `annotations` but not drawn, and are reflected in `skipped`.
buffer,
labels: plan.overlayItems.length,
skipped: plan.skipped + bboxFailures,
annotations: plan.annotations,
};
const buffer = await page.screenshot({ type, timeout: opts.timeoutMs });
return { buffer, labels: boxes.length, skipped };
} finally {
await page.evaluate(buildOverlayClearScript()).catch(() => {});
await page
.evaluate(() => {
const existing = document.querySelectorAll("[data-openclaw-labels]");
existing.forEach((el) => el.remove());
})
.catch(() => {});
}
}
async function resolveElementBoundingBoxForLabels(
page: Page,
refKey: string | undefined,
cssSelector: string | undefined,
): Promise<{ x: number; y: number; width: number; height: number } | null> {
if (refKey) {
try {
return await refLocator(page, refKey).boundingBox();
} catch {
return null;
}
}
if (cssSelector) {
try {
return await page.locator(cssSelector).first().boundingBox();
} catch {
return null;
}
}
return null;
}
async function captureElementScreenshotForLabels(
page: Page,
refKey: string | undefined,
cssSelector: string | undefined,
type: "png" | "jpeg",
timeoutMs: number | undefined,
): Promise<Buffer> {
if (refKey) {
return await refLocator(page, refKey).screenshot({ type, timeout: timeoutMs });
}
if (cssSelector) {
return await page.locator(cssSelector).first().screenshot({ type, timeout: timeoutMs });
}
throw new Error("captureElementScreenshotForLabels: requires refKey or cssSelector");
}
/** Sets file inputs for a role ref or selector with strict existing-path checks. */
export async function setInputFilesViaPlaywright(opts: {
cdpUrl: string;

View File

@@ -5,7 +5,6 @@
* navigation policy checks, media storage, and screenshot normalization.
*/
import path from "node:path";
import { getImageMetadata } from "../../media/media-services.js";
import { ensureMediaDir, saveMediaBuffer } from "../../media/store.js";
import { captureScreenshot, snapshotAria, snapshotRoleViaCdp } from "../cdp.js";
import {
@@ -25,8 +24,6 @@ import {
assertBrowserNavigationResultAllowed,
} from "../navigation-guard.js";
import { getBrowserProfileCapabilities } from "../profile-capabilities.js";
import type { AnnotationItem } from "../screenshot-annotate.js";
import { scaleAnnotations } from "../screenshot-annotate.js";
import {
DEFAULT_BROWSER_SCREENSHOT_MAX_BYTES,
DEFAULT_BROWSER_SCREENSHOT_MAX_SIDE,
@@ -195,24 +192,11 @@ async function saveNormalizedScreenshotResponse(params: {
labels?: boolean;
labelsCount?: number;
labelsSkipped?: number;
annotations?: AnnotationItem[];
}) {
// Measure original dimensions BEFORE normalization so we can rescale
// annotation coordinates if the response pipeline shrinks the image
// (longest-side or byte-budget cap). Annotation boxes are in the captured
// image's pixel space, so they would otherwise drift from the saved media.
const originalMeta = params.annotations?.length
? ((await getImageMetadata(params.buffer)) ?? undefined)
: undefined;
const normalized = await normalizeBrowserScreenshot(params.buffer, {
maxSide: DEFAULT_BROWSER_SCREENSHOT_MAX_SIDE,
maxBytes: DEFAULT_BROWSER_SCREENSHOT_MAX_BYTES,
});
const annotations = await rescaleAnnotationsForNormalization({
annotations: params.annotations,
originalMeta,
normalizedBuffer: normalized.buffer,
});
await saveBrowserMediaResponse({
res: params.res,
buffer: normalized.buffer,
@@ -223,39 +207,9 @@ async function saveNormalizedScreenshotResponse(params: {
labels: params.labels,
labelsCount: params.labelsCount,
labelsSkipped: params.labelsSkipped,
annotations,
});
}
/**
* Keep annotation coordinates aligned with the saved media after
* normalizeBrowserScreenshot. Returns the original annotations unchanged
* when normalization did not change the image dimensions, or when image
* metadata is unavailable (best-effort: better to ship pre-resize coords
* than to drop the field entirely).
*/
async function rescaleAnnotationsForNormalization(params: {
annotations?: AnnotationItem[];
originalMeta?: { width?: number; height?: number };
normalizedBuffer: Buffer;
}): Promise<AnnotationItem[] | undefined> {
if (!params.annotations || params.annotations.length === 0) {
return params.annotations;
}
const orig = params.originalMeta;
if (!orig?.width || !orig?.height) {
return params.annotations;
}
const next = await getImageMetadata(params.normalizedBuffer);
if (!next?.width || !next?.height) {
return params.annotations;
}
if (next.width === orig.width && next.height === orig.height) {
return params.annotations;
}
return scaleAnnotations(params.annotations, next.width / orig.width, next.height / orig.height);
}
async function saveBrowserMediaResponse(params: {
res: BrowserResponse;
buffer: Buffer;
@@ -266,7 +220,6 @@ async function saveBrowserMediaResponse(params: {
labels?: boolean;
labelsCount?: number;
labelsSkipped?: number;
annotations?: AnnotationItem[];
}) {
await ensureMediaDir();
const saved = await saveMediaBuffer(
@@ -283,9 +236,6 @@ async function saveBrowserMediaResponse(params: {
...(params.labels ? { labels: true } : {}),
...(typeof params.labelsCount === "number" ? { labelsCount: params.labelsCount } : {}),
...(typeof params.labelsSkipped === "number" ? { labelsSkipped: params.labelsSkipped } : {}),
...(params.annotations && params.annotations.length > 0
? { annotations: params.annotations }
: {}),
});
}
@@ -528,9 +478,6 @@ export function registerBrowserAgentSnapshotRoutes(
refs: snap.refs,
type,
timeoutMs,
fullPage,
ref,
element,
});
await saveNormalizedScreenshotResponse({
res,
@@ -541,7 +488,6 @@ export function registerBrowserAgentSnapshotRoutes(
labels: true,
labelsCount: labeled.labels,
labelsSkipped: labeled.skipped,
annotations: labeled.annotations,
});
return;
}
@@ -797,18 +743,10 @@ export function registerBrowserAgentSnapshotRoutes(
type: "png",
timeoutMs: plan.timeoutMs,
});
const originalMeta = labeled.annotations.length
? ((await getImageMetadata(labeled.buffer)) ?? undefined)
: undefined;
const normalized = await normalizeBrowserScreenshot(labeled.buffer, {
maxSide: DEFAULT_BROWSER_SCREENSHOT_MAX_SIDE,
maxBytes: DEFAULT_BROWSER_SCREENSHOT_MAX_BYTES,
});
const scaledAnnotations = await rescaleAnnotationsForNormalization({
annotations: labeled.annotations,
originalMeta,
normalizedBuffer: normalized.buffer,
});
await ensureMediaDir();
const saved = await saveMediaBuffer(
normalized.buffer,
@@ -826,9 +764,6 @@ export function registerBrowserAgentSnapshotRoutes(
labels: true,
labelsCount: labeled.labels,
labelsSkipped: labeled.skipped,
...(scaledAnnotations && scaledAnnotations.length > 0
? { annotations: scaledAnnotations }
: {}),
imagePath: path.resolve(saved.path),
imageType,
...snap,

View File

@@ -1,345 +0,0 @@
import { describe, expect, it } from "vitest";
import {
ANNOTATION_OVERLAY_ATTR,
type AnnotationItem,
buildOverlayClearScript,
buildOverlayInjectionScript,
planAnnotations,
type RawAnnotationInput,
refToNumber,
scaleAnnotations,
} from "./screenshot-annotate.js";
const sampleInputs: RawAnnotationInput[] = [
{
ref: "e1",
role: "button",
name: "Submit",
doc: { x: 100, y: 200, width: 50, height: 20 },
},
{
ref: "e2",
role: "link",
doc: { x: 300, y: 1500, width: 80, height: 18 },
},
];
describe("refToNumber", () => {
it("extracts number from `e<N>` form", () => {
expect(refToNumber("e12")).toBe(12);
expect(refToNumber("e0")).toBe(0);
});
it("extracts number from `ax<N>` form", () => {
expect(refToNumber("ax12")).toBe(12);
});
it("extracts number from bare numeric form", () => {
expect(refToNumber("12")).toBe(12);
});
it("returns 0 for non-numeric refs", () => {
expect(refToNumber("foo")).toBe(0);
expect(refToNumber("")).toBe(0);
});
});
describe("planAnnotations - viewport mode", () => {
it("subtracts scroll from doc coords", () => {
const plan = planAnnotations({
inputs: sampleInputs,
space: "viewport",
scroll: { x: 0, y: 1000 },
});
expect(plan.annotations).toHaveLength(2);
expect(plan.annotations[0]).toEqual({
ref: "e1",
number: 1,
role: "button",
name: "Submit",
box: { x: 100, y: -800, width: 50, height: 20 },
});
expect(plan.annotations[1]).toEqual({
ref: "e2",
number: 2,
role: "link",
box: { x: 300, y: 500, width: 80, height: 18 },
});
expect(plan.skipped).toBe(0);
});
it("keeps overlay items in document space regardless of mode", () => {
const plan = planAnnotations({
inputs: sampleInputs,
space: "viewport",
scroll: { x: 0, y: 1000 },
});
expect(plan.overlayItems).toEqual([
{ ref: "e1", x: 100, y: 200, w: 50, h: 20 },
{ ref: "e2", x: 300, y: 1500, w: 80, h: 18 },
]);
});
it("omits empty name field", () => {
const plan = planAnnotations({
inputs: [{ ref: "e1", role: "button", name: "", doc: { x: 0, y: 0, width: 1, height: 1 } }],
space: "viewport",
scroll: { x: 0, y: 0 },
});
expect(plan.annotations[0]).not.toHaveProperty("name");
});
it("throws when scroll missing in viewport mode", () => {
expect(() => planAnnotations({ inputs: sampleInputs, space: "viewport" })).toThrow(/scroll/);
});
});
describe("planAnnotations - viewport off-screen accounting", () => {
it("counts off-viewport refs as skipped but keeps them in annotations when viewport size is given", () => {
const plan = planAnnotations({
inputs: [
{ ref: "e1", role: "button", doc: { x: 10, y: 50, width: 40, height: 20 } }, // in viewport
{ ref: "e2", role: "link", doc: { x: 10, y: 5000, width: 40, height: 20 } }, // below viewport
],
space: "viewport",
scroll: { x: 0, y: 0 },
viewport: { width: 1280, height: 720 },
});
// Only the in-viewport ref is drawn.
expect(plan.overlayItems.map((o) => o.ref)).toEqual(["e1"]);
// Both refs are surfaced for callers (off-viewport box can be out of image).
expect(plan.annotations.map((a) => a.ref)).toEqual(["e1", "e2"]);
// The off-viewport ref raises skipped, preserving the shipped contract.
expect(plan.skipped).toBe(1);
});
it("does not count off-viewport refs when viewport size is omitted", () => {
const plan = planAnnotations({
inputs: [{ ref: "e2", role: "link", doc: { x: 10, y: 5000, width: 40, height: 20 } }],
space: "viewport",
scroll: { x: 0, y: 0 },
});
expect(plan.skipped).toBe(0);
expect(plan.overlayItems).toHaveLength(1);
expect(plan.annotations).toHaveLength(1);
});
});
describe("planAnnotations - fullpage mode", () => {
it("returns box equal to doc (document coordinates)", () => {
const plan = planAnnotations({ inputs: sampleInputs, space: "fullpage" });
expect(plan.annotations[0].box).toEqual({ x: 100, y: 200, width: 50, height: 20 });
expect(plan.annotations[1].box).toEqual({ x: 300, y: 1500, width: 80, height: 18 });
});
it("does not require scroll", () => {
expect(() => planAnnotations({ inputs: sampleInputs, space: "fullpage" })).not.toThrow();
});
});
describe("planAnnotations - element mode", () => {
const elementRect = { x: 50, y: 100, width: 200, height: 300 };
it("projects box relative to element top-left", () => {
const plan = planAnnotations({
inputs: [{ ref: "e1", role: "button", doc: { x: 60, y: 110, width: 40, height: 20 } }],
space: "element",
elementRect,
});
expect(plan.annotations[0].box).toEqual({ x: 10, y: 10, width: 40, height: 20 });
});
it("filters out inputs that do not overlap element rect", () => {
const plan = planAnnotations({
inputs: [
{ ref: "e1", role: "button", doc: { x: 60, y: 110, width: 40, height: 20 } }, // inside
{ ref: "e2", role: "link", doc: { x: 500, y: 500, width: 40, height: 20 } }, // outside
],
space: "element",
elementRect,
});
expect(plan.annotations).toHaveLength(1);
expect(plan.annotations[0].ref).toBe("e1");
expect(plan.overlayItems).toHaveLength(1);
});
it("throws when elementRect missing", () => {
expect(() => planAnnotations({ inputs: [], space: "element" })).toThrow(/elementRect/);
});
});
describe("planAnnotations - maxLabels", () => {
it("truncates to maxLabels and reports skipped", () => {
const inputs = Array.from({ length: 5 }, (_, i) => ({
ref: `e${i + 1}`,
role: "button",
doc: { x: 0, y: i * 10, width: 5, height: 5 },
}));
const plan = planAnnotations({ inputs, space: "fullpage", maxLabels: 2 });
expect(plan.annotations).toHaveLength(2);
expect(plan.overlayItems).toHaveLength(2);
expect(plan.skipped).toBe(3);
});
it("uses ANNOTATION_MAX_LABELS_DEFAULT when not specified", () => {
const inputs = Array.from({ length: 200 }, (_, i) => ({
ref: `e${i + 1}`,
role: "button",
doc: { x: 0, y: i, width: 5, height: 5 },
}));
const plan = planAnnotations({ inputs, space: "fullpage" });
expect(plan.annotations).toHaveLength(150);
expect(plan.skipped).toBe(50);
});
});
describe("buildOverlayInjectionScript", () => {
it("returns a self-contained IIFE", () => {
const script = buildOverlayInjectionScript({
items: [{ ref: "e1", x: 100, y: 200, w: 50, h: 20 }],
});
expect(script).toMatch(/^\(\s*\(\s*\)\s*=>\s*\{/);
expect(script).toMatch(/\}\s*\)\s*\(\s*\)\s*;?\s*$/);
});
it("embeds the overlay attr", () => {
const script = buildOverlayInjectionScript({ items: [] });
expect(script).toContain(ANNOTATION_OVERLAY_ATTR);
});
it("embeds each item's ref text and coordinates", () => {
const script = buildOverlayInjectionScript({
items: [
{ ref: "e1", x: 100, y: 200, w: 50, h: 20 },
{ ref: "ax42", x: 999, y: 1500, w: 80, h: 18 },
],
});
expect(script).toMatch(/"ref":\s*"e1"/);
expect(script).toMatch(/"ref":\s*"ax42"/);
expect(script).toMatch(/"x":\s*100/);
expect(script).toMatch(/"x":\s*999/);
});
it("handles empty items without throwing", () => {
expect(() => buildOverlayInjectionScript({ items: [] })).not.toThrow();
});
it("rounds coordinates to integers", () => {
const script = buildOverlayInjectionScript({
items: [{ ref: "e1", x: 100.7, y: 200.4, w: 50.6, h: 20.1 }],
});
expect(script).toMatch(/"x":\s*101/); // 100.7 -> 101
expect(script).toMatch(/"y":\s*200/); // 200.4 -> 200
});
it("clamps zero/negative-size boxes to 1px so they remain visible", () => {
const script = buildOverlayInjectionScript({
items: [{ ref: "e1", x: 10, y: 10, w: 0, h: 0 }],
});
expect(script).toMatch(/"w":\s*1/);
expect(script).toMatch(/"h":\s*1/);
});
it("escapes hostile ref characters via JSON.stringify (no breakout)", () => {
const hostile = 'e1");alert(1);//';
const script = buildOverlayInjectionScript({
items: [{ ref: hostile, x: 0, y: 0, w: 1, h: 1 }],
});
// The hostile `"` MUST be escaped as `\"` inside the JSON literal.
expect(script).toContain('"e1\\");alert(1);//"');
// The unescaped breakout MUST NOT appear anywhere in the script as a
// bare statement that would terminate the JSON literal early.
expect(script).not.toContain('e1");alert(1);');
});
it("flips label below the box when y < 14 (no headroom)", () => {
const script = buildOverlayInjectionScript({
items: [{ ref: "e1", x: 0, y: 5, w: 10, h: 10 }],
});
// labelTop = relativeY < 14 ? it.y + 2 : it.y - 14
// The expression literal `relativeY < 14 ? (it.y + 2) : (it.y - 14)` is in the script.
expect(script).toContain("relativeY < 14 ? (it.y + 2) : (it.y - 14)");
});
it("uses capture-relative y when deciding whether to flip labels below boxes", () => {
const script = buildOverlayInjectionScript({
items: [{ ref: "e1", x: 0, y: 1005, w: 10, h: 10 }],
captureY: 1000,
});
expect(script).toContain("var captureY = 1000;");
expect(script).toContain("var relativeY = it.y - captureY;");
expect(script).toContain("relativeY < 14 ? (it.y + 2) : (it.y - 14)");
});
});
describe("buildOverlayClearScript", () => {
it("returns an IIFE selecting overlay attr", () => {
const script = buildOverlayClearScript();
expect(script).toContain(`[${ANNOTATION_OVERLAY_ATTR}]`);
expect(script).toMatch(/^\(\s*\(\s*\)\s*=>\s*\{/);
});
});
describe("scaleAnnotations", () => {
const sample: AnnotationItem[] = [
{
ref: "e1",
number: 1,
role: "button",
name: "Submit",
box: { x: 100, y: 200, width: 50, height: 20 },
},
];
it("returns identity (structural copy) when both factors are 1", () => {
const out = scaleAnnotations(sample, 1, 1);
expect(out[0]).toEqual(sample[0]);
expect(out[0]).not.toBe(sample[0]);
expect(out[0]?.box).not.toBe(sample[0]?.box);
});
it("scales box dimensions by independent x/y factors", () => {
const out = scaleAnnotations(sample, 0.5, 0.485);
expect(out[0]?.box).toEqual({
x: 50,
y: 97,
width: 25,
height: 10,
});
});
it("clamps width/height to a minimum of 1 to avoid disappearing labels", () => {
const tiny: AnnotationItem[] = [
{
ref: "e1",
number: 1,
role: "button",
box: { x: 0, y: 0, width: 1, height: 1 },
},
];
const out = scaleAnnotations(tiny, 0.1, 0.1);
expect(out[0]?.box.width).toBeGreaterThanOrEqual(1);
expect(out[0]?.box.height).toBeGreaterThanOrEqual(1);
});
it("returns identity (structural copy) for invalid factors", () => {
const out = scaleAnnotations(sample, Number.NaN, 0.5);
expect(out[0]?.box).toEqual(sample[0]?.box);
const out2 = scaleAnnotations(sample, 0, 0.5);
expect(out2[0]?.box).toEqual(sample[0]?.box);
const out3 = scaleAnnotations(sample, -1, 1);
expect(out3[0]?.box).toEqual(sample[0]?.box);
});
it("preserves ref/number/role/name fields verbatim", () => {
const out = scaleAnnotations(sample, 0.5, 0.5);
expect(out[0]?.ref).toBe("e1");
expect(out[0]?.number).toBe(1);
expect(out[0]?.role).toBe("button");
expect(out[0]?.name).toBe("Submit");
});
});

View File

@@ -1,282 +0,0 @@
// extensions/browser/src/browser/screenshot-annotate.ts
//
// Pure helper module for screenshot label annotations.
// Has no Playwright / CDP / page dependency: takes document-space inputs,
// returns coordinate-projected annotations + IIFE strings the caller can
// hand to page.evaluate / Runtime.evaluate.
//
// Used by:
// - pw-tools-core.interactions.ts (Playwright path, M1.2-a)
// - planned: raw-CDP path in M1.2-b
//
// chrome-mcp path keeps its own inline overlay (renderChromeMcpLabels) for now.
export const ANNOTATION_OVERLAY_ATTR = "data-openclaw-labels";
export const ANNOTATION_OVERLAY_ROOT_ID = "__openclaw-annotations__";
export const ANNOTATION_MAX_LABELS_DEFAULT = 150;
export type CoordinateSpace = "viewport" | "fullpage" | "element";
export interface RawAnnotationInput {
ref: string;
role: string;
name?: string;
/** Bounding box in document coordinates (viewport top-left + scroll). */
doc: { x: number; y: number; width: number; height: number };
}
export interface AnnotationBox {
x: number;
y: number;
width: number;
height: number;
}
export interface AnnotationItem {
ref: string;
number: number;
role: string;
name?: string;
box: AnnotationBox;
}
export interface OverlayItem {
ref: string;
x: number;
y: number;
w: number;
h: number;
}
export interface AnnotationPlan {
/** Always document-space items, fed to buildOverlayInjectionScript. */
overlayItems: OverlayItem[];
/** Items projected into the capture mode's image-space coordinates. */
annotations: AnnotationItem[];
/** Refs dropped because of maxLabels truncation. */
skipped: number;
}
export interface PlanAnnotationsParams {
inputs: RawAnnotationInput[];
space: CoordinateSpace;
/** Required when space === "viewport". */
scroll?: { x: number; y: number };
/**
* Viewport size (CSS px). Only meaningful when space === "viewport". When
* provided, refs whose document box falls outside the current viewport rect
* (`scroll` + this size) are counted as skipped instead of drawn, preserving
* the shipped `labelsSkipped` contract. Omit it to disable that accounting.
*/
viewport?: { width: number; height: number };
/** Required when space === "element". */
elementRect?: { x: number; y: number; width: number; height: number };
maxLabels?: number;
}
export function refToNumber(ref: string): number {
const match = ref.match(/(\d+)/);
if (!match) {
return 0;
}
const n = Number(match[1]);
return Number.isFinite(n) ? n : 0;
}
export function planAnnotations(params: PlanAnnotationsParams): AnnotationPlan {
const maxLabels = params.maxLabels ?? ANNOTATION_MAX_LABELS_DEFAULT;
if (params.space === "viewport" && !params.scroll) {
throw new Error("planAnnotations: scroll is required when space is 'viewport'");
}
if (params.space === "element" && !params.elementRect) {
throw new Error("planAnnotations: elementRect is required when space is 'element'");
}
// Element-mode filter: discard inputs that do not overlap the element rect.
let kept = params.inputs;
if (params.space === "element" && params.elementRect) {
const er = params.elementRect;
kept = params.inputs.filter((input) => rectsOverlap(input.doc, er));
}
// Viewport capture only shows refs inside the current viewport rect. An
// off-viewport ref is still surfaced in `annotations` (with its real,
// possibly out-of-image box) so callers can locate it, but it is not drawn
// and is counted as skipped. This keeps the shipped `labelsSkipped` meaning
// ("refs not present in the captured viewport image") instead of silently
// narrowing it. Only applied when the caller supplies the viewport size;
// without it we cannot decide off-screen state and skip nothing.
const viewportRect =
params.space === "viewport" && params.scroll && params.viewport
? {
x: params.scroll.x,
y: params.scroll.y,
width: params.viewport.width,
height: params.viewport.height,
}
: undefined;
const overlayItems: OverlayItem[] = [];
const annotations: AnnotationItem[] = [];
let skipped = 0;
for (const input of kept) {
if (viewportRect && !rectsOverlap(input.doc, viewportRect)) {
// Outside the captured viewport: count as skipped (compat) but still
// report the annotation; do not draw it or consume the label budget.
skipped += 1;
annotations.push(toAnnotation(input, params));
continue;
}
if (overlayItems.length >= maxLabels) {
skipped += 1;
continue;
}
overlayItems.push({
ref: input.ref,
x: input.doc.x,
y: input.doc.y,
w: input.doc.width,
h: input.doc.height,
});
annotations.push(toAnnotation(input, params));
}
return { overlayItems, annotations, skipped };
}
function toAnnotation(input: RawAnnotationInput, params: PlanAnnotationsParams): AnnotationItem {
return {
ref: input.ref,
number: refToNumber(input.ref),
role: input.role,
...(input.name ? { name: input.name } : {}),
box: projectBox(input.doc, params),
};
}
function projectBox(
doc: { x: number; y: number; width: number; height: number },
params: PlanAnnotationsParams,
): AnnotationBox {
if (params.space === "viewport") {
const scroll = params.scroll!;
return {
x: doc.x - scroll.x,
y: doc.y - scroll.y,
width: doc.width,
height: doc.height,
};
}
if (params.space === "element") {
const er = params.elementRect!;
// NOTE: width/height pass through unchanged even when the input rect
// partially extends past the element. The capture backend (e.g.
// locator.screenshot) is responsible for clipping; the box may have
// negative x/y or extend past elementRect width/height for partial overlaps.
return {
x: doc.x - er.x,
y: doc.y - er.y,
width: doc.width,
height: doc.height,
};
}
// fullpage: document coordinates as-is
return { x: doc.x, y: doc.y, width: doc.width, height: doc.height };
}
function rectsOverlap(
a: { x: number; y: number; width: number; height: number },
b: { x: number; y: number; width: number; height: number },
): boolean {
return a.x < b.x + b.width && a.x + a.width > b.x && a.y < b.y + b.height && a.y + a.height > b.y;
}
export function buildOverlayInjectionScript(params: {
items: OverlayItem[];
captureY?: number;
}): string {
const itemsJson = JSON.stringify(
params.items.map((it) => ({
ref: it.ref,
x: round(it.x),
y: round(it.y),
w: Math.max(1, round(it.w)),
h: Math.max(1, round(it.h)),
})),
);
const attr = ANNOTATION_OVERLAY_ATTR;
const rootId = ANNOTATION_OVERLAY_ROOT_ID;
const captureY = Number.isFinite(params.captureY) ? round(params.captureY ?? 0) : 0;
return `(() => {
var items = ${itemsJson};
var captureY = ${captureY};
var existing = document.querySelectorAll("[${attr}]");
for (var k = 0; k < existing.length; k++) existing[k].remove();
var root = document.createElement("div");
root.id = ${JSON.stringify(rootId)};
root.setAttribute("${attr}", "1");
root.style.cssText = "position:absolute;top:0;left:0;width:0;height:0;pointer-events:none;z-index:2147483647;font-family:'SF Mono','SFMono-Regular',Menlo,Monaco,Consolas,'Liberation Mono','Courier New',monospace;";
for (var i = 0; i < items.length; i++) {
var it = items[i];
var box = document.createElement("div");
box.setAttribute("${attr}", "1");
box.style.cssText = "position:absolute;left:" + it.x + "px;top:" + it.y + "px;width:" + it.w + "px;height:" + it.h + "px;border:2px solid #ffb020;box-sizing:border-box;pointer-events:none;";
var tag = document.createElement("div");
tag.setAttribute("${attr}", "1");
tag.textContent = String(it.ref);
var relativeY = it.y - captureY;
var labelTop = relativeY < 14 ? (it.y + 2) : (it.y - 14);
tag.style.cssText = "position:absolute;left:" + it.x + "px;top:" + labelTop + "px;background:#ffb020;color:#1a1a1a;font:bold 11px/14px monospace;padding:0 4px;border-radius:2px;white-space:nowrap;pointer-events:none;";
root.appendChild(box);
root.appendChild(tag);
}
document.documentElement.appendChild(root);
return true;
})();`;
}
export function buildOverlayClearScript(): string {
const attr = ANNOTATION_OVERLAY_ATTR;
return `(() => {
var existing = document.querySelectorAll("[${attr}]");
for (var k = 0; k < existing.length; k++) existing[k].remove();
return true;
})();`;
}
/**
* Scale annotation boxes by independent x/y factors. Used to keep annotation
* coordinates aligned with the saved image after the response pipeline
* resizes the screenshot (e.g. via normalizeBrowserScreenshot capping the
* longest side or the byte budget). Returns a new array; inputs are not
* mutated. When both factors are 1 the boxes are returned unchanged (modulo
* structural copy) so callers can share the same code path for resized and
* non-resized captures.
*/
export function scaleAnnotations(
items: AnnotationItem[],
scaleX: number,
scaleY: number,
): AnnotationItem[] {
if (!Number.isFinite(scaleX) || !Number.isFinite(scaleY) || scaleX <= 0 || scaleY <= 0) {
return items.map((it) => ({ ...it, box: { ...it.box } }));
}
if (scaleX === 1 && scaleY === 1) {
return items.map((it) => ({ ...it, box: { ...it.box } }));
}
return items.map((it) => ({
...it,
box: {
x: round(it.box.x * scaleX),
y: round(it.box.y * scaleY),
width: Math.max(1, round(it.box.width * scaleX)),
height: Math.max(1, round(it.box.height * scaleY)),
},
}));
}
function round(v: number): number {
return Math.round(v);
}

View File

@@ -61,18 +61,4 @@ describe("browser navigation commands", () => {
expect(capture.runtimeErrors.join("\n")).toContain("Invalid width: maximum is 8192");
expect(mocks.runBrowserResizeWithOutput).not.toHaveBeenCalled();
});
it("navigate and resize commands are registered after removing dead import (#83878)", async () => {
const program = createNavigationProgram();
const browserCmd = program.commands.find((c) => c.name() === "browser");
expect(browserCmd).toBeDefined();
const cmds = browserCmd!.commands.map((c) => c.name());
expect(cmds).toContain("resize");
expect(cmds).toContain("navigate");
// Verify the shared module still exports requireRef (used by other modules)
const shared = await import("./shared.js");
expect(typeof shared.requireRef).toBe("function");
});
});

View File

@@ -12,7 +12,7 @@ import {
type BrowserParentOpts,
} from "../browser-cli-shared.js";
import { danger, defaultRuntime } from "../core-api.js";
import { resolveBrowserActionContext } from "./shared.js";
import { requireRef, resolveBrowserActionContext } from "./shared.js";
/** Registers Browser navigate and resize commands. */
export function registerBrowserNavigationCommands(
@@ -94,4 +94,7 @@ export function registerBrowserNavigationCommands(
defaultRuntime.exit(1);
}
});
// Keep `requireRef` reachable; shared utilities are intended for other modules too.
void requireRef;
}

View File

@@ -51,11 +51,7 @@ export function registerBrowserInspectCommands(
.option("--full-page", "Capture full scrollable page", false)
.option("--ref <ref>", "ARIA ref from ai snapshot")
.option("--element <selector>", "CSS selector for element screenshot")
.option(
"--labels",
"Overlay role refs on the screenshot (works with --full-page, --ref, and --element)",
false,
)
.option("--labels", "Overlay role refs on the screenshot", false)
.option("--type <png|jpeg>", "Output type (default: png)", "png")
.action(async (targetId: string | undefined, opts, cmd) => {
const parent = parentOpts(cmd);
@@ -102,7 +98,7 @@ export function registerBrowserInspectCommands(
.option("--depth <n>", "Role snapshot: max depth")
.option("--selector <sel>", "Role snapshot: scope to CSS selector")
.option("--frame <sel>", "Role snapshot: scope to an iframe selector")
.option("--labels", "Include label overlay screenshot with annotations", false)
.option("--labels", "Include viewport label overlay screenshot", false)
.option("--urls", "Append discovered link URLs to AI snapshots", false)
.option("--out <path>", "Write snapshot to a file")
.action(async (opts, cmd) => {

View File

@@ -1,11 +1,7 @@
// Canvas tests cover cli plugin behavior.
import { Command } from "commander";
import { describe, expect, it, vi } from "vitest";
import {
createDefaultCanvasCliDependencies,
registerNodesCanvasCommands,
type CanvasCliDependencies,
} from "./cli.js";
import { registerNodesCanvasCommands, type CanvasCliDependencies } from "./cli.js";
function createCanvasCliDeps() {
const writtenFiles: Array<{ filePath: string; base64: string }> = [];
@@ -51,26 +47,6 @@ function createCanvasCliDeps() {
return { deps, runtime, writtenFiles };
}
function createCanvasCliDepsWithDefaultParsers() {
const baseDeps = createDefaultCanvasCliDependencies();
const harness = createCanvasCliDeps();
return {
...harness,
deps: {
...baseDeps,
defaultRuntime: harness.runtime,
nodesCallOpts: harness.deps.nodesCallOpts,
runNodesCommand: harness.deps.runNodesCommand,
getNodesTheme: harness.deps.getNodesTheme,
resolveNodeId: harness.deps.resolveNodeId,
buildNodeInvokeParams: harness.deps.buildNodeInvokeParams,
callGatewayCli: harness.deps.callGatewayCli,
writeBase64ToFile: harness.deps.writeBase64ToFile,
shortenHomePath: harness.deps.shortenHomePath,
},
};
}
describe("canvas CLI", () => {
it("registers under nodes and captures a snapshot media path", async () => {
const program = new Command();
@@ -159,8 +135,6 @@ describe("canvas CLI", () => {
it.each([
["--max-width", "640px", "--max-width must be a positive integer."],
["--quality", "0.8x", "--quality must be a number."],
["--quality", "-0.1", "--quality must be between 0 and 1."],
["--quality", "5", "--quality must be between 0 and 1."],
])("rejects partial numeric snapshot %s values", async (flag, value, message) => {
const program = new Command();
program.exitOverride();
@@ -177,62 +151,6 @@ describe("canvas CLI", () => {
expect(deps.callGatewayCli).not.toHaveBeenCalled();
});
it.each(["0", "1"])("accepts snapshot --quality boundary value %s", async (quality) => {
const program = new Command();
program.exitOverride();
const nodes = program.command("nodes");
const { deps } = createCanvasCliDeps();
registerNodesCanvasCommands(nodes, deps);
await program.parseAsync(
["nodes", "canvas", "snapshot", "--node", "ios-node", "--quality", quality],
{
from: "user",
},
);
expect(deps.callGatewayCli).toHaveBeenCalledWith(
"node.invoke",
expect.any(Object),
expect.objectContaining({
params: expect.objectContaining({
quality: Number(quality),
}),
}),
);
});
it.each([
["snapshot"],
["present"],
["hide"],
["navigate", "https://example.com"],
["eval", "1 + 1"],
["a2ui", "push", "--text", "hello"],
["a2ui", "reset"],
])("rejects invalid %s invoke timeouts before invoking the node", async (...args) => {
const program = new Command();
program.exitOverride();
const nodes = program.command("nodes");
const { deps } = createCanvasCliDepsWithDefaultParsers();
deps.resolveNodeId = vi.fn(async () => {
throw new Error("resolveNodeId should not be called");
});
registerNodesCanvasCommands(nodes, deps);
await expect(
program.parseAsync(
["nodes", "canvas", ...args, "--node", "ios-node", "--invoke-timeout", "20ms"],
{
from: "user",
},
),
).rejects.toThrow("--invoke-timeout must be a positive integer.");
expect(deps.resolveNodeId).not.toHaveBeenCalled();
expect(deps.callGatewayCli).not.toHaveBeenCalled();
});
it.each([
["--x", "1x"],
["--y", "2px"],

View File

@@ -97,11 +97,7 @@ function parseTimeoutMs(raw: unknown): number | undefined {
if (raw === undefined || raw === null) {
return undefined;
}
const parsed = parseStrictPositiveInteger(raw);
if (parsed === undefined) {
throw new Error("--invoke-timeout must be a positive integer.");
}
return parsed;
return parseStrictPositiveInteger(raw);
}
function parseCanvasPositiveIntOption(raw: string | undefined, flag: string): number | undefined {
@@ -126,14 +122,6 @@ function parseCanvasFiniteNumberOption(raw: string | undefined, flag: string): n
return parsed;
}
function parseCanvasSnapshotQualityOption(raw: string | undefined): number | undefined {
const parsed = parseCanvasFiniteNumberOption(raw, "--quality");
if (parsed !== undefined && (parsed < 0 || parsed > 1)) {
throw new Error("--quality must be between 0 and 1.");
}
return parsed;
}
function parseNodeCandidates(raw: unknown): CanvasNodeCandidate[] {
const payload =
raw && typeof raw === "object" ? (raw as { nodes?: unknown; paired?: unknown }) : {};
@@ -257,8 +245,8 @@ async function invokeCanvas(
command: string,
params?: Record<string, unknown>,
) {
const timeoutMs = deps.parseTimeoutMs(opts.invokeTimeout);
const nodeId = await deps.resolveNodeId(opts, normalizeOptionalString(opts.node) ?? "");
const timeoutMs = deps.parseTimeoutMs(opts.invokeTimeout);
return await deps.callGatewayCli(
"node.invoke",
opts,
@@ -290,7 +278,7 @@ export function registerNodesCanvasCommands(nodes: Command, deps: CanvasCliDepen
await deps.runNodesCommand("canvas snapshot", async () => {
const format = parseCanvasSnapshotRequestFormat(opts.format);
const maxWidth = parseCanvasPositiveIntOption(opts.maxWidth, "--max-width");
const quality = parseCanvasSnapshotQualityOption(opts.quality);
const quality = parseCanvasFiniteNumberOption(opts.quality, "--quality");
const raw = await invokeCanvas(deps, opts, "canvas.snapshot", {
format,
maxWidth: Number.isFinite(maxWidth) ? maxWidth : undefined,

View File

@@ -5,14 +5,6 @@ import { buildCodexMediaUnderstandingProvider } from "./media-understanding-prov
import type { CodexAppServerClient } from "./src/app-server/client.js";
import type { CodexServerNotification, JsonValue } from "./src/app-server/protocol.js";
const sharedClientMocks = vi.hoisted(() => ({
createIsolatedCodexAppServerClient: vi.fn(),
}));
vi.mock("./src/app-server/shared-client.js", () => ({
createIsolatedCodexAppServerClient: sharedClientMocks.createIsolatedCodexAppServerClient,
}));
function codexModel(inputModalities: string[] = ["text", "image"]) {
return {
id: "gpt-5.4",
@@ -177,7 +169,6 @@ function createFakeClient(options?: {
requestHandlers.add(handler);
return () => requestHandlers.delete(handler);
},
close: vi.fn(),
} as unknown as CodexAppServerClient;
return { client, requests, approvalResponses };
@@ -187,24 +178,13 @@ describe("codex media understanding provider", () => {
afterEach(() => {
vi.useRealTimers();
vi.restoreAllMocks();
sharedClientMocks.createIsolatedCodexAppServerClient.mockReset();
});
it("runs image understanding through a bounded Codex app-server turn", async () => {
const { client, requests } = createFakeClient();
const clientFactory = vi.fn(
async (_startOptions, _authProfileId, _agentDir, _config) => client,
);
const provider = buildCodexMediaUnderstandingProvider({
clientFactory,
clientFactory: async () => client,
});
const cfg = {
auth: {
order: {
openai: ["openai:work"],
},
},
};
const result = await provider.describeImage?.({
buffer: Buffer.from("image-bytes"),
@@ -214,7 +194,7 @@ describe("codex media understanding provider", () => {
model: "gpt-5.4",
prompt: "Describe briefly.",
timeoutMs: 30_000,
cfg,
cfg: {},
agentDir: "/tmp/openclaw-agent",
});
@@ -224,12 +204,6 @@ describe("codex media understanding provider", () => {
"thread/start",
"turn/start",
]);
expect(clientFactory).toHaveBeenCalledWith(
expect.any(Object),
undefined,
"/tmp/openclaw-agent",
cfg,
);
expect(requests[1]?.params).toEqual({
model: "gpt-5.4",
modelProvider: "openai",
@@ -262,62 +236,6 @@ describe("codex media understanding provider", () => {
});
});
it("treats a blank agent directory as absent when starting the app-server", async () => {
const { client, requests } = createFakeClient();
const clientFactory = vi.fn(async () => client);
const provider = buildCodexMediaUnderstandingProvider({ clientFactory });
const cfg = {};
await provider.describeImage?.({
buffer: Buffer.from("image-bytes"),
fileName: "image.png",
mime: "image/png",
provider: "codex",
model: "gpt-5.4",
timeoutMs: 30_000,
cfg,
agentDir: " ",
});
expect(clientFactory).toHaveBeenCalledWith(expect.any(Object), undefined, undefined, cfg);
expect(requests[1]?.params).toEqual(expect.objectContaining({ cwd: process.cwd() }));
expect(requests[2]?.params).toEqual(expect.objectContaining({ cwd: process.cwd() }));
});
it("passes the scoped auth store into isolated app-server startup", async () => {
const { client } = createFakeClient();
sharedClientMocks.createIsolatedCodexAppServerClient.mockResolvedValue(client);
const provider = buildCodexMediaUnderstandingProvider();
const authStore = {
version: 1,
profiles: {
"openai:scoped": {
type: "oauth" as const,
provider: "openai",
access: "scoped-access",
refresh: "scoped-refresh",
expires: Date.now() + 60_000,
},
},
};
await provider.describeImage?.({
buffer: Buffer.from("image-bytes"),
fileName: "image.png",
mime: "image/png",
provider: "codex",
model: "gpt-5.4",
timeoutMs: 30_000,
cfg: {},
authStore,
agentDir: "/tmp/openclaw-agent",
});
expect(sharedClientMocks.createIsolatedCodexAppServerClient).toHaveBeenCalledWith(
expect.objectContaining({ authProfileStore: authStore }),
);
});
it("clamps oversized image understanding turn timeouts", async () => {
const setTimeoutSpy = vi.spyOn(globalThis, "setTimeout");
try {

View File

@@ -102,8 +102,6 @@ async function describeCodexImages(
profile: req.profile,
timeoutMs: req.timeoutMs,
agentDir: req.agentDir,
authStore: req.authStore,
cfg: req.cfg,
options,
taskLabel: "image understanding",
developerInstructions:
@@ -125,8 +123,6 @@ type BoundedCodexVisionTurnParams = {
profile?: string;
timeoutMs: number;
agentDir?: string;
authStore?: ImagesDescriptionRequest["authStore"];
cfg: ImagesDescriptionRequest["cfg"];
options: CodexMediaUnderstandingProviderOptions;
taskLabel: string;
developerInstructions: string;
@@ -139,22 +135,17 @@ async function runBoundedCodexVisionTurn(params: BoundedCodexVisionTurnParams):
pluginConfig: params.options.pluginConfig,
});
const timeoutMs = resolveTimerTimeoutMs(params.timeoutMs, 100, 100);
const agentDir = params.agentDir?.trim() || undefined;
const cwd = agentDir ?? process.cwd();
const ownsClient = !params.options.clientFactory;
// Tests inject a client factory; production creates an isolated app-server
// client so media tasks cannot reuse the interactive attempt session.
const client = params.options.clientFactory
? await params.options.clientFactory(appServer.start, params.profile, agentDir, params.cfg)
? await params.options.clientFactory(appServer.start, params.profile)
: await import("./src/app-server/shared-client.js").then(
({ createIsolatedCodexAppServerClient }) =>
createIsolatedCodexAppServerClient({
startOptions: appServer.start,
timeoutMs,
authProfileId: params.profile,
agentDir,
authProfileStore: params.authStore,
config: params.cfg,
}),
);
const abortController = new AbortController();
@@ -175,7 +166,7 @@ async function runBoundedCodexVisionTurn(params: BoundedCodexVisionTurnParams):
{
model: params.model,
modelProvider: "openai",
cwd,
cwd: params.agentDir || process.cwd(),
approvalPolicy: "on-request",
sandbox: "read-only",
serviceName: "OpenClaw",
@@ -202,7 +193,7 @@ async function runBoundedCodexVisionTurn(params: BoundedCodexVisionTurnParams):
{
threadId: thread.thread.id,
input: params.input,
cwd,
cwd: params.agentDir || process.cwd(),
approvalPolicy: "on-request",
model: params.model,
effort: "low",
@@ -251,8 +242,6 @@ async function extractCodexStructured(
profile: req.profile,
timeoutMs: req.timeoutMs,
agentDir: req.agentDir,
authStore: req.authStore,
cfg: req.cfg,
options,
taskLabel: "structured extraction",
developerInstructions:

View File

@@ -16,7 +16,6 @@ import {
} from "openclaw/plugin-sdk/agent-harness-runtime";
import { resolveAgentWorkspaceDir } from "openclaw/plugin-sdk/agent-runtime";
import { buildMemorySystemPromptAddition } from "openclaw/plugin-sdk/core";
import { MESSAGE_TOOL_DELIVERY_HINTS } from "openclaw/plugin-sdk/message-tool-delivery-hints";
import type { CodexDynamicToolSpec, JsonValue } from "./protocol.js";
import { isJsonObject } from "./protocol.js";
import type { CodexAppServerThreadBinding } from "./session-binding.js";
@@ -585,12 +584,17 @@ export function prependCodexOpenClawPromptContext(
return [context?.trim(), deliverySection, promptSection].filter(Boolean).join("\n\n");
}
const CODEX_DELIVERY_HINT_LINES = [
"Delivery: to send a message, use the `message` tool.",
"Delivery: Final assistant text is not automatically delivered in this run. Use the `message` tool to send user-visible output.",
] as const;
function splitLeadingCodexDeliveryHint(prompt: string): {
deliveryHint?: string;
prompt: string;
} {
const trimmedStart = prompt.trimStart();
const matchedHint = MESSAGE_TOOL_DELIVERY_HINTS.find((hint) => trimmedStart.startsWith(hint));
const matchedHint = CODEX_DELIVERY_HINT_LINES.find((hint) => trimmedStart.startsWith(hint));
if (!matchedHint) {
return { prompt };
}

View File

@@ -5,7 +5,6 @@ import path from "node:path";
import {
clearRuntimeAuthProfileStoreSnapshots,
loadAuthProfileStoreForSecretsRuntime,
replaceRuntimeAuthProfileStoreSnapshots,
} from "openclaw/plugin-sdk/agent-runtime";
import { upsertAuthProfile } from "openclaw/plugin-sdk/provider-auth";
import { afterEach, describe, expect, it, vi } from "vitest";
@@ -15,7 +14,6 @@ import {
refreshCodexAppServerAuthTokens,
resolveCodexAppServerAuthAccountCacheKey,
resolveCodexAppServerAuthProfileId,
resolveCodexAppServerAuthProfileStore,
resolveCodexAppServerFallbackApiKeyCacheKey,
resolveCodexAppServerHomeDir,
resolveCodexAppServerNativeHomeDir,
@@ -181,39 +179,6 @@ async function writeCodexCliApiKeyAuthFile(codexHome: string): Promise<void> {
}
describe("bridgeCodexAppServerStartOptions", () => {
it("preserves persisted provenance when preparing a supplied base store", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const authProfileStore = { version: 1, profiles: {} };
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "persisted-access",
refresh: "persisted-refresh",
expires: Date.now() + 60_000,
},
});
const prepared = resolveCodexAppServerAuthProfileStore({
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(prepared).not.toBe(authProfileStore);
expect(prepared.runtimePersistedProfileIds).toContain("openai:work");
expect(prepared.profiles["openai:work"]).toMatchObject({
access: "persisted-access",
refresh: "persisted-refresh",
});
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("sets agent-owned CODEX_HOME without overriding HOME for local app-server launches", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const startOptions = createStartOptions();
@@ -611,603 +576,6 @@ describe("bridgeCodexAppServerStartOptions", () => {
}
});
it("applies a supplied scoped OAuth profile instead of persisted credentials", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ type: "chatgptAuthTokens" }));
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "persisted-access",
refresh: "persisted-refresh",
expires: Date.now() + 24 * 60 * 60_000,
accountId: "persisted-account",
},
});
const authProfileStore: AuthProfileStore = {
version: 1,
profiles: {
"openai:work": {
type: "oauth",
provider: "openai",
access: "scoped-access",
refresh: "scoped-refresh",
expires: Date.now() + 24 * 60 * 60_000,
accountId: "scoped-account",
},
},
};
await applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(request).toHaveBeenCalledWith("account/login/start", {
type: "chatgptAuthTokens",
accessToken: "scoped-access",
chatgptAccountId: "scoped-account",
chatgptPlanType: null,
});
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it.each([
{ name: "without persisted same-id credentials", persistSameId: false },
{ name: "with persisted same-id credentials", persistSameId: true },
])("refreshes an expired scoped OAuth profile $name", async ({ persistSameId }) => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ type: "chatgptAuthTokens" }));
oauthMocks.refreshOpenAICodexToken.mockResolvedValueOnce({
access: "scoped-refreshed-access",
refresh: "scoped-refreshed-refresh",
expires: Date.now() + 60_000,
accountId: "scoped-refreshed-account",
});
try {
if (persistSameId) {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "persisted-access",
refresh: "persisted-refresh",
expires: Date.now() + 24 * 60 * 60_000,
accountId: "persisted-account",
},
});
}
const authProfileStore: AuthProfileStore = {
version: 1,
profiles: {
"openai:work": {
type: "oauth",
provider: "openai",
access: "scoped-expired-access",
refresh: "scoped-refresh",
expires: Date.now() - 60_000,
accountId: "scoped-account",
},
},
};
await applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(oauthMocks.refreshOpenAICodexToken).toHaveBeenCalledWith("scoped-refresh");
expect(request).toHaveBeenCalledWith("account/login/start", {
type: "chatgptAuthTokens",
accessToken: "scoped-refreshed-access",
chatgptAccountId: "scoped-refreshed-account",
chatgptPlanType: null,
});
expect(authProfileStore.profiles["openai:work"]).toMatchObject({
access: "scoped-refreshed-access",
accountId: "scoped-refreshed-account",
});
if (persistSameId) {
expect(
loadAuthProfileStoreForSecretsRuntime(agentDir).profiles["openai:work"],
).toMatchObject({
access: "persisted-access",
accountId: "persisted-account",
});
}
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("routes a supplied persisted OAuth clone through canonical refresh", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ type: "chatgptAuthTokens" }));
oauthMocks.refreshOpenAICodexToken.mockResolvedValueOnce({
access: "persisted-refreshed-access",
refresh: "persisted-refreshed-refresh",
expires: Date.now() + 60_000,
accountId: "persisted-account",
});
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "persisted-expired-access",
refresh: "persisted-refresh",
expires: Date.now() - 60_000,
accountId: "persisted-account",
},
});
const authProfileStore = loadAuthProfileStoreForSecretsRuntime(agentDir);
expect(authProfileStore.runtimePersistedProfileIds).toContain("openai:work");
await applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(oauthMocks.refreshOpenAICodexToken).toHaveBeenCalledWith("persisted-refresh");
expect(request).toHaveBeenCalledWith("account/login/start", {
type: "chatgptAuthTokens",
accessToken: "persisted-refreshed-access",
chatgptAccountId: "persisted-account",
chatgptPlanType: null,
});
expect(loadAuthProfileStoreForSecretsRuntime(agentDir).profiles["openai:work"]).toMatchObject(
{
access: "persisted-refreshed-access",
refresh: "persisted-refreshed-refresh",
accountId: "persisted-account",
},
);
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("keeps a prepared persisted store aligned across rotating refresh tokens", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
oauthMocks.refreshOpenAICodexToken
.mockResolvedValueOnce({
access: "first-rotated-access",
refresh: "first-rotated-refresh",
expires: Date.now() + 60_000,
})
.mockResolvedValueOnce({
access: "second-rotated-access",
refresh: "second-rotated-refresh",
expires: Date.now() + 60_000,
});
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "initial-access",
refresh: "initial-refresh",
expires: Date.now() + 60_000,
},
});
const authProfileStore = resolveCodexAppServerAuthProfileStore({
agentDir,
authProfileId: "openai:work",
authProfileStore: { version: 1, profiles: {} },
});
await refreshCodexAppServerAuthTokens({
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
await refreshCodexAppServerAuthTokens({
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(oauthMocks.refreshOpenAICodexToken.mock.calls).toEqual([
["initial-refresh"],
["first-rotated-refresh"],
]);
expect(authProfileStore.profiles["openai:work"]).toMatchObject({
access: "second-rotated-access",
refresh: "second-rotated-refresh",
});
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("does not replace a prepared persisted store changed during refresh", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
let resolveRefresh:
| ((value: { access: string; refresh: string; expires: number }) => void)
| undefined;
oauthMocks.refreshOpenAICodexToken.mockImplementationOnce(
() =>
new Promise((resolve) => {
resolveRefresh = resolve;
}),
);
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "initial-access",
refresh: "initial-refresh",
expires: Date.now() + 60_000,
},
});
const authProfileStore = resolveCodexAppServerAuthProfileStore({
agentDir,
authProfileId: "openai:work",
authProfileStore: { version: 1, profiles: {} },
});
const refresh = refreshCodexAppServerAuthTokens({
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
await vi.waitFor(() => expect(oauthMocks.refreshOpenAICodexToken).toHaveBeenCalledTimes(1));
authProfileStore.profiles["openai:work"] = {
type: "oauth",
provider: "openai",
access: "replacement-access",
refresh: "replacement-refresh",
expires: Date.now() + 60_000,
accountId: "replacement-account",
};
resolveRefresh?.({
access: "rotated-access",
refresh: "rotated-refresh",
expires: Date.now() + 60_000,
});
await refresh;
expect(authProfileStore.profiles["openai:work"]).toMatchObject({
access: "replacement-access",
refresh: "replacement-refresh",
accountId: "replacement-account",
});
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("keeps a runtime-external same-account OAuth profile scoped", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ type: "chatgptAuthTokens" }));
oauthMocks.refreshOpenAICodexToken.mockResolvedValueOnce({
access: "scoped-refreshed-access",
refresh: "scoped-refreshed-refresh",
expires: Date.now() + 60_000,
accountId: "shared-account",
});
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "persisted-access",
refresh: "persisted-refresh",
expires: Date.now() + 24 * 60 * 60_000,
accountId: "shared-account",
},
});
const authProfileStore: AuthProfileStore = {
version: 1,
runtimeExternalProfileIds: ["openai:work"],
runtimeExternalProfileIdsAuthoritative: true,
profiles: {
"openai:work": {
type: "oauth",
provider: "openai",
access: "scoped-expired-access",
refresh: "scoped-refresh",
expires: Date.now() - 60_000,
accountId: "shared-account",
},
},
};
await applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(oauthMocks.refreshOpenAICodexToken).toHaveBeenCalledWith("scoped-refresh");
expect(request).toHaveBeenCalledWith("account/login/start", {
type: "chatgptAuthTokens",
accessToken: "scoped-refreshed-access",
chatgptAccountId: "shared-account",
chatgptPlanType: null,
});
expect(loadAuthProfileStoreForSecretsRuntime(agentDir).profiles["openai:work"]).toMatchObject(
{
access: "persisted-access",
refresh: "persisted-refresh",
accountId: "shared-account",
},
);
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("keeps an ambiguous supplied OAuth identity scoped", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ type: "chatgptAuthTokens" }));
oauthMocks.refreshOpenAICodexToken.mockResolvedValueOnce({
access: "scoped-refreshed-access",
refresh: "scoped-refreshed-refresh",
expires: Date.now() + 60_000,
});
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "persisted-access",
refresh: "persisted-refresh",
expires: Date.now() + 24 * 60 * 60_000,
accountId: "persisted-account",
},
});
const authProfileStore: AuthProfileStore = {
version: 1,
profiles: {
"openai:work": {
type: "oauth",
provider: "openai",
access: "scoped-expired-access",
refresh: "scoped-refresh",
expires: Date.now() - 60_000,
},
},
};
await applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(oauthMocks.refreshOpenAICodexToken).toHaveBeenCalledWith("scoped-refresh");
expect(request).toHaveBeenCalledWith("account/login/start", {
type: "chatgptAuthTokens",
accessToken: "scoped-refreshed-access",
chatgptAccountId: "openai:work",
chatgptPlanType: null,
});
expect(loadAuthProfileStoreForSecretsRuntime(agentDir).profiles["openai:work"]).toMatchObject(
{
access: "persisted-access",
refresh: "persisted-refresh",
accountId: "persisted-account",
},
);
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("routes a same-identity stale persisted clone through canonical persisted auth", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ type: "chatgptAuthTokens" }));
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "stale-access",
refresh: "stale-refresh",
expires: Date.now() - 60_000,
accountId: "persisted-account",
},
});
const authProfileStore = loadAuthProfileStoreForSecretsRuntime(agentDir);
expect(authProfileStore.runtimePersistedProfileIds).toContain("openai:work");
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "current-access",
refresh: "current-refresh",
expires: Date.now() + 24 * 60 * 60_000,
accountId: "persisted-account",
},
});
await applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(oauthMocks.refreshOpenAICodexToken).not.toHaveBeenCalled();
expect(request).toHaveBeenCalledWith("account/login/start", {
type: "chatgptAuthTokens",
accessToken: "current-access",
chatgptAccountId: "persisted-account",
chatgptPlanType: null,
});
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("keeps a changed-identity persisted clone scoped", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ type: "chatgptAuthTokens" }));
oauthMocks.refreshOpenAICodexToken.mockResolvedValueOnce({
access: "account-a-refreshed-access",
refresh: "account-a-refreshed-refresh",
expires: Date.now() + 60_000,
accountId: "account-a",
});
try {
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "account-a-expired-access",
refresh: "account-a-refresh",
expires: Date.now() - 60_000,
accountId: "account-a",
},
});
const authProfileStore = loadAuthProfileStoreForSecretsRuntime(agentDir);
expect(authProfileStore.runtimePersistedProfileIds).toContain("openai:work");
upsertAuthProfile({
agentDir,
profileId: "openai:work",
credential: {
type: "oauth",
provider: "openai",
access: "account-b-access",
refresh: "account-b-refresh",
expires: Date.now() + 24 * 60 * 60_000,
accountId: "account-b",
},
});
replaceRuntimeAuthProfileStoreSnapshots([{ agentDir, store: authProfileStore }]);
await applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
expect(oauthMocks.refreshOpenAICodexToken).toHaveBeenCalledWith("account-a-refresh");
expect(request).toHaveBeenCalledWith("account/login/start", {
type: "chatgptAuthTokens",
accessToken: "account-a-refreshed-access",
chatgptAccountId: "account-a",
chatgptPlanType: null,
});
expect(loadAuthProfileStoreForSecretsRuntime(agentDir).profiles["openai:work"]).toMatchObject(
{
access: "account-b-access",
refresh: "account-b-refresh",
accountId: "account-b",
},
);
} finally {
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("serializes concurrent refreshes of the same scoped OAuth profile", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ type: "chatgptAuthTokens" }));
let resolveRefresh:
| ((value: { access: string; refresh: string; expires: number; accountId: string }) => void)
| undefined;
oauthMocks.refreshOpenAICodexToken.mockImplementationOnce(
() =>
new Promise((resolve) => {
resolveRefresh = resolve;
}),
);
const authProfileStore: AuthProfileStore = {
version: 1,
profiles: {
"openai:work": {
type: "oauth",
provider: "openai",
access: "scoped-expired-access",
refresh: "scoped-refresh",
expires: Date.now() - 60_000,
accountId: "scoped-account",
},
},
};
try {
const first = applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
const second = applyCodexAppServerAuthProfile({
client: { request } as never,
agentDir,
authProfileId: "openai:work",
authProfileStore,
});
await vi.waitFor(() => expect(oauthMocks.refreshOpenAICodexToken).toHaveBeenCalledTimes(1));
resolveRefresh?.({
access: "scoped-refreshed-access",
refresh: "scoped-refreshed-refresh",
expires: Date.now() + 60_000,
accountId: "scoped-refreshed-account",
});
await Promise.all([first, second]);
expect(oauthMocks.refreshOpenAICodexToken).toHaveBeenCalledTimes(1);
expect(request).toHaveBeenCalledTimes(2);
expect(request).toHaveBeenNthCalledWith(1, "account/login/start", {
type: "chatgptAuthTokens",
accessToken: "scoped-refreshed-access",
chatgptAccountId: "scoped-refreshed-account",
chatgptPlanType: null,
});
expect(request).toHaveBeenNthCalledWith(2, "account/login/start", {
type: "chatgptAuthTokens",
accessToken: "scoped-refreshed-access",
chatgptAccountId: "scoped-refreshed-account",
chatgptPlanType: null,
});
} finally {
resolveRefresh?.({
access: "cleanup-access",
refresh: "cleanup-refresh",
expires: Date.now() + 60_000,
accountId: "cleanup-account",
});
await fs.rm(agentDir, { recursive: true, force: true });
}
});
it("leaves native app-server auth untouched when auth bridging is disabled", async () => {
const agentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-codex-app-server-"));
const request = vi.fn(async () => ({ requiresOpenaiAuth: true }));

View File

@@ -4,10 +4,9 @@ import fsSync from "node:fs";
import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { isDeepStrictEqual } from "node:util";
import {
ensureAuthProfileStore,
findPersistedAuthProfileCredential,
ensureAuthProfileStoreWithoutExternalProfiles,
loadAuthProfileStoreForSecretsRuntime,
refreshOAuthCredentialForRuntime,
resolveAuthProfileOrder,
@@ -19,7 +18,6 @@ import {
type AuthProfileStore,
type OAuthCredential,
} from "openclaw/plugin-sdk/agent-runtime";
import { hasUsableOAuthCredential } from "openclaw/plugin-sdk/provider-auth";
import type { CodexAppServerClient } from "./client.js";
import type { CodexAppServerStartOptions } from "./config.js";
import type {
@@ -50,16 +48,11 @@ const CODEX_AUTH_JSON_FILENAME = "auth.json";
const CODEX_HOME_DIRNAME = ".codex";
type AuthProfileOrderConfig = Parameters<typeof resolveAuthProfileOrder>[0]["cfg"];
const scopedOAuthRefreshQueues = new WeakMap<
AuthProfileStore,
Map<string, Promise<OAuthCredential>>
>();
export async function bridgeCodexAppServerStartOptions(params: {
startOptions: CodexAppServerStartOptions;
agentDir: string;
authProfileId?: string | null;
authProfileStore?: AuthProfileStore;
config?: AuthProfileOrderConfig;
}): Promise<CodexAppServerStartOptions> {
if (params.startOptions.transport !== "stdio") {
@@ -72,10 +65,9 @@ export async function bridgeCodexAppServerStartOptions(params: {
if (params.authProfileId === null) {
return isolatedStartOptions;
}
const store = resolveCodexAppServerAuthProfileStore({
const store = ensureCodexAppServerAuthProfileStore({
agentDir: params.agentDir,
authProfileId: params.authProfileId,
authProfileStore: params.authProfileStore,
config: params.config,
});
const authProfileId = resolveCodexAppServerAuthProfileId({
@@ -111,15 +103,13 @@ export function resolveCodexAppServerAuthProfileId(params: {
export function resolveCodexAppServerAuthProfileIdForAgent(params: {
authProfileId?: string;
authProfileStore?: AuthProfileStore;
agentDir?: string;
config?: AuthProfileOrderConfig;
}): string | undefined {
const agentDir = params.agentDir?.trim() || resolveDefaultAgentDir(params.config ?? {});
const store = resolveCodexAppServerAuthProfileStore({
const store = ensureCodexAppServerAuthProfileStore({
agentDir,
authProfileId: params.authProfileId,
authProfileStore: params.authProfileStore,
config: params.config,
});
return resolveCodexAppServerAuthProfileId({
@@ -142,7 +132,7 @@ function ensureCodexAppServerAuthProfileStore(params: {
});
}
export function resolveCodexAppServerAuthProfileStore(params: {
function resolveCodexAppServerAuthProfileStore(params: {
agentDir?: string;
authProfileId?: string;
authProfileStore?: AuthProfileStore;
@@ -173,41 +163,13 @@ export function resolveCodexAppServerAuthProfileStore(params: {
...params.authProfileStore.order,
}
: undefined;
const profiles = {
...overlaidStore.profiles,
...params.authProfileStore.profiles,
};
const suppliedProfileIds = new Set(Object.keys(params.authProfileStore.profiles));
const mergeRuntimeProfileIds = (overlaidIds?: string[], suppliedIds?: string[]) => [
...(overlaidIds ?? []).filter((profileId) => !suppliedProfileIds.has(profileId)),
...(suppliedIds ?? []),
];
const runtimePersistedProfileIds = mergeRuntimeProfileIds(
overlaidStore.runtimePersistedProfileIds,
params.authProfileStore.runtimePersistedProfileIds,
).filter((profileId) => profiles[profileId]);
const runtimeExternalProfileIds = mergeRuntimeProfileIds(
overlaidStore.runtimeExternalProfileIds,
params.authProfileStore.runtimeExternalProfileIds,
).filter((profileId) => profiles[profileId]);
const runtimeExternalProfileIdsAuthoritative =
overlaidStore.runtimeExternalProfileIdsAuthoritative === true ||
params.authProfileStore.runtimeExternalProfileIdsAuthoritative === true;
return {
...params.authProfileStore,
...(order ? { order } : {}),
profiles,
...(runtimePersistedProfileIds.length > 0
? { runtimePersistedProfileIds: [...new Set(runtimePersistedProfileIds)] }
: {}),
...(runtimeExternalProfileIds.length > 0 || runtimeExternalProfileIdsAuthoritative
? {
runtimeExternalProfileIds: [...new Set(runtimeExternalProfileIds)],
...(runtimeExternalProfileIdsAuthoritative
? { runtimeExternalProfileIdsAuthoritative: true }
: {}),
}
: {}),
profiles: {
...overlaidStore.profiles,
...params.authProfileStore.profiles,
},
};
}
@@ -377,7 +339,6 @@ export async function applyCodexAppServerAuthProfile(params: {
client: CodexAppServerClient;
agentDir: string;
authProfileId?: string | null;
authProfileStore?: AuthProfileStore;
startOptions?: CodexAppServerStartOptions;
config?: AuthProfileOrderConfig;
}): Promise<void> {
@@ -387,7 +348,6 @@ export async function applyCodexAppServerAuthProfile(params: {
const loginParams = await resolveCodexAppServerAuthProfileLoginParams({
agentDir: params.agentDir,
authProfileId: params.authProfileId,
authProfileStore: params.authProfileStore,
config: params.config,
});
if (!loginParams) {
@@ -411,7 +371,6 @@ export async function applyCodexAppServerAuthProfile(params: {
function resolveCodexAppServerAuthProfileLoginParams(params: {
agentDir: string;
authProfileId?: string;
authProfileStore?: AuthProfileStore;
config?: AuthProfileOrderConfig;
}): Promise<CodexLoginAccountParams | undefined> {
return resolveCodexAppServerAuthProfileLoginParamsInternal(params);
@@ -420,7 +379,6 @@ function resolveCodexAppServerAuthProfileLoginParams(params: {
export async function refreshCodexAppServerAuthTokens(params: {
agentDir: string;
authProfileId?: string;
authProfileStore?: AuthProfileStore;
config?: AuthProfileOrderConfig;
}): Promise<CodexChatgptAuthTokensRefreshResponse> {
const loginParams = await resolveCodexAppServerAuthProfileLoginParamsInternal({
@@ -440,14 +398,12 @@ export async function refreshCodexAppServerAuthTokens(params: {
async function resolveCodexAppServerAuthProfileLoginParamsInternal(params: {
agentDir: string;
authProfileId?: string;
authProfileStore?: AuthProfileStore;
forceOAuthRefresh?: boolean;
config?: AuthProfileOrderConfig;
}): Promise<CodexLoginAccountParams | undefined> {
const store = resolveCodexAppServerAuthProfileStore({
const store = ensureCodexAppServerAuthProfileStore({
agentDir: params.agentDir,
authProfileId: params.authProfileId,
authProfileStore: params.authProfileStore,
config: params.config,
});
const profileId = resolveCodexAppServerAuthProfileId({
@@ -469,8 +425,6 @@ async function resolveCodexAppServerAuthProfileLoginParamsInternal(params: {
}
const loginParams = await resolveLoginParamsForCredential(profileId, credential, {
agentDir: params.agentDir,
store,
preferStoreCredential: Boolean(params.authProfileStore?.profiles[profileId]),
forceOAuthRefresh: params.forceOAuthRefresh === true,
config: params.config,
});
@@ -555,22 +509,14 @@ function resolveCodexCliAuthFileApiKeyCacheKey(env: NodeJS.ProcessEnv): string |
async function resolveLoginParamsForCredential(
profileId: string,
credential: AuthProfileCredential,
params: {
agentDir: string;
store: AuthProfileStore;
preferStoreCredential: boolean;
forceOAuthRefresh: boolean;
config?: AuthProfileOrderConfig;
},
params: { agentDir: string; forceOAuthRefresh: boolean; config?: AuthProfileOrderConfig },
): Promise<CodexLoginAccountParams | undefined> {
// Runtime honors the persisted auth profile type. Shape-based remediation
// belongs at credential entry time so request handling does not preemptively
// reject opaque provider credentials.
if (credential.type === "api_key") {
const resolved = await resolveApiKeyForProfile({
store: params.preferStoreCredential
? params.store
: ensureAuthProfileStore(params.agentDir, { allowKeychainPrompt: false }),
store: ensureAuthProfileStore(params.agentDir, { allowKeychainPrompt: false }),
profileId,
agentDir: params.agentDir,
});
@@ -579,9 +525,7 @@ async function resolveLoginParamsForCredential(
}
if (credential.type === "token") {
const resolved = await resolveApiKeyForProfile({
store: params.preferStoreCredential
? params.store
: ensureAuthProfileStore(params.agentDir, { allowKeychainPrompt: false }),
store: ensureAuthProfileStore(params.agentDir, { allowKeychainPrompt: false }),
profileId,
agentDir: params.agentDir,
});
@@ -595,8 +539,6 @@ async function resolveLoginParamsForCredential(
}
const resolvedCredential = await resolveOAuthCredentialForCodexAppServer(profileId, credential, {
agentDir: params.agentDir,
store: params.store,
preferStoreCredential: params.preferStoreCredential,
forceRefresh: params.forceOAuthRefresh,
config: params.config,
});
@@ -609,40 +551,22 @@ async function resolveLoginParamsForCredential(
async function resolveOAuthCredentialForCodexAppServer(
profileId: string,
credential: OAuthCredential,
params: {
agentDir: string;
store: AuthProfileStore;
preferStoreCredential: boolean;
forceRefresh: boolean;
config?: AuthProfileOrderConfig;
},
params: { agentDir: string; forceRefresh: boolean; config?: AuthProfileOrderConfig },
): Promise<OAuthCredential> {
const ownerAgentDir = resolvePersistedAuthProfileOwnerAgentDir({
agentDir: params.agentDir,
profileId,
});
const persistedCredential = findPersistedAuthProfileCredential({
const store = ensureCodexAppServerAuthProfileStore({
agentDir: ownerAgentDir,
profileId,
authProfileId: profileId,
config: params.config,
});
const useScopedCredential =
params.preferStoreCredential &&
shouldUseScopedOAuthCredential({
store: params.store,
profileId,
persistedCredential,
suppliedCredential: credential,
config: params.config,
});
const store = useScopedCredential
? params.store
: ensureCodexAppServerAuthProfileStore({
agentDir: ownerAgentDir,
authProfileId: profileId,
config: params.config,
});
const persistedStore = ensureAuthProfileStoreWithoutExternalProfiles(ownerAgentDir, {
allowKeychainPrompt: false,
});
const persistedCredential = persistedStore.profiles[profileId];
const persistedOAuthCredential =
!useScopedCredential &&
persistedCredential?.type === "oauth" &&
isCodexAppServerAuthProvider(persistedCredential.provider, params.config)
? persistedCredential
@@ -653,14 +577,6 @@ async function resolveOAuthCredentialForCodexAppServer(
isCodexAppServerAuthProvider(ownerCredential.provider, params.config)
? ownerCredential
: undefined;
if (useScopedCredential && overlaidOAuthCredential) {
return await resolveScopedOAuthCredential({
store,
profileId,
credential: overlaidOAuthCredential,
forceRefresh: params.forceRefresh,
});
}
if (params.forceRefresh && !persistedOAuthCredential && overlaidOAuthCredential) {
const refreshedRuntimeCredential = await refreshOAuthCredentialForRuntime({
credential: overlaidOAuthCredential,
@@ -677,111 +593,18 @@ async function resolveOAuthCredentialForCodexAppServer(
agentDir: ownerAgentDir,
forceRefresh: params.forceRefresh && Boolean(persistedOAuthCredential),
});
const refreshed = useScopedCredential
? undefined
: loadAuthProfileStoreForSecretsRuntime(ownerAgentDir).profiles[profileId];
const refreshedOAuthCredential =
const refreshed = loadAuthProfileStoreForSecretsRuntime(ownerAgentDir).profiles[profileId];
const storedCredential = store.profiles[profileId];
const candidate =
refreshed?.type === "oauth" && isCodexAppServerAuthProvider(refreshed.provider, params.config)
? refreshed
: undefined;
if (refreshedOAuthCredential && isDeepStrictEqual(params.store.profiles[profileId], credential)) {
// Persisted refreshes rotate refresh tokens. Keep an isolated prepared
// store aligned without reverting a concurrent caller-owned replacement.
params.store.profiles[profileId] = refreshedOAuthCredential;
}
const storedCredential = store.profiles[profileId];
const candidate = refreshedOAuthCredential
? refreshedOAuthCredential
: storedCredential?.type === "oauth" &&
isCodexAppServerAuthProvider(storedCredential.provider, params.config)
? storedCredential
: credential;
: storedCredential?.type === "oauth" &&
isCodexAppServerAuthProvider(storedCredential.provider, params.config)
? storedCredential
: credential;
return resolved?.apiKey ? { ...candidate, access: resolved.apiKey } : candidate;
}
function shouldUseScopedOAuthCredential(params: {
store: AuthProfileStore;
profileId: string;
persistedCredential: AuthProfileCredential | undefined;
suppliedCredential: OAuthCredential;
config?: AuthProfileOrderConfig;
}): boolean {
if (!params.store.runtimePersistedProfileIds?.includes(params.profileId)) {
return true;
}
const persisted = params.persistedCredential;
if (persisted?.type !== "oauth") {
return true;
}
if (
resolveProviderIdForAuth(persisted.provider, { config: params.config }) !==
resolveProviderIdForAuth(params.suppliedCredential.provider, { config: params.config })
) {
return true;
}
return (
!isDeepStrictEqual(persisted, params.suppliedCredential) &&
!hasMatchingOAuthIdentity(persisted, params.suppliedCredential)
);
}
function hasMatchingOAuthIdentity(persisted: OAuthCredential, supplied: OAuthCredential): boolean {
const persistedAccountId = persisted.accountId?.trim();
const suppliedAccountId = supplied.accountId?.trim();
if (persistedAccountId && suppliedAccountId) {
return persistedAccountId === suppliedAccountId;
}
const persistedEmail = persisted.email?.trim().toLowerCase();
const suppliedEmail = supplied.email?.trim().toLowerCase();
return Boolean(persistedEmail && suppliedEmail && persistedEmail === suppliedEmail);
}
async function resolveScopedOAuthCredential(params: {
store: AuthProfileStore;
profileId: string;
credential: OAuthCredential;
forceRefresh: boolean;
}): Promise<OAuthCredential> {
const existingRefresh = scopedOAuthRefreshQueues.get(params.store)?.get(params.profileId);
if (existingRefresh) {
return await existingRefresh;
}
if (!params.forceRefresh && hasUsableOAuthCredential(params.credential)) {
return params.credential;
}
const storeRefreshes = scopedOAuthRefreshQueues.get(params.store) ?? new Map();
scopedOAuthRefreshQueues.set(params.store, storeRefreshes);
const refresh = (async () => {
const current = params.store.profiles[params.profileId];
const credential = current?.type === "oauth" ? current : params.credential;
if (!params.forceRefresh && hasUsableOAuthCredential(credential)) {
return credential;
}
const refreshed = await refreshOAuthCredentialForRuntime({ credential });
if (!refreshed?.access?.trim()) {
throw new Error(`Codex app-server auth profile "${params.profileId}" could not refresh.`);
}
if (!isDeepStrictEqual(params.store.profiles[params.profileId], credential)) {
throw new Error(
`Codex app-server auth profile "${params.profileId}" changed while refreshing.`,
);
}
params.store.profiles[params.profileId] = refreshed;
return refreshed;
})();
storeRefreshes.set(params.profileId, refresh);
try {
return await refresh;
} finally {
// Scoped stores are process-local; serialize their rotating refresh token
// and release the queue entry with the refresh that owns it.
if (storeRefreshes.get(params.profileId) === refresh) {
storeRefreshes.delete(params.profileId);
}
}
}
function isCodexAppServerAuthProvider(provider: string, config?: AuthProfileOrderConfig): boolean {
const resolvedProvider = resolveProviderIdForAuth(provider, { config });
return (

View File

@@ -15,7 +15,6 @@ import {
includeForcedCodexDynamicToolAllow,
resetOpenClawCodingToolsFactoryForTests,
resolveOpenClawCodingToolsSessionKeys,
resolveCodexMessageToolProvider,
setOpenClawCodingToolsFactoryForTests,
shouldEnableCodexAppServerNativeToolSurface,
shouldForceMessageTool,
@@ -133,15 +132,6 @@ describe("Codex app-server dynamic tool build", () => {
await fs.rm(tempDir, { recursive: true, force: true });
});
it("uses the message tool channel before a differing ingress provider", () => {
expect(
resolveCodexMessageToolProvider({
messageChannel: "discord",
messageProvider: "discord-voice",
}),
).toBe("discord");
});
it("filters Codex-native dynamic tools from app-server tool exposure", () => {
const tools = [
"read",
@@ -559,28 +549,6 @@ describe("Codex app-server dynamic tool build", () => {
);
});
it("passes native and routable channel targets into Codex dynamic tools", async () => {
const sessionFile = path.join(tempDir, "session.jsonl");
const workspaceDir = path.join(tempDir, "workspace");
const params = createParams(sessionFile, workspaceDir);
params.disableTools = false;
params.currentChannelId = "D123";
params.currentMessagingTarget = "user:U123";
params.runtimePlan = createCodexRuntimePlanFixture();
const factoryOptions: unknown[] = [];
setOpenClawCodingToolsFactoryForTests((options) => {
factoryOptions.push(options);
return [];
});
await buildDynamicToolsForTest(params, workspaceDir, { sandbox: null as never });
expect(factoryOptions[0]).toMatchObject({
currentChannelId: "D123",
currentMessagingTarget: "user:U123",
});
});
it("passes runtime config into Codex exec dynamic tool construction", async () => {
const sessionFile = path.join(tempDir, "session.jsonl");
const workspaceDir = path.join(tempDir, "workspace");

View File

@@ -96,13 +96,6 @@ export function resolveOpenClawCodingToolsSessionKeys(
};
}
/** Returns the canonical channel used for Codex message routing and receipts. */
export function resolveCodexMessageToolProvider(
params: Pick<EmbeddedRunAttemptParams, "messageChannel" | "messageProvider">,
): string | undefined {
return params.messageChannel ?? params.messageProvider;
}
/** Resolves the channel id that hook events should target for this Codex app-server turn. */
export function resolveCodexAppServerHookChannelId(
params: EmbeddedRunAttemptParams,
@@ -216,8 +209,7 @@ export async function buildDynamicTools(input: DynamicToolBuildParams) {
elevated: params.bashElevated,
},
sandbox: input.sandbox,
messageProvider: resolveCodexMessageToolProvider(params),
toolPolicyMessageProvider: params.messageProvider ?? params.messageChannel,
messageProvider: params.messageChannel ?? params.messageProvider,
agentAccountId: params.agentAccountId,
messageTo: params.messageTo,
messageThreadId: params.messageThreadId,
@@ -266,7 +258,6 @@ export async function buildDynamicTools(input: DynamicToolBuildParams) {
),
suppressManagedWebSearch: false,
currentChannelId: params.currentChannelId,
currentMessagingTarget: params.currentMessagingTarget,
hookChannelId: resolveCodexAppServerHookChannelId(params, input.sandboxSessionKey),
currentThreadTs: params.currentThreadTs,
currentMessageId: params.currentMessageId,

View File

@@ -183,7 +183,6 @@ describe("dynamic tool execution helpers", () => {
vi.useFakeTimers();
let capturedSignal: AbortSignal | undefined;
const onTimeout = vi.fn();
const onAgentToolResult = vi.fn();
const response = handleDynamicToolCallWithTimeout({
call: {
threadId: "thread-1",
@@ -201,7 +200,6 @@ describe("dynamic tool execution helpers", () => {
},
signal: new AbortController().signal,
timeoutMs: 1,
onAgentToolResult,
onTimeout,
});
@@ -218,64 +216,6 @@ describe("dynamic tool execution helpers", () => {
});
expect(capturedSignal?.aborted).toBe(true);
expect(onTimeout).toHaveBeenCalledTimes(1);
expect(onAgentToolResult).toHaveBeenCalledWith({
toolName: "message",
result: {
content: [
{
type: "text",
text: "OpenClaw dynamic tool call timed out after 1ms while running tool message.",
},
],
details: {
status: "failed",
error: "OpenClaw dynamic tool call timed out after 1ms while running tool message.",
},
},
isError: true,
});
});
it("reports pre-execution aborts to the private result observer", async () => {
const controller = new AbortController();
controller.abort(new Error("run cancelled"));
const onAgentToolResult = vi.fn();
const handleToolCall = vi.fn();
const result = await handleDynamicToolCallWithTimeout({
call: {
threadId: "thread-1",
turnId: "turn-1",
callId: "call-aborted",
namespace: null,
tool: "memory_search",
arguments: {},
},
toolBridge: { handleToolCall },
signal: controller.signal,
timeoutMs: 1_000,
onAgentToolResult,
});
expect(result).toEqual({
success: false,
contentItems: [
{ type: "inputText", text: "OpenClaw dynamic tool call aborted before execution." },
],
});
expect(handleToolCall).not.toHaveBeenCalled();
expect(onAgentToolResult).toHaveBeenCalledOnce();
expect(onAgentToolResult).toHaveBeenCalledWith({
toolName: "memory_search",
result: {
content: [{ type: "text", text: "OpenClaw dynamic tool call aborted before execution." }],
details: {
status: "failed",
error: "OpenClaw dynamic tool call aborted before execution.",
},
},
isError: true,
});
});
it("logs process poll timeout context separately from session idle", async () => {

View File

@@ -126,41 +126,10 @@ export async function handleDynamicToolCallWithTimeout(params: {
toolBridge: Pick<CodexDynamicToolBridge, "handleToolCall">;
signal: AbortSignal;
timeoutMs: number;
onAgentToolResult?: EmbeddedRunAttemptParams["onAgentToolResult"];
onTimeout?: () => void;
}): Promise<CodexDynamicToolCallResponse> {
// Timeout or run abort can win while a tool ignores cancellation. Keep the
// private observer terminal result exactly once across those competing paths.
let didNotifyAgentToolResult = false;
const notifyAgentToolResult = (
event: Parameters<NonNullable<EmbeddedRunAttemptParams["onAgentToolResult"]>>[0],
) => {
if (didNotifyAgentToolResult) {
return;
}
didNotifyAgentToolResult = true;
try {
params.onAgentToolResult?.(event);
} catch (error) {
embeddedAgentLog.warn(
`onAgentToolResult handler failed: tool=${params.call.tool} error=${String(error)}`,
);
}
};
const notifyFailedToolResult = (message: string) => {
notifyAgentToolResult({
toolName: params.call.tool,
result: {
content: [{ type: "text", text: message }],
details: { status: "failed", error: message },
},
isError: true,
});
};
if (params.signal.aborted) {
const message = "OpenClaw dynamic tool call aborted before execution.";
notifyFailedToolResult(message);
return failedDynamicToolResponse(message);
return failedDynamicToolResponse("OpenClaw dynamic tool call aborted before execution.");
}
const controller = new AbortController();
@@ -170,7 +139,6 @@ export async function handleDynamicToolCallWithTimeout(params: {
const abortFromRun = () => {
const message = "OpenClaw dynamic tool call aborted.";
controller.abort(params.signal.reason ?? new Error(message));
notifyFailedToolResult(message);
resolveAbort?.(failedDynamicToolResponse(message, { sideEffectEvidence: true }));
};
const abortPromise = new Promise<CodexDynamicToolCallResponse>((resolve) => {
@@ -187,7 +155,6 @@ export async function handleDynamicToolCallWithTimeout(params: {
...timeoutDetails.meta,
consoleMessage: timeoutDetails.consoleMessage,
});
notifyFailedToolResult(timeoutDetails.responseMessage);
resolve(
failedDynamicToolResponse(timeoutDetails.responseMessage, { sideEffectEvidence: true }),
);
@@ -200,22 +167,13 @@ export async function handleDynamicToolCallWithTimeout(params: {
if (params.signal.aborted) {
abortFromRun();
}
const response = await Promise.race([
params.toolBridge.handleToolCall(params.call, {
signal: controller.signal,
onAgentToolResult: notifyAgentToolResult,
}),
return await Promise.race([
params.toolBridge.handleToolCall(params.call, { signal: controller.signal }),
abortPromise,
timeoutPromise,
]);
if (!response.success && !didNotifyAgentToolResult) {
notifyFailedToolResult(readDynamicToolResponseText(response));
}
return response;
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
notifyFailedToolResult(message);
return failedDynamicToolResponse(message, {
return failedDynamicToolResponse(error instanceof Error ? error.message : String(error), {
sideEffectEvidence: true,
});
} finally {
@@ -230,16 +188,6 @@ export async function handleDynamicToolCallWithTimeout(params: {
}
}
function readDynamicToolResponseText(response: CodexDynamicToolCallResponse): string {
const text = response.contentItems
.flatMap((item) =>
item.type === "inputText" && typeof item.text === "string" ? [item.text] : [],
)
.join("\n")
.trim();
return text || "OpenClaw dynamic tool call failed.";
}
function failedDynamicToolResponse(
message: string,
options?: { sideEffectEvidence?: boolean },

View File

@@ -18,7 +18,6 @@ import {
import {
createEmptyPluginRegistry,
createMockPluginRegistry,
createTestRegistry,
setActivePluginRegistry,
} from "openclaw/plugin-sdk/plugin-test-runtime";
import { afterEach, describe, expect, it, vi } from "vitest";
@@ -223,7 +222,6 @@ describe("createCodexDynamicToolBridge", () => {
it("can register a durable tool schema while denying execution for the current turn", async () => {
const heartbeatExecute = vi.fn(async () => textToolResult("heartbeat recorded"));
const onAgentToolResult = vi.fn();
const bridge = createCodexDynamicToolBridge({
tools: [createTool({ name: "message" })],
registeredTools: [
@@ -239,17 +237,14 @@ describe("createCodexDynamicToolBridge", () => {
HEARTBEAT_RESPONSE_TOOL_NAME,
]);
const result = await bridge.handleToolCall(
{
threadId: "thread-1",
turnId: "turn-1",
callId: "call-1",
namespace: null,
tool: HEARTBEAT_RESPONSE_TOOL_NAME,
arguments: {},
},
{ onAgentToolResult },
);
const result = await bridge.handleToolCall({
threadId: "thread-1",
turnId: "turn-1",
callId: "call-1",
namespace: null,
tool: HEARTBEAT_RESPONSE_TOOL_NAME,
arguments: {},
});
expect(result).toEqual({
success: false,
@@ -261,22 +256,6 @@ describe("createCodexDynamicToolBridge", () => {
],
});
expect(heartbeatExecute).not.toHaveBeenCalled();
expect(onAgentToolResult).toHaveBeenCalledWith({
toolName: HEARTBEAT_RESPONSE_TOOL_NAME,
result: {
content: [
{
type: "text",
text: `OpenClaw tool is not available for this turn: ${HEARTBEAT_RESPONSE_TOOL_NAME}`,
},
],
details: {
status: "failed",
error: `OpenClaw tool is not available for this turn: ${HEARTBEAT_RESPONSE_TOOL_NAME}`,
},
},
isError: true,
});
});
it("keeps available and registered schemas paired with their tools", () => {
@@ -799,163 +778,6 @@ describe("createCodexDynamicToolBridge", () => {
]);
});
it("records the current provider and transport thread for implicit message sends", async () => {
const hasRepliedRef = { value: false };
setActivePluginRegistry(
createTestRegistry([
{
pluginId: "slack",
plugin: {
id: "slack",
messaging: { normalizeTarget: (raw: string) => raw.trim().toLowerCase() },
threading: {
resolveAutoThreadId: ({
to,
toolContext,
}: {
to: string;
toolContext?: {
currentChannelId?: string;
currentMessagingTarget?: string;
currentThreadTs?: string;
replyToMode?: "off" | "first" | "all" | "batched";
hasRepliedRef?: { value: boolean };
};
}) => {
if (
to !== toolContext?.currentMessagingTarget &&
to !== toolContext?.currentChannelId
) {
return undefined;
}
if (
(toolContext?.replyToMode === "first" ||
toolContext?.replyToMode === "batched") &&
!toolContext.hasRepliedRef?.value
) {
return toolContext.currentThreadTs;
}
return undefined;
},
},
},
source: "test",
},
]),
);
const bridge = createCodexDynamicToolBridge({
tools: [
createTool({
name: "message",
execute: vi.fn(async () => {
hasRepliedRef.value = true;
return textToolResult("Sent.");
}),
}),
],
signal: new AbortController().signal,
hookContext: {
currentChannelProvider: "slack",
currentChannelId: "D1",
currentMessagingTarget: "user:u1",
currentThreadId: "171.222",
replyToMode: "first",
hasRepliedRef,
},
});
await handleMessageToolCall(bridge, {
action: "send",
to: "user:U1",
text: "hello from Codex",
});
expect(bridge.telemetry.messagingToolSentTargets).toEqual([
{
tool: "message",
provider: "slack",
to: "user:u1",
threadId: "171.222",
threadImplicit: true,
text: "hello from Codex",
},
]);
});
it("records the provider-confirmed route for successful message sends", async () => {
const registry = createTestRegistry([
{
pluginId: "mattermost",
plugin: {
id: "mattermost",
messaging: { normalizeTarget: (raw: string) => raw.trim().toLowerCase() },
actions: {
extractToolSend: ({ args }: { args: Record<string, unknown> }) =>
args.action === "send" && typeof args.to === "string"
? { to: args.to, threadImplicit: true }
: null,
extractToolSendResult: ({ result }: { result: unknown }) => {
const details = requireRecord(
requireRecord(result, "message result").details,
"message details",
);
const toolSend = requireRecord(details.toolSend, "tool send details");
return {
to: String(toolSend.to),
threadId: String(toolSend.threadId),
};
},
},
},
source: "test",
},
]);
const middleware = vi.fn(async (event: { result: AgentToolResult<unknown> }) => {
const details = requireRecord(event.result.details, "middleware details");
const toolSend = requireRecord(details.toolSend, "middleware tool send");
toolSend.to = "channel:corrupted";
toolSend.threadId = "corrupted-root";
return undefined;
});
registry.agentToolResultMiddlewares.push({
pluginId: "route-details-stripper",
pluginName: "Route details stripper",
rawHandler: middleware,
handler: middleware,
runtimes: ["codex"],
source: "test",
});
setActivePluginRegistry(registry);
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Sent.", {
toolSend: {
to: "channel:resolved-id",
threadId: "root-post-id",
},
}),
);
await handleMessageToolCall(bridge, {
action: "send",
provider: "mattermost",
to: "town-square",
text: "hello from Codex",
});
expect(bridge.telemetry.messagingToolSentTargets).toEqual([
{
tool: "message",
provider: "mattermost",
to: "channel:resolved-id",
threadId: "root-post-id",
threadImplicit: undefined,
threadSuppressed: undefined,
text: "hello from Codex",
},
]);
});
it("records message tool media attachment aliases as delivery evidence", async () => {
const toolResult = {
content: [{ type: "text", text: "Sent." }],
@@ -1205,152 +1027,6 @@ describe("createCodexDynamicToolBridge", () => {
expectContextFields(callArg(handler, 0, 1, "middleware context"), { runtime: "codex" });
});
it("keeps unrecognized non-success statuses fail-closed", async () => {
const onAgentToolResult = vi.fn();
const bridge = createCodexDynamicToolBridge({
tools: [
createTool({
name: "exec",
execute: vi.fn(async () =>
textToolResult("Approval is unavailable.", { status: "approval-unavailable" }),
),
}),
],
signal: new AbortController().signal,
});
const result = await bridge.handleToolCall(
{
threadId: "thread-1",
turnId: "turn-1",
callId: "call-1",
namespace: null,
tool: "exec",
arguments: { command: "pwd" },
},
{ onAgentToolResult },
);
expect(result).toMatchObject({ success: false });
expect(onAgentToolResult).toHaveBeenCalledWith({
toolName: "exec",
result: textToolResult("Approval is unavailable.", { status: "approval-unavailable" }),
isError: true,
});
});
it("preserves explicitly successful cancellation outcomes", async () => {
const onAgentToolResult = vi.fn();
const cancelledResult = textToolResult("Approval rejected.", {
ok: true,
status: "cancelled",
});
const bridge = createCodexDynamicToolBridge({
tools: [
createTool({
name: "lobster",
execute: vi.fn(async () => cancelledResult),
}),
],
signal: new AbortController().signal,
});
const result = await bridge.handleToolCall(
{
threadId: "thread-1",
turnId: "turn-1",
callId: "call-1",
namespace: null,
tool: "lobster",
arguments: {},
},
{ onAgentToolResult },
);
expect(result).toMatchObject({ success: true });
expect(onAgentToolResult).toHaveBeenCalledWith({
toolName: "lobster",
result: cancelledResult,
isError: false,
});
});
it("reports sanitized dynamic tool results to the private result observer", async () => {
const onAgentToolResult = vi.fn();
const bridge = createCodexDynamicToolBridge({
tools: [
createTool({
name: "memory_lookup_custom",
execute: vi.fn(async () =>
textToolResult("OPENROUTER_API_KEY=sk-or-v1-abcdef0123456789", {
status: "failed",
error: "backend unavailable",
}),
),
}),
],
signal: new AbortController().signal,
});
await bridge.handleToolCall(
{
threadId: "thread-1",
turnId: "turn-1",
callId: "call-1",
namespace: null,
tool: "memory_lookup_custom",
arguments: {},
},
{ onAgentToolResult },
);
expect(onAgentToolResult).toHaveBeenCalledOnce();
expect(onAgentToolResult).toHaveBeenCalledWith({
toolName: "memory_lookup_custom",
result: {
content: [{ type: "text", text: "OPENROUTER_API_KEY=sk-or-…6789" }],
details: { status: "failed", error: "backend unavailable" },
},
isError: true,
});
});
it("reports thrown dynamic tool failures to the private result observer", async () => {
const onAgentToolResult = vi.fn();
const bridge = createCodexDynamicToolBridge({
tools: [
createTool({
name: "memory_lookup_custom",
execute: vi.fn(async () => {
throw new Error("backend unavailable");
}),
}),
],
signal: new AbortController().signal,
});
await bridge.handleToolCall(
{
threadId: "thread-1",
turnId: "turn-1",
callId: "call-1",
namespace: null,
tool: "memory_lookup_custom",
arguments: {},
},
{ onAgentToolResult },
);
expect(onAgentToolResult).toHaveBeenCalledWith({
toolName: "memory_lookup_custom",
result: {
content: [{ type: "text", text: "backend unavailable" }],
details: { status: "failed", error: "backend unavailable" },
},
isError: true,
});
});
it("preserves terminal async tool results without marking them as errors", async () => {
const bridge = createBridgeWithToolResult("image_generate", {
content: [{ type: "text", text: "Background task started." }],

View File

@@ -6,21 +6,17 @@ import type { AgentToolResult } from "openclaw/plugin-sdk/agent-core";
import {
createAgentToolResultMiddlewareRunner,
createCodexAppServerToolResultExtensionRunner,
extractMessagingToolSend,
extractMessagingToolSendResult,
extractToolResultMediaArtifact,
filterToolResultMediaUrls,
HEARTBEAT_RESPONSE_TOOL_NAME,
embeddedAgentLog,
type EmbeddedRunAttemptParams,
isToolWrappedWithBeforeToolCallHook,
isToolResultError,
isMessagingTool,
isMessagingToolSendAction,
normalizeHeartbeatToolResponse,
projectRuntimeToolInputSchema,
runAgentHarnessAfterToolCallHook,
sanitizeToolResult,
setBeforeToolCallDiagnosticsEnabled,
type AnyAgentTool,
type HeartbeatToolResponse,
@@ -53,12 +49,6 @@ type CodexDynamicToolHookContext = {
sessionKey?: string;
runId?: string;
channelId?: string;
currentChannelProvider?: string;
currentChannelId?: string;
currentMessagingTarget?: string;
currentThreadId?: string;
replyToMode?: "off" | "first" | "all" | "batched";
hasRepliedRef?: { value: boolean };
};
type CodexToolResultHookContext = Omit<CodexDynamicToolHookContext, "config">;
@@ -75,32 +65,13 @@ type CodexDynamicToolSchemaQuarantine = {
violations: readonly string[];
};
function applyCurrentMessageProvider(
toolName: string,
args: Record<string, unknown>,
currentProvider: string | undefined,
): Record<string, unknown> {
const hasProvider =
typeof args.provider === "string" && args.provider.trim().length > 0
? true
: typeof args.channel === "string" && args.channel.trim().length > 0;
const provider = currentProvider?.trim();
if (toolName !== "message" || hasProvider || !provider) {
return args;
}
return { ...args, provider };
}
/** Runtime bridge returned to Codex app-server attempt code. */
export type CodexDynamicToolBridge = {
availableSpecs: CodexDynamicToolSpec[];
specs: CodexDynamicToolSpec[];
handleToolCall: (
params: CodexDynamicToolCallParams,
options?: {
signal?: AbortSignal;
onAgentToolResult?: EmbeddedRunAttemptParams["onAgentToolResult"];
},
options?: { signal?: AbortSignal },
) => Promise<CodexDynamicToolCallResponse>;
telemetry: {
didSendViaMessagingTool: boolean;
@@ -184,6 +155,7 @@ export function createCodexDynamicToolBridge(params: {
...ALWAYS_DIRECT_DYNAMIC_TOOL_NAMES,
...(params.directToolNames ?? []),
]);
return {
availableSpecs: availableTools.map((entry) =>
createCodexDynamicToolSpec({
@@ -203,28 +175,19 @@ export function createCodexDynamicToolBridge(params: {
handleToolCall: async (call, options) => {
const toolEntry = toolMap.get(call.tool);
if (!toolEntry) {
const message = registeredToolNames.has(call.tool)
? `OpenClaw tool is not available for this turn: ${call.tool}`
: `Unknown OpenClaw tool: ${call.tool}`;
notifyAgentToolResult(
options?.onAgentToolResult,
call.tool,
failedToolResult(message),
true,
);
if (registeredToolNames.has(call.tool)) {
return {
contentItems: [
{
type: "inputText",
text: message,
text: `OpenClaw tool is not available for this turn: ${call.tool}`,
},
],
success: false,
};
}
return {
contentItems: [{ type: "inputText", text: message }],
contentItems: [{ type: "inputText", text: `Unknown OpenClaw tool: ${call.tool}` }],
success: false,
};
}
@@ -237,30 +200,9 @@ export function createCodexDynamicToolBridge(params: {
// Prepare before marking side-effect evidence; argument preparation can
// fail without the target tool actually starting.
const preparedArgs = tool.prepareArguments ? tool.prepareArguments(args) : args;
const telemetryArgs = isRecord(preparedArgs) ? preparedArgs : args;
const messagingTelemetryArgs = applyCurrentMessageProvider(
toolName,
telemetryArgs,
params.hookContext?.currentChannelProvider,
);
const messagingTarget =
isMessagingTool(toolName) && isMessagingToolSendAction(toolName, telemetryArgs)
? extractMessagingToolSend(toolName, messagingTelemetryArgs, {
config: params.hookContext?.config,
currentChannelId: params.hookContext?.currentChannelId,
currentMessagingTarget: params.hookContext?.currentMessagingTarget,
currentThreadId: params.hookContext?.currentThreadId,
replyToMode: params.hookContext?.replyToMode,
hasRepliedRef: params.hookContext?.hasRepliedRef,
})
: undefined;
didStartExecution = true;
const rawResult = await tool.execute(call.callId, preparedArgs, signal);
const rawIsError = isCodexToolResultError(rawResult);
const confirmedMessagingTarget =
!rawIsError && messagingTarget
? extractMessagingToolSendResult(messagingTarget, rawResult)
: messagingTarget;
const rawIsError = isToolResultError(rawResult);
const middlewareResult = await middlewareRunner.applyToolResultMiddleware({
threadId: call.threadId,
turnId: call.turnId,
@@ -278,16 +220,14 @@ export function createCodexDynamicToolBridge(params: {
args,
result: middlewareResult,
});
const resultIsError = rawIsError || isCodexToolResultError(result);
notifyAgentToolResult(options?.onAgentToolResult, toolName, result, resultIsError);
const resultIsError = rawIsError || isToolResultError(result);
collectToolTelemetry({
toolName,
args: telemetryArgs,
args,
result,
mediaTrustResult: rawResult,
telemetry,
isError: resultIsError,
messagingTarget: confirmedMessagingTarget,
});
void runAgentHarnessAfterToolCallHook({
toolName,
@@ -322,13 +262,6 @@ export function createCodexDynamicToolBridge(params: {
);
return withSideEffectEvidence(response, terminalType !== "blocked");
} catch (error) {
const errorMessage = error instanceof Error ? error.message : String(error);
notifyAgentToolResult(
options?.onAgentToolResult,
toolName,
failedToolResult(errorMessage),
true,
);
collectToolTelemetry({
toolName,
args,
@@ -345,7 +278,7 @@ export function createCodexDynamicToolBridge(params: {
sessionKey: toolResultHookContext.sessionKey,
channelId: toolResultHookContext.channelId,
startArgs: args,
error: errorMessage,
error: error instanceof Error ? error.message : String(error),
startedAt,
});
return withSideEffectEvidence(
@@ -354,7 +287,7 @@ export function createCodexDynamicToolBridge(params: {
contentItems: [
{
type: "inputText",
text: errorMessage,
text: error instanceof Error ? error.message : String(error),
},
],
success: false,
@@ -368,32 +301,6 @@ export function createCodexDynamicToolBridge(params: {
};
}
function notifyAgentToolResult(
observer: EmbeddedRunAttemptParams["onAgentToolResult"] | undefined,
toolName: string,
result: unknown,
isError: boolean,
) {
try {
observer?.({
toolName,
result: sanitizeToolResult(result),
isError,
});
} catch (error) {
embeddedAgentLog.warn(
`onAgentToolResult handler failed: tool=${toolName} error=${String(error)}`,
);
}
}
function failedToolResult(message: string): AgentToolResult<unknown> {
return {
content: [{ type: "text", text: message }],
details: { status: "failed", error: message },
};
}
function wrapProjectedCodexDynamicTools(
tools: readonly ProjectedCodexDynamicTool[],
hookContext: CodexDynamicToolHookContext | undefined,
@@ -677,7 +584,6 @@ function collectToolTelemetry(params: {
mediaTrustResult?: AgentToolResult<unknown>;
telemetry: CodexDynamicToolBridge["telemetry"];
isError: boolean;
messagingTarget?: MessagingToolSend;
}): void {
if (params.isError) {
return;
@@ -730,13 +636,11 @@ function collectToolTelemetry(params: {
const mediaUrls = collectMediaUrls(params.args);
params.telemetry.messagingToolSentMediaUrls.push(...mediaUrls);
params.telemetry.messagingToolSentTargets.push({
...(params.messagingTarget ?? {
tool: params.toolName,
provider: readFirstString(params.args, ["provider", "channel"]) ?? params.toolName,
accountId: readFirstString(params.args, ["accountId", "account_id"]),
to: readFirstString(params.args, ["to", "target", "recipient"]),
threadId: readFirstString(params.args, ["threadId", "thread_id", "messageThreadId"]),
}),
tool: params.toolName,
provider: readFirstString(params.args, ["provider", "channel"]) ?? params.toolName,
accountId: readFirstString(params.args, ["accountId", "account_id"]),
to: readFirstString(params.args, ["to", "target", "recipient"]),
threadId: readFirstString(params.args, ["threadId", "thread_id", "messageThreadId"]),
...(text ? { text } : {}),
...(mediaUrls.length > 0 ? { mediaUrls } : {}),
});
@@ -784,17 +688,11 @@ function readPositiveInteger(value: unknown): number | undefined {
return Math.floor(value);
}
function isCodexToolResultError(result: AgentToolResult<unknown>): boolean {
if (isToolResultError(result)) {
return true;
}
function isToolResultError(result: AgentToolResult<unknown>): boolean {
const details = result.details;
if (!isRecord(details)) {
return false;
}
if (details.ok === true || details.success === true) {
return false;
}
if (details.timedOut === true) {
return true;
}

View File

@@ -203,191 +203,10 @@ describe("CodexNativeSubagentMonitor", () => {
task: "inspect the repo",
}),
);
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledWith(
expect.objectContaining({
runId: "codex-thread:child-thread",
progressSummary: "Codex native subagent is idle.",
}),
);
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
});
it.each([
{ label: "remote V1", codexHome: undefined, finalizes: true },
{ label: "local transcript-backed V1", codexHome: "/tmp/codex-home", finalizes: false },
])(
"uses collab completion as a terminal fallback only for $label",
async ({ codexHome, finalizes }) => {
const client = createClient();
const runtime = createRuntime();
const monitor = new CodexNativeSubagentMonitor(client, runtime, { codexHome });
monitor.registerParent({
parentThreadId: "parent-thread",
requesterSessionKey: "agent:main:main",
taskRuntimeScope: createTaskScope("agent:main:main"),
agentId: "main",
});
await notifyChildStarted(client, "parent-thread", "child-thread", "");
await client.notify({
method: "item/completed",
params: {
threadId: "parent-thread",
item: {
type: "collabAgentToolCall",
tool: "wait",
senderThreadId: "parent-thread",
agentsStates: {
"child-thread": {
status: "completed",
message: "child final result",
},
},
},
},
});
if (finalizes) {
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith(
expect.objectContaining({
runId: "codex-thread:child-thread",
status: "succeeded",
terminalSummary: "child final result",
}),
);
} else {
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledWith(
expect.objectContaining({
runId: "codex-thread:child-thread",
progressSummary: "child final result",
}),
);
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
}
monitor.dispose();
},
);
it("does not complete mirrored task rows from idle status before native completion", async () => {
const client = createClient();
const runtime = createRuntime();
const monitor = new CodexNativeSubagentMonitor(client, runtime);
monitor.registerParent({
parentThreadId: "parent-thread",
requesterSessionKey: "agent:main:discord:channel:C123",
taskRuntimeScope: createTaskScope(),
agentId: "main",
});
await notifyChildStarted(client);
await client.notify({
method: "thread/status/changed",
params: {
threadId: "child-thread",
status: { type: "idle" },
},
});
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
expect(runtime.deliverAgentHarnessTaskCompletion).not.toHaveBeenCalled();
await client.notify(
nativeCompletionNotification({
agentPath: "child-thread",
statusLabel: "completed",
result: "child final result",
}),
);
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith(
expect.objectContaining({
runId: "codex-thread:child-thread",
status: "succeeded",
terminalSummary: "child final result",
}),
);
expect(runtime.deliverAgentHarnessTaskCompletion).toHaveBeenCalledWith(
expect.objectContaining({
childSessionId: "child-thread",
result: "child final result",
}),
);
});
it("keeps late idle lifecycle updates from overwriting native completion results", async () => {
const client = createClient();
const runtime = createRuntime();
const monitor = new CodexNativeSubagentMonitor(client, runtime);
monitor.registerParent({
parentThreadId: "parent-thread",
requesterSessionKey: "agent:main:discord:channel:C123",
taskRuntimeScope: createTaskScope(),
agentId: "main",
});
await notifyChildStarted(client);
await client.notify(
nativeCompletionNotification({
agentPath: "child-thread",
statusLabel: "completed",
result: "child final result",
}),
);
runtime.recordTaskRunProgressByRunId.mockClear();
await client.notify({
method: "thread/status/changed",
params: {
threadId: "child-thread",
status: { type: "idle" },
},
});
expect(runtime.recordTaskRunProgressByRunId).not.toHaveBeenCalled();
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledTimes(1);
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith(
expect.objectContaining({
runId: "codex-thread:child-thread",
status: "succeeded",
terminalSummary: "child final result",
}),
);
});
it("keeps later lifecycle errors from rewriting native completion results", async () => {
const client = createClient();
const runtime = createRuntime();
const monitor = new CodexNativeSubagentMonitor(client, runtime);
monitor.registerParent({
parentThreadId: "parent-thread",
requesterSessionKey: "agent:main:discord:channel:C123",
taskRuntimeScope: createTaskScope(),
agentId: "main",
});
await notifyChildStarted(client);
await client.notify(
nativeCompletionNotification({
agentPath: "child-thread",
statusLabel: "completed",
result: "child final result",
}),
);
await client.notify({
method: "thread/status/changed",
params: {
threadId: "child-thread",
status: { type: "systemError" },
},
});
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledTimes(1);
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith(
expect.objectContaining({
runId: "codex-thread:child-thread",
status: "succeeded",
terminalSummary: "child final result",
}),
);
});

View File

@@ -493,9 +493,6 @@ export class CodexNativeSubagentMonitor {
if (!taskRuntime) {
return;
}
this.getMirrorForChild(completion.childThreadId)?.markAuthoritativeCompletion(
completion.childThreadId,
);
taskRuntime.finalizeTaskRunByRunId({
runId: codexNativeSubagentRunId(completion.childThreadId),
status: completion.status,
@@ -513,12 +510,6 @@ export class CodexNativeSubagentMonitor {
return state?.taskRuntime;
}
private getMirrorForChild(childThreadId: string): CodexNativeSubagentTaskMirror | undefined {
const childState = this.childStates.get(childThreadId.trim());
const state = childState ? this.parentStates.get(childState.parentThreadId) : undefined;
return state?.mirror;
}
private registerChildThread(
parentThreadId: string,
childThreadId: string,
@@ -535,10 +526,6 @@ export class CodexNativeSubagentMonitor {
normalizedChildThreadId,
);
const agentPath = normalizeOptionalString(options.agentPath);
const state = this.parentStates.get(normalizedParentThreadId);
if (state?.mirror && (this.codexHome || agentPath)) {
state.mirror.markAuthoritativeCompletionExpected(normalizedChildThreadId);
}
if (agentPath) {
this.childThreadIdsByAgentPath.set(
buildParentAgentPathKey(normalizedParentThreadId, agentPath),

View File

@@ -26,6 +26,7 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
runtime,
);
mirror.handleNotification({
method: "thread/started",
params: {
@@ -81,6 +82,7 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
runtime,
);
mirror.handleNotification({
method: "thread/started",
params: {
@@ -103,45 +105,6 @@ describe("CodexNativeSubagentTaskMirror", () => {
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
});
it("finalizes collab completion when no authoritative result path is available", () => {
const runtime = createRuntime();
const mirror = new CodexNativeSubagentTaskMirror(
{
parentThreadId: "parent-thread",
requesterSessionKey: "agent:main:main",
now: () => 44_000,
},
runtime,
);
mirror.handleNotification({
method: "item/completed",
params: {
threadId: "parent-thread",
item: {
type: "collabAgentToolCall",
tool: "spawn_agent",
prompt: "inspect one thing",
agentsStates: {
"child-thread": {
status: "completed",
message: "done",
},
},
},
},
});
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith({
runId: "codex-thread:child-thread",
status: "succeeded",
endedAt: 44_000,
lastEventAt: 44_000,
progressSummary: "done",
terminalSummary: "done",
});
});
it("deduplicates repeated thread-started notifications for the same child thread", () => {
const runtime = createRuntime();
const mirror = new CodexNativeSubagentTaskMirror(
@@ -184,6 +147,7 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
runtime,
);
mirror.handleNotification({
method: "thread/status/changed",
params: {
@@ -199,13 +163,15 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
});
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledWith({
expect(runtime.finalizeTaskRunByRunId).toHaveBeenNthCalledWith(1, {
runId: codexNativeSubagentRunId("child-thread"),
status: "succeeded",
endedAt: 30_000,
lastEventAt: 30_000,
progressSummary: "Codex native subagent is idle.",
terminalSummary: "Codex native subagent finished.",
});
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledTimes(1);
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith({
expect(runtime.finalizeTaskRunByRunId).toHaveBeenNthCalledWith(2, {
runId: codexNativeSubagentRunId("failed-child"),
status: "failed",
endedAt: 30_000,
@@ -226,7 +192,7 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
runtime,
);
mirror.markAuthoritativeCompletionExpected("child-thread");
mirror.handleNotification({
method: "item/completed",
params: {
@@ -283,12 +249,14 @@ describe("CodexNativeSubagentTaskMirror", () => {
lastEventAt: 40_000,
progressSummary: "Codex native subagent is initializing.",
});
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledWith({
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith({
runId: "codex-thread:child-thread",
status: "succeeded",
endedAt: 40_000,
lastEventAt: 40_000,
progressSummary: "done",
terminalSummary: "done",
});
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
});
it("uses the notification thread id when collab agent items omit sender thread id", () => {
@@ -301,6 +269,7 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
runtime,
);
mirror.handleNotification({
method: "item/started",
params: {
@@ -332,7 +301,6 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
runtime,
);
mirror.markAuthoritativeCompletionExpected("child-thread");
mirror.handleNotification({
method: "item/completed",
@@ -358,13 +326,13 @@ describe("CodexNativeSubagentTaskMirror", () => {
task: "inspect one thing",
}),
);
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledWith(
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith(
expect.objectContaining({
runId: "codex-thread:child-thread",
progressSummary: "done",
status: "succeeded",
terminalSummary: "done",
}),
);
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
});
it("finalizes stale collab agent state from the blocked tool call status", () => {
@@ -491,7 +459,7 @@ describe("CodexNativeSubagentTaskMirror", () => {
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
});
it("records completed collab agent and idle thread states as progress only", () => {
it("preserves a completed collab agent message when the thread later goes idle", () => {
const runtime = createRuntime();
const mirror = new CodexNativeSubagentTaskMirror(
{
@@ -501,7 +469,6 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
runtime,
);
mirror.markAuthoritativeCompletionExpected("child-thread");
mirror.handleNotification({
method: "item/completed",
@@ -529,60 +496,18 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
});
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledTimes(1);
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledWith({
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledTimes(1);
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith({
runId: "codex-thread:child-thread",
status: "succeeded",
endedAt: 50_000,
lastEventAt: 50_000,
progressSummary: "No user task is specified.",
terminalSummary: "No user task is specified.",
});
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
});
it("keeps terminal collab failures from rewriting authoritative completion", () => {
const runtime = createRuntime();
const mirror = new CodexNativeSubagentTaskMirror(
{
parentThreadId: "parent-thread",
requesterSessionKey: "agent:main:main",
now: () => 52_000,
},
runtime,
);
mirror.handleNotification({
method: "item/completed",
params: {
item: {
type: "collabAgentToolCall",
tool: "spawnAgent",
senderThreadId: "parent-thread",
receiverThreadIds: ["child-thread"],
prompt: "write the proof file",
},
},
});
mirror.markAuthoritativeCompletion("child-thread");
mirror.handleNotification({
method: "item/completed",
params: {
item: {
type: "collabAgentToolCall",
tool: "wait",
senderThreadId: "parent-thread",
agentsStates: {
"child-thread": {
status: "errored",
message: "later turn failed",
},
},
},
},
});
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
});
it("lets terminal collab agent state finalize after an earlier idle thread status", () => {
it("lets terminal collab agent state correct an earlier idle thread status", () => {
const runtime = createRuntime();
const mirror = new CodexNativeSubagentTaskMirror(
{
@@ -620,13 +545,15 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
});
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledWith({
expect(runtime.finalizeTaskRunByRunId).toHaveBeenNthCalledWith(1, {
runId: "codex-thread:child-thread",
status: "succeeded",
endedAt: 55_000,
lastEventAt: 55_000,
progressSummary: "Codex native subagent is idle.",
terminalSummary: "Codex native subagent finished.",
});
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledTimes(1);
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith({
expect(runtime.finalizeTaskRunByRunId).toHaveBeenNthCalledWith(2, {
runId: "codex-thread:child-thread",
status: "failed",
endedAt: 55_000,
@@ -647,7 +574,6 @@ describe("CodexNativeSubagentTaskMirror", () => {
},
runtime,
);
mirror.markAuthoritativeCompletionExpected("child-thread");
mirror.handleNotification({
method: "item/completed",
@@ -688,11 +614,13 @@ describe("CodexNativeSubagentTaskMirror", () => {
lastEventAt: 60_000,
progressSummary: "Codex native subagent is initializing.",
});
expect(runtime.recordTaskRunProgressByRunId).toHaveBeenCalledWith({
expect(runtime.finalizeTaskRunByRunId).toHaveBeenCalledWith({
runId: "codex-thread:child-thread",
status: "succeeded",
endedAt: 60_000,
lastEventAt: 60_000,
progressSummary: "done",
terminalSummary: "done",
});
expect(runtime.finalizeTaskRunByRunId).not.toHaveBeenCalled();
});
});

View File

@@ -36,8 +36,6 @@ export class CodexNativeSubagentTaskMirror {
private readonly mirroredThreadIds = new Set<string>();
private readonly failedMirrorThreadIds = new Set<string>();
private readonly terminalRunIds = new Set<string>();
private readonly authoritativeRunIds = new Set<string>();
private readonly expectedAuthoritativeRunIds = new Set<string>();
private readonly now: () => number;
constructor(
@@ -47,20 +45,6 @@ export class CodexNativeSubagentTaskMirror {
this.now = params.now ?? Date.now;
}
markAuthoritativeCompletion(childThreadId: string): void {
const runId = codexNativeSubagentRunId(childThreadId);
// Run identity is per child thread, not per resumed turn. Once the monitor
// finalizes and delivers this task, later mirror events must not rewrite it.
this.authoritativeRunIds.add(runId);
this.terminalRunIds.add(runId);
}
markAuthoritativeCompletionExpected(childThreadId: string): void {
// Local transcripts and V2 agent paths can supply the real result later.
// Remote V1 lacks both and must keep collab-completed as its fallback.
this.expectedAuthoritativeRunIds.add(codexNativeSubagentRunId(childThreadId));
}
handleNotification(notification: CodexServerNotification): void {
const params = isJsonObject(notification.params) ? notification.params : undefined;
if (!params) {
@@ -125,7 +109,6 @@ export class CodexNativeSubagentTaskMirror {
}
this.failedMirrorThreadIds.delete(threadId);
this.terminalRunIds.delete(runId);
this.authoritativeRunIds.delete(runId);
this.applyStatus(threadId, thread.status);
}
@@ -146,9 +129,6 @@ export class CodexNativeSubagentTaskMirror {
return;
}
const runId = codexNativeSubagentRunId(threadId);
if (this.authoritativeRunIds.has(runId)) {
return;
}
if (this.terminalRunIds.has(runId) && statusType !== "systemError") {
return;
}
@@ -163,10 +143,13 @@ export class CodexNativeSubagentTaskMirror {
}
if (statusType === "idle") {
this.terminalRunIds.add(runId);
this.runtime.recordTaskRunProgressByRunId({
this.runtime.finalizeTaskRunByRunId({
runId,
status: "succeeded",
endedAt: eventAt,
lastEventAt: eventAt,
progressSummary: "Codex native subagent is idle.",
terminalSummary: "Codex native subagent finished.",
});
return;
}
@@ -274,7 +257,6 @@ export class CodexNativeSubagentTaskMirror {
}
this.failedMirrorThreadIds.delete(normalizedThreadId);
this.terminalRunIds.delete(runId);
this.authoritativeRunIds.delete(runId);
}
private applyCollabAgentStatus(
@@ -290,9 +272,6 @@ export class CodexNativeSubagentTaskMirror {
return;
}
const runId = codexNativeSubagentRunId(threadId);
if (this.authoritativeRunIds.has(runId)) {
return;
}
if (this.terminalRunIds.has(runId) && isNonTerminalAgentStateStatus(normalizedStatus)) {
return;
}
@@ -311,25 +290,14 @@ export class CodexNativeSubagentTaskMirror {
}
if (normalizedStatus === "completed") {
this.terminalRunIds.add(runId);
const summary = trimOptional(message) ?? "Codex native subagent completed.";
if (this.expectedAuthoritativeRunIds.has(runId)) {
this.runtime.recordTaskRunProgressByRunId({
runId,
lastEventAt: eventAt,
progressSummary: summary,
});
} else {
// Remote V1 has no trusted completion envelope or local transcript.
// Its collab-completed state is therefore the terminal fallback.
this.runtime.finalizeTaskRunByRunId({
runId,
status: "succeeded",
endedAt: eventAt,
lastEventAt: eventAt,
progressSummary: summary,
terminalSummary: summary,
});
}
this.runtime.finalizeTaskRunByRunId({
runId,
status: "succeeded",
endedAt: eventAt,
lastEventAt: eventAt,
progressSummary: trimOptional(message) ?? "Codex native subagent completed.",
terminalSummary: trimOptional(message) ?? "Codex native subagent finished.",
});
return;
}
if (normalizedStatus === "blocked") {

Some files were not shown because too many files have changed in this diff Show More