mirror of
https://github.com/openclaw/openclaw.git
synced 2026-06-06 14:01:24 +08:00
Compare commits
123 Commits
v2026.4.25
...
codex/sess
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b23ff97ddc | ||
|
|
e705246619 | ||
|
|
f936f16cc5 | ||
|
|
d2786fb969 | ||
|
|
fa0729e145 | ||
|
|
fd48faa4ed | ||
|
|
21c51bc140 | ||
|
|
265bc6b6ea | ||
|
|
42db865673 | ||
|
|
5d7c6e6bda | ||
|
|
29f1cae867 | ||
|
|
560ddd2f9b | ||
|
|
f58dd36a1d | ||
|
|
998e37fcb3 | ||
|
|
33e3dccbea | ||
|
|
3cc52d9050 | ||
|
|
7902c769da | ||
|
|
9be8d43c31 | ||
|
|
eccb79db99 | ||
|
|
6fc954539f | ||
|
|
fc13a0135e | ||
|
|
0ced62f512 | ||
|
|
09a635a28b | ||
|
|
5b257cb352 | ||
|
|
efe940e9cb | ||
|
|
8d909ed0da | ||
|
|
1bb46ce68a | ||
|
|
54e77a9ec4 | ||
|
|
43e651db9a | ||
|
|
e7d069edcf | ||
|
|
17094640f8 | ||
|
|
16c6a92c53 | ||
|
|
ef3309a986 | ||
|
|
95ae3c00bd | ||
|
|
97e64196a0 | ||
|
|
41ad03dda4 | ||
|
|
4a578740a2 | ||
|
|
20d6daaeaa | ||
|
|
6018f29dbf | ||
|
|
989cfd1e33 | ||
|
|
89ab39ca64 | ||
|
|
199d5f765f | ||
|
|
2fe11020d2 | ||
|
|
1ddf6b4e39 | ||
|
|
1a02d00eb4 | ||
|
|
cfe58387a7 | ||
|
|
6077941d0b | ||
|
|
b5714b90ed | ||
|
|
7a86448a6e | ||
|
|
6cba12caae | ||
|
|
a08b65a90a | ||
|
|
084dde89fd | ||
|
|
2efc4a8233 | ||
|
|
cd417f3b68 | ||
|
|
a2adb05f74 | ||
|
|
c9c0ab3a44 | ||
|
|
0472b6197a | ||
|
|
8a60e57846 | ||
|
|
c6cf37068c | ||
|
|
ff6044f441 | ||
|
|
5aa3779d8c | ||
|
|
ff9fefb79b | ||
|
|
3746e5b969 | ||
|
|
9f5bc5465c | ||
|
|
d108110a89 | ||
|
|
1b1eea238c | ||
|
|
d9e9e61e77 | ||
|
|
fc0e6e4650 | ||
|
|
e8df081a1f | ||
|
|
5c4c33c7de | ||
|
|
070b55f336 | ||
|
|
364d49889e | ||
|
|
baaad52389 | ||
|
|
3a8961af0f | ||
|
|
ff570f3a61 | ||
|
|
2cd23957c0 | ||
|
|
43a003b8a0 | ||
|
|
fa85e6c26e | ||
|
|
d46de6cff7 | ||
|
|
018f2e78ba | ||
|
|
b61954919c | ||
|
|
5abb717112 | ||
|
|
8226238765 | ||
|
|
b68b4b9151 | ||
|
|
a3c51f91c5 | ||
|
|
2edbdc42ae | ||
|
|
b28de9a7d9 | ||
|
|
824c3e2b71 | ||
|
|
2194a8c64c | ||
|
|
410783c126 | ||
|
|
3ae6f01d61 | ||
|
|
e3cbad4fb6 | ||
|
|
c082cf892a | ||
|
|
b4a9ac3516 | ||
|
|
f0566e410a | ||
|
|
c6e9849351 | ||
|
|
8e1755928c | ||
|
|
9eb071c3f1 | ||
|
|
522eedc754 | ||
|
|
71e361af8a | ||
|
|
487f8c5d3a | ||
|
|
7a4574376a | ||
|
|
8ba82534e6 | ||
|
|
ffa84cdc02 | ||
|
|
67ffa3df8b | ||
|
|
df542f75a9 | ||
|
|
edf40ab6c9 | ||
|
|
406ae72fd2 | ||
|
|
f99fb2af86 | ||
|
|
244628f467 | ||
|
|
637bd33e69 | ||
|
|
e53c068d78 | ||
|
|
4e181d30fa | ||
|
|
e60cc50dff | ||
|
|
f2dab9b334 | ||
|
|
fc6cfbd418 | ||
|
|
480a3f66c9 | ||
|
|
19e41a1e69 | ||
|
|
b4cdd55f62 | ||
|
|
6b6dcafcee | ||
|
|
303cde8f60 | ||
|
|
e672b61417 | ||
|
|
4a3030df9e |
@@ -325,9 +325,11 @@ node --import tsx scripts/openclaw-npm-postpublish-verify.ts <published-version>
|
||||
- Docker install/update coverage that exercises the published beta package
|
||||
- published npm Telegram proof: dispatch Actions > `NPM Telegram Beta E2E`
|
||||
from `main` with `package_spec=openclaw@<beta-version>` and
|
||||
`provider_mode=mock-openai`, approve `npm-release`, and require success.
|
||||
This is the default button path for installed-package onboarding,
|
||||
Telegram setup, and real Telegram E2E against the published npm package.
|
||||
`provider_mode=mock-openai`, and require success. This workflow is
|
||||
maintainer-dispatched and intentionally has no `npm-release` approval gate;
|
||||
`qa-live-shared` only supplies the shared QA secrets. This is the default
|
||||
button path for installed-package onboarding, Telegram setup, and real
|
||||
Telegram E2E against the published npm package.
|
||||
Use the local `pnpm test:docker:npm-telegram-live` lane with the matching
|
||||
`OPENCLAW_NPM_TELEGRAM_PACKAGE_SPEC` and Convex CI env only as a fallback
|
||||
or debugging path.
|
||||
|
||||
244
.agents/skills/openclaw-testing/SKILL.md
Normal file
244
.agents/skills/openclaw-testing/SKILL.md
Normal file
@@ -0,0 +1,244 @@
|
||||
---
|
||||
name: openclaw-testing
|
||||
description: Choose, run, rerun, or debug OpenClaw tests, CI checks, Docker E2E lanes, release validation, and the cheapest safe verification path.
|
||||
---
|
||||
|
||||
# OpenClaw Testing
|
||||
|
||||
Use this skill when deciding what to test, debugging failures, rerunning CI,
|
||||
or validating a change without wasting hours.
|
||||
|
||||
## Read First
|
||||
|
||||
- `docs/reference/test.md` for local test commands.
|
||||
- `docs/ci.md` for CI scope, release checks, Docker chunks, and runner behavior.
|
||||
- Scoped `AGENTS.md` files before editing code under a subtree.
|
||||
|
||||
## Default Rule
|
||||
|
||||
Prove the touched surface first. Do not reflexively run the whole suite.
|
||||
|
||||
1. Inspect the diff and classify the touched surface:
|
||||
- source: `pnpm changed:lanes --json`, then `pnpm check:changed`
|
||||
- tests only: `pnpm test:changed`
|
||||
- one failing file: `pnpm test <path-or-filter> -- --reporter=verbose`
|
||||
- workflow-only: `git diff --check`, workflow syntax/lint (`actionlint` when available)
|
||||
- docs-only: `pnpm docs:list`, docs formatter/lint only if docs tooling changed or requested
|
||||
2. Reproduce narrowly before fixing.
|
||||
3. Fix root cause.
|
||||
4. Rerun the same narrow proof.
|
||||
5. Broaden only when the touched contract demands it.
|
||||
|
||||
## Guardrails
|
||||
|
||||
- Do not kill unrelated processes or tests. If something is running elsewhere, treat it as owned by the user or another agent.
|
||||
- Do not run expensive local Docker, full release checks, full `pnpm test`, or full `pnpm check` unless the user asks or the change genuinely requires it.
|
||||
- Prefer GitHub Actions for release/Docker proof when the workflow already has the prepared image and secrets.
|
||||
- Use `scripts/committer "<msg>" <paths...>` when committing; stage only your files.
|
||||
- If deps are missing, run `pnpm install`, retry once, then report the first actionable error.
|
||||
|
||||
## Local Test Shortcuts
|
||||
|
||||
```bash
|
||||
pnpm changed:lanes --json
|
||||
pnpm check:changed # changed typecheck/lint/guards; no Vitest
|
||||
pnpm test:changed # cheap smart changed Vitest targets
|
||||
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
|
||||
pnpm test <path-or-filter> -- --reporter=verbose
|
||||
OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test <path-or-filter>
|
||||
```
|
||||
|
||||
Use targeted file paths whenever possible. Avoid raw `vitest`; use the repo
|
||||
`pnpm test` wrapper so project routing, workers, and setup stay correct.
|
||||
|
||||
## Command Semantics
|
||||
|
||||
- `pnpm check` and `pnpm check:changed` do not run Vitest tests. They are for
|
||||
typecheck, lint, and guard proof.
|
||||
- `pnpm test` and `pnpm test:changed` run Vitest tests.
|
||||
- `pnpm test:changed` is intentionally cheap by default: direct test edits,
|
||||
sibling tests, explicit source mappings, and import-graph dependents.
|
||||
- `OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed` is the explicit broad
|
||||
fallback for harness/config/package edits that genuinely need it.
|
||||
- Do not run extension sweeps just because core changed. If a core edit is for a
|
||||
specific plugin bug, run that plugin's tests explicitly. If a public SDK or
|
||||
contract change needs consumer proof, choose the smallest representative
|
||||
plugin/contract tests first, then broaden only when the risk justifies it.
|
||||
- The test wrapper prints a short `[test] passed|failed|skipped ... in ...`
|
||||
line. Vitest's own duration is still the per-shard detail.
|
||||
|
||||
## Routing Model
|
||||
|
||||
- `pnpm changed:lanes --json` answers "which check lanes does this diff touch?"
|
||||
It is used by `pnpm check:changed` for typecheck/lint/guard selection.
|
||||
- `pnpm test:changed` answers "which Vitest targets are worth running now?" It
|
||||
uses the same changed path list, but applies a cheaper test-target resolver.
|
||||
- Direct test edits run themselves. Source edits prefer explicit mappings,
|
||||
sibling `*.test.ts`, then import-graph dependents. Shared harness/config/root
|
||||
edits are skipped by default unless they have precise mapped tests.
|
||||
- Public SDK or contract edits do not automatically run every plugin test.
|
||||
`check:changed` proves extension type contracts; the agent chooses the
|
||||
smallest plugin/contract Vitest proof that matches the actual risk.
|
||||
- Use `OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed` only when a harness,
|
||||
config, package, or unknown-root edit really needs the broad Vitest fallback.
|
||||
|
||||
## CI Debugging
|
||||
|
||||
Start with current run state, not logs for everything:
|
||||
|
||||
```bash
|
||||
gh run list --branch main --limit 10
|
||||
gh run view <run-id> --json status,conclusion,headSha,url,jobs
|
||||
gh run view <run-id> --job <job-id> --log
|
||||
```
|
||||
|
||||
- Check exact SHA. Ignore newer unrelated `main` unless asked.
|
||||
- For cancelled same-branch runs, confirm whether a newer run superseded it.
|
||||
- Fetch full logs only for failed or relevant jobs.
|
||||
|
||||
## Docker
|
||||
|
||||
Docker is expensive. First inspect the scheduler without running Docker:
|
||||
|
||||
```bash
|
||||
OPENCLAW_DOCKER_ALL_DRY_RUN=1 pnpm test:docker:all
|
||||
OPENCLAW_DOCKER_ALL_DRY_RUN=1 OPENCLAW_DOCKER_ALL_LANES=install-e2e pnpm test:docker:all
|
||||
OPENCLAW_DOCKER_ALL_LANES=install-e2e node scripts/test-docker-all.mjs --plan-json
|
||||
```
|
||||
|
||||
Run one failed lane locally only when explicitly asked or when GitHub is not
|
||||
usable:
|
||||
|
||||
```bash
|
||||
OPENCLAW_DOCKER_ALL_LANES=<lane> \
|
||||
OPENCLAW_DOCKER_ALL_BUILD=0 \
|
||||
OPENCLAW_DOCKER_ALL_PREFLIGHT=0 \
|
||||
OPENCLAW_SKIP_DOCKER_BUILD=1 \
|
||||
OPENCLAW_DOCKER_E2E_BARE_IMAGE='<prepared-bare-image>' \
|
||||
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE='<prepared-functional-image>' \
|
||||
pnpm test:docker:all
|
||||
```
|
||||
|
||||
For release validation, prefer the reusable GitHub workflow input:
|
||||
|
||||
```yaml
|
||||
docker_lanes: install-e2e
|
||||
```
|
||||
|
||||
Multiple lanes are allowed:
|
||||
|
||||
```yaml
|
||||
docker_lanes: install-e2e bundled-channel-update-acpx
|
||||
```
|
||||
|
||||
That skips the three chunk matrix and runs one targeted Docker job against the
|
||||
prepared GHCR images and a fresh OpenClaw npm tarball for the selected ref.
|
||||
Reruns usually need that new tarball because the fix being tested changed the
|
||||
package contents even if the SHA-tagged GHCR Docker image can be reused.
|
||||
Live-only targeted reruns skip the E2E images and build only the live-test
|
||||
image. Release-path normal mode remains max three Docker chunk jobs:
|
||||
|
||||
- `core`
|
||||
- `package-update`
|
||||
- `plugins-integrations`
|
||||
|
||||
Docker E2E images never copy repo sources as the app under test: the bare image
|
||||
is a Node/Git runner, and the functional image installs the same prebuilt npm
|
||||
tarball that bare lanes mount. `scripts/package-openclaw-for-docker.mjs` is the
|
||||
single packer for local scripts and CI and validates the tarball inventory
|
||||
before Docker consumes it. `scripts/test-docker-all.mjs --plan-json` is the
|
||||
scheduler-owned CI plan for image kind, package, live image, lane, and
|
||||
credential needs. Docker lane definitions live in the single scenario catalog
|
||||
`scripts/lib/docker-e2e-scenarios.mjs`; planner logic lives in
|
||||
`scripts/lib/docker-e2e-plan.mjs`. `scripts/docker-e2e.mjs` converts plan and
|
||||
summary JSON into GitHub outputs and step summaries. Every scheduler run writes
|
||||
`.artifacts/docker-tests/**/summary.json` plus `failures.json`. Read those
|
||||
before rerunning. Lane entries include `command`, `rerunCommand`, status,
|
||||
timing, timeout state, image kind, and log file path. The summary also includes
|
||||
top-level phase timings for preflight, image build, package prep, lane pools,
|
||||
and cleanup. Use `pnpm test:docker:timings <summary.json>` to rank slow lanes
|
||||
and phases before deciding whether a broader rerun is justified.
|
||||
|
||||
## Cheap Docker Reruns
|
||||
|
||||
First derive the smallest rerun command from artifacts:
|
||||
|
||||
```bash
|
||||
pnpm test:docker:rerun <github-run-id>
|
||||
pnpm test:docker:rerun .artifacts/docker-tests/<run>/failures.json
|
||||
```
|
||||
|
||||
The script downloads Docker E2E artifacts for a GitHub run, reads
|
||||
`summary.json`/`failures.json`, and prints a combined targeted workflow command
|
||||
plus per-lane commands. Prefer the combined targeted command when several lanes
|
||||
failed for the same patch:
|
||||
|
||||
```bash
|
||||
gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
|
||||
-f ref=<sha> \
|
||||
-f include_repo_e2e=false \
|
||||
-f include_release_path_suites=false \
|
||||
-f include_openwebui=false \
|
||||
-f docker_lanes='install-e2e bundled-channel-update-acpx' \
|
||||
-f include_live_suites=false \
|
||||
-f live_models_only=false
|
||||
```
|
||||
|
||||
That path still runs the prepare job, so it creates a new tarball for `<sha>`.
|
||||
If the SHA-tagged GHCR bare/functional image already exists, CI skips rebuilding
|
||||
that image and only uploads the fresh package artifact before the targeted lane
|
||||
job. Do not rerun the full three-chunk release path unless the failed lane list
|
||||
or touched surface really requires it.
|
||||
|
||||
## Docker Expected Timings
|
||||
|
||||
Treat these as ballpark. Blacksmith queue time, GHCR pull speed, provider
|
||||
latency, npm cache state, and Docker daemon health can dominate.
|
||||
|
||||
Current local timing artifact (`.artifacts/docker-tests/lane-timings.json`) has
|
||||
these rough bands:
|
||||
|
||||
- Tiny lanes, seconds to under 1 minute:
|
||||
`agents-delete-shared-workspace` ~3s, `plugin-update` ~7s,
|
||||
`config-reload` ~14s, `pi-bundle-mcp-tools` ~15s, `onboard` ~18s,
|
||||
`session-runtime-context` ~20s, `gateway-network` ~34s, `qr` ~44s.
|
||||
- Medium deterministic lanes, ~1-5 minutes:
|
||||
`npm-onboard-channel-agent` ~96s, `openai-image-auth` ~99s,
|
||||
bundled channel/update lanes usually ~90-300s, `openwebui` ~225s,
|
||||
`mcp-channels` ~274s.
|
||||
- Heavy deterministic lanes, ~6-10 minutes:
|
||||
`bundled-channel-root-owned` ~429s,
|
||||
`bundled-channel-setup-entry` ~420s,
|
||||
`bundled-channel-load-failure` ~383s,
|
||||
`cron-mcp-cleanup` ~567s.
|
||||
- Live provider lanes, often ~15-20 minutes:
|
||||
`live-gateway` ~958s, `live-models` ~1054s.
|
||||
- Installer/release lanes:
|
||||
`install-e2e` and package-update paths can vary widely with npm, provider,
|
||||
and package registry behavior. Budget tens of minutes; prefer GitHub targeted
|
||||
reruns over local repeats.
|
||||
|
||||
Default fallback lane timeout is 120 minutes. A timeout usually means debug the
|
||||
lane log/artifacts first, not “run the whole thing again.”
|
||||
|
||||
## Failure Workflow
|
||||
|
||||
1. Identify exact failing job, SHA, lane, and artifact path.
|
||||
2. Read `failures.json`, `summary.json`, and the failed lane log tail.
|
||||
3. Use `pnpm test:docker:rerun <run-id|failures.json>` to generate targeted
|
||||
GitHub rerun commands.
|
||||
4. If the lane has `rerunCommand`, use that only as a local starting point.
|
||||
5. For Docker release failures, dispatch targeted `docker_lanes=<failed-lane>`
|
||||
on GitHub before considering local Docker.
|
||||
6. Patch narrowly, then rerun the failed file/lane only.
|
||||
7. Broaden to `pnpm check:changed` or CI only after the isolated proof passes.
|
||||
|
||||
## When To Escalate
|
||||
|
||||
- Public SDK/plugin contract changes: run changed gate plus relevant extension
|
||||
validation.
|
||||
- Build output, lazy imports, package boundaries, or published surfaces:
|
||||
include `pnpm build`.
|
||||
- Workflow edits: run `pnpm check:workflows`.
|
||||
- Release branch or tag validation: use release docs and GitHub workflows; avoid
|
||||
local Docker unless Peter explicitly asks.
|
||||
4
.agents/skills/openclaw-testing/agents/openai.yaml
Normal file
4
.agents/skills/openclaw-testing/agents/openai.yaml
Normal file
@@ -0,0 +1,4 @@
|
||||
interface:
|
||||
display_name: "OpenClaw Testing"
|
||||
short_description: "Choose cheap, targeted OpenClaw validation"
|
||||
default_prompt: "Use $openclaw-testing to choose the cheapest safe test or CI verification path, inspect failures, and rerun only the relevant OpenClaw lane."
|
||||
145
.github/actions/docker-e2e-plan/action.yml
vendored
Normal file
145
.github/actions/docker-e2e-plan/action.yml
vendored
Normal file
@@ -0,0 +1,145 @@
|
||||
name: Docker E2E plan and hydrate
|
||||
description: >
|
||||
Create a Docker E2E lane plan, expose GitHub outputs, and optionally hydrate
|
||||
the prebuilt package artifact plus shared Docker images needed by the plan.
|
||||
inputs:
|
||||
mode:
|
||||
description: prepare, chunk, or targeted.
|
||||
required: true
|
||||
chunk:
|
||||
description: Release-path chunk for mode=chunk.
|
||||
required: false
|
||||
default: ""
|
||||
lanes:
|
||||
description: Comma/space separated lane names for targeted or prepare mode.
|
||||
required: false
|
||||
default: ""
|
||||
include-openwebui:
|
||||
description: Whether Open WebUI is included when planning release/prepare coverage.
|
||||
required: false
|
||||
default: "true"
|
||||
include-release-path-suites:
|
||||
description: Whether prepare mode should plan all release-path suites.
|
||||
required: false
|
||||
default: "false"
|
||||
hydrate-artifacts:
|
||||
description: Whether to download/pull artifacts required by the plan.
|
||||
required: false
|
||||
default: "true"
|
||||
outputs:
|
||||
credentials:
|
||||
description: Comma-separated credential groups required by selected lanes.
|
||||
value: ${{ steps.plan.outputs.credentials }}
|
||||
needs_bare_image:
|
||||
description: "1 when selected lanes require the bare Docker E2E image."
|
||||
value: ${{ steps.plan.outputs.needs_bare_image }}
|
||||
needs_e2e_image:
|
||||
description: "1 when selected lanes require any Docker E2E image."
|
||||
value: ${{ steps.plan.outputs.needs_e2e_image }}
|
||||
needs_functional_image:
|
||||
description: "1 when selected lanes require the functional Docker E2E image."
|
||||
value: ${{ steps.plan.outputs.needs_functional_image }}
|
||||
needs_live_image:
|
||||
description: "1 when selected lanes require building the live Docker image."
|
||||
value: ${{ steps.plan.outputs.needs_live_image }}
|
||||
needs_package:
|
||||
description: "1 when selected lanes require the OpenClaw package tarball."
|
||||
value: ${{ steps.plan.outputs.needs_package }}
|
||||
plan_json:
|
||||
description: Path to the generated plan JSON.
|
||||
value: ${{ steps.plan.outputs.plan_json }}
|
||||
runs:
|
||||
using: composite
|
||||
steps:
|
||||
- name: Plan Docker E2E lanes
|
||||
id: plan
|
||||
shell: bash
|
||||
env:
|
||||
MODE: ${{ inputs.mode }}
|
||||
CHUNK: ${{ inputs.chunk }}
|
||||
LANES: ${{ inputs.lanes }}
|
||||
INCLUDE_OPENWEBUI: ${{ inputs.include-openwebui }}
|
||||
INCLUDE_RELEASE_PATH_SUITES: ${{ inputs.include-release-path-suites }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
mkdir -p .artifacts/docker-tests
|
||||
|
||||
case "$MODE" in
|
||||
prepare)
|
||||
plan_path=".artifacts/docker-tests/plan.json"
|
||||
if [[ "$INCLUDE_RELEASE_PATH_SUITES" == "true" ]]; then
|
||||
export OPENCLAW_DOCKER_ALL_PROFILE=release-path
|
||||
export OPENCLAW_DOCKER_ALL_PLAN_RELEASE_ALL=1
|
||||
elif [[ -n "$LANES" ]]; then
|
||||
export OPENCLAW_DOCKER_ALL_LANES="$LANES"
|
||||
elif [[ "$INCLUDE_OPENWEBUI" == "true" ]]; then
|
||||
export OPENCLAW_DOCKER_ALL_LANES=openwebui
|
||||
fi
|
||||
;;
|
||||
chunk)
|
||||
if [[ -z "$CHUNK" ]]; then
|
||||
echo "chunk input is required for Docker E2E chunk planning." >&2
|
||||
exit 1
|
||||
fi
|
||||
export OPENCLAW_DOCKER_ALL_PROFILE=release-path
|
||||
export OPENCLAW_DOCKER_ALL_CHUNK="$CHUNK"
|
||||
plan_path=".artifacts/docker-tests/release-${CHUNK}-plan.json"
|
||||
;;
|
||||
targeted)
|
||||
if [[ -z "$LANES" ]]; then
|
||||
echo "lanes input is required for Docker E2E targeted planning." >&2
|
||||
exit 1
|
||||
fi
|
||||
export OPENCLAW_DOCKER_ALL_LANES="$LANES"
|
||||
plan_path=".artifacts/docker-tests/targeted-plan.json"
|
||||
;;
|
||||
*)
|
||||
echo "mode must be prepare, chunk, or targeted. Got: $MODE" >&2
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
|
||||
export OPENCLAW_DOCKER_ALL_INCLUDE_OPENWEBUI="$INCLUDE_OPENWEBUI"
|
||||
node scripts/test-docker-all.mjs --plan-json > "$plan_path"
|
||||
node scripts/docker-e2e.mjs github-outputs "$plan_path" >> "$GITHUB_OUTPUT"
|
||||
echo "plan_json=$plan_path" >> "$GITHUB_OUTPUT"
|
||||
|
||||
- name: Download OpenClaw Docker E2E package
|
||||
if: inputs.hydrate-artifacts == 'true' && steps.plan.outputs.needs_package == '1'
|
||||
uses: actions/download-artifact@v8
|
||||
with:
|
||||
name: docker-e2e-package
|
||||
path: .artifacts/docker-e2e-package
|
||||
|
||||
- name: Pull shared bare Docker E2E image
|
||||
if: inputs.hydrate-artifacts == 'true' && steps.plan.outputs.needs_bare_image == '1'
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
docker pull "${OPENCLAW_DOCKER_E2E_BARE_IMAGE}"
|
||||
|
||||
- name: Pull shared functional Docker E2E image
|
||||
if: inputs.hydrate-artifacts == 'true' && steps.plan.outputs.needs_functional_image == '1'
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
docker pull "${OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE}"
|
||||
|
||||
- name: Validate Docker E2E credentials
|
||||
if: inputs.hydrate-artifacts == 'true'
|
||||
shell: bash
|
||||
env:
|
||||
CREDENTIALS: ${{ steps.plan.outputs.credentials }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
credentials=",$CREDENTIALS,"
|
||||
if [[ "$credentials" == *",openai,"* ]]; then
|
||||
[[ -n "${OPENAI_API_KEY:-}" ]] || {
|
||||
echo "OPENAI_API_KEY is required for selected Docker E2E lanes." >&2
|
||||
exit 1
|
||||
}
|
||||
fi
|
||||
if [[ "$credentials" == *",anthropic,"* && -z "${ANTHROPIC_API_TOKEN:-}" && -z "${ANTHROPIC_API_KEY:-}" ]]; then
|
||||
echo "ANTHROPIC_API_TOKEN or ANTHROPIC_API_KEY is required for selected Docker E2E lanes." >&2
|
||||
exit 1
|
||||
fi
|
||||
37
.github/workflows/ci.yml
vendored
37
.github/workflows/ci.yml
vendored
@@ -1,6 +1,7 @@
|
||||
name: CI
|
||||
|
||||
on:
|
||||
workflow_dispatch:
|
||||
push:
|
||||
branches: [main]
|
||||
paths-ignore:
|
||||
@@ -13,8 +14,8 @@ permissions:
|
||||
contents: read
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.event_name == 'pull_request' && format('{0}-v7-{1}', github.workflow, github.event.pull_request.number) || (github.repository == 'openclaw/openclaw' && format('{0}-v7-{1}', github.workflow, github.ref) || format('{0}-v7-{1}-{2}', github.workflow, github.ref, github.sha)) }}
|
||||
cancel-in-progress: true
|
||||
group: ${{ github.event_name == 'workflow_dispatch' && format('{0}-manual-v1-{1}', github.workflow, github.run_id) || (github.event_name == 'pull_request' && format('{0}-v7-{1}', github.workflow, github.event.pull_request.number) || (github.repository == 'openclaw/openclaw' && format('{0}-v7-{1}', github.workflow, github.ref) || format('{0}-v7-{1}-{2}', github.workflow, github.ref, github.sha))) }}
|
||||
cancel-in-progress: ${{ github.event_name != 'workflow_dispatch' }}
|
||||
|
||||
env:
|
||||
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
||||
@@ -75,6 +76,7 @@ jobs:
|
||||
submodules: false
|
||||
|
||||
- name: Ensure preflight base commit
|
||||
if: github.event_name != 'workflow_dispatch'
|
||||
uses: ./.github/actions/ensure-base-commit
|
||||
with:
|
||||
base-sha: ${{ github.event_name == 'push' && github.event.before || github.event.pull_request.base.sha }}
|
||||
@@ -82,11 +84,12 @@ jobs:
|
||||
|
||||
- name: Detect docs-only changes
|
||||
id: docs_scope
|
||||
if: github.event_name != 'workflow_dispatch'
|
||||
uses: ./.github/actions/detect-docs-changes
|
||||
|
||||
- name: Detect changed scopes
|
||||
id: changed_scope
|
||||
if: steps.docs_scope.outputs.docs_only != 'true'
|
||||
if: github.event_name != 'workflow_dispatch' && steps.docs_scope.outputs.docs_only != 'true'
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
@@ -101,7 +104,7 @@ jobs:
|
||||
|
||||
- name: Detect changed extensions
|
||||
id: changed_extensions
|
||||
if: steps.docs_scope.outputs.docs_only != 'true' && steps.changed_scope.outputs.run_node == 'true'
|
||||
if: github.event_name != 'workflow_dispatch' && steps.docs_scope.outputs.docs_only != 'true' && steps.changed_scope.outputs.run_node == 'true'
|
||||
env:
|
||||
BASE_SHA: ${{ github.event_name == 'push' && github.event.before || github.event.pull_request.base.sha }}
|
||||
BASE_REF: ${{ github.event_name == 'push' && github.ref_name || github.event.pull_request.base.ref }}
|
||||
@@ -125,19 +128,19 @@ jobs:
|
||||
- name: Build CI manifest
|
||||
id: manifest
|
||||
env:
|
||||
OPENCLAW_CI_DOCS_ONLY: ${{ steps.docs_scope.outputs.docs_only }}
|
||||
OPENCLAW_CI_DOCS_CHANGED: ${{ steps.docs_scope.outputs.docs_changed }}
|
||||
OPENCLAW_CI_RUN_NODE: ${{ steps.changed_scope.outputs.run_node || 'false' }}
|
||||
OPENCLAW_CI_RUN_MACOS: ${{ steps.changed_scope.outputs.run_macos || 'false' }}
|
||||
OPENCLAW_CI_RUN_ANDROID: ${{ steps.changed_scope.outputs.run_android || 'false' }}
|
||||
OPENCLAW_CI_RUN_WINDOWS: ${{ steps.changed_scope.outputs.run_windows || 'false' }}
|
||||
OPENCLAW_CI_RUN_NODE_FAST_ONLY: ${{ steps.changed_scope.outputs.run_node_fast_only || 'false' }}
|
||||
OPENCLAW_CI_RUN_NODE_FAST_PLUGIN_CONTRACTS: ${{ steps.changed_scope.outputs.run_node_fast_plugin_contracts || 'false' }}
|
||||
OPENCLAW_CI_RUN_NODE_FAST_CI_ROUTING: ${{ steps.changed_scope.outputs.run_node_fast_ci_routing || 'false' }}
|
||||
OPENCLAW_CI_RUN_SKILLS_PYTHON: ${{ steps.changed_scope.outputs.run_skills_python || 'false' }}
|
||||
OPENCLAW_CI_RUN_CONTROL_UI_I18N: ${{ steps.changed_scope.outputs.run_control_ui_i18n || 'false' }}
|
||||
OPENCLAW_CI_HAS_CHANGED_EXTENSIONS: ${{ steps.changed_extensions.outputs.has_changed_extensions || 'false' }}
|
||||
OPENCLAW_CI_CHANGED_EXTENSIONS_MATRIX: ${{ steps.changed_extensions.outputs.changed_extensions_matrix || '{"include":[]}' }}
|
||||
OPENCLAW_CI_DOCS_ONLY: ${{ github.event_name == 'workflow_dispatch' && 'false' || steps.docs_scope.outputs.docs_only }}
|
||||
OPENCLAW_CI_DOCS_CHANGED: ${{ github.event_name == 'workflow_dispatch' && 'true' || steps.docs_scope.outputs.docs_changed }}
|
||||
OPENCLAW_CI_RUN_NODE: ${{ github.event_name == 'workflow_dispatch' && 'true' || steps.changed_scope.outputs.run_node || 'false' }}
|
||||
OPENCLAW_CI_RUN_MACOS: ${{ github.event_name == 'workflow_dispatch' && 'true' || steps.changed_scope.outputs.run_macos || 'false' }}
|
||||
OPENCLAW_CI_RUN_ANDROID: ${{ github.event_name == 'workflow_dispatch' && 'true' || steps.changed_scope.outputs.run_android || 'false' }}
|
||||
OPENCLAW_CI_RUN_WINDOWS: ${{ github.event_name == 'workflow_dispatch' && 'true' || steps.changed_scope.outputs.run_windows || 'false' }}
|
||||
OPENCLAW_CI_RUN_NODE_FAST_ONLY: ${{ github.event_name == 'workflow_dispatch' && 'false' || steps.changed_scope.outputs.run_node_fast_only || 'false' }}
|
||||
OPENCLAW_CI_RUN_NODE_FAST_PLUGIN_CONTRACTS: ${{ github.event_name == 'workflow_dispatch' && 'false' || steps.changed_scope.outputs.run_node_fast_plugin_contracts || 'false' }}
|
||||
OPENCLAW_CI_RUN_NODE_FAST_CI_ROUTING: ${{ github.event_name == 'workflow_dispatch' && 'false' || steps.changed_scope.outputs.run_node_fast_ci_routing || 'false' }}
|
||||
OPENCLAW_CI_RUN_SKILLS_PYTHON: ${{ github.event_name == 'workflow_dispatch' && 'true' || steps.changed_scope.outputs.run_skills_python || 'false' }}
|
||||
OPENCLAW_CI_RUN_CONTROL_UI_I18N: ${{ github.event_name == 'workflow_dispatch' && 'true' || steps.changed_scope.outputs.run_control_ui_i18n || 'false' }}
|
||||
OPENCLAW_CI_HAS_CHANGED_EXTENSIONS: ${{ github.event_name == 'workflow_dispatch' && 'false' || steps.changed_extensions.outputs.has_changed_extensions || 'false' }}
|
||||
OPENCLAW_CI_CHANGED_EXTENSIONS_MATRIX: ${{ github.event_name == 'workflow_dispatch' && '{"include":[]}' || steps.changed_extensions.outputs.changed_extensions_matrix || '{"include":[]}' }}
|
||||
OPENCLAW_CI_REPOSITORY: ${{ github.repository }}
|
||||
run: |
|
||||
node --input-type=module <<'EOF'
|
||||
|
||||
169
.github/workflows/docker-release.yml
vendored
169
.github/workflows/docker-release.yml
vendored
@@ -63,7 +63,7 @@ jobs:
|
||||
|
||||
# KEEP THIS WORKFLOW ON GITHUB-HOSTED RUNNERS.
|
||||
# DO NOT MOVE IT BACK TO BLACKSMITH WITHOUT RE-VALIDATING TAG BUILDS AND BACKFILLS.
|
||||
# Build amd64 images (default + slim share the build stage cache)
|
||||
# Build amd64 image. Default and slim tags point to the same slim runtime.
|
||||
build-amd64:
|
||||
needs: [approve_manual_backfill]
|
||||
if: ${{ always() && (github.event_name != 'workflow_dispatch' || needs.approve_manual_backfill.result == 'success') }}
|
||||
@@ -74,7 +74,6 @@ jobs:
|
||||
contents: read
|
||||
outputs:
|
||||
digest: ${{ steps.build.outputs.digest }}
|
||||
slim-digest: ${{ steps.build-slim.outputs.digest }}
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v6
|
||||
@@ -117,12 +116,7 @@ jobs:
|
||||
fi
|
||||
{
|
||||
echo "value<<EOF"
|
||||
printf "%s\n" "${tags[@]}"
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
{
|
||||
echo "slim<<EOF"
|
||||
printf "%s\n" "${slim_tags[@]}"
|
||||
printf "%s\n" "${tags[@]}" "${slim_tags[@]}"
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
|
||||
@@ -163,27 +157,11 @@ jobs:
|
||||
OPENCLAW_EXTENSIONS=diagnostics-otel
|
||||
tags: ${{ steps.tags.outputs.value }}
|
||||
labels: ${{ steps.labels.outputs.value }}
|
||||
provenance: false
|
||||
sbom: true
|
||||
provenance: mode=max
|
||||
push: true
|
||||
|
||||
- name: Build and push amd64 slim image
|
||||
id: build-slim
|
||||
# WARNING: KEEP THE OFFICIAL DOCKER ACTION HERE; DO NOT SWITCH THIS BACK TO BLACKSMITH BLINDLY.
|
||||
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
|
||||
with:
|
||||
context: .
|
||||
platforms: linux/amd64
|
||||
cache-from: type=gha,scope=docker-release-amd64
|
||||
cache-to: type=gha,mode=max,scope=docker-release-amd64
|
||||
build-args: |
|
||||
OPENCLAW_EXTENSIONS=diagnostics-otel
|
||||
OPENCLAW_VARIANT=slim
|
||||
tags: ${{ steps.tags.outputs.slim }}
|
||||
labels: ${{ steps.labels.outputs.value }}
|
||||
provenance: false
|
||||
push: true
|
||||
|
||||
# Build arm64 images (default + slim share the build stage cache)
|
||||
# Build arm64 image. Default and slim tags point to the same slim runtime.
|
||||
build-arm64:
|
||||
needs: [approve_manual_backfill]
|
||||
if: ${{ always() && (github.event_name != 'workflow_dispatch' || needs.approve_manual_backfill.result == 'success') }}
|
||||
@@ -194,7 +172,6 @@ jobs:
|
||||
contents: read
|
||||
outputs:
|
||||
digest: ${{ steps.build.outputs.digest }}
|
||||
slim-digest: ${{ steps.build-slim.outputs.digest }}
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v6
|
||||
@@ -237,12 +214,7 @@ jobs:
|
||||
fi
|
||||
{
|
||||
echo "value<<EOF"
|
||||
printf "%s\n" "${tags[@]}"
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
{
|
||||
echo "slim<<EOF"
|
||||
printf "%s\n" "${slim_tags[@]}"
|
||||
printf "%s\n" "${tags[@]}" "${slim_tags[@]}"
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
|
||||
@@ -283,24 +255,8 @@ jobs:
|
||||
OPENCLAW_EXTENSIONS=diagnostics-otel
|
||||
tags: ${{ steps.tags.outputs.value }}
|
||||
labels: ${{ steps.labels.outputs.value }}
|
||||
provenance: false
|
||||
push: true
|
||||
|
||||
- name: Build and push arm64 slim image
|
||||
id: build-slim
|
||||
# WARNING: KEEP THE OFFICIAL DOCKER ACTION HERE; DO NOT SWITCH THIS BACK TO BLACKSMITH BLINDLY.
|
||||
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
|
||||
with:
|
||||
context: .
|
||||
platforms: linux/arm64
|
||||
cache-from: type=gha,scope=docker-release-arm64
|
||||
cache-to: type=gha,mode=max,scope=docker-release-arm64
|
||||
build-args: |
|
||||
OPENCLAW_EXTENSIONS=diagnostics-otel
|
||||
OPENCLAW_VARIANT=slim
|
||||
tags: ${{ steps.tags.outputs.slim }}
|
||||
labels: ${{ steps.labels.outputs.value }}
|
||||
provenance: false
|
||||
sbom: true
|
||||
provenance: mode=max
|
||||
push: true
|
||||
|
||||
# Create multi-platform manifests
|
||||
@@ -357,16 +313,11 @@ jobs:
|
||||
fi
|
||||
{
|
||||
echo "value<<EOF"
|
||||
printf "%s\n" "${tags[@]}"
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
{
|
||||
echo "slim<<EOF"
|
||||
printf "%s\n" "${slim_tags[@]}"
|
||||
printf "%s\n" "${tags[@]}" "${slim_tags[@]}"
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
|
||||
- name: Create and push default manifest
|
||||
- name: Create and push manifest
|
||||
shell: bash
|
||||
env:
|
||||
TAGS: ${{ steps.tags.outputs.value }}
|
||||
@@ -384,20 +335,94 @@ jobs:
|
||||
"${AMD64_DIGEST}" \
|
||||
"${ARM64_DIGEST}"
|
||||
|
||||
- name: Create and push slim manifest
|
||||
verify-attestations:
|
||||
needs: [create-manifest]
|
||||
if: ${{ always() && needs.create-manifest.result == 'success' }}
|
||||
runs-on: ubuntu-24.04
|
||||
permissions:
|
||||
contents: read
|
||||
packages: read
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
fetch-depth: 1
|
||||
|
||||
- name: Set up Docker Builder
|
||||
uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4
|
||||
|
||||
- name: Login to GitHub Container Registry
|
||||
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4
|
||||
with:
|
||||
registry: ${{ env.REGISTRY }}
|
||||
username: ${{ github.repository_owner }}
|
||||
password: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Resolve image refs
|
||||
id: refs
|
||||
shell: bash
|
||||
env:
|
||||
SLIM_TAGS: ${{ steps.tags.outputs.slim }}
|
||||
AMD64_SLIM_DIGEST: ${{ needs.build-amd64.outputs.slim-digest }}
|
||||
ARM64_SLIM_DIGEST: ${{ needs.build-arm64.outputs.slim-digest }}
|
||||
IMAGE: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
|
||||
SOURCE_REF: ${{ github.event_name == 'workflow_dispatch' && format('refs/tags/{0}', inputs.tag) || github.ref }}
|
||||
IS_MANUAL_BACKFILL: ${{ github.event_name == 'workflow_dispatch' && '1' || '0' }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
mapfile -t tags <<< "${SLIM_TAGS}"
|
||||
args=()
|
||||
for tag in "${tags[@]}"; do
|
||||
[ -z "$tag" ] && continue
|
||||
args+=("-t" "$tag")
|
||||
done
|
||||
docker buildx imagetools create "${args[@]}" \
|
||||
"${AMD64_SLIM_DIGEST}" \
|
||||
"${ARM64_SLIM_DIGEST}"
|
||||
multi_refs=()
|
||||
slim_multi_refs=()
|
||||
amd64_refs=()
|
||||
arm64_refs=()
|
||||
if [[ "${SOURCE_REF}" == "refs/heads/main" ]]; then
|
||||
multi_refs+=("${IMAGE}:main")
|
||||
slim_multi_refs+=("${IMAGE}:main-slim")
|
||||
amd64_refs+=("${IMAGE}:main-amd64" "${IMAGE}:main-slim-amd64")
|
||||
arm64_refs+=("${IMAGE}:main-arm64" "${IMAGE}:main-slim-arm64")
|
||||
fi
|
||||
if [[ "${SOURCE_REF}" == refs/tags/v* ]]; then
|
||||
version="${SOURCE_REF#refs/tags/v}"
|
||||
multi_refs+=("${IMAGE}:${version}")
|
||||
slim_multi_refs+=("${IMAGE}:${version}-slim")
|
||||
amd64_refs+=("${IMAGE}:${version}-amd64" "${IMAGE}:${version}-slim-amd64")
|
||||
arm64_refs+=("${IMAGE}:${version}-arm64" "${IMAGE}:${version}-slim-arm64")
|
||||
if [[ "${IS_MANUAL_BACKFILL}" != "1" && "$version" =~ ^[0-9]+\.[0-9]+\.[0-9]+(-[0-9]+)?$ ]]; then
|
||||
multi_refs+=("${IMAGE}:latest")
|
||||
slim_multi_refs+=("${IMAGE}:slim")
|
||||
fi
|
||||
fi
|
||||
if [[ ${#multi_refs[@]} -eq 0 || ${#amd64_refs[@]} -eq 0 || ${#arm64_refs[@]} -eq 0 ]]; then
|
||||
echo "::error::No Docker image refs resolved for ref ${SOURCE_REF}"
|
||||
exit 1
|
||||
fi
|
||||
{
|
||||
echo "multi<<EOF"
|
||||
printf "%s\n" "${multi_refs[@]}" "${slim_multi_refs[@]}"
|
||||
echo "EOF"
|
||||
echo "amd64<<EOF"
|
||||
printf "%s\n" "${amd64_refs[@]}"
|
||||
echo "EOF"
|
||||
echo "arm64<<EOF"
|
||||
printf "%s\n" "${arm64_refs[@]}"
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
|
||||
- name: Verify Docker attestations
|
||||
shell: bash
|
||||
env:
|
||||
MULTI_REFS: ${{ steps.refs.outputs.multi }}
|
||||
AMD64_REFS: ${{ steps.refs.outputs.amd64 }}
|
||||
ARM64_REFS: ${{ steps.refs.outputs.arm64 }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
mapfile -t multi_refs <<< "${MULTI_REFS}"
|
||||
mapfile -t amd64_refs <<< "${AMD64_REFS}"
|
||||
mapfile -t arm64_refs <<< "${ARM64_REFS}"
|
||||
|
||||
node scripts/verify-docker-attestations.mjs \
|
||||
--platform linux/amd64 \
|
||||
--platform linux/arm64 \
|
||||
"${multi_refs[@]}"
|
||||
node scripts/verify-docker-attestations.mjs \
|
||||
--platform linux/amd64 \
|
||||
"${amd64_refs[@]}"
|
||||
node scripts/verify-docker-attestations.mjs \
|
||||
--platform linux/arm64 \
|
||||
"${arm64_refs[@]}"
|
||||
|
||||
14
.github/workflows/install-smoke.yml
vendored
14
.github/workflows/install-smoke.yml
vendored
@@ -10,6 +10,11 @@ on:
|
||||
required: false
|
||||
default: false
|
||||
type: boolean
|
||||
update_baseline_version:
|
||||
description: Baseline openclaw version or dist-tag for installer update smoke
|
||||
required: false
|
||||
default: latest
|
||||
type: string
|
||||
workflow_call:
|
||||
inputs:
|
||||
ref:
|
||||
@@ -21,6 +26,11 @@ on:
|
||||
required: false
|
||||
default: true
|
||||
type: boolean
|
||||
update_baseline_version:
|
||||
description: Baseline openclaw version or dist-tag for installer update smoke
|
||||
required: false
|
||||
default: latest
|
||||
type: string
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
@@ -103,7 +113,6 @@ jobs:
|
||||
context: .
|
||||
file: ./Dockerfile
|
||||
build-args: |
|
||||
OPENCLAW_DOCKER_APT_UPGRADE=0
|
||||
OPENCLAW_EXTENSIONS=matrix
|
||||
tags: |
|
||||
openclaw-dockerfile-smoke:local
|
||||
@@ -218,7 +227,6 @@ jobs:
|
||||
context: .
|
||||
file: ./Dockerfile
|
||||
build-args: |
|
||||
OPENCLAW_DOCKER_APT_UPGRADE=0
|
||||
OPENCLAW_EXTENSIONS=matrix
|
||||
tags: |
|
||||
openclaw-dockerfile-smoke:local
|
||||
@@ -332,7 +340,7 @@ jobs:
|
||||
OPENCLAW_INSTALL_SMOKE_SKIP_NONROOT: "0"
|
||||
OPENCLAW_INSTALL_SMOKE_SKIP_NPM_GLOBAL: "1"
|
||||
OPENCLAW_INSTALL_SMOKE_SKIP_PREVIOUS: "1"
|
||||
OPENCLAW_INSTALL_SMOKE_UPDATE_BASELINE: latest
|
||||
OPENCLAW_INSTALL_SMOKE_UPDATE_BASELINE: ${{ inputs.update_baseline_version || 'latest' }}
|
||||
OPENCLAW_INSTALL_SMOKE_UPDATE_DIST_IMAGE: openclaw-dockerfile-smoke:local
|
||||
OPENCLAW_INSTALL_SMOKE_UPDATE_SKIP_LOCAL_BUILD: "1"
|
||||
run: bash scripts/test-install-sh-docker.sh
|
||||
|
||||
31
.github/workflows/npm-telegram-beta-e2e.yml
vendored
31
.github/workflows/npm-telegram-beta-e2e.yml
vendored
@@ -34,34 +34,8 @@ env:
|
||||
PNPM_VERSION: "10.33.0"
|
||||
|
||||
jobs:
|
||||
validate_dispatch_ref:
|
||||
name: Validate dispatch ref
|
||||
runs-on: blacksmith-8vcpu-ubuntu-2404
|
||||
steps:
|
||||
- name: Require main workflow ref
|
||||
env:
|
||||
WORKFLOW_REF: ${{ github.ref }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
if [[ "${WORKFLOW_REF}" != "refs/heads/main" ]]; then
|
||||
echo "NPM Telegram beta E2E must be dispatched from main so workflow logic stays controlled." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
approve_release_manager:
|
||||
name: Approve npm Telegram beta E2E
|
||||
needs: validate_dispatch_ref
|
||||
runs-on: ubuntu-latest
|
||||
environment: npm-release
|
||||
steps:
|
||||
- name: Record approval
|
||||
env:
|
||||
PACKAGE_SPEC: ${{ inputs.package_spec }}
|
||||
run: echo "Approved npm Telegram beta E2E for ${PACKAGE_SPEC}"
|
||||
|
||||
run_npm_telegram_beta_e2e:
|
||||
name: Run published npm Telegram E2E
|
||||
needs: approve_release_manager
|
||||
runs-on: blacksmith-32vcpu-ubuntu-2404
|
||||
timeout-minutes: 60
|
||||
environment: qa-live-shared
|
||||
@@ -71,7 +45,7 @@ jobs:
|
||||
DOCKER_BUILD_SUMMARY: "false"
|
||||
DOCKER_BUILD_RECORD_UPLOAD: "false"
|
||||
steps:
|
||||
- name: Checkout main
|
||||
- name: Checkout dispatch ref
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
ref: ${{ github.sha }}
|
||||
@@ -79,6 +53,8 @@ jobs:
|
||||
|
||||
- name: Set up Blacksmith Docker Builder
|
||||
uses: useblacksmith/setup-docker-builder@ac083cc84672d01c60d5e8561d0a939b697de542 # v1
|
||||
with:
|
||||
max-cache-size-mb: 800000
|
||||
|
||||
- name: Build Docker E2E image
|
||||
uses: useblacksmith/build-push-action@cbd1f60d194a98cb3be5523b15134501eaf0fbf3 # v2
|
||||
@@ -143,6 +119,7 @@ jobs:
|
||||
OPENCLAW_QA_CONVEX_SITE_URL: ${{ secrets.OPENCLAW_QA_CONVEX_SITE_URL }}
|
||||
OPENCLAW_QA_CONVEX_SECRET_CI: ${{ secrets.OPENCLAW_QA_CONVEX_SECRET_CI }}
|
||||
OPENCLAW_QA_REDACT_PUBLIC_METADATA: "1"
|
||||
OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT: "1"
|
||||
INPUT_SCENARIO: ${{ inputs.scenario }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
|
||||
@@ -23,6 +23,11 @@ on:
|
||||
required: false
|
||||
default: true
|
||||
type: boolean
|
||||
docker_lanes:
|
||||
description: Comma/space separated Docker scheduler lane names to run against the prepared image
|
||||
required: false
|
||||
default: ""
|
||||
type: string
|
||||
include_live_suites:
|
||||
description: Whether to run live-provider coverage
|
||||
required: false
|
||||
@@ -54,6 +59,11 @@ on:
|
||||
required: false
|
||||
default: true
|
||||
type: boolean
|
||||
docker_lanes:
|
||||
description: Comma/space separated Docker scheduler lane names to run against the prepared image
|
||||
required: false
|
||||
default: ""
|
||||
type: string
|
||||
include_live_suites:
|
||||
description: Whether to run live-provider coverage
|
||||
required: false
|
||||
@@ -182,6 +192,7 @@ jobs:
|
||||
env:
|
||||
GH_TOKEN: ${{ github.token }}
|
||||
INPUT_REF: ${{ inputs.ref }}
|
||||
WORKFLOW_REF_NAME: ${{ github.ref_name }}
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
@@ -189,9 +200,15 @@ jobs:
|
||||
trusted_reason=""
|
||||
|
||||
git fetch --no-tags origin +refs/heads/main:refs/remotes/origin/main
|
||||
if [[ "${WORKFLOW_REF_NAME}" =~ ^release/[0-9]{4}\.[1-9][0-9]*\.[1-9][0-9]*$ ]]; then
|
||||
git fetch --no-tags origin "+refs/heads/${WORKFLOW_REF_NAME}:refs/remotes/origin/${WORKFLOW_REF_NAME}"
|
||||
fi
|
||||
|
||||
if git merge-base --is-ancestor "$selected_sha" refs/remotes/origin/main; then
|
||||
trusted_reason="main-ancestor"
|
||||
elif [[ "${WORKFLOW_REF_NAME}" =~ ^release/[0-9]{4}\.[1-9][0-9]*\.[1-9][0-9]*$ ]] &&
|
||||
[[ "$selected_sha" == "$(git rev-parse "refs/remotes/origin/${WORKFLOW_REF_NAME}")" ]]; then
|
||||
trusted_reason="release-branch-head"
|
||||
elif git tag --points-at "$selected_sha" | grep -Eq '^v'; then
|
||||
trusted_reason="release-tag"
|
||||
else
|
||||
@@ -208,7 +225,7 @@ jobs:
|
||||
|
||||
if [[ -z "$trusted_reason" ]]; then
|
||||
echo "Ref '${INPUT_REF}' resolved to $selected_sha, which is not trusted for secret-bearing live/E2E checks." >&2
|
||||
echo "Allowed refs must be on main, point to a release tag, or match an open PR head in ${GITHUB_REPOSITORY}." >&2
|
||||
echo "Allowed refs must be on main, match the current release branch head, point to a release tag, or match an open PR head in ${GITHUB_REPOSITORY}." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
@@ -303,7 +320,7 @@ jobs:
|
||||
requires_live_suites: false
|
||||
- suite_id: openai-ws-stream-live-e2e
|
||||
label: OpenAI WebSocket live E2E
|
||||
command: pnpm test:e2e -- src/agents/openai-ws-stream.e2e.test.ts
|
||||
command: pnpm test:e2e src/agents/openai-ws-stream.e2e.test.ts
|
||||
timeout_minutes: 90
|
||||
requires_repo_e2e: false
|
||||
requires_live_suites: true
|
||||
@@ -363,93 +380,23 @@ jobs:
|
||||
|
||||
validate_docker_e2e:
|
||||
needs: [validate_selected_ref, prepare_docker_e2e_image]
|
||||
if: inputs.include_release_path_suites
|
||||
if: inputs.include_release_path_suites && inputs.docker_lanes == ''
|
||||
name: Docker E2E (${{ matrix.label }})
|
||||
runs-on: blacksmith-32vcpu-ubuntu-2404
|
||||
timeout-minutes: ${{ matrix.timeout_minutes }}
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
include:
|
||||
- suite_id: docker-onboard
|
||||
label: Onboarding Docker E2E
|
||||
command: pnpm test:docker:onboard
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-npm-onboard-channel-agent
|
||||
label: Npm Onboard Channel Agent Docker E2E
|
||||
command: pnpm test:docker:npm-onboard-channel-agent
|
||||
timeout_minutes: 90
|
||||
release_path: true
|
||||
- suite_id: docker-gateway-network
|
||||
label: Gateway Network Docker E2E
|
||||
command: pnpm test:docker:gateway-network
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-openai-web-search-minimal
|
||||
label: OpenAI Web Search Minimal Docker E2E
|
||||
command: pnpm test:docker:openai-web-search-minimal
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-mcp-channels
|
||||
label: MCP Channels Docker E2E
|
||||
command: pnpm test:docker:mcp-channels
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-pi-bundle-mcp-tools
|
||||
label: Pi Bundle MCP Tools Docker E2E
|
||||
command: pnpm test:docker:pi-bundle-mcp-tools
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-cron-mcp-cleanup
|
||||
label: Cron MCP Cleanup Docker E2E
|
||||
command: pnpm test:docker:cron-mcp-cleanup
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-plugins
|
||||
label: Plugins Docker E2E
|
||||
command: pnpm test:docker:plugins
|
||||
timeout_minutes: 75
|
||||
release_path: true
|
||||
- suite_id: docker-plugin-update
|
||||
label: Plugin Update Docker E2E
|
||||
command: pnpm test:docker:plugin-update
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-config-reload
|
||||
label: Config Reload Docker E2E
|
||||
command: pnpm test:docker:config-reload
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-bundled-channel-deps
|
||||
label: Bundled Channel Runtime Deps Docker E2E
|
||||
command: pnpm test:docker:bundled-channel-deps
|
||||
timeout_minutes: 75
|
||||
release_path: true
|
||||
- suite_id: docker-doctor-switch
|
||||
label: Doctor Install Switch Docker E2E
|
||||
command: pnpm test:docker:doctor-switch
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-update-channel-switch
|
||||
label: Update Channel Switch Docker E2E
|
||||
command: pnpm test:docker:update-channel-switch
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-session-runtime-context
|
||||
label: Session Runtime Context Docker E2E
|
||||
command: pnpm test:docker:session-runtime-context
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-qr
|
||||
label: QR Import Docker E2E
|
||||
command: pnpm test:docker:qr
|
||||
timeout_minutes: 60
|
||||
release_path: true
|
||||
- suite_id: docker-install-e2e
|
||||
label: Installer Docker E2E
|
||||
command: pnpm test:install:e2e
|
||||
- chunk_id: core
|
||||
label: core
|
||||
timeout_minutes: 120
|
||||
release_path: true
|
||||
- chunk_id: package-update
|
||||
label: package/update
|
||||
timeout_minutes: 180
|
||||
- chunk_id: plugins-integrations
|
||||
label: plugins/integrations
|
||||
timeout_minutes: 180
|
||||
env:
|
||||
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
|
||||
OPENAI_BASE_URL: ${{ secrets.OPENAI_BASE_URL }}
|
||||
@@ -496,7 +443,12 @@ jobs:
|
||||
OPENCLAW_GEMINI_SETTINGS_JSON: ${{ secrets.OPENCLAW_GEMINI_SETTINGS_JSON }}
|
||||
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
|
||||
OPENCLAW_DOCKER_E2E_IMAGE: ${{ needs.prepare_docker_e2e_image.outputs.image }}
|
||||
OPENCLAW_DOCKER_E2E_BARE_IMAGE: ${{ needs.prepare_docker_e2e_image.outputs.bare_image }}
|
||||
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE: ${{ needs.prepare_docker_e2e_image.outputs.functional_image }}
|
||||
OPENCLAW_CURRENT_PACKAGE_TGZ: .artifacts/docker-e2e-package/openclaw-current.tgz
|
||||
OPENCLAW_SKIP_DOCKER_BUILD: "1"
|
||||
INCLUDE_OPENWEBUI: ${{ inputs.include_openwebui }}
|
||||
DOCKER_E2E_CHUNK: ${{ matrix.chunk_id }}
|
||||
steps:
|
||||
- name: Checkout selected ref
|
||||
uses: actions/checkout@v6
|
||||
@@ -521,45 +473,188 @@ jobs:
|
||||
- name: Hydrate live auth/profile inputs
|
||||
run: bash scripts/ci-hydrate-live-auth.sh
|
||||
|
||||
- name: Configure suite-specific env
|
||||
- name: Plan and hydrate Docker E2E chunk
|
||||
id: plan
|
||||
uses: ./.github/actions/docker-e2e-plan
|
||||
with:
|
||||
mode: chunk
|
||||
chunk: ${{ matrix.chunk_id }}
|
||||
include-openwebui: ${{ inputs.include_openwebui }}
|
||||
|
||||
- name: Run Docker E2E chunk
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
case "${{ matrix.suite_id }}" in
|
||||
docker-install-e2e)
|
||||
echo "OPENCLAW_E2E_MODELS=both" >> "$GITHUB_ENV"
|
||||
;;
|
||||
esac
|
||||
export OPENCLAW_DOCKER_ALL_PROFILE=release-path
|
||||
export OPENCLAW_DOCKER_ALL_CHUNK="${DOCKER_E2E_CHUNK}"
|
||||
export OPENCLAW_DOCKER_ALL_BUILD=0
|
||||
export OPENCLAW_DOCKER_ALL_PREFLIGHT=0
|
||||
export OPENCLAW_DOCKER_ALL_FAIL_FAST=0
|
||||
export OPENCLAW_DOCKER_ALL_INCLUDE_OPENWEBUI="${INCLUDE_OPENWEBUI}"
|
||||
export OPENCLAW_DOCKER_ALL_LOG_DIR=".artifacts/docker-tests/release-${DOCKER_E2E_CHUNK}"
|
||||
export OPENCLAW_DOCKER_ALL_TIMINGS_FILE=".artifacts/docker-tests/release-${DOCKER_E2E_CHUNK}-timings.json"
|
||||
export OPENCLAW_DOCKER_ALL_PNPM_COMMAND="$(command -v pnpm)"
|
||||
|
||||
- name: Validate suite credentials
|
||||
pnpm test:docker:all
|
||||
|
||||
- name: Summarize Docker E2E chunk
|
||||
if: always()
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
case "${{ matrix.suite_id }}" in
|
||||
docker-install-e2e)
|
||||
[[ -n "${OPENAI_API_KEY:-}" ]] || {
|
||||
echo "OPENAI_API_KEY is required for installer Docker E2E." >&2
|
||||
exit 1
|
||||
}
|
||||
if [[ -z "${ANTHROPIC_API_TOKEN:-}" && -z "${ANTHROPIC_API_KEY:-}" ]]; then
|
||||
echo "ANTHROPIC_API_TOKEN or ANTHROPIC_API_KEY is required for installer Docker E2E." >&2
|
||||
exit 1
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
summary=".artifacts/docker-tests/release-${DOCKER_E2E_CHUNK}/summary.json"
|
||||
if [[ ! -f "$summary" ]]; then
|
||||
echo "Docker chunk summary missing: \`$summary\`" >> "$GITHUB_STEP_SUMMARY"
|
||||
exit 0
|
||||
fi
|
||||
node scripts/docker-e2e.mjs summary "$summary" "Docker E2E chunk: ${DOCKER_E2E_CHUNK:-unknown}" >> "$GITHUB_STEP_SUMMARY"
|
||||
|
||||
- name: Run ${{ matrix.label }}
|
||||
run: ${{ matrix.command }}
|
||||
- name: Upload Docker E2E chunk artifacts
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v7
|
||||
with:
|
||||
name: docker-e2e-${{ matrix.chunk_id }}
|
||||
path: .artifacts/docker-tests/
|
||||
if-no-files-found: ignore
|
||||
|
||||
validate_docker_lanes:
|
||||
needs: [validate_selected_ref, prepare_docker_e2e_image]
|
||||
if: inputs.docker_lanes != ''
|
||||
name: Docker E2E targeted lanes
|
||||
runs-on: blacksmith-32vcpu-ubuntu-2404
|
||||
timeout-minutes: 180
|
||||
env:
|
||||
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
|
||||
OPENAI_BASE_URL: ${{ secrets.OPENAI_BASE_URL }}
|
||||
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
|
||||
ANTHROPIC_API_TOKEN: ${{ secrets.ANTHROPIC_API_TOKEN }}
|
||||
ANTHROPIC_API_KEY_OLD: ${{ secrets.ANTHROPIC_API_KEY_OLD }}
|
||||
BYTEPLUS_API_KEY: ${{ secrets.BYTEPLUS_API_KEY }}
|
||||
CEREBRAS_API_KEY: ${{ secrets.CEREBRAS_API_KEY }}
|
||||
DASHSCOPE_API_KEY: ${{ secrets.DASHSCOPE_API_KEY }}
|
||||
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
|
||||
KIMI_API_KEY: ${{ secrets.KIMI_API_KEY }}
|
||||
MODELSTUDIO_API_KEY: ${{ secrets.MODELSTUDIO_API_KEY }}
|
||||
MOONSHOT_API_KEY: ${{ secrets.MOONSHOT_API_KEY }}
|
||||
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
|
||||
MINIMAX_API_KEY: ${{ secrets.MINIMAX_API_KEY }}
|
||||
OPENCODE_API_KEY: ${{ secrets.OPENCODE_API_KEY }}
|
||||
OPENCODE_ZEN_API_KEY: ${{ secrets.OPENCODE_ZEN_API_KEY }}
|
||||
OPENCLAW_LIVE_BROWSER_CDP_URL: ${{ secrets.OPENCLAW_LIVE_BROWSER_CDP_URL }}
|
||||
OPENCLAW_LIVE_SETUP_TOKEN: ${{ secrets.OPENCLAW_LIVE_SETUP_TOKEN }}
|
||||
OPENCLAW_LIVE_SETUP_TOKEN_MODEL: ${{ secrets.OPENCLAW_LIVE_SETUP_TOKEN_MODEL }}
|
||||
OPENCLAW_LIVE_SETUP_TOKEN_PROFILE: ${{ secrets.OPENCLAW_LIVE_SETUP_TOKEN_PROFILE }}
|
||||
OPENCLAW_LIVE_SETUP_TOKEN_VALUE: ${{ secrets.OPENCLAW_LIVE_SETUP_TOKEN_VALUE }}
|
||||
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
|
||||
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
|
||||
OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
|
||||
QWEN_API_KEY: ${{ secrets.QWEN_API_KEY }}
|
||||
FAL_KEY: ${{ secrets.FAL_KEY }}
|
||||
RUNWAY_API_KEY: ${{ secrets.RUNWAY_API_KEY }}
|
||||
DEEPGRAM_API_KEY: ${{ secrets.DEEPGRAM_API_KEY }}
|
||||
TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}
|
||||
VYDRA_API_KEY: ${{ secrets.VYDRA_API_KEY }}
|
||||
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
|
||||
ZAI_API_KEY: ${{ secrets.ZAI_API_KEY }}
|
||||
Z_AI_API_KEY: ${{ secrets.Z_AI_API_KEY }}
|
||||
BYTEPLUS_ACCESS_KEY_ID: ${{ secrets.BYTEPLUS_ACCESS_KEY_ID }}
|
||||
BYTEPLUS_SECRET_ACCESS_KEY: ${{ secrets.BYTEPLUS_SECRET_ACCESS_KEY }}
|
||||
CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
|
||||
OPENCLAW_CODEX_AUTH_JSON: ${{ secrets.OPENCLAW_CODEX_AUTH_JSON }}
|
||||
OPENCLAW_CODEX_CONFIG_TOML: ${{ secrets.OPENCLAW_CODEX_CONFIG_TOML }}
|
||||
OPENCLAW_CLAUDE_JSON: ${{ secrets.OPENCLAW_CLAUDE_JSON }}
|
||||
OPENCLAW_CLAUDE_CREDENTIALS_JSON: ${{ secrets.OPENCLAW_CLAUDE_CREDENTIALS_JSON }}
|
||||
OPENCLAW_CLAUDE_SETTINGS_JSON: ${{ secrets.OPENCLAW_CLAUDE_SETTINGS_JSON }}
|
||||
OPENCLAW_CLAUDE_SETTINGS_LOCAL_JSON: ${{ secrets.OPENCLAW_CLAUDE_SETTINGS_LOCAL_JSON }}
|
||||
OPENCLAW_GEMINI_SETTINGS_JSON: ${{ secrets.OPENCLAW_GEMINI_SETTINGS_JSON }}
|
||||
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
|
||||
OPENCLAW_DOCKER_E2E_IMAGE: ${{ needs.prepare_docker_e2e_image.outputs.image }}
|
||||
OPENCLAW_DOCKER_E2E_BARE_IMAGE: ${{ needs.prepare_docker_e2e_image.outputs.bare_image }}
|
||||
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE: ${{ needs.prepare_docker_e2e_image.outputs.functional_image }}
|
||||
OPENCLAW_CURRENT_PACKAGE_TGZ: .artifacts/docker-e2e-package/openclaw-current.tgz
|
||||
OPENCLAW_SKIP_DOCKER_BUILD: "1"
|
||||
INCLUDE_OPENWEBUI: ${{ inputs.include_openwebui }}
|
||||
DOCKER_E2E_LANES: ${{ inputs.docker_lanes }}
|
||||
steps:
|
||||
- name: Checkout selected ref
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
ref: ${{ needs.validate_selected_ref.outputs.selected_sha }}
|
||||
fetch-depth: 1
|
||||
|
||||
- name: Log in to GHCR for shared Docker E2E image
|
||||
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4
|
||||
with:
|
||||
registry: ghcr.io
|
||||
username: ${{ github.actor }}
|
||||
password: ${{ github.token }}
|
||||
|
||||
- name: Setup Node environment
|
||||
uses: ./.github/actions/setup-node-env
|
||||
with:
|
||||
node-version: ${{ env.NODE_VERSION }}
|
||||
pnpm-version: ${{ env.PNPM_VERSION }}
|
||||
install-bun: "true"
|
||||
|
||||
- name: Hydrate live auth/profile inputs
|
||||
run: bash scripts/ci-hydrate-live-auth.sh
|
||||
|
||||
- name: Plan and hydrate targeted Docker E2E lanes
|
||||
id: plan
|
||||
uses: ./.github/actions/docker-e2e-plan
|
||||
with:
|
||||
mode: targeted
|
||||
lanes: ${{ inputs.docker_lanes }}
|
||||
include-openwebui: ${{ inputs.include_openwebui }}
|
||||
|
||||
- name: Run targeted Docker E2E lanes
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
export OPENCLAW_DOCKER_ALL_LANES="${DOCKER_E2E_LANES}"
|
||||
export OPENCLAW_DOCKER_ALL_PREFLIGHT=0
|
||||
export OPENCLAW_DOCKER_ALL_FAIL_FAST=0
|
||||
export OPENCLAW_DOCKER_ALL_INCLUDE_OPENWEBUI="${INCLUDE_OPENWEBUI}"
|
||||
export OPENCLAW_DOCKER_ALL_LOG_DIR=".artifacts/docker-tests/targeted"
|
||||
export OPENCLAW_DOCKER_ALL_TIMINGS_FILE=".artifacts/docker-tests/targeted-timings.json"
|
||||
export OPENCLAW_DOCKER_ALL_PNPM_COMMAND="$(command -v pnpm)"
|
||||
if [[ "${{ steps.plan.outputs.needs_live_image }}" == "1" ]]; then
|
||||
pnpm test:docker:live-build
|
||||
fi
|
||||
export OPENCLAW_DOCKER_ALL_BUILD=0
|
||||
|
||||
pnpm test:docker:all
|
||||
|
||||
- name: Summarize targeted Docker E2E lanes
|
||||
if: always()
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
summary=".artifacts/docker-tests/targeted/summary.json"
|
||||
if [[ ! -f "$summary" ]]; then
|
||||
echo "Docker targeted summary missing: \`$summary\`" >> "$GITHUB_STEP_SUMMARY"
|
||||
exit 0
|
||||
fi
|
||||
node scripts/docker-e2e.mjs summary "$summary" "Docker E2E targeted lanes" >> "$GITHUB_STEP_SUMMARY"
|
||||
|
||||
- name: Upload targeted Docker E2E artifacts
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v7
|
||||
with:
|
||||
name: docker-e2e-targeted
|
||||
path: .artifacts/docker-tests/
|
||||
if-no-files-found: ignore
|
||||
|
||||
validate_docker_openwebui:
|
||||
needs: [validate_selected_ref, prepare_docker_e2e_image]
|
||||
if: inputs.include_openwebui
|
||||
if: inputs.include_openwebui && !inputs.include_release_path_suites && inputs.docker_lanes == ''
|
||||
runs-on: blacksmith-32vcpu-ubuntu-2404
|
||||
timeout-minutes: 75
|
||||
env:
|
||||
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
|
||||
OPENAI_BASE_URL: ${{ secrets.OPENAI_BASE_URL }}
|
||||
OPENCLAW_DOCKER_E2E_IMAGE: ${{ needs.prepare_docker_e2e_image.outputs.image }}
|
||||
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE: ${{ needs.prepare_docker_e2e_image.outputs.functional_image }}
|
||||
OPENCLAW_SKIP_DOCKER_BUILD: "1"
|
||||
steps:
|
||||
- name: Checkout selected ref
|
||||
@@ -596,7 +691,7 @@ jobs:
|
||||
|
||||
prepare_docker_e2e_image:
|
||||
needs: validate_selected_ref
|
||||
if: inputs.include_release_path_suites || inputs.include_openwebui
|
||||
if: inputs.include_release_path_suites || inputs.include_openwebui || inputs.docker_lanes != ''
|
||||
runs-on: blacksmith-32vcpu-ubuntu-2404
|
||||
timeout-minutes: 90
|
||||
permissions:
|
||||
@@ -604,6 +699,13 @@ jobs:
|
||||
packages: write
|
||||
outputs:
|
||||
image: ${{ steps.image.outputs.image }}
|
||||
bare_image: ${{ steps.image.outputs.bare_image }}
|
||||
functional_image: ${{ steps.image.outputs.functional_image }}
|
||||
needs_bare_image: ${{ steps.plan.outputs.needs_bare_image }}
|
||||
needs_e2e_image: ${{ steps.plan.outputs.needs_e2e_image }}
|
||||
needs_functional_image: ${{ steps.plan.outputs.needs_functional_image }}
|
||||
needs_live_image: ${{ steps.plan.outputs.needs_live_image }}
|
||||
needs_package: ${{ steps.plan.outputs.needs_package }}
|
||||
env:
|
||||
DOCKER_BUILD_SUMMARY: "false"
|
||||
DOCKER_BUILD_RECORD_UPLOAD: "false"
|
||||
@@ -614,7 +716,7 @@ jobs:
|
||||
ref: ${{ needs.validate_selected_ref.outputs.selected_sha }}
|
||||
fetch-depth: 1
|
||||
|
||||
- name: Resolve shared Docker E2E image tag
|
||||
- name: Resolve shared Docker E2E image tags
|
||||
id: image
|
||||
shell: bash
|
||||
env:
|
||||
@@ -622,31 +724,127 @@ jobs:
|
||||
run: |
|
||||
set -euo pipefail
|
||||
repository="${GITHUB_REPOSITORY,,}"
|
||||
image="ghcr.io/${repository}-docker-e2e:${SELECTED_SHA}"
|
||||
bare_image="ghcr.io/${repository}-docker-e2e-bare:${SELECTED_SHA}"
|
||||
functional_image="ghcr.io/${repository}-docker-e2e-functional:${SELECTED_SHA}"
|
||||
image="$functional_image"
|
||||
echo "image=$image" >> "$GITHUB_OUTPUT"
|
||||
echo "Shared Docker E2E image: \`$image\`" >> "$GITHUB_STEP_SUMMARY"
|
||||
echo "bare_image=$bare_image" >> "$GITHUB_OUTPUT"
|
||||
echo "functional_image=$functional_image" >> "$GITHUB_OUTPUT"
|
||||
echo "Shared Docker E2E bare image: \`$bare_image\`" >> "$GITHUB_STEP_SUMMARY"
|
||||
echo "Shared Docker E2E functional image: \`$functional_image\`" >> "$GITHUB_STEP_SUMMARY"
|
||||
|
||||
- name: Plan Docker E2E images
|
||||
id: plan
|
||||
uses: ./.github/actions/docker-e2e-plan
|
||||
with:
|
||||
mode: prepare
|
||||
lanes: ${{ inputs.docker_lanes }}
|
||||
include-release-path-suites: ${{ inputs.include_release_path_suites }}
|
||||
include-openwebui: ${{ inputs.include_openwebui }}
|
||||
hydrate-artifacts: "false"
|
||||
|
||||
- name: Setup Node environment
|
||||
if: steps.plan.outputs.needs_package == '1'
|
||||
uses: ./.github/actions/setup-node-env
|
||||
with:
|
||||
node-version: ${{ env.NODE_VERSION }}
|
||||
pnpm-version: ${{ env.PNPM_VERSION }}
|
||||
install-bun: "true"
|
||||
|
||||
- name: Pack OpenClaw package for Docker E2E
|
||||
if: steps.plan.outputs.needs_package == '1'
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
mkdir -p .artifacts/docker-e2e-package
|
||||
node scripts/package-openclaw-for-docker.mjs \
|
||||
--output-dir .artifacts/docker-e2e-package \
|
||||
--output-name openclaw-current.tgz
|
||||
|
||||
- name: Upload OpenClaw Docker E2E package
|
||||
if: steps.plan.outputs.needs_package == '1'
|
||||
uses: actions/upload-artifact@v7
|
||||
with:
|
||||
name: docker-e2e-package
|
||||
path: .artifacts/docker-e2e-package/openclaw-current.tgz
|
||||
if-no-files-found: error
|
||||
|
||||
- name: Log in to GHCR
|
||||
if: steps.plan.outputs.needs_e2e_image == '1'
|
||||
uses: docker/login-action@4907a6ddec9925e35a0a9e82d7399ccc52663121 # v4
|
||||
with:
|
||||
registry: ghcr.io
|
||||
username: ${{ github.actor }}
|
||||
password: ${{ github.token }}
|
||||
|
||||
- name: Check existing shared Docker E2E images
|
||||
id: image_exists
|
||||
if: steps.plan.outputs.needs_e2e_image == '1'
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
bare_exists=0
|
||||
functional_exists=0
|
||||
needs_build=0
|
||||
|
||||
if [[ "${{ steps.plan.outputs.needs_bare_image }}" == "1" ]]; then
|
||||
if docker manifest inspect "${{ steps.image.outputs.bare_image }}" >/dev/null 2>&1; then
|
||||
bare_exists=1
|
||||
echo "Shared Docker E2E bare image already exists: ${{ steps.image.outputs.bare_image }}"
|
||||
else
|
||||
needs_build=1
|
||||
fi
|
||||
fi
|
||||
|
||||
if [[ "${{ steps.plan.outputs.needs_functional_image }}" == "1" ]]; then
|
||||
if docker manifest inspect "${{ steps.image.outputs.functional_image }}" >/dev/null 2>&1; then
|
||||
functional_exists=1
|
||||
echo "Shared Docker E2E functional image already exists: ${{ steps.image.outputs.functional_image }}"
|
||||
else
|
||||
needs_build=1
|
||||
fi
|
||||
fi
|
||||
|
||||
echo "bare_exists=$bare_exists" >> "$GITHUB_OUTPUT"
|
||||
echo "functional_exists=$functional_exists" >> "$GITHUB_OUTPUT"
|
||||
echo "needs_build=$needs_build" >> "$GITHUB_OUTPUT"
|
||||
|
||||
- name: Setup Docker builder
|
||||
if: steps.image_exists.outputs.needs_build == '1'
|
||||
uses: useblacksmith/setup-docker-builder@ac083cc84672d01c60d5e8561d0a939b697de542 # v1
|
||||
|
||||
- name: Build and push shared Docker E2E image
|
||||
- name: Build and push bare Docker E2E image
|
||||
if: steps.plan.outputs.needs_bare_image == '1' && steps.image_exists.outputs.bare_exists != '1'
|
||||
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
|
||||
with:
|
||||
context: .
|
||||
file: ./scripts/e2e/Dockerfile
|
||||
target: build
|
||||
target: bare
|
||||
platforms: linux/amd64
|
||||
cache-from: type=gha,scope=docker-e2e
|
||||
cache-to: type=gha,mode=max,scope=docker-e2e
|
||||
tags: ${{ steps.image.outputs.image }}
|
||||
provenance: false
|
||||
cache-from: type=gha,scope=docker-e2e-bare
|
||||
cache-to: type=gha,mode=max,scope=docker-e2e-bare
|
||||
tags: ${{ steps.image.outputs.bare_image }}
|
||||
sbom: true
|
||||
provenance: mode=max
|
||||
push: true
|
||||
|
||||
- name: Build and push functional Docker E2E image
|
||||
if: steps.plan.outputs.needs_functional_image == '1' && steps.image_exists.outputs.functional_exists != '1'
|
||||
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
|
||||
with:
|
||||
context: .
|
||||
file: ./scripts/e2e/Dockerfile
|
||||
target: functional
|
||||
build-contexts: |
|
||||
openclaw_package=.artifacts/docker-e2e-package
|
||||
platforms: linux/amd64
|
||||
cache-from: |
|
||||
type=gha,scope=docker-e2e-bare
|
||||
type=gha,scope=docker-e2e-functional
|
||||
cache-to: type=gha,mode=max,scope=docker-e2e-functional
|
||||
tags: ${{ steps.image.outputs.functional_image }}
|
||||
sbom: true
|
||||
provenance: mode=max
|
||||
push: true
|
||||
|
||||
validate_live_models_docker:
|
||||
|
||||
2
.gitignore
vendored
2
.gitignore
vendored
@@ -118,6 +118,8 @@ USER.md
|
||||
!.agents/skills/openclaw-test-heap-leaks/**
|
||||
!.agents/skills/openclaw-test-performance/
|
||||
!.agents/skills/openclaw-test-performance/**
|
||||
!.agents/skills/openclaw-testing/
|
||||
!.agents/skills/openclaw-testing/**
|
||||
!.agents/skills/optimizetests/
|
||||
!.agents/skills/optimizetests/**
|
||||
!.agents/skills/parallels-discord-roundtrip/
|
||||
|
||||
@@ -29,6 +29,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
|
||||
- Extension prod code: no core `src/**`, `src/plugin-sdk-internal/**`, other extension `src/**`, or relative outside package.
|
||||
- Core/tests: no deep plugin internals (`extensions/*/src/**`, `onboard.js`). Use `api.ts`, SDK facade, generic contracts.
|
||||
- Extension-owned behavior stays extension-owned: repair, detection, onboarding, auth/provider defaults, provider tools/settings.
|
||||
- Owner boundary: fix owner-specific behavior in the owner module. Shared/core gets generic seams only; no owner ids, dependency strings, defaults, migrations, or recovery policy. If a bug names an extension or its dependency, start in that extension and add a generic core seam only when multiple owners need it.
|
||||
- Legacy config repair: doctor/fix paths, not startup/load-time core migrations.
|
||||
- Core test asserting extension-specific behavior: move to owner extension or generic contract test.
|
||||
- New seams: backwards-compatible, documented, versioned. Third-party plugins exist.
|
||||
@@ -50,7 +51,8 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
|
||||
- Extension tests: `pnpm test:extensions`, `pnpm test extensions`, `pnpm test extensions/<id>`.
|
||||
- Targeted tests: `pnpm test <path-or-filter> [vitest args...]`; never raw `vitest`.
|
||||
- Typecheck: `tsgo` lanes only (`pnpm tsgo*`, `pnpm check:test-types`); do not add `tsc --noEmit`, `typecheck`, `check:types`.
|
||||
- Format/lint: `pnpm format:check`/`pnpm format`; `pnpm lint*` lanes.
|
||||
- Formatting: use `oxfmt`, not Prettier. Prefer `pnpm format:check` / `pnpm format`; for targeted files use `pnpm exec oxfmt --check --threads=1 <files...>` or `pnpm exec oxfmt --write --threads=1 <files...>`.
|
||||
- Linting: use repo wrappers (`pnpm lint:*`, `scripts/run-oxlint.mjs`); do not invoke generic JS formatters/lints unless a repo script uses them.
|
||||
- Heavy checks: `OPENCLAW_LOCAL_CHECK=1`, mode `OPENCLAW_LOCAL_CHECK_MODE=throttled|full`; CI/shared use `OPENCLAW_LOCAL_CHECK=0`.
|
||||
- Local first. Use repo `pnpm` lanes before Blacksmith/Testbox. Remote only for parity-only failures, secrets/services, or explicit ask.
|
||||
|
||||
@@ -58,6 +60,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
|
||||
|
||||
- Triage: list first, hydrate few. Use bounded `gh --json --jq`; avoid repeated full comment scans.
|
||||
- Automatic PR/issue discovery: skip maintainer-owned items unless directly relevant. Do not comment, close, label, retitle, rebase, fix up, or land them without Peter asking.
|
||||
- PR scan/triage: no unsolicited PR comments/reviews. Report in chat only unless explicitly asked, or a close/duplicate action needs a reason comment.
|
||||
- Search/dedupe: prefer `gh search issues 'repo:openclaw/openclaw is:open <terms>' --json number,title,state,updatedAt --limit 20`.
|
||||
- GitHub search boolean text is fussy. If `OR` queries return empty, split exact terms and search title/body/comments separately before concluding no hits.
|
||||
- PR shortlist: `gh pr list ...`; then `gh pr view <n> --json number,title,body,closingIssuesReferences,files,statusCheckRollup,reviewDecision`.
|
||||
@@ -117,6 +120,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
|
||||
## Tests
|
||||
|
||||
- Vitest. Colocated `*.test.ts`; e2e `*.e2e.test.ts`; example models `sonnet-4.6`, `gpt-5.4`.
|
||||
- Avoid brittle tests that grep workflow/docs strings for operator policy. Prefer executable behavior, parsed config/schema checks, or live run proof; put release/CI policy reminders in AGENTS/docs instead.
|
||||
- Clean timers/env/globals/mocks/sockets/temp dirs/module state; `--isolate=false` safe.
|
||||
- Hot tests: avoid per-test `vi.resetModules()` + heavy imports. Measure with `pnpm test:perf:imports <file>` / `pnpm test:perf:hotspots --limit N`.
|
||||
- Seam depth: pure helper/contract unit tests; one integration smoke per boundary.
|
||||
@@ -132,7 +136,7 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
|
||||
|
||||
- Docs change with behavior/API. Use docs list/read_when hints; docs links per `docs/AGENTS.md`.
|
||||
- Changelog user-facing only; pure test/internal usually no entry.
|
||||
- Changelog placement: active version `### Changes`/`### Fixes`; every added entry must include at least one `Thanks @author` attribution, using credited GitHub username(s). Never add `Thanks @steipete`.
|
||||
- Changelog placement: active version `### Changes`/`### Fixes`; every added entry must include at least one `Thanks @author` attribution, using credited GitHub username(s). Never add `Thanks @steipete` or `Thanks @codex`.
|
||||
- Changelog bullets are always single-line. No wrapping/continuation across multiple lines. Long entries stay on one long line so dedupe, PR-ref, and credit-audit tooling work and so the visual style stays uniform.
|
||||
|
||||
## Git
|
||||
|
||||
23
CHANGELOG.md
23
CHANGELOG.md
@@ -6,6 +6,22 @@ Docs: https://docs.openclaw.ai
|
||||
|
||||
### Fixes
|
||||
|
||||
- Codex harness: normalize cached input tokens before session/context accounting so prompt cache reads are not double-counted in `/status`, `session_status`, or persisted `sessionEntry.totalTokens`. Fixes #69298. Thanks @richardmqq.
|
||||
- Hooks/session-memory: use the host local timezone for memory filenames, fallback timestamp slugs, and markdown headers instead of UTC dates. Fixes #46703. (#46721) Thanks @Astro-Han.
|
||||
- Feishu: extract quoted/replied interactive-card text across schema 1.0, schema 2.0, i18n, template-variable, and post-format fallback shapes without carrying broad generated/config churn from related parser experiments. (#38776, #60383, #42218, #45936) Thanks @lishuaigit, @lskun, @just2gooo, and @Br1an67.
|
||||
- Exec approvals: accept a symlinked `OPENCLAW_HOME` as the trusted approvals root while still rejecting symlinked `.openclaw` path components below it. (#64663) Thanks @FunJim.
|
||||
- Logging: add top-level `hostname`, flattened `message`, and available `agent_id`, `session_id`, and `channel` fields to file-log JSONL records for multi-agent filtering without removing existing structured log arguments. Fixes #51075. Thanks @stevengonsalvez.
|
||||
- ACP: route server logs to stderr before Gateway config/bootstrap work so ACP stdout remains JSON-RPC only for IDE integrations. Fixes #49060. Thanks @Hollychou924.
|
||||
- Logging: propagate internal request trace scopes through Gateway HTTP requests and WebSocket frames so file logs, diagnostic events, agent run traces, model-call traces, OTEL spans, and trusted provider `traceparent` headers share a correlatable `traceId` without logging raw request or model content. Fixes #40353. Thanks @liangruochong44-ui.
|
||||
- Diagnostics/OTEL: capture privacy-safe model-call request payload bytes, streamed response bytes, first-response latency, and total duration in diagnostic events, plugin hooks, stability snapshots, and OTEL model-call spans/metrics without logging raw model content. Fixes #33832. Thanks @wwh830.
|
||||
- Logging: write validated diagnostic trace context as top-level `traceId`, `spanId`, `parentSpanId`, and `traceFlags` fields in file-log JSONL records so traced requests and model calls are easier to correlate in log processors. Refs #40353. Thanks @liangruochong44-ui.
|
||||
- Logging/sessions: apply configured redaction patterns to persisted session transcript text and accept escaped character classes in safe custom redaction regexes, so transcript JSONL no longer keeps matching sensitive text in the clear. Fixes #42982. Thanks @panpan0000.
|
||||
- Providers/Ollama: honor `/api/show` capabilities when registering local models so non-tool Ollama models no longer receive the agent tool surface, and keep native Ollama thinking opt-in instead of enabling it by default. Fixes #64710 and duplicate #65343. Thanks @yuan-b, @netherby, @xilopaint, and @Diyforfun2026.
|
||||
- Providers/Ollama: expose native Ollama thinking effort levels so `/think max` is accepted for reasoning-capable Ollama models and maps to Ollama's highest supported `think` effort. Fixes #71584. Thanks @g0st1n.
|
||||
- Agents/Ollama: validate explicit `--thinking max` against catalog-discovered Ollama reasoning metadata so local agent runs accept the same native thinking levels shown in the model catalog. Fixes #71584. Thanks @g0st1n.
|
||||
- Docker/QA: add observability coverage to the normal Docker aggregate so QA-lab OTEL and Prometheus diagnostics run inside Docker. Thanks @vincentkoc.
|
||||
- Auto-reply: poison inbound message dedupe after replay-unsafe provider/runtime failures so retries stay safe before visible progress but cannot duplicate messages after block output, tool side effects, or session progress. Fixes #69303; keeps #58549 and #64606 as duplicate validation. Thanks @martingarramon, @NikolaFC, and @zeroth-blip.
|
||||
- Agents/model fallback: jump directly to a known later live-session model redirect instead of walking unrelated fallback candidates, while preserving the already-landed live-session/fallback loop guard. Fixes #57471; related loop family already closed via #58496. Thanks @yuxiaoyang2007-prog.
|
||||
- Gateway/Bonjour: keep @homebridge/ciao cancellation handlers registered across advertiser restarts so late probing cancellations cannot crash Linux and other mDNS-churned gateways. Thanks @codex.
|
||||
- Plugins/startup: load the default `memory-core` slot during Gateway startup when permitted so active-memory recall can call `memory_search` and `memory_get` without requiring an explicit `plugins.slots.memory` entry, while preserving `plugins.slots.memory: "none"`. Thanks @codex.
|
||||
- Plugins/CLI: prefer native require for compiled bundled plugin JavaScript before jiti so read-only config, status, device, and node commands avoid unnecessary transform overhead on slow hosts. Fixes #62842. Thanks @Effet.
|
||||
@@ -14,8 +30,14 @@ Docs: https://docs.openclaw.ai
|
||||
- Plugins/CLI: refresh the persisted registry after managed plugin files are removed so ClawHub uninstall cannot leave stale `plugins list` entries. Thanks @codex.
|
||||
- Plugins/CLI: make plugin install and uninstall config writes conflict-aware, clear stale denylist entries on explicit reinstall/removal, and delete managed plugin files only after config/index commit succeeds. Thanks @codex.
|
||||
- Plugins: fail `plugins update` when tracked plugin or hook updates error, keep bundled runtime-dependency repair behind restrictive allowlists, and reject package installs with unloadable extension entries. Thanks @codex.
|
||||
- WebChat/Control UI: support non-video file attachments in chat uploads while preserving the existing image attachment path and MIME-sniff fallback for generic image uploads. (#70947) Thanks @IAMSamuelRodda.
|
||||
- Skills/memory: restore Chokidar v5 hot reloads by watching concrete skill and memory roots with filters, including SKILL.md removals and deleted skill folders without broad workspace recursion. Fixes #27404, #33585, and #41606. Thanks @shelvenzhou, @08820048, and @rocke2020.
|
||||
- Gateway/chat: keep duplicate attachment-backed `chat.send` retries with the same idempotency key on the documented in-flight path so aborts still target the real active run. Fixes #70139. Thanks @Feelw00.
|
||||
- Plugins: share package entrypoint resolution between install and discovery, reject mismatched `runtimeExtensions`, and cache bundled runtime-dependency manifest reads during scans. Thanks @codex.
|
||||
- WhatsApp/Web: keep quiet but healthy linked-device sessions connected by basing the watchdog on WhatsApp Web transport activity, while retaining a longer app-silence cap so frame activity cannot mask a stuck session forever. Fixes #70678; carries forward the focused #71466 approach and keeps #63939 as related configurable-timeout follow-up. Thanks @vincentkoc and @oromeis.
|
||||
- Discord/gateway: count failed health-monitor restart attempts toward cooldown and hourly caps, and evict stale account lifecycle state during channel reloads so repeated Discord gateway recovery cannot loop on old status. Fixes #38596. (#40413) Thanks @jellyAI-dev and @vashquez.
|
||||
- Cron/context engine: run isolated cron jobs under run-scoped context-engine session keys so prior runs of the same job are not inherited unless the job is explicitly session-bound. (#72292) Thanks @jalehman.
|
||||
- Control UI: localize command palette labels, categories, skill shortcuts, footer hints, and connect-command copy labels while preserving localized command palette search matching. (#61130, #61119) Thanks @rubensfox20.
|
||||
|
||||
## 2026.4.26
|
||||
|
||||
@@ -28,6 +50,7 @@ Docs: https://docs.openclaw.ai
|
||||
- Onboarding/models: keep skip-auth and provider-scoped model picker prompts off the full global model catalog path, and cache provider catalog hook resolution so setup no longer stalls after auth on large plugin registries. Thanks @shakkernerd.
|
||||
- Gateway/Bonjour: suppress known @homebridge/ciao cancellation and network assertion failures through scoped process handlers so malformed mDNS packets or restricted VPS networking disable/restart Bonjour instead of crashing the gateway. Fixes #67578. Thanks @zenassist26-create.
|
||||
- Discord: keep late clicks on already-resolved exec approval buttons quiet when elevated mode auto-resolved the request, while still surfacing real approval submission failures. Fixes #66906. Thanks @rlerikse.
|
||||
- Telegram: send a fresh final message for long-lived preview-streamed replies so the visible Telegram timestamp reflects completion time instead of the preview creation time. Thanks @rubencu.
|
||||
|
||||
## 2026.4.25
|
||||
|
||||
|
||||
36
Dockerfile
36
Dockerfile
@@ -9,22 +9,19 @@
|
||||
# bundled plugin workspace tree, so the main build layer is not invalidated by
|
||||
# unrelated plugin source changes.
|
||||
#
|
||||
# Two runtime variants:
|
||||
# Default (bookworm): docker build .
|
||||
# Slim (bookworm-slim): docker build --build-arg OPENCLAW_VARIANT=slim .
|
||||
# Build stages use full bookworm; the runtime image is always bookworm-slim.
|
||||
ARG OPENCLAW_EXTENSIONS=""
|
||||
ARG OPENCLAW_VARIANT=default
|
||||
ARG OPENCLAW_BUNDLED_PLUGIN_DIR=extensions
|
||||
ARG OPENCLAW_DOCKER_APT_UPGRADE=1
|
||||
ARG OPENCLAW_NODE_BOOKWORM_IMAGE="node:24-bookworm@sha256:3a09aa6354567619221ef6c45a5051b671f953f0a1924d1f819ffb236e520e6b"
|
||||
ARG OPENCLAW_NODE_BOOKWORM_DIGEST="sha256:3a09aa6354567619221ef6c45a5051b671f953f0a1924d1f819ffb236e520e6b"
|
||||
ARG OPENCLAW_NODE_BOOKWORM_SLIM_IMAGE="node:24-bookworm-slim@sha256:e8e2e91b1378f83c5b2dd15f0247f34110e2fe895f6ca7719dbb780f929368eb"
|
||||
ARG OPENCLAW_NODE_BOOKWORM_SLIM_DIGEST="sha256:e8e2e91b1378f83c5b2dd15f0247f34110e2fe895f6ca7719dbb780f929368eb"
|
||||
|
||||
# Base images are pinned to SHA256 digests for reproducible builds.
|
||||
# Trade-off: digests must be updated manually when upstream tags move.
|
||||
# To update, run: docker buildx imagetools inspect node:24-bookworm (or podman)
|
||||
# and replace the digest below with the current multi-arch manifest list entry.
|
||||
# Dependabot refreshes these blessed digests; release builds consume the
|
||||
# reviewed base snapshot instead of mutating distro state on every build.
|
||||
# To update, run: docker buildx imagetools inspect node:24-bookworm and
|
||||
# node:24-bookworm-slim (or podman) and replace the digests below with the
|
||||
# current multi-arch manifest list entries.
|
||||
|
||||
FROM ${OPENCLAW_NODE_BOOKWORM_IMAGE} AS ext-deps
|
||||
ARG OPENCLAW_EXTENSIONS
|
||||
@@ -125,22 +122,15 @@ RUN printf 'packages:\n - .\n - ui\n' > /tmp/pnpm-workspace.runtime.yaml && \
|
||||
node scripts/postinstall-bundled-plugins.mjs && \
|
||||
find dist -type f \( -name '*.d.ts' -o -name '*.d.mts' -o -name '*.d.cts' -o -name '*.map' \) -delete
|
||||
|
||||
# ── Runtime base images ─────────────────────────────────────────
|
||||
FROM ${OPENCLAW_NODE_BOOKWORM_IMAGE} AS base-default
|
||||
ARG OPENCLAW_NODE_BOOKWORM_DIGEST
|
||||
LABEL org.opencontainers.image.base.name="docker.io/library/node:24-bookworm" \
|
||||
org.opencontainers.image.base.digest="${OPENCLAW_NODE_BOOKWORM_DIGEST}"
|
||||
|
||||
FROM ${OPENCLAW_NODE_BOOKWORM_SLIM_IMAGE} AS base-slim
|
||||
# ── Runtime base image ──────────────────────────────────────────
|
||||
FROM ${OPENCLAW_NODE_BOOKWORM_SLIM_IMAGE} AS base-runtime
|
||||
ARG OPENCLAW_NODE_BOOKWORM_SLIM_DIGEST
|
||||
LABEL org.opencontainers.image.base.name="docker.io/library/node:24-bookworm-slim" \
|
||||
org.opencontainers.image.base.digest="${OPENCLAW_NODE_BOOKWORM_SLIM_DIGEST}"
|
||||
|
||||
# ── Stage 3: Runtime ────────────────────────────────────────────
|
||||
FROM base-${OPENCLAW_VARIANT}
|
||||
ARG OPENCLAW_VARIANT
|
||||
FROM base-runtime
|
||||
ARG OPENCLAW_BUNDLED_PLUGIN_DIR
|
||||
ARG OPENCLAW_DOCKER_APT_UPGRADE
|
||||
|
||||
# OCI base-image metadata for downstream image consumers.
|
||||
# If you change these annotations, also update:
|
||||
@@ -155,16 +145,10 @@ LABEL org.opencontainers.image.source="https://github.com/openclaw/openclaw" \
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install system utilities present in bookworm but missing in bookworm-slim.
|
||||
# On the full bookworm image these are already installed (apt-get is a no-op).
|
||||
# Smoke workflows can opt out of distro upgrades to cut repeated CI time while
|
||||
# keeping the default runtime image behavior unchanged.
|
||||
# Install runtime system utilities missing from bookworm-slim.
|
||||
RUN --mount=type=cache,id=openclaw-bookworm-apt-cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,id=openclaw-bookworm-apt-lists,target=/var/lib/apt,sharing=locked \
|
||||
apt-get update && \
|
||||
if [ "${OPENCLAW_DOCKER_APT_UPGRADE}" != "0" ]; then \
|
||||
DEBIAN_FRONTEND=noninteractive apt-get upgrade -y --no-install-recommends; \
|
||||
fi && \
|
||||
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
|
||||
procps hostname curl git lsof openssl
|
||||
|
||||
|
||||
@@ -7,7 +7,6 @@ ENV DEBIAN_FRONTEND=noninteractive
|
||||
RUN --mount=type=cache,id=openclaw-sandbox-bookworm-apt-cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,id=openclaw-sandbox-bookworm-apt-lists,target=/var/lib/apt,sharing=locked \
|
||||
apt-get update \
|
||||
&& apt-get upgrade -y --no-install-recommends \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash \
|
||||
ca-certificates \
|
||||
|
||||
@@ -7,7 +7,6 @@ ENV DEBIAN_FRONTEND=noninteractive
|
||||
RUN --mount=type=cache,id=openclaw-sandbox-bookworm-apt-cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,id=openclaw-sandbox-bookworm-apt-lists,target=/var/lib/apt,sharing=locked \
|
||||
apt-get update \
|
||||
&& apt-get upgrade -y --no-install-recommends \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash \
|
||||
ca-certificates \
|
||||
|
||||
@@ -24,7 +24,6 @@ ENV PATH=${BUN_INSTALL_DIR}/bin:${BREW_INSTALL_DIR}/bin:${BREW_INSTALL_DIR}/sbin
|
||||
RUN --mount=type=cache,id=openclaw-sandbox-common-apt-cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,id=openclaw-sandbox-common-apt-lists,target=/var/lib/apt,sharing=locked \
|
||||
apt-get update \
|
||||
&& apt-get upgrade -y --no-install-recommends \
|
||||
&& apt-get install -y --no-install-recommends ${PACKAGES}
|
||||
|
||||
RUN if [ "${INSTALL_PNPM}" = "1" ]; then npm install -g pnpm; fi
|
||||
|
||||
@@ -6,9 +6,9 @@ services:
|
||||
TERM: xterm-256color
|
||||
OPENCLAW_GATEWAY_TOKEN: ${OPENCLAW_GATEWAY_TOKEN:-}
|
||||
OPENCLAW_ALLOW_INSECURE_PRIVATE_WS: ${OPENCLAW_ALLOW_INSECURE_PRIVATE_WS:-}
|
||||
# Docker bridge networks usually do not carry mDNS multicast reliably.
|
||||
# Set OPENCLAW_DISABLE_BONJOUR=0 only on host/macvlan/mDNS-capable networks.
|
||||
OPENCLAW_DISABLE_BONJOUR: ${OPENCLAW_DISABLE_BONJOUR:-1}
|
||||
# Empty means auto: Bonjour disables itself in detected containers.
|
||||
# Set 0 only on host/macvlan/mDNS-capable networks; set 1 to force off.
|
||||
OPENCLAW_DISABLE_BONJOUR: ${OPENCLAW_DISABLE_BONJOUR:-}
|
||||
# OpenTelemetry export is outbound OTLP/HTTP from the Gateway. Prometheus
|
||||
# uses the existing authenticated Gateway route; it does not need a port.
|
||||
OTEL_EXPORTER_OTLP_ENDPOINT: ${OTEL_EXPORTER_OTLP_ENDPOINT:-}
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
7fa6e35bb9f9d3096d6281f141488be0dcfe15de40dc4f5c0305eb1ff2bc60b6 config-baseline.json
|
||||
5f5fb87fd46f9cbb84d8af17e00ae3c4b74062e8ad517bc2260ba83da2e9014f config-baseline.core.json
|
||||
3e6dd8292d9350b0ccc243f81f7b6e95494fc769c01c084d8d6d6e9e1f668a14 config-baseline.json
|
||||
e040e5818afe66d71fc8a7ae1653f1e8c252cc5b51480ef3b4ae1269682b9ade config-baseline.core.json
|
||||
7cd9c908f066c143eab2a201efbc9640f483ab28bba92ddeca1d18cc2b528bc3 config-baseline.channel.json
|
||||
f9e0174988718959fe1923a54496ec5b9262721fe1e7306f32ccb1316d9d9c3f config-baseline.plugin.json
|
||||
74b74cb18ac37c0acaa765f398f1f9edbcee4c43567f02d45c89598a1e13afb4 config-baseline.plugin.json
|
||||
|
||||
@@ -1,2 +1,2 @@
|
||||
fd941e0485a92ebb8256cf2256330b58c2d5bd94189f4a05d7394353ef7bed88 plugin-sdk-api-baseline.json
|
||||
11ef8362518a0d9f221dc1958b25db46956d1916f278b53e52199bf6c2cbc65b plugin-sdk-api-baseline.jsonl
|
||||
21914ef8c5840e0defc36d571834dc28a92d6d5ca2d42a088c33b4de681e836a plugin-sdk-api-baseline.json
|
||||
3f22e6af0dad3433d25d996802d7436a3cc0e68bc86ecaf813a22e2b4e5333eb plugin-sdk-api-baseline.jsonl
|
||||
|
||||
@@ -173,7 +173,7 @@ openclaw hooks enable <hook-name>
|
||||
|
||||
### session-memory details
|
||||
|
||||
Extracts the last 15 user/assistant messages, generates a descriptive filename slug via LLM, and saves to `<workspace>/memory/YYYY-MM-DD-slug.md`. Requires `workspace.dir` to be configured.
|
||||
Extracts the last 15 user/assistant messages, generates a descriptive filename slug via LLM, and saves to `<workspace>/memory/YYYY-MM-DD-slug.md` using the host local date. Requires `workspace.dir` to be configured.
|
||||
|
||||
<a id="bootstrap-extra-files"></a>
|
||||
|
||||
|
||||
@@ -298,8 +298,8 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
|
||||
|
||||
For text-only replies:
|
||||
|
||||
- DM: OpenClaw keeps the same preview message and performs a final edit in place (no second message)
|
||||
- group/topic: OpenClaw keeps the same preview message and performs a final edit in place (no second message)
|
||||
- short DM/group/topic previews: OpenClaw keeps the same preview message and performs a final edit in place
|
||||
- previews older than about one minute: OpenClaw sends the completed reply as a fresh final message and then cleans up the preview, so Telegram's visible timestamp reflects completion time instead of the preview creation time
|
||||
|
||||
For complex replies (for example media payloads), OpenClaw falls back to normal final delivery and then cleans up the preview message.
|
||||
|
||||
|
||||
@@ -146,6 +146,7 @@ OpenClaw recommends running WhatsApp on a separate number when possible. (The ch
|
||||
## Runtime model
|
||||
|
||||
- Gateway owns the WhatsApp socket and reconnect loop.
|
||||
- The reconnect watchdog uses WhatsApp Web transport activity, not only inbound app-message volume, so a quiet linked-device session is not restarted solely because nobody has sent a message recently. A longer application-silence cap still forces a reconnect if transport frames keep arriving but no application messages are handled for the watchdog window.
|
||||
- Outbound sends require an active WhatsApp listener for the target account.
|
||||
- Status and broadcast chats are ignored (`@status`, `@broadcast`).
|
||||
- Direct chats use DM session rules (`session.dmScope`; default `main` collapses DMs to the agent main session).
|
||||
@@ -510,6 +511,10 @@ Behavior notes:
|
||||
<Accordion title="Linked but disconnected / reconnect loop">
|
||||
Symptom: linked account with repeated disconnects or reconnect attempts.
|
||||
|
||||
Quiet accounts can stay connected past the normal message timeout; the watchdog
|
||||
restarts when WhatsApp Web transport activity stops, the socket closes, or
|
||||
application-level activity stays silent beyond the longer safety window.
|
||||
|
||||
Fix:
|
||||
|
||||
```bash
|
||||
|
||||
26
docs/ci.md
26
docs/ci.md
File diff suppressed because one or more lines are too long
@@ -21,8 +21,12 @@ calls paired with their matching `toolResult` entries. If a split point lands
|
||||
inside a tool block, OpenClaw moves the boundary so the pair stays together and
|
||||
the current unsummarized tail is preserved.
|
||||
|
||||
The full conversation history stays on disk. Compaction only changes what the
|
||||
model sees on the next turn.
|
||||
By default, OpenClaw also rewrites the session transcript after compaction and
|
||||
removes the message entries that were summarized. The persisted summary and
|
||||
recent unsummarized tail remain on disk. Set
|
||||
`agents.defaults.compaction.truncateAfterCompaction` to `false` if you need the
|
||||
older behavior where compaction only changed what the model saw on the next
|
||||
turn and left the full transcript intact.
|
||||
|
||||
## Auto-compaction
|
||||
|
||||
|
||||
@@ -265,6 +265,7 @@ That means fallback retries have to coordinate with live model switching:
|
||||
- System-driven model changes such as fallback rotation, heartbeat overrides, or compaction never mark a pending live switch on their own.
|
||||
- Before a fallback retry starts, the reply runner persists the selected fallback override fields to the session entry.
|
||||
- Live-session reconciliation prefers persisted session overrides over stale runtime model fields.
|
||||
- If a live-switch error points at a later candidate in the active fallback chain, OpenClaw jumps directly to that selected model instead of walking unrelated candidates first.
|
||||
- If the fallback attempt fails, the runner rolls back only the override fields it wrote, and only if they still match that failed candidate.
|
||||
|
||||
This prevents the classic race:
|
||||
|
||||
@@ -65,6 +65,15 @@ model calls must not export `StreamAbandoned` on successful turns; raw diagnosti
|
||||
`openclaw.content.*` attributes must stay out of the trace. It writes
|
||||
`otel-smoke-summary.json` next to the QA suite artifacts.
|
||||
|
||||
The normal Docker aggregate and release-path core chunk also run an
|
||||
observability lane. It reuses the shared package-installed functional Docker
|
||||
image, mounts the QA harness files read-only, runs the OTEL trace smoke inside
|
||||
the container, then runs the `docker-prometheus-smoke` QA scenario with the
|
||||
`diagnostics-prometheus` plugin enabled. Set
|
||||
`OPENCLAW_DOCKER_OBSERVABILITY_LOOPS=<count>` to repeat both checks inside one
|
||||
Docker run while preserving per-loop artifacts under
|
||||
`.artifacts/docker-observability/...`.
|
||||
|
||||
For a transport-real Matrix smoke lane, run:
|
||||
|
||||
```bash
|
||||
|
||||
@@ -152,6 +152,7 @@ Legacy key migration:
|
||||
Telegram:
|
||||
|
||||
- Uses `sendMessage` + `editMessageText` preview updates across DMs and group/topics.
|
||||
- Sends a fresh final message instead of editing in place when a preview has been visible for about one minute, then cleans up the preview so Telegram's timestamp reflects reply completion.
|
||||
- Preview streaming is skipped when Telegram block streaming is explicitly enabled (to avoid double-streaming).
|
||||
- `/reasoning stream` can write reasoning to preview.
|
||||
|
||||
|
||||
@@ -179,11 +179,10 @@ openclaw plugins disable bonjour
|
||||
|
||||
## Docker gotchas
|
||||
|
||||
Bundled Docker Compose sets `OPENCLAW_DISABLE_BONJOUR=1` for the Gateway service
|
||||
by default. Docker bridge networks usually do not forward mDNS multicast
|
||||
(`224.0.0.251:5353`) between the container and the LAN, so leaving Bonjour on can
|
||||
produce repeated ciao `probing` or `announcing` failures without making discovery
|
||||
work.
|
||||
The bundled Bonjour plugin auto-disables LAN multicast advertising in detected
|
||||
containers when `OPENCLAW_DISABLE_BONJOUR` is unset. Docker bridge networks
|
||||
usually do not forward mDNS multicast (`224.0.0.251:5353`) between the container
|
||||
and the LAN, so advertising from the container rarely makes discovery work.
|
||||
|
||||
Important gotchas:
|
||||
|
||||
@@ -193,16 +192,16 @@ Important gotchas:
|
||||
`OPENCLAW_GATEWAY_BIND=lan` so the published host port can work.
|
||||
- Disabling Bonjour does not disable wide-area DNS-SD. Use wide-area discovery
|
||||
or Tailnet when the Gateway and node are not on the same LAN.
|
||||
- Reusing the same `OPENCLAW_CONFIG_DIR` outside Docker does not inherit the
|
||||
Compose default unless the environment still sets `OPENCLAW_DISABLE_BONJOUR`.
|
||||
- Reusing the same `OPENCLAW_CONFIG_DIR` outside Docker does not persist the
|
||||
container auto-disable policy.
|
||||
- Set `OPENCLAW_DISABLE_BONJOUR=0` only for host networking, macvlan, or another
|
||||
network where mDNS multicast is known to pass.
|
||||
network where mDNS multicast is known to pass; set it to `1` to force-disable.
|
||||
|
||||
## Troubleshooting disabled Bonjour
|
||||
|
||||
If a node no longer auto-discovers the Gateway after Docker setup:
|
||||
|
||||
1. Confirm whether the Gateway is intentionally suppressing LAN advertising:
|
||||
1. Confirm whether the Gateway is running in auto, forced-on, or forced-off mode:
|
||||
|
||||
```bash
|
||||
docker compose config | grep OPENCLAW_DISABLE_BONJOUR
|
||||
@@ -239,9 +238,9 @@ If a node no longer auto-discovers the Gateway after Docker setup:
|
||||
container bridges, WSL, or interface churn can leave the ciao advertiser in a
|
||||
non-announced state. OpenClaw retries a few times and then disables Bonjour
|
||||
for the current Gateway process instead of restarting the advertiser forever.
|
||||
- **Docker bridge networking**: bundled Docker Compose disables Bonjour by
|
||||
default with `OPENCLAW_DISABLE_BONJOUR=1`. Set it to `0` only for host,
|
||||
macvlan, or another mDNS-capable network.
|
||||
- **Docker bridge networking**: Bonjour auto-disables in detected containers.
|
||||
Set `OPENCLAW_DISABLE_BONJOUR=0` only for host, macvlan, or another
|
||||
mDNS-capable network.
|
||||
- **Sleep / interface churn**: macOS may temporarily drop mDNS results; retry.
|
||||
- **Browse works but resolve fails**: keep machine names simple (avoid emojis or
|
||||
punctuation), then restart the Gateway. The service instance name derives from
|
||||
@@ -260,7 +259,8 @@ sequences (e.g. spaces become `\032`).
|
||||
- `openclaw plugins disable bonjour` disables LAN multicast advertising by disabling the bundled plugin.
|
||||
- `openclaw plugins enable bonjour` restores the default LAN discovery plugin.
|
||||
- `OPENCLAW_DISABLE_BONJOUR=1` disables LAN multicast advertising without changing plugin config; accepted truthy values are `1`, `true`, `yes`, and `on` (legacy: `OPENCLAW_DISABLE_BONJOUR`).
|
||||
- Docker Compose sets `OPENCLAW_DISABLE_BONJOUR=1` by default for bridge networking; override with `OPENCLAW_DISABLE_BONJOUR=0` only when mDNS multicast is available.
|
||||
- `OPENCLAW_DISABLE_BONJOUR=0` forces LAN multicast advertising on, including inside detected containers; accepted falsy values are `0`, `false`, `no`, and `off`.
|
||||
- When `OPENCLAW_DISABLE_BONJOUR` is unset, Bonjour advertises on normal hosts and auto-disables inside detected containers.
|
||||
- `gateway.bind` in `~/.openclaw/openclaw.json` controls the Gateway bind mode.
|
||||
- `OPENCLAW_SSH_PORT` overrides the SSH port when `sshPort` is advertised (legacy: `OPENCLAW_SSH_PORT`).
|
||||
- `OPENCLAW_TAILNET_DNS` publishes a MagicDNS hint in TXT when mDNS full mode is enabled (legacy: `OPENCLAW_TAILNET_DNS`).
|
||||
|
||||
@@ -859,6 +859,7 @@ Notes:
|
||||
- Set `logging.file` for a stable path.
|
||||
- `consoleLevel` bumps to `debug` when `--verbose`.
|
||||
- `maxFileBytes`: maximum active log file size in bytes before rotation (positive integer; default: `104857600` = 100 MB). OpenClaw keeps up to five numbered archives beside the active file.
|
||||
- `redactSensitive` / `redactPatterns`: best-effort masking for console output, file logs, OTLP log records, and persisted session transcript text.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -86,9 +86,9 @@ Security notes:
|
||||
Disable/override:
|
||||
|
||||
- `OPENCLAW_DISABLE_BONJOUR=1` disables advertising.
|
||||
- Docker Compose defaults `OPENCLAW_DISABLE_BONJOUR=1` because bridge networks
|
||||
usually do not carry mDNS multicast reliably; use `0` only on host, macvlan,
|
||||
or another mDNS-capable network.
|
||||
- When `OPENCLAW_DISABLE_BONJOUR` is unset, Bonjour advertises on normal hosts
|
||||
and auto-disables inside detected containers. Use `0` only on host, macvlan,
|
||||
or another mDNS-capable network; use `1` to force-disable.
|
||||
- `gateway.bind` in `~/.openclaw/openclaw.json` controls the Gateway bind mode.
|
||||
- `OPENCLAW_SSH_PORT` overrides the SSH port advertised when `sshPort` is emitted.
|
||||
- `OPENCLAW_TAILNET_DNS` publishes a `tailnetDns` hint (MagicDNS).
|
||||
|
||||
@@ -52,10 +52,12 @@ You can tune console verbosity independently via:
|
||||
- `logging.consoleLevel` (default `info`)
|
||||
- `logging.consoleStyle` (`pretty` | `compact` | `json`)
|
||||
|
||||
## Tool summary redaction
|
||||
## Redaction
|
||||
|
||||
Verbose tool summaries (e.g. `🛠️ Exec: ...`) can mask sensitive tokens before they hit the
|
||||
console stream. This is **tools-only** and does not alter file logs.
|
||||
OpenClaw can mask sensitive tokens before log or transcript output leaves the
|
||||
process. The same redaction policy is applied at console, file-log, OTLP
|
||||
log-record, and session transcript text sinks, so matching secret values are
|
||||
masked before JSONL lines or messages are written to disk.
|
||||
|
||||
- `logging.redactSensitive`: `off` | `tools` (default: `tools`)
|
||||
- `logging.redactPatterns`: array of regex strings (overrides defaults)
|
||||
|
||||
@@ -147,9 +147,17 @@ When any subkey is enabled, model and tool spans get bounded, redacted
|
||||
- **Traces:** `diagnostics.otel.sampleRate` (root-span only, `0.0` drops all,
|
||||
`1.0` keeps all).
|
||||
- **Metrics:** `diagnostics.otel.flushIntervalMs` (minimum `1000`).
|
||||
- **Logs:** OTLP logs respect `logging.level` (file log level). Console
|
||||
redaction does **not** apply to OTLP logs. High-volume installs should
|
||||
prefer OTLP collector sampling/filtering over local sampling.
|
||||
- **Logs:** OTLP logs respect `logging.level` (file log level). They use the
|
||||
diagnostic log-record redaction path, not console formatting. High-volume
|
||||
installs should prefer OTLP collector sampling/filtering over local sampling.
|
||||
- **File-log correlation:** JSONL file logs include top-level `traceId`,
|
||||
`spanId`, `parentSpanId`, and `traceFlags` when the log call carries a valid
|
||||
diagnostic trace context, which lets log processors join local log lines with
|
||||
exported spans.
|
||||
- **Request correlation:** Gateway HTTP requests and WebSocket frames create an
|
||||
internal request trace scope. Logs and diagnostic events inside that scope
|
||||
inherit the request trace by default, while agent run and model-call spans are
|
||||
created as children so provider `traceparent` headers stay on the same trace.
|
||||
|
||||
## Exported metrics
|
||||
|
||||
@@ -161,6 +169,10 @@ When any subkey is enabled, model and tool spans get bounded, redacted
|
||||
- `openclaw.context.tokens` (histogram, attrs: `openclaw.context`, `openclaw.channel`, `openclaw.provider`, `openclaw.model`)
|
||||
- `gen_ai.client.token.usage` (histogram, GenAI semantic-conventions metric, attrs: `gen_ai.token.type` = `input`/`output`, `gen_ai.provider.name`, `gen_ai.operation.name`, `gen_ai.request.model`)
|
||||
- `gen_ai.client.operation.duration` (histogram, seconds, GenAI semantic-conventions metric, attrs: `gen_ai.provider.name`, `gen_ai.operation.name`, `gen_ai.request.model`, optional `error.type`)
|
||||
- `openclaw.model_call.duration_ms` (histogram, attrs: `openclaw.provider`, `openclaw.model`, `openclaw.api`, `openclaw.transport`)
|
||||
- `openclaw.model_call.request_bytes` (histogram, UTF-8 byte size of the final model request payload; no raw payload content)
|
||||
- `openclaw.model_call.response_bytes` (histogram, UTF-8 byte size of streamed model response events; no raw response content)
|
||||
- `openclaw.model_call.time_to_first_byte_ms` (histogram, elapsed time before the first streamed response event)
|
||||
|
||||
### Message flow
|
||||
|
||||
@@ -212,6 +224,7 @@ When any subkey is enabled, model and tool spans get bounded, redacted
|
||||
- `openclaw.model.call`
|
||||
- `gen_ai.system` by default, or `gen_ai.provider.name` when the latest GenAI semantic conventions are opted in
|
||||
- `gen_ai.request.model`, `gen_ai.operation.name`, `openclaw.provider`, `openclaw.model`, `openclaw.api`, `openclaw.transport`
|
||||
- `openclaw.model_call.request_bytes`, `openclaw.model_call.response_bytes`, `openclaw.model_call.time_to_first_byte_ms`
|
||||
- `openclaw.provider.request_id_hash` (bounded SHA-based hash of the upstream provider request id; raw ids are not exported)
|
||||
- `openclaw.harness.run`
|
||||
- `openclaw.harness.id`, `openclaw.harness.plugin`, `openclaw.outcome`, `openclaw.provider`, `openclaw.model`, `openclaw.channel`
|
||||
|
||||
@@ -999,7 +999,7 @@ Logs and transcripts can leak sensitive info even when access controls are corre
|
||||
|
||||
Recommendations:
|
||||
|
||||
- Keep tool summary redaction on (`logging.redactSensitive: "tools"`; default).
|
||||
- Keep log and transcript redaction on (`logging.redactSensitive: "tools"`; default).
|
||||
- Add custom patterns for your environment via `logging.redactPatterns` (tokens, hostnames, internal URLs).
|
||||
- When sharing diagnostics, prefer `openclaw status --all` (pasteable, secrets redacted) over raw logs.
|
||||
- Prune old session transcripts and log files if you don’t need long retention.
|
||||
|
||||
@@ -227,10 +227,12 @@ Notes:
|
||||
- `OPENCLAW_LIVE_ACP_BIND_CODEX_MODEL=gpt-5.2`
|
||||
- `OPENCLAW_LIVE_ACP_BIND_OPENCODE_MODEL=opencode/kimi-k2.6`
|
||||
- `OPENCLAW_LIVE_ACP_BIND_REQUIRE_TRANSCRIPT=1`
|
||||
- `OPENCLAW_LIVE_ACP_BIND_REQUIRE_CRON=1`
|
||||
- `OPENCLAW_LIVE_ACP_BIND_PARENT_MODEL=openai/gpt-5.2`
|
||||
- Notes:
|
||||
- This lane uses the gateway `chat.send` surface with admin-only synthetic originating-route fields so tests can attach message-channel context without pretending to deliver externally.
|
||||
- When `OPENCLAW_LIVE_ACP_BIND_AGENT_COMMAND` is unset, the test uses the embedded `acpx` plugin's built-in agent registry for the selected ACP harness agent.
|
||||
- Bound-session cron MCP creation is best-effort by default because external ACP harnesses can cancel MCP calls after the bind/image proof has passed; set `OPENCLAW_LIVE_ACP_BIND_REQUIRE_CRON=1` to make that post-bind cron probe strict.
|
||||
|
||||
Example:
|
||||
|
||||
|
||||
@@ -411,9 +411,9 @@ Think of the suites as “increasing realism” (and increasing flakiness/cost):
|
||||
- Untargeted `pnpm test` runs twelve smaller shard configs (`core-unit-fast`, `core-unit-src`, `core-unit-security`, `core-unit-ui`, `core-unit-support`, `core-support-boundary`, `core-contracts`, `core-bundled`, `core-runtime`, `agentic`, `auto-reply`, `extensions`) instead of one giant native root-project process. This cuts peak RSS on loaded machines and avoids auto-reply/extension work starving unrelated suites.
|
||||
- `pnpm test --watch` still uses the native root `vitest.config.ts` project graph, because a multi-shard watch loop is not practical.
|
||||
- `pnpm test`, `pnpm test:watch`, and `pnpm test:perf:imports` route explicit file/directory targets through scoped lanes first, so `pnpm test extensions/discord/src/monitor/message-handler.preflight.test.ts` avoids paying the full root project startup tax.
|
||||
- `pnpm test:changed` expands changed git paths into the same scoped lanes when the diff only touches routable source/test files; config/setup edits still fall back to the broad root-project rerun.
|
||||
- `pnpm check:changed` is the normal smart local gate for narrow work. It classifies the diff into core, core tests, extensions, extension tests, apps, docs, release metadata, live Docker tooling, and tooling, then runs the matching typecheck/lint/test lanes. Public Plugin SDK and plugin-contract changes include one extension validation pass because extensions depend on those core contracts. Release metadata-only version bumps run targeted version/config/root-dependency checks instead of the full suite, with a guard that rejects package changes outside the top-level version field.
|
||||
- Live Docker ACP harness edits run a focused local gate: shell syntax for the live Docker auth scripts, live Docker scheduler dry-run, ACP bind unit tests, and the ACPX extension tests. `package.json` changes are included only when the diff is limited to `scripts["test:docker:live-*"]`; dependency, export, version, and other package-surface edits still use the broader guards.
|
||||
- `pnpm test:changed` expands changed git paths into cheap scoped lanes by default: direct test edits, sibling `*.test.ts` files, explicit source mappings, and local import-graph dependents. Config/setup/package edits do not broad-run tests unless you explicitly use `OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed`.
|
||||
- `pnpm check:changed` is the normal smart local check gate for narrow work. It classifies the diff into core, core tests, extensions, extension tests, apps, docs, release metadata, live Docker tooling, and tooling, then runs the matching typecheck, lint, and guard commands. It does not run Vitest tests; call `pnpm test:changed` or explicit `pnpm test <target>` for test proof. Release metadata-only version bumps run targeted version/config/root-dependency checks, with a guard that rejects package changes outside the top-level version field.
|
||||
- Live Docker ACP harness edits run focused checks: shell syntax for the live Docker auth scripts and a live Docker scheduler dry-run. `package.json` changes are included only when the diff is limited to `scripts["test:docker:live-*"]`; dependency, export, version, and other package-surface edits still use the broader guards.
|
||||
- Import-light unit tests from agents, commands, plugins, auto-reply helpers, `plugin-sdk`, and similar pure utility areas route through the `unit-fast` lane, which skips `test/setup-openclaw-runtime.ts`; stateful/runtime-heavy files stay on the existing lanes.
|
||||
- Selected `plugin-sdk` and `commands` helper source files also map changed-mode runs to explicit sibling tests in those light lanes, so helper edits avoid rerunning the full heavy suite for that directory.
|
||||
- `auto-reply` has dedicated buckets for top-level core helpers, top-level `reply.*` integration tests, and the `src/auto-reply/reply/**` subtree. CI further splits the reply subtree into agent-runner, dispatch, and commands/state-routing shards so one import-heavy bucket does not own the full Node tail.
|
||||
@@ -458,10 +458,11 @@ Think of the suites as “increasing realism” (and increasing flakiness/cost):
|
||||
- The pre-commit hook is formatting-only. It restages formatted files and
|
||||
does not run lint, typecheck, or tests.
|
||||
- Run `pnpm check:changed` explicitly before handoff or push when you
|
||||
need the smart local gate. Public Plugin SDK and plugin-contract
|
||||
changes include one extension validation pass.
|
||||
- `pnpm test:changed` routes through scoped lanes when the changed paths
|
||||
map cleanly to a smaller suite.
|
||||
need the smart local check gate.
|
||||
- `pnpm test:changed` routes through cheap scoped lanes by default. Use
|
||||
`OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed` only when the agent
|
||||
decides a harness, config, package, or contract edit really needs broader
|
||||
Vitest coverage.
|
||||
- `pnpm test:max` and `pnpm test:changed:max` keep the same routing
|
||||
behavior, just with a higher worker cap.
|
||||
- Local worker auto-scaling is intentionally conservative and backs off
|
||||
@@ -606,7 +607,7 @@ These Docker runners split into two buckets:
|
||||
`OPENCLAW_LIVE_GATEWAY_STEP_TIMEOUT_MS=45000`, and
|
||||
`OPENCLAW_LIVE_GATEWAY_MODEL_TIMEOUT_MS=90000`. Override those env vars when you
|
||||
explicitly want the larger exhaustive scan.
|
||||
- `test:docker:all` builds the live Docker image once via `test:docker:live-build`, then reuses it for the live Docker lanes. It also builds one shared `scripts/e2e/Dockerfile` image via `test:docker:e2e-build` and reuses it for the E2E container smoke runners that exercise the built app. The aggregate uses a weighted local scheduler: `OPENCLAW_DOCKER_ALL_PARALLELISM` controls process slots, while resource caps keep heavy live, npm-install, and multi-service lanes from all starting at once. Defaults are 10 slots, `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=6`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=8`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; tune `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` only when the Docker host has more headroom. The runner performs a Docker preflight by default, removes stale OpenClaw E2E containers, prints status every 30 seconds, stores successful lane timings in `.artifacts/docker-tests/lane-timings.json`, and uses those timings to start longer lanes first on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the weighted lane manifest without building or running Docker.
|
||||
- `test:docker:all` builds the live Docker image once via `test:docker:live-build`, packs OpenClaw once as an npm tarball through `scripts/package-openclaw-for-docker.mjs`, then builds/reuses two `scripts/e2e/Dockerfile` images. The bare image is only the Node/Git runner for install/update/plugin-dependency lanes; those lanes mount the prebuilt tarball. The functional image installs the same tarball into `/app` for built-app functionality lanes. Docker lane definitions live in `scripts/lib/docker-e2e-scenarios.mjs`; planner logic lives in `scripts/lib/docker-e2e-plan.mjs`; `scripts/test-docker-all.mjs` executes the selected plan. The aggregate uses a weighted local scheduler: `OPENCLAW_DOCKER_ALL_PARALLELISM` controls process slots, while resource caps keep heavy live, npm-install, and multi-service lanes from all starting at once. Defaults are 10 slots, `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=9`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=10`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; tune `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` only when the Docker host has more headroom. The runner performs a Docker preflight by default, removes stale OpenClaw E2E containers, prints status every 30 seconds, stores successful lane timings in `.artifacts/docker-tests/lane-timings.json`, and uses those timings to start longer lanes first on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the weighted lane manifest without building or running Docker, or `node scripts/test-docker-all.mjs --plan-json` to print the CI plan for selected lanes, package/image needs, and credentials.
|
||||
- Container smoke runners: `test:docker:openwebui`, `test:docker:onboard`, `test:docker:npm-onboard-channel-agent`, `test:docker:update-channel-switch`, `test:docker:session-runtime-context`, `test:docker:agents-delete-shared-workspace`, `test:docker:gateway-network`, `test:docker:browser-cdp-snapshot`, `test:docker:mcp-channels`, `test:docker:pi-bundle-mcp-tools`, `test:docker:cron-mcp-cleanup`, `test:docker:plugins`, `test:docker:plugin-update`, and `test:docker:config-reload` boot one or more real containers and verify higher-level integration paths.
|
||||
|
||||
The live-model Docker runners also bind-mount only the needed CLI auth homes (or all supported ones when the run is not narrowed), then copy them into the container home before the run so external-CLI OAuth can refresh tokens without mutating the host auth store:
|
||||
@@ -616,13 +617,14 @@ The live-model Docker runners also bind-mount only the needed CLI auth homes (or
|
||||
- CLI backend smoke: `pnpm test:docker:live-cli-backend` (script: `scripts/test-live-cli-backend-docker.sh`)
|
||||
- Codex app-server harness smoke: `pnpm test:docker:live-codex-harness` (script: `scripts/test-live-codex-harness-docker.sh`)
|
||||
- Gateway + dev agent: `pnpm test:docker:live-gateway` (script: `scripts/test-live-gateway-models-docker.sh`)
|
||||
- Docker observability smoke: included in `pnpm test:docker:all`, `pnpm test:docker:local:all`, and the release-path `core` chunk (script: `scripts/e2e/docker-observability-smoke.sh`). It runs QA-lab OTEL and Prometheus diagnostics checks inside the shared package-installed functional Docker image, with only QA harness files mounted read-only. Set `OPENCLAW_DOCKER_OBSERVABILITY_LOOPS=<count>` to repeat both checks in one container run.
|
||||
- Open WebUI live smoke: `pnpm test:docker:openwebui` (script: `scripts/e2e/openwebui-docker.sh`)
|
||||
- Onboarding wizard (TTY, full scaffolding): `pnpm test:docker:onboard` (script: `scripts/e2e/onboard-docker.sh`)
|
||||
- Npm tarball onboarding/channel/agent smoke: `pnpm test:docker:npm-onboard-channel-agent` installs the packed OpenClaw tarball globally in Docker, configures OpenAI via env-ref onboarding plus Telegram by default, verifies doctor repairs activated plugin runtime deps, and runs one mocked OpenAI agent turn. Reuse a prebuilt tarball with `OPENCLAW_NPM_ONBOARD_PACKAGE_TGZ=/path/to/openclaw-*.tgz`, skip the host rebuild with `OPENCLAW_NPM_ONBOARD_HOST_BUILD=0`, or switch channel with `OPENCLAW_NPM_ONBOARD_CHANNEL=discord`.
|
||||
- Npm tarball onboarding/channel/agent smoke: `pnpm test:docker:npm-onboard-channel-agent` installs the packed OpenClaw tarball globally in Docker, configures OpenAI via env-ref onboarding plus Telegram by default, verifies doctor repairs activated plugin runtime deps, and runs one mocked OpenAI agent turn. Reuse a prebuilt tarball with `OPENCLAW_CURRENT_PACKAGE_TGZ=/path/to/openclaw-*.tgz`, skip the host rebuild with `OPENCLAW_NPM_ONBOARD_HOST_BUILD=0`, or switch channel with `OPENCLAW_NPM_ONBOARD_CHANNEL=discord`.
|
||||
- Update channel switch smoke: `pnpm test:docker:update-channel-switch` installs the packed OpenClaw tarball globally in Docker, switches from package `stable` to git `dev`, verifies the persisted channel and plugin post-update work, then switches back to package `stable` and checks update status.
|
||||
- Session runtime context smoke: `pnpm test:docker:session-runtime-context` verifies hidden runtime context transcript persistence plus doctor repair of affected duplicated prompt-rewrite branches.
|
||||
- Bun global install smoke: `bash scripts/e2e/bun-global-install-smoke.sh` packs the current tree, installs it with `bun install -g` in an isolated home, and verifies `openclaw infer image providers --json` returns bundled image providers instead of hanging. Reuse a prebuilt tarball with `OPENCLAW_BUN_GLOBAL_SMOKE_PACKAGE_TGZ=/path/to/openclaw-*.tgz`, skip the host build with `OPENCLAW_BUN_GLOBAL_SMOKE_HOST_BUILD=0`, or copy `dist/` from a built Docker image with `OPENCLAW_BUN_GLOBAL_SMOKE_DIST_IMAGE=openclaw-dockerfile-smoke:local`.
|
||||
- Installer Docker smoke: `bash scripts/test-install-sh-docker.sh` shares one npm cache across its root, update, and direct-npm containers. Update smoke defaults to npm `latest` as the stable baseline before upgrading to the candidate tarball. Non-root installer checks keep an isolated npm cache so root-owned cache entries do not mask user-local install behavior. Set `OPENCLAW_INSTALL_SMOKE_NPM_CACHE_DIR=/path/to/cache` to reuse the root/update/direct-npm cache across local reruns.
|
||||
- Installer Docker smoke: `bash scripts/test-install-sh-docker.sh` shares one npm cache across its root, update, and direct-npm containers. Update smoke defaults to npm `latest` as the stable baseline before upgrading to the candidate tarball. Override with `OPENCLAW_INSTALL_SMOKE_UPDATE_BASELINE=2026.4.22` locally, or with the Install Smoke workflow's `update_baseline_version` input on GitHub. Non-root installer checks keep an isolated npm cache so root-owned cache entries do not mask user-local install behavior. Set `OPENCLAW_INSTALL_SMOKE_NPM_CACHE_DIR=/path/to/cache` to reuse the root/update/direct-npm cache across local reruns.
|
||||
- Install Smoke CI skips the duplicate direct-npm global update with `OPENCLAW_INSTALL_SMOKE_SKIP_NPM_GLOBAL=1`; run the script locally without that env when direct `npm install -g` coverage is needed.
|
||||
- Agents delete shared workspace CLI smoke: `pnpm test:docker:agents-delete-shared-workspace` (script: `scripts/e2e/agents-delete-shared-workspace-docker.sh`) builds the root Dockerfile image by default, seeds two agents with one workspace in an isolated container home, runs `agents delete --json`, and verifies valid JSON plus retained workspace behavior. Reuse the install-smoke image with `OPENCLAW_AGENTS_DELETE_SHARED_WORKSPACE_E2E_IMAGE=openclaw-dockerfile-smoke:local OPENCLAW_AGENTS_DELETE_SHARED_WORKSPACE_E2E_SKIP_BUILD=1`.
|
||||
- Gateway networking (two containers, WS auth + health): `pnpm test:docker:gateway-network` (script: `scripts/e2e/gateway-network-docker.sh`)
|
||||
@@ -635,15 +637,15 @@ The live-model Docker runners also bind-mount only the needed CLI auth homes (or
|
||||
Set `OPENCLAW_PLUGINS_E2E_CLAWHUB=0` to skip the live ClawHub block, or override the default package with `OPENCLAW_PLUGINS_E2E_CLAWHUB_SPEC` and `OPENCLAW_PLUGINS_E2E_CLAWHUB_ID`.
|
||||
- Plugin update unchanged smoke: `pnpm test:docker:plugin-update` (script: `scripts/e2e/plugin-update-unchanged-docker.sh`)
|
||||
- Config reload metadata smoke: `pnpm test:docker:config-reload` (script: `scripts/e2e/config-reload-source-docker.sh`)
|
||||
- Bundled plugin runtime deps: `pnpm test:docker:bundled-channel-deps` builds a small Docker runner image by default, builds and packs OpenClaw once on the host, then mounts that tarball into each Linux install scenario. Reuse the image with `OPENCLAW_SKIP_DOCKER_BUILD=1`, skip the host rebuild after a fresh local build with `OPENCLAW_BUNDLED_CHANNEL_HOST_BUILD=0`, or point at an existing tarball with `OPENCLAW_BUNDLED_CHANNEL_PACKAGE_TGZ=/path/to/openclaw-*.tgz`. The full Docker aggregate pre-packs this tarball once, then shards bundled channel checks into independent lanes, including separate update lanes for Telegram, Discord, Slack, Feishu, memory-lancedb, and ACPX. Use `OPENCLAW_BUNDLED_CHANNELS=telegram,slack` to narrow the channel matrix when running the bundled lane directly, or `OPENCLAW_BUNDLED_CHANNEL_UPDATE_TARGETS=telegram,acpx` to narrow the update scenario. The lane also verifies that `channels.<id>.enabled=false` and `plugins.entries.<id>.enabled=false` suppress doctor/runtime-dependency repair.
|
||||
- Bundled plugin runtime deps: `pnpm test:docker:bundled-channel-deps` builds a small Docker runner image by default, builds and packs OpenClaw once on the host, then mounts that tarball into each Linux install scenario. Reuse the image with `OPENCLAW_SKIP_DOCKER_BUILD=1`, skip the host rebuild after a fresh local build with `OPENCLAW_BUNDLED_CHANNEL_HOST_BUILD=0`, or point at an existing tarball with `OPENCLAW_CURRENT_PACKAGE_TGZ=/path/to/openclaw-*.tgz`. The full Docker aggregate pre-packs this tarball once, then shards bundled channel checks into independent lanes, including separate update lanes for Telegram, Discord, Slack, Feishu, memory-lancedb, and ACPX. Use `OPENCLAW_BUNDLED_CHANNELS=telegram,slack` to narrow the channel matrix when running the bundled lane directly, or `OPENCLAW_BUNDLED_CHANNEL_UPDATE_TARGETS=telegram,acpx` to narrow the update scenario. The lane also verifies that `channels.<id>.enabled=false` and `plugins.entries.<id>.enabled=false` suppress doctor/runtime-dependency repair.
|
||||
- Narrow bundled plugin runtime deps while iterating by disabling unrelated scenarios, for example:
|
||||
`OPENCLAW_BUNDLED_CHANNEL_SCENARIOS=0 OPENCLAW_BUNDLED_CHANNEL_UPDATE_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_ROOT_OWNED_SCENARIO=0 OPENCLAW_BUNDLED_CHANNEL_SETUP_ENTRY_SCENARIO=0 pnpm test:docker:bundled-channel-deps`.
|
||||
|
||||
To prebuild and reuse the shared built-app image manually:
|
||||
To prebuild and reuse the shared functional image manually:
|
||||
|
||||
```bash
|
||||
OPENCLAW_DOCKER_E2E_IMAGE=openclaw-docker-e2e:local pnpm test:docker:e2e-build
|
||||
OPENCLAW_DOCKER_E2E_IMAGE=openclaw-docker-e2e:local OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:mcp-channels
|
||||
OPENCLAW_DOCKER_E2E_IMAGE=openclaw-docker-e2e-functional:local pnpm test:docker:e2e-build
|
||||
OPENCLAW_DOCKER_E2E_IMAGE=openclaw-docker-e2e-functional:local OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:mcp-channels
|
||||
```
|
||||
|
||||
Suite-specific image overrides such as `OPENCLAW_GATEWAY_NETWORK_E2E_IMAGE` still win when set. When `OPENCLAW_SKIP_DOCKER_BUILD=1` points at a remote shared image, the scripts pull it if it is not already local. The QR and installer Docker tests keep their own Dockerfiles because they validate package/install behavior rather than the shared built-app runtime.
|
||||
|
||||
@@ -357,9 +357,11 @@ See [ClawDock](/install/clawdock) for the full helper guide.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Base image metadata">
|
||||
The main Docker image uses `node:24-bookworm` and publishes OCI base-image
|
||||
annotations including `org.opencontainers.image.base.name`,
|
||||
`org.opencontainers.image.source`, and others. See
|
||||
The main Docker runtime image uses `node:24-bookworm-slim` and publishes OCI
|
||||
base-image annotations including `org.opencontainers.image.base.name`,
|
||||
`org.opencontainers.image.source`, and others. The Node base digest is
|
||||
refreshed through Dependabot Docker base-image PRs; release builds do not run
|
||||
a distro upgrade layer. See
|
||||
[OCI image annotations](https://github.com/opencontainers/image-spec/blob/main/annotations.md).
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
@@ -67,6 +67,20 @@ Add `--no-onboard` to skip onboarding. To force a specific install type through
|
||||
the installer, pass `--install-method git --no-onboard` or
|
||||
`--install-method npm --no-onboard`.
|
||||
|
||||
If `openclaw update` fails after the npm package install phase, re-run the
|
||||
installer. The installer does not call the old updater; it runs the global
|
||||
package install directly and can recover a partially updated npm install.
|
||||
|
||||
```bash
|
||||
curl -fsSL https://openclaw.ai/install.sh | bash -s -- --install-method npm
|
||||
```
|
||||
|
||||
To pin the recovery to a specific version or dist-tag, add `--version`:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://openclaw.ai/install.sh | bash -s -- --install-method npm --version <version-or-dist-tag>
|
||||
```
|
||||
|
||||
## Alternative: manual npm, pnpm, or bun
|
||||
|
||||
```bash
|
||||
|
||||
@@ -103,6 +103,18 @@ openclaw channels logs --channel whatsapp
|
||||
Each line in the log file is a JSON object. The CLI and Control UI parse these
|
||||
entries to render structured output (time, level, subsystem, message).
|
||||
|
||||
File-log JSONL records also include machine-filterable top-level fields when
|
||||
available:
|
||||
|
||||
- `hostname`: gateway host name.
|
||||
- `message`: flattened log message text for full-text search.
|
||||
- `agent_id`: active agent id when the log call carries agent context.
|
||||
- `session_id`: active session id/key when the log call carries session context.
|
||||
- `channel`: active channel when the log call carries channel context.
|
||||
|
||||
OpenClaw preserves the original structured log arguments alongside these fields
|
||||
so existing parsers that read numbered tslog argument keys keep working.
|
||||
|
||||
### Console output
|
||||
|
||||
Console logs are **TTY-aware** and formatted for readability:
|
||||
@@ -157,6 +169,33 @@ You can override both via the **`OPENCLAW_LOG_LEVEL`** environment variable (e.g
|
||||
`--verbose` only affects console output and WS log verbosity; it does not change
|
||||
file log levels.
|
||||
|
||||
### Trace correlation
|
||||
|
||||
File logs are JSONL. When a log call carries a valid diagnostic trace context,
|
||||
OpenClaw writes the trace fields as top-level JSON keys (`traceId`, `spanId`,
|
||||
`parentSpanId`, `traceFlags`) so external log processors can correlate the line
|
||||
with OTEL spans and provider `traceparent` propagation.
|
||||
|
||||
Gateway HTTP requests and Gateway WebSocket frames establish an internal request
|
||||
trace scope. Logs and diagnostic events emitted inside that async scope inherit
|
||||
the request trace when they do not pass an explicit trace context. Agent run and
|
||||
model-call traces become children of the active request trace, so local logs,
|
||||
diagnostic snapshots, OTEL spans, and trusted provider `traceparent` headers can
|
||||
be joined by `traceId` without logging raw request or model content.
|
||||
|
||||
### Model call size and timing
|
||||
|
||||
Model-call diagnostics record bounded request/response measurements without
|
||||
capturing raw prompt or response content:
|
||||
|
||||
- `requestPayloadBytes`: UTF-8 byte size of the final model request payload
|
||||
- `responseStreamBytes`: UTF-8 byte size of streamed model response events
|
||||
- `timeToFirstByteMs`: elapsed time before the first streamed response event
|
||||
- `durationMs`: total model-call duration
|
||||
|
||||
These fields are available to diagnostic snapshots, model-call plugin hooks, and
|
||||
OTEL model-call spans/metrics when diagnostics export is enabled.
|
||||
|
||||
### Console styles
|
||||
|
||||
`logging.consoleStyle`:
|
||||
@@ -167,14 +206,16 @@ file log levels.
|
||||
|
||||
### Redaction
|
||||
|
||||
Tool summaries can redact sensitive tokens before they hit the console:
|
||||
OpenClaw can redact sensitive tokens before they hit console output, file logs,
|
||||
OTLP log records, or persisted session transcript text:
|
||||
|
||||
- `logging.redactSensitive`: `off` | `tools` (default: `tools`)
|
||||
- `logging.redactPatterns`: list of regex strings to override the default set
|
||||
|
||||
Redaction applies at the logging sinks for **console output**, **stderr-routed
|
||||
console diagnostics**, and **file logs**. File logs stay JSONL, but matching
|
||||
secret values are masked before the line is written to disk.
|
||||
File logs and session transcripts stay JSONL, but matching secret values are
|
||||
masked before the line or message is written to disk. Redaction is best-effort:
|
||||
it applies to text-bearing message content and log strings, not every
|
||||
identifier or binary payload field.
|
||||
|
||||
## Diagnostics and OpenTelemetry
|
||||
|
||||
|
||||
@@ -542,6 +542,72 @@ Environment overrides remain available for local testing:
|
||||
preferred for repeatable deployments because it keeps the plugin behavior in the
|
||||
same reviewed file as the rest of the Codex harness setup.
|
||||
|
||||
## Computer Use
|
||||
|
||||
Computer Use is a Codex-native MCP plugin. OpenClaw does not vendor the desktop
|
||||
control app or execute desktop actions itself; it enables Codex app-server
|
||||
plugins, installs the configured Codex marketplace plugin when requested, checks
|
||||
that the `computer-use` MCP server is available, and then lets Codex handle the
|
||||
native MCP tool calls during Codex-mode turns.
|
||||
|
||||
Set `plugins.entries.codex.config.computerUse` when you want Codex-mode turns to
|
||||
require Computer Use:
|
||||
|
||||
```json5
|
||||
{
|
||||
plugins: {
|
||||
entries: {
|
||||
codex: {
|
||||
enabled: true,
|
||||
config: {
|
||||
computerUse: {
|
||||
autoInstall: true,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
agents: {
|
||||
defaults: {
|
||||
model: "openai/gpt-5.5",
|
||||
embeddedHarness: {
|
||||
runtime: "codex",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
With no marketplace fields, OpenClaw asks Codex app-server to use its discovered
|
||||
marketplaces. On a fresh Codex home, app-server seeds the official curated
|
||||
marketplace and OpenClaw follows the same loading shape as Codex: it polls
|
||||
`plugin/list` during install before treating Computer Use as unavailable. The
|
||||
default discovery wait is 60 seconds and can be tuned with
|
||||
`marketplaceDiscoveryTimeoutMs`. If multiple known Codex marketplaces contain
|
||||
Computer Use, OpenClaw uses the Codex marketplace preference order before
|
||||
failing closed for unknown ambiguous matches.
|
||||
|
||||
Use `marketplaceSource` for a non-default Codex marketplace source that
|
||||
app-server can add, or `marketplacePath` for a local marketplace file that
|
||||
already exists on the machine. If the marketplace is already registered with
|
||||
Codex app-server, use `marketplaceName` instead. The defaults are
|
||||
`pluginName: "computer-use"` and `mcpServerName: "computer-use"`.
|
||||
For safety, turn-start auto-install only uses marketplaces app-server has
|
||||
already discovered. Use `/codex computer-use install` for explicit installs from
|
||||
a configured `marketplaceSource` or `marketplacePath`.
|
||||
|
||||
The same setup can be checked or installed from the command surface:
|
||||
|
||||
- `/codex computer-use status`
|
||||
- `/codex computer-use install`
|
||||
- `/codex computer-use install --source <marketplace-source>`
|
||||
- `/codex computer-use install --marketplace-path <path>`
|
||||
|
||||
Computer Use is macOS-specific and may require local OS permissions before the
|
||||
Codex MCP server can control apps. If `computerUse.enabled` is true and the MCP
|
||||
server is unavailable, Codex-mode turns fail before the thread starts instead of
|
||||
silently running without the native Computer Use tools.
|
||||
|
||||
## Common recipes
|
||||
|
||||
Local Codex with default stdio transport:
|
||||
@@ -644,6 +710,8 @@ Common forms:
|
||||
- `/codex resume <thread-id>` attaches the current OpenClaw session to an existing Codex thread.
|
||||
- `/codex compact` asks Codex app-server to compact the attached thread.
|
||||
- `/codex review` starts Codex native review for the attached thread.
|
||||
- `/codex computer-use status` checks the configured Computer Use plugin and MCP server.
|
||||
- `/codex computer-use install` installs the configured Computer Use plugin and reloads MCP servers.
|
||||
- `/codex account` shows account and rate-limit status.
|
||||
- `/codex mcp` lists Codex app-server MCP server status.
|
||||
- `/codex skills` lists Codex app-server skills.
|
||||
|
||||
@@ -461,7 +461,7 @@ For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-s
|
||||
<Accordion title="Streaming configuration">
|
||||
OpenClaw's Ollama integration uses the **native Ollama API** (`/api/chat`) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed.
|
||||
|
||||
For native `/api/chat` requests, OpenClaw also forwards thinking control directly to Ollama: `/think off` and `openclaw agent --thinking off` send top-level `think: false`, while non-`off` thinking levels send `think: true`.
|
||||
For native `/api/chat` requests, OpenClaw also forwards thinking control directly to Ollama: `/think off` and `openclaw agent --thinking off` send top-level `think: false`, while `/think low|medium|high` send the matching top-level `think` effort string. `/think max` maps to Ollama's highest native effort, `think: "high"`.
|
||||
|
||||
<Tip>
|
||||
If you need to use the OpenAI-compatible endpoint, see the "Legacy OpenAI-compatible mode" section above. Streaming and tool calling may not work simultaneously in that mode.
|
||||
|
||||
@@ -1,133 +0,0 @@
|
||||
---
|
||||
summary: "Investigation notes for duplicate async exec completion injection"
|
||||
read_when:
|
||||
- Debugging repeated node exec completion events
|
||||
- Working on heartbeat/system-event dedupe
|
||||
title: "Async exec duplicate completion investigation"
|
||||
---
|
||||
|
||||
## Scope
|
||||
|
||||
- Session: `agent:main:telegram:group:-1003774691294:topic:1`
|
||||
- Symptom: the same async exec completion for session/run `keen-nexus` was recorded twice in LCM as user turns.
|
||||
- Goal: identify whether this is most likely duplicate session injection or plain outbound delivery retry.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Most likely this is **duplicate session injection**, not a pure outbound delivery retry.
|
||||
|
||||
The strongest gateway-side gap is in the **node exec completion path**:
|
||||
|
||||
1. A node-side exec finish emits `exec.finished` with the full `runId`.
|
||||
2. Gateway `server-node-events` converts that into a system event and requests a heartbeat.
|
||||
3. The heartbeat run injects the drained system event block into the agent prompt.
|
||||
4. The embedded runner persists that prompt as a new user turn in the session transcript.
|
||||
|
||||
If the same `exec.finished` reaches the gateway twice for the same `runId` for any reason (replay, reconnect duplicate, upstream resend, duplicated producer), OpenClaw currently has **no idempotency check keyed by `runId`/`contextKey`** on this path. The second copy will become a second user message with the same content.
|
||||
|
||||
## Exact Code Path
|
||||
|
||||
### 1. Producer: node exec completion event
|
||||
|
||||
- `src/node-host/invoke.ts:340-360`
|
||||
- `sendExecFinishedEvent(...)` emits `node.event` with event `exec.finished`.
|
||||
- Payload includes `sessionKey` and full `runId`.
|
||||
|
||||
### 2. Gateway event ingestion
|
||||
|
||||
- `src/gateway/server-node-events.ts:574-640`
|
||||
- Handles `exec.finished`.
|
||||
- Builds text:
|
||||
- `Exec finished (node=..., id=<runId>, code ...)`
|
||||
- Enqueues it via:
|
||||
- `enqueueSystemEvent(text, { sessionKey, contextKey: runId ? \`exec:${runId}\` : "exec", trusted: false })`
|
||||
- Immediately requests a wake:
|
||||
- `requestHeartbeatNow(scopedHeartbeatWakeOptions(sessionKey, { reason: "exec-event" }))`
|
||||
|
||||
### 3. System event dedupe weakness
|
||||
|
||||
- `src/infra/system-events.ts:90-115`
|
||||
- `enqueueSystemEvent(...)` only suppresses **consecutive duplicate text**:
|
||||
- `if (entry.lastText === cleaned) return false`
|
||||
- It stores `contextKey`, but does **not** use `contextKey` for idempotency.
|
||||
- After drain, duplicate suppression resets.
|
||||
|
||||
This means a replayed `exec.finished` with the same `runId` can be accepted again later, even though the code already had a stable idempotency candidate (`exec:<runId>`).
|
||||
|
||||
### 4. Wake handling is not the primary duplicator
|
||||
|
||||
- `src/infra/heartbeat-wake.ts:79-117`
|
||||
- Wakes are coalesced by `(agentId, sessionKey)`.
|
||||
- Duplicate wake requests for the same target collapse to one pending wake entry.
|
||||
|
||||
This makes **duplicate wake handling alone** a weaker explanation than duplicate event ingestion.
|
||||
|
||||
### 5. Heartbeat consumes the event and turns it into prompt input
|
||||
|
||||
- `src/infra/heartbeat-runner.ts:535-574`
|
||||
- Preflight peeks pending system events and classifies exec-event runs.
|
||||
- `src/auto-reply/reply/session-system-events.ts:86-90`
|
||||
- `drainFormattedSystemEvents(...)` drains the queue for the session.
|
||||
- `src/auto-reply/reply/get-reply-run.ts:400-427`
|
||||
- The drained system event block is prepended into the agent prompt body.
|
||||
|
||||
### 6. Transcript injection point
|
||||
|
||||
- `src/agents/pi-embedded-runner/run/attempt.ts:2000-2017`
|
||||
- `activeSession.prompt(effectivePrompt)` submits the full prompt to the embedded PI session.
|
||||
- That is the point where the completion-derived prompt becomes a persisted user turn.
|
||||
|
||||
So once the same system event is rebuilt into the prompt twice, duplicate LCM user messages are expected.
|
||||
|
||||
## Why plain outbound delivery retry is less likely
|
||||
|
||||
There is a real outbound failure path in the heartbeat runner:
|
||||
|
||||
- `src/infra/heartbeat-runner.ts:1194-1242`
|
||||
- The reply is generated first.
|
||||
- Outbound delivery happens later via `deliverOutboundPayloads(...)`.
|
||||
- Failure there returns `{ status: "failed" }`.
|
||||
|
||||
However, for the same system event queue entry, this alone is **not sufficient** to explain the duplicate user turns:
|
||||
|
||||
- `src/auto-reply/reply/session-system-events.ts:86-90`
|
||||
- The system event queue is already drained before outbound delivery.
|
||||
|
||||
So a channel send retry by itself would not recreate the exact same queued event. It could explain missing/failed external delivery, but not by itself a second identical session user message.
|
||||
|
||||
## Secondary, lower-confidence possibility
|
||||
|
||||
There is a full-run retry loop in the agent runner:
|
||||
|
||||
- `src/auto-reply/reply/agent-runner-execution.ts:741-1473`
|
||||
- Certain transient failures can retry the whole run and resubmit the same `commandBody`.
|
||||
|
||||
That can duplicate a persisted user prompt **within the same reply execution** if the prompt was already appended before the retry condition triggered.
|
||||
|
||||
I rank this lower than duplicate `exec.finished` ingestion because:
|
||||
|
||||
- the observed gap was around 51 seconds, which looks more like a second wake/turn than an in-process retry;
|
||||
- the report already mentions repeated message send failures, which points more toward a separate later turn than an immediate model/runtime retry.
|
||||
|
||||
## Root Cause Hypothesis
|
||||
|
||||
Highest-confidence hypothesis:
|
||||
|
||||
- The `keen-nexus` completion came through the **node exec event path**.
|
||||
- The same `exec.finished` was delivered to `server-node-events` twice.
|
||||
- Gateway accepted both because `enqueueSystemEvent(...)` does not dedupe by `contextKey` / `runId`.
|
||||
- Each accepted event triggered a heartbeat and was injected as a user turn into the PI transcript.
|
||||
|
||||
## Proposed Tiny Surgical Fix
|
||||
|
||||
If a fix is wanted, the smallest high-value change is:
|
||||
|
||||
- make exec/system-event idempotency honor `contextKey` for a short horizon, at least for exact `(sessionKey, contextKey, text)` repeats;
|
||||
- or add a dedicated dedupe in `server-node-events` for `exec.finished` keyed by `(sessionKey, runId, event kind)`.
|
||||
|
||||
That would directly block replayed `exec.finished` duplicates before they become session turns.
|
||||
|
||||
## Related
|
||||
|
||||
- [Exec tool](/tools/exec)
|
||||
- [Session management](/concepts/session)
|
||||
@@ -1,540 +0,0 @@
|
||||
---
|
||||
summary: "QA refactor plan for scenario catalog and harness consolidation"
|
||||
read_when:
|
||||
- Refactoring QA scenario definitions or qa-lab harness code
|
||||
- Moving QA behavior between markdown scenarios and TypeScript harness logic
|
||||
title: "QA refactor"
|
||||
---
|
||||
|
||||
Status: foundational migration landed.
|
||||
|
||||
## Goal
|
||||
|
||||
Move OpenClaw QA from a split-definition model to a single source of truth:
|
||||
|
||||
- scenario metadata
|
||||
- prompts sent to the model
|
||||
- setup and teardown
|
||||
- harness logic
|
||||
- assertions and success criteria
|
||||
- artifacts and report hints
|
||||
|
||||
The desired end state is a generic QA harness that loads powerful scenario definition files instead of hardcoding most behavior in TypeScript.
|
||||
|
||||
## Current State
|
||||
|
||||
Primary source of truth now lives in `qa/scenarios/index.md` plus one file per
|
||||
scenario under `qa/scenarios/<theme>/*.md`.
|
||||
|
||||
Implemented:
|
||||
|
||||
- `qa/scenarios/index.md`
|
||||
- canonical QA pack metadata
|
||||
- operator identity
|
||||
- kickoff mission
|
||||
- `qa/scenarios/<theme>/*.md`
|
||||
- one markdown file per scenario
|
||||
- scenario metadata
|
||||
- handler bindings
|
||||
- scenario-specific execution config
|
||||
- `extensions/qa-lab/src/scenario-catalog.ts`
|
||||
- markdown pack parser + zod validation
|
||||
- `extensions/qa-lab/src/qa-agent-bootstrap.ts`
|
||||
- plan rendering from the markdown pack
|
||||
- `extensions/qa-lab/src/qa-agent-workspace.ts`
|
||||
- seeds generated compatibility files plus `QA_SCENARIOS.md`
|
||||
- `extensions/qa-lab/src/suite.ts`
|
||||
- selects executable scenarios through markdown-defined handler bindings
|
||||
- QA bus protocol + UI
|
||||
- generic inline attachments for image/video/audio/file rendering
|
||||
|
||||
Remaining split surfaces:
|
||||
|
||||
- `extensions/qa-lab/src/suite.ts`
|
||||
- still owns most executable custom handler logic
|
||||
- `extensions/qa-lab/src/report.ts`
|
||||
- still derives report structure from runtime outputs
|
||||
|
||||
So the source-of-truth split is fixed, but execution is still mostly handler-backed rather than fully declarative.
|
||||
|
||||
## What The Real Scenario Surface Looks Like
|
||||
|
||||
Reading the current suite shows a few distinct scenario classes.
|
||||
|
||||
### Simple interaction
|
||||
|
||||
- channel baseline
|
||||
- DM baseline
|
||||
- threaded follow-up
|
||||
- model switch
|
||||
- approval followthrough
|
||||
- reaction/edit/delete
|
||||
|
||||
### Config and runtime mutation
|
||||
|
||||
- config patch skill disable
|
||||
- config apply restart wake-up
|
||||
- config restart capability flip
|
||||
- runtime inventory drift check
|
||||
|
||||
### Filesystem and repo assertions
|
||||
|
||||
- source/docs discovery report
|
||||
- build Lobster Invaders
|
||||
- generated image artifact lookup
|
||||
|
||||
### Memory orchestration
|
||||
|
||||
- memory recall
|
||||
- memory tools in channel context
|
||||
- memory failure fallback
|
||||
- session memory ranking
|
||||
- thread memory isolation
|
||||
- memory dreaming sweep
|
||||
|
||||
### Tool and plugin integration
|
||||
|
||||
- MCP plugin-tools call
|
||||
- skill visibility
|
||||
- skill hot install
|
||||
- native image generation
|
||||
- image roundtrip
|
||||
- image understanding from attachment
|
||||
|
||||
### Multi-turn and multi-actor
|
||||
|
||||
- subagent handoff
|
||||
- subagent fanout synthesis
|
||||
- restart recovery style flows
|
||||
|
||||
These categories matter because they drive DSL requirements. A flat list of prompt + expected text is not enough.
|
||||
|
||||
## Direction
|
||||
|
||||
### Single source of truth
|
||||
|
||||
Use `qa/scenarios/index.md` plus `qa/scenarios/<theme>/*.md` as the authored
|
||||
source of truth.
|
||||
|
||||
The pack should stay:
|
||||
|
||||
- human-readable in review
|
||||
- machine-parseable
|
||||
- rich enough to drive:
|
||||
- suite execution
|
||||
- QA workspace bootstrap
|
||||
- QA Lab UI metadata
|
||||
- docs/discovery prompts
|
||||
- report generation
|
||||
|
||||
### Preferred authoring format
|
||||
|
||||
Use markdown as the top-level format, with structured YAML inside it.
|
||||
|
||||
Recommended shape:
|
||||
|
||||
- YAML frontmatter
|
||||
- id
|
||||
- title
|
||||
- surface
|
||||
- tags
|
||||
- docs refs
|
||||
- code refs
|
||||
- model/provider overrides
|
||||
- prerequisites
|
||||
- prose sections
|
||||
- objective
|
||||
- notes
|
||||
- debugging hints
|
||||
- fenced YAML blocks
|
||||
- setup
|
||||
- steps
|
||||
- assertions
|
||||
- cleanup
|
||||
|
||||
This gives:
|
||||
|
||||
- better PR readability than giant JSON
|
||||
- richer context than pure YAML
|
||||
- strict parsing and zod validation
|
||||
|
||||
Raw JSON is acceptable only as an intermediate generated form.
|
||||
|
||||
## Proposed Scenario File Shape
|
||||
|
||||
Example:
|
||||
|
||||
````md
|
||||
---
|
||||
id: image-generation-roundtrip
|
||||
title: Image generation roundtrip
|
||||
surface: image
|
||||
tags: [media, image, roundtrip]
|
||||
models:
|
||||
primary: openai/gpt-5.4
|
||||
requires:
|
||||
tools: [image_generate]
|
||||
plugins: [openai, qa-channel]
|
||||
docsRefs:
|
||||
- docs/help/testing.md
|
||||
- docs/concepts/model-providers.md
|
||||
codeRefs:
|
||||
- extensions/qa-lab/src/suite.ts
|
||||
- src/gateway/chat-attachments.ts
|
||||
---
|
||||
|
||||
# Objective
|
||||
|
||||
Verify generated media is reattached on the follow-up turn.
|
||||
|
||||
# Setup
|
||||
|
||||
```yaml scenario.setup
|
||||
- action: config.patch
|
||||
patch:
|
||||
agents:
|
||||
defaults:
|
||||
imageGenerationModel:
|
||||
primary: openai/gpt-image-1
|
||||
- action: session.create
|
||||
key: agent:qa:image-roundtrip
|
||||
```
|
||||
|
||||
# Steps
|
||||
|
||||
```yaml scenario.steps
|
||||
- action: agent.send
|
||||
session: agent:qa:image-roundtrip
|
||||
message: |
|
||||
Image generation check: generate a QA lighthouse image and summarize it in one short sentence.
|
||||
- action: artifact.capture
|
||||
kind: generated-image
|
||||
promptSnippet: Image generation check
|
||||
saveAs: lighthouseImage
|
||||
- action: agent.send
|
||||
session: agent:qa:image-roundtrip
|
||||
message: |
|
||||
Roundtrip image inspection check: describe the generated lighthouse attachment in one short sentence.
|
||||
attachments:
|
||||
- fromArtifact: lighthouseImage
|
||||
```
|
||||
|
||||
# Expect
|
||||
|
||||
```yaml scenario.expect
|
||||
- assert: outbound.textIncludes
|
||||
value: lighthouse
|
||||
- assert: requestLog.matches
|
||||
where:
|
||||
promptIncludes: Roundtrip image inspection check
|
||||
imageInputCountGte: 1
|
||||
- assert: artifact.exists
|
||||
ref: lighthouseImage
|
||||
```
|
||||
````
|
||||
|
||||
## Runner Capabilities The DSL Must Cover
|
||||
|
||||
Based on the current suite, the generic runner needs more than prompt execution.
|
||||
|
||||
### Environment and setup actions
|
||||
|
||||
- `bus.reset`
|
||||
- `gateway.waitHealthy`
|
||||
- `channel.waitReady`
|
||||
- `session.create`
|
||||
- `thread.create`
|
||||
- `workspace.writeSkill`
|
||||
|
||||
### Agent turn actions
|
||||
|
||||
- `agent.send`
|
||||
- `agent.wait`
|
||||
- `bus.injectInbound`
|
||||
- `bus.injectOutbound`
|
||||
|
||||
### Config and runtime actions
|
||||
|
||||
- `config.get`
|
||||
- `config.patch`
|
||||
- `config.apply`
|
||||
- `gateway.restart`
|
||||
- `tools.effective`
|
||||
- `skills.status`
|
||||
|
||||
### File and artifact actions
|
||||
|
||||
- `file.write`
|
||||
- `file.read`
|
||||
- `file.delete`
|
||||
- `file.touchTime`
|
||||
- `artifact.captureGeneratedImage`
|
||||
- `artifact.capturePath`
|
||||
|
||||
### Memory and cron actions
|
||||
|
||||
- `memory.indexForce`
|
||||
- `memory.searchCli`
|
||||
- `doctor.memory.status`
|
||||
- `cron.list`
|
||||
- `cron.run`
|
||||
- `cron.waitCompletion`
|
||||
- `sessionTranscript.write`
|
||||
|
||||
### MCP actions
|
||||
|
||||
- `mcp.callTool`
|
||||
|
||||
### Assertions
|
||||
|
||||
- `outbound.textIncludes`
|
||||
- `outbound.inThread`
|
||||
- `outbound.notInRoot`
|
||||
- `tool.called`
|
||||
- `tool.notPresent`
|
||||
- `skill.visible`
|
||||
- `skill.disabled`
|
||||
- `file.contains`
|
||||
- `memory.contains`
|
||||
- `requestLog.matches`
|
||||
- `sessionStore.matches`
|
||||
- `cron.managedPresent`
|
||||
- `artifact.exists`
|
||||
|
||||
## Variables and Artifact References
|
||||
|
||||
The DSL must support saved outputs and later references.
|
||||
|
||||
Examples from the current suite:
|
||||
|
||||
- create a thread, then reuse `threadId`
|
||||
- create a session, then reuse `sessionKey`
|
||||
- generate an image, then attach the file on the next turn
|
||||
- generate a wake marker string, then assert that it appears later
|
||||
|
||||
Needed capabilities:
|
||||
|
||||
- `saveAs`
|
||||
- `${vars.name}`
|
||||
- `${artifacts.name}`
|
||||
- typed references for paths, session keys, thread ids, markers, tool outputs
|
||||
|
||||
Without variable support, the harness will keep leaking scenario logic back into TypeScript.
|
||||
|
||||
## What Should Stay As Escape Hatches
|
||||
|
||||
A fully pure declarative runner is not realistic in phase 1.
|
||||
|
||||
Some scenarios are inherently orchestration-heavy:
|
||||
|
||||
- memory dreaming sweep
|
||||
- config apply restart wake-up
|
||||
- config restart capability flip
|
||||
- generated image artifact resolution by timestamp/path
|
||||
- discovery-report evaluation
|
||||
|
||||
These should use explicit custom handlers for now.
|
||||
|
||||
Recommended rule:
|
||||
|
||||
- 85-90% declarative
|
||||
- explicit `customHandler` steps for the hard remainder
|
||||
- named and documented custom handlers only
|
||||
- no anonymous inline code in the scenario file
|
||||
|
||||
That keeps the generic engine clean while still allowing progress.
|
||||
|
||||
## Architecture Change
|
||||
|
||||
### Current
|
||||
|
||||
Scenario markdown already is the source of truth for:
|
||||
|
||||
- suite execution
|
||||
- workspace bootstrap files
|
||||
- QA Lab UI scenario catalog
|
||||
- report metadata
|
||||
- discovery prompts
|
||||
|
||||
Generated compatibility:
|
||||
|
||||
- seeded workspace still includes `QA_KICKOFF_TASK.md`
|
||||
- seeded workspace still includes `QA_SCENARIO_PLAN.md`
|
||||
- seeded workspace now also includes `QA_SCENARIOS.md`
|
||||
|
||||
## Refactor Plan
|
||||
|
||||
### Phase 1: loader and schema
|
||||
|
||||
Done.
|
||||
|
||||
- added `qa/scenarios/index.md`
|
||||
- split scenarios into `qa/scenarios/<theme>/*.md`
|
||||
- added parser for named markdown YAML pack content
|
||||
- validated with zod
|
||||
- switched consumers to the parsed pack
|
||||
- removed repo-level `qa/seed-scenarios.json` and `qa/QA_KICKOFF_TASK.md`
|
||||
|
||||
### Phase 2: generic engine
|
||||
|
||||
- split `extensions/qa-lab/src/suite.ts` into:
|
||||
- loader
|
||||
- engine
|
||||
- action registry
|
||||
- assertion registry
|
||||
- custom handlers
|
||||
- keep existing helper functions as engine operations
|
||||
|
||||
Deliverable:
|
||||
|
||||
- engine executes simple declarative scenarios
|
||||
|
||||
Start with scenarios that are mostly prompt + wait + assert:
|
||||
|
||||
- threaded follow-up
|
||||
- image understanding from attachment
|
||||
- skill visibility and invocation
|
||||
- channel baseline
|
||||
|
||||
Deliverable:
|
||||
|
||||
- first real markdown-defined scenarios shipping through the generic engine
|
||||
|
||||
### Phase 4: migrate medium scenarios
|
||||
|
||||
- image generation roundtrip
|
||||
- memory tools in channel context
|
||||
- session memory ranking
|
||||
- subagent handoff
|
||||
- subagent fanout synthesis
|
||||
|
||||
Deliverable:
|
||||
|
||||
- variables, artifacts, tool assertions, request-log assertions proven out
|
||||
|
||||
### Phase 5: keep hard scenarios on custom handlers
|
||||
|
||||
- memory dreaming sweep
|
||||
- config apply restart wake-up
|
||||
- config restart capability flip
|
||||
- runtime inventory drift
|
||||
|
||||
Deliverable:
|
||||
|
||||
- same authoring format, but with explicit custom-step blocks where needed
|
||||
|
||||
### Phase 6: delete hardcoded scenario map
|
||||
|
||||
Once the pack coverage is good enough:
|
||||
|
||||
- remove most scenario-specific TypeScript branching from `extensions/qa-lab/src/suite.ts`
|
||||
|
||||
## Fake Slack / Rich Media Support
|
||||
|
||||
The current QA bus is text-first.
|
||||
|
||||
Relevant files:
|
||||
|
||||
- `extensions/qa-channel/src/protocol.ts`
|
||||
- `extensions/qa-lab/src/bus-state.ts`
|
||||
- `extensions/qa-lab/src/bus-queries.ts`
|
||||
- `extensions/qa-lab/src/bus-server.ts`
|
||||
- `extensions/qa-lab/web/src/ui-render.ts`
|
||||
|
||||
Today the QA bus supports:
|
||||
|
||||
- text
|
||||
- reactions
|
||||
- threads
|
||||
|
||||
It does not yet model inline media attachments.
|
||||
|
||||
### Needed transport contract
|
||||
|
||||
Add a generic QA bus attachment model:
|
||||
|
||||
```ts
|
||||
type QaBusAttachment = {
|
||||
id: string;
|
||||
kind: "image" | "video" | "audio" | "file";
|
||||
mimeType: string;
|
||||
fileName?: string;
|
||||
inline?: boolean;
|
||||
url?: string;
|
||||
contentBase64?: string;
|
||||
width?: number;
|
||||
height?: number;
|
||||
durationMs?: number;
|
||||
altText?: string;
|
||||
transcript?: string;
|
||||
};
|
||||
```
|
||||
|
||||
Then add `attachments?: QaBusAttachment[]` to:
|
||||
|
||||
- `QaBusMessage`
|
||||
- `QaBusInboundMessageInput`
|
||||
- `QaBusOutboundMessageInput`
|
||||
|
||||
### Why generic first
|
||||
|
||||
Do not build a Slack-only media model.
|
||||
|
||||
Instead:
|
||||
|
||||
- one generic QA transport model
|
||||
- multiple renderers on top of it
|
||||
- current QA Lab chat
|
||||
- future fake Slack web
|
||||
- any other fake transport views
|
||||
|
||||
This prevents duplicate logic and lets media scenarios stay transport-agnostic.
|
||||
|
||||
### UI work needed
|
||||
|
||||
Update the QA UI to render:
|
||||
|
||||
- inline image preview
|
||||
- inline audio player
|
||||
- inline video player
|
||||
- file attachment chip
|
||||
|
||||
The current UI can already render threads and reactions, so attachment rendering should layer onto the same message card model.
|
||||
|
||||
### Scenario work enabled by media transport
|
||||
|
||||
Once attachments flow through QA bus, we can add richer fake-chat scenarios:
|
||||
|
||||
- inline image reply in fake Slack
|
||||
- audio attachment understanding
|
||||
- video attachment understanding
|
||||
- mixed attachment ordering
|
||||
- thread reply with media retained
|
||||
|
||||
## Recommendation
|
||||
|
||||
The next implementation chunk should be:
|
||||
|
||||
1. add markdown scenario loader + zod schema
|
||||
2. generate the current catalog from markdown
|
||||
3. migrate a few simple scenarios first
|
||||
4. add generic QA bus attachment support
|
||||
5. render inline image in the QA UI
|
||||
6. then expand to audio and video
|
||||
|
||||
This is the smallest path that proves both goals:
|
||||
|
||||
- generic markdown-defined QA
|
||||
- richer fake messaging surfaces
|
||||
|
||||
## Open Questions
|
||||
|
||||
- whether scenario files should allow embedded markdown prompt templates with variable interpolation
|
||||
- whether setup/cleanup should be named sections or just ordered action lists
|
||||
- whether artifact references should be strongly typed in schema or string-based
|
||||
- whether custom handlers should live in one registry or per-surface registries
|
||||
- whether the generated JSON compatibility file should remain checked in during migration
|
||||
|
||||
## Related
|
||||
|
||||
- [QA E2E automation](/concepts/qa-e2e-automation)
|
||||
@@ -49,6 +49,12 @@ OpenClaw has three public release lanes:
|
||||
- Run `pnpm build && pnpm ui:build` before `pnpm release:check` so the expected
|
||||
`dist/*` release artifacts and Control UI bundle exist for the pack
|
||||
validation step
|
||||
- Run the manual `CI` workflow before release approval when you need full normal
|
||||
CI coverage for the release candidate. Manual CI dispatches bypass changed
|
||||
scoping and force the Linux Node shards, bundled-plugin shards, channel
|
||||
contracts, `check`, `check-additional`, build smoke, docs checks, Python
|
||||
skills, Windows, macOS, Android, and Control UI i18n lanes.
|
||||
Example: `gh workflow run ci.yml --ref release/YYYY.M.D`
|
||||
- Run `pnpm qa:otel:smoke` when validating release telemetry. It exercises
|
||||
QA-lab through a local OTLP/HTTP receiver and verifies the exported trace
|
||||
span names, bounded attributes, and content/identifier redaction without
|
||||
@@ -182,18 +188,20 @@ When cutting a stable npm release:
|
||||
SHA for a validation-only dry run of the preflight workflow
|
||||
2. Choose `npm_dist_tag=beta` for the normal beta-first flow, or `latest` only
|
||||
when you intentionally want a direct stable publish
|
||||
3. Run `OpenClaw Release Checks` separately with the same tag or the
|
||||
3. Run the manual `CI` workflow on the release ref when you want full normal CI
|
||||
coverage instead of smart-scoped merge coverage
|
||||
4. Run `OpenClaw Release Checks` separately with the same tag or the
|
||||
full current workflow-branch commit SHA when you want live prompt cache,
|
||||
QA Lab parity, Matrix, and Telegram coverage
|
||||
- This is separate on purpose so live coverage stays available without
|
||||
recoupling long-running or flaky checks to the publish workflow
|
||||
4. Save the successful `preflight_run_id`
|
||||
5. Run `OpenClaw NPM Release` again with `preflight_only=false`, the same
|
||||
5. Save the successful `preflight_run_id`
|
||||
6. Run `OpenClaw NPM Release` again with `preflight_only=false`, the same
|
||||
`tag`, the same `npm_dist_tag`, and the saved `preflight_run_id`
|
||||
6. If the release landed on `beta`, use the private
|
||||
7. If the release landed on `beta`, use the private
|
||||
`openclaw/releases-private/.github/workflows/openclaw-npm-dist-tags.yml`
|
||||
workflow to promote that stable version from `beta` to `latest`
|
||||
7. If the release intentionally published directly to `latest` and `beta`
|
||||
8. If the release intentionally published directly to `latest` and `beta`
|
||||
should follow the same stable build immediately, use that same private
|
||||
workflow to point both dist-tags at the stable version, or let its scheduled
|
||||
self-healing sync move `beta` later
|
||||
|
||||
@@ -193,7 +193,12 @@ Notable entry types:
|
||||
- `compaction`: persisted compaction summary with `firstKeptEntryId` and `tokensBefore`
|
||||
- `branch_summary`: persisted summary when navigating a tree branch
|
||||
|
||||
OpenClaw intentionally does **not** “fix up” transcripts; the Gateway uses `SessionManager` to read/write them.
|
||||
OpenClaw uses `SessionManager` for normal transcript reads/writes. After
|
||||
compaction, the Gateway now defaults to a bounded transcript rewrite that drops
|
||||
message entries already covered by the persisted compaction summary while
|
||||
keeping non-message session state and the recent unsummarized tail. Set
|
||||
`agents.defaults.compaction.truncateAfterCompaction` to `false` to preserve the
|
||||
legacy append-only behavior.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -10,11 +10,12 @@ title: "Tests"
|
||||
- `pnpm test:force`: Kills any lingering gateway process holding the default control port, then runs the full Vitest suite with an isolated gateway port so server tests don’t collide with a running instance. Use this when a prior gateway run left port 18789 occupied.
|
||||
- `pnpm test:coverage`: Runs the unit suite with V8 coverage (via `vitest.unit.config.ts`). This is a loaded-file unit coverage gate, not whole-repo all-file coverage. Thresholds are 70% lines/functions/statements and 55% branches. Because `coverage.all` is false, the gate measures files loaded by the unit coverage suite instead of treating every split-lane source file as uncovered.
|
||||
- `pnpm test:coverage:changed`: Runs unit coverage only for files changed since `origin/main`.
|
||||
- `pnpm test:changed`: expands changed git paths into scoped Vitest lanes when the diff only touches routable source/test files. Config/setup changes still fall back to the native root projects run so wiring edits rerun broadly when needed.
|
||||
- `pnpm test:changed:focused`: inner-loop changed test run. It only runs precise targets from direct test edits, sibling `*.test.ts` files, explicit source mappings, and the local import graph. Broad/config/package changes are skipped instead of expanding to the full changed-test fallback.
|
||||
- `pnpm test:changed`: cheap smart changed test run. It runs precise targets from direct test edits, sibling `*.test.ts` files, explicit source mappings, and the local import graph. Broad/config/package changes are skipped unless they map to precise tests.
|
||||
- `OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed`: explicit broad changed test run. Use it when a test harness/config/package edit should fall back to Vitest's broader changed-test behavior.
|
||||
- `pnpm changed:lanes`: shows the architectural lanes triggered by the diff against `origin/main`.
|
||||
- `pnpm check:changed`: runs the smart changed gate for the diff against `origin/main`. It runs core work with core test lanes, extension work with extension test lanes, test-only work with test typecheck/tests only, expands public Plugin SDK or plugin-contract changes to one extension validation pass, and keeps release metadata-only version bumps on targeted version/config/root-dependency checks.
|
||||
- `pnpm check:changed`: runs the smart changed check gate for the diff against `origin/main`. It runs typecheck, lint, and guard commands for the affected architectural lanes, but does not run Vitest tests. Use `pnpm test:changed` or explicit `pnpm test <target>` for test proof.
|
||||
- `pnpm test`: routes explicit file/directory targets through scoped Vitest lanes. Untargeted runs use fixed shard groups and expand to leaf configs for local parallel execution; the extension group always expands to the per-extension shard configs instead of one giant root-project process.
|
||||
- Test wrapper runs end with a short `[test] passed|failed|skipped ... in ...` summary. Vitest's own duration line stays the per-shard detail.
|
||||
- Full, extension, and include-pattern shard runs update local timing data in `.artifacts/vitest-shard-timings.json`; later whole-config runs use those timings to balance slow and fast shards. Include-pattern CI shards append the shard name to the timing key, which keeps filtered shard timings visible without replacing whole-config timing data. Set `OPENCLAW_TEST_PROJECTS_TIMINGS=0` to ignore the local timing artifact.
|
||||
- Selected `plugin-sdk` and `commands` test files now route through dedicated light lanes that keep only `test/setup.ts`, leaving runtime-heavy cases on their existing lanes.
|
||||
- Source files with sibling tests map to that sibling before falling back to wider directory globs. Helper edits under `test/helpers/channels` and `test/helpers/plugins` use a local import graph to run importing tests instead of broad-running every shard when the dependency path is precise.
|
||||
@@ -33,7 +34,7 @@ title: "Tests"
|
||||
- Gateway integration: opt-in via `OPENCLAW_TEST_INCLUDE_GATEWAY=1 pnpm test` or `pnpm test:gateway`.
|
||||
- `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=<n>` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs.
|
||||
- `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip.
|
||||
- `pnpm test:docker:all`: Builds the shared live-test image and Docker E2E image once, then runs the Docker smoke lanes with `OPENCLAW_SKIP_DOCKER_BUILD=1` through a weighted scheduler. `OPENCLAW_DOCKER_ALL_PARALLELISM=<n>` controls process slots and defaults to 10; `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=<n>` controls the provider-sensitive tail pool and defaults to 10. Heavy lane caps default to `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=9`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=10`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; provider caps default to one heavy lane per provider via `OPENCLAW_DOCKER_ALL_LIVE_CLAUDE_LIMIT=4`, `OPENCLAW_DOCKER_ALL_LIVE_CODEX_LIMIT=4`, and `OPENCLAW_DOCKER_ALL_LIVE_GEMINI_LIMIT=4`. Use `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` for larger hosts. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=<ms>`. The runner preflights Docker by default, cleans stale OpenClaw E2E containers, emits active-lane status every 30 seconds, shares provider CLI tool caches between compatible lanes, retries transient live-provider failures once by default (`OPENCLAW_DOCKER_ALL_LIVE_RETRIES=<n>`), and stores lane timings in `.artifacts/docker-tests/lane-timings.json` for longest-first ordering on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the lane manifest without running Docker, `OPENCLAW_DOCKER_ALL_STATUS_INTERVAL_MS=<ms>` to tune status output, or `OPENCLAW_DOCKER_ALL_TIMINGS=0` to disable timing reuse. Use `OPENCLAW_DOCKER_ALL_LIVE_MODE=skip` for deterministic/local lanes only or `OPENCLAW_DOCKER_ALL_LIVE_MODE=only` for live-provider lanes only; package aliases are `pnpm test:docker:local:all` and `pnpm test:docker:live:all`. Live-only mode merges main and tail live lanes into one longest-first pool so provider buckets can pack Claude, Codex, and Gemini work together. The runner stops scheduling new pooled lanes after the first failure unless `OPENCLAW_DOCKER_ALL_FAIL_FAST=0` is set, and each lane has a 120-minute fallback timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`; selected live/tail lanes use tighter per-lane caps. CLI backend Docker setup commands have their own timeout via `OPENCLAW_LIVE_CLI_BACKEND_SETUP_TIMEOUT_SECONDS` (default 180). Per-lane logs are written under `.artifacts/docker-tests/<run-id>/`.
|
||||
- `pnpm test:docker:all`: Builds the shared live-test image, packs OpenClaw once as an npm tarball, builds/reuses a bare Node/Git runner image plus a functional image that installs that tarball into `/app`, then runs Docker smoke lanes with `OPENCLAW_SKIP_DOCKER_BUILD=1` through a weighted scheduler. The bare image (`OPENCLAW_DOCKER_E2E_BARE_IMAGE`) is used for installer/update/plugin-dependency lanes; those lanes mount the prebuilt tarball instead of using copied repo sources. The functional image (`OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE`) is used for normal built-app functionality lanes. `scripts/package-openclaw-for-docker.mjs` is the single local/CI package packer and validates the tarball plus `dist/postinstall-inventory.json` before Docker consumes it. Docker lane definitions live in `scripts/lib/docker-e2e-scenarios.mjs`; planner logic lives in `scripts/lib/docker-e2e-plan.mjs`; `scripts/test-docker-all.mjs` executes the selected plan. `node scripts/test-docker-all.mjs --plan-json` emits the scheduler-owned CI plan for selected lanes, image kinds, package/live-image needs, and credential checks without building or running Docker. `OPENCLAW_DOCKER_ALL_PARALLELISM=<n>` controls process slots and defaults to 10; `OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=<n>` controls the provider-sensitive tail pool and defaults to 10. Heavy lane caps default to `OPENCLAW_DOCKER_ALL_LIVE_LIMIT=9`, `OPENCLAW_DOCKER_ALL_NPM_LIMIT=10`, and `OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7`; provider caps default to one heavy lane per provider via `OPENCLAW_DOCKER_ALL_LIVE_CLAUDE_LIMIT=4`, `OPENCLAW_DOCKER_ALL_LIVE_CODEX_LIMIT=4`, and `OPENCLAW_DOCKER_ALL_LIVE_GEMINI_LIMIT=4`. Use `OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT` or `OPENCLAW_DOCKER_ALL_DOCKER_LIMIT` for larger hosts. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with `OPENCLAW_DOCKER_ALL_START_STAGGER_MS=<ms>`. The runner preflights Docker by default, cleans stale OpenClaw E2E containers, emits active-lane status every 30 seconds, shares provider CLI tool caches between compatible lanes, retries transient live-provider failures once by default (`OPENCLAW_DOCKER_ALL_LIVE_RETRIES=<n>`), and stores lane timings in `.artifacts/docker-tests/lane-timings.json` for longest-first ordering on later runs. Use `OPENCLAW_DOCKER_ALL_DRY_RUN=1` to print the lane manifest without running Docker, `OPENCLAW_DOCKER_ALL_STATUS_INTERVAL_MS=<ms>` to tune status output, or `OPENCLAW_DOCKER_ALL_TIMINGS=0` to disable timing reuse. Use `OPENCLAW_DOCKER_ALL_LIVE_MODE=skip` for deterministic/local lanes only or `OPENCLAW_DOCKER_ALL_LIVE_MODE=only` for live-provider lanes only; package aliases are `pnpm test:docker:local:all` and `pnpm test:docker:live:all`. Live-only mode merges main and tail live lanes into one longest-first pool so provider buckets can pack Claude, Codex, and Gemini work together. The runner stops scheduling new pooled lanes after the first failure unless `OPENCLAW_DOCKER_ALL_FAIL_FAST=0` is set, and each lane has a 120-minute fallback timeout overrideable with `OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS`; selected live/tail lanes use tighter per-lane caps. CLI backend Docker setup commands have their own timeout via `OPENCLAW_LIVE_CLI_BACKEND_SETUP_TIMEOUT_SECONDS` (default 180). Per-lane logs, `summary.json`, `failures.json`, and phase timings are written under `.artifacts/docker-tests/<run-id>/`; use `pnpm test:docker:timings <summary.json>` to inspect slow lanes and `pnpm test:docker:rerun <run-id|summary.json|failures.json>` to print cheap targeted rerun commands.
|
||||
- `pnpm test:docker:browser-cdp-snapshot`: Builds a Chromium-backed source E2E container, starts raw CDP plus an isolated Gateway, runs `browser doctor --deep`, and verifies CDP role snapshots include link URLs, cursor-promoted clickables, iframe refs, and frame metadata.
|
||||
- CLI backend live Docker probes can be run as focused lanes, for example `pnpm test:docker:live-cli-backend:codex`, `pnpm test:docker:live-cli-backend:codex:resume`, or `pnpm test:docker:live-cli-backend:codex:mcp`. Claude and Gemini have matching `:resume` and `:mcp` aliases.
|
||||
- `pnpm test:docker:openwebui`: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks `/api/models`, then runs a real proxied chat through `/api/chat/completions`. Requires a usable live model key (for example OpenAI in `~/.profile`), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites.
|
||||
|
||||
@@ -15,7 +15,7 @@ title: "Thinking levels"
|
||||
- high → “ultrathink” (max budget)
|
||||
- xhigh → “ultrathink+” (GPT-5.2+ and Codex models, plus Anthropic Claude Opus 4.7 effort)
|
||||
- adaptive → provider-managed adaptive thinking (supported for Claude 4.6 on Anthropic/Bedrock, Anthropic Claude Opus 4.7, and Google Gemini dynamic thinking)
|
||||
- max → provider max reasoning (currently Anthropic Claude Opus 4.7)
|
||||
- max → provider max reasoning (Anthropic Claude Opus 4.7; Ollama maps this to its highest native `think` effort)
|
||||
- `x-high`, `x_high`, `extra-high`, `extra high`, and `extra_high` map to `xhigh`.
|
||||
- `highest` maps to `high`.
|
||||
- Provider notes:
|
||||
@@ -26,6 +26,7 @@ title: "Thinking levels"
|
||||
- Anthropic Claude Opus 4.7 does not default to adaptive thinking. Its API effort default remains provider-owned unless you explicitly set a thinking level.
|
||||
- Anthropic Claude Opus 4.7 maps `/think xhigh` to adaptive thinking plus `output_config.effort: "xhigh"`, because `/think` is a thinking directive and `xhigh` is the Opus 4.7 effort setting.
|
||||
- Anthropic Claude Opus 4.7 also exposes `/think max`; it maps to the same provider-owned max effort path.
|
||||
- Ollama thinking-capable models expose `/think low|medium|high|max`; `max` maps to native `think: "high"` because Ollama's native API accepts `low`, `medium`, and `high` effort strings.
|
||||
- OpenAI GPT models map `/think` through model-specific Responses API effort support. `/think off` sends `reasoning.effort: "none"` only when the target model supports it; otherwise OpenClaw omits the disabled reasoning payload instead of sending an unsupported value.
|
||||
- Google Gemini maps `/think adaptive` to Gemini's provider-owned dynamic thinking. Gemini 3 requests omit a fixed `thinkingLevel`, while Gemini 2.5 requests send `thinkingBudget: -1`; fixed levels still map to the closest Gemini `thinkingLevel` or budget for that model family.
|
||||
- MiniMax (`minimax/*`) on the Anthropic-compatible streaming path defaults to `thinking: { type: "disabled" }` unless you explicitly set thinking in model params or request params. This avoids leaked `reasoning_content` deltas from MiniMax's non-native Anthropic stream format.
|
||||
|
||||
@@ -134,6 +134,7 @@ The Control UI can localize itself on first load based on your browser locale. T
|
||||
<AccordionGroup>
|
||||
<Accordion title="Send and history semantics">
|
||||
- `chat.send` is **non-blocking**: it acks immediately with `{ runId, status: "started" }` and the response streams via `chat` events.
|
||||
- Chat uploads accept images plus non-video files. Images keep the native image path; other files are stored as managed media and shown in history as attachment links.
|
||||
- Re-sending with the same `idempotencyKey` returns `{ status: "in_flight" }` while running, and `{ status: "ok" }` after completion.
|
||||
- `chat.history` responses are size-bounded for UI safety. When transcript entries are too large, Gateway may truncate long text fields, omit heavy metadata blocks, and replace oversized messages with a placeholder (`[chat.history omitted: message too large]`).
|
||||
- Assistant/generated images are persisted as managed media references and served back through authenticated Gateway media URLs, so reloads do not depend on raw base64 image payloads staying in the chat history response.
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import fs from "node:fs";
|
||||
import os from "node:os";
|
||||
import { afterEach, describe, expect, it, vi } from "vitest";
|
||||
|
||||
@@ -207,6 +208,38 @@ describe("gateway bonjour advertiser", () => {
|
||||
await expect(started.stop()).resolves.toBeUndefined();
|
||||
});
|
||||
|
||||
it("auto-disables Bonjour in detected containers", async () => {
|
||||
enableAdvertiserUnitMode();
|
||||
vi.spyOn(fs, "existsSync").mockImplementation((filePath) => String(filePath) === "/.dockerenv");
|
||||
|
||||
const started = await startAdvertiser({
|
||||
gatewayPort: 18789,
|
||||
sshPort: 2222,
|
||||
});
|
||||
|
||||
expect(createService).not.toHaveBeenCalled();
|
||||
await expect(started.stop()).resolves.toBeUndefined();
|
||||
});
|
||||
|
||||
it("honors explicit Bonjour opt-in inside detected containers", async () => {
|
||||
enableAdvertiserUnitMode();
|
||||
process.env.OPENCLAW_DISABLE_BONJOUR = "0";
|
||||
vi.spyOn(fs, "existsSync").mockImplementation((filePath) => String(filePath) === "/.dockerenv");
|
||||
|
||||
const destroy = vi.fn().mockResolvedValue(undefined);
|
||||
const advertise = vi.fn().mockResolvedValue(undefined);
|
||||
mockCiaoService({ advertise, destroy });
|
||||
|
||||
const started = await startAdvertiser({
|
||||
gatewayPort: 18789,
|
||||
sshPort: 2222,
|
||||
});
|
||||
|
||||
expect(createService).toHaveBeenCalledTimes(1);
|
||||
|
||||
await started.stop();
|
||||
});
|
||||
|
||||
it("attaches conflict listeners for services", async () => {
|
||||
enableAdvertiserUnitMode();
|
||||
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import fs from "node:fs";
|
||||
import type { PluginLogger } from "openclaw/plugin-sdk/plugin-entry";
|
||||
import { isTruthyEnvValue } from "openclaw/plugin-sdk/runtime-env";
|
||||
import { classifyCiaoProcessError, type CiaoProcessErrorClassification } from "./ciao.js";
|
||||
@@ -89,16 +90,61 @@ async function loadCiaoModule(): Promise<CiaoModule> {
|
||||
return ciaoModulePromise;
|
||||
}
|
||||
|
||||
function isDisabledByEnv() {
|
||||
if (isTruthyEnvValue(process.env.OPENCLAW_DISABLE_BONJOUR)) {
|
||||
function readBonjourDisableOverride(): boolean | null {
|
||||
const raw = process.env.OPENCLAW_DISABLE_BONJOUR;
|
||||
const normalized = raw?.trim().toLowerCase();
|
||||
if (!normalized) {
|
||||
return null;
|
||||
}
|
||||
if (isTruthyEnvValue(raw)) {
|
||||
return true;
|
||||
}
|
||||
switch (normalized) {
|
||||
case "0":
|
||||
case "false":
|
||||
case "no":
|
||||
case "off":
|
||||
return false;
|
||||
default:
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
function isContainerEnvironment() {
|
||||
for (const sentinelPath of ["/.dockerenv", "/run/.containerenv", "/var/run/.containerenv"]) {
|
||||
try {
|
||||
if (fs.existsSync(sentinelPath)) {
|
||||
return true;
|
||||
}
|
||||
} catch {
|
||||
// ignore
|
||||
}
|
||||
}
|
||||
|
||||
try {
|
||||
const cgroup = fs.readFileSync("/proc/1/cgroup", "utf8");
|
||||
return /\/docker\/|cri-containerd-[0-9a-f]|containerd\/[0-9a-f]{64}|\/kubepods[/.]|\blxc\b/u.test(
|
||||
cgroup,
|
||||
);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
function isDisabledByEnv() {
|
||||
if (process.env.NODE_ENV === "test") {
|
||||
return true;
|
||||
}
|
||||
if (process.env.VITEST) {
|
||||
return true;
|
||||
}
|
||||
const envOverride = readBonjourDisableOverride();
|
||||
if (envOverride !== null) {
|
||||
return envOverride;
|
||||
}
|
||||
if (isContainerEnvironment()) {
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
|
||||
@@ -48,6 +48,34 @@ describe("bonjour-ciao", () => {
|
||||
expect(ignoreCiaoUnhandledRejection(new Error("CIAO PROBING CANCELLED"))).toBe(true);
|
||||
});
|
||||
|
||||
it("suppresses wrapped ciao cancellation rejections", () => {
|
||||
expect(
|
||||
classifyCiaoUnhandledRejection({
|
||||
reason: new Error("CIAO ANNOUNCEMENT CANCELLED"),
|
||||
}),
|
||||
).toEqual({
|
||||
kind: "cancellation",
|
||||
formatted: "CIAO ANNOUNCEMENT CANCELLED",
|
||||
});
|
||||
});
|
||||
|
||||
it("suppresses aggregate ciao assertion rejections", () => {
|
||||
expect(
|
||||
classifyCiaoUnhandledRejection(
|
||||
new AggregateError([
|
||||
Object.assign(
|
||||
new Error("Reached illegal state! IPV4 address change from defined to undefined!"),
|
||||
{ name: "AssertionError" },
|
||||
),
|
||||
]),
|
||||
),
|
||||
).toEqual({
|
||||
kind: "interface-assertion",
|
||||
formatted:
|
||||
"AssertionError: Reached illegal state! IPV4 address change from defined to undefined!",
|
||||
});
|
||||
});
|
||||
|
||||
it("suppresses lower-case string cancellation reasons too", () => {
|
||||
expect(ignoreCiaoUnhandledRejection("ciao announcement cancelled during cleanup")).toBe(true);
|
||||
});
|
||||
|
||||
@@ -11,17 +11,59 @@ export type CiaoProcessErrorClassification =
|
||||
| { kind: "interface-assertion"; formatted: string }
|
||||
| { kind: "netmask-assertion"; formatted: string };
|
||||
|
||||
function collectCiaoProcessErrorCandidates(reason: unknown): unknown[] {
|
||||
const queue: unknown[] = [reason];
|
||||
const seen = new Set<unknown>();
|
||||
const candidates: unknown[] = [];
|
||||
|
||||
while (queue.length > 0) {
|
||||
const current = queue.shift();
|
||||
if (current == null || seen.has(current)) {
|
||||
continue;
|
||||
}
|
||||
seen.add(current);
|
||||
candidates.push(current);
|
||||
|
||||
if (!current || typeof current !== "object") {
|
||||
continue;
|
||||
}
|
||||
const record = current as Record<string, unknown>;
|
||||
for (const nested of [
|
||||
record.cause,
|
||||
record.reason,
|
||||
record.original,
|
||||
record.error,
|
||||
record.data,
|
||||
]) {
|
||||
if (nested != null && !seen.has(nested)) {
|
||||
queue.push(nested);
|
||||
}
|
||||
}
|
||||
if (Array.isArray(record.errors)) {
|
||||
for (const nested of record.errors) {
|
||||
if (nested != null && !seen.has(nested)) {
|
||||
queue.push(nested);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return candidates;
|
||||
}
|
||||
|
||||
export function classifyCiaoProcessError(reason: unknown): CiaoProcessErrorClassification | null {
|
||||
const formatted = formatBonjourError(reason);
|
||||
const message = formatted.toUpperCase();
|
||||
if (CIAO_CANCELLATION_MESSAGE_RE.test(message)) {
|
||||
return { kind: "cancellation", formatted };
|
||||
}
|
||||
if (CIAO_INTERFACE_ASSERTION_MESSAGE_RE.test(message)) {
|
||||
return { kind: "interface-assertion", formatted };
|
||||
}
|
||||
if (CIAO_NETMASK_ASSERTION_MESSAGE_RE.test(message)) {
|
||||
return { kind: "netmask-assertion", formatted };
|
||||
for (const candidate of collectCiaoProcessErrorCandidates(reason)) {
|
||||
const formatted = formatBonjourError(candidate);
|
||||
const message = formatted.toUpperCase();
|
||||
if (CIAO_CANCELLATION_MESSAGE_RE.test(message)) {
|
||||
return { kind: "cancellation", formatted };
|
||||
}
|
||||
if (CIAO_INTERFACE_ASSERTION_MESSAGE_RE.test(message)) {
|
||||
return { kind: "interface-assertion", formatted };
|
||||
}
|
||||
if (CIAO_NETMASK_ASSERTION_MESSAGE_RE.test(message)) {
|
||||
return { kind: "netmask-assertion", formatted };
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
@@ -43,6 +43,42 @@
|
||||
}
|
||||
}
|
||||
},
|
||||
"computerUse": {
|
||||
"type": "object",
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"enabled": {
|
||||
"type": "boolean",
|
||||
"default": false
|
||||
},
|
||||
"autoInstall": {
|
||||
"type": "boolean",
|
||||
"default": false
|
||||
},
|
||||
"marketplaceDiscoveryTimeoutMs": {
|
||||
"type": "number",
|
||||
"minimum": 1,
|
||||
"default": 60000
|
||||
},
|
||||
"marketplaceSource": {
|
||||
"type": "string"
|
||||
},
|
||||
"marketplacePath": {
|
||||
"type": "string"
|
||||
},
|
||||
"marketplaceName": {
|
||||
"type": "string"
|
||||
},
|
||||
"pluginName": {
|
||||
"type": "string",
|
||||
"default": "computer-use"
|
||||
},
|
||||
"mcpServerName": {
|
||||
"type": "string",
|
||||
"default": "computer-use"
|
||||
}
|
||||
}
|
||||
},
|
||||
"appServer": {
|
||||
"type": "object",
|
||||
"additionalProperties": false,
|
||||
@@ -112,6 +148,51 @@
|
||||
"help": "Maximum time to wait for Codex app-server model discovery before falling back to the bundled model list.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse": {
|
||||
"label": "Computer Use",
|
||||
"help": "Controls Codex app-server setup for the Computer Use plugin.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse.enabled": {
|
||||
"label": "Enable Computer Use",
|
||||
"help": "When true, Codex-mode turns require the configured Computer Use MCP server to be available.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse.autoInstall": {
|
||||
"label": "Auto Install",
|
||||
"help": "Install the configured Computer Use plugin when Codex-mode turns start.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse.marketplaceDiscoveryTimeoutMs": {
|
||||
"label": "Marketplace Discovery Timeout",
|
||||
"help": "Maximum time to wait for Codex app-server to finish loading marketplaces during Computer Use install.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse.marketplaceSource": {
|
||||
"label": "Marketplace Source",
|
||||
"help": "Optional Codex marketplace source to add before installing Computer Use.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse.marketplacePath": {
|
||||
"label": "Marketplace Path",
|
||||
"help": "Optional local Codex marketplace file path containing the Computer Use plugin.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse.marketplaceName": {
|
||||
"label": "Marketplace Name",
|
||||
"help": "Optional registered Codex marketplace name containing the Computer Use plugin.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse.pluginName": {
|
||||
"label": "Plugin Name",
|
||||
"help": "Codex marketplace plugin name for Computer Use.",
|
||||
"advanced": true
|
||||
},
|
||||
"computerUse.mcpServerName": {
|
||||
"label": "MCP Server Name",
|
||||
"help": "MCP server name exposed by the Computer Use plugin.",
|
||||
"advanced": true
|
||||
},
|
||||
"appServer": {
|
||||
"label": "App Server",
|
||||
"help": "Runtime controls for connecting to Codex app-server.",
|
||||
|
||||
502
extensions/codex/src/app-server/computer-use.test.ts
Normal file
502
extensions/codex/src/app-server/computer-use.test.ts
Normal file
@@ -0,0 +1,502 @@
|
||||
import { afterEach, describe, expect, it, vi } from "vitest";
|
||||
import {
|
||||
CodexComputerUseSetupError,
|
||||
ensureCodexComputerUse,
|
||||
installCodexComputerUse,
|
||||
readCodexComputerUseStatus,
|
||||
type CodexComputerUseRequest,
|
||||
} from "./computer-use.js";
|
||||
|
||||
describe("Codex Computer Use setup", () => {
|
||||
afterEach(() => {
|
||||
vi.useRealTimers();
|
||||
});
|
||||
|
||||
it("stays disabled until configured", async () => {
|
||||
await expect(
|
||||
readCodexComputerUseStatus({ pluginConfig: {}, request: vi.fn() }),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
enabled: false,
|
||||
ready: false,
|
||||
message: "Computer Use is disabled.",
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("reports an installed Computer Use MCP server from a registered marketplace", async () => {
|
||||
const request = createComputerUseRequest({ installed: true });
|
||||
|
||||
await expect(
|
||||
readCodexComputerUseStatus({
|
||||
pluginConfig: { computerUse: { enabled: true, marketplaceName: "desktop-tools" } },
|
||||
request,
|
||||
}),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
enabled: true,
|
||||
ready: true,
|
||||
installed: true,
|
||||
pluginEnabled: true,
|
||||
mcpServerAvailable: true,
|
||||
marketplaceName: "desktop-tools",
|
||||
tools: ["list_apps"],
|
||||
message: "Computer Use is ready.",
|
||||
}),
|
||||
);
|
||||
expect(request).not.toHaveBeenCalledWith("marketplace/add", expect.anything());
|
||||
expect(request).not.toHaveBeenCalledWith(
|
||||
"experimentalFeature/enablement/set",
|
||||
expect.anything(),
|
||||
);
|
||||
expect(request).not.toHaveBeenCalledWith("plugin/install", expect.anything());
|
||||
});
|
||||
|
||||
it("does not register marketplace sources during status checks", async () => {
|
||||
const request = createComputerUseRequest({ installed: true });
|
||||
|
||||
await expect(
|
||||
readCodexComputerUseStatus({
|
||||
pluginConfig: {
|
||||
computerUse: {
|
||||
enabled: true,
|
||||
marketplaceSource: "github:example/desktop-tools",
|
||||
},
|
||||
},
|
||||
request,
|
||||
}),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
ready: true,
|
||||
message: "Computer Use is ready.",
|
||||
}),
|
||||
);
|
||||
expect(request).not.toHaveBeenCalledWith("marketplace/add", expect.anything());
|
||||
expect(request).not.toHaveBeenCalledWith(
|
||||
"experimentalFeature/enablement/set",
|
||||
expect.anything(),
|
||||
);
|
||||
});
|
||||
|
||||
it("fails closed when multiple marketplaces contain Computer Use", async () => {
|
||||
const request = createAmbiguousComputerUseRequest();
|
||||
|
||||
await expect(
|
||||
readCodexComputerUseStatus({
|
||||
pluginConfig: { computerUse: { enabled: true } },
|
||||
request,
|
||||
}),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
ready: false,
|
||||
message:
|
||||
"Multiple Codex marketplaces contain computer-use. Configure computerUse.marketplaceName or computerUse.marketplacePath to choose one.",
|
||||
}),
|
||||
);
|
||||
expect(request).not.toHaveBeenCalledWith("plugin/read", expect.anything());
|
||||
});
|
||||
|
||||
it("installs Computer Use from a configured marketplace source", async () => {
|
||||
const request = createComputerUseRequest({ installed: false });
|
||||
|
||||
await expect(
|
||||
installCodexComputerUse({
|
||||
pluginConfig: {
|
||||
computerUse: {
|
||||
marketplaceSource: "github:example/desktop-tools",
|
||||
},
|
||||
},
|
||||
request,
|
||||
}),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
ready: true,
|
||||
installed: true,
|
||||
pluginEnabled: true,
|
||||
tools: ["list_apps"],
|
||||
}),
|
||||
);
|
||||
expect(request).toHaveBeenCalledWith("experimentalFeature/enablement/set", {
|
||||
enablement: { plugins: true },
|
||||
});
|
||||
expect(request).toHaveBeenCalledWith("marketplace/add", {
|
||||
source: "github:example/desktop-tools",
|
||||
});
|
||||
expect(request).toHaveBeenCalledWith("plugin/install", {
|
||||
marketplacePath: "/marketplaces/desktop-tools/.agents/plugins/marketplace.json",
|
||||
pluginName: "computer-use",
|
||||
});
|
||||
expect(request).toHaveBeenCalledWith("config/mcpServer/reload", undefined);
|
||||
});
|
||||
|
||||
it("fails closed when Computer Use is required but not installed", async () => {
|
||||
const request = createComputerUseRequest({ installed: false });
|
||||
|
||||
await expect(
|
||||
ensureCodexComputerUse({
|
||||
pluginConfig: { computerUse: { enabled: true, marketplaceName: "desktop-tools" } },
|
||||
request,
|
||||
}),
|
||||
).rejects.toThrow(CodexComputerUseSetupError);
|
||||
expect(request).not.toHaveBeenCalledWith("plugin/install", expect.anything());
|
||||
});
|
||||
|
||||
it("skips setup writes when auto-install is already ready", async () => {
|
||||
const request = createComputerUseRequest({ installed: true });
|
||||
|
||||
await expect(
|
||||
ensureCodexComputerUse({
|
||||
pluginConfig: {
|
||||
computerUse: {
|
||||
enabled: true,
|
||||
autoInstall: true,
|
||||
marketplaceName: "desktop-tools",
|
||||
},
|
||||
},
|
||||
request,
|
||||
}),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
ready: true,
|
||||
message: "Computer Use is ready.",
|
||||
}),
|
||||
);
|
||||
expect(request).not.toHaveBeenCalledWith("marketplace/add", expect.anything());
|
||||
expect(request).not.toHaveBeenCalledWith(
|
||||
"experimentalFeature/enablement/set",
|
||||
expect.anything(),
|
||||
);
|
||||
expect(request).not.toHaveBeenCalledWith("plugin/install", expect.anything());
|
||||
});
|
||||
|
||||
it("uses setup writes when auto-install needs to install", async () => {
|
||||
const request = createComputerUseRequest({ installed: false });
|
||||
|
||||
await expect(
|
||||
ensureCodexComputerUse({
|
||||
pluginConfig: {
|
||||
computerUse: {
|
||||
enabled: true,
|
||||
autoInstall: true,
|
||||
},
|
||||
},
|
||||
request,
|
||||
}),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
ready: true,
|
||||
message: "Computer Use is ready.",
|
||||
}),
|
||||
);
|
||||
expect(request).toHaveBeenCalledWith("experimentalFeature/enablement/set", {
|
||||
enablement: { plugins: true },
|
||||
});
|
||||
expect(request).not.toHaveBeenCalledWith("marketplace/add", expect.anything());
|
||||
expect(request).toHaveBeenCalledWith("plugin/install", {
|
||||
marketplacePath: "/marketplaces/desktop-tools/.agents/plugins/marketplace.json",
|
||||
pluginName: "computer-use",
|
||||
});
|
||||
});
|
||||
|
||||
it("requires an explicit install command for configured marketplace sources", async () => {
|
||||
const request = createComputerUseRequest({ installed: false });
|
||||
|
||||
await expect(
|
||||
ensureCodexComputerUse({
|
||||
pluginConfig: {
|
||||
computerUse: {
|
||||
enabled: true,
|
||||
autoInstall: true,
|
||||
marketplaceSource: "github:example/desktop-tools",
|
||||
},
|
||||
},
|
||||
request,
|
||||
}),
|
||||
).rejects.toThrow(CodexComputerUseSetupError);
|
||||
expect(request).not.toHaveBeenCalledWith("marketplace/add", expect.anything());
|
||||
expect(request).not.toHaveBeenCalledWith("plugin/install", expect.anything());
|
||||
});
|
||||
|
||||
it("fails closed when a configured marketplace name is not discovered", async () => {
|
||||
const request = createEmptyMarketplaceComputerUseRequest();
|
||||
|
||||
await expect(
|
||||
readCodexComputerUseStatus({
|
||||
pluginConfig: {
|
||||
computerUse: {
|
||||
enabled: true,
|
||||
marketplaceName: "missing-marketplace",
|
||||
},
|
||||
},
|
||||
request,
|
||||
}),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
ready: false,
|
||||
message:
|
||||
"Configured Codex marketplace missing-marketplace was not found or does not contain computer-use. Run /codex computer-use install with a source or path to install from a new marketplace.",
|
||||
}),
|
||||
);
|
||||
expect(request).not.toHaveBeenCalledWith("plugin/read", expect.anything());
|
||||
});
|
||||
|
||||
it("waits for the default Codex marketplace during install", async () => {
|
||||
vi.useFakeTimers();
|
||||
const request = createComputerUseRequest({
|
||||
installed: false,
|
||||
marketplaceAvailableAfterListCalls: 3,
|
||||
});
|
||||
const installed = installCodexComputerUse({
|
||||
pluginConfig: { computerUse: {} },
|
||||
request,
|
||||
});
|
||||
|
||||
await vi.advanceTimersByTimeAsync(4_000);
|
||||
|
||||
await expect(installed).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
ready: true,
|
||||
message: "Computer Use is ready.",
|
||||
}),
|
||||
);
|
||||
expect(request).toHaveBeenCalledWith("plugin/install", {
|
||||
marketplacePath: "/marketplaces/desktop-tools/.agents/plugins/marketplace.json",
|
||||
pluginName: "computer-use",
|
||||
});
|
||||
expect(
|
||||
vi.mocked(request).mock.calls.filter(([method]) => method === "plugin/list"),
|
||||
).toHaveLength(3);
|
||||
});
|
||||
|
||||
it("prefers the official Computer Use marketplace when multiple matches are present", async () => {
|
||||
const request = createMultiMarketplaceComputerUseRequest();
|
||||
|
||||
await expect(
|
||||
installCodexComputerUse({
|
||||
pluginConfig: { computerUse: {} },
|
||||
request,
|
||||
}),
|
||||
).resolves.toEqual(
|
||||
expect.objectContaining({
|
||||
ready: true,
|
||||
marketplaceName: "openai-curated",
|
||||
}),
|
||||
);
|
||||
expect(request).toHaveBeenCalledWith("plugin/install", {
|
||||
marketplacePath: "/marketplaces/openai-curated/.agents/plugins/marketplace.json",
|
||||
pluginName: "computer-use",
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
function createComputerUseRequest(params: {
|
||||
installed: boolean;
|
||||
marketplaceAvailableAfterListCalls?: number;
|
||||
}): CodexComputerUseRequest {
|
||||
let installed = params.installed;
|
||||
let pluginListCalls = 0;
|
||||
return vi.fn(async (method: string, requestParams?: unknown) => {
|
||||
if (method === "experimentalFeature/enablement/set") {
|
||||
return { enablement: { plugins: true } };
|
||||
}
|
||||
if (method === "marketplace/add") {
|
||||
return {
|
||||
marketplaceName: "desktop-tools",
|
||||
installedRoot: "/marketplaces/desktop-tools",
|
||||
alreadyAdded: false,
|
||||
};
|
||||
}
|
||||
if (method === "plugin/list") {
|
||||
pluginListCalls += 1;
|
||||
const marketplaceAvailable =
|
||||
pluginListCalls >= (params.marketplaceAvailableAfterListCalls ?? 1);
|
||||
return {
|
||||
marketplaces: marketplaceAvailable
|
||||
? [
|
||||
{
|
||||
name: "desktop-tools",
|
||||
path: "/marketplaces/desktop-tools/.agents/plugins/marketplace.json",
|
||||
interface: null,
|
||||
plugins: [pluginSummary(installed)],
|
||||
},
|
||||
]
|
||||
: [],
|
||||
marketplaceLoadErrors: [],
|
||||
featuredPluginIds: [],
|
||||
};
|
||||
}
|
||||
if (method === "plugin/read") {
|
||||
expect(requestParams).toEqual(
|
||||
expect.objectContaining({
|
||||
pluginName: "computer-use",
|
||||
}),
|
||||
);
|
||||
return {
|
||||
plugin: {
|
||||
marketplaceName: "desktop-tools",
|
||||
marketplacePath: "/marketplaces/desktop-tools/.agents/plugins/marketplace.json",
|
||||
summary: pluginSummary(installed),
|
||||
description: "Control desktop apps.",
|
||||
skills: [],
|
||||
apps: [],
|
||||
mcpServers: ["computer-use"],
|
||||
},
|
||||
};
|
||||
}
|
||||
if (method === "plugin/install") {
|
||||
installed = true;
|
||||
return { authPolicy: "ON_INSTALL", appsNeedingAuth: [] };
|
||||
}
|
||||
if (method === "config/mcpServer/reload") {
|
||||
return undefined;
|
||||
}
|
||||
if (method === "mcpServerStatus/list") {
|
||||
return {
|
||||
data: installed
|
||||
? [
|
||||
{
|
||||
name: "computer-use",
|
||||
tools: {
|
||||
list_apps: {
|
||||
name: "list_apps",
|
||||
inputSchema: { type: "object" },
|
||||
},
|
||||
},
|
||||
resources: [],
|
||||
resourceTemplates: [],
|
||||
authStatus: "unsupported",
|
||||
},
|
||||
]
|
||||
: [],
|
||||
nextCursor: null,
|
||||
};
|
||||
}
|
||||
throw new Error(`unexpected request ${method}`);
|
||||
}) as CodexComputerUseRequest;
|
||||
}
|
||||
|
||||
function createAmbiguousComputerUseRequest(): CodexComputerUseRequest {
|
||||
return vi.fn(async (method: string) => {
|
||||
if (method === "plugin/list") {
|
||||
return {
|
||||
marketplaces: [
|
||||
{
|
||||
name: "desktop-tools",
|
||||
path: "/marketplaces/desktop-tools/.agents/plugins/marketplace.json",
|
||||
interface: null,
|
||||
plugins: [pluginSummary(true, "desktop-tools")],
|
||||
},
|
||||
{
|
||||
name: "other-tools",
|
||||
path: "/marketplaces/other-tools/.agents/plugins/marketplace.json",
|
||||
interface: null,
|
||||
plugins: [pluginSummary(true, "other-tools")],
|
||||
},
|
||||
],
|
||||
marketplaceLoadErrors: [],
|
||||
featuredPluginIds: [],
|
||||
};
|
||||
}
|
||||
throw new Error(`unexpected request ${method}`);
|
||||
}) as CodexComputerUseRequest;
|
||||
}
|
||||
|
||||
function createEmptyMarketplaceComputerUseRequest(): CodexComputerUseRequest {
|
||||
return vi.fn(async (method: string) => {
|
||||
if (method === "plugin/list") {
|
||||
return {
|
||||
marketplaces: [],
|
||||
marketplaceLoadErrors: [],
|
||||
featuredPluginIds: [],
|
||||
};
|
||||
}
|
||||
throw new Error(`unexpected request ${method}`);
|
||||
}) as CodexComputerUseRequest;
|
||||
}
|
||||
|
||||
function createMultiMarketplaceComputerUseRequest(): CodexComputerUseRequest {
|
||||
let installed = false;
|
||||
return vi.fn(async (method: string, requestParams?: unknown) => {
|
||||
if (method === "experimentalFeature/enablement/set") {
|
||||
return { enablement: { plugins: true } };
|
||||
}
|
||||
if (method === "plugin/list") {
|
||||
return {
|
||||
marketplaces: [
|
||||
marketplaceEntry("workspace-tools", false),
|
||||
marketplaceEntry("openai-curated", installed),
|
||||
],
|
||||
marketplaceLoadErrors: [],
|
||||
featuredPluginIds: [],
|
||||
};
|
||||
}
|
||||
if (method === "plugin/read") {
|
||||
return {
|
||||
plugin: {
|
||||
marketplaceName: "openai-curated",
|
||||
marketplacePath: "/marketplaces/openai-curated/.agents/plugins/marketplace.json",
|
||||
summary: pluginSummary(installed, "openai-curated"),
|
||||
description: "Control desktop apps.",
|
||||
skills: [],
|
||||
apps: [],
|
||||
mcpServers: ["computer-use"],
|
||||
},
|
||||
};
|
||||
}
|
||||
if (method === "plugin/install") {
|
||||
expect(requestParams).toEqual({
|
||||
marketplacePath: "/marketplaces/openai-curated/.agents/plugins/marketplace.json",
|
||||
pluginName: "computer-use",
|
||||
});
|
||||
installed = true;
|
||||
return { authPolicy: "ON_INSTALL", appsNeedingAuth: [] };
|
||||
}
|
||||
if (method === "config/mcpServer/reload") {
|
||||
return undefined;
|
||||
}
|
||||
if (method === "mcpServerStatus/list") {
|
||||
return {
|
||||
data: installed
|
||||
? [
|
||||
{
|
||||
name: "computer-use",
|
||||
tools: {
|
||||
list_apps: {
|
||||
name: "list_apps",
|
||||
inputSchema: { type: "object" },
|
||||
},
|
||||
},
|
||||
resources: [],
|
||||
resourceTemplates: [],
|
||||
authStatus: "unsupported",
|
||||
},
|
||||
]
|
||||
: [],
|
||||
nextCursor: null,
|
||||
};
|
||||
}
|
||||
throw new Error(`unexpected request ${method}`);
|
||||
}) as CodexComputerUseRequest;
|
||||
}
|
||||
|
||||
function marketplaceEntry(marketplaceName: string, installed: boolean) {
|
||||
return {
|
||||
name: marketplaceName,
|
||||
path: `/marketplaces/${marketplaceName}/.agents/plugins/marketplace.json`,
|
||||
interface: null,
|
||||
plugins: [pluginSummary(installed, marketplaceName)],
|
||||
};
|
||||
}
|
||||
|
||||
function pluginSummary(installed: boolean, marketplaceName = "desktop-tools") {
|
||||
return {
|
||||
id: `computer-use@${marketplaceName}`,
|
||||
name: "computer-use",
|
||||
source: { type: "local", path: `/marketplaces/${marketplaceName}/plugins/computer-use` },
|
||||
installed,
|
||||
enabled: installed,
|
||||
installPolicy: "AVAILABLE",
|
||||
authPolicy: "ON_INSTALL",
|
||||
interface: null,
|
||||
};
|
||||
}
|
||||
511
extensions/codex/src/app-server/computer-use.ts
Normal file
511
extensions/codex/src/app-server/computer-use.ts
Normal file
@@ -0,0 +1,511 @@
|
||||
import { describeControlFailure } from "./capabilities.js";
|
||||
import type { CodexAppServerClient } from "./client.js";
|
||||
import {
|
||||
resolveCodexAppServerRuntimeOptions,
|
||||
resolveCodexComputerUseConfig,
|
||||
type CodexComputerUseConfig,
|
||||
type ResolvedCodexComputerUseConfig,
|
||||
} from "./config.js";
|
||||
import type { v2 } from "./protocol-generated/typescript/index.js";
|
||||
import type { JsonValue } from "./protocol.js";
|
||||
import { requestCodexAppServerJson } from "./request.js";
|
||||
|
||||
export type CodexComputerUseRequest = <T = JsonValue | undefined>(
|
||||
method: string,
|
||||
params?: unknown,
|
||||
) => Promise<T>;
|
||||
|
||||
export type CodexComputerUseStatus = {
|
||||
enabled: boolean;
|
||||
ready: boolean;
|
||||
installed: boolean;
|
||||
pluginEnabled: boolean;
|
||||
mcpServerAvailable: boolean;
|
||||
pluginName: string;
|
||||
mcpServerName: string;
|
||||
marketplaceName?: string;
|
||||
marketplacePath?: string;
|
||||
tools: string[];
|
||||
message: string;
|
||||
};
|
||||
|
||||
export class CodexComputerUseSetupError extends Error {
|
||||
readonly status: CodexComputerUseStatus;
|
||||
|
||||
constructor(status: CodexComputerUseStatus) {
|
||||
super(status.message);
|
||||
this.name = "CodexComputerUseSetupError";
|
||||
this.status = status;
|
||||
}
|
||||
}
|
||||
|
||||
export type CodexComputerUseSetupParams = {
|
||||
pluginConfig?: unknown;
|
||||
overrides?: Partial<CodexComputerUseConfig>;
|
||||
request?: CodexComputerUseRequest;
|
||||
client?: CodexAppServerClient;
|
||||
timeoutMs?: number;
|
||||
signal?: AbortSignal;
|
||||
forceEnable?: boolean;
|
||||
};
|
||||
|
||||
type MarketplaceRef = {
|
||||
name?: string;
|
||||
path?: string;
|
||||
remoteMarketplaceName?: string;
|
||||
};
|
||||
|
||||
type MarketplaceResolution = {
|
||||
marketplace?: MarketplaceRef;
|
||||
message?: string;
|
||||
};
|
||||
|
||||
const CURATED_MARKETPLACE_POLL_INTERVAL_MS = 2_000;
|
||||
const COMPUTER_USE_MARKETPLACE_NAME_PRIORITY = ["openai-bundled", "openai-curated", "local"];
|
||||
|
||||
export async function readCodexComputerUseStatus(
|
||||
params: CodexComputerUseSetupParams = {},
|
||||
): Promise<CodexComputerUseStatus> {
|
||||
const config = resolveComputerUseConfig(params);
|
||||
if (!config.enabled) {
|
||||
return disabledStatus(config);
|
||||
}
|
||||
try {
|
||||
return await inspectCodexComputerUse({
|
||||
...params,
|
||||
config,
|
||||
installPlugin: false,
|
||||
});
|
||||
} catch (error) {
|
||||
return unavailableStatus(config, `Computer Use check failed: ${describeControlFailure(error)}`);
|
||||
}
|
||||
}
|
||||
|
||||
export async function ensureCodexComputerUse(
|
||||
params: CodexComputerUseSetupParams = {},
|
||||
): Promise<CodexComputerUseStatus> {
|
||||
const config = resolveComputerUseConfig(params);
|
||||
if (!config.enabled) {
|
||||
return disabledStatus(config);
|
||||
}
|
||||
const status = await inspectCodexComputerUse({
|
||||
...params,
|
||||
config,
|
||||
installPlugin: false,
|
||||
});
|
||||
if (status.ready) {
|
||||
return status;
|
||||
}
|
||||
if (config.autoInstall) {
|
||||
const blockedAutoInstallStatus = blockUnsafeAutoInstallStatus(config);
|
||||
if (blockedAutoInstallStatus) {
|
||||
throw new CodexComputerUseSetupError(blockedAutoInstallStatus);
|
||||
}
|
||||
const installedStatus = await inspectCodexComputerUse({
|
||||
...params,
|
||||
config,
|
||||
installPlugin: true,
|
||||
});
|
||||
if (!installedStatus.ready) {
|
||||
throw new CodexComputerUseSetupError(installedStatus);
|
||||
}
|
||||
return installedStatus;
|
||||
}
|
||||
if (!status.ready) {
|
||||
throw new CodexComputerUseSetupError(status);
|
||||
}
|
||||
return status;
|
||||
}
|
||||
|
||||
export async function installCodexComputerUse(
|
||||
params: CodexComputerUseSetupParams = {},
|
||||
): Promise<CodexComputerUseStatus> {
|
||||
const config = resolveComputerUseConfig({
|
||||
...params,
|
||||
forceEnable: true,
|
||||
overrides: { ...params.overrides, enabled: true, autoInstall: true },
|
||||
});
|
||||
const status = await inspectCodexComputerUse({
|
||||
...params,
|
||||
config,
|
||||
installPlugin: true,
|
||||
});
|
||||
if (!status.ready) {
|
||||
throw new CodexComputerUseSetupError(status);
|
||||
}
|
||||
return status;
|
||||
}
|
||||
|
||||
async function inspectCodexComputerUse(params: {
|
||||
pluginConfig?: unknown;
|
||||
request?: CodexComputerUseRequest;
|
||||
client?: CodexAppServerClient;
|
||||
timeoutMs?: number;
|
||||
signal?: AbortSignal;
|
||||
config: ResolvedCodexComputerUseConfig;
|
||||
installPlugin: boolean;
|
||||
}): Promise<CodexComputerUseStatus> {
|
||||
const request = createComputerUseRequest(params);
|
||||
if (params.installPlugin) {
|
||||
await request<v2.ExperimentalFeatureEnablementSetResponse>(
|
||||
"experimentalFeature/enablement/set",
|
||||
{
|
||||
enablement: { plugins: true },
|
||||
} satisfies v2.ExperimentalFeatureEnablementSetParams,
|
||||
);
|
||||
}
|
||||
|
||||
const marketplace = await resolveMarketplaceRef({
|
||||
request,
|
||||
config: params.config,
|
||||
allowAdd: params.installPlugin,
|
||||
signal: params.signal,
|
||||
});
|
||||
if (!marketplace.marketplace) {
|
||||
return unavailableStatus(
|
||||
params.config,
|
||||
marketplace.message ??
|
||||
`No Codex marketplace containing ${params.config.pluginName} is registered. Configure computerUse.marketplaceSource or computerUse.marketplacePath, then run /codex computer-use install.`,
|
||||
);
|
||||
}
|
||||
|
||||
let plugin = await readComputerUsePlugin(
|
||||
request,
|
||||
marketplace.marketplace,
|
||||
params.config.pluginName,
|
||||
);
|
||||
if (!plugin.summary.installed || !plugin.summary.enabled) {
|
||||
if (!params.installPlugin) {
|
||||
return statusFromPlugin({
|
||||
config: params.config,
|
||||
plugin,
|
||||
tools: [],
|
||||
message: `Computer Use is available but not installed. Run /codex computer-use install or enable computerUse.autoInstall.`,
|
||||
});
|
||||
}
|
||||
await request<v2.PluginInstallResponse>(
|
||||
"plugin/install",
|
||||
pluginRequestParams(
|
||||
marketplace.marketplace,
|
||||
params.config.pluginName,
|
||||
) satisfies v2.PluginInstallParams,
|
||||
);
|
||||
await reloadMcpServers(request);
|
||||
plugin = await readComputerUsePlugin(
|
||||
request,
|
||||
marketplace.marketplace,
|
||||
params.config.pluginName,
|
||||
);
|
||||
}
|
||||
|
||||
let server = await readMcpServerStatus(request, params.config.mcpServerName);
|
||||
if (!server && params.installPlugin) {
|
||||
await reloadMcpServers(request);
|
||||
server = await readMcpServerStatus(request, params.config.mcpServerName);
|
||||
}
|
||||
if (!server) {
|
||||
return statusFromPlugin({
|
||||
config: params.config,
|
||||
plugin,
|
||||
tools: [],
|
||||
message: `Computer Use is installed, but the ${params.config.mcpServerName} MCP server is not available.`,
|
||||
});
|
||||
}
|
||||
|
||||
return statusFromPlugin({
|
||||
config: params.config,
|
||||
plugin,
|
||||
tools: Object.keys(server.tools).toSorted(),
|
||||
message: "Computer Use is ready.",
|
||||
});
|
||||
}
|
||||
|
||||
async function resolveMarketplaceRef(params: {
|
||||
request: CodexComputerUseRequest;
|
||||
config: ResolvedCodexComputerUseConfig;
|
||||
allowAdd: boolean;
|
||||
signal?: AbortSignal;
|
||||
}): Promise<MarketplaceResolution> {
|
||||
let preferredMarketplaceName = params.config.marketplaceName;
|
||||
if (params.config.marketplaceSource && params.allowAdd) {
|
||||
const added = await params.request<v2.MarketplaceAddResponse>("marketplace/add", {
|
||||
source: params.config.marketplaceSource,
|
||||
} satisfies v2.MarketplaceAddParams);
|
||||
preferredMarketplaceName ??= added.marketplaceName;
|
||||
}
|
||||
|
||||
if (params.config.marketplacePath) {
|
||||
const marketplace: MarketplaceRef = preferredMarketplaceName
|
||||
? { name: preferredMarketplaceName, path: params.config.marketplacePath }
|
||||
: { path: params.config.marketplacePath };
|
||||
return { marketplace };
|
||||
}
|
||||
|
||||
let candidates: MarketplaceRef[] = [];
|
||||
const waitUntil = marketplaceDiscoveryWaitUntil(params);
|
||||
while (candidates.length === 0) {
|
||||
const listed = await params.request<v2.PluginListResponse>("plugin/list", {
|
||||
cwds: [],
|
||||
} satisfies v2.PluginListParams);
|
||||
candidates = findComputerUseMarketplaces(listed, params.config.pluginName);
|
||||
if (candidates.length > 0) {
|
||||
break;
|
||||
}
|
||||
if (Date.now() >= waitUntil) {
|
||||
break;
|
||||
}
|
||||
await delay(
|
||||
Math.min(CURATED_MARKETPLACE_POLL_INTERVAL_MS, waitUntil - Date.now()),
|
||||
params.signal,
|
||||
);
|
||||
}
|
||||
|
||||
if (preferredMarketplaceName) {
|
||||
const preferred = candidates.find((candidate) => candidate.name === preferredMarketplaceName);
|
||||
if (preferred) {
|
||||
return { marketplace: preferred };
|
||||
}
|
||||
return {
|
||||
message: `Configured Codex marketplace ${preferredMarketplaceName} was not found or does not contain ${params.config.pluginName}. Run /codex computer-use install with a source or path to install from a new marketplace.`,
|
||||
};
|
||||
}
|
||||
if (candidates.length > 1) {
|
||||
const preferred = chooseKnownComputerUseMarketplace(candidates);
|
||||
if (preferred) {
|
||||
return { marketplace: preferred };
|
||||
}
|
||||
return {
|
||||
message: `Multiple Codex marketplaces contain ${params.config.pluginName}. Configure computerUse.marketplaceName or computerUse.marketplacePath to choose one.`,
|
||||
};
|
||||
}
|
||||
if (params.config.marketplaceSource && !params.allowAdd && candidates.length === 0) {
|
||||
return {
|
||||
message:
|
||||
"Computer Use marketplace source is configured but has not been registered. Run /codex computer-use install to register it.",
|
||||
};
|
||||
}
|
||||
const marketplace = candidates[0];
|
||||
return marketplace ? { marketplace } : {};
|
||||
}
|
||||
|
||||
function blockUnsafeAutoInstallStatus(
|
||||
config: ResolvedCodexComputerUseConfig,
|
||||
): CodexComputerUseStatus | undefined {
|
||||
if (!config.marketplaceSource && !config.marketplacePath) {
|
||||
return undefined;
|
||||
}
|
||||
return unavailableStatus(
|
||||
config,
|
||||
"Computer Use auto-install only uses marketplaces Codex app-server has already discovered. Run /codex computer-use install to install from a configured marketplace source or path.",
|
||||
);
|
||||
}
|
||||
|
||||
function findComputerUseMarketplaces(
|
||||
listed: v2.PluginListResponse,
|
||||
pluginName: string,
|
||||
): MarketplaceRef[] {
|
||||
return listed.marketplaces
|
||||
.filter((marketplace) =>
|
||||
marketplace.plugins.some(
|
||||
(plugin) =>
|
||||
plugin.name === pluginName ||
|
||||
plugin.id === pluginName ||
|
||||
plugin.id === `${pluginName}@${marketplace.name}`,
|
||||
),
|
||||
)
|
||||
.map((marketplace) => {
|
||||
if (marketplace.path) {
|
||||
return { name: marketplace.name, path: marketplace.path };
|
||||
}
|
||||
return { name: marketplace.name, remoteMarketplaceName: marketplace.name };
|
||||
});
|
||||
}
|
||||
|
||||
function chooseKnownComputerUseMarketplace(
|
||||
candidates: MarketplaceRef[],
|
||||
): MarketplaceRef | undefined {
|
||||
for (const marketplaceName of COMPUTER_USE_MARKETPLACE_NAME_PRIORITY) {
|
||||
const candidate = candidates.find((marketplace) => marketplace.name === marketplaceName);
|
||||
if (candidate) {
|
||||
return candidate;
|
||||
}
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function marketplaceDiscoveryWaitUntil(params: {
|
||||
config: ResolvedCodexComputerUseConfig;
|
||||
allowAdd: boolean;
|
||||
}): number {
|
||||
if (
|
||||
params.allowAdd &&
|
||||
!params.config.marketplaceSource &&
|
||||
!params.config.marketplacePath &&
|
||||
!params.config.marketplaceName
|
||||
) {
|
||||
return Date.now() + params.config.marketplaceDiscoveryTimeoutMs;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
async function delay(ms: number, signal?: AbortSignal): Promise<void> {
|
||||
if (signal?.aborted) {
|
||||
throw abortError(signal);
|
||||
}
|
||||
await new Promise<void>((resolve, reject) => {
|
||||
let timer: ReturnType<typeof setTimeout>;
|
||||
const onAbort = () => {
|
||||
clearTimeout(timer);
|
||||
signal?.removeEventListener("abort", onAbort);
|
||||
reject(abortError(signal));
|
||||
};
|
||||
timer = setTimeout(() => {
|
||||
signal?.removeEventListener("abort", onAbort);
|
||||
resolve();
|
||||
}, ms);
|
||||
signal?.addEventListener("abort", onAbort, { once: true });
|
||||
});
|
||||
}
|
||||
|
||||
function abortError(signal?: AbortSignal): Error {
|
||||
const reason = signal?.reason;
|
||||
return reason instanceof Error ? reason : new Error("Computer Use setup was aborted.");
|
||||
}
|
||||
|
||||
async function readComputerUsePlugin(
|
||||
request: CodexComputerUseRequest,
|
||||
marketplace: MarketplaceRef,
|
||||
pluginName: string,
|
||||
): Promise<v2.PluginDetail> {
|
||||
const response = await request<v2.PluginReadResponse>(
|
||||
"plugin/read",
|
||||
pluginRequestParams(marketplace, pluginName) satisfies v2.PluginReadParams,
|
||||
);
|
||||
return response.plugin;
|
||||
}
|
||||
|
||||
async function readMcpServerStatus(
|
||||
request: CodexComputerUseRequest,
|
||||
serverName: string,
|
||||
): Promise<v2.McpServerStatus | undefined> {
|
||||
let cursor: string | null | undefined;
|
||||
do {
|
||||
const response = await request<v2.ListMcpServerStatusResponse>("mcpServerStatus/list", {
|
||||
cursor,
|
||||
limit: 100,
|
||||
detail: "toolsAndAuthOnly",
|
||||
} satisfies v2.ListMcpServerStatusParams);
|
||||
const found = response.data.find((server) => server.name === serverName);
|
||||
if (found) {
|
||||
return found;
|
||||
}
|
||||
cursor = response.nextCursor;
|
||||
} while (cursor);
|
||||
return undefined;
|
||||
}
|
||||
|
||||
async function reloadMcpServers(request: CodexComputerUseRequest): Promise<void> {
|
||||
await request("config/mcpServer/reload", undefined);
|
||||
}
|
||||
|
||||
function pluginRequestParams(marketplace: MarketplaceRef, pluginName: string) {
|
||||
return {
|
||||
...(marketplace.path ? { marketplacePath: marketplace.path } : {}),
|
||||
...(!marketplace.path && marketplace.remoteMarketplaceName
|
||||
? { remoteMarketplaceName: marketplace.remoteMarketplaceName }
|
||||
: {}),
|
||||
pluginName,
|
||||
};
|
||||
}
|
||||
|
||||
function statusFromPlugin(params: {
|
||||
config: ResolvedCodexComputerUseConfig;
|
||||
plugin: v2.PluginDetail;
|
||||
tools: string[];
|
||||
message: string;
|
||||
}): CodexComputerUseStatus {
|
||||
return {
|
||||
enabled: true,
|
||||
ready:
|
||||
params.plugin.summary.installed && params.plugin.summary.enabled && params.tools.length > 0,
|
||||
installed: params.plugin.summary.installed,
|
||||
pluginEnabled: params.plugin.summary.enabled,
|
||||
mcpServerAvailable: params.tools.length > 0,
|
||||
pluginName: params.config.pluginName,
|
||||
mcpServerName: params.config.mcpServerName,
|
||||
marketplaceName: params.plugin.marketplaceName,
|
||||
...(params.plugin.marketplacePath ? { marketplacePath: params.plugin.marketplacePath } : {}),
|
||||
tools: params.tools,
|
||||
message: params.message,
|
||||
};
|
||||
}
|
||||
|
||||
function disabledStatus(config: ResolvedCodexComputerUseConfig): CodexComputerUseStatus {
|
||||
return {
|
||||
enabled: false,
|
||||
ready: false,
|
||||
installed: false,
|
||||
pluginEnabled: false,
|
||||
mcpServerAvailable: false,
|
||||
pluginName: config.pluginName,
|
||||
mcpServerName: config.mcpServerName,
|
||||
tools: [],
|
||||
message: "Computer Use is disabled.",
|
||||
};
|
||||
}
|
||||
|
||||
function unavailableStatus(
|
||||
config: ResolvedCodexComputerUseConfig,
|
||||
message: string,
|
||||
): CodexComputerUseStatus {
|
||||
return {
|
||||
enabled: true,
|
||||
ready: false,
|
||||
installed: false,
|
||||
pluginEnabled: false,
|
||||
mcpServerAvailable: false,
|
||||
pluginName: config.pluginName,
|
||||
mcpServerName: config.mcpServerName,
|
||||
...(config.marketplaceName ? { marketplaceName: config.marketplaceName } : {}),
|
||||
...(config.marketplacePath ? { marketplacePath: config.marketplacePath } : {}),
|
||||
tools: [],
|
||||
message,
|
||||
};
|
||||
}
|
||||
|
||||
function createComputerUseRequest(params: {
|
||||
pluginConfig?: unknown;
|
||||
request?: CodexComputerUseRequest;
|
||||
client?: CodexAppServerClient;
|
||||
timeoutMs?: number;
|
||||
signal?: AbortSignal;
|
||||
}): CodexComputerUseRequest {
|
||||
if (params.request) {
|
||||
return params.request;
|
||||
}
|
||||
if (params.client) {
|
||||
return async <T = JsonValue | undefined>(method: string, requestParams?: unknown) =>
|
||||
await params.client!.request<T>(method, requestParams, {
|
||||
timeoutMs: params.timeoutMs,
|
||||
signal: params.signal,
|
||||
});
|
||||
}
|
||||
const runtime = resolveCodexAppServerRuntimeOptions({ pluginConfig: params.pluginConfig });
|
||||
return async <T = JsonValue | undefined>(method: string, requestParams?: unknown) =>
|
||||
await requestCodexAppServerJson<T>({
|
||||
method,
|
||||
requestParams,
|
||||
timeoutMs: params.timeoutMs ?? runtime.requestTimeoutMs,
|
||||
startOptions: runtime.start,
|
||||
});
|
||||
}
|
||||
|
||||
function resolveComputerUseConfig(
|
||||
params: Pick<CodexComputerUseSetupParams, "pluginConfig" | "overrides" | "forceEnable">,
|
||||
): ResolvedCodexComputerUseConfig {
|
||||
const overrides = params.forceEnable ? { ...params.overrides, enabled: true } : params.overrides;
|
||||
return resolveCodexComputerUseConfig({
|
||||
pluginConfig: params.pluginConfig,
|
||||
overrides,
|
||||
});
|
||||
}
|
||||
@@ -2,9 +2,11 @@ import fs from "node:fs/promises";
|
||||
import { describe, expect, it } from "vitest";
|
||||
import {
|
||||
CODEX_APP_SERVER_CONFIG_KEYS,
|
||||
CODEX_COMPUTER_USE_CONFIG_KEYS,
|
||||
codexAppServerStartOptionsKey,
|
||||
readCodexPluginConfig,
|
||||
resolveCodexAppServerRuntimeOptions,
|
||||
resolveCodexComputerUseConfig,
|
||||
} from "./config.js";
|
||||
|
||||
describe("Codex app-server config", () => {
|
||||
@@ -130,6 +132,48 @@ describe("Codex app-server config", () => {
|
||||
);
|
||||
});
|
||||
|
||||
it("resolves Computer Use setup from plugin config and environment fallbacks", () => {
|
||||
expect(
|
||||
resolveCodexComputerUseConfig({
|
||||
pluginConfig: {
|
||||
computerUse: {
|
||||
autoInstall: true,
|
||||
marketplaceName: "desktop-tools",
|
||||
},
|
||||
},
|
||||
env: {
|
||||
OPENCLAW_CODEX_COMPUTER_USE_PLUGIN_NAME: "env-fallback-plugin",
|
||||
},
|
||||
}),
|
||||
).toEqual({
|
||||
enabled: true,
|
||||
autoInstall: true,
|
||||
marketplaceDiscoveryTimeoutMs: 60_000,
|
||||
pluginName: "env-fallback-plugin",
|
||||
mcpServerName: "computer-use",
|
||||
marketplaceName: "desktop-tools",
|
||||
});
|
||||
|
||||
expect(
|
||||
resolveCodexComputerUseConfig({
|
||||
pluginConfig: {},
|
||||
env: {
|
||||
OPENCLAW_CODEX_COMPUTER_USE: "1",
|
||||
OPENCLAW_CODEX_COMPUTER_USE_MARKETPLACE_SOURCE: "github:example/plugins",
|
||||
OPENCLAW_CODEX_COMPUTER_USE_AUTO_INSTALL: "true",
|
||||
OPENCLAW_CODEX_COMPUTER_USE_MARKETPLACE_DISCOVERY_TIMEOUT_MS: "30000",
|
||||
},
|
||||
}),
|
||||
).toEqual(
|
||||
expect.objectContaining({
|
||||
enabled: true,
|
||||
autoInstall: true,
|
||||
marketplaceDiscoveryTimeoutMs: 30_000,
|
||||
marketplaceSource: "github:example/plugins",
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("allows plugin config to opt in to guardian-reviewed local execution", () => {
|
||||
const runtime = resolveCodexAppServerRuntimeOptions({
|
||||
pluginConfig: {
|
||||
@@ -246,6 +290,7 @@ describe("Codex app-server config", () => {
|
||||
configSchema: {
|
||||
properties: {
|
||||
appServer: { properties: Record<string, unknown> };
|
||||
computerUse: { properties: Record<string, unknown> };
|
||||
};
|
||||
};
|
||||
uiHints: Record<string, unknown>;
|
||||
@@ -258,6 +303,13 @@ describe("Codex app-server config", () => {
|
||||
for (const key of CODEX_APP_SERVER_CONFIG_KEYS) {
|
||||
expect(manifest.uiHints[`appServer.${key}`]).toBeTruthy();
|
||||
}
|
||||
const computerUseManifestKeys = Object.keys(
|
||||
manifest.configSchema.properties.computerUse.properties,
|
||||
).toSorted();
|
||||
expect(computerUseManifestKeys).toEqual([...CODEX_COMPUTER_USE_CONFIG_KEYS].toSorted());
|
||||
for (const key of CODEX_COMPUTER_USE_CONFIG_KEYS) {
|
||||
expect(manifest.uiHints[`computerUse.${key}`]).toBeTruthy();
|
||||
}
|
||||
});
|
||||
|
||||
it("does not schema-default mode-derived policy fields", async () => {
|
||||
|
||||
@@ -9,6 +9,28 @@ export type CodexAppServerSandboxMode = "read-only" | "workspace-write" | "dange
|
||||
export type CodexAppServerApprovalsReviewer = "user" | "auto_review" | "guardian_subagent";
|
||||
export type CodexAppServerCommandSource = "managed" | "resolved-managed" | "config" | "env";
|
||||
|
||||
export type CodexComputerUseConfig = {
|
||||
enabled?: boolean;
|
||||
autoInstall?: boolean;
|
||||
marketplaceDiscoveryTimeoutMs?: number;
|
||||
marketplaceSource?: string;
|
||||
marketplacePath?: string;
|
||||
marketplaceName?: string;
|
||||
pluginName?: string;
|
||||
mcpServerName?: string;
|
||||
};
|
||||
|
||||
export type ResolvedCodexComputerUseConfig = {
|
||||
enabled: boolean;
|
||||
autoInstall: boolean;
|
||||
marketplaceDiscoveryTimeoutMs: number;
|
||||
pluginName: string;
|
||||
mcpServerName: string;
|
||||
marketplaceSource?: string;
|
||||
marketplacePath?: string;
|
||||
marketplaceName?: string;
|
||||
};
|
||||
|
||||
export type CodexAppServerStartOptions = {
|
||||
transport: CodexAppServerTransportMode;
|
||||
command: string;
|
||||
@@ -35,6 +57,7 @@ export type CodexPluginConfig = {
|
||||
enabled?: boolean;
|
||||
timeoutMs?: number;
|
||||
};
|
||||
computerUse?: CodexComputerUseConfig;
|
||||
appServer?: {
|
||||
mode?: CodexAppServerPolicyMode;
|
||||
transport?: CodexAppServerTransportMode;
|
||||
@@ -68,6 +91,21 @@ export const CODEX_APP_SERVER_CONFIG_KEYS = [
|
||||
"defaultWorkspaceDir",
|
||||
] as const;
|
||||
|
||||
export const CODEX_COMPUTER_USE_CONFIG_KEYS = [
|
||||
"enabled",
|
||||
"autoInstall",
|
||||
"marketplaceDiscoveryTimeoutMs",
|
||||
"marketplaceSource",
|
||||
"marketplacePath",
|
||||
"marketplaceName",
|
||||
"pluginName",
|
||||
"mcpServerName",
|
||||
] as const;
|
||||
|
||||
export const DEFAULT_CODEX_COMPUTER_USE_PLUGIN_NAME = "computer-use";
|
||||
export const DEFAULT_CODEX_COMPUTER_USE_MCP_SERVER_NAME = "computer-use";
|
||||
export const DEFAULT_CODEX_COMPUTER_USE_MARKETPLACE_DISCOVERY_TIMEOUT_MS = 60_000;
|
||||
|
||||
const codexAppServerTransportSchema = z.enum(["stdio", "websocket"]);
|
||||
const codexAppServerPolicyModeSchema = z.enum(["yolo", "guardian"]);
|
||||
const codexAppServerApprovalPolicySchema = z.enum([
|
||||
@@ -92,6 +130,19 @@ const codexPluginConfigSchema = z
|
||||
})
|
||||
.strict()
|
||||
.optional(),
|
||||
computerUse: z
|
||||
.object({
|
||||
enabled: z.boolean().optional(),
|
||||
autoInstall: z.boolean().optional(),
|
||||
marketplaceDiscoveryTimeoutMs: z.number().positive().optional(),
|
||||
marketplaceSource: z.string().optional(),
|
||||
marketplacePath: z.string().optional(),
|
||||
marketplaceName: z.string().optional(),
|
||||
pluginName: z.string().optional(),
|
||||
mcpServerName: z.string().optional(),
|
||||
})
|
||||
.strict()
|
||||
.optional(),
|
||||
appServer: z
|
||||
.object({
|
||||
mode: codexAppServerPolicyModeSchema.optional(),
|
||||
@@ -176,6 +227,64 @@ export function resolveCodexAppServerRuntimeOptions(
|
||||
};
|
||||
}
|
||||
|
||||
export function resolveCodexComputerUseConfig(
|
||||
params: {
|
||||
pluginConfig?: unknown;
|
||||
env?: NodeJS.ProcessEnv;
|
||||
overrides?: Partial<CodexComputerUseConfig>;
|
||||
} = {},
|
||||
): ResolvedCodexComputerUseConfig {
|
||||
const env = params.env ?? process.env;
|
||||
const config = readCodexPluginConfig(params.pluginConfig).computerUse ?? {};
|
||||
const marketplaceSource =
|
||||
readNonEmptyString(params.overrides?.marketplaceSource) ??
|
||||
readNonEmptyString(config.marketplaceSource) ??
|
||||
readNonEmptyString(env.OPENCLAW_CODEX_COMPUTER_USE_MARKETPLACE_SOURCE);
|
||||
const marketplacePath =
|
||||
readNonEmptyString(params.overrides?.marketplacePath) ??
|
||||
readNonEmptyString(config.marketplacePath) ??
|
||||
readNonEmptyString(env.OPENCLAW_CODEX_COMPUTER_USE_MARKETPLACE_PATH);
|
||||
const marketplaceName =
|
||||
readNonEmptyString(params.overrides?.marketplaceName) ??
|
||||
readNonEmptyString(config.marketplaceName) ??
|
||||
readNonEmptyString(env.OPENCLAW_CODEX_COMPUTER_USE_MARKETPLACE_NAME);
|
||||
const autoInstall =
|
||||
params.overrides?.autoInstall ??
|
||||
config.autoInstall ??
|
||||
readBooleanEnv(env.OPENCLAW_CODEX_COMPUTER_USE_AUTO_INSTALL) ??
|
||||
false;
|
||||
const marketplaceDiscoveryTimeoutMs = normalizePositiveNumber(
|
||||
params.overrides?.marketplaceDiscoveryTimeoutMs ??
|
||||
config.marketplaceDiscoveryTimeoutMs ??
|
||||
readNumberEnv(env.OPENCLAW_CODEX_COMPUTER_USE_MARKETPLACE_DISCOVERY_TIMEOUT_MS),
|
||||
DEFAULT_CODEX_COMPUTER_USE_MARKETPLACE_DISCOVERY_TIMEOUT_MS,
|
||||
);
|
||||
const enabled =
|
||||
params.overrides?.enabled ??
|
||||
config.enabled ??
|
||||
readBooleanEnv(env.OPENCLAW_CODEX_COMPUTER_USE) ??
|
||||
Boolean(autoInstall || marketplaceSource || marketplacePath || marketplaceName);
|
||||
|
||||
return {
|
||||
enabled,
|
||||
autoInstall,
|
||||
marketplaceDiscoveryTimeoutMs,
|
||||
pluginName:
|
||||
readNonEmptyString(params.overrides?.pluginName) ??
|
||||
readNonEmptyString(config.pluginName) ??
|
||||
readNonEmptyString(env.OPENCLAW_CODEX_COMPUTER_USE_PLUGIN_NAME) ??
|
||||
DEFAULT_CODEX_COMPUTER_USE_PLUGIN_NAME,
|
||||
mcpServerName:
|
||||
readNonEmptyString(params.overrides?.mcpServerName) ??
|
||||
readNonEmptyString(config.mcpServerName) ??
|
||||
readNonEmptyString(env.OPENCLAW_CODEX_COMPUTER_USE_MCP_SERVER_NAME) ??
|
||||
DEFAULT_CODEX_COMPUTER_USE_MCP_SERVER_NAME,
|
||||
...(marketplaceSource ? { marketplaceSource } : {}),
|
||||
...(marketplacePath ? { marketplacePath } : {}),
|
||||
...(marketplaceName ? { marketplaceName } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
export function codexAppServerStartOptionsKey(
|
||||
options: CodexAppServerStartOptions,
|
||||
params: { authProfileId?: string } = {},
|
||||
@@ -264,6 +373,28 @@ function normalizeHeaders(value: unknown): Record<string, string> {
|
||||
);
|
||||
}
|
||||
|
||||
function readBooleanEnv(value: string | undefined): boolean | undefined {
|
||||
if (value === undefined) {
|
||||
return undefined;
|
||||
}
|
||||
const normalized = value.trim().toLowerCase();
|
||||
if (["1", "true", "yes", "on"].includes(normalized)) {
|
||||
return true;
|
||||
}
|
||||
if (["0", "false", "no", "off"].includes(normalized)) {
|
||||
return false;
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function readNumberEnv(value: string | undefined): number | undefined {
|
||||
if (value === undefined) {
|
||||
return undefined;
|
||||
}
|
||||
const parsed = Number(value);
|
||||
return Number.isFinite(parsed) ? parsed : undefined;
|
||||
}
|
||||
|
||||
function resolveArgs(configArgs: unknown, envArgs: string | undefined): string[] {
|
||||
if (Array.isArray(configArgs)) {
|
||||
return configArgs
|
||||
|
||||
@@ -167,7 +167,7 @@ describe("CodexAppServerEventProjector", () => {
|
||||
outputTokens: 100_000,
|
||||
},
|
||||
last: {
|
||||
totalTokens: 14,
|
||||
totalTokens: 12,
|
||||
inputTokens: 5,
|
||||
cachedInputTokens: 2,
|
||||
outputTokens: 7,
|
||||
@@ -186,12 +186,12 @@ describe("CodexAppServerEventProjector", () => {
|
||||
expect(result.assistantTexts).toEqual(["hello"]);
|
||||
expect(result.messagesSnapshot.map((message) => message.role)).toEqual(["user", "assistant"]);
|
||||
expect(result.lastAssistant?.content).toEqual([{ type: "text", text: "hello" }]);
|
||||
expect(result.attemptUsage).toMatchObject({ input: 5, output: 7, cacheRead: 2, total: 14 });
|
||||
expect(result.attemptUsage).toMatchObject({ input: 3, output: 7, cacheRead: 2, total: 12 });
|
||||
expect(result.lastAssistant?.usage).toMatchObject({
|
||||
input: 5,
|
||||
input: 3,
|
||||
output: 7,
|
||||
cacheRead: 2,
|
||||
totalTokens: 14,
|
||||
totalTokens: 12,
|
||||
});
|
||||
expect(result.replayMetadata.replaySafe).toBe(true);
|
||||
});
|
||||
@@ -289,7 +289,7 @@ describe("CodexAppServerEventProjector", () => {
|
||||
tokenUsage: {
|
||||
total: { total_tokens: 1_000_000 },
|
||||
last_token_usage: {
|
||||
total_tokens: 20,
|
||||
total_tokens: 17,
|
||||
input_tokens: 8,
|
||||
cached_input_tokens: 3,
|
||||
output_tokens: 9,
|
||||
@@ -300,12 +300,12 @@ describe("CodexAppServerEventProjector", () => {
|
||||
|
||||
const result = projector.buildResult(buildEmptyToolTelemetry());
|
||||
|
||||
expect(result.attemptUsage).toMatchObject({ input: 8, output: 9, cacheRead: 3, total: 20 });
|
||||
expect(result.attemptUsage).toMatchObject({ input: 5, output: 9, cacheRead: 3, total: 17 });
|
||||
expect(result.lastAssistant?.usage).toMatchObject({
|
||||
input: 8,
|
||||
input: 5,
|
||||
output: 9,
|
||||
cacheRead: 3,
|
||||
totalTokens: 20,
|
||||
totalTokens: 17,
|
||||
});
|
||||
});
|
||||
|
||||
|
||||
@@ -61,6 +61,13 @@ const CURRENT_TOKEN_USAGE_KEYS = [
|
||||
"last_token_usage",
|
||||
] as const;
|
||||
|
||||
const CODEX_PROMPT_TOTAL_INPUT_KEYS = [
|
||||
"inputTokens",
|
||||
"input_tokens",
|
||||
"promptTokens",
|
||||
"prompt_tokens",
|
||||
] as const;
|
||||
|
||||
const MAX_TOOL_OUTPUT_DELTA_MESSAGES_PER_ITEM = 20;
|
||||
|
||||
export class CodexAppServerEventProjector {
|
||||
@@ -910,17 +917,24 @@ function readNumberAlias(record: JsonObject, keys: readonly string[]): number |
|
||||
}
|
||||
|
||||
function normalizeCodexTokenUsage(record: JsonObject): ReturnType<typeof normalizeUsage> {
|
||||
const promptTotalInput = readNumberAlias(record, CODEX_PROMPT_TOTAL_INPUT_KEYS);
|
||||
const cacheRead = readNumberAlias(record, [
|
||||
"cachedInputTokens",
|
||||
"cached_input_tokens",
|
||||
"cacheRead",
|
||||
"cache_read",
|
||||
"cache_read_input_tokens",
|
||||
"cached_tokens",
|
||||
]);
|
||||
const input =
|
||||
promptTotalInput !== undefined && cacheRead !== undefined
|
||||
? Math.max(0, promptTotalInput - cacheRead)
|
||||
: (promptTotalInput ?? readNumber(record, "input"));
|
||||
|
||||
return normalizeUsage({
|
||||
input: readNumberAlias(record, ["inputTokens", "input_tokens", "input", "promptTokens"]),
|
||||
input,
|
||||
output: readNumberAlias(record, ["outputTokens", "output_tokens", "output"]),
|
||||
cacheRead: readNumberAlias(record, [
|
||||
"cachedInputTokens",
|
||||
"cached_input_tokens",
|
||||
"cacheRead",
|
||||
"cache_read",
|
||||
"cache_read_input_tokens",
|
||||
"cached_tokens",
|
||||
]),
|
||||
cacheRead,
|
||||
cacheWrite: readNumberAlias(record, [
|
||||
"cacheWrite",
|
||||
"cache_write",
|
||||
|
||||
@@ -41,6 +41,7 @@ import {
|
||||
defaultCodexAppServerClientFactory,
|
||||
} from "./client-factory.js";
|
||||
import { isCodexAppServerApprovalRequest, type CodexAppServerClient } from "./client.js";
|
||||
import { ensureCodexComputerUse } from "./computer-use.js";
|
||||
import { resolveCodexAppServerRuntimeOptions } from "./config.js";
|
||||
import { projectContextEngineAssemblyForCodex } from "./context-engine-projection.js";
|
||||
import { createCodexDynamicToolBridge } from "./dynamic-tools.js";
|
||||
@@ -311,6 +312,12 @@ export async function runCodexAppServerAttempt(
|
||||
signal: runAbortController.signal,
|
||||
operation: async () => {
|
||||
const startupClient = await clientFactory(appServer.start, startupAuthProfileId);
|
||||
await ensureCodexComputerUse({
|
||||
client: startupClient,
|
||||
pluginConfig: options.pluginConfig,
|
||||
timeoutMs: appServer.requestTimeoutMs,
|
||||
signal: runAbortController.signal,
|
||||
});
|
||||
const startupThread = await startOrResumeThread({
|
||||
client: startupClient,
|
||||
params,
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import type { CodexComputerUseStatus } from "./app-server/computer-use.js";
|
||||
import type { CodexAppServerModelListResult } from "./app-server/models.js";
|
||||
import { isJsonObject, type JsonObject, type JsonValue } from "./app-server/protocol.js";
|
||||
import type { SafeValue } from "./command-rpc.js";
|
||||
@@ -89,6 +90,28 @@ export function formatAccount(
|
||||
].join("\n");
|
||||
}
|
||||
|
||||
export function formatComputerUseStatus(status: CodexComputerUseStatus): string {
|
||||
const lines = [
|
||||
`Computer Use: ${status.ready ? "ready" : status.enabled ? "not ready" : "disabled"}`,
|
||||
];
|
||||
lines.push(
|
||||
`Plugin: ${status.pluginName}${status.installed ? " (installed)" : " (not installed)"}`,
|
||||
);
|
||||
lines.push(
|
||||
`MCP server: ${status.mcpServerName}${
|
||||
status.mcpServerAvailable ? ` (${status.tools.length} tools)` : " (unavailable)"
|
||||
}`,
|
||||
);
|
||||
if (status.marketplaceName) {
|
||||
lines.push(`Marketplace: ${status.marketplaceName}`);
|
||||
}
|
||||
if (status.tools.length > 0) {
|
||||
lines.push(`Tools: ${status.tools.slice(0, 8).join(", ")}`);
|
||||
}
|
||||
lines.push(status.message);
|
||||
return lines.join("\n");
|
||||
}
|
||||
|
||||
export function formatList(response: JsonValue | undefined, label: string): string {
|
||||
const entries = extractArray(response);
|
||||
if (entries.length === 0) {
|
||||
@@ -120,6 +143,7 @@ export function buildHelp(): string {
|
||||
"- /codex detach",
|
||||
"- /codex compact",
|
||||
"- /codex review",
|
||||
"- /codex computer-use [status|install]",
|
||||
"- /codex account",
|
||||
"- /codex mcp",
|
||||
"- /codex skills",
|
||||
|
||||
@@ -1,5 +1,11 @@
|
||||
import type { PluginCommandContext, PluginCommandResult } from "openclaw/plugin-sdk/plugin-entry";
|
||||
import { CODEX_CONTROL_METHODS, type CodexControlMethod } from "./app-server/capabilities.js";
|
||||
import {
|
||||
installCodexComputerUse,
|
||||
readCodexComputerUseStatus,
|
||||
type CodexComputerUseSetupParams,
|
||||
} from "./app-server/computer-use.js";
|
||||
import type { CodexComputerUseConfig } from "./app-server/config.js";
|
||||
import { listAllCodexAppServerModels } from "./app-server/models.js";
|
||||
import { isJsonObject, type JsonValue } from "./app-server/protocol.js";
|
||||
import {
|
||||
@@ -10,6 +16,7 @@ import {
|
||||
import {
|
||||
buildHelp,
|
||||
formatAccount,
|
||||
formatComputerUseStatus,
|
||||
formatCodexStatus,
|
||||
formatList,
|
||||
formatModels,
|
||||
@@ -49,6 +56,8 @@ export type CodexCommandDeps = {
|
||||
safeCodexControlRequest: SafeCodexControlRequestFn;
|
||||
writeCodexAppServerBinding: typeof writeCodexAppServerBinding;
|
||||
clearCodexAppServerBinding: typeof clearCodexAppServerBinding;
|
||||
readCodexComputerUseStatus: typeof readCodexComputerUseStatus;
|
||||
installCodexComputerUse: typeof installCodexComputerUse;
|
||||
resolveCodexDefaultWorkspaceDir: typeof resolveCodexDefaultWorkspaceDir;
|
||||
startCodexConversationThread: typeof startCodexConversationThread;
|
||||
readCodexConversationActiveTurn: typeof readCodexConversationActiveTurn;
|
||||
@@ -80,6 +89,8 @@ const defaultCodexCommandDeps: CodexCommandDeps = {
|
||||
safeCodexControlRequest,
|
||||
writeCodexAppServerBinding,
|
||||
clearCodexAppServerBinding,
|
||||
readCodexComputerUseStatus,
|
||||
installCodexComputerUse,
|
||||
resolveCodexDefaultWorkspaceDir,
|
||||
startCodexConversationThread,
|
||||
readCodexConversationActiveTurn,
|
||||
@@ -98,6 +109,13 @@ type ParsedBindArgs = {
|
||||
help?: boolean;
|
||||
};
|
||||
|
||||
type ParsedComputerUseArgs = {
|
||||
action: "status" | "install";
|
||||
overrides: Partial<CodexComputerUseConfig>;
|
||||
hasOverrides: boolean;
|
||||
help?: boolean;
|
||||
};
|
||||
|
||||
export async function handleCodexSubcommand(
|
||||
ctx: PluginCommandContext,
|
||||
options: { pluginConfig?: unknown; deps?: Partial<CodexCommandDeps> },
|
||||
@@ -170,6 +188,11 @@ export async function handleCodexSubcommand(
|
||||
),
|
||||
};
|
||||
}
|
||||
if (normalized === "computer-use" || normalized === "computeruse") {
|
||||
return {
|
||||
text: await handleComputerUseCommand(deps, options.pluginConfig, rest),
|
||||
};
|
||||
}
|
||||
if (normalized === "mcp") {
|
||||
return {
|
||||
text: formatList(
|
||||
@@ -204,6 +227,29 @@ export async function handleCodexSubcommand(
|
||||
return { text: `Unknown Codex command: ${subcommand}\n\n${buildHelp()}` };
|
||||
}
|
||||
|
||||
async function handleComputerUseCommand(
|
||||
deps: CodexCommandDeps,
|
||||
pluginConfig: unknown,
|
||||
args: string[],
|
||||
): Promise<string> {
|
||||
const parsed = parseComputerUseArgs(args);
|
||||
if (parsed.help) {
|
||||
return [
|
||||
"Usage: /codex computer-use [status|install] [--source <marketplace-source>] [--marketplace-path <path>] [--marketplace <name>]",
|
||||
"Checks or installs the configured Codex Computer Use plugin through app-server.",
|
||||
].join("\n");
|
||||
}
|
||||
const params: CodexComputerUseSetupParams = {
|
||||
pluginConfig,
|
||||
forceEnable: parsed.action === "install" || parsed.hasOverrides,
|
||||
...(Object.keys(parsed.overrides).length > 0 ? { overrides: parsed.overrides } : {}),
|
||||
};
|
||||
if (parsed.action === "install") {
|
||||
return formatComputerUseStatus(await deps.installCodexComputerUse(params));
|
||||
}
|
||||
return formatComputerUseStatus(await deps.readCodexComputerUseStatus(params));
|
||||
}
|
||||
|
||||
async function bindConversation(
|
||||
deps: CodexCommandDeps,
|
||||
ctx: PluginCommandContext,
|
||||
@@ -504,6 +550,114 @@ function parseBindArgs(args: string[]): ParsedBindArgs {
|
||||
return parsed;
|
||||
}
|
||||
|
||||
function parseComputerUseArgs(args: string[]): ParsedComputerUseArgs {
|
||||
const parsed: ParsedComputerUseArgs = {
|
||||
action: "status",
|
||||
overrides: {},
|
||||
hasOverrides: false,
|
||||
};
|
||||
for (let index = 0; index < args.length; index += 1) {
|
||||
const arg = args[index];
|
||||
if (arg === "--help" || arg === "-h") {
|
||||
parsed.help = true;
|
||||
continue;
|
||||
}
|
||||
if (arg === "status" || arg === "install") {
|
||||
parsed.action = arg;
|
||||
continue;
|
||||
}
|
||||
if (arg === "--source" || arg === "--marketplace-source") {
|
||||
const value = readRequiredOptionValue(args, index);
|
||||
if (!value) {
|
||||
parsed.help = true;
|
||||
continue;
|
||||
}
|
||||
parsed.overrides.marketplaceSource = value;
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
if (arg === "--marketplace-path" || arg === "--path") {
|
||||
const value = readRequiredOptionValue(args, index);
|
||||
if (!value) {
|
||||
parsed.help = true;
|
||||
continue;
|
||||
}
|
||||
parsed.overrides.marketplacePath = value;
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
if (arg === "--marketplace") {
|
||||
const value = readRequiredOptionValue(args, index);
|
||||
if (!value) {
|
||||
parsed.help = true;
|
||||
continue;
|
||||
}
|
||||
parsed.overrides.marketplaceName = value;
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
if (arg === "--plugin") {
|
||||
const value = readRequiredOptionValue(args, index);
|
||||
if (!value) {
|
||||
parsed.help = true;
|
||||
continue;
|
||||
}
|
||||
parsed.overrides.pluginName = value;
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
if (arg === "--server" || arg === "--mcp-server") {
|
||||
const value = readRequiredOptionValue(args, index);
|
||||
if (!value) {
|
||||
parsed.help = true;
|
||||
continue;
|
||||
}
|
||||
parsed.overrides.mcpServerName = value;
|
||||
index += 1;
|
||||
continue;
|
||||
}
|
||||
parsed.help = true;
|
||||
}
|
||||
parsed.overrides = normalizeComputerUseStringOverrides(parsed.overrides);
|
||||
parsed.hasOverrides = Object.values(parsed.overrides).some(Boolean);
|
||||
return parsed;
|
||||
}
|
||||
|
||||
function readRequiredOptionValue(args: string[], index: number): string | undefined {
|
||||
const value = args[index + 1];
|
||||
if (!value || value.startsWith("-")) {
|
||||
return undefined;
|
||||
}
|
||||
return value;
|
||||
}
|
||||
|
||||
function normalizeComputerUseStringOverrides(
|
||||
overrides: Partial<CodexComputerUseConfig>,
|
||||
): Partial<CodexComputerUseConfig> {
|
||||
const normalized: Partial<CodexComputerUseConfig> = {};
|
||||
const marketplaceSource = normalizeOptionalString(overrides.marketplaceSource);
|
||||
if (marketplaceSource) {
|
||||
normalized.marketplaceSource = marketplaceSource;
|
||||
}
|
||||
const marketplacePath = normalizeOptionalString(overrides.marketplacePath);
|
||||
if (marketplacePath) {
|
||||
normalized.marketplacePath = marketplacePath;
|
||||
}
|
||||
const marketplaceName = normalizeOptionalString(overrides.marketplaceName);
|
||||
if (marketplaceName) {
|
||||
normalized.marketplaceName = marketplaceName;
|
||||
}
|
||||
const pluginName = normalizeOptionalString(overrides.pluginName);
|
||||
if (pluginName) {
|
||||
normalized.pluginName = pluginName;
|
||||
}
|
||||
const mcpServerName = normalizeOptionalString(overrides.mcpServerName);
|
||||
if (mcpServerName) {
|
||||
normalized.mcpServerName = mcpServerName;
|
||||
}
|
||||
return normalized;
|
||||
}
|
||||
|
||||
function normalizeOptionalString(value: string | undefined): string | undefined {
|
||||
const trimmed = value?.trim();
|
||||
return trimmed || undefined;
|
||||
|
||||
@@ -4,6 +4,7 @@ import path from "node:path";
|
||||
import type { PluginCommandContext } from "openclaw/plugin-sdk/plugin-entry";
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
|
||||
import { CODEX_CONTROL_METHODS } from "./app-server/capabilities.js";
|
||||
import type { CodexComputerUseStatus } from "./app-server/computer-use.js";
|
||||
import type { CodexAppServerStartOptions } from "./app-server/config.js";
|
||||
import { resetSharedCodexAppServerClientForTests } from "./app-server/shared-client.js";
|
||||
import type { CodexCommandDeps } from "./command-handlers.js";
|
||||
@@ -241,6 +242,67 @@ describe("codex command", () => {
|
||||
});
|
||||
});
|
||||
|
||||
it("checks Codex Computer Use setup", async () => {
|
||||
const readCodexComputerUseStatus = vi.fn(async () => computerUseReadyStatus());
|
||||
|
||||
await expect(
|
||||
handleCodexCommand(createContext("computer-use status"), {
|
||||
deps: createDeps({ readCodexComputerUseStatus }),
|
||||
}),
|
||||
).resolves.toEqual({
|
||||
text: [
|
||||
"Computer Use: ready",
|
||||
"Plugin: computer-use (installed)",
|
||||
"MCP server: computer-use (1 tools)",
|
||||
"Marketplace: desktop-tools",
|
||||
"Tools: list_apps",
|
||||
"Computer Use is ready.",
|
||||
].join("\n"),
|
||||
});
|
||||
expect(readCodexComputerUseStatus).toHaveBeenCalledWith({
|
||||
pluginConfig: undefined,
|
||||
forceEnable: false,
|
||||
});
|
||||
});
|
||||
|
||||
it("installs Codex Computer Use from command overrides", async () => {
|
||||
const installCodexComputerUse = vi.fn(async () => computerUseReadyStatus());
|
||||
|
||||
await expect(
|
||||
handleCodexCommand(
|
||||
createContext(
|
||||
"computer-use install --source github:example/desktop-tools --marketplace desktop-tools",
|
||||
),
|
||||
{
|
||||
deps: createDeps({ installCodexComputerUse }),
|
||||
},
|
||||
),
|
||||
).resolves.toEqual({
|
||||
text: expect.stringContaining("Computer Use: ready"),
|
||||
});
|
||||
expect(installCodexComputerUse).toHaveBeenCalledWith({
|
||||
pluginConfig: undefined,
|
||||
forceEnable: true,
|
||||
overrides: {
|
||||
marketplaceSource: "github:example/desktop-tools",
|
||||
marketplaceName: "desktop-tools",
|
||||
},
|
||||
});
|
||||
});
|
||||
|
||||
it("shows help when Computer Use option values are missing", async () => {
|
||||
const installCodexComputerUse = vi.fn(async () => computerUseReadyStatus());
|
||||
|
||||
await expect(
|
||||
handleCodexCommand(createContext("computer-use install --source"), {
|
||||
deps: createDeps({ installCodexComputerUse }),
|
||||
}),
|
||||
).resolves.toEqual({
|
||||
text: expect.stringContaining("Usage: /codex computer-use"),
|
||||
});
|
||||
expect(installCodexComputerUse).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("explains compaction when no Codex thread is attached", async () => {
|
||||
const sessionFile = path.join(tempDir, "session.jsonl");
|
||||
|
||||
@@ -600,3 +662,18 @@ describe("codex command", () => {
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
function computerUseReadyStatus(): CodexComputerUseStatus {
|
||||
return {
|
||||
enabled: true,
|
||||
ready: true,
|
||||
installed: true,
|
||||
pluginEnabled: true,
|
||||
mcpServerAvailable: true,
|
||||
pluginName: "computer-use",
|
||||
mcpServerName: "computer-use",
|
||||
marketplaceName: "desktop-tools",
|
||||
tools: ["list_apps"],
|
||||
message: "Computer Use is ready.",
|
||||
};
|
||||
}
|
||||
|
||||
@@ -7,14 +7,24 @@ const telemetryState = vi.hoisted(() => {
|
||||
name: string;
|
||||
addEvent: ReturnType<typeof vi.fn>;
|
||||
end: ReturnType<typeof vi.fn>;
|
||||
setAttributes: ReturnType<typeof vi.fn>;
|
||||
setStatus: ReturnType<typeof vi.fn>;
|
||||
spanContext: ReturnType<typeof vi.fn>;
|
||||
}> = [];
|
||||
const tracer = {
|
||||
startSpan: vi.fn((name: string, _opts?: unknown, _ctx?: unknown) => {
|
||||
const spanNumber = spans.length + 1;
|
||||
const spanId = spanNumber.toString(16).padStart(16, "0");
|
||||
const span = {
|
||||
addEvent: vi.fn(),
|
||||
end: vi.fn(),
|
||||
setAttributes: vi.fn(),
|
||||
setStatus: vi.fn(),
|
||||
spanContext: vi.fn(() => ({
|
||||
traceId: "4bf92f3577b34da6a3ce929d0e0e4736",
|
||||
spanId,
|
||||
traceFlags: 1,
|
||||
})),
|
||||
};
|
||||
spans.push({ name, ...span });
|
||||
return span;
|
||||
@@ -122,6 +132,7 @@ vi.mock("@opentelemetry/semantic-conventions", () => ({
|
||||
import {
|
||||
emitTrustedDiagnosticEvent,
|
||||
onInternalDiagnosticEvent,
|
||||
resetDiagnosticEventsForTest,
|
||||
} from "../../../src/infra/diagnostic-events.js";
|
||||
import type { OpenClawPluginServiceContext } from "../api.js";
|
||||
import { emitDiagnosticEvent } from "../api.js";
|
||||
@@ -219,6 +230,7 @@ function flushDiagnosticEvents() {
|
||||
|
||||
describe("diagnostics-otel service", () => {
|
||||
beforeEach(() => {
|
||||
resetDiagnosticEventsForTest();
|
||||
delete process.env.OPENCLAW_OTEL_PRELOADED;
|
||||
delete process.env.OTEL_SEMCONV_STABILITY_OPT_IN;
|
||||
telemetryState.counters.clear();
|
||||
@@ -241,6 +253,7 @@ describe("diagnostics-otel service", () => {
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
resetDiagnosticEventsForTest();
|
||||
if (ORIGINAL_OPENCLAW_OTEL_PRELOADED === undefined) {
|
||||
delete process.env.OPENCLAW_OTEL_PRELOADED;
|
||||
} else {
|
||||
@@ -561,6 +574,7 @@ describe("diagnostics-otel service", () => {
|
||||
outcome: "completed",
|
||||
durationMs: 100,
|
||||
});
|
||||
await flushDiagnosticEvents();
|
||||
|
||||
expect(sdkStart).not.toHaveBeenCalled();
|
||||
expect(telemetryState.histograms.get("openclaw.run.duration_ms")?.record).toHaveBeenCalledWith(
|
||||
@@ -1133,6 +1147,9 @@ describe("diagnostics-otel service", () => {
|
||||
api: "completions",
|
||||
transport: "http",
|
||||
durationMs: 80,
|
||||
requestPayloadBytes: 1234,
|
||||
responseStreamBytes: 567,
|
||||
timeToFirstByteMs: 45,
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: CHILD_SPAN_ID,
|
||||
@@ -1295,6 +1312,41 @@ describe("diagnostics-otel service", () => {
|
||||
"openclaw.model": "gpt-5.4",
|
||||
}),
|
||||
);
|
||||
expect(
|
||||
telemetryState.histograms.get("openclaw.model_call.request_bytes")?.record,
|
||||
).toHaveBeenCalledWith(
|
||||
1234,
|
||||
expect.objectContaining({
|
||||
"openclaw.provider": "openai",
|
||||
"openclaw.model": "gpt-5.4",
|
||||
}),
|
||||
);
|
||||
expect(
|
||||
telemetryState.histograms.get("openclaw.model_call.response_bytes")?.record,
|
||||
).toHaveBeenCalledWith(
|
||||
567,
|
||||
expect.objectContaining({
|
||||
"openclaw.provider": "openai",
|
||||
"openclaw.model": "gpt-5.4",
|
||||
}),
|
||||
);
|
||||
expect(
|
||||
telemetryState.histograms.get("openclaw.model_call.time_to_first_byte_ms")?.record,
|
||||
).toHaveBeenCalledWith(
|
||||
45,
|
||||
expect.objectContaining({
|
||||
"openclaw.provider": "openai",
|
||||
"openclaw.model": "gpt-5.4",
|
||||
}),
|
||||
);
|
||||
const modelCallSpan = telemetryState.spans.find((span) => span.name === "openclaw.model.call");
|
||||
expect(modelCallSpan?.setAttributes).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
"openclaw.model_call.request_bytes": 1234,
|
||||
"openclaw.model_call.response_bytes": 567,
|
||||
"openclaw.model_call.time_to_first_byte_ms": 45,
|
||||
}),
|
||||
);
|
||||
expect(telemetryState.histograms.get("openclaw.run.duration_ms")?.record).toHaveBeenCalledWith(
|
||||
100,
|
||||
expect.not.objectContaining({
|
||||
@@ -1506,6 +1558,17 @@ describe("diagnostics-otel service", () => {
|
||||
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
|
||||
await service.start(ctx);
|
||||
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "run.started",
|
||||
runId: "run-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "context.assembled",
|
||||
runId: "run-1",
|
||||
@@ -1536,6 +1599,8 @@ describe("diagnostics-otel service", () => {
|
||||
const contextCall = telemetryState.tracer.startSpan.mock.calls.find(
|
||||
(call) => call[0] === "openclaw.context.assembled",
|
||||
);
|
||||
const runSpan = telemetryState.spans.find((span) => span.name === "openclaw.run");
|
||||
const runSpanId = runSpan?.spanContext.mock.results[0]?.value?.spanId;
|
||||
expect(contextCall?.[1]).toMatchObject({
|
||||
attributes: {
|
||||
"openclaw.provider": "openai",
|
||||
@@ -1553,12 +1618,19 @@ describe("diagnostics-otel service", () => {
|
||||
"openclaw.context.reserve_tokens": 4096,
|
||||
},
|
||||
});
|
||||
expect(contextCall?.[1]).toEqual({
|
||||
attributes: expect.any(Object),
|
||||
startTime: expect.any(Number),
|
||||
});
|
||||
expect(JSON.stringify(contextCall)).not.toContain("session-key");
|
||||
expect(JSON.stringify(contextCall)).not.toContain("prompt text");
|
||||
expect(telemetryState.tracer.setSpanContext).toHaveBeenCalledWith(
|
||||
expect.anything(),
|
||||
expect.objectContaining({ traceId: TRACE_ID, spanId: SPAN_ID }),
|
||||
expect.objectContaining({ traceId: TRACE_ID, spanId: runSpanId }),
|
||||
);
|
||||
expect(
|
||||
(contextCall?.[2] as { spanContext?: { spanId?: string } } | undefined)?.spanContext?.spanId,
|
||||
).toBe(runSpanId);
|
||||
await service.stop?.(ctx);
|
||||
});
|
||||
|
||||
@@ -1688,7 +1760,185 @@ describe("diagnostics-otel service", () => {
|
||||
await service.stop?.(ctx);
|
||||
});
|
||||
|
||||
test("parents trusted diagnostic lifecycle spans from explicit parent ids", async () => {
|
||||
test("parents trusted diagnostic lifecycle spans from active started spans", async () => {
|
||||
const service = createDiagnosticsOtelService();
|
||||
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
|
||||
await service.start(ctx);
|
||||
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "run.started",
|
||||
runId: "run-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: CHILD_SPAN_ID,
|
||||
parentSpanId: SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "model.call.started",
|
||||
runId: "run-1",
|
||||
callId: "call-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: GRANDCHILD_SPAN_ID,
|
||||
parentSpanId: CHILD_SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "tool.execution.started",
|
||||
runId: "run-1",
|
||||
toolName: "read",
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: TOOL_SPAN_ID,
|
||||
parentSpanId: GRANDCHILD_SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "tool.execution.error",
|
||||
runId: "run-1",
|
||||
toolName: "read",
|
||||
durationMs: 20,
|
||||
errorCategory: "TypeError",
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: TOOL_SPAN_ID,
|
||||
parentSpanId: GRANDCHILD_SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "model.call.completed",
|
||||
runId: "run-1",
|
||||
callId: "call-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
durationMs: 80,
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: GRANDCHILD_SPAN_ID,
|
||||
parentSpanId: CHILD_SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "run.completed",
|
||||
runId: "run-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
outcome: "completed",
|
||||
durationMs: 100,
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: CHILD_SPAN_ID,
|
||||
parentSpanId: SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
await flushDiagnosticEvents();
|
||||
|
||||
const runSpan = telemetryState.spans.find((span) => span.name === "openclaw.run");
|
||||
const modelSpan = telemetryState.spans.find((span) => span.name === "openclaw.model.call");
|
||||
const toolSpan = telemetryState.spans.find((span) => span.name === "openclaw.tool.execution");
|
||||
const runSpanId = runSpan?.spanContext.mock.results[0]?.value?.spanId;
|
||||
const modelSpanId = modelSpan?.spanContext.mock.results[0]?.value?.spanId;
|
||||
|
||||
expect(telemetryState.tracer.setSpanContext).toHaveBeenCalledTimes(2);
|
||||
expect(telemetryState.tracer.setSpanContext.mock.calls.map((call) => call[1])).toEqual([
|
||||
expect.objectContaining({ traceId: TRACE_ID, spanId: runSpanId }),
|
||||
expect.objectContaining({ traceId: TRACE_ID, spanId: modelSpanId }),
|
||||
]);
|
||||
|
||||
const parentBySpanName = Object.fromEntries(
|
||||
telemetryState.tracer.startSpan.mock.calls.map((call) => [
|
||||
call[0],
|
||||
(call[2] as { spanContext?: { spanId?: string } } | undefined)?.spanContext?.spanId,
|
||||
]),
|
||||
);
|
||||
expect(parentBySpanName).toMatchObject({
|
||||
"openclaw.run": undefined,
|
||||
"openclaw.model.call": runSpanId,
|
||||
"openclaw.tool.execution": modelSpanId,
|
||||
});
|
||||
expect(toolSpan?.setStatus).toHaveBeenCalledWith({
|
||||
code: 2,
|
||||
message: "TypeError",
|
||||
});
|
||||
await service.stop?.(ctx);
|
||||
});
|
||||
|
||||
test("keeps trusted run spans alive long enough for post-completion usage parenting", async () => {
|
||||
const service = createDiagnosticsOtelService();
|
||||
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
|
||||
await service.start(ctx);
|
||||
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "run.started",
|
||||
runId: "run-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: CHILD_SPAN_ID,
|
||||
parentSpanId: SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "run.completed",
|
||||
runId: "run-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
outcome: "completed",
|
||||
durationMs: 100,
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: CHILD_SPAN_ID,
|
||||
parentSpanId: SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "model.usage",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
usage: { input: 3, output: 2, total: 5 },
|
||||
durationMs: 10,
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: GRANDCHILD_SPAN_ID,
|
||||
parentSpanId: SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
await flushDiagnosticEvents();
|
||||
|
||||
const runSpan = telemetryState.spans.find((span) => span.name === "openclaw.run");
|
||||
const runSpanId = runSpan?.spanContext.mock.results[0]?.value?.spanId;
|
||||
const modelUsageCall = telemetryState.tracer.startSpan.mock.calls.find(
|
||||
(call) => call[0] === "openclaw.model.usage",
|
||||
);
|
||||
|
||||
expect(telemetryState.tracer.setSpanContext).toHaveBeenCalledWith(
|
||||
expect.anything(),
|
||||
expect.objectContaining({ traceId: TRACE_ID, spanId: runSpanId }),
|
||||
);
|
||||
expect(
|
||||
(modelUsageCall?.[2] as { spanContext?: { spanId?: string } } | undefined)?.spanContext
|
||||
?.spanId,
|
||||
).toBe(runSpanId);
|
||||
expect(runSpan?.end).toHaveBeenCalledWith(expect.any(Number));
|
||||
await service.stop?.(ctx);
|
||||
});
|
||||
|
||||
test("does not force remote parents for completed-only trusted lifecycle spans", async () => {
|
||||
const service = createDiagnosticsOtelService();
|
||||
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
|
||||
await service.start(ctx);
|
||||
@@ -1721,38 +1971,15 @@ describe("diagnostics-otel service", () => {
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
emitTrustedDiagnosticEvent({
|
||||
type: "tool.execution.error",
|
||||
runId: "run-1",
|
||||
toolName: "read",
|
||||
durationMs: 20,
|
||||
errorCategory: "TypeError",
|
||||
trace: {
|
||||
traceId: TRACE_ID,
|
||||
spanId: TOOL_SPAN_ID,
|
||||
parentSpanId: GRANDCHILD_SPAN_ID,
|
||||
traceFlags: "01",
|
||||
},
|
||||
});
|
||||
await flushDiagnosticEvents();
|
||||
|
||||
expect(telemetryState.tracer.setSpanContext).toHaveBeenCalledTimes(3);
|
||||
expect(telemetryState.tracer.setSpanContext.mock.calls.map((call) => call[1])).toEqual([
|
||||
expect.objectContaining({ traceId: TRACE_ID, spanId: SPAN_ID }),
|
||||
expect.objectContaining({ traceId: TRACE_ID, spanId: CHILD_SPAN_ID }),
|
||||
expect.objectContaining({ traceId: TRACE_ID, spanId: GRANDCHILD_SPAN_ID }),
|
||||
]);
|
||||
|
||||
expect(telemetryState.tracer.setSpanContext).not.toHaveBeenCalled();
|
||||
const parentBySpanName = Object.fromEntries(
|
||||
telemetryState.tracer.startSpan.mock.calls.map((call) => [
|
||||
call[0],
|
||||
(call[2] as { spanContext?: { spanId?: string } } | undefined)?.spanContext?.spanId,
|
||||
]),
|
||||
telemetryState.tracer.startSpan.mock.calls.map((call) => [call[0], call[2]]),
|
||||
);
|
||||
expect(parentBySpanName).toMatchObject({
|
||||
"openclaw.run": SPAN_ID,
|
||||
"openclaw.model.call": CHILD_SPAN_ID,
|
||||
"openclaw.tool.execution": GRANDCHILD_SPAN_ID,
|
||||
"openclaw.run": undefined,
|
||||
"openclaw.model.call": undefined,
|
||||
});
|
||||
await service.stop?.(ctx);
|
||||
});
|
||||
@@ -1860,6 +2087,93 @@ describe("diagnostics-otel service", () => {
|
||||
await service.stop?.(ctx);
|
||||
});
|
||||
|
||||
test("does not create live started spans for untrusted lifecycle diagnostics", async () => {
|
||||
const service = createDiagnosticsOtelService();
|
||||
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
|
||||
await service.start(ctx);
|
||||
|
||||
emitDiagnosticEvent({
|
||||
type: "run.started",
|
||||
runId: "run-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
});
|
||||
emitDiagnosticEvent({
|
||||
type: "run.completed",
|
||||
runId: "run-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
outcome: "completed",
|
||||
durationMs: 100,
|
||||
});
|
||||
emitDiagnosticEvent({
|
||||
type: "model.call.started",
|
||||
runId: "run-1",
|
||||
callId: "call-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
});
|
||||
emitDiagnosticEvent({
|
||||
type: "model.call.completed",
|
||||
runId: "run-1",
|
||||
callId: "call-1",
|
||||
provider: "openai",
|
||||
model: "gpt-5.4",
|
||||
durationMs: 80,
|
||||
});
|
||||
emitDiagnosticEvent({
|
||||
type: "tool.execution.started",
|
||||
runId: "run-1",
|
||||
toolName: "read",
|
||||
});
|
||||
emitDiagnosticEvent({
|
||||
type: "tool.execution.error",
|
||||
runId: "run-1",
|
||||
toolName: "read",
|
||||
durationMs: 20,
|
||||
errorCategory: "TypeError",
|
||||
});
|
||||
emitDiagnosticEvent({
|
||||
type: "harness.run.started",
|
||||
runId: "run-1",
|
||||
provider: "codex",
|
||||
model: "gpt-5.4",
|
||||
harnessId: "codex",
|
||||
pluginId: "codex-plugin",
|
||||
});
|
||||
emitDiagnosticEvent({
|
||||
type: "harness.run.completed",
|
||||
runId: "run-1",
|
||||
provider: "codex",
|
||||
model: "gpt-5.4",
|
||||
harnessId: "codex",
|
||||
pluginId: "codex-plugin",
|
||||
outcome: "completed",
|
||||
durationMs: 90,
|
||||
});
|
||||
await flushDiagnosticEvents();
|
||||
|
||||
expect(
|
||||
telemetryState.tracer.startSpan.mock.calls.filter((call) => call[0] === "openclaw.run"),
|
||||
).toHaveLength(1);
|
||||
expect(
|
||||
telemetryState.tracer.startSpan.mock.calls.filter(
|
||||
(call) => call[0] === "openclaw.model.call",
|
||||
),
|
||||
).toHaveLength(1);
|
||||
expect(
|
||||
telemetryState.tracer.startSpan.mock.calls.filter(
|
||||
(call) => call[0] === "openclaw.tool.execution",
|
||||
),
|
||||
).toHaveLength(1);
|
||||
expect(
|
||||
telemetryState.tracer.startSpan.mock.calls.filter(
|
||||
(call) => call[0] === "openclaw.harness.run",
|
||||
),
|
||||
).toHaveLength(1);
|
||||
await service.stop?.(ctx);
|
||||
});
|
||||
|
||||
test("exports exec process spans without command text", async () => {
|
||||
const service = createDiagnosticsOtelService();
|
||||
const ctx = createOtelContext(OTEL_TEST_ENDPOINT, { traces: true, metrics: true });
|
||||
|
||||
@@ -81,9 +81,9 @@ type ModelCallLifecycleDiagnosticEvent = Extract<
|
||||
DiagnosticEventPayload,
|
||||
{ type: "model.call.completed" | "model.call.error" }
|
||||
>;
|
||||
type HarnessRunLifecycleDiagnosticEvent = Extract<
|
||||
type HarnessRunDiagnosticEvent = Extract<
|
||||
DiagnosticEventPayload,
|
||||
{ type: "harness.run.completed" | "harness.run.error" }
|
||||
{ type: "harness.run.started" | "harness.run.completed" | "harness.run.error" }
|
||||
>;
|
||||
type TelemetryExporterDiagnosticEvent = Extract<
|
||||
DiagnosticEventPayload,
|
||||
@@ -217,7 +217,7 @@ function positiveFiniteNumber(value: number | undefined): number | undefined {
|
||||
}
|
||||
|
||||
function assignPositiveNumberAttr(
|
||||
attrs: Record<string, string | number>,
|
||||
attrs: Record<string, string | number | boolean>,
|
||||
key: string,
|
||||
value: number | undefined,
|
||||
): void {
|
||||
@@ -227,6 +227,23 @@ function assignPositiveNumberAttr(
|
||||
}
|
||||
}
|
||||
|
||||
function assignModelCallSizeTimingAttrs(
|
||||
attrs: Record<string, string | number | boolean>,
|
||||
evt: {
|
||||
requestPayloadBytes?: number;
|
||||
responseStreamBytes?: number;
|
||||
timeToFirstByteMs?: number;
|
||||
},
|
||||
): void {
|
||||
assignPositiveNumberAttr(attrs, "openclaw.model_call.request_bytes", evt.requestPayloadBytes);
|
||||
assignPositiveNumberAttr(attrs, "openclaw.model_call.response_bytes", evt.responseStreamBytes);
|
||||
assignPositiveNumberAttr(
|
||||
attrs,
|
||||
"openclaw.model_call.time_to_first_byte_ms",
|
||||
evt.timeToFirstByteMs,
|
||||
);
|
||||
}
|
||||
|
||||
function assignGenAiSpanIdentityAttrs(
|
||||
attrs: Record<string, string | number | boolean>,
|
||||
input: { api?: string; model?: string; provider?: string },
|
||||
@@ -244,7 +261,7 @@ function assignGenAiSpanIdentityAttrs(
|
||||
|
||||
function assignGenAiModelCallAttrs(
|
||||
attrs: Record<string, string | number | boolean>,
|
||||
evt: ModelCallLifecycleDiagnosticEvent,
|
||||
evt: { api?: string; model?: string; provider?: string },
|
||||
): void {
|
||||
assignGenAiSpanIdentityAttrs(attrs, evt);
|
||||
}
|
||||
@@ -467,19 +484,6 @@ function contextForTraceContext(traceContext: DiagnosticTraceContext | undefined
|
||||
});
|
||||
}
|
||||
|
||||
function contextForDiagnosticSpanParent(traceContext: DiagnosticTraceContext | undefined) {
|
||||
const normalized = normalizeTraceContext(traceContext);
|
||||
if (!normalized?.parentSpanId) {
|
||||
return undefined;
|
||||
}
|
||||
return trace.setSpanContext(otelContextApi.active(), {
|
||||
traceId: normalized.traceId,
|
||||
spanId: normalized.parentSpanId,
|
||||
traceFlags: traceFlagsToOtel(normalized.traceFlags),
|
||||
isRemote: true,
|
||||
});
|
||||
}
|
||||
|
||||
function contextForTrustedTraceContext(
|
||||
evt: DiagnosticEventPayload,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
@@ -487,13 +491,6 @@ function contextForTrustedTraceContext(
|
||||
return metadata.trusted ? contextForTraceContext(evt.trace) : undefined;
|
||||
}
|
||||
|
||||
function contextForTrustedDiagnosticSpanParent(
|
||||
evt: DiagnosticEventPayload,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) {
|
||||
return metadata.trusted ? contextForDiagnosticSpanParent(evt.trace) : undefined;
|
||||
}
|
||||
|
||||
function addTraceAttributes(
|
||||
attributes: Record<string, string | number | boolean>,
|
||||
traceContext: DiagnosticTraceContext | undefined,
|
||||
@@ -518,17 +515,21 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
let sdk: NodeSDK | null = null;
|
||||
let logProvider: LoggerProvider | null = null;
|
||||
let unsubscribe: (() => void) | null = null;
|
||||
let stopActiveTrustedSpans: (() => void) | null = null;
|
||||
|
||||
const stopStarted = async () => {
|
||||
const currentUnsubscribe = unsubscribe;
|
||||
const currentLogProvider = logProvider;
|
||||
const currentSdk = sdk;
|
||||
const currentStopActiveTrustedSpans = stopActiveTrustedSpans;
|
||||
|
||||
unsubscribe = null;
|
||||
logProvider = null;
|
||||
sdk = null;
|
||||
stopActiveTrustedSpans = null;
|
||||
|
||||
currentUnsubscribe?.();
|
||||
currentStopActiveTrustedSpans?.();
|
||||
if (currentLogProvider) {
|
||||
await currentLogProvider.shutdown().catch(() => undefined);
|
||||
}
|
||||
@@ -694,6 +695,24 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
|
||||
const meter = metrics.getMeter("openclaw");
|
||||
const tracer = trace.getTracer("openclaw");
|
||||
const activeTrustedSpans = new Map<string, ReturnType<typeof tracer.startSpan>>();
|
||||
const activeTrustedSpanAliases = new Map<string, ReturnType<typeof tracer.startSpan>>();
|
||||
const pendingTrustedRunFinalizers = new Map<string, ReturnType<typeof setImmediate>>();
|
||||
stopActiveTrustedSpans = () => {
|
||||
const stopAt = Date.now();
|
||||
for (const handle of pendingTrustedRunFinalizers.values()) {
|
||||
clearImmediate(handle);
|
||||
}
|
||||
pendingTrustedRunFinalizers.clear();
|
||||
for (const span of new Set([
|
||||
...activeTrustedSpans.values(),
|
||||
...activeTrustedSpanAliases.values(),
|
||||
])) {
|
||||
span.end(stopAt);
|
||||
}
|
||||
activeTrustedSpans.clear();
|
||||
activeTrustedSpanAliases.clear();
|
||||
};
|
||||
|
||||
const tokensCounter = meter.createCounter("openclaw.tokens", {
|
||||
unit: "1",
|
||||
@@ -810,6 +829,27 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
unit: "ms",
|
||||
description: "Model call duration",
|
||||
});
|
||||
const modelCallRequestBytesHistogram = meter.createHistogram(
|
||||
"openclaw.model_call.request_bytes",
|
||||
{
|
||||
unit: "By",
|
||||
description: "UTF-8 byte size of sanitized model request payloads",
|
||||
},
|
||||
);
|
||||
const modelCallResponseBytesHistogram = meter.createHistogram(
|
||||
"openclaw.model_call.response_bytes",
|
||||
{
|
||||
unit: "By",
|
||||
description: "UTF-8 byte size of streamed model response events",
|
||||
},
|
||||
);
|
||||
const modelCallTimeToFirstByteHistogram = meter.createHistogram(
|
||||
"openclaw.model_call.time_to_first_byte_ms",
|
||||
{
|
||||
unit: "ms",
|
||||
description: "Elapsed time before the first streamed model response event",
|
||||
},
|
||||
);
|
||||
const toolExecutionDurationHistogram = meter.createHistogram(
|
||||
"openclaw.tool.execution.duration_ms",
|
||||
{
|
||||
@@ -942,11 +982,16 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
options: {
|
||||
parentContext?: ReturnType<typeof contextForTraceContext> | null;
|
||||
endTimeMs?: number;
|
||||
startTimeMs?: number;
|
||||
} = {},
|
||||
) => {
|
||||
const endTimeMs = options.endTimeMs ?? Date.now();
|
||||
const startTime =
|
||||
typeof durationMs === "number" ? endTimeMs - Math.max(0, durationMs) : undefined;
|
||||
typeof options.startTimeMs === "number"
|
||||
? options.startTimeMs
|
||||
: typeof durationMs === "number" && durationMs >= 0
|
||||
? endTimeMs - durationMs
|
||||
: undefined;
|
||||
const parentContext =
|
||||
"parentContext" in options ? (options.parentContext ?? undefined) : undefined;
|
||||
const span = tracer.startSpan(
|
||||
@@ -959,6 +1004,78 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
);
|
||||
return span;
|
||||
};
|
||||
const trustedTraceContext = (
|
||||
evt: DiagnosticEventPayload,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => (metadata.trusted ? normalizeTraceContext(evt.trace) : undefined);
|
||||
const activeTrustedParentContext = (
|
||||
evt: DiagnosticEventPayload,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => {
|
||||
const parentSpanId = trustedTraceContext(evt, metadata)?.parentSpanId;
|
||||
if (!parentSpanId) {
|
||||
return undefined;
|
||||
}
|
||||
const activeParentSpan =
|
||||
activeTrustedSpans.get(parentSpanId) ?? activeTrustedSpanAliases.get(parentSpanId);
|
||||
if (!activeParentSpan) {
|
||||
return undefined;
|
||||
}
|
||||
return trace.setSpanContext(otelContextApi.active(), activeParentSpan.spanContext());
|
||||
};
|
||||
const trackTrustedSpan = (
|
||||
evt: DiagnosticEventPayload,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
span: ReturnType<typeof tracer.startSpan>,
|
||||
) => {
|
||||
const spanId = trustedTraceContext(evt, metadata)?.spanId;
|
||||
if (spanId) {
|
||||
activeTrustedSpans.set(spanId, span);
|
||||
}
|
||||
return span;
|
||||
};
|
||||
const takeTrackedTrustedSpan = (
|
||||
evt: DiagnosticEventPayload,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => {
|
||||
const spanId = trustedTraceContext(evt, metadata)?.spanId;
|
||||
if (!spanId) {
|
||||
return undefined;
|
||||
}
|
||||
const span = activeTrustedSpans.get(spanId);
|
||||
if (span) {
|
||||
activeTrustedSpans.delete(spanId);
|
||||
}
|
||||
return span;
|
||||
};
|
||||
const setSpanAttrs = (
|
||||
span: ReturnType<typeof tracer.startSpan>,
|
||||
attributes: Record<string, string | number | boolean>,
|
||||
) => {
|
||||
span.setAttributes?.(redactOtelAttributes(attributes));
|
||||
};
|
||||
const scheduleTrackedRunSpanFinalize = (
|
||||
spanId: string,
|
||||
parentSpanId: string | undefined,
|
||||
span: ReturnType<typeof tracer.startSpan>,
|
||||
endTimeMs: number,
|
||||
) => {
|
||||
const existingHandle = pendingTrustedRunFinalizers.get(spanId);
|
||||
if (existingHandle) {
|
||||
clearImmediate(existingHandle);
|
||||
}
|
||||
const handle = setImmediate(() => {
|
||||
pendingTrustedRunFinalizers.delete(spanId);
|
||||
if (activeTrustedSpans.get(spanId) === span) {
|
||||
activeTrustedSpans.delete(spanId);
|
||||
}
|
||||
if (parentSpanId && activeTrustedSpanAliases.get(parentSpanId) === span) {
|
||||
activeTrustedSpanAliases.delete(parentSpanId);
|
||||
}
|
||||
span.end(endTimeMs);
|
||||
});
|
||||
pendingTrustedRunFinalizers.set(spanId, handle);
|
||||
};
|
||||
|
||||
const addRunAttrs = (
|
||||
spanAttrs: Record<string, string | number | boolean>,
|
||||
@@ -1093,7 +1210,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
);
|
||||
|
||||
const span = spanWithDuration("openclaw.model.usage", spanAttrs, evt.durationMs, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
span.end(evt.ts);
|
||||
@@ -1258,6 +1375,29 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
span.end(evt.ts);
|
||||
};
|
||||
|
||||
const recordRunStarted = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "run.started" }>,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => {
|
||||
if (!tracesEnabled || !metadata.trusted) {
|
||||
return;
|
||||
}
|
||||
const spanAttrs: Record<string, string | number | boolean> = {};
|
||||
addRunAttrs(spanAttrs, evt);
|
||||
const span = trackTrustedSpan(
|
||||
evt,
|
||||
metadata,
|
||||
spanWithDuration("openclaw.run", spanAttrs, undefined, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
startTimeMs: evt.ts,
|
||||
}),
|
||||
);
|
||||
const parentSpanId = trustedTraceContext(evt, metadata)?.parentSpanId;
|
||||
if (parentSpanId && !activeTrustedSpans.has(parentSpanId)) {
|
||||
activeTrustedSpanAliases.set(parentSpanId, span);
|
||||
}
|
||||
};
|
||||
|
||||
const recordLaneEnqueue = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "queue.lane.enqueue" }>,
|
||||
) => {
|
||||
@@ -1421,28 +1561,65 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
if (evt.errorCategory) {
|
||||
spanAttrs["openclaw.errorCategory"] = lowCardinalityAttr(evt.errorCategory, "other");
|
||||
}
|
||||
const span = spanWithDuration("openclaw.run", spanAttrs, evt.durationMs, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
const trustedTrace = trustedTraceContext(evt, metadata);
|
||||
const trackedSpan = trustedTrace?.spanId
|
||||
? activeTrustedSpans.get(trustedTrace.spanId)
|
||||
: undefined;
|
||||
const span =
|
||||
trackedSpan ??
|
||||
spanWithDuration("openclaw.run", spanAttrs, evt.durationMs, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
setSpanAttrs(span, spanAttrs);
|
||||
if (evt.outcome === "error") {
|
||||
span.setStatus({
|
||||
code: SpanStatusCode.ERROR,
|
||||
...(evt.errorCategory ? { message: redactSensitiveText(evt.errorCategory) } : {}),
|
||||
});
|
||||
}
|
||||
if (trackedSpan && trustedTrace?.spanId) {
|
||||
scheduleTrackedRunSpanFinalize(
|
||||
trustedTrace.spanId,
|
||||
trustedTrace.parentSpanId,
|
||||
trackedSpan,
|
||||
evt.ts,
|
||||
);
|
||||
return;
|
||||
}
|
||||
span.end(evt.ts);
|
||||
};
|
||||
|
||||
const harnessRunMetricAttrs = (evt: HarnessRunLifecycleDiagnosticEvent) => ({
|
||||
const harnessRunMetricAttrs = (evt: HarnessRunDiagnosticEvent) => ({
|
||||
"openclaw.harness.id": lowCardinalityAttr(evt.harnessId, "unknown"),
|
||||
"openclaw.harness.plugin": lowCardinalityAttr(evt.pluginId),
|
||||
"openclaw.outcome": evt.type === "harness.run.error" ? "error" : evt.outcome,
|
||||
...(evt.type === "harness.run.started"
|
||||
? {}
|
||||
: {
|
||||
"openclaw.outcome": evt.type === "harness.run.error" ? "error" : evt.outcome,
|
||||
}),
|
||||
"openclaw.provider": lowCardinalityAttr(evt.provider, "unknown"),
|
||||
"openclaw.model": lowCardinalityAttr(evt.model, "unknown"),
|
||||
...(evt.channel ? { "openclaw.channel": lowCardinalityAttr(evt.channel) } : {}),
|
||||
});
|
||||
|
||||
const recordHarnessRunStarted = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "harness.run.started" }>,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => {
|
||||
if (!tracesEnabled || !metadata.trusted) {
|
||||
return;
|
||||
}
|
||||
trackTrustedSpan(
|
||||
evt,
|
||||
metadata,
|
||||
spanWithDuration("openclaw.harness.run", harnessRunMetricAttrs(evt), undefined, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
startTimeMs: evt.ts,
|
||||
}),
|
||||
);
|
||||
};
|
||||
|
||||
const recordHarnessRunCompleted = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "harness.run.completed" }>,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
@@ -1467,10 +1644,13 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
spanAttrs["openclaw.harness.items.completed"] = evt.itemLifecycle.completedCount;
|
||||
spanAttrs["openclaw.harness.items.active"] = evt.itemLifecycle.activeCount;
|
||||
}
|
||||
const span = spanWithDuration("openclaw.harness.run", spanAttrs, evt.durationMs, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
const span =
|
||||
takeTrackedTrustedSpan(evt, metadata) ??
|
||||
spanWithDuration("openclaw.harness.run", spanAttrs, evt.durationMs, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
setSpanAttrs(span, spanAttrs);
|
||||
if (evt.outcome === "error") {
|
||||
span.setStatus({
|
||||
code: SpanStatusCode.ERROR,
|
||||
@@ -1499,10 +1679,13 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
"error.type": errorType,
|
||||
...(evt.cleanupFailed ? { "openclaw.harness.cleanup_failed": true } : {}),
|
||||
};
|
||||
const span = spanWithDuration("openclaw.harness.run", spanAttrs, evt.durationMs, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
const span =
|
||||
takeTrackedTrustedSpan(evt, metadata) ??
|
||||
spanWithDuration("openclaw.harness.run", spanAttrs, evt.durationMs, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
setSpanAttrs(span, spanAttrs);
|
||||
span.setStatus({
|
||||
code: SpanStatusCode.ERROR,
|
||||
message: errorType,
|
||||
@@ -1534,7 +1717,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
spanAttrs["openclaw.context.reserve_tokens"] = evt.reserveTokens;
|
||||
}
|
||||
const span = spanWithDuration("openclaw.context.assembled", spanAttrs, 0, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
span.end(evt.ts);
|
||||
@@ -1555,12 +1738,59 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
"gen_ai.request.model": lowCardinalityAttr(evt.model),
|
||||
...(errorType ? { "error.type": errorType } : {}),
|
||||
});
|
||||
const recordModelCallSizeTimingMetrics = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "model.call.completed" | "model.call.error" }>,
|
||||
attrs: ReturnType<typeof modelCallMetricAttrs>,
|
||||
) => {
|
||||
const requestPayloadBytes = positiveFiniteNumber(evt.requestPayloadBytes);
|
||||
if (requestPayloadBytes !== undefined) {
|
||||
modelCallRequestBytesHistogram.record(requestPayloadBytes, attrs);
|
||||
}
|
||||
const responseStreamBytes = positiveFiniteNumber(evt.responseStreamBytes);
|
||||
if (responseStreamBytes !== undefined) {
|
||||
modelCallResponseBytesHistogram.record(responseStreamBytes, attrs);
|
||||
}
|
||||
const timeToFirstByteMs = positiveFiniteNumber(evt.timeToFirstByteMs);
|
||||
if (timeToFirstByteMs !== undefined) {
|
||||
modelCallTimeToFirstByteHistogram.record(timeToFirstByteMs, attrs);
|
||||
}
|
||||
};
|
||||
|
||||
const recordModelCallStarted = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "model.call.started" }>,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => {
|
||||
if (!tracesEnabled || !metadata.trusted) {
|
||||
return;
|
||||
}
|
||||
const spanAttrs: Record<string, string | number | boolean> = {
|
||||
"openclaw.provider": evt.provider,
|
||||
"openclaw.model": evt.model,
|
||||
};
|
||||
assignGenAiModelCallAttrs(spanAttrs, evt);
|
||||
if (evt.api) {
|
||||
spanAttrs["openclaw.api"] = evt.api;
|
||||
}
|
||||
if (evt.transport) {
|
||||
spanAttrs["openclaw.transport"] = evt.transport;
|
||||
}
|
||||
trackTrustedSpan(
|
||||
evt,
|
||||
metadata,
|
||||
spanWithDuration("openclaw.model.call", spanAttrs, undefined, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
startTimeMs: evt.ts,
|
||||
}),
|
||||
);
|
||||
};
|
||||
|
||||
const recordModelCallCompleted = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "model.call.completed" }>,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => {
|
||||
modelCallDurationHistogram.record(evt.durationMs, modelCallMetricAttrs(evt));
|
||||
const metricAttrs = modelCallMetricAttrs(evt);
|
||||
modelCallDurationHistogram.record(evt.durationMs, metricAttrs);
|
||||
recordModelCallSizeTimingMetrics(evt, metricAttrs);
|
||||
genAiOperationDurationHistogram.record(
|
||||
evt.durationMs / 1000,
|
||||
genAiModelCallMetricAttrs(evt),
|
||||
@@ -1579,15 +1809,19 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
if (evt.transport) {
|
||||
spanAttrs["openclaw.transport"] = evt.transport;
|
||||
}
|
||||
assignModelCallSizeTimingAttrs(spanAttrs, evt);
|
||||
assignOtelModelContentAttributes(
|
||||
spanAttrs,
|
||||
evt as unknown as Record<string, unknown>,
|
||||
contentCapturePolicy,
|
||||
);
|
||||
const span = spanWithDuration("openclaw.model.call", spanAttrs, evt.durationMs, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
const span =
|
||||
takeTrackedTrustedSpan(evt, metadata) ??
|
||||
spanWithDuration("openclaw.model.call", spanAttrs, evt.durationMs, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
setSpanAttrs(span, spanAttrs);
|
||||
addUpstreamRequestIdSpanEvent(span, evt.upstreamRequestIdHash);
|
||||
span.end(evt.ts);
|
||||
};
|
||||
@@ -1597,10 +1831,12 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => {
|
||||
const errorType = lowCardinalityAttr(evt.errorCategory, "other");
|
||||
modelCallDurationHistogram.record(evt.durationMs, {
|
||||
const metricAttrs = {
|
||||
...modelCallMetricAttrs(evt),
|
||||
"openclaw.errorCategory": errorType,
|
||||
});
|
||||
};
|
||||
modelCallDurationHistogram.record(evt.durationMs, metricAttrs);
|
||||
recordModelCallSizeTimingMetrics(evt, metricAttrs);
|
||||
genAiOperationDurationHistogram.record(
|
||||
evt.durationMs / 1000,
|
||||
genAiModelCallMetricAttrs(evt, errorType),
|
||||
@@ -1621,15 +1857,19 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
if (evt.transport) {
|
||||
spanAttrs["openclaw.transport"] = evt.transport;
|
||||
}
|
||||
assignModelCallSizeTimingAttrs(spanAttrs, evt);
|
||||
assignOtelModelContentAttributes(
|
||||
spanAttrs,
|
||||
evt as unknown as Record<string, unknown>,
|
||||
contentCapturePolicy,
|
||||
);
|
||||
const span = spanWithDuration("openclaw.model.call", spanAttrs, evt.durationMs, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
const span =
|
||||
takeTrackedTrustedSpan(evt, metadata) ??
|
||||
spanWithDuration("openclaw.model.call", spanAttrs, evt.durationMs, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
setSpanAttrs(span, spanAttrs);
|
||||
addUpstreamRequestIdSpanEvent(span, evt.upstreamRequestIdHash);
|
||||
span.setStatus({
|
||||
code: SpanStatusCode.ERROR,
|
||||
@@ -1638,6 +1878,36 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
span.end(evt.ts);
|
||||
};
|
||||
|
||||
const toolExecutionBaseAttrs = (
|
||||
evt: Extract<
|
||||
DiagnosticEventPayload,
|
||||
{
|
||||
type: "tool.execution.started" | "tool.execution.completed" | "tool.execution.error";
|
||||
}
|
||||
>,
|
||||
): Record<string, string | number | boolean> => ({
|
||||
"openclaw.toolName": evt.toolName,
|
||||
"gen_ai.tool.name": evt.toolName,
|
||||
...paramsSummaryAttrs(evt.paramsSummary),
|
||||
});
|
||||
|
||||
const recordToolExecutionStarted = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "tool.execution.started" }>,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
) => {
|
||||
if (!tracesEnabled || !metadata.trusted) {
|
||||
return;
|
||||
}
|
||||
trackTrustedSpan(
|
||||
evt,
|
||||
metadata,
|
||||
spanWithDuration("openclaw.tool.execution", toolExecutionBaseAttrs(evt), undefined, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
startTimeMs: evt.ts,
|
||||
}),
|
||||
);
|
||||
};
|
||||
|
||||
const recordToolExecutionCompleted = (
|
||||
evt: Extract<DiagnosticEventPayload, { type: "tool.execution.completed" }>,
|
||||
metadata: DiagnosticEventMetadata,
|
||||
@@ -1651,9 +1921,7 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
return;
|
||||
}
|
||||
const spanAttrs: Record<string, string | number | boolean> = {
|
||||
"openclaw.toolName": evt.toolName,
|
||||
"gen_ai.tool.name": evt.toolName,
|
||||
...paramsSummaryAttrs(evt.paramsSummary),
|
||||
...toolExecutionBaseAttrs(evt),
|
||||
};
|
||||
addRunAttrs(spanAttrs, evt);
|
||||
assignOtelToolContentAttributes(
|
||||
@@ -1661,10 +1929,13 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
evt as unknown as Record<string, unknown>,
|
||||
contentCapturePolicy,
|
||||
);
|
||||
const span = spanWithDuration("openclaw.tool.execution", spanAttrs, evt.durationMs, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
const span =
|
||||
takeTrackedTrustedSpan(evt, metadata) ??
|
||||
spanWithDuration("openclaw.tool.execution", spanAttrs, evt.durationMs, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
setSpanAttrs(span, spanAttrs);
|
||||
span.end(evt.ts);
|
||||
};
|
||||
|
||||
@@ -1682,10 +1953,8 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
return;
|
||||
}
|
||||
const spanAttrs: Record<string, string | number | boolean> = {
|
||||
"openclaw.toolName": evt.toolName,
|
||||
...toolExecutionBaseAttrs(evt),
|
||||
"openclaw.errorCategory": lowCardinalityAttr(evt.errorCategory, "other"),
|
||||
"gen_ai.tool.name": evt.toolName,
|
||||
...paramsSummaryAttrs(evt.paramsSummary),
|
||||
};
|
||||
addRunAttrs(spanAttrs, evt);
|
||||
if (evt.errorCode) {
|
||||
@@ -1696,10 +1965,13 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
evt as unknown as Record<string, unknown>,
|
||||
contentCapturePolicy,
|
||||
);
|
||||
const span = spanWithDuration("openclaw.tool.execution", spanAttrs, evt.durationMs, {
|
||||
parentContext: contextForTrustedDiagnosticSpanParent(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
const span =
|
||||
takeTrackedTrustedSpan(evt, metadata) ??
|
||||
spanWithDuration("openclaw.tool.execution", spanAttrs, evt.durationMs, {
|
||||
parentContext: activeTrustedParentContext(evt, metadata),
|
||||
endTimeMs: evt.ts,
|
||||
});
|
||||
setSpanAttrs(span, spanAttrs);
|
||||
span.setStatus({
|
||||
code: SpanStatusCode.ERROR,
|
||||
message: redactSensitiveText(evt.errorCategory),
|
||||
@@ -1827,9 +2099,15 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
case "diagnostic.heartbeat":
|
||||
recordHeartbeat(evt);
|
||||
return;
|
||||
case "run.started":
|
||||
recordRunStarted(evt, metadata);
|
||||
return;
|
||||
case "run.completed":
|
||||
recordRunCompleted(evt, metadata);
|
||||
return;
|
||||
case "harness.run.started":
|
||||
recordHarnessRunStarted(evt, metadata);
|
||||
return;
|
||||
case "harness.run.completed":
|
||||
recordHarnessRunCompleted(evt, metadata);
|
||||
return;
|
||||
@@ -1839,12 +2117,18 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
case "context.assembled":
|
||||
recordContextAssembled(evt, metadata);
|
||||
return;
|
||||
case "model.call.started":
|
||||
recordModelCallStarted(evt, metadata);
|
||||
return;
|
||||
case "model.call.completed":
|
||||
recordModelCallCompleted(evt, metadata);
|
||||
return;
|
||||
case "model.call.error":
|
||||
recordModelCallError(evt, metadata);
|
||||
return;
|
||||
case "tool.execution.started":
|
||||
recordToolExecutionStarted(evt, metadata);
|
||||
return;
|
||||
case "tool.execution.completed":
|
||||
recordToolExecutionCompleted(evt, metadata);
|
||||
return;
|
||||
@@ -1869,10 +2153,6 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
case "telemetry.exporter":
|
||||
recordTelemetryExporter(evt, metadata);
|
||||
return;
|
||||
case "tool.execution.started":
|
||||
case "run.started":
|
||||
case "harness.run.started":
|
||||
case "model.call.started":
|
||||
case "payload.large":
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -166,6 +166,9 @@ function renderElement(
|
||||
}
|
||||
case "emotion":
|
||||
return renderEmotionElement(element);
|
||||
case "md":
|
||||
case "lark_md":
|
||||
return toStringOrEmpty(element.text) || toStringOrEmpty(element.content);
|
||||
case "br":
|
||||
return "\n";
|
||||
case "hr":
|
||||
|
||||
@@ -168,6 +168,95 @@ describe("getMessageFeishu", () => {
|
||||
);
|
||||
});
|
||||
|
||||
it("falls through empty interactive card element arrays and locale variants", async () => {
|
||||
mockClientGet.mockResolvedValueOnce({
|
||||
code: 0,
|
||||
data: {
|
||||
items: [
|
||||
{
|
||||
message_id: "om_i18n_card",
|
||||
chat_id: "oc_i18n_card",
|
||||
msg_type: "interactive",
|
||||
body: {
|
||||
content: JSON.stringify({
|
||||
elements: [],
|
||||
body: { elements: [] },
|
||||
i18n_elements: {
|
||||
zh_cn: [],
|
||||
en_us: [
|
||||
{
|
||||
tag: "markdown",
|
||||
content: "hello ${count} {{label}} {{metadata}}",
|
||||
},
|
||||
],
|
||||
},
|
||||
template_variable: {
|
||||
count: 2,
|
||||
label: "tasks",
|
||||
metadata: { ignored: true },
|
||||
},
|
||||
}),
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
});
|
||||
|
||||
const result = await getMessageFeishu({
|
||||
cfg: {} as ClawdbotConfig,
|
||||
messageId: "om_i18n_card",
|
||||
});
|
||||
|
||||
expect(result).toEqual(
|
||||
expect.objectContaining({
|
||||
messageId: "om_i18n_card",
|
||||
chatId: "oc_i18n_card",
|
||||
contentType: "interactive",
|
||||
content: "hello 2 tasks {{metadata}}",
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("falls back to post-format content when interactive card elements are empty", async () => {
|
||||
mockClientGet.mockResolvedValueOnce({
|
||||
code: 0,
|
||||
data: {
|
||||
items: [
|
||||
{
|
||||
message_id: "om_post_card",
|
||||
chat_id: "oc_post_card",
|
||||
msg_type: "interactive",
|
||||
body: {
|
||||
content: JSON.stringify({
|
||||
elements: [],
|
||||
post: {
|
||||
zh_cn: {
|
||||
title: "Card summary",
|
||||
content: [[{ tag: "md", text: "**fallback** body" }]],
|
||||
},
|
||||
},
|
||||
}),
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
});
|
||||
|
||||
const result = await getMessageFeishu({
|
||||
cfg: {} as ClawdbotConfig,
|
||||
messageId: "om_post_card",
|
||||
});
|
||||
|
||||
expect(result).toEqual(
|
||||
expect.objectContaining({
|
||||
messageId: "om_post_card",
|
||||
chatId: "oc_post_card",
|
||||
contentType: "interactive",
|
||||
content: "Card summary\n\n**fallback** body",
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("extracts text content from post messages", async () => {
|
||||
mockClientGet.mockResolvedValueOnce({
|
||||
code: 0,
|
||||
|
||||
@@ -15,6 +15,8 @@ import { resolveFeishuSendTarget } from "./send-target.js";
|
||||
import type { FeishuChatType, FeishuMessageInfo, FeishuSendResult } from "./types.js";
|
||||
|
||||
const WITHDRAWN_REPLY_ERROR_CODES = new Set([230011, 231003]);
|
||||
const INTERACTIVE_CARD_FALLBACK_TEXT = "[Interactive Card]";
|
||||
const POST_FALLBACK_TEXT = "[Rich text message]";
|
||||
const FEISHU_CARD_TEMPLATES = new Set([
|
||||
"blue",
|
||||
"green",
|
||||
@@ -60,6 +62,10 @@ function isWithdrawnReplyError(err: unknown): boolean {
|
||||
return false;
|
||||
}
|
||||
|
||||
function isRecord(value: unknown): value is Record<string, unknown> {
|
||||
return Boolean(value && typeof value === "object" && !Array.isArray(value));
|
||||
}
|
||||
|
||||
type FeishuCreateMessageClient = {
|
||||
im: {
|
||||
message: {
|
||||
@@ -179,41 +185,121 @@ async function sendReplyOrFallbackDirect(
|
||||
return toFeishuSendResult(response, params.directParams.receiveId);
|
||||
}
|
||||
|
||||
function parseInteractiveCardContent(parsed: unknown): string {
|
||||
if (!parsed || typeof parsed !== "object") {
|
||||
return "[Interactive Card]";
|
||||
function normalizeCardTemplateVariable(value: unknown): string | undefined {
|
||||
if (typeof value === "string") {
|
||||
return value;
|
||||
}
|
||||
|
||||
// Support both schema 1.0 (top-level `elements`) and 2.0 (`body.elements`).
|
||||
const candidate = parsed as { elements?: unknown; body?: { elements?: unknown } };
|
||||
const elements = Array.isArray(candidate.elements)
|
||||
? candidate.elements
|
||||
: Array.isArray(candidate.body?.elements)
|
||||
? candidate.body.elements
|
||||
: null;
|
||||
if (!elements) {
|
||||
return "[Interactive Card]";
|
||||
if (typeof value === "number" || typeof value === "boolean" || typeof value === "bigint") {
|
||||
return String(value);
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function readCardTemplateVariables(parsed: Record<string, unknown>): Map<string, string> {
|
||||
const variables = new Map<string, string>();
|
||||
for (const source of [parsed.template_variable, parsed.template_variables]) {
|
||||
if (!isRecord(source)) {
|
||||
continue;
|
||||
}
|
||||
for (const [key, value] of Object.entries(source)) {
|
||||
const normalized = normalizeCardTemplateVariable(value);
|
||||
if (normalized !== undefined) {
|
||||
variables.set(key, normalized);
|
||||
}
|
||||
}
|
||||
}
|
||||
return variables;
|
||||
}
|
||||
|
||||
function applyCardTemplateVariables(text: string, variables: Map<string, string>): string {
|
||||
if (variables.size === 0) {
|
||||
return text;
|
||||
}
|
||||
return text.replace(/\$\{([A-Za-z0-9_.-]+)\}|\{\{\s*([A-Za-z0-9_.-]+)\s*\}\}/g, (match, a, b) => {
|
||||
const variableName = typeof a === "string" ? a : b;
|
||||
return variables.get(variableName) ?? match;
|
||||
});
|
||||
}
|
||||
|
||||
function extractInteractiveElementText(
|
||||
element: unknown,
|
||||
variables: Map<string, string>,
|
||||
): string | undefined {
|
||||
if (!isRecord(element)) {
|
||||
return undefined;
|
||||
}
|
||||
const tag = typeof element.tag === "string" ? element.tag : "";
|
||||
const text = isRecord(element.text) ? element.text : undefined;
|
||||
|
||||
if (tag === "div" && typeof text?.content === "string") {
|
||||
return applyCardTemplateVariables(text.content, variables);
|
||||
}
|
||||
if ((tag === "markdown" || tag === "lark_md") && typeof element.content === "string") {
|
||||
return applyCardTemplateVariables(element.content, variables);
|
||||
}
|
||||
if (tag === "plain_text" && typeof element.content === "string") {
|
||||
return applyCardTemplateVariables(element.content, variables);
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function extractInteractiveElementsText(
|
||||
elements: unknown[],
|
||||
variables: Map<string, string>,
|
||||
): string {
|
||||
const texts: string[] = [];
|
||||
for (const element of elements) {
|
||||
if (!element || typeof element !== "object") {
|
||||
continue;
|
||||
}
|
||||
const item = element as {
|
||||
tag?: string;
|
||||
content?: string;
|
||||
text?: { content?: string };
|
||||
};
|
||||
if (item.tag === "div" && typeof item.text?.content === "string") {
|
||||
texts.push(item.text.content);
|
||||
continue;
|
||||
}
|
||||
if (item.tag === "markdown" && typeof item.content === "string") {
|
||||
texts.push(item.content);
|
||||
const text = extractInteractiveElementText(element, variables);
|
||||
if (text !== undefined) {
|
||||
texts.push(text);
|
||||
}
|
||||
}
|
||||
return texts.join("\n").trim() || "[Interactive Card]";
|
||||
return texts.join("\n").trim();
|
||||
}
|
||||
|
||||
function readInteractiveElementArrays(parsed: Record<string, unknown>): unknown[][] {
|
||||
const body = isRecord(parsed.body) ? parsed.body : undefined;
|
||||
const elementArrays: unknown[][] = [];
|
||||
|
||||
for (const candidate of [parsed.elements, body?.elements]) {
|
||||
if (Array.isArray(candidate)) {
|
||||
elementArrays.push(candidate);
|
||||
}
|
||||
}
|
||||
|
||||
for (const candidate of [parsed.i18n_elements, body?.i18n_elements]) {
|
||||
if (!isRecord(candidate)) {
|
||||
continue;
|
||||
}
|
||||
for (const localeElements of Object.values(candidate)) {
|
||||
if (Array.isArray(localeElements)) {
|
||||
elementArrays.push(localeElements);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return elementArrays;
|
||||
}
|
||||
|
||||
function parseInteractivePostFallback(parsed: unknown): string | undefined {
|
||||
const textContent = parsePostContent(JSON.stringify(parsed)).textContent.trim();
|
||||
return textContent && textContent !== POST_FALLBACK_TEXT ? textContent : undefined;
|
||||
}
|
||||
|
||||
function parseInteractiveCardContent(parsed: unknown): string {
|
||||
if (!isRecord(parsed)) {
|
||||
return INTERACTIVE_CARD_FALLBACK_TEXT;
|
||||
}
|
||||
|
||||
const variables = readCardTemplateVariables(parsed);
|
||||
for (const elements of readInteractiveElementArrays(parsed)) {
|
||||
const text = extractInteractiveElementsText(elements, variables);
|
||||
if (text) {
|
||||
return text;
|
||||
}
|
||||
}
|
||||
|
||||
return parseInteractivePostFallback(parsed) ?? INTERACTIVE_CARD_FALLBACK_TEXT;
|
||||
}
|
||||
|
||||
function parseFeishuMessageContent(rawContent: string, msgType: string): string {
|
||||
|
||||
@@ -5,11 +5,7 @@ import path from "node:path";
|
||||
import type { DatabaseSync } from "node:sqlite";
|
||||
import chokidar, { FSWatcher } from "chokidar";
|
||||
import { formatErrorMessage } from "openclaw/plugin-sdk/error-runtime";
|
||||
import {
|
||||
buildCaseInsensitiveExtensionGlob,
|
||||
classifyMemoryMultimodalPath,
|
||||
getMemoryMultimodalExtensions,
|
||||
} from "openclaw/plugin-sdk/memory-core-host-engine-embeddings";
|
||||
import { classifyMemoryMultimodalPath } from "openclaw/plugin-sdk/memory-core-host-engine-embeddings";
|
||||
import {
|
||||
createSubsystemLogger,
|
||||
onSessionTranscriptUpdate,
|
||||
@@ -105,6 +101,9 @@ function shouldIgnoreMemoryWatchPath(
|
||||
if (stats?.isDirectory?.()) {
|
||||
return false;
|
||||
}
|
||||
if (!stats) {
|
||||
return false;
|
||||
}
|
||||
const extension = normalizeLowercaseStringOrEmpty(path.extname(normalized));
|
||||
if (extension.length === 0 || extension === ".md") {
|
||||
return false;
|
||||
@@ -383,16 +382,7 @@ export abstract class MemoryManagerSyncOps {
|
||||
continue;
|
||||
}
|
||||
if (stat.isDirectory()) {
|
||||
watchPaths.add(path.join(entry, "**", "*.md"));
|
||||
if (this.settings.multimodal.enabled) {
|
||||
for (const modality of this.settings.multimodal.modalities) {
|
||||
for (const extension of getMemoryMultimodalExtensions(modality)) {
|
||||
watchPaths.add(
|
||||
path.join(entry, "**", buildCaseInsensitiveExtensionGlob(extension)),
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
watchPaths.add(entry);
|
||||
continue;
|
||||
}
|
||||
if (
|
||||
@@ -422,6 +412,7 @@ export abstract class MemoryManagerSyncOps {
|
||||
this.watcher.on("add", markDirty);
|
||||
this.watcher.on("change", markDirty);
|
||||
this.watcher.on("unlink", markDirty);
|
||||
this.watcher.on("unlinkDir", markDirty);
|
||||
}
|
||||
|
||||
protected ensureSessionListener() {
|
||||
|
||||
@@ -11,12 +11,35 @@ import { registerBuiltInMemoryEmbeddingProviders } from "./provider-adapters.js"
|
||||
|
||||
type WatchIgnoredFn = (watchPath: string, stats?: { isDirectory?: () => boolean }) => boolean;
|
||||
|
||||
const { watchMock } = vi.hoisted(() => ({
|
||||
watchMock: vi.fn(() => ({
|
||||
on: vi.fn(),
|
||||
close: vi.fn(async () => undefined),
|
||||
})),
|
||||
}));
|
||||
const { createdWatchers, watchMock } = vi.hoisted(() => {
|
||||
type WatchEvent = "add" | "change" | "unlink" | "unlinkDir";
|
||||
type WatchCallback = () => void;
|
||||
function createMockWatcher() {
|
||||
const handlers = new Map<WatchEvent, WatchCallback[]>();
|
||||
const watcher = {
|
||||
on: vi.fn((event: WatchEvent, callback: WatchCallback) => {
|
||||
handlers.set(event, [...(handlers.get(event) ?? []), callback]);
|
||||
return watcher;
|
||||
}),
|
||||
close: vi.fn(async () => undefined),
|
||||
emit: (event: WatchEvent) => {
|
||||
for (const callback of handlers.get(event) ?? []) {
|
||||
callback();
|
||||
}
|
||||
},
|
||||
};
|
||||
return watcher;
|
||||
}
|
||||
const watchers: Array<ReturnType<typeof createMockWatcher>> = [];
|
||||
return {
|
||||
createdWatchers: watchers,
|
||||
watchMock: vi.fn(() => {
|
||||
const watcher = createMockWatcher();
|
||||
watchers.push(watcher);
|
||||
return watcher;
|
||||
}),
|
||||
};
|
||||
});
|
||||
|
||||
vi.mock("chokidar", () => ({
|
||||
default: { watch: watchMock },
|
||||
@@ -69,7 +92,9 @@ describe("memory watcher config", () => {
|
||||
});
|
||||
|
||||
afterEach(async () => {
|
||||
vi.useRealTimers();
|
||||
watchMock.mockClear();
|
||||
createdWatchers.length = 0;
|
||||
if (manager) {
|
||||
await manager.close();
|
||||
manager = null;
|
||||
@@ -140,9 +165,10 @@ describe("memory watcher config", () => {
|
||||
expect.arrayContaining([
|
||||
path.join(workspaceDir, "MEMORY.md"),
|
||||
path.join(workspaceDir, "memory"),
|
||||
path.join(extraDir, "**", "*.md"),
|
||||
extraDir,
|
||||
]),
|
||||
);
|
||||
expect(watchedPaths.every((watchPath) => !watchPath.includes("*"))).toBe(true);
|
||||
expect(options.ignoreInitial).toBe(true);
|
||||
expect(options.awaitWriteFinish).toEqual({ stabilityThreshold: 25, pollInterval: 100 });
|
||||
|
||||
@@ -152,15 +178,19 @@ describe("memory watcher config", () => {
|
||||
true,
|
||||
);
|
||||
expect(ignored?.(path.join(workspaceDir, "memory", ".venv", "lib", "python.md"))).toBe(true);
|
||||
expect(ignored?.(path.join(workspaceDir, "memory", "project", "notes.tmp"))).toBe(true);
|
||||
expect(ignored?.(path.join(workspaceDir, "memory", "project", "notes.json"))).toBe(true);
|
||||
expect(ignored?.(path.join(workspaceDir, "memory", "project", "notes.tmp"), {})).toBe(true);
|
||||
expect(ignored?.(path.join(workspaceDir, "memory", "project", "notes.json"), {})).toBe(true);
|
||||
expect(ignored?.(path.join(workspaceDir, "memory", "project", "notes.json"), undefined)).toBe(
|
||||
false,
|
||||
);
|
||||
expect(ignored?.(path.join(workspaceDir, "memory", "project", "notes.md"))).toBe(false);
|
||||
expect(ignored?.(path.join(workspaceDir, "memory", "project", "notes.md"), {})).toBe(false);
|
||||
expect(
|
||||
ignored?.(path.join(workspaceDir, "memory", "project"), { isDirectory: () => true }),
|
||||
).toBe(false);
|
||||
});
|
||||
|
||||
it("watches multimodal extensions with case-insensitive globs", async () => {
|
||||
it("watches multimodal extra directories with filtered extensions", async () => {
|
||||
await setupWatcherWorkspace({ name: "PHOTO.PNG", contents: "png" });
|
||||
const cfg = createWatcherConfig({
|
||||
provider: "gemini",
|
||||
@@ -177,16 +207,40 @@ describe("memory watcher config", () => {
|
||||
Record<string, unknown>,
|
||||
];
|
||||
expect(watchedPaths).toEqual(
|
||||
expect.arrayContaining([
|
||||
path.join(extraDir, "**", "*.[pP][nN][gG]"),
|
||||
path.join(extraDir, "**", "*.[wW][aA][vV]"),
|
||||
]),
|
||||
expect.arrayContaining([path.join(workspaceDir, "MEMORY.md"), path.join(extraDir)]),
|
||||
);
|
||||
expect(watchedPaths.every((watchPath) => !watchPath.includes("*"))).toBe(true);
|
||||
|
||||
const ignored = options.ignored as WatchIgnoredFn | undefined;
|
||||
expect(ignored).toBeTypeOf("function");
|
||||
expect(ignored?.(path.join(extraDir, "nested", "PHOTO.PNG"))).toBe(false);
|
||||
expect(ignored?.(path.join(extraDir, "nested", "PHOTO.PNG"), {})).toBe(false);
|
||||
expect(ignored?.(path.join(extraDir, "nested", "voice.WAV"))).toBe(false);
|
||||
expect(ignored?.(path.join(extraDir, "nested", "metadata.json"))).toBe(true);
|
||||
expect(ignored?.(path.join(extraDir, "nested", "voice.WAV"), {})).toBe(false);
|
||||
expect(ignored?.(path.join(extraDir, "nested", "metadata.json"), {})).toBe(true);
|
||||
});
|
||||
|
||||
it.each(["add", "change", "unlink", "unlinkDir"] as const)(
|
||||
"schedules watch sync on %s",
|
||||
async (event) => {
|
||||
await setupWatcherWorkspace({ name: "notes.md", contents: "hello" });
|
||||
const cfg = createWatcherConfig();
|
||||
|
||||
await expectWatcherManager(cfg);
|
||||
vi.useFakeTimers();
|
||||
const syncSpy = vi
|
||||
.spyOn(
|
||||
manager as unknown as {
|
||||
sync: (params?: { reason?: string }) => Promise<void>;
|
||||
},
|
||||
"sync",
|
||||
)
|
||||
.mockResolvedValue(undefined);
|
||||
|
||||
createdWatchers[0]?.emit(event);
|
||||
await vi.advanceTimersByTimeAsync(25);
|
||||
|
||||
expect(syncSpy).toHaveBeenCalledWith({ reason: "watch" });
|
||||
},
|
||||
);
|
||||
});
|
||||
|
||||
@@ -69,7 +69,9 @@ function registerProviderWithPluginConfig(pluginConfig: Record<string, unknown>)
|
||||
return registerProviderMock.mock.calls[0]?.[0];
|
||||
}
|
||||
|
||||
function captureWrappedOllamaPayload(thinkingLevel: "off" | "low" | undefined) {
|
||||
function captureWrappedOllamaPayload(
|
||||
thinkingLevel: "off" | "minimal" | "low" | "medium" | "high" | "max" | undefined,
|
||||
) {
|
||||
const provider = registerProvider();
|
||||
let payloadSeen: Record<string, unknown> | undefined;
|
||||
const baseStreamFn = vi.fn((_model, _context, options) => {
|
||||
@@ -528,10 +530,43 @@ describe("ollama plugin", () => {
|
||||
expect((payloadSeen?.options as Record<string, unknown> | undefined)?.think).toBeUndefined();
|
||||
});
|
||||
|
||||
it("wraps native Ollama payloads with top-level think=true when thinking is enabled", () => {
|
||||
it("keeps native Ollama thinking off by default while exposing opt-in effort levels", () => {
|
||||
const provider = registerProvider();
|
||||
|
||||
expect(
|
||||
provider.resolveThinkingProfile?.({
|
||||
provider: "ollama",
|
||||
modelId: "llama3.2:latest",
|
||||
reasoning: false,
|
||||
}),
|
||||
).toEqual({
|
||||
levels: [{ id: "off" }],
|
||||
defaultLevel: "off",
|
||||
});
|
||||
|
||||
expect(
|
||||
provider.resolveThinkingProfile?.({
|
||||
provider: "ollama",
|
||||
modelId: "gemma4:31b",
|
||||
reasoning: true,
|
||||
}),
|
||||
).toEqual({
|
||||
levels: [{ id: "off" }, { id: "low" }, { id: "medium" }, { id: "high" }, { id: "max" }],
|
||||
defaultLevel: "off",
|
||||
});
|
||||
});
|
||||
|
||||
it("wraps native Ollama payloads with top-level think effort when thinking is enabled", () => {
|
||||
const { baseStreamFn, payloadSeen } = captureWrappedOllamaPayload("low");
|
||||
expect(baseStreamFn).toHaveBeenCalledTimes(1);
|
||||
expect(payloadSeen?.think).toBe(true);
|
||||
expect(payloadSeen?.think).toBe("low");
|
||||
expect((payloadSeen?.options as Record<string, unknown> | undefined)?.think).toBeUndefined();
|
||||
});
|
||||
|
||||
it("maps native Ollama max thinking to the highest supported wire effort", () => {
|
||||
const { baseStreamFn, payloadSeen } = captureWrappedOllamaPayload("max");
|
||||
expect(baseStreamFn).toHaveBeenCalledTimes(1);
|
||||
expect(payloadSeen?.think).toBe("high");
|
||||
expect((payloadSeen?.options as Record<string, unknown> | undefined)?.think).toBeUndefined();
|
||||
});
|
||||
|
||||
|
||||
@@ -166,6 +166,13 @@ export default definePluginEntry({
|
||||
contributeResolvedModelCompat: ({ model }) =>
|
||||
usesOllamaOpenAICompatTransport(model) ? { supportsUsageInStreaming: true } : undefined,
|
||||
resolveReasoningOutputMode: () => "native",
|
||||
resolveThinkingProfile: ({ reasoning }) => ({
|
||||
levels:
|
||||
reasoning === true
|
||||
? [{ id: "off" }, { id: "low" }, { id: "medium" }, { id: "high" }, { id: "max" }]
|
||||
: [{ id: "off" }],
|
||||
defaultLevel: "off",
|
||||
}),
|
||||
wrapStreamFn: createConfiguredOllamaCompatStreamWrapper,
|
||||
createEmbeddingProvider: async ({ config, model, remote }) => {
|
||||
const { provider, client } = await createOllamaEmbeddingProvider({
|
||||
|
||||
@@ -203,13 +203,26 @@ describe("ollama provider models", () => {
|
||||
"vision",
|
||||
"completion",
|
||||
"tools",
|
||||
"thinking",
|
||||
]);
|
||||
expect(visionModel.input).toEqual(["text", "image"]);
|
||||
expect(visionModel.reasoning).toBe(true);
|
||||
expect(visionModel.compat?.supportsTools).toBe(true);
|
||||
|
||||
const textModel = buildOllamaModelDefinition("glm-5.1:cloud", 202752, ["completion", "tools"]);
|
||||
expect(textModel.input).toEqual(["text"]);
|
||||
expect(textModel.reasoning).toBe(false);
|
||||
expect(textModel.compat?.supportsTools).toBe(true);
|
||||
|
||||
const noCapabilities = buildOllamaModelDefinition("unknown-model", 65536);
|
||||
expect(noCapabilities.input).toEqual(["text"]);
|
||||
expect(noCapabilities.compat).toBeUndefined();
|
||||
});
|
||||
|
||||
it("disables tool support when Ollama capabilities omit tools", () => {
|
||||
const model = buildOllamaModelDefinition("embeddinggemma:latest", 2048, ["embedding"]);
|
||||
|
||||
expect(model.reasoning).toBe(false);
|
||||
expect(model.compat?.supportsTools).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -218,14 +218,25 @@ export function buildOllamaModelDefinition(
|
||||
): ModelDefinitionConfig {
|
||||
const hasVision = capabilities?.includes("vision") ?? false;
|
||||
const input: ("text" | "image")[] = hasVision ? ["text", "image"] : ["text"];
|
||||
const reasoning =
|
||||
capabilities === undefined
|
||||
? isReasoningModelHeuristic(modelId)
|
||||
: capabilities.includes("thinking");
|
||||
const compat =
|
||||
capabilities === undefined
|
||||
? undefined
|
||||
: {
|
||||
supportsTools: capabilities.includes("tools"),
|
||||
};
|
||||
return {
|
||||
id: modelId,
|
||||
name: modelId,
|
||||
reasoning: isReasoningModelHeuristic(modelId),
|
||||
reasoning,
|
||||
input,
|
||||
cost: OLLAMA_DEFAULT_COST,
|
||||
contextWindow: contextWindow ?? OLLAMA_DEFAULT_CONTEXT_WINDOW,
|
||||
maxTokens: OLLAMA_DEFAULT_MAX_TOKENS,
|
||||
...(compat ? { compat } : {}),
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
@@ -150,7 +150,7 @@ describe("createConfiguredOllamaCompatStreamWrapper", () => {
|
||||
);
|
||||
});
|
||||
|
||||
it("forwards think=true on native Ollama chat requests when thinking is enabled", async () => {
|
||||
it("forwards the native think effort on native Ollama chat requests when thinking is enabled", async () => {
|
||||
await withMockNdjsonFetch(
|
||||
[
|
||||
'{"model":"m","created_at":"t","message":{"role":"assistant","content":"ok"},"done":false}',
|
||||
@@ -193,10 +193,63 @@ describe("createConfiguredOllamaCompatStreamWrapper", () => {
|
||||
throw new Error("Expected string request body");
|
||||
}
|
||||
const requestBody = JSON.parse(requestInit.body) as {
|
||||
think?: boolean;
|
||||
options?: { think?: boolean; num_ctx?: number };
|
||||
think?: boolean | string;
|
||||
options?: { think?: boolean | string; num_ctx?: number };
|
||||
};
|
||||
expect(requestBody.think).toBe(true);
|
||||
expect(requestBody.think).toBe("low");
|
||||
expect(requestBody.options?.think).toBeUndefined();
|
||||
expect(requestBody.options?.num_ctx).toBe(131072);
|
||||
},
|
||||
);
|
||||
});
|
||||
|
||||
it("maps native Ollama max thinking to think=high on the wire", async () => {
|
||||
await withMockNdjsonFetch(
|
||||
[
|
||||
'{"model":"m","created_at":"t","message":{"role":"assistant","content":"ok"},"done":false}',
|
||||
'{"model":"m","created_at":"t","message":{"role":"assistant","content":""},"done":true,"prompt_eval_count":1,"eval_count":1}',
|
||||
],
|
||||
async (fetchMock) => {
|
||||
const baseStreamFn = createOllamaStreamFn("http://ollama-host:11434");
|
||||
const model = {
|
||||
api: "ollama",
|
||||
provider: "ollama",
|
||||
id: "gpt-oss:20b",
|
||||
contextWindow: 131072,
|
||||
};
|
||||
|
||||
const wrapped = createConfiguredOllamaCompatStreamWrapper({
|
||||
provider: "ollama",
|
||||
modelId: "gpt-oss:20b",
|
||||
model,
|
||||
streamFn: baseStreamFn,
|
||||
thinkingLevel: "max",
|
||||
} as never);
|
||||
if (!wrapped) {
|
||||
throw new Error("Expected wrapped Ollama stream function");
|
||||
}
|
||||
|
||||
const stream = await Promise.resolve(
|
||||
wrapped(
|
||||
model as never,
|
||||
{
|
||||
messages: [{ role: "user", content: "hello" }],
|
||||
} as never,
|
||||
{} as never,
|
||||
),
|
||||
);
|
||||
|
||||
await collectStreamEvents(stream);
|
||||
|
||||
const requestInit = getGuardedFetchCall(fetchMock).init ?? {};
|
||||
if (typeof requestInit.body !== "string") {
|
||||
throw new Error("Expected string request body");
|
||||
}
|
||||
const requestBody = JSON.parse(requestInit.body) as {
|
||||
think?: boolean | string;
|
||||
options?: { think?: boolean | string; num_ctx?: number };
|
||||
};
|
||||
expect(requestBody.think).toBe("high");
|
||||
expect(requestBody.options?.think).toBeUndefined();
|
||||
expect(requestBody.options?.num_ctx).toBe(131072);
|
||||
},
|
||||
|
||||
@@ -151,7 +151,12 @@ export function wrapOllamaCompatNumCtx(baseFn: StreamFn | undefined, numCtx: num
|
||||
});
|
||||
}
|
||||
|
||||
function createOllamaThinkingWrapper(baseFn: StreamFn | undefined, think: boolean): StreamFn {
|
||||
type OllamaThinkValue = boolean | "low" | "medium" | "high";
|
||||
|
||||
function createOllamaThinkingWrapper(
|
||||
baseFn: StreamFn | undefined,
|
||||
think: OllamaThinkValue,
|
||||
): StreamFn {
|
||||
const streamFn = baseFn ?? streamSimple;
|
||||
return (model, context, options) =>
|
||||
streamWithPayloadPatch(streamFn, model, context, options, (payloadRecord) => {
|
||||
@@ -159,6 +164,22 @@ function createOllamaThinkingWrapper(baseFn: StreamFn | undefined, think: boolea
|
||||
});
|
||||
}
|
||||
|
||||
function resolveOllamaThinkValue(thinkingLevel: unknown): OllamaThinkValue | undefined {
|
||||
if (thinkingLevel === "off") {
|
||||
return false;
|
||||
}
|
||||
if (thinkingLevel === "low" || thinkingLevel === "medium" || thinkingLevel === "high") {
|
||||
return thinkingLevel;
|
||||
}
|
||||
if (thinkingLevel === "minimal") {
|
||||
return "low";
|
||||
}
|
||||
if (thinkingLevel === "xhigh" || thinkingLevel === "adaptive" || thinkingLevel === "max") {
|
||||
return "high";
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function resolveOllamaCompatNumCtx(model: ProviderRuntimeModel): number {
|
||||
return Math.max(1, Math.floor(model.contextWindow ?? model.maxTokens ?? DEFAULT_CONTEXT_TOKENS));
|
||||
}
|
||||
@@ -196,12 +217,11 @@ export function createConfiguredOllamaCompatStreamWrapper(
|
||||
streamFn = wrapOllamaCompatNumCtx(streamFn, resolveOllamaCompatNumCtx(model));
|
||||
}
|
||||
|
||||
if (isNativeOllamaTransport && ctx.thinkingLevel === "off") {
|
||||
streamFn = createOllamaThinkingWrapper(streamFn, false);
|
||||
} else if (isNativeOllamaTransport && ctx.thinkingLevel) {
|
||||
// Any non-off ThinkLevel (minimal, low, medium, high, xhigh, adaptive, max)
|
||||
// should enable Ollama's native thinking mode.
|
||||
streamFn = createOllamaThinkingWrapper(streamFn, true);
|
||||
const ollamaThinkValue = isNativeOllamaTransport
|
||||
? resolveOllamaThinkValue(ctx.thinkingLevel)
|
||||
: undefined;
|
||||
if (ollamaThinkValue !== undefined) {
|
||||
streamFn = createOllamaThinkingWrapper(streamFn, ollamaThinkValue);
|
||||
}
|
||||
|
||||
if (normalizeProviderId(ctx.provider) === "ollama" && isOllamaCloudKimiModelRef(ctx.modelId)) {
|
||||
@@ -310,7 +330,7 @@ interface OllamaChatRequest {
|
||||
stream: boolean;
|
||||
tools?: OllamaTool[];
|
||||
options?: Record<string, unknown>;
|
||||
think?: boolean;
|
||||
think?: OllamaThinkValue;
|
||||
}
|
||||
|
||||
interface OllamaChatMessage {
|
||||
|
||||
@@ -162,6 +162,7 @@ describe("telegram live qa runtime", () => {
|
||||
sutAccountId: "sut",
|
||||
});
|
||||
|
||||
expect(next.agents?.defaults?.skipBootstrap).toBe(true);
|
||||
expect(next.plugins?.allow).toContain("telegram");
|
||||
expect(next.plugins?.entries?.telegram).toEqual({ enabled: true });
|
||||
expect(next.channels?.telegram).toEqual({
|
||||
@@ -375,6 +376,27 @@ describe("telegram live qa runtime", () => {
|
||||
matchText: "TELEGRAM_QA_NOMENTION_TOKEN",
|
||||
}),
|
||||
).toBe(false);
|
||||
expect(
|
||||
__testing.matchesTelegramScenarioReply({
|
||||
allowAnySutReply: true,
|
||||
groupId: "-100123",
|
||||
sentMessageId: 55,
|
||||
sutBotId: 88,
|
||||
message: {
|
||||
updateId: 3,
|
||||
messageId: 12,
|
||||
chatId: -100123,
|
||||
senderId: 88,
|
||||
senderIsBot: true,
|
||||
senderUsername: "sut_bot",
|
||||
text: "Protocol note: acknowledged.",
|
||||
replyToMessageId: undefined,
|
||||
timestamp: 1_700_000_003_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
}),
|
||||
).toBe(true);
|
||||
});
|
||||
|
||||
it("validates expected Telegram reply markers", () => {
|
||||
|
||||
@@ -51,6 +51,7 @@ type TelegramQaScenarioId =
|
||||
| "telegram-mention-gating";
|
||||
|
||||
type TelegramQaScenarioRun = {
|
||||
allowAnySutReply?: boolean;
|
||||
expectReply: boolean;
|
||||
input: string;
|
||||
expectedTextIncludes?: string[];
|
||||
@@ -268,15 +269,11 @@ const TELEGRAM_QA_SCENARIOS: TelegramQaScenarioDefinition[] = [
|
||||
id: "telegram-mentioned-message-reply",
|
||||
title: "Telegram mentioned message gets a reply",
|
||||
timeoutMs: 45_000,
|
||||
buildRun: (sutUsername) => {
|
||||
const token = `TELEGRAM_QA_REPLY_${randomUUID().slice(0, 8).toUpperCase()}`;
|
||||
return {
|
||||
expectReply: true,
|
||||
input: `@${sutUsername} reply with only this exact marker: ${token}`,
|
||||
expectedTextIncludes: [token],
|
||||
matchText: token,
|
||||
};
|
||||
},
|
||||
buildRun: (sutUsername) => ({
|
||||
allowAnySutReply: true,
|
||||
expectReply: true,
|
||||
input: `@${sutUsername} Telegram QA mention routing check. Reply with a short acknowledgement.`,
|
||||
}),
|
||||
},
|
||||
{
|
||||
id: "telegram-mention-gating",
|
||||
@@ -476,6 +473,13 @@ function buildTelegramQaConfig(
|
||||
};
|
||||
return {
|
||||
...baseCfg,
|
||||
agents: {
|
||||
...baseCfg.agents,
|
||||
defaults: {
|
||||
...baseCfg.agents?.defaults,
|
||||
skipBootstrap: true,
|
||||
},
|
||||
},
|
||||
plugins: {
|
||||
...baseCfg.plugins,
|
||||
allow: pluginAllow,
|
||||
@@ -751,6 +755,7 @@ function findScenario(ids?: string[]) {
|
||||
|
||||
function matchesTelegramScenarioReply(params: {
|
||||
groupId: string;
|
||||
allowAnySutReply?: boolean;
|
||||
matchText?: string;
|
||||
message: TelegramObservedMessage;
|
||||
sentMessageId: number;
|
||||
@@ -765,6 +770,9 @@ function matchesTelegramScenarioReply(params: {
|
||||
if (params.message.replyToMessageId === params.sentMessageId) {
|
||||
return true;
|
||||
}
|
||||
if (params.allowAnySutReply === true) {
|
||||
return true;
|
||||
}
|
||||
return Boolean(params.matchText && params.message.text.includes(params.matchText));
|
||||
}
|
||||
|
||||
@@ -1216,6 +1224,7 @@ export async function runTelegramQaLive(params: {
|
||||
observationScenarioTitle: scenario.title,
|
||||
predicate: (message) =>
|
||||
matchesTelegramScenarioReply({
|
||||
allowAnySutReply: scenarioRun.allowAnySutReply,
|
||||
groupId: runtimeEnv.groupId,
|
||||
matchText: scenarioRun.matchText,
|
||||
message,
|
||||
|
||||
@@ -433,6 +433,7 @@ export const dispatchTelegramMessage = async ({
|
||||
archivedAnswerPreviews.push({
|
||||
messageId: preview.messageId,
|
||||
textSnapshot: preview.textSnapshot,
|
||||
visibleSinceMs: preview.visibleSinceMs,
|
||||
deleteIfUnused: true,
|
||||
});
|
||||
}
|
||||
@@ -539,6 +540,7 @@ export const dispatchTelegramMessage = async ({
|
||||
archivedAnswerPreviews.push({
|
||||
messageId: previewMessageId,
|
||||
textSnapshot: answerLane.lastPartialText,
|
||||
visibleSinceMs: answerLane.stream?.visibleSinceMs?.(),
|
||||
deleteIfUnused: false,
|
||||
});
|
||||
}
|
||||
|
||||
@@ -6,6 +6,7 @@ export type TestDraftStream = {
|
||||
update: ReturnType<typeof vi.fn<(text: string) => void>>;
|
||||
flush: ReturnType<typeof vi.fn<() => Promise<void>>>;
|
||||
messageId: ReturnType<typeof vi.fn<() => number | undefined>>;
|
||||
visibleSinceMs: ReturnType<typeof vi.fn<() => number | undefined>>;
|
||||
previewMode: ReturnType<typeof vi.fn<() => DraftPreviewMode>>;
|
||||
previewRevision: ReturnType<typeof vi.fn<() => number>>;
|
||||
lastDeliveredText: ReturnType<typeof vi.fn<() => string>>;
|
||||
@@ -25,8 +26,10 @@ export function createTestDraftStream(params?: {
|
||||
onStop?: () => void | Promise<void>;
|
||||
onDiscard?: () => void | Promise<void>;
|
||||
clearMessageIdOnForceNew?: boolean;
|
||||
visibleSinceMs?: number;
|
||||
}): TestDraftStream {
|
||||
let messageId = params?.messageId;
|
||||
let visibleSinceMs = params?.visibleSinceMs;
|
||||
let previewRevision = 0;
|
||||
let lastDeliveredText = "";
|
||||
return {
|
||||
@@ -37,6 +40,7 @@ export function createTestDraftStream(params?: {
|
||||
}),
|
||||
flush: vi.fn().mockResolvedValue(undefined),
|
||||
messageId: vi.fn().mockImplementation(() => messageId),
|
||||
visibleSinceMs: vi.fn().mockImplementation(() => visibleSinceMs),
|
||||
previewMode: vi.fn().mockReturnValue(params?.previewMode ?? "message"),
|
||||
previewRevision: vi.fn().mockImplementation(() => previewRevision),
|
||||
lastDeliveredText: vi.fn().mockImplementation(() => lastDeliveredText),
|
||||
@@ -52,16 +56,19 @@ export function createTestDraftStream(params?: {
|
||||
if (params?.clearMessageIdOnForceNew) {
|
||||
messageId = undefined;
|
||||
}
|
||||
visibleSinceMs = undefined;
|
||||
}),
|
||||
sendMayHaveLanded: vi.fn().mockReturnValue(false),
|
||||
setMessageId: (value: number | undefined) => {
|
||||
messageId = value;
|
||||
visibleSinceMs = value == null ? undefined : Date.now();
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
export function createSequencedTestDraftStream(startMessageId = 1001): TestDraftStream {
|
||||
let activeMessageId: number | undefined;
|
||||
let visibleSinceMs: number | undefined;
|
||||
let nextMessageId = startMessageId;
|
||||
let previewRevision = 0;
|
||||
let lastDeliveredText = "";
|
||||
@@ -69,12 +76,14 @@ export function createSequencedTestDraftStream(startMessageId = 1001): TestDraft
|
||||
update: vi.fn().mockImplementation((text: string) => {
|
||||
if (activeMessageId == null) {
|
||||
activeMessageId = nextMessageId++;
|
||||
visibleSinceMs = Date.now();
|
||||
}
|
||||
previewRevision += 1;
|
||||
lastDeliveredText = text.trimEnd();
|
||||
}),
|
||||
flush: vi.fn().mockResolvedValue(undefined),
|
||||
messageId: vi.fn().mockImplementation(() => activeMessageId),
|
||||
visibleSinceMs: vi.fn().mockImplementation(() => visibleSinceMs),
|
||||
previewMode: vi.fn().mockReturnValue("message"),
|
||||
previewRevision: vi.fn().mockImplementation(() => previewRevision),
|
||||
lastDeliveredText: vi.fn().mockImplementation(() => lastDeliveredText),
|
||||
@@ -84,10 +93,12 @@ export function createSequencedTestDraftStream(startMessageId = 1001): TestDraft
|
||||
materialize: vi.fn().mockImplementation(async () => activeMessageId),
|
||||
forceNewMessage: vi.fn().mockImplementation(() => {
|
||||
activeMessageId = undefined;
|
||||
visibleSinceMs = undefined;
|
||||
}),
|
||||
sendMayHaveLanded: vi.fn().mockReturnValue(false),
|
||||
setMessageId: (value: number | undefined) => {
|
||||
activeMessageId = value;
|
||||
visibleSinceMs = value == null ? undefined : Date.now();
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
@@ -161,6 +161,28 @@ describe("createTelegramDraftStream", () => {
|
||||
expect(api.sendMessageDraft).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("tracks when a message preview first became visible", async () => {
|
||||
vi.useFakeTimers();
|
||||
try {
|
||||
vi.setSystemTime(new Date("2026-04-26T01:00:00.000Z"));
|
||||
const api = createMockDraftApi();
|
||||
const stream = createDraftStream(api, { previewTransport: "message" });
|
||||
|
||||
stream.update("Hello");
|
||||
await stream.flush();
|
||||
|
||||
expect(stream.visibleSinceMs?.()).toBe(Date.parse("2026-04-26T01:00:00.000Z"));
|
||||
|
||||
vi.setSystemTime(new Date("2026-04-26T01:01:00.000Z"));
|
||||
stream.update("Hello again");
|
||||
await stream.flush();
|
||||
|
||||
expect(stream.visibleSinceMs?.()).toBe(Date.parse("2026-04-26T01:00:00.000Z"));
|
||||
} finally {
|
||||
vi.useRealTimers();
|
||||
}
|
||||
});
|
||||
|
||||
it("falls back to message transport when sendMessageDraft is unavailable", async () => {
|
||||
const api = createMockDraftApi();
|
||||
delete (api as { sendMessageDraft?: unknown }).sendMessageDraft;
|
||||
@@ -436,6 +458,23 @@ describe("createTelegramDraftStream", () => {
|
||||
expect(api.sendMessage).toHaveBeenLastCalledWith(123, "After thinking", undefined);
|
||||
});
|
||||
|
||||
it("creates new message after cleanup and forceNewMessage", async () => {
|
||||
const { api, stream } = createForceNewMessageHarness();
|
||||
|
||||
stream.update("Stale preview");
|
||||
await stream.flush();
|
||||
|
||||
await stream.clear();
|
||||
expect(api.deleteMessage).toHaveBeenCalledWith(123, 17);
|
||||
|
||||
stream.forceNewMessage();
|
||||
stream.update("Next preview");
|
||||
await stream.flush();
|
||||
|
||||
expect(api.sendMessage).toHaveBeenCalledTimes(2);
|
||||
expect(api.sendMessage).toHaveBeenLastCalledWith(123, "Next preview", undefined);
|
||||
});
|
||||
|
||||
it("sends first update immediately after forceNewMessage within throttle window", async () => {
|
||||
vi.useFakeTimers();
|
||||
try {
|
||||
@@ -487,6 +526,7 @@ describe("createTelegramDraftStream", () => {
|
||||
messageId: 17,
|
||||
textSnapshot: "Message A partial",
|
||||
parseMode: undefined,
|
||||
visibleSinceMs: expect.any(Number),
|
||||
});
|
||||
expect(api.sendMessage).toHaveBeenCalledTimes(2);
|
||||
expect(api.sendMessage).toHaveBeenNthCalledWith(2, 123, "Message B partial", undefined);
|
||||
|
||||
@@ -94,6 +94,7 @@ export type TelegramDraftStream = {
|
||||
update: (text: string) => void;
|
||||
flush: () => Promise<void>;
|
||||
messageId: () => number | undefined;
|
||||
visibleSinceMs?: () => number | undefined;
|
||||
previewMode?: () => "message" | "draft";
|
||||
previewRevision?: () => number;
|
||||
lastDeliveredText?: () => string;
|
||||
@@ -118,6 +119,7 @@ type SupersededTelegramPreview = {
|
||||
messageId: number;
|
||||
textSnapshot: string;
|
||||
parseMode?: "HTML";
|
||||
visibleSinceMs?: number;
|
||||
};
|
||||
|
||||
export function createTelegramDraftStream(params: {
|
||||
@@ -174,6 +176,7 @@ export function createTelegramDraftStream(params: {
|
||||
const streamState = { stopped: false, final: false };
|
||||
let messageSendAttempted = false;
|
||||
let streamMessageId: number | undefined;
|
||||
let streamVisibleSinceMs: number | undefined;
|
||||
let streamDraftId = usesDraftTransport ? allocateTelegramDraftId() : undefined;
|
||||
let previewTransport: "message" | "draft" = usesDraftTransport ? "draft" : "message";
|
||||
let lastSentText = "";
|
||||
@@ -226,6 +229,7 @@ export function createTelegramDraftStream(params: {
|
||||
sendGeneration,
|
||||
}: PreviewSendParams): Promise<boolean> => {
|
||||
if (typeof streamMessageId === "number") {
|
||||
streamVisibleSinceMs ??= Date.now();
|
||||
if (renderedParseMode) {
|
||||
await params.api.editMessageText(chatId, streamMessageId, renderedText, {
|
||||
parse_mode: renderedParseMode,
|
||||
@@ -257,15 +261,18 @@ export function createTelegramDraftStream(params: {
|
||||
return false;
|
||||
}
|
||||
const normalizedMessageId = Math.trunc(sentMessageId);
|
||||
const visibleSinceMs = Date.now();
|
||||
if (sendGeneration !== generation) {
|
||||
params.onSupersededPreview?.({
|
||||
messageId: normalizedMessageId,
|
||||
textSnapshot: renderedText,
|
||||
parseMode: renderedParseMode,
|
||||
visibleSinceMs,
|
||||
});
|
||||
return true;
|
||||
}
|
||||
streamMessageId = normalizedMessageId;
|
||||
streamVisibleSinceMs = visibleSinceMs;
|
||||
return true;
|
||||
};
|
||||
const sendDraftTransportPreview = async ({
|
||||
@@ -397,10 +404,12 @@ export function createTelegramDraftStream(params: {
|
||||
};
|
||||
|
||||
const forceNewMessage = () => {
|
||||
streamState.stopped = false;
|
||||
streamState.final = false;
|
||||
generation += 1;
|
||||
messageSendAttempted = false;
|
||||
streamMessageId = undefined;
|
||||
streamVisibleSinceMs = undefined;
|
||||
if (previewTransport === "draft") {
|
||||
streamDraftId = allocateTelegramDraftId();
|
||||
}
|
||||
@@ -430,6 +439,7 @@ export function createTelegramDraftStream(params: {
|
||||
const sentId = sent?.message_id;
|
||||
if (typeof sentId === "number" && Number.isFinite(sentId)) {
|
||||
streamMessageId = Math.trunc(sentId);
|
||||
streamVisibleSinceMs = Date.now();
|
||||
if (resolvedDraftApi != null && streamDraftId != null) {
|
||||
const clearDraftId = streamDraftId;
|
||||
const clearThreadParams =
|
||||
@@ -454,6 +464,7 @@ export function createTelegramDraftStream(params: {
|
||||
update,
|
||||
flush: loop.flush,
|
||||
messageId: () => streamMessageId,
|
||||
visibleSinceMs: () => streamVisibleSinceMs,
|
||||
previewMode: () => previewTransport,
|
||||
previewRevision: () => previewRevision,
|
||||
lastDeliveredText: () => lastDeliveredText,
|
||||
|
||||
@@ -12,6 +12,7 @@ const MESSAGE_NOT_MODIFIED_RE =
|
||||
/400:\s*Bad Request:\s*message is not modified|MESSAGE_NOT_MODIFIED/i;
|
||||
const MESSAGE_NOT_FOUND_RE =
|
||||
/400:\s*Bad Request:\s*message to edit not found|MESSAGE_ID_INVALID|message can't be edited/i;
|
||||
const LONG_LIVED_PREVIEW_FRESH_FINAL_AFTER_MS = 60_000;
|
||||
|
||||
function extractErrorText(err: unknown): string {
|
||||
return typeof err === "string"
|
||||
@@ -55,6 +56,7 @@ export type DraftLaneState = {
|
||||
export type ArchivedPreview = {
|
||||
messageId: number;
|
||||
textSnapshot: string;
|
||||
visibleSinceMs?: number;
|
||||
// Boundary-finalized previews should remain visible even if no matching
|
||||
// final edit arrives; superseded previews can be safely deleted.
|
||||
deleteIfUnused?: boolean;
|
||||
@@ -92,6 +94,7 @@ type CreateLaneTextDelivererParams = {
|
||||
deletePreviewMessage: (messageId: number) => Promise<void>;
|
||||
log: (message: string) => void;
|
||||
markDelivered: () => void;
|
||||
now?: () => number;
|
||||
};
|
||||
|
||||
type DeliverLaneTextParams = {
|
||||
@@ -169,6 +172,14 @@ function shouldSkipRegressivePreviewUpdate(args: {
|
||||
);
|
||||
}
|
||||
|
||||
function isLongLivedPreview(visibleSinceMs: number | undefined, nowMs: number): boolean {
|
||||
return (
|
||||
typeof visibleSinceMs === "number" &&
|
||||
Number.isFinite(visibleSinceMs) &&
|
||||
nowMs - visibleSinceMs >= LONG_LIVED_PREVIEW_FRESH_FINAL_AFTER_MS
|
||||
);
|
||||
}
|
||||
|
||||
function resolvePreviewTarget(params: ResolvePreviewTargetParams): PreviewTargetResolution {
|
||||
const lanePreviewMessageId = params.lane.stream?.messageId();
|
||||
const previewMessageId =
|
||||
@@ -187,11 +198,27 @@ function resolvePreviewTarget(params: ResolvePreviewTargetParams): PreviewTarget
|
||||
|
||||
export function createLaneTextDeliverer(params: CreateLaneTextDelivererParams) {
|
||||
const getLanePreviewText = (lane: DraftLaneState) => lane.lastPartialText;
|
||||
const readNow = () => params.now?.() ?? Date.now();
|
||||
const markActivePreviewComplete = (laneName: LaneName) => {
|
||||
params.activePreviewLifecycleByLane[laneName] = "complete";
|
||||
params.retainPreviewOnCleanupByLane[laneName] = true;
|
||||
};
|
||||
const isDraftPreviewLane = (lane: DraftLaneState) => lane.stream?.previewMode?.() === "draft";
|
||||
const isMessagePreviewLane = (lane: DraftLaneState) => !isDraftPreviewLane(lane);
|
||||
const shouldUseFreshFinalForLane = (lane: DraftLaneState) =>
|
||||
isMessagePreviewLane(lane) && isLongLivedPreview(lane.stream?.visibleSinceMs?.(), readNow());
|
||||
const shouldUseFreshFinalForPreview = (lane: DraftLaneState, visibleSinceMs?: number) =>
|
||||
isMessagePreviewLane(lane) && isLongLivedPreview(visibleSinceMs, readNow());
|
||||
const clearActivePreviewAfterFreshFinal = async (lane: DraftLaneState, laneName: LaneName) => {
|
||||
try {
|
||||
await lane.stream?.clear();
|
||||
} catch (err) {
|
||||
params.log(`telegram: ${laneName} fresh final preview cleanup failed: ${String(err)}`);
|
||||
}
|
||||
lane.lastPartialText = "";
|
||||
lane.hasStreamedMessage = false;
|
||||
lane.stream?.forceNewMessage();
|
||||
};
|
||||
const canMaterializeDraftFinal = (
|
||||
lane: DraftLaneState,
|
||||
previewButtons?: TelegramInlineButtons,
|
||||
@@ -444,6 +471,19 @@ export function createLaneTextDeliverer(params: CreateLaneTextDelivererParams) {
|
||||
if (!archivedPreview) {
|
||||
return undefined;
|
||||
}
|
||||
if (canEditViaPreview && shouldUseFreshFinalForPreview(lane, archivedPreview.visibleSinceMs)) {
|
||||
const delivered = await params.sendPayload(params.applyTextToPayload(payload, text));
|
||||
if (delivered) {
|
||||
try {
|
||||
await params.deletePreviewMessage(archivedPreview.messageId);
|
||||
} catch (err) {
|
||||
params.log(
|
||||
`telegram: archived answer preview cleanup failed (${archivedPreview.messageId}): ${String(err)}`,
|
||||
);
|
||||
}
|
||||
return result("sent");
|
||||
}
|
||||
}
|
||||
if (canEditViaPreview) {
|
||||
const finalized = await tryUpdatePreviewForLane({
|
||||
lane,
|
||||
@@ -551,6 +591,14 @@ export function createLaneTextDeliverer(params: CreateLaneTextDelivererParams) {
|
||||
});
|
||||
}
|
||||
}
|
||||
if (shouldUseFreshFinalForLane(lane)) {
|
||||
await params.stopDraftLane(lane);
|
||||
const delivered = await params.sendPayload(params.applyTextToPayload(payload, text));
|
||||
if (delivered) {
|
||||
await clearActivePreviewAfterFreshFinal(lane, laneName);
|
||||
return result("sent");
|
||||
}
|
||||
}
|
||||
const previewMessageId = lane.stream?.messageId();
|
||||
const finalized = await tryUpdatePreviewForLane({
|
||||
lane,
|
||||
|
||||
@@ -2,6 +2,7 @@ import type { ReplyPayload } from "openclaw/plugin-sdk/reply-runtime";
|
||||
import { describe, expect, it, vi } from "vitest";
|
||||
import { createTestDraftStream } from "./draft-stream.test-helpers.js";
|
||||
import {
|
||||
type ArchivedPreview,
|
||||
createLaneTextDeliverer,
|
||||
type DraftLaneState,
|
||||
type LaneDeliveryResult,
|
||||
@@ -17,9 +18,15 @@ function createHarness(params?: {
|
||||
answerStream?: DraftLaneState["stream"];
|
||||
answerHasStreamedMessage?: boolean;
|
||||
answerLastPartialText?: string;
|
||||
answerPreviewVisibleSinceMs?: number;
|
||||
nowMs?: number;
|
||||
}) {
|
||||
const answer =
|
||||
params?.answerStream ?? createTestDraftStream({ messageId: params?.answerMessageId });
|
||||
params?.answerStream ??
|
||||
createTestDraftStream({
|
||||
messageId: params?.answerMessageId,
|
||||
visibleSinceMs: params?.answerPreviewVisibleSinceMs,
|
||||
});
|
||||
const reasoning = createTestDraftStream();
|
||||
const lanes: Record<LaneName, DraftLaneState> = {
|
||||
answer: {
|
||||
@@ -51,11 +58,7 @@ function createHarness(params?: {
|
||||
const markDelivered = vi.fn();
|
||||
const activePreviewLifecycleByLane = { answer: "transient", reasoning: "transient" } as const;
|
||||
const retainPreviewOnCleanupByLane = { answer: false, reasoning: false } as const;
|
||||
const archivedAnswerPreviews: Array<{
|
||||
messageId: number;
|
||||
textSnapshot: string;
|
||||
deleteIfUnused?: boolean;
|
||||
}> = [];
|
||||
const archivedAnswerPreviews: ArchivedPreview[] = [];
|
||||
|
||||
const deliverLaneText = createLaneTextDeliverer({
|
||||
lanes,
|
||||
@@ -71,6 +74,7 @@ function createHarness(params?: {
|
||||
deletePreviewMessage,
|
||||
log,
|
||||
markDelivered,
|
||||
now: params?.nowMs != null ? () => params.nowMs! : undefined,
|
||||
});
|
||||
|
||||
return {
|
||||
@@ -347,6 +351,116 @@ describe("createLaneTextDeliverer", () => {
|
||||
expect(harness.log).toHaveBeenCalledWith(expect.stringContaining("preview final too long"));
|
||||
});
|
||||
|
||||
it("sends a fresh final when a message preview is long lived", async () => {
|
||||
const visibleSinceMs = 10_000;
|
||||
const harness = createHarness({
|
||||
answerMessageId: 999,
|
||||
answerHasStreamedMessage: true,
|
||||
answerLastPartialText: "Working...",
|
||||
answerPreviewVisibleSinceMs: visibleSinceMs,
|
||||
nowMs: visibleSinceMs + 60_000,
|
||||
});
|
||||
|
||||
const result = await deliverFinalAnswer(harness, HELLO_FINAL);
|
||||
|
||||
expect(result.kind).toBe("sent");
|
||||
expect(harness.stopDraftLane).toHaveBeenCalledTimes(1);
|
||||
expect(harness.sendPayload).toHaveBeenCalledWith(
|
||||
expect.objectContaining({ text: HELLO_FINAL }),
|
||||
);
|
||||
expect(harness.editPreview).not.toHaveBeenCalled();
|
||||
expect(harness.answer.stream?.clear).toHaveBeenCalledTimes(1);
|
||||
expect(harness.answer.stream?.forceNewMessage).toHaveBeenCalledTimes(1);
|
||||
expect(harness.lanes.answer.hasStreamedMessage).toBe(false);
|
||||
expect(harness.lanes.answer.lastPartialText).toBe("");
|
||||
expect(harness.markDelivered).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("falls back to editing a long-lived preview when fresh final send returns false", async () => {
|
||||
const visibleSinceMs = 10_000;
|
||||
const harness = createHarness({
|
||||
answerMessageId: 999,
|
||||
answerHasStreamedMessage: true,
|
||||
answerLastPartialText: "Working...",
|
||||
answerPreviewVisibleSinceMs: visibleSinceMs,
|
||||
nowMs: visibleSinceMs + 60_000,
|
||||
});
|
||||
harness.sendPayload.mockResolvedValueOnce(false);
|
||||
|
||||
const result = await deliverFinalAnswer(harness, HELLO_FINAL);
|
||||
|
||||
expect(expectPreviewFinalized(result)).toEqual({
|
||||
content: HELLO_FINAL,
|
||||
messageId: 999,
|
||||
});
|
||||
expect(harness.stopDraftLane).toHaveBeenCalledTimes(2);
|
||||
expect(harness.sendPayload).toHaveBeenCalledTimes(1);
|
||||
expect(harness.editPreview).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
messageId: 999,
|
||||
text: HELLO_FINAL,
|
||||
}),
|
||||
);
|
||||
expect(harness.answer.stream?.clear).not.toHaveBeenCalled();
|
||||
expect(harness.markDelivered).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it("sends a fresh final for stale archived previews", async () => {
|
||||
const visibleSinceMs = 10_000;
|
||||
const harness = createHarness({
|
||||
answerMessageId: 1001,
|
||||
answerPreviewVisibleSinceMs: visibleSinceMs,
|
||||
nowMs: visibleSinceMs + 60_000,
|
||||
});
|
||||
harness.archivedAnswerPreviews.push({
|
||||
messageId: 222,
|
||||
textSnapshot: "Working...",
|
||||
visibleSinceMs,
|
||||
deleteIfUnused: true,
|
||||
});
|
||||
|
||||
const result = await deliverFinalAnswer(harness, HELLO_FINAL);
|
||||
|
||||
expect(result.kind).toBe("sent");
|
||||
expect(harness.sendPayload).toHaveBeenCalledWith(
|
||||
expect.objectContaining({ text: HELLO_FINAL }),
|
||||
);
|
||||
expect(harness.editPreview).not.toHaveBeenCalled();
|
||||
expect(harness.deletePreviewMessage).toHaveBeenCalledWith(222);
|
||||
});
|
||||
|
||||
it("falls back to editing a stale archived preview when fresh final send returns false", async () => {
|
||||
const visibleSinceMs = 10_000;
|
||||
const harness = createHarness({
|
||||
answerMessageId: 1001,
|
||||
answerPreviewVisibleSinceMs: visibleSinceMs,
|
||||
nowMs: visibleSinceMs + 60_000,
|
||||
});
|
||||
harness.archivedAnswerPreviews.push({
|
||||
messageId: 222,
|
||||
textSnapshot: "Working...",
|
||||
visibleSinceMs,
|
||||
deleteIfUnused: true,
|
||||
});
|
||||
harness.sendPayload.mockResolvedValueOnce(false);
|
||||
|
||||
const result = await deliverFinalAnswer(harness, HELLO_FINAL);
|
||||
|
||||
expect(expectPreviewFinalized(result)).toEqual({
|
||||
content: HELLO_FINAL,
|
||||
messageId: 222,
|
||||
});
|
||||
expect(harness.sendPayload).toHaveBeenCalledTimes(1);
|
||||
expect(harness.editPreview).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
messageId: 222,
|
||||
text: HELLO_FINAL,
|
||||
}),
|
||||
);
|
||||
expect(harness.deletePreviewMessage).not.toHaveBeenCalled();
|
||||
expect(harness.markDelivered).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it("materializes DM draft streaming final even when text is unchanged", async () => {
|
||||
const answerStream = createTestDraftStream({ previewMode: "draft", messageId: 321 });
|
||||
answerStream.materialize.mockResolvedValue(321);
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import "./test-helpers.js";
|
||||
import { EventEmitter } from "node:events";
|
||||
import fs from "node:fs/promises";
|
||||
import os from "node:os";
|
||||
import path from "node:path";
|
||||
@@ -42,25 +43,57 @@ type WebAutoReplyMonitorHarness = {
|
||||
controller: AbortController;
|
||||
run: Promise<unknown>;
|
||||
};
|
||||
type MockSessionSocket = {
|
||||
ev: { on: ReturnType<typeof vi.fn>; off: ReturnType<typeof vi.fn> };
|
||||
ws: EventEmitter & { close: ReturnType<typeof vi.fn> };
|
||||
user: { id: string };
|
||||
};
|
||||
|
||||
export const TEST_NET_IP = "93.184.216.34";
|
||||
const WEB_AUTO_REPLY_SOCKETS_KEY = Symbol.for("openclaw:webAutoReplySessionSockets");
|
||||
|
||||
function getSessionSockets(): MockSessionSocket[] {
|
||||
const store = globalThis as Record<PropertyKey, unknown>;
|
||||
if (!Array.isArray(store[WEB_AUTO_REPLY_SOCKETS_KEY])) {
|
||||
store[WEB_AUTO_REPLY_SOCKETS_KEY] = [];
|
||||
}
|
||||
return store[WEB_AUTO_REPLY_SOCKETS_KEY] as MockSessionSocket[];
|
||||
}
|
||||
|
||||
vi.mock("./session.js", async () => {
|
||||
const actual = await vi.importActual<typeof import("./session.js")>("./session.js");
|
||||
return {
|
||||
...actual,
|
||||
createWaSocket: vi.fn(async () => ({
|
||||
ev: {
|
||||
on: vi.fn(),
|
||||
off: vi.fn(),
|
||||
},
|
||||
ws: { close: vi.fn() },
|
||||
user: { id: "123@s.whatsapp.net" },
|
||||
})),
|
||||
createWaSocket: vi.fn(async () => {
|
||||
const ws = new EventEmitter() as MockSessionSocket["ws"];
|
||||
ws.close = vi.fn();
|
||||
const sock: MockSessionSocket = {
|
||||
ev: {
|
||||
on: vi.fn(),
|
||||
off: vi.fn(),
|
||||
},
|
||||
ws,
|
||||
user: { id: "123@s.whatsapp.net" },
|
||||
};
|
||||
getSessionSockets().push(sock);
|
||||
return sock;
|
||||
}),
|
||||
waitForWaConnection: vi.fn().mockResolvedValue(undefined),
|
||||
};
|
||||
});
|
||||
|
||||
export function getLastWebAutoReplySessionSocket(): MockSessionSocket {
|
||||
const last = getSessionSockets().at(-1);
|
||||
if (!last) {
|
||||
throw new Error("No WhatsApp Web auto-reply test socket created");
|
||||
}
|
||||
return last;
|
||||
}
|
||||
|
||||
export function resetWebAutoReplySessionSockets() {
|
||||
getSessionSockets().length = 0;
|
||||
}
|
||||
|
||||
vi.mock("openclaw/plugin-sdk/agent-runtime", () => ({
|
||||
abortEmbeddedPiRun: vi.fn().mockReturnValue(false),
|
||||
appendCronStyleCurrentTimeLine: (text: string) => text,
|
||||
@@ -166,6 +199,7 @@ export function installWebAutoReplyUnitTestHooks(opts?: { pinDns?: boolean }) {
|
||||
|
||||
beforeEach(async () => {
|
||||
vi.clearAllMocks();
|
||||
resetWebAutoReplySessionSockets();
|
||||
_resetBaileysMocks();
|
||||
_resetLoadConfigMock();
|
||||
if (opts?.pinDns) {
|
||||
|
||||
@@ -12,6 +12,7 @@ import {
|
||||
createMockWebListener,
|
||||
createScriptedWebListenerFactory,
|
||||
createWebListenerFactoryCapture,
|
||||
getLastWebAutoReplySessionSocket,
|
||||
installWebAutoReplyTestHomeHooks,
|
||||
installWebAutoReplyUnitTestHooks,
|
||||
makeSessionStore,
|
||||
@@ -255,6 +256,92 @@ describe("web auto-reply connection", () => {
|
||||
}
|
||||
});
|
||||
|
||||
it("keeps quiet linked-device sessions open when transport frames keep arriving", async () => {
|
||||
vi.useFakeTimers();
|
||||
try {
|
||||
const sleep = vi.fn(async () => {});
|
||||
const scripted = createScriptedWebListenerFactory();
|
||||
const { controller, run } = startWebAutoReplyMonitor({
|
||||
monitorWebChannelFn: monitorWebChannel as never,
|
||||
listenerFactory: scripted.listenerFactory,
|
||||
sleep,
|
||||
heartbeatSeconds: 60,
|
||||
messageTimeoutMs: 30,
|
||||
watchdogCheckMs: 5,
|
||||
});
|
||||
|
||||
await vi.waitFor(
|
||||
() => {
|
||||
expect(scripted.getListenerCount()).toBe(1);
|
||||
},
|
||||
{ timeout: 250, interval: 2 },
|
||||
);
|
||||
|
||||
const socket = getLastWebAutoReplySessionSocket();
|
||||
await vi.advanceTimersByTimeAsync(20);
|
||||
socket.ws.emit("frame");
|
||||
await vi.advanceTimersByTimeAsync(20);
|
||||
socket.ws.emit("frame");
|
||||
await vi.advanceTimersByTimeAsync(20);
|
||||
|
||||
expect(scripted.getListenerCount()).toBe(1);
|
||||
|
||||
controller.abort();
|
||||
scripted.resolveClose(0, { status: 499, isLoggedOut: false });
|
||||
await Promise.resolve();
|
||||
await run;
|
||||
} finally {
|
||||
vi.useRealTimers();
|
||||
}
|
||||
});
|
||||
|
||||
it("does not let transport frames mask application silence forever", async () => {
|
||||
vi.useFakeTimers();
|
||||
try {
|
||||
const sleep = vi.fn(async () => {});
|
||||
const scripted = createScriptedWebListenerFactory();
|
||||
const { controller, run } = startWebAutoReplyMonitor({
|
||||
monitorWebChannelFn: monitorWebChannel as never,
|
||||
listenerFactory: scripted.listenerFactory,
|
||||
sleep,
|
||||
heartbeatSeconds: 60,
|
||||
messageTimeoutMs: 30,
|
||||
watchdogCheckMs: 5,
|
||||
});
|
||||
|
||||
await vi.waitFor(
|
||||
() => {
|
||||
expect(scripted.getListenerCount()).toBe(1);
|
||||
},
|
||||
{ timeout: 250, interval: 2 },
|
||||
);
|
||||
|
||||
const socket = getLastWebAutoReplySessionSocket();
|
||||
for (let elapsedMs = 0; elapsedMs < 140; elapsedMs += 20) {
|
||||
socket.ws.emit("frame");
|
||||
await vi.advanceTimersByTimeAsync(20);
|
||||
}
|
||||
|
||||
await vi.waitFor(
|
||||
() => {
|
||||
expect(scripted.getListenerCount()).toBeGreaterThanOrEqual(2);
|
||||
},
|
||||
{ timeout: 250, interval: 2 },
|
||||
);
|
||||
|
||||
controller.abort();
|
||||
scripted.resolveClose(scripted.getListenerCount() - 1, {
|
||||
status: 499,
|
||||
isLoggedOut: false,
|
||||
error: "aborted",
|
||||
});
|
||||
await Promise.resolve();
|
||||
await run;
|
||||
} finally {
|
||||
vi.useRealTimers();
|
||||
}
|
||||
});
|
||||
|
||||
it("gives a reconnected listener a fresh watchdog window", async () => {
|
||||
vi.useFakeTimers();
|
||||
try {
|
||||
|
||||
@@ -280,6 +280,7 @@ export async function monitorWebChannel(
|
||||
reconnectAttempts: snapshot.reconnectAttempts,
|
||||
messagesHandled: snapshot.handledMessages,
|
||||
lastInboundAt: snapshot.lastInboundAt,
|
||||
lastTransportActivityAt: snapshot.lastTransportActivityAt,
|
||||
authAgeMs,
|
||||
uptimeMs: snapshot.uptimeMs,
|
||||
...(minutesSinceLastMessage !== null && minutesSinceLastMessage > 30
|
||||
@@ -297,20 +298,28 @@ export async function monitorWebChannel(
|
||||
}
|
||||
},
|
||||
onWatchdogTimeout: (snapshot) => {
|
||||
const watchdogBaselineAt = snapshot.lastInboundAt ?? snapshot.startedAt;
|
||||
const minutesSinceLastMessage = Math.floor((Date.now() - watchdogBaselineAt) / 60000);
|
||||
const now = Date.now();
|
||||
const transportSilentMs = now - snapshot.lastTransportActivityAt;
|
||||
const appBaselineAt = snapshot.lastInboundAt ?? snapshot.startedAt;
|
||||
const minutesSinceTransportActivity = Math.floor(transportSilentMs / 60000);
|
||||
const minutesSinceAppActivity = Math.floor((now - appBaselineAt) / 60000);
|
||||
const watchdogReason =
|
||||
transportSilentMs > messageTimeoutMs ? "transport-inactive" : "app-silent";
|
||||
statusController.noteWatchdogStale();
|
||||
heartbeatLogger.warn(
|
||||
{
|
||||
connectionId: snapshot.connectionId,
|
||||
minutesSinceLastMessage,
|
||||
watchdogReason,
|
||||
minutesSinceTransportActivity,
|
||||
minutesSinceAppActivity,
|
||||
lastInboundAt: snapshot.lastInboundAt ? new Date(snapshot.lastInboundAt) : null,
|
||||
lastTransportActivityAt: new Date(snapshot.lastTransportActivityAt),
|
||||
messagesHandled: snapshot.handledMessages,
|
||||
},
|
||||
"Message timeout detected - forcing reconnect",
|
||||
"WhatsApp watchdog timeout detected - forcing reconnect",
|
||||
);
|
||||
whatsappHeartbeatLog.warn(
|
||||
`No messages received in ${minutesSinceLastMessage}m - restarting connection`,
|
||||
`WhatsApp watchdog timeout (${watchdogReason}) - restarting connection`,
|
||||
);
|
||||
},
|
||||
});
|
||||
|
||||
@@ -40,8 +40,10 @@ export type WhatsAppLiveConnection = {
|
||||
heartbeat: TimerHandle | null;
|
||||
watchdogTimer: TimerHandle | null;
|
||||
lastInboundAt: number | null;
|
||||
lastTransportActivityAt: number;
|
||||
handledMessages: number;
|
||||
unregisterUnhandled: (() => void) | null;
|
||||
unregisterTransportActivity: (() => void) | null;
|
||||
backgroundTasks: Set<Promise<unknown>>;
|
||||
closePromise: Promise<WebListenerCloseReason>;
|
||||
resolveClose: (reason: WebListenerCloseReason) => void;
|
||||
@@ -51,6 +53,7 @@ export type WhatsAppConnectionSnapshot = {
|
||||
connectionId: string;
|
||||
startedAt: number;
|
||||
lastInboundAt: number | null;
|
||||
lastTransportActivityAt: number;
|
||||
handledMessages: number;
|
||||
reconnectAttempts: number;
|
||||
uptimeMs: number;
|
||||
@@ -83,6 +86,12 @@ function createNeverResolvePromise<T>(): Promise<T> {
|
||||
return new Promise<T>(() => {});
|
||||
}
|
||||
|
||||
type SocketActivityEmitter = {
|
||||
on?: (event: string, listener: (...args: unknown[]) => void) => void;
|
||||
off?: (event: string, listener: (...args: unknown[]) => void) => void;
|
||||
removeListener?: (event: string, listener: (...args: unknown[]) => void) => void;
|
||||
};
|
||||
|
||||
function createLiveConnection(params: {
|
||||
connectionId: string;
|
||||
sock: WASocket;
|
||||
@@ -108,8 +117,10 @@ function createLiveConnection(params: {
|
||||
heartbeat: null,
|
||||
watchdogTimer: null,
|
||||
lastInboundAt: null,
|
||||
lastTransportActivityAt: Date.now(),
|
||||
handledMessages: 0,
|
||||
unregisterUnhandled: null,
|
||||
unregisterTransportActivity: null,
|
||||
backgroundTasks: new Set<Promise<unknown>>(),
|
||||
closePromise,
|
||||
resolveClose: resolveClosePromise,
|
||||
@@ -232,6 +243,7 @@ export class WhatsAppConnectionController {
|
||||
private readonly heartbeatSeconds: number;
|
||||
private readonly keepAlive: boolean;
|
||||
private readonly messageTimeoutMs: number;
|
||||
private readonly appSilenceTimeoutMs: number;
|
||||
private readonly watchdogCheckMs: number;
|
||||
private readonly verbose: boolean;
|
||||
private readonly abortSignal?: AbortSignal;
|
||||
@@ -262,6 +274,7 @@ export class WhatsAppConnectionController {
|
||||
this.keepAlive = params.keepAlive;
|
||||
this.heartbeatSeconds = params.heartbeatSeconds;
|
||||
this.messageTimeoutMs = params.messageTimeoutMs;
|
||||
this.appSilenceTimeoutMs = Math.max(params.messageTimeoutMs, params.messageTimeoutMs * 4);
|
||||
this.watchdogCheckMs = params.watchdogCheckMs;
|
||||
this.reconnectPolicy = params.reconnectPolicy;
|
||||
this.abortSignal = params.abortSignal;
|
||||
@@ -311,6 +324,14 @@ export class WhatsAppConnectionController {
|
||||
}
|
||||
this.current.handledMessages += 1;
|
||||
this.current.lastInboundAt = timestamp;
|
||||
this.current.lastTransportActivityAt = timestamp;
|
||||
}
|
||||
|
||||
noteTransportActivity(timestamp = Date.now()): void {
|
||||
if (!this.current) {
|
||||
return;
|
||||
}
|
||||
this.current.lastTransportActivityAt = timestamp;
|
||||
}
|
||||
|
||||
getCurrentSnapshot(
|
||||
@@ -323,6 +344,7 @@ export class WhatsAppConnectionController {
|
||||
connectionId: connection.connectionId,
|
||||
startedAt: connection.startedAt,
|
||||
lastInboundAt: connection.lastInboundAt,
|
||||
lastTransportActivityAt: connection.lastTransportActivityAt,
|
||||
handledMessages: connection.handledMessages,
|
||||
reconnectAttempts: this.reconnectAttempts,
|
||||
uptimeMs: Date.now() - connection.startedAt,
|
||||
@@ -369,6 +391,7 @@ export class WhatsAppConnectionController {
|
||||
const listener = await params.createListener({ sock, connection });
|
||||
connection.listener = listener;
|
||||
this.current = connection;
|
||||
connection.unregisterTransportActivity = this.attachTransportActivityListener(sock);
|
||||
registerWhatsAppConnectionController(this.accountId, this);
|
||||
this.startTimers(connection, {
|
||||
onHeartbeat: params.onHeartbeat,
|
||||
@@ -383,6 +406,7 @@ export class WhatsAppConnectionController {
|
||||
if (connection?.unregisterUnhandled) {
|
||||
connection.unregisterUnhandled();
|
||||
}
|
||||
connection?.unregisterTransportActivity?.();
|
||||
throw err;
|
||||
}
|
||||
}
|
||||
@@ -515,6 +539,7 @@ export class WhatsAppConnectionController {
|
||||
this.socketRef.current = null;
|
||||
}
|
||||
connection.unregisterUnhandled?.();
|
||||
connection.unregisterTransportActivity?.();
|
||||
if (connection.heartbeat) {
|
||||
clearInterval(connection.heartbeat);
|
||||
}
|
||||
@@ -563,9 +588,14 @@ export class WhatsAppConnectionController {
|
||||
}, this.heartbeatSeconds * 1000);
|
||||
|
||||
connection.watchdogTimer = setInterval(() => {
|
||||
const baselineAt = connection.lastInboundAt ?? connection.startedAt;
|
||||
const staleForMs = Date.now() - baselineAt;
|
||||
if (staleForMs <= this.messageTimeoutMs) {
|
||||
const now = Date.now();
|
||||
const transportStaleForMs = now - connection.lastTransportActivityAt;
|
||||
const appBaselineAt = connection.lastInboundAt ?? connection.startedAt;
|
||||
const appSilentForMs = now - appBaselineAt;
|
||||
if (
|
||||
transportStaleForMs <= this.messageTimeoutMs &&
|
||||
appSilentForMs <= this.appSilenceTimeoutMs
|
||||
) {
|
||||
return;
|
||||
}
|
||||
const snapshot = this.getCurrentSnapshot(connection);
|
||||
@@ -581,6 +611,24 @@ export class WhatsAppConnectionController {
|
||||
}, this.watchdogCheckMs);
|
||||
}
|
||||
|
||||
private attachTransportActivityListener(sock: WASocket): (() => void) | null {
|
||||
const ws = sock.ws as SocketActivityEmitter | undefined;
|
||||
if (!ws || typeof ws.on !== "function") {
|
||||
return null;
|
||||
}
|
||||
|
||||
const noteActivity = () => this.noteTransportActivity();
|
||||
ws.on("frame", noteActivity);
|
||||
|
||||
return () => {
|
||||
if (typeof ws.off === "function") {
|
||||
ws.off("frame", noteActivity);
|
||||
return;
|
||||
}
|
||||
ws.removeListener?.("frame", noteActivity);
|
||||
};
|
||||
}
|
||||
|
||||
private stopDisconnectRetries(): void {
|
||||
if (!this.disconnectRetryController.signal.aborted) {
|
||||
this.disconnectRetryController.abort();
|
||||
|
||||
21
package.json
21
package.json
@@ -37,14 +37,20 @@
|
||||
"!dist/extensions/qa-channel/**",
|
||||
"!dist/extensions/qa-lab/**",
|
||||
"!dist/extensions/qa-matrix/**",
|
||||
"!dist/plugin-sdk/extensions/qa-channel/**",
|
||||
"!dist/plugin-sdk/extensions/qa-lab/**",
|
||||
"!dist/plugin-sdk/qa-channel.*",
|
||||
"!dist/plugin-sdk/qa-channel-protocol.*",
|
||||
"!dist/plugin-sdk/qa-lab.*",
|
||||
"!dist/plugin-sdk/qa-runtime.*",
|
||||
"!dist/plugin-sdk/src/plugin-sdk/qa-channel.d.ts",
|
||||
"!dist/plugin-sdk/src/plugin-sdk/qa-channel-protocol.d.ts",
|
||||
"!dist/plugin-sdk/src/plugin-sdk/qa-lab.d.ts",
|
||||
"!dist/plugin-sdk/src/plugin-sdk/qa-runtime.d.ts",
|
||||
"!dist/qa-runtime-*.js",
|
||||
"docs/",
|
||||
"!docs/.generated/**",
|
||||
"!docs/channels/qa-channel.md",
|
||||
"patches/",
|
||||
"skills/",
|
||||
"scripts/npm-runner.mjs",
|
||||
@@ -1044,14 +1050,6 @@
|
||||
"types": "./dist/plugin-sdk/nostr.d.ts",
|
||||
"default": "./dist/plugin-sdk/nostr.js"
|
||||
},
|
||||
"./plugin-sdk/qa-channel": {
|
||||
"types": "./dist/plugin-sdk/qa-channel.d.ts",
|
||||
"default": "./dist/plugin-sdk/qa-channel.js"
|
||||
},
|
||||
"./plugin-sdk/qa-channel-protocol": {
|
||||
"types": "./dist/plugin-sdk/qa-channel-protocol.d.ts",
|
||||
"default": "./dist/plugin-sdk/qa-channel-protocol.js"
|
||||
},
|
||||
"./plugin-sdk/provider-auth": {
|
||||
"types": "./dist/plugin-sdk/provider-auth.d.ts",
|
||||
"default": "./dist/plugin-sdk/provider-auth.js"
|
||||
@@ -1335,6 +1333,7 @@
|
||||
"check:timed": "node scripts/check-timed.mjs",
|
||||
"check:timed:all-types": "node scripts/check-timed.mjs --include-test-types",
|
||||
"check:timed:architecture": "node scripts/check-timed.mjs --include-architecture",
|
||||
"check:workflows": "node scripts/check-workflows.mjs",
|
||||
"ci:timings": "node scripts/ci-run-timings.mjs --latest-main",
|
||||
"ci:timings:recent": "node scripts/ci-run-timings.mjs --recent 10",
|
||||
"codex-app-server:protocol:check": "node --import tsx scripts/check-codex-app-server-protocol.ts",
|
||||
@@ -1400,6 +1399,7 @@
|
||||
"lint:auth:no-pairing-store-group": "node scripts/check-no-pairing-store-group-auth.mjs",
|
||||
"lint:auth:pairing-account-scope": "node scripts/check-pairing-account-scope.mjs",
|
||||
"lint:core": "node scripts/run-oxlint.mjs --tsconfig tsconfig.oxlint.core.json src ui packages",
|
||||
"lint:docker-e2e": "node scripts/check-docker-e2e-boundaries.mjs",
|
||||
"lint:docs": "pnpm dlx markdownlint-cli2",
|
||||
"lint:docs:fix": "pnpm dlx markdownlint-cli2 --fix",
|
||||
"lint:extensions": "node scripts/run-oxlint.mjs --tsconfig tsconfig.oxlint.extensions.json extensions",
|
||||
@@ -1415,7 +1415,7 @@
|
||||
"lint:plugins:no-monolithic-plugin-sdk-entry-imports": "node --import tsx scripts/check-no-monolithic-plugin-sdk-entry-imports.ts",
|
||||
"lint:plugins:no-register-http-handler": "node scripts/check-no-register-http-handler.mjs",
|
||||
"lint:plugins:plugin-sdk-subpaths-exported": "node scripts/check-plugin-sdk-subpath-exports.mjs",
|
||||
"lint:scripts": "node scripts/run-oxlint.mjs --tsconfig tsconfig.oxlint.scripts.json scripts",
|
||||
"lint:scripts": "pnpm lint:docker-e2e && node scripts/run-oxlint.mjs --tsconfig tsconfig.oxlint.scripts.json scripts",
|
||||
"lint:swift": "swiftlint lint --config .swiftlint.yml && (cd apps/ios && swiftlint lint --config .swiftlint.yml)",
|
||||
"lint:tmp:channel-agnostic-boundaries": "node scripts/check-channel-agnostic-boundaries.mjs",
|
||||
"lint:tmp:dynamic-import-warts": "node scripts/check-dynamic-import-warts.mjs",
|
||||
@@ -1478,7 +1478,6 @@
|
||||
"test:build:singleton": "node scripts/test-built-plugin-singleton.mjs",
|
||||
"test:bundled": "node scripts/run-vitest.mjs run --config test/vitest/vitest.bundled.config.ts",
|
||||
"test:changed": "node scripts/test-projects.mjs --changed origin/main",
|
||||
"test:changed:focused": "OPENCLAW_TEST_CHANGED_FOCUSED=1 node scripts/test-projects.mjs --changed origin/main",
|
||||
"test:changed:max": "OPENCLAW_VITEST_MAX_WORKERS=8 node scripts/test-projects.mjs --changed origin/main",
|
||||
"test:channels": "node scripts/run-vitest.mjs run --config test/vitest/vitest.channels.config.ts",
|
||||
"test:contracts": "pnpm test:contracts:channels && pnpm test:contracts:plugins",
|
||||
@@ -1541,7 +1540,9 @@
|
||||
"test:docker:plugin-update": "bash scripts/e2e/plugin-update-unchanged-docker.sh",
|
||||
"test:docker:plugins": "bash scripts/e2e/plugins-docker.sh",
|
||||
"test:docker:qr": "bash scripts/e2e/qr-import-docker.sh",
|
||||
"test:docker:rerun": "node scripts/docker-e2e-rerun.mjs",
|
||||
"test:docker:session-runtime-context": "bash scripts/e2e/session-runtime-context-docker.sh",
|
||||
"test:docker:timings": "node scripts/docker-e2e-timings.mjs",
|
||||
"test:docker:update-channel-switch": "bash scripts/e2e/update-channel-switch-docker.sh",
|
||||
"test:e2e": "node scripts/run-vitest.mjs run --config test/vitest/vitest.e2e.config.ts",
|
||||
"test:e2e:openshell": "OPENCLAW_E2E_OPENSHELL=1 node scripts/run-vitest.mjs run --config test/vitest/vitest.e2e.config.ts extensions/openshell/src/backend.e2e.test.ts",
|
||||
|
||||
156
qa/scenarios/runtime/docker-prometheus-smoke.md
Normal file
156
qa/scenarios/runtime/docker-prometheus-smoke.md
Normal file
@@ -0,0 +1,156 @@
|
||||
# Docker Prometheus smoke
|
||||
|
||||
```yaml qa-scenario
|
||||
id: docker-prometheus-smoke
|
||||
title: Docker Prometheus smoke
|
||||
surface: telemetry
|
||||
coverage:
|
||||
primary:
|
||||
- telemetry.prometheus
|
||||
secondary:
|
||||
- harness.qa-lab
|
||||
- docker.e2e
|
||||
objective: Verify a QA-lab gateway run emits protected, bounded Prometheus diagnostics metrics through the diagnostics-prometheus plugin.
|
||||
successCriteria:
|
||||
- The diagnostics-prometheus plugin exposes the protected scrape route.
|
||||
- An unauthenticated scrape is rejected.
|
||||
- A minimal QA-channel agent turn completes.
|
||||
- The authenticated scrape includes release-critical diagnostics metric families.
|
||||
- Prometheus output omits prompt content, session keys, auth tokens, raw ids, and file paths.
|
||||
plugins:
|
||||
- diagnostics-prometheus
|
||||
gatewayConfigPatch:
|
||||
diagnostics:
|
||||
enabled: true
|
||||
docsRefs:
|
||||
- docs/gateway/prometheus.md
|
||||
- docs/concepts/qa-e2e-automation.md
|
||||
codeRefs:
|
||||
- extensions/diagnostics-prometheus/src/service.ts
|
||||
- src/diagnostics/internal-diagnostics.ts
|
||||
- extensions/qa-lab/src/suite.ts
|
||||
execution:
|
||||
kind: flow
|
||||
summary: Complete a minimal QA-lab turn and scrape the protected Prometheus route.
|
||||
config:
|
||||
prompt: Reply exactly DOCKER-PROMETHEUS-OK. Do not repeat DOCKER-PROMETHEUS-SECRET.
|
||||
secretNeedle: DOCKER-PROMETHEUS-SECRET
|
||||
```
|
||||
|
||||
```yaml qa-flow
|
||||
steps:
|
||||
- name: emits protected low-cardinality prometheus metrics
|
||||
actions:
|
||||
- call: waitForGatewayHealthy
|
||||
args:
|
||||
- ref: env
|
||||
- 60000
|
||||
- call: waitForQaChannelReady
|
||||
args:
|
||||
- ref: env
|
||||
- 60000
|
||||
- call: reset
|
||||
- set: startCursor
|
||||
value:
|
||||
expr: state.getSnapshot().messages.length
|
||||
- call: runAgentPrompt
|
||||
args:
|
||||
- ref: env
|
||||
- sessionKey: agent:qa:docker-prometheus-smoke
|
||||
message:
|
||||
expr: config.prompt
|
||||
timeoutMs:
|
||||
expr: liveTurnTimeoutMs(env, 30000)
|
||||
- call: waitForCondition
|
||||
saveAs: outbound
|
||||
args:
|
||||
- lambda:
|
||||
expr: "state.getSnapshot().messages.slice(startCursor).filter((candidate) => candidate.direction === 'outbound' && candidate.conversation.id === 'qa-operator' && String(candidate.text ?? '').trim().length > 0).at(-1)"
|
||||
- expr: liveTurnTimeoutMs(env, 30000)
|
||||
- expr: "env.providerMode === 'mock-openai' ? 100 : 250"
|
||||
- assert:
|
||||
expr: "String(outbound.text ?? '').trim().length > 0"
|
||||
message: "expected non-empty qa output before scraping metrics"
|
||||
- set: prometheusUrl
|
||||
value:
|
||||
expr: "`${env.gateway.baseUrl}/api/diagnostics/prometheus`"
|
||||
- set: gatewayToken
|
||||
value:
|
||||
expr: "String(env.gateway.token ?? env.gateway.runtimeEnv.OPENCLAW_GATEWAY_TOKEN ?? '')"
|
||||
- assert:
|
||||
expr: "gatewayToken.length > 0"
|
||||
message: "expected QA gateway token to be available for protected scrape"
|
||||
- set: unauthenticatedScrape
|
||||
value:
|
||||
expr: |-
|
||||
(async () => {
|
||||
const response = await fetch(prometheusUrl);
|
||||
await response.text().catch(() => "");
|
||||
return { status: response.status };
|
||||
})()
|
||||
- assert:
|
||||
expr: "unauthenticatedScrape.status === 401 || unauthenticatedScrape.status === 403"
|
||||
message:
|
||||
expr: "`expected unauthenticated prometheus scrape to be rejected, got ${unauthenticatedScrape.status}`"
|
||||
- set: authenticatedScrape
|
||||
value:
|
||||
expr: |-
|
||||
(async () => {
|
||||
const response = await fetch(prometheusUrl, {
|
||||
headers: { authorization: `Bearer ${gatewayToken}` },
|
||||
});
|
||||
const text = await response.text();
|
||||
return {
|
||||
status: response.status,
|
||||
contentType: response.headers.get("content-type") ?? "",
|
||||
text,
|
||||
};
|
||||
})()
|
||||
- assert:
|
||||
expr: "authenticatedScrape.status === 200"
|
||||
message:
|
||||
expr: "`expected authenticated prometheus scrape to return 200, got ${authenticatedScrape.status}`"
|
||||
- assert:
|
||||
expr: "authenticatedScrape.contentType.includes('text/plain')"
|
||||
message:
|
||||
expr: "`expected prometheus text content type, got ${authenticatedScrape.contentType}`"
|
||||
- set: prometheusText
|
||||
value:
|
||||
expr: "String(authenticatedScrape.text ?? '')"
|
||||
- assert:
|
||||
expr: "prometheusText.includes('# TYPE openclaw_run_completed_total counter')"
|
||||
message: "missing run completion counter"
|
||||
- assert:
|
||||
expr: "prometheusText.includes('# TYPE openclaw_run_duration_seconds histogram')"
|
||||
message: "missing run duration histogram"
|
||||
- assert:
|
||||
expr: "prometheusText.includes('# TYPE openclaw_model_call_total counter')"
|
||||
message: "missing model call counter"
|
||||
- assert:
|
||||
expr: "prometheusText.includes('# TYPE openclaw_harness_run_total counter')"
|
||||
message: "missing harness run counter"
|
||||
- assert:
|
||||
expr: "!prometheusText.includes(config.secretNeedle)"
|
||||
message: "prometheus output leaked prompt sentinel"
|
||||
- assert:
|
||||
expr: "!prometheusText.includes('DOCKER-PROMETHEUS-OK')"
|
||||
message: "prometheus output leaked response content"
|
||||
- assert:
|
||||
expr: "!prometheusText.includes('agent:qa:docker-prometheus-smoke')"
|
||||
message: "prometheus output leaked the session key"
|
||||
- assert:
|
||||
expr: "!prometheusText.includes(gatewayToken)"
|
||||
message: "prometheus output leaked the gateway token"
|
||||
- assert:
|
||||
expr: "!/runId|sessionId|sessionKey|callId|toolCallId|messageId|providerRequestId/.test(prometheusText)"
|
||||
message: "prometheus output leaked raw diagnostic identifiers"
|
||||
- assert:
|
||||
expr: "!/\\/tmp\\/|\\/private\\/tmp\\/|\\/app\\//.test(prometheusText)"
|
||||
message: "prometheus output leaked a local file path"
|
||||
- assert:
|
||||
expr: "!prometheusText.includes('openclaw.content.')"
|
||||
message: "prometheus output leaked content attributes"
|
||||
- assert:
|
||||
expr: "!/openclaw_prometheus_series_dropped_total(?:\\{[^}]*\\})?\\s+(?!0(?:\\.0+)?(?:\\s|$))/.test(prometheusText)"
|
||||
message: "prometheus dropped series during the smoke"
|
||||
```
|
||||
@@ -67,7 +67,7 @@ export function createEmptyChangedLanes() {
|
||||
|
||||
/**
|
||||
* @param {string[]} changedPaths
|
||||
* @param {{ packageJsonChangeKind?: "liveDockerTooling" | null }} [options]
|
||||
* @param {{ packageJsonChangeKind?: "liveDockerTooling" | "tooling" | null }} [options]
|
||||
* @returns {ChangedLaneResult}
|
||||
*/
|
||||
export function detectChangedLanes(changedPaths, options = {}) {
|
||||
@@ -80,6 +80,8 @@ export function detectChangedLanes(changedPaths, options = {}) {
|
||||
let hasNonDocs = false;
|
||||
const packageJsonIsLiveDockerTooling =
|
||||
paths.includes("package.json") && options.packageJsonChangeKind === "liveDockerTooling";
|
||||
const packageJsonIsTooling =
|
||||
paths.includes("package.json") && options.packageJsonChangeKind === "tooling";
|
||||
|
||||
if (paths.length === 0) {
|
||||
reasons.push("no changed paths");
|
||||
@@ -88,6 +90,7 @@ export function detectChangedLanes(changedPaths, options = {}) {
|
||||
|
||||
if (
|
||||
!packageJsonIsLiveDockerTooling &&
|
||||
!packageJsonIsTooling &&
|
||||
paths.some((changedPath) => RELEASE_METADATA_PATHS.has(changedPath)) &&
|
||||
paths.every(
|
||||
(changedPath) => RELEASE_METADATA_PATHS.has(changedPath) || DOCS_PATH_RE.test(changedPath),
|
||||
@@ -115,6 +118,12 @@ export function detectChangedLanes(changedPaths, options = {}) {
|
||||
continue;
|
||||
}
|
||||
|
||||
if (changedPath === "package.json" && packageJsonIsTooling) {
|
||||
lanes.tooling = true;
|
||||
reasons.push(`${changedPath}: package scripts`);
|
||||
continue;
|
||||
}
|
||||
|
||||
if (LIVE_DOCKER_TOOLING_PATH_RE.test(changedPath)) {
|
||||
lanes.liveDockerTooling = true;
|
||||
reasons.push(`${changedPath}: live Docker tooling surface`);
|
||||
@@ -195,39 +204,57 @@ export function detectChangedLanes(changedPaths, options = {}) {
|
||||
}
|
||||
|
||||
/**
|
||||
* @param {{ base: string; head?: string; includeWorktree?: boolean }} params
|
||||
* @param {{ paths: string[]; base: string; head?: string; staged?: boolean }} params
|
||||
* @returns {ChangedLaneResult}
|
||||
*/
|
||||
export function detectChangedLanesForPaths(params) {
|
||||
const packageJsonChangeKind = params.paths.includes("package.json")
|
||||
? classifyPackageJsonChangeFromGit({
|
||||
base: params.base,
|
||||
head: params.head,
|
||||
staged: params.staged,
|
||||
})
|
||||
: null;
|
||||
return detectChangedLanes(params.paths, { packageJsonChangeKind });
|
||||
}
|
||||
|
||||
/**
|
||||
* @param {{ base: string; head?: string; includeWorktree?: boolean; cwd?: string }} params
|
||||
* @returns {string[]}
|
||||
*/
|
||||
export function listChangedPathsFromGit(params) {
|
||||
const base = params.base;
|
||||
const head = params.head ?? "HEAD";
|
||||
const cwd = params.cwd ?? process.cwd();
|
||||
if (!base) {
|
||||
return [];
|
||||
}
|
||||
const rangePaths = runGitNameOnlyDiff([`${base}...${head}`]);
|
||||
const rangePaths = runGitNameOnlyDiff([`${base}...${head}`], cwd);
|
||||
if (params.includeWorktree === false) {
|
||||
return rangePaths;
|
||||
}
|
||||
return [
|
||||
...new Set([
|
||||
...rangePaths,
|
||||
...runGitNameOnlyDiff(["--cached", "--diff-filter=ACMR"]),
|
||||
...runGitNameOnlyDiff(["--diff-filter=ACMR"]),
|
||||
...runGitLsFiles(["--others", "--exclude-standard"]),
|
||||
...runGitNameOnlyDiff(["--cached", "--diff-filter=ACMR"], cwd),
|
||||
...runGitNameOnlyDiff(["--diff-filter=ACMR"], cwd),
|
||||
...runGitLsFiles(["--others", "--exclude-standard"], cwd),
|
||||
]),
|
||||
].toSorted((left, right) => left.localeCompare(right));
|
||||
}
|
||||
|
||||
function runGitNameOnlyDiff(extraArgs) {
|
||||
function runGitNameOnlyDiff(extraArgs, cwd = process.cwd()) {
|
||||
const output = execFileSync("git", ["diff", "--name-only", ...extraArgs], {
|
||||
cwd,
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
encoding: "utf8",
|
||||
});
|
||||
return output.split("\n").map(normalizeChangedPath).filter(Boolean);
|
||||
}
|
||||
|
||||
function runGitLsFiles(extraArgs) {
|
||||
function runGitLsFiles(extraArgs, cwd = process.cwd()) {
|
||||
const output = execFileSync("git", ["ls-files", ...extraArgs], {
|
||||
cwd,
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
encoding: "utf8",
|
||||
});
|
||||
@@ -245,7 +272,10 @@ export function listStagedChangedPaths() {
|
||||
export function classifyPackageJsonChangeFromGit(params) {
|
||||
try {
|
||||
const { before, after } = readPackageJsonBeforeAfter(params);
|
||||
return isLiveDockerPackageScriptOnlyChange(before, after) ? "liveDockerTooling" : null;
|
||||
if (isLiveDockerPackageScriptOnlyChange(before, after)) {
|
||||
return "liveDockerTooling";
|
||||
}
|
||||
return isPackageScriptOnlyChange(before, after) ? "tooling" : null;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
@@ -265,6 +295,20 @@ export function isLiveDockerPackageScriptOnlyChange(before, after) {
|
||||
);
|
||||
}
|
||||
|
||||
export function isPackageScriptOnlyChange(before, after) {
|
||||
const beforePackage = JSON.parse(before);
|
||||
const afterPackage = JSON.parse(after);
|
||||
const beforeScripts = extractPackageScripts(beforePackage);
|
||||
const afterScripts = extractPackageScripts(afterPackage);
|
||||
const beforeStripped = stripPackageScripts(beforePackage);
|
||||
const afterStripped = stripPackageScripts(afterPackage);
|
||||
|
||||
return (
|
||||
stableJson(beforeStripped) === stableJson(afterStripped) &&
|
||||
stableJson(beforeScripts) !== stableJson(afterScripts)
|
||||
);
|
||||
}
|
||||
|
||||
function readPackageJsonBeforeAfter(params) {
|
||||
const before = readGitText(params.staged ? "HEAD" : params.base, "package.json");
|
||||
if (params.staged) {
|
||||
@@ -317,6 +361,17 @@ function stripLiveDockerPackageScripts(packageJson) {
|
||||
return clone;
|
||||
}
|
||||
|
||||
function extractPackageScripts(packageJson) {
|
||||
const scripts = packageJson?.scripts;
|
||||
return scripts && typeof scripts === "object" && !Array.isArray(scripts) ? scripts : {};
|
||||
}
|
||||
|
||||
function stripPackageScripts(packageJson) {
|
||||
const clone = JSON.parse(JSON.stringify(packageJson));
|
||||
delete clone.scripts;
|
||||
return clone;
|
||||
}
|
||||
|
||||
function stableJson(value) {
|
||||
if (Array.isArray(value)) {
|
||||
return `[${value.map(stableJson).join(",")}]`;
|
||||
@@ -418,14 +473,12 @@ if (isDirectRun()) {
|
||||
: args.staged
|
||||
? listStagedChangedPaths()
|
||||
: listChangedPathsFromGit({ base: args.base, head: args.head });
|
||||
const packageJsonChangeKind = paths.includes("package.json")
|
||||
? classifyPackageJsonChangeFromGit({
|
||||
base: args.base,
|
||||
head: args.head,
|
||||
staged: args.staged,
|
||||
})
|
||||
: null;
|
||||
const result = detectChangedLanes(paths, { packageJsonChangeKind });
|
||||
const result = detectChangedLanesForPaths({
|
||||
paths,
|
||||
base: args.base,
|
||||
head: args.head,
|
||||
staged: args.staged,
|
||||
});
|
||||
if (args.githubOutput) {
|
||||
writeChangedLaneGitHubOutput(result);
|
||||
}
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
import { performance } from "node:perf_hooks";
|
||||
import {
|
||||
classifyPackageJsonChangeFromGit,
|
||||
detectChangedLanes,
|
||||
detectChangedLanesForPaths,
|
||||
listChangedPathsFromGit,
|
||||
listStagedChangedPaths,
|
||||
normalizeChangedPath,
|
||||
@@ -14,12 +13,7 @@ import {
|
||||
} from "./lib/local-heavy-check-runtime.mjs";
|
||||
import { runManagedCommand } from "./lib/managed-child-process.mjs";
|
||||
import { createSparseTsgoSkipEnv } from "./lib/tsgo-sparse-guard.mjs";
|
||||
import { isCiLikeEnv } from "./lib/vitest-local-scheduling.mjs";
|
||||
import { resolveChangedTestTargetPlan } from "./test-projects.test-support.mjs";
|
||||
|
||||
export const CHANGED_CHECK_VITEST_NO_OUTPUT_TIMEOUT_MS = "600000";
|
||||
const VITEST_NO_OUTPUT_TIMEOUT_ENV_KEY = "OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS";
|
||||
const VITEST_NO_OUTPUT_RETRY_ENV_KEY = "OPENCLAW_VITEST_NO_OUTPUT_RETRY";
|
||||
const LIVE_DOCKER_AUTH_SHELL_TARGETS = [
|
||||
"scripts/lib/live-docker-auth.sh",
|
||||
"scripts/test-live-acp-bind-docker.sh",
|
||||
@@ -39,35 +33,6 @@ export function createChangedCheckChildEnv(baseEnv = process.env) {
|
||||
};
|
||||
}
|
||||
|
||||
export function createChangedCheckVitestEnv(baseEnv = process.env) {
|
||||
const resolvedBaseEnv = createChangedCheckChildEnv(baseEnv);
|
||||
const env = {
|
||||
...resolvedBaseEnv,
|
||||
[VITEST_NO_OUTPUT_TIMEOUT_ENV_KEY]:
|
||||
resolvedBaseEnv[VITEST_NO_OUTPUT_TIMEOUT_ENV_KEY]?.trim() ||
|
||||
CHANGED_CHECK_VITEST_NO_OUTPUT_TIMEOUT_MS,
|
||||
[VITEST_NO_OUTPUT_RETRY_ENV_KEY]:
|
||||
resolvedBaseEnv[VITEST_NO_OUTPUT_RETRY_ENV_KEY]?.trim() || "0",
|
||||
};
|
||||
|
||||
const hasWorkerOverride = Boolean(
|
||||
(resolvedBaseEnv.OPENCLAW_VITEST_MAX_WORKERS ?? resolvedBaseEnv.OPENCLAW_TEST_WORKERS)?.trim(),
|
||||
);
|
||||
const hasParallelOverride = Boolean(resolvedBaseEnv.OPENCLAW_TEST_PROJECTS_PARALLEL?.trim());
|
||||
const serialOverride = resolvedBaseEnv.OPENCLAW_TEST_PROJECTS_SERIAL?.trim();
|
||||
if (
|
||||
!isCiLikeEnv(resolvedBaseEnv) &&
|
||||
!hasWorkerOverride &&
|
||||
!hasParallelOverride &&
|
||||
serialOverride !== "0"
|
||||
) {
|
||||
env.OPENCLAW_TEST_PROJECTS_SERIAL = serialOverride || "1";
|
||||
env.OPENCLAW_VITEST_MAX_WORKERS = "1";
|
||||
}
|
||||
|
||||
return env;
|
||||
}
|
||||
|
||||
export function createChangedCheckPlan(result, options = {}) {
|
||||
const commands = [];
|
||||
const baseEnv = createChangedCheckChildEnv(options.env ?? process.env);
|
||||
@@ -93,10 +58,6 @@ export function createChangedCheckPlan(result, options = {}) {
|
||||
if (result.docsOnly) {
|
||||
return {
|
||||
commands,
|
||||
testTargets: [],
|
||||
runChangedTestsBroad: false,
|
||||
runFullTests: false,
|
||||
runExtensionTests: false,
|
||||
summary: "docs-only",
|
||||
};
|
||||
}
|
||||
@@ -118,10 +79,6 @@ export function createChangedCheckPlan(result, options = {}) {
|
||||
add("root dependency ownership", ["deps:root-ownership:check"]);
|
||||
return {
|
||||
commands,
|
||||
testTargets: [],
|
||||
runChangedTestsBroad: false,
|
||||
runFullTests: false,
|
||||
runExtensionTests: false,
|
||||
summary: "release metadata",
|
||||
};
|
||||
}
|
||||
@@ -132,10 +89,6 @@ export function createChangedCheckPlan(result, options = {}) {
|
||||
add("runtime import cycles", ["check:import-cycles"]);
|
||||
return {
|
||||
commands,
|
||||
testTargets: [],
|
||||
runChangedTestsBroad: false,
|
||||
runFullTests: true,
|
||||
runExtensionTests: false,
|
||||
summary: "all",
|
||||
};
|
||||
}
|
||||
@@ -189,26 +142,10 @@ export function createChangedCheckPlan(result, options = {}) {
|
||||
OPENCLAW_DOCKER_ALL_DRY_RUN: "1",
|
||||
OPENCLAW_DOCKER_ALL_LIVE_MODE: "only",
|
||||
});
|
||||
add(
|
||||
"ACP bind unit tests",
|
||||
["test", "src/gateway/live-agent-probes.test.ts", "src/agents/acp-spawn.test.ts"],
|
||||
createChangedCheckVitestEnv(baseEnv),
|
||||
);
|
||||
add("ACPX extension tests", ["test:extension", "acpx"], createChangedCheckVitestEnv(baseEnv));
|
||||
}
|
||||
|
||||
const testPlan = resolveChangedTestTargetPlan(result.paths);
|
||||
const runExtensionTests = result.extensionImpactFromCore;
|
||||
const testTargets = runExtensionTests
|
||||
? testPlan.targets.filter((target) => target !== "extensions")
|
||||
: testPlan.targets;
|
||||
const runChangedTestsBroad = testPlan.mode === "broad";
|
||||
return {
|
||||
commands,
|
||||
testTargets,
|
||||
runChangedTestsBroad,
|
||||
runFullTests: false,
|
||||
runExtensionTests,
|
||||
summary: Object.entries(lanes)
|
||||
.filter(([, enabled]) => enabled)
|
||||
.map(([lane]) => lane)
|
||||
@@ -244,61 +181,6 @@ export async function runChangedCheck(result, options = {}) {
|
||||
}
|
||||
}
|
||||
|
||||
if (plan.runFullTests) {
|
||||
const status = await runPnpm(
|
||||
{ name: "tests all", args: ["test"], env: createChangedCheckVitestEnv(childEnv) },
|
||||
timings,
|
||||
);
|
||||
if (status !== 0) {
|
||||
printSummary(timings, options);
|
||||
return status;
|
||||
}
|
||||
} else if (plan.runChangedTestsBroad) {
|
||||
const testArgs = options.explicitPaths
|
||||
? ["test"]
|
||||
: ["test", "--changed", options.base ?? "origin/main"];
|
||||
const status = await runPnpm(
|
||||
{
|
||||
name: options.explicitPaths ? "tests all" : "tests changed broad",
|
||||
args: testArgs,
|
||||
env: createChangedCheckVitestEnv(childEnv),
|
||||
},
|
||||
timings,
|
||||
);
|
||||
if (status !== 0) {
|
||||
printSummary(timings, options);
|
||||
return status;
|
||||
}
|
||||
} else if (plan.testTargets.length > 0) {
|
||||
const status = await runPnpm(
|
||||
{
|
||||
name: "tests changed",
|
||||
args: ["test", ...plan.testTargets],
|
||||
env: createChangedCheckVitestEnv(childEnv),
|
||||
},
|
||||
timings,
|
||||
);
|
||||
if (status !== 0) {
|
||||
printSummary(timings, options);
|
||||
return status;
|
||||
}
|
||||
}
|
||||
|
||||
if (plan.runExtensionTests) {
|
||||
const status = await runPnpm(
|
||||
{
|
||||
name: "tests extensions",
|
||||
args: ["test:extensions"],
|
||||
env: createChangedCheckVitestEnv(childEnv),
|
||||
},
|
||||
timings,
|
||||
);
|
||||
if (status !== 0) {
|
||||
printSummary(timings, options);
|
||||
return status;
|
||||
}
|
||||
}
|
||||
|
||||
printSummary(timings, options);
|
||||
return 0;
|
||||
} finally {
|
||||
@@ -314,17 +196,11 @@ function printPlan(result, plan, options) {
|
||||
const prefix = options.dryRun ? "[check:changed:dry-run]" : "[check:changed]";
|
||||
console.error(`${prefix} lanes=${plan.summary || "none"}`);
|
||||
if (result.extensionImpactFromCore) {
|
||||
console.error(`${prefix} core contract changed; extension tests included`);
|
||||
}
|
||||
if (plan.runChangedTestsBroad) {
|
||||
console.error(`${prefix} broad changed tests included`);
|
||||
console.error(`${prefix} extension-impacting surface; extension typecheck included`);
|
||||
}
|
||||
for (const reason of result.reasons) {
|
||||
console.error(`${prefix} ${reason}`);
|
||||
}
|
||||
if (plan.testTargets.length > 0) {
|
||||
console.error(`${prefix} test targets=${plan.testTargets.length}`);
|
||||
}
|
||||
}
|
||||
|
||||
async function runPnpm(command, timings) {
|
||||
@@ -408,14 +284,12 @@ if (isDirectRun()) {
|
||||
: args.staged
|
||||
? listStagedChangedPaths()
|
||||
: listChangedPathsFromGit({ base: args.base, head: args.head });
|
||||
const packageJsonChangeKind = paths.includes("package.json")
|
||||
? classifyPackageJsonChangeFromGit({
|
||||
base: args.base,
|
||||
head: args.head,
|
||||
staged: args.staged,
|
||||
})
|
||||
: null;
|
||||
const result = detectChangedLanes(paths, { packageJsonChangeKind });
|
||||
const result = detectChangedLanesForPaths({
|
||||
paths,
|
||||
base: args.base,
|
||||
head: args.head,
|
||||
staged: args.staged,
|
||||
});
|
||||
process.exitCode = await runChangedCheck(result, {
|
||||
...args,
|
||||
explicitPaths: args.paths.length > 0,
|
||||
|
||||
113
scripts/check-docker-e2e-boundaries.mjs
Normal file
113
scripts/check-docker-e2e-boundaries.mjs
Normal file
@@ -0,0 +1,113 @@
|
||||
#!/usr/bin/env node
|
||||
// Cheap guard for Docker E2E test boundaries.
|
||||
// Docker E2E must test packaged npm tarballs and package-installed images, not
|
||||
// the source checkout copied or mounted as the app under test.
|
||||
import fs from "node:fs";
|
||||
import path from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
import { laneResources, laneWeight } from "./lib/docker-e2e-plan.mjs";
|
||||
import { allReleasePathLanes, mainLanes, tailLanes } from "./lib/docker-e2e-scenarios.mjs";
|
||||
|
||||
const ROOT_DIR = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "..");
|
||||
const errors = [];
|
||||
const packageJson = JSON.parse(readText("package.json"));
|
||||
const packageScripts = new Set(Object.keys(packageJson.scripts ?? {}));
|
||||
|
||||
function readText(relativePath) {
|
||||
return fs.readFileSync(path.join(ROOT_DIR, relativePath), "utf8");
|
||||
}
|
||||
|
||||
function walk(dir, out = []) {
|
||||
for (const entry of fs.readdirSync(path.join(ROOT_DIR, dir), { withFileTypes: true })) {
|
||||
const relativePath = path.join(dir, entry.name);
|
||||
if (entry.isDirectory()) {
|
||||
walk(relativePath, out);
|
||||
} else {
|
||||
out.push(relativePath);
|
||||
}
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
for (const relativePath of walk("scripts/e2e")) {
|
||||
if (!/\.(?:sh|ts|mjs|js)$/u.test(relativePath)) {
|
||||
continue;
|
||||
}
|
||||
const text = readText(relativePath);
|
||||
if (/from\s+["']\.\.\/\.\.\/src\//u.test(text) || /import\(["']\.\.\/\.\.\/src\//u.test(text)) {
|
||||
errors.push(`${relativePath}: Docker E2E harness must import built dist, not ../../src`);
|
||||
}
|
||||
if (/-v\s+["']?\$ROOT_DIR:\/app(?::|["'\s]|$)/u.test(text)) {
|
||||
errors.push(`${relativePath}: do not mount the repo root as /app in Docker E2E`);
|
||||
}
|
||||
}
|
||||
|
||||
const dockerfile = readText("scripts/e2e/Dockerfile");
|
||||
if (/^\s*(?:COPY|ADD)\s+\.\s+\/app(?:\s|$)/imu.test(dockerfile)) {
|
||||
errors.push("scripts/e2e/Dockerfile: do not copy the source checkout into /app");
|
||||
}
|
||||
|
||||
function validateUniqueLanes(label, lanes) {
|
||||
const seen = new Set();
|
||||
for (const lane of lanes) {
|
||||
if (seen.has(lane.name)) {
|
||||
errors.push(`${label}: duplicate Docker E2E lane '${lane.name}'`);
|
||||
}
|
||||
seen.add(lane.name);
|
||||
}
|
||||
}
|
||||
|
||||
function validateLane(label, lane) {
|
||||
if (!lane.name || typeof lane.name !== "string") {
|
||||
errors.push(`${label}: Docker E2E lane is missing a string name`);
|
||||
}
|
||||
if (!lane.command || typeof lane.command !== "string") {
|
||||
errors.push(`${label}: Docker E2E lane '${lane.name}' is missing a string command`);
|
||||
return;
|
||||
}
|
||||
if (lane.e2eImageKind && lane.e2eImageKind !== "bare" && lane.e2eImageKind !== "functional") {
|
||||
errors.push(
|
||||
`${label}: Docker E2E lane '${lane.name}' has invalid image kind '${lane.e2eImageKind}'`,
|
||||
);
|
||||
}
|
||||
if (lane.live && lane.e2eImageKind) {
|
||||
errors.push(`${label}: live Docker E2E lane '${lane.name}' must not require a package image`);
|
||||
}
|
||||
if (!lane.live && !lane.e2eImageKind) {
|
||||
errors.push(`${label}: package Docker E2E lane '${lane.name}' must declare an e2e image kind`);
|
||||
}
|
||||
if (laneWeight(lane) < 1) {
|
||||
errors.push(`${label}: Docker E2E lane '${lane.name}' must have positive weight`);
|
||||
}
|
||||
if (!laneResources(lane).includes("docker")) {
|
||||
errors.push(`${label}: Docker E2E lane '${lane.name}' must include the docker resource`);
|
||||
}
|
||||
|
||||
for (const match of lane.command.matchAll(/\bpnpm\s+([^\s]+)/gu)) {
|
||||
const script = match[1];
|
||||
if (!packageScripts.has(script)) {
|
||||
errors.push(
|
||||
`${label}: Docker E2E lane '${lane.name}' references missing package script '${script}'`,
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const releasePathLanes = allReleasePathLanes({ includeOpenWebUI: true });
|
||||
for (const [label, lanes] of [
|
||||
["release-path", releasePathLanes],
|
||||
["main", mainLanes],
|
||||
["tail", tailLanes],
|
||||
]) {
|
||||
validateUniqueLanes(label, lanes);
|
||||
for (const lane of lanes) {
|
||||
validateLane(label, lane);
|
||||
}
|
||||
}
|
||||
|
||||
if (errors.length > 0) {
|
||||
console.error(errors.join("\n"));
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
console.log("Docker E2E package boundary/catalog guard passed.");
|
||||
96
scripts/check-openclaw-package-tarball.mjs
Normal file
96
scripts/check-openclaw-package-tarball.mjs
Normal file
@@ -0,0 +1,96 @@
|
||||
#!/usr/bin/env node
|
||||
// Validates the npm tarball Docker E2E lanes install.
|
||||
// This is intentionally tarball-only: the check proves Docker lanes consume the
|
||||
// prebuilt package artifact with dist inventory, not a source checkout.
|
||||
import { spawnSync } from "node:child_process";
|
||||
import fs from "node:fs";
|
||||
|
||||
function usage() {
|
||||
return "Usage: node scripts/check-openclaw-package-tarball.mjs <openclaw.tgz>";
|
||||
}
|
||||
|
||||
function fail(message) {
|
||||
console.error(message);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const tarball = process.argv[2];
|
||||
if (!tarball || process.argv.length > 3) {
|
||||
fail(usage());
|
||||
}
|
||||
if (!fs.existsSync(tarball)) {
|
||||
fail(`OpenClaw package tarball does not exist: ${tarball}`);
|
||||
}
|
||||
|
||||
const list = spawnSync("tar", ["-tf", tarball], {
|
||||
encoding: "utf8",
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
});
|
||||
if (list.status !== 0) {
|
||||
fail(`tar -tf failed for ${tarball}: ${list.stderr || list.status}`);
|
||||
}
|
||||
|
||||
const entries = list.stdout
|
||||
.split(/\r?\n/u)
|
||||
.map((entry) => entry.trim())
|
||||
.filter(Boolean);
|
||||
const normalized = entries.map((entry) => entry.replace(/^package\//u, ""));
|
||||
const entrySet = new Set(normalized);
|
||||
const errors = [];
|
||||
|
||||
function readTarEntry(entryPath) {
|
||||
const candidates = [entryPath, `package/${entryPath}`];
|
||||
for (const candidate of candidates) {
|
||||
const result = spawnSync("tar", ["-xOf", tarball, candidate], {
|
||||
encoding: "utf8",
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
});
|
||||
if (result.status === 0) {
|
||||
return result.stdout;
|
||||
}
|
||||
}
|
||||
return "";
|
||||
}
|
||||
|
||||
for (const entry of normalized) {
|
||||
if (entry.startsWith("/") || entry.split("/").includes("..")) {
|
||||
errors.push(`unsafe tar entry: ${entry}`);
|
||||
}
|
||||
}
|
||||
|
||||
if (!entrySet.has("package.json")) {
|
||||
errors.push("missing package.json");
|
||||
}
|
||||
if (!normalized.some((entry) => entry.startsWith("dist/"))) {
|
||||
errors.push("missing dist/ entries");
|
||||
}
|
||||
if (!entrySet.has("dist/postinstall-inventory.json")) {
|
||||
errors.push("missing dist/postinstall-inventory.json");
|
||||
}
|
||||
if (entrySet.has("dist/postinstall-inventory.json")) {
|
||||
try {
|
||||
const inventory = JSON.parse(readTarEntry("dist/postinstall-inventory.json"));
|
||||
if (!Array.isArray(inventory) || inventory.some((entry) => typeof entry !== "string")) {
|
||||
errors.push("invalid dist/postinstall-inventory.json");
|
||||
} else {
|
||||
for (const inventoryEntry of inventory) {
|
||||
const normalizedEntry = inventoryEntry.replace(/\\/gu, "/");
|
||||
if (!entrySet.has(normalizedEntry)) {
|
||||
errors.push(`inventory references missing tar entry ${normalizedEntry}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
errors.push(
|
||||
`unreadable dist/postinstall-inventory.json: ${
|
||||
error instanceof Error ? error.message : String(error)
|
||||
}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
if (errors.length > 0) {
|
||||
fail(`OpenClaw package tarball integrity failed:\n${errors.join("\n")}`);
|
||||
}
|
||||
|
||||
console.log("OpenClaw package tarball integrity passed.");
|
||||
@@ -30,6 +30,16 @@ function readEntrypoints() {
|
||||
return new Set(entrypoints.filter((entry) => entry !== "index"));
|
||||
}
|
||||
|
||||
function readPrivateLocalOnlySubpaths() {
|
||||
const subpaths = JSON.parse(
|
||||
readFileSync(
|
||||
path.join(repoRoot, "scripts/lib/plugin-sdk-private-local-only-subpaths.json"),
|
||||
"utf8",
|
||||
),
|
||||
);
|
||||
return new Set(subpaths.filter((entry) => typeof entry === "string" && !entry.includes("/")));
|
||||
}
|
||||
|
||||
function parsePluginSdkSubpath(specifier) {
|
||||
if (!specifier.startsWith("openclaw/plugin-sdk/")) {
|
||||
return null;
|
||||
@@ -51,6 +61,7 @@ function compareEntries(left, right) {
|
||||
async function collectViolations() {
|
||||
const entrypoints = readEntrypoints();
|
||||
const exports = readPackageExports();
|
||||
const privateLocalOnlySubpaths = readPrivateLocalOnlySubpaths();
|
||||
const files = (await collectTypeScriptFilesFromRoots(scanRoots, { includeTests: true })).toSorted(
|
||||
(left, right) =>
|
||||
normalizeRepoPath(repoRoot, left).localeCompare(normalizeRepoPath(repoRoot, right)),
|
||||
@@ -72,6 +83,9 @@ async function collectViolations() {
|
||||
if (!subpath) {
|
||||
return;
|
||||
}
|
||||
if (privateLocalOnlySubpaths.has(subpath)) {
|
||||
return;
|
||||
}
|
||||
|
||||
const missingFrom = [];
|
||||
if (!entrypoints.has(subpath)) {
|
||||
|
||||
27
scripts/check-workflows.mjs
Normal file
27
scripts/check-workflows.mjs
Normal file
@@ -0,0 +1,27 @@
|
||||
#!/usr/bin/env node
|
||||
// Runs local workflow sanity checks.
|
||||
// Uses an installed actionlint when present, otherwise falls back to `go run`
|
||||
// for the pinned version used by CI, then runs repo-specific composite guards.
|
||||
import { spawnSync } from "node:child_process";
|
||||
|
||||
const ACTIONLINT_VERSION = "1.7.11";
|
||||
|
||||
function commandExists(command) {
|
||||
return spawnSync("bash", ["-lc", `command -v ${command}`], { stdio: "ignore" }).status === 0;
|
||||
}
|
||||
|
||||
function run(command, args) {
|
||||
const result = spawnSync(command, args, { stdio: "inherit" });
|
||||
if (result.status !== 0) {
|
||||
process.exit(result.status ?? 1);
|
||||
}
|
||||
}
|
||||
|
||||
if (commandExists("actionlint")) {
|
||||
run("actionlint", []);
|
||||
} else {
|
||||
run("go", ["run", `github.com/rhysd/actionlint/cmd/actionlint@v${ACTIONLINT_VERSION}`]);
|
||||
}
|
||||
|
||||
run("python3", ["scripts/check-composite-action-input-interpolation.py"]);
|
||||
run("node", ["scripts/check-no-conflict-markers.mjs"]);
|
||||
259
scripts/docker-e2e-rerun.mjs
Normal file
259
scripts/docker-e2e-rerun.mjs
Normal file
@@ -0,0 +1,259 @@
|
||||
#!/usr/bin/env node
|
||||
// Builds cheap rerun commands from a Docker E2E GitHub run or local summary.
|
||||
// For GitHub runs, the script downloads Docker E2E artifacts, reads
|
||||
// summary/failures JSON, and prints targeted workflow commands that prepare a
|
||||
// fresh OpenClaw tarball for the same ref before running only failed lanes.
|
||||
import { spawnSync } from "node:child_process";
|
||||
import fs from "node:fs";
|
||||
import os from "node:os";
|
||||
import path from "node:path";
|
||||
|
||||
const DEFAULT_WORKFLOW = "openclaw-live-and-e2e-checks-reusable.yml";
|
||||
|
||||
function usage() {
|
||||
return [
|
||||
"Usage:",
|
||||
" node scripts/docker-e2e-rerun.mjs <run-id|summary.json|failures.json> [--repo owner/repo] [--dir output-dir] [--workflow workflow.yml] [--ref ref]",
|
||||
].join("\n");
|
||||
}
|
||||
|
||||
function parseArgs(argv) {
|
||||
const options = {
|
||||
dir: "",
|
||||
input: "",
|
||||
ref: "",
|
||||
repo: "",
|
||||
workflow: DEFAULT_WORKFLOW,
|
||||
};
|
||||
for (let index = 0; index < argv.length; index += 1) {
|
||||
const arg = argv[index];
|
||||
if (arg === "--repo") {
|
||||
options.repo = argv[(index += 1)] ?? "";
|
||||
} else if (arg?.startsWith("--repo=")) {
|
||||
options.repo = arg.slice("--repo=".length);
|
||||
} else if (arg === "--dir") {
|
||||
options.dir = argv[(index += 1)] ?? "";
|
||||
} else if (arg?.startsWith("--dir=")) {
|
||||
options.dir = arg.slice("--dir=".length);
|
||||
} else if (arg === "--workflow") {
|
||||
options.workflow = argv[(index += 1)] ?? "";
|
||||
} else if (arg?.startsWith("--workflow=")) {
|
||||
options.workflow = arg.slice("--workflow=".length);
|
||||
} else if (arg === "--ref") {
|
||||
options.ref = argv[(index += 1)] ?? "";
|
||||
} else if (arg?.startsWith("--ref=")) {
|
||||
options.ref = arg.slice("--ref=".length);
|
||||
} else if (!options.input) {
|
||||
options.input = arg;
|
||||
} else {
|
||||
throw new Error(`unknown argument: ${arg}\n${usage()}`);
|
||||
}
|
||||
}
|
||||
if (!options.input || !options.workflow) {
|
||||
throw new Error(usage());
|
||||
}
|
||||
return options;
|
||||
}
|
||||
|
||||
function run(command, args, options = {}) {
|
||||
const result = spawnSync(command, args, {
|
||||
encoding: "utf8",
|
||||
stdio: options.stdio ?? ["ignore", "pipe", "pipe"],
|
||||
});
|
||||
if (result.status !== 0) {
|
||||
throw new Error(
|
||||
`${command} ${args.join(" ")} failed with ${result.status ?? result.signal}\n${result.stderr}`,
|
||||
);
|
||||
}
|
||||
return result.stdout;
|
||||
}
|
||||
|
||||
function readJson(file) {
|
||||
return JSON.parse(fs.readFileSync(file, "utf8"));
|
||||
}
|
||||
|
||||
function shellQuote(value) {
|
||||
return `'${String(value).replaceAll("'", "'\\''")}'`;
|
||||
}
|
||||
|
||||
function ghWorkflowCommand(lanes, ref, workflow) {
|
||||
return [
|
||||
"gh workflow run",
|
||||
shellQuote(workflow),
|
||||
"-f",
|
||||
`ref=${shellQuote(ref)}`,
|
||||
"-f",
|
||||
"include_repo_e2e=false",
|
||||
"-f",
|
||||
"include_release_path_suites=false",
|
||||
"-f",
|
||||
"include_openwebui=false",
|
||||
"-f",
|
||||
`docker_lanes=${shellQuote(lanes.join(" "))}`,
|
||||
"-f",
|
||||
"include_live_suites=false",
|
||||
"-f",
|
||||
"live_models_only=false",
|
||||
].join(" ");
|
||||
}
|
||||
|
||||
function detectRepo() {
|
||||
return JSON.parse(run("gh", ["repo", "view", "--json", "nameWithOwner"])).nameWithOwner;
|
||||
}
|
||||
|
||||
function findFiles(rootDir, basenames, out = []) {
|
||||
for (const entry of fs.readdirSync(rootDir, { withFileTypes: true })) {
|
||||
const file = path.join(rootDir, entry.name);
|
||||
if (entry.isDirectory()) {
|
||||
findFiles(file, basenames, out);
|
||||
} else if (basenames.has(entry.name)) {
|
||||
out.push(file);
|
||||
}
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
function failedLaneEntriesFromJson(file, ref, workflow) {
|
||||
const parsed = readJson(file);
|
||||
const source = path.basename(file);
|
||||
if (source === "failures.json" && Array.isArray(parsed.lanes)) {
|
||||
return parsed.lanes
|
||||
.filter((lane) => lane.name)
|
||||
.map((lane) => ({
|
||||
ghWorkflowCommand: lane.ghWorkflowCommand,
|
||||
lane: lane.name,
|
||||
localRerunCommand: lane.rerunCommand,
|
||||
logFile: lane.logFile,
|
||||
source: file,
|
||||
status: lane.status,
|
||||
}));
|
||||
}
|
||||
|
||||
const lanes = Array.isArray(parsed.lanes) ? parsed.lanes : [];
|
||||
return lanes
|
||||
.filter((lane) => lane.status !== 0 && lane.name)
|
||||
.map((lane) => ({
|
||||
ghWorkflowCommand: ghWorkflowCommand([lane.name], ref, workflow),
|
||||
lane: lane.name,
|
||||
localRerunCommand: lane.rerunCommand,
|
||||
logFile: lane.logFile,
|
||||
source: file,
|
||||
status: lane.status,
|
||||
}));
|
||||
}
|
||||
|
||||
function mergeByLane(entries) {
|
||||
const byLane = new Map();
|
||||
for (const entry of entries) {
|
||||
if (!byLane.has(entry.lane)) {
|
||||
byLane.set(entry.lane, entry);
|
||||
}
|
||||
}
|
||||
return [...byLane.values()].toSorted((left, right) => left.lane.localeCompare(right.lane));
|
||||
}
|
||||
|
||||
function downloadDockerArtifacts(runId, repo, outputDir) {
|
||||
fs.mkdirSync(outputDir, { recursive: true });
|
||||
const artifacts = JSON.parse(
|
||||
run("gh", [
|
||||
"api",
|
||||
`repos/${repo}/actions/runs/${runId}/artifacts?per_page=100`,
|
||||
"--jq",
|
||||
".artifacts",
|
||||
]),
|
||||
);
|
||||
const names = artifacts
|
||||
.filter((artifact) => !artifact.expired && artifact.name.startsWith("docker-e2e-"))
|
||||
.map((artifact) => artifact.name);
|
||||
if (names.length === 0) {
|
||||
throw new Error(`No docker-e2e-* artifacts found for run ${runId}`);
|
||||
}
|
||||
for (const name of names) {
|
||||
run(
|
||||
"gh",
|
||||
["run", "download", String(runId), "--repo", repo, "--name", name, "--dir", outputDir],
|
||||
{
|
||||
stdio: "inherit",
|
||||
},
|
||||
);
|
||||
}
|
||||
return names;
|
||||
}
|
||||
|
||||
function runInfo(runId, repo) {
|
||||
return JSON.parse(
|
||||
run("gh", [
|
||||
"run",
|
||||
"view",
|
||||
String(runId),
|
||||
"--repo",
|
||||
repo,
|
||||
"--json",
|
||||
"databaseId,headSha,headBranch,status,conclusion,url,workflowName",
|
||||
]),
|
||||
);
|
||||
}
|
||||
|
||||
function printEntries(entries, ref, workflow, run) {
|
||||
if (run) {
|
||||
console.log(`Run: ${run.url}`);
|
||||
console.log(`Workflow: ${run.workflowName}`);
|
||||
}
|
||||
console.log(`Ref: ${ref}`);
|
||||
console.log(
|
||||
"Targeted GitHub reruns prepare a fresh OpenClaw npm tarball for that ref before lane execution.",
|
||||
);
|
||||
if (entries.length === 0) {
|
||||
console.log("No failed Docker E2E lanes found.");
|
||||
return;
|
||||
}
|
||||
console.log(`Failed lanes: ${entries.map((entry) => entry.lane).join(", ")}`);
|
||||
console.log("");
|
||||
console.log("Combined GitHub rerun:");
|
||||
console.log(
|
||||
ghWorkflowCommand(
|
||||
entries.map((entry) => entry.lane),
|
||||
ref,
|
||||
workflow,
|
||||
),
|
||||
);
|
||||
console.log("");
|
||||
console.log("Per-lane GitHub reruns:");
|
||||
for (const entry of entries) {
|
||||
console.log(
|
||||
`- ${entry.lane}: ${entry.ghWorkflowCommand || ghWorkflowCommand([entry.lane], ref, workflow)}`,
|
||||
);
|
||||
}
|
||||
console.log("");
|
||||
console.log("Local rerun starting points:");
|
||||
for (const entry of entries) {
|
||||
if (entry.localRerunCommand) {
|
||||
console.log(`- ${entry.lane}: ${entry.localRerunCommand}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const options = parseArgs(process.argv.slice(2));
|
||||
const isLocalJson = fs.existsSync(options.input) && fs.statSync(options.input).isFile();
|
||||
if (isLocalJson) {
|
||||
const ref = options.ref || process.env.GITHUB_SHA || "HEAD";
|
||||
printEntries(
|
||||
mergeByLane(failedLaneEntriesFromJson(options.input, ref, options.workflow)),
|
||||
ref,
|
||||
options.workflow,
|
||||
);
|
||||
} else {
|
||||
const repo = options.repo || detectRepo();
|
||||
const run = runInfo(options.input, repo);
|
||||
const ref = options.ref || run.headSha || run.headBranch;
|
||||
const outputDir =
|
||||
options.dir || path.join(os.tmpdir(), `openclaw-docker-e2e-rerun-${options.input}`);
|
||||
const artifactNames = downloadDockerArtifacts(options.input, repo, outputDir);
|
||||
const files = findFiles(outputDir, new Set(["failures.json", "summary.json"]));
|
||||
const entries = mergeByLane(
|
||||
files.flatMap((file) => failedLaneEntriesFromJson(file, ref, options.workflow)),
|
||||
);
|
||||
console.log(`Artifacts: ${artifactNames.join(", ")}`);
|
||||
console.log(`Downloaded: ${outputDir}`);
|
||||
printEntries(entries, ref, options.workflow, run);
|
||||
}
|
||||
130
scripts/docker-e2e-timings.mjs
Normal file
130
scripts/docker-e2e-timings.mjs
Normal file
@@ -0,0 +1,130 @@
|
||||
#!/usr/bin/env node
|
||||
// Summarizes Docker E2E timing artifacts.
|
||||
// Accepts scheduler summary.json or lane-timings.json so agents can see the
|
||||
// slowest lanes and phase critical path before deciding what to rerun.
|
||||
import fs from "node:fs";
|
||||
|
||||
function usage() {
|
||||
return "Usage: node scripts/docker-e2e-timings.mjs <summary.json|lane-timings.json> [--limit N]";
|
||||
}
|
||||
|
||||
function parseArgs(argv) {
|
||||
const options = { file: "", limit: 12 };
|
||||
for (let index = 0; index < argv.length; index += 1) {
|
||||
const arg = argv[index];
|
||||
if (arg === "--limit") {
|
||||
options.limit = Number(argv[(index += 1)] ?? "");
|
||||
} else if (arg?.startsWith("--limit=")) {
|
||||
options.limit = Number(arg.slice("--limit=".length));
|
||||
} else if (!options.file) {
|
||||
options.file = arg;
|
||||
} else {
|
||||
throw new Error(`unknown argument: ${arg}\n${usage()}`);
|
||||
}
|
||||
}
|
||||
if (!options.file || !Number.isInteger(options.limit) || options.limit < 1) {
|
||||
throw new Error(usage());
|
||||
}
|
||||
return options;
|
||||
}
|
||||
|
||||
function readJson(file) {
|
||||
return JSON.parse(fs.readFileSync(file, "utf8"));
|
||||
}
|
||||
|
||||
function seconds(value) {
|
||||
return typeof value === "number" && Number.isFinite(value) ? value : 0;
|
||||
}
|
||||
|
||||
function durationBetween(startedAt, finishedAt) {
|
||||
if (!startedAt || !finishedAt) {
|
||||
return 0;
|
||||
}
|
||||
const started = Date.parse(startedAt);
|
||||
const finished = Date.parse(finishedAt);
|
||||
if (!Number.isFinite(started) || !Number.isFinite(finished) || finished < started) {
|
||||
return 0;
|
||||
}
|
||||
return Math.round((finished - started) / 1000);
|
||||
}
|
||||
|
||||
function summarizeSummary(summary, limit) {
|
||||
const lanes = (Array.isArray(summary.lanes) ? summary.lanes : [])
|
||||
.map((lane) => ({
|
||||
imageKind: lane.imageKind ?? "",
|
||||
name: lane.name,
|
||||
seconds: seconds(lane.elapsedSeconds),
|
||||
status: lane.status === 0 ? "pass" : `fail ${lane.status}`,
|
||||
timedOut: lane.timedOut === true,
|
||||
}))
|
||||
.filter((lane) => lane.name)
|
||||
.toSorted((left, right) => right.seconds - left.seconds || left.name.localeCompare(right.name));
|
||||
const phases = (Array.isArray(summary.phases) ? summary.phases : [])
|
||||
.map((phase) => ({
|
||||
name: phase.name,
|
||||
seconds: seconds(phase.elapsedSeconds),
|
||||
status: phase.status ?? "",
|
||||
}))
|
||||
.filter((phase) => phase.name);
|
||||
const wallSeconds = durationBetween(summary.startedAt, summary.finishedAt);
|
||||
const totalLaneSeconds = lanes.reduce((total, lane) => total + lane.seconds, 0);
|
||||
const criticalPathSeconds =
|
||||
phases.reduce((total, phase) => total + phase.seconds, 0) ||
|
||||
wallSeconds ||
|
||||
lanes[0]?.seconds ||
|
||||
0;
|
||||
|
||||
console.log(`Status: ${summary.status ?? "unknown"}`);
|
||||
if (wallSeconds > 0) {
|
||||
console.log(`Wall seconds: ${wallSeconds}`);
|
||||
}
|
||||
console.log(`Lane seconds total: ${totalLaneSeconds}`);
|
||||
console.log(`Approx critical path seconds: ${criticalPathSeconds}`);
|
||||
if (wallSeconds > 0 && totalLaneSeconds > 0) {
|
||||
console.log(`Approx parallelism: ${(totalLaneSeconds / wallSeconds).toFixed(1)}x`);
|
||||
}
|
||||
if (phases.length > 0) {
|
||||
console.log("");
|
||||
console.log("Phases:");
|
||||
for (const phase of phases.toSorted((left, right) => right.seconds - left.seconds)) {
|
||||
console.log(`- ${phase.name}: ${phase.seconds}s ${phase.status}`);
|
||||
}
|
||||
}
|
||||
console.log("");
|
||||
console.log(`Slowest lanes (top ${Math.min(limit, lanes.length)}):`);
|
||||
for (const lane of lanes.slice(0, limit)) {
|
||||
console.log(
|
||||
`- ${lane.name}: ${lane.seconds}s ${lane.status}${lane.timedOut ? " timeout" : ""}${
|
||||
lane.imageKind ? ` image=${lane.imageKind}` : ""
|
||||
}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
function summarizeTimingStore(store, limit) {
|
||||
const lanes = Object.entries(store.lanes ?? {})
|
||||
.map(([name, lane]) => ({
|
||||
name,
|
||||
seconds: seconds(lane.durationSeconds),
|
||||
status: lane.status === 0 ? "pass" : `fail ${lane.status}`,
|
||||
updatedAt: lane.updatedAt ?? "",
|
||||
}))
|
||||
.toSorted((left, right) => right.seconds - left.seconds || left.name.localeCompare(right.name));
|
||||
console.log(`Updated: ${store.updatedAt ?? "unknown"}`);
|
||||
console.log(`Known lanes: ${lanes.length}`);
|
||||
console.log("");
|
||||
console.log(`Slowest lanes (top ${Math.min(limit, lanes.length)}):`);
|
||||
for (const lane of lanes.slice(0, limit)) {
|
||||
console.log(`- ${lane.name}: ${lane.seconds}s ${lane.status} ${lane.updatedAt}`.trim());
|
||||
}
|
||||
}
|
||||
|
||||
const options = parseArgs(process.argv.slice(2));
|
||||
const payload = readJson(options.file);
|
||||
if (Array.isArray(payload.lanes)) {
|
||||
summarizeSummary(payload, options.limit);
|
||||
} else if (payload.lanes && typeof payload.lanes === "object") {
|
||||
summarizeTimingStore(payload, options.limit);
|
||||
} else {
|
||||
throw new Error(`Unsupported Docker E2E timing artifact: ${options.file}`);
|
||||
}
|
||||
103
scripts/docker-e2e.mjs
Normal file
103
scripts/docker-e2e.mjs
Normal file
@@ -0,0 +1,103 @@
|
||||
// Docker E2E CI helper.
|
||||
// Converts scheduler JSON into GitHub Actions outputs and compact markdown
|
||||
// summaries so the workflow does not duplicate Docker E2E planning logic.
|
||||
import fs from "node:fs";
|
||||
|
||||
function usage() {
|
||||
return [
|
||||
"Usage:",
|
||||
" node scripts/docker-e2e.mjs github-outputs <plan.json>",
|
||||
" node scripts/docker-e2e.mjs summary <summary.json> <title>",
|
||||
" node scripts/docker-e2e.mjs failed-reruns <summary.json>",
|
||||
].join("\n");
|
||||
}
|
||||
|
||||
function readJson(file) {
|
||||
return JSON.parse(fs.readFileSync(file, "utf8"));
|
||||
}
|
||||
|
||||
function boolOutput(value) {
|
||||
return value ? "1" : "0";
|
||||
}
|
||||
|
||||
function githubOutputs(plan) {
|
||||
const needs = plan.needs ?? {};
|
||||
return [
|
||||
`credentials=${(plan.credentials ?? []).join(",")}`,
|
||||
`needs_bare_image=${boolOutput(needs.bareImage)}`,
|
||||
`needs_e2e_image=${boolOutput(needs.e2eImage)}`,
|
||||
`needs_functional_image=${boolOutput(needs.functionalImage)}`,
|
||||
`needs_live_image=${boolOutput(needs.liveImage)}`,
|
||||
`needs_package=${boolOutput(needs.package)}`,
|
||||
];
|
||||
}
|
||||
|
||||
function markdownCell(value) {
|
||||
return String(value ?? "").replaceAll("|", "\\|");
|
||||
}
|
||||
|
||||
function inlineCode(value) {
|
||||
return `\`${String(value ?? "").replaceAll("`", "\\`")}\``;
|
||||
}
|
||||
|
||||
function summaryMarkdown(summary, title) {
|
||||
const lanes = Array.isArray(summary.lanes) ? summary.lanes : [];
|
||||
const lines = [
|
||||
`### ${title}`,
|
||||
"",
|
||||
`Status: ${inlineCode(summary.status)}`,
|
||||
"",
|
||||
"| Lane | Status | Seconds | Timed out | Rerun |",
|
||||
"| --- | ---: | ---: | --- | --- |",
|
||||
];
|
||||
for (const lane of lanes) {
|
||||
const status = lane.status === 0 ? "pass" : `fail ${lane.status}`;
|
||||
lines.push(
|
||||
`| ${inlineCode(lane.name)} | ${markdownCell(status)} | ${markdownCell(lane.elapsedSeconds)} | ${lane.timedOut ? "yes" : "no"} | ${inlineCode(lane.rerunCommand)} |`,
|
||||
);
|
||||
}
|
||||
|
||||
const phases = Array.isArray(summary.phases) ? summary.phases : [];
|
||||
if (phases.length > 0) {
|
||||
lines.push("", "| Phase | Seconds | Status | Image kind |", "| --- | ---: | --- | --- |");
|
||||
for (const phase of phases) {
|
||||
lines.push(
|
||||
`| ${inlineCode(phase.name)} | ${markdownCell(phase.elapsedSeconds)} | ${markdownCell(phase.status)} | ${markdownCell(phase.imageKind)} |`,
|
||||
);
|
||||
}
|
||||
}
|
||||
const failedReruns = failedRerunCommands(summary);
|
||||
if (failedReruns.length > 0) {
|
||||
lines.push("", "Failed lane reruns:", "");
|
||||
for (const command of failedReruns) {
|
||||
lines.push(`- ${inlineCode(command)}`);
|
||||
}
|
||||
}
|
||||
return lines.join("\n");
|
||||
}
|
||||
|
||||
function failedRerunCommands(summary) {
|
||||
const lanes = Array.isArray(summary.lanes) ? summary.lanes : [];
|
||||
return lanes
|
||||
.filter((lane) => lane.status !== 0 && lane.rerunCommand)
|
||||
.map((lane) => lane.rerunCommand);
|
||||
}
|
||||
|
||||
const [command, file, ...args] = process.argv.slice(2);
|
||||
if (!command || !file) {
|
||||
throw new Error(usage());
|
||||
}
|
||||
|
||||
if (command === "github-outputs") {
|
||||
process.stdout.write(`${githubOutputs(readJson(file)).join("\n")}\n`);
|
||||
} else if (command === "summary") {
|
||||
const title = args.join(" ").trim();
|
||||
if (!title) {
|
||||
throw new Error(usage());
|
||||
}
|
||||
process.stdout.write(`${summaryMarkdown(readJson(file), title)}\n`);
|
||||
} else if (command === "failed-reruns") {
|
||||
process.stdout.write(`${failedRerunCommands(readJson(file)).join("\n")}\n`);
|
||||
} else {
|
||||
throw new Error(`unknown command: ${command}\n${usage()}`);
|
||||
}
|
||||
@@ -1,13 +1,12 @@
|
||||
# syntax=docker/dockerfile:1.7
|
||||
|
||||
FROM node:24-bookworm-slim@sha256:b4687aef2571c632a1953695ce4d61d6462a7eda471fe6e272eebf0418f276ba
|
||||
FROM node:24-bookworm-slim@sha256:e8e2e91b1378f83c5b2dd15f0247f34110e2fe895f6ca7719dbb780f929368eb
|
||||
|
||||
ENV COREPACK_ENABLE_DOWNLOAD_PROMPT=0
|
||||
|
||||
RUN --mount=type=cache,id=openclaw-cleanup-smoke-apt-cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,id=openclaw-cleanup-smoke-apt-lists,target=/var/lib/apt,sharing=locked \
|
||||
apt-get update \
|
||||
&& DEBIAN_FRONTEND=noninteractive apt-get upgrade -y --no-install-recommends \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash \
|
||||
ca-certificates \
|
||||
|
||||
@@ -1,11 +1,10 @@
|
||||
# syntax=docker/dockerfile:1.7
|
||||
|
||||
FROM node:24-bookworm-slim@sha256:b4687aef2571c632a1953695ce4d61d6462a7eda471fe6e272eebf0418f276ba
|
||||
FROM node:24-bookworm-slim@sha256:e8e2e91b1378f83c5b2dd15f0247f34110e2fe895f6ca7719dbb780f929368eb
|
||||
|
||||
RUN --mount=type=cache,id=openclaw-install-sh-e2e-apt-cache,target=/var/cache/apt,sharing=locked \
|
||||
--mount=type=cache,id=openclaw-install-sh-e2e-apt-lists,target=/var/lib/apt,sharing=locked \
|
||||
apt-get update \
|
||||
&& DEBIAN_FRONTEND=noninteractive apt-get upgrade -y --no-install-recommends \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
bash \
|
||||
ca-certificates \
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user