mirror of
https://github.com/openclaw/openclaw.git
synced 2026-06-25 08:42:35 +08:00
Compare commits
1 Commits
appui
...
fix/webcha
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
4f062d6c89 |
@@ -266,52 +266,6 @@ It should include `broker.url`, `broker.token`, and usually `provider: aws`
|
||||
for owned-cloud lanes. Do not let that config override the OpenClaw default
|
||||
when Blacksmith proof is requested; pass `--provider blacksmith-testbox`.
|
||||
|
||||
### OpenClaw Control UI WebVNC
|
||||
|
||||
When Peter asks to show the OpenClaw app UI in a Crabbox desktop/WebVNC session,
|
||||
keep the OpenClaw setup as agent-local ceremony and delegate the generic desktop
|
||||
bridge to Crabbox:
|
||||
|
||||
```sh
|
||||
lease=<lease-slug-or-id>
|
||||
|
||||
# If no lease exists yet:
|
||||
../crabbox/bin/crabbox warmup --provider aws --target linux --desktop --browser \
|
||||
--class beast --market on-demand --idle-timeout 90m --ttl 240m --timing-json
|
||||
|
||||
../crabbox/bin/crabbox run --provider aws --target linux --id "$lease" \
|
||||
--desktop --browser --keep --idle-timeout 90m --ttl 240m --timing-json \
|
||||
--shell -- 'set -euxo pipefail
|
||||
if ! command -v node >/dev/null || ! node -e "process.exit(Number(process.versions.node.split(\".\")[0]) >= 22 ? 0 : 1)"; then
|
||||
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
|
||||
sudo apt-get install -y nodejs
|
||||
fi
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y build-essential python3
|
||||
sudo corepack enable
|
||||
corepack prepare pnpm@10.33.2 --activate
|
||||
pnpm install --frozen-lockfile
|
||||
pnpm --dir ui build
|
||||
if [ -f /tmp/openclaw-ui.pid ] && kill -0 "$(cat /tmp/openclaw-ui.pid)" 2>/dev/null; then
|
||||
kill "$(cat /tmp/openclaw-ui.pid)" || true
|
||||
fi
|
||||
nohup pnpm --dir ui dev --host 0.0.0.0 --port 3001 > /tmp/openclaw-ui.log 2>&1 &
|
||||
echo $! > /tmp/openclaw-ui.pid
|
||||
for _ in $(seq 1 90); do
|
||||
curl -fsS http://127.0.0.1:3001/ >/tmp/openclaw-ui.html && exit 0
|
||||
sleep 1
|
||||
done
|
||||
tail -80 /tmp/openclaw-ui.log >&2 || true
|
||||
exit 1'
|
||||
|
||||
../crabbox/bin/crabbox desktop launch --provider aws --target linux --id "$lease" \
|
||||
--browser --url http://127.0.0.1:3001/ --webvnc --open
|
||||
```
|
||||
|
||||
Do not add an OpenClaw-specific helper under repo `scripts/` for this. If the
|
||||
demo needs a connected app, start a throwaway gateway inside the Crabbox lease;
|
||||
do not touch Peter's Mac Studio gateway unless he explicitly asks.
|
||||
|
||||
## Diagnostics
|
||||
|
||||
```sh
|
||||
|
||||
@@ -24,60 +24,6 @@ gitcrawl search openclaw/openclaw --query "<scope or title keywords>" --mode hyb
|
||||
gitcrawl cluster-detail openclaw/openclaw --id <cluster-id> --member-limit 20 --body-chars 280 --json
|
||||
```
|
||||
|
||||
## Surface opener identity
|
||||
|
||||
- For every reviewed, triaged, closed, or landed issue/PR, show the opener's human name when available, GitHub login, and account age.
|
||||
- Get the login from `gh issue view` / `gh pr view` (`author.login`), then fetch profile metadata once with `gh api users/<login> --jq '{login,name,created_at,type}'`.
|
||||
- Report account age as created date plus rough age, for example `Opened by Jane Doe (@jane, account created 2021-04-03, ~5y old)`.
|
||||
- Also show recent GitHub activity when it informs maintainer risk: OpenClaw PRs, issues, and commits in the last 12 months; for linked issue-fixing PRs, include both the PR author and issue opener when they differ.
|
||||
- Prefer the bundled helper for activity lookups:
|
||||
|
||||
```bash
|
||||
.agents/skills/openclaw-pr-maintainer/scripts/github-activity.sh <login> [other-login...]
|
||||
.agents/skills/openclaw-pr-maintainer/scripts/github-activity.sh --global <login>
|
||||
```
|
||||
|
||||
- The helper reports repo-local activity first and can fetch public GitHub contribution totals for the same window with `--global`.
|
||||
- The helper is intentionally cache-friendly for gitcrawl-backed `gh`: it rounds repo-local windows to the UTC day, rounds global contribution windows to the UTC hour, and counts PRs/issues from one paginated issues response before fetching commits separately. Prefer reusing the helper instead of hand-rolling several `gh api` loops.
|
||||
- Report activity compactly, for example `OpenClaw last 12mo: 4 PRs, 2 issues, 11 commits; GitHub public last 12mo: 86 commits, 9 PRs, 3 issues, 12 reviews`.
|
||||
- If `name` is empty, use the login only. If profile lookup is rate-limited or unavailable, say `account age unknown` rather than omitting the opener.
|
||||
- Use identity and activity as triage signal, not proof by itself: new, low-activity, or bot-like accounts can raise review caution, but code, repro, and CI evidence still decide.
|
||||
|
||||
## Suppress top-maintainer items in issue triage
|
||||
|
||||
When Peter asks for issue triage, hot issues, pressing bugs, Discord-correlated issues, or "what is still open", do not surface issues or PRs authored by top maintainers by default. He wants external/user-reported hot issues and external PRs, not maintainer-owned work queues.
|
||||
|
||||
Suppress by default when the opener/author is one of:
|
||||
|
||||
- `@vincentkoc`
|
||||
- `@Takhoffman`
|
||||
- `@gumadeiras`
|
||||
- `@obviyus`
|
||||
- `@shakkernerd`
|
||||
- `@mbelinky`
|
||||
- `@joshavant`
|
||||
- `@ngutman`
|
||||
- `@vignesh07`
|
||||
- `@huntharo`
|
||||
|
||||
Also suppress lower-priority maintainer-owned noise from the broader keep/top-maintainer group unless it is directly relevant:
|
||||
|
||||
- `@thewilloftheshadow`
|
||||
- `@onutc` / `@osolmaz`
|
||||
- `@jacobtomlinson`
|
||||
- `@tyler6204`
|
||||
- `@velvet-shark`
|
||||
- `@jalehman`
|
||||
- `@frankekn`
|
||||
- `@ImLukeF`
|
||||
- `@mcaxtr`
|
||||
|
||||
Exceptions:
|
||||
|
||||
- Show maintainer-authored items when Peter explicitly asks for maintainer PRs/issues, PR landing candidates, release-blocking maintainer work, or a specific PR/issue number.
|
||||
- Show a maintainer-authored item when it is the canonical fix for an external hot issue, but frame it as the fix path rather than as a user-facing issue candidate.
|
||||
- Do not close, label, or deprioritize solely because an item is maintainer-authored; this section only controls what appears in triage shortlists.
|
||||
|
||||
## Apply close and triage labels correctly
|
||||
|
||||
- If an issue or PR matches an auto-close reason, apply the label and let `.github/workflows/auto-response.yml` handle the comment/close/lock flow.
|
||||
|
||||
@@ -1,178 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
repo="openclaw/openclaw"
|
||||
months="12"
|
||||
include_global="0"
|
||||
|
||||
usage() {
|
||||
printf 'Usage: %s [--repo owner/repo] [--months N] [--global] <github-login> [login...]\n' "$0"
|
||||
}
|
||||
|
||||
die() {
|
||||
printf 'error: %s\n' "$*" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
need() {
|
||||
command -v "$1" >/dev/null 2>&1 || die "missing required command: $1"
|
||||
}
|
||||
|
||||
date_utc_relative_months() {
|
||||
local count="$1"
|
||||
if date -u -v-"${count}"m +%Y-%m-%dT00:00:00Z >/dev/null 2>&1; then
|
||||
date -u -v-"${count}"m +%Y-%m-%dT00:00:00Z
|
||||
return
|
||||
fi
|
||||
date -u -d "${count} months ago" +%Y-%m-%dT00:00:00Z
|
||||
}
|
||||
|
||||
date_to_epoch() {
|
||||
local value="$1"
|
||||
if date -u -j -f '%Y-%m-%dT%H:%M:%SZ' "$value" +%s >/dev/null 2>&1; then
|
||||
date -u -j -f '%Y-%m-%dT%H:%M:%SZ' "$value" +%s
|
||||
return
|
||||
fi
|
||||
date -u -d "$value" +%s
|
||||
}
|
||||
|
||||
rough_age() {
|
||||
local created_at="$1"
|
||||
local now_s created_s days
|
||||
now_s=$(date -u +%s)
|
||||
created_s=$(date_to_epoch "$created_at")
|
||||
days=$(( (now_s - created_s) / 86400 ))
|
||||
if (( days < 120 )); then
|
||||
printf '~%dd old' "$days"
|
||||
return
|
||||
fi
|
||||
awk -v days="$days" 'BEGIN { printf "~%.1fy old", days / 365.2425 }'
|
||||
}
|
||||
|
||||
thread_kinds() {
|
||||
local login="$1"
|
||||
local since_ts="$2"
|
||||
gh api --paginate "repos/${repo}/issues?state=all&creator=${login}&since=${since_ts}&per_page=100" \
|
||||
--jq ".[] | select(.created_at >= \"${since_ts}\") | if has(\"pull_request\") then \"pr\" else \"issue\" end"
|
||||
}
|
||||
|
||||
count_kind_lines() {
|
||||
local kind="$1"
|
||||
local lines="$2"
|
||||
grep -cx "$kind" <<<"$lines" 2>/dev/null || true
|
||||
}
|
||||
|
||||
count_commits() {
|
||||
local login="$1"
|
||||
local since_ts="$2"
|
||||
gh api --paginate "repos/${repo}/commits?author=${login}&since=${since_ts}&per_page=100" \
|
||||
--jq '.[].sha' | wc -l | tr -d '[:space:]'
|
||||
}
|
||||
|
||||
global_activity() {
|
||||
local login="$1"
|
||||
local since_ts="$2"
|
||||
local now_ts="$3"
|
||||
# shellcheck disable=SC2016
|
||||
gh api graphql \
|
||||
-f login="$login" \
|
||||
-f from="$since_ts" \
|
||||
-f to="$now_ts" \
|
||||
-f query='
|
||||
query($login: String!, $from: DateTime!, $to: DateTime!) {
|
||||
user(login: $login) {
|
||||
contributionsCollection(from: $from, to: $to) {
|
||||
totalCommitContributions
|
||||
totalIssueContributions
|
||||
totalPullRequestContributions
|
||||
totalPullRequestReviewContributions
|
||||
}
|
||||
}
|
||||
}' \
|
||||
--jq '.data.user.contributionsCollection // empty'
|
||||
}
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
--repo)
|
||||
[[ $# -ge 2 ]] || die "--repo requires owner/repo"
|
||||
repo="$2"
|
||||
shift 2
|
||||
;;
|
||||
--months)
|
||||
[[ $# -ge 2 ]] || die "--months requires a positive integer"
|
||||
months="$2"
|
||||
[[ "$months" =~ ^[0-9]+$ && "$months" != "0" ]] || die "--months must be a positive integer"
|
||||
shift 2
|
||||
;;
|
||||
--global)
|
||||
include_global="1"
|
||||
shift
|
||||
;;
|
||||
-h|--help)
|
||||
usage
|
||||
exit 0
|
||||
;;
|
||||
--)
|
||||
shift
|
||||
break
|
||||
;;
|
||||
-*)
|
||||
die "unknown option: $1"
|
||||
;;
|
||||
*)
|
||||
break
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
[[ $# -gt 0 ]] || {
|
||||
usage >&2
|
||||
exit 2
|
||||
}
|
||||
|
||||
need gh
|
||||
need jq
|
||||
|
||||
since_ts=$(date_utc_relative_months "$months")
|
||||
now_ts=$(date -u +%Y-%m-%dT%H:00:00Z)
|
||||
|
||||
for login in "$@"; do
|
||||
profile=$(gh api "users/${login}" --jq '{login,name,created_at,type}')
|
||||
display_login=$(jq -r '.login' <<<"$profile")
|
||||
name=$(jq -r '.name // empty' <<<"$profile")
|
||||
created_at=$(jq -r '.created_at' <<<"$profile")
|
||||
type=$(jq -r '.type' <<<"$profile")
|
||||
created_day=${created_at%%T*}
|
||||
|
||||
kinds=$(thread_kinds "$display_login" "$since_ts")
|
||||
prs=$(count_kind_lines pr "$kinds")
|
||||
issues=$(count_kind_lines issue "$kinds")
|
||||
commits=$(count_commits "$display_login" "$since_ts")
|
||||
|
||||
if [[ -n "$name" ]]; then
|
||||
printf '%s (@%s, %s, account created %s, %s)\n' \
|
||||
"$name" "$display_login" "$type" "$created_day" "$(rough_age "$created_at")"
|
||||
else
|
||||
printf '@%s (%s, account created %s, %s)\n' \
|
||||
"$display_login" "$type" "$created_day" "$(rough_age "$created_at")"
|
||||
fi
|
||||
printf '%s last %smo: %s PRs, %s issues, %s commits\n' "$repo" "$months" "$prs" "$issues" "$commits"
|
||||
|
||||
if [[ "$include_global" == "1" ]]; then
|
||||
if global_json=$(global_activity "$display_login" "$since_ts" "$now_ts" 2>/dev/null); then
|
||||
if [[ -n "$global_json" ]]; then
|
||||
global_commits=$(jq -r '.totalCommitContributions' <<<"$global_json")
|
||||
global_issues=$(jq -r '.totalIssueContributions' <<<"$global_json")
|
||||
global_prs=$(jq -r '.totalPullRequestContributions' <<<"$global_json")
|
||||
global_reviews=$(jq -r '.totalPullRequestReviewContributions' <<<"$global_json")
|
||||
printf 'GitHub public last %smo: %s commits, %s PRs, %s issues, %s reviews\n' \
|
||||
"$months" "$global_commits" "$global_prs" "$global_issues" "$global_reviews"
|
||||
else
|
||||
printf 'GitHub public last %smo: unavailable\n' "$months"
|
||||
fi
|
||||
else
|
||||
printf 'GitHub public last %smo: unavailable\n' "$months"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
12
.github/pull_request_template.md
vendored
12
.github/pull_request_template.md
vendored
@@ -35,18 +35,6 @@ If this PR fixes a plugin beta-release blocker, title it `fix(<plugin-id>): beta
|
||||
- Related #
|
||||
- [ ] This PR fixes a bug or regression
|
||||
|
||||
## Real behavior proof (required for external PRs)
|
||||
|
||||
External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only. Screenshots are encouraged even for CLI, console, text, or log changes; terminal screenshots and copied live output count.
|
||||
|
||||
- Behavior or issue addressed:
|
||||
- Real environment tested:
|
||||
- Exact steps or command run after this patch:
|
||||
- Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output):
|
||||
- Observed result after fix:
|
||||
- What was not tested:
|
||||
- Before evidence (optional but encouraged):
|
||||
|
||||
## Root Cause (if applicable)
|
||||
|
||||
For bug fixes or regressions, explain why this happened, not just what changed. Otherwise write `N/A`. If the cause is unclear, write `Unknown`.
|
||||
|
||||
2
.github/workflows/auto-response.yml
vendored
2
.github/workflows/auto-response.yml
vendored
@@ -6,7 +6,7 @@ on:
|
||||
issue_comment:
|
||||
types: [created]
|
||||
pull_request_target: # zizmor: ignore[dangerous-triggers] maintainer-owned label automation; trusted base checkout only, no untrusted PR code execution
|
||||
types: [opened, edited, synchronize, reopened, labeled, unlabeled]
|
||||
types: [opened, edited, synchronize, reopened, labeled]
|
||||
|
||||
env:
|
||||
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
||||
|
||||
@@ -401,38 +401,11 @@ jobs:
|
||||
)
|
||||
pnpm "${args[@]}"
|
||||
cp "$desktop_dir/desktop-browser-smoke.png" "$root/$lane/discord-status-reactions-tool-only-desktop.png"
|
||||
cp "$desktop_dir/desktop-browser-smoke.mp4" "$root/$lane/discord-status-reactions-tool-only-desktop.mp4"
|
||||
}
|
||||
|
||||
capture_desktop_lane baseline
|
||||
capture_desktop_lane candidate
|
||||
|
||||
make_desktop_preview() {
|
||||
local lane="$1"
|
||||
local input="$root/$lane/discord-status-reactions-tool-only-desktop.mp4"
|
||||
local output="$root/$lane/discord-status-reactions-tool-only-desktop-preview.gif"
|
||||
local clip="$root/$lane/discord-status-reactions-tool-only-desktop-change.mp4"
|
||||
local metadata="$root/$lane/discord-status-reactions-tool-only-desktop-preview.json"
|
||||
crabbox media preview \
|
||||
--input "$input" \
|
||||
--output "$output" \
|
||||
--trimmed-video-output "$clip" \
|
||||
--json > "$metadata"
|
||||
}
|
||||
|
||||
if ! command -v ffmpeg >/dev/null 2>&1 || ! command -v ffprobe >/dev/null 2>&1; then
|
||||
sudo apt-get update && sudo apt-get install -y ffmpeg || true
|
||||
fi
|
||||
if ! make_desktop_preview baseline || ! make_desktop_preview candidate; then
|
||||
rm -f "$root/baseline/discord-status-reactions-tool-only-desktop-preview.gif"
|
||||
rm -f "$root/candidate/discord-status-reactions-tool-only-desktop-preview.gif"
|
||||
rm -f "$root/baseline/discord-status-reactions-tool-only-desktop-change.mp4"
|
||||
rm -f "$root/candidate/discord-status-reactions-tool-only-desktop-change.mp4"
|
||||
rm -f "$root/baseline/discord-status-reactions-tool-only-desktop-preview.json"
|
||||
rm -f "$root/candidate/discord-status-reactions-tool-only-desktop-preview.json"
|
||||
echo "::warning::Could not generate motion-trimmed desktop previews; continuing with screenshots and full MP4 links."
|
||||
fi
|
||||
|
||||
baseline_status="$(jq -r '.scenarios[0].status' "$root/baseline/discord-qa-summary.json")"
|
||||
candidate_status="$(jq -r '.scenarios[0].status' "$root/candidate/discord-qa-summary.json")"
|
||||
|
||||
@@ -458,20 +431,6 @@ jobs:
|
||||
echo "- Candidate screenshot: \`candidate/discord-status-reactions-tool-only-timeline.png\`"
|
||||
echo "- Baseline desktop screenshot: \`baseline/discord-status-reactions-tool-only-desktop.png\`"
|
||||
echo "- Candidate desktop screenshot: \`candidate/discord-status-reactions-tool-only-desktop.png\`"
|
||||
if [[ -f "$root/baseline/discord-status-reactions-tool-only-desktop-preview.gif" ]]; then
|
||||
echo "- Baseline desktop preview: \`baseline/discord-status-reactions-tool-only-desktop-preview.gif\`"
|
||||
fi
|
||||
if [[ -f "$root/candidate/discord-status-reactions-tool-only-desktop-preview.gif" ]]; then
|
||||
echo "- Candidate desktop preview: \`candidate/discord-status-reactions-tool-only-desktop-preview.gif\`"
|
||||
fi
|
||||
if [[ -f "$root/baseline/discord-status-reactions-tool-only-desktop-change.mp4" ]]; then
|
||||
echo "- Baseline desktop change clip: \`baseline/discord-status-reactions-tool-only-desktop-change.mp4\`"
|
||||
fi
|
||||
if [[ -f "$root/candidate/discord-status-reactions-tool-only-desktop-change.mp4" ]]; then
|
||||
echo "- Candidate desktop change clip: \`candidate/discord-status-reactions-tool-only-desktop-change.mp4\`"
|
||||
fi
|
||||
echo "- Baseline desktop video: \`baseline/discord-status-reactions-tool-only-desktop.mp4\`"
|
||||
echo "- Candidate desktop video: \`candidate/discord-status-reactions-tool-only-desktop.mp4\`"
|
||||
} > "$root/mantis-report.md"
|
||||
|
||||
cat "$root/mantis-report.md" >> "$GITHUB_STEP_SUMMARY"
|
||||
@@ -508,7 +467,7 @@ jobs:
|
||||
permission-issues: write
|
||||
permission-pull-requests: write
|
||||
|
||||
- name: Comment PR with inline QA evidence
|
||||
- name: Comment PR with inline QA screenshots
|
||||
if: ${{ always() && needs.resolve_request.outputs.pr_number != '' && steps.run_mantis.outputs.output_dir != '' }}
|
||||
env:
|
||||
GH_TOKEN: ${{ steps.mantis_app_token.outputs.token }}
|
||||
@@ -532,9 +491,7 @@ jobs:
|
||||
"$root/baseline/discord-status-reactions-tool-only-timeline.png" \
|
||||
"$root/candidate/discord-status-reactions-tool-only-timeline.png" \
|
||||
"$root/baseline/discord-status-reactions-tool-only-desktop.png" \
|
||||
"$root/candidate/discord-status-reactions-tool-only-desktop.png" \
|
||||
"$root/baseline/discord-status-reactions-tool-only-desktop.mp4" \
|
||||
"$root/candidate/discord-status-reactions-tool-only-desktop.mp4"
|
||||
"$root/candidate/discord-status-reactions-tool-only-desktop.png"
|
||||
do
|
||||
if [[ ! -f "$required" ]]; then
|
||||
echo "Missing required QA evidence file: $required" >&2
|
||||
@@ -562,30 +519,14 @@ jobs:
|
||||
cp "$root/candidate/discord-status-reactions-tool-only-timeline.png" "$artifacts_worktree/$artifact_root/candidate.png"
|
||||
cp "$root/baseline/discord-status-reactions-tool-only-desktop.png" "$artifacts_worktree/$artifact_root/baseline-desktop.png"
|
||||
cp "$root/candidate/discord-status-reactions-tool-only-desktop.png" "$artifacts_worktree/$artifact_root/candidate-desktop.png"
|
||||
has_desktop_previews="false"
|
||||
if [[ -f "$root/baseline/discord-status-reactions-tool-only-desktop-preview.gif" && -f "$root/candidate/discord-status-reactions-tool-only-desktop-preview.gif" ]]; then
|
||||
cp "$root/baseline/discord-status-reactions-tool-only-desktop-preview.gif" "$artifacts_worktree/$artifact_root/baseline-desktop-preview.gif"
|
||||
cp "$root/candidate/discord-status-reactions-tool-only-desktop-preview.gif" "$artifacts_worktree/$artifact_root/candidate-desktop-preview.gif"
|
||||
cp "$root/baseline/discord-status-reactions-tool-only-desktop-preview.json" "$artifacts_worktree/$artifact_root/baseline-desktop-preview.json"
|
||||
cp "$root/candidate/discord-status-reactions-tool-only-desktop-preview.json" "$artifacts_worktree/$artifact_root/candidate-desktop-preview.json"
|
||||
has_desktop_previews="true"
|
||||
fi
|
||||
has_change_clips="false"
|
||||
if [[ -f "$root/baseline/discord-status-reactions-tool-only-desktop-change.mp4" && -f "$root/candidate/discord-status-reactions-tool-only-desktop-change.mp4" ]]; then
|
||||
cp "$root/baseline/discord-status-reactions-tool-only-desktop-change.mp4" "$artifacts_worktree/$artifact_root/baseline-desktop-change.mp4"
|
||||
cp "$root/candidate/discord-status-reactions-tool-only-desktop-change.mp4" "$artifacts_worktree/$artifact_root/candidate-desktop-change.mp4"
|
||||
has_change_clips="true"
|
||||
fi
|
||||
cp "$root/baseline/discord-status-reactions-tool-only-desktop.mp4" "$artifacts_worktree/$artifact_root/baseline-desktop.mp4"
|
||||
cp "$root/candidate/discord-status-reactions-tool-only-desktop.mp4" "$artifacts_worktree/$artifact_root/candidate-desktop.mp4"
|
||||
cp "$root/comparison.json" "$artifacts_worktree/$artifact_root/comparison.json"
|
||||
cp "$root/mantis-report.md" "$artifacts_worktree/$artifact_root/mantis-report.md"
|
||||
|
||||
git -C "$artifacts_worktree" add "$artifact_root"
|
||||
if git -C "$artifacts_worktree" diff --cached --quiet; then
|
||||
echo "No QA screenshot/video artifact changes to publish."
|
||||
echo "No QA screenshot artifact changes to publish."
|
||||
else
|
||||
git -C "$artifacts_worktree" commit --quiet -m "qa: publish Mantis Discord evidence for PR ${TARGET_PR}"
|
||||
git -C "$artifacts_worktree" commit --quiet -m "qa: publish Mantis Discord screenshots for PR ${TARGET_PR}"
|
||||
git -C "$artifacts_worktree" push --quiet origin HEAD:qa-artifacts
|
||||
fi
|
||||
|
||||
@@ -594,26 +535,6 @@ jobs:
|
||||
baseline_status="$(jq -r '.baseline.status' "$root/comparison.json")"
|
||||
candidate_status="$(jq -r '.candidate.status' "$root/comparison.json")"
|
||||
pass="$(jq -r '.pass' "$root/comparison.json")"
|
||||
preview_section=""
|
||||
if [[ "$has_desktop_previews" == "true" ]]; then
|
||||
preview_section="$(cat <<EOF
|
||||
|
||||
| Baseline motion preview | Candidate motion preview |
|
||||
| --- | --- |
|
||||
| <img src="${raw_base}/baseline-desktop-preview.gif" width="420" alt="Animated baseline desktop preview"> | <img src="${raw_base}/candidate-desktop-preview.gif" width="420" alt="Animated candidate desktop preview"> |
|
||||
EOF
|
||||
)"
|
||||
fi
|
||||
change_clip_section=""
|
||||
if [[ "$has_change_clips" == "true" ]]; then
|
||||
change_clip_section="$(cat <<EOF
|
||||
|
||||
Motion-trimmed clips:
|
||||
- [Baseline change MP4](${raw_base}/baseline-desktop-change.mp4)
|
||||
- [Candidate change MP4](${raw_base}/candidate-desktop-change.mp4)
|
||||
EOF
|
||||
)"
|
||||
fi
|
||||
comment_file="$(mktemp)"
|
||||
cat > "$comment_file" <<EOF
|
||||
<!-- mantis-discord-status-reactions -->
|
||||
@@ -636,12 +557,6 @@ jobs:
|
||||
| Baseline desktop/VNC browser | Candidate desktop/VNC browser |
|
||||
| --- | --- |
|
||||
| <img src="${raw_base}/baseline-desktop.png" width="420" alt="Baseline Mantis desktop browser screenshot"> | <img src="${raw_base}/candidate-desktop.png" width="420" alt="Candidate Mantis desktop browser screenshot"> |
|
||||
${preview_section}
|
||||
${change_clip_section}
|
||||
|
||||
Full videos:
|
||||
- [Baseline desktop MP4](${raw_base}/baseline-desktop.mp4)
|
||||
- [Candidate desktop MP4](${raw_base}/candidate-desktop.mp4)
|
||||
|
||||
Raw QA files: https://github.com/${GITHUB_REPOSITORY}/tree/qa-artifacts/${artifact_root}
|
||||
EOF
|
||||
@@ -656,13 +571,13 @@ jobs:
|
||||
comment_payload="$(mktemp)"
|
||||
jq -n --rawfile body "$comment_file" '{ body: $body }' > "$comment_payload"
|
||||
if gh api --method PATCH "repos/${GITHUB_REPOSITORY}/issues/comments/${comment_id}" --input "$comment_payload" >/dev/null; then
|
||||
echo "Updated Mantis QA evidence comment on PR #${TARGET_PR}."
|
||||
echo "Updated Mantis QA screenshot comment on PR #${TARGET_PR}."
|
||||
else
|
||||
echo "::warning::Could not update existing Mantis QA evidence comment ${comment_id}; creating a new one."
|
||||
echo "::warning::Could not update existing Mantis QA screenshot comment ${comment_id}; creating a new one."
|
||||
gh pr comment "$TARGET_PR" --body-file "$comment_file"
|
||||
echo "Created Mantis QA evidence comment on PR #${TARGET_PR}."
|
||||
echo "Created Mantis QA screenshot comment on PR #${TARGET_PR}."
|
||||
fi
|
||||
else
|
||||
gh pr comment "$TARGET_PR" --body-file "$comment_file"
|
||||
echo "Created Mantis QA evidence comment on PR #${TARGET_PR}."
|
||||
echo "Created Mantis QA screenshot comment on PR #${TARGET_PR}."
|
||||
fi
|
||||
|
||||
@@ -34,7 +34,7 @@ on:
|
||||
default: 1
|
||||
type: number
|
||||
published_upgrade_survivor_baseline:
|
||||
description: Published OpenClaw package baseline for the published-upgrade-survivor/update-migration Docker lanes
|
||||
description: Published OpenClaw package baseline for the published-upgrade-survivor/update-migration Docker lane
|
||||
required: false
|
||||
default: openclaw@latest
|
||||
type: string
|
||||
@@ -129,7 +129,7 @@ on:
|
||||
default: 1
|
||||
type: number
|
||||
published_upgrade_survivor_baseline:
|
||||
description: Published OpenClaw package baseline for the published-upgrade-survivor/update-restart-auth/update-migration Docker lanes
|
||||
description: Published OpenClaw package baseline for the published-upgrade-survivor/update-migration Docker lane
|
||||
required: false
|
||||
default: openclaw@latest
|
||||
type: string
|
||||
@@ -861,24 +861,36 @@ jobs:
|
||||
runs-on: blacksmith-4vcpu-ubuntu-2404
|
||||
timeout-minutes: 5
|
||||
outputs:
|
||||
groups_json: ${{ steps.groups.outputs.groups_json }}
|
||||
groups_json: ${{ steps.plan.outputs.groups_json }}
|
||||
steps:
|
||||
- name: Checkout trusted release harness
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
ref: ${{ github.sha }}
|
||||
fetch-depth: 1
|
||||
|
||||
- name: Build targeted Docker lane groups
|
||||
id: groups
|
||||
- name: Plan targeted Docker lane groups
|
||||
id: plan
|
||||
shell: bash
|
||||
env:
|
||||
LANES: ${{ inputs.docker_lanes }}
|
||||
GROUP_SIZE: ${{ inputs.targeted_docker_lane_group_size }}
|
||||
OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS: ${{ inputs.published_upgrade_survivor_baselines }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
groups_json="$(node scripts/plan-targeted-docker-lane-groups.mjs)"
|
||||
groups_json="$(
|
||||
LANES="$LANES" GROUP_SIZE="$GROUP_SIZE" node <<'NODE'
|
||||
const lanes = [...new Set(String(process.env.LANES || "").split(/[,\s]+/u).map((lane) => lane.trim()).filter(Boolean))];
|
||||
if (lanes.length === 0) {
|
||||
throw new Error("docker_lanes is required when planning targeted Docker lane groups.");
|
||||
}
|
||||
const rawGroupSize = Number.parseInt(process.env.GROUP_SIZE || "1", 10);
|
||||
const groupSize = Number.isFinite(rawGroupSize) && rawGroupSize > 0 ? rawGroupSize : 1;
|
||||
const sanitize = (lane) => lane.replace(/[^A-Za-z0-9._-]+/g, "-").replace(/^-+|-+$/g, "") || "targeted";
|
||||
const groups = [];
|
||||
for (let index = 0; index < lanes.length; index += groupSize) {
|
||||
const groupLanes = lanes.slice(index, index + groupSize);
|
||||
const first = sanitize(groupLanes[0]);
|
||||
const last = sanitize(groupLanes[groupLanes.length - 1]);
|
||||
const label = groupLanes.length === 1 ? first : `${first}--${last}`;
|
||||
groups.push({ label, docker_lanes: groupLanes.join(" ") });
|
||||
}
|
||||
process.stdout.write(JSON.stringify(groups));
|
||||
NODE
|
||||
)"
|
||||
echo "groups_json=${groups_json}" >> "$GITHUB_OUTPUT"
|
||||
|
||||
validate_docker_lanes:
|
||||
@@ -945,7 +957,7 @@ jobs:
|
||||
OPENCLAW_DOCKER_E2E_SELECTED_SHA: ${{ needs.validate_selected_ref.outputs.selected_sha }}
|
||||
OPENCLAW_CURRENT_PACKAGE_TGZ: .artifacts/docker-e2e-package/openclaw-current.tgz
|
||||
OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC: ${{ inputs.published_upgrade_survivor_baseline }}
|
||||
OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS: ${{ matrix.group.published_upgrade_survivor_baselines || inputs.published_upgrade_survivor_baselines }}
|
||||
OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS: ${{ inputs.published_upgrade_survivor_baselines }}
|
||||
OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS: ${{ inputs.published_upgrade_survivor_scenarios }}
|
||||
OPENCLAW_SKIP_DOCKER_BUILD: "1"
|
||||
INCLUDE_OPENWEBUI: ${{ inputs.include_openwebui }}
|
||||
@@ -986,7 +998,6 @@ jobs:
|
||||
shell: bash
|
||||
env:
|
||||
LANES: ${{ matrix.group.docker_lanes }}
|
||||
GROUP_LABEL: ${{ matrix.group.label }}
|
||||
INCLUDE_OPENWEBUI: ${{ inputs.include_openwebui }}
|
||||
INCLUDE_RELEASE_PATH_SUITES: ${{ inputs.include_release_path_suites }}
|
||||
run: |
|
||||
@@ -1006,7 +1017,7 @@ jobs:
|
||||
plan_path=".artifacts/docker-tests/targeted-plan.json"
|
||||
node .release-harness/scripts/test-docker-all.mjs --plan-json > "$plan_path"
|
||||
node .release-harness/scripts/docker-e2e.mjs github-outputs "$plan_path" >> "$GITHUB_OUTPUT"
|
||||
suffix="$(printf '%s' "${GROUP_LABEL:-$LANES}" | tr ',[:space:]' '-' | tr -cd 'A-Za-z0-9._-' | sed -E 's/-+/-/g; s/^-//; s/-$//')"
|
||||
suffix="$(printf '%s' "$LANES" | tr ',[:space:]' '-' | tr -cd 'A-Za-z0-9._-' | sed -E 's/-+/-/g; s/^-//; s/-$//')"
|
||||
echo "artifact_suffix=${suffix:-targeted}" >> "$GITHUB_OUTPUT"
|
||||
echo "plan_json=$plan_path" >> "$GITHUB_OUTPUT"
|
||||
|
||||
|
||||
@@ -558,8 +558,8 @@ jobs:
|
||||
artifact_name: ${{ needs.prepare_release_package.outputs.artifact_name }}
|
||||
package_sha256: ${{ needs.prepare_release_package.outputs.package_sha256 }}
|
||||
suite_profile: custom
|
||||
docker_lanes: doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update
|
||||
published_upgrade_survivor_baselines: ${{ needs.resolve_target.outputs.run_release_soak == 'true' && 'last-stable-4 2026.4.23 2026.5.2 2026.4.15' || '' }}
|
||||
docker_lanes: doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor plugins-offline plugin-update
|
||||
published_upgrade_survivor_baselines: ${{ needs.resolve_target.outputs.run_release_soak == 'true' && 'all-since-2026.4.23' || '' }}
|
||||
published_upgrade_survivor_scenarios: ${{ needs.resolve_target.outputs.run_release_soak == 'true' && 'reported-issues' || '' }}
|
||||
telegram_mode: mock-openai
|
||||
telegram_scenarios: telegram-help-command,telegram-commands-command,telegram-tools-compact-command,telegram-whoami-command,telegram-context-command,telegram-current-session-status-tool,telegram-mention-gating
|
||||
|
||||
10
.github/workflows/package-acceptance.yml
vendored
10
.github/workflows/package-acceptance.yml
vendored
@@ -70,7 +70,7 @@ on:
|
||||
default: openclaw@latest
|
||||
type: string
|
||||
published_upgrade_survivor_baselines:
|
||||
description: Optional baseline list for published-upgrade-survivor/update-migration; use last-stable-4, all-since-2026.4.23, release-history, or exact versions
|
||||
description: Optional baseline list for published-upgrade-survivor/update-migration; use all-since-2026.4.23, release-history, or exact versions
|
||||
required: false
|
||||
default: ""
|
||||
type: string
|
||||
@@ -150,7 +150,7 @@ on:
|
||||
default: openclaw@latest
|
||||
type: string
|
||||
published_upgrade_survivor_baselines:
|
||||
description: Optional baseline list for published-upgrade-survivor/update-migration; use last-stable-4, all-since-2026.4.23, release-history, or exact versions
|
||||
description: Optional baseline list for published-upgrade-survivor/update-migration; use all-since-2026.4.23, release-history, or exact versions
|
||||
required: false
|
||||
default: ""
|
||||
type: string
|
||||
@@ -386,10 +386,10 @@ jobs:
|
||||
docker_lanes="npm-onboard-channel-agent gateway-network config-reload"
|
||||
;;
|
||||
package)
|
||||
docker_lanes="npm-onboard-channel-agent doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update"
|
||||
docker_lanes="npm-onboard-channel-agent doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor plugins-offline plugin-update"
|
||||
;;
|
||||
product)
|
||||
docker_lanes="npm-onboard-channel-agent doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins plugin-update mcp-channels cron-mcp-cleanup openai-web-search-minimal openwebui"
|
||||
docker_lanes="npm-onboard-channel-agent doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor plugins plugin-update mcp-channels cron-mcp-cleanup openai-web-search-minimal openwebui"
|
||||
include_openwebui=true
|
||||
;;
|
||||
full)
|
||||
@@ -442,7 +442,7 @@ jobs:
|
||||
fi
|
||||
releases_json=""
|
||||
npm_versions_json=""
|
||||
if [[ "$REQUESTED_BASELINES" == *"release-history"* || "$REQUESTED_BASELINES" == *"all-since-"* || "$REQUESTED_BASELINES" == *"last-stable-"* ]]; then
|
||||
if [[ "$REQUESTED_BASELINES" == *"release-history"* || "$REQUESTED_BASELINES" == *"all-since-"* ]]; then
|
||||
releases_json=".artifacts/package-candidate-input/openclaw-releases.json"
|
||||
npm_versions_json=".artifacts/package-candidate-input/openclaw-npm-versions.json"
|
||||
mkdir -p "$(dirname "$releases_json")"
|
||||
|
||||
29
.github/workflows/real-behavior-proof.yml
vendored
29
.github/workflows/real-behavior-proof.yml
vendored
@@ -1,29 +0,0 @@
|
||||
name: Real behavior proof
|
||||
|
||||
on:
|
||||
pull_request_target: # zizmor: ignore[dangerous-triggers] trusted base checkout only; no untrusted PR code execution
|
||||
types: [opened, edited, synchronize, reopened, ready_for_review, labeled, unlabeled]
|
||||
|
||||
env:
|
||||
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref || github.run_id }}
|
||||
cancel-in-progress: true
|
||||
|
||||
permissions: {}
|
||||
|
||||
jobs:
|
||||
real-behavior-proof:
|
||||
name: Real behavior proof
|
||||
permissions:
|
||||
contents: read
|
||||
pull-requests: read
|
||||
runs-on: ubuntu-24.04
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
with:
|
||||
ref: ${{ github.event.pull_request.base.sha }}
|
||||
persist-credentials: false
|
||||
- name: Check real behavior proof
|
||||
run: node scripts/github/real-behavior-proof-check.mjs
|
||||
470
CHANGELOG.md
470
CHANGELOG.md
@@ -10,71 +10,211 @@ Docs: https://docs.openclaw.ai
|
||||
|
||||
### Changes
|
||||
|
||||
- Control UI: refresh the app shell into a denser cockpit layout with session navigation, live runtime cards, and a right-side skills/jobs/hooks inspector.
|
||||
- Telegram: accept plugin-owned numeric forum-topic targets in the agent message tool and keep reply-dispatch provider chunks behind a real stable runtime alias during in-place package updates. Fixes #77137. Thanks @richardmqq.
|
||||
- Channels/WhatsApp: support explicit WhatsApp Channel/Newsletter `@newsletter` outbound message targets with channel session metadata instead of DM routing. Fixes #13417; carries forward the narrow outbound target idea from #13424. Thanks @vincentkoc and @agentz-manfred.
|
||||
- TTS/telephony: honor provider voice/model overrides in telephony synthesis providers so Google Meet agent speech logs match the backend that actually produced the audio. Thanks @vincentkoc.
|
||||
- Voice Call/realtime: bound the paced Twilio audio queue and close overloaded realtime streams before provider audio can pile up behind the websocket backpressure guard. Thanks @vincentkoc.
|
||||
- Google Meet: preserve `realtime.introMessage: ""` so realtime Chrome joins can stay silent instead of restoring the default spoken intro. Thanks @vincentkoc.
|
||||
- OpenAI/Codex media: advertise Codex audio transcription in runtime and manifest metadata and route active Codex chat models to the OpenAI transcription default instead of sending chat model ids to audio transcription. Thanks @vincentkoc.
|
||||
- Models/auth: add `openclaw models auth list [--provider <id>] [--json]` so users can inspect saved per-agent auth profiles without dumping secrets or hitting the old “too many arguments” path. Thanks @vincentkoc.
|
||||
- Cron CLI: add `openclaw cron list --agent <id>`, normalize the requested agent id, and include jobs without a stored agent id under the configured default agent while keeping `cron list` unfiltered when no agent is supplied. Fixes #77118. Thanks @zhanggttry.
|
||||
- Status: show compact Gateway process uptime and host system uptime in `/status`, making restart and host-lifetime checks visible from chat. Thanks @vincentkoc.
|
||||
- Discord/status: add degraded Discord transport and gateway event-loop starvation signals to `openclaw channels status`, `openclaw status --deep`, and fetch-timeout logs so intermittent socket resets do not look like a healthy running channel. (#76327) Thanks @joshavant.
|
||||
- Gateway/Windows: bind the default loopback gateway listener only to `127.0.0.1` on Windows so libuv's dual-stack `::1` behavior cannot wedge localhost HTTP requests. (#69701, fixes #69674) Thanks @SARAMALI15792.
|
||||
- Plugins/migration: emit catalog-backed install hints when `plugins.entries` or `plugins.allow` references an official external plugin that is not installed, so upgraded configs point operators to `openclaw plugins install <spec>` instead of telling them to remove valid plugin config. (#77483) Thanks @hclsys.
|
||||
- OpenAI/Codex media: advertise Codex audio transcription in runtime and manifest metadata and route active Codex chat models to the OpenAI transcription default instead of sending chat model ids to audio transcription. Thanks @vincentkoc.
|
||||
- Dependencies: refresh runtime and provider packages including Pi 0.73.0, ACPX adapters, OpenAI, Anthropic, Slack, and TypeScript native preview, while keeping the Bedrock runtime installer override pinned below the Windows ARM Node 24 npm resolver failure.
|
||||
- Agents/performance: pass the resolved workspace through BTW, compaction, embedded-run model generation, and PDF model setup so explicit agent-dir model refreshes can reuse the current workspace-scoped plugin metadata snapshot instead of falling back to cold plugin metadata scans. (#77519, #77532)
|
||||
- Plugins/performance: let unscoped model catalog and manifest-contract readers reuse the current workspace-compatible plugin metadata snapshot, avoiding repeated cold plugin metadata scans on hot control-plane paths while preserving env/config/workspace compatibility checks. (#77519, #77532)
|
||||
- Config/plugin auto-enable: prefer the claiming plugin manifest id over a built-in channel alias when auto-allowlisting a configured channel, so WeCom/Yuanbao-style aliases resolve to the installed plugin id. Thanks @Beandon13.
|
||||
- Secrets/apply: preserve auth-profile `keyRef` and `tokenRef` fields when scrubbing provider-target secrets, so the canonical SecretRef metadata survives `secrets apply` without keeping plaintext values. Thanks @Beandon13.
|
||||
- Plugins/active-memory: skip session-store channel entries that contain `:` when resolving the recall subagent's channel, so QQ c2c agent IDs (e.g. `c2c:10D4F7C2…`) and other scoped conversation IDs do not reach bundled-plugin `dirName` validation and crash the recall run. The same guard already applied to explicit `channelId` params (#76704); this extends it to store-derived channels. (#77396) Thanks @hclsys.
|
||||
- Secrets/external channel contracts: also look in `<rootDir>/dist/` when resolving the `secret-contract-api` sidecar, so npm-published externalized channel plugins (e.g. `@openclaw/discord` since 2026.5.2) whose compiled artifacts live under `dist/` actually contribute their channel SecretRef contracts to the runtime snapshot. Without this, env-backed `channels.discord.token` SecretRefs silently failed to resolve at gateway start on 2026.5.3, leaving the channel `not configured` even though #76449 had landed the generic external-contract loader. Thanks @mogglemoss.
|
||||
- Models/auth: add `openclaw models auth list [--provider <id>] [--json]` so users can inspect saved per-agent auth profiles without dumping secrets or hitting the old “too many arguments” path. Thanks @vincentkoc.
|
||||
- Control UI/header: show the active agent name in dashboard breadcrumbs without adding the current session key, keeping non-chat views oriented without crowding the topbar.
|
||||
- Control UI/cron: make the New Job sidebar collapsible so the jobs list can reclaim space while keeping the form one click away. Thanks @BunsDev.
|
||||
- Gateway/startup: keep model-catalog test helpers, run-session lookup code, QR pairing helpers, and TypeBox memory-tool schema construction out of hot startup import paths, reducing default gateway benchmark plugin-load and memory pressure.
|
||||
- Control UI/performance: record browser long animation frame or long task entries in the debug event log when supported, making slow dashboard renders easier to attribute from the UI.
|
||||
- Slack/streaming: add `streaming.progress.render: "rich"` for Block Kit progress drafts backed by structured progress line data.
|
||||
- Slack/streaming: keep the newest rich progress lines when Block Kit limits trim long progress drafts. Thanks @vincentkoc.
|
||||
- Channels/streaming: cap progress-draft tool lines by default so edited progress boxes avoid jumpy reflow from long wrapped lines.
|
||||
- Control UI/chat: add an agent-first filter to the chat session picker, keep chat controls/composer responsive across phone/tablet/desktop widths, keep desktop chat controls on one row, avoid duplicate avatar refreshes during initial chat load, and hide that row while scrolling down the transcript. Thanks @BunsDev.
|
||||
- Control UI/chat: collapse consecutive duplicate text messages into one bubble with a count so repeated text-only messages stay compact without hiding nearby context.
|
||||
- Control UI/cron: make the New Job sidebar collapsible so the jobs list can reclaim space while keeping the form one click away. Thanks @BunsDev.
|
||||
- Control UI/header: show the active agent name in dashboard breadcrumbs without adding the current session key, keeping non-chat views oriented without crowding the topbar.
|
||||
- Plugins/migration: emit catalog-backed install hints when `plugins.entries` or `plugins.allow` references an official external plugin that is not installed, so upgraded configs point operators to `openclaw plugins install <spec>` instead of telling them to remove valid plugin config. (#77483) Thanks @hclsys.
|
||||
- Plugins/ClawHub: annotate 429 errors from ClawHub with the reset window from `RateLimit-Reset`/`Retry-After` and append a `Sign in for higher rate limits.` hint when the request was unauthenticated, so users can see when downloads will recover and how to lift the cap. Thanks @romneyda.
|
||||
- Secrets/external channel contracts: also look in `<rootDir>/dist/` when resolving the `secret-contract-api` sidecar, so npm-published externalized channel plugins (e.g. `@openclaw/discord` since 2026.5.2) whose compiled artifacts live under `dist/` actually contribute their channel SecretRef contracts to the runtime snapshot. Without this, env-backed `channels.discord.token` SecretRefs silently failed to resolve at gateway start on 2026.5.3, leaving the channel `not configured` even though #76449 had landed the generic external-contract loader. Thanks @mogglemoss.
|
||||
- Secrets/apply: preserve auth-profile `keyRef` and `tokenRef` fields when scrubbing provider-target secrets, so the canonical SecretRef metadata survives `secrets apply` without keeping plaintext values. Thanks @Beandon13.
|
||||
- Config/plugin auto-enable: prefer the claiming plugin manifest id over a built-in channel alias when auto-allowlisting a configured channel, so WeCom/Yuanbao-style aliases resolve to the installed plugin id. Thanks @Beandon13.
|
||||
- Plugins/update: treat official externalized bundled npm migrations and ClawHub-to-npm fallbacks as trusted source-linked installs, so prerelease-only official plugin packages can migrate from bundled builds without being rejected as unsafe prerelease resolutions. Thanks @vincentkoc.
|
||||
- Plugins/update: move ClawHub-preferred externalized plugin installs back to ClawHub after an earlier npm fallback once the ClawHub package becomes available. Thanks @vincentkoc.
|
||||
- Plugins/update: clean stale bundled load paths for already-externalized pinned npm and ClawHub plugin installs, so release-channel sync does not leave removed bundled paths ahead of the installed external package. Thanks @vincentkoc.
|
||||
- Plugins/update: make package upgrades swap pnpm/npm-prefix installs cleanly, keep legacy plugin install runtime chunks working, and on the beta channel fall back default-line npm plugins to default/latest when plugin beta releases are missing or fail install validation. Thanks @vincentkoc and @joshavant.
|
||||
- Plugins/active-memory: skip session-store channel entries that contain `:` when resolving the recall subagent's channel, so QQ c2c agent IDs (e.g. `c2c:10D4F7C2…`) and other scoped conversation IDs do not reach bundled-plugin `dirName` validation and crash the recall run. The same guard already applied to explicit `channelId` params (#76704); this extends it to store-derived channels. (#77396) Thanks @hclsys.
|
||||
- Sandbox/Windows: accept drive-absolute Docker bind sources while keeping sandbox blocked-path and allowed-root policy comparisons Windows-case-insensitive. (#42174) Thanks @6607changchun.
|
||||
- Agents/subagents: preserve every grouped child result when direct completion fallback has to bypass the requester-agent announce turn. Thanks @vincentkoc.
|
||||
- Agents/verbose: use compact explain-mode tool summaries for `/verbose` and progress drafts by default, with `agents.defaults.toolProgressDetail: "raw"` and per-agent overrides for debugging raw command/detail output.
|
||||
- Gateway/startup: keep model-catalog test helpers, run-session lookup code, QR pairing helpers, and TypeBox memory-tool schema construction out of hot startup import paths, reducing default gateway benchmark plugin-load and memory pressure.
|
||||
- Control UI/chat: add an agent-first filter to the chat session picker, keep chat controls/composer responsive across phone/tablet/desktop widths, keep desktop chat controls on one row, avoid duplicate avatar refreshes during initial chat load, and hide that row while scrolling down the transcript. Thanks @BunsDev.
|
||||
- Control UI/chat: collapse consecutive duplicate text messages into one bubble with a count so no-op heartbeat acknowledgements stay compact without hiding nearby context.
|
||||
- Agents/subagents: preserve every grouped child result when direct completion fallback has to bypass the requester-agent announce turn. Thanks @vincentkoc.
|
||||
- TTS/telephony: honor provider voice/model overrides in telephony synthesis providers so Google Meet agent speech logs match the backend that actually produced the audio. Thanks @vincentkoc.
|
||||
- Voice Call/realtime: bound the paced Twilio audio queue and close overloaded realtime streams before provider audio can pile up behind the websocket backpressure guard. Thanks @vincentkoc.
|
||||
- Docs: clarify that IRC uses raw TCP/TLS sockets outside operator-managed forward proxy routing, so direct IRC egress should be explicitly approved before enabling IRC. Thanks @jesse-merhi.
|
||||
- Gateway/performance: defer non-readiness sidecars until after the ready signal, avoid hot-path channel plugin barrel imports, and fast-path trusted bundled plugin metadata during Gateway startup.
|
||||
- Gateway/performance: avoid importing `jiti` on native-loadable plugin startup paths, so compiled bundled plugin surfaces do not pay source-transform loader cost unless fallback loading is actually needed.
|
||||
- Plugins/loader: preserve real compiled plugin module evaluation errors on the native fast path instead of treating every thrown `.js` module as a source-transform fallback miss. Thanks @vincentkoc.
|
||||
- Providers/OpenRouter: add opt-in response caching params that send OpenRouter's `X-OpenRouter-Cache`, `X-OpenRouter-Cache-TTL`, and cache-clear headers only on verified OpenRouter routes. Thanks @vincentkoc.
|
||||
- Providers/OpenRouter: expand app-attribution categories so OpenClaw advertises coding, programming, writing, chat, and personal-agent usage on verified OpenRouter routes. Thanks @vincentkoc.
|
||||
- Agents/performance: pass the resolved workspace through BTW, compaction, embedded-run model generation, and PDF model setup so explicit agent-dir model refreshes can reuse the current workspace-scoped plugin metadata snapshot instead of falling back to cold plugin metadata scans. (#77519, #77532)
|
||||
- Plugins/performance: let unscoped model catalog and manifest-contract readers reuse the current workspace-compatible plugin metadata snapshot, avoiding repeated cold plugin metadata scans on hot control-plane paths while preserving env/config/workspace compatibility checks. (#77519, #77532)
|
||||
- Agents/sandbox: store sandbox container and browser registry entries as per-runtime shard files, reducing unrelated session lock contention while `openclaw doctor --fix` migrates legacy monolithic registry files. (#74831) Thanks @luckylhb90.
|
||||
- Plugins/runtime state: add `registerIfAbsent` for atomic keyed-store dedupe claims that return whether a plugin successfully claimed a key without overwriting an existing live value. Thanks @amknight.
|
||||
- Exec approvals: add a tree-sitter-backed shell command explainer for future approval and command-review surfaces. (#75004) Thanks @jesse-merhi.
|
||||
- Control UI/performance: record browser long animation frame or long task entries in the debug event log when supported, making slow dashboard renders easier to attribute from the UI.
|
||||
- Gateway/diagnostics: add startup phase spans, active work labels, stale terminal bridge markers, and default sync-I/O tracing in `pnpm gateway:watch` so slow Gateway turns are easier to attribute from logs and stability diagnostics.
|
||||
- QA/Codex harness: add targeted live Docker/Testbox diagnostics, auth preflight checks, cache mount fixes, and app-server protocol checkout discovery so maintainer harness failures are easier to reproduce. Thanks @vincentkoc.
|
||||
- Plugins/loader: preserve real compiled plugin module evaluation errors on the native fast path instead of treating every thrown `.js` module as a source-transform fallback miss. Thanks @vincentkoc.
|
||||
- QA/Mantis: add `pnpm openclaw qa mantis slack-desktop-smoke` to run Slack live QA inside a Crabbox VNC desktop, open Slack Web, and capture desktop screenshots beside the Slack QA artifacts.
|
||||
- QA/Mantis: add visual desktop tasks with Crabbox MP4 recording, screenshot capture, and optional image-understanding assertions, and preserve video artifacts in Mantis before/after reports.
|
||||
- QA/Mantis: pass the runtime env through desktop-browser Crabbox and artifact-copy child commands, so embedded Mantis callers can provide Crabbox credentials without mutating the parent process. Thanks @vincentkoc.
|
||||
- QA/Mantis: return the copied Slack desktop screenshot path even when remote Slack QA fails, so the CLI still prints the failure screenshot artifact. Thanks @vincentkoc.
|
||||
- QA/Mantis: accept Blacksmith Testbox `tbx_...` lease ids from desktop smoke warmup, so provider overrides do not fail before inspect/run. Thanks @vincentkoc.
|
||||
- QA/Codex harness: add targeted live Docker/Testbox diagnostics, auth preflight checks, cache mount fixes, and app-server protocol checkout discovery so maintainer harness failures are easier to reproduce. Thanks @vincentkoc.
|
||||
- Plugins/update: treat official externalized bundled npm migrations and ClawHub-to-npm fallbacks as trusted source-linked installs, so prerelease-only official plugin packages can migrate from bundled builds without being rejected as unsafe prerelease resolutions. Thanks @vincentkoc.
|
||||
- Plugins/update: move ClawHub-preferred externalized plugin installs back to ClawHub after an earlier npm fallback once the ClawHub package becomes available. Thanks @vincentkoc.
|
||||
- Plugins/update: clean stale bundled load paths for already-externalized pinned npm and ClawHub plugin installs, so release-channel sync does not leave removed bundled paths ahead of the installed external package. Thanks @vincentkoc.
|
||||
- Telegram: accept plugin-owned numeric forum-topic targets in the agent message tool and keep reply-dispatch provider chunks behind a real stable runtime alias during in-place package updates. Fixes #77137. Thanks @richardmqq.
|
||||
- Google Meet: preserve `realtime.introMessage: ""` so realtime Chrome joins can stay silent instead of restoring the default spoken intro. Thanks @vincentkoc.
|
||||
- Plugins/SDK: add bounded `before_agent_finalize` retry instructions so workflow plugins can request one more model pass. Thanks @100yenadmin.
|
||||
- Discord/status: add degraded Discord transport and gateway event-loop starvation signals to `openclaw channels status`, `openclaw status --deep`, and fetch-timeout logs so intermittent socket resets do not look like a healthy running channel. (#76327) Thanks @joshavant.
|
||||
- Providers/OpenRouter: add opt-in response caching params that send OpenRouter's `X-OpenRouter-Cache`, `X-OpenRouter-Cache-TTL`, and cache-clear headers only on verified OpenRouter routes. Thanks @vincentkoc.
|
||||
- Providers/OpenRouter: expand app-attribution categories so OpenClaw advertises coding, programming, writing, chat, and personal-agent usage on verified OpenRouter routes. Thanks @vincentkoc.
|
||||
- Plugins/update: make package upgrades swap pnpm/npm-prefix installs cleanly, keep legacy plugin install runtime chunks working, and on the beta channel fall back default-line npm plugins to default/latest when plugin beta releases are missing or fail install validation. Thanks @vincentkoc and @joshavant.
|
||||
- Channels/WhatsApp: support explicit WhatsApp Channel/Newsletter `@newsletter` outbound message targets with channel session metadata instead of DM routing. Fixes #13417; carries forward the narrow outbound target idea from #13424. Thanks @vincentkoc and @agentz-manfred.
|
||||
- Exec approvals: add a tree-sitter-backed shell command explainer for future approval and command-review surfaces. (#75004) Thanks @jesse-merhi.
|
||||
- Agents/sandbox: store sandbox container and browser registry entries as per-runtime shard files, reducing unrelated session lock contention while `openclaw doctor --fix` migrates legacy monolithic registry files. (#74831) Thanks @luckylhb90.
|
||||
- Plugins/ClawHub: annotate 429 errors from ClawHub with the reset window from `RateLimit-Reset`/`Retry-After` and append a `Sign in for higher rate limits.` hint when the request was unauthenticated, so users can see when downloads will recover and how to lift the cap. Thanks @romneyda.
|
||||
- Plugins/runtime state: add `registerIfAbsent` for atomic keyed-store dedupe claims that return whether a plugin successfully claimed a key without overwriting an existing live value. Thanks @amknight.
|
||||
- Plugin SDK: add plugin-owned `SessionEntry` slot projection and scoped trusted-policy session extension reads. (#75609; replaces part of #73384/#74483) Thanks @100yenadmin.
|
||||
- Docs: clarify that IRC uses raw TCP/TLS sockets outside operator-managed forward proxy routing, so direct IRC egress should be explicitly approved before enabling IRC. Thanks @jesse-merhi.
|
||||
- Dependencies: refresh runtime and provider packages including Pi 0.73.0, ACPX adapters, OpenAI, Anthropic, Slack, and TypeScript native preview, while keeping the Bedrock runtime installer override pinned below the Windows ARM Node 24 npm resolver failure.
|
||||
- Contributor PRs: require external pull requests to include after-fix real behavior proof from a real OpenClaw setup, with terminal screenshots, console output, redacted runtime logs, linked artifacts, and copied live output treated as valid evidence while unit tests, mocks, lint, typechecks, snapshots, and CI remain supplemental only.
|
||||
|
||||
### Fixes
|
||||
|
||||
- Video generation: wait up to 20 minutes for slow fal/MiniMax queue-backed jobs, stop forwarding unsupported Google Veo generated-audio options, and normalize MiniMax `720P` requests to its supported `768P` resolution with the usual override warning/details instead of failing fallback.
|
||||
- Video generation: accept provider-specific aspect-ratio and resolution hints at the tool boundary, normalize `720P` to MiniMax's supported `768P`, and stop sending Google `generateAudio` on Gemini video requests so provider fallback can recover from model-specific parameter differences. Thanks @vincentkoc.
|
||||
- OpenAI/Google Meet: fail realtime voice connection attempts when the socket closes before `session.updated`, avoiding stuck Meet joins waiting on a bridge that never became ready. Thanks @vincentkoc.
|
||||
- WhatsApp/onboarding: canonicalize setup and pairing allowlist entries to WhatsApp's digit-only phone ids while still accepting E.164, JID, and `whatsapp:` inputs, so personal-phone allowlists match WhatsApp Web sender ids after setup. Thanks @vincentkoc.
|
||||
- Gateway/startup: load provider plugins that own explicitly configured image, video, or music generation defaults so generation tools become live after gateway restart instead of remaining catalog-only. Fixes #77244. Thanks @buyuangtampan, @Nikoxx99, and @vincentkoc.
|
||||
- Slack/subagents: keep resumed parent `message.send` calls in the originating Slack thread when ambient session thread context is present, and suppress successful silent child completion rows from follow-up findings. Thanks @bek91.
|
||||
- WebChat/exec approvals: send `/approve ...` through the existing backend command path immediately while a run is blocked on approval, hydrate pending approval cards after reconnect, and add `openclaw approvals list --gateway` plus `openclaw sessions list --json` so operators can inspect stuck sessions without guessing. Thanks @vincentkoc.
|
||||
- Infra/Windows: skip the POSIX `/tmp/openclaw` preferred path on Windows in `resolvePreferredOpenClawTmpDir` so log files, TTS temp files, and other writes land in `%TEMP%\openclaw-<uid>` instead of `C:\tmp\openclaw`. Fixes #60713. Thanks @juan-flores077.
|
||||
- Gateway/diagnostics: make stuck-session recovery outcome-driven and generation-guarded, add `diagnostics.stuckSessionAbortMs`, and emit structured recovery requested/completed events so stale or skipped recovery no longer looks like a successful abort.
|
||||
- Media/Windows: open saved attachment temp files read/write before fsync so Windows WebChat and `chat.send` media offloads no longer fail with EPERM during durability flush. (#76593) Thanks @qq230849622-a11y.
|
||||
- Agents/tools: honor narrow runtime tool allowlists when constructing embedded-runner tool families and bundled MCP/LSP runtimes, so cron/subagent runs that request tools such as `update_plan`, `browser`, `x_search`, channel login tools, or `group:plugins` no longer start with missing tools or unrelated bootstrap work. (#77519, #77532)
|
||||
- Codex plugin: mirror the experimental upstream app-server protocol and format generated TypeScript before drift checks, keeping OpenClaw's `experimentalApi` bridge compatible with latest Codex while preserving formatter gates.
|
||||
- Telegram/media: derive no-caption inbound media placeholders from saved MIME metadata instead of the Telegram `photo` shape, so non-image and mixed attachments no longer reach the model as `<media:image>`. Fixes #69793. Thanks @aspalagin.
|
||||
- Agents/cache: keep per-turn runtime context out of ordinary chat system prompts while still delivering hidden current-turn context, restoring prompt-cache reuse on chat continuations. Fixes #77431. Thanks @Udjin79.
|
||||
- Gateway/startup: include resolved thinking and fast-mode defaults in the `agent model` startup log line, defaulting unset startup thinking to `medium` without mixing in reasoning visibility.
|
||||
- Gateway/update: resolve local gateway probe auth from the installed config during post-update restart verification, so token/device-authenticated VPS gateways are not misreported as unhealthy port conflicts after a package swap. Thanks @vincentkoc.
|
||||
- Agents/Tools: add post-compaction loop guard in `pi-embedded-runner` that arms after auto-compaction-retry and aborts the run with `compaction_loop_persisted` when the agent emits the same `(tool, args, result)` triple `windowSize` times (default 3) within that window. Disable via existing `tools.loopDetection.enabled`; tune via `tools.loopDetection.postCompactionGuard.windowSize`. Targets the failure mode where context-overflow + compaction does not break a tool-call loop. Refs #77474; carries forward #21597. Thanks @efpiva.
|
||||
- Gateway/watch: suppress sync-I/O trace output during `pnpm gateway:watch --benchmark` unless explicitly requested, so CPU profiling no longer floods the terminal with stack traces.
|
||||
- Gateway/watch: when benchmark sync-I/O tracing is explicitly enabled, tee trace blocks to the benchmark output log and filter them from the terminal pane while keeping normal Gateway logs visible.
|
||||
- Plugins/runtime-deps: include `json5` in the memory-core plugin runtime dependency set so packaged `memory_search` sandboxes can resolve generated OpenClaw runtime chunks that parse JSON5 config. Fixes #77461.
|
||||
- Codex harness: preserve app-server usage-limit reset details and deliver OpenClaw-owned runtime failure notices through tool-only source-reply mode, so Telegram and other chat channels tell users when Codex subscription limits or API failures block a turn instead of going silent. (#77557) Thanks @pashpashpash.
|
||||
- Agents/OpenAI: default direct OpenAI Responses models to the SSE transport instead of WebSocket auto-selection, preventing pi runtime chat turns from hanging on servers where the WebSocket path stalls while the OpenAI HTTP stream works. Thanks @vincentkoc.
|
||||
- Plugins/update: repair missing plugin-local `openclaw` peer links before skipping unchanged npm plugin updates, so current external Codex installs can recover `openclaw/plugin-sdk/*` resolution during OTA repair. (#77544) Thanks @ProspectOre.
|
||||
- Discord/replies: treat failed final reply delivery as a failed turn instead of counting it as a delivered automatic visible reply, so guild/channel turns no longer show done when the final message was dropped. Fixes #77520. Thanks @Patrick-Erichsen.
|
||||
- Discord: prefer IPv4 for Discord REST and gateway WebSocket startup paths so IPv4-only networks no longer stall before Gateway READY and inbound message dispatch. Fixes #77398; refs #77526. Thanks @Beandon13.
|
||||
- Channels/plugins: key bundled package-state probes, env/config presence, and read-only command defaults by channel id instead of manifest plugin id, preserving setup and native-command detection for channel plugins whose package id differs from the channel alias. Thanks @vincentkoc.
|
||||
- Docker: prune package-excluded plugin dist directories from runtime images unless the build explicitly opts that plugin in, so official external plugins such as Feishu stay install-on-demand instead of shipping partial metadata without compiled runtime output. Fixes #77424. Thanks @vincentkoc.
|
||||
- Model switching: include the exact additive allowlist repair command when `/model ... --runtime ...` targets a blocked model, and make Telegram's model picker say that it changes only the session model while leaving the runtime unchanged. Thanks @vincentkoc.
|
||||
- Mattermost: clarify that the model picker only changes the session model and that runtime switches require `/oc_model <provider/model> --runtime <runtime>`. Thanks @vincentkoc.
|
||||
- Doctor/config: keep active `auth.profiles` metadata intact when `doctor --fix` strips stale secret fields from configs, repairing legacy `<provider>:default` API-key profile metadata when model fallbacks or explicit `model@profile` refs still depend on it. Fixes #77400.
|
||||
- Doctor/plugins: include `plugins.allow`-only official plugin ids in the release configured-plugin repair set, so `doctor --fix` installs official external plugins that are configured but not yet loaded instead of removing them as stale allow entries. Fixes #77155. Thanks @hclsys.
|
||||
- Doctor/sessions: clear auto-created stale session routing state from the sessions store when `doctor --fix` sees plugin-owned model/runtime/auth/session bindings outside the current configured route, while leaving explicit user model choices for manual review. Refs #68615.
|
||||
- CLI/update: disable and skip plugins that fail package-update plugin sync, so a broken npm/ClawHub/git/marketplace plugin cannot turn a successful OpenClaw package update into a failed update result. Thanks @vincentkoc.
|
||||
- CLI/update: use an absolute POSIX npm script shell during package-manager updates, so restricted PATH environments can still run dependency lifecycle scripts while updating from `--tag main`. Fixes #77530. Thanks @PeterTremonti.
|
||||
- Diagnostics: grant the internal diagnostics event bus to official installed diagnostics exporter plugins, so npm-installed `@openclaw/diagnostics-prometheus` can emit metrics without broadening the capability to arbitrary global plugins. Fixes #76628. Thanks @RayWoo.
|
||||
- Browser: enforce strict SSRF current-URL checks before existing-session screenshots, matching existing-session snapshot handling. Thanks @vincentkoc.
|
||||
- Active Memory: give timeout partial transcript recovery enough abort-settle headroom so temporary recall summaries are returned before cleanup. Thanks @vincentkoc.
|
||||
- Gateway/chat: clear the active reply-run guard before draining queued same-session follow-up turns, so sequential `chat.send` calls no longer trip `ReplyRunAlreadyActiveError` every other request. Fixes #77485. Thanks @bws14email.
|
||||
- Agents/media: avoid sending generated image, video, and music attachments twice when streamed reply text arrives before the final `MEDIA:` directive.
|
||||
- CLI/sessions: cap `openclaw sessions` output to the newest 100 rows by default and add `--limit <n|all>` plus JSON pagination metadata, so repeated machine polling of large session stores cannot fan out into unbounded per-row enrichment/output work. Fixes #77500. Thanks @Kaotic3.
|
||||
- Doctor/config: restore legacy group chat config migrations for `routing.allowFrom`, `routing.groupChat.*`, and `channels.telegram.requireMention` so upgrades keep WhatsApp, Telegram, and iMessage group mention gates and history settings instead of leaving configs invalid or silently blocked. Thanks @scoootscooob.
|
||||
- CLI/update: make package-update follow-up processes write completion results and exit explicitly, so Windows packaged upgrades do not hang after the new package finishes post-core plugin work. Thanks @vincentkoc.
|
||||
- Release validation: skip Slack live QA unless Slack credentials are explicitly configured, so release gates can keep proving non-Slack surfaces while Slack is still local and credential-gated. Thanks @vincentkoc.
|
||||
- Plugins/update: treat OpenClaw CalVer correction versions like `2026.5.3-1` as satisfying base plugin API ranges, so correction builds can install plugins that require the base runtime API. Fixes #77293. (#77450) Thanks @p3nchan.
|
||||
- Discord/Gateway startup: retry Discord READY waits with backoff, defer startup `sessions.list` and native approval readiness failures until sidecars recover, and preserve component-only Discord payloads when final reply scrubbing removes all text. (#77478) Thanks @NikolaFC.
|
||||
- CLI/launcher: forward termination signals to compile-cache respawn children, so killing a wrapper process no longer leaves the security audit worker orphaned. Fixes #77458. Thanks @jaikharbanda.
|
||||
- Plugins/registry: recover managed-npm external plugins from the owned npm root when a stale persisted registry would otherwise hide them after package-manager upgrades. Fixes #77266. Thanks @p3nchan.
|
||||
- fix(gateway): clamp unbound websocket auth scopes [AI]. (#77413) Thanks @pgondhi987.
|
||||
- Gate zalouser startup name matching [AI]. (#77411) Thanks @pgondhi987.
|
||||
- Active Memory: send a bounded latest-message search query to the recall worker so channel/runtime metadata does not become the memory search string. Fixes #65309. Thanks @joeykrug, @westley3601, @pimenov, and @tasi333.
|
||||
- fix(device-pair): require pairing scope for pair command [AI]. (#76377) Thanks @pgondhi987.
|
||||
- Providers/OpenRouter: keep DeepSeek V4 `reasoning_effort` on OpenRouter-supported values, mapping stale `max` thinking overrides to `xhigh` so `openrouter/deepseek/deepseek-v4-pro` no longer fails with OpenRouter's invalid-effort 400. Fixes #77350. (#77423) Thanks @krllagent, @mushuiyu886, and @sallyom.
|
||||
- fix(qqbot): keep private commands off framework surface [AI]. (#77212) Thanks @pgondhi987.
|
||||
- Claude CLI: honor non-off `/think` levels by passing Claude Code's session-scoped `--effort` flag through the CLI backend seam, so chat bridges no longer show an inert thinking control. Fixes #77303. Thanks @Petr1t.
|
||||
- Agents/subagents: refresh deferred final-delivery payloads when same-session completion output changes, so retried parent notifications use the final child summary instead of stale progress text. Thanks @vincentkoc.
|
||||
- Agents/media: route async music and video completion results back through the requester agent, preserving automatic replies while requiring the message tool only for message-tool-only group/channel delivery.
|
||||
- active-memory: skip the memory sub-agent gracefully instead of logging a confusing allowlist error when no memory plugin (`memory-core` or `memory-lancedb`) is loaded, so active-memory with no memory backend no longer produces misleading "No callable tools remain" warnings in the gateway log. Fixes #77506. Thanks @hclsys.
|
||||
- Memory/wiki: preserve representation from both corpora in `corpus=all` searches while backfilling unused result capacity, so memory hits are not starved by numerically higher wiki integer scores. Fixes #77337. Thanks @hclsys.
|
||||
- Docker/compose: pin container-side `OPENCLAW_CONFIG_DIR` and `OPENCLAW_WORKSPACE_DIR` on both gateway and CLI services so the host paths written into `.env` by `scripts/docker/setup.sh` (used as Compose bind-mount sources) cannot leak into runtime code via the `env_file` import. Fixes regressions on macOS Docker setups where the first agent reply died with `EACCES: permission denied, mkdir '/Users'` because the host-style workspace path got persisted into `agents.defaults.workspace`. Fixes #77436. Thanks @lonexreb.
|
||||
- Telegram: clean up tool-only draft previews after assistant message boundaries so transient `Surfacing...` tool-status bubbles do not linger when no matching final preview arrives. Thanks @BunsDev.
|
||||
- Slack: report `unknown error` instead of `undefined` in socket-mode startup retry logs and label the retry reason explicitly.
|
||||
- Telegram: let explicit forum-topic `requireMention` settings override persisted `/activate` and `/deactivate` state, so per-topic mention gates work consistently. Fixes #49864. Thanks @Panniantong.
|
||||
- Cron: surface failed isolated-run diagnostics in `cron show`, status, and run history when requested tools are unavailable, so blocked cron runs report the actual tool-policy failure instead of a misleading green result. Fixes #75763. Thanks @RyanSandoval.
|
||||
- TUI/escape abort: track the in-flight runId after `chat.send` resolves so pressing Esc during the gap before the first gateway event aborts the run instead of repeatedly printing `no active run`. Fixes #1296. Thanks @Lukavyi and @romneyda.
|
||||
- TUI/render: stop the long-token sanitizer from injecting literal spaces inside inline code spans, fenced code blocks, table borders, and bare hyphenated/dotted identifiers, so copied package names, entity IDs, and shell line-continuations stay byte-for-byte intact while narrow-terminal protection still chunks unidentifiable long prose tokens. Fixes #48432, #39505. Thanks @DocOellerson, @xeusoc, @CCcassiusdjs, @akramcodez, @brokemac79, @romneyda.
|
||||
- Plugin skills: publish plugin-declared skills through the generated plugin skills directory (`~/.openclaw/plugin-skills/`) while keeping direct prompt loading intact, so agent file-based discovery paths find plugin skill `SKILL.md` files and inactive plugin links are cleaned up. Fixes #77296. (#77328) Thanks @zhangguiping-xydt.
|
||||
- Gateway/status: label Linux managed gateway services as `systemd user`, making status output explicit about the user-service scope instead of implying a system-level unit. Thanks @vincentkoc.
|
||||
- Plugins/install: remove the previous managed plugin directory when a reinstall switches sources, so stale ClawHub and npm copies no longer keep duplicate plugin ids in discovery after the new install wins. Thanks @vincentkoc.
|
||||
- Plugins/install: let official plugin reinstall recovery repair source-only installed runtime shadows, so `openclaw plugins install npm:@openclaw/discord --force` can replace the bad package instead of stopping at stale config validation. Thanks @vincentkoc.
|
||||
- CLI/update: stage pnpm-detected npm-layout global package updates through a clean npm prefix swap, keep plugin install runtime imports behind a stable alias, and ship legacy install-runtime aliases back to `2026.3.22`, preventing stale overlay chunks from breaking plugin post-update sync. Thanks @vincentkoc.
|
||||
- Plugins/commands: allow the official ClawHub Codex plugin package to keep reserved `/codex` command ownership, matching the existing npm-managed Codex package behavior. Thanks @vincentkoc.
|
||||
- Auth/OpenAI Codex: rewrite invalidated per-agent Codex auth-order and session profile overrides toward a healthy relogin profile, so revoked OAuth accounts do not stay pinned after signing in again. Thanks @BunsDev.
|
||||
- Plugins/commands: scope QQBot framework slash commands to the QQBot channel so `/bot-*` command handlers and native specs do not leak onto unrelated chat surfaces. Thanks @vincentkoc.
|
||||
- fix: harden backend message action gateway routing [AI]. (#76374) Thanks @pgondhi987.
|
||||
- Gate QQBot streaming command auth [AI]. (#76375) Thanks @pgondhi987.
|
||||
- Plugins/discovery: ignore managed npm plugin packages that only expose TypeScript source entries without compiled runtime output, so stale/broken installs cannot hide a working bundled or reinstallable channel plugin during setup. Thanks @vincentkoc.
|
||||
- CLI/update: treat OpenClaw stable correction versions like `2026.5.3-1` as newer than their base stable release, so package updates no longer ask for downgrade confirmation. Thanks @vincentkoc.
|
||||
- Plugins/install: suppress dangerous-pattern scanner warnings for trusted official OpenClaw npm installs, so installing `@openclaw/discord` no longer prints credential-harvesting warnings for the official package. Thanks @vincentkoc.
|
||||
- Plugins/commands: suppress dangerous-pattern scanner warnings for trusted catalog npm installs from owner-gated `/plugins install` commands, so chat-driven installs match the CLI install trust path. Thanks @vincentkoc.
|
||||
- Plugins/release: make the published npm runtime verifier reject blank `openclaw.runtimeExtensions` entries instead of treating them as absent and passing via inferred outputs. Thanks @vincentkoc.
|
||||
- Plugins/security: ignore inline and block comments when matching source-rule context in plugin install scans, so comment-only `fetch`/`post` references near environment defaults do not block clean plugins. Thanks @vincentkoc.
|
||||
- Doctor/plugins: remove stale managed install records for bundled plugins even when the bundled plugin is not explicitly configured, so doctor cleanup cannot leave orphaned install metadata behind. Thanks @vincentkoc.
|
||||
- Web fetch: scope provider fallback cache entries by the selected fetch provider so config reloads cannot reuse another provider's cached fallback payload. Thanks @vincentkoc.
|
||||
- Web search: honor late-bound `tools.web.search.enabled: false` during tool execution so config reloads cannot leave an already-created `web_search` tool runnable. Thanks @vincentkoc.
|
||||
- Plugins/packages: reject inferred built runtime entries that exist but fail package-boundary checks instead of falling back to TypeScript source for installed packages. Thanks @vincentkoc.
|
||||
- Plugins/loader: do not retry native-loaded JavaScript plugin modules through the source transformer after native evaluation has already reached a missing dependency, avoiding duplicate top-level side effects. Thanks @vincentkoc.
|
||||
- Plugins/packages: reject blank `openclaw.runtimeExtensions` entries instead of silently ignoring them and falling back to inferred TypeScript runtime entries. Thanks @vincentkoc.
|
||||
- Doctor/plugins: remove stale managed npm plugin shadow entries from the managed package lock as well as `package.json` and `node_modules`, so future npm operations do not keep referencing repaired bundled-plugin shadows. Thanks @vincentkoc.
|
||||
- Plugins/runtime state: keep the key being registered when namespace eviction runs in the same millisecond as existing entries, so `register` and `registerIfAbsent` do not report success while evicting their own fresh value. Thanks @vincentkoc.
|
||||
- Plugins/providers: make bundled provider discovery honor restrictive `plugins.allow` by default for new configs, while doctor migrates legacy restrictive allowlist configs to `plugins.bundledDiscovery: "compat"` to preserve upgrade behavior. Thanks @dougbtv.
|
||||
- Control UI/Talk: make failed Talk startup errors dismissable and clear the stale Talk error state when dismissed, so missing realtime voice provider configuration does not leave a permanent chat banner. Fixes #77071. Thanks @ijoshdavis.
|
||||
- Control UI/Talk: stop and clear failed realtime Talk sessions when dismissing runtime error banners, so the next Talk click starts a fresh session instead of only stopping the stale one. Thanks @vincentkoc.
|
||||
- Control UI/Talk: retry from a failed realtime Talk session on the next Talk click instead of requiring a separate stale-session stop click first. Thanks @vincentkoc.
|
||||
- Canvas host: preserve the Gateway TLS scheme in browser canvas host URLs and startup mount logs, so direct HTTPS gateways do not advertise insecure canvas links. Thanks @vincentkoc.
|
||||
- WhatsApp/login: route login success and failure messages through the injected runtime, so setup/onboarding surfaces capture all login output instead of only the QR. Thanks @vincentkoc.
|
||||
- Google Chat: create an isolated Google auth transport per auth client, so google-auth-library interceptor mutations do not accumulate across webhook verification and access-token clients. Thanks @vincentkoc.
|
||||
- Doctor/plugins: remove orphaned or recovered managed npm copies of bundled `@openclaw/*` plugins during `doctor --fix`, so stale package manifests cannot shadow the current bundled plugin config schema.
|
||||
- Control UI/performance: cap long-task and long-animation-frame diagnostics in the shared event log, so slow-render telemetry does not evict gateway/plugin events from the Debug and Overview views. Thanks @vincentkoc.
|
||||
- Gateway/startup: log the canvas host mount only after the HTTP server has bound, so startup logs no longer report the canvas host as mounted before it can serve requests.
|
||||
- Control UI/i18n: render the Sessions active filter tooltip with the configured minute count in every locale and make the i18n check reject placeholder drift. Thanks @BunsDev.
|
||||
- Web fetch: late-bind `web_fetch` config and provider fallback metadata from the active runtime snapshot, matching `web_search` so long-lived tools do not use stale fetch provider settings. Thanks @vincentkoc.
|
||||
- Discord: clear stale startup probe bot/application status when the async bot probe throws, not just when it returns a degraded probe result. Thanks @vincentkoc.
|
||||
- Web search: scope explicit bundled `web_search` provider runtime loading through manifest ownership, so selecting DuckDuckGo/Gemini/etc. does not import unrelated bundled providers or log their optional dependency failures. Thanks @vincentkoc.
|
||||
- Plugins/discovery: demote the source-only TypeScript runtime check on already-installed `origin: "global"` plugin packages from a config-blocking error to a warning and let the runtime fall through to the TypeScript source via jiti, so a single broken installed package no longer blocks `plugins install` for unrelated plugins; install-time rejection of newly-installed source-only packages is unchanged. Thanks @romneyda.
|
||||
- Providers/OpenAI Codex: stop the OAuth progress spinner before showing the manual redirect paste prompt, so callback timeouts do not spam `Browser callback did not finish` across terminals.
|
||||
- Providers/OpenAI Codex: fail closed on malformed `/codex` control commands and diagnostics confirmations before changing bindings, permissions, model overrides, active turns, or feedback uploads. Thanks @vincentkoc.
|
||||
- Providers/OpenAI Codex: sanitize Codex app-server command readouts, failure replies, approval prompts, elicitation prompts, and `request_user_input` text before posting them back into chat. Thanks @vincentkoc.
|
||||
- Providers/OpenAI Codex: preserve local bound-turn image paths, reject stale same-thread turn notifications, enforce option-only user input prompts, and return failed dynamic tool results to Codex as unsuccessful tool calls. Thanks @vincentkoc.
|
||||
- Providers/DeepSeek: expose DeepSeek V4 `xhigh` and `max` thinking levels through the lightweight provider-policy surface, so Control UI `/think` pickers keep showing the max reasoning options when the runtime plugin registry is not active. Fixes #77139. Thanks @bittoby.
|
||||
- Release/beta smoke: resolve the dispatched Telegram beta E2E run from `gh run list` when `gh workflow run` returns no run URL, so the maintainer helper does not fail immediately after dispatch. Thanks @vincentkoc.
|
||||
- Media/images: keep HEIC/HEIF attachments fail-closed when optional Sharp conversion is unavailable instead of sending originals that still need conversion. Thanks @vincentkoc.
|
||||
- Google Meet: fork the caller's current agent transcript into agent-mode meeting consultant sessions, so Meet replies inherit the context from the tool call that joined the meeting.
|
||||
- iOS/mobile pairing: reject non-loopback `ws://` setup URLs before QR/setup-code issuance and let the iOS Gateway settings screen scan QR codes or paste full setup-code messages. Thanks @BunsDev.
|
||||
- Control UI: keep Gateway Access inputs and locale picker contained inside the card at narrow and tablet widths.
|
||||
- Agents/trajectory: bound runtime trajectory capture and yield queued sidecar writes so oversized traces stop recording instead of monopolizing Gateway cleanup. Fixes #77124. Thanks @loyur.
|
||||
- Telegram/streaming: sanitize tool-progress draft preview backticks before shared compaction, so long backtick-heavy progress text still renders inside the safe code-formatted preview instead of collapsing to an ellipsis.
|
||||
- UI/chat: remove the unsupported `line-clamp` declaration from the chat queue text rule to eliminate Firefox console noise without changing visible truncation behavior. Thanks @ZanderH-code.
|
||||
- Control UI: add explicit feedback for repeated actions by announcing session switches, flashing the active session selector, showing inline Save/Apply/Update progress, and distinguishing filtered-empty session lists from genuinely empty session stores. Thanks @BunsDev.
|
||||
- Agents/Pi: suppress persistence for synthetic mid-turn overflow continuation prompts, so transcript-retry recovery does not write the "continue from transcript" prompt as a new user turn. Thanks @vincentkoc.
|
||||
- Agents/tools: strip reasoning text from visible rich presentation titles, blocks, buttons, and select labels before message-tool sends, so structured channel payloads cannot leak hidden planning. Thanks @vincentkoc.
|
||||
- Telegram: keep reply-dispatch lazy provider runtime chunks behind stable dist names and delete `/reasoning stream` previews after final delivery so package updates and live reasoning drafts do not leave Telegram turns broken or noisy. Thanks @BunsDev.
|
||||
- Discord: start the gateway monitor without waiting for the startup bot/application probe, so WSL2 hosts with a slow `/users/@me` REST path still bring the channel online while status enrichment finishes asynchronously. Fixes #77103. Thanks @Suited78.
|
||||
- Exec approvals: detect `env -S` split-string command-carrier risks when `-S`/`-s` is combined with other env short options, so approval explanations do not miss split payloads hidden behind `env -iS...`. Thanks @vincentkoc.
|
||||
- Google Meet: log the concrete agent-mode TTS provider, model, voice, output format, and sample rate after speech synthesis, so Meet logs show which voice backend spoke each reply.
|
||||
- Voice Call: mark realtime calls completed when the realtime provider closes normally, so Twilio/OpenAI/Google realtime stop events do not leave active call records behind. Thanks @vincentkoc.
|
||||
- Gateway/update: keep the shutdown close path behind a stable runtime chunk and ship compatibility aliases for recent `server-close-*` hashes, so manual npm package replacement cannot leave an already-running Gateway unable to shut down cleanly. Fixes #77087. Thanks @westlife219.
|
||||
- Control UI/media: mint short-lived scoped tickets for assistant media fetches and render ticketed URLs instead of exposing long-lived auth tokens in chat image URLs. Fixes #70830 and #77097. Thanks @hclsys.
|
||||
- Exec approvals: treat POSIX `exec` as a command carrier for inline eval, shell-wrapper, and eval/source detection, so approval explanations and command-risk checks do not miss payloads hidden behind `exec`. Thanks @vincentkoc.
|
||||
- Google Meet: log the resolved audio provider model when starting Chrome and paired-node Meet talk-back bridges, so agent-mode joins show the STT model and bidi joins show the realtime voice model.
|
||||
- Diagnostics: handle missing session-tail files in cron recovery context without tripping extension test typecheck. Thanks @vincentkoc.
|
||||
- QA/Slack: update the Slack dispatch preview fallback test SDK mock for structured progress draft helpers, so the rich progress draft regression suite covers the new imports instead of failing before assertions run. Thanks @vincentkoc.
|
||||
- Release validation: allow focused QA live reruns to select Matrix and Telegram without running Slack, so known Slack credential-pool outages do not block non-Slack live proof. Thanks @vincentkoc.
|
||||
- Plugins/loader: keep bundled plugin package `test-api.js` aliases behind private QA mode, so source transforms do not expose test-only public surfaces during normal plugin loading. Thanks @vincentkoc.
|
||||
- Gateway/startup: start cron and record the post-ready memory trace even when deferred maintenance timers fail after readiness, so a non-fatal timer setup issue does not silently leave scheduled jobs idle. Thanks @vincentkoc.
|
||||
- Exec approvals: unwrap BSD/macOS `env -P <path>` carrier commands before approval-command and strict inline-eval checks, so `/approve` shell execution and inline interpreter payloads are still blocked behind that env form.
|
||||
- Agents/session status: keep semantic `session_status({ sessionKey: "current" })` on the live run session even before that run has a persisted session-store entry, instead of falling back to the sandbox policy key. Thanks @vincentkoc.
|
||||
- QA/Slack: resolve bundled official plugin public-surface package aliases during source-mode QA runs, so release Slack live validation can load `@openclaw/slack/api.js` without workspace symlinks. Thanks @vincentkoc.
|
||||
- Codex: pass the live run session key into app-server dynamic tools when sandbox policy uses a separate session key, so `session_status({ sessionKey: "current" })` reports the active run instead of the sandbox policy key. Thanks @vincentkoc.
|
||||
- Web search: keep first-class assistant `web_search` auto-detect and configured runtime providers visible when active runtime metadata or the active plugin registry is incomplete. Fixes #77073. Thanks @joeykrug.
|
||||
- Plugins/tools: mark manifest-optional sibling tools as optional even when they come from a shared non-optional factory, so cached/status/MCP metadata keeps opt-in tool policy accurate. Thanks @vincentkoc.
|
||||
- Matrix: keep `streaming.progress.toolProgress` scoped to progress draft mode, so partial and quiet Matrix previews do not lose tool progress unless `streaming.preview.toolProgress` is disabled. Thanks @vincentkoc.
|
||||
- Gateway/validation: isolate gateway server validation files, ignore unrelated startup logs in request-trace coverage, and fail fast on stuck shared-auth sockets, reducing false main-branch CI failures for contributors. Thanks @amknight.
|
||||
- Channels/streaming: keep `streaming.progress.toolProgress` scoped to progress draft mode, so disabling compact progress lines does not silence partial/block preview tool updates. Thanks @vincentkoc.
|
||||
- Plugins/update: treat OpenClaw stable correction versions like `2026.5.3-1` as stable releases for npm installs, plugin updates, and bundled-version comparisons, so `latest` can advance official plugins without prerelease opt-in. Thanks @vincentkoc.
|
||||
- Control UI: point the Appearance tweakcn browse action and docs at the live tweakcn editor route instead of the removed `/themes` page. Fixes #77048.
|
||||
- Control UI: render Dream Diary prose through the sanitized markdown pipeline, so diary bold/italic/header markdown no longer appears as literal source text. Fixes #62413.
|
||||
- Control UI: render tool results whose output arrives as text-block arrays and give expanded tool output a scrollable block, so read/exec output remains visible in WebChat. Fixes #77054.
|
||||
- MCP: include serialized conversation/message payloads in the primary text content for `conversations_list` and `messages_read`, while preserving `structuredContent` for capable clients. Fixes #77024.
|
||||
- Media: treat `EPERM` from the post-write media fsync step as best-effort, allowing WebChat and channel uploads to finish on Windows filesystems that reject `fsync` after a successful write. Fixes #76844.
|
||||
- Media/Telegram: send in-limit original images when optional image optimization is unavailable, so Telegram MEDIA replies and message-tool image sends do not fail just because `sharp` is missing. Fixes #77081. (#77117) Thanks @pfrederiksen.
|
||||
- Diagnostics: include last progress, cron job/run ids, stopped cron job name, and the last assistant transcript snippet in stalled-session and stuck-session recovery logs so cron stalls show what was stopped.
|
||||
- Streaming channels: add `streaming.preview.commandText: "status"` / `streaming.progress.commandText: "status"` to hide command/exec text in preview progress lines while keeping the released raw command text default. Fixes #77072.
|
||||
- Agents/cron: let explicit cron `timeoutSeconds` drive both CLI no-output and embedded LLM idle watchdogs instead of being capped by resume defaults. Fixes #76289.
|
||||
- Plugins/catalog: suppress missing `channelConfigs` compatibility diagnostics for external channel plugins that are disabled, denied, or outside a restrictive allowlist. Fixes #76095.
|
||||
- Diagnostics: keep webhook/message OTEL attributes and Prometheus delivery labels low-cardinality and omit raw chat/message IDs from spans, so progress-draft and message-tool modes do not leak high-cardinality messaging identifiers.
|
||||
- Google Meet: stop advertising legacy `mode: "realtime"` to agents and config UIs, while keeping it as a hidden compatibility alias for `mode: "agent"`, so new joins use the STT -> OpenClaw agent -> TTS path instead of selecting the direct realtime voice fallback.
|
||||
- Google Meet: add `chrome.audioBufferBytes` for generated command-pair SoX audio commands and lower the default buffer from SoX's 8192 bytes to 4096 bytes to reduce Chrome talk-back latency.
|
||||
- Google Meet: split realtime provider config into agent-mode transcription and bidi-mode voice providers, and migrate legacy Gemini Live bidi configs with `doctor --fix`, so Gemini Live can back direct bidi fallback without breaking the default OpenClaw agent talk-back path.
|
||||
@@ -82,238 +222,72 @@ Docs: https://docs.openclaw.ai
|
||||
- Google Meet: expose `voiceCall.postDtmfSpeechDelayMs` in the plugin manifest schema and setup hints, so manifest-based config editing accepts the runtime-supported Twilio delay key. Thanks @vincentkoc.
|
||||
- Google Meet: keep explicit non-Google `realtime.provider` values as the transcription provider compatibility fallback when `realtime.transcriptionProvider` is unset. Thanks @vincentkoc.
|
||||
- Google Meet: make Twilio setup status require an enabled `voice-call` plugin entry instead of treating a missing entry as ready. Thanks @vincentkoc.
|
||||
- Google Meet: avoid treating repeated participant words as multiple assistant-overlap matches when suppressing realtime echo transcripts. Thanks @vincentkoc.
|
||||
- Google Meet: make `mode: "agent"` the default Chrome talk-back path, using realtime transcription for input and regular OpenClaw TTS for speech output, while keeping direct realtime voice answers available as `mode: "bidi"` and accepting `mode: "realtime"` as an agent-mode compatibility alias.
|
||||
- Google Meet: make realtime talk-back agent-driven by default with `realtime.strategy: "agent"`, keep the previous direct bidirectional model behavior available as `realtime.strategy: "bidi"`, route the Meet tab speaker output to `BlackHole 2ch` automatically for local Chrome realtime joins, coalesce nearby speech transcript fragments before consulting the agent, and avoid cutting off agent speech from server VAD or stale playback pipe errors.
|
||||
- Google Meet: suppress queued assistant playback and assistant-like transcript echoes from the realtime input path, so the meeting does not hear the agent's own speech as a new user turn and loop or cut itself off.
|
||||
- Google Meet: keep Chrome realtime transport tests hermetic on Linux prerelease shards while preserving the macOS-only runtime guard. Thanks @vincentkoc.
|
||||
- Voice Call: mark realtime calls completed when the realtime provider closes normally, so Twilio/OpenAI/Google realtime stop events do not leave active call records behind. Thanks @vincentkoc.
|
||||
- Slack: keep health-monitor recovery stops from poisoning manual-stop state after channel stop timeouts, allowing Socket Mode accounts to reconnect after event-loop stalls instead of staying dead until Gateway restart. Fixes #77651. Thanks @Gusty3055.
|
||||
- Slack: report `unknown error` instead of `undefined` in socket-mode startup retry logs and label the retry reason explicitly.
|
||||
- Slack/mentions: record thread participation for successful visible threaded Slack sends, including message-tool and media delivery paths, so unmentioned replies in bot-participated threads can bypass mention gating as documented. Fixes #77648. Thanks @bek91.
|
||||
- Slack/subagents: keep resumed parent `message.send` calls in the originating Slack thread when ambient session thread context is present, and suppress successful silent child completion rows from follow-up findings. Thanks @bek91.
|
||||
- WhatsApp/onboarding: canonicalize setup and pairing allowlist entries to WhatsApp's digit-only phone ids while still accepting E.164, JID, and `whatsapp:` inputs, so personal-phone allowlists match WhatsApp Web sender ids after setup. Thanks @vincentkoc.
|
||||
- WhatsApp/login: route login success and failure messages through the injected runtime, so setup/onboarding surfaces capture all login output instead of only the QR. Thanks @vincentkoc.
|
||||
- Channels/WhatsApp: apply the shared group/channel visible-reply mode during inbound dispatch so group replies stay message-tool-only by default without overriding direct-chat harness defaults. Refs #75178 and #67394. Thanks @scoootscooob.
|
||||
- Telegram/media: derive no-caption inbound media placeholders from saved MIME metadata instead of the Telegram `photo` shape, so non-image and mixed attachments no longer reach the model as `<media:image>`. Fixes #69793. Thanks @aspalagin.
|
||||
- Telegram/streaming: reuse the active preview as the first chunk for long text finals, so multi-chunk replies no longer create a transient extra bubble that appears and then disappears. Thanks @vincentkoc.
|
||||
- Telegram/streaming: sanitize tool-progress draft preview backticks before shared compaction, so long backtick-heavy progress text still renders inside the safe code-formatted preview instead of collapsing to an ellipsis.
|
||||
- Telegram: clean up tool-only draft previews after assistant message boundaries so transient `Surfacing...` tool-status bubbles do not linger when no matching final preview arrives. Thanks @BunsDev.
|
||||
- Telegram: let explicit forum-topic `requireMention` settings override persisted `/activate` and `/deactivate` state, so per-topic mention gates work consistently. Fixes #49864. Thanks @Panniantong.
|
||||
- Telegram: keep reply-dispatch lazy provider runtime chunks behind stable dist names and delete `/reasoning stream` previews after final delivery so package updates and live reasoning drafts do not leave Telegram turns broken or noisy. Thanks @BunsDev.
|
||||
- Telegram: render shared interactive reply buttons in reply delivery so plugin approval messages show inline keyboards. (#76238) Thanks @keshavbotagent.
|
||||
- Telegram: deliver button-only interactive replies by sending the shared fallback button-label text with the inline keyboard instead of dropping the reply as empty. Thanks @vincentkoc.
|
||||
- Telegram: keep status checks pointed at the active chat so asking for the current session no longer reports an old direct-message conversation. (#76708) Thanks @amknight.
|
||||
- Media/Telegram: send in-limit original images when optional image optimization is unavailable, so Telegram MEDIA replies and message-tool image sends do not fail just because `sharp` is missing. Fixes #77081. (#77117) Thanks @pfrederiksen.
|
||||
- Discord/replies: treat failed final reply delivery as a failed turn instead of counting it as a delivered automatic visible reply, so guild/channel turns no longer show done when the final message was dropped. Fixes #77520. Thanks @Patrick-Erichsen.
|
||||
- Discord: prefer IPv4 for Discord REST and gateway WebSocket startup paths so IPv4-only networks no longer stall before Gateway READY and inbound message dispatch. Fixes #77398; refs #77526. Thanks @Beandon13.
|
||||
- Discord: clear stale startup probe bot/application status when the async bot probe throws, not just when it returns a degraded probe result. Thanks @vincentkoc.
|
||||
- Discord: start the gateway monitor without waiting for the startup bot/application probe, so WSL2 hosts with a slow `/users/@me` REST path still bring the channel online while status enrichment finishes asynchronously. Fixes #77103. Thanks @Suited78.
|
||||
- Discord/Gateway startup: retry Discord READY waits with backoff, defer startup `sessions.list` and native approval readiness failures until sidecars recover, and preserve component-only Discord payloads when final reply scrubbing removes all text. (#77478) Thanks @NikolaFC.
|
||||
- Webhooks/Gmail/Windows: resolve `gcloud`, `gog`, and `tailscale` PATH/PATHEXT shims before setup and watcher spawns, using the Windows-safe `.cmd` wrapper for long-lived `gog serve` processes. (#74881, fixes #54470) Thanks @Angfr95.
|
||||
- Infra/Windows: skip the POSIX `/tmp/openclaw` preferred path on Windows in `resolvePreferredOpenClawTmpDir` so log files, TTS temp files, and other writes land in `%TEMP%\openclaw-<uid>` instead of `C:\tmp\openclaw`. Fixes #60713. Thanks @juan-flores077.
|
||||
- Media/Windows: open saved attachment temp files read/write before fsync so Windows WebChat and `chat.send` media offloads no longer fail with EPERM during durability flush. (#76593) Thanks @qq230849622-a11y.
|
||||
- Plugins/Windows: show a Git install hint when npm plugin installation fails with `spawn git ENOENT`, and document the WhatsApp plugin's Git-on-PATH requirement for Baileys/libsignal installs.
|
||||
- Media/images: keep HEIC/HEIF attachments fail-closed when optional Sharp conversion is unavailable instead of sending originals that still need conversion. Thanks @vincentkoc.
|
||||
- Control UI/chat: suppress `HEARTBEAT_OK` acknowledgement history, streams, deltas, and final events before they enter the transcript view, so repeated heartbeat no-op turns do not stack noisy bubbles. Thanks @BunsDev.
|
||||
- Control UI/Talk: make failed Talk startup errors dismissable and clear the stale Talk error state when dismissed, so missing realtime voice provider configuration does not leave a permanent chat banner. Fixes #77071. Thanks @ijoshdavis.
|
||||
- Control UI/Talk: stop and clear failed realtime Talk sessions when dismissing runtime error banners, so the next Talk click starts a fresh session instead of only stopping the stale one. Thanks @vincentkoc.
|
||||
- Control UI/Talk: retry from a failed realtime Talk session on the next Talk click instead of requiring a separate stale-session stop click first. Thanks @vincentkoc.
|
||||
- Control UI/media: mint short-lived scoped tickets for assistant media fetches and render ticketed URLs instead of exposing long-lived auth tokens in chat image URLs. Fixes #70830 and #77097. Thanks @hclsys.
|
||||
- Control UI: keep Gateway Access inputs and locale picker contained inside the card at narrow and tablet widths.
|
||||
- Control UI: add explicit feedback for repeated actions by announcing session switches, flashing the active session selector, showing inline Save/Apply/Update progress, and distinguishing filtered-empty session lists from genuinely empty session stores. Thanks @BunsDev.
|
||||
- Control UI: point the Appearance tweakcn browse action and docs at the live tweakcn editor route instead of the removed `/themes` page. Fixes #77048.
|
||||
- Control UI: render Dream Diary prose through the sanitized markdown pipeline, so diary bold/italic/header markdown no longer appears as literal source text. Fixes #62413.
|
||||
- Control UI: render tool results whose output arrives as text-block arrays and give expanded tool output a scrollable block, so read/exec output remains visible in WebChat. Fixes #77054.
|
||||
- UI/chat: remove the unsupported `line-clamp` declaration from the chat queue text rule to eliminate Firefox console noise without changing visible truncation behavior. Thanks @ZanderH-code.
|
||||
- TUI/escape abort: track the in-flight runId after `chat.send` resolves so pressing Esc during the gap before the first gateway event aborts the run instead of repeatedly printing `no active run`. Fixes #1296. Thanks @Lukavyi and @romneyda.
|
||||
- TUI/render: stop the long-token sanitizer from injecting literal spaces inside inline code spans, fenced code blocks, table borders, and bare hyphenated/dotted identifiers, so copied package names, entity IDs, and shell line-continuations stay byte-for-byte intact while narrow-terminal protection still chunks unidentifiable long prose tokens. Fixes #48432, #39505. Thanks @DocOellerson, @xeusoc, @CCcassiusdjs, @akramcodez, @brokemac79, @romneyda.
|
||||
- iOS/mobile pairing: reject non-loopback `ws://` setup URLs before QR/setup-code issuance and let the iOS Gateway settings screen scan QR codes or paste full setup-code messages. Thanks @BunsDev.
|
||||
- Canvas host: preserve the Gateway TLS scheme in browser canvas host URLs and startup mount logs, so direct HTTPS gateways do not advertise insecure canvas links. Thanks @vincentkoc.
|
||||
- Model switching: include the exact additive allowlist repair command when `/model ... --runtime ...` targets a blocked model, and make Telegram's model picker say that it changes only the session model while leaving the runtime unchanged. Thanks @vincentkoc.
|
||||
- Mattermost: clarify that the model picker only changes the session model and that runtime switches require `/oc_model <provider/model> --runtime <runtime>`. Thanks @vincentkoc.
|
||||
- Mattermost: use the shared progress draft formatter for tool status previews, including raw command/detail output when `agents.defaults.toolProgressDetail: "raw"` is enabled. Thanks @vincentkoc.
|
||||
- Mattermost: suppress standalone default tool-progress messages while draft previews are active, including when draft tool lines are disabled. Thanks @vincentkoc.
|
||||
- Discord/Slack/Mattermost: align draft preview tool-progress config help with the runtime behavior that hides interim tool updates when `streaming.preview.toolProgress` is false. Thanks @vincentkoc.
|
||||
- Google Chat: create an isolated Google auth transport per auth client, so google-auth-library interceptor mutations do not accumulate across webhook verification and access-token clients. Thanks @vincentkoc.
|
||||
- Google Chat: normalize Google auth certificate response headers before google-auth-library reads cache-control, so inbound webhook auth no longer rejects with `res?.headers.get is not a function`. Fixes #76880. Thanks @donbowman.
|
||||
- Providers/DeepSeek: expose DeepSeek V4 `xhigh` and `max` thinking levels through the lightweight provider-policy surface, so Control UI `/think` pickers keep showing the max reasoning options when the runtime plugin registry is not active. Fixes #77139. Thanks @bittoby.
|
||||
- Providers/OpenRouter: keep DeepSeek V4 `reasoning_effort` on OpenRouter-supported values, mapping stale `max` thinking overrides to `xhigh` so `openrouter/deepseek/deepseek-v4-pro` no longer fails with OpenRouter's invalid-effort 400. Fixes #77350. (#77423) Thanks @krllagent, @mushuiyu886, and @sallyom.
|
||||
- Providers/OpenAI Codex: stop the OAuth progress spinner before showing the manual redirect paste prompt, so callback timeouts do not spam `Browser callback did not finish` across terminals.
|
||||
- Providers/OpenAI Codex: fail closed on malformed `/codex` control commands and diagnostics confirmations before changing bindings, permissions, model overrides, active turns, or feedback uploads. Thanks @vincentkoc.
|
||||
- Providers/OpenAI Codex: sanitize Codex app-server command readouts, failure replies, approval prompts, elicitation prompts, and `request_user_input` text before posting them back into chat. Thanks @vincentkoc.
|
||||
- Providers/OpenAI Codex: preserve local bound-turn image paths, reject stale same-thread turn notifications, enforce option-only user input prompts, and return failed dynamic tool results to Codex as unsuccessful tool calls. Thanks @vincentkoc.
|
||||
- OpenAI Codex: recreate missing bound app-server threads once when a stale `/codex bind` sidecar survives a restart, preserving the selected auth profile and turn overrides before retrying the inbound turn. (#76936) Thanks @keshavbotagent.
|
||||
- OpenAI Codex: honor `auth.order.openai-codex` when starting app-server clients without an explicit auth profile, so status/model probes and implicit startup use the configured Codex account instead of falling back to the default profile. Thanks @vincentkoc.
|
||||
- OpenAI Codex: let SSRF-guarded provider requests inherit OpenClaw's undici IPv4/IPv6 fallback policy, so ChatGPT-backed Codex runs recover on IPv4-working hosts when DNS still returns unreachable IPv6 addresses. Fixes #76857. Thanks @jplavoiemtl and @SymbolStar.
|
||||
- Auth/OpenAI Codex: rewrite invalidated per-agent Codex auth-order and session profile overrides toward a healthy relogin profile, so revoked OAuth accounts do not stay pinned after signing in again. Thanks @BunsDev.
|
||||
- Plugins/Codex: preserve Codex-native OAuth routing for `/codex bind` app-server turns so bound sessions keep the selected Codex auth profile instead of falling back to public OpenAI credentials. (#76714) Thanks @keshavbotagent.
|
||||
- Codex harness: preserve app-server usage-limit reset details and deliver OpenClaw-owned runtime failure notices through tool-only source-reply mode, so Telegram and other chat channels tell users when Codex subscription limits or API failures block a turn instead of going silent. (#77557) Thanks @pashpashpash.
|
||||
- Codex harness: keep `codex_app_server.*` telemetry publication owned by the harness instead of republishing the same callback event from core runners. Thanks @vincentkoc.
|
||||
- Codex plugin: mirror the experimental upstream app-server protocol and format generated TypeScript before drift checks, keeping OpenClaw's `experimentalApi` bridge compatible with latest Codex while preserving formatter gates.
|
||||
- Agents/OpenAI: default direct OpenAI Responses models to the SSE transport instead of WebSocket auto-selection, preventing pi runtime chat turns from hanging on servers where the WebSocket path stalls while the OpenAI HTTP stream works. Thanks @vincentkoc.
|
||||
- Claude CLI: honor non-off `/think` levels by passing Claude Code's session-scoped `--effort` flag through the CLI backend seam, so chat bridges no longer show an inert thinking control. Fixes #77303. Thanks @Petr1t.
|
||||
- Browser/SSRF: enforce the existing current-tab URL navigation policy before tab-scoped debug, export, and read routes (console, page errors, network requests, trace start/stop, response body, screenshot, snapshot, storage, etc.) collect from an already-selected tab, so blocked tabs return a policy error instead of being read first and redacted only at response time. (#75731) Thanks @eleqtrizit.
|
||||
- Browser: enforce strict SSRF current-URL checks before existing-session screenshots, matching existing-session snapshot handling. Thanks @vincentkoc.
|
||||
- fix(gateway): clamp unbound websocket auth scopes [AI]. (#77413) Thanks @pgondhi987.
|
||||
- fix(device-pair): require pairing scope for pair command [AI]. (#76377) Thanks @pgondhi987.
|
||||
- fix: harden backend message action gateway routing [AI]. (#76374) Thanks @pgondhi987.
|
||||
- Gate QQBot streaming command auth [AI]. (#76375) Thanks @pgondhi987.
|
||||
- fix(qqbot): keep private commands off framework surface [AI]. (#77212) Thanks @pgondhi987.
|
||||
- Gate zalouser startup name matching [AI]. (#77411) Thanks @pgondhi987.
|
||||
- QQBot: preserve the framework command authorization decision when converting framework command contexts into engine slash command contexts, so downstream slash handlers see `commandAuthorized` matching the channel's resolved `isAuthorizedSender` instead of a hardcoded `true`. (#77453) Thanks @drobison00.
|
||||
- Agents/cache: keep per-turn runtime context out of ordinary chat system prompts while still delivering hidden current-turn context, restoring prompt-cache reuse on chat continuations. Fixes #77431. Thanks @Udjin79.
|
||||
- Agents/tools: honor narrow runtime tool allowlists when constructing embedded-runner tool families and bundled MCP/LSP runtimes, so cron/subagent runs that request tools such as `update_plan`, `browser`, `x_search`, channel login tools, or `group:plugins` no longer start with missing tools or unrelated bootstrap work. (#77519, #77532)
|
||||
- Agents/Tools: add post-compaction loop guard in `pi-embedded-runner` that arms after auto-compaction-retry and aborts the run with `compaction_loop_persisted` when the agent emits the same `(tool, args, result)` triple `windowSize` times (default 3) within that window. Disable via existing `tools.loopDetection.enabled`; tune via `tools.loopDetection.postCompactionGuard.windowSize`. Targets the failure mode where context-overflow + compaction does not break a tool-call loop. Refs #77474; carries forward #21597. Thanks @efpiva.
|
||||
- Agents/tools: strip reasoning text from visible rich presentation titles, blocks, buttons, and select labels before message-tool sends, so structured channel payloads cannot leak hidden planning. Thanks @vincentkoc.
|
||||
- Agents/tools: use config-only runtime snapshots for plugin tool registration and live runtime config getters, avoiding expensive full secrets snapshot clones on the core-plugin-tools prep path. Fixes #76295.
|
||||
- Agents/tools: honor the effective tool denylist before constructing optional PDF/media tool factories, so `tools.deny: ["pdf"]` skips PDF setup before later policy filtering. Fixes #76997.
|
||||
- Agents/skills: require exact `<location>` skill paths for both single-skill and multi-skill prompt selection, so agents do not guess or hard-code skill file paths. (#74161) Thanks @lanzhi-lee.
|
||||
- Agents/skills: rebuild sandboxed non-rw run skill prompts from the sandbox workspace copy, so `<available_skills>` no longer points at host-only `~/.openclaw/skills` paths. Fixes #50590. Thanks @kidroca and @sallyom.
|
||||
- Agents/media: avoid sending generated image, video, and music attachments twice when streamed reply text arrives before the final `MEDIA:` directive.
|
||||
- Agents/media: tell async music and video completion agents when normal final replies are private, and send completion fallbacks directly to message-tool-only group/channel routes when the completion agent still only writes a private final reply, so generated media does not disappear behind the delivery contract.
|
||||
- Agents/media: route async music and video completion results back through the requester agent, preserving automatic replies while requiring the message tool only for message-tool-only group/channel delivery.
|
||||
- Agents/subagents: refresh deferred final-delivery payloads when same-session completion output changes, so retried parent notifications use the final child summary instead of stale progress text. Thanks @vincentkoc.
|
||||
- Agents/subagents: detect prefix-only completion announce replies and fall back to the captured child result so requester chats no longer lose most of long sub-agent reports silently. Fixes #76412. Thanks @inxaos and @davemorin.
|
||||
- Active Memory: give timeout partial transcript recovery enough abort-settle headroom so temporary recall summaries are returned before cleanup. Thanks @vincentkoc.
|
||||
- Active Memory: send a bounded latest-message search query to the recall worker so channel/runtime metadata does not become the memory search string. Fixes #65309. Thanks @joeykrug, @westley3601, @pimenov, and @tasi333.
|
||||
- active-memory: skip the memory sub-agent gracefully instead of logging a confusing allowlist error when no memory plugin (`memory-core` or `memory-lancedb`) is loaded, so active-memory with no memory backend no longer produces misleading "No callable tools remain" warnings in the gateway log. Fixes #77506. Thanks @hclsys.
|
||||
- Memory/wiki: preserve representation from both corpora in `corpus=all` searches while backfilling unused result capacity, so memory hits are not starved by numerically higher wiki integer scores. Fixes #77337. Thanks @hclsys.
|
||||
- Plugin skills: publish plugin-declared skills through the generated plugin skills directory (`~/.openclaw/plugin-skills/`) while keeping direct prompt loading intact, so agent file-based discovery paths find plugin skill `SKILL.md` files and inactive plugin links are cleaned up. Fixes #77296. (#77328) Thanks @zhangguiping-xydt.
|
||||
- Plugins/install: honor the beta update channel for onboarding and doctor-managed plugin installs by requesting floating npm and ClawHub specs with `@beta` while keeping persistent install records on the catalog default. Thanks @vincentkoc.
|
||||
- Plugins/install: remove the previous managed plugin directory when a reinstall switches sources, so stale ClawHub and npm copies no longer keep duplicate plugin ids in discovery after the new install wins. Thanks @vincentkoc.
|
||||
- Plugins/install: let official plugin reinstall recovery repair source-only installed runtime shadows, so `openclaw plugins install npm:@openclaw/discord --force` can replace the bad package instead of stopping at stale config validation. Thanks @vincentkoc.
|
||||
- Plugins/install: suppress dangerous-pattern scanner warnings for trusted official OpenClaw npm installs, so installing `@openclaw/discord` no longer prints credential-harvesting warnings for the official package. Thanks @vincentkoc.
|
||||
- Plugins/update: repair missing plugin-local `openclaw` peer links before skipping unchanged npm plugin updates, so current external Codex installs can recover `openclaw/plugin-sdk/*` resolution during OTA repair. (#77544) Thanks @ProspectOre.
|
||||
- Plugins/update: treat OpenClaw CalVer correction versions like `2026.5.3-1` as satisfying base plugin API ranges, so correction builds can install plugins that require the base runtime API. Fixes #77293. (#77450) Thanks @p3nchan.
|
||||
- Plugins/update: treat OpenClaw stable correction versions like `2026.5.3-1` as stable releases for npm installs, plugin updates, and bundled-version comparisons, so `latest` can advance official plugins without prerelease opt-in. Thanks @vincentkoc.
|
||||
- Plugins/commands: allow the official ClawHub Codex plugin package to keep reserved `/codex` command ownership, matching the existing npm-managed Codex package behavior. Thanks @vincentkoc.
|
||||
- Plugins/commands: scope QQBot framework slash commands to the QQBot channel so `/bot-*` command handlers and native specs do not leak onto unrelated chat surfaces. Thanks @vincentkoc.
|
||||
- Plugins/commands: suppress dangerous-pattern scanner warnings for trusted catalog npm installs from owner-gated `/plugins install` commands, so chat-driven installs match the CLI install trust path. Thanks @vincentkoc.
|
||||
- Plugins/discovery: ignore managed npm plugin packages that only expose TypeScript source entries without compiled runtime output, so stale/broken installs cannot hide a working bundled or reinstallable channel plugin during setup. Thanks @vincentkoc.
|
||||
- Plugins/discovery: demote the source-only TypeScript runtime check on already-installed `origin: "global"` plugin packages from a config-blocking error to a warning and let the runtime fall through to the TypeScript source via jiti, so a single broken installed package no longer blocks `plugins install` for unrelated plugins; install-time rejection of newly-installed source-only packages is unchanged. Thanks @romneyda.
|
||||
- Plugins/registry: recover managed-npm external plugins from the owned npm root when a stale persisted registry would otherwise hide them after package-manager upgrades. Fixes #77266. Thanks @p3nchan.
|
||||
- Plugins/providers: make bundled provider discovery honor restrictive `plugins.allow` by default for new configs, while doctor migrates legacy restrictive allowlist configs to `plugins.bundledDiscovery: "compat"` to preserve upgrade behavior. Thanks @dougbtv.
|
||||
- Plugins/security: ignore inline and block comments when matching source-rule context in plugin install scans, so comment-only `fetch`/`post` references near environment defaults do not block clean plugins. Thanks @vincentkoc.
|
||||
- Plugins/packages: reject inferred built runtime entries that exist but fail package-boundary checks instead of falling back to TypeScript source for installed packages. Thanks @vincentkoc.
|
||||
- Plugins/packages: reject blank `openclaw.runtimeExtensions` entries instead of silently ignoring them and falling back to inferred TypeScript runtime entries. Thanks @vincentkoc.
|
||||
- Plugins/loader: do not retry native-loaded JavaScript plugin modules through the source transformer after native evaluation has already reached a missing dependency, avoiding duplicate top-level side effects. Thanks @vincentkoc.
|
||||
- Plugins/loader: keep bundled plugin package `test-api.js` aliases behind private QA mode, so source transforms do not expose test-only public surfaces during normal plugin loading. Thanks @vincentkoc.
|
||||
- Plugins/runtime-deps: include `json5` in the memory-core plugin runtime dependency set so packaged `memory_search` sandboxes can resolve generated OpenClaw runtime chunks that parse JSON5 config. Fixes #77461.
|
||||
- Plugins/runtime state: keep the key being registered when namespace eviction runs in the same millisecond as existing entries, so `register` and `registerIfAbsent` do not report success while evicting their own fresh value. Thanks @vincentkoc.
|
||||
- Plugins/release: make the published npm runtime verifier reject blank `openclaw.runtimeExtensions` entries instead of treating them as absent and passing via inferred outputs. Thanks @vincentkoc.
|
||||
- Doctor/config: keep active `auth.profiles` metadata intact when `doctor --fix` strips stale secret fields from configs, repairing legacy `<provider>:default` API-key profile metadata when model fallbacks or explicit `model@profile` refs still depend on it. Fixes #77400.
|
||||
- Doctor/config: restore legacy group chat config migrations for `routing.allowFrom`, `routing.groupChat.*`, and `channels.telegram.requireMention` so upgrades keep WhatsApp, Telegram, and iMessage group mention gates and history settings instead of leaving configs invalid or silently blocked. Thanks @scoootscooob.
|
||||
- Doctor/plugins: include `plugins.allow`-only official plugin ids in the release configured-plugin repair set, so `doctor --fix` installs official external plugins that are configured but not yet loaded instead of removing them as stale allow entries. Fixes #77155. Thanks @hclsys.
|
||||
- Doctor/plugins: remove stale managed install records for bundled plugins even when the bundled plugin is not explicitly configured, so doctor cleanup cannot leave orphaned install metadata behind. Thanks @vincentkoc.
|
||||
- Doctor/plugins: remove stale managed npm plugin shadow entries from the managed package lock as well as `package.json` and `node_modules`, so future npm operations do not keep referencing repaired bundled-plugin shadows. Thanks @vincentkoc.
|
||||
- Doctor/plugins: remove orphaned or recovered managed npm copies of bundled `@openclaw/*` plugins during `doctor --fix`, so stale package manifests cannot shadow the current bundled plugin config schema.
|
||||
- Doctor/plugins: skip channel-derived official plugin installs when another configured plugin is the effective owner for the same channel, so `doctor --repair` does not reinstall `feishu` while `openclaw-lark` handles `channels.feishu`. Fixes #76623. Thanks @fuyizheng3120.
|
||||
- Doctor/plugins: do not treat `plugins.allow` entries as configured plugins during missing-plugin repair, so restrictive allowlists no longer install allowed-but-unused plugins. Thanks @vincentkoc.
|
||||
- Doctor/sessions: clear auto-created stale session routing state from the sessions store when `doctor --fix` sees plugin-owned model/runtime/auth/session bindings outside the current configured route, while leaving explicit user model choices for manual review. Refs #68615.
|
||||
- CLI/sessions: prune old unreferenced transcript, compaction checkpoint, and trajectory artifacts during normal `sessions cleanup`, so gateway restart or crash orphans do not accumulate indefinitely outside `sessions.json`. Fixes #77608. Thanks @slideshow-dingo.
|
||||
- CLI/sessions: cap `openclaw sessions` output to the newest 100 rows by default and add `--limit <n|all>` plus JSON pagination metadata, so repeated machine polling of large session stores cannot fan out into unbounded per-row enrichment/output work. Fixes #77500. Thanks @Kaotic3.
|
||||
- CLI/update: disable and skip plugins that fail package-update plugin sync, so a broken npm/ClawHub/git/marketplace plugin cannot turn a successful OpenClaw package update into a failed update result. Thanks @vincentkoc.
|
||||
- CLI/update: use an absolute POSIX npm script shell during package-manager updates, so restricted PATH environments can still run dependency lifecycle scripts while updating from `--tag main`. Fixes #77530. Thanks @PeterTremonti.
|
||||
- CLI/update: make package-update follow-up processes write completion results and exit explicitly, so Windows packaged upgrades do not hang after the new package finishes post-core plugin work. Thanks @vincentkoc.
|
||||
- CLI/update: stage pnpm-detected npm-layout global package updates through a clean npm prefix swap, keep plugin install runtime imports behind a stable alias, and ship legacy install-runtime aliases back to `2026.3.22`, preventing stale overlay chunks from breaking plugin post-update sync. Thanks @vincentkoc.
|
||||
- CLI/update: treat OpenClaw stable correction versions like `2026.5.3-1` as newer than their base stable release, so package updates no longer ask for downgrade confirmation. Thanks @vincentkoc.
|
||||
- CLI/launcher: forward termination signals to compile-cache respawn children, so killing a wrapper process no longer leaves the security audit worker orphaned. Fixes #77458. Thanks @jaikharbanda.
|
||||
- Update/restart: probe managed Gateway restarts with the service environment and add a Docker product lane that exercises candidate-owned `openclaw update --yes --json` restarts, so SecretRef-backed local gateway auth cannot regress behind mocked restart checks. Thanks @vincentkoc.
|
||||
- Gateway/startup: load provider plugins that own explicitly configured image, video, or music generation defaults so generation tools become live after gateway restart instead of remaining catalog-only. Fixes #77244. Thanks @buyuangtampan, @Nikoxx99, and @vincentkoc.
|
||||
- Gateway/startup: include resolved thinking and fast-mode defaults in the `agent model` startup log line, defaulting unset startup thinking to `medium` without mixing in reasoning visibility.
|
||||
- Gateway/startup: log the canvas host mount only after the HTTP server has bound, so startup logs no longer report the canvas host as mounted before it can serve requests.
|
||||
- Gateway/startup: start cron and record the post-ready memory trace even when deferred maintenance timers fail after readiness, so a non-fatal timer setup issue does not silently leave scheduled jobs idle. Thanks @vincentkoc.
|
||||
- Gateway/update: resolve local gateway probe auth from the installed config during post-update restart verification, so token/device-authenticated VPS gateways are not misreported as unhealthy port conflicts after a package swap. Thanks @vincentkoc.
|
||||
- Gateway/update: keep the shutdown close path behind a stable runtime chunk and ship compatibility aliases for recent `server-close-*` hashes, so manual npm package replacement cannot leave an already-running Gateway unable to shut down cleanly. Fixes #77087. Thanks @westlife219.
|
||||
- Gateway/chat: clear the active reply-run guard before draining queued same-session follow-up turns, so sequential `chat.send` calls no longer trip `ReplyRunAlreadyActiveError` every other request. Fixes #77485. Thanks @bws14email.
|
||||
- Gateway/status: label Linux managed gateway services as `systemd user`, making status output explicit about the user-service scope instead of implying a system-level unit. Thanks @vincentkoc.
|
||||
- Gateway/sessions: memoize repeated thinking-option enrichment and skip unused cost fallback checks while listing sessions, reducing per-row work on large multi-agent stores. Fixes #76931.
|
||||
- Gateway/sessions: bound default `sessions.list` RPC responses and report truncation metadata, preventing Slack-heavy long-lived stores from forcing unbounded Gateway row construction. Fixes #77062.
|
||||
- Gateway/sessions: cache selected model override resolution while building session-list rows so `openclaw sessions` and Control UI session lists stay responsive on model-heavy stores. (#77650) Thanks @ragesaq.
|
||||
- Gateway/watch: suppress sync-I/O trace output during `pnpm gateway:watch --benchmark` unless explicitly requested, so CPU profiling no longer floods the terminal with stack traces.
|
||||
- Gateway/watch: when benchmark sync-I/O tracing is explicitly enabled, tee trace blocks to the benchmark output log and filter them from the terminal pane while keeping normal Gateway logs visible.
|
||||
- Gateway/diagnostics: make stuck-session recovery outcome-driven and generation-guarded, add `diagnostics.stuckSessionAbortMs`, and emit structured recovery requested/completed events so stale or skipped recovery no longer looks like a successful abort.
|
||||
- Gateway/validation: isolate gateway server validation files, ignore unrelated startup logs in request-trace coverage, and fail fast on stuck shared-auth sockets, reducing false main-branch CI failures for contributors. Thanks @amknight.
|
||||
- Gateway/install: keep `.env`-managed values in the macOS LaunchAgent env file while still tracking `OPENCLAW_SERVICE_MANAGED_ENV_KEYS`, so regenerated services do not boot without managed auth/provider keys. Fixes #75374.
|
||||
- Gateway/restart: verify listener PIDs by argv when `lsof` reports only the Node process name, so stale gateway cleanup can find macOS `cnode` listeners. Fixes #70664.
|
||||
- Gateway/logging: expand leading `~` in `logging.file` before creating the file logger, preventing startup crash loops for home-relative log paths. Fixes #73587.
|
||||
- Gateway/install: prefer supported system Node over nvm/fnm/volta/asdf/mise when regenerating managed gateway services, so `gateway install --force` no longer recreates service definitions that doctor immediately flags as version-manager-backed. Fixes #76339. Thanks @brokemac79 and @BunsDev.
|
||||
- Cron: surface failed isolated-run diagnostics in `cron show`, status, and run history when requested tools are unavailable, so blocked cron runs report the actual tool-policy failure instead of a misleading green result. Fixes #75763. Thanks @RyanSandoval.
|
||||
- Cron/sessions: keep cron metadata rows without an on-disk transcript non-resumable until a transcript exists, so doctor and `sessions cleanup --fix-missing` no longer report or prune pre-transcript cron rows as broken sessions. Refs #77011.
|
||||
- Docker/compose: pin container-side `OPENCLAW_CONFIG_DIR` and `OPENCLAW_WORKSPACE_DIR` on both gateway and CLI services so the host paths written into `.env` by `scripts/docker/setup.sh` (used as Compose bind-mount sources) cannot leak into runtime code via the `env_file` import. Fixes regressions on macOS Docker setups where the first agent reply died with `EACCES: permission denied, mkdir '/Users'` because the host-style workspace path got persisted into `agents.defaults.workspace`. Fixes #77436. Thanks @lonexreb.
|
||||
- Docker: prune package-excluded plugin dist directories from runtime images unless the build explicitly opts that plugin in, so official external plugins such as Feishu stay install-on-demand instead of shipping partial metadata without compiled runtime output. Fixes #77424. Thanks @vincentkoc.
|
||||
- Web search: honor late-bound `tools.web.search.enabled: false` during tool execution so config reloads cannot leave an already-created `web_search` tool runnable. Thanks @vincentkoc.
|
||||
- Web search: scope explicit bundled `web_search` provider runtime loading through manifest ownership, so selecting DuckDuckGo/Gemini/etc. does not import unrelated bundled providers or log their optional dependency failures. Thanks @vincentkoc.
|
||||
- Web search: keep first-class assistant `web_search` auto-detect and configured runtime providers visible when active runtime metadata or the active plugin registry is incomplete. Fixes #77073. Thanks @joeykrug.
|
||||
- Web fetch: scope provider fallback cache entries by the selected fetch provider so config reloads cannot reuse another provider's cached fallback payload. Thanks @vincentkoc.
|
||||
- Web fetch: late-bind `web_fetch` config and provider fallback metadata from the active runtime snapshot, matching `web_search` so long-lived tools do not use stale fetch provider settings. Thanks @vincentkoc.
|
||||
- Diagnostics: grant the internal diagnostics event bus to official installed diagnostics exporter plugins, so npm-installed `@openclaw/diagnostics-prometheus` can emit metrics without broadening the capability to arbitrary global plugins. Fixes #76628. Thanks @RayWoo.
|
||||
- Diagnostics: handle missing session-tail files in cron recovery context without tripping extension test typecheck. Thanks @vincentkoc.
|
||||
- Diagnostics: include last progress, cron job/run ids, stopped cron job name, and the last assistant transcript snippet in stalled-session and stuck-session recovery logs so cron stalls show what was stopped.
|
||||
- Diagnostics: keep webhook/message OTEL attributes and Prometheus delivery labels low-cardinality and omit raw chat/message IDs from spans, so progress-draft and message-tool modes do not leak high-cardinality messaging identifiers.
|
||||
- Exec approvals: detect `env -S` split-string command-carrier risks when `-S`/`-s` is combined with other env short options, so approval explanations do not miss split payloads hidden behind `env -iS...`. Thanks @vincentkoc.
|
||||
- Exec approvals: treat POSIX `exec` as a command carrier for inline eval, shell-wrapper, and eval/source detection, so approval explanations and command-risk checks do not miss payloads hidden behind `exec`. Thanks @vincentkoc.
|
||||
- Exec approvals: unwrap BSD/macOS `env -P <path>` carrier commands before approval-command and strict inline-eval checks, so `/approve` shell execution and inline interpreter payloads are still blocked behind that env form.
|
||||
- Agents/session status: keep semantic `session_status({ sessionKey: "current" })` on the live run session even before that run has a persisted session-store entry, instead of falling back to the sandbox policy key. Thanks @vincentkoc.
|
||||
- Agents/trajectory: bound runtime trajectory capture and yield queued sidecar writes so oversized traces stop recording instead of monopolizing Gateway cleanup. Fixes #77124. Thanks @loyur.
|
||||
- Agents/Pi: suppress persistence for synthetic mid-turn overflow continuation prompts, so transcript-retry recovery does not write the "continue from transcript" prompt as a new user turn. Thanks @vincentkoc.
|
||||
- Release validation: skip Slack live QA unless Slack credentials are explicitly configured, so release gates can keep proving non-Slack surfaces while Slack is still local and credential-gated. Thanks @vincentkoc.
|
||||
- Release validation: allow focused QA live reruns to select Matrix and Telegram without running Slack, so known Slack credential-pool outages do not block non-Slack live proof. Thanks @vincentkoc.
|
||||
- OpenAI Codex: recreate missing bound app-server threads once when a stale `/codex bind` sidecar survives a restart, preserving the selected auth profile and turn overrides before retrying the inbound turn. (#76936) Thanks @keshavbotagent.
|
||||
- Agents/cli-runner: drop a saved `claude-cli` resume sessionId at preparation time when its on-disk transcript no longer exists in `~/.claude/projects/`, so a stale binding from a half-installed `update.run` cannot trap follow-up runs (auto-reply / Telegram direct) in a `claude --resume` timeout loop; the run starts fresh and the new sessionId is written back through the existing post-run flow. (#77030; refs #77011) Thanks @openperf.
|
||||
- Release validation: install the cross-OS TypeScript harness through Windows-safe Node/npm shims so native Windows package checks reach the OpenClaw smoke suites instead of exiting before artifact capture. Thanks @vincentkoc.
|
||||
- Release validation: let Windows packaged-upgrade checks continue after the shipped 2026.5.2 updater hits its native-module swap cleanup fallback, verifying the fallback-installed candidate through package metadata and downstream smoke instead of crashing on the immediate update-status probe. Thanks @vincentkoc.
|
||||
- Release/beta smoke: resolve the dispatched Telegram beta E2E run from `gh run list` when `gh workflow run` returns no run URL, so the maintainer helper does not fail immediately after dispatch. Thanks @vincentkoc.
|
||||
- QA/Slack: update the Slack dispatch preview fallback test SDK mock for structured progress draft helpers, so the rich progress draft regression suite covers the new imports instead of failing before assertions run. Thanks @vincentkoc.
|
||||
- QA/Slack: resolve bundled official plugin public-surface package aliases during source-mode QA runs, so release Slack live validation can load `@openclaw/slack/api.js` without workspace symlinks. Thanks @vincentkoc.
|
||||
- QA/Matrix: let the live tool-progress preview and error checks verify progress replacement events without depending on the preview saying `Working`, `tool: read`, an unlabelled/pathless `read from`, or the original draft root being observed. Thanks @vincentkoc.
|
||||
- QA/Matrix: keep the target=both approval scenario focused on channel and DM metadata delivery by resolving the accepted approval through the gateway after both Matrix events are observed. Thanks @vincentkoc.
|
||||
- QA/Matrix: wait for live approval reactions to echo before starting the threaded approval decision timeout. Thanks @vincentkoc.
|
||||
- QA/Matrix: reuse the primed driver sync stream when confirming approval reaction echoes, avoiding missed self-reactions in live release runs. Thanks @vincentkoc.
|
||||
- Channels/plugins: key bundled package-state probes, env/config presence, and read-only command defaults by channel id instead of manifest plugin id, preserving setup and native-command detection for channel plugins whose package id differs from the channel alias. Thanks @vincentkoc.
|
||||
- Control UI/performance: cap long-task and long-animation-frame diagnostics in the shared event log, so slow-render telemetry does not evict gateway/plugin events from the Debug and Overview views. Thanks @vincentkoc.
|
||||
- Control UI/i18n: render the Sessions active filter tooltip with the configured minute count in every locale and make the i18n check reject placeholder drift. Thanks @BunsDev.
|
||||
- Codex: pass the live run session key into app-server dynamic tools when sandbox policy uses a separate session key, so `session_status({ sessionKey: "current" })` reports the active run instead of the sandbox policy key. Thanks @vincentkoc.
|
||||
- Plugins/tools: mark manifest-optional sibling tools as optional even when they come from a shared non-optional factory, so cached/status/MCP metadata keeps opt-in tool policy accurate. Thanks @vincentkoc.
|
||||
- Matrix: keep `streaming.progress.toolProgress` scoped to progress draft mode, so partial and quiet Matrix previews do not lose tool progress unless `streaming.preview.toolProgress` is disabled. Thanks @vincentkoc.
|
||||
- Channels/streaming: keep `streaming.progress.toolProgress` scoped to progress draft mode, so disabling compact progress lines does not silence partial/block preview tool updates. Thanks @vincentkoc.
|
||||
- MCP: include serialized conversation/message payloads in the primary text content for `conversations_list` and `messages_read`, while preserving `structuredContent` for capable clients. Fixes #77024.
|
||||
- Media: treat `EPERM` from the post-write media fsync step as best-effort, allowing WebChat and channel uploads to finish on Windows filesystems that reject `fsync` after a successful write. Fixes #76844.
|
||||
- Streaming channels: add `streaming.preview.commandText: "status"` / `streaming.progress.commandText: "status"` to hide command/exec text in preview progress lines while keeping the released raw command text default. Fixes #77072.
|
||||
- Agents/cron: let explicit cron `timeoutSeconds` drive both CLI no-output and embedded LLM idle watchdogs instead of being capped by resume defaults. Fixes #76289.
|
||||
- Plugins/catalog: suppress missing `channelConfigs` compatibility diagnostics for external channel plugins that are disabled, denied, or outside a restrictive allowlist. Fixes #76095.
|
||||
- Agents/cli-runner: drop a saved `claude-cli` resume sessionId at preparation time when its on-disk transcript no longer exists in `~/.claude/projects/`, so a stale binding from a half-installed `update.run` cannot trap follow-up runs (auto-reply / Telegram direct) in a `claude --resume` timeout loop; the run starts fresh and the new sessionId is written back through the existing post-run flow. (#77030; refs #77011) Thanks @openperf.
|
||||
- Doctor/plugins: skip channel-derived official plugin installs when another configured plugin is the effective owner for the same channel, so `doctor --repair` does not reinstall `feishu` while `openclaw-lark` handles `channels.feishu`. Fixes #76623. Thanks @fuyizheng3120.
|
||||
- Gateway/sessions: memoize repeated thinking-option enrichment and skip unused cost fallback checks while listing sessions, reducing per-row work on large multi-agent stores. Fixes #76931.
|
||||
- Gateway/sessions: bound default `sessions.list` RPC responses and report truncation metadata, preventing Slack-heavy long-lived stores from forcing unbounded Gateway row construction. Fixes #77062.
|
||||
- Agents/tools: use config-only runtime snapshots for plugin tool registration and live runtime config getters, avoiding expensive full secrets snapshot clones on the core-plugin-tools prep path. Fixes #76295.
|
||||
- Agents/tools: honor the effective tool denylist before constructing optional PDF/media tool factories, so `tools.deny: ["pdf"]` skips PDF setup before later policy filtering. Fixes #76997.
|
||||
- MCP/plugin tools: apply global `tools.profile`, `tools.alsoAllow`, and `tools.deny` policy while exposing plugin tools over the standalone MCP bridge, so ACP clients do not see policy-hidden plugin tools or miss opt-in optional tools. Thanks @vincentkoc.
|
||||
- Plugin tools: honor explicit tool denylists while selecting plugin tool runtimes, so denied plugin tools are not materialized for direct command or gateway surfaces before later policy filtering. Thanks @vincentkoc.
|
||||
- Plugin tools: filter factory-returned tools by manifest per-tool optional policy, so optional sibling tools from a shared runtime factory stay hidden unless explicitly allowed. Thanks @vincentkoc.
|
||||
- Agents/transcripts: retry context-overflow compaction from the current transcript only after the inbound user turn was actually persisted, and keep WebChat agent-run live delivery from writing duplicate Pi-managed assistant turns. Fixes #76424. (#77033)
|
||||
- Agents/bootstrap: keep pending `BOOTSTRAP.md` and bootstrap truncation notices in system-prompt Project Context instead of copying setup text or raw warning diagnostics into WebChat user/runtime context. Fixes #76946.
|
||||
- Gateway/install: keep `.env`-managed values in the macOS LaunchAgent env file while still tracking `OPENCLAW_SERVICE_MANAGED_ENV_KEYS`, so regenerated services do not boot without managed auth/provider keys. Fixes #75374.
|
||||
- Gateway/restart: verify listener PIDs by argv when `lsof` reports only the Node process name, so stale gateway cleanup can find macOS `cnode` listeners. Fixes #70664.
|
||||
- Gateway/logging: expand leading `~` in `logging.file` before creating the file logger, preventing startup crash loops for home-relative log paths. Fixes #73587.
|
||||
- Channels/CLI: keep `openclaw channels list --json` usable when provider usage fetching fails, and report per-provider usage errors without aborting the channel list. Refs #67595.
|
||||
- Doctor/plugins: do not treat `plugins.allow` entries as configured plugins during missing-plugin repair, so restrictive allowlists no longer install allowed-but-unused plugins. Thanks @vincentkoc.
|
||||
- Agents/messaging: deliver distinct final commentary after same-target `message` tool sends while still deduping text/media already sent by the tool, so short closing remarks are no longer silently dropped. Fixes #76915. Thanks @hclsys.
|
||||
- Agents/messaging: preserve string thread IDs when matching message-tool reply dedupe routes, avoiding precision loss on numeric-looking topic IDs before channel plugin comparison. Thanks @vincentkoc.
|
||||
- Channels/streaming: honor `agents.defaults.toolProgressDetail: "raw"` in Slack, Discord, Telegram, Matrix, and Microsoft Teams progress drafts, so tool-start lines include raw command/detail output when debugging. Thanks @vincentkoc.
|
||||
- Channels/streaming: strip unmatched inline-code backticks from compacted raw progress draft lines, avoiding stray markdown markers after long command details are shortened. Thanks @vincentkoc.
|
||||
- Discord/Slack/Mattermost: align draft preview tool-progress config help with the runtime behavior that hides interim tool updates when `streaming.preview.toolProgress` is false. Thanks @vincentkoc.
|
||||
- Feishu: use the shared channel progress formatter for streaming-card tool status lines, including raw command/detail output and message-tool filtering. Thanks @vincentkoc.
|
||||
- Mattermost: use the shared progress draft formatter for tool status previews, including raw command/detail output when `agents.defaults.toolProgressDetail: "raw"` is enabled. Thanks @vincentkoc.
|
||||
- Mattermost: suppress standalone default tool-progress messages while draft previews are active, including when draft tool lines are disabled. Thanks @vincentkoc.
|
||||
- Telegram: deliver button-only interactive replies by sending the shared fallback button-label text with the inline keyboard instead of dropping the reply as empty. Thanks @vincentkoc.
|
||||
- OpenAI Codex: honor `auth.order.openai-codex` when starting app-server clients without an explicit auth profile, so status/model probes and implicit startup use the configured Codex account instead of falling back to the default profile. Thanks @vincentkoc.
|
||||
- OpenAI Codex: let SSRF-guarded provider requests inherit OpenClaw's undici IPv4/IPv6 fallback policy, so ChatGPT-backed Codex runs recover on IPv4-working hosts when DNS still returns unreachable IPv6 addresses. Fixes #76857. Thanks @jplavoiemtl and @SymbolStar.
|
||||
- Plugin updates: do not short-circuit trusted official npm updates as unchanged when the default/latest spec still resolves to an already-installed prerelease that the installer should replace with a stable fallback. Thanks @vincentkoc.
|
||||
- Plugin updates: clean stale bundled load paths for already-externalized npm installs whose legacy install record only preserved the resolved package name. Thanks @vincentkoc.
|
||||
- Plugin tools: keep auth-unavailable optional tools hidden even when another default tool from the same plugin is available and `tools.alsoAllow` names the optional tool. Thanks @vincentkoc.
|
||||
- Realtime transcription: report socket closes before provider readiness as closed-before-ready failures instead of mislabeling them as connection timeouts for OpenAI, xAI, and Deepgram streaming transcription. Thanks @vincentkoc.
|
||||
- OpenAI/Google Meet: fail realtime voice connection attempts when the socket closes before `session.updated`, avoiding stuck Meet joins waiting on a bridge that never became ready. Thanks @vincentkoc.
|
||||
- Google Meet: avoid treating repeated participant words as multiple assistant-overlap matches when suppressing realtime echo transcripts. Thanks @vincentkoc.
|
||||
- Google Meet: make `mode: "agent"` the default Chrome talk-back path, using realtime transcription for input and regular OpenClaw TTS for speech output, while keeping direct realtime voice answers available as `mode: "bidi"` and accepting `mode: "realtime"` as an agent-mode compatibility alias.
|
||||
- Codex harness: keep `codex_app_server.*` telemetry publication owned by the harness instead of republishing the same callback event from core runners. Thanks @vincentkoc.
|
||||
- Slack/Discord: suppress standalone tool-progress chatter when partial preview streaming has `streaming.preview.toolProgress: false`, matching the documented quiet-preview behavior. Thanks @vincentkoc.
|
||||
- Matrix: bind native approval reaction targets before publishing option reactions, so fast approver reactions on threaded prompts are not dropped while the approval handler finishes setup. Thanks @vincentkoc.
|
||||
- Google Meet: make realtime talk-back agent-driven by default with `realtime.strategy: "agent"`, keep the previous direct bidirectional model behavior available as `realtime.strategy: "bidi"`, route the Meet tab speaker output to `BlackHole 2ch` automatically for local Chrome realtime joins, coalesce nearby speech transcript fragments before consulting the agent, and avoid cutting off agent speech from server VAD or stale playback pipe errors.
|
||||
- Google Meet: suppress queued assistant playback and assistant-like transcript echoes from the realtime input path, so the meeting does not hear the agent's own speech as a new user turn and loop or cut itself off.
|
||||
- Google Meet: keep Chrome realtime transport tests hermetic on Linux prerelease shards while preserving the macOS-only runtime guard. Thanks @vincentkoc.
|
||||
- QA/Matrix: let the live tool-progress preview and error checks verify progress replacement events without depending on the preview saying `Working`, `tool: read`, an unlabelled/pathless `read from`, or the original draft root being observed. Thanks @vincentkoc.
|
||||
- QA/Matrix: keep the target=both approval scenario focused on channel and DM metadata delivery by resolving the accepted approval through the gateway after both Matrix events are observed. Thanks @vincentkoc.
|
||||
- QA/Matrix: wait for live approval reactions to echo before starting the threaded approval decision timeout. Thanks @vincentkoc.
|
||||
- QA/Matrix: reuse the primed driver sync stream when confirming approval reaction echoes, avoiding missed self-reactions in live release runs. Thanks @vincentkoc.
|
||||
- Channels/WhatsApp: apply the shared group/channel visible-reply mode during inbound dispatch so group replies stay message-tool-only by default without overriding direct-chat harness defaults. Refs #75178 and #67394. Thanks @scoootscooob.
|
||||
- Plugins/Codex: preserve Codex-native OAuth routing for `/codex bind` app-server turns so bound sessions keep the selected Codex auth profile instead of falling back to public OpenAI credentials. (#76714) Thanks @keshavbotagent.
|
||||
- Telegram: keep status checks pointed at the active chat so asking for the current session no longer reports an old direct-message conversation. (#76708) Thanks @amknight.
|
||||
- Gateway/install: prefer supported system Node over nvm/fnm/volta/asdf/mise when regenerating managed gateway services, so `gateway install --force` no longer recreates service definitions that doctor immediately flags as version-manager-backed. Fixes #76339. Thanks @brokemac79 and @BunsDev.
|
||||
- Google Chat: normalize Google auth certificate response headers before google-auth-library reads cache-control, so inbound webhook auth no longer rejects with `res?.headers.get is not a function`. Fixes #76880. Thanks @donbowman.
|
||||
- WhatsApp: route terminal login QR output through the active runtime for initial and restart sockets, so `openclaw channels login --channel whatsapp` does not lose the QR behind direct stdout writes. Fixes #76213. Thanks @dougvk.
|
||||
- Proxy/debugging: disable debug proxy direct upstream forwarding for proxy requests and CONNECT tunnels while managed proxy mode is active unless `OPENCLAW_DEBUG_PROXY_ALLOW_DIRECT_CONNECT_WITH_MANAGED_PROXY=1` is explicitly set for approved local diagnostics. Thanks @jesse-merhi and @mjamiv.
|
||||
- Direct APNs: route direct HTTP/2 delivery through the active managed proxy with redacted proxy diagnostics, so push requests honor configured egress controls and `openclaw proxy validate --apns-reachable` can prove APNs is reachable through the proxy before deployment. (#74905) Thanks @jesse-merhi.
|
||||
- Agents/subagents: detect prefix-only completion announce replies and fall back to the captured child result so requester chats no longer lose most of long sub-agent reports silently. Fixes #76412. Thanks @inxaos and @davemorin.
|
||||
- TUI: replace the stale-response watchdog notice with plain user-facing copy so stalled replies no longer surface backend or streaming internals. (#77120) Thanks @davemorin.
|
||||
- Security/Windows: validate `SystemRoot`/`WINDIR` env values through the Windows install-root validator and add them to the dangerous-host-env policy when resolving `icacls.exe`/`whoami.exe` for `openclaw security audit`, so workspace `.env` overrides and bare command names cannot redirect Windows ACL helpers to attacker-controlled binaries. (#74458) Thanks @mmaps.
|
||||
- Security/Windows: pin Windows registry-probe `reg.exe` resolution to the canonical Windows install root in install-root probing, so `SystemRoot`/`WINDIR` env overrides cannot redirect registry queries during Windows host detection. (#74454) Thanks @mmaps.
|
||||
- QQBot: preserve the framework command authorization decision when converting framework command contexts into engine slash command contexts, so downstream slash handlers see `commandAuthorized` matching the channel's resolved `isAuthorizedSender` instead of a hardcoded `true`. (#77453) Thanks @drobison00.
|
||||
- Security/Windows: block `LOCALAPPDATA` from workspace `.env` and resolve Windows update-flow portable Git path prepends from the trusted process-local `LOCALAPPDATA` only, so workspace-supplied values cannot redirect `git` discovery during `openclaw update`. (#77470) Thanks @drobison00.
|
||||
- Browser/SSRF: enforce the existing current-tab URL navigation policy before tab-scoped debug, export, and read routes (console, page errors, network requests, trace start/stop, response body, screenshot, snapshot, storage, etc.) collect from an already-selected tab, so blocked tabs return a policy error instead of being read first and redacted only at response time. (#75731) Thanks @eleqtrizit.
|
||||
- Security/Windows: route the `.cmd`/`.bat` process wrapper through the shared Windows install-root resolver instead of `process.env.ComSpec`, so workspace dotenv-blocked `SystemRoot`/`WINDIR` overrides and unsafe values like UNC paths or path-lists cannot redirect `cmd.exe` selection on Windows. (#77472) Thanks @drobison00.
|
||||
- Agents/bootstrap: honor `BOOTSTRAP.md` content injected by `agent:bootstrap` hooks when deciding whether bootstrap is pending, so hook-provided required setup instructions are included in the system prompt. (#77501) Thanks @ificator.
|
||||
- Agents/replay-history: drop trailing assistant turns whose content is empty or carries only the stream-error sentinel before sending the transcript to the provider, so prefill-strict providers (such as github-copilot/claude-opus-4.6) no longer reject the request with `400 The conversation must end with a user message` after a session whose last turn errored before producing content. Refs #77228. (#77287) Thanks @openperf.
|
||||
- Agents/session-file-repair: drop `type: "message"` entries with a missing, `null`, or blank role during the on-disk repair pass so sessions that accumulated null-role JSONL corruption (such as the 935+ corrupt entries in #77228) get fully cleaned up rather than carried forward into the repaired file. Refs #77228. (#77288) Thanks @openperf.
|
||||
- Doctor/device pairing: stop suggesting `openclaw devices rotate --role <role>` for stale local cached device auth when that role is no longer approved by the gateway pairing record, so doctor no longer points users at a command that must be denied. (#77688) Thanks @Conan-Scott.
|
||||
- Ollama/thinking: expose the lightweight Ollama provider thinking profile through the public provider-policy artifact too, so reasoning-capable Ollama models such as `ollama/deepseek-v4-pro:cloud` keep `/think max` available even before the full plugin runtime activates. (#77617, fixes #77612) Thanks @rriggs and @yfge.
|
||||
- Codex/app-server: stabilize transcript mirror dedupe across re-mirrored turns so reordered snapshots no longer drop reasoning entries or duplicate the assistant reply. Refs #77012. (#77046) Thanks @openperf.
|
||||
- Agents/auth-profiles: do not record request-shape (`format`) rejections as auth-profile health failures, so a single per-session transcript-shape error (such as a prefill-strict 400 "conversation must end with a user message") no longer triggers a profile-wide cooldown that blocks every other healthy session sharing the same auth profile. Refs #77228. (#77280) Thanks @openperf.
|
||||
|
||||
## 2026.5.3-1
|
||||
|
||||
@@ -617,6 +591,7 @@ Docs: https://docs.openclaw.ai
|
||||
- Plugins/update: keep externalized bundled npm bridge updates on the normal plugin security scanner path instead of granting source-linked official trust without artifact provenance. (#76765) Thanks @Lucenx9.
|
||||
- Agents/reply context: label replied-to messages as the current user message target in model-visible metadata, so short replies are grounded to their explicit reply target instead of nearby chat history. (#76817) Thanks @obviyus.
|
||||
- Doctor/plugins: install configured missing official plugins such as Discord and Brave during doctor/update repair, auto-enable repaired provider plugins, preserve config when a download fails, and stop auto-enable from inventing plugin entries when no manifest declares a configured channel. Fixes #76872. Thanks @jack-stormentswe.
|
||||
- Codex/app-server: stabilize transcript mirror dedupe across re-mirrored turns so reordered snapshots no longer drop reasoning entries or duplicate the assistant reply. Refs #77012. (#77046) Thanks @openperf.
|
||||
|
||||
## 2026.5.2
|
||||
|
||||
@@ -1430,6 +1405,7 @@ Docs: https://docs.openclaw.ai
|
||||
- Gateway/plugins: enable the native `require()` fast path on Windows for bundled plugin modules so plugin loading uses `require()` instead of Jiti's transform pipeline, reducing startup from ~39s to ~2s on typical 6-plugin setups. Fixes #68656. (#74173) Thanks @galiniliev.
|
||||
- macOS app: detect stale Gateway TLS certificate pins, automatically repair trusted Tailscale Serve rotations, and surface paired-but-disconnected Mac companion nodes so partial Gateway connections no longer look healthy. Thanks @guti.
|
||||
- Feishu: recreate WebSocket clients with monitor-owned backoff only after SDK reconnect exhaustion, preserving heartbeat defaults and shutdown cleanup without treating recoverable SDK callback errors as terminal, so persistent connections recover without manual gateway restart. Fixes #52618; duplicate evidence #59753; related #55532, #68766, #72411, and #73739. Thanks @vincentkoc, @schumilin, @alex-xuweilong, @120106835, @sirfengyu, and @tianhaocui.
|
||||
- Agents/skills: require exact `<location>` skill paths for both single-skill and multi-skill prompt selection, so agents do not guess or hard-code skill file paths. (#74161) Thanks @lanzhi-lee.
|
||||
|
||||
## 2026.4.27
|
||||
|
||||
|
||||
@@ -100,7 +100,6 @@ For coordinated change sets that genuinely need more than 20 PRs, join the **#cl
|
||||
## Before You PR
|
||||
|
||||
- Test locally with your OpenClaw instance
|
||||
- External PRs must include a filled **Real behavior proof** section in the PR body. Show the real setup you tested, the exact command or steps you ran after the patch, after-fix evidence, the observed result, and anything you did not test. Screenshots, recordings, terminal screenshots, console output, copied live output, linked artifacts, and redacted runtime logs all count. Unit tests, mocks, snapshots, lint, typechecks, and CI are useful but do not satisfy this requirement by themselves. Maintainers may apply `proof: override` only when the proof gate should not apply.
|
||||
- Run tests: `pnpm build && pnpm check && pnpm test`
|
||||
- For iterative local commits, `scripts/committer --fast "message" <files...>` passes `FAST_COMMIT=1` through to the pre-commit hook so it skips the repo-wide `pnpm check`. Only use it when you've already run equivalent targeted validation for the touched surface.
|
||||
- For extension/plugin changes, run the fast local lane first:
|
||||
@@ -161,7 +160,7 @@ Built with Codex, Claude, or other AI tools? **Awesome - just mark it!**
|
||||
Please include in your PR:
|
||||
|
||||
- [ ] Mark as AI-assisted in the PR title or description
|
||||
- [ ] Include human-run real behavior proof from your own setup. AI-generated tests, mocks, lint, typechecks, and CI output are supplemental only; they do not prove the fix works for users.
|
||||
- [ ] Note the degree of testing (untested / lightly tested / fully tested)
|
||||
- [ ] Include prompts or session logs if possible (super helpful!)
|
||||
- [ ] Confirm you understand what the code does
|
||||
- [ ] If you have access to Codex, run `codex review --base origin/main` locally and address the findings before asking for review
|
||||
|
||||
@@ -4172,7 +4172,6 @@ public struct CronListParams: Codable, Sendable {
|
||||
public let enabled: AnyCodable?
|
||||
public let sortby: AnyCodable?
|
||||
public let sortdir: AnyCodable?
|
||||
public let agentid: String?
|
||||
|
||||
public init(
|
||||
includedisabled: Bool?,
|
||||
@@ -4181,8 +4180,7 @@ public struct CronListParams: Codable, Sendable {
|
||||
query: String?,
|
||||
enabled: AnyCodable?,
|
||||
sortby: AnyCodable?,
|
||||
sortdir: AnyCodable?,
|
||||
agentid: String?)
|
||||
sortdir: AnyCodable?)
|
||||
{
|
||||
self.includedisabled = includedisabled
|
||||
self.limit = limit
|
||||
@@ -4191,7 +4189,6 @@ public struct CronListParams: Codable, Sendable {
|
||||
self.enabled = enabled
|
||||
self.sortby = sortby
|
||||
self.sortdir = sortdir
|
||||
self.agentid = agentid
|
||||
}
|
||||
|
||||
private enum CodingKeys: String, CodingKey {
|
||||
@@ -4202,7 +4199,6 @@ public struct CronListParams: Codable, Sendable {
|
||||
case enabled
|
||||
case sortby = "sortBy"
|
||||
case sortdir = "sortDir"
|
||||
case agentid = "agentId"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -4172,7 +4172,6 @@ public struct CronListParams: Codable, Sendable {
|
||||
public let enabled: AnyCodable?
|
||||
public let sortby: AnyCodable?
|
||||
public let sortdir: AnyCodable?
|
||||
public let agentid: String?
|
||||
|
||||
public init(
|
||||
includedisabled: Bool?,
|
||||
@@ -4181,8 +4180,7 @@ public struct CronListParams: Codable, Sendable {
|
||||
query: String?,
|
||||
enabled: AnyCodable?,
|
||||
sortby: AnyCodable?,
|
||||
sortdir: AnyCodable?,
|
||||
agentid: String?)
|
||||
sortdir: AnyCodable?)
|
||||
{
|
||||
self.includedisabled = includedisabled
|
||||
self.limit = limit
|
||||
@@ -4191,7 +4189,6 @@ public struct CronListParams: Codable, Sendable {
|
||||
self.enabled = enabled
|
||||
self.sortby = sortby
|
||||
self.sortdir = sortdir
|
||||
self.agentid = agentid
|
||||
}
|
||||
|
||||
private enum CodingKeys: String, CodingKey {
|
||||
@@ -4202,7 +4199,6 @@ public struct CronListParams: Codable, Sendable {
|
||||
case enabled
|
||||
case sortby = "sortBy"
|
||||
case sortdir = "sortDir"
|
||||
case agentid = "agentId"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -102,7 +102,7 @@ Not every agent run creates a task. Heartbeat turns and normal interactive chat
|
||||
<Accordion title="Notify defaults for cron and media">
|
||||
Main-session cron tasks use `silent` notify policy by default — they create records for tracking but do not generate notifications. Isolated cron tasks also default to `silent` but are more visible because they run in their own session.
|
||||
|
||||
Session-backed `music_generate` and `video_generate` runs also use `silent` notify policy. They still create task records, but completion is handed back to the original agent session as an internal wake so the agent can write the follow-up message and attach the finished media itself. Group/channel completions follow the normal visible-reply policy, so the agent uses the message tool when source delivery requires it. If the completion agent fails to produce message-tool delivery evidence in a tool-only route, OpenClaw sends the completion fallback directly to the original channel instead of leaving the media private.
|
||||
Session-backed `music_generate` and `video_generate` runs also use `silent` notify policy. They still create task records, but completion is handed back to the original agent session as an internal wake so the agent can write the follow-up message and attach the finished media itself. Group/channel completions follow the normal visible-reply policy, so the agent uses the message tool when source delivery requires it.
|
||||
|
||||
</Accordion>
|
||||
<Accordion title="Concurrent video_generate guardrail">
|
||||
|
||||
@@ -344,7 +344,6 @@ curl "https://api.telegram.org/bot<bot_token>/getUpdates"
|
||||
For text-only replies:
|
||||
|
||||
- short DM/group/topic previews: OpenClaw keeps the same preview message and performs a final edit in place, unless a visible non-preview message was sent after the preview appeared
|
||||
- long text finals that split into multiple Telegram messages reuse the existing preview as the first final chunk when possible, then send only the remaining chunks
|
||||
- previews followed by visible non-preview output: OpenClaw sends the completed reply as a fresh final message and cleans up the older preview, so the final answer appears after intermediate output
|
||||
- previews older than about one minute: OpenClaw sends the completed reply as a fresh final message and then cleans up the preview, so Telegram's visible timestamp reflects completion time instead of the preview creation time
|
||||
|
||||
|
||||
@@ -26,16 +26,6 @@ openclaw plugins install @openclaw/whatsapp
|
||||
Use the bare package to follow the current official release tag. Pin an exact
|
||||
version only when you need a reproducible install.
|
||||
|
||||
On Windows, the WhatsApp plugin needs Git on `PATH` during npm install because
|
||||
one of its Baileys/libsignal dependencies is fetched from a git URL. Install
|
||||
Git for Windows, then restart the shell and rerun the install:
|
||||
|
||||
```powershell
|
||||
winget install --id Git.Git -e
|
||||
```
|
||||
|
||||
Portable Git also works if its `bin` directory is on `PATH`.
|
||||
|
||||
<CardGroup cols={3}>
|
||||
<Card title="Pairing" icon="link" href="/channels/pairing">
|
||||
Default DM policy is pairing for unknown senders.
|
||||
|
||||
@@ -265,7 +265,7 @@ For the dedicated update and plugin testing policy, including local commands,
|
||||
Docker lanes, Package Acceptance inputs, release defaults, and failure triage,
|
||||
see [Testing updates and plugins](/help/testing-updates-plugins).
|
||||
|
||||
Release checks call Package Acceptance with `source=artifact`, the prepared release package artifact, `suite_profile=custom`, `docker_lanes='doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor plugins-offline plugin-update'`, and `telegram_mode=mock-openai`. This keeps package migration, update, stale-plugin-dependency cleanup, configured-plugin install repair, offline plugin, plugin-update, and Telegram proof on the same resolved package tarball. Set `package_acceptance_package_spec` on Full Release Validation or OpenClaw Release Checks to run that same matrix against a shipped npm package instead of the SHA-built artifact. Cross-OS release checks still cover OS-specific onboarding, installer, and platform behavior; package/update product validation should start with Package Acceptance. The `published-upgrade-survivor` Docker lane validates one published package baseline per run in the blocking release path. In Package Acceptance, the resolved `package-under-test` tarball is always the candidate and `published_upgrade_survivor_baseline` selects the fallback published baseline, defaulting to `openclaw@latest`; failed-lane rerun commands preserve that baseline. Full Release Validation with `run_release_soak=true` or `release_profile=full` sets `published_upgrade_survivor_baselines='last-stable-4 2026.4.23 2026.5.2 2026.4.15'` and `published_upgrade_survivor_scenarios=reported-issues` to expand across the four latest stable npm releases plus pinned plugin-compatibility boundary releases and issue-shaped fixtures for Feishu config, preserved bootstrap/persona files, configured OpenClaw plugin installs, tilde log paths, and stale legacy plugin dependency roots. Multi-baseline published-upgrade survivor selections are sharded by baseline into separate targeted Docker runner jobs. The separate `Update Migration` workflow uses the `update-migration` Docker lane with `all-since-2026.4.23` and `plugin-deps-cleanup` when the question is exhaustive published update cleanup, not normal Full Release CI breadth. Local aggregate runs can pass exact package specs with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS`, keep a single lane with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC` such as `openclaw@2026.4.15`, or set `OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS` for the scenario matrix. The published lane configures the baseline with a baked `openclaw config set` command recipe, records recipe steps in `summary.json`, and probes `/healthz`, `/readyz`, plus RPC status after Gateway start. The Windows packaged and installer fresh lanes also verify that an installed package can import a browser-control override from a raw absolute Windows path. The OpenAI cross-OS agent-turn smoke defaults to `OPENCLAW_CROSS_OS_OPENAI_MODEL` when set, otherwise `openai/gpt-5.4`, so the install and gateway proof stays on a GPT-5 test model while avoiding GPT-4.x defaults.
|
||||
Release checks call Package Acceptance with `source=artifact`, the prepared release package artifact, `suite_profile=custom`, `docker_lanes='doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor plugins-offline plugin-update'`, and `telegram_mode=mock-openai`. This keeps package migration, update, stale-plugin-dependency cleanup, configured-plugin install repair, offline plugin, plugin-update, and Telegram proof on the same resolved package tarball. Set `package_acceptance_package_spec` on Full Release Validation or OpenClaw Release Checks to run that same matrix against a shipped npm package instead of the SHA-built artifact. Cross-OS release checks still cover OS-specific onboarding, installer, and platform behavior; package/update product validation should start with Package Acceptance. The `published-upgrade-survivor` Docker lane validates one published package baseline per run in the blocking release path. In Package Acceptance, the resolved `package-under-test` tarball is always the candidate and `published_upgrade_survivor_baseline` selects the fallback published baseline, defaulting to `openclaw@latest`; failed-lane rerun commands preserve that baseline. Full Release Validation with `run_release_soak=true` or `release_profile=full` sets `published_upgrade_survivor_baselines=all-since-2026.4.23` and `published_upgrade_survivor_scenarios=reported-issues` to expand across every stable npm release from `2026.4.23` through `latest` and issue-shaped fixtures for Feishu config, preserved bootstrap/persona files, configured OpenClaw plugin installs, tilde log paths, and stale legacy plugin dependency roots. The separate `Update Migration` workflow uses the `update-migration` Docker lane with `all-since-2026.4.23` and `plugin-deps-cleanup` when the question is exhaustive published update cleanup, not normal Full Release CI breadth. Local aggregate runs can pass exact package specs with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS`, keep a single lane with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC` such as `openclaw@2026.4.15`, or set `OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS` for the scenario matrix. The published lane configures the baseline with a baked `openclaw config set` command recipe, records recipe steps in `summary.json`, and probes `/healthz`, `/readyz`, plus RPC status after Gateway start. The Windows packaged and installer fresh lanes also verify that an installed package can import a browser-control override from a raw absolute Windows path. The OpenAI cross-OS agent-turn smoke defaults to `OPENCLAW_CROSS_OS_OPENAI_MODEL` when set, otherwise `openai/gpt-5.4`, so the install and gateway proof stays on a GPT-5 test model while avoiding GPT-4.x defaults.
|
||||
|
||||
### Legacy compatibility windows
|
||||
|
||||
|
||||
@@ -63,6 +63,7 @@ or `openclaw approvals set --node <id|name|ip>`.
|
||||
openclaw approvals get
|
||||
openclaw approvals get --node <id|name|ip>
|
||||
openclaw approvals get --gateway
|
||||
openclaw approvals list --gateway
|
||||
```
|
||||
|
||||
`openclaw approvals get` now shows the effective exec policy for local, gateway, and node targets:
|
||||
@@ -78,6 +79,8 @@ Precedence is intentional:
|
||||
- `--node` combines the node host approvals file with gateway `tools.exec` policy, because both still apply at runtime
|
||||
- if gateway config is unavailable, the CLI falls back to the node approvals snapshot and notes that the final runtime policy could not be computed
|
||||
|
||||
`openclaw approvals list --gateway` lists pending runtime exec approval requests on the gateway. Use `get` for policy snapshots and allowlists; use `list` when an agent is waiting for an approval id.
|
||||
|
||||
## Replace approvals from a file
|
||||
|
||||
```bash
|
||||
|
||||
@@ -211,15 +211,12 @@ Manual run and inspection:
|
||||
|
||||
```bash
|
||||
openclaw cron list
|
||||
openclaw cron list --agent ops
|
||||
openclaw cron show <job-id>
|
||||
openclaw cron run <job-id>
|
||||
openclaw cron run <job-id> --due
|
||||
openclaw cron runs --id <job-id> --limit 50
|
||||
```
|
||||
|
||||
`openclaw cron list` shows all matching jobs by default. Pass `--agent <id>` to show only jobs whose effective normalized agent id matches; jobs without a stored agent id count as the configured default agent.
|
||||
|
||||
`cron runs` entries include delivery diagnostics with the intended cron target, the resolved target, message-tool sends, fallback use, and delivered state.
|
||||
|
||||
Agent and session retargeting:
|
||||
|
||||
@@ -31,6 +31,7 @@ openclaw sessions --active 120
|
||||
openclaw sessions --limit 25
|
||||
openclaw sessions --verbose
|
||||
openclaw sessions --json
|
||||
openclaw sessions list --json
|
||||
```
|
||||
|
||||
Scope selection:
|
||||
@@ -99,7 +100,6 @@ openclaw sessions cleanup --json
|
||||
`openclaw sessions cleanup` uses `session.maintenance` settings from config:
|
||||
|
||||
- Scope note: `openclaw sessions cleanup` maintains session stores, transcripts, and trajectory sidecars. It does not prune cron run logs (`cron/runs/<jobId>.jsonl`), which are managed by `cron.runLog.maxBytes` and `cron.runLog.keepLines` in [Cron configuration](/automation/cron-jobs#configuration) and explained in [Cron maintenance](/automation/cron-jobs#maintenance).
|
||||
- Cleanup also prunes unreferenced primary transcripts, compaction checkpoints, and trajectory sidecars older than `session.maintenance.pruneAfter`; files still referenced by `sessions.json` are preserved.
|
||||
|
||||
- `--dry-run`: preview how many entries would be pruned/capped without writing.
|
||||
- In text mode, dry-run prints a per-session action table (`Action`, `Key`, `Age`, `Model`, `Flags`) so you can see what would be kept vs removed.
|
||||
|
||||
@@ -26,7 +26,6 @@ Notes:
|
||||
- Session status output separates `Execution:` from `Runtime:`. `Execution` is the sandbox path (`direct`, `docker/*`), while `Runtime` tells you whether the session is using `OpenClaw Pi Default`, `OpenAI Codex`, a CLI backend, or an ACP backend such as `codex (acp/acpx)`. See [Agent runtimes](/concepts/agent-runtimes) for the provider/model/runtime distinction.
|
||||
- MiniMax's raw `usage_percent` / `usagePercent` fields are remaining quota, so OpenClaw inverts them before display; count-based fields win when present. `model_remains` responses prefer the chat-model entry, derive the window label from timestamps when needed, and include the model name in the plan label.
|
||||
- When the current session snapshot is sparse, `/status` can backfill token and cache counters from the most recent transcript usage log. Existing nonzero live values still win over transcript fallback values.
|
||||
- `/status` includes compact Gateway process uptime and host system uptime.
|
||||
- Transcript fallback can also recover the active runtime model label when the live session entry is missing it. If that transcript model differs from the selected model, status resolves the context window against the recovered runtime model instead of the selected one.
|
||||
- For prompt-size accounting, transcript fallback prefers the larger prompt-oriented total when session metadata is missing or smaller, so custom-provider sessions do not collapse to `0` token displays.
|
||||
- Output includes per-agent session stores when multiple agents are configured.
|
||||
|
||||
@@ -168,11 +168,7 @@ worktrees, runs `discord-status-reactions-tool-only` against each worktree, and
|
||||
uploads `baseline/`, `candidate/`, `comparison.json`, and `mantis-report.md` as
|
||||
Actions artifacts. It also renders each lane's timeline HTML in a Crabbox
|
||||
desktop browser and publishes those VNC screenshots beside the deterministic
|
||||
timeline PNGs in the PR comment. The same PR comment embeds lightweight
|
||||
motion-trimmed GIF previews generated by `crabbox media preview`, links to the
|
||||
matching motion-trimmed MP4 clips, and keeps the full desktop MP4 files for deep
|
||||
inspection. Screenshots stay inline for quick review. The workflow builds the
|
||||
Crabbox CLI from
|
||||
timeline PNGs in the PR comment. The workflow builds the Crabbox CLI from
|
||||
`openclaw/crabbox` main so it can use the current desktop/browser lease flags
|
||||
before the next Crabbox binary release is cut.
|
||||
|
||||
|
||||
@@ -132,37 +132,12 @@ pnpm openclaw qa mantis slack-desktop-smoke \
|
||||
|
||||
That command leases a Crabbox desktop/browser machine, runs the Slack live lane
|
||||
inside the VM, opens Slack Web in the VNC browser, captures the desktop, and
|
||||
copies `slack-qa/`, `slack-desktop-smoke.png`, and `slack-desktop-smoke.mp4`
|
||||
when video capture is available back to the Mantis artifact directory. Reuse `--lease-id <cbx_...>` after logging in to Slack Web manually
|
||||
copies `slack-qa/` plus `slack-desktop-smoke.png` back to the Mantis artifact
|
||||
directory. Reuse `--lease-id <cbx_...>` after logging in to Slack Web manually
|
||||
through VNC. With `--gateway-setup`, Mantis leaves a persistent OpenClaw Slack
|
||||
gateway running inside the VM on port `38973`; without it, the command runs the
|
||||
normal bot-to-bot Slack QA lane and exits after artifact capture.
|
||||
|
||||
For an agent/CV style desktop task, run:
|
||||
|
||||
```bash
|
||||
pnpm openclaw qa mantis visual-task \
|
||||
--browser-url https://example.net \
|
||||
--expect-text "Example Domain" \
|
||||
--vision-model openai/gpt-5.4
|
||||
```
|
||||
|
||||
`visual-task` leases or reuses a Crabbox desktop/browser machine, starts
|
||||
`crabbox record --while`, drives the visible browser through a nested
|
||||
`visual-driver`, captures `visual-task.png`, runs `openclaw infer image describe`
|
||||
against the screenshot when `--vision-mode image-describe` is selected, and
|
||||
writes `visual-task.mp4`, `mantis-visual-task-summary.json`,
|
||||
`mantis-visual-task-driver-result.json`, and `mantis-visual-task-report.md`.
|
||||
When `--expect-text` is set, the vision prompt asks for a structured JSON
|
||||
verdict and only passes when the model reports positive visible evidence; a
|
||||
negative response that merely quotes the target text fails the assertion.
|
||||
Use `--vision-mode metadata` for a no-model smoke that proves the desktop,
|
||||
browser, screenshot, and video plumbing without calling an image-understanding
|
||||
provider. Recording is a required artifact for `visual-task`; if Crabbox records
|
||||
no non-empty `visual-task.mp4`, the task fails even when the visual driver
|
||||
passed. On failure, Mantis keeps the lease for VNC unless the task had already
|
||||
passed and `--keep-lease` was not set.
|
||||
|
||||
Before using pooled live credentials, run:
|
||||
|
||||
```bash
|
||||
@@ -257,8 +232,6 @@ Scenarios (`extensions/qa-lab/src/live-transports/telegram/telegram-live.runtime
|
||||
- `telegram-tools-compact-command`
|
||||
- `telegram-whoami-command`
|
||||
- `telegram-context-command`
|
||||
- `telegram-long-final-reuses-preview`
|
||||
- `telegram-long-final-three-chunks`
|
||||
|
||||
Output artifacts:
|
||||
|
||||
@@ -291,7 +264,7 @@ Scenarios (`extensions/qa-lab/src/live-transports/discord/discord-live.runtime.t
|
||||
- `discord-canary`
|
||||
- `discord-mention-gating`
|
||||
- `discord-native-help-command-registration`
|
||||
- `discord-status-reactions-tool-only` — opt-in Mantis scenario. Runs by itself because it switches the SUT to always-on, tool-only guild replies with `messages.statusReactions.enabled=true`, then captures a REST reaction timeline plus HTML/PNG visual artifacts. Mantis before/after reports also preserve scenario-provided MP4 artifacts as `baseline.mp4` and `candidate.mp4`.
|
||||
- `discord-status-reactions-tool-only` — opt-in Mantis scenario. Runs by itself because it switches the SUT to always-on, tool-only guild replies with `messages.statusReactions.enabled=true`, then captures a REST reaction timeline plus an HTML/PNG visual artifact.
|
||||
|
||||
Run the Mantis status-reaction scenario explicitly:
|
||||
|
||||
|
||||
@@ -78,7 +78,6 @@ pnpm test:docker:plugin-lifecycle-matrix
|
||||
pnpm test:docker:plugin-update
|
||||
pnpm test:docker:upgrade-survivor
|
||||
pnpm test:docker:published-upgrade-survivor
|
||||
pnpm test:docker:update-restart-auth
|
||||
pnpm test:docker:update-migration
|
||||
```
|
||||
|
||||
@@ -104,10 +103,6 @@ Important lanes:
|
||||
configures it through a baked `openclaw config set` recipe, updates it to the
|
||||
candidate tarball, runs doctor, checks legacy cleanup, starts the Gateway, and
|
||||
probes `/healthz`, `/readyz`, and RPC status.
|
||||
- `test:docker:update-restart-auth` installs the candidate package, starts a
|
||||
managed token-auth Gateway, unsets caller gateway auth env for
|
||||
`openclaw update --yes --json`, and requires the candidate update command to
|
||||
restart the Gateway before the normal probes.
|
||||
- `test:docker:update-migration` is the cleanup-heavy published-update lane. It
|
||||
starts from a configured Discord/Telegram-style user state, runs baseline
|
||||
doctor so configured plugin dependencies have a chance to materialize, seeds
|
||||
@@ -169,41 +164,30 @@ resolved release SHA. For post-publish proof, pass
|
||||
`package_acceptance_package_spec=openclaw@YYYY.M.D` so the same upgrade matrix
|
||||
targets the shipped npm package instead.
|
||||
|
||||
Release checks call Package Acceptance with the package/update/restart/plugin set:
|
||||
Release checks call Package Acceptance with the package/update/plugin set:
|
||||
|
||||
```text
|
||||
doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update
|
||||
doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor plugins-offline plugin-update
|
||||
```
|
||||
|
||||
When release soak is enabled, they also pass:
|
||||
They also pass:
|
||||
|
||||
```text
|
||||
published_upgrade_survivor_baselines=last-stable-4 2026.4.23 2026.5.2 2026.4.15
|
||||
published_upgrade_survivor_baselines=all-since-2026.4.23
|
||||
published_upgrade_survivor_scenarios=reported-issues
|
||||
telegram_mode=mock-openai
|
||||
```
|
||||
|
||||
This keeps package migration, update channel switching, stale plugin dependency
|
||||
cleanup, offline plugin coverage, plugin update behavior, and Telegram package
|
||||
QA on the same resolved artifact without making the default release package gate
|
||||
walk every published release.
|
||||
QA on the same resolved artifact.
|
||||
|
||||
`last-stable-4` resolves to the four latest stable npm-published OpenClaw
|
||||
releases. Release package acceptance pins `2026.4.23` as the first plugin-update
|
||||
compatibility boundary, `2026.5.2` as a plugin-architecture churn boundary, and
|
||||
`2026.4.15` as an older 2026.4.1x published-update baseline; the resolver
|
||||
dedupes pins that are already in the latest four. For exhaustive published
|
||||
`all-since-2026.4.23` is the Full Release CI upgrade sample: every stable npm-published release from `2026.4.23` through `latest`. For exhaustive published
|
||||
update migration coverage, use `all-since-2026.4.23` in the separate Update
|
||||
Migration workflow instead of Full Release CI. `release-history` remains
|
||||
available for manual wider sampling when you also want the legacy pre-date
|
||||
anchor.
|
||||
|
||||
When multiple published-upgrade survivor baselines are selected, the reusable
|
||||
Docker workflow shards each baseline into its own targeted runner job. Each
|
||||
baseline shard still runs the selected scenario set, but logs and artifacts stay
|
||||
per-baseline and wall time is bounded by the slowest shard instead of one large
|
||||
serial job.
|
||||
|
||||
Run a package profile manually when validating a candidate before release:
|
||||
|
||||
```bash
|
||||
@@ -213,7 +197,7 @@ gh workflow run package-acceptance.yml \
|
||||
-f source=npm \
|
||||
-f package_spec=openclaw@beta \
|
||||
-f suite_profile=package \
|
||||
-f published_upgrade_survivor_baselines="last-stable-4 2026.4.23 2026.5.2 2026.4.15" \
|
||||
-f published_upgrade_survivor_baselines=all-since-2026.4.23 \
|
||||
-f published_upgrade_survivor_scenarios=reported-issues \
|
||||
-f telegram_mode=mock-openai
|
||||
```
|
||||
@@ -229,7 +213,7 @@ For release candidates, the default proof stack is:
|
||||
1. `pnpm check:changed` and `pnpm test:changed` for source-level regressions.
|
||||
2. `pnpm release:check` for package artifact integrity.
|
||||
3. Package Acceptance `package` profile or the release-check custom package
|
||||
lanes for install/update/restart/plugin contracts.
|
||||
lanes for install/update/plugin contracts.
|
||||
4. Cross-OS release checks for OS-specific installer, onboarding, and platform
|
||||
behavior.
|
||||
5. Live suites only when the changed surface touches provider or hosted-service
|
||||
@@ -250,8 +234,7 @@ Compatibility leniency is narrow and time boxed:
|
||||
warning or skipping.
|
||||
|
||||
Do not add new startup migrations for these old shapes. Add or extend a doctor
|
||||
repair, then prove it with `upgrade-survivor`, `published-upgrade-survivor`, or
|
||||
`update-restart-auth` when the update command owns the restart.
|
||||
repair, then prove it with `upgrade-survivor` or `published-upgrade-survivor`.
|
||||
|
||||
## Adding coverage
|
||||
|
||||
@@ -263,7 +246,6 @@ can fail for the right reason:
|
||||
checker test.
|
||||
- CLI install/update behavior: Docker lane assertion or fixture.
|
||||
- Published-release migration behavior: `published-upgrade-survivor` scenario.
|
||||
- Update-owned restart behavior: `update-restart-auth`.
|
||||
- Registry/package source behavior: `test:docker:plugins` fixture or ClawHub
|
||||
fixture server.
|
||||
- Dependency layout or cleanup behavior: assert both runtime execution and the
|
||||
|
||||
@@ -643,7 +643,7 @@ The live-model Docker runners also bind-mount only the needed CLI auth homes (or
|
||||
- Npm tarball onboarding/channel/agent smoke: `pnpm test:docker:npm-onboard-channel-agent` installs the packed OpenClaw tarball globally in Docker, configures OpenAI via env-ref onboarding plus Telegram by default, runs doctor, and runs one mocked OpenAI agent turn. Reuse a prebuilt tarball with `OPENCLAW_CURRENT_PACKAGE_TGZ=/path/to/openclaw-*.tgz`, skip the host rebuild with `OPENCLAW_NPM_ONBOARD_HOST_BUILD=0`, or switch channel with `OPENCLAW_NPM_ONBOARD_CHANNEL=discord` or `OPENCLAW_NPM_ONBOARD_CHANNEL=slack`.
|
||||
- Update channel switch smoke: `pnpm test:docker:update-channel-switch` installs the packed OpenClaw tarball globally in Docker, switches from package `stable` to git `dev`, verifies the persisted channel and plugin post-update work, then switches back to package `stable` and checks update status.
|
||||
- Upgrade survivor smoke: `pnpm test:docker:upgrade-survivor` installs the packed OpenClaw tarball over a dirty old-user fixture with agents, channel config, plugin allowlists, stale plugin dependency state, and existing workspace/session files. It runs package update plus non-interactive doctor without live provider or channel keys, then starts a loopback Gateway and checks config/state preservation plus startup/status budgets.
|
||||
- Published upgrade survivor smoke: `pnpm test:docker:published-upgrade-survivor` installs `openclaw@latest` by default, seeds realistic existing-user files, configures that baseline with a baked command recipe, validates the resulting config, updates that published install to the candidate tarball, runs non-interactive doctor, writes `.artifacts/upgrade-survivor/summary.json`, then starts a loopback Gateway and checks configured intents, state preservation, startup, `/healthz`, `/readyz`, and RPC status budgets. Override one baseline with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC`, ask the aggregate scheduler to expand exact local baselines with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS` such as `openclaw@2026.5.2 openclaw@2026.4.23 openclaw@2026.4.15`, and expand issue-shaped fixtures with `OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS` such as `reported-issues`; the reported-issues set includes `configured-plugin-installs` for automatic external OpenClaw plugin install repair. Package Acceptance exposes those as `published_upgrade_survivor_baseline`, `published_upgrade_survivor_baselines`, and `published_upgrade_survivor_scenarios`, resolves meta baseline tokens such as `last-stable-4` or `all-since-2026.4.23`, and Full Release Validation expands the release-soak package gate to `last-stable-4 2026.4.23 2026.5.2 2026.4.15` plus `reported-issues`.
|
||||
- Published upgrade survivor smoke: `pnpm test:docker:published-upgrade-survivor` installs `openclaw@latest` by default, seeds realistic existing-user files, configures that baseline with a baked command recipe, validates the resulting config, updates that published install to the candidate tarball, runs non-interactive doctor, writes `.artifacts/upgrade-survivor/summary.json`, then starts a loopback Gateway and checks configured intents, state preservation, startup, `/healthz`, `/readyz`, and RPC status budgets. Override one baseline with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC`, ask the aggregate scheduler to expand exact baselines with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS` such as `all-since-2026.4.23`, and expand issue-shaped fixtures with `OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS` such as `reported-issues`; the reported-issues set includes `configured-plugin-installs` for automatic external OpenClaw plugin install repair. Package Acceptance exposes those as `published_upgrade_survivor_baseline`, `published_upgrade_survivor_baselines`, and `published_upgrade_survivor_scenarios`; Full Release Validation uses the default latest baseline in the blocking path and expands to all-since/reported-issues only for `run_release_soak=true` or `release_profile=full`.
|
||||
- Session runtime context smoke: `pnpm test:docker:session-runtime-context` verifies hidden runtime context transcript persistence plus doctor repair of affected duplicated prompt-rewrite branches.
|
||||
- Bun global install smoke: `bash scripts/e2e/bun-global-install-smoke.sh` packs the current tree, installs it with `bun install -g` in an isolated home, and verifies `openclaw infer image providers --json` returns bundled image providers instead of hanging. Reuse a prebuilt tarball with `OPENCLAW_BUN_GLOBAL_SMOKE_PACKAGE_TGZ=/path/to/openclaw-*.tgz`, skip the host build with `OPENCLAW_BUN_GLOBAL_SMOKE_HOST_BUILD=0`, or copy `dist/` from a built Docker image with `OPENCLAW_BUN_GLOBAL_SMOKE_DIST_IMAGE=openclaw-dockerfile-smoke:local`.
|
||||
- Installer Docker smoke: `bash scripts/test-install-sh-docker.sh` shares one npm cache across its root, update, and direct-npm containers. Update smoke defaults to npm `latest` as the stable baseline before upgrading to the candidate tarball. Override with `OPENCLAW_INSTALL_SMOKE_UPDATE_BASELINE=2026.4.22` locally, or with the Install Smoke workflow's `update_baseline_version` input on GitHub. Non-root installer checks keep an isolated npm cache so root-owned cache entries do not mask user-local install behavior. Set `OPENCLAW_INSTALL_SMOKE_NPM_CACHE_DIR=/path/to/cache` to reuse the root/update/direct-npm cache across local reruns.
|
||||
|
||||
@@ -245,40 +245,8 @@ Full guide: [Getting Started](/start/getting-started)
|
||||
|
||||
## Windows companion app
|
||||
|
||||
We do not have a Windows companion app yet. Contributions are welcome if you want to
|
||||
help make it happen.
|
||||
|
||||
## Git and GitHub connectivity (contributors)
|
||||
|
||||
Some networks block or throttle HTTPS to GitHub. If `git clone` fails with timeouts
|
||||
or connection resets, try another network, a VPN, or an HTTP/HTTPS proxy your
|
||||
organization provides.
|
||||
|
||||
If `gh auth login` fails during the browser device flow (for example a timeout
|
||||
reaching `github.com:443`), authenticate with a personal access token instead:
|
||||
|
||||
1. Create a token with at least the `repo` scope (classic PAT) or equivalent
|
||||
fine-grained access.
|
||||
2. In PowerShell for the current session:
|
||||
|
||||
```powershell
|
||||
$env:GH_TOKEN="<your-token>"
|
||||
gh auth status
|
||||
gh auth setup-git
|
||||
```
|
||||
|
||||
3. If `gh auth status` warns about missing `read:org`, mint a token that includes
|
||||
that scope and re-assign the variable:
|
||||
|
||||
```powershell
|
||||
$env:GH_TOKEN="<your-token-with-repo-and-read:org>"
|
||||
gh auth status
|
||||
```
|
||||
|
||||
`gh auth refresh -s read:org` only applies when you authenticated via `gh auth login`
|
||||
and have stored credentials to refresh (not when using `GH_TOKEN`).
|
||||
|
||||
Never commit tokens or paste them into issues or pull requests.
|
||||
We do not have a Windows companion app yet. Contributions are welcome if you want
|
||||
contributions to make it happen.
|
||||
|
||||
## Related
|
||||
|
||||
|
||||
@@ -18,16 +18,6 @@ Adds the WhatsApp channel surface for sending and receiving OpenClaw messages.
|
||||
|
||||
channels: whatsapp
|
||||
|
||||
## Windows install note
|
||||
|
||||
On Windows, the WhatsApp plugin needs Git on `PATH` during npm install because one of its Baileys/libsignal dependencies is fetched from a git URL. Install Git for Windows, then restart the shell and rerun the install:
|
||||
|
||||
```powershell
|
||||
winget install --id Git.Git -e
|
||||
```
|
||||
|
||||
Portable Git also works if its `bin` directory is on `PATH`.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [whatsapp](/channels/whatsapp)
|
||||
|
||||
@@ -141,13 +141,11 @@ the maintainer-only release runbook.
|
||||
`telegram_mode=mock-openai` or `telegram_mode=live-frontier`. When the
|
||||
selected Docker lanes include `published-upgrade-survivor`, the package
|
||||
artifact is the candidate and `published_upgrade_survivor_baseline` selects
|
||||
the published baseline. `update-restart-auth` uses the candidate package as
|
||||
both the installed CLI and the package-under-test so it exercises the
|
||||
candidate update command's managed restart path.
|
||||
the published baseline.
|
||||
Example: `gh workflow run package-acceptance.yml --ref main -f workflow_ref=main -f source=npm -f package_spec=openclaw@beta -f suite_profile=product -f published_upgrade_survivor_baseline=openclaw@2026.4.26 -f telegram_mode=mock-openai`
|
||||
Common profiles:
|
||||
- `smoke`: install/channel/agent, gateway network, and config reload lanes
|
||||
- `package`: artifact-native package/update/restart/plugin lanes without OpenWebUI or live ClawHub
|
||||
- `package`: artifact-native package/update/plugin lanes without OpenWebUI or live ClawHub
|
||||
- `product`: package profile plus MCP channels, cron/subagent cleanup,
|
||||
OpenAI web search, and OpenWebUI
|
||||
- `full`: Docker release-path chunks with OpenWebUI
|
||||
@@ -324,10 +322,7 @@ Use `release_profile` to select live/provider breadth:
|
||||
|
||||
Use `run_release_soak=true` with `stable` when the release-blocking lanes are
|
||||
green and you want the exhaustive live/E2E, Docker release-path, and
|
||||
bounded published upgrade-survivor sweep before promotion. That sweep covers
|
||||
the latest four stable packages plus pinned `2026.4.23` and `2026.5.2`
|
||||
baselines plus older `2026.4.15` coverage, with duplicate baselines removed and
|
||||
each baseline sharded into its own Docker runner job. `full` implies
|
||||
all-since-2026.4.23 upgrade-survivor sweep before promotion. `full` implies
|
||||
`run_release_soak=true`.
|
||||
|
||||
`OpenClaw Release Checks` uses the trusted workflow ref to resolve the target
|
||||
@@ -488,12 +483,11 @@ Supported candidate sources:
|
||||
|
||||
`OpenClaw Release Checks` runs Package Acceptance with `source=artifact`, the
|
||||
prepared release package artifact, `suite_profile=custom`,
|
||||
`docker_lanes=doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor update-restart-auth plugins-offline plugin-update`,
|
||||
`telegram_mode=mock-openai`. Package Acceptance keeps migration, update,
|
||||
configured-auth update restart, stale plugin dependency cleanup, offline plugin
|
||||
fixtures, plugin update, and Telegram package QA against the same resolved
|
||||
tarball. Blocking release checks use the default latest published package
|
||||
baseline; `run_release_soak=true` or
|
||||
`docker_lanes=doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor plugins-offline plugin-update`,
|
||||
`telegram_mode=mock-openai`. Package Acceptance keeps migration, update, stale
|
||||
plugin dependency cleanup, offline plugin fixtures, plugin update, and Telegram
|
||||
package QA against the same resolved tarball. Blocking release checks use the
|
||||
default latest published package baseline; `run_release_soak=true` or
|
||||
`release_profile=full` expands to every stable npm-published baseline from
|
||||
`2026.4.23` through `latest` plus reported-issue fixtures. Use
|
||||
Package Acceptance with `source=npm` for an already shipped candidate, or
|
||||
@@ -539,8 +533,8 @@ Common package profiles:
|
||||
|
||||
- `smoke`: quick package install/channel/agent, gateway network, and config
|
||||
reload lanes
|
||||
- `package`: install/update/restart/plugin package contracts without live
|
||||
ClawHub; this is the release-check default
|
||||
- `package`: install/update/plugin package contracts without live ClawHub; this is the release-check
|
||||
default
|
||||
- `product`: `package` plus MCP channels, cron/subagent cleanup, OpenAI web
|
||||
search, and OpenWebUI
|
||||
- `full`: Docker release-path chunks with OpenWebUI
|
||||
|
||||
@@ -85,7 +85,7 @@ Session persistence has automatic maintenance controls (`session.maintenance`) f
|
||||
- `maxDiskBytes`: optional sessions-directory budget
|
||||
- `highWaterBytes`: optional target after cleanup (default `80%` of `maxDiskBytes`)
|
||||
|
||||
Normal Gateway writes flow through a per-store session writer that serializes in-process mutations without taking a runtime file lock. Hot-path patch helpers borrow the validated mutable cache while they hold that writer slot, so large `sessions.json` files are not cloned or reread for every metadata update. Runtime code should prefer `updateSessionStore(...)` or `updateSessionStoreEntry(...)`; direct whole-store saves are compatibility and offline-maintenance tools. When a Gateway is reachable, non-dry-run `openclaw sessions cleanup` and `openclaw agents delete` delegate store mutations to the Gateway so cleanup joins the same writer queue; `--store <path>` is the explicit offline repair path for direct file maintenance. `maxEntries` cleanup is still batched for production-sized caps, so a store may briefly exceed the configured cap before the next high-water cleanup rewrites it back down. Session store reads do not prune or cap entries during Gateway startup; use writes or `openclaw sessions cleanup --enforce` for cleanup. `openclaw sessions cleanup --enforce` still applies the configured cap immediately and prunes old unreferenced transcript, checkpoint, and trajectory artifacts even when no disk budget is configured.
|
||||
Normal Gateway writes flow through a per-store session writer that serializes in-process mutations without taking a runtime file lock. Hot-path patch helpers borrow the validated mutable cache while they hold that writer slot, so large `sessions.json` files are not cloned or reread for every metadata update. Runtime code should prefer `updateSessionStore(...)` or `updateSessionStoreEntry(...)`; direct whole-store saves are compatibility and offline-maintenance tools. When a Gateway is reachable, non-dry-run `openclaw sessions cleanup` and `openclaw agents delete` delegate store mutations to the Gateway so cleanup joins the same writer queue; `--store <path>` is the explicit offline repair path for direct file maintenance. `maxEntries` cleanup is still batched for production-sized caps, so a store may briefly exceed the configured cap before the next high-water cleanup rewrites it back down. Session store reads do not prune or cap entries during Gateway startup; use writes or `openclaw sessions cleanup --enforce` for cleanup. `openclaw sessions cleanup --enforce` still applies the configured cap immediately.
|
||||
|
||||
Maintenance keeps durable external conversation pointers such as group sessions
|
||||
and thread-scoped chat sessions, but synthetic runtime entries for cron, hooks,
|
||||
|
||||
@@ -44,7 +44,7 @@ title: "Tests"
|
||||
- `pnpm test:docker:openwebui`: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks `/api/models`, then runs a real proxied chat through `/api/chat/completions`. Requires a usable live model key (for example OpenAI in `~/.profile`), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites.
|
||||
- `pnpm test:docker:mcp-channels`: Starts a seeded Gateway container and a second client container that spawns `openclaw mcp serve`, then verifies routed conversation discovery, transcript reads, attachment metadata, live event queue behavior, outbound send routing, and Claude-style channel + permission notifications over the real stdio bridge. The Claude notification assertion reads the raw stdio MCP frames directly so the smoke reflects what the bridge actually emits.
|
||||
- `pnpm test:docker:upgrade-survivor`: Installs the packed OpenClaw tarball over a dirty old-user fixture, runs package update plus non-interactive doctor without live provider or channel keys, then starts a loopback Gateway and checks that agents, channel config, plugin allowlists, workspace/session files, stale legacy plugin dependency state, startup, and RPC status survive.
|
||||
- `pnpm test:docker:published-upgrade-survivor`: Installs `openclaw@latest` by default, seeds realistic existing-user files without live provider or channel keys, configures that baseline with a baked `openclaw config set` command recipe, updates that published install to the packed OpenClaw tarball, runs non-interactive doctor, writes `.artifacts/upgrade-survivor/summary.json`, then starts a loopback Gateway and checks that configured intents, workspace/session files, stale plugin config and legacy dependency state, startup, `/healthz`, `/readyz`, and RPC status survive or repair cleanly. Override one baseline with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC`, expand an exact local matrix with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS` such as `openclaw@2026.5.2 openclaw@2026.4.23 openclaw@2026.4.15`, or add scenario fixtures with `OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS=reported-issues`; the reported-issues set includes `configured-plugin-installs` to verify configured external OpenClaw plugins install automatically during upgrade and `stale-source-plugin-shadow` to keep source-only plugin shadows from breaking startup. Package Acceptance exposes those as `published_upgrade_survivor_baseline`, `published_upgrade_survivor_baselines`, and `published_upgrade_survivor_scenarios`, and resolves meta baseline tokens such as `last-stable-4` or `all-since-2026.4.23` before handing exact package specs to Docker lanes.
|
||||
- `pnpm test:docker:published-upgrade-survivor`: Installs `openclaw@latest` by default, seeds realistic existing-user files without live provider or channel keys, configures that baseline with a baked `openclaw config set` command recipe, updates that published install to the packed OpenClaw tarball, runs non-interactive doctor, writes `.artifacts/upgrade-survivor/summary.json`, then starts a loopback Gateway and checks that configured intents, workspace/session files, stale plugin config and legacy dependency state, startup, `/healthz`, `/readyz`, and RPC status survive or repair cleanly. Override one baseline with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC`, expand an exact matrix with `OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS` such as `all-since-2026.4.23`, or add scenario fixtures with `OPENCLAW_UPGRADE_SURVIVOR_SCENARIOS=reported-issues`; the reported-issues set includes `configured-plugin-installs` to verify configured external OpenClaw plugins install automatically during upgrade and `stale-source-plugin-shadow` to keep source-only plugin shadows from breaking startup. Package Acceptance exposes those as `published_upgrade_survivor_baseline`, `published_upgrade_survivor_baselines`, and `published_upgrade_survivor_scenarios`.
|
||||
- `pnpm test:docker:update-migration`: Runs the published-upgrade survivor harness in the cleanup-heavy `plugin-deps-cleanup` scenario, starting at `openclaw@2026.4.23` by default. The separate `Update Migration` workflow expands this lane with `baselines=all-since-2026.4.23` so every stable published package from `.23` onward updates to the candidate and proves configured-plugin dependency cleanup outside Full Release CI.
|
||||
- `pnpm test:docker:plugins`: Runs install/update smoke for local path, `file:`, npm registry packages with hoisted dependencies, git moving refs, ClawHub fixtures, marketplace updates, and Claude-bundle enable/inspect.
|
||||
|
||||
|
||||
@@ -28,6 +28,7 @@ prompting even if session or config defaults request `ask: "on-miss"`.
|
||||
| Command | What it shows |
|
||||
| ---------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
|
||||
| `openclaw approvals get` / `--gateway` / `--node <id\|name\|ip>` | Requested policy, host policy sources, and the effective result. |
|
||||
| `openclaw approvals list --gateway` | Pending gateway runtime exec approval requests. |
|
||||
| `openclaw exec-policy show` | Local-machine merged view. |
|
||||
| `openclaw exec-policy set` / `preset` | Synchronize the local requested policy with the local host approvals file in one step. |
|
||||
|
||||
|
||||
@@ -80,22 +80,20 @@ reply model.
|
||||
|
||||
## Async vs synchronous
|
||||
|
||||
| Capability | Mode | Why |
|
||||
| --------------- | ------------ | ---------------------------------------------------------------------------------------------------- |
|
||||
| Image | Synchronous | Provider responses return in seconds; completes inline with reply. |
|
||||
| Text-to-speech | Synchronous | Provider responses return in seconds; attached to the reply audio. |
|
||||
| Video | Asynchronous | Provider processing takes 30 s to several minutes; slow queues can run up to the configured timeout. |
|
||||
| Music (shared) | Asynchronous | Same provider-processing characteristic as video. |
|
||||
| Music (ComfyUI) | Synchronous | Local workflow runs inline against the configured ComfyUI server. |
|
||||
| Capability | Mode | Why |
|
||||
| --------------- | ------------ | ------------------------------------------------------------------ |
|
||||
| Image | Synchronous | Provider responses return in seconds; completes inline with reply. |
|
||||
| Text-to-speech | Synchronous | Provider responses return in seconds; attached to the reply audio. |
|
||||
| Video | Asynchronous | Provider processing takes 30 s to several minutes. |
|
||||
| Music (shared) | Asynchronous | Same provider-processing characteristic as video. |
|
||||
| Music (ComfyUI) | Synchronous | Local workflow runs inline against the configured ComfyUI server. |
|
||||
|
||||
For async tools, OpenClaw submits the request to the provider, returns a task
|
||||
id immediately, and tracks the job in the task ledger. The agent continues
|
||||
responding to other messages while the job runs. When the provider finishes,
|
||||
OpenClaw wakes the agent with the generated media paths so it can tell the
|
||||
user and, when required by source-delivery policy, relay the result through
|
||||
the message tool. For message-tool-only group/channel routes, OpenClaw treats
|
||||
missing message-tool delivery evidence as a failed completion attempt and sends
|
||||
the generated media fallback directly to the original channel.
|
||||
the message tool.
|
||||
|
||||
## Speech-to-text and Voice Call
|
||||
|
||||
|
||||
@@ -16,10 +16,7 @@ For session-backed agent runs, OpenClaw starts music generation as a
|
||||
background task, tracks it in the task ledger, then wakes the agent again
|
||||
when the track is ready so the agent can tell the user and attach the
|
||||
finished audio. In group/channel chats that use message-tool-only visible
|
||||
delivery, the agent relays the result through the message tool. If the
|
||||
completion agent writes only a private final reply, OpenClaw falls back to a
|
||||
direct channel send with the generated media. The completion wake explicitly
|
||||
warns the agent that normal final replies are private in those routes.
|
||||
delivery, the agent relays the result through the message tool.
|
||||
|
||||
<Note>
|
||||
The built-in shared tool only appears when at least one music-generation
|
||||
|
||||
@@ -152,7 +152,7 @@ Current source-of-truth:
|
||||
- `/help` shows the short help summary.
|
||||
- `/commands` shows the generated command catalog.
|
||||
- `/tools [compact|verbose]` shows what the current agent can use right now.
|
||||
- `/status` shows execution/runtime status, Gateway and system uptime, plus provider usage/quota when available.
|
||||
- `/status` shows execution/runtime status, including `Execution`/`Runtime` labels and provider usage/quota when available.
|
||||
- `/diagnostics [note]` is the owner-only support-report flow for Gateway bugs and Codex harness runs. It asks for explicit exec approval every time before running `openclaw gateway diagnostics export --json`; do not approve diagnostics with an allow-all rule. After approval, it sends a pasteable report with the local bundle path, manifest summary, privacy notes, and relevant session ids. In group chats, the approval prompt and report go to the owner privately. When the active session uses the OpenAI Codex harness, the same approval also sends relevant Codex feedback to OpenAI servers and the completed reply lists the OpenClaw session ids, Codex thread ids, and `codex resume <thread-id>` commands. See [Diagnostics Export](/gateway/diagnostics).
|
||||
- `/crestodian <request>` runs the Crestodian setup and repair helper from an owner DM.
|
||||
- `/tasks` lists active/recent background tasks for the current session.
|
||||
|
||||
@@ -60,7 +60,7 @@ Video generation is asynchronous. When the agent calls `video_generate` in a
|
||||
session:
|
||||
|
||||
1. OpenClaw submits the request to the provider and immediately returns a task id.
|
||||
2. The provider processes the job in the background (typically 30 seconds to several minutes depending on the provider and resolution; slow queue-backed providers can run up to the configured timeout).
|
||||
2. The provider processes the job in the background (typically 30 seconds to 5 minutes depending on the provider and resolution).
|
||||
3. When the video is ready, OpenClaw wakes the same session with an internal completion event.
|
||||
4. The agent tells the user and attaches the finished video. In group/channel
|
||||
chats that use message-tool-only visible delivery, the agent relays the
|
||||
@@ -84,12 +84,12 @@ rejects an oversized file.
|
||||
|
||||
### Task lifecycle
|
||||
|
||||
| State | Meaning |
|
||||
| ----------- | ------------------------------------------------------------------------------------------------------ |
|
||||
| `queued` | Task created, waiting for the provider to accept it. |
|
||||
| `running` | Provider is processing (typically 30 seconds to several minutes depending on provider and resolution). |
|
||||
| `succeeded` | Video ready; the agent wakes and posts it to the conversation. |
|
||||
| `failed` | Provider error or timeout; the agent wakes with error details. |
|
||||
| State | Meaning |
|
||||
| ----------- | ------------------------------------------------------------------------------------------------ |
|
||||
| `queued` | Task created, waiting for the provider to accept it. |
|
||||
| `running` | Provider is processing (typically 30 seconds to 5 minutes depending on provider and resolution). |
|
||||
| `succeeded` | Video ready; the agent wakes and posts it to the conversation. |
|
||||
| `failed` | Provider error or timeout; the agent wakes with error details. |
|
||||
|
||||
Check status from the CLI:
|
||||
|
||||
@@ -198,9 +198,9 @@ role or use `first_frame` for single-image image-to-video.
|
||||
### Style controls
|
||||
|
||||
<ParamField path="aspectRatio" type="string">
|
||||
Aspect-ratio hint such as `1:1`, `16:9`, `9:16`, `adaptive`, or a provider-specific value. OpenClaw normalizes or ignores unsupported values per provider.
|
||||
`1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`, or `adaptive`.
|
||||
</ParamField>
|
||||
<ParamField path="resolution" type="string">Resolution hint such as `480P`, `720P`, `768P`, `1080P`, `4K`, or a provider-specific value. OpenClaw normalizes or ignores unsupported values per provider.</ParamField>
|
||||
<ParamField path="resolution" type="string">`480P`, `720P`, `768P`, or `1080P`.</ParamField>
|
||||
<ParamField path="durationSeconds" type="number">
|
||||
Target duration in seconds (rounded to nearest provider-supported value).
|
||||
</ParamField>
|
||||
@@ -223,7 +223,7 @@ dimensions). Providers that do not declare it surface the value via
|
||||
</ParamField>
|
||||
<ParamField path="model" type="string">Provider/model override (e.g. `runway/gen4.5`).</ParamField>
|
||||
<ParamField path="filename" type="string">Output filename hint.</ParamField>
|
||||
<ParamField path="timeoutMs" type="number">Optional provider operation timeout in milliseconds.</ParamField>
|
||||
<ParamField path="timeoutMs" type="number">Optional provider request timeout in milliseconds.</ParamField>
|
||||
<ParamField path="providerOptions" type="object">
|
||||
Provider-specific options as a JSON object (e.g. `{"seed": 42, "draft": true}`).
|
||||
Providers that declare a typed schema validate the keys and types; unknown
|
||||
@@ -377,22 +377,16 @@ only the explicit `model`, `primary`, and `fallbacks` entries.
|
||||
image-to-video through the configured graph.
|
||||
</Accordion>
|
||||
<Accordion title="fal">
|
||||
Uses a queue-backed flow for long-running jobs. OpenClaw waits up to 20
|
||||
minutes by default before treating an in-progress fal queue job as timed
|
||||
out. Most fal video models
|
||||
Uses a queue-backed flow for long-running jobs. Most fal video models
|
||||
accept a single image reference. Seedance 2.0 reference-to-video
|
||||
models accept up to 9 images, 3 videos, and 3 audio references, with
|
||||
at most 12 total reference files.
|
||||
</Accordion>
|
||||
<Accordion title="Google (Gemini / Veo)">
|
||||
Supports one image or one video reference. Generated-audio requests are
|
||||
ignored with a warning on the Gemini API path because that API rejects
|
||||
the `generateAudio` parameter for current Veo video generation.
|
||||
Supports one image or one video reference.
|
||||
</Accordion>
|
||||
<Accordion title="MiniMax">
|
||||
Single image reference only. MiniMax accepts `768P` and `1080P`
|
||||
resolutions; requests such as `720P` are normalized to the closest
|
||||
supported value before submission.
|
||||
Single image reference only.
|
||||
</Accordion>
|
||||
<Accordion title="OpenAI">
|
||||
Only `size` override is forwarded. Other style overrides
|
||||
|
||||
@@ -154,7 +154,7 @@ Imported themes are stored only in the current browser profile. They are not wri
|
||||
- Re-sending with the same `idempotencyKey` returns `{ status: "in_flight" }` while running, and `{ status: "ok" }` after completion.
|
||||
- `chat.history` responses are size-bounded for UI safety. When transcript entries are too large, Gateway may truncate long text fields, omit heavy metadata blocks, and replace oversized messages with a placeholder (`[chat.history omitted: message too large]`).
|
||||
- Assistant/generated images are persisted as managed media references and served back through authenticated Gateway media URLs, so reloads do not depend on raw base64 image payloads staying in the chat history response.
|
||||
- When rendering `chat.history`, the Control UI strips display-only inline directive tags from visible assistant text (for example `[[reply_to_*]]` and `[[audio_as_voice]]`), plain-text tool-call XML payloads (including `<tool_call>...</tool_call>`, `<function_call>...</function_call>`, `<tool_calls>...</tool_calls>`, `<function_calls>...</function_calls>`, and truncated tool-call blocks), and leaked ASCII/full-width model control tokens, and omits assistant entries whose whole visible text is only the exact silent token `NO_REPLY` / `no_reply` or the heartbeat acknowledgement token `HEARTBEAT_OK`.
|
||||
- `chat.history` also strips display-only inline directive tags from visible assistant text (for example `[[reply_to_*]]` and `[[audio_as_voice]]`), plain-text tool-call XML payloads (including `<tool_call>...</tool_call>`, `<function_call>...</function_call>`, `<tool_calls>...</tool_calls>`, `<function_calls>...</function_calls>`, and truncated tool-call blocks), and leaked ASCII/full-width model control tokens, and omits assistant entries whose whole visible text is only the exact silent token `NO_REPLY` / `no_reply`.
|
||||
- During an active send and the final history refresh, the chat view keeps local optimistic user/assistant messages visible if `chat.history` briefly returns an older snapshot; the canonical transcript replaces those local messages once the Gateway history catches up.
|
||||
- Live `chat` events are delivery state, while `chat.history` is rebuilt from the durable session transcript. After tool-final events the Control UI reloads history and merges only a small optimistic tail; the transcript boundary is documented in [WebChat](/web/webchat).
|
||||
- `chat.inject` appends an assistant note to the session transcript and broadcasts a `chat` event for UI-only updates (no agent run, no channel delivery).
|
||||
|
||||
@@ -2236,8 +2236,6 @@ export function createDiagnosticsOtelService(): OpenClawPluginService {
|
||||
return;
|
||||
case "session.long_running":
|
||||
case "session.stalled":
|
||||
case "session.recovery.completed":
|
||||
case "session.recovery.requested":
|
||||
return;
|
||||
case "session.stuck":
|
||||
recordSessionStuck(evt);
|
||||
|
||||
@@ -52,7 +52,7 @@ const SEEDANCE_REFERENCE_MAX_AUDIOS_BY_MODEL = Object.fromEntries(
|
||||
SEEDANCE_2_REFERENCE_VIDEO_MODELS.map((model) => [model, SEEDANCE_REFERENCE_MAX_AUDIOS]),
|
||||
);
|
||||
const DEFAULT_HTTP_TIMEOUT_MS = 30_000;
|
||||
const DEFAULT_OPERATION_TIMEOUT_MS = 1_200_000;
|
||||
const DEFAULT_OPERATION_TIMEOUT_MS = 600_000;
|
||||
const POLL_INTERVAL_MS = 5_000;
|
||||
|
||||
type FalVideoResponse = {
|
||||
|
||||
@@ -88,7 +88,7 @@ export function createGoogleVideoGenerationProviderMetadata(): Omit<
|
||||
supportsAspectRatio: true,
|
||||
supportsResolution: true,
|
||||
supportsSize: true,
|
||||
supportsAudio: false,
|
||||
supportsAudio: true,
|
||||
},
|
||||
imageToVideo: {
|
||||
enabled: true,
|
||||
@@ -101,7 +101,7 @@ export function createGoogleVideoGenerationProviderMetadata(): Omit<
|
||||
supportsAspectRatio: true,
|
||||
supportsResolution: true,
|
||||
supportsSize: true,
|
||||
supportsAudio: false,
|
||||
supportsAudio: true,
|
||||
},
|
||||
videoToVideo: {
|
||||
enabled: true,
|
||||
@@ -114,7 +114,7 @@ export function createGoogleVideoGenerationProviderMetadata(): Omit<
|
||||
supportsAspectRatio: true,
|
||||
supportsResolution: true,
|
||||
supportsSize: true,
|
||||
supportsAudio: false,
|
||||
supportsAudio: true,
|
||||
},
|
||||
},
|
||||
};
|
||||
|
||||
@@ -40,11 +40,7 @@ describe("google video generation provider", () => {
|
||||
});
|
||||
|
||||
it("declares explicit mode capabilities", () => {
|
||||
const provider = buildGoogleVideoGenerationProvider();
|
||||
expectExplicitVideoGenerationCapabilities(provider);
|
||||
expect(provider.capabilities.generate?.supportsAudio).toBe(false);
|
||||
expect(provider.capabilities.imageToVideo?.supportsAudio).toBe(false);
|
||||
expect(provider.capabilities.videoToVideo?.supportsAudio).toBe(false);
|
||||
expectExplicitVideoGenerationCapabilities(buildGoogleVideoGenerationProvider());
|
||||
});
|
||||
|
||||
it("submits generation and returns inline video bytes", async () => {
|
||||
@@ -90,12 +86,11 @@ describe("google video generation provider", () => {
|
||||
durationSeconds: 4,
|
||||
aspectRatio: "16:9",
|
||||
resolution: "720p",
|
||||
generateAudio: true,
|
||||
}),
|
||||
}),
|
||||
);
|
||||
expect(request?.config).not.toHaveProperty("generateAudio");
|
||||
expect(request?.config).not.toHaveProperty("numberOfVideos");
|
||||
expect(request?.config).not.toHaveProperty("generateAudio");
|
||||
expect(result.videos).toHaveLength(1);
|
||||
expect(result.videos[0]?.mimeType).toBe("video/mp4");
|
||||
expect(createGoogleGenAIMock).toHaveBeenCalledWith(
|
||||
|
||||
@@ -26,7 +26,7 @@ import { createGoogleGenAI, type GoogleGenAIClient } from "./google-genai-runtim
|
||||
|
||||
const DEFAULT_TIMEOUT_MS = 180_000;
|
||||
const POLL_INTERVAL_MS = 10_000;
|
||||
const MAX_POLL_ATTEMPTS = 120;
|
||||
const MAX_POLL_ATTEMPTS = 90;
|
||||
const GOOGLE_VIDEO_EMPTY_RESULT_MESSAGE =
|
||||
"Google video generation response missing generated videos";
|
||||
|
||||
@@ -322,6 +322,7 @@ async function generateGoogleVideoViaRest(params: {
|
||||
durationSeconds?: number;
|
||||
aspectRatio?: "16:9" | "9:16";
|
||||
resolution?: "720p" | "1080p";
|
||||
audio?: boolean;
|
||||
}): Promise<unknown> {
|
||||
let operation = await requestGoogleVideoJson({
|
||||
url: `${params.baseUrl}/${resolveGoogleVideoRestModelPath(params.model)}:predictLongRunning`,
|
||||
@@ -336,6 +337,7 @@ async function generateGoogleVideoViaRest(params: {
|
||||
: {}),
|
||||
...(params.aspectRatio ? { aspectRatio: params.aspectRatio } : {}),
|
||||
...(params.resolution ? { resolution: params.resolution } : {}),
|
||||
...(params.audio === true ? { generateAudio: true } : {}),
|
||||
},
|
||||
},
|
||||
});
|
||||
@@ -427,6 +429,7 @@ export function buildGoogleVideoGenerationProvider(): VideoGenerationProvider {
|
||||
...(typeof durationSeconds === "number" ? { durationSeconds } : {}),
|
||||
...(aspectRatio ? { aspectRatio } : {}),
|
||||
...(resolution ? { resolution } : {}),
|
||||
...(req.audio === true ? { generateAudio: true } : {}),
|
||||
},
|
||||
});
|
||||
} catch (error) {
|
||||
@@ -443,6 +446,7 @@ export function buildGoogleVideoGenerationProvider(): VideoGenerationProvider {
|
||||
durationSeconds,
|
||||
aspectRatio,
|
||||
resolution,
|
||||
audio: req.audio,
|
||||
});
|
||||
}
|
||||
|
||||
@@ -476,6 +480,7 @@ export function buildGoogleVideoGenerationProvider(): VideoGenerationProvider {
|
||||
durationSeconds,
|
||||
aspectRatio,
|
||||
resolution,
|
||||
audio: req.audio,
|
||||
});
|
||||
generatedVideos = extractGeneratedVideos(operation);
|
||||
}
|
||||
|
||||
@@ -29,10 +29,7 @@ installMinimaxProviderHttpMockCleanup();
|
||||
|
||||
describe("minimax video generation provider", () => {
|
||||
it("declares explicit mode capabilities", () => {
|
||||
const provider = buildMinimaxVideoGenerationProvider();
|
||||
expectExplicitVideoGenerationCapabilities(provider);
|
||||
expect(provider.capabilities.generate?.resolutions).toEqual(["768P", "1080P"]);
|
||||
expect(provider.capabilities.imageToVideo?.resolutions).toEqual(["768P", "1080P"]);
|
||||
expectExplicitVideoGenerationCapabilities(buildMinimaxVideoGenerationProvider());
|
||||
});
|
||||
|
||||
it("creates a task, polls status, and downloads the generated video", async () => {
|
||||
@@ -67,7 +64,6 @@ describe("minimax video generation provider", () => {
|
||||
prompt: "A fox sprints across snowy hills",
|
||||
cfg: {},
|
||||
durationSeconds: 5,
|
||||
resolution: "720P",
|
||||
});
|
||||
|
||||
expect(postJsonRequestMock).toHaveBeenCalledWith(
|
||||
@@ -75,7 +71,6 @@ describe("minimax video generation provider", () => {
|
||||
url: "https://api.minimax.io/v1/video_generation",
|
||||
body: expect.objectContaining({
|
||||
duration: 6,
|
||||
resolution: "768P",
|
||||
}),
|
||||
}),
|
||||
);
|
||||
|
||||
@@ -19,19 +19,12 @@ import type {
|
||||
const DEFAULT_MINIMAX_VIDEO_BASE_URL = "https://api.minimax.io";
|
||||
const DEFAULT_MINIMAX_VIDEO_MODEL = "MiniMax-Hailuo-2.3";
|
||||
const DEFAULT_TIMEOUT_MS = 120_000;
|
||||
const DEFAULT_OPERATION_TIMEOUT_MS = 1_200_000;
|
||||
const POLL_INTERVAL_MS = 10_000;
|
||||
const MAX_POLL_ATTEMPTS = 120;
|
||||
const MAX_POLL_ATTEMPTS = 90;
|
||||
const MINIMAX_MODEL_ALLOWED_DURATIONS: Readonly<Record<string, readonly number[]>> = {
|
||||
"MiniMax-Hailuo-2.3": [6, 10],
|
||||
"MiniMax-Hailuo-02": [6, 10],
|
||||
};
|
||||
const MINIMAX_MODEL_ALLOWED_RESOLUTIONS: Readonly<Record<string, readonly string[]>> = {
|
||||
"MiniMax-Hailuo-2.3": ["768P", "1080P"],
|
||||
"MiniMax-Hailuo-2.3-Fast": ["768P", "1080P"],
|
||||
"MiniMax-Hailuo-02": ["768P", "1080P"],
|
||||
};
|
||||
const MINIMAX_RESOLUTION_ORDER = ["480P", "720P", "768P", "1080P"] as const;
|
||||
|
||||
type MinimaxBaseResp = {
|
||||
status_code?: number;
|
||||
@@ -119,43 +112,6 @@ function resolveDurationSeconds(params: {
|
||||
);
|
||||
}
|
||||
|
||||
function resolveResolution(params: {
|
||||
model: string;
|
||||
resolution: string | undefined;
|
||||
}): string | undefined {
|
||||
const requested = normalizeOptionalString(params.resolution)?.toUpperCase();
|
||||
if (!requested) {
|
||||
return undefined;
|
||||
}
|
||||
const allowed = MINIMAX_MODEL_ALLOWED_RESOLUTIONS[params.model];
|
||||
if (!allowed || allowed.length === 0 || allowed.includes(requested)) {
|
||||
return requested;
|
||||
}
|
||||
const requestedIndex = MINIMAX_RESOLUTION_ORDER.indexOf(
|
||||
requested as (typeof MINIMAX_RESOLUTION_ORDER)[number],
|
||||
);
|
||||
if (requestedIndex < 0) {
|
||||
return undefined;
|
||||
}
|
||||
return allowed.reduce((best, current) => {
|
||||
const currentIndex = MINIMAX_RESOLUTION_ORDER.indexOf(
|
||||
current as (typeof MINIMAX_RESOLUTION_ORDER)[number],
|
||||
);
|
||||
const bestIndex = MINIMAX_RESOLUTION_ORDER.indexOf(
|
||||
best as (typeof MINIMAX_RESOLUTION_ORDER)[number],
|
||||
);
|
||||
if (currentIndex < 0) {
|
||||
return best;
|
||||
}
|
||||
if (bestIndex < 0) {
|
||||
return current;
|
||||
}
|
||||
return Math.abs(currentIndex - requestedIndex) < Math.abs(bestIndex - requestedIndex)
|
||||
? current
|
||||
: best;
|
||||
});
|
||||
}
|
||||
|
||||
async function pollMinimaxVideo(params: {
|
||||
taskId: string;
|
||||
headers: Headers;
|
||||
@@ -290,7 +246,6 @@ function buildMinimaxVideoProvider(providerId: string): VideoGenerationProvider
|
||||
maxVideos: 1,
|
||||
maxDurationSeconds: 10,
|
||||
supportedDurationSecondsByModel: MINIMAX_MODEL_ALLOWED_DURATIONS,
|
||||
resolutions: ["768P", "1080P"],
|
||||
supportsResolution: true,
|
||||
supportsWatermark: false,
|
||||
},
|
||||
@@ -300,7 +255,6 @@ function buildMinimaxVideoProvider(providerId: string): VideoGenerationProvider
|
||||
maxInputImages: 1,
|
||||
maxDurationSeconds: 10,
|
||||
supportedDurationSecondsByModel: MINIMAX_MODEL_ALLOWED_DURATIONS,
|
||||
resolutions: ["768P", "1080P"],
|
||||
supportsResolution: true,
|
||||
supportsWatermark: false,
|
||||
},
|
||||
@@ -324,7 +278,7 @@ function buildMinimaxVideoProvider(providerId: string): VideoGenerationProvider
|
||||
|
||||
const fetchFn = fetch;
|
||||
const deadline = createProviderOperationDeadline({
|
||||
timeoutMs: req.timeoutMs ?? DEFAULT_OPERATION_TIMEOUT_MS,
|
||||
timeoutMs: req.timeoutMs,
|
||||
label: "MiniMax video generation",
|
||||
});
|
||||
const { baseUrl, allowPrivateNetwork, headers, dispatcherPolicy } =
|
||||
@@ -349,12 +303,8 @@ function buildMinimaxVideoProvider(providerId: string): VideoGenerationProvider
|
||||
if (firstFrameImage) {
|
||||
body.first_frame_image = firstFrameImage;
|
||||
}
|
||||
const resolution = resolveResolution({
|
||||
model,
|
||||
resolution: req.resolution,
|
||||
});
|
||||
if (resolution) {
|
||||
body.resolution = resolution;
|
||||
if (req.resolution) {
|
||||
body.resolution = req.resolution;
|
||||
}
|
||||
const durationSeconds = resolveDurationSeconds({
|
||||
model,
|
||||
@@ -388,7 +338,7 @@ function buildMinimaxVideoProvider(providerId: string): VideoGenerationProvider
|
||||
headers,
|
||||
timeoutMs: resolveProviderOperationTimeoutMs({
|
||||
deadline,
|
||||
defaultTimeoutMs: DEFAULT_OPERATION_TIMEOUT_MS,
|
||||
defaultTimeoutMs: DEFAULT_TIMEOUT_MS,
|
||||
}),
|
||||
baseUrl,
|
||||
fetchFn,
|
||||
|
||||
@@ -27,7 +27,6 @@ import {
|
||||
promptAndConfigureOllama,
|
||||
queryOllamaModelShowInfo,
|
||||
} from "./api.js";
|
||||
import { resolveThinkingProfile as resolveOllamaThinkingProfile } from "./provider-policy-api.js";
|
||||
import {
|
||||
OLLAMA_DEFAULT_API_KEY,
|
||||
OLLAMA_PROVIDER_ID,
|
||||
@@ -250,7 +249,13 @@ export default definePluginEntry({
|
||||
contributeResolvedModelCompat: ({ model }) =>
|
||||
usesOllamaOpenAICompatTransport(model) ? { supportsUsageInStreaming: true } : undefined,
|
||||
resolveReasoningOutputMode: () => "native",
|
||||
resolveThinkingProfile: resolveOllamaThinkingProfile,
|
||||
resolveThinkingProfile: ({ reasoning }) => ({
|
||||
levels:
|
||||
reasoning === true
|
||||
? [{ id: "off" }, { id: "low" }, { id: "medium" }, { id: "high" }, { id: "max" }]
|
||||
: [{ id: "off" }],
|
||||
defaultLevel: "off",
|
||||
}),
|
||||
wrapStreamFn: createConfiguredOllamaCompatStreamWrapper,
|
||||
createEmbeddingProvider: async ({ config, model, provider: embeddingProvider, remote }) => {
|
||||
const { provider, client } = await createOllamaEmbeddingProvider({
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import type { ModelDefinitionConfig } from "openclaw/plugin-sdk/provider-model-types";
|
||||
import { describe, expect, it } from "vitest";
|
||||
import { normalizeConfig, resolveThinkingProfile } from "./provider-policy-api.js";
|
||||
import { normalizeConfig } from "./provider-policy-api.js";
|
||||
import { OLLAMA_DEFAULT_BASE_URL } from "./src/defaults.js";
|
||||
|
||||
function createModel(id: string, name: string): ModelDefinitionConfig {
|
||||
@@ -58,15 +58,4 @@ describe("ollama provider policy public artifact", () => {
|
||||
}),
|
||||
).toEqual({});
|
||||
});
|
||||
|
||||
it("exposes max thinking for reasoning-capable models without full plugin activation", () => {
|
||||
expect(resolveThinkingProfile({ reasoning: true })).toEqual({
|
||||
levels: [{ id: "off" }, { id: "low" }, { id: "medium" }, { id: "high" }, { id: "max" }],
|
||||
defaultLevel: "off",
|
||||
});
|
||||
expect(resolveThinkingProfile({ reasoning: false })).toEqual({
|
||||
levels: [{ id: "off" }],
|
||||
defaultLevel: "off",
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
@@ -1,19 +1,8 @@
|
||||
import type { ProviderThinkingProfile } from "openclaw/plugin-sdk/plugin-entry";
|
||||
import type { ModelProviderConfig } from "openclaw/plugin-sdk/provider-model-types";
|
||||
import { OLLAMA_DEFAULT_BASE_URL } from "./src/defaults.js";
|
||||
|
||||
type OllamaProviderConfigDraft = Partial<ModelProviderConfig>;
|
||||
|
||||
const OLLAMA_REASONING_THINKING_PROFILE = {
|
||||
levels: [{ id: "off" }, { id: "low" }, { id: "medium" }, { id: "high" }, { id: "max" }],
|
||||
defaultLevel: "off",
|
||||
} satisfies ProviderThinkingProfile;
|
||||
|
||||
const OLLAMA_NON_REASONING_THINKING_PROFILE = {
|
||||
levels: [{ id: "off" }],
|
||||
defaultLevel: "off",
|
||||
} satisfies ProviderThinkingProfile;
|
||||
|
||||
/**
|
||||
* Provider policy surface for Ollama: normalize provider configs used by
|
||||
* core defaults/normalizers. This runs during config defaults application and
|
||||
@@ -49,11 +38,3 @@ export function normalizeConfig({
|
||||
|
||||
return next;
|
||||
}
|
||||
|
||||
export function resolveThinkingProfile({
|
||||
reasoning,
|
||||
}: {
|
||||
reasoning?: boolean;
|
||||
}): ProviderThinkingProfile {
|
||||
return reasoning ? OLLAMA_REASONING_THINKING_PROFILE : OLLAMA_NON_REASONING_THINKING_PROFILE;
|
||||
}
|
||||
|
||||
@@ -333,8 +333,6 @@ describe("telegram live qa runtime", () => {
|
||||
"telegram-context-command",
|
||||
"telegram-current-session-status-tool",
|
||||
"telegram-mentioned-message-reply",
|
||||
"telegram-long-final-reuses-preview",
|
||||
"telegram-long-final-three-chunks",
|
||||
"telegram-mention-gating",
|
||||
]);
|
||||
expect(scenarios.map((scenario) => scenario.id)).toEqual([
|
||||
@@ -345,8 +343,6 @@ describe("telegram live qa runtime", () => {
|
||||
"telegram-context-command",
|
||||
"telegram-current-session-status-tool",
|
||||
"telegram-mentioned-message-reply",
|
||||
"telegram-long-final-reuses-preview",
|
||||
"telegram-long-final-three-chunks",
|
||||
"telegram-mention-gating",
|
||||
]);
|
||||
expect(
|
||||
@@ -359,25 +355,6 @@ describe("telegram live qa runtime", () => {
|
||||
.find((scenario) => scenario.id === "telegram-mentioned-message-reply")
|
||||
?.buildRun("sut_bot").replyToLatestSutMessage,
|
||||
).toBe(true);
|
||||
expect(
|
||||
scenarios
|
||||
.find((scenario) => scenario.id === "telegram-long-final-reuses-preview")
|
||||
?.buildRun("sut_bot"),
|
||||
).toMatchObject({
|
||||
expectedJoinedSutTextIncludes: ["TELEGRAM-LONG-FINAL-BEGIN", "TELEGRAM-LONG-FINAL-END"],
|
||||
expectedSutMessageCount: 2,
|
||||
});
|
||||
expect(
|
||||
scenarios
|
||||
.find((scenario) => scenario.id === "telegram-long-final-three-chunks")
|
||||
?.buildRun("sut_bot"),
|
||||
).toMatchObject({
|
||||
expectedJoinedSutTextIncludes: [
|
||||
"TELEGRAM-LONG-FINAL-3CHUNK-BEGIN",
|
||||
"TELEGRAM-LONG-FINAL-3CHUNK-END",
|
||||
],
|
||||
expectedSutMessageCount: 3,
|
||||
});
|
||||
});
|
||||
|
||||
it("keeps bot-to-bot plain mentions out of the default Telegram live set", () => {
|
||||
@@ -405,160 +382,6 @@ describe("telegram live qa runtime", () => {
|
||||
).toEqual(["allowlist-block", "top-level-reply-shape", "restart-resume"]);
|
||||
});
|
||||
|
||||
it("asserts long Telegram final replies reuse the streamed preview message", () => {
|
||||
expect(() =>
|
||||
__testing.assertTelegramScenarioMessageSet({
|
||||
expectedJoinedSutTextIncludes: ["TELEGRAM-LONG-FINAL-BEGIN", "TELEGRAM-LONG-FINAL-END"],
|
||||
expectedSutMessageCount: 2,
|
||||
groupId: "-100123",
|
||||
scenarioId: "telegram-long-final-reuses-preview",
|
||||
sutBotId: 99,
|
||||
observedMessages: [
|
||||
{
|
||||
updateId: 1,
|
||||
messageId: 10,
|
||||
chatId: -100123,
|
||||
senderId: 99,
|
||||
senderIsBot: true,
|
||||
scenarioId: "telegram-long-final-reuses-preview",
|
||||
scenarioTitle: "Telegram long final reuses the preview message",
|
||||
matchedScenario: true,
|
||||
text: "TELEGRAM-LONG-FINAL-BEGIN part one ",
|
||||
timestamp: 1_700_000_000_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
{
|
||||
updateId: 2,
|
||||
messageId: 11,
|
||||
chatId: -100123,
|
||||
senderId: 99,
|
||||
senderIsBot: true,
|
||||
scenarioId: "telegram-long-final-reuses-preview",
|
||||
scenarioTitle: "Telegram long final reuses the preview message",
|
||||
matchedScenario: true,
|
||||
text: "part two TELEGRAM-LONG-FINAL-END",
|
||||
timestamp: 1_700_000_001_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
],
|
||||
}),
|
||||
).not.toThrow();
|
||||
|
||||
expect(() =>
|
||||
__testing.assertTelegramScenarioMessageSet({
|
||||
expectedSutMessageCount: 2,
|
||||
groupId: "-100123",
|
||||
scenarioId: "telegram-long-final-reuses-preview",
|
||||
sutBotId: 99,
|
||||
observedMessages: [
|
||||
{
|
||||
updateId: 1,
|
||||
messageId: 10,
|
||||
chatId: -100123,
|
||||
senderId: 99,
|
||||
senderIsBot: true,
|
||||
scenarioId: "telegram-long-final-reuses-preview",
|
||||
scenarioTitle: "Telegram long final reuses the preview message",
|
||||
matchedScenario: true,
|
||||
text: "preview",
|
||||
timestamp: 1_700_000_000_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
{
|
||||
updateId: 2,
|
||||
messageId: 11,
|
||||
chatId: -100123,
|
||||
senderId: 99,
|
||||
senderIsBot: true,
|
||||
scenarioId: "telegram-long-final-reuses-preview",
|
||||
scenarioTitle: "Telegram long final reuses the preview message",
|
||||
matchedScenario: true,
|
||||
text: "final chunk one",
|
||||
timestamp: 1_700_000_001_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
{
|
||||
updateId: 3,
|
||||
messageId: 12,
|
||||
chatId: -100123,
|
||||
senderId: 99,
|
||||
senderIsBot: true,
|
||||
scenarioId: "telegram-long-final-reuses-preview",
|
||||
scenarioTitle: "Telegram long final reuses the preview message",
|
||||
matchedScenario: true,
|
||||
text: "final chunk two",
|
||||
timestamp: 1_700_000_002_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
],
|
||||
}),
|
||||
).toThrow("expected 2 SUT message(s), observed 3");
|
||||
});
|
||||
|
||||
it("accepts legitimate three-chunk Telegram final replies", () => {
|
||||
expect(() =>
|
||||
__testing.assertTelegramScenarioMessageSet({
|
||||
expectedJoinedSutTextIncludes: [
|
||||
"TELEGRAM-LONG-FINAL-3CHUNK-BEGIN",
|
||||
"TELEGRAM-LONG-FINAL-3CHUNK-END",
|
||||
],
|
||||
expectedSutMessageCount: 3,
|
||||
groupId: "-100123",
|
||||
scenarioId: "telegram-long-final-three-chunks",
|
||||
sutBotId: 99,
|
||||
observedMessages: [
|
||||
{
|
||||
updateId: 1,
|
||||
messageId: 10,
|
||||
chatId: -100123,
|
||||
senderId: 99,
|
||||
senderIsBot: true,
|
||||
scenarioId: "telegram-long-final-three-chunks",
|
||||
scenarioTitle: "Telegram three-chunk final keeps only final chunks",
|
||||
matchedScenario: true,
|
||||
text: "TELEGRAM-LONG-FINAL-3CHUNK-BEGIN part one ",
|
||||
timestamp: 1_700_000_000_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
{
|
||||
updateId: 2,
|
||||
messageId: 11,
|
||||
chatId: -100123,
|
||||
senderId: 99,
|
||||
senderIsBot: true,
|
||||
scenarioId: "telegram-long-final-three-chunks",
|
||||
scenarioTitle: "Telegram three-chunk final keeps only final chunks",
|
||||
matchedScenario: true,
|
||||
text: "part two ",
|
||||
timestamp: 1_700_000_001_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
{
|
||||
updateId: 3,
|
||||
messageId: 12,
|
||||
chatId: -100123,
|
||||
senderId: 99,
|
||||
senderIsBot: true,
|
||||
scenarioId: "telegram-long-final-three-chunks",
|
||||
scenarioTitle: "Telegram three-chunk final keeps only final chunks",
|
||||
matchedScenario: true,
|
||||
text: "part three TELEGRAM-LONG-FINAL-3CHUNK-END",
|
||||
timestamp: 1_700_000_002_000,
|
||||
inlineButtons: [],
|
||||
mediaKinds: [],
|
||||
},
|
||||
],
|
||||
}),
|
||||
).not.toThrow();
|
||||
});
|
||||
|
||||
it("matches scenario replies by thread or exact marker", () => {
|
||||
expect(
|
||||
__testing.matchesTelegramScenarioReply({
|
||||
|
||||
@@ -48,8 +48,6 @@ type TelegramQaScenarioId =
|
||||
| "telegram-whoami-command"
|
||||
| "telegram-context-command"
|
||||
| "telegram-current-session-status-tool"
|
||||
| "telegram-long-final-three-chunks"
|
||||
| "telegram-long-final-reuses-preview"
|
||||
| "telegram-mentioned-message-reply"
|
||||
| "telegram-mention-gating";
|
||||
|
||||
@@ -58,11 +56,8 @@ type TelegramQaScenarioRun = {
|
||||
expectReply: boolean;
|
||||
input: string;
|
||||
expectedTextIncludes?: string[];
|
||||
expectedJoinedSutTextIncludes?: string[];
|
||||
expectedSutMessageCount?: number;
|
||||
matchText?: string;
|
||||
replyToLatestSutMessage?: boolean;
|
||||
settleMs?: number;
|
||||
};
|
||||
|
||||
type TelegramQaScenarioDefinition = LiveTransportScenarioDefinition<TelegramQaScenarioId> & {
|
||||
@@ -300,39 +295,6 @@ const TELEGRAM_QA_SCENARIOS: TelegramQaScenarioDefinition[] = [
|
||||
replyToLatestSutMessage: true,
|
||||
}),
|
||||
},
|
||||
{
|
||||
id: "telegram-long-final-reuses-preview",
|
||||
title: "Telegram long final reuses the preview message",
|
||||
defaultEnabled: false,
|
||||
timeoutMs: 60_000,
|
||||
buildRun: (sutUsername) => ({
|
||||
allowAnySutReply: true,
|
||||
expectReply: true,
|
||||
input: `@${sutUsername} Telegram long final QA check. Use the scripted long final response.`,
|
||||
expectedTextIncludes: ["TELEGRAM-LONG-FINAL-BEGIN"],
|
||||
expectedJoinedSutTextIncludes: ["TELEGRAM-LONG-FINAL-BEGIN", "TELEGRAM-LONG-FINAL-END"],
|
||||
expectedSutMessageCount: 2,
|
||||
settleMs: 4_000,
|
||||
}),
|
||||
},
|
||||
{
|
||||
id: "telegram-long-final-three-chunks",
|
||||
title: "Telegram three-chunk final keeps only final chunks",
|
||||
defaultEnabled: false,
|
||||
timeoutMs: 60_000,
|
||||
buildRun: (sutUsername) => ({
|
||||
allowAnySutReply: true,
|
||||
expectReply: true,
|
||||
input: `@${sutUsername} Telegram long final three chunk QA check. Use the scripted three chunk final response.`,
|
||||
expectedTextIncludes: ["TELEGRAM-LONG-FINAL-3CHUNK-BEGIN"],
|
||||
expectedJoinedSutTextIncludes: [
|
||||
"TELEGRAM-LONG-FINAL-3CHUNK-BEGIN",
|
||||
"TELEGRAM-LONG-FINAL-3CHUNK-END",
|
||||
],
|
||||
expectedSutMessageCount: 3,
|
||||
settleMs: 4_000,
|
||||
}),
|
||||
},
|
||||
{
|
||||
id: "telegram-mention-gating",
|
||||
standardId: "mention-gating",
|
||||
@@ -782,102 +744,6 @@ async function waitForObservedMessage(params: {
|
||||
throw new Error(timeoutMessage);
|
||||
}
|
||||
|
||||
async function collectObservedMessages(params: {
|
||||
token: string;
|
||||
initialOffset: number;
|
||||
settleMs: number;
|
||||
predicate: (message: TelegramObservedMessage) => boolean;
|
||||
observedMessages: TelegramObservedMessage[];
|
||||
observationScenarioId: string;
|
||||
observationScenarioTitle: string;
|
||||
}) {
|
||||
const startedAt = Date.now();
|
||||
let offset = params.initialOffset;
|
||||
while (Date.now() - startedAt < params.settleMs) {
|
||||
const remainingMs = Math.max(1, params.settleMs - (Date.now() - startedAt));
|
||||
const timeoutSeconds = Math.max(1, Math.min(2, Math.ceil(remainingMs / 1000)));
|
||||
let updates: TelegramUpdate[];
|
||||
try {
|
||||
updates = await callTelegramApi<TelegramUpdate[]>(
|
||||
params.token,
|
||||
"getUpdates",
|
||||
{
|
||||
offset,
|
||||
timeout: timeoutSeconds,
|
||||
allowed_updates: ["message", "edited_message"],
|
||||
},
|
||||
timeoutSeconds * 1000 + 5_000,
|
||||
);
|
||||
} catch (error) {
|
||||
if (!isRecoverableTelegramQaPollError(error)) {
|
||||
throw error;
|
||||
}
|
||||
await waitForTelegramPollRetryDelay(params.settleMs - (Date.now() - startedAt));
|
||||
continue;
|
||||
}
|
||||
if (updates.length === 0) {
|
||||
continue;
|
||||
}
|
||||
offset = (updates.at(-1)?.update_id ?? offset) + 1;
|
||||
for (const update of updates) {
|
||||
const normalized = normalizeTelegramObservedMessage(update);
|
||||
if (!normalized) {
|
||||
continue;
|
||||
}
|
||||
params.observedMessages.push({
|
||||
...normalized,
|
||||
scenarioId: params.observationScenarioId,
|
||||
scenarioTitle: params.observationScenarioTitle,
|
||||
matchedScenario: params.predicate(normalized),
|
||||
});
|
||||
}
|
||||
}
|
||||
return offset;
|
||||
}
|
||||
|
||||
function assertTelegramScenarioMessageSet(params: {
|
||||
expectedJoinedSutTextIncludes?: string[];
|
||||
expectedSutMessageCount?: number;
|
||||
groupId: string;
|
||||
observedMessages: TelegramObservedMessage[];
|
||||
scenarioId: string;
|
||||
sutBotId: number;
|
||||
}) {
|
||||
if (
|
||||
params.expectedSutMessageCount === undefined &&
|
||||
(params.expectedJoinedSutTextIncludes ?? []).length === 0
|
||||
) {
|
||||
return;
|
||||
}
|
||||
const byMessageId = new Map<number, TelegramObservedMessage>();
|
||||
for (const message of params.observedMessages) {
|
||||
if (
|
||||
message.scenarioId === params.scenarioId &&
|
||||
message.chatId === Number(params.groupId) &&
|
||||
message.senderId === params.sutBotId
|
||||
) {
|
||||
byMessageId.set(message.messageId, message);
|
||||
}
|
||||
}
|
||||
const messages = [...byMessageId.values()].toSorted((a, b) => a.messageId - b.messageId);
|
||||
if (
|
||||
params.expectedSutMessageCount !== undefined &&
|
||||
messages.length !== params.expectedSutMessageCount
|
||||
) {
|
||||
throw new Error(
|
||||
`expected ${params.expectedSutMessageCount} SUT message(s), observed ${messages.length}: ${messages
|
||||
.map((message) => message.messageId)
|
||||
.join(", ")}`,
|
||||
);
|
||||
}
|
||||
const joinedText = messages.map((message) => message.text).join("");
|
||||
for (const expected of params.expectedJoinedSutTextIncludes ?? []) {
|
||||
if (!joinedText.includes(expected)) {
|
||||
throw new Error(`joined SUT reply text missing expected text: ${expected}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
async function waitForTelegramChannelRunning(
|
||||
gateway: Awaited<ReturnType<typeof startQaGatewayChild>>,
|
||||
accountId: string,
|
||||
@@ -1508,25 +1374,6 @@ export async function runTelegramQaLive(params: {
|
||||
}),
|
||||
});
|
||||
driverOffset = matched.nextOffset;
|
||||
if (scenarioRun.settleMs !== undefined) {
|
||||
driverOffset = await collectObservedMessages({
|
||||
token: runtimeEnv.driverToken,
|
||||
initialOffset: driverOffset,
|
||||
settleMs: scenarioRun.settleMs,
|
||||
observedMessages,
|
||||
observationScenarioId: scenario.id,
|
||||
observationScenarioTitle: scenario.title,
|
||||
predicate: (message) =>
|
||||
matchesTelegramScenarioReply({
|
||||
allowAnySutReply: scenarioRun.allowAnySutReply,
|
||||
groupId: runtimeEnv.groupId,
|
||||
matchText: scenarioRun.matchText,
|
||||
message,
|
||||
sentMessageId: sent.message_id,
|
||||
sutBotId: sutIdentity.id,
|
||||
}),
|
||||
});
|
||||
}
|
||||
if (!scenarioRun.expectReply) {
|
||||
throw new Error(`unexpected reply message ${matched.message.messageId} matched`);
|
||||
}
|
||||
@@ -1534,26 +1381,14 @@ export async function runTelegramQaLive(params: {
|
||||
expectedTextIncludes: scenarioRun.expectedTextIncludes,
|
||||
message: matched.message,
|
||||
});
|
||||
assertTelegramScenarioMessageSet({
|
||||
expectedJoinedSutTextIncludes: scenarioRun.expectedJoinedSutTextIncludes,
|
||||
expectedSutMessageCount: scenarioRun.expectedSutMessageCount,
|
||||
groupId: runtimeEnv.groupId,
|
||||
observedMessages,
|
||||
scenarioId: scenario.id,
|
||||
sutBotId: sutIdentity.id,
|
||||
});
|
||||
const rttMs = matched.observedAtMs - requestStartedAtMs;
|
||||
const suffix =
|
||||
scenarioRun.expectedSutMessageCount === undefined
|
||||
? ""
|
||||
: `; observed ${scenarioRun.expectedSutMessageCount} SUT message(s)`;
|
||||
const result = {
|
||||
id: scenario.id,
|
||||
title: scenario.title,
|
||||
status: "pass",
|
||||
details: redactPublicMetadata
|
||||
? `reply matched in ${rttMs}ms${suffix}`
|
||||
: `reply message ${matched.message.messageId} matched in ${rttMs}ms${suffix}`,
|
||||
? `reply matched in ${rttMs}ms`
|
||||
: `reply message ${matched.message.messageId} matched in ${rttMs}ms`,
|
||||
rttMs,
|
||||
requestStartedAt,
|
||||
responseObservedAt: new Date(matched.observedAtMs).toISOString(),
|
||||
@@ -1730,7 +1565,6 @@ export const __testing = {
|
||||
buildObservedMessagesArtifact,
|
||||
canaryFailureMessage,
|
||||
callTelegramApi,
|
||||
assertTelegramScenarioMessageSet,
|
||||
isRecoverableTelegramQaPollError,
|
||||
assertTelegramScenarioReply,
|
||||
classifyCanaryReply,
|
||||
|
||||
@@ -8,12 +8,6 @@ import {
|
||||
runMantisSlackDesktopSmoke,
|
||||
type MantisSlackDesktopSmokeOptions,
|
||||
} from "./slack-desktop-smoke.runtime.js";
|
||||
import {
|
||||
runMantisVisualDriver,
|
||||
runMantisVisualTask,
|
||||
type MantisVisualDriverOptions,
|
||||
type MantisVisualTaskOptions,
|
||||
} from "./visual-task.runtime.js";
|
||||
|
||||
export async function runMantisDiscordSmokeCommand(opts: MantisDiscordSmokeOptions) {
|
||||
const result = await runMantisDiscordSmoke(opts);
|
||||
@@ -40,9 +34,6 @@ export async function runMantisDesktopBrowserSmokeCommand(opts: MantisDesktopBro
|
||||
if (result.screenshotPath) {
|
||||
process.stdout.write(`Mantis desktop browser screenshot: ${result.screenshotPath}\n`);
|
||||
}
|
||||
if (result.videoPath) {
|
||||
process.stdout.write(`Mantis desktop browser video: ${result.videoPath}\n`);
|
||||
}
|
||||
if (result.status === "fail") {
|
||||
process.exitCode = 1;
|
||||
}
|
||||
@@ -55,33 +46,6 @@ export async function runMantisSlackDesktopSmokeCommand(opts: MantisSlackDesktop
|
||||
if (result.screenshotPath) {
|
||||
process.stdout.write(`Mantis Slack desktop screenshot: ${result.screenshotPath}\n`);
|
||||
}
|
||||
if (result.videoPath) {
|
||||
process.stdout.write(`Mantis Slack desktop video: ${result.videoPath}\n`);
|
||||
}
|
||||
if (result.status === "fail") {
|
||||
process.exitCode = 1;
|
||||
}
|
||||
}
|
||||
|
||||
export async function runMantisVisualDriverCommand(opts: MantisVisualDriverOptions) {
|
||||
const result = await runMantisVisualDriver(opts);
|
||||
process.stdout.write(`Mantis visual driver result: ${result.status}\n`);
|
||||
process.stdout.write(`Mantis visual driver screenshot: ${result.screenshotPath}\n`);
|
||||
if (result.status === "fail") {
|
||||
process.exitCode = 1;
|
||||
}
|
||||
}
|
||||
|
||||
export async function runMantisVisualTaskCommand(opts: MantisVisualTaskOptions) {
|
||||
const result = await runMantisVisualTask(opts);
|
||||
process.stdout.write(`Mantis visual task report: ${result.reportPath}\n`);
|
||||
process.stdout.write(`Mantis visual task summary: ${result.summaryPath}\n`);
|
||||
if (result.screenshotPath) {
|
||||
process.stdout.write(`Mantis visual task screenshot: ${result.screenshotPath}\n`);
|
||||
}
|
||||
if (result.videoPath) {
|
||||
process.stdout.write(`Mantis visual task video: ${result.videoPath}\n`);
|
||||
}
|
||||
if (result.status === "fail") {
|
||||
process.exitCode = 1;
|
||||
}
|
||||
|
||||
@@ -4,11 +4,6 @@ import type { MantisDesktopBrowserSmokeOptions } from "./desktop-browser-smoke.r
|
||||
import type { MantisDiscordSmokeOptions } from "./discord-smoke.runtime.js";
|
||||
import type { MantisBeforeAfterOptions } from "./run.runtime.js";
|
||||
import type { MantisSlackDesktopSmokeOptions } from "./slack-desktop-smoke.runtime.js";
|
||||
import type {
|
||||
MantisVisualDriverOptions,
|
||||
MantisVisualTaskOptions,
|
||||
MantisVisualTaskVisionMode,
|
||||
} from "./visual-task.runtime.js";
|
||||
|
||||
type MantisCliRuntime = typeof import("./cli.runtime.js");
|
||||
|
||||
@@ -36,16 +31,6 @@ async function runSlackDesktopSmoke(opts: MantisSlackDesktopSmokeOptions) {
|
||||
await runtime.runMantisSlackDesktopSmokeCommand(opts);
|
||||
}
|
||||
|
||||
async function runVisualDriver(opts: MantisVisualDriverOptions) {
|
||||
const runtime = await loadMantisCliRuntime();
|
||||
await runtime.runMantisVisualDriverCommand(opts);
|
||||
}
|
||||
|
||||
async function runVisualTask(opts: MantisVisualTaskOptions) {
|
||||
const runtime = await loadMantisCliRuntime();
|
||||
await runtime.runMantisVisualTaskCommand(opts);
|
||||
}
|
||||
|
||||
type MantisDiscordSmokeCommanderOptions = {
|
||||
channelId?: string;
|
||||
guildId?: string;
|
||||
@@ -111,57 +96,10 @@ type MantisSlackDesktopSmokeCommanderOptions = {
|
||||
ttl?: string;
|
||||
};
|
||||
|
||||
type MantisVisualTaskCommanderOptions = {
|
||||
browserUrl?: string;
|
||||
class?: string;
|
||||
crabboxBin?: string;
|
||||
duration?: string;
|
||||
expectText?: string;
|
||||
idleTimeout?: string;
|
||||
keepLease?: boolean;
|
||||
leaseId?: string;
|
||||
machineClass?: string;
|
||||
outputDir?: string;
|
||||
provider?: string;
|
||||
repoRoot?: string;
|
||||
settleMs?: string;
|
||||
ttl?: string;
|
||||
visionMode?: MantisVisualTaskVisionMode;
|
||||
visionModel?: string;
|
||||
visionPrompt?: string;
|
||||
visionTimeoutMs?: string;
|
||||
};
|
||||
|
||||
type MantisVisualDriverCommanderOptions = {
|
||||
browserUrl?: string;
|
||||
crabboxBin?: string;
|
||||
expectText?: string;
|
||||
leaseId?: string;
|
||||
outputDir?: string;
|
||||
provider?: string;
|
||||
repoRoot?: string;
|
||||
settleMs?: string;
|
||||
visionMode?: MantisVisualTaskVisionMode;
|
||||
visionModel?: string;
|
||||
visionPrompt?: string;
|
||||
visionTimeoutMs?: string;
|
||||
};
|
||||
|
||||
function collectString(value: string, previous: string[] = []) {
|
||||
return [...previous, value];
|
||||
}
|
||||
|
||||
function parseOptionalInteger(value: string | undefined, label: string) {
|
||||
if (value === undefined) {
|
||||
return undefined;
|
||||
}
|
||||
const parsed = Number.parseInt(value, 10);
|
||||
if (!Number.isFinite(parsed) || String(parsed) !== value || parsed < 0) {
|
||||
throw new Error(`${label} must be a non-negative integer`);
|
||||
}
|
||||
return parsed;
|
||||
}
|
||||
|
||||
export function registerMantisCli(qa: Command) {
|
||||
const mantis = qa
|
||||
.command("mantis")
|
||||
@@ -228,7 +166,7 @@ export function registerMantisCli(qa: Command) {
|
||||
mantis
|
||||
.command("desktop-browser-smoke")
|
||||
.description(
|
||||
"Lease or reuse a Crabbox desktop, open a visible browser, and capture VNC desktop screenshot/video artifacts",
|
||||
"Lease or reuse a Crabbox desktop, open a visible browser, and capture a VNC desktop screenshot",
|
||||
)
|
||||
.option("--repo-root <path>", "Repository root to target when running from a neutral cwd")
|
||||
.option("--output-dir <path>", "Mantis desktop browser artifact directory")
|
||||
@@ -261,7 +199,7 @@ export function registerMantisCli(qa: Command) {
|
||||
mantis
|
||||
.command("slack-desktop-smoke")
|
||||
.description(
|
||||
"Lease or reuse a Crabbox VNC desktop, run Slack QA inside it, open Slack in the browser, and capture screenshot/video artifacts",
|
||||
"Lease or reuse a Crabbox VNC desktop, run Slack QA inside it, open Slack in the browser, and capture a screenshot",
|
||||
)
|
||||
.option("--repo-root <path>", "Repository root to target when running from a neutral cwd")
|
||||
.option("--output-dir <path>", "Mantis Slack desktop artifact directory")
|
||||
@@ -311,83 +249,4 @@ export function registerMantisCli(qa: Command) {
|
||||
ttl: opts.ttl,
|
||||
});
|
||||
});
|
||||
|
||||
mantis
|
||||
.command("visual-task")
|
||||
.description(
|
||||
"Lease or reuse a Crabbox desktop, drive visible browser UI, record MP4, screenshot it, and optionally run image-understanding assertions",
|
||||
)
|
||||
.option("--repo-root <path>", "Repository root to target when running from a neutral cwd")
|
||||
.option("--output-dir <path>", "Mantis visual-task artifact directory")
|
||||
.option("--crabbox-bin <path>", "Crabbox binary path")
|
||||
.option("--provider <provider>", "Crabbox provider")
|
||||
.option("--machine-class <class>", "Crabbox machine class")
|
||||
.option("--class <class>", "Alias for --machine-class")
|
||||
.option("--lease-id <id>", "Reuse an existing Crabbox lease")
|
||||
.option("--idle-timeout <duration>", "Crabbox idle timeout")
|
||||
.option("--ttl <duration>", "Crabbox maximum lease lifetime")
|
||||
.option("--keep-lease", "Keep a lease created by this run after a passing task")
|
||||
.option("--browser-url <url>", "URL to open in the visible browser")
|
||||
.option("--duration <duration>", "Desktop recording duration")
|
||||
.option("--settle-ms <ms>", "Milliseconds to wait after launch before screenshot")
|
||||
.option("--vision-mode <mode>", "Vision mode: image-describe or metadata")
|
||||
.option("--vision-prompt <text>", "Prompt for image understanding")
|
||||
.option("--vision-model <provider/model>", "Image-capable provider/model ref")
|
||||
.option("--vision-timeout-ms <ms>", "Image understanding timeout in milliseconds")
|
||||
.option("--expect-text <text>", "Case-insensitive text expected in the vision output")
|
||||
.action(async (opts: MantisVisualTaskCommanderOptions) => {
|
||||
await runVisualTask({
|
||||
browserUrl: opts.browserUrl,
|
||||
crabboxBin: opts.crabboxBin,
|
||||
duration: opts.duration,
|
||||
expectText: opts.expectText,
|
||||
idleTimeout: opts.idleTimeout,
|
||||
keepLease: opts.keepLease,
|
||||
leaseId: opts.leaseId,
|
||||
machineClass: opts.machineClass ?? opts.class,
|
||||
outputDir: opts.outputDir,
|
||||
provider: opts.provider,
|
||||
repoRoot: opts.repoRoot,
|
||||
settleMs: parseOptionalInteger(opts.settleMs, "--settle-ms"),
|
||||
ttl: opts.ttl,
|
||||
visionMode: opts.visionMode,
|
||||
visionModel: opts.visionModel,
|
||||
visionPrompt: opts.visionPrompt,
|
||||
visionTimeoutMs: parseOptionalInteger(opts.visionTimeoutMs, "--vision-timeout-ms"),
|
||||
});
|
||||
});
|
||||
|
||||
mantis
|
||||
.command("visual-driver")
|
||||
.description(
|
||||
"Driver half for Mantis visual-task; launched by Crabbox record --while, then opens browser, screenshots, and runs vision",
|
||||
)
|
||||
.option("--repo-root <path>", "Repository root to target when running from a neutral cwd")
|
||||
.option("--output-dir <path>", "Mantis visual-task artifact directory")
|
||||
.option("--crabbox-bin <path>", "Crabbox binary path")
|
||||
.option("--provider <provider>", "Crabbox provider")
|
||||
.option("--lease-id <id>", "Crabbox lease id")
|
||||
.option("--browser-url <url>", "URL to open in the visible browser")
|
||||
.option("--settle-ms <ms>", "Milliseconds to wait after launch before screenshot")
|
||||
.option("--vision-mode <mode>", "Vision mode: image-describe or metadata")
|
||||
.option("--vision-prompt <text>", "Prompt for image understanding")
|
||||
.option("--vision-model <provider/model>", "Image-capable provider/model ref")
|
||||
.option("--vision-timeout-ms <ms>", "Image understanding timeout in milliseconds")
|
||||
.option("--expect-text <text>", "Case-insensitive text expected in the vision output")
|
||||
.action(async (opts: MantisVisualDriverCommanderOptions) => {
|
||||
await runVisualDriver({
|
||||
browserUrl: opts.browserUrl,
|
||||
crabboxBin: opts.crabboxBin,
|
||||
expectText: opts.expectText,
|
||||
leaseId: opts.leaseId,
|
||||
outputDir: opts.outputDir,
|
||||
provider: opts.provider,
|
||||
repoRoot: opts.repoRoot,
|
||||
settleMs: parseOptionalInteger(opts.settleMs, "--settle-ms"),
|
||||
visionMode: opts.visionMode,
|
||||
visionModel: opts.visionModel,
|
||||
visionPrompt: opts.visionPrompt,
|
||||
visionTimeoutMs: parseOptionalInteger(opts.visionTimeoutMs, "--vision-timeout-ms"),
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
@@ -50,10 +50,8 @@ describe("mantis desktop browser smoke runtime", () => {
|
||||
expect(outputDir).toBeTypeOf("string");
|
||||
await fs.mkdir(outputDir as string, { recursive: true });
|
||||
await fs.writeFile(path.join(outputDir as string, "desktop-browser-smoke.png"), "png");
|
||||
await fs.writeFile(path.join(outputDir as string, "desktop-browser-smoke.mp4"), "mp4");
|
||||
await fs.writeFile(path.join(outputDir as string, "remote-metadata.json"), "{}\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "chrome.log"), "chrome\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "ffmpeg.log"), "ffmpeg\n");
|
||||
return { stdout: "", stderr: "" };
|
||||
}
|
||||
return { stdout: "", stderr: "" };
|
||||
@@ -82,10 +80,11 @@ describe("mantis desktop browser smoke runtime", () => {
|
||||
expect(commands.every((entry) => entry.env === runtimeEnv)).toBe(true);
|
||||
const rsyncArgs = commands.find((entry) => entry.command === "rsync")?.args ?? [];
|
||||
expect(rsyncArgs).not.toContain("--delete");
|
||||
expect(rsyncArgs).toEqual(expect.arrayContaining(["--exclude", "chrome-profile/**"]));
|
||||
expect(rsyncArgs).toEqual(
|
||||
expect.arrayContaining([
|
||||
"crabbox@203.0.113.10:/tmp/openclaw-mantis-desktop-2026-05-04T12-00-00-000Z/",
|
||||
"crabbox@203.0.113.10:/tmp/openclaw-mantis-desktop-2026-05-04T12-00-00-000Z/desktop-browser-smoke.png",
|
||||
"crabbox@203.0.113.10:/tmp/openclaw-mantis-desktop-2026-05-04T12-00-00-000Z/remote-metadata.json",
|
||||
"crabbox@203.0.113.10:/tmp/openclaw-mantis-desktop-2026-05-04T12-00-00-000Z/chrome.log",
|
||||
]),
|
||||
);
|
||||
const remoteScript = commands
|
||||
@@ -95,14 +94,9 @@ describe("mantis desktop browser smoke runtime", () => {
|
||||
expect(remoteScript).toContain("${CHROME_BIN:-}");
|
||||
expect(remoteScript).toContain("chromium-browser");
|
||||
expect(remoteScript).toContain("base64 -d");
|
||||
expect(remoteScript).toContain("ffmpeg");
|
||||
expect(remoteScript).toContain('sudo apt-get update -y >>"$out/apt.log" 2>&1 || true');
|
||||
expect(remoteScript).toContain("desktop-browser-smoke.mp4");
|
||||
expect(remoteScript).not.toContain("-video_size");
|
||||
expect(remoteScript).toContain('url="file://$out/input.html"');
|
||||
expect(remoteScript).toContain('"browserBinary": "$browser_bin"');
|
||||
await expect(fs.readFile(result.screenshotPath ?? "", "utf8")).resolves.toBe("png");
|
||||
await expect(fs.readFile(result.videoPath ?? "", "utf8")).resolves.toBe("mp4");
|
||||
const summary = JSON.parse(await fs.readFile(result.summaryPath, "utf8")) as {
|
||||
browserUrl: string;
|
||||
crabbox: { id: string; vncCommand: string };
|
||||
|
||||
@@ -28,7 +28,6 @@ export type MantisDesktopBrowserSmokeResult = {
|
||||
screenshotPath?: string;
|
||||
status: "pass" | "fail";
|
||||
summaryPath: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
|
||||
type CommandResult = {
|
||||
@@ -59,7 +58,6 @@ type MantisDesktopBrowserSmokeSummary = {
|
||||
reportPath: string;
|
||||
screenshotPath?: string;
|
||||
summaryPath: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
browserUrl: string;
|
||||
htmlFile?: string;
|
||||
@@ -234,24 +232,6 @@ if [ -z "$browser_bin" ]; then
|
||||
echo "No browser binary found. Checked BROWSER, CHROME_BIN, google-chrome, chromium, chromium-browser." >&2
|
||||
exit 127
|
||||
fi
|
||||
video_pid=""
|
||||
if command -v ffmpeg >/dev/null 2>&1; then
|
||||
:
|
||||
else
|
||||
sudo apt-get update -y >>"$out/apt.log" 2>&1 || true
|
||||
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y ffmpeg >>"$out/apt.log" 2>&1 || true
|
||||
fi
|
||||
if command -v ffmpeg >/dev/null 2>&1; then
|
||||
display_input="$DISPLAY"
|
||||
case "$display_input" in
|
||||
*.*) ;;
|
||||
*) display_input="$display_input.0" ;;
|
||||
esac
|
||||
ffmpeg -hide_banner -loglevel error -y -f x11grab -framerate 15 -i "$display_input" -t 10 -pix_fmt yuv420p "$out/desktop-browser-smoke.mp4" >"$out/ffmpeg.log" 2>&1 &
|
||||
video_pid=$!
|
||||
else
|
||||
echo "ffmpeg missing; video artifact skipped" >"$out/ffmpeg.log"
|
||||
fi
|
||||
"$browser_bin" \
|
||||
--user-data-dir="$profile" \
|
||||
--no-first-run \
|
||||
@@ -268,9 +248,6 @@ cleanup() {
|
||||
trap cleanup EXIT
|
||||
sleep 8
|
||||
scrot "$out/desktop-browser-smoke.png"
|
||||
if [ -n "$video_pid" ]; then
|
||||
wait "$video_pid" || true
|
||||
fi
|
||||
cleanup
|
||||
trap - EXIT
|
||||
sleep 1
|
||||
@@ -314,11 +291,7 @@ function renderReport(summary: MantisDesktopBrowserSmokeSummary) {
|
||||
summary.artifacts.screenshotPath
|
||||
? `- Screenshot: \`${path.basename(summary.artifacts.screenshotPath)}\``
|
||||
: "- Screenshot: missing",
|
||||
summary.artifacts.videoPath
|
||||
? `- Video: \`${path.basename(summary.artifacts.videoPath)}\``
|
||||
: "- Video: missing",
|
||||
"- Remote metadata: `remote-metadata.json`",
|
||||
"- FFmpeg log: `ffmpeg.log`",
|
||||
"- Chrome log: `chrome.log`",
|
||||
summary.error ? `- Error: ${summary.error}` : undefined,
|
||||
"",
|
||||
@@ -428,9 +401,9 @@ async function copyRemoteArtifacts(params: {
|
||||
"-o",
|
||||
"UserKnownHostsFile=/dev/null",
|
||||
].join(" "),
|
||||
"--exclude",
|
||||
"chrome-profile/**",
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/`,
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/desktop-browser-smoke.png`,
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/remote-metadata.json`,
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/chrome.log`,
|
||||
`${params.outputDir}/`,
|
||||
],
|
||||
cwd: params.cwd,
|
||||
@@ -551,17 +524,14 @@ export async function runMantisDesktopBrowserSmoke(
|
||||
runner,
|
||||
});
|
||||
const screenshotPath = path.join(outputDir, "desktop-browser-smoke.png");
|
||||
const videoPath = path.join(outputDir, "desktop-browser-smoke.mp4");
|
||||
if (!(await pathExists(screenshotPath))) {
|
||||
throw new Error("Desktop browser screenshot was not copied back from Crabbox.");
|
||||
}
|
||||
const copiedVideoPath = (await pathExists(videoPath)) ? videoPath : undefined;
|
||||
summary = {
|
||||
artifacts: {
|
||||
reportPath,
|
||||
screenshotPath,
|
||||
summaryPath,
|
||||
videoPath: copiedVideoPath,
|
||||
},
|
||||
browserUrl,
|
||||
htmlFile,
|
||||
@@ -586,7 +556,6 @@ export async function runMantisDesktopBrowserSmoke(
|
||||
screenshotPath,
|
||||
status: "pass",
|
||||
summaryPath,
|
||||
videoPath: copiedVideoPath,
|
||||
};
|
||||
} catch (error) {
|
||||
summary = {
|
||||
|
||||
@@ -28,16 +28,14 @@ describe("mantis before/after runtime", () => {
|
||||
const outputDir = path.join(repoRootArg, outputDirArg);
|
||||
await fs.mkdir(outputDir, { recursive: true });
|
||||
const screenshotPath = path.join(outputDir, `${lane}-timeline.png`);
|
||||
const videoPath = path.join(outputDir, `${lane}-timeline.mp4`);
|
||||
await fs.writeFile(screenshotPath, `${lane} screenshot`);
|
||||
await fs.writeFile(videoPath, `${lane} video`);
|
||||
await fs.writeFile(
|
||||
path.join(outputDir, "discord-qa-summary.json"),
|
||||
`${JSON.stringify(
|
||||
{
|
||||
scenarios: [
|
||||
{
|
||||
artifactPaths: { screenshot: screenshotPath, video: videoPath },
|
||||
artifactPaths: { screenshot: screenshotPath },
|
||||
details:
|
||||
lane === "baseline"
|
||||
? "reaction timeline missing thinking/done"
|
||||
@@ -96,11 +94,5 @@ describe("mantis before/after runtime", () => {
|
||||
await expect(
|
||||
fs.readFile(path.join(result.outputDir, "candidate", "candidate.png"), "utf8"),
|
||||
).resolves.toBe("candidate screenshot");
|
||||
await expect(
|
||||
fs.readFile(path.join(result.outputDir, "baseline", "baseline.mp4"), "utf8"),
|
||||
).resolves.toBe("baseline video");
|
||||
await expect(
|
||||
fs.readFile(path.join(result.outputDir, "candidate", "candidate.mp4"), "utf8"),
|
||||
).resolves.toBe("candidate video");
|
||||
});
|
||||
});
|
||||
|
||||
@@ -51,7 +51,6 @@ type LaneResult = {
|
||||
screenshotPath?: string;
|
||||
status: string;
|
||||
summaryPath: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
|
||||
type Comparison = {
|
||||
@@ -61,7 +60,6 @@ type Comparison = {
|
||||
reproduced: boolean;
|
||||
screenshotPath?: string;
|
||||
status: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
candidate: {
|
||||
expected: "queued -> thinking -> done";
|
||||
@@ -69,7 +67,6 @@ type Comparison = {
|
||||
ref: string;
|
||||
screenshotPath?: string;
|
||||
status: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
pass: boolean;
|
||||
scenario: string;
|
||||
@@ -160,14 +157,12 @@ async function readLaneResult(params: {
|
||||
summary.scenarios?.find((entry) => entry.id === params.scenario) ?? summary.scenarios?.[0];
|
||||
const status = scenarioSummary?.status ?? "fail";
|
||||
const screenshotPath = scenarioSummary?.artifactPaths?.screenshot;
|
||||
const videoPath = scenarioSummary?.artifactPaths?.video;
|
||||
return {
|
||||
outputDir: params.publishedLaneDir,
|
||||
scenarioDetails: scenarioSummary?.details,
|
||||
screenshotPath,
|
||||
status,
|
||||
summaryPath,
|
||||
videoPath,
|
||||
} satisfies LaneResult;
|
||||
}
|
||||
|
||||
@@ -194,9 +189,6 @@ function renderReport(params: {
|
||||
params.baseline.screenshotPath
|
||||
? `- Screenshot: \`${path.join("baseline", path.basename(params.baseline.screenshotPath))}\``
|
||||
: "- Screenshot: missing",
|
||||
params.baseline.videoPath
|
||||
? `- Video: \`${path.join("baseline", path.basename(params.baseline.videoPath))}\``
|
||||
: "- Video: missing",
|
||||
params.baseline.scenarioDetails ? `- Details: ${params.baseline.scenarioDetails}` : undefined,
|
||||
"",
|
||||
"## Candidate",
|
||||
@@ -208,9 +200,6 @@ function renderReport(params: {
|
||||
params.candidate.screenshotPath
|
||||
? `- Screenshot: \`${path.join("candidate", path.basename(params.candidate.screenshotPath))}\``
|
||||
: "- Screenshot: missing",
|
||||
params.candidate.videoPath
|
||||
? `- Video: \`${path.join("candidate", path.basename(params.candidate.videoPath))}\``
|
||||
: "- Video: missing",
|
||||
params.candidate.scenarioDetails ? `- Details: ${params.candidate.scenarioDetails}` : undefined,
|
||||
"",
|
||||
].filter((line) => line !== undefined);
|
||||
@@ -229,18 +218,6 @@ async function copyScreenshot(params: { lane: "baseline" | "candidate"; result:
|
||||
return target;
|
||||
}
|
||||
|
||||
async function copyVideo(params: { lane: "baseline" | "candidate"; result: LaneResult }) {
|
||||
if (!params.result.videoPath) {
|
||||
return undefined;
|
||||
}
|
||||
const source = path.isAbsolute(params.result.videoPath)
|
||||
? params.result.videoPath
|
||||
: path.join(params.result.outputDir, params.result.videoPath);
|
||||
const target = path.join(params.result.outputDir, `${params.lane}.mp4`);
|
||||
await fs.copyFile(source, target);
|
||||
return target;
|
||||
}
|
||||
|
||||
async function runLane(params: {
|
||||
lane: "baseline" | "candidate";
|
||||
outputDir: string;
|
||||
@@ -323,11 +300,9 @@ async function runLane(params: {
|
||||
scenario: params.scenario,
|
||||
});
|
||||
const copiedScreenshot = await copyScreenshot({ lane: params.lane, result });
|
||||
const copiedVideo = await copyVideo({ lane: params.lane, result });
|
||||
return {
|
||||
...result,
|
||||
screenshotPath: copiedScreenshot ?? result.screenshotPath,
|
||||
videoPath: copiedVideo ?? result.videoPath,
|
||||
} satisfies LaneResult;
|
||||
}
|
||||
|
||||
@@ -398,7 +373,6 @@ export async function runMantisBeforeAfter(
|
||||
reproduced: baselineResult.status === "fail",
|
||||
screenshotPath: baselineResult.screenshotPath,
|
||||
status: baselineResult.status,
|
||||
videoPath: baselineResult.videoPath,
|
||||
},
|
||||
candidate: {
|
||||
expected: "queued -> thinking -> done",
|
||||
@@ -406,7 +380,6 @@ export async function runMantisBeforeAfter(
|
||||
ref: candidate,
|
||||
screenshotPath: candidateResult.screenshotPath,
|
||||
status: candidateResult.status,
|
||||
videoPath: candidateResult.videoPath,
|
||||
},
|
||||
pass: baselineResult.status === "fail" && candidateResult.status === "pass",
|
||||
scenario,
|
||||
|
||||
@@ -54,10 +54,8 @@ describe("mantis Slack desktop smoke runtime", () => {
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-qa-report.md"), "# Slack\n");
|
||||
} else {
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-smoke.png"), "png");
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-smoke.mp4"), "mp4");
|
||||
await fs.writeFile(path.join(outputDir as string, "remote-metadata.json"), "{}\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "chrome.log"), "chrome\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "ffmpeg.log"), "ffmpeg\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-command.log"), "qa\n");
|
||||
}
|
||||
return { stdout: "", stderr: "" };
|
||||
@@ -99,10 +97,6 @@ describe("mantis Slack desktop smoke runtime", () => {
|
||||
expect(remoteScript).toContain("${CHROME_BIN:-}");
|
||||
expect(remoteScript).toContain("pnpm install --frozen-lockfile");
|
||||
expect(remoteScript).toContain("pnpm build");
|
||||
expect(remoteScript).toContain("ffmpeg");
|
||||
expect(remoteScript).toContain('sudo apt-get update -y >>"$out/apt.log" 2>&1 || true');
|
||||
expect(remoteScript).toContain("slack-desktop-smoke.mp4");
|
||||
expect(remoteScript).not.toContain("-video_size");
|
||||
expect(remoteScript).toContain("openclaw qa slack");
|
||||
expect(remoteScript).toContain("--scenario 'slack-canary'");
|
||||
expect(remoteScript).toContain("OPENCLAW_MANTIS_SLACK_BROWSER_PROFILE_DIR");
|
||||
@@ -112,12 +106,11 @@ describe("mantis Slack desktop smoke runtime", () => {
|
||||
expect(rsyncArgs).not.toContain("--delete");
|
||||
expect(rsyncArgs).toEqual(
|
||||
expect.arrayContaining([
|
||||
"crabbox@203.0.113.10:/tmp/openclaw-mantis-slack-desktop-2026-05-04T13-00-00-000Z/",
|
||||
"crabbox@203.0.113.10:/tmp/openclaw-mantis-slack-desktop-2026-05-04T13-00-00-000Z/slack-desktop-smoke.png",
|
||||
"crabbox@203.0.113.10:/tmp/openclaw-mantis-slack-desktop-2026-05-04T13-00-00-000Z/slack-qa/",
|
||||
]),
|
||||
);
|
||||
await expect(fs.readFile(result.screenshotPath ?? "", "utf8")).resolves.toBe("png");
|
||||
await expect(fs.readFile(result.videoPath ?? "", "utf8")).resolves.toBe("mp4");
|
||||
const summary = JSON.parse(await fs.readFile(result.summaryPath, "utf8")) as {
|
||||
crabbox: { id: string; vncCommand: string };
|
||||
status: string;
|
||||
@@ -153,10 +146,8 @@ describe("mantis Slack desktop smoke runtime", () => {
|
||||
const outputDir = args.at(-1);
|
||||
await fs.mkdir(outputDir as string, { recursive: true });
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-smoke.png"), "png");
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-smoke.mp4"), "mp4");
|
||||
await fs.writeFile(path.join(outputDir as string, "remote-metadata.json"), "{}\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "chrome.log"), "chrome\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "ffmpeg.log"), "ffmpeg\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-command.log"), "qa\n");
|
||||
}
|
||||
return { stdout: "", stderr: "" };
|
||||
@@ -172,19 +163,17 @@ describe("mantis Slack desktop smoke runtime", () => {
|
||||
|
||||
expect(result.status).toBe("fail");
|
||||
expect(result.screenshotPath).toBe(path.join(result.outputDir, "slack-desktop-smoke.png"));
|
||||
expect(result.videoPath).toBe(path.join(result.outputDir, "slack-desktop-smoke.mp4"));
|
||||
await expect(
|
||||
fs.readFile(path.join(result.outputDir, "slack-desktop-smoke.png"), "utf8"),
|
||||
).resolves.toBe("png");
|
||||
const summary = JSON.parse(await fs.readFile(result.summaryPath, "utf8")) as {
|
||||
artifacts: { screenshotPath?: string; videoPath?: string };
|
||||
artifacts: { screenshotPath?: string };
|
||||
error?: string;
|
||||
status: string;
|
||||
};
|
||||
expect(summary.status).toBe("fail");
|
||||
expect(summary.error).toContain("remote Slack QA failed");
|
||||
expect(summary.artifacts.screenshotPath).toContain("slack-desktop-smoke.png");
|
||||
expect(summary.artifacts.videoPath).toContain("slack-desktop-smoke.mp4");
|
||||
});
|
||||
|
||||
it("accepts Blacksmith Testbox lease ids from Crabbox warmup", async () => {
|
||||
@@ -215,10 +204,8 @@ describe("mantis Slack desktop smoke runtime", () => {
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-qa-report.md"), "# Slack\n");
|
||||
} else {
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-smoke.png"), "png");
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-smoke.mp4"), "mp4");
|
||||
await fs.writeFile(path.join(outputDir as string, "remote-metadata.json"), "{}\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "chrome.log"), "chrome\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "ffmpeg.log"), "ffmpeg\n");
|
||||
await fs.writeFile(path.join(outputDir as string, "slack-desktop-command.log"), "qa\n");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -35,7 +35,6 @@ export type MantisSlackDesktopSmokeResult = {
|
||||
screenshotPath?: string;
|
||||
status: "pass" | "fail";
|
||||
summaryPath: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
|
||||
type CommandResult = {
|
||||
@@ -67,7 +66,6 @@ type MantisSlackDesktopSmokeSummary = {
|
||||
screenshotPath?: string;
|
||||
slackQaDir?: string;
|
||||
summaryPath: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
crabbox: {
|
||||
bin: string;
|
||||
@@ -304,24 +302,6 @@ fi
|
||||
if [ -z "$slack_url" ]; then
|
||||
slack_url="https://app.slack.com/client"
|
||||
fi
|
||||
video_pid=""
|
||||
if command -v ffmpeg >/dev/null 2>&1; then
|
||||
:
|
||||
else
|
||||
sudo apt-get update -y >>"$out/apt.log" 2>&1 || true
|
||||
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y ffmpeg >>"$out/apt.log" 2>&1 || true
|
||||
fi
|
||||
if command -v ffmpeg >/dev/null 2>&1; then
|
||||
display_input="$DISPLAY"
|
||||
case "$display_input" in
|
||||
*.*) ;;
|
||||
*) display_input="$display_input.0" ;;
|
||||
esac
|
||||
ffmpeg -hide_banner -loglevel error -y -f x11grab -framerate 15 -i "$display_input" -t 45 -pix_fmt yuv420p "$out/slack-desktop-smoke.mp4" >"$out/ffmpeg.log" 2>&1 &
|
||||
video_pid=$!
|
||||
else
|
||||
echo "ffmpeg missing; video artifact skipped" >"$out/ffmpeg.log"
|
||||
fi
|
||||
if [ "$setup_gateway" = "1" ]; then
|
||||
nohup "$browser_bin" \
|
||||
--user-data-dir="$profile" \
|
||||
@@ -396,9 +376,6 @@ MANTIS_SLACK_PATCH
|
||||
} >"$out/slack-desktop-command.log" 2>&1 || qa_status=$?
|
||||
sleep 5
|
||||
scrot "$out/slack-desktop-smoke.png" || true
|
||||
if [ -n "$video_pid" ]; then
|
||||
wait "$video_pid" || true
|
||||
fi
|
||||
if [ "$setup_gateway" != "1" ]; then
|
||||
kill "$chrome_pid" >/dev/null 2>&1 || true
|
||||
fi
|
||||
@@ -445,13 +422,9 @@ function renderReport(summary: MantisSlackDesktopSmokeSummary) {
|
||||
summary.artifacts.screenshotPath
|
||||
? `- Screenshot: \`${path.basename(summary.artifacts.screenshotPath)}\``
|
||||
: "- Screenshot: missing",
|
||||
summary.artifacts.videoPath
|
||||
? `- Video: \`${path.basename(summary.artifacts.videoPath)}\``
|
||||
: "- Video: missing",
|
||||
summary.artifacts.slackQaDir ? "- Slack QA artifacts: `slack-qa/`" : undefined,
|
||||
"- Remote metadata: `remote-metadata.json`",
|
||||
"- Remote command log: `slack-desktop-command.log`",
|
||||
"- FFmpeg log: `ffmpeg.log`",
|
||||
"- Chrome log: `chrome.log`",
|
||||
summary.error ? `- Error: ${summary.error}` : undefined,
|
||||
"",
|
||||
@@ -571,7 +544,10 @@ async function copyRemoteArtifacts(params: {
|
||||
"-az",
|
||||
"-e",
|
||||
sshArgs,
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/`,
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/slack-desktop-smoke.png`,
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/remote-metadata.json`,
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/chrome.log`,
|
||||
`${sshUser}@${host}:${params.remoteOutputDir}/slack-desktop-command.log`,
|
||||
`${params.outputDir}/`,
|
||||
],
|
||||
cwd: params.cwd,
|
||||
@@ -660,7 +636,6 @@ export async function runMantisSlackDesktopSmoke(
|
||||
let summary: MantisSlackDesktopSmokeSummary | undefined;
|
||||
let screenshotPath: string | undefined;
|
||||
let slackQaDir: string | undefined;
|
||||
let videoPath: string | undefined;
|
||||
|
||||
try {
|
||||
leaseId =
|
||||
@@ -727,10 +702,6 @@ export async function runMantisSlackDesktopSmoke(
|
||||
runner,
|
||||
});
|
||||
screenshotPath = path.join(outputDir, "slack-desktop-smoke.png");
|
||||
videoPath = path.join(outputDir, "slack-desktop-smoke.mp4");
|
||||
if (!(await pathExists(videoPath))) {
|
||||
videoPath = undefined;
|
||||
}
|
||||
slackQaDir = path.join(outputDir, "slack-qa");
|
||||
if (!(await pathExists(screenshotPath))) {
|
||||
throw new Error("Slack desktop screenshot was not copied back from Crabbox.");
|
||||
@@ -744,7 +715,6 @@ export async function runMantisSlackDesktopSmoke(
|
||||
screenshotPath,
|
||||
slackQaDir,
|
||||
summaryPath,
|
||||
videoPath,
|
||||
},
|
||||
crabbox: {
|
||||
bin: crabboxBin,
|
||||
@@ -768,7 +738,6 @@ export async function runMantisSlackDesktopSmoke(
|
||||
screenshotPath,
|
||||
status: "pass",
|
||||
summaryPath,
|
||||
videoPath,
|
||||
};
|
||||
} catch (error) {
|
||||
summary = {
|
||||
@@ -777,7 +746,6 @@ export async function runMantisSlackDesktopSmoke(
|
||||
screenshotPath,
|
||||
slackQaDir,
|
||||
summaryPath,
|
||||
videoPath,
|
||||
},
|
||||
crabbox: {
|
||||
bin: crabboxBin,
|
||||
@@ -803,7 +771,6 @@ export async function runMantisSlackDesktopSmoke(
|
||||
screenshotPath,
|
||||
status: "fail",
|
||||
summaryPath,
|
||||
videoPath,
|
||||
};
|
||||
} finally {
|
||||
if (summary) {
|
||||
|
||||
@@ -1,349 +0,0 @@
|
||||
import fs from "node:fs/promises";
|
||||
import os from "node:os";
|
||||
import path from "node:path";
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
|
||||
import { runMantisVisualDriver, runMantisVisualTask } from "./visual-task.runtime.js";
|
||||
|
||||
describe("mantis visual task runtime", () => {
|
||||
let repoRoot: string;
|
||||
|
||||
beforeEach(async () => {
|
||||
repoRoot = await fs.mkdtemp(path.join(os.tmpdir(), "mantis-visual-task-"));
|
||||
});
|
||||
|
||||
afterEach(async () => {
|
||||
await fs.rm(repoRoot, { force: true, recursive: true });
|
||||
});
|
||||
|
||||
it("records a visible browser task and keeps screenshot/video artifacts", async () => {
|
||||
const commands: { args: readonly string[]; command: string }[] = [];
|
||||
const runner = vi.fn(async (command: string, args: readonly string[]) => {
|
||||
commands.push({ command, args });
|
||||
if (command === "/tmp/crabbox" && args[0] === "warmup") {
|
||||
return { stdout: "ready lease cbx_abc123\n", stderr: "" };
|
||||
}
|
||||
if (command === "/tmp/crabbox" && args[0] === "inspect") {
|
||||
return {
|
||||
stdout: `${JSON.stringify({
|
||||
id: "cbx_abc123",
|
||||
provider: "hetzner",
|
||||
slug: "brisk-mantis",
|
||||
state: "active",
|
||||
})}\n`,
|
||||
stderr: "",
|
||||
};
|
||||
}
|
||||
if (command === "/tmp/crabbox" && args[0] === "record") {
|
||||
const outputPath = args[args.indexOf("--output") + 1];
|
||||
const outputDir = args[args.indexOf("--output-dir") + 1];
|
||||
await fs.mkdir(path.dirname(outputPath), { recursive: true });
|
||||
await fs.writeFile(outputPath, "mp4");
|
||||
await fs.writeFile(path.join(outputDir, "visual-task.png"), "png");
|
||||
await fs.writeFile(
|
||||
path.join(outputDir, "mantis-visual-task-driver-result.json"),
|
||||
`${JSON.stringify({
|
||||
browserUrl: "https://example.net",
|
||||
finishedAt: "2026-05-04T12:00:05.000Z",
|
||||
matched: true,
|
||||
outputDir,
|
||||
screenshotPath: path.join(outputDir, "visual-task.png"),
|
||||
startedAt: "2026-05-04T12:00:01.000Z",
|
||||
status: "pass",
|
||||
vision: {
|
||||
mode: "metadata",
|
||||
timeoutMs: 120000,
|
||||
},
|
||||
})}\n`,
|
||||
);
|
||||
}
|
||||
return { stdout: "", stderr: "" };
|
||||
});
|
||||
|
||||
const result = await runMantisVisualTask({
|
||||
commandRunner: runner,
|
||||
crabboxBin: "/tmp/crabbox",
|
||||
duration: "12s",
|
||||
env: { PATH: process.env.PATH },
|
||||
now: () => new Date("2026-05-04T12:00:00.000Z"),
|
||||
outputDir: ".artifacts/qa-e2e/mantis/visual-task-test",
|
||||
repoRoot,
|
||||
settleMs: 0,
|
||||
visionMode: "metadata",
|
||||
});
|
||||
|
||||
expect(result.status).toBe("pass");
|
||||
expect(commands.map((entry) => [entry.command, entry.args[0]])).toEqual([
|
||||
["/tmp/crabbox", "warmup"],
|
||||
["/tmp/crabbox", "inspect"],
|
||||
["/tmp/crabbox", "record"],
|
||||
["/tmp/crabbox", "stop"],
|
||||
]);
|
||||
const recordArgs = commands.find((entry) => entry.args[0] === "record")?.args ?? [];
|
||||
expect(recordArgs).toEqual(
|
||||
expect.arrayContaining([
|
||||
"--duration",
|
||||
"12s",
|
||||
"--output",
|
||||
path.join(repoRoot, ".artifacts/qa-e2e/mantis/visual-task-test/visual-task.mp4"),
|
||||
"--while",
|
||||
"--",
|
||||
"pnpm",
|
||||
"--dir",
|
||||
repoRoot,
|
||||
"openclaw",
|
||||
"qa",
|
||||
"mantis",
|
||||
"visual-driver",
|
||||
]),
|
||||
);
|
||||
await expect(fs.readFile(result.screenshotPath ?? "", "utf8")).resolves.toBe("png");
|
||||
await expect(fs.readFile(result.videoPath ?? "", "utf8")).resolves.toBe("mp4");
|
||||
const summary = JSON.parse(await fs.readFile(result.summaryPath, "utf8")) as {
|
||||
crabbox: { id: string; vncCommand: string };
|
||||
status: string;
|
||||
visionMode: string;
|
||||
};
|
||||
expect(summary).toMatchObject({
|
||||
crabbox: {
|
||||
id: "cbx_abc123",
|
||||
vncCommand: "/tmp/crabbox vnc --provider hetzner --id cbx_abc123 --open",
|
||||
},
|
||||
status: "pass",
|
||||
visionMode: "metadata",
|
||||
});
|
||||
});
|
||||
|
||||
it("fails when recording breaks after the visual driver passes", async () => {
|
||||
const commands: { args: readonly string[]; command: string }[] = [];
|
||||
const runner = vi.fn(async (command: string, args: readonly string[]) => {
|
||||
commands.push({ command, args });
|
||||
if (command === "/tmp/crabbox" && args[0] === "warmup") {
|
||||
return { stdout: "ready lease cbx_abc123\n", stderr: "" };
|
||||
}
|
||||
if (command === "/tmp/crabbox" && args[0] === "inspect") {
|
||||
return {
|
||||
stdout: `${JSON.stringify({
|
||||
id: "cbx_abc123",
|
||||
provider: "hetzner",
|
||||
slug: "brisk-mantis",
|
||||
state: "active",
|
||||
})}\n`,
|
||||
stderr: "",
|
||||
};
|
||||
}
|
||||
if (command === "/tmp/crabbox" && args[0] === "record") {
|
||||
const outputDir = args[args.indexOf("--output-dir") + 1];
|
||||
await fs.mkdir(outputDir, { recursive: true });
|
||||
await fs.writeFile(path.join(outputDir, "visual-task.png"), "png");
|
||||
await fs.writeFile(
|
||||
path.join(outputDir, "mantis-visual-task-driver-result.json"),
|
||||
`${JSON.stringify({
|
||||
browserUrl: "https://example.net",
|
||||
finishedAt: "2026-05-04T12:00:05.000Z",
|
||||
matched: true,
|
||||
outputDir,
|
||||
screenshotPath: path.join(outputDir, "visual-task.png"),
|
||||
startedAt: "2026-05-04T12:00:01.000Z",
|
||||
status: "pass",
|
||||
vision: {
|
||||
mode: "metadata",
|
||||
timeoutMs: 120000,
|
||||
},
|
||||
})}\n`,
|
||||
);
|
||||
throw new Error("crabbox record failed after driver exit");
|
||||
}
|
||||
return { stdout: "", stderr: "" };
|
||||
});
|
||||
|
||||
const result = await runMantisVisualTask({
|
||||
commandRunner: runner,
|
||||
crabboxBin: "/tmp/crabbox",
|
||||
env: { PATH: process.env.PATH },
|
||||
now: () => new Date("2026-05-04T12:00:00.000Z"),
|
||||
outputDir: ".artifacts/qa-e2e/mantis/visual-task-recording-fail",
|
||||
repoRoot,
|
||||
settleMs: 0,
|
||||
visionMode: "metadata",
|
||||
});
|
||||
|
||||
expect(result).toMatchObject({
|
||||
status: "fail",
|
||||
videoPath: undefined,
|
||||
});
|
||||
expect(commands.map((entry) => [entry.command, entry.args[0]])).toEqual([
|
||||
["/tmp/crabbox", "warmup"],
|
||||
["/tmp/crabbox", "inspect"],
|
||||
["/tmp/crabbox", "record"],
|
||||
]);
|
||||
const summary = JSON.parse(await fs.readFile(result.summaryPath, "utf8")) as {
|
||||
error?: string;
|
||||
recording?: { error?: string; required: boolean };
|
||||
status: string;
|
||||
};
|
||||
expect(summary).toMatchObject({
|
||||
error: "crabbox record failed after driver exit",
|
||||
recording: {
|
||||
error: "crabbox record failed after driver exit",
|
||||
required: true,
|
||||
},
|
||||
status: "fail",
|
||||
});
|
||||
});
|
||||
|
||||
it("drives a lease, screenshots it, and verifies image-describe text", async () => {
|
||||
const commands: { args: readonly string[]; command: string }[] = [];
|
||||
const runner = vi.fn(async (command: string, args: readonly string[]) => {
|
||||
commands.push({ command, args });
|
||||
if (command === "/tmp/crabbox" && args[0] === "screenshot") {
|
||||
const outputPath = args[args.indexOf("--output") + 1];
|
||||
await fs.mkdir(path.dirname(outputPath), { recursive: true });
|
||||
await fs.writeFile(outputPath, "png");
|
||||
}
|
||||
if (command === "pnpm") {
|
||||
return {
|
||||
stdout: `\n> openclaw qa mantis visual-driver --vision-prompt '{"visible": boolean}'\n${JSON.stringify(
|
||||
{
|
||||
ok: true,
|
||||
outputs: [
|
||||
{
|
||||
kind: "image.description",
|
||||
text: JSON.stringify({
|
||||
evidence: 'The page heading reads "Example Domain".',
|
||||
reason: "The expected text is visible as the main heading.",
|
||||
visible: true,
|
||||
}),
|
||||
},
|
||||
],
|
||||
},
|
||||
)}\n`,
|
||||
stderr: "",
|
||||
};
|
||||
}
|
||||
return { stdout: "", stderr: "" };
|
||||
});
|
||||
|
||||
const result = await runMantisVisualDriver({
|
||||
browserUrl: "https://example.net",
|
||||
commandRunner: runner,
|
||||
crabboxBin: "/tmp/crabbox",
|
||||
env: { PATH: process.env.PATH },
|
||||
expectText: "Example Domain",
|
||||
leaseId: "cbx_abc123",
|
||||
outputDir: ".artifacts/qa-e2e/mantis/visual-driver-test",
|
||||
repoRoot,
|
||||
settleMs: 0,
|
||||
visionMode: "image-describe",
|
||||
visionModel: "openai/gpt-5.4",
|
||||
visionPrompt: "Read the page title",
|
||||
});
|
||||
|
||||
expect(result.status).toBe("pass");
|
||||
expect(commands.map((entry) => [entry.command, entry.args[0], entry.args[1]])).toEqual([
|
||||
["/tmp/crabbox", "desktop", "launch"],
|
||||
["/tmp/crabbox", "screenshot", "--provider"],
|
||||
["pnpm", "--dir", repoRoot],
|
||||
]);
|
||||
const launchArgs = commands.find((entry) => entry.args[0] === "desktop")?.args ?? [];
|
||||
expect(launchArgs).toEqual(
|
||||
expect.arrayContaining(["--", "sh", "-lc", expect.stringContaining("--no-first-run")]),
|
||||
);
|
||||
const visionArgs = commands.find((entry) => entry.command === "pnpm")?.args ?? [];
|
||||
expect(visionArgs).toEqual(
|
||||
expect.arrayContaining([
|
||||
"infer",
|
||||
"image",
|
||||
"describe",
|
||||
"--file",
|
||||
path.join(repoRoot, ".artifacts/qa-e2e/mantis/visual-driver-test/visual-task.png"),
|
||||
"--model",
|
||||
"openai/gpt-5.4",
|
||||
]),
|
||||
);
|
||||
expect(visionArgs).toEqual(
|
||||
expect.arrayContaining(["--prompt", expect.stringContaining("return only valid JSON")]),
|
||||
);
|
||||
expect(result.vision.assertion).toMatchObject({
|
||||
evidence: 'The page heading reads "Example Domain".',
|
||||
matched: true,
|
||||
visible: true,
|
||||
});
|
||||
});
|
||||
|
||||
it("fails image-describe text checks when the model gives negative evidence that quotes the target", async () => {
|
||||
const runner = vi.fn(async (command: string, args: readonly string[]) => {
|
||||
if (command === "/tmp/crabbox" && args[0] === "screenshot") {
|
||||
const outputPath = args[args.indexOf("--output") + 1];
|
||||
await fs.mkdir(path.dirname(outputPath), { recursive: true });
|
||||
await fs.writeFile(outputPath, "png");
|
||||
}
|
||||
if (command === "pnpm") {
|
||||
return {
|
||||
stdout: `${JSON.stringify({
|
||||
ok: true,
|
||||
outputs: [
|
||||
{
|
||||
kind: "image.description",
|
||||
text: 'The screenshot does not contain "Example Domain".',
|
||||
},
|
||||
],
|
||||
})}\n`,
|
||||
stderr: "",
|
||||
};
|
||||
}
|
||||
return { stdout: "", stderr: "" };
|
||||
});
|
||||
|
||||
const result = await runMantisVisualDriver({
|
||||
commandRunner: runner,
|
||||
crabboxBin: "/tmp/crabbox",
|
||||
expectText: "Example Domain",
|
||||
leaseId: "cbx_abc123",
|
||||
outputDir: ".artifacts/qa-e2e/mantis/visual-driver-negative",
|
||||
repoRoot,
|
||||
settleMs: 0,
|
||||
visionMode: "image-describe",
|
||||
});
|
||||
|
||||
expect(result).toMatchObject({
|
||||
matched: false,
|
||||
status: "fail",
|
||||
vision: {
|
||||
assertion: {
|
||||
matched: false,
|
||||
reason: "Image describe did not return a structured visual assertion.",
|
||||
},
|
||||
},
|
||||
});
|
||||
});
|
||||
|
||||
it("fails metadata mode when text evidence is requested", async () => {
|
||||
const runner = vi.fn(async (command: string, args: readonly string[]) => {
|
||||
if (command === "/tmp/crabbox" && args[0] === "screenshot") {
|
||||
const outputPath = args[args.indexOf("--output") + 1];
|
||||
await fs.mkdir(path.dirname(outputPath), { recursive: true });
|
||||
await fs.writeFile(outputPath, "png");
|
||||
}
|
||||
return { stdout: "", stderr: "" };
|
||||
});
|
||||
|
||||
const result = await runMantisVisualDriver({
|
||||
commandRunner: runner,
|
||||
crabboxBin: "/tmp/crabbox",
|
||||
expectText: "Example Domain",
|
||||
leaseId: "cbx_abc123",
|
||||
outputDir: ".artifacts/qa-e2e/mantis/visual-driver-metadata",
|
||||
repoRoot,
|
||||
settleMs: 0,
|
||||
visionMode: "metadata",
|
||||
});
|
||||
|
||||
expect(result).toMatchObject({
|
||||
matched: false,
|
||||
status: "fail",
|
||||
vision: {
|
||||
mode: "metadata",
|
||||
},
|
||||
});
|
||||
});
|
||||
});
|
||||
@@ -1,926 +0,0 @@
|
||||
import { spawn, type SpawnOptions } from "node:child_process";
|
||||
import fs from "node:fs/promises";
|
||||
import path from "node:path";
|
||||
import { formatErrorMessage } from "openclaw/plugin-sdk/error-runtime";
|
||||
import { ensureRepoBoundDirectory, resolveRepoRelativeOutputDir } from "../cli-paths.js";
|
||||
|
||||
export type MantisVisualTaskVisionMode = "image-describe" | "metadata";
|
||||
|
||||
export type MantisVisualTaskOptions = {
|
||||
browserUrl?: string;
|
||||
commandRunner?: CommandRunner;
|
||||
crabboxBin?: string;
|
||||
duration?: string;
|
||||
env?: NodeJS.ProcessEnv;
|
||||
expectText?: string;
|
||||
idleTimeout?: string;
|
||||
keepLease?: boolean;
|
||||
leaseId?: string;
|
||||
machineClass?: string;
|
||||
now?: () => Date;
|
||||
outputDir?: string;
|
||||
provider?: string;
|
||||
repoRoot?: string;
|
||||
settleMs?: number;
|
||||
ttl?: string;
|
||||
visionMode?: MantisVisualTaskVisionMode;
|
||||
visionModel?: string;
|
||||
visionPrompt?: string;
|
||||
visionTimeoutMs?: number;
|
||||
};
|
||||
|
||||
export type MantisVisualDriverOptions = {
|
||||
browserUrl?: string;
|
||||
commandRunner?: CommandRunner;
|
||||
crabboxBin?: string;
|
||||
env?: NodeJS.ProcessEnv;
|
||||
expectText?: string;
|
||||
leaseId?: string;
|
||||
outputDir?: string;
|
||||
provider?: string;
|
||||
repoRoot?: string;
|
||||
settleMs?: number;
|
||||
visionMode?: MantisVisualTaskVisionMode;
|
||||
visionModel?: string;
|
||||
visionPrompt?: string;
|
||||
visionTimeoutMs?: number;
|
||||
};
|
||||
|
||||
export type MantisVisualTaskResult = {
|
||||
outputDir: string;
|
||||
reportPath: string;
|
||||
screenshotPath?: string;
|
||||
status: "pass" | "fail";
|
||||
summaryPath: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
|
||||
type CommandResult = {
|
||||
stderr: string;
|
||||
stdout: string;
|
||||
};
|
||||
|
||||
type CommandRunner = (
|
||||
command: string,
|
||||
args: readonly string[],
|
||||
options: SpawnOptions,
|
||||
) => Promise<CommandResult>;
|
||||
|
||||
type CrabboxInspect = {
|
||||
id?: string;
|
||||
provider?: string;
|
||||
slug?: string;
|
||||
state?: string;
|
||||
};
|
||||
|
||||
type MantisVisualDriverResult = {
|
||||
browserUrl: string;
|
||||
error?: string;
|
||||
expectText?: string;
|
||||
finishedAt: string;
|
||||
matched?: boolean;
|
||||
outputDir: string;
|
||||
screenshotPath: string;
|
||||
startedAt: string;
|
||||
status: "pass" | "fail";
|
||||
vision: {
|
||||
assertion?: VisionAssertion;
|
||||
mode: MantisVisualTaskVisionMode;
|
||||
model?: string;
|
||||
prompt?: string;
|
||||
text?: string;
|
||||
timeoutMs: number;
|
||||
};
|
||||
};
|
||||
|
||||
type VisionAssertion = {
|
||||
evidence?: string;
|
||||
expectedText: string;
|
||||
matched: boolean;
|
||||
reason?: string;
|
||||
visible?: boolean;
|
||||
};
|
||||
|
||||
type MantisVisualTaskSummary = {
|
||||
artifacts: {
|
||||
driverResultPath: string;
|
||||
reportPath: string;
|
||||
screenshotPath?: string;
|
||||
summaryPath: string;
|
||||
videoPath?: string;
|
||||
};
|
||||
browserUrl: string;
|
||||
crabbox: {
|
||||
bin: string;
|
||||
createdLease: boolean;
|
||||
id: string;
|
||||
provider: string;
|
||||
slug?: string;
|
||||
state?: string;
|
||||
vncCommand: string;
|
||||
};
|
||||
driver?: MantisVisualDriverResult;
|
||||
error?: string;
|
||||
finishedAt: string;
|
||||
outputDir: string;
|
||||
recording: {
|
||||
error?: string;
|
||||
required: boolean;
|
||||
};
|
||||
startedAt: string;
|
||||
status: "pass" | "fail";
|
||||
visionMode: MantisVisualTaskVisionMode;
|
||||
};
|
||||
|
||||
const DEFAULT_BROWSER_URL = "https://example.net";
|
||||
const DEFAULT_PROVIDER = "hetzner";
|
||||
const DEFAULT_CLASS = "beast";
|
||||
const DEFAULT_DURATION = "180s";
|
||||
const DEFAULT_IDLE_TIMEOUT = "60m";
|
||||
const DEFAULT_TTL = "120m";
|
||||
const DEFAULT_SETTLE_MS = 8000;
|
||||
const DEFAULT_VISION_TIMEOUT_MS = 120000;
|
||||
const CRABBOX_BIN_ENV = "OPENCLAW_MANTIS_CRABBOX_BIN";
|
||||
const CRABBOX_PROVIDER_ENV = "OPENCLAW_MANTIS_CRABBOX_PROVIDER";
|
||||
const CRABBOX_CLASS_ENV = "OPENCLAW_MANTIS_CRABBOX_CLASS";
|
||||
const CRABBOX_LEASE_ID_ENV = "OPENCLAW_MANTIS_CRABBOX_LEASE_ID";
|
||||
const CRABBOX_KEEP_ENV = "OPENCLAW_MANTIS_KEEP_VM";
|
||||
const CRABBOX_IDLE_TIMEOUT_ENV = "OPENCLAW_MANTIS_CRABBOX_IDLE_TIMEOUT";
|
||||
const CRABBOX_TTL_ENV = "OPENCLAW_MANTIS_CRABBOX_TTL";
|
||||
|
||||
function trimToValue(value: string | undefined) {
|
||||
const trimmed = value?.trim();
|
||||
return trimmed && trimmed.length > 0 ? trimmed : undefined;
|
||||
}
|
||||
|
||||
function isTruthyOptIn(value: string | undefined) {
|
||||
const normalized = value?.trim().toLowerCase();
|
||||
return normalized === "1" || normalized === "true" || normalized === "yes";
|
||||
}
|
||||
|
||||
function defaultOutputDir(repoRoot: string, startedAt: Date) {
|
||||
const stamp = startedAt.toISOString().replace(/[:.]/gu, "-");
|
||||
return path.join(repoRoot, ".artifacts", "qa-e2e", "mantis", `visual-task-${stamp}`);
|
||||
}
|
||||
|
||||
function resolveMantisOutputDir(repoRoot: string, outputDir: string | undefined, startedAt: Date) {
|
||||
const configured = trimToValue(outputDir);
|
||||
if (!configured) {
|
||||
return defaultOutputDir(repoRoot, startedAt);
|
||||
}
|
||||
return path.isAbsolute(configured)
|
||||
? configured
|
||||
: (resolveRepoRelativeOutputDir(repoRoot, configured) ?? defaultOutputDir(repoRoot, startedAt));
|
||||
}
|
||||
|
||||
async function defaultCommandRunner(
|
||||
command: string,
|
||||
args: readonly string[],
|
||||
options: SpawnOptions,
|
||||
): Promise<CommandResult> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const child = spawn(command, args, {
|
||||
...options,
|
||||
stdio: ["ignore", "pipe", "pipe"],
|
||||
});
|
||||
let stdout = "";
|
||||
let stderr = "";
|
||||
child.stdout?.on("data", (chunk: Buffer) => {
|
||||
const text = chunk.toString();
|
||||
stdout += text;
|
||||
if (options.stdio === "inherit") {
|
||||
process.stdout.write(text);
|
||||
}
|
||||
});
|
||||
child.stderr?.on("data", (chunk: Buffer) => {
|
||||
const text = chunk.toString();
|
||||
stderr += text;
|
||||
if (options.stdio === "inherit") {
|
||||
process.stderr.write(text);
|
||||
}
|
||||
});
|
||||
child.on("error", reject);
|
||||
child.on("close", (code, signal) => {
|
||||
if (code === 0) {
|
||||
resolve({ stdout, stderr });
|
||||
return;
|
||||
}
|
||||
const detail = signal ? `signal ${signal}` : `exit code ${code ?? "unknown"}`;
|
||||
reject(new Error(`${command} ${args.join(" ")} failed with ${detail}`));
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
async function pathExists(filePath: string) {
|
||||
try {
|
||||
await fs.access(filePath);
|
||||
return true;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
async function nonEmptyFileExists(filePath: string) {
|
||||
try {
|
||||
const stat = await fs.stat(filePath);
|
||||
return stat.isFile() && stat.size > 0;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
async function resolveCrabboxBin(params: {
|
||||
env: NodeJS.ProcessEnv;
|
||||
explicit?: string;
|
||||
repoRoot: string;
|
||||
}) {
|
||||
const configured = trimToValue(params.explicit) ?? trimToValue(params.env[CRABBOX_BIN_ENV]);
|
||||
if (configured) {
|
||||
return configured;
|
||||
}
|
||||
const sibling = path.resolve(params.repoRoot, "../crabbox/bin/crabbox");
|
||||
if (await pathExists(sibling)) {
|
||||
return sibling;
|
||||
}
|
||||
return "crabbox";
|
||||
}
|
||||
|
||||
function extractLeaseId(output: string) {
|
||||
return output.match(/\b(?:cbx_[a-f0-9]+|tbx_[A-Za-z0-9_-]+)\b/u)?.[0];
|
||||
}
|
||||
|
||||
function normalizeVisionMode(value: string | undefined): MantisVisualTaskVisionMode {
|
||||
const normalized = trimToValue(value);
|
||||
if (normalized === undefined || normalized === "image-describe") {
|
||||
return "image-describe";
|
||||
}
|
||||
if (normalized === "metadata") {
|
||||
return "metadata";
|
||||
}
|
||||
throw new Error(`Unsupported Mantis visual-task vision mode: ${normalized}`);
|
||||
}
|
||||
|
||||
function defaultVisionPrompt(expectText: string | undefined) {
|
||||
if (expectText) {
|
||||
return `Inspect this UI screenshot and determine whether the exact text "${expectText}" is visibly present.`;
|
||||
}
|
||||
return "Inspect this UI screenshot and describe the visible page state in one concise sentence.";
|
||||
}
|
||||
|
||||
function buildVisionPrompt(prompt: string | undefined, expectText: string | undefined) {
|
||||
const base = trimToValue(prompt) ?? defaultVisionPrompt(expectText);
|
||||
if (!expectText) {
|
||||
return base;
|
||||
}
|
||||
if (base.includes("Visual assertion contract:")) {
|
||||
return base;
|
||||
}
|
||||
return `${base}\n\nVisual assertion contract: return only valid JSON: {"visible": boolean, "evidence": string, "reason": string}. Set visible=true only when the exact text "${expectText}" is actually visible in the screenshot; text quoted in the prompt or a negative statement is not evidence.`;
|
||||
}
|
||||
|
||||
async function runCommand(params: {
|
||||
args: readonly string[];
|
||||
command: string;
|
||||
cwd: string;
|
||||
env: NodeJS.ProcessEnv;
|
||||
runner: CommandRunner;
|
||||
stdio?: "inherit" | "pipe";
|
||||
}) {
|
||||
return params.runner(params.command, params.args, {
|
||||
cwd: params.cwd,
|
||||
env: params.env,
|
||||
stdio: params.stdio ?? "pipe",
|
||||
});
|
||||
}
|
||||
|
||||
async function warmupCrabbox(params: {
|
||||
crabboxBin: string;
|
||||
cwd: string;
|
||||
env: NodeJS.ProcessEnv;
|
||||
idleTimeout: string;
|
||||
machineClass: string;
|
||||
provider: string;
|
||||
runner: CommandRunner;
|
||||
ttl: string;
|
||||
}) {
|
||||
const result = await runCommand({
|
||||
command: params.crabboxBin,
|
||||
args: [
|
||||
"warmup",
|
||||
"--provider",
|
||||
params.provider,
|
||||
"--desktop",
|
||||
"--browser",
|
||||
"--class",
|
||||
params.machineClass,
|
||||
"--idle-timeout",
|
||||
params.idleTimeout,
|
||||
"--ttl",
|
||||
params.ttl,
|
||||
],
|
||||
cwd: params.cwd,
|
||||
env: params.env,
|
||||
runner: params.runner,
|
||||
stdio: "inherit",
|
||||
});
|
||||
const leaseId = extractLeaseId(`${result.stdout}\n${result.stderr}`);
|
||||
if (!leaseId) {
|
||||
throw new Error("Crabbox warmup did not print a lease id.");
|
||||
}
|
||||
return leaseId;
|
||||
}
|
||||
|
||||
async function inspectCrabbox(params: {
|
||||
crabboxBin: string;
|
||||
cwd: string;
|
||||
env: NodeJS.ProcessEnv;
|
||||
leaseId: string;
|
||||
provider: string;
|
||||
runner: CommandRunner;
|
||||
}) {
|
||||
const result = await runCommand({
|
||||
command: params.crabboxBin,
|
||||
args: ["inspect", "--provider", params.provider, "--id", params.leaseId, "--json"],
|
||||
cwd: params.cwd,
|
||||
env: params.env,
|
||||
runner: params.runner,
|
||||
});
|
||||
return JSON.parse(result.stdout) as CrabboxInspect;
|
||||
}
|
||||
|
||||
async function stopCrabbox(params: {
|
||||
crabboxBin: string;
|
||||
cwd: string;
|
||||
env: NodeJS.ProcessEnv;
|
||||
leaseId: string;
|
||||
provider: string;
|
||||
runner: CommandRunner;
|
||||
}) {
|
||||
await runCommand({
|
||||
command: params.crabboxBin,
|
||||
args: ["stop", "--provider", params.provider, params.leaseId],
|
||||
cwd: params.cwd,
|
||||
env: params.env,
|
||||
runner: params.runner,
|
||||
stdio: "inherit",
|
||||
});
|
||||
}
|
||||
|
||||
function buildVisualDriverArgs(params: {
|
||||
browserUrl: string;
|
||||
crabboxBin: string;
|
||||
expectText?: string;
|
||||
leaseId: string;
|
||||
outputDir: string;
|
||||
provider: string;
|
||||
repoRoot: string;
|
||||
settleMs: number;
|
||||
visionMode: MantisVisualTaskVisionMode;
|
||||
visionModel?: string;
|
||||
visionPrompt: string;
|
||||
visionTimeoutMs: number;
|
||||
}) {
|
||||
const args = [
|
||||
"--dir",
|
||||
params.repoRoot,
|
||||
"openclaw",
|
||||
"qa",
|
||||
"mantis",
|
||||
"visual-driver",
|
||||
"--repo-root",
|
||||
params.repoRoot,
|
||||
"--output-dir",
|
||||
params.outputDir,
|
||||
"--crabbox-bin",
|
||||
params.crabboxBin,
|
||||
"--provider",
|
||||
params.provider,
|
||||
"--lease-id",
|
||||
params.leaseId,
|
||||
"--browser-url",
|
||||
params.browserUrl,
|
||||
"--settle-ms",
|
||||
String(params.settleMs),
|
||||
"--vision-mode",
|
||||
params.visionMode,
|
||||
"--vision-prompt",
|
||||
params.visionPrompt,
|
||||
"--vision-timeout-ms",
|
||||
String(params.visionTimeoutMs),
|
||||
];
|
||||
if (params.expectText) {
|
||||
args.push("--expect-text", params.expectText);
|
||||
}
|
||||
if (params.visionModel) {
|
||||
args.push("--vision-model", params.visionModel);
|
||||
}
|
||||
return args;
|
||||
}
|
||||
|
||||
function parseImageDescribeText(stdout: string) {
|
||||
const parsed = parseJsonObjectFromText(
|
||||
stdout,
|
||||
(value): value is { outputs?: Array<{ text?: unknown }> } =>
|
||||
Boolean(
|
||||
value &&
|
||||
typeof value === "object" &&
|
||||
Array.isArray((value as { outputs?: unknown }).outputs),
|
||||
),
|
||||
);
|
||||
if (!parsed) {
|
||||
throw new Error("Image describe did not return a JSON envelope with outputs.");
|
||||
}
|
||||
const text = parsed.outputs?.find((output) => typeof output.text === "string")?.text;
|
||||
if (typeof text !== "string" || text.trim().length === 0) {
|
||||
throw new Error("Image describe did not return output text.");
|
||||
}
|
||||
return text;
|
||||
}
|
||||
|
||||
function parseJsonObjectFromText<T>(text: string, accepts: (value: unknown) => value is T) {
|
||||
const starts = [...text.matchAll(/\{/gu)]
|
||||
.map((match) => match.index)
|
||||
.filter((index) => index !== undefined);
|
||||
const ends = [...text.matchAll(/\}/gu)]
|
||||
.map((match) => match.index)
|
||||
.filter((index) => index !== undefined);
|
||||
for (const start of starts) {
|
||||
for (const end of ends.toReversed()) {
|
||||
if (end < start) {
|
||||
continue;
|
||||
}
|
||||
try {
|
||||
const parsed = JSON.parse(text.slice(start, end + 1)) as unknown;
|
||||
if (accepts(parsed)) {
|
||||
return parsed;
|
||||
}
|
||||
} catch {
|
||||
// Keep scanning: command wrappers can echo prompt schemas before the real JSON.
|
||||
}
|
||||
}
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function parseVisionAssertion(text: string, expectText: string): VisionAssertion {
|
||||
const parsed = parseJsonObjectFromText(text, (value): value is Record<string, unknown> =>
|
||||
Boolean(value && typeof value === "object" && "visible" in value),
|
||||
);
|
||||
if (!parsed) {
|
||||
return {
|
||||
expectedText: expectText,
|
||||
matched: false,
|
||||
reason: "Image describe did not return a structured visual assertion.",
|
||||
};
|
||||
}
|
||||
const record = parsed;
|
||||
const visible = record.visible;
|
||||
const evidence = typeof record.evidence === "string" ? record.evidence.trim() : undefined;
|
||||
const reason = typeof record.reason === "string" ? record.reason.trim() : undefined;
|
||||
if (typeof visible !== "boolean") {
|
||||
return {
|
||||
evidence,
|
||||
expectedText: expectText,
|
||||
matched: false,
|
||||
reason: reason ?? "Image describe visual assertion is missing boolean visible.",
|
||||
};
|
||||
}
|
||||
const normalizedExpected = expectText.toLowerCase();
|
||||
const positiveEvidence = [evidence, reason]
|
||||
.filter((value): value is string => Boolean(value))
|
||||
.some((value) => value.toLowerCase().includes(normalizedExpected));
|
||||
return {
|
||||
evidence,
|
||||
expectedText: expectText,
|
||||
matched: visible && Boolean(evidence) && positiveEvidence,
|
||||
reason: positiveEvidence
|
||||
? reason
|
||||
: (reason ?? `Visual assertion did not cite the expected text "${expectText}".`),
|
||||
visible,
|
||||
};
|
||||
}
|
||||
|
||||
function evaluateVisualExpectation(text: string | undefined, expectText: string | undefined) {
|
||||
if (!expectText) {
|
||||
return { matched: true };
|
||||
}
|
||||
if (!text) {
|
||||
return {
|
||||
assertion: {
|
||||
expectedText: expectText,
|
||||
matched: false,
|
||||
reason: "Image describe did not return text.",
|
||||
},
|
||||
matched: false,
|
||||
};
|
||||
}
|
||||
const assertion = parseVisionAssertion(text, expectText);
|
||||
return { assertion, matched: assertion.matched };
|
||||
}
|
||||
|
||||
function browserLaunchScript() {
|
||||
return [
|
||||
'browser="${BROWSER:-${CHROME_BIN:-google-chrome}}"',
|
||||
'profile="${TMPDIR:-/tmp}/openclaw-mantis-visual-chrome-profile"',
|
||||
'mkdir -p "$profile"',
|
||||
'exec "$browser" --user-data-dir="$profile" --no-first-run --no-default-browser-check --disable-default-apps --disable-dev-shm-usage --window-size=1280,900 --window-position=0,0 "$0"',
|
||||
].join("; ");
|
||||
}
|
||||
|
||||
function renderReport(summary: MantisVisualTaskSummary) {
|
||||
const lines = [
|
||||
"# Mantis Visual Task",
|
||||
"",
|
||||
`Status: ${summary.status}`,
|
||||
`Browser URL: ${summary.browserUrl}`,
|
||||
`Vision mode: ${summary.visionMode}`,
|
||||
`Output: ${summary.outputDir}`,
|
||||
`Started: ${summary.startedAt}`,
|
||||
`Finished: ${summary.finishedAt}`,
|
||||
"",
|
||||
"## Crabbox",
|
||||
"",
|
||||
`- Provider: ${summary.crabbox.provider}`,
|
||||
`- Lease: ${summary.crabbox.id}${summary.crabbox.slug ? ` (${summary.crabbox.slug})` : ""}`,
|
||||
`- Created by run: ${summary.crabbox.createdLease}`,
|
||||
`- State: ${summary.crabbox.state ?? "unknown"}`,
|
||||
`- VNC: \`${summary.crabbox.vncCommand}\``,
|
||||
"",
|
||||
"## Artifacts",
|
||||
"",
|
||||
summary.artifacts.screenshotPath
|
||||
? `- Screenshot: \`${path.basename(summary.artifacts.screenshotPath)}\``
|
||||
: "- Screenshot: missing",
|
||||
summary.artifacts.videoPath
|
||||
? `- Video: \`${path.basename(summary.artifacts.videoPath)}\``
|
||||
: "- Video: missing",
|
||||
`- Driver result: \`${path.basename(summary.artifacts.driverResultPath)}\``,
|
||||
"",
|
||||
"## Vision",
|
||||
"",
|
||||
summary.driver?.vision.text ? summary.driver.vision.text : "No vision text recorded.",
|
||||
summary.driver?.expectText ? `Expected text: ${summary.driver.expectText}` : undefined,
|
||||
summary.driver?.vision.assertion?.visible !== undefined
|
||||
? `Visible: ${summary.driver.vision.assertion.visible}`
|
||||
: undefined,
|
||||
summary.driver?.vision.assertion?.evidence
|
||||
? `Evidence: ${summary.driver.vision.assertion.evidence}`
|
||||
: undefined,
|
||||
summary.driver?.vision.assertion?.reason
|
||||
? `Reason: ${summary.driver.vision.assertion.reason}`
|
||||
: undefined,
|
||||
summary.driver?.matched !== undefined ? `Matched: ${summary.driver.matched}` : undefined,
|
||||
summary.recording.error ? `Recording error: ${summary.recording.error}` : undefined,
|
||||
summary.error ? `Error: ${summary.error}` : undefined,
|
||||
"",
|
||||
].filter((line) => line !== undefined);
|
||||
return `${lines.join("\n")}\n`;
|
||||
}
|
||||
|
||||
export async function runMantisVisualDriver(
|
||||
opts: MantisVisualDriverOptions = {},
|
||||
): Promise<MantisVisualDriverResult> {
|
||||
const env = opts.env ?? process.env;
|
||||
const startedAt = new Date();
|
||||
const repoRoot = path.resolve(opts.repoRoot ?? process.cwd());
|
||||
const outputDir = await ensureRepoBoundDirectory(
|
||||
repoRoot,
|
||||
resolveMantisOutputDir(repoRoot, opts.outputDir, startedAt),
|
||||
"Mantis visual driver output directory",
|
||||
{ mode: 0o755 },
|
||||
);
|
||||
const resultPath = path.join(outputDir, "mantis-visual-task-driver-result.json");
|
||||
const screenshotPath = path.join(outputDir, "visual-task.png");
|
||||
const crabboxBin = await resolveCrabboxBin({ env, explicit: opts.crabboxBin, repoRoot });
|
||||
const provider =
|
||||
trimToValue(opts.provider) ??
|
||||
trimToValue(env.CRABBOX_RECORD_PROVIDER) ??
|
||||
trimToValue(env[CRABBOX_PROVIDER_ENV]) ??
|
||||
DEFAULT_PROVIDER;
|
||||
const leaseId =
|
||||
trimToValue(opts.leaseId) ??
|
||||
trimToValue(env.CRABBOX_RECORD_LEASE_ID) ??
|
||||
trimToValue(env[CRABBOX_LEASE_ID_ENV]);
|
||||
if (!leaseId) {
|
||||
throw new Error("Mantis visual-driver needs --lease-id or CRABBOX_RECORD_LEASE_ID.");
|
||||
}
|
||||
const browserUrl = trimToValue(opts.browserUrl) ?? DEFAULT_BROWSER_URL;
|
||||
const visionMode = normalizeVisionMode(opts.visionMode);
|
||||
const expectText = trimToValue(opts.expectText);
|
||||
const visionPrompt = buildVisionPrompt(opts.visionPrompt, expectText);
|
||||
const visionTimeoutMs = opts.visionTimeoutMs ?? DEFAULT_VISION_TIMEOUT_MS;
|
||||
const runner = opts.commandRunner ?? defaultCommandRunner;
|
||||
let result: MantisVisualDriverResult;
|
||||
|
||||
try {
|
||||
await runCommand({
|
||||
command: crabboxBin,
|
||||
args: [
|
||||
"desktop",
|
||||
"launch",
|
||||
"--provider",
|
||||
provider,
|
||||
"--id",
|
||||
leaseId,
|
||||
"--browser",
|
||||
"--url",
|
||||
browserUrl,
|
||||
"--reclaim",
|
||||
"--",
|
||||
"sh",
|
||||
"-lc",
|
||||
browserLaunchScript(),
|
||||
],
|
||||
cwd: repoRoot,
|
||||
env,
|
||||
runner,
|
||||
stdio: "inherit",
|
||||
});
|
||||
await new Promise((resolve) => setTimeout(resolve, opts.settleMs ?? DEFAULT_SETTLE_MS));
|
||||
await runCommand({
|
||||
command: crabboxBin,
|
||||
args: [
|
||||
"screenshot",
|
||||
"--provider",
|
||||
provider,
|
||||
"--id",
|
||||
leaseId,
|
||||
"--output",
|
||||
screenshotPath,
|
||||
"--reclaim",
|
||||
],
|
||||
cwd: repoRoot,
|
||||
env,
|
||||
runner,
|
||||
stdio: "inherit",
|
||||
});
|
||||
let visionText: string | undefined;
|
||||
if (visionMode === "image-describe") {
|
||||
const imageArgs = [
|
||||
"openclaw",
|
||||
"infer",
|
||||
"image",
|
||||
"describe",
|
||||
"--file",
|
||||
screenshotPath,
|
||||
"--prompt",
|
||||
visionPrompt,
|
||||
"--timeout-ms",
|
||||
String(visionTimeoutMs),
|
||||
"--json",
|
||||
];
|
||||
const visionModel = trimToValue(opts.visionModel);
|
||||
if (visionModel) {
|
||||
imageArgs.push("--model", visionModel);
|
||||
}
|
||||
const described = await runCommand({
|
||||
command: "pnpm",
|
||||
args: ["--dir", repoRoot, ...imageArgs],
|
||||
cwd: repoRoot,
|
||||
env,
|
||||
runner,
|
||||
});
|
||||
visionText = parseImageDescribeText(described.stdout);
|
||||
}
|
||||
const { assertion, matched } = evaluateVisualExpectation(visionText, expectText);
|
||||
result = {
|
||||
browserUrl,
|
||||
expectText,
|
||||
finishedAt: new Date().toISOString(),
|
||||
matched,
|
||||
outputDir,
|
||||
screenshotPath,
|
||||
startedAt: startedAt.toISOString(),
|
||||
status: matched ? "pass" : "fail",
|
||||
vision: {
|
||||
assertion,
|
||||
mode: visionMode,
|
||||
model: trimToValue(opts.visionModel),
|
||||
prompt: visionPrompt,
|
||||
text: visionText,
|
||||
timeoutMs: visionTimeoutMs,
|
||||
},
|
||||
};
|
||||
} catch (error) {
|
||||
result = {
|
||||
browserUrl,
|
||||
error: formatErrorMessage(error),
|
||||
expectText,
|
||||
finishedAt: new Date().toISOString(),
|
||||
matched: false,
|
||||
outputDir,
|
||||
screenshotPath,
|
||||
startedAt: startedAt.toISOString(),
|
||||
status: "fail",
|
||||
vision: {
|
||||
mode: visionMode,
|
||||
model: trimToValue(opts.visionModel),
|
||||
prompt: visionPrompt,
|
||||
timeoutMs: visionTimeoutMs,
|
||||
},
|
||||
};
|
||||
}
|
||||
await fs.writeFile(resultPath, `${JSON.stringify(result, null, 2)}\n`, "utf8");
|
||||
return result;
|
||||
}
|
||||
|
||||
export async function runMantisVisualTask(
|
||||
opts: MantisVisualTaskOptions = {},
|
||||
): Promise<MantisVisualTaskResult> {
|
||||
const env = opts.env ?? process.env;
|
||||
const startedAt = (opts.now ?? (() => new Date()))();
|
||||
const repoRoot = path.resolve(opts.repoRoot ?? process.cwd());
|
||||
const outputDir = await ensureRepoBoundDirectory(
|
||||
repoRoot,
|
||||
resolveMantisOutputDir(repoRoot, opts.outputDir, startedAt),
|
||||
"Mantis visual task output directory",
|
||||
{ mode: 0o755 },
|
||||
);
|
||||
const summaryPath = path.join(outputDir, "mantis-visual-task-summary.json");
|
||||
const reportPath = path.join(outputDir, "mantis-visual-task-report.md");
|
||||
const driverResultPath = path.join(outputDir, "mantis-visual-task-driver-result.json");
|
||||
const screenshotPath = path.join(outputDir, "visual-task.png");
|
||||
const videoPath = path.join(outputDir, "visual-task.mp4");
|
||||
const crabboxBin = await resolveCrabboxBin({ env, explicit: opts.crabboxBin, repoRoot });
|
||||
const provider =
|
||||
trimToValue(opts.provider) ?? trimToValue(env[CRABBOX_PROVIDER_ENV]) ?? DEFAULT_PROVIDER;
|
||||
const machineClass =
|
||||
trimToValue(opts.machineClass) ?? trimToValue(env[CRABBOX_CLASS_ENV]) ?? DEFAULT_CLASS;
|
||||
const idleTimeout =
|
||||
trimToValue(opts.idleTimeout) ??
|
||||
trimToValue(env[CRABBOX_IDLE_TIMEOUT_ENV]) ??
|
||||
DEFAULT_IDLE_TIMEOUT;
|
||||
const ttl = trimToValue(opts.ttl) ?? trimToValue(env[CRABBOX_TTL_ENV]) ?? DEFAULT_TTL;
|
||||
const explicitLeaseId = trimToValue(opts.leaseId) ?? trimToValue(env[CRABBOX_LEASE_ID_ENV]);
|
||||
const keepLease = opts.keepLease ?? isTruthyOptIn(env[CRABBOX_KEEP_ENV]);
|
||||
const createdLease = explicitLeaseId === undefined;
|
||||
const browserUrl = trimToValue(opts.browserUrl) ?? DEFAULT_BROWSER_URL;
|
||||
const expectText = trimToValue(opts.expectText);
|
||||
const visionMode = normalizeVisionMode(opts.visionMode);
|
||||
const visionPrompt = buildVisionPrompt(opts.visionPrompt, expectText);
|
||||
const runner = opts.commandRunner ?? defaultCommandRunner;
|
||||
let leaseId = explicitLeaseId;
|
||||
let inspected: CrabboxInspect = {};
|
||||
let summary: MantisVisualTaskSummary | undefined;
|
||||
|
||||
try {
|
||||
leaseId =
|
||||
leaseId ??
|
||||
(await warmupCrabbox({
|
||||
crabboxBin,
|
||||
cwd: repoRoot,
|
||||
env,
|
||||
idleTimeout,
|
||||
machineClass,
|
||||
provider,
|
||||
runner,
|
||||
ttl,
|
||||
}));
|
||||
inspected = await inspectCrabbox({
|
||||
crabboxBin,
|
||||
cwd: repoRoot,
|
||||
env,
|
||||
leaseId,
|
||||
provider,
|
||||
runner,
|
||||
});
|
||||
let recordingError: string | undefined;
|
||||
try {
|
||||
await runCommand({
|
||||
command: crabboxBin,
|
||||
args: [
|
||||
"record",
|
||||
"--provider",
|
||||
provider,
|
||||
"--id",
|
||||
leaseId,
|
||||
"--duration",
|
||||
trimToValue(opts.duration) ?? DEFAULT_DURATION,
|
||||
"--output",
|
||||
videoPath,
|
||||
"--while",
|
||||
"--",
|
||||
"pnpm",
|
||||
...buildVisualDriverArgs({
|
||||
browserUrl,
|
||||
crabboxBin,
|
||||
expectText,
|
||||
leaseId,
|
||||
outputDir,
|
||||
provider,
|
||||
repoRoot,
|
||||
settleMs: opts.settleMs ?? DEFAULT_SETTLE_MS,
|
||||
visionMode,
|
||||
visionModel: trimToValue(opts.visionModel),
|
||||
visionPrompt,
|
||||
visionTimeoutMs: opts.visionTimeoutMs ?? DEFAULT_VISION_TIMEOUT_MS,
|
||||
}),
|
||||
],
|
||||
cwd: repoRoot,
|
||||
env,
|
||||
runner,
|
||||
stdio: "inherit",
|
||||
});
|
||||
} catch (error) {
|
||||
if (!(await pathExists(driverResultPath))) {
|
||||
throw error;
|
||||
}
|
||||
recordingError = formatErrorMessage(error);
|
||||
}
|
||||
const driver = JSON.parse(
|
||||
await fs.readFile(driverResultPath, "utf8"),
|
||||
) as MantisVisualDriverResult;
|
||||
const copiedScreenshot = (await pathExists(screenshotPath)) ? screenshotPath : undefined;
|
||||
const copiedVideo = (await nonEmptyFileExists(videoPath)) ? videoPath : undefined;
|
||||
const recordingFailure =
|
||||
recordingError ??
|
||||
(copiedVideo ? undefined : "Mantis visual task recording did not produce visual-task.mp4.");
|
||||
const status = driver.status === "pass" && !recordingFailure ? "pass" : "fail";
|
||||
summary = {
|
||||
artifacts: {
|
||||
driverResultPath,
|
||||
reportPath,
|
||||
screenshotPath: copiedScreenshot,
|
||||
summaryPath,
|
||||
videoPath: copiedVideo,
|
||||
},
|
||||
browserUrl,
|
||||
crabbox: {
|
||||
bin: crabboxBin,
|
||||
createdLease,
|
||||
id: leaseId,
|
||||
provider,
|
||||
slug: inspected.slug,
|
||||
state: inspected.state,
|
||||
vncCommand: `${crabboxBin} vnc --provider ${provider} --id ${leaseId} --open`,
|
||||
},
|
||||
driver,
|
||||
error: recordingFailure,
|
||||
finishedAt: new Date().toISOString(),
|
||||
outputDir,
|
||||
recording: {
|
||||
error: recordingFailure,
|
||||
required: true,
|
||||
},
|
||||
startedAt: startedAt.toISOString(),
|
||||
status,
|
||||
visionMode,
|
||||
};
|
||||
return {
|
||||
outputDir,
|
||||
reportPath,
|
||||
screenshotPath: copiedScreenshot,
|
||||
status,
|
||||
summaryPath,
|
||||
videoPath: copiedVideo,
|
||||
};
|
||||
} catch (error) {
|
||||
summary = {
|
||||
artifacts: {
|
||||
driverResultPath,
|
||||
reportPath,
|
||||
summaryPath,
|
||||
videoPath: (await pathExists(videoPath)) ? videoPath : undefined,
|
||||
},
|
||||
browserUrl,
|
||||
crabbox: {
|
||||
bin: crabboxBin,
|
||||
createdLease,
|
||||
id: leaseId ?? "unallocated",
|
||||
provider,
|
||||
slug: inspected.slug,
|
||||
state: inspected.state,
|
||||
vncCommand: leaseId
|
||||
? `${crabboxBin} vnc --provider ${provider} --id ${leaseId} --open`
|
||||
: "unallocated",
|
||||
},
|
||||
error: formatErrorMessage(error),
|
||||
finishedAt: new Date().toISOString(),
|
||||
outputDir,
|
||||
recording: {
|
||||
error: (await nonEmptyFileExists(videoPath)) ? undefined : "visual-task.mp4 missing",
|
||||
required: true,
|
||||
},
|
||||
startedAt: startedAt.toISOString(),
|
||||
status: "fail",
|
||||
visionMode,
|
||||
};
|
||||
await fs.writeFile(path.join(outputDir, "error.txt"), `${summary.error}\n`, "utf8");
|
||||
return {
|
||||
outputDir,
|
||||
reportPath,
|
||||
status: "fail",
|
||||
summaryPath,
|
||||
videoPath: summary.artifacts.videoPath,
|
||||
};
|
||||
} finally {
|
||||
if (summary) {
|
||||
summary.finishedAt = new Date().toISOString();
|
||||
await fs.writeFile(summaryPath, `${JSON.stringify(summary, null, 2)}\n`, "utf8");
|
||||
await fs.writeFile(reportPath, renderReport(summary), "utf8");
|
||||
}
|
||||
if (summary?.status === "pass" && createdLease && leaseId && !keepLease) {
|
||||
await stopCrabbox({ crabboxBin, cwd: repoRoot, env, leaseId, provider, runner });
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -221,48 +221,6 @@ describe("qa mock openai server", () => {
|
||||
expect(partialBody).toContain('"type":"response.output_text.delta"');
|
||||
expect(partialBody).toContain("QA_PARTIAL_OK");
|
||||
|
||||
const telegramLongResponse = await fetch(`${server.baseUrl}/v1/responses`, {
|
||||
method: "POST",
|
||||
headers: {
|
||||
"content-type": "application/json",
|
||||
},
|
||||
body: JSON.stringify({
|
||||
stream: true,
|
||||
input: [
|
||||
makeUserInput("Telegram long final QA check. Use the scripted long final response."),
|
||||
],
|
||||
}),
|
||||
});
|
||||
expect(telegramLongResponse.status).toBe(200);
|
||||
const telegramLongBody = await telegramLongResponse.text();
|
||||
expect(telegramLongBody).toContain('"type":"response.output_text.delta"');
|
||||
expect(telegramLongBody).toContain('"phase":"final_answer"');
|
||||
expect(telegramLongBody).toContain("TELEGRAM-LONG-FINAL-BEGIN");
|
||||
expect(telegramLongBody).toContain("TELEGRAM-LONG-FINAL-END");
|
||||
expect(telegramLongBody.length).toBeGreaterThan(4_500);
|
||||
|
||||
const telegramThreeChunkLongResponse = await fetch(`${server.baseUrl}/v1/responses`, {
|
||||
method: "POST",
|
||||
headers: {
|
||||
"content-type": "application/json",
|
||||
},
|
||||
body: JSON.stringify({
|
||||
stream: true,
|
||||
input: [
|
||||
makeUserInput(
|
||||
"Telegram long final three chunk QA check. Use the scripted three chunk final response.",
|
||||
),
|
||||
],
|
||||
}),
|
||||
});
|
||||
expect(telegramThreeChunkLongResponse.status).toBe(200);
|
||||
const telegramThreeChunkLongBody = await telegramThreeChunkLongResponse.text();
|
||||
expect(telegramThreeChunkLongBody).toContain('"type":"response.output_text.delta"');
|
||||
expect(telegramThreeChunkLongBody).toContain('"phase":"final_answer"');
|
||||
expect(telegramThreeChunkLongBody).toContain("TELEGRAM-LONG-FINAL-3CHUNK-BEGIN");
|
||||
expect(telegramThreeChunkLongBody).toContain("TELEGRAM-LONG-FINAL-3CHUNK-END");
|
||||
expect(telegramThreeChunkLongBody.length).toBeGreaterThan(8_000);
|
||||
|
||||
const blockResponse = await fetch(`${server.baseUrl}/v1/responses`, {
|
||||
method: "POST",
|
||||
headers: {
|
||||
|
||||
@@ -153,8 +153,6 @@ const QA_GROUP_VISIBLE_REPLY_TOOL_PROMPT_RE = /qa group visible reply tool check
|
||||
const QA_GROUP_MESSAGE_UNAVAILABLE_FALLBACK_PROMPT_RE =
|
||||
/qa group message unavailable fallback check/i;
|
||||
const QA_TELEGRAM_CURRENT_SESSION_STATUS_PROMPT_RE = /telegram current session_status qa check/i;
|
||||
const QA_TELEGRAM_LONG_FINAL_THREE_CHUNK_PROMPT_RE = /telegram long final three chunk qa check/i;
|
||||
const QA_TELEGRAM_LONG_FINAL_PROMPT_RE = /telegram long final qa check/i;
|
||||
const QA_SUBAGENT_DIRECT_FALLBACK_PROMPT_RE = /subagent direct fallback qa check/i;
|
||||
const QA_SUBAGENT_DIRECT_FALLBACK_WORKER_RE = /subagent direct fallback worker/i;
|
||||
const QA_SUBAGENT_DIRECT_FALLBACK_MARKER = "QA-SUBAGENT-DIRECT-FALLBACK-OK";
|
||||
@@ -1036,23 +1034,6 @@ function splitMockStreamingText(text: string, parts = 3) {
|
||||
return chunks.length > 1 ? chunks : [text.slice(0, 1), text.slice(1)];
|
||||
}
|
||||
|
||||
function buildTelegramLongFinalText({
|
||||
endMarker = "TELEGRAM-LONG-FINAL-END",
|
||||
segmentCount = 54,
|
||||
startMarker = "TELEGRAM-LONG-FINAL-BEGIN",
|
||||
}: {
|
||||
endMarker?: string;
|
||||
segmentCount?: number;
|
||||
startMarker?: string;
|
||||
} = {}) {
|
||||
const body = Array.from(
|
||||
{ length: segmentCount },
|
||||
(_, index) =>
|
||||
`telegram-long-final-segment-${String(index + 1).padStart(3, "0")} ${"x".repeat(54)}`,
|
||||
).join("\n");
|
||||
return `${startMarker}\n${body}\n${endMarker}`;
|
||||
}
|
||||
|
||||
function buildAssistantOutputItem(spec: MockAssistantMessageSpec) {
|
||||
return {
|
||||
type: "message",
|
||||
@@ -1329,32 +1310,6 @@ async function buildResponsesPayload(
|
||||
}
|
||||
return buildAssistantEvents("");
|
||||
}
|
||||
if (QA_TELEGRAM_LONG_FINAL_THREE_CHUNK_PROMPT_RE.test(allInputText)) {
|
||||
const text = buildTelegramLongFinalText({
|
||||
endMarker: "TELEGRAM-LONG-FINAL-3CHUNK-END",
|
||||
segmentCount: 96,
|
||||
startMarker: "TELEGRAM-LONG-FINAL-3CHUNK-BEGIN",
|
||||
});
|
||||
return buildAssistantEvents([
|
||||
{
|
||||
id: "msg_mock_telegram_long_final_three_chunk",
|
||||
phase: "final_answer",
|
||||
streamDeltas: splitMockStreamingText(text),
|
||||
text,
|
||||
},
|
||||
]);
|
||||
}
|
||||
if (QA_TELEGRAM_LONG_FINAL_PROMPT_RE.test(allInputText)) {
|
||||
const text = buildTelegramLongFinalText();
|
||||
return buildAssistantEvents([
|
||||
{
|
||||
id: "msg_mock_telegram_long_final",
|
||||
phase: "final_answer",
|
||||
streamDeltas: splitMockStreamingText(text),
|
||||
text,
|
||||
},
|
||||
]);
|
||||
}
|
||||
if (QA_STREAMING_PROMPT_RE.test(allInputText) && exactReplyDirective) {
|
||||
return buildAssistantEvents([
|
||||
{
|
||||
|
||||
@@ -16,6 +16,7 @@ const reactSlackMessage = vi.fn(async (..._args: unknown[]) => ({}));
|
||||
const readSlackMessages = vi.fn(async (..._args: unknown[]) => ({}));
|
||||
const removeOwnSlackReactions = vi.fn(async (..._args: unknown[]) => ["thumbsup"]);
|
||||
const removeSlackReaction = vi.fn(async (..._args: unknown[]) => ({}));
|
||||
const recordSlackThreadParticipation = vi.fn();
|
||||
const sendSlackMessage = vi.fn(async (..._args: unknown[]) => ({ channelId: "C123" }));
|
||||
const unpinSlackMessage = vi.fn(async (..._args: unknown[]) => ({}));
|
||||
|
||||
@@ -102,6 +103,7 @@ describe("handleSlackAction", () => {
|
||||
pinSlackMessage,
|
||||
reactSlackMessage,
|
||||
readSlackMessages,
|
||||
recordSlackThreadParticipation,
|
||||
removeOwnSlackReactions,
|
||||
removeSlackReaction,
|
||||
sendSlackMessage,
|
||||
|
||||
@@ -12,6 +12,7 @@ import {
|
||||
type OpenClawConfig,
|
||||
withNormalizedTimestamp,
|
||||
} from "./runtime-api.js";
|
||||
import { recordSlackThreadParticipation } from "./sent-thread-cache.js";
|
||||
import { parseSlackTarget, resolveSlackChannelId } from "./targets.js";
|
||||
|
||||
const messagingActions = new Set([
|
||||
@@ -77,6 +78,7 @@ export const slackActionRuntime = {
|
||||
pinSlackMessage: createLazySlackAction("pinSlackMessage"),
|
||||
reactSlackMessage: createLazySlackAction("reactSlackMessage"),
|
||||
readSlackMessages: createLazySlackAction("readSlackMessages"),
|
||||
recordSlackThreadParticipation,
|
||||
removeOwnSlackReactions: createLazySlackAction("removeOwnSlackReactions"),
|
||||
removeSlackReaction: createLazySlackAction("removeSlackReaction"),
|
||||
sendSlackMessage: createLazySlackAction("sendSlackMessage"),
|
||||
@@ -271,6 +273,14 @@ export async function handleSlackAction(
|
||||
blocks,
|
||||
});
|
||||
|
||||
if (threadTs && result.channelId && account.accountId) {
|
||||
slackActionRuntime.recordSlackThreadParticipation(
|
||||
account.accountId,
|
||||
result.channelId,
|
||||
threadTs,
|
||||
);
|
||||
}
|
||||
|
||||
// Keep "first" mode consistent even when the agent explicitly provided
|
||||
// threadTs: once we send a message to the current channel, consider the
|
||||
// first reply "used" so later tool calls don't auto-thread again.
|
||||
@@ -308,6 +318,14 @@ export async function handleSlackAction(
|
||||
...(title ? { uploadTitle: title } : {}),
|
||||
});
|
||||
|
||||
if (threadTs && result.channelId && account.accountId) {
|
||||
slackActionRuntime.recordSlackThreadParticipation(
|
||||
account.accountId,
|
||||
result.channelId,
|
||||
threadTs,
|
||||
);
|
||||
}
|
||||
|
||||
if (context?.hasRepliedRef && context.currentChannelId) {
|
||||
if (sameSlackChannelTarget(to, context.currentChannelId)) {
|
||||
context.hasRepliedRef.value = true;
|
||||
|
||||
@@ -92,25 +92,6 @@ describe("slack outbound shared hook wiring", () => {
|
||||
expect(sendMessageSlackMock).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it("passes replyToId as Slack threadTs for threaded outbound delivery", async () => {
|
||||
await deliverOutboundPayloads({
|
||||
cfg,
|
||||
channel: "slack",
|
||||
to: "C123",
|
||||
payloads: [{ text: "hello" }],
|
||||
accountId: "default",
|
||||
replyToId: "1712000000.000001",
|
||||
});
|
||||
|
||||
expect(sendMessageSlackMock).toHaveBeenCalledWith(
|
||||
"C123",
|
||||
"hello",
|
||||
expect.objectContaining({
|
||||
threadTs: "1712000000.000001",
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("respects cancel from the shared hook without a second adapter pass", async () => {
|
||||
const hookRegistry = createEmptyPluginRegistry();
|
||||
const handler = vi.fn().mockResolvedValue({ cancel: true });
|
||||
|
||||
@@ -1,9 +1,5 @@
|
||||
import { describe, expect, it } from "vitest";
|
||||
import { createSlackSendTestClient, installSlackBlockTestMocks } from "./blocks.test-helpers.js";
|
||||
import {
|
||||
clearSlackThreadParticipationCache,
|
||||
hasSlackThreadParticipation,
|
||||
} from "./sent-thread-cache.js";
|
||||
|
||||
installSlackBlockTestMocks();
|
||||
const { sendMessageSlack } = await import("./send.js");
|
||||
@@ -71,49 +67,6 @@ describe("sendMessageSlack NO_REPLY guard", () => {
|
||||
});
|
||||
});
|
||||
|
||||
describe("sendMessageSlack thread participation", () => {
|
||||
it("records participation after a successful threaded send", async () => {
|
||||
clearSlackThreadParticipationCache();
|
||||
const client = createSlackSendTestClient();
|
||||
|
||||
await sendMessageSlack("channel:C123", "hello thread", {
|
||||
token: "xoxb-test",
|
||||
cfg: SLACK_TEST_CFG,
|
||||
client,
|
||||
threadTs: "1712345678.123456",
|
||||
});
|
||||
|
||||
expect(hasSlackThreadParticipation("default", "C123", "1712345678.123456")).toBe(true);
|
||||
});
|
||||
|
||||
it("does not record participation for unthreaded sends", async () => {
|
||||
clearSlackThreadParticipationCache();
|
||||
const client = createSlackSendTestClient();
|
||||
|
||||
await sendMessageSlack("channel:C123", "hello channel", {
|
||||
token: "xoxb-test",
|
||||
cfg: SLACK_TEST_CFG,
|
||||
client,
|
||||
});
|
||||
|
||||
expect(hasSlackThreadParticipation("default", "C123", "1712345678.123456")).toBe(false);
|
||||
});
|
||||
|
||||
it("does not record participation for invalid thread ids", async () => {
|
||||
clearSlackThreadParticipationCache();
|
||||
const client = createSlackSendTestClient();
|
||||
|
||||
await sendMessageSlack("channel:C123", "hello invalid thread", {
|
||||
token: "xoxb-test",
|
||||
cfg: SLACK_TEST_CFG,
|
||||
client,
|
||||
threadTs: "not-a-slack-thread",
|
||||
});
|
||||
|
||||
expect(hasSlackThreadParticipation("default", "C123", "not-a-slack-thread")).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe("sendMessageSlack chunking", () => {
|
||||
it("keeps 4205-character text in a single Slack post by default", async () => {
|
||||
const client = createSlackSendTestClient();
|
||||
|
||||
@@ -24,9 +24,7 @@ import { createSlackTokenCacheKey, getSlackWriteClient } from "./client.js";
|
||||
import { markdownToSlackMrkdwnChunks } from "./format.js";
|
||||
import { SLACK_TEXT_LIMIT } from "./limits.js";
|
||||
import { loadOutboundMediaFromUrl } from "./runtime-api.js";
|
||||
import { recordSlackThreadParticipation } from "./sent-thread-cache.js";
|
||||
import { parseSlackTarget } from "./targets.js";
|
||||
import { normalizeSlackThreadTsCandidate } from "./thread-ts.js";
|
||||
import { resolveSlackBotToken } from "./token.js";
|
||||
import { truncateSlackText } from "./truncate.js";
|
||||
const SLACK_UPLOAD_SSRF_POLICY = {
|
||||
@@ -537,7 +535,7 @@ export async function sendMessageSlack(
|
||||
recipient,
|
||||
threadTs: opts.threadTs,
|
||||
});
|
||||
const result = await runQueuedSlackSend(queueKey, () =>
|
||||
return await runQueuedSlackSend(queueKey, () =>
|
||||
sendMessageSlackQueued({
|
||||
trimmedMessage,
|
||||
opts,
|
||||
@@ -548,11 +546,6 @@ export async function sendMessageSlack(
|
||||
blocks,
|
||||
}),
|
||||
);
|
||||
const threadTs = normalizeSlackThreadTsCandidate(opts.threadTs);
|
||||
if (threadTs && result.channelId && account.accountId) {
|
||||
recordSlackThreadParticipation(account.accountId, result.channelId, threadTs);
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
async function sendMessageSlackQueued(params: {
|
||||
|
||||
@@ -1,10 +1,6 @@
|
||||
import type { WebClient } from "@slack/web-api";
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
|
||||
import { installSlackBlockTestMocks } from "./blocks.test-helpers.js";
|
||||
import {
|
||||
clearSlackThreadParticipationCache,
|
||||
hasSlackThreadParticipation,
|
||||
} from "./sent-thread-cache.js";
|
||||
|
||||
// --- Module mocks (must precede dynamic import) ---
|
||||
installSlackBlockTestMocks();
|
||||
@@ -100,7 +96,6 @@ describe("sendMessageSlack file upload with user IDs", () => {
|
||||
loadOutboundMediaFromUrlMock.mockClear();
|
||||
clearSlackDmChannelCache();
|
||||
clearSlackSendQueuesForTest();
|
||||
clearSlackThreadParticipationCache();
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
@@ -302,7 +297,6 @@ describe("sendMessageSlack file upload with user IDs", () => {
|
||||
thread_ts: "171.222",
|
||||
}),
|
||||
);
|
||||
expect(hasSlackThreadParticipation("default", "C123CHAN", "171.222")).toBe(true);
|
||||
});
|
||||
|
||||
it("uses explicit upload filename and title overrides when provided", async () => {
|
||||
|
||||
@@ -373,7 +373,6 @@ describe("dispatchTelegramMessage draft streaming", () => {
|
||||
telegramDeps?: TelegramBotDeps;
|
||||
bot?: Bot;
|
||||
replyToMode?: Parameters<typeof dispatchTelegramMessage>[0]["replyToMode"];
|
||||
textLimit?: number;
|
||||
}) {
|
||||
const bot = params.bot ?? createBot();
|
||||
await dispatchTelegramMessage({
|
||||
@@ -383,7 +382,7 @@ describe("dispatchTelegramMessage draft streaming", () => {
|
||||
runtime: createRuntime(),
|
||||
replyToMode: params.replyToMode ?? "first",
|
||||
streamMode: params.streamMode ?? "partial",
|
||||
textLimit: params.textLimit ?? 4096,
|
||||
textLimit: 4096,
|
||||
telegramCfg: params.telegramCfg ?? {},
|
||||
telegramDeps: params.telegramDeps ?? telegramDepsForTest,
|
||||
opts: { token: "token" },
|
||||
@@ -1577,89 +1576,6 @@ describe("dispatchTelegramMessage draft streaming", () => {
|
||||
);
|
||||
});
|
||||
|
||||
it("uses the active preview as the first chunk for long text finals", async () => {
|
||||
const answerDraftStream = createSequencedDraftStream(1001);
|
||||
const reasoningDraftStream = createDraftStream();
|
||||
createTelegramDraftStream
|
||||
.mockImplementationOnce(() => answerDraftStream)
|
||||
.mockImplementationOnce(() => reasoningDraftStream);
|
||||
const finalText = `${"A".repeat(70)}${"B".repeat(70)}`;
|
||||
dispatchReplyWithBufferedBlockDispatcher.mockImplementation(
|
||||
async ({ dispatcherOptions, replyOptions }) => {
|
||||
await replyOptions?.onPartialReply?.({ text: "Working preview" });
|
||||
await dispatcherOptions.deliver({ text: finalText, replyToId: "456" }, { kind: "final" });
|
||||
return { queuedFinal: true };
|
||||
},
|
||||
);
|
||||
deliverReplies.mockResolvedValue({ delivered: true });
|
||||
editMessageTelegram.mockResolvedValue({ ok: true, chatId: "123", messageId: "1001" });
|
||||
|
||||
await dispatchWithContext({
|
||||
context: createContext(),
|
||||
streamMode: "partial",
|
||||
textLimit: 80,
|
||||
});
|
||||
|
||||
const editedText = editMessageTelegram.mock.calls[0]?.[2] as string;
|
||||
const followUpText =
|
||||
(deliverReplies.mock.calls[0]?.[0] as { replies?: Array<{ text?: string }> })?.replies?.[0]
|
||||
?.text ?? "";
|
||||
|
||||
expect(editMessageTelegram).toHaveBeenCalledTimes(1);
|
||||
expect(editedText.length).toBeLessThanOrEqual(80);
|
||||
expect(followUpText.length).toBeGreaterThan(0);
|
||||
expect(`${editedText}${followUpText}`).toBe(finalText);
|
||||
expect(deliverReplies).toHaveBeenCalledTimes(1);
|
||||
expect(deliverReplies).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
replies: [expect.not.objectContaining({ replyToId: expect.any(String) })],
|
||||
}),
|
||||
);
|
||||
expect(answerDraftStream.clear).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("uses the active preview as the first chunk for three-chunk long text finals", async () => {
|
||||
const answerDraftStream = createSequencedDraftStream(1001);
|
||||
const reasoningDraftStream = createDraftStream();
|
||||
createTelegramDraftStream
|
||||
.mockImplementationOnce(() => answerDraftStream)
|
||||
.mockImplementationOnce(() => reasoningDraftStream);
|
||||
const finalText = `${"A".repeat(70)}${"B".repeat(70)}${"C".repeat(70)}`;
|
||||
dispatchReplyWithBufferedBlockDispatcher.mockImplementation(
|
||||
async ({ dispatcherOptions, replyOptions }) => {
|
||||
await replyOptions?.onPartialReply?.({ text: "Working preview" });
|
||||
await dispatcherOptions.deliver({ text: finalText, replyToId: "456" }, { kind: "final" });
|
||||
return { queuedFinal: true };
|
||||
},
|
||||
);
|
||||
deliverReplies.mockResolvedValue({ delivered: true });
|
||||
editMessageTelegram.mockResolvedValue({ ok: true, chatId: "123", messageId: "1001" });
|
||||
|
||||
await dispatchWithContext({
|
||||
context: createContext(),
|
||||
streamMode: "partial",
|
||||
textLimit: 80,
|
||||
});
|
||||
|
||||
const editedText = editMessageTelegram.mock.calls[0]?.[2] as string;
|
||||
const followUpReplies =
|
||||
(deliverReplies.mock.calls[0]?.[0] as { replies?: Array<{ text?: string }> })?.replies ?? [];
|
||||
const followUpText = followUpReplies.map((reply) => reply.text ?? "").join("");
|
||||
|
||||
expect(editMessageTelegram).toHaveBeenCalledTimes(1);
|
||||
expect(editedText.length).toBeLessThanOrEqual(80);
|
||||
expect(followUpReplies).toHaveLength(1);
|
||||
expect(followUpText.length).toBeGreaterThan(80);
|
||||
expect(`${editedText}${followUpText}`).toBe(finalText);
|
||||
expect(deliverReplies).toHaveBeenCalledTimes(1);
|
||||
expect(deliverReplies).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
replies: [expect.not.objectContaining({ replyToId: expect.any(String) })],
|
||||
}),
|
||||
);
|
||||
expect(answerDraftStream.clear).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("does not force new message on first assistant message start", async () => {
|
||||
const draftStream = createDraftStream(999);
|
||||
createTelegramDraftStream.mockReturnValue(draftStream);
|
||||
|
||||
@@ -28,7 +28,6 @@ import {
|
||||
createOutboundPayloadPlan,
|
||||
projectOutboundPayloadPlanForDelivery,
|
||||
} from "openclaw/plugin-sdk/outbound-runtime";
|
||||
import { chunkMarkdownTextWithMode } from "openclaw/plugin-sdk/reply-chunking";
|
||||
import { clearHistoryEntriesIfEnabled } from "openclaw/plugin-sdk/reply-history";
|
||||
import { resolveSendableOutboundReplyParts } from "openclaw/plugin-sdk/reply-payload";
|
||||
import type { ReplyPayload } from "openclaw/plugin-sdk/reply-payload";
|
||||
@@ -76,7 +75,7 @@ import {
|
||||
shouldSuppressTelegramError,
|
||||
} from "./error-policy.js";
|
||||
import { shouldSuppressLocalTelegramExecApprovalPrompt } from "./exec-approvals.js";
|
||||
import { markdownToTelegramChunks, renderTelegramHtmlText } from "./format.js";
|
||||
import { renderTelegramHtmlText } from "./format.js";
|
||||
import {
|
||||
type ArchivedPreview,
|
||||
createLaneDeliveryStateTracker,
|
||||
@@ -785,27 +784,6 @@ export const dispatchTelegramMessage = async ({
|
||||
}
|
||||
return { ...payload, text };
|
||||
};
|
||||
const applyTextToFollowUpPayload = (payload: ReplyPayload, text: string): ReplyPayload => {
|
||||
const next = applyTextToPayload(payload, text);
|
||||
const {
|
||||
replyToId: _replyToId,
|
||||
replyToCurrent: _replyToCurrent,
|
||||
replyToTag: _replyToTag,
|
||||
...followUp
|
||||
} = next;
|
||||
return followUp;
|
||||
};
|
||||
const splitFinalTextForPreview = (text: string): string[] => {
|
||||
const markdownChunks =
|
||||
chunkMode === "newline"
|
||||
? chunkMarkdownTextWithMode(text, draftMaxChars, chunkMode)
|
||||
: [text];
|
||||
return markdownChunks.flatMap((chunk) =>
|
||||
markdownToTelegramChunks(chunk, draftMaxChars, { tableMode }).map(
|
||||
(telegramChunk) => telegramChunk.text,
|
||||
),
|
||||
);
|
||||
};
|
||||
const applyQuoteReplyTarget = (payload: ReplyPayload): ReplyPayload => {
|
||||
if (
|
||||
!implicitQuoteReplyTargetId ||
|
||||
@@ -858,8 +836,6 @@ export const dispatchTelegramMessage = async ({
|
||||
retainPreviewOnCleanupByLane,
|
||||
draftMaxChars,
|
||||
applyTextToPayload,
|
||||
applyTextToFollowUpPayload,
|
||||
splitFinalTextForPreview,
|
||||
sendPayload,
|
||||
flushDraftLane,
|
||||
stopDraftLane: async (lane) => {
|
||||
|
||||
@@ -81,8 +81,6 @@ type CreateLaneTextDelivererParams = {
|
||||
retainPreviewOnCleanupByLane: Record<LaneName, boolean>;
|
||||
draftMaxChars: number;
|
||||
applyTextToPayload: (payload: ReplyPayload, text: string) => ReplyPayload;
|
||||
applyTextToFollowUpPayload?: (payload: ReplyPayload, text: string) => ReplyPayload;
|
||||
splitFinalTextForPreview?: (text: string) => readonly string[];
|
||||
sendPayload: (payload: ReplyPayload) => Promise<boolean>;
|
||||
flushDraftLane: (lane: DraftLaneState) => Promise<void>;
|
||||
stopDraftLane: (lane: DraftLaneState) => Promise<void>;
|
||||
@@ -119,7 +117,7 @@ type TryUpdatePreviewParams = {
|
||||
previewButtons?: TelegramInlineButtons;
|
||||
stopBeforeEdit?: boolean;
|
||||
updateLaneSnapshot?: boolean;
|
||||
skipRegressive: RegressiveSkipMode;
|
||||
skipRegressive: "always" | "existingOnly";
|
||||
context: "final" | "update";
|
||||
previewMessageId?: number;
|
||||
previewTextSnapshot?: string;
|
||||
@@ -136,7 +134,7 @@ type ConsumeArchivedAnswerPreviewParams = {
|
||||
};
|
||||
|
||||
type PreviewUpdateContext = "final" | "update";
|
||||
type RegressiveSkipMode = "always" | "existingOnly" | "never";
|
||||
type RegressiveSkipMode = "always" | "existingOnly";
|
||||
|
||||
type ResolvePreviewTargetParams = {
|
||||
lane: DraftLaneState;
|
||||
@@ -171,9 +169,6 @@ function shouldSkipRegressivePreviewUpdate(args: {
|
||||
if (currentPreviewText === undefined) {
|
||||
return false;
|
||||
}
|
||||
if (args.skipRegressive === "never") {
|
||||
return false;
|
||||
}
|
||||
return (
|
||||
currentPreviewText.startsWith(args.text) &&
|
||||
args.text.length < currentPreviewText.length &&
|
||||
@@ -189,26 +184,6 @@ function isLongLivedPreview(visibleSinceMs: number | undefined, nowMs: number):
|
||||
);
|
||||
}
|
||||
|
||||
function compactPreviewFinalChunks(chunks: readonly string[]): string[] {
|
||||
const result: string[] = [];
|
||||
let pendingWhitespace = "";
|
||||
for (const chunk of chunks) {
|
||||
if (!chunk) {
|
||||
continue;
|
||||
}
|
||||
if (chunk.trim().length === 0) {
|
||||
pendingWhitespace += chunk;
|
||||
continue;
|
||||
}
|
||||
result.push(`${pendingWhitespace}${chunk}`);
|
||||
pendingWhitespace = "";
|
||||
}
|
||||
if (pendingWhitespace && result.length > 0) {
|
||||
result[result.length - 1] = `${result[result.length - 1]}${pendingWhitespace}`;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
function resolvePreviewTarget(params: ResolvePreviewTargetParams): PreviewTargetResolution {
|
||||
const lanePreviewMessageId = params.lane.stream?.messageId();
|
||||
const previewMessageId =
|
||||
@@ -252,10 +227,6 @@ export function createLaneTextDeliverer(params: CreateLaneTextDelivererParams) {
|
||||
const shouldUseFreshFinalForPreview = (lane: DraftLaneState, visibleSinceMs?: number) =>
|
||||
isMessagePreviewLane(lane) &&
|
||||
(isLongLivedPreview(visibleSinceMs, readNow()) || wasVisiblyOverwrittenSince(visibleSinceMs));
|
||||
const buildFollowUpPayload = (payload: ReplyPayload, text: string) =>
|
||||
params.applyTextToFollowUpPayload
|
||||
? params.applyTextToFollowUpPayload(payload, text)
|
||||
: params.applyTextToPayload(payload, text);
|
||||
const clearActivePreviewAfterFreshFinal = async (lane: DraftLaneState, laneName: LaneName) => {
|
||||
try {
|
||||
await lane.stream?.clear();
|
||||
@@ -359,56 +330,6 @@ export function createLaneTextDeliverer(params: CreateLaneTextDelivererParams) {
|
||||
return "fallback";
|
||||
}
|
||||
};
|
||||
const tryDeliverLongFinalThroughPreview = async (args: {
|
||||
lane: DraftLaneState;
|
||||
laneName: LaneName;
|
||||
text: string;
|
||||
payload: ReplyPayload;
|
||||
previewButtons?: TelegramInlineButtons;
|
||||
}): Promise<LaneDeliveryResult | undefined> => {
|
||||
if (
|
||||
!args.lane.stream ||
|
||||
args.previewButtons !== undefined ||
|
||||
params.activePreviewLifecycleByLane[args.laneName] !== "transient"
|
||||
) {
|
||||
return undefined;
|
||||
}
|
||||
const chunks = compactPreviewFinalChunks(params.splitFinalTextForPreview?.(args.text) ?? []);
|
||||
const [firstChunk, ...remainingChunks] = chunks;
|
||||
if (!firstChunk || remainingChunks.length === 0 || firstChunk.length > params.draftMaxChars) {
|
||||
return undefined;
|
||||
}
|
||||
await params.flushDraftLane(args.lane);
|
||||
const previewMessageId = args.lane.stream.messageId();
|
||||
if (typeof previewMessageId !== "number") {
|
||||
return undefined;
|
||||
}
|
||||
const finalized = await tryUpdatePreviewForLane({
|
||||
lane: args.lane,
|
||||
laneName: args.laneName,
|
||||
text: firstChunk,
|
||||
stopBeforeEdit: true,
|
||||
updateLaneSnapshot: true,
|
||||
skipRegressive: "never",
|
||||
context: "final",
|
||||
});
|
||||
if (finalized === "fallback") {
|
||||
return undefined;
|
||||
}
|
||||
if (finalized === "retained") {
|
||||
markActivePreviewComplete(args.laneName);
|
||||
return result("preview-retained");
|
||||
}
|
||||
markActivePreviewComplete(args.laneName);
|
||||
const remainingText = remainingChunks.join("");
|
||||
if (remainingText.trim().length > 0) {
|
||||
await params.sendPayload(buildFollowUpPayload(args.payload, remainingText));
|
||||
}
|
||||
return result("preview-finalized", {
|
||||
content: args.text,
|
||||
messageId: previewMessageId,
|
||||
});
|
||||
};
|
||||
|
||||
const tryUpdatePreviewForLane = async ({
|
||||
lane,
|
||||
@@ -675,16 +596,6 @@ export function createLaneTextDeliverer(params: CreateLaneTextDelivererParams) {
|
||||
return result("preview-retained");
|
||||
}
|
||||
} else if (!hasMedia && !payload.isError && text.length > params.draftMaxChars) {
|
||||
const longFinalResult = await tryDeliverLongFinalThroughPreview({
|
||||
lane,
|
||||
laneName,
|
||||
text,
|
||||
payload,
|
||||
previewButtons,
|
||||
});
|
||||
if (longFinalResult) {
|
||||
return longFinalResult;
|
||||
}
|
||||
params.log(
|
||||
`telegram: preview final too long for edit (${text.length} > ${params.draftMaxChars}); falling back to standard send`,
|
||||
);
|
||||
|
||||
@@ -22,7 +22,6 @@ function createHarness(params?: {
|
||||
answerHasStreamedMessage?: boolean;
|
||||
answerLastPartialText?: string;
|
||||
answerPreviewVisibleSinceMs?: number;
|
||||
splitFinalTextForPreview?: (text: string) => readonly string[];
|
||||
nowMs?: number;
|
||||
}) {
|
||||
const answer =
|
||||
@@ -71,7 +70,6 @@ function createHarness(params?: {
|
||||
retainPreviewOnCleanupByLane: { ...retainPreviewOnCleanupByLane },
|
||||
draftMaxChars: params?.draftMaxChars ?? 4_096,
|
||||
applyTextToPayload: (payload: ReplyPayload, text: string) => ({ ...payload, text }),
|
||||
splitFinalTextForPreview: params?.splitFinalTextForPreview,
|
||||
sendPayload,
|
||||
flushDraftLane,
|
||||
stopDraftLane,
|
||||
@@ -385,36 +383,6 @@ describe("createLaneTextDeliverer", () => {
|
||||
expect(harness.log).toHaveBeenCalledWith(expect.stringContaining("preview final too long"));
|
||||
});
|
||||
|
||||
it("forces a long final preview back to the first chunk before sending the rest", async () => {
|
||||
const firstChunk = "First chunk boundary.";
|
||||
const remainingText = " Follow-up body after the boundary.";
|
||||
const finalText = `${firstChunk}${remainingText}`;
|
||||
const harness = createHarness({
|
||||
answerMessageId: 999,
|
||||
answerHasStreamedMessage: true,
|
||||
answerLastPartialText: `${firstChunk} overlap already visible`,
|
||||
draftMaxChars: 24,
|
||||
splitFinalTextForPreview: () => [firstChunk, remainingText],
|
||||
});
|
||||
|
||||
const result = await deliverFinalAnswer(harness, finalText);
|
||||
|
||||
expect(expectPreviewFinalized(result)).toEqual({
|
||||
content: finalText,
|
||||
messageId: 999,
|
||||
});
|
||||
expect(harness.editPreview).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
messageId: 999,
|
||||
text: firstChunk,
|
||||
}),
|
||||
);
|
||||
expect(harness.sendPayload).toHaveBeenCalledWith(
|
||||
expect.objectContaining({ text: remainingText }),
|
||||
);
|
||||
expect(harness.lanes.answer.lastPartialText).toBe(firstChunk);
|
||||
});
|
||||
|
||||
it("sends a fresh final when a message preview is long lived", async () => {
|
||||
const visibleSinceMs = 10_000;
|
||||
const harness = createHarness({
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
import path from "node:path";
|
||||
import {
|
||||
DEFAULT_ACCOUNT_ID,
|
||||
normalizeE164,
|
||||
pathExists,
|
||||
splitSetupEntries,
|
||||
type DmPolicy,
|
||||
|
||||
@@ -1570,7 +1570,6 @@
|
||||
"test:docker:timings": "node scripts/docker-e2e-timings.mjs",
|
||||
"test:docker:update-channel-switch": "bash scripts/e2e/update-channel-switch-docker.sh",
|
||||
"test:docker:update-migration": "env OPENCLAW_UPGRADE_SURVIVOR_PUBLISHED_BASELINE=1 OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC=${OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC:-openclaw@2026.4.23} OPENCLAW_UPGRADE_SURVIVOR_SCENARIO=${OPENCLAW_UPGRADE_SURVIVOR_SCENARIO:-plugin-deps-cleanup} bash scripts/e2e/upgrade-survivor-docker.sh",
|
||||
"test:docker:update-restart-auth": "env OPENCLAW_UPGRADE_SURVIVOR_UPDATE_RESTART_MODE=auto-auth OPENCLAW_UPGRADE_SURVIVOR_DOCKER_RUN_TIMEOUT=${OPENCLAW_UPGRADE_SURVIVOR_DOCKER_RUN_TIMEOUT:-1500s} bash scripts/e2e/upgrade-survivor-docker.sh",
|
||||
"test:docker:upgrade-survivor": "bash scripts/e2e/upgrade-survivor-docker.sh",
|
||||
"test:e2e": "node scripts/run-vitest.mjs run --config test/vitest/vitest.e2e.config.ts",
|
||||
"test:e2e:openshell": "OPENCLAW_E2E_OPENSHELL=1 node scripts/run-vitest.mjs run --config test/vitest/vitest.e2e.config.ts extensions/openshell/src/backend.e2e.test.ts",
|
||||
|
||||
@@ -37,7 +37,6 @@ BASELINE_RAW="${OPENCLAW_UPGRADE_SURVIVOR_BASELINE:?missing OPENCLAW_UPGRADE_SUR
|
||||
CANDIDATE_KIND="${OPENCLAW_UPGRADE_SURVIVOR_CANDIDATE_KIND:-tarball}"
|
||||
CANDIDATE_SPEC="${OPENCLAW_UPGRADE_SURVIVOR_CANDIDATE_SPEC:-${OPENCLAW_CURRENT_PACKAGE_TGZ:-}}"
|
||||
SCENARIO="${OPENCLAW_UPGRADE_SURVIVOR_SCENARIO:-base}"
|
||||
UPDATE_RESTART_MODE="${OPENCLAW_UPGRADE_SURVIVOR_UPDATE_RESTART_MODE:-manual}"
|
||||
CURRENT_PHASE="setup"
|
||||
FAILURE_PHASE=""
|
||||
FAILURE_MESSAGE=""
|
||||
@@ -52,7 +51,6 @@ start_seconds=""
|
||||
status_seconds=""
|
||||
healthz_seconds=""
|
||||
readyz_seconds=""
|
||||
update_restart_seconds=""
|
||||
|
||||
BASELINE_INSTALL_LOG="$ARTIFACT_ROOT/baseline-install.log"
|
||||
UPDATE_JSON="$ARTIFACT_ROOT/update.json"
|
||||
@@ -65,11 +63,6 @@ READYZ_JSON="$ARTIFACT_ROOT/readyz.json"
|
||||
STATUS_JSON="$ARTIFACT_ROOT/status.json"
|
||||
STATUS_ERR="$ARTIFACT_ROOT/status.err"
|
||||
BASELINE_CONFIG_VALIDATE_LOG="$ARTIFACT_ROOT/baseline-config-validate.log"
|
||||
BASELINE_SERVICE_INSTALL_JSON="$ARTIFACT_ROOT/baseline-service-install.json"
|
||||
BASELINE_SERVICE_INSTALL_ERR="$ARTIFACT_ROOT/baseline-service-install.err"
|
||||
SYSTEMCTL_SHIM_LOG="$ARTIFACT_ROOT/systemctl-shim.log"
|
||||
SYSTEMCTL_SHIM_PID_FILE="$ARTIFACT_ROOT/systemctl-shim.pid"
|
||||
SYSTEMCTL_SHIM_DAEMON_LOG="$ARTIFACT_ROOT/systemctl-shim-gateway.log"
|
||||
CONFIG_COVERAGE_JSON="$ARTIFACT_ROOT/config-recipe.json"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_CONFIG_COVERAGE_JSON="$CONFIG_COVERAGE_JSON"
|
||||
rm -f "$SUMMARY_JSON" "$CONFIG_COVERAGE_JSON"
|
||||
@@ -120,17 +113,6 @@ normalize_baseline() {
|
||||
validate_baseline_package_spec "$baseline_spec"
|
||||
}
|
||||
|
||||
validate_update_restart_mode() {
|
||||
case "$UPDATE_RESTART_MODE" in
|
||||
manual | auto-auth)
|
||||
;;
|
||||
*)
|
||||
echo "OPENCLAW_UPGRADE_SURVIVOR_UPDATE_RESTART_MODE must be manual or auto-auth; got: $UPDATE_RESTART_MODE" >&2
|
||||
return 1
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
json_event() {
|
||||
local phase="$1"
|
||||
local status="$2"
|
||||
@@ -157,9 +139,7 @@ write_summary() {
|
||||
SUMMARY_CANDIDATE_VERSION="$candidate_version" \
|
||||
SUMMARY_INSTALLED_VERSION="$installed_version" \
|
||||
SUMMARY_SCENARIO="$SCENARIO" \
|
||||
SUMMARY_UPDATE_RESTART_MODE="$UPDATE_RESTART_MODE" \
|
||||
SUMMARY_START_SECONDS="$start_seconds" \
|
||||
SUMMARY_UPDATE_RESTART_SECONDS="$update_restart_seconds" \
|
||||
SUMMARY_HEALTHZ_SECONDS="$healthz_seconds" \
|
||||
SUMMARY_READYZ_SECONDS="$readyz_seconds" \
|
||||
SUMMARY_STATUS_SECONDS="$status_seconds" \
|
||||
@@ -193,10 +173,8 @@ const summary = {
|
||||
version: process.env.SUMMARY_CANDIDATE_VERSION || null,
|
||||
},
|
||||
installedVersion: process.env.SUMMARY_INSTALLED_VERSION || null,
|
||||
updateRestartMode: process.env.SUMMARY_UPDATE_RESTART_MODE || "manual",
|
||||
timings: {
|
||||
startupSeconds: numberOrNull(process.env.SUMMARY_START_SECONDS),
|
||||
updateRestartSeconds: numberOrNull(process.env.SUMMARY_UPDATE_RESTART_SECONDS),
|
||||
healthzSeconds: numberOrNull(process.env.SUMMARY_HEALTHZ_SECONDS),
|
||||
readyzSeconds: numberOrNull(process.env.SUMMARY_READYZ_SECONDS),
|
||||
statusSeconds: numberOrNull(process.env.SUMMARY_STATUS_SECONDS),
|
||||
@@ -219,13 +197,6 @@ cleanup() {
|
||||
kill "$plugin_registry_pid" >/dev/null 2>&1 || true
|
||||
fi
|
||||
openclaw_e2e_terminate_gateways "${gateway_pid:-}"
|
||||
if [ -s "$SYSTEMCTL_SHIM_PID_FILE" ]; then
|
||||
local shim_pid
|
||||
shim_pid="$(cat "$SYSTEMCTL_SHIM_PID_FILE" 2>/dev/null || true)"
|
||||
if [[ "$shim_pid" =~ ^[0-9]+$ ]] && [ "$shim_pid" -gt 1 ]; then
|
||||
openclaw_e2e_terminate_gateways "$shim_pid"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
on_error() {
|
||||
@@ -641,7 +612,6 @@ rm_rf_retry() {
|
||||
|
||||
reset_run_state() {
|
||||
rm_rf_retry "$npm_config_prefix" "$TMPDIR" "$ARTIFACT_ROOT/state-home"
|
||||
rm -f "$SYSTEMCTL_SHIM_PID_FILE" "$SYSTEMCTL_SHIM_DAEMON_LOG"
|
||||
mkdir -p "$npm_config_prefix" "$npm_config_cache" "$TMPDIR"
|
||||
}
|
||||
|
||||
@@ -700,296 +670,6 @@ validate_baseline_config() {
|
||||
fi
|
||||
}
|
||||
|
||||
install_update_restart_systemctl_shim() {
|
||||
local shim_dir="$npm_config_prefix/bin"
|
||||
mkdir -p "$shim_dir"
|
||||
cat >"$shim_dir/systemctl" <<'SHIM'
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
log_file="${OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_LOG:-/tmp/openclaw-systemctl-shim.log}"
|
||||
pid_file="${OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_PID_FILE:-/tmp/openclaw-systemctl-shim.pid}"
|
||||
daemon_log="${OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_DAEMON_LOG:-/tmp/openclaw-systemctl-shim-gateway.log}"
|
||||
printf '%s\n' "$*" >>"$log_file"
|
||||
|
||||
filtered=()
|
||||
for ((i = 1; i <= $#; i++)); do
|
||||
arg="${!i}"
|
||||
case "$arg" in
|
||||
--user | --quiet | --no-page | --now)
|
||||
;;
|
||||
--property)
|
||||
i=$((i + 1))
|
||||
;;
|
||||
*)
|
||||
filtered+=("$arg")
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
command="${filtered[0]:-status}"
|
||||
|
||||
is_running() {
|
||||
[ -s "$pid_file" ] || return 1
|
||||
local pid
|
||||
pid="$(cat "$pid_file" 2>/dev/null || true)"
|
||||
[ -n "$pid" ] || return 1
|
||||
kill -0 "$pid" >/dev/null 2>&1
|
||||
}
|
||||
|
||||
stop_gateway() {
|
||||
[ -s "$pid_file" ] || return 0
|
||||
local pid
|
||||
pid="$(cat "$pid_file" 2>/dev/null || true)"
|
||||
if [[ "$pid" =~ ^[0-9]+$ ]] && [ "$pid" -gt 1 ] && kill -0 "$pid" >/dev/null 2>&1; then
|
||||
kill "$pid" >/dev/null 2>&1 || true
|
||||
for _ in $(seq 1 100); do
|
||||
kill -0 "$pid" >/dev/null 2>&1 || break
|
||||
sleep 0.1
|
||||
done
|
||||
kill -9 "$pid" >/dev/null 2>&1 || true
|
||||
fi
|
||||
rm -f "$pid_file"
|
||||
}
|
||||
|
||||
unit_path() {
|
||||
printf '%s/.config/systemd/user/openclaw-gateway.service\n' "${HOME:?missing HOME}"
|
||||
}
|
||||
|
||||
load_unit_environment() {
|
||||
local unit="$1"
|
||||
while IFS= read -r line; do
|
||||
case "$line" in
|
||||
EnvironmentFile=*)
|
||||
local spec="${line#EnvironmentFile=}"
|
||||
for token in $spec; do
|
||||
local file="${token#-}"
|
||||
[ -f "$file" ] || continue
|
||||
set -a
|
||||
# shellcheck disable=SC1090
|
||||
. "$file"
|
||||
set +a
|
||||
done
|
||||
;;
|
||||
Environment=*)
|
||||
local assignment="${line#Environment=}"
|
||||
assignment="${assignment#\"}"
|
||||
assignment="${assignment%\"}"
|
||||
export "$assignment"
|
||||
;;
|
||||
esac
|
||||
done <"$unit"
|
||||
}
|
||||
|
||||
start_gateway() {
|
||||
local unit
|
||||
local exec_start
|
||||
unit="$(unit_path)"
|
||||
exec_start="$(sed -n 's/^ExecStart=//p' "$unit" | tail -n 1)"
|
||||
[ -n "$exec_start" ] || {
|
||||
echo "systemctl shim could not find ExecStart in $unit" >&2
|
||||
return 1
|
||||
}
|
||||
(
|
||||
load_unit_environment "$unit"
|
||||
nohup bash -lc "exec $exec_start" >>"$daemon_log" 2>&1 &
|
||||
printf '%s\n' "$!" >"$pid_file"
|
||||
)
|
||||
}
|
||||
|
||||
case "$command" in
|
||||
daemon-reload | enable | disable)
|
||||
exit 0
|
||||
;;
|
||||
status)
|
||||
is_running && exit 0
|
||||
exit 0
|
||||
;;
|
||||
stop)
|
||||
stop_gateway
|
||||
exit 0
|
||||
;;
|
||||
restart | start)
|
||||
stop_gateway
|
||||
start_gateway
|
||||
exit 0
|
||||
;;
|
||||
is-enabled)
|
||||
exit 0
|
||||
;;
|
||||
is-active)
|
||||
is_running && exit 0
|
||||
exit 3
|
||||
;;
|
||||
show)
|
||||
if is_running; then
|
||||
printf 'ActiveState=active\nSubState=running\nMainPID=%s\nExecMainStatus=0\nExecMainCode=0\n' "$(cat "$pid_file")"
|
||||
else
|
||||
printf 'ActiveState=inactive\nSubState=dead\nMainPID=0\nExecMainStatus=0\nExecMainCode=0\n'
|
||||
fi
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
echo "systemctl shim unsupported command: $*" >&2
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
SHIM
|
||||
chmod +x "$shim_dir/systemctl"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_LOG="$SYSTEMCTL_SHIM_LOG"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_PID_FILE="$SYSTEMCTL_SHIM_PID_FILE"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_DAEMON_LOG="$SYSTEMCTL_SHIM_DAEMON_LOG"
|
||||
export PATH="$shim_dir:$PATH"
|
||||
}
|
||||
|
||||
install_update_restart_service_unit() {
|
||||
if ! env -u OPENCLAW_GATEWAY_TOKEN -u OPENCLAW_GATEWAY_PASSWORD openclaw gateway install --force --json >"$BASELINE_SERVICE_INSTALL_JSON" 2>"$BASELINE_SERVICE_INSTALL_ERR"; then
|
||||
echo "baseline gateway service install failed" >&2
|
||||
cat "$BASELINE_SERVICE_INSTALL_ERR" >&2 || true
|
||||
cat "$BASELINE_SERVICE_INSTALL_JSON" >&2 || true
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
seed_update_restart_probe_device_auth() {
|
||||
node --input-type=module <<'NODE'
|
||||
import crypto from "node:crypto";
|
||||
import fs from "node:fs";
|
||||
import path from "node:path";
|
||||
|
||||
const stateDir = process.env.OPENCLAW_STATE_DIR;
|
||||
if (!stateDir) {
|
||||
throw new Error("missing OPENCLAW_STATE_DIR");
|
||||
}
|
||||
|
||||
const base64UrlEncode = (buf) =>
|
||||
buf.toString("base64").replaceAll("+", "-").replaceAll("/", "_").replace(/=+$/g, "");
|
||||
const ed25519SpkiPrefix = Buffer.from("302a300506032b6570032100", "hex");
|
||||
const { publicKey, privateKey } = crypto.generateKeyPairSync("ed25519");
|
||||
const publicKeyPem = publicKey.export({ type: "spki", format: "pem" });
|
||||
const privateKeyPem = privateKey.export({ type: "pkcs8", format: "pem" });
|
||||
const spki = crypto.createPublicKey(publicKeyPem).export({ type: "spki", format: "der" });
|
||||
const rawPublicKey =
|
||||
spki.length === ed25519SpkiPrefix.length + 32 &&
|
||||
spki.subarray(0, ed25519SpkiPrefix.length).equals(ed25519SpkiPrefix)
|
||||
? spki.subarray(ed25519SpkiPrefix.length)
|
||||
: spki;
|
||||
const publicKeyRaw = base64UrlEncode(rawPublicKey);
|
||||
const deviceId = crypto.createHash("sha256").update(rawPublicKey).digest("hex");
|
||||
const token = base64UrlEncode(crypto.randomBytes(32));
|
||||
const now = Date.now();
|
||||
const scopes = ["operator.read"];
|
||||
|
||||
function writeJson(filePath, value) {
|
||||
fs.mkdirSync(path.dirname(filePath), { recursive: true });
|
||||
fs.writeFileSync(filePath, `${JSON.stringify(value, null, 2)}\n`, { mode: 0o600 });
|
||||
try {
|
||||
fs.chmodSync(filePath, 0o600);
|
||||
} catch {
|
||||
// best-effort inside Docker
|
||||
}
|
||||
}
|
||||
|
||||
writeJson(path.join(stateDir, "identity", "device.json"), {
|
||||
version: 1,
|
||||
deviceId,
|
||||
publicKeyPem,
|
||||
privateKeyPem,
|
||||
createdAtMs: now,
|
||||
});
|
||||
writeJson(path.join(stateDir, "identity", "device-auth.json"), {
|
||||
version: 1,
|
||||
deviceId,
|
||||
tokens: {
|
||||
operator: {
|
||||
token,
|
||||
role: "operator",
|
||||
scopes,
|
||||
updatedAtMs: now,
|
||||
},
|
||||
},
|
||||
});
|
||||
writeJson(path.join(stateDir, "devices", "paired.json"), {
|
||||
[deviceId]: {
|
||||
deviceId,
|
||||
publicKey: publicKeyRaw,
|
||||
displayName: "upgrade survivor restart probe",
|
||||
platform: process.platform,
|
||||
clientId: "upgrade-survivor",
|
||||
clientMode: "probe",
|
||||
role: "operator",
|
||||
roles: ["operator"],
|
||||
scopes,
|
||||
approvedScopes: scopes,
|
||||
tokens: {
|
||||
operator: {
|
||||
token,
|
||||
role: "operator",
|
||||
scopes,
|
||||
createdAtMs: now,
|
||||
},
|
||||
},
|
||||
createdAtMs: now,
|
||||
approvedAtMs: now,
|
||||
},
|
||||
});
|
||||
writeJson(path.join(stateDir, "devices", "pending.json"), {});
|
||||
NODE
|
||||
}
|
||||
|
||||
write_update_restart_service_secretref_env() {
|
||||
mkdir -p "$OPENCLAW_STATE_DIR"
|
||||
local dotenv_path="$OPENCLAW_STATE_DIR/.env"
|
||||
local tmp_path="$dotenv_path.tmp.$$"
|
||||
if [ -f "$dotenv_path" ]; then
|
||||
grep -v '^GATEWAY_AUTH_TOKEN_REF=' "$dotenv_path" >"$tmp_path" || true
|
||||
else
|
||||
: >"$tmp_path"
|
||||
fi
|
||||
# Managed restarts resolve SecretRefs from service-owned durable env, not the update caller.
|
||||
printf 'GATEWAY_AUTH_TOKEN_REF=%s\n' "$GATEWAY_AUTH_TOKEN_REF" >>"$tmp_path"
|
||||
mv "$tmp_path" "$dotenv_path"
|
||||
}
|
||||
|
||||
write_update_restart_service_auth_env() {
|
||||
mkdir -p "$OPENCLAW_STATE_DIR"
|
||||
local dotenv_path="$OPENCLAW_STATE_DIR/.env"
|
||||
local tmp_path="$dotenv_path.tmp.$$"
|
||||
if [ -f "$dotenv_path" ]; then
|
||||
grep -v '^GATEWAY_AUTH_TOKEN_REF=' "$dotenv_path" >"$tmp_path" || true
|
||||
else
|
||||
: >"$tmp_path"
|
||||
fi
|
||||
printf 'GATEWAY_AUTH_TOKEN_REF=%s\n' "$GATEWAY_AUTH_TOKEN_REF" >>"$tmp_path"
|
||||
mv "$tmp_path" "$dotenv_path"
|
||||
local systemd_env_path="$OPENCLAW_STATE_DIR/gateway.systemd.env"
|
||||
printf 'GATEWAY_AUTH_TOKEN_REF=%s\n' "$GATEWAY_AUTH_TOKEN_REF" >"$systemd_env_path"
|
||||
}
|
||||
|
||||
prepare_update_restart_probe() {
|
||||
if [ "$UPDATE_RESTART_MODE" != "auto-auth" ]; then
|
||||
return 0
|
||||
fi
|
||||
echo "Preparing configured-auth gateway for automatic update restart."
|
||||
install_update_restart_systemctl_shim
|
||||
seed_update_restart_probe_device_auth
|
||||
start_gateway
|
||||
write_update_restart_service_secretref_env
|
||||
install_update_restart_service_unit
|
||||
}
|
||||
|
||||
prepare_update_restart_probe_current_install() {
|
||||
if [ "$UPDATE_RESTART_MODE" != "auto-auth" ]; then
|
||||
return 0
|
||||
fi
|
||||
echo "Preparing candidate-auth gateway for automatic update restart."
|
||||
install_update_restart_systemctl_shim
|
||||
seed_update_restart_probe_device_auth
|
||||
start_gateway
|
||||
write_update_restart_service_auth_env
|
||||
install_update_restart_service_unit
|
||||
}
|
||||
|
||||
assert_baseline_state() {
|
||||
OPENCLAW_UPGRADE_SURVIVOR_ASSERT_STAGE=baseline \
|
||||
node scripts/e2e/lib/upgrade-survivor/assertions.mjs assert-config
|
||||
@@ -1034,32 +714,12 @@ resolve_candidate_version() {
|
||||
|
||||
update_candidate() {
|
||||
echo "Updating baseline $baseline_spec to candidate $CANDIDATE_KIND:$CANDIDATE_SPEC ($candidate_version)"
|
||||
local update_start=""
|
||||
local update_end=""
|
||||
local update_args=(update --tag "$CANDIDATE_SPEC" --yes --json)
|
||||
if [ "$UPDATE_RESTART_MODE" = "manual" ]; then
|
||||
update_args+=(--no-restart)
|
||||
else
|
||||
update_start="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
fi
|
||||
if ! env -u OPENCLAW_GATEWAY_TOKEN -u OPENCLAW_GATEWAY_PASSWORD openclaw "${update_args[@]}" >"$UPDATE_JSON" 2>"$UPDATE_ERR"; then
|
||||
if ! openclaw update --tag "$CANDIDATE_SPEC" --yes --json --no-restart >"$UPDATE_JSON" 2>"$UPDATE_ERR"; then
|
||||
echo "openclaw update failed" >&2
|
||||
cat "$UPDATE_ERR" >&2 || true
|
||||
cat "$UPDATE_JSON" >&2 || true
|
||||
return 1
|
||||
fi
|
||||
if [ "$UPDATE_RESTART_MODE" = "auto-auth" ]; then
|
||||
update_end="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
update_restart_seconds=$(((update_end - update_start + 999) / 1000))
|
||||
node -e '
|
||||
const fs = require("node:fs");
|
||||
const file = process.argv[1];
|
||||
const result = JSON.parse(fs.readFileSync(file, "utf8"));
|
||||
if (!result || result.status !== "ok") {
|
||||
throw new Error(`update JSON did not report ok status: ${JSON.stringify(result)}`);
|
||||
}
|
||||
' "$UPDATE_JSON"
|
||||
fi
|
||||
installed_version="$(read_installed_version)"
|
||||
}
|
||||
|
||||
@@ -1116,11 +776,8 @@ start_gateway() {
|
||||
local start_epoch
|
||||
local ready_epoch
|
||||
start_epoch="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
env -u OPENCLAW_GATEWAY_TOKEN -u OPENCLAW_GATEWAY_PASSWORD openclaw gateway --port "$port" --bind loopback --allow-unconfigured >"$GATEWAY_LOG" 2>&1 &
|
||||
openclaw gateway --port "$port" --bind loopback --allow-unconfigured >"$GATEWAY_LOG" 2>&1 &
|
||||
gateway_pid="$!"
|
||||
if [ "$UPDATE_RESTART_MODE" = "auto-auth" ]; then
|
||||
printf '%s\n' "$gateway_pid" >"$SYSTEMCTL_SHIM_PID_FILE"
|
||||
fi
|
||||
openclaw_e2e_wait_gateway_ready "$gateway_pid" "$GATEWAY_LOG" 360
|
||||
ready_epoch="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
start_seconds=$(((ready_epoch - start_epoch + 999) / 1000))
|
||||
@@ -1131,13 +788,6 @@ start_gateway() {
|
||||
fi
|
||||
}
|
||||
|
||||
ensure_gateway_started() {
|
||||
if [ "$UPDATE_RESTART_MODE" = "auto-auth" ]; then
|
||||
return 0
|
||||
fi
|
||||
start_gateway
|
||||
}
|
||||
|
||||
check_gateway_probes() {
|
||||
healthz_seconds="$(probe_gateway_endpoint /healthz live "$HEALTHZ_JSON")"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_READYZ_ALLOW_FAILING="discord,telegram,whatsapp,feishu,matrix"
|
||||
@@ -1168,7 +818,6 @@ check_gateway_status() {
|
||||
}
|
||||
|
||||
phase storage-preflight storage_preflight
|
||||
phase validate-update-restart-mode validate_update_restart_mode
|
||||
phase reset-run-state reset_run_state
|
||||
phase install-baseline install_baseline
|
||||
phase seed-state seed_state
|
||||
@@ -1181,7 +830,6 @@ phase seed-source-only-plugin-shadow seed_source_only_plugin_shadow
|
||||
phase assert-baseline assert_baseline_state
|
||||
phase seed-legacy-runtime-deps-symlink seed_legacy_runtime_deps_symlink
|
||||
phase resolve-candidate resolve_candidate_version
|
||||
phase prepare-update-restart-probe prepare_update_restart_probe
|
||||
phase update-candidate update_candidate
|
||||
phase assert-legacy-plugin-dependency-debris-before-doctor assert_legacy_plugin_dependency_debris_before_doctor
|
||||
phase configure-configured-plugin-install-fixture-registry configure_configured_plugin_install_fixture_registry
|
||||
@@ -1190,8 +838,8 @@ phase assert-legacy-plugin-dependency-debris-cleaned assert_legacy_plugin_depend
|
||||
phase assert-legacy-runtime-deps-symlink-repaired assert_legacy_runtime_deps_symlink_repaired
|
||||
phase validate-post-doctor-config validate_post_doctor_config
|
||||
phase assert-survival assert_survival
|
||||
phase gateway-start ensure_gateway_started
|
||||
phase gateway-start start_gateway
|
||||
phase gateway-probes check_gateway_probes
|
||||
phase gateway-status check_gateway_status
|
||||
|
||||
echo "Upgrade survivor Docker E2E passed baseline=${baseline_spec} scenario=${SCENARIO} candidate=${candidate_version} updateRestartMode=${UPDATE_RESTART_MODE} startup=${start_seconds}s updateRestart=${update_restart_seconds:-manual}s healthz=${healthz_seconds}s readyz=${readyz_seconds}s status=${status_seconds}s."
|
||||
echo "Upgrade survivor Docker E2E passed baseline=${baseline_spec} scenario=${SCENARIO} candidate=${candidate_version} startup=${start_seconds}s healthz=${healthz_seconds}s readyz=${readyz_seconds}s status=${status_seconds}s."
|
||||
|
||||
@@ -1,264 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
install_update_restart_systemctl_shim() {
|
||||
local shim_dir="$npm_config_prefix/bin"
|
||||
mkdir -p "$shim_dir"
|
||||
cat >"$shim_dir/systemctl" <<'SHIM'
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
log_file="${OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_LOG:-/tmp/openclaw-systemctl-shim.log}"
|
||||
pid_file="${OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_PID_FILE:-/tmp/openclaw-systemctl-shim.pid}"
|
||||
daemon_log="${OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_DAEMON_LOG:-/tmp/openclaw-systemctl-shim-gateway.log}"
|
||||
printf '%s\n' "$*" >>"$log_file"
|
||||
|
||||
filtered=()
|
||||
for ((i = 1; i <= $#; i++)); do
|
||||
arg="${!i}"
|
||||
case "$arg" in
|
||||
--user | --quiet | --no-page | --now)
|
||||
;;
|
||||
--property)
|
||||
i=$((i + 1))
|
||||
;;
|
||||
*)
|
||||
filtered+=("$arg")
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
command="${filtered[0]:-status}"
|
||||
|
||||
is_running() {
|
||||
[ -s "$pid_file" ] || return 1
|
||||
local pid
|
||||
pid="$(cat "$pid_file" 2>/dev/null || true)"
|
||||
[ -n "$pid" ] || return 1
|
||||
kill -0 "$pid" >/dev/null 2>&1
|
||||
}
|
||||
|
||||
stop_gateway() {
|
||||
[ -s "$pid_file" ] || return 0
|
||||
local pid
|
||||
pid="$(cat "$pid_file" 2>/dev/null || true)"
|
||||
if [[ "$pid" =~ ^[0-9]+$ ]] && [ "$pid" -gt 1 ] && kill -0 "$pid" >/dev/null 2>&1; then
|
||||
kill "$pid" >/dev/null 2>&1 || true
|
||||
for _ in $(seq 1 100); do
|
||||
kill -0 "$pid" >/dev/null 2>&1 || break
|
||||
sleep 0.1
|
||||
done
|
||||
kill -9 "$pid" >/dev/null 2>&1 || true
|
||||
fi
|
||||
rm -f "$pid_file"
|
||||
}
|
||||
|
||||
unit_path() {
|
||||
printf '%s/.config/systemd/user/openclaw-gateway.service\n' "${HOME:?missing HOME}"
|
||||
}
|
||||
|
||||
load_unit_environment() {
|
||||
local unit="$1"
|
||||
while IFS= read -r line; do
|
||||
case "$line" in
|
||||
EnvironmentFile=*)
|
||||
local spec="${line#EnvironmentFile=}"
|
||||
for token in $spec; do
|
||||
local file="${token#-}"
|
||||
[ -f "$file" ] || continue
|
||||
set -a
|
||||
# shellcheck disable=SC1090
|
||||
. "$file"
|
||||
set +a
|
||||
done
|
||||
;;
|
||||
Environment=*)
|
||||
local assignment="${line#Environment=}"
|
||||
assignment="${assignment#\"}"
|
||||
assignment="${assignment%\"}"
|
||||
export "$assignment"
|
||||
;;
|
||||
esac
|
||||
done <"$unit"
|
||||
}
|
||||
|
||||
start_gateway() {
|
||||
local unit
|
||||
local exec_start
|
||||
unit="$(unit_path)"
|
||||
exec_start="$(sed -n 's/^ExecStart=//p' "$unit" | tail -n 1)"
|
||||
[ -n "$exec_start" ] || {
|
||||
echo "systemctl shim could not find ExecStart in $unit" >&2
|
||||
return 1
|
||||
}
|
||||
(
|
||||
load_unit_environment "$unit"
|
||||
nohup bash -lc "exec $exec_start" >>"$daemon_log" 2>&1 &
|
||||
printf '%s\n' "$!" >"$pid_file"
|
||||
)
|
||||
}
|
||||
|
||||
case "$command" in
|
||||
daemon-reload | enable | disable)
|
||||
exit 0
|
||||
;;
|
||||
status)
|
||||
is_running && exit 0
|
||||
exit 0
|
||||
;;
|
||||
stop)
|
||||
stop_gateway
|
||||
exit 0
|
||||
;;
|
||||
restart | start)
|
||||
stop_gateway
|
||||
start_gateway
|
||||
exit 0
|
||||
;;
|
||||
is-enabled)
|
||||
exit 0
|
||||
;;
|
||||
is-active)
|
||||
is_running && exit 0
|
||||
exit 3
|
||||
;;
|
||||
show)
|
||||
if is_running; then
|
||||
printf 'ActiveState=active\nSubState=running\nMainPID=%s\nExecMainStatus=0\nExecMainCode=0\n' "$(cat "$pid_file")"
|
||||
else
|
||||
printf 'ActiveState=inactive\nSubState=dead\nMainPID=0\nExecMainStatus=0\nExecMainCode=0\n'
|
||||
fi
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
echo "systemctl shim unsupported command: $*" >&2
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
SHIM
|
||||
chmod +x "$shim_dir/systemctl"
|
||||
export PATH="$shim_dir:$PATH"
|
||||
}
|
||||
|
||||
seed_update_restart_probe_device_auth() {
|
||||
node --input-type=module <<'NODE'
|
||||
import crypto from "node:crypto";
|
||||
import fs from "node:fs";
|
||||
import path from "node:path";
|
||||
|
||||
const stateDir = process.env.OPENCLAW_STATE_DIR;
|
||||
if (!stateDir) {
|
||||
throw new Error("missing OPENCLAW_STATE_DIR");
|
||||
}
|
||||
|
||||
const base64UrlEncode = (buf) =>
|
||||
buf.toString("base64").replaceAll("+", "-").replaceAll("/", "_").replace(/=+$/g, "");
|
||||
const ed25519SpkiPrefix = Buffer.from("302a300506032b6570032100", "hex");
|
||||
const { publicKey, privateKey } = crypto.generateKeyPairSync("ed25519");
|
||||
const publicKeyPem = publicKey.export({ type: "spki", format: "pem" });
|
||||
const privateKeyPem = privateKey.export({ type: "pkcs8", format: "pem" });
|
||||
const spki = crypto.createPublicKey(publicKeyPem).export({ type: "spki", format: "der" });
|
||||
const rawPublicKey =
|
||||
spki.length === ed25519SpkiPrefix.length + 32 &&
|
||||
spki.subarray(0, ed25519SpkiPrefix.length).equals(ed25519SpkiPrefix)
|
||||
? spki.subarray(ed25519SpkiPrefix.length)
|
||||
: spki;
|
||||
const publicKeyRaw = base64UrlEncode(rawPublicKey);
|
||||
const deviceId = crypto.createHash("sha256").update(rawPublicKey).digest("hex");
|
||||
const token = base64UrlEncode(crypto.randomBytes(32));
|
||||
const now = Date.now();
|
||||
const scopes = ["operator.read"];
|
||||
|
||||
function writeJson(filePath, value) {
|
||||
fs.mkdirSync(path.dirname(filePath), { recursive: true });
|
||||
fs.writeFileSync(filePath, `${JSON.stringify(value, null, 2)}\n`, { mode: 0o600 });
|
||||
try {
|
||||
fs.chmodSync(filePath, 0o600);
|
||||
} catch {
|
||||
}
|
||||
}
|
||||
|
||||
writeJson(path.join(stateDir, "identity", "device.json"), {
|
||||
version: 1,
|
||||
deviceId,
|
||||
publicKeyPem,
|
||||
privateKeyPem,
|
||||
createdAtMs: now,
|
||||
});
|
||||
writeJson(path.join(stateDir, "identity", "device-auth.json"), {
|
||||
version: 1,
|
||||
deviceId,
|
||||
tokens: {
|
||||
operator: {
|
||||
token,
|
||||
role: "operator",
|
||||
scopes,
|
||||
updatedAtMs: now,
|
||||
},
|
||||
},
|
||||
});
|
||||
writeJson(path.join(stateDir, "devices", "paired.json"), {
|
||||
[deviceId]: {
|
||||
deviceId,
|
||||
publicKey: publicKeyRaw,
|
||||
displayName: "upgrade survivor restart probe",
|
||||
platform: process.platform,
|
||||
clientId: "openclaw-cli",
|
||||
clientMode: "probe",
|
||||
role: "operator",
|
||||
roles: ["operator"],
|
||||
scopes,
|
||||
approvedScopes: scopes,
|
||||
tokens: {
|
||||
operator: {
|
||||
token,
|
||||
role: "operator",
|
||||
scopes,
|
||||
createdAtMs: now,
|
||||
},
|
||||
},
|
||||
createdAtMs: now,
|
||||
approvedAtMs: now,
|
||||
},
|
||||
});
|
||||
writeJson(path.join(stateDir, "devices", "pending.json"), {});
|
||||
NODE
|
||||
}
|
||||
|
||||
write_update_restart_service_auth_env() {
|
||||
mkdir -p "$OPENCLAW_STATE_DIR"
|
||||
local dotenv_path="$OPENCLAW_STATE_DIR/.env"
|
||||
local tmp_path="$dotenv_path.tmp.$$"
|
||||
if [ -f "$dotenv_path" ]; then
|
||||
grep -v '^GATEWAY_AUTH_TOKEN_REF=' "$dotenv_path" >"$tmp_path" || true
|
||||
else
|
||||
: >"$tmp_path"
|
||||
fi
|
||||
printf 'GATEWAY_AUTH_TOKEN_REF=%s\n' "$GATEWAY_AUTH_TOKEN_REF" >>"$tmp_path"
|
||||
mv "$tmp_path" "$dotenv_path"
|
||||
printf 'GATEWAY_AUTH_TOKEN_REF=%s\n' "$GATEWAY_AUTH_TOKEN_REF" >"$OPENCLAW_STATE_DIR/gateway.systemd.env"
|
||||
}
|
||||
|
||||
prepare_update_restart_probe_current_install() {
|
||||
local port="$1"
|
||||
local log_file="$2"
|
||||
local start_epoch
|
||||
local ready_epoch
|
||||
|
||||
echo "Preparing candidate-auth gateway for automatic update restart."
|
||||
install_update_restart_systemctl_shim
|
||||
seed_update_restart_probe_device_auth
|
||||
start_epoch="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
env -u OPENCLAW_GATEWAY_TOKEN -u OPENCLAW_GATEWAY_PASSWORD openclaw gateway --port "$port" --bind loopback --allow-unconfigured >"$log_file" 2>&1 &
|
||||
gateway_pid="$!"
|
||||
printf '%s\n' "$gateway_pid" >"$OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_PID_FILE"
|
||||
openclaw_e2e_wait_gateway_ready "$gateway_pid" "$log_file" 360
|
||||
ready_epoch="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
start_seconds=$(((ready_epoch - start_epoch + 999) / 1000))
|
||||
write_update_restart_service_auth_env
|
||||
if ! env -u OPENCLAW_GATEWAY_TOKEN -u OPENCLAW_GATEWAY_PASSWORD openclaw gateway install --force --json >"$OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SERVICE_INSTALL_JSON" 2>"$OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SERVICE_INSTALL_ERR"; then
|
||||
echo "gateway service install failed" >&2
|
||||
cat "$OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SERVICE_INSTALL_ERR" >&2 || true
|
||||
cat "$OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SERVICE_INSTALL_JSON" >&2 || true
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
@@ -13,7 +13,6 @@ SKIP_BUILD="${OPENCLAW_UPGRADE_SURVIVOR_E2E_SKIP_BUILD:-0}"
|
||||
DOCKER_RUN_TIMEOUT="${OPENCLAW_UPGRADE_SURVIVOR_DOCKER_RUN_TIMEOUT:-900s}"
|
||||
BASELINE_SPEC="${OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC:-}"
|
||||
SCENARIO="${OPENCLAW_UPGRADE_SURVIVOR_SCENARIO:-base}"
|
||||
UPDATE_RESTART_MODE="${OPENCLAW_UPGRADE_SURVIVOR_UPDATE_RESTART_MODE:-manual}"
|
||||
LANE_ARTIFACT_SUFFIX="${OPENCLAW_DOCKER_ALL_LANE_NAME:-default}"
|
||||
LANE_ARTIFACT_SUFFIX="${LANE_ARTIFACT_SUFFIX//[^A-Za-z0-9_.-]/_}"
|
||||
ARTIFACT_DIR="${OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_DIR:-$ROOT_DIR/.artifacts/upgrade-survivor/$LANE_ARTIFACT_SUFFIX}"
|
||||
@@ -87,7 +86,6 @@ if [ "${OPENCLAW_UPGRADE_SURVIVOR_PUBLISHED_BASELINE:-0}" = "1" ]; then
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_CANDIDATE_KIND="$CANDIDATE_KIND" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_CANDIDATE_SPEC="$CANDIDATE_SPEC" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_SCENARIO="$SCENARIO" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_UPDATE_RESTART_MODE="$UPDATE_RESTART_MODE" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_LEGACY_RUNTIME_DEPS_SYMLINK="${OPENCLAW_UPGRADE_SURVIVOR_LEGACY_RUNTIME_DEPS_SYMLINK:-}" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_SUMMARY_JSON=/tmp/openclaw-upgrade-survivor-artifacts/summary.json \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_START_BUDGET_SECONDS="${OPENCLAW_UPGRADE_SURVIVOR_START_BUDGET_SECONDS:-90}" \
|
||||
@@ -113,7 +111,6 @@ docker_e2e_run_with_harness \
|
||||
-e OPENCLAW_TEST_STATE_SCRIPT_B64="$OPENCLAW_TEST_STATE_SCRIPT_B64" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_ROOT=/tmp/openclaw-upgrade-survivor-artifacts \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_SCENARIO="$SCENARIO" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_UPDATE_RESTART_MODE="$UPDATE_RESTART_MODE" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_START_BUDGET_SECONDS="${OPENCLAW_UPGRADE_SURVIVOR_START_BUDGET_SECONDS:-90}" \
|
||||
-e OPENCLAW_UPGRADE_SURVIVOR_STATUS_BUDGET_SECONDS="${OPENCLAW_UPGRADE_SURVIVOR_STATUS_BUDGET_SECONDS:-30}" \
|
||||
-v "$ARTIFACT_DIR:/tmp/openclaw-upgrade-survivor-artifacts" \
|
||||
@@ -148,22 +145,6 @@ export TELEGRAM_BOT_TOKEN="123456:upgrade-survivor-telegram-token"
|
||||
export FEISHU_APP_SECRET="upgrade-survivor-feishu-secret"
|
||||
export BRAVE_API_KEY="BSA_upgrade_survivor_brave_key"
|
||||
|
||||
UPDATE_RESTART_MODE="${OPENCLAW_UPGRADE_SURVIVOR_UPDATE_RESTART_MODE:-manual}"
|
||||
PORT=18789
|
||||
START_BUDGET="${OPENCLAW_UPGRADE_SURVIVOR_START_BUDGET_SECONDS:-90}"
|
||||
STATUS_BUDGET="${OPENCLAW_UPGRADE_SURVIVOR_STATUS_BUDGET_SECONDS:-30}"
|
||||
GATEWAY_LOG="$OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_ROOT/gateway.log"
|
||||
SYSTEMCTL_SHIM_LOG="$OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_ROOT/systemctl-shim.log"
|
||||
SYSTEMCTL_SHIM_PID_FILE="$OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_ROOT/systemctl-shim.pid"
|
||||
SYSTEMCTL_SHIM_DAEMON_LOG="$OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_ROOT/systemctl-shim-gateway.log"
|
||||
BASELINE_SERVICE_INSTALL_JSON="$OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_ROOT/baseline-service-install.json"
|
||||
BASELINE_SERVICE_INSTALL_ERR="$OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_ROOT/baseline-service-install.err"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_LOG="$SYSTEMCTL_SHIM_LOG"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_PID_FILE="$SYSTEMCTL_SHIM_PID_FILE"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_SYSTEMCTL_SHIM_DAEMON_LOG="$SYSTEMCTL_SHIM_DAEMON_LOG"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SERVICE_INSTALL_JSON="$BASELINE_SERVICE_INSTALL_JSON"
|
||||
export OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SERVICE_INSTALL_ERR="$BASELINE_SERVICE_INSTALL_ERR"
|
||||
|
||||
gateway_pid=""
|
||||
plugin_registry_pid=""
|
||||
cleanup() {
|
||||
@@ -171,9 +152,6 @@ cleanup() {
|
||||
kill "$plugin_registry_pid" >/dev/null 2>&1 || true
|
||||
fi
|
||||
openclaw_e2e_terminate_gateways "${gateway_pid:-}"
|
||||
if [ -s "$SYSTEMCTL_SHIM_PID_FILE" ]; then
|
||||
openclaw_e2e_terminate_gateways "$(cat "$SYSTEMCTL_SHIM_PID_FILE" 2>/dev/null || true)"
|
||||
fi
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
@@ -277,19 +255,10 @@ export OPENCLAW_PACKAGE_ACCEPTANCE_LEGACY_COMPAT
|
||||
echo "Checking dirty-state config before update..."
|
||||
OPENCLAW_UPGRADE_SURVIVOR_ASSERT_STAGE=baseline node scripts/e2e/lib/upgrade-survivor/assertions.mjs assert-config
|
||||
OPENCLAW_UPGRADE_SURVIVOR_ASSERT_STAGE=baseline node scripts/e2e/lib/upgrade-survivor/assertions.mjs assert-state
|
||||
if [ "$UPDATE_RESTART_MODE" = "auto-auth" ]; then
|
||||
# shellcheck disable=SC1091
|
||||
source scripts/e2e/lib/upgrade-survivor/update-restart-auth.sh
|
||||
prepare_update_restart_probe_current_install "$PORT" "$GATEWAY_LOG"
|
||||
fi
|
||||
|
||||
echo "Running package update against the mounted tarball..."
|
||||
update_args=(update --tag "${OPENCLAW_CURRENT_PACKAGE_TGZ:?missing OPENCLAW_CURRENT_PACKAGE_TGZ}" --yes --json)
|
||||
if [ "$UPDATE_RESTART_MODE" != "auto-auth" ]; then
|
||||
update_args+=(--no-restart)
|
||||
fi
|
||||
set +e
|
||||
env -u OPENCLAW_GATEWAY_TOKEN -u OPENCLAW_GATEWAY_PASSWORD openclaw "${update_args[@]}" >/tmp/openclaw-upgrade-survivor-update.json 2>/tmp/openclaw-upgrade-survivor-update.err
|
||||
openclaw update --tag "${OPENCLAW_CURRENT_PACKAGE_TGZ:?missing OPENCLAW_CURRENT_PACKAGE_TGZ}" --yes --json --no-restart >/tmp/openclaw-upgrade-survivor-update.json 2>/tmp/openclaw-upgrade-survivor-update.err
|
||||
update_status=$?
|
||||
set -e
|
||||
if [ "$update_status" -ne 0 ]; then
|
||||
@@ -299,42 +268,38 @@ if [ "$update_status" -ne 0 ]; then
|
||||
exit "$update_status"
|
||||
fi
|
||||
|
||||
if [ "$UPDATE_RESTART_MODE" = "auto-auth" ]; then
|
||||
echo "Skipping doctor repair until after restart proof."
|
||||
else
|
||||
echo "Running non-interactive doctor repair..."
|
||||
configure_configured_plugin_install_fixture_registry
|
||||
if ! openclaw doctor --fix --non-interactive >/tmp/openclaw-upgrade-survivor-doctor.log 2>&1; then
|
||||
echo "openclaw doctor failed" >&2
|
||||
cat /tmp/openclaw-upgrade-survivor-doctor.log >&2 || true
|
||||
exit 1
|
||||
fi
|
||||
if ! openclaw config validate >>/tmp/openclaw-upgrade-survivor-doctor.log 2>&1; then
|
||||
echo "post-doctor config validation failed" >&2
|
||||
cat /tmp/openclaw-upgrade-survivor-doctor.log >&2 || true
|
||||
exit 1
|
||||
fi
|
||||
echo "Running non-interactive doctor repair..."
|
||||
configure_configured_plugin_install_fixture_registry
|
||||
if ! openclaw doctor --fix --non-interactive >/tmp/openclaw-upgrade-survivor-doctor.log 2>&1; then
|
||||
echo "openclaw doctor failed" >&2
|
||||
cat /tmp/openclaw-upgrade-survivor-doctor.log >&2 || true
|
||||
exit 1
|
||||
fi
|
||||
if ! openclaw config validate >>/tmp/openclaw-upgrade-survivor-doctor.log 2>&1; then
|
||||
echo "post-doctor config validation failed" >&2
|
||||
cat /tmp/openclaw-upgrade-survivor-doctor.log >&2 || true
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Verifying config and state survived update..."
|
||||
echo "Verifying config and state survived update/doctor..."
|
||||
node scripts/e2e/lib/upgrade-survivor/assertions.mjs assert-config
|
||||
node scripts/e2e/lib/upgrade-survivor/assertions.mjs assert-state
|
||||
|
||||
if [ "$UPDATE_RESTART_MODE" = "auto-auth" ]; then
|
||||
echo "Gateway restart was handled by openclaw update."
|
||||
else
|
||||
echo "Starting gateway from upgraded state..."
|
||||
start_epoch="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
openclaw gateway --port "$PORT" --bind loopback --allow-unconfigured >"$GATEWAY_LOG" 2>&1 &
|
||||
gateway_pid="$!"
|
||||
openclaw_e2e_wait_gateway_ready "$gateway_pid" "$GATEWAY_LOG" 360
|
||||
ready_epoch="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
start_seconds=$(((ready_epoch - start_epoch + 999) / 1000))
|
||||
if [ "$start_seconds" -gt "$START_BUDGET" ]; then
|
||||
echo "gateway startup exceeded survivor budget: ${start_seconds}s > ${START_BUDGET}s" >&2
|
||||
cat "$GATEWAY_LOG" >&2 || true
|
||||
exit 1
|
||||
fi
|
||||
PORT=18789
|
||||
START_BUDGET="${OPENCLAW_UPGRADE_SURVIVOR_START_BUDGET_SECONDS:-90}"
|
||||
STATUS_BUDGET="${OPENCLAW_UPGRADE_SURVIVOR_STATUS_BUDGET_SECONDS:-30}"
|
||||
|
||||
echo "Starting gateway from upgraded state..."
|
||||
start_epoch="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
openclaw gateway --port "$PORT" --bind loopback --allow-unconfigured >/tmp/openclaw-upgrade-survivor-gateway.log 2>&1 &
|
||||
gateway_pid="$!"
|
||||
openclaw_e2e_wait_gateway_ready "$gateway_pid" /tmp/openclaw-upgrade-survivor-gateway.log 360
|
||||
ready_epoch="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
start_seconds=$(((ready_epoch - start_epoch + 999) / 1000))
|
||||
if [ "$start_seconds" -gt "$START_BUDGET" ]; then
|
||||
echo "gateway startup exceeded survivor budget: ${start_seconds}s > ${START_BUDGET}s" >&2
|
||||
cat /tmp/openclaw-upgrade-survivor-gateway.log >&2 || true
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Checking gateway HTTP probes..."
|
||||
@@ -355,8 +320,7 @@ status_start="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
if ! openclaw gateway status --url "ws://127.0.0.1:$PORT" --token "$GATEWAY_AUTH_TOKEN_REF" --require-rpc --timeout 30000 --json >/tmp/openclaw-upgrade-survivor-status.json 2>/tmp/openclaw-upgrade-survivor-status.err; then
|
||||
echo "gateway status failed" >&2
|
||||
cat /tmp/openclaw-upgrade-survivor-status.err >&2 || true
|
||||
cat "$GATEWAY_LOG" >&2 || true
|
||||
cat "$SYSTEMCTL_SHIM_DAEMON_LOG" >&2 || true
|
||||
cat /tmp/openclaw-upgrade-survivor-gateway.log >&2 || true
|
||||
exit 1
|
||||
fi
|
||||
status_end="$(node -e "process.stdout.write(String(Date.now()))")"
|
||||
@@ -368,5 +332,5 @@ if [ "$status_seconds" -gt "$STATUS_BUDGET" ]; then
|
||||
fi
|
||||
node scripts/e2e/lib/upgrade-survivor/assertions.mjs assert-status-json /tmp/openclaw-upgrade-survivor-status.json
|
||||
|
||||
echo "Upgrade survivor Docker E2E passed scenario=${OPENCLAW_UPGRADE_SURVIVOR_SCENARIO:-base} updateRestartMode=${UPDATE_RESTART_MODE} startup=${start_seconds}s status=${status_seconds}s."
|
||||
echo "Upgrade survivor Docker E2E passed scenario=${OPENCLAW_UPGRADE_SURVIVOR_SCENARIO:-base} startup=${start_seconds}s status=${status_seconds}s."
|
||||
'
|
||||
|
||||
@@ -28,20 +28,6 @@ const PLUGIN_DOC_ALIASES = new Map([
|
||||
["tavily", "/tools/tavily"],
|
||||
["tokenjuice", "/tools/tokenjuice"],
|
||||
]);
|
||||
const PLUGIN_REFERENCE_EXTRA_SECTIONS = new Map([
|
||||
[
|
||||
"whatsapp",
|
||||
`## Windows install note
|
||||
|
||||
On Windows, the WhatsApp plugin needs Git on \`PATH\` during npm install because one of its Baileys/libsignal dependencies is fetched from a git URL. Install Git for Windows, then restart the shell and rerun the install:
|
||||
|
||||
\`\`\`powershell
|
||||
winget install --id Git.Git -e
|
||||
\`\`\`
|
||||
|
||||
Portable Git also works if its \`bin\` directory is on \`PATH\`.`,
|
||||
],
|
||||
]);
|
||||
|
||||
function readJson(relativePath) {
|
||||
return JSON.parse(fs.readFileSync(path.join(ROOT, relativePath), "utf8"));
|
||||
@@ -390,7 +376,6 @@ ${record.docs.map((link) => `- ${docLink(link)}`).join("\n")}`;
|
||||
|
||||
function renderReferencePage(record) {
|
||||
const relatedDocs = renderRelatedDocs(record);
|
||||
const extraSections = PLUGIN_REFERENCE_EXTRA_SECTIONS.get(record.id);
|
||||
return `---
|
||||
summary: "${record.description.replaceAll('"', '\\"')}"
|
||||
read_when:
|
||||
@@ -409,7 +394,7 @@ ${record.description}
|
||||
|
||||
## Surface
|
||||
|
||||
${record.surface}${extraSections ? `\n\n${extraSections}` : ""}${relatedDocs ? `\n\n${relatedDocs}` : ""}
|
||||
${record.surface}${relatedDocs ? `\n\n${relatedDocs}` : ""}
|
||||
`;
|
||||
}
|
||||
|
||||
|
||||
@@ -1,13 +1,5 @@
|
||||
// Barnacle owns deterministic GitHub triage and auto-response behavior.
|
||||
|
||||
import {
|
||||
MOCK_ONLY_PROOF_LABEL,
|
||||
NEEDS_REAL_BEHAVIOR_PROOF_LABEL,
|
||||
PROOF_OVERRIDE_LABEL,
|
||||
evaluateRealBehaviorProof,
|
||||
labelsForRealBehaviorProof,
|
||||
} from "./real-behavior-proof-policy.mjs";
|
||||
|
||||
const activePrLimit = 20;
|
||||
|
||||
const thirdPartyExtensionMessage =
|
||||
@@ -142,18 +134,6 @@ export const managedLabelSpecs = {
|
||||
color: "C5DEF5",
|
||||
description: "Candidate: PR template appears mostly untouched.",
|
||||
},
|
||||
[NEEDS_REAL_BEHAVIOR_PROOF_LABEL]: {
|
||||
color: "C5DEF5",
|
||||
description: "Candidate: external PR needs after-fix proof from a real setup.",
|
||||
},
|
||||
[MOCK_ONLY_PROOF_LABEL]: {
|
||||
color: "C5DEF5",
|
||||
description: "Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI.",
|
||||
},
|
||||
[PROOF_OVERRIDE_LABEL]: {
|
||||
color: "C2E0C6",
|
||||
description: "Maintainer override for the external PR real behavior proof gate.",
|
||||
},
|
||||
"triage: dirty-candidate": {
|
||||
color: "C5DEF5",
|
||||
description: "Candidate: broad unrelated surfaces; may need splitting or cleanup.",
|
||||
@@ -174,8 +154,6 @@ export const candidateLabels = {
|
||||
docsDiscoverability: "triage: docs-discoverability",
|
||||
testOnlyNoBug: "triage: test-only-no-bug",
|
||||
refactorOnly: "triage: refactor-only",
|
||||
needsRealBehaviorProof: NEEDS_REAL_BEHAVIOR_PROOF_LABEL,
|
||||
mockOnlyProof: MOCK_ONLY_PROOF_LABEL,
|
||||
dirtyCandidate: "triage: dirty-candidate",
|
||||
riskyInfra: "triage: risky-infra",
|
||||
externalPluginCandidate: "triage: external-plugin-candidate",
|
||||
@@ -218,23 +196,10 @@ const maintainerAuthorLabel = "maintainer";
|
||||
const privilegedAuthorAssociations = new Set(["OWNER", "MEMBER", "COLLABORATOR"]);
|
||||
const privilegedRepositoryRoles = new Set(["admin", "maintain", "write"]);
|
||||
const candidateLabelValues = Object.values(candidateLabels);
|
||||
const proofCandidateLabelValues = [NEEDS_REAL_BEHAVIOR_PROOF_LABEL, MOCK_ONLY_PROOF_LABEL];
|
||||
const noisyPrMessage =
|
||||
"Closing this PR because it looks dirty (too many unrelated or unexpected changes). This usually happens when a branch picks up unrelated commits or a merge went sideways. Please recreate the PR from a clean branch.";
|
||||
|
||||
const candidateActionRules = [
|
||||
{
|
||||
label: candidateLabels.needsRealBehaviorProof,
|
||||
close: true,
|
||||
message:
|
||||
"Closing this PR because it does not include real behavior proof. Please reopen or resubmit with after-fix evidence from a real OpenClaw setup; terminal screenshots, console output, redacted logs, recordings, linked artifacts, and copied live output count. Unit tests, mocks, snapshots, lint, typechecks, and CI are supplemental only.",
|
||||
},
|
||||
{
|
||||
label: candidateLabels.mockOnlyProof,
|
||||
close: true,
|
||||
message:
|
||||
"Closing this PR because the proof only shows tests, mocks, snapshots, lint, typechecks, or CI. Please reopen or resubmit with after-fix evidence from a real OpenClaw setup; terminal screenshots, console output, redacted logs, recordings, linked artifacts, and copied live output count.",
|
||||
},
|
||||
{
|
||||
label: candidateLabels.dirtyCandidate,
|
||||
close: true,
|
||||
@@ -473,14 +438,6 @@ export function classifyPullRequestCandidateLabels(pullRequest, files) {
|
||||
labelsToAdd.push(candidateLabels.blankTemplate);
|
||||
}
|
||||
|
||||
labelsToAdd.push(
|
||||
...labelsForRealBehaviorProof(
|
||||
evaluateRealBehaviorProof({
|
||||
pullRequest,
|
||||
}),
|
||||
),
|
||||
);
|
||||
|
||||
const docsOnly = filenames.every(isMarkdownOrDocsFile);
|
||||
const docsSignal =
|
||||
/\b(add|adds|update|updates|fix|fixes|improve|cleanup|clean up|typo|readme|docs?|documentation|translation|translate)\b/i.test(
|
||||
@@ -761,18 +718,14 @@ async function addMissingLabels(github, context, core, issueNumber, labels, labe
|
||||
|
||||
async function applyPullRequestCandidateLabels(github, context, core, pullRequest, labelSet) {
|
||||
const files = await listPullRequestFiles(github, context, pullRequest);
|
||||
const classifiedLabels = classifyPullRequestCandidateLabels(
|
||||
{
|
||||
...pullRequest,
|
||||
labels: [...labelSet].map((name) => ({ name })),
|
||||
},
|
||||
files,
|
||||
await addMissingLabels(
|
||||
github,
|
||||
context,
|
||||
core,
|
||||
pullRequest.number,
|
||||
classifyPullRequestCandidateLabels(pullRequest, files),
|
||||
labelSet,
|
||||
);
|
||||
const staleProofLabels = proofCandidateLabelValues.filter(
|
||||
(label) => labelSet.has(label) && !classifiedLabels.includes(label),
|
||||
);
|
||||
await removeLabels(github, context, pullRequest.number, staleProofLabels, labelSet);
|
||||
await addMissingLabels(github, context, core, pullRequest.number, classifiedLabels, labelSet);
|
||||
}
|
||||
|
||||
function isAutomationUser(user, fallbackLogin = "") {
|
||||
@@ -978,9 +931,7 @@ export async function runBarnacleAutoResponse({ github, context, core = console
|
||||
const isLabelEvent = context.payload.action === "labeled";
|
||||
const isPrCandidateEvent =
|
||||
pullRequest &&
|
||||
["opened", "edited", "synchronize", "reopened", "labeled", "unlabeled"].includes(
|
||||
context.payload.action,
|
||||
);
|
||||
["opened", "edited", "synchronize", "reopened", "labeled"].includes(context.payload.action);
|
||||
if (!hasTriggerLabel && !isLabelEvent && !isPrCandidateEvent) {
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -1,34 +0,0 @@
|
||||
#!/usr/bin/env node
|
||||
import { readFileSync } from "node:fs";
|
||||
import { evaluateRealBehaviorProof } from "./real-behavior-proof-policy.mjs";
|
||||
|
||||
function escapeCommandValue(value) {
|
||||
return String(value)
|
||||
.replace(/%/g, "%25")
|
||||
.replace(/\r/g, "%0D")
|
||||
.replace(/\n/g, "%0A")
|
||||
.replace(/:/g, "%3A");
|
||||
}
|
||||
|
||||
const eventPath = process.env.GITHUB_EVENT_PATH;
|
||||
if (!eventPath) {
|
||||
console.error("::error title=Real behavior proof failed::GITHUB_EVENT_PATH is not set.");
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const event = JSON.parse(readFileSync(eventPath, "utf8"));
|
||||
const pullRequest = event.pull_request;
|
||||
if (!pullRequest) {
|
||||
console.log("No pull_request payload found; skipping real behavior proof gate.");
|
||||
process.exit(0);
|
||||
}
|
||||
|
||||
const evaluation = evaluateRealBehaviorProof({ pullRequest });
|
||||
if (evaluation.passed) {
|
||||
console.log(evaluation.reason);
|
||||
process.exit(0);
|
||||
}
|
||||
|
||||
const message = `${evaluation.reason} Add after-fix evidence from a real OpenClaw setup in the PR body. Screenshots, recordings, terminal screenshots, console output, redacted runtime logs, linked artifacts, or copied live output count. Unit tests, mocks, snapshots, lint, typechecks, and CI are supplemental only. A maintainer can apply proof: override when appropriate.`;
|
||||
console.error(`::error title=Real behavior proof required::${escapeCommandValue(message)}`);
|
||||
process.exit(1);
|
||||
@@ -1,284 +0,0 @@
|
||||
export const PROOF_OVERRIDE_LABEL = "proof: override";
|
||||
export const NEEDS_REAL_BEHAVIOR_PROOF_LABEL = "triage: needs-real-behavior-proof";
|
||||
export const MOCK_ONLY_PROOF_LABEL = "triage: mock-only-proof";
|
||||
|
||||
const privilegedAuthorAssociations = new Set(["OWNER", "MEMBER", "COLLABORATOR"]);
|
||||
|
||||
const requiredProofFields = [
|
||||
{
|
||||
key: "behavior",
|
||||
names: ["Behavior or issue addressed", "Issue addressed", "Behavior addressed"],
|
||||
},
|
||||
{
|
||||
key: "environment",
|
||||
names: ["Real environment tested", "Environment tested", "Real setup tested"],
|
||||
},
|
||||
{
|
||||
key: "steps",
|
||||
names: [
|
||||
"Exact steps or command run after this patch",
|
||||
"Exact steps or command run after the patch",
|
||||
"Exact steps or command run after fix",
|
||||
"Steps run after the patch",
|
||||
"Command run after the patch",
|
||||
],
|
||||
},
|
||||
{
|
||||
key: "evidence",
|
||||
names: [
|
||||
"Evidence after fix",
|
||||
"After-fix evidence",
|
||||
"Evidence link or embedded proof",
|
||||
"Evidence",
|
||||
],
|
||||
},
|
||||
{
|
||||
key: "observedResult",
|
||||
names: ["Observed result after fix", "Observed result after the fix", "Observed result"],
|
||||
},
|
||||
{
|
||||
key: "notTested",
|
||||
names: ["What was not tested", "Not tested"],
|
||||
allowNone: true,
|
||||
},
|
||||
];
|
||||
|
||||
const allProofFieldNames = requiredProofFields
|
||||
.flatMap((field) => field.names)
|
||||
.concat(["Before evidence", "Before evidence optional"]);
|
||||
|
||||
const missingValueRegex =
|
||||
/^(?:n\/?a|not applicable|tbd|todo|unknown|unsure|none provided|no evidence|not tested|untested|-|\[[^\]]*\])$/i;
|
||||
|
||||
const standaloneMissingProofRegex =
|
||||
/^\s*(?:[-*]\s*)?(?:n\/?a|not applicable|not tested|untested|no evidence|did not test|didn't test|could not test|couldn't test)\s*\.?\s*$/im;
|
||||
|
||||
const mockOnlyEvidenceRegex =
|
||||
/\b(?:pnpm|npm|yarn|bun)\s+(?:run\s+)?(?:test|vitest|lint|typecheck|tsgo|build|check)\b|\b(?:vitest|unit tests?|mock(?:ed|s)?|snapshots?|lint|typechecks?|tsgo|ci(?:\s+passes?)?)\b/i;
|
||||
|
||||
const artifactEvidenceRegex =
|
||||
/!\[[^\]]*\]\([^)]+\)|github\.com\/user-attachments\/assets\/|github\.com\/[^/\s]+\/[^/\s]+\/actions\/runs\/\d+\/artifacts\/\d+|https?:\/\/\S+\.(?:png|jpe?g|gif|webp|mp4|mov|webm)\b/i;
|
||||
|
||||
const evidenceDescriptorRegex =
|
||||
/\b(?:screenshot|screen\s*recording|recording|terminal\s+(?:capture|screenshot|transcript|output)|console\s+(?:output|log)|runtime\s+logs?|redacted\s+logs?|live\s+output|actual\s+output|observed\s+output|stdout|stderr|stack trace|trace excerpt|log excerpt|linked\s+artifacts?|artifact\s+links?)\b|```[\s\S]*\n[\s\S]*\n```/i;
|
||||
|
||||
const liveCommandRegex =
|
||||
/\b(?:openclaw|node|docker|curl|gh|ssh|adb|xcrun|xcodebuild|open|npm\s+run|pnpm\s+openclaw)\b/i;
|
||||
|
||||
const mockOnlyEvidenceStripRegex =
|
||||
/\b(?:pnpm|npm|yarn|bun)\s+(?:run\s+)?(?:test|vitest|lint|typecheck|tsgo|build|check)\b|\b(?:vitest|unit tests?|mock(?:ed|s)?|snapshots?|lint|typechecks?|tsgo|ci(?:\s+passes?)?|tests?|passed|passes|green|success|succeeded|with|and|the|branch|only|output|transcript|capture|fenced)\b/gi;
|
||||
|
||||
const evidenceDescriptorStripRegex =
|
||||
/\b(?:screenshot|screen\s*recording|recording|terminal\s+(?:capture|screenshot|transcript|output)|console\s+(?:output|log)|runtime\s+logs?|redacted\s+logs?|live\s+output|actual\s+output|observed\s+output|stdout|stderr|stack trace|trace excerpt|log excerpt|linked\s+artifacts?|artifact\s+links?)\b/gi;
|
||||
|
||||
function escapeRegex(text) {
|
||||
return text.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
|
||||
}
|
||||
|
||||
function labelNames(labels) {
|
||||
return new Set(
|
||||
(labels ?? [])
|
||||
.map((label) => (typeof label === "string" ? label : label?.name))
|
||||
.filter((label) => typeof label === "string"),
|
||||
);
|
||||
}
|
||||
|
||||
function isAutomationUser(user = {}, fallbackLogin = "") {
|
||||
const login = user?.login ?? fallbackLogin;
|
||||
return user?.type === "Bot" || /\[bot\]$/i.test(login) || login.startsWith("app/");
|
||||
}
|
||||
|
||||
export function isExternalPullRequest(pullRequest) {
|
||||
if (!pullRequest) {
|
||||
return false;
|
||||
}
|
||||
if (isAutomationUser(pullRequest.user)) {
|
||||
return false;
|
||||
}
|
||||
const authorAssociation = String(
|
||||
pullRequest.author_association ?? pullRequest.authorAssociation ?? "",
|
||||
).toUpperCase();
|
||||
return !privilegedAuthorAssociations.has(authorAssociation);
|
||||
}
|
||||
|
||||
export function hasProofOverride(labels) {
|
||||
return labelNames(labels).has(PROOF_OVERRIDE_LABEL);
|
||||
}
|
||||
|
||||
export function extractRealBehaviorProofSection(body = "") {
|
||||
const headingRegex = /^#{2,6}\s+real behavior proof\b[^\n]*$/gim;
|
||||
const match = headingRegex.exec(body);
|
||||
if (!match) {
|
||||
return "";
|
||||
}
|
||||
const sectionStart = match.index + match[0].length;
|
||||
const rest = body.slice(sectionStart);
|
||||
const nextHeading = rest.match(/\n#{1,6}\s+\S/);
|
||||
return (nextHeading ? rest.slice(0, nextHeading.index) : rest).trim();
|
||||
}
|
||||
|
||||
function fieldLineRegex(name) {
|
||||
return new RegExp(
|
||||
`^\\s*(?:[-*]\\s*)?(?:\\*\\*)?${escapeRegex(name)}(?:\\s*\\([^)]*\\))?(?:\\*\\*)?\\s*:\\s*(.*)$`,
|
||||
"i",
|
||||
);
|
||||
}
|
||||
|
||||
function isAnyProofFieldLine(line) {
|
||||
return allProofFieldNames.some((name) => fieldLineRegex(name).test(line));
|
||||
}
|
||||
|
||||
function extractFieldValue(section, field) {
|
||||
const lines = section.split("\n");
|
||||
for (let index = 0; index < lines.length; index += 1) {
|
||||
const matchingName = field.names.find((name) => fieldLineRegex(name).test(lines[index]));
|
||||
if (!matchingName) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const match = lines[index].match(fieldLineRegex(matchingName));
|
||||
const valueLines = [match?.[1] ?? ""];
|
||||
for (let next = index + 1; next < lines.length; next += 1) {
|
||||
const line = lines[next];
|
||||
if (/^#{1,6}\s+\S/.test(line) || isAnyProofFieldLine(line)) {
|
||||
break;
|
||||
}
|
||||
valueLines.push(line);
|
||||
}
|
||||
return valueLines.join("\n").trim();
|
||||
}
|
||||
return "";
|
||||
}
|
||||
|
||||
function stripProofFieldLabels(section) {
|
||||
return section
|
||||
.split("\n")
|
||||
.map((line) => {
|
||||
if (!isAnyProofFieldLine(line)) {
|
||||
return line;
|
||||
}
|
||||
const matchingName = allProofFieldNames.find((name) => fieldLineRegex(name).test(line));
|
||||
const match = matchingName ? line.match(fieldLineRegex(matchingName)) : null;
|
||||
return match?.[1] ?? "";
|
||||
})
|
||||
.join("\n");
|
||||
}
|
||||
|
||||
function isMissingValue(value, field) {
|
||||
const trimmed = value.trim();
|
||||
if (!trimmed) {
|
||||
return true;
|
||||
}
|
||||
if (
|
||||
field.allowNone &&
|
||||
/^(?:none|nothing else|no known gaps|no additional gaps)$/i.test(trimmed)
|
||||
) {
|
||||
return false;
|
||||
}
|
||||
return missingValueRegex.test(trimmed);
|
||||
}
|
||||
|
||||
function hasNonMockEvidencePayload(value) {
|
||||
const payload = value
|
||||
.replace(evidenceDescriptorStripRegex, "")
|
||||
.replace(mockOnlyEvidenceStripRegex, "")
|
||||
.replace(/```(?:\w+)?|```/g, "")
|
||||
.replace(/[`$>:\-_.()[\]\s]+/g, "");
|
||||
return Boolean(payload);
|
||||
}
|
||||
|
||||
function result(status, reason, details = {}) {
|
||||
return {
|
||||
status,
|
||||
reason,
|
||||
applies: ["passed", "missing", "mock_only", "insufficient", "override"].includes(status),
|
||||
passed: ["passed", "skipped", "override"].includes(status),
|
||||
...details,
|
||||
};
|
||||
}
|
||||
|
||||
export function evaluateRealBehaviorProof({ pullRequest, labels } = {}) {
|
||||
const currentLabels = labels ?? pullRequest?.labels ?? [];
|
||||
if (hasProofOverride(currentLabels)) {
|
||||
return result("override", `Maintainer override label ${PROOF_OVERRIDE_LABEL} is present.`);
|
||||
}
|
||||
if (!isExternalPullRequest(pullRequest)) {
|
||||
return result("skipped", "Maintainer, collaborator, or bot PRs do not require this gate.");
|
||||
}
|
||||
|
||||
const section = extractRealBehaviorProofSection(pullRequest?.body ?? "");
|
||||
if (!section) {
|
||||
return result(
|
||||
"missing",
|
||||
"External PRs must include a Real behavior proof section with after-fix evidence from a real setup.",
|
||||
);
|
||||
}
|
||||
|
||||
const fields = Object.fromEntries(
|
||||
requiredProofFields.map((field) => [field.key, extractFieldValue(section, field)]),
|
||||
);
|
||||
const missingFields = requiredProofFields
|
||||
.filter((field) => isMissingValue(fields[field.key] ?? "", field))
|
||||
.map((field) => field.key);
|
||||
if (missingFields.length > 0) {
|
||||
return result(
|
||||
"missing",
|
||||
`Real behavior proof is missing required field content: ${missingFields.join(", ")}.`,
|
||||
{ fields, missingFields },
|
||||
);
|
||||
}
|
||||
|
||||
const proofContent = stripProofFieldLabels(section);
|
||||
if (standaloneMissingProofRegex.test(proofContent)) {
|
||||
return result("insufficient", "Real behavior proof says the changed behavior was not tested.", {
|
||||
fields,
|
||||
});
|
||||
}
|
||||
|
||||
const evidenceContent = [fields.evidence, fields.observedResult].join("\n");
|
||||
const proofContentForMockDetection = [fields.evidence, fields.observedResult, fields.steps].join(
|
||||
"\n",
|
||||
);
|
||||
const hasArtifactEvidence = artifactEvidenceRegex.test(evidenceContent);
|
||||
const hasNonMockPayload = hasNonMockEvidencePayload(evidenceContent);
|
||||
const hasMockEvidenceSignal = mockOnlyEvidenceRegex.test(proofContentForMockDetection);
|
||||
if (hasMockEvidenceSignal && !hasArtifactEvidence && !hasNonMockPayload) {
|
||||
return result(
|
||||
"mock_only",
|
||||
"Unit tests, mocks, snapshots, lint, typechecks, and CI are supplemental and do not count as real behavior proof.",
|
||||
{ fields },
|
||||
);
|
||||
}
|
||||
|
||||
const hasRealEvidence =
|
||||
hasArtifactEvidence ||
|
||||
(evidenceDescriptorRegex.test(evidenceContent) && hasNonMockPayload) ||
|
||||
liveCommandRegex.test(evidenceContent);
|
||||
if (hasMockEvidenceSignal && !hasRealEvidence) {
|
||||
return result(
|
||||
"mock_only",
|
||||
"Unit tests, mocks, snapshots, lint, typechecks, and CI are supplemental and do not count as real behavior proof.",
|
||||
{ fields },
|
||||
);
|
||||
}
|
||||
|
||||
if (!hasRealEvidence) {
|
||||
return result(
|
||||
"insufficient",
|
||||
"Real behavior proof must include an after-fix screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output.",
|
||||
{ fields },
|
||||
);
|
||||
}
|
||||
|
||||
return result("passed", "External PR includes after-fix real behavior proof.", { fields });
|
||||
}
|
||||
|
||||
export function labelsForRealBehaviorProof(evaluation) {
|
||||
if (evaluation.status === "mock_only") {
|
||||
return [MOCK_ONLY_PROOF_LABEL];
|
||||
}
|
||||
if (evaluation.status === "missing" || evaluation.status === "insufficient") {
|
||||
return [NEEDS_REAL_BEHAVIOR_PROOF_LABEL];
|
||||
}
|
||||
return [];
|
||||
}
|
||||
@@ -9,8 +9,6 @@ const LIVE_PROFILE_TIMEOUT_MS = 20 * 60 * 1000;
|
||||
const OPENWEBUI_TIMEOUT_MS = 20 * 60 * 1000;
|
||||
export const BUNDLED_PLUGIN_INSTALL_UNINSTALL_SHARDS = 24;
|
||||
const upgradeSurvivorCommand = "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:upgrade-survivor";
|
||||
const updateRestartAuthCommand =
|
||||
"OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:update-restart-auth";
|
||||
|
||||
const LIVE_RETRY_PATTERNS = [
|
||||
/529\b/i,
|
||||
@@ -240,11 +238,6 @@ export const mainLanes = [
|
||||
weight: 3,
|
||||
},
|
||||
),
|
||||
npmLane("update-restart-auth", updateRestartAuthCommand, {
|
||||
stateScenario: "upgrade-survivor",
|
||||
timeoutMs: 25 * 60 * 1000,
|
||||
weight: 3,
|
||||
}),
|
||||
npmLane("update-migration", "OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:update-migration", {
|
||||
stateScenario: "upgrade-survivor",
|
||||
timeoutMs: 30 * 60 * 1000,
|
||||
@@ -543,11 +536,6 @@ const releasePathPackageUpdateCoreLanes = [
|
||||
weight: 3,
|
||||
},
|
||||
),
|
||||
npmLane("update-restart-auth", updateRestartAuthCommand, {
|
||||
stateScenario: "upgrade-survivor",
|
||||
timeoutMs: 25 * 60 * 1000,
|
||||
weight: 3,
|
||||
}),
|
||||
];
|
||||
|
||||
const primaryReleasePathChunks = {
|
||||
|
||||
@@ -1,97 +0,0 @@
|
||||
import { fileURLToPath } from "node:url";
|
||||
|
||||
const BASELINE_SHARDED_LANES = new Set(["published-upgrade-survivor", "update-migration"]);
|
||||
|
||||
function splitTokens(raw) {
|
||||
return [
|
||||
...new Set(
|
||||
String(raw ?? "")
|
||||
.split(/[,\s]+/u)
|
||||
.map((token) => token.trim())
|
||||
.filter(Boolean),
|
||||
),
|
||||
];
|
||||
}
|
||||
|
||||
function parsePositiveInt(raw, fallback, label) {
|
||||
const parsed = Number.parseInt(String(raw ?? ""), 10);
|
||||
if (!Number.isFinite(parsed)) {
|
||||
return fallback;
|
||||
}
|
||||
if (parsed < 1) {
|
||||
throw new Error(`${label} must be a positive integer. Got: ${JSON.stringify(raw)}`);
|
||||
}
|
||||
return parsed;
|
||||
}
|
||||
|
||||
function sanitizeLabel(value) {
|
||||
return (
|
||||
String(value)
|
||||
.replace(/^openclaw@/u, "")
|
||||
.replace(/[^A-Za-z0-9._-]+/g, "-")
|
||||
.replace(/^-+|-+$/g, "") || "targeted"
|
||||
);
|
||||
}
|
||||
|
||||
export function planTargetedDockerLaneGroups({
|
||||
groupSize = 1,
|
||||
lanes,
|
||||
upgradeSurvivorBaselines = "",
|
||||
} = {}) {
|
||||
const selectedLanes = splitTokens(lanes);
|
||||
if (selectedLanes.length === 0) {
|
||||
throw new Error("docker_lanes is required when planning targeted Docker lane groups.");
|
||||
}
|
||||
|
||||
const parsedGroupSize = parsePositiveInt(groupSize, 1, "groupSize");
|
||||
const baselineSpecs = splitTokens(upgradeSurvivorBaselines);
|
||||
const groups = [];
|
||||
let pendingLanes = [];
|
||||
|
||||
const flushPending = () => {
|
||||
if (pendingLanes.length === 0) {
|
||||
return;
|
||||
}
|
||||
const first = sanitizeLabel(pendingLanes[0]);
|
||||
const last = sanitizeLabel(pendingLanes[pendingLanes.length - 1]);
|
||||
const label = pendingLanes.length === 1 ? first : `${first}--${last}`;
|
||||
groups.push({ docker_lanes: pendingLanes.join(" "), label });
|
||||
pendingLanes = [];
|
||||
};
|
||||
|
||||
for (const lane of selectedLanes) {
|
||||
if (BASELINE_SHARDED_LANES.has(lane) && baselineSpecs.length > 1) {
|
||||
flushPending();
|
||||
for (const baselineSpec of baselineSpecs) {
|
||||
groups.push({
|
||||
docker_lanes: lane,
|
||||
label: `${sanitizeLabel(lane)}-${sanitizeLabel(baselineSpec)}`,
|
||||
published_upgrade_survivor_baselines: baselineSpec,
|
||||
});
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
pendingLanes.push(lane);
|
||||
if (pendingLanes.length >= parsedGroupSize) {
|
||||
flushPending();
|
||||
}
|
||||
}
|
||||
|
||||
flushPending();
|
||||
return groups;
|
||||
}
|
||||
|
||||
const isMain = process.argv[1] ? fileURLToPath(import.meta.url) === process.argv[1] : false;
|
||||
|
||||
if (isMain) {
|
||||
process.stdout.write(
|
||||
JSON.stringify(
|
||||
planTargetedDockerLaneGroups({
|
||||
groupSize: process.env.GROUP_SIZE,
|
||||
lanes: process.env.LANES,
|
||||
upgradeSurvivorBaselines: process.env.OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPECS,
|
||||
}),
|
||||
),
|
||||
);
|
||||
}
|
||||
@@ -128,19 +128,6 @@ export function resolveReleaseHistory(args) {
|
||||
return dedupeSpecs(versions);
|
||||
}
|
||||
|
||||
export function resolveLastStable(args, count) {
|
||||
const releasesJson = args.get("releases-json");
|
||||
if (!releasesJson) {
|
||||
throw new Error("--releases-json is required when requested baselines include last-stable-*");
|
||||
}
|
||||
if (!Number.isInteger(count) || count < 1) {
|
||||
throw new Error(`invalid last-stable baseline count: ${count}`);
|
||||
}
|
||||
const publishedVersions = readPublishedVersions(args.get("npm-versions-json"));
|
||||
const releases = readStableReleases(releasesJson, publishedVersions);
|
||||
return dedupeSpecs(releases.slice(0, count).map((release) => release.version));
|
||||
}
|
||||
|
||||
export function resolveAllSince(args, minimumVersion) {
|
||||
const releasesJson = args.get("releases-json");
|
||||
if (!releasesJson) {
|
||||
@@ -162,13 +149,11 @@ export function resolveBaselines(args) {
|
||||
if (requestedTokens.length === 0) {
|
||||
return dedupeSpecs([fallback]);
|
||||
}
|
||||
const exactTokens = [];
|
||||
const resolved = [];
|
||||
for (const token of requestedTokens) {
|
||||
if (token === "release-history") {
|
||||
resolved.push(...resolveReleaseHistory(args));
|
||||
} else if (token.startsWith("last-stable-")) {
|
||||
const count = Number.parseInt(token.slice("last-stable-".length), 10);
|
||||
resolved.push(...resolveLastStable(args, count));
|
||||
} else if (token.startsWith("all-since-")) {
|
||||
const minimumVersion = token.slice("all-since-".length);
|
||||
if (!parseStableVersion(minimumVersion)) {
|
||||
@@ -176,10 +161,10 @@ export function resolveBaselines(args) {
|
||||
}
|
||||
resolved.push(...resolveAllSince(args, minimumVersion));
|
||||
} else {
|
||||
resolved.push(token);
|
||||
exactTokens.push(token);
|
||||
}
|
||||
}
|
||||
return dedupeSpecs(resolved);
|
||||
return dedupeSpecs([...exactTokens, ...resolved]);
|
||||
}
|
||||
|
||||
const isMain = process.argv[1] ? fileURLToPath(import.meta.url) === process.argv[1] : false;
|
||||
|
||||
@@ -64,21 +64,18 @@ export const LIVE_CACHE_REGRESSION_BASELINE = {
|
||||
observedHitRate: 0.891,
|
||||
minCacheRead: 4_096,
|
||||
minHitRate: 0.85,
|
||||
warnOnly: true,
|
||||
},
|
||||
stable: {
|
||||
observedCacheRead: 4_864,
|
||||
observedHitRate: 0.966,
|
||||
minCacheRead: 4_608,
|
||||
minHitRate: 0.9,
|
||||
warnOnly: true,
|
||||
},
|
||||
tool: {
|
||||
observedCacheRead: 4_608,
|
||||
observedHitRate: 0.896,
|
||||
minCacheRead: 4_096,
|
||||
minHitRate: 0.85,
|
||||
warnOnly: true,
|
||||
},
|
||||
},
|
||||
} as const satisfies Record<string, Record<string, LiveCacheFloor>>;
|
||||
|
||||
@@ -28,7 +28,7 @@ describe("live cache regression runner", () => {
|
||||
]);
|
||||
});
|
||||
|
||||
it("keeps OpenAI text cache floor misses advisory", () => {
|
||||
it("keeps hard cache floors blocking for required OpenAI lanes", () => {
|
||||
const regressions: string[] = [];
|
||||
const warnings: string[] = [];
|
||||
|
||||
@@ -47,11 +47,11 @@ describe("live cache regression runner", () => {
|
||||
warnings,
|
||||
});
|
||||
|
||||
expect(regressions).toEqual([]);
|
||||
expect(warnings).toEqual([
|
||||
expect(regressions).toEqual([
|
||||
"openai:stable cacheRead=0 < min=4608",
|
||||
"openai:stable hitRate=0.000 < min=0.900",
|
||||
]);
|
||||
expect(warnings).toEqual([]);
|
||||
});
|
||||
|
||||
it("retries hard cache baseline misses once", () => {
|
||||
@@ -122,65 +122,6 @@ describe("live cache regression runner", () => {
|
||||
).toBe(false);
|
||||
});
|
||||
|
||||
it("keeps OpenAI cache probes above the reasoning output floor", () => {
|
||||
expect(
|
||||
__testing.resolveCacheProbeMaxTokens({
|
||||
maxTokens: 32,
|
||||
providerTag: "openai",
|
||||
}),
|
||||
).toBe(256);
|
||||
expect(
|
||||
__testing.resolveCacheProbeMaxTokens({
|
||||
maxTokens: 512,
|
||||
providerTag: "openai",
|
||||
}),
|
||||
).toBe(512);
|
||||
expect(
|
||||
__testing.resolveCacheProbeMaxTokens({
|
||||
maxTokens: 32,
|
||||
providerTag: "anthropic",
|
||||
}),
|
||||
).toBe(32);
|
||||
});
|
||||
|
||||
it("accepts empty OpenAI cache probe text only when usage is observable", () => {
|
||||
expect(
|
||||
__testing.shouldAcceptEmptyOpenAICacheProbe({
|
||||
providerTag: "openai",
|
||||
text: "",
|
||||
usage: { input: 5_000 },
|
||||
}),
|
||||
).toBe(true);
|
||||
expect(
|
||||
__testing.shouldAcceptEmptyOpenAICacheProbe({
|
||||
providerTag: "openai",
|
||||
text: "",
|
||||
usage: { cacheRead: 4_608 },
|
||||
}),
|
||||
).toBe(true);
|
||||
expect(
|
||||
__testing.shouldAcceptEmptyOpenAICacheProbe({
|
||||
providerTag: "openai",
|
||||
text: "wrong",
|
||||
usage: { input: 5_000 },
|
||||
}),
|
||||
).toBe(false);
|
||||
expect(
|
||||
__testing.shouldAcceptEmptyOpenAICacheProbe({
|
||||
providerTag: "anthropic",
|
||||
text: "",
|
||||
usage: { input: 5_000 },
|
||||
}),
|
||||
).toBe(false);
|
||||
expect(
|
||||
__testing.shouldAcceptEmptyOpenAICacheProbe({
|
||||
providerTag: "openai",
|
||||
text: "",
|
||||
usage: {},
|
||||
}),
|
||||
).toBe(false);
|
||||
});
|
||||
|
||||
it("accepts a warmup that already hits the provider cache", () => {
|
||||
const findings = __testing.evaluateAgainstBaseline({
|
||||
lane: "image",
|
||||
|
||||
@@ -22,7 +22,6 @@ const ANTHROPIC_TIMEOUT_MS = 120_000;
|
||||
const LIVE_CACHE_LANE_RETRIES = 1;
|
||||
const LIVE_CACHE_RESPONSE_RETRIES = 2;
|
||||
const OPENAI_CACHE_REASONING = "low" as unknown as never;
|
||||
const OPENAI_CACHE_MIN_MAX_TOKENS = 256;
|
||||
const OPENAI_PREFIX = buildStableCachePrefix("openai");
|
||||
const OPENAI_MCP_PREFIX = buildStableCachePrefix("openai-mcp-style");
|
||||
const ANTHROPIC_PREFIX = buildStableCachePrefix("anthropic");
|
||||
@@ -154,32 +153,6 @@ function shouldRetryCacheProbeText(params: {
|
||||
);
|
||||
}
|
||||
|
||||
function resolveCacheProbeMaxTokens(params: {
|
||||
maxTokens: number | undefined;
|
||||
providerTag: "anthropic" | "openai";
|
||||
}): number {
|
||||
const requested = params.maxTokens ?? 64;
|
||||
if (params.providerTag !== "openai") {
|
||||
return requested;
|
||||
}
|
||||
return Math.max(requested, OPENAI_CACHE_MIN_MAX_TOKENS);
|
||||
}
|
||||
|
||||
function shouldAcceptEmptyOpenAICacheProbe(params: {
|
||||
providerTag: "anthropic" | "openai";
|
||||
text: string;
|
||||
usage: CacheUsage;
|
||||
}): boolean {
|
||||
if (params.providerTag !== "openai" || params.text.trim().length > 0) {
|
||||
return false;
|
||||
}
|
||||
return (
|
||||
(params.usage.input ?? 0) > 0 ||
|
||||
(params.usage.cacheRead ?? 0) > 0 ||
|
||||
(params.usage.cacheWrite ?? 0) > 0
|
||||
);
|
||||
}
|
||||
|
||||
async function runToolOnlyTurn(params: {
|
||||
apiKey: string;
|
||||
cacheRetention: "none" | "short" | "long";
|
||||
@@ -269,10 +242,7 @@ async function completeCacheProbe(params: {
|
||||
apiKey: params.apiKey,
|
||||
cacheRetention: params.cacheRetention,
|
||||
sessionId: params.sessionId,
|
||||
maxTokens: resolveCacheProbeMaxTokens({
|
||||
maxTokens: params.maxTokens,
|
||||
providerTag: params.providerTag,
|
||||
}),
|
||||
maxTokens: params.maxTokens ?? 64,
|
||||
temperature: 0,
|
||||
...(params.providerTag === "openai" ? { reasoning: OPENAI_CACHE_REASONING } : {}),
|
||||
},
|
||||
@@ -280,24 +250,6 @@ async function completeCacheProbe(params: {
|
||||
timeoutMs,
|
||||
);
|
||||
const text = extractAssistantText(response);
|
||||
const usage = normalizeCacheUsage(response.usage);
|
||||
if (
|
||||
shouldAcceptEmptyOpenAICacheProbe({
|
||||
providerTag: params.providerTag,
|
||||
text,
|
||||
usage,
|
||||
})
|
||||
) {
|
||||
logLiveCache(
|
||||
`${params.providerTag} cache lane ${params.suffix} accepted empty text with usage ${formatUsage(usage)}`,
|
||||
);
|
||||
return {
|
||||
suffix: params.suffix,
|
||||
text,
|
||||
usage,
|
||||
hitRate: computeCacheHitRate(usage),
|
||||
};
|
||||
}
|
||||
if (shouldRetryCacheProbeText({ attempt, suffix: params.suffix, text })) {
|
||||
logLiveCache(
|
||||
`${params.providerTag} cache lane ${params.suffix} response mismatch; retrying: ${JSON.stringify(text)}`,
|
||||
@@ -310,6 +262,7 @@ async function completeCacheProbe(params: {
|
||||
if (!responseTextLower.includes(markerLower)) {
|
||||
throw new CacheProbeTextMismatchError(params.suffix, text);
|
||||
}
|
||||
const usage = normalizeCacheUsage(response.usage);
|
||||
return {
|
||||
suffix: params.suffix,
|
||||
text,
|
||||
@@ -598,8 +551,6 @@ function appendBaselineFindings(target: BaselineFindings, source: BaselineFindin
|
||||
export const __testing = {
|
||||
assertAgainstBaseline,
|
||||
evaluateAgainstBaseline,
|
||||
resolveCacheProbeMaxTokens,
|
||||
shouldAcceptEmptyOpenAICacheProbe,
|
||||
shouldRetryCacheProbeText,
|
||||
shouldRetryBaselineFindings,
|
||||
};
|
||||
@@ -611,7 +562,7 @@ export async function runLiveCacheRegression(): Promise<LiveCacheRegressionResul
|
||||
provider: "openai",
|
||||
api: "openai-responses",
|
||||
envVar: "OPENCLAW_LIVE_OPENAI_CACHE_MODEL",
|
||||
preferredModelIds: ["gpt-4.1", "gpt-5.2", "gpt-5.4-mini", "gpt-5.4", "gpt-5.5"],
|
||||
preferredModelIds: ["gpt-5.2", "gpt-5.4-mini", "gpt-5.4", "gpt-5.5"],
|
||||
});
|
||||
const anthropic = await resolveLiveDirectModel({
|
||||
provider: "anthropic",
|
||||
|
||||
@@ -576,17 +576,15 @@ async function compactEmbeddedPiSessionDirectOnce(
|
||||
let checkpointSnapshot: CapturedCompactionCheckpointSnapshot | null = null;
|
||||
let checkpointSnapshotRetained = false;
|
||||
try {
|
||||
const skillsSnapshotForRun =
|
||||
sandbox?.enabled && sandbox.workspaceAccess !== "rw" ? undefined : params.skillsSnapshot;
|
||||
const { shouldLoadSkillEntries, skillEntries } = resolveEmbeddedRunSkillEntries({
|
||||
workspaceDir: effectiveWorkspace,
|
||||
config: params.config,
|
||||
agentId: effectiveSkillAgentId,
|
||||
skillsSnapshot: skillsSnapshotForRun,
|
||||
skillsSnapshot: params.skillsSnapshot,
|
||||
});
|
||||
restoreSkillEnv = skillsSnapshotForRun
|
||||
restoreSkillEnv = params.skillsSnapshot
|
||||
? applySkillEnvOverridesFromSnapshot({
|
||||
snapshot: skillsSnapshotForRun,
|
||||
snapshot: params.skillsSnapshot,
|
||||
config: params.config,
|
||||
})
|
||||
: applySkillEnvOverrides({
|
||||
@@ -594,7 +592,7 @@ async function compactEmbeddedPiSessionDirectOnce(
|
||||
config: params.config,
|
||||
});
|
||||
const skillsPrompt = resolveSkillsPromptForRun({
|
||||
skillsSnapshot: skillsSnapshotForRun,
|
||||
skillsSnapshot: params.skillsSnapshot,
|
||||
entries: shouldLoadSkillEntries ? skillEntries : undefined,
|
||||
config: params.config,
|
||||
workspaceDir: effectiveWorkspace,
|
||||
|
||||
@@ -58,15 +58,12 @@ function openclawTranscriptAssistant(model: "delivery-mirror" | "gateway-injecte
|
||||
}
|
||||
|
||||
describe("normalizeAssistantReplayContent", () => {
|
||||
it("converts mid-turn assistant content: [] to a non-empty sentinel text block when stopReason is error", () => {
|
||||
const messages = [userMessage("hello"), bedrockAssistant([], "error"), userMessage("retry")];
|
||||
it("converts assistant content: [] to a non-empty sentinel text block when stopReason is error", () => {
|
||||
const messages = [userMessage("hello"), bedrockAssistant([], "error")];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).not.toBe(messages);
|
||||
const repaired = out[1] as AgentMessage & { content: { type: string; text: string }[] };
|
||||
expect(repaired.content).toEqual([{ type: "text", text: FALLBACK_TEXT }]);
|
||||
// Trailing user is preserved so request still ends with user.
|
||||
expect(out).toHaveLength(3);
|
||||
expect((out[2] as { role: string }).role).toBe("user");
|
||||
});
|
||||
|
||||
it("drops blank user text messages from replay", () => {
|
||||
@@ -111,9 +108,9 @@ describe("normalizeAssistantReplayContent", () => {
|
||||
expect(out[1]).toBe(silentStop);
|
||||
});
|
||||
|
||||
it("converts mid-turn zero-usage empty stop turns to a replay sentinel", () => {
|
||||
it("converts zero-usage empty stop turns to a replay sentinel", () => {
|
||||
const falseSuccessStop = bedrockAssistant([], "stop");
|
||||
const messages = [userMessage("hello"), falseSuccessStop, userMessage("retry")];
|
||||
const messages = [userMessage("hello"), falseSuccessStop];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).not.toBe(messages);
|
||||
const repaired = out[1] as AgentMessage & { content: { type: string; text: string }[] };
|
||||
@@ -186,117 +183,4 @@ describe("normalizeAssistantReplayContent", () => {
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toBe(messages);
|
||||
});
|
||||
|
||||
it("drops a trailing assistant turn whose content: [] would have been rewritten to the sentinel (#77228)", () => {
|
||||
// The sentinel was synthesized to satisfy Bedrock's non-empty-content
|
||||
// rule for *non-trailing* error turns. As the trailing message it would
|
||||
// make prefill-strict providers (e.g. github-copilot/claude-opus-4.6)
|
||||
// 400 with "conversation must end with a user message". The original
|
||||
// turn carried content:[] and zero usage — drop is lossless.
|
||||
const messages = [userMessage("hello"), bedrockAssistant([], "error")];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).not.toBe(messages);
|
||||
expect(out).toHaveLength(1);
|
||||
expect(out[0]).toBe(messages[0]);
|
||||
});
|
||||
|
||||
it("drops a trailing zero-usage empty stop assistant turn (#77228)", () => {
|
||||
const falseSuccessStop = bedrockAssistant([], "stop");
|
||||
const messages = [userMessage("hello"), falseSuccessStop];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toHaveLength(1);
|
||||
expect(out[0]).toBe(messages[0]);
|
||||
});
|
||||
|
||||
it("drops a trailing assistant turn that already carries the persisted sentinel content (#77228)", () => {
|
||||
// Covers the case where session-file-repair persisted the sentinel to
|
||||
// disk; on the next turn the loaded transcript ends with a non-empty
|
||||
// assistant turn whose only content is the sentinel text. Provider
|
||||
// request must still end with user.
|
||||
const persistedSentinel = bedrockAssistant([{ type: "text", text: FALLBACK_TEXT }], "error");
|
||||
const messages = [userMessage("hello"), persistedSentinel];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toHaveLength(1);
|
||||
expect(out[0]).toBe(messages[0]);
|
||||
});
|
||||
|
||||
it("drops several consecutive trailing sentinel/empty-error turns at the tail", () => {
|
||||
const messages = [
|
||||
userMessage("hi"),
|
||||
bedrockAssistant([{ type: "text", text: "real" }]),
|
||||
userMessage("again"),
|
||||
bedrockAssistant([], "error"),
|
||||
bedrockAssistant([{ type: "text", text: FALLBACK_TEXT }], "error"),
|
||||
];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toHaveLength(3);
|
||||
expect((out.at(-1) as { role: string }).role).toBe("user");
|
||||
});
|
||||
|
||||
it("does not drop a trailing assistant turn that has real content", () => {
|
||||
const realReply = bedrockAssistant([{ type: "text", text: "hello back" }], "stop", {
|
||||
input: 1,
|
||||
output: 1,
|
||||
totalTokens: 2,
|
||||
});
|
||||
const messages = [userMessage("hi"), realReply];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toBe(messages);
|
||||
expect(out).toHaveLength(2);
|
||||
});
|
||||
|
||||
it("does not drop a trailing assistant turn with non-error empty content (toolUse / length)", () => {
|
||||
// Boundary lock: only error/zero-usage-empty-stop and the sentinel
|
||||
// shape are droppable. toolUse/length empty turns are real provider
|
||||
// states and must be preserved on the wire.
|
||||
const toolUse = bedrockAssistant([], "toolUse");
|
||||
const messages = [userMessage("hi"), toolUse];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toBe(messages);
|
||||
expect(out).toHaveLength(2);
|
||||
});
|
||||
|
||||
it("preserves a trailing real model reply whose only content happens to be the sentinel text (clawsweeper review on #77287)", () => {
|
||||
// Defensive boundary: even if a model legitimately replies with the
|
||||
// exact sentinel string, the trim must require synthetic provenance
|
||||
// (stopReason: "error" or zero-usage stop) before dropping. Without
|
||||
// this guard the trim would silently delete a real reply on next
|
||||
// replay.
|
||||
const realReplyAsStop = bedrockAssistant([{ type: "text", text: FALLBACK_TEXT }], "stop", {
|
||||
input: 1,
|
||||
output: 1,
|
||||
totalTokens: 2,
|
||||
});
|
||||
const messages = [userMessage("hi"), realReplyAsStop];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toBe(messages);
|
||||
expect(out).toHaveLength(2);
|
||||
expect((out[1] as { content: unknown[] }).content).toEqual([
|
||||
{ type: "text", text: FALLBACK_TEXT },
|
||||
]);
|
||||
});
|
||||
|
||||
it("preserves a trailing turn whose sentinel content is paired with stopReason: toolUse (real provider state, not synthetic)", () => {
|
||||
const toolUseSentinel = bedrockAssistant([{ type: "text", text: FALLBACK_TEXT }], "toolUse");
|
||||
const messages = [userMessage("hi"), toolUseSentinel];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toBe(messages);
|
||||
expect(out).toHaveLength(2);
|
||||
});
|
||||
|
||||
it("still drops a trailing zero-usage stop turn whose content was already lifted to the sentinel block (post-rewrite shape)", () => {
|
||||
// Confirms the sentinel-content branch still recognizes the post-rewrite
|
||||
// shape produced by the in-memory rewrite earlier in the same loop:
|
||||
// stopReason: "stop" + zero usage + sentinel content. Only the synthetic
|
||||
// provenance (zero usage + stop) makes this droppable; a non-zero-usage
|
||||
// version is preserved by the regression test above.
|
||||
const persistedZeroUsageSentinel = bedrockAssistant(
|
||||
[{ type: "text", text: FALLBACK_TEXT }],
|
||||
"stop",
|
||||
);
|
||||
const messages = [userMessage("hi"), persistedZeroUsageSentinel];
|
||||
const out = normalizeAssistantReplayContent(messages);
|
||||
expect(out).toHaveLength(1);
|
||||
expect(out[0]).toBe(messages[0]);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -396,76 +396,9 @@ export function normalizeAssistantReplayContent(messages: AgentMessage[]): Agent
|
||||
}
|
||||
out.push(message);
|
||||
}
|
||||
|
||||
// Drop trailing stream-error / zero-usage-empty-stop placeholder turns. The
|
||||
// sentinel was synthesized to satisfy Bedrock Converse's "ContentBlock must
|
||||
// not be empty" rule for *non-trailing* error turns; when it is the trailing
|
||||
// entry, prefill-strict providers (e.g. github-copilot/claude-opus-4.6 — the
|
||||
// exact path reported in #77228) reject the request with
|
||||
// `400 This model does not support assistant message prefill. The
|
||||
// conversation must end with a user message.`. The original turn carried
|
||||
// `content: []` and zero usage — there is no information to lose by
|
||||
// dropping it. This trim runs after the main loop so it also catches a
|
||||
// sentinel that was *persisted* to disk by an earlier session-file repair
|
||||
// pass (matching the same content shape the loop above produces).
|
||||
while (out.length > 0) {
|
||||
const last = out[out.length - 1];
|
||||
if (!isReplayDroppableTrailingAssistant(last)) {
|
||||
break;
|
||||
}
|
||||
out.pop();
|
||||
touched = true;
|
||||
}
|
||||
return touched ? out : messages;
|
||||
}
|
||||
|
||||
function isReplayDroppableTrailingAssistant(message: AgentMessage | undefined): boolean {
|
||||
if (!message || message.role !== "assistant") {
|
||||
return false;
|
||||
}
|
||||
const content = (message as { content?: unknown }).content;
|
||||
if (!Array.isArray(content)) {
|
||||
return false;
|
||||
}
|
||||
if (content.length === 0) {
|
||||
const stopReason = (message as { stopReason?: unknown }).stopReason;
|
||||
return stopReason === "error" || isZeroUsageEmptyStopAssistantTurn(message);
|
||||
}
|
||||
// Sentinel-text content is the post-rewrite shape produced by either
|
||||
// session-file-repair.rewriteAssistantEntryWithEmptyContent (always
|
||||
// stopReason="error") or the in-memory rewrite earlier in this same
|
||||
// normalizeAssistantReplayContent loop (preserves the original
|
||||
// stopReason — "error" or zero-usage "stop"). Drop only when the trailing
|
||||
// turn carries that synthetic provenance: without this guard, a real
|
||||
// model reply that happens to consist of exactly the sentinel string
|
||||
// would be silently removed on next replay
|
||||
// (clawsweeper review on #77287, P2).
|
||||
if (!isStreamErrorSentinelContent(content)) {
|
||||
return false;
|
||||
}
|
||||
const stopReason = (message as { stopReason?: unknown }).stopReason;
|
||||
if (stopReason === "error") {
|
||||
return true;
|
||||
}
|
||||
return isZeroUsageEmptyStopAssistantTurn({
|
||||
stopReason,
|
||||
usage: (message as { usage?: unknown }).usage,
|
||||
content: [],
|
||||
});
|
||||
}
|
||||
|
||||
function isStreamErrorSentinelContent(content: readonly unknown[]): boolean {
|
||||
if (content.length !== 1) {
|
||||
return false;
|
||||
}
|
||||
const block = content[0];
|
||||
if (!block || typeof block !== "object") {
|
||||
return false;
|
||||
}
|
||||
const blockRecord = block as { type?: unknown; text?: unknown };
|
||||
return blockRecord.type === "text" && blockRecord.text === STREAM_ERROR_FALLBACK_TEXT;
|
||||
}
|
||||
|
||||
function normalizeAssistantUsageSnapshot(usage: unknown) {
|
||||
const normalized = normalizeUsage((usage ?? undefined) as UsageLike | undefined);
|
||||
if (!normalized) {
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
import fs from "node:fs/promises";
|
||||
import os from "node:os";
|
||||
import path from "node:path";
|
||||
import type { AgentMessage } from "@mariozechner/pi-agent-core";
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
|
||||
@@ -211,59 +210,6 @@ describe("runEmbeddedAttempt context engine sessionKey forwarding", () => {
|
||||
}
|
||||
});
|
||||
|
||||
it("rebuilds skill prompt inputs from the sandbox workspace for non-rw sandbox runs", async () => {
|
||||
const sandboxWorkspace = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-sandbox-skills-"));
|
||||
tempPaths.push(sandboxWorkspace);
|
||||
hoisted.resolveSandboxContextMock.mockResolvedValue({
|
||||
enabled: true,
|
||||
workspaceAccess: "ro",
|
||||
workspaceDir: sandboxWorkspace,
|
||||
});
|
||||
|
||||
await createContextEngineAttemptRunner({
|
||||
contextEngine: createContextEngineBootstrapAndAssemble(),
|
||||
sessionKey,
|
||||
tempPaths,
|
||||
attemptOverrides: {
|
||||
skillsSnapshot: {
|
||||
prompt:
|
||||
"<available_skills><skill><location>~/.openclaw/skills/smaug/SKILL.md</location></skill></available_skills>",
|
||||
skills: [{ name: "smaug" }],
|
||||
resolvedSkills: [
|
||||
{
|
||||
name: "smaug",
|
||||
description: "Host copy",
|
||||
disableModelInvocation: false,
|
||||
filePath: "/Users/alice/.openclaw/skills/smaug/SKILL.md",
|
||||
baseDir: "/Users/alice/.openclaw/skills/smaug",
|
||||
source: "openclaw-workspace",
|
||||
sourceInfo: {
|
||||
path: "/Users/alice/.openclaw/skills/smaug/SKILL.md",
|
||||
source: "openclaw-workspace",
|
||||
scope: "project",
|
||||
origin: "top-level",
|
||||
baseDir: "/Users/alice/.openclaw/skills/smaug",
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(hoisted.resolveEmbeddedRunSkillEntriesMock).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
workspaceDir: sandboxWorkspace,
|
||||
skillsSnapshot: undefined,
|
||||
}),
|
||||
);
|
||||
expect(hoisted.resolveSkillsPromptForRunMock).toHaveBeenCalledWith(
|
||||
expect.objectContaining({
|
||||
workspaceDir: sandboxWorkspace,
|
||||
skillsSnapshot: undefined,
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("keeps before_prompt_build prependContext out of system prompt on transcriptPrompt runs", async () => {
|
||||
const runBeforePromptBuild = vi.fn(async () => ({ prependContext: "dynamic hook context" }));
|
||||
hoisted.getGlobalHookRunnerMock.mockReturnValue({
|
||||
|
||||
@@ -69,13 +69,13 @@ type AttemptSpawnWorkspaceHoisted = {
|
||||
installContextEngineLoopHookMock: UnknownMock;
|
||||
flushPendingToolResultsAfterIdleMock: AsyncUnknownMock;
|
||||
releaseWsSessionMock: UnknownMock;
|
||||
resolveBootstrapFilesForRunMock: Mock<(...args: unknown[]) => Promise<WorkspaceBootstrapFile[]>>;
|
||||
resolveBootstrapFilesForRunMock: Mock<
|
||||
(...args: unknown[]) => Promise<WorkspaceBootstrapFile[]>
|
||||
>;
|
||||
resolveBootstrapContextForRunMock: Mock<() => Promise<BootstrapContext>>;
|
||||
isWorkspaceBootstrapPendingMock: Mock<(workspaceDir: string) => Promise<boolean>>;
|
||||
resolveContextInjectionModeMock: Mock<() => "always" | "continuation-skip">;
|
||||
hasCompletedBootstrapTurnMock: Mock<() => Promise<boolean>>;
|
||||
resolveEmbeddedRunSkillEntriesMock: UnknownMock;
|
||||
resolveSkillsPromptForRunMock: UnknownMock;
|
||||
supportsModelToolsMock: Mock<(model?: unknown) => boolean>;
|
||||
getGlobalHookRunnerMock: Mock<() => unknown>;
|
||||
initializeGlobalHookRunnerMock: UnknownMock;
|
||||
@@ -155,11 +155,6 @@ const hoisted = vi.hoisted((): AttemptSpawnWorkspaceHoisted => {
|
||||
() => "always",
|
||||
);
|
||||
const hasCompletedBootstrapTurnMock = vi.fn<() => Promise<boolean>>(async () => false);
|
||||
const resolveEmbeddedRunSkillEntriesMock = vi.fn(() => ({
|
||||
shouldLoadSkillEntries: false,
|
||||
skillEntries: undefined,
|
||||
}));
|
||||
const resolveSkillsPromptForRunMock = vi.fn(() => "");
|
||||
const supportsModelToolsMock = vi.fn<(model?: unknown) => boolean>(() => true);
|
||||
const getGlobalHookRunnerMock = vi.fn<() => unknown>(() => undefined);
|
||||
const initializeGlobalHookRunnerMock = vi.fn();
|
||||
@@ -207,8 +202,6 @@ const hoisted = vi.hoisted((): AttemptSpawnWorkspaceHoisted => {
|
||||
isWorkspaceBootstrapPendingMock,
|
||||
resolveContextInjectionModeMock,
|
||||
hasCompletedBootstrapTurnMock,
|
||||
resolveEmbeddedRunSkillEntriesMock,
|
||||
resolveSkillsPromptForRunMock,
|
||||
supportsModelToolsMock,
|
||||
getGlobalHookRunnerMock,
|
||||
initializeGlobalHookRunnerMock,
|
||||
@@ -313,12 +306,14 @@ vi.mock("../../bootstrap-files.js", async () => {
|
||||
vi.mock("../../skills.js", () => ({
|
||||
applySkillEnvOverrides: () => () => {},
|
||||
applySkillEnvOverridesFromSnapshot: () => () => {},
|
||||
resolveSkillsPromptForRun: (...args: unknown[]) => hoisted.resolveSkillsPromptForRunMock(...args),
|
||||
resolveSkillsPromptForRun: () => "",
|
||||
}));
|
||||
|
||||
vi.mock("../skills-runtime.js", () => ({
|
||||
resolveEmbeddedRunSkillEntries: (...args: unknown[]) =>
|
||||
hoisted.resolveEmbeddedRunSkillEntriesMock(...args),
|
||||
resolveEmbeddedRunSkillEntries: () => ({
|
||||
shouldLoadSkillEntries: false,
|
||||
skillEntries: undefined,
|
||||
}),
|
||||
}));
|
||||
|
||||
vi.mock("../context-engine-maintenance.js", () => ({
|
||||
@@ -844,11 +839,6 @@ export function resetEmbeddedAttemptHarness(
|
||||
hoisted.isWorkspaceBootstrapPendingMock.mockReset().mockResolvedValue(false);
|
||||
hoisted.resolveContextInjectionModeMock.mockReset().mockReturnValue("always");
|
||||
hoisted.hasCompletedBootstrapTurnMock.mockReset().mockResolvedValue(false);
|
||||
hoisted.resolveEmbeddedRunSkillEntriesMock.mockReset().mockReturnValue({
|
||||
shouldLoadSkillEntries: false,
|
||||
skillEntries: undefined,
|
||||
});
|
||||
hoisted.resolveSkillsPromptForRunMock.mockReset().mockReturnValue("");
|
||||
hoisted.supportsModelToolsMock.mockReset().mockReturnValue(true);
|
||||
hoisted.getGlobalHookRunnerMock.mockReset().mockReturnValue(undefined);
|
||||
hoisted.runContextEngineMaintenanceMock.mockReset().mockResolvedValue(undefined);
|
||||
|
||||
@@ -713,17 +713,15 @@ export async function runEmbeddedAttempt(
|
||||
| ((outcome: "completed" | "aborted" | "error", err?: unknown) => void)
|
||||
| undefined;
|
||||
try {
|
||||
const skillsSnapshotForRun =
|
||||
sandbox?.enabled && sandbox.workspaceAccess !== "rw" ? undefined : params.skillsSnapshot;
|
||||
const { shouldLoadSkillEntries, skillEntries } = resolveEmbeddedRunSkillEntries({
|
||||
workspaceDir: effectiveWorkspace,
|
||||
config: params.config,
|
||||
agentId: sessionAgentId,
|
||||
skillsSnapshot: skillsSnapshotForRun,
|
||||
skillsSnapshot: params.skillsSnapshot,
|
||||
});
|
||||
restoreSkillEnv = skillsSnapshotForRun
|
||||
restoreSkillEnv = params.skillsSnapshot
|
||||
? applySkillEnvOverridesFromSnapshot({
|
||||
snapshot: skillsSnapshotForRun,
|
||||
snapshot: params.skillsSnapshot,
|
||||
config: params.config,
|
||||
})
|
||||
: applySkillEnvOverrides({
|
||||
@@ -732,7 +730,7 @@ export async function runEmbeddedAttempt(
|
||||
});
|
||||
|
||||
const skillsPrompt = resolveSkillsPromptForRun({
|
||||
skillsSnapshot: skillsSnapshotForRun,
|
||||
skillsSnapshot: params.skillsSnapshot,
|
||||
entries: shouldLoadSkillEntries ? skillEntries : undefined,
|
||||
config: params.config,
|
||||
workspaceDir: effectiveWorkspace,
|
||||
|
||||
@@ -39,23 +39,4 @@ describe("resolveAuthProfileFailureReason", () => {
|
||||
}),
|
||||
).toBeNull();
|
||||
});
|
||||
|
||||
it("does not persist request-shape (format) rejections as auth-profile health (#77228)", () => {
|
||||
// A format rejection (e.g. the github-copilot prefill-strict 400
|
||||
// "conversation must end with a user message" reported in #77228) is
|
||||
// a per-session transcript-shape problem; cascading it to a profile
|
||||
// cooldown blocks every other healthy session sharing the same auth
|
||||
// profile and can take down the whole provider for the backoff window.
|
||||
expect(
|
||||
resolveAuthProfileFailureReason({
|
||||
failoverReason: "format",
|
||||
}),
|
||||
).toBeNull();
|
||||
expect(
|
||||
resolveAuthProfileFailureReason({
|
||||
failoverReason: "format",
|
||||
policy: "shared",
|
||||
}),
|
||||
).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
@@ -6,21 +6,8 @@ export function resolveAuthProfileFailureReason(params: {
|
||||
failoverReason: FailoverReason | null;
|
||||
policy?: AuthProfileFailurePolicy;
|
||||
}): AuthProfileFailureReason | null {
|
||||
// Helper-local runs, transport timeouts, and request-shape ("format") rejections
|
||||
// should not poison shared provider auth health. A `format` failure means the
|
||||
// provider rejected the request payload (e.g. an assistant-prefill 400 from a
|
||||
// strict provider when a session transcript ends with a stream-error placeholder
|
||||
// turn) — that is a per-session transcript-shape problem, not a profile-wide
|
||||
// reliability signal. Cascading it to a profile cooldown blocks every other
|
||||
// healthy session sharing the same auth profile and, when all profiles share
|
||||
// the same fault, takes down the entire provider for the configured backoff
|
||||
// window (#77228).
|
||||
if (
|
||||
params.policy === "local" ||
|
||||
!params.failoverReason ||
|
||||
params.failoverReason === "timeout" ||
|
||||
params.failoverReason === "format"
|
||||
) {
|
||||
// Helper-local runs and transport timeouts should not poison shared provider auth health.
|
||||
if (params.policy === "local" || !params.failoverReason || params.failoverReason === "timeout") {
|
||||
return null;
|
||||
}
|
||||
return params.failoverReason;
|
||||
|
||||
@@ -3,8 +3,6 @@ import { tmpdir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
import { describe, expect, it } from "vitest";
|
||||
import {
|
||||
getSandboxHostPathPolicyKey,
|
||||
isSandboxHostPathAbsolute,
|
||||
normalizeSandboxHostPath,
|
||||
resolveSandboxHostPathViaExistingAncestor,
|
||||
} from "./host-paths.js";
|
||||
@@ -13,33 +11,6 @@ describe("normalizeSandboxHostPath", () => {
|
||||
it("normalizes dot segments and strips trailing slash", () => {
|
||||
expect(normalizeSandboxHostPath("/tmp/a/../b//")).toBe("/tmp/b");
|
||||
});
|
||||
|
||||
it("normalizes Windows drive-letter paths without losing the drive root", () => {
|
||||
expect(normalizeSandboxHostPath("c:\\Users\\Kai\\..\\Project\\")).toBe("C:/Users/Project");
|
||||
expect(normalizeSandboxHostPath("d:/")).toBe("D:/");
|
||||
});
|
||||
});
|
||||
|
||||
describe("isSandboxHostPathAbsolute", () => {
|
||||
it("accepts POSIX and drive-absolute Windows paths", () => {
|
||||
expect(isSandboxHostPathAbsolute("/tmp/project")).toBe(true);
|
||||
expect(isSandboxHostPathAbsolute("C:/Users/kai/project")).toBe(true);
|
||||
expect(isSandboxHostPathAbsolute("C:\\Users\\kai\\project")).toBe(true);
|
||||
});
|
||||
|
||||
it("rejects relative paths, named volumes, and drive-relative Windows paths", () => {
|
||||
expect(isSandboxHostPathAbsolute("relative/path")).toBe(false);
|
||||
expect(isSandboxHostPathAbsolute("my-volume")).toBe(false);
|
||||
expect(isSandboxHostPathAbsolute("C:relative\\path")).toBe(false);
|
||||
});
|
||||
});
|
||||
|
||||
describe("getSandboxHostPathPolicyKey", () => {
|
||||
it("compares Windows drive-letter paths case-insensitively", () => {
|
||||
expect(getSandboxHostPathPolicyKey("c:\\Users\\Kai\\.SSH\\config")).toBe(
|
||||
"c:/users/kai/.ssh/config",
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
describe("resolveSandboxHostPathViaExistingAncestor", () => {
|
||||
@@ -47,16 +18,6 @@ describe("resolveSandboxHostPathViaExistingAncestor", () => {
|
||||
expect(resolveSandboxHostPathViaExistingAncestor("relative/path")).toBe("relative/path");
|
||||
});
|
||||
|
||||
it("normalizes Windows paths without resolving them through POSIX cwd on non-Windows hosts", () => {
|
||||
if (process.platform === "win32") {
|
||||
return;
|
||||
}
|
||||
|
||||
expect(resolveSandboxHostPathViaExistingAncestor("C:/Users/kai/project")).toBe(
|
||||
"C:/Users/kai/project",
|
||||
);
|
||||
});
|
||||
|
||||
it("resolves symlink parents when the final leaf does not exist", () => {
|
||||
if (process.platform === "win32") {
|
||||
return;
|
||||
|
||||
@@ -19,42 +19,16 @@ function stripWindowsNamespacePrefix(input: string): string {
|
||||
return input;
|
||||
}
|
||||
|
||||
export function isWindowsDriveAbsolutePath(raw: string): boolean {
|
||||
return /^[A-Za-z]:[\\/]/.test(stripWindowsNamespacePrefix(raw.trim()));
|
||||
}
|
||||
|
||||
export function isSandboxHostPathAbsolute(raw: string): boolean {
|
||||
const trimmed = stripWindowsNamespacePrefix(raw.trim());
|
||||
return trimmed.startsWith("/") || isWindowsDriveAbsolutePath(trimmed);
|
||||
}
|
||||
|
||||
/**
|
||||
* Normalize a host path: resolve `.`, `..`, collapse `//`, strip trailing `/`.
|
||||
* Windows drive-letter paths preserve the drive root and uppercase the drive letter.
|
||||
* Normalize a POSIX host path: resolve `.`, `..`, collapse `//`, strip trailing `/`.
|
||||
*/
|
||||
export function normalizeSandboxHostPath(raw: string): string {
|
||||
const trimmed = stripWindowsNamespacePrefix(raw.trim());
|
||||
if (!trimmed) {
|
||||
return "/";
|
||||
}
|
||||
let normalTrimmed = trimmed.replaceAll("\\", "/");
|
||||
if (isWindowsDriveAbsolutePath(normalTrimmed)) {
|
||||
normalTrimmed = normalTrimmed.charAt(0).toUpperCase() + normalTrimmed.slice(1);
|
||||
}
|
||||
const normalized = posix.normalize(normalTrimmed);
|
||||
const withoutTrailingSlash = normalized.replace(/\/+$/, "") || "/";
|
||||
if (/^[A-Z]:$/.test(withoutTrailingSlash)) {
|
||||
return `${withoutTrailingSlash}/`;
|
||||
}
|
||||
return withoutTrailingSlash;
|
||||
}
|
||||
|
||||
export function getSandboxHostPathPolicyKey(raw: string): string {
|
||||
const normalized = normalizeSandboxHostPath(raw);
|
||||
if (isWindowsDriveAbsolutePath(normalized)) {
|
||||
return normalized.toLowerCase();
|
||||
}
|
||||
return normalized;
|
||||
const normalized = posix.normalize(trimmed.replaceAll("\\", "/"));
|
||||
return normalized.replace(/\/+$/, "") || "/";
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -62,11 +36,8 @@ export function getSandboxHostPathPolicyKey(raw: string): string {
|
||||
* even when the final source leaf does not exist yet.
|
||||
*/
|
||||
export function resolveSandboxHostPathViaExistingAncestor(sourcePath: string): string {
|
||||
if (!isSandboxHostPathAbsolute(sourcePath)) {
|
||||
if (!sourcePath.startsWith("/")) {
|
||||
return sourcePath;
|
||||
}
|
||||
if (isWindowsDriveAbsolutePath(sourcePath) && process.platform !== "win32") {
|
||||
return normalizeSandboxHostPath(sourcePath);
|
||||
}
|
||||
return normalizeSandboxHostPath(resolvePathViaExistingAncestorSync(sourcePath));
|
||||
}
|
||||
|
||||
@@ -174,25 +174,6 @@ describe("validateBindMounts", () => {
|
||||
expect(() => validateBindMounts(["/home/tester/.netrc:/mnt/netrc:ro"])).toThrow(/blocked path/);
|
||||
});
|
||||
|
||||
it("allows drive-absolute Windows bind sources", () => {
|
||||
expect(() => validateBindMounts(["D:/data/openclaw/src:/src:ro"])).not.toThrow();
|
||||
expect(() => validateBindMounts(["D:\\data\\openclaw\\output:/output:rw"])).not.toThrow();
|
||||
});
|
||||
|
||||
it("compares Windows allowed roots case-insensitively", () => {
|
||||
expect(() =>
|
||||
validateBindMounts(["d:/DATA/OpenClaw/src:/src:ro"], {
|
||||
allowedSourceRoots: ["D:/data/openclaw"],
|
||||
}),
|
||||
).not.toThrow();
|
||||
|
||||
expect(() =>
|
||||
validateBindMounts(["D:/other/project:/src:ro"], {
|
||||
allowedSourceRoots: ["d:/data/openclaw"],
|
||||
}),
|
||||
).toThrow(/outside allowed roots/);
|
||||
});
|
||||
|
||||
it("blocks credential binds through canonical home aliases", () => {
|
||||
if (process.platform === "win32") {
|
||||
return;
|
||||
@@ -212,7 +193,14 @@ describe("validateBindMounts", () => {
|
||||
|
||||
it("blocks symlink escapes into blocked directories", () => {
|
||||
if (process.platform === "win32") {
|
||||
// Symlink setup for blocked POSIX targets like /etc is POSIX-only.
|
||||
// Symlinks to non-existent targets like /etc require
|
||||
// SeCreateSymbolicLinkPrivilege on Windows. The Windows branch of this
|
||||
// test does not need a real symlink — it only asserts that Windows source
|
||||
// paths are rejected as non-POSIX.
|
||||
const dir = mkdtempSync(join(tmpdir(), "openclaw-sbx-"));
|
||||
const fakePath = join(dir, "etc-link", "passwd");
|
||||
const run = () => validateBindMounts([`${fakePath}:/mnt/passwd:ro`]);
|
||||
expect(run).toThrow(/non-absolute source path/);
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -225,7 +213,7 @@ describe("validateBindMounts", () => {
|
||||
|
||||
it("blocks symlink-parent escapes with non-existent leaf outside allowed roots", () => {
|
||||
if (process.platform === "win32") {
|
||||
// Windows symlink semantics differ; POSIX symlink escape coverage runs on POSIX hosts.
|
||||
// Windows source paths (e.g. C:\\...) are intentionally rejected as non-POSIX.
|
||||
return;
|
||||
}
|
||||
const dir = mkdtempSync(join(tmpdir(), "openclaw-sbx-"));
|
||||
@@ -245,7 +233,7 @@ describe("validateBindMounts", () => {
|
||||
|
||||
it("blocks symlink-parent escapes into blocked paths when leaf does not exist", () => {
|
||||
if (process.platform === "win32") {
|
||||
// Symlink setup for blocked POSIX targets like /var/run is POSIX-only.
|
||||
// Windows source paths (e.g. C:\\...) are intentionally rejected as non-POSIX.
|
||||
return;
|
||||
}
|
||||
const dir = mkdtempSync(join(tmpdir(), "openclaw-sbx-"));
|
||||
|
||||
@@ -12,8 +12,6 @@ import { normalizeOptionalLowercaseString } from "../../shared/string-coerce.js"
|
||||
import { splitSandboxBindSpec } from "./bind-spec.js";
|
||||
import { SANDBOX_AGENT_WORKSPACE_MOUNT } from "./constants.js";
|
||||
import {
|
||||
getSandboxHostPathPolicyKey,
|
||||
isSandboxHostPathAbsolute,
|
||||
normalizeSandboxHostPath,
|
||||
resolveSandboxHostPathViaExistingAncestor,
|
||||
} from "./host-paths.js";
|
||||
@@ -103,7 +101,6 @@ function parseBindTargetPath(bind: string): string {
|
||||
|
||||
/**
|
||||
* Normalize a POSIX path: resolve `.`, `..`, collapse `//`, strip trailing `/`.
|
||||
* If it starts with the drive letter, convert it to the upper case.
|
||||
*/
|
||||
function normalizeHostPath(raw: string): string {
|
||||
return normalizeSandboxHostPath(raw);
|
||||
@@ -118,9 +115,10 @@ function normalizeHostPath(raw: string): string {
|
||||
*/
|
||||
export function getBlockedBindReason(bind: string): BlockedBindReason | null {
|
||||
const sourceRaw = parseBindSourcePath(bind);
|
||||
if (!isSandboxHostPathAbsolute(sourceRaw)) {
|
||||
if (!sourceRaw.startsWith("/")) {
|
||||
return { kind: "non_absolute", sourcePath: sourceRaw };
|
||||
}
|
||||
|
||||
const normalized = normalizeHostPath(sourceRaw);
|
||||
const blockedHostPaths = getBlockedHostPaths();
|
||||
const directReason = getBlockedReasonForSourcePath(normalized, blockedHostPaths);
|
||||
@@ -143,10 +141,8 @@ function getBlockedReasonForSourcePath(
|
||||
if (sourceNormalized === "/") {
|
||||
return { kind: "covers", blockedPath: "/" };
|
||||
}
|
||||
const sourceKey = getSandboxHostPathPolicyKey(sourceNormalized);
|
||||
for (const blocked of blockedHostPaths) {
|
||||
const blockedKey = getSandboxHostPathPolicyKey(blocked);
|
||||
if (sourceKey === blockedKey || sourceKey.startsWith(`${blockedKey}/`)) {
|
||||
if (sourceNormalized === blocked || sourceNormalized.startsWith(blocked + "/")) {
|
||||
return { kind: "targets", blockedPath: blocked };
|
||||
}
|
||||
}
|
||||
@@ -197,7 +193,7 @@ function normalizeAllowedRoots(roots: string[] | undefined): string[] {
|
||||
}
|
||||
const normalized = roots
|
||||
.map((entry) => entry.trim())
|
||||
.filter(isSandboxHostPathAbsolute)
|
||||
.filter((entry) => entry.startsWith("/"))
|
||||
.map(normalizeHostPath);
|
||||
const expanded = new Set<string>();
|
||||
for (const root of normalized) {
|
||||
@@ -214,9 +210,7 @@ function isPathInsidePosix(root: string, target: string): boolean {
|
||||
if (root === "/") {
|
||||
return true;
|
||||
}
|
||||
const rootKey = getSandboxHostPathPolicyKey(root);
|
||||
const targetKey = getSandboxHostPathPolicyKey(target);
|
||||
return targetKey === rootKey || targetKey.startsWith(`${rootKey}/`);
|
||||
return target === root || target.startsWith(`${root}/`);
|
||||
}
|
||||
|
||||
function getOutsideAllowedRootsReason(
|
||||
@@ -280,7 +274,7 @@ function formatBindBlockedError(params: { bind: string; reason: BlockedBindReaso
|
||||
if (params.reason.kind === "non_absolute") {
|
||||
return new Error(
|
||||
`Sandbox security: bind mount "${params.bind}" uses a non-absolute source path ` +
|
||||
`"${params.reason.sourcePath}". Only absolute POSIX or Windows drive-letter paths are supported for sandbox binds.`,
|
||||
`"${params.reason.sourcePath}". Only absolute POSIX paths are supported for sandbox binds.`,
|
||||
);
|
||||
}
|
||||
if (params.reason.kind === "outside_allowed_roots") {
|
||||
|
||||
@@ -580,123 +580,4 @@ describe("repairSessionFileIfNeeded", () => {
|
||||
const after = await fs.readFile(file, "utf-8");
|
||||
expect(after).toBe(original);
|
||||
});
|
||||
|
||||
it("drops type:message entries with null role instead of preserving them through repair (#77228)", async () => {
|
||||
const { file } = await createTempSessionPath();
|
||||
const { header, message } = buildSessionHeaderAndMessage();
|
||||
|
||||
const nullRoleEntry = {
|
||||
type: "message",
|
||||
id: "corrupt-1",
|
||||
parentId: null,
|
||||
timestamp: new Date().toISOString(),
|
||||
message: { role: null, content: "ignored" },
|
||||
};
|
||||
const missingRoleEntry = {
|
||||
type: "message",
|
||||
id: "corrupt-2",
|
||||
parentId: null,
|
||||
timestamp: new Date().toISOString(),
|
||||
message: { content: "no role at all" },
|
||||
};
|
||||
const emptyRoleEntry = {
|
||||
type: "message",
|
||||
id: "corrupt-3",
|
||||
parentId: null,
|
||||
timestamp: new Date().toISOString(),
|
||||
message: { role: " ", content: "blank role" },
|
||||
};
|
||||
|
||||
const content = [
|
||||
JSON.stringify(header),
|
||||
JSON.stringify(message),
|
||||
JSON.stringify(nullRoleEntry),
|
||||
JSON.stringify(missingRoleEntry),
|
||||
JSON.stringify(emptyRoleEntry),
|
||||
].join("\n");
|
||||
await fs.writeFile(file, `${content}\n`, "utf-8");
|
||||
|
||||
const result = await repairSessionFileIfNeeded({ sessionFile: file });
|
||||
|
||||
expect(result.repaired).toBe(true);
|
||||
expect(result.droppedLines).toBe(3);
|
||||
expect(result.backupPath).toBeTruthy();
|
||||
|
||||
const after = await fs.readFile(file, "utf-8");
|
||||
const lines = after.trimEnd().split("\n");
|
||||
expect(lines).toHaveLength(2);
|
||||
expect(JSON.parse(lines[0])).toEqual(header);
|
||||
expect(JSON.parse(lines[1])).toEqual(message);
|
||||
expect(after).not.toContain('"role":null');
|
||||
});
|
||||
|
||||
it("drops a type:message entry whose message field is missing or non-object", async () => {
|
||||
const { file } = await createTempSessionPath();
|
||||
const { header, message } = buildSessionHeaderAndMessage();
|
||||
|
||||
const missingMessage = {
|
||||
type: "message",
|
||||
id: "corrupt-4",
|
||||
parentId: null,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
const stringMessage = {
|
||||
type: "message",
|
||||
id: "corrupt-5",
|
||||
parentId: null,
|
||||
timestamp: new Date().toISOString(),
|
||||
message: "not an object",
|
||||
};
|
||||
|
||||
const content = [
|
||||
JSON.stringify(header),
|
||||
JSON.stringify(message),
|
||||
JSON.stringify(missingMessage),
|
||||
JSON.stringify(stringMessage),
|
||||
].join("\n");
|
||||
await fs.writeFile(file, `${content}\n`, "utf-8");
|
||||
|
||||
const result = await repairSessionFileIfNeeded({ sessionFile: file });
|
||||
|
||||
expect(result.repaired).toBe(true);
|
||||
expect(result.droppedLines).toBe(2);
|
||||
|
||||
const after = await fs.readFile(file, "utf-8");
|
||||
const lines = after.trimEnd().split("\n");
|
||||
expect(lines).toHaveLength(2);
|
||||
});
|
||||
|
||||
it("preserves non-`message` envelope types (e.g. compactionSummary, custom) without role inspection", async () => {
|
||||
const { file } = await createTempSessionPath();
|
||||
const { header, message } = buildSessionHeaderAndMessage();
|
||||
|
||||
const summary = {
|
||||
type: "summary",
|
||||
id: "summary-1",
|
||||
timestamp: new Date().toISOString(),
|
||||
summary: "opaque summary blob",
|
||||
};
|
||||
const custom = {
|
||||
type: "custom",
|
||||
id: "custom-1",
|
||||
customType: "model-snapshot",
|
||||
timestamp: new Date().toISOString(),
|
||||
data: { provider: "openai", modelApi: "openai-responses", modelId: "gpt-5" },
|
||||
};
|
||||
|
||||
const content = [
|
||||
JSON.stringify(header),
|
||||
JSON.stringify(message),
|
||||
JSON.stringify(summary),
|
||||
JSON.stringify(custom),
|
||||
].join("\n");
|
||||
await fs.writeFile(file, `${content}\n`, "utf-8");
|
||||
|
||||
const result = await repairSessionFileIfNeeded({ sessionFile: file });
|
||||
|
||||
expect(result.repaired).toBe(false);
|
||||
expect(result.droppedLines).toBe(0);
|
||||
const after = await fs.readFile(file, "utf-8");
|
||||
expect(after).toBe(`${content}\n`);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -33,31 +33,6 @@ function isSessionHeader(entry: unknown): entry is { type: string; id: string }
|
||||
return record.type === "session" && typeof record.id === "string" && record.id.length > 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Detect a `type: "message"` entry whose `message.role` is missing, `null`, or
|
||||
* not a non-empty string. Such entries surface in the wild as "null role"
|
||||
* JSONL corruption (e.g. #77228 reported transcripts that contained 935+
|
||||
* entries with null roles after an earlier failure). They cannot be replayed
|
||||
* to any provider — every provider router branches on `message.role` — and
|
||||
* preserving them through repair just relocates the corruption from the
|
||||
* original file into the post-repair file. Treat them as malformed lines:
|
||||
* drop during repair so the cleaned transcript no longer carries them.
|
||||
*/
|
||||
function isStructurallyInvalidMessageEntry(entry: unknown): boolean {
|
||||
if (!entry || typeof entry !== "object") {
|
||||
return false;
|
||||
}
|
||||
const record = entry as { type?: unknown; message?: unknown };
|
||||
if (record.type !== "message") {
|
||||
return false;
|
||||
}
|
||||
if (!record.message || typeof record.message !== "object") {
|
||||
return true;
|
||||
}
|
||||
const role = (record.message as { role?: unknown }).role;
|
||||
return typeof role !== "string" || role.trim().length === 0;
|
||||
}
|
||||
|
||||
function isAssistantEntryWithEmptyContent(entry: unknown): entry is SessionMessageEntry {
|
||||
if (!entry || typeof entry !== "object") {
|
||||
return false;
|
||||
@@ -218,15 +193,6 @@ export async function repairSessionFileIfNeeded(params: {
|
||||
}
|
||||
try {
|
||||
const entry: unknown = JSON.parse(line);
|
||||
if (isStructurallyInvalidMessageEntry(entry)) {
|
||||
// Drop "null role" / missing-role message entries the same way we
|
||||
// drop unparseable JSONL: they cannot be replayed to any provider
|
||||
// and preserving them through repair just relocates the corruption
|
||||
// into the post-repair file (#77228: 935+ null-role entries
|
||||
// surviving the auto-repair pass).
|
||||
droppedLines += 1;
|
||||
continue;
|
||||
}
|
||||
if (isAssistantEntryWithEmptyContent(entry)) {
|
||||
entries.push(rewriteAssistantEntryWithEmptyContent(entry));
|
||||
rewrittenAssistantMessages += 1;
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user