Compare commits

..

88 Commits

Author SHA1 Message Date
Peter Steinberger
b48d2438d3 revert: restore default sessions background 2026-06-25 20:24:07 -04:00
Peter Steinberger
369288718e style(ui): use frog green sessions background 2026-06-25 20:19:00 -04:00
Peter Steinberger
bcd926cc4f feat(ui): use red sessions background 2026-06-25 20:02:36 -04:00
brokemac79
0da26499da fix: fallback on safe prompt timeouts (#96142) 2026-06-25 16:33:44 -07:00
brokemac79
1f941a026e fix: reload control ui on service worker update (#96141) 2026-06-25 16:33:27 -07:00
brokemac79
941e8f1ef2 fix: reset diff viewer controllers on rehydrate (#96138) 2026-06-25 16:19:30 -07:00
Renaud Cerrato
95b97e5b0b fix(exec): fail invalid explicit workdir before running (#94441)
* fix(exec): fail invalid explicit workdir before running

* test(exec): tighten invalid workdir regression

* fix(exec): clarify invalid workdir recovery

* refactor(exec): centralize workdir resolution

* test(exec): update invalid workdir assertion

* fix(exec): harden backend workdir contract

* fix(exec): map missing backend host workdirs

* fix(exec): reject control commands before workdir prep

* fix(exec): defer env hook until backend cwd validation

* chore(sdk): refresh plugin api baseline

* test(agents): drop redundant definition assertions

* test(exec): use real config workdirs

* test(exec): use tracked temp dirs

* test(openshell): keep temp setup local

* test: update temp-dir route fixture

---------

Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com>
2026-06-26 08:02:00 +10:00
VACInc
13ecca5408 fix(telegram): back off session init spool retries 2026-06-25 13:41:57 -07:00
Wynne668
c68484acc4 fix(gateway): report omitted chat-history messages in truncation log (#96788)
Summary:
- The PR moves Gateway `chat.history` omission accounting to a whole-pipeline reporter and adds focused helper plus real WebSocket request regression tests.
- PR surface: Source +35, Tests +219. Total +254 across 4 files.
- Reproducibility: yes. Current-main source shows the zero-count keep-last helper branch and the positive-coun ... he PR body includes a negative-control real WebSocket run where the same request test fails before the fix.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(gateway): count unique omitted chat-history messages + prove diag…
- PR branch already contained follow-up commit before automerge: test(gateway): prove chat.history request emits omission diagnostic

Validation:
- ClawSweeper review passed for head 414f885880.
- Required merge gates passed before the squash merge.

Prepared head SHA: 414f885880
Review: https://github.com/openclaw/openclaw/pull/96788#issuecomment-4799553366

Co-authored-by: ZengWen-DT <ceng.wen@xydigit.com>
Approved-by: takhoffman
2026-06-25 20:17:27 +00:00
Ayaan Zaidi
d2da8c79d9 fix(auto-reply): serialize reply session initialization 2026-06-25 13:10:35 -07:00
NIO
1aa7cafc35 fix(github-copilot): bound model discovery and embeddings JSON response (#96499)
* fix(github-copilot): bound model discovery and embeddings JSON response reads

The GitHub Copilot embeddings plugin already bounds its error response
bodies via readResponseTextLimited, but the success JSON reads for both
model discovery and the embeddings call used unbounded response.json().
Route both through readProviderJsonResponse (16 MiB cap).

Update isCopilotSetupError to recognise the new error label prefix so
auto-selection still falls through on malformed discovery responses.
Update tests to use proper Response objects and the new error messages.

AI-assisted.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(github-copilot): use memory embedding response cap

Signed-off-by: sallyom <somalley@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: sallyom <somalley@redhat.com>
2026-06-25 14:46:09 -04:00
NIO
66e2fcc6f8 fix(speech): bound TTS/STT voice-list and transcription JSON response reads (#96496)
Route success JSON reads through readProviderJsonResponse (16 MiB cap) in
azure-speech, elevenlabs, microsoft, minimax/tts, xai/stt, and
openrouter/media-understanding to prevent OOM from oversized or hostile
endpoint responses. Mirrors the response-limit campaign already applied to
other provider paths.

AI-assisted.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-25 14:32:53 -04:00
Ben Badejo
b3ac552c82 fix(codex): prefer desktop app-server for Computer Use on macOS (#96730)
* fix(codex): prefer desktop app-server for Computer Use on macOS

* fix(codex): fall back from stale desktop app-server

---------

Co-authored-by: Benjamin Badejo <ben@benbadejo.com>
2026-06-25 14:28:20 -04:00
mushuiyu886
5715b55000 fix(openrouter): bound video catalog JSON reads (#96505) 2026-06-25 14:17:01 -04:00
Radek Sienkiewicz
0247eab773 fix(cli): sync official plugins during update all (#96831)
Co-authored-by: ooiuuii <169449607+ooiuuii@users.noreply.github.com>
2026-06-25 20:13:37 +02:00
Alix-007
646e54ae35 fix(github-copilot): bound usage response (#96607)
The Copilot usage read in extensions/github-copilot/usage.ts parsed its
HTTP response with an unbounded await res.json(). A hostile or buggy
api.github.com proxy (the proxy endpoint is derived from a user-supplied
token) could stream an unbounded JSON body and drive the usage snapshot
into OOM.

Route the read through the shared readProviderJsonResponse (from
openclaw/plugin-sdk/provider-http), which enforces the 16 MiB byte cap,
cancels the stream on overflow, and wraps malformed JSON with the caller
label. Same no-helper-import-to-bounded-reader shape as the #96027 /
#96038 response-limit work.

Add a focused regression test: when the usage stream exceeds the JSON
byte cap, fetchCopilotUsage rejects with a bounded-overflow error and the
reader cancels the body mid-flight instead of buffering the full
advertised stream. Existing parse/HTTP-error cases keep passing.
2026-06-25 13:53:43 -04:00
Alix-007
d3620da3e0 fix(voyage): bound embedding-batch status, error, and non-OK responses (#96608)
The batch status read (fetchVoyageBatchStatus) parsed its response with an
unbounded await res.json(), and the batch error-file read (readVoyageBatchError)
buffered the whole body via await res.text(). On top of that, the non-OK
(4xx/5xx) diagnostic body was still read unbounded: assertVoyageResponseOk did
await res.text() before throwing, and the non-OK output-file branch in
runVoyageEmbeddingBatches did the same. Voyage base URLs are user-supplied and
reachable via SSRF, so a misbehaving or hostile endpoint could stream an
unbounded body into memory on any of these paths before parsing.

Route the status JSON through the shared readProviderJsonResponse, the error
file through readResponseWithLimit, and now the non-OK diagnostic body through
readResponseWithLimit as well, all under a single 16 MiB cap, cancelling the
stream on overflow before decode/parse. assertVoyageResponseOk preserves its
original "${context}: ${status} ${text}" diagnostic shape for under-cap bodies
and throws a bounded "(error body exceeds <N> bytes)" on overflow; the non-OK
output-file branch now reuses it instead of a duplicate unbounded read. The
existing error-file fail-soft handling (formatUnavailableBatchError) is
preserved, so a capped endpoint degrades gracefully. The submit path already
bounds its body via postJsonWithRetry/maxResponseBytes and is left untouched.

Symmetric counterpart to the #96027/#96038 response-limit campaign.
2026-06-25 13:52:36 -04:00
Alix-007
7b5ee739eb fix(byteplus): bound video-generation success response (#96606) 2026-06-25 13:47:07 -04:00
Alix-007
bfc33ac114 fix(google): bound video success response (#96605) 2026-06-25 13:41:35 -04:00
Alix-007
cc124d2921 fix(qwen): bound video success response (#96604) 2026-06-25 13:40:07 -04:00
Vincent Koc
7cce191b05 test(infra): isolate matrix outbound queue integration 2026-06-26 01:23:16 +08:00
Yzx
7fefc5ff58 fix: cron stream stalls fail over before job timeout (#96096)
* fix(agents): cap cron stream idle stalls

* fix(agents): preserve cron hostname timeout

* fix: bound cron idle timeout local exceptions

* fix: bound cron idle timeout local exceptions

---------

Co-authored-by: Radek Sienkiewicz <mail@velvetshark.com>
2026-06-25 19:04:15 +02:00
Yzx
19707cce1d fix(cron): avoid gateway restart on setup timeout (#96396)
* fix(cron): avoid gateway restart on setup timeout

* fix(cron): avoid gateway restart on setup timeout

---------

Co-authored-by: Radek Sienkiewicz <mail@velvetshark.com>
2026-06-25 18:11:33 +02:00
Ayaan Zaidi
a3b4e8102f test(telegram): fold draft preview surrogate clamp coverage 2026-06-25 09:10:07 -07:00
杨浩宇0668001029
4bd68aef65 fix(telegram): keep draft preview chunks surrogate-safe 2026-06-25 09:10:07 -07:00
Ayaan Zaidi
8bc069f76f fix(outbound): preserve narrowed delivery target type 2026-06-25 08:37:00 -07:00
Ayaan Zaidi
1adb119ba0 refactor(outbound): distill reserved target delivery cleanup 2026-06-25 08:37:00 -07:00
zhang-guiping
57c07d7f3b refactor: centralize reserved target error checks
Keep reserved-target detection behavior unchanged while routing callers through a shared helper so future changes stay localized.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-25 08:37:00 -07:00
张贵萍0668001030
3c8ff0d1c3 fix(outbound): require exact reserved directory matches 2026-06-25 08:37:00 -07:00
张贵萍0668001030
3a03d1e70b fix(cron): preserve reserved directory targets 2026-06-25 08:37:00 -07:00
张贵萍0668001030
9047b1cfa1 fix(outbound): preserve reserved directory target on route miss 2026-06-25 08:37:00 -07:00
张贵萍0668001030
ba004b3547 test(outbound): align Telegram resolver fixtures with chat capabilities 2026-06-25 08:37:00 -07:00
张贵萍0668001030
3092b4fd0d fix(outbound): fail closed heartbeat reserved Telegram misses 2026-06-25 08:37:00 -07:00
张贵萍0668001030
116758e69a fix(outbound): satisfy target resolver lint 2026-06-25 08:37:00 -07:00
张贵萍0668001030
cd3793185b fix(outbound): preserve reserved Telegram directory targets 2026-06-25 08:37:00 -07:00
zhang-guiping
5fccf06b5f fix(outbound): defer reserved-literal errors to async session-route resolver
In resolveAgentDeliveryPlanWithSessionRoute, reserved-literal errors from the
sync outbound target check are no longer treated as fatal. Instead, the path
proceeds to resolveOutboundSessionRoute which calls resolveMessagingTarget,
already fixed to do directory-first lookup before rejecting reserved literals.

This preserves configured Telegram directory entries named like reserved words
(current, self, this, me) through the explicit agent/gateway delivery path.

Update docs to reflect directory-first ordering.
2026-06-25 08:37:00 -07:00
zhang-guiping
bbf494955d fix(outbound): preserve configured directory entries before reserved-literal rejection in resolveMessagingTarget
Move the reserved-literal check from before directory lookup to after directory
miss, so configured Telegram groups/channels whose directory key is a reserved
word (current, self, this, me) still resolve through the directory before
failing closed. The reserved check now runs only after the directory returns no
match and before plugin fallback resolution.

Update the regression test to verify directory-first ordering: a configured
directory entry named current resolves successfully, and a directory miss with
a reserved literal fails with the descriptive error.
2026-06-25 08:37:00 -07:00
zhang-guiping
f12ade0082 fix(outbound): skip sync reserved-literal rejection for heartbeat mode
The sync reserved-literal check in resolveOutboundTargetWithPlugin was
suppressing heartbeat routes to directory entries whose names match
reserved literals (e.g., a Telegram group named "current"). Skip the
check for heartbeat mode so the async resolveChannelTarget →
resolveMessagingTarget path can do directory-first lookup before
deciding.
2026-06-25 08:37:00 -07:00
张贵萍
56baf9d079 fix(outbound): reject reserved Telegram targets 2026-06-25 08:37:00 -07:00
Ayaan Zaidi
dc12b998da fix(media): scope UUID filename restore to media store 2026-06-25 08:35:41 -07:00
Narahari Raghava
cf512f639b fix(media): strip internal UUID suffix from outbound media filenames
Closes #96538
2026-06-25 08:35:41 -07:00
Ayaan Zaidi
29670c13f6 refactor(status): reuse runtime authority decision 2026-06-25 08:31:54 -07:00
zhang-guiping
bead84f0ee fix(status): route usage to session-selected model 2026-06-25 08:31:54 -07:00
Vincent Koc
497d53d821 fix(sdk): tighten wildcard surface budget 2026-06-25 23:30:17 +08:00
yetval
446d98d601 fix(trajectory): export legacy v1 sessions without entry timestamps
readSessionBranch filtered out every entry lacking a string or number
timestamp. Sessions written before entry timestamps existed (version 1)
have ids and parentIds synthesized by the legacy migration but no entry
timestamp, so all entries were dropped and the exported bundle reported
transcriptEventCount 0. The transcript event builder already defaults a
missing timestamp via normalizeTimestamp, so the filter clause was both
wrong and redundant. Drop it; entry identity plus the canonical-entry
check is what the branch walk needs.
2026-06-25 08:29:25 -07:00
Gio Della-Libera
82a6a57330 Doctor: expose session artifact findings (#95976)
* feat(doctor): expose session artifact findings

* fix(doctor): make session artifact findings advisory
2026-06-25 08:27:25 -07:00
Ayaan Zaidi
01ce03c5b1 fix(gateway): preserve webchat send guard 2026-06-25 08:26:12 -07:00
黑承亮0668000844
5881dc8ac3 fix(gateway): use normalizeMessageChannel for send validation to support plugin channels
Fixes #92094
2026-06-25 08:26:12 -07:00
openclaw-clownfish[bot]
31a0f97dd9 fix(clownfish): repair validation for repair-94016-live-pr-inventory-20260617t082059-003-20260617a (2) 2026-06-25 08:25:09 -07:00
openclaw-clownfish[bot]
ace22feb3f fix(clownfish): address review for repair-94016-live-pr-inventory-20260617t082059-003-20260617a (1) 2026-06-25 08:25:09 -07:00
openclaw-clownfish[bot]
ecd29fe572 fix(gateway): resume channel after pending task recovery 2026-06-25 08:25:09 -07:00
openclaw-clownfish[bot]
6039da3ed6 fix(gateway): resume channel after pending task recovery 2026-06-25 08:25:09 -07:00
sheyanmin
8b4be2fdd4 fix: recover channel after stop timeout in health monitor
When a channel stop times out (e.g. during a Telegram API outage),
the channel enters recoveryStopTimedOut state. The health monitor's
subsequent start call would set restartPending and return without
actually starting the channel.

If the stuck stop never completes, the channel stays in limbo forever
with the health monitor retrying every cycle but never recovering.

Fix: when the health monitor retries recovery (recoveryStartRequested
already set), clean up the stuck task state and allow the channel to
start normally.

Closes #94008
2026-06-25 08:25:09 -07:00
Ayaan Zaidi
210ea659f7 fix(outbound): prevent partial-send recovery replay 2026-06-25 08:16:56 -07:00
rosenlo
c0a61f5351 test(outbound): add drain no-replay guard for unknown_after_send; clarify fallback intent
- Add integration test: after mid-batch failure with send evidence the
  resulting unknown_after_send entry is NOT replayed by reconnect drain
  when no adapter reconciliation is available ('refusing blind replay').
  Pins the drain contract so any regression that re-enables blind replay
  is caught end-to-end against a real SQLite queue.
- Add comment in deliver.ts fallback branch: failDelivery inside the
  markQueuedPlatformOutcomeUnknown catch is a last-resort DB-write-error
  path, not an indication that failDelivery is correct with send evidence.
2026-06-25 08:16:56 -07:00
rosenlo
7f2c04ce11 fix(test): remove unused import and unnecessary type assertions in queue integration test 2026-06-25 08:16:56 -07:00
rosenlo
f9e0dce731 test(outbound): add real-queue integration test for unknown_after_send on mid-batch failure
Complement the unit test (which mocks delivery-queue) with an integration
test that uses the real SQLite delivery queue (no mock of ./delivery-queue.js)
and the real deliverOutboundPayloads code path.

Verifies at the queue layer:
- mid-batch failure with send evidence (first payload succeeds, second
  throws, queuePolicy=required) -> queue entry recovery_state advances to
  unknown_after_send, retryCount stays 0, no lastError. This is the patch
  path: drain will route the entry through reconcileUnknownQueuedDelivery
  instead of leaving it in send_attempt_started for blind replay.
- no send evidence (sole payload fails immediately) -> failDelivery path:
  retryCount bumped, recovery_state stays send_attempt_started. Patch does
  not affect this path.

Negative control confirmed: with deliver.ts reverted to v2026.6.8 original
(no patch), the mid-batch test fails with recovery_state=send_attempt_started
(the root-cause state), while the no-evidence test still passes. This
reproduces the patch's code path and proves the fix at the real-queue layer.
2026-06-25 08:16:56 -07:00
rosenlo
71422a9a5a fix(outbound): advance queue entry to unknown_after_send on mid-batch failure with send evidence
When a required-mode batch send fails mid-batch after an earlier payload
already succeeded, the wrapper catch in deliverOutboundPayloadsWithQueueCleanup
called failDelivery. failDelivery only bumps retryCount/lastError; it does
not advance recoveryState, so the entry stayed in send_attempt_started (set
earlier by markDeliveryPlatformSendAttemptStarted via onPlatformSendStart).

On the next Telegram reconnect, drainQueuedEntry sees send_attempt_started
and calls reconcileUnknownQueuedDelivery. When adapter reconciliation
misreports not_sent (the message was actually sent, per the outbound send
ok / messageId evidence), the entry is replayed and the user receives a
duplicate.

Fix: when the error carries send evidence (OutboundDeliveryError with
sentBeforeError === true and platformSendStarted === true), call
markQueuedPlatformOutcomeUnknown instead of failDelivery. This advances the
entry to unknown_after_send, which drain already routes through
reconcileUnknownQueuedDelivery, preserving the entry for adapter
reconciliation rather than leaving it in send_attempt_started for replay.

When there is no send evidence (sentBeforeError === false), failDelivery
remains correct: nothing reached the channel, so retrying is safe.

This is a third duplicate path distinct from #89812 (mirror best-effort)
and #92274 (subagent-announce-delivery retry); it is the outbound/deliver
wrapper catch, which neither prior fix covers.

Tests:
- regression: two payloads, first succeeds, second throws; asserts
  markDeliveryPlatformOutcomeUnknown called, failDelivery/ackDelivery not.
- guard: no send evidence; failDelivery still called.
2026-06-25 08:16:56 -07:00
linhongkuan
2e6e17f7c5 fix(media-generation): preserve trimmed default model flag (#96430) 2026-06-25 20:46:47 +08:00
linhongkuan
1ba1fecaa6 fix(acp-core): clear stale active run lookups (#96427) 2026-06-25 20:44:07 +08:00
Shakker
4ecb45bf77 fix: narrow test config path 2026-06-25 10:40:38 +01:00
Shakker
0757cad597 fix: narrow cron path env cleanup 2026-06-25 10:07:09 +01:00
Shakker
21b21583cc test: isolate reply media state env 2026-06-25 10:02:41 +01:00
Shakker
c8c4490b17 fix: scope embedded image state env 2026-06-25 09:59:32 +01:00
Shakker
d693b70bfc test: preserve daemon coverage env scope 2026-06-25 09:56:16 +01:00
Shakker
2b8c089b76 fix: guard current turn state env 2026-06-25 09:53:17 +01:00
Shakker
1d1c2f4f72 test: stabilize allowlist config env 2026-06-25 09:50:50 +01:00
Shakker
3ce398712a fix: preserve exec env test cleanup 2026-06-25 09:48:15 +01:00
Shakker
3c2a3d9d2b test: centralize tool manager agent env 2026-06-25 09:45:37 +01:00
Shakker
33d7a2a3f7 fix: route session history config env 2026-06-25 09:42:46 +01:00
Vincent Koc
94ae918d8f perf(plugins): reuse installed manifest realpaths (#96710)
Co-authored-by: sheyanmin <she.yanmin@xydigit.com>
2026-06-25 16:41:26 +08:00
David
af906225fa fix(git-hooks): skip sequencer pre-commit formatting (#95842)
* fix(git-hooks): skip sequencer pre-commit formatting

* chore: rerun CI

* fix(git-hooks): skip revert sequencer formatting

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-06-25 16:40:12 +08:00
Shakker
08b7fddf80 test: centralize shell snapshot env 2026-06-25 09:37:55 +01:00
Wynne668
d7dff3cbf4 fix(document-extract): render PDF image fallback per page so multi-page scans don't starve later pages (#96390)
* fix(document-extract): render PDF image fallback per page so multi-page scans don't starve later pages

clawpdf's mode:"images" extract applies a single maxPixels budget across
every page, so the first page consumes it and later pages collapse to ~1x1
PNGs that vision OCR models reject. Render each selected page in its own
extract() call so the pixel budget resets per page and every page yields a
usable image.

* fix(document-extract): preserve aggregate PDF render budget

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-06-25 16:37:47 +08:00
Vincent Koc
42d0a1267e perf(gateway): cache transcript field regexes (#96707)
Co-authored-by: Yongan Zhang <374456248@qq.com>
2026-06-25 16:36:44 +08:00
David
99f56cd548 fix(discord): keep audio voice replies threaded (#95978)
* fix(discord): keep audio voice replies threaded

* chore: retrigger CI after base recovery
2026-06-25 16:36:23 +08:00
Shakker
e6a2f61e94 fix: route persisted result config env 2026-06-25 09:29:46 +01:00
linhongkuan
c030b305a4 fix(agent-core): preserve empty prompt arguments (#96405) 2026-06-25 16:25:52 +08:00
ly-wang19
770b19f496 fix(imessage): only strip standalone role-turn markers, not prose ending in a role word (#96392)
ROLE_TURN_MARKER_RE anchored only the end of the line (\b...:\s*$), so any
outbound line that merely ended with 'user:'/'system:'/'assistant:' was
truncated — e.g. 'Please send this reply to the user:' lost its last word.
Anchor the marker to the whole line so only a standalone leaked turn marker
(its own line) is stripped; standalone-marker behavior is unchanged.

Co-authored-by: ly-wang19 <ly-wang19@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 16:05:36 +08:00
linhongkuan
793b604b23 fix(media-understanding): parse nested Gemini output JSON (#96432) 2026-06-25 15:50:39 +08:00
linhongkuan
31e941c3fc fix(context): count fullwidth chars in token estimates (#96442) 2026-06-25 14:36:04 +08:00
Vincent Koc
56d95b18f4 fix(sdk): refresh plugin sdk api baseline 2026-06-25 06:51:02 +02:00
Vincent Koc
e7f2b125f6 fix(test): isolate upgrade survivor artifacts 2026-06-25 05:49:46 +02:00
Vincent Koc
643410c1f3 test(qa): scope fanout marker proof to channel runtime 2026-06-25 10:20:51 +08:00
Vincent Koc
8d4e40d293 test(qa): extend fanout marker wait 2026-06-25 10:20:51 +08:00
Vincent Koc
068ae4eb4b test(qa): allow Codex fanout completion window 2026-06-25 10:20:51 +08:00
Vincent Koc
dad7168c2f fix(qa): align runtime parity evidence with Codex 2026-06-25 10:20:51 +08:00
Vincent Koc
31a65e0647 fix(agents): preserve absent embedded session keys 2026-06-25 10:20:51 +08:00
191 changed files with 8624 additions and 2049 deletions

View File

@@ -1,2 +1,2 @@
35b314075ff47453c5d57788861ca0c0e65d6a988b549ab2a2e1757b7590d140 plugin-sdk-api-baseline.json
0dc8abcefccfe7d19280bde5fb2c0c69cf73b782d47e3759e2984baf904fe07c plugin-sdk-api-baseline.jsonl
abdff20b710c6b0fecb5af25603d7cfad7ade80600ca374ebe38f69d78933b50 plugin-sdk-api-baseline.json
630367961e4d14463020f588564c23308159ae2de6e4301418b2b0c471797e70 plugin-sdk-api-baseline.jsonl

View File

@@ -399,13 +399,17 @@ Updates apply to tracked plugin installs in the managed plugin index and tracked
<Accordion title="Resolving plugin id vs npm spec">
When you pass a plugin id, OpenClaw reuses the recorded install spec for that plugin. That means previously stored dist-tags such as `@beta` and exact pinned versions continue to be used on later `update <id>` runs.
That targeted-update rule is different from the bulk `openclaw plugins update --all` maintenance path. Bulk updates still respect ordinary tracked install specs, but trusted official OpenClaw plugin records can sync to the current official catalog target instead of staying on a stale exact official package. Use targeted `update <id>` when you intentionally want to keep an exact or tagged official spec untouched.
For npm installs, you can also pass an explicit npm package spec with a dist-tag or exact version. OpenClaw resolves that package name back to the tracked plugin record, updates that installed plugin, and records the new npm spec for future id-based updates.
Passing the npm package name without a version or tag also resolves back to the tracked plugin record. Use this when a plugin was pinned to an exact version and you want to move it back to the registry's default release line.
</Accordion>
<Accordion title="Beta channel updates">
`openclaw plugins update` reuses the tracked plugin spec unless you pass a new spec. `openclaw update` additionally knows the active OpenClaw update channel: on the beta channel, default-line npm and ClawHub plugin records try `@beta` first. They fall back to the recorded default/latest spec if no plugin beta release exists; npm plugins also fall back when the beta package exists but fails install validation. That fallback is reported as a warning and does not fail the core update. Exact versions and explicit tags stay pinned to that selector.
Targeted `openclaw plugins update <id-or-npm-spec>` reuses the tracked plugin spec unless you pass a new spec. Bulk `openclaw plugins update --all` uses the configured `update.channel` when it syncs trusted official plugin records to the official catalog target, so beta-channel installs can stay on the beta release line instead of being silently normalized to stable/latest.
`openclaw update` also knows the active OpenClaw update channel: on the beta channel, default-line npm and ClawHub plugin records try `@beta` first. They fall back to the recorded default/latest spec if no plugin beta release exists; npm plugins also fall back when the beta package exists but fails install validation. That fallback is reported as a warning and does not fail the core update. Exact versions and explicit tags stay pinned to that selector for targeted updates.
</Accordion>
<Accordion title="Version checks and integrity drift">

View File

@@ -167,7 +167,7 @@ surfaces, while Codex native hooks remain a separate lower-level Codex mechanism
- Agent runtime: `agents.defaults.timeoutSeconds` default 172800s (48 hours); enforced in `runEmbeddedAgent` abort timer.
- Cron runtime: isolated agent-turn `timeoutSeconds` is owned by cron. The scheduler starts that timer when execution begins, aborts the underlying run at the configured deadline, then runs bounded cleanup before recording the timeout so a stale child session cannot keep the lane stuck.
- Session liveness diagnostics: with diagnostics enabled, `diagnostics.stuckSessionWarnMs` classifies long `processing` sessions that have no observed reply, tool, status, block, or ACP progress. Active embedded runs, model calls, and tool calls report as `session.long_running`; owned silent model calls also stay `session.long_running` until `diagnostics.stuckSessionAbortMs` so slow or non-streaming providers are not reported as stalled too early. Active work with no recent progress reports as `session.stalled`; owned model calls switch to `session.stalled` at or after the abort threshold, and ownerless stale model/tool activity is not hidden as long-running. `session.stuck` is reserved for recoverable stale session bookkeeping, including idle queued sessions with stale ownerless model/tool activity. Stale session bookkeeping releases the affected session lane immediately after recovery gates pass; stalled embedded runs are abort-drained only after `diagnostics.stuckSessionAbortMs` (default: at least 5 minutes and 3x the warning threshold) so queued work can resume without cutting off merely slow runs. Recovery emits structured requested/completed outcomes, and diagnostic state is marked idle only if the same processing generation is still current. Repeated `session.stuck` diagnostics back off while the session remains unchanged.
- Model idle timeout: OpenClaw aborts a model request when no response chunks arrive before the idle window. `models.providers.<id>.timeoutSeconds` extends this idle watchdog for slow local/self-hosted providers, but it is still bounded by any lower `agents.defaults.timeoutSeconds` or run-specific timeout because those control the whole agent run. Otherwise OpenClaw uses `agents.defaults.timeoutSeconds` when configured, capped at 120s by default. Cron-triggered cloud model runs with no explicit model or agent timeout use the same default idle watchdog; cron-triggered local or self-hosted model runs disable the implicit watchdog unless an explicit timeout is configured, so slow local providers should set `models.providers.<id>.timeoutSeconds`.
- Model idle timeout: OpenClaw aborts a model request when no response chunks arrive before the idle window. `models.providers.<id>.timeoutSeconds` extends this idle watchdog for slow local/self-hosted providers, but it is still bounded by any lower `agents.defaults.timeoutSeconds` or run-specific timeout because those control the whole agent run. Otherwise OpenClaw uses `agents.defaults.timeoutSeconds` when configured, capped at 120s by default. Cron-triggered cloud model runs with no explicit model or agent timeout use the same default idle watchdog; with an explicit cron run timeout, cloud model stream stalls are capped at 60s so configured model fallbacks can run before the outer cron deadline. Cron-triggered local or self-hosted model runs disable the implicit watchdog unless an explicit timeout is configured, and explicit cron run timeouts remain the idle window for local/self-hosted providers, so slow local providers should set `models.providers.<id>.timeoutSeconds`.
- Provider HTTP request timeout: `models.providers.<id>.timeoutSeconds` applies to that provider's model HTTP fetches, including connect, headers, body, SDK request timeout, total guarded-fetch abort handling, and model stream idle watchdog. Use this for slow local/self-hosted providers such as Ollama before raising the whole agent runtime timeout, and keep the agent/runtime timeout at least as high when the model request needs to run longer.
## Where things can end early

View File

@@ -737,6 +737,10 @@ outbound host generic and use the messaging adapter surface for provider rules:
should be treated as `direct`, `group`, or `channel` before directory lookup.
- `messaging.targetResolver.looksLikeId(raw, normalized)` tells core whether an
input should skip straight to id-like resolution instead of directory search.
- `messaging.targetResolver.reservedLiterals` lists bare words that are
channel/session references for that provider. Resolution preserves configured
directory entries before rejecting reserved literals, then fails closed on a
directory miss.
- `messaging.targetResolver.resolveTarget(...)` is the plugin fallback when
core needs a final provider-owned resolution after normalization or after a
directory miss.

View File

@@ -115,6 +115,17 @@ before the thread starts.
After changing Computer Use config, use `/new` or `/reset` in the affected chat
before testing if an existing Codex thread has already started.
On macOS managed stdio startup, OpenClaw prefers the signed desktop Codex app
bundle at `/Applications/Codex.app/Contents/Resources/codex` when it exists.
That keeps Computer Use under the app bundle that owns the local desktop-control
permissions. If the desktop app is not installed, OpenClaw falls back to the
managed Codex binary installed beside the plugin. If an installed desktop app
initializes with an unsupported app-server version, OpenClaw closes that child
and retries the next managed binary candidate instead of letting a stale
desktop app shadow the plugin-local fallback. Explicit `appServer.command`
config or `OPENCLAW_CODEX_APP_SERVER_BIN` still overrides this managed
selection.
## Commands
Use the `/codex computer-use` commands from any chat surface where the `codex`
@@ -276,7 +287,13 @@ Codex app-server MCP status, or macOS permissions.
**Status or a probe times out on `computer-use.list_apps`.** The plugin and MCP
server are present, but the local Computer Use bridge did not answer. Quit or
restart Codex Computer Use, relaunch Codex Desktop if needed, then retry in a
fresh OpenClaw session.
fresh OpenClaw session. If the host previously ran Computer Use through an older
managed Codex app-server, refresh the installed plugin from the desktop bundled
marketplace:
```text
/codex computer-use install --source /Applications/Codex.app/Contents/Resources/plugins/openai-bundled
```
**A Computer Use tool says `Native hook relay unavailable`.** The Codex-native
tool hook could not reach an active OpenClaw relay through the local bridge or

View File

@@ -110,6 +110,13 @@ When you pass a plugin id, OpenClaw reuses the tracked install spec. Stored
dist-tags such as `@beta` and exact pinned versions continue to be used on
later `update <plugin-id>` runs.
`openclaw plugins update --all` is the bulk maintenance path. It still respects
ordinary tracked install specs, but trusted official OpenClaw plugin records can
sync to the current official catalog target instead of staying on a stale exact
official package. If `update.channel` is set to `beta`, that bulk official sync
uses the beta-channel context. Use a targeted `update <plugin-id>` when you
intentionally want to keep an exact or tagged official spec untouched.
For npm installs, you can pass an explicit package spec to switch the tracked
record:

View File

@@ -739,7 +739,7 @@ Write colocated tests in `src/channel.test.ts`:
describeMessageTool and action discovery
</Card>
<Card title="Target resolution" icon="crosshair" href="/plugins/architecture-internals#channel-target-resolution">
inferTargetChatType, looksLikeId, resolveTarget
inferTargetChatType, looksLikeId, reservedLiterals, resolveTarget
</Card>
<Card title="Runtime helpers" icon="settings" href="/plugins/sdk-runtime">
TTS, STT, media, subagent via api.runtime

View File

@@ -2,7 +2,10 @@
* Azure Speech REST helpers. They normalize endpoints, build SSML, list voices,
* and synthesize speech with response-size and SSRF guards.
*/
import { assertOkOrThrowProviderError } from "openclaw/plugin-sdk/provider-http";
import {
assertOkOrThrowProviderError,
readProviderJsonResponse,
} from "openclaw/plugin-sdk/provider-http";
import { readResponseWithLimit } from "openclaw/plugin-sdk/response-limit-runtime";
import type { SpeechVoiceOption } from "openclaw/plugin-sdk/speech-core";
import { trimToUndefined } from "openclaw/plugin-sdk/speech-core";
@@ -160,7 +163,10 @@ export async function listAzureSpeechVoices(params: {
try {
await assertOkOrThrowProviderError(response, "Azure Speech voices API error");
const voices = (await response.json()) as AzureSpeechVoiceEntry[];
const voices = await readProviderJsonResponse<AzureSpeechVoiceEntry[]>(
response,
"azure-speech.voices",
);
return Array.isArray(voices)
? voices
.filter((voice) => !isDeprecatedVoice(voice))

View File

@@ -1,12 +1,70 @@
// Byteplus tests cover video generation provider plugin behavior.
import {
getProviderHttpMocks,
installProviderHttpMockCleanup,
} from "openclaw/plugin-sdk/provider-http-test-mocks";
import { expectExplicitVideoGenerationCapabilities } from "openclaw/plugin-sdk/provider-test-contracts";
import { beforeAll, describe, expect, it, vi } from "vitest";
import { afterEach, beforeAll, describe, expect, it, vi } from "vitest";
const { postJsonRequestMock, fetchWithTimeoutMock } = getProviderHttpMocks();
// Submit/poll transport is mocked locally so each test can inject the BytePlus task JSON
// bodies, while readProviderJsonResponse is kept REAL (via importActual) so the byte-bounded
// reader actually streams and cancels oversized bodies under test instead of a stub.
const { postJsonRequestMock, fetchWithTimeoutMock, resolveApiKeyForProviderMock } = vi.hoisted(
() => ({
postJsonRequestMock: vi.fn(),
fetchWithTimeoutMock: vi.fn(),
resolveApiKeyForProviderMock: vi.fn(async () => ({ apiKey: "provider-key" })),
}),
);
vi.mock("openclaw/plugin-sdk/provider-auth-runtime", () => ({
resolveApiKeyForProvider: resolveApiKeyForProviderMock,
}));
vi.mock("openclaw/plugin-sdk/provider-http", async (importActual) => {
const actual = await importActual<typeof import("openclaw/plugin-sdk/provider-http")>();
const resolveTimeoutMs = (timeoutMs: unknown): number =>
typeof timeoutMs === "function" ? (timeoutMs() as number) : ((timeoutMs as number) ?? 60_000);
return {
// REAL byte-bounded JSON reader under test — not stubbed.
readProviderJsonResponse: actual.readProviderJsonResponse,
postJsonRequest: postJsonRequestMock,
fetchProviderOperationResponse: async (params: {
url: string;
init?: RequestInit;
timeoutMs?: unknown;
fetchFn: typeof fetch;
}) => fetchWithTimeoutMock(params.url, params.init ?? {}, resolveTimeoutMs(params.timeoutMs)),
fetchProviderDownloadResponse: async (params: {
url: string;
init?: RequestInit;
timeoutMs?: unknown;
fetchFn: typeof fetch;
}) => fetchWithTimeoutMock(params.url, params.init ?? {}, resolveTimeoutMs(params.timeoutMs)),
assertOkOrThrowHttpError: async () => {},
createProviderOperationDeadline: ({
label,
timeoutMs,
}: {
label: string;
timeoutMs?: number;
}) => ({ label, timeoutMs }),
createProviderOperationTimeoutResolver:
({ defaultTimeoutMs }: { defaultTimeoutMs: number }) =>
() =>
defaultTimeoutMs,
resolveProviderOperationTimeoutMs: ({ defaultTimeoutMs }: { defaultTimeoutMs: number }) =>
defaultTimeoutMs,
resolveProviderHttpRequestConfig: (params: {
baseUrl?: string;
defaultBaseUrl: string;
allowPrivateNetwork?: boolean;
defaultHeaders?: Record<string, string>;
}) => ({
baseUrl: params.baseUrl ?? params.defaultBaseUrl,
allowPrivateNetwork: params.allowPrivateNetwork === true,
headers: new Headers(params.defaultHeaders),
dispatcherPolicy: undefined,
}),
waitProviderOperationPollInterval: async () => {},
};
});
let buildBytePlusVideoGenerationProvider: typeof import("./video-generation-provider.js").buildBytePlusVideoGenerationProvider;
@@ -14,20 +72,22 @@ beforeAll(async () => {
({ buildBytePlusVideoGenerationProvider } = await import("./video-generation-provider.js"));
});
installProviderHttpMockCleanup();
afterEach(() => {
postJsonRequestMock.mockReset();
fetchWithTimeoutMock.mockReset();
resolveApiKeyForProviderMock.mockClear();
});
function mockSuccessfulBytePlusTask(params?: { model?: string }) {
postJsonRequestMock.mockResolvedValue({
response: {
json: async () => ({
id: "task_123",
}),
},
response: streamedJsonResponse({
id: "task_123",
}),
release: vi.fn(async () => {}),
});
fetchWithTimeoutMock
.mockResolvedValueOnce({
json: async () => ({
.mockResolvedValueOnce(
streamedJsonResponse({
id: "task_123",
status: "succeeded",
content: {
@@ -35,7 +95,7 @@ function mockSuccessfulBytePlusTask(params?: { model?: string }) {
},
model: params?.model ?? "seedance-1-0-lite-t2v-250428",
}),
})
)
.mockResolvedValueOnce({
headers: new Headers({ "content-type": "video/webm" }),
arrayBuffer: async () => Buffer.from("webm-bytes"),
@@ -77,6 +137,53 @@ function streamedVideoResponse(bytes: string): Response {
);
}
// BytePlus submit/poll task JSON is now read through the byte-bounded reader, so the
// mocked responses must expose a real readable body (not just a json() shortcut).
function streamedJsonResponse(payload: unknown): Response {
return new Response(
new ReadableStream({
start(controller) {
controller.enqueue(new TextEncoder().encode(JSON.stringify(payload)));
controller.close();
},
}),
{ status: 200, headers: { "content-type": "application/json" } },
);
}
// Builds a JSON body larger than the shared 16 MiB readProviderJsonResponse cap so the
// bounded reader cancels the stream mid-flight; if the cap were removed the reader would
// buffer the whole advertised payload before parsing. Tracks how many bytes were pulled
// and whether the stream was canceled so callers can assert the body was not fully read.
function makeOversizedJsonStream(): {
body: ReadableStream<Uint8Array>;
maxBytes: number;
totalBytes: number;
state: { bytesPulled: number; canceled: boolean };
} {
const maxBytes = 16 * 1024 * 1024; // matches PROVIDER_JSON_RESPONSE_MAX_BYTES.
const ONE_MIB = 1024 * 1024;
const TOTAL_CHUNKS = 32; // 32 MiB advertised body, double the cap.
const chunk = new Uint8Array(ONE_MIB);
const state = { bytesPulled: 0, canceled: false };
let pulled = 0;
const body = new ReadableStream<Uint8Array>({
pull(controller) {
if (pulled >= TOTAL_CHUNKS) {
controller.close();
return;
}
pulled += 1;
state.bytesPulled += chunk.length;
controller.enqueue(chunk);
},
cancel() {
state.canceled = true;
},
});
return { body, maxBytes, totalBytes: TOTAL_CHUNKS * ONE_MIB, state };
}
describe("byteplus video generation provider", () => {
it("declares explicit mode capabilities", () => {
expectExplicitVideoGenerationCapabilities(buildBytePlusVideoGenerationProvider());
@@ -110,21 +217,19 @@ describe("byteplus video generation provider", () => {
it("rejects generated video downloads that exceed the configured media cap", async () => {
postJsonRequestMock.mockResolvedValue({
response: {
json: async () => ({ id: "task_too_large" }),
},
response: streamedJsonResponse({ id: "task_too_large" }),
release: vi.fn(async () => {}),
});
fetchWithTimeoutMock
.mockResolvedValueOnce({
json: async () => ({
.mockResolvedValueOnce(
streamedJsonResponse({
id: "task_too_large",
status: "succeeded",
content: {
video_url: "https://example.com/too-large.mp4",
},
}),
})
)
.mockResolvedValueOnce(streamedVideoResponse("too-large"));
const provider = buildBytePlusVideoGenerationProvider();
@@ -222,16 +327,14 @@ describe("byteplus video generation provider", () => {
it("drops malformed response duration metadata", async () => {
postJsonRequestMock.mockResolvedValue({
response: {
json: async () => ({
id: "task_123",
}),
},
response: streamedJsonResponse({
id: "task_123",
}),
release: vi.fn(async () => {}),
});
fetchWithTimeoutMock
.mockResolvedValueOnce({
json: async () => ({
.mockResolvedValueOnce(
streamedJsonResponse({
id: "task_123",
status: "succeeded",
content: {
@@ -239,7 +342,7 @@ describe("byteplus video generation provider", () => {
},
duration: 1.5,
}),
})
)
.mockResolvedValueOnce({
headers: new Headers({ "content-type": "video/mp4" }),
arrayBuffer: async () => Buffer.from("mp4-bytes"),
@@ -259,11 +362,15 @@ describe("byteplus video generation provider", () => {
it("reports malformed create JSON with a provider-owned error", async () => {
const release = vi.fn(async () => {});
postJsonRequestMock.mockResolvedValue({
response: {
json: async () => {
throw new SyntaxError("bad json");
},
},
response: new Response(
new ReadableStream({
start(controller) {
controller.enqueue(new TextEncoder().encode("{ not valid json"));
controller.close();
},
}),
{ status: 200, headers: { "content-type": "application/json" } },
),
release,
});
@@ -281,19 +388,17 @@ describe("byteplus video generation provider", () => {
it("rejects status responses missing a task status", async () => {
postJsonRequestMock.mockResolvedValue({
response: {
json: async () => ({ id: "task_missing_status" }),
},
response: streamedJsonResponse({ id: "task_missing_status" }),
release: vi.fn(async () => {}),
});
fetchWithTimeoutMock.mockResolvedValueOnce({
json: async () => ({
fetchWithTimeoutMock.mockResolvedValueOnce(
streamedJsonResponse({
id: "task_missing_status",
content: {
video_url: "https://example.com/byteplus.mp4",
},
}),
});
);
const provider = buildBytePlusVideoGenerationProvider();
await expect(
@@ -308,18 +413,16 @@ describe("byteplus video generation provider", () => {
it("rejects malformed completed content", async () => {
postJsonRequestMock.mockResolvedValue({
response: {
json: async () => ({ id: "task_malformed_content" }),
},
response: streamedJsonResponse({ id: "task_malformed_content" }),
release: vi.fn(async () => {}),
});
fetchWithTimeoutMock.mockResolvedValueOnce({
json: async () => ({
fetchWithTimeoutMock.mockResolvedValueOnce(
streamedJsonResponse({
id: "task_malformed_content",
status: "succeeded",
content: ["https://example.com/byteplus.mp4"],
}),
});
);
const provider = buildBytePlusVideoGenerationProvider();
await expect(
@@ -331,4 +434,61 @@ describe("byteplus video generation provider", () => {
}),
).rejects.toThrow("BytePlus video generation completed with malformed content");
});
it("bounds the submit task JSON body and cancels an oversized stream", async () => {
const stream = makeOversizedJsonStream();
const release = vi.fn(async () => {});
postJsonRequestMock.mockResolvedValue({
response: new Response(stream.body, {
status: 200,
headers: { "content-type": "application/json" },
}),
release,
});
const provider = buildBytePlusVideoGenerationProvider();
await expect(
provider.generateVideo({
provider: "byteplus",
model: "seedance-1-0-lite-t2v-250428",
prompt: "oversized submit response",
cfg: {},
}),
).rejects.toThrow(
`BytePlus video generation failed: JSON response exceeds ${stream.maxBytes} bytes`,
);
expect(stream.state.canceled).toBe(true);
// Only the bounded prefix is pulled, never the full advertised stream.
expect(stream.state.bytesPulled).toBeLessThan(stream.totalBytes);
// The submit request must still be released even though the body overflowed.
expect(release).toHaveBeenCalledOnce();
});
it("bounds the poll status JSON body and cancels an oversized stream", async () => {
postJsonRequestMock.mockResolvedValue({
response: streamedJsonResponse({ id: "task_oversized_poll" }),
release: vi.fn(async () => {}),
});
const stream = makeOversizedJsonStream();
fetchWithTimeoutMock.mockResolvedValueOnce(
new Response(stream.body, {
status: 200,
headers: { "content-type": "application/json" },
}),
);
const provider = buildBytePlusVideoGenerationProvider();
await expect(
provider.generateVideo({
provider: "byteplus",
model: "seedance-1-0-lite-t2v-250428",
prompt: "oversized poll response",
cfg: {},
}),
).rejects.toThrow(
`BytePlus video status request failed: JSON response exceeds ${stream.maxBytes} bytes`,
);
expect(stream.state.canceled).toBe(true);
expect(stream.state.bytesPulled).toBeLessThan(stream.totalBytes);
});
});

View File

@@ -11,6 +11,7 @@ import {
fetchProviderDownloadResponse,
fetchProviderOperationResponse,
postJsonRequest,
readProviderJsonResponse,
resolveProviderOperationTimeoutMs,
resolveProviderHttpRequestConfig,
waitProviderOperationPollInterval,
@@ -55,16 +56,13 @@ type BytePlusTaskResponse = {
type BytePlusTaskStatus = "running" | "failed" | "queued" | "succeeded" | "cancelled";
async function readBytePlusJsonResponse<T>(
response: Pick<Response, "json">,
label: string,
): Promise<T> {
let payload: unknown;
try {
payload = await response.json();
} catch (cause) {
throw new Error(`${label}: malformed JSON response`, { cause });
}
async function readBytePlusJsonResponse<T>(response: Response, label: string): Promise<T> {
// BytePlus submit/poll task bodies are read through the shared byte-bounded reader
// (readResponseWithLimit, via readProviderJsonResponse) so a hostile or buggy endpoint
// that streams an unbounded JSON body cannot force the runtime to buffer the whole
// payload before parsing. Overflow cancels the stream and throws a bounded error;
// malformed JSON keeps the existing `${label}: malformed JSON response` wrapping.
const payload = await readProviderJsonResponse<unknown>(response, label);
if (!isRecord(payload)) {
throw new Error(`${label}: malformed JSON response`);
}

View File

@@ -639,6 +639,15 @@ function assertSupportedCodexAppServerVersion(response: CodexInitializeResponse)
return detectedVersion;
}
export function isUnsupportedCodexAppServerVersionError(error: unknown): boolean {
return (
error instanceof Error &&
error.message.startsWith(
`Codex app-server ${MIN_CODEX_APP_SERVER_VERSION} or newer is required`,
)
);
}
function buildCodexAppServerRuntimeIdentity(
response: CodexInitializeResponse,
serverVersion: string,

View File

@@ -167,6 +167,7 @@ export type CodexAppServerStartOptions = {
transport: CodexAppServerTransportMode;
command: string;
commandSource?: CodexAppServerCommandSource;
managedFallbackCommandPaths?: string[];
args: string[];
url?: string;
authToken?: string;
@@ -332,7 +333,9 @@ const codexAppServerNetworkProxySchema = z
baseProfile: z.enum(["read-only", "workspace"]).optional(),
mode: z.enum(["limited", "full"]).optional(),
domains: z.record(z.string(), codexAppServerNetworkProxyDomainPermissionSchema).optional(),
unixSockets: z.record(z.string(), codexAppServerNetworkProxyUnixSocketPermissionSchema).optional(),
unixSockets: z
.record(z.string(), codexAppServerNetworkProxyUnixSocketPermissionSchema)
.optional(),
proxyUrl: z.string().trim().min(1).optional(),
socksUrl: z.string().trim().min(1).optional(),
enableSocks5: z.boolean().optional(),
@@ -874,6 +877,7 @@ export function codexAppServerStartOptionsKey(
transport: options.transport,
command: options.command,
commandSource: options.commandSource ?? null,
managedFallbackCommandPaths: [...(options.managedFallbackCommandPaths ?? [])],
args: options.args,
url: options.url ?? null,
authToken: hashSecretForKey(options.authToken, "authToken"),

View File

@@ -1102,585 +1102,6 @@ describe("createCodexDynamicToolBridge", () => {
]);
});
it("marks delivered message-tool-only source replies as terminal", async () => {
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Sent.", { messageId: "imessage-6264" }),
{ sourceReplyDeliveryMode: "message_tool_only" },
);
const result = await handleMessageToolCall(bridge, {
action: "send",
message: "visible reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBe(true);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("keeps message-tool-only source replies terminal when middleware redacts receipt details", async () => {
const registry = createEmptyPluginRegistry();
registry.agentToolResultMiddlewares.push({
pluginId: "receipt-redactor",
pluginName: "Receipt redactor",
rawHandler: () => undefined,
handler: (event: { result: AgentToolResult<unknown> }) => ({
result: {
content: event.result.content,
details: { redacted: true },
},
}),
runtimes: ["codex"],
source: "test",
});
setActivePluginRegistry(registry);
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Sent.", {
receipt: {
primaryPlatformMessageId: "imessage-6264",
platformMessageIds: ["imessage-6264"],
},
}),
{ sourceReplyDeliveryMode: "message_tool_only" },
);
const result = await handleMessageToolCall(bridge, {
action: "send",
message: "visible reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("does not treat target telemetry alone as delivered message-tool-only source reply evidence", async () => {
const bridge = createBridgeWithToolResult("message", textToolResult("Sent."), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "chat-1",
});
const result = await handleMessageToolCall(bridge, {
action: "send",
message: "visible reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(bridge.telemetry.messagingToolSentTargets).toEqual([
expect.objectContaining({
tool: "message",
provider: "imessage",
to: "chat-1",
text: "visible reply",
}),
]);
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("keeps message-tool-only source replies terminal for explicit current source routes", async () => {
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Sent.", { ok: true, messageId: "imessage-853" }),
{
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "imessage:+12069106512",
currentMessagingTarget: "+12069106512",
},
);
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "+12069106512",
messageId: "853",
message: "visible reply",
buttons: [],
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBe(true);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("keeps normalized explicit source routes terminal", async () => {
setActivePluginRegistry(
createTestRegistry([
{
pluginId: "sms",
plugin: {
id: "sms",
messaging: {
normalizeTarget: (raw: string) => {
const digits = raw.replace(/\D/gu, "");
return digits.length === 11 && digits.startsWith("1") ? `+${digits}` : raw.trim();
},
},
},
source: "test",
},
]),
);
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Sent.", { ok: true, messageId: "sms-853" }),
{
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "sms",
currentChannelId: "sms:+12069106512",
currentMessagingTarget: "+12069106512",
},
);
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "sms",
target: "+1 (206) 910-6512",
messageId: "853",
message: "visible reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(bridge.telemetry.messagingToolSentTargets).toEqual([
expect.objectContaining({
tool: "message",
provider: "sms",
to: "+12069106512",
text: "visible reply",
}),
]);
expect(result.terminate).toBe(true);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("keeps message-tool-only source replies terminal when the reply receipt matches the current message id", async () => {
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Sent.", {
ok: true,
messageId: "provider-message-1",
repliedTo: "provider-guid-857",
}),
{
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "imessage:any;-;+12069106512",
currentMessageId: "provider-guid-857",
},
);
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "+12069106512",
messageId: "857",
message: "visible reply",
buttons: [],
});
expect(result).toEqual(expectInputText("Sent."));
expect(bridge.telemetry.messagingToolSentTargets).toEqual([
expect.objectContaining({
tool: "message",
provider: "imessage",
to: "+12069106512",
text: "visible reply",
}),
]);
expect(result.terminate).toBe(true);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("keeps message-tool-only source replies terminal when a text receipt matches the current message id", async () => {
const receiptText = JSON.stringify({
ok: true,
messageId: "provider-message-1",
repliedTo: "provider-guid-861",
});
const bridge = createBridgeWithToolResult("message", textToolResult(receiptText), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "imessage:any;-;+12069106512",
currentMessageId: "provider-guid-861",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "+12069106512",
messageId: "861",
message: "visible reply",
buttons: [],
});
expect(result).toEqual(expectInputText(receiptText));
expect(result.terminate).toBe(true);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("does not let dry-run reply receipts terminate message-tool-only source replies", async () => {
const receiptText = JSON.stringify({
deliveryStatus: "dry_run",
dryRun: true,
replyToId: "provider-guid-862",
});
const bridge = createBridgeWithToolResult("message", textToolResult(receiptText), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "imessage:any;-;+12069106512",
currentMessageId: "provider-guid-862",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "+12069106512",
messageId: "862",
message: "visible reply",
buttons: [],
});
expect(result).toEqual(expectInputText(receiptText));
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not record dry-run reply actions as committed sends", async () => {
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Dry run.", {
deliveryStatus: "dry_run",
dryRun: true,
}),
{
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "imessage:+12069106512",
currentMessagingTarget: "+12069106512",
currentMessageId: "provider-guid-862",
},
);
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "+12069106512",
messageId: "862",
message: "visible reply",
});
expect(result).toEqual(expectInputText("Dry run."));
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didSendViaMessagingTool).toBe(false);
expect(bridge.telemetry.messagingToolSentTargets).toEqual([]);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("keeps message-tool-only source replies terminal for explicit native target segments", async () => {
const bridge = createBridgeWithToolResult("message", textToolResult("Sent.", { ok: true }), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "imessage:any;-;+12069106512",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "+12069106512",
messageId: "863",
message: "visible reply",
buttons: [],
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBe(true);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("keeps message-tool-only source replies terminal when the provider is only in the current channel id", async () => {
const bridge = createBridgeWithToolResult("message", textToolResult("Sent.", { ok: true }), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelId: "imessage:any;-;+12069106512",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "+12069106512",
messageId: "865",
message: "visible reply",
buttons: [],
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBe(true);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("records message-tool-owned terminal replies as delivered source replies", async () => {
const bridge = createBridgeWithToolResult(
"message",
{
...textToolResult("Sent.", { ok: true }),
terminate: true,
} as AgentToolResult<unknown>,
{ sourceReplyDeliveryMode: "message_tool_only" },
);
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "+12069106512",
messageId: "867",
message: "visible reply",
buttons: [],
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBe(true);
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(true);
expect(Object.keys(result)).not.toContain("terminate");
});
it("does not treat bare send telemetry as delivered message-tool-only source reply evidence", async () => {
const bridge = createBridgeWithToolResult("message", textToolResult("Sent."), {
sourceReplyDeliveryMode: "message_tool_only",
});
const result = await handleMessageToolCall(bridge, {
action: "send",
message: "visible reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(bridge.telemetry.didSendViaMessagingTool).toBe(true);
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not let prior message-send telemetry terminate a later non-delivery tool result", async () => {
const execute = vi
.fn()
.mockResolvedValueOnce(textToolResult("Sent.", { messageId: "source-reply-1" }))
.mockResolvedValueOnce(textToolResult("No message sent.", { ok: true }));
const bridge = createCodexDynamicToolBridge({
tools: [createTool({ name: "message", execute })],
signal: new AbortController().signal,
hookContext: { sourceReplyDeliveryMode: "message_tool_only" },
});
const firstResult = await handleMessageToolCall(bridge, {
action: "send",
message: "visible reply",
});
const secondResult = await bridge.handleToolCall({
threadId: "thread-1",
turnId: "turn-1",
callId: "call-2",
namespace: null,
tool: "message",
arguments: { action: "inspect" },
});
expect(firstResult.terminate).toBe(true);
expect(bridge.telemetry.didSendViaMessagingTool).toBe(true);
expect(secondResult).toEqual(expectInputText("No message sent."));
expect(secondResult.terminate).toBeUndefined();
});
it("does not mark explicit message-tool sends as terminal source replies", async () => {
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Sent.", { messageId: "other-chat-message" }),
{ sourceReplyDeliveryMode: "message_tool_only" },
);
const result = await handleMessageToolCall(bridge, {
action: "send",
target: "channel:other",
message: "cross-channel reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not mark mismatched explicit message-tool sends as terminal source replies", async () => {
const bridge = createBridgeWithToolResult("message", textToolResult("Sent."), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "imessage:+12069106512",
currentMessagingTarget: "+12069106512",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "slack",
target: "+12069106512",
messageId: "853",
message: "cross-provider reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not mark same-target sibling-thread replies as terminal source replies", async () => {
const bridge = createBridgeWithToolResult("message", textToolResult("Sent.", { ok: true }), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "slack",
currentChannelId: "slack:C123",
currentMessagingTarget: "C123",
currentThreadId: "171.222",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "slack",
target: "C123",
threadId: "171.333",
message: "sibling thread reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not mark implicit-target sibling-thread replies as terminal source replies", async () => {
const bridge = createBridgeWithToolResult("message", textToolResult("Sent.", { ok: true }), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "slack",
currentChannelId: "slack:C123",
currentMessagingTarget: "C123",
currentThreadId: "171.222",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "slack",
threadId: "171.333",
message: "sibling thread reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not mark top-level source replies with explicit thread routes as terminal", async () => {
const bridge = createBridgeWithToolResult("message", textToolResult("Sent.", { ok: true }), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "slack",
currentChannelId: "slack:C123",
currentMessagingTarget: "C123",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "slack",
target: "C123",
threadId: "171.333",
message: "thread reply from top-level source",
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not let matching reply receipts override explicit non-source routes", async () => {
const bridge = createBridgeWithToolResult(
"message",
textToolResult("Sent.", {
ok: true,
messageId: "other-chat-message",
repliedTo: "provider-guid-853",
}),
{
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "imessage",
currentChannelId: "imessage:+12069106512",
currentMessagingTarget: "+12069106512",
currentMessageId: "provider-guid-853",
},
);
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "imessage",
target: "other-chat",
message: "cross-channel reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not let provider target aliases override source routes", async () => {
setActivePluginRegistry(
createTestRegistry([
{
pluginId: "slack",
plugin: {
id: "slack",
messaging: { normalizeTarget: (raw: string) => raw.trim().toLowerCase() },
actions: {
messageActionTargetAliases: {
reply: {
aliases: ["chatGuid"],
deliveryTargetAliases: ["chatGuid"],
},
},
},
},
source: "test",
},
]),
);
const bridge = createBridgeWithToolResult("message", textToolResult("Sent.", { ok: true }), {
sourceReplyDeliveryMode: "message_tool_only",
currentChannelProvider: "slack",
currentChannelId: "channel:c1",
currentMessagingTarget: "channel:c1",
currentMessageId: "provider-guid-854",
});
const result = await handleMessageToolCall(bridge, {
action: "reply",
channel: "slack",
chatGuid: "Channel:C2",
messageId: "854",
message: "cross-chat reply",
});
expect(result).toEqual(expectInputText("Sent."));
expect(bridge.telemetry.messagingToolSentTargets).toEqual([
expect.objectContaining({
tool: "message",
provider: "slack",
to: "channel:c2",
text: "cross-chat reply",
}),
]);
expect(result.terminate).toBeUndefined();
expect(bridge.telemetry.didDeliverSourceReplyViaMessageTool).toBe(false);
});
it("does not record messaging side effects when the send fails", async () => {
const tool = createTool({
name: "message",

View File

@@ -18,8 +18,6 @@ import {
getChannelAgentToolMeta,
getPluginToolMeta,
type EmbeddedRunAttemptParams,
isDeliveredMessageToolOnlySourceReplyResult,
isDeliveredMessagingToolResult,
isReplaySafeToolCall,
isToolWrappedWithBeforeToolCallHook,
isToolResultError,
@@ -65,11 +63,9 @@ type CodexDynamicToolHookContext = {
currentChannelProvider?: string;
currentChannelId?: string;
currentMessagingTarget?: string;
currentMessageId?: string | number;
currentThreadId?: string;
replyToMode?: "off" | "first" | "all" | "batched";
hasRepliedRef?: { value: boolean };
sourceReplyDeliveryMode?: EmbeddedRunAttemptParams["sourceReplyDeliveryMode"];
onToolOutcome?: EmbeddedRunAttemptParams["onToolOutcome"];
allocateToolOutcomeOrdinal?: EmbeddedRunAttemptParams["allocateToolOutcomeOrdinal"];
};
@@ -104,225 +100,6 @@ function applyCurrentMessageProvider(
return { ...args, provider };
}
function normalizeRouteToken(value: string | number | undefined): string | undefined {
if (typeof value === "number") {
return Number.isFinite(value) ? String(value) : undefined;
}
const normalized = value?.trim().toLowerCase();
return normalized ? normalized : undefined;
}
function sourceRouteTokens(hookContext: CodexDynamicToolHookContext | undefined): Set<string> {
const tokens = new Set<string>();
const currentTarget = normalizeRouteToken(hookContext?.currentMessagingTarget);
const currentChannel = normalizeRouteToken(hookContext?.currentChannelId);
const currentProvider = normalizeRouteToken(hookContext?.currentChannelProvider);
if (currentTarget) {
tokens.add(currentTarget);
}
if (currentChannel) {
tokens.add(currentChannel);
}
const channelPrefixIndex = currentChannel?.indexOf(":") ?? -1;
if (channelPrefixIndex >= 0 && currentChannel) {
const unprefixedChannel = currentChannel.slice(channelPrefixIndex + 1);
if (unprefixedChannel) {
tokens.add(unprefixedChannel);
for (const segment of unprefixedChannel.split(/[;,]/u)) {
const token = normalizeRouteToken(segment);
if (token) {
tokens.add(token);
}
}
}
}
if (currentProvider && currentChannel?.startsWith(`${currentProvider}:`)) {
const unprefixedChannel = currentChannel.slice(currentProvider.length + 1);
if (unprefixedChannel) {
tokens.add(unprefixedChannel);
}
}
return tokens;
}
function routeTokenMatchesSource(
token: string | undefined,
hookContext: CodexDynamicToolHookContext | undefined,
): boolean {
const normalized = normalizeRouteToken(token);
return normalized !== undefined && sourceRouteTokens(hookContext).has(normalized);
}
function routeProviderMatchesSource(
provider: string | undefined,
hookContext: CodexDynamicToolHookContext | undefined,
): boolean {
const normalized = normalizeRouteToken(provider);
if (!normalized) {
return false;
}
const currentProvider = normalizeRouteToken(hookContext?.currentChannelProvider);
const currentChannel = normalizeRouteToken(hookContext?.currentChannelId);
return currentProvider === normalized || currentChannel?.startsWith(`${normalized}:`) === true;
}
function routeTokenMatchesCurrentMessage(
token: string | number | undefined,
hookContext: CodexDynamicToolHookContext | undefined,
): boolean {
const normalized = normalizeRouteToken(token);
return (
normalized !== undefined && normalized === normalizeRouteToken(hookContext?.currentMessageId)
);
}
function readRouteToken(record: Record<string, unknown>, key: string): string | number | undefined {
const value = record[key];
return typeof value === "string" || typeof value === "number" ? value : undefined;
}
function explicitRouteTokensMismatchCurrent(
args: Record<string, unknown>,
keys: readonly string[],
currentToken: string | number | undefined,
): boolean {
const normalizedCurrent = normalizeRouteToken(currentToken);
if (!normalizedCurrent) {
return false;
}
return keys.some((key) => {
const normalized = normalizeRouteToken(readRouteToken(args, key));
return normalized !== undefined && normalized !== normalizedCurrent;
});
}
function explicitThreadRouteTargetsNonSource(
args: Record<string, unknown>,
hookContext: CodexDynamicToolHookContext | undefined,
messagingTarget: MessagingToolSend | undefined,
): boolean {
const normalizedCurrentThread = normalizeRouteToken(hookContext?.currentThreadId);
const explicitThreadTokens = [
...EXPLICIT_MESSAGE_THREAD_KEYS.map((key) => normalizeRouteToken(readRouteToken(args, key))),
normalizeRouteToken(messagingTarget?.threadId),
].filter((value): value is string => value !== undefined);
if (explicitThreadTokens.length === 0) {
return false;
}
return (
normalizedCurrentThread === undefined ||
explicitThreadTokens.some((value) => value !== normalizedCurrentThread)
);
}
function replyReceiptMatchesCurrentMessage(
value: unknown,
hookContext: CodexDynamicToolHookContext | undefined,
depth = 0,
): boolean {
if (depth > 4 || value === null) {
return false;
}
if (typeof value === "string") {
const trimmed = value.trim();
if (!trimmed || !["{", "["].includes(trimmed[0] ?? "")) {
return false;
}
try {
return replyReceiptMatchesCurrentMessage(JSON.parse(trimmed), hookContext, depth + 1);
} catch {
return false;
}
}
if (typeof value !== "object") {
return false;
}
if (Array.isArray(value)) {
return value.some((item) => replyReceiptMatchesCurrentMessage(item, hookContext, depth + 1));
}
const record = value as Record<string, unknown>;
for (const key of ["repliedTo", "replyTo", "replyToId", "replyToIdFull"]) {
if (
routeTokenMatchesCurrentMessage(
typeof record[key] === "string" ? record[key] : undefined,
hookContext,
)
) {
return true;
}
}
for (const key of [
"content",
"details",
"payload",
"receipt",
"result",
"results",
"sendResult",
"text",
]) {
if (replyReceiptMatchesCurrentMessage(record[key], hookContext, depth + 1)) {
return true;
}
}
return false;
}
function hasExplicitNonSourceMessageRoute(
args: Record<string, unknown>,
hookContext: CodexDynamicToolHookContext | undefined,
messagingTarget: MessagingToolSend | undefined,
): boolean {
const currentProvider = normalizeRouteToken(hookContext?.currentChannelProvider);
for (const key of EXPLICIT_MESSAGE_PROVIDER_KEYS) {
const provider = normalizeRouteToken(typeof args[key] === "string" ? args[key] : undefined);
if (
provider &&
currentProvider !== provider &&
!routeProviderMatchesSource(provider, hookContext)
) {
return true;
}
}
const targetValues = [
...EXPLICIT_MESSAGE_TARGET_KEYS.map((key) =>
typeof args[key] === "string" ? args[key] : undefined,
),
...(Array.isArray(args.targets)
? args.targets.map((value) => (typeof value === "string" ? value : undefined))
: []),
].filter((value): value is string => normalizeRouteToken(value) !== undefined);
if (explicitThreadRouteTargetsNonSource(args, hookContext, messagingTarget)) {
return true;
}
if (
explicitRouteTokensMismatchCurrent(
args,
EXPLICIT_MESSAGE_REPLY_KEYS,
hookContext?.currentMessageId,
)
) {
return true;
}
if (
messagingTarget?.to !== undefined &&
!routeTokenMatchesSource(messagingTarget.to, hookContext)
) {
return true;
}
if (messagingTarget?.to !== undefined) {
return false;
}
if (targetValues.length === 0) {
return false;
}
if (targetValues.some((value) => !routeTokenMatchesSource(value, hookContext))) {
return true;
}
return false;
}
/** Runtime bridge returned to Codex app-server attempt code. */
export type CodexDynamicToolBridge = {
availableSpecs: CodexDynamicToolSpec[];
@@ -337,7 +114,6 @@ export type CodexDynamicToolBridge = {
) => Promise<CodexDynamicToolCallResponse>;
telemetry: {
didSendViaMessagingTool: boolean;
didDeliverSourceReplyViaMessageTool: boolean;
messagingToolSentTexts: string[];
messagingToolSentMediaUrls: string[];
messagingToolSentTargets: MessagingToolSend[];
@@ -356,10 +132,6 @@ export const CODEX_OPENCLAW_DYNAMIC_TOOL_NAMESPACE = "openclaw";
// Keep OpenClaw session spawning searchable in Codex mode so Codex's native
// spawn_agent remains the primary Codex subagent surface.
const ALWAYS_DIRECT_DYNAMIC_TOOL_NAMES = new Set(["sessions_yield"]);
const EXPLICIT_MESSAGE_PROVIDER_KEYS = ["channel", "provider"];
const EXPLICIT_MESSAGE_TARGET_KEYS = ["target", "to", "channelId"];
const EXPLICIT_MESSAGE_THREAD_KEYS = ["threadId", "thread_id", "messageThreadId", "topicId"];
const EXPLICIT_MESSAGE_REPLY_KEYS = ["replyTo", "replyToId", "replyToIdFull"];
const DEFAULT_CODEX_DYNAMIC_TOOL_RESULT_MAX_CHARS = 16_000;
/**
@@ -404,7 +176,6 @@ export function createCodexDynamicToolBridge(params: {
emitQuarantinedDynamicToolDiagnostics(quarantinedTools, params.hookContext);
const telemetry: CodexDynamicToolBridge["telemetry"] = {
didSendViaMessagingTool: false,
didDeliverSourceReplyViaMessageTool: false,
messagingToolSentTexts: [],
messagingToolSentMediaUrls: [],
messagingToolSentTargets: [],
@@ -562,9 +333,10 @@ export function createCodexDynamicToolBridge(params: {
executedArgs,
params.hookContext?.currentChannelProvider,
);
const messagingTarget = isMessagingTool(toolName)
? extractMessagingToolSend(toolName, messagingTelemetryArgs, messagingContext)
: undefined;
const messagingTarget =
isMessagingTool(toolName) && isMessagingToolSendAction(toolName, executedArgs)
? extractMessagingToolSend(toolName, messagingTelemetryArgs, messagingContext)
: undefined;
const confirmedMessagingTarget =
!rawIsError && messagingTarget
? extractMessagingToolSendResult(messagingTarget, telemetryRawResult)
@@ -586,53 +358,12 @@ export function createCodexDynamicToolBridge(params: {
},
terminalType,
);
const blocksSourceReplyTermination = hasExplicitNonSourceMessageRoute(
executedArgs,
params.hookContext,
confirmedMessagingTarget,
);
const deliveredSourceReply = isDeliveredMessageToolOnlySourceReplyResult({
sourceReplyDeliveryMode: params.hookContext?.sourceReplyDeliveryMode,
toolName,
args: executedArgs,
result,
hookResult: rawResult,
isError: resultIsError,
allowExplicitSourceRoute: !blocksSourceReplyTermination,
});
const receiptConfirmedSourceReply =
params.hookContext?.sourceReplyDeliveryMode === "message_tool_only" &&
toolName === "message" &&
normalizeRouteToken(
typeof executedArgs.action === "string" ? executedArgs.action : undefined,
) === "reply" &&
!resultIsError &&
!blocksSourceReplyTermination &&
isDeliveredMessagingToolResult({
toolName,
args: executedArgs,
result,
hookResult: rawResult,
isError: resultIsError,
}) &&
(replyReceiptMatchesCurrentMessage(rawResult, params.hookContext) ||
replyReceiptMatchesCurrentMessage(result, params.hookContext));
const toolConfirmedSourceReply =
params.hookContext?.sourceReplyDeliveryMode === "message_tool_only" &&
toolName === "message" &&
!resultIsError &&
(rawResult.terminate === true || result.terminate === true);
if (deliveredSourceReply || receiptConfirmedSourceReply || toolConfirmedSourceReply) {
telemetry.didDeliverSourceReplyViaMessageTool = true;
}
withDynamicToolTermination(
response,
rawResult.terminate === true ||
result.terminate === true ||
isToolResultYield(rawResult) ||
isToolResultYield(result) ||
deliveredSourceReply ||
receiptConfirmedSourceReply,
isToolResultYield(result),
);
const asyncStarted =
isAsyncStartedToolResult(rawResult) || isAsyncStartedToolResult(result);
@@ -1070,22 +801,9 @@ function collectToolTelemetry(params: {
}
}
}
if (!isMessagingTool(params.toolName)) {
return;
}
const isMessagingSendAction = isMessagingToolSendAction(params.toolName, params.args);
if (!isMessagingSendAction && !params.messagingTarget) {
return;
}
if (
!isMessagingSendAction &&
!isDeliveredMessagingToolResult({
toolName: params.toolName,
args: params.args,
result: params.result,
hookResult: params.mediaTrustResult,
isError: params.isError,
})
!isMessagingTool(params.toolName) ||
!isMessagingToolSendAction(params.toolName, params.args)
) {
return;
}

View File

@@ -836,19 +836,6 @@ describe("CodexAppServerEventProjector", () => {
expect(result.toolMediaUrls).toStrictEqual([]);
});
it("propagates message-tool-only source reply delivery telemetry", async () => {
const projector = await createProjector();
const result = projector.buildResult({
...buildEmptyToolTelemetry(),
didSendViaMessagingTool: true,
didDeliverSourceReplyViaMessageTool: true,
});
expect(result.didSendViaMessagingTool).toBe(true);
expect(result.didDeliverSourceReplyViaMessageTool).toBe(true);
});
it("does not promote repeated tool progress text to the final assistant reply", async () => {
const onToolResult = vi.fn();
const projector = await createProjector({

View File

@@ -53,7 +53,6 @@ import { attachCodexMirrorIdentity, buildCodexUserPromptMessage } from "./transc
export type CodexAppServerToolTelemetry = {
didSendViaMessagingTool: boolean;
didDeliverSourceReplyViaMessageTool?: boolean;
messagingToolSentTexts: string[];
messagingToolSentMediaUrls: string[];
messagingToolSentTargets: MessagingToolSend[];
@@ -412,8 +411,6 @@ export class CodexAppServerEventProjector {
currentAttemptAssistant,
...(this.lastNativeToolError ? { lastToolError: this.lastNativeToolError } : {}),
didSendViaMessagingTool: toolTelemetry.didSendViaMessagingTool,
didDeliverSourceReplyViaMessageTool:
toolTelemetry.didDeliverSourceReplyViaMessageTool === true,
messagingToolSentTexts: toolTelemetry.messagingToolSentTexts,
messagingToolSentMediaUrls: toolTelemetry.messagingToolSentMediaUrls,
messagingToolSentTargets: toolTelemetry.messagingToolSentTargets,

View File

@@ -27,6 +27,8 @@ function managedCommandPath(root: string, platform: NodeJS.Platform): string {
return pathApi.join(root, "node_modules", ".bin", platform === "win32" ? "codex.cmd" : "codex");
}
const MACOS_DESKTOP_CODEX_APP_SERVER_COMMAND = "/Applications/Codex.app/Contents/Resources/codex";
describe("managed Codex app-server binary", () => {
it("leaves explicit command overrides unchanged", async () => {
const explicitOptions = startOptions("config");
@@ -41,10 +43,14 @@ describe("managed Codex app-server binary", () => {
expect(pathExists).not.toHaveBeenCalled();
});
it("resolves the plugin-local bundled Codex binary", async () => {
it("prefers the macOS desktop app bundle when it exists", async () => {
const pluginRoot = path.join("/tmp", "openclaw", "extensions", "codex");
const paths = resolveManagedCodexAppServerPaths({ platform: "darwin", pluginRoot });
const pathExists = vi.fn(async (filePath: string) => filePath === paths.commandPath);
const pluginLocalCommand = managedCommandPath(pluginRoot, "darwin");
const pathExists = vi.fn(
async (filePath: string) =>
filePath === MACOS_DESKTOP_CODEX_APP_SERVER_COMMAND || filePath === pluginLocalCommand,
);
await expect(
resolveManagedCodexAppServerStartOptions(startOptions("managed"), {
@@ -54,10 +60,31 @@ describe("managed Codex app-server binary", () => {
}),
).resolves.toEqual({
...startOptions("managed"),
command: paths.commandPath,
command: MACOS_DESKTOP_CODEX_APP_SERVER_COMMAND,
commandSource: "resolved-managed",
managedFallbackCommandPaths: [pluginLocalCommand],
});
expect(paths.commandPath).toBe(MACOS_DESKTOP_CODEX_APP_SERVER_COMMAND);
expect(paths.candidateCommandPaths).toContain(pluginLocalCommand);
});
it("falls back to the plugin-local bundled Codex binary on macOS", async () => {
const pluginRoot = path.join("/tmp", "openclaw", "extensions", "codex");
const pluginLocalCommand = managedCommandPath(pluginRoot, "darwin");
const pathExists = vi.fn(async (filePath: string) => filePath === pluginLocalCommand);
await expect(
resolveManagedCodexAppServerStartOptions(startOptions("managed"), {
platform: "darwin",
pluginRoot,
pathExists,
}),
).resolves.toEqual({
...startOptions("managed"),
command: pluginLocalCommand,
commandSource: "resolved-managed",
});
expect(paths.commandPath).toBe(managedCommandPath(pluginRoot, "darwin"));
expect(pathExists).toHaveBeenCalledWith(MACOS_DESKTOP_CODEX_APP_SERVER_COMMAND, "darwin");
});
it("resolves Windows Codex command shims", () => {

View File

@@ -12,6 +12,7 @@ import { MANAGED_CODEX_APP_SERVER_PACKAGE } from "./version.js";
const CODEX_APP_SERVER_MODULE_DIR = path.dirname(fileURLToPath(import.meta.url));
const CODEX_PLUGIN_ROOT = resolveDefaultCodexPluginRoot(CODEX_APP_SERVER_MODULE_DIR);
const MACOS_DESKTOP_CODEX_APP_SERVER_COMMAND = "/Applications/Codex.app/Contents/Resources/codex";
type ManagedCodexAppServerPaths = {
commandPath: string;
@@ -39,16 +40,19 @@ export async function resolveManagedCodexAppServerStartOptions(
pluginRoot: options.pluginRoot,
});
const pathExists = options.pathExists ?? commandPathExists;
const commandPath = await findManagedCodexAppServerCommandPath({
const commandPaths = await findManagedCodexAppServerCommandPaths({
candidateCommandPaths: paths.candidateCommandPaths,
pathExists,
platform,
});
const commandPath = commandPaths[0];
const managedFallbackCommandPaths = commandPaths.slice(1);
return {
...startOptions,
command: commandPath,
commandSource: "resolved-managed",
...(managedFallbackCommandPaths.length > 0 ? { managedFallbackCommandPaths } : {}),
};
}
@@ -77,12 +81,17 @@ function resolveManagedCodexAppServerCommandCandidates(
const roots = resolveManagedCodexAppServerCandidateRoots(pluginRoot, platform);
return [
...new Set([
...resolveDesktopCodexAppServerCommandCandidates(platform),
...roots.map((root) => pathApi.join(root, "node_modules", ".bin", commandName)),
...resolveManagedCodexPackageBinCandidates(roots, platform),
]),
];
}
function resolveDesktopCodexAppServerCommandCandidates(platform: NodeJS.Platform): string[] {
return platform === "darwin" ? [MACOS_DESKTOP_CODEX_APP_SERVER_COMMAND] : [];
}
function resolveDefaultCodexPluginRoot(moduleDir: string): string {
const moduleBaseName = path.basename(moduleDir);
if (moduleBaseName === "dist" || moduleBaseName === "dist-runtime") {
@@ -195,16 +204,20 @@ function pathForPlatform(platform: NodeJS.Platform): typeof path {
return platform === "win32" ? path.win32 : path.posix;
}
async function findManagedCodexAppServerCommandPath(params: {
async function findManagedCodexAppServerCommandPaths(params: {
candidateCommandPaths: readonly string[];
pathExists: (filePath: string, platform: NodeJS.Platform) => Promise<boolean>;
platform: NodeJS.Platform;
}): Promise<string> {
}): Promise<string[]> {
const commandPaths: string[] = [];
for (const commandPath of params.candidateCommandPaths) {
if (await params.pathExists(commandPath, params.platform)) {
return commandPath;
commandPaths.push(commandPath);
}
}
if (commandPaths.length > 0) {
return commandPaths;
}
throw new Error(
[

View File

@@ -841,11 +841,9 @@ export async function runCodexAppServerAttempt(
currentChannelProvider: resolveCodexMessageToolProvider(params),
currentChannelId: params.currentChannelId,
currentMessagingTarget: params.currentMessagingTarget,
currentMessageId: params.currentMessageId,
currentThreadId: params.currentThreadTs,
replyToMode: params.replyToMode,
hasRepliedRef: params.hasRepliedRef,
sourceReplyDeliveryMode: params.sourceReplyDeliveryMode,
onToolOutcome: onCodexToolOutcome,
allocateToolOutcomeOrdinal: allocateCodexToolOutcomeOrdinal,
},

View File

@@ -187,6 +187,41 @@ describe("shared Codex app-server client", () => {
startSpy.mockRestore();
});
it("falls back to the next managed app-server when desktop initialize is unsupported", async () => {
const desktop = createClientHarness();
const pluginLocal = createClientHarness();
const startSpy = vi
.spyOn(CodexAppServerClient, "start")
.mockReturnValueOnce(desktop.client)
.mockReturnValueOnce(pluginLocal.client);
mocks.resolveManagedCodexAppServerStartOptions.mockImplementationOnce(async (startOptions) => ({
...startOptions,
command: "/Applications/Codex.app/Contents/Resources/codex",
commandSource: "resolved-managed",
managedFallbackCommandPaths: ["/cache/openclaw/codex"],
}));
const listPromise = listCodexAppServerModels({ timeoutMs: 1000 });
await sendInitializeResult(desktop, "openclaw/0.124.9 (macOS; test)");
await sendInitializeResult(pluginLocal, "openclaw/0.125.0 (macOS; test)");
await sendEmptyModelList(pluginLocal);
await expect(listPromise).resolves.toEqual({ models: [] });
expect(desktop.process.stdin.destroyed).toBe(true);
expect(pluginLocal.process.stdin.destroyed).toBe(false);
expect(startSpy).toHaveBeenCalledTimes(2);
expect(startSpy.mock.calls[0]?.[0]).toMatchObject({
command: "/Applications/Codex.app/Contents/Resources/codex",
commandSource: "resolved-managed",
managedFallbackCommandPaths: ["/cache/openclaw/codex"],
});
expect(startSpy.mock.calls[1]?.[0]).toMatchObject({
command: "/cache/openclaw/codex",
commandSource: "resolved-managed",
});
expect(startSpy.mock.calls[1]?.[0]).not.toHaveProperty("managedFallbackCommandPaths");
});
it("closes and clears a shared app-server when initialize times out", async () => {
const first = createClientHarness();
const second = createClientHarness();

View File

@@ -11,7 +11,7 @@ import {
resolveCodexAppServerAuthProfileStore,
resolveCodexAppServerFallbackApiKeyCacheKey,
} from "./auth-bridge.js";
import { CodexAppServerClient } from "./client.js";
import { CodexAppServerClient, isUnsupportedCodexAppServerVersionError } from "./client.js";
import {
codexAppServerStartOptionsKey,
resolveCodexAppServerRuntimeOptions,
@@ -242,27 +242,23 @@ async function acquireSharedCodexAppServerClient(
const sharedPromise =
entry.promise ??
(entry.promise = (async () => {
const client = CodexAppServerClient.start(startOptions);
const client = await startInitializedCodexAppServerClient({
startOptions,
agentDir,
authProfileId: usesNativeAuth ? null : authProfileId,
config: options?.config,
onStartedClient: (startedClient) => {
entry.client = startedClient;
startedClient.setActiveSharedLeaseCountProviderForUnscopedNotifications(
() => entry.activeLeases,
);
options?.onStartedClient?.(startedClient);
},
});
entry.client = client;
options?.onStartedClient?.(client);
client.setActiveSharedLeaseCountProviderForUnscopedNotifications(() => entry.activeLeases);
client.addCloseHandler((closedClient) => clearSharedClientEntryIfCurrent(key, closedClient));
try {
await client.initialize();
await applyCodexAppServerAuthProfile({
client,
agentDir,
authProfileId: usesNativeAuth ? null : authProfileId,
startOptions,
config: options?.config,
});
return client;
} catch (error) {
// Startup failures happen before callers own the shared client, so close
// the child here instead of leaving a rejected daemon attached to stdio.
client.close();
throw error;
}
return client;
})());
try {
const client = await withTimeout(
@@ -291,39 +287,110 @@ export async function createIsolatedCodexAppServerClient(
): Promise<CodexAppServerClient> {
const { agentDir, usesNativeAuth, authProfileId, authProfileStore, startOptions } =
await resolveCodexAppServerClientStartContext(options);
const client = CodexAppServerClient.start(startOptions);
if (authProfileId) {
// Profile-backed Codex auth is ephemeral. Keep the host refresh callback
// available whether the profile came from a scoped store or persisted state.
client.addRequestHandler(async (request) => {
if (request.method !== "account/chatgptAuthTokens/refresh") {
return undefined;
return await startInitializedCodexAppServerClient({
startOptions,
agentDir,
authProfileId: usesNativeAuth ? null : authProfileId,
authProfileStore,
config: options?.config,
timeoutMs: options?.timeoutMs,
onStartedClient: options?.onStartedClient,
});
}
async function startInitializedCodexAppServerClient(params: {
startOptions: CodexAppServerStartOptions;
agentDir: string;
authProfileId: string | null | undefined;
authProfileStore?: AuthProfileStore;
config?: CodexAppServerClientOptions["config"];
timeoutMs?: number;
onStartedClient?: (client: CodexAppServerClient) => void;
}): Promise<CodexAppServerClient> {
const startOptionsCandidates = resolveManagedFallbackStartOptions(params.startOptions);
for (let index = 0; index < startOptionsCandidates.length; index += 1) {
const startOptions = startOptionsCandidates[index];
const client = CodexAppServerClient.start(startOptions);
params.onStartedClient?.(client);
const initialize = client.initialize();
try {
await withTimeout(initialize, params.timeoutMs ?? 0, "codex app-server initialize timed out");
} catch (error) {
client.close();
void initialize.catch(() => undefined);
if (shouldTryManagedFallbackStartOption(error, startOptions, index, startOptionsCandidates)) {
continue;
}
return await refreshCodexAppServerAuthTokens({
agentDir,
authProfileId,
...(authProfileStore ? { authProfileStore } : {}),
config: options?.config,
throw error;
}
if (params.authProfileId) {
// Profile-backed Codex auth is ephemeral. Keep the host refresh callback
// available whether the profile came from a scoped store or persisted state.
client.addRequestHandler(async (request) => {
if (request.method !== "account/chatgptAuthTokens/refresh") {
return undefined;
}
return await refreshCodexAppServerAuthTokens({
agentDir: params.agentDir,
authProfileId: params.authProfileId!,
...(params.authProfileStore ? { authProfileStore: params.authProfileStore } : {}),
config: params.config,
});
});
});
}
try {
await applyCodexAppServerAuthProfile({
client,
agentDir: params.agentDir,
authProfileId: params.authProfileId,
startOptions,
config: params.config,
...(params.authProfileStore ? { authProfileStore: params.authProfileStore } : {}),
});
return client;
} catch (error) {
client.close();
throw error;
}
}
const initialize = client.initialize();
try {
await withTimeout(initialize, options?.timeoutMs ?? 0, "codex app-server initialize timed out");
await applyCodexAppServerAuthProfile({
client,
agentDir,
authProfileId: usesNativeAuth ? null : authProfileId,
startOptions,
config: options?.config,
...(authProfileStore ? { authProfileStore } : {}),
});
return client;
} catch (error) {
client.close();
void initialize.catch(() => undefined);
throw error;
throw new Error("Managed Codex app-server fallback candidates were exhausted.");
}
function resolveManagedFallbackStartOptions(
startOptions: CodexAppServerStartOptions,
): CodexAppServerStartOptions[] {
const commands = [startOptions.command, ...(startOptions.managedFallbackCommandPaths ?? [])];
const candidates: CodexAppServerStartOptions[] = [];
for (let index = 0; index < commands.length; index += 1) {
const command = commands[index];
const managedFallbackCommandPaths = commands.slice(index + 1);
const candidate = {
...startOptions,
command,
};
if (managedFallbackCommandPaths.length === 0) {
delete candidate.managedFallbackCommandPaths;
} else {
candidate.managedFallbackCommandPaths = managedFallbackCommandPaths;
}
candidates.push(candidate);
}
return candidates;
}
function shouldTryManagedFallbackStartOption(
error: unknown,
startOptions: CodexAppServerStartOptions,
index: number,
startOptionsCandidates: readonly CodexAppServerStartOptions[],
): boolean {
return (
startOptions.commandSource === "resolved-managed" &&
index < startOptionsCandidates.length - 1 &&
isUnsupportedCodexAppServerVersionError(error)
);
}
/** Clears and closes all shared clients for deterministic tests. */

View File

@@ -172,6 +172,24 @@ describe("hydrateViewer", () => {
expect(document.documentElement.dataset.openclawDiffsError).toBeUndefined();
warn.mockRestore();
});
it("replaces stale controllers when hydrating the current cards again", async () => {
renderCard();
const { controllers, hydrateViewer } = await import("./viewer-client.js");
controllers.splice(0);
await hydrateViewer();
expect(controllers).toHaveLength(1);
const firstController = controllers[0];
document.body.innerHTML = "";
renderCard();
await hydrateViewer();
expect(controllers).toHaveLength(1);
expect(controllers[0]).not.toBe(firstController);
expect(fileDiffHydrateMock).toHaveBeenCalledTimes(2);
});
});
describe("viewerState initialization", () => {

View File

@@ -287,6 +287,9 @@ function syncAllControllers(): void {
}
export async function hydrateViewer(): Promise<void> {
// Rehydration replaces the current DOM card set; do not retain controllers
// from a previous render because they can keep stale DOM references alive.
controllers.length = 0;
const cards = await Promise.all(
getCards().map(async ({ host, payload }) => ({
host,

View File

@@ -345,7 +345,7 @@ describe("discordOutbound", () => {
2,
);
expect(messageOptions.accountId).toBe("default");
expect(messageOptions.replyTo).toBeUndefined();
expect(messageOptions.replyTo).toBe("reply-1");
const mediaCall = mockCall(hoisted.sendMessageDiscordMock, "sendMessageDiscord", 1);
expect(mediaCall[0]).toBe("channel:123456");
@@ -353,7 +353,7 @@ describe("discordOutbound", () => {
const mediaOptions = mockObjectArg(hoisted.sendMessageDiscordMock, "sendMessageDiscord", 1, 2);
expect(mediaOptions.accountId).toBe("default");
expect(mediaOptions.mediaUrl).toBe("https://example.com/extra.png");
expect(mediaOptions.replyTo).toBeUndefined();
expect(mediaOptions.replyTo).toBe("reply-1");
expect(result).toEqual({
channel: "discord",
messageId: "msg-1",
@@ -361,6 +361,31 @@ describe("discordOutbound", () => {
});
});
it("keeps captured replyTo on audioAsVoice sends when replyToMode is batched", async () => {
await discordOutbound.sendPayload?.({
cfg: {},
to: "channel:123456",
text: "",
payload: {
text: "voice note",
mediaUrls: ["https://example.com/voice.ogg", "https://example.com/extra.png"],
audioAsVoice: true,
},
accountId: "default",
replyToId: "reply-1",
replyToMode: "batched",
});
expect(
mockObjectArg(hoisted.sendVoiceMessageDiscordMock, "sendVoiceMessageDiscord", 0, 2).replyTo,
).toBe("reply-1");
expect(
hoisted.sendMessageDiscordMock.mock.calls.map(
(call) => (call[2] as { replyTo?: unknown } | undefined)?.replyTo,
),
).toEqual(["reply-1", "reply-1"]);
});
it("keeps replyToId on every internal audioAsVoice send when replyToMode is all", async () => {
await discordOutbound.sendPayload?.({
cfg: {},

View File

@@ -84,13 +84,15 @@ export async function sendDiscordOutboundPayload(params: {
const sendContext = await createDiscordPayloadSendContext(ctx);
if (payload.audioAsVoice && mediaUrls.length > 0) {
// audioAsVoice emits one logical Discord reply across voice/text/media sends.
// Capture before helper calls consume implicit single-use reply targets.
const voiceReplyTo = sendContext.resolveReplyTo();
let lastResult = await sendContext.withRetry(
async () =>
await sendContext.sendVoice(
sendContext.target,
mediaUrls[0],
resolveDiscordDeliveryOptions(ctx, sendContext),
),
await sendContext.sendVoice(sendContext.target, mediaUrls[0], {
...resolveDiscordDeliveryOptions(ctx, sendContext),
replyTo: voiceReplyTo,
}),
);
if (payload.text?.trim()) {
lastResult = await sendContext.withRetry(
@@ -98,6 +100,7 @@ export async function sendDiscordOutboundPayload(params: {
await sendContext.send(sendContext.target, payload.text, {
verbose: false,
...resolveDiscordFormattedDeliveryOptions(ctx, sendContext),
replyTo: voiceReplyTo,
}),
);
}
@@ -107,6 +110,7 @@ export async function sendDiscordOutboundPayload(params: {
await sendContext.send(sendContext.target, "", {
verbose: false,
...resolveDiscordMediaDeliveryOptions(ctx, sendContext, mediaUrl),
replyTo: voiceReplyTo,
}),
);
}

View File

@@ -55,20 +55,35 @@ describe("PDF document extractor", () => {
});
});
it("extracts text first and renders fallback images through clawpdf", async () => {
pdfDocument.extract.mockResolvedValueOnce({ text: "", images: [] }).mockResolvedValueOnce({
text: "",
images: [
{
type: "image",
bytes: Uint8Array.from(Buffer.from("png")),
mimeType: "image/png",
page: 1,
width: 10,
height: 10,
},
],
});
it("extracts text first and renders each fallback page with its own pixel budget", async () => {
pdfDocument.extract
.mockResolvedValueOnce({ text: "", images: [] })
.mockResolvedValueOnce({
text: "",
images: [
{
type: "image",
bytes: Uint8Array.from(Buffer.from("png1")),
mimeType: "image/png",
page: 1,
width: 5,
height: 10,
},
],
})
.mockResolvedValueOnce({
text: "",
images: [
{
type: "image",
bytes: Uint8Array.from(Buffer.from("png2")),
mimeType: "image/png",
page: 2,
width: 5,
height: 10,
},
],
});
const extractor = createPdfDocumentExtractor();
const result = await extractor.extract(request());
@@ -82,18 +97,24 @@ describe("PDF document extractor", () => {
maxPages: 2,
maxTextChars: 200_000,
});
// Each page renders in its own extract() call, with the aggregate pixel cap
// allocated across selected pages so later pages are not starved.
expect(pdfDocument.extract).toHaveBeenNthCalledWith(2, {
mode: "images",
maxPages: 2,
image: {
maxDimension: 10_000,
maxPixels: 100,
forms: true,
},
pages: [1],
image: { maxDimension: 10_000, maxPixels: 50, forms: true },
});
expect(pdfDocument.extract).toHaveBeenNthCalledWith(3, {
mode: "images",
pages: [2],
image: { maxDimension: 10_000, maxPixels: 50, forms: true },
});
expect(result).toEqual({
text: "",
images: [{ type: "image", data: "cG5n", mimeType: "image/png" }],
images: [
{ type: "image", data: "cG5nMQ==", mimeType: "image/png" },
{ type: "image", data: "cG5nMg==", mimeType: "image/png" },
],
});
expect(pdfDocument.destroy).toHaveBeenCalledTimes(1);
});
@@ -131,8 +152,9 @@ describe("PDF document extractor", () => {
expect(pdfDocument.destroy).not.toHaveBeenCalled();
});
it("filters selected pages before passing them to clawpdf", async () => {
it("filters selected pages and renders them one page per image call", async () => {
pdfDocument.extract
.mockResolvedValueOnce({ text: "", images: [] })
.mockResolvedValueOnce({ text: "", images: [] })
.mockResolvedValueOnce({ text: "", images: [] });
const extractor = createPdfDocumentExtractor();
@@ -141,11 +163,15 @@ describe("PDF document extractor", () => {
expect(pdfDocument.extract).toHaveBeenNthCalledWith(
1,
expect.objectContaining({ pages: [2, 1] }),
expect.objectContaining({ mode: "text", pages: [2, 1] }),
);
expect(pdfDocument.extract).toHaveBeenNthCalledWith(
2,
expect.objectContaining({ pages: [2, 1] }),
expect.objectContaining({ mode: "images", pages: [2] }),
);
expect(pdfDocument.extract).toHaveBeenNthCalledWith(
3,
expect.objectContaining({ mode: "images", pages: [1] }),
);
});

View File

@@ -83,17 +83,38 @@ async function extractPdfContent(
return { text, images: [] };
}
// clawpdf's image render budget (maxPixels) is shared across every page in one
// extract() call: the first page consumes it and later pages collapse to 1x1
// PNGs that vision models reject. Render each page separately, allocating the
// remaining aggregate budget across pages that still need rendering.
const imagePages =
pages ?? Array.from({ length: Math.min(pdf.pageCount, request.maxPages) }, (_, i) => i + 1);
try {
const imageResult = await pdf.extract({
mode: "images",
...pageSelection,
image: {
maxDimension: MAX_RENDER_DIMENSION,
maxPixels: request.maxPixels,
forms: true,
},
});
return { text, images: imageResult.images.map(toDocumentImage) };
const images: DocumentExtractedImage[] = [];
let remainingPixels = request.maxPixels;
for (let index = 0; index < imagePages.length; index += 1) {
if (remainingPixels <= 0) {
break;
}
const pagesRemaining = imagePages.length - index;
const maxPixelsPerPage = Math.max(1, Math.ceil(remainingPixels / pagesRemaining));
const pageNumber = imagePages[index];
const imageResult = await pdf.extract({
mode: "images",
pages: [pageNumber],
image: {
maxDimension: MAX_RENDER_DIMENSION,
maxPixels: maxPixelsPerPage,
forms: true,
},
});
for (const image of imageResult.images) {
images.push(toDocumentImage(image));
remainingPixels -= image.width * image.height;
}
}
return { text, images };
} catch (err) {
request.onImageExtractionError?.(err);
return { text, images: [] };

View File

@@ -1,7 +1,10 @@
// Elevenlabs provider module implements model/runtime integration.
import { formatErrorMessage } from "openclaw/plugin-sdk/error-runtime";
import { parseStrictFiniteNumber, parseStrictInteger } from "openclaw/plugin-sdk/number-runtime";
import { assertOkOrThrowProviderError } from "openclaw/plugin-sdk/provider-http";
import {
assertOkOrThrowProviderError,
readProviderJsonResponse,
} from "openclaw/plugin-sdk/provider-http";
import { normalizeResolvedSecretInputString } from "openclaw/plugin-sdk/secret-input";
import type {
SpeechDirectiveTokenParseContext,
@@ -367,14 +370,14 @@ async function listElevenLabsVoices(params: {
});
try {
await assertOkOrThrowProviderError(response, "ElevenLabs voices API error");
const json = (await response.json()) as {
const json = await readProviderJsonResponse<{
voices?: Array<{
voice_id?: string;
name?: string;
category?: string;
description?: string;
}>;
};
}>(response, "elevenlabs.voices");
return Array.isArray(json.voices)
? json.voices
.map((voice) => ({

View File

@@ -75,13 +75,16 @@ function mockDiscoveryResponse(spec: {
json?: unknown;
text?: string;
}) {
const status = spec.status ?? (spec.ok ? 200 : 500);
const response =
spec.json !== undefined
? new Response(JSON.stringify(spec.json), {
status,
headers: { "Content-Type": "application/json" },
})
: new Response(spec.text ?? "", { status });
fetchWithSsrFGuardMock.mockImplementationOnce(async () => ({
response: {
ok: spec.ok,
status: spec.status ?? (spec.ok ? 200 : 500),
json: async () => spec.json,
text: async () => spec.text ?? "",
},
response,
release: vi.fn(async () => {}),
}));
}
@@ -228,20 +231,16 @@ describe("githubCopilotMemoryEmbeddingProviderAdapter", () => {
it("wraps invalid discovery JSON as a setup error", async () => {
fetchWithSsrFGuardMock.mockImplementationOnce(async () => ({
response: {
ok: true,
response: new Response("not-valid-json{{{", {
status: 200,
json: async () => {
throw new SyntaxError("bad json");
},
text: async () => "",
},
headers: { "Content-Type": "application/json" },
}),
release: vi.fn(async () => {}),
}));
await expect(
githubCopilotMemoryEmbeddingProviderAdapter.create(defaultCreateOptions()),
).rejects.toThrow("GitHub Copilot model discovery returned invalid JSON");
).rejects.toThrow("github-copilot.model-discovery: malformed JSON response");
});
it("bounds model discovery error bodies", async () => {
@@ -360,7 +359,7 @@ describe("githubCopilotMemoryEmbeddingProviderAdapter", () => {
).toBe(true);
expect(
shouldContinueAutoSelection(
new Error("GitHub Copilot model discovery returned invalid JSON"),
new Error("github-copilot.model-discovery: malformed JSON response"),
),
).toBe(true);
expect(shouldContinueAutoSelection(new Error("Network timeout"))).toBe(false);

View File

@@ -7,7 +7,10 @@ import {
type MemoryEmbeddingProviderAdapter,
} from "openclaw/plugin-sdk/memory-core-host-engine-embeddings";
import { buildCopilotIdeHeaders } from "openclaw/plugin-sdk/provider-auth";
import { readResponseTextLimited } from "openclaw/plugin-sdk/provider-http";
import {
readProviderJsonResponse,
readResponseTextLimited,
} from "openclaw/plugin-sdk/provider-http";
import { resolveConfiguredSecretInputString } from "openclaw/plugin-sdk/secret-input-runtime";
import { fetchWithSsrFGuard, type SsrFPolicy } from "openclaw/plugin-sdk/ssrf-runtime";
import { resolveFirstGithubToken } from "./auth.js";
@@ -29,6 +32,7 @@ const COPILOT_HEADERS_STATIC: Record<string, string> = {
...buildCopilotIdeHeaders(),
};
const COPILOT_ERROR_BODY_LIMIT_BYTES = 8 * 1024;
const COPILOT_EMBEDDINGS_RESPONSE_MAX_BYTES = 64 * 1024 * 1024;
function buildSsrfPolicy(baseUrl: string): SsrFPolicy | undefined {
try {
@@ -70,6 +74,7 @@ function isCopilotSetupError(err: unknown): boolean {
err.message.includes("Copilot token response") ||
err.message.includes("No embedding models available") ||
err.message.includes("GitHub Copilot model discovery") ||
err.message.includes("github-copilot.model-discovery") ||
err.message.includes("GitHub Copilot embedding model") ||
err.message.includes("Unexpected response from GitHub Copilot token endpoint")
);
@@ -100,12 +105,7 @@ async function discoverEmbeddingModels(params: {
const detail = await readResponseTextLimited(response, COPILOT_ERROR_BODY_LIMIT_BYTES);
throw new Error(`GitHub Copilot model discovery HTTP ${response.status}: ${detail}`);
}
let payload: unknown;
try {
payload = await response.json();
} catch {
throw new Error("GitHub Copilot model discovery returned invalid JSON");
}
const payload = await readProviderJsonResponse(response, "github-copilot.model-discovery");
const allModels = Array.isArray((payload as { data?: unknown })?.data)
? ((payload as { data: CopilotModelEntry[] }).data ?? [])
: [];
@@ -246,12 +246,9 @@ async function createGitHubCopilotEmbeddingProvider(
throw new Error(`GitHub Copilot embeddings HTTP ${response.status}: ${detail}`);
}
let payload: unknown;
try {
payload = await response.json();
} catch {
throw new Error("GitHub Copilot embeddings returned invalid JSON");
}
const payload = await readProviderJsonResponse(response, "github-copilot.embeddings", {
maxBytes: COPILOT_EMBEDDINGS_RESPONSE_MAX_BYTES,
});
return parseGitHubCopilotEmbeddingPayload(payload, input.length);
},
});

View File

@@ -267,6 +267,47 @@ describe("fetchCopilotUsage", () => {
plan: "free",
});
});
it("bounds the usage read and cancels the stream when the body exceeds the JSON byte cap", async () => {
// Larger than the shared 16 MiB readProviderJsonResponse cap so the bounded reader cancels the
// stream mid-flight; if the cap were removed the unbounded res.json() would buffer the whole body.
const ONE_MIB = 1024 * 1024;
const TOTAL_CHUNKS = 32; // 32 MiB advertised body, double the cap.
const chunk = new Uint8Array(ONE_MIB);
let bytesPulled = 0;
let canceled = false;
const makeOversizedJsonResponse = (): Response => {
let pulled = 0;
const body = new ReadableStream<Uint8Array>({
pull(controller) {
if (pulled >= TOTAL_CHUNKS) {
controller.close();
return;
}
pulled += 1;
bytesPulled += chunk.length;
controller.enqueue(chunk);
},
cancel() {
canceled = true;
},
});
return new Response(body, {
status: 200,
headers: { "Content-Type": "application/json" },
});
};
const mockFetch = createProviderUsageFetch(async () => makeOversizedJsonResponse());
await expect(fetchCopilotUsage("token", 5000, mockFetch)).rejects.toThrow(
/github-copilot-usage: JSON response exceeds/,
);
// The bounded reader cancels the body and never pulls the full advertised 32 MiB stream.
expect(canceled).toBe(true);
expect(bytesPulled).toBeLessThan(TOTAL_CHUNKS * ONE_MIB);
});
});
describe("github-copilot token", () => {

View File

@@ -1,5 +1,6 @@
// Github Copilot plugin module implements usage behavior.
import { buildCopilotIdeHeaders } from "openclaw/plugin-sdk/provider-auth";
import { readProviderJsonResponse } from "openclaw/plugin-sdk/provider-http";
import {
buildUsageHttpErrorSnapshot,
fetchJson,
@@ -41,7 +42,10 @@ export async function fetchCopilotUsage(
});
}
const data = (await res.json()) as CopilotUsageResponse;
const data = await readProviderJsonResponse<CopilotUsageResponse>(
res,
"github-copilot-usage",
);
const windows: UsageWindow[] = [];
if (data.quota_snapshots?.premium_interactions) {

View File

@@ -94,6 +94,39 @@ function fetchInputUrl(fetchMock: ReturnType<typeof vi.fn>, index: number): stri
return input.url;
}
function oversizedJsonResponse(params: { chunkCount: number; chunkSize: number }): {
response: Response;
getReadCount: () => number;
wasCanceled: () => boolean;
} {
const chunk = new Uint8Array(params.chunkSize);
let readCount = 0;
let canceled = false;
return {
response: new Response(
new ReadableStream<Uint8Array>({
pull(controller) {
if (readCount >= params.chunkCount) {
controller.close();
return;
}
readCount += 1;
controller.enqueue(chunk);
},
cancel() {
canceled = true;
},
}),
{
status: 200,
headers: { "Content-Type": "application/json" },
},
),
getReadCount: () => readCount,
wasCanceled: () => canceled,
};
}
let ssrfMock: { mockRestore: () => void } | undefined;
describe("google video generation provider", () => {
@@ -486,6 +519,33 @@ describe("google video generation provider", () => {
expect(result.videos[0]?.buffer).toEqual(Buffer.from("rest-video"));
});
it("bounds successful Google REST operation JSON bodies instead of buffering the whole response", async () => {
vi.spyOn(providerAuthRuntime, "resolveApiKeyForProvider").mockResolvedValue({
apiKey: "google-key",
source: "env",
mode: "api-key",
});
generateVideosMock.mockRejectedValue(Object.assign(new Error("sdk 404"), { status: 404 }));
const streamed = oversizedJsonResponse({ chunkCount: 64, chunkSize: 1024 * 1024 });
const fetchMock = vi.fn(async () => streamed.response);
vi.stubGlobal("fetch", fetchMock);
const provider = buildGoogleVideoGenerationProvider();
await expect(
provider.generateVideo({
provider: "google",
model: "veo-3.1-fast-generate-preview",
prompt: "A tiny robot watering a windowsill garden",
cfg: {},
durationSeconds: 3,
}),
).rejects.toThrow("Google video operation response exceeds 16777216 bytes");
expect(fetchMock).toHaveBeenCalledTimes(1);
expect(streamed.getReadCount()).toBeLessThan(64);
expect(streamed.wasCanceled()).toBe(true);
});
it("retries transient Google REST poll failures with empty bodies", async () => {
vi.useFakeTimers();
vi.spyOn(providerAuthRuntime, "resolveApiKeyForProvider").mockResolvedValue({

View File

@@ -28,6 +28,7 @@ const DEFAULT_TIMEOUT_MS = 180_000;
const POLL_INTERVAL_MS = 10_000;
const MAX_POLL_ATTEMPTS = 120;
const DEFAULT_GENERATED_VIDEO_MAX_BYTES = 16 * 1024 * 1024;
const GOOGLE_VIDEO_OPERATION_RESPONSE_MAX_BYTES = 16 * 1024 * 1024;
const GOOGLE_VIDEO_EMPTY_RESULT_MESSAGE =
"Google video generation response missing generated videos";
@@ -349,7 +350,15 @@ async function requestGoogleVideoJson(params: {
signal: controller.signal,
});
try {
const text = await response.text();
const buffer = await readResponseWithLimit(
response,
GOOGLE_VIDEO_OPERATION_RESPONSE_MAX_BYTES,
{
onOverflow: ({ maxBytes }) =>
new Error(`Google video operation response exceeds ${maxBytes} bytes`),
},
);
const text = new TextDecoder().decode(buffer);
if (!response.ok) {
let detail: unknown = text;
if (text) {

View File

@@ -49,6 +49,15 @@ describe("sanitizeOutboundText", () => {
expect(result).not.toMatch(/^assistant:$/m);
});
it("preserves prose lines that merely end with 'user:'/'system:'", () => {
expect(sanitizeOutboundText("Please send this reply to the user:")).toBe(
"Please send this reply to the user:",
);
expect(sanitizeOutboundText("Here is a note for the system:")).toBe(
"Here is a note for the system:",
);
});
it("collapses excessive blank lines after stripping", () => {
const text = "Hello\n\n\n\n\nWorld";
expect(sanitizeOutboundText(text)).toBe("Hello\n\nWorld");

View File

@@ -7,7 +7,9 @@ import { stripAssistantInternalScaffolding } from "openclaw/plugin-sdk/text-chun
*/
const INTERNAL_SEPARATOR_RE = /(?:#\+){2,}#?/g;
const ASSISTANT_ROLE_MARKER_RE = /\bassistant\s+to\s*=\s*\w+/gi;
const ROLE_TURN_MARKER_RE = /\b(?:user|system|assistant)\s*:\s*$/gm;
// Only a standalone role marker on its own line (a leaked turn boundary) — not
// any line that merely ends with the word "user/system/assistant:" in prose.
const ROLE_TURN_MARKER_RE = /^[ \t]*(?:user|system|assistant)\s*:\s*$/gm;
/**
* Strip all assistant-internal scaffolding from outbound text before delivery.

View File

@@ -7,7 +7,10 @@ import {
generateSecMsGecToken,
} from "node-edge-tts/dist/drm.js";
import { isVoiceCompatibleAudio } from "openclaw/plugin-sdk/media-runtime";
import { assertOkOrThrowProviderError } from "openclaw/plugin-sdk/provider-http";
import {
assertOkOrThrowProviderError,
readProviderJsonResponse,
} from "openclaw/plugin-sdk/provider-http";
import {
captureHttpExchange,
isDebugProxyGlobalFetchPatchInstalled,
@@ -166,7 +169,10 @@ export async function listMicrosoftVoices(): Promise<SpeechVoiceOption[]> {
});
}
await assertOkOrThrowProviderError(response, "Microsoft voices API error");
const voices = (await response.json()) as MicrosoftVoiceListEntry[];
const voices = await readProviderJsonResponse<MicrosoftVoiceListEntry[]>(
response,
"microsoft.speech-voices",
);
return Array.isArray(voices)
? voices
.map((voice) => ({

View File

@@ -1,6 +1,9 @@
// Minimax plugin module implements tts behavior.
import { resolveTimerTimeoutMs } from "openclaw/plugin-sdk/number-runtime";
import { assertOkOrThrowProviderError } from "openclaw/plugin-sdk/provider-http";
import {
assertOkOrThrowProviderError,
readProviderJsonResponse,
} from "openclaw/plugin-sdk/provider-http";
import {
fetchWithSsrFGuard,
ssrfPolicyFromHttpBaseUrlAllowedHostname,
@@ -105,10 +108,10 @@ export async function minimaxTTS(params: {
try {
await assertOkOrThrowProviderError(response, "MiniMax TTS API error");
const body = (await response.json()) as {
const body = await readProviderJsonResponse<{
data?: { audio?: string };
base_resp?: { status_code?: number; status_msg?: string };
};
}>(response, "minimax.tts");
// Check base_resp for envelope errors (HTTP 200 with non-zero status_code).
// Other MiniMax providers (image, video, music, web-search) already check this.
@@ -119,9 +122,7 @@ export async function minimaxTTS(params: {
body.base_resp.status_code !== 0
) {
const msg = body.base_resp.status_msg ?? "unknown error";
throw new Error(
`MiniMax TTS API error (${body.base_resp.status_code}): ${msg}`,
);
throw new Error(`MiniMax TTS API error (${body.base_resp.status_code}): ${msg}`);
}
const hexAudio = body?.data?.audio;

View File

@@ -24,6 +24,8 @@ const { assertOkOrThrowHttpErrorMock, postJsonRequestMock, resolveProviderHttpRe
vi.mock("openclaw/plugin-sdk/provider-http", () => ({
assertOkOrThrowHttpError: assertOkOrThrowHttpErrorMock,
postJsonRequest: postJsonRequestMock,
// Pass-through: bounded-reader enforcement is tested via bounded-reader unit tests.
readProviderJsonResponse: async (response: { json(): Promise<unknown> }) => response.json(),
requireTranscriptionText: (value: string | undefined, message: string) => {
const text = value?.trim();
if (!text) {

View File

@@ -10,6 +10,7 @@ import {
import {
assertOkOrThrowHttpError,
postJsonRequest,
readProviderJsonResponse,
requireTranscriptionText,
resolveProviderHttpRequestConfig,
} from "openclaw/plugin-sdk/provider-http";
@@ -148,7 +149,10 @@ export async function transcribeOpenRouterAudio(
try {
await assertOkOrThrowHttpError(response, "OpenRouter audio transcription failed");
const payload = (await response.json()) as OpenRouterSttResponse;
const payload = await readProviderJsonResponse<OpenRouterSttResponse>(
response,
"openrouter.stt",
);
return {
text: requireTranscriptionText(
payload.text,

View File

@@ -54,13 +54,34 @@ vi.mock("openclaw/plugin-sdk/provider-http", async () => {
function releasedJson(value: unknown) {
return {
response: {
json: async () => value,
},
response: new Response(JSON.stringify(value), {
status: 200,
headers: { "content-type": "application/json" },
}),
release: vi.fn(async () => {}),
};
}
function releasedOversizedJsonStream() {
let canceled = false;
const stream = new ReadableStream<Uint8Array>({
start(controller) {
controller.enqueue(new Uint8Array(16 * 1024 * 1024 + 1));
},
cancel() {
canceled = true;
},
});
return {
response: new Response(stream, {
status: 200,
headers: { "content-type": "application/json" },
}),
release: vi.fn(async () => {}),
wasCanceled: () => canceled,
};
}
function releasedVideo(params: { contentType: string; bytes: string }) {
return {
response: new Response(Buffer.from(params.bytes), {
@@ -292,6 +313,40 @@ describe("openrouter video generation provider", () => {
});
});
it("cancels oversized OpenRouter video catalog success bodies", async () => {
const oversized = releasedOversizedJsonStream();
fetchWithTimeoutGuardedMock.mockResolvedValueOnce(oversized);
await expect(
listOpenRouterVideoModelCatalog({
config: {
models: {
providers: {
openrouter: {
baseUrl: "https://custom.openrouter.test/openrouter/api/v1",
},
},
},
} as never,
env: {},
resolveProviderApiKey: () => ({
apiKey: "OPENROUTER_API_KEY",
discoveryApiKey: "resolved-openrouter-key",
}),
resolveProviderAuth: () => ({
apiKey: "OPENROUTER_API_KEY",
discoveryApiKey: "resolved-openrouter-key",
mode: "api_key",
source: "env",
}),
}),
).rejects.toThrow(
"OpenRouter video models request failed: JSON response exceeds 16777216 bytes",
);
expect(oversized.wasCanceled()).toBe(true);
expect(oversized.release).toHaveBeenCalledOnce();
});
it("skips live OpenRouter video catalog discovery without an API key", async () => {
await expect(
listOpenRouterVideoModelCatalog({

View File

@@ -7,6 +7,7 @@ import { resolveApiKeyForProvider } from "openclaw/plugin-sdk/provider-auth-runt
import { getCachedLiveCatalogValue } from "openclaw/plugin-sdk/provider-catalog-shared";
import {
assertOkOrThrowHttpError,
readProviderJsonResponse,
resolveProviderHttpRequestConfig,
} from "openclaw/plugin-sdk/provider-http";
import {
@@ -234,7 +235,10 @@ async function fetchOpenRouterVideoModels(params: {
});
try {
await assertOkOrThrowHttpError(response, "OpenRouter video models request failed");
return (await response.json()) as OpenRouterVideoModelsResponse;
return await readProviderJsonResponse<OpenRouterVideoModelsResponse>(
response,
"OpenRouter video models request failed",
);
} finally {
await release();
}

View File

@@ -0,0 +1,167 @@
// Openshell tests cover backend-owned exec workdir validation behavior.
import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import type { CreateSandboxBackendParams } from "openclaw/plugin-sdk/sandbox";
import {
createSandboxBrowserConfig,
createSandboxPruneConfig,
createSandboxSshConfig,
} from "openclaw/plugin-sdk/test-fixtures";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { createOpenShellSandboxBackendFactory } from "./backend.js";
import { resolveOpenShellPluginConfig } from "./config.js";
const sdkMocks = vi.hoisted(() => ({
runSshSandboxCommand: vi.fn(),
disposeSshSandboxSession: vi.fn(),
}));
const cliMocks = vi.hoisted(() => ({
runOpenShellCli: vi.fn(),
createOpenShellSshSession: vi.fn(),
}));
vi.mock("openclaw/plugin-sdk/sandbox", async (importOriginal) => {
const actual = await importOriginal<typeof import("openclaw/plugin-sdk/sandbox")>();
return {
...actual,
runSshSandboxCommand: sdkMocks.runSshSandboxCommand,
disposeSshSandboxSession: sdkMocks.disposeSshSandboxSession,
};
});
vi.mock("./cli.js", async (importOriginal) => {
const actual = await importOriginal<typeof import("./cli.js")>();
return {
...actual,
runOpenShellCli: cliMocks.runOpenShellCli,
createOpenShellSshSession: cliMocks.createOpenShellSshSession,
};
});
const tempDirs: string[] = [];
function createOpenShellBackendSandboxConfig(): CreateSandboxBackendParams["cfg"] {
return {
mode: "all",
backend: "openshell",
scope: "session",
workspaceAccess: "rw",
workspaceRoot: "/tmp/openclaw-sandboxes",
docker: {
image: "openclaw-sandbox:bookworm-slim",
containerPrefix: "openclaw-sbx-",
workdir: "/workspace",
readOnlyRoot: false,
tmpfs: [],
network: "none",
capDrop: [],
binds: [],
env: {},
},
ssh: createSandboxSshConfig("/tmp/openclaw-sandboxes"),
browser: createSandboxBrowserConfig(),
tools: { allow: ["*"], deny: [] },
prune: createSandboxPruneConfig(),
};
}
async function makeTempDir(prefix: string) {
const dir = await fs.mkdtemp(path.join(os.tmpdir(), prefix));
tempDirs.push(dir);
return dir;
}
describe("openshell backend exec workdir validation", () => {
beforeEach(() => {
vi.clearAllMocks();
cliMocks.createOpenShellSshSession.mockResolvedValue({
command: "ssh",
configPath: "/tmp/openclaw-openshell-test-ssh-config",
host: "openshell-test",
});
cliMocks.runOpenShellCli.mockResolvedValue({
code: 0,
stdout: "",
stderr: "",
});
sdkMocks.runSshSandboxCommand.mockImplementation(async ({ remoteCommand }) => ({
stdout: String(remoteCommand).includes("openclaw-validate-workdir")
? Buffer.from("/workspace\n")
: Buffer.alloc(0),
stderr: Buffer.alloc(0),
code: 0,
}));
});
afterEach(async () => {
await Promise.all(
tempDirs.splice(0).map((dir) => fs.rm(dir, { recursive: true, force: true })),
);
});
it("reuses validation-time workspace preparation for the following exec", async () => {
const workspaceDir = await makeTempDir("openclaw-openshell-workspace-");
await fs.writeFile(path.join(workspaceDir, "seed.txt"), "seed", "utf8");
const backendFactory = createOpenShellSandboxBackendFactory({
pluginConfig: resolveOpenShellPluginConfig({
command: "openshell",
mode: "mirror",
}),
});
const backend = await backendFactory({
sessionKey: "agent:main:turn",
scopeKey: "agent:main",
workspaceDir,
agentWorkspaceDir: workspaceDir,
cfg: createOpenShellBackendSandboxConfig(),
});
await expect(backend.validateWorkdir?.("/workspace")).resolves.toBe("/workspace");
const execSpec = await backend.buildExecSpec({
command: "pwd",
workdir: "/workspace",
env: {},
usePty: false,
});
const uploadCalls = cliMocks.runOpenShellCli.mock.calls.filter(
([params]) => params.args[0] === "sandbox" && params.args[1] === "upload",
);
expect(uploadCalls).toHaveLength(1);
expect(execSpec.argv).toContain("openshell-test");
});
it("does not reuse validation-time workspace preparation after discard", async () => {
const workspaceDir = await makeTempDir("openclaw-openshell-workspace-");
await fs.writeFile(path.join(workspaceDir, "seed.txt"), "seed", "utf8");
const backendFactory = createOpenShellSandboxBackendFactory({
pluginConfig: resolveOpenShellPluginConfig({
command: "openshell",
mode: "mirror",
}),
});
const backend = await backendFactory({
sessionKey: "agent:main:turn",
scopeKey: "agent:main",
workspaceDir,
agentWorkspaceDir: workspaceDir,
cfg: createOpenShellBackendSandboxConfig(),
});
await expect(backend.validateWorkdir?.("/workspace")).resolves.toBe("/workspace");
backend.discardPreparedWorkdir?.("/workspace");
await backend.buildExecSpec({
command: "pwd",
workdir: "/workspace",
env: {},
usePty: false,
});
const uploadCalls = cliMocks.runOpenShellCli.mock.calls.filter(
([params]) => params.args[0] === "sandbox" && params.args[1] === "upload",
);
expect(uploadCalls).toHaveLength(2);
});
});

View File

@@ -22,6 +22,7 @@ import { normalizeLowercaseStringOrEmpty } from "openclaw/plugin-sdk/string-coer
import type { OpenShellSandboxBackend } from "./backend.types.js";
import {
buildValidatedExecRemoteCommand,
buildRemoteWorkdirValidationCommand,
buildRemoteCommand,
createOpenShellSshSession,
runOpenShellCli,
@@ -280,6 +281,13 @@ async function createOpenShellSandboxBackend(params: {
mode: params.pluginConfig.mode,
configLabel: params.pluginConfig.from,
configLabelKind: "Source",
workdirValidation: "backend",
validateWorkdir: async (workdir) => await impl.validateWorkdir(workdir),
discardPreparedWorkdir: (workdir) => impl.discardPreparedWorkdir(workdir),
workdirRoots: [
params.pluginConfig.remoteWorkspaceDir,
params.pluginConfig.remoteAgentWorkspaceDir,
],
buildExecSpec: async ({ command, workdir, env, usePty }) => {
const pending = await impl.prepareExec({ command, workdir, env, usePty });
return {
@@ -318,6 +326,10 @@ async function createOpenShellSandboxBackend(params: {
class OpenShellSandboxBackendImpl {
private ensurePromise: Promise<void> | null = null;
private preparedRemoteWorkspaceForNextExec: {
workdir: string;
promise: Promise<void>;
} | null = null;
private remoteSeedPending = false;
constructor(
@@ -339,6 +351,10 @@ class OpenShellSandboxBackendImpl {
mode: this.params.execContext.config.mode,
configLabel: this.params.execContext.config.from,
configLabelKind: "Source",
workdirValidation: "backend",
validateWorkdir: async (workdir) => await this.validateWorkdir(workdir),
discardPreparedWorkdir: (workdir) => this.discardPreparedWorkdir(workdir),
workdirRoots: [this.params.remoteWorkspaceDir, this.params.remoteAgentWorkspaceDir],
remoteWorkspaceDir: this.params.remoteWorkspaceDir,
remoteAgentWorkspaceDir: this.params.remoteAgentWorkspaceDir,
buildExecSpec: async ({ command, workdir, env, usePty }) => {
@@ -382,20 +398,14 @@ class OpenShellSandboxBackendImpl {
env: Record<string, string>;
usePty: boolean;
}): Promise<{ argv: string[]; token: PendingExec }> {
const remoteWorkdir = params.workdir ?? this.params.remoteWorkspaceDir;
const preparedWorkspace = this.consumePreparedRemoteWorkspaceForNextExec(remoteWorkdir);
const remoteCommand = buildValidatedExecRemoteCommand({
command: params.command,
workdir: params.workdir ?? this.params.remoteWorkspaceDir,
workdir: remoteWorkdir,
env: params.env,
});
await this.ensureSandboxExists();
if (this.params.execContext.config.mode === "mirror") {
await this.syncWorkspaceToRemote();
} else {
const seeded = await this.maybeSeedRemoteWorkspace();
if (!seeded) {
await this.syncSkillsWorkspaceToRemote();
}
}
await (preparedWorkspace ?? this.prepareRemoteWorkspaceForExec());
const sshSession = await createOpenShellSshSession({
context: this.params.execContext,
});
@@ -414,6 +424,85 @@ class OpenShellSandboxBackendImpl {
};
}
async validateWorkdir(workdir: string): Promise<string | null> {
const preparedWorkspace = this.prepareRemoteWorkspaceForExec();
const reusablePreparation = { workdir, promise: preparedWorkspace };
this.preparedRemoteWorkspaceForNextExec = reusablePreparation;
try {
await preparedWorkspace;
const sshSession = await createOpenShellSshSession({
context: this.params.execContext,
});
try {
const result = await runSshSandboxCommand({
session: sshSession,
remoteCommand: buildRemoteWorkdirValidationCommand({
workdir,
root: this.resolveWorkdirValidationRoot(workdir),
}),
allowFailure: true,
});
const resolvedWorkdir = result.code === 0 ? result.stdout.toString("utf8").trim() : "";
if (this.preparedRemoteWorkspaceForNextExec === reusablePreparation) {
this.preparedRemoteWorkspaceForNextExec = resolvedWorkdir
? { workdir: resolvedWorkdir, promise: preparedWorkspace }
: null;
}
return resolvedWorkdir || null;
} finally {
await disposeSshSandboxSession(sshSession);
}
} catch (error) {
if (this.preparedRemoteWorkspaceForNextExec === reusablePreparation) {
this.preparedRemoteWorkspaceForNextExec = null;
}
throw error;
}
}
private resolveWorkdirValidationRoot(workdir: string): string {
try {
const normalized = normalizeRemotePath(workdir);
const roots = [
normalizeRemotePath(this.params.remoteAgentWorkspaceDir),
normalizeRemotePath(this.params.remoteWorkspaceDir),
].toSorted((a, b) => b.length - a.length);
return (
roots.find((root) => isRemotePathInside(root, normalized)) ?? this.params.remoteWorkspaceDir
);
} catch {
return this.params.remoteWorkspaceDir;
}
}
private consumePreparedRemoteWorkspaceForNextExec(workdir: string): Promise<void> | null {
const preparedWorkspace = this.preparedRemoteWorkspaceForNextExec;
if (!preparedWorkspace || preparedWorkspace.workdir !== workdir) {
this.preparedRemoteWorkspaceForNextExec = null;
return null;
}
this.preparedRemoteWorkspaceForNextExec = null;
return preparedWorkspace.promise;
}
discardPreparedWorkdir(workdir: string): void {
if (this.preparedRemoteWorkspaceForNextExec?.workdir === workdir) {
this.preparedRemoteWorkspaceForNextExec = null;
}
}
private async prepareRemoteWorkspaceForExec(): Promise<void> {
await this.ensureSandboxExists();
if (this.params.execContext.config.mode === "mirror") {
await this.syncWorkspaceToRemote();
return;
}
const seeded = await this.maybeSeedRemoteWorkspace();
if (!seeded) {
await this.syncSkillsWorkspaceToRemote();
}
}
async finalizeExec(token?: PendingExec): Promise<void> {
try {
if (this.params.execContext.config.mode === "mirror") {

View File

@@ -9,6 +9,7 @@ import type { ResolvedOpenShellPluginConfig } from "./config.js";
export {
buildExecRemoteCommand,
buildRemoteWorkdirValidationCommand,
buildValidatedExecRemoteCommand,
shellEscape,
} from "openclaw/plugin-sdk/sandbox";

View File

@@ -168,23 +168,42 @@ describe("runtime parity", () => {
const scoped = __testing.filterMockRequestsForParentPrompt(
[
{
prompt: "Fanout worker alpha: inspect the QA workspace and finish with exactly ALPHA-OK.",
allInputText:
"Delegate one bounded QA task to a subagent. Fanout worker alpha: inspect the QA workspace and finish with exactly ALPHA-OK.",
plannedToolName: "read",
},
{
prompt: "Delegate one bounded QA task to a subagent.",
allInputText: "Delegate one bounded QA task to a subagent.",
plannedToolName: "sessions_spawn",
},
{
prompt: "Continue the bounded QA task with the retained child result.",
allInputText:
"Delegate one bounded QA task to a subagent. Continue the bounded QA task with the retained child result.",
plannedToolName: "sessions_spawn",
},
{
allInputText: "Inspect the QA workspace and return one concise protocol note.",
plannedToolName: "read",
},
{
prompt: "Delegate one bounded QA task to a subagent.",
allInputText: "Delegate one bounded QA task to a subagent. Tool result: child accepted.",
toolOutput: "child accepted",
},
],
"Delegate one bounded QA task to a subagent.",
[
"Delegate one bounded QA task to a subagent.",
"Continue the bounded QA task with the retained child result.",
],
);
expect(scoped).toHaveLength(2);
expect(scoped).toHaveLength(3);
expect(scoped.map((request) => request.plannedToolName ?? "result")).toEqual([
"sessions_spawn",
"sessions_spawn",
"result",
]);

View File

@@ -120,6 +120,7 @@ type RuntimeParityTranscriptRecord = {
};
type RuntimeParityMockRequestSnapshot = {
prompt?: string;
allInputText?: string;
plannedToolName?: string;
plannedToolArgs?: unknown;
@@ -759,14 +760,22 @@ function resolveRuntimeParityToolCalls(params: {
function filterMockRequestsForParentPrompt(
requests: RuntimeParityMockRequestSnapshot[],
parentPrompt: string,
parentPrompts: readonly string[] = [parentPrompt],
) {
const normalizedParentPrompt = normalizeTextForParity(parentPrompt);
if (!normalizedParentPrompt) {
const normalizedParentPrompts = parentPrompts
.map(normalizeTextForParity)
.filter((prompt) => prompt.length > 0);
if (normalizedParentPrompts.length === 0) {
return requests;
}
const matching = requests.filter((request) =>
normalizeTextForParity(request.allInputText ?? "").includes(normalizedParentPrompt),
);
const matching = requests.filter((request) => {
const normalizedPrompt = normalizeTextForParity(request.prompt ?? "");
if (normalizedPrompt) {
return normalizedParentPrompts.some((prompt) => normalizedPrompt.includes(prompt));
}
const normalizedHistory = normalizeTextForParity(request.allInputText ?? "");
return normalizedParentPrompts.some((prompt) => normalizedHistory.includes(prompt));
});
return matching.length > 0 ? matching : requests;
}
@@ -966,6 +975,7 @@ async function loadRuntimeParityTranscripts(params: {
async function loadRuntimeParityMockToolCalls(
mockBaseUrl: string | undefined,
parentPrompt: string,
parentPrompts: readonly string[] = [parentPrompt],
): Promise<RuntimeParityToolCall[] | null> {
const normalizedBaseUrl = mockBaseUrl?.trim().replace(/\/+$/u, "");
if (!normalizedBaseUrl) {
@@ -991,6 +1001,7 @@ async function loadRuntimeParityMockToolCalls(
}
const requests = payload.filter(isMessageRecord).map(
(entry): RuntimeParityMockRequestSnapshot => ({
prompt: readNonEmptyString(entry.prompt),
allInputText: readNonEmptyString(entry.allInputText),
plannedToolName: readNonEmptyString(entry.plannedToolName),
plannedToolArgs: entry.plannedToolArgs ?? null,
@@ -998,7 +1009,7 @@ async function loadRuntimeParityMockToolCalls(
}),
);
return resolveToolCallOrderFromMockRequests(
filterMockRequestsForParentPrompt(requests, parentPrompt),
filterMockRequestsForParentPrompt(requests, parentPrompt, parentPrompts),
);
} catch {
return null;
@@ -1015,12 +1026,16 @@ export async function captureRuntimeParityCell(
});
const transcriptRecords = buildTranscriptRecords(transcriptBytes);
const transcriptToolCalls = resolveToolCallOrder(transcriptRecords);
const parentPrompt =
transcriptRecords
.filter((record) => record.role === "user" && !isToolResultLikeMessage(record.message))
.map((record) => extractAssistantText(record.message))
.find(Boolean) ?? "";
const mockToolCalls = await loadRuntimeParityMockToolCalls(params.mockBaseUrl, parentPrompt);
const parentPrompts = transcriptRecords
.filter((record) => record.role === "user")
.map((record) => extractAssistantText(record.message))
.filter((prompt) => prompt.length > 0);
const parentPrompt = parentPrompts[0] ?? "";
const mockToolCalls = await loadRuntimeParityMockToolCalls(
params.mockBaseUrl,
parentPrompt,
parentPrompts,
);
const gatewayLogs = params.gateway.logs?.();
const sentinelFindings = [
...scanGatewayLogSentinels(gatewayLogs),

View File

@@ -8,6 +8,39 @@ import { describeQwenVideo } from "./media-understanding-provider.js";
installPinnedHostnameTestHooks();
function oversizedJsonResponse(params: { chunkCount: number; chunkSize: number }): {
response: Response;
getReadCount: () => number;
wasCanceled: () => boolean;
} {
const chunk = new Uint8Array(params.chunkSize);
let readCount = 0;
let canceled = false;
return {
response: new Response(
new ReadableStream<Uint8Array>({
pull(controller) {
if (readCount >= params.chunkCount) {
controller.close();
return;
}
readCount += 1;
controller.enqueue(chunk);
},
cancel() {
canceled = true;
},
}),
{
status: 200,
headers: { "Content-Type": "application/json" },
},
),
getReadCount: () => readCount,
wasCanceled: () => canceled,
};
}
describe("describeQwenVideo", () => {
it("builds the expected OpenAI-compatible video payload", async () => {
const { fetchFn, getRequest } = createRequestCaptureJsonFetch({
@@ -74,4 +107,42 @@ describe("describeQwenVideo", () => {
`data:video/mp4;base64,${Buffer.from("video-bytes").toString("base64")}`,
);
});
it("bounds successful Qwen video JSON bodies instead of buffering the whole response", async () => {
const streamed = oversizedJsonResponse({ chunkCount: 64, chunkSize: 1024 * 1024 });
await expect(
describeQwenVideo({
buffer: Buffer.from("video-bytes"),
fileName: "clip.mp4",
mime: "video/mp4",
apiKey: "test-key",
timeoutMs: 1500,
baseUrl: "https://example.com/v1",
fetchFn: async () => streamed.response,
}),
).rejects.toThrow("Qwen video description failed: JSON response exceeds 16777216 bytes");
expect(streamed.getReadCount()).toBeLessThan(64);
expect(streamed.wasCanceled()).toBe(true);
});
it("reports malformed Qwen video JSON with a provider-owned error", async () => {
const response = new Response("not-json{", {
status: 200,
headers: { "Content-Type": "application/json" },
});
await expect(
describeQwenVideo({
buffer: Buffer.from("video-bytes"),
fileName: "clip.mp4",
mime: "video/mp4",
apiKey: "test-key",
timeoutMs: 1500,
baseUrl: "https://example.com/v1",
fetchFn: async () => response,
}),
).rejects.toThrow("Qwen video description failed: malformed JSON response");
});
});

View File

@@ -13,6 +13,7 @@ import {
import {
assertOkOrThrowHttpError,
postJsonRequest,
readProviderJsonResponse,
resolveProviderHttpRequestConfig,
} from "openclaw/plugin-sdk/provider-http";
import { QWEN_STANDARD_GLOBAL_BASE_URL } from "./models.js";
@@ -60,7 +61,14 @@ export async function describeQwenVideo(
try {
await assertOkOrThrowHttpError(res, "Qwen video description failed");
const payload = (await res.json()) as OpenAiCompatibleVideoPayload;
// Read the success body through the shared byte-bounded JSON reader (16 MiB cap +
// stream cancel on overflow) so a hostile or buggy endpoint cannot force the runtime
// to buffer an unbounded body. Malformed JSON keeps the
// `Qwen video description failed: malformed JSON response` wrapping.
const payload = await readProviderJsonResponse<OpenAiCompatibleVideoPayload>(
res,
"Qwen video description failed",
);
const text = coerceOpenAiCompatibleVideoText(payload);
if (!text) {
throw new Error("Qwen video description response missing content");

View File

@@ -833,6 +833,7 @@ export const telegramPlugin = createChatChannelPlugin({
targetResolver: {
looksLikeId: looksLikeTelegramTargetId,
hint: "<chatId>",
reservedLiterals: ["current", "self", "this", "me"],
},
},
resolver: {

View File

@@ -807,16 +807,16 @@ describe("createTelegramDraftStream", () => {
expectNthPreviewSend(api, 2, "foo bar baz qux");
});
it("clamps a first oversized non-final preview", async () => {
it("clamps a first oversized non-final preview on a UTF-16 boundary", async () => {
const api = createMockDraftApi();
const stream = createDraftStream(api, { maxChars: 10 });
stream.update("1234567890ABCDEFGHIJ");
stream.update("123456789😀tail");
await stream.flush();
expect(api.sendMessage).toHaveBeenCalledTimes(1);
expectNthPreviewSend(api, 1, "1234567890");
expect(stream.lastDeliveredText?.()).toBe("1234567890");
expectNthPreviewSend(api, 1, "123456789");
expect(stream.lastDeliveredText?.()).toBe("123456789");
});
it("finalizes overflow that was hidden by a clamped non-final preview", async () => {

View File

@@ -5,6 +5,7 @@ import {
takeMessageIdAfterStop,
} from "openclaw/plugin-sdk/channel-outbound";
import { formatErrorMessage } from "openclaw/plugin-sdk/error-runtime";
import { sliceUtf16Safe } from "openclaw/plugin-sdk/text-utility-runtime";
import { buildTelegramThreadParams, type TelegramThreadSpec } from "./bot/helpers.js";
import { renderTelegramHtmlText, telegramHtmlToPlainTextFallback } from "./format.js";
import {
@@ -169,7 +170,7 @@ function findTelegramDraftChunkLength(
high = mid - 1;
}
}
return best;
return sliceUtf16Safe(text, 0, best).length;
}
export function createTelegramDraftStream(params: {

View File

@@ -418,7 +418,7 @@ type TestTelegramUpdate = {
update_id: number;
message: {
text: string;
chat: { id: number; type: "supergroup" };
chat: { id: number; type: "private" | "supergroup" };
message_thread_id?: number;
is_topic_message?: boolean;
};
@@ -436,6 +436,16 @@ function topicUpdate(updateId: number, threadId: number, text: string): TestTele
};
}
function directUpdate(updateId: number, chatId: number, text: string): TestTelegramUpdate {
return {
update_id: updateId,
message: {
text,
chat: { id: chatId, type: "private" },
},
};
}
async function waitForAbortSignal(signal: AbortSignal): Promise<void> {
if (signal.aborted) {
return;
@@ -1795,6 +1805,93 @@ describe("TelegramPollingSession", () => {
});
});
for (const scenario of [
{
name: "topic",
conflict: topicUpdate(42, 10, "retryable session init conflict"),
blocked: topicUpdate(43, 10, "same topic must wait behind retry backoff"),
other: topicUpdate(44, 11, "other topic can continue"),
conflictEvent: "topic10:conflict",
blockedEvent: "topic10:overtook",
otherEvent: "topic11",
error: "reply session initialization conflicted for agent:main:telegram:group:-100:topic:10",
},
{
name: "direct message",
conflict: directUpdate(42, 100, "retryable session init conflict"),
blocked: directUpdate(43, 100, "same DM must wait behind retry backoff"),
other: directUpdate(44, 101, "other DM can continue"),
conflictEvent: "dm100:conflict",
blockedEvent: "dm100:overtook",
otherEvent: "dm101",
error: "reply session initialization conflicted for agent:main:telegram:direct:100",
},
]) {
it(`backs off retryable reply session init conflicts for ${scenario.name} lanes`, async () => {
vi.useFakeTimers({ shouldAdvanceTime: true });
try {
await withTempSpool(async (tempDir) => {
const abort = new AbortController();
const log = vi.fn();
let attempts = 0;
const events: string[] = [];
await writeSpooledTestUpdates(tempDir, [
scenario.conflict,
scenario.blocked,
scenario.other,
]);
const { runPromise, stopWorker } = startIsolatedIngressSession({
abort,
spoolDir: tempDir,
log,
drainIntervalMs: 100,
handleUpdate: async (update) => {
if (update.update_id === scenario.conflict.update_id) {
attempts += 1;
events.push(`${scenario.conflictEvent}:${attempts}`);
throw new Error(scenario.error);
}
if (update.update_id === scenario.blocked.update_id) {
events.push(scenario.blockedEvent);
return;
}
if (update.update_id === scenario.other.update_id) {
events.push(scenario.otherEvent);
}
},
});
await vi.waitFor(() => expect(attempts).toBe(1));
await vi.advanceTimersByTimeAsync(1_000);
expect(attempts).toBe(1);
await vi.waitFor(() =>
expect(events).toEqual([`${scenario.conflictEvent}:1`, scenario.otherEvent]),
);
expect(await pendingUpdateIds(tempDir, "all")).toEqual([
scenario.conflict.update_id,
scenario.blocked.update_id,
]);
expect(await failedUpdateIds(tempDir)).toEqual([]);
await vi.advanceTimersByTimeAsync(4_500);
await vi.waitFor(() => expect(attempts).toBe(2));
expect(events).not.toContain(scenario.blockedEvent);
expectLogIncludes(
log,
`spooled update ${scenario.conflict.update_id} failed; keeping for retry`,
);
abort.abort();
stopWorker();
await runPromise;
});
} finally {
vi.useRealTimers();
}
});
}
it("dead-letters wrapped missing harness failures", async () => {
await withTempSpool(async (tempDir) => {
const abort = new AbortController();

View File

@@ -131,11 +131,14 @@ const TELEGRAM_SPOOLED_HANDLER_ABORT_GRACE_MS = 5_000;
const TELEGRAM_SPOOLED_HANDLER_TIMEOUT_ENV = "OPENCLAW_TELEGRAM_SPOOLED_HANDLER_TIMEOUT_MS";
const TELEGRAM_SPOOLED_DRAIN_START_LIMIT = 100;
const TELEGRAM_SPOOLED_DRAIN_SCAN_LIMIT = TELEGRAM_SPOOLED_DRAIN_START_LIMIT * 10;
const TELEGRAM_SPOOLED_SESSION_INIT_CONFLICT_RETRY_BASE_MS = 5_000;
const TELEGRAM_SPOOLED_SESSION_INIT_CONFLICT_RETRY_MAX_MS = 60_000;
const TELEGRAM_POLLING_CLIENT_TIMEOUT_FLOOR_SECONDS = Math.ceil(
TELEGRAM_GET_UPDATES_REQUEST_TIMEOUT_MS / 1000,
);
const MISSING_AGENT_HARNESS_ERROR_NAME = "MissingAgentHarnessError";
const MISSING_AGENT_HARNESS_MESSAGE_RE = /Requested agent harness "[^"]+" is not registered\./u;
const REPLY_SESSION_INIT_CONFLICT_MESSAGE_RE = /reply session initialization conflicted for \S+/u;
function normalizeTelegramAccountId(accountId?: string | null): string {
return accountId?.trim() || "default";
@@ -169,6 +172,24 @@ function resolveNonRetryableSpooledUpdateFailure(
return null;
}
function resolveSpooledUpdateRetryDelayMs(update: TelegramSpooledUpdate, now = Date.now()): number {
const attempts = update.attempts ?? 0;
if (
!update.lastError ||
!REPLY_SESSION_INIT_CONFLICT_MESSAGE_RE.test(update.lastError) ||
update.lastAttemptAt === undefined ||
attempts <= 0
) {
return 0;
}
const exponent = Math.min(attempts - 1, 8);
const delayMs = Math.min(
TELEGRAM_SPOOLED_SESSION_INIT_CONFLICT_RETRY_MAX_MS,
TELEGRAM_SPOOLED_SESSION_INIT_CONFLICT_RETRY_BASE_MS * 2 ** exponent,
);
return Math.max(0, update.lastAttemptAt + delayMs - now);
}
type TelegramBot = ReturnType<typeof createTelegramBot>;
const waitForGracefulStop = async (stop: () => Promise<void>) => {
@@ -777,7 +798,9 @@ export class TelegramPollingSession {
}
}
try {
await releaseTelegramSpooledUpdateClaim(params.update);
await releaseTelegramSpooledUpdateClaim(params.update, {
lastError: formatErrorMessage(params.err),
});
} catch (releaseErr) {
this.opts.log(
`[telegram][diag] spooled update ${params.update.updateId} failed and could not be requeued: ${formatErrorMessage(releaseErr)}`,
@@ -865,6 +888,10 @@ export class TelegramPollingSession {
if (this.opts.abortSignal?.aborted) {
break;
}
if (resolveSpooledUpdateRetryDelayMs(update) > 0) {
claimedLaneKeys.add(laneKey);
continue;
}
const handlerKey = buildSpooledUpdateHandlerKey({ spoolDir: params.spoolDir, laneKey });
if (activeSpooledUpdateHandlersByLane.has(handlerKey)) {
blockedByLane.add(handlerKey);
@@ -1533,6 +1560,7 @@ export const testing = {
createTelegramRestartBackoffState,
resetTelegramRestartBackoffState,
resolveTelegramRestartDelayMs,
resolveSpooledUpdateRetryDelayMs,
resolveSpooledUpdateHandlerAbortGraceMs: (valueMs: unknown): number =>
resolvePositiveTimerTimeoutMs(valueMs, TELEGRAM_SPOOLED_HANDLER_ABORT_GRACE_MS),
};

View File

@@ -38,6 +38,9 @@ export type TelegramSpooledUpdate = {
path: string;
update: unknown;
receivedAt: number;
attempts?: number;
lastAttemptAt?: number;
lastError?: string;
claim?: TelegramSpooledUpdateClaimOwner;
};
@@ -166,6 +169,9 @@ function parseQueueRecord(
path: pendingPath(spoolDir, payload.updateId),
update: payload.update,
receivedAt: payload.receivedAt,
attempts: record.attempts,
...(record.lastAttemptAt === undefined ? {} : { lastAttemptAt: record.lastAttemptAt }),
...(record.lastError === undefined ? {} : { lastError: record.lastError }),
};
}
@@ -267,9 +273,11 @@ export async function claimTelegramSpooledUpdate(
export async function releaseTelegramSpooledUpdateClaim(
update: ClaimedTelegramSpooledUpdate,
options?: { lastError?: string; releasedAt?: number },
): Promise<void> {
await createTelegramIngressQueue(path.dirname(update.pendingPath)).release(
queueMutationTarget(update),
options,
);
}

View File

@@ -0,0 +1,217 @@
// Voyage batch tests cover bounded status/error response reads.
import { describe, expect, it } from "vitest";
import type { VoyageEmbeddingClient } from "./embedding-provider.js";
import { testing } from "./embedding-batch.js";
const { fetchVoyageBatchStatus, readVoyageBatchError, VOYAGE_BATCH_RESPONSE_MAX_BYTES } = testing;
function buildClient(): VoyageEmbeddingClient {
return {
baseUrl: "https://api.voyageai.test/v1",
headers: { authorization: "Bearer test" },
model: "voyage-3",
};
}
/**
* Build deps whose withRemoteHttpResponse drives the real onResponse against a
* caller-provided Response, so the bounded readers run exactly as in production.
*/
function buildDeps(response: Response): Parameters<typeof fetchVoyageBatchStatus>[0]["deps"] {
return {
now: () => 0,
sleep: async () => {},
postJsonWithRetry: (async () => {
throw new Error("postJsonWithRetry should not be called in these tests");
}) as never,
uploadBatchJsonlFile: (async () => {
throw new Error("uploadBatchJsonlFile should not be called in these tests");
}) as never,
withRemoteHttpResponse: (async (params: { onResponse: (res: Response) => Promise<unknown> }) =>
await params.onResponse(response)) as never,
};
}
/**
* A streaming JSON-ish body that proves an oversized response stops being read
* before the whole advertised payload is buffered into memory. getReadCount
* reports how many chunks were pulled; cancel() flips wasCanceled.
*/
function streamingResponse(params: { chunkCount: number; chunkSize: number; status?: number }): {
response: Response;
getReadCount: () => number;
wasCanceled: () => boolean;
} {
let reads = 0;
let canceled = false;
const encoder = new TextEncoder();
const stream = new ReadableStream<Uint8Array>({
pull(controller) {
if (reads >= params.chunkCount) {
controller.close();
return;
}
reads += 1;
controller.enqueue(encoder.encode("a".repeat(params.chunkSize)));
},
cancel() {
canceled = true;
},
});
return {
response: new Response(stream, {
status: params.status ?? 200,
headers: { "content-type": "application/json" },
}),
getReadCount: () => reads,
wasCanceled: () => canceled,
};
}
describe("voyage batch bounded reads", () => {
it("uses a 16 MiB cap for batch status/error responses", () => {
expect(VOYAGE_BATCH_RESPONSE_MAX_BYTES).toBe(16 * 1024 * 1024);
});
it("parses a well-formed batch status response under the byte cap", async () => {
const response = new Response(JSON.stringify({ id: "batch_1", status: "completed" }), {
status: 200,
headers: { "content-type": "application/json" },
});
const status = await fetchVoyageBatchStatus({
client: buildClient(),
batchId: "batch_1",
deps: buildDeps(response),
});
expect(status).toEqual({ id: "batch_1", status: "completed" });
});
it("caps an oversized batch status stream instead of buffering the whole body", async () => {
const streamed = streamingResponse({ chunkCount: 64, chunkSize: 1024 });
await expect(
fetchVoyageBatchStatus({
client: buildClient(),
batchId: "batch_1",
deps: buildDeps(streamed.response),
maxResponseBytes: 4096,
}),
).rejects.toThrow(/voyage-batch-status: JSON response exceeds 4096 bytes/);
// Stream was cancelled mid-flight: fewer chunks read than the full payload.
expect(streamed.getReadCount()).toBeLessThan(64);
expect(streamed.wasCanceled()).toBe(true);
});
it("preserves the full NDJSON parse chain for an under-cap error file", async () => {
// Multi-line NDJSON with a blank line proves the bounded read does not
// disturb the original trim/split("\n")/JSON.parse/extractBatchErrorMessage
// pipeline: the first useful error message is still extracted byte-for-byte
// identically to the pre-change `await res.text()` path.
const body = [
JSON.stringify({ custom_id: "req-0", response: { status_code: 200 } }),
"",
JSON.stringify({ custom_id: "req-1", error: { message: "voyage upstream rejected" } }),
JSON.stringify({ custom_id: "req-2", error: { message: "second error ignored" } }),
"",
].join("\n");
const response = new Response(body, {
status: 200,
headers: { "content-type": "application/x-ndjson" },
});
const message = await readVoyageBatchError({
client: buildClient(),
errorFileId: "file_1",
deps: buildDeps(response),
});
// extractBatchErrorMessage returns the first line carrying a message, so the
// success line is skipped and the second error is not surfaced.
expect(message).toBe("voyage upstream rejected");
});
it("returns undefined for an empty error file via the original empty-body branch", async () => {
// Whitespace-only body must still hit the `!text.trim()` short-circuit after
// decoding the bounded buffer, returning undefined exactly as before.
const response = new Response(" \n", {
status: 200,
headers: { "content-type": "application/x-ndjson" },
});
const message = await readVoyageBatchError({
client: buildClient(),
errorFileId: "file_1",
deps: buildDeps(response),
});
expect(message).toBeUndefined();
});
it("fail-softs an oversized error file into formatUnavailableBatchError by design", async () => {
const streamed = streamingResponse({ chunkCount: 64, chunkSize: 1024 });
// Intended behavior: an over-cap error file must NOT throw out of
// readVoyageBatchError. An unbounded error body would otherwise OOM the
// worker, so the bounded overflow error is caught and degraded into a
// diagnostic string via formatUnavailableBatchError. We accept the lost
// detail; the overflow message names the cap so the truncation is visible.
const readError = async () =>
await readVoyageBatchError({
client: buildClient(),
errorFileId: "file_1",
deps: buildDeps(streamed.response),
maxResponseBytes: 4096,
});
await expect(readError()).resolves.toMatch(
/error file unavailable: voyage batch error file content exceeds 4096 bytes/,
);
// The bounded reader still cancels the stream mid-flight rather than
// buffering the whole advertised payload before failing soft.
expect(streamed.getReadCount()).toBeLessThan(64);
expect(streamed.wasCanceled()).toBe(true);
});
it("caps an oversized non-OK (error) diagnostic body instead of buffering it whole", async () => {
// Regression for the non-OK gap: `assertVoyageResponseOk` previously read the
// 4xx/5xx diagnostic body with an unbounded `await res.text()`. A hostile
// endpoint can return a 500 with a never-ending body, so that read must be
// bounded too. Drive a streaming 500 through the real status path and assert
// the bounded overflow error fires and the stream is cancelled mid-flight.
const streamed = streamingResponse({ chunkCount: 64, chunkSize: 1024, status: 500 });
await expect(
fetchVoyageBatchStatus({
client: buildClient(),
batchId: "batch_1",
deps: buildDeps(streamed.response),
maxResponseBytes: 4096,
}),
).rejects.toThrow(/voyage batch status failed: 500 \(error body exceeds 4096 bytes\)/);
// Stream was cancelled mid-flight rather than draining the whole body.
expect(streamed.getReadCount()).toBeLessThan(64);
expect(streamed.wasCanceled()).toBe(true);
});
it("preserves the diagnostic shape for a small non-OK (error) body", async () => {
// Under-cap non-OK body must still surface the original
// `${context}: ${status} ${text}` diagnostic byte-for-byte.
const response = new Response("voyage upstream is down", {
status: 503,
headers: { "content-type": "text/plain" },
});
await expect(
fetchVoyageBatchStatus({
client: buildClient(),
batchId: "batch_1",
deps: buildDeps(response),
}),
).rejects.toThrow(/voyage batch status failed: 503 voyage upstream is down/);
});
});

View File

@@ -21,6 +21,8 @@ import {
uploadBatchJsonlFile,
withRemoteHttpResponse,
} from "openclaw/plugin-sdk/memory-core-host-engine-embeddings";
import { readProviderJsonResponse } from "openclaw/plugin-sdk/provider-http";
import { readResponseWithLimit } from "openclaw/plugin-sdk/response-limit-runtime";
import { normalizeStringEntries } from "openclaw/plugin-sdk/string-coerce-runtime";
import type { VoyageEmbeddingClient } from "./embedding-provider.js";
@@ -41,6 +43,10 @@ type VoyageBatchOutputLine = ProviderBatchOutputLine;
const VOYAGE_BATCH_ENDPOINT = EMBEDDING_BATCH_ENDPOINT;
const VOYAGE_BATCH_COMPLETION_WINDOW = "12h";
const VOYAGE_BATCH_MAX_REQUESTS = 50000;
// Voyage batch status/error responses are untrusted external bodies. Cap them
// the same way other bundled providers do (16 MiB) so a misbehaving or hostile
// endpoint cannot stream an unbounded body into memory before we parse it.
const VOYAGE_BATCH_RESPONSE_MAX_BYTES = 16 * 1024 * 1024;
type VoyageBatchDeps = {
now: () => number;
@@ -65,9 +71,23 @@ function resolveVoyageBatchDeps(overrides: Partial<VoyageBatchDeps> | undefined)
};
}
async function assertVoyageResponseOk(res: Response, context: string): Promise<void> {
async function assertVoyageResponseOk(
res: Response,
context: string,
maxBytes: number = VOYAGE_BATCH_RESPONSE_MAX_BYTES,
): Promise<void> {
if (!res.ok) {
const text = await res.text();
// The non-OK diagnostic body is just as untrusted as the success body: a
// misbehaving or hostile endpoint can return a 4xx/5xx with an unbounded
// body, and the old `await res.text()` buffered it whole before we threw.
// Read it through the same bounded reader (16 MiB cap, stream cancelled on
// overflow) while preserving the original `${context}: ${status} ${text}`
// diagnostic shape for backward compatibility.
const bytes = await readResponseWithLimit(res, maxBytes, {
onOverflow: ({ maxBytes: maxBytesLocal }) =>
new Error(`${context}: ${res.status} (error body exceeds ${maxBytesLocal} bytes)`),
});
const text = new TextDecoder().decode(bytes);
throw new Error(`${context}: ${res.status} ${text}`);
}
}
@@ -127,14 +147,18 @@ async function fetchVoyageBatchStatus(params: {
client: VoyageEmbeddingClient;
batchId: string;
deps: VoyageBatchDeps;
maxResponseBytes?: number;
}): Promise<VoyageBatchStatus> {
const maxBytes = params.maxResponseBytes ?? VOYAGE_BATCH_RESPONSE_MAX_BYTES;
return await params.deps.withRemoteHttpResponse(
buildVoyageBatchRequest({
client: params.client,
path: `batches/${params.batchId}`,
onResponse: async (res) => {
await assertVoyageResponseOk(res, "voyage batch status failed");
return (await res.json()) as VoyageBatchStatus;
await assertVoyageResponseOk(res, "voyage batch status failed", maxBytes);
return await readProviderJsonResponse<VoyageBatchStatus>(res, "voyage-batch-status", {
maxBytes,
});
},
}),
);
@@ -144,15 +168,21 @@ async function readVoyageBatchError(params: {
client: VoyageEmbeddingClient;
errorFileId: string;
deps: VoyageBatchDeps;
maxResponseBytes?: number;
}): Promise<string | undefined> {
const maxBytes = params.maxResponseBytes ?? VOYAGE_BATCH_RESPONSE_MAX_BYTES;
try {
return await params.deps.withRemoteHttpResponse(
buildVoyageBatchRequest({
client: params.client,
path: `files/${params.errorFileId}/content`,
onResponse: async (res) => {
await assertVoyageResponseOk(res, "voyage batch error file content failed");
const text = await res.text();
await assertVoyageResponseOk(res, "voyage batch error file content failed", maxBytes);
const bytes = await readResponseWithLimit(res, maxBytes, {
onOverflow: ({ maxBytes: maxBytesLocal }) =>
new Error(`voyage batch error file content exceeds ${maxBytesLocal} bytes`),
});
const text = new TextDecoder().decode(bytes);
if (!text.trim()) {
return undefined;
}
@@ -280,10 +310,9 @@ export async function runVoyageEmbeddingBatches(
headers: buildBatchHeaders(params.client, { json: true }),
},
onResponse: async (contentRes) => {
if (!contentRes.ok) {
const text = await contentRes.text();
throw new Error(`voyage batch file content failed: ${contentRes.status} ${text}`);
}
// Same bounded non-OK diagnostic read as the status/error-file paths:
// the failure body is untrusted, so cap it instead of `await text()`.
await assertVoyageResponseOk(contentRes, "voyage batch file content failed");
if (!contentRes.body) {
return;
@@ -316,3 +345,9 @@ export async function runVoyageEmbeddingBatches(
},
});
}
export const testing = {
fetchVoyageBatchStatus,
readVoyageBatchError,
VOYAGE_BATCH_RESPONSE_MAX_BYTES,
} as const;

View File

@@ -8,8 +8,9 @@ import {
assertOkOrThrowHttpError,
buildAudioTranscriptionFormData,
postTranscriptionRequest,
resolveProviderHttpRequestConfig,
readProviderJsonResponse,
requireTranscriptionText,
resolveProviderHttpRequestConfig,
} from "openclaw/plugin-sdk/provider-http";
import { normalizeOptionalString } from "openclaw/plugin-sdk/string-coerce-runtime";
import { XAI_BASE_URL } from "./model-definitions.js";
@@ -68,7 +69,7 @@ export async function transcribeXaiAudio(
try {
await assertOkOrThrowHttpError(response, "xAI audio transcription failed");
const payload = (await response.json()) as XaiSttResponse;
const payload = await readProviderJsonResponse<XaiSttResponse>(response, "xai.stt");
return {
text: requireTranscriptionText(payload.text, "xAI transcription response missing text"),
...(model ? { model } : {}),

View File

@@ -16,6 +16,18 @@ if [[ ! -f "$FILTER_FILES" ]]; then
exit 1
fi
GIT_DIR="$(git rev-parse --git-dir 2>/dev/null || true)"
if [[ -n "$GIT_DIR" ]] && \
{ [[ -f "$GIT_DIR/MERGE_HEAD" ]] || \
[[ -f "$GIT_DIR/CHERRY_PICK_HEAD" ]] || \
[[ -f "$GIT_DIR/REVERT_HEAD" ]] || \
[[ -f "$GIT_DIR/REBASE_HEAD" ]] || \
[[ -d "$GIT_DIR/rebase-merge" ]] || \
[[ -d "$GIT_DIR/rebase-apply" ]]; }; then
# Sequencer commits stage the operation result, not just the user's local edits.
exit 0
fi
# Security: avoid option-injection from malicious file names (e.g. "--all", "--force").
# Robustness: NUL-delimited file list handles spaces/newlines safely.
# Compatibility: use read loops instead of `mapfile` so this runs on macOS Bash 3.x.

View File

@@ -34,6 +34,19 @@ describe("acp session manager", () => {
expect(store.getSessionByRunId("run-1")).toBeUndefined();
});
it("removes stale run lookup entries when rebinding an active run", () => {
const session = store.createSession({
sessionKey: "acp:rebind",
cwd: "/tmp",
});
store.setActiveRun(session.sessionId, "run-old", new AbortController());
store.setActiveRun(session.sessionId, "run-new", new AbortController());
expect(store.getSessionByRunId("run-old")).toBeUndefined();
expect(store.getSessionByRunId("run-new")?.sessionId).toBe(session.sessionId);
});
it("deletes sessions and aborts active runs on close", () => {
const session = store.createSession({
sessionId: "close-me",

View File

@@ -150,6 +150,9 @@ export function createInMemorySessionStore(options: AcpSessionStoreOptions = {})
if (!session) {
return;
}
if (session.activeRunId && session.activeRunId !== runId) {
runIdToSessionId.delete(session.activeRunId);
}
session.activeRunId = runId;
session.abortController = abortController;
runIdToSessionId.set(runId, sessionId);

View File

@@ -0,0 +1,11 @@
// Agent Core tests cover prompt template argument parsing behavior.
import { describe, expect, it } from "vitest";
import { parseCommandArgs, substituteArgs } from "./prompt-template-arguments.js";
describe("prompt template arguments", () => {
it("preserves quoted empty arguments so positional placeholders stay aligned", () => {
expect(parseCommandArgs('first "" third')).toEqual(["first", "", "third"]);
expect(parseCommandArgs("first '' third")).toEqual(["first", "", "third"]);
expect(substituteArgs("$1|$2|$3", parseCommandArgs('first "" third'))).toBe("first||third");
});
});

View File

@@ -5,26 +5,31 @@ export function parseCommandArgs(argsString: string): string[] {
const args: string[] = [];
let current = "";
let inQuote: string | null = null;
let hasToken = false;
for (const char of argsString) {
if (inQuote) {
if (char === inQuote) {
inQuote = null;
} else {
hasToken = true;
current += char;
}
} else if (char === '"' || char === "'") {
hasToken = true;
inQuote = char;
} else if (/\s/.test(char)) {
if (current) {
if (hasToken) {
args.push(current);
current = "";
hasToken = false;
}
} else {
hasToken = true;
current += char;
}
}
if (current) {
if (hasToken) {
args.push(current);
}
return args;

View File

@@ -55,4 +55,27 @@ describe("media-generation catalog", () => {
}),
).toEqual(["video-default", "video-pro"]);
});
it("marks a trimmed default model as the catalog default", () => {
expect(
synthesizeMediaGenerationCatalogEntries({
kind: "video_generation",
provider: {
id: "example",
defaultModel: " video-default ",
models: ["video-default"],
capabilities: {},
},
}),
).toEqual([
{
kind: "video_generation",
provider: "example",
model: "video-default",
source: "static",
default: true,
capabilities: {},
},
]);
});
});

View File

@@ -51,6 +51,7 @@ export function synthesizeMediaGenerationCatalogEntries<TCapabilities>(params: {
provider: MediaGenerationCatalogProvider<TCapabilities>;
modes?: readonly string[];
}): Array<MediaGenerationCatalogEntry<TCapabilities>> {
const defaultModel = uniqueTrimmedStrings([params.provider.defaultModel])[0];
return uniqueModels(params.provider).map((model) => {
const entry: MediaGenerationCatalogEntry<TCapabilities> = {
kind: params.kind,
@@ -62,7 +63,7 @@ export function synthesizeMediaGenerationCatalogEntries<TCapabilities>(params: {
if (params.provider.label) {
entry.label = params.provider.label;
}
if (model === params.provider.defaultModel) {
if (model === defaultModel) {
entry.default = true;
}
if (params.modes) {

View File

@@ -0,0 +1,100 @@
// Media Understanding Common tests cover provider output extraction behavior.
import { describe, expect, it } from "vitest";
import { extractGeminiResponse } from "./output-extract.js";
describe("extractGeminiResponse", () => {
it("extracts the response from noisy output with nested JSON objects", () => {
expect(
extractGeminiResponse(
[
"debug: invoking gemini",
JSON.stringify({
response: "a useful description",
usage: {
inputTokens: 12,
outputTokens: 4,
},
}),
].join("\n"),
),
).toBe("a useful description");
});
it("returns null for an incomplete JSON object", () => {
expect(extractGeminiResponse("{")).toBeNull();
});
it("ignores unmatched quotes in noisy output before the JSON object", () => {
expect(extractGeminiResponse('debug: model said "hello\n{"response":"ok"}')).toBe("ok");
});
it("ignores braces inside quoted noisy output", () => {
expect(extractGeminiResponse('debug: "hello { world" {"response":"ok"}')).toBe("ok");
});
it("ignores shell-quoted JSON-like noisy output", () => {
expect(extractGeminiResponse('debug: \'{"response":"fake"}\'')).toBeNull();
});
it("does not treat apostrophes inside noisy words as quote delimiters", () => {
expect(extractGeminiResponse('debug: it\'s done {"response":"ok"}')).toBe("ok");
});
it("resynchronizes after an unmatched brace in noisy output", () => {
expect(extractGeminiResponse('debug: generated {\n{"response":"ok"}')).toBe("ok");
});
it("preserves brace-heavy response text", () => {
const response = "{".repeat(33);
expect(extractGeminiResponse(JSON.stringify({ response }))).toBe(response);
});
it("extracts pretty-printed JSON output", () => {
expect(
extractGeminiResponse(
JSON.stringify(
{
response: "pretty response",
usage: { inputTokens: 12 },
},
null,
2,
),
),
).toBe("pretty response");
});
it("preserves pretty-printed object elements inside arrays", () => {
expect(
extractGeminiResponse(
JSON.stringify(
{
response: "array response",
items: [{ id: 1 }, { id: 2 }],
},
null,
2,
),
),
).toBe("array response");
});
it("does not accept an inner response from a malformed trailing object", () => {
expect(extractGeminiResponse('{"response":"good"} {"meta":{"response":"bad"} broken}')).toBe(
"good",
);
expect(extractGeminiResponse('{"response":"good"} {"meta":{"response":"bad"}')).toBe("good");
});
it("ignores a nested response inside an unfinished outer object", () => {
expect(extractGeminiResponse('noise {"meta":{"response":"bad"}')).toBeNull();
});
it("does not promote a child from a malformed outer object", () => {
expect(extractGeminiResponse('{"response":"good"} {"meta" {"response":"bad"}}')).toBe("good");
expect(extractGeminiResponse('noise {broken {"response":"bad"}}')).toBeNull();
expect(extractGeminiResponse('{"response":"good"}\nnoise {broken\n{"response":"bad"}}')).toBe(
"good",
);
});
});

View File

@@ -3,16 +3,119 @@
/** Parse the last JSON object in a noisy provider output string. */
function extractLastJsonObject(raw: string): unknown {
const trimmed = raw.trim();
const start = trimmed.lastIndexOf("{");
if (start === -1) {
return null;
const ranges: Array<{ end: number; start: number }> = [];
const starts: number[] = [];
let inString = false;
let escaped = false;
let preambleQuote: string | undefined;
let preambleEscaped = false;
let previousSignificant: string | undefined;
let lineHasNonWhitespace = false;
let arrayDepth = 0;
let candidateHasContent = false;
for (let index = 0; index < trimmed.length; index += 1) {
const character = trimmed[index];
if (inString) {
if (character === "\n" || character === "\r") {
starts.length = 0;
inString = false;
escaped = false;
} else if (escaped) {
escaped = false;
} else if (character === "\\") {
escaped = true;
} else if (character === '"') {
inString = false;
}
continue;
}
if (starts.length === 0) {
if (preambleQuote !== undefined) {
if (character === "\n" || character === "\r") {
preambleQuote = undefined;
preambleEscaped = false;
} else if (preambleEscaped) {
preambleEscaped = false;
} else if (character === "\\") {
preambleEscaped = true;
} else if (character === preambleQuote) {
preambleQuote = undefined;
}
continue;
}
if (character === '"' || character === "'" || character === "`") {
const previous = trimmed[index - 1];
if (previous === undefined || /[\s:([{]/.test(previous)) {
preambleQuote = character;
preambleEscaped = false;
continue;
}
}
if (character === "{") {
arrayDepth = 0;
candidateHasContent = false;
starts.push(index);
}
if (!/\s/.test(character)) {
previousSignificant = character;
lineHasNonWhitespace = true;
} else if (character === "\n" || character === "\r") {
lineHasNonWhitespace = false;
}
continue;
}
const hadCandidateContent = candidateHasContent;
if (character === '"') {
inString = true;
} else if (character === "{") {
if (
previousSignificant === ":" ||
previousSignificant === "[" ||
previousSignificant === '"' ||
(previousSignificant === "," && (lineHasNonWhitespace || arrayDepth > 0))
) {
starts.push(index);
} else if (!lineHasNonWhitespace && !hadCandidateContent) {
// Only resync at a clean record boundary; otherwise keep malformed
// outer objects from promoting diagnostic payloads as valid results.
starts.length = 1;
starts[0] = index;
arrayDepth = 0;
candidateHasContent = false;
}
} else if (character === "}" && starts.length > 0) {
const start = starts.pop();
if (start !== undefined && starts.length === 0) {
ranges.push({ start, end: index });
}
} else if (character === "[") {
arrayDepth += 1;
} else if (character === "]" && arrayDepth > 0) {
arrayDepth -= 1;
}
if (!/\s/.test(character)) {
candidateHasContent = true;
previousSignificant = character;
lineHasNonWhitespace = true;
} else if (character === "\n" || character === "\r") {
lineHasNonWhitespace = false;
}
}
const slice = trimmed.slice(start);
try {
return JSON.parse(slice);
} catch {
return null;
for (let index = ranges.length - 1; index >= 0; index -= 1) {
const range = ranges[index];
try {
return JSON.parse(trimmed.slice(range.start, range.end + 1));
} catch {
// Ignore malformed objects and try the previous completed range.
}
}
return null;
}
/** Extract Gemini CLI-style response text from the last JSON object in output. */

View File

@@ -108,7 +108,7 @@ flow:
- lambda:
params: [text]
expr: "config.expectedReplyGroups.every((group) => group.some((needle) => normalizeLowercaseStringOrEmpty(text).includes(needle)))"
- expr: "env.providerMode === 'mock-openai' ? 10000 : 30000"
- expr: "30000"
- expr: "env.providerMode === 'mock-openai' ? 100 : 250"
- if:
expr: "Boolean(env.mock)"
@@ -240,7 +240,11 @@ flow:
message:
expr: "lastError instanceof Error ? formatErrorMessage(lastError) : String(lastError ?? 'fanout retry exhausted')"
- if:
expr: "Boolean(env.mock)"
# Codex completes child sessions through its app-server path but
# does not relay the child marker back onto the parent QA channel.
# The shared assertions above already prove both child tool calls
# and child session rows; keep this transport-only proof OpenClaw-specific.
expr: "Boolean(env.mock) && env.gateway.runtimeEnv.OPENCLAW_QA_FORCE_RUNTIME !== 'codex'"
then:
- forEach:
items:
@@ -253,5 +257,5 @@ flow:
- lambda:
params: [candidate]
expr: "String(candidate.text ?? '').trim() === childCompletionMarker"
- 10000
- 30000
detailsExpr: "details"

View File

@@ -26,7 +26,9 @@ scenario:
config:
sessionKey: agent:qa:long-context-cache-stability
fixtureFile: large-cache-fixture.txt
cacheEvidenceNeedle: CACHE-FIXTURE-0550
cacheEvidenceNeedle: CACHE-FIXTURE-0050
cacheEvidenceLine: "CACHE-FIXTURE-0050: stable tool-result evidence for prompt-cache reuse across long sessions."
followupPromptNeedle: Using the already-read
warmupMarker: QA-LARGE-CACHE-WARMUP-OK
hitMarker: QA-LARGE-CACHE-HIT-OK
@@ -84,8 +86,17 @@ flow:
- set: debugRequests
value:
expr: "env.mock ? [...(await fetchJson(`${env.mock.baseUrl}/debug/requests`))] : []"
- set: cappedReadOutputIndex
value:
expr: "debugRequests.reduce((found, planned, index) => { if (found >= 0 || !planned.plannedToolCallId || planned.plannedToolName !== 'read' || planned.plannedToolArgs?.path !== config.fixtureFile) return found; const outputOffset = debugRequests.slice(index + 1).findIndex((candidate) => Boolean(candidate.toolOutputCallId) && candidate.toolOutputCallId === planned.plannedToolCallId); if (outputOffset < 0) return found; const output = debugRequests[index + 1 + outputOffset]; const evidence = [planned.allInputText, output.allInputText, output.toolOutput].filter((value) => typeof value === 'string').join('\\n'); const hasCodexFormattedTruncation = evidence.includes('Warning: truncated output') && (evidence.includes('chars truncated') || evidence.includes('tokens truncated')); return evidence.includes(config.cacheEvidenceLine) && (evidence.includes('[Read output capped at 50KB') || evidence.includes('...(OpenClaw truncated dynamic tool result') || evidence.includes('...(truncated)...') || hasCodexFormattedTruncation) ? index + 1 + outputOffset : found; }, -1)"
- set: hasCappedReadEvidence
value:
expr: "cappedReadOutputIndex >= 0"
- set: hasFollowupCacheEvidence
value:
expr: "cappedReadOutputIndex >= 0 && debugRequests.some((request, index) => index > cappedReadOutputIndex && String(request.prompt ?? '').includes(config.followupPromptNeedle) && String(request.allInputText ?? '').includes(config.cacheEvidenceLine))"
- assert:
expr: "!env.mock || debugRequests.some((request, index) => request.plannedToolName === 'read' && request.plannedToolArgs?.path === config.fixtureFile && typeof request.plannedToolCallId === 'string' && debugRequests.slice(index + 1).some((result, resultOffset) => result.toolOutputCallId === request.plannedToolCallId && String(result.toolOutput ?? '').includes(config.cacheEvidenceNeedle) && (String(result.toolOutput ?? '').includes('[Read output capped at 50KB') || (String(result.toolOutput ?? '').includes('...(truncated)...') && String(result.toolOutput ?? '').length <= 13000)) && debugRequests.slice(index + resultOffset + 2).some((followup) => followup.plannedToolName === 'read' && followup.plannedToolArgs?.path === config.fixtureFile && String(followup.allInputText ?? '').includes(config.cacheEvidenceNeedle) && (String(followup.allInputText ?? '').includes('[Read output capped at 50KB') || String(followup.allInputText ?? '').includes('...(truncated)...')))))"
expr: "!env.mock || (hasCappedReadEvidence && hasFollowupCacheEvidence)"
message:
expr: "`large capped read tool result was not observed: ${JSON.stringify(debugRequests.slice(-8).map((request) => ({ plannedToolName: request.plannedToolName ?? null, plannedToolArgs: request.plannedToolArgs ?? null, plannedToolCallId: request.plannedToolCallId ?? null, toolOutputCallId: request.toolOutputCallId ?? null, toolOutputLength: String(request.toolOutput ?? '').length, toolOutputHasNeedle: String(request.toolOutput ?? '').includes(config.cacheEvidenceNeedle), toolOutputHasReadCap: String(request.toolOutput ?? '').includes('[Read output capped at 50KB'), toolOutputHasCodexTruncation: String(request.toolOutput ?? '').includes('...(truncated)...'), inputHasNeedle: String(request.allInputText ?? '').includes(config.cacheEvidenceNeedle), inputHasReadCap: String(request.allInputText ?? '').includes('[Read output capped at 50KB'), inputHasCodexTruncation: String(request.allInputText ?? '').includes('...(truncated)...') })))}`"
expr: "`large capped read cache evidence was not observed: ${JSON.stringify({ hasCappedReadEvidence, hasFollowupCacheEvidence, requests: debugRequests.slice(-8).map((request) => ({ prompt: request.prompt ?? null, plannedToolName: request.plannedToolName ?? null, plannedToolArgs: request.plannedToolArgs ?? null, plannedToolCallId: request.plannedToolCallId ?? null, toolOutputCallId: request.toolOutputCallId ?? null, toolOutputLength: String(request.toolOutput ?? '').length, outputHasReadCap: String(request.toolOutput ?? '').includes('[Read output capped at 50KB'), outputHasCodexTruncation: String(request.toolOutput ?? '').includes('...(truncated)...'), inputHasEvidenceLine: String(request.allInputText ?? '').includes(config.cacheEvidenceLine) })) })}`"
detailsExpr: "outbound?.text ?? config.hitMarker"

View File

@@ -25,10 +25,35 @@ PROBE_ATTEMPT_TIMEOUT_MS="$(
PROBE_MAX_BODY_BYTES="$(
openclaw_e2e_read_positive_int_env OPENCLAW_UPGRADE_SURVIVOR_PROBE_MAX_BODY_BYTES 1048576
)"
LANE_ARTIFACT_SUFFIX="${OPENCLAW_DOCKER_ALL_LANE_NAME:-default}"
ROOT_MANAGED_VPS="${OPENCLAW_UPGRADE_SURVIVOR_ROOT_MANAGED_VPS:-0}"
resolve_lane_artifact_suffix() {
if [ -n "${OPENCLAW_DOCKER_ALL_LANE_NAME:-}" ]; then
printf "%s" "$OPENCLAW_DOCKER_ALL_LANE_NAME"
return
fi
if [ "$ROOT_MANAGED_VPS" = "1" ]; then
printf "root-managed-vps-upgrade"
elif [ "$UPDATE_RESTART_MODE" = "auto-auth" ]; then
printf "update-restart-auth"
elif [ "${OPENCLAW_UPGRADE_SURVIVOR_PUBLISHED_BASELINE:-0}" = "1" ]; then
printf "published-upgrade-survivor"
else
printf "upgrade-survivor"
fi
if [ -n "${BASELINE_SPEC// }" ]; then
printf -- "-%s" "$BASELINE_SPEC"
fi
if [ "$SCENARIO" != "base" ]; then
printf -- "-%s" "$SCENARIO"
fi
}
LANE_ARTIFACT_SUFFIX="$(resolve_lane_artifact_suffix)"
LANE_ARTIFACT_SUFFIX="${LANE_ARTIFACT_SUFFIX//[^A-Za-z0-9_.-]/_}"
ARTIFACT_DIR="${OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_DIR:-$ROOT_DIR/.artifacts/upgrade-survivor/$LANE_ARTIFACT_SUFFIX}"
ROOT_MANAGED_VPS="${OPENCLAW_UPGRADE_SURVIVOR_ROOT_MANAGED_VPS:-0}"
DOCKER_RUN_USER_ARGS=()
PROBE_ENV_ARGS=(
-e OPENCLAW_UPGRADE_SURVIVOR_PROBE_TIMEOUT_MS="$PROBE_TIMEOUT_MS"

View File

@@ -21,6 +21,9 @@ export const pluginSdkDocMetadata = {
health: {
category: "core",
},
sandbox: {
category: "runtime",
},
"approval-runtime": {
category: "runtime",
},

View File

@@ -203,14 +203,14 @@ try {
budgets = {
publicEntrypoints: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_ENTRYPOINTS", 322),
publicExports: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_EXPORTS", 10386),
publicFunctionExports: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_FUNCTION_EXPORTS", 5215),
publicFunctionExports: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_FUNCTION_EXPORTS", 5212),
publicDeprecatedExports: readBudgetEnv(
"OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_DEPRECATED_EXPORTS",
3247,
),
publicWildcardReexports: readBudgetEnv(
"OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_WILDCARD_REEXPORTS",
215,
214,
),
};
publicDeprecatedExportsByEntrypointBudget = readEntrypointBudgetEnv(

View File

@@ -3,6 +3,8 @@
* Exercises result coercion, error wrapping, client delegation, and conflict
* detection at the ToolDefinition boundary.
*/
import os from "node:os";
import path from "node:path";
import type { AgentTool } from "openclaw/plugin-sdk/agent-core";
import { Type } from "typebox";
import { describe, expect, it, vi } from "vitest";
@@ -14,6 +16,7 @@ import {
toToolDefinitions,
} from "./agent-tool-definition-adapter.js";
import { wrapToolWithBeforeToolCallHook } from "./agent-tools.before-tool-call.js";
import { createExecTool } from "./bash-tools.exec.js";
import type { ClientToolDefinition } from "./embedded-agent-runner/run/params.js";
type ToolExecute = ReturnType<typeof toToolDefinitions>[number]["execute"];
@@ -93,6 +96,154 @@ describe("agent tool definition adapter", () => {
expect(details?.error).toBe("nope");
});
it("preserves exec deny before prepared workdir failures", async () => {
const tool = createExecTool({
security: "deny",
ask: "off",
});
const [definition] = toToolDefinitions([tool]);
const missingWorkdir = path.join(os.tmpdir(), `openclaw-missing-denied-cwd-${Date.now()}`);
const existing = await definition.execute(
"call-denied-existing-cwd",
{
command: "echo denied",
workdir: process.cwd(),
},
undefined,
undefined,
extensionContext,
);
const missing = await definition.execute(
"call-denied-missing-cwd",
{
command: "echo denied",
workdir: missingWorkdir,
},
undefined,
undefined,
extensionContext,
);
const expected = {
status: "error",
error: "exec denied: host=gateway security=deny",
};
expect(existing.details).toMatchObject(expected);
expect(missing.details).toMatchObject(expected);
expect(JSON.stringify(missing)).not.toContain("unavailable or not a directory");
});
it("does not validate backend sandbox workdirs before exec deny", async () => {
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
const tool = createExecTool({
host: "sandbox",
security: "deny",
ask: "off",
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir: process.cwd(),
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
},
});
const [definition] = toToolDefinitions([tool]);
const result = await definition.execute(
"call-denied-backend-cwd",
{
command: "echo denied",
workdir: "/remote/workspace/generated",
},
undefined,
undefined,
extensionContext,
);
expect(result.details).toMatchObject({
status: "error",
error: "exec denied: host=sandbox security=deny",
});
expect(validateWorkdir).not.toHaveBeenCalled();
});
it("does not throw WeakMap errors when preparing malformed exec params", async () => {
const tool = createExecTool({
security: "full",
ask: "off",
});
const [definition] = toToolDefinitions([tool]);
const result = await definition.execute(
"call-malformed-exec-params",
"not-an-object",
undefined,
undefined,
extensionContext,
);
expect(result.details).toMatchObject({
status: "error",
error: "Provide a command to start.",
});
});
it("reports malformed exec params when elevated logging is enabled", async () => {
const tool = createExecTool({
security: "full",
ask: "off",
elevated: { enabled: true, allowed: true, defaultLevel: "on" },
});
const [definition] = toToolDefinitions([tool]);
const result = await definition.execute(
"call-malformed-elevated-exec-params",
{},
undefined,
undefined,
extensionContext,
);
expect(result.details).toMatchObject({
status: "error",
error: "Provide a command to start.",
});
});
it("does not validate backend sandbox workdirs before malformed exec params fail", async () => {
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir: process.cwd(),
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
},
});
const [definition] = toToolDefinitions([tool]);
const result = await definition.execute(
"call-malformed-backend-sandbox-exec-params",
{
workdir: "/remote/workspace/generated",
},
undefined,
undefined,
extensionContext,
);
expect(result.details).toMatchObject({
status: "error",
error: "Provide a command to start.",
});
expect(validateWorkdir).not.toHaveBeenCalled();
});
it("coerces details-only tool results to include content", async () => {
const tool = {
name: "memory_query",

View File

@@ -2,9 +2,12 @@
* Tests agent-specific exec defaults in assembled coding tools.
* Verifies per-agent exec host policy affects lazy exec/process behavior.
*/
import { beforeEach, describe, expect, it } from "vitest";
import fs from "node:fs";
import path from "node:path";
import { afterEach, beforeEach, describe, expect, it } from "vitest";
import "./test-helpers/fast-coding-tools.js";
import "./test-helpers/fast-openclaw-tools.js";
import { createTempDirTracker } from "../../test/helpers/temp-dir.js";
import type { OpenClawConfig } from "../config/config.js";
import { setActivePluginRegistry } from "../plugins/runtime.js";
import { createSessionConversationTestRegistry } from "../test-utils/session-conversation-registry.js";
@@ -46,11 +49,26 @@ function requireExecTool(tools: ReturnType<typeof createOpenClawCodingTools>) {
return execTool;
}
const tempDirs = createTempDirTracker();
function createTempAgentDirs(prefix: string) {
const root = tempDirs.make(`${prefix}-`);
const workspaceDir = path.join(root, "workspace");
const agentDir = path.join(root, "agent");
fs.mkdirSync(workspaceDir, { recursive: true });
fs.mkdirSync(agentDir, { recursive: true });
return { workspaceDir, agentDir };
}
describe("Agent-specific exec tool defaults", () => {
beforeEach(() => {
setActivePluginRegistry(createSessionConversationTestRegistry());
});
afterEach(() => {
tempDirs.cleanup();
});
it("should run exec synchronously when process is denied", async () => {
const cfg: OpenClawConfig = {
tools: {
@@ -66,8 +84,7 @@ describe("Agent-specific exec tool defaults", () => {
const tools = createOpenClawCodingTools({
config: cfg,
sessionKey: "agent:main:main",
workspaceDir: "/tmp/test-main",
agentDir: "/tmp/agent-main",
...createTempAgentDirs("test-main"),
});
const execTool = requireExecTool(tools);
@@ -91,8 +108,7 @@ describe("Agent-specific exec tool defaults", () => {
},
},
sessionKey: "agent:main:main",
workspaceDir: "/tmp/test-main-implicit-gateway",
agentDir: "/tmp/agent-main-implicit-gateway",
...createTempAgentDirs("test-main-implicit-gateway"),
});
const execTool = requireExecTool(tools);
@@ -113,8 +129,7 @@ describe("Agent-specific exec tool defaults", () => {
},
},
sessionKey: "agent:main:main",
workspaceDir: "/tmp/test-main-mode-deny",
agentDir: "/tmp/agent-main-mode-deny",
...createTempAgentDirs("test-main-mode-deny"),
});
const execTool = requireExecTool(tools);
@@ -135,8 +150,7 @@ describe("Agent-specific exec tool defaults", () => {
},
},
sessionKey: "agent:main:main",
workspaceDir: "/tmp/test-main-mode-call-security",
agentDir: "/tmp/agent-main-mode-call-security",
...createTempAgentDirs("test-main-mode-call-security"),
});
const execTool = requireExecTool(tools);
@@ -171,8 +185,7 @@ describe("Agent-specific exec tool defaults", () => {
},
},
sessionKey: "agent:main:main",
workspaceDir: "/tmp/test-main-mode-partial-agent",
agentDir: "/tmp/agent-main-mode-partial-agent",
...createTempAgentDirs("test-main-mode-partial-agent"),
});
const execTool = requireExecTool(tools);
@@ -197,8 +210,7 @@ describe("Agent-specific exec tool defaults", () => {
security: "deny",
},
sessionKey: "agent:main:main",
workspaceDir: "/tmp/test-main-session-legacy-override",
agentDir: "/tmp/agent-main-session-legacy-override",
...createTempAgentDirs("test-main-session-legacy-override"),
});
const execTool = requireExecTool(tools);
@@ -213,8 +225,7 @@ describe("Agent-specific exec tool defaults", () => {
const tools = createOpenClawCodingTools({
config: {},
sessionKey: "agent:main:main",
workspaceDir: "/tmp/test-main-fail-closed",
agentDir: "/tmp/agent-main-fail-closed",
...createTempAgentDirs("test-main-fail-closed"),
});
const execTool = requireExecTool(tools);
await expect(
@@ -234,8 +245,7 @@ describe("Agent-specific exec tool defaults", () => {
const mainTools = createOpenClawCodingTools({
config: cfg,
sessionKey: "agent:main:main",
workspaceDir: "/tmp/test-main-exec-defaults",
agentDir: "/tmp/agent-main-exec-defaults",
...createTempAgentDirs("test-main-exec-defaults"),
});
const mainExecTool = requireExecTool(mainTools);
const mainResult = await mainExecTool.execute("call-main-default", {
@@ -254,8 +264,7 @@ describe("Agent-specific exec tool defaults", () => {
const helperTools = createOpenClawCodingTools({
config: cfg,
sessionKey: "agent:helper:main",
workspaceDir: "/tmp/test-helper-exec-defaults",
agentDir: "/tmp/agent-helper-exec-defaults",
...createTempAgentDirs("test-helper-exec-defaults"),
});
const helperExecTool = requireExecTool(helperTools);
const helperResult = await helperExecTool.execute("call-helper-default", {
@@ -280,8 +289,7 @@ describe("Agent-specific exec tool defaults", () => {
config: cfg,
agentId: "main",
sessionKey: "run-opaque-123",
workspaceDir: "/tmp/test-main-opaque-session",
agentDir: "/tmp/agent-main-opaque-session",
...createTempAgentDirs("test-main-opaque-session"),
});
const execTool = requireExecTool(tools);
const result = await execTool.execute("call-main-opaque-session", {

View File

@@ -854,6 +854,12 @@ export function createOpenClawCodingTools(options?: {
containerName: sandbox.containerName,
workspaceDir: sandbox.workspaceDir,
containerWorkdir: sandbox.containerWorkdir,
workdirValidation: sandbox.backend?.workdirValidation,
validateWorkdir: sandbox.backend?.validateWorkdir?.bind(sandbox.backend),
discardPreparedWorkdir: sandbox.backend?.discardPreparedWorkdir?.bind(
sandbox.backend,
),
workdirRoots: sandbox.backend?.workdirRoots,
env: sandbox.backend?.env ?? sandbox.docker.env,
buildExecSpec: sandbox.backend?.buildExecSpec.bind(sandbox.backend),
finalizeExec: sandbox.backend?.finalizeExec?.bind(sandbox.backend),

View File

@@ -3,18 +3,23 @@
* Verifies failed process outcomes surface useful text/details for shell
* errors, timeouts, signals, and runtime failures.
*/
import fs from "node:fs";
import os from "node:os";
import path from "node:path";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import type { SpawnInput } from "../process/supervisor/index.js";
import { createTempDirTracker } from "../../test/helpers/temp-dir.js";
import type { ProcessSupervisor, SpawnInput } from "../process/supervisor/index.js";
import { captureEnv } from "../test-utils/env.js";
import { resetProcessRegistryForTests } from "./bash-process-registry.js";
import { createExecTool } from "./bash-tools.exec.js";
import type { BashSandboxConfig } from "./bash-tools.shared.js";
import { resolveShellFromPath } from "./shell-utils.js";
const supervisorMock = vi.hoisted(() => ({
spawn: vi.fn(),
cancel: vi.fn(),
cancelScope: vi.fn(),
getRecord: vi.fn(),
spawn: vi.fn<ProcessSupervisor["spawn"]>(),
cancel: vi.fn<ProcessSupervisor["cancel"]>(),
cancelScope: vi.fn<ProcessSupervisor["cancelScope"]>(),
getRecord: vi.fn<ProcessSupervisor["getRecord"]>(),
}));
vi.mock("../process/supervisor/index.js", () => ({
@@ -25,6 +30,7 @@ const isWin = process.platform === "win32";
const defaultShell = isWin
? undefined
: process.env.OPENCLAW_TEST_SHELL || resolveShellFromPath("bash") || process.env.SHELL || "sh";
const tempDirs = createTempDirTracker();
function requireTextContent(
result: Awaited<ReturnType<ReturnType<typeof createExecTool>["execute"]>>,
@@ -47,6 +53,66 @@ function requireFailedDetails(
return details;
}
function mockSuccessfulSpawn(stdout = "ok\n") {
supervisorMock.spawn.mockImplementationOnce(async (input: SpawnInput) => ({
runId: input.runId ?? "call-success",
pid: 1234,
startedAtMs: Date.now(),
stdin: {
write: vi.fn(),
end: vi.fn(),
destroy: vi.fn(),
},
wait: vi.fn(async () => ({
reason: "exit" as const,
exitCode: 0,
exitSignal: null,
durationMs: 1,
stdout,
stderr: "",
timedOut: false,
noOutputTimedOut: false,
})),
cancel: vi.fn(),
}));
}
async function expectUnavailableWorkdir(params: {
workdir: string;
toolDefaults?: Parameters<typeof createExecTool>[0];
executeArgs?: Partial<Parameters<ReturnType<typeof createExecTool>["execute"]>[1]>;
cleanup?: () => void;
}) {
const tool = createExecTool({
security: "full",
ask: "off",
allowBackground: false,
...params.toolDefaults,
});
try {
const executeArgs = params.executeArgs ?? { workdir: params.workdir };
const result = await tool.execute("call-unavailable-workdir", {
command: "echo should-not-run",
...executeArgs,
});
const text = requireTextContent(result);
expect(text).toContain(`workdir "${params.workdir}" is unavailable or not a directory`);
expect(text).toContain("command was not executed");
expect(text).toContain("workdir is treated as a literal path");
expect(text).toContain('shell expansions such as "~" are not applied');
const details = requireFailedDetails(result.details);
expect(details.exitCode).toBeNull();
expect(details.timedOut).toBe(false);
expect(details.aggregated).toBe("");
expect(details.cwd).toBe(params.workdir);
expect(supervisorMock.spawn).not.toHaveBeenCalled();
} finally {
params.cleanup?.();
}
}
describe("exec foreground failures", () => {
let envSnapshot: ReturnType<typeof captureEnv> | undefined;
@@ -67,6 +133,7 @@ describe("exec foreground failures", () => {
vi.useRealTimers();
envSnapshot?.restore();
envSnapshot = undefined;
tempDirs.cleanup();
});
it("returns a failed text result when the default timeout is exceeded", async () => {
@@ -144,4 +211,374 @@ describe("exec foreground failures", () => {
);
}
});
it("returns a failed result for unavailable explicit host workdirs before launching", async () => {
const missingWorkdir = path.join(
os.tmpdir(),
`openclaw-missing-workdir-${process.pid}-${Date.now()}`,
);
fs.rmSync(missingWorkdir, { recursive: true, force: true });
const fileWorkdir = path.join(
os.tmpdir(),
`openclaw-file-workdir-${process.pid}-${Date.now()}`,
);
fs.writeFileSync(fileWorkdir, "not a directory");
try {
for (const workdir of [missingWorkdir, " ", fileWorkdir]) {
await expectUnavailableWorkdir({ workdir });
supervisorMock.spawn.mockClear();
}
} finally {
fs.rmSync(fileWorkdir, { force: true });
}
});
it("returns a failed result for unavailable configured host workdirs before launching", async () => {
const missingDefaultWorkdir = path.join(
os.tmpdir(),
`openclaw-missing-default-workdir-${process.pid}-${Date.now()}`,
);
fs.rmSync(missingDefaultWorkdir, { recursive: true, force: true });
await expectUnavailableWorkdir({
workdir: missingDefaultWorkdir,
toolDefaults: { cwd: missingDefaultWorkdir },
executeArgs: {},
});
});
it("returns a failed result when the current gateway cwd is unavailable", async () => {
const cwdSpy = vi.spyOn(process, "cwd").mockImplementation(() => {
throw new Error("current cwd unavailable");
});
try {
await expectUnavailableWorkdir({
workdir: "current working directory",
executeArgs: {},
});
} finally {
cwdSpy.mockRestore();
}
});
it("returns a failed result for unavailable configured sandbox workdirs before launching", async () => {
const workspaceDir = tempDirs.make("openclaw-sandbox-workdir-");
try {
await expectUnavailableWorkdir({
workdir: "/workspace/missing",
toolDefaults: {
cwd: "/workspace/missing",
host: "sandbox",
sandbox: {
containerName: "sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/workspace",
},
},
executeArgs: {},
});
} finally {
fs.rmSync(workspaceDir, { recursive: true, force: true });
}
});
it("defaults omitted sandbox workdirs to the sandbox workspace", async () => {
const workspaceDir = tempDirs.make("openclaw-sandbox-workdir-");
mockSuccessfulSpawn();
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
allowBackground: false,
sandbox: {
containerName: "sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/workspace",
},
});
try {
const result = await tool.execute("call-sandbox-default-workdir", {
command: "echo ok",
});
expect(result.details.status).toBe("completed");
expect(result.details.cwd).toBe(workspaceDir);
expect(supervisorMock.spawn).toHaveBeenCalledOnce();
const input = supervisorMock.spawn.mock.calls[0]?.[0];
expect(input?.cwd).toBe(workspaceDir);
expect(input?.mode).toBe("child");
if (input?.mode === "child") {
expect(input.argv).toContain("-w");
expect(input.argv).toContain("/workspace");
}
} finally {
fs.rmSync(workspaceDir, { recursive: true, force: true });
}
});
it("lets backend-validated sandbox workdirs reach the backend without host stat fallback", async () => {
const workspaceDir = tempDirs.make("openclaw-sandbox-workdir-");
const buildExecSpec = vi.fn<NonNullable<BashSandboxConfig["buildExecSpec"]>>(
async (params) => ({
argv: ["remote-shell", params.command],
env: {},
stdinMode: "pipe-open" as const,
}),
);
const validateWorkdir = vi.fn<NonNullable<BashSandboxConfig["validateWorkdir"]>>(
async (workdir) => workdir,
);
mockSuccessfulSpawn();
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
allowBackground: false,
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
buildExecSpec,
},
});
try {
const result = await tool.execute("call-remote-sandbox-workdir", {
command: "echo ok",
workdir: "/remote/workspace/generated",
});
expect(result.details.status).toBe("completed");
expect(result.details.cwd).toBe(workspaceDir);
expect(validateWorkdir).toHaveBeenCalledWith("/remote/workspace/generated");
expect(buildExecSpec).toHaveBeenCalledOnce();
expect(buildExecSpec.mock.calls[0]?.[0]?.workdir).toBe("/remote/workspace/generated");
expect(supervisorMock.spawn).toHaveBeenCalledOnce();
expect(supervisorMock.spawn.mock.calls[0]?.[0]?.cwd).toBe(workspaceDir);
} finally {
fs.rmSync(workspaceDir, { recursive: true, force: true });
}
});
it("rejects unsafe commands before backend workdir validation", async () => {
const workspaceDir = tempDirs.make("openclaw-sandbox-workdir-");
const buildExecSpec = vi.fn<NonNullable<BashSandboxConfig["buildExecSpec"]>>(
async (params) => ({
argv: ["remote-shell", params.command],
env: {},
stdinMode: "pipe-open" as const,
}),
);
const validateWorkdir = vi.fn<NonNullable<BashSandboxConfig["validateWorkdir"]>>(
async (workdir) => workdir,
);
const discardPreparedWorkdir =
vi.fn<NonNullable<BashSandboxConfig["discardPreparedWorkdir"]>>();
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
allowBackground: false,
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
discardPreparedWorkdir,
buildExecSpec,
},
});
try {
await expect(
tool.execute("call-remote-sandbox-rejected-command", {
command: "/approve approval-1 deny",
workdir: "/remote/workspace/generated",
}),
).rejects.toThrow("exec cannot run /approve commands");
expect(validateWorkdir).not.toHaveBeenCalled();
expect(discardPreparedWorkdir).not.toHaveBeenCalled();
expect(buildExecSpec).not.toHaveBeenCalled();
expect(supervisorMock.spawn).not.toHaveBeenCalled();
} finally {
fs.rmSync(workspaceDir, { recursive: true, force: true });
}
});
it("does not preflight remote-only backend workdirs from the local workspace root", async () => {
const workspaceDir = tempDirs.make("openclaw-sandbox-workdir-");
fs.writeFileSync(path.join(workspaceDir, "script.py"), "print($TOKEN)\n");
const buildExecSpec = vi.fn<NonNullable<BashSandboxConfig["buildExecSpec"]>>(
async (params) => ({
argv: ["remote-shell", params.command],
env: {},
stdinMode: "pipe-open" as const,
}),
);
const validateWorkdir = vi.fn<NonNullable<BashSandboxConfig["validateWorkdir"]>>(
async (workdir) => workdir,
);
mockSuccessfulSpawn();
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
allowBackground: false,
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
buildExecSpec,
},
});
try {
const result = await tool.execute("call-remote-only-script", {
command: "python script.py",
workdir: "/remote/workspace/generated",
});
expect(result.details.status).toBe("completed");
expect(validateWorkdir).toHaveBeenCalledWith("/remote/workspace/generated");
expect(buildExecSpec).toHaveBeenCalledOnce();
expect(buildExecSpec.mock.calls[0]?.[0]?.workdir).toBe("/remote/workspace/generated");
expect(supervisorMock.spawn).toHaveBeenCalledOnce();
} finally {
fs.rmSync(workspaceDir, { recursive: true, force: true });
}
});
it("uses the mapped host cwd for existing relative backend-validated sandbox workdirs", async () => {
const workspaceDir = tempDirs.make("openclaw-sandbox-workdir-");
const srcDir = path.join(workspaceDir, "src");
fs.mkdirSync(srcDir);
const buildExecSpec = vi.fn<NonNullable<BashSandboxConfig["buildExecSpec"]>>(
async (params) => ({
argv: ["remote-shell", params.command],
env: {},
stdinMode: "pipe-open" as const,
}),
);
const validateWorkdir = vi.fn<NonNullable<BashSandboxConfig["validateWorkdir"]>>(
async (workdir) => workdir,
);
mockSuccessfulSpawn();
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
allowBackground: false,
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
buildExecSpec,
},
});
try {
const result = await tool.execute("call-relative-remote-sandbox-workdir", {
command: "echo ok",
workdir: "src",
});
expect(result.details.status).toBe("completed");
expect(result.details.cwd).toBe(srcDir);
expect(validateWorkdir).toHaveBeenCalledWith("/remote/workspace/src");
expect(buildExecSpec).toHaveBeenCalledOnce();
expect(buildExecSpec.mock.calls[0]?.[0]?.workdir).toBe("/remote/workspace/src");
expect(supervisorMock.spawn).toHaveBeenCalledOnce();
expect(supervisorMock.spawn.mock.calls[0]?.[0]?.cwd).toBe(srcDir);
} finally {
fs.rmSync(workspaceDir, { recursive: true, force: true });
}
});
it("fails backend-validated sandbox workdirs before launch when backend validation rejects", async () => {
const workspaceDir = tempDirs.make("openclaw-sandbox-workdir-");
const validateWorkdir = vi.fn<NonNullable<BashSandboxConfig["validateWorkdir"]>>(
async () => null,
);
const buildExecSpec = vi.fn<NonNullable<BashSandboxConfig["buildExecSpec"]>>(
async (params) => ({
argv: ["remote-shell", params.command],
env: {},
stdinMode: "pipe-open" as const,
}),
);
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
allowBackground: false,
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
buildExecSpec,
},
});
try {
const result = await tool.execute("call-remote-sandbox-workdir", {
command: "echo ok",
workdir: "/remote/workspace/generated",
});
expect(result.details).toMatchObject({
status: "failed",
cwd: "/remote/workspace/generated",
});
expect(JSON.stringify(result)).toContain("unavailable or not a directory");
expect(validateWorkdir).toHaveBeenCalledOnce();
expect(buildExecSpec).not.toHaveBeenCalled();
expect(supervisorMock.spawn).not.toHaveBeenCalled();
} finally {
fs.rmSync(workspaceDir, { recursive: true, force: true });
}
});
it("returns a failed result for unavailable explicit sandbox workdirs before launching a command", async () => {
const workspaceDir = tempDirs.make("openclaw-sandbox-workdir-");
const outsideDir = tempDirs.make("openclaw-outside-workdir-");
fs.writeFileSync(path.join(workspaceDir, "not-dir"), "not a directory");
try {
for (const workdir of ["/workspace/missing", " ", "/workspace/not-dir", outsideDir]) {
await expectUnavailableWorkdir({
workdir,
toolDefaults: {
host: "sandbox",
sandbox: {
containerName: "sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/workspace",
},
},
});
supervisorMock.spawn.mockClear();
}
} finally {
fs.rmSync(workspaceDir, { recursive: true, force: true });
fs.rmSync(outsideDir, { recursive: true, force: true });
}
});
});

View File

@@ -2716,7 +2716,10 @@ describe("executeNodeHostCommand", () => {
expect(requireGatewayCommand("system.run.prepare").params?.params?.env).toEqual({
FOO: "bar",
});
expect(requireRunParams(requireGatewayCommand("system.run")).env).toEqual({ FOO: "bar" });
expect(requireGatewayCommand("system.run.prepare").params?.params?.cwd).toBe("/tmp/work");
const runParams = requireRunParams(requireGatewayCommand("system.run"));
expect(runParams.env).toEqual({ FOO: "bar" });
expect(runParams.cwd).toBe("/tmp/work");
const evalEnvs = evaluateShellAllowlistMock.mock.calls.map(
([raw]) => (raw as ShellAllowlistMockParams).env,
);
@@ -2745,12 +2748,31 @@ describe("executeNodeHostCommand", () => {
const runParams = requireRunParams(call);
expect(runParams.command).toEqual(["/bin/sh", "-lc", "bun ./script.ts"]);
expect(runParams.rawCommand).toBe("bun ./script.ts");
expect(runParams.cwd).toBe("/tmp/work");
expect(typeof runParams.runId).toBe("string");
expect(runParams.suppressNotifyOnExit).toBe(true);
expect(runParams.timeoutMs).toBe(30_000);
expect(Object.hasOwn(runParams, "systemRunPlan")).toBe(false);
});
it("omits cwd from direct node system.run when workdir is undefined", async () => {
await executeNodeHostCommand({
command: "bun ./script.ts",
workdir: undefined,
env: {},
security: "full",
ask: "off",
defaultTimeoutSec: 30,
approvalRunningNoticeMs: 0,
warnings: [],
agentId: "requested-agent",
sessionKey: "requested-session",
});
const runParams = requireRunParams(requireGatewayCall(0));
expect(Object.hasOwn(runParams, "cwd")).toBe(false);
});
it("rejects disconnected node targets before invoking system.run", async () => {
listNodesMock.mockResolvedValueOnce([
{

View File

@@ -0,0 +1,616 @@
/**
* Exec workdir resolver tests.
* Verifies cwd selection and validation before exec launches or remote node
* forwarding.
*/
import { mkdir, mkdtemp, rm, symlink, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
import { resolveExecWorkdir } from "./bash-tools.exec-workdir.js";
import type { BashSandboxConfig } from "./bash-tools.shared.js";
async function withTempDir(run: (dir: string) => Promise<void>) {
const dir = await mkdtemp(path.join(os.tmpdir(), "openclaw-exec-workdir-"));
try {
await run(dir);
} finally {
await rm(dir, { recursive: true, force: true });
}
}
function sandboxConfig(workspaceDir: string): BashSandboxConfig {
return {
containerName: "sandbox-workdir-test",
workspaceDir,
containerWorkdir: "/workspace",
};
}
function backendSandboxConfig(
workspaceDir: string,
params?: {
containerWorkdir?: string;
workdirRoots?: readonly string[];
validateWorkdir?: BashSandboxConfig["validateWorkdir"];
},
): BashSandboxConfig {
return {
...sandboxConfig(workspaceDir),
containerWorkdir: params?.containerWorkdir ?? "/remote/workspace",
workdirValidation: "backend",
workdirRoots: params?.workdirRoots,
validateWorkdir: params?.validateWorkdir ?? (async (workdir) => workdir),
};
}
describe("resolveExecWorkdir", () => {
afterEach(() => {
vi.restoreAllMocks();
});
it("rejects blank explicit local workdirs", async () => {
await expect(
resolveExecWorkdir({
host: "gateway",
workdir: " ",
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: " " });
});
it("rejects missing explicit local workdirs without fallback", async () => {
await withTempDir(async (workspaceDir) => {
const missing = path.join(workspaceDir, "missing");
await expect(
resolveExecWorkdir({
host: "gateway",
workdir: missing,
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: missing });
});
});
it("rejects file explicit local workdirs", async () => {
await withTempDir(async (workspaceDir) => {
const fileWorkdir = path.join(workspaceDir, "not-dir");
await writeFile(fileWorkdir, "not a directory");
await expect(
resolveExecWorkdir({
host: "gateway",
workdir: fileWorkdir,
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: fileWorkdir });
});
});
it("resolves valid explicit local workdirs", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "gateway",
workdir: ` ${workspaceDir} `,
}),
).resolves.toEqual({ kind: "local", hostCwd: workspaceDir });
});
});
it("uses configured local cwd when workdir is omitted", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "gateway",
defaultCwd: workspaceDir,
}),
).resolves.toEqual({ kind: "local", hostCwd: workspaceDir });
});
});
it("uses current cwd for omitted local workdir only when no default exists", async () => {
await withTempDir(async (workspaceDir) => {
vi.spyOn(process, "cwd").mockReturnValue(workspaceDir);
await expect(
resolveExecWorkdir({
host: "gateway",
}),
).resolves.toEqual({ kind: "local", hostCwd: workspaceDir });
});
});
it("fails omitted local workdir when current cwd is unavailable", async () => {
vi.spyOn(process, "cwd").mockImplementation(() => {
throw new Error("cwd unavailable");
});
await expect(
resolveExecWorkdir({
host: "gateway",
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: "current working directory" });
});
it("rejects missing configured local cwd without falling back to current cwd", async () => {
await withTempDir(async (workspaceDir) => {
const missingDefault = path.join(workspaceDir, "missing-default");
vi.spyOn(process, "cwd").mockReturnValue(workspaceDir);
await expect(
resolveExecWorkdir({
host: "gateway",
defaultCwd: missingDefault,
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: missingDefault });
});
});
it("uses the sandbox workspace when sandbox workdir is omitted", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: "/workspace",
scriptPreflightCwd: workspaceDir,
});
});
});
it("rejects missing explicit sandbox workdirs", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/workspace/missing",
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: "/workspace/missing" });
});
});
it("rejects missing configured sandbox workdirs", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
defaultCwd: "/workspace/missing",
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: "/workspace/missing" });
});
});
it("rejects file sandbox workdirs", async () => {
await withTempDir(async (workspaceDir) => {
await writeFile(path.join(workspaceDir, "not-dir"), "not a directory");
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/workspace/not-dir",
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: "/workspace/not-dir" });
});
});
it("rejects sandbox workdirs that escape the workspace", async () => {
await withTempDir(async (workspaceDir) => {
await withTempDir(async (outsideDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: outsideDir,
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: outsideDir });
});
});
});
it("rejects sandbox workdirs with parent-directory segments", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "missing/..",
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: "missing/.." });
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/workspace/missing/..",
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: "/workspace/missing/.." });
});
});
it("rejects sandbox workdir symlinks that escape the workspace", async () => {
await withTempDir(async (workspaceDir) => {
await withTempDir(async (outsideDir) => {
await symlink(outsideDir, path.join(workspaceDir, "escape"), "dir");
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/workspace/escape",
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: "/workspace/escape" });
});
});
});
it("resolves relative sandbox workdirs under the workspace", async () => {
await withTempDir(async (workspaceDir) => {
const srcDir = path.join(workspaceDir, "src");
await mkdir(srcDir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "src",
sandbox: sandboxConfig(workspaceDir),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: srcDir,
containerCwd: "/workspace/src",
scriptPreflightCwd: srcDir,
});
});
});
it("supports custom sandbox container workdir prefixes", async () => {
await withTempDir(async (workspaceDir) => {
const projectDir = path.join(workspaceDir, "project");
await mkdir(projectDir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/sandbox-root/project",
sandbox: {
...sandboxConfig(workspaceDir),
containerWorkdir: "/sandbox-root",
},
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: projectDir,
containerCwd: "/sandbox-root/project",
scriptPreflightCwd: projectDir,
});
});
});
it("lets backend-validated sandboxes use remote-only container workdirs", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/remote/workspace/generated",
sandbox: backendSandboxConfig(workspaceDir),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: "/remote/workspace/generated",
scriptPreflightCwd: null,
});
});
});
it("normalizes backend-validated sandbox workdir roots with trailing slashes", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/remote/workspace/generated",
sandbox: backendSandboxConfig(workspaceDir, {
containerWorkdir: "/remote/workspace/",
}),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: "/remote/workspace/generated",
scriptPreflightCwd: null,
});
});
});
it("lets backend-validated sandboxes use declared alternate remote roots", async () => {
await withTempDir(async (workspaceDir) => {
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/agent/project",
sandbox: backendSandboxConfig(workspaceDir, {
workdirRoots: ["/agent"],
validateWorkdir,
}),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: "/agent/project",
scriptPreflightCwd: null,
});
expect(validateWorkdir).toHaveBeenCalledWith("/agent/project");
});
});
it("resolves relative backend-validated sandbox workdirs under the remote workspace", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "remote-only",
sandbox: backendSandboxConfig(workspaceDir),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: "/remote/workspace/remote-only",
scriptPreflightCwd: null,
});
});
});
it("keeps existing relative backend-validated sandbox workdirs aligned with the local mirror", async () => {
await withTempDir(async (workspaceDir) => {
const localDir = path.join(workspaceDir, "src");
await mkdir(localDir);
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "src",
sandbox: backendSandboxConfig(workspaceDir, { validateWorkdir }),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: localDir,
containerCwd: "/remote/workspace/src",
scriptPreflightCwd: localDir,
});
expect(validateWorkdir).toHaveBeenCalledWith("/remote/workspace/src");
});
});
it("defers stale relative backend-validated sandbox workdirs to the backend", async () => {
await withTempDir(async (workspaceDir) => {
const localFile = path.join(workspaceDir, "build");
await writeFile(localFile, "stale local mirror file");
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "build",
sandbox: backendSandboxConfig(workspaceDir, { validateWorkdir }),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: "/remote/workspace/build",
scriptPreflightCwd: null,
});
expect(validateWorkdir).toHaveBeenCalledWith("/remote/workspace/build");
});
});
it("accepts backend-validated absolute workdirs when the remote workspace root is slash", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/generated",
sandbox: backendSandboxConfig(workspaceDir, {
containerWorkdir: "/",
}),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: "/generated",
scriptPreflightCwd: null,
});
});
});
it("maps host workspace paths for backend-validated sandboxes when they exist locally", async () => {
await withTempDir(async (workspaceDir) => {
const localDir = path.join(workspaceDir, "src");
await mkdir(localDir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: localDir,
sandbox: backendSandboxConfig(workspaceDir),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: localDir,
containerCwd: "/remote/workspace/src",
scriptPreflightCwd: localDir,
});
});
});
it("defers missing absolute backend workdirs to remote validation when roots overlap", async () => {
await withTempDir(async (workspaceDir) => {
const missingRemoteDir = path.join(workspaceDir, "generated");
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: missingRemoteDir,
sandbox: backendSandboxConfig(workspaceDir, {
containerWorkdir: workspaceDir,
validateWorkdir,
}),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: missingRemoteDir,
scriptPreflightCwd: null,
});
expect(validateWorkdir).toHaveBeenCalledWith(missingRemoteDir);
});
});
it("maps missing absolute host workspace paths before backend validation", async () => {
await withTempDir(async (workspaceDir) => {
const missingRemoteDir = path.join(workspaceDir, "generated");
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: missingRemoteDir,
sandbox: backendSandboxConfig(workspaceDir, {
validateWorkdir,
}),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: workspaceDir,
containerCwd: "/remote/workspace/generated",
scriptPreflightCwd: null,
});
expect(validateWorkdir).toHaveBeenCalledWith("/remote/workspace/generated");
});
});
it("rejects backend-validated sandbox host paths that symlink outside the workspace", async () => {
await withTempDir(async (workspaceDir) => {
await withTempDir(async (outsideDir) => {
const escape = path.join(workspaceDir, "escape");
await symlink(outsideDir, escape, "dir");
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: escape,
sandbox: backendSandboxConfig(workspaceDir),
}),
).resolves.toEqual({
kind: "unavailable",
requestedCwd: escape,
});
});
});
});
it("prefers existing host workspace paths over matching backend container prefixes", async () => {
await withTempDir(async (workspaceDir) => {
const localDir = path.join(workspaceDir, "src");
await mkdir(localDir);
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: localDir,
sandbox: backendSandboxConfig(workspaceDir, {
containerWorkdir: workspaceDir,
}),
}),
).resolves.toEqual({
kind: "sandbox",
hostCwd: localDir,
containerCwd: `${workspaceDir}/src`,
scriptPreflightCwd: localDir,
});
});
});
it("rejects backend-validated sandbox workdirs outside local and remote workspace roots", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/other/remote/workspace",
sandbox: backendSandboxConfig(workspaceDir),
}),
).resolves.toEqual({
kind: "unavailable",
requestedCwd: "/other/remote/workspace",
});
});
});
it("rejects backend-validated sandbox workdirs with parent-directory segments", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/remote/workspace/missing/..",
sandbox: backendSandboxConfig(workspaceDir),
}),
).resolves.toEqual({
kind: "unavailable",
requestedCwd: "/remote/workspace/missing/..",
});
});
});
it("rejects backend-validated sandbox workdirs when the backend validator fails", async () => {
await withTempDir(async (workspaceDir) => {
await expect(
resolveExecWorkdir({
host: "sandbox",
workdir: "/remote/workspace/missing",
sandbox: backendSandboxConfig(workspaceDir, {
validateWorkdir: async () => null,
}),
}),
).resolves.toEqual({
kind: "unavailable",
requestedCwd: "/remote/workspace/missing",
});
});
});
it("omits node cwd when node workdir is omitted", async () => {
await expect(
resolveExecWorkdir({
host: "node",
defaultCwd: "/gateway/default",
}),
).resolves.toEqual({ kind: "node" });
});
it("forwards explicit node cwd without local validation", async () => {
await expect(
resolveExecWorkdir({
host: "node",
workdir: "/remote/node/workspace",
defaultCwd: "/gateway/default",
}),
).resolves.toEqual({ kind: "node", remoteCwd: "/remote/node/workspace" });
});
it("rejects blank explicit node workdirs", async () => {
await expect(
resolveExecWorkdir({
host: "node",
workdir: " ",
}),
).resolves.toEqual({ kind: "unavailable", requestedCwd: " " });
});
});

View File

@@ -0,0 +1,381 @@
/**
* Internal exec workdir resolver.
* Owns cwd selection and validation before exec approval, hooks, preflight, or
* process launch can observe an invalid selected working directory.
*/
import fs from "node:fs/promises";
import path from "node:path";
import { normalizeOptionalString } from "@openclaw/normalization-core/string-coerce";
import type { ExecHost } from "../infra/exec-approvals.js";
import { safeStatSync } from "../infra/path-guards.js";
import type { BashSandboxConfig } from "./bash-tools.shared.js";
import { assertSandboxPath } from "./sandbox-paths.js";
export type ExecWorkdirResolution =
| { kind: "local"; hostCwd: string }
| { kind: "sandbox"; hostCwd: string; containerCwd: string; scriptPreflightCwd: string | null }
| { kind: "node"; remoteCwd?: string }
| { kind: "unavailable"; requestedCwd: string };
type NormalizedWorkdirInput =
| { kind: "omitted" }
| { kind: "blank"; raw: string }
| { kind: "specified"; value: string };
type SandboxWorkdir = {
hostCwd: string;
containerCwd: string;
scriptPreflightCwd: string | null;
};
type BackendHostWorkdirCandidate = {
hostPath: string;
failIfInvalid: boolean;
};
type ExistingHostWorkspacePathResult =
| { kind: "available"; workdir: SandboxWorkdir }
| { kind: "missing"; relative: string }
| { kind: "invalid" };
function normalizeExplicitWorkdirInput(workdir: string | undefined): NormalizedWorkdirInput {
if (workdir === undefined) {
return { kind: "omitted" };
}
const value = normalizeOptionalString(workdir);
return value ? { kind: "specified", value } : { kind: "blank", raw: workdir };
}
function unavailable(requestedCwd: string): ExecWorkdirResolution {
return { kind: "unavailable", requestedCwd };
}
function resolveExistingHostWorkdir(workdir: string): string | null {
const stats = safeStatSync(workdir);
return stats?.isDirectory() ? workdir : null;
}
function isHostPathInsideRoot(params: { root: string; candidate: string }): boolean {
const root = path.resolve(params.root);
const candidate = path.resolve(params.candidate);
const relative = path.relative(root, candidate);
return relative === "" || (!relative.startsWith("..") && !path.isAbsolute(relative));
}
function safeCurrentCwd(): string | null {
try {
return process.cwd();
} catch {
return null;
}
}
function mapContainerWorkdirToHost(params: {
workdir: string;
sandbox: BashSandboxConfig;
}): string | undefined {
const workdir = normalizeContainerPath(params.workdir);
const containerRoot = normalizeContainerPath(params.sandbox.containerWorkdir);
if (containerRoot === ".") {
return undefined;
}
if (workdir === containerRoot) {
return path.resolve(params.sandbox.workspaceDir);
}
if (!workdir.startsWith(`${containerRoot}/`)) {
return undefined;
}
const rel = workdir
.slice(containerRoot.length + 1)
.split("/")
.filter(Boolean);
return path.resolve(params.sandbox.workspaceDir, ...rel);
}
function normalizeContainerPath(input: string): string {
const normalized = input.trim().replace(/\\/g, "/");
if (!normalized) {
return ".";
}
const posixPath = path.posix.normalize(normalized);
return posixPath === "/" ? posixPath : posixPath.replace(/\/+$/g, "");
}
function joinContainerWorkdir(containerWorkdir: string, relative: string): string {
return relative ? path.posix.join(containerWorkdir, relative) : containerWorkdir;
}
function hasParentPathSegment(input: string): boolean {
return input
.replace(/\\/g, "/")
.split("/")
.some((segment) => segment === "..");
}
function isContainerWorkdirInsideRoot(params: { root: string; workdir: string }): boolean {
const root = normalizeContainerPath(params.root);
const workdir = normalizeContainerPath(params.workdir);
if (root === "/") {
return path.posix.isAbsolute(workdir);
}
return workdir === root || workdir.startsWith(`${root}/`);
}
function resolveBackendWorkdirRoots(sandbox: BashSandboxConfig): string[] {
const roots: string[] = [];
const addRoot = (root: string | undefined) => {
const normalized = normalizeContainerPath(root ?? "");
if (normalized === "." || !path.posix.isAbsolute(normalized) || roots.includes(normalized)) {
return;
}
roots.push(normalized);
};
addRoot(sandbox.containerWorkdir);
for (const root of sandbox.workdirRoots ?? []) {
addRoot(root);
}
return roots;
}
function resolveBackendContainerWorkdir(params: {
workdir: string;
sandbox: BashSandboxConfig;
}): string | null {
const containerRoot = normalizeContainerPath(params.sandbox.containerWorkdir);
const backendRoots = resolveBackendWorkdirRoots(params.sandbox);
const requested = normalizeContainerPath(params.workdir);
if (path.posix.isAbsolute(requested)) {
return backendRoots.some((root) => isContainerWorkdirInsideRoot({ root, workdir: requested }))
? requested
: null;
}
if (requested === ".." || requested.startsWith("../")) {
return null;
}
return joinContainerWorkdir(containerRoot, requested === "." ? "" : requested);
}
async function mapExistingHostWorkspacePath(params: {
hostPath: string;
sandbox: BashSandboxConfig;
}): Promise<ExistingHostWorkspacePathResult> {
let resolved: Awaited<ReturnType<typeof assertSandboxPath>>;
try {
resolved = await assertSandboxPath({
filePath: params.hostPath,
cwd: params.sandbox.workspaceDir,
root: params.sandbox.workspaceDir,
});
} catch {
return { kind: "invalid" };
}
const stats = safeStatSync(resolved.resolved);
if (!stats) {
return {
kind: "missing",
relative: resolved.relative ? resolved.relative.split(path.sep).join(path.posix.sep) : "",
};
}
if (!stats.isDirectory()) {
return { kind: "invalid" };
}
const relative = resolved.relative ? resolved.relative.split(path.sep).join(path.posix.sep) : "";
return {
kind: "available",
workdir: {
hostCwd: resolved.resolved,
containerCwd: joinContainerWorkdir(params.sandbox.containerWorkdir, relative),
scriptPreflightCwd: resolved.resolved,
},
};
}
async function validateBackendWorkdir(params: {
workdir: SandboxWorkdir;
sandbox: BashSandboxConfig;
}): Promise<SandboxWorkdir | null> {
const containerCwd = await params.sandbox.validateWorkdir?.(params.workdir.containerCwd);
return containerCwd
? {
hostCwd: params.workdir.hostCwd,
containerCwd,
scriptPreflightCwd: params.workdir.scriptPreflightCwd,
}
: null;
}
function resolveBackendHostWorkdirCandidate(params: {
workdir: string;
sandbox: BashSandboxConfig;
}): BackendHostWorkdirCandidate | null {
if (!path.isAbsolute(params.workdir)) {
return {
hostPath: path.resolve(params.sandbox.workspaceDir, params.workdir),
failIfInvalid: false,
};
}
const hostPath = path.resolve(params.workdir);
if (
isHostPathInsideRoot({
root: params.sandbox.workspaceDir,
candidate: hostPath,
})
) {
return { hostPath, failIfInvalid: true };
}
const containerMappedHostPath = mapContainerWorkdirToHost({
workdir: params.workdir,
sandbox: params.sandbox,
});
return containerMappedHostPath
? { hostPath: containerMappedHostPath, failIfInvalid: false }
: null;
}
async function resolveBackendValidatedSandboxWorkdir(params: {
workdir: string;
sandbox: BashSandboxConfig;
}): Promise<SandboxWorkdir | null> {
const workspaceHostCwd = resolveExistingHostWorkdir(params.sandbox.workspaceDir);
if (!workspaceHostCwd) {
return null;
}
const hostCandidate = resolveBackendHostWorkdirCandidate(params);
if (hostCandidate) {
const mappedWorkdir = await mapExistingHostWorkspacePath({
hostPath: hostCandidate.hostPath,
sandbox: params.sandbox,
});
if (mappedWorkdir.kind === "available") {
return await validateBackendWorkdir({
workdir: mappedWorkdir.workdir,
sandbox: params.sandbox,
});
}
if (mappedWorkdir.kind === "missing") {
return await validateBackendWorkdir({
workdir: {
hostCwd: workspaceHostCwd,
containerCwd: joinContainerWorkdir(
params.sandbox.containerWorkdir,
mappedWorkdir.relative,
),
scriptPreflightCwd: null,
},
sandbox: params.sandbox,
});
}
if (hostCandidate.failIfInvalid && mappedWorkdir.kind === "invalid") {
return null;
}
}
const containerCwd = resolveBackendContainerWorkdir(params);
if (containerCwd) {
return await validateBackendWorkdir({
workdir: {
hostCwd: workspaceHostCwd,
containerCwd,
scriptPreflightCwd: null,
},
sandbox: params.sandbox,
});
}
return null;
}
async function resolveHostValidatedSandboxWorkdir(params: {
workdir: string;
sandbox: BashSandboxConfig;
}): Promise<SandboxWorkdir | null> {
const mappedHostWorkdir = mapContainerWorkdirToHost({
workdir: params.workdir,
sandbox: params.sandbox,
});
const candidateWorkdir = mappedHostWorkdir ?? params.workdir;
try {
const resolved = await assertSandboxPath({
filePath: candidateWorkdir,
cwd: params.sandbox.workspaceDir,
root: params.sandbox.workspaceDir,
});
const stats = await fs.stat(resolved.resolved);
if (!stats.isDirectory()) {
return null;
}
const relative = resolved.relative
? resolved.relative.split(path.sep).join(path.posix.sep)
: "";
const containerCwd = joinContainerWorkdir(params.sandbox.containerWorkdir, relative);
return { hostCwd: resolved.resolved, containerCwd, scriptPreflightCwd: resolved.resolved };
} catch {
return null;
}
}
async function resolveSandboxWorkdir(params: {
workdir: string;
sandbox: BashSandboxConfig;
}): Promise<SandboxWorkdir | null> {
if (hasParentPathSegment(params.workdir)) {
return null;
}
if (params.sandbox.workdirValidation === "backend") {
return await resolveBackendValidatedSandboxWorkdir(params);
}
return await resolveHostValidatedSandboxWorkdir(params);
}
export function formatUnavailableWorkdirFailure(workdir: string): string {
return [
`workdir "${workdir}" is unavailable or not a directory: command was not executed.`,
'workdir is treated as a literal path; shell expansions such as "~" are not applied.',
"Use an existing directory, omit an explicit workdir to use the default cwd, or update the configured default cwd.",
].join(" ");
}
export async function resolveExecWorkdir(params: {
host: ExecHost;
workdir?: string;
defaultCwd?: string;
sandbox?: BashSandboxConfig;
}): Promise<ExecWorkdirResolution> {
const explicitWorkdir = normalizeExplicitWorkdirInput(params.workdir);
if (explicitWorkdir.kind === "blank") {
return unavailable(explicitWorkdir.raw);
}
if (params.host === "node") {
return explicitWorkdir.kind === "specified"
? { kind: "node", remoteCwd: explicitWorkdir.value }
: { kind: "node" };
}
const defaultCwd = normalizeOptionalString(params.defaultCwd);
if (params.host === "sandbox") {
const sandbox = params.sandbox;
if (!sandbox) {
throw new Error("exec internal error: sandbox workdir resolution requires sandbox config");
}
const requestedCwd =
explicitWorkdir.kind === "specified"
? explicitWorkdir.value
: (defaultCwd ?? sandbox.containerWorkdir);
const resolved = await resolveSandboxWorkdir({ workdir: requestedCwd, sandbox });
return resolved
? {
kind: "sandbox",
hostCwd: resolved.hostCwd,
containerCwd: resolved.containerCwd,
scriptPreflightCwd: resolved.scriptPreflightCwd,
}
: unavailable(requestedCwd);
}
const requestedCwd =
explicitWorkdir.kind === "specified" ? explicitWorkdir.value : (defaultCwd ?? safeCurrentCwd());
if (!requestedCwd) {
return unavailable("current working directory");
}
const resolved = resolveExistingHostWorkdir(requestedCwd);
return resolved ? { kind: "local", hostCwd: resolved } : unavailable(requestedCwd);
}

View File

@@ -216,6 +216,29 @@ describe("exec PATH login shell merge", () => {
}
});
it("fails without running when an explicit workdir is unavailable", async () => {
const missingWorkdir = path.join(
os.tmpdir(),
`openclaw-missing-workdir-${process.pid}-${Date.now()}`,
);
fs.rmSync(missingWorkdir, { recursive: true, force: true });
const tool = createExecTool({ host: "gateway", security: "full", ask: "off" });
const result = await tool.execute("call-missing-workdir", {
command: "echo ok",
workdir: missingWorkdir,
yieldMs: FOREGROUND_TEST_YIELD_MS,
});
const value = normalizeText(result.content.find((c) => c.type === "text")?.text);
expect(result.details?.status).toBe("failed");
expect(value).toContain(`workdir "${missingWorkdir}" is unavailable or not a directory`);
expect(value).toContain("command was not executed");
expect(value).toContain("workdir is treated as a literal path");
expect(value).toContain('shell expansions such as "~" are not applied');
expect(value).not.toMatch(/^ok/);
});
it("merges login-shell PATH for host=gateway", async () => {
if (isWin) {
return;

View File

@@ -5,6 +5,7 @@
*/
import { beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
import { OPENCLAW_CLI_ENV_VALUE } from "../infra/openclaw-exec-env.js";
import type { ExecuteNodeHostCommandParams } from "./bash-tools.exec-host-node.types.js";
import type { ExtensionContext } from "./sessions/index.js";
declare module "../plugins/hook-types.js" {
@@ -14,6 +15,10 @@ declare module "../plugins/hook-types.js" {
}
const CHANNEL_CONTEXT_ENV_KEY = "OPENCLAW_CHANNEL_CONTEXT";
type CapturedNodeHostParams = Pick<
ExecuteNodeHostCommandParams,
"env" | "requestedEnv" | "workdir"
>;
const mocks = vi.hoisted(() => ({
hookRunner: undefined as
@@ -28,10 +33,7 @@ const mocks = vi.hoisted(() => ({
env: Record<string, string>;
requestedEnv?: Record<string, string>;
}>,
nodeHostParams: [] as Array<{
env: Record<string, string>;
requestedEnv?: Record<string, string>;
}>,
nodeHostParams: [] as CapturedNodeHostParams[],
spawnInputs: [] as Array<{
env?: Record<string, string>;
}>,
@@ -64,10 +66,11 @@ vi.mock("./bash-tools.exec-host-gateway.js", () => ({
vi.mock("./bash-tools.exec-host-node.js", () => ({
executeNodeHostCommand: vi.fn(
async (params: { env: Record<string, string>; requestedEnv?: Record<string, string> }) => {
async (params: Pick<ExecuteNodeHostCommandParams, "env" | "requestedEnv" | "workdir">) => {
mocks.nodeHostParams.push({
env: { ...params.env },
requestedEnv: params.requestedEnv ? { ...params.requestedEnv } : undefined,
workdir: params.workdir,
});
return {
content: [{ type: "text", text: "node ok" }],
@@ -272,6 +275,305 @@ describe("exec resolve_exec_env hook wiring", () => {
expect(mocks.nodeHostParams[0]?.env).not.toHaveProperty("LD_PRELOAD");
});
it("does not forward configured gateway cwd defaults to node host requests", async () => {
const tool = createExecTool({
cwd: "/gateway/default/that/node/cannot/use",
host: "node",
security: "full",
ask: "off",
});
await tool.execute("call-node-default-cwd", {
command: "echo ok",
});
expect(mocks.nodeHostParams[0]?.workdir).toBeUndefined();
});
it("fails blank explicit node host workdirs before node invocation", async () => {
const tool = createExecTool({
host: "node",
security: "full",
ask: "off",
});
const result = await tool.execute("call-node-blank-cwd", {
command: "echo ok",
workdir: " ",
});
const text = result.content.find((entry) => entry.type === "text")?.text ?? "";
expect((result.details as { status?: unknown } | undefined)?.status).toBe("failed");
expect(text).toContain('workdir " " is unavailable or not a directory');
expect(text).toContain("command was not executed");
expect(mocks.nodeHostParams).toHaveLength(0);
});
it("prevalidates node workdirs before resolving exec env when a backend sandbox exists", async () => {
installResolveExecEnvHook({ PLUGIN_SAFE: "yes" });
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
const tool = createExecTool({
host: "node",
security: "full",
ask: "off",
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir: process.cwd(),
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
},
});
const result = await tool.execute("call-node-invalid-cwd-with-backend-sandbox", {
command: "echo ok",
workdir: " ",
});
expect((result.details as { status?: unknown } | undefined)?.status).toBe("failed");
expect(mocks.hookRunner?.runResolveExecEnv).not.toHaveBeenCalled();
expect(validateWorkdir).not.toHaveBeenCalled();
expect(mocks.nodeHostParams).toHaveLength(0);
});
it("fails invalid workdirs before resolving exec env", async () => {
installResolveExecEnvHook({ PLUGIN_SAFE: "yes" });
const tool = createExecTool({
host: "gateway",
security: "full",
ask: "off",
});
const result = await tool.execute("call-invalid-cwd-before-env", {
command: "echo ok",
workdir: " ",
});
expect((result.details as { status?: unknown } | undefined)?.status).toBe("failed");
expect(mocks.hookRunner?.runResolveExecEnv).not.toHaveBeenCalled();
expect(mocks.gatewayParams).toHaveLength(0);
expect(mocks.spawnInputs).toHaveLength(0);
});
it("prevalidates gateway workdirs before resolving exec env when a backend sandbox exists", async () => {
installResolveExecEnvHook({ PLUGIN_SAFE: "yes" });
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
const tool = createExecTool({
host: "gateway",
security: "full",
ask: "off",
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir: process.cwd(),
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
},
});
const result = await tool.execute("call-gateway-invalid-cwd-with-backend-sandbox", {
command: "echo ok",
workdir: " ",
});
expect((result.details as { status?: unknown } | undefined)?.status).toBe("failed");
expect(mocks.hookRunner?.runResolveExecEnv).not.toHaveBeenCalled();
expect(validateWorkdir).not.toHaveBeenCalled();
expect(mocks.gatewayParams).toHaveLength(0);
expect(mocks.spawnInputs).toHaveLength(0);
});
it("lets before_tool_call see invalid wrapped workdirs before failing unchanged params", async () => {
mocks.hookRunner = {
hasHooks: vi.fn(
(hookName: string) => hookName === "resolve_exec_env" || hookName === "before_tool_call",
),
runResolveExecEnv: vi.fn(async () => ({ PLUGIN_SAFE: "yes" })),
runBeforeToolCall: vi.fn(async () => undefined),
};
const tool = createExecTool({
host: "gateway",
security: "full",
ask: "off",
sessionKey: "agent:main:telegram:chat-1",
});
const [definition] = toToolDefinitions([tool], {
agentId: "main",
sessionKey: "agent:main:telegram:chat-1",
});
const result = await definition.execute(
"call-invalid-wrapped-cwd-before-hooks",
{
command: "echo ok",
workdir: " ",
},
undefined,
undefined,
testExtensionContext,
);
const text = result.content.find((entry) => entry.type === "text")?.text ?? "";
expect((result.details as { status?: unknown } | undefined)?.status).toBe("failed");
expect(text).toContain('workdir " " is unavailable or not a directory');
expect(mocks.hookRunner.runBeforeToolCall!).toHaveBeenCalledTimes(1);
expect(mocks.hookRunner.runResolveExecEnv!).not.toHaveBeenCalled();
expect(mocks.gatewayParams).toHaveLength(0);
expect(mocks.spawnInputs).toHaveLength(0);
});
it("does not validate backend sandbox workdirs before before_tool_call veto", async () => {
const validateWorkdir = vi.fn(async (workdir: string) => workdir);
mocks.hookRunner = {
hasHooks: vi.fn((hookName: string) => hookName === "before_tool_call"),
runBeforeToolCall: vi.fn(async () => ({
block: true,
blockReason: "blocked by test hook",
})),
};
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir: process.cwd(),
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
},
});
const [definition] = toToolDefinitions([tool], {
agentId: "main",
sessionKey: "agent:main:telegram:chat-1",
});
const result = await definition.execute(
"call-backend-cwd-vetoed-before-validation",
{
command: "echo ok",
workdir: "/remote/workspace/generated",
},
undefined,
undefined,
testExtensionContext,
);
expect(
result.details as { status?: unknown; deniedReason?: unknown } | undefined,
).toMatchObject({
status: "blocked",
deniedReason: "plugin-before-tool-call",
});
expect(mocks.hookRunner.runBeforeToolCall!).toHaveBeenCalledOnce();
expect(validateWorkdir).not.toHaveBeenCalled();
expect(mocks.gatewayParams).toHaveLength(0);
expect(mocks.spawnInputs).toHaveLength(0);
});
it("defers resolve_exec_env for backend sandboxes until workdir validation succeeds", async () => {
const validateWorkdir = vi.fn(async () => null);
mocks.hookRunner = {
hasHooks: vi.fn(
(hookName: string) => hookName === "resolve_exec_env" || hookName === "before_tool_call",
),
runResolveExecEnv: vi.fn(async () => ({ PLUGIN_SAFE: "yes" })),
runBeforeToolCall: vi.fn(async () => undefined),
};
const tool = createExecTool({
host: "sandbox",
security: "full",
ask: "off",
sandbox: {
containerName: "remote-sandbox-workdir-test",
workspaceDir: process.cwd(),
containerWorkdir: "/remote/workspace",
workdirValidation: "backend",
validateWorkdir,
},
});
const [definition] = toToolDefinitions([tool], {
agentId: "main",
sessionKey: "agent:main:telegram:chat-1",
});
const result = await definition.execute(
"call-backend-invalid-cwd-before-env",
{
command: "echo ok",
workdir: "/remote/workspace/missing",
},
undefined,
undefined,
testExtensionContext,
);
expect((result.details as { status?: unknown } | undefined)?.status).toBe("failed");
expect(mocks.hookRunner.runBeforeToolCall!).toHaveBeenCalledOnce();
expect(validateWorkdir).toHaveBeenCalledWith("/remote/workspace/missing");
expect(mocks.hookRunner.runResolveExecEnv!).not.toHaveBeenCalled();
expect(mocks.gatewayParams).toHaveLength(0);
expect(mocks.spawnInputs).toHaveLength(0);
});
it("lets lazy before_tool_call see invalid workdirs before failing unchanged params", async () => {
mocks.hookRunner = {
hasHooks: vi.fn(
(hookName: string) => hookName === "resolve_exec_env" || hookName === "before_tool_call",
),
runResolveExecEnv: vi.fn(async () => ({ LAZY_PLUGIN_SAFE: "yes" })),
runBeforeToolCall: vi.fn(async () => undefined),
};
const exec = createOpenClawCodingTools({
agentId: "main",
sessionKey: "agent:main:telegram:chat-1",
cwd: process.cwd(),
exec: { host: "gateway", security: "full", ask: "off" },
}).find((tool) => tool.name === "exec");
expect(exec).toBeDefined();
const [definition] = toToolDefinitions([exec!], {
agentId: "main",
sessionKey: "agent:main:telegram:chat-1",
channelId: "chat-1",
});
const result = await definition.execute(
"call-invalid-lazy-cwd-before-hooks",
{
command: "echo ok",
workdir: " ",
},
undefined,
undefined,
testExtensionContext,
);
const text = result.content.find((entry) => entry.type === "text")?.text ?? "";
expect((result.details as { status?: unknown } | undefined)?.status).toBe("failed");
expect(text).toContain('workdir " " is unavailable or not a directory');
expect(mocks.hookRunner.runBeforeToolCall!).toHaveBeenCalledTimes(1);
expect(mocks.hookRunner.runResolveExecEnv!).not.toHaveBeenCalled();
expect(mocks.gatewayParams).toHaveLength(0);
expect(mocks.spawnInputs).toHaveLength(0);
});
it("forwards explicit node host workdirs without local gateway validation", async () => {
const remoteWorkdir = "/remote/node/workspace";
const tool = createExecTool({
host: "node",
security: "full",
ask: "off",
});
await tool.execute("call-node-explicit-cwd", {
command: "echo ok",
workdir: remoteWorkdir,
});
expect(mocks.nodeHostParams[0]?.workdir).toBe(remoteWorkdir);
});
it("keeps plugin env out of before_tool_call params before execution", async () => {
mocks.hookRunner = {
hasHooks: vi.fn(
@@ -422,6 +724,57 @@ describe("exec resolve_exec_env hook wiring", () => {
expect(mocks.nodeHostParams[0]?.requestedEnv).not.toHaveProperty("GATEWAY_PLUGIN_SAFE");
});
it("lets before_tool_call reroute gateway-invalid workdirs to node host execution", async () => {
mocks.hookRunner = {
hasHooks: vi.fn(
(hookName: string) => hookName === "resolve_exec_env" || hookName === "before_tool_call",
),
runResolveExecEnv: vi.fn(async (event: { host: "gateway" | "sandbox" | "node" }) =>
event.host === "node" ? { NODE_PLUGIN_SAFE: "node" } : { GATEWAY_PLUGIN_SAFE: "gateway" },
),
runBeforeToolCall: vi.fn(async (event: { params: Record<string, unknown> }) => ({
params: { ...event.params, host: "node" },
})),
};
const tool = createExecTool({
host: "auto",
security: "full",
ask: "off",
sessionKey: "agent:main:telegram:chat-1",
});
const [definition] = toToolDefinitions([tool], {
agentId: "main",
sessionKey: "agent:main:telegram:chat-1",
});
await definition.execute(
"call-host-rewrite-with-remote-cwd",
{
command: "echo ok",
env: { REQUEST_SAFE: "request" },
workdir: "/remote/node/workspace",
},
undefined,
undefined,
testExtensionContext,
);
expect(mocks.hookRunner.runBeforeToolCall!).toHaveBeenCalledOnce();
expect(mocks.hookRunner.runResolveExecEnv!).toHaveBeenCalledOnce();
expect(mocks.hookRunner.runResolveExecEnv!).toHaveBeenCalledWith(
expect.objectContaining({ host: "node" }),
expect.anything(),
);
expect(mocks.nodeHostParams[0]?.requestedEnv).toEqual({
NODE_PLUGIN_SAFE: "node",
REQUEST_SAFE: "request",
});
expect(mocks.nodeHostParams[0]?.workdir).toBe("/remote/node/workspace");
expect(mocks.gatewayParams).toHaveLength(0);
expect(mocks.spawnInputs).toHaveLength(0);
});
it("lets before_tool_call rewrite host when no resolve_exec_env hook is registered", async () => {
mocks.hookRunner = {
hasHooks: vi.fn((hookName: string) => hookName === "before_tool_call"),

View File

@@ -63,23 +63,28 @@ import {
DEFAULT_MAX_OUTPUT,
DEFAULT_PATH,
DEFAULT_PENDING_MAX_OUTPUT,
type ExecProcessHandle,
type ExecProcessOutcome,
applyPathPrepend,
applyShellPath,
normalizePathPrepend,
resolveExecTarget,
resolveApprovalRunningNoticeMs,
buildExecRuntimeErrorOutcome,
runExecProcess,
execSchema,
} from "./bash-tools.exec-runtime.js";
import type { ExecToolDefaults, ExecToolDetails } from "./bash-tools.exec-types.js";
import {
type ExecWorkdirResolution,
formatUnavailableWorkdirFailure,
resolveExecWorkdir,
} from "./bash-tools.exec-workdir.js";
import {
buildSandboxEnv,
clampWithDefault,
coerceEnv,
readEnvInt,
resolveSandboxWorkdir,
resolveWorkdir,
truncateMiddle,
} from "./bash-tools.shared.js";
import { createModelExecAutoReviewer } from "./exec-auto-reviewer.js";
@@ -138,6 +143,15 @@ type ResolvedExecEnvPreparedState = {
pluginEnv?: Record<string, string>;
};
const resolvedExecEnvPreparedStates = new WeakMap<ExecToolArgs, ResolvedExecEnvPreparedState>();
type ResolvedExecWorkdirPreparedState = {
host: ExecHost;
inputWorkdir?: string;
resolution: ExecWorkdirResolution;
};
const resolvedExecWorkdirPreparedStates = new WeakMap<
ExecToolArgs,
ResolvedExecWorkdirPreparedState
>();
const XML_ARG_VALUE_EXEC_PARAM_KEYS = [
"command",
@@ -187,6 +201,20 @@ function isResolveExecEnvPrepared(params: ExecToolArgs): boolean {
return Boolean(getResolvedExecEnvPreparedState(params));
}
function markResolvedExecWorkdirPrepared<T extends ExecToolArgs>(
params: T,
state: ResolvedExecWorkdirPreparedState,
): T {
resolvedExecWorkdirPreparedStates.set(params, state);
return params;
}
function getResolvedExecWorkdirPreparedState(
params: ExecToolArgs,
): ResolvedExecWorkdirPreparedState | undefined {
return resolvedExecWorkdirPreparedStates.get(params);
}
function buildExecForegroundResult(params: {
outcome: ExecProcessOutcome;
cwd?: string;
@@ -1325,6 +1353,62 @@ export function createExecTool(
sandboxAvailable: Boolean(defaults?.sandbox),
}).effectiveHost;
};
const buildUnavailableWorkdirResult = (params: {
cwd: string;
startedAt?: number;
warningText?: string;
}) =>
buildExecForegroundResult({
outcome: buildExecRuntimeErrorOutcome({
error: formatUnavailableWorkdirFailure(params.cwd),
aggregated: "",
durationMs: params.startedAt ? Date.now() - params.startedAt : 0,
}),
cwd: params.cwd,
warningText: params.warningText,
});
const prepareParamsWithResolvedExecWorkdir = async (rawArgs: unknown): Promise<ExecToolArgs> => {
if (typeof rawArgs !== "object" || rawArgs === null || Array.isArray(rawArgs)) {
return rawArgs as ExecToolArgs;
}
const params = stripMalformedXmlArgValueSuffixFromKeys(
rawArgs as ExecToolArgs,
XML_ARG_VALUE_EXEC_PARAM_KEYS,
);
let host: ExecHost;
try {
host = resolveHostForParams(params);
} catch {
return params;
}
if (host === "sandbox" && !defaults?.sandbox) {
return params;
}
if (host === "sandbox" && defaults?.sandbox?.workdirValidation === "backend") {
return params;
}
const resolution = await resolveExecWorkdir({
host,
workdir: params.workdir,
defaultCwd: defaults?.cwd,
sandbox: defaults?.sandbox,
});
return markResolvedExecWorkdirPrepared(params, {
host,
inputWorkdir: params.workdir,
resolution,
});
};
const shouldDeferResolveExecEnvUntilWorkdirValidated = (params: ExecToolArgs): boolean => {
try {
return (
resolveHostForParams(params) === "sandbox" &&
defaults?.sandbox?.workdirValidation === "backend"
);
} catch {
return false;
}
};
const prepareParamsWithResolvedExecEnv = async (
rawArgs: unknown,
context?: { hookContext?: HookContext },
@@ -1385,41 +1469,70 @@ export function createExecTool(
return describeExecTool({ agentId, hasCronTool: defaults?.hasCronTool === true });
},
parameters: execSchema,
prepareBeforeToolCallParams: async (args, context) =>
prepareParamsWithResolvedExecEnv(args, {
prepareBeforeToolCallParams: async (args, context) => {
const params = await prepareParamsWithResolvedExecWorkdir(args);
const workdirState = getResolvedExecWorkdirPreparedState(params);
if (workdirState?.resolution.kind === "unavailable") {
return params;
}
if (shouldDeferResolveExecEnvUntilWorkdirValidated(params)) {
return params;
}
return prepareParamsWithResolvedExecEnv(params, {
hookContext: context.hookContext as HookContext | undefined,
}),
finalizeBeforeToolCallParams: (params, preparedParams) =>
(() => {
const state = getResolvedExecEnvPreparedState(preparedParams as ExecToolArgs);
if (!state) {
return params;
}
const execParams = params as ExecToolArgs;
if (state.host && execParams.command && resolveHostForParams(execParams) !== state.host) {
});
},
finalizeBeforeToolCallParams: (params, preparedParams) => {
const execParams = params as ExecToolArgs;
const envState = getResolvedExecEnvPreparedState(preparedParams as ExecToolArgs);
const workdirState = getResolvedExecWorkdirPreparedState(preparedParams as ExecToolArgs);
if (!envState && !workdirState) {
return params;
}
let host: ExecHost | undefined;
const resolveFinalHost = () => {
host ??= resolveHostForParams(execParams);
return host;
};
try {
if (envState?.host && execParams.command && resolveFinalHost() !== envState.host) {
return { ...execParams };
}
return markResolveExecEnvPrepared(execParams, state);
})(),
execute: async (_toolCallId, args, signal, onUpdate) => {
const params = isResolveExecEnvPrepared(args as ExecToolArgs)
? stripMalformedXmlArgValueSuffixFromKeys(
args as ExecToolArgs,
XML_ARG_VALUE_EXEC_PARAM_KEYS,
)
: await prepareParamsWithResolvedExecEnv(args);
if (!params.command) {
throw new Error("Provide a command to start.");
if (
workdirState &&
(resolveFinalHost() !== workdirState.host ||
execParams.workdir !== workdirState.inputWorkdir)
) {
return { ...execParams };
}
} catch {
return { ...execParams };
}
if (envState) {
markResolveExecEnvPrepared(execParams, envState);
}
if (workdirState) {
markResolvedExecWorkdirPrepared(execParams, workdirState);
}
return execParams;
},
execute: async (_toolCallId, args, signal, onUpdate) => {
let params = stripMalformedXmlArgValueSuffixFromKeys(
args as ExecToolArgs,
XML_ARG_VALUE_EXEC_PARAM_KEYS,
);
const resolveExecEnvPrepared = isResolveExecEnvPrepared(args as ExecToolArgs);
const preparedWorkdirState = getResolvedExecWorkdirPreparedState(params);
const maxOutput = DEFAULT_MAX_OUTPUT;
const pendingMaxOutput = DEFAULT_PENDING_MAX_OUTPUT;
const warnings: string[] = [];
const getWarningText = () => (warnings.length ? `${warnings.join("\n")}\n\n` : "");
const approvalWarningText = normalizeOptionalString(defaults?.approvalWarningText);
if (approvalWarningText) {
warnings.push(approvalWarningText);
}
const startedAt = Date.now();
let execCommandOverride: string | undefined;
const backgroundRequested = params.background === true;
const yieldRequested = typeof params.yieldMs === "number";
@@ -1492,9 +1605,6 @@ export function createExecTool(
);
}
}
if (elevatedRequested) {
logInfo(`exec: elevated command ${truncateMiddle(params.command, 120)}`);
}
const requestedTarget = requireValidExecTarget(params.host);
const target = resolveExecTarget({
configuredTarget: defaults?.host,
@@ -1567,242 +1677,269 @@ export function createExecTool(
].join("\n"),
);
}
const explicitWorkdir = normalizeOptionalString(params.workdir);
const defaultWorkdir = normalizeOptionalString(defaults?.cwd);
let workdir: string | undefined;
let containerWorkdir = sandbox?.containerWorkdir;
if (sandbox) {
const sandboxWorkdir = explicitWorkdir ?? defaultWorkdir ?? process.cwd();
const resolved = await resolveSandboxWorkdir({
workdir: sandboxWorkdir,
sandbox,
warnings,
});
workdir = resolved.hostWorkdir;
containerWorkdir = resolved.containerWorkdir;
} else if (host === "node") {
// For remote node execution, only forward a cwd that was explicitly
// requested on the tool call. The gateway's workspace root is wired in as a
// local default, but it is not meaningful on the remote node and would
// recreate the cross-platform approval failure this path is fixing.
// When no explicit cwd was given, the gateway's own
// process.cwd() is meaningless on the remote node (especially cross-platform,
// e.g. Linux gateway + Windows node) and would cause
// "SYSTEM_RUN_DENIED: approval requires an existing canonical cwd".
// Passing undefined lets the node use its own default working directory.
workdir = explicitWorkdir;
} else {
const rawWorkdir = explicitWorkdir ?? defaultWorkdir ?? process.cwd();
workdir = resolveWorkdir(rawWorkdir, warnings);
if (!params.command) {
throw new Error("Provide a command to start.");
}
await rejectUnsafeExecControlShellCommand(params.command);
const inheritedBaseEnv = coerceEnv(process.env);
const resolvedExecEnvState = getResolvedExecEnvPreparedState(params);
const channelContextEnv = buildChannelContextEnv(defaults?.channelContext);
const requestedEnv: Record<string, string> | undefined =
params.env !== undefined ||
resolvedExecEnvState?.pluginEnv !== undefined ||
channelContextEnv !== undefined
? { ...params.env, ...resolvedExecEnvState?.pluginEnv, ...channelContextEnv }
: undefined;
const hostEnvResult =
host === "sandbox"
? null
: sanitizeHostExecEnvWithDiagnostics({
baseEnv: inheritedBaseEnv,
overrides: requestedEnv,
blockPathOverrides: true,
let workdir: string | undefined;
let scriptPreflightCwd: string | null = null;
let containerWorkdir = sandbox?.containerWorkdir;
let discardPreparedSandboxWorkdir: (() => void) | null = null;
const workdirResolution =
preparedWorkdirState?.host === host
? preparedWorkdirState.resolution
: await resolveExecWorkdir({
host,
workdir: params.workdir,
defaultCwd: defaults?.cwd,
sandbox,
});
if (
hostEnvResult &&
requestedEnv &&
(hostEnvResult.rejectedOverrideBlockedKeys.length > 0 ||
hostEnvResult.rejectedOverrideInvalidKeys.length > 0)
) {
const blockedKeys = hostEnvResult.rejectedOverrideBlockedKeys;
const invalidKeys = hostEnvResult.rejectedOverrideInvalidKeys;
const pathBlocked = blockedKeys.includes("PATH");
if (pathBlocked && blockedKeys.length === 1 && invalidKeys.length === 0) {
throw new Error(
"Security Violation: Custom 'PATH' variable is forbidden during host execution.",
);
}
if (blockedKeys.length === 1 && invalidKeys.length === 0) {
throw new Error(
`Security Violation: Environment variable '${blockedKeys[0]}' is forbidden during host execution.`,
);
}
const details: string[] = [];
if (blockedKeys.length > 0) {
details.push(`blocked override keys: ${blockedKeys.join(", ")}`);
}
if (invalidKeys.length > 0) {
details.push(`invalid non-portable override keys: ${invalidKeys.join(", ")}`);
}
const suffix = details.join("; ");
if (pathBlocked) {
throw new Error(
`Security Violation: Custom 'PATH' variable is forbidden during host execution (${suffix}).`,
);
}
throw new Error(`Security Violation: ${suffix}.`);
}
const env =
sandbox && host === "sandbox"
? buildSandboxEnv({
defaultPath: DEFAULT_PATH,
paramsEnv: requestedEnv,
sandboxEnv: sandbox.env,
containerWorkdir: containerWorkdir ?? sandbox.containerWorkdir,
})
: (hostEnvResult?.env ?? inheritedBaseEnv);
if (!sandbox && host === "gateway" && !requestedEnv?.PATH) {
const shellPath = getShellPathFromLoginShell({
env: process.env,
timeoutMs: resolveShellEnvFallbackTimeoutMs(process.env),
if (workdirResolution.kind === "unavailable") {
return buildUnavailableWorkdirResult({
cwd: workdirResolution.requestedCwd,
startedAt,
warningText: warnings.join("\n"),
});
applyShellPath(env, shellPath);
}
// `tools.exec.pathPrepend` is only meaningful when exec runs locally (gateway) or in the sandbox.
// Node hosts intentionally ignore request-scoped PATH overrides, so don't pretend this applies.
if (host === "node" && defaultPathPrepend.length > 0) {
warnings.push(
"Warning: tools.exec.pathPrepend is ignored for host=node. Configure PATH on the node host/service instead.",
);
if (workdirResolution.kind === "sandbox") {
workdir = workdirResolution.hostCwd;
containerWorkdir = workdirResolution.containerCwd;
scriptPreflightCwd = workdirResolution.scriptPreflightCwd;
if (sandbox?.discardPreparedWorkdir && sandbox.workdirValidation === "backend") {
const preparedContainerWorkdir = containerWorkdir;
discardPreparedSandboxWorkdir = () => {
sandbox.discardPreparedWorkdir?.(preparedContainerWorkdir);
};
}
} else if (workdirResolution.kind === "local") {
workdir = workdirResolution.hostCwd;
scriptPreflightCwd = workdirResolution.hostCwd;
} else {
applyPathPrepend(env, defaultPathPrepend);
workdir = workdirResolution.remoteCwd;
}
let run: ExecProcessHandle;
let effectiveTimeout: number;
try {
if (elevatedRequested) {
logInfo(`exec: elevated command ${truncateMiddle(params.command, 120)}`);
}
if (!resolveExecEnvPrepared) {
params = await prepareParamsWithResolvedExecEnv(params);
}
if (host === "node") {
return executeNodeHostCommand({
command: params.command,
workdir,
env,
requestedEnv,
requestedNode: params.node?.trim(),
boundNode: defaults?.node?.trim(),
sessionKey: defaults?.sessionKey,
sessionId: defaults?.sessionId,
sessionStore: defaults?.sessionStore,
bashElevated: elevatedDefaults,
approvalReviewerDeviceId: defaults?.approvalReviewerDeviceId,
turnSourceChannel: defaults?.messageProvider,
turnSourceTo: defaults?.currentChannelId,
turnSourceAccountId: defaults?.accountId,
turnSourceThreadId: defaults?.currentThreadTs,
agentId,
security,
ask,
autoReview,
autoReviewer,
strictInlineEval: defaults?.strictInlineEval,
commandHighlighting: defaults?.commandHighlighting,
trigger: defaults?.trigger,
timeoutSec: params.timeout,
defaultTimeoutSec,
approvalRunningNoticeMs,
warnings,
notifySessionKey,
notifyOnExit,
trustedSafeBinDirs,
});
}
if (!workdir) {
throw new Error("exec internal error: local execution requires a resolved workdir");
}
if (host === "gateway" && !bypassApprovals) {
const gatewayResult = await processGatewayAllowlist({
const inheritedBaseEnv = coerceEnv(process.env);
const resolvedExecEnvState = getResolvedExecEnvPreparedState(params);
const channelContextEnv = buildChannelContextEnv(defaults?.channelContext);
const requestedEnv: Record<string, string> | undefined =
params.env !== undefined ||
resolvedExecEnvState?.pluginEnv !== undefined ||
channelContextEnv !== undefined
? { ...params.env, ...resolvedExecEnvState?.pluginEnv, ...channelContextEnv }
: undefined;
const hostEnvResult =
host === "sandbox"
? null
: sanitizeHostExecEnvWithDiagnostics({
baseEnv: inheritedBaseEnv,
overrides: requestedEnv,
blockPathOverrides: true,
});
if (
hostEnvResult &&
requestedEnv &&
(hostEnvResult.rejectedOverrideBlockedKeys.length > 0 ||
hostEnvResult.rejectedOverrideInvalidKeys.length > 0)
) {
const blockedKeys = hostEnvResult.rejectedOverrideBlockedKeys;
const invalidKeys = hostEnvResult.rejectedOverrideInvalidKeys;
const pathBlocked = blockedKeys.includes("PATH");
if (pathBlocked && blockedKeys.length === 1 && invalidKeys.length === 0) {
throw new Error(
"Security Violation: Custom 'PATH' variable is forbidden during host execution.",
);
}
if (blockedKeys.length === 1 && invalidKeys.length === 0) {
throw new Error(
`Security Violation: Environment variable '${blockedKeys[0]}' is forbidden during host execution.`,
);
}
const details: string[] = [];
if (blockedKeys.length > 0) {
details.push(`blocked override keys: ${blockedKeys.join(", ")}`);
}
if (invalidKeys.length > 0) {
details.push(`invalid non-portable override keys: ${invalidKeys.join(", ")}`);
}
const suffix = details.join("; ");
if (pathBlocked) {
throw new Error(
`Security Violation: Custom 'PATH' variable is forbidden during host execution (${suffix}).`,
);
}
throw new Error(`Security Violation: ${suffix}.`);
}
const env =
sandbox && host === "sandbox"
? buildSandboxEnv({
defaultPath: DEFAULT_PATH,
paramsEnv: requestedEnv,
sandboxEnv: sandbox.env,
containerWorkdir: containerWorkdir ?? sandbox.containerWorkdir,
})
: (hostEnvResult?.env ?? inheritedBaseEnv);
if (!sandbox && host === "gateway" && !requestedEnv?.PATH) {
const shellPath = getShellPathFromLoginShell({
env: process.env,
timeoutMs: resolveShellEnvFallbackTimeoutMs(process.env),
});
applyShellPath(env, shellPath);
}
// `tools.exec.pathPrepend` is only meaningful when exec runs locally (gateway) or in the sandbox.
// Node hosts intentionally ignore request-scoped PATH overrides, so don't pretend this applies.
if (host === "node" && defaultPathPrepend.length > 0) {
warnings.push(
"Warning: tools.exec.pathPrepend is ignored for host=node. Configure PATH on the node host/service instead.",
);
} else {
applyPathPrepend(env, defaultPathPrepend);
}
if (host === "node") {
return executeNodeHostCommand({
command: params.command,
workdir,
env,
requestedEnv,
requestedNode: params.node?.trim(),
boundNode: defaults?.node?.trim(),
sessionKey: defaults?.sessionKey,
sessionId: defaults?.sessionId,
sessionStore: defaults?.sessionStore,
bashElevated: elevatedDefaults,
approvalReviewerDeviceId: defaults?.approvalReviewerDeviceId,
turnSourceChannel: defaults?.messageProvider,
turnSourceTo: defaults?.currentChannelId,
turnSourceAccountId: defaults?.accountId,
turnSourceThreadId: defaults?.currentThreadTs,
agentId,
security,
ask,
autoReview,
autoReviewer,
strictInlineEval: defaults?.strictInlineEval,
commandHighlighting: defaults?.commandHighlighting,
trigger: defaults?.trigger,
timeoutSec: params.timeout,
defaultTimeoutSec,
approvalRunningNoticeMs,
warnings,
notifySessionKey,
notifyOnExit,
trustedSafeBinDirs,
});
}
if (!workdir) {
throw new Error("exec internal error: local execution requires a resolved workdir");
}
if (host === "gateway" && !bypassApprovals) {
const gatewayResult = await processGatewayAllowlist({
command: params.command,
workdir,
env,
pathPrepend: defaultPathPrepend,
requestedEnv,
pty: params.pty === true && !sandbox,
timeoutSec: params.timeout,
defaultTimeoutSec,
security,
ask,
autoReview,
autoReviewer,
safeBins,
safeBinProfiles,
strictInlineEval: defaults?.strictInlineEval,
commandHighlighting: defaults?.commandHighlighting,
trigger: defaults?.trigger,
agentId,
sessionKey: defaults?.sessionKey,
sessionId: defaults?.sessionId,
sessionStore: defaults?.sessionStore,
bashElevated: elevatedDefaults,
approvalReviewerDeviceId: defaults?.approvalReviewerDeviceId,
turnSourceChannel: defaults?.messageProvider,
turnSourceTo: defaults?.currentChannelId,
turnSourceAccountId: defaults?.accountId,
turnSourceThreadId: defaults?.currentThreadTs,
scopeKey: defaults?.scopeKey,
approvalFollowupText: defaults?.approvalFollowupText,
approvalFollowup: defaults?.approvalFollowup,
approvalFollowupMode: defaults?.approvalFollowupMode,
warnings,
notifySessionKey,
approvalRunningNoticeMs,
maxOutput,
pendingMaxOutput,
trustedSafeBinDirs,
});
if (gatewayResult.pendingResult) {
return gatewayResult.pendingResult;
}
if (gatewayResult.deniedResult) {
return gatewayResult.deniedResult;
}
execCommandOverride = gatewayResult.execCommandOverride;
if (gatewayResult.allowWithoutEnforcedCommand) {
execCommandOverride = undefined;
}
}
const explicitTimeoutSec = typeof params.timeout === "number" ? params.timeout : null;
effectiveTimeout = explicitTimeoutSec ?? defaultTimeoutSec;
const usePty = params.pty === true && !sandbox;
// Preflight: catch a common model failure mode (shell syntax leaking into Python/JS sources)
// before we execute and burn tokens in cron loops.
if (scriptPreflightCwd && !shouldSkipExecScriptPreflight({ host, security, ask })) {
await validateScriptFileForShellBleed({
command: params.command,
workdir: scriptPreflightCwd,
});
}
run = await runExecProcess({
command: params.command,
execCommand: execCommandOverride,
workdir,
env,
pathPrepend: defaultPathPrepend,
requestedEnv,
pty: params.pty === true && !sandbox,
timeoutSec: params.timeout,
defaultTimeoutSec,
security,
ask,
autoReview,
autoReviewer,
safeBins,
safeBinProfiles,
strictInlineEval: defaults?.strictInlineEval,
commandHighlighting: defaults?.commandHighlighting,
trigger: defaults?.trigger,
agentId,
sessionKey: defaults?.sessionKey,
sessionId: defaults?.sessionId,
sessionStore: defaults?.sessionStore,
bashElevated: elevatedDefaults,
approvalReviewerDeviceId: defaults?.approvalReviewerDeviceId,
turnSourceChannel: defaults?.messageProvider,
turnSourceTo: defaults?.currentChannelId,
turnSourceAccountId: defaults?.accountId,
turnSourceThreadId: defaults?.currentThreadTs,
scopeKey: defaults?.scopeKey,
approvalFollowupText: defaults?.approvalFollowupText,
approvalFollowup: defaults?.approvalFollowup,
approvalFollowupMode: defaults?.approvalFollowupMode,
sandbox,
containerWorkdir,
usePty,
warnings,
notifySessionKey,
approvalRunningNoticeMs,
maxOutput,
pendingMaxOutput,
trustedSafeBinDirs,
notifyOnExit,
notifyOnExitEmptySuccess,
scopeKey: defaults?.scopeKey,
sessionKey: notifySessionKey,
mainKey: defaults?.mainKey,
sessionScope: defaults?.sessionScope,
eventRouting: defaults?.eventRouting,
notifyDeliveryContext,
timeoutSec: effectiveTimeout,
onUpdate,
});
if (gatewayResult.pendingResult) {
return gatewayResult.pendingResult;
}
if (gatewayResult.deniedResult) {
return gatewayResult.deniedResult;
}
execCommandOverride = gatewayResult.execCommandOverride;
if (gatewayResult.allowWithoutEnforcedCommand) {
execCommandOverride = undefined;
}
discardPreparedSandboxWorkdir = null;
} catch (error) {
discardPreparedSandboxWorkdir?.();
throw error;
}
const explicitTimeoutSec = typeof params.timeout === "number" ? params.timeout : null;
const effectiveTimeout = explicitTimeoutSec ?? defaultTimeoutSec;
const getWarningText = () => (warnings.length ? `${warnings.join("\n")}\n\n` : "");
const usePty = params.pty === true && !sandbox;
// Preflight: catch a common model failure mode (shell syntax leaking into Python/JS sources)
// before we execute and burn tokens in cron loops.
if (!shouldSkipExecScriptPreflight({ host, security, ask })) {
await validateScriptFileForShellBleed({ command: params.command, workdir });
}
const run = await runExecProcess({
command: params.command,
execCommand: execCommandOverride,
workdir,
env,
pathPrepend: defaultPathPrepend,
sandbox,
containerWorkdir,
usePty,
warnings,
maxOutput,
pendingMaxOutput,
notifyOnExit,
notifyOnExitEmptySuccess,
scopeKey: defaults?.scopeKey,
sessionKey: notifySessionKey,
mainKey: defaults?.mainKey,
sessionScope: defaults?.sessionScope,
eventRouting: defaults?.eventRouting,
notifyDeliveryContext,
timeoutSec: effectiveTimeout,
onUpdate,
});
let yielded = false;
let yieldTimer: NodeJS.Timeout | null = null;
let registeredAbortSignal: AbortSignal | null = null;

View File

@@ -12,7 +12,12 @@ const EXEC_TOOL_HOST_VALUES = ["auto", "sandbox", "gateway", "node"] as const;
/** Parameters accepted by the exec tool. */
export const execSchema = Type.Object({
command: Type.String({ description: "Shell command to execute" }),
workdir: Type.Optional(Type.String({ description: "Working directory (defaults to cwd)" })),
workdir: Type.Optional(
Type.String({
description:
"Working directory. Blank/whitespace values are invalid; omit to use the default cwd.",
}),
),
env: Type.Optional(Type.Record(Type.String(), Type.String())),
yieldMs: Type.Optional(
Type.Number({

View File

@@ -1,24 +1,11 @@
/**
* Shared bash-tool helper tests.
* Covers strict env parsing and sandbox workdir mapping between container and
* host workspace paths.
* Covers strict env parsing and compact session labels.
*/
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
import { deriveSessionName, readEnvInt, resolveSandboxWorkdir } from "./bash-tools.shared.js";
import { deriveSessionName, readEnvInt } from "./bash-tools.shared.js";
async function withTempDir(run: (dir: string) => Promise<void>) {
const dir = await mkdtemp(path.join(os.tmpdir(), "openclaw-bash-workdir-"));
try {
await run(dir);
} finally {
await rm(dir, { recursive: true, force: true });
}
}
describe("resolveSandboxWorkdir", () => {
describe("readEnvInt", () => {
afterEach(() => {
vi.unstubAllEnvs();
});
@@ -56,67 +43,6 @@ describe("resolveSandboxWorkdir", () => {
expect(readEnvInt("OPENCLAW_BASH_YIELD_MS", "PI_BASH_YIELD_MS")).toBeUndefined();
});
it("maps container root workdir to host workspace", async () => {
await withTempDir(async (workspaceDir) => {
const warnings: string[] = [];
const resolved = await resolveSandboxWorkdir({
workdir: "/workspace",
sandbox: {
containerName: "sandbox-1",
workspaceDir,
containerWorkdir: "/workspace",
},
warnings,
});
expect(resolved.hostWorkdir).toBe(workspaceDir);
expect(resolved.containerWorkdir).toBe("/workspace");
expect(warnings).toStrictEqual([]);
});
});
it("maps nested container workdir under the container workspace", async () => {
await withTempDir(async (workspaceDir) => {
const nested = path.join(workspaceDir, "scripts", "runner");
await mkdir(nested, { recursive: true });
const warnings: string[] = [];
const resolved = await resolveSandboxWorkdir({
workdir: "/workspace/scripts/runner",
sandbox: {
containerName: "sandbox-2",
workspaceDir,
containerWorkdir: "/workspace",
},
warnings,
});
expect(resolved.hostWorkdir).toBe(nested);
expect(resolved.containerWorkdir).toBe("/workspace/scripts/runner");
expect(warnings).toStrictEqual([]);
});
});
it("supports custom container workdir prefixes", async () => {
await withTempDir(async (workspaceDir) => {
const nested = path.join(workspaceDir, "project");
await mkdir(nested, { recursive: true });
const warnings: string[] = [];
const resolved = await resolveSandboxWorkdir({
workdir: "/sandbox-root/project",
sandbox: {
containerName: "sandbox-3",
workspaceDir,
containerWorkdir: "/sandbox-root",
},
warnings,
});
expect(resolved.hostWorkdir).toBe(nested);
expect(resolved.containerWorkdir).toBe("/sandbox-root/project");
expect(warnings).toStrictEqual([]);
});
});
});
describe("deriveSessionName", () => {

View File

@@ -1,16 +1,15 @@
/**
* Shared helpers for bash exec/process tools.
* Owns sandbox workdir mapping, Docker exec argument construction, output
* slicing, environment coercion, and compact session labels.
* Owns Docker exec argument construction, output slicing, environment
* coercion, and compact session labels.
*/
import { existsSync, statSync } from "node:fs";
import fs from "node:fs/promises";
import { homedir } from "node:os";
import path from "node:path";
import { parseStrictInteger } from "@openclaw/normalization-core/number-coercion";
import { sliceUtf16Safe } from "../utils.js";
import { assertSandboxPath } from "./sandbox-paths.js";
import type { SandboxBackendExecSpec } from "./sandbox/backend-handle.types.js";
import type {
SandboxBackendExecSpec,
SandboxBackendWorkdirValidation,
SandboxBackendWorkdirValidator,
} from "./sandbox/backend-handle.types.js";
const CHUNK_LIMIT = 8 * 1024;
@@ -19,6 +18,10 @@ export type BashSandboxConfig = {
containerName: string;
workspaceDir: string;
containerWorkdir: string;
workdirValidation?: SandboxBackendWorkdirValidation;
validateWorkdir?: SandboxBackendWorkdirValidator;
discardPreparedWorkdir?: (workdir: string) => void;
workdirRoots?: readonly string[];
env?: Record<string, string>;
buildExecSpec?: (params: {
command: string;
@@ -109,101 +112,6 @@ export function buildDockerExecArgs(params: {
return args;
}
/** Resolves a requested workdir to both host and container paths for a sandbox. */
export async function resolveSandboxWorkdir(params: {
workdir: string;
sandbox: BashSandboxConfig;
warnings: string[];
}) {
const fallback = params.sandbox.workspaceDir;
const mappedHostWorkdir = mapContainerWorkdirToHost({
workdir: params.workdir,
sandbox: params.sandbox,
});
const candidateWorkdir = mappedHostWorkdir ?? params.workdir;
try {
const resolved = await assertSandboxPath({
filePath: candidateWorkdir,
cwd: process.cwd(),
root: params.sandbox.workspaceDir,
});
const stats = await fs.stat(resolved.resolved);
if (!stats.isDirectory()) {
throw new Error("workdir is not a directory");
}
const relative = resolved.relative
? resolved.relative.split(path.sep).join(path.posix.sep)
: "";
const containerWorkdir = relative
? path.posix.join(params.sandbox.containerWorkdir, relative)
: params.sandbox.containerWorkdir;
return { hostWorkdir: resolved.resolved, containerWorkdir };
} catch {
params.warnings.push(
`Warning: workdir "${params.workdir}" is unavailable; using "${fallback}".`,
);
return {
hostWorkdir: fallback,
containerWorkdir: params.sandbox.containerWorkdir,
};
}
}
function mapContainerWorkdirToHost(params: {
workdir: string;
sandbox: BashSandboxConfig;
}): string | undefined {
const workdir = normalizeContainerPath(params.workdir);
const containerRoot = normalizeContainerPath(params.sandbox.containerWorkdir);
if (containerRoot === ".") {
return undefined;
}
if (workdir === containerRoot) {
return path.resolve(params.sandbox.workspaceDir);
}
if (!workdir.startsWith(`${containerRoot}/`)) {
return undefined;
}
const rel = workdir
.slice(containerRoot.length + 1)
.split("/")
.filter(Boolean);
return path.resolve(params.sandbox.workspaceDir, ...rel);
}
function normalizeContainerPath(input: string): string {
const normalized = input.trim().replace(/\\/g, "/");
if (!normalized) {
return ".";
}
return path.posix.normalize(normalized);
}
/** Resolves a host workdir, falling back to a safe cwd/home path with a warning. */
export function resolveWorkdir(workdir: string, warnings: string[]) {
const current = safeCwd();
const fallback = current ?? homedir();
try {
const stats = statSync(workdir);
if (stats.isDirectory()) {
return workdir;
}
} catch {
// ignore, fallback below
}
warnings.push(`Warning: workdir "${workdir}" is unavailable; using "${fallback}".`);
return fallback;
}
function safeCwd() {
try {
const cwd = process.cwd();
return existsSync(cwd) ? cwd : null;
} catch {
return null;
}
}
/**
* Clamp a number within min/max bounds, using defaultValue if undefined or NaN.
*/

View File

@@ -73,7 +73,9 @@ describe("isDeliveredMessagingToolResult", () => {
result: [{ type: "text", text: JSON.stringify({ result: { messageId: "msg-1" } }) }],
}),
).toBe(true);
expect(isDeliveredMessagingToolResult({ result: { content: [{ text: "sent" }] } })).toBe(true);
expect(isDeliveredMessagingToolResult({ result: { content: [{ text: "sent" }] } })).toBe(
true,
);
expect(isDeliveredMessagingToolResult({ result: { status: "sent" } })).toBe(true);
});
@@ -332,47 +334,4 @@ describe("isDeliveredMessageToolOnlySourceReplyResult", () => {
}),
).toBe(false);
});
it("accepts confirmed explicit routes when the caller verified the source route", () => {
expect(
isDeliveredMessageToolOnlySourceReplyResult({
sourceReplyDeliveryMode: "message_tool_only",
toolName: "message",
args: {
action: "reply",
channel: "imessage",
target: "+12069106512",
message: "reply",
},
result: { ok: true, messageId: "imessage-853" },
allowExplicitSourceRoute: true,
}),
).toBe(true);
expect(
isDeliveredMessageToolOnlySourceReplyResult({
sourceReplyDeliveryMode: "message_tool_only",
toolName: "message",
args: {
action: "reply",
channel: "imessage",
target: "+12069106512",
message: "reply",
},
result: { ok: true, messageId: "imessage-853" },
}),
).toBe(false);
expect(
isDeliveredMessageToolOnlySourceReplyResult({
sourceReplyDeliveryMode: "message_tool_only",
toolName: "message",
args: {
action: "react",
channel: "imessage",
target: "+12069106512",
},
result: { ok: true },
allowExplicitSourceRoute: true,
}),
).toBe(false);
});
});

View File

@@ -50,13 +50,6 @@ function hasExplicitMessageRoute(args: Record<string, unknown>): boolean {
return Array.isArray(args.targets) && args.targets.some((value) => hasStringValue(value));
}
function isMessageToolSourceReplyActionName(action: unknown): boolean {
if (isMessageToolSendActionName(action)) {
return true;
}
return typeof action === "string" && action.trim().toLowerCase() === "reply";
}
function normalizeStatus(value: unknown): string | undefined {
return typeof value === "string" ? value.trim().toLowerCase() : undefined;
}
@@ -554,7 +547,6 @@ export function isDeliveredMessageToolOnlySourceReplyResult(params: {
result?: unknown;
hookResult?: unknown;
isError?: boolean;
allowExplicitSourceRoute?: boolean;
}): boolean {
if (params.sourceReplyDeliveryMode !== "message_tool_only") {
return false;
@@ -563,12 +555,7 @@ export function isDeliveredMessageToolOnlySourceReplyResult(params: {
return false;
}
const args = asRecord(params.args);
const sourceRouteReplyAction =
params.allowExplicitSourceRoute === true && isMessageToolSourceReplyActionName(args.action);
if (!isMessageToolSendActionName(args.action) && !sourceRouteReplyAction) {
return false;
}
if (hasExplicitMessageRoute(args) && params.allowExplicitSourceRoute !== true) {
if (!isMessageToolSendActionName(args.action) || hasExplicitMessageRoute(args)) {
return false;
}
return isDeliveredMessagingToolResult(params);

View File

@@ -0,0 +1,99 @@
// Coverage for handing replay-safe plugin-harness prompt timeouts to model fallback.
import { beforeAll, beforeEach, describe, expect, it } from "vitest";
import { makeModelFallbackCfg } from "../test-helpers/model-fallback-config-fixture.js";
import { makeAttemptResult } from "./run.overflow-compaction.fixture.js";
import {
loadRunOverflowCompactionHarness,
MockedFailoverError,
mockedClassifyFailoverReason,
mockedRunEmbeddedAttempt,
overflowBaseRunParams,
resetRunOverflowCompactionHarnessMocks,
} from "./run.overflow-compaction.harness.js";
let runEmbeddedAgent: typeof import("./run.js").runEmbeddedAgent;
describe("runEmbeddedAgent prompt timeout fallback handoff", () => {
beforeAll(async () => {
({ runEmbeddedAgent } = await loadRunOverflowCompactionHarness());
});
beforeEach(() => {
resetRunOverflowCompactionHarnessMocks();
});
it("throws FailoverError for replay-safe harness-owned prompt timeouts when model fallbacks are configured", async () => {
mockedClassifyFailoverReason.mockReturnValue("timeout");
mockedRunEmbeddedAttempt.mockResolvedValueOnce(
makeAttemptResult({
assistantTexts: [],
promptError: new Error("LLM request timed out."),
promptErrorSource: "prompt",
}),
);
const promise = runEmbeddedAgent({
...overflowBaseRunParams,
provider: "openai",
model: "gpt-5.4",
runId: "run-prompt-timeout-fallback",
config: makeModelFallbackCfg({
agents: {
defaults: {
model: {
primary: "openai/gpt-5.4",
fallbacks: ["anthropic/claude-opus-4-6"],
},
},
},
}),
});
await expect(promise).rejects.toBeInstanceOf(MockedFailoverError);
await expect(promise).rejects.toThrow("LLM request timed out.");
expect(mockedRunEmbeddedAttempt).toHaveBeenCalledTimes(1);
});
it("surfaces replay-invalid prompt timeouts instead of handing them to model fallback", async () => {
mockedClassifyFailoverReason.mockReturnValue("timeout");
mockedRunEmbeddedAttempt.mockResolvedValueOnce(
makeAttemptResult({
assistantTexts: [],
promptError: new Error("LLM request timed out."),
promptErrorSource: "prompt",
promptTimeoutOutcome: {
message: "Harness abandoned the timed-out turn after provider activity.",
replayInvalid: true,
livenessState: "abandoned",
},
}),
);
let thrown: unknown;
try {
await runEmbeddedAgent({
...overflowBaseRunParams,
provider: "openai",
model: "gpt-5.4",
runId: "run-prompt-timeout-replay-invalid",
config: makeModelFallbackCfg({
agents: {
defaults: {
model: {
primary: "openai/gpt-5.4",
fallbacks: ["anthropic/claude-opus-4-6"],
},
},
},
}),
});
} catch (err) {
thrown = err;
}
expect(thrown).toBeInstanceOf(Error);
expect(thrown).not.toBeInstanceOf(MockedFailoverError);
expect(String((thrown as Error | undefined)?.message)).toContain("LLM request timed out.");
expect(mockedRunEmbeddedAttempt).toHaveBeenCalledTimes(1);
});
});

View File

@@ -641,7 +641,7 @@ async function runEmbeddedAgentInternal(
...paramsBase,
agentId: paramsBase.agentId ?? runSessionTarget.agentId,
sessionId: runSessionTarget.sessionId,
sessionKey: effectiveSessionKey ?? runSessionTarget.sessionKey,
sessionKey: normalizeOptionalString(effectiveSessionKey ?? runSessionTarget.sessionKey),
sessionFile: runSessionTarget.sessionFile,
};
const sessionLane = resolveSessionLane(params.sessionKey?.trim() || params.sessionId);
@@ -3116,6 +3116,12 @@ async function runEmbeddedAgentInternal(
);
const promptFailoverFailure =
promptFailoverReason !== null || isFailoverErrorMessage(errorText, { provider });
const promptTimeoutFallbackSafe =
promptErrorSource === "prompt" &&
promptFailoverReason === "timeout" &&
!attempt.codexAppServerFailure &&
attempt.promptTimeoutOutcome?.replayInvalid !== true &&
attempt.replayMetadata.replaySafe;
// Capture the failing profile before auth-profile rotation mutates `lastProfileId`.
const failedPromptProfileId = lastProfileId;
const logPromptFailoverDecision = createFailoverDecisionLogger({
@@ -3147,6 +3153,7 @@ async function runEmbeddedAgentInternal(
failoverFailure: promptFailoverFailure,
failoverReason: promptFailoverReason,
harnessOwnsTransport: pluginHarnessOwnsTransport,
promptTimeoutFallbackSafe,
profileRotated: false,
});
if (
@@ -3186,6 +3193,7 @@ async function runEmbeddedAgentInternal(
failoverFailure: promptFailoverFailure,
failoverReason: promptFailoverReason,
harnessOwnsTransport: pluginHarnessOwnsTransport,
promptTimeoutFallbackSafe,
profileRotated: true,
});
}

View File

@@ -3129,7 +3129,11 @@ export async function runEmbeddedAttempt(
trigger: params.trigger,
runTimeoutMs: resolvedRunTimeoutMs,
modelRequestTimeoutMs: (params.model as { requestTimeoutMs?: number }).requestTimeoutMs,
model: params.model as { baseUrl?: string },
model: {
baseUrl: params.model.baseUrl,
id: params.modelId,
provider: params.provider,
},
});
if (idleTimeoutMs > 0) {
activeSession.agent.streamFn = streamWithIdleTimeout(

View File

@@ -581,6 +581,44 @@ describe("resolveRunFailoverDecision", () => {
});
});
it("falls back on fallback-safe harness-owned prompt timeouts", () => {
expect(
resolveRunFailoverDecision({
stage: "prompt",
aborted: false,
externalAbort: false,
fallbackConfigured: true,
failoverFailure: true,
failoverReason: "timeout",
harnessOwnsTransport: true,
promptTimeoutFallbackSafe: true,
profileRotated: true,
}),
).toEqual({
action: "fallback_model",
reason: "timeout",
});
});
it("surfaces fallback-safe harness-owned prompt timeouts when no fallback is configured", () => {
expect(
resolveRunFailoverDecision({
stage: "prompt",
aborted: false,
externalAbort: false,
fallbackConfigured: false,
failoverFailure: true,
failoverReason: "timeout",
harnessOwnsTransport: true,
promptTimeoutFallbackSafe: true,
profileRotated: true,
}),
).toEqual({
action: "surface_error",
reason: "timeout",
});
});
it("surfaces error on LLM idle timeout when no fallback is configured and rotation is exhausted", () => {
expect(
resolveRunFailoverDecision({

View File

@@ -50,6 +50,7 @@ type PromptDecisionParams = {
failoverFailure: boolean;
failoverReason: FailoverReason | null;
harnessOwnsTransport?: boolean;
promptTimeoutFallbackSafe?: boolean;
profileRotated: boolean;
};
@@ -179,6 +180,14 @@ export function resolveRunFailoverDecision(params: RunFailoverDecisionParams): R
};
}
if (params.harnessOwnsTransport && params.failoverReason === "timeout") {
// Plugin harness lifecycle timeouts must stay inside the harness boundary;
// only prompt request timeouts proven replay-safe may enter model fallback.
if (params.promptTimeoutFallbackSafe === true && params.fallbackConfigured) {
return {
action: "fallback_model",
reason: "timeout",
};
}
return {
action: "surface_error",
reason: params.failoverReason,

View File

@@ -6,6 +6,7 @@ import path from "node:path";
import { pathToFileURL } from "node:url";
import { describe, expect, it, vi } from "vitest";
import { resolvePreferredOpenClawTmpDir } from "../../../infra/tmp-openclaw-dir.js";
import { captureEnv, setTestEnvValue } from "../../../test-utils/env.js";
import { createHostSandboxFsBridge } from "../../test-helpers/host-sandbox-fs-bridge.js";
import { createUnsafeMountedSandbox } from "../../test-helpers/unsafe-mounted-sandbox.js";
import {
@@ -420,7 +421,8 @@ describe("loadImageFromRef", () => {
await fs.mkdir(workspaceDir, { recursive: true });
await fs.mkdir(inboundDir, { recursive: true });
await fs.writeFile(path.join(inboundDir, mediaId), Buffer.from(TINY_PNG_BASE64, "base64"));
vi.stubEnv("OPENCLAW_STATE_DIR", stateDir);
const envSnapshot = captureEnv(["OPENCLAW_STATE_DIR"]);
setTestEnvValue("OPENCLAW_STATE_DIR", stateDir);
try {
const image = await loadImageFromRef(
@@ -437,7 +439,7 @@ describe("loadImageFromRef", () => {
expect(image?.mimeType).toBe("image/png");
expect(image?.data).toBe(TINY_PNG_BASE64);
} finally {
vi.unstubAllEnvs();
envSnapshot.restore();
await fs.rm(stateDir, { recursive: true, force: true });
}
});
@@ -670,7 +672,8 @@ describe("detectAndLoadPromptImages", () => {
const imagePath = path.join(inboundDir, "signal-replay.png");
const pngB64 = TINY_PNG_BASE64;
await fs.writeFile(imagePath, Buffer.from(pngB64, "base64"));
vi.stubEnv("OPENCLAW_STATE_DIR", stateDir);
const envSnapshot = captureEnv(["OPENCLAW_STATE_DIR"]);
setTestEnvValue("OPENCLAW_STATE_DIR", stateDir);
try {
const result = await detectAndLoadPromptImages({
@@ -685,7 +688,7 @@ describe("detectAndLoadPromptImages", () => {
expect(result.skippedCount).toBe(0);
expect(result.images).toHaveLength(1);
} finally {
vi.unstubAllEnvs();
envSnapshot.restore();
await fs.rm(stateDir, { recursive: true, force: true });
}
});

View File

@@ -12,6 +12,7 @@ import type { StreamFn } from "../../runtime/index.js";
import { resolveLlmIdleTimeoutMs, streamWithIdleTimeout } from "./llm-idle-timeout.js";
const DEFAULT_LLM_IDLE_TIMEOUT_MS = 120_000;
const CRON_LLM_IDLE_TIMEOUT_MS = 60_000;
describe("resolveLlmIdleTimeoutMs", () => {
it("returns default when config is undefined", () => {
@@ -41,8 +42,153 @@ describe("resolveLlmIdleTimeoutMs", () => {
expect(resolveLlmIdleTimeoutMs({ runTimeoutMs: 30_000 })).toBe(30_000);
});
it("honors explicit cron run timeouts as the idle watchdog ceiling", () => {
expect(resolveLlmIdleTimeoutMs({ trigger: "cron", runTimeoutMs: 600_000 })).toBe(600_000);
it("caps explicit cron run timeouts so stream stalls can reach model fallbacks", () => {
expect(resolveLlmIdleTimeoutMs({ trigger: "cron", runTimeoutMs: 600_000 })).toBe(
CRON_LLM_IDLE_TIMEOUT_MS,
);
});
it("uses shorter explicit cron run timeouts as the idle watchdog ceiling", () => {
expect(resolveLlmIdleTimeoutMs({ trigger: "cron", runTimeoutMs: 30_000 })).toBe(30_000);
});
it("honors explicit cron run timeouts for local provider model calls", () => {
expect(
resolveLlmIdleTimeoutMs({
trigger: "cron",
runTimeoutMs: 600_000,
model: { baseUrl: "http://127.0.0.1:11434" },
}),
).toBe(600_000);
});
it.each([
["ollama", "http://ollama-host:11434"],
["ollama-beelink", "http://ollama-host:11434"],
["lmstudio", "http://lmstudio-box:1234/v1"],
["lmstudio-mac", "http://lmstudio-box:1234/v1"],
["vllm", "http://vllm-rig:8000/v1"],
["sglang", "http://sglang-rig:30000/v1"],
])(
"honors explicit cron run timeouts for self-hosted provider %s hostname %s",
(provider, baseUrl) => {
expect(
resolveLlmIdleTimeoutMs({
trigger: "cron",
runTimeoutMs: 600_000,
model: { provider, baseUrl },
}),
).toBe(600_000);
},
);
it("honors explicit cron run timeouts for explicit local host aliases", () => {
expect(
resolveLlmIdleTimeoutMs({
trigger: "cron",
runTimeoutMs: 600_000,
model: { baseUrl: "http://host.docker.internal:11434" },
}),
).toBe(600_000);
});
it("honors explicit cron run timeouts for custom local provider markers on bare hostnames", () => {
const cfg = {
models: {
providers: {
gpu: {
baseUrl: "http://gpu-box:8000/v1",
api: "openai-completions",
apiKey: "custom-local",
models: [],
},
"local-ollama": {
baseUrl: "http://ollama-box:11434",
api: "ollama",
apiKey: "ollama-local",
models: [],
},
},
},
} as unknown as OpenClawConfig;
expect(
resolveLlmIdleTimeoutMs({
cfg,
trigger: "cron",
runTimeoutMs: 600_000,
model: { provider: "gpu", baseUrl: "http://gpu-box:8000/v1" },
}),
).toBe(600_000);
expect(
resolveLlmIdleTimeoutMs({
cfg,
trigger: "cron",
runTimeoutMs: 600_000,
model: { provider: "local-ollama", baseUrl: "http://ollama-box:11434" },
}),
).toBe(600_000);
});
it("honors explicit cron run timeouts for provider-owned local services on bare hostnames", () => {
const cfg = {
models: {
providers: {
ds4: {
baseUrl: "http://ds4-box:8000/v1",
api: "openai-completions",
localService: {
command: "/opt/ds4/ds4-server",
healthUrl: "http://ds4-box:8000/v1/models",
},
models: [],
},
},
},
} as unknown as OpenClawConfig;
expect(
resolveLlmIdleTimeoutMs({
cfg,
trigger: "cron",
runTimeoutMs: 600_000,
model: { provider: "ds4", baseUrl: "http://ds4-box:8000/v1" },
}),
).toBe(600_000);
});
it.each([
["openai", "openai/gpt-5.5", "http://api:8080/v1"],
["custom-proxy", "custom-proxy/gpt-5.5", "http://gateway:4000/v1"],
["ollama-cloud", "ollama-cloud/kimi-k2.6", "http://ollama-host:11434"],
])(
"keeps the cron stall cap for cloud provider %s routed through single-label host %s",
(provider, id, baseUrl) => {
expect(
resolveLlmIdleTimeoutMs({
trigger: "cron",
runTimeoutMs: 600_000,
model: { provider, id, baseUrl },
}),
).toBe(CRON_LLM_IDLE_TIMEOUT_MS);
},
);
it("keeps the cron stall cap for remote or cloud hostnames", () => {
expect(
resolveLlmIdleTimeoutMs({
trigger: "cron",
runTimeoutMs: 600_000,
model: { provider: "openai", id: "openai/gpt-5.5", baseUrl: "https://api.openai.com/v1" },
}),
).toBe(CRON_LLM_IDLE_TIMEOUT_MS);
expect(
resolveLlmIdleTimeoutMs({
trigger: "cron",
runTimeoutMs: 600_000,
model: { provider: "ollama", id: "ollama/gpt-oss:cloud", baseUrl: "http://ollama-host" },
}),
).toBe(CRON_LLM_IDLE_TIMEOUT_MS);
});
it("disables the idle watchdog when an explicit run timeout disables timeouts", () => {

View File

@@ -18,6 +18,16 @@ import type { EmbeddedRunTrigger } from "./params.js";
* Default idle timeout for LLM streaming responses in milliseconds.
*/
const DEFAULT_LLM_IDLE_TIMEOUT_MS = 120_000;
// Cron has its own outer watchdog; stream stalls must fail early enough for
// the existing model fallback chain to try the next configured candidate.
const CRON_LLM_IDLE_TIMEOUT_MS = 60_000;
const LOCAL_PROVIDER_AUTH_MARKERS = new Set(["custom-local", "ollama-local"]);
const SELF_HOSTED_PROVIDER_ID_PREFIXES = ["ollama", "lmstudio", "vllm", "sglang", "llama-cpp"];
type IdleTimeoutProviderConfig = {
apiKey?: unknown;
localService?: unknown;
};
/**
* Detects loopback / private-network / `.local` base URLs. Local providers
@@ -37,11 +47,9 @@ const DEFAULT_LLM_IDLE_TIMEOUT_MS = 120_000;
* matched, mirroring the SSRF-policy helper in
* `src/cron/isolated-agent/model-preflight.runtime.ts`.
* - DNS-resolved local aliases (e.g. an `/etc/hosts` entry mapping a custom
* hostname to a private IP) are not detected: classification keys on
* `URL.hostname` so resolution would have to happen here, and adding
* sync/async DNS to the watchdog hot path is disproportionate. Affected
* users can use the IP directly or set
* `models.providers.<id>.timeoutSeconds` explicitly.
* hostname to a private IP) are not detected for the implicit watchdog opt-out:
* classification keys on `URL.hostname` so resolution would have to happen
* here, and adding sync/async DNS to the watchdog hot path is disproportionate.
*/
function isLocalProviderBaseUrl(baseUrl: string): boolean {
let host: string;
@@ -95,6 +103,82 @@ function isLocalProviderBaseUrl(baseUrl: string): boolean {
);
}
function isExplicitLocalHostnameBaseUrl(baseUrl: string): boolean {
let host: string;
try {
host = new URL(baseUrl).hostname.toLowerCase();
} catch {
return false;
}
if (
host === "docker.orb.internal" ||
host === "host.docker.internal" ||
host === "host.orb.internal"
) {
return true;
}
return false;
}
function isBareProviderHostnameBaseUrl(baseUrl: string): boolean {
let host: string;
try {
host = new URL(baseUrl).hostname.toLowerCase();
} catch {
return false;
}
if (host.includes(".") || host.includes(":")) {
return false;
}
return /^[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$/.test(host);
}
function isSelfHostedProviderId(provider: string | undefined): boolean {
const normalized = provider?.trim().toLowerCase();
if (!normalized || normalized === "ollama-cloud") {
return false;
}
return SELF_HOSTED_PROVIDER_ID_PREFIXES.some(
(prefix) => normalized === prefix || normalized.startsWith(`${prefix}-`),
);
}
function findConfiguredProviderConfig(
cfg: OpenClawConfig | undefined,
provider: string | undefined,
): IdleTimeoutProviderConfig | undefined {
const normalizedProvider = provider?.trim().toLowerCase();
if (!normalizedProvider) {
return undefined;
}
const providers = cfg?.models?.providers as
| Record<string, IdleTimeoutProviderConfig | undefined>
| undefined;
const exact = providers?.[normalizedProvider];
if (exact) {
return exact;
}
return Object.entries(providers ?? {}).find(
([key]) => key.trim().toLowerCase() === normalizedProvider,
)?.[1];
}
function hasLocalProviderAuthMarker(apiKey: unknown): boolean {
return typeof apiKey === "string" && LOCAL_PROVIDER_AUTH_MARKERS.has(apiKey.trim().toLowerCase());
}
function hasConfiguredLocalProviderSignal(params: {
cfg: OpenClawConfig | undefined;
provider: string | undefined;
}): boolean {
const providerConfig = findConfiguredProviderConfig(params.cfg, params.provider);
return Boolean(
providerConfig?.localService || hasLocalProviderAuthMarker(providerConfig?.apiKey),
);
}
function isOllamaCloudModel(model: { id?: string; provider?: string } | undefined): boolean {
const rawModelId = model?.id;
if (typeof rawModelId !== "string") {
@@ -134,6 +218,22 @@ export function resolveLlmIdleTimeoutMs(params?: {
const hasExplicitRunTimeout =
typeof runTimeoutMs === "number" && Number.isFinite(runTimeoutMs) && runTimeoutMs > 0;
const runTimeoutIsNoTimeout = hasExplicitRunTimeout && runTimeoutMs >= MAX_TIMER_TIMEOUT_MS;
const baseUrl = params?.model?.baseUrl;
const isLocalProvider =
typeof baseUrl === "string" && baseUrl.length > 0 && isLocalProviderBaseUrl(baseUrl);
const isLocalRuntimeModel = isLocalProvider && !isOllamaCloudModel(params?.model);
const isExplicitLocalHostnameRuntimeModel =
typeof baseUrl === "string" &&
baseUrl.length > 0 &&
isExplicitLocalHostnameBaseUrl(baseUrl) &&
!isOllamaCloudModel(params?.model);
const isSelfHostedHostnameRuntimeModel =
typeof baseUrl === "string" &&
baseUrl.length > 0 &&
isBareProviderHostnameBaseUrl(baseUrl) &&
(isSelfHostedProviderId(params?.model?.provider) ||
hasConfiguredLocalProviderSignal({ cfg: params?.cfg, provider: params?.model?.provider })) &&
!isOllamaCloudModel(params?.model);
const timeoutBounds = [
runTimeoutIsNoTimeout ? undefined : runTimeoutMs,
hasExplicitRunTimeout ? undefined : agentTimeoutMs,
@@ -174,7 +274,14 @@ export function resolveLlmIdleTimeoutMs(params?: {
return 0;
}
if (params?.trigger === "cron") {
return clampTimeoutMs(runTimeoutMs);
if (
isLocalRuntimeModel ||
isExplicitLocalHostnameRuntimeModel ||
isSelfHostedHostnameRuntimeModel
) {
return clampTimeoutMs(runTimeoutMs);
}
return clampTimeoutMs(Math.min(runTimeoutMs, CRON_LLM_IDLE_TIMEOUT_MS));
}
return clampImplicitTimeoutMs(runTimeoutMs);
}
@@ -190,10 +297,7 @@ export function resolveLlmIdleTimeoutMs(params?: {
// baseUrl pointing at loopback / private-network / `.local`. Ollama cloud
// models are still hosted remotely even when proxied through local Ollama, so
// keep the cloud watchdog for `*:cloud` model ids.
const baseUrl = params?.model?.baseUrl;
const isLocalProvider =
typeof baseUrl === "string" && baseUrl.length > 0 && isLocalProviderBaseUrl(baseUrl);
if (isLocalProvider && !isOllamaCloudModel(params?.model)) {
if (isLocalRuntimeModel) {
return 0;
}

View File

@@ -43,6 +43,7 @@ export { isToolAllowed, resolveSandboxToolPolicyForAgent } from "./sandbox/tool-
export type { SandboxFsBridge, SandboxFsStat, SandboxResolvedPath } from "./sandbox/fs-bridge.js";
export {
buildExecRemoteCommand,
buildRemoteWorkdirValidationCommand,
buildRemoteCommand,
buildSshSandboxArgv,
buildValidatedExecRemoteCommand,
@@ -52,6 +53,7 @@ export {
runSshSandboxCommand,
shellEscape,
uploadDirectoryToSshTarget,
VALIDATE_REMOTE_WORKDIR_SCRIPT,
} from "./sandbox/ssh.js";
export { sanitizeEnvVars } from "./sandbox/sanitize-env-vars.js";
export { createRemoteShellSandboxFsBridge } from "./sandbox/remote-fs-bridge.js";
@@ -68,9 +70,12 @@ export type {
SandboxBackendHandle,
SandboxBackendId,
SandboxBackendManager,
SandboxBackendPreparedWorkdirDiscarder,
SandboxBackendRegistration,
SandboxBackendRuntimeInfo,
SandboxBackendWorkdirValidation,
SandboxBackendWorkdirResolver,
SandboxBackendWorkdirValidator,
} from "./sandbox/backend.js";
export type { RemoteShellSandboxHandle } from "./sandbox/remote-fs-bridge.js";
export type {

View File

@@ -18,6 +18,11 @@ export type SandboxBackendExecSpec = {
finalizeToken?: unknown;
};
export type SandboxBackendWorkdirValidation = "host" | "backend";
export type SandboxBackendWorkdirValidator = (workdir: string) => Promise<string | null>;
export type SandboxBackendPreparedWorkdirDiscarder = (workdir: string) => void;
/** Parameters for backend-managed shell commands used by fs bridges and probes. */
export type SandboxBackendCommandParams = {
script: string;
@@ -59,6 +64,18 @@ export type SandboxBackendHandle = {
env?: Record<string, string>;
configLabel?: string;
configLabelKind?: string;
/**
* Remote backends own cwd existence checks because valid runtime paths may
* not exist in the local workspace mirror. Backend validation must be paired
* with validateWorkdir so cwd is proved after before_tool_call adjustments
* and before env resolution, approval, preflight, and launch.
*/
workdirValidation?: SandboxBackendWorkdirValidation;
validateWorkdir?: SandboxBackendWorkdirValidator;
/** Discard one-shot state created while validating a backend-owned cwd. */
discardPreparedWorkdir?: SandboxBackendPreparedWorkdirDiscarder;
/** Remote cwd roots managed by backend validation. Defaults to workdir. */
workdirRoots?: readonly string[];
capabilities?: {
browser?: boolean;
};

View File

@@ -20,6 +20,7 @@ export type {
SandboxBackendManager,
SandboxBackendRegistration,
SandboxBackendRuntimeInfo,
SandboxBackendWorkdirValidation,
SandboxBackendWorkdirResolver,
} from "./backend.types.js";
export type {
@@ -27,6 +28,8 @@ export type {
SandboxBackendCommandResult,
SandboxBackendExecSpec,
SandboxBackendHandle,
SandboxBackendPreparedWorkdirDiscarder,
SandboxBackendWorkdirValidator,
} from "./backend-handle.types.js";
const SANDBOX_BACKEND_FACTORIES_STATE_KEY = Symbol.for("openclaw.sandboxBackendFactories");

View File

@@ -68,5 +68,8 @@ export type {
SandboxBackendCommandParams,
SandboxBackendCommandResult,
SandboxBackendExecSpec,
SandboxBackendPreparedWorkdirDiscarder,
SandboxBackendWorkdirValidation,
SandboxBackendWorkdirValidator,
SandboxFsBridgeContext,
} from "./backend-handle.types.js";

View File

@@ -33,7 +33,8 @@ vi.mock("./ssh.js", async () => {
};
});
const { createSshSandboxBackend, sshSandboxBackendManager } = await import("./ssh-backend.js");
const { createSshSandboxBackend, resolveSshRuntimePaths, sshSandboxBackendManager } =
await import("./ssh-backend.js");
const tempDirs: string[] = [];
async function createTempDir(prefix: string): Promise<string> {
@@ -341,6 +342,173 @@ describe("ssh sandbox backend", () => {
expect(sshMocks.disposeSshSandboxSession).toHaveBeenCalledTimes(2);
});
it("validates remote workdirs before exec accepts backend-owned cwd", async () => {
sshMocks.runSshSandboxCommand
.mockResolvedValueOnce({
stdout: Buffer.from("1\n"),
stderr: Buffer.alloc(0),
code: 0,
})
.mockResolvedValueOnce({
stdout: Buffer.from("/remote/openclaw/openclaw-ssh-agent-worker-abcd1234/workspace/src\n"),
stderr: Buffer.alloc(0),
code: 0,
})
.mockResolvedValueOnce({
stdout: Buffer.alloc(0),
stderr: Buffer.from("remote directory not found\n"),
code: 1,
})
.mockResolvedValueOnce({
stdout: Buffer.from("/remote/openclaw/openclaw-ssh-agent-worker-abcd1234/agent/src\n"),
stderr: Buffer.alloc(0),
code: 0,
});
const backend = await createSshSandboxBackend({
sessionKey: "agent:worker:task",
scopeKey: "agent:worker",
workspaceDir: "/tmp/workspace",
agentWorkspaceDir: "/tmp/workspace",
cfg: createBackendSandboxConfig({
target: "peter@example.com:2222",
}),
});
await expect(
backend.validateWorkdir?.(
"/remote/openclaw/openclaw-ssh-agent-worker-abcd1234/workspace/src",
),
).resolves.toBe("/remote/openclaw/openclaw-ssh-agent-worker-abcd1234/workspace/src");
await expect(
backend.validateWorkdir?.(
"/remote/openclaw/openclaw-ssh-agent-worker-abcd1234/workspace/missing",
),
).resolves.toBeNull();
await expect(
backend.validateWorkdir?.("/remote/openclaw/openclaw-ssh-agent-worker-abcd1234/agent/src"),
).resolves.toBe("/remote/openclaw/openclaw-ssh-agent-worker-abcd1234/agent/src");
const validationCommand = String(requireSshRunCommandParams(1).remoteCommand);
expect(validationCommand).toContain("openclaw-validate-workdir");
expect(validationCommand).toContain("remote directory must stay under root");
const agentValidationCommand = String(requireSshRunCommandParams(3).remoteCommand);
expect(agentValidationCommand).toContain(
"/remote/openclaw/openclaw-ssh-agent-worker-abcd1234/agent",
);
});
it("refreshes materialized skills before validating a skills workdir", async () => {
const skillsWorkspaceDir = await createTempDir("openclaw-ssh-skills-");
await fs.mkdir(path.join(skillsWorkspaceDir, "skills", "demo"), { recursive: true });
const runtimePaths = resolveSshRuntimePaths("/remote/openclaw", "agent:worker");
const skillsWorkdir = path.posix.join(runtimePaths.remoteSkillsWorkspaceDir, "skills", "demo");
sshMocks.runSshSandboxCommand
.mockResolvedValueOnce({
stdout: Buffer.from("1\n"),
stderr: Buffer.alloc(0),
code: 0,
})
.mockResolvedValueOnce({
stdout: Buffer.alloc(0),
stderr: Buffer.alloc(0),
code: 0,
})
.mockResolvedValueOnce({
stdout: Buffer.from(`${skillsWorkdir}\n`),
stderr: Buffer.alloc(0),
code: 0,
});
const backend = await createSshSandboxBackend({
sessionKey: "agent:worker:task",
scopeKey: "agent:worker",
workspaceDir: "/tmp/workspace",
agentWorkspaceDir: "/tmp/workspace",
skillsWorkspaceDir,
cfg: createBackendSandboxConfig({
target: "peter@example.com:2222",
}),
});
await expect(backend.validateWorkdir?.(skillsWorkdir)).resolves.toBe(skillsWorkdir);
const execSpec = await backend.buildExecSpec({
command: "pwd",
workdir: skillsWorkdir,
env: {},
usePty: false,
});
expect(sshMocks.uploadDirectoryToSshTarget).toHaveBeenCalledOnce();
const skillsUploadParams = requireSshUploadParams(0, "skills upload params");
expect(skillsUploadParams.localDir).toBe(skillsWorkspaceDir);
expect(skillsUploadParams.remoteDir).toBe(runtimePaths.remoteSkillsWorkspaceDir);
expect(execSpec.argv.at(-1)).toContain(skillsWorkdir);
await backend.finalizeExec?.({
status: "completed",
exitCode: 0,
timedOut: false,
token: execSpec.finalizeToken,
});
});
it("discards validated materialized skills refreshes that do not launch", async () => {
const skillsWorkspaceDir = await createTempDir("openclaw-ssh-skills-");
await fs.mkdir(path.join(skillsWorkspaceDir, "skills", "demo"), { recursive: true });
const runtimePaths = resolveSshRuntimePaths("/remote/openclaw", "agent:worker");
const skillsWorkdir = path.posix.join(runtimePaths.remoteSkillsWorkspaceDir, "skills", "demo");
sshMocks.runSshSandboxCommand
.mockResolvedValueOnce({
stdout: Buffer.from("1\n"),
stderr: Buffer.alloc(0),
code: 0,
})
.mockResolvedValueOnce({
stdout: Buffer.alloc(0),
stderr: Buffer.alloc(0),
code: 0,
})
.mockResolvedValueOnce({
stdout: Buffer.from(`${skillsWorkdir}\n`),
stderr: Buffer.alloc(0),
code: 0,
})
.mockResolvedValueOnce({
stdout: Buffer.alloc(0),
stderr: Buffer.alloc(0),
code: 0,
});
const backend = await createSshSandboxBackend({
sessionKey: "agent:worker:task",
scopeKey: "agent:worker",
workspaceDir: "/tmp/workspace",
agentWorkspaceDir: "/tmp/workspace",
skillsWorkspaceDir,
cfg: createBackendSandboxConfig({
target: "peter@example.com:2222",
}),
});
await expect(backend.validateWorkdir?.(skillsWorkdir)).resolves.toBe(skillsWorkdir);
backend.discardPreparedWorkdir?.(skillsWorkdir);
const execSpec = await backend.buildExecSpec({
command: "pwd",
workdir: skillsWorkdir,
env: {},
usePty: false,
});
expect(sshMocks.uploadDirectoryToSshTarget).toHaveBeenCalledTimes(2);
await backend.finalizeExec?.({
status: "completed",
exitCode: 0,
timedOut: false,
token: execSpec.finalizeToken,
});
});
it("refreshes materialized skills before each exec and remote fs command", async () => {
const skillsWorkspaceDir = await createTempDir("openclaw-ssh-skills-");
await fs.mkdir(path.join(skillsWorkspaceDir, "skills"), { recursive: true });

Some files were not shown because too many files have changed in this diff Show More