Compare commits

..

20 Commits

Author SHA1 Message Date
Vincent Koc
f63d668458 fix(markdown): honor fence suffix whitespace rules
Co-authored-by: ly-wang19 <ly-wang19@users.noreply.github.com>
2026-06-26 14:00:34 -07:00
ly-wang19
cdf35f708c fix(markdown): a fenced-code line with trailing text is content, not a closing fence
scanFenceSpans accepted any line starting with >=3 matching fence markers as a
closing fence, ignoring trailing text after the marker. Per CommonMark a closing
fence may be followed only by whitespace, so a code-content line such as
"``` not a close" was wrongly treated as a close: the block ended early, the
following lines were reported as outside any fence, and the trailing marker line
became a new unclosed opener.

That made isSafeFenceBreak() return true for offsets inside the real code block
and findFenceSpanAt() return undefined, so chunkers (chunkMarkdownText, the
embedded-agent block chunker) could split inside a fenced code block — the exact
thing this module exists to prevent.

Require the closing fence's trailing text to be whitespace-only. Opening info
strings, bare closes, and longer same-marker closes are unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 14:59:04 +08:00
Vincent Koc
56d95b18f4 fix(sdk): refresh plugin sdk api baseline 2026-06-25 06:51:02 +02:00
Vincent Koc
e7f2b125f6 fix(test): isolate upgrade survivor artifacts 2026-06-25 05:49:46 +02:00
Vincent Koc
643410c1f3 test(qa): scope fanout marker proof to channel runtime 2026-06-25 10:20:51 +08:00
Vincent Koc
8d4e40d293 test(qa): extend fanout marker wait 2026-06-25 10:20:51 +08:00
Vincent Koc
068ae4eb4b test(qa): allow Codex fanout completion window 2026-06-25 10:20:51 +08:00
Vincent Koc
dad7168c2f fix(qa): align runtime parity evidence with Codex 2026-06-25 10:20:51 +08:00
Vincent Koc
31a65e0647 fix(agents): preserve absent embedded session keys 2026-06-25 10:20:51 +08:00
Vincent Koc
1a04b8eb98 test(plugins): review channel daemon spawn findings 2026-06-25 03:42:44 +02:00
clawsweeper[bot]
a21144d8a6 fix(cron): preserve enabled-with-defaults failure alert through store roundtrip (fixes #96589) (AI-assisted) (#96615)
Summary:
- The PR preserves `failure_alert_disabled === 0` as the enabled-with-defaults failure-alert state and adds focused codec roundtrip tests.
- PR surface: Source +2, Tests +54. Total +56 across 2 files.
- Reproducibility: yes. At source level, current main encodes `failureAlert: {}` with `failure_alert_disabled = 0`, then decodes it as `undefined` when all explicit alert option columns are null.

Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.

Validation:
- ClawSweeper review passed for head bd9b2a1798.
- Required merge gates passed before the squash merge.

Prepared head SHA: bd9b2a1798
Review: https://github.com/openclaw/openclaw/pull/96615#issuecomment-4794949533

Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: liuhao1024 <11816344+liuhao1024@users.noreply.github.com>
Approved-by: takhoffman
2026-06-25 01:17:31 +00:00
Dallin Romney
8a5cb85c31 ci: default maturity evidence to all profile (#96595) 2026-06-24 17:32:25 -07:00
Dallin Romney
61d4ff782e docs: clarify maturity scorecard scoring (#96594)
* docs: clarify maturity scorecard scoring

* chore: split qa profile workflow change

* docs: keep maturity coverage values stable

* test: keep maturity renderer fixture in core boundary
2026-06-24 17:32:08 -07:00
Vincent Koc
3ab7a72764 fix(plugin-sdk): update surface budget 2026-06-25 02:22:44 +02:00
Peter Lee
b4bdea0d02 fix(agent): emit model.usage diagnostic for HTTP ingress traffic (#96152)
* fix(agents): emit model.usage diagnostic for HTTP ingress traffic

* fix(agents): emit model.usage diagnostic for HTTP ingress traffic

* fix(agent): add regression tests and refactor ingress model.usage diagnostic emission

* fix(agent): resolve oxlint curly and no-useless-fallback-in-spread violations
2026-06-24 20:17:28 -04:00
Sarah Fortune
113d6f3c64 fix: surface provider authentication failures in channels (#96599)
* fix: surface provider authentication failures in channels

* fix: handle typed provider auth failures

---------

Co-authored-by: Sarah Fortune <sarah.fortune@gmail.com>
2026-06-24 17:16:30 -07:00
joshavant
0a14444924 Bound successful provider response reads 2026-06-24 19:08:22 -05:00
Peter Lee
0a042f68df fix(agent): replace self-wait with deferred release in retained-lock abort cleanup (#96100)
* fix(agent): wait for retained session write before releasing held lock on abort

* fix(agent): replace self-wait with deferred release in retained-lock abort cleanup

* fix(test): reject fallback acquire with SessionWriteLockTimeoutError in active-scope cleanup test

* fix(agent): trim retained-lock comments

Signed-off-by: sallyom <somalley@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>
Co-authored-by: sallyom <somalley@redhat.com>
2026-06-24 19:50:58 -04:00
Vincent Koc
3ab8d6aa60 fix(e2e): extend Codex on-demand install timeout 2026-06-25 00:32:29 +02:00
Omar Shahine
f2af052cee perf(imessage): show typing sooner for slow replies (#95621)
Merged via squash.

Prepared head SHA: 65e9ad10fd
Co-authored-by: omarshahine <10343873+omarshahine@users.noreply.github.com>
Co-authored-by: omarshahine <10343873+omarshahine@users.noreply.github.com>
Reviewed-by: @omarshahine
2026-06-24 15:31:48 -07:00
64 changed files with 1964 additions and 383 deletions

View File

@@ -134,7 +134,7 @@ jobs:
with:
ref: ${{ inputs.ref }}
expected_sha: ${{ needs.validate_selected_ref.outputs.selected_revision }}
qa_profile: release
qa_profile: all
secrets:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
@@ -238,8 +238,8 @@ jobs:
}
const evidence = JSON.parse(fs.readFileSync(evidencePath, "utf8"));
if (evidence.profile !== "release") {
throw new Error(`qa-evidence.json profile must be release, got ${JSON.stringify(evidence.profile)}`);
if (evidence.profile !== "all") {
throw new Error(`qa-evidence.json profile must be all, got ${JSON.stringify(evidence.profile)}`);
}
const artifactDir = path.dirname(evidencePath);
@@ -256,8 +256,8 @@ jobs:
const manifestPath = path.join(artifactDir, manifestNames[0]);
const manifest = JSON.parse(fs.readFileSync(manifestPath, "utf8"));
const manifestProfile = manifest.qaProfile ?? evidence.profile;
if (manifestProfile !== "release") {
throw new Error(`QA evidence manifest profile must be release, got ${JSON.stringify(manifestProfile)}`);
if (manifestProfile !== "all") {
throw new Error(`QA evidence manifest profile must be all, got ${JSON.stringify(manifestProfile)}`);
}
if (manifest.targetSha !== targetSha) {
throw new Error(`QA evidence manifest targetSha ${manifest.targetSha} does not match selected ref ${targetSha}`);
@@ -428,14 +428,14 @@ jobs:
cat > "$body_file" <<BODY
## Summary
- render maturity scorecard docs from \`qa/maturity-scores.yaml\` and release QA evidence
- render maturity scorecard docs from \`qa/maturity-scores.yaml\` and full taxonomy QA evidence
- maturity source ref: ${REF_INPUT}
- QA evidence run: ${evidence_run_id}
## Verification
- QA Lab maturity score validation passed
- Maturity scorecard workflow rendered docs from release profile qa-evidence.json artifacts with strict inputs
- Maturity scorecard workflow rendered docs from all profile qa-evidence.json artifacts with strict inputs
BODY
pr_url="$(gh pr list --head "$branch" --state open --json url --jq '.[0].url // ""')"

View File

@@ -18,7 +18,7 @@ on:
qa_profile:
description: Taxonomy QA profile id to run (for example release or all)
required: true
default: release
default: all
type: string
workflow_call:
inputs:

View File

@@ -1,2 +1,2 @@
6620d5a6100d60f98cf13b8a13e3c46e9631400d1a1d7c0c6a22c490da810813 plugin-sdk-api-baseline.json
961377a56fd0fb3307fb4be95dcb480610f14c717e1b82e4bf262dd5faaddcbc plugin-sdk-api-baseline.jsonl
9d5b34975270bb2d16748002c1441ab48fde81af8eb12cc8eb3e341c862232ff plugin-sdk-api-baseline.json
f1a6ff189498d955cad6d6fb912eb4cad7aeb628f89c51d0745e146fe0d163d6 plugin-sdk-api-baseline.jsonl

View File

@@ -579,7 +579,7 @@ When `imsg launch` is running and `openclaw channels status --probe` reports `pr
</Accordion>
<Accordion title="Read receipts and typing">
When the private API bridge is up, accepted inbound chats are marked read before dispatch and a typing bubble is shown to the sender while the agent generates. Disable read-marking with:
When the private API bridge is up, accepted inbound chats are marked read and direct chats show a typing bubble as soon as the turn is accepted, while the agent prepares context and generates. Disable read-marking with:
```json5
{

View File

@@ -19,42 +19,23 @@ Use this page to answer one question: which OpenClaw surfaces are credible choic
## At a glance
<div className="maturity-summary-grid">
<div className="maturity-summary-item maturity-score-experimental">
<div className="maturity-summary-heading">
<span className="maturity-summary-value">4%</span>
<span>Coverage</span>
</div>
<div className="maturity-summary-bar" style={{ "--score": "4" }}><span /></div>
<div className="maturity-summary-meta">
<span className="maturity-level-pill maturity-level-experimental">Experimental</span>
<span>QA profile evidence</span>
</div>
</div>
<div className="maturity-summary-item maturity-score-alpha">
<div className="maturity-summary-heading">
<span className="maturity-summary-value">63%</span>
<span>Quality</span>
<span className="maturity-summary-value">67%</span>
<span>Maturity score</span>
</div>
<div className="maturity-summary-bar" style={{ "--score": "63" }}><span /></div>
<div className="maturity-summary-bar" style={{ "--score": "67" }}><span /></div>
<div className="maturity-summary-meta">
<span className="maturity-level-pill maturity-level-alpha">Alpha</span>
<span>Reliability and operator confidence</span>
</div>
</div>
<div className="maturity-summary-item maturity-score-beta">
<div className="maturity-summary-heading">
<span className="maturity-summary-value">70%</span>
<span>Completeness</span>
</div>
<div className="maturity-summary-bar" style={{ "--score": "70" }}><span /></div>
<div className="maturity-summary-meta">
<span className="maturity-level-pill maturity-level-beta">Beta</span>
<span>Expected workflow coverage</span>
<span>Quality + completeness</span>
<span>Coverage Experimental - 4%</span>
<span>Quality Alpha - 63%</span>
<span>Completeness Beta - 70%</span>
</div>
</div>
</div>
Coverage is deliberately evidence-led: an area does not become "ready" just because the implementation exists.
Coverage is deliberately evidence-led: an area does not become "ready" just because the implementation exists. It is not an input to the maturity score, but OpenClaw aims to keep end-to-end coverage above 90% for mature Stable-or-better features over time.
## Score bands

View File

@@ -57,34 +57,6 @@ Logging:
The macOS app checks the gateway version against its own version. If they're
incompatible, update the global CLI to match the app version.
## State directory on macOS
Keep OpenClaw state on a local, non-synced disk. Avoid iCloud Drive and other
cloud-synced folders because sync latency and file locks can affect sessions,
credentials, and Gateway state.
Set `OPENCLAW_STATE_DIR` to a local path only when you need an override.
`openclaw doctor` warns about common cloud-synced state paths and recommends
moving back to local storage. See
[environment variables](/help/environment#path-related-env-vars) and
[Doctor](/gateway/doctor).
## Debug app connectivity
Use the macOS debug CLI from a source checkout to exercise the same Gateway
WebSocket handshake and discovery logic the app uses:
```bash
cd apps/macos
swift run openclaw-mac connect --json
swift run openclaw-mac discover --timeout 3000 --json
```
`connect` accepts `--url`, `--token`, `--timeout`, and `--json`. `discover`
accepts `--host`, `--port`, `--timeout`, and `--json`. Compare discovery output
with `openclaw gateway discover --json` when you need to separate CLI discovery
from app-side connection issues.
## Smoke check
```bash

View File

@@ -114,18 +114,7 @@ Example (in JS):
window.location.href = "openclaw://agent?message=Review%20this%20design";
```
Supported query parameters:
- `message`: prefilled agent prompt.
- `sessionKey`: stable session identifier.
- `thinking`: optional thinking profile.
- `deliver`, `to`, or `channel`: delivery target.
- `timeoutSeconds`: optional run timeout.
- `key`: app-generated safety token for trusted local callers.
The app prompts for confirmation unless a valid key is provided. The prompt
shows the decoded message and destination, and execution uses the normal
Gateway run path after approval.
The app prompts for confirmation unless a valid key is provided.
## Security notes

View File

@@ -24,9 +24,6 @@ In SSH tunnel mode, discovered LAN/tailnet hostnames are saved as
`gateway.remote.sshTarget`. The app keeps `gateway.remote.url` on the local
tunnel endpoint, for example `ws://127.0.0.1:18789`, so CLI, Web Chat, and
the local node-host service all use the same safe loopback transport.
When discovery returns both raw Tailnet IPs and stable hostnames, the app
prefers Tailscale MagicDNS or LAN names so remote connections survive address
changes better.
If the local tunnel port differs from the remote gateway port, set
`gateway.remote.remotePort` to the port on the remote host.

View File

@@ -21,10 +21,6 @@ title: "macOS IPC"
- The app runs the Gateway (local mode) and connects to it as a node.
- Agent actions are performed via `node.invoke` (e.g. `system.run`, `system.notify`, `canvas.*`).
- Common Mac node commands include `canvas.*`, `camera.snap`, `camera.clip`,
`screen.snapshot`, `screen.record`, `system.run`, and `system.notify`.
- The node reports a `permissions` map so agents can see whether screen,
camera, microphone, speech, automation, or accessibility access is available.
### Node service + app IPC

View File

@@ -1,87 +1,228 @@
---
summary: "Install and use the OpenClaw macOS menu bar app"
summary: "OpenClaw macOS companion app (menu bar + gateway broker)"
read_when:
- Installing the macOS app
- Deciding between local and remote Gateway mode on macOS
- Looking for macOS app release downloads
- Implementing macOS app features
- Changing gateway lifecycle or node bridging on macOS
title: "macOS app"
---
The macOS app is the OpenClaw **menu bar companion**. Use it when you want a
native tray UI, macOS permission prompts, notifications, WebChat, voice input,
Canvas, or Mac-hosted node tools such as `system.run`.
The macOS app is the **menu-bar companion** for OpenClaw. It owns permissions,
manages/attaches to the Gateway locally (launchd or manual), and exposes macOS
capabilities to the agent as a node.
If you only need the CLI and Gateway, start with [Getting started](/start/getting-started).
## What it does
## Download
- Shows native notifications and status in the menu bar.
- Owns TCC prompts (Notifications, Accessibility, Screen Recording, Microphone,
Speech Recognition, Automation/AppleScript).
- Runs or connects to the Gateway (local or remote).
- Exposes macOS-only tools (Canvas, Camera, Screen Recording, `system.run`).
- Starts the local node host service in **remote** mode (launchd), and stops it in **local** mode.
- Optionally hosts **PeekabooBridge** for UI automation.
- Installs the global CLI (`openclaw`) on request via npm, pnpm, or bun (the app prefers npm, then pnpm, then bun; Node remains the recommended Gateway runtime).
Download macOS app builds from the
[OpenClaw GitHub releases](https://github.com/openclaw/openclaw/releases).
When a release includes macOS app assets, look for:
## Local vs remote mode
- `OpenClaw-<version>.dmg` (preferred)
- `OpenClaw-<version>.zip`
- **Local** (default): the app attaches to a running local Gateway if present;
otherwise it enables the launchd service via `openclaw gateway install`.
- **Remote**: the app connects to a Gateway over SSH/Tailscale and never starts
a local process.
The app starts the local **node host service** so the remote Gateway can reach this Mac.
The app does not spawn the Gateway as a child process.
Gateway discovery now prefers Tailscale MagicDNS names over raw tailnet IPs,
so the Mac app recovers more reliably when tailnet IPs change.
Some releases only include CLI, evidence, or Windows assets. If the newest
release has no macOS app asset, use the newest release that does, or build the
app from source with [macOS dev setup](/platforms/mac/dev-setup).
## Launchd control
## First run
The app manages a per-user LaunchAgent labeled `ai.openclaw.gateway`
(or `ai.openclaw.<profile>` when using `--profile`/`OPENCLAW_PROFILE`; legacy `com.openclaw.*` still unloads).
```bash
launchctl kickstart -k gui/$UID/ai.openclaw.gateway
launchctl bootout gui/$UID/ai.openclaw.gateway
```
Replace the label with `ai.openclaw.<profile>` when running a named profile.
If the LaunchAgent isn't installed, enable it from the app or run
`openclaw gateway install`.
If the gateway repeatedly disappears for minutes to hours and only resumes when you touch the Control UI or SSH into the host, see the troubleshooting note for macOS Maintenance Sleep / `ENETDOWN` crashes and launchd's respawn-protection gate in [Gateway troubleshooting](/gateway/troubleshooting#macos-gateway-silently-stops-responding-then-resumes-when-you-touch-the-dashboard).
## Node capabilities (mac)
The macOS app presents itself as a node. Common commands:
- Canvas: `canvas.present`, `canvas.navigate`, `canvas.eval`, `canvas.snapshot`, `canvas.a2ui.*`
- Camera: `camera.snap`, `camera.clip`
- Screen: `screen.snapshot`, `screen.record`
- System: `system.run`, `system.notify`
The node reports a `permissions` map so agents can decide what's allowed.
Node service + app IPC:
- When the headless node host service is running (remote mode), it connects to the Gateway WS as a node.
- `system.run` executes in the macOS app (UI/TCC context) over a local Unix socket; prompts + output stay in-app.
Diagram (SCI):
```
Gateway -> Node Service (WS)
| IPC (UDS + token + HMAC + TTL)
v
Mac App (UI + TCC + system.run)
```
## Exec approvals (system.run)
`system.run` is controlled by **Exec approvals** in the macOS app (Settings → Exec approvals).
Security + ask + allowlist are stored locally on the Mac in:
```
~/.openclaw/exec-approvals.json
```
Example:
```json
{
"version": 1,
"defaults": {
"security": "deny",
"ask": "on-miss"
},
"agents": {
"main": {
"security": "allowlist",
"ask": "on-miss",
"allowlist": [{ "pattern": "/opt/homebrew/bin/rg" }]
}
}
}
```
Notes:
- `allowlist` entries are glob patterns for resolved binary paths, or bare command names for PATH-invoked commands.
- Raw shell command text that contains shell control or expansion syntax (`&&`, `||`, `;`, `|`, `` ` ``, `$`, `<`, `>`, `(`, `)`) is treated as an allowlist miss and requires explicit approval (or allowlisting the shell binary).
- Choosing "Always Allow" in the prompt adds that command to the allowlist.
- `system.run` environment overrides are filtered (drops `PATH`, `DYLD_*`, `LD_*`, `BASHOPTS`, `FPATH`, `KSH_ENV`, `NODE_OPTIONS`, `NODE_REDIRECT_WARNINGS`, `NODE_REPL_EXTERNAL_MODULE`, `NODE_REPL_HISTORY`, `NODE_V8_COVERAGE`, `PYTHON*`, `PERL*`, `RUBYOPT`, `SHELLOPTS`, `PS4`, `TCLLIBPATH`) and then merged with the app's environment.
- For shell wrappers (`bash|sh|zsh ... -c/-lc`), request-scoped environment overrides are reduced to a small explicit allowlist (`TERM`, `LANG`, `LC_*`, `COLORTERM`, `NO_COLOR`, `FORCE_COLOR`).
- For allow-always decisions in allowlist mode, known dispatch wrappers (`env`, `flock`, `nice`, `nohup`, `stdbuf`, `timeout`) persist inner executable paths instead of wrapper paths. If unwrapping is not safe, no allowlist entry is persisted automatically.
## Deep links
The app registers the `openclaw://` URL scheme for local actions.
### `openclaw://agent`
Triggers a Gateway `agent` request.
```bash
open 'openclaw://agent?message=Hello%20from%20deep%20link'
```
Query parameters:
- `message` (required)
- `sessionKey` (optional)
- `thinking` (optional)
- `deliver` / `to` / `channel` (optional)
- `timeoutSeconds` (optional)
- `key` (optional unattended mode key)
Safety:
- Without `key`, the app prompts for confirmation.
- Without `key`, the app enforces a short message limit for the confirmation prompt and ignores `deliver` / `to` / `channel`.
- With a valid `key`, the run is unattended (intended for personal automations).
## Onboarding flow (typical)
1. Install and launch **OpenClaw.app**.
2. Complete the macOS permission checklist.
3. Pick **Local** or **Remote** mode.
4. Install the `openclaw` CLI if the app asks for it.
5. Open WebChat from the menu bar and send a test message.
2. Complete the permissions checklist (TCC prompts).
3. Ensure **Local** mode is active and the Gateway is running.
4. Install the CLI if you want terminal access.
For the CLI/Gateway setup path, use [Getting started](/start/getting-started).
For permission recovery, use [macOS permissions](/platforms/mac/permissions).
## State dir placement (macOS)
## Choose a Gateway mode
Avoid putting your OpenClaw state dir in iCloud or other cloud-synced folders.
Sync-backed paths can add latency and occasionally cause file-lock/sync races for
sessions and credentials.
| Mode | Use it when | Detail page |
| ------ | --------------------------------------------------------------------------------------- | -------------------------------------------------- |
| Local | This Mac should run the Gateway and keep it alive with launchd. | [Gateway on macOS](/platforms/mac/bundled-gateway) |
| Remote | Another host runs the Gateway and this Mac should control it over SSH, LAN, or Tailnet. | [Remote control](/platforms/mac/remote) |
Prefer a local non-synced state path such as:
Local mode requires an installed `openclaw` CLI. The app can install it, or you
can follow [Gateway on macOS](/platforms/mac/bundled-gateway).
```bash
OPENCLAW_STATE_DIR=~/.openclaw
```
## What the app owns
If `openclaw doctor` detects state under:
- Menu bar status, notifications, health, and WebChat.
- macOS permission prompts for screen, microphone, speech, automation, and accessibility.
- Local node tools such as Canvas, camera/screen capture, notifications, and `system.run`.
- Exec approval prompts for Mac-hosted commands.
- Remote-mode SSH tunnels or direct Gateway connections.
- `~/Library/Mobile Documents/com~apple~CloudDocs/...`
- `~/Library/CloudStorage/...`
The app does **not** replace the OpenClaw Gateway or general CLI docs. Core
Gateway configuration, providers, plugins, channels, tools, and security live in
their own docs.
it will warn and recommend moving back to a local path.
## macOS detail pages
## Build and dev workflow (native)
| Task | Read |
| ---------------------------------------- | ------------------------------------------------------------------------------------------- |
| Install or debug the CLI/Gateway service | [Gateway on macOS](/platforms/mac/bundled-gateway) |
| Keep state out of cloud-synced folders | [Gateway on macOS](/platforms/mac/bundled-gateway#state-directory-on-macos) |
| Debug app discovery and connectivity | [Gateway on macOS](/platforms/mac/bundled-gateway#debug-app-connectivity) |
| Understand launchd behavior | [Gateway lifecycle](/platforms/mac/child-process) |
| Fix permissions or signing/TCC issues | [macOS permissions](/platforms/mac/permissions) |
| Connect to a remote Gateway | [Remote control](/platforms/mac/remote) |
| Read menu bar status and health checks | [Menu bar](/platforms/mac/menu-bar), [Health checks](/platforms/mac/health) |
| Use the embedded chat UI | [WebChat](/platforms/mac/webchat) |
| Use voice wake or push-to-talk | [Voice wake](/platforms/mac/voicewake) |
| Use Canvas and Canvas deep links | [Canvas](/platforms/mac/canvas) |
| Host PeekabooBridge for UI automation | [Peekaboo bridge](/platforms/mac/peekaboo) |
| Configure command approvals | [Exec approvals](/tools/exec-approvals), [advanced details](/tools/exec-approvals-advanced) |
| Inspect Mac node commands and app IPC | [macOS IPC](/platforms/mac/xpc) |
| Capture logs | [macOS logging](/platforms/mac/logging) |
| Build from source | [macOS dev setup](/platforms/mac/dev-setup) |
- `cd apps/macos && swift build`
- `swift run OpenClaw` (or Xcode)
- Package app: `scripts/package-mac-app.sh`
## Related
## Debug gateway connectivity (macOS CLI)
- [Platforms](/platforms)
- [Getting started](/start/getting-started)
- [Gateway](/gateway)
- [Exec approvals](/tools/exec-approvals)
Use the debug CLI to exercise the same Gateway WebSocket handshake and discovery
logic that the macOS app uses, without launching the app.
```bash
cd apps/macos
swift run openclaw-mac connect --json
swift run openclaw-mac discover --timeout 3000 --json
```
Connect options:
- `--url <ws://host:port>`: override config
- `--mode <local|remote>`: resolve from config (default: config or local)
- `--probe`: force a fresh health probe
- `--timeout <ms>`: request timeout (default: `15000`)
- `--json`: structured output for diffing
Discovery options:
- `--include-local`: include gateways that would be filtered as "local"
- `--timeout <ms>`: overall discovery window (default: `2000`)
- `--json`: structured output for diffing
<Tip>
Compare against `openclaw gateway discover --json` to see whether the macOS app's discovery pipeline (`local.` plus the configured wide-area domain, with wide-area and Tailscale Serve fallbacks) differs from the Node CLI's `dns-sd` based discovery.
</Tip>
## Remote connection plumbing (SSH tunnels)
When the macOS app runs in **Remote** mode, it opens an SSH tunnel so local UI
components can talk to a remote Gateway as if it were on localhost.
### Control tunnel (Gateway WebSocket port)
- **Purpose:** health checks, status, Web Chat, config, and other control-plane calls.
- **Local port:** the Gateway port (default `18789`), always stable.
- **Remote port:** the same Gateway port on the remote host.
- **Behavior:** no random local port; the app reuses an existing healthy tunnel
or restarts it if needed.
- **SSH shape:** `ssh -N -L <local>:127.0.0.1:<remote>` with BatchMode +
ExitOnForwardFailure + keepalive options.
- **IP reporting:** the SSH tunnel uses loopback, so the gateway will see the node
IP as `127.0.0.1`. Use **Direct (ws/wss)** transport if you want the real client
IP to appear (see [macOS remote access](/platforms/mac/remote)).
For setup steps, see [macOS remote access](/platforms/mac/remote). For protocol
details, see [Gateway protocol](/gateway/protocol).
## Related docs
- [Gateway runbook](/gateway)
- [Gateway (macOS)](/platforms/mac/bundled-gateway)
- [macOS permissions](/platforms/mac/permissions)
- [Canvas](/platforms/mac/canvas)

View File

@@ -269,7 +269,7 @@ html.dark .nav-tabs-underline {
.maturity-summary-grid {
display: grid;
grid-template-columns: repeat(3, minmax(0, 1fr));
grid-template-columns: repeat(auto-fit, minmax(min(220px, 100%), 1fr));
margin: 14px 0 20px;
border-top: 1px solid color-mix(in oklab, rgb(var(--primary)) 18%, transparent);
border-bottom: 1px solid color-mix(in oklab, rgb(var(--primary)) 18%, transparent);

View File

@@ -1,5 +1,6 @@
// Duckduckgo plugin module implements ddg client behavior.
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-contracts";
import { readProviderTextResponse } from "openclaw/plugin-sdk/provider-http";
import {
DEFAULT_CACHE_TTL_MINUTES,
DEFAULT_SEARCH_COUNT,
@@ -113,6 +114,10 @@ function isBotChallenge(html: string): boolean {
return /g-recaptcha|are you a human|id="challenge-form"|name="challenge"/i.test(html);
}
async function readDuckDuckGoHtmlResponse(response: Response): Promise<string> {
return await readProviderTextResponse(response, "DuckDuckGo search");
}
function parseDuckDuckGoHtml(html: string): DuckDuckGoResult[] {
const results: DuckDuckGoResult[] = [];
const resultRegex = /<a\b(?=[^>]*\bclass="[^"]*\bresult__a\b[^"]*")([^>]*)>([\s\S]*?)<\/a>/gi;
@@ -202,7 +207,7 @@ export async function runDuckDuckGoSearch(params: {
);
}
const html = await response.text();
const html = await readDuckDuckGoHtmlResponse(response);
if (isBotChallenge(html)) {
throw new Error("DuckDuckGo returned a bot-detection challenge.");
}
@@ -238,5 +243,6 @@ export const testing = {
decodeHtmlEntities,
isBotChallenge,
parseDuckDuckGoHtml,
readDuckDuckGoHtmlResponse,
};
export { testing as __testing };

View File

@@ -1,5 +1,6 @@
// Duckduckgo tests cover ddg search provider plugin behavior.
import { afterAll, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../test-support/streaming-error-response.js";
import { createDuckDuckGoWebSearchProvider as createDuckDuckGoWebSearchContractProvider } from "../web-search-contract-api.js";
import { DEFAULT_DDG_SAFE_SEARCH, resolveDdgRegion, resolveDdgSafeSearch } from "./config.js";
@@ -104,6 +105,24 @@ describe("duckduckgo web search provider", () => {
expect(runDuckDuckGoSearch).not.toHaveBeenCalled();
});
it("bounds successful DuckDuckGo HTML bodies without using response.text()", async () => {
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "Content-Type": "text/html" },
});
const textSpy = vi.spyOn(streamed.response, "text").mockRejectedValue(new Error("unbounded"));
await expect(ddgClientTesting.readDuckDuckGoHtmlResponse(streamed.response)).rejects.toThrow(
"DuckDuckGo search: text response exceeds 16777216 bytes",
);
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(textSpy).not.toHaveBeenCalled();
});
it("reads region from plugin config and normalizes empty values away", () => {
expect(
resolveDdgRegion({

View File

@@ -1,5 +1,6 @@
// Firecrawl plugin module implements firecrawl client behavior.
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-contracts";
import { readProviderJsonResponse } from "openclaw/plugin-sdk/provider-http";
import {
DEFAULT_CACHE_TTL_MINUTES,
markdownToText,
@@ -41,6 +42,7 @@ const SCRAPE_CACHE = new Map<
>();
const DEFAULT_SEARCH_COUNT = 5;
const DEFAULT_SCRAPE_MAX_CHARS = 50_000;
const FIRECRAWL_SCRAPE_RESPONSE_MAX_BYTES = 64 * 1024 * 1024;
const ALLOWED_FIRECRAWL_HOSTS = new Set(["api.firecrawl.dev"]);
const FIRECRAWL_SELF_HOSTED_PRIVATE_ERROR =
"Firecrawl custom baseUrl must target a private or internal self-hosted endpoint.";
@@ -65,12 +67,9 @@ type FirecrawlSearchItem = {
async function readFirecrawlJsonResponse(
response: Response,
label: string,
opts?: { maxBytes?: number },
): Promise<Record<string, unknown>> {
try {
return (await response.json()) as Record<string, unknown>;
} catch (cause) {
throw new Error(`${label}: malformed JSON response`, { cause });
}
return await readProviderJsonResponse<Record<string, unknown>>(response, label, opts);
}
export type FirecrawlSearchParams = {
@@ -220,11 +219,9 @@ async function postFirecrawlJson<T>(
const readJsonPayload = async (): Promise<Record<string, unknown> | null> => {
const candidate = response as Response & { clone?: () => Response };
const jsonResponse = typeof candidate.clone === "function" ? candidate.clone() : response;
if (typeof jsonResponse.json !== "function") {
return null;
}
try {
const payload = await jsonResponse.json();
const body = await readResponseText(jsonResponse, { maxBytes: 64_000 });
const payload = JSON.parse(body.text) as unknown;
return payload && typeof payload === "object" && !Array.isArray(payload)
? (payload as Record<string, unknown>)
: null;
@@ -579,7 +576,10 @@ export async function runFirecrawlScrape(
},
},
async (response) => {
const payloadLocal = await readFirecrawlJsonResponse(response, "Firecrawl fetch failed");
const payloadLocal = await readFirecrawlJsonResponse(response, "Firecrawl fetch failed", {
// Scrape can legitimately return page bodies before maxChars truncates parsed output.
maxBytes: FIRECRAWL_SCRAPE_RESPONSE_MAX_BYTES,
});
if (payloadLocal.success === false) {
const detail =
typeof payloadLocal.error === "string"
@@ -613,6 +613,7 @@ export const testing = {
assertFirecrawlScrapeTargetAllowed,
parseFirecrawlScrapePayload,
postFirecrawlJson,
readFirecrawlJsonResponse,
resolveEndpoint,
validateFirecrawlBaseUrl,
resolveSearchItems,

View File

@@ -2,6 +2,7 @@
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-contracts";
import { mockPinnedHostnameResolution } from "openclaw/plugin-sdk/test-env";
import { afterAll, afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../test-support/streaming-error-response.js";
import {
DEFAULT_FIRECRAWL_BASE_URL,
DEFAULT_FIRECRAWL_MAX_AGE_MS,
@@ -966,6 +967,27 @@ describe("firecrawl tools", () => {
).rejects.toThrow("Firecrawl Search API error: malformed JSON response");
});
it("bounds successful Firecrawl JSON bodies before parsing", async () => {
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "content-type": "application/json" },
});
const jsonSpy = vi.spyOn(streamed.response, "json").mockRejectedValue(new Error("unbounded"));
await expect(
firecrawlClientTesting.readFirecrawlJsonResponse(
streamed.response,
"Firecrawl Search API error",
),
).rejects.toThrow("Firecrawl Search API error: JSON response exceeds 16777216 bytes");
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(jsonSpy).not.toHaveBeenCalled();
});
it("reports malformed Firecrawl scrape JSON with a stable provider error", async () => {
global.fetch = vi.fn(
async () =>

View File

@@ -256,6 +256,183 @@ describe("iMessage monitor last-route updates", () => {
});
});
it("keeps direct progress options when imsg lacks native typing support", async () => {
setCachedIMessagePrivateApiStatus("imsg", {
available: true,
v2Ready: true,
selectors: {},
rpcMethods: ["watch.subscribe", "send", "read"],
});
dispatchInboundMessageMock.mockImplementationOnce(async (params) => {
expect(params.replyOptions?.suppressDefaultToolProgressMessages).toBe(true);
expect(params.replyOptions?.allowProgressCallbacksWhenSourceDeliverySuppressed).toBe(true);
expect(params.replyOptions?.onToolStart).toBeUndefined();
return { queuedFinal: false, counts: { tool: 0, block: 0, final: 0 } } as const;
});
let onNotification: ((message: { method: string; params: unknown }) => void) | undefined;
const client = {
request: vi.fn(async (method: string) => {
if (method === "watch.subscribe") {
return { subscription: 1 };
}
if (method === "typing") {
throw new Error("typing should not start without native typing support");
}
throw new Error(`unexpected imsg method ${method}`);
}),
waitForClose: vi.fn(async () => {
onNotification?.({
method: "message",
params: {
message: {
id: 13,
chat_id: 123,
sender: "+15550001111",
is_from_me: false,
text: "run a long script without native typing",
is_group: false,
created_at: new Date().toISOString(),
},
},
});
await Promise.resolve();
await Promise.resolve();
}),
stop: vi.fn(async () => {}),
};
createIMessageRpcClientMock.mockImplementation(async (params) => {
if (!params?.onNotification) {
throw new Error("expected iMessage notification handler");
}
onNotification = params.onNotification;
return client as never;
});
await monitorIMessageProvider({
config: {
channels: {
imessage: {
dmPolicy: "allowlist",
allowFrom: ["+15550001111"],
sendReadReceipts: false,
},
},
messages: { inbound: { debounceMs: 0 } },
session: { mainKey: "main" },
} as never,
runtime: { error: vi.fn(), exit: vi.fn(), log: vi.fn() },
});
await vi.waitFor(() => {
expect(dispatchInboundMessageMock).toHaveBeenCalledTimes(1);
});
expect(client.request).not.toHaveBeenCalledWith(
"typing",
expect.objectContaining({ typing: true }),
expect.anything(),
);
});
it("starts direct typing before dispatching the inbound turn", async () => {
setCachedIMessagePrivateApiStatus("imsg", {
available: true,
v2Ready: true,
selectors: {},
rpcMethods: ["watch.subscribe", "send", "typing"],
});
let onNotification: ((message: { method: string; params: unknown }) => void) | undefined;
const earlyTypingClient = {
request: vi.fn(async (method: string) => {
if (method === "typing") {
return { ok: true };
}
throw new Error(`unexpected imsg typing-client method ${method}`);
}),
stop: vi.fn(async () => {}),
};
const watchClient = {
request: vi.fn(async (method: string) => {
if (method === "watch.subscribe") {
return { subscription: 1 };
}
if (method === "typing") {
return { ok: true };
}
throw new Error(`unexpected imsg watch-client method ${method}`);
}),
waitForClose: vi.fn(async () => {
onNotification?.({
method: "message",
params: {
message: {
id: 12,
chat_id: 123,
sender: "+15550001111",
is_from_me: false,
text: "respond after a slow context build",
is_group: false,
created_at: new Date().toISOString(),
},
},
});
await vi.waitFor(() => {
expect(earlyTypingClient.request).toHaveBeenCalledWith(
"typing",
expect.objectContaining({ typing: true, to: "+15550001111" }),
expect.any(Object),
);
expect(dispatchInboundMessageMock).toHaveBeenCalledTimes(1);
});
}),
stop: vi.fn(async () => {}),
};
createIMessageRpcClientMock.mockImplementation(async (params) => {
if (params?.onNotification) {
onNotification = params.onNotification;
return watchClient as never;
}
return earlyTypingClient as never;
});
dispatchInboundMessageMock.mockImplementationOnce(async () => {
expect(earlyTypingClient.request).toHaveBeenCalledWith(
"typing",
expect.objectContaining({ typing: true, to: "+15550001111" }),
expect.any(Object),
);
return { queuedFinal: false, counts: { tool: 0, block: 0, final: 0 } } as const;
});
await monitorIMessageProvider({
config: {
channels: {
imessage: {
dmPolicy: "allowlist",
allowFrom: ["+15550001111"],
sendReadReceipts: false,
},
},
messages: { inbound: { debounceMs: 0 } },
session: { mainKey: "main" },
} as never,
runtime: { error: vi.fn(), exit: vi.fn(), log: vi.fn() },
});
expect(watchClient.request).not.toHaveBeenCalledWith(
"typing",
expect.objectContaining({ typing: true }),
expect.anything(),
);
await vi.waitFor(() => {
expect(earlyTypingClient.request).toHaveBeenCalledWith(
"typing",
expect.objectContaining({ typing: false, to: "+15550001111" }),
expect.any(Object),
);
});
});
it.each(["never", "message", "thinking"] as const)(
"does not start direct tool typing when typingMode is %s",
async (typingMode) => {
@@ -420,6 +597,87 @@ describe("iMessage monitor last-route updates", () => {
);
});
it("does not wait for read receipts before dispatching the inbound turn", async () => {
setCachedIMessagePrivateApiStatus("imsg", {
available: true,
v2Ready: true,
selectors: {},
rpcMethods: ["watch.subscribe", "read"],
});
let onNotification: ((message: { method: string; params: unknown }) => void) | undefined;
const readClient = {
request: vi.fn((method: string) => {
if (method === "read") {
return new Promise(() => {});
}
return Promise.reject(new Error(`unexpected imsg read-client method ${method}`));
}),
stop: vi.fn(async () => {}),
};
const watchClient = {
request: vi.fn((method: string) => {
if (method === "watch.subscribe") {
return Promise.resolve({ subscription: 1 });
}
return Promise.reject(new Error(`unexpected imsg watch-client method ${method}`));
}),
waitForClose: vi.fn(async () => {
onNotification?.({
method: "message",
params: {
message: {
id: 11,
chat_id: 123,
sender: "+15550001111",
is_from_me: false,
text: "respond without waiting for read receipt",
is_group: false,
created_at: new Date().toISOString(),
},
},
});
await vi.waitFor(() => {
expect(dispatchInboundMessageMock).toHaveBeenCalledTimes(1);
});
}),
stop: vi.fn(async () => {}),
};
createIMessageRpcClientMock.mockImplementation(async (params) => {
if (params?.onNotification) {
onNotification = params.onNotification;
return watchClient as never;
}
return readClient as never;
});
await monitorIMessageProvider({
config: {
channels: {
imessage: {
dmPolicy: "allowlist",
allowFrom: ["+15550001111"],
},
},
messages: { inbound: { debounceMs: 0 } },
session: { mainKey: "main" },
} as never,
runtime: { error: vi.fn(), exit: vi.fn(), log: vi.fn() },
});
expect(readClient.request).toHaveBeenCalledWith(
"read",
expect.objectContaining({ to: "+15550001111" }),
expect.any(Object),
);
expect(watchClient.request).not.toHaveBeenCalledWith(
"read",
expect.anything(),
expect.anything(),
);
expect(dispatchInboundMessageMock).toHaveBeenCalledTimes(1);
});
it.each([
{
label: "flat true",

View File

@@ -1087,7 +1087,7 @@ function buildIMessageEchoScope(params: {
return scopes;
}
function buildDirectIMessageReplyTarget(params: {
export function buildDirectIMessageReplyTarget(params: {
cfg: OpenClawConfig;
accountId?: string | null;
sender: string;

View File

@@ -94,6 +94,7 @@ import {
releaseIMessageInboundReplay,
} from "./inbound-dedupe.js";
import {
buildDirectIMessageReplyTarget,
buildIMessageInboundContext,
rememberIMessageSkippedFromMeForSelfChatDedupe,
resolveIMessageReactionContext,
@@ -1039,6 +1040,87 @@ export async function monitorIMessageProvider(opts: MonitorIMessageOpts = {}): P
const storePath = resolveStorePath(cfg.session?.store, {
agentId: decision.route.agentId,
});
const privateApiStatus = getCachedIMessagePrivateApiStatus(cliPath);
const supportsTyping = imessageRpcSupportsMethod(privateApiStatus, "typing");
const supportsRead = imessageRpcSupportsMethod(privateApiStatus, "read");
if (privateApiStatus?.available === true) {
// Surface a single warning per restart when the bridge is up but we
// had to gate off typing/read because the imsg build pre-dates the
// capability list. Otherwise the user sees no typing bubble / no
// "Read" receipt with no visible reason.
if (!supportsTyping || !supportsRead) {
warnIfImsgUpgradeNeeded.fireOnce(privateApiStatus.rpcMethods, runtime);
}
}
const configuredTypingMode = resolveConfiguredIMessageTypingMode(cfg);
const sendPolicy = resolveSendPolicy({
cfg,
entry: getSessionEntry({ storePath, sessionKey: decision.route.sessionKey }),
sessionKey: decision.route.sessionKey,
channel: "imessage",
chatType: decision.isGroup ? "group" : "direct",
});
const shouldUseDirectToolTypingOptions =
!decision.isGroup &&
sendPolicy !== "deny" &&
(configuredTypingMode === undefined || configuredTypingMode === "instant");
const shouldStartDirectTyping = supportsTyping && shouldUseDirectToolTypingOptions;
const earlyDirectTypingTarget = shouldStartDirectTyping
? buildDirectIMessageReplyTarget({
cfg,
accountId: decision.route.accountId,
sender: decision.sender,
})
: undefined;
let stopEarlyDirectTyping: (() => void) | undefined;
if (earlyDirectTypingTarget) {
// Start channel-native feedback before the expensive history/context/model
// path. Use a short-lived client so a slow typing RPC cannot block the
// monitor client's watch stream. Stop is sequenced after start so fast
// command replies cannot leave a late true after typing:false.
const earlyDirectTypingStarted = sendIMessageTyping(earlyDirectTypingTarget, true, {
cfg,
accountId: accountInfo.accountId,
}).then(
() => true,
(err: unknown) => {
logTypingFailure({
log: (msg) => logVerbose(msg),
channel: "imessage",
action: "start",
target: earlyDirectTypingTarget,
error: err,
});
return false;
},
);
let earlyTypingStopQueued = false;
stopEarlyDirectTyping = () => {
if (earlyTypingStopQueued) {
return;
}
earlyTypingStopQueued = true;
void earlyDirectTypingStarted
.then(async (started) => {
if (!started) {
return;
}
await sendIMessageTyping(earlyDirectTypingTarget, false, {
cfg,
accountId: accountInfo.accountId,
});
})
.catch((err: unknown) => {
logTypingFailure({
log: (msg) => logVerbose(msg),
channel: "imessage",
action: "stop",
target: earlyDirectTypingTarget,
error: err,
});
});
};
}
const stagedAttachments = remoteHost
? []
: await stageIMessageAttachments(validAttachments, {
@@ -1107,31 +1189,20 @@ export async function monitorIMessageProvider(opts: MonitorIMessageOpts = {}): P
);
}
const privateApiStatus = getCachedIMessagePrivateApiStatus(cliPath);
const supportsTyping = imessageRpcSupportsMethod(privateApiStatus, "typing");
const supportsRead = imessageRpcSupportsMethod(privateApiStatus, "read");
if (privateApiStatus?.available === true) {
// Surface a single warning per restart when the bridge is up but we
// had to gate off typing/read because the imsg build pre-dates the
// capability list. Otherwise the user sees no typing bubble / no
// "Read" receipt with no visible reason.
if (!supportsTyping || !supportsRead) {
warnIfImsgUpgradeNeeded.fireOnce(privateApiStatus.rpcMethods, runtime);
}
}
const sendReadReceipts = imessageCfg.sendReadReceipts !== false;
const typingTarget = ctxPayload.To;
if (supportsRead && sendReadReceipts && typingTarget) {
try {
await markIMessageChatRead(typingTarget, {
cfg,
accountId: accountInfo.accountId,
client: getActiveClient(),
});
} catch (err) {
// Read receipts are best-effort channel UI. Do not put them on the
// critical path before model dispatch; slow private-API reads otherwise
// make accepted iMessage turns feel stuck before the agent starts. Use
// a short-lived client so a stuck read cannot block monitor-client typing.
void markIMessageChatRead(typingTarget, {
cfg,
accountId: accountInfo.accountId,
}).catch((err: unknown) => {
runtime.error?.(`imessage: mark read failed: ${String(err)}`);
}
});
}
const { onModelSelected, ...replyPipeline } = createChannelMessageReplyPipeline({
@@ -1234,35 +1305,27 @@ export async function monitorIMessageProvider(opts: MonitorIMessageOpts = {}): P
},
});
let directTypingController: IMessageTypingController | undefined;
const configuredTypingMode = resolveConfiguredIMessageTypingMode(cfg);
const sendPolicy = resolveSendPolicy({
cfg,
entry: getSessionEntry({ storePath, sessionKey: decision.route.sessionKey }),
sessionKey: decision.route.sessionKey,
channel: "imessage",
chatType: decision.isGroup ? "group" : "direct",
});
const shouldStartToolTyping =
!decision.isGroup &&
sendPolicy !== "deny" &&
(configuredTypingMode === undefined || configuredTypingMode === "instant");
const directToolTypingOptions = shouldStartToolTyping
const directToolTypingOptions = shouldUseDirectToolTypingOptions
? ({
// iMessage's native typing bubble is channel-owned UI, not a
// visible tool-progress message. The suppress flag is what lets
// dispatch forward this callback even when verbose progress is off;
// allowProgress covers message_tool_only source delivery. Keep this on
// the direct instant/default path so configured typingMode values still
// decide when typing can begin.
// the direct instant/default path even when older imsg builds do not
// report native typing support.
suppressDefaultToolProgressMessages: true,
allowProgressCallbacksWhenSourceDeliverySuppressed: true,
onTypingController: (typing: IMessageTypingController) => {
directTypingController = typing;
typingReplyOptions.onTypingController?.(typing);
},
onToolStart: async () => {
await directTypingController?.startTypingLoop();
},
...(supportsTyping
? {
onToolStart: async () => {
await directTypingController?.startTypingLoop();
},
}
: {}),
} as const)
: {};
const configuredBlockStreaming = resolveChannelStreamingBlockEnabled(accountInfo.config);
@@ -1325,11 +1388,13 @@ export async function monitorIMessageProvider(opts: MonitorIMessageOpts = {}): P
historyMap: groupHistories,
limit: historyLimit,
},
onPreDispatchFailure: () =>
settleReplyDispatcher({
onPreDispatchFailure: () => {
stopEarlyDirectTyping?.();
void settleReplyDispatcher({
dispatcher,
onSettled: () => markDispatchIdle(),
}),
});
},
runDispatch: async () => {
try {
return await dispatchInboundMessage({
@@ -1348,6 +1413,7 @@ export async function monitorIMessageProvider(opts: MonitorIMessageOpts = {}): P
});
} finally {
markDispatchIdle();
stopEarlyDirectTyping?.();
}
},
}),

View File

@@ -1,6 +1,7 @@
// Ollama tests cover embedding provider plugin behavior.
import type { OpenClawConfig } from "openclaw/plugin-sdk/provider-auth";
import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../test-support/streaming-error-response.js";
const { fetchConfiguredLocalOriginWithSsrFGuardMock } = vi.hoisted(() => ({
fetchConfiguredLocalOriginWithSsrFGuardMock: vi.fn(
@@ -412,10 +413,40 @@ describe("ollama embedding provider", () => {
});
await expect(provider.embedQuery("hello")).rejects.toThrow(
"Ollama embed response returned malformed JSON",
"Ollama embed response: malformed JSON response",
);
});
it("bounds successful embed JSON bodies before parsing", async () => {
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "content-type": "application/json" },
});
const jsonSpy = vi.spyOn(streamed.response, "json").mockRejectedValue(new Error("unbounded"));
vi.stubGlobal(
"fetch",
vi.fn(async () => streamed.response),
);
const { provider } = await createOllamaEmbeddingProvider({
config: {} as OpenClawConfig,
provider: "ollama",
model: "nomic-embed-text",
fallback: "none",
remote: { baseUrl: "http://127.0.0.1:11434" },
});
await expect(provider.embedQuery("hello")).rejects.toThrow(
"Ollama embed response: JSON response exceeds 16777216 bytes",
);
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(jsonSpy).not.toHaveBeenCalled();
});
it("rejects non-number embedding values instead of zeroing them", async () => {
vi.stubGlobal(
"fetch",

View File

@@ -6,7 +6,10 @@ import {
normalizeOptionalSecretInput,
} from "openclaw/plugin-sdk/provider-auth";
import { resolveEnvApiKey } from "openclaw/plugin-sdk/provider-auth-runtime";
import { readResponseTextLimited } from "openclaw/plugin-sdk/provider-http";
import {
readProviderJsonResponse,
readResponseTextLimited,
} from "openclaw/plugin-sdk/provider-http";
import { normalizeProviderId } from "openclaw/plugin-sdk/provider-model-shared";
import {
hasConfiguredSecretInput,
@@ -117,14 +120,9 @@ async function withRemoteHttpResponse<T>(params: {
}
async function readOllamaEmbeddingJsonResponse(
response: Pick<Response, "json">,
response: Response,
): Promise<{ embeddings?: unknown }> {
let payload: unknown;
try {
payload = await response.json();
} catch (cause) {
throw new Error("Ollama embed response returned malformed JSON", { cause });
}
const payload = await readProviderJsonResponse<unknown>(response, "Ollama embed response");
if (typeof payload !== "object" || payload === null || Array.isArray(payload)) {
throw new Error("Ollama embed response returned a non-object JSON payload");
}

View File

@@ -1,6 +1,7 @@
// Ollama tests cover web search provider plugin behavior.
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-contracts";
import { beforeEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../test-support/streaming-error-response.js";
import { createOllamaWebSearchProvider as createContractOllamaWebSearchProvider } from "../web-search-contract-api.js";
import {
testing,
@@ -403,7 +404,32 @@ describe("ollama web search provider", () => {
config: createOllamaConfig(),
query: "openclaw",
}),
).rejects.toThrow("Ollama web search returned malformed JSON");
).rejects.toThrow("Ollama web search: malformed JSON response");
});
it("bounds successful Ollama web search JSON bodies before parsing", async () => {
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "content-type": "application/json" },
});
const jsonSpy = vi.spyOn(streamed.response, "json").mockRejectedValue(new Error("unbounded"));
fetchWithSsrFGuardMock.mockResolvedValueOnce({
response: streamed.response,
release: vi.fn(async () => {}),
});
await expect(
runOllamaWebSearch({
config: createOllamaConfig(),
query: "openclaw",
}),
).rejects.toThrow("Ollama web search: JSON response exceeds 16777216 bytes");
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(jsonSpy).not.toHaveBeenCalled();
});
it("warns when Ollama is not reachable during setup without cancelling", async () => {

View File

@@ -5,6 +5,7 @@ import {
normalizeOptionalSecretInput,
} from "openclaw/plugin-sdk/provider-auth";
import { resolveEnvApiKey } from "openclaw/plugin-sdk/provider-auth-runtime";
import { readProviderJsonResponse } from "openclaw/plugin-sdk/provider-http";
import {
enablePluginInConfig,
readPositiveIntegerParam,
@@ -67,11 +68,7 @@ type OllamaWebSearchAttempt = {
};
async function readOllamaWebSearchResponse(response: Response): Promise<OllamaWebSearchResponse> {
try {
return (await response.json()) as OllamaWebSearchResponse;
} catch (cause) {
throw new Error("Ollama web search returned malformed JSON", { cause });
}
return await readProviderJsonResponse<OllamaWebSearchResponse>(response, "Ollama web search");
}
function isOllamaCloudBaseUrl(baseUrl: string): boolean {

View File

@@ -1,4 +1,5 @@
import { beforeEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../test-support/streaming-error-response.js";
type EndpointCall = {
url: string;
@@ -311,4 +312,27 @@ describe("runParallelMcpSearch", () => {
expect(tracked.wasCanceled()).toBe(true);
expect(textSpy).not.toHaveBeenCalled();
});
it("bounds successful MCP bodies without using response.text()", async () => {
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "Content-Type": "application/json" },
});
const textSpy = vi.spyOn(streamed.response, "text").mockRejectedValue(new Error("unbounded"));
endpointMockState.responses.push(streamed.response);
const error = await runParallelMcpSearch({ searchQueries: ["x"], maxResults: 5 }).catch(
(cause: unknown) => cause,
);
expect(error).toBeInstanceOf(Error);
expect((error as Error).message).toContain(
"Parallel MCP: text response exceeds 16777216 bytes",
);
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(textSpy).not.toHaveBeenCalled();
});
});

View File

@@ -1,7 +1,10 @@
import { randomUUID } from "node:crypto";
import { createRequire } from "node:module";
import { readPluginPackageVersion } from "openclaw/plugin-sdk/extension-shared";
import { readResponseTextLimited } from "openclaw/plugin-sdk/provider-http";
import {
readProviderTextResponse,
readResponseTextLimited,
} from "openclaw/plugin-sdk/provider-http";
import { withTrustedWebSearchEndpoint } from "openclaw/plugin-sdk/provider-web-search";
// Free hosted Search MCP. This keyless transport is used only after the user
@@ -218,7 +221,7 @@ async function postMcp(params: {
status: response.status,
statusText: response.statusText,
text: response.ok
? await response.text()
? await readProviderTextResponse(response, "Parallel MCP")
: await readResponseTextLimited(response, PARALLEL_MCP_ERROR_BODY_LIMIT_BYTES),
sessionIdHeader: response.headers.get("mcp-session-id"),
}),

View File

@@ -1,4 +1,5 @@
import { beforeEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../test-support/streaming-error-response.js";
type EndpointCall = {
url: string;
@@ -59,40 +60,6 @@ function cancelTrackedResponse(
};
}
function streamedJsonResponse(params: { chunkCount: number; chunkSize: number }): {
response: Response;
getReadCount: () => number;
wasCanceled: () => boolean;
} {
// Multi-chunk fixture: proves the bounded read stops pulling chunks before
// the whole (here syntactically broken / unbounded) body is buffered, and
// that the stream is cancelled on overflow.
let reads = 0;
let canceled = false;
const encoder = new TextEncoder();
const stream = new ReadableStream<Uint8Array>({
pull(controller) {
if (reads >= params.chunkCount) {
controller.close();
return;
}
reads += 1;
controller.enqueue(encoder.encode("a".repeat(params.chunkSize)));
},
cancel() {
canceled = true;
},
});
return {
response: new Response(stream, {
status: 200,
headers: { "Content-Type": "application/json" },
}),
getReadCount: () => reads,
wasCanceled: () => canceled,
};
}
import { testing } from "../test-api.js";
import { createParallelWebSearchProvider as createContractParallelWebSearchProvider } from "../web-search-contract-api.js";
import { createParallelWebSearchProvider } from "./parallel-web-search-provider.js";
@@ -621,7 +588,12 @@ describe("parallel web search provider", () => {
// 200-chunk x 1 MiB body (~200 MiB) caps at 16 MiB: the bounded reader must
// stop pulling chunks and cancel the stream well before draining it, then
// surface a bounded error rather than buffering the whole payload.
const streamed = streamedJsonResponse({ chunkCount: 200, chunkSize: 1024 * 1024 });
const streamed = createStreamingResponse({
chunkCount: 200,
chunkSize: 1024 * 1024,
text: "a",
headers: { "Content-Type": "application/json" },
});
endpointMockState.responses.push(streamed.response);
const provider = createParallelWebSearchProvider();
const tool = provider.createTool({

View File

@@ -1,3 +1,4 @@
import { readProviderJsonResponse } from "openclaw/plugin-sdk/provider-http";
// Perplexity provider module implements model/runtime integration.
import {
readPositiveIntegerParam,
@@ -142,11 +143,7 @@ function buildPerplexityRequestHeaders(apiKey: string, acceptJson = false): Reco
}
async function readPerplexityJsonResponse<T>(response: Response, label: string): Promise<T> {
try {
return (await response.json()) as T;
} catch (cause) {
throw new Error(`${label}: malformed JSON response`, { cause });
}
return await readProviderJsonResponse<T>(response, label);
}
function resolvePerplexityTransport(perplexity?: PerplexityConfig): {

View File

@@ -1,6 +1,7 @@
// Perplexity tests cover perplexity web search provider plugin behavior.
import { withEnv, withEnvAsync } from "openclaw/plugin-sdk/test-env";
import { describe, expect, it } from "vitest";
import { describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../test-support/streaming-error-response.js";
import { createPerplexityWebSearchProvider } from "./perplexity-web-search-provider.js";
import { testing } from "./perplexity-web-search-provider.runtime.js";
@@ -171,4 +172,22 @@ describe("perplexity web search provider", () => {
testing.readPerplexityJsonResponse(new Response("{ nope"), "Perplexity"),
).rejects.toThrow("Perplexity: malformed JSON response");
});
it("bounds successful Perplexity JSON bodies before parsing", async () => {
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "content-type": "application/json" },
});
const jsonSpy = vi.spyOn(streamed.response, "json").mockRejectedValue(new Error("unbounded"));
await expect(
testing.readPerplexityJsonResponse(streamed.response, "Perplexity Search"),
).rejects.toThrow("Perplexity Search: JSON response exceeds 16777216 bytes");
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(jsonSpy).not.toHaveBeenCalled();
});
});

View File

@@ -168,23 +168,42 @@ describe("runtime parity", () => {
const scoped = __testing.filterMockRequestsForParentPrompt(
[
{
prompt: "Fanout worker alpha: inspect the QA workspace and finish with exactly ALPHA-OK.",
allInputText:
"Delegate one bounded QA task to a subagent. Fanout worker alpha: inspect the QA workspace and finish with exactly ALPHA-OK.",
plannedToolName: "read",
},
{
prompt: "Delegate one bounded QA task to a subagent.",
allInputText: "Delegate one bounded QA task to a subagent.",
plannedToolName: "sessions_spawn",
},
{
prompt: "Continue the bounded QA task with the retained child result.",
allInputText:
"Delegate one bounded QA task to a subagent. Continue the bounded QA task with the retained child result.",
plannedToolName: "sessions_spawn",
},
{
allInputText: "Inspect the QA workspace and return one concise protocol note.",
plannedToolName: "read",
},
{
prompt: "Delegate one bounded QA task to a subagent.",
allInputText: "Delegate one bounded QA task to a subagent. Tool result: child accepted.",
toolOutput: "child accepted",
},
],
"Delegate one bounded QA task to a subagent.",
[
"Delegate one bounded QA task to a subagent.",
"Continue the bounded QA task with the retained child result.",
],
);
expect(scoped).toHaveLength(2);
expect(scoped).toHaveLength(3);
expect(scoped.map((request) => request.plannedToolName ?? "result")).toEqual([
"sessions_spawn",
"sessions_spawn",
"result",
]);

View File

@@ -120,6 +120,7 @@ type RuntimeParityTranscriptRecord = {
};
type RuntimeParityMockRequestSnapshot = {
prompt?: string;
allInputText?: string;
plannedToolName?: string;
plannedToolArgs?: unknown;
@@ -759,14 +760,22 @@ function resolveRuntimeParityToolCalls(params: {
function filterMockRequestsForParentPrompt(
requests: RuntimeParityMockRequestSnapshot[],
parentPrompt: string,
parentPrompts: readonly string[] = [parentPrompt],
) {
const normalizedParentPrompt = normalizeTextForParity(parentPrompt);
if (!normalizedParentPrompt) {
const normalizedParentPrompts = parentPrompts
.map(normalizeTextForParity)
.filter((prompt) => prompt.length > 0);
if (normalizedParentPrompts.length === 0) {
return requests;
}
const matching = requests.filter((request) =>
normalizeTextForParity(request.allInputText ?? "").includes(normalizedParentPrompt),
);
const matching = requests.filter((request) => {
const normalizedPrompt = normalizeTextForParity(request.prompt ?? "");
if (normalizedPrompt) {
return normalizedParentPrompts.some((prompt) => normalizedPrompt.includes(prompt));
}
const normalizedHistory = normalizeTextForParity(request.allInputText ?? "");
return normalizedParentPrompts.some((prompt) => normalizedHistory.includes(prompt));
});
return matching.length > 0 ? matching : requests;
}
@@ -966,6 +975,7 @@ async function loadRuntimeParityTranscripts(params: {
async function loadRuntimeParityMockToolCalls(
mockBaseUrl: string | undefined,
parentPrompt: string,
parentPrompts: readonly string[] = [parentPrompt],
): Promise<RuntimeParityToolCall[] | null> {
const normalizedBaseUrl = mockBaseUrl?.trim().replace(/\/+$/u, "");
if (!normalizedBaseUrl) {
@@ -991,6 +1001,7 @@ async function loadRuntimeParityMockToolCalls(
}
const requests = payload.filter(isMessageRecord).map(
(entry): RuntimeParityMockRequestSnapshot => ({
prompt: readNonEmptyString(entry.prompt),
allInputText: readNonEmptyString(entry.allInputText),
plannedToolName: readNonEmptyString(entry.plannedToolName),
plannedToolArgs: entry.plannedToolArgs ?? null,
@@ -998,7 +1009,7 @@ async function loadRuntimeParityMockToolCalls(
}),
);
return resolveToolCallOrderFromMockRequests(
filterMockRequestsForParentPrompt(requests, parentPrompt),
filterMockRequestsForParentPrompt(requests, parentPrompt, parentPrompts),
);
} catch {
return null;
@@ -1015,12 +1026,16 @@ export async function captureRuntimeParityCell(
});
const transcriptRecords = buildTranscriptRecords(transcriptBytes);
const transcriptToolCalls = resolveToolCallOrder(transcriptRecords);
const parentPrompt =
transcriptRecords
.filter((record) => record.role === "user" && !isToolResultLikeMessage(record.message))
.map((record) => extractAssistantText(record.message))
.find(Boolean) ?? "";
const mockToolCalls = await loadRuntimeParityMockToolCalls(params.mockBaseUrl, parentPrompt);
const parentPrompts = transcriptRecords
.filter((record) => record.role === "user")
.map((record) => extractAssistantText(record.message))
.filter((prompt) => prompt.length > 0);
const parentPrompt = parentPrompts[0] ?? "";
const mockToolCalls = await loadRuntimeParityMockToolCalls(
params.mockBaseUrl,
parentPrompt,
parentPrompts,
);
const gatewayLogs = params.gateway.logs?.();
const sentinelFindings = [
...scanGatewayLogSentinels(gatewayLogs),

View File

@@ -1,5 +1,6 @@
// Qqbot tests cover api-client plugin behavior.
import { afterEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../../../test-support/streaming-error-response.js";
const fetchWithSsrFGuardMock = vi.hoisted(() => vi.fn());
@@ -88,4 +89,35 @@ describe("ApiClient", () => {
},
});
});
it("bounds successful response bodies without using response.text()", async () => {
const release = vi.fn(async () => {});
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "content-type": "application/json" },
});
const textSpy = vi.spyOn(streamed.response, "text").mockRejectedValue(new Error("unbounded"));
fetchWithSsrFGuardMock.mockResolvedValueOnce({
response: streamed.response,
release,
});
const client = new ApiClient({ baseUrl: "https://qqbot.test" });
let error: unknown;
try {
await client.request("token-1", "GET", "/v2/users/@me");
} catch (caught) {
error = caught;
}
expect(error).toBeInstanceOf(ApiError);
expect(String(error)).toContain("QQBot API response: text response exceeds 16777216 bytes");
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(textSpy).not.toHaveBeenCalled();
expect(release).toHaveBeenCalledTimes(1);
});
});

View File

@@ -9,7 +9,10 @@
* - `redactBodyKeys` replaces the hardcoded `file_data` redaction.
*/
import { readResponseTextLimited } from "openclaw/plugin-sdk/provider-http";
import {
readProviderTextResponse,
readResponseTextLimited,
} from "openclaw/plugin-sdk/provider-http";
import { fetchWithSsrFGuard, type SsrFPolicy } from "openclaw/plugin-sdk/ssrf-runtime";
import { ApiError, type ApiClientConfig, type EngineLogger } from "../types.js";
import { formatErrorMessage } from "../utils/format.js";
@@ -162,7 +165,7 @@ export class ApiClient {
const readBody = async (limitBytes?: number): Promise<string> => {
try {
return limitBytes === undefined
? await res.text()
? await readProviderTextResponse(res, "QQBot API response")
: await readResponseTextLimited(res, limitBytes);
} catch (err) {
throw new ApiError(

View File

@@ -1,5 +1,6 @@
// Qqbot tests cover channel-api tool behavior.
import { afterEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../../../test-support/streaming-error-response.js";
const fetchWithSsrFGuardMock = vi.hoisted(() => vi.fn());
@@ -109,4 +110,33 @@ describe("executeChannelApi", () => {
expect(textSpy).not.toHaveBeenCalled();
expect(release).toHaveBeenCalledTimes(1);
});
it("bounds successful response bodies without using response.text()", async () => {
const release = vi.fn(async () => {});
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "content-type": "application/json" },
});
const textSpy = vi.spyOn(streamed.response, "text").mockRejectedValue(new Error("unbounded"));
fetchWithSsrFGuardMock.mockResolvedValueOnce({
response: streamed.response,
release,
});
const result = await executeChannelApi(
{ method: "GET", path: "/guilds/123/channels" },
{ accessToken: "token-1" },
);
expect(result.details).toMatchObject({
error: "QQ channel API response: text response exceeds 16777216 bytes",
path: "/guilds/123/channels",
});
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(textSpy).not.toHaveBeenCalled();
expect(release).toHaveBeenCalledTimes(1);
});
});

View File

@@ -8,7 +8,10 @@
* validation, fetch, and structured response formatting.
*/
import { readResponseTextLimited } from "openclaw/plugin-sdk/provider-http";
import {
readProviderTextResponse,
readResponseTextLimited,
} from "openclaw/plugin-sdk/provider-http";
import { fetchWithSsrFGuard, type SsrFPolicy } from "openclaw/plugin-sdk/ssrf-runtime";
import { formatErrorMessage } from "../utils/format.js";
import { debugLog, debugError } from "../utils/log.js";
@@ -216,7 +219,7 @@ export async function executeChannelApi(
debugLog(`[qqbot-channel-api] <<< Status: ${res.status} ${res.statusText}`);
const rawBody = res.ok
? await res.text()
? await readProviderTextResponse(res, "QQ channel API response")
: await readResponseTextLimited(res, CHANNEL_API_ERROR_BODY_LIMIT_BYTES);
if (!rawBody || rawBody.trim() === "") {
if (res.ok) {

View File

@@ -2,23 +2,12 @@
import { spawn, type ChildProcess } from "node:child_process";
import { createHash, randomBytes, randomUUID, timingSafeEqual } from "node:crypto";
import type { EventEmitter } from "node:events";
import {
createServer,
type IncomingMessage,
type Server,
type ServerResponse,
} from "node:http";
import { createServer, type IncomingMessage, type Server, type ServerResponse } from "node:http";
import type { Socket } from "node:net";
import {
keepHttpServerTaskAlive,
waitUntilAbort,
} from "openclaw/plugin-sdk/channel-outbound";
import type { ChannelGatewayContext } from "openclaw/plugin-sdk/channel-contract";
import { keepHttpServerTaskAlive, waitUntilAbort } from "openclaw/plugin-sdk/channel-outbound";
import { KeyedAsyncQueue } from "openclaw/plugin-sdk/keyed-async-queue";
import {
createClaimableDedupe,
type ClaimableDedupe,
} from "openclaw/plugin-sdk/persistent-dedupe";
import { createClaimableDedupe, type ClaimableDedupe } from "openclaw/plugin-sdk/persistent-dedupe";
import { RAFT_CHANNEL_ID, type ResolvedRaftAccount } from "./accounts.js";
import { dispatchRaftWake } from "./inbound.js";
@@ -54,11 +43,7 @@ type RaftBridgeProcess = Pick<ChildProcess, "kill"> & Pick<EventEmitter, "once">
type RaftGatewayDeps = {
createToken?: () => string;
spawnBridge?: (params: {
profile: string;
endpoint: string;
token: string;
}) => RaftBridgeProcess;
spawnBridge?: (params: { profile: string; endpoint: string; token: string }) => RaftBridgeProcess;
wakeDedupe?: ClaimableDedupe;
};
@@ -80,6 +65,8 @@ function spawnRaftBridge(params: {
endpoint: string;
token: string;
}): RaftBridgeProcess {
// Raft owns the fixed bridge command. OpenClaw passes profile/loopback
// endpoint/token as separate argv/env fields; wake payloads never reach argv.
return spawn(
"raft",
[
@@ -247,7 +234,7 @@ export async function startRaftGatewayAccount(
onDiskError: (error) => {
ctx.log?.warn?.(`Raft wake dedupe storage failed: ${String(error)}`);
},
});
});
const token = (deps.createToken ?? createToken)();
const runtimeSession = randomUUID();
const sockets = new Set<Socket>();

View File

@@ -116,6 +116,8 @@ function buildDaemonArgs(opts: SignalDaemonOpts): string[] {
export function spawnSignalDaemon(opts: SignalDaemonOpts): SignalDaemonHandle {
const args = buildDaemonArgs(opts);
// The executable is operator-selected or setup-discovered signal-cli.
// Runtime message content only flows through the daemon HTTP API, not argv.
const child = spawn(opts.cliPath, args, {
stdio: ["ignore", "pipe", "pipe"],
});

View File

@@ -1,5 +1,6 @@
// Tavily tests cover tavily client plugin behavior.
import { beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
import { createStreamingResponse } from "../../test-support/streaming-error-response.js";
// Capture every call to postTrustedWebToolsJson so we can assert on extraHeaders.
const postTrustedWebToolsJson = vi.fn();
@@ -61,6 +62,29 @@ describe("tavily client X-Client-Source header", () => {
);
});
it("bounds successful Tavily JSON bodies before parsing", async () => {
const streamed = createStreamingResponse({
chunkCount: 32,
chunkSize: 1024 * 1024,
text: "x",
headers: { "content-type": "application/json" },
});
const jsonSpy = vi.spyOn(streamed.response, "json").mockRejectedValue(new Error("unbounded"));
postTrustedWebToolsJson.mockImplementationOnce(
async (_params: unknown, parse: (r: Response) => Promise<unknown>) =>
parse(streamed.response),
);
await expect(runTavilySearch({ query: "test query" })).rejects.toThrow(
"Tavily Search: JSON response exceeds 16777216 bytes",
);
expect(streamed.getReadCount()).toBeLessThan(32);
expect(streamed.wasCanceled()).toBe(true);
expect(jsonSpy).not.toHaveBeenCalled();
});
it("runTavilyExtract sends X-Client-Source: openclaw", async () => {
await runTavilyExtract({ urls: ["https://example.com"] });

View File

@@ -1,5 +1,6 @@
// Tavily plugin module implements tavily client behavior.
import type { OpenClawConfig } from "openclaw/plugin-sdk/config-contracts";
import { readProviderJsonResponse } from "openclaw/plugin-sdk/provider-http";
import {
DEFAULT_CACHE_TTL_MINUTES,
normalizeCacheKey,
@@ -26,6 +27,7 @@ const EXTRACT_CACHE = new Map<
{ value: Record<string, unknown>; expiresAt: number; insertedAt: number }
>();
const DEFAULT_SEARCH_COUNT = 5;
const TAVILY_EXTRACT_RESPONSE_MAX_BYTES = 64 * 1024 * 1024;
export type TavilySearchParams = {
cfg?: OpenClawConfig;
@@ -73,6 +75,7 @@ async function postTavilyJson(params: {
apiKey: string;
body: Record<string, unknown>;
errorLabel: string;
responseMaxBytes?: number;
}): Promise<Record<string, unknown>> {
return postTrustedWebToolsJson(
{
@@ -83,19 +86,19 @@ async function postTavilyJson(params: {
errorLabel: params.errorLabel,
extraHeaders: { "X-Client-Source": "openclaw" },
},
async (response) => readTavilyJsonResponse(response, params.errorLabel),
async (response) =>
readTavilyJsonResponse(response, params.errorLabel, {
maxBytes: params.responseMaxBytes,
}),
);
}
async function readTavilyJsonResponse(
response: Response,
label: string,
opts?: { maxBytes?: number },
): Promise<Record<string, unknown>> {
try {
return (await response.json()) as Record<string, unknown>;
} catch (cause) {
throw new Error(`${label}: malformed JSON response`, { cause });
}
return await readProviderJsonResponse<Record<string, unknown>>(response, label, opts);
}
export async function runTavilySearch(
@@ -255,6 +258,8 @@ export async function runTavilyExtract(
apiKey,
body,
errorLabel: "Tavily Extract",
// Extract can include raw page content and image lists, unlike search metadata.
responseMaxBytes: TAVILY_EXTRACT_RESPONSE_MAX_BYTES,
});
const rawResults = Array.isArray(payload.results) ? payload.results : [];

View File

@@ -1,11 +1,21 @@
// Test Support plugin module implements streaming error response behavior.
export function createStreamingErrorResponse(params: {
status: number;
// Test Support plugin module implements streaming response fixtures.
export type StreamingResponseFixture = {
response: Response;
getReadCount: () => number;
wasCanceled: () => boolean;
};
export function createStreamingResponse(params: {
status?: number;
chunkCount: number;
chunkSize: number;
byte: number;
}): { response: Response; getReadCount: () => number } {
byte?: number;
text?: string;
headers?: HeadersInit;
}): StreamingResponseFixture {
let reads = 0;
let canceled = false;
const encoder = new TextEncoder();
const stream = new ReadableStream<Uint8Array>({
pull(controller) {
if (reads >= params.chunkCount) {
@@ -13,11 +23,28 @@ export function createStreamingErrorResponse(params: {
return;
}
reads += 1;
controller.enqueue(new Uint8Array(params.chunkSize).fill(params.byte));
const chunk =
params.text !== undefined
? encoder.encode(params.text.repeat(params.chunkSize))
: new Uint8Array(params.chunkSize).fill(params.byte ?? 120);
controller.enqueue(chunk);
},
cancel() {
canceled = true;
},
});
return {
response: new Response(stream, { status: params.status }),
response: new Response(stream, { status: params.status ?? 200, headers: params.headers }),
getReadCount: () => reads,
wasCanceled: () => canceled,
};
}
export function createStreamingErrorResponse(params: {
status: number;
chunkCount: number;
chunkSize: number;
byte: number;
}): StreamingResponseFixture {
return createStreamingResponse(params);
}

View File

@@ -0,0 +1,45 @@
// Tests fenced-code-block span scanning used to keep chunk breaks out of code blocks.
import { describe, expect, it } from "vitest";
import { isSafeFenceBreak, parseFenceSpans } from "./fences.js";
describe("parseFenceSpans closing-fence rules", () => {
it("treats a marker line with trailing text as code content, not a closing fence", () => {
// CommonMark: a closing fence may be followed only by whitespace, so "``` not a close" is code
// content and the block stays open until the real closing fence. Reporting an interior offset
// as a safe break would let a chunker split inside the code block.
const text = "```\ncode\n``` not a close\nmore code\n```\n";
const spans = parseFenceSpans(text);
expect(spans).toHaveLength(1);
expect(isSafeFenceBreak(spans, text.indexOf("more code") + 1)).toBe(false);
});
it("does not close on non-space/tab whitespace", () => {
for (const suffix of ["\u00a0", "\v", "\f"]) {
const text = `\`\`\`\ncode\n\`\`\`${suffix}\nmore code\n`;
const spans = parseFenceSpans(text);
expect(spans).toHaveLength(1);
expect(spans[0]?.end).toBe(text.length);
expect(isSafeFenceBreak(spans, text.indexOf("more code") + 1)).toBe(false);
}
});
it("closes fences with CRLF line endings", () => {
const text = "```\r\ncode\r\n```\r\nafter\r\n";
const spans = parseFenceSpans(text);
expect(spans).toHaveLength(1);
expect(isSafeFenceBreak(spans, text.indexOf("after") + 1)).toBe(true);
});
it("still closes on a bare fence, a longer same-marker fence, and keeps an opener info string", () => {
expect(parseFenceSpans("```\ncode\n```\nafter\n")).toHaveLength(1);
expect(parseFenceSpans("```\ncode\n````` \nafter\n")).toHaveLength(1);
expect(parseFenceSpans("```python\nx = 1\n```\n")).toHaveLength(1);
const closed = "```\ncode\n```\nafter\n";
const spans = parseFenceSpans(closed);
expect(isSafeFenceBreak(spans, closed.indexOf("after") + 1)).toBe(true);
});
});

View File

@@ -41,7 +41,7 @@ export function scanFenceSpans(
while (offset <= buffer.length) {
const nextNewline = buffer.indexOf("\n", offset);
const lineEnd = nextNewline === -1 ? buffer.length : nextNewline;
const line = buffer.slice(offset, lineEnd);
const line = buffer.slice(offset, lineEnd).replace(/\r$/, "");
const match = line.match(/^( {0,3})(`{3,}|~{3,})(.*)$/);
if (match && (offset > 0 || startsAtLineStart)) {
@@ -58,9 +58,13 @@ export function scanFenceSpans(
marker,
indent,
};
} else if (open.markerChar === markerChar && markerLen >= open.markerLen) {
// CommonMark allows a closing fence to be longer than the opener, but
// it must use the same marker character to avoid crossing fence kinds.
} else if (
open.markerChar === markerChar &&
markerLen >= open.markerLen &&
/^[ \t]*$/.test(match[3])
) {
// CommonMark permits only spaces or tabs after a closing fence. A marker line carrying
// other trailing text is code content, not a close, so it must not end the block.
const end = lineEnd;
spans.push({
start: open.start,

View File

@@ -108,7 +108,7 @@ flow:
- lambda:
params: [text]
expr: "config.expectedReplyGroups.every((group) => group.some((needle) => normalizeLowercaseStringOrEmpty(text).includes(needle)))"
- expr: "env.providerMode === 'mock-openai' ? 10000 : 30000"
- expr: "30000"
- expr: "env.providerMode === 'mock-openai' ? 100 : 250"
- if:
expr: "Boolean(env.mock)"
@@ -240,7 +240,11 @@ flow:
message:
expr: "lastError instanceof Error ? formatErrorMessage(lastError) : String(lastError ?? 'fanout retry exhausted')"
- if:
expr: "Boolean(env.mock)"
# Codex completes child sessions through its app-server path but
# does not relay the child marker back onto the parent QA channel.
# The shared assertions above already prove both child tool calls
# and child session rows; keep this transport-only proof OpenClaw-specific.
expr: "Boolean(env.mock) && env.gateway.runtimeEnv.OPENCLAW_QA_FORCE_RUNTIME !== 'codex'"
then:
- forEach:
items:
@@ -253,5 +257,5 @@ flow:
- lambda:
params: [candidate]
expr: "String(candidate.text ?? '').trim() === childCompletionMarker"
- 10000
- 30000
detailsExpr: "details"

View File

@@ -26,7 +26,9 @@ scenario:
config:
sessionKey: agent:qa:long-context-cache-stability
fixtureFile: large-cache-fixture.txt
cacheEvidenceNeedle: CACHE-FIXTURE-0550
cacheEvidenceNeedle: CACHE-FIXTURE-0050
cacheEvidenceLine: "CACHE-FIXTURE-0050: stable tool-result evidence for prompt-cache reuse across long sessions."
followupPromptNeedle: Using the already-read
warmupMarker: QA-LARGE-CACHE-WARMUP-OK
hitMarker: QA-LARGE-CACHE-HIT-OK
@@ -84,8 +86,17 @@ flow:
- set: debugRequests
value:
expr: "env.mock ? [...(await fetchJson(`${env.mock.baseUrl}/debug/requests`))] : []"
- set: cappedReadOutputIndex
value:
expr: "debugRequests.reduce((found, planned, index) => { if (found >= 0 || !planned.plannedToolCallId || planned.plannedToolName !== 'read' || planned.plannedToolArgs?.path !== config.fixtureFile) return found; const outputOffset = debugRequests.slice(index + 1).findIndex((candidate) => Boolean(candidate.toolOutputCallId) && candidate.toolOutputCallId === planned.plannedToolCallId); if (outputOffset < 0) return found; const output = debugRequests[index + 1 + outputOffset]; const evidence = [planned.allInputText, output.allInputText, output.toolOutput].filter((value) => typeof value === 'string').join('\\n'); const hasCodexFormattedTruncation = evidence.includes('Warning: truncated output') && (evidence.includes('chars truncated') || evidence.includes('tokens truncated')); return evidence.includes(config.cacheEvidenceLine) && (evidence.includes('[Read output capped at 50KB') || evidence.includes('...(OpenClaw truncated dynamic tool result') || evidence.includes('...(truncated)...') || hasCodexFormattedTruncation) ? index + 1 + outputOffset : found; }, -1)"
- set: hasCappedReadEvidence
value:
expr: "cappedReadOutputIndex >= 0"
- set: hasFollowupCacheEvidence
value:
expr: "cappedReadOutputIndex >= 0 && debugRequests.some((request, index) => index > cappedReadOutputIndex && String(request.prompt ?? '').includes(config.followupPromptNeedle) && String(request.allInputText ?? '').includes(config.cacheEvidenceLine))"
- assert:
expr: "!env.mock || debugRequests.some((request, index) => request.plannedToolName === 'read' && request.plannedToolArgs?.path === config.fixtureFile && typeof request.plannedToolCallId === 'string' && debugRequests.slice(index + 1).some((result, resultOffset) => result.toolOutputCallId === request.plannedToolCallId && String(result.toolOutput ?? '').includes(config.cacheEvidenceNeedle) && (String(result.toolOutput ?? '').includes('[Read output capped at 50KB') || (String(result.toolOutput ?? '').includes('...(truncated)...') && String(result.toolOutput ?? '').length <= 13000)) && debugRequests.slice(index + resultOffset + 2).some((followup) => followup.plannedToolName === 'read' && followup.plannedToolArgs?.path === config.fixtureFile && String(followup.allInputText ?? '').includes(config.cacheEvidenceNeedle) && (String(followup.allInputText ?? '').includes('[Read output capped at 50KB') || String(followup.allInputText ?? '').includes('...(truncated)...')))))"
expr: "!env.mock || (hasCappedReadEvidence && hasFollowupCacheEvidence)"
message:
expr: "`large capped read tool result was not observed: ${JSON.stringify(debugRequests.slice(-8).map((request) => ({ plannedToolName: request.plannedToolName ?? null, plannedToolArgs: request.plannedToolArgs ?? null, plannedToolCallId: request.plannedToolCallId ?? null, toolOutputCallId: request.toolOutputCallId ?? null, toolOutputLength: String(request.toolOutput ?? '').length, toolOutputHasNeedle: String(request.toolOutput ?? '').includes(config.cacheEvidenceNeedle), toolOutputHasReadCap: String(request.toolOutput ?? '').includes('[Read output capped at 50KB'), toolOutputHasCodexTruncation: String(request.toolOutput ?? '').includes('...(truncated)...'), inputHasNeedle: String(request.allInputText ?? '').includes(config.cacheEvidenceNeedle), inputHasReadCap: String(request.allInputText ?? '').includes('[Read output capped at 50KB'), inputHasCodexTruncation: String(request.allInputText ?? '').includes('...(truncated)...') })))}`"
expr: "`large capped read cache evidence was not observed: ${JSON.stringify({ hasCappedReadEvidence, hasFollowupCacheEvidence, requests: debugRequests.slice(-8).map((request) => ({ prompt: request.prompt ?? null, plannedToolName: request.plannedToolName ?? null, plannedToolArgs: request.plannedToolArgs ?? null, plannedToolCallId: request.plannedToolCallId ?? null, toolOutputCallId: request.toolOutputCallId ?? null, toolOutputLength: String(request.toolOutput ?? '').length, outputHasReadCap: String(request.toolOutput ?? '').includes('[Read output capped at 50KB'), outputHasCodexTruncation: String(request.toolOutput ?? '').includes('...(truncated)...'), inputHasEvidenceLine: String(request.allInputText ?? '').includes(config.cacheEvidenceLine) })) })}`"
detailsExpr: "outbound?.text ?? config.hitMarker"

View File

@@ -13,6 +13,11 @@ HOST_BUILD="${OPENCLAW_CODEX_ON_DEMAND_HOST_BUILD:-1}"
PACKAGE_TGZ="${OPENCLAW_CURRENT_PACKAGE_TGZ:-}"
run_log=""
# This lane installs the package and then exercises a managed npm install of Codex.
# Keep the package install budget above the shared default so slow npm hosts reach
# the Codex assertions instead of failing as a silent package-install timeout.
export OPENCLAW_E2E_NPM_INSTALL_TIMEOUT="${OPENCLAW_E2E_NPM_INSTALL_TIMEOUT:-1200s}"
cleanup() {
if [ -n "${PACKAGE_TGZ:-}" ]; then
docker_e2e_cleanup_package_tgz "$PACKAGE_TGZ"

View File

@@ -25,10 +25,35 @@ PROBE_ATTEMPT_TIMEOUT_MS="$(
PROBE_MAX_BODY_BYTES="$(
openclaw_e2e_read_positive_int_env OPENCLAW_UPGRADE_SURVIVOR_PROBE_MAX_BODY_BYTES 1048576
)"
LANE_ARTIFACT_SUFFIX="${OPENCLAW_DOCKER_ALL_LANE_NAME:-default}"
ROOT_MANAGED_VPS="${OPENCLAW_UPGRADE_SURVIVOR_ROOT_MANAGED_VPS:-0}"
resolve_lane_artifact_suffix() {
if [ -n "${OPENCLAW_DOCKER_ALL_LANE_NAME:-}" ]; then
printf "%s" "$OPENCLAW_DOCKER_ALL_LANE_NAME"
return
fi
if [ "$ROOT_MANAGED_VPS" = "1" ]; then
printf "root-managed-vps-upgrade"
elif [ "$UPDATE_RESTART_MODE" = "auto-auth" ]; then
printf "update-restart-auth"
elif [ "${OPENCLAW_UPGRADE_SURVIVOR_PUBLISHED_BASELINE:-0}" = "1" ]; then
printf "published-upgrade-survivor"
else
printf "upgrade-survivor"
fi
if [ -n "${BASELINE_SPEC// }" ]; then
printf -- "-%s" "$BASELINE_SPEC"
fi
if [ "$SCENARIO" != "base" ]; then
printf -- "-%s" "$SCENARIO"
fi
}
LANE_ARTIFACT_SUFFIX="$(resolve_lane_artifact_suffix)"
LANE_ARTIFACT_SUFFIX="${LANE_ARTIFACT_SUFFIX//[^A-Za-z0-9_.-]/_}"
ARTIFACT_DIR="${OPENCLAW_UPGRADE_SURVIVOR_ARTIFACT_DIR:-$ROOT_DIR/.artifacts/upgrade-survivor/$LANE_ARTIFACT_SUFFIX}"
ROOT_MANAGED_VPS="${OPENCLAW_UPGRADE_SURVIVOR_ROOT_MANAGED_VPS:-0}"
DOCKER_RUN_USER_ARGS=()
PROBE_ENV_ARGS=(
-e OPENCLAW_UPGRADE_SURVIVOR_PROBE_TIMEOUT_MS="$PROBE_TIMEOUT_MS"

View File

@@ -202,8 +202,8 @@ let publicDeprecatedExportsByEntrypointBudget;
try {
budgets = {
publicEntrypoints: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_ENTRYPOINTS", 322),
publicExports: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_EXPORTS", 10381),
publicFunctionExports: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_FUNCTION_EXPORTS", 5210),
publicExports: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_EXPORTS", 10382),
publicFunctionExports: readBudgetEnv("OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_FUNCTION_EXPORTS", 5211),
publicDeprecatedExports: readBudgetEnv(
"OPENCLAW_PLUGIN_SDK_MAX_PUBLIC_DEPRECATED_EXPORTS",
3247,

View File

@@ -299,6 +299,7 @@ function scoreSummary(
title: string,
value: QaMaturityScoreObject | undefined,
description: string,
details: readonly string[] = [],
): string[] {
const score = scorePercent(value);
const displayScore = score === undefined ? "-" : `${score}%`;
@@ -313,6 +314,7 @@ function scoreSummary(
' <div className="maturity-summary-meta">',
` ${maturityLabelPill(value?.label ?? "Unscored")}`,
` <span>${markdownEscape(description)}</span>`,
...details.map((detail) => ` <span>${markdownEscape(detail)}</span>`),
" </div>",
"</div>",
];
@@ -964,6 +966,7 @@ function renderMaturityScorecard({
const surfaceAverage = coverage.rollups.surface_average;
const qualityAverage = scores.rollups.surface_average.quality;
const completenessAverage = scores.rollups.surface_average.completeness;
const maturityAverage = averageScores([qualityAverage, completenessAverage]);
const lines = [
...frontmatter(
"Maturity scorecard",
@@ -985,18 +988,17 @@ function renderMaturityScorecard({
"## At a glance",
"",
'<div className="maturity-summary-grid">',
...indentMarkdown(scoreSummary("Coverage", surfaceAverage, "QA profile evidence"), 2),
...indentMarkdown(
scoreSummary("Quality", qualityAverage, "Reliability and operator confidence"),
2,
),
...indentMarkdown(
scoreSummary("Completeness", completenessAverage, "Expected workflow coverage"),
scoreSummary("Maturity score", maturityAverage, "Quality + completeness", [
`Coverage ${scoreLabel(surfaceAverage)}`,
`Quality ${scoreLabel(qualityAverage)}`,
`Completeness ${scoreLabel(completenessAverage)}`,
]),
2,
),
"</div>",
"",
'Coverage is deliberately evidence-led: an area does not become "ready" just because the implementation exists.',
'Coverage is deliberately evidence-led: an area does not become "ready" just because the implementation exists. It is not an input to the maturity score, but OpenClaw aims to keep end-to-end coverage above 90% for mature Stable-or-better features over time.',
"",
...renderScoreBands(),
];
@@ -1212,9 +1214,7 @@ function main(): void {
}
const evidenceSummaries = readEvidenceSummaries(args.evidenceDir);
if (args.strictInputs) {
rejectBlockingEvidence(evidenceSummaries);
}
rejectBlockingEvidence(evidenceSummaries);
const coverage = deriveCoverageScores(taxonomy, evidenceSummaries);
const { scores, warnings: scoreWarnings } = readValidatedQaMaturityScoreSources({
coverageScores: coverage,

View File

@@ -0,0 +1,301 @@
/**
* Tests for ingress model.usage diagnostic emission in agentCommandFromIngress.
*
* Covers:
* - ingressDiagnosticChannel channel label resolution
* - emitIngressModelUsageDiagnostic with diagnostics enabled + valid usage
* - emitIngressModelUsageDiagnostic with diagnostics disabled
* - emitIngressModelUsageDiagnostic with null/missing usage
*/
import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
const mocks = vi.hoisted(() => ({
emitTrustedDiagnosticEvent: vi.fn(),
isDiagnosticsEnabled: vi.fn(),
getRuntimeConfig: vi.fn(),
hasNonzeroUsage: vi.fn(),
resolveModelCostConfig: vi.fn(),
estimateUsageCost: vi.fn(),
}));
vi.mock("../infra/diagnostic-events.js", async () => {
const actual = await vi.importActual<typeof import("../infra/diagnostic-events.js")>(
"../infra/diagnostic-events.js",
);
return {
...actual,
emitTrustedDiagnosticEvent: mocks.emitTrustedDiagnosticEvent,
isDiagnosticsEnabled: mocks.isDiagnosticsEnabled,
};
});
vi.mock("../utils/usage-format.js", () => ({
resolveModelCostConfig: (...args: Array<unknown>) => mocks.resolveModelCostConfig(...args),
estimateUsageCost: (...args: Array<unknown>) => mocks.estimateUsageCost(...args),
}));
vi.mock("./usage.js", () => ({
hasNonzeroUsage: (usage: unknown) => mocks.hasNonzeroUsage(usage),
}));
vi.mock("../config/io.js", () => ({
getRuntimeConfig: () => mocks.getRuntimeConfig(),
}));
let testing: typeof import("./agent-command.js").testing;
beforeAll(async () => {
const mod = await import("./agent-command.js");
testing = mod.testing;
});
beforeEach(() => {
vi.clearAllMocks();
mocks.isDiagnosticsEnabled.mockReturnValue(true);
mocks.hasNonzeroUsage.mockReturnValue(true);
mocks.getRuntimeConfig.mockReturnValue({});
mocks.resolveModelCostConfig.mockReturnValue({});
mocks.estimateUsageCost.mockReturnValue(0.001);
});
afterEach(() => {
vi.clearAllMocks();
});
function makeResult(overrides?: Record<string, unknown>) {
return {
payloads: [{ text: "hello", mediaUrl: "" }],
meta: {
durationMs: 1234,
aborted: false,
stopReason: "end_turn",
agentMeta: {
provider: "openai",
model: "gpt-5.5",
sessionId: "sess-abc",
usage: {
input: 500,
output: 200,
cacheRead: 50,
cacheWrite: 25,
total: 775,
},
contextTokens: 128000,
promptTokens: 1200,
lastCallUsage: { input: 500, output: 200 },
...(overrides?.agentMeta as Record<string, unknown> | undefined),
},
...(overrides?.meta as Record<string, unknown> | undefined),
},
...overrides,
};
}
function makeOpts(overrides?: Record<string, unknown>) {
return {
message: "hello",
sessionKey: "agent:main:main",
agentId: "main",
allowModelOverride: false,
messageChannel: "api",
...overrides,
};
}
// ---------------------------------------------------------------------------
// ingressDiagnosticChannel
// ---------------------------------------------------------------------------
describe("ingressDiagnosticChannel", () => {
it("returns runContext.messageChannel when set", () => {
const channel = testing.ingressDiagnosticChannel({
message: "hi",
allowModelOverride: false,
runContext: { messageChannel: "discord" },
messageChannel: "api",
channel: "http",
});
expect(channel).toBe("discord");
});
it("falls back to opts.messageChannel", () => {
const channel = testing.ingressDiagnosticChannel({
message: "hi",
allowModelOverride: false,
messageChannel: "api",
channel: "http",
});
expect(channel).toBe("api");
});
it("falls back to opts.channel", () => {
const channel = testing.ingressDiagnosticChannel({
message: "hi",
allowModelOverride: false,
channel: "webchat",
});
expect(channel).toBe("webchat");
});
it('defaults to "http" when no channel info is present', () => {
const channel = testing.ingressDiagnosticChannel({
message: "hi",
allowModelOverride: false,
});
expect(channel).toBe("http");
});
});
// ---------------------------------------------------------------------------
// emitIngressModelUsageDiagnostic
// ---------------------------------------------------------------------------
describe("emitIngressModelUsageDiagnostic", () => {
it("emits model.usage when diagnostics are enabled and result has usage", () => {
const result = makeResult();
const opts = makeOpts();
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.emitTrustedDiagnosticEvent).toHaveBeenCalledTimes(1);
const event = mocks.emitTrustedDiagnosticEvent.mock.calls[0]?.[0];
expect(event).toMatchObject({
type: "model.usage",
sessionKey: "agent:main:main",
sessionId: "sess-abc",
channel: "api",
agentId: "main",
provider: "openai",
model: "gpt-5.5",
usage: {
input: 500,
output: 200,
cacheRead: 50,
cacheWrite: 25,
promptTokens: 575,
total: 775,
},
durationMs: 1234,
});
});
it("does not emit when diagnostics are disabled", () => {
mocks.isDiagnosticsEnabled.mockReturnValue(false);
const result = makeResult();
const opts = makeOpts();
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.emitTrustedDiagnosticEvent).not.toHaveBeenCalled();
});
it("does not emit when agentMeta is missing", () => {
const result = makeResult({
meta: { durationMs: 100, aborted: false, stopReason: "end_turn" },
});
// result.meta.agentMeta is undefined
(result as Record<string, unknown>).meta = { durationMs: 100 };
const opts = makeOpts();
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.emitTrustedDiagnosticEvent).not.toHaveBeenCalled();
});
it("does not emit when usage is zero", () => {
mocks.hasNonzeroUsage.mockReturnValue(false);
const result = makeResult();
const opts = makeOpts();
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.emitTrustedDiagnosticEvent).not.toHaveBeenCalled();
});
it("resolves channel from runContext when available", () => {
const result = makeResult();
const opts = makeOpts({
runContext: { messageChannel: "discord" },
messageChannel: "api",
});
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.emitTrustedDiagnosticEvent).toHaveBeenCalledTimes(1);
const event = mocks.emitTrustedDiagnosticEvent.mock.calls[0]?.[0];
expect(event.channel).toBe("discord");
});
it('defaults channel to "http" when no channel info is present', () => {
const result = makeResult();
const opts = { message: "hi", allowModelOverride: false };
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.emitTrustedDiagnosticEvent).toHaveBeenCalledTimes(1);
const event = mocks.emitTrustedDiagnosticEvent.mock.calls[0]?.[0];
expect(event.channel).toBe("http");
});
it("computes cost when billable usage buckets are present", () => {
const result = makeResult();
const opts = makeOpts();
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.resolveModelCostConfig).toHaveBeenCalledWith({
provider: "openai",
model: "gpt-5.5",
config: expect.any(Object) as unknown,
});
expect(mocks.estimateUsageCost).toHaveBeenCalled();
expect(mocks.emitTrustedDiagnosticEvent).toHaveBeenCalledTimes(1);
const event = mocks.emitTrustedDiagnosticEvent.mock.calls[0]?.[0];
expect(event.costUsd).toBe(0.001);
});
it("handles missing optional usage fields gracefully", () => {
const result = makeResult({
agentMeta: {
provider: "openai",
model: "gpt-5.5",
sessionId: "sess-min",
usage: { input: 100, output: 50 },
},
});
const opts = makeOpts();
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.emitTrustedDiagnosticEvent).toHaveBeenCalledTimes(1);
const event = mocks.emitTrustedDiagnosticEvent.mock.calls[0]?.[0];
expect(event.usage).toMatchObject({
input: 100,
output: 50,
cacheRead: 0,
cacheWrite: 0,
promptTokens: 100,
total: 150,
});
});
it("omits context.used when promptTokens is undefined", () => {
const result = makeResult({
agentMeta: {
promptTokens: undefined,
provider: "openai",
model: "gpt-5.5",
sessionId: "sess-no-prompt",
usage: { input: 10, output: 5 },
contextTokens: 128000,
},
});
const opts = makeOpts();
testing.emitIngressModelUsageDiagnostic(result, opts);
expect(mocks.emitTrustedDiagnosticEvent).toHaveBeenCalledTimes(1);
const event = mocks.emitTrustedDiagnosticEvent.mock.calls[0]?.[0];
expect(event.context).toEqual({ limit: 128000 });
});
});

View File

@@ -26,6 +26,7 @@ import {
registerAgentRunContext,
withAgentRunLifecycleGeneration,
} from "../infra/agent-events.js";
import { isDiagnosticsEnabled, emitTrustedDiagnosticEvent } from "../infra/diagnostic-events.js";
import { formatErrorMessage } from "../infra/errors.js";
import {
resolveAgentDeliveryPlan,
@@ -67,6 +68,7 @@ import {
isDeliverableMessageChannel,
resolveMessageChannel,
} from "../utils/message-channel.js";
import { estimateUsageCost, resolveModelCostConfig } from "../utils/usage-format.js";
import { resolveAgentRuntimeConfig } from "./agent-runtime-config.js";
import {
clearAutoFallbackPrimaryProbeSelection,
@@ -137,6 +139,7 @@ import {
} from "./run-termination.js";
import { normalizeSpawnedRunMetadata } from "./spawned-context.js";
import { resolveAgentTimeoutMs } from "./timeout.js";
import { hasNonzeroUsage } from "./usage.js";
import { ensureAgentWorkspace } from "./workspace.js";
const log = createSubsystemLogger("agents/agent-command");
@@ -2442,6 +2445,83 @@ export async function agentCommand(
);
}
/** Resolve the channel label for model.usage diagnostics from ingress run options. */
function ingressDiagnosticChannel(opts: AgentCommandIngressOpts): string {
return opts.runContext?.messageChannel ?? opts.messageChannel ?? opts.channel ?? "http";
}
/**
* Emit a model.usage diagnostic event after an ingress agent run completes.
*
* Unlike channel/cron paths which emit model.usage in runReplyAgent /
* finalizeCronRun, the ingress path has no such existing emission — without
* this every diagnostics consumer (Langfuse bridge, @openclaw/diagnostics-otel,
* diagnostics-prometheus) sees usage/cost only for webchat/cli/cron turns
* and is blind to HTTP API traffic (POST /v1/responses, POST /v1/chat/completions,
* and node-event dispatch).
*/
function emitIngressModelUsageDiagnostic(
result: NonNullable<Awaited<ReturnType<typeof agentCommandInternal>>>,
opts: AgentCommandIngressOpts,
): void {
const cfg = getRuntimeConfig();
if (!isDiagnosticsEnabled(cfg)) {
return;
}
const agentMeta = result.meta?.agentMeta;
const usage = agentMeta?.usage;
if (!agentMeta || !hasNonzeroUsage(usage)) {
return;
}
const providerUsed = agentMeta.provider ?? "";
const modelUsed = agentMeta.model ?? "";
const input = usage.input ?? 0;
const output = usage.output ?? 0;
const cacheRead = usage.cacheRead ?? 0;
const cacheWrite = usage.cacheWrite ?? 0;
const usagePromptTokens = input + cacheRead + cacheWrite;
const totalTokens = usage.total ?? usagePromptTokens + output;
const hasBillableUsageBuckets =
usage.input !== undefined ||
usage.output !== undefined ||
usage.cacheRead !== undefined ||
usage.cacheWrite !== undefined;
const costConfig = resolveModelCostConfig({
provider: providerUsed,
model: modelUsed,
config: cfg,
});
const costUsd = hasBillableUsageBuckets
? estimateUsageCost({ usage, cost: costConfig })
: undefined;
emitTrustedDiagnosticEvent({
type: "model.usage",
sessionKey: opts.sessionKey,
sessionId: agentMeta.sessionId,
channel: ingressDiagnosticChannel(opts),
agentId: opts.agentId,
provider: providerUsed,
model: modelUsed,
usage: {
input,
output,
cacheRead,
cacheWrite,
promptTokens: usagePromptTokens,
total: totalTokens,
},
lastCallUsage: agentMeta.lastCallUsage,
context: {
limit: agentMeta.contextTokens,
...(agentMeta.promptTokens !== undefined ? { used: agentMeta.promptTokens } : {}),
},
costUsd,
durationMs: result.meta?.durationMs,
});
}
/** Runs an agent turn from an inbound channel/gateway ingress context. */
export async function agentCommandFromIngress(
opts: AgentCommandIngressOpts,
@@ -2453,8 +2533,8 @@ export async function agentCommandFromIngress(
}
const lifecycleGeneration =
opts.lifecycleGeneration ?? captureAgentRunLifecycleGeneration(opts.runId ?? "");
return await withAgentRunLifecycleGeneration(lifecycleGeneration, () =>
agentCommandInternal(
return await withAgentRunLifecycleGeneration(lifecycleGeneration, async () => {
const result = await agentCommandInternal(
{
...opts,
lifecycleGeneration,
@@ -2462,14 +2542,22 @@ export async function agentCommandFromIngress(
},
runtime,
deps,
),
);
);
if (result) {
emitIngressModelUsageDiagnostic(result, opts);
}
return result;
});
}
export const testing = {
resolveAgentRuntimeConfig,
prepareAgentCommandExecution,
resolveExplicitAgentCommandSessionKey,
ingressDiagnosticChannel,
emitIngressModelUsageDiagnostic,
};
/** @deprecated Use `testing`. */

View File

@@ -75,6 +75,10 @@ export {
const log = createSubsystemLogger("errors");
const sandboxToolPolicyAuditMessages = new WeakSet<AssistantMessage>();
export const GENERIC_ASSISTANT_ERROR_TEXT = "LLM request failed.";
export const AUTH_INVALID_TOKEN_USER_TEXT =
"Authentication failed (provider returned HTTP 401). " +
"Your provider token may have expired — try the request again in a moment. " +
"If the failure persists, re-authenticate this provider.";
const PROVIDER_SCHEMA_REJECTION_USER_TEXT =
"LLM request failed: provider rejected the request schema or tool payload.";
const MODEL_NOT_FOUND_USER_TEXT =
@@ -1419,11 +1423,7 @@ export function formatAssistantErrorText(
}
if (providerRuntimeFailureKind === "auth_invalid_token") {
return (
"Authentication failed (provider returned HTTP 401). " +
"Your provider token may have expired — try the request again in a moment. " +
"If the failure persists, re-authenticate this provider."
);
return AUTH_INVALID_TOKEN_USER_TEXT;
}
if (providerRuntimeFailureKind === "upstream_html") {

View File

@@ -641,7 +641,7 @@ async function runEmbeddedAgentInternal(
...paramsBase,
agentId: paramsBase.agentId ?? runSessionTarget.agentId,
sessionId: runSessionTarget.sessionId,
sessionKey: effectiveSessionKey ?? runSessionTarget.sessionKey,
sessionKey: normalizeOptionalString(effectiveSessionKey ?? runSessionTarget.sessionKey),
sessionFile: runSessionTarget.sessionFile,
};
const sessionLane = resolveSessionLane(params.sessionKey?.trim() || params.sessionId);

View File

@@ -3992,6 +3992,103 @@ describe("embedded attempt session lock lifecycle", () => {
expect(acquireSessionWriteLockLocal).toHaveBeenCalledTimes(2);
});
it("releaseHeldLockWithFence sets deferred flag when bailed out during active scope; re-attempted after scope deactivation (#95915)", async () => {
const events: string[] = [];
const releasePrep = vi.fn(async () => events.push("prep-release"));
const releaseRetained = vi.fn(async () => events.push("retained-release"));
const acquireSessionWriteLockLocal = vi
.fn()
.mockResolvedValueOnce({ release: releasePrep })
.mockResolvedValueOnce({ release: releaseRetained });
const controller = await createEmbeddedAttemptSessionLockController({
acquireSessionWriteLock: acquireSessionWriteLockLocal,
lockOptions,
});
await controller.releaseForPrompt();
await controller.reacquireAfterPrompt();
await controller.withSessionWriteLock(async () => {
events.push("write-start");
await controller.releaseHeldLockForAbort();
events.push("write-end");
});
expect(events).toEqual(["prep-release", "write-start", "write-end", "retained-release"]);
expect(acquireSessionWriteLockLocal).toHaveBeenCalledTimes(2);
});
it("controls the held lock lifecycle across deferred abort release, reacquisition, and prompt release", async () => {
const events: string[] = [];
const acquireSessionWriteLockLocal = vi
.fn()
.mockResolvedValueOnce({ release: vi.fn(async () => events.push("init-release")) })
.mockResolvedValueOnce({ release: vi.fn(async () => events.push("held-release")) })
.mockResolvedValueOnce({ release: vi.fn(async () => events.push("reacquire-release")) });
const controller = await createEmbeddedAttemptSessionLockController({
acquireSessionWriteLock: acquireSessionWriteLockLocal,
lockOptions,
});
await controller.releaseForPrompt();
await controller.reacquireAfterPrompt();
await controller.withSessionWriteLock(async () => {
events.push("write");
await controller.releaseHeldLockForAbort();
});
expect(events).toEqual(["init-release", "write", "held-release"]);
await controller.reacquireAfterPrompt();
await controller.releaseForPrompt();
expect(events).toEqual(["init-release", "write", "held-release", "reacquire-release"]);
expect(acquireSessionWriteLockLocal).toHaveBeenCalledTimes(3);
});
it("takeHeldLockAfterRetainedIdle does not self-deadlock when called from inside active write scope (#95915)", async () => {
const events: string[] = [];
const acquireSessionWriteLockLocal = vi
.fn()
.mockResolvedValueOnce({ release: vi.fn(async () => events.push("init-release")) })
.mockResolvedValueOnce({ release: vi.fn(async () => events.push("held-release")) })
.mockRejectedValueOnce(
new SessionWriteLockTimeoutError({
timeoutMs: lockOptions.timeoutMs,
owner: "pid=test",
lockPath: `${lockOptions.sessionFile}.lock`,
}),
);
const controller = await createEmbeddedAttemptSessionLockController({
acquireSessionWriteLock: acquireSessionWriteLockLocal,
lockOptions,
});
await controller.releaseForPrompt();
await controller.reacquireAfterPrompt();
const takeoverError = await controller
.withSessionWriteLock(async () => {
events.push("write-start");
const cleanupLock = await controller.acquireForCleanup();
await cleanupLock.release();
events.push("cleanup-inside-done");
})
.catch((error: unknown) => error);
expect(takeoverError).toBeInstanceOf(EmbeddedAttemptSessionTakeoverError);
const cleanupLock = await controller.acquireForCleanup();
await cleanupLock.release();
expect(events).toEqual(["init-release", "write-start", "cleanup-inside-done", "held-release"]);
expect(acquireSessionWriteLockLocal).toHaveBeenCalledTimes(3);
});
it("returns a no-op cleanup lock after prompt lock reacquisition times out", async () => {
const releases: string[] = [];
const acquireSessionWriteLockResult = vi

View File

@@ -1171,6 +1171,9 @@ export async function createEmbeddedAttemptSessionLockController(params: {
let fenceGeneration = 0;
let fenceActive = false;
let takeoverDetected = false;
// Set when an active retained write prevents immediate held-lock release.
// The scope completion path retries release after the retained use unwinds.
let releaseHeldLockDeferred = false;
let retainedLockUseCount = 0;
const retainedLockIdleWaiters = new Set<() => void>();
let heldLockDraining = false;
@@ -1603,6 +1606,7 @@ export async function createEmbeddedAttemptSessionLockController(params: {
const drainOwner = await beginHeldLockDrain();
try {
if (!(await waitForRetainedLockIdle())) {
releaseHeldLockDeferred = true;
return;
}
if (!heldLock) {
@@ -1639,6 +1643,8 @@ export async function createEmbeddedAttemptSessionLockController(params: {
const drainOwner = await beginHeldLockDrain();
try {
if (!(await waitForRetainedLockIdle())) {
// Do not wait for retained idle from inside the active scope; that
// scope must unwind before the retained-use waiter can resolve.
return undefined;
}
if (!heldLock) {
@@ -1660,6 +1666,7 @@ export async function createEmbeddedAttemptSessionLockController(params: {
const drainOwner = await beginHeldLockDrain();
try {
if (!(await waitForRetainedLockIdle())) {
// Same active-scope self-deadlock guard as takeHeldLockAfterRetainedIdle.
return;
}
if (!heldLock) {
@@ -1721,6 +1728,12 @@ export async function createEmbeddedAttemptSessionLockController(params: {
}
}
await releaseHeldLockAfterTakeover();
// Retained use has been released and the active scope is no longer live,
// so a prior active-scope release bailout can drain the held file lock now.
if (releaseHeldLockDeferred) {
releaseHeldLockDeferred = false;
await releaseHeldLockWithFence();
}
if (!outcome.ok) {
throw outcome.error;
}

View File

@@ -9,6 +9,7 @@ import {
ProviderHttpError,
readProviderBinaryResponse,
readProviderJsonResponse,
readProviderTextResponse,
readResponseTextLimited,
} from "./provider-http-errors.js";
@@ -64,6 +65,31 @@ function createStreamingJsonResponse(params: { chunkCount: number; chunkSize: nu
};
}
function createStreamingTextResponse(params: { chunkCount: number; chunkSize: number }): {
response: Response;
getReadCount: () => number;
} {
let reads = 0;
const encoder = new TextEncoder();
const stream = new ReadableStream<Uint8Array>({
pull(controller) {
if (reads >= params.chunkCount) {
controller.close();
return;
}
reads += 1;
controller.enqueue(encoder.encode("x".repeat(params.chunkSize)));
},
});
return {
response: new Response(stream, {
status: 200,
headers: { "Content-Type": "text/plain" },
}),
getReadCount: () => reads,
};
}
describe("provider error utils", () => {
it("formats nested provider error details with request ids", async () => {
const response = new Response(
@@ -263,6 +289,21 @@ describe("provider error utils", () => {
expect(streamed.getReadCount()).toBeLessThan(20);
});
it("caps successful text responses instead of buffering oversized bodies", async () => {
const streamed = createStreamingTextResponse({
chunkCount: 20,
chunkSize: 1024,
});
await expect(
readProviderTextResponse(streamed.response, "Provider text failed", {
maxBytes: 2048,
}),
).rejects.toThrow("Provider text failed: text response exceeds 2048 bytes");
expect(streamed.getReadCount()).toBeLessThan(20);
});
it("caps successful binary responses instead of buffering oversized bodies", async () => {
const streamed = createStreamingBinaryResponse({
chunkCount: 20,

View File

@@ -14,6 +14,7 @@ export { normalizeOptionalString as trimToUndefined } from "../../packages/norma
const ERROR_BODY_METADATA_LIMIT = 500;
const PROVIDER_BINARY_RESPONSE_MAX_BYTES = 16 * 1024 * 1024;
const PROVIDER_JSON_RESPONSE_MAX_BYTES = 16 * 1024 * 1024;
const PROVIDER_TEXT_RESPONSE_MAX_BYTES = 16 * 1024 * 1024;
/** Returns a plain object view for provider JSON payloads when one exists. */
export function asObject(value: unknown): Record<string, unknown> | undefined {
@@ -86,6 +87,20 @@ export async function readResponseTextLimited(
return text;
}
/** Reads a successful provider text response under a byte cap. */
export async function readProviderTextResponse(
response: Response,
label: string,
opts?: { maxBytes?: number },
): Promise<string> {
const maxBytes = opts?.maxBytes ?? PROVIDER_TEXT_RESPONSE_MAX_BYTES;
const bytes = await readResponseWithLimit(response, maxBytes, {
onOverflow: ({ maxBytes: maxBytesLocal }) =>
new Error(`${label}: text response exceeds ${maxBytesLocal} bytes`),
});
return new TextDecoder().decode(bytes);
}
/** Formats common provider JSON error payload shapes into one readable detail string. */
export function formatProviderErrorPayload(payload: unknown): string | undefined {
const root = asObject(payload);

View File

@@ -28,6 +28,7 @@ import {
} from "./agent-runner-execution.js";
import { HEARTBEAT_EXTERNAL_RUN_FAILURE_TEXT } from "./agent-runner-failure-copy.js";
import {
PROVIDER_AUTHENTICATION_ERROR_USER_MESSAGE,
PROVIDER_CONVERSATION_STATE_ERROR_USER_MESSAGE,
PROVIDER_INTERNAL_ERROR_USER_MESSAGE,
PROVIDER_RATE_LIMIT_OR_QUOTA_ERROR_USER_MESSAGE,
@@ -6534,6 +6535,38 @@ describe("runAgentTurnWithFallback", () => {
},
);
it.each(NON_DIRECT_FAILURE_SURFACE_CASES)(
"surfaces provider authentication failures in $label chats",
async (testCase) => {
const rawError =
"unexpected status 401 Unauthorized: Missing bearer or basic authentication in header, url: https://api.openai.com/v1/responses";
state.runEmbeddedAgentMock.mockRejectedValueOnce(
new FailoverError("LLM request unauthorized.", {
reason: "auth",
provider: "openai",
model: "gpt-5.5",
status: 401,
rawError,
}),
);
const runAgentTurnWithFallback = await getRunAgentTurnWithFallback();
const result = await runAgentTurnWithFallback(
createMinimalRunAgentTurnParams({
sessionCtx: createNonDirectFailureSessionCtx(testCase),
}),
);
expect(result.kind).toBe("final");
if (result.kind === "final") {
expect(result.payload.isError).toBe(true);
expect(result.payload.text).toBe(PROVIDER_AUTHENTICATION_ERROR_USER_MESSAGE);
expect(result.payload.text).not.toBe(SILENT_REPLY_TOKEN);
expect(result.payload.text).not.toContain(rawError);
}
},
);
it.each(NON_DIRECT_FAILURE_SURFACE_CASES)(
"surfaces rate-limit fallback copy in $label chats",
async (testCase) => {

View File

@@ -1,13 +1,60 @@
/** Tests provider request error classification for retry/fallback decisions. */
import { describe, expect, it } from "vitest";
import { FailoverError } from "../../agents/failover-error.js";
import {
classifyProviderRequestError,
PROVIDER_AUTHENTICATION_ERROR_USER_MESSAGE,
PROVIDER_CONVERSATION_STATE_ERROR_USER_MESSAGE,
PROVIDER_INTERNAL_ERROR_USER_MESSAGE,
PROVIDER_RATE_LIMIT_OR_QUOTA_ERROR_USER_MESSAGE,
} from "./provider-request-error-classifier.js";
describe("provider request error classifier", () => {
it("classifies provider HTTP 401 authentication failures", () => {
const message =
"unexpected status 401 Unauthorized: Missing bearer or basic authentication in header, url: https://api.openai.com/v1/responses";
expect(classifyProviderRequestError(new Error(message))).toEqual({
code: "provider_authentication_error",
userMessage: PROVIDER_AUTHENTICATION_ERROR_USER_MESSAGE,
technicalMessage: message,
});
});
it("classifies typed authentication failures without relying on raw provider text", () => {
const error = new FailoverError("LLM request unauthorized.", {
reason: "auth",
provider: "openai",
model: "gpt-5.5",
status: 401,
});
expect(classifyProviderRequestError(error)).toEqual({
code: "provider_authentication_error",
userMessage: PROVIDER_AUTHENTICATION_ERROR_USER_MESSAGE,
technicalMessage: "LLM request unauthorized.",
});
});
it("does not label typed HTTP 403 authorization failures as HTTP 401", () => {
const error = new FailoverError("Provider access denied.", {
reason: "auth_permanent",
provider: "openai",
model: "gpt-5.5",
status: 403,
});
expect(classifyProviderRequestError(error)).toBeUndefined();
});
it("leaves unrelated HTTP 401 failures unclassified", () => {
expect(
classifyProviderRequestError(
new Error("401 input item id does not belong to this conversation"),
),
).toBeUndefined();
});
it.each([
[
"OpenAI missing custom tool output",

View File

@@ -1,9 +1,15 @@
// Classifies provider request failures into retry and user-facing categories.
import { normalizeLowercaseStringOrEmpty } from "@openclaw/normalization-core/string-coerce";
import {
AUTH_INVALID_TOKEN_USER_TEXT,
classifyProviderRuntimeFailureKind,
} from "../../agents/embedded-agent-helpers/errors.js";
import { isFailoverError } from "../../agents/failover-error.js";
import { formatErrorMessage } from "../../infra/errors.js";
/** Provider request error classes that get a specialized user-facing reply. */
export type ProviderRequestErrorCode =
| "provider_authentication_error"
| "provider_conversation_state_error"
| "provider_internal_error"
| "provider_rate_limit_or_quota_error";
@@ -25,11 +31,24 @@ export const PROVIDER_RATE_LIMIT_OR_QUOTA_ERROR_USER_MESSAGE =
export const PROVIDER_INTERNAL_ERROR_USER_MESSAGE =
"⚠️ The model provider returned a temporary internal error before replying. Try again in a moment, or switch to another model if it keeps happening.";
export const PROVIDER_AUTHENTICATION_ERROR_USER_MESSAGE = `⚠️ ${AUTH_INVALID_TOKEN_USER_TEXT}`;
/** Classifies provider request failures that are actionable for users. */
export function classifyProviderRequestError(
err: unknown,
): ProviderRequestErrorClassification | undefined {
const technicalMessage = formatErrorMessage(err);
const isTypedAuthFailure = isFailoverError(err) && err.reason === "auth" && err.status === 401;
if (
isTypedAuthFailure ||
classifyProviderRuntimeFailureKind(technicalMessage) === "auth_invalid_token"
) {
return {
code: "provider_authentication_error",
userMessage: PROVIDER_AUTHENTICATION_ERROR_USER_MESSAGE,
technicalMessage,
};
}
if (
hasHttp429Evidence(err, technicalMessage) &&
isGenericProviderRuntimeErrorMessage(technicalMessage)

View File

@@ -0,0 +1,54 @@
// Unit tests for failure-alert SQLite column codec roundtrip.
import { describe, expect, it } from "vitest";
import { bindFailureAlertColumns, failureAlertFromRow } from "./failure-alert-codec.js";
import type { CronJobRow } from "./schema.js";
function roundtrip(
input: Parameters<typeof bindFailureAlertColumns>[0],
): ReturnType<typeof failureAlertFromRow> {
const columns = bindFailureAlertColumns(input);
return failureAlertFromRow(columns as CronJobRow);
}
describe("failureAlertFromRow", () => {
it("round-trips disabled config (false)", () => {
expect(roundtrip(false)).toBe(false);
});
it("round-trips undefined (no alert config) as undefined", () => {
expect(roundtrip(undefined)).toBeUndefined();
});
it("round-trips enabled-with-defaults ({}) as {}", () => {
const result = roundtrip({});
expect(result).toEqual({});
});
it("round-trips populated config with all fields", () => {
const config = {
after: 3,
cooldownMs: 120_000,
channel: "telegram" as const,
to: "@user",
mode: "announce" as const,
accountId: "acc-1",
includeSkipped: true,
};
expect(roundtrip(config)).toEqual(config);
});
it("round-trips partial config (only after)", () => {
expect(roundtrip({ after: 5 })).toEqual({ after: 5 });
});
it("enabled-with-defaults does not collapse to undefined on read", () => {
const columns = bindFailureAlertColumns({});
const row = columns as CronJobRow;
expect(row.failure_alert_disabled).toBe(0);
expect(row.failure_alert_after).toBeNull();
const decoded = failureAlertFromRow(row);
expect(decoded).toEqual({});
expect(decoded).not.toBeUndefined();
expect(decoded).toBeTruthy();
});
});

View File

@@ -46,6 +46,7 @@ export function failureAlertFromRow(row: CronJobRow): CronFailureAlert | false |
if (row.failure_alert_disabled === 1) {
return false;
}
const failureAlertExplicitlyEnabled = row.failure_alert_disabled === 0;
if (
row.failure_alert_after == null &&
!row.failure_alert_channel &&
@@ -53,7 +54,8 @@ export function failureAlertFromRow(row: CronJobRow): CronFailureAlert | false |
row.failure_alert_cooldown_ms == null &&
row.failure_alert_include_skipped == null &&
!row.failure_alert_mode &&
!row.failure_alert_account_id
!row.failure_alert_account_id &&
!failureAlertExplicitlyEnabled
) {
return undefined;
}

View File

@@ -14,6 +14,7 @@ export {
readProviderJsonArrayFieldResponse,
readProviderJsonObjectResponse,
readProviderJsonResponse,
readProviderTextResponse,
readResponseTextLimited,
truncateErrorDetail,
} from "../agents/provider-http-errors.js";

View File

@@ -34,6 +34,8 @@ const REQUIRED_REVIEWED_PUBLISHABLE_CRITICAL_FINDINGS = new Set([
"@openclaw/google-meet:dangerous-exec:src/node-host.ts",
"@openclaw/google-meet:dangerous-exec:src/realtime.ts",
"@openclaw/matrix:dangerous-exec:src/matrix/deps.ts",
"@openclaw/raft:dangerous-exec:src/gateway.ts",
"@openclaw/signal:dangerous-exec:src/daemon.ts",
"@openclaw/voice-call:dangerous-exec:src/tunnel.ts",
"@openclaw/voice-call:dangerous-exec:src/webhook/tailscale.ts",
]);

View File

@@ -664,6 +664,7 @@ describe("ci workflow guards", () => {
expect(qaEvidenceWorkflow.on.workflow_dispatch.inputs).not.toHaveProperty("fail_on_qa_failure");
expect(qaEvidenceWorkflow.on.workflow_call.inputs).not.toHaveProperty("fail_on_qa_failure");
expect(qaEvidenceWorkflow.on.workflow_dispatch.inputs.qa_profile).not.toHaveProperty("options");
expect(qaEvidenceWorkflow.on.workflow_dispatch.inputs.qa_profile.default).toBe("all");
expect(qaEvidenceWorkflow.on.workflow_call.inputs.qa_profile.type).toBe("string");
const validateProfileStep = qaRunJob.steps.find(
(step) => step.name === "Validate QA profile input",
@@ -682,7 +683,7 @@ describe("ci workflow guards", () => {
// Keep the caller's ref while the callee verifies it against expected_sha.
ref: "${{ inputs.ref }}",
expected_sha: "${{ needs.validate_selected_ref.outputs.selected_revision }}",
qa_profile: "release",
qa_profile: "all",
});
expect(generateJob.with).not.toHaveProperty("fail_on_qa_failure");
@@ -713,6 +714,8 @@ describe("ci workflow guards", () => {
(step) => step.name === "Validate QA evidence manifest",
);
expect(validateManifestStep.run).toContain("qa-profile-evidence-manifest.json");
expect(validateManifestStep.run).toContain("qa-evidence.json profile must be all");
expect(validateManifestStep.run).toContain("QA evidence manifest profile must be all");
expect(validateManifestStep.run).toContain("manifest.targetSha !== targetSha");
expect(qaRunJob.outputs.artifact_name).toBe("${{ steps.evidence.outputs.artifact_name }}");

View File

@@ -56,14 +56,12 @@ const QR_IMPORT_DOCKER_E2E_PATH = "scripts/e2e/qr-import-docker.sh";
const MULTI_NODE_UPDATE_DOCKER_E2E_PATH = "scripts/e2e/multi-node-update-docker.sh";
const BUNDLED_PLUGIN_INSTALL_UNINSTALL_E2E_PATH =
"scripts/e2e/bundled-plugin-install-uninstall-docker.sh";
const AGENT_BUNDLE_MCP_TOOLS_DOCKER_E2E_PATH =
"scripts/e2e/agent-bundle-mcp-tools-docker.sh";
const AGENT_BUNDLE_MCP_TOOLS_DOCKER_E2E_PATH = "scripts/e2e/agent-bundle-mcp-tools-docker.sh";
const COMMITMENTS_SAFETY_DOCKER_E2E_PATH = "scripts/e2e/commitments-safety-docker.sh";
const CRESTODIAN_FIRST_RUN_DOCKER_E2E_PATH = "scripts/e2e/crestodian-first-run-docker.sh";
const CRESTODIAN_PLANNER_DOCKER_E2E_PATH = "scripts/e2e/crestodian-planner-docker.sh";
const CRESTODIAN_RESCUE_DOCKER_E2E_PATH = "scripts/e2e/crestodian-rescue-docker.sh";
const SESSION_RUNTIME_CONTEXT_DOCKER_E2E_PATH =
"scripts/e2e/session-runtime-context-docker.sh";
const SESSION_RUNTIME_CONTEXT_DOCKER_E2E_PATH = "scripts/e2e/session-runtime-context-docker.sh";
const BUNDLED_PLUGIN_INSTALL_UNINSTALL_SWEEP_PATH =
"scripts/e2e/lib/bundled-plugin-install-uninstall/sweep.sh";
const BUNDLED_PLUGIN_INSTALL_UNINSTALL_PROBE_PATH =
@@ -2804,6 +2802,14 @@ grep -Fxq preserved "$TMPDIR/caller-fd"
}
});
it("gives Codex on-demand package installs enough time to reach Codex assertions", () => {
const runner = readFileSync(CODEX_ON_DEMAND_DOCKER_E2E_PATH, "utf8");
expect(runner).toContain(
'export OPENCLAW_E2E_NPM_INSTALL_TIMEOUT="${OPENCLAW_E2E_NPM_INSTALL_TIMEOUT:-1200s}"',
);
});
it("cleans package-backed onboarding and plugin Docker artifacts on every exit path", () => {
for (const path of [
CODEX_ON_DEMAND_DOCKER_E2E_PATH,
@@ -4188,7 +4194,7 @@ output="$(cat "$sampler_log")"
const client = readFileSync(OPENAI_WEB_SEARCH_MINIMAL_CLIENT_PATH, "utf8");
expect(runner).toContain(
"PORT=\"$(docker_e2e_read_tcp_port_env OPENCLAW_OPENAI_WEB_SEARCH_MINIMAL_PORT 18789)\"",
'PORT="$(docker_e2e_read_tcp_port_env OPENCLAW_OPENAI_WEB_SEARCH_MINIMAL_PORT 18789)"',
);
expect(runner).toContain('MOCK_PORT="80"');
expect(runner).not.toContain("OPENCLAW_OPENAI_WEB_SEARCH_MINIMAL_MOCK_PORT");

View File

@@ -3,11 +3,32 @@ import { spawnSync } from "node:child_process";
import fs from "node:fs";
import path from "node:path";
import { afterEach, describe, expect, it } from "vitest";
import { parse as parseYaml } from "yaml";
import { createTempDirTracker } from "../helpers/temp-dir.js";
const repoRoot = path.resolve(__dirname, "../..");
const tempDirs = createTempDirTracker();
type TaxonomyFixture = {
surfaces?: TaxonomySurfaceFixture[];
};
type TaxonomySurfaceFixture = {
id?: string;
status?: string;
categories?: TaxonomyCategoryFixture[];
};
type TaxonomyCategoryFixture = {
id?: string;
name?: string;
features?: TaxonomyFeatureFixture[];
};
type TaxonomyFeatureFixture = {
coverageIds?: string[];
};
afterEach(() => {
tempDirs.cleanup();
});
@@ -26,7 +47,33 @@ function runCli(...args: string[]) {
function writeQaEvidence(params: {
dir: string;
entries: Array<{ id: string; status: "pass" | "fail" | "blocked" | "skipped" }>;
scorecard?: unknown;
}) {
const scorecard = params.scorecard ?? {
filters: { surface: null, category: null },
run: { evidenceEntryCount: params.entries.length },
categories: {
total: 0,
fulfilled: 0,
partial: 0,
missing: 0,
fulfillmentPercent: 0,
},
features: {
total: 0,
fulfilled: 0,
partial: 0,
missing: 0,
fulfillmentPercent: 0,
},
coverageIds: {
total: 0,
fulfilled: 0,
missing: 0,
fulfillmentPercent: 0,
},
categoryReports: [],
};
fs.mkdirSync(params.dir, { recursive: true });
fs.writeFileSync(
path.join(params.dir, "qa-evidence.json"),
@@ -47,31 +94,7 @@ function writeQaEvidence(params: {
coverage: [{ id: "tools.evidence", role: "primary" }],
result: { status: entry.status },
})),
scorecard: {
filters: { surface: null, category: null },
run: { evidenceEntryCount: params.entries.length },
categories: {
total: 0,
fulfilled: 0,
partial: 0,
missing: 0,
fulfillmentPercent: 0,
},
features: {
total: 0,
fulfilled: 0,
partial: 0,
missing: 0,
fulfillmentPercent: 0,
},
coverageIds: {
total: 0,
fulfilled: 0,
missing: 0,
fulfillmentPercent: 0,
},
categoryReports: [],
},
scorecard,
},
null,
2,
@@ -80,6 +103,73 @@ function writeQaEvidence(params: {
);
}
function allProfileScorecardFixture() {
const taxonomy = parseYaml(
fs.readFileSync(path.join(repoRoot, "taxonomy.yaml"), "utf8"),
) as TaxonomyFixture;
const activeSurfaces = (taxonomy.surfaces ?? []).filter(
(surface) => surface.status !== "retired",
);
const categoryReports = activeSurfaces.flatMap((surface) =>
(surface.categories ?? []).map((category) => {
const coverageIds = [
...new Set((category.features ?? []).flatMap((feature) => feature.coverageIds ?? [])),
].sort();
return {
id: `${surface.id}.${category.id}`,
surfaceId: surface.id,
name: category.name,
status: "missing",
features: {
total: category.features.length,
fulfilled: 0,
partial: 0,
missing: category.features.length,
fulfillmentPercent: 0,
},
coverageIds: {
total: coverageIds.length,
fulfilled: 0,
missing: coverageIds.length,
fulfillmentPercent: 0,
secondaryOnly: 0,
},
missingCoverageIds: coverageIds,
};
}),
);
const featureCount = categoryReports.reduce((count, report) => count + report.features.total, 0);
const coverageIdCount = categoryReports.reduce(
(count, report) => count + report.coverageIds.total,
0,
);
return {
filters: { surface: null, category: null },
run: { evidenceEntryCount: 1 },
categories: {
total: categoryReports.length,
fulfilled: 0,
partial: 0,
missing: categoryReports.length,
fulfillmentPercent: 0,
},
features: {
total: featureCount,
fulfilled: 0,
partial: 0,
missing: featureCount,
fulfillmentPercent: 0,
},
coverageIds: {
total: coverageIdCount,
fulfilled: 0,
missing: coverageIdCount,
fulfillmentPercent: 0,
},
categoryReports,
};
}
describe("maturity docs renderer CLI", () => {
it("checks maturity inputs without requiring QA evidence artifacts", () => {
const result = runCli("--check");
@@ -113,13 +203,7 @@ describe("maturity docs renderer CLI", () => {
],
});
const result = runCli(
"--output-dir",
outputDir,
"--evidence-dir",
evidenceDir,
"--strict-inputs",
);
const result = runCli("--output-dir", outputDir, "--evidence-dir", evidenceDir);
expect(result.status).toBe(1);
expect(result.stdout).toBe("");
@@ -147,4 +231,23 @@ describe("maturity docs renderer CLI", () => {
expect(scorecard).not.toContain("0 failed");
expect(scorecard).not.toContain("0 blocked");
});
it("renders the maturity score from quality and completeness without coverage", () => {
const outputDir = tempDirs.make("openclaw-maturity-docs-output-");
const evidenceDir = tempDirs.make("openclaw-maturity-docs-evidence-");
writeQaEvidence({
dir: evidenceDir,
entries: [{ id: "passing-scenario", status: "pass" }],
scorecard: allProfileScorecardFixture(),
});
const result = runCli("--output-dir", outputDir, "--evidence-dir", evidenceDir);
expect(result.status).toBe(0);
const scorecard = fs.readFileSync(path.join(outputDir, "maturity", "scorecard.md"), "utf8");
expect(scorecard).toContain("<span>Maturity score</span>");
expect(scorecard).toContain('<span className="maturity-summary-value">67%</span>');
expect(scorecard).toContain("Coverage Experimental - 0%");
expect(scorecard).toContain("end-to-end coverage above 90%");
});
});