diff --git a/.agents/skills/crabbox/SKILL.md b/.agents/skills/crabbox/SKILL.md index a0c68f296722..66443818b1fe 100644 --- a/.agents/skills/crabbox/SKILL.md +++ b/.agents/skills/crabbox/SKILL.md @@ -127,6 +127,7 @@ Read the JSON summary. Useful fields: - `provider`: should be `blacksmith-testbox` - `leaseId`: `tbx_...` - `syncDelegated`: should be `true` +- `commandPhases`: populated when the command prints `CRABBOX_PHASE:` - `commandMs` / `totalMs` - `exitCode` @@ -138,6 +139,47 @@ unclear: blacksmith testbox list ``` +## Observability Flags + +Use these on debugging runs before inventing ad hoc logging: + +- `--preflight`: prints run context, workspace mode, SSH target, remote user/cwd, + sudo/apt, Node, pnpm, Docker, and bubblewrap. On `blacksmith-testbox`, this + prints a delegated-unsupported note because the workflow owns setup. +- `CRABBOX_ENV_ALLOW=NAME,...`: forwards only listed local env vars for direct + providers and prints `set len=N secret=true` style summaries. On + `blacksmith-testbox`, env forwarding is unsupported; put secrets in the + Testbox workflow instead. +- `--capture-stdout ` / `--capture-stderr `: write remote streams to + local files and keep binary/noisy output out of retained logs. Parent + directories must already exist. These are direct-provider only. +- `--capture-on-fail`: on non-zero direct-provider exits, downloads + `.crabbox/captures/*.tar.gz` with `test-results`, `playwright-report`, + `coverage`, JUnit XML, and nearby logs. Treat as secret-bearing until reviewed. +- `--timing-json`: final machine-readable timing. Add + `echo CRABBOX_PHASE:install`, `CRABBOX_PHASE:test`, etc. in long shell + commands; direct providers and Blacksmith Testbox both report them as + `commandPhases`. + +Live-provider debug template for direct AWS/Hetzner leases: + +```sh +mkdir -p .crabbox/logs +CRABBOX_ENV_ALLOW=OPENAI_API_KEY,OPENAI_BASE_URL \ + pnpm crabbox:run -- --provider aws \ + --preflight \ + --timing-json \ + --capture-stdout .crabbox/logs/live-provider.stdout.log \ + --capture-stderr .crabbox/logs/live-provider.stderr.log \ + --capture-on-fail \ + --shell -- \ + "echo CRABBOX_PHASE:install; pnpm install --frozen-lockfile; echo CRABBOX_PHASE:test; pnpm test:live" +``` + +Do not pass `--capture-*`, `--download`, `--checksum`, `--force-sync-large`, or +`--sync-only` to delegated providers. Crabbox rejects them because the provider +owns sync or command transport. + ## Efficient Bug E2E Verification Use the smallest Crabbox lane that proves the reported user path, not just the diff --git a/.agents/skills/openclaw-debugging/SKILL.md b/.agents/skills/openclaw-debugging/SKILL.md new file mode 100644 index 000000000000..ba240dc315c1 --- /dev/null +++ b/.agents/skills/openclaw-debugging/SKILL.md @@ -0,0 +1,113 @@ +--- +name: openclaw-debugging +description: Debug OpenClaw model, provider, tool-surface, code-mode, streaming, and live/Crabbox behavior by choosing the right logs, probes, and proof path before changing code. +--- + +# OpenClaw Debugging + +Use this skill when OpenClaw behavior differs between local tests, live models, +providers, code mode, Tool Search, Crabbox, or CI, and the next move should be a +debug signal rather than a guess. + +## Read First + +- `docs/logging.md` for log files, `openclaw logs`, and targeted debug flags. +- `docs/reference/test.md` for local test commands. +- `docs/reference/code-mode.md` for code-mode exec/wait and tool catalog rules. +- Use `$openclaw-testing` for choosing test lanes. +- Use `$crabbox` for broad, Docker, package, Linux, live-key, or CI-parity proof. + +## Default Loop + +1. State the suspected boundary: config, tool construction, provider payload, + fetch, stream/SSE, transcript replay, worker/runtime, package/dist, or CI. +2. Add or enable the narrowest signal that proves that boundary. +3. Reproduce with the same provider/model/config. Do not randomly switch models + unless the model itself is the variable being tested. +4. Compare configured state with actual run activation. +5. Patch the root cause. +6. Rerun the exact failing probe, then broaden only if the contract requires it. + +## Model Transport Logs + +Use targeted env flags instead of global debug when the model request shape or +stream timing matters: + +```bash +OPENCLAW_DEBUG_MODEL_TRANSPORT=1 openclaw gateway +OPENCLAW_DEBUG_MODEL_PAYLOAD=tools OPENCLAW_DEBUG_SSE=events openclaw gateway +OPENCLAW_DEBUG_MODEL_PAYLOAD=full-redacted OPENCLAW_DEBUG_SSE=peek openclaw gateway +``` + +Useful flags: + +- `OPENCLAW_DEBUG_MODEL_TRANSPORT=1`: request start, fetch response, SDK + headers, first SSE event, stream done, and transport errors at `info`. +- `OPENCLAW_DEBUG_MODEL_PAYLOAD=summary`: bounded payload summary. +- `OPENCLAW_DEBUG_MODEL_PAYLOAD=tools`: all model-facing tool names. +- `OPENCLAW_DEBUG_MODEL_PAYLOAD=full-redacted`: capped, redacted JSON payload. + Use only while debugging; prompts/message text may still appear. +- `OPENCLAW_DEBUG_SSE=events`: first-event and stream-completion timing. +- `OPENCLAW_DEBUG_SSE=peek`: first five redacted SSE events. +- `OPENCLAW_DEBUG_CODE_MODE=1`: code-mode tool-surface diagnostics. + +Watch logs with: + +```bash +openclaw logs --follow +``` + +## Common Boundaries + +- **Config vs activation:** config can be enabled while the run disables tools, + is raw, has an empty allowlist, or lacks model tool support. Check the actual + visible tools before enforcing provider payload invariants. +- **Tool surface:** inspect final model-visible tool names, not only the tool + registry or config. Code mode means exactly `exec` and `wait` only after it + actually activates. +- **Provider payload:** log fields, model id, service tier, reasoning, input + size, metadata keys, prompt-cache key presence, and tool names before SDK + call. +- **Fetch vs SSE:** fetch response proves HTTP headers arrived; first SSE event + proves provider body progress. A gap here is a stream/body/provider issue, not + tool execution. +- **Worker/dist:** run `pnpm build` when touching workers, dynamic imports, + package exports, lazy runtime boundaries, or published paths. +- **Live keys:** check local `~/.profile` for key presence/length before saying + live proof is blocked. Never print secrets. + +## Code Pointers + +- Model payload + Responses stream: + `src/agents/openai-transport-stream.ts` +- Guarded fetch/timing: + `src/agents/provider-transport-fetch.ts` +- OpenAI/Codex provider wrappers: + `src/agents/pi-embedded-runner/openai-stream-wrappers.ts` +- Tool construction, Tool Search, code-mode activation: + `src/agents/pi-embedded-runner/run/attempt.ts` +- Code-mode runtime and worker: + `src/agents/code-mode.ts` + `src/agents/code-mode.worker.ts` +- Tool Search catalog: + `src/agents/tool-search.ts` + +## Proof Choice + +- Single helper/payload bug: local targeted Vitest. +- Docs/logging-only: `pnpm check:docs` and `git diff --check`. +- Worker/dist/lazy import/package surface: targeted tests plus `pnpm build`. +- Live provider/model behavior: same provider/model with debug flags and a real + key if available. +- Docker/package/Linux/CI-parity: `$crabbox`. +- CI failure: exact SHA, relevant job only, logs only after failure/completion. + +## Output Habit + +Report: + +- boundary tested +- exact command/env shape, redacted +- observed signal, such as tool names or first SSE event timing +- fix location +- narrow proof and any remaining risk diff --git a/.agents/skills/openclaw-debugging/agents/openai.yaml b/.agents/skills/openclaw-debugging/agents/openai.yaml new file mode 100644 index 000000000000..3d6e559b6f64 --- /dev/null +++ b/.agents/skills/openclaw-debugging/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "OpenClaw Debugging" + short_description: "Debug model, tool, stream, and live behavior" + default_prompt: "Use $openclaw-debugging to identify the right OpenClaw debug boundary, turn on targeted logs, and choose the narrowest local or Crabbox proof." diff --git a/.gitignore b/.gitignore index 1c3d3cf5888c..b68d487502a3 100644 --- a/.gitignore +++ b/.gitignore @@ -118,6 +118,8 @@ USER.md !.agents/skills/gitcrawl/ !.agents/skills/gitcrawl/** !.agents/skills/openclaw-docs/** +!.agents/skills/openclaw-debugging/ +!.agents/skills/openclaw-debugging/** !.agents/skills/openclaw-ghsa-maintainer/ !.agents/skills/openclaw-ghsa-maintainer/** !.agents/skills/openclaw-parallels-smoke/ diff --git a/CHANGELOG.md b/CHANGELOG.md index e4aa91230d85..ceedb9965f5c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,7 @@ Docs: https://docs.openclaw.ai - Build: enable stricter Vitest lint rules for focused, disabled, conditional, hook, matcher, and expectation hazards. - Build: pin explicit oxfmt defaults in the shared formatter config to keep formatting behavior stable across upgrades. - TypeScript: enable stricter compiler checks for implicit returns, side-effect imports, overrides, and unused production code. +- Logging: add targeted model transport, payload, SSE, and code-mode diagnostics with redacted URL handling. - Agents: allow `session.agentToAgent.maxPingPongTurns` up to 20 while keeping the default at 5 for longer agent-to-agent reply chains. Fixes #52382. (#52400) Thanks @thirumaleshp. - Agents: add per-agent `tools.message.crossContext` overrides so sandboxed/public agents can restrict message sends to the current conversation without changing the global bot policy. - Agents: add per-agent `tools.message.actions.allow` overrides so sandboxed/public agents can expose and enforce send-only message tools. @@ -828,6 +829,7 @@ Docs: https://docs.openclaw.ai - Sandbox/Windows: accept drive-absolute Docker bind sources while keeping sandbox blocked-path and allowed-root policy comparisons Windows-case-insensitive. (#42174) Thanks @6607changchun. ### Fixes + - Browser/chrome-mcp: read Chrome DevTools MCP screenshot output from the extension-suffixed path, fixing ENOENT on screenshot capture. Fixes #77222. (#74685) Thanks @barbarhan. - macOS/launchd: set generated Gateway LaunchAgent plists to `ProcessType=Interactive` so the gateway keeps timely execution during idle periods. Fixes #58061; refs #62294 and closed duplicate #66992. (#62308) Thanks @bryanpearson and @zssggle-rgb. diff --git a/docs/help/environment.md b/docs/help/environment.md index 7a91acbbb35d..4d7e548d04ff 100644 --- a/docs/help/environment.md +++ b/docs/help/environment.md @@ -112,9 +112,13 @@ Both resolve from process env at activation time. SecretRef details are document ## Logging -| Variable | Purpose | -| -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `OPENCLAW_LOG_LEVEL` | Override log level for both file and console (e.g. `debug`, `trace`). Takes precedence over `logging.level` and `logging.consoleLevel` in config. Invalid values are ignored with a warning. | +| Variable | Purpose | +| -------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `OPENCLAW_LOG_LEVEL` | Override log level for both file and console (e.g. `debug`, `trace`). Takes precedence over `logging.level` and `logging.consoleLevel` in config. Invalid values are ignored with a warning. | +| `OPENCLAW_DEBUG_MODEL_TRANSPORT` | Emit targeted model request/response timing diagnostics at `info` level without enabling global debug logs. | +| `OPENCLAW_DEBUG_MODEL_PAYLOAD` | Model payload diagnostics: `summary`, `tools`, or `full-redacted`. `full-redacted` is capped and redacted but may include prompt/message text. | +| `OPENCLAW_DEBUG_SSE` | Streaming diagnostics: `events` for first/done timing, `peek` to include the first five redacted SSE events. | +| `OPENCLAW_DEBUG_CODE_MODE` | Code-mode model-surface diagnostics, including provider-tool hiding and exec/wait-only enforcement. | ### `OPENCLAW_HOME` diff --git a/docs/logging.md b/docs/logging.md index cf82487dedfb..dfd4726d0fb2 100644 --- a/docs/logging.md +++ b/docs/logging.md @@ -175,6 +175,39 @@ You can override both via the **`OPENCLAW_LOG_LEVEL`** environment variable (e.g `--verbose` only affects console output and WS log verbosity; it does not change file log levels. +### Targeted model transport diagnostics + +When debugging provider calls, use targeted environment flags instead of raising +all logs to `debug`: + +```bash +OPENCLAW_DEBUG_MODEL_TRANSPORT=1 openclaw gateway +OPENCLAW_DEBUG_MODEL_PAYLOAD=tools OPENCLAW_DEBUG_SSE=events openclaw gateway +``` + +Available flags: + +- `OPENCLAW_DEBUG_MODEL_TRANSPORT=1`: emit request start, fetch response, SDK + headers, first streaming event, stream completion, and transport errors at + `info` level. +- `OPENCLAW_DEBUG_MODEL_PAYLOAD=summary`: include a bounded request payload + summary in model request logs. +- `OPENCLAW_DEBUG_MODEL_PAYLOAD=tools`: include all model-facing tool names in + the payload summary. +- `OPENCLAW_DEBUG_MODEL_PAYLOAD=full-redacted`: include a redacted, capped JSON + payload snapshot. Use only while debugging; secrets are redacted but prompts + and message text may still be present. +- `OPENCLAW_DEBUG_SSE=events`: emit first-event and stream-completion timing. +- `OPENCLAW_DEBUG_SSE=peek`: also emit the first five redacted SSE event + payloads, capped per event. +- `OPENCLAW_DEBUG_CODE_MODE=1`: emit code-mode model-surface diagnostics, + including when native provider tools are hidden because code mode owns the + tool surface. + +These flags log through normal OpenClaw logging, so `openclaw logs --follow` +and the Control UI Logs tab show them. Without the flags, the same diagnostics +remain available at `debug` level. + ### Trace correlation File logs are JSONL. When a log call carries a valid diagnostic trace context, diff --git a/docs/tools/tool-search.md b/docs/tools/tool-search.md index 30e891a819a1..e54216486cf3 100644 --- a/docs/tools/tool-search.md +++ b/docs/tools/tool-search.md @@ -7,9 +7,14 @@ read_when: - You are implementing or debugging tool discovery for PI runs --- -Tool Search gives PI agents one compact way to discover and call large tool -catalogs. It is useful when the run has many available tools but the model is -likely to need only a few of them. +Tool Search is an experimental OpenClaw PI-agent feature. It gives PI agents one +compact way to discover and call large tool catalogs. It is useful when the run +has many available tools but the model is likely to need only a few of them. + +This page documents OpenClaw PI Tool Search. It is not the Codex-native tool +search or dynamic-tools surface. Codex-native code mode, tool search, deferred +dynamic tools, and nested tool calls are stable Codex harness surfaces and do +not depend on `tools.toolSearch`. When enabled for PI, the model receives one `tool_search_code` tool by default. That tool runs a short JavaScript body in an isolated Node subprocess with an @@ -29,9 +34,10 @@ client-provided tools. The model does not see every full schema up front. Instead, it searches compact descriptors, describes one selected tool when it needs the exact schema, and calls that tool through OpenClaw. -Codex harness runs do not receive these OpenClaw Tool Search controls. OpenClaw -passes product capabilities to Codex as dynamic tools, and Codex owns native -code mode, native tool search, deferred dynamic tools, and nested tool calls. +Codex harness runs do not receive these experimental OpenClaw Tool Search +controls. OpenClaw passes product capabilities to Codex as dynamic tools, and +Codex owns the stable native code mode, native tool search, deferred dynamic +tools, and nested tool calls. ## How a turn runs @@ -64,6 +70,9 @@ shape the model sees. If the current runtime cannot launch the isolated Node code-mode child process, the default `code` mode falls back to `tools` before catalog compaction. +Both modes are experimental. Prefer direct tool exposure for small PI tool +catalogs, and prefer the Codex-native stable surfaces for Codex harness runs. + There is no separate source-selection config. When Tool Search is enabled, the catalog includes eligible OpenClaw, MCP, and client tools after normal policy filtering. diff --git a/src/agents/model-transport-debug.ts b/src/agents/model-transport-debug.ts new file mode 100644 index 000000000000..2c9aa6eaaac4 --- /dev/null +++ b/src/agents/model-transport-debug.ts @@ -0,0 +1,70 @@ +import type { createSubsystemLogger } from "../logging/subsystem.js"; + +type SubsystemLogger = ReturnType; + +type ModelTransportDebugEnv = NodeJS.ProcessEnv; + +export type ModelPayloadDebugMode = "off" | "summary" | "tools" | "full-redacted"; +export type ModelSseDebugMode = "off" | "events" | "peek"; + +function normalizeEnv(value: unknown): string { + return typeof value === "string" ? value.trim().toLowerCase() : ""; +} + +function isTruthyEnv(value: unknown): boolean { + const normalized = normalizeEnv(value); + return ( + normalized.length > 0 && + normalized !== "0" && + normalized !== "false" && + normalized !== "off" && + normalized !== "no" + ); +} + +export function resolveModelPayloadDebugMode( + env: ModelTransportDebugEnv = process.env, +): ModelPayloadDebugMode { + const normalized = normalizeEnv(env.OPENCLAW_DEBUG_MODEL_PAYLOAD); + if (normalized === "tools" || normalized === "full-redacted") { + return normalized; + } + if (normalized === "summary") { + return "summary"; + } + return "off"; +} + +export function resolveModelSseDebugMode( + env: ModelTransportDebugEnv = process.env, +): ModelSseDebugMode { + const normalized = normalizeEnv(env.OPENCLAW_DEBUG_SSE); + if (normalized === "peek") { + return "peek"; + } + if (normalized === "events" || isTruthyEnv(normalized)) { + return "events"; + } + return "off"; +} + +export function isModelTransportDebugEnabled(env: ModelTransportDebugEnv = process.env): boolean { + return ( + isTruthyEnv(env.OPENCLAW_DEBUG_MODEL_TRANSPORT) || + resolveModelPayloadDebugMode(env) !== "off" || + resolveModelSseDebugMode(env) !== "off" || + isTruthyEnv(env.OPENCLAW_DEBUG_CODE_MODE) + ); +} + +export function isCodeModeDebugEnabled(env: ModelTransportDebugEnv = process.env): boolean { + return isTruthyEnv(env.OPENCLAW_DEBUG_CODE_MODE) || isModelTransportDebugEnabled(env); +} + +export function emitModelTransportDebug(log: SubsystemLogger, message: string): void { + if (isModelTransportDebugEnabled()) { + log.info(message); + return; + } + log.debug(message); +} diff --git a/src/agents/model-transport-url.test.ts b/src/agents/model-transport-url.test.ts new file mode 100644 index 000000000000..e0d536baf6ba --- /dev/null +++ b/src/agents/model-transport-url.test.ts @@ -0,0 +1,31 @@ +import { describe, expect, it } from "vitest"; +import { + formatModelTransportDebugBaseUrl, + formatModelTransportDebugUrl, +} from "./model-transport-url.js"; +import { __testing as openAITesting } from "./openai-transport-stream.js"; + +describe("model transport diagnostic URLs", () => { + it("redacts credentials and request secrets from fetch URLs", () => { + expect( + formatModelTransportDebugUrl( + "https://user:token@example.com/v1/responses?api-key=secret#fragment", + ), + ).toBe("https://example.com/v1/responses"); + }); + + it("redacts credentials and query params from model base URLs", () => { + const baseUrl = "https://tenant:password@example.openai.azure.com/openai/v1?api-version=secret"; + expect(formatModelTransportDebugBaseUrl(baseUrl)).toBe( + "https://example.openai.azure.com/openai/v1", + ); + expect(openAITesting.formatModelTransportDebugBaseUrl(baseUrl)).toBe( + "https://example.openai.azure.com/openai/v1", + ); + }); + + it("does not echo unparsable URL strings", () => { + expect(formatModelTransportDebugUrl("https://user:token@")).toBe(""); + expect(formatModelTransportDebugBaseUrl(undefined)).toBe("default"); + }); +}); diff --git a/src/agents/model-transport-url.ts b/src/agents/model-transport-url.ts new file mode 100644 index 000000000000..a39ad5416a9b --- /dev/null +++ b/src/agents/model-transport-url.ts @@ -0,0 +1,16 @@ +export function formatModelTransportDebugUrl(rawUrl: string): string { + try { + const parsed = new URL(rawUrl); + parsed.username = ""; + parsed.password = ""; + parsed.search = ""; + parsed.hash = ""; + return parsed.toString(); + } catch { + return ""; + } +} + +export function formatModelTransportDebugBaseUrl(rawUrl: string | undefined): string { + return rawUrl ? formatModelTransportDebugUrl(rawUrl) : "default"; +} diff --git a/src/agents/openai-transport-stream.test.ts b/src/agents/openai-transport-stream.test.ts index 280049187cb6..a779064c9dc8 100644 --- a/src/agents/openai-transport-stream.test.ts +++ b/src/agents/openai-transport-stream.test.ts @@ -76,6 +76,70 @@ function expectRecordFields(record: unknown, expected: Record) } describe("openai transport stream", () => { + it("summarizes model payload tools with full names when requested", () => { + const previous = process.env.OPENCLAW_DEBUG_MODEL_PAYLOAD; + process.env.OPENCLAW_DEBUG_MODEL_PAYLOAD = "tools"; + try { + expect( + __testing.summarizeResponsesTools([ + { type: "function", name: "exec" }, + { type: "function", function: { name: "wait" } }, + ]), + ).toBe("count=2 names=exec,wait"); + } finally { + if (previous === undefined) { + delete process.env.OPENCLAW_DEBUG_MODEL_PAYLOAD; + } else { + process.env.OPENCLAW_DEBUG_MODEL_PAYLOAD = previous; + } + } + }); + + it("redacts full model payload debug summaries", () => { + const previous = process.env.OPENCLAW_DEBUG_MODEL_PAYLOAD; + process.env.OPENCLAW_DEBUG_MODEL_PAYLOAD = "full-redacted"; + try { + const summary = __testing.summarizeResponsesPayload({ + model: "gpt-5.5", + stream: true, + input: [], + tools: [{ type: "function", name: "exec" }], + apiKey: "sk-abcdefghijklmnopqrstuvwxyz", + }); + expect(summary).toContain("payload="); + expect(summary).toContain("sk-abc"); + expect(summary).not.toContain("sk-abcdefghijklmnopqrstuvwxyz"); + } finally { + if (previous === undefined) { + delete process.env.OPENCLAW_DEBUG_MODEL_PAYLOAD; + } else { + process.env.OPENCLAW_DEBUG_MODEL_PAYLOAD = previous; + } + } + }); + + it("enforces the code mode responses tool surface before requests leave OpenClaw", () => { + const payload = { + tools: [ + { type: "function", name: "exec" }, + { type: "web_search_preview" }, + { type: "function", function: { name: "wait" } }, + ], + }; + + __testing.enforceCodeModeResponsesToolSurface(payload); + __testing.assertCodeModeResponsesToolSurface(payload); + expect(payload.tools).toHaveLength(2); + }); + + it("fails closed when the code mode final payload tool surface is not exec/wait", () => { + expect(() => + __testing.assertCodeModeResponsesToolSurface({ + tools: [{ type: "function", name: "exec" }, { type: "web_search_preview" }], + }), + ).toThrow(/Code mode payload tool surface violation/); + }); + it("adds OpenClaw attribution to native OpenAI transport headers and protects it from pi", () => { vi.stubEnv("OPENCLAW_VERSION", "2026.3.22"); const headers = __testing.buildOpenAIClientHeaders( diff --git a/src/agents/openai-transport-stream.ts b/src/agents/openai-transport-stream.ts index 408c51de1680..b905e14f67c3 100644 --- a/src/agents/openai-transport-stream.ts +++ b/src/agents/openai-transport-stream.ts @@ -23,11 +23,18 @@ import type { ResponseReasoningItem, } from "openai/resources/responses/responses.js"; import type { ModelCompatConfig } from "../config/types.models.js"; +import { redactSensitiveText } from "../logging/redact.js"; import { createSubsystemLogger } from "../logging/subsystem.js"; import type { ProviderRuntimeModel } from "../plugins/provider-runtime-model.types.js"; import { resolveProviderTransportTurnStateWithPlugin } from "../plugins/provider-runtime.js"; import { buildCopilotDynamicHeaders, hasCopilotVisionInput } from "./copilot-dynamic-headers.js"; import { createDeepSeekTextFilter } from "./deepseek-text-filter.js"; +import { + emitModelTransportDebug, + resolveModelPayloadDebugMode, + resolveModelSseDebugMode, +} from "./model-transport-debug.js"; +import { formatModelTransportDebugBaseUrl } from "./model-transport-url.js"; import { detectOpenAICompletionsCompat } from "./openai-completions-compat.js"; import { flattenCompletionMessagesToStringContent, @@ -77,6 +84,7 @@ type BaseStreamOptions = { sessionId?: string; onPayload?: (payload: unknown, model: Model) => unknown; headers?: Record; + openclawCodeModeToolSurface?: boolean; }; type OpenAIResponsesOptions = BaseStreamOptions & { @@ -181,6 +189,202 @@ function applyServiceTierPricing( usage.cost.input + usage.cost.output + usage.cost.cacheRead + usage.cost.cacheWrite; } +function safeDebugValue(value: unknown): string { + if (typeof value === "string") { + return value; + } + if (typeof value === "number" || typeof value === "boolean") { + return String(value); + } + if (value === null) { + return "null"; + } + if (value === undefined) { + return "undefined"; + } + return Array.isArray(value) ? "array" : typeof value; +} + +function responseInputTextChars(input: unknown): number { + if (typeof input === "string") { + return input.length; + } + if (Array.isArray(input)) { + return input.reduce((total, item) => total + responseInputTextChars(item), 0); + } + if (!input || typeof input !== "object") { + return 0; + } + const record = input as Record; + let total = 0; + if (typeof record.text === "string") { + total += record.text.length; + } + if (typeof record.content === "string") { + total += record.content.length; + } else if (Array.isArray(record.content)) { + total += responseInputTextChars(record.content); + } + return total; +} + +function responseInputRoles(input: unknown): string { + if (!Array.isArray(input)) { + return ""; + } + const roles = new Set(); + for (const item of input) { + if (item && typeof item === "object") { + const role = (item as Record).role; + if (typeof role === "string" && role.trim()) { + roles.add(role.trim()); + } + } + } + return [...roles].sort().join(","); +} + +function readResponsesToolDisplayName(tool: unknown): string { + if (!tool || typeof tool !== "object") { + return ""; + } + const record = tool as Record; + if (typeof record.name === "string") { + return record.name; + } + const fn = record.function; + if (fn && typeof fn === "object" && typeof (fn as Record).name === "string") { + return (fn as Record).name as string; + } + return typeof record.type === "string" ? record.type : ""; +} + +function summarizeResponsesTools(tools: unknown): string { + if (!Array.isArray(tools)) { + return "count=0"; + } + const names = tools.map(readResponsesToolDisplayName).filter(Boolean); + const mode = resolveModelPayloadDebugMode(); + const maxNames = mode === "tools" || mode === "full-redacted" ? names.length : 12; + const label = maxNames >= names.length ? "names" : "sample"; + const shown = names.slice(0, maxNames).join(","); + return `count=${tools.length}${shown ? ` ${label}=${shown}` : ""}`; +} + +function responsesPayloadToolName(tool: unknown): string | undefined { + if (!isRecord(tool)) { + return undefined; + } + if (typeof tool.name === "string") { + return tool.name; + } + const fn = tool.function; + return isRecord(fn) && typeof fn.name === "string" ? fn.name : undefined; +} + +function enforceCodeModeResponsesToolSurface(payload: unknown): void { + if (!isRecord(payload) || !Array.isArray(payload.tools)) { + return; + } + payload.tools = payload.tools.filter((tool) => { + const name = responsesPayloadToolName(tool); + return name === "exec" || name === "wait"; + }); +} + +function assertCodeModeResponsesToolSurface(payload: unknown): void { + if (!isRecord(payload) || !Array.isArray(payload.tools)) { + throw new Error("Code mode payload tool surface violation: expected exec,wait; got no tools"); + } + const names = payload.tools.map(responsesPayloadToolName).filter(Boolean).sort(); + if (names.length === 2 && names[0] === "exec" && names[1] === "wait") { + return; + } + throw new Error( + `Code mode payload tool surface violation: expected exec,wait; got ${ + names.length > 0 ? names.join(",") : "none" + }`, + ); +} + +function stringifyRedactedPayload(value: unknown): string { + try { + const encoded = JSON.stringify(value); + if (!encoded) { + return ""; + } + const redacted = redactSensitiveText(encoded, { mode: "tools" }); + return redacted.length > 8000 ? `${redacted.slice(0, 8000)}…` : redacted; + } catch { + return ""; + } +} + +function stringifyRedactedEvent(value: unknown): string { + const redacted = stringifyRedactedPayload(value); + return redacted.length > 2000 ? `${redacted.slice(0, 2000)}…` : redacted; +} + +function summarizeResponsesPayload(params: unknown): string { + if (!params || typeof params !== "object") { + return "payload=non-object"; + } + const record = params as Record; + const input = record.input; + const reasoning = + record.reasoning && typeof record.reasoning === "object" + ? (record.reasoning as Record) + : undefined; + const text = + record.text && typeof record.text === "object" + ? (record.text as Record) + : undefined; + const parts = [ + `fields=${Object.keys(record).sort().join(",")}`, + `model=${safeDebugValue(record.model)}`, + `stream=${safeDebugValue(record.stream)}`, + `inputItems=${Array.isArray(input) ? input.length : typeof input}`, + `inputRoles=${responseInputRoles(input) || "none"}`, + `inputTextChars=${responseInputTextChars(input)}`, + `tools=${summarizeResponsesTools(record.tools)}`, + `reasoningEffort=${safeDebugValue(reasoning?.effort)}`, + `reasoningSummary=${safeDebugValue(reasoning?.summary)}`, + `textVerbosity=${safeDebugValue(text?.verbosity)}`, + `serviceTier=${safeDebugValue(record.service_tier)}`, + `store=${safeDebugValue(record.store)}`, + `promptCacheKey=${record.prompt_cache_key === undefined ? "absent" : "present"}`, + `metadataKeys=${ + record.metadata && typeof record.metadata === "object" + ? Object.keys(record.metadata).sort().join(",") + : "none" + }`, + ]; + if (resolveModelPayloadDebugMode() === "full-redacted") { + parts.push(`payload=${stringifyRedactedPayload(record)}`); + } + return parts.join(" "); +} + +function summarizeOpenAITransportError(error: unknown): string { + if (!error || typeof error !== "object") { + return `type=${typeof error} message=${safeDebugValue(error)}`; + } + const record = error as Record; + const cause = + record.cause && typeof record.cause === "object" + ? (record.cause as Record) + : undefined; + return [ + `name=${safeDebugValue(record.name)}`, + `status=${safeDebugValue(record.status)}`, + `code=${safeDebugValue(record.code)}`, + `type=${safeDebugValue(record.type)}`, + `causeName=${safeDebugValue(cause?.name)}`, + `causeCode=${safeDebugValue(cause?.code)}`, + `message=${error instanceof Error ? error.message : safeDebugValue(error)}`, + ].join(" "); +} + export function resolveAzureOpenAIApiVersion(env = process.env): string { return env.AZURE_OPENAI_API_VERSION?.trim() || DEFAULT_AZURE_OPENAI_API_VERSION; } @@ -457,10 +661,30 @@ async function processResponsesStream( ) { let currentItem: Record | null = null; let currentBlock: Record | null = null; + const streamStartedAt = Date.now(); + let eventCount = 0; + const eventTypes = new Map(); + const sseDebugMode = resolveModelSseDebugMode(); const blockIndex = () => output.content.length - 1; for await (const rawEvent of openaiStream) { const event = rawEvent as Record; const type = stringifyUnknown(event.type); + eventCount += 1; + eventTypes.set(type, (eventTypes.get(type) ?? 0) + 1); + if (eventCount === 1) { + emitModelTransportDebug( + log, + `[responses] first_event provider=${model.provider} api=${model.api} model=${model.id} ` + + `elapsedMs=${Date.now() - streamStartedAt} type=${type}`, + ); + } + if (sseDebugMode === "peek" && eventCount <= 5) { + emitModelTransportDebug( + log, + `[responses] event_peek provider=${model.provider} api=${model.api} model=${model.id} ` + + `index=${eventCount} type=${type} event=${stringifyRedactedEvent(event)}`, + ); + } if (type === "response.created") { output.responseId = stringifyUnknown((event.response as { id?: string } | undefined)?.id); } else if (type === "response.output_item.added") { @@ -637,6 +861,16 @@ async function processResponsesStream( throw new Error(msg); } } + const eventTypeSummary = [...eventTypes.entries()] + .slice(0, 12) + .map(([eventType, count]) => `${eventType}:${count}`) + .join(","); + emitModelTransportDebug( + log, + `[responses] stream_done provider=${model.provider} api=${model.api} model=${model.id} ` + + `elapsedMs=${Date.now() - streamStartedAt} events=${eventCount} types=${eventTypeSummary} ` + + `stopReason=${output.stopReason ?? "unset"} contentBlocks=${output.content.length}`, + ); } function mapResponsesStopReason(status: string | undefined): string { @@ -806,10 +1040,30 @@ export function createOpenAIResponsesTransportStreamFn(): StreamFn { model, params as Record, ) as typeof params; + if ( + (options as { openclawCodeModeToolSurface?: unknown } | undefined) + ?.openclawCodeModeToolSurface === true + ) { + enforceCodeModeResponsesToolSurface(params); + assertCodeModeResponsesToolSurface(params); + } + const requestStartedAt = Date.now(); + const requestOptions = buildOpenAISdkRequestOptions(model, options?.signal); + emitModelTransportDebug( + log, + `[responses] start provider=${model.provider} api=${model.api} model=${model.id} ` + + `baseUrl=${formatModelTransportDebugBaseUrl(model.baseUrl)} timeoutMs=${safeDebugValue(requestOptions?.timeout)} ` + + `apiKey=${apiKey ? "present" : "missing"} ${summarizeResponsesPayload(params)}`, + ); const responseStream = (await client.responses.create( params as never, - buildOpenAISdkRequestOptions(model, options?.signal), + requestOptions, )) as unknown as AsyncIterable; + emitModelTransportDebug( + log, + `[responses] headers provider=${model.provider} api=${model.api} model=${model.id} ` + + `elapsedMs=${Date.now() - requestStartedAt}`, + ); stream.push({ type: "start", partial: output as never }); await processResponsesStream(responseStream, output, stream, model, { serviceTier: (options as OpenAIResponsesOptions | undefined)?.serviceTier, @@ -824,6 +1078,10 @@ export function createOpenAIResponsesTransportStreamFn(): StreamFn { stream.push({ type: "done", reason: output.stopReason as never, message: output as never }); stream.end(); } catch (error) { + log.warn( + `[responses] error provider=${model.provider} api=${model.api} model=${model.id} ` + + summarizeOpenAITransportError(error), + ); output.stopReason = options?.signal?.aborted ? "aborted" : "error"; output.errorMessage = error instanceof Error ? error.message : JSON.stringify(error); stream.push({ type: "error", reason: output.stopReason as never, error: output as never }); @@ -1135,10 +1393,30 @@ export function createAzureOpenAIResponsesTransportStreamFn(): StreamFn { model, params as Record, ) as typeof params; + if ( + (options as { openclawCodeModeToolSurface?: unknown } | undefined) + ?.openclawCodeModeToolSurface === true + ) { + enforceCodeModeResponsesToolSurface(params); + assertCodeModeResponsesToolSurface(params); + } + const requestStartedAt = Date.now(); + const requestOptions = buildOpenAISdkRequestOptions(model, options?.signal); + emitModelTransportDebug( + log, + `[responses] start provider=${model.provider} api=${model.api} model=${model.id} ` + + `baseUrl=${formatModelTransportDebugBaseUrl(model.baseUrl)} timeoutMs=${safeDebugValue(requestOptions?.timeout)} ` + + `apiKey=${apiKey ? "present" : "missing"} ${summarizeResponsesPayload(params)}`, + ); const responseStream = (await client.responses.create( params as never, - buildOpenAISdkRequestOptions(model, options?.signal), + requestOptions, )) as unknown as AsyncIterable; + emitModelTransportDebug( + log, + `[responses] headers provider=${model.provider} api=${model.api} model=${model.id} ` + + `elapsedMs=${Date.now() - requestStartedAt}`, + ); stream.push({ type: "start", partial: output as never }); await processResponsesStream(responseStream, output, stream, model); if (options?.signal?.aborted) { @@ -1150,6 +1428,10 @@ export function createAzureOpenAIResponsesTransportStreamFn(): StreamFn { stream.push({ type: "done", reason: output.stopReason as never, message: output as never }); stream.end(); } catch (error) { + log.warn( + `[responses] error provider=${model.provider} api=${model.api} model=${model.id} ` + + summarizeOpenAITransportError(error), + ); output.stopReason = options?.signal?.aborted ? "aborted" : "error"; output.errorMessage = error instanceof Error ? error.message : JSON.stringify(error); stream.push({ type: "error", reason: output.stopReason as never, error: output as never }); @@ -1324,6 +1606,13 @@ export function createOpenAICompletionsTransportStreamFn(): StreamFn { if (nextParams !== undefined) { params = nextParams as typeof params; } + if ( + (options as { openclawCodeModeToolSurface?: unknown } | undefined) + ?.openclawCodeModeToolSurface === true + ) { + enforceCodeModeResponsesToolSurface(params); + assertCodeModeResponsesToolSurface(params); + } const responseStream = (await client.chat.completions.create( params as never, buildOpenAISdkRequestOptions(model, options?.signal), @@ -2152,13 +2441,18 @@ function mapStopReason(reason: string | null) { } export const __testing = { + assertCodeModeResponsesToolSurface, buildOpenAIClientHeaders, buildOpenAISdkClientOptions, buildOpenAISdkRequestOptions, createAzureOpenAIClient, createOpenAICompletionsClient, createOpenAIResponsesClient, + enforceCodeModeResponsesToolSurface, sanitizeOpenAICodexResponsesParams, buildOpenAICompletionsClientConfig, processOpenAICompletionsStream, + formatModelTransportDebugBaseUrl, + summarizeResponsesPayload, + summarizeResponsesTools, }; diff --git a/src/agents/provider-transport-fetch.ts b/src/agents/provider-transport-fetch.ts index 0a13f8c55703..72b031e68c3b 100644 --- a/src/agents/provider-transport-fetch.ts +++ b/src/agents/provider-transport-fetch.ts @@ -8,7 +8,10 @@ import { ssrfPolicyFromHttpBaseUrlFakeIpHostnameAllowlist, type SsrFPolicy, } from "../infra/net/ssrf.js"; +import { createSubsystemLogger } from "../logging/subsystem.js"; import { resolveDebugProxySettings } from "../proxy-capture/env.js"; +import { emitModelTransportDebug } from "./model-transport-debug.js"; +import { formatModelTransportDebugUrl } from "./model-transport-url.js"; import { ensureModelProviderLocalService, type ProviderLocalServiceLease, @@ -21,6 +24,7 @@ import { } from "./provider-request-config.js"; const DEFAULT_MAX_SDK_RETRY_WAIT_SECONDS = 60; +const log = createSubsystemLogger("provider-transport-fetch"); function hasReadableSseData(block: string): boolean { const dataLines = block @@ -416,6 +420,24 @@ export function buildGuardedModelFetch( const requestConfig = resolveModelRequestPolicy(model); const dispatcherPolicy = buildProviderRequestDispatcherPolicy(requestConfig); const requestTimeoutMs = resolveModelRequestTimeoutMs(model, timeoutMs); + const summarizeError = (error: unknown): string => { + if (!error || typeof error !== "object") { + return `type=${typeof error}`; + } + const record = error as Record; + const cause = + record.cause && typeof record.cause === "object" + ? (record.cause as Record) + : undefined; + const read = (value: unknown) => (typeof value === "string" ? value : typeof value); + return [ + `name=${read(record.name)}`, + `code=${read(record.code)}`, + `causeName=${read(cause?.name)}`, + `causeCode=${read(cause?.code)}`, + `message=${error instanceof Error ? error.message : String(error)}`, + ].join(" "); + }; return async (input, init) => { let localServiceLease: ProviderLocalServiceLease | undefined; const request = input instanceof Request ? new Request(input, init) : undefined; @@ -462,6 +484,15 @@ export function buildGuardedModelFetch( ...(policy ? { policy } : {}), }; let result: Awaited>; + const fetchStartedAt = Date.now(); + const useEnvProxy = !dispatcherPolicy && shouldUseEnvHttpProxyForUrl(url); + emitModelTransportDebug( + log, + `[model-fetch] start provider=${model.provider} api=${model.api} model=${model.id} ` + + `method=${(requestInit ?? init)?.method ?? "GET"} url=${formatModelTransportDebugUrl(url)} timeoutMs=${requestTimeoutMs} ` + + `proxy=${dispatcherPolicy ? "configured" : useEnvProxy ? "env" : "none"} ` + + `policy=${policy ? "custom" : "default"}`, + ); try { localServiceLease = await ensureModelProviderLocalService( model, @@ -469,15 +500,25 @@ export function buildGuardedModelFetch( (requestInit ?? init)?.signal, ); result = await fetchWithSsrFGuard( - !dispatcherPolicy && shouldUseEnvHttpProxyForUrl(url) + useEnvProxy ? withTrustedEnvProxyGuardedFetchMode(guardedFetchOptions) : guardedFetchOptions, ); } catch (error) { + log.warn( + `[model-fetch] error provider=${model.provider} api=${model.api} model=${model.id} ` + + `elapsedMs=${Date.now() - fetchStartedAt} ${summarizeError(error)}`, + ); localServiceLease?.release(); throw error; } let response = result.response; + emitModelTransportDebug( + log, + `[model-fetch] response provider=${model.provider} api=${model.api} model=${model.id} ` + + `status=${response.status} elapsedMs=${Date.now() - fetchStartedAt} ` + + `contentType=${response.headers.get("content-type") ?? ""}`, + ); if (shouldBypassLongSdkRetry(response)) { const headers = new Headers(response.headers); headers.set("x-should-retry", "false");