Compare commits

...

6 Commits

Author SHA1 Message Date
Tak Hoffman
ed4dbe63a5 fix web search fallback explicitness 2026-04-07 07:18:10 -05:00
Tak Hoffman
e8b7694f05 fix infer auth logout persistence 2026-04-07 07:00:30 -05:00
Tak Hoffman
db65818d0b validate explicit web search providers 2026-04-07 06:25:24 -05:00
Tak Hoffman
00c0e9fb36 fix tts runtime facade export 2026-04-07 06:21:47 -05:00
Tak Hoffman
4ba0ce3ed1 flatten infer media commands 2026-04-07 06:10:09 -05:00
Tak Hoffman
d18d56f040 refresh infer branch onto latest main 2026-04-07 05:57:11 -05:00
28 changed files with 3785 additions and 28 deletions

View File

@@ -6,6 +6,7 @@ Docs: https://docs.openclaw.ai
### Changes
- CLI/infer: add a first-class `openclaw infer ...` hub for provider-backed inference workflows across model, media, web, and embedding tasks. Thanks @Takhoffman.
- Plugins/webhooks: add a bundled webhook ingress plugin so external automation can create and drive bound TaskFlows through per-route shared-secret endpoints. (#61892) Thanks @mbelinky.
- Tools/media generation: preserve intent across auth-backed image, music, and video provider fallback, remap size, aspect ratio, resolution, and duration hints to the closest supported option, and surface explicit provider capabilities plus mode-aware video-to-video support.
- Memory/wiki: restore the bundled `memory-wiki` stack with plugin, CLI, sync/query/apply tooling, and memory-host integration for wiki-backed memory workflows.
@@ -28,6 +29,7 @@ Docs: https://docs.openclaw.ai
### Fixes
- Plugins/media: when `plugins.allow` is set, capability fallback now merges bundled capability plugin ids into the allowlist (not only `plugins.entries`), so media understanding providers such as OpenAI-compatible STT load for voice transcription without requiring `openai` in `plugins.allow`. (#62205) Thanks @neeravmakwana.
- CLI/infer: keep provider-backed infer behavior aligned with actual runtime execution by fixing explicit TTS override handling, profile-aware gateway TTS prefs resolution, per-request transcription `prompt`/`language` overrides, image output MIME/extension mismatches, configured web-search fallback behavior, and agent-vs-CLI web-search execution drift.
- Auth/OpenAI Codex OAuth: reload fresh on-disk credentials inside the locked refresh path and retry once after `refresh_token_reused` rotates only the stored refresh token, so relogin/restart recovery stops getting stuck on stale cached auth state. Thanks @owen-ever.
- Agents/history and replies: buffer phaseless OpenAI WS text until a real assistant phase arrives, keep replay and SSE history sequence tracking aligned, hide commentary and leaked tool XML from user-visible history, and keep history-based follow-up replies on `final_answer` text only. (#61729, #61747, #61829, #61855, #61954) Thanks @100yenadmin, @afurm, and @openperf.
- Plugins/channels: keep bundled channel artifact and secret-contract loading stable under lazy loading, preserve plugin-schema defaults during install, and fix Windows `file://` plus native-Jiti plugin loader paths so onboarding, doctor, `openclaw secret`, and bundled plugin installs work again. (#61832, #61836, #61853, #61856) Thanks @Zeesejo and @SuperMarioYL.

119
docs/cli/capability.md Normal file
View File

@@ -0,0 +1,119 @@
---
summary: "Infer-first CLI for provider-backed model, image, audio, TTS, video, web, and embedding workflows"
read_when:
- Adding or modifying `openclaw infer` commands
- Designing stable headless capability automation
title: "Inference CLI"
---
# Inference CLI
`openclaw infer` is the canonical headless surface for provider-backed inference workflows.
`openclaw capability` remains supported as a fallback alias for compatibility.
It intentionally exposes capability families, not raw gateway RPC names and not raw agent tool ids.
## Command tree
```text
openclaw infer
list
inspect
model
run
list
inspect
providers
auth login
auth logout
auth status
image
generate
edit
describe
describe-many
providers
audio
transcribe
providers
tts
convert
voices
providers
status
enable
disable
set-provider
video
generate
describe
providers
web
search
fetch
providers
embedding
create
providers
```
## Transport
Supported transport flags:
- `--local`
- `--gateway`
Default transport is implicit auto at the command-family level:
- Stateless execution commands default to local.
- Gateway-managed state commands default to gateway.
Examples:
```bash
openclaw infer model run --prompt "hello" --json
openclaw infer image generate --prompt "friendly lobster" --json
openclaw infer tts status --json
openclaw infer embedding create --text "hello world" --json
```
## JSON output
Capability commands normalize JSON output under a shared envelope:
```json
{
"ok": true,
"capability": "image.generate",
"transport": "local",
"provider": "openai",
"model": "gpt-image-1",
"attempts": [],
"outputs": []
}
```
Top-level fields are stable:
- `ok`
- `capability`
- `transport`
- `provider`
- `model`
- `attempts`
- `outputs`
- `error`
## Notes
- `model run` reuses the agent runtime so provider/model overrides behave like normal agent execution.
- `tts status` defaults to gateway because it reflects gateway-managed TTS state.

View File

@@ -35,6 +35,7 @@ This page describes the current CLI behavior. If commands change, update this do
- [`logs`](/cli/logs)
- [`system`](/cli/system)
- [`models`](/cli/models)
- [`infer`](/cli/capability)
- [`memory`](/cli/memory)
- [`directory`](/cli/directory)
- [`nodes`](/cli/nodes)
@@ -248,6 +249,16 @@ openclaw [--dev] [--profile <name>] <command>
fallbacks list|add|remove|clear
image-fallbacks list|add|remove|clear
scan
infer (alias: capability)
list
inspect
model run|list|inspect|providers|auth login|logout|status
image generate|edit|describe|describe-many|providers
audio transcribe|providers
tts convert|voices|providers|status|enable|disable|set-provider
video generate|describe|providers
web search|fetch|providers
embedding create|providers
auth add|login|login-github-copilot|setup-token|paste-token
auth order get|set|clear
sandbox

View File

@@ -4,7 +4,9 @@ export {
DEFAULT_LOCAL_MODEL,
getBuiltinMemoryEmbeddingProviderDoctorMetadata,
listBuiltinAutoSelectMemoryEmbeddingProviderDoctorMetadata,
registerBuiltInMemoryEmbeddingProviders,
} from "./src/memory/provider-adapters.js";
export { createEmbeddingProvider } from "./src/memory/embeddings.js";
export {
resolveMemoryCacheSummary,
resolveMemoryFtsState,

View File

@@ -9,6 +9,7 @@ export {
isTtsProviderConfigured,
listSpeechVoices,
maybeApplyTtsToPayload,
resolveExplicitTtsOverrides,
resolveTtsAutoMode,
resolveTtsConfig,
resolveTtsPrefsPath,

View File

@@ -25,8 +25,8 @@ import type { ReplyPayload } from "openclaw/plugin-sdk/reply-runtime";
import { isVerbose, logVerbose } from "openclaw/plugin-sdk/runtime-env";
import { resolvePreferredOpenClawTmpDir } from "openclaw/plugin-sdk/sandbox";
import {
CONFIG_DIR,
normalizeOptionalString,
resolveConfigDir,
resolveUserPath,
stripMarkdown,
} from "openclaw/plugin-sdk/text-runtime";
@@ -41,6 +41,7 @@ import {
summarizeText,
type SpeechModelOverridePolicy,
type SpeechProviderConfig,
type SpeechProviderOverrides,
type SpeechVoiceOption,
type TtsDirectiveOverrides,
type TtsDirectiveParseResult,
@@ -173,7 +174,7 @@ function resolveTtsPrefsPathValue(prefsPath: string | undefined): string {
if (envPath) {
return resolveUserPath(envPath);
}
return path.join(CONFIG_DIR, "settings", "tts.json");
return path.join(resolveConfigDir(process.env), "settings", "tts.json");
}
function resolveModelOverridePolicy(
@@ -502,6 +503,66 @@ export function setTtsProvider(prefsPath: string, provider: TtsProvider): void {
});
}
export function resolveExplicitTtsOverrides(params: {
cfg: OpenClawConfig;
prefsPath?: string;
provider?: string;
modelId?: string;
voiceId?: string;
}): TtsDirectiveOverrides {
const providerInput = params.provider?.trim();
const modelId = params.modelId?.trim();
const voiceId = params.voiceId?.trim();
const config = resolveTtsConfig(params.cfg);
const prefsPath = params.prefsPath ?? resolveTtsPrefsPath(config);
const selectedProvider =
canonicalizeSpeechProviderId(providerInput, params.cfg) ??
(modelId || voiceId ? getTtsProvider(config, prefsPath) : undefined);
if (providerInput && !selectedProvider) {
throw new Error(`Unknown TTS provider "${providerInput}".`);
}
if (!modelId && !voiceId) {
return selectedProvider ? { provider: selectedProvider } : {};
}
if (!selectedProvider) {
throw new Error("TTS model or voice overrides require a resolved provider.");
}
const provider = getSpeechProvider(selectedProvider, params.cfg);
if (!provider) {
throw new Error(`speech provider ${selectedProvider} is not registered`);
}
if (!provider.resolveTalkOverrides) {
throw new Error(
`TTS provider "${selectedProvider}" does not support model or voice overrides.`,
);
}
const providerOverrides = provider.resolveTalkOverrides({
talkProviderConfig: {},
params: {
...(voiceId ? { voiceId } : {}),
...(modelId ? { modelId } : {}),
},
});
if ((voiceId || modelId) && (!providerOverrides || Object.keys(providerOverrides).length === 0)) {
throw new Error(
`TTS provider "${selectedProvider}" ignored the requested model or voice overrides.`,
);
}
const overridesRecord = providerOverrides as SpeechProviderOverrides;
return {
provider: selectedProvider,
providerOverrides: {
[provider.id]: overridesRecord,
},
};
}
export function getTtsMaxLength(prefsPath: string): number {
const prefs = readPrefs(prefsPath);
return prefs.tts?.maxLength ?? DEFAULT_TTS_MAX_LENGTH;

View File

@@ -232,8 +232,7 @@ function findInlineModelMatch(params: {
);
}
export { buildModelAliasLines };
export { buildInlineProviderModels };
export { buildModelAliasLines, buildInlineProviderModels };
function resolveConfiguredProviderConfig(
cfg: OpenClawConfig | undefined,
@@ -336,7 +335,6 @@ function applyConfiguredProviderOverrides(params: {
providerRequest,
);
}
function resolveExplicitModelWithRegistry(params: {
provider: string;
modelId: string;

View File

@@ -4,6 +4,7 @@ import type { RuntimeWebSearchMetadata } from "../../secrets/runtime-web-tools.t
import {
resolveWebSearchDefinition,
resolveWebSearchProviderId,
runWebSearch,
} from "../../web-search/runtime.js";
import type { AnyAgentTool } from "./common.js";
import { jsonResult } from "./common.js";
@@ -16,16 +17,17 @@ export function createWebSearchTool(options?: {
}): AnyAgentTool | null {
const runtimeProviderId =
options?.runtimeWebSearch?.selectedProvider ?? options?.runtimeWebSearch?.providerConfigured;
const preferRuntimeProviders =
Boolean(runtimeProviderId) &&
!resolveManifestContractOwnerPluginId({
contract: "webSearchProviders",
value: runtimeProviderId,
origin: "bundled",
config: options?.config,
});
const resolved = resolveWebSearchDefinition({
...options,
preferRuntimeProviders:
Boolean(runtimeProviderId) &&
!resolveManifestContractOwnerPluginId({
contract: "webSearchProviders",
value: runtimeProviderId,
origin: "bundled",
config: options?.config,
}),
preferRuntimeProviders,
});
if (!resolved) {
return null;
@@ -36,7 +38,19 @@ export function createWebSearchTool(options?: {
name: "web_search",
description: resolved.definition.description,
parameters: resolved.definition.parameters,
execute: async (_toolCallId, args) => jsonResult(await resolved.definition.execute(args)),
execute: async (_toolCallId, args) => {
const result = await runWebSearch({
config: options?.config,
sandboxed: options?.sandboxed,
runtimeWebSearch: options?.runtimeWebSearch,
preferRuntimeProviders,
args,
});
return jsonResult({
...result.result,
provider: result.provider,
});
},
};
}

View File

@@ -0,0 +1,903 @@
import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { Command } from "commander";
import { beforeEach, describe, expect, it, vi } from "vitest";
import { runRegisteredCli } from "../test-utils/command-runner.js";
import { registerCapabilityCli } from "./capability-cli.js";
const mocks = vi.hoisted(() => ({
runtime: {
log: vi.fn(),
error: vi.fn(),
exit: vi.fn((code: number) => {
throw new Error(`exit ${code}`);
}),
writeJson: vi.fn(),
writeStdout: vi.fn(),
},
loadConfig: vi.fn(() => ({})),
loadAuthProfileStoreForRuntime: vi.fn(() => ({ profiles: {}, order: {} })),
listProfilesForProvider: vi.fn(() => []),
updateAuthProfileStoreWithLock: vi.fn(
async ({ updater }: { updater: (store: any) => boolean }) => {
const store = {
version: 1,
profiles: {},
order: {},
lastGood: {},
usageStats: {},
};
updater(store);
return store;
},
),
resolveMemorySearchConfig: vi.fn(() => null),
loadModelCatalog: vi.fn(async () => []),
agentCommand: vi.fn(async () => ({
payloads: [{ text: "local reply" }],
meta: { agentMeta: { provider: "openai", model: "gpt-5.4" } },
})),
callGateway: vi.fn(async ({ method }: { method: string }) => {
if (method === "tts.status") {
return { enabled: true, provider: "openai" };
}
if (method === "agent") {
return {
result: {
payloads: [{ text: "gateway reply" }],
meta: { agentMeta: { provider: "anthropic", model: "claude-sonnet-4-6" } },
},
};
}
return {};
}),
describeImageFile: vi.fn(async () => ({
text: "friendly lobster",
provider: "openai",
model: "gpt-4.1-mini",
})),
generateImage: vi.fn(),
transcribeAudioFile: vi.fn(async () => ({ text: "meeting notes" })),
textToSpeech: vi.fn(async () => ({
success: true,
audioPath: "/tmp/tts-source.mp3",
provider: "openai",
outputFormat: "mp3",
voiceCompatible: false,
attempts: [],
})),
setTtsProvider: vi.fn(),
resolveExplicitTtsOverrides: vi.fn(
({
provider,
modelId,
voiceId,
}: {
provider?: string;
modelId?: string;
voiceId?: string;
}) => ({
...(provider ? { provider } : {}),
...(modelId || voiceId
? {
providerOverrides: {
[provider ?? "openai"]: {
...(modelId ? { modelId } : {}),
...(voiceId ? { voiceId } : {}),
},
},
}
: {}),
}),
),
createEmbeddingProvider: vi.fn(async () => ({
provider: {
id: "openai",
model: "text-embedding-3-small",
embedQuery: async () => [0.1, 0.2],
embedBatch: async (texts: string[]) => texts.map(() => [0.1, 0.2]),
},
})),
registerMemoryEmbeddingProvider: vi.fn(),
listMemoryEmbeddingProviders: vi.fn(() => [
{ id: "openai", defaultModel: "text-embedding-3-small", transport: "remote" },
]),
registerBuiltInMemoryEmbeddingProviders: vi.fn(),
isWebSearchProviderConfigured: vi.fn(() => false),
isWebFetchProviderConfigured: vi.fn(() => false),
modelsStatusCommand: vi.fn(
async (_opts: unknown, runtime: { log: (...args: unknown[]) => void }) => {
runtime.log(JSON.stringify({ ok: true, providers: [{ id: "openai" }] }));
},
),
}));
vi.mock("../runtime.js", () => ({
defaultRuntime: mocks.runtime,
writeRuntimeJson: (runtime: { writeJson: (value: unknown) => void }, value: unknown) =>
runtime.writeJson(value),
}));
vi.mock("../config/config.js", () => ({
loadConfig: (...args: unknown[]) => mocks.loadConfig(...args),
}));
vi.mock("../agents/agent-command.js", () => ({
agentCommand: (...args: unknown[]) => mocks.agentCommand(...args),
}));
vi.mock("../agents/agent-scope.js", () => ({
resolveDefaultAgentId: () => "main",
resolveAgentDir: () => "/tmp/agent",
}));
vi.mock("../agents/model-catalog.js", () => ({
loadModelCatalog: (...args: unknown[]) => mocks.loadModelCatalog(...args),
}));
vi.mock("../agents/auth-profiles.js", () => ({
loadAuthProfileStoreForRuntime: (...args: unknown[]) =>
mocks.loadAuthProfileStoreForRuntime(...args),
listProfilesForProvider: (...args: unknown[]) => mocks.listProfilesForProvider(...args),
}));
vi.mock("../agents/auth-profiles/store.js", () => ({
updateAuthProfileStoreWithLock: (...args: unknown[]) =>
mocks.updateAuthProfileStoreWithLock(...args),
}));
vi.mock("../agents/memory-search.js", () => ({
resolveMemorySearchConfig: (...args: unknown[]) => mocks.resolveMemorySearchConfig(...args),
}));
vi.mock("../commands/models.js", () => ({
modelsAuthLoginCommand: vi.fn(),
modelsStatusCommand: (...args: unknown[]) => mocks.modelsStatusCommand(...args),
}));
vi.mock("../gateway/call.js", () => ({
callGateway: (...args: unknown[]) => mocks.callGateway(...args),
randomIdempotencyKey: () => "run-1",
}));
vi.mock("../gateway/connection-details.js", () => ({
buildGatewayConnectionDetailsWithResolvers: vi.fn(() => ({
url: "ws://127.0.0.1:18789",
urlSource: "local loopback",
message: "Gateway target: ws://127.0.0.1:18789",
})),
}));
vi.mock("../media-understanding/runtime.js", () => ({
describeImageFile: (...args: unknown[]) => mocks.describeImageFile(...args),
describeVideoFile: vi.fn(),
transcribeAudioFile: (...args: unknown[]) => mocks.transcribeAudioFile(...args),
}));
vi.mock("../plugins/memory-embedding-providers.js", () => ({
listMemoryEmbeddingProviders: (...args: unknown[]) => mocks.listMemoryEmbeddingProviders(...args),
registerMemoryEmbeddingProvider: (...args: unknown[]) =>
mocks.registerMemoryEmbeddingProvider(...args),
}));
vi.mock("../../extensions/memory-core/runtime-api.js", () => ({
createEmbeddingProvider: (...args: unknown[]) => mocks.createEmbeddingProvider(...args),
registerBuiltInMemoryEmbeddingProviders: (...args: unknown[]) =>
mocks.registerBuiltInMemoryEmbeddingProviders(...args),
}));
vi.mock("../image-generation/runtime.js", () => ({
generateImage: (...args: unknown[]) => mocks.generateImage(...args),
listRuntimeImageGenerationProviders: vi.fn(() => []),
}));
vi.mock("../video-generation/runtime.js", () => ({
generateVideo: vi.fn(),
listRuntimeVideoGenerationProviders: vi.fn(() => []),
}));
vi.mock("../tts/tts.js", () => ({
getTtsProvider: vi.fn(() => "openai"),
listSpeechVoices: vi.fn(async () => []),
resolveTtsConfig: vi.fn(() => ({})),
resolveTtsPrefsPath: vi.fn(() => "/tmp/tts.json"),
setTtsEnabled: vi.fn(),
setTtsProvider: (...args: unknown[]) => mocks.setTtsProvider(...args),
resolveExplicitTtsOverrides: (...args: unknown[]) => mocks.resolveExplicitTtsOverrides(...args),
textToSpeech: (...args: unknown[]) => mocks.textToSpeech(...args),
}));
vi.mock("../tts/provider-registry.js", () => ({
canonicalizeSpeechProviderId: vi.fn((provider: string) => provider),
listSpeechProviders: vi.fn(() => []),
}));
vi.mock("../web-search/runtime.js", () => ({
listWebSearchProviders: vi.fn(() => []),
isWebSearchProviderConfigured: (...args: unknown[]) =>
mocks.isWebSearchProviderConfigured(...args),
runWebSearch: vi.fn(),
}));
vi.mock("../web-fetch/runtime.js", () => ({
listWebFetchProviders: vi.fn(() => []),
isWebFetchProviderConfigured: (...args: unknown[]) => mocks.isWebFetchProviderConfigured(...args),
resolveWebFetchDefinition: vi.fn(),
}));
describe("capability cli", () => {
beforeEach(() => {
mocks.runtime.log.mockClear();
mocks.runtime.error.mockClear();
mocks.runtime.writeJson.mockClear();
mocks.loadModelCatalog
.mockReset()
.mockResolvedValue([{ id: "gpt-5.4", provider: "openai", name: "GPT-5.4" }]);
mocks.loadAuthProfileStoreForRuntime.mockReset().mockReturnValue({ profiles: {}, order: {} });
mocks.listProfilesForProvider.mockReset().mockReturnValue([]);
mocks.updateAuthProfileStoreWithLock
.mockReset()
.mockImplementation(async ({ updater }: { updater: (store: any) => boolean }) => {
const store = {
version: 1,
profiles: {},
order: {},
lastGood: {},
usageStats: {},
};
updater(store);
return store;
});
mocks.resolveMemorySearchConfig.mockReset().mockReturnValue(null);
mocks.agentCommand.mockClear();
mocks.callGateway.mockClear().mockImplementation(async ({ method }: { method: string }) => {
if (method === "tts.status") {
return { enabled: true, provider: "openai" };
}
if (method === "agent") {
return {
result: {
payloads: [{ text: "gateway reply" }],
meta: { agentMeta: { provider: "anthropic", model: "claude-sonnet-4-6" } },
},
};
}
return {};
});
mocks.describeImageFile.mockClear();
mocks.generateImage.mockReset();
mocks.transcribeAudioFile.mockClear();
mocks.textToSpeech.mockClear();
mocks.setTtsProvider.mockClear();
mocks.resolveExplicitTtsOverrides.mockClear();
mocks.createEmbeddingProvider.mockClear();
mocks.registerMemoryEmbeddingProvider.mockClear();
mocks.registerBuiltInMemoryEmbeddingProviders.mockClear();
mocks.isWebSearchProviderConfigured.mockReset().mockReturnValue(false);
mocks.isWebFetchProviderConfigured.mockReset().mockReturnValue(false);
mocks.modelsStatusCommand.mockClear();
mocks.callGateway.mockImplementation(async ({ method }: { method: string }) => {
if (method === "tts.status") {
return { enabled: true, provider: "openai" };
}
if (method === "tts.convert") {
return {
audioPath: "/tmp/gateway-tts.mp3",
provider: "openai",
outputFormat: "mp3",
voiceCompatible: false,
};
}
if (method === "agent") {
return {
result: {
payloads: [{ text: "gateway reply" }],
meta: { agentMeta: { provider: "anthropic", model: "claude-sonnet-4-6" } },
},
};
}
return {};
});
});
it("lists canonical capabilities", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "list", "--json"],
});
const payload = mocks.runtime.writeJson.mock.calls[0]?.[0] as Array<{ id: string }>;
expect(payload.some((entry) => entry.id === "model.run")).toBe(true);
expect(payload.some((entry) => entry.id === "image.describe")).toBe(true);
});
it("defaults model run to local transport", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "model", "run", "--prompt", "hello", "--json"],
});
expect(mocks.agentCommand).toHaveBeenCalledTimes(1);
expect(mocks.callGateway).not.toHaveBeenCalled();
expect(mocks.runtime.writeJson).toHaveBeenCalledWith(
expect.objectContaining({
capability: "model.run",
transport: "local",
}),
);
});
it("defaults tts status to gateway transport", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "tts", "status", "--json"],
});
expect(mocks.callGateway).toHaveBeenCalledWith(
expect.objectContaining({ method: "tts.status" }),
);
expect(mocks.runtime.writeJson).toHaveBeenCalledWith(
expect.objectContaining({ transport: "gateway" }),
);
});
it("routes image describe through media understanding, not generation", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "image", "describe", "--file", "photo.jpg", "--json"],
});
expect(mocks.describeImageFile).toHaveBeenCalledWith(
expect.objectContaining({ filePath: expect.stringMatching(/photo\.jpg$/) }),
);
expect(mocks.runtime.writeJson).toHaveBeenCalledWith(
expect.objectContaining({
capability: "image.describe",
outputs: [expect.objectContaining({ kind: "image.description" })],
}),
);
});
it("fails image describe when no description text is returned", async () => {
mocks.describeImageFile.mockResolvedValueOnce({
text: undefined,
provider: undefined,
model: undefined,
});
await expect(
runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "image", "describe", "--file", "photo.jpg", "--json"],
}),
).rejects.toThrow("exit 1");
expect(mocks.runtime.error).toHaveBeenCalledWith(
expect.stringMatching(/No description returned for image/),
);
});
it("rewrites mismatched explicit image output extensions to the detected file type", async () => {
const jpegBase64 =
"/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEAAkGBxAQEBUQEBAVFRUVFRUVFRUVFRUVFRUVFRUXFhUVFRUYHSggGBolHRUVITEhJSkrLi4uFx8zODMsNygtLisBCgoKDg0OGhAQGi0fHyUtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLf/AABEIAAEAAQMBIgACEQEDEQH/xAAXAAEBAQEAAAAAAAAAAAAAAAAAAQID/8QAFhEBAQEAAAAAAAAAAAAAAAAAAAER/9oADAMBAAIQAxAAAAH2AP/EABgQAQEAAwAAAAAAAAAAAAAAAAEAEQIS/9oACAEBAAEFAk1o7//EABYRAQEBAAAAAAAAAAAAAAAAAAABEf/aAAgBAwEBPwGn/8QAFhEBAQEAAAAAAAAAAAAAAAAAABEB/9oACAECAQE/AYf/xAAaEAACAgMAAAAAAAAAAAAAAAABEQAhMUFh/9oACAEBAAY/AjK9cY2f/8QAGhABAQACAwAAAAAAAAAAAAAAAAERITFBUf/aAAgBAQABPyGQk7W5jVYkA//Z";
mocks.generateImage.mockResolvedValue({
provider: "openai",
model: "gpt-image-1",
attempts: [],
images: [
{
buffer: Buffer.from(jpegBase64, "base64"),
mimeType: "image/png",
fileName: "provider-output.png",
},
],
});
const tempOutput = path.join(os.tmpdir(), `openclaw-image-mismatch-${Date.now()}.png`);
await fs.rm(tempOutput, { force: true });
await fs.rm(tempOutput.replace(/\.png$/, ".jpg"), { force: true });
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"image",
"generate",
"--prompt",
"friendly lobster",
"--output",
tempOutput,
"--json",
],
});
expect(mocks.runtime.writeJson).toHaveBeenCalledWith(
expect.objectContaining({
outputs: [
expect.objectContaining({
path: tempOutput.replace(/\.png$/, ".jpg"),
mimeType: "image/jpeg",
}),
],
}),
);
});
it("routes audio transcribe through transcription, not realtime", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "audio", "transcribe", "--file", "memo.m4a", "--json"],
});
expect(mocks.transcribeAudioFile).toHaveBeenCalledWith(
expect.objectContaining({ filePath: expect.stringMatching(/memo\.m4a$/) }),
);
expect(mocks.runtime.writeJson).toHaveBeenCalledWith(
expect.objectContaining({
capability: "audio.transcribe",
outputs: [expect.objectContaining({ kind: "audio.transcription" })],
}),
);
});
it("fails audio transcribe when no transcript text is returned", async () => {
mocks.transcribeAudioFile.mockResolvedValueOnce({ text: undefined });
await expect(
runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "audio", "transcribe", "--file", "memo.m4a", "--json"],
}),
).rejects.toThrow("exit 1");
expect(mocks.runtime.error).toHaveBeenCalledWith(
expect.stringMatching(/No transcript returned for audio/),
);
});
it("forwards transcription prompt and language hints", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"audio",
"transcribe",
"--file",
"memo.m4a",
"--language",
"en",
"--prompt",
"Focus on names",
"--json",
],
});
expect(mocks.transcribeAudioFile).toHaveBeenCalledWith(
expect.objectContaining({
filePath: expect.stringMatching(/memo\.m4a$/),
language: "en",
prompt: "Focus on names",
}),
);
});
it("uses request-scoped TTS overrides without mutating prefs", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"tts",
"convert",
"--text",
"hello",
"--model",
"openai/gpt-4o-mini-tts",
"--voice",
"alloy",
"--json",
],
});
expect(mocks.textToSpeech).toHaveBeenCalledWith(
expect.objectContaining({
overrides: expect.objectContaining({
provider: "openai",
providerOverrides: expect.objectContaining({
openai: expect.objectContaining({
modelId: "gpt-4o-mini-tts",
voiceId: "alloy",
}),
}),
}),
}),
);
expect(mocks.setTtsProvider).not.toHaveBeenCalled();
});
it("disables TTS fallback when explicit provider or voice/model selection is requested", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"tts",
"convert",
"--text",
"hello",
"--model",
"openai/gpt-4o-mini-tts",
"--voice",
"alloy",
"--json",
],
});
expect(mocks.textToSpeech).toHaveBeenCalledWith(
expect.objectContaining({
disableFallback: true,
}),
);
});
it("does not infer and forward a local provider guess for gateway TTS overrides", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"tts",
"convert",
"--gateway",
"--text",
"hello",
"--voice",
"alloy",
"--json",
],
});
expect(mocks.callGateway).toHaveBeenCalledWith(
expect.objectContaining({
method: "tts.convert",
params: expect.objectContaining({
provider: undefined,
voiceId: "alloy",
}),
}),
);
});
it("fails clearly when gateway TTS output is requested against a remote gateway", async () => {
const gatewayConnection = await import("../gateway/connection-details.js");
vi.mocked(gatewayConnection.buildGatewayConnectionDetailsWithResolvers).mockReturnValueOnce({
url: "wss://gateway.example.com",
urlSource: "config gateway.remote.url",
message: "Gateway target: wss://gateway.example.com",
});
await expect(
runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"tts",
"convert",
"--gateway",
"--text",
"hello",
"--output",
"hello.mp3",
"--json",
],
}),
).rejects.toThrow("exit 1");
expect(mocks.runtime.error).toHaveBeenCalledWith(
expect.stringContaining("--output is not supported for remote gateway TTS yet"),
);
});
it("uses only embedding providers for embedding creation", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "embedding", "create", "--text", "hello", "--json"],
});
expect(mocks.createEmbeddingProvider).toHaveBeenCalledWith(
expect.objectContaining({
provider: "auto",
fallback: "none",
}),
);
expect(mocks.runtime.writeJson).toHaveBeenCalledWith(
expect.objectContaining({
capability: "embedding.create",
provider: "openai",
model: "text-embedding-3-small",
}),
);
});
it("derives the embedding provider from a provider/model override", async () => {
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"embedding",
"create",
"--text",
"hello",
"--model",
"openai/text-embedding-3-large",
"--json",
],
});
expect(mocks.createEmbeddingProvider).toHaveBeenCalledWith(
expect.objectContaining({
provider: "openai",
fallback: "none",
model: "text-embedding-3-large",
}),
);
});
it("cleans provider auth profiles and usage stats on logout", async () => {
mocks.loadAuthProfileStoreForRuntime.mockReturnValue({
profiles: {
"openai:default": { id: "openai:default" },
"openai:secondary": { id: "openai:secondary" },
"anthropic:default": { id: "anthropic:default" },
},
order: { openai: ["openai:default", "openai:secondary"] },
lastGood: { openai: "openai:secondary" },
usageStats: {
"openai:default": { errorCount: 2 },
"openai:secondary": { errorCount: 1 },
"anthropic:default": { errorCount: 3 },
},
});
mocks.listProfilesForProvider.mockReturnValue(["openai:default", "openai:secondary"]);
let updatedStore: Record<string, any> | null = null;
mocks.updateAuthProfileStoreWithLock.mockImplementationOnce(
async ({ updater }: { updater: (store: any) => boolean }) => {
const store = {
version: 1,
profiles: {
"openai:default": { id: "openai:default" },
"openai:secondary": { id: "openai:secondary" },
"anthropic:default": { id: "anthropic:default" },
},
order: { openai: ["openai:default", "openai:secondary"] },
lastGood: { openai: "openai:secondary" },
usageStats: {
"openai:default": { errorCount: 2 },
"openai:secondary": { errorCount: 1 },
"anthropic:default": { errorCount: 3 },
},
};
updater(store);
updatedStore = store;
return store;
},
);
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "model", "auth", "logout", "--provider", "openai", "--json"],
});
expect(updatedStore).toMatchObject({
profiles: {
"anthropic:default": { id: "anthropic:default" },
},
order: {},
lastGood: {},
usageStats: {
"anthropic:default": { errorCount: 3 },
},
});
expect(mocks.runtime.writeJson).toHaveBeenCalledWith({
provider: "openai",
removedProfiles: ["openai:default", "openai:secondary"],
});
});
it("fails logout if the auth store update does not complete", async () => {
mocks.listProfilesForProvider.mockReturnValue(["openai:default"]);
mocks.updateAuthProfileStoreWithLock.mockResolvedValueOnce(null);
await expect(
runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "model", "auth", "logout", "--provider", "openai", "--json"],
}),
).rejects.toThrow("exit 1");
expect(mocks.runtime.error).toHaveBeenCalledWith(
expect.stringContaining("Failed to remove saved auth profiles for provider openai."),
);
});
it("rejects providerless audio model overrides", async () => {
await expect(
runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"audio",
"transcribe",
"--file",
"memo.m4a",
"--model",
"whisper-1",
"--json",
],
}),
).rejects.toThrow("exit 1");
expect(mocks.runtime.error).toHaveBeenCalledWith(
expect.stringContaining("Model overrides must use the form <provider/model>."),
);
expect(mocks.transcribeAudioFile).not.toHaveBeenCalled();
});
it("rejects providerless image describe model overrides", async () => {
await expect(
runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"image",
"describe",
"--file",
"photo.jpg",
"--model",
"gpt-4.1-mini",
"--json",
],
}),
).rejects.toThrow("exit 1");
expect(mocks.runtime.error).toHaveBeenCalledWith(
expect.stringContaining("Model overrides must use the form <provider/model>."),
);
expect(mocks.describeImageFile).not.toHaveBeenCalled();
});
it("rejects providerless video describe model overrides", async () => {
const mediaRuntime = await import("../media-understanding/runtime.js");
vi.mocked(mediaRuntime.describeVideoFile).mockResolvedValue({
text: "friendly lobster",
provider: "openai",
model: "gpt-4.1-mini",
} as never);
await expect(
runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: [
"capability",
"video",
"describe",
"--file",
"clip.mp4",
"--model",
"gpt-4.1-mini",
"--json",
],
}),
).rejects.toThrow("exit 1");
expect(mocks.runtime.error).toHaveBeenCalledWith(
expect.stringContaining("Model overrides must use the form <provider/model>."),
);
expect(vi.mocked(mediaRuntime.describeVideoFile)).not.toHaveBeenCalled();
});
it("bootstraps built-in embedding providers when the registry is empty", async () => {
mocks.listMemoryEmbeddingProviders.mockReturnValueOnce([]);
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "embedding", "providers", "--json"],
});
expect(mocks.registerBuiltInMemoryEmbeddingProviders).toHaveBeenCalledWith(
expect.objectContaining({
registerMemoryEmbeddingProvider: expect.any(Function),
}),
);
});
it("surfaces available, configured, and selected for web providers", async () => {
mocks.loadConfig.mockReturnValue({
tools: {
web: {
search: { provider: "gemini" },
fetch: { provider: "firecrawl" },
},
},
});
const webSearchRuntime = await import("../web-search/runtime.js");
const webFetchRuntime = await import("../web-fetch/runtime.js");
vi.mocked(webSearchRuntime.listWebSearchProviders).mockReturnValue([
{ id: "brave", envVars: ["BRAVE_API_KEY"] } as never,
{ id: "gemini", envVars: ["GEMINI_API_KEY"] } as never,
]);
vi.mocked(webFetchRuntime.listWebFetchProviders).mockReturnValue([
{ id: "firecrawl", envVars: ["FIRECRAWL_API_KEY"] } as never,
]);
mocks.isWebSearchProviderConfigured.mockReturnValueOnce(false).mockReturnValueOnce(true);
mocks.isWebFetchProviderConfigured.mockReturnValueOnce(true);
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "web", "providers", "--json"],
});
expect(mocks.runtime.writeJson).toHaveBeenCalledWith({
search: [
{
available: true,
configured: false,
selected: false,
id: "brave",
envVars: ["BRAVE_API_KEY"],
},
{
available: true,
configured: true,
selected: true,
id: "gemini",
envVars: ["GEMINI_API_KEY"],
},
],
fetch: [
{
available: true,
configured: true,
selected: true,
id: "firecrawl",
envVars: ["FIRECRAWL_API_KEY"],
},
],
});
});
it("surfaces selected and configured embedding provider state", async () => {
mocks.loadConfig.mockReturnValue({});
mocks.resolveMemorySearchConfig.mockReturnValue({
provider: "gemini",
model: "gemini-embedding-001",
});
mocks.listMemoryEmbeddingProviders.mockReturnValue([
{ id: "openai", defaultModel: "text-embedding-3-small", transport: "remote" },
{ id: "gemini", defaultModel: "gemini-embedding-001", transport: "remote" },
]);
await runRegisteredCli({
register: registerCapabilityCli as (program: Command) => void,
argv: ["capability", "embedding", "providers", "--json"],
});
expect(mocks.runtime.writeJson).toHaveBeenCalledWith([
{
available: true,
configured: false,
selected: false,
id: "openai",
defaultModel: "text-embedding-3-small",
transport: "remote",
autoSelectPriority: undefined,
},
{
available: true,
configured: true,
selected: true,
id: "gemini",
defaultModel: "gemini-embedding-001",
transport: "remote",
autoSelectPriority: undefined,
},
]);
});
});

1822
src/cli/capability-cli.ts Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -26,9 +26,18 @@ const { registerQaCli } = vi.hoisted(() => ({
}),
}));
const { inferAction, registerCapabilityCli } = vi.hoisted(() => {
const action = vi.fn();
const register = vi.fn((program: Command) => {
program.command("infer").alias("capability").action(action);
});
return { inferAction: action, registerCapabilityCli: register };
});
vi.mock("../acp-cli.js", () => ({ registerAcpCli }));
vi.mock("../nodes-cli.js", () => ({ registerNodesCli }));
vi.mock("../qa-cli.js", () => ({ registerQaCli }));
vi.mock("../capability-cli.js", () => ({ registerCapabilityCli }));
describe("registerSubCliCommands", () => {
const originalArgv = process.argv;
@@ -54,6 +63,8 @@ describe("registerSubCliCommands", () => {
acpAction.mockClear();
registerNodesCli.mockClear();
nodesAction.mockClear();
registerCapabilityCli.mockClear();
inferAction.mockClear();
});
afterEach(() => {
@@ -98,6 +109,17 @@ describe("registerSubCliCommands", () => {
expect(nodesAction).toHaveBeenCalledTimes(1);
});
it("registers the infer placeholder and dispatches through the capability registrar", async () => {
const program = createRegisteredProgram(["node", "openclaw", "infer"], "openclaw");
expect(program.commands.map((cmd) => cmd.name())).toEqual(["infer"]);
await program.parseAsync(["infer"], { from: "user" });
expect(registerCapabilityCli).toHaveBeenCalledTimes(1);
expect(inferAction).toHaveBeenCalledTimes(1);
});
it("replaces placeholder when registering a subcommand by name", async () => {
const program = createRegisteredProgram(["node", "openclaw", "acp", "--help"], "openclaw");

View File

@@ -74,6 +74,11 @@ const entrySpecs: readonly CommandGroupDescriptorSpec<SubCliRegistrar>[] = [
loadModule: () => import("../models-cli.js"),
exportName: "registerModelsCli",
},
{
commandNames: ["infer", "capability"],
loadModule: () => import("../capability-cli.js"),
exportName: "registerCapabilityCli",
},
{
commandNames: ["approvals"],
loadModule: () => import("../exec-approvals-cli.js"),

View File

@@ -22,6 +22,16 @@ const subCliCommandCatalog = defineCommandDescriptorCatalog([
description: "Discover, scan, and configure models",
hasSubcommands: true,
},
{
name: "infer",
description: "Run provider-backed inference commands",
hasSubcommands: true,
},
{
name: "capability",
description: "Run provider-backed inference commands (fallback alias: infer)",
hasSubcommands: true,
},
{
name: "approvals",
description: "Manage exec approvals (gateway or node host)",

View File

@@ -0,0 +1,83 @@
import { beforeEach, describe, expect, it, vi } from "vitest";
import { ErrorCodes } from "../protocol/index.js";
const mocks = vi.hoisted(() => ({
loadConfig: vi.fn(() => ({})),
resolveExplicitTtsOverrides: vi.fn(() => ({})),
textToSpeech: vi.fn(async () => ({
success: true,
audioPath: "/tmp/tts.mp3",
provider: "openai",
outputFormat: "mp3",
voiceCompatible: false,
})),
}));
vi.mock("../../config/config.js", () => ({
loadConfig: (...args: unknown[]) => mocks.loadConfig(...args),
}));
vi.mock("../../tts/provider-registry.js", () => ({
canonicalizeSpeechProviderId: vi.fn(),
getSpeechProvider: vi.fn(),
listSpeechProviders: vi.fn(() => []),
}));
vi.mock("../../tts/tts.js", () => ({
getResolvedSpeechProviderConfig: vi.fn(),
getTtsProvider: vi.fn(() => "openai"),
isTtsEnabled: vi.fn(() => true),
isTtsProviderConfigured: vi.fn(() => true),
resolveExplicitTtsOverrides: (...args: unknown[]) => mocks.resolveExplicitTtsOverrides(...args),
resolveTtsAutoMode: vi.fn(() => false),
resolveTtsConfig: vi.fn(() => ({})),
resolveTtsPrefsPath: vi.fn(() => "/tmp/tts.json"),
resolveTtsProviderOrder: vi.fn(() => ["openai"]),
setTtsEnabled: vi.fn(),
setTtsProvider: vi.fn(),
textToSpeech: (...args: unknown[]) => mocks.textToSpeech(...args),
}));
describe("ttsHandlers", () => {
beforeEach(() => {
mocks.loadConfig.mockReset();
mocks.loadConfig.mockReturnValue({});
mocks.resolveExplicitTtsOverrides.mockReset();
mocks.resolveExplicitTtsOverrides.mockReturnValue({});
mocks.textToSpeech.mockReset();
mocks.textToSpeech.mockResolvedValue({
success: true,
audioPath: "/tmp/tts.mp3",
provider: "openai",
outputFormat: "mp3",
voiceCompatible: false,
});
});
it("returns INVALID_REQUEST when TTS override validation fails", async () => {
mocks.resolveExplicitTtsOverrides.mockImplementation(() => {
throw new Error('Unknown TTS provider "bad".');
});
const { ttsHandlers } = await import("./tts.js");
const respond = vi.fn();
await ttsHandlers["tts.convert"]({
params: {
text: "hello",
provider: "bad",
},
respond,
} as never);
expect(respond).toHaveBeenCalledWith(
false,
undefined,
expect.objectContaining({
code: ErrorCodes.INVALID_REQUEST,
message: 'Error: Unknown TTS provider "bad".',
}),
);
expect(mocks.textToSpeech).not.toHaveBeenCalled();
});
});

View File

@@ -9,6 +9,7 @@ import {
getTtsProvider,
isTtsEnabled,
isTtsProviderConfigured,
resolveExplicitTtsOverrides,
resolveTtsAutoMode,
resolveTtsConfig,
resolveTtsPrefsPath,
@@ -89,7 +90,28 @@ export const ttsHandlers: GatewayRequestHandlers = {
try {
const cfg = loadConfig();
const channel = typeof params.channel === "string" ? params.channel.trim() : undefined;
const result = await textToSpeech({ text, cfg, channel });
const providerRaw = typeof params.provider === "string" ? params.provider.trim() : undefined;
const modelId = typeof params.modelId === "string" ? params.modelId.trim() : undefined;
const voiceId = typeof params.voiceId === "string" ? params.voiceId.trim() : undefined;
let overrides;
try {
overrides = resolveExplicitTtsOverrides({
cfg,
provider: providerRaw,
modelId,
voiceId,
});
} catch (err) {
respond(false, undefined, errorShape(ErrorCodes.INVALID_REQUEST, formatForLog(err)));
return;
}
const result = await textToSpeech({
text,
cfg,
channel,
overrides,
disableFallback: Boolean(overrides.provider || modelId || voiceId),
});
if (result.success && result.audioPath) {
respond(true, {
audioPath: result.audioPath,

View File

@@ -121,6 +121,43 @@ describe("runCapability auto audio entries", () => {
expect(seenModel).toBe("whisper-1");
});
it("lets per-request transcription hints override configured model-entry hints", async () => {
let seenLanguage: string | undefined;
let seenPrompt: string | undefined;
const result = await runAutoAudioCase({
transcribeAudio: async (req) => {
seenLanguage = req.language;
seenPrompt = req.prompt;
return { text: "ok", model: req.model ?? "unknown" };
},
cfgExtra: {
tools: {
media: {
audio: {
enabled: true,
prompt: "configured prompt",
language: "fr",
_requestPromptOverride: "Focus on names",
_requestLanguageOverride: "en",
models: [
{
provider: "openai",
model: "whisper-1",
prompt: "entry prompt",
language: "de",
},
],
},
},
},
} as Partial<OpenClawConfig>,
});
expect(result.outputs[0]?.text).toBe("ok");
expect(seenLanguage).toBe("en");
expect(seenPrompt).toBe("Focus on names");
});
it("uses mistral when only mistral key is configured", async () => {
const isolatedAgentDir = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-audio-agent-"));
let runResult: Awaited<ReturnType<typeof runCapability>> | undefined;

View File

@@ -0,0 +1,67 @@
import { afterEach, beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
import type { OpenClawConfig } from "../config/config.js";
import { withAudioFixture } from "./runner.test-utils.js";
const runExecMock = vi.hoisted(() => vi.fn());
vi.mock("../process/exec.js", () => ({
runExec: (...args: unknown[]) => runExecMock(...args),
}));
let runCliEntry: typeof import("./runner.entries.js").runCliEntry;
describe("media-understanding CLI audio entry", () => {
beforeAll(async () => {
({ runCliEntry } = await import("./runner.entries.js"));
});
beforeEach(() => {
runExecMock.mockReset().mockResolvedValue({ stdout: "cli transcript" });
});
afterEach(() => {
vi.clearAllMocks();
});
it("applies per-request prompt and language overrides to CLI transcription templating", async () => {
await withAudioFixture("openclaw-cli-audio", async ({ ctx, cache }) => {
await runCliEntry({
capability: "audio",
entry: {
type: "cli",
command: "mock-transcriber",
args: ["--prompt", "{{Prompt}}", "--language", "{{Language}}", "--file", "{{MediaPath}}"],
prompt: "entry prompt",
language: "de",
},
cfg: {
tools: {
media: {
audio: {
prompt: "configured prompt",
language: "fr",
_requestPromptOverride: "Focus on names",
_requestLanguageOverride: "en",
},
},
},
} as OpenClawConfig,
ctx,
attachmentIndex: 0,
cache,
config: {
prompt: "configured prompt",
language: "fr",
_requestPromptOverride: "Focus on names",
_requestLanguageOverride: "en",
} as never,
});
});
expect(runExecMock).toHaveBeenCalledWith(
"mock-transcriber",
expect.arrayContaining(["--prompt", "Focus on names", "--language", "en"]),
expect.any(Object),
);
});
});

View File

@@ -372,6 +372,20 @@ function resolveEntryRunOptions(params: {
return { maxBytes, maxChars, timeoutMs, prompt };
}
function resolveAudioRequestOverrides(config: MediaUnderstandingConfig | undefined): {
prompt?: string;
language?: string;
} {
const overrides = (config ?? {}) as MediaUnderstandingConfig & {
_requestPromptOverride?: string;
_requestLanguageOverride?: string;
};
return {
prompt: overrides._requestPromptOverride,
language: overrides._requestLanguageOverride,
};
}
async function resolveProviderExecutionAuth(params: {
providerId: string;
cfg: OpenClawConfig;
@@ -530,6 +544,7 @@ export async function runProviderEntry(params: {
throw new Error(`Audio transcription provider "${providerId}" not available.`);
}
const transcribeAudio = provider.transcribeAudio;
const requestOverrides = resolveAudioRequestOverrides(params.config);
const media = await params.cache.getBuffer({
attachmentIndex: params.attachmentIndex,
maxBytes,
@@ -569,8 +584,12 @@ export async function runProviderEntry(params: {
headers,
request,
model,
language: entry.language ?? params.config?.language ?? cfg.tools?.media?.audio?.language,
prompt,
language:
requestOverrides.language ??
entry.language ??
params.config?.language ??
cfg.tools?.media?.audio?.language,
prompt: requestOverrides.prompt ?? prompt,
query: providerQuery,
timeoutMs,
fetchFn,
@@ -651,6 +670,7 @@ export async function runCliEntry(params: {
if (!command) {
throw new Error(`CLI entry missing command for ${capability}`);
}
const requestOverrides = resolveAudioRequestOverrides(params.config);
const { maxBytes, maxChars, timeoutMs, prompt } = resolveEntryRunOptions({
capability,
entry,
@@ -683,7 +703,8 @@ export async function runCliEntry(params: {
MediaDir: path.dirname(mediaPath),
OutputDir: outputDir,
OutputBase: outputBase,
Prompt: prompt,
Prompt: requestOverrides.prompt ?? prompt,
...(requestOverrides.language ? { Language: requestOverrides.language } : {}),
MaxChars: maxChars,
};
const argv = [command, ...args].map((part, index) =>

View File

@@ -150,7 +150,28 @@ export async function transcribeAudioFile(params: {
agentDir?: string;
mime?: string;
activeModel?: ActiveMediaModel;
language?: string;
prompt?: string;
}): Promise<{ text: string | undefined }> {
const result = await runMediaUnderstandingFile({ ...params, capability: "audio" });
const cfg =
params.language || params.prompt
? {
...params.cfg,
tools: {
...params.cfg.tools,
media: {
...params.cfg.tools?.media,
audio: {
...params.cfg.tools?.media?.audio,
...(params.language ? { _requestLanguageOverride: params.language } : {}),
...(params.prompt ? { _requestPromptOverride: params.prompt } : {}),
...(params.language ? { language: params.language } : {}),
...(params.prompt ? { prompt: params.prompt } : {}),
},
},
},
}
: params.cfg;
const result = await runMediaUnderstandingFile({ ...params, cfg, capability: "audio" });
return { text: result.text };
}

View File

@@ -34,6 +34,8 @@ export const listSpeechVoices: FacadeModule["listSpeechVoices"] =
createLazyFacadeValue("listSpeechVoices");
export const maybeApplyTtsToPayload: FacadeModule["maybeApplyTtsToPayload"] =
createLazyFacadeValue("maybeApplyTtsToPayload");
export const resolveExplicitTtsOverrides: FacadeModule["resolveExplicitTtsOverrides"] =
createLazyFacadeValue("resolveExplicitTtsOverrides");
export const resolveTtsAutoMode: FacadeModule["resolveTtsAutoMode"] =
createLazyFacadeValue("resolveTtsAutoMode");
export const resolveTtsConfig: FacadeModule["resolveTtsConfig"] =

View File

@@ -1,6 +1,6 @@
import fs from "node:fs";
import path from "node:path";
import { describe, expect, it } from "vitest";
import { describe, expect, it, vi } from "vitest";
import { withTempHome } from "../../test/helpers/temp-home.js";
import type { OpenClawConfig } from "../config/config.js";
import { resolveStatusTtsSnapshot } from "./status-config.js";
@@ -61,4 +61,44 @@ describe("resolveStatusTtsSnapshot", () => {
});
});
});
it("derives the default prefs path from OPENCLAW_CONFIG_PATH when set", async () => {
await withTempHome(
async (home) => {
const stateDir = path.join(home, ".openclaw-dev");
const prefsPath = path.join(stateDir, "settings", "tts.json");
fs.mkdirSync(path.dirname(prefsPath), { recursive: true });
fs.writeFileSync(
prefsPath,
JSON.stringify({
tts: {
auto: "always",
provider: "openai",
},
}),
);
vi.stubEnv("OPENCLAW_CONFIG_PATH", path.join(stateDir, "openclaw.json"));
try {
expect(
resolveStatusTtsSnapshot({
cfg: {
messages: {
tts: {},
},
} as OpenClawConfig,
}),
).toEqual({
autoMode: "always",
provider: "openai",
maxLength: 1500,
summarize: true,
});
} finally {
vi.unstubAllEnvs();
}
},
{ env: { OPENCLAW_STATE_DIR: undefined } },
);
});
});

View File

@@ -6,7 +6,7 @@ import {
normalizeOptionalLowercaseString,
normalizeOptionalString,
} from "../shared/string-coerce.js";
import { CONFIG_DIR, resolveUserPath } from "../utils.js";
import { resolveConfigDir, resolveUserPath } from "../utils.js";
import { normalizeTtsAutoMode } from "./tts-auto-mode.js";
const DEFAULT_TTS_MAX_LENGTH = 1500;
@@ -52,7 +52,7 @@ function resolveTtsPrefsPathValue(prefsPath: string | undefined): string {
if (envPath) {
return resolveUserPath(envPath);
}
return path.join(CONFIG_DIR, "settings", "tts.json");
return path.join(resolveConfigDir(process.env), "settings", "tts.json");
}
function readPrefs(prefsPath: string): TtsUserPrefs {

View File

@@ -10,6 +10,7 @@ export {
isTtsProviderConfigured,
listSpeechVoices,
maybeApplyTtsToPayload,
resolveExplicitTtsOverrides,
resolveTtsAutoMode,
resolveTtsConfig,
resolveTtsPrefsPath,

View File

@@ -50,6 +50,15 @@ describe("resolveConfigDir", () => {
expect(resolveConfigDir(env)).toBe(path.resolve("/tmp/openclaw-home", "state"));
});
it("falls back to the config file directory when only OPENCLAW_CONFIG_PATH is set", () => {
const env = {
HOME: "/tmp/openclaw-home",
OPENCLAW_CONFIG_PATH: "~/profiles/dev/openclaw.json",
} as NodeJS.ProcessEnv;
expect(resolveConfigDir(env)).toBe(path.resolve("/tmp/openclaw-home", "profiles", "dev"));
});
});
describe("resolveHomeDir", () => {

View File

@@ -141,6 +141,10 @@ export function resolveConfigDir(
if (override) {
return resolveUserPath(override, env, homedir);
}
const configPath = env.OPENCLAW_CONFIG_PATH?.trim();
if (configPath) {
return path.dirname(resolveUserPath(configPath, env, homedir));
}
const newDir = path.join(resolveRequiredHomeDir(env, homedir), ".openclaw");
try {
const hasNew = fs.existsSync(newDir);

View File

@@ -60,6 +60,16 @@ function hasEntryCredential(
});
}
export function isWebFetchProviderConfigured(params: {
provider: Pick<
PluginWebFetchProviderEntry,
"envVars" | "getConfiguredCredentialValue" | "getCredentialValue" | "requiresCredential"
>;
config?: OpenClawConfig;
}): boolean {
return hasEntryCredential(params.provider, params.config, resolveFetchConfig(params.config));
}
export function listWebFetchProviders(params?: {
config?: OpenClawConfig;
}): PluginWebFetchProviderEntry[] {

View File

@@ -289,4 +289,332 @@ describe("web search runtime", () => {
result: { query: "runtime", provider: "beta", runtimeSelectedProvider: "beta" },
});
});
it("falls back to another provider when auto-selected search execution fails", async () => {
resolveRuntimeWebSearchProvidersMock.mockReturnValue([
createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
createTool: () => ({
description: "google",
parameters: {},
execute: async () => {
throw new Error("google aborted");
},
}),
}),
createProvider({
pluginId: "duckduckgo",
id: "duckduckgo",
credentialPath: "",
autoDetectOrder: 100,
requiresCredential: false,
createTool: () => ({
description: "duckduckgo",
parameters: {},
execute: async (args) => ({ ...args, provider: "duckduckgo" }),
}),
}),
]);
await expect(
runWebSearch({
config: {},
args: { query: "fallback" },
}),
).resolves.toEqual({
provider: "duckduckgo",
result: { query: "fallback", provider: "duckduckgo" },
});
});
it("does not prebuild fallback provider tools before attempting the selected provider", async () => {
resolveRuntimeWebSearchProvidersMock.mockReturnValue([
createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
createTool: () => ({
description: "google",
parameters: {},
execute: async (args) => ({ ...args, provider: "google" }),
}),
}),
createProvider({
pluginId: "broken-fallback",
id: "broken-fallback",
credentialPath: "",
autoDetectOrder: 100,
requiresCredential: false,
createTool: () => {
throw new Error("fallback createTool exploded");
},
}),
]);
await expect(
runWebSearch({
config: {},
args: { query: "selected-first" },
}),
).resolves.toEqual({
provider: "google",
result: { query: "selected-first", provider: "google" },
});
});
it("does not fall back when the provider came from explicit config selection", async () => {
resolveRuntimeWebSearchProvidersMock.mockReturnValue([
createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
createTool: () => ({
description: "google",
parameters: {},
execute: async () => {
throw new Error("google aborted");
},
}),
}),
createProvider({
pluginId: "duckduckgo",
id: "duckduckgo",
credentialPath: "",
autoDetectOrder: 100,
requiresCredential: false,
createTool: () => ({
description: "duckduckgo",
parameters: {},
execute: async (args) => ({ ...args, provider: "duckduckgo" }),
}),
}),
]);
await expect(
runWebSearch({
config: {
tools: {
web: {
search: {
provider: "google",
},
},
},
},
args: { query: "configured" },
}),
).rejects.toThrow("google aborted");
});
it("does not fall back when the caller explicitly selects a provider", async () => {
resolveRuntimeWebSearchProvidersMock.mockReturnValue([
createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
createTool: () => ({
description: "google",
parameters: {},
execute: async () => {
throw new Error("google aborted");
},
}),
}),
createProvider({
pluginId: "duckduckgo",
id: "duckduckgo",
credentialPath: "",
autoDetectOrder: 100,
requiresCredential: false,
}),
]);
await expect(
runWebSearch({
config: {},
providerId: "google",
args: { query: "explicit" },
}),
).rejects.toThrow("google aborted");
});
it("fails fast when an explicit provider cannot create a tool", async () => {
resolveRuntimeWebSearchProvidersMock.mockReturnValue([
createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
createTool: () => null,
}),
createProvider({
pluginId: "duckduckgo",
id: "duckduckgo",
credentialPath: "",
autoDetectOrder: 100,
requiresCredential: false,
}),
]);
await expect(
runWebSearch({
config: {},
providerId: "google",
args: { query: "explicit-null-tool" },
}),
).rejects.toThrow('web_search provider "google" is not available.');
});
it("fails fast when the caller explicitly selects an unknown provider", async () => {
resolveRuntimeWebSearchProvidersMock.mockReturnValue([
createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
}),
createProvider({
pluginId: "duckduckgo",
id: "duckduckgo",
credentialPath: "",
autoDetectOrder: 100,
requiresCredential: false,
}),
]);
await expect(
runWebSearch({
config: {},
providerId: "missing-id",
args: { query: "explicit-missing" },
}),
).rejects.toThrow('Unknown web_search provider "missing-id".');
});
it("still falls back when config names an unknown provider id", async () => {
resolveRuntimeWebSearchProvidersMock.mockReturnValue([
createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
createTool: () => {
throw new Error("google aborted");
},
}),
createProvider({
pluginId: "duckduckgo",
id: "duckduckgo",
credentialPath: "",
autoDetectOrder: 100,
requiresCredential: false,
}),
]);
await expect(
runWebSearch({
config: {
tools: {
web: {
search: {
provider: "missing-id",
},
},
},
},
args: { query: "config-typo" },
}),
).resolves.toMatchObject({
provider: "duckduckgo",
result: expect.objectContaining({
provider: "duckduckgo",
query: "config-typo",
}),
});
});
it("honors preferRuntimeProviders during execution", async () => {
const configuredProvider = createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
});
const runtimeProvider = createProvider({
pluginId: "runtime-search",
id: "runtime-search",
credentialPath: "",
autoDetectOrder: 0,
requiresCredential: false,
});
resolveRuntimeWebSearchProvidersMock.mockReturnValue([configuredProvider, runtimeProvider]);
resolvePluginWebSearchProvidersMock.mockReturnValue([configuredProvider]);
await expect(
runWebSearch({
config: {
tools: {
web: {
search: {
provider: "google",
},
},
},
},
runtimeWebSearch: {
enabled: true,
providerConfigured: "runtime-search",
selectedProvider: "runtime-search",
providerSource: "runtime",
},
preferRuntimeProviders: false,
args: { query: "prefer-config" },
}),
).resolves.toEqual({
provider: "google",
result: { query: "prefer-config", provider: "google" },
});
});
it("returns a clear error when every fallback-capable provider is unavailable", async () => {
resolveRuntimeWebSearchProvidersMock.mockReturnValue([
createProvider({
pluginId: "google",
id: "google",
credentialPath: "tools.web.search.google.apiKey",
autoDetectOrder: 1,
getCredentialValue: () => "configured",
createTool: () => null,
}),
createProvider({
pluginId: "duckduckgo",
id: "duckduckgo",
credentialPath: "",
autoDetectOrder: 100,
requiresCredential: false,
createTool: () => null,
}),
]);
await expect(
runWebSearch({
config: {},
args: { query: "all-null-tools" },
}),
).rejects.toThrow("web_search is enabled but no provider is currently available.");
});
});

View File

@@ -78,6 +78,21 @@ function hasEntryCredential(
});
}
export function isWebSearchProviderConfigured(params: {
provider: Pick<
PluginWebSearchProviderEntry,
| "credentialPath"
| "id"
| "envVars"
| "getConfiguredCredentialValue"
| "getCredentialValue"
| "requiresCredential"
>;
config?: OpenClawConfig;
}): boolean {
return hasEntryCredential(params.provider, params.config, resolveSearchConfig(params.config));
}
export function listWebSearchProviders(params?: {
config?: OpenClawConfig;
}): PluginWebSearchProviderEntry[] {
@@ -197,21 +212,148 @@ export function resolveWebSearchDefinition(
});
}
function resolveWebSearchCandidates(
options?: ResolveWebSearchDefinitionParams,
): PluginWebSearchProviderEntry[] {
const search = resolveSearchConfig(options?.config);
const runtimeWebSearch = options?.runtimeWebSearch ?? getActiveRuntimeWebToolsMetadata()?.search;
if (!resolveWebSearchEnabled({ search, sandboxed: options?.sandboxed })) {
return [];
}
const providers = sortWebSearchProvidersForAutoDetect(
options?.preferRuntimeProviders
? resolveRuntimeWebSearchProviders({
config: options?.config,
bundledAllowlistCompat: true,
})
: resolvePluginWebSearchProviders({
config: options?.config,
bundledAllowlistCompat: true,
origin: "bundled",
}),
).filter(Boolean);
if (providers.length === 0) {
return [];
}
const preferredIds = [
options?.providerId,
runtimeWebSearch?.selectedProvider,
runtimeWebSearch?.providerConfigured,
resolveWebSearchProviderId({ config: options?.config, search, providers }),
].filter(
(value, index, array): value is string => Boolean(value) && array.indexOf(value) === index,
);
const explicitProviderId = options?.providerId?.trim();
if (explicitProviderId && !providers.some((entry) => entry.id === explicitProviderId)) {
throw new Error(`Unknown web_search provider "${explicitProviderId}".`);
}
const orderedProviders = [
...preferredIds
.map((id) => providers.find((entry) => entry.id === id))
.filter((entry): entry is PluginWebSearchProviderEntry => Boolean(entry)),
...providers.filter((entry) => !preferredIds.includes(entry.id)),
];
return orderedProviders;
}
function hasExplicitWebSearchSelection(params: {
search?: WebSearchConfig;
runtimeWebSearch?: RuntimeWebSearchMetadata;
providerId?: string;
providers?: PluginWebSearchProviderEntry[];
}): boolean {
if (params.providerId?.trim()) {
return true;
}
const availableProviderIds = new Set(
(params.providers ?? []).map((provider) => provider.id.trim().toLowerCase()),
);
const configuredProviderId =
params.search &&
"provider" in params.search &&
typeof params.search.provider === "string"
? params.search.provider.trim().toLowerCase()
: "";
if (configuredProviderId && availableProviderIds.has(configuredProviderId)) {
return true;
}
const runtimeConfiguredId = (
params.runtimeWebSearch?.selectedProvider ?? params.runtimeWebSearch?.providerConfigured
)
?.trim()
.toLowerCase();
if (
params.runtimeWebSearch?.providerSource === "configured" &&
runtimeConfiguredId &&
availableProviderIds.has(runtimeConfiguredId)
) {
return true;
}
return false;
}
export async function runWebSearch(
params: RunWebSearchParams,
): Promise<{ provider: string; result: Record<string, unknown> }> {
const resolved = resolveWebSearchDefinition({ ...params, preferRuntimeProviders: true });
if (!resolved) {
const search = resolveSearchConfig(params.config);
const runtimeWebSearch = params.runtimeWebSearch ?? getActiveRuntimeWebToolsMetadata()?.search;
const candidates = resolveWebSearchCandidates({
...params,
runtimeWebSearch,
preferRuntimeProviders: params.preferRuntimeProviders ?? true,
});
if (candidates.length === 0) {
throw new Error("web_search is disabled or no provider is available.");
}
return {
provider: resolved.provider.id,
result: await resolved.definition.execute(params.args),
};
const allowFallback = !hasExplicitWebSearchSelection({
search,
runtimeWebSearch,
providerId: params.providerId,
providers: candidates,
});
let lastError: unknown;
let sawUnavailableProvider = false;
for (const candidate of candidates) {
try {
const definition = candidate.createTool({
config: params.config,
searchConfig: search as Record<string, unknown> | undefined,
runtimeMetadata: runtimeWebSearch,
});
if (!definition) {
if (!allowFallback) {
throw new Error(`web_search provider "${candidate.id}" is not available.`);
}
sawUnavailableProvider = true;
continue;
}
return {
provider: candidate.id,
result: await definition.execute(params.args),
};
} catch (error) {
lastError = error;
if (!allowFallback) {
throw error;
}
}
}
if (sawUnavailableProvider && lastError === undefined) {
throw new Error("web_search is enabled but no provider is currently available.");
}
throw lastError instanceof Error ? lastError : new Error(String(lastError));
}
export const __testing = {
resolveSearchConfig,
resolveSearchProvider: resolveWebSearchProviderId,
resolveWebSearchProviderId,
resolveWebSearchCandidates,
hasExplicitWebSearchSelection,
};