mirror of
https://github.com/openclaw/openclaw.git
synced 2026-06-06 05:51:15 +08:00
fix(vllm): wire configured thinking params
Move vLLM Qwen thinking control onto configured model compat metadata and carry it through catalog/model-selection/runtime thinking contexts. Also migrate legacy provider/default request params in doctor and keep Pi/runtime model rows buildable with explicit reasoning defaults. Thanks @rendrag-git. Co-authored-by: rendrag-git <253747599+rendrag-git@users.noreply.github.com>
This commit is contained in:
@@ -459,8 +459,8 @@ Time format in system prompt. Default: `auto` (OS preference).
|
||||
- `params` merge precedence (config): `agents.defaults.params` (global base) is overridden by `agents.defaults.models["provider/model"].params` (per-model), then `agents.list[].params` (matching agent id) overrides by key. See [Prompt Caching](/reference/prompt-caching) for details.
|
||||
- `models.providers.openrouter.params.provider`: OpenRouter-wide default provider-routing policy. OpenClaw forwards this to OpenRouter's request `provider` object; per-model `agents.defaults.models["openrouter/<model>"].params.provider` and agent params override by key. See [OpenRouter provider routing](/providers/openrouter#advanced-configuration).
|
||||
- `params.extra_body`/`params.extraBody`: advanced pass-through JSON merged into `api: "openai-completions"` request bodies for OpenAI-compatible proxies. If it collides with generated request keys, the extra body wins; non-native completions routes still strip OpenAI-only `store` afterward.
|
||||
- `params.chat_template_kwargs`: vLLM/OpenAI-compatible chat-template arguments merged into top-level `api: "openai-completions"` request bodies. For `vllm/nemotron-3-*` with thinking off, the bundled vLLM plugin automatically sends `enable_thinking: false` and `force_nonempty_content: true`; explicit `chat_template_kwargs` override generated defaults, and `extra_body.chat_template_kwargs` still has final precedence. For vLLM Qwen thinking controls, set `params.qwenThinkingFormat` to `"chat-template"` or `"top-level"` on that model entry.
|
||||
- `compat.thinkingFormat`: OpenAI-compatible thinking payload style. Use `"together"` for Together-style `reasoning.enabled`, `"qwen"` for Qwen-style top-level `enable_thinking`, or `"qwen-chat-template"` for `chat_template_kwargs.enable_thinking` on Qwen-family backends that support request-level chat-template kwargs, such as vLLM. OpenClaw maps disabled thinking to `false` and enabled thinking to `true`.
|
||||
- `params.chat_template_kwargs`: vLLM/OpenAI-compatible chat-template arguments merged into top-level `api: "openai-completions"` request bodies. For `vllm/nemotron-3-*` with thinking off, the bundled vLLM plugin automatically sends `enable_thinking: false` and `force_nonempty_content: true`; explicit `chat_template_kwargs` override generated defaults, and `extra_body.chat_template_kwargs` still has final precedence. Configured vLLM Qwen and Nemotron thinking models expose binary `/think` choices (`off`, `on`) instead of the multi-level effort ladder.
|
||||
- `compat.thinkingFormat`: OpenAI-compatible thinking payload style. Use `"together"` for Together-style `reasoning.enabled`, `"qwen"` for Qwen-style top-level `enable_thinking`, or `"qwen-chat-template"` for `chat_template_kwargs.enable_thinking` on Qwen-family backends that support request-level chat-template kwargs, such as vLLM. OpenClaw maps disabled thinking to `false` and enabled thinking to `true`, and configured vLLM Qwen models expose binary `/think` choices for these formats.
|
||||
- `compat.supportedReasoningEfforts`: per-model OpenAI-compatible reasoning effort list. Include `"xhigh"` for custom endpoints that truly accept it; OpenClaw then exposes `/think xhigh` in command menus, Gateway session rows, session patch validation, agent CLI validation, and `llm-task` validation for that configured provider/model. Use `compat.reasoningEffortMap` when the backend wants a provider-specific value for a canonical level.
|
||||
- `params.preserveThinking`: Z.AI-only opt-in for preserved thinking. When enabled and thinking is on, OpenClaw sends `thinking.clear_thinking: false` and replays prior `reasoning_content`; see [Z.AI thinking and preserved thinking](/providers/zai#thinking-and-preserved-thinking).
|
||||
- `localService`: optional provider-level process manager for local/self-hosted model servers. When the selected model belongs to that provider, OpenClaw probes `healthUrl` (or `baseUrl + "/models"`), starts `command` with `args` if the endpoint is down, waits up to `readyTimeoutMs`, then sends the model request. `command` must be an absolute path. `idleStopMs: 0` keeps the process alive until OpenClaw exits; a positive value stops the OpenClaw-spawned process after that many idle milliseconds. See [Local model services](/gateway/local-model-services).
|
||||
|
||||
@@ -535,7 +535,7 @@ Configuring a custom/local provider `baseUrl` is also the narrow network trust d
|
||||
- `models.providers.*.models.*.compat.supportsDeveloperRole`: optional compatibility hint. For `api: "openai-completions"` with a non-empty non-native `baseUrl` (host not `api.openai.com`), OpenClaw forces this to `false` at runtime. Empty/omitted `baseUrl` keeps default OpenAI behavior.
|
||||
- `models.providers.*.models.*.compat.requiresStringContent`: optional compatibility hint for string-only OpenAI-compatible chat endpoints. When `true`, OpenClaw flattens pure text `messages[].content` arrays into plain strings before sending the request.
|
||||
- `models.providers.*.models.*.compat.strictMessageKeys`: optional compatibility hint for strict OpenAI-compatible chat endpoints. When `true`, OpenClaw strips outgoing Chat Completions message objects to `role` and `content` before sending the request.
|
||||
- `models.providers.*.models.*.compat.thinkingFormat`: optional thinking payload hint. Use `"together"` for Together-style `reasoning.enabled`, `"qwen"` for top-level `enable_thinking`, or `"qwen-chat-template"` for `chat_template_kwargs.enable_thinking` on Qwen-family OpenAI-compatible servers that support request-level chat-template kwargs, such as vLLM.
|
||||
- `models.providers.*.models.*.compat.thinkingFormat`: optional thinking payload hint. Use `"together"` for Together-style `reasoning.enabled`, `"qwen"` for top-level `enable_thinking`, or `"qwen-chat-template"` for `chat_template_kwargs.enable_thinking` on Qwen-family OpenAI-compatible servers that support request-level chat-template kwargs, such as vLLM. Configured vLLM Qwen models expose binary `/think` choices (`off`, `on`) for these formats.
|
||||
|
||||
</Accordion>
|
||||
<Accordion title="Amazon Bedrock discovery">
|
||||
|
||||
@@ -818,6 +818,11 @@ canonical replacement.
|
||||
ranked level list. OpenClaw downgrades stale stored values by profile
|
||||
rank automatically.
|
||||
|
||||
The context includes `provider`, `modelId`, optional merged `reasoning`,
|
||||
and optional merged model `compat` facts. Provider plugins can use those
|
||||
catalog facts to expose a model-specific profile only when the configured
|
||||
request contract supports it.
|
||||
|
||||
Implement one hook instead of three. The legacy hooks keep working during
|
||||
the deprecation window but are not composed with the profile result.
|
||||
|
||||
|
||||
@@ -501,6 +501,7 @@ API key auth, and dynamic model resolution.
|
||||
|
||||
- `normalizeConfig` checks the matched provider first, then other hook-capable provider plugins until one actually changes the config. If no provider hook rewrites a supported Google-family config entry, the bundled Google config normalizer still applies.
|
||||
- `resolveConfigApiKey` uses the provider hook when exposed. The bundled `amazon-bedrock` path also has a built-in AWS env-marker resolver here, even though Bedrock runtime auth itself still uses the AWS SDK default chain.
|
||||
- `resolveThinkingProfile(ctx)` receives the selected `provider`, `modelId`, optional merged `reasoning` catalog hint, and optional merged model `compat` facts. Use `compat` only to select the provider's thinking UI/profile.
|
||||
- `resolveSystemPromptContribution` lets a provider inject cache-aware system-prompt guidance for a model family. Prefer it over `before_prompt_build` when the behavior belongs to one provider/model family and should preserve the stable/dynamic cache split.
|
||||
|
||||
For detailed descriptions and real-world examples, see [Internals: Provider Runtime Hooks](/plugins/architecture-internals#provider-runtime-hooks).
|
||||
|
||||
@@ -145,8 +145,32 @@ wildcard to the visible model catalog:
|
||||
|
||||
<Accordion title="Qwen thinking controls">
|
||||
For Qwen models served through vLLM, set
|
||||
`params.qwenThinkingFormat: "chat-template"` on the model entry when the
|
||||
server expects Qwen chat-template kwargs. OpenClaw maps `/think off` to:
|
||||
`compat.thinkingFormat: "qwen-chat-template"` on the configured provider
|
||||
model row when the server expects Qwen chat-template kwargs. Models
|
||||
configured this way expose a binary `/think` profile (`off`, `on`) because
|
||||
Qwen template thinking is an on/off request flag, not an OpenAI-style effort
|
||||
ladder.
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
models: [
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3 8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
OpenClaw maps `/think off` to:
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -159,8 +183,8 @@ wildcard to the visible model catalog:
|
||||
|
||||
Non-`off` thinking levels send `enable_thinking: true`. If your endpoint
|
||||
expects DashScope-style top-level flags instead, use
|
||||
`params.qwenThinkingFormat: "top-level"` to send `enable_thinking` at the
|
||||
request root. Snake-case `params.qwen_thinking_format` is also accepted.
|
||||
`compat.thinkingFormat: "qwen"` to send `enable_thinking` at the request
|
||||
root.
|
||||
|
||||
</Accordion>
|
||||
|
||||
|
||||
@@ -134,6 +134,7 @@ Malformed local-model reasoning tags are handled conservatively. Closed `<think>
|
||||
- Provider plugins can expose `resolveThinkingProfile(ctx)` to define the model's supported levels and default.
|
||||
- Provider plugins that proxy Claude models should reuse `resolveClaudeThinkingProfile(modelId)` from `openclaw/plugin-sdk/provider-model-shared` so direct Anthropic and proxy catalogs stay aligned.
|
||||
- Each profile level has a stored canonical `id` (`off`, `minimal`, `low`, `medium`, `high`, `xhigh`, `adaptive`, or `max`) and may include a display `label`. Binary providers use `{ id: "low", label: "on" }`.
|
||||
- Profile hooks receive merged catalog facts when available, including `reasoning`, `compat.thinkingFormat`, and `compat.supportedReasoningEfforts`. Use those facts to expose binary or custom profiles only when the configured request contract supports the matching payload.
|
||||
- Tool plugins that need to validate an explicit thinking override should use `api.runtime.agent.resolveThinkingPolicy({ provider, model })` plus `api.runtime.agent.normalizeThinkingLevel(...)`; they should not keep their own provider/model level lists.
|
||||
- Tool plugins with access to configured custom model metadata can pass `catalog` into `resolveThinkingPolicy` so `compat.supportedReasoningEfforts` opt-ins are reflected in plugin-side validation.
|
||||
- Published legacy hooks (`supportsXHighThinking`, `isBinaryThinking`, and `resolveDefaultThinkingLevel`) remain as compatibility adapters, but new custom level sets should use `resolveThinkingProfile`.
|
||||
|
||||
@@ -11,6 +11,7 @@ import {
|
||||
VLLM_PROVIDER_LABEL,
|
||||
} from "./api.js";
|
||||
import { wrapVllmProviderStream } from "./stream.js";
|
||||
import { resolveThinkingProfile } from "./thinking-policy.js";
|
||||
|
||||
const PROVIDER_ID = "vllm";
|
||||
|
||||
@@ -90,6 +91,7 @@ export default definePluginEntry({
|
||||
"vLLM requires authentication to be registered as a provider. " +
|
||||
'Set VLLM_API_KEY (any value works) or run "openclaw configure". ' +
|
||||
"See: https://docs.openclaw.ai/providers/vllm",
|
||||
resolveThinkingProfile,
|
||||
wrapStreamFn: wrapVllmProviderStream,
|
||||
});
|
||||
},
|
||||
|
||||
@@ -1,7 +1,28 @@
|
||||
import { fileURLToPath } from "node:url";
|
||||
import { registerSingleProviderPlugin } from "openclaw/plugin-sdk/plugin-test-runtime";
|
||||
import { describeVllmProviderDiscoveryContract } from "openclaw/plugin-sdk/provider-test-contracts";
|
||||
import { describe, expect, it } from "vitest";
|
||||
import vllmPlugin from "./index.js";
|
||||
|
||||
describeVllmProviderDiscoveryContract({
|
||||
load: () => import("./index.js"),
|
||||
apiModuleId: fileURLToPath(new URL("./api.js", import.meta.url)),
|
||||
});
|
||||
|
||||
describe("vLLM provider registration", () => {
|
||||
it("exposes the binary thinking profile hook", async () => {
|
||||
const provider = await registerSingleProviderPlugin(vllmPlugin);
|
||||
|
||||
expect(
|
||||
provider.resolveThinkingProfile?.({
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
}),
|
||||
).toEqual({
|
||||
levels: [{ id: "off" }, { id: "low", label: "on" }],
|
||||
defaultLevel: "off",
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
62
extensions/vllm/provider-policy-api.test.ts
Normal file
62
extensions/vllm/provider-policy-api.test.ts
Normal file
@@ -0,0 +1,62 @@
|
||||
import { describe, expect, it } from "vitest";
|
||||
import { resolveThinkingProfile } from "./provider-policy-api.js";
|
||||
|
||||
describe("vLLM provider thinking policy", () => {
|
||||
it("exposes a binary profile for configured Qwen chat-template models", () => {
|
||||
expect(
|
||||
resolveThinkingProfile({
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
}),
|
||||
).toEqual({
|
||||
levels: [{ id: "off" }, { id: "low", label: "on" }],
|
||||
defaultLevel: "off",
|
||||
});
|
||||
});
|
||||
|
||||
it("uses configured Qwen compat even when catalog reasoning metadata is absent", () => {
|
||||
expect(
|
||||
resolveThinkingProfile({
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
}),
|
||||
).toEqual({
|
||||
levels: [{ id: "off" }, { id: "low", label: "on" }],
|
||||
defaultLevel: "off",
|
||||
});
|
||||
});
|
||||
|
||||
it("exposes a binary profile for vLLM Nemotron 3 reasoning models", () => {
|
||||
expect(
|
||||
resolveThinkingProfile({
|
||||
provider: "vllm",
|
||||
modelId: "nemotron-3-super",
|
||||
reasoning: true,
|
||||
}),
|
||||
).toEqual({
|
||||
levels: [{ id: "off" }, { id: "low", label: "on" }],
|
||||
defaultLevel: "off",
|
||||
});
|
||||
});
|
||||
|
||||
it("does not flatten unconfigured or non-reasoning vLLM models", () => {
|
||||
expect(
|
||||
resolveThinkingProfile({
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
}),
|
||||
).toBeNull();
|
||||
expect(
|
||||
resolveThinkingProfile({
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
reasoning: false,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
}),
|
||||
).toBeNull();
|
||||
});
|
||||
});
|
||||
1
extensions/vllm/provider-policy-api.ts
Normal file
1
extensions/vllm/provider-policy-api.ts
Normal file
@@ -0,0 +1 @@
|
||||
export { resolveThinkingProfile } from "./thinking-policy.js";
|
||||
@@ -101,10 +101,19 @@ describe("createVllmQwenThinkingWrapper", () => {
|
||||
});
|
||||
});
|
||||
|
||||
it("skips non-reasoning and non-completions models", () => {
|
||||
it("patches configured Qwen models unless reasoning is explicitly disabled", () => {
|
||||
expect(capturePayload({ format: "chat-template", model: { reasoning: undefined } })).toEqual({
|
||||
chat_template_kwargs: {
|
||||
enable_thinking: true,
|
||||
preserve_thinking: true,
|
||||
},
|
||||
});
|
||||
expect(capturePayload({ format: "chat-template", model: { reasoning: false } })).toStrictEqual(
|
||||
{},
|
||||
);
|
||||
});
|
||||
|
||||
it("skips non-completions models", () => {
|
||||
expect(
|
||||
capturePayload({ format: "chat-template", model: { api: "openai-responses" as never } }),
|
||||
).toStrictEqual({});
|
||||
@@ -186,7 +195,25 @@ describe("createVllmProviderThinkingWrapper", () => {
|
||||
});
|
||||
|
||||
describe("wrapVllmProviderStream", () => {
|
||||
it("registers when vLLM Qwen thinking format params are configured", () => {
|
||||
it("registers when vLLM Qwen thinking format compat is configured", () => {
|
||||
expect(
|
||||
wrapVllmProviderStream({
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
extraParams: {},
|
||||
model: {
|
||||
api: "openai-completions",
|
||||
provider: "vllm",
|
||||
id: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
} as Model<"openai-completions">,
|
||||
streamFn: undefined,
|
||||
} as never),
|
||||
).toBeTypeOf("function");
|
||||
});
|
||||
|
||||
it("ignores request params when Qwen thinking format compat is not configured", () => {
|
||||
expect(
|
||||
wrapVllmProviderStream({
|
||||
provider: "vllm",
|
||||
@@ -200,22 +227,42 @@ describe("wrapVllmProviderStream", () => {
|
||||
} as Model<"openai-completions">,
|
||||
streamFn: undefined,
|
||||
} as never),
|
||||
).toBeTypeOf("function");
|
||||
).toBeUndefined();
|
||||
});
|
||||
|
||||
expect(
|
||||
wrapVllmProviderStream({
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
extraParams: { qwen_thinking_format: "enable_thinking" },
|
||||
model: {
|
||||
api: "openai-completions",
|
||||
provider: "vllm",
|
||||
id: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
} as Model<"openai-completions">,
|
||||
streamFn: undefined,
|
||||
} as never),
|
||||
).toBeTypeOf("function");
|
||||
it("uses model compat for Qwen thinking format", () => {
|
||||
let captured: Record<string, unknown> = {};
|
||||
const baseStreamFn: StreamFn = (_model, _context, options) => {
|
||||
const payload = {};
|
||||
options?.onPayload?.(payload, _model);
|
||||
captured = payload;
|
||||
return {} as ReturnType<StreamFn>;
|
||||
};
|
||||
const model = {
|
||||
api: "openai-completions",
|
||||
provider: "vllm",
|
||||
id: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
} as unknown as Model<"openai-completions">;
|
||||
const wrapped = wrapVllmProviderStream({
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
extraParams: {},
|
||||
thinkingLevel: "off",
|
||||
model,
|
||||
streamFn: baseStreamFn,
|
||||
} as never);
|
||||
|
||||
expect(wrapped).toBeTypeOf("function");
|
||||
void wrapped?.(model, { messages: [] } as Context, {});
|
||||
|
||||
expect(captured).toEqual({
|
||||
chat_template_kwargs: {
|
||||
enable_thinking: false,
|
||||
preserve_thinking: true,
|
||||
},
|
||||
});
|
||||
});
|
||||
|
||||
it("skips unconfigured vLLM and non-vLLM providers", () => {
|
||||
@@ -237,7 +284,7 @@ describe("wrapVllmProviderStream", () => {
|
||||
wrapVllmProviderStream({
|
||||
provider: "openai",
|
||||
modelId: "gpt-5.4",
|
||||
extraParams: { qwenThinkingFormat: "chat-template" },
|
||||
extraParams: {},
|
||||
model: {
|
||||
api: "openai-completions",
|
||||
provider: "openai",
|
||||
|
||||
@@ -5,43 +5,21 @@ import {
|
||||
createPayloadPatchStreamWrapper,
|
||||
isOpenAICompatibleThinkingEnabled,
|
||||
} from "openclaw/plugin-sdk/provider-stream-shared";
|
||||
import {
|
||||
resolveVllmQwenThinkingFormatFromCompat,
|
||||
type VllmQwenThinkingFormat,
|
||||
} from "./thinking-policy.js";
|
||||
|
||||
type VllmThinkingLevel = ProviderWrapStreamFnContext["thinkingLevel"];
|
||||
type VllmQwenThinkingFormat = "chat-template" | "top-level";
|
||||
|
||||
function isVllmProviderId(providerId: string): boolean {
|
||||
return normalizeProviderId(providerId) === "vllm";
|
||||
}
|
||||
|
||||
function normalizeQwenThinkingFormat(value: unknown): VllmQwenThinkingFormat | undefined {
|
||||
if (typeof value !== "string") {
|
||||
return undefined;
|
||||
}
|
||||
const normalized = value.trim().toLowerCase().replace(/_/g, "-");
|
||||
if (
|
||||
normalized === "chat-template" ||
|
||||
normalized === "chat-template-kwargs" ||
|
||||
normalized === "chat-template-kwarg" ||
|
||||
normalized === "chat-template-arguments"
|
||||
) {
|
||||
return "chat-template";
|
||||
}
|
||||
if (
|
||||
normalized === "top-level" ||
|
||||
normalized === "enable-thinking" ||
|
||||
normalized === "request-body"
|
||||
) {
|
||||
return "top-level";
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function resolveVllmQwenThinkingFormat(
|
||||
extraParams: ProviderWrapStreamFnContext["extraParams"],
|
||||
ctx: Pick<ProviderWrapStreamFnContext, "model">,
|
||||
): VllmQwenThinkingFormat | undefined {
|
||||
return normalizeQwenThinkingFormat(
|
||||
extraParams?.qwenThinkingFormat ?? extraParams?.qwen_thinking_format,
|
||||
);
|
||||
return resolveVllmQwenThinkingFormatFromCompat(ctx.model?.compat);
|
||||
}
|
||||
|
||||
function setQwenChatTemplateThinking(payload: Record<string, unknown>, enabled: boolean): void {
|
||||
@@ -110,7 +88,7 @@ export function createVllmQwenThinkingWrapper(params: {
|
||||
delete payloadObj.reasoning;
|
||||
},
|
||||
{
|
||||
shouldPatch: ({ model }) => model.api === "openai-completions" && model.reasoning,
|
||||
shouldPatch: ({ model }) => model.api === "openai-completions" && (model.reasoning ?? true),
|
||||
},
|
||||
);
|
||||
}
|
||||
@@ -145,7 +123,7 @@ export function wrapVllmProviderStream(ctx: ProviderWrapStreamFnContext): Stream
|
||||
if (!isVllmProviderId(ctx.provider) || (ctx.model && ctx.model.api !== "openai-completions")) {
|
||||
return undefined;
|
||||
}
|
||||
const qwenFormat = resolveVllmQwenThinkingFormat(ctx.extraParams);
|
||||
const qwenFormat = resolveVllmQwenThinkingFormat(ctx);
|
||||
const shouldHandleNemotron =
|
||||
ctx.thinkingLevel === "off" &&
|
||||
isVllmNemotronModel({
|
||||
|
||||
65
extensions/vllm/thinking-policy.ts
Normal file
65
extensions/vllm/thinking-policy.ts
Normal file
@@ -0,0 +1,65 @@
|
||||
import type {
|
||||
ProviderDefaultThinkingPolicyContext,
|
||||
ProviderThinkingProfile,
|
||||
} from "openclaw/plugin-sdk/plugin-entry";
|
||||
import { normalizeProviderId } from "openclaw/plugin-sdk/provider-model-shared";
|
||||
|
||||
export type VllmQwenThinkingFormat = "chat-template" | "top-level";
|
||||
|
||||
const VLLM_BINARY_THINKING_PROFILE = {
|
||||
levels: [{ id: "off" }, { id: "low", label: "on" }],
|
||||
defaultLevel: "off",
|
||||
} satisfies ProviderThinkingProfile;
|
||||
|
||||
export function normalizeVllmQwenThinkingFormat(
|
||||
value: unknown,
|
||||
): VllmQwenThinkingFormat | undefined {
|
||||
if (typeof value !== "string") {
|
||||
return undefined;
|
||||
}
|
||||
const normalized = value.trim().toLowerCase().replace(/_/g, "-");
|
||||
if (
|
||||
normalized === "chat-template" ||
|
||||
normalized === "chat-template-kwargs" ||
|
||||
normalized === "chat-template-kwarg" ||
|
||||
normalized === "chat-template-arguments" ||
|
||||
normalized === "qwen-chat-template"
|
||||
) {
|
||||
return "chat-template";
|
||||
}
|
||||
if (
|
||||
normalized === "top-level" ||
|
||||
normalized === "enable-thinking" ||
|
||||
normalized === "request-body" ||
|
||||
normalized === "qwen"
|
||||
) {
|
||||
return "top-level";
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
export function resolveVllmQwenThinkingFormatFromCompat(
|
||||
compat?: ProviderDefaultThinkingPolicyContext["compat"],
|
||||
): VllmQwenThinkingFormat | undefined {
|
||||
return normalizeVllmQwenThinkingFormat(compat?.thinkingFormat);
|
||||
}
|
||||
|
||||
function isVllmNemotronThinkingModel(modelId: string): boolean {
|
||||
return /\bnemotron-3(?:[-_](?:nano|super|ultra))?\b/i.test(modelId);
|
||||
}
|
||||
|
||||
export function resolveThinkingProfile(
|
||||
ctx: ProviderDefaultThinkingPolicyContext,
|
||||
): ProviderThinkingProfile | null {
|
||||
if (normalizeProviderId(ctx.provider) !== "vllm") {
|
||||
return null;
|
||||
}
|
||||
if (ctx.reasoning === false) {
|
||||
return null;
|
||||
}
|
||||
const qwenFormat = resolveVllmQwenThinkingFormatFromCompat(ctx.compat);
|
||||
if (qwenFormat || (ctx.reasoning === true && isVllmNemotronThinkingModel(ctx.modelId))) {
|
||||
return VLLM_BINARY_THINKING_PROFILE;
|
||||
}
|
||||
return null;
|
||||
}
|
||||
@@ -1033,6 +1033,100 @@ describe("loadModelCatalog", () => {
|
||||
expect(entry.contextWindow).toBe(128_000);
|
||||
});
|
||||
|
||||
it("overlays configured model compat onto discovered catalog rows", async () => {
|
||||
mockPiDiscoveryModels([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3 8B",
|
||||
provider: "vllm",
|
||||
reasoning: false,
|
||||
compat: { supportsStrictMode: false },
|
||||
},
|
||||
]);
|
||||
|
||||
const result = await loadModelCatalog({
|
||||
config: {
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
baseUrl: "http://localhost:9000/v1",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
id: "vllm/Qwen/Qwen3-8B",
|
||||
name: "Configured Qwen3 8B",
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as unknown as OpenClawConfig,
|
||||
});
|
||||
|
||||
const entry = requireCatalogEntry(result, "vllm", "Qwen/Qwen3-8B");
|
||||
expect(result.filter((entry) => entry.provider === "vllm")).toHaveLength(1);
|
||||
expect(entry.name).toBe("Qwen3 8B");
|
||||
expect(entry.reasoning).toBe(true);
|
||||
expect(entry.compat).toEqual(
|
||||
expect.objectContaining({
|
||||
supportsStrictMode: false,
|
||||
thinkingFormat: "qwen-chat-template",
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("overlays configured model compat onto persisted read-only catalog rows", async () => {
|
||||
readFileMock.mockResolvedValue(
|
||||
JSON.stringify({
|
||||
providers: {
|
||||
vllm: {
|
||||
models: [
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3 8B",
|
||||
reasoning: false,
|
||||
compat: { supportsStrictMode: false },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
}),
|
||||
);
|
||||
|
||||
const result = await loadModelCatalog({
|
||||
config: {
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
baseUrl: "http://localhost:9000/v1",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
id: "vllm/Qwen/Qwen3-8B",
|
||||
name: "Configured Qwen3 8B",
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as unknown as OpenClawConfig,
|
||||
readOnly: true,
|
||||
});
|
||||
|
||||
const entry = requireCatalogEntry(result, "vllm", "Qwen/Qwen3-8B");
|
||||
expect(result.filter((entry) => entry.provider === "vllm")).toHaveLength(1);
|
||||
expect(entry.name).toBe("Qwen3 8B");
|
||||
expect(entry.reasoning).toBe(true);
|
||||
expect(entry.compat).toEqual(
|
||||
expect.objectContaining({
|
||||
supportsStrictMode: false,
|
||||
thinkingFormat: "qwen-chat-template",
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("merges manifest model catalog rows on the normal catalog path", async () => {
|
||||
mockSingleOpenAiCatalogModel();
|
||||
currentPluginMetadataSnapshotMock.mockReturnValue({
|
||||
|
||||
@@ -21,6 +21,7 @@ import { resolveDefaultAgentDir } from "./agent-scope.js";
|
||||
import { modelSupportsInput as modelCatalogEntrySupportsInput } from "./model-catalog-lookup.js";
|
||||
import type { ModelCatalogEntry, ModelInputType } from "./model-catalog.types.js";
|
||||
import {
|
||||
modelKey,
|
||||
normalizeConfiguredProviderCatalogModelId,
|
||||
type ProviderModelIdNormalizationOptions,
|
||||
} from "./model-ref-shared.js";
|
||||
@@ -112,7 +113,8 @@ function instantiatePiModelRegistry(
|
||||
}
|
||||
|
||||
function catalogEntryDedupeKey(provider: string, id: string): string {
|
||||
return `${normalizeProviderId(provider)}::${normalizeLowercaseStringOrEmpty(id)}`;
|
||||
const normalizedProvider = normalizeProviderId(provider);
|
||||
return normalizeLowercaseStringOrEmpty(modelKey(normalizedProvider, id));
|
||||
}
|
||||
|
||||
function appendCatalogEntriesIfAbsent(
|
||||
@@ -130,6 +132,52 @@ function appendCatalogEntriesIfAbsent(
|
||||
}
|
||||
}
|
||||
|
||||
function mergeCatalogCompat(
|
||||
base: ModelCatalogEntry["compat"] | undefined,
|
||||
override: ModelCatalogEntry["compat"] | undefined,
|
||||
): ModelCatalogEntry["compat"] | undefined {
|
||||
if (!base) {
|
||||
return override;
|
||||
}
|
||||
if (!override) {
|
||||
return base;
|
||||
}
|
||||
return { ...base, ...override };
|
||||
}
|
||||
|
||||
function overlayConfiguredCatalogMetadata(
|
||||
base: ModelCatalogEntry,
|
||||
configured: ModelCatalogEntry,
|
||||
): ModelCatalogEntry {
|
||||
return {
|
||||
...base,
|
||||
...(configured.contextWindow !== undefined ? { contextWindow: configured.contextWindow } : {}),
|
||||
...(configured.contextTokens !== undefined ? { contextTokens: configured.contextTokens } : {}),
|
||||
...(configured.reasoning !== undefined ? { reasoning: configured.reasoning } : {}),
|
||||
...(configured.input !== undefined ? { input: configured.input } : {}),
|
||||
compat: mergeCatalogCompat(base.compat, configured.compat),
|
||||
};
|
||||
}
|
||||
|
||||
function mergeConfiguredCatalogEntries(
|
||||
models: ModelCatalogEntry[],
|
||||
entries: ModelCatalogEntry[],
|
||||
): void {
|
||||
const indexByKey = new Map(
|
||||
models.map((entry, index) => [catalogEntryDedupeKey(entry.provider, entry.id), index]),
|
||||
);
|
||||
for (const entry of entries) {
|
||||
const key = catalogEntryDedupeKey(entry.provider, entry.id);
|
||||
const existingIndex = indexByKey.get(key);
|
||||
if (existingIndex === undefined) {
|
||||
models.push(entry);
|
||||
indexByKey.set(key, models.length - 1);
|
||||
continue;
|
||||
}
|
||||
models[existingIndex] = overlayConfiguredCatalogMetadata(models[existingIndex], entry);
|
||||
}
|
||||
}
|
||||
|
||||
export function loadManifestModelCatalog(params: {
|
||||
config: OpenClawConfig;
|
||||
workspaceDir?: string;
|
||||
@@ -319,7 +367,7 @@ async function loadReadOnlyPersistedModelCatalog(params?: {
|
||||
manifestPlugins: hasConfiguredProviderModelRows(cfg) ? getManifestPlugins() : undefined,
|
||||
});
|
||||
if (configuredModels.length > 0) {
|
||||
appendCatalogEntriesIfAbsent(models, configuredModels);
|
||||
mergeConfiguredCatalogEntries(models, configuredModels);
|
||||
}
|
||||
return sortModelCatalogEntries(models);
|
||||
}
|
||||
@@ -371,7 +419,7 @@ function loadReadOnlyStaticModelCatalog(params?: {
|
||||
manifestPlugins: configuredManifestPlugins,
|
||||
});
|
||||
if (configuredModels.length > 0) {
|
||||
appendCatalogEntriesIfAbsent(models, configuredModels);
|
||||
mergeConfiguredCatalogEntries(models, configuredModels);
|
||||
}
|
||||
return sortModelCatalogEntries(models);
|
||||
}
|
||||
@@ -537,7 +585,7 @@ export async function loadModelCatalog(params?: {
|
||||
manifestPlugins: hasConfiguredProviderModelRows(cfg) ? getManifestPlugins() : undefined,
|
||||
});
|
||||
if (configuredModels.length > 0) {
|
||||
appendCatalogEntriesIfAbsent(models, configuredModels);
|
||||
mergeConfiguredCatalogEntries(models, configuredModels);
|
||||
}
|
||||
logStage("configured-models-merged", `entries=${models.length}`);
|
||||
|
||||
|
||||
@@ -562,10 +562,6 @@ function buildModelCatalogMetadata(
|
||||
if (rawKey.trim().endsWith("/*")) {
|
||||
continue;
|
||||
}
|
||||
const alias = ((entryRaw as { alias?: string } | undefined)?.alias ?? "").trim();
|
||||
if (!alias) {
|
||||
continue;
|
||||
}
|
||||
const key = resolveAllowlistModelKey({
|
||||
cfg: params.cfg,
|
||||
raw: rawKey,
|
||||
@@ -577,7 +573,10 @@ function buildModelCatalogMetadata(
|
||||
if (!key) {
|
||||
continue;
|
||||
}
|
||||
aliasByKey.set(key, alias);
|
||||
const alias = ((entryRaw as { alias?: string } | undefined)?.alias ?? "").trim();
|
||||
if (alias) {
|
||||
aliasByKey.set(key, alias);
|
||||
}
|
||||
}
|
||||
|
||||
return { configuredByKey, aliasByKey };
|
||||
@@ -598,7 +597,10 @@ function applyModelCatalogMetadata(params: {
|
||||
const nextContextTokens = configuredEntry?.contextTokens ?? params.entry.contextTokens;
|
||||
const nextReasoning = configuredEntry?.reasoning ?? params.entry.reasoning;
|
||||
const nextInput = configuredEntry?.input ?? params.entry.input;
|
||||
const nextCompat = configuredEntry?.compat ?? params.entry.compat;
|
||||
const nextCompat =
|
||||
params.entry.compat || configuredEntry?.compat
|
||||
? { ...params.entry.compat, ...configuredEntry?.compat }
|
||||
: undefined;
|
||||
|
||||
return {
|
||||
...params.entry,
|
||||
@@ -1180,9 +1182,14 @@ export function buildConfiguredModelCatalog(params: {
|
||||
typeof model?.contextTokens === "number" && model.contextTokens > 0
|
||||
? model.contextTokens
|
||||
: undefined;
|
||||
const reasoning = typeof model?.reasoning === "boolean" ? model.reasoning : undefined;
|
||||
const input = Array.isArray(model?.input) ? model.input : undefined;
|
||||
const compat = model?.compat && typeof model.compat === "object" ? model.compat : undefined;
|
||||
const reasoning =
|
||||
typeof model?.reasoning === "boolean"
|
||||
? model.reasoning
|
||||
: isVllmQwenThinkingCompat(providerId, compat)
|
||||
? true
|
||||
: undefined;
|
||||
catalog.push({
|
||||
provider: providerId,
|
||||
id,
|
||||
@@ -1199,6 +1206,16 @@ export function buildConfiguredModelCatalog(params: {
|
||||
return catalog;
|
||||
}
|
||||
|
||||
function isVllmQwenThinkingCompat(
|
||||
providerId: string,
|
||||
compat?: { thinkingFormat?: unknown } | null,
|
||||
): boolean {
|
||||
return (
|
||||
providerId === "vllm" &&
|
||||
(compat?.thinkingFormat === "qwen" || compat?.thinkingFormat === "qwen-chat-template")
|
||||
);
|
||||
}
|
||||
|
||||
export function resolveHooksGmailModel(
|
||||
params: {
|
||||
cfg: OpenClawConfig;
|
||||
|
||||
@@ -828,6 +828,59 @@ describe("model-selection", () => {
|
||||
expect(model?.id).toBe("google/gemini-3.1-pro-preview");
|
||||
expect(model?.name).toBe("Gemini 3 Pro");
|
||||
});
|
||||
|
||||
it("carries configured model compat into catalog entries for provider policy", () => {
|
||||
const cfg = {
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
models: [
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen 3 8B",
|
||||
reasoning: true,
|
||||
compat: {
|
||||
thinkingFormat: "qwen-chat-template",
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as unknown as OpenClawConfig;
|
||||
|
||||
const model = buildConfiguredModelCatalog({ cfg }).find(
|
||||
(entry) => entry.provider === "vllm" && entry.id === "Qwen/Qwen3-8B",
|
||||
);
|
||||
expect(model?.compat).toEqual({ thinkingFormat: "qwen-chat-template" });
|
||||
expect(model?.reasoning).toBe(true);
|
||||
});
|
||||
|
||||
it("does not infer reasoning from non-vLLM thinking compat", () => {
|
||||
const cfg = {
|
||||
models: {
|
||||
providers: {
|
||||
custom: {
|
||||
models: [
|
||||
{
|
||||
id: "custom-reasoning",
|
||||
name: "Custom Reasoning",
|
||||
compat: {
|
||||
thinkingFormat: "together",
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as unknown as OpenClawConfig;
|
||||
|
||||
const model = buildConfiguredModelCatalog({ cfg }).find(
|
||||
(entry) => entry.provider === "custom" && entry.id === "custom-reasoning",
|
||||
);
|
||||
expect(model?.compat).toEqual({ thinkingFormat: "together" });
|
||||
expect(model?.reasoning).toBeUndefined();
|
||||
});
|
||||
});
|
||||
|
||||
describe("buildModelAliasIndex", () => {
|
||||
@@ -953,6 +1006,43 @@ describe("model-selection", () => {
|
||||
]);
|
||||
});
|
||||
|
||||
it("overlays configured provider metadata after manifest model normalization", () => {
|
||||
const cfg: OpenClawConfig = {
|
||||
models: {
|
||||
providers: {
|
||||
nvidia: {
|
||||
models: [
|
||||
{
|
||||
id: "llama-fast",
|
||||
name: "Configured Llama Fast",
|
||||
contextWindow: 128_000,
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen" },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as unknown as OpenClawConfig;
|
||||
|
||||
const result = buildAllowedModelSet({
|
||||
cfg,
|
||||
catalog: [{ provider: "nvidia", id: "nvidia/llama-fast", name: "Runtime Llama Fast" }],
|
||||
defaultProvider: "anthropic",
|
||||
});
|
||||
|
||||
expect(result.allowedCatalog).toEqual([
|
||||
{
|
||||
provider: "nvidia",
|
||||
id: "nvidia/llama-fast",
|
||||
name: "Configured Llama Fast",
|
||||
contextWindow: 128_000,
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen" },
|
||||
},
|
||||
]);
|
||||
});
|
||||
|
||||
it("keeps configured provider models visible when the catalog is otherwise allow-any", () => {
|
||||
const cfg: OpenClawConfig = {
|
||||
agents: {
|
||||
|
||||
@@ -1499,6 +1499,101 @@ describe("resolveModel", () => {
|
||||
expect(result.model?.reasoning).toBe(true);
|
||||
});
|
||||
|
||||
it("propagates compat from matching configured fallback model", () => {
|
||||
const cfg = {
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
baseUrl: "http://localhost:9000",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
...makeModel("Qwen/Qwen3-8B"),
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as unknown as OpenClawConfig;
|
||||
|
||||
const result = resolveModelForTest("vllm", "Qwen/Qwen3-8B", "/tmp/agent", cfg);
|
||||
|
||||
expect(result.error).toBeUndefined();
|
||||
expect(result.model?.compat).toEqual(
|
||||
expect.objectContaining({ thinkingFormat: "qwen-chat-template" }),
|
||||
);
|
||||
expect(result.model?.reasoning).toBe(false);
|
||||
});
|
||||
|
||||
it("lets configured vLLM Qwen compat override stale discovered reasoning", () => {
|
||||
mockDiscoveredModel(discoverModels, {
|
||||
provider: "vllm",
|
||||
modelId: "Qwen/Qwen3-8B",
|
||||
templateModel: {
|
||||
...makeModel("Qwen/Qwen3-8B"),
|
||||
provider: "vllm",
|
||||
api: "openai-completions",
|
||||
baseUrl: "http://localhost:9000",
|
||||
reasoning: false,
|
||||
compat: { supportsStrictMode: false },
|
||||
},
|
||||
});
|
||||
const cfg = {
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
baseUrl: "http://localhost:9000",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as unknown as OpenClawConfig;
|
||||
|
||||
const result = resolveModelForTest("vllm", "Qwen/Qwen3-8B", "/tmp/agent", cfg);
|
||||
|
||||
expect(result.error).toBeUndefined();
|
||||
expect(result.model?.reasoning).toBe(true);
|
||||
expect(result.model?.compat).toEqual(
|
||||
expect.objectContaining({
|
||||
supportsStrictMode: false,
|
||||
thinkingFormat: "qwen-chat-template",
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
it("infers reasoning for matching vLLM Qwen compat fallback models", () => {
|
||||
const cfg = {
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
baseUrl: "http://localhost:9000",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as unknown as OpenClawConfig;
|
||||
|
||||
const result = resolveModelForTest("vllm", "Qwen/Qwen3-8B", "/tmp/agent", cfg);
|
||||
|
||||
expect(result.error).toBeUndefined();
|
||||
expect(result.model?.reasoning).toBe(true);
|
||||
});
|
||||
|
||||
it("propagates image input capability from matching configured fallback model", () => {
|
||||
const cfg = {
|
||||
models: {
|
||||
|
||||
@@ -5,7 +5,7 @@ import {
|
||||
type AuthStorage,
|
||||
type ModelRegistry,
|
||||
} from "@earendil-works/pi-coding-agent";
|
||||
import type { ModelMediaInputConfig } from "../../config/types.models.js";
|
||||
import type { ModelCompatConfig, ModelMediaInputConfig } from "../../config/types.models.js";
|
||||
import type { OpenClawConfig } from "../../config/types.openclaw.js";
|
||||
import type { ProviderRuntimeModel } from "../../plugins/provider-runtime-model.types.js";
|
||||
import {
|
||||
@@ -688,6 +688,14 @@ function applyConfiguredProviderOverrides(params: {
|
||||
metadataOverrideModel?.contextWindow ?? providerConfig.contextWindow;
|
||||
const resolvedMaxTokens =
|
||||
metadataOverrideModel?.maxTokens ?? providerConfig.maxTokens ?? discoveredModel.maxTokens;
|
||||
const resolvedCompat = mergeModelCompat(discoveredModel.compat, metadataOverrideModel?.compat);
|
||||
const resolvedReasoning = resolveMergedConfiguredModelReasoning({
|
||||
provider: params.provider,
|
||||
configuredCompat: metadataOverrideModel?.compat,
|
||||
resolvedCompat,
|
||||
configuredReasoning: metadataOverrideModel?.reasoning,
|
||||
discoveredReasoning: discoveredModel.reasoning,
|
||||
});
|
||||
const requestConfig = resolveProviderRequestConfig({
|
||||
provider: params.provider,
|
||||
api:
|
||||
@@ -710,7 +718,7 @@ function applyConfiguredProviderOverrides(params: {
|
||||
...discoveredModel,
|
||||
api: requestConfig.api ?? "openai-responses",
|
||||
baseUrl: requestConfig.baseUrl ?? discoveredModel.baseUrl,
|
||||
reasoning: metadataOverrideModel?.reasoning ?? discoveredModel.reasoning,
|
||||
reasoning: resolvedReasoning,
|
||||
input: normalizedInput,
|
||||
cost: metadataOverrideModel?.cost ?? discoveredModel.cost,
|
||||
contextWindow: resolvedContextWindow ?? discoveredModel.contextWindow,
|
||||
@@ -725,7 +733,7 @@ function applyConfiguredProviderOverrides(params: {
|
||||
...(resolvedParams ? { params: resolvedParams } : {}),
|
||||
...(requestTimeoutMs !== undefined ? { requestTimeoutMs } : {}),
|
||||
headers: requestConfig.headers,
|
||||
compat: metadataOverrideModel?.compat ?? discoveredModel.compat,
|
||||
compat: resolvedCompat,
|
||||
mediaInput: mergeModelMediaInput(
|
||||
discoveredModel.mediaInput,
|
||||
metadataOverrideModel?.mediaInput,
|
||||
@@ -778,6 +786,11 @@ function resolveExplicitModelWithRegistry(params: {
|
||||
workspaceDir,
|
||||
model: {
|
||||
...inlineMatch,
|
||||
reasoning: resolveConfiguredModelReasoning({
|
||||
provider,
|
||||
compat: inlineMatch.compat,
|
||||
reasoning: inlineMatch.reasoning,
|
||||
}),
|
||||
...(resolvedParams ? { params: resolvedParams } : {}),
|
||||
...(requestTimeoutMs !== undefined ? { requestTimeoutMs } : {}),
|
||||
} as Model<Api>,
|
||||
@@ -842,6 +855,11 @@ function resolveExplicitModelWithRegistry(params: {
|
||||
workspaceDir,
|
||||
model: {
|
||||
...fallbackInlineMatch,
|
||||
reasoning: resolveConfiguredModelReasoning({
|
||||
provider,
|
||||
compat: fallbackInlineMatch.compat,
|
||||
reasoning: fallbackInlineMatch.reasoning,
|
||||
}),
|
||||
...(resolvedParams ? { params: resolvedParams } : {}),
|
||||
...(requestTimeoutMs !== undefined ? { requestTimeoutMs } : {}),
|
||||
} as Model<Api>,
|
||||
@@ -961,6 +979,11 @@ function resolveConfiguredFallbackModel(params: {
|
||||
capability: "llm",
|
||||
transport: "stream",
|
||||
});
|
||||
const fallbackReasoning = resolveConfiguredFallbackReasoning({
|
||||
provider,
|
||||
compat: configuredModel?.compat,
|
||||
reasoning: configuredModel?.reasoning,
|
||||
});
|
||||
return normalizeResolvedModel({
|
||||
provider,
|
||||
cfg,
|
||||
@@ -974,7 +997,7 @@ function resolveConfiguredFallbackModel(params: {
|
||||
api: requestConfig.api ?? "openai-responses",
|
||||
provider,
|
||||
baseUrl: requestConfig.baseUrl,
|
||||
reasoning: configuredModel?.reasoning ?? false,
|
||||
reasoning: fallbackReasoning,
|
||||
input: resolveProviderModelInput({
|
||||
provider,
|
||||
modelId,
|
||||
@@ -999,6 +1022,7 @@ function resolveConfiguredFallbackModel(params: {
|
||||
...(resolvedParams ? { params: resolvedParams } : {}),
|
||||
...(requestTimeoutMs !== undefined ? { requestTimeoutMs } : {}),
|
||||
headers: requestConfig.headers,
|
||||
compat: configuredModel?.compat,
|
||||
mediaInput: configuredModel?.mediaInput,
|
||||
} as Model<Api>,
|
||||
providerRequest,
|
||||
@@ -1033,6 +1057,71 @@ function shouldCompareProviderRuntimeResolvedModel(params: {
|
||||
);
|
||||
}
|
||||
|
||||
function resolveConfiguredFallbackReasoning(params: {
|
||||
provider: string;
|
||||
compat?: { thinkingFormat?: string } | null;
|
||||
reasoning?: boolean;
|
||||
}): boolean {
|
||||
return resolveConfiguredModelReasoning(params) ?? false;
|
||||
}
|
||||
|
||||
function resolveConfiguredModelReasoning(params: {
|
||||
provider: string;
|
||||
compat?: { thinkingFormat?: string } | null;
|
||||
reasoning?: boolean;
|
||||
}): boolean | undefined {
|
||||
if (params.reasoning !== undefined) {
|
||||
return params.reasoning;
|
||||
}
|
||||
return isVllmQwenThinkingCompat(params) ? true : undefined;
|
||||
}
|
||||
|
||||
function resolveMergedConfiguredModelReasoning(params: {
|
||||
provider: string;
|
||||
configuredCompat?: { thinkingFormat?: string } | null;
|
||||
resolvedCompat?: { thinkingFormat?: string } | null;
|
||||
configuredReasoning?: boolean;
|
||||
discoveredReasoning?: boolean;
|
||||
}): boolean {
|
||||
if (params.configuredReasoning !== undefined) {
|
||||
return params.configuredReasoning;
|
||||
}
|
||||
if (isVllmQwenThinkingCompat({ provider: params.provider, compat: params.configuredCompat })) {
|
||||
return true;
|
||||
}
|
||||
return (
|
||||
resolveConfiguredModelReasoning({
|
||||
provider: params.provider,
|
||||
compat: params.resolvedCompat,
|
||||
reasoning: params.discoveredReasoning,
|
||||
}) ?? false
|
||||
);
|
||||
}
|
||||
|
||||
function isVllmQwenThinkingCompat(params: {
|
||||
provider: string;
|
||||
compat?: { thinkingFormat?: string } | null;
|
||||
}): boolean {
|
||||
const thinkingFormat = params.compat?.thinkingFormat;
|
||||
return (
|
||||
normalizeProviderId(params.provider) === "vllm" &&
|
||||
(thinkingFormat === "qwen" || thinkingFormat === "qwen-chat-template")
|
||||
);
|
||||
}
|
||||
|
||||
function mergeModelCompat(
|
||||
base: ModelCompatConfig | undefined,
|
||||
override: ModelCompatConfig | undefined,
|
||||
): ModelCompatConfig | undefined {
|
||||
if (!base) {
|
||||
return override;
|
||||
}
|
||||
if (!override) {
|
||||
return base;
|
||||
}
|
||||
return { ...base, ...override };
|
||||
}
|
||||
|
||||
function preferProviderRuntimeResolvedModel(params: {
|
||||
explicitModel: Model<Api>;
|
||||
runtimeResolvedModel?: Model<Api>;
|
||||
|
||||
@@ -63,7 +63,7 @@ const makeConfiguredModel = (overrides: Record<string, unknown> = {}) => ({
|
||||
id: "gpt-5.4",
|
||||
name: "GPT-5.4",
|
||||
reasoning: true,
|
||||
input: ["text"],
|
||||
input: ["text"] as Array<"text">,
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 128_000,
|
||||
maxTokens: 16_384,
|
||||
@@ -215,6 +215,115 @@ describe("createModelSelectionState catalog loading", () => {
|
||||
expect(loadModelCatalog).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("keeps configured compat when manifest thinking metadata is used", async () => {
|
||||
vi.mocked(loadModelCatalog).mockClear();
|
||||
vi.mocked(loadManifestModelCatalog).mockReturnValueOnce([
|
||||
{ provider: "vllm", id: "Qwen/Qwen3-8B", name: "Qwen3", reasoning: true },
|
||||
]);
|
||||
const cfg = {
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {},
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
baseUrl: "http://localhost:9000/v1",
|
||||
models: [
|
||||
makeConfiguredModel({
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3",
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
}),
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as OpenClawConfig;
|
||||
|
||||
const state = await createModelSelectionState({
|
||||
cfg,
|
||||
agentCfg: cfg.agents?.defaults,
|
||||
defaultProvider: "vllm",
|
||||
defaultModel: "Qwen/Qwen3-8B",
|
||||
provider: "vllm",
|
||||
model: "Qwen/Qwen3-8B",
|
||||
hasModelDirective: false,
|
||||
});
|
||||
|
||||
await expect(state.resolveThinkingCatalog()).resolves.toEqual([
|
||||
expect.objectContaining({
|
||||
provider: "vllm",
|
||||
id: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
}),
|
||||
]);
|
||||
expect(loadModelCatalog).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("keeps configured compat when runtime thinking catalog is already loaded", async () => {
|
||||
vi.mocked(loadModelCatalog).mockClear();
|
||||
vi.mocked(loadModelCatalog).mockResolvedValueOnce([
|
||||
{
|
||||
provider: "vllm",
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3",
|
||||
reasoning: true,
|
||||
compat: { supportedReasoningEfforts: ["xhigh"] },
|
||||
},
|
||||
]);
|
||||
const cfg = {
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {},
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
baseUrl: "http://localhost:9000/v1",
|
||||
models: [
|
||||
makeConfiguredModel({
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3",
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
}),
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as OpenClawConfig;
|
||||
|
||||
const state = await createModelSelectionState({
|
||||
cfg,
|
||||
agentCfg: cfg.agents?.defaults,
|
||||
defaultProvider: "vllm",
|
||||
defaultModel: "Qwen/Qwen3-8B",
|
||||
provider: "vllm",
|
||||
model: "Qwen/Qwen3-8B",
|
||||
hasModelDirective: true,
|
||||
});
|
||||
|
||||
await expect(state.resolveThinkingCatalog()).resolves.toEqual([
|
||||
expect.objectContaining({
|
||||
provider: "vllm",
|
||||
id: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: {
|
||||
supportedReasoningEfforts: ["xhigh"],
|
||||
thinkingFormat: "qwen-chat-template",
|
||||
},
|
||||
}),
|
||||
]);
|
||||
expect(loadModelCatalog).toHaveBeenCalledOnce();
|
||||
});
|
||||
|
||||
it("prefers per-agent thinkingDefault over model and global defaults", async () => {
|
||||
vi.mocked(loadModelCatalog).mockClear();
|
||||
const cfg = {
|
||||
|
||||
@@ -97,10 +97,8 @@ function findSelectedCatalogEntry(params: {
|
||||
model: string;
|
||||
}): ModelCatalogEntry | undefined {
|
||||
const normalizedProvider = normalizeProviderId(params.provider);
|
||||
return params.catalog?.find(
|
||||
(entry) =>
|
||||
normalizeProviderId(entry.provider) === normalizedProvider && entry.id === params.model,
|
||||
);
|
||||
const selectedKey = modelKey(normalizedProvider, params.model);
|
||||
return params.catalog?.find((entry) => modelKey(entry.provider, entry.id) === selectedKey);
|
||||
}
|
||||
|
||||
export async function createModelSelectionState(params: {
|
||||
@@ -360,6 +358,15 @@ export async function createModelSelectionState(params: {
|
||||
|
||||
let thinkingCatalog: ModelCatalog | undefined;
|
||||
let manifestModelCatalog: ModelCatalog | null = null;
|
||||
const buildThinkingCatalog = (catalog: ModelCatalog): ModelCatalog =>
|
||||
createModelVisibilityPolicy({
|
||||
cfg,
|
||||
catalog,
|
||||
defaultProvider,
|
||||
defaultModel,
|
||||
agentId: params.agentId,
|
||||
...RUNTIME_MODEL_VISIBILITY_NORMALIZATION,
|
||||
}).allowedCatalog;
|
||||
const loadManifestCatalogForThinking = async () => {
|
||||
if (manifestModelCatalog) {
|
||||
return manifestModelCatalog;
|
||||
@@ -377,7 +384,11 @@ export async function createModelSelectionState(params: {
|
||||
return thinkingCatalog;
|
||||
}
|
||||
let catalogForThinking =
|
||||
modelCatalog && modelCatalog.length > 0 ? modelCatalog : allowedModelCatalog;
|
||||
allowedModelCatalog.length > 0
|
||||
? allowedModelCatalog
|
||||
: modelCatalog && modelCatalog.length > 0
|
||||
? buildThinkingCatalog(modelCatalog)
|
||||
: [];
|
||||
let selectedCatalogEntry = findSelectedCatalogEntry({
|
||||
catalog: catalogForThinking,
|
||||
provider,
|
||||
@@ -387,7 +398,7 @@ export async function createModelSelectionState(params: {
|
||||
// allowlist rows know only provider/id; manifest rows can prove reasoning
|
||||
// support without opening the Pi auth-backed model registry.
|
||||
if (!modelCatalog && selectedCatalogEntry?.reasoning === undefined) {
|
||||
const manifestCatalog = await loadManifestCatalogForThinking();
|
||||
const manifestCatalog = buildThinkingCatalog(await loadManifestCatalogForThinking());
|
||||
const manifestSelectedEntry = findSelectedCatalogEntry({
|
||||
catalog: manifestCatalog,
|
||||
provider,
|
||||
@@ -403,13 +414,16 @@ export async function createModelSelectionState(params: {
|
||||
if (shouldHydrateRuntimeCatalog) {
|
||||
modelCatalog = await (await loadModelCatalogRuntime()).loadModelCatalog({ config: cfg });
|
||||
logStage("catalog-loaded-for-thinking", `entries=${modelCatalog.length}`);
|
||||
const runtimeSelectedEntry = modelCatalog.find(
|
||||
(entry) => entry.provider === provider && entry.id === model,
|
||||
);
|
||||
const runtimeCatalog = buildThinkingCatalog(modelCatalog);
|
||||
const runtimeSelectedEntry = findSelectedCatalogEntry({
|
||||
catalog: runtimeCatalog,
|
||||
provider,
|
||||
model,
|
||||
});
|
||||
catalogForThinking =
|
||||
runtimeSelectedEntry || !catalogForThinking || catalogForThinking.length === 0
|
||||
? modelCatalog.length > 0
|
||||
? modelCatalog
|
||||
? runtimeCatalog.length > 0
|
||||
? runtimeCatalog
|
||||
: allowedModelCatalog
|
||||
: allowedModelCatalog;
|
||||
}
|
||||
|
||||
@@ -27,6 +27,7 @@ export type ThinkingCatalogEntry = {
|
||||
id: string;
|
||||
reasoning?: boolean;
|
||||
compat?: {
|
||||
thinkingFormat?: string;
|
||||
supportedReasoningEfforts?: readonly string[] | null;
|
||||
} | null;
|
||||
};
|
||||
|
||||
@@ -255,6 +255,64 @@ describe("listThinkingLevels", () => {
|
||||
).toBe("max");
|
||||
});
|
||||
|
||||
it("passes catalog compat into provider thinking profiles", () => {
|
||||
providerRuntimeMocks.resolveProviderThinkingProfile.mockImplementation(({ context }) =>
|
||||
context.reasoning === true && context.compat?.thinkingFormat === "qwen-chat-template"
|
||||
? {
|
||||
levels: [{ id: "off" }, { id: "low", label: "on" }],
|
||||
defaultLevel: "off",
|
||||
}
|
||||
: undefined,
|
||||
);
|
||||
const catalog = [
|
||||
{
|
||||
provider: "vllm",
|
||||
id: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
];
|
||||
|
||||
expect(listThinkingLevelLabels("vllm", "Qwen/Qwen3-8B", catalog)).toEqual(["off", "on"]);
|
||||
expect(
|
||||
resolveSupportedThinkingLevel({
|
||||
provider: "vllm",
|
||||
model: "Qwen/Qwen3-8B",
|
||||
level: "high",
|
||||
catalog,
|
||||
}),
|
||||
).toBe("low");
|
||||
});
|
||||
|
||||
it("matches provider-qualified catalog ids for provider thinking profiles", () => {
|
||||
providerRuntimeMocks.resolveProviderThinkingProfile.mockImplementation(({ context }) =>
|
||||
context.reasoning === true && context.compat?.thinkingFormat === "qwen-chat-template"
|
||||
? {
|
||||
levels: [{ id: "off" }, { id: "low", label: "on" }],
|
||||
defaultLevel: "off",
|
||||
}
|
||||
: undefined,
|
||||
);
|
||||
const catalog = [
|
||||
{
|
||||
provider: "vllm",
|
||||
id: "vllm/Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
];
|
||||
|
||||
expect(listThinkingLevelLabels("vllm", "Qwen/Qwen3-8B", catalog)).toEqual(["off", "on"]);
|
||||
expect(
|
||||
resolveSupportedThinkingLevel({
|
||||
provider: "vllm",
|
||||
model: "Qwen/Qwen3-8B",
|
||||
level: "high",
|
||||
catalog,
|
||||
}),
|
||||
).toBe("low");
|
||||
});
|
||||
|
||||
it("uses catalog compat reasoning efforts to expose xhigh for configured custom models", () => {
|
||||
const catalog = [
|
||||
{
|
||||
|
||||
@@ -57,6 +57,22 @@ type ResolvedThinkingProfile = {
|
||||
defaultLevel?: ThinkLevel | null;
|
||||
};
|
||||
|
||||
function buildCatalogModelKey(provider: string, model: string): string {
|
||||
const providerId = provider.trim();
|
||||
const modelId = model.trim();
|
||||
if (!providerId) {
|
||||
return modelId;
|
||||
}
|
||||
if (!modelId) {
|
||||
return providerId;
|
||||
}
|
||||
return normalizeOptionalLowercaseString(modelId)?.startsWith(
|
||||
`${normalizeOptionalLowercaseString(providerId)}/`,
|
||||
)
|
||||
? modelId
|
||||
: `${providerId}/${modelId}`;
|
||||
}
|
||||
|
||||
function resolveThinkingPolicyContext(params: {
|
||||
provider?: string | null;
|
||||
model?: string | null;
|
||||
@@ -66,8 +82,12 @@ function resolveThinkingPolicyContext(params: {
|
||||
const normalizedProvider = providerRaw ? normalizeProviderId(providerRaw) : "";
|
||||
const modelId = normalizeOptionalString(params.model) ?? "";
|
||||
const modelKey = normalizeOptionalLowercaseString(params.model) ?? "";
|
||||
const selectedCatalogKey =
|
||||
normalizedProvider && modelId ? buildCatalogModelKey(normalizedProvider, modelId) : undefined;
|
||||
const candidate = params.catalog?.find(
|
||||
(entry) => normalizeProviderId(entry.provider) === normalizedProvider && entry.id === modelId,
|
||||
(entry) =>
|
||||
selectedCatalogKey !== undefined &&
|
||||
buildCatalogModelKey(normalizeProviderId(entry.provider), entry.id) === selectedCatalogKey,
|
||||
);
|
||||
return {
|
||||
normalizedProvider,
|
||||
@@ -165,6 +185,7 @@ export function resolveThinkingProfile(params: {
|
||||
provider: context.normalizedProvider,
|
||||
modelId: context.modelId,
|
||||
reasoning: context.reasoning,
|
||||
compat: context.compat,
|
||||
};
|
||||
const pluginProfile = resolveProviderThinkingProfile({
|
||||
provider: context.normalizedProvider,
|
||||
|
||||
@@ -1700,6 +1700,647 @@ describe("legacy model compat migrate", () => {
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen thinking params to model compat", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
models: [{ id: "Qwen/Qwen3-8B", name: "Qwen3 8B" }],
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.defaults?.models?.["vllm/Qwen/Qwen3-8B"]?.params).toEqual({
|
||||
temperature: 0.2,
|
||||
});
|
||||
expect(res.config?.models?.providers?.vllm?.models?.[0]?.compat).toEqual({
|
||||
thinkingFormat: "qwen-chat-template",
|
||||
});
|
||||
expect(res.config?.models?.providers?.vllm?.models?.[0]?.reasoning).toBe(true);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved agents.defaults.models."vllm/Qwen/Qwen3-8B".params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen thinking params from normalized agent model refs", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"VLLM/Qwen/Qwen3-8B": {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.defaults?.models?.["VLLM/Qwen/Qwen3-8B"]).not.toHaveProperty(
|
||||
"params",
|
||||
);
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved agents.defaults.models."VLLM/Qwen/Qwen3-8B".params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("creates a vLLM model row for legacy Qwen top-level thinking params", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {
|
||||
params: {
|
||||
qwen_thinking_format: "enable_thinking",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.defaults?.models?.["vllm/Qwen/Qwen3-8B"]).not.toHaveProperty(
|
||||
"params",
|
||||
);
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved agents.defaults.models."vllm/Qwen/Qwen3-8B".params.qwen_thinking_format to models.providers.vllm.models[0].compat.thinkingFormat ("qwen").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("preserves existing vLLM model compat when removing legacy Qwen thinking params", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {
|
||||
params: {
|
||||
qwenThinkingFormat: "top-level",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
models: [
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.defaults?.models?.["vllm/Qwen/Qwen3-8B"]).not.toHaveProperty(
|
||||
"params",
|
||||
);
|
||||
expect(res.config?.models?.providers?.vllm?.models?.[0]?.compat).toEqual({
|
||||
thinkingFormat: "qwen-chat-template",
|
||||
});
|
||||
expect(res.config?.models?.providers?.vllm?.models?.[0]?.reasoning).toBe(true);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Removed agents.defaults.models."vllm/Qwen/Qwen3-8B".params.qwenThinkingFormat; models.providers.vllm.models[0].compat.thinkingFormat is already "qwen-chat-template".',
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen thinking params onto provider-qualified model rows", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
models: [{ id: "vllm/Qwen/Qwen3-8B", name: "Qwen3 8B" }],
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "vllm/Qwen/Qwen3-8B",
|
||||
name: "Qwen3 8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved agents.defaults.models."vllm/Qwen/Qwen3-8B".params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen model-row params to model compat", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
models: [
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3 8B",
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.models?.providers?.vllm?.models?.[0]).toEqual({
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3 8B",
|
||||
reasoning: true,
|
||||
params: { temperature: 0.2 },
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
});
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved models.providers.vllm.models[0].params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen provider params to model compat rows", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
params: {
|
||||
qwen_thinking_format: "enable_thinking",
|
||||
temperature: 0.2,
|
||||
},
|
||||
models: [
|
||||
{ id: "Qwen/Qwen3-8B", name: "Qwen3 8B" },
|
||||
{ id: "Qwen/Qwen3-14B", name: "Qwen3 14B" },
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.models?.providers?.vllm?.params).toEqual({ temperature: 0.2 });
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen3 8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen" },
|
||||
},
|
||||
{
|
||||
id: "Qwen/Qwen3-14B",
|
||||
name: "Qwen3 14B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved models.providers.vllm.params.qwen_thinking_format to models.providers.vllm.models[0].compat.thinkingFormat ("qwen").',
|
||||
'Moved models.providers.vllm.params.qwen_thinking_format to models.providers.vllm.models[1].compat.thinkingFormat ("qwen").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen provider params to existing and selected model rows", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "vllm/Qwen/Qwen3-8B" },
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
models: [{ id: "Qwen/Qwen3-14B", name: "Qwen3 14B" }],
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-14B",
|
||||
name: "Qwen3 14B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved models.providers.vllm.params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
'Moved models.providers.vllm.params.qwenThinkingFormat to models.providers.vllm.models[1].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("removes untargeted legacy vLLM Qwen provider params", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
baseUrl: "http://localhost:8000/v1",
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.models?.providers?.vllm).toEqual({
|
||||
baseUrl: "http://localhost:8000/v1",
|
||||
params: { temperature: 0.2 },
|
||||
});
|
||||
expect(res.changes).toStrictEqual([
|
||||
"Removed models.providers.vllm.params.qwenThinkingFormat; no concrete vLLM model row or agent model ref exists, so configure models.providers.vllm.models[].compat.thinkingFormat on each Qwen model that needs it.",
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen provider params using the default selected model", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "vllm/Qwen/Qwen3-8B" },
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.models?.providers?.vllm?.params).toEqual({ temperature: 0.2 });
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved models.providers.vllm.params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("preserves normalized vLLM provider keys when moving provider params", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "vllm/Qwen/Qwen3-8B" },
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
VLLM: {
|
||||
baseUrl: "http://localhost:8000/v1",
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.models?.providers?.vllm).toBeUndefined();
|
||||
expect(res.config?.models?.providers?.VLLM).toEqual({
|
||||
baseUrl: "http://localhost:8000/v1",
|
||||
params: { temperature: 0.2 },
|
||||
models: [
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
],
|
||||
});
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved models.providers.vllm.params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("strips auth profile suffixes when moving legacy vLLM Qwen params", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "vllm/Qwen/Qwen3-8B@local" },
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen default agent params to the selected model compat row", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "vllm/Qwen/Qwen3-8B" },
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.defaults?.params).toEqual({ temperature: 0.2 });
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved agents.defaults.params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("removes untargeted legacy vLLM Qwen default agent params", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.defaults?.params).toEqual({ temperature: 0.2 });
|
||||
expect(res.changes).toStrictEqual([
|
||||
"Removed agents.defaults.params.qwenThinkingFormat; no concrete vLLM model row or agent model ref exists, so configure models.providers.vllm.models[].compat.thinkingFormat on each Qwen model that needs it.",
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen per-agent params to the agent model compat row", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
list: [
|
||||
{
|
||||
id: "local",
|
||||
model: "vllm/Qwen/Qwen3-8B",
|
||||
params: {
|
||||
qwen_thinking_format: "enable_thinking",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.list?.[0]?.params).toEqual({ temperature: 0.2 });
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved agents.list[0].params.qwen_thinking_format to models.providers.vllm.models[0].compat.thinkingFormat ("qwen").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("removes untargeted legacy vLLM Qwen per-agent params", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
list: [
|
||||
{
|
||||
id: "local",
|
||||
params: {
|
||||
qwen_thinking_format: "enable_thinking",
|
||||
temperature: 0.2,
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.list?.[0]?.params).toEqual({ temperature: 0.2 });
|
||||
expect(res.changes).toStrictEqual([
|
||||
"Removed agents.list[0].params.qwen_thinking_format; no concrete vLLM model row or agent model ref exists, so configure models.providers.vllm.models[].compat.thinkingFormat on each Qwen model that needs it.",
|
||||
]);
|
||||
});
|
||||
|
||||
it("moves legacy vLLM Qwen per-agent params using the inherited default model", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
model: "vllm/Qwen/Qwen3-8B",
|
||||
},
|
||||
list: [
|
||||
{
|
||||
id: "local",
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config?.agents?.list?.[0]).not.toHaveProperty("params");
|
||||
expect(res.config?.models?.providers?.vllm?.models).toEqual([
|
||||
{
|
||||
id: "Qwen/Qwen3-8B",
|
||||
name: "Qwen/Qwen3-8B",
|
||||
reasoning: true,
|
||||
compat: { thinkingFormat: "qwen-chat-template" },
|
||||
},
|
||||
]);
|
||||
expect(res.changes).toStrictEqual([
|
||||
'Moved agents.list[0].params.qwenThinkingFormat to models.providers.vllm.models[0].compat.thinkingFormat ("qwen-chat-template").',
|
||||
]);
|
||||
});
|
||||
|
||||
it("leaves legacy vLLM Qwen thinking params when the model compat row cannot be written", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: {
|
||||
models: "malformed",
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config).toBeNull();
|
||||
expect(res.changes).toStrictEqual([]);
|
||||
});
|
||||
|
||||
it("leaves malformed vLLM provider ancestors untouched during legacy Qwen migration", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
vllm: "malformed",
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(res.config).toBeNull();
|
||||
expect(res.changes).toStrictEqual([]);
|
||||
});
|
||||
|
||||
it("reports legacy vLLM Qwen thinking params before doctor fix", () => {
|
||||
const raw = {
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"vllm/Qwen/Qwen3-8B": {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
};
|
||||
|
||||
expect(findLegacyConfigIssues(raw).map((issue) => issue.path)).toContain(
|
||||
"agents.defaults.models",
|
||||
);
|
||||
});
|
||||
|
||||
it("reports legacy vLLM Qwen thinking params from merged extra-param sources", () => {
|
||||
const raw = {
|
||||
agents: {
|
||||
defaults: {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
list: [
|
||||
{
|
||||
id: "local",
|
||||
params: {
|
||||
qwen_thinking_format: "enable_thinking",
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
};
|
||||
|
||||
expect(findLegacyConfigIssues(raw).map((issue) => issue.path)).toEqual(
|
||||
expect.arrayContaining(["agents.defaults.params", "agents"]),
|
||||
);
|
||||
});
|
||||
|
||||
it("reports legacy vLLM Qwen params from normalized provider keys", () => {
|
||||
const raw = {
|
||||
models: {
|
||||
providers: {
|
||||
VLLM: {
|
||||
params: {
|
||||
qwenThinkingFormat: "chat-template",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
};
|
||||
|
||||
expect(findLegacyConfigIssues(raw).map((issue) => issue.path)).toContain("models.providers");
|
||||
});
|
||||
|
||||
it("preserves recognized model compat thinkingFormat values", () => {
|
||||
const res = migrateLegacyConfigForTest({
|
||||
models: {
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
import { splitTrailingAuthProfile } from "../../../agents/model-ref-profile.js";
|
||||
import { normalizeProviderId } from "../../../agents/provider-id.js";
|
||||
import {
|
||||
defineLegacyConfigMigration,
|
||||
ensureRecord,
|
||||
getRecord,
|
||||
type LegacyConfigMigrationSpec,
|
||||
type LegacyConfigRule,
|
||||
@@ -78,6 +80,394 @@ function hasInvalidThinkingFormat(providers: unknown): boolean {
|
||||
return false;
|
||||
}
|
||||
|
||||
const LEGACY_VLLM_QWEN_THINKING_FORMAT_KEYS = [
|
||||
"qwenThinkingFormat",
|
||||
"qwen_thinking_format",
|
||||
] as const;
|
||||
|
||||
function normalizeLegacyVllmQwenThinkingFormat(
|
||||
value: unknown,
|
||||
): "qwen" | "qwen-chat-template" | undefined {
|
||||
if (typeof value !== "string") {
|
||||
return undefined;
|
||||
}
|
||||
const normalized = value
|
||||
.trim()
|
||||
.toLowerCase()
|
||||
.replace(/[_\s]+/g, "-");
|
||||
switch (normalized) {
|
||||
case "chat-template":
|
||||
case "chat-template-argument":
|
||||
case "chat-template-arguments":
|
||||
case "chat-template-kwarg":
|
||||
case "chat-template-kwargs":
|
||||
case "qwen-chat-template":
|
||||
return "qwen-chat-template";
|
||||
case "enable-thinking":
|
||||
case "qwen":
|
||||
case "request-body":
|
||||
case "top-level":
|
||||
return "qwen";
|
||||
default:
|
||||
return undefined;
|
||||
}
|
||||
}
|
||||
|
||||
function getLegacyVllmQwenThinkingFormat(params: Record<string, unknown>):
|
||||
| {
|
||||
key: (typeof LEGACY_VLLM_QWEN_THINKING_FORMAT_KEYS)[number];
|
||||
value: unknown;
|
||||
compat: "qwen" | "qwen-chat-template" | undefined;
|
||||
}
|
||||
| undefined {
|
||||
for (const key of LEGACY_VLLM_QWEN_THINKING_FORMAT_KEYS) {
|
||||
if (Object.prototype.hasOwnProperty.call(params, key)) {
|
||||
return {
|
||||
key,
|
||||
value: params[key],
|
||||
compat: normalizeLegacyVllmQwenThinkingFormat(params[key]),
|
||||
};
|
||||
}
|
||||
}
|
||||
return undefined;
|
||||
}
|
||||
|
||||
function parseVllmAgentModelKey(key: string): string | undefined {
|
||||
const trimmed = splitTrailingAuthProfile(key).model.trim();
|
||||
const slashIndex = trimmed.indexOf("/");
|
||||
if (slashIndex <= 0) {
|
||||
return undefined;
|
||||
}
|
||||
const providerId = trimmed.slice(0, slashIndex);
|
||||
if (normalizeProviderId(providerId) !== "vllm") {
|
||||
return undefined;
|
||||
}
|
||||
const modelId = trimmed.slice(slashIndex + 1).trim();
|
||||
return modelId && modelId !== "*" ? modelId : undefined;
|
||||
}
|
||||
|
||||
function hasLegacyVllmQwenThinkingFormat(defaultModels: unknown): boolean {
|
||||
const models = getRecord(defaultModels);
|
||||
if (!models) {
|
||||
return false;
|
||||
}
|
||||
for (const [key, entry] of Object.entries(models)) {
|
||||
if (!parseVllmAgentModelKey(key)) {
|
||||
continue;
|
||||
}
|
||||
const params = getRecord(getRecord(entry)?.params);
|
||||
if (params && getLegacyVllmQwenThinkingFormat(params)) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
function hasLegacyVllmQwenThinkingProviderParams(provider: unknown): boolean {
|
||||
const params = getRecord(getRecord(provider)?.params);
|
||||
return Boolean(params && getLegacyVllmQwenThinkingFormat(params));
|
||||
}
|
||||
|
||||
function hasLegacyVllmQwenThinkingModelParams(provider: unknown): boolean {
|
||||
const models = getRecord(provider)?.models;
|
||||
if (!Array.isArray(models)) {
|
||||
return false;
|
||||
}
|
||||
return models.some((model) => {
|
||||
const params = getRecord(getRecord(model)?.params);
|
||||
return Boolean(params && getLegacyVllmQwenThinkingFormat(params));
|
||||
});
|
||||
}
|
||||
|
||||
function hasLegacyVllmQwenThinkingParams(params: unknown): boolean {
|
||||
const record = getRecord(params);
|
||||
return Boolean(record && getLegacyVllmQwenThinkingFormat(record));
|
||||
}
|
||||
|
||||
function hasLegacyVllmQwenThinkingAgentParams(agents: unknown): boolean {
|
||||
const list = getRecord(agents)?.list;
|
||||
if (!Array.isArray(list)) {
|
||||
return false;
|
||||
}
|
||||
return list.some((agent) => hasLegacyVllmQwenThinkingParams(getRecord(agent)?.params));
|
||||
}
|
||||
|
||||
function findOrCreateVllmModelEntry(
|
||||
raw: Record<string, unknown>,
|
||||
modelId: string,
|
||||
): { model: Record<string, unknown>; index: number } | undefined {
|
||||
const modelsRoot = getOrCreateRecord(raw, "models");
|
||||
const providers = modelsRoot ? getOrCreateRecord(modelsRoot, "providers") : undefined;
|
||||
const vllm = providers ? getOrCreateVllmProvider(providers) : undefined;
|
||||
if (!vllm) {
|
||||
return undefined;
|
||||
}
|
||||
if (vllm.models !== undefined && !Array.isArray(vllm.models)) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
const models = Array.isArray(vllm.models) ? vllm.models : [];
|
||||
vllm.models = models;
|
||||
const providerModelId = `vllm/${modelId}`;
|
||||
for (const [index, model] of models.entries()) {
|
||||
const record = getRecord(model);
|
||||
if (record?.id === modelId || record?.id === providerModelId) {
|
||||
return { model: record, index };
|
||||
}
|
||||
}
|
||||
|
||||
const model = { id: modelId, name: modelId };
|
||||
models.push(model);
|
||||
return { model, index: models.length - 1 };
|
||||
}
|
||||
|
||||
function listExistingVllmModelTargets(
|
||||
raw: Record<string, unknown>,
|
||||
): Array<{ model: Record<string, unknown>; index: number }> {
|
||||
const models = findVllmProvider(getRecord(getRecord(raw.models)?.providers))?.models;
|
||||
if (!Array.isArray(models)) {
|
||||
return [];
|
||||
}
|
||||
return models.flatMap((model, index) => {
|
||||
const record = getRecord(model);
|
||||
return record ? [{ model: record, index }] : [];
|
||||
});
|
||||
}
|
||||
|
||||
function collectVllmModelIdsFromSelection(value: unknown): string[] {
|
||||
if (typeof value === "string") {
|
||||
const modelId = parseVllmAgentModelKey(value);
|
||||
return modelId ? [modelId] : [];
|
||||
}
|
||||
const record = getRecord(value);
|
||||
if (!record) {
|
||||
return [];
|
||||
}
|
||||
const ids: string[] = [];
|
||||
if (typeof record.primary === "string") {
|
||||
const primary = parseVllmAgentModelKey(record.primary);
|
||||
if (primary) {
|
||||
ids.push(primary);
|
||||
}
|
||||
}
|
||||
if (Array.isArray(record.fallbacks)) {
|
||||
for (const fallback of record.fallbacks) {
|
||||
if (typeof fallback !== "string") {
|
||||
continue;
|
||||
}
|
||||
const modelId = parseVllmAgentModelKey(fallback);
|
||||
if (modelId) {
|
||||
ids.push(modelId);
|
||||
}
|
||||
}
|
||||
}
|
||||
return ids;
|
||||
}
|
||||
|
||||
function collectVllmModelIdsFromAgentModelMap(value: unknown): string[] {
|
||||
const models = getRecord(value);
|
||||
if (!models) {
|
||||
return [];
|
||||
}
|
||||
return Object.keys(models).flatMap((key) => {
|
||||
const modelId = parseVllmAgentModelKey(key);
|
||||
return modelId ? [modelId] : [];
|
||||
});
|
||||
}
|
||||
|
||||
function createVllmModelTargets(
|
||||
raw: Record<string, unknown>,
|
||||
modelIds: string[],
|
||||
): Array<{ model: Record<string, unknown>; index: number }> {
|
||||
const targets: Array<{ model: Record<string, unknown>; index: number }> = [];
|
||||
const seen = new Set<Record<string, unknown>>();
|
||||
for (const modelId of modelIds) {
|
||||
const target = findOrCreateVllmModelEntry(raw, modelId);
|
||||
if (!target || seen.has(target.model)) {
|
||||
continue;
|
||||
}
|
||||
seen.add(target.model);
|
||||
targets.push(target);
|
||||
}
|
||||
return targets;
|
||||
}
|
||||
|
||||
function combineVllmModelTargets(
|
||||
...groups: Array<Array<{ model: Record<string, unknown>; index: number }>>
|
||||
): Array<{ model: Record<string, unknown>; index: number }> {
|
||||
const targets: Array<{ model: Record<string, unknown>; index: number }> = [];
|
||||
const seen = new Set<Record<string, unknown>>();
|
||||
for (const group of groups) {
|
||||
for (const target of group) {
|
||||
if (seen.has(target.model)) {
|
||||
continue;
|
||||
}
|
||||
seen.add(target.model);
|
||||
targets.push(target);
|
||||
}
|
||||
}
|
||||
return targets;
|
||||
}
|
||||
|
||||
function collectVllmModelIdsFromAgentList(value: unknown): string[] {
|
||||
if (!Array.isArray(value)) {
|
||||
return [];
|
||||
}
|
||||
return value.flatMap((agent) => {
|
||||
const record = getRecord(agent);
|
||||
return record
|
||||
? [
|
||||
...collectVllmModelIdsFromSelection(record.model),
|
||||
...collectVllmModelIdsFromAgentModelMap(record.models),
|
||||
]
|
||||
: [];
|
||||
});
|
||||
}
|
||||
|
||||
function getOrCreateRecord(
|
||||
root: Record<string, unknown>,
|
||||
key: string,
|
||||
): Record<string, unknown> | undefined {
|
||||
if (root[key] === undefined) {
|
||||
const next: Record<string, unknown> = {};
|
||||
root[key] = next;
|
||||
return next;
|
||||
}
|
||||
return getRecord(root[key]) ?? undefined;
|
||||
}
|
||||
|
||||
function findVllmProvider(
|
||||
providers: Record<string, unknown> | null | undefined,
|
||||
): Record<string, unknown> | undefined {
|
||||
if (!providers) {
|
||||
return undefined;
|
||||
}
|
||||
const key = Object.keys(providers).find((entry) => normalizeProviderId(entry) === "vllm");
|
||||
return key ? (getRecord(providers[key]) ?? undefined) : undefined;
|
||||
}
|
||||
|
||||
function getOrCreateVllmProvider(
|
||||
providers: Record<string, unknown>,
|
||||
): Record<string, unknown> | undefined {
|
||||
const key = Object.keys(providers).find((entry) => normalizeProviderId(entry) === "vllm");
|
||||
if (key) {
|
||||
return getRecord(providers[key]) ?? undefined;
|
||||
}
|
||||
return getOrCreateRecord(providers, "vllm");
|
||||
}
|
||||
|
||||
function hasLegacyVllmQwenThinkingNormalizedProvider(providers: unknown): boolean {
|
||||
const providersRecord = getRecord(providers);
|
||||
if (!providersRecord || getRecord(providersRecord.vllm)) {
|
||||
return false;
|
||||
}
|
||||
const vllmProvider = findVllmProvider(providersRecord);
|
||||
return (
|
||||
hasLegacyVllmQwenThinkingProviderParams(vllmProvider) ||
|
||||
hasLegacyVllmQwenThinkingModelParams(vllmProvider)
|
||||
);
|
||||
}
|
||||
|
||||
function preserveMigratedVllmQwenReasoning(model: Record<string, unknown>): void {
|
||||
if (model.reasoning === undefined) {
|
||||
model.reasoning = true;
|
||||
}
|
||||
}
|
||||
|
||||
function removeLegacyVllmQwenThinkingParams(params: Record<string, unknown>): void {
|
||||
for (const key of LEGACY_VLLM_QWEN_THINKING_FORMAT_KEYS) {
|
||||
delete params[key];
|
||||
}
|
||||
}
|
||||
|
||||
function applyLegacyVllmQwenThinkingFormat(params: {
|
||||
sourcePath: string;
|
||||
legacyParams: Record<string, unknown>;
|
||||
target: { model: Record<string, unknown>; index: number };
|
||||
legacyFormat: NonNullable<ReturnType<typeof getLegacyVllmQwenThinkingFormat>>;
|
||||
changes: string[];
|
||||
}): boolean {
|
||||
if (!params.legacyFormat.compat) {
|
||||
removeLegacyVllmQwenThinkingParams(params.legacyParams);
|
||||
params.changes.push(
|
||||
`Removed ${params.sourcePath}.${params.legacyFormat.key} (unrecognized value ${JSON.stringify(params.legacyFormat.value)}; configure models.providers.vllm.models[].compat.thinkingFormat if needed).`,
|
||||
);
|
||||
return true;
|
||||
}
|
||||
|
||||
preserveMigratedVllmQwenReasoning(params.target.model);
|
||||
const compat = ensureRecord(params.target.model, "compat");
|
||||
const currentThinkingFormat = compat.thinkingFormat;
|
||||
if (typeof currentThinkingFormat === "string" && isModelThinkingFormat(currentThinkingFormat)) {
|
||||
removeLegacyVllmQwenThinkingParams(params.legacyParams);
|
||||
params.changes.push(
|
||||
`Removed ${params.sourcePath}.${params.legacyFormat.key}; models.providers.vllm.models[${params.target.index}].compat.thinkingFormat is already ${JSON.stringify(currentThinkingFormat)}.`,
|
||||
);
|
||||
return true;
|
||||
}
|
||||
|
||||
compat.thinkingFormat = params.legacyFormat.compat;
|
||||
removeLegacyVllmQwenThinkingParams(params.legacyParams);
|
||||
params.changes.push(
|
||||
`Moved ${params.sourcePath}.${params.legacyFormat.key} to models.providers.vllm.models[${params.target.index}].compat.thinkingFormat (${JSON.stringify(params.legacyFormat.compat)}).`,
|
||||
);
|
||||
return true;
|
||||
}
|
||||
|
||||
function removeUntargetedLegacyVllmQwenThinkingFormat(params: {
|
||||
sourcePath: string;
|
||||
legacyParams: Record<string, unknown>;
|
||||
legacyFormat: NonNullable<ReturnType<typeof getLegacyVllmQwenThinkingFormat>>;
|
||||
changes: string[];
|
||||
}): void {
|
||||
removeLegacyVllmQwenThinkingParams(params.legacyParams);
|
||||
params.changes.push(
|
||||
`Removed ${params.sourcePath}.${params.legacyFormat.key}; no concrete vLLM model row or agent model ref exists, so configure models.providers.vllm.models[].compat.thinkingFormat on each Qwen model that needs it.`,
|
||||
);
|
||||
}
|
||||
|
||||
const LEGACY_VLLM_QWEN_AGENT_THINKING_FORMAT_RULE: LegacyConfigRule = {
|
||||
path: ["agents", "defaults", "models"],
|
||||
message:
|
||||
'agents.defaults.models.<vllm-model>.params.qwenThinkingFormat is legacy; run "openclaw doctor --fix" to move it to models.providers.vllm.models[].compat.thinkingFormat.',
|
||||
match: (value) => hasLegacyVllmQwenThinkingFormat(value),
|
||||
};
|
||||
|
||||
const LEGACY_VLLM_QWEN_PROVIDER_THINKING_FORMAT_RULE: LegacyConfigRule = {
|
||||
path: ["models", "providers", "vllm", "params"],
|
||||
message:
|
||||
'models.providers.vllm.params.qwenThinkingFormat is legacy; run "openclaw doctor --fix" to move it to models.providers.vllm.models[].compat.thinkingFormat.',
|
||||
match: (value) => hasLegacyVllmQwenThinkingProviderParams({ params: value }),
|
||||
};
|
||||
|
||||
const LEGACY_VLLM_QWEN_PROVIDER_MODEL_THINKING_FORMAT_RULE: LegacyConfigRule = {
|
||||
path: ["models", "providers", "vllm", "models"],
|
||||
message:
|
||||
'models.providers.vllm.models[*].params.qwenThinkingFormat is legacy; run "openclaw doctor --fix" to move it to models.providers.vllm.models[].compat.thinkingFormat.',
|
||||
match: (value) => hasLegacyVllmQwenThinkingModelParams({ models: value }),
|
||||
};
|
||||
|
||||
const LEGACY_VLLM_QWEN_NORMALIZED_PROVIDER_THINKING_FORMAT_RULE: LegacyConfigRule = {
|
||||
path: ["models", "providers"],
|
||||
message:
|
||||
'models.providers.<vllm>.params.qwenThinkingFormat is legacy; run "openclaw doctor --fix" to move it to models.providers.<vllm>.models[].compat.thinkingFormat.',
|
||||
match: (value) => hasLegacyVllmQwenThinkingNormalizedProvider(value),
|
||||
};
|
||||
|
||||
const LEGACY_VLLM_QWEN_DEFAULT_PARAMS_THINKING_FORMAT_RULE: LegacyConfigRule = {
|
||||
path: ["agents", "defaults", "params"],
|
||||
message:
|
||||
'agents.defaults.params.qwenThinkingFormat is legacy; run "openclaw doctor --fix" to move it to models.providers.vllm.models[].compat.thinkingFormat.',
|
||||
match: (value) => hasLegacyVllmQwenThinkingParams(value),
|
||||
};
|
||||
|
||||
const LEGACY_VLLM_QWEN_AGENT_PARAMS_THINKING_FORMAT_RULE: LegacyConfigRule = {
|
||||
path: ["agents"],
|
||||
message:
|
||||
'agents.list[].params.qwenThinkingFormat is legacy; run "openclaw doctor --fix" to move it to models.providers.vllm.models[].compat.thinkingFormat.',
|
||||
match: (value) => hasLegacyVllmQwenThinkingAgentParams(value),
|
||||
};
|
||||
|
||||
const INVALID_THINKING_FORMAT_RULE: LegacyConfigRule = {
|
||||
path: ["models", "providers"],
|
||||
message:
|
||||
@@ -559,6 +949,201 @@ export const LEGACY_CONFIG_MIGRATIONS_RUNTIME_MODELS: LegacyConfigMigrationSpec[
|
||||
Object.assign(raw, rewritten.value);
|
||||
},
|
||||
}),
|
||||
defineLegacyConfigMigration({
|
||||
id: "agents.defaults.models.vllm.params.qwenThinkingFormat->models.providers.vllm.models.compat.thinkingFormat",
|
||||
describe: "Move legacy vLLM Qwen thinking params to model compat metadata",
|
||||
legacyRules: [
|
||||
LEGACY_VLLM_QWEN_AGENT_THINKING_FORMAT_RULE,
|
||||
LEGACY_VLLM_QWEN_PROVIDER_THINKING_FORMAT_RULE,
|
||||
LEGACY_VLLM_QWEN_PROVIDER_MODEL_THINKING_FORMAT_RULE,
|
||||
LEGACY_VLLM_QWEN_NORMALIZED_PROVIDER_THINKING_FORMAT_RULE,
|
||||
LEGACY_VLLM_QWEN_DEFAULT_PARAMS_THINKING_FORMAT_RULE,
|
||||
LEGACY_VLLM_QWEN_AGENT_PARAMS_THINKING_FORMAT_RULE,
|
||||
],
|
||||
apply: (raw, changes) => {
|
||||
const agentsDefaults = getRecord(getRecord(raw.agents)?.defaults);
|
||||
const defaultModels = getRecord(agentsDefaults?.models);
|
||||
if (defaultModels) {
|
||||
for (const [key, entry] of Object.entries(defaultModels)) {
|
||||
const modelId = parseVllmAgentModelKey(key);
|
||||
const entryRecord = getRecord(entry);
|
||||
const params = getRecord(entryRecord?.params);
|
||||
if (!modelId || !entryRecord || !params) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const legacyFormat = getLegacyVllmQwenThinkingFormat(params);
|
||||
if (!legacyFormat) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const target = legacyFormat.compat ? findOrCreateVllmModelEntry(raw, modelId) : undefined;
|
||||
if (legacyFormat.compat && !target) {
|
||||
continue;
|
||||
}
|
||||
applyLegacyVllmQwenThinkingFormat({
|
||||
sourcePath: `agents.defaults.models.${JSON.stringify(key)}.params`,
|
||||
legacyParams: params,
|
||||
target: target ?? { model: {}, index: -1 },
|
||||
legacyFormat,
|
||||
changes,
|
||||
});
|
||||
if (Object.keys(params).length === 0) {
|
||||
delete entryRecord.params;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const vllmProvider = findVllmProvider(getRecord(getRecord(raw.models)?.providers));
|
||||
const vllmModels = vllmProvider?.models;
|
||||
if (Array.isArray(vllmModels)) {
|
||||
for (const [index, model] of vllmModels.entries()) {
|
||||
const modelRecord = getRecord(model);
|
||||
const params = getRecord(modelRecord?.params);
|
||||
if (!modelRecord || !params) {
|
||||
continue;
|
||||
}
|
||||
const legacyFormat = getLegacyVllmQwenThinkingFormat(params);
|
||||
if (!legacyFormat) {
|
||||
continue;
|
||||
}
|
||||
applyLegacyVllmQwenThinkingFormat({
|
||||
sourcePath: `models.providers.vllm.models[${index}].params`,
|
||||
legacyParams: params,
|
||||
target: { model: modelRecord, index },
|
||||
legacyFormat,
|
||||
changes,
|
||||
});
|
||||
if (Object.keys(params).length === 0) {
|
||||
delete modelRecord.params;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const providerParams = getRecord(vllmProvider?.params);
|
||||
if (providerParams) {
|
||||
const providerLegacyFormat = getLegacyVllmQwenThinkingFormat(providerParams);
|
||||
if (providerLegacyFormat) {
|
||||
const providerModelIds = [
|
||||
...collectVllmModelIdsFromSelection(agentsDefaults?.model),
|
||||
...collectVllmModelIdsFromAgentModelMap(defaultModels),
|
||||
...collectVllmModelIdsFromAgentList(getRecord(raw.agents)?.list),
|
||||
];
|
||||
const targets = combineVllmModelTargets(
|
||||
listExistingVllmModelTargets(raw),
|
||||
createVllmModelTargets(raw, providerModelIds),
|
||||
);
|
||||
if (targets.length === 0) {
|
||||
removeUntargetedLegacyVllmQwenThinkingFormat({
|
||||
sourcePath: "models.providers.vllm.params",
|
||||
legacyParams: providerParams,
|
||||
legacyFormat: providerLegacyFormat,
|
||||
changes,
|
||||
});
|
||||
} else {
|
||||
for (const target of targets) {
|
||||
applyLegacyVllmQwenThinkingFormat({
|
||||
sourcePath: "models.providers.vllm.params",
|
||||
legacyParams: providerParams,
|
||||
target,
|
||||
legacyFormat: providerLegacyFormat,
|
||||
changes,
|
||||
});
|
||||
}
|
||||
}
|
||||
if (Object.keys(providerParams).length === 0) {
|
||||
delete vllmProvider?.params;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const defaultParams = getRecord(agentsDefaults?.params);
|
||||
if (defaultParams) {
|
||||
const defaultLegacyFormat = getLegacyVllmQwenThinkingFormat(defaultParams);
|
||||
if (defaultLegacyFormat) {
|
||||
const defaultModelIds = [
|
||||
...collectVllmModelIdsFromSelection(agentsDefaults?.model),
|
||||
...collectVllmModelIdsFromAgentModelMap(defaultModels),
|
||||
];
|
||||
const targets =
|
||||
defaultModelIds.length > 0
|
||||
? createVllmModelTargets(raw, defaultModelIds)
|
||||
: listExistingVllmModelTargets(raw);
|
||||
if (targets.length === 0) {
|
||||
removeUntargetedLegacyVllmQwenThinkingFormat({
|
||||
sourcePath: "agents.defaults.params",
|
||||
legacyParams: defaultParams,
|
||||
legacyFormat: defaultLegacyFormat,
|
||||
changes,
|
||||
});
|
||||
} else {
|
||||
for (const target of targets) {
|
||||
applyLegacyVllmQwenThinkingFormat({
|
||||
sourcePath: "agents.defaults.params",
|
||||
legacyParams: defaultParams,
|
||||
target,
|
||||
legacyFormat: defaultLegacyFormat,
|
||||
changes,
|
||||
});
|
||||
}
|
||||
}
|
||||
if (Object.keys(defaultParams).length === 0) {
|
||||
delete agentsDefaults?.params;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const agentList = getRecord(raw.agents)?.list;
|
||||
if (!Array.isArray(agentList)) {
|
||||
return;
|
||||
}
|
||||
for (const [index, agent] of agentList.entries()) {
|
||||
const agentRecord = getRecord(agent);
|
||||
const agentParams = getRecord(agentRecord?.params);
|
||||
const agentLegacyFormat = agentParams
|
||||
? getLegacyVllmQwenThinkingFormat(agentParams)
|
||||
: undefined;
|
||||
if (!agentRecord || !agentParams || !agentLegacyFormat) {
|
||||
continue;
|
||||
}
|
||||
const explicitAgentModelIds = [
|
||||
...collectVllmModelIdsFromSelection(agentRecord.model),
|
||||
...collectVllmModelIdsFromAgentModelMap(agentRecord.models),
|
||||
];
|
||||
const inheritedDefaultModelIds = [
|
||||
...collectVllmModelIdsFromSelection(agentsDefaults?.model),
|
||||
...collectVllmModelIdsFromAgentModelMap(defaultModels),
|
||||
];
|
||||
const agentModelIds =
|
||||
explicitAgentModelIds.length > 0 ? explicitAgentModelIds : inheritedDefaultModelIds;
|
||||
const targets =
|
||||
agentModelIds.length > 0
|
||||
? createVllmModelTargets(raw, agentModelIds)
|
||||
: listExistingVllmModelTargets(raw);
|
||||
if (targets.length === 0) {
|
||||
removeUntargetedLegacyVllmQwenThinkingFormat({
|
||||
sourcePath: `agents.list[${index}].params`,
|
||||
legacyParams: agentParams,
|
||||
legacyFormat: agentLegacyFormat,
|
||||
changes,
|
||||
});
|
||||
} else {
|
||||
for (const target of targets) {
|
||||
applyLegacyVllmQwenThinkingFormat({
|
||||
sourcePath: `agents.list[${index}].params`,
|
||||
legacyParams: agentParams,
|
||||
target,
|
||||
legacyFormat: agentLegacyFormat,
|
||||
changes,
|
||||
});
|
||||
}
|
||||
}
|
||||
if (Object.keys(agentParams).length === 0) {
|
||||
delete agentRecord.params;
|
||||
}
|
||||
}
|
||||
},
|
||||
}),
|
||||
defineLegacyConfigMigration({
|
||||
id: "models.providers.*.models.*.compat.thinkingFormat-invalid",
|
||||
describe: "Remove unrecognized compat.thinkingFormat values from provider model entries",
|
||||
|
||||
@@ -128,6 +128,44 @@ describe("models.list", () => {
|
||||
}
|
||||
});
|
||||
|
||||
it("does not expose runtime params from catalog rows", async () => {
|
||||
const respond = vi.fn();
|
||||
await modelsHandlers["models.list"]({
|
||||
req: {
|
||||
type: "req",
|
||||
id: "req-models-list-redact-params",
|
||||
method: "models.list",
|
||||
params: { view: "all" },
|
||||
},
|
||||
params: { view: "all" },
|
||||
respond,
|
||||
client: null,
|
||||
isWebchatConnect: () => false,
|
||||
context: {
|
||||
getRuntimeConfig: () => ({}) as OpenClawConfig,
|
||||
loadGatewayModelCatalog: vi.fn(() =>
|
||||
Promise.resolve([
|
||||
{
|
||||
id: "qwen-local",
|
||||
name: "Qwen Local",
|
||||
provider: "vllm",
|
||||
params: { qwenThinkingFormat: "chat-template" },
|
||||
},
|
||||
]),
|
||||
),
|
||||
logGateway: {
|
||||
debug: vi.fn(),
|
||||
},
|
||||
} as never,
|
||||
});
|
||||
|
||||
expect(respond).toHaveBeenCalledWith(
|
||||
true,
|
||||
{ models: [{ id: "qwen-local", name: "Qwen Local", provider: "vllm" }] },
|
||||
undefined,
|
||||
);
|
||||
});
|
||||
|
||||
it("loads the full catalog for provider-scoped configured view and filters only providers", async () => {
|
||||
const catalog = [
|
||||
{ id: "claude-test", name: "Claude Test", provider: "anthropic" },
|
||||
|
||||
@@ -5,6 +5,7 @@ import {
|
||||
type ModelCatalogBrowseView,
|
||||
} from "../../agents/model-catalog-browse.js";
|
||||
import { resolveVisibleModelCatalog } from "../../agents/model-catalog-visibility.js";
|
||||
import type { ModelCatalogEntry } from "../../agents/model-catalog.types.js";
|
||||
import { resolveDefaultAgentWorkspaceDir } from "../../agents/workspace.js";
|
||||
import {
|
||||
ErrorCodes,
|
||||
@@ -22,6 +23,17 @@ function resolveModelsListView(params: Record<string, unknown>): ModelsListView
|
||||
return typeof params.view === "string" ? (params.view as ModelsListView) : "default";
|
||||
}
|
||||
|
||||
function omitRuntimeModelParams(entry: ModelCatalogEntry): ModelCatalogEntry {
|
||||
const { params: _params, ...rest } = entry as ModelCatalogEntry & {
|
||||
params?: Record<string, unknown>;
|
||||
};
|
||||
return rest;
|
||||
}
|
||||
|
||||
function omitRuntimeModelParamsFromCatalog(catalog: ModelCatalogEntry[]): ModelCatalogEntry[] {
|
||||
return catalog.map(omitRuntimeModelParams);
|
||||
}
|
||||
|
||||
export const modelsHandlers: GatewayRequestHandlers = {
|
||||
"models.list": async ({ params, respond, context }) => {
|
||||
if (!validateModelsListParams(params)) {
|
||||
@@ -56,7 +68,7 @@ export const modelsHandlers: GatewayRequestHandlers = {
|
||||
},
|
||||
});
|
||||
if (view === "all") {
|
||||
respond(true, { models: catalog }, undefined);
|
||||
respond(true, { models: omitRuntimeModelParamsFromCatalog(catalog) }, undefined);
|
||||
return;
|
||||
}
|
||||
const models = await resolveVisibleModelCatalog({
|
||||
@@ -67,7 +79,7 @@ export const modelsHandlers: GatewayRequestHandlers = {
|
||||
view,
|
||||
runtimeAuthDiscovery: false,
|
||||
});
|
||||
respond(true, { models }, undefined);
|
||||
respond(true, { models: omitRuntimeModelParamsFromCatalog(models) }, undefined);
|
||||
} catch (err) {
|
||||
respond(false, undefined, errorShape(ErrorCodes.UNAVAILABLE, String(err)));
|
||||
}
|
||||
|
||||
@@ -10,15 +10,25 @@ export type ProviderThinkingPolicyContext = {
|
||||
modelId: string;
|
||||
};
|
||||
|
||||
export type ProviderThinkingModelCompat = {
|
||||
thinkingFormat?: string;
|
||||
supportedReasoningEfforts?: readonly string[] | null;
|
||||
};
|
||||
|
||||
/**
|
||||
* Provider-owned default thinking policy input.
|
||||
*
|
||||
* `reasoning` is the merged catalog hint for the selected model when one is
|
||||
* available. Providers can use it to keep "reasoning model => low" behavior
|
||||
* without re-reading the catalog themselves.
|
||||
*
|
||||
* `compat` carries model-level request contract facts for the selected model
|
||||
* when available. Providers can use it to expose model-specific thinking
|
||||
* profiles only when the configured payload style supports them.
|
||||
*/
|
||||
export type ProviderDefaultThinkingPolicyContext = ProviderThinkingPolicyContext & {
|
||||
reasoning?: boolean;
|
||||
compat?: ProviderThinkingModelCompat | null;
|
||||
};
|
||||
|
||||
export type ProviderThinkingLevelId =
|
||||
|
||||
Reference in New Issue
Block a user