docs: document mistral medium 3.5 usage

2026-06-06 05:51:15 +08:00 · 2026-05-09 11:45:37 +01:00
parent 8dc1080db7
commit b27a251ce5
3 changed files with 54 additions and 2 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -146,6 +146,7 @@ Docs: https://docs.openclaw.ai
 - iMessage: add opt-in inbound catchup that replays messages received while the gateway was offline (crash, restart, mac sleep) on next startup. Enable with `channels.imessage.catchup.enabled: true`; tunables for `maxAgeMinutes`, `perRunLimit`, `firstRunLookbackMinutes`, and `maxFailureRetries`. Persists a per-account cursor under the OpenClaw state dir (`<openclawStateDir>/imessage/catchup/`), replays each row through the live dispatch path so allowlists/group policy/dedupe behave identically on replayed and live messages, and force-advances past wedged guids after `maxFailureRetries` to prevent stuck cursors. Extends the persisted echo-cache retention window so the agent's own outbound rows from before a gap are not re-fed as inbound on replay. Includes a regenerated `src/config/bundled-channel-config-metadata.generated.ts` so the runtime AJV schema accepts the new `channels.imessage.catchup` block. Fixes #78649. (#79387) Thanks @omarshahine.
 - Channels/Yuanbao: bump the bundled `openclaw-plugin-yuanbao` npm spec from `2.11.0` to `2.13.0` in the official external channel catalog and refresh the pinned integrity hash, so fresh installs and catalog-driven reinstalls pick up the newer Yuanbao channel plugin release. (#79620) Thanks @loongfay.
 - Providers/Mistral: add `mistral-medium-3-5` to the bundled catalog with reasoning support. Thanks @sliekens.
+- Docs/Mistral: document Medium 3.5 setup, local infer smoke usage, adjustable reasoning, and the Mistral HTTP 400 caveat for `reasoning_effort="high"` with `temperature: 0`.

 ### Breaking

--- a/docs/cli/infer.md
+++ b/docs/cli/infer.md
@@ -157,6 +157,7 @@ openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "R
 openclaw infer model run --local --model cerebras/zai-glm-4.7 --prompt "Reply with exactly: pong" --json
 openclaw infer model run --local --model google/gemini-2.5-flash --prompt "Reply with exactly: pong" --json
 openclaw infer model run --local --model groq/llama-3.1-8b-instant --prompt "Reply with exactly: pong" --json
+openclaw infer model run --local --model mistral/mistral-medium-3-5 --prompt "Reply with exactly: pong" --json
 openclaw infer model run --local --model mistral/mistral-small-latest --prompt "Reply with exactly: pong" --json
 openclaw infer model run --local --model openai/gpt-4.1 --prompt "Reply with exactly: pong" --json
 openclaw infer model run --local --model ollama/qwen2.5vl:7b --prompt "Describe this image." --file ./photo.jpg --json
@@ -165,6 +166,7 @@ openclaw infer model run --local --model ollama/qwen2.5vl:7b --prompt "Describe
 Notes:

 - Local `model run` is the narrowest CLI smoke for provider/model/auth health because, for non-Codex providers, it sends only the supplied prompt to the selected model.
+- For Mistral Medium 3.5 reasoning probes, leave temperature unset/default. Mistral rejects `reasoning_effort="high"` plus `temperature: 0`; use `mistral/mistral-medium-3-5` with default temperature or a non-zero reasoning-mode value such as `0.7`.
 - `openai-codex/*` local probes are the narrow exception: OpenClaw adds a minimal system instruction so the Codex Responses transport can populate its required `instructions` field, without adding full agent context, tools, memory, or session transcript.
 - Local `model run --file` keeps that lean path and attaches image content directly to the single user message. Common image files such as PNG, JPEG, and WebP work when their MIME type is detected as `image/*`; unsupported or unrecognized files fail before the provider is called.
 - `model run --file` is best when you want to test the selected multimodal text model directly. Use `infer image describe` when you want OpenClaw's image-understanding provider selection and default image-model routing.
--- a/docs/providers/mistral.md
+++ b/docs/providers/mistral.md
@@ -58,19 +58,41 @@ OpenClaw includes a bundled Mistral plugin that registers four contracts: chat c

 ## Built-in LLM catalog

+[Mistral Medium 3.5](https://docs.mistral.ai/models/model-cards/mistral-medium-3-5-26-04)
+is the current blended Medium model in the bundled catalog: 128B dense weights,
+text and image input, 256K context, function calling, structured output, coding,
+and adjustable reasoning through the Chat Completions API. Use
+`mistral/mistral-medium-3-5` when you want Mistral's newer unified
+agentic/coding model instead of the default `mistral/mistral-large-latest`.
+
 OpenClaw currently ships this bundled Mistral catalog:

 | Model ref                        | Input       | Context | Max output | Notes                                                            |
 | -------------------------------- | ----------- | ------- | ---------- | ---------------------------------------------------------------- |
 | `mistral/mistral-large-latest`   | text, image | 262,144 | 16,384     | Default model                                                    |
 | `mistral/mistral-medium-2508`    | text, image | 262,144 | 8,192      | Mistral Medium 3.1                                               |
-| `mistral/mistral-medium-3-5`     | text, image | 262,144 | 8,192      | Mistral Medium 3.5                                               |
+| `mistral/mistral-medium-3-5`     | text, image | 262,144 | 8,192      | Mistral Medium 3.5; adjustable reasoning                         |
 | `mistral/mistral-small-latest`   | text, image | 128,000 | 16,384     | Mistral Small 4; adjustable reasoning via API `reasoning_effort` |
 | `mistral/pixtral-large-latest`   | text, image | 128,000 | 32,768     | Pixtral                                                          |
 | `mistral/codestral-latest`       | text        | 256,000 | 4,096      | Coding                                                           |
 | `mistral/devstral-medium-latest` | text        | 262,144 | 32,768     | Devstral 2                                                       |
 | `mistral/magistral-small`        | text        | 128,000 | 40,000     | Reasoning-enabled                                                |

+After onboarding, smoke-test Medium 3.5 without starting the Gateway:
+
+```bash
+openclaw infer model run --local \
+  --model mistral/mistral-medium-3-5 \
+  --prompt "Reply with exactly: mistral-ok" \
+  --json
+```
+
+To browse the bundled catalog row before changing config:
+
+```bash
+openclaw models list --all --provider mistral --plain
+```
+
 ## Audio transcription (Voxtral)

 Use Voxtral for batch audio transcription through the media understanding
@@ -139,7 +161,7 @@ matching `sampleRate` only if your upstream stream is already raw PCM.

 <AccordionGroup>
  <Accordion title="Adjustable reasoning">
-    `mistral/mistral-small-latest` (Mistral Small 4) and `mistral/mistral-medium-3-5` support [adjustable reasoning](https://docs.mistral.ai/capabilities/reasoning/adjustable) on the Chat Completions API via `reasoning_effort` (`none` minimizes extra thinking in the output; `high` surfaces full thinking traces before the final answer).
+    `mistral/mistral-small-latest` (Mistral Small 4) and `mistral/mistral-medium-3-5` support [adjustable reasoning](https://docs.mistral.ai/studio-api/conversations/reasoning/adjustable) on the Chat Completions API via `reasoning_effort` (`none` minimizes extra thinking in the output; `high` surfaces full thinking traces before the final answer). Mistral recommends `reasoning_effort="high"` for Medium 3.5 agentic and code use cases.

    OpenClaw maps the session **thinking** level to Mistral's API:

@@ -148,6 +170,33 @@ matching `sampleRate` only if your upstream stream is already raw PCM.
    | **off** / **minimal**                            | `none`                     |
    | **low** / **medium** / **high** / **xhigh** / **adaptive** / **max** | `high`     |

+    <Warning>
+    Do not combine Medium 3.5 reasoning mode with `temperature: 0`. The Mistral
+    HTTP API rejects `reasoning_effort="high"` plus `temperature: 0` with a 400
+    response. Leave temperature unset so Mistral uses its default, or follow
+    the [Medium 3.5 recommended settings](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B)
+    and use `temperature: 0.7` for high reasoning. For deterministic direct
+    answers, turn thinking off/minimal so OpenClaw sends
+    `reasoning_effort: "none"` before you lower temperature.
+    </Warning>
+
+    Example model-scoped config for Medium 3.5 reasoning:
+
+    ```json5
+    {
+      agents: {
+        defaults: {
+          model: { primary: "mistral/mistral-medium-3-5" },
+          models: {
+            "mistral/mistral-medium-3-5": {
+              params: { thinking: "high" },
+            },
+          },
+        },
+      },
+    }
+    ```
+
    <Note>
    Other bundled Mistral catalog models do not use this parameter. Keep using `magistral-*` models when you want Mistral's native reasoning-first behavior.
    </Note>