feat: default queueing to steer

2026-06-06 05:51:15 +08:00 · 2026-04-29 22:41:53 +01:00
parent 83267e99b0
commit 4a6e10ece8
13 changed files with 143 additions and 57 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,7 @@ Docs: https://docs.openclaw.ai
 ### Changes

 - Dependencies: refresh workspace runtime, plugin, and tooling packages, including ACP, Pi, AWS SDK, TypeBox, pnpm, oxlint, oxfmt, jsdom, pdfjs, ciao, and tokenjuice, while keeping patched ACP behavior and lint gates current. Thanks @mariozechner.
+- Messages/queue: default active-run queueing to `steer` with a 500ms followup fallback debounce, and document the queue modes, precedence, and drop policies on the command queue page. Thanks @vincentkoc.
 - Providers/NVIDIA: add the NVIDIA provider with API-key onboarding, setup docs, static catalog metadata, and literal model-ref picker support so NVIDIA hosted models can be selected with their provider prefix intact. (#71204) Thanks @eleqtrizit.
 - Memory/wiki: add agent-facing people wiki metadata, canonical aliases, person cards, relationship graphs, privacy/provenance reports, evidence-kind drilldown, and search modes for person lookup, question routing, source evidence, and raw claims. Thanks @vincentkoc.
 - Messages: add global `messages.visibleReplies` so operators can require visible output to go through `message(action=send)` for any source chat, while `messages.groupChat.visibleReplies` stays available as the group/channel override. Thanks @scoootscooob.
--- a/docs/.generated/config-baseline.sha256
+++ b/docs/.generated/config-baseline.sha256
@@ -1,4 +1,4 @@
-0316c2ceef9a2da29a8860ba8c8e5218249bc561c5b44202ac78faf16b56029f  config-baseline.json
-d6f6410e05b623412f086ba59d8caea82e691e2f1367090ec2ddabfc189381ed  config-baseline.core.json
+807f97de19c66b263192c9285bb3f85785d5e505c09dc9a2e09f06edf4ff75ae  config-baseline.json
+a0d85108a55bad17e823d861994ebdd1c6fdb806febee3da7af8b821b7e1c607  config-baseline.core.json
 9f5fad66a49fa618d64a963470aa69fed9fe4b4639cc4321f9ec04bfb2f8aa50  config-baseline.channel.json
 c4231c2194206547af8ad94342dc00aadb734f43cb49cc79d4c46bdbb80c3f95  config-baseline.plugin.json
--- a/docs/concepts/messages.md
+++ b/docs/concepts/messages.md
@@ -124,9 +124,12 @@ If a run is already active, inbound messages can be queued, steered into the
 current run, or collected for a followup turn.

 - Configure via `messages.queue` (and `messages.queue.byChannel`).
- Modes: `interrupt`, `steer`, `followup`, `collect`, plus backlog variants.
+- Default mode is `steer`, with a 500ms followup debounce when steering falls
+  back to queued followup delivery.
+- Modes: `steer`, `followup`, `collect`, `steer-backlog`, `interrupt`, and the
+  legacy `queue` alias.

-Details: [Queueing](/concepts/queue).
+Details: [Command queue](/concepts/queue).

 ## Channel run ownership

--- a/docs/concepts/queue.md
+++ b/docs/concepts/queue.md
@@ -1,7 +1,8 @@
 ---
-summary: "Command queue design that serializes inbound auto-reply runs"
+summary: "Auto-reply queue modes, defaults, and per-session overrides"
 read_when:
  - Changing auto-reply execution or concurrency
+  - Explaining /queue modes or message steering behavior
 title: "Command queue"
 ---

@@ -20,25 +21,33 @@ We serialize inbound auto-reply runs (all channels) through a tiny in-process qu
 - When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
 - Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.

-## Queue modes (per channel)
+## Defaults
+
+When unset, all inbound channel surfaces use:
+
+- `mode: "steer"`
+- `debounceMs: 500`
+- `cap: 20`
+- `drop: "summarize"`
+
+`steer` is the default because it keeps the active model turn responsive without
+starting a second session run. If the current run cannot accept steering,
+OpenClaw falls back to a followup queue entry.
+
+## Queue modes

 Inbound messages can steer the current run, wait for a followup turn, or do both:

- `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
- `followup`: enqueue for the next agent turn after the current run ends.
- `collect`: coalesce all queued messages into a **single** followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.
- `steer-backlog` (aka `steer+backlog`): steer now **and** preserve the message for a followup turn.
+- `steer`: queue a steering message into the active Pi run. Pi delivers it **after the current assistant turn finishes executing its tool calls**, before the next LLM call. If the run is not actively streaming or steering is unavailable, OpenClaw falls back to a followup queue entry.
+- `followup`: enqueue each message for a later agent turn after the current run ends.
+- `collect`: coalesce queued messages into a **single** followup turn after the quiet window. If messages target different channels/threads, they drain individually to preserve routing.
+- `steer-backlog` (aka `steer+backlog`): steer now **and** preserve the same message for a followup turn.
 - `interrupt` (legacy): abort the active run for that session, then run the newest message.
 - `queue` (legacy alias): same as `steer`.

 Steer-backlog means you can get a followup response after the steered run, so
 streaming surfaces can look like duplicates. Prefer `collect`/`steer` if you want
 one response per inbound message.
-Send `/queue collect` as a standalone command (per-session) or set `messages.queue.byChannel.discord: "collect"`.
-
-Defaults (when unset in config):
-
- All surfaces → `collect`

 Configure globally or per channel via `messages.queue`:

@@ -46,8 +55,8 @@ Configure globally or per channel via `messages.queue`:
 {
  messages: {
    queue: {
-      mode: "collect",
-      debounceMs: 1000,
+      mode: "steer",
+      debounceMs: 500,
      cap: 20,
      drop: "summarize",
      byChannel: { discord: "collect" },
@@ -60,17 +69,33 @@ Configure globally or per channel via `messages.queue`:

 Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` when it falls back to followup):

- `debounceMs`: wait for quiet before starting a followup turn (prevents “continue, continue”).
- `cap`: max queued messages per session.
- `drop`: overflow policy (`old`, `new`, `summarize`).
+- `debounceMs`: quiet window before draining queued followups. Bare numbers are milliseconds; units `ms`, `s`, `m`, `h`, and `d` are accepted by `/queue` options.
+- `cap`: max queued messages per session. Values below `1` are ignored.
+- `drop: "summarize"`: default. Drop the oldest queued entries as needed, keep compact summaries, and inject them as a synthetic followup prompt.
+- `drop: "old"`: drop the oldest queued entries as needed, without preserving summaries.
+- `drop: "new"`: reject the newest message when the queue is already full.

-Summarize keeps a short bullet list of dropped messages and injects it as a synthetic followup prompt.
-Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.
+Defaults: `debounceMs: 500`, `cap: 20`, `drop: summarize`.
+
+## Precedence
+
+For mode selection, OpenClaw resolves:
+
+1. Inline or stored per-session `/queue` override.
+2. `messages.queue.byChannel.<channel>`.
+3. `messages.queue.mode`.
+4. Default `steer`.
+
+For options, inline or stored `/queue` options win over config. Then
+channel-specific debounce (`messages.queue.debounceMsByChannel`), plugin
+debounce defaults, global `messages.queue` options, and built-in defaults are
+applied. `cap` and `drop` are global/session options, not per-channel config
+keys.

 ## Per-session overrides

 - Send `/queue <mode>` as a standalone command to store the mode for the current session.
- Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
+- Options can be combined: `/queue collect debounce:0.5s cap:25 drop:summarize`
 - `/queue default` or `/queue reset` clears the session override.

 ## Scope and guarantees
--- a/docs/gateway/config-agents.md
+++ b/docs/gateway/config-agents.md
@@ -1226,13 +1226,13 @@ See [Multi-Agent Sandbox & Tools](/tools/multi-agent-sandbox-tools) for preceden
    ackReactionScope: "group-mentions", // group-mentions | group-all | direct | all
    removeAckAfterReply: false,
    queue: {
-      mode: "collect", // steer | followup | collect | steer-backlog | steer+backlog | queue | interrupt
-      debounceMs: 1000,
+      mode: "steer", // steer | followup | collect | steer-backlog | steer+backlog | queue | interrupt
+      debounceMs: 500,
      cap: 20,
      drop: "summarize", // old | new | summarize
      byChannel: {
-        whatsapp: "collect",
-        telegram: "collect",
+        whatsapp: "steer",
+        telegram: "steer",
      },
    },
    inbound: {
--- a/docs/gateway/configuration-examples.md
+++ b/docs/gateway/configuration-examples.md
@@ -111,18 +111,18 @@ Save to `~/.openclaw/openclaw.json` and you can DM the bot from that number.
      visibleReplies: "message_tool", // normal final replies stay private in groups/channels
    },
    queue: {
-      mode: "collect",
-      debounceMs: 1000,
+      mode: "steer",
+      debounceMs: 500,
      cap: 20,
      drop: "summarize",
      byChannel: {
-        whatsapp: "collect",
-        telegram: "collect",
-        discord: "collect",
-        slack: "collect",
-        signal: "collect",
-        imessage: "collect",
-        webchat: "collect",
+        whatsapp: "steer",
+        telegram: "steer",
+        discord: "steer",
+        slack: "steer",
+        signal: "steer",
+        imessage: "steer",
+        webchat: "steer",
      },
    },
  },
--- a/docs/help/faq.md
+++ b/docs/help/faq.md
@@ -1943,13 +1943,13 @@ lives on the [Models FAQ](/help/faq-models).
  <Accordion title='Why does it feel like the bot "ignores" rapid-fire messages?'>
    Queue mode controls how new messages interact with an in-flight run. Use `/queue` to change modes:

-    - `steer` - new messages redirect the current task
+    - `steer` - queue steering for the next model boundary in the current run
    - `followup` - run messages one at a time
-    - `collect` - batch messages and reply once (default)
+    - `collect` - batch messages and reply once
    - `steer-backlog` - steer now, then process backlog
    - `interrupt` - abort current run and start fresh

-    You can add options like `debounce:2s cap:25 drop:summarize` for followup modes.
+    Default mode is `steer`. You can add options like `debounce:0.5s cap:25 drop:summarize` for followup modes. See [Command queue](/concepts/queue).

  </Accordion>
 </AccordionGroup>
--- a/docs/tools/slash-commands.md
+++ b/docs/tools/slash-commands.md
@@ -142,7 +142,7 @@ Current source-of-truth:
    - `/exec host=<auto|sandbox|gateway|node> security=<deny|allowlist|full> ask=<off|on-miss|always> node=<id>` shows or sets exec defaults.
    - `/model [name|#|status]` shows or sets the model.
    - `/models [provider] [page] [limit=<n>|size=<n>|all]` lists configured/auth-available providers or models for a provider; add `all` to browse that provider's full catalog.
-    - `/queue <mode>` manages queue behavior (`steer`, `interrupt`, `followup`, `collect`, `steer-backlog`) plus options like `debounce:2s cap:25 drop:summarize`.
+    - `/queue <mode>` manages queue behavior (`steer`, `followup`, `collect`, `steer-backlog`, `interrupt`) plus options like `debounce:0.5s cap:25 drop:summarize`; `/queue default` or `/queue reset` clears the session override. See [Command queue](/concepts/queue).

  </Accordion>
  <Accordion title="Discovery and status">
--- a/src/auto-reply/reply/queue/settings.test.ts
+++ b/src/auto-reply/reply/queue/settings.test.ts
@@ -0,0 +1,57 @@
+import { describe, expect, it } from "vitest";
+import type { OpenClawConfig } from "../../../config/types.openclaw.js";
+import { resolveQueueSettings } from "./settings.js";
+
+describe("resolveQueueSettings", () => {
+  it("defaults inbound channels to steer with a short followup debounce", () => {
+    expect(resolveQueueSettings({ cfg: {} as OpenClawConfig })).toEqual({
+      mode: "steer",
+      debounceMs: 500,
+      cap: 20,
+      dropPolicy: "summarize",
+    });
+  });
+
+  it("uses the short debounce when collect is selected globally", () => {
+    expect(
+      resolveQueueSettings({
+        cfg: {
+          messages: {
+            queue: {
+              mode: "collect",
+            },
+          },
+        } as OpenClawConfig,
+      }),
+    ).toEqual({
+      mode: "collect",
+      debounceMs: 500,
+      cap: 20,
+      dropPolicy: "summarize",
+    });
+  });
+
+  it("keeps explicit channel queue overrides ahead of defaults", () => {
+    expect(
+      resolveQueueSettings({
+        cfg: {
+          messages: {
+            queue: {
+              mode: "steer",
+              debounceMs: 750,
+              byChannel: {
+                discord: "collect",
+              },
+            },
+          },
+        } as OpenClawConfig,
+        channel: "discord",
+      }),
+    ).toEqual({
+      mode: "collect",
+      debounceMs: 750,
+      cap: 20,
+      dropPolicy: "summarize",
+    });
+  });
+});
--- a/src/auto-reply/reply/queue/settings.ts
+++ b/src/auto-reply/reply/queue/settings.ts
@@ -5,7 +5,7 @@ import { DEFAULT_QUEUE_CAP, DEFAULT_QUEUE_DEBOUNCE_MS, DEFAULT_QUEUE_DROP } from
 import type { QueueMode, QueueSettings, ResolveQueueSettingsParams } from "./types.js";

 function defaultQueueModeForChannel(_channel?: string): QueueMode {
-  return "collect";
+  return "steer";
 }

 /** Resolve per-channel debounce override from debounceMsByChannel map. */
--- a/src/auto-reply/reply/queue/state.ts
+++ b/src/auto-reply/reply/queue/state.ts
@@ -16,7 +16,7 @@ export type FollowupQueueState = {
  lastRun?: FollowupRun["run"];
 };

-export const DEFAULT_QUEUE_DEBOUNCE_MS = 1000;
+export const DEFAULT_QUEUE_DEBOUNCE_MS = 500;
 export const DEFAULT_QUEUE_CAP = 20;
 export const DEFAULT_QUEUE_DROP: QueueDropPolicy = "summarize";

--- a/src/config/schema.base.generated.ts
+++ b/src/config/schema.base.generated.ts
@@ -18977,7 +18977,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
                ],
                title: "Queue Mode",
                description:
-                  'Queue behavior mode: "steer", "followup", "collect", "steer-backlog", "steer+backlog", "queue", or "interrupt". Keep conservative modes unless you intentionally need aggressive interruption/backlog semantics.',
+                  'Queue behavior mode. "steer" injects at the next model boundary; "followup" runs later; "collect" batches later; "steer-backlog" does both; "queue" aliases steer; "interrupt" aborts the active run.',
              },
              byChannel: {
                type: "object",
@@ -19314,7 +19314,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
                maximum: 9007199254740991,
                title: "Queue Debounce (ms)",
                description:
-                  "Global queue debounce window in milliseconds before processing buffered inbound messages. Use higher values to coalesce rapid bursts, or lower values for reduced response latency.",
+                  "Global followup queue debounce window in milliseconds before draining buffered inbound messages. Default is 500ms; higher values coalesce bursts, lower values reduce latency.",
              },
              debounceMsByChannel: {
                type: "object",
@@ -19336,7 +19336,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
                maximum: 9007199254740991,
                title: "Queue Capacity",
                description:
-                  "Maximum number of queued inbound items retained before drop policy applies. Keep caps bounded in noisy channels so memory usage remains predictable.",
+                  "Maximum number of queued inbound items retained before drop policy applies. Default is 20; keep caps bounded in noisy channels so memory usage remains predictable.",
              },
              drop: {
                anyOf: [
@@ -19355,13 +19355,13 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
                ],
                title: "Queue Drop Strategy",
                description:
-                  'Drop strategy when queue cap is exceeded: "old", "new", or "summarize". Use summarize when preserving intent matters, or old/new when deterministic dropping is preferred.',
+                  'Drop strategy when queue cap is exceeded. "summarize" drops oldest entries but preserves compact summaries; "old" drops oldest without summaries; "new" rejects the newest item.',
              },
            },
            additionalProperties: false,
            title: "Inbound Queue",
            description:
-              "Inbound message queue strategy used to buffer bursts before processing turns. Tune this for busy channels where sequential processing or batching behavior matters.",
+              "Inbound message queue strategy for messages that arrive while a session run is active. Default mode is steer, with followup fallback when steering is unavailable.",
          },
          inbound: {
            type: "object",
@@ -28302,12 +28302,12 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
    },
    "messages.queue": {
      label: "Inbound Queue",
-      help: "Inbound message queue strategy used to buffer bursts before processing turns. Tune this for busy channels where sequential processing or batching behavior matters.",
+      help: "Inbound message queue strategy for messages that arrive while a session run is active. Default mode is steer, with followup fallback when steering is unavailable.",
      tags: ["advanced"],
    },
    "messages.queue.mode": {
      label: "Queue Mode",
-      help: 'Queue behavior mode: "steer", "followup", "collect", "steer-backlog", "steer+backlog", "queue", or "interrupt". Keep conservative modes unless you intentionally need aggressive interruption/backlog semantics.',
+      help: 'Queue behavior mode. "steer" injects at the next model boundary; "followup" runs later; "collect" batches later; "steer-backlog" does both; "queue" aliases steer; "interrupt" aborts the active run.',
      tags: ["advanced"],
    },
    "messages.queue.byChannel": {
@@ -28317,7 +28317,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
    },
    "messages.queue.debounceMs": {
      label: "Queue Debounce (ms)",
-      help: "Global queue debounce window in milliseconds before processing buffered inbound messages. Use higher values to coalesce rapid bursts, or lower values for reduced response latency.",
+      help: "Global followup queue debounce window in milliseconds before draining buffered inbound messages. Default is 500ms; higher values coalesce bursts, lower values reduce latency.",
      tags: ["performance"],
    },
    "messages.queue.debounceMsByChannel": {
@@ -28327,12 +28327,12 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
    },
    "messages.queue.cap": {
      label: "Queue Capacity",
-      help: "Maximum number of queued inbound items retained before drop policy applies. Keep caps bounded in noisy channels so memory usage remains predictable.",
+      help: "Maximum number of queued inbound items retained before drop policy applies. Default is 20; keep caps bounded in noisy channels so memory usage remains predictable.",
      tags: ["advanced"],
    },
    "messages.queue.drop": {
      label: "Queue Drop Strategy",
-      help: 'Drop strategy when queue cap is exceeded: "old", "new", or "summarize". Use summarize when preserving intent matters, or old/new when deterministic dropping is preferred.',
+      help: 'Drop strategy when queue cap is exceeded. "summarize" drops oldest entries but preserves compact summaries; "old" drops oldest without summaries; "new" rejects the newest item.',
      tags: ["advanced"],
    },
    "messages.inbound": {
--- a/src/config/schema.help.ts
+++ b/src/config/schema.help.ts
@@ -1635,19 +1635,19 @@ export const FIELD_HELP: Record<string, string> = {
  "messages.groupChat.visibleReplies":
    'Overrides visible source replies for group/channel conversations. "message_tool" keeps normal final replies private and requires message(action=send) for room output; "automatic" posts normal replies as before.',
  "messages.queue":
-    "Inbound message queue strategy used to buffer bursts before processing turns. Tune this for busy channels where sequential processing or batching behavior matters.",
+    "Inbound message queue strategy for messages that arrive while a session run is active. Default mode is steer, with followup fallback when steering is unavailable.",
  "messages.queue.mode":
-    'Queue behavior mode: "steer", "followup", "collect", "steer-backlog", "steer+backlog", "queue", or "interrupt". Keep conservative modes unless you intentionally need aggressive interruption/backlog semantics.',
+    'Queue behavior mode. "steer" injects at the next model boundary; "followup" runs later; "collect" batches later; "steer-backlog" does both; "queue" aliases steer; "interrupt" aborts the active run.',
  "messages.queue.byChannel":
    "Per-channel queue mode overrides keyed by provider id (for example telegram, discord, slack). Use this when one channel’s traffic pattern needs different queue behavior than global defaults.",
  "messages.queue.debounceMs":
-    "Global queue debounce window in milliseconds before processing buffered inbound messages. Use higher values to coalesce rapid bursts, or lower values for reduced response latency.",
+    "Global followup queue debounce window in milliseconds before draining buffered inbound messages. Default is 500ms; higher values coalesce bursts, lower values reduce latency.",
  "messages.queue.debounceMsByChannel":
    "Per-channel debounce overrides for queue behavior keyed by provider id. Use this to tune burst handling independently for chat surfaces with different pacing.",
  "messages.queue.cap":
-    "Maximum number of queued inbound items retained before drop policy applies. Keep caps bounded in noisy channels so memory usage remains predictable.",
+    "Maximum number of queued inbound items retained before drop policy applies. Default is 20; keep caps bounded in noisy channels so memory usage remains predictable.",
  "messages.queue.drop":
-    'Drop strategy when queue cap is exceeded: "old", "new", or "summarize". Use summarize when preserving intent matters, or old/new when deterministic dropping is preferred.',
+    'Drop strategy when queue cap is exceeded. "summarize" drops oldest entries but preserves compact summaries; "old" drops oldest without summaries; "new" rejects the newest item.',
  "messages.inbound":
    "Direct inbound debounce settings used before queue/turn processing starts. Configure this for provider-specific rapid message bursts from the same sender.",
  "messages.inbound.byChannel":