fix(qa-lab): hard gate runtime tool coverage

2026-06-06 05:51:15 +08:00 · 2026-05-18 09:42:10 +08:00
parent 73f4657869
commit 58e1351863
19 changed files with 318 additions and 41 deletions
--- a/qa/scenarios/index.md
+++ b/qa/scenarios/index.md
@@ -28,7 +28,10 @@ Coverage tracking:
 Runtime parity tiers:

 - `standard`: required Codex-vs-Pi mock gate coverage for first-hour depth and
-  default runtime-tool fixtures; selected with
+  default runtime-tool fixtures. OpenClaw dynamic integration tools in this
+  tier are hard-gated by `openclaw qa coverage --tools --summary`; Codex-native
+  workspace rows remain separately tracked until native/live behavior is the
+  asserted surface. Selected with
  `openclaw qa suite --runtime-pair pi,codex --runtime-parity-tier standard`
 - `optional`: profile-, plugin-, or external-service-dependent runtime-tool
  fixtures that stay out of the default release gate
--- a/qa/scenarios/runtime/tools/image-generate.md
+++ b/qa/scenarios/runtime/tools/image-generate.md
@@ -13,6 +13,7 @@ successCriteria:
  - Effective tools expose image_generate after QA image-generation config is applied.
  - The mock provider plans exactly one happy-path image_generate call.
  - The mock provider plans one denied-input failure-path image_generate call.
+  - Runtime parity coverage hard-fails call/result drift in the standard direct-loading gate.
 docsRefs:
  - docs/tools/image-generation.md
 codeRefs:
@@ -29,15 +30,12 @@ execution:
      actualTool: image_generate
      bucket: openclaw-dynamic-integration
      expectedLayer: openclaw-dynamic
+      capabilityLayer: openclaw-dynamic-direct
      required: true
-      tracking: "#80319"
      codexDefaultImpact: P4
      qaImpact: P1
-      action: teach fixture/mock planner Codex searchable OpenClaw dynamic tool behavior
-      reason: image_generate is an OpenClaw integration tool; QA mock provider does not yet model Codex searchable/deferred dynamic tool declarations for this fixture.
-    knownHarnessGap:
-      issue: "#80319"
-      reason: QA mock provider does not yet model Codex searchable/deferred OpenClaw dynamic tool declarations for this fixture.
+      action: hard gate in the standard direct-loading tier
+      reason: image_generate is an OpenClaw integration tool and must stay visible and callable under Pi and Codex direct runtime parity.
    promptSnippet: "target=image_generate"
    failurePromptSnippet: "failure target=image_generate"
 ```
--- a/qa/scenarios/runtime/tools/session-status.md
+++ b/qa/scenarios/runtime/tools/session-status.md
@@ -13,6 +13,7 @@ successCriteria:
  - Effective tools expose session_status.
  - The mock provider plans exactly one happy-path session_status call.
  - The mock provider plans one denied-input failure-path session_status call.
+  - Runtime parity coverage hard-fails call/result drift in the standard direct-loading gate.
 docsRefs:
  - qa/scenarios/index.md
 codeRefs:
@@ -28,15 +29,12 @@ execution:
      actualTool: session_status
      bucket: openclaw-dynamic-integration
      expectedLayer: openclaw-dynamic
+      capabilityLayer: openclaw-dynamic-direct
      required: true
-      tracking: "#80319"
      codexDefaultImpact: P4
      qaImpact: P1
-      action: teach fixture/mock planner Codex searchable OpenClaw dynamic tool behavior
-      reason: session_status is an OpenClaw integration tool; QA mock provider does not yet model Codex searchable/deferred dynamic tool declarations for this fixture.
-    knownHarnessGap:
-      issue: "#80319"
-      reason: QA mock provider does not yet model Codex searchable/deferred OpenClaw dynamic tool declarations for this fixture.
+      action: hard gate in the standard direct-loading tier
+      reason: session_status is an OpenClaw integration tool and must stay visible and callable under Pi and Codex direct runtime parity.
    promptSnippet: "target=session_status"
    failurePromptSnippet: "failure target=session_status"
 ```
--- a/qa/scenarios/runtime/tools/sessions-spawn.md
+++ b/qa/scenarios/runtime/tools/sessions-spawn.md
@@ -13,6 +13,7 @@ successCriteria:
  - Effective tools expose sessions_spawn.
  - The mock provider plans exactly one happy-path sessions_spawn call.
  - The mock provider plans one denied-input failure-path sessions_spawn call.
+  - Runtime parity coverage hard-fails call/result drift in the standard direct-loading gate.
 docsRefs:
  - qa/scenarios/index.md
 codeRefs:
@@ -28,15 +29,12 @@ execution:
      actualTool: sessions_spawn
      bucket: openclaw-dynamic-integration
      expectedLayer: openclaw-dynamic
+      capabilityLayer: openclaw-dynamic-direct
      required: true
-      tracking: "#80319"
      codexDefaultImpact: P4
      qaImpact: P1
-      action: teach fixture/mock planner Codex searchable OpenClaw dynamic tool behavior
-      reason: sessions_spawn is an OpenClaw integration tool; QA mock provider does not yet model Codex searchable/deferred dynamic tool declarations for this fixture.
-    knownHarnessGap:
-      issue: "#80319"
-      reason: QA mock provider does not yet model Codex searchable/deferred OpenClaw dynamic tool declarations for this fixture.
+      action: hard gate in the standard direct-loading tier
+      reason: sessions_spawn is an OpenClaw integration tool and must stay visible and callable under Pi and Codex direct runtime parity.
    promptSnippet: "target=sessions_spawn"
    failurePromptSnippet: "failure target=sessions_spawn"
 ```
--- a/qa/scenarios/runtime/tools/web-fetch.md
+++ b/qa/scenarios/runtime/tools/web-fetch.md
@@ -13,6 +13,7 @@ successCriteria:
  - Effective tools expose web_fetch.
  - The mock provider plans exactly one happy-path web_fetch call.
  - The mock provider plans one denied-input failure-path web_fetch call.
+  - Runtime parity coverage hard-fails call/result drift in the standard direct-loading gate.
 docsRefs:
  - qa/scenarios/index.md
 codeRefs:
@@ -28,15 +29,12 @@ execution:
      actualTool: web_fetch
      bucket: openclaw-dynamic-integration
      expectedLayer: openclaw-dynamic
+      capabilityLayer: openclaw-dynamic-direct
      required: true
-      tracking: "#80319"
      codexDefaultImpact: P4
      qaImpact: P1
-      action: teach fixture/mock planner Codex searchable OpenClaw dynamic tool behavior
-      reason: web_fetch is an OpenClaw integration tool; QA mock provider does not yet model Codex searchable/deferred dynamic tool declarations for this fixture.
-    knownHarnessGap:
-      issue: "#80319"
-      reason: QA mock provider does not yet model Codex searchable/deferred OpenClaw dynamic tool declarations for this fixture.
+      action: hard gate in the standard direct-loading tier
+      reason: web_fetch is an OpenClaw integration tool and must stay visible and callable under Pi and Codex direct runtime parity.
    promptSnippet: "target=web_fetch"
    failurePromptSnippet: "failure target=web_fetch"
 ```
--- a/qa/scenarios/runtime/tools/web-search.md
+++ b/qa/scenarios/runtime/tools/web-search.md
@@ -13,6 +13,7 @@ successCriteria:
  - Effective tools expose web_search.
  - The mock provider plans exactly one happy-path web_search call.
  - The mock provider plans one denied-input failure-path web_search call.
+  - Runtime parity coverage hard-fails call/result drift in the standard direct-loading gate.
 docsRefs:
  - qa/scenarios/index.md
 codeRefs:
@@ -28,15 +29,12 @@ execution:
      actualTool: web_search
      bucket: openclaw-dynamic-integration
      expectedLayer: openclaw-dynamic
+      capabilityLayer: openclaw-dynamic-direct
      required: true
-      tracking: "#80319"
      codexDefaultImpact: P4
      qaImpact: P1
-      action: teach fixture/mock planner Codex searchable OpenClaw dynamic tool behavior
-      reason: web_search is an OpenClaw integration tool; QA mock provider does not yet model Codex searchable/deferred dynamic tool declarations for this fixture.
-    knownHarnessGap:
-      issue: "#80319"
-      reason: QA mock provider does not yet model Codex searchable/deferred OpenClaw dynamic tool declarations for this fixture.
+      action: hard gate in the standard direct-loading tier
+      reason: web_search is an OpenClaw integration tool and must stay visible and callable under Pi and Codex direct runtime parity.
    promptSnippet: "target=web_search"
    failurePromptSnippet: "failure target=web_search"
 ```