Compare commits

...

1 Commits

Author SHA1 Message Date
Dallin Romney
5d1c4af83e docs: add CLI LTS checklist pilot 2026-06-05 16:28:48 -07:00
2 changed files with 320 additions and 0 deletions

View File

@@ -0,0 +1,254 @@
---
title: "LTS Deterministic Checklist Plan"
summary: "Plan for turning maturity scorecard claims into auditable release-support evidence packets"
read_when:
- Designing deterministic LTS evidence packets from scorecard inputs
- Auditing CLI support claims before an LTS or stable release
- Reusing the checklist workflow for another OpenClaw surface
---
# LTS Deterministic Checklist Plan
This plan turns the current Clanker-generated maturity scorecard into an auditable release-support artifact. The goal is not to replace Clanker judgment. The goal is to force every proposed LTS claim into a repeatable evidence table that maintainers can inspect before an LTS or stable release.
## General Flow
The deterministic checklist answers one question for every proposed LTS feature:
> Can we prove this works today, and will CI or release validation catch it if it breaks?
Use this hierarchy:
```text
Surface
-> Category
-> Feature
-> Evidence: docs, source, tests, CI/Testbox proof, known gaps
```
The maturity scorecard already defines the surfaces and categories. `LTS.md` marks which categories are proposed for the initial LTS slice. The checklist should not decide LTS policy directly. It should expose which promises are strongly backed, partially backed, missing proof, or require owner judgment.
## Deterministic Checklist Artifact
For each surface, produce one Markdown evidence packet:
```text
Surface: <surface name>
Snapshot:
- Scorecard ref:
- LTS ref:
- OpenClaw source ref:
- gitcrawl freshness:
- discrawl freshness:
- CI/Testbox source:
Summary:
- Included LTS categories:
- Strongly covered categories:
- Partial categories:
- Missing-proof categories:
- Owner-decision categories:
Feature Checklist:
| Category | Feature | Docs | Source | Test or proof | Latest CI/Testbox | Verdict | Gap | Next action |
```
Use these verdicts:
- `covered`: docs, source, and integration/e2e/live proof exist, and latest CI/Testbox proof is known.
- `partial`: implementation exists, but proof is unit-only, stale, platform-limited, or not tied to the user path.
- `missing`: no credible runtime-flow proof was found.
- `owner`: evidence exists, but whether it belongs in LTS is a product/support decision.
Do not let unit tests alone mark a feature as `covered`. Unit tests can support the row, but LTS coverage should be based on integration, e2e, live, or real runtime-flow proof.
## Agent Orchestration Model
Use Clankers as evidence collectors and reviewers.
Recommended roles:
- `surface-auditor`: builds the first checklist for one surface.
- `skeptic-reviewer`: attacks overclaims and downgrades weak rows.
- `ci-finder`: finds latest CI/Testbox proof for cited tests.
- `normalizer`: rewrites rows into the shared verdict vocabulary.
For high-risk surfaces, run two independent `surface-auditor` agents and compare disagreement. The useful output is often the conflict list, not the average answer.
## Standard Surface Auditor Prompt
```text
Audit only the <SURFACE> surface for the proposed LTS checklist.
Inputs:
- LTS source: docs/kevinslin/maturity-scorecard/LTS.md
- Scorecard source: docs/kevinslin/maturity-scorecard/maturity-scorecard.md
- Surface report: docs/kevinslin/maturity-scorecard/inventory/<surface-id>/report.md
- Surface score source: docs/kevinslin/maturity-scorecard/inventory/<surface-id>/scores.yaml
- Category notes: docs/kevinslin/maturity-scorecard/inventory/<surface-id>/*.md
Task:
For every category included in LTS.md for this surface:
1. Extract the user-facing features.
2. Cite docs that promise or explain the feature.
3. Cite implementation source that owns the feature.
4. Cite integration, e2e, live, or runtime-flow tests.
5. Find latest CI/Testbox proof for the cited tests when available.
6. Mark verdict as covered, partial, missing, or owner.
7. Explain the gap and the next action.
Rules:
- Do not change LTS policy.
- Do not score by vibes.
- A row without source plus runtime-flow proof is not covered.
- Unit tests alone are supporting evidence only.
- Prefer exact file paths and line references.
- Keep the final output to the checklist table plus a short gaps summary.
```
## Standard Skeptic Prompt
```text
Review this <SURFACE> LTS checklist.
Find:
- rows that overclaim coverage
- rows where unit tests are being counted as coverage
- rows where docs/source/test do not prove the same user-facing feature
- stale or missing CI/Testbox proof
- vague feature names
- categories that should be marked owner instead of covered
Return only actionable corrections:
| Row | Problem | Required correction | Severity |
```
## CLI Pilot
The CLI is the best first pilot because it is bounded, enterprise-relevant, and easier to connect to docs, source, tests, and release proof than provider or channel surfaces.
The proposed initial LTS slice includes 6 of 8 CLI categories:
- CLI Setup
- Onboarding and Auth Setup
- Gateway Service Management
- CLI Observability
- Doctor
- Updates and Upgrades
The deferred categories are:
- Plugin and Channel Setup
- Windows and WSL2
The CLI pilot should prove whether the included categories are actually backed by deterministic evidence, and whether any deferred category is obviously stronger than an included category.
## CLI Inputs To Read
Read these first:
- `docs/kevinslin/maturity-scorecard/LTS.md`
- `docs/kevinslin/maturity-scorecard/maturity-scorecard.md`
- `docs/kevinslin/maturity-scorecard/inventory/cli-install-update-onboard-doctor/report.md`
- `docs/kevinslin/maturity-scorecard/inventory/cli-install-update-onboard-doctor/scores.yaml`
- `docs/kevinslin/maturity-scorecard/inventory/cli-install-update-onboard-doctor/*.md`
Then read the product docs that match the categories:
- `docs/cli/index.md`
- `docs/cli/onboard.md`
- `docs/cli/configure.md`
- `docs/cli/doctor.md`
- `docs/cli/gateway.md`
- `docs/cli/health.md`
- `docs/cli/logs.md`
- `docs/cli/models.md`
- `docs/start/wizard-cli-automation.md`
- `docs/start/wizard-cli-reference.md`
- `docs/reference/wizard.md`
Use docs only as claims. Every claim still needs source and runtime proof.
## CLI Evidence Search Strategy
For each included category, search source and tests by command name and user workflow.
Suggested source searches:
```bash
rg -n "openclaw (onboard|configure|doctor|gateway|health|logs|models|update)" src packages ui extensions scripts test
rg -n "doctor|onboard|configure|gateway service|service install|update channel|auth profile|model set" src packages test
rg -n "program\\.command|subcommand|Command|commander|cac|yargs|parse" src packages
```
Suggested test searches:
```bash
rg -n "doctor|onboard|configure|gateway service|update|auth profile|models" --glob '*.{test,e2e.test}.ts' src packages test
rg -n "openclaw doctor|openclaw onboard|openclaw gateway|openclaw update" test scripts docs
```
Suggested proof searches:
```bash
gh run list -R openclaw/openclaw --limit 30 --json databaseId,headSha,conclusion,status,displayTitle,createdAt,url
gh pr checks <PR-or-branch> -R openclaw/openclaw
```
If local CI proof is not enough, mark the row `partial` and recommend Crabbox/Testbox proof.
## CLI Feature Table Template
Use this table as the CLI pilot output:
```markdown
| Category | Feature | Docs | Source | Test or proof | Latest CI/Testbox | Verdict | Gap | Next action |
| -------------------------- | ------------------------------------------------------------ | ---------------------------------------- | ------ | ------------- | ----------------- | ------- | -------------------------------------------------------- | ------------------------------------ |
| CLI Setup | Package install exposes `openclaw` CLI | `docs/start/getting-started.md` | TBD | TBD | TBD | partial | Need current package install smoke proof | Find release CI or add package smoke |
| Onboarding and Auth Setup | `openclaw onboard` creates usable config/auth path | `docs/cli/onboard.md` | TBD | TBD | TBD | partial | Need non-interactive and interactive proof separated | Audit onboarding tests |
| Gateway Service Management | CLI can install/start/stop/status Gateway service | `docs/cli/gateway.md` | TBD | TBD | TBD | partial | Need Linux service proof if LTS includes Linux host path | Link service tests and Testbox run |
| CLI Observability | `openclaw health` and `openclaw logs` expose operator status | `docs/cli/health.md`, `docs/cli/logs.md` | TBD | TBD | TBD | partial | Need running Gateway proof | Audit RPC/CLI e2e |
| Doctor | `openclaw doctor --fix` repairs supported config/auth drift | `docs/cli/doctor.md` | TBD | TBD | TBD | partial | Need migration fixture coverage | Audit doctor tests |
| Updates and Upgrades | CLI supports supported update channel flow | TBD | TBD | TBD | TBD | partial | Need release/update smoke proof | Find release CI/Testbox run |
```
Replace `TBD` with exact evidence. Do not leave `TBD` in the final artifact.
## CLI Definition Of Done
The CLI pilot is done when:
- Every included CLI LTS category has at least one feature row.
- Every feature row has docs, source, test/proof, verdict, gap, and next action.
- Rows without integration/e2e/live/runtime-flow proof are marked `partial` or `missing`.
- Latest CI/Testbox evidence is linked when available.
- A skeptic review has downgraded overclaims.
- The final summary names the top 3 CLI gaps Kevin needs to know before LTS.
## Likely CLI Outcomes
Expected useful outputs:
- A short list of CLI categories that are safe to keep in LTS.
- A short list of CLI categories that need one targeted integration or package smoke test.
- A short list of CLI claims that should be narrowed before announcement.
- A reusable checklist template for the next surface.
The most valuable result is not a high score. The most valuable result is a clear distinction between:
- "covered by current release gates"
- "implemented but not release-gated"
- "documented but weakly tested"
- "needs owner decision before support promise"
## Suggested First Day Plan
1. Run one `surface-auditor` on CLI.
2. Run one `ci-finder` on the cited CLI tests and release checks.
3. Run one `skeptic-reviewer` on the completed table.
4. Normalize the verdicts.
5. Send Kevin a concise packet:
- CLI checklist
- top 3 gaps
- recommended test/proof additions
- whether the checklist format should be repeated for Gateway runtime next

View File

@@ -0,0 +1,66 @@
---
title: "CLI LTS Deterministic Checklist"
summary: "Evidence packet for proposed CLI categories in the initial OpenClaw LTS slice"
read_when:
- Auditing whether CLI setup, onboarding, gateway service management, observability, doctor, or updates are ready for an LTS support promise
- Preparing release validation gaps for CLI support claims
- Extending the deterministic LTS checklist to another surface
---
## Surface
CLI install, update, onboard, configure, gateway, health, logs, and doctor.
## Snapshot
- Scorecard ref: planned `docs/kevinslin/maturity-scorecard/maturity-scorecard.md`; not present in this checkout.
- LTS ref: planned `docs/kevinslin/maturity-scorecard/LTS.md`; not present in this checkout.
- OpenClaw source ref: this worktree, branch `lts-checklist-cli`.
- gitcrawl freshness: not checked; this packet uses local source plus live GitHub Actions run metadata.
- discrawl freshness: not checked; no Discord/operator archive claim is needed for the CLI source proof.
- CI/Testbox source: latest observed successful runs from `gh run list` on 2026-06-05:
- Full Release Validation: `27022847039`, `tideclaw/alpha/2026-06-05-1410Z`, `2e307827be8a7aa6fb8b70e2d5e639a6d86c7ddc`, https://github.com/openclaw/openclaw/actions/runs/27022847039
- OpenClaw Release Checks: `27023463705`, same ref/SHA, https://github.com/openclaw/openclaw/actions/runs/27023463705
- main CI: `27032770269`, `d896a4c7a3ef033f93bc6bd6d392e299630c52c7`, https://github.com/openclaw/openclaw/actions/runs/27032770269
- Package Acceptance: latest observed success `26917431094`, `main`, `308114e1486dc2a2409ab1d99a1e5f8e05d97b7e`, https://github.com/openclaw/openclaw/actions/runs/26917431094
## Summary
- Included LTS categories: CLI Setup; Onboarding and Auth Setup; Gateway Service Management; CLI Observability; Doctor; Updates and Upgrades.
- Strongly covered categories: none without a release-run manifest tying exact rows to exact lane success.
- Partial categories: all six included categories have docs, implementation, tests, and at least some Docker or release-gate proof.
- Missing-proof categories: CLI Setup lacks an isolated current package-install smoke row in this packet; CLI Observability lacks a named release lane for `openclaw health` plus `openclaw logs` together.
- Owner-decision categories: Gateway Service Management, because the support promise must name which supervisors and operating systems are in LTS scope.
## Feature Checklist
| Category | Feature | Docs | Source | Test or proof | Latest CI/Testbox | Verdict | Gap | Next action |
| -------------------------- | ----------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| CLI Setup | Package install exposes `openclaw` and the documented command tree | `docs/cli/index.md:9`, `docs/cli/index.md:21`, `docs/cli/index.md:40` | `src/cli/program/build-program.ts:10`, `src/cli/program/build-program.ts:27`, `src/cli/program.ts:4` | Package install is exercised by `scripts/e2e/npm-onboard-channel-agent-docker.sh:134` and checks `command -v openclaw` at `scripts/e2e/npm-onboard-channel-agent-docker.sh:136`; launcher/package behavior has `test/openclaw-launcher.e2e.test.ts` | Package Acceptance `26917431094` is the latest successful package workflow observed; Release Checks `27023463705` is newer but this packet did not fetch its job manifest to prove the install lane ran. | partial | The proof is embedded in larger package/onboard lanes, not a row-level install smoke tied to the latest release run. | Add or identify a release-gated `install-smoke` row that records package spec, `openclaw --version`, command tree smoke, and runner OS. |
| Onboarding and Auth Setup | `openclaw onboard` creates usable local/remote config, auth refs, workspace, and health path | `docs/cli/onboard.md:8`, `docs/cli/onboard.md:121`, `docs/cli/onboard.md:139`, `docs/start/wizard-cli-automation.md:1`, `docs/reference/wizard.md:1` | `src/commands/onboard.ts:27`, `src/commands/onboard.ts:77`, `src/commands/onboard.ts:113`, `src/commands/onboard.ts:118` | Unit/integration: `src/commands/onboard-non-interactive.gateway.test.ts:330`, `src/commands/onboard-non-interactive.gateway.test.ts:764`, `src/commands/onboard-non-interactive.gateway-health-auth.test.ts:40`; Docker flow: `scripts/e2e/onboard-docker.sh:24`, local and remote cases at `scripts/e2e/lib/onboard/scenario.sh:244`, `scripts/e2e/lib/onboard/scenario.sh:272`; package user path at `scripts/e2e/npm-onboard-channel-agent-docker.sh:149` | Package Acceptance `26917431094` covers `npm-onboard-channel-agent` by policy (`docs/ci.md:292`); OpenClaw Release Checks `27023463705` is newer and calls package lanes by policy (`docs/ci.md:306`). | partial | Onboarding has strong runtime proof, but the row still mixes local setup, remote setup, auth SecretRefs, and package channel-agent proof. | Split into local non-interactive, remote non-interactive, interactive wizard, and SecretRef auth rows, then attach latest release job names/artifacts for each. |
| Gateway Service Management | CLI can install, start, stop, restart, and report Gateway service state | `docs/cli/gateway.md:113`, `docs/cli/gateway.md:139`, `docs/cli/gateway.md:164`; legacy alias listed at `docs/cli/index.md:37` | `src/cli/gateway-cli/register.ts:20`, `src/cli/gateway-cli/register.ts:479`, `src/cli/daemon-cli/register-service-commands.ts:57`, `src/cli/daemon-cli/register-service-commands.ts:83`, `src/cli/daemon-cli/register-service-commands.ts:106`, `src/cli/daemon-cli/register-service-commands.ts:115`, `src/cli/daemon-cli/register-service-commands.ts:129` | Unit/integration: `src/commands/gateway-readiness.test.ts:92`, `src/commands/doctor-gateway-services.test.ts:351`, `src/cli/daemon-cli/register-service-commands.test.ts:1`; Docker service-entrypoint proof: `scripts/e2e/doctor-install-switch-docker.sh:1`, `scripts/e2e/lib/doctor-install-switch/scenario.sh:122`, `scripts/e2e/lib/doctor-install-switch/scenario.sh:160`, `scripts/e2e/lib/doctor-install-switch/scenario.sh:167` | Release Checks `27023463705` is the latest successful release-check run observed; docs state release checks include `doctor-switch` in package acceptance (`docs/ci.md:306`). | owner | The Linux systemd-user Docker shim proves important flows, but LTS support scope for launchd, systemd, Scheduled Tasks, WSL2, and unmanaged supervisors is a product decision. | Define which supervisors are LTS-backed, then require one release-gated service lifecycle lane per supported supervisor. |
| CLI Observability | `openclaw health` and `openclaw logs` expose operator status and logs through Gateway RPC with fallbacks | `docs/cli/health.md:8`, `docs/cli/health.md:31`, `docs/cli/logs.md:9`, `docs/cli/logs.md:58` | `src/commands/health.ts:411`, `src/commands/health.ts:634`, `src/commands/health.ts:658`, `src/cli/logs-cli.ts:100`, `src/cli/logs-cli.ts:119`, `src/cli/logs-cli.ts:146`, `src/cli/logs-cli.ts:483`, `src/gateway/server-methods/logs.ts:12`, `src/gateway/server-methods/health.ts:121` | Unit/integration: `src/commands/health.test.ts:127`, `src/commands/health.test.ts:251`, `src/cli/logs-cli.test.ts:138`, `src/cli/logs-cli.test.ts:416`, `src/cli/logs-cli.test.ts:534`; package/onboard lane checks status surfaces at `scripts/e2e/npm-onboard-channel-agent-docker.sh:170` | main CI `27032770269` is current general CI proof; no latest named release lane was found for `openclaw health` plus `openclaw logs`. | partial | Health and logs are implemented and tested, but logs proof is mostly unit-level; package runtime proof checks status surfaces, not logs tail/follow against a running Gateway. | Add a Docker release-path observability lane that starts Gateway, asserts `openclaw health --json`, `openclaw logs --json`, and `openclaw logs --follow` reconnect behavior. |
| Doctor | `openclaw doctor --fix` repairs supported config, auth, plugin, state, and service drift | `docs/cli/doctor.md:20`, `docs/cli/doctor.md:25`, `docs/cli/doctor.md:61`, `docs/cli/doctor.md:79` | `src/commands/doctor.ts:7`, `src/commands/doctor.ts:25`, `src/flows/doctor-health.ts:20`, `src/flows/doctor-health.ts:63`, `src/flows/doctor-health.ts:81` | Unit/integration: `src/commands/doctor-config-preflight.test.ts:30`, `src/commands/doctor-config-preflight.state-migration.test.ts:49`, `src/commands/doctor-auth-flat-profiles.test.ts:415`, `src/commands/doctor-gateway-daemon-flow.test.ts:503`, `src/commands/doctor-gateway-services.test.ts:480`; Docker package repair proof: `scripts/e2e/lib/doctor-install-switch/scenario.sh:152`, `scripts/e2e/lib/doctor-install-switch/scenario.sh:201`, `scripts/e2e/npm-onboard-channel-agent-docker.sh:175` | Package Acceptance `26917431094`; Release Checks `27023463705` is newer and release policy includes `doctor-switch`, `update-corrupt-plugin`, `upgrade-survivor`, `published-upgrade-survivor`, and `update-restart-auth` (`docs/ci.md:306`). | partial | Doctor is broad; no single artifact summarizes which repair families are release-gated versus unit-only. | Generate a doctor repair matrix from check IDs/repair families and map each to unit, Docker, Package Acceptance, or missing proof. |
| Updates and Upgrades | `openclaw update` supports stable/beta/dev channel switching, package updates, post-core convergence, and update status | `docs/cli/update.md:10`, `docs/cli/update.md:34`, `docs/cli/update.md:89`, `docs/cli/update.md:121`, `docs/cli/update.md:168` | `src/cli/update-cli.ts:41`, `src/cli/update-cli.ts:45`, `src/cli/update-cli.ts:50`, `src/cli/update-cli.ts:101`, `src/cli/update-cli.ts:163`; update implementation is under `src/cli/update-cli/*` and `src/infra/update-*` | Unit/integration: `src/cli/update-cli.test.ts:3307`, `src/cli/update-cli.test.ts:3971`, `src/infra/update-runner.test.ts:2272`; Docker runtime proof: `scripts/e2e/update-channel-switch-docker.sh:1`, package-to-git and git-to-package assertions at `scripts/e2e/update-channel-switch-docker.sh:100`, `scripts/e2e/update-channel-switch-docker.sh:116`; upgrade survivor proof under `scripts/e2e/lib/upgrade-survivor/run.sh:1113` and `scripts/e2e/lib/upgrade-survivor/run.sh:1241` | Release Checks `27023463705`; Package Acceptance policy includes `update-channel-switch`, `upgrade-survivor`, `published-upgrade-survivor`, `update-restart-auth`, and plugin update lanes (`docs/ci.md:294`, `docs/ci.md:306`). | partial | This is close to covered for package/update flows, but row-level proof still lacks latest job artifact/job names and exact package spec. | Pull the latest Release Checks child Package Acceptance job summary and record lane-level outcomes for `update-channel-switch`, `upgrade-survivor`, `published-upgrade-survivor`, and `update-restart-auth`. |
## Deferred Category Check
- Plugin and Channel Setup appears at least as strongly release-gated as some included categories: Package Acceptance `smoke` includes `npm-onboard-channel-agent`, and package profile includes `skill-install`, `plugins-offline`, and `plugin-update` (`docs/ci.md:292`). If CLI LTS promises include post-onboard channel setup, this category may deserve inclusion or a narrow split.
- Windows and WSL2 have release-check hooks through cross-OS release checks (`.github/workflows/openclaw-release-checks.yml:24`, `.github/workflows/openclaw-release-checks.yml:66`) and Windows/package notes in `docs/ci.md:306`, but this packet did not audit Windows-specific source/tests. Keep deferred until a dedicated cross-OS evidence packet exists.
## Skeptic Review Corrections
| Row | Problem | Required correction | Severity |
| -------------------------- | ------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------- | -------- |
| CLI Setup | Package install proof is only inferred from package/onboard lanes. | Keep `partial` until latest release job artifacts show an install smoke or add a standalone install smoke. | Medium |
| Onboarding and Auth Setup | Combines too many behaviors in one row; SecretRef and interactive wizard could regress independently. | Split local non-interactive, remote non-interactive, interactive, and SecretRef rows. | Medium |
| Gateway Service Management | Current strongest Docker proof uses systemd shims; launchd and Scheduled Tasks support are not equivalently proven here. | Mark `owner`; define OS/supervisor LTS scope before upgrade to `covered`. | High |
| CLI Observability | Unit proof is being asked to carry logs tail/follow support. | Add runtime Gateway logs proof or keep `partial`. | High |
| Doctor | Doctor has many repair families; a single row can hide unproven repairs. | Matrix repair families and mark each separately. | Medium |
| Updates and Upgrades | Release policy strongly covers this area, but latest lane-level artifacts were not inspected. | Fetch latest Release Checks child job summary before calling it `covered`. | Medium |
## Top Gaps For LTS
1. **Lane-level release evidence**: the current release workflows cover many CLI user paths, but the LTS artifact needs child job IDs, lane names, package specs, and artifacts for each row.
2. **Observability runtime proof**: add a Gateway-backed Docker lane for `health` and `logs`, especially `logs --follow` reconnect/journal behavior.
3. **Service support boundary**: decide whether LTS means Linux systemd user only, macOS launchd, native Windows Scheduled Tasks, WSL2, or all of them, then require one real runtime proof per supported supervisor.