OpenClaw Pre-Release Plugin Testing
Use this skill when the user asks for plugin release confidence, plugin lifecycle
sweeps, package-artifact plugin proof, or "what else should we test before
release?" It complements
; use that skill too when choosing
the cheapest safe runner or debugging a failing lane.
Goal
Prove the plugin system as a product surface, not just as source tests:
- bundled plugin lifecycle: install, inspect, enable, disable, uninstall
- package artifact behavior from a clean
- doctor/fix/config validation and idempotence
- config discovery and config round-trip
- status/log visibility and diagnostics
- gateway startup/bootstrap with plugin metadata snapshots
- public SDK compatibility for real external plugins
- live-ish provider/channel probes only when safe credentials exist
First Checks
From the OpenClaw repo root:
bash
pnpm docs:list
git status --short --branch
readlink node_modules
pnpm changed:lanes --json
In Codex worktrees under
,
must be a symlink to
the main OpenClaw checkout. Do not run
there. For broad or
package-heavy proof, use Blacksmith Testbox or GitHub Actions.
Runner Choice
Prefer this order:
- GitHub Package Acceptance for installable-package product proof.
ci-build-artifacts-testbox.yml
Testbox when Docker/package lanes need
seeded , , and package caches.
- Testbox for source checks, targeted Vitest,
package-boundary checks, or focused Docker lanes.
- Local targeted commands only for small format/static/unit probes.
Avoid long package Docker runs from a stale sparse worktree. If Testbox sync
reports hundreds of changed files or starts deleting package inputs, stop and
warm a fresh box from current
, or switch to Package Acceptance.
Existing Baseline
Run or verify these before inventing new coverage:
bash
OPENCLAW_TESTBOX=1 pnpm check:changed
pnpm run test:extensions:package-boundary:canary
pnpm run test:extensions:package-boundary:compile
pnpm test:docker:plugins
OPENCLAW_PLUGINS_E2E_CLAWHUB=0 pnpm test:docker:plugins
pnpm test:docker:plugin-update
pnpm test:docker:bundled-channel-deps:fast
For full bundled install/uninstall proof, shard the packaged sweep:
bash
OPENCLAW_BUNDLED_PLUGIN_SWEEP_TOTAL=8 \
OPENCLAW_BUNDLED_PLUGIN_SWEEP_INDEX=<0-7> \
pnpm test:docker:bundled-plugin-install-uninstall
Expected current packaged scope: 116 public bundled plugins over shards
.
Private QA plugins are source-mode only unless a package explicitly includes
them.
Confidence Matrix
Use this matrix for pre-release signoff. Record pass/fail, run URL/Testbox ID,
package SHA/version, and skipped-live reason.
| Surface | Proof | Preferred runner |
|---|
| Package artifact | Package Acceptance or custom lanes | GitHub Actions |
| Bundled lifecycle | 8-shard test:docker:bundled-plugin-install-uninstall
| Testbox or release Docker |
| External plugins | and | Testbox/package acceptance |
| Update no-op | test:docker:plugin-update
| Testbox/package acceptance |
| Channel runtime deps | test:docker:bundled-channel-deps:fast
plus key channels | Testbox/package acceptance |
| Doctor/fix | seeded bad configs + doctor --fix --non-interactive
| new Docker/Testbox harness |
| Config round-trip | , inspect, doctor, reload, diff hash | new Docker/Testbox harness |
| Gateway bootstrap | clean , plugin groups enabled/disabled, status JSON | new Docker/Testbox harness |
| SDK compatibility | directory, tgz, and external plugins using SDK subpaths | plus new smoke |
| Live-ish | redacted provider/channel probes only for present env | Testbox live lanes |
Package Acceptance Plan
Use this when validating a release branch, beta, or candidate package:
bash
gh workflow run package-acceptance.yml \
--repo openclaw/openclaw \
--ref main \
-f workflow_ref=main \
-f source=ref \
-f package_ref=<branch-or-sha> \
-f suite_profile=custom \
-f docker_lanes='plugins-offline plugin-update bundled-channel-deps-compat doctor-switch update-channel-switch config-reload mcp-channels npm-onboard-channel-agent' \
-f telegram_mode=mock-openai
Use
source=npm -f package_spec=openclaw@beta
for published beta proof. Keep
as trusted current harness code unless the release process says
otherwise.
New Testbox Harness Plan
If more certainty is needed, add or run a
Docker lane
that uses one package tarball and sharded plugin lists. Per plugin:
- Start with a clean .
- Capture .
- .
plugins inspect <id> --json
.
- , then assert disabled visibility.
- , except config-required plugins without config.
plugins registry --refresh
.
- .
plugins uninstall <id> --force
.
- Assert no config entry, allow/deny residue, install record, managed dir, or
bundled load path remains.
- Assert diagnostics contain no and output redacts
secret-looking values.
Keep
special: it is config-required. First assert install does
not enable it without embedding config, then run a second configured case.
Doctor/Fix Matrix
Seed bad states and require
doctor --fix --non-interactive
to repair them,
then run doctor again and require idempotence:
- stale
- stale
- stale channel config for missing channel plugin
- invalid
plugins.entries.<id>.config
- packaged bundled path in
- legacy
- disabled channel/plugin config that must not stage runtime deps
- root-owned global package tree that must remain unmodified
Gateway Bootstrap Matrix
Start packaged OpenClaw in Docker with clean state:
- provider plugins enabled, no credentials: ready with warnings, no crash
- channel plugins configured disabled: no runtime deps staged
- startup-activation plugins enabled: ready and reflected in status
- invalid single plugin config: bad plugin skipped/quarantined, others remain
Assert:
- gateway reaches ready
- includes plugin diagnostics
openclaw plugins inspect --all --json
is parseable
- package tree is not mutated
- logs contain no raw tokens
Config Round-Trip Representatives
Use representative plugin families instead of every plugin for deep config
round-trip:
- providers: , , ,
- channels: , , ,
- memory:
- feature/runtime: , ,
For each representative:
- Write config through CLI when possible.
- Read it back through or JSON.
- Run .
- Run .
- Trigger gateway config reload if applicable.
- Compare config hash before/after no-op commands.
External SDK Smoke
In a package Docker lane, create tiny external plugins and install them from:
Cover CJS and ESM shapes, plus at least one plugin importing focused
subpaths. Assert
sees its tool,
gateway method, CLI command, or service.
Live-Ish Probe Rules
Before live-ish work, source allowed env in Testbox and generate a redacted
availability matrix: present/missing only, never values.
Only run probes for credentials that exist. Prefer auth/catalog/status probes
over sending user-visible messages. If a probe might contact an external user,
channel, or workspace, stop and ask the user.
Reporting
Report in this shape:
text
package/ref:
tbx ids / run urls:
matrix:
bundled lifecycle:
package acceptance:
doctor/fix:
gateway bootstrap:
config round-trip:
sdk external:
live-ish:
failures:
skips:
next highest-value gap:
Say clearly when a failure is Testbox sync/env damage rather than product
behavior, and prove that with a clean rerun or current-main comparison.