openclaw-testing

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

OpenClaw Testing

OpenClaw测试

Use this skill when deciding what to test, debugging failures, rerunning CI, or validating a change without wasting hours.

当你需要决定测试内容、调试失败问题、重新运行CI，或者在不浪费大量时间的情况下验证变更时，使用本技能。

Read First

必读内容

```
docs/reference/test.md
```
for local test commands.
```
docs/ci.md
```
for CI scope, release checks, Docker chunks, and runner behavior.
Scoped
```
AGENTS.md
```
files before editing code under a subtree.

本地测试命令请参考
```
docs/reference/test.md
```
。
CI范围、发布检查、Docker分片以及运行器行为请参考
```
docs/ci.md
```
。
在编辑子树中的代码前，请先阅读对应范围的
```
AGENTS.md
```
文件。

Default Rule

默认规则

Prove the touched surface first. Do not reflexively run the whole suite.

Inspect the diff and classify the touched surface:
- source:
```
pnpm changed:lanes --json
```
  , then
```
pnpm check:changed
```
- tests only:
```
pnpm test:changed
```
- one failing file:
```
pnpm test <path-or-filter> -- --reporter=verbose
```
- workflow-only:
```
git diff --check
```
  , workflow syntax/lint (
```
actionlint
```
  when available)
- docs-only:
```
pnpm docs:list
```
  , docs formatter/lint only if docs tooling changed or requested
Reproduce narrowly before fixing.
Fix root cause.
Rerun the same narrow proof.
Broaden only when the touched contract demands it.

优先验证受变更影响的范围，不要下意识地运行整个测试套件。

检查代码差异并分类受影响的范围：
- 源代码变更：执行
```
pnpm changed:lanes --json
```
  ，然后运行
```
pnpm check:changed
```
- 仅测试代码变更：执行
```
pnpm test:changed
```
- 单个文件测试失败：执行
```
pnpm test <path-or-filter> -- --reporter=verbose
```
- 仅工作流变更：执行
```
git diff --check
```
  ，若有
```
actionlint
```
  则进行工作流语法/检查
- 仅文档变更：执行
```
pnpm docs:list
```
  ，仅当文档工具变更或有明确要求时，才运行文档格式化/检查
在修复问题前先精准复现故障。
修复问题根源。
重新运行相同的精准验证用例。
仅当受影响的契约要求时，才扩大测试范围。

Guardrails

防护规则

Do not kill unrelated processes or tests. If something is running elsewhere, treat it as owned by the user or another agent.
Do not run expensive local Docker, full release checks, full
```
pnpm test
```
, or full
```
pnpm check
```
unless the user asks or the change genuinely requires it.
Prefer GitHub Actions for release/Docker proof when the workflow already has the prepared image and secrets.
Use
```
scripts/committer "<msg>" <paths...>
```
when committing; stage only your files.
If deps are missing, run
```
pnpm install
```
, retry once, then report the first actionable error.
For Blacksmith Testbox proof, reuse only an id warmed and claimed in this operator session.
```
blacksmith testbox list
```
is diagnostics only; a listed id can have a local key and still carry stale rsync state from another lane. After warmup, run
```
pnpm testbox:claim --id <id>
```
, then prefer
```
pnpm testbox:run --id <id> -- "<command>"
```
for OpenClaw gates so stale org-visible ids fail fast before syncing. Claims older than 12 hours are stale unless
```
OPENCLAW_TESTBOX_CLAIM_TTL_MINUTES
```
is explicitly set for long work.

不要终止无关的进程或测试。如果有其他进程在运行，默认视为用户或其他Agent所有。
除非用户要求或变更确实需要，否则不要运行本地Docker、完整发布检查、完整
```
pnpm test
```
或完整
```
pnpm check
```
。
当工作流已准备好镜像和密钥时，优先使用GitHub Actions进行发布/Docker验证。
提交代码时使用
```
scripts/committer "<msg>" <paths...>
```
；仅暂存你修改的文件。
如果缺少依赖，执行
```
pnpm install
```
，重试一次后，报告第一个可处理的错误。
对于Blacksmith Testbox验证，仅重用当前操作员会话中预热并认领的实例ID。
```
blacksmith testbox list
```
仅用于诊断；列出的ID可能带有本地密钥，且仍保留来自其他流水线的陈旧rsync状态。预热完成后，执行
```
pnpm testbox:claim --id <id>
```
，然后优先使用
```
pnpm testbox:run --id <id> -- "<command>"
```
运行OpenClaw网关测试，这样陈旧的组织可见ID会在同步前快速失败。除非为长时间工作显式设置
```
OPENCLAW_TESTBOX_CLAIM_TTL_MINUTES
```
，否则超过12小时的认领将视为过期。

Local Test Shortcuts

本地测试快捷命令

bash

pnpm changed:lanes --json
pnpm check:changed       # changed typecheck/lint/guards; no Vitest
pnpm test:changed        # cheap smart changed Vitest targets
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
pnpm test <path-or-filter> -- --reporter=verbose
OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test <path-or-filter>

Use targeted file paths whenever possible. Avoid raw

vitest

; use the repo

pnpm test

wrapper so project routing, workers, and setup stay correct.

bash

pnpm changed:lanes --json
pnpm check:changed       # 对变更内容进行类型检查/代码检查/防护验证；不运行Vitest
pnpm test:changed        # 轻量智能的变更Vitest目标测试
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
pnpm test <path-or-filter> -- --reporter=verbose
OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test <path-or-filter>

尽可能使用目标文件路径。避免直接使用

vitest

；使用仓库的

pnpm test

包装器，以确保项目路由、工作线程和设置保持正确。

Command Semantics

命令语义

```
pnpm check
```
and
```
pnpm check:changed
```
do not run Vitest tests. They are for typecheck, lint, and guard proof.
```
pnpm test
```
and
```
pnpm test:changed
```
run Vitest tests.
```
pnpm test:changed
```
is intentionally cheap by default: direct test edits, sibling tests, explicit source mappings, and import-graph dependents.
```
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
```
is the explicit broad fallback for harness/config/package edits that genuinely need it.
Do not run extension sweeps just because core changed. If a core edit is for a specific plugin bug, run that plugin's tests explicitly. If a public SDK or contract change needs consumer proof, choose the smallest representative plugin/contract tests first, then broaden only when the risk justifies it.
The test wrapper prints a short
```
[test] passed|failed|skipped ... in ...
```
line. Vitest's own duration is still the per-shard detail.

```
pnpm check
```
和
```
pnpm check:changed
```
不运行Vitest测试，仅用于类型检查、代码检查和防护验证。
```
pnpm test
```
和
```
pnpm test:changed
```
运行Vitest测试。
```
pnpm test:changed
```
默认设计为轻量：仅运行直接修改的测试、同级测试、显式映射的源代码测试，以及依赖导入图的相关测试。
```
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
```
是针对测试工具/配置/包变更的显式宽范围回退方案，仅在确实需要时使用。
不要因为核心代码变更就运行所有扩展测试。如果核心代码变更是针对特定插件Bug，仅运行该插件的测试。如果公共SDK或契约变更需要消费者验证，优先选择最小代表性的插件/契约测试，仅当风险证明有必要时才扩大范围。
测试包装器会输出简短的
```
[test] passed|failed|skipped ... in ...
```
行。Vitest自身的时长仍为每个分片的详细信息。

Routing Model

路由模型

```
pnpm changed:lanes --json
```
answers "which check lanes does this diff touch?" It is used by
```
pnpm check:changed
```
for typecheck/lint/guard selection.
```
pnpm test:changed
```
answers "which Vitest targets are worth running now?" It uses the same changed path list, but applies a cheaper test-target resolver.
Direct test edits run themselves. Source edits prefer explicit mappings, sibling
```
*.test.ts
```
, then import-graph dependents. Shared harness/config/root edits are skipped by default unless they have precise mapped tests.
Shared group-room delivery config and source-reply prompt edits are precise mapped tests: they run the core auto-reply regressions plus Discord and Slack delivery tests so cross-channel default changes fail before a PR push.
Public SDK or contract edits do not automatically run every plugin test.
```
check:changed
```
proves extension type contracts; the agent chooses the smallest plugin/contract Vitest proof that matches the actual risk.
Use
```
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
```
only when a harness, config, package, or unknown-root edit really needs the broad Vitest fallback.

```
pnpm changed:lanes --json
```
用于回答“此代码差异影响哪些检查流水线？”，
```
pnpm check:changed
```
使用它来选择类型检查/代码检查/防护验证的范围。
```
pnpm test:changed
```
用于回答“现在值得运行哪些Vitest目标？”，它使用相同的变更路径列表，但应用更轻量的测试目标解析器。
直接修改的测试文件会运行自身。源代码变更优先选择显式映射的测试、同级
```
*.test.ts
```
文件，然后是依赖导入图的相关测试。默认情况下，共享测试工具/配置/根目录的变更会被跳过，除非有精准映射的测试。
共享群组交付配置和源回复提示的变更有精准映射的测试：会运行核心自动回复回归测试以及Discord和Slack交付测试，这样跨渠道的默认变更会在PR推送前失败。
公共SDK或契约变更不会自动运行所有插件测试。
```
check:changed
```
验证扩展类型契约；Agent会选择与实际风险匹配的最小插件/契约Vitest验证用例。
仅当测试工具、配置、包或未知根目录的变更确实需要时，才使用
```
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
```
的宽范围Vitest回退方案。

CI Debugging

CI调试

Start with current run state, not logs for everything:

bash

gh run list --branch main --limit 10
gh run view <run-id> --json status,conclusion,headSha,url,jobs
gh run view <run-id> --job <job-id> --log

Check exact SHA. Ignore newer unrelated
```
main
```
unless asked.
For cancelled same-branch runs, confirm whether a newer run superseded it.
Fetch full logs only for failed or relevant jobs.

从当前运行状态开始，不要查看所有日志：

bash

gh run list --branch main --limit 10
gh run view <run-id> --json status,conclusion,headSha,url,jobs
gh run view <run-id> --job <job-id> --log

检查精确的SHA值。除非被要求，否则忽略更新的无关
```
main
```
分支内容。
对于同一分支的已取消运行，确认是否有更新的运行已取代它。
仅获取失败或相关作业的完整日志。

GitHub Release Workflows

GitHub发布工作流

Use the smallest workflow that proves the current risk. The full umbrella is available, but it is usually the last step after narrower proof, not the first rerun after a focused patch.

使用最小的工作流来验证当前风险。完整的集成工作流可用，但通常是在更窄范围验证之后的最后一步，而不是在聚焦补丁后的首次重新运行。

Full Release Validation

完整发布验证

Full Release Validation

(

.github/workflows/full-release-validation.yml

) is the manual "everything before release" umbrella. It resolves a target ref, then dispatches:

manual
```
CI
```
for the full normal CI graph, with Android enabled via
```
include_android=true
```
```
Plugin Prerelease
```
for release-only plugin static checks, extension shards, the release-only
```
agentic-plugins
```
shard, and plugin product Docker lanes
```
OpenClaw Release Checks
```
for install smoke, cross-OS release checks, live and E2E checks, Docker release-path suites, OpenWebUI, QA Lab, fast Matrix, and Telegram release lanes
optional post-publish Telegram E2E when a package spec is supplied

Run it only when validating an actual release candidate, after broad shared CI or release orchestration changes, or when explicitly asked:

bash

gh workflow run full-release-validation.yml \
  --repo openclaw/openclaw \
  --ref main \
  -f ref=<branch-or-sha> \
  -f provider=openai \
  -f mode=both \
  -f release_profile=stable

Run the workflow itself from the trusted current ref, normally

--ref main

; child workflows are dispatched from that same ref even when

ref

points at an older release branch or tag. Full Release Validation has no separate child workflow ref input; choose the trusted harness by choosing the workflow run ref. Use

release_profile=minimum|stable|full

to control live/provider breadth:

minimum

keeps the fastest OpenAI/core release-critical set,

stable

adds the stable provider/backend set, and

full

adds the broad advisory provider/media matrix. Do not make

full

faster by silently dropping suites; optimize setup, artifact reuse, and sharding instead. The parent verifier job appends a child overview plus slowest-job tables for child runs; rerun only that verifier after a child rerun turns green.

Standalone manual

CI

dispatches do not run the plugin prerelease suite, the extension batch sweep, or the release-only

agentic-plugins

Vitest shard. Those lanes are intentionally reserved for the separate

Plugin Prerelease

child so PRs, main pushes, and ad hoc broad CI checks do not spend Docker/package time or all-plugin runtime time on release-only product coverage.

If a full run is already active on a newer

origin/main

, prefer watching that run over dispatching a duplicate. Do not cancel release, release-check, or child workflow runs unless Peter explicitly asks for cancellation.

The child-dispatch jobs record the child run ids. The final

Verify full validation

job re-queries those child runs and is the canonical parent gate. If a child workflow failed but was later rerun successfully, rerun only the failed parent verifier job; do not dispatch a new full umbrella unless the release evidence is stale.

For bounded recovery after a focused fix, pass

-f rerun_group=<group>

. Supported umbrella groups are

all

ci

plugin-prerelease

release-checks

install-smoke

cross-os

live-e2e

package

qa

qa-parity

qa-live

, and

npm-telegram

. Use the narrowest group that covers the failed box. After a targeted release-check fix, do not restart the full umbrella by habit: dispatch the matching

rerun_group

and rerun only the parent verifier/evidence step after the child is green unless the release evidence is stale. For a single failed live/E2E shard, use

-f rerun_group=live-e2e -f live_suite_filter=<suite_id>

so the Blacksmith workflow only spends setup and queue time on that suite.

Full Release Validation

（

.github/workflows/full-release-validation.yml

）是手动触发的“发布前所有检查”集成工作流。它解析目标引用，然后调度：

手动
```
CI
```
，包含完整的常规CI图，通过
```
include_android=true
```
启用Android检查
```
Plugin Prerelease
```
，包含仅发布阶段的插件静态检查、扩展分片、仅发布阶段的
```
agentic-plugins
```
分片，以及插件产品Docker流水线
```
OpenClaw Release Checks
```
，包含安装冒烟测试、跨平台发布检查、实时和E2E检查、Docker发布路径套件、OpenWebUI、QA实验室、快速Matrix和Telegram发布流水线
当提供包规格时，可选的发布后Telegram E2E测试

仅在验证实际发布候选版本、完成广泛的共享CI或发布编排变更后，或被明确要求时才运行：

bash

gh workflow run full-release-validation.yml \
  --repo openclaw/openclaw \
  --ref main \
  -f ref=<branch-or-sha> \
  -f provider=openai \
  -f mode=both \
  -f release_profile=stable

从可信的当前引用运行工作流，通常是

--ref main

；即使

ref

指向旧的发布分支或标签，子工作流也会从同一引用调度。完整发布验证没有单独的子工作流引用输入；通过选择工作流运行引用来选择可信的测试工具。使用

release_profile=minimum|stable|full

控制实时/提供商范围：

minimum

保留最快的OpenAI/核心发布关键集，

stable

添加稳定的提供商/后端集，

full

添加广泛的建议性提供商/媒体矩阵。不要通过静默删除套件来加快

full

的速度；而是优化设置、工件重用和分片。父验证器作业会附加子工作流概述以及最慢作业表；子工作流重新运行通过后，仅重新运行该验证器作业。

独立的手动

CI

调度不会运行插件预发布套件、扩展批量扫描或仅发布阶段的

agentic-plugins

Vitest分片。这些流水线被有意保留为单独的

Plugin Prerelease

子工作流，这样PR、主分支推送和临时的广泛CI检查就不会在仅发布阶段的产品覆盖上花费Docker/包时间或全插件运行时间。

如果较新的

origin/main

上已有完整运行在进行中，优先观察该运行，而不是调度重复运行。除非Peter明确要求取消，否则不要取消发布、发布检查或子工作流运行。

子调度作业会记录子运行ID。最终的

Verify full validation

作业会重新查询这些子运行，是规范的父网关。如果子工作流失败但后来重新运行成功，仅重新运行失败的父验证器作业；除非发布证据过期，否则不要调度新的完整集成工作流。

对于聚焦修复后的有限恢复，传递

-f rerun_group=<group>

。支持的集成组包括

all

、

ci

、

plugin-prerelease

、

release-checks

、

install-smoke

、

cross-os

、

live-e2e

、

package

、

qa

、

qa-parity

、

qa-live

和

npm-telegram

。使用覆盖失败范围的最小组。完成针对性的发布检查修复后，不要习惯性地重启完整集成工作流：调度匹配的

rerun_group

，子工作流通过后仅重新运行父验证器/证据步骤，除非发布证据过期。对于单个失败的实时/E2E分片，使用

-f rerun_group=live-e2e -f live_suite_filter=<suite_id>

，这样Blacksmith工作流仅在该套件上花费设置和排队时间。

Release Evidence

发布证据

After release-candidate validation or before a release decision, record the important run ids in the private

openclaw/releases-private

evidence ledger. Use the manual

OpenClaw Release Evidence

(

openclaw-release-evidence.yml

) workflow there. It writes durable summaries under

evidence/<release-id>/

and commits:

```
release-evidence.md
```
```
release-evidence.json
```
```
index.json
```
```
runs/<label>.json
```

Use one run per line:

text

full-release-validation openclaw/openclaw <run-id> blocking
package-acceptance openclaw/openclaw <run-id> blocking
release-checks openclaw/openclaw <run-id> blocking

Store summaries, run URLs, artifact metadata, timings, pass/fail state, and short release-manager notes there. Do not store raw logs, provider prompts/responses, channel transcripts, signing material, or secret-bearing config in git; raw logs stay in Actions artifacts.

When

Full Release Validation

completes and

OPENCLAW_RELEASES_PRIVATE_DISPATCH_TOKEN

is configured in the public repo, it requests the private

OpenClaw Release Evidence From Full Validation

workflow. That private workflow reads the parent full-validation run, extracts the child CI/release-checks/Telegram run ids from the parent logs, and opens the evidence PR automatically. If the token is absent or the run predates this wiring, trigger that private workflow manually with the full-validation run id.

在发布候选版本验证后或发布决策前，将重要的运行ID记录在私有

openclaw/releases-private

证据台账中。使用那里的手动

OpenClaw Release Evidence

（

openclaw-release-evidence.yml

）工作流。它会在

evidence/<release-id>/

下写入持久化摘要并提交：

```
release-evidence.md
```
```
release-evidence.json
```
```
index.json
```
```
runs/<label>.json
```

每行记录一个运行：

text

full-release-validation openclaw/openclaw <run-id> blocking
package-acceptance openclaw/openclaw <run-id> blocking
release-checks openclaw/openclaw <run-id> blocking

在其中存储摘要、运行URL、工件元数据、时间、通过/失败状态以及简短的发布经理备注。不要在git中存储原始日志、提供商提示/响应、渠道记录、签名材料或包含密钥的配置；原始日志保留在Actions工件中。

当

Full Release Validation

完成且公共仓库中配置了

OPENCLAW_RELEASES_PRIVATE_DISPATCH_TOKEN

时，它会请求私有

OpenClaw Release Evidence From Full Validation

工作流。该私有工作流读取父完整验证运行，从父日志中提取子CI/发布检查/Telegram运行ID，并自动打开证据PR。如果令牌不存在或运行早于该配置，则使用完整验证运行ID手动触发该私有工作流。

Release Checks

发布检查

OpenClaw Release Checks

(

openclaw-release-checks.yml

) is the release child workflow. It is broader than normal CI but narrower than the umbrella because it does not dispatch the separate full normal CI child. It runs Package Acceptance with artifact-native delta lanes and

telegram_mode=mock-openai

, so the release package tarball also goes through offline plugin proof, bundled-channel compat, and Telegram package QA. The Docker release-path chunks cover the overlapping package/update/plugin lanes. Use it when release-path validation is needed without rerunning the entire umbrella.

bash

gh workflow run openclaw-release-checks.yml \
  --repo openclaw/openclaw \
  --ref main \
  -f ref=<branch-or-sha> \
  -f provider=openai \
  -f mode=both \
  -f release_profile=stable \
  -f rerun_group=all

Release-check rerun groups are

all

install-smoke

cross-os

live-e2e

package

qa

qa-parity

, and

qa-live

OpenClaw Release Checks

uses the trusted workflow ref to resolve the selected ref once as

release-package-under-test

and passes that artifact into cross-OS release checks, release-path Docker live/E2E checks, and Package Acceptance. When

Full Release Validation

dispatches release checks, it passes the requested branch/tag plus an

expected_sha

so branch/tag refs resolve through the fast remote-ref path while the package and QA jobs still validate the exact SHA.

The full install-smoke child is split on purpose: one job prepares or reuses the target-SHA GHCR root Dockerfile smoke image, QR package install runs in its own job, root Dockerfile/gateway smokes pull the prepared image, and installer/Bun smokes pull the same image while building only their small installer images. If install-smoke gets slow again, first check whether the root image was reused or rebuilt before adding/removing coverage.

The full-profile native live media shards use the prebuilt

ghcr.io/openclaw/openclaw-live-media-runner:ubuntu-24.04

container so

ffmpeg

ffprobe

are already present. If those jobs suddenly spend minutes in dependency setup again, first check the

Live Media Runner Image

workflow and the

Verify preinstalled live media dependencies

step before assuming the media tests themselves slowed down.

The release Docker path intentionally shards the plugin/runtime tail. The workflow uses

plugins-runtime-plugins

plugins-runtime-services

, and

plugins-runtime-install-a

through

plugins-runtime-install-d

; aggregate aliases such as

plugins-runtime-core

plugins-runtime

, and

plugins-integrations

remain for manual reruns.

The release QA parity box is internally split into candidate and baseline lane jobs, followed by a report job that downloads both artifacts and runs

pnpm openclaw qa parity-report

. For parity failures, inspect the failed lane first; inspect the report job when both lane summaries exist but the comparison fails.

OpenClaw Release Checks

（

openclaw-release-checks.yml

）是发布子工作流。它比常规CI更广泛，但比集成工作流更窄，因为它不调度单独的完整常规CI子工作流。它运行Package Acceptance，包含工件原生增量流水线和

telegram_mode=mock-openai

，因此发布包tarball还会经过离线插件验证、捆绑渠道兼容性和Telegram包QA。Docker发布路径分片覆盖重叠的包/更新/插件流水线。当需要发布路径验证但无需重新运行整个集成工作流时使用它。

bash

gh workflow run openclaw-release-checks.yml \
  --repo openclaw/openclaw \
  --ref main \
  -f ref=<branch-or-sha> \
  -f provider=openai \
  -f mode=both \
  -f release_profile=stable \
  -f rerun_group=all

发布检查的重新运行组包括

all

、

install-smoke

、

cross-os

、

live-e2e

、

package

、

qa

、

qa-parity

和

qa-live

。

OpenClaw Release Checks

使用可信的工作流引用将所选引用解析一次为

release-package-under-test

，并将该工件传递到跨平台发布检查、发布路径Docker实时/E2E检查和Package Acceptance中。当

Full Release Validation

调度发布检查时，它会传递请求的分支/标签以及

expected_sha

，这样分支/标签引用通过快速远程引用路径解析，而包和QA作业仍验证精确的SHA。

完整的安装冒烟测试子工作流被有意拆分：一个作业准备或重用目标SHA的GHCR根Dockerfile冒烟镜像，QR包安装在自己的作业中运行，根Dockerfile/网关冒烟测试拉取准备好的镜像，安装程序/Bun冒烟测试拉取相同的镜像，同时仅构建它们的小型安装程序镜像。如果安装冒烟测试再次变慢，首先检查根镜像是被重用还是重建，然后再考虑添加/删除覆盖范围。

完整配置文件的原生实时媒体分片使用预构建的

ghcr.io/openclaw/openclaw-live-media-runner:ubuntu-24.04

容器，因此

ffmpeg

ffprobe

已存在。如果这些作业突然在依赖设置上花费数分钟，首先检查

Live Media Runner Image

工作流和

Verify preinstalled live media dependencies

步骤，然后再假设媒体测试本身变慢。

发布Docker路径有意拆分了插件/运行时的尾部工作流。工作流使用

plugins-runtime-plugins

、

plugins-runtime-services

、

plugins-runtime-install-a

到

plugins-runtime-install-d

；聚合别名如

plugins-runtime-core

、

plugins-runtime

和

plugins-integrations

仍用于手动重新运行。

发布QA奇偶校验框在内部拆分为候选和基准流水线作业，然后是一个报告作业，下载两个工件并运行

pnpm openclaw qa parity-report

。对于奇偶校验失败，首先检查失败的流水线；当两个流水线摘要都存在但比较失败时，检查报告作业。

QA Lab Matrix Profiles

QA实验室Matrix配置文件

pnpm openclaw qa matrix

defaults to

--profile all

. Do not assume the CLI default is the fast release path. Use explicit profiles:

```
--profile fast
```
: release-critical Matrix transport contract; add
```
--fail-fast
```
only when the target CLI supports it

--profile transport|media|e2ee-smoke|e2ee-deep|e2ee-cli

: sharded full Matrix proof

```
OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS=3000
```
: CI-friendly no-reply quiet window when paired with fast or sharded gates

QA-Lab - All Lanes

uses explicit fast Matrix on scheduled runs; manual dispatch keeps

matrix_profile=all

as the default and always shards that full Matrix selection.

OpenClaw Release Checks

uses explicit fast Matrix; run the all-lanes workflow when release investigation needs full Matrix media/E2EE inventory.

pnpm openclaw qa matrix

默认使用

--profile all

。不要假设CLI默认是快速发布路径。使用显式配置文件：

```
--profile fast
```
：发布关键的Matrix传输契约；仅当目标CLI支持时添加
```
--fail-fast
```

--profile transport|media|e2ee-smoke|e2ee-deep|e2ee-cli

：分片的完整Matrix验证

```
OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS=3000
```
：与快速或分片网关配对时，CI友好的无回复静默窗口

QA-Lab - All Lanes

在计划运行时使用显式快速Matrix；手动调度保留

matrix_profile=all

作为默认值，并始终对完整Matrix选择进行分片。

OpenClaw Release Checks

使用显式快速Matrix；当发布调查需要完整Matrix媒体/E2EE清单时，运行全流水线工作流。

Reusable Live/E2E Checks

可重用的实时/E2E检查

OpenClaw Live And E2E Checks (Reusable)

(

openclaw-live-and-e2e-checks-reusable.yml

) is the preferred entry point for targeted live, Docker, model, and E2E proof. Inputs let you turn off unrelated lanes:

bash

gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
  --repo openclaw/openclaw \
  --ref main \
  -f ref=<sha> \
  -f include_repo_e2e=false \
  -f include_release_path_suites=false \
  -f include_openwebui=false \
  -f include_live_suites=true \
  -f live_models_only=true \
  -f live_model_providers=fireworks

Useful knobs:

```
docker_lanes='<lane[,lane]>'
```
: run selected Docker scheduler lanes against prepared artifacts instead of the release chunk matrix. Multiple selected lanes fan out as parallel targeted Docker jobs after one shared package/image preparation step.
```
include_live_suites=false
```
: skip live/provider suites when testing Docker scheduler or release packaging only.
```
live_models_only=true
```
: run only Docker live model coverage.
```
live_model_providers=fireworks
```
(or comma/space separated providers): run one targeted Docker live model job instead of the full provider matrix.
blank
```
live_model_providers
```
: run the full live-model provider matrix.

Release-path Docker chunks are currently

core

package-update-openai

package-update-anthropic

package-update-core

plugins-runtime-plugins

plugins-runtime-services

plugins-runtime-install-a

plugins-runtime-install-b

plugins-runtime-install-c

plugins-runtime-install-d

bundled-channels-core

bundled-channels-update-a

bundled-channels-update-b

, and

bundled-channels-contracts

. The aggregate

bundled-channels

plugins-runtime-core

plugins-runtime

, and

plugins-integrations

chunks remain valid for manual one-shot reruns, but release checks use the split chunks.

When live suites are enabled, the workflow shards broad native

pnpm test:live

coverage through

scripts/test-live-shard.mjs

instead of one serial

live-all

job:

```
native-live-src-agents
```
```
native-live-src-gateway-core
```

native-live-src-gateway-profiles

(release CI runs this with provider filters such as

OPENCLAW_LIVE_GATEWAY_PROVIDERS=anthropic

)

```
native-live-src-gateway-backends
```
```
native-live-test
```
```
native-live-extensions-a-k
```
```
native-live-extensions-l-n
```
```
native-live-extensions-openai
```
```
native-live-extensions-o-z
```
```
native-live-extensions-o-z-other
```
```
native-live-extensions-xai
```
```
native-live-extensions-media
```
```
native-live-extensions-media-audio
```
```
native-live-extensions-media-music
```

native-live-extensions-media-music-google

native-live-extensions-media-music-minimax

```
native-live-extensions-media-video
```

Use

node scripts/test-live-shard.mjs <shard> --list

to see the exact files before rerunning a failed native live shard. The aggregate

o-z

and

media

shards remain useful locally; release CI uses the smaller provider/media shards so one live-provider flake does not force a broad native live rerun.

For model-list or provider-selection fixes, use

live_models_only=true

plus the specific

live_model_providers

allowlist. Confirm logs show the expected

OPENCLAW_LIVE_PROVIDERS

and selected model ids before declaring proof.

OpenClaw Live And E2E Checks (Reusable)

（

openclaw-live-and-e2e-checks-reusable.yml

）是针对性实时、Docker、模型和E2E验证的首选入口点。输入可以关闭无关流水线：

bash

gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
  --repo openclaw/openclaw \
  --ref main \
  -f ref=<sha> \
  -f include_repo_e2e=false \
  -f include_release_path_suites=false \
  -f include_openwebui=false \
  -f include_live_suites=true \
  -f live_models_only=true \
  -f live_model_providers=fireworks

实用配置项：

```
docker_lanes='<lane[,lane]>'
```
：针对准备好的工件运行选定的Docker调度流水线，而不是发布分片矩阵。多个选定流水线在一个共享包/镜像准备步骤后，会展开为并行的针对性Docker作业。
```
include_live_suites=false
```
：仅测试Docker调度或发布打包时，跳过实时/提供商套件。
```
live_models_only=true
```
：仅运行Docker实时模型覆盖测试。
```
live_model_providers=fireworks
```
（或逗号/空格分隔的提供商）：运行一个针对性Docker实时模型作业，而不是完整的提供商矩阵。
空
```
live_model_providers
```
：运行完整的实时模型提供商矩阵。

发布路径Docker分片目前包括

core

、

package-update-openai

、

package-update-anthropic

、

package-update-core

、

plugins-runtime-plugins

、

plugins-runtime-services

、

plugins-runtime-install-a

、

plugins-runtime-install-b

、

plugins-runtime-install-c

、

plugins-runtime-install-d

、

bundled-channels-core

、

bundled-channels-update-a

、

bundled-channels-update-b

和

bundled-channels-contracts

。聚合的

bundled-channels

、

plugins-runtime-core

、

plugins-runtime

和

plugins-integrations

分片仍可用于手动一次性重新运行，但发布检查使用拆分后的分片，这样提供商安装程序检查、插件运行时检查、捆绑插件安装/卸载分片和捆绑渠道检查可以在不同机器上运行。

bundled-channels

内的捆绑渠道运行时依赖覆盖使用拆分后的

bundled-channel-*

和

bundled-channel-update-*

流水线，而不是串行的

bundled-channel-deps

流水线，因此失败会产生针对确切渠道/更新场景的低成本重新运行。捆绑插件安装/卸载扫描也拆分为

bundled-plugin-install-uninstall-0

到

bundled-plugin-install-uninstall-7

；选择旧的

bundled-plugin-install-uninstall

流水线会展开为所有8个分片。

当启用实时套件时，工作流通过

scripts/test-live-shard.mjs

将广泛的原生

pnpm test:live

覆盖分片，而不是一个串行的

live-all

作业：

```
native-live-src-agents
```
```
native-live-src-gateway-core
```

native-live-src-gateway-profiles

（发布CI使用提供商过滤器运行，如

OPENCLAW_LIVE_GATEWAY_PROVIDERS=anthropic

）

```
native-live-src-gateway-backends
```
```
native-live-test
```
```
native-live-extensions-a-k
```
```
native-live-extensions-l-n
```
```
native-live-extensions-openai
```
```
native-live-extensions-o-z
```
```
native-live-extensions-o-z-other
```
```
native-live-extensions-xai
```
```
native-live-extensions-media
```
```
native-live-extensions-media-audio
```
```
native-live-extensions-media-music
```

native-live-extensions-media-music-google

native-live-extensions-media-music-minimax

```
native-live-extensions-media-video
```

重新运行失败的原生实时分片前，使用

node scripts/test-live-shard.mjs <shard> --list

查看确切文件。聚合的

o-z

和

media

分片在本地仍然有用；发布CI使用更小的提供商/媒体分片，这样一个实时提供商的故障不会导致广泛的原生实时重新运行。

对于模型列表或提供商选择修复，使用

live_models_only=true

加上特定的

live_model_providers

允许列表。在确认验证通过前，检查日志是否显示预期的

OPENCLAW_LIVE_PROVIDERS

和选定的模型ID。

Docker

Docker相关

Docker is expensive. First inspect the scheduler without running Docker:

bash

OPENCLAW_DOCKER_ALL_DRY_RUN=1 pnpm test:docker:all
OPENCLAW_DOCKER_ALL_DRY_RUN=1 OPENCLAW_DOCKER_ALL_LANES=install-e2e pnpm test:docker:all
OPENCLAW_DOCKER_ALL_LANES=install-e2e node scripts/test-docker-all.mjs --plan-json

Run one failed lane locally only when explicitly asked or when GitHub is not usable:

bash

OPENCLAW_DOCKER_ALL_LANES=<lane> \
OPENCLAW_DOCKER_ALL_BUILD=0 \
OPENCLAW_DOCKER_ALL_PREFLIGHT=0 \
OPENCLAW_SKIP_DOCKER_BUILD=1 \
OPENCLAW_DOCKER_E2E_BARE_IMAGE='<prepared-bare-image>' \
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE='<prepared-functional-image>' \
pnpm test:docker:all

For release validation, prefer the reusable GitHub workflow input:

yaml

docker_lanes: install-e2e

Multiple lanes are allowed:

yaml

docker_lanes: install-e2e bundled-channel-update-acpx

That skips the release chunk matrix and runs one targeted Docker job against the prepared GHCR images and the selected package artifact. Rerun commands generated inside GitHub artifacts include

package_artifact_run_id

package_artifact_name

docker_e2e_bare_image

, and

docker_e2e_functional_image

when available, so failed lanes can reuse the exact tarball and prepared images from the failed run. When the fix changes package contents, omit those reuse inputs so the workflow packs a new tarball. Live-only targeted reruns skip the E2E images and build only the live-test image. Release-path normal mode fans out into smaller Docker chunk jobs:

```
core
```
```
package-update-openai
```
```
package-update-anthropic
```
```
package-update-core
```
```
plugins-runtime-plugins
```
```
plugins-runtime-services
```
```
plugins-runtime-install-a
```
```
plugins-runtime-install-b
```
```
plugins-runtime-install-c
```
```
plugins-runtime-install-d
```
```
bundled-channels
```

OpenWebUI is folded into

plugins-runtime-services

for full release-path coverage and keeps a standalone

openwebui

chunk only for OpenWebUI-only dispatches. The legacy

package-update

plugins-runtime-core

plugins-runtime

, and

plugins-integrations

chunks still work as aggregate aliases for manual reruns, but the release workflow uses the split chunks so provider installer checks, plugin runtime checks, bundled plugin install/uninstall shards, and bundled-channel checks can run on separate machines. The bundled-channel runtime-dependency coverage inside

bundled-channels

uses the split

bundled-channel-*

and

bundled-channel-update-*

lanes rather than the serial

bundled-channel-deps

lane, so failures produce cheap targeted reruns for the exact channel/update scenario. The bundled plugin install/uninstall sweep is also split into

bundled-plugin-install-uninstall-0

through

bundled-plugin-install-uninstall-7

; selecting the legacy

bundled-plugin-install-uninstall

lane expands to all eight shards.

Docker运行成本较高。首先在不运行Docker的情况下检查调度器：

bash

OPENCLAW_DOCKER_ALL_DRY_RUN=1 pnpm test:docker:all
OPENCLAW_DOCKER_ALL_DRY_RUN=1 OPENCLAW_DOCKER_ALL_LANES=install-e2e pnpm test:docker:all
OPENCLAW_DOCKER_ALL_LANES=install-e2e node scripts/test-docker-all.mjs --plan-json

仅当被明确要求或GitHub不可用时，才在本地运行单个失败的流水线：

bash

OPENCLAW_DOCKER_ALL_LANES=<lane> \
OPENCLAW_DOCKER_ALL_BUILD=0 \
OPENCLAW_DOCKER_ALL_PREFLIGHT=0 \
OPENCLAW_SKIP_DOCKER_BUILD=1 \
OPENCLAW_DOCKER_E2E_BARE_IMAGE='<prepared-bare-image>' \
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE='<prepared-functional-image>' \
pnpm test:docker:all

对于发布验证，优先使用可重用的GitHub工作流输入：

yaml

docker_lanes: install-e2e

允许多个流水线：

yaml

docker_lanes: install-e2e bundled-channel-update-acpx

这会跳过发布分片矩阵，并针对准备好的GHCR镜像和选定的包工件运行一个针对性Docker作业。GitHub工件中生成的重新运行命令包括

package_artifact_run_id

、

package_artifact_name

、

docker_e2e_bare_image

和

docker_e2e_functional_image

（如果可用），因此失败的流水线可以重用失败运行中的精确tarball和准备好的镜像。如果修复变更了包内容，请省略这些重用输入，以便工作流打包新的tarball。仅实时的针对性重新运行会跳过E2E镜像，仅构建实时测试镜像。发布路径正常模式会展开为更小的Docker分片作业：

```
core
```
```
package-update-openai
```
```
package-update-anthropic
```
```
package-update-core
```
```
plugins-runtime-plugins
```
```
plugins-runtime-services
```
```
plugins-runtime-install-a
```
```
plugins-runtime-install-b
```
```
plugins-runtime-install-c
```
```
plugins-runtime-install-d
```
```
bundled-channels
```

OpenWebUI被纳入

plugins-runtime-services

以实现完整的发布路径覆盖，仅在OpenWebUI专属调度时保留独立的

openwebui

分片。旧的

package-update

、

plugins-runtime-core

、

plugins-runtime

和

plugins-integrations

分片仍可作为聚合别名用于手动重新运行，但发布工作流使用拆分后的分片，这样提供商安装程序检查、插件运行时检查、捆绑插件安装/卸载分片和捆绑渠道检查可以在不同机器上运行。

bundled-channels

内的捆绑渠道运行时依赖覆盖使用拆分后的

bundled-channel-*

和

bundled-channel-update-*

流水线，而不是串行的

bundled-channel-deps

流水线，因此失败会产生针对确切渠道/更新场景的低成本重新运行。捆绑插件安装/卸载扫描也拆分为

bundled-plugin-install-uninstall-0

到

bundled-plugin-install-uninstall-7

；选择旧的

bundled-plugin-install-uninstall

流水线会展开为所有8个分片。

Package Acceptance

包验收

Use the manual

Package Acceptance

workflow when the question is "does this installable package work as a product?" rather than "does this source diff pass Vitest?"

In release validation, treat Package Acceptance as the package-candidate shard inside the larger release umbrella, not as a competing full-test path. Full Release Validation and private release gauntlets should call Package Acceptance for tarball resolution, Docker product/package proof, and optional Telegram QA against the same resolved

package-under-test

artifact; keep orchestration, secret policy, blocking/advisory status, and evidence rollup in the caller.

Good defaults:

bash

gh workflow run package-acceptance.yml --ref main \
  -f source=npm \
  -f workflow_ref=main \
  -f package_spec=openclaw@beta \
  -f suite_profile=product \
  -f telegram_mode=mock-openai

Npm candidate selection:

Resolve the registry immediately before dispatch:

npm view openclaw dist-tags --json --prefer-online --cache /tmp/openclaw-npm-cache-verify-$$

and

npm view openclaw@beta version dist.tarball dist.integrity --json --prefer-online --cache /tmp/openclaw-npm-cache-verify-$$

If Peter asks for "latest beta", use
```
source=npm
```
with
```
package_spec=openclaw@beta
```
, then record the resolved version from
```
npm view
```
or the workflow summary.
For reruns, release proof, or comparing one known package, prefer the exact immutable spec:
```
package_spec=openclaw@YYYY.M.D-beta.N
```
or
```
package_spec=openclaw@YYYY.M.D
```
.
For stable package proof, use
```
package_spec=openclaw@latest
```
only when the question is explicitly the current stable dist-tag; otherwise pin the exact version.
```
source=npm
```
only accepts registry specs for
```
openclaw@beta
```
,
```
openclaw@latest
```
, or exact OpenClaw release versions. Do not pass semver ranges, git refs, file paths, tarball URLs, or plugin package names there.
If the candidate is a tarball URL, use
```
source=url
```
with
```
package_sha256
```
. If it is an Actions tarball artifact, use
```
source=artifact
```
. If it is an unpublished source candidate, use
```
source=ref
```
with a trusted ref or SHA.
Package acceptance tests exactly the selected package candidate. Do not apply
```
openclaw update --channel beta
```
fallback semantics here; if
```
beta
```
is absent, stale, older than
```
latest
```
, or points at a broken tarball, report that tag state instead of silently testing
```
latest
```
.

Profiles:

```
smoke
```
: quick confidence that the tarball installs, can onboard a channel, can run an agent turn, and basic gateway/config lanes work.
```
package
```
: release-package contract. Adds installer/update, doctor install switching, bundled plugin runtime deps, plugin install/update, and package repair lanes. This is the default native replacement for most Parallels package/update coverage.
```
product
```
: package profile plus broader product surfaces: MCP channels, cron/subagent cleanup, OpenAI web search, and OpenWebUI.
```
full
```
: split Docker release-path chunks with OpenWebUI.
```
custom
```
: exact
```
docker_lanes
```
list for a focused rerun.

Candidate sources:

```
source=npm
```
:
```
openclaw@beta
```
,
```
openclaw@latest
```
, or an exact release version.
```
source=ref
```
: pack
```
package_ref
```
using the trusted
```
workflow_ref
```
harness. This intentionally separates old package commits from new workflow/test code.
```
source=url
```
: HTTPS
```
.tgz
```
plus required
```
package_sha256
```
.

source=artifact

: download one

.tgz

from

artifact_run_id

artifact_name

Ref model:

```
gh workflow run ... --ref <workflow-ref>
```
selects the workflow file revision GitHub executes.
```
workflow_ref
```
is the trusted harness/script ref passed to reusable Docker E2E.
```
package_ref
```
is the source ref to build when
```
source=ref
```
. It can be an older branch/tag/SHA as long as it is reachable from an OpenClaw branch or release tag.

Example: run latest package acceptance harness against an older trusted commit:

bash

gh workflow run package-acceptance.yml --ref main \
  -f workflow_ref=main \
  -f source=ref \
  -f package_ref=<branch-or-sha> \
  -f suite_profile=package \
  -f telegram_mode=mock-openai

Use

telegram_mode=mock-openai

telegram_mode=live-frontier

when the same resolved

package-under-test

tarball should also run through the Telegram QA workflow in the

qa-live-shared

environment. The standalone Telegram workflow still accepts a published npm spec for post-publish checks, but Package Acceptance passes the resolved artifact for

source=npm

ref

url

, and

artifact

. Use

telegram_mode=none

only when intentionally skipping Telegram credentialed package proof for a focused rerun.

Docker E2E images never copy repo sources as the app under test: the bare image is a Node/Git runner, and the functional image installs the same prebuilt npm tarball that bare lanes mount.

scripts/package-openclaw-for-docker.mjs

is the single packer for local scripts and CI and validates the tarball inventory before Docker consumes it.

scripts/test-docker-all.mjs --plan-json

is the scheduler-owned CI plan for image kind, package, live image, lane, and credential needs. Docker lane definitions live in the single scenario catalog

scripts/lib/docker-e2e-scenarios.mjs

; planner logic lives in

scripts/lib/docker-e2e-plan.mjs

scripts/docker-e2e.mjs

converts plan and summary JSON into GitHub outputs and step summaries. Every scheduler run writes

.artifacts/docker-tests/**/summary.json

plus

failures.json

. Read those before rerunning. Lane entries include

command

rerunCommand

, status, timing, timeout state, image kind, and log file path. The summary also includes top-level phase timings for preflight, image build, package prep, lane pools, and cleanup. Use

pnpm test:docker:timings <summary.json>

to rank slow lanes and phases before deciding whether a broader rerun is justified.

当你需要验证“这个可安装包是否能作为产品正常工作”，而不是“这个源代码差异是否通过Vitest测试”时，使用手动

Package Acceptance

工作流。

在发布验证中，将Package Acceptance视为更大发布集成工作流中的包候选分片，而不是竞争的完整测试路径。完整发布验证和私有发布流程应调用Package Acceptance进行tarball解析、Docker产品/包验证，以及针对相同解析的

package-under-test

工件的可选Telegram QA；将编排、密钥策略、阻塞/建议状态和证据汇总保留在调用方中。

推荐默认配置：

bash

gh workflow run package-acceptance.yml --ref main \
  -f source=npm \
  -f workflow_ref=main \
  -f package_spec=openclaw@beta \
  -f suite_profile=product \
  -f telegram_mode=mock-openai

Npm候选版本选择：

调度前立即解析注册表：

npm view openclaw dist-tags --json --prefer-online --cache /tmp/openclaw-npm-cache-verify-$$

和

npm view openclaw@beta version dist.tarball dist.integrity --json --prefer-online --cache /tmp/openclaw-npm-cache-verify-$$

。

如果Peter要求“最新beta版本”，使用
```
source=npm
```
和
```
package_spec=openclaw@beta
```
，然后记录
```
npm view
```
或工作流摘要中解析的版本。
对于重新运行、发布验证或比较已知包，优先使用精确的不可变规格：
```
package_spec=openclaw@YYYY.M.D-beta.N
```
或
```
package_spec=openclaw@YYYY.M.D
```
。
对于稳定包验证，仅当明确询问当前稳定dist-tag时，才使用
```
package_spec=openclaw@latest
```
；否则固定精确版本。
```
source=npm
```
仅接受
```
openclaw@beta
```
、
```
openclaw@latest
```
或精确OpenClaw发布版本的注册表规格。不要传递语义版本范围、git引用、文件路径、tarball URL或插件包名称。
如果候选版本是tarball URL，使用
```
source=url
```
并提供
```
package_sha256
```
。如果是Actions tarball工件，使用
```
source=artifact
```
。如果是未发布的源代码候选版本，使用
```
source=ref
```
并提供可信的引用或SHA。
包验收会精确测试选定的包候选版本。不要在此处应用
```
openclaw update --channel beta
```
的回退语义；如果
```
beta
```
不存在、过期、早于
```
latest
```
或指向损坏的tarball，请报告该标签状态，而不是静默测试
```
latest
```
。

配置文件：

```
smoke
```
：快速确认tarball可安装、可接入渠道、可运行Agent轮次，以及基本网关/配置流水线正常工作。
```
package
```
：发布包契约。添加安装程序/更新、医生安装切换、捆绑插件运行时依赖、插件安装/更新和包修复流水线。这是大多数Parallels包/更新覆盖的默认原生替代方案。
```
product
```
：包配置文件加上更广泛的产品场景：MCP渠道、定时任务/子Agent清理、OpenAI网页搜索和OpenWebUI。
```
full
```
：包含OpenWebUI的拆分Docker发布路径分片。
```
custom
```
：针对聚焦重新运行的精确
```
docker_lanes
```
列表。

候选版本来源：

```
source=npm
```
：
```
openclaw@beta
```
、
```
openclaw@latest
```
或精确发布版本。
```
source=ref
```
：使用可信的
```
workflow_ref
```
测试工具打包
```
package_ref
```
。这有意将旧包提交与新工作流/测试代码分离。
```
source=url
```
：HTTPS
```
.tgz
```
加上必填的
```
package_sha256
```
。

source=artifact

：从

artifact_run_id

artifact_name

下载一个

.tgz

。

引用模型：

```
gh workflow run ... --ref <workflow-ref>
```
选择GitHub执行的工作流文件版本。
```
workflow_ref
```
是传递给可重用Docker E2E的可信测试工具/脚本引用。
```
package_ref
```
是
```
source=ref
```
时要构建的源代码引用。只要它可从OpenClaw分支或发布标签访问，就可以是旧的分支/标签/SHA。

示例：使用最新的包验收测试工具针对旧的可信提交运行：

bash

gh workflow run package-acceptance.yml --ref main \
  -f workflow_ref=main \
  -f source=ref \
  -f package_ref=<branch-or-sha> \
  -f suite_profile=package \
  -f telegram_mode=mock-openai

当相同解析的

package-under-test

tarball也需要在

qa-live-shared

环境中通过Telegram QA工作流运行时，使用

telegram_mode=mock-openai

或

telegram_mode=live-frontier

。独立的Telegram工作流仍接受已发布的npm规格用于发布后检查，但Package Acceptance会为

source=npm

、

ref

、

url

和

artifact

传递解析后的工件。仅当有意为聚焦重新运行跳过Telegram认证包验证时，才使用

telegram_mode=none

。

Docker E2E镜像永远不会将仓库源代码作为被测应用复制：基础镜像是Node/Git运行器，功能镜像安装与基础流水线挂载的相同预构建npm tarball。

scripts/package-openclaw-for-docker.mjs

是本地脚本和CI的单一打包器，会在Docker使用前验证tarball清单。

scripts/test-docker-all.mjs --plan-json

是调度器所属的CI计划，包含镜像类型、包、实时镜像、流水线和密钥需求。Docker流水线定义位于单一场景目录

scripts/lib/docker-e2e-scenarios.mjs

；规划器逻辑位于

scripts/lib/docker-e2e-plan.mjs

。

scripts/docker-e2e.mjs

将计划和摘要JSON转换为GitHub输出和步骤摘要。每个调度器运行都会写入

.artifacts/docker-tests/**/summary.json

和

failures.json

。重新运行前请阅读这些文件。流水线条目包括

command

、

rerunCommand

、状态、时间、超时状态、镜像类型和日志文件路径。摘要还包括预检查、镜像构建、包准备、流水线池和清理的顶级阶段时间。在决定是否需要更广泛的重新运行前，使用

pnpm test:docker:timings <summary.json>

对慢流水线和阶段进行排名。

Cheap Docker Reruns

低成本Docker重新运行

First derive the smallest rerun command from artifacts:

bash

pnpm test:docker:rerun <github-run-id>
pnpm test:docker:rerun .artifacts/docker-tests/<run>/failures.json

The script downloads Docker E2E artifacts for a GitHub run, reads

summary.json

failures.json

, and prints a combined targeted workflow command plus per-lane commands. Prefer the combined targeted command when several lanes failed for the same patch:

bash

gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
  -f ref=<sha> \
  -f include_repo_e2e=false \
  -f include_release_path_suites=false \
  -f include_openwebui=false \
  -f docker_lanes='install-e2e bundled-channel-update-acpx' \
  -f include_live_suites=false \
  -f live_models_only=false

That path still runs the prepare job, so it creates a new tarball for

<sha>

. If the SHA-tagged GHCR bare/functional image already exists, CI skips rebuilding that image and only uploads the fresh package artifact before the targeted lane job. Do not rerun the full release path unless the failed lane list or touched surface really requires it.

首先从工件中推导最小的重新运行命令：

bash

pnpm test:docker:rerun <github-run-id>
pnpm test:docker:rerun .artifacts/docker-tests/<run>/failures.json

该脚本会下载GitHub运行的Docker E2E工件，读取

summary.json

failures.json

，并打印组合的针对性工作流命令以及每个流水线的命令。当多个流水线因同一补丁失败时，优先使用组合的针对性命令：

bash

gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
  -f ref=<sha> \
  -f include_repo_e2e=false \
  -f include_release_path_suites=false \
  -f include_openwebui=false \
  -f docker_lanes='install-e2e bundled-channel-update-acpx' \
  -f include_live_suites=false \
  -f live_models_only=false

该路径仍会运行准备作业，因此会为

<sha>

创建新的tarball。如果带有SHA标签的GHCR基础/功能镜像已存在，CI会跳过重建该镜像，仅在针对性流水线作业前上传新的包工件。除非失败流水线列表或受影响范围确实需要，否则不要重新运行完整的发布路径。

Docker Expected Timings

Docker预期时间

Treat these as ballpark. Blacksmith queue time, GHCR pull speed, provider latency, npm cache state, and Docker daemon health can dominate.

Current local timing artifact (

.artifacts/docker-tests/lane-timings.json

) has these rough bands:

Tiny lanes, seconds to under 1 minute:

agents-delete-shared-workspace

~3s,

plugin-update

~7s,

config-reload

~14s,

pi-bundle-mcp-tools

~15s,

onboard

~18s,

session-runtime-context

~20s,

gateway-network

~34s,

qr

~44s.

Medium deterministic lanes, ~1-5 minutes:
```
npm-onboard-channel-agent
```
~96s,
```
openai-image-auth
```
~99s, bundled channel/update lanes usually ~90-300s when split,
```
openwebui
```
~225s,
```
mcp-channels
```
~274s.

Heavy deterministic lanes, ~6-10 minutes:

bundled-channel-root-owned

~429s,

bundled-channel-setup-entry

~420s,

bundled-channel-load-failure

~383s,

cron-mcp-cleanup

~567s.

Live provider lanes, often ~15-20 minutes:
```
live-gateway
```
~958s,
```
live-models
```
~1054s.
Installer/release lanes:
```
install-e2e
```
and package-update paths can vary widely with npm, provider, and package registry behavior. Budget tens of minutes; prefer GitHub targeted reruns over local repeats.

Default fallback lane timeout is 120 minutes. A timeout usually means debug the lane log/artifacts first, not “run the whole thing again.”

将这些视为大致参考。Blacksmith排队时间、GHCR拉取速度、提供商延迟、npm缓存状态和Docker守护进程健康状况可能占主导。

当前本地时间工件（

.artifacts/docker-tests/lane-timings.json

）有以下大致区间：

小型流水线，几秒到1分钟以内：

agents-delete-shared-workspace

~3秒，

plugin-update

~7秒，

config-reload

~14秒，

pi-bundle-mcp-tools

~15秒，

onboard

~18秒，

session-runtime-context

~20秒，

gateway-network

~34秒，

qr

~44秒。

中等确定性流水线，约1-5分钟：
```
npm-onboard-channel-agent
```
~96秒，
```
openai-image-auth
```
~99秒，捆绑渠道/更新流水线拆分后通常约90-300秒，
```
openwebui
```
~225秒，
```
mcp-channels
```
~274秒。

大型确定性流水线，约6-10分钟：

bundled-channel-root-owned

~429秒，

bundled-channel-setup-entry

~420秒，

bundled-channel-load-failure

~383秒，

cron-mcp-cleanup

~567秒。

实时提供商流水线，通常约15-20分钟：
```
live-gateway
```
~958秒，
```
live-models
```
~1054秒。
安装程序/发布流水线：
```
install-e2e
```
和包更新路径的时间可能因npm、提供商和包注册表行为而有很大差异。预算几十分钟；优先选择GitHub针对性重新运行，而不是本地重复运行。

默认的流水线超时回退为120分钟。超时通常意味着首先调试流水线日志/工件，而不是“再次运行整个流程”。

Failure Workflow

故障处理流程

Identify exact failing job, SHA, lane, and artifact path.
Read
```
failures.json
```
,
```
summary.json
```
, and the failed lane log tail.
Use
```
pnpm test:docker:rerun <run-id|failures.json>
```
to generate targeted GitHub rerun commands.
If the lane has
```
rerunCommand
```
, use that only as a local starting point.
For Docker release failures, dispatch targeted
```
docker_lanes=<failed-lane>
```
on GitHub before considering local Docker.
Patch narrowly, then rerun the failed file/lane only.
Broaden to
```
pnpm check:changed
```
or CI only after the isolated proof passes.

确定确切的失败作业、SHA、流水线和工件路径。
阅读
```
failures.json
```
、
```
summary.json
```
和失败流水线的日志尾部。
使用
```
pnpm test:docker:rerun <run-id|failures.json>
```
生成针对性的GitHub重新运行命令。
如果流水线有
```
rerunCommand
```
，仅将其作为本地起点使用。
对于Docker发布失败，在考虑本地Docker之前，先在GitHub上调度针对性的
```
docker_lanes=<failed-lane>
```
。
进行窄范围修复，然后仅重新运行失败的文件/流水线。
仅在孤立验证通过后，才扩大到
```
pnpm check:changed
```
或CI。

When To Escalate

何时升级处理

Public SDK/plugin contract changes: run changed gate plus relevant extension validation.
Build output, lazy imports, package boundaries, or published surfaces: include
```
pnpm build
```
.
Workflow edits: run
```
pnpm check:workflows
```
.
Release branch or tag validation: use release docs and GitHub workflows; avoid local Docker unless Peter explicitly asks.

公共SDK/插件契约变更：运行变更网关测试加上相关的扩展验证。
构建输出、延迟导入、包边界或发布表面变更：包含
```
pnpm build
```
。
工作流编辑：运行
```
pnpm check:workflows
```
。
发布分支或标签验证：使用发布文档和GitHub工作流；除非Peter明确要求，否则避免使用本地Docker。