openclaw-testing
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOpenClaw Testing
OpenClaw测试
Use this skill when deciding what to test, debugging failures, rerunning CI,
or validating a change without wasting hours.
当你需要决定测试内容、调试失败问题、重新运行CI,或者在不浪费大量时间的情况下验证变更时,使用本技能。
Read First
必读内容
- for local test commands.
docs/reference/test.md - for CI scope, release checks, Docker chunks, and runner behavior.
docs/ci.md - Scoped files before editing code under a subtree.
AGENTS.md
- 本地测试命令请参考。
docs/reference/test.md - CI范围、发布检查、Docker分片以及运行器行为请参考。
docs/ci.md - 在编辑子树中的代码前,请先阅读对应范围的文件。
AGENTS.md
Default Rule
默认规则
Prove the touched surface first. Do not reflexively run the whole suite.
- Inspect the diff and classify the touched surface:
- source: , then
pnpm changed:lanes --jsonpnpm check:changed - tests only:
pnpm test:changed - one failing file:
pnpm test <path-or-filter> -- --reporter=verbose - workflow-only: , workflow syntax/lint (
git diff --checkwhen available)actionlint - docs-only: , docs formatter/lint only if docs tooling changed or requested
pnpm docs:list
- source:
- Reproduce narrowly before fixing.
- Fix root cause.
- Rerun the same narrow proof.
- Broaden only when the touched contract demands it.
优先验证受变更影响的范围,不要下意识地运行整个测试套件。
- 检查代码差异并分类受影响的范围:
- 源代码变更:执行,然后运行
pnpm changed:lanes --jsonpnpm check:changed - 仅测试代码变更:执行
pnpm test:changed - 单个文件测试失败:执行
pnpm test <path-or-filter> -- --reporter=verbose - 仅工作流变更:执行,若有
git diff --check则进行工作流语法/检查actionlint - 仅文档变更:执行,仅当文档工具变更或有明确要求时,才运行文档格式化/检查
pnpm docs:list
- 源代码变更:执行
- 在修复问题前先精准复现故障。
- 修复问题根源。
- 重新运行相同的精准验证用例。
- 仅当受影响的契约要求时,才扩大测试范围。
Guardrails
防护规则
- Do not kill unrelated processes or tests. If something is running elsewhere, treat it as owned by the user or another agent.
- Do not run expensive local Docker, full release checks, full , or full
pnpm testunless the user asks or the change genuinely requires it.pnpm check - Prefer GitHub Actions for release/Docker proof when the workflow already has the prepared image and secrets.
- Use when committing; stage only your files.
scripts/committer "<msg>" <paths...> - If deps are missing, run , retry once, then report the first actionable error.
pnpm install - For Blacksmith Testbox proof, reuse only an id warmed and claimed in this
operator session. is diagnostics only; a listed id can have a local key and still carry stale rsync state from another lane. After warmup, run
blacksmith testbox list, then preferpnpm testbox:claim --id <id>for OpenClaw gates so stale org-visible ids fail fast before syncing. Claims older than 12 hours are stale unlesspnpm testbox:run --id <id> -- "<command>"is explicitly set for long work.OPENCLAW_TESTBOX_CLAIM_TTL_MINUTES
- 不要终止无关的进程或测试。如果有其他进程在运行,默认视为用户或其他Agent所有。
- 除非用户要求或变更确实需要,否则不要运行本地Docker、完整发布检查、完整或完整
pnpm test。pnpm check - 当工作流已准备好镜像和密钥时,优先使用GitHub Actions进行发布/Docker验证。
- 提交代码时使用;仅暂存你修改的文件。
scripts/committer "<msg>" <paths...> - 如果缺少依赖,执行,重试一次后,报告第一个可处理的错误。
pnpm install - 对于Blacksmith Testbox验证,仅重用当前操作员会话中预热并认领的实例ID。仅用于诊断;列出的ID可能带有本地密钥,且仍保留来自其他流水线的陈旧rsync状态。预热完成后,执行
blacksmith testbox list,然后优先使用pnpm testbox:claim --id <id>运行OpenClaw网关测试,这样陈旧的组织可见ID会在同步前快速失败。除非为长时间工作显式设置pnpm testbox:run --id <id> -- "<command>",否则超过12小时的认领将视为过期。OPENCLAW_TESTBOX_CLAIM_TTL_MINUTES
Local Test Shortcuts
本地测试快捷命令
bash
pnpm changed:lanes --json
pnpm check:changed # changed typecheck/lint/guards; no Vitest
pnpm test:changed # cheap smart changed Vitest targets
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
pnpm test <path-or-filter> -- --reporter=verbose
OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test <path-or-filter>Use targeted file paths whenever possible. Avoid raw ; use the repo
wrapper so project routing, workers, and setup stay correct.
vitestpnpm testbash
pnpm changed:lanes --json
pnpm check:changed # 对变更内容进行类型检查/代码检查/防护验证;不运行Vitest
pnpm test:changed # 轻量智能的变更Vitest目标测试
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
pnpm test <path-or-filter> -- --reporter=verbose
OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test <path-or-filter>尽可能使用目标文件路径。避免直接使用;使用仓库的包装器,以确保项目路由、工作线程和设置保持正确。
vitestpnpm testCommand Semantics
命令语义
- and
pnpm checkdo not run Vitest tests. They are for typecheck, lint, and guard proof.pnpm check:changed - and
pnpm testrun Vitest tests.pnpm test:changed - is intentionally cheap by default: direct test edits, sibling tests, explicit source mappings, and import-graph dependents.
pnpm test:changed - is the explicit broad fallback for harness/config/package edits that genuinely need it.
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed - Do not run extension sweeps just because core changed. If a core edit is for a specific plugin bug, run that plugin's tests explicitly. If a public SDK or contract change needs consumer proof, choose the smallest representative plugin/contract tests first, then broaden only when the risk justifies it.
- The test wrapper prints a short line. Vitest's own duration is still the per-shard detail.
[test] passed|failed|skipped ... in ...
- 和
pnpm check不运行Vitest测试,仅用于类型检查、代码检查和防护验证。pnpm check:changed - 和
pnpm test运行Vitest测试。pnpm test:changed - 默认设计为轻量:仅运行直接修改的测试、同级测试、显式映射的源代码测试,以及依赖导入图的相关测试。
pnpm test:changed - 是针对测试工具/配置/包变更的显式宽范围回退方案,仅在确实需要时使用。
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed - 不要因为核心代码变更就运行所有扩展测试。如果核心代码变更是针对特定插件Bug,仅运行该插件的测试。如果公共SDK或契约变更需要消费者验证,优先选择最小代表性的插件/契约测试,仅当风险证明有必要时才扩大范围。
- 测试包装器会输出简短的行。Vitest自身的时长仍为每个分片的详细信息。
[test] passed|failed|skipped ... in ...
Routing Model
路由模型
- answers "which check lanes does this diff touch?" It is used by
pnpm changed:lanes --jsonfor typecheck/lint/guard selection.pnpm check:changed - answers "which Vitest targets are worth running now?" It uses the same changed path list, but applies a cheaper test-target resolver.
pnpm test:changed - Direct test edits run themselves. Source edits prefer explicit mappings,
sibling , then import-graph dependents. Shared harness/config/root edits are skipped by default unless they have precise mapped tests.
*.test.ts - Shared group-room delivery config and source-reply prompt edits are precise mapped tests: they run the core auto-reply regressions plus Discord and Slack delivery tests so cross-channel default changes fail before a PR push.
- Public SDK or contract edits do not automatically run every plugin test.
proves extension type contracts; the agent chooses the smallest plugin/contract Vitest proof that matches the actual risk.
check:changed - Use only when a harness, config, package, or unknown-root edit really needs the broad Vitest fallback.
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
- 用于回答“此代码差异影响哪些检查流水线?”,
pnpm changed:lanes --json使用它来选择类型检查/代码检查/防护验证的范围。pnpm check:changed - 用于回答“现在值得运行哪些Vitest目标?”,它使用相同的变更路径列表,但应用更轻量的测试目标解析器。
pnpm test:changed - 直接修改的测试文件会运行自身。源代码变更优先选择显式映射的测试、同级文件,然后是依赖导入图的相关测试。默认情况下,共享测试工具/配置/根目录的变更会被跳过,除非有精准映射的测试。
*.test.ts - 共享群组交付配置和源回复提示的变更有精准映射的测试:会运行核心自动回复回归测试以及Discord和Slack交付测试,这样跨渠道的默认变更会在PR推送前失败。
- 公共SDK或契约变更不会自动运行所有插件测试。验证扩展类型契约;Agent会选择与实际风险匹配的最小插件/契约Vitest验证用例。
check:changed - 仅当测试工具、配置、包或未知根目录的变更确实需要时,才使用的宽范围Vitest回退方案。
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
CI Debugging
CI调试
Start with current run state, not logs for everything:
bash
gh run list --branch main --limit 10
gh run view <run-id> --json status,conclusion,headSha,url,jobs
gh run view <run-id> --job <job-id> --log- Check exact SHA. Ignore newer unrelated unless asked.
main - For cancelled same-branch runs, confirm whether a newer run superseded it.
- Fetch full logs only for failed or relevant jobs.
从当前运行状态开始,不要查看所有日志:
bash
gh run list --branch main --limit 10
gh run view <run-id> --json status,conclusion,headSha,url,jobs
gh run view <run-id> --job <job-id> --log- 检查精确的SHA值。除非被要求,否则忽略更新的无关分支内容。
main - 对于同一分支的已取消运行,确认是否有更新的运行已取代它。
- 仅获取失败或相关作业的完整日志。
GitHub Release Workflows
GitHub发布工作流
Use the smallest workflow that proves the current risk. The full umbrella is
available, but it is usually the last step after narrower proof, not the first
rerun after a focused patch.
使用最小的工作流来验证当前风险。完整的集成工作流可用,但通常是在更窄范围验证之后的最后一步,而不是在聚焦补丁后的首次重新运行。
Full Release Validation
完整发布验证
Full Release Validation.github/workflows/full-release-validation.yml- manual for the full normal CI graph, with Android enabled via
CIinclude_android=true - for release-only plugin static checks, extension shards, the release-only
Plugin Prereleaseshard, and plugin product Docker lanesagentic-plugins - for install smoke, cross-OS release checks, live and E2E checks, Docker release-path suites, OpenWebUI, QA Lab, fast Matrix, and Telegram release lanes
OpenClaw Release Checks - optional post-publish Telegram E2E when a package spec is supplied
Run it only when validating an actual release candidate, after broad shared CI
or release orchestration changes, or when explicitly asked:
bash
gh workflow run full-release-validation.yml \
--repo openclaw/openclaw \
--ref main \
-f ref=<branch-or-sha> \
-f provider=openai \
-f mode=both \
-f release_profile=stableRun the workflow itself from the trusted current ref, normally ;
child workflows are dispatched from that same ref even when points at an
older release branch or tag. Full Release Validation has no separate child
workflow ref input; choose the trusted harness by choosing the workflow run ref.
Use to control live/provider breadth:
keeps the fastest OpenAI/core release-critical set, adds the
stable provider/backend set, and adds the broad advisory provider/media
matrix. Do not make faster by silently dropping suites; optimize setup,
artifact reuse, and sharding instead. The parent verifier job appends a child
overview plus slowest-job tables for child runs; rerun only that verifier after
a child rerun turns green.
--ref mainrefrelease_profile=minimum|stable|fullminimumstablefullfullStandalone manual dispatches do not run the plugin prerelease suite, the
extension batch sweep, or the release-only Vitest shard. Those
lanes are intentionally reserved for the separate child so
PRs, main pushes, and ad hoc broad CI checks do not spend Docker/package time or
all-plugin runtime time on release-only product coverage.
CIagentic-pluginsPlugin PrereleaseIf a full run is already active on a newer , prefer watching that
run over dispatching a duplicate. Do not cancel release, release-check, or child
workflow runs unless Peter explicitly asks for cancellation.
origin/mainThe child-dispatch jobs record the child run ids. The final
job re-queries those child runs and is the canonical
parent gate. If a child workflow failed but was later rerun successfully, rerun
only the failed parent verifier job; do not dispatch a new full umbrella unless
the release evidence is stale.
Verify full validationFor bounded recovery after a focused fix, pass .
Supported umbrella groups are , , ,
, , , , , ,
, , and . Use the narrowest group that covers
the failed box. After a targeted release-check fix, do not restart the full
umbrella by habit: dispatch the matching and rerun only the parent
verifier/evidence step after the child is green unless the release evidence is
stale. For a single failed live/E2E shard, use
so the Blacksmith
workflow only spends setup and queue time on that suite.
-f rerun_group=<group>allciplugin-prereleaserelease-checksinstall-smokecross-oslive-e2epackageqaqa-parityqa-livenpm-telegramrerun_group-f rerun_group=live-e2e -f live_suite_filter=<suite_id>Full Release Validation.github/workflows/full-release-validation.yml- 手动,包含完整的常规CI图,通过
CI启用Android检查include_android=true - ,包含仅发布阶段的插件静态检查、扩展分片、仅发布阶段的
Plugin Prerelease分片,以及插件产品Docker流水线agentic-plugins - ,包含安装冒烟测试、跨平台发布检查、实时和E2E检查、Docker发布路径套件、OpenWebUI、QA实验室、快速Matrix和Telegram发布流水线
OpenClaw Release Checks - 当提供包规格时,可选的发布后Telegram E2E测试
仅在验证实际发布候选版本、完成广泛的共享CI或发布编排变更后,或被明确要求时才运行:
bash
gh workflow run full-release-validation.yml \
--repo openclaw/openclaw \
--ref main \
-f ref=<branch-or-sha> \
-f provider=openai \
-f mode=both \
-f release_profile=stable从可信的当前引用运行工作流,通常是;即使指向旧的发布分支或标签,子工作流也会从同一引用调度。完整发布验证没有单独的子工作流引用输入;通过选择工作流运行引用来选择可信的测试工具。使用控制实时/提供商范围:保留最快的OpenAI/核心发布关键集,添加稳定的提供商/后端集,添加广泛的建议性提供商/媒体矩阵。不要通过静默删除套件来加快的速度;而是优化设置、工件重用和分片。父验证器作业会附加子工作流概述以及最慢作业表;子工作流重新运行通过后,仅重新运行该验证器作业。
--ref mainrefrelease_profile=minimum|stable|fullminimumstablefullfull独立的手动调度不会运行插件预发布套件、扩展批量扫描或仅发布阶段的Vitest分片。这些流水线被有意保留为单独的子工作流,这样PR、主分支推送和临时的广泛CI检查就不会在仅发布阶段的产品覆盖上花费Docker/包时间或全插件运行时间。
CIagentic-pluginsPlugin Prerelease如果较新的上已有完整运行在进行中,优先观察该运行,而不是调度重复运行。除非Peter明确要求取消,否则不要取消发布、发布检查或子工作流运行。
origin/main子调度作业会记录子运行ID。最终的作业会重新查询这些子运行,是规范的父网关。如果子工作流失败但后来重新运行成功,仅重新运行失败的父验证器作业;除非发布证据过期,否则不要调度新的完整集成工作流。
Verify full validation对于聚焦修复后的有限恢复,传递。支持的集成组包括、、、、、、、、、、和。使用覆盖失败范围的最小组。完成针对性的发布检查修复后,不要习惯性地重启完整集成工作流:调度匹配的,子工作流通过后仅重新运行父验证器/证据步骤,除非发布证据过期。对于单个失败的实时/E2E分片,使用,这样Blacksmith工作流仅在该套件上花费设置和排队时间。
-f rerun_group=<group>allciplugin-prereleaserelease-checksinstall-smokecross-oslive-e2epackageqaqa-parityqa-livenpm-telegramrerun_group-f rerun_group=live-e2e -f live_suite_filter=<suite_id>Release Evidence
发布证据
After release-candidate validation or before a release decision, record the
important run ids in the private evidence ledger.
Use the manual
() workflow there. It writes durable summaries
under and commits:
openclaw/releases-privateOpenClaw Release Evidenceopenclaw-release-evidence.ymlevidence/<release-id>/release-evidence.mdrelease-evidence.jsonindex.jsonruns/<label>.json
Use one run per line:
text
full-release-validation openclaw/openclaw <run-id> blocking
package-acceptance openclaw/openclaw <run-id> blocking
release-checks openclaw/openclaw <run-id> blockingStore summaries, run URLs, artifact metadata, timings, pass/fail state, and
short release-manager notes there. Do not store raw logs, provider
prompts/responses, channel transcripts, signing material, or secret-bearing
config in git; raw logs stay in Actions artifacts.
When completes and
is configured in the public repo, it
requests the private workflow.
That private workflow reads the parent full-validation run, extracts the child
CI/release-checks/Telegram run ids from the parent logs, and opens the evidence
PR automatically. If the token is absent or the run predates this wiring, trigger
that private workflow manually with the full-validation run id.
Full Release ValidationOPENCLAW_RELEASES_PRIVATE_DISPATCH_TOKENOpenClaw Release Evidence From Full Validation在发布候选版本验证后或发布决策前,将重要的运行ID记录在私有证据台账中。使用那里的手动()工作流。它会在下写入持久化摘要并提交:
openclaw/releases-privateOpenClaw Release Evidenceopenclaw-release-evidence.ymlevidence/<release-id>/release-evidence.mdrelease-evidence.jsonindex.jsonruns/<label>.json
每行记录一个运行:
text
full-release-validation openclaw/openclaw <run-id> blocking
package-acceptance openclaw/openclaw <run-id> blocking
release-checks openclaw/openclaw <run-id> blocking在其中存储摘要、运行URL、工件元数据、时间、通过/失败状态以及简短的发布经理备注。不要在git中存储原始日志、提供商提示/响应、渠道记录、签名材料或包含密钥的配置;原始日志保留在Actions工件中。
当完成且公共仓库中配置了时,它会请求私有工作流。该私有工作流读取父完整验证运行,从父日志中提取子CI/发布检查/Telegram运行ID,并自动打开证据PR。如果令牌不存在或运行早于该配置,则使用完整验证运行ID手动触发该私有工作流。
Full Release ValidationOPENCLAW_RELEASES_PRIVATE_DISPATCH_TOKENOpenClaw Release Evidence From Full ValidationRelease Checks
发布检查
OpenClaw Release Checksopenclaw-release-checks.ymltelegram_mode=mock-openaibash
gh workflow run openclaw-release-checks.yml \
--repo openclaw/openclaw \
--ref main \
-f ref=<branch-or-sha> \
-f provider=openai \
-f mode=both \
-f release_profile=stable \
-f rerun_group=allRelease-check rerun groups are , , , ,
, , , and .
uses the trusted workflow ref to resolve the selected
ref once as and passes that artifact into cross-OS
release checks, release-path Docker live/E2E checks, and Package Acceptance.
When dispatches release checks, it passes the requested
branch/tag plus an so branch/tag refs resolve through the fast
remote-ref path while the package and QA jobs still validate the exact SHA.
allinstall-smokecross-oslive-e2epackageqaqa-parityqa-liveOpenClaw Release Checksrelease-package-under-testFull Release Validationexpected_shaThe full install-smoke child is split on purpose: one job prepares or reuses the
target-SHA GHCR root Dockerfile smoke image, QR package install runs in its own
job, root Dockerfile/gateway smokes pull the prepared image, and installer/Bun
smokes pull the same image while building only their small installer images.
If install-smoke gets slow again, first check whether the root image was reused
or rebuilt before adding/removing coverage.
The full-profile native live media shards use the prebuilt
container so
/ are already present. If those jobs suddenly spend minutes in
dependency setup again, first check the workflow and
the step before assuming the media
tests themselves slowed down.
ghcr.io/openclaw/openclaw-live-media-runner:ubuntu-24.04ffmpegffprobeLive Media Runner ImageVerify preinstalled live media dependenciesThe release Docker path intentionally shards the plugin/runtime tail. The
workflow uses , , and
through ; aggregate
aliases such as , , and
remain for manual reruns.
plugins-runtime-pluginsplugins-runtime-servicesplugins-runtime-install-aplugins-runtime-install-dplugins-runtime-coreplugins-runtimeplugins-integrationsThe release QA parity box is internally split into candidate and baseline lane
jobs, followed by a report job that downloads both artifacts and runs
. For parity failures, inspect the failed lane
first; inspect the report job when both lane summaries exist but the comparison
fails.
pnpm openclaw qa parity-reportOpenClaw Release Checksopenclaw-release-checks.ymltelegram_mode=mock-openaibash
gh workflow run openclaw-release-checks.yml \
--repo openclaw/openclaw \
--ref main \
-f ref=<branch-or-sha> \
-f provider=openai \
-f mode=both \
-f release_profile=stable \
-f rerun_group=all发布检查的重新运行组包括、、、、、、和。
使用可信的工作流引用将所选引用解析一次为,并将该工件传递到跨平台发布检查、发布路径Docker实时/E2E检查和Package Acceptance中。当调度发布检查时,它会传递请求的分支/标签以及,这样分支/标签引用通过快速远程引用路径解析,而包和QA作业仍验证精确的SHA。
allinstall-smokecross-oslive-e2epackageqaqa-parityqa-liveOpenClaw Release Checksrelease-package-under-testFull Release Validationexpected_sha完整的安装冒烟测试子工作流被有意拆分:一个作业准备或重用目标SHA的GHCR根Dockerfile冒烟镜像,QR包安装在自己的作业中运行,根Dockerfile/网关冒烟测试拉取准备好的镜像,安装程序/Bun冒烟测试拉取相同的镜像,同时仅构建它们的小型安装程序镜像。如果安装冒烟测试再次变慢,首先检查根镜像是被重用还是重建,然后再考虑添加/删除覆盖范围。
完整配置文件的原生实时媒体分片使用预构建的容器,因此/已存在。如果这些作业突然在依赖设置上花费数分钟,首先检查工作流和步骤,然后再假设媒体测试本身变慢。
ghcr.io/openclaw/openclaw-live-media-runner:ubuntu-24.04ffmpegffprobeLive Media Runner ImageVerify preinstalled live media dependencies发布Docker路径有意拆分了插件/运行时的尾部工作流。工作流使用、、到;聚合别名如、和仍用于手动重新运行。
plugins-runtime-pluginsplugins-runtime-servicesplugins-runtime-install-aplugins-runtime-install-dplugins-runtime-coreplugins-runtimeplugins-integrations发布QA奇偶校验框在内部拆分为候选和基准流水线作业,然后是一个报告作业,下载两个工件并运行。对于奇偶校验失败,首先检查失败的流水线;当两个流水线摘要都存在但比较失败时,检查报告作业。
pnpm openclaw qa parity-reportQA Lab Matrix Profiles
QA实验室Matrix配置文件
pnpm openclaw qa matrix--profile all- : release-critical Matrix transport contract; add
--profile fastonly when the target CLI supports it--fail-fast - : sharded full Matrix proof
--profile transport|media|e2ee-smoke|e2ee-deep|e2ee-cli - : CI-friendly no-reply quiet window when paired with fast or sharded gates
OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS=3000
QA-Lab - All Lanesmatrix_profile=allOpenClaw Release Checkspnpm openclaw qa matrix--profile all- :发布关键的Matrix传输契约;仅当目标CLI支持时添加
--profile fast--fail-fast - :分片的完整Matrix验证
--profile transport|media|e2ee-smoke|e2ee-deep|e2ee-cli - :与快速或分片网关配对时,CI友好的无回复静默窗口
OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS=3000
QA-Lab - All Lanesmatrix_profile=allOpenClaw Release ChecksReusable Live/E2E Checks
可重用的实时/E2E检查
OpenClaw Live And E2E Checks (Reusable)openclaw-live-and-e2e-checks-reusable.ymlbash
gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
--repo openclaw/openclaw \
--ref main \
-f ref=<sha> \
-f include_repo_e2e=false \
-f include_release_path_suites=false \
-f include_openwebui=false \
-f include_live_suites=true \
-f live_models_only=true \
-f live_model_providers=fireworksUseful knobs:
- : run selected Docker scheduler lanes against prepared artifacts instead of the release chunk matrix. Multiple selected lanes fan out as parallel targeted Docker jobs after one shared package/image preparation step.
docker_lanes='<lane[,lane]>' - : skip live/provider suites when testing Docker scheduler or release packaging only.
include_live_suites=false - : run only Docker live model coverage.
live_models_only=true - (or comma/space separated providers): run one targeted Docker live model job instead of the full provider matrix.
live_model_providers=fireworks - blank : run the full live-model provider matrix.
live_model_providers
Release-path Docker chunks are currently , ,
, ,
, ,
, ,
, ,
, ,
, and . The aggregate
, , , and
chunks remain valid for manual one-shot reruns, but
release checks use the split chunks.
corepackage-update-openaipackage-update-anthropicpackage-update-coreplugins-runtime-pluginsplugins-runtime-servicesplugins-runtime-install-aplugins-runtime-install-bplugins-runtime-install-cplugins-runtime-install-dbundled-channels-corebundled-channels-update-abundled-channels-update-bbundled-channels-contractsbundled-channelsplugins-runtime-coreplugins-runtimeplugins-integrationsWhen live suites are enabled, the workflow shards broad native
coverage through instead of one serial
job:
pnpm test:livescripts/test-live-shard.mjslive-allnative-live-src-agentsnative-live-src-gateway-core- (release CI runs this with provider filters such as
native-live-src-gateway-profiles)OPENCLAW_LIVE_GATEWAY_PROVIDERS=anthropic native-live-src-gateway-backendsnative-live-testnative-live-extensions-a-knative-live-extensions-l-nnative-live-extensions-openainative-live-extensions-o-znative-live-extensions-o-z-othernative-live-extensions-xainative-live-extensions-medianative-live-extensions-media-audionative-live-extensions-media-musicnative-live-extensions-media-music-googlenative-live-extensions-media-music-minimaxnative-live-extensions-media-video
Use to see the exact files
before rerunning a failed native live shard. The aggregate and
shards remain useful locally; release CI uses the smaller provider/media shards
so one live-provider flake does not force a broad native live rerun.
node scripts/test-live-shard.mjs <shard> --listo-zmediaFor model-list or provider-selection fixes, use plus the
specific allowlist. Confirm logs show the expected
and selected model ids before declaring proof.
live_models_only=truelive_model_providersOPENCLAW_LIVE_PROVIDERSOpenClaw Live And E2E Checks (Reusable)openclaw-live-and-e2e-checks-reusable.ymlbash
gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
--repo openclaw/openclaw \
--ref main \
-f ref=<sha> \
-f include_repo_e2e=false \
-f include_release_path_suites=false \
-f include_openwebui=false \
-f include_live_suites=true \
-f live_models_only=true \
-f live_model_providers=fireworks实用配置项:
- :针对准备好的工件运行选定的Docker调度流水线,而不是发布分片矩阵。多个选定流水线在一个共享包/镜像准备步骤后,会展开为并行的针对性Docker作业。
docker_lanes='<lane[,lane]>' - :仅测试Docker调度或发布打包时,跳过实时/提供商套件。
include_live_suites=false - :仅运行Docker实时模型覆盖测试。
live_models_only=true - (或逗号/空格分隔的提供商):运行一个针对性Docker实时模型作业,而不是完整的提供商矩阵。
live_model_providers=fireworks - 空:运行完整的实时模型提供商矩阵。
live_model_providers
发布路径Docker分片目前包括、、、、、、、、、、、、和。聚合的、、和分片仍可用于手动一次性重新运行,但发布检查使用拆分后的分片,这样提供商安装程序检查、插件运行时检查、捆绑插件安装/卸载分片和捆绑渠道检查可以在不同机器上运行。内的捆绑渠道运行时依赖覆盖使用拆分后的和流水线,而不是串行的流水线,因此失败会产生针对确切渠道/更新场景的低成本重新运行。捆绑插件安装/卸载扫描也拆分为到;选择旧的流水线会展开为所有8个分片。
corepackage-update-openaipackage-update-anthropicpackage-update-coreplugins-runtime-pluginsplugins-runtime-servicesplugins-runtime-install-aplugins-runtime-install-bplugins-runtime-install-cplugins-runtime-install-dbundled-channels-corebundled-channels-update-abundled-channels-update-bbundled-channels-contractsbundled-channelsplugins-runtime-coreplugins-runtimeplugins-integrationsbundled-channelsbundled-channel-*bundled-channel-update-*bundled-channel-depsbundled-plugin-install-uninstall-0bundled-plugin-install-uninstall-7bundled-plugin-install-uninstall当启用实时套件时,工作流通过将广泛的原生覆盖分片,而不是一个串行的作业:
scripts/test-live-shard.mjspnpm test:livelive-allnative-live-src-agentsnative-live-src-gateway-core- (发布CI使用提供商过滤器运行,如
native-live-src-gateway-profiles)OPENCLAW_LIVE_GATEWAY_PROVIDERS=anthropic native-live-src-gateway-backendsnative-live-testnative-live-extensions-a-knative-live-extensions-l-nnative-live-extensions-openainative-live-extensions-o-znative-live-extensions-o-z-othernative-live-extensions-xainative-live-extensions-medianative-live-extensions-media-audionative-live-extensions-media-musicnative-live-extensions-media-music-googlenative-live-extensions-media-music-minimaxnative-live-extensions-media-video
重新运行失败的原生实时分片前,使用查看确切文件。聚合的和分片在本地仍然有用;发布CI使用更小的提供商/媒体分片,这样一个实时提供商的故障不会导致广泛的原生实时重新运行。
node scripts/test-live-shard.mjs <shard> --listo-zmedia对于模型列表或提供商选择修复,使用加上特定的允许列表。在确认验证通过前,检查日志是否显示预期的和选定的模型ID。
live_models_only=truelive_model_providersOPENCLAW_LIVE_PROVIDERSDocker
Docker相关
Docker is expensive. First inspect the scheduler without running Docker:
bash
OPENCLAW_DOCKER_ALL_DRY_RUN=1 pnpm test:docker:all
OPENCLAW_DOCKER_ALL_DRY_RUN=1 OPENCLAW_DOCKER_ALL_LANES=install-e2e pnpm test:docker:all
OPENCLAW_DOCKER_ALL_LANES=install-e2e node scripts/test-docker-all.mjs --plan-jsonRun one failed lane locally only when explicitly asked or when GitHub is not
usable:
bash
OPENCLAW_DOCKER_ALL_LANES=<lane> \
OPENCLAW_DOCKER_ALL_BUILD=0 \
OPENCLAW_DOCKER_ALL_PREFLIGHT=0 \
OPENCLAW_SKIP_DOCKER_BUILD=1 \
OPENCLAW_DOCKER_E2E_BARE_IMAGE='<prepared-bare-image>' \
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE='<prepared-functional-image>' \
pnpm test:docker:allFor release validation, prefer the reusable GitHub workflow input:
yaml
docker_lanes: install-e2eMultiple lanes are allowed:
yaml
docker_lanes: install-e2e bundled-channel-update-acpxThat skips the release chunk matrix and runs one targeted Docker job against the
prepared GHCR images and the selected package artifact. Rerun commands
generated inside GitHub artifacts include ,
, , and
when available, so failed lanes can reuse the
exact tarball and prepared images from the failed run. When the fix changes
package contents, omit those reuse inputs so the workflow packs a new tarball.
Live-only targeted reruns skip the E2E images and build only the live-test
image. Release-path normal mode fans out into smaller Docker chunk jobs:
package_artifact_run_idpackage_artifact_namedocker_e2e_bare_imagedocker_e2e_functional_imagecorepackage-update-openaipackage-update-anthropicpackage-update-coreplugins-runtime-pluginsplugins-runtime-servicesplugins-runtime-install-aplugins-runtime-install-bplugins-runtime-install-cplugins-runtime-install-dbundled-channels
OpenWebUI is folded into for full release-path
coverage and keeps a standalone chunk only for OpenWebUI-only
dispatches. The legacy , ,
, and chunks still work as aggregate
aliases for manual reruns, but the release workflow uses the split chunks so
provider installer checks, plugin runtime checks, bundled plugin
install/uninstall shards, and bundled-channel checks can run on separate
machines. The bundled-channel runtime-dependency coverage
inside
uses the split and lanes rather
than the serial lane, so failures produce cheap targeted
reruns for the exact channel/update scenario. The bundled plugin
install/uninstall sweep is also split into
through
; selecting the legacy
lane expands to all eight shards.
plugins-runtime-servicesopenwebuipackage-updateplugins-runtime-coreplugins-runtimeplugins-integrationsbundled-channelsbundled-channel-*bundled-channel-update-*bundled-channel-depsbundled-plugin-install-uninstall-0bundled-plugin-install-uninstall-7bundled-plugin-install-uninstallDocker运行成本较高。首先在不运行Docker的情况下检查调度器:
bash
OPENCLAW_DOCKER_ALL_DRY_RUN=1 pnpm test:docker:all
OPENCLAW_DOCKER_ALL_DRY_RUN=1 OPENCLAW_DOCKER_ALL_LANES=install-e2e pnpm test:docker:all
OPENCLAW_DOCKER_ALL_LANES=install-e2e node scripts/test-docker-all.mjs --plan-json仅当被明确要求或GitHub不可用时,才在本地运行单个失败的流水线:
bash
OPENCLAW_DOCKER_ALL_LANES=<lane> \
OPENCLAW_DOCKER_ALL_BUILD=0 \
OPENCLAW_DOCKER_ALL_PREFLIGHT=0 \
OPENCLAW_SKIP_DOCKER_BUILD=1 \
OPENCLAW_DOCKER_E2E_BARE_IMAGE='<prepared-bare-image>' \
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE='<prepared-functional-image>' \
pnpm test:docker:all对于发布验证,优先使用可重用的GitHub工作流输入:
yaml
docker_lanes: install-e2e允许多个流水线:
yaml
docker_lanes: install-e2e bundled-channel-update-acpx这会跳过发布分片矩阵,并针对准备好的GHCR镜像和选定的包工件运行一个针对性Docker作业。GitHub工件中生成的重新运行命令包括、、和(如果可用),因此失败的流水线可以重用失败运行中的精确tarball和准备好的镜像。如果修复变更了包内容,请省略这些重用输入,以便工作流打包新的tarball。仅实时的针对性重新运行会跳过E2E镜像,仅构建实时测试镜像。发布路径正常模式会展开为更小的Docker分片作业:
package_artifact_run_idpackage_artifact_namedocker_e2e_bare_imagedocker_e2e_functional_imagecorepackage-update-openaipackage-update-anthropicpackage-update-coreplugins-runtime-pluginsplugins-runtime-servicesplugins-runtime-install-aplugins-runtime-install-bplugins-runtime-install-cplugins-runtime-install-dbundled-channels
OpenWebUI被纳入以实现完整的发布路径覆盖,仅在OpenWebUI专属调度时保留独立的分片。旧的、、和分片仍可作为聚合别名用于手动重新运行,但发布工作流使用拆分后的分片,这样提供商安装程序检查、插件运行时检查、捆绑插件安装/卸载分片和捆绑渠道检查可以在不同机器上运行。内的捆绑渠道运行时依赖覆盖使用拆分后的和流水线,而不是串行的流水线,因此失败会产生针对确切渠道/更新场景的低成本重新运行。捆绑插件安装/卸载扫描也拆分为到;选择旧的流水线会展开为所有8个分片。
plugins-runtime-servicesopenwebuipackage-updateplugins-runtime-coreplugins-runtimeplugins-integrationsbundled-channelsbundled-channel-*bundled-channel-update-*bundled-channel-depsbundled-plugin-install-uninstall-0bundled-plugin-install-uninstall-7bundled-plugin-install-uninstallPackage Acceptance
包验收
Use the manual workflow when the question is "does this
installable package work as a product?" rather than "does this source diff pass
Vitest?"
Package AcceptanceIn release validation, treat Package Acceptance as the package-candidate shard
inside the larger release umbrella, not as a competing full-test path. Full
Release Validation and private release gauntlets should call Package Acceptance
for tarball resolution, Docker product/package proof, and optional Telegram QA
against the same resolved artifact; keep orchestration,
secret policy, blocking/advisory status, and evidence rollup in the caller.
package-under-testGood defaults:
bash
gh workflow run package-acceptance.yml --ref main \
-f source=npm \
-f workflow_ref=main \
-f package_spec=openclaw@beta \
-f suite_profile=product \
-f telegram_mode=mock-openaiNpm candidate selection:
- Resolve the registry immediately before dispatch:
and
npm view openclaw dist-tags --json --prefer-online --cache /tmp/openclaw-npm-cache-verify-$$.npm view openclaw@beta version dist.tarball dist.integrity --json --prefer-online --cache /tmp/openclaw-npm-cache-verify-$$ - If Peter asks for "latest beta", use with
source=npm, then record the resolved version frompackage_spec=openclaw@betaor the workflow summary.npm view - For reruns, release proof, or comparing one known package, prefer the exact
immutable spec: or
package_spec=openclaw@YYYY.M.D-beta.N.package_spec=openclaw@YYYY.M.D - For stable package proof, use only when the question is explicitly the current stable dist-tag; otherwise pin the exact version.
package_spec=openclaw@latest - only accepts registry specs for
source=npm,openclaw@beta, or exact OpenClaw release versions. Do not pass semver ranges, git refs, file paths, tarball URLs, or plugin package names there.openclaw@latest - If the candidate is a tarball URL, use with
source=url. If it is an Actions tarball artifact, usepackage_sha256. If it is an unpublished source candidate, usesource=artifactwith a trusted ref or SHA.source=ref - Package acceptance tests exactly the selected package candidate. Do not apply
fallback semantics here; if
openclaw update --channel betais absent, stale, older thanbeta, or points at a broken tarball, report that tag state instead of silently testinglatest.latest
Profiles:
- : quick confidence that the tarball installs, can onboard a channel, can run an agent turn, and basic gateway/config lanes work.
smoke - : release-package contract. Adds installer/update, doctor install switching, bundled plugin runtime deps, plugin install/update, and package repair lanes. This is the default native replacement for most Parallels package/update coverage.
package - : package profile plus broader product surfaces: MCP channels, cron/subagent cleanup, OpenAI web search, and OpenWebUI.
product - : split Docker release-path chunks with OpenWebUI.
full - : exact
customlist for a focused rerun.docker_lanes
Candidate sources:
- :
source=npm,openclaw@beta, or an exact release version.openclaw@latest - : pack
source=refusing the trustedpackage_refharness. This intentionally separates old package commits from new workflow/test code.workflow_ref - : HTTPS
source=urlplus required.tgz.package_sha256 - : download one
source=artifactfrom.tgz/artifact_run_id.artifact_name
Ref model:
- selects the workflow file revision GitHub executes.
gh workflow run ... --ref <workflow-ref> - is the trusted harness/script ref passed to reusable Docker E2E.
workflow_ref - is the source ref to build when
package_ref. It can be an older branch/tag/SHA as long as it is reachable from an OpenClaw branch or release tag.source=ref
Example: run latest package acceptance harness against an older trusted commit:
bash
gh workflow run package-acceptance.yml --ref main \
-f workflow_ref=main \
-f source=ref \
-f package_ref=<branch-or-sha> \
-f suite_profile=package \
-f telegram_mode=mock-openaiUse or when the same
resolved tarball should also run through the Telegram QA
workflow in the environment. The standalone Telegram workflow
still accepts a published npm spec for post-publish checks, but Package
Acceptance passes the resolved artifact for , , , and
. Use only when intentionally skipping Telegram
credentialed package proof for a focused rerun.
telegram_mode=mock-openaitelegram_mode=live-frontierpackage-under-testqa-live-sharedsource=npmrefurlartifacttelegram_mode=noneDocker E2E images never copy repo sources as the app under test: the bare image
is a Node/Git runner, and the functional image installs the same prebuilt npm
tarball that bare lanes mount. is the
single packer for local scripts and CI and validates the tarball inventory
before Docker consumes it. is the
scheduler-owned CI plan for image kind, package, live image, lane, and
credential needs. Docker lane definitions live in the single scenario catalog
; planner logic lives in
. converts plan and
summary JSON into GitHub outputs and step summaries. Every scheduler run writes
plus . Read those
before rerunning. Lane entries include , , status,
timing, timeout state, image kind, and log file path. The summary also includes
top-level phase timings for preflight, image build, package prep, lane pools,
and cleanup. Use to rank slow lanes
and phases before deciding whether a broader rerun is justified.
scripts/package-openclaw-for-docker.mjsscripts/test-docker-all.mjs --plan-jsonscripts/lib/docker-e2e-scenarios.mjsscripts/lib/docker-e2e-plan.mjsscripts/docker-e2e.mjs.artifacts/docker-tests/**/summary.jsonfailures.jsoncommandrerunCommandpnpm test:docker:timings <summary.json>当你需要验证“这个可安装包是否能作为产品正常工作”,而不是“这个源代码差异是否通过Vitest测试”时,使用手动工作流。
Package Acceptance在发布验证中,将Package Acceptance视为更大发布集成工作流中的包候选分片,而不是竞争的完整测试路径。完整发布验证和私有发布流程应调用Package Acceptance进行tarball解析、Docker产品/包验证,以及针对相同解析的工件的可选Telegram QA;将编排、密钥策略、阻塞/建议状态和证据汇总保留在调用方中。
package-under-test推荐默认配置:
bash
gh workflow run package-acceptance.yml --ref main \
-f source=npm \
-f workflow_ref=main \
-f package_spec=openclaw@beta \
-f suite_profile=product \
-f telegram_mode=mock-openaiNpm候选版本选择:
- 调度前立即解析注册表:和
npm view openclaw dist-tags --json --prefer-online --cache /tmp/openclaw-npm-cache-verify-$$。npm view openclaw@beta version dist.tarball dist.integrity --json --prefer-online --cache /tmp/openclaw-npm-cache-verify-$$ - 如果Peter要求“最新beta版本”,使用和
source=npm,然后记录package_spec=openclaw@beta或工作流摘要中解析的版本。npm view - 对于重新运行、发布验证或比较已知包,优先使用精确的不可变规格:或
package_spec=openclaw@YYYY.M.D-beta.N。package_spec=openclaw@YYYY.M.D - 对于稳定包验证,仅当明确询问当前稳定dist-tag时,才使用;否则固定精确版本。
package_spec=openclaw@latest - 仅接受
source=npm、openclaw@beta或精确OpenClaw发布版本的注册表规格。不要传递语义版本范围、git引用、文件路径、tarball URL或插件包名称。openclaw@latest - 如果候选版本是tarball URL,使用并提供
source=url。如果是Actions tarball工件,使用package_sha256。如果是未发布的源代码候选版本,使用source=artifact并提供可信的引用或SHA。source=ref - 包验收会精确测试选定的包候选版本。不要在此处应用的回退语义;如果
openclaw update --channel beta不存在、过期、早于beta或指向损坏的tarball,请报告该标签状态,而不是静默测试latest。latest
配置文件:
- :快速确认tarball可安装、可接入渠道、可运行Agent轮次,以及基本网关/配置流水线正常工作。
smoke - :发布包契约。添加安装程序/更新、医生安装切换、捆绑插件运行时依赖、插件安装/更新和包修复流水线。这是大多数Parallels包/更新覆盖的默认原生替代方案。
package - :包配置文件加上更广泛的产品场景:MCP渠道、定时任务/子Agent清理、OpenAI网页搜索和OpenWebUI。
product - :包含OpenWebUI的拆分Docker发布路径分片。
full - :针对聚焦重新运行的精确
custom列表。docker_lanes
候选版本来源:
- :
source=npm、openclaw@beta或精确发布版本。openclaw@latest - :使用可信的
source=ref测试工具打包workflow_ref。这有意将旧包提交与新工作流/测试代码分离。package_ref - :HTTPS
source=url加上必填的.tgz。package_sha256 - :从
source=artifact/artifact_run_id下载一个artifact_name。.tgz
引用模型:
- 选择GitHub执行的工作流文件版本。
gh workflow run ... --ref <workflow-ref> - 是传递给可重用Docker E2E的可信测试工具/脚本引用。
workflow_ref - 是
package_ref时要构建的源代码引用。只要它可从OpenClaw分支或发布标签访问,就可以是旧的分支/标签/SHA。source=ref
示例:使用最新的包验收测试工具针对旧的可信提交运行:
bash
gh workflow run package-acceptance.yml --ref main \
-f workflow_ref=main \
-f source=ref \
-f package_ref=<branch-or-sha> \
-f suite_profile=package \
-f telegram_mode=mock-openai当相同解析的tarball也需要在环境中通过Telegram QA工作流运行时,使用或。独立的Telegram工作流仍接受已发布的npm规格用于发布后检查,但Package Acceptance会为、、和传递解析后的工件。仅当有意为聚焦重新运行跳过Telegram认证包验证时,才使用。
package-under-testqa-live-sharedtelegram_mode=mock-openaitelegram_mode=live-frontiersource=npmrefurlartifacttelegram_mode=noneDocker E2E镜像永远不会将仓库源代码作为被测应用复制:基础镜像是Node/Git运行器,功能镜像安装与基础流水线挂载的相同预构建npm tarball。是本地脚本和CI的单一打包器,会在Docker使用前验证tarball清单。是调度器所属的CI计划,包含镜像类型、包、实时镜像、流水线和密钥需求。Docker流水线定义位于单一场景目录;规划器逻辑位于。将计划和摘要JSON转换为GitHub输出和步骤摘要。每个调度器运行都会写入和。重新运行前请阅读这些文件。流水线条目包括、、状态、时间、超时状态、镜像类型和日志文件路径。摘要还包括预检查、镜像构建、包准备、流水线池和清理的顶级阶段时间。在决定是否需要更广泛的重新运行前,使用对慢流水线和阶段进行排名。
scripts/package-openclaw-for-docker.mjsscripts/test-docker-all.mjs --plan-jsonscripts/lib/docker-e2e-scenarios.mjsscripts/lib/docker-e2e-plan.mjsscripts/docker-e2e.mjs.artifacts/docker-tests/**/summary.jsonfailures.jsoncommandrerunCommandpnpm test:docker:timings <summary.json>Cheap Docker Reruns
低成本Docker重新运行
First derive the smallest rerun command from artifacts:
bash
pnpm test:docker:rerun <github-run-id>
pnpm test:docker:rerun .artifacts/docker-tests/<run>/failures.jsonThe script downloads Docker E2E artifacts for a GitHub run, reads
/, and prints a combined targeted workflow command
plus per-lane commands. Prefer the combined targeted command when several lanes
failed for the same patch:
summary.jsonfailures.jsonbash
gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
-f ref=<sha> \
-f include_repo_e2e=false \
-f include_release_path_suites=false \
-f include_openwebui=false \
-f docker_lanes='install-e2e bundled-channel-update-acpx' \
-f include_live_suites=false \
-f live_models_only=falseThat path still runs the prepare job, so it creates a new tarball for .
If the SHA-tagged GHCR bare/functional image already exists, CI skips rebuilding
that image and only uploads the fresh package artifact before the targeted lane
job. Do not rerun the full release path unless the failed lane list
or touched surface really requires it.
<sha>首先从工件中推导最小的重新运行命令:
bash
pnpm test:docker:rerun <github-run-id>
pnpm test:docker:rerun .artifacts/docker-tests/<run>/failures.json该脚本会下载GitHub运行的Docker E2E工件,读取/,并打印组合的针对性工作流命令以及每个流水线的命令。当多个流水线因同一补丁失败时,优先使用组合的针对性命令:
summary.jsonfailures.jsonbash
gh workflow run openclaw-live-and-e2e-checks-reusable.yml \
-f ref=<sha> \
-f include_repo_e2e=false \
-f include_release_path_suites=false \
-f include_openwebui=false \
-f docker_lanes='install-e2e bundled-channel-update-acpx' \
-f include_live_suites=false \
-f live_models_only=false该路径仍会运行准备作业,因此会为创建新的tarball。如果带有SHA标签的GHCR基础/功能镜像已存在,CI会跳过重建该镜像,仅在针对性流水线作业前上传新的包工件。除非失败流水线列表或受影响范围确实需要,否则不要重新运行完整的发布路径。
<sha>Docker Expected Timings
Docker预期时间
Treat these as ballpark. Blacksmith queue time, GHCR pull speed, provider
latency, npm cache state, and Docker daemon health can dominate.
Current local timing artifact () has
these rough bands:
.artifacts/docker-tests/lane-timings.json- Tiny lanes, seconds to under 1 minute:
~3s,
agents-delete-shared-workspace~7s,plugin-update~14s,config-reload~15s,pi-bundle-mcp-tools~18s,onboard~20s,session-runtime-context~34s,gateway-network~44s.qr - Medium deterministic lanes, ~1-5 minutes:
~96s,
npm-onboard-channel-agent~99s, bundled channel/update lanes usually ~90-300s when split,openai-image-auth~225s,openwebui~274s.mcp-channels - Heavy deterministic lanes, ~6-10 minutes:
~429s,
bundled-channel-root-owned~420s,bundled-channel-setup-entry~383s,bundled-channel-load-failure~567s.cron-mcp-cleanup - Live provider lanes, often ~15-20 minutes:
~958s,
live-gateway~1054s.live-models - Installer/release lanes:
and package-update paths can vary widely with npm, provider, and package registry behavior. Budget tens of minutes; prefer GitHub targeted reruns over local repeats.
install-e2e
Default fallback lane timeout is 120 minutes. A timeout usually means debug the
lane log/artifacts first, not “run the whole thing again.”
将这些视为大致参考。Blacksmith排队时间、GHCR拉取速度、提供商延迟、npm缓存状态和Docker守护进程健康状况可能占主导。
当前本地时间工件()有以下大致区间:
.artifacts/docker-tests/lane-timings.json- 小型流水线,几秒到1分钟以内:
~3秒,
agents-delete-shared-workspace~7秒,plugin-update~14秒,config-reload~15秒,pi-bundle-mcp-tools~18秒,onboard~20秒,session-runtime-context~34秒,gateway-network~44秒。qr - 中等确定性流水线,约1-5分钟:
~96秒,
npm-onboard-channel-agent~99秒,捆绑渠道/更新流水线拆分后通常约90-300秒,openai-image-auth~225秒,openwebui~274秒。mcp-channels - 大型确定性流水线,约6-10分钟:
~429秒,
bundled-channel-root-owned~420秒,bundled-channel-setup-entry~383秒,bundled-channel-load-failure~567秒。cron-mcp-cleanup - 实时提供商流水线,通常约15-20分钟:
~958秒,
live-gateway~1054秒。live-models - 安装程序/发布流水线:
和包更新路径的时间可能因npm、提供商和包注册表行为而有很大差异。预算几十分钟;优先选择GitHub针对性重新运行,而不是本地重复运行。
install-e2e
默认的流水线超时回退为120分钟。超时通常意味着首先调试流水线日志/工件,而不是“再次运行整个流程”。
Failure Workflow
故障处理流程
- Identify exact failing job, SHA, lane, and artifact path.
- Read ,
failures.json, and the failed lane log tail.summary.json - Use to generate targeted GitHub rerun commands.
pnpm test:docker:rerun <run-id|failures.json> - If the lane has , use that only as a local starting point.
rerunCommand - For Docker release failures, dispatch targeted on GitHub before considering local Docker.
docker_lanes=<failed-lane> - Patch narrowly, then rerun the failed file/lane only.
- Broaden to or CI only after the isolated proof passes.
pnpm check:changed
- 确定确切的失败作业、SHA、流水线和工件路径。
- 阅读、
failures.json和失败流水线的日志尾部。summary.json - 使用生成针对性的GitHub重新运行命令。
pnpm test:docker:rerun <run-id|failures.json> - 如果流水线有,仅将其作为本地起点使用。
rerunCommand - 对于Docker发布失败,在考虑本地Docker之前,先在GitHub上调度针对性的。
docker_lanes=<failed-lane> - 进行窄范围修复,然后仅重新运行失败的文件/流水线。
- 仅在孤立验证通过后,才扩大到或CI。
pnpm check:changed
When To Escalate
何时升级处理
- Public SDK/plugin contract changes: run changed gate plus relevant extension validation.
- Build output, lazy imports, package boundaries, or published surfaces:
include .
pnpm build - Workflow edits: run .
pnpm check:workflows - Release branch or tag validation: use release docs and GitHub workflows; avoid local Docker unless Peter explicitly asks.
- 公共SDK/插件契约变更:运行变更网关测试加上相关的扩展验证。
- 构建输出、延迟导入、包边界或发布表面变更:包含。
pnpm build - 工作流编辑:运行。
pnpm check:workflows - 发布分支或标签验证:使用发布文档和GitHub工作流;除非Peter明确要求,否则避免使用本地Docker。