blacksmith-testbox

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Blacksmith Testbox

Blacksmith Testbox

Scope

适用场景

Use Testbox when you need remote CI parity, injected secrets, hosted services, or an OS/runtime image that your local machine cannot provide cheaply.
For OpenClaw, Crabbox is a supported alternative when Blacksmith is unavailable or owned cloud capacity is preferable.
Do not default to Testbox for every local test/build loop. If the repo has documented local commands for normal iteration, use those first so you keep warm caches, local build state, and fast feedback.
Testbox is the expensive path. Reach for it deliberately.
OpenClaw maintainers can opt into Testbox-first validation by setting
OPENCLAW_TESTBOX=1
in their environment or standing agent rules. This mode is maintainers-only and requires Blacksmith access.
When
OPENCLAW_TESTBOX=1
is set in OpenClaw:
  • Pre-warm a Testbox early for longer, wider, or uncertain work.
  • Prefer Testbox for
    pnpm
    gates, e2e, package-like proof, and broad suites.
  • Reuse the same Testbox ID for every run command in the same task/session.
  • Use local commands only when the task explicitly sets
    OPENCLAW_LOCAL_CHECK_MODE=throttled|full
    , or when the user asks for local proof.
当你需要远程CI一致性、注入密钥、托管服务,或是本地机器无法低成本提供的操作系统/运行时镜像时,可以使用Testbox。
对于OpenClaw,当Blacksmith不可用或更倾向于使用自有云容量时,Crabbox是受支持的替代方案。
不要默认将Testbox用于所有本地测试/构建循环。如果仓库有记录用于常规迭代的本地命令,请优先使用这些命令,这样可以保留热缓存、本地构建状态并获得快速反馈。
Testbox是成本较高的方案,请谨慎使用。
OpenClaw维护者可以通过在环境中设置
OPENCLAW_TESTBOX=1
或配置代理规则,选择启用Testbox优先验证模式。此模式仅对维护者开放,且需要Blacksmith访问权限。
当OpenClaw中设置了
OPENCLAW_TESTBOX=1
时:
  • 提前预热Testbox,以应对耗时更长、范围更广或不确定的工作。
  • 优先使用Testbox执行
    pnpm
    校验、端到端测试(e2e)、类包验证以及全面测试套件。
  • 在同一任务/会话中,每次运行命令都复用同一个Testbox ID。
  • 仅当任务明确设置
    OPENCLAW_LOCAL_CHECK_MODE=throttled|full
    ,或用户要求本地验证时,才使用本地命令。

Install the CLI

安装CLI

If
blacksmith
is not installed, install it:
curl -fsSL https://get.blacksmith.sh | sh
For the canary channel (bleeding-edge):
BLACKSMITH_CHANNEL=canary sh -c 'curl -fsSL https://get.blacksmith.sh | sh'
Then authenticate:
blacksmith auth login
如果未安装
blacksmith
,请执行以下命令安装:
curl -fsSL https://get.blacksmith.sh | sh
如需安装金丝雀通道(前沿版本):
BLACKSMITH_CHANNEL=canary sh -c 'curl -fsSL https://get.blacksmith.sh | sh'
然后进行身份验证:
blacksmith auth login

Agent-triggered browser auth (non-interactive)

代理触发的浏览器认证(非交互式)

When an agent needs to ensure the user is authenticated before running testbox commands (e.g. warmup, run), use browser-based auth with non-interactive mode. This opens the browser for the user to sign in; the agent does not interact with the browser. The org selector in the dashboard is skipped, so the user only sees the sign-in flow.
Required command (
--organization
is required with
--non-interactive
):
blacksmith auth login --non-interactive --organization <org-slug>
The org slug can come from
BLACKSMITH_ORG
env var or the
--org
global flag. If neither is set, the agent should use the project's known org (e.g. from repo config or user context). Example:
blacksmith auth login --non-interactive --organization acme-corp
blacksmith --org acme-corp auth login --non-interactive --organization acme-corp
Flow: The CLI starts a local callback server, opens the browser to the dashboard auth page, and blocks for up to 2 minutes. The user completes sign-in and authorization in the browser. The dashboard redirects to localhost with the token; the CLI saves credentials and exits. The agent then proceeds.
Do not use
--api-token
for this flow — that is for headless/token-based auth. This skill focuses on browser-based auth when the user prefers signing in via the web UI.
Optional flags:
  • --dashboard-url <url>
    — Override dashboard URL (e.g. for staging)
当代理需要确保用户已完成认证后再运行Testbox命令(例如预热、运行)时,请使用基于浏览器的非交互式认证。此方式会打开浏览器供用户登录;代理不会与浏览器交互。仪表盘中的组织选择器会被跳过,因此用户只会看到登录流程。
必填命令(使用
--non-interactive
时必须指定
--organization
):
blacksmith auth login --non-interactive --organization <org-slug>
组织别名(org slug)可来自
BLACKSMITH_ORG
环境变量或
--org
全局标志。如果两者均未设置,代理应使用项目的已知组织(例如来自仓库配置或用户上下文)。示例:
blacksmith auth login --non-interactive --organization acme-corp
blacksmith --org acme-corp auth login --non-interactive --organization acme-corp
流程:CLI启动本地回调服务器,打开浏览器跳转到仪表板认证页面,并阻塞最多2分钟。用户在浏览器中完成登录和授权。仪表板将重定向到本地主机并携带令牌;CLI保存凭据后退出。随后代理继续执行操作。
请勿使用
--api-token
进行此流程——该参数用于无头/基于令牌的认证。本技能专注于当用户偏好通过Web UI登录时的基于浏览器的认证。
可选标志:
  • --dashboard-url <url>
    — 覆盖仪表板URL(例如用于 staging 环境)

Decide first: local or Testbox

先决定:本地执行还是使用Testbox

Before warming anything up, check the repo's own instructions.
Prefer local commands when:
  • the repo documents a supported local test/build workflow
  • you are iterating on unit tests, lint, typecheck, formatting, or other local-only validation
  • the value comes from warm local caches and fast repeat runs
  • the command does not need remote secrets, hosted services, or CI-only images
Prefer Testbox when:
  • the repo explicitly requires CI-parity or remote validation
  • the command needs secrets, service containers, or provisioned infra
  • you are reproducing CI-only failures
  • you need the exact workflow image/job environment from GitHub Actions
For OpenClaw specifically, normal local iteration stays local unless maintainer Testbox mode is enabled with
OPENCLAW_TESTBOX=1
:
  • pnpm check:changed
  • pnpm test:changed
  • pnpm test <path-or-filter>
  • pnpm test:serial
  • pnpm build
If
OPENCLAW_TESTBOX=1
is enabled, run those same repo commands inside the warm Testbox. If the user wants laptop-friendly local proof for one command, use the explicit escape hatch
OPENCLAW_LOCAL_CHECK_MODE=throttled
.
For installable-package product proof, prefer the GitHub
Package Acceptance
workflow over an ad hoc Testbox command. It resolves one package candidate (
source=npm
,
source=ref
,
source=url
, or
source=artifact
), uploads it as
package-under-test
, and runs the reusable Docker E2E lanes against that exact tarball on GitHub/Blacksmith runners. Use
workflow_ref
for the trusted workflow/harness code and
package_ref
for the source ref to pack when testing an older trusted branch, tag, or SHA.
在进行任何预热操作之前,请查看仓库自身的说明。
优先选择本地命令的场景:
  • 仓库已记录受支持的本地测试/构建工作流
  • 你正在迭代单元测试、代码检查、类型校验、格式化或其他仅需本地验证的操作
  • 优势来自本地热缓存和快速重复运行
  • 命令不需要远程密钥、托管服务或仅CI可用的镜像
优先选择Testbox的场景:
  • 仓库明确要求CI一致性或远程验证
  • 命令需要密钥、服务容器或预配置的基础设施
  • 你正在复现仅CI环境中出现的故障
  • 你需要GitHub Actions中完全一致的工作流镜像/作业环境
对于OpenClaw,常规本地迭代默认使用本地命令,除非启用了维护者Testbox模式(设置
OPENCLAW_TESTBOX=1
):
  • pnpm check:changed
  • pnpm test:changed
  • pnpm test <path-or-filter>
  • pnpm test:serial
  • pnpm build
如果启用了
OPENCLAW_TESTBOX=1
,请在预热好的Testbox中运行上述仓库命令。如果用户希望针对单个命令进行适合笔记本电脑的本地验证,请使用显式的规避选项
OPENCLAW_LOCAL_CHECK_MODE=throttled
对于可安装包的产品验证,优先使用GitHub的
Package Acceptance
工作流,而非临时的Testbox命令。该工作流会解析一个包候选版本(
source=npm
source=ref
source=url
source=artifact
),将其上传为
package-under-test
,并在GitHub/Blacksmith运行器上针对该精确压缩包执行可复用的Docker端到端测试通道。测试旧的可信分支、标签或SHA时,使用
workflow_ref
指定可信工作流/测试工具代码,使用
package_ref
指定要打包的源引用。

Setup: Warmup before coding

设置:编码前预热

If you decided Testbox is warranted, warm one up early. This returns an ID instantly and boots the CI environment in the background while you work:
blacksmith testbox warmup ci-check-testbox.yml
# → tbx_01jkz5b3t9...
Save this ID in the current session. You need it for every
run
command. Treat
blacksmith testbox list
as diagnostics, not a reusable work queue. Listed boxes can be visible at the org/repo level while still being unusable or stale for the current local agent lane.
For OpenClaw maintainer Testbox mode, pre-warm at the start of longer or wider tasks:
blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90
pnpm testbox:claim --id <ID>
Use the build-artifact warmup when e2e/package/build proof benefits from seeded
dist/
,
dist-runtime/
, and build-all caches:
blacksmith testbox warmup ci-build-artifacts-testbox.yml --ref main --idle-timeout 90
pnpm testbox:claim --id <ID>
Warmup dispatches a GitHub Actions workflow that provisions a VM with the full CI environment: dependencies installed, services started, secrets injected, and a clean checkout of the repo at the default branch.
In OpenClaw, raw commit SHAs are not reliable dispatch refs for
warmup --ref
; use a branch or tag. The build-artifact workflow resolves
openclaw@beta
and
openclaw@latest
to SHA cache keys internally.
Options:
--ref <branch|tag>     Git ref to dispatch against (default: repo's default branch)
--job <name>           Specific job within the workflow (if it has multiple)
--idle-timeout <min>   Idle timeout in minutes (default: 30)
如果确定需要使用Testbox,请尽早进行预热。此操作会立即返回一个ID,并在你工作的后台启动CI环境:
blacksmith testbox warmup ci-check-testbox.yml
# → tbx_01jkz5b3t9...
请在当前会话中保存此ID。你需要用它来执行每一次
run
命令。将
blacksmith testbox list
视为诊断工具,而非可复用的工作队列。列出的Testbox可能在组织/仓库级别可见,但对于当前本地代理通道可能已不可用或过期。
对于OpenClaw维护者Testbox模式,在开始较长或较广范围的任务时提前预热:
blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90
pnpm testbox:claim --id <ID>
当端到端测试/包/构建验证可从预加载的
dist/
dist-runtime/
和全构建缓存中受益时,请使用构建工件预热:
blacksmith testbox warmup ci-build-artifacts-testbox.yml --ref main --idle-timeout 90
pnpm testbox:claim --id <ID>
预热操作会触发GitHub Actions工作流,该工作流将预配置一个包含完整CI环境的VM:已安装依赖、已启动服务、已注入密钥,并已默认分支的仓库干净检出。
在OpenClaw中,原始提交SHA作为
warmup --ref
的触发引用不可靠;请使用分支或标签。构建工件工作流会在内部将
openclaw@beta
openclaw@latest
解析为SHA缓存键。
选项:
--ref <branch|tag>     用于触发的Git引用(默认:仓库的默认分支)
--job <name>           工作流中的特定作业(如果工作流包含多个作业)
--idle-timeout <min>   空闲超时时间(分钟,默认:30)

CRITICAL: Always run from the repo root

重要提示:始终从仓库根目录运行

ALWAYS invoke
blacksmith testbox
commands from the root of the git repository. The CLI syncs the current working directory to the testbox using rsync with
--delete
. If you run from a subdirectory (e.g.
cd backend && blacksmith testbox run ...
), rsync will mirror only that subdirectory and delete everything else on the testbox — wiping other directories like
dashboard/
,
cli/
, etc.
# CORRECT — run from repo root, use paths in the command
blacksmith testbox run --id <ID> "cd backend && php artisan test"
blacksmith testbox run --id <ID> "cd dashboard && npm test"

# WRONG — do NOT cd into a subdirectory before invoking the CLI
cd backend && blacksmith testbox run --id <ID> "php artisan test"
If your shell is in a subdirectory,
cd
back to the repo root first:
cd "$(git rev-parse --show-toplevel)"
blacksmith testbox run --id <ID> "cd backend && php artisan test"
务必Git仓库的根目录调用
blacksmith testbox
命令。CLI会使用rsync的
--delete
参数将当前工作目录同步到Testbox。如果你从子目录运行(例如
cd backend && blacksmith testbox run ...
),rsync将仅镜像该子目录,并删除Testbox上的所有其他内容——包括
dashboard/
cli/
等目录。
# 正确做法 — 从仓库根目录运行,在命令中使用路径
blacksmith testbox run --id <ID> "cd backend && php artisan test"
blacksmith testbox run --id <ID> "cd dashboard && npm test"

# 错误做法 — 不要先进入子目录再调用CLI
cd backend && blacksmith testbox run --id <ID> "php artisan test"
如果你的Shell当前处于子目录,请先回到仓库根目录:
cd "$(git rev-parse --show-toplevel)"
blacksmith testbox run --id <ID> "cd backend && php artisan test"

Running commands

运行命令

blacksmith testbox run --id <ID> "<command>"
The
run
command automatically waits for the testbox to become ready if it is still booting, so you can call
run
immediately after warmup without needing to check status first.
In OpenClaw, prefer the guarded runner wrapper so stale/reused ids fail before the Blacksmith CLI spends time syncing or emits a confusing missing-key error:
pnpm testbox:run --id <ID> -- "OPENCLAW_TESTBOX=1 pnpm check:changed"
The wrapper refuses to run when the local per-Testbox key is missing or when the id was not claimed by this OpenClaw checkout with
pnpm testbox:claim --id <ID>
. Treat that as the expected remediation, not as a GitHub account or normal SSH-key problem. A local key alone is not enough; a ready box may still carry stale rsync state from another lane.
If the agent crashes, the remote box relies on Blacksmith's idle timeout. The local OpenClaw claim marker is not deleted automatically, so the wrapper treats claims older than 12 hours as stale. Override only for intentional long-running work with
OPENCLAW_TESTBOX_CLAIM_TTL_MINUTES=<minutes>
.
Before spending a broad gate on a manually assembled command, you can also run:
pnpm testbox:sanity -- --id <ID>
blacksmith testbox run --id <ID> "<command>"
如果Testbox仍在启动中,
run
命令会自动等待其就绪,因此你可以在预热后立即调用
run
,无需先检查状态。
在OpenClaw中,优先使用受保护的运行器包装器,这样过期/复用的ID会在Blacksmith CLI花费时间同步或发出令人困惑的密钥缺失错误之前失败:
pnpm testbox:run --id <ID> -- "OPENCLAW_TESTBOX=1 pnpm check:changed"
当本地每个Testbox的密钥缺失,或该ID未被此OpenClaw检出通过
pnpm testbox:claim --id <ID>
认领时,包装器会拒绝运行。请将此视为预期的修复措施,而非GitHub账户或常规SSH密钥问题。仅本地密钥是不够的;就绪的Testbox可能仍携带来自其他通道的过期rsync状态。
如果代理崩溃,远程Testbox会依赖Blacksmith的空闲超时。本地OpenClaw认领标记不会自动删除,因此包装器会将超过12小时的认领视为过期。仅当有意进行长时间运行的工作时,才使用
OPENCLAW_TESTBOX_CLAIM_TTL_MINUTES=<minutes>
覆盖此设置。
在手动组装的命令上执行全面校验之前,你也可以运行:
pnpm testbox:sanity -- --id <ID>

Downloading files from a testbox

从Testbox下载文件

Use the
download
command to retrieve files or directories from a running testbox to your local machine. This is useful for fetching build artifacts, test results, coverage reports, or any output generated on the testbox.
blacksmith testbox download --id <ID> <remote-path> [local-path]
The remote path is relative to the testbox working directory (same as
run
). If no local path is specified, the file is saved to the current directory using the same base name.
To download a directory, append a trailing
/
to the remote path — this triggers recursive mode:
# Download a single file
blacksmith testbox download --id <ID> coverage/report.html

# Download a file to a specific local path
blacksmith testbox download --id <ID> build/output.tar.gz ./output.tar.gz

# Download an entire directory
blacksmith testbox download --id <ID> test-results/ ./results/
Options:
--ssh-private-key <path>   Path to SSH private key (if warmup used --ssh-public-key)
使用
download
命令将运行中的Testbox上的文件或目录检索到本地机器。这对于获取构建工件、测试结果、覆盖率报告或Testbox上生成的任何输出非常有用。
blacksmith testbox download --id <ID> <remote-path> [local-path]
远程路径是相对于Testbox工作目录的(与
run
命令相同)。如果未指定本地路径,文件将保存到当前目录,并使用相同的基础名称。
要下载目录,请在远程路径后添加尾随
/
——这会触发递归模式:
# 下载单个文件
blacksmith testbox download --id <ID> coverage/report.html

# 将文件下载到指定本地路径
blacksmith testbox download --id <ID> build/output.tar.gz ./output.tar.gz

# 下载整个目录
blacksmith testbox download --id <ID> test-results/ ./results/
选项:
--ssh-private-key <path>   SSH私钥路径(如果预热时使用了--ssh-public-key)

How file sync works

文件同步原理

Understanding this model is critical for using Testbox correctly.
When you call
run
, the CLI performs a delta sync of your local changes to the remote testbox before executing your command:
  1. The testbox VM starts from a clean
    actions/checkout
    at the warmup ref. The workflow's setup steps (e.g.
    npm install
    ,
    pip install
    ,
    composer install
    ) run during warmup and populate dependency directories on the remote VM.
  2. On each
    run
    , the CLI uses git to detect which files changed locally since the last sync. It syncs ONLY tracked files and untracked non-ignored files (i.e. files that
    git ls-files
    reports).
  3. .gitignore
    'd directories are never synced.
    This means directories like
    node_modules/
    ,
    vendor/
    ,
    .venv/
    ,
    build/
    ,
    dist/
    , etc. are NOT transferred from your local machine. The testbox uses its own copies of those directories, populated during the warmup workflow steps.
  4. If nothing has changed since the last sync (same git commit and working tree state), the sync is skipped entirely for speed.
正确理解此模型对于正确使用Testbox至关重要。
当你调用
run
时,CLI会在执行命令前将本地更改增量同步到远程Testbox:
  1. Testbox VM从预热引用的干净
    actions/checkout
    开始。工作流的设置步骤(例如
    npm install
    pip install
    composer install
    )会在预热期间运行,并在远程VM上填充依赖目录。
  2. 在每次
    run
    时,CLI使用git检测自上次同步以来本地哪些文件发生了变化。它仅同步已跟踪文件和未被忽略的未跟踪文件(即
    git ls-files
    报告的文件)。
  3. .gitignore
    中的目录永远不会被同步
    。这意味着
    node_modules/
    vendor/
    .venv/
    build/
    dist/
    等目录不会从本地机器传输到Testbox。Testbox使用自己的这些目录副本,这些副本是在预热工作流步骤中填充的。
  4. 如果自上次同步以来没有任何变化(相同的git提交和工作树状态),则会完全跳过同步以提高速度。

Why this matters

为什么这很重要

  • Changing dependencies: If you modify
    package.json
    ,
    requirements.txt
    ,
    composer.json
    ,
    go.mod
    , or similar dependency manifests, the lock/manifest file will be synced but the actual dependency directory will NOT. You must re-run the install command on the testbox:
    blacksmith testbox run --id <ID> "npm install && npm test"
    blacksmith testbox run --id <ID> "pip install -r requirements.txt && pytest"
    blacksmith testbox run --id <ID> "composer install && phpunit"
  • Generated/build artifacts: If your tests depend on a build step (e.g.
    npm run build
    ,
    make
    ), and you changed source files that affect the build output, re-run the build on the testbox before testing.
  • New untracked files: New files you create locally ARE synced (as long as they are not gitignored). You do not need to
    git add
    them first.
  • Deleted files: Files you delete locally are also deleted on the remote testbox. The sync model keeps the remote in lockstep with your local managed file set.
  • 修改依赖项:如果你修改了
    package.json
    requirements.txt
    composer.json
    go.mod
    或类似的依赖清单,锁文件/清单文件会被同步,但实际的依赖目录不会。你必须在Testbox上重新运行安装命令:
    blacksmith testbox run --id <ID> "npm install && npm test"
    blacksmith testbox run --id <ID> "pip install -r requirements.txt && pytest"
    blacksmith testbox run --id <ID> "composer install && phpunit"
  • 生成/构建工件:如果你的测试依赖于构建步骤(例如
    npm run build
    make
    ),并且你更改了影响构建输出的源文件,请在测试前在Testbox上重新运行构建。
  • 新的未跟踪文件:你在本地创建的新文件会被同步(只要它们未被git忽略)。你不需要先执行
    git add
  • 删除文件:你在本地删除的文件也会在远程Testbox上被删除。同步模型使远程与本地托管的文件集保持一致。

CRITICAL: Do not ban local tests

重要提示:不要禁用本地测试

Do not assume local validation is forbidden. Many repos intentionally invest in fast, warm local loops, and forcing every run through Testbox destroys that advantage.
Use Testbox for the checks that actually need it: remote parity, secrets, services, CI-only runners, or reproducibility against the workflow image.
If the repo says local tests/builds are the normal path, follow the repo.
OpenClaw maintainer exception: if
OPENCLAW_TESTBOX=1
is set by the user or agent environment, treat Testbox as the normal validation path for this repo. Use
OPENCLAW_LOCAL_CHECK_MODE=throttled|full
as the explicit local escape hatch.
不要假设本地验证是被禁止的。许多仓库特意投入资源打造快速、热缓存的本地循环,强制每次运行都通过Testbox会破坏这一优势。
仅将Testbox用于确实需要它的检查:远程一致性、密钥、服务、仅CI可用的运行器,或针对工作流镜像的可复现性验证。
如果仓库说明本地测试/构建是常规路径,请遵循仓库的指导。
OpenClaw维护者例外:如果用户或代理环境设置了
OPENCLAW_TESTBOX=1
,则将Testbox视为该仓库的默认验证路径。使用
OPENCLAW_LOCAL_CHECK_MODE=throttled|full
作为显式的本地规避选项。

When to use

使用场景

Use Testbox when:
  • running database migrations or destructive environment checks
  • running commands that depend on secrets or environment variables not present locally
  • reproducing CI-only failures or validating against the workflow image
  • validating behavior that needs provisioned services or remote runners
  • doing a final parity check before commit/push when the repo or user wants that
Trim that list based on repo guidance. If the repo documents supported local tests/builds, prefer local for routine iteration and keep Testbox for the checks that need parity or remote state.
在以下场景中使用Testbox:
  • 运行数据库迁移或破坏性环境检查
  • 运行依赖本地不存在的密钥或环境变量的命令
  • 复现仅CI环境中出现的故障或针对工作流镜像进行验证
  • 验证需要预配置服务或远程运行器的行为
  • 当仓库或用户要求时,在提交/推送前进行最终一致性检查
根据仓库指导调整此列表。如果仓库记录了受支持的本地测试/构建,请优先使用本地进行常规迭代,仅将Testbox用于需要一致性或远程状态的检查。

Workflow

工作流

  1. Decide whether the repo's local loop is the right default. For OpenClaw,
    OPENCLAW_TESTBOX=1
    makes Testbox the maintainer default.
  2. If Testbox is warranted, warm up early:
    blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90
    → save the ID, then
    pnpm testbox:claim --id <ID>
  3. Write code while the testbox boots in the background.
  4. Run the remote command when needed:
    pnpm testbox:run --id <ID> -- "OPENCLAW_TESTBOX=1 pnpm check:changed"
  5. If tests fail, fix code and re-run against the same warm box.
  6. If you changed dependency manifests (package.json, etc.), prepend the install command:
    blacksmith testbox run --id <ID> "npm install && npm test"
  7. If a narrow PR reports a full sync or the box was reused/expired, sanity check the remote copy before a slow gate:
    pnpm testbox:run --id <ID> -- "pnpm testbox:sanity"
    . If it reports missing root files or mass tracked deletions, stop the box and warm a fresh one. Use
    OPENCLAW_TESTBOX_ALLOW_MASS_DELETIONS=1
    only for an intentional large deletion PR.
  8. If you need artifacts (coverage reports, build outputs, etc.), download them:
    blacksmith testbox download --id <ID> coverage/ ./coverage/
  9. Once green, commit and push.
  1. 确定仓库的本地循环是否为正确的默认选项。对于OpenClaw,
    OPENCLAW_TESTBOX=1
    会使Testbox成为维护者的默认选项。
  2. 如果需要使用Testbox,请尽早预热:
    blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90
    → 保存ID, 然后执行
    pnpm testbox:claim --id <ID>
  3. 在Testbox后台启动的同时编写代码。
  4. 需要时运行远程命令:
    pnpm testbox:run --id <ID> -- "OPENCLAW_TESTBOX=1 pnpm check:changed"
  5. 如果测试失败,修复代码并在同一个预热好的Testbox上重新运行。
  6. 如果更改了依赖清单(如package.json等),请在命令前添加安装步骤:
    blacksmith testbox run --id <ID> "npm install && npm test"
  7. 如果窄范围PR报告完全同步或Testbox已被复用/过期,请在执行缓慢的校验前检查远程副本的完整性:
    pnpm testbox:run --id <ID> -- "pnpm testbox:sanity"
    。 如果报告根文件缺失或大量已跟踪文件被删除,请停止该Testbox并预热一个新的。仅当有意进行大规模删除PR时,才使用
    OPENCLAW_TESTBOX_ALLOW_MASS_DELETIONS=1
  8. 如果需要工件(覆盖率报告、构建输出等),请下载它们:
    blacksmith testbox download --id <ID> coverage/ ./coverage/
  9. 验证通过后,提交并推送代码。

OpenClaw full test suite

OpenClaw完整测试套件

For OpenClaw, use the repo package manager and the measured stable full-suite profile below. It keeps six Vitest project shards active while limiting each shard to one worker to avoid worker OOMs on Testbox:
blacksmith testbox run --id <ID> "env NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test"
Observed full-suite time on Blacksmith Testbox is about 3-4 minutes:
  • 173-180s on a warmed box
  • 219s on a fresh 32-vCPU box
When validating before commit/push in maintainer Testbox mode, run
pnpm check:changed
inside the warmed box first when appropriate, then the full suite with the profile above if broad confidence is needed.
Run
pnpm testbox:sanity
inside the warmed box before the broad command when the sync looks suspicious. It checks that root files such as
pnpm-lock.yaml
still exist and fails on 200 or more tracked deletions. That catches stale or corrupted rsync state before dependency install or Vitest failures hide the real problem.
对于OpenClaw,请使用仓库包管理器和以下经过验证的稳定完整套件配置文件。它会保持6个Vitest项目分片处于活动状态,同时将每个分片限制为一个工作进程,以避免Testbox上的工作进程内存不足(OOM):
blacksmith testbox run --id <ID> "env NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test"
在Blacksmith Testbox上运行完整套件的时间约为3-4分钟:
  • 预热好的Testbox上需要173-180秒
  • 全新的32-vCPU Testbox上需要219秒
在维护者Testbox模式下,提交/推送前进行验证时,适当时先在预热好的Testbox中运行
pnpm check:changed
,如果需要全面确认,再使用上述配置文件运行完整套件。
当同步看起来可疑时,请在执行全面命令前在预热好的Testbox中运行
pnpm testbox:sanity
。它会检查
pnpm-lock.yaml
等根文件是否仍然存在,并在超过200个已跟踪文件被删除时失败。这可以在依赖安装或Vitest失败掩盖真实问题之前,捕获过期或损坏的rsync状态。

Examples

示例

blacksmith testbox warmup ci-check-testbox.yml # → tbx_01jkz5b3t9...
# Run tests
blacksmith testbox run --id <ID> "npm test -- --testPathPattern=handler.test"
blacksmith testbox run --id <ID> "go test ./pkg/api/... -run TestHandler -v"
blacksmith testbox run --id <ID> "python -m pytest tests/test_api.py -k test_auth"

# Re-install deps after changing package.json, then test
blacksmith testbox run --id <ID> "npm install && npm test"

# Build and test
blacksmith testbox run --id <ID> "npm run build && npm test"

# Download artifacts from the testbox
blacksmith testbox download --id <ID> coverage/lcov-report/ ./coverage/
blacksmith testbox download --id <ID> build/output.tar.gz
blacksmith testbox warmup ci-check-testbox.yml # → tbx_01jkz5b3t9...
# 运行测试
blacksmith testbox run --id <ID> "npm test -- --testPathPattern=handler.test"
blacksmith testbox run --id <ID> "go test ./pkg/api/... -run TestHandler -v"
blacksmith testbox run --id <ID> "python -m pytest tests/test_api.py -k test_auth"

# 修改package.json后重新安装依赖,然后测试
blacksmith testbox run --id <ID> "npm install && npm test"

# 构建并测试
blacksmith testbox run --id <ID> "npm run build && npm test"

# 从Testbox下载工件
blacksmith testbox download --id <ID> coverage/lcov-report/ ./coverage/
blacksmith testbox download --id <ID> build/output.tar.gz

Waiting for the testbox to be ready

等待Testbox就绪

The
run
command automatically waits for the testbox, so explicit waiting is usually unnecessary. If you do need to check readiness separately (e.g. before a series of runs), use the
--wait
flag. Do NOT use a sleep-and-recheck loop.
Correct: block until ready with a timeout:
blacksmith testbox status --id <ID> --wait [--wait-timeout 5m]
Wrong: never use sleep + status in a loop:
# BAD — do not do this
sleep 30 && blacksmith testbox status --id <ID>
while ! blacksmith testbox status --id <ID> | grep ready; do sleep 5; done
--wait
polls the status and exits as soon as the testbox is ready (or when the timeout is reached). Default timeout is 5m; use
--wait-timeout
for longer (e.g.
10m
,
1h
).
run
命令会自动等待Testbox就绪,因此通常不需要显式等待。如果你确实需要单独检查就绪状态(例如在一系列运行之前),请使用
--wait
标志。不要使用睡眠+重新检查的循环。
正确做法:设置超时并阻塞直到就绪:
blacksmith testbox status --id <ID> --wait [--wait-timeout 5m]
错误做法:永远不要使用睡眠+状态检查的循环:
# 错误示例 — 不要这样做
sleep 30 && blacksmith testbox status --id <ID>
while ! blacksmith testbox status --id <ID> | grep ready; do sleep 5; done
--wait
会轮询状态,并在Testbox就绪(或超时)时立即退出。默认超时为5分钟;使用
--wait-timeout
设置更长时间(例如
10m
1h
)。

Managing testboxes

管理Testbox

Check status of a specific testbox

blacksmith testbox status --id <ID>

# List all active testboxes for the current repo
blacksmith testbox list

# Stop a testbox when you're done (frees resources)
blacksmith testbox stop --id <ID>
Testboxes automatically shut down after being idle (default: 30 minutes). If you need a longer session, increase the timeout at warmup time. For OpenClaw maintainer work, use 90 minutes for long-running sessions:
blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90
blacksmith testbox warmup ci-build-artifacts-testbox.yml --idle-timeout 90

检查特定Testbox的状态

blacksmith testbox status --id <ID>

# 列出当前仓库的所有活跃Testbox
blacksmith testbox list

# 使用完毕后停止Testbox(释放资源)
blacksmith testbox stop --id <ID>
Testbox在空闲一段时间后会自动关闭(默认:30分钟)。如果需要更长的会话时间,请在预热时增加超时时间。对于OpenClaw维护者的工作,长时间会话请设置为90分钟:
blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90
blacksmith testbox warmup ci-build-artifacts-testbox.yml --idle-timeout 90

With options

使用选项

blacksmith testbox warmup ci-check-testbox.yml --ref main blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90 blacksmith testbox run --id <ID> "go test ./..."
blacksmith testbox warmup ci-check-testbox.yml --ref main blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90 blacksmith testbox run --id <ID> "go test ./..."