run-automated-tests
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSkill: Run Automated Tests
Skill:执行自动化测试
Purpose
用途
Determine how a target repository expects automated tests to be executed (commands, frameworks, prerequisites, and scope), then run the best matching test suite(s) with a safety-first interaction policy.
确定目标仓库的自动化测试执行规范(包含命令、框架、前置条件和测试范围),随后遵循安全优先的交互策略执行匹配度最高的测试套件。
Core Objective
核心目标
Primary Goal: Produce test execution results with evidence-based command selection and safety guardrails.
Success Criteria (ALL must be met):
- ✅ Test plan discovered: Evidence sources identified (docs, CI configs, or build manifests)
- ✅ Commands selected: Appropriate test commands chosen based on mode (fast/ci/full) and constraints
- ✅ User confirmation obtained: Approval received before installing dependencies, using network, or starting services
- ✅ Tests executed: Commands run with captured output and exit codes
- ✅ Results summarized: Test Plan Summary produced with evidence, commands, execution status, and failures (if any)
Acceptance Test: Can a developer reproduce the test execution by following the Test Plan Summary without additional context?
首要目标:通过实证型命令选择和安全防护机制输出测试执行结果。
成功标准(必须全部满足):
- ✅ 已明确测试方案:找到可信的参考依据来源(文档、CI配置或构建清单)
- ✅ 已选定执行命令:根据运行模式(快速/CI/全量)和约束条件选择合适的测试命令
- ✅ 已获得用户确认:在安装依赖、使用网络、启动服务前获得用户批准
- ✅ 已完成测试执行:运行命令并捕获输出内容和退出码
- ✅ 已输出结果汇总:生成包含参考依据、执行命令、运行状态、故障信息(如有)的测试方案汇总
验收标准:开发者是否可以仅通过测试方案汇总复现测试执行过程,无需额外上下文信息?
Scope Boundaries
适用范围边界
This skill handles:
- Discovering test commands from repository evidence (docs, CI, build manifests)
- Selecting appropriate test commands based on mode and constraints
- Executing tests with safety guardrails and user confirmation
- Summarizing test results with evidence and failure diagnostics
This skill does NOT handle:
- Test quality assessment or coverage analysis (use )
review-testing - Fixing failing tests or debugging test failures (use )
run-repair-loop - Writing new tests or test infrastructure (use development skills)
- Reviewing test code for best practices (use )
review-testing
Handoff point: When tests complete (pass or fail), hand off to for fixing failures or for quality assessment.
run-repair-loopreview-testing本Skill支持的能力:
- 从代码仓库可信来源(文档、CI配置、构建清单)中识别测试命令
- 根据运行模式和约束条件选择合适的测试命令
- 基于安全防护机制和用户确认执行测试
- 汇总测试结果,附带参考依据和故障诊断信息
本Skill不支持的能力:
- 测试质量评估或覆盖率分析(请使用)
review-testing - 修复失败测试或调试测试故障(请使用)
run-repair-loop - 编写新测试或测试基础设施(请使用开发类Skill)
- 评审测试代码的最佳实践合规性(请使用)
review-testing
交接节点:测试完成(无论成功失败)后,如需修复故障请交接给,如需质量评估请交接给。
run-repair-loopreview-testingUse Cases
适用场景
- You cloned a repo and want the correct test command without guessing.
- A repo has multiple test layers (unit/integration/e2e) and you need a safe default run plan.
- CI is failing and you want to reproduce locally by running the same commands used in workflows.
- 你克隆了一个仓库,不想盲目试错,想找到正确的测试执行命令
- 仓库包含多层测试体系(单元/集成/端到端),你需要安全的默认运行方案
- CI运行失败,你希望通过执行工作流中相同的命令在本地复现问题
Behavior
运行逻辑
-
Establish scope and constraints (ask if ambiguous)
- If the user did not specify, default to a fast, local, non-destructive run:
- Unit tests only, no external services, no Docker, no network-dependent setup.
- Ask the user to choose a mode if needed:
- : unit tests only, minimal setup.
fast - : mirror CI workflow commands as closely as possible.
ci - : include integration/e2e tests and service dependencies.
full
- Ask whether Docker is allowed, whether network access is allowed, and whether installing dependencies is allowed.
- If the user did not specify, default to a fast, local, non-destructive run:
-
Discover the test plan (evidence-based)
- Read these sources in order; stop early if a clear, explicit test command is found:
- ,
README.md,CONTRIBUTING.md,TESTING.md,docs/testing*Makefile - CI configs: ,
.github/workflows/*.yml,.gitlab-ci.yml,azure-pipelines.ymlJenkinsfile - Build manifests: ,
package.json,pyproject.toml,setup.cfg,tox.ini,go.mod,pom.xml,build.gradle*,*.csprojCargo.toml
- Identify:
- Primary test entrypoints (,
npm test,pnpm test,yarn test,pytest,tox,go test,dotnet test,mvn test,gradle test, etc.)cargo test - Test layers and markers (unit vs integration vs e2e)
- Environment prerequisites (DB, Redis, Docker Compose, required env vars, secrets)
- How CI sets up dependencies (services, caches, artifacts)
- Primary test entrypoints (
- Prefer explicit instructions found in docs or CI over heuristics.
- Read these sources in order; stop early if a clear, explicit test command is found:
-
Select an execution plan
- If mode: derive the run sequence from the repo's CI workflow steps (closest match).
ci - If mode: pick the most direct unit-test command with the least prerequisites.
fast - If multiple stacks exist (e.g., backend + frontend), propose running each stack separately in a deterministic order.
- If the plan requires dependency installation or service startup, request confirmation before proceeding.
- If
-
Execute with guardrails
- Always print the exact commands you will run before running them.
- Use a working directory rooted at the target repo (default ).
. - Capture and summarize failures:
- First failing command and exit code
- The most relevant error excerpt
- Next actions (missing toolchain, missing env var, service not running, etc.)
- Avoid destructive operations:
- Do not run ,
rm -rf,git clean -fdx, or database drop/migrate commands without explicit user approval.docker system prune
- Do not run
- If the repo requires secrets, do not ask the user to paste secrets into chat. Prefer files, secret managers, or documented local dev flows.
.env
-
明确范围和约束(存在歧义时主动询问用户)
- 若用户未指定,默认使用快速、本地、无破坏性的运行模式:
- 仅执行单元测试,不启动外部服务、不使用Docker、不执行依赖网络的初始化操作
- 必要时请用户选择运行模式:
- :仅执行单元测试,最少化初始化操作
fast - :尽可能镜像CI工作流的执行命令
ci - :包含集成/端到端测试和服务依赖
full
- 询问用户是否允许使用Docker、是否允许网络访问、是否允许安装依赖
- 若用户未指定,默认使用快速、本地、无破坏性的运行模式:
-
基于实证确定测试方案
- 按以下顺序读取参考来源,若找到明确的测试命令可提前终止:
- 、
README.md、CONTRIBUTING.md、TESTING.md、docs/testing*Makefile - CI配置:、
.github/workflows/*.yml、.gitlab-ci.yml、azure-pipelines.ymlJenkinsfile - 构建清单:、
package.json、pyproject.toml、setup.cfg、tox.ini、go.mod、pom.xml、build.gradle*、*.csprojCargo.toml
- 识别以下信息:
- 核心测试入口(、
npm test、pnpm test、yarn test、pytest、tox、go test、dotnet test、mvn test、gradle test等)cargo test - 测试层级和标记(单元/集成/端到端)
- 环境前置要求(数据库、Redis、Docker Compose、所需环境变量、密钥)
- CI的依赖初始化方式(服务、缓存、制品)
- 核心测试入口(
- 优先选择文档或CI中的明确说明,而非启发式推测的结果
- 按以下顺序读取参考来源,若找到明确的测试命令可提前终止:
-
选择执行方案
- 若为模式:从仓库的CI工作流步骤中推导最匹配的运行序列
ci - 若为模式:选择前置要求最少的最直接单元测试命令
fast - 若存在多个技术栈(例如后端+前端),建议按确定顺序分别执行各栈的测试
- 若方案需要安装依赖或启动服务,执行前需先请求用户确认
- 若为
-
带防护机制执行
- 执行命令前务必打印即将运行的完整命令内容
- 工作目录默认设为目标仓库根目录(默认)
. - 捕获并汇总故障信息:
- 首个失败的命令和退出码
- 最相关的错误片段
- 后续处理建议(缺失工具链、缺失环境变量、服务未运行等)
- 避免破坏性操作:
- 没有用户明确批准的情况下,不要运行、
rm -rf、git clean -fdx或者数据库删除/迁移命令docker system prune
- 没有用户明确批准的情况下,不要运行
- 若仓库需要密钥,不要要求用户在聊天中粘贴密钥,优先推荐文件、密钥管理器或官方文档说明的本地开发流程
.env
Input & Output
输入与输出
Input
- Target repository path (default ).
. - Mode: (default),
fast, orci.full - Constraints: allow dependency install (yes/no), allow network (yes/no), allow Docker (yes/no).
Output
- A short "Test Plan Summary" containing:
- Evidence: which files/paths informed the plan
- Chosen commands (in order)
- Assumptions and prerequisites
- What was executed and what was skipped (and why)
- Command transcript snippets sufficient to debug failures (do not dump extremely long logs unless asked).
输入
- 目标仓库路径(默认)
. - 运行模式:(默认)、
fast或cifull - 约束条件:是否允许安装依赖、是否允许使用网络、是否允许使用Docker
输出
- 简短的「测试方案汇总」,包含:
- 参考依据:方案参考的文件/路径
- 选定的命令(按执行顺序)
- 假设条件和前置要求
- 已执行和已跳过的内容(及跳过原因)
- 足够用于调试故障的命令片段(除非用户要求,不要输出超长日志)
Restrictions
限制规则
Hard Boundaries
硬边界
- Do not invent test commands when evidence exists (prefer docs/CI).
- Do not install dependencies, run Docker, or start external services without confirmation.
- Do not modify repository files unless the user explicitly requests it (exception: generating a report file if the user asked for artifacts).
- Do not exfiltrate secrets; do not request sensitive credentials in chat.
- 存在可信参考依据时不要自行编造测试命令(优先选择文档/CI中的说明)
- 未获得确认前不得安装依赖、运行Docker或启动外部服务
- 除非用户明确要求,不得修改仓库文件(例外:若用户要求生成制品,可生成报告文件)
- 不得泄露密钥;不得在聊天中索要敏感凭证
Skill Boundaries (Avoid Overlap)
技能边界(避免能力重叠)
Do NOT do these (other skills handle them):
- Test quality assessment: Evaluating test coverage, test design, or testing best practices → Use
review-testing - Fixing test failures: Debugging failing tests, repairing broken test code, or investigating root causes → Use
run-repair-loop - Writing tests: Creating new test cases, test infrastructure, or test frameworks → Use development/implementation skills
- Code review: Reviewing test code for quality, maintainability, or best practices → Use
review-testing - Repository analysis: Comprehensive codebase structure analysis or architecture review → Use
review-codebase
When to stop and hand off:
- Tests fail and user asks "why?" or "how to fix?" → Hand off to for debugging and repair
run-repair-loop - User asks "are these tests good?" or "what's our coverage?" → Hand off to for quality assessment
review-testing - User asks "can you write tests for X?" → Hand off to development workflow for test implementation
- Tests pass and user asks "what should we test next?" → Hand off to for test strategy recommendations
review-testing
不要执行以下操作(对应能力由其他Skill提供):
- 测试质量评估:评估测试覆盖率、测试设计或测试最佳实践 → 使用
review-testing - 测试故障修复:调试失败测试、修复有问题的测试代码、排查根因 → 使用
run-repair-loop - 测试编写:创建新测试用例、测试基础设施或测试框架 → 使用开发/实现类Skill
- 代码评审:评审测试代码的质量、可维护性或最佳实践合规性 → 使用
review-testing - 仓库分析:全面的代码库结构分析或架构评审 → 使用
review-codebase
停止运行并触发交接的场景:
- 测试失败,用户询问「为什么失败?」或「怎么修复?」 → 交接给进行调试和修复
run-repair-loop - 用户询问「这些测试质量怎么样?」或「我们的覆盖率是多少?」 → 交接给进行质量评估
review-testing - 用户询问「你可以为X功能写测试吗?」 → 交接给开发工作流实现测试
- 测试通过,用户询问「接下来我们应该测试什么?」 → 交接给提供测试策略建议
review-testing
Self-Check
自检清单
Core Success Criteria (ALL must be met)
核心成功标准(必须全部满足)
- Test plan discovered: Evidence sources identified (docs, CI configs, or build manifests)
- Commands selected: Appropriate test commands chosen based on mode (fast/ci/full) and constraints
- User confirmation obtained: Approval received before installing dependencies, using network, or starting services
- Tests executed: Commands run with captured output and exit codes
- Results summarized: Test Plan Summary produced with evidence, commands, execution status, and failures (if any)
- 已明确测试方案:找到可信的参考依据来源(文档、CI配置或构建清单)
- 已选定执行命令:根据运行模式(快速/CI/全量)和约束条件选择合适的测试命令
- 已获得用户确认:在安装依赖、使用网络、启动服务前获得用户批准
- 已完成测试执行:运行命令并捕获输出内容和退出码
- 已输出结果汇总:生成包含参考依据、执行命令、运行状态、故障信息(如有)的测试方案汇总
Process Quality Checks
流程质量检查
- Evidence-based selection: Did I identify at least one authoritative test instruction source (doc file, CI workflow, or build manifest)?
- Safety guardrails applied: Did I ask for confirmation before any action that installs dependencies, uses network, starts Docker/services, or changes state?
- Commands printed: Did I print the exact commands before running them?
- Failures diagnosed: If tests failed, did I provide the first failing command, exit code, and likely root cause category?
- No destructive operations: Did I avoid running destructive commands (,
rm -rf,git clean, database drops) without explicit approval?docker system prune - No secret exfiltration: Did I avoid requesting sensitive credentials in chat and prefer files or documented local dev flows?
.env
- 基于实证选择:我是否找到了至少一个权威的测试说明来源(文档文件、CI工作流或构建清单)?
- 已应用安全防护:执行安装依赖、使用网络、启动Docker/服务、修改状态的操作前,我是否询问了用户确认?
- 已打印命令内容:运行命令前我是否打印了完整的命令内容?
- 已完成故障诊断:如果测试失败,我是否提供了首个失败命令、退出码和可能的根因分类?
- 无破坏性操作:没有用户明确批准的情况下,我是否避免了运行破坏性命令(、
rm -rf、git clean、数据库删除等)?docker system prune - 无密钥泄露风险:我是否避免了在聊天中索要敏感凭证,优先推荐文件或官方文档说明的本地开发流程?
.env
Acceptance Test
验收测试
Can a developer reproduce the test execution by following the Test Plan Summary without additional context?
If NO: Test Plan Summary is incomplete. Add missing evidence, commands, or prerequisites.
If YES: Skill execution is complete. Proceed to handoff if needed.
开发者是否可以仅通过测试方案汇总复现测试执行过程,无需额外上下文信息?
如果答案为否:测试方案汇总不完整,补充缺失的参考依据、命令或前置要求。
如果答案为是:Skill执行完成,必要时触发交接流程。
Examples
示例
Example 1: JavaScript repo with package.json
User: "Run tests for this repo."
Agent:
- Inspect scripts and
package.json..github/workflows/* - Decide mode and propose:
fast- (or
npm test/pnpm testif the repo standardizes on it)yarn test
- Ask: "Install dependencies () and allow network?"
npm ci - Run:
npm cinpm test
- Summarize results and point to failing test output if any.
Example 2 (Edge case): Monorepo with integration tests requiring Docker
User: "Mirror CI locally."
Agent:
- Parse and identify separate jobs:
.github/workflows/ci.yml- backend unit tests
- frontend tests
- integration tests with
docker compose
- Ask for confirmation:
- allow Docker
- allow network
- which jobs to run (all vs only failing job)
- Execute in a controlled order:
- install deps per job
- run unit tests first
- bring up services for integration tests
- If integration tests fail, summarize:
- service health / port conflicts
- missing env vars
- how CI config differs from local
示例1:包含package.json的JavaScript仓库
用户:「运行这个仓库的测试。」
Agent:
- 检查的scripts字段和
package.json路径.github/workflows/* - 确定使用模式,提出方案:
fast- (如果仓库统一使用
npm test/pnpm test则对应调整)yarn test
- 询问用户:「是否允许安装依赖()和使用网络?」
npm ci - 执行命令:
npm cinpm test
- 汇总结果,如果有失败测试指向对应的失败输出
示例2(边界场景):包含需要Docker的集成测试的Monorepo
用户:「在本地镜像CI的运行效果。」
Agent:
- 解析,识别独立的任务:
.github/workflows/ci.yml- 后端单元测试
- 前端测试
- 依赖的集成测试
docker compose
- 请求用户确认:
- 是否允许使用Docker
- 是否允许使用网络
- 需要运行哪些任务(全部 vs 仅失败的任务)
- 按可控顺序执行:
- 按任务要求安装依赖
- 先运行单元测试
- 为集成测试启动所需服务
- 如果集成测试失败,汇总信息:
- 服务健康状态/端口冲突
- 缺失的环境变量
- CI配置和本地环境的差异
Appendix: Output contract
附录:输出契约
Each skill execution MUST produce a Test Plan Summary in this exact JSON format:
json
{
"test_plan_summary": {
"mode": "fast | ci | full",
"evidence": ["path/to/source1", "path/to/source2"],
"commands": [
{"command": "npm test", "purpose": "run unit tests", "order": 1}
],
"prerequisites": ["npm ci", "Docker running"],
"executed": ["npm ci", "npm test"],
"skipped": ["integration tests - require Docker"],
"result": {
"status": "passed | failed | blocked",
"exit_code": 0,
"first_failure": {
"command": "npm test",
"exit_code": 1,
"error_excerpt": "FAIL src/utils.test.js"
}
}
}
}| Element | Type | Description |
|---|---|---|
| string | Selected mode: |
| array | Source files that informed the test plan |
| array | Selected test commands with purpose and order |
| array | Required setup steps |
| array | Commands actually run |
| array | Commands skipped and reason |
| string | |
| number | Exit code of test command |
| object | First failure details (if any) |
This schema enables Agent consumption without prose parsing.
每次Skill执行必须输出严格符合以下JSON格式的测试方案汇总:
json
{
"test_plan_summary": {
"mode": "fast | ci | full",
"evidence": ["path/to/source1", "path/to/source2"],
"commands": [
{"command": "npm test", "purpose": "run unit tests", "order": 1}
],
"prerequisites": ["npm ci", "Docker running"],
"executed": ["npm ci", "npm test"],
"skipped": ["integration tests - require Docker"],
"result": {
"status": "passed | failed | blocked",
"exit_code": 0,
"first_failure": {
"command": "npm test",
"exit_code": 1,
"error_excerpt": "FAIL src/utils.test.js"
}
}
}
}| 字段 | 类型 | 说明 |
|---|---|---|
| string | 选定的运行模式: |
| array | 测试方案参考的源文件列表 |
| array | 选定的测试命令列表,包含用途和执行顺序 |
| array | 所需的初始化步骤列表 |
| array | 实际运行的命令列表 |
| array | 跳过的命令及原因列表 |
| string | 运行状态: |
| number | 测试命令的退出码 |
| object | 首个失败的详情(如有) |
该Schema支持Agent直接读取使用,无需解析自然语言文本。