outside-in-testing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Outside-In Testing Skill

外部驱动测试技能

Purpose [LEVEL 1]

用途 [LEVEL 1]

This skill helps you create agentic outside-in tests that verify application behavior from an external user's perspective without any knowledge of internal implementation. Using the gadugi-agentic-test framework, you write declarative YAML scenarios that AI agents execute, observe, and validate.
Key Principle: Tests describe WHAT should happen, not HOW it's implemented. Agents figure out the execution details.
本技能可帮助你创建智能外部驱动测试,从外部用户视角验证应用行为,无需了解任何内部实现细节。借助gadugi-agentic-test框架,你只需编写声明式YAML场景,由AI代理负责执行、观测和验证。
核心原则:测试描述的是应该发生什么,而非实现方式。代理会自行处理执行细节。

When to Use This Skill [LEVEL 1]

适用场景 [LEVEL 1]

Perfect For

理想适用场景

  • Smoke Tests: Quick validation that critical user flows work
  • Behavior-Driven Testing: Verify features from user perspective
  • Cross-Platform Testing: Same test logic for CLI, TUI, Web, Electron
  • Refactoring Safety: Tests remain valid when implementation changes
  • AI-Powered Testing: Let agents handle complex interactions
  • Documentation as Tests: YAML scenarios double as executable specs
  • 冒烟测试:快速验证关键用户流程是否正常工作
  • 行为驱动测试:从用户视角验证功能
  • 跨平台测试:同一测试逻辑适用于CLI、TUI、Web、Electron
  • 重构安全保障:内部实现变更时,测试依然有效
  • AI驱动测试:让代理处理复杂交互
  • 可执行文档:YAML场景同时可作为可执行规格说明

Use This Skill When

建议使用本技能的情况

  • Starting a new project and defining expected behaviors
  • Refactoring code and need tests that won't break with internal changes
  • Testing user-facing applications (CLI tools, TUIs, web apps, desktop apps)
  • Writing acceptance criteria that can be automatically verified
  • Need tests that non-developers can read and understand
  • Want to catch regressions in critical user workflows
  • Testing complex multi-step interactions
  • 启动新项目并定义预期行为时
  • 重构代码,需要不会因内部变更而失效的测试时
  • 测试面向用户的应用(CLI工具、TUI、Web应用、桌面应用)时
  • 编写可自动验证的验收标准时
  • 需要非开发人员也能阅读和理解的测试时
  • 希望捕获关键用户流程中的回归问题时
  • 测试复杂多步骤交互时

Don't Use This Skill When

不建议使用本技能的情况

  • Need unit tests for internal functions (use test-gap-analyzer instead)
  • Testing performance or load characteristics
  • Need precise timing or concurrency control
  • Testing non-interactive batch processes
  • Implementation details matter more than behavior
  • 需要为内部函数编写单元测试时(请使用test-gap-analyzer)
  • 测试性能或负载特性时
  • 需要精确的时序或并发控制时
  • 测试非交互式批处理流程时
  • 实现细节比行为更重要时

Core Concepts [LEVEL 1]

核心概念 [LEVEL 1]

Outside-In Testing Philosophy

外部驱动测试理念

Traditional Inside-Out Testing:
python
undefined
传统内部驱动测试
python
undefined

Tightly coupled to implementation

与实现紧密耦合

def test_calculator_add(): calc = Calculator() result = calc.add(2, 3) assert result == 5 assert calc.history == [(2, 3, 5)] # Knows internal state

**Agentic Outside-In Testing**:

```yaml
def test_calculator_add(): calc = Calculator() result = calc.add(2, 3) assert result == 5 assert calc.history == [(2, 3, 5)] # 依赖内部状态

**智能外部驱动测试**:

```yaml

Implementation-agnostic behavior verification

与实现无关的行为验证

scenario: name: "Calculator Addition" steps: - action: launch target: "./calculator" - action: send_input value: "add 2 3" - action: verify_output contains: "Result: 5"

**Benefits**:

- Tests survive refactoring (internal changes don't break tests)
- Readable by non-developers (YAML is declarative)
- Platform-agnostic (same structure for CLI/TUI/Web/Electron)
- AI agents handle complexity (navigation, timing, screenshots)
scenario: name: "计算器加法测试" steps: - action: launch target: "./calculator" - action: send_input value: "add 2 3" - action: verify_output contains: "Result: 5"

**优势**:

- 测试可在重构后继续使用(内部变更不会导致测试失效)
- 非开发人员也能轻松阅读(YAML是声明式的)
- 平台无关(CLI/TUI/Web/Electron使用相同结构)
- AI代理处理复杂操作(导航、时序、截图)

The Gadugi Agentic Test Framework [LEVEL 2]

gadugi-agentic-test框架 [LEVEL 2]

Gadugi-agentic-test is a Python framework that:
  1. Parses YAML test scenarios with declarative steps
  2. Dispatches to specialized agents (CLI, TUI, Web, Electron agents)
  3. Executes actions (launch, input, click, wait, verify)
  4. Collects evidence (screenshots, logs, output captures)
  5. Validates outcomes against expected results
  6. Generates reports with evidence trails
Architecture:
YAML Scenario → Scenario Loader → Agent Dispatcher → Execution Engine
                     [CLI Agent, TUI Agent, Web Agent, Electron Agent]
                           Observers → Comprehension Agent
                                   Evidence Report
Gadugi-agentic-test是一个Python框架,具备以下功能:
  1. 解析YAML测试场景:处理声明式步骤
  2. 调度至专用代理:CLI、TUI、Web、Electron代理
  3. 执行操作:启动、输入、点击、等待、验证
  4. 收集证据:截图、日志、输出捕获
  5. 验证结果:与预期结果对比
  6. 生成报告:包含证据追踪
架构
YAML场景 → 场景加载器 → 代理调度器 → 执行引擎
                     [CLI代理、TUI代理、Web代理、Electron代理]
                           观测器 → 理解代理
                                   证据报告

Progressive Disclosure Levels [LEVEL 1]

渐进式学习层级 [LEVEL 1]

This skill teaches testing in three levels:
  • Level 1: Fundamentals - Basic single-action tests, simple verification
  • Level 2: Intermediate - Multi-step flows, conditional logic, error handling
  • Level 3: Advanced - Custom agents, visual regression, performance validation
Each example is marked with its level. Start at Level 1 and progress as needed.
本技能将测试分为三个层级:
  • Level 1: 基础 - 单操作基础测试、简单验证
  • Level 2: 中级 - 多步骤流程、条件逻辑、错误处理
  • Level 3: 高级 - 自定义代理、视觉回归、性能验证
每个示例都标记了对应的层级。建议从Level 1开始,根据需要逐步进阶。

Quick Start [LEVEL 1]

快速开始 [LEVEL 1]

Installation

安装

Prerequisites (for native module compilation):
bash
undefined
前置依赖(用于原生模块编译)
bash
undefined

macOS

macOS

xcode-select --install
xcode-select --install

Ubuntu/Debian

Ubuntu/Debian

sudo apt-get install -y build-essential python3
sudo apt-get install -y build-essential python3

Windows: Install Visual Studio Build Tools with "Desktop development with C++"

Windows: 安装Visual Studio Build Tools并勾选"Desktop development with C++"


**Install the framework:**

```bash

**安装框架**:

```bash

Install globally for CLI access

全局安装以使用CLI

npm install -g @gadugi/agentic-test
npm install -g @gadugi/agentic-test

Or install locally in your project

或在项目中本地安装

npm install @gadugi/agentic-test
npm install @gadugi/agentic-test

Verify installation

验证安装

gadugi-test --version
undefined
gadugi-test --version
undefined

Your First Test (CLI Example)

第一个测试(CLI示例)

Create
test-hello.yaml
:
yaml
scenario:
  name: "Hello World CLI Test"
  description: "Verify CLI prints greeting"
  type: cli

  prerequisites:
    - "./hello-world executable exists"

  steps:
    - action: launch
      target: "./hello-world"

    - action: verify_output
      contains: "Hello, World!"

    - action: verify_exit_code
      expected: 0
Run the test:
bash
gadugi-test run test-hello.yaml
Output:
✓ Scenario: Hello World CLI Test
  ✓ Step 1: Launched ./hello-world
  ✓ Step 2: Output contains "Hello, World!"
  ✓ Step 3: Exit code is 0

PASSED (3/3 steps successful)
Evidence saved to: ./evidence/test-hello-20250116-093045/
创建
test-hello.yaml
yaml
scenario:
  name: "Hello World CLI测试"
  description: "验证CLI输出问候语"
  type: cli

  prerequisites:
    - "./hello-world可执行文件存在"

  steps:
    - action: launch
      target: "./hello-world"

    - action: verify_output
      contains: "Hello, World!"

    - action: verify_exit_code
      expected: 0
运行测试:
bash
gadugi-test run test-hello.yaml
输出:
✓ 场景: Hello World CLI测试
  ✓ 步骤1: 启动./hello-world
  ✓ 步骤2: 输出包含"Hello, World!"
  ✓ 步骤3: 退出码为0

测试通过(3/3步骤成功)
证据已保存至: ./evidence/test-hello-20250116-093045/

Understanding the YAML Structure [LEVEL 1]

YAML结构说明 [LEVEL 1]

Every test scenario has this structure:
yaml
scenario:
  name: "Descriptive test name"
  description: "What this test verifies"
  type: cli | tui | web | electron

  # Optional metadata
  tags: [smoke, critical, auth]
  timeout: 30s

  # What must be true before test runs
  prerequisites:
    - "Condition 1"
    - "Condition 2"

  # The test steps (executed sequentially)
  steps:
    - action: action_name
      parameter1: value1
      parameter2: value2

    - action: verify_something
      expected: value

  # Optional cleanup
  cleanup:
    - action: stop_application
每个测试场景都包含以下结构:
yaml
scenario:
  name: "描述性测试名称"
  description: "本测试验证的内容"
  type: cli | tui | web | electron

  # 可选元数据
  tags: [smoke, critical, auth]
  timeout: 30s

  # 测试运行前必须满足的条件
  prerequisites:
    - "条件1"
    - "条件2"

  # 测试步骤(按顺序执行)
  steps:
    - action: action_name
      parameter1: value1
      parameter2: value2

    - action: verify_something
      expected: value

  # 可选清理步骤
  cleanup:
    - action: stop_application

Application Types and Agents [LEVEL 2]

应用类型与代理 [LEVEL 2]

CLI Applications [LEVEL 1]

CLI应用 [LEVEL 1]

Use Case: Command-line tools, scripts, build tools, package managers
Supported Actions:
  • launch
    - Start the CLI program
  • send_input
    - Send text or commands via stdin
  • send_signal
    - Send OS signals (SIGINT, SIGTERM)
  • wait_for_output
    - Wait for specific text in stdout/stderr
  • verify_output
    - Check stdout/stderr contains/matches expected text
  • verify_exit_code
    - Validate process exit code
  • capture_output
    - Save output for later verification
Example (see
examples/cli/calculator-basic.yaml
):
yaml
scenario:
  name: "CLI Calculator Basic Operations"
  type: cli

  steps:
    - action: launch
      target: "./calculator"
      args: ["--mode", "interactive"]

    - action: send_input
      value: "add 5 3\n"

    - action: verify_output
      contains: "Result: 8"
      timeout: 2s

    - action: send_input
      value: "multiply 4 7\n"

    - action: verify_output
      contains: "Result: 28"

    - action: send_input
      value: "exit\n"

    - action: verify_exit_code
      expected: 0
适用场景:命令行工具、脚本、构建工具、包管理器
支持的操作
  • launch
    - 启动CLI程序
  • send_input
    - 通过stdin发送文本或命令
  • send_signal
    - 发送OS信号(SIGINT、SIGTERM)
  • wait_for_output
    - 等待stdout/stderr中出现特定文本
  • verify_output
    - 检查stdout/stderr是否包含/匹配预期文本
  • verify_exit_code
    - 验证进程退出码
  • capture_output
    - 保存输出供后续验证
示例(见
examples/cli/calculator-basic.yaml
):
yaml
scenario:
  name: "CLI计算器基础操作"
  type: cli

  steps:
    - action: launch
      target: "./calculator"
      args: ["--mode", "interactive"]

    - action: send_input
      value: "add 5 3\n"

    - action: verify_output
      contains: "Result: 8"
      timeout: 2s

    - action: send_input
      value: "multiply 4 7\n"

    - action: verify_output
      contains: "Result: 28"

    - action: send_input
      value: "exit\n"

    - action: verify_exit_code
      expected: 0

TUI Applications [LEVEL 1]

TUI应用 [LEVEL 1]

Use Case: Terminal user interfaces (htop, vim, tmux, custom dashboard TUIs)
Supported Actions:
  • launch
    - Start TUI application
  • send_keypress
    - Send keyboard input (arrow keys, enter, ctrl+c, etc.)
  • wait_for_screen
    - Wait for specific text to appear on screen
  • verify_screen
    - Check screen contents match expectations
  • capture_screenshot
    - Save terminal screenshot (ANSI art)
  • navigate_menu
    - Navigate menu structures
  • fill_form
    - Fill TUI form fields
Example (see
examples/tui/file-manager-navigation.yaml
):
yaml
scenario:
  name: "TUI File Manager Navigation"
  type: tui

  steps:
    - action: launch
      target: "./file-manager"

    - action: wait_for_screen
      contains: "File Manager v1.0"
      timeout: 3s

    - action: send_keypress
      value: "down"
      times: 3

    - action: verify_screen
      contains: "> documents/"
      description: "Third item should be selected"

    - action: send_keypress
      value: "enter"

    - action: wait_for_screen
      contains: "documents/"
      timeout: 2s

    - action: capture_screenshot
      save_as: "documents-view.txt"
适用场景:终端用户界面(htop、vim、tmux、自定义仪表盘TUI)
支持的操作
  • launch
    - 启动TUI应用
  • send_keypress
    - 发送键盘输入(方向键、回车、ctrl+c等)
  • wait_for_screen
    - 等待屏幕上出现特定文本
  • verify_screen
    - 检查屏幕内容是否符合预期
  • capture_screenshot
    - 保存终端截图(ANSI格式)
  • navigate_menu
    - 导航菜单结构
  • fill_form
    - 填写TUI表单字段
示例(见
examples/tui/file-manager-navigation.yaml
):
yaml
scenario:
  name: "TUI文件管理器导航"
  type: tui

  steps:
    - action: launch
      target: "./file-manager"

    - action: wait_for_screen
      contains: "File Manager v1.0"
      timeout: 3s

    - action: send_keypress
      value: "down"
      times: 3

    - action: verify_screen
      contains: "> documents/"
      description: "第三个选项应被选中"

    - action: send_keypress
      value: "enter"

    - action: wait_for_screen
      contains: "documents/"
      timeout: 2s

    - action: capture_screenshot
      save_as: "documents-view.txt"

Web Applications [LEVEL 1]

Web应用 [LEVEL 1]

Use Case: Web apps, dashboards, SPAs, admin panels
Supported Actions:
  • navigate
    - Go to URL
  • click
    - Click element by selector or text
  • type
    - Type into input fields
  • wait_for_element
    - Wait for element to appear
  • verify_element
    - Check element exists/contains text
  • verify_url
    - Validate current URL
  • screenshot
    - Capture browser screenshot
  • scroll
    - Scroll page or element
Example (see
examples/web/dashboard-smoke-test.yaml
):
yaml
scenario:
  name: "Dashboard Smoke Test"
  type: web

  steps:
    - action: navigate
      url: "http://localhost:3000/dashboard"

    - action: wait_for_element
      selector: "h1.dashboard-title"
      timeout: 5s

    - action: verify_element
      selector: "h1.dashboard-title"
      contains: "Analytics Dashboard"

    - action: verify_element
      selector: ".widget-stats"
      count: 4
      description: "Should have 4 stat widgets"

    - action: click
      selector: "button.refresh-data"

    - action: wait_for_element
      selector: ".loading-spinner"
      disappears: true
      timeout: 10s

    - action: screenshot
      save_as: "dashboard-loaded.png"
适用场景:Web应用、仪表盘、SPA、管理面板
支持的操作
  • navigate
    - 访问URL
  • click
    - 通过选择器或文本点击元素
  • type
    - 在输入框中输入文本
  • wait_for_element
    - 等待元素出现
  • verify_element
    - 检查元素是否存在/包含指定文本
  • verify_url
    - 验证当前URL
  • screenshot
    - 捕获浏览器截图
  • scroll
    - 滚动页面或元素
示例(见
examples/web/dashboard-smoke-test.yaml
):
yaml
scenario:
  name: "仪表盘冒烟测试"
  type: web

  steps:
    - action: navigate
      url: "http://localhost:3000/dashboard"

    - action: wait_for_element
      selector: "h1.dashboard-title"
      timeout: 5s

    - action: verify_element
      selector: "h1.dashboard-title"
      contains: "Analytics Dashboard"

    - action: verify_element
      selector: ".widget-stats"
      count: 4
      description: "应包含4个统计组件"

    - action: click
      selector: "button.refresh-data"

    - action: wait_for_element
      selector: ".loading-spinner"
      disappears: true
      timeout: 10s

    - action: screenshot
      save_as: "dashboard-loaded.png"

Electron Applications [LEVEL 2]

Electron应用 [LEVEL 2]

Use Case: Desktop apps built with Electron (VS Code, Slack, Discord clones)
Supported Actions:
  • launch
    - Start Electron app
  • window_action
    - Interact with windows (focus, minimize, close)
  • menu_click
    - Click application menu items
  • dialog_action
    - Handle native dialogs (open file, save, confirm)
  • ipc_send
    - Send IPC message to main process
  • verify_window
    - Check window state/properties
  • All web actions (since Electron uses Chromium)
Example (see
examples/electron/single-window-basic.yaml
):
yaml
scenario:
  name: "Electron Single Window Test"
  type: electron

  steps:
    - action: launch
      target: "./dist/my-app"
      wait_for_window: true
      timeout: 10s

    - action: verify_window
      title: "My Application"
      visible: true

    - action: menu_click
      path: ["File", "New Document"]

    - action: wait_for_element
      selector: ".document-editor"

    - action: type
      selector: ".document-editor"
      value: "Hello from test"

    - action: menu_click
      path: ["File", "Save"]

    - action: dialog_action
      type: save_file
      filename: "test-document.txt"

    - action: verify_window
      title_contains: "test-document.txt"
适用场景:使用Electron构建的桌面应用(VS Code、Slack、Discord克隆版)
支持的操作
  • launch
    - 启动Electron应用
  • window_action
    - 与窗口交互(聚焦、最小化、关闭)
  • menu_click
    - 点击应用菜单项
  • dialog_action
    - 处理原生对话框(打开文件、保存、确认)
  • ipc_send
    - 向主进程发送IPC消息
  • verify_window
    - 检查窗口状态/属性
  • 所有Web操作(因为Electron基于Chromium)
示例(见
examples/electron/single-window-basic.yaml
):
yaml
scenario:
  name: "Electron单窗口基础测试"
  type: electron

  steps:
    - action: launch
      target: "./dist/my-app"
      wait_for_window: true
      timeout: 10s

    - action: verify_window
      title: "My Application"
      visible: true

    - action: menu_click
      path: ["File", "New Document"]

    - action: wait_for_element
      selector: ".document-editor"

    - action: type
      selector: ".document-editor"
      value: "Hello from test"

    - action: menu_click
      path: ["File", "Save"]

    - action: dialog_action
      type: save_file
      filename: "test-document.txt"

    - action: verify_window
      title_contains: "test-document.txt"

Test Scenario Anatomy [LEVEL 2]

测试场景剖析 [LEVEL 2]

Metadata Section

元数据部分

yaml
scenario:
  name: "Clear descriptive name"
  description: "Detailed explanation of what this test verifies"
  type: cli | tui | web | electron

  # Optional fields
  tags: [smoke, regression, auth, payment]
  priority: high | medium | low
  timeout: 60s # Overall scenario timeout
  retry_on_failure: 2 # Retry count

  # Environment requirements
  environment:
    variables:
      API_URL: "http://localhost:8080"
      DEBUG: "true"
    files:
      - "./config.json must exist"
yaml
scenario:
  name: "清晰的描述性名称"
  description: "本测试验证内容的详细说明"
  type: cli | tui | web | electron

  # 可选字段
  tags: [smoke, regression, auth, payment]
  priority: high | medium | low
  timeout: 60s # 整个场景的超时时间
  retry_on_failure: 2 # 失败重试次数

  # 环境要求
  environment:
    variables:
      API_URL: "http://localhost:8080"
      DEBUG: "true"
    files:
      - "./config.json必须存在"

Prerequisites

前置条件

Prerequisites are conditions that must be true before the test runs. The framework validates these before execution.
yaml
prerequisites:
  - "./application binary exists"
  - "Port 8080 is available"
  - "Database is running"
  - "User account test@example.com exists"
  - "File ./test-data.json exists"
If prerequisites fail, the test is skipped (not failed).
前置条件是测试运行前必须满足的条件。框架会在执行前验证这些条件。
yaml
prerequisites:
  - "./应用二进制文件存在"
  - "端口8080可用"
  - "数据库已启动"
  - "用户test@example.com已存在"
  - "文件./test-data.json存在"
如果前置条件不满足,测试会被跳过(而非标记为失败)。

Steps

测试步骤

Steps execute sequentially. Each step has:
  • action: Required - the action to perform
  • Parameters: Action-specific parameters
  • description: Optional - human-readable explanation
  • timeout: Optional - step-specific timeout
  • continue_on_failure: Optional - don't fail scenario if step fails
yaml
steps:
  # Simple action
  - action: launch
    target: "./app"

  # Action with multiple parameters
  - action: verify_output
    contains: "Success"
    timeout: 5s
    description: "App should print success message"

  # Continue even if this fails
  - action: click
    selector: ".optional-button"
    continue_on_failure: true
步骤按顺序执行。每个步骤包含:
  • action: 必填 - 要执行的操作
  • Parameters: 操作特定的参数
  • description: 可选 - 人类可读的说明
  • timeout: 可选 - 步骤的超时时间
  • continue_on_failure: 可选 - 步骤失败时不终止整个场景
yaml
steps:
  # 简单操作
  - action: launch
    target: "./app"

  # 带多个参数的操作
  - action: verify_output
    contains: "Success"
    timeout: 5s
    description: "应用应输出成功消息"

  # 即使失败也继续执行
  - action: click
    selector: ".optional-button"
    continue_on_failure: true

Verification Actions [LEVEL 1]

验证操作 [LEVEL 1]

Verification actions check expected outcomes. They fail the test if expectations aren't met.
Common Verifications:
yaml
undefined
验证操作用于检查预期结果。如果不符合预期,测试会失败。
常见验证操作
yaml
undefined

CLI: Check output contains text

CLI: 检查输出是否包含指定文本

  • action: verify_output contains: "Expected text"
  • action: verify_output contains: "Expected text"

CLI: Check output matches regex

CLI: 检查输出是否匹配正则表达式

  • action: verify_output matches: "Result: \d+"
  • action: verify_output matches: "Result: \d+"

CLI: Check exit code

CLI: 检查退出码

  • action: verify_exit_code expected: 0
  • action: verify_exit_code expected: 0

Web/TUI: Check element exists

Web/TUI: 检查元素是否存在

  • action: verify_element selector: ".success-message"
  • action: verify_element selector: ".success-message"

Web/TUI: Check element contains text

Web/TUI: 检查元素是否包含指定文本

  • action: verify_element selector: "h1" contains: "Welcome"
  • action: verify_element selector: "h1" contains: "Welcome"

Web: Check URL

Web: 检查URL

Web: Check element count

Web: 检查元素数量

  • action: verify_element selector: ".list-item" count: 5
  • action: verify_element selector: ".list-item" count: 5

Electron: Check window state

Electron: 检查窗口状态

  • action: verify_window title: "My App" visible: true focused: true
undefined
  • action: verify_window title: "My App" visible: true focused: true
undefined

Cleanup Section

清理部分

Cleanup runs after all steps complete (success or failure). Use for teardown actions.
yaml
cleanup:
  - action: stop_application
    force: true

  - action: delete_file
    path: "./temp-test-data.json"

  - action: reset_database
    connection: "test_db"
清理步骤会在所有步骤完成后执行(无论成功或失败)。用于执行收尾操作。
yaml
cleanup:
  - action: stop_application
    force: true

  - action: delete_file
    path: "./temp-test-data.json"

  - action: reset_database
    connection: "test_db"

Advanced Patterns [LEVEL 2]

高级模式 [LEVEL 2]

Conditional Logic

条件逻辑

Execute steps based on conditions:
yaml
steps:
  - action: launch
    target: "./app"

  - action: verify_output
    contains: "Login required"
    id: login_check

  # Only run if login_check passed
  - action: send_input
    value: "login admin password123\n"
    condition: login_check.passed
根据条件执行步骤:
yaml
steps:
  - action: launch
    target: "./app"

  - action: verify_output
    contains: "Login required"
    id: login_check

  # 仅当login_check通过时执行
  - action: send_input
    value: "login admin password123\n"
    condition: login_check.passed

Variables and Templating [LEVEL 2]

变量与模板 [LEVEL 2]

Define variables and use them throughout the scenario:
yaml
scenario:
  name: "Test with Variables"
  type: cli

  variables:
    username: "testuser"
    api_url: "http://localhost:8080"

  steps:
    - action: launch
      target: "./app"
      args: ["--api", "${api_url}"]

    - action: send_input
      value: "login ${username}\n"

    - action: verify_output
      contains: "Welcome, ${username}!"
定义变量并在场景中使用:
yaml
scenario:
  name: "使用变量的测试"
  type: cli

  variables:
    username: "testuser"
    api_url: "http://localhost:8080"

  steps:
    - action: launch
      target: "./app"
      args: ["--api", "${api_url}"]

    - action: send_input
      value: "login ${username}\n"

    - action: verify_output
      contains: "Welcome, ${username}!"

Loops and Repetition [LEVEL 2]

循环与重复 [LEVEL 2]

Repeat actions multiple times:
yaml
steps:
  - action: launch
    target: "./app"

  # Repeat action N times
  - action: send_keypress
    value: "down"
    times: 5

  # Loop over list
  - action: send_input
    value: "${item}\n"
    for_each:
      - "apple"
      - "banana"
      - "cherry"
重复执行操作多次:
yaml
steps:
  - action: launch
    target: "./app"

  # 重复执行N次
  - action: send_keypress
    value: "down"
    times: 5

  # 遍历列表执行
  - action: send_input
    value: "${item}\n"
    for_each:
      - "apple"
      - "banana"
      - "cherry"

Error Handling [LEVEL 2]

错误处理 [LEVEL 2]

Handle expected errors gracefully:
yaml
steps:
  - action: send_input
    value: "invalid command\n"

  # Verify error message appears
  - action: verify_output
    contains: "Error: Unknown command"
    expected_failure: true

  # App should still be running
  - action: verify_running
    expected: true
优雅处理预期的错误:
yaml
steps:
  - action: send_input
    value: "invalid command\n"

  # 验证错误消息是否出现
  - action: verify_output
    contains: "Error: Unknown command"
    expected_failure: true

  # 应用应仍在运行
  - action: verify_running
    expected: true

Multi-Step Workflows [LEVEL 2]

多步骤工作流 [LEVEL 2]

Complex scenarios with multiple phases:
yaml
scenario:
  name: "E-commerce Purchase Flow"
  type: web

  steps:
    # Phase 1: Authentication
    - action: navigate
      url: "http://localhost:3000/login"

    - action: type
      selector: "#username"
      value: "test@example.com"

    - action: type
      selector: "#password"
      value: "password123"

    - action: click
      selector: "button[type=submit]"

    - action: wait_for_url
      contains: "/dashboard"

    # Phase 2: Product Selection
    - action: navigate
      url: "http://localhost:3000/products"

    - action: click
      text: "Add to Cart"
      nth: 1

    - action: verify_element
      selector: ".cart-badge"
      contains: "1"

    # Phase 3: Checkout
    - action: click
      selector: ".cart-icon"

    - action: click
      text: "Proceed to Checkout"

    - action: fill_form
      fields:
        "#shipping-address": "123 Test St"
        "#city": "Testville"
        "#zip": "12345"

    - action: click
      selector: "#place-order"

    - action: wait_for_element
      selector: ".order-confirmation"
      timeout: 10s

    - action: verify_element
      selector: ".order-number"
      exists: true
包含多个阶段的复杂场景:
yaml
scenario:
  name: "电商购买流程"
  type: web

  steps:
    # 阶段1: 认证
    - action: navigate
      url: "http://localhost:3000/login"

    - action: type
      selector: "#username"
      value: "test@example.com"

    - action: type
      selector: "#password"
      value: "password123"

    - action: click
      selector: "button[type=submit]"

    - action: wait_for_url
      contains: "/dashboard"

    # 阶段2: 商品选择
    - action: navigate
      url: "http://localhost:3000/products"

    - action: click
      text: "Add to Cart"
      nth: 1

    - action: verify_element
      selector: ".cart-badge"
      contains: "1"

    # 阶段3: 结账
    - action: click
      selector: ".cart-icon"

    - action: click
      text: "Proceed to Checkout"

    - action: fill_form
      fields:
        "#shipping-address": "123 Test St"
        "#city": "Testville"
        "#zip": "12345"

    - action: click
      selector: "#place-order"

    - action: wait_for_element
      selector: ".order-confirmation"
      timeout: 10s

    - action: verify_element
      selector: ".order-number"
      exists: true

Level 3: Advanced Topics [LEVEL 3]

Level 3: 高级主题 [LEVEL 3]

Custom Comprehension Agents

自定义理解代理

The framework uses AI agents to interpret application output and determine if tests pass. You can customize these agents for domain-specific logic.
Default Comprehension Agent:
  • Observes raw output (text, HTML, screenshots)
  • Applies general reasoning to verify expectations
  • Returns pass/fail with explanation
Custom Comprehension Agent (see
examples/custom-agents/custom-comprehension-agent.yaml
):
yaml
scenario:
  name: "Financial Dashboard Test with Custom Agent"
  type: web

  # Define custom comprehension logic
  comprehension_agent:
    model: "gpt-4"
    system_prompt: |
      You are a financial data validator. When verifying dashboard content:
      1. All monetary values must use proper formatting ($1,234.56)
      2. Percentages must include % symbol
      3. Dates must be in MM/DD/YYYY format
      4. Negative values must be red
      5. Chart data must be logically consistent

      Be strict about formatting and data consistency.

    examples:
      - input: "Total Revenue: 45000"
        output: "FAIL - Missing currency symbol and comma separator"
      - input: "Total Revenue: $45,000.00"
        output: "PASS - Correctly formatted"

  steps:
    - action: navigate
      url: "http://localhost:3000/financial-dashboard"

    - action: verify_element
      selector: ".revenue-widget"
      use_custom_comprehension: true
      description: "Revenue should be properly formatted"
框架使用AI代理来解释应用输出并判断测试是否通过。你可以针对特定领域逻辑自定义这些代理。
默认理解代理
  • 观测原始输出(文本、HTML、截图)
  • 应用通用推理逻辑验证预期结果
  • 返回通过/失败状态及说明
自定义理解代理(见
examples/custom-agents/custom-comprehension-agent.yaml
):
yaml
scenario:
  name: "使用自定义代理的财务仪表盘测试"
  type: web

  # 定义自定义理解逻辑
  comprehension_agent:
    model: "gpt-4"
    system_prompt: |
      你是一名财务数据验证员。验证仪表盘内容时:
      1. 所有货币值必须使用正确格式($1,234.56)
      2. 百分比必须包含%符号
      3. 日期必须为MM/DD/YYYY格式
      4. 负值必须显示为红色
      5. 图表数据必须逻辑一致

      请严格检查格式和数据一致性。

    examples:
      - input: "Total Revenue: 45000"
        output: "失败 - 缺少货币符号和千位分隔符"
      - input: "Total Revenue: $45,000.00"
        output: "通过 - 格式正确"

  steps:
    - action: navigate
      url: "http://localhost:3000/financial-dashboard"

    - action: verify_element
      selector: ".revenue-widget"
      use_custom_comprehension: true
      description: "收入应格式正确"

Visual Regression Testing [LEVEL 3]

视觉回归测试 [LEVEL 3]

Compare screenshots against baseline images:
yaml
scenario:
  name: "Visual Regression - Homepage"
  type: web

  steps:
    - action: navigate
      url: "http://localhost:3000"

    - action: wait_for_element
      selector: ".page-loaded"

    - action: screenshot
      save_as: "homepage.png"

    - action: visual_compare
      screenshot: "homepage.png"
      baseline: "./baselines/homepage-baseline.png"
      threshold: 0.05 # 5% difference allowed
      highlight_differences: true
将截图与基准图片对比:
yaml
scenario:
  name: "视觉回归测试 - 首页"
  type: web

  steps:
    - action: navigate
      url: "http://localhost:3000"

    - action: wait_for_element
      selector: ".page-loaded"

    - action: screenshot
      save_as: "homepage.png"

    - action: visual_compare
      screenshot: "homepage.png"
      baseline: "./baselines/homepage-baseline.png"
      threshold: 0.05 # 允许5%的差异
      highlight_differences: true

Performance Validation [LEVEL 3]

性能验证 [LEVEL 3]

Measure and validate performance metrics:
yaml
scenario:
  name: "Performance - Dashboard Load Time"
  type: web

  performance:
    metrics:
      - page_load_time
      - first_contentful_paint
      - time_to_interactive

  steps:
    - action: navigate
      url: "http://localhost:3000/dashboard"
      measure_timing: true

    - action: verify_performance
      metric: page_load_time
      less_than: 3000 # 3 seconds

    - action: verify_performance
      metric: first_contentful_paint
      less_than: 1500 # 1.5 seconds
测量并验证性能指标:
yaml
scenario:
  name: "性能测试 - 仪表盘加载时间"
  type: web

  performance:
    metrics:
      - page_load_time
      - first_contentful_paint
      - time_to_interactive

  steps:
    - action: navigate
      url: "http://localhost:3000/dashboard"
      measure_timing: true

    - action: verify_performance
      metric: page_load_time
      less_than: 3000 # 3秒

    - action: verify_performance
      metric: first_contentful_paint
      less_than: 1500 # 1.5秒

Multi-Window Coordination (Electron) [LEVEL 3]

多窗口协调(Electron)[LEVEL 3]

Test applications with multiple windows:
yaml
scenario:
  name: "Multi-Window Chat Application"
  type: electron

  steps:
    - action: launch
      target: "./chat-app"

    - action: menu_click
      path: ["Window", "New Chat"]

    - action: verify_window
      count: 2

    - action: window_action
      window: 1
      action: focus

    - action: type
      selector: ".message-input"
      value: "Hello from window 1"

    - action: click
      selector: ".send-button"

    - action: window_action
      window: 2
      action: focus

    - action: wait_for_element
      selector: ".message"
      contains: "Hello from window 1"
      timeout: 5s
测试包含多个窗口的应用:
yaml
scenario:
  name: "多窗口聊天应用"
  type: electron

  steps:
    - action: launch
      target: "./chat-app"

    - action: menu_click
      path: ["Window", "New Chat"]

    - action: verify_window
      count: 2

    - action: window_action
      window: 1
      action: focus

    - action: type
      selector: ".message-input"
      value: "Hello from window 1"

    - action: click
      selector: ".send-button"

    - action: window_action
      window: 2
      action: focus

    - action: wait_for_element
      selector: ".message"
      contains: "Hello from window 1"
      timeout: 5s

IPC Testing (Electron) [LEVEL 3]

IPC测试(Electron)[LEVEL 3]

Test Inter-Process Communication between renderer and main:
yaml
scenario:
  name: "Electron IPC Communication"
  type: electron

  steps:
    - action: launch
      target: "./my-app"

    - action: ipc_send
      channel: "get-system-info"

    - action: ipc_expect
      channel: "system-info-reply"
      timeout: 3s

    - action: verify_ipc_payload
      contains:
        platform: "darwin"
        arch: "x64"
测试渲染进程与主进程之间的进程间通信:
yaml
scenario:
  name: "Electron IPC通信测试"
  type: electron

  steps:
    - action: launch
      target: "./my-app"

    - action: ipc_send
      channel: "get-system-info"

    - action: ipc_expect
      channel: "system-info-reply"
      timeout: 3s

    - action: verify_ipc_payload
      contains:
        platform: "darwin"
        arch: "x64"

Custom Reporters [LEVEL 3]

自定义报告器 [LEVEL 3]

Generate custom test reports:
yaml
scenario:
  name: "Test with Custom Reporting"
  type: cli

  reporting:
    format: custom
    template: "./report-template.html"
    include:
      - screenshots
      - logs
      - timing_data
      - video_recording

    email:
      enabled: true
      recipients: ["team@example.com"]
      on_failure_only: true

  steps:
    # ... test steps ...
生成自定义测试报告:
yaml
scenario:
  name: "使用自定义报告的测试"
  type: cli

  reporting:
    format: custom
    template: "./report-template.html"
    include:
      - screenshots
      - logs
      - timing_data
      - video_recording

    email:
      enabled: true
      recipients: ["team@example.com"]
      on_failure_only: true

  steps:
    # ... 测试步骤 ...

Framework Integration [LEVEL 2]

框架集成 [LEVEL 2]

Running Tests

运行测试

Single test:
bash
gadugi-test run test-scenario.yaml
Multiple tests:
bash
gadugi-test run tests/*.yaml
With options:
bash
gadugi-test run test.yaml \
  --verbose \
  --evidence-dir ./test-evidence \
  --retry 2 \
  --timeout 60s
单个测试
bash
gadugi-test run test-scenario.yaml
多个测试
bash
gadugi-test run tests/*.yaml
带选项运行
bash
gadugi-test run test.yaml \
  --verbose \
  --evidence-dir ./test-evidence \
  --retry 2 \
  --timeout 60s

CI/CD Integration

CI/CD集成

GitHub Actions (
.github/workflows/agentic-tests.yml
):
yaml
name: Agentic Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install gadugi-agentic-test
        run: npm install -g @gadugi/agentic-test

      - name: Run tests
        run: gadugi-test run tests/agentic/*.yaml

      - name: Upload evidence
        if: always()
        uses: actions/upload-artifact@v3
        with:
          name: test-evidence
          path: ./evidence/
GitHub Actions
.github/workflows/agentic-tests.yml
):
yaml
name: Agentic Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install gadugi-agentic-test
        run: npm install -g @gadugi/agentic-test

      - name: Run tests
        run: gadugi-test run tests/agentic/*.yaml

      - name: Upload evidence
        if: always()
        uses: actions/upload-artifact@v3
        with:
          name: test-evidence
          path: ./evidence/

Evidence Collection

证据收集

The framework automatically collects evidence for debugging:
evidence/
  scenario-name-20250116-093045/
    ├── scenario.yaml          # Original test scenario
    ├── execution-log.json     # Detailed execution log
    ├── screenshots/           # All captured screenshots
    │   ├── step-1.png
    │   ├── step-3.png
    │   └── step-5.png
    ├── output-captures/       # CLI/TUI output
    │   ├── stdout.txt
    │   └── stderr.txt
    ├── timing.json            # Performance metrics
    └── report.html            # Human-readable report
框架会自动收集调试用的证据:
evidence/
  scenario-name-20250116-093045/
    ├── scenario.yaml          # 原始测试场景
    ├── execution-log.json     # 详细执行日志
    ├── screenshots/           # 所有捕获的截图
    │   ├── step-1.png
    │   ├── step-3.png
    │   └── step-5.png
    ├── output-captures/       # CLI/TUI输出
    │   ├── stdout.txt
    │   └── stderr.txt
    ├── timing.json            # 性能指标
    └── report.html            # 人类可读的报告

Best Practices [LEVEL 2]

最佳实践 [LEVEL 2]

1. Start Simple, Add Complexity

1. 从简到繁,逐步增加复杂度

Begin with basic smoke tests, then add detail:
yaml
undefined
从基础冒烟测试开始,再逐步添加细节:
yaml
undefined

Level 1: Basic smoke test

Level 1: 基础冒烟测试

steps:
  • action: launch target: "./app"
  • action: verify_output contains: "Ready"
steps:
  • action: launch target: "./app"
  • action: verify_output contains: "Ready"

Level 2: Add interaction

Level 2: 添加交互

steps:
  • action: launch target: "./app"
  • action: send_input value: "command\n"
  • action: verify_output contains: "Success"
steps:
  • action: launch target: "./app"
  • action: send_input value: "command\n"
  • action: verify_output contains: "Success"

Level 3: Add error handling and edge cases

Level 3: 添加错误处理和边界情况

steps:
  • action: launch target: "./app"
  • action: send_input value: "invalid\n"
  • action: verify_output contains: "Error"
  • action: send_input value: "command\n"
  • action: verify_output contains: "Success"
undefined
steps:
  • action: launch target: "./app"
  • action: send_input value: "invalid\n"
  • action: verify_output contains: "Error"
  • action: send_input value: "command\n"
  • action: verify_output contains: "Success"
undefined

2. Use Descriptive Names and Descriptions

2. 使用描述性的名称和说明

yaml
undefined
yaml
undefined

Bad

不好的示例

scenario: name: "Test 1" steps: - action: click selector: "button"
scenario: name: "Test 1" steps: - action: click selector: "button"

Good

好的示例

scenario: name: "User Login Flow - Valid Credentials" description: "Verifies user can log in with valid email and password" steps: - action: click selector: "button[type=submit]" description: "Submit login form"
undefined
scenario: name: "用户登录流程 - 有效凭据" description: "验证用户可使用有效邮箱和密码登录" steps: - action: click selector: "button[type=submit]" description: "提交登录表单"
undefined

3. Verify Critical Paths Only

3. 仅验证关键路径

Don't test every tiny detail. Focus on user-facing behavior:
yaml
undefined
不要测试每个细节。专注于用户可见的行为:
yaml
undefined

Bad - Tests implementation details

不好的示例 - 测试实现细节

  • action: verify_element selector: ".internal-cache-status" contains: "initialized"
  • action: verify_element selector: ".internal-cache-status" contains: "initialized"

Good - Tests user-visible behavior

好的示例 - 测试用户可见的行为

  • action: verify_element selector: ".welcome-message" contains: "Welcome back"
undefined
  • action: verify_element selector: ".welcome-message" contains: "Welcome back"
undefined

4. Use Prerequisites for Test Dependencies

4. 使用前置条件处理测试依赖

yaml
scenario:
  name: "User Profile Edit"

  prerequisites:
    - "User testuser@example.com exists"
    - "User is logged in"
    - "Database is seeded with test data"

  steps:
    # Test assumes prerequisites are met
    - action: navigate
      url: "/profile"
yaml
scenario:
  name: "用户资料编辑"

  prerequisites:
    - "用户testuser@example.com已存在"
    - "用户已登录"
    - "数据库已填充测试数据"

  steps:
    # 测试假设前置条件已满足
    - action: navigate
      url: "/profile"

5. Keep Tests Independent

5. 保持测试独立性

Each test should set up its own state and clean up:
yaml
scenario:
  name: "Create Document"

  steps:
    # Create test user (don't assume exists)
    - action: api_call
      endpoint: "/api/users"
      method: POST
      data: { email: "test@example.com" }

    # Run test
    - action: navigate
      url: "/documents/new"
    # ... test steps ...

  cleanup:
    # Remove test user
    - action: api_call
      endpoint: "/api/users/test@example.com"
      method: DELETE
每个测试都应自行设置状态并清理:
yaml
scenario:
  name: "创建文档"

  steps:
    # 创建测试用户(不假设用户已存在)
    - action: api_call
      endpoint: "/api/users"
      method: POST
      data: { email: "test@example.com" }

    # 执行测试
    - action: navigate
      url: "/documents/new"
    # ... 测试步骤 ...

  cleanup:
    # 删除测试用户
    - action: api_call
      endpoint: "/api/users/test@example.com"
      method: DELETE

6. Use Tags for Organization

6. 使用标签进行组织

yaml
scenario:
  name: "Critical Payment Flow"
  tags: [smoke, critical, payment, e2e]
  # Run with: gadugi-test run --tags critical
yaml
scenario:
  name: "关键支付流程"
  tags: [smoke, critical, payment, e2e]
  # 运行方式: gadugi-test run --tags critical

7. Add Timeouts Strategically

7. 合理设置超时时间

yaml
steps:
  # Quick operations - short timeout
  - action: click
    selector: "button"
    timeout: 2s

  # Network operations - longer timeout
  - action: wait_for_element
    selector: ".data-loaded"
    timeout: 10s

  # Complex operations - generous timeout
  - action: verify_element
    selector: ".report-generated"
    timeout: 60s
yaml
steps:
  # 快速操作 - 短超时
  - action: click
    selector: "button"
    timeout: 2s

  # 网络操作 - 较长超时
  - action: wait_for_element
    selector: ".data-loaded"
    timeout: 10s

  # 复杂操作 - 宽松超时
  - action: verify_element
    selector: ".report-generated"
    timeout: 60s

Testing Strategies [LEVEL 2]

测试策略 [LEVEL 2]

Smoke Tests

冒烟测试

Minimal tests that verify critical functionality works:
yaml
scenario:
  name: "Smoke Test - Application Starts"
  tags: [smoke]

  steps:
    - action: launch
      target: "./app"
    - action: verify_output
      contains: "Ready"
      timeout: 5s
Run before every commit:
gadugi-test run --tags smoke
验证关键功能是否正常工作的最小测试:
yaml
scenario:
  name: "冒烟测试 - 应用启动"
  tags: [smoke]

  steps:
    - action: launch
      target: "./app"
    - action: verify_output
      contains: "Ready"
      timeout: 5s
每次提交前运行:
gadugi-test run --tags smoke

Happy Path Tests

正常路径测试

Test the ideal user journey:
yaml
scenario:
  name: "Happy Path - User Registration"

  steps:
    - action: navigate
      url: "/register"
    - action: type
      selector: "#email"
      value: "newuser@example.com"
    - action: type
      selector: "#password"
      value: "SecurePass123!"
    - action: click
      selector: "button[type=submit]"
    - action: wait_for_url
      contains: "/welcome"
测试理想的用户流程:
yaml
scenario:
  name: "正常路径 - 用户注册"

  steps:
    - action: navigate
      url: "/register"
    - action: type
      selector: "#email"
      value: "newuser@example.com"
    - action: type
      selector: "#password"
      value: "SecurePass123!"
    - action: click
      selector: "button[type=submit]"
    - action: wait_for_url
      contains: "/welcome"

Error Path Tests

错误路径测试

Verify error handling:
yaml
scenario:
  name: "Error Path - Invalid Login"

  steps:
    - action: navigate
      url: "/login"
    - action: type
      selector: "#email"
      value: "invalid@example.com"
    - action: type
      selector: "#password"
      value: "wrongpassword"
    - action: click
      selector: "button[type=submit]"
    - action: verify_element
      selector: ".error-message"
      contains: "Invalid credentials"
验证错误处理:
yaml
scenario:
  name: "错误路径 - 无效登录"

  steps:
    - action: navigate
      url: "/login"
    - action: type
      selector: "#email"
      value: "invalid@example.com"
    - action: type
      selector: "#password"
      value: "wrongpassword"
    - action: click
      selector: "button[type=submit]"
    - action: verify_element
      selector: ".error-message"
      contains: "Invalid credentials"

Regression Tests

回归测试

Prevent bugs from reappearing:
yaml
scenario:
  name: "Regression - Issue #123 Password Reset"
  tags: [regression, bug-123]
  description: "Verifies password reset email is sent (was broken in v1.2)"

  steps:
    - action: navigate
      url: "/forgot-password"
    - action: type
      selector: "#email"
      value: "user@example.com"
    - action: click
      selector: "button[type=submit]"
    - action: verify_element
      selector: ".success-message"
      contains: "Reset email sent"
防止问题重现:
yaml
scenario:
  name: "回归测试 - Issue #123 密码重置"
  tags: [regression, bug-123]
  description: "验证密码重置邮件可正常发送(v1.2版本曾出现问题)"

  steps:
    - action: navigate
      url: "/forgot-password"
    - action: type
      selector: "#email"
      value: "user@example.com"
    - action: click
      selector: "button[type=submit]"
    - action: verify_element
      selector: ".success-message"
      contains: "Reset email sent"

Philosophy Alignment [LEVEL 2]

理念对齐 [LEVEL 2]

This skill follows amplihack's core principles:
本技能遵循amplihack的核心原则:

Ruthless Simplicity

极致简洁

  • YAML over code: Declarative tests are simpler than programmatic tests
  • No implementation details: Tests describe WHAT, not HOW
  • Minimal boilerplate: Each test is focused and concise
  • YAML优先:声明式测试比编程式测试更简单
  • 无实现细节:测试描述的是WHAT,而非HOW
  • 最小化模板代码:每个测试都专注且简洁

Modular Design (Bricks & Studs)

模块化设计(积木式)

  • Self-contained scenarios: Each YAML file is independent
  • Clear contracts: Steps have well-defined inputs/outputs
  • Composable actions: Reuse actions across different test types
  • 独立场景:每个YAML文件都是独立的
  • 清晰的契约:步骤有明确的输入/输出
  • 可组合的操作:不同测试类型可复用操作

Zero-BS Implementation

务实实现

  • No stubs: Every example in this skill is a complete, runnable test
  • Working defaults: Tests run with minimal configuration
  • Clear errors: Framework provides actionable error messages
  • 无存根:本技能中的每个示例都是完整可运行的测试
  • 合理默认值:测试只需最少配置即可运行
  • 清晰的错误信息:框架提供可操作的错误提示

Outside-In Thinking

外部驱动思维

  • User perspective: Tests verify behavior users care about
  • Implementation agnostic: Refactoring doesn't break tests
  • Behavior-driven: Focus on outcomes, not internals
  • 用户视角:测试验证用户关心的行为
  • 与实现无关:重构不会导致测试失效
  • 行为驱动:专注于结果,而非内部实现

Common Pitfalls and Solutions [LEVEL 2]

常见陷阱与解决方案 [LEVEL 2]

Pitfall 1: Over-Specifying

陷阱1:过度指定

Problem: Test breaks when UI changes slightly
yaml
undefined
问题:UI轻微变更导致测试失效
yaml
undefined

Bad - Too specific

不好的示例 - 过于具体

  • action: verify_element selector: "div.container > div.row > div.col-md-6 > span.text-primary.font-bold" contains: "Welcome"

**Solution**: Use flexible selectors

```yaml
  • action: verify_element selector: "div.container > div.row > div.col-md-6 > span.text-primary.font-bold" contains: "Welcome"

**解决方案**:使用灵活的选择器

```yaml

Good - Focused on behavior

好的示例 - 专注于行为

  • action: verify_element selector: ".welcome-message" contains: "Welcome"
undefined
  • action: verify_element selector: ".welcome-message" contains: "Welcome"
undefined

Pitfall 2: Missing Waits

陷阱2:缺少等待

Problem: Test fails intermittently due to timing
yaml
undefined
问题:因时序问题导致测试间歇性失败
yaml
undefined

Bad - No wait for async operation

不好的示例 - 未等待异步操作

  • action: click selector: ".load-data-button"
  • action: verify_element selector: ".data-table" # May not exist yet!

**Solution**: Always wait for dynamic content

```yaml
  • action: click selector: ".load-data-button"
  • action: verify_element selector: ".data-table" # 可能还未出现!

**解决方案**:始终等待动态内容

```yaml

Good - Wait for element to appear

好的示例 - 等待元素出现

  • action: click selector: ".load-data-button"
  • action: wait_for_element selector: ".data-table" timeout: 10s
  • action: verify_element selector: ".data-table"
undefined
  • action: click selector: ".load-data-button"
  • action: wait_for_element selector: ".data-table" timeout: 10s
  • action: verify_element selector: ".data-table"
undefined

Pitfall 3: Testing Implementation Details

陷阱3:测试实现细节

Problem: Test coupled to internal state
yaml
undefined
问题:测试与内部状态耦合
yaml
undefined

Bad - Tests internal cache state

不好的示例 - 测试内部缓存状态

  • action: verify_output contains: "Cache hit ratio: 85%"

**Solution**: Test user-visible behavior

```yaml
  • action: verify_output contains: "Cache hit ratio: 85%"

**解决方案**:测试用户可见的行为

```yaml

Good - Tests response time

好的示例 - 测试响应时间

  • action: verify_response_time less_than: 100ms description: "Fast response indicates caching works"
undefined
  • action: verify_response_time less_than: 100ms description: "快速响应表明缓存正常工作"
undefined

Pitfall 4: Flaky Assertions

陷阱4:不稳定的断言

Problem: Assertions depend on exact timing or formatting
yaml
undefined
问题:断言依赖精确的时序或格式
yaml
undefined

Bad - Exact timestamp match will fail

不好的示例 - 精确时间戳匹配会失败

  • action: verify_output contains: "Created at: 2025-11-16 09:30:45"

**Solution**: Use flexible patterns

```yaml
  • action: verify_output contains: "Created at: 2025-11-16 09:30:45"

**解决方案**:使用灵活的匹配模式

```yaml

Good - Match pattern, not exact value

好的示例 - 匹配模式而非精确值

  • action: verify_output matches: "Created at: \d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}"
undefined
  • action: verify_output matches: "Created at: \d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}"
undefined

Pitfall 5: Not Cleaning Up

陷阱5:未清理

Problem: Tests leave artifacts that affect future runs
yaml
undefined
问题:测试留下的 artifacts 影响后续运行
yaml
undefined

Bad - No cleanup

不好的示例 - 无清理

steps:
  • action: create_file path: "./test-data.json"
  • action: launch target: "./app"

**Solution**: Always use cleanup section

```yaml
steps:
  • action: create_file path: "./test-data.json"
  • action: launch target: "./app"

**解决方案**:始终使用清理部分

```yaml

Good - Cleanup ensures clean slate

好的示例 - 清理确保环境干净

steps:
  • action: create_file path: "./test-data.json"
  • action: launch target: "./app"
cleanup:
  • action: delete_file path: "./test-data.json"
undefined
steps:
  • action: create_file path: "./test-data.json"
  • action: launch target: "./app"
cleanup:
  • action: delete_file path: "./test-data.json"
undefined

Example Library [LEVEL 1]

示例库 [LEVEL 1]

This skill includes 15 complete working examples organized by application type and complexity level:
本技能包含15个完整的可运行示例,按应用类型和复杂度层级组织:

CLI Examples

CLI示例

  1. calculator-basic.yaml [LEVEL 1] - Simple CLI arithmetic operations
  2. cli-error-handling.yaml [LEVEL 2] - Error messages and recovery
  3. cli-interactive-session.yaml [LEVEL 2] - Multi-turn interactive CLI
  1. calculator-basic.yaml [LEVEL 1] - 简单CLI算术操作
  2. cli-error-handling.yaml [LEVEL 2] - 错误消息与恢复
  3. cli-interactive-session.yaml [LEVEL 2] - 多轮交互式CLI

TUI Examples

TUI示例

  1. file-manager-navigation.yaml [LEVEL 1] - Basic TUI keyboard navigation
  2. tui-form-validation.yaml [LEVEL 2] - Complex form filling and validation
  3. tui-performance-monitoring.yaml [LEVEL 3] - TUI performance dashboard testing
  1. file-manager-navigation.yaml [LEVEL 1] - 基础TUI键盘导航
  2. tui-form-validation.yaml [LEVEL 2] - 复杂表单填写与验证
  3. tui-performance-monitoring.yaml [LEVEL 3] - TUI性能仪表盘测试

Web Examples

Web示例

  1. dashboard-smoke-test.yaml [LEVEL 1] - Simple web dashboard verification
  2. web-authentication-flow.yaml [LEVEL 2] - Multi-step login workflow
  3. web-visual-regression.yaml [LEVEL 2] - Screenshot-based visual testing
  1. dashboard-smoke-test.yaml [LEVEL 1] - 简单Web仪表盘验证
  2. web-authentication-flow.yaml [LEVEL 2] - 多步骤登录工作流
  3. web-visual-regression.yaml [LEVEL 2] - 基于截图的视觉测试

Electron Examples

Electron示例

  1. single-window-basic.yaml [LEVEL 1] - Basic Electron window test
  2. multi-window-coordination.yaml [LEVEL 2] - Multiple window orchestration
  3. electron-menu-testing.yaml [LEVEL 2] - Application menu interactions
  4. electron-ipc-testing.yaml [LEVEL 3] - Main/renderer IPC testing
  1. single-window-basic.yaml [LEVEL 1] - 基础Electron窗口测试
  2. multi-window-coordination.yaml [LEVEL 2] - 多窗口编排
  3. electron-menu-testing.yaml [LEVEL 2] - 应用菜单交互
  4. electron-ipc-testing.yaml [LEVEL 3] - 主/渲染进程IPC测试

Custom Agent Examples

自定义代理示例

  1. custom-comprehension-agent.yaml [LEVEL 3] - Domain-specific validation logic
  2. custom-reporter-integration.yaml [LEVEL 3] - Custom test reporting
See
examples/
directory for full example code with inline documentation.
  1. custom-comprehension-agent.yaml [LEVEL 3] - 特定领域验证逻辑
  2. custom-reporter-integration.yaml [LEVEL 3] - 自定义测试报告
请查看
examples/
目录获取完整示例代码及内联文档。

Framework Freshness Check [LEVEL 3]

框架版本检查 [LEVEL 3]

This skill embeds knowledge of gadugi-agentic-test version 0.1.0. To check if a newer version exists:
bash
undefined
本技能基于gadugi-agentic-test版本0.1.0。检查是否有新版本:
bash
undefined

Run the freshness check script

运行版本检查脚本

python scripts/check-freshness.py
python scripts/check-freshness.py

Output if outdated:

如果版本过时,输出如下:

WARNING: Embedded framework version is 0.1.0

WARNING: 嵌入的框架版本为0.1.0

Latest GitHub version is 0.2.5

GitHub最新版本为0.2.5

New features in 0.2.5:

0.2.5版本的新特性:

- Native Playwright support for web testing

- 原生支持Playwright用于Web测试

- Video recording for all test types

- 所有测试类型支持视频录制

- Parallel test execution

- 并行测试执行

Update with: npm update -g @gadugi/agentic-test

更新命令: npm update -g @gadugi/agentic-test


The script checks the GitHub repository for releases and compares against the embedded version. This ensures you're aware of new features and improvements.

**When to Update This Skill**:

- New framework version adds significant features
- Breaking changes in YAML schema
- New application types supported
- Agent capabilities expand

该脚本会检查GitHub仓库的版本并与嵌入版本对比。确保你了解新特性和改进。

**何时更新本技能**:

- 框架新版本添加了重要特性
- YAML schema出现破坏性变更
- 支持新的应用类型
- 代理能力扩展

Integration with Other Skills [LEVEL 2]

与其他技能集成 [LEVEL 2]

Works Well With

适配良好的技能

test-gap-analyzer:
  • Use test-gap-analyzer to find untested functions
  • Write outside-in tests for critical user-facing paths
  • Use unit tests (from test-gap-analyzer) for internal functions
philosophy-guardian:
  • Ensure test YAML follows ruthless simplicity
  • Verify tests focus on behavior, not implementation
pr-review-assistant:
  • Include outside-in tests in PR reviews
  • Verify tests cover changed functionality
  • Check test readability and clarity
module-spec-generator:
  • Generate module specs that include outside-in test scenarios
  • Use specs as templates for test YAML
test-gap-analyzer
  • 使用test-gap-analyzer找出未测试的函数
  • 为关键用户路径编写外部驱动测试
  • 使用单元测试(来自test-gap-analyzer)测试内部函数
philosophy-guardian
  • 确保测试YAML遵循极致简洁原则
  • 验证测试专注于行为而非实现
pr-review-assistant
  • 在PR评审中包含外部驱动测试
  • 验证测试覆盖了变更的功能
  • 检查测试的可读性和清晰度
module-spec-generator
  • 生成包含外部驱动测试场景的模块规格
  • 使用规格作为测试YAML的模板

Example Combined Workflow

示例组合工作流

bash
undefined
bash
undefined

1. Analyze coverage gaps

1. 分析覆盖缺口

claude "Use test-gap-analyzer on ./src"
claude "Use test-gap-analyzer on ./src"

2. Write outside-in tests for critical paths

2. 为关键路径编写外部驱动测试

claude "Use outside-in-testing to create web tests for authentication"
claude "Use outside-in-testing to create web tests for authentication"

3. Verify philosophy compliance

3. 验证理念合规性

claude "Use philosophy-guardian to review new test files"
claude "Use philosophy-guardian to review new test files"

4. Include in PR

4. 提交到PR

git add tests/agentic/ git commit -m "Add outside-in tests for auth flow"
undefined
git add tests/agentic/ git commit -m "Add outside-in tests for auth flow"
undefined

Troubleshooting [LEVEL 2]

故障排除 [LEVEL 2]

Test Times Out

测试超时

Symptom: Test exceeds timeout and fails
Causes:
  • Application takes longer to start than expected
  • Network requests are slow
  • Element never appears (incorrect selector)
Solutions:
yaml
undefined
症状:测试超过超时时间并失败
原因
  • 应用启动时间超出预期
  • 网络请求缓慢
  • 元素从未出现(选择器错误)
解决方案
yaml
undefined

Increase timeout

增加超时时间

  • action: wait_for_element selector: ".slow-loading-element" timeout: 30s # Increase from default
  • action: wait_for_element selector: ".slow-loading-element" timeout: 30s # 从默认值增加

Add intermediate verification

添加中间验证

  • action: launch target: "./app"
  • action: wait_for_output contains: "Initializing..." timeout: 5s
  • action: wait_for_output contains: "Ready" timeout: 20s
undefined
  • action: launch target: "./app"
  • action: wait_for_output contains: "Initializing..." timeout: 5s
  • action: wait_for_output contains: "Ready" timeout: 20s
undefined

Element Not Found

元素未找到

Symptom:
verify_element
or
click
fails with "element not found"
Causes:
  • Incorrect CSS selector
  • Element not yet rendered (timing issue)
  • Element in iframe or shadow DOM
Solutions:
yaml
undefined
症状
verify_element
click
因"element not found"失败
原因
  • CSS选择器错误
  • 元素尚未渲染(时序问题)
  • 元素在iframe或shadow DOM中
解决方案
yaml
undefined

Add wait before interaction

交互前添加等待

  • action: wait_for_element selector: ".target-element" timeout: 10s
  • action: click selector: ".target-element"
  • action: wait_for_element selector: ".target-element" timeout: 10s
  • action: click selector: ".target-element"

Use more specific selector

使用更具体的选择器

  • action: click selector: "button[data-testid='submit-button']"
  • action: click selector: "button[data-testid='submit-button']"

Handle iframe

处理iframe

  • action: switch_to_iframe selector: "iframe#payment-frame"
  • action: click selector: ".pay-now-button"
undefined
  • action: switch_to_iframe selector: "iframe#payment-frame"
  • action: click selector: ".pay-now-button"
undefined

Test Passes Locally, Fails in CI

本地测试通过,CI中失败

Symptom: Test works on dev machine but fails in CI environment
Causes:
  • Different screen size (web/Electron)
  • Missing dependencies
  • Timing differences (slower CI machines)
  • Environment variable differences
Solutions:
yaml
undefined
症状:在开发机器上正常工作,但在CI环境中失败
原因
  • 屏幕尺寸不同(Web/Electron)
  • 缺少依赖
  • 时序差异(CI机器较慢)
  • 环境变量不同
解决方案
yaml
undefined

Set explicit viewport size (web/Electron)

设置明确的视口大小(Web/Electron)

scenario: environment: viewport: width: 1920 height: 1080
scenario: environment: viewport: width: 1920 height: 1080

Add longer timeouts in CI

在CI中设置更长的超时

  • action: wait_for_element selector: ".element" timeout: 30s # Generous for CI
  • action: wait_for_element selector: ".element" timeout: 30s # 为CI设置宽松的超时

Verify prerequisites

验证前置条件

prerequisites:
  • "Chrome browser installed"
  • "Environment variable API_KEY is set"
undefined
prerequisites:
  • "已安装Chrome浏览器"
  • "已设置环境变量API_KEY"
undefined

Output Doesn't Match Expected

输出与预期不符

Symptom:
verify_output
fails even though output looks correct
Causes:
  • Extra whitespace or newlines
  • ANSI color codes in output
  • Case sensitivity
Solutions:
yaml
undefined
症状
verify_output
失败,尽管输出看起来正确
原因
  • 多余的空格或换行
  • 输出中包含ANSI颜色代码
  • 大小写敏感
解决方案
yaml
undefined

Use flexible matching

使用灵活匹配

  • action: verify_output matches: "Result:\s+Success" # Allow flexible whitespace
  • action: verify_output matches: "Result:\s+Success" # 允许灵活的空格

Strip ANSI codes

去除ANSI代码

  • action: verify_output contains: "Success" strip_ansi: true
  • action: verify_output contains: "Success" strip_ansi: true

Case-insensitive match

不区分大小写匹配

  • action: verify_output contains: "success" case_sensitive: false
undefined
  • action: verify_output contains: "success" case_sensitive: false
undefined

Reference: Action Catalog [LEVEL 3]

参考:操作目录 [LEVEL 3]

CLI Actions

CLI操作

ActionParametersDescription
launch
target
,
args
,
cwd
,
env
Start CLI application
send_input
value
,
delay
Send text to stdin
send_signal
signal
Send OS signal (SIGINT, SIGTERM, etc.)
wait_for_output
contains
,
matches
,
timeout
Wait for text in stdout/stderr
verify_output
contains
,
matches
,
stream
Check output content
verify_exit_code
expected
Validate exit code
capture_output
save_as
,
stream
Save output to file
操作参数描述
launch
target
,
args
,
cwd
,
env
启动CLI应用
send_input
value
,
delay
向stdin发送文本
send_signal
signal
发送OS信号(SIGINT、SIGTERM等)
wait_for_output
contains
,
matches
,
timeout
等待stdout/stderr中出现特定文本
verify_output
contains
,
matches
,
stream
检查输出内容
verify_exit_code
expected
验证退出码
capture_output
save_as
,
stream
将输出保存到文件

TUI Actions

TUI操作

ActionParametersDescription
launch
target
,
args
,
terminal_size
Start TUI application
send_keypress
value
,
times
,
modifiers
Send keyboard input
wait_for_screen
contains
,
timeout
Wait for text on screen
verify_screen
contains
,
matches
,
region
Check screen content
capture_screenshot
save_as
Save terminal screenshot
navigate_menu
path
Navigate menu structure
fill_form
fields
Fill TUI form fields
操作参数描述
launch
target
,
args
,
terminal_size
启动TUI应用
send_keypress
value
,
times
,
modifiers
发送键盘输入
wait_for_screen
contains
,
timeout
等待屏幕上出现特定文本
verify_screen
contains
,
matches
,
region
检查屏幕内容
capture_screenshot
save_as
保存终端截图
navigate_menu
path
导航菜单结构
fill_form
fields
填写TUI表单字段

Web Actions

Web操作

ActionParametersDescription
navigate
url
,
wait_for_load
Go to URL
click
selector
,
text
,
nth
Click element
type
selector
,
value
,
delay
Type into input
wait_for_element
selector
,
timeout
,
disappears
Wait for element
verify_element
selector
,
contains
,
count
,
exists
Check element state
verify_url
equals
,
contains
,
matches
Validate URL
screenshot
save_as
,
selector
,
full_page
Capture screenshot
scroll
selector
,
direction
,
amount
Scroll page/element
select_option
selector
,
value
Select dropdown option
checkbox
selector
,
checked
Check/uncheck checkbox
操作参数描述
navigate
url
,
wait_for_load
访问URL
click
selector
,
text
,
nth
点击元素
type
selector
,
value
,
delay
在输入框中输入文本
wait_for_element
selector
,
timeout
,
disappears
等待元素出现
verify_element
selector
,
contains
,
count
,
exists
检查元素状态
verify_url
equals
,
contains
,
matches
验证URL
screenshot
save_as
,
selector
,
full_page
捕获截图
scroll
selector
,
direction
,
amount
滚动页面/元素
select_option
selector
,
value
选择下拉选项
checkbox
selector
,
checked
勾选/取消勾选复选框

Electron Actions

Electron操作

ActionParametersDescription
launch
target
,
args
,
wait_for_window
Start Electron app
window_action
window
,
action
Interact with windows
menu_click
path
Click menu items
dialog_action
type
,
action
,
filename
Handle dialogs
ipc_send
channel
,
data
Send IPC message
ipc_expect
channel
,
timeout
Wait for IPC message
verify_window
title
,
visible
,
focused
,
count
Check window state
All web actionsElectron includes Chromium
操作参数描述
launch
target
,
args
,
wait_for_window
启动Electron应用
window_action
window
,
action
与窗口交互(聚焦、最小化、关闭)
menu_click
path
点击菜单项
dialog_action
type
,
action
,
filename
处理对话框(打开文件、保存、确认)
ipc_send
channel
,
data
发送IPC消息
ipc_expect
channel
,
timeout
等待IPC消息
verify_window
title
,
visible
,
focused
,
count
检查窗口状态
所有Web操作Electron基于Chromium,支持所有Web操作

Common Parameters

通用参数

ParameterTypeDescription
timeout
DurationMaximum wait time (e.g., "5s", "2m")
description
StringHuman-readable step explanation
continue_on_failure
BooleanDon't fail scenario if step fails
id
StringStep identifier for conditionals
condition
ExpressionExecute step only if condition true
参数类型描述
timeout
时长最大等待时间(例如"5s"、"2m")
description
字符串人类可读的步骤说明
continue_on_failure
布尔值步骤失败时不终止整个场景
id
字符串步骤标识符,用于条件逻辑
condition
表达式仅当条件为真时执行步骤

Quick Reference: YAML Template [LEVEL 1]

快速参考:YAML模板 [LEVEL 1]

yaml
scenario:
  # Required fields
  name: "Test Name"
  description: "What this test verifies"
  type: cli | tui | web | electron

  # Optional metadata
  tags: [smoke, critical]
  timeout: 60s

  # What must be true before running
  prerequisites:
    - "Condition 1"
    - "Condition 2"

  # Environment setup
  environment:
    variables:
      VAR_NAME: "value"

  # Variables for templating
  variables:
    username: "testuser"

  # Test steps (executed in order)
  steps:
    - action: launch
      target: "./app"

    - action: send_input
      value: "command\n"

    - action: verify_output
      contains: "Success"
      timeout: 5s

  # Cleanup (always runs)
  cleanup:
    - action: stop_application
yaml
scenario:
  # 必填字段
  name: "测试名称"
  description: "本测试验证的内容"
  type: cli | tui | web | electron

  # 可选元数据
  tags: [smoke, critical]
  timeout: 60s

  # 运行前必须满足的条件
  prerequisites:
    - "条件1"
    - "条件2"

  # 环境设置
  environment:
    variables:
      VAR_NAME: "value"

  # 模板变量
  variables:
    username: "testuser"

  # 测试步骤(按顺序执行)
  steps:
    - action: launch
      target: "./app"

    - action: send_input
      value: "command\n"

    - action: verify_output
      contains: "Success"
      timeout: 5s

  # 清理步骤(始终执行)
  cleanup:
    - action: stop_application

Skill Usage Examples [LEVEL 1]

技能使用示例 [LEVEL 1]

Invoke This Skill

调用本技能

Claude, use the outside-in-testing skill to create a CLI test for my calculator app.

Claude, use outside-in-testing to generate web tests for the login flow.

Claude, create TUI tests using the outside-in-testing skill for file navigation.
Claude, use the outside-in-testing skill to create a CLI test for my calculator app.

Claude, use outside-in-testing to generate web tests for the login flow.

Claude, create TUI tests using the outside-in-testing skill for file navigation.

What You'll Get

你将获得

  1. Complete YAML test scenario matching your requirements
  2. Inline documentation explaining each section
  3. Best practices applied (timeouts, waits, verification)
  4. Appropriate complexity level (1, 2, or 3 based on needs)
  5. Instructions for running the test
  1. 完整的YAML测试场景,符合你的需求
  2. 内联文档,解释每个部分
  3. 应用最佳实践(超时、等待、验证)
  4. 合适的复杂度层级(1、2或3,根据需求)
  5. 运行说明,指导如何执行测试

Example Request and Response

示例请求与响应

User: "Create a web test for user login with valid credentials"
Claude (using this skill):
yaml
scenario:
  name: "User Login - Valid Credentials"
  description: "Verifies user can successfully log in with valid email and password"
  type: web
  tags: [smoke, auth, critical]

  prerequisites:
    - "User test@example.com exists with password 'TestPass123!'"
    - "Application is running on http://localhost:3000"

  steps:
    # Navigate to login page
    - action: navigate
      url: "http://localhost:3000/login"
      description: "Open login page"

    # Wait for page to fully load
    - action: wait_for_element
      selector: "form.login-form"
      timeout: 5s
      description: "Wait for login form to render"

    # Fill in email field
    - action: type
      selector: "input[name='email']"
      value: "test@example.com"
      description: "Enter email address"

    # Fill in password field
    - action: type
      selector: "input[name='password']"
      value: "TestPass123!"
      description: "Enter password"

    # Submit form
    - action: click
      selector: "button[type='submit']"
      description: "Click login button"

    # Wait for redirect to dashboard
    - action: wait_for_url
      contains: "/dashboard"
      timeout: 10s
      description: "Should redirect to dashboard after successful login"

    # Verify user is logged in
    - action: verify_element
      selector: ".user-profile"
      contains: "test@example.com"
      description: "User email should appear in profile widget"

    # Capture evidence
    - action: screenshot
      save_as: "successful-login.png"
      description: "Save screenshot of logged-in state"
用户:"Create a web test for user login with valid credentials"
Claude(使用本技能):
yaml
scenario:
  name: "用户登录 - 有效凭据"
  description: "验证用户可使用有效邮箱和密码成功登录"
  type: web
  tags: [smoke, auth, critical]

  prerequisites:
    - "用户test@example.com已存在,密码为'TestPass123!'"
    - "应用运行在http://localhost:3000"

  steps:
    # 导航到登录页面
    - action: navigate
      url: "http://localhost:3000/login"
      description: "打开登录页面"

    # 等待页面完全加载
    - action: wait_for_element
      selector: "form.login-form"
      timeout: 5s
      description: "等待登录表单渲染"

    # 填写邮箱字段
    - action: type
      selector: "input[name='email']"
      value: "test@example.com"
      description: "输入邮箱地址"

    # 填写密码字段
    - action: type
      selector: "input[name='password']"
      value: "TestPass123!"
      description: "输入密码"

    # 提交表单
    - action: click
      selector: "button[type='submit']"
      description: "点击登录按钮"

    # 等待重定向到仪表盘
    - action: wait_for_url
      contains: "/dashboard"
      timeout: 10s
      description: "登录成功后应重定向到仪表盘"

    # 验证用户已登录
    - action: verify_element
      selector: ".user-profile"
      contains: "test@example.com"
      description: "用户邮箱应显示在资料组件中"

    # 捕获证据
    - action: screenshot
      save_as: "successful-login.png"
      description: "保存登录成功状态的截图"

To run this test:

运行此测试:

1. Save as test-login-valid.yaml

1. 保存为test-login-valid.yaml

2. Run: gadugi-test run test-login-valid.yaml

2. 运行: gadugi-test run test-login-valid.yaml

3. Check evidence in ./evidence/user-login-valid-credentials-TIMESTAMP/

3. 在./evidence/user-login-valid-credentials-TIMESTAMP/中查看证据

undefined
undefined

Related Resources [LEVEL 1]

相关资源 [LEVEL 1]

Official Documentation

官方文档

Level 4: Shadow Environment Integration [LEVEL 4]

Level 4: 影子环境集成 [LEVEL 4]

Run your outside-in tests in isolated shadow environments to validate changes before pushing. This combines the behavioral testing power of gadugi-agentic-test with the clean-state isolation of shadow environments.
隔离的影子环境中运行外部驱动测试,在推送前验证变更。这将gadugi-agentic-test的行为测试能力与影子环境的干净状态隔离能力相结合。

Why Use Shadow Environments for Testing

为什么在影子环境中测试

  1. Clean State: Fresh container, no host pollution
  2. Local Changes: Test uncommitted code exactly as-is
  3. Multi-Repo: Coordinate changes across multiple repos
  4. CI Parity: What shadow sees ≈ what CI will see
  1. 干净状态:全新容器,无主机污染
  2. 本地变更:精确测试未提交的代码
  3. 多仓库协调:跨多个仓库协调变更
  4. CI一致性:影子环境的测试结果≈CI的测试结果

Shadow Testing Workflow

影子测试工作流

For complete shadow environment documentation, see the shadow-testing skill. Here's how to integrate it with outside-in tests:
完整的影子环境文档请查看shadow-testing技能。以下是与外部驱动测试集成的方法:

Pattern 1: CLI Tests in Shadow (Amplifier)

模式1:CLI测试在影子环境中运行(Amplifier)

python
undefined
python
undefined

Create shadow with your local library changes

创建包含本地库变更的影子环境

shadow.create(local_sources=["~/repos/my-lib:org/my-lib"])
shadow.create(local_sources=["~/repos/my-lib:org/my-lib"])

Run outside-in test scenarios inside shadow

在影子环境中运行外部驱动测试场景

shadow.exec(shadow_id, "gadugi-test run test-scenario.yaml")
shadow.exec(shadow_id, "gadugi-test run test-scenario.yaml")

Extract evidence

提取证据

shadow.extract(shadow_id, "/evidence", "./test-evidence")
shadow.extract(shadow_id, "/evidence", "./test-evidence")

Cleanup

清理

shadow.destroy(shadow_id)
undefined
shadow.destroy(shadow_id)
undefined

Pattern 2: CLI Tests in Shadow (Standalone)

模式2:CLI测试在影子环境中运行(独立版)

bash
undefined
bash
undefined

Create shadow with local changes

创建包含本地变更的影子环境

amplifier-shadow create --local ~/repos/my-lib:org/my-lib --name test
amplifier-shadow create --local ~/repos/my-lib:org/my-lib --name test

Run your test scenarios

运行测试场景

amplifier-shadow exec test "gadugi-test run test-scenario.yaml"
amplifier-shadow exec test "gadugi-test run test-scenario.yaml"

Extract results

提取结果

amplifier-shadow extract test /evidence ./test-evidence
amplifier-shadow extract test /evidence ./test-evidence

Cleanup

清理

amplifier-shadow destroy test
undefined
amplifier-shadow destroy test
undefined

Pattern 3: Multi-Repo Integration Test

模式3:多仓库集成测试

yaml
undefined
yaml
undefined

test-multi-repo.yaml

test-multi-repo.yaml

scenario: name: "Multi-Repo Integration Test" type: cli
prerequisites: - "Shadow environment with core-lib and cli-tool"
steps: - action: launch target: "cli-tool"
- action: send_input
  value: "process --lib core-lib\n"

- action: verify_output
  contains: "Success: Using core-lib"

```bash
scenario: name: "多仓库集成测试" type: cli
prerequisites: - "包含core-lib和cli-tool的影子环境"
steps: - action: launch target: "cli-tool"
- action: send_input
  value: "process --lib core-lib\n"

- action: verify_output
  contains: "Success: Using core-lib"

```bash

Setup shadow with both repos

创建包含两个仓库的影子环境

amplifier-shadow create
--local ~/repos/core-lib:org/core-lib
--local ~/repos/cli-tool:org/cli-tool
--name multi-test
amplifier-shadow create
--local ~/repos/core-lib:org/core-lib
--local ~/repos/cli-tool:org/cli-tool
--name multi-test

Run test that exercises both

运行测试,验证两者的交互

amplifier-shadow exec multi-test "gadugi-test run test-multi-repo.yaml"
undefined
amplifier-shadow exec multi-test "gadugi-test run test-multi-repo.yaml"
undefined

Pattern 4: Web App Testing in Shadow

模式4:Web应用在影子环境中测试

yaml
undefined
yaml
undefined

test-web-app.yaml

test-web-app.yaml

scenario: name: "Web App with Local Library" type: web
steps: - action: navigate url: "http://localhost:3000"
- action: click
  selector: "button.process"

- action: verify_element
  selector: ".result"
  contains: "Processed with v2.0" # Your local version

```bash
scenario: name: "使用本地库的Web应用" type: web
steps: - action: navigate url: "http://localhost:3000"
- action: click
  selector: "button.process"

- action: verify_element
  selector: ".result"
  contains: "Processed with v2.0" # 你的本地版本

```bash

Shadow with library changes

创建包含本地库变更的影子环境

amplifier-shadow create --local ~/repos/my-lib:org/my-lib --name web-test
amplifier-shadow create --local ~/repos/my-lib:org/my-lib --name web-test

Start web app inside shadow (uses your local lib)

在影子环境中启动Web应用(使用你的本地库)

amplifier-shadow exec web-test " cd /workspace && git clone https://github.com/org/web-app && cd web-app && npm install && # Pulls your local my-lib via git URL rewriting npm start & "
amplifier-shadow exec web-test " cd /workspace && git clone https://github.com/org/web-app && cd web-app && npm install && # 通过git URL重写拉取你的本地my-lib npm start & "

Wait for app to start, then run tests

等待应用启动,然后运行测试

amplifier-shadow exec web-test "sleep 5 && gadugi-test run test-web-app.yaml"
undefined
amplifier-shadow exec web-test "sleep 5 && gadugi-test run test-web-app.yaml"
undefined

Verification Best Practices

验证最佳实践

When running tests in shadow, always verify your local sources are being used:
bash
undefined
在影子环境中运行测试时,始终验证是否使用了你的本地代码:
bash
undefined

After shadow.create, check snapshot commits

shadow.create后,检查快照提交

shadow.status(shadow_id)
shadow.status(shadow_id)

Shows: snapshot_commits: {"org/my-lib": "abc1234..."}

输出: snapshot_commits: {"org/my-lib": "abc1234..."}

When your test installs dependencies, verify commit matches

当测试安装依赖时,验证提交是否匹配

Look in test output for: my-lib @ git+...@abc1234

在测试输出中查找: my-lib @ git+...@abc1234

undefined
undefined

Complete Example: Library Change Validation

完整示例:库变更验证

yaml
undefined
yaml
undefined

test-library-change.yaml - Outside-in test

test-library-change.yaml - 外部驱动测试

scenario: name: "Validate Library Breaking Change" type: cli description: "Test that dependent app still works with new library API"
steps: - action: launch target: "/workspace/org/dependent-app/cli.py"
- action: send_input
  value: "process data.json\n"

- action: verify_output
  contains: "Processed successfully"
  description: "New library API should still work"

- action: verify_exit_code
  expected: 0

```bash
scenario: name: "验证库的破坏性变更" type: cli description: "测试依赖应用在新库API下是否仍能正常工作"
steps: - action: launch target: "/workspace/org/dependent-app/cli.py"
- action: send_input
  value: "process data.json\n"

- action: verify_output
  contains: "Processed successfully"
  description: "新库API应仍能正常工作"

- action: verify_exit_code
  expected: 0

```bash

Complete workflow

完整工作流

1. Create shadow with your breaking change

1. 创建包含破坏性变更的影子环境

amplifier-shadow create --local ~/repos/my-lib:org/my-lib --name breaking-test
amplifier-shadow create --local ~/repos/my-lib:org/my-lib --name breaking-test

2. Install dependent app (pulls your local lib)

2. 安装依赖应用(拉取你的本地库)

amplifier-shadow exec breaking-test " cd /workspace && git clone https://github.com/org/dependent-app && cd dependent-app && pip install -e . && # This installs git+https://github.com/org/my-lib (your local version) echo 'Ready to test' "
amplifier-shadow exec breaking-test " cd /workspace && git clone https://github.com/org/dependent-app && cd dependent-app && pip install -e . && # 这会安装git+https://github.com/org/my-lib(你的本地版本) echo 'Ready to test' "

3. Run outside-in test

3. 运行外部驱动测试

amplifier-shadow exec breaking-test "gadugi-test run test-library-change.yaml"
amplifier-shadow exec breaking-test "gadugi-test run test-library-change.yaml"

If test passes, your breaking change is compatible!

如果测试通过,说明你的破坏性变更兼容!

If test fails, you've caught the issue before pushing

如果测试失败,说明你在推送前就发现了问题

undefined
undefined

When to Use Shadow Integration

何时使用影子集成

Use shadow + outside-in tests when:
  • ✅ Testing library changes with dependent projects
  • ✅ Validating multi-repo coordinated changes
  • ✅ Need clean-state validation before pushing
  • ✅ Want to catch integration issues early
  • ✅ Testing that setup/install procedures work
Don't use shadow for:
  • ❌ Simple unit tests (too much overhead)
  • ❌ Tests of already-committed code (shadow adds no value)
  • ❌ Performance testing (container overhead skews results)
当以下情况时,使用影子+外部驱动测试:
  • ✅ 测试库变更对依赖项目的影响
  • ✅ 验证多仓库的协调变更
  • ✅ 需要在推送前进行干净状态验证
  • ✅ 希望尽早发现集成问题
  • ✅ 测试安装/设置流程是否正常工作
不要在以下情况使用影子:
  • ❌ 简单单元测试(开销太大)
  • ❌ 测试已提交的代码(影子无额外价值)
  • ❌ 性能测试(容器开销会影响结果)

Learn More

了解更多

For complete shadow environment documentation, including:
  • Shell scripts for DIY setup
  • Docker Compose examples
  • Multi-language support (Python, Node, Rust, Go)
  • Troubleshooting and verification techniques
Load the shadow-testing skill:
Claude, use the shadow-testing skill to set up a shadow environment
Or for Amplifier users, the shadow tool is built-in:
python
shadow.create(local_sources=["~/repos/lib:org/lib"])

完整的影子环境文档,包括:
  • DIY设置的Shell脚本
  • Docker Compose示例
  • 多语言支持(Python、Node、Rust、Go)
  • 故障排除和验证技巧
加载shadow-testing技能
Claude, use the shadow-testing skill to set up a shadow environment
对于Amplifier用户,影子工具是内置的:
python
shadow.create(local_sources=["~/repos/lib:org/lib"])

Related Skills

相关技能

  • shadow-testing: Complete shadow environment setup and usage
  • test-gap-analyzer: Find untested code paths
  • philosophy-guardian: Review test philosophy compliance
  • pr-review-assistant: Include tests in PR reviews
  • module-spec-generator: Generate specs with test scenarios
  • shadow-testing: 完整的影子环境设置和使用
  • test-gap-analyzer: 找出未测试的代码路径
  • philosophy-guardian: 评审测试理念合规性
  • pr-review-assistant: 在PR评审中包含测试
  • module-spec-generator: 生成包含测试场景的规格

Further Reading

进一步阅读

  • Outside-in vs inside-out testing approaches
  • Behavior-driven development (BDD) principles
  • AI-powered testing best practices
  • Test automation patterns
  • Shadow environment testing methodology
  • 外部驱动与内部驱动测试方法对比
  • 行为驱动开发(BDD)原则
  • AI驱动测试最佳实践
  • 测试自动化模式
  • 影子环境测试方法论

Changelog [LEVEL 3]

更新日志 [LEVEL 3]

Version 1.1.0 (2026-01-29)

Version 1.1.0 (2026-01-29)

  • NEW: Level 4 - Shadow Environment Integration
  • Added complete shadow testing workflow patterns
  • Integration examples for Amplifier native and standalone CLI
  • Multi-repo integration test patterns
  • Web app testing in shadow environments
  • Complete workflow example for library change validation
  • References to shadow-testing skill for deep-dive documentation
  • 新增: Level 4 - 影子环境集成
  • 添加完整的影子测试工作流模式
  • Amplifier原生和独立CLI的集成示例
  • 多仓库集成测试模式
  • Web应用在影子环境中的测试
  • 库变更验证的完整工作流示例
  • 指向shadow-testing技能的参考文档

Version 1.0.0 (2025-11-16)

Version 1.0.0 (2025-11-16)

  • Initial skill release
  • Support for CLI, TUI, Web, and Electron applications
  • 15 complete working examples
  • Progressive disclosure levels (1, 2, 3)
  • Embedded gadugi-agentic-test framework documentation (v0.1.0)
  • Freshness check script for version monitoring
  • Full integration with amplihack philosophy
  • Comprehensive troubleshooting guide
  • Action reference catalog

Remember: Outside-in tests verify WHAT your application does, not HOW it does it. Focus on user-visible behavior, and your tests will remain stable across refactorings while providing meaningful validation of critical workflows.
Start at Level 1 with simple smoke tests, and progressively add complexity only when needed. The framework's AI agents handle the hard parts - you just describe what should happen.
  • 初始技能发布
  • 支持CLI、TUI、Web和Electron应用
  • 15个完整的可运行示例
  • 渐进式学习层级(1、2、3)
  • 嵌入gadugi-agentic-test框架文档(v0.1.0)
  • 版本检查脚本
  • 与amplihack理念完全对齐
  • 全面的故障排除指南
  • 操作参考目录

记住:外部驱动测试验证的是应用做什么,而非怎么做。专注于用户可见的行为,你的测试将在重构后依然稳定,同时为关键工作流提供有意义的验证。
从Level 1的简单冒烟测试开始,仅在需要时逐步增加复杂度。框架的AI代理会处理复杂的部分 - 你只需描述应该发生什么。