aiden-test-feature

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Task: Test Feature & Generate Demo Report

任务:测试功能并生成演示报告

Analyze the current branch's changes, scope the coverage level with the user, build an explicit testing strategy, start the app's dev server, test the feature in a real browser using
agent-browser
, capture screenshots and video (always — no exceptions), upload everything to S3, and generate a structured test report (always — never skip).

分析当前分支的变更,与用户确定测试覆盖级别,制定明确的测试策略,启动应用的开发服务器,使用
agent-browser
在真实浏览器中测试功能,捕获截图和视频(必须执行——无例外),将所有内容上传至S3,并生成结构化测试报告(必须执行——不得跳过)。

Phase 0: Environment Setup

阶段0:环境搭建

1. Get artifact context IDs

1. 获取工件上下文ID

Use the active task context when it is present in the prompt. Otherwise:
  • The task ID is available from the
    AIDEN_TASK_ID
    environment variable.
  • The conversation ID is available from the
    AIDEN_SESSION_ID
    environment variable.
  • Resolve
    teamId
    with Aiden MCP context/tools before creating the report.
When calling
create_test_report
, always pass
taskId
,
conversationId
, and
teamId
explicitly. Do not substitute one ID for another.
当提示中存在活跃任务上下文时,使用该上下文。否则:
  • 任务ID可从
    AIDEN_TASK_ID
    环境变量获取。
  • 会话ID可从
    AIDEN_SESSION_ID
    环境变量获取。
  • 在创建报告前,通过Aiden MCP上下文/工具解析
    teamId
调用
create_test_report
时,必须显式传入
taskId
conversationId
teamId
。不得用一个ID替代另一个。

2. Verify agent-browser is available

2. 验证agent-browser是否可用

agent-browser
is a CLI tool by Vercel Labs (
npm: agent-browser
) — it is NOT a skill or MCP tool. It provides headless browser automation via bash commands (open, click, fill, screenshot, record). In sandboxes it is pre-installed by
runtime-bootstrap.ts
.
bash
command -v agent-browser >/dev/null 2>&1 && echo "agent-browser: OK" || echo "agent-browser: MISSING"
If missing, install it:
bash
npm install -g agent-browser && agent-browser install
If installation fails (e.g. no network, no npm), STOP and tell the user:
"agent-browser is not available and could not be installed. Install it manually:
npm install -g agent-browser && agent-browser install
"
Do NOT proceed to Phase 2 without a working
agent-browser
— all browser testing depends on it.
agent-browser
是Vercel Labs开发的CLI工具(npm: agent-browser)——它不是skill或MCP工具。它通过bash命令(open、click、fill、screenshot、record)提供无头浏览器自动化功能。在沙箱环境中,它由
runtime-bootstrap.ts
预安装。
bash
command -v agent-browser >/dev/null 2>&1 && echo "agent-browser: OK" || echo "agent-browser: MISSING"
若缺失,请安装:
bash
npm install -g agent-browser && agent-browser install
若安装失败(如无网络、无npm),停止操作并告知用户:
"agent-browser不可用且无法安装。请手动安装:
npm install -g agent-browser && agent-browser install
"
agent-browser
可用前,不得进入阶段2——所有浏览器测试都依赖它。

3. Create workspace

3. 创建工作区

All captures (screenshots, videos) MUST be written to
/tmp/aiden-captures/
.
bash
mkdir -p /tmp/aiden-captures
CRITICAL — file path rules:
  • ONLY write capture files to
    /tmp/aiden-captures/
    . Never anywhere else.
  • NEVER write to
    .claude/
    , project directories,
    reports/
    , or any path inside the repo.
  • .claude/
    is a sensitive system directory — writing to it will be blocked and will abort the test run.
  • If you are tempted to create a
    reports/
    or
    screenshots/
    folder anywhere other than
    /tmp/
    , stop and use
    /tmp/aiden-captures/
    instead.

所有捕获内容(截图、视频)必须写入
/tmp/aiden-captures/
bash
mkdir -p /tmp/aiden-captures
关键——文件路径规则:
  • 将捕获文件写入
    /tmp/aiden-captures/
    。不得写入其他任何位置。
  • 切勿写入
    .claude/
    、项目目录、
    reports/
    或仓库内的任何路径。
  • .claude/
    是敏感系统目录——写入该目录会被阻止并导致测试运行中止。
  • 若您想在
    /tmp/
    以外的位置创建
    reports/
    screenshots/
    文件夹,请停止操作,改用
    /tmp/aiden-captures/

Phase 0.5: Coverage Scoping & Context Gathering (MANDATORY — do before anything else)

阶段0.5:覆盖范围确定与上下文收集(必须执行——在任何操作前完成)

Never skip this phase. You must understand what the user wants before writing a single test step.
不得跳过此阶段。在编写任何测试步骤前,您必须了解用户的需求。

1. Ask about coverage level

1. 询问覆盖级别

Use
AskUserQuestion
to ask the user:
What level of test coverage do you want for this feature?

- light       — Happy path only. Quick smoke test to verify the main flow works.
- standard    — Happy path + key edge cases + basic error states. (Default)
- comprehensive — Full coverage: happy path, all edge cases, error states, non-happy paths, boundary values, accessibility, responsiveness.

Also: are there any specific flows, known bugs, or risky areas you want me to focus on?
Wait for the response before continuing.
使用
AskUserQuestion
询问用户:
您希望对此功能采用何种测试覆盖级别?

- light(轻量)       — 仅主流程。快速冒烟测试,验证主要流程正常工作。
- standard(标准)    — 主流程 + 关键边缘案例 + 基础错误状态。(默认)
- comprehensive(全面) — 全覆盖:主流程、所有边缘案例、错误状态、异常流程、边界值、无障碍性、响应式。

另外:是否有特定流程、已知bug或风险区域需要我重点关注?
等待用户回复后再继续。

2. Ask targeted follow-up questions

2. 提出针对性跟进问题

Based on the feature (which you may not know yet — do a quick
git diff --stat
first to get a hint), use
AskUserQuestion
to ask 1–3 targeted follow-up questions. Examples:
  • "Is there an authenticated state I should test? If yes, what test credentials should I use?"
  • "Are there any known edge cases or previous bugs related to this feature?"
  • "What's the definition of 'working correctly' for this feature — what should I see?"
  • "Are there any specific non-happy paths you're concerned about (e.g. invalid input, network errors, empty states)?"
  • "Any pages or user roles I should specifically include or exclude?"
Do not skip questions if context is unclear. Ask. A well-scoped test is 10× more valuable than a blind one.
根据功能(您可能尚不了解——先运行
git diff --stat
获取提示),使用
AskUserQuestion
提出1-3个针对性跟进问题。示例:
  • "是否需要测试已认证状态?若是,应使用哪些测试凭据?"
  • "是否存在与此功能相关的已知边缘案例或过往bug?"
  • "此功能的'正常工作'定义是什么——我应该看到什么?"
  • "您是否担心某些特定的异常流程(如无效输入、网络错误、空状态)?"
  • "是否有特定页面或用户角色需要我专门包含或排除?"
若上下文不明确,不得跳过问题。请提问。一个范围明确的测试比盲目测试价值高10倍。

3. Confirm the plan

3. 确认计划

After gathering answers, summarize back to the user (no tool call needed — just a short message):
  • Coverage level chosen
  • Specific areas / flows to focus on
  • Any known risks or edge cases to target
  • Rough count of test scenarios expected
Only proceed to Phase 1 after this confirmation.

收集答案后,向用户总结(无需调用工具——只需简短消息):
  • 选择的覆盖级别
  • 需要重点关注的特定区域/流程
  • 需要针对的已知风险或边缘案例
  • 预计的测试场景大致数量
仅在用户确认后,方可进入阶段1。

Phase 1: Discover What to Test

阶段1:确定测试内容

1. Analyze the branch

1. 分析分支

Run these commands to understand what changed:
bash
git log main..HEAD --oneline 2>/dev/null || git log HEAD~5..HEAD --oneline
git diff main...HEAD --stat 2>/dev/null || git diff HEAD~1 --stat
Read the actual diff to understand the feature or bug fix. Identify:
  • What part of the app is affected (which pages, components, API routes)
  • What the expected behavior change is
  • What URL path to navigate to for testing
运行以下命令了解变更内容:
bash
git log main..HEAD --oneline 2>/dev/null || git log HEAD~5..HEAD --oneline
git diff main...HEAD --stat 2>/dev/null || git diff HEAD~1 --stat
阅读实际差异以了解功能或bug修复。确定:
  • 应用的哪些部分受影响(哪些页面、组件、API路由)
  • 预期的行为变化
  • 测试时应导航至的URL路径

2. Discover and start the dev server

2. 发现并启动开发服务器

Look at the project to figure out how to run it:
  1. Read
    package.json
    — check
    scripts.dev
    ,
    scripts.start
    ,
    scripts.serve
  2. Check for
    docker-compose.yml
    /
    docker-compose.yaml
    /
    compose.yml
  3. Check for
    Makefile
    (look for
    dev
    or
    serve
    targets)
  4. Check for
    Procfile
    ,
    .env
    ,
    Pipfile
    ,
    requirements.txt
    ,
    Gemfile
  5. Check for framework-specific files:
    next.config.*
    ,
    vite.config.*
    ,
    nuxt.config.*
    ,
    angular.json
    ,
    manage.py
    ,
    config/routes.rb
Start the dev server in the background. Common patterns:
bash
undefined
查看项目以确定如何运行:
  1. 阅读
    package.json
    ——检查
    scripts.dev
    scripts.start
    scripts.serve
  2. 检查是否有
    docker-compose.yml
    /
    docker-compose.yaml
    /
    compose.yml
  3. 检查是否有
    Makefile
    (查找
    dev
    serve
    目标)
  4. 检查是否有
    Procfile
    .env
    Pipfile
    requirements.txt
    Gemfile
  5. 检查是否有框架特定文件:
    next.config.*
    vite.config.*
    nuxt.config.*
    angular.json
    manage.py
    config/routes.rb
在后台启动开发服务器。常见模式:
bash
undefined

Node.js

Node.js

npm run dev &
npm run dev &

or: pnpm dev &, yarn dev &, npx next dev &, npx vite &

或:pnpm dev &, yarn dev &, npx next dev &, npx vite &

Python

Python

python manage.py runserver &
python manage.py runserver &

or: flask run &, uvicorn main:app &

或:flask run &, uvicorn main:app &

Ruby

Ruby

bundle exec rails server &
bundle exec rails server &

Docker

Docker

docker compose up -d
undefined
docker compose up -d
undefined

3. Wait for the server

3. 等待服务器启动

Poll until the server is responding:
bash
undefined
轮询直到服务器响应:
bash
undefined

Replace PORT with the discovered port

将PORT替换为发现的端口

for i in $(seq 1 30); do curl -sf http://localhost:PORT >/dev/null 2>&1 && break sleep 2 done

Check common ports if unclear: 3000, 5173, 8080, 4200, 8000, 4000, 3001, 8888.
for i in $(seq 1 30); do curl -sf http://localhost:PORT >/dev/null 2>&1 && break sleep 2 done

若端口不明确,尝试常见端口:3000、5173、8080、4200、8000、4000、3001、8888。

4. Determine the app URL

4. 确定应用URL

  • Parse dev server stdout/stderr for "Local:" or "ready on" messages with URLs
  • Check
    .env
    or
    .env.local
    for
    PORT
    or
    VITE_PORT
    or similar
  • Test common ports with
    curl -sf http://localhost:PORT >/dev/null
  • If you cannot determine it, ask the user via
    AskUserQuestion
  • 解析开发服务器的标准输出/错误,查找含URL的"Local:"或"ready on"消息
  • 检查
    .env
    .env.local
    中的
    PORT
    VITE_PORT
    等变量
  • 使用
    curl -sf http://localhost:PORT >/dev/null
    测试常见端口
  • 若无法确定,通过
    AskUserQuestion
    询问用户

5. Summarize before proceeding

5. 总结后再继续

Tell the user:
  • What you found in the diff (feature/fix summary)
  • What URL you will test
  • What test scenarios you plan to cover

告知用户:
  • 您在差异分析中发现的内容(功能/修复摘要)
  • 您将测试的URL
  • 您计划覆盖的测试场景

Phase 1.5: Build Testing Strategy & Create Todos (MANDATORY)

阶段1.5:制定测试策略并创建待办事项(必须执行)

Before opening the browser, you must have an explicit plan. Do not improvise test steps on the fly.
在打开浏览器前,您必须有明确的计划。不得即兴编写测试步骤。

1. Draft the testing strategy

1. 草拟测试策略

Based on:
  • The coverage level the user chose in Phase 0.5
  • The diff analysis from Phase 1
  • The context gathered from the user
Write out the full test plan as a structured list. For each scenario, note:
  • Scenario name — short label (e.g. "Happy path: create item")
  • Type — happy path / non-happy path / edge case / error state / visual / navigation
  • Steps — what to do
  • Expected outcome — what "pass" looks like
  • Screenshot needed — yes/no
Coverage requirements by level:
  • light: happy path only (1–3 scenarios)
  • standard: happy path + 2–4 non-happy paths + 1–2 error states
  • comprehensive: happy path + all non-happy paths + all error states + boundary values + empty states + responsiveness + navigation
Non-happy path examples to consider:
  • Invalid / missing required inputs
  • Submitting with no data / empty state
  • Duplicate entries (if applicable)
  • Permission denied / unauthorized access
  • Network error / API failure simulation (if testable)
  • Rapid repeated actions (double-click, spam submit)
  • Long input strings / special characters
  • Back button / browser navigation mid-flow
基于:
  • 用户在阶段0.5选择的覆盖级别
  • 阶段1的差异分析
  • 从用户处收集的上下文
将完整测试计划编写为结构化列表。对于每个场景,记录:
  • 场景名称——简短标签(如"主流程:创建条目")
  • 类型——主流程/异常流程/边缘案例/错误状态/视觉/导航
  • 步骤——执行操作
  • 预期结果——"通过"的标准
  • 是否需要截图——是/否
各覆盖级别的要求:
  • light(轻量):仅主流程(1-3个场景)
  • standard(标准):主流程 + 2-4个异常流程 + 1-2个错误状态
  • comprehensive(全面):主流程 + 所有异常流程 + 所有错误状态 + 边界值 + 空状态 + 响应式 + 导航
异常流程示例参考:
  • 无效/缺失必填输入
  • 提交空数据/空状态
  • 重复条目(若适用)
  • 权限拒绝/未授权访问
  • 网络错误/API失败模拟(若可测试)
  • 快速重复操作(双击、重复提交)
  • 长输入字符串/特殊字符
  • 流程中点击返回按钮/浏览器导航

2. Create todos for each scenario

2. 为每个场景创建待办事项

Use
TodoWrite
to create a todo entry for each test scenario so progress is tracked. Each todo should be the scenario name.
使用
TodoWrite
为每个测试场景创建待办条目,以便跟踪进度。每个待办事项应为场景名称。

3. Present the strategy

3. 展示测试策略

Output the full testing strategy as a numbered list before proceeding. The user should be able to see exactly what will be tested before Phase 2 starts.

在进入阶段2前,将完整测试策略以编号列表形式输出。用户应能在阶段2开始前清楚了解将测试的内容。

Phase 2: Test with agent-browser

阶段2:使用agent-browser测试

agent-browser
is a CLI tool — invoke it via bash, not as an MCP tool or skill. Docs: https://github.com/vercel-labs/agent-browser Key commands:
open
,
snapshot
,
click
,
fill
,
screenshot
,
record
,
wait
,
find
,
close
agent-browser
是CLI工具——通过bash调用,而非作为MCP工具或skill。 文档:https://github.com/vercel-labs/agent-browser 关键命令:
open
snapshot
click
fill
screenshot
record
wait
find
close

1. Open the app

1. 打开应用

bash
agent-browser --session test-feature open http://localhost:PORT
agent-browser --session test-feature wait --load networkidle
bash
agent-browser --session test-feature open http://localhost:PORT
agent-browser --session test-feature wait --load networkidle

2. Start video recording — MANDATORY

2. 开始视频录制——必须执行

Always start a video recording before any interaction. No exceptions.
bash
agent-browser --session test-feature record start /tmp/aiden-captures/happy-path.webm
If video recording fails for technical reasons, note the failure but continue — screenshots are still required.
在任何交互前,始终启动视频录制。无例外。
bash
agent-browser --session test-feature record start /tmp/aiden-captures/happy-path.webm
若视频录制因技术原因失败,记录失败情况但继续操作——仍需捕获截图。

3. Take initial screenshot — MANDATORY

3. 拍摄初始截图——必须执行

Always capture the initial page state before any interaction.
bash
agent-browser --session test-feature screenshot /tmp/aiden-captures/step-00-initial-state.png
在任何交互前,始终捕获页面初始状态。
bash
agent-browser --session test-feature screenshot /tmp/aiden-captures/step-00-initial-state.png

4. Execute each scenario from the testing strategy

4. 执行测试策略中的每个场景

Work through every scenario defined in Phase 1.5. For each scenario:
  • Mark the corresponding todo as in-progress
  • Use
    agent-browser snapshot -i
    to discover interactive elements
  • Use
    agent-browser click @eN
    ,
    agent-browser fill @eN "text"
    , etc. to interact
  • Use
    agent-browser wait --load networkidle
    or
    agent-browser wait 1500
    between actions
  • Take a screenshot after every significant state change — never go more than 2 meaningful actions without a screenshot:
    bash
    agent-browser --session test-feature screenshot /tmp/aiden-captures/step-NN-description.png
    Name screenshots descriptively:
    step-02-form-filled.png
    ,
    step-03-submit-clicked.png
    ,
    step-04-success-state.png
  • Re-snapshot after navigation or DOM changes (refs go stale)
  • If elements are hard to find by ref, use semantic locators:
    bash
    agent-browser --session test-feature find text "Submit" click
    agent-browser --session test-feature find role button click --name "Save"
  • Mark each todo as complete or failed based on outcome
Mandatory coverage checklist (execute ALL that apply to the coverage level chosen):
Happy path (always required):
  • Main feature flow works end-to-end as expected
  • Success state / confirmation is visible
  • Data is persisted / reflected correctly after action
Non-happy paths (required for standard + comprehensive):
  • Empty / missing required inputs — form validation fires, error messages shown
  • Invalid input values — correct rejection, no crash
  • Boundary values — minimum and maximum accepted values
  • Duplicate / conflicting data (if applicable)
  • Unauthorized / permission-denied state (if applicable)
  • Empty list / zero-state view (if applicable)
  • Rapid repeated actions (double-click submit, spam button)
  • Long strings / special characters in text inputs
Error states (required for standard + comprehensive):
  • API / network failure behavior (if simulatable)
  • Partial failure — what happens if only part of the action succeeds
  • Graceful degradation — app does not crash, user sees a meaningful message
Visual & navigation (required for comprehensive):
  • Layout is correct, no overflow or broken UI
  • Responsive on narrower viewport (resize if possible)
  • Links and navigation work; back button does not break state
  • Loading states / spinners shown while async work is in progress
完成阶段1.5中定义的所有场景。对于每个场景:
  • 将对应的待办事项标记为进行中
  • 使用
    agent-browser snapshot -i
    发现交互元素
  • 使用
    agent-browser click @eN
    agent-browser fill @eN "text"
    等命令进行交互
  • 在操作间使用
    agent-browser wait --load networkidle
    agent-browser wait 1500
  • 每次显著状态变更后拍摄截图——不得连续执行2次以上有意义的操作而不截图:
    bash
    agent-browser --session test-feature screenshot /tmp/aiden-captures/step-NN-description.png
    截图命名应具有描述性:
    step-02-form-filled.png
    step-03-submit-clicked.png
    step-04-success-state.png
  • 导航或DOM变更后重新拍摄快照(引用会失效)
  • 若通过引用难以找到元素,使用语义定位器:
    bash
    agent-browser --session test-feature find text "Submit" click
    agent-browser --session test-feature find role button click --name "Save"
  • 根据结果将每个待办事项标记为完成或失败
必须执行的覆盖检查清单(执行所有符合所选覆盖级别的项):
主流程(始终要求):
  • 主要功能流程端到端正常工作
  • 成功状态/确认信息可见
  • 操作后数据正确持久化/反映
异常流程(标准+全面级别要求):
  • 空/缺失必填输入——表单验证触发,显示错误消息
  • 无效输入值——正确拒绝,无崩溃
  • 边界值——最小和最大可接受值
  • 重复/冲突数据(若适用)
  • 未授权/权限拒绝状态(若适用)
  • 空列表/零状态视图(若适用)
  • 快速重复操作(双击提交、重复点击按钮)
  • 文本输入中的长字符串/特殊字符
错误状态(标准+全面级别要求):
  • API/网络失败行为(若可模拟)
  • 部分失败——若仅部分操作成功会发生什么
  • 优雅降级——应用不崩溃,用户看到有意义的消息
视觉与导航(全面级别要求):
  • 布局正确,无溢出或UI破损
  • 在窄视口下响应正常(若可调整大小)
  • 链接和导航正常工作;返回按钮不会破坏状态
  • 异步操作时显示加载状态/加载动画

5. Stop recording and clean up

5. 停止录制并清理

bash
agent-browser --session test-feature record stop
agent-browser --session test-feature screenshot /tmp/aiden-captures/final-state.png
agent-browser --session test-feature close
Keep the recording under 2 minutes. If the feature requires more exploration, split into multiple recordings (e.g.
happy-path.webm
,
error-states.webm
).
bash
agent-browser --session test-feature record stop
agent-browser --session test-feature screenshot /tmp/aiden-captures/final-state.png
agent-browser --session test-feature close
录制时长控制在2分钟以内。若功能需要更多探索,拆分为多个录制文件(如
happy-path.webm
error-states.webm
)。

6. Fix WebM duration metadata (MANDATORY if ffmpeg is available)

6. 修复WebM时长元数据(若ffmpeg可用则必须执行)

WebM files recorded by agent-browser often have
duration = Infinity
in the container header — the video player then shows 0:00. Fix every
.webm
file by remuxing it through ffmpeg, which reads the entire file, computes the real duration, and writes it into the output header:
bash
for f in /tmp/aiden-captures/*.webm; do
  if command -v ffmpeg >/dev/null 2>&1; then
    ffmpeg -y -i "$f" -c copy "${f%.webm}-fixed.webm" 2>/dev/null \
      && mv "${f%.webm}-fixed.webm" "$f" \
      || echo "ffmpeg remux failed for $f — uploading as-is"
  fi
done
If ffmpeg is not available, skip this step and continue — the video will still play, it just won't show the correct duration in the player.

agent-browser录制的WebM文件通常在容器头中存在
duration = Infinity
——视频播放器会显示0:00。通过ffmpeg重新封装修复每个
.webm
文件,ffmpeg会读取整个文件,计算实际时长,并写入输出头:
bash
for f in /tmp/aiden-captures/*.webm; do
  if command -v ffmpeg >/dev/null 2>&1; then
    ffmpeg -y -i "$f" -c copy "${f%.webm}-fixed.webm" 2>/dev/null \
      && mv "${f%.webm}-fixed.webm" "$f" \
      || echo "ffmpeg remux failed for $f — uploading as-is"
  fi
done
若ffmpeg不可用,跳过此步骤继续操作——视频仍可播放,只是播放器不会显示正确时长。

Phase 3: Upload Captures to S3

阶段3:将捕获内容上传至S3

For each captured file, use the
mcp__aiden__get_upload_url
MCP tool to get a presigned S3 URL, then
curl PUT
the file directly to S3.
对于每个捕获文件,使用
mcp__aiden__get_upload_url
MCP工具获取预签名S3 URL,然后通过
curl PUT
直接将文件上传至S3。

Per-file upload flow

单文件上传流程

  1. Get the file size (needed by the MCP tool):
    bash
    SIZE=$(stat -c%s "/tmp/aiden-captures/step-01.png" 2>/dev/null || stat -f%z "/tmp/aiden-captures/step-01.png")
  2. Call the MCP tool to get a presigned upload URL:
    Tool: mcp__aiden__get_upload_url
    Parameters: {
      "taskId": "<active task ID or value of AIDEN_TASK_ID>",
      "filename": "step-01.png",
      "mimeType": "image/png",
      "size": <SIZE from step 1>
    }
    Returns:
    { "uploadUrl": "https://...", "s3Key": "sandbox-captures/..." }
  3. Upload the file to S3 using the presigned URL:
    bash
    curl -sf -X PUT "<uploadUrl>" \
      -H "Content-Type: image/png" \
      --data-binary @/tmp/aiden-captures/step-01.png
  4. Save the
    s3Key
    — you will pass it to
    create_test_report
    in Phase 4.
Repeat for each screenshot and video file. Common MIME types:
  • Screenshots:
    image/png
  • Video recordings:
    video/webm
If uploads fail: Continue to Phase 4 anyway — omit the
screenshotUrl
,
videoUrl
, and
screenshotUrls
fields. The structured report is still valuable without media.

  1. 获取文件大小(MCP工具需要):
    bash
    SIZE=$(stat -c%s "/tmp/aiden-captures/step-01.png" 2>/dev/null || stat -f%z "/tmp/aiden-captures/step-01.png")
  2. 调用MCP工具获取预签名上传URL:
    Tool: mcp__aiden__get_upload_url
    Parameters: {
      "taskId": "<活跃任务ID或AIDEN_TASK_ID的值>",
      "filename": "step-01.png",
      "mimeType": "image/png",
      "size": <步骤1中的SIZE>
    }
    返回:
    { "uploadUrl": "https://...", "s3Key": "sandbox-captures/..." }
  3. 使用预签名URL上传文件至S3:
    bash
    curl -sf -X PUT "<uploadUrl>" \
      -H "Content-Type: image/png" \
      --data-binary @/tmp/aiden-captures/step-01.png
  4. 保存
    s3Key
    ——在阶段4中需将其传入
    create_test_report
对每个截图和视频文件重复上述步骤。常见MIME类型:
  • 截图:
    image/png
  • 视频录制:
    video/webm
若上传失败:仍继续进入阶段4——省略
screenshotUrl
videoUrl
screenshotUrls
字段。即使没有媒体,结构化报告仍有价值。

Phase 4: Create Structured Test Report (MANDATORY — NEVER SKIP)

阶段4:生成结构化测试报告(必须执行——不得跳过)

CRITICAL: You MUST call
mcp__aiden__create_test_report
to complete this skill. This is non-negotiable.
  • Do NOT output the report as markdown text
  • Do NOT summarize findings in chat only
  • Do NOT skip this phase for any reason — not because of time, not because uploads failed, not because testing was partial
  • The report MUST be persisted via this MCP tool call so the UI renders it as an interactive artifact
  • Skipping this step means the entire test session is wasted and untracked
If uploads failed in Phase 3, create the report anyway — omit the media URLs but include all steps, issues, and summary text.
Call the
mcp__aiden__create_test_report
MCP tool with structured data from your testing.
Gather all the data from the previous phases and call the tool:
Tool: mcp__aiden__create_test_report
Parameters: {
  "title": "<short description of what was tested>",
  "taskId": "<active task ID or value of AIDEN_TASK_ID, if set>",
  "conversationId": "<active conversation ID or value of AIDEN_SESSION_ID>",
  "teamId": "<resolved team ID>",
  "branch": "<current branch name from git>",
  "baseBranch": "main",
  "commits": [
    { "sha": "<commit sha>", "message": "<commit message>" }
  ],
  "changedFiles": [
    { "path": "src/components/Feature.tsx", "description": "Added new feature component" }
  ],
  "appUrl": "<the URL you tested>",
  "summary": "<1-2 paragraph summary of what was tested and the outcome>",
  "status": "pass | fail | mixed",
  "steps": [
    {
      "name": "Navigate to feature page",
      "status": "pass | fail | skip",
      "screenshotUrl": "<s3Key from Phase 3 — e.g. sandbox-captures/org/task/step-01.png — omit if upload failed>",
      "notes": "Page loaded correctly"
    }
  ],
  "issues": [
    {
      "title": "Button misaligned on mobile",
      "severity": "critical | high | medium | low",
      "description": "The submit button overflows on viewports < 375px",
      "screenshotUrl": "<s3Key from Phase 3 showing the issue — omit if upload failed>"
    }
  ],
  "videoUrl": "<s3Key from Phase 3 for the happy-path recording — omit if upload failed>",
  "screenshotUrls": ["<s3Key 1 from Phase 3>", "<s3Key 2 from Phase 3>"]
}
关键:您必须调用
mcp__aiden__create_test_report
来完成此skill。这是不可协商的。
  • 不得将报告输出为markdown文本
  • 不得仅在聊天中总结发现
  • 不得因任何原因跳过此阶段——无论时间、上传失败或测试不完整
  • 必须通过此MCP工具调用持久化报告,以便UI将其渲染为交互式工件
  • 跳过此步骤意味着整个测试会话白费且无法追踪
若阶段3中上传失败,仍需生成报告——省略媒体URL,但包含所有步骤、问题和摘要文本。
调用
mcp__aiden__create_test_report
MCP工具,传入测试中的结构化数据。
收集前几个阶段的所有数据并调用工具:
Tool: mcp__aiden__create_test_report
Parameters: {
  "title": "<测试内容的简短描述>",
  "taskId": "<活跃任务ID或AIDEN_TASK_ID的值(若已设置)>",
  "conversationId": "<活跃会话ID或AIDEN_SESSION_ID的值>",
  "teamId": "<解析后的团队ID>",
  "branch": "<git中的当前分支名称>",
  "baseBranch": "main",
  "commits": [
    { "sha": "<提交sha>", "message": "<提交消息>" }
  ],
  "changedFiles": [
    { "path": "src/components/Feature.tsx", "description": "Added new feature component" }
  ],
  "appUrl": "<您测试的URL>",
  "summary": "<1-2段测试内容和结果的摘要>",
  "status": "pass | fail | mixed",
  "steps": [
    {
      "name": "Navigate to feature page",
      "status": "pass | fail | skip",
      "screenshotUrl": "<阶段3中的s3Key —— 例如sandbox-captures/org/task/step-01.png —— 上传失败则省略>",
      "notes": "Page loaded correctly"
    }
  ],
  "issues": [
    {
      "title": "Button misaligned on mobile",
      "severity": "critical | high | medium | low",
      "description": "The submit button overflows on viewports < 375px",
      "screenshotUrl": "<阶段3中显示问题的s3Key —— 上传失败则省略>"
    }
  ],
  "videoUrl": "<阶段3中主流程录制的s3Key —— 上传失败则省略>",
  "screenshotUrls": ["<阶段3中的s3Key 1>", "<阶段3中的s3Key 2>"]
}

Field guidelines

字段指南

  • status: "pass" if all steps passed, "fail" if any critical step failed, "mixed" if some passed and some failed
  • steps: one entry per distinct test action (navigate, click, fill, verify). Include the screenshot URL for that step if you took one.
  • issues: only include actual problems found. Each issue should have a severity and clear description.
  • screenshotUrls: flat list of ALL screenshot s3Keys from Phase 3 (for gallery display). Use the
    s3Key
    field returned by
    get_upload_url
    , not the
    uploadUrl
    . Omit if no uploads succeeded.
  • commits and changedFiles: from git analysis in Phase 1
  • status:若所有步骤通过则为"pass",若任何关键步骤失败则为"fail",若部分通过部分失败则为"mixed"
  • steps:每个不同测试操作(导航、点击、填写、验证)对应一个条目。若拍摄了该步骤的截图,包含截图URL。
  • issues:仅包含实际发现的问题。每个问题应包含严重性和清晰描述。
  • screenshotUrls:阶段3中所有截图s3Key的扁平列表(用于画廊展示)。使用
    get_upload_url
    返回的
    s3Key
    字段,而非
    uploadUrl
    。若无成功上传则省略。
  • commitschangedFiles:来自阶段1的git分析

Output

输出

After calling
mcp__aiden__create_test_report
, tell the user:
  • The test report title, so the user can find the generated artifact in the Aiden UI
  • A link to the artifact when you have enough context to form one (usually
    /teams/<teamId>/docs?artifactId=<artifactId>
    )
  • A brief summary: what was tested, how many steps passed/failed, any issues found
  • List any bugs or concerns discovered during testing
Do not lead with a raw UUID. Only include the artifact/report ID as secondary debug context if no usable title or link is available.
Do NOT render the full report as markdown. The MCP tool call creates the report as a structured artifact that the UI displays interactively.
调用
mcp__aiden__create_test_report
后,告知用户:
  • 测试报告标题,以便用户在Aiden UI中找到生成的工件
  • 工件链接(若有足够上下文可生成,通常为
    /teams/<teamId>/docs?artifactId=<artifactId>
  • 简要摘要:测试内容、通过/失败步骤数量、发现的问题
  • 列出测试期间发现的任何bug或关注点
不得直接显示原始UUID。仅在无可用标题或链接时,将工件/报告ID作为次要调试信息包含。
不得将完整报告渲染为markdown。MCP工具调用会将报告创建为结构化工件,UI会以交互式方式显示。