e2e-tests-studio
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseE2E Behavior Validation for Frontend Modifications
前端修改的E2E行为验证
Core Principle: Test Product Behavior, Not UI States
核心原则:测试产品行为,而非UI状态
CRITICAL: Tests must verify that product features WORK correctly, not just that UI elements render.
关键要求:测试必须验证产品功能是否正常工作,而非仅验证UI元素是否渲染。
What NOT to test (UI States):
无需测试的内容(UI状态):
- ❌ "Dropdown opens when clicked"
- ❌ "Modal appears after button click"
- ❌ "Loading spinner shows during request"
- ❌ "Form fields are visible"
- ❌ "Sidebar collapses"
- ❌ "点击后下拉菜单展开"
- ❌ "点击按钮后模态框出现"
- ❌ "请求期间显示加载动画"
- ❌ "表单字段可见"
- ❌ "侧边栏可折叠"
What TO test (Product Behavior):
需要测试的内容(产品行为):
- ✅ "Selecting an LLM provider configures the agent to use that provider"
- ✅ "Creating a new agent persists it and shows in the agents list"
- ✅ "Running a tool with parameters returns the expected output"
- ✅ "Chat messages stream correctly and maintain conversation context"
- ✅ "Workflow execution triggers tools in the correct order"
- ✅ "选择LLM供应商后,Agent会配置为使用该供应商"
- ✅ "创建新Agent后,该Agent会被持久化并显示在Agent列表中"
- ✅ "带参数运行工具会返回预期输出"
- ✅ "聊天消息能正确流式传输并保持对话上下文"
- ✅ "工作流执行会按正确顺序触发工具"
Prerequisites
前置条件
Requires Playwright MCP server. If the tool is unavailable, instruct the user to add it:
browser_navigatesh
claude mcp add playwright -- npx @playwright/mcp@latest需要Playwright MCP服务器。如果工具不可用,请指导用户添加:
browser_navigatesh
claude mcp add playwright -- npx @playwright/mcp@latestStep 1: Understand the Feature Intent
步骤1:理解功能意图
Before writing ANY test, answer these questions:
- What user problem does this feature solve?
- What is the expected outcome when the feature works correctly?
- What data flows through the system? (user input → API → state → UI)
- What should persist after page reload?
- What downstream effects should this action have?
Document these answers as comments in your test file.
在编写任何测试之前,请先回答以下问题:
- 该功能解决了用户的什么问题?
- 功能正常工作时的预期结果是什么?
- 系统中有哪些数据流?(用户输入 → API → 状态 → UI)
- 页面刷新后哪些数据应该保留?
- 该操作会产生哪些下游影响?
将这些答案作为注释记录在测试文件中。
Step 2: Build and Start
步骤2:构建并启动服务
sh
pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm devVerify server at http://localhost:4111
sh
pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm dev验证服务器是否在http://localhost:4111正常运行
Step 3: Map Feature to Behavior Tests
步骤3:将功能映射到行为测试
Feature-to-Test Mapping Guide
功能到测试的映射指南
| Feature Category | What to Test | Example Assertion |
|---|---|---|
| Agent Configuration | Config changes affect agent behavior | Send message → verify response uses selected model |
| LLM Provider Selection | Selected provider is used in requests | Intercept API call → verify provider in request payload |
| Tool Execution | Tool runs with correct params & returns result | Execute tool → verify output matches expected transformation |
| Workflow Execution | Steps execute in order, data flows between steps | Run workflow → verify each step's output feeds next step |
| Chat/Streaming | Messages persist, context maintained across turns | Multi-turn conversation → verify context awareness |
| MCP Server Tools | Server tools are callable and return data | Call MCP tool → verify response structure and content |
| Memory/Persistence | Data survives page reload | Create item → reload → verify item exists |
| Error Handling | Errors surface correctly to user | Trigger error condition → verify error message + recovery |
| 功能类别 | 测试内容 | 断言示例 |
|---|---|---|
| Agent配置 | 配置变更会影响Agent行为 | 发送消息 → 验证响应使用了所选模型 |
| LLM供应商选择 | 所选供应商会在请求中被使用 | 拦截API调用 → 验证请求负载中包含该供应商 |
| 工具执行 | 工具使用正确参数运行并返回结果 | 执行工具 → 验证输出与预期转换结果匹配 |
| 工作流执行 | 步骤按顺序执行,数据在步骤间流转 | 运行工作流 → 验证每个步骤的输出会传入下一个步骤 |
| 聊天/流式传输 | 消息持久化,多轮对话中保持上下文 | 多轮对话 → 验证上下文感知能力 |
| MCP服务器工具 | 服务器工具可被调用并返回数据 | 调用MCP工具 → 验证响应结构和内容 |
| 内存/持久化 | 页面刷新后数据仍保留 | 创建项目 → 刷新页面 → 验证项目仍然存在 |
| 错误处理 | 错误能正确呈现给用户 | 触发错误条件 → 验证错误消息和恢复选项 |
Step 4: Write Behavior-Focused Tests
步骤4:编写聚焦行为的测试
Test Structure Template
测试结构模板
ts
import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';
/**
* FEATURE: [Name of feature]
* USER STORY: As a user, I want to [action] so that [outcome]
* BEHAVIOR UNDER TEST: [Specific behavior being validated]
*/
test.describe('[Feature Name] - Behavior Tests', () => {
let page: Page;
test.beforeEach(async ({ browser }) => {
const context = await browser.newContext();
page = await context.newPage();
});
test.afterEach(async () => {
await resetStorage(page);
});
test('should [verb describing behavior] when [trigger condition]', async () => {
// ARRANGE: Set up preconditions
// - Navigate to the feature
// - Configure any required state
// ACT: Perform the user action that triggers the behavior
// ASSERT: Verify the OUTCOME, not the UI state
// - Check data persistence
// - Verify downstream effects
// - Confirm API calls made correctly
});
});ts
import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';
/**
* 功能:[功能名称]
* 用户故事:作为用户,我想要[操作],以便[实现结果]
* 测试的行为:[待验证的具体行为]
*/
test.describe('[功能名称] - 行为测试', () => {
let page: Page;
test.beforeEach(async ({ browser }) => {
const context = await browser.newContext();
page = await context.newPage();
});
test.afterEach(async () => {
await resetStorage(page);
});
test('当[触发条件]时,应该[描述行为的动词]', async () => {
// 准备:设置前置条件
// - 导航到功能页面
// - 配置所需状态
// 执行:执行触发行为的用户操作
// 断言:验证结果,而非UI状态
// - 检查数据持久化
// - 验证下游影响
// - 确认API调用正确
});
});Behavior Test Patterns
行为测试模式
Pattern 1: Configuration Affects Behavior
模式1:配置影响行为
ts
test('selecting LLM provider should use that provider for agent responses', async () => {
// ARRANGE
await page.goto('/agents/my-agent/chat');
// Intercept API to verify provider
let capturedProvider: string | null = null;
await page.route('**/api/chat', route => {
const body = JSON.parse(route.request().postData() || '{}');
capturedProvider = body.provider;
route.continue();
});
// ACT: Select a different provider
await page.getByTestId('provider-selector').click();
await page.getByRole('option', { name: 'OpenAI' }).click();
// Send a message to trigger the agent
await page.getByTestId('chat-input').fill('Hello');
await page.getByTestId('send-button').click();
// ASSERT: Verify the selected provider was used
await expect.poll(() => capturedProvider).toBe('openai');
});ts
test('选择LLM供应商后,Agent响应应使用该供应商', async () => {
// 准备
await page.goto('/agents/my-agent/chat');
// 拦截API以验证供应商
let capturedProvider: string | null = null;
await page.route('**/api/chat', route => {
const body = JSON.parse(route.request().postData() || '{}');
capturedProvider = body.provider;
route.continue();
});
// 执行:选择不同的供应商
await page.getByTestId('provider-selector').click();
await page.getByRole('option', { name: 'OpenAI' }).click();
// 发送消息触发Agent
await page.getByTestId('chat-input').fill('Hello');
await page.getByTestId('send-button').click();
// 断言:验证使用了所选供应商
await expect.poll(() => capturedProvider).toBe('openai');
});Pattern 2: Data Persistence
模式2:数据持久化
ts
test('created agent should persist after page reload', async () => {
// ARRANGE
await page.goto('/agents');
const agentName = `Test Agent ${nanoid()}`;
// ACT: Create new agent
await page.getByTestId('create-agent-button').click();
await page.getByTestId('agent-name-input').fill(agentName);
await page.getByTestId('save-agent-button').click();
// Wait for creation to complete
await expect(page.getByText(agentName)).toBeVisible();
// ASSERT: Verify persistence
await page.reload();
await expect(page.getByText(agentName)).toBeVisible({ timeout: 10000 });
});ts
test('创建的Agent在页面刷新后仍保留', async () => {
// 准备
await page.goto('/agents');
const agentName = `测试Agent ${nanoid()}`;
// 执行:创建新Agent
await page.getByTestId('create-agent-button').click();
await page.getByTestId('agent-name-input').fill(agentName);
await page.getByTestId('save-agent-button').click();
// 等待创建完成
await expect(page.getByText(agentName)).toBeVisible();
// 断言:验证持久化
await page.reload();
await expect(page.getByText(agentName)).toBeVisible({ timeout: 10000 });
});Pattern 3: Tool Execution Produces Correct Output
模式3:工具执行产生正确输出
ts
test('weather tool should return formatted weather data', async () => {
// ARRANGE
await selectFixture(page, 'weather-success');
await page.goto('/tools/weather-tool');
// ACT: Execute tool with parameters
await page.getByTestId('param-city').fill('San Francisco');
await page.getByTestId('execute-tool-button').click();
// ASSERT: Verify OUTPUT content, not just that output appears
const output = page.getByTestId('tool-output');
await expect(output).toContainText('temperature');
await expect(output).toContainText('San Francisco');
// Verify structured data if applicable
const outputText = await output.textContent();
const outputData = JSON.parse(outputText || '{}');
expect(outputData).toHaveProperty('temperature');
expect(outputData).toHaveProperty('conditions');
});ts
test('天气工具应返回格式化的天气数据', async () => {
// 准备
await selectFixture(page, 'weather-success');
await page.goto('/tools/weather-tool');
// 执行:带参数执行工具
await page.getByTestId('param-city').fill('San Francisco');
await page.getByTestId('execute-tool-button').click();
// 断言:验证输出内容,而非仅输出是否出现
const output = page.getByTestId('tool-output');
await expect(output).toContainText('temperature');
await expect(output).toContainText('San Francisco');
// 验证结构化数据(如果适用)
const outputText = await output.textContent();
const outputData = JSON.parse(outputText || '{}');
expect(outputData).toHaveProperty('temperature');
expect(outputData).toHaveProperty('conditions');
});Pattern 4: Workflow Step Chaining
模式4:工作流步骤链式传递
ts
test('workflow should pass data between steps correctly', async () => {
// ARRANGE
await selectFixture(page, 'workflow-multi-step');
const sessionId = nanoid();
await page.goto(`/workflows/data-pipeline?session=${sessionId}`);
// ACT: Trigger workflow execution
await page.getByTestId('workflow-input').fill('test input data');
await page.getByTestId('run-workflow-button').click();
// ASSERT: Verify each step received correct input from previous step
// Wait for completion
await expect(page.getByTestId('workflow-status')).toHaveText('completed', { timeout: 30000 });
// Check step outputs show data transformation chain
const step1Output = await page.getByTestId('step-1-output').textContent();
const step2Output = await page.getByTestId('step-2-output').textContent();
// Verify step 2 received step 1's output as input
expect(step2Output).toContain(step1Output);
});ts
test('工作流应在步骤间正确传递数据', async () => {
// 准备
await selectFixture(page, 'workflow-multi-step');
const sessionId = nanoid();
await page.goto(`/workflows/data-pipeline?session=${sessionId}`);
// 执行:触发工作流执行
await page.getByTestId('workflow-input').fill('test input data');
await page.getByTestId('run-workflow-button').click();
// 断言:验证每个步骤都从之前步骤接收到正确输入
// 等待完成
await expect(page.getByTestId('workflow-status')).toHaveText('completed', { timeout: 30000 });
// 检查步骤输出显示数据转换链
const step1Output = await page.getByTestId('step-1-output').textContent();
const step2Output = await page.getByTestId('step-2-output').textContent();
// 验证步骤2收到了步骤1的输出作为输入
expect(step2Output).toContain(step1Output);
});Pattern 5: Streaming Chat with Context
模式5:带上下文的流式聊天
ts
test('chat should maintain conversation context across messages', async () => {
// ARRANGE
await selectFixture(page, 'contextual-chat');
const chatId = nanoid();
await page.goto(`/agents/assistant/chat/${chatId}`);
// ACT: Multi-turn conversation
await page.getByTestId('chat-input').fill('My name is Alice');
await page.getByTestId('send-button').click();
await expect(page.getByTestId('assistant-message').last()).toBeVisible({ timeout: 20000 });
await page.getByTestId('chat-input').fill('What is my name?');
await page.getByTestId('send-button').click();
// ASSERT: Verify context was maintained
const response = page.getByTestId('assistant-message').last();
await expect(response).toContainText('Alice', { timeout: 20000 });
});ts
test('聊天应在多轮消息间保持对话上下文', async () => {
// 准备
await selectFixture(page, 'contextual-chat');
const chatId = nanoid();
await page.goto(`/agents/assistant/chat/${chatId}`);
// 执行:多轮对话
await page.getByTestId('chat-input').fill('My name is Alice');
await page.getByTestId('send-button').click();
await expect(page.getByTestId('assistant-message').last()).toBeVisible({ timeout: 20000 });
await page.getByTestId('chat-input').fill('What is my name?');
await page.getByTestId('send-button').click();
// 断言:验证上下文被保留
const response = page.getByTestId('assistant-message').last();
await expect(response).toContainText('Alice', { timeout: 20000 });
});Pattern 6: Error Recovery
模式6:错误恢复
ts
test('should show actionable error and allow retry when API fails', async () => {
// ARRANGE: Set up failure fixture
await selectFixture(page, 'api-failure');
await page.goto('/tools/flaky-tool');
// ACT: Trigger the error
await page.getByTestId('execute-tool-button').click();
// ASSERT: Error is shown with recovery option
await expect(page.getByTestId('error-message')).toContainText('failed');
await expect(page.getByTestId('retry-button')).toBeVisible();
// Switch to success fixture and retry
await selectFixture(page, 'api-success');
await page.getByTestId('retry-button').click();
// Verify recovery worked
await expect(page.getByTestId('tool-output')).toBeVisible({ timeout: 10000 });
await expect(page.getByTestId('error-message')).not.toBeVisible();
});ts
test('API失败时应显示可操作错误并允许重试', async () => {
// 准备:设置失败夹具
await selectFixture(page, 'api-failure');
await page.goto('/tools/flaky-tool');
// 执行:触发错误
await page.getByTestId('execute-tool-button').click();
// 断言:显示错误及恢复选项
await expect(page.getByTestId('error-message')).toContainText('failed');
await expect(page.getByTestId('retry-button')).toBeVisible();
// 切换到成功夹具并重试
await selectFixture(page, 'api-success');
await page.getByTestId('retry-button').click();
// 验证恢复成功
await expect(page.getByTestId('tool-output')).toBeVisible({ timeout: 10000 });
await expect(page.getByTestId('error-message')).not.toBeVisible();
});Step 5: Update Existing Tests
步骤5:更新现有测试
When a test file already exists:
- Read the existing tests to understand current coverage
- Identify if tests are UI-focused or behavior-focused
- Refactor UI-focused tests to verify behavior instead:
当测试文件已存在时:
- 阅读现有测试以了解当前覆盖范围
- 识别测试是聚焦UI还是聚焦行为
- 重构聚焦UI的测试以改为验证行为:
Refactoring Example
重构示例
BEFORE (UI-focused):
ts
test('dropdown opens when clicked', async () => {
await page.getByTestId('model-dropdown').click();
await expect(page.getByRole('listbox')).toBeVisible();
});AFTER (Behavior-focused):
ts
test('selecting model from dropdown updates agent configuration', async () => {
// Open dropdown and select model
await page.getByTestId('model-dropdown').click();
await page.getByRole('option', { name: 'GPT-4' }).click();
// Verify the selection persists and affects behavior
await page.reload();
await expect(page.getByTestId('model-dropdown')).toHaveText('GPT-4');
// Optionally: verify the model is used in actual requests
// (via request interception or checking response metadata)
});之前(聚焦UI):
ts
test('点击后下拉菜单展开', async () => {
await page.getByTestId('model-dropdown').click();
await expect(page.getByRole('listbox')).toBeVisible();
});之后(聚焦行为):
ts
test('从下拉菜单选择模型会更新Agent配置', async () => {
// 打开下拉菜单并选择模型
await page.getByTestId('model-dropdown').click();
await page.getByRole('option', { name: 'GPT-4' }).click();
// 验证选择已保留并影响行为
await page.reload();
await expect(page.getByTestId('model-dropdown')).toHaveText('GPT-4');
// 可选:验证模型在实际请求中被使用
// (通过请求拦截或检查响应元数据)
});Step 6: Kitchen-Sink Fixtures for Behavior Testing
步骤6:用于行为测试的综合场景夹具
Fixtures should represent realistic scenarios, not just mock data:
夹具应代表真实场景,而非仅模拟数据:
Fixture Naming Convention
夹具命名规范
<feature>-<scenario>.fixture.ts
Examples:
- agent-with-tools.fixture.ts
- chat-multi-turn-context.fixture.ts
- workflow-parallel-execution.fixture.ts
- tool-validation-error.fixture.ts
- mcp-server-timeout.fixture.ts<功能>-<场景>.fixture.ts
示例:
- agent-with-tools.fixture.ts
- chat-multi-turn-context.fixture.ts
- workflow-parallel-execution.fixture.ts
- tool-validation-error.fixture.ts
- mcp-server-timeout.fixture.tsFixture Content Requirements
夹具内容要求
Each fixture must define:
- Scenario description (what behavior it enables testing)
- Expected outcomes (what assertions should pass)
- Edge cases covered (error states, empty states, etc.)
ts
// fixtures/agent-provider-switch.fixture.ts
export const agentProviderSwitch = {
name: 'agent-provider-switch',
description: 'Tests that switching LLM providers changes agent behavior',
// Mock responses for different providers
responses: {
openai: { content: 'Response from OpenAI', model: 'gpt-4' },
anthropic: { content: 'Response from Anthropic', model: 'claude-3' },
},
expectedBehavior: {
// When provider is switched, subsequent messages use new provider
providerSwitchAffectsNextMessage: true,
// Provider selection persists across page reload
providerPersistsOnReload: true,
},
};每个夹具必须定义:
- 场景描述(该夹具支持测试什么行为)
- 预期结果(哪些断言应该通过)
- 覆盖的边缘情况(错误状态、空状态等)
ts
// fixtures/agent-provider-switch.fixture.ts
export const agentProviderSwitch = {
name: 'agent-provider-switch',
description: '测试切换LLM供应商会改变Agent行为',
// 不同供应商的模拟响应
responses: {
openai: { content: 'Response from OpenAI', model: 'gpt-4' },
anthropic: { content: 'Response from Anthropic', model: 'claude-3' },
},
expectedBehavior: {
// 切换供应商后,后续消息使用新供应商
providerSwitchAffectsNextMessage: true,
// 供应商选择在页面刷新后仍保留
providerPersistsOnReload: true,
},
};Step 7: Run and Validate
步骤7:运行并验证
sh
cd packages/playground && pnpm test:e2esh
cd packages/playground && pnpm test:e2eTest Quality Checklist
测试质量检查表
Before considering tests complete, verify:
- Each test has a clear user story comment
- Tests verify OUTCOMES, not intermediate UI states
- Tests would FAIL if the feature broke (not just if UI changed)
- Persistence is verified via where applicable
page.reload() - Error scenarios are covered
- Tests use appropriate timeouts for async operations
- Fixtures represent realistic usage scenarios
在认为测试完成前,请验证:
- 每个测试都有清晰的用户故事注释
- 测试验证的是结果,而非中间UI状态
- 如果功能失效,测试会失败(而非仅UI变化时失败)
- 适用时通过验证持久化
page.reload() - 覆盖了错误场景
- 异步操作使用了适当的超时时间
- 夹具代表了真实的使用场景
Quick Reference
快速参考
| Step | Command/Action |
|---|---|
| Build | |
| Start | |
| App URL | http://localhost:4111 |
| Routes | |
| Run tests | |
| Test dir | |
| Fixtures | |
| 步骤 | 命令/操作 |
|---|---|
| 构建 | |
| 启动 | |
| 应用URL | http://localhost:4111 |
| 路由文件 | |
| 运行测试 | |
| 测试目录 | |
| 夹具目录 | |
Anti-Patterns to Avoid
需避免的反模式
| ❌ Don't | ✅ Do Instead |
|---|---|
| Test that modal opens | Test that modal action completes and persists |
| Test that button is clickable | Test that clicking button produces expected result |
| Test loading spinner appears | Test that loaded data is correct |
| Test form validation message shows | Test that invalid form cannot submit AND valid form succeeds |
| Test dropdown has options | Test that selecting option changes system behavior |
| Test sidebar navigation works | Test that navigated page has correct data/functionality |
| Assert element is visible | Assert element contains expected data/state |
| ❌ 不要做的事 | ✅ 应该做的事 |
|---|---|
| 测试模态框是否打开 | 测试模态框操作是否完成并持久化 |
| 测试按钮是否可点击 | 测试点击按钮是否产生预期结果 |
| 测试加载动画是否出现 | 测试加载的数据是否正确 |
| 测试表单验证消息是否显示 | 测试无效表单无法提交且有效表单提交成功 |
| 测试下拉菜单是否包含选项 | 测试选择选项是否改变系统行为 |
| 测试侧边栏导航是否可用 | 测试导航后的页面是否有正确的数据/功能 |
| 断言元素是否可见 | 断言元素包含预期的数据/状态 |