e2e-tests-studio

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

E2E Behavior Validation for Frontend Modifications

前端修改的E2E行为验证

Core Principle: Test Product Behavior, Not UI States

核心原则：测试产品行为，而非UI状态

CRITICAL: Tests must verify that product features WORK correctly, not just that UI elements render.

关键要求：测试必须验证产品功能是否正常工作，而非仅验证UI元素是否渲染。

What NOT to test (UI States):

无需测试的内容（UI状态）：

❌ "Dropdown opens when clicked"
❌ "Modal appears after button click"
❌ "Loading spinner shows during request"
❌ "Form fields are visible"
❌ "Sidebar collapses"

❌ "点击后下拉菜单展开"
❌ "点击按钮后模态框出现"
❌ "请求期间显示加载动画"
❌ "表单字段可见"
❌ "侧边栏可折叠"

What TO test (Product Behavior):

需要测试的内容（产品行为）：

✅ "Selecting an LLM provider configures the agent to use that provider"
✅ "Creating a new agent persists it and shows in the agents list"
✅ "Running a tool with parameters returns the expected output"
✅ "Chat messages stream correctly and maintain conversation context"
✅ "Workflow execution triggers tools in the correct order"

✅ "选择LLM供应商后，Agent会配置为使用该供应商"
✅ "创建新Agent后，该Agent会被持久化并显示在Agent列表中"
✅ "带参数运行工具会返回预期输出"
✅ "聊天消息能正确流式传输并保持对话上下文"
✅ "工作流执行会按正确顺序触发工具"

Prerequisites

前置条件

Requires Playwright MCP server. If the

browser_navigate

tool is unavailable, instruct the user to add it:

claude mcp add playwright -- npx @playwright/mcp@latest

需要Playwright MCP服务器。如果

browser_navigate

工具不可用，请指导用户添加：

claude mcp add playwright -- npx @playwright/mcp@latest

Step 1: Understand the Feature Intent

步骤1：理解功能意图

Before writing ANY test, answer these questions:

What user problem does this feature solve?
What is the expected outcome when the feature works correctly?
What data flows through the system? (user input → API → state → UI)
What should persist after page reload?
What downstream effects should this action have?

Document these answers as comments in your test file.

在编写任何测试之前，请先回答以下问题：

该功能解决了用户的什么问题？
功能正常工作时的预期结果是什么？
系统中有哪些数据流？（用户输入 → API → 状态 → UI）
页面刷新后哪些数据应该保留？
该操作会产生哪些下游影响？

将这些答案作为注释记录在测试文件中。

Step 2: Build and Start

步骤2：构建并启动服务

pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm dev

Verify server at http://localhost:4111

pnpm build:cli
cd packages/playground/e2e/kitchen-sink && pnpm dev

验证服务器是否在http://localhost:4111正常运行

Step 3: Map Feature to Behavior Tests

步骤3：将功能映射到行为测试

Feature-to-Test Mapping Guide

功能到测试的映射指南

Feature Category	What to Test	Example Assertion
Agent Configuration	Config changes affect agent behavior	Send message → verify response uses selected model
LLM Provider Selection	Selected provider is used in requests	Intercept API call → verify provider in request payload
Tool Execution	Tool runs with correct params & returns result	Execute tool → verify output matches expected transformation
Workflow Execution	Steps execute in order, data flows between steps	Run workflow → verify each step's output feeds next step
Chat/Streaming	Messages persist, context maintained across turns	Multi-turn conversation → verify context awareness
MCP Server Tools	Server tools are callable and return data	Call MCP tool → verify response structure and content
Memory/Persistence	Data survives page reload	Create item → reload → verify item exists
Error Handling	Errors surface correctly to user	Trigger error condition → verify error message + recovery

功能类别	测试内容	断言示例
Agent配置	配置变更会影响Agent行为	发送消息 → 验证响应使用了所选模型
LLM供应商选择	所选供应商会在请求中被使用	拦截API调用 → 验证请求负载中包含该供应商
工具执行	工具使用正确参数运行并返回结果	执行工具 → 验证输出与预期转换结果匹配
工作流执行	步骤按顺序执行，数据在步骤间流转	运行工作流 → 验证每个步骤的输出会传入下一个步骤
聊天/流式传输	消息持久化，多轮对话中保持上下文	多轮对话 → 验证上下文感知能力
MCP服务器工具	服务器工具可被调用并返回数据	调用MCP工具 → 验证响应结构和内容
内存/持久化	页面刷新后数据仍保留	创建项目 → 刷新页面 → 验证项目仍然存在
错误处理	错误能正确呈现给用户	触发错误条件 → 验证错误消息和恢复选项

Step 4: Write Behavior-Focused Tests

步骤4：编写聚焦行为的测试

Test Structure Template

测试结构模板

import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';

/**
 * FEATURE: [Name of feature]
 * USER STORY: As a user, I want to [action] so that [outcome]
 * BEHAVIOR UNDER TEST: [Specific behavior being validated]
 */

test.describe('[Feature Name] - Behavior Tests', () => {
  let page: Page;

  test.beforeEach(async ({ browser }) => {
    const context = await browser.newContext();
    page = await context.newPage();
  });

  test.afterEach(async () => {
    await resetStorage(page);
  });

  test('should [verb describing behavior] when [trigger condition]', async () => {
    // ARRANGE: Set up preconditions
    // - Navigate to the feature
    // - Configure any required state
    // ACT: Perform the user action that triggers the behavior
    // ASSERT: Verify the OUTCOME, not the UI state
    // - Check data persistence
    // - Verify downstream effects
    // - Confirm API calls made correctly
  });
});

import { test, expect, Page } from '@playwright/test';
import { resetStorage } from '../__utils__/reset-storage';
import { selectFixture } from '../__utils__/select-fixture';
import { nanoid } from 'nanoid';

/**
 * 功能：[功能名称]
 * 用户故事：作为用户，我想要[操作]，以便[实现结果]
 * 测试的行为：[待验证的具体行为]
 */

test.describe('[功能名称] - 行为测试', () => {
  let page: Page;

  test.beforeEach(async ({ browser }) => {
    const context = await browser.newContext();
    page = await context.newPage();
  });

  test.afterEach(async () => {
    await resetStorage(page);
  });

  test('当[触发条件]时，应该[描述行为的动词]', async () => {
    // 准备：设置前置条件
    // - 导航到功能页面
    // - 配置所需状态
    // 执行：执行触发行为的用户操作
    // 断言：验证结果，而非UI状态
    // - 检查数据持久化
    // - 验证下游影响
    // - 确认API调用正确
  });
});

Behavior Test Patterns

行为测试模式

Pattern 1: Configuration Affects Behavior

模式1：配置影响行为

test('selecting LLM provider should use that provider for agent responses', async () => {
  // ARRANGE
  await page.goto('/agents/my-agent/chat');

  // Intercept API to verify provider
  let capturedProvider: string | null = null;
  await page.route('**/api/chat', route => {
    const body = JSON.parse(route.request().postData() || '{}');
    capturedProvider = body.provider;
    route.continue();
  });

  // ACT: Select a different provider
  await page.getByTestId('provider-selector').click();
  await page.getByRole('option', { name: 'OpenAI' }).click();

  // Send a message to trigger the agent
  await page.getByTestId('chat-input').fill('Hello');
  await page.getByTestId('send-button').click();

  // ASSERT: Verify the selected provider was used
  await expect.poll(() => capturedProvider).toBe('openai');
});

test('选择LLM供应商后，Agent响应应使用该供应商', async () => {
  // 准备
  await page.goto('/agents/my-agent/chat');

  // 拦截API以验证供应商
  let capturedProvider: string | null = null;
  await page.route('**/api/chat', route => {
    const body = JSON.parse(route.request().postData() || '{}');
    capturedProvider = body.provider;
    route.continue();
  });

  // 执行：选择不同的供应商
  await page.getByTestId('provider-selector').click();
  await page.getByRole('option', { name: 'OpenAI' }).click();

  // 发送消息触发Agent
  await page.getByTestId('chat-input').fill('Hello');
  await page.getByTestId('send-button').click();

  // 断言：验证使用了所选供应商
  await expect.poll(() => capturedProvider).toBe('openai');
});

Pattern 2: Data Persistence

模式2：数据持久化

test('created agent should persist after page reload', async () => {
  // ARRANGE
  await page.goto('/agents');
  const agentName = `Test Agent ${nanoid()}`;

  // ACT: Create new agent
  await page.getByTestId('create-agent-button').click();
  await page.getByTestId('agent-name-input').fill(agentName);
  await page.getByTestId('save-agent-button').click();

  // Wait for creation to complete
  await expect(page.getByText(agentName)).toBeVisible();

  // ASSERT: Verify persistence
  await page.reload();
  await expect(page.getByText(agentName)).toBeVisible({ timeout: 10000 });
});

test('创建的Agent在页面刷新后仍保留', async () => {
  // 准备
  await page.goto('/agents');
  const agentName = `测试Agent ${nanoid()}`;

  // 执行：创建新Agent
  await page.getByTestId('create-agent-button').click();
  await page.getByTestId('agent-name-input').fill(agentName);
  await page.getByTestId('save-agent-button').click();

  // 等待创建完成
  await expect(page.getByText(agentName)).toBeVisible();

  // 断言：验证持久化
  await page.reload();
  await expect(page.getByText(agentName)).toBeVisible({ timeout: 10000 });
});

Pattern 3: Tool Execution Produces Correct Output

模式3：工具执行产生正确输出

test('weather tool should return formatted weather data', async () => {
  // ARRANGE
  await selectFixture(page, 'weather-success');
  await page.goto('/tools/weather-tool');

  // ACT: Execute tool with parameters
  await page.getByTestId('param-city').fill('San Francisco');
  await page.getByTestId('execute-tool-button').click();

  // ASSERT: Verify OUTPUT content, not just that output appears
  const output = page.getByTestId('tool-output');
  await expect(output).toContainText('temperature');
  await expect(output).toContainText('San Francisco');

  // Verify structured data if applicable
  const outputText = await output.textContent();
  const outputData = JSON.parse(outputText || '{}');
  expect(outputData).toHaveProperty('temperature');
  expect(outputData).toHaveProperty('conditions');
});

test('天气工具应返回格式化的天气数据', async () => {
  // 准备
  await selectFixture(page, 'weather-success');
  await page.goto('/tools/weather-tool');

  // 执行：带参数执行工具
  await page.getByTestId('param-city').fill('San Francisco');
  await page.getByTestId('execute-tool-button').click();

  // 断言：验证输出内容，而非仅输出是否出现
  const output = page.getByTestId('tool-output');
  await expect(output).toContainText('temperature');
  await expect(output).toContainText('San Francisco');

  // 验证结构化数据（如果适用）
  const outputText = await output.textContent();
  const outputData = JSON.parse(outputText || '{}');
  expect(outputData).toHaveProperty('temperature');
  expect(outputData).toHaveProperty('conditions');
});

Pattern 4: Workflow Step Chaining

模式4：工作流步骤链式传递

test('workflow should pass data between steps correctly', async () => {
  // ARRANGE
  await selectFixture(page, 'workflow-multi-step');
  const sessionId = nanoid();
  await page.goto(`/workflows/data-pipeline?session=${sessionId}`);

  // ACT: Trigger workflow execution
  await page.getByTestId('workflow-input').fill('test input data');
  await page.getByTestId('run-workflow-button').click();

  // ASSERT: Verify each step received correct input from previous step
  // Wait for completion
  await expect(page.getByTestId('workflow-status')).toHaveText('completed', { timeout: 30000 });

  // Check step outputs show data transformation chain
  const step1Output = await page.getByTestId('step-1-output').textContent();
  const step2Output = await page.getByTestId('step-2-output').textContent();

  // Verify step 2 received step 1's output as input
  expect(step2Output).toContain(step1Output);
});

test('工作流应在步骤间正确传递数据', async () => {
  // 准备
  await selectFixture(page, 'workflow-multi-step');
  const sessionId = nanoid();
  await page.goto(`/workflows/data-pipeline?session=${sessionId}`);

  // 执行：触发工作流执行
  await page.getByTestId('workflow-input').fill('test input data');
  await page.getByTestId('run-workflow-button').click();

  // 断言：验证每个步骤都从之前步骤接收到正确输入
  // 等待完成
  await expect(page.getByTestId('workflow-status')).toHaveText('completed', { timeout: 30000 });

  // 检查步骤输出显示数据转换链
  const step1Output = await page.getByTestId('step-1-output').textContent();
  const step2Output = await page.getByTestId('step-2-output').textContent();

  // 验证步骤2收到了步骤1的输出作为输入
  expect(step2Output).toContain(step1Output);
});

Pattern 5: Streaming Chat with Context

模式5：带上下文的流式聊天

test('chat should maintain conversation context across messages', async () => {
  // ARRANGE
  await selectFixture(page, 'contextual-chat');
  const chatId = nanoid();
  await page.goto(`/agents/assistant/chat/${chatId}`);

  // ACT: Multi-turn conversation
  await page.getByTestId('chat-input').fill('My name is Alice');
  await page.getByTestId('send-button').click();
  await expect(page.getByTestId('assistant-message').last()).toBeVisible({ timeout: 20000 });

  await page.getByTestId('chat-input').fill('What is my name?');
  await page.getByTestId('send-button').click();

  // ASSERT: Verify context was maintained
  const response = page.getByTestId('assistant-message').last();
  await expect(response).toContainText('Alice', { timeout: 20000 });
});

test('聊天应在多轮消息间保持对话上下文', async () => {
  // 准备
  await selectFixture(page, 'contextual-chat');
  const chatId = nanoid();
  await page.goto(`/agents/assistant/chat/${chatId}`);

  // 执行：多轮对话
  await page.getByTestId('chat-input').fill('My name is Alice');
  await page.getByTestId('send-button').click();
  await expect(page.getByTestId('assistant-message').last()).toBeVisible({ timeout: 20000 });

  await page.getByTestId('chat-input').fill('What is my name?');
  await page.getByTestId('send-button').click();

  // 断言：验证上下文被保留
  const response = page.getByTestId('assistant-message').last();
  await expect(response).toContainText('Alice', { timeout: 20000 });
});

Pattern 6: Error Recovery

模式6：错误恢复

test('should show actionable error and allow retry when API fails', async () => {
  // ARRANGE: Set up failure fixture
  await selectFixture(page, 'api-failure');
  await page.goto('/tools/flaky-tool');

  // ACT: Trigger the error
  await page.getByTestId('execute-tool-button').click();

  // ASSERT: Error is shown with recovery option
  await expect(page.getByTestId('error-message')).toContainText('failed');
  await expect(page.getByTestId('retry-button')).toBeVisible();

  // Switch to success fixture and retry
  await selectFixture(page, 'api-success');
  await page.getByTestId('retry-button').click();

  // Verify recovery worked
  await expect(page.getByTestId('tool-output')).toBeVisible({ timeout: 10000 });
  await expect(page.getByTestId('error-message')).not.toBeVisible();
});

test('API失败时应显示可操作错误并允许重试', async () => {
  // 准备：设置失败夹具
  await selectFixture(page, 'api-failure');
  await page.goto('/tools/flaky-tool');

  // 执行：触发错误
  await page.getByTestId('execute-tool-button').click();

  // 断言：显示错误及恢复选项
  await expect(page.getByTestId('error-message')).toContainText('failed');
  await expect(page.getByTestId('retry-button')).toBeVisible();

  // 切换到成功夹具并重试
  await selectFixture(page, 'api-success');
  await page.getByTestId('retry-button').click();

  // 验证恢复成功
  await expect(page.getByTestId('tool-output')).toBeVisible({ timeout: 10000 });
  await expect(page.getByTestId('error-message')).not.toBeVisible();
});

Step 5: Update Existing Tests

步骤5：更新现有测试

When a test file already exists:

Read the existing tests to understand current coverage
Identify if tests are UI-focused or behavior-focused
Refactor UI-focused tests to verify behavior instead:

当测试文件已存在时：

阅读现有测试以了解当前覆盖范围
识别测试是聚焦UI还是聚焦行为
重构聚焦UI的测试以改为验证行为：

Refactoring Example

重构示例

BEFORE (UI-focused):

test('dropdown opens when clicked', async () => {
  await page.getByTestId('model-dropdown').click();
  await expect(page.getByRole('listbox')).toBeVisible();
});

AFTER (Behavior-focused):

test('selecting model from dropdown updates agent configuration', async () => {
  // Open dropdown and select model
  await page.getByTestId('model-dropdown').click();
  await page.getByRole('option', { name: 'GPT-4' }).click();

  // Verify the selection persists and affects behavior
  await page.reload();
  await expect(page.getByTestId('model-dropdown')).toHaveText('GPT-4');

  // Optionally: verify the model is used in actual requests
  // (via request interception or checking response metadata)
});

之前（聚焦UI）：

test('点击后下拉菜单展开', async () => {
  await page.getByTestId('model-dropdown').click();
  await expect(page.getByRole('listbox')).toBeVisible();
});

之后（聚焦行为）：

test('从下拉菜单选择模型会更新Agent配置', async () => {
  // 打开下拉菜单并选择模型
  await page.getByTestId('model-dropdown').click();
  await page.getByRole('option', { name: 'GPT-4' }).click();

  // 验证选择已保留并影响行为
  await page.reload();
  await expect(page.getByTestId('model-dropdown')).toHaveText('GPT-4');

  // 可选：验证模型在实际请求中被使用
  // （通过请求拦截或检查响应元数据）
});

Step 6: Kitchen-Sink Fixtures for Behavior Testing

步骤6：用于行为测试的综合场景夹具

Fixtures should represent realistic scenarios, not just mock data:

夹具应代表真实场景，而非仅模拟数据：

Fixture Naming Convention

夹具命名规范

<feature>-<scenario>.fixture.ts

Examples:
- agent-with-tools.fixture.ts
- chat-multi-turn-context.fixture.ts
- workflow-parallel-execution.fixture.ts
- tool-validation-error.fixture.ts
- mcp-server-timeout.fixture.ts

<功能>-<场景>.fixture.ts

示例：
- agent-with-tools.fixture.ts
- chat-multi-turn-context.fixture.ts
- workflow-parallel-execution.fixture.ts
- tool-validation-error.fixture.ts
- mcp-server-timeout.fixture.ts

Fixture Content Requirements

夹具内容要求

Each fixture must define:

Scenario description (what behavior it enables testing)
Expected outcomes (what assertions should pass)
Edge cases covered (error states, empty states, etc.)

// fixtures/agent-provider-switch.fixture.ts
export const agentProviderSwitch = {
  name: 'agent-provider-switch',
  description: 'Tests that switching LLM providers changes agent behavior',

  // Mock responses for different providers
  responses: {
    openai: { content: 'Response from OpenAI', model: 'gpt-4' },
    anthropic: { content: 'Response from Anthropic', model: 'claude-3' },
  },

  expectedBehavior: {
    // When provider is switched, subsequent messages use new provider
    providerSwitchAffectsNextMessage: true,
    // Provider selection persists across page reload
    providerPersistsOnReload: true,
  },
};

每个夹具必须定义：

场景描述（该夹具支持测试什么行为）
预期结果（哪些断言应该通过）
覆盖的边缘情况（错误状态、空状态等）

// fixtures/agent-provider-switch.fixture.ts
export const agentProviderSwitch = {
  name: 'agent-provider-switch',
  description: '测试切换LLM供应商会改变Agent行为',

  // 不同供应商的模拟响应
  responses: {
    openai: { content: 'Response from OpenAI', model: 'gpt-4' },
    anthropic: { content: 'Response from Anthropic', model: 'claude-3' },
  },

  expectedBehavior: {
    // 切换供应商后，后续消息使用新供应商
    providerSwitchAffectsNextMessage: true,
    // 供应商选择在页面刷新后仍保留
    providerPersistsOnReload: true,
  },
};

Step 7: Run and Validate

步骤7：运行并验证

cd packages/playground && pnpm test:e2e

cd packages/playground && pnpm test:e2e

Test Quality Checklist

测试质量检查表

Before considering tests complete, verify:

Each test has a clear user story comment
Tests verify OUTCOMES, not intermediate UI states
Tests would FAIL if the feature broke (not just if UI changed)
Persistence is verified via
```
page.reload()
```
where applicable
Error scenarios are covered
Tests use appropriate timeouts for async operations
Fixtures represent realistic usage scenarios

在认为测试完成前，请验证：

每个测试都有清晰的用户故事注释
测试验证的是结果，而非中间UI状态
如果功能失效，测试会失败（而非仅UI变化时失败）
适用时通过
```
page.reload()
```
验证持久化
覆盖了错误场景
异步操作使用了适当的超时时间
夹具代表了真实的使用场景

Quick Reference

快速参考

Step	Command/Action
Build	`pnpm build:cli`
Start	`cd packages/playground/e2e/kitchen-sink && pnpm dev`
App URL	http://localhost:4111
Routes	`@packages/playground/src/App.tsx`
Run tests	`cd packages/playground && pnpm test:e2e`
Test dir	`packages/playground/e2e/tests/`
Fixtures	`packages/playground/e2e/kitchen-sink/fixtures/`

步骤	命令/操作
构建	`pnpm build:cli`
启动	`cd packages/playground/e2e/kitchen-sink && pnpm dev`
应用URL	http://localhost:4111
路由文件	`@packages/playground/src/App.tsx`
运行测试	`cd packages/playground && pnpm test:e2e`
测试目录	`packages/playground/e2e/tests/`
夹具目录	`packages/playground/e2e/kitchen-sink/fixtures/`

Anti-Patterns to Avoid

需避免的反模式

❌ Don't	✅ Do Instead
Test that modal opens	Test that modal action completes and persists
Test that button is clickable	Test that clicking button produces expected result
Test loading spinner appears	Test that loaded data is correct
Test form validation message shows	Test that invalid form cannot submit AND valid form succeeds
Test dropdown has options	Test that selecting option changes system behavior
Test sidebar navigation works	Test that navigated page has correct data/functionality
Assert element is visible	Assert element contains expected data/state

❌ 不要做的事	✅ 应该做的事
测试模态框是否打开	测试模态框操作是否完成并持久化
测试按钮是否可点击	测试点击按钮是否产生预期结果
测试加载动画是否出现	测试加载的数据是否正确
测试表单验证消息是否显示	测试无效表单无法提交且有效表单提交成功
测试下拉菜单是否包含选项	测试选择选项是否改变系统行为
测试侧边栏导航是否可用	测试导航后的页面是否有正确的数据/功能
断言元素是否可见	断言元素包含预期的数据/状态