usability-tester

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Usability Tester

可用性测试人员

Validate that users can successfully complete core tasks through systematic observation.
通过系统观察验证用户能否成功完成核心任务。

Core Principle

核心原则

Watch users struggle. The best way to find UX issues is to observe real users attempting real tasks. Their struggles reveal truth that surveys and analytics cannot.
观察用户的操作困境。发现UX问题的最佳方式是观察真实用户完成真实任务的过程。他们遇到的困境能揭示出调查和分析无法发现的真相。

Test Planning

测试规划

1. Define Test Objectives

1. 定义测试目标

yaml
Good Objectives:
  - "Can users complete onboarding in <5 minutes?"
  - "Can users find and use the export feature?"
  - "Do users understand the pricing page?"

Bad Objectives:
  - "Test the UI" (too vague)
  - "See if users like it" (subjective, not behavioral)
yaml
Good Objectives:
  - "Can users complete onboarding in <5 minutes?"
  - "Can users find and use the export feature?"
  - "Do users understand the pricing page?"

Bad Objectives:
  - "Test the UI" (too vague)
  - "See if users like it" (subjective, not behavioral)

2. Research Questions

2. 研究问题

yaml
Examples:
  - Where do users get stuck during sign-up?
  - Can users find the settings page?
  - Do users understand what each tier includes?
  - What errors do users encounter?
yaml
Examples:
  - Where do users get stuck during sign-up?
  - Can users find the settings page?
  - Do users understand what each tier includes?
  - What errors do users encounter?

3. Identify Core Tasks

3. 确定核心任务

Choose 3-5 tasks that represent key user journeys:
yaml
Example Tasks (Project Management Tool): 1. Sign up and create account
  2. Create your first project
  3. Invite a team member
  4. Assign a task to someone
  5. Export project data
选择3-5项代表关键用户旅程的任务:
yaml
Example Tasks (Project Management Tool): 1. Sign up and create account
  2. Create your first project
  3. Invite a team member
  4. Assign a task to someone
  5. Export project data

4. Recruit Participants

4. 招募参与者

yaml
Sample Size:
  - 5-8 users per persona
  - After 5 users, diminishing returns (Nielsen's research)
  - Test in waves: 5 users → fix issues → test 5 more

Recruitment Criteria:
  - Match target persona
  - Haven't used product before (for onboarding tests)
  - Or: Active users (for feature tests)

Incentives:
  - $50-100 per hour (B2C)
  - $100-200 per hour (B2B professionals)
  - Gift cards work well
yaml
Sample Size:
  - 5-8 users per persona
  - After 5 users, diminishing returns (Nielsen's research)
  - Test in waves: 5 users → fix issues → test 5 more

Recruitment Criteria:
  - Match target persona
  - Haven't used product before (for onboarding tests)
  - Or: Active users (for feature tests)

Incentives:
  - $50-100 per hour (B2C)
  - $100-200 per hour (B2B professionals)
  - Gift cards work well

Task Scenarios

任务场景

Best Practices

最佳实践

Good task scenario:
"Your team is launching a new project next week. Create a project
called 'Q2 Launch' and invite john@example.com to collaborate."
Why it works:
  • Realistic context
  • Clear goal
  • Natural language
  • Doesn't give step-by-step instructions
Bad task scenario:
"Click the 'New Project' button, then enter 'Q2 Launch', then
click Settings, then click Invite, then enter email."
Why it fails:
  • Step-by-step instructions
  • No context
  • Doesn't test discoverability
  • User just follows orders
优质任务场景:
"Your team is launching a new project next week. Create a project
called 'Q2 Launch' and invite john@example.com to collaborate."
为何有效:
  • 贴近真实场景
  • 目标清晰
  • 使用自然语言
  • 不提供分步指令
劣质任务场景:
"Click the 'New Project' button, then enter 'Q2 Launch', then
click Settings, then click Invite, then enter email."
为何无效:
  • 提供分步指令
  • 缺乏场景背景
  • 无法测试功能可发现性
  • 用户只是按指令操作

Task Scenario Template

任务场景模板

yaml
Scenario: [Context/Motivation]
Goal: [What they need to accomplish]
Success Criteria: [How to know they completed it]

Example:
  Scenario: You're preparing for a client meeting tomorrow and need to review past conversations.
  Goal: Find all conversations with "Acme Corp" from the last 30 days
  Success Criteria: User successfully uses search/filter to find conversations
yaml
Scenario: [Context/Motivation]
Goal: [What they need to accomplish]
Success Criteria: [How to know they completed it]

Example:
  Scenario: You're preparing for a client meeting tomorrow and need to review past conversations.
  Goal: Find all conversations with "Acme Corp" from the last 30 days
  Success Criteria: User successfully uses search/filter to find conversations

Conducting Tests

开展测试

Think-Aloud Protocol

出声思考协议

Key instruction to participant:
"Please think aloud as you work. Tell me what you're looking for,
what you're thinking, what you're trying to do. There are no
wrong answers - we're testing the product, not you."
What to listen for:
  • "I'm looking for..." (what they expect)
  • "I thought this would..." (mental models)
  • "This is confusing because..." (friction points)
  • "I'm not sure if..." (uncertainty)
给参与者的关键指令:
"Please think aloud as you work. Tell me what you're looking for,
what you're thinking, what you're trying to do. There are no
wrong answers - we're testing the product, not you."
需要关注的内容:
  • "我在找..."(他们的预期)
  • "我以为这个会..."(心智模型)
  • "这个很困惑,因为..."(摩擦点)
  • "我不确定是否..."(不确定性)

Facilitation Rules

引导规则

Do:
  • Observe silently
  • Take notes
  • Let them struggle (reveals issues)
  • Ask follow-up questions AFTER task
  • Stay neutral
Don't:
  • Help or explain
  • Lead them ("maybe try clicking...")
  • Defend design choices
  • Interrupt during task
  • Show frustration
可以做:
  • 安静观察
  • 记录笔记
  • 让他们自行尝试(能揭示问题)
  • 任务完成后再追问
  • 保持中立
不要做:
  • 提供帮助或解释
  • 引导他们(比如“试试点击...”)
  • 为设计决策辩护
  • 任务过程中打断
  • 表现出不耐烦

Questions to Ask After Each Task

每项任务完成后可问的问题

yaml
Completion Questions:
  - 'On a scale of 1-5, how easy was that task?'
  - 'What were you expecting to see?'
  - 'What was confusing about that?'
  - 'If you could change one thing, what would it be?'

Discovery Questions:
  - 'Where did you expect to find that?'
  - 'What do you think this [feature] does?'
  - 'Why did you click there?'
yaml
Completion Questions:
  - 'On a scale of 1-5, how easy was that task?'
  - 'What were you expecting to see?'
  - 'What was confusing about that?'
  - 'If you could change one thing, what would it be?'

Discovery Questions:
  - 'Where did you expect to find that?'
  - 'What do you think this [feature] does?'
  - 'Why did you click there?'

Metrics to Track

需跟踪的指标

Task Success Rate

任务成功率

yaml
Measurement:
  - Completed: User achieved goal without help
  - Partial: User achieved goal with hints
  - Failed: User could not complete task

Calculation: Task Success Rate = (Completed Tasks / Total Attempts) × 100

Target: ≥80% for core tasks
yaml
Measurement:
  - Completed: User achieved goal without help
  - Partial: User achieved goal with hints
  - Failed: User could not complete task

Calculation: Task Success Rate = (Completed Tasks / Total Attempts) × 100

Target: ≥80% for core tasks

Time on Task

任务耗时

yaml
Measurement:
  - Start timer when task begins
  - Stop when user completes or gives up

Analysis:
  - Compare to baseline/previous tests
  - Identify outliers (very fast or very slow)

Target: Varies by task complexity
  - Simple task (e.g., log in): <30 seconds
  - Medium task (e.g., create project): 1-2 minutes
  - Complex task (e.g., configure integration): 3-5 minutes
yaml
Measurement:
  - Start timer when task begins
  - Stop when user completes or gives up

Analysis:
  - Compare to baseline/previous tests
  - Identify outliers (very fast or very slow)

Target: Varies by task complexity
  - Simple task (e.g., log in): <30 seconds
  - Medium task (e.g., create project): 1-2 minutes
  - Complex task (e.g., configure integration): 3-5 minutes

Error Rate

错误率

yaml
Errors:
  - Wrong path taken
  - Incorrect button clicked
  - Had to backtrack
  - Gave up and tried different approach

Calculation: Errors per Task = Total Errors / Number of Users

Target: <2 errors per task
yaml
Errors:
  - Wrong path taken
  - Incorrect button clicked
  - Had to backtrack
  - Gave up and tried different approach

Calculation: Errors per Task = Total Errors / Number of Users

Target: <2 errors per task

Satisfaction Rating

满意度评分

yaml
Post-Task Question:
  "How satisfied are you with completing this task?" (1-5 scale)

  1 = Very Dissatisfied
  2 = Dissatisfied
  3 = Neutral
  4 = Satisfied
  5 = Very Satisfied

Target: ≥4.0 average
yaml
Post-Task Question:
  "How satisfied are you with completing this task?" (1-5 scale)

  1 = Very Dissatisfied
  2 = Dissatisfied
  3 = Neutral
  4 = Satisfied
  5 = Very Satisfied

Target: ≥4.0 average

Issue Severity Rating

问题严重程度评级

Severity Formula

严重程度计算公式

Severity = Impact × Frequency
Severity = Impact × Frequency

Impact Scale (1-3)

影响程度量表(1-3)

yaml
1 - Low Impact:
  - Minor inconvenience
  - User can easily recover
  - Cosmetic issue

2 - Medium Impact:
  - Causes delay or confusion
  - User eventually figures it out
  - Moderate frustration

3 - High Impact:
  - Blocks task completion
  - User cannot proceed without help
  - Critical to core functionality
yaml
1 - Low Impact:
  - Minor inconvenience
  - User can easily recover
  - Cosmetic issue

2 - Medium Impact:
  - Causes delay or confusion
  - User eventually figures it out
  - Moderate frustration

3 - High Impact:
  - Blocks task completion
  - User cannot proceed without help
  - Critical to core functionality

Frequency Scale (1-3)

出现频率量表(1-3)

yaml
1 - Rare:
  - Only 1-2 users encountered
  - Edge case
  - Specific conditions

2 - Occasional:
  - 3-5 users encountered
  - Somewhat common
  - Specific user types

3 - Frequent:
  - Most/all users encountered
  - Consistent issue
  - All user types
yaml
1 - Rare:
  - Only 1-2 users encountered
  - Edge case
  - Specific conditions

2 - Occasional:
  - 3-5 users encountered
  - Somewhat common
  - Specific user types

3 - Frequent:
  - Most/all users encountered
  - Consistent issue
  - All user types

Combined Severity

综合严重程度

yaml
Critical (8-9):
  - Impact: 3, Frequency: 3
  - Blocks most users
  → Fix immediately before release

High (6-7):
  - Impact: 3, Frequency: 2 OR Impact: 2, Frequency: 3
  - Significant delay or frequent minor issue
  → Fix before release

Medium (4-5):
  - Impact: 2, Frequency: 2 OR Impact: 3, Frequency: 1
  - Minor frustration or rare blocker
  → Fix in next release

Low (1-3):
  - Impact: 1, Frequency: 1-3
  - Cosmetic or rare minor issue
  → Backlog
yaml
Critical (8-9):
  - Impact: 3, Frequency: 3
  - Blocks most users
  → Fix immediately before release

High (6-7):
  - Impact: 3, Frequency: 2 OR Impact: 2, Frequency: 3
  - Significant delay or frequent minor issue
  → Fix before release

Medium (4-5):
  - Impact: 2, Frequency: 2 OR Impact: 3, Frequency: 1
  - Minor frustration or rare blocker
  → Fix in next release

Low (1-3):
  - Impact: 1, Frequency: 1-3
  - Cosmetic or rare minor issue
  → Backlog

System Usability Scale (SUS)

系统可用性量表(SUS)

10-question survey (post-test, 1-5 Likert scale):
yaml
Questions (Odd = Positive, Even = Negative): 1. I think I would like to use this product frequently
  2. I found the product unnecessarily complex
  3. I thought the product was easy to use
  4. I think I would need support to use this product
  5. I found the various functions well integrated
  6. I thought there was too much inconsistency
  7. I imagine most people would learn this quickly
  8. I found the product cumbersome to use
  9. I felt very confident using the product
  10. I needed to learn a lot before getting going

Scoring:
  - Odd questions: Score - 1
  - Even questions: 5 - Score
  - Sum all scores
  - Multiply by 2.5
  - Result: 0-100 score

Interpretation:
  ≥80: Excellent
  68-79: Good (industry average)
  51-67: OK
  <51: Needs significant improvement
10题调查问卷(测试后填写,1-5李克特量表):
yaml
Questions (Odd = Positive, Even = Negative): 1. I think I would like to use this product frequently
  2. I found the product unnecessarily complex
  3. I thought the product was easy to use
  4. I think I would need support to use this product
  5. I found the various functions well integrated
  6. I thought there was too much inconsistency
  7. I imagine most people would learn this quickly
  8. I found the product cumbersome to use
  9. I felt very confident using the product
  10. I needed to learn a lot before getting going

Scoring:
  - Odd questions: Score - 1
  - Even questions: 5 - Score
  - Sum all scores
  - Multiply by 2.5
  - Result: 0-100 score

Interpretation:
  ≥80: Excellent
  68-79: Good (industry average)
  51-67: OK
  <51: Needs significant improvement

Test Report Template

测试报告模板

yaml
usability_test_summary:
  date: '2024-01-20'
  participants: 8
  participant_profile: 'New users, age 25-45, tech-savvy'

  tasks:
    - task: 'Create a new project'
      success_rate: '87.5% (7/8)'
      avg_time: '1m 24s'
      errors: 1.2 per user
      satisfaction: 4.3/5

    - task: 'Invite team member'
      success_rate: '62.5% (5/8)'
      avg_time: '2m 45s'
      errors: 2.8 per user
      satisfaction: 3.1/5

  issues:
    - issue: "Users can't find 'Invite' button"
      severity: high
      impact: 3
      frequency: 3
      affected_users: 7/8
      recommendation: "Move 'Invite' button to top of project page, make it more prominent"

    - issue: 'Confusion about project vs workspace'
      severity: medium
      impact: 2
      frequency: 3
      affected_users: 6/8
      recommendation: 'Add tooltip explaining difference, update onboarding'

    - issue: 'Export button text unclear'
      severity: low
      impact: 1
      frequency: 2
      affected_users: 2/8
      recommendation: "Change 'Export' to 'Export to CSV'"

  sus_score: 72 (Good)

  key_insights:
    - 'Onboarding is smooth (87.5% success)'
    - 'Team collaboration features hard to discover'
    - 'Overall product easy to use once features are found'

  recommended_actions:
    1. "High priority: Redesign invite flow"
    2. "Medium priority: Add contextual help for workspace vs project"
    3. "Low priority: Update button labels"
yaml
usability_test_summary:
  date: '2024-01-20'
  participants: 8
  participant_profile: 'New users, age 25-45, tech-savvy'

  tasks:
    - task: 'Create a new project'
      success_rate: '87.5% (7/8)'
      avg_time: '1m 24s'
      errors: 1.2 per user
      satisfaction: 4.3/5

    - task: 'Invite team member'
      success_rate: '62.5% (5/8)'
      avg_time: '2m 45s'
      errors: 2.8 per user
      satisfaction: 3.1/5

  issues:
    - issue: "Users can't find 'Invite' button"
      severity: high
      impact: 3
      frequency: 3
      affected_users: 7/8
      recommendation: "Move 'Invite' button to top of project page, make it more prominent"

    - issue: 'Confusion about project vs workspace'
      severity: medium
      impact: 2
      frequency: 3
      affected_users: 6/8
      recommendation: 'Add tooltip explaining difference, update onboarding'

    - issue: 'Export button text unclear'
      severity: low
      impact: 1
      frequency: 2
      affected_users: 2/8
      recommendation: "Change 'Export' to 'Export to CSV'"

  sus_score: 72 (Good)

  key_insights:
    - 'Onboarding is smooth (87.5% success)'
    - 'Team collaboration features hard to discover'
    - 'Overall product easy to use once features are found'

  recommended_actions:
    1. "High priority: Redesign invite flow"
    2. "Medium priority: Add contextual help for workspace vs project"
    3. "Low priority: Update button labels"

Remote vs In-Person Testing

远程测试 vs 线下测试

Remote Testing (Moderated)

远程测试(有主持人)

Tools: Zoom, Google Meet, UserTesting.com
yaml
Pros:
  - Can test with users anywhere
  - Lower cost (no travel)
  - Easier to recruit
  - Record sessions easily

Cons:
  - Can't see body language as well
  - Technical issues possible
  - Harder to build rapport
  - Screen sharing can lag

Best Practices:
  - Test your setup beforehand
  - Have backup communication method
  - Ask user to share screen + turn on camera
  - Record session (with permission)
工具: Zoom, Google Meet, UserTesting.com
yaml
Pros:
  - 可测试全球各地用户
  - 成本更低(无需差旅)
  - 招募参与者更易
  - 便于录制测试会话

Cons:
  - 难以清晰观察肢体语言
  - 可能出现技术问题
  - 难以建立信任关系
  - 屏幕共享可能卡顿

Best Practices:
  - 提前测试设备设置
  - 准备备用沟通方式
  - 要求用户共享屏幕并开启摄像头
  - 经许可后录制会话

In-Person Testing

线下测试

yaml
Pros:
  - See full body language
  - Better rapport
  - No technical issues
  - Can see facial expressions

Cons:
  - Limited geographic reach
  - Higher cost
  - Harder to schedule
  - Need physical space

Best Practices:
  - Set up quiet room
  - Have snacks/water
  - Use screen recording software
  - Position yourself behind/beside user
yaml
Pros:
  - 可观察完整肢体语言
  - 更易建立信任关系
  - 无技术问题
  - 可观察面部表情

Cons:
  - 地理范围受限
  - 成本更高
  - 安排日程更难
  - 需要实体测试空间

Best Practices:
  - 布置安静的测试房间
  - 提供零食和饮用水
  - 使用屏幕录制软件
  - 站在或坐在用户身后/旁边

Test Frequency

测试频率

yaml
When to Test:
  - Pre-launch: Test prototypes/designs
  - Post-launch: Test new features
  - Ongoing: Test every major release
  - Quarterly: Full usability audit

Continuous Testing:
  - Week 1: Test with 5 users
  - Week 2: Fix issues
  - Week 3: Test with 5 new users
  - Repeat until success rate ≥80%
yaml
When to Test:
  - Pre-launch: Test prototypes/designs
  - Post-launch: Test new features
  - Ongoing: Test every major release
  - Quarterly: Full usability audit

Continuous Testing:
  - Week 1: Test with 5 users
  - Week 2: Fix issues
  - Week 3: Test with 5 new users
  - Repeat until success rate ≥80%

Tools & Software

工具与软件

yaml
Remote Testing:
  - UserTesting.com (recruit + test)
  - UserZoom (enterprise solution)
  - Lookback (live testing)
  - Maze (unmoderated testing)

Recording:
  - Zoom (screen + audio)
  - Loom (quick recordings)
  - OBS (advanced recording)

Analysis:
  - Dovetail (organize insights)
  - Notion (collaborative notes)
  - Miro (affinity mapping)
  - Excel/Sheets (metrics tracking)
yaml
Remote Testing:
  - UserTesting.com (recruit + test)
  - UserZoom (enterprise solution)
  - Lookback (live testing)
  - Maze (unmoderated testing)

Recording:
  - Zoom (screen + audio)
  - Loom (quick recordings)
  - OBS (advanced recording)

Analysis:
  - Dovetail (organize insights)
  - Notion (collaborative notes)
  - Miro (affinity mapping)
  - Excel/Sheets (metrics tracking)

Quick Start Checklist

快速启动检查清单

Planning Phase

规划阶段

  • Define test objectives
  • Write 3-5 task scenarios
  • Recruit 5-8 participants
  • Prepare test script
  • Set up recording
  • 定义测试目标
  • 编写3-5个任务场景
  • 招募5-8名参与者
  • 准备测试脚本
  • 设置录制设备

Testing Phase

测试阶段

  • Welcome participant
  • Explain think-aloud protocol
  • Conduct tasks (don't help!)
  • Ask follow-up questions
  • Administer SUS survey
  • Thank participant
  • 欢迎参与者
  • 解释出声思考协议
  • 开展测试(不要提供帮助!)
  • 进行后续追问
  • 发放SUS调查问卷
  • 感谢参与者

Analysis Phase

分析阶段

  • Calculate success rates
  • Identify common issues
  • Rate issue severity
  • Create report
  • Share with team
  • Prioritize fixes
  • 计算任务成功率
  • 识别常见问题
  • 评级问题严重程度
  • 撰写测试报告
  • 与团队共享报告
  • 确定修复优先级

Common Pitfalls

常见误区

Testing with employees: They know the product too well ❌ Helping users during tasks: Let them struggle to find real issues ❌ Only testing happy path: Test error cases and edge cases too ❌ Not enough participants: 5 minimum per persona ❌ Ignoring low-severity issues: They add up to poor experience ❌ Testing but not fixing: Usability tests are worthless if you don't act
测试内部员工:他们对产品过于熟悉 ❌ 测试中帮助用户:让用户自行尝试才能发现真实问题 ❌ 仅测试顺畅路径:也要测试错误案例和边缘场景 ❌ 参与者数量不足:每个用户角色至少测试5人 ❌ 忽略低严重程度问题:这些问题累积会导致糟糕的用户体验 ❌ 只测试不修复:如果不采取行动,可用性测试毫无价值

Summary

总结

Great usability testing:
  • ✅ Test with 5-8 users per persona
  • ✅ Use realistic task scenarios (not step-by-step)
  • ✅ Think-aloud protocol (understand mental models)
  • ✅ Don't help users during tasks
  • ✅ Track success rate, time, errors, satisfaction
  • ✅ Rate issues by severity (impact × frequency)
  • ✅ Fix high-priority issues before release
  • ✅ Test continuously, not just once
优秀的可用性测试:
  • ✅ 每个用户角色测试5-8人
  • ✅ 使用贴近真实的任务场景(而非分步指令)
  • ✅ 采用出声思考协议(了解用户心智模型)
  • ✅ 测试过程中不帮助用户
  • ✅ 跟踪任务成功率、耗时、错误率和满意度
  • ✅ 按严重程度(影响×频率)评级问题
  • ✅ 发布前修复高优先级问题
  • ✅ 持续测试,而非一次性测试