usability-tester
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseUsability Tester
可用性测试人员
Validate that users can successfully complete core tasks through systematic observation.
通过系统观察验证用户能否成功完成核心任务。
Core Principle
核心原则
Watch users struggle. The best way to find UX issues is to observe real users attempting real tasks. Their struggles reveal truth that surveys and analytics cannot.
观察用户的操作困境。发现UX问题的最佳方式是观察真实用户完成真实任务的过程。他们遇到的困境能揭示出调查和分析无法发现的真相。
Test Planning
测试规划
1. Define Test Objectives
1. 定义测试目标
yaml
Good Objectives:
- "Can users complete onboarding in <5 minutes?"
- "Can users find and use the export feature?"
- "Do users understand the pricing page?"
Bad Objectives:
- "Test the UI" (too vague)
- "See if users like it" (subjective, not behavioral)yaml
Good Objectives:
- "Can users complete onboarding in <5 minutes?"
- "Can users find and use the export feature?"
- "Do users understand the pricing page?"
Bad Objectives:
- "Test the UI" (too vague)
- "See if users like it" (subjective, not behavioral)2. Research Questions
2. 研究问题
yaml
Examples:
- Where do users get stuck during sign-up?
- Can users find the settings page?
- Do users understand what each tier includes?
- What errors do users encounter?yaml
Examples:
- Where do users get stuck during sign-up?
- Can users find the settings page?
- Do users understand what each tier includes?
- What errors do users encounter?3. Identify Core Tasks
3. 确定核心任务
Choose 3-5 tasks that represent key user journeys:
yaml
Example Tasks (Project Management Tool): 1. Sign up and create account
2. Create your first project
3. Invite a team member
4. Assign a task to someone
5. Export project data选择3-5项代表关键用户旅程的任务:
yaml
Example Tasks (Project Management Tool): 1. Sign up and create account
2. Create your first project
3. Invite a team member
4. Assign a task to someone
5. Export project data4. Recruit Participants
4. 招募参与者
yaml
Sample Size:
- 5-8 users per persona
- After 5 users, diminishing returns (Nielsen's research)
- Test in waves: 5 users → fix issues → test 5 more
Recruitment Criteria:
- Match target persona
- Haven't used product before (for onboarding tests)
- Or: Active users (for feature tests)
Incentives:
- $50-100 per hour (B2C)
- $100-200 per hour (B2B professionals)
- Gift cards work wellyaml
Sample Size:
- 5-8 users per persona
- After 5 users, diminishing returns (Nielsen's research)
- Test in waves: 5 users → fix issues → test 5 more
Recruitment Criteria:
- Match target persona
- Haven't used product before (for onboarding tests)
- Or: Active users (for feature tests)
Incentives:
- $50-100 per hour (B2C)
- $100-200 per hour (B2B professionals)
- Gift cards work wellTask Scenarios
任务场景
Best Practices
最佳实践
✅ Good task scenario:
"Your team is launching a new project next week. Create a project
called 'Q2 Launch' and invite john@example.com to collaborate."Why it works:
- Realistic context
- Clear goal
- Natural language
- Doesn't give step-by-step instructions
❌ Bad task scenario:
"Click the 'New Project' button, then enter 'Q2 Launch', then
click Settings, then click Invite, then enter email."Why it fails:
- Step-by-step instructions
- No context
- Doesn't test discoverability
- User just follows orders
✅ 优质任务场景:
"Your team is launching a new project next week. Create a project
called 'Q2 Launch' and invite john@example.com to collaborate."为何有效:
- 贴近真实场景
- 目标清晰
- 使用自然语言
- 不提供分步指令
❌ 劣质任务场景:
"Click the 'New Project' button, then enter 'Q2 Launch', then
click Settings, then click Invite, then enter email."为何无效:
- 提供分步指令
- 缺乏场景背景
- 无法测试功能可发现性
- 用户只是按指令操作
Task Scenario Template
任务场景模板
yaml
Scenario: [Context/Motivation]
Goal: [What they need to accomplish]
Success Criteria: [How to know they completed it]
Example:
Scenario: You're preparing for a client meeting tomorrow and need to review past conversations.
Goal: Find all conversations with "Acme Corp" from the last 30 days
Success Criteria: User successfully uses search/filter to find conversationsyaml
Scenario: [Context/Motivation]
Goal: [What they need to accomplish]
Success Criteria: [How to know they completed it]
Example:
Scenario: You're preparing for a client meeting tomorrow and need to review past conversations.
Goal: Find all conversations with "Acme Corp" from the last 30 days
Success Criteria: User successfully uses search/filter to find conversationsConducting Tests
开展测试
Think-Aloud Protocol
出声思考协议
Key instruction to participant:
"Please think aloud as you work. Tell me what you're looking for,
what you're thinking, what you're trying to do. There are no
wrong answers - we're testing the product, not you."What to listen for:
- "I'm looking for..." (what they expect)
- "I thought this would..." (mental models)
- "This is confusing because..." (friction points)
- "I'm not sure if..." (uncertainty)
给参与者的关键指令:
"Please think aloud as you work. Tell me what you're looking for,
what you're thinking, what you're trying to do. There are no
wrong answers - we're testing the product, not you."需要关注的内容:
- "我在找..."(他们的预期)
- "我以为这个会..."(心智模型)
- "这个很困惑,因为..."(摩擦点)
- "我不确定是否..."(不确定性)
Facilitation Rules
引导规则
✅ Do:
- Observe silently
- Take notes
- Let them struggle (reveals issues)
- Ask follow-up questions AFTER task
- Stay neutral
❌ Don't:
- Help or explain
- Lead them ("maybe try clicking...")
- Defend design choices
- Interrupt during task
- Show frustration
✅ 可以做:
- 安静观察
- 记录笔记
- 让他们自行尝试(能揭示问题)
- 任务完成后再追问
- 保持中立
❌ 不要做:
- 提供帮助或解释
- 引导他们(比如“试试点击...”)
- 为设计决策辩护
- 任务过程中打断
- 表现出不耐烦
Questions to Ask After Each Task
每项任务完成后可问的问题
yaml
Completion Questions:
- 'On a scale of 1-5, how easy was that task?'
- 'What were you expecting to see?'
- 'What was confusing about that?'
- 'If you could change one thing, what would it be?'
Discovery Questions:
- 'Where did you expect to find that?'
- 'What do you think this [feature] does?'
- 'Why did you click there?'yaml
Completion Questions:
- 'On a scale of 1-5, how easy was that task?'
- 'What were you expecting to see?'
- 'What was confusing about that?'
- 'If you could change one thing, what would it be?'
Discovery Questions:
- 'Where did you expect to find that?'
- 'What do you think this [feature] does?'
- 'Why did you click there?'Metrics to Track
需跟踪的指标
Task Success Rate
任务成功率
yaml
Measurement:
- Completed: User achieved goal without help
- Partial: User achieved goal with hints
- Failed: User could not complete task
Calculation: Task Success Rate = (Completed Tasks / Total Attempts) × 100
Target: ≥80% for core tasksyaml
Measurement:
- Completed: User achieved goal without help
- Partial: User achieved goal with hints
- Failed: User could not complete task
Calculation: Task Success Rate = (Completed Tasks / Total Attempts) × 100
Target: ≥80% for core tasksTime on Task
任务耗时
yaml
Measurement:
- Start timer when task begins
- Stop when user completes or gives up
Analysis:
- Compare to baseline/previous tests
- Identify outliers (very fast or very slow)
Target: Varies by task complexity
- Simple task (e.g., log in): <30 seconds
- Medium task (e.g., create project): 1-2 minutes
- Complex task (e.g., configure integration): 3-5 minutesyaml
Measurement:
- Start timer when task begins
- Stop when user completes or gives up
Analysis:
- Compare to baseline/previous tests
- Identify outliers (very fast or very slow)
Target: Varies by task complexity
- Simple task (e.g., log in): <30 seconds
- Medium task (e.g., create project): 1-2 minutes
- Complex task (e.g., configure integration): 3-5 minutesError Rate
错误率
yaml
Errors:
- Wrong path taken
- Incorrect button clicked
- Had to backtrack
- Gave up and tried different approach
Calculation: Errors per Task = Total Errors / Number of Users
Target: <2 errors per taskyaml
Errors:
- Wrong path taken
- Incorrect button clicked
- Had to backtrack
- Gave up and tried different approach
Calculation: Errors per Task = Total Errors / Number of Users
Target: <2 errors per taskSatisfaction Rating
满意度评分
yaml
Post-Task Question:
"How satisfied are you with completing this task?" (1-5 scale)
1 = Very Dissatisfied
2 = Dissatisfied
3 = Neutral
4 = Satisfied
5 = Very Satisfied
Target: ≥4.0 averageyaml
Post-Task Question:
"How satisfied are you with completing this task?" (1-5 scale)
1 = Very Dissatisfied
2 = Dissatisfied
3 = Neutral
4 = Satisfied
5 = Very Satisfied
Target: ≥4.0 averageIssue Severity Rating
问题严重程度评级
Severity Formula
严重程度计算公式
Severity = Impact × FrequencySeverity = Impact × FrequencyImpact Scale (1-3)
影响程度量表(1-3)
yaml
1 - Low Impact:
- Minor inconvenience
- User can easily recover
- Cosmetic issue
2 - Medium Impact:
- Causes delay or confusion
- User eventually figures it out
- Moderate frustration
3 - High Impact:
- Blocks task completion
- User cannot proceed without help
- Critical to core functionalityyaml
1 - Low Impact:
- Minor inconvenience
- User can easily recover
- Cosmetic issue
2 - Medium Impact:
- Causes delay or confusion
- User eventually figures it out
- Moderate frustration
3 - High Impact:
- Blocks task completion
- User cannot proceed without help
- Critical to core functionalityFrequency Scale (1-3)
出现频率量表(1-3)
yaml
1 - Rare:
- Only 1-2 users encountered
- Edge case
- Specific conditions
2 - Occasional:
- 3-5 users encountered
- Somewhat common
- Specific user types
3 - Frequent:
- Most/all users encountered
- Consistent issue
- All user typesyaml
1 - Rare:
- Only 1-2 users encountered
- Edge case
- Specific conditions
2 - Occasional:
- 3-5 users encountered
- Somewhat common
- Specific user types
3 - Frequent:
- Most/all users encountered
- Consistent issue
- All user typesCombined Severity
综合严重程度
yaml
Critical (8-9):
- Impact: 3, Frequency: 3
- Blocks most users
→ Fix immediately before release
High (6-7):
- Impact: 3, Frequency: 2 OR Impact: 2, Frequency: 3
- Significant delay or frequent minor issue
→ Fix before release
Medium (4-5):
- Impact: 2, Frequency: 2 OR Impact: 3, Frequency: 1
- Minor frustration or rare blocker
→ Fix in next release
Low (1-3):
- Impact: 1, Frequency: 1-3
- Cosmetic or rare minor issue
→ Backlogyaml
Critical (8-9):
- Impact: 3, Frequency: 3
- Blocks most users
→ Fix immediately before release
High (6-7):
- Impact: 3, Frequency: 2 OR Impact: 2, Frequency: 3
- Significant delay or frequent minor issue
→ Fix before release
Medium (4-5):
- Impact: 2, Frequency: 2 OR Impact: 3, Frequency: 1
- Minor frustration or rare blocker
→ Fix in next release
Low (1-3):
- Impact: 1, Frequency: 1-3
- Cosmetic or rare minor issue
→ BacklogSystem Usability Scale (SUS)
系统可用性量表(SUS)
10-question survey (post-test, 1-5 Likert scale):
yaml
Questions (Odd = Positive, Even = Negative): 1. I think I would like to use this product frequently
2. I found the product unnecessarily complex
3. I thought the product was easy to use
4. I think I would need support to use this product
5. I found the various functions well integrated
6. I thought there was too much inconsistency
7. I imagine most people would learn this quickly
8. I found the product cumbersome to use
9. I felt very confident using the product
10. I needed to learn a lot before getting going
Scoring:
- Odd questions: Score - 1
- Even questions: 5 - Score
- Sum all scores
- Multiply by 2.5
- Result: 0-100 score
Interpretation:
≥80: Excellent
68-79: Good (industry average)
51-67: OK
<51: Needs significant improvement10题调查问卷(测试后填写,1-5李克特量表):
yaml
Questions (Odd = Positive, Even = Negative): 1. I think I would like to use this product frequently
2. I found the product unnecessarily complex
3. I thought the product was easy to use
4. I think I would need support to use this product
5. I found the various functions well integrated
6. I thought there was too much inconsistency
7. I imagine most people would learn this quickly
8. I found the product cumbersome to use
9. I felt very confident using the product
10. I needed to learn a lot before getting going
Scoring:
- Odd questions: Score - 1
- Even questions: 5 - Score
- Sum all scores
- Multiply by 2.5
- Result: 0-100 score
Interpretation:
≥80: Excellent
68-79: Good (industry average)
51-67: OK
<51: Needs significant improvementTest Report Template
测试报告模板
yaml
usability_test_summary:
date: '2024-01-20'
participants: 8
participant_profile: 'New users, age 25-45, tech-savvy'
tasks:
- task: 'Create a new project'
success_rate: '87.5% (7/8)'
avg_time: '1m 24s'
errors: 1.2 per user
satisfaction: 4.3/5
- task: 'Invite team member'
success_rate: '62.5% (5/8)'
avg_time: '2m 45s'
errors: 2.8 per user
satisfaction: 3.1/5
issues:
- issue: "Users can't find 'Invite' button"
severity: high
impact: 3
frequency: 3
affected_users: 7/8
recommendation: "Move 'Invite' button to top of project page, make it more prominent"
- issue: 'Confusion about project vs workspace'
severity: medium
impact: 2
frequency: 3
affected_users: 6/8
recommendation: 'Add tooltip explaining difference, update onboarding'
- issue: 'Export button text unclear'
severity: low
impact: 1
frequency: 2
affected_users: 2/8
recommendation: "Change 'Export' to 'Export to CSV'"
sus_score: 72 (Good)
key_insights:
- 'Onboarding is smooth (87.5% success)'
- 'Team collaboration features hard to discover'
- 'Overall product easy to use once features are found'
recommended_actions:
1. "High priority: Redesign invite flow"
2. "Medium priority: Add contextual help for workspace vs project"
3. "Low priority: Update button labels"yaml
usability_test_summary:
date: '2024-01-20'
participants: 8
participant_profile: 'New users, age 25-45, tech-savvy'
tasks:
- task: 'Create a new project'
success_rate: '87.5% (7/8)'
avg_time: '1m 24s'
errors: 1.2 per user
satisfaction: 4.3/5
- task: 'Invite team member'
success_rate: '62.5% (5/8)'
avg_time: '2m 45s'
errors: 2.8 per user
satisfaction: 3.1/5
issues:
- issue: "Users can't find 'Invite' button"
severity: high
impact: 3
frequency: 3
affected_users: 7/8
recommendation: "Move 'Invite' button to top of project page, make it more prominent"
- issue: 'Confusion about project vs workspace'
severity: medium
impact: 2
frequency: 3
affected_users: 6/8
recommendation: 'Add tooltip explaining difference, update onboarding'
- issue: 'Export button text unclear'
severity: low
impact: 1
frequency: 2
affected_users: 2/8
recommendation: "Change 'Export' to 'Export to CSV'"
sus_score: 72 (Good)
key_insights:
- 'Onboarding is smooth (87.5% success)'
- 'Team collaboration features hard to discover'
- 'Overall product easy to use once features are found'
recommended_actions:
1. "High priority: Redesign invite flow"
2. "Medium priority: Add contextual help for workspace vs project"
3. "Low priority: Update button labels"Remote vs In-Person Testing
远程测试 vs 线下测试
Remote Testing (Moderated)
远程测试(有主持人)
Tools: Zoom, Google Meet, UserTesting.com
yaml
Pros:
- Can test with users anywhere
- Lower cost (no travel)
- Easier to recruit
- Record sessions easily
Cons:
- Can't see body language as well
- Technical issues possible
- Harder to build rapport
- Screen sharing can lag
Best Practices:
- Test your setup beforehand
- Have backup communication method
- Ask user to share screen + turn on camera
- Record session (with permission)工具: Zoom, Google Meet, UserTesting.com
yaml
Pros:
- 可测试全球各地用户
- 成本更低(无需差旅)
- 招募参与者更易
- 便于录制测试会话
Cons:
- 难以清晰观察肢体语言
- 可能出现技术问题
- 难以建立信任关系
- 屏幕共享可能卡顿
Best Practices:
- 提前测试设备设置
- 准备备用沟通方式
- 要求用户共享屏幕并开启摄像头
- 经许可后录制会话In-Person Testing
线下测试
yaml
Pros:
- See full body language
- Better rapport
- No technical issues
- Can see facial expressions
Cons:
- Limited geographic reach
- Higher cost
- Harder to schedule
- Need physical space
Best Practices:
- Set up quiet room
- Have snacks/water
- Use screen recording software
- Position yourself behind/beside useryaml
Pros:
- 可观察完整肢体语言
- 更易建立信任关系
- 无技术问题
- 可观察面部表情
Cons:
- 地理范围受限
- 成本更高
- 安排日程更难
- 需要实体测试空间
Best Practices:
- 布置安静的测试房间
- 提供零食和饮用水
- 使用屏幕录制软件
- 站在或坐在用户身后/旁边Test Frequency
测试频率
yaml
When to Test:
- Pre-launch: Test prototypes/designs
- Post-launch: Test new features
- Ongoing: Test every major release
- Quarterly: Full usability audit
Continuous Testing:
- Week 1: Test with 5 users
- Week 2: Fix issues
- Week 3: Test with 5 new users
- Repeat until success rate ≥80%yaml
When to Test:
- Pre-launch: Test prototypes/designs
- Post-launch: Test new features
- Ongoing: Test every major release
- Quarterly: Full usability audit
Continuous Testing:
- Week 1: Test with 5 users
- Week 2: Fix issues
- Week 3: Test with 5 new users
- Repeat until success rate ≥80%Tools & Software
工具与软件
yaml
Remote Testing:
- UserTesting.com (recruit + test)
- UserZoom (enterprise solution)
- Lookback (live testing)
- Maze (unmoderated testing)
Recording:
- Zoom (screen + audio)
- Loom (quick recordings)
- OBS (advanced recording)
Analysis:
- Dovetail (organize insights)
- Notion (collaborative notes)
- Miro (affinity mapping)
- Excel/Sheets (metrics tracking)yaml
Remote Testing:
- UserTesting.com (recruit + test)
- UserZoom (enterprise solution)
- Lookback (live testing)
- Maze (unmoderated testing)
Recording:
- Zoom (screen + audio)
- Loom (quick recordings)
- OBS (advanced recording)
Analysis:
- Dovetail (organize insights)
- Notion (collaborative notes)
- Miro (affinity mapping)
- Excel/Sheets (metrics tracking)Quick Start Checklist
快速启动检查清单
Planning Phase
规划阶段
- Define test objectives
- Write 3-5 task scenarios
- Recruit 5-8 participants
- Prepare test script
- Set up recording
- 定义测试目标
- 编写3-5个任务场景
- 招募5-8名参与者
- 准备测试脚本
- 设置录制设备
Testing Phase
测试阶段
- Welcome participant
- Explain think-aloud protocol
- Conduct tasks (don't help!)
- Ask follow-up questions
- Administer SUS survey
- Thank participant
- 欢迎参与者
- 解释出声思考协议
- 开展测试(不要提供帮助!)
- 进行后续追问
- 发放SUS调查问卷
- 感谢参与者
Analysis Phase
分析阶段
- Calculate success rates
- Identify common issues
- Rate issue severity
- Create report
- Share with team
- Prioritize fixes
- 计算任务成功率
- 识别常见问题
- 评级问题严重程度
- 撰写测试报告
- 与团队共享报告
- 确定修复优先级
Common Pitfalls
常见误区
❌ Testing with employees: They know the product too well
❌ Helping users during tasks: Let them struggle to find real issues
❌ Only testing happy path: Test error cases and edge cases too
❌ Not enough participants: 5 minimum per persona
❌ Ignoring low-severity issues: They add up to poor experience
❌ Testing but not fixing: Usability tests are worthless if you don't act
❌ 测试内部员工:他们对产品过于熟悉
❌ 测试中帮助用户:让用户自行尝试才能发现真实问题
❌ 仅测试顺畅路径:也要测试错误案例和边缘场景
❌ 参与者数量不足:每个用户角色至少测试5人
❌ 忽略低严重程度问题:这些问题累积会导致糟糕的用户体验
❌ 只测试不修复:如果不采取行动,可用性测试毫无价值
Summary
总结
Great usability testing:
- ✅ Test with 5-8 users per persona
- ✅ Use realistic task scenarios (not step-by-step)
- ✅ Think-aloud protocol (understand mental models)
- ✅ Don't help users during tasks
- ✅ Track success rate, time, errors, satisfaction
- ✅ Rate issues by severity (impact × frequency)
- ✅ Fix high-priority issues before release
- ✅ Test continuously, not just once
优秀的可用性测试:
- ✅ 每个用户角色测试5-8人
- ✅ 使用贴近真实的任务场景(而非分步指令)
- ✅ 采用出声思考协议(了解用户心智模型)
- ✅ 测试过程中不帮助用户
- ✅ 跟踪任务成功率、耗时、错误率和满意度
- ✅ 按严重程度(影响×频率)评级问题
- ✅ 发布前修复高优先级问题
- ✅ 持续测试,而非一次性测试