tool-design-sprint-test-and-score
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese<!-- PM-Skills | https://github.com/product-on-purpose/pm-skills | Apache 2.0 -->
<!-- PM-Skills | https://github.com/product-on-purpose/pm-skills | Apache 2.0 -->
Design Sprint Test and Score (Friday)
设计冲刺(Design Sprint)测试与评分(周五)
Friday is the sprint's payoff. 5 target-profile customers run the prototype while the team observes; the team synthesizes observations into a scorecard against the sprint questions; the Decider makes the build / iterate / pivot / stop call by end-of-day. The week's 35-40 person-days plus customer recruiting cost converts into one actionable decision.
Family contract: . This skill is a member of .
docs/reference/skill-families/design-sprint-skills-contract.mddesign-sprint-skills周五是设计冲刺的核心回报环节。5位目标用户群体的客户试用原型,团队同步观察;团队将观察结果整合为针对冲刺问题的评分卡;决策者需在当日结束前做出开发/迭代/转型/终止的决策。本周投入的35-40人天工作量加上客户招募成本,将转化为一个可落地的决策。
技能族契约:。本技能属于技能族。
docs/reference/skill-families/design-sprint-skills-contract.mddesign-sprint-skillsWhen to Use
适用场景
- It is Day 5 of the Design Sprint and Thursday's prototype passed trial run.
- 5 confirmed participants are scheduled (canonical; or 4 if 1 cancelled-and-no-buffer; pause if below 4).
- The team can observe interviews live (in-person or via Zoom breakout room) and synthesize during the day.
- The Decider is present Friday PM for the post-interview review (canonically 14:00-18:00 PT window covering observation of slots 4-5 plus Decider review by 17:30 PT).
- 处于设计冲刺第5天,且周四的原型已通过试运行。
- 已预约5位确认参与者(标准配置;若1位取消且无备选,可调整为4位;不足4位则暂停)。
- 团队可现场观察访谈(面对面或通过Zoom breakout room),并在当日完成整合工作。
- 决策者周五下午可参与访谈后复盘(标准时间为太平洋时间14:00-18:00,涵盖第4-5场访谈观察,需在17:30前完成决策者复盘)。
When NOT to Use
不适用场景
- Thursday prototype did not pass trial run. Re-run trial; if still failing at 19:00 PT Thursday, postpone Friday.
- Fewer than 3 customers confirmed. Per Ratified Decision 3, the canonical guidance is 5 customers; 3-4 or 6-7 gets a documented warning; below 3 or above 7 should trigger a re-decision (postpone or split testing). Note: the v0.1.0 family validator does NOT mechanically enforce these thresholds (cohort count is in the EXAMPLE artifact, not in frontmatter); enforcement is a v2.16 validator-expansion candidate.
- Decider unavailable for the post-interview review window. Without Decider, the day produces observations without a call.
- The team plans to use this skill to write the executive memo. Per Ratified Decision 4: exec memo authoring is delegated to (existing pm-skills foundation skill); this skill produces the Decider summary only.
foundation-stakeholder-update
- 周四原型未通过试运行。需重新进行试运行;若周四太平洋时间19:00仍未通过,需推迟周五环节。
- 确认参与的客户不足3位。根据第3号正式决议,标准配置为5位客户;3-4位或6-7位需记录警告;不足3位或超过7位需重新决策(推迟或拆分测试)。注意:v0.1.0版本的技能族验证器不会机械执行这些阈值(用户组数量仅在示例成果物中体现,不在前置内容里);自动执行将是v2.16版本验证器的扩展方向。
- 决策者无法参与访谈后复盘窗口。没有决策者的话,当日只能产出观察记录,无法做出决策。
- 团队计划用本技能撰写高管备忘录。根据第4号正式决议:高管备忘录撰写由(已有的pm-skills基础技能)负责;本技能仅产出决策者总结。
foundation-stakeholder-update
What This Skill Produces
本技能产出物
A single bundled artifact with six sections:
- Per-customer interview observation notes: one section per customer; covers Context (Act 2) reactions, Tasks (Act 4) behavior with timestamps, Debrief (Act 5) reactions including pricing. Captured live during the day's interviews.
- Best quotes: 5-15 verbatim customer quotes the team flags as most signal-bearing. Used in the Decider summary and in any downstream pitch or planning artifact.
- Scorecard grid: rows are the sprint questions (from Monday); columns are the 5 customers; each cell is Y / N / partial / unclear with a one-line note; rightmost column is the team's day-end decision per question (Validated / Invalidated / Inconclusive).
- Observed patterns: 4 buckets (worked, hesitated, broke trust, unexpected) with 2-4 patterns per bucket. Each pattern names how many customers showed it.
- Hot takes: one short paragraph per team member capturing their personal read on Friday before group synthesis biases the read. Written silently in parallel.
- Decider summary: the Decider's call (build / iterate / pivot / stop / reframe) plus the highest-confidence learning, the most important revision the team would make to the prototype direction, and the next artifact the team will produce (the post-sprint deliverable).
See for the canonical structure and for the Brainshelf book-catalog Friday artifact.
references/TEMPLATE.mdreferences/EXAMPLE.md一份整合式成果物,包含六个部分:
- 每位客户的访谈观察记录:每位客户对应一个部分;涵盖情境(第二环节)反馈、任务(第四环节)行为及时间戳、复盘(第五环节)反馈(包括定价相关内容)。在当日访谈过程中实时记录。
- 最佳引用语录:团队标记的5-15条最具参考价值的客户原话。用于决策者总结及后续的推介或规划成果物。
- 评分卡表格:行是周一确定的冲刺问题;列是5位客户;每个单元格标记Y/ N/ 部分符合/ 不明确,并附一行说明;最右侧列是团队针对每个问题的当日最终决策(已验证/ 已推翻/ 无定论)。
- 观察到的模式:分为4个类别(有效、犹豫、失去信任、意外),每个类别包含2-4种模式。每种模式需标注出现该模式的客户数量。
- 核心观点:每位团队成员撰写一段简短内容,记录自己在团队整合前对周五环节的个人见解。需独立静默撰写。
- 决策者总结:决策者的决策(开发/ 迭代/ 转型/ 终止/ 重构),加上最具可信度的结论、团队对原型方向最重要的修订建议,以及团队将产出的下一个成果物(冲刺后的交付物)。
标准结构可参考,Brainshelf图书目录的周五成果物示例可参考。
references/TEMPLATE.mdreferences/EXAMPLE.mdFriday Time Structure
周五时间安排
Friday is the longest day: customer interviews start early (canonically 09:00 PT) and the Decider review concludes the day (canonically 17:30 PT).
- 09:00-16:30: 5 customer interviews of 50-60 minutes each at 09:00 / 10:30 / 12:00 / 14:00 / 15:30. Each slot: 10 min setup + 50-55 min interview + 5 min team huddle to capture observations before next customer.
- 13:00-14:00: Lunch (slot 3 wraps ~13:00; lunch overlaps the slot 3 to slot 4 buffer)
- 16:30-16:45: Last-customer wrap; observation note tidy
- 16:45-17:00: Team writes hot takes silently in parallel
- 17:00-17:30: Decider reviews scorecard + hot takes; makes the call
- 17:30-18:00: Decider summary captured; team begins post-sprint disposition (next-step calendar, downstream deliverable assignment)
This skill's 270-minute timebox covers the synthesis sections (scorecard, patterns, hot takes, Decider summary). The 5 interviews themselves (~5 hours of interview time) run in parallel with continuous observation capture.
周五是最长的一天:客户访谈开始较早(标准为太平洋时间09:00),决策者复盘在当日结束(标准为太平洋时间17:30)。
- 09:00-16:30:5场客户访谈,每场50-60分钟,时间分别为09:00 / 10:30 / 12:00 / 14:00 / 15:30。每场流程:10分钟准备 + 50-55分钟访谈 + 5分钟团队碰头记录观察结果,为下一场做准备。
- 13:00-14:00:午餐(第3场访谈约13:00结束;午餐与第3场到第4场的间隔时间重叠)
- 16:30-16:45:最后一场访谈收尾;整理观察记录
- 16:45-17:00:团队独立静默撰写核心观点
- 17:00-17:30:决策者审阅评分卡和核心观点;做出决策
- 17:30-18:00:记录决策者总结;团队开始冲刺后部署(下一步日程安排、下游交付物分配)
本技能的270分钟时间盒涵盖整合环节(评分卡、模式、核心观点、决策者总结)。5场访谈本身(约5小时访谈时间)与持续的观察记录同步进行。
Scorecard Mechanic
评分卡机制
The scorecard is a 2-D grid. Rows are sprint questions from Monday's map-and-target (typically 3-7). Columns are the 5 customers (anonymized IDs). Each cell answers: did this customer's interview validate, invalidate, or leave inconclusive the row's question?
| C1 | C2 | C3 | C4 | C5 | Day-end decision | |
|---|---|---|---|---|---|---|
| Q1 | Y | Y | N | Y | partial | Validated (4 of 5) |
| Q2 | N | Y | unclear | N | N | Invalidated (3-of-5 N, 1 of 5 Y) |
| ... | ... | ... | ... | ... | ... | ... |
Day-end decision rules:
- Validated: 4 or 5 of 5 Y (strong signal); 3 of 5 Y with no N (directional). For 4-customer cohorts: 4 Y is Validated; 3 Y with no N is directional.
- Invalidated: 4 or 5 of 5 N. For 4-customer cohorts: 4 N is Invalidated; 3 N with no Y is directional.
- Inconclusive: all other patterns. Inconclusive questions get scheduled for follow-up (a smaller test, a quant experiment, or a second Design Sprint).
The Decider can override day-end decisions but should record reasoning.
评分卡是一个二维表格。行是周一映射与目标环节确定的冲刺问题(通常3-7个)。列是5位客户(匿名ID)。每个单元格需回答:该客户的访谈是否验证、推翻或对该行的问题无定论?
| C1 | C2 | C3 | C4 | C5 | 当日最终决策 | |
|---|---|---|---|---|---|---|
| Q1 | Y | Y | N | Y | partial | 已验证(5位中4位认可) |
| Q2 | N | Y | unclear | N | N | 已推翻(5位中3位否定,1位认可) |
| ... | ... | ... | ... | ... | ... | ... |
当日最终决策规则:
- 已验证:5位客户中有4或5位选择Y(强信号);5位中有3位选择Y且无N(方向性信号)。对于4位客户的组别:4位都选Y则为已验证;3位选Y且无N则为方向性信号。
- 已推翻:5位客户中有4或5位选择N。对于4位客户的组别:4位都选N则为已推翻;3位选N且无Y则为方向性信号。
- 无定论:所有其他情况。无定论的问题需安排后续跟进(小型测试、量化实验或第二次设计冲刺)。
决策者可推翻当日最终决策,但需记录理由。
Common Pitfalls
常见误区
- Observation notes too narrative, not behavioral. "Customer seemed confused" is a narrative; "Customer hovered on the capture button for 4 seconds without tapping, then tapped twice in rapid succession" is behavior. Behavior is data; narrative is interpretation.
- Scorecard cells filled in by consensus. Each observer writes their cell read; differences are surfaced, not averaged. If C1's read on Q1 is split 2 Y vs 2 N across the team, the cell is "split" with an explanatory note.
- Hot takes written after group synthesis. Hot takes are written SILENTLY and in PARALLEL before group synthesis. Writing them after a group debrief produces consensus, not signal.
- Decider hesitating on the call because "we want more data." Friday's job is to produce a call with the data you have. If the call truly cannot be made, the call is "iterate" (re-sprint with adjustments). "Defer" is not an answer.
- Skipping the Decider summary because "we'll write it up Monday." The summary is captured Friday before the team leaves. Monday is too late; context decays fast.
- Treating "5 customers" as a soft target. Per the canonical research, 5 is where confidence about patterns crosses an inflection point. Fewer than 4 produces noisy signal; more than 7 produces synthesis-overload without much marginal signal.
- 观察记录过于主观描述,而非客观行为记录。“客户似乎很困惑”是主观描述;“客户在捕捉按钮上悬停4秒未点击,随后快速连续点击两次”是客观行为记录。行为是数据;主观描述是解读。
- 评分卡单元格内容通过共识确定。每位观察者需记录自己的判断;差异需明确提出,而非平均化。如果团队对C1的Q1判断分为2个Y和2个N,该单元格需标记为“分歧”并附上说明。
- 核心观点在团队整合后撰写。核心观点需在团队整合前独立静默撰写。在团队复盘后撰写会导致观点趋同,无法获取有效信号。
- 决策者因“我们想要更多数据”而犹豫决策。周五的任务是基于现有数据做出决策。如果确实无法决策,决策应为“迭代”(调整后重新进行冲刺)。“推迟”不是可选答案。
- 跳过决策者总结,因为“我们周一再写”。总结需在周五团队下班前记录完成。周一为时已晚,上下文信息会快速流失。
- 将“5位客户”视为灵活目标。根据标准研究,5位客户是模式可信度达到拐点的数量。不足4位会产生嘈杂信号;超过7位会导致整合过载,且边际信号增益极低。
Cross-Skill Usage
跨技能使用
Prerequisites: . Friday consumes the prototype and the interview script from Thursday. Without a working prototype that passed trial run, Friday cannot run.
tool-design-sprint-prototype-planThis skill does NOT invoke . Friday has no voting moment; the scorecard cells are individual reads and the Decider summary is the Decider's call.
tool-note-and-voteThis skill does NOT author an executive memo (per Ratified Decision 4). If the team wants an exec memo or stakeholder update, the next invocation is , which consumes the Decider summary as input.
foundation-stakeholder-updateDownstream invocations after the sprint closes: (if Decider call is "build"); (if "iterate" requires a smaller follow-on experiment); (if "pivot" requires documenting the pivot rationale); (if any of the above need stakeholder communication).
deliver-prdmeasure-experiment-designiterate-pivot-decisionfoundation-stakeholder-update前置技能:。周五环节会使用周四产出的原型和访谈脚本。如果没有通过试运行的可用原型,周五环节无法开展。
tool-design-sprint-prototype-plan本技能不会调用。周五环节没有投票环节;评分卡单元格是个人判断,决策者总结是决策者的决策。
tool-note-and-vote本技能不负责撰写高管备忘录(根据第4号正式决议)。如果团队需要高管备忘录或利益相关方更新,下一步需调用,该技能会以决策者总结为输入。
foundation-stakeholder-update冲刺结束后的下游调用:(若决策者决策为“开发”);(若“迭代”需要后续小型实验);(若“转型”需要记录转型理由);(若上述任何环节需要向利益相关方沟通)。
deliver-prdmeasure-experiment-designiterate-pivot-decisionfoundation-stakeholder-updateCanonical Sources
标准参考来源
- Knapp, J., Zeratsky, J., and Kowitz, B. Sprint. Simon and Schuster, 2016. Friday chapter (Chapters 18-20).
- GV Design Sprint Guide. "Sprint Week Friday." https://www.gv.com/sprint/
- Character Capital. "Design Sprint Day 5." https://www.character.vc
- Google Design Sprint Kit. "Friday scorecard template + interview observation worksheet." https://designsprintkit.withgoogle.com/
- Nielsen, J. (2000). "Why You Only Need to Test with 5 Users." Nielsen Norman Group. https://www.nngroup.com/articles/why-you-only-need-to-test-with-5-users/ (canonical research for the 5-customer cohort size).
- Knapp, J., Zeratsky, J., and Kowitz, B. 《Sprint》. Simon and Schuster, 2016. 周五章节(第18-20章)。
- GV设计冲刺指南。“冲刺周周五”。https://www.gv.com/sprint/
- Character Capital。“设计冲刺第5天”。https://www.character.vc
- Google设计冲刺工具包。“周五评分卡模板 + 访谈观察工作表”。https://designsprintkit.withgoogle.com/
- Nielsen, J. (2000). “Why You Only Need to Test with 5 Users.” Nielsen Norman Group. https://www.nngroup.com/articles/why-you-only-need-to-test-with-5-users/(5位客户组规模的标准研究来源)。
Decider Checkpoint
决策者检查点
This skill ends with a Decider Checkpoint in . The Decider's call (build / iterate / pivot / stop / reframe) IS the checkpoint; the sprint cannot close without it. The checkpoint also captures the next artifact the team owns producing (a PRD, a smaller experiment, a pivot memo, or a stakeholder update), which is what triggers Monday's post-sprint work to begin clean.
references/TEMPLATE.md本技能结束时,中包含决策者检查点。决策者的决策(开发/ 迭代/ 转型/ 终止/ 重构)即为检查点内容;未完成该决策则冲刺无法收尾。检查点还需记录团队需产出的下一个成果物(PRD、小型实验、转型备忘录或利益相关方更新),这将触发周一冲刺后工作的有序启动。
references/TEMPLATE.md