tool-design-sprint-test-and-score

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
<!-- PM-Skills | https://github.com/product-on-purpose/pm-skills | Apache 2.0 -->
<!-- PM-Skills | https://github.com/product-on-purpose/pm-skills | Apache 2.0 -->

Design Sprint Test and Score (Friday)

设计冲刺(Design Sprint)测试与评分(周五)

Friday is the sprint's payoff. 5 target-profile customers run the prototype while the team observes; the team synthesizes observations into a scorecard against the sprint questions; the Decider makes the build / iterate / pivot / stop call by end-of-day. The week's 35-40 person-days plus customer recruiting cost converts into one actionable decision.
Family contract:
docs/reference/skill-families/design-sprint-skills-contract.md
. This skill is a member of
design-sprint-skills
.
周五是设计冲刺的核心回报环节。5位目标用户群体的客户试用原型,团队同步观察;团队将观察结果整合为针对冲刺问题的评分卡;决策者需在当日结束前做出开发/迭代/转型/终止的决策。本周投入的35-40人天工作量加上客户招募成本,将转化为一个可落地的决策。
技能族契约:
docs/reference/skill-families/design-sprint-skills-contract.md
。本技能属于
design-sprint-skills
技能族。

When to Use

适用场景

  • It is Day 5 of the Design Sprint and Thursday's prototype passed trial run.
  • 5 confirmed participants are scheduled (canonical; or 4 if 1 cancelled-and-no-buffer; pause if below 4).
  • The team can observe interviews live (in-person or via Zoom breakout room) and synthesize during the day.
  • The Decider is present Friday PM for the post-interview review (canonically 14:00-18:00 PT window covering observation of slots 4-5 plus Decider review by 17:30 PT).
  • 处于设计冲刺第5天,且周四的原型已通过试运行。
  • 已预约5位确认参与者(标准配置;若1位取消且无备选,可调整为4位;不足4位则暂停)。
  • 团队可现场观察访谈(面对面或通过Zoom breakout room),并在当日完成整合工作。
  • 决策者周五下午可参与访谈后复盘(标准时间为太平洋时间14:00-18:00,涵盖第4-5场访谈观察,需在17:30前完成决策者复盘)。

When NOT to Use

不适用场景

  • Thursday prototype did not pass trial run. Re-run trial; if still failing at 19:00 PT Thursday, postpone Friday.
  • Fewer than 3 customers confirmed. Per Ratified Decision 3, the canonical guidance is 5 customers; 3-4 or 6-7 gets a documented warning; below 3 or above 7 should trigger a re-decision (postpone or split testing). Note: the v0.1.0 family validator does NOT mechanically enforce these thresholds (cohort count is in the EXAMPLE artifact, not in frontmatter); enforcement is a v2.16 validator-expansion candidate.
  • Decider unavailable for the post-interview review window. Without Decider, the day produces observations without a call.
  • The team plans to use this skill to write the executive memo. Per Ratified Decision 4: exec memo authoring is delegated to
    foundation-stakeholder-update
    (existing pm-skills foundation skill); this skill produces the Decider summary only.
  • 周四原型未通过试运行。需重新进行试运行;若周四太平洋时间19:00仍未通过,需推迟周五环节。
  • 确认参与的客户不足3位。根据第3号正式决议,标准配置为5位客户;3-4位或6-7位需记录警告;不足3位或超过7位需重新决策(推迟或拆分测试)。注意:v0.1.0版本的技能族验证器不会机械执行这些阈值(用户组数量仅在示例成果物中体现,不在前置内容里);自动执行将是v2.16版本验证器的扩展方向。
  • 决策者无法参与访谈后复盘窗口。没有决策者的话,当日只能产出观察记录,无法做出决策。
  • 团队计划用本技能撰写高管备忘录。根据第4号正式决议:高管备忘录撰写由
    foundation-stakeholder-update
    (已有的pm-skills基础技能)负责;本技能仅产出决策者总结。

What This Skill Produces

本技能产出物

A single bundled artifact with six sections:
  1. Per-customer interview observation notes: one section per customer; covers Context (Act 2) reactions, Tasks (Act 4) behavior with timestamps, Debrief (Act 5) reactions including pricing. Captured live during the day's interviews.
  2. Best quotes: 5-15 verbatim customer quotes the team flags as most signal-bearing. Used in the Decider summary and in any downstream pitch or planning artifact.
  3. Scorecard grid: rows are the sprint questions (from Monday); columns are the 5 customers; each cell is Y / N / partial / unclear with a one-line note; rightmost column is the team's day-end decision per question (Validated / Invalidated / Inconclusive).
  4. Observed patterns: 4 buckets (worked, hesitated, broke trust, unexpected) with 2-4 patterns per bucket. Each pattern names how many customers showed it.
  5. Hot takes: one short paragraph per team member capturing their personal read on Friday before group synthesis biases the read. Written silently in parallel.
  6. Decider summary: the Decider's call (build / iterate / pivot / stop / reframe) plus the highest-confidence learning, the most important revision the team would make to the prototype direction, and the next artifact the team will produce (the post-sprint deliverable).
See
references/TEMPLATE.md
for the canonical structure and
references/EXAMPLE.md
for the Brainshelf book-catalog Friday artifact.
一份整合式成果物,包含六个部分:
  1. 每位客户的访谈观察记录:每位客户对应一个部分;涵盖情境(第二环节)反馈、任务(第四环节)行为及时间戳、复盘(第五环节)反馈(包括定价相关内容)。在当日访谈过程中实时记录。
  2. 最佳引用语录:团队标记的5-15条最具参考价值的客户原话。用于决策者总结及后续的推介或规划成果物。
  3. 评分卡表格:行是周一确定的冲刺问题;列是5位客户;每个单元格标记Y/ N/ 部分符合/ 不明确,并附一行说明;最右侧列是团队针对每个问题的当日最终决策(已验证/ 已推翻/ 无定论)。
  4. 观察到的模式:分为4个类别(有效、犹豫、失去信任、意外),每个类别包含2-4种模式。每种模式需标注出现该模式的客户数量。
  5. 核心观点:每位团队成员撰写一段简短内容,记录自己在团队整合前对周五环节的个人见解。需独立静默撰写。
  6. 决策者总结:决策者的决策(开发/ 迭代/ 转型/ 终止/ 重构),加上最具可信度的结论、团队对原型方向最重要的修订建议,以及团队将产出的下一个成果物(冲刺后的交付物)。
标准结构可参考
references/TEMPLATE.md
,Brainshelf图书目录的周五成果物示例可参考
references/EXAMPLE.md

Friday Time Structure

周五时间安排

Friday is the longest day: customer interviews start early (canonically 09:00 PT) and the Decider review concludes the day (canonically 17:30 PT).
  • 09:00-16:30: 5 customer interviews of 50-60 minutes each at 09:00 / 10:30 / 12:00 / 14:00 / 15:30. Each slot: 10 min setup + 50-55 min interview + 5 min team huddle to capture observations before next customer.
  • 13:00-14:00: Lunch (slot 3 wraps ~13:00; lunch overlaps the slot 3 to slot 4 buffer)
  • 16:30-16:45: Last-customer wrap; observation note tidy
  • 16:45-17:00: Team writes hot takes silently in parallel
  • 17:00-17:30: Decider reviews scorecard + hot takes; makes the call
  • 17:30-18:00: Decider summary captured; team begins post-sprint disposition (next-step calendar, downstream deliverable assignment)
This skill's 270-minute timebox covers the synthesis sections (scorecard, patterns, hot takes, Decider summary). The 5 interviews themselves (~5 hours of interview time) run in parallel with continuous observation capture.
周五是最长的一天:客户访谈开始较早(标准为太平洋时间09:00),决策者复盘在当日结束(标准为太平洋时间17:30)。
  • 09:00-16:30:5场客户访谈,每场50-60分钟,时间分别为09:00 / 10:30 / 12:00 / 14:00 / 15:30。每场流程:10分钟准备 + 50-55分钟访谈 + 5分钟团队碰头记录观察结果,为下一场做准备。
  • 13:00-14:00:午餐(第3场访谈约13:00结束;午餐与第3场到第4场的间隔时间重叠)
  • 16:30-16:45:最后一场访谈收尾;整理观察记录
  • 16:45-17:00:团队独立静默撰写核心观点
  • 17:00-17:30:决策者审阅评分卡和核心观点;做出决策
  • 17:30-18:00:记录决策者总结;团队开始冲刺后部署(下一步日程安排、下游交付物分配)
本技能的270分钟时间盒涵盖整合环节(评分卡、模式、核心观点、决策者总结)。5场访谈本身(约5小时访谈时间)与持续的观察记录同步进行。

Scorecard Mechanic

评分卡机制

The scorecard is a 2-D grid. Rows are sprint questions from Monday's map-and-target (typically 3-7). Columns are the 5 customers (anonymized IDs). Each cell answers: did this customer's interview validate, invalidate, or leave inconclusive the row's question?
C1C2C3C4C5Day-end decision
Q1YYNYpartialValidated (4 of 5)
Q2NYunclearNNInvalidated (3-of-5 N, 1 of 5 Y)
.....................
Day-end decision rules:
  • Validated: 4 or 5 of 5 Y (strong signal); 3 of 5 Y with no N (directional). For 4-customer cohorts: 4 Y is Validated; 3 Y with no N is directional.
  • Invalidated: 4 or 5 of 5 N. For 4-customer cohorts: 4 N is Invalidated; 3 N with no Y is directional.
  • Inconclusive: all other patterns. Inconclusive questions get scheduled for follow-up (a smaller test, a quant experiment, or a second Design Sprint).
The Decider can override day-end decisions but should record reasoning.
评分卡是一个二维表格。行是周一映射与目标环节确定的冲刺问题(通常3-7个)。列是5位客户(匿名ID)。每个单元格需回答:该客户的访谈是否验证、推翻或对该行的问题无定论?
C1C2C3C4C5当日最终决策
Q1YYNYpartial已验证(5位中4位认可)
Q2NYunclearNN已推翻(5位中3位否定,1位认可)
.....................
当日最终决策规则:
  • 已验证:5位客户中有4或5位选择Y(强信号);5位中有3位选择Y且无N(方向性信号)。对于4位客户的组别:4位都选Y则为已验证;3位选Y且无N则为方向性信号。
  • 已推翻:5位客户中有4或5位选择N。对于4位客户的组别:4位都选N则为已推翻;3位选N且无Y则为方向性信号。
  • 无定论:所有其他情况。无定论的问题需安排后续跟进(小型测试、量化实验或第二次设计冲刺)。
决策者可推翻当日最终决策,但需记录理由。

Common Pitfalls

常见误区

  • Observation notes too narrative, not behavioral. "Customer seemed confused" is a narrative; "Customer hovered on the capture button for 4 seconds without tapping, then tapped twice in rapid succession" is behavior. Behavior is data; narrative is interpretation.
  • Scorecard cells filled in by consensus. Each observer writes their cell read; differences are surfaced, not averaged. If C1's read on Q1 is split 2 Y vs 2 N across the team, the cell is "split" with an explanatory note.
  • Hot takes written after group synthesis. Hot takes are written SILENTLY and in PARALLEL before group synthesis. Writing them after a group debrief produces consensus, not signal.
  • Decider hesitating on the call because "we want more data." Friday's job is to produce a call with the data you have. If the call truly cannot be made, the call is "iterate" (re-sprint with adjustments). "Defer" is not an answer.
  • Skipping the Decider summary because "we'll write it up Monday." The summary is captured Friday before the team leaves. Monday is too late; context decays fast.
  • Treating "5 customers" as a soft target. Per the canonical research, 5 is where confidence about patterns crosses an inflection point. Fewer than 4 produces noisy signal; more than 7 produces synthesis-overload without much marginal signal.
  • 观察记录过于主观描述,而非客观行为记录。“客户似乎很困惑”是主观描述;“客户在捕捉按钮上悬停4秒未点击,随后快速连续点击两次”是客观行为记录。行为是数据;主观描述是解读。
  • 评分卡单元格内容通过共识确定。每位观察者需记录自己的判断;差异需明确提出,而非平均化。如果团队对C1的Q1判断分为2个Y和2个N,该单元格需标记为“分歧”并附上说明。
  • 核心观点在团队整合后撰写。核心观点需在团队整合前独立静默撰写。在团队复盘后撰写会导致观点趋同,无法获取有效信号。
  • 决策者因“我们想要更多数据”而犹豫决策。周五的任务是基于现有数据做出决策。如果确实无法决策,决策应为“迭代”(调整后重新进行冲刺)。“推迟”不是可选答案。
  • 跳过决策者总结,因为“我们周一再写”。总结需在周五团队下班前记录完成。周一为时已晚,上下文信息会快速流失。
  • 将“5位客户”视为灵活目标。根据标准研究,5位客户是模式可信度达到拐点的数量。不足4位会产生嘈杂信号;超过7位会导致整合过载,且边际信号增益极低。

Cross-Skill Usage

跨技能使用

Prerequisites:
tool-design-sprint-prototype-plan
. Friday consumes the prototype and the interview script from Thursday. Without a working prototype that passed trial run, Friday cannot run.
This skill does NOT invoke
tool-note-and-vote
. Friday has no voting moment; the scorecard cells are individual reads and the Decider summary is the Decider's call.
This skill does NOT author an executive memo (per Ratified Decision 4). If the team wants an exec memo or stakeholder update, the next invocation is
foundation-stakeholder-update
, which consumes the Decider summary as input.
Downstream invocations after the sprint closes:
deliver-prd
(if Decider call is "build");
measure-experiment-design
(if "iterate" requires a smaller follow-on experiment);
iterate-pivot-decision
(if "pivot" requires documenting the pivot rationale);
foundation-stakeholder-update
(if any of the above need stakeholder communication).
前置技能:
tool-design-sprint-prototype-plan
。周五环节会使用周四产出的原型和访谈脚本。如果没有通过试运行的可用原型,周五环节无法开展。
本技能不会调用
tool-note-and-vote
。周五环节没有投票环节;评分卡单元格是个人判断,决策者总结是决策者的决策。
本技能不负责撰写高管备忘录(根据第4号正式决议)。如果团队需要高管备忘录或利益相关方更新,下一步需调用
foundation-stakeholder-update
,该技能会以决策者总结为输入。
冲刺结束后的下游调用:
deliver-prd
(若决策者决策为“开发”);
measure-experiment-design
(若“迭代”需要后续小型实验);
iterate-pivot-decision
(若“转型”需要记录转型理由);
foundation-stakeholder-update
(若上述任何环节需要向利益相关方沟通)。

Canonical Sources

标准参考来源

Decider Checkpoint

决策者检查点

This skill ends with a Decider Checkpoint in
references/TEMPLATE.md
. The Decider's call (build / iterate / pivot / stop / reframe) IS the checkpoint; the sprint cannot close without it. The checkpoint also captures the next artifact the team owns producing (a PRD, a smaller experiment, a pivot memo, or a stakeholder update), which is what triggers Monday's post-sprint work to begin clean.
本技能结束时,
references/TEMPLATE.md
中包含决策者检查点。决策者的决策(开发/ 迭代/ 转型/ 终止/ 重构)即为检查点内容;未完成该决策则冲刺无法收尾。检查点还需记录团队需产出的下一个成果物(PRD、小型实验、转型备忘录或利益相关方更新),这将触发周一冲刺后工作的有序启动。