pol-probe-advisor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Purpose

目的

Guide product managers through selecting the right Proof of Life (PoL) probe type (of 5 flavors) based on their hypothesis, risk, and available resources. Use this when you need to eliminate a specific risk or test a narrow hypothesis, but aren't sure which validation method to use. This interactive skill ensures you match the cheapest prototype to the harshest truth—not the prototype you're most comfortable building.

This is not a tool for deciding if you should validate (you should). It's a decision framework for choosing how to validate most effectively.

引导产品经理根据其假设、风险和可用资源，从5种类型中选择合适的Proof of Life (PoL)验证探针。当你需要消除特定风险或测试狭义假设，但不确定应使用哪种验证方法时，可使用本工具。这款交互式工具确保你用成本最低的原型获取最真实的反馈——而非选择你最熟悉构建的原型。

这不是用于决定“是否需要验证”的工具（你当然需要验证），而是用于决定“如何最有效地验证”的决策框架。

Key Concepts

核心概念

The Core Problem: Method-Hypothesis Mismatch

核心问题：方法与假设不匹配

Common failure mode: PMs choose validation methods based on tooling comfort ("I know Figma, so I'll design a prototype") rather than learning goal. Result: validate the wrong thing, miss the actual risk.

Solution: Work backwards from the hypothesis. Ask: "What specific risk am I eliminating? What's the cheapest path to harsh truth?"

常见失败模式：产品经理基于对工具的熟悉程度选择验证方法（“我会用Figma，所以我要设计一个原型”），而非基于学习目标。结果：验证了错误的内容，遗漏了实际风险。

解决方案：从假设倒推思考。问自己：“我要消除的具体风险是什么？获取真实反馈的最低成本路径是什么？”

The 5 PoL Probe Flavors (Quick Reference)

5种PoL验证探针类型（快速参考）

Type	Core Question	Best For	Timeline
Feasibility Check	"Can we build this?"	Technical unknowns, API dependencies, data integrity	1-2 days
Task-Focused Test	"Can users complete this job without friction?"	Critical UI moments, field labels, decision points	2-5 days
Narrative Prototype	"Does this workflow earn stakeholder buy-in?"	Storytelling, explaining complex flows, alignment	1-3 days
Synthetic Data Simulation	"Can we model this without production risk?"	Edge cases, unknown-unknowns, statistical modeling	2-4 days
Vibe-Coded PoL Probe	"Will this solution survive real user contact?"	Workflow/UX validation with real interactions	2-3 days

Golden Rule: "Use the cheapest prototype that tells the harshest truth."

类型	核心问题	适用场景	时间周期
Feasibility Check	“我们能否实现这个功能？”	技术不确定性、API依赖、数据完整性	1-2天
Task-Focused Test	“用户能否顺畅完成这项任务？”	关键UI场景、字段标签、决策节点	2-5天
Narrative Prototype	“这个工作流能否获得相关方认可？”	故事讲述、复杂流程说明、对齐共识	1-3天
Synthetic Data Simulation	“我们能否在无生产风险的情况下建模？”	边缘案例、未知未知项、统计建模	2-4天
Vibe-Coded PoL Probe	“该解决方案能否经受真实用户的检验？”	通过真实交互验证工作流/UX	2-3天

黄金法则：“用成本最低的原型获取最真实的反馈。”

Anti-Patterns (What This Is NOT)

反模式（本工具不适用的场景）

Not "build the prototype you're comfortable with": Match method to hypothesis, not skillset
Not "pick based on stakeholder preference": Optimize for learning, not internal politics
Not "choose the most impressive option": Impressive ≠ informative
Not "default to code": Writing code should be your last resort, not your first

不要“选择你熟悉构建的原型”：让方法匹配假设，而非你的技能集
不要“根据相关方偏好选择”：以学习为优化目标，而非内部政治
不要“选择最令人印象深刻的选项”：令人印象深刻 ≠ 能提供有效信息
不要“默认选择编码实现”：编写代码应是你的最后选择，而非首选

When to Use This Skill

何时使用本工具

✅ Use this when:

You have a clear hypothesis but don't know which validation method to use
You're unsure whether to build code, create a video, or run a simulation
You need to eliminate a specific risk quickly (within days)
You want to avoid prototype theater

❌ Don't use this when:

You don't have a hypothesis yet (use

problem-statement.md

problem-framing-canvas.md

first)

You're trying to impress executives (that's not validation)
You already know the answer (confirmation bias)
You need to ship an MVP (this is for pre-MVP reconnaissance)

✅ 适用场景：

你有明确的假设，但不知道应使用哪种验证方法
你不确定是要编写代码、创建视频还是运行模拟
你需要快速（数天内）消除特定风险
你想避免“原型表演”（为了展示而做原型）

❌ 不适用场景：

你还没有明确的假设（请先使用
```
problem-statement.md
```
或
```
problem-framing-canvas.md
```
）
你试图给高管留下深刻印象（这不是验证）
你已经知道答案（避免确认偏差）
你需要交付MVP（本工具用于MVP前的调研阶段）

Facilitation Source of Truth

引导实施的基准规则

Use

workshop-facilitation

as the default interaction protocol for this skill.

It defines:

session heads-up + entry mode (Guided, Context dump, Best guess)
one-question turns with plain-language prompts
progress labels (for example, Context Qx/8 and Scoring Qx/5)
interruption handling and pause/resume behavior
numbered recommendations at decision points
quick-select numbered response options for regular questions (include
```
Other (specify)
```
when useful)

This file defines the domain-specific assessment content. If there is a conflict, follow this file's domain logic.

使用

workshop-facilitation

作为本工具的默认交互协议。

它定义了：

会话提醒与进入模式（引导式、上下文导入、最佳猜测）
采用平实语言的单问题轮询
进度标签（例如，上下文问题X/8、评分问题X/5）
中断处理与暂停/恢复机制
决策节点的编号式建议
常规问题的快速选择编号响应选项（必要时包含“其他（请说明）”）

本文件定义了特定领域的评估内容。若存在冲突，请遵循本文件的领域逻辑。

Application

应用流程

This interactive skill uses adaptive questioning to recommend the right PoL probe type based on your context.

这款交互式工具通过自适应提问，根据你的上下文推荐合适的PoL验证探针类型。

Step 0: Gather Context

步骤0：收集上下文信息

Agent asks:

Let's figure out which PoL probe type is right for your validation needs. First, I need some context:

1. What hypothesis are you testing? (Describe in one sentence, or use "If [we do X] for [persona], then [outcome]" format)

2. What specific risk are you trying to eliminate? Examples:

Technical feasibility ("Can our API handle real-time data?")
User task completion ("Can users find the 'export' button?")
Stakeholder alignment ("Will leadership approve this direction?")
Edge case behavior ("How does the system handle duplicate entries?")
Workflow validation ("Will users complete the 3-step onboarding?")

3. What's your timeline?

Hours (same-day validation)
1-2 days (quick spike)
3-5 days (moderate effort)
1 week+ (too long—consider breaking into smaller probes)

4. What resources do you have available? Examples:

Engineering capacity (1 dev for 1 day)
Design tools (Figma, Loom, Sora)
AI/no-code tools (ChatGPT Canvas, Replit, Airtable)
User access (10 users from waitlist, 5 beta customers, etc.)
Budget (for UsabilityHub, Optimal Workshop, etc.)

工具提问：

让我们一起确定哪种PoL验证探针最符合你的验证需求。首先，我需要了解一些上下文信息：

1. 你要测试的假设是什么？ （用一句话描述，或使用“如果我们为[用户角色]做[X]，那么会得到[结果]”的格式）

2. 你要消除的具体风险是什么？ 示例：

技术可行性（“我们的API能否处理实时数据？”）
用户任务完成度（“用户能否找到‘导出’按钮？”）
相关方对齐（“领导层是否会批准这个方向？”）
边缘案例表现（“系统如何处理重复条目？”）
工作流验证（“用户能否完成3步注册流程？”）

3. 你的时间周期是什么？

数小时（当日完成验证）
1-2天（快速探索）
3-5天（中等投入）
1周以上（时间过长——考虑拆分为更小的验证探针）

4. 你可用的资源有哪些？ 示例：

工程资源（1名开发人员投入1天）
设计工具（Figma、Loom、Sora）
AI/无代码工具（ChatGPT Canvas、Replit、Airtable）
用户资源（等待列表中的10名用户、5名测试客户等）
预算（用于UsabilityHub、Optimal Workshop等工具）

Step 1: Identify the Core Question

步骤1：确定核心问题

Agent synthesizes user input and asks:

Based on your hypothesis and risk, which of these core questions are you really trying to answer?

Offer 5 options (aligned to probe types):

"Can we build this?" — You're uncertain about technical feasibility, API integration, data availability, or third-party dependencies
"Can users complete this job without friction?" — You're validating critical UI moments, field labels, navigation, or decision points
"Does this workflow earn stakeholder buy-in?" — You need to explain a complex flow, align leadership, or "tell vs. test" the story
"Can we model this without production risk?" — You need to explore edge cases, simulate user behavior, or test prompt logic safely
"Will this solution survive real user contact?" — You need users to interact with a semi-functional workflow to catch UX/workflow issues

User response: [Select one number, or describe if none fit]

工具综合用户输入后提问：

根据你的假设和风险，你真正想要回答的核心问题是以下哪一个？

提供5个与探针类型对应的选项：

“我们能否实现这个功能？” —— 你对技术可行性、API集成、数据可用性或第三方依赖存在不确定性
“用户能否顺畅完成这项任务？” —— 你需要验证关键UI场景、字段标签、导航或决策节点
“这个工作流能否获得相关方认可？” —— 你需要解释复杂流程、对齐领导层，或“讲述而非测试”相关故事
“我们能否在无生产风险的情况下建模？” —— 你需要探索边缘案例、模拟用户行为或安全测试提示逻辑
“该解决方案能否经受真实用户的检验？” —— 你需要用户与半功能工作流交互，以发现UX/工作流问题

用户回复： [选择一个编号，若均不匹配可自行描述]

Step 2: Recommend PoL Probe Type

步骤2：推荐PoL验证探针类型

Based on user selection, agent recommends the matching probe type:

根据用户的选择，工具推荐匹配的探针类型：

Option 1 Selected: "Can we build this?"

选择选项1：“我们能否实现这个功能？”

→ Recommended Probe: Feasibility Check

What it is: A 1-2 day spike-and-delete test to surface technical risk. Not meant to impress anyone—meant to reveal blockers fast.

Methods:

GenAI prompt chains (test if AI can handle your use case)
API sniff tests (verify third-party integrations work)
Data integrity sweeps (check if your data supports the feature)
Third-party tool evaluation (test if Zapier/Stripe/Twilio does what you think)

Timeline: 1-2 days

Tools:

ChatGPT/Claude (prompt testing)
Postman/Insomnia (API testing)
Jupyter notebooks (data exploration)
Proof-of-concept scripts (throwaway code)

Success Criteria Example:

Pass: API returns expected data format in <200ms
Fail: API times out, or data structure incompatible with our schema
Learn: Identify specific technical blocker

Disposal Plan: Delete all spike code after documenting findings.

Next Step: Would you like me to generate a

pol-probe

artifact documenting this feasibility check?

→ 推荐探针：Feasibility Check

定义： 一项1-2天的快速探索测试，用于暴露技术风险。目的不是给人留下深刻印象——而是快速发现障碍。

方法：

生成式AI提示链（测试AI能否处理你的用例）
API嗅探测试（验证第三方集成是否可用）
数据完整性扫描（检查你的数据是否支持该功能）
第三方工具评估（测试Zapier/Stripe/Twilio是否如你预期般工作）

时间周期：1-2天

工具：

ChatGPT/Claude（提示测试）
Postman/Insomnia（API测试）
Jupyter notebooks（数据探索）
概念验证脚本（用完即弃的代码）

成功标准示例：

通过：API在<200ms内返回预期数据格式
失败：API超时，或数据结构与我们的架构不兼容
收获：识别具体的技术障碍

处置方案：记录发现后删除所有探索代码。

下一步：是否需要我生成一份记录此可行性检查的

pol-probe

文档？

Option 2 Selected: "Can users complete this job without friction?"

选择选项2：“用户能否顺畅完成这项任务？”

→ Recommended Probe: Task-Focused Test

What it is: Validate critical moments—field labels, decision points, navigation, drop-off zones—using specialized testing tools. Focus on observable task completion, not opinions.

Methods:

Optimal Workshop (tree testing, card sorting)
UsabilityHub (5-second tests, click tests, preference tests)
Maze (prototype testing with heatmaps)
Loom-recorded task walkthroughs (ask users to "think aloud")

Timeline: 2-5 days

Tools:

Optimal Workshop ($200/month)
UsabilityHub ($100-300/month)
Maze (free tier available)
Loom (free for basic)

Success Criteria Example:

Pass: 80%+ users complete task in <2 minutes
Fail: <60% completion, or 3+ users get stuck on same step
Learn: Identify exact friction point (specific field, button, etc.)

Disposal Plan: Archive session recordings, document learnings, delete test prototype.

Next Step: Would you like me to generate a

pol-probe

artifact documenting this task-focused test?

→ 推荐探针：Task-Focused Test

定义： 使用专业测试工具验证关键场景——字段标签、决策节点、导航、流失点。聚焦于可观察的任务完成情况，而非主观意见。

方法：

Optimal Workshop（树测试、卡片分类）
UsabilityHub（5秒测试、点击测试、偏好测试）
Maze（带热力图的原型测试）
Loom录制的任务走查（让用户“边操作边思考”）

时间周期：2-5天

工具：

Optimal Workshop（200美元/月）
UsabilityHub（100-300美元/月）
Maze（提供免费版）
Loom（基础版免费）

成功标准示例：

通过：80%以上的用户在<2分钟内完成任务
失败：完成率<60%，或3名以上用户在同一步骤受阻
收获：识别确切的摩擦点（特定字段、按钮等）

处置方案：归档会话记录，记录收获，删除测试原型。

下一步：是否需要我生成一份记录此任务聚焦测试的

pol-probe

文档？

Option 3 Selected: "Does this workflow earn stakeholder buy-in?"

选择选项3：“这个工作流能否获得相关方认可？”

→ Recommended Probe: Narrative Prototype

What it is: Tell the story, don't test the interface. Use video walkthroughs or slideware storyboards to explain workflows and measure interest. This is "tell vs. test"—you're validating the narrative, not the UI.

Methods:

Loom walkthroughs (screen recording with voiceover)
Sora/Synthesia/Veo3 (AI-generated explainer videos)
Slideware storyboards (PowerPoint/Keynote with illustrations)
Storyboard sketches (use
```
storyboard.md
```
component skill)

Timeline: 1-3 days

Tools:

Loom (free, fast)
Sora/Synthesia (text-to-video, paid)
PowerPoint/Keynote (slideware animation)
Figma (static storyboard frames)

Success Criteria Example:

Pass: 8/10 stakeholders say "I'd use this" or "This solves the problem"
Fail: Stakeholders ask "Why would I use this?" or suggest alternative approaches
Learn: Identify which part of the narrative resonates (or doesn't)

Disposal Plan: Archive video, document feedback, delete supporting files.

Next Step: Would you like me to generate a

pol-probe

artifact documenting this narrative prototype?

→ 推荐探针：Narrative Prototype

定义： 讲述故事，而非测试界面。使用视频走查或幻灯片故事板解释工作流并衡量兴趣度。这是“讲述而非测试”——你要验证的是叙事逻辑，而非UI。

方法：

Loom走查（带旁白的屏幕录制）
Sora/Synthesia/Veo3（AI生成的讲解视频）
幻灯片故事板（PowerPoint/Keynote配插图）
故事板草图（使用
```
storyboard.md
```
组件工具）

时间周期：1-3天

工具：

Loom（免费、快速）
Sora/Synthesia（文本转视频，付费）
PowerPoint/Keynote（幻灯片动画）
Figma（静态故事板帧）

成功标准示例：

通过：8/10的相关方表示“我会使用这个”或“这解决了问题”
失败：相关方问“我为什么要使用这个？”或提出替代方案
收获：识别叙事中哪些部分引起共鸣（或未引起共鸣）

处置方案：归档视频，记录反馈，删除支持文件。

下一步：是否需要我生成一份记录此叙事原型的

pol-probe

文档？

Option 4 Selected: "Can we model this without production risk?"

选择选项4：“我们能否在无生产风险的情况下建模？”

→ Recommended Probe: Synthetic Data Simulation

What it is: Use simulated users, synthetic data, or prompt logic testing to explore edge cases and unknown-unknowns without touching production. Think "wind tunnel testing, cheaper than postmortem."

Methods:

Synthea (synthetic patient data generation)
DataStax LangFlow (test prompt logic without real users)
Monte Carlo simulations (model probabilistic outcomes)
Synthetic user behavior scripts (simulate click patterns, load testing)

Timeline: 2-4 days

Tools:

Synthea (open-source, healthcare)
DataStax LangFlow (prompt chain testing)
Python + Faker library (generate synthetic data)
Locust/k6 (load testing with synthetic users)

Success Criteria Example:

Pass: System handles 10,000 synthetic users with <1% error rate
Fail: Edge cases cause crashes or incorrect outputs
Learn: Identify which edge cases break the system

Disposal Plan: Delete synthetic data, archive findings, document edge cases.

Next Step: Would you like me to generate a

pol-probe

artifact documenting this synthetic data simulation?

→ 推荐探针：Synthetic Data Simulation

定义： 使用模拟用户、合成数据或提示逻辑测试，在不影响生产环境的情况下探索边缘案例和未知未知项。相当于“风洞测试，比事后复盘成本更低”。

方法：

Synthea（合成患者数据生成）
DataStax LangFlow（无需真实用户即可测试提示逻辑）
蒙特卡洛模拟（建模概率结果）
合成用户行为脚本（模拟点击模式、负载测试）

时间周期：2-4天

工具：

Synthea（开源，医疗领域）
DataStax LangFlow（提示链测试）
Python + Faker库（生成合成数据）
Locust/k6（使用合成用户进行负载测试）

成功标准示例：

通过：系统能处理10,000名合成用户，错误率<1%
失败：边缘案例导致崩溃或输出错误
收获：识别哪些边缘案例会导致系统故障

处置方案：删除合成数据，归档发现，记录边缘案例。

下一步：是否需要我生成一份记录此合成数据模拟的

pol-probe

文档？

Option 5 Selected: "Will this solution survive real user contact?"

选择选项5：“该解决方案能否经受真实用户的检验？”

→ Recommended Probe: Vibe-Coded PoL Probe

What it is: A Frankensoft stack (ChatGPT Canvas + Replit + Airtable) that creates just enough illusion for users to interact with a semi-functional workflow. Not production-grade—just enough to catch UX/workflow signals in 48 hours.

⚠️ Warning: This is the riskiest probe type. It looks real enough to confuse momentum with maturity. Use only when you need real user contact and other methods won't suffice.

Methods:

ChatGPT Canvas (quick UI generation)
Replit (host throwaway code)
Airtable (fake database)
Carrd/Webflow (landing page + workflow mockup)

Timeline: 2-3 days

Stack Example:

ChatGPT Canvas: Generate form UI
Replit: Host simple Flask/Node app
Airtable: Capture form submissions
Loom: Record user sessions for post-mortem analysis

Success Criteria Example:

Pass: 8/10 users complete workflow, 0 critical confusion moments
Fail: Users get stuck, ask "Is this broken?", or abandon mid-flow
Learn: Identify exact step where users lose confidence

Disposal Plan: Delete all code after user sessions, archive Loom recordings, document learnings.

Next Step: Would you like me to generate a

pol-probe

artifact documenting this vibe-coded probe?

→ 推荐探针：Vibe-Coded PoL Probe

定义： 一种“拼凑式”技术栈（ChatGPT Canvas + Replit + Airtable），仅创建足够的交互假象，让用户能与半功能工作流互动。并非生产级别的解决方案——仅需在48小时内捕捉UX/工作流反馈信号。

⚠️ 警告：这是风险最高的探针类型。它看起来足够真实，可能会让你将“推进势头”误认为“成熟度”。仅当你需要真实用户交互且其他方法无法满足需求时使用。

方法：

ChatGPT Canvas（快速生成UI）
Replit（托管用完即弃的代码）
Airtable（模拟数据库）
Carrd/Webflow（着陆页 + 工作流原型）

时间周期：2-3天

技术栈示例：

ChatGPT Canvas：生成表单UI
Replit：托管简单的Flask/Node应用
Airtable：捕获表单提交数据
Loom：录制用户会话用于事后分析

成功标准示例：

通过：8/10的用户完成工作流，无关键困惑点
失败：用户受阻、问“这是不是坏了？”，或中途放弃
收获：识别用户失去信心的确切步骤

处置方案：用户会话结束后删除所有代码，归档Loom记录，记录收获。

下一步：是否需要我生成一份记录此Vibe-Coded探针的

pol-probe

文档？

Step 3: Apply Component Skill

步骤3：应用组件工具

Agent offers:

I recommend using [selected probe type] for your hypothesis. Would you like me to:

Generate a complete
pol-probe
artifact (using the
```
pol-probe.md
```
template) with your hypothesis, success criteria, timeline, and disposal plan?
Walk through the execution steps for this probe type (tools, methods, user recruitment)?
Refine the hypothesis before proceeding (if it seems too broad or ambiguous)?

User selects option, and agent executes accordingly.

工具提供选项：

我推荐为你的假设使用**[所选探针类型]**。你希望我：

生成完整的
pol-probe
文档（使用
```
pol-probe.md
```
模板），包含你的假设、成功标准、时间周期和处置方案？
讲解该探针类型的执行步骤（工具、方法、用户招募）？
在推进前优化你的假设（如果假设看起来太宽泛或模糊）？

用户选择选项后，工具执行相应操作。

Step 4: Refine or Iterate (Optional)

步骤4：优化或迭代（可选）

If user is uncertain or hypothesis is too broad:

Agent asks:

Your hypothesis seems broad or tests multiple things at once. PoL probes work best when they're narrow and falsifiable. Let me ask a few clarifying questions:

Clarifying Questions:

What's the smallest thing you could test first? (Isolate one variable)
What would "failure" look like? (If you can't describe failure, you can't measure success)
Is this testing user behavior, technical feasibility, or stakeholder alignment? (Pick one focus)

Agent then re-recommends probe type based on refined hypothesis.

若用户不确定或假设过于宽泛：

工具提问：

你的假设看起来过于宽泛，或同时测试多个内容。PoL探针在聚焦且可证伪时效果最佳。让我问几个澄清问题：

澄清问题：

你可以先测试的最小内容是什么？（隔离一个变量）
“失败”会是什么样子？（如果你无法描述失败，就无法衡量成功）
这是在测试用户行为、技术可行性还是相关方对齐？（选择一个聚焦点）

工具随后根据优化后的假设重新推荐探针类型。

Examples

示例

Example 1: Choosing Task-Focused Test

示例1：选择Task-Focused Test

User Context:

Hypothesis: "Users can complete our checkout flow in under 3 minutes"
Risk: High cart abandonment on mobile
Timeline: 3 days
Resources: UsabilityHub account, 20 users from waitlist

Agent Reasoning:

Core question: "Can users complete this job without friction?"
Risk is task completion (observable behavior), not technical feasibility
Timeline fits Task-Focused Test (2-5 days)

Recommendation: Task-Focused Test using UsabilityHub

Method: Maze prototype testing with mobile users
Success criteria: 80%+ complete checkout in <3 minutes
Disposal: Delete prototype after analyzing heatmaps

用户上下文：

假设：“用户能在3分钟内完成我们的结账流程”
风险：移动端购物车弃购率高
时间周期：3天
资源：UsabilityHub账户、等待列表中的20名用户

工具推理：

核心问题：“用户能否顺畅完成这项任务？”
风险是任务完成度（可观察行为），而非技术可行性
时间周期符合Task-Focused Test（2-5天）

推荐方案：使用UsabilityHub进行Task-Focused Test

方法：针对移动端用户进行Maze原型测试
成功标准：80%以上的用户在<3分钟内完成结账
处置：分析热力图后删除原型

Example 2: Choosing Feasibility Check

示例2：选择Feasibility Check

User Context:

Hypothesis: "We can auto-tag support tickets using GPT-4 with <5% error rate"
Risk: Unknown if AI can handle industry-specific jargon
Timeline: 1 day
Resources: 100 past support tickets, ChatGPT API access

Agent Reasoning:

Core question: "Can we build this?"
Risk is technical feasibility (AI capability), not user behavior
Timeline is short (1 day = spike territory)

Recommendation: Feasibility Check using prompt engineering spike

Method: Run 100 tickets through GPT-4 prompt chain, measure error rate
Success criteria: <5% misclassified tickets
Disposal: Delete spike code after documenting findings

Outcome: Error rate was 18%. Decided NOT to build feature. Saved 6 weeks of development.

用户上下文：

假设：“我们可以用GPT-4自动标记支持工单，错误率<5%”
风险：不确定AI能否处理行业特定术语
时间周期：1天
资源：100份历史支持工单、ChatGPT API访问权限

工具推理：

核心问题：“我们能否实现这个功能？”
风险是技术可行性（AI能力），而非用户行为
时间周期短（1天 = 快速探索）

推荐方案：通过提示工程探索进行Feasibility Check

方法：将100份工单输入GPT-4提示链，测量错误率
成功标准：错误分类的工单<5%
处置：记录发现后删除探索代码

结果：错误率为18%。决定不构建该功能，节省了6周的开发时间。

Example 3: Choosing Narrative Prototype (NOT Vibe-Coded)

示例3：选择Narrative Prototype（而非Vibe-Coded）

User Context:

Hypothesis: "Enterprise buyers will understand our multi-tenant security model"
Risk: Complex technical concept needs explanation
Timeline: 2 days
Resources: Loom, 5 enterprise prospects

Agent Reasoning:

Core question: "Does this workflow earn stakeholder buy-in?"
Risk is comprehension/alignment, not task completion
Building a functional prototype would over-invest (2 days = just explain it)

Recommendation: Narrative Prototype using Loom walkthrough

Method: 5-minute Loom explaining security model with diagrams
Success criteria: 4/5 prospects say "I understand this" without follow-up questions
Disposal: Archive video after 5 sessions

Why NOT Vibe-Coded Probe: You don't need users to interact with security settings—you need them to understand the concept. Narrative is cheaper and faster.

用户上下文：

假设：“企业买家能理解我们的多租户安全模型”
风险：复杂的技术概念需要解释
时间周期：2天
资源：Loom、5名企业潜在客户

工具推理：

核心问题：“这个工作流能否获得相关方认可？”
风险是理解/对齐，而非任务完成度
构建功能原型投入过大（2天时间仅需解释清楚即可）

推荐方案：使用Loom走查进行Narrative Prototype

方法：5分钟的Loom视频，用图表解释安全模型
成功标准：4/5的潜在客户表示“我理解了”，无后续问题
处置：5次会话后归档视频

为什么不选Vibe-Coded Probe：你不需要用户与安全设置交互——你需要他们理解这个概念。叙事原型成本更低、速度更快。

Example 4: Avoiding Prototype Theater

示例4：避免原型表演

User Context:

Hypothesis: "Executives will approve budget if we show a polished demo"
Risk: Internal politics
Timeline: 3 weeks
Resources: Full design team

Agent Response:

⚠️ This is prototype theater, not a PoL probe.

You're testing internal politics, not user behavior or technical feasibility. PoL probes eliminate product risk, not organizational risk.

Recommendation: Skip the prototype entirely. Instead:

Use a Narrative Prototype (1-day Loom walkthrough) to explain the concept
Test with 5 target users (not executives) to validate the hypothesis
Present user feedback to executives, not a polished demo

If executives need a demo, build it after you've validated the hypothesis with real users.

用户上下文：

假设：“如果我们展示一个精美的演示，高管会批准预算”
风险：内部政治
时间周期：3周
资源：完整的设计团队

工具回复：

⚠️ 这是原型表演，而非PoL探针。

你在测试内部政治，而非用户行为或技术可行性。PoL探针用于消除产品风险，而非组织风险。

推荐方案：完全跳过原型。取而代之：

使用Narrative Prototype（1天完成的Loom走查）解释概念
与5名目标用户（而非高管）测试以验证假设
向高管展示用户反馈，而非精美的演示

如果高管需要演示，请在你用真实用户验证假设后再构建。

Common Pitfalls

常见陷阱

1. Choosing Based on Tooling Comfort

1. 基于对工具的熟悉程度选择

Failure Mode: "I know Figma, so I'll design a UI prototype" (even if design isn't the risk).

Consequence: Validate the wrong thing; miss the actual risk.

Fix: Answer the core question first, then pick the method. If you need a Feasibility Check but only know design tools, pair with an engineer for 1 day.

失败模式：“我会用Figma，所以我要设计一个UI原型”（即使设计不是风险点）。

后果：验证了错误的内容，遗漏了实际风险。

解决方法：先确定核心问题，再选择方法。如果你需要Feasibility Check但只会用设计工具，请与工程师合作1天。

2. Defaulting to Code

2. 默认选择编码实现

Failure Mode: "Let's just build it and see what happens."

Consequence: 2 weeks of development before learning you tested the wrong hypothesis.

Fix: Ask: "What's the cheapest prototype that tells the harshest truth?" Usually it's NOT code.

失败模式：“我们直接构建出来看看效果。”

后果：开发2周后才发现你测试了错误的假设。

解决方法：问自己：“获取真实反馈的最低成本原型是什么？”通常不是代码。

3. Confusing Vibe-Coded Probes with MVPs

3. 将Vibe-Coded探针与MVP混淆

Failure Mode: Vibe-Coded probe "looks real," so team treats it like production code.

Consequence: Scope creep, technical debt, resistance to disposal.

Fix: Set disposal date before building. Vibe-Coded probes are Frankensoft by design—celebrate the jank, delete after learning.

失败模式：Vibe-Coded探针“看起来很真实”，因此团队将其视为生产代码。

后果：范围蔓延、技术债务、不愿处置。

解决方法：在构建前设定处置日期。Vibe-Coded探针本质上就是“拼凑式”的——接受其粗糙性，学习后删除。

4. Testing Multiple Things at Once

4. 同时测试多个内容

Failure Mode: "Let's test the workflow, the pricing, and the UI in one probe."

Consequence: Ambiguous results—you won't know which variable caused failure.

Fix: One probe, one hypothesis. If you have 3 hypotheses, run 3 probes.

失败模式：“我们要在一个探针中测试工作流、定价和UI。”

后果：结果模糊——你不知道是哪个变量导致了失败。

解决方法：一个探针对应一个假设。如果你有3个假设，就运行3个探针。

5. Skipping Success Criteria

5. 跳过成功标准

Failure Mode: "We'll know it when we see it."

Consequence: No harsh truth—just opinions and vanity metrics.

Fix: Write success criteria before building. Define "pass," "fail," and "learn" thresholds.

失败模式：“我们看到结果就知道了。”

后果：没有真实反馈——只有主观意见和虚荣指标。

解决方法：在构建前编写成功标准。定义“通过”、“失败”和“收获”的阈值。

References

参考资料

Related Skills

External Frameworks

外部框架

Jeff Patton — User Story Mapping (lean validation principles)
Marty Cagan — Inspired (2014 prototype flavors framework)
Dean Peters — Vibe First, Validate Fast, Verify Fit (Dean Peters' Substack, 2025)

Jeff Patton —— 用户故事地图（精益验证原则）
Marty Cagan —— 启示录（2014年原型类型框架）
Dean Peters —— 先看感觉，快速验证，确认适配（Dean Peters的Substack，2025）

Tools by Probe Type

各探针类型对应的工具

Feasibility: ChatGPT/Claude, Postman, Jupyter
Task-Focused: Optimal Workshop, UsabilityHub, Maze
Narrative: Loom, Sora, Synthesia, PowerPoint
Synthetic Data: Synthea, DataStax LangFlow, Faker
Vibe-Coded: ChatGPT Canvas, Replit, Airtable, Carrd

Feasibility：ChatGPT/Claude、Postman、Jupyter
Task-Focused：Optimal Workshop、UsabilityHub、Maze
Narrative：Loom、Sora、Synthesia、PowerPoint
Synthetic Data：Synthea、DataStax LangFlow、Faker
Vibe-Coded：ChatGPT Canvas、Replit、Airtable、Carrd

pol-probe-advisor

Original

Translation

Purpose

目的

Key Concepts

核心概念

The Core Problem: Method-Hypothesis Mismatch

核心问题：方法与假设不匹配

The 5 PoL Probe Flavors (Quick Reference)

5种PoL验证探针类型（快速参考）

Anti-Patterns (What This Is NOT)

反模式（本工具不适用的场景）

When to Use This Skill

何时使用本工具

Facilitation Source of Truth

引导实施的基准规则

Application

应用流程

Step 0: Gather Context

步骤0：收集上下文信息

Step 1: Identify the Core Question

步骤1：确定核心问题

Step 2: Recommend PoL Probe Type

步骤2：推荐PoL验证探针类型

Option 1 Selected: "Can we build this?"

选择选项1：“我们能否实现这个功能？”

Option 2 Selected: "Can users complete this job without friction?"

选择选项2：“用户能否顺畅完成这项任务？”

Option 3 Selected: "Does this workflow earn stakeholder buy-in?"

选择选项3：“这个工作流能否获得相关方认可？”

Option 4 Selected: "Can we model this without production risk?"

选择选项4：“我们能否在无生产风险的情况下建模？”

Option 5 Selected: "Will this solution survive real user contact?"

选择选项5：“该解决方案能否经受真实用户的检验？”

Step 3: Apply Component Skill

步骤3：应用组件工具

Step 4: Refine or Iterate (Optional)

步骤4：优化或迭代（可选）

Examples

示例

Example 1: Choosing Task-Focused Test

示例1：选择Task-Focused Test

Example 2: Choosing Feasibility Check

示例2：选择Feasibility Check

Example 3: Choosing Narrative Prototype (NOT Vibe-Coded)

示例3：选择Narrative Prototype（而非Vibe-Coded）

Example 4: Avoiding Prototype Theater

示例4：避免原型表演

Common Pitfalls

常见陷阱

1. Choosing Based on Tooling Comfort

1. 基于对工具的熟悉程度选择

2. Defaulting to Code

2. 默认选择编码实现

3. Confusing Vibe-Coded Probes with MVPs

3. 将Vibe-Coded探针与MVP混淆

4. Testing Multiple Things at Once

4. 同时测试多个内容

5. Skipping Success Criteria

5. 跳过成功标准

References

参考资料

Related Skills

相关工具

External Frameworks

外部框架

Tools by Probe Type

各探针类型对应的工具