evaluating-new-technology

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Evaluating New Technology

新技术评估

Scope

适用范围

Covers

Evaluating a new tool/platform/vendor (including AI products) for adoption
Emerging tech “should we use this?” decisions
Build vs buy decisions and tech stack changes
Running a proof-of-value pilot and capturing evidence
First-pass risk review (security/privacy/compliance, vendor claims, operational readiness)

When to use

“Evaluate this new AI tool/vendor for our team.”
“Should we build this in-house or buy a vendor?”
“We’re considering changing our analytics/experimentation stack—make a recommendation.”
“Create a technology evaluation doc with a pilot plan, risks, and decision memo.”

When NOT to use

You don’t have a real problem/job to solve yet (use
```
problem-definition
```
first).
You need a full product strategy/roadmap (use
```
ai-product-strategy
```
).
You’re designing how to build an LLM system (use
```
building-with-llms
```
).
You need a formal security assessment / penetration testing (engage security; this skill produces a structured first pass).

适用场景

评估拟采用的新工具/平台/供应商（包括AI产品）
新兴技术“是否应该采用”的决策
自研vs采购决策以及技术栈变更
开展价值验证试点并收集证据
初步风险评审（安全/隐私/合规、供应商承诺、运营就绪情况）

适用时机

“为我们团队评估这款新AI工具/供应商。”
“我们应该自研还是采购供应商的产品？”
“我们正考虑更换分析/实验技术栈——给出建议。”
“创建一份包含试点计划、风险和决策备忘录的技术评估文档。”

不适用时机

尚未明确需要解决的实际问题/工作任务（请先使用
```
problem-definition
```
）
需要完整的产品战略/路线图（请使用
```
ai-product-strategy
```
）
正在设计LLM系统的构建方案（请使用
```
building-with-llms
```
）
需要正式的安全评估/渗透测试（请联系安全团队；本技能仅提供结构化的初步评审）

Inputs

输入信息

Minimum required

Candidate technology (what it is, vendor/build option, links if available)
Problem/workflow to improve + who it’s for
Current approach/stack and what’s not working
Constraints: data sensitivity, privacy/compliance, budget, timeline, regions, deployment model (SaaS/on-prem)
Decision context: who decides, adoption scope, risk tolerance

Missing-info strategy

Ask up to 5 questions from references/INTAKE.md (3–5 at a time).
If still missing, proceed with explicit assumptions and present 2–3 options (e.g., buy vs build vs defer).
Do not request secrets. If asked to run tools, change production systems, or sign up for vendors, require explicit confirmation.

必备信息

候选技术（包括技术类型、供应商/自研选项、相关链接（如有））
待优化的问题/工作流 + 受众群体
当前方案/技术栈及其存在的问题
约束条件：数据敏感度、隐私/合规要求、预算、时间线、覆盖区域、部署模式（SaaS/on-prem）
决策背景：决策主体、采用范围、风险容忍度

缺失信息处理策略

从references/INTAKE.md中最多提出5个问题（每次3-5个）
如果仍有信息缺失，基于明确的假设推进，并给出2-3个选项（例如：采购vs自研vs推迟）
不得索要机密信息。若被要求运行工具、变更生产系统或注册供应商服务，需获得明确确认。

Outputs (deliverables)

输出成果（交付物）

Produce a Technology Evaluation Pack (in chat; or as files if requested), in this order:

Evaluation brief (problem, stakeholders, decision, constraints, non-goals, assumptions)
Options & criteria matrix (status quo + alternatives, criteria, scoring, notes)
Build vs buy analysis (bandwidth/TCO, core competency, opportunity cost, lock-in)
Pilot (proof-of-value) plan (hypotheses, scope, metrics, timeline, exit criteria)
Risk & guardrails review (security/privacy/compliance, vendor claims, mitigations)
Decision memo (recommendation, rationale, trade-offs, adoption/rollback plan)
Risks / Open questions / Next steps (always included)

Templates: references/TEMPLATES.md

生成一份技术评估包（以聊天形式呈现；若有需求可生成文件），内容顺序如下：

评估摘要（问题、利益相关方、决策事项、约束条件、非目标、假设）
选项与评估标准矩阵（现状+替代方案、评估标准、评分、备注）
自研vs采购分析（资源投入/总拥有成本、核心竞争力、机会成本、锁定风险）
试点（价值验证）计划（假设、范围、指标、时间线、退出标准）
风险与防护措施评审（安全/隐私/合规、供应商承诺、缓解方案）
决策备忘录（建议、理由、权衡、采用/回滚计划）
风险/未解决问题/下一步行动（必须包含）

模板：references/TEMPLATES.md

Workflow (8 steps)

工作流程（8个步骤）

1) Start with the problem (avoid tool bias)

1) 从问题出发（避免工具偏见）

Inputs: Candidate tech, target workflow/users, current pain.
Actions: Write a one-sentence problem statement and “who feels it.” List 3–5 symptoms and 3–5 non-goals.
Outputs: Draft Evaluation brief (problem + non-goals).
Checks: You can explain the decision without naming the tool.

输入信息: 候选技术、目标工作流/用户群体、当前痛点
行动: 撰写一句话问题陈述及“受影响人群”。列出3-5个症状和3-5个非目标。
输出: 草稿版评估摘要（问题+非目标）
校验: 无需提及具体工具即可解释决策背景

2) Define “good” and hard constraints

2) 定义“成功标准”和刚性约束

Inputs: Success metrics, constraints, risk tolerance, decision deadline.
Actions: Define success metrics (leading + lagging) and must-have constraints (privacy, compliance, security, uptime, latency/cost if relevant). Capture “deal breakers.”
Outputs: Evaluation brief (success + constraints + deal breakers).
Checks: A stakeholder can say what would make this a clear “yes” or “no.”

输入信息: 成功指标、约束条件、风险容忍度、决策截止日期
行动: 定义成功指标（前置指标+滞后指标）以及必备约束条件（隐私、合规、安全、可用性、延迟/成本（如相关））。明确“否决项”。
输出: 评估摘要（成功指标+约束条件+否决项）
校验: 利益相关方能够明确判断结果是“通过”还是“否决”

3) Map options and evaluation criteria (workflows → ROI)

3) 梳理选项与评估标准（工作流→投资回报率）

Inputs: Current stack, alternatives, stakeholders.
Actions: List options: status quo, 1–3 vendors, build, hybrid. Define criteria anchored to workflows enabled and ROI (time saved, revenue impact, risk reduction), not feature checklists.
Outputs: Options & criteria matrix.
Checks: Every criterion is measurable or at least falsifiable in a pilot.

输入信息: 当前技术栈、替代方案、利益相关方
行动: 列出选项：现状、1-3个供应商方案、自研、混合方案。基于可支持的工作流和投资回报率（节省时间、收入影响、风险降低）定义评估标准，而非仅做功能清单检查。
输出: 选项与评估标准矩阵
校验: 每项评估标准均可衡量，或至少可在试点中验证真伪

4) Fast reality check: integration + data fit

4) 快速现实校验：集成适配性+数据兼容性

Inputs: Architecture constraints, data sources, integration points.
Actions: Identify required integrations (SSO, data pipelines, APIs, logs). Note migration complexity, data ownership, and export/exit path. For PLG/growth tools, sanity-check the stack layers (data hub → analytics → lifecycle).
Outputs: Notes added to Options & criteria matrix (integration complexity + stack fit).
Checks: You can describe the end-to-end data/control flow in 5–10 bullets.

输入信息: 架构约束、数据源、集成点
行动: 识别所需集成（SSO、数据管道、API、日志）。记录迁移复杂度、数据所有权以及导出/退出路径。对于PLG/增长工具，确认技术栈层级合理性（数据中心→分析→生命周期管理）。
输出: 在选项与评估标准矩阵中添加备注（集成复杂度+技术栈适配性）
校验: 能够用5-10条要点描述端到端的数据/控制流

5) Build vs buy with “bandwidth” as a first-class cost

5) 自研vs采购：将“资源投入”作为核心成本考量

Inputs: Engineering capacity, core competencies, opportunity cost.
Actions: Compare build vs buy using a bandwidth/TCO ledger (build time, maintenance, on-call, upgrades, vendor management). Prefer building only when it’s a core differentiator or the vendor market is immature/unacceptable.
Outputs: Build vs buy analysis.
Checks: The analysis includes opportunity cost and who would maintain the system 12 months from now.

输入信息: 工程团队产能、核心竞争力、机会成本
行动: 基于资源投入/总拥有成本台账（自研时间、维护成本、值班成本、升级成本、供应商管理成本）对比自研与采购。仅当该技术为核心差异化竞争力，或供应商市场不成熟/不可接受时，优先考虑自研。
输出: 自研vs采购分析
校验: 分析内容包含机会成本，以及12个月后负责维护系统的主体

6) Risk & guardrails review (be skeptical of “100% safe” claims)

6) 风险与防护措施评审（对“100%安全”的承诺保持怀疑）

Inputs: Data sensitivity, threat model, vendor posture, deployment model.
Actions: Identify key risks (security, privacy, compliance, reliability, lock-in). For AI vendors: treat “guardrails catch everything” claims as marketing; assume determined attackers exist and design defense-in-depth (permissions, logging, human approval points, eval/red-team).
Outputs: Risk & guardrails review.
Checks: Each top risk has an owner and a mitigation or a “blocker” label.

输入信息: 数据敏感度、威胁模型、供应商安全态势、部署模式
行动: 识别关键风险（安全、隐私、合规、可靠性、锁定风险）。对于AI供应商：将“防护措施可拦截所有风险”的宣传视为营销话术；假设存在恶意攻击者，并设计纵深防御方案（权限管控、日志记录、人工审批节点、评估/红队测试）。
输出: 风险与防护措施评审
校验: 每个顶级风险均有负责人和缓解方案，或标记为“阻塞项”

7) Plan a proof-of-value pilot (or document why you can skip it)

7) 制定价值验证试点计划（或说明无需试点的理由）

Inputs: Criteria, risks, timeline, stakeholders.
Actions: Define pilot hypotheses, scope, success metrics, test dataset, and evaluation method. Specify timeline, resourcing, and exit criteria (adopt / iterate / reject). Include rollback and data deletion requirements.
Outputs: Pilot plan.
Checks: A team can run the pilot without extra meetings; success/failure is unambiguous.

输入信息: 评估标准、风险、时间线、利益相关方
行动: 定义试点假设、范围、成功指标、测试数据集和评估方法。明确时间线、资源配置和退出标准（采用/迭代/否决）。包含回滚和数据删除要求。
输出: 试点计划
校验: 团队无需额外会议即可开展试点；成功/失败的判定标准清晰明确

8) Decide, communicate, and quality-gate

8) 决策、沟通与质量校验

Inputs: Completed pack drafts.
Actions: Write the Decision memo with recommendation, trade-offs, and adoption plan. Run references/CHECKLISTS.md and score with references/RUBRIC.md. Always include Risks / Open questions / Next steps.
Outputs: Final Technology Evaluation Pack.
Checks: Decision is actionable (owner, date, next actions) and reversible where possible.

输入信息: 完成的评估包草稿
行动: 撰写决策备忘录，包含建议、权衡分析和采用计划。使用references/CHECKLISTS.md检查，并通过references/RUBRIC.md评分。必须包含风险/未解决问题/下一步行动。
输出: 最终版技术评估包
校验: 决策具备可执行性（负责人、日期、下一步行动），且尽可能具备可逆性

Quality gate (required)

质量校验（必填）

Use references/CHECKLISTS.md and references/RUBRIC.md.
Always include: Risks, Open questions, Next steps.

使用references/CHECKLISTS.md和references/RUBRIC.md进行校验
必须包含：风险、未解决问题、下一步行动

Examples

示例

Example 1 (AI vendor): “Use

evaluating-new-technology

to evaluate an AI ‘prompt guardrails’ vendor for our support agent. Constraints: SOC2 required, PII present, must support SSO, budget $50k/yr, decision in 3 weeks.”
Expected: evaluation pack that treats guardrail claims skeptically and proposes defense-in-depth + a measurable pilot.

Example 2 (analytics stack): “Use

evaluating-new-technology

to choose between PostHog and Amplitude for our PLG product. Current stack: Segment + data warehouse; goal is faster iteration on onboarding and activation.”
Expected: options matrix + pilot plan tied to workflows (experiments, funnels, lifecycle triggers) and migration effort.

Boundary example: “What’s the best new AI tool we should adopt?”
Response: out of scope without a problem/workflow; ask intake questions and/or propose running

problem-definition

first.

示例1（AI供应商）： “使用

evaluating-new-technology

为我们的支持Agent评估一款AI‘提示词防护’供应商。约束条件：需符合SOC2标准，涉及PII数据，必须支持SSO，预算每年5万美元，3周内完成决策。”
预期输出：一份对防护措施承诺持怀疑态度的评估包，提出纵深防御方案+可衡量的试点计划。

示例2（分析技术栈）： “使用

evaluating-new-technology

为我们的PLG产品在PostHog和Amplitude之间做选择。当前技术栈：Segment + 数据仓库；目标是加快用户激活与留存的迭代速度。”
预期输出：选项矩阵+与工作流（实验、漏斗分析、生命周期触发）和迁移工作量挂钩的试点计划。

边界示例： “我们应该采用哪款最新的AI工具？”
回应：因缺少具体问题/工作流，超出适用范围；请提出采集问题或建议先运行

problem-definition

。