assess

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Assess

评估

Comprehensive assessment skill for answering "is this good?" with structured evaluation, scoring, and actionable recommendations.

这是一款综合性评估技能，通过结构化评估、打分和可落地的建议来回答「这个方案/代码/设计好不好？」这类问题。

Quick Start

快速开始

bash

/assess backend/app/services/auth.py
/assess our caching strategy
/assess the current database schema
/assess frontend/src/components/Dashboard

bash

/assess backend/app/services/auth.py
/assess our caching strategy
/assess the current database schema
/assess frontend/src/components/Dashboard

STEP 0: Verify User Intent with AskUserQuestion

步骤0：通过AskUserQuestion确认用户意图

BEFORE creating tasks, clarify assessment dimensions:

python

AskUserQuestion(
  questions=[{
    "question": "What dimensions to assess?",
    "header": "Dimensions",
    "options": [
      {"label": "Full assessment (Recommended)", "description": "All dimensions: quality, maintainability, security, performance"},
      {"label": "Code quality only", "description": "Readability, complexity, best practices"},
      {"label": "Security focus", "description": "Vulnerabilities, attack surface, compliance"},
      {"label": "Quick score", "description": "Just give me a 0-10 score with brief notes"}
    ],
    "multiSelect": false
  }]
)

Based on answer, adjust workflow:

Full assessment: All 7 phases, parallel agents
Code quality only: Skip security and performance phases
Security focus: Prioritize security-auditor agent
Quick score: Single pass, brief output

在创建任务之前，先明确评估维度：

python

AskUserQuestion(
  questions=[{
    "question": "What dimensions to assess?",
    "header": "Dimensions",
    "options": [
      {"label": "Full assessment (Recommended)", "description": "All dimensions: quality, maintainability, security, performance"},
      {"label": "Code quality only", "description": "Readability, complexity, best practices"},
      {"label": "Security focus", "description": "Vulnerabilities, attack surface, compliance"},
      {"label": "Quick score", "description": "Just give me a 0-10 score with brief notes"}
    ],
    "multiSelect": false
  }]
)

根据用户回答调整工作流：

全面评估：覆盖全部7个阶段，使用并行Agent
仅代码质量评估：跳过安全和性能阶段
侧重安全评估：优先调用security-auditor agent
快速打分：单次评估，输出简洁结果

STEP 0b: Select Orchestration Mode

步骤0b：选择编排模式

Choose Agent Teams (mesh — assessors cross-validate scores) or Task tool (star — all report to lead):

```
ORCHESTKIT_PREFER_TEAMS=1
```
→ Agent Teams mode
Agent Teams unavailable → Task tool mode (default)
Otherwise: Full assessment with 6 dimension agents → recommend Agent Teams; Quick score or single-dimension → Task tool

Aspect	Task Tool	Agent Teams
Score calibration	Lead normalizes independently	Assessors discuss disagreements
Cross-dimension findings	Lead correlates after completion	Security assessor alerts performance assessor of overlap
Cost	~200K tokens	~500K tokens
Best for	Quick scores, single dimension	Full multi-dimensional assessment

Fallback: If Agent Teams encounters issues, fall back to Task tool for remaining assessment.

选择Agent Teams（网状模式 — 评估人员交叉校验分数）或Task工具（星型模式 — 所有结果汇报至负责人）：

```
ORCHESTKIT_PREFER_TEAMS=1
```
→ Agent Teams模式
若Agent Teams不可用 → Task工具模式（默认）
其他情况：全维度评估推荐使用Agent Teams；快速打分或单维度评估使用Task工具

维度	Task工具	Agent Teams
分数校准	负责人独立标准化	评估人员讨论分歧
跨维度发现	负责人在完成后关联	安全评估人员向性能评估人员预警重叠问题
成本	~200K tokens	~500K tokens
最佳适用场景	快速打分、单维度评估	全维度综合评估

降级方案：若Agent Teams出现问题，剩余评估环节切换为Task工具模式。

Task Management (CC 2.1.16)

任务管理（CC 2.1.16）

python

undefined

python

undefined

Create main assessment task

创建主评估任务

TaskCreate( subject="Assess: {target}", description="Comprehensive evaluation with quality scores and recommendations", activeForm="Assessing {target}" )

Create subtasks for 7-phase process

为7阶段流程创建子任务

for phase in ["Understand target", "Rate quality", "List pros/cons", "Compare alternatives", "Generate suggestions", "Estimate effort", "Compile report"]: TaskCreate(subject=phase, activeForm=f"{phase}ing")

---

---

What This Skill Answers

该技能可回答的问题

Question	How It's Answered
"Is this good?"	Quality score 0-10 with reasoning
"What are the trade-offs?"	Structured pros/cons list
"Should we change this?"	Improvement suggestions with effort
"What are the alternatives?"	Comparison with scores
"Where should we focus?"	Prioritized recommendations

问题	回答方式
「这个好不好？」	给出0-10分的质量评分及理由
「有哪些权衡？」	结构化的优缺点列表
「我们应该改进吗？」	附带实施成本的改进建议
「有哪些替代方案？」	带评分的方案对比
「我们应该重点关注什么？」	优先级明确的建议

Workflow Overview

工作流概览

Phase	Activities	Output
1. Target Understanding	Read code/design, identify scope	Context summary
2. Quality Rating	6-dimension scoring (0-10)	Scores with reasoning
3. Pros/Cons Analysis	Strengths and weaknesses	Balanced evaluation
4. Alternative Comparison	Score alternatives	Comparison matrix
5. Improvement Suggestions	Actionable recommendations	Prioritized list
6. Effort Estimation	Time and complexity estimates	Effort breakdown
7. Assessment Report	Compile findings	Final report

阶段	活动内容	输出
1. 目标理解	读取代码/设计，确定评估范围	上下文摘要
2. 质量评级	6维度打分（0-10）	带理由的评分结果
3. 优缺点分析	梳理优势与不足	平衡的评估结果
4. 替代方案对比	为替代方案打分	对比矩阵
5. 改进建议	可落地的优化建议	优先级列表
6. 成本估算	时间与复杂度估算	成本拆分
7. 评估报告	整合所有发现	最终报告

Phase 1: Target Understanding

阶段1：目标理解

Identify what's being assessed (code, design, approach, decision, pattern) and gather context:

python

undefined

确定评估对象（代码、设计、方案、决策、模式）并收集上下文：

python

undefined

PARALLEL - Gather context

并行执行 - 收集上下文

Read(file_path="$ARGUMENTS") # If file path Grep(pattern="$ARGUMENTS", output_mode="files_with_matches") mcp__memory__search_nodes(query="$ARGUMENTS") # Past decisions

---

Read(file_path="$ARGUMENTS") # 若传入文件路径 Grep(pattern="$ARGUMENTS", output_mode="files_with_matches") mcp__memory__search_nodes(query="$ARGUMENTS") # 历史决策

---

Phase 2: Quality Rating (6 Dimensions)

阶段2：质量评级（6维度）

Rate each dimension 0-10 with weighted composite score. See Scoring Rubric for details.

Dimension	Weight	What It Measures
Correctness	0.20	Does it work correctly?
Maintainability	0.20	Easy to understand/modify?
Performance	0.15	Efficient, no bottlenecks?
Security	0.15	Follows best practices?
Scalability	0.15	Handles growth?
Testability	0.15	Easy to test?

Composite Score: Weighted average of all dimensions.

Launch 6 parallel agents (one per dimension) with

run_in_background=True

对每个维度进行0-10分打分，最终计算加权综合得分。详情请见评分规则。

维度	权重	评估内容
正确性	0.20	是否能正常工作？
可维护性	0.20	是否易于理解/修改？
性能	0.15	是否高效、无瓶颈？
安全性	0.15	是否遵循最佳实践？
可扩展性	0.15	是否能应对业务增长？
可测试性	0.15	是否易于测试？

综合得分：所有维度的加权平均值。

启动6个并行Agent（每个维度一个），设置

run_in_background=True

。

Phase 2 — Agent Teams Alternative

阶段2 — Agent Teams替代方案

In Agent Teams mode, form an assessment team where dimension assessors cross-validate scores and discuss disagreements:

python

TeamCreate(team_name="assess-{target-slug}", description="Assess {target}")

Task(subagent_type="code-quality-reviewer", name="correctness-assessor",
     team_name="assess-{target-slug}",
     prompt="""Assess CORRECTNESS (0-10) and MAINTAINABILITY (0-10) for: {target}
     When you find issues that affect security, message security-assessor.
     When you find issues that affect performance, message perf-assessor.
     Share your scores with all teammates for calibration — if scores diverge
     significantly (>2 points), discuss the disagreement.""")

Task(subagent_type="security-auditor", name="security-assessor",
     team_name="assess-{target-slug}",
     prompt="""Assess SECURITY (0-10) for: {target}
     When correctness-assessor flags security-relevant patterns, investigate deeper.
     When you find performance-impacting security measures, message perf-assessor.
     Share your score and flag any cross-dimension trade-offs.""")

Task(subagent_type="performance-engineer", name="perf-assessor",
     team_name="assess-{target-slug}",
     prompt="""Assess PERFORMANCE (0-10) and SCALABILITY (0-10) for: {target}
     When security-assessor flags performance trade-offs, evaluate the impact.
     When you find testability issues (hard-to-benchmark code), message test-assessor.
     Share your scores with reasoning for the composite calculation.""")

Task(subagent_type="test-generator", name="test-assessor",
     team_name="assess-{target-slug}",
     prompt="""Assess TESTABILITY (0-10) for: {target}
     Evaluate test coverage, test quality, and ease of testing.
     When other assessors flag dimension-specific concerns, verify test coverage
     for those areas. Share your score and any coverage gaps found.""")

Team teardown after report compilation:

python

SendMessage(type="shutdown_request", recipient="correctness-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="security-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="perf-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="test-assessor", content="Assessment complete")
TeamDelete()

Fallback: If team formation fails, use standard Phase 2 Task spawns above.

在Agent Teams模式下，组建评估团队，各维度评估人员交叉校验分数并讨论分歧：

python

TeamCreate(team_name="assess-{target-slug}", description="Assess {target}")

Task(subagent_type="code-quality-reviewer", name="correctness-assessor",
     team_name="assess-{target-slug}",
     prompt="""Assess CORRECTNESS (0-10) and MAINTAINABILITY (0-10) for: {target}
     When you find issues that affect security, message security-assessor.
     When you find issues that affect performance, message perf-assessor.
     Share your scores with all teammates for calibration — if scores diverge
     significantly (>2 points), discuss the disagreement.""")

Task(subagent_type="security-auditor", name="security-assessor",
     team_name="assess-{target-slug}",
     prompt="""Assess SECURITY (0-10) for: {target}
     When correctness-assessor flags security-relevant patterns, investigate deeper.
     When you find performance-impacting security measures, message perf-assessor.
     Share your score and flag any cross-dimension trade-offs.""")

Task(subagent_type="performance-engineer", name="perf-assessor",
     team_name="assess-{target-slug}",
     prompt="""Assess PERFORMANCE (0-10) and SCALABILITY (0-10) for: {target}
     When security-assessor flags performance trade-offs, evaluate the impact.
     When you find testability issues (hard-to-benchmark code), message test-assessor.
     Share your scores with reasoning for the composite calculation.""")

Task(subagent_type="test-generator", name="test-assessor",
     team_name="assess-{target-slug}",
     prompt="""Assess TESTABILITY (0-10) for: {target}
     Evaluate test coverage, test quality, and ease of testing.
     When other assessors flag dimension-specific concerns, verify test coverage
     for those areas. Share your score and any coverage gaps found.""")

报告完成后解散团队：

python

SendMessage(type="shutdown_request", recipient="correctness-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="security-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="perf-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="test-assessor", content="Assessment complete")
TeamDelete()

降级方案：若团队创建失败，使用上述标准的阶段2任务生成方式。

Phase 3: Pros/Cons Analysis

阶段3：优缺点分析

markdown

undefined

markdown

undefined

Pros (Strengths)

优势（Strengths）

#	Strength	Impact	Evidence
1	[strength]	High/Med/Low	[example]

序号	优势	影响程度	证据
1	[优势内容]	高/中/低	[示例]

Cons (Weaknesses)

劣势（Weaknesses）

#	Weakness	Severity	Evidence
1	[weakness]	High/Med/Low	[example]

Net Assessment: [Strengths outweigh / Balanced / Weaknesses dominate] Recommended action: [Keep as-is / Improve / Reconsider / Rewrite]

---

序号	劣势	严重程度	证据
1	[劣势内容]	高/中/低	[示例]

整体评估：[优势占优 / 势均力敌 / 劣势主导] 建议行动：[保持现状 / 优化改进 / 重新考量 / 重写]

---

Phase 4: Alternative Comparison

阶段4：替代方案对比

See Alternative Analysis for full comparison template.

Criteria	Current	Alternative A	Alternative B
Composite	[N.N]	[N.N]	[N.N]
Migration Effort	N/A	[1-5]	[1-5]

完整对比模板请见替代方案分析。

评估标准	当前方案	替代方案A	替代方案B
综合得分	[N.N]	[N.N]	[N.N]
迁移成本	N/A	[1-5]	[1-5]

Phase 5: Improvement Suggestions

阶段5：改进建议

See Improvement Prioritization for effort/impact guidelines.

Suggestion	Effort (1-5)	Impact (1-5)	Priority (I/E)
[action]	[N]	[N]	[ratio]

Quick Wins = Effort <= 2 AND Impact >= 4. Always highlight these first.

努力程度/影响程度评分规则请见改进优先级指南。

建议内容	实施成本（1-5）	影响程度（1-5）	优先级（影响/成本）
[行动项]	[N]	[N]	[比值]

快速落地项 = 实施成本 ≤2 且影响程度 ≥4。需优先高亮展示。

Phase 6: Effort Estimation

阶段6：成本估算

Timeframe	Tasks	Total
Quick wins (< 1hr)	[list]	X min
Short-term (< 1 day)	[list]	X hrs
Medium-term (1-3 days)	[list]	X days

时间范围	任务内容	总耗时
快速落地（<1小时）	[任务列表]	X分钟
短期（<1天）	[任务列表]	X小时
中期（1-3天）	[任务列表]	X天

Phase 7: Assessment Report

阶段7：评估报告

See Scoring Rubric for full report template.

markdown

undefined

完整报告模板请见评分规则。

markdown

undefined

Assessment Report: $ARGUMENTS

评估报告：$ARGUMENTS

Overall Score: [N.N]/10 (Grade: [A+/A/B/C/D/F])

Verdict: [EXCELLENT | GOOD | ADEQUATE | NEEDS WORK | CRITICAL]

整体得分：[N.N]/10（等级：[A+/A/B/C/D/F]）

结论： [优秀 | 良好 | 合格 | 需要改进 | 严重问题]

Answer: Is This Good?

问题：这个好不好？

[YES / MOSTLY / SOMEWHAT / NO] [Reasoning]

---

[是 / 大部分是 / 部分是 / 否] [理由说明]

---

Grade Interpretation

等级说明

Score	Grade	Verdict
9.0-10.0	A+	EXCELLENT
8.0-8.9	A	GOOD
7.0-7.9	B	GOOD
6.0-6.9	C	ADEQUATE
5.0-5.9	D	NEEDS WORK
0.0-4.9	F	CRITICAL

分数	等级	结论
9.0-10.0	A+	优秀
8.0-8.9	A	良好
7.0-7.9	B	良好
6.0-6.9	C	合格
5.0-5.9	D	需要改进
0.0-4.9	F	严重问题

Key Decisions

核心决策

Decision	Choice	Rationale
6 dimensions	Comprehensive coverage	All quality aspects without overwhelming
0-10 scale	Industry standard	Easy to understand and compare
Parallel assessment	6 agents	Fast, thorough evaluation
Effort/Impact scoring	1-5 scale	Simple prioritization math

决策内容	选择	理由
6个评估维度	全面覆盖	涵盖所有质量维度且不会过于复杂
0-10分制	行业标准	易于理解和对比
并行评估	6个Agent	快速、全面的评估
成本/影响评分	1-5分制	简单的优先级计算方式

assess

Original

Translation

Assess

评估

Quick Start

快速开始

STEP 0: Verify User Intent with AskUserQuestion

步骤0：通过AskUserQuestion确认用户意图

STEP 0b: Select Orchestration Mode

步骤0b：选择编排模式

Task Management (CC 2.1.16)

任务管理（CC 2.1.16）

Create main assessment task

创建主评估任务

Create subtasks for 7-phase process

为7阶段流程创建子任务

What This Skill Answers

该技能可回答的问题

Workflow Overview

工作流概览

Phase 1: Target Understanding

阶段1：目标理解

PARALLEL - Gather context

并行执行 - 收集上下文

Phase 2: Quality Rating (6 Dimensions)

阶段2：质量评级（6维度）

Phase 2 — Agent Teams Alternative

阶段2 — Agent Teams替代方案

Phase 3: Pros/Cons Analysis

阶段3：优缺点分析

Pros (Strengths)

优势（Strengths）

Cons (Weaknesses)

劣势（Weaknesses）

Phase 4: Alternative Comparison

阶段4：替代方案对比

Phase 5: Improvement Suggestions

阶段5：改进建议

Phase 6: Effort Estimation

阶段6：成本估算

Phase 7: Assessment Report

阶段7：评估报告

Assessment Report: $ARGUMENTS

评估报告：$ARGUMENTS

Answer: Is This Good?

问题：这个好不好？

Grade Interpretation

等级说明

Key Decisions

核心决策

Related Skills

相关技能