quality-auditor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Quality Auditor

质量审核工具

Overview

概述

Evaluates tools, frameworks, systems, and codebases against the highest industry standards across 12 weighted dimensions. Produces evidence-based scores, identifies anti-patterns, and generates prioritized improvement roadmaps. Applies extra scrutiny to AI-generated code through the verification gap protocol, ensuring velocity does not compromise integrity.

When to use: Auditing code quality, reviewing AI-generated code, scoring codebases against industry benchmarks, enforcing pre-commit quality gates, comparing tools or frameworks, assessing technical debt.

When NOT to use: Quick code reviews without scoring, style-only linting (use a linter), feature implementation, routine PR reviews that do not require a full audit.

评估工具、框架、系统和代码库是否符合最高行业标准，涵盖12个加权维度。生成基于证据的评分，识别反模式，并制定优先级明确的改进路线图。通过验证缺口协议对AI生成的代码进行额外审查，确保开发速度不影响代码完整性。

适用场景：审核代码质量、审查AI生成的代码、对标行业基准为代码库评分、执行提交前质量门禁、对比工具或框架、评估技术债务。

不适用场景：无需评分的快速代码审查、仅针对代码风格的检查（请使用linter工具）、功能开发实现、无需完整审核的常规PR审查。

Quick Reference

快速参考

Dimension	Weight	What to Evaluate
Code Quality	10%	Structure, patterns, SOLID, duplication, complexity, error handling
Architecture	10%	Design, modularity, scalability, coupling/cohesion, API design
Documentation	10%	Completeness, clarity, accuracy, examples, troubleshooting
Usability	10%	Learning curve, installation ease, error messages, ergonomics
Performance	8%	Speed, resource usage, caching, bundle size, Core Web Vitals
Security	10%	OWASP Top 10, input validation, auth, secrets, dependencies
Testing	8%	Coverage (unit/integration/e2e), quality, automation, organization
Maintainability	8%	Technical debt, readability, refactorability, versioning
Developer Experience	10%	Setup ease, debugging, tooling, hot reload, IDE integration
Accessibility	8%	WCAG compliance, keyboard nav, screen readers, cognitive load
CI/CD	5%	Automation, pipelines, deployment, rollback, monitoring
Innovation	3%	Novel approaches, forward-thinking design, unique value

维度	权重	评估内容
代码质量	10%	结构、设计模式、SOLID原则、代码重复度、复杂度、错误处理
架构设计	10%	设计方案、模块化、可扩展性、耦合/内聚性、API设计
文档	10%	完整性、清晰度、准确性、示例、故障排查指南
易用性	10%	学习曲线、安装便捷性、错误提示、人机工程学
性能	8%	速度、资源占用、缓存、包体积、Core Web Vitals
安全性	10%	OWASP Top 10、输入验证、身份认证、密钥管理、依赖项安全
测试	8%	覆盖率（单元/集成/端到端）、测试质量、自动化程度、测试组织
可维护性	8%	技术债务、可读性、可重构性、版本管理
开发者体验	10%	搭建便捷性、调试体验、工具链、热重载、IDE集成
可访问性	8%	WCAG合规性、键盘导航、屏幕阅读器兼容性、认知负荷
CI/CD流程	5%	自动化程度、流水线、部署、回滚、监控
创新性	3%	新颖方案、前瞻性设计、独特价值

Audit Phases

审核阶段

Phase	Name	Purpose
0	Resource Completeness	Verify registry/filesystem parity; audit fails if this fails
1	Discovery	Read docs, examine code, test system, review supporting materials
2	Evaluation	Score each dimension with evidence, strengths, and weaknesses
3	Synthesis	Executive summary, detailed scores, recommendations, risk matrix

阶段	名称	目的
0	资源完整性检查	验证注册表/文件系统一致性；若不通过则审核直接失败
1	发现调研	阅读文档、检查代码、测试系统、查阅支撑材料
2	维度评估	为每个维度评分并提供证据，标注优势与不足
3	结果整合	执行摘要、详细评分、改进建议、风险矩阵

Scoring Scale

评分标准

Score	Rating	Meaning
10	Exceptional	Industry-leading, sets new standards
8-9	Excellent	Exceeds expectations significantly
6-7	Good	Meets expectations with improvements needed
5	Acceptable	Below average, significant improvements
3-4	Poor	Major gaps and fundamental problems
1-2	Critical	Barely functional or non-functional

分数	评级	含义
10	卓越级	行业领先，树立新标准
8-9	优秀级	显著超出预期
6-7	良好级	符合预期，但仍需改进
5	合格级	低于平均水平，需大幅改进
3-4	较差级	存在重大缺口与基础问题
1-2	危急级	基本无法使用或完全不可用

Common Mistakes

常见误区

Mistake	Correct Pattern
Giving inflated scores without evidence	Every score must cite specific files, metrics, or code examples as evidence
Skipping Phase 0 resource completeness check	Always verify registry completeness first; missing resources cap the overall score at 6/10
Evaluating only code quality, ignoring dimensions	Score all 12 dimensions with appropriate weights; architecture, security, and DX matter equally
Accepting superficial "LGTM" reviews	Perform deep semantic audits checking contract integrity, security sanitization, and performance hygiene
Trusting AI-generated code without verification	Apply the verification gap protocol: critic agents, verifiable goals, human oversight for critical paths
Proceeding after audit failure without re-audit	Stop, analyze the deviation, remediate, then restart the checklist from step 1
Using 10/10 scores without exceptional evidence	Reserve 10/10 for truly industry-leading work; most quality tools score 6-7
Surface-level static analysis only	Combine linting with architectural fit checks, risk-based PR categorization, and context-aware validation

误区	正确做法
无证据给出虚高评分	所有评分必须引用具体文件、指标或代码示例作为证据
跳过阶段0的资源完整性检查	始终先验证注册表完整性；资源缺失会将整体评分上限限制为6/10
仅评估代码质量，忽略其他维度	需为所有12个维度按权重评分；架构、安全和开发者体验同样重要
接受表面化的"LGTM"（看起来没问题）审查	执行深度语义审核，检查契约完整性、安全清理、性能健康状况
未经验证就信任AI生成的代码	应用验证缺口协议：使用批评Agent、可验证目标、关键路径需人工监督
审核失败后未重新审核就继续推进	停止流程，分析偏差原因，修复问题后从步骤1重新开始检查
无卓越证据就给出10/10评分	10/10评分仅授予真正行业领先的成果；大多数质量工具评分在6-7分之间
仅做表层静态分析	将代码检查与架构适配性检查、基于风险的PR分类、上下文感知验证相结合

Delegation

任务委派

Discover codebase structure and gather audit evidence: Use
```
Explore
```
agent to survey file organization, dependencies, test coverage, and documentation
Execute targeted quality checks across dimensions: Use
```
Task
```
agent to run linters, security scanners, performance profilers, and accessibility audits
Design quality improvement roadmap: Use
```
Plan
```
agent to prioritize quick wins, short-term, and long-term recommendations from audit findings

For stylistic cleanup of AI-generated prose and code (emdash overuse, slop vocabulary, over-commenting, verbose naming), use the
de-slopify
skill.
If the
usability-tester
skill is available, delegate usability dimension evaluation and user flow validation to it. Otherwise, recommend:
pnpm dlx skills add oakoss/agent-skills -s usability-tester -a claude-code -y

探索代码库结构并收集审核证据：使用
```
Explore
```
Agent调查文件组织、依赖项、测试覆盖率和文档情况
跨维度执行针对性质量检查：使用
```
Task
```
Agent运行代码检查工具、安全扫描器、性能分析器和可访问性审核工具
设计质量改进路线图：使用
```
Plan
```
Agent根据审核结果优先规划速赢项、短期和长期改进建议

若要清理AI生成的 prose 和代码中的风格问题（过度使用破折号、冗余词汇、过度注释、命名冗长），请使用
de-slopify
技能。
若
usability-tester
技能可用，可将易用性维度评估和用户流验证委派给它。否则，建议执行：
pnpm dlx skills add oakoss/agent-skills -s usability-tester -a claude-code -y

References

参考资料

Audit Rubric -- pass/warn/fail thresholds, weighted scoring methodology, automated vs manual checklists, score caps, report format
Dimension Rubrics -- detailed scoring criteria, evidence requirements, and rubric tables for all 12 dimensions
Audit Report Template -- structured report format, executive summary, recommendations, risk assessment
Anti-Patterns Guide -- code, architecture, security, testing, and process anti-patterns to identify during audits
Verification Gap Protocol -- AI code verification methodology, critic agents, rejection protocol, risk-based review strategies

审核规则 -- 通过/警告/失败阈值、加权评分方法、自动化与手动检查清单、评分上限、报告格式
维度规则 -- 所有12个维度的详细评分标准、证据要求和规则表格
审核报告模板 -- 结构化报告格式、执行摘要、改进建议、风险评估
反模式指南 -- 审核中需识别的代码、架构、安全、测试和流程反模式
验证缺口协议 -- AI代码验证方法、批评Agent、拒绝协议、基于风险的审查策略