agentic-quality-engineering
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgentic Quality Engineering
Agent式质量工程
<default_to_action>
When implementing agentic QE or coordinating agents:
- SPAWN appropriate agent(s) for the task using tool with agent type
Task - CONFIGURE agent coordination (hierarchical/mesh/sequential)
- EXECUTE with PACT principles: Proactive analysis, Autonomous operation, Collaborative feedback, Targeted risk focus
- VALIDATE results through quality gates before deployment
- LEARN from outcomes - store patterns in namespace
aqe/learning/*
Quick Agent Selection:
- Test generation needed →
qe-test-generator - Coverage gaps →
qe-coverage-analyzer - Quality decision →
qe-quality-gate - Security scan →
qe-security-scanner - Performance test →
qe-performance-tester - Full pipeline →
qe-fleet-commander
Critical Success Factors:
- Agents amplify human expertise, not replace it
- Human-in-the-loop for critical decisions
- Measure: bugs caught, time saved, coverage improved </default_to_action>
<default_to_action>
在实施Agent式QE或协调Agent时:
- 使用工具并指定Agent类型,生成(SPAWN)适合任务的Agent
Task - 配置(CONFIGURE)Agent协调方式(层级式/网状/顺序式)
- 遵循PACT原则执行(EXECUTE):主动分析、自主操作、协作反馈、针对性风险聚焦
- 部署前通过质量门验证(VALIDATE)结果
- 从结果中学习(LEARN)——将模式存储在命名空间中
aqe/learning/*
快速Agent选择:
- 需要生成测试 →
qe-test-generator - 覆盖缺口分析 →
qe-coverage-analyzer - 质量决策 →
qe-quality-gate - 安全扫描 →
qe-security-scanner - 性能测试 →
qe-performance-tester - 完整流水线 →
qe-fleet-commander
关键成功因素:
- Agent是对人类专业能力的放大,而非替代
- 关键决策需保留人工介入环节
- 衡量指标:发现的Bug数量、节省的时间、测试覆盖率提升情况 </default_to_action>
Quick Reference Card
快速参考卡片
When to Use
适用场景
- Designing autonomous testing systems
- Scaling QE with intelligent agents
- Implementing multi-agent coordination
- Building CI/CD quality pipelines
- 设计自主测试系统
- 借助智能Agent扩展QE规模
- 实施多Agent协调
- 构建CI/CD质量流水线
PACT Principles
PACT原则
| Principle | Agent Behavior | Human Role |
|---|---|---|
| Proactive | Analyze pre-merge, predict risk | Set guardrails |
| Autonomous | Execute tests, fix flaky tests | Review critical |
| Collaborative | Multi-agent coordination | Provide context |
| Targeted | Risk-based prioritization | Define risk areas |
| 原则 | Agent行为 | 人类角色 |
|---|---|---|
| Proactive(主动) | 分析合并前内容,预测风险 | 设置管控边界 |
| Autonomous(自主) | 执行测试,修复不稳定测试 | 审核关键结果 |
| Collaborative(协作) | 多Agent协同工作 | 提供上下文信息 |
| Targeted(针对性) | 基于风险优先级排序 | 定义风险领域 |
19-Agent Fleet
19-Agent集群
| Category | Agents | Primary Use |
|---|---|---|
| Core Testing (5) | test-generator, test-executor, coverage-analyzer, quality-gate, quality-analyzer | Daily testing |
| Performance/Security (2) | performance-tester, security-scanner | Non-functional |
| Strategic (3) | requirements-validator, production-intelligence, fleet-commander | Planning |
| Advanced (4) | regression-risk-analyzer, test-data-architect, api-contract-validator, flaky-test-hunter | Specialized |
| Visual/Chaos (2) | visual-tester, chaos-engineer | Edge cases |
| Deployment (1) | deployment-readiness | Release |
| Analysis (1) | code-complexity | Maintainability |
| 类别 | Agent列表 | 主要用途 |
|---|---|---|
| 核心测试类(5个) | test-generator, test-executor, coverage-analyzer, quality-gate, quality-analyzer | 日常测试 |
| 性能/安全类(2个) | performance-tester, security-scanner | 非功能性测试 |
| 战略类(3个) | requirements-validator, production-intelligence, fleet-commander | 规划阶段 |
| 进阶类(4个) | regression-risk-analyzer, test-data-architect, api-contract-validator, flaky-test-hunter | 专项测试 |
| 可视化/混沌测试类(2个) | visual-tester, chaos-engineer | 边缘场景测试 |
| 部署类(1个) | deployment-readiness | 发布阶段 |
| 分析类(1个) | code-complexity | 可维护性分析 |
Coordination Patterns
协调模式
Hierarchical: fleet-commander → [generators] → [executors] → quality-gate
Mesh: test-gen ↔ coverage ↔ quality (peer decisions)
Sequential: risk-analyzer → test-gen → executor → coverage → gateHierarchical: fleet-commander → [generators] → [executors] → quality-gate
Mesh: test-gen ↔ coverage ↔ quality (peer decisions)
Sequential: risk-analyzer → test-gen → executor → coverage → gateSuccess Criteria
成功标准
✅ 10x deployment frequency with same/better quality
✅ Coverage gaps detected in real-time
✅ Bugs caught pre-production
❌ Agents acting without human oversight on critical decisions
❌ Deploying all 19 agents at once (start with 1-2)
✅ 部署频率提升10倍,且质量保持不变或更优
✅ 实时检测到测试覆盖缺口
✅ 生产前发现Bug
❌ Agent在无人工监督的情况下处理关键决策
❌ 一次性部署全部19个Agent(建议从1-2个开始)
Core Concepts
核心概念
QE Evolution
QE演进历程
| Stage | Approach | Limitation |
|---|---|---|
| Traditional | Manual everything | Human bottleneck |
| Automation | Scripts + fixed scenarios | Needs orchestration |
| Agentic | AI agents + human judgment | Requires trust-building |
Core Premise: Agents amplify human expertise for 10x scale.
| 阶段 | 方式 | 局限性 |
|---|---|---|
| 传统阶段 | 全手动操作 | 人力瓶颈 |
| 自动化阶段 | 脚本+固定场景 | 需要编排管理 |
| Agent驱动阶段 | AI Agent+人工判断 | 需要建立信任机制 |
核心前提: Agent可将人类专业能力放大10倍,实现规模化。
Key Capabilities
核心能力
1. Intelligent Test Generation
typescript
// Agent analyzes code change, generates targeted tests
const tests = await qeTestGenerator.generate(prDiff);
// → Happy path, edge cases, error handling tests2. Pattern Detection - Scan logs, find anomalies, correlate errors
3. Adaptive Strategy - Adjust test focus based on risk signals
4. Root Cause Analysis - Link failures to code changes, suggest fixes
1. 智能测试生成
typescript
// Agent分析代码变更,生成针对性测试
const tests = await qeTestGenerator.generate(prDiff);
// → 正常流程、边缘场景、错误处理测试2. 模式检测 - 扫描日志,发现异常,关联错误
3. 自适应策略 - 根据风险信号调整测试重点
4. 根因分析 - 将故障与代码变更关联,建议修复方案
Agent Coordination
Agent协调
Memory Namespaces
内存命名空间
aqe/test-plan/* - Test planning decisions
aqe/coverage/* - Coverage analysis results
aqe/quality/* - Quality metrics and gates
aqe/learning/* - Patterns and Q-values
aqe/coordination/* - Cross-agent stateaqe/test-plan/* - 测试规划决策
aqe/coverage/* - 覆盖分析结果
aqe/quality/* - 质量指标与质量门
aqe/learning/* - 模式与Q值
aqe/coordination/* - 跨Agent状态Memory Operations (MCP Tools)
内存操作(MCP工具)
CRITICAL: Always use with for learnings.
mcp__agentic-qe__memory_storepersist: true1. Store data to persistent memory:
javascript
// Store test plan decisions (persisted to .agentic-qe/memory.db)
mcp__agentic-qe__memory_store({
key: "aqe/test-plan/pr-123",
namespace: "aqe/test-plan",
value: {
prNumber: 123,
riskLevel: "medium",
requiredCoverage: 85,
testTypes: ["unit", "integration"],
estimatedTime: 1800
},
persist: true, // ⚠️ REQUIRED for cross-session persistence
ttl: 604800 // 7 days (0 = permanent)
})2. Retrieve prior learnings before task:
javascript
// Query patterns before starting test generation
const priorData = await mcp__agentic-qe__memory_retrieve({
key: "aqe/learning/patterns/test-generation/*",
namespace: "aqe/learning",
includeMetadata: true
})
// Use patterns to guide current task
if (priorData.success) {
console.log(`Loaded ${priorData.patterns.length} prior patterns`);
}3. Store coverage analysis results:
javascript
mcp__agentic-qe__memory_store({
key: "aqe/coverage/auth-module",
namespace: "aqe/coverage",
value: {
moduleId: "auth-module",
currentCoverage: 78,
gaps: ["error-handling", "edge-cases"],
suggestedTests: 12,
priority: "high"
},
persist: true,
ttl: 1209600 // 14 days
})重要提示:对于学习到的模式,务必使用并设置。
mcp__agentic-qe__memory_storepersist: true1. 将数据存储到持久化内存:
javascript
// 存储测试规划决策(持久化到.agentic-qe/memory.db)
mcp__agentic-qe__memory_store({
key: "aqe/test-plan/pr-123",
namespace: "aqe/test-plan",
value: {
prNumber: 123,
riskLevel: "medium",
requiredCoverage: 85,
testTypes: ["unit", "integration"],
estimatedTime: 1800
},
persist: true, // ⚠️ 跨会话持久化必填
ttl: 604800 // 7天(0表示永久)
})2. 任务开始前获取历史学习数据:
javascript
// 生成测试前查询历史模式
const priorData = await mcp__agentic-qe__memory_retrieve({
key: "aqe/learning/patterns/test-generation/*",
namespace: "aqe/learning",
includeMetadata: true
})
// 使用历史模式指导当前任务
if (priorData.success) {
console.log(`已加载${priorData.patterns.length}个历史模式`);
}3. 存储覆盖分析结果:
javascript
mcp__agentic-qe__memory_store({
key: "aqe/coverage/auth-module",
namespace: "aqe/coverage",
value: {
moduleId: "auth-module",
currentCoverage: 78,
gaps: ["error-handling", "edge-cases"],
suggestedTests: 12,
priority: "high"
},
persist: true,
ttl: 1209600 // 14天
})Three-Phase Memory Protocol
三阶段内存协议
For coordinated multi-agent tasks, use the STATUS → PROGRESS → COMPLETE pattern:
javascript
// PHASE 1: STATUS - Task starting
mcp__agentic-qe__memory_store({
key: "aqe/coordination/task-123/status",
namespace: "aqe/coordination",
value: { status: "running", agent: "qe-test-generator", startTime: Date.now() },
persist: true
})
// PHASE 2: PROGRESS - Intermediate updates
mcp__agentic-qe__memory_store({
key: "aqe/coordination/task-123/progress",
namespace: "aqe/coordination",
value: { progress: 50, action: "generating-unit-tests", testsGenerated: 25 },
persist: true
})
// PHASE 3: COMPLETE - Task finished
mcp__agentic-qe__memory_store({
key: "aqe/coordination/task-123/complete",
namespace: "aqe/coordination",
value: {
status: "complete",
result: "success",
testsGenerated: 47,
coverageAchieved: 92.3,
duration: 15000
},
persist: true
})对于多Agent协同任务,使用STATUS → PROGRESS → COMPLETE模式:
javascript
// 阶段1:STATUS - 任务启动
mcp__agentic-qe__memory_store({
key: "aqe/coordination/task-123/status",
namespace: "aqe/coordination",
value: { status: "running", agent: "qe-test-generator", startTime: Date.now() },
persist: true
})
// 阶段2:PROGRESS - 中间更新
mcp__agentic-qe__memory_store({
key: "aqe/coordination/task-123/progress",
namespace: "aqe/coordination",
value: { progress: 50, action: "generating-unit-tests", testsGenerated: 25 },
persist: true
})
// 阶段3:COMPLETE - 任务完成
mcp__agentic-qe__memory_store({
key: "aqe/coordination/task-123/complete",
namespace: "aqe/coordination",
value: {
status: "complete",
result: "success",
testsGenerated: 47,
coverageAchieved: 92.3,
duration: 15000
},
persist: true
})Blackboard Events
黑板事件
| Event | Trigger | Subscribers |
|---|---|---|
| New tests created | executor, coverage |
| Gap detected | test-generator |
| Gate evaluated | fleet-commander |
| Vulnerability found | quality-gate |
| 事件 | 触发条件 | 订阅者 |
|---|---|---|
| 生成新测试 | executor, coverage |
| 检测到覆盖缺口 | test-generator |
| 质量门评估完成 | fleet-commander |
| 发现漏洞 | quality-gate |
Example: PR Quality Pipeline
示例:PR质量流水线
typescript
// 1. Risk analysis
const risks = await Task("Analyze PR", prDiff, "qe-regression-risk-analyzer");
// 2. Generate tests for risks
const tests = await Task("Generate tests", risks, "qe-test-generator");
// 3. Execute + analyze
const results = await Task("Run tests", tests, "qe-test-executor");
const coverage = await Task("Check coverage", results, "qe-coverage-analyzer");
// 4. Quality decision
const decision = await Task("Evaluate", {results, coverage}, "qe-quality-gate");
// → GO/NO-GO with rationaletypescript
// 1. 风险分析
const risks = await Task("Analyze PR", prDiff, "qe-regression-risk-analyzer");
// 2. 针对风险生成测试
const tests = await Task("Generate tests", risks, "qe-test-generator");
// 3. 执行+分析
const results = await Task("Run tests", tests, "qe-test-executor");
const coverage = await Task("Check coverage", results, "qe-coverage-analyzer");
// 4. 质量决策
const decision = await Task("Evaluate", {results, coverage}, "qe-quality-gate");
// → 给出通过/不通过的决策及理由Implementation Phases
实施阶段
| Phase | Duration | Goal | Agent(s) |
|---|---|---|---|
| Experiment | Weeks 1-4 | Validate one use case | 1 agent |
| Integrate | Months 2-3 | CI/CD pipeline | 3-4 agents |
| Scale | Months 4-6 | Multiple use cases | 8+ agents |
| Evolve | Ongoing | Continuous learning | Full fleet |
| 阶段 | 时长 | 目标 | 使用的Agent |
|---|---|---|---|
| 实验阶段 | 第1-4周 | 验证单个用例 | 1个Agent |
| 集成阶段 | 第2-3个月 | 接入CI/CD流水线 | 3-4个Agent |
| 规模化阶段 | 第4-6个月 | 覆盖多个用例 | 8个以上Agent |
| 演进阶段 | 持续进行 | 持续学习优化 | 全部Agent |
Phase 1 Example
阶段1示例
bash
undefinedbash
undefinedWeek 1: Deploy single agent
第1周:部署单个Agent
aqe agent spawn qe-test-generator
aqe agent spawn qe-test-generator
Weeks 2-3: Generate tests for 10 PRs
第2-3周:为10个PR生成测试
Track: bugs found, test quality, review time
跟踪指标:发现的Bug数量、测试质量、评审时间
Week 4: Measure impact
第4周:衡量影响
aqe agent metrics qe-test-generator
aqe agent metrics qe-test-generator
→ Tests: 150, Bugs: 12, Time saved: 8h
→ 生成测试:150个,发现Bug:12个,节省时间:8小时
---
---Limitations & Strengths
局限性与优势
Agents Excel At
Agent擅长的场景
- Volume: Scan thousands of logs in seconds
- Patterns: Find correlations humans miss
- Tireless: 24/7 testing and monitoring
- Speed: Instant code change analysis
- 批量处理:几秒内扫描数千条日志
- 模式识别:发现人类难以察觉的关联
- 持续工作:7×24小时测试与监控
- 快速响应:即时分析代码变更
Agents Need Humans For
Agent需要人类参与的场景
- Business context and priorities
- Ethical judgment and trade-offs
- Creative exploration ("what if" scenarios)
- Domain expertise (healthcare, finance, legal)
- 业务上下文与优先级定义
- 伦理判断与权衡决策
- 创造性探索(“假设”场景)
- 领域专业知识(医疗、金融、法律等)
Best Practices
最佳实践
| Do | Don't |
|---|---|
| Start with one agent, one use case | Deploy all 18 at once |
| Build feedback loops early | Deploy and forget |
| Human reviews agent output | Auto-merge without review |
| Measure bugs caught, time saved | Track vanity metrics (test count) |
| Build trust gradually | Give full autonomy immediately |
| 建议 | 避免 |
|---|---|
| 从单个Agent、单个用例开始 | 一次性部署全部18个Agent |
| 尽早建立反馈循环 | 部署后不再管 |
| 人工审核Agent输出结果 | 无需审核直接合并 |
| 衡量发现的Bug数量、节省的时间 | 关注 vanity metrics(如测试数量) |
| 逐步建立信任 | 立即赋予完全自主权 |
Trust Progression
信任演进路径
Month 1: Agent suggests → Human decides
Month 2: Agent acts → Human reviews after
Month 3: Agent autonomous on low-risk
Month 4: Agent handles critical with oversight第1个月:Agent提出建议 → 人类做决策
第2个月:Agent执行操作 → 人类事后审核
第3个月:Agent自主处理低风险任务
第4个月:Agent在监督下处理关键任务Agent Coordination Hints
Agent协调提示
yaml
coordination:
topology: hierarchical
commander: qe-fleet-commander
memory_namespace: aqe/coordination
blackboard_topic: qe-fleet
preload_skills:
- agentic-quality-engineering # Always (this skill)
- risk-based-testing # For prioritization
- quality-metrics # For measurement
agent_assignments:
qe-test-generator: [api-testing-patterns, tdd-london-chicago]
qe-coverage-analyzer: [quality-metrics, risk-based-testing]
qe-security-scanner: [security-testing, risk-based-testing]
qe-performance-tester: [performance-testing]yaml
coordination:
topology: hierarchical
commander: qe-fleet-commander
memory_namespace: aqe/coordination
blackboard_topic: qe-fleet
preload_skills:
- agentic-quality-engineering # 必须加载(本技能)
- risk-based-testing # 用于优先级排序
- quality-metrics # 用于效果衡量
agent_assignments:
qe-test-generator: [api-testing-patterns, tdd-london-chicago]
qe-coverage-analyzer: [quality-metrics, risk-based-testing]
qe-security-scanner: [security-testing, risk-based-testing]
qe-performance-tester: [performance-testing]Related Skills
相关技能
- - PACT principles deep dive
holistic-testing-pact - - Prioritize agent focus
risk-based-testing - - Measure agent effectiveness
quality-metrics - ,
api-testing-patterns,security-testing- Specialized testingperformance-testing
- - PACT原则深度解析
holistic-testing-pact - - 定义Agent测试重点
risk-based-testing - - 衡量Agent效果
quality-metrics - ,
api-testing-patterns,security-testing- 专项测试技能performance-testing
Resources
资源
- Agent definitions:
.claude/agents/ - CLI:
aqe agent --help - Fleet status:
aqe fleet status
Success Metric: Deploy 10x more frequently with same or better quality through intelligent agent collaboration.
- Agent定义:
.claude/agents/ - 命令行工具:
aqe agent --help - 集群状态:
aqe fleet status
成功指标: 通过智能Agent协作,实现部署频率提升10倍,且质量保持不变或更优。