ai-agents

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

AI Agents Development — Production Skill Hub

AI Agent 开发——生产级技能中心

Modern Best Practices (January 2026): deterministic control flow, bounded tools, auditable state, MCP-based tool integration, handoff-first orchestration, multi-layer guardrails, OpenTelemetry tracing, and human-in-the-loop controls (OWASP LLM Top 10: https://owasp.org/www-project-top-10-for-large-language-model-applications/).
This skill provides production-ready operational patterns for designing, building, evaluating, and deploying AI agents. It centralizes procedures, checklists, decision rules, and templates used across RAG agents, tool-using agents, OS agents, and multi-agent systems.
No theory. No narrative. Only operational steps and templates.

现代最佳实践(2026年1月):确定性控制流、受限工具、可审计状态、基于MCP的工具集成、优先交接的编排、多层防护、OpenTelemetry追踪,以及人在回路控制(OWASP LLM Top 10:https://owasp.org/www-project-top-10-for-large-language-model-applications/)。
本技能为设计、构建、评估和部署AI Agent提供生产就绪的操作模式。 它集中了RAG Agent、工具调用型Agent、OS Agent和多Agent系统中使用的流程检查清单决策规则模板
无理论内容,无叙事性描述,仅包含操作步骤和模板。

When to Use This Skill

何时使用本技能

Codex should activate this skill whenever the user asks for:
  • Designing an agent (LLM-based, tool-based, OS-based, or multi-agent).
  • Scoping capability maturity and rollout risk for new agent behaviors.
  • Creating action loops, plans, workflows, or delegation logic.
  • Writing tool definitions, MCP tools, schemas, or validation logic.
  • Generating RAG pipelines, retrieval modules, or context injection.
  • Building memory systems (session, long-term, episodic, task).
  • Creating evaluation harnesses, observability plans, or safety gates.
  • Preparing CI/CD, rollout, deployment, or production operational specs.
  • Producing any template in
    /references/
    or
    /assets/
    .
  • Implementing MCP servers or integrating Model Context Protocol.
  • Setting up agent handoffs and orchestration patterns.
  • Configuring multi-layer guardrails and safety controls.
  • Evaluating whether to build an agent (build vs not decision).
  • Calculating agent ROI, token costs, or cost/benefit analysis.
  • Assessing hallucination risk and mitigation strategies.
  • Deciding when to kill an agent project (kill triggers).
  • For prompt scaffolds, retrieval tuning, or security depth, see Scope Boundaries below.
当用户询问以下内容时,Codex应激活本技能:
  • 设计Agent(基于LLM、工具、OS或多Agent类型)。
  • 评估新Agent行为的能力成熟度和上线风险。
  • 创建动作循环、计划、工作流或委托逻辑。
  • 编写工具定义、MCP工具、模式或验证逻辑。
  • 生成RAG流水线、检索模块或上下文注入逻辑。
  • 构建记忆系统(会话型、长期型、情景型、任务型)。
  • 创建评估框架、可观测性计划或安全闸门。
  • 准备CI/CD、上线、部署或生产操作规范。
  • 生成
    /references/
    /assets/
    中的任何模板。
  • 实现MCP服务器或集成Model Context Protocol。
  • 设置Agent交接和编排模式。
  • 配置多层防护和安全控制。
  • 评估是否应构建Agent(构建与否的决策)。
  • 计算Agent的ROI、代币成本或成本收益分析。
  • 评估幻觉风险及缓解策略。
  • 决定何时终止Agent项目(终止触发条件)。
  • 如需提示框架、检索调优或安全深度相关内容,请参阅下文的范围边界。

Scope Boundaries (Use These Skills for Depth)

范围边界(如需深入内容请使用以下技能)

  • Prompt scaffolds & structured outputsai-prompt-engineering
  • RAG retrieval & chunkingai-rag
  • Search tuning (BM25/HNSW/hybrid)ai-rag
  • Security/guardrailsai-mlops
  • Inference optimizationai-llm-inference
  • 提示框架与结构化输出ai-prompt-engineering
  • RAG检索与分块ai-rag
  • 搜索调优(BM25/HNSW/混合搜索)ai-rag
  • 安全/防护ai-mlops
  • 推理优化ai-llm-inference

Default Workflow (Production)

默认生产工作流

  • Pick an architecture with the Decision Tree (below); default to workflow/FSM/DAG for production.
  • Draft an agent spec with
    assets/core/agent-template-standard.md
    (or
    assets/core/agent-template-quick.md
    ).
  • Specify tools and handoffs with JSON Schema using
    assets/tools/tool-definition.md
    and
    references/api-contracts-for-agents.md
    .
  • Add retrieval only when needed; start with
    assets/rag/rag-basic.md
    and scale via
    assets/rag/rag-advanced.md
    +
    references/rag-patterns.md
    .
  • Add eval + telemetry early via
    references/evaluation-and-observability.md
    .
  • Run the go/no-go gate with
    assets/checklists/agent-safety-checklist.md
    .
  • Plan deploy/rollback and safety controls via
    references/deployment-ci-cd-and-safety.md
    .

  • 使用下方的决策树选择架构;生产环境默认使用工作流/FSM/DAG
  • 使用
    assets/core/agent-template-standard.md
    (或
    assets/core/agent-template-quick.md
    )草拟Agent规范。
  • 使用JSON Schema结合
    assets/tools/tool-definition.md
    references/api-contracts-for-agents.md
    指定工具和交接规则。
  • 仅在需要时添加检索功能;从
    assets/rag/rag-basic.md
    开始,通过
    assets/rag/rag-advanced.md
    +
    references/rag-patterns.md
    进行扩展。
  • 尽早通过
    references/evaluation-and-observability.md
    添加评估和遥测功能。
  • 使用
    assets/checklists/agent-safety-checklist.md
    执行上线/不上线的安全闸门检查。
  • 通过
    references/deployment-ci-cd-and-safety.md
    规划部署/回滚和安全控制。

Quick Reference

快速参考

Agent TypeCore Control FlowInterfacesMCP/A2AWhen to Use
Workflow Agent (FSM/DAG)Explicit state transitionsState store, tool allowlistMCPDeterministic, auditable flows
Tool-Using AgentRoute → call tool → observeTool schemas, retries/timeoutsMCPExternal actions (APIs, DB, files)
RAG AgentRetrieve → answer → citeRetriever, citations, ACLsMCPKnowledge-grounded responses
Planner/ExecutorPlan → execute steps with capsPlanner prompts, step budgetMCP (+A2A)Multi-step problems with bounded autonomy
Multi-Agent (Orchestrated)Delegate → merge → validateHandoff contracts, eval gatesA2ASpecialization with explicit handoffs
OS AgentObserve UI → act → verifySandbox, UI groundingMCPDesktop/browser control under strict guardrails
Code/SWE AgentBranch → edit → test → PRRepo access, CI gatesMCPCoding tasks with review/merge controls
Agent类型核心控制流接口MCP/A2A使用场景
工作流Agent(FSM/DAG)显式状态转换状态存储、工具白名单MCP确定性、可审计的流程
工具调用型Agent路由 → 调用工具 → 观测工具模式、重试/超时MCP外部操作(API、数据库、文件)
RAG Agent检索 → 回答 → 引用检索器、引用、访问控制列表MCP基于知识的响应
规划/执行Agent规划 → 按步骤执行并限制预算规划提示、步骤预算MCP (+A2A)有边界自主性的多步骤问题
多Agent(编排型)委托 → 合并 → 验证交接契约、评估闸门A2A有明确交接的专业化场景
OS Agent观测UI → 操作 → 验证沙箱、UI锚定MCP严格防护下的桌面/浏览器控制
代码/SWE Agent分支 → 编辑 → 测试 → 提交PR代码仓库访问、CI闸门MCP带有审核/合并控制的编码任务

Framework Selection (2026)

框架选择(2026年)

FrameworkArchitectureBest ForEase
LangGraphGraph-based, statefulEnterprise, compliance, auditabilityMedium
OpenAI Agents SDKTool-centric, lightweightFast prototyping, OpenAI ecosystemEasy
Google ADKCode-first, multi-languageGemini/Vertex AI, polyglot teamsMedium
Pydantic AIType-safe, graph FSMProduction Python, type safetyMedium
CrewAIRole-based crewsTeam workflows, content generationEasiest
AutoGenConversationalCode generation, researchMedium
AWS Bedrock AgentsManaged infrastructureEnterprise AWS, knowledge basesEasy
See
references/modern-best-practices.md
for detailed framework comparison and selection guide.

框架架构最佳适用场景易用性
LangGraph基于图的、有状态的企业级、合规性、可审计性中等
OpenAI Agents SDK以工具为中心、轻量级快速原型开发、OpenAI生态系统简单
Google ADK代码优先、多语言Gemini/Vertex AI、多语言团队中等
Pydantic AI类型安全、图FSM生产级Python、类型安全中等
CrewAI基于角色的团队团队工作流、内容生成最简单
AutoGen对话式代码生成、研究中等
AWS Bedrock Agents托管基础设施企业级AWS、知识库简单
详细的框架对比和选择指南请参阅
references/modern-best-practices.md

Decision Tree: Choosing Agent Architecture

决策树:选择Agent架构

text
What does the agent need to do?
    ├─ Answer questions from knowledge base?
    │   ├─ Simple lookup? → RAG Agent (LangChain/LlamaIndex + vector DB)
    │   └─ Complex multi-step? → Agentic RAG (iterative retrieval + reasoning)
    ├─ Perform external actions (APIs, tools, functions)?
    │   ├─ 1-3 tools, linear flow? → Tool-Using Agent (LangGraph + MCP)
    │   └─ Complex workflows, branching? → Planning Agent (ReAct/Plan-Execute)
    ├─ Write/modify code autonomously?
    │   ├─ Single file edits? → Tool-Using Agent with code tools
    │   └─ Multi-file, issue resolution? → Code/SWE Agent (HyperAgent pattern)
    ├─ Delegate tasks to specialists?
    │   ├─ Fixed workflow? → Multi-Agent Sequential (A → B → C)
    │   ├─ Manager-Worker? → Multi-Agent Hierarchical (Manager + Workers)
    │   └─ Dynamic routing? → Multi-Agent Group Chat (collaborative)
    ├─ Control desktop/browser?
    │   └─ OS Agent (Anthropic Computer Use + MCP for system access)
    └─ Hybrid (combination of above)?
        └─ Planning Agent that coordinates:
            - Tool-using for actions (MCP)
            - RAG for knowledge (MCP)
            - Multi-agent for delegation (A2A)
            - Code agents for implementation
Protocol Selection:
  • Use MCP for: Tool access, data retrieval, single-agent integration
  • Use A2A for: Agent-to-agent handoffs, multi-agent coordination, task delegation

text
Agent需要完成什么任务?
    ├─ 从知识库中回答问题?
    │   ├─ 简单查询? → RAG Agent(LangChain/LlamaIndex + 向量数据库)
    │   └─ 复杂多步骤问题? → Agentic RAG(迭代检索 + 推理)
    ├─ 执行外部操作(API、工具、函数)?
    │   ├─ 1-3个工具、线性流程? → 工具调用型Agent(LangGraph + MCP)
    │   └─ 复杂工作流、分支逻辑? → 规划Agent(ReAct/Plan-Execute)
    ├─ 自主编写/修改代码?
    │   ├─ 单文件编辑? → 带有代码工具的工具调用型Agent
    │   └─ 多文件、问题解决? → 代码/SWE Agent(HyperAgent模式)
    ├─ 将任务委托给专业Agent?
    │   ├─ 固定工作流? → 顺序型多Agent(A → B → C)
    │   ├─ 管理者-工作者模式? → 分层型多Agent(管理者 + 工作者)
    │   └─ 动态路由? → 群聊型多Agent(协作式)
    ├─ 控制桌面/浏览器?
    │   └─ OS Agent(Anthropic Computer Use + MCP系统访问)
    └─ 混合场景(以上组合)?
        └─ 协调型规划Agent:
            - 工具调用型用于执行操作(MCP)
            - RAG用于知识检索(MCP)
            - 多Agent用于任务委托(A2A)
            - 代码Agent用于实现
协议选择:
  • 使用MCP场景:工具访问、数据检索、单Agent集成
  • 使用A2A场景:Agent间交接、多Agent协调、任务委托

Core Concepts (Vendor-Agnostic)

核心概念(厂商无关)

Control Flow Options

控制流选项

  • Reactive: direct tool routing per user request (fast, brittle if unbounded).
  • Workflow (FSM/DAG): explicit states and transitions (default for deterministic production).
  • Planner/Executor: plan with strict budgets, then execute step-by-step (use when branching is unavoidable).
  • Orchestrated multi-agent: separate roles with validated handoffs (use when specialization is required).
  • 响应式:根据用户请求直接路由工具(速度快,但无边界时易出问题)。
  • 工作流(FSM/DAG):显式状态和转换(生产环境默认的确定性方案)。
  • 规划/执行:严格预算下规划,然后逐步执行(分支逻辑不可避免时使用)。
  • 编排型多Agent:角色分离且有验证过的交接(需要专业化时使用)。

Memory Types (Tradeoffs)

记忆类型(权衡)

  • Short-term (session): cheap, ephemeral; best for conversational continuity.
  • Episodic (task): scoped to a case/ticket; supports audit and replay.
  • Long-term (profile/knowledge): high risk; requires consent, retention limits, and provenance.
  • 短期(会话):成本低、临时存储;最适合对话连续性。
  • 情景(任务):限定于案例/工单;支持审计和重放。
  • 长期(档案/知识):风险高;需要用户同意、存储限制和来源追溯。

Failure Handling (Production Defaults)

故障处理(生产环境默认方案)

  • Classify errors: retriable vs fatal vs needs-human.
  • Bound retries: max attempts, backoff, jitter; avoid retry storms.
  • Fallbacks: degraded mode, smaller model, cached answers, or safe refusal.
  • 错误分类:可重试、致命、需人工介入。
  • 限制重试:最大尝试次数、退避策略、抖动;避免重试风暴。
  • 降级方案:降级模式、轻量模型、缓存答案或安全拒绝。

Do / Avoid

建议/避免

Do
  • Do keep state explicit and serializable (replayable runs).
  • Do enforce tool allowlists, scopes, and idempotency for side effects.
  • Do log traces/metrics for model calls and tool calls (OpenTelemetry GenAI semantic conventions: https://opentelemetry.io/docs/specs/semconv/gen-ai/).
Avoid
  • Avoid runaway autonomy (unbounded loops or step counts).
  • Avoid hidden state (implicit memory that cannot be audited).
  • Avoid untrusted tool outputs without validation/sanitization.
建议
避免
  • 避免无限制的自主性(无边界循环或步骤计数)。
  • 避免隐藏状态(无法审计的隐式记忆)。
  • 避免未经验证/清理的不可信工具输出。

Navigation: Economics & Decision Framework

导航:经济性与决策框架

Should You Build an Agent?

是否应构建Agent?

  • Build vs Not Decision Framework -
    references/build-vs-not-decision.md
    • 10-second test (volume, cost, error tolerance)
    • Red flags and immediate disqualifiers
    • Alternatives to agents (usually better)
    • Full decision tree with stage gates
    • Kill triggers during development and post-launch
    • Pre-build validation checklist
  • 构建与否决策框架 -
    references/build-vs-not-decision.md
    • 10秒测试(量级、成本、错误容忍度)
    • 红色预警和直接不合格项
    • Agent的替代方案(通常更优)
    • 带有阶段闸门的完整决策树
    • 开发中和上线后的终止触发条件
    • 预构建验证检查清单

Agent ROI & Token Economics

Agent ROI与代币经济性

  • Agent Economics -
    references/agent-economics.md
    • Token pricing by model (January 2026)
    • Cost per task by agent type
    • ROI calculation formula and tiers
    • Hallucination cost framework and mitigation ROI
    • Investment decision matrix
    • Monthly tracking dashboard

  • Agent经济性 -
    references/agent-economics.md
    • 各模型的代币定价(2026年1月)
    • 不同Agent类型的单任务成本
    • ROI计算公式和层级
    • 幻觉成本框架及缓解ROI
    • 投资决策矩阵
    • 月度追踪仪表盘

Navigation: Core Concepts & Patterns

导航:核心概念与模式

Governance & Maturity

治理与成熟度

  • Agent Maturity & Governance -
    references/agent-maturity-governance.md
    • Capability maturity levels (L0-L4)
    • Identity & policy enforcement
    • Fleet control and registry management
    • Deprecation rules and kill switches
  • Agent成熟度与治理 -
    references/agent-maturity-governance.md
    • 能力成熟度等级(L0-L4)
    • 身份与策略执行
    • 集群控制和注册管理
    • 废弃规则和终止开关

Modern Best Practices

现代最佳实践

  • Modern Best Practices -
    references/modern-best-practices.md
    • Model Context Protocol (MCP)
    • Agent-to-Agent Protocol (A2A)
    • Agentic RAG (Dynamic Retrieval)
    • Multi-layer guardrails
    • LangGraph over LangChain
    • OpenTelemetry for agents
  • 现代最佳实践 -
    references/modern-best-practices.md
    • Model Context Protocol(MCP)
    • Agent-to-Agent Protocol(A2A)
    • Agentic RAG(动态检索)
    • 多层防护
    • LangGraph替代LangChain
    • 面向Agent的OpenTelemetry

Context Management

上下文管理

  • Context Engineering -
    references/context-engineering.md
    • Progressive disclosure
    • Session management
    • Memory provenance
    • Retrieval timing
    • Multimodal context
  • 上下文工程 -
    references/context-engineering.md
    • 渐进式披露
    • 会话管理
    • 记忆来源追溯
    • 检索时机
    • 多模态上下文

Core Operational Patterns

核心操作模式

  • Operational Patterns -
    references/operational-patterns.md
    • Agent loop pattern (PLAN → ACT → OBSERVE → UPDATE)
    • OS agent action loop
    • RAG pipeline pattern
    • Tool specification
    • Memory system pattern
    • Multi-agent workflow
    • Safety & guardrails
    • Observability
    • Evaluation patterns
    • Deployment & CI/CD

  • 操作模式 -
    references/operational-patterns.md
    • Agent循环模式(规划 → 执行 → 观测 → 更新)
    • OS Agent操作循环
    • RAG流水线模式
    • 工具规范
    • 记忆系统模式
    • 多Agent工作流
    • 安全与防护
    • 可观测性
    • 评估模式
    • 部署与CI/CD

Navigation: Protocol Implementation

导航:协议实现

  • MCP Practical Guide -
    references/mcp-practical-guide.md
    Building MCP servers, tool integration, and standardized data access
  • MCP Server Builder -
    references/mcp-server-builder.md
    End-to-end checklist for workflow-focused MCP servers (design → build → test)
  • A2A Handoff Patterns -
    references/a2a-handoff-patterns.md
    Agent-to-agent communication, task delegation, and coordination protocols
  • Protocol Decision Tree -
    references/protocol-decision-tree.md
    When to use MCP vs A2A, decision framework, and selection criteria

  • MCP实践指南 -
    references/mcp-practical-guide.md
    构建MCP服务器、工具集成和标准化数据访问
  • MCP服务器构建器 -
    references/mcp-server-builder.md
    面向工作流的MCP服务器端到端检查清单(设计 → 构建 → 测试)
  • A2A交接模式 -
    references/a2a-handoff-patterns.md
    Agent间通信、任务委托和协调协议
  • 协议决策树 -
    references/protocol-decision-tree.md
    MCP与A2A的使用场景、决策框架和选择标准

Navigation: Agent Capabilities

导航:Agent能力

  • Agent Operations -
    references/agent-operations-best-practices.md
    Action loops, planning, observation, and execution patterns
  • RAG Patterns -
    references/rag-patterns.md
    Contextual retrieval, agentic RAG, and hybrid search strategies
  • Memory Systems -
    references/memory-systems.md
    Session, long-term, episodic, and task memory architectures
  • Tool Design & Validation -
    references/tool-design-specs.md
    Tool schemas, validation, error handling, and MCP integration
  • Agent操作 -
    references/agent-operations-best-practices.md
    操作循环、规划、观测和执行模式
  • RAG模式 -
    references/rag-patterns.md
    上下文检索、Agentic RAG和混合搜索策略
  • 记忆系统 -
    references/memory-systems.md
    会话、长期、情景和任务型记忆架构
  • 工具设计与验证 -
    references/tool-design-specs.md
    工具模式、验证、错误处理和MCP集成

Skill Packaging & Sharing

技能打包与共享

  • Skill Lifecycle -
    references/skill-lifecycle.md
    Scaffold, validate, package, and share skills with teams (Slack-ready)
  • API Contracts for Agents -
    references/api-contracts-for-agents.md
    Request/response envelopes, safety gates, streaming/async patterns, error taxonomy
  • Multi-Agent Patterns -
    references/multi-agent-patterns.md
    Manager-worker, sequential, handoff, and group chat orchestration
  • OS Agent Capabilities -
    references/os-agent-capabilities.md
    Desktop automation, UI grounding, and computer use patterns
  • Code/SWE Agents -
    references/code-swe-agents.md
    SE 3.0 paradigm, autonomous coding patterns, SWE-Bench, HyperAgent architecture

  • 技能生命周期 -
    references/skill-lifecycle.md
    脚手架搭建、验证、打包和团队共享技能(适配Slack)
  • Agent API契约 -
    references/api-contracts-for-agents.md
    请求/响应信封、安全闸门、流式/异步模式、错误分类
  • 多Agent模式 -
    references/multi-agent-patterns.md
    管理者-工作者、顺序型、交接型和群聊编排
  • OS Agent能力 -
    references/os-agent-capabilities.md
    桌面自动化、UI锚定和计算机使用模式
  • 代码/SWE Agent -
    references/code-swe-agents.md
    SE 3.0范式、自主编码模式、SWE-Bench、HyperAgent架构

Navigation: Production Operations

导航:生产操作

  • Evaluation & Observability -
    references/evaluation-and-observability.md
    OpenTelemetry GenAI, metrics, LLM-as-judge, and monitoring
  • Deployment, CI/CD & Safety -
    references/deployment-ci-cd-and-safety.md
    Multi-layer guardrails, HITL controls, NIST AI RMF, production checklists

  • 评估与可观测性 -
    references/evaluation-and-observability.md
    OpenTelemetry GenAI、指标、LLM-as-judge和监控
  • 部署、CI/CD与安全 -
    references/deployment-ci-cd-and-safety.md
    多层防护、人在回路控制、NIST AI RMF、生产检查清单

Navigation: Templates (Copy-Paste Ready)

导航:模板(可直接复制粘贴)

Checklists

检查清单

  • Agent Design & Safety Checklist -
    assets/checklists/agent-safety-checklist.md
    Go/No-Go safety gate: permissions, HITL triggers, eval gates, observability, rollback
  • Agent设计与安全检查清单 -
    assets/checklists/agent-safety-checklist.md
    上线/不上线安全闸门:权限、人在回路触发条件、评估闸门、可观测性、回滚

Core Agent Templates

核心Agent模板

  • Standard Agent Template -
    assets/core/agent-template-standard.md
    Full production spec: memory, tools, RAG, evaluation, observability, safety
  • Specialized Agent Template -
    assets/core/agent-template-specialized.md
    Domain-specific agents with custom capabilities and constraints
  • Quick Agent Template -
    assets/core/agent-template-quick.md
    Minimal viable agent for rapid prototyping
  • 标准Agent模板 -
    assets/core/agent-template-standard.md
    完整生产规范:记忆、工具、RAG、评估、可观测性、安全
  • 专用Agent模板 -
    assets/core/agent-template-specialized.md
    具备自定义能力和约束的领域专用Agent
  • 快速Agent模板 -
    assets/core/agent-template-quick.md
    用于快速原型开发的最小可行Agent

RAG Templates

RAG模板

  • Basic RAG -
    assets/rag/rag-basic.md
    Simple retrieval-augmented generation pipeline
  • Advanced RAG -
    assets/rag/rag-advanced.md
    Contextual retrieval, reranking, and agentic RAG patterns
  • Hybrid Retrieval -
    assets/rag/hybrid-retrieval.md
    Semantic + keyword search with BM25 fusion
  • 基础RAG -
    assets/rag/rag-basic.md
    简单的检索增强生成流水线
  • 高级RAG -
    assets/rag/rag-advanced.md
    上下文检索、重排序和Agentic RAG模式
  • 混合检索 -
    assets/rag/hybrid-retrieval.md
    语义+关键词搜索的BM25融合

Tool Templates

工具模板

  • Tool Definition -
    assets/tools/tool-definition.md
    MCP-compatible tool schemas with validation and error handling
  • Tool Validation Checklist -
    assets/tools/tool-validation-checklist.md
    Testing, security, and production readiness checks
  • 工具定义 -
    assets/tools/tool-definition.md
    兼容MCP的工具模式,包含验证和错误处理
  • 工具验证检查清单 -
    assets/tools/tool-validation-checklist.md
    测试、安全和生产就绪性检查

Multi-Agent Templates

多Agent模板

  • Manager-Worker Template -
    assets/multi-agent/manager-worker-template.md
    Orchestration pattern with task delegation and result aggregation
  • Evaluator-Router Template -
    assets/multi-agent/evaluator-router-template.md
    Dynamic routing with quality assessment and domain classification
  • 管理者-工作者模板 -
    assets/multi-agent/manager-worker-template.md
    带有任务委托和结果聚合的编排模式
  • 评估器-路由器模板 -
    assets/multi-agent/evaluator-router-template.md
    带有质量评估和领域分类的动态路由

Service Layer Templates

服务层模板

  • FastAPI Agent Service -
    ../dev-api-design/assets/fastapi/fastapi-complete-api.md
    Auth, pagination, validation, error handling; extend with model lifespan loads, SSE, background tasks

  • FastAPI Agent服务 -
    ../dev-api-design/assets/fastapi/fastapi-complete-api.md
    认证、分页、验证、错误处理;扩展模型生命周期加载、服务器推送事件、后台任务

External Sources Metadata

外部源元数据

  • Curated References -
    data/sources.json
    Authoritative sources spanning standards, protocols, and production agent frameworks

  • 精选参考 -
    data/sources.json
    涵盖标准、协议和生产级Agent框架的权威来源

Shared Utilities (Centralized patterns — extract, don't duplicate)

共享工具(集中式模式 — 提取而非重复)

  • ../software-clean-code-standard/utilities/llm-utilities.md — Token counting, streaming, cost estimation
  • ../software-clean-code-standard/utilities/error-handling.md — Effect Result types, correlation IDs
  • ../software-clean-code-standard/utilities/resilience-utilities.md — p-retry v6, circuit breaker for API calls
  • ../software-clean-code-standard/utilities/logging-utilities.md — pino v9 + OpenTelemetry integration
  • ../software-clean-code-standard/utilities/observability-utilities.md — OpenTelemetry SDK, tracing, metrics
  • ../software-clean-code-standard/utilities/testing-utilities.md — Test factories, fixtures, mocks
  • ../software-clean-code-standard/references/clean-code-standard.md — Canonical clean code rules (
    CC-*
    ) for citation

  • ../software-clean-code-standard/utilities/llm-utilities.md — 代币计数、流式传输、成本估算
  • ../software-clean-code-standard/utilities/error-handling.md — 结果类型、关联ID
  • ../software-clean-code-standard/utilities/resilience-utilities.md — p-retry v6、API调用的断路器
  • ../software-clean-code-standard/utilities/logging-utilities.md — pino v9 + OpenTelemetry集成
  • ../software-clean-code-standard/utilities/observability-utilities.md — OpenTelemetry SDK、追踪、指标
  • ../software-clean-code-standard/utilities/testing-utilities.md — 测试工厂、夹具、模拟
  • ../software-clean-code-standard/references/clean-code-standard.md — 可引用的规范干净代码规则(
    CC-*

Trend Awareness Protocol

趋势感知协议

IMPORTANT: When users ask recommendation questions about AI agents, you MUST use WebSearch to check current trends before answering. If WebSearch is unavailable, use
data/sources.json
+ any available web browsing tools, and explicitly state what you verified vs assumed.
重要提示:当用户询问AI Agent相关的推荐问题时,必须使用WebSearch检查当前趋势后再作答。 如果无法使用WebSearch,请使用
data/sources.json
+ 任何可用的网页浏览工具,并明确说明已验证内容和假设内容。

Trigger Conditions

触发条件

  • "What's the best agent framework for [use case]?"
  • "What should I use for [multi-agent/tool use/orchestration]?"
  • "What's the latest in AI agents?"
  • "Current best practices for [agent architecture/MCP/A2A]?"
  • "Is [LangGraph/CrewAI/AutoGen] still relevant in 2026?"
  • "[Agent framework A] vs [Agent framework B]?"
  • "Best way to build [coding agent/RAG agent/OS agent]?"
  • "What MCP servers are available?"
  • "[场景]下最佳的Agent框架是什么?"
  • "[多Agent/工具使用/编排]应使用什么?"
  • "AI Agent的最新动态是什么?"
  • "[Agent架构/MCP/A2A]的当前最佳实践是什么?"
  • "[LangGraph/CrewAI/AutoGen]在2026年是否仍适用?"
  • "[Agent框架A] vs [Agent框架B]?"
  • "构建[编码Agent/RAG Agent/OS Agent]的最佳方式是什么?"
  • "有哪些可用的MCP服务器?"

Required Searches

必要搜索

  1. Search:
    "AI agent frameworks best practices 2026"
  2. Search:
    "[LangGraph/CrewAI/AutoGen/Semantic Kernel] comparison 2026"
  3. Search:
    "AI agent trends January 2026"
  4. Search:
    "MCP servers available 2026"
  1. 搜索:
    "AI agent frameworks best practices 2026"
  2. 搜索:
    "[LangGraph/CrewAI/AutoGen/Semantic Kernel] comparison 2026"
  3. 搜索:
    "AI agent trends January 2026"
  4. 搜索:
    "MCP servers available 2026"

What to Report

需报告内容

After searching, provide:
  • Current landscape: What agent frameworks are popular NOW
  • Emerging trends: New patterns gaining traction (MCP, A2A, agentic coding)
  • Deprecated/declining: Frameworks or patterns losing relevance
  • Recommendation: Based on fresh data, not just static knowledge
搜索后,需提供:
  • 当前格局:当前流行的Agent框架
  • 新兴趋势:正在兴起的新模式(MCP、A2A、Agentic编码)
  • 已过时/衰退:正在失去相关性的框架或模式
  • 推荐方案:基于最新数据,而非静态知识

Example Topics (verify with fresh search)

示例主题(需通过最新搜索验证)

  • Agent frameworks (LangGraph, CrewAI, AutoGen, Semantic Kernel, Pydantic AI)
  • MCP ecosystem (available servers, new integrations)
  • Agentic coding (Codex CLI, Claude Code, Cursor, Windsurf, Cline)
  • Multi-agent patterns (hierarchical, collaborative, competitive)
  • Tool use protocols (MCP, function calling)
  • Agent evaluation (SWE-Bench, AgentBench, GAIA)
  • OS/computer use agents (computer-use APIs, browser automation)

  • Agent框架(LangGraph、CrewAI、AutoGen、Semantic Kernel、Pydantic AI)
  • MCP生态系统(可用服务器、新集成)
  • Agentic编码(Codex CLI、Claude Code、Cursor、Windsurf、Cline)
  • 多Agent模式(分层、协作、竞争)
  • 工具使用协议(MCP、函数调用)
  • Agent评估(SWE-Bench、AgentBench、GAIA)
  • OS/计算机使用Agent(计算机使用API、浏览器自动化)

Related Skills

相关技能

This skill integrates with complementary skills:
本技能与以下互补技能集成:

Core Dependencies

核心依赖

  • ../ai-llm/
    - LLM patterns, prompt engineering, and model selection for agents
  • ../ai-rag/
    - Deep RAG implementation: chunking, embedding, reranking
  • ../ai-prompt-engineering/
    - System prompt design, few-shot patterns, reasoning strategies
  • ../ai-llm/
    - 面向Agent的LLM模式、提示工程和模型选择
  • ../ai-rag/
    - 深度RAG实现:分块、嵌入、重排序
  • ../ai-prompt-engineering/
    - 系统提示设计、少样本模式、推理策略

Production & Operations

生产与操作

  • ../qa-observability/
    - OpenTelemetry, metrics, distributed tracing
  • ../software-security-appsec/
    - OWASP Top 10, input validation, secure tool design
  • ../ops-devops-platform/
    - CI/CD pipelines, deployment strategies, infrastructure
  • ../qa-observability/
    - OpenTelemetry、指标、分布式追踪
  • ../software-security-appsec/
    - OWASP Top 10、输入验证、安全工具设计
  • ../ops-devops-platform/
    - CI/CD流水线、部署策略、基础设施

Supporting Patterns

支持模式

  • ../dev-api-design/
    - REST/GraphQL design for agent APIs and tool interfaces
  • ../ai-mlops/
    - Model deployment, monitoring, drift detection
  • ../qa-debugging/
    - Agent debugging, error analysis, root cause investigation
Usage pattern: Start here for agent architecture, then reference specialized skills for deep implementation details.

  • ../dev-api-design/
    - 面向Agent API和工具接口的REST/GraphQL设计
  • ../ai-mlops/
    - 模型部署、监控、漂移检测
  • ../qa-debugging/
    - Agent调试、错误分析、根因调查
使用模式:从本技能开始确定Agent架构,然后参考专用技能获取深入实现细节。

Usage Notes

使用说明

  • Modern Standards: Default to MCP for tools, agentic RAG for retrieval, handoff-first for multi-agent
  • Lightweight SKILL.md: Use this file for quick reference and navigation
  • Drill-down resources: Reference detailed resources for implementation guidance
  • Copy-paste templates: Use templates when the user asks for structured artifacts
  • External sources: Reference
    data/sources.json
    for authoritative documentation links
  • No theory: Never include theoretical explanations; only operational steps

  • 现代标准:工具默认使用MCP,检索默认使用Agentic RAG,多Agent默认使用优先交接模式
  • 轻量级SKILL.md:本文件用于快速参考和导航
  • 深入资源:参考详细资源获取实现指导
  • 可复制模板:当用户需要结构化制品时使用模板
  • 外部源:参考
    data/sources.json
    获取权威文档链接
  • 无理论内容:绝不包含理论解释;仅提供操作步骤

Key Modern Migrations

关键现代迁移

Traditional → Modern:
  • Custom APIs → Model Context Protocol (MCP)
  • Static RAG → Agentic RAG with contextual retrieval
  • Ad-hoc handoffs → Versioned handoff APIs with JSON Schema
  • Single guardrail → Multi-layer defense (5+ layers)
  • LangChain agents → LangGraph stateful workflows
  • Custom observability → OpenTelemetry GenAI standards
  • Model-centric → Context engineering-centric

传统 → 现代:
  • 自定义API → Model Context Protocol(MCP)
  • 静态RAG → 带上下文检索的Agentic RAG
  • 临时交接 → 带JSON Schema的版本化交接API
  • 单一防护 → 多层防御(5层以上)
  • LangChain Agent → LangGraph有状态工作流
  • 自定义可观测性 → OpenTelemetry GenAI标准
  • 模型中心 → 上下文工程中心

AI-Native SDLC Template

AI原生SDLC模板

  • Use
    assets/agent-template-ainative-sdlc.md
    for the Delegate → Review → Own runbook (guardrails + outputs checklist).
  • 使用
    assets/agent-template-ainative-sdlc.md
    获取委托 → 审核 → 负责的运行手册(防护 + 输出检查清单)。