google-adk

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Google Agent Development Kit (ADK) Skill

Google Agent Development Kit (ADK) 技能

Purpose

用途

Provide specialized guidance for developing agentic applications and multi-agent systems using Google's Agent Development Kit (ADK). Enable AI assistants to design agents, build tools, orchestrate multi-agent workflows, implement memory/state management, and deploy agent-based applications following code-first development patterns.
提供使用Google Agent Development Kit (ADK)开发智能体应用和多智能体系统的专业指导。支持AI助手遵循代码优先的开发模式,进行智能体设计、工具构建、多智能体工作流编排、内存/状态管理以及智能体应用部署。

When to Use This Skill

何时使用此技能

Invoke this skill when:
  • Building conversational AI agents with tool integration
  • Creating multi-agent orchestration systems
  • Developing workflow agents (sequential, parallel, iterative)
  • Implementing custom tools for agents
  • Designing agent architectures for complex tasks
  • Deploying agent applications to production
  • Evaluating agent performance and behavior
  • Implementing human-in-the-loop patterns
Do NOT use this skill for:
  • Generic Python development (use Python-specific skills)
  • Simple REST API development (ADK is for agentic systems)
  • Frontend development (ADK is backend agent framework)
  • Direct LLM API usage without agent orchestration (use LLM provider SDKs)
  • Non-Python agent frameworks (LangChain, CrewAI, AutoGPT - different patterns)
在以下场景调用此技能:
  • 构建集成工具的对话式AI智能体
  • 创建多智能体编排系统
  • 开发工作流智能体(顺序、并行、迭代型)
  • 为智能体实现自定义工具
  • 为复杂任务设计智能体架构
  • 将智能体应用部署到生产环境
  • 评估智能体的性能与行为
  • 实现人在回路(Human-in-the-loop)模式
请勿在以下场景使用此技能:
  • 通用Python开发(使用Python专属技能)
  • 简单REST API开发(ADK专为智能体系统设计)
  • 前端开发(ADK是后端智能体框架)
  • 无智能体编排的直接LLM API调用(使用LLM提供商SDK)
  • 非Python智能体框架(LangChain、CrewAI、AutoGPT - 模式不同)

Core ADK Concepts

ADK核心概念

Platform Architecture

平台架构

Framework Philosophy:
  • Code-first approach - Define agents in Python code (not YAML/JSON configs)
  • Model-agnostic - Optimized for Gemini but supports other LLMs
  • Composable - Build complex systems from simple agent primitives
  • Observable - Built-in integration with tracing and monitoring tools
Supported Languages:
  • Python (primary, most mature) -
    google-adk
    package
  • Go (available) -
    adk-go
    repository
  • Java (available) -
    adk-java
    repository
Runtime Environment:
  • Python 3.9+ required
  • Agent Engine for deployment (containerized execution)
  • Web UI for development/testing (Angular + FastAPI)
  • CLI for evaluation and deployment operations
框架理念:
  • 代码优先方法 - 在Python代码中定义智能体(而非YAML/JSON配置)
  • 模型无关 - 针对Gemini优化,但支持其他LLM
  • 可组合 - 从简单智能体原语构建复杂系统
  • 可观测 - 内置追踪与监控工具集成
支持语言:
  • Python(主要,最成熟)-
    google-adk
  • Go(可用)-
    adk-go
    仓库
  • Java(可用)-
    adk-java
    仓库
运行环境:
  • 要求Python 3.9+
  • 用于部署的Agent Engine(容器化执行)
  • 用于开发/测试的Web UI(Angular + FastAPI)
  • 用于评估与部署操作的CLI

Agent Types and Hierarchy

智能体类型与层级

1. LlmAgent (Dynamic, model-driven)
Use for:
  • Conversational interfaces
  • Decision-making with uncertainty
  • Natural language understanding
  • Creative tasks (content generation)
  • Contextual reasoning
Characteristics:
  • Uses LLM for decision-making
  • Non-deterministic execution
  • Tool selection driven by model
  • Handles ambiguous inputs
2. Workflow Agents (Deterministic, programmatic)
Sequential Agent:
  • Executes tools in fixed order
  • Use for: Multi-step processes with dependencies
  • Example: Data pipeline (fetch → transform → load)
Parallel Agent:
  • Executes multiple tools concurrently
  • Use for: Independent operations requiring aggregation
  • Example: Multi-source data gathering
Loop Agent:
  • Repeats execution until condition met
  • Use for: Iterative refinement, convergence tasks
  • Example: Generator-critic pattern
3. Custom Agents (User-defined logic)
Use for:
  • Domain-specific orchestration
  • Complex state machines
  • Integration with existing systems
  • Specialized execution patterns
Agent Composition:
  • Agents can contain sub-agents (hierarchical)
  • Parent agent coordinates child agents
  • Supports multi-level nesting
1. LlmAgent(动态,模型驱动)
适用场景:
  • 对话式界面
  • 带不确定性的决策
  • 自然语言理解
  • 创意任务(内容生成)
  • 上下文推理
特点:
  • 使用LLM进行决策
  • 非确定性执行
  • 工具选择由模型驱动
  • 处理模糊输入
2. 工作流智能体(确定性,程序化)
顺序智能体:
  • 按固定顺序执行工具
  • 适用场景:带依赖的多步骤流程
  • 示例:数据管道(获取 → 转换 → 加载)
并行智能体:
  • 同时执行多个工具
  • 适用场景:需要聚合的独立操作
  • 示例:多源数据收集
循环智能体:
  • 重复执行直到满足条件
  • 适用场景:迭代优化、收敛任务
  • 示例:生成器-批评者模式
3. 自定义智能体(用户定义逻辑)
适用场景:
  • 领域专属编排
  • 复杂状态机
  • 与现有系统集成
  • 特殊执行模式
智能体组合:
  • 智能体可包含子智能体(层级结构)
  • 父智能体协调子智能体
  • 支持多层嵌套

Tool Ecosystem

工具生态

Tool Categories:
  1. Built-in Tools:
    • Search - Web search via Google Search API
    • Code Execution - Python code interpreter (sandboxed)
    • Google Cloud tools - Vertex AI, BigQuery, Cloud Storage
  2. Custom Function Tools:
    • Python functions wrapped as tools
    • Automatic schema generation from type hints
    • Supports async functions
  3. OpenAPI Tools:
    • Auto-generate from OpenAPI/Swagger specs
    • HTTP-based service integration
  4. MCP (Model Context Protocol) Tools:
    • Integration with MCP servers
    • Cross-framework tool sharing
Tool Attributes:
  • Name - Unique identifier
  • Description - Natural language explanation for LLM
  • Parameters - JSON schema defining inputs
  • Function - Execution logic
  • Confirmation - Optional human-in-the-loop approval
工具类别:
  1. 内置工具:
    • Search - 通过Google Search API实现网页搜索
    • Code Execution - Python代码解释器(沙箱环境)
    • Google Cloud工具 - Vertex AI、BigQuery、Cloud Storage
  2. 自定义函数工具:
    • 包装为工具的Python函数
    • 从类型提示自动生成Schema
    • 支持异步函数
  3. OpenAPI工具:
    • 从OpenAPI/Swagger规范自动生成
    • 基于HTTP的服务集成
  4. MCP(Model Context Protocol)工具:
    • 与MCP服务器集成
    • 跨框架工具共享
工具属性:
  • Name - 唯一标识符
  • Description - 供LLM使用的自然语言说明
  • Parameters - 定义输入的JSON Schema
  • Function - 执行逻辑
  • Confirmation - 可选的人在回路审批

Memory and State Management

内存与状态管理

Session Management:
  • Agent maintains conversation history
  • Automatic context window management
  • Configurable history retention
State Persistence:
  • Custom state objects per agent
  • Serialization support (JSON, pickle)
  • Database integration for long-term storage
Context Caching:
  • Reduces token usage for repeated context
  • Automatic cache invalidation
  • Configurable cache TTL
会话管理:
  • 智能体维护对话历史
  • 自动上下文窗口管理
  • 可配置的历史保留规则
状态持久化:
  • 每个智能体的自定义状态对象
  • 序列化支持(JSON、pickle)
  • 用于长期存储的数据库集成
上下文缓存:
  • 减少重复上下文的令牌使用
  • 自动缓存失效
  • 可配置的缓存TTL

Agent Development Methodology

智能体开发方法论

Planning Phase

规划阶段

Step 1: Define Agent Purpose
  • Primary objective (single responsibility)
  • Input/output format
  • Success criteria
  • Failure modes
Step 2: Identify Required Tools
Decision criteria:
  • Use built-in tools when available (Search, Code Execution)
  • Create custom functions for simple operations (<100 lines)
  • Use OpenAPI tools for existing REST APIs
  • Use MCP tools for cross-framework compatibility
Step 3: Select Agent Type
START: What's the agent's decision pattern?
  ├─> Requires natural language reasoning? ─Yes─> LlmAgent ★
  ├─> Fixed sequence of steps?
  │   └─> Sequential Workflow Agent ★
  ├─> Independent parallel operations?
  │   └─> Parallel Workflow Agent ★
  ├─> Iterative refinement needed?
  │   └─> Loop Workflow Agent ★
  └─> Custom orchestration logic?
      └─> Custom Agent ★
Step 4: Design Multi-Agent Architecture (if needed)
Patterns:
  • Coordinator/Dispatcher - Central agent routes to specialists
  • Sequential Pipeline - Output of Agent A → Input of Agent B
  • Parallel Fan-Out/Gather - Distribute work, aggregate results
  • Hierarchical Decomposition - Break complex task into subtasks
步骤1:定义智能体用途
  • 主要目标(单一职责)
  • 输入/输出格式
  • 成功标准
  • 失败模式
步骤2:确定所需工具
决策标准:
  • 优先使用内置工具(Search、Code Execution)
  • 为简单操作(少于100行)创建自定义函数
  • 为现有REST API使用OpenAPI工具
  • 为跨框架兼容性使用MCP工具
步骤3:选择智能体类型
开始:智能体的决策模式是什么?
  ├─> 需要自然语言推理? ─是─> LlmAgent ★
  ├─> 固定步骤序列?
  │   └─> 顺序工作流智能体 ★
  ├─> 独立并行操作?
  │   └─> 并行工作流智能体 ★
  ├─> 需要迭代优化?
  │   └─> 循环工作流智能体 ★
  └─> 自定义编排逻辑?
      └─> 自定义智能体 ★
步骤4:设计多智能体架构(如需要)
模式:
  • 协调器/调度器 - 中央智能体将请求路由到专业智能体
  • 顺序流水线 - 智能体A的输出 → 智能体B的输入
  • 并行扇出/聚合 - 分配任务,聚合结果
  • 层级分解 - 将复杂任务拆分为子任务

Implementation Phase

实现阶段

Agent Implementation Examples:
[See Code Examples: examples/google_adk_agent_implementation.py]
Key agent patterns demonstrated:
  • LlmAgent -
    create_weather_assistant()
    - Conversational agent with custom tools
  • SequentialAgent -
    create_data_pipeline()
    - Ordered execution (fetch → transform → save)
  • ParallelAgent -
    create_market_researcher()
    - Concurrent tool execution
  • LoopAgent -
    create_content_generator()
    - Iterative refinement with break conditions
  • Session Management -
    create_stateful_session()
    - Multi-turn conversation with history
智能体实现示例:
[查看代码示例:examples/google_adk_agent_implementation.py]
展示的核心智能体模式:
  • LlmAgent -
    create_weather_assistant()
    - 集成自定义工具的对话式智能体
  • SequentialAgent -
    create_data_pipeline()
    - 有序执行(获取 → 转换 → 保存)
  • ParallelAgent -
    create_market_researcher()
    - 并发工具执行
  • LoopAgent -
    create_content_generator()
    - 带终止条件的迭代优化
  • 会话管理 -
    create_stateful_session()
    - 带历史记录的多轮对话

Testing Phase

测试阶段

Web UI Testing:
bash
undefined
Web UI测试:
bash
undefined

Start API server

启动API服务器

adk api_server --port 8000
adk api_server --port 8000

Start web UI (separate terminal)

启动Web UI(单独终端)

cd adk-web npm install npm start
cd adk-web npm install npm start

**Programmatic Testing:**

```python

**程序化测试:**

```python

Unit test for agent

智能体单元测试

def test_weather_agent(): agent = create_weather_agent() response = agent.run("Weather in NYC?") assert "weather" in response.content.lower() assert response.success is True
def test_weather_agent(): agent = create_weather_agent() response = agent.run("Weather in NYC?") assert "weather" in response.content.lower() assert response.success is True

Integration test with mock tools

带模拟工具的集成测试

def test_pipeline_agent(): agent = create_pipeline_agent(mock_tools=True) result = agent.run({"input": "test_data"}) assert result["status"] == "completed"

**Evaluation Framework:**

```python
from google.adk.evaluation import evaluate_agent
def test_pipeline_agent(): agent = create_pipeline_agent(mock_tools=True) result = agent.run({"input": "test_data"}) assert result["status"] == "completed"

**评估框架:**

```python
from google.adk.evaluation import evaluate_agent

Criteria-based evaluation

基于标准的评估

results = evaluate_agent( agent=my_agent, test_cases=[ {"input": "What's 2+2?", "expected_output": "4"}, {"input": "Explain quantum computing", "criteria": "mentions_qubits"} ], evaluator_model="gemini-2.0-flash" )
print(f"Pass rate: {results.pass_rate}") print(f"Average score: {results.avg_score}")
undefined
results = evaluate_agent( agent=my_agent, test_cases=[ {"input": "What's 2+2?", "expected_output": "4"}, {"input": "Explain quantum computing", "criteria": "mentions_qubits"} ], evaluator_model="gemini-2.0-flash" )
print(f"Pass rate: {results.pass_rate}") print(f"Average score: {results.avg_score}")
undefined

Tool Development

工具开发

[See Code Examples: examples/google_adk_tools_example.py]
[查看代码示例:examples/google_adk_tools_example.py]

Custom Function Tools

自定义函数工具

Examples demonstrated:
  • Basic Function Tool -
    calculate_tax()
    - Simple tool with type hints
  • Async Tool -
    fetch_user_data()
    - Asynchronous API calls
  • HITL Confirmation -
    send_email()
    ,
    delete_user_account()
    - Human approval required
  • Input Validation -
    send_email_tool()
    - Email format validation and sanitization
  • Retry Logic -
    fetch_external_data()
    - Automatic retry with exponential backoff
  • Rate Limiting -
    call_external_api()
    - Decorator-based rate limiting
  • Error Handling -
    fetch_stock_price()
    - Graceful degradation on failures
展示的示例:
  • 基础函数工具 -
    calculate_tax()
    - 带类型提示的简单工具
  • 异步工具 -
    fetch_user_data()
    - 异步API调用
  • HITL审批 -
    send_email()
    delete_user_account()
    - 需要人工审批
  • 输入验证 -
    send_email_tool()
    - 邮箱格式验证与清理
  • 重试逻辑 -
    fetch_external_data()
    - 带指数退避的自动重试
  • 速率限制 -
    call_external_api()
    - 基于装饰器的速率限制
  • 错误处理 -
    fetch_stock_price()
    - 故障时优雅降级

OpenAPI Tools

OpenAPI工具

Integration Pattern:
  • Load tools from OpenAPI spec URL
  • Optional tool filtering for specific operations
  • Automatic schema generation from spec [See:
    create_api_agent()
    in examples/google_adk_tools_example.py]
集成模式:
  • 从OpenAPI规范URL加载工具
  • 可选针对特定操作的工具过滤
  • 从规范自动生成Schema [查看:examples/google_adk_tools_example.py中的
    create_api_agent()
    ]

MCP Tool Integration

MCP工具集成

Integration Pattern:
  • Connect to MCP server endpoint
  • Import tools for cross-framework compatibility [See:
    create_mcp_agent()
    in examples/google_adk_tools_example.py]
集成模式:
  • 连接到MCP服务器端点
  • 导入工具以实现跨框架兼容性 [查看:examples/google_adk_tools_example.py中的
    create_mcp_agent()
    ]

Multi-Agent Orchestration

多智能体编排

[See Code Examples: examples/google_adk_multi_agent.py]
[查看代码示例:examples/google_adk_multi_agent.py]

Multi-Agent Patterns

多智能体模式

Pattern 1: Coordinator/Dispatcher (Complexity: 4)
  • Use case: Route user requests to specialized agents
  • Function:
    create_coordinator_system()
    - Weather + Finance specialists
Pattern 2: Sequential Pipeline (Complexity: 3)
  • Use case: Multi-stage processing with dependencies
  • Function:
    create_content_pipeline()
    - Research → Write → Edit
Pattern 3: Parallel Fan-Out/Gather (Complexity: 4)
  • Use case: Aggregate results from multiple sources
  • Function:
    create_market_analysis_system()
    - Technical + Fundamental + Sentiment analysis
Pattern 4: Hierarchical Decomposition (Complexity: 5)
  • Use case: Break complex tasks into manageable subtasks
  • Function:
    create_project_management_system()
    - Multi-level agent hierarchy
Pattern 5: Generator-Critic Loop (Complexity: 4)
  • Use case: Iterative refinement with feedback
  • Function:
    create_quality_content_system()
    - Generate → Critique → Refine
Pattern 6: Human-in-the-Loop (HITL) (Complexity: 3)
  • Use case: Critical decisions require human approval
  • Function:
    create_account_management_agent()
    - Confirmation before deletion
Pattern 7: State Management
  • Use case: Persistent user context across sessions
  • Class:
    StatefulAgent
    - In-memory state storage with history
Pattern 8: Database Persistence
  • Use case: Long-term state storage
  • Functions:
    save_state()
    ,
    load_state()
    - PostgreSQL-backed persistence
模式1:协调器/调度器(复杂度:4)
  • 适用场景: 将用户请求路由到专业智能体
  • 函数:
    create_coordinator_system()
    - 天气+金融专业智能体
模式2:顺序流水线(复杂度:3)
  • 适用场景: 带依赖的多阶段处理
  • 函数:
    create_content_pipeline()
    - 调研 → 写作 → 编辑
模式3:并行扇出/聚合(复杂度:4)
  • 适用场景: 聚合多来源结果
  • 函数:
    create_market_analysis_system()
    - 技术面+基本面+情绪分析
模式4:层级分解(复杂度:5)
  • 适用场景: 将复杂任务拆分为可管理的子任务
  • 函数:
    create_project_management_system()
    - 多层智能体层级
模式5:生成器-批评者循环(复杂度:4)
  • 适用场景: 带反馈的迭代优化
  • 函数:
    create_quality_content_system()
    - 生成 → 评审 → 优化
模式6:人在回路(HITL)(复杂度:3)
  • 适用场景: 关键决策需要人工审批
  • 函数:
    create_account_management_agent()
    - 删除前需确认
模式7:状态管理
  • 适用场景: 跨会话的持久用户上下文
  • 类:
    StatefulAgent
    - 带历史记录的内存状态存储
模式8:数据库持久化
  • 适用场景: 长期状态存储
  • 函数:
    save_state()
    load_state()
    - 基于PostgreSQL的持久化

Memory and State Management

内存与状态管理

[See Code Examples: examples/google_adk_multi_agent.py - State Management section]
[查看代码示例:examples/google_adk_multi_agent.py - 状态管理部分]

Session Management

会话管理

Basic Session Pattern:
  • Multi-turn conversation with history retention
  • Automatic context window management
  • Configurable history limits
[See:
create_stateful_session()
in examples/google_adk_agent_implementation.py]
基础会话模式:
  • 带历史记录保留的多轮对话
  • 自动上下文窗口管理
  • 可配置的历史限制
[查看:examples/google_adk_agent_implementation.py中的
create_stateful_session()
]

State Persistence

状态持久化

Custom State Object:
  • In-memory state storage per user
  • Dataclass-based state modeling
  • Conversation history tracking
[See:
StatefulAgent
class in examples/google_adk_multi_agent.py]
Database Persistence:
  • Long-term state storage with SQLAlchemy
  • JSON-serialized state data
  • PostgreSQL/MySQL support
[See:
save_state()
,
load_state()
functions in examples/google_adk_multi_agent.py]
自定义状态对象:
  • 每个用户的内存状态存储
  • 基于数据类的状态建模
  • 对话历史追踪
[查看:examples/google_adk_multi_agent.py中的
StatefulAgent
类]
数据库持久化:
  • 带SQLAlchemy的长期状态存储
  • JSON序列化的状态数据
  • 支持PostgreSQL/MySQL
[查看:examples/google_adk_multi_agent.py中的
save_state()
load_state()
函数]

Deployment Options

部署选项

[See Code Examples: examples/google_adk_deployment.py]
[查看代码示例:examples/google_adk_deployment.py]

Agent Engine (Managed Service)

Agent Engine(托管服务)

Deployment Commands:
bash
pip install google-adk[cli]
adk auth login
adk deploy --agent-file agent.py --agent-name my_agent --project-id my-gcp-project --region us-central1
[See:
create_production_agent()
for configuration example]
部署命令:
bash
pip install google-adk[cli]
adk auth login
adk deploy --agent-file agent.py --agent-name my_agent --project-id my-gcp-project --region us-central1
[查看:配置示例
create_production_agent()
]

Cloud Run Deployment

Cloud Run部署

Components:
  • FastAPI server with agent endpoints
  • Dockerfile for containerization
  • Health check and error handling
  • Environment configuration
[See: FastAPI app implementation, Dockerfile reference, deployment commands in examples/google_adk_deployment.py]
组件:
  • 带智能体端点的FastAPI服务器
  • 用于容器化的Dockerfile
  • 健康检查与错误处理
  • 环境配置
[查看:examples/google_adk_deployment.py中的FastAPI应用实现、Dockerfile参考、部署命令]

Docker Containerization

Docker容器化

Self-Hosted Options:
  • Docker Compose with Redis
  • Single container deployment
  • Environment variable configuration
[See: docker-compose.yml reference, deployment commands in examples/google_adk_deployment.py]
自托管选项:
  • 带Redis的Docker Compose
  • 单容器部署
  • 环境变量配置
[查看:examples/google_adk_deployment.py中的docker-compose.yml参考、部署命令]

Resource Requirements

资源要求

Agent ComplexityCPURAMConcurrent Requests
Simple LlmAgent1 core512MB10
Workflow Agent2 cores1GB20
Multi-Agent (3-5 agents)4 cores2GB10
Complex Multi-Agent (>5)8 cores4GB5
智能体复杂度CPU内存并发请求数
简单LlmAgent1核512MB10
工作流智能体2核1GB20
多智能体(3-5个)4核2GB10
复杂多智能体(>5个)8核4GB5

Evaluation and Testing

评估与测试

[See Code Examples: examples/google_adk_deployment.py - Evaluation endpoint]
[查看代码示例:examples/google_adk_deployment.py - 评估端点]

Criteria-Based Evaluation

基于标准的评估

Pattern:
  • Define custom evaluation criteria (accuracy, helpfulness, etc.)
  • Run test cases against agent
  • Analyze pass rate and scores
[See:
evaluate_agent_endpoint()
in examples/google_adk_deployment.py]
模式:
  • 定义自定义评估标准(准确性、有用性等)
  • 针对智能体运行测试用例
  • 分析通过率与得分
[查看:examples/google_adk_deployment.py中的
evaluate_agent_endpoint()
]

User Simulation Evaluation

用户模拟评估

Pattern:
  • Simulate user interactions with defined goals
  • Track goal completion rate
  • Measure average turns to completion
[See documentation for UserSimulator examples]
模式:
  • 模拟带既定目标的用户交互
  • 追踪目标完成率
  • 测量完成平均轮次
[查看UserSimulator示例文档]

Best Practices

最佳实践

[See Code Examples: examples/google_adk_tools_example.py - Tool Design Best Practices section]
[查看代码示例:examples/google_adk_tools_example.py - 工具设计最佳实践部分]

Agent Instruction Writing

智能体指令编写

Effective Patterns:
  • Clear role and responsibilities
  • Structured format with constraints
  • Specific tool usage guidance
  • Example interactions
[See: GOOD vs BAD examples at end of examples/google_adk_tools_example.py]
有效模式:
  • 清晰的角色与职责
  • 带约束的结构化格式
  • 具体的工具使用指导
  • 示例交互
[查看:examples/google_adk_tools_example.py末尾的优秀vs糟糕示例]

Tool Design Principles

工具设计原则

Key Principles:
  1. Single Responsibility - One clear purpose per tool
  2. Descriptive Naming - Clear action and object naming
  3. Type Hints - Complete type annotations for all parameters
[See: Tool design examples at end of examples/google_adk_tools_example.py]
核心原则:
  1. 单一职责 - 每个工具仅一个明确用途
  2. 描述性命名 - 清晰的动作与对象命名
  3. 类型提示 - 所有参数的完整类型注解
[查看:examples/google_adk_tools_example.py末尾的工具设计示例]

Error Handling

错误处理

Strategies:
  • Graceful Degradation - Return error messages instead of raising exceptions
  • Retry Logic - Automatic retry with exponential backoff
  • Input Validation - Validate and sanitize all inputs
[See:
fetch_stock_price()
,
fetch_external_data()
,
send_email_tool()
in examples/google_adk_tools_example.py]
策略:
  • 优雅降级 - 返回错误消息而非抛出异常
  • 重试逻辑 - 带指数退避的自动重试
  • 输入验证 - 验证并清理所有输入
[查看:examples/google_adk_tools_example.py中的
fetch_stock_price()
fetch_external_data()
send_email_tool()
]

Security and Safety

安全与防护

Implementation:
  • Input Validation - Email format validation, length limits
  • Rate Limiting - Decorator-based request throttling
  • Sanitization - Remove dangerous HTML/script content
[See:
send_email_tool()
,
rate_limit()
decorator in examples/google_adk_tools_example.py]
实现:
  • 输入验证 - 邮箱格式验证、长度限制
  • 速率限制 - 基于装饰器的请求限流
  • 清理 - 移除危险的HTML/脚本内容
[查看:examples/google_adk_tools_example.py中的
send_email_tool()
rate_limit()
装饰器]

Performance Optimization

性能优化

Async Tools:
  • Automatic parallel execution for async functions
  • Improved throughput for I/O-bound operations
[See:
fetch_price()
,
create_portfolio_analyzer()
in examples/google_adk_tools_example.py]
异步工具:
  • 异步函数的自动并行执行
  • 提升I/O密集型操作的吞吐量
[查看:examples/google_adk_tools_example.py中的
fetch_price()
create_portfolio_analyzer()
]

Quality Gates

质量门

Definition of Done: Agents

智能体完成定义:

An agent is production-ready when:
  1. Functionality:
    • ✓ Agent completes primary objective on test cases
    • ✓ Tool execution succeeds with valid inputs
    • ✓ Error handling covers expected failure modes
    • ✓ Multi-turn conversations maintain context
  2. Performance:
    • ✓ Response time <5 seconds for simple queries
    • ✓ Response time <30 seconds for complex workflows
    • ✓ Evaluation pass rate ≥80% on criteria
    • ✓ Resource usage within deployment limits
  3. Safety:
    • ✓ Input validation on all tools
    • ✓ High-risk actions require confirmation (HITL)
    • ✓ No hardcoded credentials or API keys
    • ✓ Rate limiting on external API calls
  4. Observability:
    • ✓ Logging configured for debugging
    • ✓ Tracing enabled for multi-agent workflows
    • ✓ Evaluation metrics tracked
    • ✓ Error alerts configured
  5. Documentation:
    • ✓ Agent purpose and capabilities documented
    • ✓ Tool descriptions clear and accurate
    • ✓ Example usage provided
    • ✓ Known limitations documented
智能体满足以下条件时可投入生产:
  1. 功能:
    • ✓ 智能体在测试用例中完成主要目标
    • ✓ 工具在输入有效时执行成功
    • ✓ 错误处理覆盖预期故障模式
    • ✓ 多轮对话保持上下文
  2. 性能:
    • ✓ 简单查询响应时间<5秒
    • ✓ 复杂工作流响应时间<30秒
    • ✓ 标准评估通过率≥80%
    • ✓ 资源使用在部署限制内
  3. 安全:
    • ✓ 所有工具都有输入验证
    • ✓ 高风险操作需要确认(HITL)
    • ✓ 无硬编码凭证或API密钥
    • ✓ 外部API调用有速率限制
  4. 可观测性:
    • ✓ 配置用于调试的日志
    • ✓ 多智能体工作流启用追踪
    • ✓ 追踪评估指标
    • ✓ 配置错误告警
  5. 文档:
    • ✓ 记录智能体用途与能力
    • ✓ 工具描述清晰准确
    • ✓ 提供示例用法
    • ✓ 记录已知限制

Definition of Done: Tools

工具完成定义:

A tool is production-ready when:
  1. Interface:
    • ✓ Function has type hints for all parameters
    • ✓ Docstring explains purpose, args, returns
    • ✓ Parameter descriptions guide LLM selection
    • ✓ Return values are JSON-serializable
  2. Reliability:
    • ✓ Error handling with informative messages
    • ✓ Input validation prevents invalid operations
    • ✓ Timeout configured for long-running operations
    • ✓ Retry logic for transient failures
  3. Testing:
    • ✓ Unit tests cover success cases
    • ✓ Unit tests cover error cases
    • ✓ Integration tests with agent execution
    • ✓ Performance benchmarks for expensive operations
工具满足以下条件时可投入生产:
  1. 接口:
    • ✓ 函数对所有参数有类型提示
    • ✓ 文档字符串说明用途、参数、返回值
    • ✓ 参数描述指导LLM选择
    • ✓ 返回值可JSON序列化
  2. 可靠性:
    • ✓ 错误处理带信息性消息
    • ✓ 输入验证防止无效操作
    • ✓ 为长时间运行的操作配置超时
    • ✓ 针对瞬时故障的重试逻辑
  3. 测试:
    • ✓ 单元测试覆盖成功场景
    • ✓ 单元测试覆盖错误场景
    • ✓ 与智能体执行的集成测试
    • ✓ 昂贵操作的性能基准

Error Handling Guide

错误处理指南

Common Issues and Resolutions

常见问题与解决方案

Issue: Agent doesn't call the right tool
  • Cause: Tool description unclear or ambiguous
  • Resolution:
    python
    # BAD: Vague description
    def process(data):
        """Process data."""  # Too generic
    
    # GOOD: Specific description
    def validate_email_format(email: str) -> bool:
        """Check if email address matches valid format (user@domain.com).
    
        Use this tool ONLY to validate email syntax, not to verify
        if email exists or is deliverable."""
Issue: Agent loops indefinitely
  • Cause: No termination condition in Loop Agent
  • Resolution:
    python
    # Add max_iterations and explicit break condition
    agent = LoopAgent(
        tools=[...],
        max_iterations=10,  # Hard limit
        break_condition=lambda result: result.get("completed", False)
    )
Issue: "Tool execution failed" errors
  • Cause: Tool raises unhandled exception
  • Resolution:
    python
    def robust_tool(param: str) -> str:
        try:
            result = risky_operation(param)
            return f"Success: {result}"
        except SpecificError as e:
            return f"Operation failed: {e.message}"
        except Exception as e:
            logger.error(f"Unexpected error in robust_tool: {e}")
            return "Temporary service error, please try again"
Issue: Agent response is too slow
  • Cause: Sequential tool calls when parallelization possible
  • Resolution:
    python
    # Use Parallel Agent or async tools
    agent = ParallelAgent(
        tools=[tool1, tool2, tool3]  # Execute concurrently
    )
Issue: Context limit exceeded
  • Cause: Conversation history too long
  • Resolution:
    python
    session = Session(
        agent=my_agent,
        max_history_turns=10,  # Limit history
        context_window_tokens=30000  # Set explicit limit
    )
Issue: Deployment fails on Cloud Run
  • Cause: Missing dependencies or environment variables
  • Resolution:
    bash
    # Ensure requirements.txt is complete
    pip freeze > requirements.txt
    
    # Set required environment variables
    gcloud run deploy my-agent \
      --set-env-vars GOOGLE_API_KEY=your_key,AGENT_CONFIG=prod
问题:智能体未调用正确工具
  • 原因: 工具描述模糊或歧义
  • 解决方案:
    python
    # 糟糕:描述模糊
    def process(data):
        """Process data."""  # 过于通用
    
    # 优秀:具体描述
    def validate_email_format(email: str) -> bool:
        """检查邮箱地址是否符合有效格式(user@domain.com)。
    
        仅使用此工具验证邮箱语法,请勿用于验证邮箱是否存在或可送达。"""
问题:智能体无限循环
  • 原因: Loop Agent无终止条件
  • 解决方案:
    python
    # 添加max_iterations和显式终止条件
    agent = LoopAgent(
        tools=[...],
        max_iterations=10,  # 硬限制
        break_condition=lambda result: result.get("completed", False)
    )
问题:"Tool execution failed"错误
  • 原因: 工具抛出未处理的异常
  • 解决方案:
    python
    def robust_tool(param: str) -> str:
        try:
            result = risky_operation(param)
            return f"Success: {result}"
        except SpecificError as e:
            return f"Operation failed: {e.message}"
        except Exception as e:
            logger.error(f"Unexpected error in robust_tool: {e}")
            return "临时服务错误,请重试"
问题:智能体响应过慢
  • 原因: 可并行化时使用了顺序工具调用
  • 解决方案:
    python
    # 使用Parallel Agent或异步工具
    agent = ParallelAgent(
        tools=[tool1, tool2, tool3]  # 并发执行
    )
问题:上下文限制超出
  • 原因: 对话历史过长
  • 解决方案:
    python
    session = Session(
        agent=my_agent,
        max_history_turns=10,  # 限制历史长度
        context_window_tokens=30000  # 设置显式限制
    )
问题:Cloud Run部署失败
  • 原因: 缺少依赖或环境变量
  • 解决方案:
    bash
    # 确保requirements.txt完整
    pip freeze > requirements.txt
    
    # 设置所需环境变量
    gcloud run deploy my-agent \
      --set-env-vars GOOGLE_API_KEY=your_key,AGENT_CONFIG=prod

Debugging Strategies

调试策略

1. Enable verbose logging:
python
import logging

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("google.adk")
logger.setLevel(logging.DEBUG)
2. Test tools independently:
python
undefined
1. 启用详细日志:
python
import logging

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("google.adk")
logger.setLevel(logging.DEBUG)
2. 独立测试工具:
python
undefined

Test tool without agent

无需智能体直接测试工具

tool = Tool.from_function(my_function) result = my_function("test_input") print(f"Tool output: {result}")

**3. Use web UI for interactive debugging:**

```bash
adk api_server --debug
tool = Tool.from_function(my_function) result = my_function("test_input") print(f"Tool output: {result}")

**3. 使用Web UI进行交互式调试:**

```bash
adk api_server --debug

View tool calls, agent reasoning, response generation

查看工具调用、智能体推理、响应生成


**4. Inspect agent execution trace:**

```python
response = agent.run("test query", return_trace=True)
print(response.trace)  # Shows all tool calls and decisions

**4. 检查智能体执行追踪:**

```python
response = agent.run("test query", return_trace=True)
print(response.trace)  # 显示所有工具调用与决策

Complexity Ratings

复杂度评级

TaskRatingDescription
Simple LlmAgent with tools2Basic conversational agent
Sequential Workflow Agent2Fixed-order tool execution
Parallel Workflow Agent3Concurrent operations
Loop Workflow Agent3Iterative refinement
Custom Agent3User-defined orchestration
Coordinator/Dispatcher (2-3 agents)4Multi-agent routing
Sequential Pipeline (3+ agents)4Chained agent execution
Hierarchical Multi-Agent (>5 agents)5Complex nested architecture
Custom tool development2Python function wrapper
OpenAPI tool integration2Auto-generated from spec
MCP tool integration3Cross-framework tools
Deployment to Agent Engine2Managed deployment
Self-hosted Docker deployment3Container orchestration
Advanced evaluation framework4Custom criteria and simulation
任务评级描述
带工具的简单LlmAgent2基础对话式智能体
顺序工作流智能体2固定顺序工具执行
并行工作流智能体3并发操作
循环工作流智能体3迭代优化
自定义智能体3用户定义编排
协调器/调度器(2-3个智能体)4多智能体路由
顺序流水线(3+个智能体)4链式智能体执行
层级多智能体(>5个)5复杂嵌套架构
自定义工具开发2Python函数包装
OpenAPI工具集成2从规范自动生成
MCP工具集成3跨框架工具
部署到Agent Engine2托管部署
自托管Docker部署3容器编排
高级评估框架4自定义标准与模拟

References

参考资料

Official Documentation

官方文档

Python SDK Resources

Python SDK资源

Additional Languages

其他语言

Community Resources

社区资源

Related Skills

相关技能

  • For workflow automation: Use
    n8n
    skill
  • For API design: Use
    api-design-architect
    skill
  • For cloud deployment: Use
    cloud-devops-expert
    skill
  • For Python development: Use Python-specific skills
  • For LLM integration: Use model provider SDKs (OpenAI, Anthropic, etc.)

Version: 1.0.0 Last Updated: 2025-11-13 Complexity Rating: 3 (Moderate - requires agent architecture knowledge) Estimated Learning Time: 10-15 hours for proficiency
  • 工作流自动化:使用
    n8n
    技能
  • API设计:使用
    api-design-architect
    技能
  • 云部署:使用
    cloud-devops-expert
    技能
  • Python开发:使用Python专属技能
  • LLM集成:使用模型提供商SDK(OpenAI、Anthropic等)

版本: 1.0.0 最后更新: 2025-11-13 复杂度评级: 3(中等 - 需要智能体架构知识) 预计学习时间: 熟练掌握需10-15小时