tool-design

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Tool Design for Agents

Agent工具设计

Tools are the primary mechanism through which agents interact with the world. They define the contract between deterministic systems and non-deterministic agents. Unlike traditional software APIs designed for developers, tool APIs must be designed for language models that reason about intent, infer parameter values, and generate calls from natural language requests. Poor tool design creates failure modes that no amount of prompt engineering can fix. Effective tool design follows specific principles that account for how agents perceive and use tools.

工具是Agent与外部世界交互的主要机制。它们定义了确定性系统与非确定性Agent之间的契约。与为开发者设计的传统软件API不同，工具API必须为能够推理意图、推断参数值并根据自然语言请求生成调用的语言模型量身打造。糟糕的工具设计会导致提示词工程无法解决的故障模式。有效的工具设计遵循特定原则，充分考虑Agent感知和使用工具的方式。

When to Activate

适用时机

Activate this skill when:

Creating new tools for agent systems
Debugging tool-related failures or misuse
Optimizing existing tool sets for better agent performance
Designing tool APIs from scratch
Evaluating third-party tools for agent integration
Standardizing tool conventions across a codebase

在以下场景启用此技能：

为Agent系统创建新工具
调试工具相关故障或误用问题
优化现有工具集以提升Agent性能
从头设计工具API
评估第三方工具以用于Agent集成
在代码库中标准化工具约定

Core Concepts

核心概念

Tools are contracts between deterministic systems and non-deterministic agents. The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. Effective tool descriptions are prompt engineering that shapes agent behavior.

Key principles include: clear descriptions that answer what, when, and what returns; response formats that balance completeness and token efficiency; error messages that enable recovery; and consistent conventions that reduce cognitive load.

工具是确定性系统与非确定性Agent之间的契约。整合原则指出，如果人类工程师无法明确判断在特定场景下应使用哪种工具，那么也不能指望Agent做得更好。有效的工具描述属于提示词工程的一部分，能够塑造Agent的行为。

关键原则包括：清晰回答功能、适用场景和返回内容的描述；在完整性和Token效率之间取得平衡的响应格式；支持故障恢复的错误消息；以及降低认知负荷的一致约定。

Detailed Topics

详细主题

The Tool-Agent Interface

工具与Agent的交互接口

Tools as Contracts Tools are contracts between deterministic systems and non-deterministic agents. When humans call APIs, they understand the contract and make appropriate requests. Agents must infer the contract from descriptions and generate calls that match expected formats.

This fundamental difference requires rethinking API design. The contract must be unambiguous, examples must illustrate expected patterns, and error messages must guide correction. Every ambiguity in tool definitions becomes a potential failure mode.

Tool Description as Prompt Tool descriptions are loaded into agent context and collectively steer behavior. The descriptions are not just documentation—they are prompt engineering that shapes how agents reason about tool use.

Poor descriptions like "Search the database" with cryptic parameter names force agents to guess. Optimized descriptions include usage context, examples, and defaults. The description answers: what the tool does, when to use it, and what it produces.

Namespacing and Organization As tool collections grow, organization becomes critical. Namespacing groups related tools under common prefixes, helping agents select appropriate tools at the right time.

Namespacing creates clear boundaries between functionality. When an agent needs database information, it routes to the database namespace. When it needs web search, it routes to web namespace.

工具作为契约 工具是确定性系统与非确定性Agent之间的契约。人类调用API时，会理解契约并发出合适的请求。而Agent必须从描述中推断契约，并生成符合预期格式的调用。

这种根本性差异要求我们重新思考API设计。契约必须明确无误，示例必须展示预期模式，错误消息必须引导修正。工具定义中的任何模糊性都可能成为潜在的故障点。

工具描述作为提示词 工具描述会被加载到Agent的上下文环境中，共同引导其行为。这些描述不仅仅是文档——它们是塑造Agent对工具使用逻辑的提示词工程内容。

像“搜索数据库”这类模糊描述加上晦涩的参数名，会迫使Agent去猜测。优化后的描述应包含使用场景、示例和默认值。描述需要回答：工具能做什么、何时使用、以及返回什么内容。

命名空间与组织方式 随着工具集合的扩大，组织管理变得至关重要。命名空间通过通用前缀将相关工具分组，帮助Agent在合适的时机选择恰当的工具。

命名空间为功能划分了清晰的边界。当Agent需要数据库信息时，它会转向数据库命名空间；当需要网页搜索时，则会使用网页命名空间。

The Consolidation Principle

整合原则

Single Comprehensive Tools The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. This leads to a preference for single comprehensive tools over multiple narrow tools.

Instead of implementing list_users, list_events, and create_event, implement schedule_event that finds availability and schedules. The comprehensive tool handles the full workflow internally rather than requiring agents to chain multiple calls.

Why Consolidation Works Agents have limited context and attention. Each tool in the collection competes for attention in the tool selection phase. Each tool adds description tokens that consume context budget. Overlapping functionality creates ambiguity about which tool to use.

Consolidation reduces token consumption by eliminating redundant descriptions. It eliminates ambiguity by having one tool cover each workflow. It reduces tool selection complexity by shrinking the effective tool set.

When Not to Consolidate Consolidation is not universally correct. Tools with fundamentally different behaviors should remain separate. Tools used in different contexts benefit from separation. Tools that might be called independently should not be artificially bundled.

单一综合工具 整合原则指出，如果人类工程师无法明确判断在特定场景下应使用哪种工具，那么也不能指望Agent做得更好。这意味着我们更倾向于使用单一的综合工具，而非多个功能狭窄的工具。

与其实现list_users、list_events和create_event，不如实现一个schedule_event工具来处理查找可用时间并完成调度的完整流程。综合工具在内部处理整个工作流，无需Agent链式调用多个工具。

整合的优势 Agent的上下文和注意力是有限的。工具集合中的每个工具都会在工具选择阶段争夺注意力，每个工具的描述Token都会消耗上下文预算。功能重叠会导致Agent对工具选择产生困惑。

整合通过消除冗余描述减少Token消耗，通过为每个工作流提供单一工具消除模糊性，通过缩小有效工具集降低工具选择的复杂度。

无需整合的场景 整合并非适用于所有情况。具有根本不同行为的工具应保持独立；在不同上下文使用的工具分开更有利；可能被独立调用的工具不应被人为捆绑。

Tool Description Engineering

工具描述工程

Description Structure Effective tool descriptions answer four questions:

What does the tool do? Clear, specific description of functionality. Avoid vague language like "helps with" or "can be used for." State exactly what the tool accomplishes.

When should it be used? Specific triggers and contexts. Include both direct triggers ("User asks about pricing") and indirect signals ("Need current market rates").

What inputs does it accept? Parameter descriptions with types, constraints, and defaults. Explain what each parameter controls.

What does it return? Output format and structure. Include examples of successful responses and error conditions.

Default Parameter Selection Defaults should reflect common use cases. They reduce agent burden by eliminating unnecessary parameter specification. They prevent errors from omitted parameters.

描述结构 有效的工具描述应回答四个问题：

工具能做什么？清晰、具体的功能描述。避免使用“有助于”或“可用于”这类模糊表述，明确说明工具的具体作用。

何时使用？具体的触发条件和场景。包括直接触发（如“用户询问定价”）和间接信号（如“需要当前市场价格”）。

接受哪些输入？带有类型、约束和默认值的参数描述，解释每个参数的作用。

返回什么内容？输出格式和结构，包含成功响应和错误情况的示例。

默认参数选择 默认值应反映常见使用场景，通过减少不必要的参数指定减轻Agent的负担，同时避免因参数缺失导致的错误。

Response Format Optimization

响应格式优化

Tool response size significantly impacts context usage. Implementing response format options gives agents control over verbosity.

Concise format returns essential fields only, appropriate for confirmation or basic information. Detailed format returns complete objects with all fields, appropriate when full context is needed for decisions.

Include guidance in tool descriptions about when to use each format. Agents learn to select appropriate formats based on task requirements.

工具响应的大小会显著影响上下文的使用。提供响应格式选项能让Agent控制输出的详细程度。

简洁格式仅返回必要字段，适用于确认操作或获取基本信息；详细格式返回包含所有字段的完整对象，适用于需要完整上下文来做决策的场景。

在工具描述中加入何时使用每种格式的指导，Agent会根据任务需求学会选择合适的格式。

Error Message Design

错误消息设计

Error messages serve two audiences: developers debugging issues and agents recovering from failures. For agents, error messages must be actionable. They must tell the agent what went wrong and how to correct it.

Design error messages that enable recovery. For retryable errors, include retry guidance. For input errors, include corrected format. For missing data, include what's needed.

错误消息服务于两类受众：调试问题的开发者和从故障中恢复的Agent。对于Agent而言，错误消息必须具备可操作性，告知Agent问题所在以及如何修正。

设计支持恢复的错误消息：对于可重试的错误，包含重试指导；对于输入错误，包含修正后的格式；对于数据缺失，说明需要补充的内容。

Tool Definition Schema

工具定义 Schema

Use a consistent schema across all tools. Establish naming conventions: verb-noun pattern for tool names, consistent parameter names across tools, consistent return field names.

为所有工具使用一致的Schema。建立命名约定：工具名称采用动词-名词模式，工具间使用一致的参数名和返回字段名。

Tool Collection Design

工具集合设计

Research shows tool description overlap causes model confusion. More tools do not always lead to better outcomes. A reasonable guideline is 10-20 tools for most applications. If more are needed, use namespacing to create logical groupings.

Implement mechanisms to help agents select the right tool: tool grouping, example-based selection, and hierarchy with umbrella tools that route to specialized sub-tools.

研究表明，工具描述重叠会导致模型混淆。更多工具并不总能带来更好的结果。一个合理的准则是大多数应用使用10-20个工具。如果需要更多工具，使用命名空间进行逻辑分组。

实现帮助Agent选择正确工具的机制：工具分组、基于示例的选择、以及带有总括工具的层级结构，总括工具可路由到专门的子工具。

MCP Tool Naming Requirements

MCP工具命名要求

When using MCP (Model Context Protocol) tools, always use fully qualified tool names to avoid "tool not found" errors.

Format:

ServerName:tool_name

python

undefined

使用MCP（Model Context Protocol）工具时，始终使用完全限定的工具名称，以避免“工具未找到”错误。

格式：

ServerName:tool_name

python

undefined

Correct: Fully qualified names

正确：完全限定名称

"Use the BigQuery:bigquery_schema tool to retrieve table schemas." "Use the GitHub:create_issue tool to create issues."

"使用BigQuery:bigquery_schema工具检索表结构。" "使用GitHub:create_issue工具创建问题。"

Incorrect: Unqualified names

错误：非限定名称

"Use the bigquery_schema tool..." # May fail with multiple servers


Without the server prefix, agents may fail to locate tools, especially when multiple MCP servers are available. Establish naming conventions that include server context in all tool references.

"使用bigquery_schema工具..." # 多服务器环境下可能失败


如果没有服务器前缀，Agent可能无法定位工具，尤其是在存在多个MCP服务器的情况下。建立包含服务器上下文的工具引用命名约定。

Using Agents to Optimize Tools

使用Agent优化工具

Claude can optimize its own tools. When given a tool and observed failure modes, it diagnoses issues and suggests improvements. Production testing shows this approach achieves 40% reduction in task completion time by helping future agents avoid mistakes.

The Tool-Testing Agent Pattern:

python

def optimize_tool_description(tool_spec, failure_examples):
    """
    Use an agent to analyze tool failures and improve descriptions.
    
    Process:
    1. Agent attempts to use tool across diverse tasks
    2. Collect failure modes and friction points
    3. Agent analyzes failures and proposes improvements
    4. Test improved descriptions against same tasks
    """
    prompt = f"""
    Analyze this tool specification and the observed failures.
    
    Tool: {tool_spec}
    
    Failures observed:
    {failure_examples}
    
    Identify:
    1. Why agents are failing with this tool
    2. What information is missing from the description
    3. What ambiguities cause incorrect usage
    
    Propose an improved tool description that addresses these issues.
    """
    
    return get_agent_response(prompt)

This creates a feedback loop: agents using tools generate failure data, which agents then use to improve tool descriptions, which reduces future failures.

Claude可以优化自身的工具。给定一个工具和观察到的故障模式，它能诊断问题并提出改进建议。生产测试表明，这种方法通过帮助后续Agent避免错误，将任务完成时间缩短了40%。

工具测试Agent模式：

python

def optimize_tool_description(tool_spec, failure_examples):
    """
    使用Agent分析工具故障并改进描述。
    
    流程：
    1. Agent尝试在多种任务中使用该工具
    2. 收集故障模式和痛点
    3. Agent分析故障并提出改进方案
    4. 在相同任务中测试改进后的描述
    """
    prompt = f"""
    分析以下工具规范和观察到的故障。
    
    工具：{tool_spec}
    
    观察到的故障：
    {failure_examples}
    
    请指出：
    1. Agent使用该工具失败的原因
    2. 描述中缺失的信息
    3. 导致错误使用的模糊点
    
    提出解决这些问题的改进工具描述。
    """
    
    return get_agent_response(prompt)

这形成了一个反馈循环：使用工具的Agent生成故障数据，Agent再利用这些数据改进工具描述，从而减少未来的故障。

Testing Tool Design

测试工具设计

Evaluate tool designs against criteria: unambiguity, completeness, recoverability, efficiency, and consistency. Test tools by presenting representative agent requests and evaluating the resulting tool calls.

根据以下标准评估工具设计：明确性、完整性、可恢复性、效率和一致性。通过呈现代表性的Agent请求并评估生成的工具调用，来测试工具。

Practical Guidance

实践指导

Anti-Patterns to Avoid

需避免的反模式

Vague descriptions: "Search the database for customer information" leaves too many questions unanswered.

Cryptic parameter names: Parameters named x, val, or param1 force agents to guess meaning.

Missing error handling: Tools that fail with generic errors provide no recovery guidance.

Inconsistent naming: Using id in some tools, identifier in others, and customer_id in some creates confusion.

模糊描述：“在数据库中搜索客户信息”留下太多未解答的问题。

晦涩参数名：命名为x、val或param1的参数会迫使Agent猜测其含义。

缺失错误处理：返回通用错误的工具无法提供恢复指导。

命名不一致：部分工具使用id，部分使用identifier，还有部分使用customer_id，会造成混淆。

Tool Selection Framework

工具选择框架

When designing tool collections:

Identify distinct workflows agents must accomplish
Group related actions into comprehensive tools
Ensure each tool has a clear, unambiguous purpose
Document error cases and recovery paths
Test with actual agent interactions

设计工具集合时：

确定Agent必须完成的不同工作流
将相关操作分组为综合工具
确保每个工具的用途清晰明确
记录错误场景和恢复路径
通过实际Agent交互进行测试

Examples

示例

Example 1: Well-Designed Tool

python

def get_customer(customer_id: str, format: str = "concise"):
    """
    Retrieve customer information by ID.
    
    Use when:
    - User asks about specific customer details
    - Need customer context for decision-making
    - Verifying customer identity
    
    Args:
        customer_id: Format "CUST-######" (e.g., "CUST-000001")
        format: "concise" for key fields, "detailed" for complete record
    
    Returns:
        Customer object with requested fields
    
    Errors:
        NOT_FOUND: Customer ID not found
        INVALID_FORMAT: ID must match CUST-###### pattern
    """

Example 2: Poor Tool Design

This example demonstrates several tool design anti-patterns:

python

def search(query):
    """Search the database."""
    pass

Problems with this design:

Vague name: "search" is ambiguous - search what, for what purpose?
Missing parameters: What database? What format should query take?
No return description: What does this function return? A list? A string? Error handling?
No usage context: When should an agent use this versus other tools?
No error handling: What happens if the database is unavailable?

Failure modes:

Agents may call this tool when they should use a more specific tool
Agents cannot determine correct query format
Agents cannot interpret results
Agents cannot recover from failures

示例1：设计良好的工具

python

def get_customer(customer_id: str, format: str = "concise"):
    """
    通过ID检索客户信息。
    
    适用场景：
    - 用户询问特定客户的详细信息
    - 决策过程中需要客户上下文
    - 验证客户身份
    
    参数：
        customer_id：格式为"CUST-######"（例如："CUST-000001"）
        format："concise"返回关键字段，"detailed"返回完整记录
    
    返回值：
        包含请求字段的客户对象
    
    错误：
        NOT_FOUND：客户ID不存在
        INVALID_FORMAT：ID必须符合CUST-######格式
    """

示例2：设计不佳的工具

本示例展示了多种工具设计反模式：

python

def search(query):
    """搜索数据库。"""
    pass

该设计的问题：

名称模糊：“search”含义不明确——搜索什么？用于什么目的？
参数缺失：哪个数据库？查询应采用什么格式？
无返回描述：该函数返回什么？列表？字符串？是否有错误处理？
无使用场景：Agent何时应使用此工具而非其他工具？
无错误处理：数据库不可用时会发生什么？

故障模式：

Agent可能在应使用更具体工具时调用此工具
Agent无法确定正确的查询格式
Agent无法解释返回结果
Agent无法从故障中恢复

Guidelines

指南

Write descriptions that answer what, when, and what returns
Use consolidation to reduce ambiguity
Implement response format options for token efficiency
Design error messages for agent recovery
Establish and follow consistent naming conventions
Limit tool count and use namespacing for organization
Test tool designs with actual agent interactions
Iterate based on observed failure modes

编写回答“是什么、何时用、返回什么”的描述
通过整合减少模糊性
提供响应格式选项以提升Token效率
为Agent恢复设计错误消息
建立并遵循一致的命名约定
限制工具数量并使用命名空间进行组织
通过实际Agent交互测试工具设计
根据观察到的故障模式迭代优化

Integration

集成

This skill connects to:

context-fundamentals - How tools interact with context
multi-agent-patterns - Specialized tools per agent
evaluation - Evaluating tool effectiveness

本技能与以下内容相关：

context-fundamentals - 工具与上下文的交互方式
multi-agent-patterns - 为不同Agent设计专用工具
evaluation - 评估工具有效性

References

参考资料

Internal reference:

Best Practices Reference - Detailed tool design guidelines

Related skills in this collection:

context-fundamentals - Tool context interactions
evaluation - Tool testing patterns

External resources:

MCP (Model Context Protocol) documentation
Framework tool conventions
API design best practices for agents

内部参考：

最佳实践参考 - 详细的工具设计指南

本集合中的相关技能：

context-fundamentals - 工具上下文交互
evaluation - 工具测试模式

外部资源：

MCP (Model Context Protocol) 文档
框架工具约定
面向Agent的API设计最佳实践

Skill Metadata

技能元数据

Created: 2025-12-20 Last Updated: 2025-12-20 Author: Agent Skills for Context Engineering Contributors Version: 1.0.0

创建时间：2025-12-20 最后更新：2025-12-20 作者：Agent Skills for Context Engineering 贡献者版本：1.0.0