tool-design

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Tool Design for Agents

Agent工具设计

Tools are the primary mechanism through which agents interact with the world. They define the contract between deterministic systems and non-deterministic agents. Unlike traditional software APIs designed for developers, tool APIs must be designed for language models that reason about intent, infer parameter values, and generate calls from natural language requests. Poor tool design creates failure modes that no amount of prompt engineering can fix. Effective tool design follows specific principles that account for how agents perceive and use tools.
工具是Agent与外部世界交互的主要机制,它们定义了确定性系统与非确定性Agent之间的契约。不同于为开发者设计的传统软件API,工具API必须为能够推理意图、推断参数值并从自然语言请求生成调用的语言模型量身打造。糟糕的工具设计会导致即使再多的提示词工程也无法解决的失败模式。有效的工具设计遵循特定原则,充分考虑Agent感知和使用工具的方式。

When to Activate

触发场景

Activate this skill when:
  • Creating new tools for agent systems
  • Debugging tool-related failures or misuse
  • Optimizing existing tool sets for better agent performance
  • Designing tool APIs from scratch
  • Evaluating third-party tools for agent integration
  • Standardizing tool conventions across a codebase
在以下场景下激活本技能:
  • 为Agent系统创建新工具
  • 调试与工具相关的失败或误用问题
  • 优化现有工具集以提升Agent性能
  • 从零开始设计工具API
  • 评估第三方工具以集成到Agent系统中
  • 在代码库中标准化工具规范

Core Concepts

核心概念

Tools are contracts between deterministic systems and non-deterministic agents. The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. Effective tool descriptions are prompt engineering that shapes agent behavior.
Key principles include: clear descriptions that answer what, when, and what returns; response formats that balance completeness and token efficiency; error messages that enable recovery; and consistent conventions that reduce cognitive load.
工具是确定性系统与非确定性Agent之间的契约。整合原则指出:如果人类工程师无法明确判断在特定场景下应使用哪个工具,那么也不能指望Agent做得更好。有效的工具描述属于提示词工程的一部分,能够塑造Agent的行为。
关键原则包括:明确回答功能、使用场景和返回结果的清晰描述;在完整性和Token效率之间取得平衡的响应格式;支持故障恢复的错误消息;以及降低认知负荷的一致规范。

Detailed Topics

详细主题

The Tool-Agent Interface

Agent-工具接口

Tools as Contracts Tools are contracts between deterministic systems and non-deterministic agents. When humans call APIs, they understand the contract and make appropriate requests. Agents must infer the contract from descriptions and generate calls that match expected formats.
This fundamental difference requires rethinking API design. The contract must be unambiguous, examples must illustrate expected patterns, and error messages must guide correction. Every ambiguity in tool definitions becomes a potential failure mode.
Tool Description as Prompt Tool descriptions are loaded into agent context and collectively steer behavior. The descriptions are not just documentation—they are prompt engineering that shapes how agents reason about tool use.
Poor descriptions like "Search the database" with cryptic parameter names force agents to guess. Optimized descriptions include usage context, examples, and defaults. The description answers: what the tool does, when to use it, and what it produces.
Namespacing and Organization As tool collections grow, organization becomes critical. Namespacing groups related tools under common prefixes, helping agents select appropriate tools at the right time.
Namespacing creates clear boundaries between functionality. When an agent needs database information, it routes to the database namespace. When it needs web search, it routes to web namespace.
工具即契约 工具是确定性系统与非确定性Agent之间的契约。人类调用API时,理解契约并发出恰当的请求;而Agent必须从描述中推断契约,并生成符合预期格式的调用。
这一根本性差异要求我们重新思考API设计:契约必须明确无误,示例必须展示预期模式,错误消息必须指引修正方向。工具定义中的每一处模糊性都可能成为潜在的失败点。
工具描述即提示词 工具描述会被加载到Agent的上下文环境中,共同引导其行为。这些描述不仅仅是文档——它们是塑造Agent工具使用推理逻辑的提示词工程。
像「搜索数据库」这种模糊描述加上晦涩的参数名,会迫使Agent去猜测。优化后的描述应包含使用场景、示例和默认值,明确回答:工具能做什么、何时使用、返回什么结果。
命名空间与组织 随着工具集合的增长,组织方式变得至关重要。命名空间将相关工具归到共同前缀下,帮助Agent在合适的时机选择恰当的工具。
命名空间为功能划分了清晰的边界:当Agent需要数据库信息时,会转向数据库命名空间;当需要网页搜索时,则会使用网页命名空间。

The Consolidation Principle

整合原则

Single Comprehensive Tools The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. This leads to a preference for single comprehensive tools over multiple narrow tools.
Instead of implementing list_users, list_events, and create_event, implement schedule_event that finds availability and schedules. The comprehensive tool handles the full workflow internally rather than requiring agents to chain multiple calls.
Why Consolidation Works Agents have limited context and attention. Each tool in the collection competes for attention in the tool selection phase. Each tool adds description tokens that consume context budget. Overlapping functionality creates ambiguity about which tool to use.
Consolidation reduces token consumption by eliminating redundant descriptions. It eliminates ambiguity by having one tool cover each workflow. It reduces tool selection complexity by shrinking the effective tool set.
When Not to Consolidate Consolidation is not universally correct. Tools with fundamentally different behaviors should remain separate. Tools used in different contexts benefit from separation. Tools that might be called independently should not be artificially bundled.
单一综合工具 整合原则指出:如果人类工程师无法明确判断在特定场景下应使用哪个工具,那么也不能指望Agent做得更好。这意味着我们更倾向于使用单一综合工具,而非多个细分工具。
与其实现list_users、list_events和create_event三个工具,不如实现一个schedule_event工具,让它负责查找可用时间并完成调度。综合工具在内部处理完整工作流,无需Agent链式调用多个工具。
整合为何有效 Agent的上下文和注意力是有限的,工具集合中的每个工具都会在工具选择阶段争夺注意力,每个工具的描述Token都会消耗上下文预算。功能重叠会导致Agent对工具选择产生困惑。
整合通过消除冗余描述减少Token消耗,通过单个工具覆盖完整工作流消除歧义,通过缩小有效工具集降低工具选择复杂度。
无需整合的场景 整合并非万能方案:行为本质不同的工具应保持独立;用于不同场景的工具分开使用更有利;可能被独立调用的工具不应被强行捆绑。

Architectural Reduction

架构简化

The consolidation principle, taken to its logical extreme, leads to architectural reduction: removing most specialized tools in favor of primitive, general-purpose capabilities. Production evidence shows this approach can outperform sophisticated multi-tool architectures.
The File System Agent Pattern Instead of building custom tools for data exploration, schema lookup, and query validation, provide direct file system access through a single command execution tool. The agent uses standard Unix utilities (grep, cat, find, ls) to explore, understand, and operate on your system.
This works because:
  1. File systems are a proven abstraction that models understand deeply
  2. Standard tools have predictable, well-documented behavior
  3. The agent can chain primitives flexibly rather than being constrained to predefined workflows
  4. Good documentation in files replaces the need for summarization tools
When Reduction Outperforms Complexity Reduction works when:
  • Your data layer is well-documented and consistently structured
  • The model has sufficient reasoning capability to navigate complexity
  • Your specialized tools were constraining rather than enabling the model
  • You're spending more time maintaining scaffolding than improving outcomes
Reduction fails when:
  • Your underlying data is messy, inconsistent, or poorly documented
  • The domain requires specialized knowledge the model lacks
  • Safety constraints require limiting what the agent can do
  • Operations are truly complex and benefit from structured workflows
Stop Constraining Reasoning A common anti-pattern is building tools to "protect" the model from complexity. Pre-filtering context, constraining options, wrapping interactions in validation logic. These guardrails often become liabilities as models improve.
The question to ask: are your tools enabling new capabilities, or are they constraining reasoning the model could handle on its own?
Build for Future Models Models improve faster than tooling can keep up. An architecture optimized for today's model may be over-constrained for tomorrow's. Build minimal architectures that can benefit from model improvements rather than sophisticated architectures that lock in current limitations.
See Architectural Reduction Case Study for production evidence.
将整合原则推向极致,就会得到架构简化:移除大多数专用工具,转而使用基础的通用能力。生产实践表明,这种方法的表现可能优于复杂的多工具架构。
文件系统Agent模式 无需为数据探索、模式查找和查询验证构建自定义工具,只需通过单个命令执行工具提供直接的文件系统访问权限。Agent可以使用标准Unix工具(grep、cat、find、ls)来探索、理解和操作你的系统。
这种模式之所以有效,原因如下:
  1. 文件系统是模型深度理解的成熟抽象
  2. 标准工具的行为可预测且文档完善
  3. Agent可以灵活地链式调用基础工具,而非受限于预定义工作流
  4. 文件中的优质文档可以替代摘要工具的需求
简化优于复杂的场景 在以下场景中,简化架构表现更优:
  • 你的数据层文档完善且结构一致
  • 模型具备足够的推理能力来处理复杂度
  • 你的专用工具反而限制了模型能力
  • 你在维护脚手架上花费的时间多于优化结果的时间
简化架构失效的场景:
  • 底层数据混乱、不一致或文档缺失
  • 领域需要模型不具备的专业知识
  • 安全约束需要限制Agent的操作范围
  • 操作确实复杂,结构化工作流更有优势
停止限制推理能力 一种常见的反模式是构建工具来「保护」模型免受复杂度影响,比如预过滤上下文、限制选项、用验证逻辑包装交互。随着模型能力的提升,这些防护措施往往会成为负担。
你需要问自己:你的工具是在赋能新能力,还是在限制模型本可以自行处理的推理过程?
为未来模型构建架构 模型的改进速度远超工具的迭代速度。为当前模型优化的架构,可能在未来会过度限制模型能力。应构建能够从模型改进中获益的极简架构,而非锁定当前局限性的复杂架构。
详见架构简化案例研究中的生产实践证据。

Tool Description Engineering

工具描述工程

Description Structure Effective tool descriptions answer four questions:
What does the tool do? Clear, specific description of functionality. Avoid vague language like "helps with" or "can be used for." State exactly what the tool accomplishes.
When should it be used? Specific triggers and contexts. Include both direct triggers ("User asks about pricing") and indirect signals ("Need current market rates").
What inputs does it accept? Parameter descriptions with types, constraints, and defaults. Explain what each parameter controls.
What does it return? Output format and structure. Include examples of successful responses and error conditions.
Default Parameter Selection Defaults should reflect common use cases. They reduce agent burden by eliminating unnecessary parameter specification. They prevent errors from omitted parameters.
描述结构 有效的工具描述应回答四个问题:
工具能做什么?清晰、具体的功能描述,避免「有助于」或「可用于」这类模糊表述,明确说明工具能完成的具体任务。
何时使用?具体的触发条件和场景,包括直接触发(如「用户询问定价」)和间接信号(如「需要当前市场利率」)。
接受什么输入?包含类型、约束和默认值的参数描述,解释每个参数的作用。
返回什么结果?输出格式和结构,包含成功响应和错误情况的示例。
默认参数选择 默认值应反映常见使用场景,通过减少不必要的参数指定来降低Agent的负担,同时避免因参数缺失导致的错误。

Response Format Optimization

响应格式优化

Tool response size significantly impacts context usage. Implementing response format options gives agents control over verbosity.
Concise format returns essential fields only, appropriate for confirmation or basic information. Detailed format returns complete objects with all fields, appropriate when full context is needed for decisions.
Include guidance in tool descriptions about when to use each format. Agents learn to select appropriate formats based on task requirements.
工具响应的大小会显著影响上下文使用量。实现响应格式选项,让Agent可以控制输出的详细程度。
简洁格式仅返回关键字段,适用于确认或获取基础信息;详细格式返回包含所有字段的完整对象,适用于需要完整上下文来做决策的场景。
在工具描述中加入使用每种格式的场景指引,Agent会根据任务需求学会选择合适的格式。

Error Message Design

错误消息设计

Error messages serve two audiences: developers debugging issues and agents recovering from failures. For agents, error messages must be actionable. They must tell the agent what went wrong and how to correct it.
Design error messages that enable recovery. For retryable errors, include retry guidance. For input errors, include corrected format. For missing data, include what's needed.
错误消息服务于两类受众:调试问题的开发者和从失败中恢复的Agent。对于Agent来说,错误消息必须具备可操作性,要告诉Agent哪里出错了以及如何修正。
设计支持恢复的错误消息:对于可重试错误,包含重试指引;对于输入错误,包含修正后的格式;对于数据缺失,说明需要补充的内容。

Tool Definition Schema

工具定义 Schema

Use a consistent schema across all tools. Establish naming conventions: verb-noun pattern for tool names, consistent parameter names across tools, consistent return field names.
为所有工具使用一致的Schema,建立命名规范:工具名称使用动词-名词模式,参数名称在所有工具中保持一致,返回字段名称保持一致。

Tool Collection Design

工具集合设计

Research shows tool description overlap causes model confusion. More tools do not always lead to better outcomes. A reasonable guideline is 10-20 tools for most applications. If more are needed, use namespacing to create logical groupings.
Implement mechanisms to help agents select the right tool: tool grouping, example-based selection, and hierarchy with umbrella tools that route to specialized sub-tools.
研究表明,工具描述重叠会导致模型困惑。工具数量多并不总是意味着结果更好,一个合理的指导原则是大多数应用使用10-20个工具。如果需要更多工具,使用命名空间创建逻辑分组。
实现帮助Agent选择正确工具的机制:工具分组、基于示例的选择,以及包含路由到专用子工具的顶层工具的层级结构。

MCP Tool Naming Requirements

MCP工具命名要求

When using MCP (Model Context Protocol) tools, always use fully qualified tool names to avoid "tool not found" errors.
Format:
ServerName:tool_name
python
undefined
使用MCP(Model Context Protocol)工具时,始终使用完全限定的工具名称,以避免「工具未找到」错误。
格式:
ServerName:tool_name
python
undefined

Correct: Fully qualified names

正确:完全限定名称

"Use the BigQuery:bigquery_schema tool to retrieve table schemas." "Use the GitHub:create_issue tool to create issues."
"Use the BigQuery:bigquery_schema tool to retrieve table schemas." "Use the GitHub:create_issue tool to create issues."

Incorrect: Unqualified names

错误:非限定名称

"Use the bigquery_schema tool..." # May fail with multiple servers

Without the server prefix, agents may fail to locate tools, especially when multiple MCP servers are available. Establish naming conventions that include server context in all tool references.
"Use the bigquery_schema tool..." # 多服务器环境下可能失败

如果没有服务器前缀,Agent可能无法定位工具,尤其是在存在多个MCP服务器的情况下。建立在所有工具引用中包含服务器上下文的命名规范。

Using Agents to Optimize Tools

使用Agent优化工具

Claude can optimize its own tools. When given a tool and observed failure modes, it diagnoses issues and suggests improvements. Production testing shows this approach achieves 40% reduction in task completion time by helping future agents avoid mistakes.
The Tool-Testing Agent Pattern:
python
def optimize_tool_description(tool_spec, failure_examples):
    """
    Use an agent to analyze tool failures and improve descriptions.
    
    Process:
    1. Agent attempts to use tool across diverse tasks
    2. Collect failure modes and friction points
    3. Agent analyzes failures and proposes improvements
    4. Test improved descriptions against same tasks
    """
    prompt = f"""
    Analyze this tool specification and the observed failures.
    
    Tool: {tool_spec}
    
    Failures observed:
    {failure_examples}
    
    Identify:
    1. Why agents are failing with this tool
    2. What information is missing from the description
    3. What ambiguities cause incorrect usage
    
    Propose an improved tool description that addresses these issues.
    """
    
    return get_agent_response(prompt)
This creates a feedback loop: agents using tools generate failure data, which agents then use to improve tool descriptions, which reduces future failures.
Claude可以优化自身的工具。给定一个工具和观察到的失败模式,它可以诊断问题并提出改进建议。生产测试表明,这种方法通过帮助未来的Agent避免错误,将任务完成时间缩短了40%。
工具测试Agent模式:
python
def optimize_tool_description(tool_spec, failure_examples):
    """
    使用Agent分析工具失败情况并改进描述。
    
    流程:
    1. Agent尝试在不同任务中使用工具
    2. 收集失败模式和痛点
    3. Agent分析失败并提出改进建议
    4. 在相同任务中测试改进后的描述
    """
    prompt = f"""
    Analyze this tool specification and the observed failures.
    
    Tool: {tool_spec}
    
    Failures observed:
    {failure_examples}
    
    Identify:
    1. Why agents are failing with this tool
    2. What information is missing from the description
    3. What ambiguities cause incorrect usage
    
    Propose an improved tool description that addresses these issues.
    """
    
    return get_agent_response(prompt)
这形成了一个反馈循环:使用工具的Agent生成失败数据,然后Agent利用这些数据改进工具描述,从而减少未来的失败。

Testing Tool Design

测试工具设计

Evaluate tool designs against criteria: unambiguity, completeness, recoverability, efficiency, and consistency. Test tools by presenting representative agent requests and evaluating the resulting tool calls.
根据以下标准评估工具设计:明确性、完整性、可恢复性、效率和一致性。通过呈现代表性的Agent请求并评估生成的工具调用来测试工具。

Practical Guidance

实践指南

Anti-Patterns to Avoid

应避免的反模式

Vague descriptions: "Search the database for customer information" leaves too many questions unanswered.
Cryptic parameter names: Parameters named x, val, or param1 force agents to guess meaning.
Missing error handling: Tools that fail with generic errors provide no recovery guidance.
Inconsistent naming: Using id in some tools, identifier in others, and customer_id in some creates confusion.
模糊描述:「Search the database for customer information」留下了太多未解答的问题。
晦涩的参数名:命名为x、val或param1的参数会迫使Agent猜测其含义。
缺失错误处理:返回通用错误的工具无法提供恢复指引。
不一致的命名:在某些工具中使用id,另一些中使用identifier,还有一些中使用customer_id,会造成混淆。

Tool Selection Framework

工具选择框架

When designing tool collections:
  1. Identify distinct workflows agents must accomplish
  2. Group related actions into comprehensive tools
  3. Ensure each tool has a clear, unambiguous purpose
  4. Document error cases and recovery paths
  5. Test with actual agent interactions
设计工具集合时:
  1. 识别Agent必须完成的不同工作流
  2. 将相关操作分组为综合工具
  3. 确保每个工具都有明确、无歧义的用途
  4. 记录错误情况和恢复路径
  5. 通过实际的Agent交互进行测试

Examples

示例

Example 1: Well-Designed Tool
python
def get_customer(customer_id: str, format: str = "concise"):
    """
    Retrieve customer information by ID.
    
    Use when:
    - User asks about specific customer details
    - Need customer context for decision-making
    - Verifying customer identity
    
    Args:
        customer_id: Format "CUST-######" (e.g., "CUST-000001")
        format: "concise" for key fields, "detailed" for complete record
    
    Returns:
        Customer object with requested fields
    
    Errors:
        NOT_FOUND: Customer ID not found
        INVALID_FORMAT: ID must match CUST-###### pattern
    """
Example 2: Poor Tool Design
This example demonstrates several tool design anti-patterns:
python
def search(query):
    """Search the database."""
    pass
Problems with this design:
  1. Vague name: "search" is ambiguous - search what, for what purpose?
  2. Missing parameters: What database? What format should query take?
  3. No return description: What does this function return? A list? A string? Error handling?
  4. No usage context: When should an agent use this versus other tools?
  5. No error handling: What happens if the database is unavailable?
Failure modes:
  • Agents may call this tool when they should use a more specific tool
  • Agents cannot determine correct query format
  • Agents cannot interpret results
  • Agents cannot recover from failures
示例1:设计良好的工具
python
def get_customer(customer_id: str, format: str = "concise"):
    """
    根据ID检索客户信息。
    
    使用场景:
    - 用户询问特定客户的详细信息
    - 决策过程中需要客户上下文
    - 验证客户身份
    
    参数:
        customer_id:格式为「CUST-######」(例如:「CUST-000001」)
        format:「concise」返回关键字段,「detailed」返回完整记录
    
    返回:
        包含请求字段的客户对象
    
    错误:
        NOT_FOUND:客户ID不存在
        INVALID_FORMAT:ID必须匹配CUST-######格式
    """
示例2:设计糟糕的工具
本示例展示了多种工具设计反模式:
python
def search(query):
    """Search the database."""
    pass
该设计的问题:
  1. 模糊名称:「search」含义模糊——搜索什么?用于什么目的?
  2. 缺失参数:哪个数据库?查询应采用什么格式?
  3. 无返回描述:这个函数返回什么?列表?字符串?错误处理机制?
  4. 无使用场景:Agent何时应使用这个工具而非其他工具?
  5. 无错误处理:数据库不可用时会发生什么?
失败模式:
  • Agent可能在应使用更特定工具时调用此工具
  • Agent无法确定正确的查询格式
  • Agent无法解释结果
  • Agent无法从失败中恢复

Guidelines

准则

  1. Write descriptions that answer what, when, and what returns
  2. Use consolidation to reduce ambiguity
  3. Implement response format options for token efficiency
  4. Design error messages for agent recovery
  5. Establish and follow consistent naming conventions
  6. Limit tool count and use namespacing for organization
  7. Test tool designs with actual agent interactions
  8. Iterate based on observed failure modes
  9. Question whether each tool enables or constrains the model
  10. Prefer primitive, general-purpose tools over specialized wrappers
  11. Invest in documentation quality over tooling sophistication
  12. Build minimal architectures that benefit from model improvements
  1. 编写回答「是什么、何时用、返回什么」的描述
  2. 使用整合原则减少歧义
  3. 实现响应格式选项以提升Token效率
  4. 设计支持Agent恢复的错误消息
  5. 建立并遵循一致的命名规范
  6. 限制工具数量,使用命名空间进行组织
  7. 通过实际Agent交互测试工具设计
  8. 根据观察到的失败模式进行迭代
  9. 质疑每个工具是赋能还是限制了模型
  10. 优先选择基础通用工具而非专用封装工具
  11. 重视文档质量而非工具复杂度
  12. 构建能够从模型改进中获益的极简架构

Integration

集成

This skill connects to:
  • context-fundamentals - How tools interact with context
  • multi-agent-patterns - Specialized tools per agent
  • evaluation - Evaluating tool effectiveness
本技能与以下内容关联:
  • context-fundamentals - 工具与上下文的交互方式
  • multi-agent-patterns - 为每个Agent配备专用工具
  • evaluation - 评估工具有效性

References

参考资料

Internal references:
  • Best Practices Reference - Detailed tool design guidelines
  • Architectural Reduction Case Study - Production evidence for tool minimalism
Related skills in this collection:
  • context-fundamentals - Tool context interactions
  • evaluation - Tool testing patterns
External resources:
  • MCP (Model Context Protocol) documentation
  • Framework tool conventions
  • API design best practices for agents
  • Vercel d0 agent architecture case study

内部参考:
  • 最佳实践参考 - 详细的工具设计准则
  • 架构简化案例研究 - 工具极简主义的生产实践证据
本集合中的相关技能:
  • context-fundamentals - 工具上下文交互
  • evaluation - 工具测试模式
外部资源:
  • MCP (Model Context Protocol) 文档
  • 框架工具规范
  • 面向Agent的API设计最佳实践
  • Vercel d0 agent架构案例研究

Skill Metadata

技能元数据

Created: 2025-12-20 Last Updated: 2025-12-23 Author: Agent Skills for Context Engineering Contributors Version: 1.1.0
创建时间: 2025-12-20 最后更新: 2025-12-23 作者: Agent Skills for Context Engineering Contributors 版本: 1.1.0