tool-design
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWhen to Use This Skill
何时使用此技能
Build tools that agents can use effectively, including architectural reduction patterns
Use this skill when working with build tools that agents can use effectively, including architectural reduction patterns.
构建Agent可有效使用的工具,包括架构简化模式
在开发Agent可有效使用的构建工具(包括架构简化模式)时,使用此技能。
Tool Design for Agents
Agent工具设计
Tools are the primary mechanism through which agents interact with the world. They define the contract between deterministic systems and non-deterministic agents. Unlike traditional software APIs designed for developers, tool APIs must be designed for language models that reason about intent, infer parameter values, and generate calls from natural language requests. Poor tool design creates failure modes that no amount of prompt engineering can fix. Effective tool design follows specific principles that account for how agents perceive and use tools.
工具是Agent与外部世界交互的主要机制。它们定义了确定性系统与非确定性Agent之间的契约。与为开发者设计的传统软件API不同,工具API必须为语言模型设计——这些模型需要推理意图、推断参数值,并根据自然语言请求生成调用。糟糕的工具设计会导致即使大量提示工程也无法解决的故障模式。有效的工具设计遵循特定原则,充分考虑Agent感知和使用工具的方式。
When to Activate
何时启用
Activate this skill when:
- Creating new tools for agent systems
- Debugging tool-related failures or misuse
- Optimizing existing tool sets for better agent performance
- Designing tool APIs from scratch
- Evaluating third-party tools for agent integration
- Standardizing tool conventions across a codebase
在以下场景启用此技能:
- 为Agent系统创建新工具
- 调试与工具相关的故障或误用问题
- 优化现有工具集以提升Agent性能
- 从零开始设计工具API
- 评估第三方工具以进行Agent集成
- 在代码库中标准化工具约定
Core Concepts
核心概念
Tools are contracts between deterministic systems and non-deterministic agents. The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. Effective tool descriptions are prompt engineering that shapes agent behavior.
Key principles include: clear descriptions that answer what, when, and what returns; response formats that balance completeness and token efficiency; error messages that enable recovery; and consistent conventions that reduce cognitive load.
工具是确定性系统与非确定性Agent之间的契约。整合原则指出:如果人类工程师无法明确判断在特定场景下应使用哪个工具,那么也不能指望Agent做得更好。有效的工具描述属于提示工程的一部分,能够塑造Agent的行为。
关键原则包括:清晰回答功能、使用场景和返回结果的描述;在完整性和Token效率之间取得平衡的响应格式;支持故障恢复的错误消息;以及降低认知负荷的一致约定。
Detailed Topics
详细主题
The Tool-Agent Interface
工具-Agent接口
Tools as Contracts
Tools are contracts between deterministic systems and non-deterministic agents. When humans call APIs, they understand the contract and make appropriate requests. Agents must infer the contract from descriptions and generate calls that match expected formats.
This fundamental difference requires rethinking API design. The contract must be unambiguous, examples must illustrate expected patterns, and error messages must guide correction. Every ambiguity in tool definitions becomes a potential failure mode.
Tool Description as Prompt
Tool descriptions are loaded into agent context and collectively steer behavior. The descriptions are not just documentation—they are prompt engineering that shapes how agents reason about tool use.
Poor descriptions like "Search the database" with cryptic parameter names force agents to guess. Optimized descriptions include usage context, examples, and defaults. The description answers: what the tool does, when to use it, and what it produces.
Namespacing and Organization
As tool collections grow, organization becomes critical. Namespacing groups related tools under common prefixes, helping agents select appropriate tools at the right time.
Namespacing creates clear boundaries between functionality. When an agent needs database information, it routes to the database namespace. When it needs web search, it routes to web namespace.
作为契约的工具
工具是确定性系统与非确定性Agent之间的契约。人类调用API时,理解契约并发出合适的请求。而Agent必须从描述中推断契约,并生成符合预期格式的调用。
这种根本性差异要求重新思考API设计。契约必须明确无误,示例必须展示预期模式,错误消息必须指导修正。工具定义中的每一处模糊性都可能成为潜在的故障点。
作为提示的工具描述
工具描述会被加载到Agent的上下文环境中,共同引导其行为。这些描述不仅仅是文档——它们是塑造Agent工具使用推理逻辑的提示工程。
像“搜索数据库”这样模糊的描述加上晦涩的参数名,会迫使Agent去猜测。优化后的描述应包含使用场景、示例和默认值。描述需要回答:工具能做什么、何时使用、返回什么结果。
命名空间与组织
随着工具集合的增长,组织方式变得至关重要。命名空间将相关工具归到共同前缀下,帮助Agent在合适的时机选择合适的工具。
命名空间为功能划分了清晰的边界。当Agent需要数据库信息时,它会转向数据库命名空间;当需要网页搜索时,则转向网页命名空间。
The Consolidation Principle
整合原则
Single Comprehensive Tools
The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better. This leads to a preference for single comprehensive tools over multiple narrow tools.
Instead of implementing list_users, list_events, and create_event, implement schedule_event that finds availability and schedules. The comprehensive tool handles the full workflow internally rather than requiring agents to chain multiple calls.
Why Consolidation Works
Agents have limited context and attention. Each tool in the collection competes for attention in the tool selection phase. Each tool adds description tokens that consume context budget. Overlapping functionality creates ambiguity about which tool to use.
Consolidation reduces token consumption by eliminating redundant descriptions. It eliminates ambiguity by having one tool cover each workflow. It reduces tool selection complexity by shrinking the effective tool set.
When Not to Consolidate
Consolidation is not universally correct. Tools with fundamentally different behaviors should remain separate. Tools used in different contexts benefit from separation. Tools that might be called independently should not be artificially bundled.
单一综合工具
整合原则指出:如果人类工程师无法明确判断在特定场景下应使用哪个工具,那么也不能指望Agent做得更好。这使得我们更倾向于使用单一综合工具,而非多个窄功能工具。
与其实现list_users、list_events和create_event,不如实现一个schedule_event工具,它可以查找可用时间并完成调度。综合工具在内部处理完整工作流,无需Agent链式调用多个工具。
整合为何有效
Agent的上下文和注意力是有限的。工具集合中的每个工具都会在工具选择阶段争夺注意力。每个工具的描述Token都会消耗上下文预算。功能重叠会导致工具选择的模糊性。
整合通过消除冗余描述减少Token消耗,通过单个工具覆盖每个工作流消除模糊性,通过缩小有效工具集合降低工具选择的复杂度。
何时不进行整合
整合并非万能。具有根本不同行为的工具应保持独立。在不同场景使用的工具分开更有益。可能被独立调用的工具不应被人为捆绑。
Architectural Reduction
架构简化
The consolidation principle, taken to its logical extreme, leads to architectural reduction: removing most specialized tools in favor of primitive, general-purpose capabilities. Production evidence shows this approach can outperform sophisticated multi-tool architectures.
The File System Agent Pattern
Instead of building custom tools for data exploration, schema lookup, and query validation, provide direct file system access through a single command execution tool. The agent uses standard Unix utilities (grep, cat, find, ls) to explore, understand, and operate on your system.
This works because:
- File systems are a proven abstraction that models understand deeply
- Standard tools have predictable, well-documented behavior
- The agent can chain primitives flexibly rather than being constrained to predefined workflows
- Good documentation in files replaces the need for summarization tools
When Reduction Outperforms Complexity
Reduction works when:
- Your data layer is well-documented and consistently structured
- The model has sufficient reasoning capability to navigate complexity
- Your specialized tools were constraining rather than enabling the model
- You're spending more time maintaining scaffolding than improving outcomes
Reduction fails when:
- Your underlying data is messy, inconsistent, or poorly documented
- The domain requires specialized knowledge the model lacks
- Safety constraints require limiting what the agent can do
- Operations are truly complex and benefit from structured workflows
Stop Constraining Reasoning
A common anti-pattern is building tools to "protect" the model from complexity. Pre-filtering context, constraining options, wrapping interactions in validation logic. These guardrails often become liabilities as models improve.
The question to ask: are your tools enabling new capabilities, or are they constraining reasoning the model could handle on its own?
Build for Future Models
Models improve faster than tooling can keep up. An architecture optimized for today's model may be over-constrained for tomorrow's. Build minimal architectures that can benefit from model improvements rather than sophisticated architectures that lock in current limitations.
See Architectural Reduction Case Study for production evidence.
将整合原则推向逻辑极致,就会得到架构简化:移除大多数专用工具,转而使用原始的通用功能。生产实践表明,这种方法的性能可能优于复杂的多工具架构。
文件系统Agent模式
无需为数据探索、模式查找和查询验证构建自定义工具,只需通过单个命令执行工具提供直接的文件系统访问权限。Agent使用标准Unix工具(grep、cat、find、ls)来探索、理解和操作你的系统。
这种模式有效的原因:
- 文件系统是模型深度理解的成熟抽象
- 标准工具具有可预测、文档完善的行为
- Agent可以灵活链式调用原始工具,而非受限于预定义工作流
- 文件中的良好文档替代了摘要工具的需求
何时简化优于复杂
在以下场景简化更有效:
- 你的数据层文档完善且结构一致
- 模型具备足够的推理能力以应对复杂性
- 你的专用工具限制而非赋能了模型
- 你在维护脚手架上花费的时间多于改进结果的时间
在以下场景简化会失败:
- 底层数据混乱、不一致或文档缺失
- 领域需要模型不具备的专业知识
- 安全约束需要限制Agent的操作范围
- 操作确实复杂,需要结构化工作流
停止限制推理
一种常见的反模式是构建工具来“保护”模型免受复杂性影响。比如预过滤上下文、限制选项、用验证逻辑包装交互。随着模型能力的提升,这些防护措施往往会成为负担。
要问自己的问题:你的工具是在赋能新能力,还是在限制模型本可以自行处理的推理?
为未来模型构建
模型的改进速度快于工具的迭代速度。为当前模型优化的架构可能会对未来的模型过度约束。构建最小化架构,使其能从模型的改进中获益,而非构建将当前局限性固化的复杂架构。
查看Architectural Reduction Case Study获取生产实践证据。
Tool Description Engineering
工具描述工程
Description Structure
Effective tool descriptions answer four questions:
What does the tool do? Clear, specific description of functionality. Avoid vague language like "helps with" or "can be used for." State exactly what the tool accomplishes.
When should it be used? Specific triggers and contexts. Include both direct triggers ("User asks about pricing") and indirect signals ("Need current market rates").
What inputs does it accept? Parameter descriptions with types, constraints, and defaults. Explain what each parameter controls.
What does it return? Output format and structure. Include examples of successful responses and error conditions.
Default Parameter Selection
Defaults should reflect common use cases. They reduce agent burden by eliminating unnecessary parameter specification. They prevent errors from omitted parameters.
描述结构
有效的工具描述应回答四个问题:
工具能做什么?清晰、具体的功能描述。避免“有助于”或“可用于”这类模糊表述。明确说明工具能完成的具体任务。
何时使用?具体的触发条件和场景。包括直接触发(“用户询问定价”)和间接信号(“需要当前市场利率”)。
接受什么输入?带类型、约束和默认值的参数描述。解释每个参数的作用。
返回什么?输出格式和结构。包含成功响应和错误情况的示例。
默认参数选择
默认值应反映常见用例。它们通过减少不必要的参数指定降低Agent的负担,同时防止因参数遗漏导致的错误。
Response Format Optimization
响应格式优化
Tool response size significantly impacts context usage. Implementing response format options gives agents control over verbosity.
Concise format returns essential fields only, appropriate for confirmation or basic information. Detailed format returns complete objects with all fields, appropriate when full context is needed for decisions.
Include guidance in tool descriptions about when to use each format. Agents learn to select appropriate formats based on task requirements.
工具响应的大小会显著影响上下文的使用。实现响应格式选项,让Agent可以控制输出的详细程度。
简洁格式仅返回必要字段,适用于确认或基础信息查询。详细格式返回包含所有字段的完整对象,适用于需要完整上下文进行决策的场景。
在工具描述中加入何时使用每种格式的指导。Agent会根据任务需求学习选择合适的格式。
Error Message Design
错误消息设计
Error messages serve two audiences: developers debugging issues and agents recovering from failures. For agents, error messages must be actionable. They must tell the agent what went wrong and how to correct it.
Design error messages that enable recovery. For retryable errors, include retry guidance. For input errors, include corrected format. For missing data, include what's needed.
错误消息服务于两类受众:调试问题的开发者和从故障中恢复的Agent。对于Agent,错误消息必须具备可操作性——必须告知Agent哪里出错以及如何修正。
设计支持恢复的错误消息。对于可重试错误,包含重试指导;对于输入错误,包含修正后的格式;对于缺失数据,说明需要补充的内容。
Tool Definition Schema
工具定义 Schema
Use a consistent schema across all tools. Establish naming conventions: verb-noun pattern for tool names, consistent parameter names across tools, consistent return field names.
为所有工具使用一致的Schema。建立命名约定:工具名称使用动词-名词模式,工具间参数名称保持一致,返回字段名称保持一致。
Tool Collection Design
工具集合设计
Research shows tool description overlap causes model confusion. More tools do not always lead to better outcomes. A reasonable guideline is 10-20 tools for most applications. If more are needed, use namespacing to create logical groupings.
Implement mechanisms to help agents select the right tool: tool grouping, example-based selection, and hierarchy with umbrella tools that route to specialized sub-tools.
研究表明,工具描述的重叠会导致模型混淆。更多工具并不总能带来更好的结果。一个合理的指导原则是:大多数应用使用10-20个工具。如果需要更多工具,使用命名空间创建逻辑分组。
实现帮助Agent选择正确工具的机制:工具分组、基于示例的选择,以及包含路由到专用子工具的总括工具的层级结构。
MCP Tool Naming Requirements
MCP工具命名要求
When using MCP (Model Context Protocol) tools, always use fully qualified tool names to avoid "tool not found" errors.
Format:
ServerName:tool_namepython
undefined使用MCP(Model Context Protocol)工具时,始终使用完全限定的工具名称,以避免“工具未找到”错误。
格式:
ServerName:tool_namepython
undefinedCorrect: Fully qualified names
Correct: Fully qualified names
"Use the BigQuery:bigquery_schema tool to retrieve table schemas."
"Use the GitHub:create_issue tool to create issues."
"Use the BigQuery:bigquery_schema tool to retrieve table schemas."
"Use the GitHub:create_issue tool to create issues."
Incorrect: Unqualified names
Incorrect: Unqualified names
"Use the bigquery_schema tool..." # May fail with multiple servers
Without the server prefix, agents may fail to locate tools, especially when multiple MCP servers are available. Establish naming conventions that include server context in all tool references."Use the bigquery_schema tool..." # May fail with multiple servers
如果没有服务器前缀,Agent可能无法定位工具,尤其是在存在多个MCP服务器的情况下。建立在所有工具引用中包含服务器上下文的命名约定。Using Agents to Optimize Tools
使用Agent优化工具
Claude can optimize its own tools. When given a tool and observed failure modes, it diagnoses issues and suggests improvements. Production testing shows this approach achieves 40% reduction in task completion time by helping future agents avoid mistakes.
The Tool-Testing Agent Pattern:
python
def optimize_tool_description(tool_spec, failure_examples):
"""
Use an agent to analyze tool failures and improve descriptions.
Process:
1. Agent attempts to use tool across diverse tasks
2. Collect failure modes and friction points
3. Agent analyzes failures and proposes improvements
4. Test improved descriptions against same tasks
"""
prompt = f"""
Analyze this tool specification and the observed failures.
Tool: {tool_spec}
Failures observed:
{failure_examples}
Identify:
1. Why agents are failing with this tool
2. What information is missing from the description
3. What ambiguities cause incorrect usage
Propose an improved tool description that addresses these issues.
"""
return get_agent_response(prompt)This creates a feedback loop: agents using tools generate failure data, which agents then use to improve tool descriptions, which reduces future failures.
Claude可以优化自己的工具。给定一个工具和观察到的故障模式,它可以诊断问题并提出改进建议。生产测试表明,这种方法通过帮助未来的Agent避免错误,将任务完成时间缩短了40%。
工具测试Agent模式:
python
def optimize_tool_description(tool_spec, failure_examples):
"""
Use an agent to analyze tool failures and improve descriptions.
Process:
1. Agent attempts to use tool across diverse tasks
2. Collect failure modes and friction points
3. Agent analyzes failures and proposes improvements
4. Test improved descriptions against same tasks
"""
prompt = f"""
Analyze this tool specification and the observed failures.
Tool: {tool_spec}
Failures observed:
{failure_examples}
Identify:
1. Why agents are failing with this tool
2. What information is missing from the description
3. What ambiguities cause incorrect usage
Propose an improved tool description that addresses these issues.
"""
return get_agent_response(prompt)这形成了一个反馈循环:使用工具的Agent生成故障数据,然后Agent利用这些数据改进工具描述,从而减少未来的故障。
Testing Tool Design
测试工具设计
Evaluate tool designs against criteria: unambiguity, completeness, recoverability, efficiency, and consistency. Test tools by presenting representative agent requests and evaluating the resulting tool calls.
根据以下标准评估工具设计:明确性、完整性、可恢复性、效率和一致性。通过呈现代表性的Agent请求并评估生成的工具调用来测试工具。
Practical Guidance
实践指导
Anti-Patterns to Avoid
需避免的反模式
Vague descriptions: "Search the database for customer information" leaves too many questions unanswered.
Cryptic parameter names: Parameters named x, val, or param1 force agents to guess meaning.
Missing error handling: Tools that fail with generic errors provide no recovery guidance.
Inconsistent naming: Using id in some tools, identifier in others, and customer_id in some creates confusion.
模糊描述:“搜索数据库获取客户信息”留下太多未解答的问题。
晦涩参数名:命名为x、val或param1的参数会迫使Agent猜测其含义。
缺失错误处理:返回通用错误的工具无法提供恢复指导。
命名不一致:在某些工具中使用id,其他工具使用identifier,还有一些使用customer_id,会造成混淆。
Tool Selection Framework
工具选择框架
When designing tool collections:
- Identify distinct workflows agents must accomplish
- Group related actions into comprehensive tools
- Ensure each tool has a clear, unambiguous purpose
- Document error cases and recovery paths
- Test with actual agent interactions
设计工具集合时:
- 识别Agent必须完成的不同工作流
- 将相关操作分组为综合工具
- 确保每个工具具有清晰、明确的用途
- 记录错误情况和恢复路径
- 通过实际Agent交互进行测试
Examples
示例
Example 1: Well-Designed Tool
python
def get_customer(customer_id: str, format: str = "concise"):
"""
Retrieve customer information by ID.
Use when:
- User asks about specific customer details
- Need customer context for decision-making
- Verifying customer identity
Args:
customer_id: Format "CUST-######" (e.g., "CUST-000001")
format: "concise" for key fields, "detailed" for complete record
Returns:
Customer object with requested fields
Errors:
NOT_FOUND: Customer ID not found
INVALID_FORMAT: ID must match CUST-###### pattern
"""Example 2: Poor Tool Design
This example demonstrates several tool design anti-patterns:
python
def search(query):
"""Search the database."""
passProblems with this design:
- Vague name: "search" is ambiguous - search what, for what purpose?
- Missing parameters: What database? What format should query take?
- No return description: What does this function return? A list? A string? Error handling?
- No usage context: When should an agent use this versus other tools?
- No error handling: What happens if the database is unavailable?
Failure modes:
- Agents may call this tool when they should use a more specific tool
- Agents cannot determine correct query format
- Agents cannot interpret results
- Agents cannot recover from failures
示例1:设计良好的工具
python
def get_customer(customer_id: str, format: str = "concise"):
"""
Retrieve customer information by ID.
Use when:
- User asks about specific customer details
- Need customer context for decision-making
- Verifying customer identity
Args:
customer_id: Format "CUST-######" (e.g., "CUST-000001")
format: "concise" for key fields, "detailed" for complete record
Returns:
Customer object with requested fields
Errors:
NOT_FOUND: Customer ID not found
INVALID_FORMAT: ID must match CUST-###### pattern
"""示例2:糟糕的工具设计
此示例展示了多种工具设计反模式:
python
def search(query):
"""Search the database."""
pass该设计的问题:
- 模糊名称:“search”含义模糊——搜索什么、用于什么目的?
- 缺失参数:哪个数据库?查询应采用什么格式?
- 无返回描述:此函数返回什么?列表?字符串?错误处理?
- 无使用场景:Agent何时应使用此工具而非其他工具?
- 无错误处理:数据库不可用时会发生什么?
故障模式:
- Agent可能在应使用更具体工具时调用此工具
- Agent无法确定正确的查询格式
- Agent无法解释结果
- Agent无法从故障中恢复
Guidelines
指导原则
- Write descriptions that answer what, when, and what returns
- Use consolidation to reduce ambiguity
- Implement response format options for token efficiency
- Design error messages for agent recovery
- Establish and follow consistent naming conventions
- Limit tool count and use namespacing for organization
- Test tool designs with actual agent interactions
- Iterate based on observed failure modes
- Question whether each tool enables or constrains the model
- Prefer primitive, general-purpose tools over specialized wrappers
- Invest in documentation quality over tooling sophistication
- Build minimal architectures that benefit from model improvements
- 撰写回答“是什么、何时用、返回什么”的描述
- 使用整合减少模糊性
- 实现响应格式选项以提升Token效率
- 设计支持Agent恢复的错误消息
- 建立并遵循一致的命名约定
- 限制工具数量并使用命名空间进行组织
- 通过实际Agent交互测试工具设计
- 根据观察到的故障模式迭代优化
- 质疑每个工具是赋能还是限制了模型
- 优先选择原始通用工具而非专用包装器
- 投入文档质量而非工具复杂度
- 构建能从模型改进中获益的最小化架构
Integration
集成
This skill connects to:
- context-fundamentals - How tools interact with context
- multi-agent-patterns - Specialized tools per agent
- evaluation - Evaluating tool effectiveness
此技能与以下内容关联:
- context-fundamentals - 工具如何与上下文交互
- multi-agent-patterns - 为每个Agent配备专用工具
- evaluation - 评估工具有效性
References
参考资料
Internal references:
- Best Practices Reference - Detailed tool design guidelines
- Architectural Reduction Case Study - Production evidence for tool minimalism
Related skills in this collection:
- context-fundamentals - Tool context interactions
- evaluation - Tool testing patterns
External resources:
- MCP (Model Context Protocol) documentation
- Framework tool conventions
- API design best practices for agents
- Vercel d0 agent architecture case study
内部参考:
- Best Practices Reference - 详细的工具设计指导原则
- Architectural Reduction Case Study - 工具极简主义的生产实践证据
本集合中的相关技能:
- context-fundamentals - 工具上下文交互
- evaluation - 工具测试模式
外部资源:
- MCP (Model Context Protocol) 文档
- 框架工具约定
- 面向Agent的API设计最佳实践
- Vercel d0 Agent架构案例研究
Skill Metadata
技能元数据
Created: 2025-12-20
Last Updated: 2025-12-23
Author: Agent Skills for Context Engineering Contributors
Version: 1.1.0
创建时间: 2025-12-20
最后更新: 2025-12-23
作者: Agent Skills for Context Engineering Contributors
版本: 1.1.0