context-fundamentals
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseContext Engineering Fundamentals
上下文工程基础(Context Engineering Fundamentals)
Context is the complete state available to a language model at inference time. It includes everything the model can attend to when generating responses: system instructions, tool definitions, retrieved documents, message history, and tool outputs. Understanding context fundamentals is prerequisite to effective context engineering.
Context是语言模型在推理时可用的完整状态。它包含模型生成响应时可以关注的所有内容:系统指令、Tool Definitions、检索到的文档、消息历史记录和工具输出。理解Context基础是有效进行上下文工程的前提。
When to Activate
何时激活
Activate this skill when:
- Designing new agent systems or modifying existing architectures
- Debugging unexpected agent behavior that may relate to context
- Optimizing context usage to reduce token costs or improve performance
- Onboarding new team members to context engineering concepts
- Reviewing context-related design decisions
在以下场景激活本技能:
- 设计新的Agent系统或修改现有架构
- 调试可能与Context相关的Agent异常行为
- 优化Context使用以降低token成本或提升性能
- 向新团队成员介绍上下文工程概念
- 审查与Context相关的设计决策
Core Concepts
核心概念
Context comprises several distinct components, each with different characteristics and constraints. The attention mechanism creates a finite budget that constrains effective context usage. Progressive disclosure manages this constraint by loading information only as needed. The engineering discipline is curating the smallest high-signal token set that achieves desired outcomes.
Context由几个不同的组件组成,每个组件都有不同的特性和约束。注意力机制(Attention Mechanism)创建了一个有限的预算,限制了Context的有效使用。渐进式披露(Progressive Disclosure)通过仅在需要时加载信息来管理这种约束。上下文工程的核心是筛选出能实现预期结果的最小高信号token集合。
Detailed Topics
详细主题
The Anatomy of Context
Context的组成结构
System Prompts
System prompts establish the agent's core identity, constraints, and behavioral guidelines. They are loaded once at session start and typically persist throughout the conversation. System prompts should be extremely clear and use simple, direct language at the right altitude for the agent.
The right altitude balances two failure modes. At one extreme, engineers hardcode complex brittle logic that creates fragility and maintenance burden. At the other extreme, engineers provide vague high-level guidance that fails to give concrete signals for desired outputs or falsely assumes shared context. The optimal altitude strikes a balance: specific enough to guide behavior effectively, yet flexible enough to provide strong heuristics.
Organize prompts into distinct sections using XML tagging or Markdown headers to delineate background information, instructions, tool guidance, and output description. The exact formatting matters less as models become more capable, but structural clarity remains valuable.
Tool Definitions
Tool definitions specify the actions an agent can take. Each tool includes a name, description, parameters, and return format. Tool definitions live near the front of context after serialization, typically before or after the system prompt.
Tool descriptions collectively steer agent behavior. Poor descriptions force agents to guess; optimized descriptions include usage context, examples, and defaults. The consolidation principle states that if a human engineer cannot definitively say which tool should be used in a given situation, an agent cannot be expected to do better.
Retrieved Documents
Retrieved documents provide domain-specific knowledge, reference materials, or task-relevant information. Agents use retrieval augmented generation to pull relevant documents into context at runtime rather than pre-loading all possible information.
The just-in-time approach maintains lightweight identifiers (file paths, stored queries, web links) and uses these references to load data into context dynamically. This mirrors human cognition: we generally do not memorize entire corpuses of information but rather use external organization and indexing systems to retrieve relevant information on demand.
Message History
Message history contains the conversation between the user and agent, including previous queries, responses, and reasoning. For long-running tasks, message history can grow to dominate context usage.
Message history serves as scratchpad memory where agents track progress, maintain task state, and preserve reasoning across turns. Effective management of message history is critical for long-horizon task completion.
Tool Outputs
Tool outputs are the results of agent actions: file contents, search results, command execution output, API responses, and similar data. Tool outputs comprise the majority of tokens in typical agent trajectories, with research showing observations (tool outputs) can reach 83.9% of total context usage.
Tool outputs consume context whether they are relevant to current decisions or not. This creates pressure for strategies like observation masking, compaction, and selective tool result retention.
System Prompts
System Prompts确立Agent的核心身份、约束和行为准则。它们在会话开始时加载一次,通常会在整个对话过程中保持不变。System Prompts应极其清晰,使用简洁直接的语言,且符合Agent的定位。
合适的定位需要平衡两种极端情况:一种是工程师硬编码复杂且脆弱的逻辑,导致系统易碎且维护负担重;另一种是工程师提供模糊的高层指导,无法为预期输出提供具体信号,或错误假设存在共享Context。最优定位需要找到平衡点:足够具体以有效引导行为,同时足够灵活以提供强大的启发式规则。
使用XML标签或Markdown标题将提示词组织成不同的部分,划分背景信息、指令、工具指导和输出描述。随着模型能力的提升,具体格式的重要性有所降低,但结构清晰性仍然很有价值。
Tool Definitions
Tool Definitions指定Agent可以执行的操作。每个工具包含名称、描述、参数和返回格式。Tool Definitions在序列化后位于Context的前端,通常在System Prompts之前或之后。
工具描述共同引导Agent的行为。糟糕的描述会迫使Agent猜测;优化后的描述应包含使用场景、示例和默认值。整合原则指出,如果人类工程师无法明确说明在特定情况下应使用哪个工具,那么也不能期望Agent做得更好。
Retrieved Documents
Retrieved Documents提供领域特定知识、参考资料或任务相关信息。Agent使用检索增强生成(Retrieval Augmented Generation)技术在运行时将相关文档拉入Context,而非预先加载所有可能的信息。
这种即时加载的方法保留轻量级标识符(文件路径、存储的查询、网页链接),并使用这些引用动态将数据加载到Context中。这与人类的认知方式类似:我们通常不会记住整个信息库,而是使用外部组织和索引系统按需检索相关信息。
Message History
Message History包含用户与Agent之间的对话,包括之前的查询、响应和推理过程。对于长期运行的任务,Message History可能会占据Context的大部分空间。
Message History充当临时记忆区,Agent在此跟踪进度、维护任务状态,并在多轮对话中保留推理过程。有效管理Message History对于完成长期任务至关重要。
Tool Outputs
Tool Outputs是Agent操作的结果:文件内容、搜索结果、命令执行输出、API响应等类似数据。在典型的Agent轨迹中,Tool Outputs占据了大部分token,研究表明观察结果(Tool Outputs)可占总Context使用量的83.9%。
无论Tool Outputs是否与当前决策相关,都会消耗Context。这催生了观察掩码、压缩和选择性保留工具结果等策略的需求。
Context Windows and Attention Mechanics
上下文窗口与注意力机制(Context Windows and Attention Mechanics)
The Attention Budget Constraint
Language models process tokens through attention mechanisms that create pairwise relationships between all tokens in context. For n tokens, this creates n² relationships that must be computed and stored. As context length increases, the model's ability to capture these relationships gets stretched thin.
Models develop attention patterns from training data distributions where shorter sequences predominate. This means models have less experience with and fewer specialized parameters for context-wide dependencies. The result is an "attention budget" that depletes as context grows.
Position Encoding and Context Extension
Position encoding interpolation allows models to handle longer sequences by adapting them to originally trained smaller contexts. However, this adaptation introduces degradation in token position understanding. Models remain highly capable at longer contexts but show reduced precision for information retrieval and long-range reasoning compared to performance on shorter contexts.
The Progressive Disclosure Principle
Progressive disclosure manages context efficiently by loading information only as needed. At startup, agents load only skill names and descriptions—sufficient to know when a skill might be relevant. Full content loads only when a skill is activated for specific tasks.
This approach keeps agents fast while giving them access to more context on demand. The principle applies at multiple levels: skill selection, document loading, and even tool result retrieval.
注意力预算约束(The Attention Budget Constraint)
语言模型通过注意力机制处理token,该机制在Context中的所有token之间创建成对关系。对于n个token,这会产生n²个需要计算和存储的关系。随着Context长度的增加,模型捕获这些关系的能力会被稀释。
模型从以短序列为主的训练数据分布中发展出注意力模式。这意味着模型对全Context依赖的经验较少,且相关的专用参数也较少。结果就是一个“注意力预算”,随着Context的增长而耗尽。
位置编码与Context扩展(Position Encoding and Context Extension)
位置编码插值允许模型通过适配原本训练的较小Context来处理更长的序列。然而,这种适配会导致token位置理解能力下降。模型在较长Context下仍然保持较高的能力,但与较短Context下的性能相比,信息检索和长距离推理的精度会降低。
渐进式披露原则(The Progressive Disclosure Principle)
渐进式披露(Progressive Disclosure)通过仅在需要时加载信息来高效管理Context。在启动时,Agent仅加载技能名称和描述——足以了解何时可能需要某个技能。仅当技能被激活用于特定任务时,才加载完整内容。
这种方法保持Agent的快速响应,同时允许其按需访问更多Context。该原则适用于多个层面:技能选择、文档加载甚至工具结果检索。
Context Quality Versus Context Quantity
Context质量与数量的权衡(Context Quality Versus Context Quantity)
The assumption that larger context windows solve memory problems has been empirically debunked. Context engineering means finding the smallest possible set of high-signal tokens that maximize the likelihood of desired outcomes.
Several factors create pressure for context efficiency. Processing cost grows disproportionately with context length—not just double the cost for double the tokens, but exponentially more in time and computing resources. Model performance degrades beyond certain context lengths even when the window technically supports more tokens. Long inputs remain expensive even with prefix caching.
The guiding principle is informativity over exhaustiveness. Include what matters for the decision at hand, exclude what does not, and design systems that can access additional information on demand.
“更大的上下文窗口解决内存问题”这一假设已被实证推翻。上下文工程意味着找到最小的高信号token集合,以最大化预期结果的可能性。
有几个因素推动了Context效率的需求。处理成本随Context长度不成比例地增长——不仅是token数量翻倍成本翻倍,时间和计算资源的增长呈指数级。即使窗口技术上支持更多token,超过一定Context长度后模型性能也会下降。即使使用前缀缓存,长输入仍然成本高昂。
指导原则是信息性优先于全面性。包含与当前决策相关的内容,排除无关内容,并设计可按需访问额外信息的系统。
Context as Finite Resource
Context作为有限资源(Context as Finite Resource)
Context must be treated as a finite resource with diminishing marginal returns. Like humans with limited working memory, language models have an attention budget drawn on when parsing large volumes of context.
Every new token introduced depletes this budget by some amount. This creates the need for careful curation of available tokens. The engineering problem is optimizing utility against inherent constraints.
Context engineering is iterative and the curation phase happens each time you decide what to pass to the model. It is not a one-time prompt writing exercise but an ongoing discipline of context management.
必须将Context视为一种有限资源,其边际收益递减。与人类有限的工作记忆类似,语言模型在解析大量Context时会消耗注意力预算。
每引入一个新token都会在一定程度上消耗这个预算。这就需要仔细筛选可用的token。工程问题是在固有约束下优化效用。
上下文工程是迭代的,每次决定将哪些内容传递给模型时,都需要进行筛选。它不是一次性的提示词编写练习,而是持续的Context管理规范。
Practical Guidance
实践指南
File-System-Based Access
基于文件系统的访问(File-System-Based Access)
Agents with filesystem access can use progressive disclosure naturally. Store reference materials, documentation, and data externally. Load files only when needed using standard filesystem operations. This pattern avoids stuffing context with information that may not be relevant.
The file system itself provides structure that agents can navigate. File sizes suggest complexity; naming conventions hint at purpose; timestamps serve as proxies for relevance. Metadata of file references provides a mechanism to efficiently refine behavior.
具有文件系统访问权限的Agent可以自然地使用渐进式披露。将参考资料、文档和数据存储在外部。仅在需要时使用标准文件系统操作加载文件。这种模式避免了将可能无关的信息塞入Context。
文件系统本身提供了Agent可以导航的结构。文件大小暗示了复杂度;命名约定暗示了用途;时间戳可作为相关性的代理。文件引用的元数据提供了有效优化行为的机制。
Hybrid Strategies
混合策略(Hybrid Strategies)
The most effective agents employ hybrid strategies. Pre-load some context for speed (like CLAUDE.md files or project rules), but enable autonomous exploration for additional context as needed. The decision boundary depends on task characteristics and context dynamics.
For contexts with less dynamic content, pre-loading more upfront makes sense. For rapidly changing or highly specific information, just-in-time loading avoids stale context.
最有效的Agent采用混合策略。预加载部分Context以提高速度(如CLAUDE.md文件或项目规则),但允许自主探索以按需获取额外Context。决策边界取决于任务特性和Context动态。
对于内容变化较少的Context,预先加载更多内容是合理的。对于快速变化或高度特定的信息,即时加载可避免使用过时的Context。
Context Budgeting
上下文预算管理(Context Budgeting)
Design with explicit context budgets in mind. Know the effective context limit for your model and task. Monitor context usage during development. Implement compaction triggers at appropriate thresholds. Design systems assuming context will degrade rather than hoping it will not.
Effective context budgeting requires understanding not just raw token counts but also attention distribution patterns. The middle of context receives less attention than the beginning and end. Place critical information at attention-favored positions.
设计时要考虑明确的Context预算。了解你的模型和任务的有效Context限制。在开发过程中监控Context使用情况。在适当的阈值处实现压缩触发机制。设计系统时要假设Context会退化,而非期望它不会。
有效的Context预算管理不仅需要了解原始token计数,还需要了解注意力分布模式。Context中间部分受到的关注少于开头和结尾。将关键信息放在注意力偏好的位置。
Examples
示例
Example 1: Organizing System Prompts
markdown
<BACKGROUND_INFORMATION>
You are a Python expert helping a development team.
Current project: Data processing pipeline in Python 3.9+
</BACKGROUND_INFORMATION>
<INSTRUCTIONS>
- Write clean, idiomatic Python code
- Include type hints for function signatures
- Add docstrings for public functions
- Follow PEP 8 style guidelines
</INSTRUCTIONS>
<TOOL_GUIDANCE>
Use bash for shell operations, python for code tasks.
File operations should use pathlib for cross-platform compatibility.
</TOOL_GUIDANCE>
<OUTPUT_DESCRIPTION>
Provide code blocks with syntax highlighting.
Explain non-obvious decisions in comments.
</OUTPUT_DESCRIPTION>Example 2: Progressive Document Loading
markdown
undefined示例1:组织System Prompts
markdown
<BACKGROUND_INFORMATION>
You are a Python expert helping a development team.
Current project: Data processing pipeline in Python 3.9+
</BACKGROUND_INFORMATION>
<INSTRUCTIONS>
- Write clean, idiomatic Python code
- Include type hints for function signatures
- Add docstrings for public functions
- Follow PEP 8 style guidelines
</INSTRUCTIONS>
<TOOL_GUIDANCE>
Use bash for shell operations, python for code tasks.
File operations should use pathlib for cross-platform compatibility.
</TOOL_GUIDANCE>
<OUTPUT_DESCRIPTION>
Provide code blocks with syntax highlighting.
Explain non-obvious decisions in comments.
</OUTPUT_DESCRIPTION>示例2:渐进式文档加载
markdown
undefinedInstead of loading all documentation at once:
Instead of loading all documentation at once:
Step 1: Load summary
Step 1: Load summary
docs/api_summary.md # Lightweight overview
docs/api_summary.md # Lightweight overview
Step 2: Load specific section as needed
Step 2: Load specific section as needed
docs/api/endpoints.md # Only when API calls needed
docs/api/authentication.md # Only when auth context needed
undefineddocs/api/endpoints.md # Only when API calls needed
docs/api/authentication.md # Only when auth context needed
undefinedGuidelines
指南
- Treat context as a finite resource with diminishing returns
- Place critical information at attention-favored positions (beginning and end)
- Use progressive disclosure to defer loading until needed
- Organize system prompts with clear section boundaries
- Monitor context usage during development
- Implement compaction triggers at 70-80% utilization
- Design for context degradation rather than hoping to avoid it
- Prefer smaller high-signal context over larger low-signal context
- 将Context视为边际收益递减的有限资源
- 将关键信息放在注意力偏好的位置(开头和结尾)
- 使用渐进式披露延迟加载,直到需要时再加载
- 以清晰的部分边界组织System Prompts
- 在开发过程中监控Context使用情况
- 在70-80%的利用率阈值处实现压缩触发机制
- 假设Context会退化进行系统设计,而非期望它不会
- 优先选择较小的高信号Context,而非较大的低信号Context
Integration
集成
This skill provides foundational context that all other skills build upon. It should be studied first before exploring:
- context-degradation - Understanding how context fails
- context-optimization - Techniques for extending context capacity
- multi-agent-patterns - How context isolation enables multi-agent systems
- tool-design - How tool definitions interact with context
本技能提供了所有其他技能构建的基础Context。在探索以下技能之前,应首先学习本技能:
- context-degradation - 理解Context如何失效
- context-optimization - 扩展Context容量的技术
- multi-agent-patterns - Context隔离如何支持多Agent系统
- tool-design - Tool Definitions如何与Context交互
References
参考资料
Internal reference:
- Context Components Reference - Detailed technical reference
Related skills in this collection:
- context-degradation - Understanding context failure patterns
- context-optimization - Techniques for efficient context use
External resources:
- Research on transformer attention mechanisms
- Production engineering guides from leading AI labs
- Framework documentation on context window management
内部参考:
- Context Components Reference - 详细技术参考
本集合中的相关技能:
- context-degradation - 理解Context失效模式
- context-optimization - 高效使用Context的技术
外部资源:
- 关于Transformer注意力机制的研究
- 领先AI实验室的生产工程指南
- 关于Context窗口管理的框架文档
Skill Metadata
技能元数据
Created: 2025-12-20
Last Updated: 2025-12-20
Author: Agent Skills for Context Engineering Contributors
Version: 1.0.0
创建时间: 2025-12-20
最后更新时间: 2025-12-20
作者: Agent Skills for Context Engineering Contributors
版本: 1.0.0