enrich
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMANDATORY PREPARATION
必备准备工作
Invoke {{command_prefix}}agent-workflow — it contains workflow principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run {{command_prefix}}teach-maestro first.
Consult the knowledge-systems reference in the agent-workflow skill for RAG architecture, chunking strategies, and retrieval patterns.
Add knowledge sources to ground the workflow in facts. Without grounding, agents hallucinate. With grounding, they cite sources.
调用 {{command_prefix}}agent-workflow —— 它包含工作流原则、反模式,以及上下文收集协议。在继续操作前请遵循该协议——如果还不存在工作流上下文,你必须先运行 {{command_prefix}}teach-maestro。
查阅agent-workflow技能中的知识系统参考,了解RAG架构、分块策略和检索模式。
添加知识源以基于事实支撑工作流。没有数据支撑时,Agent会产生幻觉。有数据支撑时,它们可以引用来源。
Knowledge Source Assessment
知识源评估
Identify what knowledge the workflow needs:
| Knowledge Type | Source | Update Frequency | Access Pattern |
|---|---|---|---|
| Domain docs | Internal docs, specs | Monthly | Semantic search |
| Code context | Codebase | Real-time | Code search |
| User data | Database, CRM | Real-time | Structured query |
| External data | APIs, web | Real-time | API call |
| Historical | Logs, past interactions | Daily | Time-range query |
确定工作流需要哪些知识:
| 知识类型 | 来源 | 更新频率 | 访问模式 |
|---|---|---|---|
| 领域文档 | 内部文档、规格说明 | 每月 | 语义搜索 |
| 代码上下文 | 代码库 | 实时 | 代码搜索 |
| 用户数据 | 数据库、CRM | 实时 | 结构化查询 |
| 外部数据 | API、网页 | 实时 | API调用 |
| 历史数据 | 日志、过往交互记录 | 每日 | 时间范围查询 |
Add RAG Pipeline
添加RAG Pipeline
For document-based knowledge (consult the knowledge-systems reference in the agent-workflow skill):
- Select documents: Identify the authoritative source documents
- Chunk strategy: Choose chunking based on document type (semantic > token-based)
- Embed: Use appropriate embedding model for the domain
- Index: Store in vector database with metadata
- Retrieve: Implement hybrid search (semantic + keyword)
- Inject: Add retrieved context to the prompt with source attribution
针对基于文档的知识(请查阅agent-workflow技能中的知识系统参考):
- 选择文档:确定权威的源文档
- 分块策略:根据文档类型选择分块方式(语义分块 > 基于token的分块)
- 嵌入:为对应领域选择合适的嵌入模型
- 索引:存储到带元数据的向量数据库中
- 检索:实现混合搜索(语义 + 关键词)
- 注入:将检索到的上下文添加到prompt中,并标注来源归属
Add Structured Data
添加结构化数据
For database-backed knowledge:
- Define the query interface: Natural language → structured query
- Add guardrails: Read-only access, query complexity limits
- Format results: Transform raw data into context the model can use
- Attribute: Include data source and freshness in the context
针对数据库支撑的知识:
- 定义查询接口:自然语言 → 结构化查询
- 添加防护规则:只读访问、查询复杂度限制
- 结果格式化:将原始数据转换为模型可使用的上下文
- 归属标注:在上下文中包含数据源和数据新鲜度
Add Real-Time Data
添加实时数据
For live information:
- Identify APIs: What external services provide the needed data
- Cache strategy: How often does the data change? Cache accordingly
- Fallback: What happens when the API is down?
- Attribution: Include data timestamp and source
针对实时信息:
- 确定API:哪些外部服务可提供所需数据
- 缓存策略:数据的更新频率是多少?据此设置缓存
- 降级方案:API不可用时的处理逻辑是什么?
- 归属标注:包含数据时间戳和来源
Enrichment Checklist
丰富度检查清单
- Every knowledge source has attribution (source, date, confidence)
- Retrieval quality tested independently of generation quality
- Chunk sizes tested and optimized for the document types
- Fallbacks exist for all external knowledge sources
- Knowledge base has a refresh/update strategy
- PII is handled appropriately in knowledge sources
- 每个知识源都有归属标注(来源、日期、置信度)
- 检索质量已独立于生成质量完成测试
- 分块大小已针对文档类型完成测试和优化
- 所有外部知识源都有降级方案
- 知识库具备刷新/更新策略
- 知识源中的PII已得到妥善处理
Recommended Next Step
推荐下一步操作
After enrichment, run to test retrieval quality, or to set up continuous monitoring of knowledge freshness.
{{command_prefix}}evaluate{{command_prefix}}iterateNEVER:
- Index everything without curation (garbage in = garbage out)
- Skip source attribution (hallucination without attribution is undetectable)
- Build RAG without testing retrieval quality first
- Use fixed chunk sizes for all document types
- Assume embedding similarity equals relevance
完成知识丰富后,运行 测试检索质量,或运行 设置知识新鲜度的持续监控。
{{command_prefix}}evaluate{{command_prefix}}iterate严禁操作:
- 不加筛选就索引所有内容(垃圾进 = 垃圾出)
- 跳过来源归属标注(没有归属的幻觉无法被检测)
- 未先测试检索质量就构建RAG
- 对所有文档类型使用固定分块大小
- 假设嵌入相似度等同于相关性