senior-prompt-engineer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSenior Prompt Engineer
资深提示词工程师
Prompt engineering patterns, LLM evaluation frameworks, and agentic system design.
提示词工程模式、LLM评估框架与Agent系统设计。
Table of Contents
目录
Quick Start
快速开始
bash
undefinedbash
undefinedAnalyze and optimize a prompt file
分析并优化提示词文件
python scripts/prompt_optimizer.py prompts/my_prompt.txt --analyze
python scripts/prompt_optimizer.py prompts/my_prompt.txt --analyze
Evaluate RAG retrieval quality
评估RAG检索质量
python scripts/rag_evaluator.py --contexts contexts.json --questions questions.json
python scripts/rag_evaluator.py --contexts contexts.json --questions questions.json
Visualize agent workflow from definition
可视化定义好的Agent工作流
python scripts/agent_orchestrator.py agent_config.yaml --visualize
---python scripts/agent_orchestrator.py agent_config.yaml --visualize
---Tools Overview
工具概览
1. Prompt Optimizer
1. 提示词优化器
Analyzes prompts for token efficiency, clarity, and structure. Generates optimized versions.
Input: Prompt text file or string
Output: Analysis report with optimization suggestions
Usage:
bash
undefined分析提示词的Token效率、清晰度和结构,生成优化版本。
输入: 提示词文本文件或字符串
输出: 带有优化建议的分析报告
使用方法:
bash
undefinedAnalyze a prompt file
分析提示词文件
python scripts/prompt_optimizer.py prompt.txt --analyze
python scripts/prompt_optimizer.py prompt.txt --analyze
Output:
输出:
Token count: 847
Token数量:847
Estimated cost: $0.0025 (GPT-4)
预估成本:$0.0025 (GPT-4)
Clarity score: 72/100
清晰度评分:72/100
Issues found:
发现的问题:
- Ambiguous instruction at line 3
- 第3行存在模糊指令
- Missing output format specification
- 缺少输出格式规范
- Redundant context (lines 12-15 repeat lines 5-8)
- 冗余上下文(第12-15行重复了第5-8行内容)
Suggestions:
建议:
1. Add explicit output format: "Respond in JSON with keys: ..."
1. 添加明确的输出格式:“以JSON格式响应,包含以下键:...”
2. Remove redundant context to save 89 tokens
2. 删除冗余上下文,可节省89个Token
3. Clarify "analyze" -> "list the top 3 issues with severity ratings"
3. 将“analyze”明确为“列出最严重的3个问题并给出严重等级”
Generate optimized version
生成优化后的版本
python scripts/prompt_optimizer.py prompt.txt --optimize --output optimized.txt
python scripts/prompt_optimizer.py prompt.txt --optimize --output optimized.txt
Count tokens for cost estimation
统计Token数量以估算成本
python scripts/prompt_optimizer.py prompt.txt --tokens --model gpt-4
python scripts/prompt_optimizer.py prompt.txt --tokens --model gpt-4
Extract and manage few-shot examples
提取并管理少样本示例
python scripts/prompt_optimizer.py prompt.txt --extract-examples --output examples.json
---python scripts/prompt_optimizer.py prompt.txt --extract-examples --output examples.json
---2. RAG Evaluator
2. RAG评估器
Evaluates Retrieval-Augmented Generation quality by measuring context relevance and answer faithfulness.
Input: Retrieved contexts (JSON) and questions/answers
Output: Evaluation metrics and quality report
Usage:
bash
undefined通过衡量上下文相关性和答案可信度来评估检索增强生成(RAG)的质量。
输入: 检索到的上下文(JSON格式)和问题/答案
输出: 评估指标和质量报告
使用方法:
bash
undefinedEvaluate retrieval quality
评估检索质量
python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json
python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json
Output:
输出:
=== RAG Evaluation Report ===
=== RAG评估报告 ===
Questions evaluated: 50
评估问题数量:50
Retrieval Metrics:
检索指标:
Context Relevance: 0.78 (target: >0.80)
上下文相关性:0.78(目标:>0.80)
Retrieval Precision@5: 0.72
检索准确率@5:0.72
Coverage: 0.85
覆盖率:0.85
Generation Metrics:
生成指标:
Answer Faithfulness: 0.91
答案可信度:0.91
Groundedness: 0.88
事实一致性:0.88
Issues Found:
发现的问题:
- 8 questions had no relevant context in top-5
- 8个问题的前5个检索结果中无相关上下文
- 3 answers contained information not in context
- 3个答案包含上下文以外的信息
Recommendations:
建议:
1. Improve chunking strategy for technical documents
1. 改进技术文档的分块策略
2. Add metadata filtering for date-sensitive queries
2. 为日期敏感型查询添加元数据过滤
Evaluate with custom metrics
使用自定义指标进行评估
python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json
--metrics relevance,faithfulness,coverage
--metrics relevance,faithfulness,coverage
python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json
--metrics relevance,faithfulness,coverage
--metrics relevance,faithfulness,coverage
Export detailed results
导出详细结果
python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json
--output report.json --verbose
--output report.json --verbose
---python scripts/rag_evaluator.py --contexts retrieved.json --questions eval_set.json
--output report.json --verbose
--output report.json --verbose
---3. Agent Orchestrator
3. Agent编排器
Parses agent definitions and visualizes execution flows. Validates tool configurations.
Input: Agent configuration (YAML/JSON)
Output: Workflow visualization, validation report
Usage:
bash
undefined解析Agent定义并可视化执行流程,验证工具配置。
输入: Agent配置文件(YAML/JSON格式)
输出: 工作流可视化图、验证报告
使用方法:
bash
undefinedValidate agent configuration
验证Agent配置
python scripts/agent_orchestrator.py agent.yaml --validate
python scripts/agent_orchestrator.py agent.yaml --validate
Output:
输出:
=== Agent Validation Report ===
=== Agent验证报告 ===
Agent: research_assistant
Agent:research_assistant
Pattern: ReAct
模式:ReAct
Tools (4 registered):
已注册工具(4个):
[OK] web_search - API key configured
[正常] web_search - API密钥已配置
[OK] calculator - No config needed
[正常] calculator - 无需配置
[WARN] file_reader - Missing allowed_paths
[警告] file_reader - 缺少allowed_paths配置
[OK] summarizer - Prompt template valid
[正常] summarizer - 提示词模板有效
Flow Analysis:
流程分析:
Max depth: 5 iterations
最大深度:5次迭代
Estimated tokens/run: 2,400-4,800
每次运行预估Token数:2400-4800
Potential infinite loop: No
潜在无限循环:无
Recommendations:
建议:
1. Add allowed_paths to file_reader for security
1. 为file_reader添加allowed_paths以保障安全
2. Consider adding early exit condition for simple queries
2. 考虑为简单查询添加提前退出条件
Visualize agent workflow (ASCII)
可视化Agent工作流(ASCII格式)
python scripts/agent_orchestrator.py agent.yaml --visualize
python scripts/agent_orchestrator.py agent.yaml --visualize
Output:
输出:
┌─────────────────────────────────────────┐
┌─────────────────────────────────────────┐
│ research_assistant │
│ research_assistant │
│ (ReAct Pattern) │
│ (ReAct Pattern) │
└─────────────────┬───────────────────────┘
└─────────────────┬───────────────────────┘
│
│
┌────────▼────────┐
┌────────▼────────┐
│ User Query │
│ 用户查询 │
└────────┬────────┘
└────────┬────────┘
│
│
┌────────▼────────┐
┌────────▼────────┐
│ Think │◄──────┐
│ 思考 │◄──────┐
└────────┬────────┘ │
└────────┬────────┘ │
│ │
│ │
┌────────▼────────┐ │
┌────────▼────────┐ │
│ Select Tool │ │
│ 选择工具 │ │
└────────┬────────┘ │
└────────┬────────┘ │
│ │
│ │
┌─────────────┼─────────────┐ │
┌─────────────┼─────────────┐ │
▼ ▼ ▼ │
▼ ▼ ▼ │
[web_search] [calculator] [file_reader]
[web_search] [calculator] [file_reader]
│ │ │ │
│ │ │ │
└─────────────┼─────────────┘ │
└─────────────┼─────────────┘ │
│ │
│ │
┌────────▼────────┐ │
┌────────▼────────┐ │
│ Observe │───────┘
│ 观察结果 │───────┘
└────────┬────────┘
└────────┬────────┘
│
│
┌────────▼────────┐
┌────────▼────────┐
│ Final Answer │
│ 最终回复 │
└─────────────────┘
└─────────────────┘
Export workflow as Mermaid diagram
将工作流导出为Mermaid图
python scripts/agent_orchestrator.py agent.yaml --visualize --format mermaid
---python scripts/agent_orchestrator.py agent.yaml --visualize --format mermaid
---Prompt Engineering Workflows
提示词工程工作流
Prompt Optimization Workflow
提示词优化工作流
Use when improving an existing prompt's performance or reducing token costs.
Step 1: Baseline current prompt
bash
python scripts/prompt_optimizer.py current_prompt.txt --analyze --output baseline.jsonStep 2: Identify issues
Review the analysis report for:
- Token waste (redundant instructions, verbose examples)
- Ambiguous instructions (unclear output format, vague verbs)
- Missing constraints (no length limits, no format specification)
Step 3: Apply optimization patterns
| Issue | Pattern to Apply |
|---|---|
| Ambiguous output | Add explicit format specification |
| Too verbose | Extract to few-shot examples |
| Inconsistent results | Add role/persona framing |
| Missing edge cases | Add constraint boundaries |
Step 4: Generate optimized version
bash
python scripts/prompt_optimizer.py current_prompt.txt --optimize --output optimized.txtStep 5: Compare results
bash
python scripts/prompt_optimizer.py optimized.txt --analyze --compare baseline.json用于提升现有提示词的性能或降低Token成本。
步骤1:建立当前提示词基准
bash
python scripts/prompt_optimizer.py current_prompt.txt --analyze --output baseline.json步骤2:识别问题
查看分析报告,找出以下问题:
- Token浪费(冗余指令、冗长示例)
- 模糊指令(输出格式不明确、动词含义模糊)
- 缺少约束(无长度限制、无格式规范)
步骤3:应用优化模式
| 问题 | 适用模式 |
|---|---|
| 输出模糊 | 添加明确的格式规范 |
| 过于冗长 | 提取为少样本示例 |
| 结果不一致 | 添加角色/人设框架 |
| 缺少边缘场景处理 | 添加约束边界 |
步骤4:生成优化版本
bash
python scripts/prompt_optimizer.py current_prompt.txt --optimize --output optimized.txt步骤5:对比结果
bash
python scripts/prompt_optimizer.py optimized.txt --analyze --compare baseline.jsonShows: token reduction, clarity improvement, issues resolved
显示:Token减少量、清晰度提升情况、已解决的问题
**Step 6: Validate with test cases**
Run both prompts against your evaluation set and compare outputs.
---
**步骤6:用测试用例验证**
将原提示词和优化后的提示词在评估集上运行,对比输出结果。
---Few-Shot Example Design Workflow
少样本示例设计工作流
Use when creating examples for in-context learning.
Step 1: Define the task clearly
Task: Extract product entities from customer reviews
Input: Review text
Output: JSON with {product_name, sentiment, features_mentioned}Step 2: Select diverse examples (3-5 recommended)
| Example Type | Purpose |
|---|---|
| Simple case | Shows basic pattern |
| Edge case | Handles ambiguity |
| Complex case | Multiple entities |
| Negative case | What NOT to extract |
Step 3: Format consistently
Example 1:
Input: "Love my new iPhone 15, the camera is amazing!"
Output: {"product_name": "iPhone 15", "sentiment": "positive", "features_mentioned": ["camera"]}
Example 2:
Input: "The laptop was okay but battery life is terrible."
Output: {"product_name": "laptop", "sentiment": "mixed", "features_mentioned": ["battery life"]}Step 4: Validate example quality
bash
python scripts/prompt_optimizer.py prompt_with_examples.txt --validate-examples用于创建上下文学习所需的示例。
步骤1:明确任务定义
任务:从客户评论中提取产品实体
输入:评论文本
输出:JSON格式,包含{product_name, sentiment, features_mentioned}步骤2:选择多样化示例(建议3-5个)
| 示例类型 | 用途 |
|---|---|
| 简单场景 | 展示基本模式 |
| 边缘场景 | 处理模糊情况 |
| 复杂场景 | 包含多个实体 |
| 负面示例 | 展示不应提取的内容 |
步骤3:统一格式
示例1:
输入:“我超爱我的新iPhone 15,相机太棒了!”
输出:{"product_name": "iPhone 15", "sentiment": "positive", "features_mentioned": ["camera"]}
示例2:
输入:“这台笔记本还行,但续航太糟糕了。”
输出:{"product_name": "laptop", "sentiment": "mixed", "features_mentioned": ["battery life"]}步骤4:验证示例质量
bash
python scripts/prompt_optimizer.py prompt_with_examples.txt --validate-examplesChecks: consistency, coverage, format alignment
检查:一致性、覆盖率、格式对齐情况
**Step 5: Test with held-out cases**
Ensure model generalizes beyond your examples.
---
**步骤5:用预留测试用例验证**
确保模型能够泛化到示例以外的场景。
---Structured Output Design Workflow
结构化输出设计工作流
Use when you need reliable JSON/XML/structured responses.
Step 1: Define schema
json
{
"type": "object",
"properties": {
"summary": {"type": "string", "maxLength": 200},
"sentiment": {"enum": ["positive", "negative", "neutral"]},
"confidence": {"type": "number", "minimum": 0, "maximum": 1}
},
"required": ["summary", "sentiment"]
}Step 2: Include schema in prompt
Respond with JSON matching this schema:
- summary (string, max 200 chars): Brief summary of the content
- sentiment (enum): One of "positive", "negative", "neutral"
- confidence (number 0-1): Your confidence in the sentimentStep 3: Add format enforcement
IMPORTANT: Respond ONLY with valid JSON. No markdown, no explanation.
Start your response with { and end with }Step 4: Validate outputs
bash
python scripts/prompt_optimizer.py structured_prompt.txt --validate-schema schema.json用于需要可靠JSON/XML等结构化响应的场景。
步骤1:定义Schema
json
{
"type": "object",
"properties": {
"summary": {"type": "string", "maxLength": 200},
"sentiment": {"enum": ["positive", "negative", "neutral"]},
"confidence": {"type": "number", "minimum": 0, "maximum": 1}
},
"required": ["summary", "sentiment"]
}步骤2:在提示词中包含Schema
请按照以下Schema以JSON格式响应:
- summary(字符串,最大200字符):内容的简要总结
- sentiment(枚举值):“positive”、“negative”、“neutral”其中之一
- confidence(0-1的数字):你对情感判断的置信度步骤3:添加格式强制要求
重要提示:仅返回有效的JSON,不要使用Markdown,不要添加解释。
请以{开头,以}结尾你的响应。步骤4:验证输出
bash
python scripts/prompt_optimizer.py structured_prompt.txt --validate-schema schema.jsonReference Documentation
参考文档
| File | Contains | Load when user asks about |
|---|---|---|
| 10 prompt patterns with input/output examples | "which pattern?", "few-shot", "chain-of-thought", "role prompting" |
| Evaluation metrics, scoring methods, A/B testing | "how to evaluate?", "measure quality", "compare prompts" |
| Agent architectures (ReAct, Plan-Execute, Tool Use) | "build agent", "tool calling", "multi-agent" |
| 文件 | 内容 | 当用户询问以下内容时调用 |
|---|---|---|
| 10种提示词模式及输入输出示例 | “使用哪种模式?”、“少样本”、“思维链”、“角色提示” |
| 评估指标、评分方法、A/B测试 | “如何评估?”、“衡量质量”、“对比提示词” |
| Agent架构(ReAct、Plan-Execute、工具调用) | “构建Agent”、“工具调用”、“多Agent” |
Common Patterns Quick Reference
常用模式速查
| Pattern | When to Use | Example |
|---|---|---|
| Zero-shot | Simple, well-defined tasks | "Classify this email as spam or not spam" |
| Few-shot | Complex tasks, consistent format needed | Provide 3-5 examples before the task |
| Chain-of-Thought | Reasoning, math, multi-step logic | "Think step by step..." |
| Role Prompting | Expertise needed, specific perspective | "You are an expert tax accountant..." |
| Structured Output | Need parseable JSON/XML | Include schema + format enforcement |
| 模式 | 适用场景 | 示例 |
|---|---|---|
| 零样本 | 简单、定义明确的任务 | “将这封邮件分类为垃圾邮件或非垃圾邮件” |
| 少样本 | 复杂任务、需要一致格式 | 在任务前提供3-5个示例 |
| 思维链 | 推理、数学、多步骤逻辑 | “请逐步思考...” |
| 角色提示 | 需要专业知识、特定视角 | “你是一名资深税务会计师...” |
| 结构化输出 | 需要可解析的JSON/XML | 包含Schema + 格式强制要求 |
Common Commands
常用命令
bash
undefinedbash
undefinedPrompt Analysis
提示词分析
python scripts/prompt_optimizer.py prompt.txt --analyze # Full analysis
python scripts/prompt_optimizer.py prompt.txt --tokens # Token count only
python scripts/prompt_optimizer.py prompt.txt --optimize # Generate optimized version
python scripts/prompt_optimizer.py prompt.txt --analyze # 完整分析
python scripts/prompt_optimizer.py prompt.txt --tokens # 仅统计Token数量
python scripts/prompt_optimizer.py prompt.txt --optimize # 生成优化版本
RAG Evaluation
RAG评估
python scripts/rag_evaluator.py --contexts ctx.json --questions q.json # Evaluate
python scripts/rag_evaluator.py --contexts ctx.json --compare baseline # Compare to baseline
python scripts/rag_evaluator.py --contexts ctx.json --questions q.json # 进行评估
python scripts/rag_evaluator.py --contexts ctx.json --compare baseline # 与基准版本对比
Agent Development
Agent开发
python scripts/agent_orchestrator.py agent.yaml --validate # Validate config
python scripts/agent_orchestrator.py agent.yaml --visualize # Show workflow
python scripts/agent_orchestrator.py agent.yaml --estimate-cost # Token estimation
undefinedpython scripts/agent_orchestrator.py agent.yaml --validate # 验证配置
python scripts/agent_orchestrator.py agent.yaml --visualize # 展示工作流
python scripts/agent_orchestrator.py agent.yaml --estimate-cost # 估算Token成本
undefined