prompt-engineer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Prompt Engineer

Prompt工程师

Overview

概述

Design, test, and optimize prompts for LLM interactions. This skill covers prompt patterns (few-shot, chain-of-thought, ReAct), system prompt design, output formatting, prompt evaluation, and prompt optimization techniques.
为LLM交互设计、测试并优化prompt。本技能涵盖prompt模式(few-shot、chain-of-thought、ReAct)、系统prompt设计、输出格式化、prompt评估以及prompt优化技术。

Features

功能特性

  • Prompt patterns: few-shot, zero-shot, chain-of-thought, ReAct, self-consistency
  • System prompt design: role definition, constraints, output format specification
  • Output formatting: JSON, XML, markdown, structured templates
  • Prompt evaluation: quality metrics, consistency testing, edge case analysis
  • Prompt optimization: token reduction, clarity improvement, robustness testing
  • Prompt模式:few-shot、zero-shot、chain-of-thought、ReAct、self-consistency
  • 系统prompt设计:角色定义、约束条件、输出格式规范
  • 输出格式化:JSON、XML、markdown、结构化模板
  • Prompt评估:质量指标、一致性测试、边缘案例分析
  • Prompt优化:token精简、清晰度提升、鲁棒性测试

Usage

使用方法

  1. Identify the user's prompt need (pattern selection, system prompt, output format, or optimization)
  2. Follow the corresponding workflow below
  3. Produce structured outputs: prompt templates, system prompts, output schemas, or evaluation reports
  1. 识别用户的prompt需求(模式选择、系统prompt、输出格式或优化)
  2. 遵循下方对应的工作流程
  3. 生成结构化输出:prompt模板、系统prompt、输出schema、评估报告

Examples

示例

  • User: "Write a prompt for summarization" Agent: Runs Prompt Design workflow, selects zero-shot pattern, defines role and constraints, produces prompt with output format
  • User: "Optimize this prompt" Agent: Runs Prompt Optimization workflow, identifies ambiguity, reduces token count, adds clarity, tests edge cases
  • User: "Evaluate prompt quality" Agent: Runs Prompt Evaluation workflow, tests against quality metrics, identifies failure modes, produces improvement recommendations
  • 用户:"Write a prompt for summarization" Agent:执行Prompt设计工作流,选择zero-shot模式,定义角色与约束条件,生成带有输出格式的prompt
  • 用户:"Optimize this prompt" Agent:执行Prompt优化工作流,识别模糊点,减少token数量,提升清晰度,测试边缘案例
  • 用户:"Evaluate prompt quality" Agent:执行Prompt评估工作流,对照质量指标进行测试,识别失效模式,生成改进建议

When to Use

使用场景

  • Designing, versioning, and evaluating prompts for LLM-powered features
  • Building agent workflows (ReAct, tool use, multi-agent coordination)
  • Optimizing accuracy, format compliance, latency, and token cost
  • Deploying guardrails, observability, and abuse defenses for GenAI in production
  • 为LLM驱动的功能设计、版本管理并评估prompt
  • 构建Agent工作流(ReAct、工具调用、多Agent协作)
  • 优化准确性、格式合规性、延迟和token成本
  • 在生产环境中为生成式AI部署防护机制、可观测性和滥用防御措施

When NOT to Use

非适用场景

  • Classical ML model training, feature engineering, or statistical A/B tests → use
    data-scientist
  • General technical writing, API reference, or runbooks → use
    tech-writer-researcher
  • Cloud infrastructure, CI/CD, or Kubernetes operations → use
    infrastructure-engineer
  • Revenue recognition or finance close procedures → use
    senior-revenue-accountant
  • Multi-feature token reduction roadmap → use
    ai-token-improvement-plan-engineer
  • Rigorous token-efficiency experiments and ablations → use
    research-engineer-scientist-tokens
  • 传统机器学习模型训练、特征工程或统计A/B测试 → 使用
    data-scientist
  • 通用技术文档编写、API参考或运行手册 → 使用
    tech-writer-researcher
  • 云基础设施、CI/CD或Kubernetes运维 → 使用
    infrastructure-engineer
  • 收入确认或财务结账流程 → 使用
    senior-revenue-accountant
  • 多功能token精简路线图 → 使用
    ai-token-improvement-plan-engineer
  • 严谨的token效率实验与消融研究 → 使用
    research-engineer-scientist-tokens

Core Workflows

核心工作流

1. Prompt Design Workflow

1. Prompt设计工作流

Step-by-step process:
  1. Define the task clearly
    • What input does the user provide?
    • What output format is required?
    • What constraints must be enforced?
  2. Choose the pattern
    PatternWhenStructure
    Zero-shotSimple, well-defined tasksInstructions + input
    Few-shotPattern recognition, formattingExamples + task
    Chain-of-thoughtReasoning, math, logic"Let's think step by step"
    Role-basedDomain expertise needed"You are a senior X..."
    StructuredAPI/programmatic consumptionJSON schema, XML template
  3. Draft and iterate
    • Start simple, add complexity only where needed
    • Use clear separators (###, XML tags, markdown)
    • Specify output format explicitly
    • Include constraints and what to avoid
  4. Test with edge cases
    • Empty input, malformed input, adversarial input
    • Boundary conditions
    • Multiple languages or formats
分步流程:
  1. 明确任务定义
    • 用户提供什么输入?
    • 需要什么输出格式?
    • 必须遵守哪些约束条件?
  2. 选择合适的模式
    模式适用场景结构
    Zero-shot简单、定义清晰的任务指令 + 输入
    Few-shot模式识别、格式化任务示例 + 任务
    Chain-of-thought推理、数学、逻辑任务"Let's think step by step"
    基于角色需要领域专业知识"You are a senior X..."
    结构化API/程序化调用JSON schema、XML模板
  3. 草稿与迭代
    • 从简单版本开始,仅在必要时增加复杂度
    • 使用清晰的分隔符(###、XML标签、markdown)
    • 明确指定输出格式
    • 包含约束条件及需规避的内容
  4. 边缘案例测试
    • 空输入、格式错误的输入、对抗性输入
    • 边界条件
    • 多语言或多格式输入

2. Prompt Optimization & Testing

2. Prompt优化与测试

Evaluation dimensions:
  • Accuracy: Does it produce correct results? (human or model judge)
  • Consistency: Same input → same output? (temperature, seed control)
  • Format compliance: Does output match the schema? (JSON validator)
  • Latency: Time to first token, total generation time
  • Cost: Tokens consumed (input + output)
Testing workflow:
  1. Build a benchmark dataset (50-200 diverse examples)
  2. Establish baseline with current prompt
  3. Modify one variable at a time (prompt, model, temperature)
  4. Run A/B comparison on benchmark
  5. Measure and document improvement
评估维度:
  • 准确性:是否生成正确结果?(人工或模型判断)
  • 一致性:相同输入是否产生相同输出?(temperature、seed控制)
  • 格式合规性:输出是否匹配schema?(JSON验证器)
  • 延迟:首token生成时间、总生成时间
  • 成本:消耗的token数量(输入 + 输出)
测试工作流:
  1. 构建基准数据集(50-200个多样化示例)
  2. 使用当前prompt建立基准线
  3. 每次仅修改一个变量(prompt、模型、temperature)
  4. 在基准数据集上进行A/B对比
  5. 测量并记录改进效果

3. Agent Orchestration

3. Agent编排

Agent patterns:
PatternWhenComponents
ReActTool-using agentReasoning + Action + Observation loop
Plan-and-SolveMulti-step tasksPlanner → Executor → Checker
ReflexionSelf-improvementExecute → Evaluate → Revise
Multi-agentComplex workflowsSpecialist agents + coordinator
Tool use checklist:
  • Tool schemas are clearly defined (name, description, parameters)
  • Agent can handle tool failure gracefully
  • Tool results are summarized, not passed raw to user
  • Rate limits and costs are monitored
Agent模式:
模式适用场景组件
ReAct工具调用型Agent推理 + 动作 + 观察循环
Plan-and-Solve多步骤任务规划器 → 执行器 → 检查器
Reflexion自我提升执行 → 评估 → 修订
多Agent复杂工作流专业Agent + 协调器
工具调用检查清单:
  • 工具schema定义清晰(名称、描述、参数)
  • Agent可优雅处理工具调用失败
  • 工具结果经过汇总,而非直接传递给用户
  • 监控速率限制与成本

4. Production Patterns

4. 生产环境模式

Security checklist:
  • Input validated and sanitized
  • Prompt injection defenses in place (delimiters, output filtering)
  • No sensitive data in prompts (PII, secrets)
  • Output filtered for harmful content
  • Rate limiting and abuse detection
Observability:
  • Log all prompts and responses (with PII redaction)
  • Track token usage and cost per user/request
  • Monitor for drift in output quality
  • Alert on error rates and latency spikes
安全检查清单:
  • 输入已验证与清理
  • 已部署prompt注入防御措施(分隔符、输出过滤)
  • prompt中无敏感数据(PII、密钥)
  • 输出已过滤有害内容
  • 速率限制与滥用检测
可观测性:
  • 记录所有prompt与响应(含PII脱敏)
  • 跟踪每个用户/请求的token使用量与成本
  • 监控输出质量漂移
  • 针对错误率与延迟峰值发出警报