ai-sorting
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBuild an AI Content Sorter
构建AI内容分类器
Guide the user through building an AI that sorts, tags, or categorizes content. Powered by DSPy classification — works with any label set.
引导用户构建一个可对内容进行排序、打标或分类的AI。基于DSPy分类功能构建——支持任意标签集。
Step 1: Define the sorting task
步骤1:定义分类任务
Ask the user:
- What are you sorting? (tickets, emails, reviews, messages, etc.)
- What are the categories? (list all labels/buckets)
- One category per item, or multiple? (e.g., "priority" vs "all applicable tags")
询问用户:
- 你要分类的内容是什么?(工单、邮件、评论、消息等)
- 分类类别有哪些?(列出所有标签/分类项)
- 每个内容项对应单个类别还是多个?(例如“优先级分类” vs “所有适用标签”)
Step 2: Build the sorter
步骤2:构建分类器
Single category (most common)
单类别分类(最常见)
python
import dspy
from typing import Literalpython
import dspy
from typing import LiteralYour categories
Your categories
CATEGORIES = ["billing", "technical", "account", "feature_request", "general"]
class SortContent(dspy.Signature):
"""Sort the message into the correct category."""
message: str = dspy.InputField(desc="The content to sort")
category: Literal[tuple(CATEGORIES)] = dspy.OutputField(desc="The assigned category")
sorter = dspy.ChainOfThought(SortContent)
Using `Literal` locks the output to valid categories only — the AI can't hallucinate labels. `ChainOfThought` adds reasoning which improves accuracy over bare `Predict`.CATEGORIES = ["billing", "technical", "account", "feature_request", "general"]
class SortContent(dspy.Signature):
"""Sort the message into the correct category."""
message: str = dspy.InputField(desc="The content to sort")
category: Literal[tuple(CATEGORIES)] = dspy.OutputField(desc="The assigned category")
sorter = dspy.ChainOfThought(SortContent)
使用`Literal`可将输出限定为有效类别——AI不会生成不存在的标签。`ChainOfThought`添加推理逻辑,相比基础的`Predict`能提升准确率。Multiple tags
多标签分类
python
class TagContent(dspy.Signature):
"""Assign all applicable tags to the content."""
message: str = dspy.InputField(desc="The content to tag")
tags: list[Literal[tuple(CATEGORIES)]] = dspy.OutputField(desc="All applicable tags")
tagger = dspy.ChainOfThought(TagContent)python
class TagContent(dspy.Signature):
"""Assign all applicable tags to the content."""
message: str = dspy.InputField(desc="The content to tag")
tags: list[Literal[tuple(CATEGORIES)]] = dspy.OutputField(desc="All applicable tags")
tagger = dspy.ChainOfThought(TagContent)Step 3: Test the quality
步骤3:测试分类质量
python
from dspy.evaluate import Evaluate
def sorting_metric(example, prediction, trace=None):
return prediction.category == example.category
evaluator = Evaluate(
devset=devset,
metric=sorting_metric,
num_threads=4,
display_progress=True,
display_table=5,
)
score = evaluator(sorter)For multi-tag, use F1 or Jaccard similarity instead of exact match.
python
from dspy.evaluate import Evaluate
def sorting_metric(example, prediction, trace=None):
return prediction.category == example.category
evaluator = Evaluate(
devset=devset,
metric=sorting_metric,
num_threads=4,
display_progress=True,
display_table=5,
)
score = evaluator(sorter)对于多标签分类,使用F1或Jaccard相似度而非精确匹配。
Step 4: Improve accuracy
步骤4:提升准确率
Start with — fast and usually gives a solid boost:
BootstrapFewShotpython
optimizer = dspy.BootstrapFewShot(
metric=sorting_metric,
max_bootstrapped_demos=4,
)
optimized_sorter = optimizer.compile(sorter, trainset=trainset)If accuracy still isn't good enough, upgrade to :
MIPROv2python
optimizer = dspy.MIPROv2(
metric=sorting_metric,
auto="medium",
)
optimized_sorter = optimizer.compile(sorter, trainset=trainset)首先使用——速度快,通常能显著提升效果:
BootstrapFewShotpython
optimizer = dspy.BootstrapFewShot(
metric=sorting_metric,
max_bootstrapped_demos=4,
)
optimized_sorter = optimizer.compile(sorter, trainset=trainset)如果准确率仍未达标,可升级为:
MIPROv2python
optimizer = dspy.MIPROv2(
metric=sorting_metric,
auto="medium",
)
optimized_sorter = optimizer.compile(sorter, trainset=trainset)Step 5: Use it
步骤5:使用分类器
python
result = optimized_sorter(message="I was charged twice on my credit card last month")
print(f"Category: {result.category}")
print(f"Reasoning: {result.reasoning}")python
result = optimized_sorter(message="I was charged twice on my credit card last month")
print(f"Category: {result.category}")
print(f"Reasoning: {result.reasoning}")Key patterns
关键模式
- Use types to lock outputs to valid categories
Literal - Use over
ChainOfThought— reasoning improves sorting accuracyPredict - Include a field during training for tricky examples:
hintSetpythonclass SortWithHint(dspy.Signature): message: str = dspy.InputField() hint: str = dspy.InputField(desc="Optional hint for ambiguous cases") category: Literal[tuple(CATEGORIES)] = dspy.OutputField()in training data, leave empty at inference time.hint - Confidence scores: Add a float output field if you need confidence
- 使用类型:将输出限定为有效类别
Literal - 优先使用而非
ChainOfThought——推理逻辑能提升分类准确率Predict - 训练时为复杂案例添加字段:
hint在训练数据中设置pythonclass SortWithHint(dspy.Signature): message: str = dspy.InputField() hint: str = dspy.InputField(desc="Optional hint for ambiguous cases") category: Literal[tuple(CATEGORIES)] = dspy.OutputField(),推理时留空即可。hint - 置信度分数:如果需要置信度,可添加一个浮点型输出字段
Additional resources
额外资源
- For worked examples (sentiment, intent, topics), see examples.md
- Need scores instead of categories? Use
/ai-scoring - Next: to measure and improve your AI
/ai-improving-accuracy
- 如需完整示例(情感识别、意图识别、主题分类),请查看examples.md
- 需要分数而非分类结果?请使用
/ai-scoring - 下一步:使用来衡量并优化你的AI
/ai-improving-accuracy