ai-sorting

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Build an AI Content Sorter

构建AI内容分类器

Guide the user through building an AI that sorts, tags, or categorizes content. Powered by DSPy classification — works with any label set.

引导用户构建一个可对内容进行排序、打标或分类的AI。基于DSPy分类功能构建——支持任意标签集。

Step 1: Define the sorting task

步骤1：定义分类任务

Ask the user:

What are you sorting? (tickets, emails, reviews, messages, etc.)
What are the categories? (list all labels/buckets)
One category per item, or multiple? (e.g., "priority" vs "all applicable tags")

询问用户：

你要分类的内容是什么？（工单、邮件、评论、消息等）
分类类别有哪些？（列出所有标签/分类项）
每个内容项对应单个类别还是多个？（例如“优先级分类” vs “所有适用标签”）

Step 2: Build the sorter

步骤2：构建分类器

Single category (most common)

单类别分类（最常见）

python

import dspy
from typing import Literal

python

import dspy
from typing import Literal

Your categories

CATEGORIES = ["billing", "technical", "account", "feature_request", "general"]

class SortContent(dspy.Signature): """Sort the message into the correct category.""" message: str = dspy.InputField(desc="The content to sort") category: Literal[tuple(CATEGORIES)] = dspy.OutputField(desc="The assigned category")

sorter = dspy.ChainOfThought(SortContent)


Using `Literal` locks the output to valid categories only — the AI can't hallucinate labels. `ChainOfThought` adds reasoning which improves accuracy over bare `Predict`.

CATEGORIES = ["billing", "technical", "account", "feature_request", "general"]

sorter = dspy.ChainOfThought(SortContent)


使用`Literal`可将输出限定为有效类别——AI不会生成不存在的标签。`ChainOfThought`添加推理逻辑，相比基础的`Predict`能提升准确率。

Multiple tags

多标签分类

python

class TagContent(dspy.Signature):
    """Assign all applicable tags to the content."""
    message: str = dspy.InputField(desc="The content to tag")
    tags: list[Literal[tuple(CATEGORIES)]] = dspy.OutputField(desc="All applicable tags")

tagger = dspy.ChainOfThought(TagContent)

python

class TagContent(dspy.Signature):
    """Assign all applicable tags to the content."""
    message: str = dspy.InputField(desc="The content to tag")
    tags: list[Literal[tuple(CATEGORIES)]] = dspy.OutputField(desc="All applicable tags")

tagger = dspy.ChainOfThought(TagContent)

Step 3: Test the quality

步骤3：测试分类质量

python

from dspy.evaluate import Evaluate

def sorting_metric(example, prediction, trace=None):
    return prediction.category == example.category

evaluator = Evaluate(
    devset=devset,
    metric=sorting_metric,
    num_threads=4,
    display_progress=True,
    display_table=5,
)
score = evaluator(sorter)

For multi-tag, use F1 or Jaccard similarity instead of exact match.

python

from dspy.evaluate import Evaluate

def sorting_metric(example, prediction, trace=None):
    return prediction.category == example.category

evaluator = Evaluate(
    devset=devset,
    metric=sorting_metric,
    num_threads=4,
    display_progress=True,
    display_table=5,
)
score = evaluator(sorter)

对于多标签分类，使用F1或Jaccard相似度而非精确匹配。

Step 4: Improve accuracy

步骤4：提升准确率

Start with

BootstrapFewShot

— fast and usually gives a solid boost:

python

optimizer = dspy.BootstrapFewShot(
    metric=sorting_metric,
    max_bootstrapped_demos=4,
)
optimized_sorter = optimizer.compile(sorter, trainset=trainset)

If accuracy still isn't good enough, upgrade to

MIPROv2

python

optimizer = dspy.MIPROv2(
    metric=sorting_metric,
    auto="medium",
)
optimized_sorter = optimizer.compile(sorter, trainset=trainset)

首先使用

BootstrapFewShot

——速度快，通常能显著提升效果：

python

optimizer = dspy.BootstrapFewShot(
    metric=sorting_metric,
    max_bootstrapped_demos=4,
)
optimized_sorter = optimizer.compile(sorter, trainset=trainset)

如果准确率仍未达标，可升级为

MIPROv2

：

python

optimizer = dspy.MIPROv2(
    metric=sorting_metric,
    auto="medium",
)
optimized_sorter = optimizer.compile(sorter, trainset=trainset)

Step 5: Use it

步骤5：使用分类器

python

result = optimized_sorter(message="I was charged twice on my credit card last month")
print(f"Category: {result.category}")
print(f"Reasoning: {result.reasoning}")

python

result = optimized_sorter(message="I was charged twice on my credit card last month")
print(f"Category: {result.category}")
print(f"Reasoning: {result.reasoning}")

Key patterns

关键模式

Use
Literal
types to lock outputs to valid categories
Use
ChainOfThought
over
```
Predict
```
— reasoning improves sorting accuracy

Include a
hint
field during training for tricky examples:

python

class SortWithHint(dspy.Signature):
    message: str = dspy.InputField()
    hint: str = dspy.InputField(desc="Optional hint for ambiguous cases")
    category: Literal[tuple(CATEGORIES)] = dspy.OutputField()

Set

hint

in training data, leave empty at inference time.

Confidence scores: Add a float output field if you need confidence

使用
Literal
类型：将输出限定为有效类别
优先使用
ChainOfThought
而非
Predict
——推理逻辑能提升分类准确率

训练时为复杂案例添加
hint
字段：

python

class SortWithHint(dspy.Signature):
    message: str = dspy.InputField()
    hint: str = dspy.InputField(desc="Optional hint for ambiguous cases")
    category: Literal[tuple(CATEGORIES)] = dspy.OutputField()

在训练数据中设置

hint

，推理时留空即可。

置信度分数：如果需要置信度，可添加一个浮点型输出字段

Additional resources

额外资源

For worked examples (sentiment, intent, topics), see examples.md
Need scores instead of categories? Use
```
/ai-scoring
```
Next:
```
/ai-improving-accuracy
```
to measure and improve your AI

如需完整示例（情感识别、意图识别、主题分类），请查看examples.md
需要分数而非分类结果？请使用
```
/ai-scoring
```
下一步：使用
```
/ai-improving-accuracy
```
来衡量并优化你的AI