outlines

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Outlines: Structured Text Generation

Outlines:结构化文本生成

When to Use This Skill

何时使用该工具

Use Outlines when you need to:
  • Guarantee valid JSON/XML/code structure during generation
  • Use Pydantic models for type-safe outputs
  • Support local models (Transformers, llama.cpp, vLLM)
  • Maximize inference speed with zero-overhead structured generation
  • Generate against JSON schemas automatically
  • Control token sampling at the grammar level
GitHub Stars: 8,000+ | From: dottxt.ai (formerly .txt)
当你需要以下功能时,使用Outlines:
  • 生成过程中保证有效的JSON/XML/代码结构
  • 使用Pydantic模型实现类型安全的输出
  • 支持本地模型(Transformers、llama.cpp、vLLM)
  • 通过零开销结构化生成最大化推理速度
  • 自动基于JSON Schema生成内容
  • 在语法层面控制token采样
GitHub星标数:8000+ | 出品方:dottxt.ai(原.txt)

Installation

安装

bash
undefined
bash
undefined

Base installation

基础安装

pip install outlines
pip install outlines

With specific backends

安装带特定后端的版本

pip install outlines transformers # Hugging Face models pip install outlines llama-cpp-python # llama.cpp pip install outlines vllm # vLLM for high-throughput
undefined
pip install outlines transformers # Hugging Face模型 pip install outlines llama-cpp-python # llama.cpp pip install outlines vllm # 高吞吐量的vLLM
undefined

Quick Start

快速入门

Basic Example: Classification

基础示例:分类任务

python
import outlines
from typing import Literal
python
import outlines
from typing import Literal

Load model

加载模型

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

Generate with type constraint

带类型约束的生成

prompt = "Sentiment of 'This product is amazing!': " generator = outlines.generate.choice(model, ["positive", "negative", "neutral"]) sentiment = generator(prompt)
print(sentiment) # "positive" (guaranteed one of these)
undefined
prompt = "Sentiment of 'This product is amazing!': " generator = outlines.generate.choice(model, ["positive", "negative", "neutral"]) sentiment = generator(prompt)
print(sentiment) # 输出为"positive"(必为三者之一)
undefined

With Pydantic Models

结合Pydantic模型使用

python
from pydantic import BaseModel
import outlines

class User(BaseModel):
    name: str
    age: int
    email: str

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
python
from pydantic import BaseModel
import outlines

class User(BaseModel):
    name: str
    age: int
    email: str

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

Generate structured output

生成结构化输出

prompt = "Extract user: John Doe, 30 years old, john@example.com" generator = outlines.generate.json(model, User) user = generator(prompt)
print(user.name) # "John Doe" print(user.age) # 30 print(user.email) # "john@example.com"
undefined
prompt = "提取用户信息: John Doe, 30岁, john@example.com" generator = outlines.generate.json(model, User) user = generator(prompt)
print(user.name) # "John Doe" print(user.age) # 30 print(user.email) # "john@example.com"
undefined

Core Concepts

核心概念

1. Constrained Token Sampling

1. 约束式Token采样

Outlines uses Finite State Machines (FSM) to constrain token generation at the logit level.
How it works:
  1. Convert schema (JSON/Pydantic/regex) to context-free grammar (CFG)
  2. Transform CFG into Finite State Machine (FSM)
  3. Filter invalid tokens at each step during generation
  4. Fast-forward when only one valid token exists
Benefits:
  • Zero overhead: Filtering happens at token level
  • Speed improvement: Fast-forward through deterministic paths
  • Guaranteed validity: Invalid outputs impossible
python
import outlines
Outlines使用有限状态机(FSM)在logit层面约束token生成。
工作原理:
  1. 将Schema(JSON/Pydantic/正则表达式)转换为上下文无关文法(CFG)
  2. 将CFG转换为有限状态机(FSM)
  3. 在生成过程的每一步过滤无效token
  4. 当只有一个有效token时直接跳过生成步骤
优势:
  • 零开销:在token层面完成过滤
  • 速度提升:跳过确定性路径的生成步骤
  • 保证有效性:绝不会生成无效输出
python
import outlines

Pydantic model -> JSON schema -> CFG -> FSM

Pydantic模型 -> JSON Schema -> CFG -> FSM

class Person(BaseModel): name: str age: int
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
class Person(BaseModel): name: str age: int
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

Behind the scenes:

底层流程:

1. Person -> JSON schema

1. Person模型 -> JSON Schema

2. JSON schema -> CFG

2. JSON Schema -> CFG

3. CFG -> FSM

3. CFG -> FSM

4. FSM filters tokens during generation

4. 生成过程中FSM过滤token

generator = outlines.generate.json(model, Person) result = generator("Generate person: Alice, 25")
undefined
generator = outlines.generate.json(model, Person) result = generator("生成人物信息: Alice, 25岁")
undefined

2. Structured Generators

2. 结构化生成器

Outlines provides specialized generators for different output types.
Outlines为不同输出类型提供了专用生成器。

Choice Generator

选项生成器

python
undefined
python
undefined

Multiple choice selection

多选任务生成

generator = outlines.generate.choice( model, ["positive", "negative", "neutral"] )
sentiment = generator("Review: This is great!")
generator = outlines.generate.choice( model, ["positive", "negative", "neutral"] )
sentiment = generator("评价: 太棒了!")

Result: One of the three choices

结果必为三个选项之一

undefined
undefined

JSON Generator

JSON生成器

python
from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float
    in_stock: bool
python
from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float
    in_stock: bool

Generate valid JSON matching schema

生成符合Schema的有效JSON

generator = outlines.generate.json(model, Product) product = generator("Extract: iPhone 15, $999, available")
generator = outlines.generate.json(model, Product) product = generator("提取信息: iPhone 15, 999美元, 有货")

Guaranteed valid Product instance

保证是有效的Product实例

print(type(product)) # <class 'main.Product'>
undefined
print(type(product)) # <class 'main.Product'>
undefined

Regex Generator

正则生成器

python
undefined
python
undefined

Generate text matching regex

生成符合正则表达式的文本

generator = outlines.generate.regex( model, r"[0-9]{3}-[0-9]{3}-[0-9]{4}" # Phone number pattern )
phone = generator("Generate phone number:")
generator = outlines.generate.regex( model, r"[0-9]{3}-[0-9]{3}-[0-9]{4}" # 电话号码格式 )
phone = generator("生成电话号码:")

Result: "555-123-4567" (guaranteed to match pattern)

结果示例: "555-123-4567"(保证符合格式)

undefined
undefined

Integer/Float Generators

整数/浮点数生成器

python
undefined
python
undefined

Generate specific numeric types

生成特定数值类型

int_generator = outlines.generate.integer(model) age = int_generator("Person's age:") # Guaranteed integer
float_generator = outlines.generate.float(model) price = float_generator("Product price:") # Guaranteed float
undefined
int_generator = outlines.generate.integer(model) age = int_generator("人物年龄:") # 保证是整数
float_generator = outlines.generate.float(model) price = float_generator("产品价格:") # 保证是浮点数
undefined

3. Model Backends

3. 模型后端

Outlines supports multiple local and API-based backends.
Outlines支持多种本地和基于API的后端。

Transformers (Hugging Face)

Transformers(Hugging Face)

python
import outlines
python
import outlines

Load from Hugging Face

从Hugging Face加载模型

model = outlines.models.transformers( "microsoft/Phi-3-mini-4k-instruct", device="cuda" # Or "cpu" )
model = outlines.models.transformers( "microsoft/Phi-3-mini-4k-instruct", device="cuda" # 或"cpu" )

Use with any generator

可与任意生成器配合使用

generator = outlines.generate.json(model, YourModel)
undefined
generator = outlines.generate.json(model, YourModel)
undefined

llama.cpp

llama.cpp

python
undefined
python
undefined

Load GGUF model

加载GGUF格式模型

model = outlines.models.llamacpp( "./models/llama-3.1-8b-instruct.Q4_K_M.gguf", n_gpu_layers=35 )
generator = outlines.generate.json(model, YourModel)
undefined
model = outlines.models.llamacpp( "./models/llama-3.1-8b-instruct.Q4_K_M.gguf", n_gpu_layers=35 )
generator = outlines.generate.json(model, YourModel)
undefined

vLLM (High Throughput)

vLLM(高吞吐量)

python
undefined
python
undefined

For production deployments

生产环境部署使用

model = outlines.models.vllm( "meta-llama/Llama-3.1-8B-Instruct", tensor_parallel_size=2 # Multi-GPU )
generator = outlines.generate.json(model, YourModel)
undefined
model = outlines.models.vllm( "meta-llama/Llama-3.1-8B-Instruct", tensor_parallel_size=2 # 多GPU )
generator = outlines.generate.json(model, YourModel)
undefined

OpenAI (Limited Support)

OpenAI(有限支持)

python
undefined
python
undefined

Basic OpenAI support

基础OpenAI支持

model = outlines.models.openai( "gpt-4o-mini", api_key="your-api-key" )
model = outlines.models.openai( "gpt-4o-mini", api_key="your-api-key" )

Note: Some features limited with API models

注意:API模型部分功能受限

generator = outlines.generate.json(model, YourModel)
undefined
generator = outlines.generate.json(model, YourModel)
undefined

4. Pydantic Integration

4. Pydantic集成

Outlines has first-class Pydantic support with automatic schema translation.
Outlines原生支持Pydantic,可自动转换Schema。

Basic Models

基础模型

python
from pydantic import BaseModel, Field

class Article(BaseModel):
    title: str = Field(description="Article title")
    author: str = Field(description="Author name")
    word_count: int = Field(description="Number of words", gt=0)
    tags: list[str] = Field(description="List of tags")

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, Article)

article = generator("Generate article about AI")
print(article.title)
print(article.word_count)  # Guaranteed > 0
python
from pydantic import BaseModel, Field

class Article(BaseModel):
    title: str = Field(description="文章标题")
    author: str = Field(description="作者姓名")
    word_count: int = Field(description="字数", gt=0)
    tags: list[str] = Field(description="标签列表")

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, Article)

article = generator("生成一篇关于AI的文章")
print(article.title)
print(article.word_count)  # 保证大于0

Nested Models

嵌套模型

python
class Address(BaseModel):
    street: str
    city: str
    country: str

class Person(BaseModel):
    name: str
    age: int
    address: Address  # Nested model

generator = outlines.generate.json(model, Person)
person = generator("Generate person in New York")

print(person.address.city)  # "New York"
python
class Address(BaseModel):
    street: str
    city: str
    country: str

class Person(BaseModel):
    name: str
    age: int
    address: Address  # 嵌套模型

generator = outlines.generate.json(model, Person)
person = generator("生成一位纽约的人物信息")

print(person.address.city)  # "New York"

Enums and Literals

枚举与字面量

python
from enum import Enum
from typing import Literal

class Status(str, Enum):
    PENDING = "pending"
    APPROVED = "approved"
    REJECTED = "rejected"

class Application(BaseModel):
    applicant: str
    status: Status  # Must be one of enum values
    priority: Literal["low", "medium", "high"]  # Must be one of literals

generator = outlines.generate.json(model, Application)
app = generator("Generate application")

print(app.status)  # Status.PENDING (or APPROVED/REJECTED)
python
from enum import Enum
from typing import Literal

class Status(str, Enum):
    PENDING = "pending"
    APPROVED = "approved"
    REJECTED = "rejected"

class Application(BaseModel):
    applicant: str
    status: Status  # 必须是枚举值之一
    priority: Literal["low", "medium", "high"]  # 必须是字面量之一

generator = outlines.generate.json(model, Application)
app = generator("生成一份申请信息")

print(app.status)  # Status.PENDING(或APPROVED/REJECTED)

Common Patterns

常见模式

Pattern 1: Data Extraction

模式1:数据提取

python
from pydantic import BaseModel
import outlines

class CompanyInfo(BaseModel):
    name: str
    founded_year: int
    industry: str
    employees: int

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, CompanyInfo)

text = """
Apple Inc. was founded in 1976 in the technology industry.
The company employs approximately 164,000 people worldwide.
"""

prompt = f"Extract company information:\n{text}\n\nCompany:"
company = generator(prompt)

print(f"Name: {company.name}")
print(f"Founded: {company.founded_year}")
print(f"Industry: {company.industry}")
print(f"Employees: {company.employees}")
python
from pydantic import BaseModel
import outlines

class CompanyInfo(BaseModel):
    name: str
    founded_year: int
    industry: str
    employees: int

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, CompanyInfo)

text = """
苹果公司成立于1976年,属于科技行业。
该公司在全球拥有约164,000名员工。
"""

prompt = f"提取公司信息:\n{text}\n\n公司信息:"
company = generator(prompt)

print(f"名称: {company.name}")
print(f"成立年份: {company.founded_year}")
print(f"行业: {company.industry}")
print(f"员工数: {company.employees}")

Pattern 2: Classification

模式2:分类任务

python
from typing import Literal
import outlines

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
python
from typing import Literal
import outlines

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

Binary classification

二分类

generator = outlines.generate.choice(model, ["spam", "not_spam"]) result = generator("Email: Buy now! 50% off!")
generator = outlines.generate.choice(model, ["垃圾邮件", "非垃圾邮件"]) result = generator("邮件内容: 立即购买!五折优惠!")

Multi-class classification

多分类

categories = ["technology", "business", "sports", "entertainment"] category_gen = outlines.generate.choice(model, categories) category = category_gen("Article: Apple announces new iPhone...")
categories = ["科技", "商业", "体育", "娱乐"] category_gen = outlines.generate.choice(model, categories) category = category_gen("文章内容: 苹果发布新款iPhone...")

With confidence

带置信度的分类

class Classification(BaseModel): label: Literal["positive", "negative", "neutral"] confidence: float
classifier = outlines.generate.json(model, Classification) result = classifier("Review: This product is okay, nothing special")
undefined
class Classification(BaseModel): label: Literal["正面", "负面", "中性"] confidence: float
classifier = outlines.generate.json(model, Classification) result = classifier("评价: 这款产品还行,没什么特别的")
undefined

Pattern 3: Structured Forms

模式3:结构化表单

python
class UserProfile(BaseModel):
    full_name: str
    age: int
    email: str
    phone: str
    country: str
    interests: list[str]

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, UserProfile)

prompt = """
Extract user profile from:
Name: Alice Johnson
Age: 28
Email: alice@example.com
Phone: 555-0123
Country: USA
Interests: hiking, photography, cooking
"""

profile = generator(prompt)
print(profile.full_name)
print(profile.interests)  # ["hiking", "photography", "cooking"]
python
class UserProfile(BaseModel):
    full_name: str
    age: int
    email: str
    phone: str
    country: str
    interests: list[str]

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, UserProfile)

prompt = """
从以下内容提取用户资料:
姓名: Alice Johnson
年龄: 28
邮箱: alice@example.com
电话: 555-0123
国家: 美国
兴趣: 徒步、摄影、烹饪
"""

profile = generator(prompt)
print(profile.full_name)
print(profile.interests)  # ["hiking", "photography", "cooking"]

Pattern 4: Multi-Entity Extraction

模式4:多实体提取

python
class Entity(BaseModel):
    name: str
    type: Literal["PERSON", "ORGANIZATION", "LOCATION"]

class DocumentEntities(BaseModel):
    entities: list[Entity]

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, DocumentEntities)

text = "Tim Cook met with Satya Nadella at Microsoft headquarters in Redmond."
prompt = f"Extract entities from: {text}"

result = generator(prompt)
for entity in result.entities:
    print(f"{entity.name} ({entity.type})")
python
class Entity(BaseModel):
    name: str
    type: Literal["人物", "组织", "地点"]

class DocumentEntities(BaseModel):
    entities: list[Entity]

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, DocumentEntities)

text = "蒂姆·库克在雷德蒙德的微软总部会见了萨蒂亚·纳德拉。"
prompt = f"从文本中提取实体: {text}"

result = generator(prompt)
for entity in result.entities:
    print(f"{entity.name} ({entity.type})")

Pattern 5: Code Generation

模式5:代码生成

python
class PythonFunction(BaseModel):
    function_name: str
    parameters: list[str]
    docstring: str
    body: str

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, PythonFunction)

prompt = "Generate a Python function to calculate factorial"
func = generator(prompt)

print(f"def {func.function_name}({', '.join(func.parameters)}):")
print(f'    """{func.docstring}"""')
print(f"    {func.body}")
python
class PythonFunction(BaseModel):
    function_name: str
    parameters: list[str]
    docstring: str
    body: str

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
generator = outlines.generate.json(model, PythonFunction)

prompt = "生成一个计算阶乘的Python函数"
func = generator(prompt)

print(f"def {func.function_name}({', '.join(func.parameters)}):")
print(f'    """{func.docstring}"""')
print(f"    {func.body}")

Pattern 6: Batch Processing

模式6:批量处理

python
def batch_extract(texts: list[str], schema: type[BaseModel]):
    """Extract structured data from multiple texts."""
    model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
    generator = outlines.generate.json(model, schema)

    results = []
    for text in texts:
        result = generator(f"Extract from: {text}")
        results.append(result)

    return results

class Person(BaseModel):
    name: str
    age: int

texts = [
    "John is 30 years old",
    "Alice is 25 years old",
    "Bob is 40 years old"
]

people = batch_extract(texts, Person)
for person in people:
    print(f"{person.name}: {person.age}")
python
def batch_extract(texts: list[str], schema: type[BaseModel]):
    """从多个文本中提取结构化数据。"""
    model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
    generator = outlines.generate.json(model, schema)

    results = []
    for text in texts:
        result = generator(f"从文本提取信息: {text}")
        results.append(result)

    return results

class Person(BaseModel):
    name: str
    age: int

texts = [
    "John今年30岁",
    "Alice今年25岁",
    "Bob今年40岁"
]

people = batch_extract(texts, Person)
for person in people:
    print(f"{person.name}: {person.age}")

Backend Configuration

后端配置

Transformers

Transformers

python
import outlines
python
import outlines

Basic usage

基础用法

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

GPU configuration

GPU配置

model = outlines.models.transformers( "microsoft/Phi-3-mini-4k-instruct", device="cuda", model_kwargs={"torch_dtype": "float16"} )
model = outlines.models.transformers( "microsoft/Phi-3-mini-4k-instruct", device="cuda", model_kwargs={"torch_dtype": "float16"} )

Popular models

热门模型

model = outlines.models.transformers("meta-llama/Llama-3.1-8B-Instruct") model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.3") model = outlines.models.transformers("Qwen/Qwen2.5-7B-Instruct")
undefined
model = outlines.models.transformers("meta-llama/Llama-3.1-8B-Instruct") model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.3") model = outlines.models.transformers("Qwen/Qwen2.5-7B-Instruct")
undefined

llama.cpp

llama.cpp

python
undefined
python
undefined

Load GGUF model

加载GGUF模型

model = outlines.models.llamacpp( "./models/llama-3.1-8b.Q4_K_M.gguf", n_ctx=4096, # Context window n_gpu_layers=35, # GPU layers n_threads=8 # CPU threads )
model = outlines.models.llamacpp( "./models/llama-3.1-8b.Q4_K_M.gguf", n_ctx=4096, # 上下文窗口 n_gpu_layers=35, # GPU层数量 n_threads=8 # CPU线程数 )

Full GPU offload

全GPU卸载

model = outlines.models.llamacpp( "./models/model.gguf", n_gpu_layers=-1 # All layers on GPU )
undefined
model = outlines.models.llamacpp( "./models/model.gguf", n_gpu_layers=-1 # 所有层都在GPU上 )
undefined

vLLM (Production)

vLLM(生产环境)

python
undefined
python
undefined

Single GPU

单GPU

model = outlines.models.vllm("meta-llama/Llama-3.1-8B-Instruct")
model = outlines.models.vllm("meta-llama/Llama-3.1-8B-Instruct")

Multi-GPU

多GPU

model = outlines.models.vllm( "meta-llama/Llama-3.1-70B-Instruct", tensor_parallel_size=4 # 4 GPUs )
model = outlines.models.vllm( "meta-llama/Llama-3.1-70B-Instruct", tensor_parallel_size=4 # 4个GPU )

With quantization

带量化

model = outlines.models.vllm( "meta-llama/Llama-3.1-8B-Instruct", quantization="awq" # Or "gptq" )
undefined
model = outlines.models.vllm( "meta-llama/Llama-3.1-8B-Instruct", quantization="awq" # 或"gptq" )
undefined

Best Practices

最佳实践

1. Use Specific Types

1. 使用特定类型

python
undefined
python
undefined

✅ Good: Specific types

✅ 推荐:使用特定类型

class Product(BaseModel): name: str price: float # Not str quantity: int # Not str in_stock: bool # Not str
class Product(BaseModel): name: str price: float # 不使用str quantity: int # 不使用str in_stock: bool # 不使用str

❌ Bad: Everything as string

❌ 不推荐:全部用字符串

class Product(BaseModel): name: str price: str # Should be float quantity: str # Should be int
undefined
class Product(BaseModel): name: str price: str # 应该用float quantity: str # 应该用int
undefined

2. Add Constraints

2. 添加约束

python
from pydantic import Field
python
from pydantic import Field

✅ Good: With constraints

✅ 推荐:添加约束

class User(BaseModel): name: str = Field(min_length=1, max_length=100) age: int = Field(ge=0, le=120) email: str = Field(pattern=r"^[\w.-]+@[\w.-]+.\w+$")
class User(BaseModel): name: str = Field(min_length=1, max_length=100) age: int = Field(ge=0, le=120) email: str = Field(pattern=r"^[\w.-]+@[\w.-]+.\w+$")

❌ Bad: No constraints

❌ 不推荐:无约束

class User(BaseModel): name: str age: int email: str
undefined
class User(BaseModel): name: str age: int email: str
undefined

3. Use Enums for Categories

3. 用枚举表示分类

python
undefined
python
undefined

✅ Good: Enum for fixed set

✅ 推荐:用枚举表示固定选项

class Priority(str, Enum): LOW = "low" MEDIUM = "medium" HIGH = "high"
class Task(BaseModel): title: str priority: Priority
class Priority(str, Enum): LOW = "low" MEDIUM = "medium" HIGH = "high"
class Task(BaseModel): title: str priority: Priority

❌ Bad: Free-form string

❌ 不推荐:自由格式字符串

class Task(BaseModel): title: str priority: str # Can be anything
undefined
class Task(BaseModel): title: str priority: str # 可以是任意值
undefined

4. Provide Context in Prompts

4. 在提示词中提供上下文

python
undefined
python
undefined

✅ Good: Clear context

✅ 推荐:清晰的上下文

prompt = """ Extract product information from the following text. Text: iPhone 15 Pro costs $999 and is currently in stock. Product: """
prompt = """ 从以下文本中提取产品信息。 文本:iPhone 15 Pro售价999美元,目前有货。 产品信息: """

❌ Bad: Minimal context

❌ 不推荐:最小化上下文

prompt = "iPhone 15 Pro costs $999 and is currently in stock."
undefined
prompt = "iPhone 15 Pro售价999美元,目前有货。"
undefined

5. Handle Optional Fields

5. 处理可选字段

python
from typing import Optional
python
from typing import Optional

✅ Good: Optional fields for incomplete data

✅ 推荐:为不完整数据设置可选字段

class Article(BaseModel): title: str # Required author: Optional[str] = None # Optional date: Optional[str] = None # Optional tags: list[str] = [] # Default empty list
class Article(BaseModel): title: str # 必填 author: Optional[str] = None # 可选 date: Optional[str] = None # 可选 tags: list[str] = [] # 默认空列表

Can succeed even if author/date missing

即使缺少作者/日期也能成功生成

undefined
undefined

Comparison to Alternatives

与竞品对比

FeatureOutlinesInstructorGuidanceLMQL
Pydantic Support✅ Native✅ Native❌ No❌ No
JSON Schema✅ Yes✅ Yes⚠️ Limited✅ Yes
Regex Constraints✅ Yes❌ No✅ Yes✅ Yes
Local Models✅ Full⚠️ Limited✅ Full✅ Full
API Models⚠️ Limited✅ Full✅ Full✅ Full
Zero Overhead✅ Yes❌ No⚠️ Partial✅ Yes
Automatic Retrying❌ No✅ Yes❌ No❌ No
Learning CurveLowLowLowHigh
When to choose Outlines:
  • Using local models (Transformers, llama.cpp, vLLM)
  • Need maximum inference speed
  • Want Pydantic model support
  • Require zero-overhead structured generation
  • Control token sampling process
When to choose alternatives:
  • Instructor: Need API models with automatic retrying
  • Guidance: Need token healing and complex workflows
  • LMQL: Prefer declarative query syntax
特性OutlinesInstructorGuidanceLMQL
Pydantic支持✅ 原生支持✅ 原生支持❌ 不支持❌ 不支持
JSON Schema✅ 支持✅ 支持⚠️ 有限支持✅ 支持
正则约束✅ 支持❌ 不支持✅ 支持✅ 支持
本地模型✅ 完全支持⚠️ 有限支持✅ 完全支持✅ 完全支持
API模型⚠️ 有限支持✅ 完全支持✅ 完全支持✅ 完全支持
零开销✅ 支持❌ 不支持⚠️ 部分支持✅ 支持
自动重试❌ 不支持✅ 支持❌ 不支持❌ 不支持
学习曲线
选择Outlines的场景:
  • 使用本地模型(Transformers、llama.cpp、vLLM)
  • 需要最大化推理速度
  • 想要Pydantic模型支持
  • 需要零开销的结构化生成
  • 控制token采样过程
选择竞品的场景:
  • Instructor:需要带自动重试的API模型
  • Guidance:需要token修复和复杂工作流
  • LMQL:偏好声明式查询语法

Performance Characteristics

性能特点

Speed:
  • Zero overhead: Structured generation as fast as unconstrained
  • Fast-forward optimization: Skips deterministic tokens
  • 1.2-2x faster than post-generation validation approaches
Memory:
  • FSM compiled once per schema (cached)
  • Minimal runtime overhead
  • Efficient with vLLM for high throughput
Accuracy:
  • 100% valid outputs (guaranteed by FSM)
  • No retry loops needed
  • Deterministic token filtering
速度:
  • 零开销:结构化生成速度与无约束生成相当
  • 快速跳过优化:跳过确定性token的生成步骤
  • 比生成后验证的方法快1.2-2倍
内存:
  • FSM每个Schema编译一次(缓存)
  • 运行时开销极小
  • 与vLLM配合实现高吞吐量时效率优异
准确性:
  • 100%有效输出(由FSM保证)
  • 无需重试循环
  • 确定性token过滤

Resources

资源

See Also

相关文档

  • references/json_generation.md
    - Comprehensive JSON and Pydantic patterns
  • references/backends.md
    - Backend-specific configuration
  • references/examples.md
    - Production-ready examples
  • references/json_generation.md
    - 全面的JSON和Pydantic使用模式
  • references/backends.md
    - 后端特定配置
  • references/examples.md
    - 生产环境可用示例