pydantic

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Pydantic v2 Best Practices

Pydantic v2 最佳实践

Pydantic is the most widely used data validation library for Python. This skill covers idiomatic patterns, common pitfalls, and performance guidance for Pydantic v2 (the current major version).
Pydantic是Python中使用最广泛的数据验证库。本技能涵盖Pydantic v2(当前主要版本)的惯用模式、常见陷阱和性能优化指南。

Models

模型

Define models by inheriting from
BaseModel

通过继承
BaseModel
定义模型

python
from pydantic import BaseModel, ConfigDict

class User(BaseModel):
    model_config = ConfigDict(str_strip_whitespace=True, extra='forbid')

    id: int
    name: str
    email: str | None = None
Key rules:
  • Use
    model_config = ConfigDict(...)
    — never the deprecated V1
    class Config
    .
  • Set
    extra='forbid'
    to reject unexpected fields in strict APIs; use
    extra='ignore'
    (default) or
    extra='allow'
    when appropriate.
  • Avoid naming a field the same as its type annotation (
    int: int
    breaks validation).
python
from pydantic import BaseModel, ConfigDict

class User(BaseModel):
    model_config = ConfigDict(str_strip_whitespace=True, extra='forbid')

    id: int
    name: str
    email: str | None = None
核心规则:
  • 使用
    model_config = ConfigDict(...)
    ——切勿使用已废弃的V1版本的
    class Config
  • 在严格API中设置
    extra='forbid'
    以拒绝意外字段;在合适的场景下使用
    extra='ignore'
    (默认值)或
    extra='allow'
  • 避免字段名称与其类型注解重名(如
    int: int
    会破坏验证逻辑)。

Validate data correctly

正确验证数据

python
undefined
python
undefined

From dict / Python objects

从字典 / Python对象验证

user = User.model_validate({'id': 1, 'name': 'Alice'})
user = User.model_validate({'id': 1, 'name': 'Alice'})

From JSON bytes/str — faster than model_validate(json.loads(...))

从JSON字节/字符串验证——比model_validate(json.loads(...))更快

user = User.model_validate_json('{"id": 1, "name": "Alice"}')
user = User.model_validate_json('{"id": 1, "name": "Alice"}')

From ORM / arbitrary objects

从ORM / 任意对象验证

class UserORM: ... user = User.model_validate(orm_obj, from_attributes=True)

Always prefer `model_validate_json()` over `model_validate(json.loads(...))` for JSON input — the former validates internally without an extra Python-side parse step.
class UserORM: ... user = User.model_validate(orm_obj, from_attributes=True)

对于JSON输入,优先使用`model_validate_json()`而非`model_validate(json.loads(...))`——前者会在内部完成验证,无需额外的Python端解析步骤。

Use
model_post_init
instead of a custom
__init__

使用
model_post_init
替代自定义
__init__

python
from typing import Any
from pydantic import BaseModel

class MyModel(BaseModel):
    value: int

    def model_post_init(self, context: Any) -> None:
        # Runs after all field validators succeed
        self._cache: dict = {}
Defining a custom
__init__
bypasses validation parameters (strictness, extra, context). Use
model_post_init
for side effects after initialization.
python
from typing import Any
from pydantic import BaseModel

class MyModel(BaseModel):
    value: int

    def model_post_init(self, context: Any) -> None:
        # 所有字段验证器成功后执行
        self._cache: dict = {}
定义自定义
__init__
会绕过验证参数(严格性、额外字段、上下文)。如需在初始化后执行副作用操作,请使用
model_post_init

Copy models with
model_copy

使用
model_copy
复制模型

python
updated = user.model_copy(update={'name': 'Bob'})
deep_copy = user.model_copy(deep=True)

python
updated = user.model_copy(update={'name': 'Bob'})
deep_copy = user.model_copy(deep=True)

Fields

字段

Use the
Annotated
pattern for reusable constraints

使用Annotated模式实现可复用约束

python
from typing import Annotated
from pydantic import BaseModel, Field

PositivePrice = Annotated[float, Field(gt=0, description='Price in USD')]
ShortString = Annotated[str, Field(max_length=100)]

class Product(BaseModel):
    name: ShortString
    price: PositivePrice
    quantity: Annotated[int, Field(ge=0)] = 0
The annotated pattern makes constraints composable and reusable across models, unlike
field: type = Field(...)
which ties the constraint to one model.
python
from typing import Annotated
from pydantic import BaseModel, Field

PositivePrice = Annotated[float, Field(gt=0, description='Price in USD')]
ShortString = Annotated[str, Field(max_length=100)]

class Product(BaseModel):
    name: ShortString
    price: PositivePrice
    quantity: Annotated[int, Field(ge=0)] = 0
Annotated模式可让约束在多个模型间组合和复用,而
field: type = Field(...)
的方式会将约束绑定到单个模型。

Provide field metadata for JSON Schema

为JSON Schema提供字段元数据

python
from pydantic import BaseModel, Field

class Article(BaseModel):
    title: str = Field(
        min_length=1,
        max_length=200,
        title='Article Title',
        description='The main headline',
        examples=['Pydantic v2 released'],
    )
python
from pydantic import BaseModel, Field

class Article(BaseModel):
    title: str = Field(
        min_length=1,
        max_length=200,
        title='Article Title',
        description='The main headline',
        examples=['Pydantic v2 released'],
    )

Use
default_factory
for mutable defaults

使用
default_factory
处理可变默认值

python
from pydantic import BaseModel, Field

class Order(BaseModel):
    # Correct — factory called per instance
    items: list[str] = Field(default_factory=list)
    tags: set[str] = Field(default_factory=set)
Pydantic handles non-hashable defaults (like
[]
,
{}
) safely by deep-copying them, but
default_factory
is the explicit, recommended approach.
python
from pydantic import BaseModel, Field

class Order(BaseModel):
    # 正确方式——每个实例都会调用工厂函数
    items: list[str] = Field(default_factory=list)
    tags: set[str] = Field(default_factory=set)
Pydantic会通过深拷贝安全处理不可哈希的默认值(如
[]
{}
),但
default_factory
是明确推荐的实现方式。

Use aliases to decouple field names from wire formats

使用别名解耦字段名与传输格式

python
from pydantic import BaseModel, Field, ConfigDict

class Response(BaseModel):
    model_config = ConfigDict(populate_by_name=True)

    user_id: int = Field(alias='userId')          # validation + serialization
    created_at: str = Field(serialization_alias='createdAt')  # serialization only

python
from pydantic import BaseModel, Field, ConfigDict

class Response(BaseModel):
    model_config = ConfigDict(populate_by_name=True)

    user_id: int = Field(alias='userId')          # 验证和序列化都生效
    created_at: str = Field(serialization_alias='createdAt')  # 仅序列化生效

Validators

验证器

Choose the right validator mode

选择合适的验证器模式

ModeWhen to use
after
Post-type-coercion checks; input is already the correct type
before
Pre-coercion transformations; input may be raw/arbitrary
plain
Full replacement of Pydantic's logic for a field
wrap
Need to intercept errors or run code both before and after
Prefer
after
validators — they receive the already-coerced value and are easier to type correctly.
模式使用场景
after
类型转换后的检查;输入已为正确类型
before
转换前的预处理;输入可能是原始/任意格式
plain
完全替换Pydantic对该字段的处理逻辑
wrap
需要拦截错误或在转换前后都执行代码
优先使用
after
验证器——它们接收已转换后的值,更容易进行类型匹配。

Write reusable validators with the annotated pattern

使用Annotated模式编写可复用验证器

python
from typing import Annotated
from pydantic import AfterValidator, BaseModel

def must_be_even(v: int) -> int:
    if v % 2 != 0:
        raise ValueError(f'{v} is not even')
    return v

EvenInt = Annotated[int, AfterValidator(must_be_even)]

class Config(BaseModel):
    batch_size: EvenInt
    worker_count: EvenInt
python
from typing import Annotated
from pydantic import AfterValidator, BaseModel

def must_be_even(v: int) -> int:
    if v % 2 != 0:
        raise ValueError(f'{v} is not even')
    return v

EvenInt = Annotated[int, AfterValidator(must_be_even)]

class Config(BaseModel):
    batch_size: EvenInt
    worker_count: EvenInt

Use
field_validator
to apply one function to multiple fields

使用
field_validator
为多个字段应用同一函数

python
from pydantic import BaseModel, field_validator

class User(BaseModel):
    first_name: str
    last_name: str

    @field_validator('first_name', 'last_name', mode='before')
    @classmethod
    def strip_whitespace(cls, v: str) -> str:
        return v.strip()
python
from pydantic import BaseModel, field_validator

class User(BaseModel):
    first_name: str
    last_name: str

    @field_validator('first_name', 'last_name', mode='before')
    @classmethod
    def strip_whitespace(cls, v: str) -> str:
        return v.strip()

Use
model_validator
for cross-field checks

使用
model_validator
实现跨字段检查

python
from typing_extensions import Self
from pydantic import BaseModel, model_validator

class DateRange(BaseModel):
    start: int
    end: int

    @model_validator(mode='after')
    def check_range(self) -> Self:
        if self.end <= self.start:
            raise ValueError('end must be greater than start')
        return self
python
from typing_extensions import Self
from pydantic import BaseModel, model_validator

class DateRange(BaseModel):
    start: int
    end: int

    @model_validator(mode='after')
    def check_range(self) -> Self:
        if self.end <= self.start:
            raise ValueError('end must be greater than start')
        return self

Raise the right exception type in validators

在验证器中抛出正确的异常类型

  • ValueError
    — standard choice for most validation failures.
  • AssertionError
    — works but is skipped under Python's
    -O
    flag; avoid in production validators.
  • PydanticCustomError
    — use when custom error types and structured error metadata are needed.
python
from pydantic_core import PydanticCustomError

raise PydanticCustomError(
    'invalid_format',
    'Value {value!r} does not match the expected format',
    {'value': v},
)
  • ValueError
    ——大多数验证失败场景的标准选择。
  • AssertionError
    ——可以使用,但在Python的
    -O
    模式下会被跳过;生产环境的验证器中避免使用。
  • PydanticCustomError
    ——当需要自定义错误类型和结构化错误元数据时使用。
python
from pydantic_core import PydanticCustomError

raise PydanticCustomError(
    'invalid_format',
    'Value {value!r} does not match the expected format',
    {'value': v},
)

Pass context to validators when needed

必要时为验证器传递上下文

python
from pydantic import BaseModel, ValidationInfo, field_validator

class Document(BaseModel):
    text: str

    @field_validator('text', mode='after')
    @classmethod
    def filter_words(cls, v: str, info: ValidationInfo) -> str:
        if isinstance(info.context, dict):
            banned = info.context.get('banned_words', set())
            v = ' '.join(w for w in v.split() if w not in banned)
        return v

doc = Document.model_validate(
    {'text': 'hello world'},
    context={'banned_words': {'hello'}},
)

python
from pydantic import BaseModel, ValidationInfo, field_validator

class Document(BaseModel):
    text: str

    @field_validator('text', mode='after')
    @classmethod
    def filter_words(cls, v: str, info: ValidationInfo) -> str:
        if isinstance(info.context, dict):
            banned = info.context.get('banned_words', set())
            v = ' '.join(w for w in v.split() if w not in banned)
        return v

doc = Document.model_validate(
    {'text': 'hello world'},
    context={'banned_words': {'hello'}},
)

Error Handling

错误处理

Catch
ValidationError
and inspect
.errors()
for structured detail:
python
from pydantic import BaseModel, ValidationError

class Item(BaseModel):
    price: float
    quantity: int

try:
    Item(price='bad', quantity=-1)
except ValidationError as exc:
    for error in exc.errors():
        print(error['loc'], error['msg'], error['type'])
One
ValidationError
aggregates all field errors — never raised per-field individually.

捕获
ValidationError
并通过
.errors()
查看结构化详情:
python
from pydantic import BaseModel, ValidationError

class Item(BaseModel):
    price: float
    quantity: int

try:
    Item(price='bad', quantity=-1)
except ValidationError as exc:
    for error in exc.errors():
        print(error['loc'], error['msg'], error['type'])
单个
ValidationError
会聚合所有字段的错误——不会针对每个字段单独抛出异常。

Quick Reference

快速参考

TaskRecommended API
Validate from dict
Model.model_validate(data)
Validate from JSON
Model.model_validate_json(json_str)
Validate from ORM
Model.model_validate(obj, from_attributes=True)
Dump to dict
model.model_dump()
Dump to JSON
model.model_dump_json()
Dump only set fields
model.model_dump(exclude_unset=True)
Copy with changes
model.model_copy(update={...})
Skip validation
Model.model_construct(...)
— only for pre-validated data
Rebuild after forward refs
Model.model_rebuild()

任务推荐API
从字典验证
Model.model_validate(data)
从JSON验证
Model.model_validate_json(json_str)
从ORM验证
Model.model_validate(obj, from_attributes=True)
导出为字典
model.model_dump()
导出为JSON
model.model_dump_json()
仅导出已设置的字段
model.model_dump(exclude_unset=True)
复制并修改模型
model.model_copy(update={...})
跳过验证
Model.model_construct(...)
——仅用于已验证的数据
解析前向引用后重建模型
Model.model_rebuild()

Additional Resources

额外资源

  • references/validators-and-fields.md
    — Detailed validator modes, field constraints, discriminated unions, and computed fields.
  • references/serialization-and-config.md
    — Serializers,
    model_dump
    options,
    ConfigDict
    reference, and ORM integration.
  • references/performance.md
    — Performance tips:
    TypeAdapter
    reuse, tagged unions,
    TypedDict
    vs nested models,
    FailFast
    , and more.
  • references/validators-and-fields.md
    ——详细介绍验证器模式、字段约束、鉴别联合和计算字段。
  • references/serialization-and-config.md
    ——序列化器、
    model_dump
    选项、
    ConfigDict
    参考和ORM集成。
  • references/performance.md
    ——性能优化技巧:
    TypeAdapter
    复用、标签联合、
    TypedDict
    与嵌套模型对比、
    FailFast
    等。