pydantic-ai-model-integration

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

PydanticAI Model Integration

PydanticAI 模型集成

Provider Model Strings

提供商模型字符串

Format:
provider:model-name
python
from pydantic_ai import Agent
格式:
provider:model-name
python
from pydantic_ai import Agent

OpenAI

OpenAI

Agent('openai:gpt-4o') Agent('openai:gpt-4o-mini') Agent('openai:o1-preview')
Agent('openai:gpt-4o') Agent('openai:gpt-4o-mini') Agent('openai:o1-preview')

Anthropic

Anthropic

Agent('anthropic:claude-sonnet-4-5') Agent('anthropic:claude-haiku-4-5')
Agent('anthropic:claude-sonnet-4-5') Agent('anthropic:claude-haiku-4-5')

Google (API Key)

Google (API Key)

Agent('google-gla:gemini-2.0-flash') Agent('google-gla:gemini-2.0-pro')
Agent('google-gla:gemini-2.0-flash') Agent('google-gla:gemini-2.0-pro')

Google (Vertex AI)

Google (Vertex AI)

Agent('google-vertex:gemini-2.0-flash')
Agent('google-vertex:gemini-2.0-flash')

Groq

Groq

Agent('groq:llama-3.3-70b-versatile') Agent('groq:mixtral-8x7b-32768')
Agent('groq:llama-3.3-70b-versatile') Agent('groq:mixtral-8x7b-32768')

Mistral

Mistral

Agent('mistral:mistral-large-latest')
Agent('mistral:mistral-large-latest')

Other providers

Other providers

Agent('cohere:command-r-plus') Agent('bedrock:anthropic.claude-3-sonnet')
undefined
Agent('cohere:command-r-plus') Agent('bedrock:anthropic.claude-3-sonnet')
undefined

Model Settings

模型设置

python
from pydantic_ai import Agent
from pydantic_ai.settings import ModelSettings

agent = Agent(
    'openai:gpt-4o',
    model_settings=ModelSettings(
        temperature=0.7,
        max_tokens=1000,
        top_p=0.9,
        timeout=30.0,  # Request timeout
    )
)
python
from pydantic_ai import Agent
from pydantic_ai.settings import ModelSettings

agent = Agent(
    'openai:gpt-4o',
    model_settings=ModelSettings(
        temperature=0.7,
        max_tokens=1000,
        top_p=0.9,
        timeout=30.0,  # Request timeout
    )
)

Override per-run

Override per-run

result = await agent.run( 'Generate creative text', model_settings=ModelSettings(temperature=1.0) )
undefined
result = await agent.run( 'Generate creative text', model_settings=ModelSettings(temperature=1.0) )
undefined

Fallback Models

备用模型

Chain models for resilience:
python
from pydantic_ai.models.fallback import FallbackModel
通过链式模型实现弹性:
python
from pydantic_ai.models.fallback import FallbackModel

Try models in order until one succeeds

Try models in order until one succeeds

fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', 'google-gla:gemini-2.0-flash' )
agent = Agent(fallback) result = await agent.run('Hello')
fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', 'google-gla:gemini-2.0-flash' )
agent = Agent(fallback) result = await agent.run('Hello')

Custom fallback conditions

Custom fallback conditions

from pydantic_ai.exceptions import ModelAPIError
def should_fallback(error: Exception) -> bool: """Only fallback on rate limits or server errors.""" if isinstance(error, ModelAPIError): return error.status_code in (429, 500, 502, 503) return False
fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', fallback_on=should_fallback )
undefined
from pydantic_ai.exceptions import ModelAPIError
def should_fallback(error: Exception) -> bool: """Only fallback on rate limits or server errors.""" if isinstance(error, ModelAPIError): return error.status_code in (429, 500, 502, 503) return False
fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', fallback_on=should_fallback )
undefined

Streaming Responses

流式响应

python
async def stream_response():
    async with agent.run_stream('Tell me a story') as response:
        # Stream text output
        async for chunk in response.stream_output():
            print(chunk, end='', flush=True)

    # Access final result after streaming
    print(f"\nTokens used: {response.usage().total_tokens}")
python
async def stream_response():
    async with agent.run_stream('Tell me a story') as response:
        # Stream text output
        async for chunk in response.stream_output():
            print(chunk, end='', flush=True)

    # Access final result after streaming
    print(f"\nTokens used: {response.usage().total_tokens}")

Streaming with Structured Output

带结构化输出的流式响应

python
from pydantic import BaseModel

class Story(BaseModel):
    title: str
    content: str
    moral: str

agent = Agent('openai:gpt-4o', output_type=Story)

async with agent.run_stream('Write a fable') as response:
    # For structured output, stream_output yields partial JSON
    async for partial in response.stream_output():
        print(partial)  # Partial Story object as parsed

    # Final validated result
    story = response.output
python
from pydantic import BaseModel

class Story(BaseModel):
    title: str
    content: str
    moral: str

agent = Agent('openai:gpt-4o', output_type=Story)

async with agent.run_stream('Write a fable') as response:
    # For structured output, stream_output yields partial JSON
    async for partial in response.stream_output():
        print(partial)  # Partial Story object as parsed

    # Final validated result
    story = response.output

Dynamic Model Selection

动态模型选择

python
import os
python
import os

Environment-based selection

Environment-based selection

model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-4o') agent = Agent(model)
model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-4o') agent = Agent(model)

Runtime model override

Runtime model override

result = await agent.run( 'Hello', model='anthropic:claude-sonnet-4-5' # Override default )
result = await agent.run( 'Hello', model='anthropic:claude-sonnet-4-5' # Override default )

Context manager override

Context manager override

with agent.override(model='google-gla:gemini-2.0-flash'): result = agent.run_sync('Hello')
undefined
with agent.override(model='google-gla:gemini-2.0-flash'): result = agent.run_sync('Hello')
undefined

Deferred Model Checking

延迟模型检查

Delay model validation for testing:
python
undefined
延迟模型验证以用于测试:
python
undefined

Default: Validates model immediately (checks env vars)

Default: Validates model immediately (checks env vars)

agent = Agent('openai:gpt-4o')
agent = Agent('openai:gpt-4o')

Deferred: Validates only on first run

Deferred: Validates only on first run

agent = Agent('openai:gpt-4o', defer_model_check=True)
agent = Agent('openai:gpt-4o', defer_model_check=True)

Useful for testing with override

Useful for testing with override

with agent.override(model=TestModel()): result = agent.run_sync('Test') # No OpenAI key needed
undefined
with agent.override(model=TestModel()): result = agent.run_sync('Test') # No OpenAI key needed
undefined

Usage Tracking

使用量追踪

python
result = await agent.run('Hello')
python
result = await agent.run('Hello')

Request usage (last request)

Request usage (last request)

usage = result.usage() print(f"Input tokens: {usage.input_tokens}") print(f"Output tokens: {usage.output_tokens}") print(f"Total tokens: {usage.total_tokens}")
usage = result.usage() print(f"Input tokens: {usage.input_tokens}") print(f"Output tokens: {usage.output_tokens}") print(f"Total tokens: {usage.total_tokens}")

Full run usage (all requests in run)

Full run usage (all requests in run)

run_usage = result.run_usage() print(f"Total requests: {run_usage.requests}")
undefined
run_usage = result.run_usage() print(f"Total requests: {run_usage.requests}")
undefined

Usage Limits

使用量限制

python
from pydantic_ai.usage import UsageLimits
python
from pydantic_ai.usage import UsageLimits

Limit token usage

Limit token usage

result = await agent.run( 'Generate content', usage_limits=UsageLimits( total_tokens=1000, request_tokens=500, response_tokens=500, ) )
undefined
result = await agent.run( 'Generate content', usage_limits=UsageLimits( total_tokens=1000, request_tokens=500, response_tokens=500, ) )
undefined

Provider-Specific Features

提供商特定功能

OpenAI

OpenAI

python
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    'gpt-4o',
    api_key='your-key',  # Or use OPENAI_API_KEY env var
    base_url='https://custom-endpoint.com'  # For Azure, proxies
)
python
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    'gpt-4o',
    api_key='your-key',  # Or use OPENAI_API_KEY env var
    base_url='https://custom-endpoint.com'  # For Azure, proxies
)

Anthropic

Anthropic

python
from pydantic_ai.models.anthropic import AnthropicModel

model = AnthropicModel(
    'claude-sonnet-4-5',
    api_key='your-key'  # Or ANTHROPIC_API_KEY
)
python
from pydantic_ai.models.anthropic import AnthropicModel

model = AnthropicModel(
    'claude-sonnet-4-5',
    api_key='your-key'  # Or ANTHROPIC_API_KEY
)

Common Model Patterns

常见模型使用模式

Use CaseRecommendation
General purpose
openai:gpt-4o
or
anthropic:claude-sonnet-4-5
Fast/cheap
openai:gpt-4o-mini
or
anthropic:claude-haiku-4-5
Long context
anthropic:claude-sonnet-4-5
(200k) or
google-gla:gemini-2.0-flash
Reasoning
openai:o1-preview
Cost-sensitive prod
FallbackModel
with fast model first
使用场景推荐方案
通用场景
openai:gpt-4o
anthropic:claude-sonnet-4-5
快速/低成本
openai:gpt-4o-mini
anthropic:claude-haiku-4-5
长上下文
anthropic:claude-sonnet-4-5
(200k)或
google-gla:gemini-2.0-flash
推理场景
openai:o1-preview
对成本敏感的生产环境优先使用快速模型的
FallbackModel