PydanticAI Model Integration

PydanticAI 模型集成

Provider Model Strings

提供商模型字符串

Format:

provider:model-name

python

from pydantic_ai import Agent

格式：

provider:model-name

python

from pydantic_ai import Agent

OpenAI

Agent('openai:gpt-4o') Agent('openai:gpt-4o-mini') Agent('openai:o1-preview')

Anthropic

Agent('anthropic:claude-sonnet-4-5') Agent('anthropic:claude-haiku-4-5')

Google (API Key)

Agent('google-gla:gemini-2.0-flash') Agent('google-gla:gemini-2.0-pro')

Google (Vertex AI)

Agent('google-vertex:gemini-2.0-flash')

Groq

Agent('groq:llama-3.3-70b-versatile') Agent('groq:mixtral-8x7b-32768')

Mistral

Agent('mistral:mistral-large-latest')

Other providers

Agent('cohere:command-r-plus') Agent('bedrock:anthropic.claude-3-sonnet')

undefined

Agent('cohere:command-r-plus') Agent('bedrock:anthropic.claude-3-sonnet')

undefined

Model Settings

模型设置

python

from pydantic_ai import Agent
from pydantic_ai.settings import ModelSettings

agent = Agent(
    'openai:gpt-4o',
    model_settings=ModelSettings(
        temperature=0.7,
        max_tokens=1000,
        top_p=0.9,
        timeout=30.0,  # Request timeout
    )
)

python

from pydantic_ai import Agent
from pydantic_ai.settings import ModelSettings

agent = Agent(
    'openai:gpt-4o',
    model_settings=ModelSettings(
        temperature=0.7,
        max_tokens=1000,
        top_p=0.9,
        timeout=30.0,  # Request timeout
    )
)

Override per-run

result = await agent.run( 'Generate creative text', model_settings=ModelSettings(temperature=1.0) )

undefined

result = await agent.run( 'Generate creative text', model_settings=ModelSettings(temperature=1.0) )

undefined

Fallback Models

备用模型

Chain models for resilience:

python

from pydantic_ai.models.fallback import FallbackModel

通过链式模型实现弹性：

python

from pydantic_ai.models.fallback import FallbackModel

Try models in order until one succeeds

fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', 'google-gla:gemini-2.0-flash' )

agent = Agent(fallback) result = await agent.run('Hello')

fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', 'google-gla:gemini-2.0-flash' )

agent = Agent(fallback) result = await agent.run('Hello')

Custom fallback conditions

from pydantic_ai.exceptions import ModelAPIError

def should_fallback(error: Exception) -> bool: """Only fallback on rate limits or server errors.""" if isinstance(error, ModelAPIError): return error.status_code in (429, 500, 502, 503) return False

fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', fallback_on=should_fallback )

undefined

from pydantic_ai.exceptions import ModelAPIError

def should_fallback(error: Exception) -> bool: """Only fallback on rate limits or server errors.""" if isinstance(error, ModelAPIError): return error.status_code in (429, 500, 502, 503) return False

fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', fallback_on=should_fallback )

undefined

Streaming Responses

流式响应

python

async def stream_response():
    async with agent.run_stream('Tell me a story') as response:
        # Stream text output
        async for chunk in response.stream_output():
            print(chunk, end='', flush=True)

    # Access final result after streaming
    print(f"\nTokens used: {response.usage().total_tokens}")

python

async def stream_response():
    async with agent.run_stream('Tell me a story') as response:
        # Stream text output
        async for chunk in response.stream_output():
            print(chunk, end='', flush=True)

    # Access final result after streaming
    print(f"\nTokens used: {response.usage().total_tokens}")

Streaming with Structured Output

带结构化输出的流式响应

python

from pydantic import BaseModel

class Story(BaseModel):
    title: str
    content: str
    moral: str

agent = Agent('openai:gpt-4o', output_type=Story)

async with agent.run_stream('Write a fable') as response:
    # For structured output, stream_output yields partial JSON
    async for partial in response.stream_output():
        print(partial)  # Partial Story object as parsed

    # Final validated result
    story = response.output

python

from pydantic import BaseModel

class Story(BaseModel):
    title: str
    content: str
    moral: str

agent = Agent('openai:gpt-4o', output_type=Story)

async with agent.run_stream('Write a fable') as response:
    # For structured output, stream_output yields partial JSON
    async for partial in response.stream_output():
        print(partial)  # Partial Story object as parsed

    # Final validated result
    story = response.output

Dynamic Model Selection

动态模型选择

python

import os

python

import os

Environment-based selection

model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-4o') agent = Agent(model)

Runtime model override

result = await agent.run( 'Hello', model='anthropic:claude-sonnet-4-5' # Override default )

Context manager override

with agent.override(model='google-gla:gemini-2.0-flash'): result = agent.run_sync('Hello')

undefined

with agent.override(model='google-gla:gemini-2.0-flash'): result = agent.run_sync('Hello')

undefined

Deferred Model Checking

延迟模型检查

Delay model validation for testing:

python

undefined

延迟模型验证以用于测试：

python

undefined

Default: Validates model immediately (checks env vars)

agent = Agent('openai:gpt-4o')

Deferred: Validates only on first run

agent = Agent('openai:gpt-4o', defer_model_check=True)

Useful for testing with override

with agent.override(model=TestModel()): result = agent.run_sync('Test') # No OpenAI key needed

undefined

with agent.override(model=TestModel()): result = agent.run_sync('Test') # No OpenAI key needed

undefined

Usage Tracking

使用量追踪

python

result = await agent.run('Hello')

python

result = await agent.run('Hello')

Request usage (last request)

usage = result.usage() print(f"Input tokens: {usage.input_tokens}") print(f"Output tokens: {usage.output_tokens}") print(f"Total tokens: {usage.total_tokens}")

Full run usage (all requests in run)

run_usage = result.run_usage() print(f"Total requests: {run_usage.requests}")

undefined

run_usage = result.run_usage() print(f"Total requests: {run_usage.requests}")

undefined

Usage Limits

使用量限制

python

from pydantic_ai.usage import UsageLimits

python

from pydantic_ai.usage import UsageLimits

Limit token usage

result = await agent.run( 'Generate content', usage_limits=UsageLimits( total_tokens=1000, request_tokens=500, response_tokens=500, ) )

undefined

result = await agent.run( 'Generate content', usage_limits=UsageLimits( total_tokens=1000, request_tokens=500, response_tokens=500, ) )

undefined

Provider-Specific Features

提供商特定功能

OpenAI

python

from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    'gpt-4o',
    api_key='your-key',  # Or use OPENAI_API_KEY env var
    base_url='https://custom-endpoint.com'  # For Azure, proxies
)

python

from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    'gpt-4o',
    api_key='your-key',  # Or use OPENAI_API_KEY env var
    base_url='https://custom-endpoint.com'  # For Azure, proxies
)

Anthropic

python

from pydantic_ai.models.anthropic import AnthropicModel

model = AnthropicModel(
    'claude-sonnet-4-5',
    api_key='your-key'  # Or ANTHROPIC_API_KEY
)

python

from pydantic_ai.models.anthropic import AnthropicModel

model = AnthropicModel(
    'claude-sonnet-4-5',
    api_key='your-key'  # Or ANTHROPIC_API_KEY
)

Common Model Patterns

常见模型使用模式

Use Case	Recommendation
General purpose	`openai:gpt-4o` or `anthropic:claude-sonnet-4-5`
Fast/cheap	`openai:gpt-4o-mini` or `anthropic:claude-haiku-4-5`
Long context	`anthropic:claude-sonnet-4-5` (200k) or `google-gla:gemini-2.0-flash`
Reasoning	`openai:o1-preview`
Cost-sensitive prod	`FallbackModel` with fast model first

使用场景	推荐方案
通用场景	`openai:gpt-4o` 或 `anthropic:claude-sonnet-4-5`
快速/低成本	`openai:gpt-4o-mini` 或 `anthropic:claude-haiku-4-5`
长上下文	`anthropic:claude-sonnet-4-5` （200k）或 `google-gla:gemini-2.0-flash`
推理场景	`openai:o1-preview`
对成本敏感的生产环境	优先使用快速模型的 `FallbackModel`

pydantic-ai-model-integration

Original

Translation

PydanticAI Model Integration

PydanticAI 模型集成

Provider Model Strings

提供商模型字符串

OpenAI

OpenAI

Anthropic

Anthropic

Google (API Key)

Google (API Key)

Google (Vertex AI)

Google (Vertex AI)

Groq

Groq

Mistral

Mistral

Other providers

Other providers

Model Settings

模型设置

Override per-run

Override per-run

Fallback Models

备用模型

Try models in order until one succeeds

Try models in order until one succeeds

Custom fallback conditions

Custom fallback conditions

Streaming Responses

流式响应

Streaming with Structured Output

带结构化输出的流式响应

Dynamic Model Selection

动态模型选择

Environment-based selection

Environment-based selection

Runtime model override

Runtime model override

Context manager override

Context manager override

Deferred Model Checking

延迟模型检查

Default: Validates model immediately (checks env vars)

Default: Validates model immediately (checks env vars)

Deferred: Validates only on first run

Deferred: Validates only on first run

Useful for testing with override

Useful for testing with override

Usage Tracking

使用量追踪

Request usage (last request)

Request usage (last request)

Full run usage (all requests in run)

Full run usage (all requests in run)

Usage Limits

使用量限制

Limit token usage

Limit token usage

Provider-Specific Features

提供商特定功能

OpenAI

OpenAI

Anthropic

Anthropic

Common Model Patterns

常见模型使用模式