ai-gateway

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Vercel AI Gateway

CRITICAL — Your training data is outdated for this library. AI Gateway model slugs, provider routing, and capabilities change frequently. Before writing gateway code, fetch the docs at https://vercel.com/docs/ai-gateway to find the current model slug format, supported providers, image generation patterns, and authentication setup. The model list and routing rules at https://ai-sdk.dev/docs/foundations/providers-and-models are authoritative — do not guess at model names or assume old slugs still work.

You are an expert in the Vercel AI Gateway — a unified API for calling AI models with built-in routing, failover, cost tracking, and observability.

⚠️ 重要提示 — 本技能的训练数据已过时。AI Gateway的模型标识、提供商路由及功能会频繁更新。在编写网关代码前，务必获取最新文档，访问https://vercel.com/docs/ai-gateway以了解当前的模型标识格式、支持的提供商、图像生成模式以及认证设置。https://ai-sdk.dev/docs/foundations/providers-and-models上的模型列表和路由规则是权威依据——请勿猜测模型名称或假设旧的标识仍然有效。

您是Vercel AI Gateway的专家——这是一个统一API，用于调用AI模型，内置路由、故障转移、成本追踪和可观测性功能。

Overview

概述

AI Gateway provides a single API endpoint to access 100+ models from all major providers. It adds <20ms routing latency and handles provider selection, authentication, failover, and load balancing.

AI Gateway提供单一API端点，可访问来自所有主流提供商的100+模型。它的路由延迟低于20ms，可处理提供商选择、认证、故障转移和负载均衡。

Packages

依赖包

```
ai@^6.0.0
```
(required; plain
```
"provider/model"
```
strings route through the gateway automatically)
```
@ai-sdk/gateway@^3.0.0
```
(optional direct install for explicit gateway package usage)

```
ai@^6.0.0
```
（必填；格式为
```
"provider/model"
```
的字符串会自动通过网关路由）
```
@ai-sdk/gateway@^3.0.0
```
（可选，直接安装用于显式使用网关包）

Setup

快速上手

Pass a

"provider/model"

string to the

model

parameter — the AI SDK automatically routes it through the AI Gateway:

import { generateText } from 'ai'

const result = await generateText({
  model: 'openai/gpt-5.4', // plain string — routes through AI Gateway automatically
  prompt: 'Hello!',
})

gateway()

wrapper or additional package needed. The

gateway()

function is an optional explicit wrapper — only needed when you use

providerOptions.gateway

for routing, failover, or tags:

import { gateway } from 'ai'

const result = await generateText({
  model: gateway('openai/gpt-5.4'),
  providerOptions: { gateway: { order: ['openai', 'azure-openai'] } },
})

将

"provider/model"

格式的字符串传入

model

参数——AI SDK会自动通过AI Gateway路由请求：

import { generateText } from 'ai'

const result = await generateText({
  model: 'openai/gpt-5.4', // 纯字符串格式——自动通过AI Gateway路由
  prompt: 'Hello!',
})

无需

gateway()

包装器或额外依赖包。

gateway()

函数是可选的显式包装器——仅当您使用

providerOptions.gateway

配置路由、故障转移或标签时才需要：

import { gateway } from 'ai'

const result = await generateText({
  model: gateway('openai/gpt-5.4'),
  providerOptions: { gateway: { order: ['openai', 'azure-openai'] } },
})

Model Slug Rules (Critical)

模型标识规则（关键）

Always use
```
provider/model
```
format (for example
```
openai/gpt-5.4
```
).
Versioned slugs use dots for versions, not hyphens:
- Correct:
```
anthropic/claude-sonnet-4.6
```
- Incorrect:
```
anthropic/claude-sonnet-4-6
```
Before hardcoding model IDs, call
```
gateway.getAvailableModels()
```
and pick from the returned IDs.

Default text models:

openai/gpt-5.4

anthropic/claude-sonnet-4.6

Do not default to outdated choices like
```
openai/gpt-4o
```
.

import { gateway } from 'ai'

const availableModels = await gateway.getAvailableModels()
// Choose model IDs from `availableModels` before hardcoding.

必须使用
```
provider/model
```
格式（例如
```
openai/gpt-5.4
```
）。
带版本的标识使用点号分隔版本，而非连字符：
- 正确：
```
anthropic/claude-sonnet-4.6
```
- 错误：
```
anthropic/claude-sonnet-4-6
```
在硬编码模型ID前，调用
```
gateway.getAvailableModels()
```
并从返回的ID中选择。

默认文本模型：

openai/gpt-5.4

或

anthropic/claude-sonnet-4.6

。

请勿使用过时的默认选项，如
```
openai/gpt-4o
```
。

import { gateway } from 'ai'

const availableModels = await gateway.getAvailableModels()
// 在硬编码前从`availableModels`中选择模型ID。

Authentication (OIDC — Default)

认证（OIDC — 默认方式）

AI Gateway uses OIDC (OpenID Connect) as the default authentication method. No manual API keys needed.

AI Gateway默认使用**OIDC（OpenID Connect）**作为认证方式，无需手动配置API密钥。

Setup

配置步骤

bash

vercel link                    # Connect to your Vercel project

bash

vercel link                    # 连接到您的Vercel项目

Enable AI Gateway in Vercel dashboard: https://vercel.com/{team}/{project}/settings → AI Gateway

在Vercel控制台启用AI Gateway：https://vercel.com/{team}/{project}/settings → AI Gateway

vercel env pull .env.local # Provisions VERCEL_OIDC_TOKEN automatically

undefined

vercel env pull .env.local # 自动配置VERCEL_OIDC_TOKEN

undefined

How It Works

工作原理

vercel env pull

writes a

VERCEL_OIDC_TOKEN

.env.local

— a short-lived JWT (~24h)

The

@ai-sdk/gateway

package reads this token via

@vercel/oidc

(

getVercelOidcToken()

)

No
```
AI_GATEWAY_API_KEY
```
or provider-specific keys (like
```
ANTHROPIC_API_KEY
```
) are needed
On Vercel deployments, OIDC tokens are auto-refreshed — zero maintenance

```
vercel env pull
```
会在
```
.env.local
```
中写入
```
VERCEL_OIDC_TOKEN
```
——一个短期有效的JWT（约24小时）

@ai-sdk/gateway

包通过

@vercel/oidc

读取该令牌（

getVercelOidcToken()

）

无需
```
AI_GATEWAY_API_KEY
```
或提供商专属密钥（如
```
ANTHROPIC_API_KEY
```
）
在Vercel部署环境中，OIDC令牌会自动刷新——零维护成本

Local Development

本地开发

For local dev, the OIDC token from

vercel env pull

is valid for ~24 hours. When it expires:

bash

vercel env pull .env.local --yes   # Re-pull to get a fresh token

本地开发时，

vercel env pull

获取的OIDC令牌有效期约为24小时。过期后执行：

bash

vercel env pull .env.local --yes   # 重新拉取获取新令牌

Alternative: Manual API Key

替代方案：手动API密钥

If you prefer a static key (e.g., for CI or non-Vercel environments):

bash

undefined

如果您偏好静态密钥（例如用于CI环境或非Vercel环境）：

bash

undefined

Set AI_GATEWAY_API_KEY in your environment

在环境变量中设置AI_GATEWAY_API_KEY

The gateway falls back to this when VERCEL_OIDC_TOKEN is not available

当VERCEL_OIDC_TOKEN不可用时，网关会 fallback 到该密钥

export AI_GATEWAY_API_KEY=your-key-here

undefined

export AI_GATEWAY_API_KEY=your-key-here

undefined

Auth Priority

认证优先级

The

@ai-sdk/gateway

package resolves authentication in this order:

```
AI_GATEWAY_API_KEY
```
environment variable (if set)

VERCEL_OIDC_TOKEN

via

@vercel/oidc

(default on Vercel and after

vercel env pull

)

@ai-sdk/gateway

包按以下顺序解析认证方式：

```
AI_GATEWAY_API_KEY
```
环境变量（如果已设置）
通过
```
@vercel/oidc
```
获取的
```
VERCEL_OIDC_TOKEN
```
（Vercel环境和执行
```
vercel env pull
```
后的默认方式）

Provider Routing

提供商路由

Configure how AI Gateway routes requests across providers:

const result = await generateText({
  model: gateway('anthropic/claude-sonnet-4.6'),
  prompt: 'Hello!',
  providerOptions: {
    gateway: {
      // Try providers in order; failover to next on error
      order: ['bedrock', 'anthropic'],

      // Restrict to specific providers only
      only: ['anthropic', 'vertex'],

      // Fallback models if primary model fails
      models: ['openai/gpt-5.4', 'google/gemini-3-flash'],

      // Track usage per end-user
      user: 'user-123',

      // Tag for cost attribution and filtering
      tags: ['feature:chat', 'env:production', 'team:growth'],
    },
  },
})

配置AI Gateway如何在多个提供商之间路由请求：

const result = await generateText({
  model: gateway('anthropic/claude-sonnet-4.6'),
  prompt: 'Hello!',
  providerOptions: {
    gateway: {
      // 按顺序尝试提供商；出错时故障转移到下一个
      order: ['bedrock', 'anthropic'],

      // 仅限制使用特定提供商
      only: ['anthropic', 'vertex'],

      // 主模型故障时的备用模型列表
      models: ['openai/gpt-5.4', 'google/gemini-3-flash'],

      // 按终端用户追踪使用情况
      user: 'user-123',

      // 用于成本归因和过滤的标签
      tags: ['feature:chat', 'env:production', 'team:growth'],
    },
  },
})

Routing Options

路由选项

Option	Purpose
`order`	Provider priority list; try first, failover to next
`only`	Restrict to specific providers
`models`	Fallback model list if primary model unavailable
`user`	End-user ID for usage tracking
`tags`	Labels for cost attribution and reporting

选项	用途
`order`	提供商优先级列表；先尝试第一个，故障转移到下一个
`only`	限制仅使用特定提供商
`models`	主模型不可用时的备用模型列表
`user`	用于使用情况追踪的终端用户ID
`tags`	用于成本归因和报告的标签

Cache-Control Headers

缓存控制头

AI Gateway supports response caching to reduce latency and cost for repeated or similar requests:

const result = await generateText({
  model: gateway('openai/gpt-5.4'),
  prompt: 'What is the capital of France?',
  providerOptions: {
    gateway: {
      // Cache identical requests for 1 hour
      cacheControl: 'max-age=3600',
    },
  },
})

AI Gateway支持响应缓存，以降低重复或相似请求的延迟和成本：

const result = await generateText({
  model: gateway('openai/gpt-5.4'),
  prompt: 'What is the capital of France?',
  providerOptions: {
    gateway: {
      // 缓存相同请求1小时
      cacheControl: 'max-age=3600',
    },
  },
})

Caching strategies

缓存策略

Header Value	Behavior
`max-age=3600`	Cache response for 1 hour
`max-age=0`	Bypass cache, always call provider
`s-maxage=86400`	Cache at the edge for 24 hours
`stale-while-revalidate=600`	Serve stale for 10 min while refreshing in background

头值	行为
`max-age=3600`	缓存响应1小时
`max-age=0`	绕过缓存，始终调用提供商
`s-maxage=86400`	在边缘节点缓存24小时
`stale-while-revalidate=600`	提供过期缓存内容10分钟，同时在后台刷新

When to use caching

缓存适用场景

Static knowledge queries: FAQs, translations, factual lookups — cache aggressively
User-specific conversations: Do not cache — each response depends on conversation history
Embeddings: Cache embedding results for identical inputs to save cost
Structured extraction: Cache when extracting structured data from identical documents

静态知识查询：常见问题、翻译、事实查找——可激进缓存
用户特定对话：请勿缓存——每个响应依赖对话历史
嵌入向量：对相同输入的嵌入结果进行缓存以节省成本
结构化提取：从相同文档提取结构化数据时可缓存

Cache key composition

缓存键构成

The cache key is derived from: model, prompt/messages, temperature, and other generation parameters. Changing any parameter produces a new cache key.

缓存键由以下内容生成：模型、提示词/消息、温度参数以及其他生成参数。更改任何参数都会生成新的缓存键。

Per-User Rate Limiting

按用户限流

Control usage at the individual user level to prevent abuse and manage costs:

const result = await generateText({
  model: gateway('openai/gpt-5.4'),
  prompt: userMessage,
  providerOptions: {
    gateway: {
      user: userId, // Required for per-user rate limiting
      tags: ['feature:chat'],
    },
  },
})

控制单个用户的请求量，防止滥用并管理成本：

const result = await generateText({
  model: gateway('openai/gpt-5.4'),
  prompt: userMessage,
  providerOptions: {
    gateway: {
      user: userId, // 按用户限流必填
      tags: ['feature:chat'],
    },
  },
})

Rate limit configuration

限流配置

Configure rate limits at

https://vercel.com/{team}/{project}/settings

→ AI Gateway → Rate Limits:

Requests per minute per user: Throttle individual users (e.g., 20 RPM)
Tokens per day per user: Cap daily token consumption (e.g., 100K tokens/day)
Concurrent requests per user: Limit parallel calls (e.g., 3 concurrent)

在

https://vercel.com/{team}/{project}/settings

→ AI Gateway → Rate Limits中配置限流规则：

每用户每分钟请求数：限制单个用户的请求频率（例如20 RPM）
每用户每日令牌数：限制单个用户的每日令牌消耗（例如10万令牌/天）
每用户并发请求数：限制并行调用数（例如3个并发）

Handling rate limit responses

处理限流响应

When a user exceeds their limit, the gateway returns HTTP 429:

import { generateText, APICallError } from 'ai'

try {
  const result = await generateText({
    model: gateway('openai/gpt-5.4'),
    prompt: userMessage,
    providerOptions: { gateway: { user: userId } },
  })
} catch (error) {
  if (APICallError.isInstance(error) && error.statusCode === 429) {
    const retryAfter = error.responseHeaders?.['retry-after']
    return new Response(
      JSON.stringify({ error: 'Rate limited', retryAfter }),
      { status: 429 }
    )
  }
  throw error
}

当用户超出限制时，网关会返回HTTP 429状态码：

import { generateText, APICallError } from 'ai'

try {
  const result = await generateText({
    model: gateway('openai/gpt-5.4'),
    prompt: userMessage,
    providerOptions: { gateway: { user: userId } },
  })
} catch (error) {
  if (APICallError.isInstance(error) && error.statusCode === 429) {
    const retryAfter = error.responseHeaders?.['retry-after']
    return new Response(
      JSON.stringify({ error: '请求过于频繁，请稍后重试', retryAfter }),
      { status: 429 }
    )
  }
  throw error
}

Budget Alerts and Cost Controls

预算告警与成本控制

Tagging for cost attribution

标签用于成本归因

Use tags to track spend by feature, team, and environment:

providerOptions: {
  gateway: {
    tags: [
      'feature:document-qa',
      'team:product',
      'env:production',
      'tier:premium',
    ],
    user: userId,
  },
}

使用标签按功能、团队和环境追踪支出：

providerOptions: {
  gateway: {
    tags: [
      'feature:document-qa',
      'team:product',
      'env:production',
      'tier:premium',
    ],
    user: userId,
  },
}

Setting up budget alerts

设置预算告警

In the Vercel dashboard at

https://vercel.com/{team}/{project}/settings

→ AI Gateway:

Navigate to AI Gateway → Usage & Budgets
Set monthly budget thresholds (e.g., $500/month warning, $1000/month hard limit)
Configure alert channels (email, Slack webhook, Vercel integration)
Optionally set per-tag budgets for granular control

在Vercel控制台的

https://vercel.com/{team}/{project}/settings

→ AI Gateway中：

导航至AI Gateway → 使用情况与预算
设置月度预算阈值（例如500美元预警，1000美元硬限制）
配置告警渠道（邮件、Slack webhook、Vercel集成）
可选：为每个标签设置单独预算以实现精细化控制

Budget isolation best practice

预算隔离最佳实践

Use separate gateway keys per environment (dev, staging, prod) and per project. This keeps dashboards clean and budgets isolated:

Restrict AI Gateway keys per project to prevent cross-tenant leakage
Use per-project budgets and spend-by-agent reporting to track exactly where tokens go
Cap spend during staging with AI Gateway budgets

为每个环境（开发、预发布、生产）和项目使用独立的网关密钥。这可保持控制台整洁并实现预算隔离：

为每个项目限制AI Gateway密钥，防止跨租户泄漏
使用按项目的预算和按Agent的支出报告，精确追踪令牌消耗去向
在预发布环境中通过AI Gateway预算限制支出

Pre-flight cost controls

预请求成本控制

The AI Gateway dashboard provides observability (traces, token counts, spend tracking) but no programmatic metrics API. Build your own cost guardrails by estimating token counts and rejecting expensive requests before they execute:

import { generateText } from 'ai'

function estimateTokens(text: string): number {
  return Math.ceil(text.length / 4) // rough estimate
}

async function callWithBudget(prompt: string, maxTokens: number) {
  const estimated = estimateTokens(prompt)
  if (estimated > maxTokens) {
    throw new Error(`Prompt too large: ~${estimated} tokens exceeds ${maxTokens} limit`)
  }
  return generateText({ model: 'openai/gpt-5.4', prompt })
}

The AI SDK's

usage

field on responses gives actual token counts after each request — store these for historical tracking and cost analysis.

AI Gateway控制台提供可观测性（追踪、令牌计数、支出追踪），但无程序化指标API。您可以通过估算令牌数并在执行前拒绝高成本请求，构建自己的成本防护机制：

import { generateText } from 'ai'

function estimateTokens(text: string): number {
  return Math.ceil(text.length / 4) // 粗略估算
}

async function callWithBudget(prompt: string, maxTokens: number) {
  const estimated = estimateTokens(prompt)
  if (estimated > maxTokens) {
    throw new Error(`提示词过长：约${estimated}个令牌，超过${maxTokens}个令牌的限制`)
  }
  return generateText({ model: 'openai/gpt-5.4', prompt })
}

AI SDK响应中的

usage

字段会返回每次请求的实际令牌数——可存储这些数据用于历史追踪和成本分析。

Hard spending limits

硬支出限制

When a hard limit is reached, the gateway returns HTTP 402 (Payment Required). Handle this gracefully:

if (APICallError.isInstance(error) && error.statusCode === 402) {
  // Budget exceeded — degrade gracefully
  return fallbackResponse()
}

当达到硬限制时，网关会返回HTTP 402（需要付费）状态码。请优雅处理该情况：

if (APICallError.isInstance(error) && error.statusCode === 402) {
  // 预算已超支——优雅降级
  return fallbackResponse()
}

Cost optimization patterns

成本优化模式

Use cheaper models for classification/routing, expensive models for generation
Cache embeddings and static queries (see Cache-Control above)
Set per-user daily token caps to prevent runaway usage
Monitor cost-per-feature with tags to identify optimization targets

分类/路由使用低成本模型，生成任务使用高成本模型
缓存嵌入向量和静态查询（见上文缓存控制）
设置按用户的每日令牌上限，防止过度使用
使用标签监控按功能的成本，识别优化目标

Audit Logging

审计日志

AI Gateway logs every request for compliance and debugging:

AI Gateway会记录所有请求，用于合规性检查和调试：

What's logged

日志内容

Timestamp, model, provider used
Input/output token counts
Latency (routing + provider)
User ID and tags
HTTP status code
Failover chain (which providers were tried)

时间戳、模型、使用的提供商
输入/输出令牌计数
延迟（路由+提供商）
用户ID和标签
HTTP状态码
故障转移链（尝试过哪些提供商）

Accessing logs

访问日志

Vercel Dashboard at
```
https://vercel.com/{team}/{project}/ai
```
→ Logs — filter by model, user, tag, status, date range
Vercel API: Query logs programmatically:

bash

curl -H "Authorization: Bearer $VERCEL_TOKEN" \
  "https://api.vercel.com/v1/ai-gateway/logs?projectId=$PROJECT_ID&limit=100"

Log Drains: Forward AI Gateway logs to Datadog, Splunk, or other providers via Vercel Log Drains (configure at
```
https://vercel.com/dashboard/{team}/~/settings/log-drains
```
) for long-term retention and custom analysis

Vercel控制台：访问
```
https://vercel.com/{team}/{project}/ai
```
→ Logs——可按模型、用户、标签、状态、日期范围过滤
Vercel API：程序化查询日志：

bash

curl -H "Authorization: Bearer $VERCEL_TOKEN" \\
  "https://api.vercel.com/v1/ai-gateway/logs?projectId=$PROJECT_ID&limit=100"

日志导出：通过Vercel日志导出功能，将AI Gateway日志转发到Datadog、Splunk或其他提供商（在
```
https://vercel.com/dashboard/{team}/~/settings/log-drains
```
配置），以实现长期留存和自定义分析

Compliance considerations

合规性注意事项

AI Gateway does not log prompt or completion content by default
Enable content logging in project settings if required for compliance
Logs are retained per your Vercel plan's retention policy
Use
```
user
```
field consistently to support audit trails

AI Gateway默认不会记录提示词或生成内容
如果合规要求需要，可在项目设置中启用内容日志
日志保留期限遵循您的Vercel计划的保留政策
请始终使用
```
user
```
字段，以支持审计追踪

Error Handling Patterns

错误处理模式

Provider unavailable

提供商不可用

When a provider is down, the gateway automatically fails over if you configured

order

models

const result = await generateText({
  model: gateway('anthropic/claude-sonnet-4.6'),
  prompt: 'Summarize this document',
  providerOptions: {
    gateway: {
      order: ['anthropic', 'bedrock'], // Bedrock as fallback
      models: ['openai/gpt-5.4'],   // Final fallback model
    },
  },
})

当提供商宕机时，如果您配置了

order

或

models

，网关会自动故障转移：

const result = await generateText({
  model: gateway('anthropic/claude-sonnet-4.6'),
  prompt: 'Summarize this document',
  providerOptions: {
    gateway: {
      order: ['anthropic', 'bedrock'], // Bedrock作为备用
      models: ['openai/gpt-5.4'],   // 最终备用模型
    },
  },
})

Quota exceeded at provider

提供商配额耗尽

If your provider API key hits its quota, the gateway tries the next provider in the

order

list. Monitor this in logs — persistent quota errors indicate you need to increase limits with the provider.

如果您的提供商API密钥达到配额上限，网关会尝试

order

列表中的下一个提供商。请在日志中监控此情况——持续的配额错误表明您需要向提供商申请提高限额。

Invalid model identifier

无效模型标识

// Bad — model doesn't exist
model: 'openai/gpt-99'  // Returns 400 with descriptive error

// Good — use models listed in Vercel docs
model: 'openai/gpt-5.4'

// 错误——模型不存在
model: 'openai/gpt-99'  // 返回400状态码和描述性错误

// 正确——使用Vercel文档中列出的模型
model: 'openai/gpt-5.4'

Timeout handling

超时处理

Gateway has a default timeout per provider. For long-running generations, use streaming:

import { streamText } from 'ai'

const result = streamText({
  model: 'anthropic/claude-sonnet-4.6',
  prompt: longDocument,
})

for await (const chunk of result.textStream) {
  process.stdout.write(chunk)
}

网关对每个提供商有默认超时时间。对于长时间运行的生成任务，请使用流式传输：

import { streamText } from 'ai'

const result = streamText({
  model: 'anthropic/claude-sonnet-4.6',
  prompt: longDocument,
})

for await (const chunk of result.textStream) {
  process.stdout.write(chunk)
}

Complete error handling template

完整错误处理模板

import { generateText, APICallError } from 'ai'

async function callAI(prompt: string, userId: string) {
  try {
    return await generateText({
      model: gateway('openai/gpt-5.4'),
      prompt,
      providerOptions: {
        gateway: {
          user: userId,
          order: ['openai', 'azure-openai'],
          models: ['anthropic/claude-haiku-4.5'],
          tags: ['feature:chat'],
        },
      },
    })
  } catch (error) {
    if (!APICallError.isInstance(error)) throw error

    switch (error.statusCode) {
      case 402: return { text: 'Budget limit reached. Please try again later.' }
      case 429: return { text: 'Too many requests. Please slow down.' }
      case 503: return { text: 'AI service temporarily unavailable.' }
      default: throw error
    }
  }
}

import { generateText, APICallError } from 'ai'

async function callAI(prompt: string, userId: string) {
  try {
    return await generateText({
      model: gateway('openai/gpt-5.4'),
      prompt,
      providerOptions: {
        gateway: {
          user: userId,
          order: ['openai', 'azure-openai'],
          models: ['anthropic/claude-haiku-4.5'],
          tags: ['feature:chat'],
        },
      },
    })
  } catch (error) {
    if (!APICallError.isInstance(error)) throw error

    switch (error.statusCode) {
      case 402: return { text: '预算限额已达，请稍后重试。' }
      case 429: return { text: '请求过于频繁，请放慢速度。' }
      case 503: return { text: 'AI服务暂时不可用。' }
      default: throw error
    }
  }
}

Gateway vs Direct Provider — Decision Tree

网关 vs 直接调用提供商——决策树

Use this to decide whether to route through AI Gateway or call a provider SDK directly:

Need failover across providers?
  └─ Yes → Use Gateway
  └─ No
      Need cost tracking / budget alerts?
        └─ Yes → Use Gateway
        └─ No
            Need per-user rate limiting?
              └─ Yes → Use Gateway
              └─ No
                  Need audit logging?
                    └─ Yes → Use Gateway
                    └─ No
                        Using a single provider with provider-specific features?
                          └─ Yes → Use direct provider SDK
                          └─ No → Use Gateway (simplifies code)

使用以下决策树判断是否通过AI Gateway路由，还是直接调用提供商SDK：

需要跨提供商故障转移？
  └─ 是 → 使用网关
  └─ 否
      需要成本追踪/预算告警？
        └─ 是 → 使用网关
        └─ 否
            需要按用户限流？
              └─ 是 → 使用网关
              └─ 否
                  需要审计日志？
                    └─ 是 → 使用网关
                    └─ 否
                        使用单一提供商且需要提供商专属功能？
                          └─ 是 → 使用直接提供商SDK
                          └─ 否 → 使用网关（简化代码）

When to use direct provider SDK

何时使用直接提供商SDK

You need provider-specific features not exposed through the gateway (e.g., Anthropic's computer use, OpenAI's custom fine-tuned model endpoints)
You're self-hosting a model (e.g., vLLM, Ollama) that isn't registered with the gateway
You need request-level control over HTTP transport (custom proxies, mTLS)

您需要网关未暴露的提供商专属功能（例如Anthropic的计算机使用、OpenAI的自定义微调模型端点）
您正在自托管模型（例如vLLM、Ollama）且未在网关中注册
您需要对HTTP传输进行请求级控制（自定义代理、mTLS）

When to always use Gateway

何时始终使用网关

Production applications — failover and observability are essential
Multi-tenant SaaS — per-user tracking and rate limiting
Teams with cost accountability — tag-based budgeting

生产应用——故障转移和可观测性至关重要
多租户SaaS——按用户追踪和限流
需要成本问责的团队——基于标签的预算管理

Claude Code Compatibility

Claude Code兼容性

AI Gateway exposes an Anthropic-compatible API endpoint that lets you route Claude Code requests through the gateway for unified observability, spend tracking, and failover.

AI Gateway提供兼容Anthropic的API端点，可让您通过网关路由Claude Code请求，实现统一的可观测性、支出追踪和故障转移。

Configuration

配置

Set these environment variables to route Claude Code through AI Gateway:

bash

export ANTHROPIC_BASE_URL="https://ai-gateway.vercel.sh"
export ANTHROPIC_AUTH_TOKEN="your-vercel-ai-gateway-api-key"
export ANTHROPIC_API_KEY=""  # Must be empty string — Claude Code checks this first

Important: Setting

ANTHROPIC_API_KEY

to an empty string is required. Claude Code checks this variable first, and if it's set to a non-empty value, it uses that directly instead of

ANTHROPIC_AUTH_TOKEN

设置以下环境变量以通过AI Gateway路由Claude Code请求：

bash

export ANTHROPIC_BASE_URL="https://ai-gateway.vercel.sh"
export ANTHROPIC_AUTH_TOKEN="your-vercel-ai-gateway-api-key"
export ANTHROPIC_API_KEY=""  // 必须设为空字符串——Claude Code会优先检查此变量

重要提示：必须将

ANTHROPIC_API_KEY

设为空字符串。Claude Code会优先检查此变量，如果它被设为非空值，会直接使用该值而非

ANTHROPIC_AUTH_TOKEN

。

Claude Code Max Subscription

Claude Code Max订阅

AI Gateway supports Claude Code Max subscriptions. When configured, Claude Code continues to authenticate with Anthropic via its

Authorization

header while AI Gateway uses a separate

x-ai-gateway-api-key

header, allowing both auth mechanisms to coexist. This gives you unified observability at no additional token cost.

AI Gateway支持Claude Code Max订阅。配置后，Claude Code会继续通过其

Authorization

头向Anthropic认证，而AI Gateway使用单独的

x-ai-gateway-api-key

头，允许两种认证机制共存。这可为您提供统一的可观测性，且不会产生额外的令牌成本。

Using Non-Anthropic Models

使用非Anthropic模型

Override the default Anthropic models by setting:

bash

export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5.4"
export ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4.6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"

通过设置以下变量覆盖默认的Anthropic模型：

bash

export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5.4"
export ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4.6"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"

Latest Model Availability

Model	Slug	Input	Output
GPT-5.4	`openai/gpt-5.4`	$2.50/M tokens	$15.00/M tokens
GPT-5.4 Pro	`openai/gpt-5.4-pro`	$30.00/M tokens	$180.00/M tokens

模型	标识	输入成本	输出成本
GPT-5.4	`openai/gpt-5.4`	$2.50/百万令牌	$15.00/百万令牌
GPT-5.4 Pro	`openai/gpt-5.4-pro`	$30.00/百万令牌	$180.00/百万令牌

Supported Providers

支持的提供商

OpenAI (GPT-5.x including GPT-5.4 and GPT-5.4 Pro, o-series)
Anthropic (Claude 4.x)
Google (Gemini)
xAI (Grok)
Mistral
DeepSeek
Amazon Bedrock
Azure OpenAI
Cohere
Perplexity
Alibaba (Qwen)
Meta (Llama)
And many more (100+ models total)

OpenAI（GPT-5.x系列，包括GPT-5.4和GPT-5.4 Pro，o系列）
Anthropic（Claude 4.x）
Google（Gemini）
xAI（Grok）
Mistral
DeepSeek
Amazon Bedrock
Azure OpenAI
Cohere
Perplexity
Alibaba（Qwen）
Meta（Llama）
以及更多（总计100+模型）

Pricing

定价

Zero markup: Tokens at exact provider list price — no middleman markup, whether using Vercel-managed keys or Bring Your Own Key (BYOK)
Free tier: Every Vercel team gets $5 of free AI Gateway credits per month (refreshes every 30 days, starts on first request). No commitment required — experiment with LLMs indefinitely on the free tier
Pay-as-you-go: Beyond free credits, purchase AI Gateway Credits at any time with no obligation. Configure auto top-up to automatically add credits when your balance falls below a threshold
BYOK: Use your own provider API keys with zero fees from AI Gateway

零加价：令牌价格与提供商官网完全一致——无论使用Vercel托管密钥还是自带密钥（BYOK），均无中间商加价
免费额度：每个Vercel团队每月可获得5美元的免费AI Gateway额度（每30天刷新一次，首次请求开始计算）。无需承诺——可在免费额度上无限期试验大语言模型
按需付费：超出免费额度后，可随时购买AI Gateway额度，无任何义务。配置自动充值，当余额低于阈值时自动添加额度
BYOK：使用您自己的提供商API密钥，AI Gateway不收取任何费用

Multimodal Support

多模态支持

Text and image generation both route through the gateway. For embeddings, use a direct provider SDK.

// Text — through gateway
const { text } = await generateText({
  model: 'openai/gpt-5.4',
  prompt: 'Hello',
})

// Image — through gateway (multimodal LLMs return images in result.files)
const result = await generateText({
  model: 'google/gemini-3.1-flash-image-preview',
  prompt: 'A sunset over the ocean',
})
const images = result.files.filter((f) => f.mediaType?.startsWith('image/'))

// Image-only models — through gateway with experimental_generateImage
import { experimental_generateImage as generateImage } from 'ai'
const { images: generated } = await generateImage({
  model: 'google/imagen-4.0-generate-001',
  prompt: 'A sunset',
})

Default image model:

google/gemini-3.1-flash-image-preview

— fast multimodal image generation via gateway.

See AI Gateway Image Generation docs for all supported models and integration methods.

文本和图像生成均可通过网关路由。对于嵌入向量，请使用直接提供商SDK。

// 文本——通过网关
const { text } = await generateText({
  model: 'openai/gpt-5.4',
  prompt: 'Hello',
})

// 图像——通过网关（多模态大语言模型会在result.files中返回图像）
const result = await generateText({
  model: 'google/gemini-3.1-flash-image-preview',
  prompt: '海上日落',
})
const images = result.files.filter((f) => f.mediaType?.startsWith('image/'))

// 纯图像模型——通过网关使用experimental_generateImage
import { experimental_generateImage as generateImage } from 'ai'
const { images: generated } = await generateImage({
  model: 'google/imagen-4.0-generate-001',
  prompt: '日落',
})

默认图像模型：

google/gemini-3.1-flash-image-preview

——通过网关实现快速多模态图像生成。

请查看AI Gateway图像生成文档了解所有支持的模型和集成方法。

Key Benefits

核心优势

Unified API: One interface for all providers, no provider-specific code
Automatic failover: If a provider is down, requests route to the next
Cost tracking: Per-user, per-feature attribution with tags
Observability: Built-in monitoring of all model calls
Low latency: <20ms routing overhead
No lock-in: Switch models/providers by changing a string

统一API：一个接口对接所有提供商，无需编写提供商专属代码
自动故障转移：如果提供商宕机，请求会自动路由到下一个可用提供商
成本追踪：通过标签实现按用户、按功能的成本归因
可观测性：内置所有模型调用的监控功能
低延迟：路由开销低于20ms
无锁定：只需修改字符串即可切换模型/提供商

When to Use AI Gateway

何时使用AI Gateway

Scenario	Use Gateway?
Production app with AI features	Yes — failover, cost tracking
Prototyping with single provider	Optional — direct provider works fine
Multi-provider setup	Yes — unified routing
Need provider-specific features	Use direct provider SDK + Gateway as fallback
Cost tracking and budgeting	Yes — user tracking and tags
Multi-tenant SaaS	Yes — per-user rate limiting and audit
Compliance requirements	Yes — audit logging and log drains

场景	是否使用网关
带有AI功能的生产应用	是——故障转移、成本追踪
使用单一提供商的原型开发	可选——直接调用提供商即可
多提供商架构	是——统一路由
需要提供商专属功能	使用直接提供商SDK + 网关作为备用
成本追踪和预算管理	是——按用户追踪和标签
多租户SaaS	是——按用户限流和审计
合规性要求	是——审计日志和日志导出