ai-models

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

AI Models Reference Skill

AI模型参考技能

Load with: base.md + llm-patterns.md
Last Updated: December 2025
加载依赖:base.md + llm-patterns.md
最后更新:2025年12月

Philosophy

核心理念

Use the right model for the job. Bigger isn't always better - match model capabilities to task requirements. Consider cost, latency, and accuracy tradeoffs.
为工作选择合适的模型。 模型并非越大越好——要让模型能力与任务需求相匹配,需综合考虑成本、延迟和准确性之间的权衡。

Model Selection Matrix

模型选择矩阵

TaskRecommendedWhy
Complex reasoningClaude Opus 4.5, o3, Gemini 3 ProHighest accuracy
Fast chat/completionClaude Haiku, GPT-4.1 mini, Gemini FlashLow latency, cheap
Code generationClaude Sonnet 4.5, Codestral, GPT-4.1Strong coding
Vision/imagesClaude Sonnet, GPT-4o, Gemini 3 ProMultimodal
Embeddingstext-embedding-3-small, VoyageCost-effective
Voice synthesisEleven Labs v3, OpenAI TTSNatural sounding
Image generationFLUX.2, DALL-E 3, SD 3.5Different styles

任务推荐模型原因
复杂推理Claude Opus 4.5, o3, Gemini 3 Pro准确性最高
快速对话/补全Claude Haiku, GPT-4.1 mini, Gemini Flash低延迟、低成本
代码生成Claude Sonnet 4.5, Codestral, GPT-4.1强大的编码能力
视觉/图像处理Claude Sonnet, GPT-4o, Gemini 3 Pro多模态能力
嵌入向量生成text-embedding-3-small, Voyage高性价比
语音合成Eleven Labs v3, OpenAI TTS自然发声
图像生成FLUX.2, DALL-E 3, SD 3.5风格多样

Anthropic (Claude)

Anthropic (Claude)

Documentation

文档

Latest Models (December 2025)

最新模型(2025年12月)

typescript
const CLAUDE_MODELS = {
  // Flagship - highest capability
  opus: 'claude-opus-4-5-20251101',

  // Balanced - best for most tasks
  sonnet: 'claude-sonnet-4-5-20250929',

  // Previous generation (still excellent)
  opus4: 'claude-opus-4-20250514',
  sonnet4: 'claude-sonnet-4-20250514',

  // Fast & cheap - high volume tasks
  haiku: 'claude-haiku-3-5-20241022',
} as const;
typescript
const CLAUDE_MODELS = {
  // 旗舰模型 - 能力最强
  opus: 'claude-opus-4-5-20251101',

  // 平衡型 - 适用于大多数任务
  sonnet: 'claude-sonnet-4-5-20250929',

  // 上一代模型(仍表现出色)
  opus4: 'claude-opus-4-20250514',
  sonnet4: 'claude-sonnet-4-20250514',

  // 快速低成本 - 高吞吐量任务
  haiku: 'claude-haiku-3-5-20241022',
} as const;

Usage

使用示例

typescript
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'Hello, Claude!' }
  ],
});
typescript
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'Hello, Claude!' }
  ],
});

Model Selection

模型选择指南

claude-opus-4-5-20251101 (Opus 4.5)
├── Best for: Complex analysis, research, nuanced writing
├── Context: 200K tokens
├── Cost: $5/$25 per 1M tokens (input/output)
└── Use when: Accuracy matters most

claude-sonnet-4-5-20250929 (Sonnet 4.5)
├── Best for: Code, general tasks, balanced performance
├── Context: 200K tokens
├── Cost: $3/$15 per 1M tokens
└── Use when: Default choice for most applications

claude-haiku-3-5-20241022 (Haiku 3.5)
├── Best for: Classification, extraction, high-volume
├── Context: 200K tokens
├── Cost: $0.25/$1.25 per 1M tokens
└── Use when: Speed and cost matter most

claude-opus-4-5-20251101 (Opus 4.5)
├── 最佳适用场景: 复杂分析、研究、精细化写作
├── 上下文窗口: 200K tokens
├── 成本: 每1M tokens输入/输出分别为$5/$25
└── 使用时机: 对准确性要求最高时

claude-sonnet-4-5-20250929 (Sonnet 4.5)
├── 最佳适用场景: 代码生成、通用任务、性能均衡
├── 上下文窗口: 200K tokens
├── 成本: 每1M tokens输入/输出分别为$3/$15
└── 使用时机: 大多数应用的默认选择

claude-haiku-3-5-20241022 (Haiku 3.5)
├── 最佳适用场景: 分类、信息提取、高吞吐量任务
├── 上下文窗口: 200K tokens
├── 成本: 每1M tokens输入/输出分别为$0.25/$1.25
└── 使用时机: 对速度和成本要求最高时

OpenAI

OpenAI

Documentation

文档

Latest Models (December 2025)

最新模型(2025年12月)

typescript
const OPENAI_MODELS = {
  // GPT-5 series (latest)
  gpt5: 'gpt-5.2',
  gpt5Mini: 'gpt-5-mini',

  // GPT-4.1 series (recommended for most)
  gpt41: 'gpt-4.1',
  gpt41Mini: 'gpt-4.1-mini',
  gpt41Nano: 'gpt-4.1-nano',

  // Reasoning models (o-series)
  o3: 'o3',
  o3Pro: 'o3-pro',
  o4Mini: 'o4-mini',

  // Legacy but still useful
  gpt4o: 'gpt-4o',           // Still has audio support
  gpt4oMini: 'gpt-4o-mini',

  // Embeddings
  embeddingSmall: 'text-embedding-3-small',
  embeddingLarge: 'text-embedding-3-large',

  // Image generation
  dalle3: 'dall-e-3',
  gptImage: 'gpt-image-1',

  // Audio
  tts: 'tts-1',
  ttsHd: 'tts-1-hd',
  whisper: 'whisper-1',
} as const;
typescript
const OPENAI_MODELS = {
  // GPT-5系列(最新)
  gpt5: 'gpt-5.2',
  gpt5Mini: 'gpt-5-mini',

  // GPT-4.1系列(推荐用于大多数场景)
  gpt41: 'gpt-4.1',
  gpt41Mini: 'gpt-4.1-mini',
  gpt41Nano: 'gpt-4.1-nano',

  // 推理模型(o系列)
  o3: 'o3',
  o3Pro: 'o3-pro',
  o4Mini: 'o4-mini',

  // 旧版但仍实用
  gpt4o: 'gpt-4o',           // 仍支持音频功能
  gpt4oMini: 'gpt-4o-mini',

  // 嵌入向量模型
  embeddingSmall: 'text-embedding-3-small',
  embeddingLarge: 'text-embedding-3-large',

  // 图像生成
  dalle3: 'dall-e-3',
  gptImage: 'gpt-image-1',

  // 音频
  tts: 'tts-1',
  ttsHd: 'tts-1-hd',
  whisper: 'whisper-1',
} as const;

Usage

使用示例

typescript
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Chat completion
const response = await openai.chat.completions.create({
  model: 'gpt-4.1',
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});

// With vision
const visionResponse = await openai.chat.completions.create({
  model: 'gpt-4.1',
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'What is in this image?' },
        { type: 'image_url', image_url: { url: 'https://...' } },
      ],
    },
  ],
});

// Embeddings
const embedding = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: 'Your text here',
});
typescript
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// 对话补全
const response = await openai.chat.completions.create({
  model: 'gpt-4.1',
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});

// 视觉功能示例
const visionResponse = await openai.chat.completions.create({
  model: 'gpt-4.1',
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: '这张图片里有什么?' },
        { type: 'image_url', image_url: { url: 'https://...' } },
      ],
    },
  ],
});

// 嵌入向量生成
const embedding = await openai.embeddings.create({
  model: 'text-embedding-3-small',
  input: 'Your text here',
});

Model Selection

模型选择指南

o3 / o3-pro
├── Best for: Math, coding, complex multi-step reasoning
├── Context: 200K tokens
├── Cost: Premium pricing
└── Use when: Hardest problems, need chain-of-thought

gpt-4.1
├── Best for: General tasks, coding, instruction following
├── Context: 1M tokens (!)
├── Cost: Lower than GPT-4o
└── Use when: Default choice, replaces GPT-4o

gpt-4.1-mini / gpt-4.1-nano
├── Best for: High-volume, cost-sensitive
├── Context: 1M tokens
├── Cost: Very low
└── Use when: Simple tasks at scale

o4-mini
├── Best for: Fast reasoning at low cost
├── Context: 200K tokens
├── Cost: Budget reasoning
└── Use when: Need reasoning but cost-conscious

o3 / o3-pro
├── 最佳适用场景: 数学、编码、复杂多步骤推理
├── 上下文窗口: 200K tokens
├── 成本: 溢价定价
└── 使用时机: 解决最难问题,需要思维链推理时

gpt-4.1
├── 最佳适用场景: 通用任务、编码、指令遵循
├── 上下文窗口: 1M tokens (!)
├── 成本: 低于GPT-4o
└── 使用时机: 默认选择,替代GPT-4o

gpt-4.1-mini / gpt-4.1-nano
├── 最佳适用场景: 高吞吐量、成本敏感型任务
├── 上下文窗口: 1M tokens
├── 成本: 极低
└── 使用时机: 大规模处理简单任务

o4-mini
├── 最佳适用场景: 低成本快速推理
├── 上下文窗口: 200K tokens
├── 成本: 经济型推理定价
└── 使用时机: 需要推理能力但关注成本时

Google (Gemini)

Google (Gemini)

Documentation

文档

Latest Models (December 2025)

最新模型(2025年12月)

typescript
const GEMINI_MODELS = {
  // Gemini 3 (Latest)
  gemini3Pro: 'gemini-3-pro-preview',
  gemini3ProImage: 'gemini-3-pro-image-preview',
  gemini3Flash: 'gemini-3-flash-preview',

  // Gemini 2.5 (Stable)
  gemini25Pro: 'gemini-2.5-pro',
  gemini25Flash: 'gemini-2.5-flash',
  gemini25FlashLite: 'gemini-2.5-flash-lite',

  // Specialized
  gemini25FlashTTS: 'gemini-2.5-flash-preview-tts',
  gemini25FlashAudio: 'gemini-2.5-flash-native-audio-preview-12-2025',

  // Previous generation
  gemini2Flash: 'gemini-2.0-flash',
} as const;
typescript
const GEMINI_MODELS = {
  // Gemini 3(最新)
  gemini3Pro: 'gemini-3-pro-preview',
  gemini3ProImage: 'gemini-3-pro-image-preview',
  gemini3Flash: 'gemini-3-flash-preview',

  // Gemini 2.5(稳定版)
  gemini25Pro: 'gemini-2.5-pro',
  gemini25Flash: 'gemini-2.5-flash',
  gemini25FlashLite: 'gemini-2.5-flash-lite',

  // 专业模型
  gemini25FlashTTS: 'gemini-2.5-flash-preview-tts',
  gemini25FlashAudio: 'gemini-2.5-flash-native-audio-preview-12-2025',

  // 上一代模型
  gemini2Flash: 'gemini-2.0-flash',
} as const;

Usage

使用示例

typescript
import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });

const result = await model.generateContent('Hello!');
const response = result.response.text();

// With vision
const visionModel = genAI.getGenerativeModel({ model: 'gemini-2.5-pro' });
const imagePart = {
  inlineData: {
    data: base64Image,
    mimeType: 'image/jpeg',
  },
};
const result = await visionModel.generateContent(['Describe this:', imagePart]);
typescript
import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });

const result = await model.generateContent('Hello!');
const response = result.response.text();

// 视觉功能示例
const visionModel = genAI.getGenerativeModel({ model: 'gemini-2.5-pro' });
const imagePart = {
  inlineData: {
    data: base64Image,
    mimeType: 'image/jpeg',
  },
};
const result = await visionModel.generateContent(['描述这张图片:', imagePart]);

Model Selection

模型选择指南

gemini-3-pro-preview
├── Best for: "Best model in the world for multimodal"
├── Context: 2M tokens
├── Cost: Premium
└── Use when: Need absolute best quality

gemini-2.5-pro
├── Best for: State-of-the-art thinking, complex tasks
├── Context: 2M tokens
├── Cost: $1.25/$5 per 1M tokens
└── Use when: Long context, complex reasoning

gemini-2.5-flash
├── Best for: Fast, balanced performance
├── Context: 1M tokens
├── Cost: $0.075/$0.30 per 1M tokens
└── Use when: Speed and cost matter

gemini-2.5-flash-lite
├── Best for: Ultra-fast, lowest cost
├── Context: 1M tokens
├── Cost: $0.04/$0.15 per 1M tokens
└── Use when: High volume, simple tasks

gemini-3-pro-preview
├── 最佳适用场景: "全球最佳多模态模型"
├── 上下文窗口: 2M tokens
├── 成本: 溢价
└── 使用时机: 需要绝对最佳质量时

gemini-2.5-pro
├── 最佳适用场景: 最先进的推理能力、复杂任务
├── 上下文窗口: 2M tokens
├── 成本: 每1M tokens输入/输出分别为$1.25/$5
└── 使用时机: 长上下文、复杂推理场景

gemini-2.5-flash
├── 最佳适用场景: 快速、性能均衡
├── 上下文窗口: 1M tokens
├── 成本: 每1M tokens输入/输出分别为$0.075/$0.30
└── 使用时机: 关注速度和成本时

gemini-2.5-flash-lite
├── 最佳适用场景: 超高速、最低成本
├── 上下文窗口: 1M tokens
├── 成本: 每1M tokens输入/输出分别为$0.04/$0.15
└── 使用时机: 高吞吐量、简单任务

Eleven Labs (Voice)

Eleven Labs(语音合成)

Documentation

文档

Latest Models (December 2025)

最新模型(2025年12月)

typescript
const ELEVENLABS_MODELS = {
  // Latest - highest quality (alpha)
  v3: 'eleven_v3',

  // Production ready
  multilingualV2: 'eleven_multilingual_v2',
  turboV2_5: 'eleven_turbo_v2_5',

  // Ultra-low latency
  flashV2_5: 'eleven_flash_v2_5',
  flashV2: 'eleven_flash_v2', // English only
} as const;
typescript
const ELEVENLABS_MODELS = {
  // 最新版 - 最高质量(测试版)
  v3: 'eleven_v3',

  // 生产可用版
  multilingualV2: 'eleven_multilingual_v2',
  turboV2_5: 'eleven_turbo_v2_5',

  // 超低延迟版
  flashV2_5: 'eleven_flash_v2_5',
  flashV2: 'eleven_flash_v2', // 仅支持英文
} as const;

Usage

使用示例

typescript
import { ElevenLabsClient } from 'elevenlabs';

const elevenlabs = new ElevenLabsClient({
  apiKey: process.env.ELEVENLABS_API_KEY,
});

// Text to speech
const audio = await elevenlabs.textToSpeech.convert('voice-id', {
  text: 'Hello, world!',
  model_id: 'eleven_turbo_v2_5',
  voice_settings: {
    stability: 0.5,
    similarity_boost: 0.75,
  },
});

// Stream audio (for real-time)
const audioStream = await elevenlabs.textToSpeech.convertAsStream('voice-id', {
  text: 'Streaming audio...',
  model_id: 'eleven_flash_v2_5',
});
typescript
import { ElevenLabsClient } from 'elevenlabs';

const elevenlabs = new ElevenLabsClient({
  apiKey: process.env.ELEVENLABS_API_KEY,
});

// 文本转语音
const audio = await elevenlabs.textToSpeech.convert('voice-id', {
  text: 'Hello, world!',
  model_id: 'eleven_turbo_v2_5',
  voice_settings: {
    stability: 0.5,
    similarity_boost: 0.75,
  },
});

// 流式音频(实时场景)
const audioStream = await elevenlabs.textToSpeech.convertAsStream('voice-id', {
  text: 'Streaming audio...',
  model_id: 'eleven_flash_v2_5',
});

Model Selection

模型选择指南

eleven_v3 (Alpha)
├── Best for: Highest quality, emotional range
├── Latency: ~1s+ (not for real-time)
├── Languages: 74
└── Use when: Quality over speed, pre-rendered

eleven_turbo_v2_5
├── Best for: Balanced quality and speed
├── Latency: ~250-300ms
├── Languages: 32
└── Use when: Good quality with reasonable latency

eleven_flash_v2_5
├── Best for: Real-time, conversational AI
├── Latency: <75ms
├── Languages: 32
└── Use when: Live voice agents, chatbots

eleven_v3 (测试版)
├── 最佳适用场景: 最高质量、丰富情感表达
├── 延迟: ~1秒以上(不适合实时场景)
├── 支持语言: 74种
└── 使用时机: 优先考虑质量而非速度,预渲染场景

eleven_turbo_v2_5
├── 最佳适用场景: 质量与速度平衡
├── 延迟: ~250-300ms
├── 支持语言: 32种
└── 使用时机: 兼顾质量与合理延迟的场景

eleven_flash_v2_5
├── 最佳适用场景: 实时对话式AI
├── 延迟: <75ms
├── 支持语言: 32种
└── 使用时机: 实时语音助手、聊天机器人

Replicate

Replicate

Documentation

文档

Popular Models (December 2025)

热门模型(2025年12月)

typescript
const REPLICATE_MODELS = {
  // FLUX.2 (Latest - November 2025)
  flux2Pro: 'black-forest-labs/flux-2-pro',
  flux2Flex: 'black-forest-labs/flux-2-flex',
  flux2Dev: 'black-forest-labs/flux-2-dev',

  // FLUX.1 (Still excellent)
  flux11Pro: 'black-forest-labs/flux-1.1-pro',
  fluxKontext: 'black-forest-labs/flux-kontext', // Image editing
  fluxSchnell: 'black-forest-labs/flux-schnell',

  // Video
  stableVideo4D: 'stability-ai/sv4d-2.0',

  // Audio
  musicgen: 'meta/musicgen',

  // LLMs (if needed outside main providers)
  llama: 'meta/llama-3.2-90b-vision',
} as const;
typescript
const REPLICATE_MODELS = {
  // FLUX.2(最新 - 2025年11月)
  flux2Pro: 'black-forest-labs/flux-2-pro',
  flux2Flex: 'black-forest-labs/flux-2-flex',
  flux2Dev: 'black-forest-labs/flux-2-dev',

  // FLUX.1(仍表现出色)
  flux11Pro: 'black-forest-labs/flux-1.1-pro',
  fluxKontext: 'black-forest-labs/flux-kontext', // 图像编辑
  fluxSchnell: 'black-forest-labs/flux-schnell',

  // 视频生成
  stableVideo4D: 'stability-ai/sv4d-2.0',

  // 音频生成
  musicgen: 'meta/musicgen',

  // 大语言模型(主供应商之外的选择)
  llama: 'meta/llama-3.2-90b-vision',
} as const;

Usage

使用示例

typescript
import Replicate from 'replicate';

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

// Image generation with FLUX.2
const output = await replicate.run('black-forest-labs/flux-2-pro', {
  input: {
    prompt: 'A serene mountain landscape at sunset',
    aspect_ratio: '16:9',
    output_format: 'webp',
  },
});

// Image editing with Kontext
const edited = await replicate.run('black-forest-labs/flux-kontext', {
  input: {
    image: 'https://...',
    prompt: 'Change the sky to sunset colors',
  },
});
typescript
import Replicate from 'replicate';

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

// 使用FLUX.2生成图像
const output = await replicate.run('black-forest-labs/flux-2-pro', {
  input: {
    prompt: 'A serene mountain landscape at sunset',
    aspect_ratio: '16:9',
    output_format: 'webp',
  },
});

// 使用Kontext编辑图像
const edited = await replicate.run('black-forest-labs/flux-kontext', {
  input: {
    image: 'https://...',
    prompt: 'Change the sky to sunset colors',
  },
});

Model Selection

模型选择指南

flux-2-pro
├── Best for: Highest quality, up to 4MP
├── Speed: ~6s
├── Cost: $0.015 + per megapixel
└── Use when: Professional quality needed

flux-2-flex
├── Best for: Fine details, typography
├── Speed: ~22s
├── Cost: $0.06 per megapixel
└── Use when: Need precise control

flux-2-dev (Open source)
├── Best for: Fast generation
├── Speed: ~2.5s
├── Cost: $0.012 per megapixel
└── Use when: Speed over quality

flux-kontext
├── Best for: Image editing with text
├── Speed: Variable
├── Cost: Per run
└── Use when: Edit existing images

flux-2-pro
├── 最佳适用场景: 最高质量,支持最高4MP分辨率
├── 速度: ~6秒
├── 成本: $0.015 + 每兆像素额外费用
└── 使用时机: 需要专业级质量时

flux-2-flex
├── 最佳适用场景: 精细细节、文字排版
├── 速度: ~22秒
├── 成本: 每兆像素$0.06
└── 使用时机: 需要精确控制时

flux-2-dev(开源)
├── 最佳适用场景: 快速生成
├── 速度: ~2.5秒
├── 成本: 每兆像素$0.012
└── 使用时机: 优先考虑速度而非质量时

flux-kontext
├── 最佳适用场景: 文本指令图像编辑
├── 速度: 可变
├── 成本: 按次计费
└── 使用时机: 编辑现有图像时

Stability AI

Stability AI

Documentation

文档

Latest Models (December 2025)

最新模型(2025年12月)

typescript
const STABILITY_MODELS = {
  // Image generation
  sd35Large: 'sd3.5-large',
  sd35LargeTurbo: 'sd3.5-large-turbo',
  sd3Medium: 'sd3-medium',

  // Video
  sv4d: 'sv4d-2.0', // Stable Video 4D 2.0

  // Upscaling
  upscale: 'esrgan-v1-x2plus',
} as const;
typescript
const STABILITY_MODELS = {
  // 图像生成
  sd35Large: 'sd3.5-large',
  sd35LargeTurbo: 'sd3.5-large-turbo',
  sd3Medium: 'sd3-medium',

  // 视频生成
  sv4d: 'sv4d-2.0', // Stable Video 4D 2.0

  // 图像超分辨率
  upscale: 'esrgan-v1-x2plus',
} as const;

Usage

使用示例

typescript
const response = await fetch(
  'https://api.stability.ai/v2beta/stable-image/generate/sd3',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
    },
    body: JSON.stringify({
      prompt: 'A futuristic city at night',
      output_format: 'webp',
      aspect_ratio: '16:9',
      model: 'sd3.5-large',
    }),
  }
);

typescript
const response = await fetch(
  'https://api.stability.ai/v2beta/stable-image/generate/sd3',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
    },
    body: JSON.stringify({
      prompt: 'A futuristic city at night',
      output_format: 'webp',
      aspect_ratio: '16:9',
      model: 'sd3.5-large',
    }),
  }
);

Mistral AI

Mistral AI

Documentation

文档

Latest Models (December 2025)

最新模型(2025年12月)

typescript
const MISTRAL_MODELS = {
  // Flagship
  large: 'mistral-large-latest',  // Points to 2411

  // Medium tier
  medium: 'mistral-medium-2505',  // Medium 3

  // Small/Fast
  small: 'mistral-small-2506',    // Small 3.2

  // Code specialized
  codestral: 'codestral-2508',
  devstral: 'devstral-medium-2507',

  // Reasoning (Magistral)
  magistralMedium: 'magistral-medium-2507',
  magistralSmall: 'magistral-small-2507',

  // Audio
  voxtral: 'voxtral-small-2507',

  // OCR
  ocr: 'mistral-ocr-2505',
} as const;
typescript
const MISTRAL_MODELS = {
  // 旗舰模型
  large: 'mistral-large-latest',  // 指向2411版本

  // 中端模型
  medium: 'mistral-medium-2505',  // Medium 3

  // 小型/快速模型
  small: 'mistral-small-2506',    // Small 3.2

  // 代码专业模型
  codestral: 'codestral-2508',
  devstral: 'devstral-medium-2507',

  // 推理模型(Magistral)
  magistralMedium: 'magistral-medium-2507',
  magistralSmall: 'magistral-small-2507',

  // 音频模型
  voxtral: 'voxtral-small-2507',

  // OCR模型
  ocr: 'mistral-ocr-2505',
} as const;

Usage

使用示例

typescript
import MistralClient from '@mistralai/mistralai';

const client = new MistralClient(process.env.MISTRAL_API_KEY);

const response = await client.chat({
  model: 'mistral-large-latest',
  messages: [{ role: 'user', content: 'Hello!' }],
});

// Code completion with Codestral
const codeResponse = await client.chat({
  model: 'codestral-2508',
  messages: [{ role: 'user', content: 'Write a Python function to...' }],
});
typescript
import MistralClient from '@mistralai/mistralai';

const client = new MistralClient(process.env.MISTRAL_API_KEY);

const response = await client.chat({
  model: 'mistral-large-latest',
  messages: [{ role: 'user', content: 'Hello!' }],
});

// 使用Codestral生成代码
const codeResponse = await client.chat({
  model: 'codestral-2508',
  messages: [{ role: 'user', content: 'Write a Python function to...' }],
});

Model Selection

模型选择指南

mistral-large-latest (123B params)
├── Best for: Complex reasoning, knowledge tasks
├── Context: 128K tokens
└── Use when: Need high capability

codestral-2508
├── Best for: Code generation, 80+ languages
├── Speed: 2.5x faster than predecessor
└── Use when: Code-focused tasks

magistral-medium-2507
├── Best for: Multi-step reasoning
├── Specialty: Transparent chain-of-thought
└── Use when: Need reasoning traces

mistral-large-latest (123B参数)
├── 最佳适用场景: 复杂推理、知识类任务
├── 上下文窗口: 128K tokens
└── 使用时机: 需要高能力模型时

codestral-2508
├── 最佳适用场景: 代码生成,支持80+种语言
├── 速度: 比前代快2.5倍
└── 使用时机: 代码相关任务

magistral-medium-2507
├── 最佳适用场景: 多步骤推理
├── 特色: 可解释的思维链
└── 使用时机: 需要推理轨迹时

Voyage AI (Embeddings)

Voyage AI(嵌入向量)

Documentation

文档

Latest Models (December 2025)

最新模型(2025年12月)

typescript
const VOYAGE_MODELS = {
  // General purpose
  large2: 'voyage-large-2',
  large2Instruct: 'voyage-large-2-instruct',

  // Code specialized
  code2: 'voyage-code-2',
  code3: 'voyage-code-3',

  // Multilingual
  multilingual2: 'voyage-multilingual-2',

  // Domain specific
  law2: 'voyage-law-2',
  finance2: 'voyage-finance-2',
} as const;
typescript
const VOYAGE_MODELS = {
  // 通用型
  large2: 'voyage-large-2',
  large2Instruct: 'voyage-large-2-instruct',

  // 代码专业型
  code2: 'voyage-code-2',
  code3: 'voyage-code-3',

  // 多语言型
  multilingual2: 'voyage-multilingual-2',

  // 领域专用型
  law2: 'voyage-law-2',
  finance2: 'voyage-finance-2',
} as const;

Usage

使用示例

typescript
const response = await fetch('https://api.voyageai.com/v1/embeddings', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${process.env.VOYAGE_API_KEY}`,
  },
  body: JSON.stringify({
    model: 'voyage-code-3',
    input: ['Your code to embed'],
  }),
});

const { data } = await response.json();
const embedding = data[0].embedding;

typescript
const response = await fetch('https://api.voyageai.com/v1/embeddings', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${process.env.VOYAGE_API_KEY}`,
  },
  body: JSON.stringify({
    model: 'voyage-code-3',
    input: ['Your code to embed'],
  }),
});

const { data } = await response.json();
const embedding = data[0].embedding;

Quick Reference

快速参考

Cost Comparison (per 1M tokens, approx.)

成本对比(每1M tokens,约数)

ProviderCheapMidPremium
Anthropic$0.25 (Haiku)$3 (Sonnet 4.5)$5 (Opus 4.5)
OpenAI$0.15 (4.1-nano)$2 (4.1)$15+ (o3)
Google$0.04 (Flash-lite)$0.08 (Flash)$1.25 (Pro)
Mistral$0.25 (Small)$2.70 (Medium)$8 (Large)
供应商经济型中端高端
Anthropic$0.25 (Haiku)$3 (Sonnet 4.5)$5 (Opus 4.5)
OpenAI$0.15 (4.1-nano)$2 (4.1)$15+ (o3)
Google$0.04 (Flash-lite)$0.08 (Flash)$1.25 (Pro)
Mistral$0.25 (Small)$2.70 (Medium)$8 (Large)

Best For Each Task

各任务最佳模型

Reasoning/Analysis    → Claude Opus 4.5, o3, Gemini 3 Pro
Code Generation       → Claude Sonnet 4.5, Codestral 2508, GPT-4.1
Fast Responses        → Claude Haiku, GPT-4.1-mini, Gemini Flash
Long Context          → Gemini 2.5 Pro (2M), GPT-4.1 (1M), Claude (200K)
Vision                → GPT-4.1, Claude Sonnet, Gemini 3 Pro
Embeddings            → Voyage code-3, text-embedding-3-small
Voice Synthesis       → Eleven Labs v3/flash, OpenAI TTS
Image Generation      → FLUX.2 Pro, DALL-E 3, SD 3.5
Video Generation      → Stable Video 4D 2.0, Runway
Image Editing         → FLUX Kontext, gpt-image-1
推理/分析    → Claude Opus 4.5, o3, Gemini 3 Pro
代码生成       → Claude Sonnet 4.5, Codestral 2508, GPT-4.1
快速响应        → Claude Haiku, GPT-4.1-mini, Gemini Flash
长上下文          → Gemini 2.5 Pro (2M), GPT-4.1 (1M), Claude (200K)
视觉                → GPT-4.1, Claude Sonnet, Gemini 3 Pro
嵌入向量            → Voyage code-3, text-embedding-3-small
语音合成       → Eleven Labs v3/flash, OpenAI TTS
图像生成      → FLUX.2 Pro, DALL-E 3, SD 3.5
视频生成      → Stable Video 4D 2.0, Runway
图像编辑         → FLUX Kontext, gpt-image-1

Environment Variables Template

环境变量模板

bash
undefined
bash
undefined

.env.example (NEVER commit actual keys)

.env.example(切勿提交真实密钥)

LLMs

大语言模型

ANTHROPIC_API_KEY=sk-ant-... OPENAI_API_KEY=sk-... GOOGLE_API_KEY=AI... MISTRAL_API_KEY=...
ANTHROPIC_API_KEY=sk-ant-... OPENAI_API_KEY=sk-... GOOGLE_API_KEY=AI... MISTRAL_API_KEY=...

Media

多媒体模型

ELEVENLABS_API_KEY=... REPLICATE_API_TOKEN=r8_... STABILITY_API_KEY=sk-...
ELEVENLABS_API_KEY=... REPLICATE_API_TOKEN=r8_... STABILITY_API_KEY=sk-...

Embeddings

嵌入向量模型

VOYAGE_API_KEY=pa-...
undefined
VOYAGE_API_KEY=pa-...
undefined

Model Update Checklist

模型更新检查清单

When models update:
□ Check official changelog/blog
□ Update model ID strings
□ Test with existing prompts
□ Compare output quality
□ Check pricing changes
□ Update context limits if changed

当模型更新时:
□ 查看官方更新日志/博客
□ 更新模型ID字符串
□ 使用现有提示词测试
□ 对比输出质量
□ 检查定价变化
□ 若上下文窗口变化则更新对应信息

Sources

参考来源