google-gemini-api

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Google Gemini API - Complete Guide

Google Gemini API - 完整指南

Version: 3.0.0 (14 Known Issues Added) Package: @google/genai@1.35.0 (⚠️ NOT @google/generative-ai) Last Updated: 2026-01-21

版本: 3.0.0(新增14个已知问题) : @google/genai@1.35.0(⚠️ 请勿使用@google/generative-ai) 最后更新: 2026-01-21

⚠️ CRITICAL SDK MIGRATION WARNING

⚠️ 重要SDK迁移警告

DEPRECATED SDK:
@google/generative-ai
(sunset November 30, 2025) CURRENT SDK:
@google/genai
v1.27+
If you see code using
@google/generative-ai
, it's outdated!
This skill uses the correct current SDK and provides a complete migration guide.

已弃用SDK:
@google/generative-ai
(2025年11月30日停止服务) 当前SDK:
@google/genai
v1.27+
如果您看到使用
@google/generative-ai
的代码,说明它已过时!
本指南使用正确的当前SDK,并提供完整的迁移指南。

Status

状态

✅ Phase 1 Complete:
  • ✅ Text Generation (basic + streaming)
  • ✅ Multimodal Inputs (images, video, audio, PDFs)
  • ✅ Function Calling (basic + parallel execution)
  • ✅ System Instructions & Multi-turn Chat
  • ✅ Thinking Mode Configuration
  • ✅ Generation Parameters (temperature, top-p, top-k, stop sequences)
  • ✅ Both Node.js SDK (@google/genai) and fetch approaches
✅ Phase 2 Complete:
  • ✅ Context Caching (cost optimization with TTL-based caching)
  • ✅ Code Execution (built-in Python interpreter and sandbox)
  • ✅ Grounding with Google Search (real-time web information + citations)
📦 Separate Skills:
  • Embeddings: See
    google-gemini-embeddings
    skill for text-embedding-004

✅ 第一阶段完成:
  • ✅ 文本生成(基础版+流式输出)
  • ✅ 多模态输入(图片、视频、音频、PDF)
  • ✅ 函数调用(基础版+并行执行)
  • ✅ 系统指令与多轮对话
  • ✅ 思考模式配置
  • ✅ 生成参数(temperature、top-p、top-k、停止序列)
  • ✅ Node.js SDK(@google/genai)和Fetch两种实现方式
✅ 第二阶段完成:
  • ✅ 上下文缓存(基于TTL的缓存优化成本)
  • ✅ 代码执行(内置Python解释器和沙箱)
  • ✅ 基于Google搜索的事实校验(实时网络信息+引用)
📦 独立技能:
  • 嵌入: 文本嵌入功能请查看
    google-gemini-embeddings
    技能(对应text-embedding-004模型)

Table of Contents

目录

Quick Start

快速开始

Installation

安装

CORRECT SDK:
bash
npm install @google/genai@1.34.0
❌ WRONG (DEPRECATED):
bash
npm install @google/generative-ai  # DO NOT USE!
正确的SDK:
bash
npm install @google/genai@1.34.0
❌ 错误(已弃用):
bash
npm install @google/generative-ai  # 请勿使用!

Environment Setup

环境配置

bash
export GEMINI_API_KEY="..."
Or create
.env
file:
GEMINI_API_KEY=...
bash
export GEMINI_API_KEY="..."
或创建
.env
文件:
GEMINI_API_KEY=...

First Text Generation (Node.js SDK)

首次文本生成(Node.js SDK)

typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Explain quantum computing in simple terms'
});

console.log(response.text);
typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '用简单的语言解释量子计算'
});

console.log(response.text);

First Text Generation (Fetch - Cloudflare Workers)

首次文本生成(Fetch - Cloudflare Workers)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Explain quantum computing in simple terms' }] }]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '用简单的语言解释量子计算' }] }]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Current Models (2025)

当前模型(2025)

Gemini 3 Series (December 2025)

Gemini 3系列(2025年12月)

gemini-3-flash

gemini-3-flash

  • Context: 1,048,576 input tokens / 65,536 output tokens
  • Status: 🆕 Generally Available (December 2025)
  • Description: Google's fastest and most efficient Gemini 3 model for production workloads
  • Best for: High-throughput applications, low-latency responses, cost-sensitive production
  • Features: Enhanced multimodal, function calling, streaming, thinking mode
  • Benchmark Performance: Matches gemini-2.5-pro quality at gemini-2.5-flash speed/cost
  • Recommended for: Production use cases requiring speed + quality balance
  • 上下文窗口: 1,048,576输入token / 65,536输出token
  • 状态: 🆕 正式可用(2025年12月)
  • 描述: Google最快、最高效的Gemini 3模型,适用于生产工作负载
  • 最佳适用场景: 高吞吐量应用、低延迟响应、对成本敏感的生产环境
  • 特性: 增强型多模态、函数调用、流式输出、思考模式
  • 基准性能: 达到gemini-2.5-pro的质量,同时拥有gemini-2.5-flash的速度和成本优势
  • 推荐: 需要速度与质量平衡的生产场景

gemini-3-pro-preview

gemini-3-pro-preview

  • Context: TBD (documentation pending)
  • Status: Preview release (November 18, 2025)
  • Description: Google's newest and most intelligent AI model with state-of-the-art reasoning
  • Best for: Most complex reasoning tasks, advanced multimodal understanding, benchmark-critical applications
  • Features: Enhanced multimodal (text, image, video, audio, PDF), function calling, streaming
  • Benchmark Performance: Outperforms Gemini 2.5 Pro on every major AI benchmark
  • ⚠️ Preview Models Warning: Preview models have NO SLAs and can change or be deprecated with little notice. Use GA (generally available) models for production. See Issue #13
  • 上下文窗口: 待定(文档未发布)
  • 状态: 预览版(2025年11月18日发布)
  • 描述: Google最新、最智能的AI模型,具备最先进的推理能力
  • 最佳适用场景: 最复杂的推理任务、高级多模态理解、对基准要求严格的应用
  • 特性: 增强型多模态(文本、图片、视频、音频、PDF)、函数调用、流式输出
  • 基准性能: 在所有主要AI基准测试中优于Gemini 2.5 Pro
  • ⚠️ 预览模型警告: 预览模型无服务级别协议(SLA),可能随时变更或弃用。生产环境请使用正式可用(GA)模型。详情请查看问题#13

Gemini 2.5 Series (General Availability - Stable)

Gemini 2.5系列(正式可用 - 稳定版)

gemini-2.5-pro

gemini-2.5-pro

  • Context: 1,048,576 input tokens / 65,536 output tokens
  • Description: State-of-the-art thinking model for complex reasoning
  • Best for: Code, math, STEM, complex problem-solving
  • Features: Thinking mode (default on), function calling, multimodal, streaming
  • Knowledge cutoff: January 2025
  • 上下文窗口: 1,048,576输入token / 65,536输出token
  • 描述: 具备最先进思考能力的模型,适用于复杂推理
  • 最佳适用场景: 代码、数学、STEM、复杂问题解决
  • 特性: 思考模式(默认开启)、函数调用、多模态、流式输出
  • 知识截止日期: 2025年1月

gemini-2.5-flash

gemini-2.5-flash

  • Context: 1,048,576 input tokens / 65,536 output tokens
  • Description: Best price-performance workhorse model
  • Best for: Large-scale processing, low-latency, high-volume, agentic use cases
  • Features: Thinking mode (default on), function calling, multimodal, streaming
  • Knowledge cutoff: January 2025
  • 上下文窗口: 1,048,576输入token / 65,536输出token
  • 描述: 性价比最高的主力模型
  • 最佳适用场景: 大规模处理、低延迟、高吞吐量、智能代理场景
  • 特性: 思考模式(默认开启)、函数调用、多模态、流式输出
  • 知识截止日期: 2025年1月

gemini-2.5-flash-lite

gemini-2.5-flash-lite

  • Context: 1,048,576 input tokens / 65,536 output tokens
  • Description: Cost-optimized, fastest 2.5 model
  • Best for: High throughput, cost-sensitive applications
  • Features: Thinking mode (default on), function calling, multimodal, streaming
  • Knowledge cutoff: January 2025
  • 上下文窗口: 1,048,576输入token / 65,536输出token
  • 描述: 成本优化的最快2.5系列模型
  • 最佳适用场景: 高吞吐量、对成本敏感的应用
  • 特性: 思考模式(默认开启)、函数调用、多模态、流式输出
  • 知识截止日期: 2025年1月

Model Feature Matrix

模型特性矩阵

Feature3-Flash3-Pro (Preview)2.5-Pro2.5-Flash2.5-Flash-Lite
Thinking Mode✅ Default ONTBD✅ Default ON✅ Default ON✅ Default ON
Function Calling
Multimodal✅ Enhanced✅ Enhanced
Streaming
System Instructions
Context Window1,048,576 inTBD1,048,576 in1,048,576 in1,048,576 in
Output Tokens65,536 maxTBD65,536 max65,536 max65,536 max
StatusGAPreviewStableStableStable
特性3-Flash3-Pro(预览版)2.5-Pro2.5-Flash2.5-Flash-Lite
思考模式✅ 默认开启待定✅ 默认开启✅ 默认开启✅ 默认开启
函数调用
多模态✅ 增强型✅ 增强型
流式输出
系统指令
输入上下文窗口1,048,576待定1,048,5761,048,5761,048,576
最大输出token65,536待定65,53665,53665,536
状态正式可用预览版稳定版稳定版稳定版

⚠️ Context Window Correction

⚠️ 上下文窗口纠正

ACCURATE (Gemini 2.5): Gemini 2.5 models support 1,048,576 input tokens (NOT 2M!) OUTDATED: Only Gemini 1.5 Pro (previous generation) had 2M token context window GEMINI 3: Context window specifications pending official documentation
Common mistake: Claiming Gemini 2.5 has 2M tokens. It doesn't. This skill prevents this error.

准确信息(Gemini 2.5): Gemini 2.5模型支持1,048,576输入token(不是200万!) 过时信息: 只有上一代的Gemini 1.5 Pro拥有200万token上下文窗口 GEMINI 3: 上下文窗口规格等待官方文档发布
常见错误: 声称Gemini 2.5拥有200万token上下文窗口,实际并非如此。本指南可避免此错误。

SDK vs Fetch Approaches

SDK vs Fetch实现方式

Node.js SDK (@google/genai)

Node.js SDK(@google/genai)

Pros:
  • Type-safe with TypeScript
  • Easier API (simpler syntax)
  • Built-in chat helpers
  • Automatic SSE parsing for streaming
  • Better error handling
Cons:
  • Requires Node.js or compatible runtime
  • Larger bundle size
  • May not work in all edge runtimes
Use when: Building Node.js apps, Next.js Server Actions/Components, or any environment with Node.js compatibility
优点:
  • TypeScript类型安全
  • API更易用(语法更简洁)
  • 内置对话助手
  • 自动解析流式输出的SSE
  • 更好的错误处理
缺点:
  • 需要Node.js或兼容运行时
  • 包体积更大
  • 可能无法在所有边缘运行时工作
适用场景: 构建Node.js应用、Next.js Server Actions/Components,或任何兼容Node.js的环境

Fetch-based (Direct REST API)

基于Fetch的实现(直接调用REST API)

Pros:
  • Works in any JavaScript environment (Cloudflare Workers, Deno, Bun, browsers)
  • Minimal dependencies
  • Smaller bundle size
  • Full control over requests
Cons:
  • More verbose syntax
  • Manual SSE parsing for streaming
  • No built-in chat helpers
  • Manual error handling
Use when: Deploying to Cloudflare Workers, browser clients, or lightweight edge runtimes

优点:
  • 可在任何JavaScript环境中运行(Cloudflare Workers、Deno、Bun、浏览器)
  • 依赖极少
  • 包体积更小
  • 完全控制请求过程
缺点:
  • 语法更冗长
  • 需要手动解析流式输出的SSE
  • 无内置对话助手
  • 需要手动处理错误
适用场景: 部署到Cloudflare Workers、浏览器客户端,或轻量级边缘运行时

Text Generation

文本生成

Basic Text Generation (SDK)

基础文本生成(SDK)

typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a haiku about artificial intelligence'
});

console.log(response.text);
typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '写一首关于人工智能的俳句'
});

console.log(response.text);

Basic Text Generation (Fetch)

基础文本生成(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: 'Write a haiku about artificial intelligence' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: '写一首关于人工智能的俳句' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Response Structure

响应结构

typescript
{
  text: string,                  // Convenience accessor for text content
  candidates: [
    {
      content: {
        parts: [
          { text: string }       // Generated text
        ],
        role: string             // "model"
      },
      finishReason: string,      // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
      index: number
    }
  ],
  usageMetadata: {
    promptTokenCount: number,
    candidatesTokenCount: number,
    totalTokenCount: number
  }
}

typescript
{
  text: string,                  // 文本内容的便捷访问器
  candidates: [
    {
      content: {
        parts: [
          { text: string }       // 生成的文本
        ],
        role: string             // "model"
      },
      finishReason: string,      // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
      index: number
    }
  ],
  usageMetadata: {
    promptTokenCount: number,
    candidatesTokenCount: number,
    totalTokenCount: number
  }
}

Streaming

流式输出

Streaming with SDK (Async Iteration)

使用SDK的流式输出(异步迭代)

typescript
const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: 'Write a 200-word story about time travel'
});

for await (const chunk of response) {
  process.stdout.write(chunk.text);
}
typescript
const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: '写一个200字左右的时间旅行故事'
});

for await (const chunk of response) {
  process.stdout.write(chunk.text);
}

Streaming with Fetch (SSE Parsing)

使用Fetch的流式输出(SSE解析)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Write a 200-word story about time travel' }] }]
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
    if (!line.startsWith('data: ')) continue;

    try {
      const data = JSON.parse(line.slice(6));
      const text = data.candidates[0]?.content?.parts[0]?.text;
      if (text) {
        process.stdout.write(text);
      }
    } catch (e) {
      // Skip invalid JSON
    }
  }
}
Key Points:
  • Use
    streamGenerateContent
    endpoint (not
    generateContent
    )
  • Parse Server-Sent Events (SSE) format:
    data: {json}\n\n
  • Handle incomplete chunks in buffer
  • Skip empty lines and
    [DONE]
    markers

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '写一个200字左右的时间旅行故事' }] }]
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
    if (!line.startsWith('data: ')) continue;

    try {
      const data = JSON.parse(line.slice(6));
      const text = data.candidates[0]?.content?.parts[0]?.text;
      if (text) {
        process.stdout.write(text);
      }
    } catch (e) {
      // 跳过无效JSON
    }
  }
}
关键点:
  • 使用
    streamGenerateContent
    端点(而非
    generateContent
  • 解析Server-Sent Events(SSE)格式:
    data: {json}\n\n
  • 处理缓冲区中的不完整块
  • 跳过空行和
    [DONE]
    标记

Multimodal Inputs

多模态输入

Gemini 2.5 models support text + images + video + audio + PDFs in the same request.
Gemini 2.5模型支持在同一个请求中混合文本+图片+视频+音频+PDF。

Images (Vision)

图片(视觉)

SDK Approach

SDK实现方式

typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// From file
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'What is in this image?' },
        {
          inlineData: {
            data: base64Image,
            mimeType: 'image/jpeg'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 从文件读取
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '这张图片里有什么?' },
        {
          inlineData: {
            data: base64Image,
            mimeType: 'image/jpeg'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Fetch Approach

Fetch实现方式

typescript
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: 'What is in this image?' },
            {
              inlineData: {
                data: base64Image,
                mimeType: 'image/jpeg'
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
Supported Image Formats:
  • JPEG (
    .jpg
    ,
    .jpeg
    )
  • PNG (
    .png
    )
  • WebP (
    .webp
    )
  • HEIC (
    .heic
    )
  • HEIF (
    .heif
    )
Max Image Size: 20MB per image
typescript
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: '这张图片里有什么?' },
            {
              inlineData: {
                data: base64Image,
                mimeType: 'image/jpeg'
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
支持的图片格式:
  • JPEG(
    .jpg
    ,
    .jpeg
  • PNG(
    .png
  • WebP(
    .webp
  • HEIC(
    .heic
  • HEIF(
    .heif
单张图片最大尺寸: 20MB

Video

视频

typescript
// Video must be < 2 minutes for inline data
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Describe what happens in this video' },
        {
          inlineData: {
            data: base64Video,
            mimeType: 'video/mp4'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
Supported Video Formats:
  • MP4 (
    .mp4
    )
  • MPEG (
    .mpeg
    )
  • MOV (
    .mov
    )
  • AVI (
    .avi
    )
  • FLV (
    .flv
    )
  • MPG (
    .mpg
    )
  • WebM (
    .webm
    )
  • WMV (
    .wmv
    )
Max Video Length (inline): 2 minutes Max Video Size: 2GB (use File API for larger files - Phase 2)
typescript
// 内联数据的视频时长必须<2分钟
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '描述这个视频里发生了什么' },
        {
          inlineData: {
            data: base64Video,
            mimeType: 'video/mp4'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
支持的视频格式:
  • MP4(
    .mp4
  • MPEG(
    .mpeg
  • MOV(
    .mov
  • AVI(
    .avi
  • FLV(
    .flv
  • MPG(
    .mpg
  • WebM(
    .webm
  • WMV(
    .wmv
内联视频最大时长: 2分钟 视频最大尺寸: 2GB(更大文件请使用File API - 第二阶段功能)

Audio

音频

typescript
const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Transcribe and summarize this audio' },
        {
          inlineData: {
            data: base64Audio,
            mimeType: 'audio/mp3'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
Supported Audio Formats:
  • MP3 (
    .mp3
    )
  • WAV (
    .wav
    )
  • FLAC (
    .flac
    )
  • AAC (
    .aac
    )
  • OGG (
    .ogg
    )
  • OPUS (
    .opus
    )
Max Audio Size: 20MB
typescript
const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '转录并总结这段音频' },
        {
          inlineData: {
            data: base64Audio,
            mimeType: 'audio/mp3'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
支持的音频格式:
  • MP3(
    .mp3
  • WAV(
    .wav
  • FLAC(
    .flac
  • AAC(
    .aac
  • OGG(
    .ogg
  • OPUS(
    .opus
音频最大尺寸: 20MB

PDFs

PDF

typescript
const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Summarize the key points in this PDF' },
        {
          inlineData: {
            data: base64Pdf,
            mimeType: 'application/pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
Max PDF Size: 30MB PDF Limitations: Text-based PDFs work best; scanned images may have lower accuracy
typescript
const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '总结这份PDF的关键点' },
        {
          inlineData: {
            data: base64Pdf,
            mimeType: 'application/pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
PDF最大尺寸: 30MB PDF限制: 基于文本的PDF效果最佳;扫描件图片的识别准确率可能较低

Multiple Inputs

多输入混合

You can combine multiple modalities in one request:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Compare these two images and describe the differences:' },
        { inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
        { inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
      ]
    }
  ]
});

您可以在一个请求中组合多种模态:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '对比这两张图片,描述它们的区别:' },
        { inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
        { inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
      ]
    }
  ]
});

Function Calling

函数调用

Gemini supports function calling (tool use) to connect models with external APIs and systems.
Gemini支持函数调用(工具使用),将模型与外部API和系统连接。

Basic Function Calling (SDK)

基础函数调用(SDK)

typescript
import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Define function declarations
const getCurrentWeather = {
  name: 'get_current_weather',
  description: 'Get the current weather for a location',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: 'City name, e.g. San Francisco'
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit']
      }
    },
    required: ['location']
  }
};

// Make request with tools
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What\'s the weather in Tokyo?',
  config: {
    tools: [
      { functionDeclarations: [getCurrentWeather] }
    ]
  }
});

// Check if model wants to call a function
const functionCall = response.candidates[0].content.parts[0].functionCall;

if (functionCall) {
  console.log('Function to call:', functionCall.name);
  console.log('Arguments:', functionCall.args);

  // Execute the function (your implementation)
  const weatherData = await fetchWeather(functionCall.args.location);

  // Send function result back to model
  const finalResponse = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      'What\'s the weather in Tokyo?',
      response.candidates[0].content, // Original assistant response with function call
      {
        parts: [
          {
            functionResponse: {
              name: functionCall.name,
              response: weatherData
            }
          }
        ]
      }
    ],
    config: {
      tools: [
        { functionDeclarations: [getCurrentWeather] }
      ]
    }
  });

  console.log(finalResponse.text);
}
typescript
import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 定义函数声明
const getCurrentWeather = {
  name: 'get_current_weather',
  description: '获取指定地点的当前天气',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: '城市名称,例如:旧金山'
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit']
      }
    },
    required: ['location']
  }
};

// 携带工具发起请求
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '东京的天气怎么样?',
  config: {
    tools: [
      { functionDeclarations: [getCurrentWeather] }
    ]
  }
});

// 检查模型是否需要调用函数
const functionCall = response.candidates[0].content.parts[0].functionCall;

if (functionCall) {
  console.log('需要调用的函数:', functionCall.name);
  console.log('参数:', functionCall.args);

  // 执行函数(您的实现逻辑)
  const weatherData = await fetchWeather(functionCall.args.location);

  // 将函数结果返回给模型
  const finalResponse = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      '东京的天气怎么样?',
      response.candidates[0].content, // 原始助手响应(包含函数调用)
      {
        parts: [
          {
            functionResponse: {
              name: functionCall.name,
              response: weatherData
            }
          }
        ]
      }
    ],
    config: {
      tools: [
        { functionDeclarations: [getCurrentWeather] }
      ]
    }
  });

  console.log(finalResponse.text);
}

Function Calling (Fetch)

函数调用(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'What\'s the weather in Tokyo?' }] }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_current_weather',
              description: 'Get the current weather for a location',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: 'City name'
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;

if (functionCall) {
  // Execute function and send result back (same flow as SDK)
}
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '东京的天气怎么样?' }] }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_current_weather',
              description: '获取指定地点的当前天气',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: '城市名称'
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;

if (functionCall) {
  // 执行函数并返回结果(流程与SDK相同)
}

Parallel Function Calling

并行函数调用

Gemini can call multiple independent functions simultaneously:
typescript
const tools = [
  {
    functionDeclarations: [
      {
        name: 'get_weather',
        description: 'Get weather for a location',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      },
      {
        name: 'get_population',
        description: 'Get population of a city',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            city: { type: 'string' }
          },
          required: ['city']
        }
      }
    ]
  }
];

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather and population of Tokyo?',
  config: { tools }
});

// Model may return MULTIPLE function calls in parallel
const functionCalls = response.candidates[0].content.parts.filter(
  part => part.functionCall
);

console.log(`Model wants to call ${functionCalls.length} functions in parallel`);
Gemini可以同时调用多个独立函数:
typescript
const tools = [
  {
    functionDeclarations: [
      {
        name: 'get_weather',
        description: '获取指定地点的天气',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      },
      {
        name: 'get_population',
        description: '获取指定城市的人口',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            city: { type: 'string' }
          },
          required: ['city']
        }
      }
    ]
  }
];

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '东京的天气和人口分别是多少?',
  config: { tools }
});

// 模型可能返回多个并行的函数调用
const functionCalls = response.candidates[0].content.parts.filter(
  part => part.functionCall
);

console.log(`模型需要并行调用${functionCalls.length}个函数`);

Function Calling Modes

函数调用模式

typescript
import { FunctionCallingConfigMode } from '@google/genai';

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What\'s the weather?',
  config: {
    tools: [{ functionDeclarations: [getCurrentWeather] }],
    toolConfig: {
      functionCallingConfig: {
        mode: FunctionCallingConfigMode.ANY, // Force function call
        // mode: FunctionCallingConfigMode.AUTO, // Model decides (default)
        // mode: FunctionCallingConfigMode.NONE, // Never call functions
        allowedFunctionNames: ['get_current_weather'] // Optional: restrict to specific functions
      }
    }
  }
});
Modes:
  • AUTO
    (default): Model decides whether to call functions
  • ANY
    : Force model to call at least one function
  • NONE
    : Disable function calling for this request

typescript
import { FunctionCallingConfigMode } from '@google/genai';

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '天气怎么样?',
  config: {
    tools: [{ functionDeclarations: [getCurrentWeather] }],
    toolConfig: {
      functionCallingConfig: {
        mode: FunctionCallingConfigMode.ANY, // 强制调用函数
        // mode: FunctionCallingConfigMode.AUTO, // 模型自主决定(默认)
        // mode: FunctionCallingConfigMode.NONE, // 禁止调用函数
        allowedFunctionNames: ['get_current_weather'] // 可选:限制只能调用特定函数
      }
    }
  }
});
模式说明:
  • AUTO
    (默认): 模型决定是否调用函数
  • ANY
    : 强制模型至少调用一个函数
  • NONE
    : 本次请求禁止调用函数

System Instructions

系统指令

System instructions guide the model's behavior and set context. They are separate from the conversation messages.
系统指令用于引导模型行为并设置上下文,与对话消息分开

SDK Approach

SDK实现方式

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful AI assistant that always responds in the style of a pirate. Use nautical terminology and end sentences with "arrr".',
  contents: 'Explain what a database is'
});

console.log(response.text);
// Output: "Ahoy there! A database be like a treasure chest..."
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: '你是一个乐于助人的AI助手,说话风格要像海盗,使用航海术语,句子结尾要加"arrr"。',
  contents: '解释什么是数据库'
});

console.log(response.text);
// 输出: "Ahoy there! A database be like a treasure chest... arrr"

Fetch Approach

Fetch实现方式

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      systemInstruction: {
        parts: [
          { text: 'You are a helpful AI assistant that always responds in the style of a pirate.' }
        ]
      },
      contents: [
        { parts: [{ text: 'Explain what a database is' }] }
      ]
    }),
  }
);
Key Points:
  • System instructions are NOT part of
    contents
    array
  • They are set once at the top level of the request
  • They persist for the entire conversation (when using multi-turn chat)
  • They don't count as user or model messages

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      systemInstruction: {
        parts: [
          { text: '你是一个乐于助人的AI助手,说话风格要像海盗。' }
        ]
      },
      contents: [
        { parts: [{ text: '解释什么是数据库' }] }
      ]
    }),
  }
);
关键点:
  • 系统指令不属于
    contents
    数组
  • 系统指令设置在请求的顶层
  • 在多轮对话中,系统指令会持续生效
  • 系统指令不计入用户或模型消息

Multi-turn Chat

多轮对话

For conversations with history, use the SDK's chat helpers or manually manage conversation state.
对于需要上下文的对话,使用SDK的对话助手或手动管理对话状态。

SDK Chat Helpers (Recommended)

SDK对话助手(推荐)

typescript
const chat = await ai.models.createChat({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful coding assistant.',
  history: [] // Start empty or with previous messages
});

// Send first message
const response1 = await chat.sendMessage('What is TypeScript?');
console.log('Assistant:', response1.text);

// Send follow-up (context is automatically maintained)
const response2 = await chat.sendMessage('How do I install it?');
console.log('Assistant:', response2.text);

// Get full chat history
const history = chat.getHistory();
console.log('Full conversation:', history);
typescript
const chat = await ai.models.createChat({
  model: 'gemini-2.5-flash',
  systemInstruction: '你是一个乐于助人的编程助手。',
  history: [] // 从空对话开始,或传入历史消息
});

// 发送第一条消息
const response1 = await chat.sendMessage('什么是TypeScript?');
console.log('助手:', response1.text);

// 发送跟进消息(上下文会自动维护)
const response2 = await chat.sendMessage('如何安装它?');
console.log('助手:', response2.text);

// 获取完整对话历史
const history = chat.getHistory();
console.log('完整对话:', history);

Manual Chat Management (Fetch)

手动管理对话(Fetch)

typescript
const conversationHistory = [];

// First turn
const response1 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{ text: 'What is TypeScript?' }]
        }
      ]
    }),
  }
);

const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;

// Add to history
conversationHistory.push(
  { role: 'user', parts: [{ text: 'What is TypeScript?' }] },
  { role: 'model', parts: [{ text: assistantReply1 }] }
);

// Second turn (include full history)
const response2 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        ...conversationHistory,
        { role: 'user', parts: [{ text: 'How do I install it?' }] }
      ]
    }),
  }
);
Message Roles:
  • user
    : User messages
  • model
    : Assistant responses
⚠️ Important: Chat helpers are SDK-only. With fetch, you must manually manage conversation history.

typescript
const conversationHistory = [];

// 第一轮对话
const response1 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{ text: '什么是TypeScript?' }]
        }
      ]
    }),
  }
);

const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;

// 添加到历史记录
conversationHistory.push(
  { role: 'user', parts: [{ text: '什么是TypeScript?' }] },
  { role: 'model', parts: [{ text: assistantReply1 }] }
);

// 第二轮对话(包含完整历史记录)
const response2 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        ...conversationHistory,
        { role: 'user', parts: [{ text: '如何安装它?' }] }
      ]
    }),
  }
);
消息角色:
  • user
    : 用户消息
  • model
    : 助手响应
⚠️ 重要提示: 对话助手是SDK专属功能。使用Fetch时,必须手动管理对话历史。

Thinking Mode

思考模式

Gemini 2.5 models have thinking mode enabled by default for enhanced quality. You can configure the thinking budget.
Gemini 2.5模型默认开启思考模式以提升质量。您可以配置思考预算。

Configure Thinking Budget (SDK)

配置思考预算(SDK)

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Solve this complex math problem: ...',
  config: {
    thinkingConfig: {
      thinkingBudget: 8192 // Max tokens for thinking (default: model-dependent)
    }
  }
});
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '解决这个复杂的数学问题: ...',
  config: {
    thinkingConfig: {
      thinkingBudget: 8192 // 最大思考token数(默认值取决于模型)
    }
  }
});

Configure Thinking Budget (Fetch)

配置思考预算(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Solve this complex math problem: ...' }] }],
      generationConfig: {
        thinkingConfig: {
          thinkingBudget: 8192
        }
      }
    }),
  }
);
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '解决这个复杂的数学问题: ...' }] }],
      generationConfig: {
        thinkingConfig: {
          thinkingBudget: 8192
        }
      }
    }),
  }
);

Configure Thinking Level (SDK) - New in v1.30.0

配置思考级别(SDK)- v1.30.0新增

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Solve this complex problem: ...',
  config: {
    thinkingConfig: {
      thinkingLevel: 'MEDIUM' // 'LOW' | 'MEDIUM' | 'HIGH'
    }
  }
});
Thinking Levels:
  • LOW
    : Minimal internal reasoning (faster, lower quality)
  • MEDIUM
    : Balanced reasoning (default)
  • HIGH
    : Maximum reasoning depth (slower, higher quality)
Key Points:
  • Thinking mode is always enabled on Gemini 2.5 models (cannot be disabled)
  • Higher thinking budgets allow more internal reasoning (may increase latency)
  • thinkingLevel
    provides simpler control than
    thinkingBudget
    (new in v1.30.0)
  • Default budget varies by model (usually sufficient for most tasks)
  • Only increase budget/level for very complex reasoning tasks

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '解决这个复杂的问题: ...',
  config: {
    thinkingConfig: {
      thinkingLevel: 'MEDIUM' // 'LOW' | 'MEDIUM' | 'HIGH'
    }
  }
});
思考级别说明:
  • LOW
    : 最小内部推理(速度快,质量较低)
  • MEDIUM
    : 平衡的推理(默认)
  • HIGH
    : 最大推理深度(速度慢,质量较高)
关键点:
  • Gemini 2.5模型始终开启思考模式(无法关闭)
  • 更高的思考预算允许更多内部推理(可能增加延迟)
  • thinkingLevel
    thinkingBudget
    提供更简单的控制(v1.30.0新增)
  • 默认预算因模型而异(通常足以应对大多数任务)
  • 仅在处理非常复杂的推理任务时才增加预算/级别

Generation Configuration

生成配置

Customize model behavior with generation parameters.
使用生成参数自定义模型行为。

All Configuration Options (SDK)

所有配置选项(SDK)

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a creative story',
  config: {
    temperature: 0.9,           // Randomness (0.0-2.0, default: 1.0)
    topP: 0.95,                 // Nucleus sampling (0.0-1.0)
    topK: 40,                   // Top-k sampling
    maxOutputTokens: 2048,      // Max tokens to generate
    stopSequences: ['END'],     // Stop generation if these appear
    responseMimeType: 'text/plain', // Or 'application/json' for JSON mode
    candidateCount: 1           // Number of response candidates (usually 1)
  }
});
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '写一个创意故事',
  config: {
    temperature: 0.9,           // 随机性(0.0-2.0,默认值:1.0)
    topP: 0.95,                 // 核采样(0.0-1.0)
    topK: 40,                   // Top-k采样
    maxOutputTokens: 2048,      // 最大生成token数
    stopSequences: ['END'],     // 如果出现这些序列则停止生成
    responseMimeType: 'text/plain', // 或使用'application/json'开启JSON模式
    candidateCount: 1           // 响应候选数(通常为1)
  }
});

All Configuration Options (Fetch)

所有配置选项(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Write a creative story' }] }],
      generationConfig: {
        temperature: 0.9,
        topP: 0.95,
        topK: 40,
        maxOutputTokens: 2048,
        stopSequences: ['END'],
        responseMimeType: 'text/plain',
        candidateCount: 1
      }
    }),
  }
);
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '写一个创意故事' }] }],
      generationConfig: {
        temperature: 0.9,
        topP: 0.95,
        topK: 40,
        maxOutputTokens: 2048,
        stopSequences: ['END'],
        responseMimeType: 'text/plain',
        candidateCount: 1
      }
    }),
  }
);

Parameter Guidelines

参数指南

ParameterRangeDefaultUse Case
temperature0.0-2.01.0Lower = more focused, higher = more creative
topP0.0-1.00.95Nucleus sampling threshold
topK1-100+40Limit to top K tokens
maxOutputTokens1-65536Model maxControl response length
stopSequencesArrayNoneStop generation at specific strings
Tips:
  • For factual tasks: Use low temperature (0.0-0.3)
  • For creative tasks: Use high temperature (0.7-1.5)
  • topP and topK both control randomness; use one or the other (not both)
  • Always set maxOutputTokens to prevent excessive generation

参数范围默认值适用场景
temperature0.0-2.01.0值越低越聚焦,值越高越有创意
topP0.0-1.00.95核采样阈值
topK1-100+40限制仅考虑前K个token
maxOutputTokens1-65536模型最大值控制响应长度
stopSequences数组当出现指定序列时停止生成
提示:
  • 对于事实性任务: 使用低temperature(0.0-0.3)
  • 对于创意任务: 使用高temperature(0.7-1.5)
  • topPtopK都用于控制随机性,使用其中一个即可(不要同时使用)
  • 始终设置maxOutputTokens以避免过度生成

Context Caching

上下文缓存

Context caching allows you to cache frequently used content (like system instructions, large documents, or video files) to reduce costs by up to 90% and improve latency.
上下文缓存允许您缓存频繁使用的内容(如系统指令、大型文档或视频文件),可降低高达90%的成本并提升延迟性能。

How It Works

工作原理

  1. Create a cache with your repeated content
  2. Reference the cache in subsequent requests
  3. Save tokens - cached tokens cost significantly less
  4. TTL management - caches expire after specified time
  1. 创建缓存:将重复使用的内容存入缓存
  2. 引用缓存:在后续请求中引用该缓存
  3. 节省token:缓存的token成本远低于普通token
  4. TTL管理:缓存会在指定时间后过期

Benefits

优势

  • Cost savings: Up to 90% reduction on cached tokens
  • Reduced latency: Faster responses by reusing processed content
  • Consistent context: Same large context across multiple requests
  • 成本节约:缓存的输入token比普通token便宜约90%
  • 延迟降低:通过复用已处理内容提升响应速度
  • 上下文一致:在多个请求中保持相同的大型上下文

Cache Creation (SDK)

创建缓存(SDK)

typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Create a cache for a large document
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');

const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'large-doc-cache', // Identifier for the cache
    systemInstruction: 'You are an expert at analyzing legal documents.',
    contents: documentText,
    ttl: '3600s', // Cache for 1 hour
  }
});

console.log('Cache created:', cache.name);
console.log('Expires at:', cache.expireTime);
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 为大型文档创建缓存
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');

const cache = await ai.caches.create({
  model: 'gemini-1.5-flash-001', // 注意:仅Gemini 1.5模型支持缓存
  config: {
    displayName: 'large-doc-cache', // 缓存标识
    systemInstruction: '你是一名法律文档分析专家。',
    contents: documentText,
    ttl: '3600s', // 缓存1小时
  }
});

console.log('缓存已创建:', cache.name);
console.log('过期时间:', cache.expireTime);

Cache Creation (Fetch)

创建缓存(Fetch)

typescript
const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/cachedContents',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      model: 'models/gemini-2.5-flash',
      displayName: 'large-doc-cache',
      systemInstruction: {
        parts: [{ text: 'You are an expert at analyzing legal documents.' }]
      },
      contents: [
        { parts: [{ text: documentText }] }
      ],
      ttl: '3600s'
    }),
  }
);

const cache = await response.json();
console.log('Cache created:', cache.name);
typescript
const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/cachedContents',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      model: 'models/gemini-1.5-flash-001', // 注意:仅Gemini 1.5模型支持缓存
      displayName: 'large-doc-cache',
      systemInstruction: {
        parts: [{ text: '你是一名法律文档分析专家。' }]
      },
      contents: [
        { parts: [{ text: documentText }] }
      ],
      ttl: '3600s'
    }),
  }
);

const cache = await response.json();
console.log('缓存已创建:', cache.name);

Using a Cache (SDK)

使用缓存(SDK)

typescript
// Generate content using the cache
const response = await ai.models.generateContent({
  model: cache.name, // Use cache name as model
  contents: 'Summarize the key points in the document'
});

console.log(response.text);
typescript
// 使用缓存生成内容
const response = await ai.models.generateContent({
  model: cache.name, // 将缓存名称作为模型参数
  contents: '总结文档的关键点'
});

console.log(response.text);

Using a Cache (Fetch)

使用缓存(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'Summarize the key points in the document' }] }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '总结文档的关键点' }] }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Update Cache TTL (SDK)

更新缓存TTL(SDK)

typescript
import { UpdateCachedContentConfig } from '@google/genai';

await ai.caches.update({
  name: cache.name,
  config: {
    ttl: '7200s' // Extend to 2 hours
  }
});
typescript
import { UpdateCachedContentConfig } from '@google/genai';

await ai.caches.update({
  name: cache.name,
  config: {
    ttl: '7200s' // 延长至2小时
  }
});

Update Cache with Expiration Time (SDK)

使用过期时间更新缓存(SDK)

typescript
// Set specific expiration time (must be timezone-aware)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);

await ai.caches.update({
  name: cache.name,
  config: {
    expireTime: in10Minutes
  }
});
typescript
// 设置具体的过期时间(必须包含时区)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);

await ai.caches.update({
  name: cache.name,
  config: {
    expireTime: in10Minutes
  }
});

List and Delete Caches (SDK)

列出和删除缓存(SDK)

typescript
// List all caches
const caches = await ai.caches.list();
for (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

// Delete a specific cache
await ai.caches.delete({ name: cache.name });
typescript
// 列出所有缓存
const caches = await ai.caches.list();
for (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

// 删除指定缓存
await ai.caches.delete({ name: cache.name });

Caching with Video Files

视频文件缓存

typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Upload video file
const videoFile = await ai.files.upload({
  file: fs.createReadStream('./video.mp4')
});

// Wait for processing
while (videoFile.state.name === 'PROCESSING') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await ai.files.get({ name: videoFile.name });
}

// Create cache with video
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'video-analysis-cache',
    systemInstruction: 'You are an expert video analyzer.',
    contents: [videoFile],
    ttl: '300s' // 5 minutes
  }
});

// Use cache for multiple queries
const response1 = await ai.models.generateContent({
  model: cache.name,
  contents: 'What happens in the first minute?'
});

const response2 = await ai.models.generateContent({
  model: cache.name,
  contents: 'Describe the main characters'
});
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 上传视频文件
const videoFile = await ai.files.upload({
  file: fs.createReadStream('./video.mp4')
});

// 等待处理完成
while (videoFile.state.name === 'PROCESSING') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await ai.files.get({ name: videoFile.name });
}

// 创建包含视频的缓存
const cache = await ai.caches.create({
  model: 'gemini-1.5-flash-001', // 注意:仅Gemini 1.5模型支持缓存
  config: {
    displayName: 'video-analysis-cache',
    systemInstruction: '你是一名专业的视频分析师。',
    contents: [videoFile],
    ttl: '300s' // 缓存5分钟
  }
});

// 使用缓存进行多次查询
const response1 = await ai.models.generateContent({
  model: cache.name,
  contents: '视频第一分钟发生了什么?'
});

const response2 = await ai.models.generateContent({
  model: cache.name,
  contents: '描述主要角色'
});

Key Points

关键点

When to Use Caching:
  • Large system instructions used repeatedly
  • Long documents analyzed multiple times
  • Video/audio files queried with different prompts
  • Consistent context across conversation sessions
TTL Guidelines:
  • Short sessions: 300s (5 min) to 3600s (1 hour)
  • Long sessions: 3600s (1 hour) to 86400s (24 hours)
  • Maximum: 7 days
Cost Savings:
  • Cached input tokens: ~90% cheaper than regular tokens
  • Output tokens: Same price (not cached)
Important:
  • You must use explicit model version suffixes (e.g.,
    gemini-2.5-flash-001
    , NOT just
    gemini-2.5-flash
    )
  • Caches are automatically deleted after TTL expires
  • Update TTL before expiration to extend cache lifetime

何时使用缓存:
  • 重复使用的大型系统指令
  • 需要多次分析的长文档
  • 需要用不同查询提问的视频/音频文件
  • 跨对话会话的一致上下文
TTL指南:
  • 短会话:300s(5分钟)至3600s(1小时)
  • 长会话:3600s(1小时)至86400s(24小时)
  • 最大值:7天
成本节约:
  • 缓存的输入token:比普通token便宜约90%
  • 输出token:价格不变(不缓存)
重要提示:
  • 必须使用明确的模型版本后缀(例如:
    gemini-1.5-flash-001
    ,不能仅使用
    gemini-1.5-flash
  • 缓存会在TTL过期后自动删除
  • 在过期前更新TTL以延长缓存生命周期
  • 仅Gemini 1.5模型支持上下文缓存

Code Execution

代码执行

Gemini models can generate and execute Python code to solve problems requiring computation, data analysis, or visualization.
Gemini模型可以生成并执行Python代码,解决需要计算、数据分析或可视化的问题。

How It Works

工作原理

  1. Model generates executable Python code
  2. Code runs in secure sandbox
  3. Results are returned to the model
  4. Model incorporates results into response
  1. 模型生成可执行的Python代码
  2. 代码在安全沙箱中运行
  3. 结果返回给模型
  4. 模型将结果整合到响应中

Supported Operations

支持的操作

  • Mathematical calculations
  • Data analysis and statistics
  • File processing (CSV, JSON, etc.)
  • Chart and graph generation
  • Algorithm implementation
  • Data transformations
  • 数学计算
  • 数据分析和统计
  • 文件处理(CSV、JSON等)
  • 图表和图形生成
  • 算法实现
  • 数据转换

Available Python Packages

可用的Python包

Standard Library:
  • math
    ,
    statistics
    ,
    random
    ,
    datetime
    ,
    json
    ,
    csv
    ,
    re
  • collections
    ,
    itertools
    ,
    functools
Data Science:
  • numpy
    ,
    pandas
    ,
    scipy
Visualization:
  • matplotlib
    ,
    seaborn
Note: Limited package availability compared to full Python environment
标准库:
  • math
    ,
    statistics
    ,
    random
    ,
    datetime
    ,
    json
    ,
    csv
    ,
    re
  • collections
    ,
    itertools
    ,
    functools
数据科学:
  • numpy
    ,
    pandas
    ,
    scipy
可视化:
  • matplotlib
    ,
    seaborn
注意: 与完整Python环境相比,可用包有限

Basic Code Execution (SDK)

基础代码执行(SDK)

typescript
import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the sum of the first 50 prime numbers? Generate and run code for the calculation.',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Parse response parts
for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    console.log('Text:', part.text);
  }
  if (part.executableCode) {
    console.log('Generated Code:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('Execution Output:', part.codeExecutionResult.output);
  }
}
typescript
import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '前50个质数的和是多少?生成并运行计算代码。',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 解析响应部分
for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    console.log('文本:', part.text);
  }
  if (part.executableCode) {
    console.log('生成的代码:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('执行输出:', part.codeExecutionResult.output);
  }
}

Basic Code Execution (Fetch)

基础代码执行(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      tools: [{ code_execution: {} }],
      contents: [
        {
          parts: [
            { text: 'What is the sum of the first 50 prime numbers? Generate and run code.' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();

for (const part of data.candidates[0].content.parts) {
  if (part.text) {
    console.log('Text:', part.text);
  }
  if (part.executableCode) {
    console.log('Code:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('Result:', part.codeExecutionResult.output);
  }
}
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      tools: [{ code_execution: {} }],
      contents: [
        {
          parts: [
            { text: '前50个质数的和是多少?生成并运行计算代码。' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();

for (const part of data.candidates[0].content.parts) {
  if (part.text) {
    console.log('文本:', part.text);
  }
  if (part.executableCode) {
    console.log('代码:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('结果:', part.codeExecutionResult.output);
  }
}

Chat with Code Execution (SDK)

带代码执行的对话(SDK)

typescript
const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

let response = await chat.sendMessage('I have a math question for you.');
console.log(response.text);

response = await chat.sendMessage(
  'Calculate the Fibonacci sequence up to the 20th number and sum them.'
);

// Model will generate and execute code, then provide answer
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Code:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('Output:', part.codeExecutionResult.output);
}
typescript
const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

let response = await chat.sendMessage('我有一个数学问题想请教你。');
console.log(response.text);

response = await chat.sendMessage(
  '计算斐波那契数列的前20项并求和。'
);

// 模型会生成并执行代码,然后给出答案
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('代码:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('输出:', part.codeExecutionResult.output);
}

Data Analysis Example

数据分析示例

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: `
    Analyze this sales data and calculate:
    1. Total revenue
    2. Average sale price
    3. Best-selling month

    Data (CSV format):
    month,sales,revenue
    Jan,150,45000
    Feb,200,62000
    Mar,175,53000
    Apr,220,68000
  `,
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Model will generate pandas/numpy code to analyze data
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Analysis Code:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('Results:', part.codeExecutionResult.output);
}
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: `
    分析这份销售数据并计算:
    1. 总营收
    2. 平均售价
    3. 最畅销的月份

    数据(CSV格式):
    month,sales,revenue
    Jan,150,45000
    Feb,200,62000
    Mar,175,53000
    Apr,220,68000
  `,
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 模型会生成pandas/numpy代码来分析数据
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('分析代码:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('结果:', part.codeExecutionResult.output);
}

Visualization Example

可视化示例

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Create a bar chart showing the distribution of prime numbers under 100 by their last digit. Generate the chart and describe the pattern.',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Model generates matplotlib code, executes it, and describes results
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Chart Code:', part.executableCode.code);
  if (part.codeExecutionResult) {
    // Note: Chart image data would be in output
    console.log('Execution completed');
  }
}
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '创建一个柱状图,展示100以内质数的末位数字分布。生成图表并描述模式。',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 模型生成matplotlib代码,执行后描述结果
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('图表代码:', part.executableCode.code);
  if (part.codeExecutionResult) {
    // 注意:图表图片数据会在输出中
    console.log('执行完成');
  }
}

Response Structure

响应结构

typescript
{
  candidates: [
    {
      content: {
        parts: [
          { text: "I'll calculate that for you." },
          {
            executableCode: {
              language: "PYTHON",
              code: "def is_prime(n):\n  if n <= 1:\n    return False\n  ..."
            }
          },
          {
            codeExecutionResult: {
              outcome: "OUTCOME_OK", // or "OUTCOME_FAILED"
              output: "5117\n"
            }
          },
          { text: "The sum of the first 50 prime numbers is 5117." }
        ]
      }
    }
  ]
}
typescript
{
  candidates: [
    {
      content: {
        parts: [
          { text: "我来帮你计算。" },
          {
            executableCode: {
              language: "PYTHON",
              code: "def is_prime(n):\n  if n <= 1:\n    return False\n  ..."
            }
          },
          {
            codeExecutionResult: {
              outcome: "OUTCOME_OK", // 或"OUTCOME_FAILED"
              output: "5117\n"
            }
          },
          { text: "前50个质数的和是5117。" }
        ]
      }
    }
  ]
}

Error Handling

错误处理

typescript
for (const part of response.candidates[0].content.parts) {
  if (part.codeExecutionResult) {
    if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
      console.error('Code execution failed:', part.codeExecutionResult.output);
    } else {
      console.log('Success:', part.codeExecutionResult.output);
    }
  }
}
typescript
for (const part of response.candidates[0].content.parts) {
  if (part.codeExecutionResult) {
    if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
      console.error('代码执行失败:', part.codeExecutionResult.output);
    } else {
      console.log('成功:', part.codeExecutionResult.output);
    }
  }
}

Key Points

关键点

When to Use Code Execution:
  • Complex mathematical calculations
  • Data analysis and statistics
  • Algorithm implementations
  • File parsing and processing
  • Chart generation
  • Computational problems
Limitations:
  • Sandbox environment (limited file system access)
  • Limited Python package availability
  • Execution timeout limits
  • No network access from code
  • No persistent state between executions
Best Practices:
  • Specify what calculation or analysis you need clearly
  • Request code generation explicitly ("Generate and run code...")
  • Check
    outcome
    field for errors
  • Use for deterministic computations, not for general programming
Important:
  • Available on all Gemini 2.5 models (Pro, Flash, Flash-Lite)
  • Code runs in isolated sandbox for security
  • Supports Python with standard library and common data science packages

何时使用代码执行:
  • 复杂数学计算
  • 数据分析和统计
  • 算法实现
  • 文件解析和处理
  • 图表生成
  • 计算类问题
限制:
  • 沙箱环境(文件系统访问受限)
  • Python包可用性有限
  • 执行超时限制
  • 代码无法访问网络
  • 执行之间无持久化状态
最佳实践:
  • 明确说明您需要的计算或分析
  • 明确要求生成代码("生成并运行代码...")
  • 检查
    outcome
    字段是否有错误
  • 用于确定性计算,而非通用编程
重要提示:
  • 所有Gemini 2.5模型(Pro、Flash、Flash-Lite)均支持
  • 代码在隔离沙箱中运行,保障安全
  • 支持Python标准库和常见数据科学包

Grounding with Google Search

基于Google搜索的事实校验

Grounding connects the model to real-time web information, reducing hallucinations and providing up-to-date, fact-checked responses with citations.
事实校验将模型与实时网络信息连接,减少幻觉并提供最新、经过事实核查的响应,同时附带引用。

How It Works

工作原理

  1. Model determines if it needs current information
  2. Automatically performs Google Search
  3. Processes search results
  4. Incorporates findings into response
  5. Provides citations and source URLs
  1. 模型判断是否需要当前信息
  2. 自动执行Google搜索
  3. 处理搜索结果
  4. 将发现整合到响应中
  5. 提供引用和来源URL

Benefits

优势

  • Real-time information: Access to current events and data
  • Reduced hallucinations: Answers grounded in web sources
  • Verifiable: Citations allow fact-checking
  • Up-to-date: Not limited to model's training cutoff
  • 实时信息: 访问当前事件和数据
  • 减少幻觉: 答案基于网络来源
  • 可验证: 引用允许事实核查
  • 内容更新: 不受模型训练截止日期限制

Grounding Options

事实校验选项

1. Google Search (
googleSearch
) - Recommended for Gemini 2.5

1. Google搜索(
googleSearch
)- Gemini 2.5推荐

typescript
const groundingTool = {
  googleSearch: {}
};
Features:
  • Simple configuration
  • Automatic search when needed
  • Available on all Gemini 2.5 models
typescript
const groundingTool = {
  googleSearch: {}
};
特性:
  • 配置简单
  • 需要时自动搜索
  • 所有Gemini 2.5模型均支持

2. FileSearch - New in v1.29.0 (Preview)

2. 文件搜索 - v1.29.0新增(预览版)

typescript
const fileSearchTool = {
  fileSearch: {
    fileSearchStoreId: 'store-id-here' // Created via FileSearchStore APIs
  }
};
Features:
  • Search through your own document collections
  • Upload and index custom knowledge bases
  • Alternative to web search for proprietary data
  • Preview feature (requires FileSearchStore setup)
Note: See FileSearch documentation for store creation and management.
typescript
const fileSearchTool = {
  fileSearch: {
    fileSearchStoreId: 'store-id-here' // 通过FileSearchStore API创建
  }
};
特性:
  • 搜索您自己的文档集合
  • 上传并索引自定义知识库
  • 专有数据的网络搜索替代方案
  • 预览功能(需要设置FileSearchStore)
注意: 请查看FileSearch文档了解存储创建和管理方法。

3. Google Search Retrieval (
googleSearchRetrieval
) - Legacy (Gemini 1.5)

3. Google搜索检索(
googleSearchRetrieval
)- 旧版(Gemini 1.5)

typescript
const retrievalTool = {
  googleSearchRetrieval: {
    dynamicRetrievalConfig: {
      mode: 'MODE_DYNAMIC',
      dynamicThreshold: 0.7 // Only search if confidence < 70%
    }
  }
};
Features:
  • Dynamic threshold control
  • Used with Gemini 1.5 models
  • More configuration options
typescript
const retrievalTool = {
  googleSearchRetrieval: {
    dynamicRetrievalConfig: {
      mode: 'MODE_DYNAMIC',
      dynamicThreshold: 0.7 // 仅当置信度<70%时搜索
    }
  }
};
特性:
  • 动态阈值控制
  • 用于Gemini 1.5模型
  • 更多配置选项

Basic Grounding (SDK) - Gemini 2.5

基础事实校验(SDK)- Gemini 2.5

typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Who won the euro 2024?',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

console.log(response.text);

// Check if grounding was used
if (response.candidates[0].groundingMetadata) {
  console.log('Search was performed!');
  console.log('Sources:', response.candidates[0].groundingMetadata);
}
typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '2024年欧洲杯冠军是谁?',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

console.log(response.text);

// 检查是否使用了事实校验
if (response.candidates[0].groundingMetadata) {
  console.log('已执行搜索!');
  console.log('来源:', response.candidates[0].groundingMetadata);
}

Basic Grounding (Fetch) - Gemini 2.5

基础事实校验(Fetch)- Gemini 2.5

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'Who won the euro 2024?' }] }
      ],
      tools: [
        { google_search: {} }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

if (data.candidates[0].groundingMetadata) {
  console.log('Grounding metadata:', data.candidates[0].groundingMetadata);
}
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '2024年欧洲杯冠军是谁?' }] }
      ],
      tools: [
        { google_search: {} }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

if (data.candidates[0].groundingMetadata) {
  console.log('事实校验元数据:', data.candidates[0].groundingMetadata);
}

Dynamic Retrieval (SDK) - Gemini 1.5

动态检索(SDK)- Gemini 1.5

typescript
import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Who won the euro 2024?',
  config: {
    tools: [
      {
        googleSearchRetrieval: {
          dynamicRetrievalConfig: {
            mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
            dynamicThreshold: 0.7 // Search only if confidence < 70%
          }
        }
      }
    ]
  }
});

console.log(response.text);

if (!response.candidates[0].groundingMetadata) {
  console.log('Model answered from its own knowledge (high confidence)');
}
typescript
import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-1.5-flash',
  contents: '2024年欧洲杯冠军是谁?',
  config: {
    tools: [
      {
        googleSearchRetrieval: {
          dynamicRetrievalConfig: {
            mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
            dynamicThreshold: 0.7 // 仅当置信度<70%时搜索
          }
        }
      }
    ]
  }
});

console.log(response.text);

if (!response.candidates[0].groundingMetadata) {
  console.log('模型使用自身知识回答(高置信度)');
}

Grounding Metadata Structure

事实校验元数据结构

typescript
{
  groundingMetadata: {
    searchQueries: [
      { text: "euro 2024 winner" }
    ],
    webPages: [
      {
        url: "https://example.com/euro-2024-results",
        title: "UEFA Euro 2024 Final Results",
        snippet: "Spain won UEFA Euro 2024..."
      }
    ],
    citations: [
      {
        startIndex: 42,
        endIndex: 47,
        uri: "https://example.com/euro-2024-results"
      }
    ],
    retrievalQueries: [
      {
        query: "who won euro 2024 final"
      }
    ]
  }
}
typescript
{
  groundingMetadata: {
    searchQueries: [
      { text: "euro 2024 winner" }
    ],
    webPages: [
      {
        url: "https://example.com/euro-2024-results",
        title: "UEFA Euro 2024 Final Results",
        snippet: "Spain won UEFA Euro 2024..."
      }
    ],
    citations: [
      {
        startIndex: 42,
        endIndex: 47,
        uri: "https://example.com/euro-2024-results"
      }
    ],
    retrievalQueries: [
      {
        query: "who won euro 2024 final"
      }
    ]
  }
}

Chat with Grounding (SDK)

带事实校验的对话(SDK)

typescript
const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

let response = await chat.sendMessage('What are the latest developments in quantum computing?');
console.log(response.text);

// Check grounding sources
if (response.candidates[0].groundingMetadata) {
  const sources = response.candidates[0].groundingMetadata.webPages || [];
  console.log(`Sources used: ${sources.length}`);
  sources.forEach(source => {
    console.log(`- ${source.title}: ${source.url}`);
  });
}

// Follow-up still has grounding enabled
response = await chat.sendMessage('Which company made the biggest breakthrough?');
console.log(response.text);
typescript
const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

let response = await chat.sendMessage('量子计算的最新进展是什么?');
console.log(response.text);

// 检查事实校验来源
if (response.candidates[0].groundingMetadata) {
  const sources = response.candidates[0].groundingMetadata.webPages || [];
  console.log(`使用的来源: ${sources.length}`);
  sources.forEach(source => {
    console.log(`- ${source.title}: ${source.url}`);
  });
}

// 跟进消息仍会启用事实校验
response = await chat.sendMessage('哪家公司取得了最大突破?');
console.log(response.text);

Combining Grounding with Function Calling

事实校验与函数调用结合

typescript
const weatherFunction = {
  name: 'get_current_weather',
  description: 'Get current weather for a location',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: { type: 'string', description: 'City name' }
    },
    required: ['location']
  }
};

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather like in the city that won Euro 2024?',
  config: {
    tools: [
      { googleSearch: {} },
      { functionDeclarations: [weatherFunction] }
    ]
  }
});

// Model will:
// 1. Use Google Search to find Euro 2024 winner
// 2. Call get_current_weather function with the city
// 3. Combine both results in response
typescript
const weatherFunction = {
  name: 'get_current_weather',
  description: '获取指定地点的当前天气',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: { type: 'string', description: '城市名称' }
    },
    required: ['location']
  }
};

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '2024年欧洲杯冠军城市的天气怎么样?',
  config: {
    tools: [
      { googleSearch: {} },
      { functionDeclarations: [weatherFunction] }
    ]
  }
});

// 模型会:
// 1. 使用Google搜索找到2024年欧洲杯冠军
// 2. 调用get_current_weather函数获取该城市的天气
// 3. 将两个结果整合到响应中

Checking if Grounding was Used

检查是否使用了事实校验

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is 2+2?', // Model knows this without search
  config: {
    tools: [{ googleSearch: {} }]
  }
});

if (!response.candidates[0].groundingMetadata) {
  console.log('Model answered from its own knowledge (no search needed)');
} else {
  console.log('Search was performed');
}
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '2+2等于多少?', // 模型无需搜索即可回答
  config: {
    tools: [{ googleSearch: {} }]
  }
});

if (!response.candidates[0].groundingMetadata) {
  console.log('模型使用自身知识回答(无需搜索)');
} else {
  console.log('已执行搜索');
}

Key Points

关键点

When to Use Grounding:
  • Current events and news
  • Real-time data (stock prices, sports scores, weather)
  • Fact-checking and verification
  • Questions about recent developments
  • Information beyond model's training cutoff
When NOT to Use:
  • General knowledge questions
  • Mathematical calculations
  • Code generation
  • Creative writing
  • Tasks requiring internal reasoning only
Cost Considerations:
  • Grounding adds latency (search takes time)
  • Additional token costs for retrieved content
  • Use
    dynamicThreshold
    to control when searches happen (Gemini 1.5)
Important Notes:
  • Grounding requires Google Cloud project (not just API key)
  • Search results quality depends on query phrasing
  • Citations may not cover all facts in response
  • Search is performed automatically based on confidence
Gemini 2.5 vs 1.5:
  • Gemini 2.5: Use
    googleSearch
    (simple, recommended)
  • Gemini 1.5: Use
    googleSearchRetrieval
    with
    dynamicThreshold
Best Practices:
  • Always check
    groundingMetadata
    to see if search was used
  • Display citations to users for transparency
  • Use specific, well-phrased questions for better search results
  • Combine with function calling for hybrid workflows

何时使用事实校验:
  • 当前事件和新闻
  • 实时数据(股票价格、体育比分、天气)
  • 事实核查和验证
  • 关于最新进展的问题
  • 超出模型训练截止日期的信息
何时不使用:
  • 常识性问题
  • 数学计算
  • 代码生成
  • 创意写作
  • 仅需内部推理的任务
成本考虑:
  • 事实校验会增加延迟(搜索需要时间)
  • 检索内容会产生额外token成本
  • 使用
    dynamicThreshold
    控制搜索时机(Gemini 1.5)
重要提示:
  • 事实校验需要Google Cloud项目(不仅仅是API密钥)
  • 搜索结果质量取决于查询措辞
  • 引用可能无法覆盖响应中的所有事实
  • 搜索会根据置信度自动执行
Gemini 2.5 vs 1.5:
  • Gemini 2.5: 使用
    googleSearch
    (简单,推荐)
  • Gemini 1.5: 使用
    googleSearchRetrieval
    并设置
    dynamicThreshold
最佳实践:
  • 始终检查
    groundingMetadata
    以确认是否执行了搜索
  • 向用户展示引用以保证透明度
  • 使用具体、措辞清晰的问题以获得更好的搜索结果
  • 与函数调用结合实现混合工作流

Known Issues Prevention

已知问题预防

This skill prevents 14 documented issues:
本指南可预防14个已记录的问题:

Issue #1: Multi-byte Character Corruption in Streaming

问题#1: 流式输出中的多字节字符损坏

Error: Garbled text or � symbols when streaming responses with non-English text Source: GitHub Issue #764 Why It Happens: The
TextDecoder
converts chunks to strings without the
{stream: true}
option. Multi-byte UTF-8 characters (Chinese, Japanese, Korean, emoji) split across chunks create invalid strings.
Prevention:
typescript
// The SDK already fixes this, but if implementing custom streaming:
const decoder = new TextDecoder();
const { value } = await reader.read();
const text = decoder.decode(value, { stream: true }); // ← stream: true required
Affected: All non-English languages using multi-byte characters Status: Fixed in SDK, but documented for custom implementations

错误: 流式输出非英文文本时出现乱码或�符号 来源: GitHub Issue #764 原因:
TextDecoder
转换块为字符串时未使用
{stream: true}
选项。多字节UTF-8字符(中文、日文、韩文、表情符号)被拆分到不同块中,导致无效字符串。
预防措施:
typescript
// SDK已修复此问题,但如果是自定义流式实现:
const decoder = new TextDecoder();
const { value } = await reader.read();
const text = decoder.decode(value, { stream: true }); // ← 必须设置stream: true
影响范围: 所有使用多字节字符的非英语语言 状态: SDK中已修复,为自定义实现提供文档说明

Issue #2: Safety Settings Method Parameter Not Supported

问题#2: 安全设置的method参数不被支持

Error: "method parameter is not supported in Gemini API" Source: GitHub Issue #810 Why It Happens: The
method
parameter in
safetySettings
only works with Vertex AI Gemini API, not Gemini Developer API or Google AI Studio. The SDK allows passing it without validation.
Prevention:
typescript
// ❌ WRONG - Fails with Gemini Developer API:
config: {
  safetySettings: [{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    method: HarmBlockMethod.SEVERITY // Not supported!
  }]
}

// ✅ CORRECT - Omit 'method' for Gemini Developer API:
config: {
  safetySettings: [{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
    // No 'method' field
  }]
}
Affected: Gemini Developer API and Google AI Studio users Status: Known limitation, use Vertex AI if you need
method
parameter

错误: "method parameter is not supported in Gemini API" 来源: GitHub Issue #810 原因:
safetySettings
中的
method
参数仅适用于Vertex AI Gemini API,不适用于Gemini开发者API或Google AI Studio。SDK允许传递该参数但未进行验证。
预防措施:
typescript
// ❌ 错误 - 使用Gemini开发者API会失败:
config: {
  safetySettings: [{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    method: HarmBlockMethod.SEVERITY // 不被支持!
  }]
}

// ✅ 正确 - 针对Gemini开发者API省略'method':
config: {
  safetySettings: [{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
    // 无'method'字段
  }]
}
影响范围: Gemini开发者API和Google AI Studio用户 状态: 已知限制,如果需要
method
参数请使用Vertex AI

Issue #3: Safety Settings Have Model-Specific Thresholds

问题#3: 安全设置具有模型特定阈值

Error: Content passes through despite strict safety settings, or
safetyRatings
shows NEGLIGIBLE with empty output Source: GitHub Issue #872 Why It Happens: Different models have different blocking thresholds.
gemini-2.5-flash
blocks more strictly than
gemini-2.0-flash
. Additionally,
promptFeedback
only appears when INPUT is blocked; if the model generates a refusal message,
safetyRatings
may show NEGLIGIBLE.
Prevention:
typescript
// Check BOTH promptFeedback AND empty response:
if (response.candidates[0].finishReason === 'SAFETY' ||
    !response.text || response.text.trim() === '') {
  console.log('Content blocked or refused');
}

// Be aware: Different models have different thresholds
// gemini-2.5-flash: Lower threshold (stricter blocking)
// gemini-2.0-flash: Higher threshold (more permissive)
Affected: All models when using safety settings Status: Known behavior, model-specific thresholds are by design

错误: 尽管设置了严格的安全设置,内容仍通过;或
safetyRatings
显示NEGLIGIBLE但输出为空 来源: GitHub Issue #872 原因: 不同模型具有不同的拦截阈值。
gemini-2.5-flash
gemini-2.0-flash
拦截更严格。此外,只有当输入被拦截时才会出现
promptFeedback
;如果模型生成拒绝消息,
safetyRatings
可能显示NEGLIGIBLE。
预防措施:
typescript
// 同时检查promptFeedback和空响应:
if (response.candidates[0].finishReason === 'SAFETY' ||
    !response.text || response.text.trim() === '') {
  console.log('内容被拦截或拒绝');
}

// 注意: 不同模型具有不同阈值
// gemini-2.5-flash: 阈值更低(拦截更严格)
// gemini-2.0-flash: 阈值更高(更宽松)
影响范围: 使用安全设置的所有模型 状态: 已知行为,模型特定阈值为设计如此

Issue #4: FunctionCallingConfigMode.ANY Causes Infinite Loop

问题#4: FunctionCallingConfigMode.ANY导致无限循环

Error: Model loops forever calling tools, never returns text response Source: GitHub Issue #908 Why It Happens: When
FunctionCallingConfigMode.ANY
is set with automatic function calling (
CallableTool
), the model is forced to call at least one tool on every turn and physically cannot stop, looping until max invocations limit.
Prevention:
typescript
// ❌ WRONG - Loops forever:
config: {
  toolConfig: {
    functionCallingConfig: {
      mode: FunctionCallingConfigMode.ANY // Forces tool calls forever
    }
  }
}

// ✅ CORRECT - Use AUTO mode (model decides):
config: {
  toolConfig: {
    functionCallingConfig: {
      mode: FunctionCallingConfigMode.AUTO // Model can choose to answer directly
    }
  }
}

// Or use manual function calling (check for functionCall, execute, send back)
Affected: Automatic function calling with
CallableTool
Status: Known limitation, use AUTO mode or manual function calling

错误: 模型无限循环调用工具,从不返回文本响应 来源: GitHub Issue #908 原因: 当使用自动函数调用(
CallableTool
)并设置
FunctionCallingConfigMode.ANY
时,模型被强制在每一轮至少调用一个工具,无法停止,直到达到最大调用次数限制。
预防措施:
typescript
// ❌ 错误 - 无限循环:
config: {
  toolConfig: {
    functionCallingConfig: {
      mode: FunctionCallingConfigMode.ANY // 强制永远调用工具
    }
  }
}

// ✅ 正确 - 使用AUTO模式(模型自主决定):
config: {
  toolConfig: {
    functionCallingConfig: {
      mode: FunctionCallingConfigMode.AUTO // 模型可以选择直接回答
    }
  }
}

// 或使用手动函数调用(检查functionCall,执行后返回结果)
影响范围: 使用
CallableTool
的自动函数调用 状态: 已知限制,使用AUTO模式或手动函数调用

Issue #5: Structured Output Doesn't Preserve Escaped Backslashes (Gemini 3)

问题#5: 结构化输出无法保留转义反斜杠(Gemini 3)

Error:
JSON.parse
fails on structured output, or keys with backslashes are incorrect Source: GitHub Issue #1226 Why It Happens: When using
responseMimeType: "application/json"
with schema keys containing escaped backslashes (e.g.,
\\a
for key
\a
), the model output doesn't preserve JSON escaping. It emits a single backslash, causing invalid JSON.
Prevention:
typescript
// Avoid using backslashes in JSON schema keys
// Or manually post-process if required:
let jsonText = response.text;
// Add custom escaping logic if needed
Affected: Gemini 3 models with structured output using backslashes in keys Status: Known issue, workaround required

错误: 解析结构化输出时
JSON.parse
失败,或包含反斜杠的键不正确 来源: GitHub Issue #1226 原因: 当使用
responseMimeType: "application/json"
且模式键包含转义反斜杠(例如:
\\a
表示键
\a
)时,模型输出无法保留JSON转义,会输出单个反斜杠,导致无效JSON。
预防措施:
typescript
// 避免在JSON模式键中使用反斜杠
// 或根据需要手动后处理:
let jsonText = response.text;
// 添加自定义转义逻辑(如果需要)
影响范围: 使用包含反斜杠键的结构化输出的Gemini 3模型 状态: 已知问题,需要使用解决方法

Issue #6: Large PDFs from S3 Signed URLs Fail with "Document has no pages"

问题#6: 来自S3签名URL的大型PDF失败,提示"Document has no pages"

Error:
ApiError: {"error":{"code":400,"message":"The document has no pages.","status":"INVALID_ARGUMENT"}}
Source: GitHub Issue #1259 Why It Happens: Larger PDFs (e.g., 20MB) from AWS S3 signed URLs fail when passed via
fileData.fileUri
. The API cannot fetch or process the PDF from signed URLs.
Prevention:
typescript
// ❌ WRONG - Fails with large PDFs from S3:
contents: [{
  parts: [{
    fileData: {
      fileUri: 'https://bucket.s3.region.amazonaws.com/file.pdf?X-Amz-Algorithm=...'
    }
  }]
}]

// ✅ CORRECT - Fetch and encode to base64:
const pdfResponse = await fetch(signedUrl);
const pdfBuffer = await pdfResponse.arrayBuffer();
const base64Pdf = Buffer.from(pdfBuffer).toString('base64');

contents: [{
  parts: [{
    inlineData: {
      data: base64Pdf,
      mimeType: 'application/pdf'
    }
  }]
}]
Affected: PDF files from external signed URLs Status: Known limitation, use base64 inline data instead

错误:
ApiError: {"error":{"code":400,"message":"The document has no pages.","status":"INVALID_ARGUMENT"}}
来源: GitHub Issue #1259 原因: 来自AWS S3签名URL的大型PDF(例如20MB)通过
fileData.fileUri
传递时失败。API无法从签名URL获取或处理PDF。
预防措施:
typescript
// ❌ 错误 - 来自S3的大型PDF会失败:
contents: [{
  parts: [{
    fileData: {
      fileUri: 'https://bucket.s3.region.amazonaws.com/file.pdf?X-Amz-Algorithm=...'
    }
  }]
}]

// ✅ 正确 - 获取并编码为base64:
const pdfResponse = await fetch(signedUrl);
const pdfBuffer = await pdfResponse.arrayBuffer();
const base64Pdf = Buffer.from(pdfBuffer).toString('base64');

contents: [{
  parts: [{
    inlineData: {
      data: base64Pdf,
      mimeType: 'application/pdf'
    }
  }]
}]
影响范围: 来自外部签名URL的PDF文件 状态: 已知限制,使用base64内联数据替代

Issue #7: 404 NOT_FOUND with Uploaded Video on Gemini 3 Models

问题#7: 在Gemini 3模型上使用上传的视频返回404 NOT_FOUND

Error: 404 NOT_FOUND when using uploaded video files with Gemini 3 models Source: GitHub Issue #1220 Why It Happens: Some Gemini 3 models (
gemini-3-flash-preview
,
gemini-3-pro-preview
) are not available in the free tier or have limited access even with paid accounts. Video file uploads fail with 404.
Prevention:
typescript
// ❌ WRONG - 404 error with Gemini 3:
const response = await ai.models.generateContent({
  model: 'gemini-3-pro-preview', // 404 error
  contents: [{
    parts: [
      { text: 'Describe this video' },
      { fileData: { fileUri: videoFile.uri }}
    ]
  }]
});

// ✅ CORRECT - Use Gemini 2.5 for video understanding:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash', // Works
  contents: [{
    parts: [
      { text: 'Describe this video' },
      { fileData: { fileUri: videoFile.uri }}
    ]
  }]
});
Affected: Gemini 3 preview models with video uploads Status: Known limitation, use Gemini 2.5 models for video

错误: 在Gemini 3模型上使用上传的视频时返回404 NOT_FOUND 来源: GitHub Issue #1220 原因: 部分Gemini 3模型(
gemini-3-flash-preview
,
gemini-3-pro-preview
)在免费层不可用,即使是付费账户也可能访问受限。视频文件上传会返回404。
预防措施:
typescript
// ❌ 错误 - Gemini 3会返回404:
const response = await ai.models.generateContent({
  model: 'gemini-3-pro-preview', // 404错误
  contents: [{
    parts: [
      { text: '描述这个视频' },
      { fileData: { fileUri: videoFile.uri }}
    ]
  }]
});

// ✅ 正确 - 使用Gemini 2.5进行视频理解:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash', // 可用
  contents: [{
    parts: [
      { text: '描述这个视频' },
      { fileData: { fileUri: videoFile.uri }}
    ]
  }]
});
影响范围: 使用视频上传的Gemini 3预览模型 状态: 已知限制,使用Gemini 2.5模型进行视频处理

Issue #8: Batch API Returns 429 Despite Being Under Quota

问题#8: 批量API在未超出配额时返回429

Error: 429 RESOURCE_EXHAUSTED when using Batch API, even when under documented quota Source: GitHub Issue #1264 Why It Happens: The Batch API may have dynamic rate limiting based on server load or undocumented limits beyond static quotas.
Prevention:
typescript
// Implement exponential backoff for Batch API:
async function batchWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.batches.create(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000;
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}
Affected: Batch API users on paid tier Status: Under investigation, use retry logic

错误: 使用批量API时返回429 RESOURCE_EXHAUSTED,即使未超出文档记录的配额 来源: GitHub Issue #1264 原因: 批量API可能基于服务器负载或文档未记录的限制实施动态速率限制。
预防措施:
typescript
// 为批量API实现指数退避:
async function batchWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.batches.create(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000;
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}
影响范围: 付费层的批量API用户 状态: 正在调查中,使用重试逻辑

Issue #9: Context Caching Only Works with Gemini 1.5 Models

问题#9: 上下文缓存仅适用于Gemini 1.5模型

Error: 404 NOT FOUND when creating caches with Gemini 2.0, 2.5, or 3.0 models Source: GitHub Issue #339 Why It Happens: Context caching only supports Gemini 1.5 Pro and Gemini 1.5 Flash models. Documentation examples incorrectly show Gemini 2.0+ models.
Prevention:
typescript
// ❌ WRONG - 404 error:
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash', // Not supported
  config: { /* ... */ }
});

// ✅ CORRECT - Use Gemini 1.5 with explicit version:
const cache = await ai.caches.create({
  model: 'gemini-1.5-flash-001', // Explicit version required
  config: { /* ... */ }
});
Affected: All Gemini 2.x and 3.x users trying to use context caching Status: Known limitation, only Gemini 1.5 models support caching

错误: 使用Gemini 2.0、2.5或3.0模型创建缓存时返回404 NOT FOUND 来源: GitHub Issue #339 原因: 上下文缓存仅支持Gemini 1.5 Pro和Gemini 1.5 Flash模型。文档示例错误地展示了Gemini 2.0+模型。
预防措施:
typescript
// ❌ 错误 - 返回404:
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash', // 不支持
  config: { /* ... */ }
});

// ✅ 正确 - 使用Gemini 1.5并指定明确版本:
const cache = await ai.caches.create({
  model: 'gemini-1.5-flash-001', // 需要明确版本
  config: { /* ... */ }
});
影响范围: 尝试使用上下文缓存的所有Gemini 2.x和3.x用户 状态: 已知限制,仅Gemini 1.5模型支持缓存

Issue #10: Structured Output Occasionally Returns Backticks Causing JSON.parse Error

问题#10: 结构化输出偶尔返回反引号,导致JSON.parse错误

Error:
SyntaxError: Unexpected token '
'
when parsing JSON responses **Source**: [GitHub Issue #976](https://github.com/googleapis/js-genai/issues/976) **Why It Happens**: When using
responseMimeType: "application/json"
, the response occasionally includes markdown code fence backticks wrapping the JSON (`` ```json\n{...}\n``` ``), breaking 
JSON.parse()`.
Prevention:
typescript
// Strip markdown code fences before parsing:
let jsonText = response.text.trim();

if (jsonText.startsWith('```json')) {
  jsonText = jsonText.replace(/^```json\n/, '').replace(/\n```$/, '');
} else if (jsonText.startsWith('```')) {
  jsonText = jsonText.replace(/^```\n/, '').replace(/\n```$/, '');
}

const data = JSON.parse(jsonText);
Affected: All models when using structured output with
responseMimeType: "application/json"
Status: Known intermittent issue, workaround required

错误: 解析JSON响应时出现
SyntaxError: Unexpected token '
'
 **来源**: [GitHub Issue #976](https://github.com/googleapis/js-genai/issues/976) **原因**: 当使用
responseMimeType: "application/json"
时,响应偶尔会包含包裹JSON的Markdown代码块反引号(`` ```json\n{...}\n``` ``),导致
JSON.parse()`失败。
预防措施:
typescript
// 解析前去除Markdown代码块:
let jsonText = response.text.trim();

if (jsonText.startsWith('```json')) {
  jsonText = jsonText.replace(/^```json\n/, '').replace(/\n```$/, '');
} else if (jsonText.startsWith('```')) {
  jsonText = jsonText.replace(/^```\n/, '').replace(/\n```$/, '');
}

const data = JSON.parse(jsonText);
影响范围: 使用
responseMimeType: "application/json"
的所有模型 状态: 已知间歇性问题,需要使用解决方法

Issue #11: Gemini 3 Temperature Below 1.0 Causes Looping/Degraded Reasoning

问题#11: Gemini 3的temperature低于1.0导致循环/推理质量下降

Error: Infinite loops or degraded reasoning quality on complex tasks Source: Official Troubleshooting Docs Why It Happens: Gemini 3 models are optimized for temperature 1.0. Lowering temperature below 1.0 may cause looping behavior or degraded performance on complex mathematical/reasoning tasks.
Prevention:
typescript
// ❌ WRONG - May cause issues with Gemini 3:
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Solve this complex math problem: ...',
  config: {
    temperature: 0.3 // May cause looping/degradation
  }
});

// ✅ CORRECT - Keep default temperature:
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Solve this complex math problem: ...',
  config: {
    temperature: 1.0 // Recommended for Gemini 3
  }
});
// Or omit temperature config entirely (uses default 1.0)
Affected: Gemini 3 series models Status: Official recommendation, keep temperature at 1.0

错误: 复杂任务出现无限循环或推理质量下降 来源: 官方故障排查文档 原因: Gemini 3模型针对temperature 1.0进行优化。将temperature设置为1.0以下可能导致循环行为或复杂数学/推理任务的性能下降。
预防措施:
typescript
// ❌ 错误 - Gemini 3可能出现问题:
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: '解决这个复杂的数学问题: ...',
  config: {
    temperature: 0.3 // 可能导致循环/质量下降
  }
});

// ✅ 正确 - 保持默认temperature:
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: '解决这个复杂的数学问题: ...',
  config: {
    temperature: 1.0 // Gemini 3推荐值
  }
});
// 或完全省略temperature配置(使用默认值1.0)
影响范围: Gemini 3系列模型 状态: 官方建议,保持temperature为1.0

Issue #12: Massive Rate Limit Reductions in December 2025 (Free Tier)

问题#12: 2025年12月免费层速率限制大幅降低

Error: Sudden 429 RESOURCE_EXHAUSTED errors after December 6, 2025 Source: LaoZhang AI Blog | HowToGeek Why It Happens: Google reduced free tier rate limits by 80-90% without wide announcement, catching developers off guard.
Changes:
  • Gemini 2.5 Pro: 80% reduction in daily requests (100 RPD, was ~250)
  • Gemini 2.5 Flash: ~20 requests per day (was ~250) - 90% reduction
  • Free tier now impractical for production
Prevention:
typescript
// For production, upgrade to paid tier:
// https://ai.google.dev/pricing

// For free tier, implement aggressive rate limiting:
const rateLimiter = {
  requests: 0,
  resetTime: Date.now() + 24 * 60 * 60 * 1000,
  async checkLimit() {
    if (Date.now() > this.resetTime) {
      this.requests = 0;
      this.resetTime = Date.now() + 24 * 60 * 60 * 1000;
    }
    if (this.requests >= 20) {
      throw new Error('Daily limit reached');
    }
    this.requests++;
  }
};

await rateLimiter.checkLimit();
const response = await ai.models.generateContent({/* ... */});
Affected: Free tier users (December 6, 2025 onwards) Status: Permanent change, upgrade to paid tier for production

错误: 2025年12月6日后突然出现429 RESOURCE_EXHAUSTED错误 来源: LaoZhang AI Blog | HowToGeek 原因: Google在未广泛通知的情况下将免费层速率限制降低了80-90%,让开发者措手不及。
变化:
  • Gemini 2.5 Pro: 每日请求数减少80%(从约250降至100 RPD)
  • Gemini 2.5 Flash: 每日请求数减少90%(从约250降至约20 RPD)
  • 免费层现在不适用于生产环境
预防措施:
typescript
// 生产环境请升级到付费层:
// https://ai.google.dev/pricing

// 免费层请实施严格的速率限制:
const rateLimiter = {
  requests: 0,
  resetTime: Date.now() + 24 * 60 * 60 * 1000,
  async checkLimit() {
    if (Date.now() > this.resetTime) {
      this.requests = 0;
      this.resetTime = Date.now() + 24 * 60 * 60 * 1000;
    }
    if (this.requests >= 20) {
      throw new Error('已达到每日限制');
    }
    this.requests++;
  }
};

await rateLimiter.checkLimit();
const response = await ai.models.generateContent({/* ... */});
影响范围: 2025年12月6日之后的免费层用户 状态: 永久变更,生产环境请升级到付费层

Issue #13: Preview Models Have No SLAs and Can Change Without Warning

问题#13: 预览模型无SLA,可能随时变更

Error: Unexpected behavior changes, deprecation, or service interruptions Source: Arsturn Blog | Official docs Why It Happens: Preview and experimental models (e.g.,
gemini-2.5-flash-preview
,
gemini-3-pro-preview
) have no service level agreements (SLAs) and are inherently unstable. Google can change or deprecate them with little notice.
Prevention:
typescript
// ❌ WRONG - Using preview models in production:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash-preview', // No SLA!
  contents: 'Production traffic'
});

// ✅ CORRECT - Use GA (generally available) models:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash', // Stable, with SLA
  contents: 'Production traffic'
});

// Or use specific version numbers for stability:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash-001', // Pinned version
  contents: 'Production traffic'
});
Affected: Users of preview/experimental models in production Status: Known limitation, use GA models for production

错误: 意外的行为变更、弃用或服务中断 来源: Arsturn Blog | 官方文档 原因: 预览和实验模型(例如
gemini-2.5-flash-preview
,
gemini-3-pro-preview
)无服务级别协议(SLA),本质上不稳定。Google可能随时变更或弃用这些模型,且通知有限。
预防措施:
typescript
// ❌ 错误 - 生产环境使用预览模型:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash-preview', // 无SLA!
  contents: '生产流量'
});

// ✅ 正确 - 使用正式可用(GA)模型:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash', // 稳定,有SLA
  contents: '生产流量'
});

// 或使用特定版本号以保证稳定性:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash-001', // 固定版本
  contents: '生产流量'
});
影响范围: 在生产环境中使用预览/实验模型的用户 状态: 已知限制,生产环境请使用GA模型

Issue #14: API Key Leakage Auto-Blocking (Security Enhancement)

问题#14: API密钥泄露自动拦截(安全增强)

Error: "Invalid API key" after accidentally committing key to GitHub Source: AI Free API Blog | Official troubleshooting Why It Happens: Google proactively scans for publicly exposed API keys (e.g., in GitHub repos) and automatically blocks them from accessing the Gemini API as a security measure.
Prevention:
typescript
// Best practices:
// 1. Use .env files (never commit)
// 2. Use environment variables in production
// 3. Rotate keys if exposed
// 4. Use .gitignore:

// .gitignore
.env
.env.local
*.key
Affected: Users who accidentally commit API keys to public repos Status: Security feature, rotate keys if exposed

错误: 意外将密钥提交到GitHub后提示"Invalid API key" 来源: AI Free API Blog | 官方故障排查 原因: Google主动扫描公开暴露的API密钥(例如GitHub仓库中的密钥),并自动阻止这些密钥访问Gemini API,作为安全措施。
预防措施:
typescript
// 最佳实践:
// 1. 使用.env文件(永远不要提交到仓库)
// 2. 生产环境使用环境变量
// 3. 如果密钥泄露,立即轮换
// 4. 使用.gitignore:

// .gitignore
.env
.env.local
*.key
影响范围: 意外将API密钥提交到公共仓库的用户 状态: 安全功能,密钥泄露后请立即轮换

Error Handling

错误处理

Common Errors

常见错误

1. Invalid API Key (401)

1. 无效API密钥(401)

typescript
{
  error: {
    code: 401,
    message: 'API key not valid. Please pass a valid API key.',
    status: 'UNAUTHENTICATED'
  }
}
Solution: Verify
GEMINI_API_KEY
environment variable is set correctly.
typescript
{
  error: {
    code: 401,
    message: 'API key not valid. Please pass a valid API key.',
    status: 'UNAUTHENTICATED'
  }
}
解决方案: 验证
GEMINI_API_KEY
环境变量是否正确设置。

2. Rate Limit Exceeded (429)

2. 超出速率限制(429)

typescript
{
  error: {
    code: 429,
    message: 'Resource has been exhausted (e.g. check quota).',
    status: 'RESOURCE_EXHAUSTED'
  }
}
Solution: Implement exponential backoff retry strategy.
typescript
{
  error: {
    code: 429,
    message: 'Resource has been exhausted (e.g. check quota).',
    status: 'RESOURCE_EXHAUSTED'
  }
}
解决方案: 实现指数退避重试策略。

3. Model Not Found (404)

3. 模型未找到(404)

typescript
{
  error: {
    code: 404,
    message: 'models/gemini-3.0-flash is not found',
    status: 'NOT_FOUND'
  }
}
Solution: Use correct model names:
gemini-2.5-pro
,
gemini-2.5-flash
,
gemini-2.5-flash-lite
typescript
{
  error: {
    code: 404,
    message: 'models/gemini-3.0-flash is not found',
    status: 'NOT_FOUND'
  }
}
解决方案: 使用正确的模型名称:
gemini-2.5-pro
,
gemini-2.5-flash
,
gemini-2.5-flash-lite

4. Context Length Exceeded (400)

4. 超出上下文长度(400)

typescript
{
  error: {
    code: 400,
    message: 'Request payload size exceeds the limit',
    status: 'INVALID_ARGUMENT'
  }
}
Solution: Reduce input size. Gemini 2.5 models support 1,048,576 input tokens max.
typescript
{
  error: {
    code: 400,
    message: 'Request payload size exceeds the limit',
    status: 'INVALID_ARGUMENT'
  }
}
解决方案: 减小输入大小。Gemini 2.5模型最大支持1,048,576输入token。

Exponential Backoff Pattern

指数退避模式

typescript
async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

typescript
async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Rate Limits

速率限制

⚠️ December 2025 Update - Major Free Tier Reductions

⚠️ 2025年12月更新 - 免费层大幅缩减

CRITICAL: Google reduced free tier limits by 80-90% on December 6-7, 2025 without wide announcement. Free tier is now primarily for prototyping only.
重要提示: Google在2025年12月6-7日未广泛通知的情况下,将免费层限制降低了80-90%。免费层现在主要用于原型开发。

Free Tier (Gemini API) - Current Limits

免费层(Gemini API)- 当前限制

Rate limits vary by model:
Gemini 2.5 Pro:
  • Requests per minute: 5 RPM
  • Tokens per minute: 125,000 TPM
  • Requests per day: 100 RPD (was ~250 before Dec 2025) - 80% reduction
Gemini 2.5 Flash:
  • Requests per minute: 10 RPM
  • Tokens per minute: 250,000 TPM
  • Requests per day: ~20 RPD (was ~250 before Dec 2025) - 90% reduction
Gemini 2.5 Flash-Lite:
  • Requests per minute: 15 RPM
  • Tokens per minute: 250,000 TPM
  • Requests per day: 1,000 RPD (unchanged)
速率限制因模型而异:
Gemini 2.5 Pro:
  • 每分钟请求数: 5 RPM
  • 每分钟token数: 125,000 TPM
  • 每日请求数: 100 RPD(2025年12月前约为250)- 减少80%
Gemini 2.5 Flash:
  • 每分钟请求数: 10 RPM
  • 每分钟token数: 250,000 TPM
  • 每日请求数: 约20 RPD(2025年12月前约为250)- 减少90%
Gemini 2.5 Flash-Lite:
  • 每分钟请求数: 15 RPM
  • 每分钟token数: 250,000 TPM
  • 每日请求数: 1,000 RPD(无变化)

Paid Tier (Tier 1)

付费层(Tier 1)

Requires billing account linked to your Google Cloud project.
Gemini 2.5 Pro:
  • Requests per minute: 150 RPM
  • Tokens per minute: 2,000,000 TPM
  • Requests per day: 10,000 RPD
Gemini 2.5 Flash:
  • Requests per minute: 1,000 RPM
  • Tokens per minute: 1,000,000 TPM
  • Requests per day: 10,000 RPD
Gemini 2.5 Flash-Lite:
  • Requests per minute: 4,000 RPM
  • Tokens per minute: 4,000,000 TPM
  • Requests per day: Not specified
需要将结算账户链接到您的Google Cloud项目。
Gemini 2.5 Pro:
  • 每分钟请求数: 150 RPM
  • 每分钟token数: 2,000,000 TPM
  • 每日请求数: 10,000 RPD
Gemini 2.5 Flash:
  • 每分钟请求数: 1,000 RPM
  • 每分钟token数: 1,000,000 TPM
  • 每日请求数: 10,000 RPD
Gemini 2.5 Flash-Lite:
  • 每分钟请求数: 4,000 RPM
  • 每分钟token数: 4,000,000 TPM
  • 每日请求数: 未指定

Higher Tiers (Tier 2 & 3)

更高层级(Tier 2 & 3)

Tier 2 (requires $250+ spending and 30-day wait):
  • Even higher limits available
Tier 3 (requires $1,000+ spending and 30-day wait):
  • Maximum limits available
Tips:
  • Implement rate limit handling with exponential backoff
  • Use batch processing for high-volume tasks
  • Monitor usage in Google AI Studio
  • Choose the right model based on your rate limit needs
  • Official rate limits: https://ai.google.dev/gemini-api/docs/rate-limits

Tier 2(需要月消费250美元以上,等待30天):
  • 提供更高的限制
Tier 3(需要月消费1,000美元以上,等待30天):
  • 提供最高限制
提示:

SDK Migration Guide

SDK迁移指南

From @google/generative-ai to @google/genai

从@google/generative-ai迁移到@google/genai

1. Update Package

1. 更新包

bash
undefined
bash
undefined

Remove deprecated SDK

移除已弃用的SDK

npm uninstall @google/generative-ai
npm uninstall @google/generative-ai

Install current SDK

安装当前SDK

npm install @google/genai@1.27.0
undefined
npm install @google/genai@1.27.0
undefined

2. Update Imports

2. 更新导入

Old (DEPRECATED):
typescript
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
New (CURRENT):
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// Use ai.models.generateContent() directly
旧版(已弃用):
typescript
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
新版(当前):
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// 直接使用ai.models.generateContent()

3. Update API Calls

3. 更新API调用

Old:
typescript
const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();
New:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: prompt
});
const text = response.text;
旧版:
typescript
const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();
新版:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: prompt
});
const text = response.text;

4. Update Streaming

4. 更新流式输出

Old:
typescript
const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
  console.log(chunk.text());
}
New:
typescript
const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: prompt
});
for await (const chunk of response) {
  console.log(chunk.text);
}
旧版:
typescript
const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
  console.log(chunk.text());
}
新版:
typescript
const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: prompt
});
for await (const chunk of response) {
  console.log(chunk.text);
}

5. Update Chat

5. 更新对话

Old:
typescript
const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;
New:
typescript
const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text is directly available

旧版:
typescript
const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;
新版:
typescript
const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text可直接获取

Production Best Practices

生产环境最佳实践

1. Always Do

1. 必须遵守

Use @google/genai (NOT @google/generative-ai) ✅ Set maxOutputTokens to prevent excessive generation ✅ Implement rate limit handling with exponential backoff ✅ Use environment variables for API keys (never hardcode) ✅ Validate inputs before sending to API (save costs) ✅ Use streaming for better UX on long responses ✅ Choose the right model based on your needs (Pro for complex reasoning, Flash for balance, Flash-Lite for speed) ✅ Handle errors gracefully with try-catch ✅ Monitor token usage for cost control ✅ Use correct model names: gemini-2.5-pro/flash/flash-lite
使用@google/genai(请勿使用@google/generative-ai) ✅ 设置maxOutputTokens以避免过度生成 ✅ 实现带指数退避的速率限制处理使用环境变量存储API密钥(永远不要硬编码) ✅ API调用前验证输入(节省成本) ✅ 使用流式输出提升长响应的用户体验 ✅ 根据需求选择合适的模型(Pro用于复杂推理,Flash用于平衡,Flash-Lite用于速度) ✅ 优雅处理错误(使用try-catch) ✅ 监控token使用以控制成本 ✅ 使用正确的模型名称: gemini-2.5-pro/flash/flash-lite

2. Never Do

2. 禁止操作

Never use @google/generative-ai (deprecated!) ❌ Never hardcode API keys in code ❌ Never claim 2M context for Gemini 2.5 (it's 1,048,576 input tokens) ❌ Never expose API keys in client-side code ❌ Never skip error handling (always try-catch) ❌ Never use generic rate limits (each model has different limits - check official docs) ❌ Never send PII without user consent ❌ Never trust user input without validation ❌ Never ignore rate limits (will get 429 errors) ❌ Never use old model names like gemini-1.5-pro (use 2.5 models)
永远不要使用@google/generative-ai(已弃用!) ❌ 永远不要在代码中硬编码API密钥永远不要声称Gemini 2.5有200万token上下文窗口(实际为1,048,576输入token) ❌ 永远不要在客户端代码中暴露API密钥永远不要跳过错误处理(始终使用try-catch) ❌ 永远不要使用通用速率限制(每个模型的限制不同,请查看官方文档) ❌ 未经用户同意永远不要发送个人身份信息(PII)永远不要信任未验证的用户输入永远不要忽略速率限制(会收到429错误) ❌ 永远不要使用旧模型名称如gemini-1.5-pro(请使用2.5系列模型)

3. Security

3. 安全

  • API Key Storage: Use environment variables or secret managers
  • Server-Side Only: Never expose API keys in browser JavaScript
  • Input Validation: Sanitize all user inputs before API calls
  • Rate Limiting: Implement your own rate limits to prevent abuse
  • Error Messages: Don't expose API keys or sensitive data in error logs
  • API密钥存储: 使用环境变量或密钥管理器
  • 仅在服务端使用: 永远不要在浏览器JavaScript中暴露API密钥
  • 输入验证: API调用前清理所有用户输入
  • 速率限制: 实现自己的速率限制以防止滥用
  • 错误消息: 错误日志中不要暴露API密钥或敏感数据

4. Cost Optimization

4. 成本优化

  • Choose Right Model: Use Flash for most tasks, Pro only when needed
  • Set Token Limits: Use maxOutputTokens to control costs
  • Batch Requests: Process multiple items efficiently
  • Cache Results: Store responses when appropriate
  • Monitor Usage: Track token consumption in Google Cloud Console
  • 选择合适的模型: 大多数任务使用Flash,仅在需要时使用Pro
  • 设置token限制: 使用maxOutputTokens控制成本
  • 批量请求: 高效处理多个项目
  • 缓存结果: 适当时存储响应
  • 监控使用情况: 在Google Cloud Console中跟踪token消耗

5. Performance

5. 性能

  • Use Streaming: Better perceived latency for long responses
  • Parallel Requests: Use Promise.all() for independent calls
  • Edge Deployment: Deploy to Cloudflare Workers for low latency
  • Connection Pooling: Reuse HTTP connections when possible

  • 使用流式输出: 提升长响应的感知延迟
  • 并行请求: 使用Promise.all()处理独立调用
  • 边缘部署: 部署到Cloudflare Workers以降低延迟
  • 连接池: 可能的话复用HTTP连接

Quick Reference

快速参考

Installation

安装

bash
npm install @google/genai@1.34.0
bash
npm install @google/genai@1.34.0

Environment

环境配置

bash
export GEMINI_API_KEY="..."
bash
export GEMINI_API_KEY="..."

Models (2025-2026)

模型(2025-2026)

  • gemini-3-flash
    (1,048,576 in / 65,536 out) - NEW Best speed+quality balance
  • gemini-2.5-pro
    (1,048,576 in / 65,536 out) - Best for complex reasoning
  • gemini-2.5-flash
    (1,048,576 in / 65,536 out) - Proven price-performance balance
  • gemini-2.5-flash-lite
    (1,048,576 in / 65,536 out) - Fastest, most cost-effective
  • gemini-3-flash
    (1,048,576输入 / 65,536输出)- 新增 最佳速度+质量平衡
  • gemini-2.5-pro
    (1,048,576输入 / 65,536输出)- 复杂推理最佳选择
  • gemini-2.5-flash
    (1,048,576输入 / 65,536输出)- 经过验证的性价比平衡
  • gemini-2.5-flash-lite
    (1,048,576输入 / 65,536输出)- 最快、最具成本效益

Basic Generation

基础生成

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Your prompt here'
});
console.log(response.text);
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '您的提示词'
});
console.log(response.text);

Streaming

流式输出

typescript
const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
  console.log(chunk.text);
}
typescript
const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
  console.log(chunk.text);
}

Multimodal

多模态

typescript
contents: [
  {
    parts: [
      { text: 'What is this?' },
      { inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
    ]
  }
]
typescript
contents: [
  {
    parts: [
      { text: '这是什么?' },
      { inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
    ]
  }
]

Function Calling

函数调用

typescript
config: {
  tools: [{ functionDeclarations: [...] }]
}

Last Updated: 2026-01-21 Production Validated: All features tested with @google/genai@1.35.0 Phase: 2 Complete ✅ (All Core + Advanced Features) Known Issues: 14 documented errors prevented Changes: Added Known Issues Prevention section with 14 community-researched findings from post-training-cutoff period (May 2025-Jan 2026)
typescript
config: {
  tools: [{ functionDeclarations: [...] }]
}

最后更新: 2026-01-21 生产环境验证: 所有功能已通过@google/genai@1.35.0测试 阶段: 第二阶段完成 ✅(所有核心+高级功能) 已知问题: 已预防14个已记录的错误 变更: 新增已知问题预防部分,包含14个社区研究的发现(2025年5月-2026年1月,模型训练截止日期之后的内容)