google-gemini-api

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Google Gemini API - Complete Guide

Google Gemini API 完整指南

Version: Phase 2 Complete ✅ Package: @google/genai@1.27.0 (⚠️ NOT @google/generative-ai) Last Updated: 2025-10-25

版本:第二阶段已完成 ✅ :@google/genai@1.27.0(⚠️ 请勿使用@google/generative-ai) 最后更新:2025-10-25

⚠️ CRITICAL SDK MIGRATION WARNING

⚠️ 重要SDK迁移警告

DEPRECATED SDK:
@google/generative-ai
(sunset November 30, 2025) CURRENT SDK:
@google/genai
v1.27+
If you see code using
@google/generative-ai
, it's outdated!
This skill uses the correct current SDK and provides a complete migration guide.

已废弃SDK
@google/generative-ai
(将于2025年11月30日停止服务) 当前SDK
@google/genai
v1.27+
如果您看到使用
@google/generative-ai
的代码,说明它已经过时了!
本指南使用正确的当前SDK,并提供完整的迁移指南。

Status

状态

✅ Phase 1 Complete:
  • ✅ Text Generation (basic + streaming)
  • ✅ Multimodal Inputs (images, video, audio, PDFs)
  • ✅ Function Calling (basic + parallel execution)
  • ✅ System Instructions & Multi-turn Chat
  • ✅ Thinking Mode Configuration
  • ✅ Generation Parameters (temperature, top-p, top-k, stop sequences)
  • ✅ Both Node.js SDK (@google/genai) and fetch approaches
✅ Phase 2 Complete:
  • ✅ Context Caching (cost optimization with TTL-based caching)
  • ✅ Code Execution (built-in Python interpreter and sandbox)
  • ✅ Grounding with Google Search (real-time web information + citations)
📦 Separate Skills:
  • Embeddings: See
    google-gemini-embeddings
    skill for text-embedding-004

✅ 第一阶段已完成
  • ✅ 文本生成(基础版+流式版)
  • ✅ 多模态输入(图片、视频、音频、PDF)
  • ✅ 函数调用(基础版+并行执行)
  • ✅ 系统指令与多轮对话
  • ✅ 思考模式配置
  • ✅ 生成参数(温度、top-p、top-k、停止序列)
  • ✅ Node.js SDK(@google/genai)与Fetch两种实现方式
✅ 第二阶段已完成
  • ✅ 上下文缓存(基于TTL的缓存优化成本)
  • ✅ 代码执行(内置Python解释器与沙箱)
  • ✅ 基于Google搜索的事实校验(实时网络信息+引用)
📦 独立技能
  • 嵌入向量:请查看
    google-gemini-embeddings
    技能了解text-embedding-004

Table of Contents

目录

Quick Start

快速开始

Installation

安装

CORRECT SDK:
bash
npm install @google/genai@1.27.0
❌ WRONG (DEPRECATED):
bash
npm install @google/generative-ai  # DO NOT USE!
正确的SDK:
bash
npm install @google/genai@1.27.0
❌ 错误(已废弃):
bash
npm install @google/generative-ai  # 请勿使用!

Environment Setup

环境配置

bash
export GEMINI_API_KEY="..."
Or create
.env
file:
GEMINI_API_KEY=...
bash
export GEMINI_API_KEY="..."
或创建
.env
文件:
GEMINI_API_KEY=...

First Text Generation (Node.js SDK)

首次文本生成(Node.js SDK)

typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Explain quantum computing in simple terms'
});

console.log(response.text);
typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '用简单的语言解释量子计算'
});

console.log(response.text);

First Text Generation (Fetch - Cloudflare Workers)

首次文本生成(Fetch - Cloudflare Workers)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Explain quantum computing in simple terms' }] }]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '用简单的语言解释量子计算' }] }]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Current Models (2025)

当前模型(2025)

Gemini 2.5 Series (General Availability)

Gemini 2.5系列(正式可用)

gemini-2.5-pro

gemini-2.5-pro

  • Context: 1,048,576 input tokens / 65,536 output tokens
  • Description: State-of-the-art thinking model for complex reasoning
  • Best for: Code, math, STEM, complex problem-solving
  • Features: Thinking mode (default on), function calling, multimodal, streaming
  • Knowledge cutoff: January 2025
  • 上下文窗口:1,048,576输入令牌 / 65,536输出令牌
  • 描述:最先进的推理模型,适用于复杂推理任务
  • 最佳场景:代码、数学、STEM领域、复杂问题解决
  • 特性:思考模式(默认开启)、函数调用、多模态、流式传输
  • 知识截止日期:2025年1月

gemini-2.5-flash

gemini-2.5-flash

  • Context: 1,048,576 input tokens / 65,536 output tokens
  • Description: Best price-performance workhorse model
  • Best for: Large-scale processing, low-latency, high-volume, agentic use cases
  • Features: Thinking mode (default on), function calling, multimodal, streaming
  • Knowledge cutoff: January 2025
  • 上下文窗口:1,048,576输入令牌 / 65,536输出令牌
  • 描述:性价比最高的主力模型
  • 最佳场景:大规模处理、低延迟、高吞吐量、智能代理类场景
  • 特性:思考模式(默认开启)、函数调用、多模态、流式传输
  • 知识截止日期:2025年1月

gemini-2.5-flash-lite

gemini-2.5-flash-lite

  • Context: 1,048,576 input tokens / 65,536 output tokens
  • Description: Cost-optimized, fastest 2.5 model
  • Best for: High throughput, cost-sensitive applications
  • Features: Thinking mode (default on), function calling, multimodal, streaming
  • Knowledge cutoff: January 2025
  • 上下文窗口:1,048,576输入令牌 / 65,536输出令牌
  • 描述:成本优化的最快2.5系列模型
  • 最佳场景:高吞吐量、对成本敏感的应用
  • 特性:思考模式(默认开启)、函数调用、多模态、流式传输
  • 知识截止日期:2025年1月

Model Feature Matrix

模型特性矩阵

FeatureProFlashFlash-Lite
Thinking Mode✅ Default ON✅ Default ON✅ Default ON
Function Calling
Multimodal
Streaming
System Instructions
Context Window1,048,576 in1,048,576 in1,048,576 in
Output Tokens65,536 max65,536 max65,536 max
特性ProFlashFlash-Lite
思考模式✅ 默认开启✅ 默认开启✅ 默认开启
函数调用
多模态
流式传输
系统指令
输入上下文窗口1,048,5761,048,5761,048,576
最大输出令牌65,53665,53665,536

⚠️ Context Window Correction

⚠️ 上下文窗口纠正

ACCURATE: Gemini 2.5 models support 1,048,576 input tokens (NOT 2M!) OUTDATED: Only Gemini 1.5 Pro (previous generation) had 2M token context window
Common mistake: Claiming Gemini 2.5 has 2M tokens. It doesn't. This skill prevents this error.

准确信息:Gemini 2.5系列模型支持1,048,576输入令牌(并非200万!) 过时信息:只有上一代Gemini 1.5 Pro支持200万令牌上下文窗口
常见错误:声称Gemini 2.5支持200万令牌,实际并不支持。本指南可避免此类错误。

SDK vs Fetch Approaches

SDK vs Fetch实现方式

Node.js SDK (@google/genai)

Node.js SDK(@google/genai)

Pros:
  • Type-safe with TypeScript
  • Easier API (simpler syntax)
  • Built-in chat helpers
  • Automatic SSE parsing for streaming
  • Better error handling
Cons:
  • Requires Node.js or compatible runtime
  • Larger bundle size
  • May not work in all edge runtimes
Use when: Building Node.js apps, Next.js Server Actions/Components, or any environment with Node.js compatibility
优势
  • TypeScript类型安全
  • API更易用(语法更简洁)
  • 内置对话助手
  • 自动解析流式传输的SSE
  • 更完善的错误处理
劣势
  • 需要Node.js或兼容运行时
  • 包体积更大
  • 可能无法在所有边缘运行时工作
适用场景:构建Node.js应用、Next.js Server Actions/组件,或任何兼容Node.js的环境

Fetch-based (Direct REST API)

基于Fetch的实现(直接调用REST API)

Pros:
  • Works in any JavaScript environment (Cloudflare Workers, Deno, Bun, browsers)
  • Minimal dependencies
  • Smaller bundle size
  • Full control over requests
Cons:
  • More verbose syntax
  • Manual SSE parsing for streaming
  • No built-in chat helpers
  • Manual error handling
Use when: Deploying to Cloudflare Workers, browser clients, or lightweight edge runtimes

优势
  • 可在任何JavaScript环境运行(Cloudflare Workers、Deno、Bun、浏览器)
  • 依赖极少
  • 包体积更小
  • 完全控制请求细节
劣势
  • 语法更繁琐
  • 需手动解析流式传输的SSE
  • 无内置对话助手
  • 需手动处理错误
适用场景:部署到Cloudflare Workers、浏览器客户端,或轻量级边缘运行时

Text Generation

文本生成

Basic Text Generation (SDK)

基础文本生成(SDK)

typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a haiku about artificial intelligence'
});

console.log(response.text);
typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '写一首关于人工智能的俳句'
});

console.log(response.text);

Basic Text Generation (Fetch)

基础文本生成(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: 'Write a haiku about artificial intelligence' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: '写一首关于人工智能的俳句' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Response Structure

响应结构

typescript
{
  text: string,                  // Convenience accessor for text content
  candidates: [
    {
      content: {
        parts: [
          { text: string }       // Generated text
        ],
        role: string             // "model"
      },
      finishReason: string,      // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
      index: number
    }
  ],
  usageMetadata: {
    promptTokenCount: number,
    candidatesTokenCount: number,
    totalTokenCount: number
  }
}

typescript
{
  text: string,                  // 文本内容的便捷访问器
  candidates: [
    {
      content: {
        parts: [
          { text: string }       // 生成的文本
        ],
        role: string             // "model"
      },
      finishReason: string,      // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
      index: number
    }
  ],
  usageMetadata: {
    promptTokenCount: number,
    candidatesTokenCount: number,
    totalTokenCount: number
  }
}

Streaming

流式传输

Streaming with SDK (Async Iteration)

SDK流式实现(异步迭代)

typescript
const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: 'Write a 200-word story about time travel'
});

for await (const chunk of response) {
  process.stdout.write(chunk.text);
}
typescript
const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: '写一个200字的时间旅行故事'
});

for await (const chunk of response) {
  process.stdout.write(chunk.text);
}

Streaming with Fetch (SSE Parsing)

Fetch流式实现(SSE解析)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Write a 200-word story about time travel' }] }]
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
    if (!line.startsWith('data: ')) continue;

    try {
      const data = JSON.parse(line.slice(6));
      const text = data.candidates[0]?.content?.parts[0]?.text;
      if (text) {
        process.stdout.write(text);
      }
    } catch (e) {
      // Skip invalid JSON
    }
  }
}
Key Points:
  • Use
    streamGenerateContent
    endpoint (not
    generateContent
    )
  • Parse Server-Sent Events (SSE) format:
    data: {json}\n\n
  • Handle incomplete chunks in buffer
  • Skip empty lines and
    [DONE]
    markers

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '写一个200字的时间旅行故事' }] }]
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
    if (!line.startsWith('data: ')) continue;

    try {
      const data = JSON.parse(line.slice(6));
      const text = data.candidates[0]?.content?.parts[0]?.text;
      if (text) {
        process.stdout.write(text);
      }
    } catch (e) {
      // 跳过无效JSON
    }
  }
}
关键点
  • 使用
    streamGenerateContent
    端点(而非
    generateContent
  • 解析Server-Sent Events(SSE)格式:
    data: {json}\n\n
  • 处理缓冲区中的不完整块
  • 跳过空行和
    [DONE]
    标记

Multimodal Inputs

多模态输入

Gemini 2.5 models support text + images + video + audio + PDFs in the same request.
Gemini 2.5系列模型支持在同一请求中混合文本+图片+视频+音频+PDF。

Images (Vision)

图片(视觉)

SDK Approach

SDK实现

typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// From file
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'What is in this image?' },
        {
          inlineData: {
            data: base64Image,
            mimeType: 'image/jpeg'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 从文件读取
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '这张图片里有什么?' },
        {
          inlineData: {
            data: base64Image,
            mimeType: 'image/jpeg'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Fetch Approach

Fetch实现

typescript
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: 'What is in this image?' },
            {
              inlineData: {
                data: base64Image,
                mimeType: 'image/jpeg'
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
Supported Image Formats:
  • JPEG (
    .jpg
    ,
    .jpeg
    )
  • PNG (
    .png
    )
  • WebP (
    .webp
    )
  • HEIC (
    .heic
    )
  • HEIF (
    .heif
    )
Max Image Size: 20MB per image
typescript
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: '这张图片里有什么?' },
            {
              inlineData: {
                data: base64Image,
                mimeType: 'image/jpeg'
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
支持的图片格式
  • JPEG(
    .jpg
    ,
    .jpeg
  • PNG(
    .png
  • WebP(
    .webp
  • HEIC(
    .heic
  • HEIF(
    .heif
单张图片最大尺寸:20MB

Video

视频

typescript
// Video must be < 2 minutes for inline data
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Describe what happens in this video' },
        {
          inlineData: {
            data: base64Video,
            mimeType: 'video/mp4'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
Supported Video Formats:
  • MP4 (
    .mp4
    )
  • MPEG (
    .mpeg
    )
  • MOV (
    .mov
    )
  • AVI (
    .avi
    )
  • FLV (
    .flv
    )
  • MPG (
    .mpg
    )
  • WebM (
    .webm
    )
  • WMV (
    .wmv
    )
Max Video Length (inline): 2 minutes Max Video Size: 2GB (use File API for larger files - Phase 2)
typescript
// 内联数据的视频时长需小于2分钟
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '描述这个视频里发生了什么' },
        {
          inlineData: {
            data: base64Video,
            mimeType: 'video/mp4'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
支持的视频格式
  • MP4(
    .mp4
  • MPEG(
    .mpeg
  • MOV(
    .mov
  • AVI(
    .avi
  • FLV(
    .flv
  • MPG(
    .mpg
  • WebM(
    .webm
  • WMV(
    .wmv
内联视频最大时长:2分钟 视频最大尺寸:2GB(更大文件请使用File API - 第二阶段内容)

Audio

音频

typescript
const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Transcribe and summarize this audio' },
        {
          inlineData: {
            data: base64Audio,
            mimeType: 'audio/mp3'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
Supported Audio Formats:
  • MP3 (
    .mp3
    )
  • WAV (
    .wav
    )
  • FLAC (
    .flac
    )
  • AAC (
    .aac
    )
  • OGG (
    .ogg
    )
  • OPUS (
    .opus
    )
Max Audio Size: 20MB
typescript
const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '转录并总结这段音频的内容' },
        {
          inlineData: {
            data: base64Audio,
            mimeType: 'audio/mp3'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
支持的音频格式
  • MP3(
    .mp3
  • WAV(
    .wav
  • FLAC(
    .flac
  • AAC(
    .aac
  • OGG(
    .ogg
  • OPUS(
    .opus
音频最大尺寸:20MB

PDFs

PDF

typescript
const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Summarize the key points in this PDF' },
        {
          inlineData: {
            data: base64Pdf,
            mimeType: 'application/pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
Max PDF Size: 30MB PDF Limitations: Text-based PDFs work best; scanned images may have lower accuracy
typescript
const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '总结这份PDF的要点' },
        {
          inlineData: {
            data: base64Pdf,
            mimeType: 'application/pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.text);
PDF最大尺寸:30MB PDF限制:基于文本的PDF效果最佳;扫描版图片PDF的准确率可能较低

Multiple Inputs

多输入混合

You can combine multiple modalities in one request:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Compare these two images and describe the differences:' },
        { inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
        { inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
      ]
    }
  ]
});

您可以在一个请求中组合多种模态:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '对比这两张图片,描述它们的区别:' },
        { inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
        { inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
      ]
    }
  ]
});

Function Calling

函数调用

Gemini supports function calling (tool use) to connect models with external APIs and systems.
Gemini支持函数调用(工具调用),可将模型与外部API和系统连接。

Basic Function Calling (SDK)

基础函数调用(SDK)

typescript
import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Define function declarations
const getCurrentWeather = {
  name: 'get_current_weather',
  description: 'Get the current weather for a location',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: 'City name, e.g. San Francisco'
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit']
      }
    },
    required: ['location']
  }
};

// Make request with tools
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What\'s the weather in Tokyo?',
  config: {
    tools: [
      { functionDeclarations: [getCurrentWeather] }
    ]
  }
});

// Check if model wants to call a function
const functionCall = response.candidates[0].content.parts[0].functionCall;

if (functionCall) {
  console.log('Function to call:', functionCall.name);
  console.log('Arguments:', functionCall.args);

  // Execute the function (your implementation)
  const weatherData = await fetchWeather(functionCall.args.location);

  // Send function result back to model
  const finalResponse = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      'What\'s the weather in Tokyo?',
      response.candidates[0].content, // Original assistant response with function call
      {
        parts: [
          {
            functionResponse: {
              name: functionCall.name,
              response: weatherData
            }
          }
        ]
      }
    ],
    config: {
      tools: [
        { functionDeclarations: [getCurrentWeather] }
      ]
    }
  });

  console.log(finalResponse.text);
}
typescript
import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 定义函数声明
const getCurrentWeather = {
  name: 'get_current_weather',
  description: '获取指定地点的当前天气',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: '城市名称,例如:旧金山'
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit']
      }
    },
    required: ['location']
  }
};

// 携带工具发起请求
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '东京现在的天气怎么样?',
  config: {
    tools: [
      { functionDeclarations: [getCurrentWeather] }
    ]
  }
});

// 检查模型是否需要调用函数
const functionCall = response.candidates[0].content.parts[0].functionCall;

if (functionCall) {
  console.log('需要调用的函数:', functionCall.name);
  console.log('参数:', functionCall.args);

  // 执行函数(您的实现逻辑)
  const weatherData = await fetchWeather(functionCall.args.location);

  // 将函数结果返回给模型
  const finalResponse = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      '东京现在的天气怎么样?',
      response.candidates[0].content, // 原始助手响应(包含函数调用)
      {
        parts: [
          {
            functionResponse: {
              name: functionCall.name,
              response: weatherData
            }
          }
        ]
      }
    ],
    config: {
      tools: [
        { functionDeclarations: [getCurrentWeather] }
      ]
    }
  });

  console.log(finalResponse.text);
}

Function Calling (Fetch)

函数调用(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'What\'s the weather in Tokyo?' }] }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_current_weather',
              description: 'Get the current weather for a location',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: 'City name'
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;

if (functionCall) {
  // Execute function and send result back (same flow as SDK)
}
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '东京现在的天气怎么样?' }] }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_current_weather',
              description: '获取指定地点的当前天气',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: '城市名称'
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;

if (functionCall) {
  // 执行函数并返回结果(流程与SDK一致)
}

Parallel Function Calling

并行函数调用

Gemini can call multiple independent functions simultaneously:
typescript
const tools = [
  {
    functionDeclarations: [
      {
        name: 'get_weather',
        description: 'Get weather for a location',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      },
      {
        name: 'get_population',
        description: 'Get population of a city',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            city: { type: 'string' }
          },
          required: ['city']
        }
      }
    ]
  }
];

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather and population of Tokyo?',
  config: { tools }
});

// Model may return MULTIPLE function calls in parallel
const functionCalls = response.candidates[0].content.parts.filter(
  part => part.functionCall
);

console.log(`Model wants to call ${functionCalls.length} functions in parallel`);
Gemini可同时调用多个独立函数:
typescript
const tools = [
  {
    functionDeclarations: [
      {
        name: 'get_weather',
        description: '获取指定地点的天气',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      },
      {
        name: 'get_population',
        description: '获取指定城市的人口',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            city: { type: 'string' }
          },
          required: ['city']
        }
      }
    ]
  }
];

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '东京的天气和人口分别是多少?',
  config: { tools }
});

// 模型可能返回多个并行的函数调用
const functionCalls = response.candidates[0].content.parts.filter(
  part => part.functionCall
);

console.log(`模型需要并行调用${functionCalls.length}个函数`);

Function Calling Modes

函数调用模式

typescript
import { FunctionCallingConfigMode } from '@google/genai';

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What\'s the weather?',
  config: {
    tools: [{ functionDeclarations: [getCurrentWeather] }],
    toolConfig: {
      functionCallingConfig: {
        mode: FunctionCallingConfigMode.ANY, // Force function call
        // mode: FunctionCallingConfigMode.AUTO, // Model decides (default)
        // mode: FunctionCallingConfigMode.NONE, // Never call functions
        allowedFunctionNames: ['get_current_weather'] // Optional: restrict to specific functions
      }
    }
  }
});
Modes:
  • AUTO
    (default): Model decides whether to call functions
  • ANY
    : Force model to call at least one function
  • NONE
    : Disable function calling for this request

typescript
import { FunctionCallingConfigMode } from '@google/genai';

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '天气怎么样?',
  config: {
    tools: [{ functionDeclarations: [getCurrentWeather] }],
    toolConfig: {
      functionCallingConfig: {
        mode: FunctionCallingConfigMode.ANY, // 强制调用函数
        // mode: FunctionCallingConfigMode.AUTO, // 模型自主决定(默认)
        // mode: FunctionCallingConfigMode.NONE, // 禁止调用函数
        allowedFunctionNames: ['get_current_weather'] // 可选:限制仅调用指定函数
      }
    }
  }
});
模式说明
  • AUTO
    (默认):模型自主决定是否调用函数
  • ANY
    :强制模型至少调用一个函数
  • NONE
    :本次请求禁止调用函数

System Instructions

系统指令

System instructions guide the model's behavior and set context. They are separate from the conversation messages.
系统指令用于引导模型行为并设置上下文,与对话消息分离

SDK Approach

SDK实现

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful AI assistant that always responds in the style of a pirate. Use nautical terminology and end sentences with "arrr".',
  contents: 'Explain what a database is'
});

console.log(response.text);
// Output: "Ahoy there! A database be like a treasure chest..."
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: '你是一个乐于助人的AI助手,说话风格要像海盗,使用航海术语,句子结尾要加“arrr”。',
  contents: '解释什么是数据库'
});

console.log(response.text);
// 输出:“Ahoy there! 数据库就像一个藏宝箱... arrr”

Fetch Approach

Fetch实现

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      systemInstruction: {
        parts: [
          { text: 'You are a helpful AI assistant that always responds in the style of a pirate.' }
        ]
      },
      contents: [
        { parts: [{ text: 'Explain what a database is' }] }
      ]
    }),
  }
);
Key Points:
  • System instructions are NOT part of
    contents
    array
  • They are set once at the top level of the request
  • They persist for the entire conversation (when using multi-turn chat)
  • They don't count as user or model messages

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      systemInstruction: {
        parts: [
          { text: '你是一个乐于助人的AI助手,说话风格要像海盗。' }
        ]
      },
      contents: [
        { parts: [{ text: '解释什么是数据库' }] }
      ]
    }),
  }
);
关键点
  • 系统指令不属于
    contents
    数组
  • 需在请求的顶层设置一次
  • 在多轮对话中持续生效
  • 不计入用户或模型消息

Multi-turn Chat

多轮对话

For conversations with history, use the SDK's chat helpers or manually manage conversation state.
对于需要历史上下文的对话,可使用SDK的对话助手,或手动管理对话状态。

SDK Chat Helpers (Recommended)

SDK对话助手(推荐)

typescript
const chat = await ai.models.createChat({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful coding assistant.',
  history: [] // Start empty or with previous messages
});

// Send first message
const response1 = await chat.sendMessage('What is TypeScript?');
console.log('Assistant:', response1.text);

// Send follow-up (context is automatically maintained)
const response2 = await chat.sendMessage('How do I install it?');
console.log('Assistant:', response2.text);

// Get full chat history
const history = chat.getHistory();
console.log('Full conversation:', history);
typescript
const chat = await ai.models.createChat({
  model: 'gemini-2.5-flash',
  systemInstruction: '你是一个乐于助人的编程助手。',
  history: [] // 从空开始,或传入历史消息
});

// 发送第一条消息
const response1 = await chat.sendMessage('什么是TypeScript?');
console.log('助手:', response1.text);

// 发送跟进消息(上下文自动维护)
const response2 = await chat.sendMessage('怎么安装它?');
console.log('助手:', response2.text);

// 获取完整对话历史
const history = chat.getHistory();
console.log('完整对话:', history);

Manual Chat Management (Fetch)

手动管理对话(Fetch)

typescript
const conversationHistory = [];

// First turn
const response1 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{ text: 'What is TypeScript?' }]
        }
      ]
    }),
  }
);

const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;

// Add to history
conversationHistory.push(
  { role: 'user', parts: [{ text: 'What is TypeScript?' }] },
  { role: 'model', parts: [{ text: assistantReply1 }] }
);

// Second turn (include full history)
const response2 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        ...conversationHistory,
        { role: 'user', parts: [{ text: 'How do I install it?' }] }
      ]
    }),
  }
);
Message Roles:
  • user
    : User messages
  • model
    : Assistant responses
⚠️ Important: Chat helpers are SDK-only. With fetch, you must manually manage conversation history.

typescript
const conversationHistory = [];

// 第一轮对话
const response1 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{ text: '什么是TypeScript?' }]
        }
      ]
    }),
  }
);

const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;

// 添加到历史
conversationHistory.push(
  { role: 'user', parts: [{ text: '什么是TypeScript?' }] },
  { role: 'model', parts: [{ text: assistantReply1 }] }
);

// 第二轮对话(包含完整历史)
const response2 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        ...conversationHistory,
        { role: 'user', parts: [{ text: '怎么安装它?' }] }
      ]
    }),
  }
);
消息角色
  • user
    :用户消息
  • model
    :助手响应
⚠️ 注意:对话助手仅SDK支持。使用Fetch时,必须手动管理对话历史。

Thinking Mode

思考模式

Gemini 2.5 models have thinking mode enabled by default for enhanced quality. You can configure the thinking budget.
Gemini 2.5系列模型默认开启思考模式以提升输出质量。您可以配置思考预算。

Configure Thinking Budget (SDK)

配置思考预算(SDK)

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Solve this complex math problem: ...',
  config: {
    thinkingConfig: {
      thinkingBudget: 8192 // Max tokens for thinking (default: model-dependent)
    }
  }
});
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '解决这个复杂的数学问题:...',
  config: {
    thinkingConfig: {
      thinkingBudget: 8192 // 最大思考令牌数(默认值因模型而异)
    }
  }
});

Configure Thinking Budget (Fetch)

配置思考预算(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Solve this complex math problem: ...' }] }],
      generationConfig: {
        thinkingConfig: {
          thinkingBudget: 8192
        }
      }
    }),
  }
);
Key Points:
  • Thinking mode is always enabled on Gemini 2.5 models (cannot be disabled)
  • Higher thinking budgets allow more internal reasoning (may increase latency)
  • Default budget varies by model (usually sufficient for most tasks)
  • Only increase budget for very complex reasoning tasks

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '解决这个复杂的数学问题:...' }] }],
      generationConfig: {
        thinkingConfig: {
          thinkingBudget: 8192
        }
      }
    }),
  }
);
关键点
  • Gemini 2.5系列模型始终开启思考模式(无法禁用)
  • 更高的思考预算允许更深入的内部推理(可能增加延迟)
  • 默认预算因模型而异(通常足以应对大多数任务)
  • 仅在处理极复杂推理任务时才需要增加预算

Generation Configuration

生成配置

Customize model behavior with generation parameters.
通过生成参数自定义模型行为。

All Configuration Options (SDK)

所有配置选项(SDK)

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a creative story',
  config: {
    temperature: 0.9,           // Randomness (0.0-2.0, default: 1.0)
    topP: 0.95,                 // Nucleus sampling (0.0-1.0)
    topK: 40,                   // Top-k sampling
    maxOutputTokens: 2048,      // Max tokens to generate
    stopSequences: ['END'],     // Stop generation if these appear
    responseMimeType: 'text/plain', // Or 'application/json' for JSON mode
    candidateCount: 1           // Number of response candidates (usually 1)
  }
});
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '写一个有创意的故事',
  config: {
    temperature: 0.9,           // 随机性(0.0-2.0,默认:1.0)
    topP: 0.95,                 // 核采样(0.0-1.0)
    topK: 40,                   // Top-k采样
    maxOutputTokens: 2048,      // 最大生成令牌数
    stopSequences: ['END'],     // 遇到这些字符串时停止生成
    responseMimeType: 'text/plain', // 或指定'application/json'启用JSON模式
    candidateCount: 1           // 响应候选数(通常为1)
  }
});

All Configuration Options (Fetch)

所有配置选项(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Write a creative story' }] }],
      generationConfig: {
        temperature: 0.9,
        topP: 0.95,
        topK: 40,
        maxOutputTokens: 2048,
        stopSequences: ['END'],
        responseMimeType: 'text/plain',
        candidateCount: 1
      }
    }),
  }
);
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '写一个有创意的故事' }] }],
      generationConfig: {
        temperature: 0.9,
        topP: 0.95,
        topK: 40,
        maxOutputTokens: 2048,
        stopSequences: ['END'],
        responseMimeType: 'text/plain',
        candidateCount: 1
      }
    }),
  }
);

Parameter Guidelines

参数指南

ParameterRangeDefaultUse Case
temperature0.0-2.01.0Lower = more focused, higher = more creative
topP0.0-1.00.95Nucleus sampling threshold
topK1-100+40Limit to top K tokens
maxOutputTokens1-65536Model maxControl response length
stopSequencesArrayNoneStop generation at specific strings
Tips:
  • For factual tasks: Use low temperature (0.0-0.3)
  • For creative tasks: Use high temperature (0.7-1.5)
  • topP and topK both control randomness; use one or the other (not both)
  • Always set maxOutputTokens to prevent excessive generation

参数范围默认值适用场景
temperature0.0-2.01.0值越低输出越聚焦,值越高输出越有创意
topP0.0-1.00.95核采样阈值
topK1-100+40限制仅考虑前K个令牌
maxOutputTokens1-65536模型最大值控制响应长度
stopSequences数组遇到指定字符串时停止生成
提示
  • 事实类任务:使用低温度(0.0-0.3)
  • 创意类任务:使用高温度(0.7-1.5)
  • topPtopK都用于控制随机性,建议只使用其中一个(不要同时使用)
  • 始终设置maxOutputTokens以避免过度生成

Context Caching

上下文缓存

Context caching allows you to cache frequently used content (like system instructions, large documents, or video files) to reduce costs by up to 90% and improve latency.
上下文缓存允许您缓存频繁使用的内容(如系统指令、大型文档或视频文件),可降低高达90%的成本并提升响应速度。

How It Works

工作原理

  1. Create a cache with your repeated content
  2. Reference the cache in subsequent requests
  3. Save tokens - cached tokens cost significantly less
  4. TTL management - caches expire after specified time
  1. 创建缓存:将重复使用的内容存入缓存
  2. 引用缓存:在后续请求中引用该缓存
  3. 节省令牌:缓存令牌的成本远低于普通令牌
  4. TTL管理:缓存会在指定时间后过期

Benefits

优势

  • Cost savings: Up to 90% reduction on cached tokens
  • Reduced latency: Faster responses by reusing processed content
  • Consistent context: Same large context across multiple requests
  • 成本节省:缓存令牌的成本比普通令牌低约90%
  • 延迟降低:通过复用已处理内容提升响应速度
  • 上下文一致:在多个请求中使用相同的大型上下文

Cache Creation (SDK)

创建缓存(SDK)

typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Create a cache for a large document
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');

const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'large-doc-cache', // Identifier for the cache
    systemInstruction: 'You are an expert at analyzing legal documents.',
    contents: documentText,
    ttl: '3600s', // Cache for 1 hour
  }
});

console.log('Cache created:', cache.name);
console.log('Expires at:', cache.expireTime);
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 为大型文档创建缓存
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');

const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'large-doc-cache', // 缓存标识符
    systemInstruction: '你是一名法律文档分析专家。',
    contents: documentText,
    ttl: '3600s', // 缓存1小时
  }
});

console.log('缓存已创建:', cache.name);
console.log('过期时间:', cache.expireTime);

Cache Creation (Fetch)

创建缓存(Fetch)

typescript
const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/cachedContents',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      model: 'models/gemini-2.5-flash',
      displayName: 'large-doc-cache',
      systemInstruction: {
        parts: [{ text: 'You are an expert at analyzing legal documents.' }]
      },
      contents: [
        { parts: [{ text: documentText }] }
      ],
      ttl: '3600s'
    }),
  }
);

const cache = await response.json();
console.log('Cache created:', cache.name);
typescript
const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/cachedContents',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      model: 'models/gemini-2.5-flash',
      displayName: 'large-doc-cache',
      systemInstruction: {
        parts: [{ text: '你是一名法律文档分析专家。' }]
      },
      contents: [
        { parts: [{ text: documentText }] }
      ],
      ttl: '3600s'
    }),
  }
);

const cache = await response.json();
console.log('缓存已创建:', cache.name);

Using a Cache (SDK)

使用缓存(SDK)

typescript
// Generate content using the cache
const response = await ai.models.generateContent({
  model: cache.name, // Use cache name as model
  contents: 'Summarize the key points in the document'
});

console.log(response.text);
typescript
// 使用缓存生成内容
const response = await ai.models.generateContent({
  model: cache.name, // 使用缓存名称作为模型
  contents: '总结文档的要点'
});

console.log(response.text);

Using a Cache (Fetch)

使用缓存(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'Summarize the key points in the document' }] }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '总结文档的要点' }] }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Update Cache TTL (SDK)

更新缓存TTL(SDK)

typescript
import { UpdateCachedContentConfig } from '@google/genai';

await ai.caches.update({
  name: cache.name,
  config: {
    ttl: '7200s' // Extend to 2 hours
  }
});
typescript
import { UpdateCachedContentConfig } from '@google/genai';

await ai.caches.update({
  name: cache.name,
  config: {
    ttl: '7200s' // 延长至2小时
  }
});

Update Cache with Expiration Time (SDK)

设置缓存过期时间(SDK)

typescript
// Set specific expiration time (must be timezone-aware)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);

await ai.caches.update({
  name: cache.name,
  config: {
    expireTime: in10Minutes
  }
});
typescript
// 设置具体过期时间(需包含时区信息)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);

await ai.caches.update({
  name: cache.name,
  config: {
    expireTime: in10Minutes
  }
});

List and Delete Caches (SDK)

列出和删除缓存(SDK)

typescript
// List all caches
const caches = await ai.caches.list();
for (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

// Delete a specific cache
await ai.caches.delete({ name: cache.name });
typescript
// 列出所有缓存
const caches = await ai.caches.list();
for (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

// 删除指定缓存
await ai.caches.delete({ name: cache.name });

Caching with Video Files

视频文件缓存

typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Upload video file
const videoFile = await ai.files.upload({
  file: fs.createReadStream('./video.mp4')
});

// Wait for processing
while (videoFile.state.name === 'PROCESSING') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await ai.files.get({ name: videoFile.name });
}

// Create cache with video
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'video-analysis-cache',
    systemInstruction: 'You are an expert video analyzer.',
    contents: [videoFile],
    ttl: '300s' // 5 minutes
  }
});

// Use cache for multiple queries
const response1 = await ai.models.generateContent({
  model: cache.name,
  contents: 'What happens in the first minute?'
});

const response2 = await ai.models.generateContent({
  model: cache.name,
  contents: 'Describe the main characters'
});
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 上传视频文件
const videoFile = await ai.files.upload({
  file: fs.createReadStream('./video.mp4')
});

// 等待处理完成
while (videoFile.state.name === 'PROCESSING') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await ai.files.get({ name: videoFile.name });
}

// 为视频创建缓存
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'video-analysis-cache',
    systemInstruction: '你是一名专业的视频分析专家。',
    contents: [videoFile],
    ttl: '300s' // 缓存5分钟
  }
});

// 使用缓存进行多次查询
const response1 = await ai.models.generateContent({
  model: cache.name,
  contents: '视频第一分钟发生了什么?'
});

const response2 = await ai.models.generateContent({
  model: cache.name,
  contents: '描述主要角色'
});

Key Points

关键点

When to Use Caching:
  • Large system instructions used repeatedly
  • Long documents analyzed multiple times
  • Video/audio files queried with different prompts
  • Consistent context across conversation sessions
TTL Guidelines:
  • Short sessions: 300s (5 min) to 3600s (1 hour)
  • Long sessions: 3600s (1 hour) to 86400s (24 hours)
  • Maximum: 7 days
Cost Savings:
  • Cached input tokens: ~90% cheaper than regular tokens
  • Output tokens: Same price (not cached)
Important:
  • You must use explicit model version suffixes (e.g.,
    gemini-2.5-flash-001
    , NOT just
    gemini-2.5-flash
    )
  • Caches are automatically deleted after TTL expires
  • Update TTL before expiration to extend cache lifetime

何时使用缓存
  • 重复使用的大型系统指令
  • 需要多次分析的长文档
  • 需用不同查询提问的视频/音频文件
  • 跨对话会话的一致上下文
TTL指南
  • 短会话:300秒(5分钟)至3600秒(1小时)
  • 长会话:3600秒(1小时)至86400秒(24小时)
  • 最大值:7天
成本节省
  • 缓存输入令牌:比普通令牌便宜约90%
  • 输出令牌:价格与普通令牌相同(不缓存)
注意事项
  • 必须使用明确的模型版本后缀(例如:
    gemini-2.5-flash-001
    ,而非仅
    gemini-2.5-flash
  • 缓存会在TTL到期后自动删除
  • 需在到期前更新TTL以延长缓存生命周期

Code Execution

代码执行

Gemini models can generate and execute Python code to solve problems requiring computation, data analysis, or visualization.
Gemini模型可以生成并执行Python代码,解决需要计算、数据分析或可视化的问题。

How It Works

工作原理

  1. Model generates executable Python code
  2. Code runs in secure sandbox
  3. Results are returned to the model
  4. Model incorporates results into response
  1. 模型生成可执行的Python代码
  2. 代码在安全沙箱中运行
  3. 将结果返回给模型
  4. 模型将结果整合到响应中

Supported Operations

支持的操作

  • Mathematical calculations
  • Data analysis and statistics
  • File processing (CSV, JSON, etc.)
  • Chart and graph generation
  • Algorithm implementation
  • Data transformations
  • 数学计算
  • 数据分析与统计
  • 文件处理(CSV、JSON等)
  • 图表生成
  • 算法实现
  • 数据转换

Available Python Packages

可用的Python包

Standard Library:
  • math
    ,
    statistics
    ,
    random
    ,
    datetime
    ,
    json
    ,
    csv
    ,
    re
  • collections
    ,
    itertools
    ,
    functools
Data Science:
  • numpy
    ,
    pandas
    ,
    scipy
Visualization:
  • matplotlib
    ,
    seaborn
Note: Limited package availability compared to full Python environment
标准库
  • math
    ,
    statistics
    ,
    random
    ,
    datetime
    ,
    json
    ,
    csv
    ,
    re
  • collections
    ,
    itertools
    ,
    functools
数据科学
  • numpy
    ,
    pandas
    ,
    scipy
可视化
  • matplotlib
    ,
    seaborn
注意:与完整Python环境相比,可用包有限

Basic Code Execution (SDK)

基础代码执行(SDK)

typescript
import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the sum of the first 50 prime numbers? Generate and run code for the calculation.',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Parse response parts
for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    console.log('Text:', part.text);
  }
  if (part.executableCode) {
    console.log('Generated Code:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('Execution Output:', part.codeExecutionResult.output);
  }
}
typescript
import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '前50个质数的和是多少?生成并运行计算代码。',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 解析响应部分
for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    console.log('文本:', part.text);
  }
  if (part.executableCode) {
    console.log('生成的代码:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('执行输出:', part.codeExecutionResult.output);
  }
}

Basic Code Execution (Fetch)

基础代码执行(Fetch)

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      tools: [{ code_execution: {} }],
      contents: [
        {
          parts: [
            { text: 'What is the sum of the first 50 prime numbers? Generate and run code.' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();

for (const part of data.candidates[0].content.parts) {
  if (part.text) {
    console.log('Text:', part.text);
  }
  if (part.executableCode) {
    console.log('Code:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('Result:', part.codeExecutionResult.output);
  }
}
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      tools: [{ code_execution: {} }],
      contents: [
        {
          parts: [
            { text: '前50个质数的和是多少?生成并运行计算代码。' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();

for (const part of data.candidates[0].content.parts) {
  if (part.text) {
    console.log('文本:', part.text);
  }
  if (part.executableCode) {
    console.log('代码:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('结果:', part.codeExecutionResult.output);
  }
}

Chat with Code Execution (SDK)

带代码执行的对话(SDK)

typescript
const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

let response = await chat.sendMessage('I have a math question for you.');
console.log(response.text);

response = await chat.sendMessage(
  'Calculate the Fibonacci sequence up to the 20th number and sum them.'
);

// Model will generate and execute code, then provide answer
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Code:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('Output:', part.codeExecutionResult.output);
}
typescript
const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

let response = await chat.sendMessage('我有一个数学问题想问你。');
console.log(response.text);

response = await chat.sendMessage(
  '计算斐波那契数列的前20项并求和。'
);

// 模型会生成并执行代码,然后给出答案
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('代码:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('输出:', part.codeExecutionResult.output);
}

Data Analysis Example

数据分析示例

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: `
    Analyze this sales data and calculate:
    1. Total revenue
    2. Average sale price
    3. Best-selling month

    Data (CSV format):
    month,sales,revenue
    Jan,150,45000
    Feb,200,62000
    Mar,175,53000
    Apr,220,68000
  `,
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Model will generate pandas/numpy code to analyze data
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Analysis Code:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('Results:', part.codeExecutionResult.output);
}
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: `
    分析以下销售数据并计算:
    1. 总营收
    2. 平均售价
    3. 销量最高的月份

    数据(CSV格式):
    month,sales,revenue
    Jan,150,45000
    Feb,200,62000
    Mar,175,53000
    Apr,220,68000
  `,
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 模型会生成pandas/numpy代码来分析数据
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('分析代码:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('结果:', part.codeExecutionResult.output);
}

Visualization Example

可视化示例

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Create a bar chart showing the distribution of prime numbers under 100 by their last digit. Generate the chart and describe the pattern.',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Model generates matplotlib code, executes it, and describes results
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Chart Code:', part.executableCode.code);
  if (part.codeExecutionResult) {
    // Note: Chart image data would be in output
    console.log('Execution completed');
  }
}
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '创建一个柱状图,展示100以内质数的末位数字分布。生成图表并描述规律。',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 模型生成matplotlib代码,执行后描述结果
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('图表代码:', part.executableCode.code);
  if (part.codeExecutionResult) {
    // 注意:图表图片数据会包含在输出中
    console.log('执行完成');
  }
}

Response Structure

响应结构

typescript
{
  candidates: [
    {
      content: {
        parts: [
          { text: "I'll calculate that for you." },
          {
            executableCode: {
              language: "PYTHON",
              code: "def is_prime(n):\n  if n <= 1:\n    return False\n  ..."
            }
          },
          {
            codeExecutionResult: {
              outcome: "OUTCOME_OK", // or "OUTCOME_FAILED"
              output: "5117\n"
            }
          },
          { text: "The sum of the first 50 prime numbers is 5117." }
        ]
      }
    }
  ]
}
typescript
{
  candidates: [
    {
      content: {
        parts: [
          { text: "我来帮你计算。" },
          {
            executableCode: {
              language: "PYTHON",
              code: "def is_prime(n):\n  if n <= 1:\n    return False\n  ..."
            }
          },
          {
            codeExecutionResult: {
              outcome: "OUTCOME_OK", // 或"OUTCOME_FAILED"
              output: "5117\n"
            }
          },
          { text: "前50个质数的和是5117。" }
        ]
      }
    }
  ]
}

Error Handling

错误处理

typescript
for (const part of response.candidates[0].content.parts) {
  if (part.codeExecutionResult) {
    if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
      console.error('Code execution failed:', part.codeExecutionResult.output);
    } else {
      console.log('Success:', part.codeExecutionResult.output);
    }
  }
}
typescript
for (const part of response.candidates[0].content.parts) {
  if (part.codeExecutionResult) {
    if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
      console.error('代码执行失败:', part.codeExecutionResult.output);
    } else {
      console.log('成功:', part.codeExecutionResult.output);
    }
  }
}

Key Points

关键点

When to Use Code Execution:
  • Complex mathematical calculations
  • Data analysis and statistics
  • Algorithm implementations
  • File parsing and processing
  • Chart generation
  • Computational problems
Limitations:
  • Sandbox environment (limited file system access)
  • Limited Python package availability
  • Execution timeout limits
  • No network access from code
  • No persistent state between executions
Best Practices:
  • Specify what calculation or analysis you need clearly
  • Request code generation explicitly ("Generate and run code...")
  • Check
    outcome
    field for errors
  • Use for deterministic computations, not for general programming
Important:
  • Available on all Gemini 2.5 models (Pro, Flash, Flash-Lite)
  • Code runs in isolated sandbox for security
  • Supports Python with standard library and common data science packages

何时使用代码执行
  • 复杂数学计算
  • 数据分析与统计
  • 算法实现
  • 文件解析与处理
  • 图表生成
  • 计算类问题
限制
  • 沙箱环境(文件系统访问受限)
  • Python可用包有限
  • 执行超时限制
  • 代码无法访问网络
  • 执行之间无持久化状态
最佳实践
  • 清晰说明您需要的计算或分析内容
  • 明确要求生成代码(例如:“生成并运行代码...”)
  • 检查
    outcome
    字段是否有错误
  • 仅用于确定性计算,而非通用编程
注意
  • 所有Gemini 2.5系列模型(Pro、Flash、Flash-Lite)均支持
  • 代码在隔离沙箱中运行以保障安全
  • 支持Python标准库和常见数据科学包

Grounding with Google Search

基于Google搜索的事实校验

Grounding connects the model to real-time web information, reducing hallucinations and providing up-to-date, fact-checked responses with citations.
事实校验功能将模型与实时网络信息连接,减少幻觉并提供最新的、经过事实核查的响应(包含引用)。

How It Works

工作原理

  1. Model determines if it needs current information
  2. Automatically performs Google Search
  3. Processes search results
  4. Incorporates findings into response
  5. Provides citations and source URLs
  1. 模型判断是否需要当前信息
  2. 自动执行Google搜索
  3. 处理搜索结果
  4. 将发现整合到响应中
  5. 提供引用和来源URL

Benefits

优势

  • Real-time information: Access to current events and data
  • Reduced hallucinations: Answers grounded in web sources
  • Verifiable: Citations allow fact-checking
  • Up-to-date: Not limited to model's training cutoff
  • 实时信息:获取当前事件和数据
  • 减少幻觉:答案基于网络来源
  • 可验证:引用支持事实核查
  • 内容更新:不受模型训练截止日期限制

Two Grounding APIs

两种事实校验API

1. Google Search (
googleSearch
) - Recommended for Gemini 2.5

1. Google Search(
googleSearch
)- 推荐用于Gemini 2.5

typescript
const groundingTool = {
  googleSearch: {}
};
Features:
  • Simple configuration
  • Automatic search when needed
  • Available on all Gemini 2.5 models
typescript
const groundingTool = {
  googleSearch: {}
};
特性
  • 配置简单
  • 自动在需要时执行搜索
  • 所有Gemini 2.5系列模型均支持

2. Google Search Retrieval (
googleSearchRetrieval
) - Legacy (Gemini 1.5)

2. Google Search Retrieval(
googleSearchRetrieval
)- 旧版(Gemini 1.5)

typescript
const retrievalTool = {
  googleSearchRetrieval: {
    dynamicRetrievalConfig: {
      mode: 'MODE_DYNAMIC',
      dynamicThreshold: 0.7 // Only search if confidence < 70%
    }
  }
};
Features:
  • Dynamic threshold control
  • Used with Gemini 1.5 models
  • More configuration options
typescript
const retrievalTool = {
  googleSearchRetrieval: {
    dynamicRetrievalConfig: {
      mode: 'MODE_DYNAMIC',
      dynamicThreshold: 0.7 // 仅当置信度<70%时执行搜索
    }
  }
};
特性
  • 动态阈值控制
  • 用于Gemini 1.5系列模型
  • 配置选项更多

Basic Grounding (SDK) - Gemini 2.5

基础事实校验(SDK)- Gemini 2.5

typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Who won the euro 2024?',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

console.log(response.text);

// Check if grounding was used
if (response.candidates[0].groundingMetadata) {
  console.log('Search was performed!');
  console.log('Sources:', response.candidates[0].groundingMetadata);
}
typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '谁赢得了2024年欧洲杯?',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

console.log(response.text);

// 检查是否使用了事实校验
if (response.candidates[0].groundingMetadata) {
  console.log('已执行搜索!');
  console.log('来源:', response.candidates[0].groundingMetadata);
}

Basic Grounding (Fetch) - Gemini 2.5

基础事实校验(Fetch)- Gemini 2.5

typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'Who won the euro 2024?' }] }
      ],
      tools: [
        { google_search: {} }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

if (data.candidates[0].groundingMetadata) {
  console.log('Grounding metadata:', data.candidates[0].groundingMetadata);
}
typescript
const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '谁赢得了2024年欧洲杯?' }] }
      ],
      tools: [
        { google_search: {} }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

if (data.candidates[0].groundingMetadata) {
  console.log('事实校验元数据:', data.candidates[0].groundingMetadata);
}

Dynamic Retrieval (SDK) - Gemini 1.5

动态检索(SDK)- Gemini 1.5

typescript
import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-1.5-flash',
  contents: 'Who won the euro 2024?',
  config: {
    tools: [
      {
        googleSearchRetrieval: {
          dynamicRetrievalConfig: {
            mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
            dynamicThreshold: 0.7 // Search only if confidence < 70%
          }
        }
      }
    ]
  }
});

console.log(response.text);

if (!response.candidates[0].groundingMetadata) {
  console.log('Model answered from its own knowledge (high confidence)');
}
typescript
import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-1.5-flash',
  contents: '谁赢得了2024年欧洲杯?',
  config: {
    tools: [
      {
        googleSearchRetrieval: {
          dynamicRetrievalConfig: {
            mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
            dynamicThreshold: 0.7 // 仅当置信度<70%时执行搜索
          }
        }
      }
    ]
  }
});

console.log(response.text);

if (!response.candidates[0].groundingMetadata) {
  console.log('模型使用自身知识回答(置信度高)');
}

Grounding Metadata Structure

事实校验元数据结构

typescript
{
  groundingMetadata: {
    searchQueries: [
      { text: "euro 2024 winner" }
    ],
    webPages: [
      {
        url: "https://example.com/euro-2024-results",
        title: "UEFA Euro 2024 Final Results",
        snippet: "Spain won UEFA Euro 2024..."
      }
    ],
    citations: [
      {
        startIndex: 42,
        endIndex: 47,
        uri: "https://example.com/euro-2024-results"
      }
    ],
    retrievalQueries: [
      {
        query: "who won euro 2024 final"
      }
    ]
  }
}
typescript
{
  groundingMetadata: {
    searchQueries: [
      { text: "2024欧洲杯冠军" }
    ],
    webPages: [
      {
        url: "https://example.com/euro-2024-results",
        title: "2024欧洲杯决赛结果",
        snippet: "西班牙赢得了2024年欧洲杯..."
      }
    ],
    citations: [
      {
        startIndex: 42,
        endIndex: 47,
        uri: "https://example.com/euro-2024-results"
      }
    ],
    retrievalQueries: [
      {
        query: "who won euro 2024 final"
      }
    ]
  }
}

Chat with Grounding (SDK)

带事实校验的对话(SDK)

typescript
const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

let response = await chat.sendMessage('What are the latest developments in quantum computing?');
console.log(response.text);

// Check grounding sources
if (response.candidates[0].groundingMetadata) {
  const sources = response.candidates[0].groundingMetadata.webPages || [];
  console.log(`Sources used: ${sources.length}`);
  sources.forEach(source => {
    console.log(`- ${source.title}: ${source.url}`);
  });
}

// Follow-up still has grounding enabled
response = await chat.sendMessage('Which company made the biggest breakthrough?');
console.log(response.text);
typescript
const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

let response = await chat.sendMessage('量子计算的最新进展是什么?');
console.log(response.text);

// 检查事实校验来源
if (response.candidates[0].groundingMetadata) {
  const sources = response.candidates[0].groundingMetadata.webPages || [];
  console.log(`使用的来源数:${sources.length}`);
  sources.forEach(source => {
    console.log(`- ${source.title}: ${source.url}`);
  });
}

// 跟进问题仍会启用事实校验
response = await chat.sendMessage('哪家公司取得了最大突破?');
console.log(response.text);

Combining Grounding with Function Calling

结合事实校验与函数调用

typescript
const weatherFunction = {
  name: 'get_current_weather',
  description: 'Get current weather for a location',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: { type: 'string', description: 'City name' }
    },
    required: ['location']
  }
};

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather like in the city that won Euro 2024?',
  config: {
    tools: [
      { googleSearch: {} },
      { functionDeclarations: [weatherFunction] }
    ]
  }
});

// Model will:
// 1. Use Google Search to find Euro 2024 winner
// 2. Call get_current_weather function with the city
// 3. Combine both results in response
typescript
const weatherFunction = {
  name: 'get_current_weather',
  description: '获取指定地点的当前天气',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: { type: 'string', description: '城市名称' }
    },
    required: ['location']
  }
};

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '赢得2024年欧洲杯的城市现在天气怎么样?',
  config: {
    tools: [
      { googleSearch: {} },
      { functionDeclarations: [weatherFunction] }
    ]
  }
});

// 模型会:
// 1. 使用Google搜索找到2024欧洲杯冠军
// 2. 调用get_current_weather函数获取该城市天气
// 3. 将两者结果整合到响应中

Checking if Grounding was Used

检查是否使用了事实校验

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is 2+2?', // Model knows this without search
  config: {
    tools: [{ googleSearch: {} }]
  }
});

if (!response.candidates[0].groundingMetadata) {
  console.log('Model answered from its own knowledge (no search needed)');
} else {
  console.log('Search was performed');
}
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '2+2等于多少?', // 模型无需搜索即可回答
  config: {
    tools: [{ googleSearch: {} }]
  }
});

if (!response.candidates[0].groundingMetadata) {
  console.log('模型使用自身知识回答(无需搜索)');
} else {
  console.log('已执行搜索');
}

Key Points

关键点

When to Use Grounding:
  • Current events and news
  • Real-time data (stock prices, sports scores, weather)
  • Fact-checking and verification
  • Questions about recent developments
  • Information beyond model's training cutoff
When NOT to Use:
  • General knowledge questions
  • Mathematical calculations
  • Code generation
  • Creative writing
  • Tasks requiring internal reasoning only
Cost Considerations:
  • Grounding adds latency (search takes time)
  • Additional token costs for retrieved content
  • Use
    dynamicThreshold
    to control when searches happen (Gemini 1.5)
Important Notes:
  • Grounding requires Google Cloud project (not just API key)
  • Search results quality depends on query phrasing
  • Citations may not cover all facts in response
  • Search is performed automatically based on confidence
Gemini 2.5 vs 1.5:
  • Gemini 2.5: Use
    googleSearch
    (simple, recommended)
  • Gemini 1.5: Use
    googleSearchRetrieval
    with
    dynamicThreshold
Best Practices:
  • Always check
    groundingMetadata
    to see if search was used
  • Display citations to users for transparency
  • Use specific, well-phrased questions for better search results
  • Combine with function calling for hybrid workflows

何时使用事实校验
  • 当前事件与新闻
  • 实时数据(股票价格、体育比分、天气)
  • 事实核查与验证
  • 关于近期发展的问题
  • 超出模型训练截止日期的信息
何时不使用
  • 通用知识问题
  • 数学计算
  • 代码生成
  • 创意写作
  • 仅需内部推理的任务
成本考量
  • 事实校验会增加延迟(搜索需要时间)
  • 检索内容会产生额外令牌成本
  • 使用
    dynamicThreshold
    控制搜索时机(Gemini 1.5)
注意事项
  • 事实校验需要Google Cloud项目(不仅是API密钥)
  • 搜索结果质量取决于查询措辞
  • 引用可能无法覆盖响应中的所有事实
  • 搜索会根据置信度自动执行
Gemini 2.5 vs 1.5
  • Gemini 2.5:使用
    googleSearch
    (简单,推荐)
  • Gemini 1.5:使用
    googleSearchRetrieval
    并配置
    dynamicThreshold
最佳实践
  • 始终检查
    groundingMetadata
    以确认是否执行了搜索
  • 向用户展示引用以保证透明度
  • 使用具体、清晰的问题以获得更好的搜索结果
  • 结合函数调用实现混合工作流

Error Handling

错误处理

Common Errors

常见错误

1. Invalid API Key (401)

1. 无效API密钥(401)

typescript
{
  error: {
    code: 401,
    message: 'API key not valid. Please pass a valid API key.',
    status: 'UNAUTHENTICATED'
  }
}
Solution: Verify
GEMINI_API_KEY
environment variable is set correctly.
typescript
{
  error: {
    code: 401,
    message: 'API key not valid. Please pass a valid API key.',
    status: 'UNAUTHENTICATED'
  }
}
解决方案:确认
GEMINI_API_KEY
环境变量已正确设置。

2. Rate Limit Exceeded (429)

2. 超出速率限制(429)

typescript
{
  error: {
    code: 429,
    message: 'Resource has been exhausted (e.g. check quota).',
    status: 'RESOURCE_EXHAUSTED'
  }
}
Solution: Implement exponential backoff retry strategy.
typescript
{
  error: {
    code: 429,
    message: 'Resource has been exhausted (e.g. check quota).',
    status: 'RESOURCE_EXHAUSTED'
  }
}
解决方案:实现指数退避重试策略。

3. Model Not Found (404)

3. 模型未找到(404)

typescript
{
  error: {
    code: 404,
    message: 'models/gemini-3.0-flash is not found',
    status: 'NOT_FOUND'
  }
}
Solution: Use correct model names:
gemini-2.5-pro
,
gemini-2.5-flash
,
gemini-2.5-flash-lite
typescript
{
  error: {
    code: 404,
    message: 'models/gemini-3.0-flash is not found',
    status: 'NOT_FOUND'
  }
}
解决方案:使用正确的模型名称:
gemini-2.5-pro
,
gemini-2.5-flash
,
gemini-2.5-flash-lite

4. Context Length Exceeded (400)

4. 超出上下文长度(400)

typescript
{
  error: {
    code: 400,
    message: 'Request payload size exceeds the limit',
    status: 'INVALID_ARGUMENT'
  }
}
Solution: Reduce input size. Gemini 2.5 models support 1,048,576 input tokens max.
typescript
{
  error: {
    code: 400,
    message: 'Request payload size exceeds the limit',
    status: 'INVALID_ARGUMENT'
  }
}
解决方案:减小输入大小。Gemini 2.5系列模型最多支持1,048,576输入令牌。

Exponential Backoff Pattern

指数退避模式

typescript
async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

typescript
async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1秒, 2秒, 4秒
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Rate Limits

速率限制

Free Tier (Gemini API)

免费版(Gemini API)

Rate limits vary by model:
Gemini 2.5 Pro:
  • Requests per minute: 5 RPM
  • Tokens per minute: 125,000 TPM
  • Requests per day: 100 RPD
Gemini 2.5 Flash:
  • Requests per minute: 10 RPM
  • Tokens per minute: 250,000 TPM
  • Requests per day: 250 RPD
Gemini 2.5 Flash-Lite:
  • Requests per minute: 15 RPM
  • Tokens per minute: 250,000 TPM
  • Requests per day: 1,000 RPD
速率限制因模型而异:
Gemini 2.5 Pro
  • 每分钟请求数:5 RPM
  • 每分钟令牌数:125,000 TPM
  • 每天请求数:100 RPD
Gemini 2.5 Flash
  • 每分钟请求数:10 RPM
  • 每分钟令牌数:250,000 TPM
  • 每天请求数:250 RPD
Gemini 2.5 Flash-Lite
  • 每分钟请求数:15 RPM
  • 每分钟令牌数:250,000 TPM
  • 每天请求数:1,000 RPD

Paid Tier (Tier 1)

付费版(Tier 1)

Requires billing account linked to your Google Cloud project.
Gemini 2.5 Pro:
  • Requests per minute: 150 RPM
  • Tokens per minute: 2,000,000 TPM
  • Requests per day: 10,000 RPD
Gemini 2.5 Flash:
  • Requests per minute: 1,000 RPM
  • Tokens per minute: 1,000,000 TPM
  • Requests per day: 10,000 RPD
Gemini 2.5 Flash-Lite:
  • Requests per minute: 4,000 RPM
  • Tokens per minute: 4,000,000 TPM
  • Requests per day: Not specified
需要将计费账户关联到您的Google Cloud项目。
Gemini 2.5 Pro
  • 每分钟请求数:150 RPM
  • 每分钟令牌数:2,000,000 TPM
  • 每天请求数:10,000 RPD
Gemini 2.5 Flash
  • 每分钟请求数:1,000 RPM
  • 每分钟令牌数:1,000,000 TPM
  • 每天请求数:10,000 RPD
Gemini 2.5 Flash-Lite
  • 每分钟请求数:4,000 RPM
  • 每分钟令牌数:4,000,000 TPM
  • 每天请求数:未指定

Higher Tiers (Tier 2 & 3)

更高等级(Tier 2 & 3)

Tier 2 (requires $250+ spending and 30-day wait):
  • Even higher limits available
Tier 3 (requires $1,000+ spending and 30-day wait):
  • Maximum limits available
Tips:
  • Implement rate limit handling with exponential backoff
  • Use batch processing for high-volume tasks
  • Monitor usage in Google AI Studio
  • Choose the right model based on your rate limit needs
  • Official rate limits: https://ai.google.dev/gemini-api/docs/rate-limits

Tier 2(需消费250美元以上并等待30天):
  • 可获得更高的限制
Tier 3(需消费1000美元以上并等待30天):
  • 可获得最高限制
提示

SDK Migration Guide

SDK迁移指南

From @google/generative-ai to @google/genai

从@google/generative-ai迁移到@google/genai

1. Update Package

1. 更新包

bash
undefined
bash
undefined

Remove deprecated SDK

卸载已废弃的SDK

npm uninstall @google/generative-ai
npm uninstall @google/generative-ai

Install current SDK

安装当前SDK

npm install @google/genai@1.27.0
undefined
npm install @google/genai@1.27.0
undefined

2. Update Imports

2. 更新导入

Old (DEPRECATED):
typescript
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
New (CURRENT):
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// Use ai.models.generateContent() directly
旧版(已废弃)
typescript
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });
新版(当前)
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// 直接使用ai.models.generateContent()

3. Update API Calls

3. 更新API调用

Old:
typescript
const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();
New:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: prompt
});
const text = response.text;
旧版
typescript
const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();
新版
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: prompt
});
const text = response.text;

4. Update Streaming

4. 更新流式传输

Old:
typescript
const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
  console.log(chunk.text());
}
New:
typescript
const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: prompt
});
for await (const chunk of response) {
  console.log(chunk.text);
}
旧版
typescript
const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
  console.log(chunk.text());
}
新版
typescript
const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: prompt
});
for await (const chunk of response) {
  console.log(chunk.text);
}

5. Update Chat

5. 更新对话

Old:
typescript
const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;
New:
typescript
const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text is directly available

旧版
typescript
const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;
新版
typescript
const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// 直接使用response.text

Production Best Practices

生产环境最佳实践

1. Always Do

1. 必须遵守

Use @google/genai (NOT @google/generative-ai) ✅ Set maxOutputTokens to prevent excessive generation ✅ Implement rate limit handling with exponential backoff ✅ Use environment variables for API keys (never hardcode) ✅ Validate inputs before sending to API (save costs) ✅ Use streaming for better UX on long responses ✅ Choose the right model based on your needs (Pro for complex reasoning, Flash for balance, Flash-Lite for speed) ✅ Handle errors gracefully with try-catch ✅ Monitor token usage for cost control ✅ Use correct model names: gemini-2.5-pro/flash/flash-lite
使用@google/genai(而非@google/generative-ai) ✅ 设置maxOutputTokens以避免过度生成 ✅ 实现速率限制处理并使用指数退避 ✅ 使用环境变量存储API密钥(绝不要硬编码) ✅ 在发送到API前验证输入(节省成本) ✅ 使用流式传输提升长响应的用户体验 ✅ 根据需求选择合适的模型(Pro用于复杂推理,Flash用于平衡,Flash-Lite用于速度) ✅ 优雅处理错误(使用try-catch) ✅ 监控令牌使用以控制成本 ✅ 使用正确的模型名称:gemini-2.5-pro/flash/flash-lite

2. Never Do

2. 绝对禁止

Never use @google/generative-ai (deprecated!) ❌ Never hardcode API keys in code ❌ Never claim 2M context for Gemini 2.5 (it's 1,048,576 input tokens) ❌ Never expose API keys in client-side code ❌ Never skip error handling (always try-catch) ❌ Never use generic rate limits (each model has different limits - check official docs) ❌ Never send PII without user consent ❌ Never trust user input without validation ❌ Never ignore rate limits (will get 429 errors) ❌ Never use old model names like gemini-1.5-pro (use 2.5 models)
绝不要使用@google/generative-ai(已废弃!) ❌ 绝不要在代码中硬编码API密钥绝不要声称Gemini 2.5支持200万令牌(实际为1,048,576输入令牌) ❌ 绝不要在客户端代码中暴露API密钥绝不要跳过错误处理(始终使用try-catch) ❌ 绝不要使用通用速率限制(每个模型的限制不同,请查看官方文档) ❌ 绝不要在未获得用户同意的情况下发送PII(个人身份信息)绝不要信任未验证的用户输入绝不要忽略速率限制(会收到429错误) ❌ 绝不要使用旧模型名称如gemini-1.5-pro(使用2.5系列模型)

3. Security

3. 安全

  • API Key Storage: Use environment variables or secret managers
  • Server-Side Only: Never expose API keys in browser JavaScript
  • Input Validation: Sanitize all user inputs before API calls
  • Rate Limiting: Implement your own rate limits to prevent abuse
  • Error Messages: Don't expose API keys or sensitive data in error logs
  • API密钥存储:使用环境变量或密钥管理器
  • 仅在服务端使用:绝不要在浏览器JavaScript中暴露API密钥
  • 输入验证:在调用API前清理所有用户输入
  • 速率限制:实现自己的速率限制以防止滥用
  • 错误消息:不要在错误日志中暴露API密钥或敏感数据

4. Cost Optimization

4. 成本优化

  • Choose Right Model: Use Flash for most tasks, Pro only when needed
  • Set Token Limits: Use maxOutputTokens to control costs
  • Batch Requests: Process multiple items efficiently
  • Cache Results: Store responses when appropriate
  • Monitor Usage: Track token consumption in Google Cloud Console
  • 选择合适的模型:大多数任务使用Flash,仅在必要时使用Pro
  • 设置令牌限制:使用maxOutputTokens控制成本
  • 批量处理:对高吞吐量任务使用批量处理
  • 缓存结果:在合适的场景下缓存响应
  • 监控使用情况:在Google Cloud控制台跟踪令牌消耗

5. Performance

5. 性能

  • Use Streaming: Better perceived latency for long responses
  • Parallel Requests: Use Promise.all() for independent calls
  • Edge Deployment: Deploy to Cloudflare Workers for low latency
  • Connection Pooling: Reuse HTTP connections when possible

  • 使用流式传输:长响应时提升感知延迟
  • 并行请求:对独立调用使用Promise.all()
  • 边缘部署:部署到Cloudflare Workers以降低延迟
  • 连接池:尽可能复用HTTP连接

Quick Reference

快速参考

Installation

安装

bash
npm install @google/genai@1.27.0
bash
npm install @google/genai@1.27.0

Environment

环境配置

bash
export GEMINI_API_KEY="..."
bash
export GEMINI_API_KEY="..."

Models (2025)

模型(2025)

  • gemini-2.5-pro
    (1,048,576 in / 65,536 out) - Best for complex reasoning
  • gemini-2.5-flash
    (1,048,576 in / 65,536 out) - Best price-performance balance
  • gemini-2.5-flash-lite
    (1,048,576 in / 65,536 out) - Fastest, most cost-effective
  • gemini-2.5-pro
    (1,048,576输入 / 65,536输出)- 最佳复杂推理模型
  • gemini-2.5-flash
    (1,048,576输入 / 65,536输出)- 最佳性价比模型
  • gemini-2.5-flash-lite
    (1,048,576输入 / 65,536输出)- 最快、最具成本效益的模型

Basic Generation

基础生成

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Your prompt here'
});
console.log(response.text);
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '你的提示词'
});
console.log(response.text);

Streaming

流式传输

typescript
const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
  console.log(chunk.text);
}
typescript
const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
  console.log(chunk.text);
}

Multimodal

多模态

typescript
contents: [
  {
    parts: [
      { text: 'What is this?' },
      { inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
    ]
  }
]
typescript
contents: [
  {
    parts: [
      { text: '这是什么?' },
      { inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
    ]
  }
]

Function Calling

函数调用

typescript
config: {
  tools: [{ functionDeclarations: [...] }]
}

Last Updated: 2025-10-25 Production Validated: All features tested with @google/genai@1.27.0 Phase: 2 Complete ✅ (All Core + Advanced Features)
typescript
config: {
  tools: [{ functionDeclarations: [...] }]
}

最后更新:2025-10-25 生产环境验证:所有特性均已通过@google/genai@1.27.0测试 阶段:第二阶段已完成 ✅(所有核心+高级功能)