google-gemini-api

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Google Gemini API - Complete Guide

Google Gemini API 完整指南

Version: Phase 2 Complete ✅ Package: @google/genai@1.27.0 (⚠️ NOT @google/generative-ai) Last Updated: 2025-10-25

版本：第二阶段已完成 ✅ 包：@google/genai@1.27.0（⚠️ 请勿使用@google/generative-ai） 最后更新：2025-10-25

⚠️ CRITICAL SDK MIGRATION WARNING

⚠️ 重要SDK迁移警告

DEPRECATED SDK:

@google/generative-ai

(sunset November 30, 2025) CURRENT SDK:

@google/genai

v1.27+

If you see code using
@google/generative-ai
, it's outdated!

This skill uses the correct current SDK and provides a complete migration guide.

已废弃SDK：

@google/generative-ai

（将于2025年11月30日停止服务） 当前SDK：

@google/genai

v1.27+

如果您看到使用
@google/generative-ai
的代码，说明它已经过时了！

本指南使用正确的当前SDK，并提供完整的迁移指南。

Status

状态

✅ Phase 1 Complete:

✅ Text Generation (basic + streaming)
✅ Multimodal Inputs (images, video, audio, PDFs)
✅ Function Calling (basic + parallel execution)
✅ System Instructions & Multi-turn Chat
✅ Thinking Mode Configuration
✅ Generation Parameters (temperature, top-p, top-k, stop sequences)
✅ Both Node.js SDK (@google/genai) and fetch approaches

✅ Phase 2 Complete:

✅ Context Caching (cost optimization with TTL-based caching)
✅ Code Execution (built-in Python interpreter and sandbox)
✅ Grounding with Google Search (real-time web information + citations)

📦 Separate Skills:

Embeddings: See
```
google-gemini-embeddings
```
skill for text-embedding-004

✅ 第一阶段已完成：

✅ 文本生成（基础版+流式版）
✅ 多模态输入（图片、视频、音频、PDF）
✅ 函数调用（基础版+并行执行）
✅ 系统指令与多轮对话
✅ 思考模式配置
✅ 生成参数（温度、top-p、top-k、停止序列）
✅ Node.js SDK（@google/genai）与Fetch两种实现方式

✅ 第二阶段已完成：

✅ 上下文缓存（基于TTL的缓存优化成本）
✅ 代码执行（内置Python解释器与沙箱）
✅ 基于Google搜索的事实校验（实时网络信息+引用）

📦 独立技能：

嵌入向量：请查看
```
google-gemini-embeddings
```
技能了解text-embedding-004

Quick Start

快速开始

Installation

安装

CORRECT SDK:

bash

npm install @google/genai@1.27.0

❌ WRONG (DEPRECATED):

bash

npm install @google/generative-ai  # DO NOT USE!

正确的SDK：

bash

npm install @google/genai@1.27.0

❌ 错误（已废弃）：

bash

npm install @google/generative-ai  # 请勿使用！

Environment Setup

环境配置

bash

export GEMINI_API_KEY="..."

Or create

.env

file:

GEMINI_API_KEY=...

bash

export GEMINI_API_KEY="..."

或创建

.env

文件：

GEMINI_API_KEY=...

First Text Generation (Node.js SDK)

首次文本生成（Node.js SDK）

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Explain quantum computing in simple terms'
});

console.log(response.text);

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '用简单的语言解释量子计算'
});

console.log(response.text);

First Text Generation (Fetch - Cloudflare Workers)

首次文本生成（Fetch - Cloudflare Workers）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Explain quantum computing in simple terms' }] }]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '用简单的语言解释量子计算' }] }]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Current Models (2025)

当前模型（2025）

Gemini 2.5 Series (General Availability)

Gemini 2.5系列（正式可用）

gemini-2.5-pro

Context: 1,048,576 input tokens / 65,536 output tokens
Description: State-of-the-art thinking model for complex reasoning
Best for: Code, math, STEM, complex problem-solving
Features: Thinking mode (default on), function calling, multimodal, streaming
Knowledge cutoff: January 2025

上下文窗口：1,048,576输入令牌 / 65,536输出令牌
描述：最先进的推理模型，适用于复杂推理任务
最佳场景：代码、数学、STEM领域、复杂问题解决
特性：思考模式（默认开启）、函数调用、多模态、流式传输
知识截止日期：2025年1月

gemini-2.5-flash

Context: 1,048,576 input tokens / 65,536 output tokens
Description: Best price-performance workhorse model
Best for: Large-scale processing, low-latency, high-volume, agentic use cases
Features: Thinking mode (default on), function calling, multimodal, streaming
Knowledge cutoff: January 2025

上下文窗口：1,048,576输入令牌 / 65,536输出令牌
描述：性价比最高的主力模型
最佳场景：大规模处理、低延迟、高吞吐量、智能代理类场景
特性：思考模式（默认开启）、函数调用、多模态、流式传输
知识截止日期：2025年1月

gemini-2.5-flash-lite

Context: 1,048,576 input tokens / 65,536 output tokens
Description: Cost-optimized, fastest 2.5 model
Best for: High throughput, cost-sensitive applications
Features: Thinking mode (default on), function calling, multimodal, streaming
Knowledge cutoff: January 2025

上下文窗口：1,048,576输入令牌 / 65,536输出令牌
描述：成本优化的最快2.5系列模型
最佳场景：高吞吐量、对成本敏感的应用
特性：思考模式（默认开启）、函数调用、多模态、流式传输
知识截止日期：2025年1月

Model Feature Matrix

模型特性矩阵

Feature	Pro	Flash	Flash-Lite
Thinking Mode	✅ Default ON	✅ Default ON	✅ Default ON
Function Calling	✅	✅	✅
Multimodal	✅	✅	✅
Streaming	✅	✅	✅
System Instructions	✅	✅	✅
Context Window	1,048,576 in	1,048,576 in	1,048,576 in
Output Tokens	65,536 max	65,536 max	65,536 max

特性	Pro	Flash	Flash-Lite
思考模式	✅ 默认开启	✅ 默认开启	✅ 默认开启
函数调用	✅	✅	✅
多模态	✅	✅	✅
流式传输	✅	✅	✅
系统指令	✅	✅	✅
输入上下文窗口	1,048,576	1,048,576	1,048,576
最大输出令牌	65,536	65,536	65,536

⚠️ Context Window Correction

⚠️ 上下文窗口纠正

ACCURATE: Gemini 2.5 models support 1,048,576 input tokens (NOT 2M!) OUTDATED: Only Gemini 1.5 Pro (previous generation) had 2M token context window

Common mistake: Claiming Gemini 2.5 has 2M tokens. It doesn't. This skill prevents this error.

准确信息：Gemini 2.5系列模型支持1,048,576输入令牌（并非200万！） 过时信息：只有上一代Gemini 1.5 Pro支持200万令牌上下文窗口

常见错误：声称Gemini 2.5支持200万令牌，实际并不支持。本指南可避免此类错误。

SDK vs Fetch Approaches

SDK vs Fetch实现方式

Node.js SDK (@google/genai)

Node.js SDK（@google/genai）

Pros:

Type-safe with TypeScript
Easier API (simpler syntax)
Built-in chat helpers
Automatic SSE parsing for streaming
Better error handling

Cons:

Requires Node.js or compatible runtime
Larger bundle size
May not work in all edge runtimes

Use when: Building Node.js apps, Next.js Server Actions/Components, or any environment with Node.js compatibility

优势：

TypeScript类型安全
API更易用（语法更简洁）
内置对话助手
自动解析流式传输的SSE
更完善的错误处理

劣势：

需要Node.js或兼容运行时
包体积更大
可能无法在所有边缘运行时工作

适用场景：构建Node.js应用、Next.js Server Actions/组件，或任何兼容Node.js的环境

Fetch-based (Direct REST API)

基于Fetch的实现（直接调用REST API）

Pros:

Works in any JavaScript environment (Cloudflare Workers, Deno, Bun, browsers)
Minimal dependencies
Smaller bundle size
Full control over requests

Cons:

More verbose syntax
Manual SSE parsing for streaming
No built-in chat helpers
Manual error handling

Use when: Deploying to Cloudflare Workers, browser clients, or lightweight edge runtimes

优势：

可在任何JavaScript环境运行（Cloudflare Workers、Deno、Bun、浏览器）
依赖极少
包体积更小
完全控制请求细节

劣势：

语法更繁琐
需手动解析流式传输的SSE
无内置对话助手
需手动处理错误

适用场景：部署到Cloudflare Workers、浏览器客户端，或轻量级边缘运行时

Text Generation

文本生成

Basic Text Generation (SDK)

基础文本生成（SDK）

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a haiku about artificial intelligence'
});

console.log(response.text);

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '写一首关于人工智能的俳句'
});

console.log(response.text);

Basic Text Generation (Fetch)

基础文本生成（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: 'Write a haiku about artificial intelligence' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: '写一首关于人工智能的俳句' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Response Structure

响应结构

typescript

{
  text: string,                  // Convenience accessor for text content
  candidates: [
    {
      content: {
        parts: [
          { text: string }       // Generated text
        ],
        role: string             // "model"
      },
      finishReason: string,      // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
      index: number
    }
  ],
  usageMetadata: {
    promptTokenCount: number,
    candidatesTokenCount: number,
    totalTokenCount: number
  }
}

typescript

{
  text: string,                  // 文本内容的便捷访问器
  candidates: [
    {
      content: {
        parts: [
          { text: string }       // 生成的文本
        ],
        role: string             // "model"
      },
      finishReason: string,      // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
      index: number
    }
  ],
  usageMetadata: {
    promptTokenCount: number,
    candidatesTokenCount: number,
    totalTokenCount: number
  }
}

Streaming

流式传输

Streaming with SDK (Async Iteration)

SDK流式实现（异步迭代）

typescript

const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: 'Write a 200-word story about time travel'
});

for await (const chunk of response) {
  process.stdout.write(chunk.text);
}

typescript

const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: '写一个200字的时间旅行故事'
});

for await (const chunk of response) {
  process.stdout.write(chunk.text);
}

Streaming with Fetch (SSE Parsing)

Fetch流式实现（SSE解析）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Write a 200-word story about time travel' }] }]
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
    if (!line.startsWith('data: ')) continue;

    try {
      const data = JSON.parse(line.slice(6));
      const text = data.candidates[0]?.content?.parts[0]?.text;
      if (text) {
        process.stdout.write(text);
      }
    } catch (e) {
      // Skip invalid JSON
    }
  }
}

Key Points:

Use
```
streamGenerateContent
```
endpoint (not
```
generateContent
```
)
Parse Server-Sent Events (SSE) format:
```
data: {json}\n\n
```
Handle incomplete chunks in buffer
Skip empty lines and
```
[DONE]
```
markers

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '写一个200字的时间旅行故事' }] }]
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
    if (!line.startsWith('data: ')) continue;

    try {
      const data = JSON.parse(line.slice(6));
      const text = data.candidates[0]?.content?.parts[0]?.text;
      if (text) {
        process.stdout.write(text);
      }
    } catch (e) {
      // 跳过无效JSON
    }
  }
}

关键点：

使用
```
streamGenerateContent
```
端点（而非
```
generateContent
```
）
解析Server-Sent Events（SSE）格式：
```
data: {json}\n\n
```
处理缓冲区中的不完整块
跳过空行和
```
[DONE]
```
标记

Multimodal Inputs

多模态输入

Gemini 2.5 models support text + images + video + audio + PDFs in the same request.

Gemini 2.5系列模型支持在同一请求中混合文本+图片+视频+音频+PDF。

Images (Vision)

图片（视觉）

SDK Approach

SDK实现

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// From file
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'What is in this image?' },
        {
          inlineData: {
            data: base64Image,
            mimeType: 'image/jpeg'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 从文件读取
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '这张图片里有什么？' },
        {
          inlineData: {
            data: base64Image,
            mimeType: 'image/jpeg'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Fetch Approach

Fetch实现

typescript

const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: 'What is in this image?' },
            {
              inlineData: {
                data: base64Image,
                mimeType: 'image/jpeg'
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Supported Image Formats:

JPEG (
```
.jpg
```
,
```
.jpeg
```
)
PNG (
```
.png
```
)
WebP (
```
.webp
```
)
HEIC (
```
.heic
```
)
HEIF (
```
.heif
```
)

Max Image Size: 20MB per image

typescript

const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: '这张图片里有什么？' },
            {
              inlineData: {
                data: base64Image,
                mimeType: 'image/jpeg'
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

支持的图片格式：

JPEG（
```
.jpg
```
,
```
.jpeg
```
）
PNG（
```
.png
```
）
WebP（
```
.webp
```
）
HEIC（
```
.heic
```
）
HEIF（
```
.heif
```
）

单张图片最大尺寸：20MB

Video

视频

typescript

// Video must be < 2 minutes for inline data
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Describe what happens in this video' },
        {
          inlineData: {
            data: base64Video,
            mimeType: 'video/mp4'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Supported Video Formats:

MP4 (
```
.mp4
```
)
MPEG (
```
.mpeg
```
)
MOV (
```
.mov
```
)
AVI (
```
.avi
```
)
FLV (
```
.flv
```
)
MPG (
```
.mpg
```
)
WebM (
```
.webm
```
)
WMV (
```
.wmv
```
)

Max Video Length (inline): 2 minutes Max Video Size: 2GB (use File API for larger files - Phase 2)

typescript

// 内联数据的视频时长需小于2分钟
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '描述这个视频里发生了什么' },
        {
          inlineData: {
            data: base64Video,
            mimeType: 'video/mp4'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

支持的视频格式：

MP4（
```
.mp4
```
）
MPEG（
```
.mpeg
```
）
MOV（
```
.mov
```
）
AVI（
```
.avi
```
）
FLV（
```
.flv
```
）
MPG（
```
.mpg
```
）
WebM（
```
.webm
```
）
WMV（
```
.wmv
```
）

内联视频最大时长：2分钟 视频最大尺寸：2GB（更大文件请使用File API - 第二阶段内容）

Audio

音频

typescript

const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Transcribe and summarize this audio' },
        {
          inlineData: {
            data: base64Audio,
            mimeType: 'audio/mp3'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Supported Audio Formats:

MP3 (
```
.mp3
```
)
WAV (
```
.wav
```
)
FLAC (
```
.flac
```
)
AAC (
```
.aac
```
)
OGG (
```
.ogg
```
)
OPUS (
```
.opus
```
)

Max Audio Size: 20MB

typescript

const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '转录并总结这段音频的内容' },
        {
          inlineData: {
            data: base64Audio,
            mimeType: 'audio/mp3'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

支持的音频格式：

MP3（
```
.mp3
```
）
WAV（
```
.wav
```
）
FLAC（
```
.flac
```
）
AAC（
```
.aac
```
）
OGG（
```
.ogg
```
）
OPUS（
```
.opus
```
）

音频最大尺寸：20MB

PDFs

PDF

typescript

const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Summarize the key points in this PDF' },
        {
          inlineData: {
            data: base64Pdf,
            mimeType: 'application/pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Max PDF Size: 30MB PDF Limitations: Text-based PDFs work best; scanned images may have lower accuracy

typescript

const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '总结这份PDF的要点' },
        {
          inlineData: {
            data: base64Pdf,
            mimeType: 'application/pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

PDF最大尺寸：30MB PDF限制：基于文本的PDF效果最佳；扫描版图片PDF的准确率可能较低

Multiple Inputs

多输入混合

You can combine multiple modalities in one request:

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Compare these two images and describe the differences:' },
        { inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
        { inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
      ]
    }
  ]
});

您可以在一个请求中组合多种模态：

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '对比这两张图片，描述它们的区别：' },
        { inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
        { inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
      ]
    }
  ]
});

Function Calling

函数调用

Gemini supports function calling (tool use) to connect models with external APIs and systems.

Gemini支持函数调用（工具调用），可将模型与外部API和系统连接。

Basic Function Calling (SDK)

基础函数调用（SDK）

typescript

import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Define function declarations
const getCurrentWeather = {
  name: 'get_current_weather',
  description: 'Get the current weather for a location',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: 'City name, e.g. San Francisco'
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit']
      }
    },
    required: ['location']
  }
};

// Make request with tools
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What\'s the weather in Tokyo?',
  config: {
    tools: [
      { functionDeclarations: [getCurrentWeather] }
    ]
  }
});

// Check if model wants to call a function
const functionCall = response.candidates[0].content.parts[0].functionCall;

if (functionCall) {
  console.log('Function to call:', functionCall.name);
  console.log('Arguments:', functionCall.args);

  // Execute the function (your implementation)
  const weatherData = await fetchWeather(functionCall.args.location);

  // Send function result back to model
  const finalResponse = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      'What\'s the weather in Tokyo?',
      response.candidates[0].content, // Original assistant response with function call
      {
        parts: [
          {
            functionResponse: {
              name: functionCall.name,
              response: weatherData
            }
          }
        ]
      }
    ],
    config: {
      tools: [
        { functionDeclarations: [getCurrentWeather] }
      ]
    }
  });

  console.log(finalResponse.text);
}

typescript

import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 定义函数声明
const getCurrentWeather = {
  name: 'get_current_weather',
  description: '获取指定地点的当前天气',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: '城市名称，例如：旧金山'
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit']
      }
    },
    required: ['location']
  }
};

// 携带工具发起请求
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '东京现在的天气怎么样？',
  config: {
    tools: [
      { functionDeclarations: [getCurrentWeather] }
    ]
  }
});

// 检查模型是否需要调用函数
const functionCall = response.candidates[0].content.parts[0].functionCall;

if (functionCall) {
  console.log('需要调用的函数：', functionCall.name);
  console.log('参数：', functionCall.args);

  // 执行函数（您的实现逻辑）
  const weatherData = await fetchWeather(functionCall.args.location);

  // 将函数结果返回给模型
  const finalResponse = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      '东京现在的天气怎么样？',
      response.candidates[0].content, // 原始助手响应（包含函数调用）
      {
        parts: [
          {
            functionResponse: {
              name: functionCall.name,
              response: weatherData
            }
          }
        ]
      }
    ],
    config: {
      tools: [
        { functionDeclarations: [getCurrentWeather] }
      ]
    }
  });

  console.log(finalResponse.text);
}

Function Calling (Fetch)

函数调用（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'What\'s the weather in Tokyo?' }] }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_current_weather',
              description: 'Get the current weather for a location',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: 'City name'
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;

if (functionCall) {
  // Execute function and send result back (same flow as SDK)
}

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '东京现在的天气怎么样？' }] }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_current_weather',
              description: '获取指定地点的当前天气',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: '城市名称'
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;

if (functionCall) {
  // 执行函数并返回结果（流程与SDK一致）
}

Parallel Function Calling

并行函数调用

Gemini can call multiple independent functions simultaneously:

typescript

const tools = [
  {
    functionDeclarations: [
      {
        name: 'get_weather',
        description: 'Get weather for a location',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      },
      {
        name: 'get_population',
        description: 'Get population of a city',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            city: { type: 'string' }
          },
          required: ['city']
        }
      }
    ]
  }
];

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather and population of Tokyo?',
  config: { tools }
});

// Model may return MULTIPLE function calls in parallel
const functionCalls = response.candidates[0].content.parts.filter(
  part => part.functionCall
);

console.log(`Model wants to call ${functionCalls.length} functions in parallel`);

Gemini可同时调用多个独立函数：

typescript

const tools = [
  {
    functionDeclarations: [
      {
        name: 'get_weather',
        description: '获取指定地点的天气',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      },
      {
        name: 'get_population',
        description: '获取指定城市的人口',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            city: { type: 'string' }
          },
          required: ['city']
        }
      }
    ]
  }
];

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '东京的天气和人口分别是多少？',
  config: { tools }
});

// 模型可能返回多个并行的函数调用
const functionCalls = response.candidates[0].content.parts.filter(
  part => part.functionCall
);

console.log(`模型需要并行调用${functionCalls.length}个函数`);

Function Calling Modes

函数调用模式

typescript

import { FunctionCallingConfigMode } from '@google/genai';

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What\'s the weather?',
  config: {
    tools: [{ functionDeclarations: [getCurrentWeather] }],
    toolConfig: {
      functionCallingConfig: {
        mode: FunctionCallingConfigMode.ANY, // Force function call
        // mode: FunctionCallingConfigMode.AUTO, // Model decides (default)
        // mode: FunctionCallingConfigMode.NONE, // Never call functions
        allowedFunctionNames: ['get_current_weather'] // Optional: restrict to specific functions
      }
    }
  }
});

Modes:

```
AUTO
```
(default): Model decides whether to call functions
```
ANY
```
: Force model to call at least one function
```
NONE
```
: Disable function calling for this request

typescript

import { FunctionCallingConfigMode } from '@google/genai';

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '天气怎么样？',
  config: {
    tools: [{ functionDeclarations: [getCurrentWeather] }],
    toolConfig: {
      functionCallingConfig: {
        mode: FunctionCallingConfigMode.ANY, // 强制调用函数
        // mode: FunctionCallingConfigMode.AUTO, // 模型自主决定（默认）
        // mode: FunctionCallingConfigMode.NONE, // 禁止调用函数
        allowedFunctionNames: ['get_current_weather'] // 可选：限制仅调用指定函数
      }
    }
  }
});

模式说明：

```
AUTO
```
（默认）：模型自主决定是否调用函数
```
ANY
```
：强制模型至少调用一个函数
```
NONE
```
：本次请求禁止调用函数

System Instructions

系统指令

System instructions guide the model's behavior and set context. They are separate from the conversation messages.

系统指令用于引导模型行为并设置上下文，与对话消息分离。

SDK Approach

SDK实现

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful AI assistant that always responds in the style of a pirate. Use nautical terminology and end sentences with "arrr".',
  contents: 'Explain what a database is'
});

console.log(response.text);
// Output: "Ahoy there! A database be like a treasure chest..."

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: '你是一个乐于助人的AI助手，说话风格要像海盗，使用航海术语，句子结尾要加“arrr”。',
  contents: '解释什么是数据库'
});

console.log(response.text);
// 输出：“Ahoy there! 数据库就像一个藏宝箱... arrr”

Fetch Approach

Fetch实现

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      systemInstruction: {
        parts: [
          { text: 'You are a helpful AI assistant that always responds in the style of a pirate.' }
        ]
      },
      contents: [
        { parts: [{ text: 'Explain what a database is' }] }
      ]
    }),
  }
);

Key Points:

System instructions are NOT part of
```
contents
```
array
They are set once at the top level of the request
They persist for the entire conversation (when using multi-turn chat)
They don't count as user or model messages

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      systemInstruction: {
        parts: [
          { text: '你是一个乐于助人的AI助手，说话风格要像海盗。' }
        ]
      },
      contents: [
        { parts: [{ text: '解释什么是数据库' }] }
      ]
    }),
  }
);

关键点：

系统指令不属于
```
contents
```
数组
需在请求的顶层设置一次
在多轮对话中持续生效
不计入用户或模型消息

Multi-turn Chat

多轮对话

For conversations with history, use the SDK's chat helpers or manually manage conversation state.

对于需要历史上下文的对话，可使用SDK的对话助手，或手动管理对话状态。

SDK Chat Helpers (Recommended)

SDK对话助手（推荐）

typescript

const chat = await ai.models.createChat({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful coding assistant.',
  history: [] // Start empty or with previous messages
});

// Send first message
const response1 = await chat.sendMessage('What is TypeScript?');
console.log('Assistant:', response1.text);

// Send follow-up (context is automatically maintained)
const response2 = await chat.sendMessage('How do I install it?');
console.log('Assistant:', response2.text);

// Get full chat history
const history = chat.getHistory();
console.log('Full conversation:', history);

typescript

const chat = await ai.models.createChat({
  model: 'gemini-2.5-flash',
  systemInstruction: '你是一个乐于助人的编程助手。',
  history: [] // 从空开始，或传入历史消息
});

// 发送第一条消息
const response1 = await chat.sendMessage('什么是TypeScript？');
console.log('助手：', response1.text);

// 发送跟进消息（上下文自动维护）
const response2 = await chat.sendMessage('怎么安装它？');
console.log('助手：', response2.text);

// 获取完整对话历史
const history = chat.getHistory();
console.log('完整对话：', history);

Manual Chat Management (Fetch)

手动管理对话（Fetch）

typescript

const conversationHistory = [];

// First turn
const response1 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{ text: 'What is TypeScript?' }]
        }
      ]
    }),
  }
);

const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;

// Add to history
conversationHistory.push(
  { role: 'user', parts: [{ text: 'What is TypeScript?' }] },
  { role: 'model', parts: [{ text: assistantReply1 }] }
);

// Second turn (include full history)
const response2 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        ...conversationHistory,
        { role: 'user', parts: [{ text: 'How do I install it?' }] }
      ]
    }),
  }
);

Message Roles:

```
user
```
: User messages
```
model
```
: Assistant responses

⚠️ Important: Chat helpers are SDK-only. With fetch, you must manually manage conversation history.

typescript

const conversationHistory = [];

// 第一轮对话
const response1 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{ text: '什么是TypeScript？' }]
        }
      ]
    }),
  }
);

const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;

// 添加到历史
conversationHistory.push(
  { role: 'user', parts: [{ text: '什么是TypeScript？' }] },
  { role: 'model', parts: [{ text: assistantReply1 }] }
);

// 第二轮对话（包含完整历史）
const response2 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        ...conversationHistory,
        { role: 'user', parts: [{ text: '怎么安装它？' }] }
      ]
    }),
  }
);

消息角色：

```
user
```
：用户消息
```
model
```
：助手响应

⚠️ 注意：对话助手仅SDK支持。使用Fetch时，必须手动管理对话历史。

Thinking Mode

思考模式

Gemini 2.5 models have thinking mode enabled by default for enhanced quality. You can configure the thinking budget.

Gemini 2.5系列模型默认开启思考模式以提升输出质量。您可以配置思考预算。

Configure Thinking Budget (SDK)

配置思考预算（SDK）

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Solve this complex math problem: ...',
  config: {
    thinkingConfig: {
      thinkingBudget: 8192 // Max tokens for thinking (default: model-dependent)
    }
  }
});

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '解决这个复杂的数学问题：...',
  config: {
    thinkingConfig: {
      thinkingBudget: 8192 // 最大思考令牌数（默认值因模型而异）
    }
  }
});

Configure Thinking Budget (Fetch)

配置思考预算（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Solve this complex math problem: ...' }] }],
      generationConfig: {
        thinkingConfig: {
          thinkingBudget: 8192
        }
      }
    }),
  }
);

Key Points:

Thinking mode is always enabled on Gemini 2.5 models (cannot be disabled)
Higher thinking budgets allow more internal reasoning (may increase latency)
Default budget varies by model (usually sufficient for most tasks)
Only increase budget for very complex reasoning tasks

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '解决这个复杂的数学问题：...' }] }],
      generationConfig: {
        thinkingConfig: {
          thinkingBudget: 8192
        }
      }
    }),
  }
);

关键点：

Gemini 2.5系列模型始终开启思考模式（无法禁用）
更高的思考预算允许更深入的内部推理（可能增加延迟）
默认预算因模型而异（通常足以应对大多数任务）
仅在处理极复杂推理任务时才需要增加预算

Generation Configuration

生成配置

Customize model behavior with generation parameters.

通过生成参数自定义模型行为。

All Configuration Options (SDK)

所有配置选项（SDK）

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a creative story',
  config: {
    temperature: 0.9,           // Randomness (0.0-2.0, default: 1.0)
    topP: 0.95,                 // Nucleus sampling (0.0-1.0)
    topK: 40,                   // Top-k sampling
    maxOutputTokens: 2048,      // Max tokens to generate
    stopSequences: ['END'],     // Stop generation if these appear
    responseMimeType: 'text/plain', // Or 'application/json' for JSON mode
    candidateCount: 1           // Number of response candidates (usually 1)
  }
});

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '写一个有创意的故事',
  config: {
    temperature: 0.9,           // 随机性（0.0-2.0，默认：1.0）
    topP: 0.95,                 // 核采样（0.0-1.0）
    topK: 40,                   // Top-k采样
    maxOutputTokens: 2048,      // 最大生成令牌数
    stopSequences: ['END'],     // 遇到这些字符串时停止生成
    responseMimeType: 'text/plain', // 或指定'application/json'启用JSON模式
    candidateCount: 1           // 响应候选数（通常为1）
  }
});

All Configuration Options (Fetch)

所有配置选项（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Write a creative story' }] }],
      generationConfig: {
        temperature: 0.9,
        topP: 0.95,
        topK: 40,
        maxOutputTokens: 2048,
        stopSequences: ['END'],
        responseMimeType: 'text/plain',
        candidateCount: 1
      }
    }),
  }
);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '写一个有创意的故事' }] }],
      generationConfig: {
        temperature: 0.9,
        topP: 0.95,
        topK: 40,
        maxOutputTokens: 2048,
        stopSequences: ['END'],
        responseMimeType: 'text/plain',
        candidateCount: 1
      }
    }),
  }
);

Parameter Guidelines

参数指南

Parameter	Range	Default	Use Case
temperature	0.0-2.0	1.0	Lower = more focused, higher = more creative
topP	0.0-1.0	0.95	Nucleus sampling threshold
topK	1-100+	40	Limit to top K tokens
maxOutputTokens	1-65536	Model max	Control response length
stopSequences	Array	None	Stop generation at specific strings

Tips:

For factual tasks: Use low temperature (0.0-0.3)
For creative tasks: Use high temperature (0.7-1.5)
topP and topK both control randomness; use one or the other (not both)
Always set maxOutputTokens to prevent excessive generation

参数	范围	默认值	适用场景
temperature	0.0-2.0	1.0	值越低输出越聚焦，值越高输出越有创意
topP	0.0-1.0	0.95	核采样阈值
topK	1-100+	40	限制仅考虑前K个令牌
maxOutputTokens	1-65536	模型最大值	控制响应长度
stopSequences	数组	无	遇到指定字符串时停止生成

提示：

事实类任务：使用低温度（0.0-0.3）
创意类任务：使用高温度（0.7-1.5）
topP和topK都用于控制随机性，建议只使用其中一个（不要同时使用）
始终设置maxOutputTokens以避免过度生成

Context Caching

上下文缓存

Context caching allows you to cache frequently used content (like system instructions, large documents, or video files) to reduce costs by up to 90% and improve latency.

上下文缓存允许您缓存频繁使用的内容（如系统指令、大型文档或视频文件），可降低高达90%的成本并提升响应速度。

How It Works

工作原理

Create a cache with your repeated content
Reference the cache in subsequent requests
Save tokens - cached tokens cost significantly less
TTL management - caches expire after specified time

创建缓存：将重复使用的内容存入缓存
引用缓存：在后续请求中引用该缓存
节省令牌：缓存令牌的成本远低于普通令牌
TTL管理：缓存会在指定时间后过期

Benefits

优势

Cost savings: Up to 90% reduction on cached tokens
Reduced latency: Faster responses by reusing processed content
Consistent context: Same large context across multiple requests

成本节省：缓存令牌的成本比普通令牌低约90%
延迟降低：通过复用已处理内容提升响应速度
上下文一致：在多个请求中使用相同的大型上下文

Cache Creation (SDK)

创建缓存（SDK）

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Create a cache for a large document
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');

const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'large-doc-cache', // Identifier for the cache
    systemInstruction: 'You are an expert at analyzing legal documents.',
    contents: documentText,
    ttl: '3600s', // Cache for 1 hour
  }
});

console.log('Cache created:', cache.name);
console.log('Expires at:', cache.expireTime);

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 为大型文档创建缓存
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');

const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'large-doc-cache', // 缓存标识符
    systemInstruction: '你是一名法律文档分析专家。',
    contents: documentText,
    ttl: '3600s', // 缓存1小时
  }
});

console.log('缓存已创建：', cache.name);
console.log('过期时间：', cache.expireTime);

Cache Creation (Fetch)

创建缓存（Fetch）

typescript

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/cachedContents',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      model: 'models/gemini-2.5-flash',
      displayName: 'large-doc-cache',
      systemInstruction: {
        parts: [{ text: 'You are an expert at analyzing legal documents.' }]
      },
      contents: [
        { parts: [{ text: documentText }] }
      ],
      ttl: '3600s'
    }),
  }
);

const cache = await response.json();
console.log('Cache created:', cache.name);

typescript

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/cachedContents',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      model: 'models/gemini-2.5-flash',
      displayName: 'large-doc-cache',
      systemInstruction: {
        parts: [{ text: '你是一名法律文档分析专家。' }]
      },
      contents: [
        { parts: [{ text: documentText }] }
      ],
      ttl: '3600s'
    }),
  }
);

const cache = await response.json();
console.log('缓存已创建：', cache.name);

Using a Cache (SDK)

使用缓存（SDK）

typescript

// Generate content using the cache
const response = await ai.models.generateContent({
  model: cache.name, // Use cache name as model
  contents: 'Summarize the key points in the document'
});

console.log(response.text);

typescript

// 使用缓存生成内容
const response = await ai.models.generateContent({
  model: cache.name, // 使用缓存名称作为模型
  contents: '总结文档的要点'
});

console.log(response.text);

Using a Cache (Fetch)

使用缓存（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'Summarize the key points in the document' }] }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '总结文档的要点' }] }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Update Cache TTL (SDK)

更新缓存TTL（SDK）

typescript

import { UpdateCachedContentConfig } from '@google/genai';

await ai.caches.update({
  name: cache.name,
  config: {
    ttl: '7200s' // Extend to 2 hours
  }
});

typescript

import { UpdateCachedContentConfig } from '@google/genai';

await ai.caches.update({
  name: cache.name,
  config: {
    ttl: '7200s' // 延长至2小时
  }
});

Update Cache with Expiration Time (SDK)

设置缓存过期时间（SDK）

typescript

// Set specific expiration time (must be timezone-aware)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);

await ai.caches.update({
  name: cache.name,
  config: {
    expireTime: in10Minutes
  }
});

typescript

// 设置具体过期时间（需包含时区信息）
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);

await ai.caches.update({
  name: cache.name,
  config: {
    expireTime: in10Minutes
  }
});

List and Delete Caches (SDK)

列出和删除缓存（SDK）

typescript

// List all caches
const caches = await ai.caches.list();
for (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

// Delete a specific cache
await ai.caches.delete({ name: cache.name });

typescript

// 列出所有缓存
const caches = await ai.caches.list();
for (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

// 删除指定缓存
await ai.caches.delete({ name: cache.name });

Caching with Video Files

视频文件缓存

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Upload video file
const videoFile = await ai.files.upload({
  file: fs.createReadStream('./video.mp4')
});

// Wait for processing
while (videoFile.state.name === 'PROCESSING') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await ai.files.get({ name: videoFile.name });
}

// Create cache with video
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'video-analysis-cache',
    systemInstruction: 'You are an expert video analyzer.',
    contents: [videoFile],
    ttl: '300s' // 5 minutes
  }
});

// Use cache for multiple queries
const response1 = await ai.models.generateContent({
  model: cache.name,
  contents: 'What happens in the first minute?'
});

const response2 = await ai.models.generateContent({
  model: cache.name,
  contents: 'Describe the main characters'
});

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 上传视频文件
const videoFile = await ai.files.upload({
  file: fs.createReadStream('./video.mp4')
});

// 等待处理完成
while (videoFile.state.name === 'PROCESSING') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await ai.files.get({ name: videoFile.name });
}

// 为视频创建缓存
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'video-analysis-cache',
    systemInstruction: '你是一名专业的视频分析专家。',
    contents: [videoFile],
    ttl: '300s' // 缓存5分钟
  }
});

// 使用缓存进行多次查询
const response1 = await ai.models.generateContent({
  model: cache.name,
  contents: '视频第一分钟发生了什么？'
});

const response2 = await ai.models.generateContent({
  model: cache.name,
  contents: '描述主要角色'
});

Key Points

关键点

When to Use Caching:

Large system instructions used repeatedly
Long documents analyzed multiple times
Video/audio files queried with different prompts
Consistent context across conversation sessions

TTL Guidelines:

Short sessions: 300s (5 min) to 3600s (1 hour)
Long sessions: 3600s (1 hour) to 86400s (24 hours)
Maximum: 7 days

Cost Savings:

Cached input tokens: ~90% cheaper than regular tokens
Output tokens: Same price (not cached)

Important:

You must use explicit model version suffixes (e.g.,
```
gemini-2.5-flash-001
```
, NOT just
```
gemini-2.5-flash
```
)
Caches are automatically deleted after TTL expires
Update TTL before expiration to extend cache lifetime

何时使用缓存：

重复使用的大型系统指令
需要多次分析的长文档
需用不同查询提问的视频/音频文件
跨对话会话的一致上下文

TTL指南：

短会话：300秒（5分钟）至3600秒（1小时）
长会话：3600秒（1小时）至86400秒（24小时）
最大值：7天

成本节省：

缓存输入令牌：比普通令牌便宜约90%
输出令牌：价格与普通令牌相同（不缓存）

注意事项：

必须使用明确的模型版本后缀（例如：
```
gemini-2.5-flash-001
```
，而非仅
```
gemini-2.5-flash
```
）
缓存会在TTL到期后自动删除
需在到期前更新TTL以延长缓存生命周期

Code Execution

代码执行

Gemini models can generate and execute Python code to solve problems requiring computation, data analysis, or visualization.

Gemini模型可以生成并执行Python代码，解决需要计算、数据分析或可视化的问题。

How It Works

工作原理

Model generates executable Python code
Code runs in secure sandbox
Results are returned to the model
Model incorporates results into response

模型生成可执行的Python代码
代码在安全沙箱中运行
将结果返回给模型
模型将结果整合到响应中

Supported Operations

支持的操作

Mathematical calculations
Data analysis and statistics
File processing (CSV, JSON, etc.)
Chart and graph generation
Algorithm implementation
Data transformations

数学计算
数据分析与统计
文件处理（CSV、JSON等）
图表生成
算法实现
数据转换

Available Python Packages

可用的Python包

Standard Library:

```
math
```
,
```
statistics
```
,
```
random
```
,
```
datetime
```
,
```
json
```
,
```
csv
```
,
```
re
```
```
collections
```
,
```
itertools
```
,
```
functools
```

Data Science:

```
numpy
```
,
```
pandas
```
,
```
scipy
```

Visualization:

```
matplotlib
```
,
```
seaborn
```

Note: Limited package availability compared to full Python environment

标准库：

```
math
```
,
```
statistics
```
,
```
random
```
,
```
datetime
```
,
```
json
```
,
```
csv
```
,
```
re
```
```
collections
```
,
```
itertools
```
,
```
functools
```

数据科学：

```
numpy
```
,
```
pandas
```
,
```
scipy
```

可视化：

```
matplotlib
```
,
```
seaborn
```

注意：与完整Python环境相比，可用包有限

Basic Code Execution (SDK)

基础代码执行（SDK）

typescript

import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the sum of the first 50 prime numbers? Generate and run code for the calculation.',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Parse response parts
for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    console.log('Text:', part.text);
  }
  if (part.executableCode) {
    console.log('Generated Code:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('Execution Output:', part.codeExecutionResult.output);
  }
}

typescript

import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '前50个质数的和是多少？生成并运行计算代码。',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 解析响应部分
for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    console.log('文本：', part.text);
  }
  if (part.executableCode) {
    console.log('生成的代码：', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('执行输出：', part.codeExecutionResult.output);
  }
}

Basic Code Execution (Fetch)

基础代码执行（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      tools: [{ code_execution: {} }],
      contents: [
        {
          parts: [
            { text: 'What is the sum of the first 50 prime numbers? Generate and run code.' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();

for (const part of data.candidates[0].content.parts) {
  if (part.text) {
    console.log('Text:', part.text);
  }
  if (part.executableCode) {
    console.log('Code:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('Result:', part.codeExecutionResult.output);
  }
}

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      tools: [{ code_execution: {} }],
      contents: [
        {
          parts: [
            { text: '前50个质数的和是多少？生成并运行计算代码。' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();

for (const part of data.candidates[0].content.parts) {
  if (part.text) {
    console.log('文本：', part.text);
  }
  if (part.executableCode) {
    console.log('代码：', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('结果：', part.codeExecutionResult.output);
  }
}

Chat with Code Execution (SDK)

带代码执行的对话（SDK）

typescript

const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

let response = await chat.sendMessage('I have a math question for you.');
console.log(response.text);

response = await chat.sendMessage(
  'Calculate the Fibonacci sequence up to the 20th number and sum them.'
);

// Model will generate and execute code, then provide answer
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Code:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('Output:', part.codeExecutionResult.output);
}

typescript

const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

let response = await chat.sendMessage('我有一个数学问题想问你。');
console.log(response.text);

response = await chat.sendMessage(
  '计算斐波那契数列的前20项并求和。'
);

// 模型会生成并执行代码，然后给出答案
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('代码：', part.executableCode.code);
  if (part.codeExecutionResult) console.log('输出：', part.codeExecutionResult.output);
}

Data Analysis Example

数据分析示例

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: `
    Analyze this sales data and calculate:
    1. Total revenue
    2. Average sale price
    3. Best-selling month

    Data (CSV format):
    month,sales,revenue
    Jan,150,45000
    Feb,200,62000
    Mar,175,53000
    Apr,220,68000
  `,
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Model will generate pandas/numpy code to analyze data
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Analysis Code:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('Results:', part.codeExecutionResult.output);
}

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: `
    分析以下销售数据并计算：
    1. 总营收
    2. 平均售价
    3. 销量最高的月份

    数据（CSV格式）：
    month,sales,revenue
    Jan,150,45000
    Feb,200,62000
    Mar,175,53000
    Apr,220,68000
  `,
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 模型会生成pandas/numpy代码来分析数据
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('分析代码：', part.executableCode.code);
  if (part.codeExecutionResult) console.log('结果：', part.codeExecutionResult.output);
}

Visualization Example

可视化示例

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Create a bar chart showing the distribution of prime numbers under 100 by their last digit. Generate the chart and describe the pattern.',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Model generates matplotlib code, executes it, and describes results
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Chart Code:', part.executableCode.code);
  if (part.codeExecutionResult) {
    // Note: Chart image data would be in output
    console.log('Execution completed');
  }
}

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '创建一个柱状图，展示100以内质数的末位数字分布。生成图表并描述规律。',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 模型生成matplotlib代码，执行后描述结果
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('图表代码：', part.executableCode.code);
  if (part.codeExecutionResult) {
    // 注意：图表图片数据会包含在输出中
    console.log('执行完成');
  }
}

Response Structure

响应结构

typescript

{
  candidates: [
    {
      content: {
        parts: [
          { text: "I'll calculate that for you." },
          {
            executableCode: {
              language: "PYTHON",
              code: "def is_prime(n):\n  if n <= 1:\n    return False\n  ..."
            }
          },
          {
            codeExecutionResult: {
              outcome: "OUTCOME_OK", // or "OUTCOME_FAILED"
              output: "5117\n"
            }
          },
          { text: "The sum of the first 50 prime numbers is 5117." }
        ]
      }
    }
  ]
}

typescript

{
  candidates: [
    {
      content: {
        parts: [
          { text: "我来帮你计算。" },
          {
            executableCode: {
              language: "PYTHON",
              code: "def is_prime(n):\n  if n <= 1:\n    return False\n  ..."
            }
          },
          {
            codeExecutionResult: {
              outcome: "OUTCOME_OK", // 或"OUTCOME_FAILED"
              output: "5117\n"
            }
          },
          { text: "前50个质数的和是5117。" }
        ]
      }
    }
  ]
}

Error Handling

错误处理

typescript

for (const part of response.candidates[0].content.parts) {
  if (part.codeExecutionResult) {
    if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
      console.error('Code execution failed:', part.codeExecutionResult.output);
    } else {
      console.log('Success:', part.codeExecutionResult.output);
    }
  }
}

typescript

for (const part of response.candidates[0].content.parts) {
  if (part.codeExecutionResult) {
    if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
      console.error('代码执行失败：', part.codeExecutionResult.output);
    } else {
      console.log('成功：', part.codeExecutionResult.output);
    }
  }
}

Key Points

关键点

When to Use Code Execution:

Complex mathematical calculations
Data analysis and statistics
Algorithm implementations
File parsing and processing
Chart generation
Computational problems

Limitations:

Sandbox environment (limited file system access)
Limited Python package availability
Execution timeout limits
No network access from code
No persistent state between executions

Best Practices:

Specify what calculation or analysis you need clearly
Request code generation explicitly ("Generate and run code...")
Check
```
outcome
```
field for errors
Use for deterministic computations, not for general programming

Important:

Available on all Gemini 2.5 models (Pro, Flash, Flash-Lite)
Code runs in isolated sandbox for security
Supports Python with standard library and common data science packages

何时使用代码执行：

复杂数学计算
数据分析与统计
算法实现
文件解析与处理
图表生成
计算类问题

限制：

沙箱环境（文件系统访问受限）
Python可用包有限
执行超时限制
代码无法访问网络
执行之间无持久化状态

最佳实践：

清晰说明您需要的计算或分析内容
明确要求生成代码（例如：“生成并运行代码...”）
检查
```
outcome
```
字段是否有错误
仅用于确定性计算，而非通用编程

注意：

所有Gemini 2.5系列模型（Pro、Flash、Flash-Lite）均支持
代码在隔离沙箱中运行以保障安全
支持Python标准库和常见数据科学包

Grounding with Google Search

基于Google搜索的事实校验

Grounding connects the model to real-time web information, reducing hallucinations and providing up-to-date, fact-checked responses with citations.

事实校验功能将模型与实时网络信息连接，减少幻觉并提供最新的、经过事实核查的响应（包含引用）。

How It Works

工作原理

Model determines if it needs current information
Automatically performs Google Search
Processes search results
Incorporates findings into response
Provides citations and source URLs

模型判断是否需要当前信息
自动执行Google搜索
处理搜索结果
将发现整合到响应中
提供引用和来源URL

Benefits

优势

Real-time information: Access to current events and data
Reduced hallucinations: Answers grounded in web sources
Verifiable: Citations allow fact-checking
Up-to-date: Not limited to model's training cutoff

实时信息：获取当前事件和数据
减少幻觉：答案基于网络来源
可验证：引用支持事实核查
内容更新：不受模型训练截止日期限制

Two Grounding APIs

两种事实校验API

1. Google Search (

googleSearch

) - Recommended for Gemini 2.5

1. Google Search（

googleSearch

）- 推荐用于Gemini 2.5

typescript

const groundingTool = {
  googleSearch: {}
};

Features:

Simple configuration
Automatic search when needed
Available on all Gemini 2.5 models

typescript

const groundingTool = {
  googleSearch: {}
};

特性：

配置简单
自动在需要时执行搜索
所有Gemini 2.5系列模型均支持

2. Google Search Retrieval (

googleSearchRetrieval

) - Legacy (Gemini 1.5)

2. Google Search Retrieval（

googleSearchRetrieval

）- 旧版（Gemini 1.5）

typescript

const retrievalTool = {
  googleSearchRetrieval: {
    dynamicRetrievalConfig: {
      mode: 'MODE_DYNAMIC',
      dynamicThreshold: 0.7 // Only search if confidence < 70%
    }
  }
};

Features:

Dynamic threshold control
Used with Gemini 1.5 models
More configuration options

typescript

const retrievalTool = {
  googleSearchRetrieval: {
    dynamicRetrievalConfig: {
      mode: 'MODE_DYNAMIC',
      dynamicThreshold: 0.7 // 仅当置信度<70%时执行搜索
    }
  }
};

特性：

动态阈值控制
用于Gemini 1.5系列模型
配置选项更多

Basic Grounding (SDK) - Gemini 2.5

基础事实校验（SDK）- Gemini 2.5

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Who won the euro 2024?',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

console.log(response.text);

// Check if grounding was used
if (response.candidates[0].groundingMetadata) {
  console.log('Search was performed!');
  console.log('Sources:', response.candidates[0].groundingMetadata);
}

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '谁赢得了2024年欧洲杯？',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

console.log(response.text);

// 检查是否使用了事实校验
if (response.candidates[0].groundingMetadata) {
  console.log('已执行搜索！');
  console.log('来源：', response.candidates[0].groundingMetadata);
}

Basic Grounding (Fetch) - Gemini 2.5

基础事实校验（Fetch）- Gemini 2.5

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'Who won the euro 2024?' }] }
      ],
      tools: [
        { google_search: {} }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

if (data.candidates[0].groundingMetadata) {
  console.log('Grounding metadata:', data.candidates[0].groundingMetadata);
}

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '谁赢得了2024年欧洲杯？' }] }
      ],
      tools: [
        { google_search: {} }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

if (data.candidates[0].groundingMetadata) {
  console.log('事实校验元数据：', data.candidates[0].groundingMetadata);
}

Dynamic Retrieval (SDK) - Gemini 1.5

动态检索（SDK）- Gemini 1.5

typescript

import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-1.5-flash',
  contents: 'Who won the euro 2024?',
  config: {
    tools: [
      {
        googleSearchRetrieval: {
          dynamicRetrievalConfig: {
            mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
            dynamicThreshold: 0.7 // Search only if confidence < 70%
          }
        }
      }
    ]
  }
});

console.log(response.text);

if (!response.candidates[0].groundingMetadata) {
  console.log('Model answered from its own knowledge (high confidence)');
}

typescript

import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-1.5-flash',
  contents: '谁赢得了2024年欧洲杯？',
  config: {
    tools: [
      {
        googleSearchRetrieval: {
          dynamicRetrievalConfig: {
            mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
            dynamicThreshold: 0.7 // 仅当置信度<70%时执行搜索
          }
        }
      }
    ]
  }
});

console.log(response.text);

if (!response.candidates[0].groundingMetadata) {
  console.log('模型使用自身知识回答（置信度高）');
}

Grounding Metadata Structure

事实校验元数据结构

typescript

{
  groundingMetadata: {
    searchQueries: [
      { text: "euro 2024 winner" }
    ],
    webPages: [
      {
        url: "https://example.com/euro-2024-results",
        title: "UEFA Euro 2024 Final Results",
        snippet: "Spain won UEFA Euro 2024..."
      }
    ],
    citations: [
      {
        startIndex: 42,
        endIndex: 47,
        uri: "https://example.com/euro-2024-results"
      }
    ],
    retrievalQueries: [
      {
        query: "who won euro 2024 final"
      }
    ]
  }
}

typescript

{
  groundingMetadata: {
    searchQueries: [
      { text: "2024欧洲杯冠军" }
    ],
    webPages: [
      {
        url: "https://example.com/euro-2024-results",
        title: "2024欧洲杯决赛结果",
        snippet: "西班牙赢得了2024年欧洲杯..."
      }
    ],
    citations: [
      {
        startIndex: 42,
        endIndex: 47,
        uri: "https://example.com/euro-2024-results"
      }
    ],
    retrievalQueries: [
      {
        query: "who won euro 2024 final"
      }
    ]
  }
}

Chat with Grounding (SDK)

带事实校验的对话（SDK）

typescript

const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

let response = await chat.sendMessage('What are the latest developments in quantum computing?');
console.log(response.text);

// Check grounding sources
if (response.candidates[0].groundingMetadata) {
  const sources = response.candidates[0].groundingMetadata.webPages || [];
  console.log(`Sources used: ${sources.length}`);
  sources.forEach(source => {
    console.log(`- ${source.title}: ${source.url}`);
  });
}

// Follow-up still has grounding enabled
response = await chat.sendMessage('Which company made the biggest breakthrough?');
console.log(response.text);

typescript

const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

let response = await chat.sendMessage('量子计算的最新进展是什么？');
console.log(response.text);

// 检查事实校验来源
if (response.candidates[0].groundingMetadata) {
  const sources = response.candidates[0].groundingMetadata.webPages || [];
  console.log(`使用的来源数：${sources.length}`);
  sources.forEach(source => {
    console.log(`- ${source.title}: ${source.url}`);
  });
}

// 跟进问题仍会启用事实校验
response = await chat.sendMessage('哪家公司取得了最大突破？');
console.log(response.text);

Combining Grounding with Function Calling

结合事实校验与函数调用

typescript

const weatherFunction = {
  name: 'get_current_weather',
  description: 'Get current weather for a location',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: { type: 'string', description: 'City name' }
    },
    required: ['location']
  }
};

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather like in the city that won Euro 2024?',
  config: {
    tools: [
      { googleSearch: {} },
      { functionDeclarations: [weatherFunction] }
    ]
  }
});

// Model will:
// 1. Use Google Search to find Euro 2024 winner
// 2. Call get_current_weather function with the city
// 3. Combine both results in response

typescript

const weatherFunction = {
  name: 'get_current_weather',
  description: '获取指定地点的当前天气',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: { type: 'string', description: '城市名称' }
    },
    required: ['location']
  }
};

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '赢得2024年欧洲杯的城市现在天气怎么样？',
  config: {
    tools: [
      { googleSearch: {} },
      { functionDeclarations: [weatherFunction] }
    ]
  }
});

// 模型会：
// 1. 使用Google搜索找到2024欧洲杯冠军
// 2. 调用get_current_weather函数获取该城市天气
// 3. 将两者结果整合到响应中

Checking if Grounding was Used

检查是否使用了事实校验

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is 2+2?', // Model knows this without search
  config: {
    tools: [{ googleSearch: {} }]
  }
});

if (!response.candidates[0].groundingMetadata) {
  console.log('Model answered from its own knowledge (no search needed)');
} else {
  console.log('Search was performed');
}

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '2+2等于多少？', // 模型无需搜索即可回答
  config: {
    tools: [{ googleSearch: {} }]
  }
});

if (!response.candidates[0].groundingMetadata) {
  console.log('模型使用自身知识回答（无需搜索）');
} else {
  console.log('已执行搜索');
}

Key Points

关键点

When to Use Grounding:

Current events and news
Real-time data (stock prices, sports scores, weather)
Fact-checking and verification
Questions about recent developments
Information beyond model's training cutoff

When NOT to Use:

General knowledge questions
Mathematical calculations
Code generation
Creative writing
Tasks requiring internal reasoning only

Cost Considerations:

Grounding adds latency (search takes time)
Additional token costs for retrieved content
Use
```
dynamicThreshold
```
to control when searches happen (Gemini 1.5)

Important Notes:

Grounding requires Google Cloud project (not just API key)
Search results quality depends on query phrasing
Citations may not cover all facts in response
Search is performed automatically based on confidence

Gemini 2.5 vs 1.5:

Gemini 2.5: Use
```
googleSearch
```
(simple, recommended)
Gemini 1.5: Use
```
googleSearchRetrieval
```
with
```
dynamicThreshold
```

Best Practices:

Always check
```
groundingMetadata
```
to see if search was used
Display citations to users for transparency
Use specific, well-phrased questions for better search results
Combine with function calling for hybrid workflows

何时使用事实校验：

当前事件与新闻
实时数据（股票价格、体育比分、天气）
事实核查与验证
关于近期发展的问题
超出模型训练截止日期的信息

何时不使用：

通用知识问题
数学计算
代码生成
创意写作
仅需内部推理的任务

成本考量：

事实校验会增加延迟（搜索需要时间）
检索内容会产生额外令牌成本
使用
```
dynamicThreshold
```
控制搜索时机（Gemini 1.5）

注意事项：

事实校验需要Google Cloud项目（不仅是API密钥）
搜索结果质量取决于查询措辞
引用可能无法覆盖响应中的所有事实
搜索会根据置信度自动执行

Gemini 2.5 vs 1.5：

Gemini 2.5：使用
```
googleSearch
```
（简单，推荐）
Gemini 1.5：使用
```
googleSearchRetrieval
```
并配置
```
dynamicThreshold
```

最佳实践：

始终检查
```
groundingMetadata
```
以确认是否执行了搜索
向用户展示引用以保证透明度
使用具体、清晰的问题以获得更好的搜索结果
结合函数调用实现混合工作流

Error Handling

错误处理

Common Errors

常见错误

1. Invalid API Key (401)

1. 无效API密钥（401）

typescript

{
  error: {
    code: 401,
    message: 'API key not valid. Please pass a valid API key.',
    status: 'UNAUTHENTICATED'
  }
}

Solution: Verify

GEMINI_API_KEY

environment variable is set correctly.

typescript

{
  error: {
    code: 401,
    message: 'API key not valid. Please pass a valid API key.',
    status: 'UNAUTHENTICATED'
  }
}

解决方案：确认

GEMINI_API_KEY

环境变量已正确设置。

2. Rate Limit Exceeded (429)

2. 超出速率限制（429）

typescript

{
  error: {
    code: 429,
    message: 'Resource has been exhausted (e.g. check quota).',
    status: 'RESOURCE_EXHAUSTED'
  }
}

Solution: Implement exponential backoff retry strategy.

typescript

{
  error: {
    code: 429,
    message: 'Resource has been exhausted (e.g. check quota).',
    status: 'RESOURCE_EXHAUSTED'
  }
}

解决方案：实现指数退避重试策略。

3. Model Not Found (404)

3. 模型未找到（404）

typescript

{
  error: {
    code: 404,
    message: 'models/gemini-3.0-flash is not found',
    status: 'NOT_FOUND'
  }
}

Solution: Use correct model names:

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash-lite

typescript

{
  error: {
    code: 404,
    message: 'models/gemini-3.0-flash is not found',
    status: 'NOT_FOUND'
  }
}

解决方案：使用正确的模型名称：

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash-lite

4. Context Length Exceeded (400)

4. 超出上下文长度（400）

typescript

{
  error: {
    code: 400,
    message: 'Request payload size exceeds the limit',
    status: 'INVALID_ARGUMENT'
  }
}

Solution: Reduce input size. Gemini 2.5 models support 1,048,576 input tokens max.

typescript

{
  error: {
    code: 400,
    message: 'Request payload size exceeds the limit',
    status: 'INVALID_ARGUMENT'
  }
}

解决方案：减小输入大小。Gemini 2.5系列模型最多支持1,048,576输入令牌。

Exponential Backoff Pattern

指数退避模式

typescript

async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

typescript

async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1秒, 2秒, 4秒
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Rate Limits

速率限制

Free Tier (Gemini API)

免费版（Gemini API）

Rate limits vary by model:

Gemini 2.5 Pro:

Requests per minute: 5 RPM
Tokens per minute: 125,000 TPM
Requests per day: 100 RPD

Gemini 2.5 Flash:

Requests per minute: 10 RPM
Tokens per minute: 250,000 TPM
Requests per day: 250 RPD

Gemini 2.5 Flash-Lite:

Requests per minute: 15 RPM
Tokens per minute: 250,000 TPM
Requests per day: 1,000 RPD

速率限制因模型而异：

Gemini 2.5 Pro：

每分钟请求数：5 RPM
每分钟令牌数：125,000 TPM
每天请求数：100 RPD

Gemini 2.5 Flash：

每分钟请求数：10 RPM
每分钟令牌数：250,000 TPM
每天请求数：250 RPD

Gemini 2.5 Flash-Lite：

每分钟请求数：15 RPM
每分钟令牌数：250,000 TPM
每天请求数：1,000 RPD

Paid Tier (Tier 1)

付费版（Tier 1）

Requires billing account linked to your Google Cloud project.

Gemini 2.5 Pro:

Requests per minute: 150 RPM
Tokens per minute: 2,000,000 TPM
Requests per day: 10,000 RPD

Gemini 2.5 Flash:

Requests per minute: 1,000 RPM
Tokens per minute: 1,000,000 TPM
Requests per day: 10,000 RPD

Gemini 2.5 Flash-Lite:

Requests per minute: 4,000 RPM
Tokens per minute: 4,000,000 TPM
Requests per day: Not specified

需要将计费账户关联到您的Google Cloud项目。

Gemini 2.5 Pro：

每分钟请求数：150 RPM
每分钟令牌数：2,000,000 TPM
每天请求数：10,000 RPD

Gemini 2.5 Flash：

每分钟请求数：1,000 RPM
每分钟令牌数：1,000,000 TPM
每天请求数：10,000 RPD

Gemini 2.5 Flash-Lite：

每分钟请求数：4,000 RPM
每分钟令牌数：4,000,000 TPM
每天请求数：未指定

Higher Tiers (Tier 2 & 3)

更高等级（Tier 2 & 3）

Tier 2 (requires $250+ spending and 30-day wait):

Even higher limits available

Tier 3 (requires $1,000+ spending and 30-day wait):

Maximum limits available

Tips:

Implement rate limit handling with exponential backoff
Use batch processing for high-volume tasks
Monitor usage in Google AI Studio
Choose the right model based on your rate limit needs
Official rate limits: https://ai.google.dev/gemini-api/docs/rate-limits

Tier 2（需消费250美元以上并等待30天）：

可获得更高的限制

Tier 3（需消费1000美元以上并等待30天）：

可获得最高限制

提示：

实现带指数退避的速率限制处理
对高吞吐量任务使用批量处理
在Google AI Studio中监控使用情况
根据速率限制需求选择合适的模型
官方速率限制：https://ai.google.dev/gemini-api/docs/rate-limits

SDK Migration Guide

SDK迁移指南

From @google/generative-ai to @google/genai

从@google/generative-ai迁移到@google/genai

1. Update Package

1. 更新包

bash

undefined

bash

undefined

Remove deprecated SDK

卸载已废弃的SDK

npm uninstall @google/generative-ai

Install current SDK

安装当前SDK

npm install @google/genai@1.27.0

undefined

npm install @google/genai@1.27.0

undefined

2. Update Imports

2. 更新导入

Old (DEPRECATED):

typescript

import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });

New (CURRENT):

typescript

import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// Use ai.models.generateContent() directly

旧版（已废弃）：

typescript

import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });

新版（当前）：

typescript

import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// 直接使用ai.models.generateContent()

3. Update API Calls

3. 更新API调用

Old:

typescript

const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();

New:

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: prompt
});
const text = response.text;

旧版：

typescript

const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();

新版：

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: prompt
});
const text = response.text;

4. Update Streaming

4. 更新流式传输

Old:

typescript

const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
  console.log(chunk.text());
}

New:

typescript

const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: prompt
});
for await (const chunk of response) {
  console.log(chunk.text);
}

旧版：

typescript

const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
  console.log(chunk.text());
}

新版：

typescript

const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: prompt
});
for await (const chunk of response) {
  console.log(chunk.text);
}

5. Update Chat

5. 更新对话

Old:

typescript

const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;

New:

typescript

const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text is directly available

旧版：

typescript

const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;

新版：

typescript

const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// 直接使用response.text

Production Best Practices

生产环境最佳实践

1. Always Do

1. 必须遵守

✅ Use @google/genai (NOT @google/generative-ai) ✅ Set maxOutputTokens to prevent excessive generation ✅ Implement rate limit handling with exponential backoff ✅ Use environment variables for API keys (never hardcode) ✅ Validate inputs before sending to API (save costs) ✅ Use streaming for better UX on long responses ✅ Choose the right model based on your needs (Pro for complex reasoning, Flash for balance, Flash-Lite for speed) ✅ Handle errors gracefully with try-catch ✅ Monitor token usage for cost control ✅ Use correct model names: gemini-2.5-pro/flash/flash-lite

✅ 使用@google/genai（而非@google/generative-ai） ✅ 设置maxOutputTokens以避免过度生成 ✅ 实现速率限制处理并使用指数退避 ✅ 使用环境变量存储API密钥（绝不要硬编码） ✅ 在发送到API前验证输入（节省成本） ✅ 使用流式传输提升长响应的用户体验 ✅ 根据需求选择合适的模型（Pro用于复杂推理，Flash用于平衡，Flash-Lite用于速度） ✅ 优雅处理错误（使用try-catch） ✅ 监控令牌使用以控制成本 ✅ 使用正确的模型名称：gemini-2.5-pro/flash/flash-lite

2. Never Do

2. 绝对禁止

❌ Never use @google/generative-ai (deprecated!) ❌ Never hardcode API keys in code ❌ Never claim 2M context for Gemini 2.5 (it's 1,048,576 input tokens) ❌ Never expose API keys in client-side code ❌ Never skip error handling (always try-catch) ❌ Never use generic rate limits (each model has different limits - check official docs) ❌ Never send PII without user consent ❌ Never trust user input without validation ❌ Never ignore rate limits (will get 429 errors) ❌ Never use old model names like gemini-1.5-pro (use 2.5 models)

❌ 绝不要使用@google/generative-ai（已废弃！） ❌ 绝不要在代码中硬编码API密钥 ❌ 绝不要声称Gemini 2.5支持200万令牌（实际为1,048,576输入令牌） ❌ 绝不要在客户端代码中暴露API密钥 ❌ 绝不要跳过错误处理（始终使用try-catch） ❌ 绝不要使用通用速率限制（每个模型的限制不同，请查看官方文档） ❌ 绝不要在未获得用户同意的情况下发送PII（个人身份信息） ❌ 绝不要信任未验证的用户输入 ❌ 绝不要忽略速率限制（会收到429错误） ❌ 绝不要使用旧模型名称如gemini-1.5-pro（使用2.5系列模型）

3. Security

3. 安全

API Key Storage: Use environment variables or secret managers
Server-Side Only: Never expose API keys in browser JavaScript
Input Validation: Sanitize all user inputs before API calls
Rate Limiting: Implement your own rate limits to prevent abuse
Error Messages: Don't expose API keys or sensitive data in error logs

API密钥存储：使用环境变量或密钥管理器
仅在服务端使用：绝不要在浏览器JavaScript中暴露API密钥
输入验证：在调用API前清理所有用户输入
速率限制：实现自己的速率限制以防止滥用
错误消息：不要在错误日志中暴露API密钥或敏感数据

4. Cost Optimization

4. 成本优化

Choose Right Model: Use Flash for most tasks, Pro only when needed
Set Token Limits: Use maxOutputTokens to control costs
Batch Requests: Process multiple items efficiently
Cache Results: Store responses when appropriate
Monitor Usage: Track token consumption in Google Cloud Console

选择合适的模型：大多数任务使用Flash，仅在必要时使用Pro
设置令牌限制：使用maxOutputTokens控制成本
批量处理：对高吞吐量任务使用批量处理
缓存结果：在合适的场景下缓存响应
监控使用情况：在Google Cloud控制台跟踪令牌消耗

5. Performance

5. 性能

Use Streaming: Better perceived latency for long responses
Parallel Requests: Use Promise.all() for independent calls
Edge Deployment: Deploy to Cloudflare Workers for low latency
Connection Pooling: Reuse HTTP connections when possible

使用流式传输：长响应时提升感知延迟
并行请求：对独立调用使用Promise.all()
边缘部署：部署到Cloudflare Workers以降低延迟
连接池：尽可能复用HTTP连接

Quick Reference

快速参考

Installation

安装

bash

npm install @google/genai@1.27.0

bash

npm install @google/genai@1.27.0

Environment

环境配置

bash

export GEMINI_API_KEY="..."

bash

export GEMINI_API_KEY="..."

Models (2025)

模型（2025）

```
gemini-2.5-pro
```
(1,048,576 in / 65,536 out) - Best for complex reasoning
```
gemini-2.5-flash
```
(1,048,576 in / 65,536 out) - Best price-performance balance
```
gemini-2.5-flash-lite
```
(1,048,576 in / 65,536 out) - Fastest, most cost-effective

```
gemini-2.5-pro
```
（1,048,576输入 / 65,536输出）- 最佳复杂推理模型
```
gemini-2.5-flash
```
（1,048,576输入 / 65,536输出）- 最佳性价比模型
```
gemini-2.5-flash-lite
```
（1,048,576输入 / 65,536输出）- 最快、最具成本效益的模型

Basic Generation

基础生成

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Your prompt here'
});
console.log(response.text);

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '你的提示词'
});
console.log(response.text);

Streaming

流式传输

typescript

const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
  console.log(chunk.text);
}

typescript

const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
  console.log(chunk.text);
}

Multimodal

多模态

typescript

contents: [
  {
    parts: [
      { text: 'What is this?' },
      { inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
    ]
  }
]

typescript

contents: [
  {
    parts: [
      { text: '这是什么？' },
      { inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
    ]
  }
]

Function Calling

函数调用

typescript

config: {
  tools: [{ functionDeclarations: [...] }]
}

Last Updated: 2025-10-25 Production Validated: All features tested with @google/genai@1.27.0 Phase: 2 Complete ✅ (All Core + Advanced Features)

typescript

config: {
  tools: [{ functionDeclarations: [...] }]
}

最后更新：2025-10-25 生产环境验证：所有特性均已通过@google/genai@1.27.0测试阶段：第二阶段已完成 ✅（所有核心+高级功能）

google-gemini-api

Original

Translation

Google Gemini API - Complete Guide

Google Gemini API 完整指南

⚠️ CRITICAL SDK MIGRATION WARNING

⚠️ 重要SDK迁移警告

Status

状态

Table of Contents

目录

Quick Start

快速开始

Installation

安装

Environment Setup

环境配置

First Text Generation (Node.js SDK)

首次文本生成（Node.js SDK）

First Text Generation (Fetch - Cloudflare Workers)

首次文本生成（Fetch - Cloudflare Workers）

Current Models (2025)

当前模型（2025）

Gemini 2.5 Series (General Availability)

Gemini 2.5系列（正式可用）

gemini-2.5-pro

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash

gemini-2.5-flash-lite

gemini-2.5-flash-lite

Model Feature Matrix

模型特性矩阵

⚠️ Context Window Correction

⚠️ 上下文窗口纠正

SDK vs Fetch Approaches

SDK vs Fetch实现方式

Node.js SDK (@google/genai)

Node.js SDK（@google/genai）

Fetch-based (Direct REST API)

基于Fetch的实现（直接调用REST API）

Text Generation

文本生成

Basic Text Generation (SDK)

基础文本生成（SDK）

Basic Text Generation (Fetch)

基础文本生成（Fetch）

Response Structure

响应结构

Streaming

流式传输

Streaming with SDK (Async Iteration)

SDK流式实现（异步迭代）

Streaming with Fetch (SSE Parsing)

Fetch流式实现（SSE解析）

Multimodal Inputs

多模态输入

Images (Vision)

图片（视觉）

SDK Approach

SDK实现

Fetch Approach

Fetch实现

Video

视频

Audio

音频

PDFs

PDF

Multiple Inputs

多输入混合

Function Calling

函数调用

Basic Function Calling (SDK)

基础函数调用（SDK）

Function Calling (Fetch)

函数调用（Fetch）

Parallel Function Calling

并行函数调用

Function Calling Modes