gemini-api-2026

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Gemini API Development Skill

Gemini API开发指南

Source: Official Gemini API documentation scraped 2026-02-27
Coverage: All 81 documentation files

来源： 2026年2月27日爬取的Gemini API官方文档
覆盖范围： 全部81份文档文件

Quick Start

快速入门

python

from google import genai  # CRITICAL: NOT google.generativeai

client = genai.Client()  # Uses GEMINI_API_KEY env var automatically

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Hello world"
)
print(response.text)

JavaScript:

javascript

import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({});  // Uses GEMINI_API_KEY env var
const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: "Hello world"
});
console.log(response.text);

python

from google import genai  # 重要：不要使用google.generativeai

client = genai.Client()  # 自动读取GEMINI_API_KEY环境变量

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Hello world"
)
print(response.text)

JavaScript示例：

javascript

import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({});  // 自动读取GEMINI_API_KEY环境变量
const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: "Hello world"
});
console.log(response.text);

CRITICAL GOTCHAS (Read First)

重要注意事项（请先阅读）

SDK import:

from google import genai

— NOT

google.generativeai

(legacy)

Temperature: Default is 1.0 for Gemini 3 — do NOT lower it; causes loops/degraded performance
Thinking params: Gemini 3 uses
```
thinking_level
```
("low"/"medium"/"high"); Gemini 2.5 uses
```
thinking_budget
```
(integer tokens)
Thought signatures: Gemini 3 REQUIRES thought signatures echoed back during function calling or you get a 400 error. SDKs handle this automatically in chat mode.
API default: SDK defaults to
```
v1beta
```
. Use
```
http_options={'api_version': 'v1alpha'}
```
for experimental features.
REST auth header:
```
x-goog-api-key: $GEMINI_API_KEY
```
(not Authorization Bearer)

SDK导入：使用

from google import genai

— 不要用旧版的

google.generativeai

Temperature参数：Gemini 3的默认值为1.0 — 不要调低该值，否则会导致循环/性能下降
推理参数：Gemini 3使用
```
thinking_level
```
（可选值"low"/"medium"/"high"）；Gemini 2.5使用
```
thinking_budget
```
（整数类型的token数）
思想签名：Gemini 3在函数调用时要求回传思想签名，否则会返回400错误。SDK在聊天模式下会自动处理该逻辑
API版本默认值：SDK默认使用
```
v1beta
```
版本，如需使用实验特性请设置
```
http_options={'api_version': 'v1alpha'}
```
REST鉴权头：使用
```
x-goog-api-key: $GEMINI_API_KEY
```
（而非Authorization Bearer格式）

Models Reference

模型参考

Gemini 3 Series (Current)

Gemini 3系列（当前最新）

Model	String	Notes
Gemini 3.1 Pro Preview	`gemini-3.1-pro-preview`	Latest; also `gemini-3.1-pro-preview-customtools` variant
Gemini 3 Flash Preview	`gemini-3-flash-preview`	Default workhorse; shutdown: no date
Gemini 3 Pro Image Preview	`gemini-3-pro-image-preview`	"Nano Banana Pro" — native image gen
Gemini 3.1 Flash Image Preview	`gemini-3.1-flash-image-preview`	"Nano Banana 2" — fast image gen

⚠️ Gemini 3 Pro Preview (
gemini-3-pro-preview
) shuts down March 9, 2026 → migrate to
gemini-3.1-pro-preview

模型	标识字符串	说明
Gemini 3.1 Pro Preview	`gemini-3.1-pro-preview`	最新版本；另有 `gemini-3.1-pro-preview-customtools` 变体
Gemini 3 Flash Preview	`gemini-3-flash-preview`	默认通用模型；暂无下线日期
Gemini 3 Pro Image Preview	`gemini-3-pro-image-preview`	又称"Nano Banana Pro" — 原生图像生成能力
Gemini 3.1 Flash Image Preview	`gemini-3.1-flash-image-preview`	又称"Nano Banana 2" — 快速图像生成能力

⚠️ Gemini 3 Pro Preview (
gemini-3-pro-preview
) 将于2026年3月9日下线 → 请迁移至
gemini-3.1-pro-preview

Gemini 2.5 Series (Stable)

Gemini 2.5系列（稳定版）

Model	String	Shutdown
Gemini 2.5 Pro	`gemini-2.5-pro`	June 17, 2026
Gemini 2.5 Flash	`gemini-2.5-flash`	June 17, 2026
Gemini 2.5 Flash Lite	`gemini-2.5-flash-lite`	July 22, 2026
Gemini 2.5 Flash Image	`gemini-2.5-flash-image`	Oct 2, 2026

模型	标识字符串	下线日期
Gemini 2.5 Pro	`gemini-2.5-pro`	2026年6月17日
Gemini 2.5 Flash	`gemini-2.5-flash`	2026年6月17日
Gemini 2.5 Flash Lite	`gemini-2.5-flash-lite`	2026年7月22日
Gemini 2.5 Flash Image	`gemini-2.5-flash-image`	2026年10月2日

Gemini 2.0 Series (Deprecating)

Gemini 2.0系列（即将弃用）

```
gemini-2.0-flash
```
,
```
gemini-2.0-flash-lite
```
→ shutdown June 1, 2026

```
gemini-2.0-flash
```
、
```
gemini-2.0-flash-lite
```
→ 2026年6月1日下线

Specialized Models

专项模型

TTS:
```
gemini-2.5-flash-preview-tts
```

Live API:

gemini-2.5-flash-native-audio-preview-12-2025

Computer Use:

gemini-2.5-computer-use-preview-10-2025

gemini-3-flash-preview

Deep Research:
```
deep-research-pro-preview-12-2025
```
(via Interactions API only)
Embeddings:
```
gemini-embedding-001
```
Video (Veo):
```
veo-3.1-generate-preview
```
Images (Imagen):
```
imagen-4.0-generate-001
```
Music (Lyria):
```
models/lyria-realtime-exp
```
Robotics:
```
gemini-robotics-er-1.5-preview
```
LearnLM: experimental tutor model

TTS：
```
gemini-2.5-flash-preview-tts
```

Live API：

gemini-2.5-flash-native-audio-preview-12-2025

Computer Use：

gemini-2.5-computer-use-preview-10-2025

、

gemini-3-flash-preview

深度研究：
```
deep-research-pro-preview-12-2025
```
（仅可通过Interactions API调用）
嵌入模型：
```
gemini-embedding-001
```
视频生成（Veo）：
```
veo-3.1-generate-preview
```
图像生成（Imagen）：
```
imagen-4.0-generate-001
```
音乐生成（Lyria）：
```
models/lyria-realtime-exp
```
机器人：
```
gemini-robotics-er-1.5-preview
```
LearnLM：实验性教学模型

Latest Aliases

Libraries & Installation

依赖库与安装

bash

pip install google-genai          # Python
npm install @google/genai          # JavaScript
go get google.golang.org/genai     # Go

bash

pip install google-genai          # Python
npm install @google/genai          # JavaScript
go get google.golang.org/genai     # Go

Java: com.google.genai:google-genai:1.0.0

C#: dotnet add package Google.GenAI


**OpenAI compatibility** (3 line change):
```python
from openai import OpenAI
client = OpenAI(
    api_key="GEMINI_API_KEY",
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(model="gemini-3-flash-preview", messages=[...])


**OpenAI兼容模式**（仅需修改3行代码）：
```python
from openai import OpenAI
client = OpenAI(
    api_key="GEMINI_API_KEY",
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
response = client.chat.completions.create(model="gemini-3-flash-preview", messages=[...])

Core Generation

核心生成能力

System Instructions

系统提示词

python

from google.genai import types

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="User message",
    config=types.GenerateContentConfig(
        system_instruction="You are a helpful assistant.",
        temperature=1.0,
        max_output_tokens=1024,
    )
)

python

from google.genai import types

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="用户消息",
    config=types.GenerateContentConfig(
        system_instruction="你是一个乐于助人的助手。",
        temperature=1.0,
        max_output_tokens=1024,
    )
)

Multi-turn Chat

多轮对话

python

chat = client.chats.create(model="gemini-3-flash-preview")
response = chat.send_message("Hello")
print(response.text)
response2 = chat.send_message("Tell me more")
print(response2.text)

python

chat = client.chats.create(model="gemini-3-flash-preview")
response = chat.send_message("你好")
print(response.text)
response2 = chat.send_message("详细介绍一下")
print(response2.text)

Streaming

流式输出

python

for chunk in client.models.generate_content_stream(
    model="gemini-3-flash-preview",
    contents="Write a long story"
):
    print(chunk.text, end="")

python

for chunk in client.models.generate_content_stream(
    model="gemini-3-flash-preview",
    contents="写一个长故事"
):
    print(chunk.text, end="")

Token Counting

Token计数

python

undefined

python

undefined

Before sending:

发送请求前计数：

count = client.models.count_tokens(model="gemini-3-flash-preview", contents=prompt) print(count.total_tokens)

After generating:

生成结果后计数：

print(response.usage_metadata)

Fields: prompt_token_count, candidates_token_count, thoughts_token_count, total_token_count

包含字段：prompt_token_count, candidates_token_count, thoughts_token_count, total_token_count


1 token ≈ 4 characters; 100 tokens ≈ 60-80 English words.

---


1 token ≈ 4个字符；100 tokens ≈ 60-80个英文单词。

---

Thinking (Reasoning)

推理（思维链）能力

Gemini 3 —

thinking_level

Gemini 3 —

thinking_level

python

config=types.GenerateContentConfig(
    thinking_config=types.ThinkingConfig(thinking_level="low")  # "low", "medium", "high"
)

python

config=types.GenerateContentConfig(
    thinking_config=types.ThinkingConfig(thinking_level="low")  # 可选值"low", "medium", "high"
)

Gemini 2.5 —

thinking_budget

Gemini 2.5 —

thinking_budget

python

config=types.GenerateContentConfig(
    thinking_config=types.ThinkingConfig(thinking_budget=1024)  # token budget; 0=disabled
)

Thinking is enabled by default on 2.5 and 3 models — causes higher latency/tokens. Disable if optimizing for speed.

python

config=types.GenerateContentConfig(
    thinking_config=types.ThinkingConfig(thinking_budget=1024)  # token预算；0表示禁用推理
)

2.5和3系列模型默认开启推理能力 — 会导致更高的延迟和token消耗，如追求速度可禁用该功能。

Multimodal Input

多模态输入

Images (Inline — under 20MB)

图像（内联上传 — 大小不超过20MB）

python

with open('image.jpg', 'rb') as f:
    image_bytes = f.read()

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[
        types.Part.from_bytes(data=image_bytes, mime_type='image/jpeg'),
        "Caption this image."
    ]
)

python

with open('image.jpg', 'rb') as f:
    image_bytes = f.read()

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[
        types.Part.from_bytes(data=image_bytes, mime_type='image/jpeg'),
        "给这张图写说明文字。"
    ]
)

Images (URL fetch)

图像（URL拉取）

python

import requests
image_bytes = requests.get("https://example.com/image.jpg").content
image = types.Part.from_bytes(data=image_bytes, mime_type="image/jpeg")

python

import requests
image_bytes = requests.get("https://example.com/image.jpg").content
image = types.Part.from_bytes(data=image_bytes, mime_type="image/jpeg")

PDF Documents (Inline — under 50MB)

PDF文档（内联上传 — 大小不超过50MB）

python

import pathlib
filepath = pathlib.Path('file.pdf')
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[
        types.Part.from_bytes(data=filepath.read_bytes(), mime_type='application/pdf'),
        "Summarize this document"
    ]
)

python

import pathlib
filepath = pathlib.Path('file.pdf')
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[
        types.Part.from_bytes(data=filepath.read_bytes(), mime_type='application/pdf'),
        "总结这份文档的内容"
    ]
)

Audio (via Files API)

音频（通过Files API上传）

python

myfile = client.files.upload(file="sample.mp3")
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=["Describe this audio", myfile]
)

Audio capabilities: transcription, translation, speaker diarization, emotion detection, timestamps.
For real-time audio → use Live API.

python

myfile = client.files.upload(file="sample.mp3")
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=["描述这段音频的内容", myfile]
)

音频能力支持：转录、翻译、说话人分离、情绪检测、时间戳标记。
实时音频场景请使用Live API。

Video (via Files API)

视频（通过Files API上传）

python

myfile = client.files.upload(file="video.mp4")
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[myfile, "Summarize this video"]
)

python

myfile = client.files.upload(file="video.mp4")
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[myfile, "总结这段视频的内容"]
)

YouTube URLs

YouTube链接

python

undefined

python

undefined

Include YouTube URL directly in contents

直接在内容中包含YouTube链接即可

response = client.models.generate_content( model="gemini-3-flash-preview", contents=["https://www.youtube.com/watch?v=XXXXX", "Summarize this video"] )

---

response = client.models.generate_content( model="gemini-3-flash-preview", contents=["https://www.youtube.com/watch?v=XXXXX", "总结这段视频的内容"] )

---

File Input Methods Comparison

文件输入方式对比

Method	Max Size	Best For	Persistence
Inline data	100MB (50MB PDF)	Small files, one-off	None
Files API	2GB/file, 20GB/project	Large files, reuse	48 hours
GCS URI	2GB/file, unlimited storage	GCS files	30 days (registration)
External URLs	100MB	Public URLs, AWS/Azure/GCS	None (fetched per request)

方式	最大大小	适用场景	持久化时长
内联数据	100MB（PDF为50MB）	小文件、一次性使用	无
Files API	单文件2GB，单项目20GB	大文件、重复使用	48小时
GCS URI	单文件2GB，存储无上限	GCS存储的文件	30天（注册后）
外部URL	100MB	公共URL、AWS/Azure/GCS存储的文件	无（每次请求都会重新拉取）

Files API

python

undefined

python

undefined

Upload

上传文件

myfile = client.files.upload(file="path/to/file.pdf") print(myfile.uri) # use in requests

myfile = client.files.upload(file="path/to/file.pdf") print(myfile.uri) # 可在请求中直接使用

List

列出文件

for file in client.files.list(): print(file.name)

Delete

删除文件

client.files.delete(name=myfile.name)


**Files API limits:** 2GB per file, 20GB per project, 48-hour TTL.

---

client.files.delete(name=myfile.name)


**Files API限制：** 单文件最大2GB，单项目总容量20GB，文件TTL为48小时。

---

Structured Output

结构化输出

python

from pydantic import BaseModel

class Recipe(BaseModel):
    name: str
    ingredients: list[str]
    steps: list[str]

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Give me a chocolate cake recipe",
    config=types.GenerateContentConfig(
        response_mime_type="application/json",
        response_schema=Recipe,
    )
)
import json
recipe = json.loads(response.text)

python

from pydantic import BaseModel

class Recipe(BaseModel):
    name: str
    ingredients: list[str]
    steps: list[str]

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="给我一个巧克力蛋糕的食谱",
    config=types.GenerateContentConfig(
        response_mime_type="application/json",
        response_schema=Recipe,
    )
)
import json
recipe = json.loads(response.text)

Tools: Built-in

内置工具

Google Search (Grounding)

谷歌搜索（事实 grounding）

python

grounding_tool = types.Tool(google_search=types.GoogleSearch())
config = types.GenerateContentConfig(tools=[grounding_tool])
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Who won Euro 2024?",
    config=config
)

python

grounding_tool = types.Tool(google_search=types.GoogleSearch())
config = types.GenerateContentConfig(tools=[grounding_tool])
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="2024年欧洲杯冠军是谁？",
    config=config
)

Check response.candidates[0].grounding_metadata for citations

可查看response.candidates[0].grounding_metadata获取引用来源


Response includes `groundingMetadata` with `webSearchQueries`, `groundingChunks`, `groundingSupports`.


响应包含`groundingMetadata`字段，包含`webSearchQueries`、`groundingChunks`、`groundingSupports`信息。

URL Context

URL上下文

python

config = types.GenerateContentConfig(tools=[{"url_context": {}}])

python

config = types.GenerateContentConfig(tools=[{"url_context": {}}])

Include URLs in the prompt text

在提示词文本中直接包含URL即可

undefined

undefined

Google Maps (NOT available with Gemini 3)

谷歌地图（Gemini 3暂不支持）

python

undefined

python

undefined

Only for Gemini 2.5 models

仅适用于Gemini 2.5系列模型

config = types.GenerateContentConfig( tools=[types.Tool(google_maps=types.GoogleMaps())], tool_config=types.ToolConfig(retrieval_config=types.RetrievalConfig( lat_lng=types.LatLng(latitude=34.05, longitude=-118.25) )) )

undefined

undefined

Code Execution

代码执行

python

config = types.GenerateContentConfig(
    tools=[types.Tool(code_execution=types.CodeExecution())]
)

python

config = types.GenerateContentConfig(
    tools=[types.Tool(code_execution=types.CodeExecution())]
)

Function Calling

函数调用

python

def get_weather(location: str) -> dict:
    return {"temp": 72, "condition": "sunny"}

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="What's the weather in NYC?",
    config=types.GenerateContentConfig(
        tools=[get_weather],
        tool_config=types.ToolConfig(
            function_calling_config=types.FunctionCallingConfig(mode="AUTO")
        )
    )
)

python

def get_weather(location: str) -> dict:
    return {"temp": 72, "condition": "sunny"}

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="纽约现在的天气怎么样？",
    config=types.GenerateContentConfig(
        tools=[get_weather],
        tool_config=types.ToolConfig(
            function_calling_config=types.FunctionCallingConfig(mode="AUTO")
        )
    )
)

Check for function calls

检查是否有函数调用请求

for part in response.candidates[0].content.parts: if part.function_call: result = get_weather(**part.function_call.args) # Send result back...


**⚠️ Gemini 3 Thought Signatures in Function Calling:**  
When Gemini 3 returns function calls, each step includes a `thoughtSignature`. You MUST echo it back exactly — omitting it causes a 400 error. **The SDK handles this automatically if you use the chat API or append the full response object to history.**

Manual handling pattern:
```python

for part in response.candidates[0].content.parts: if part.function_call: result = get_weather(**part.function_call.args) # 将结果回传给模型...


**⚠️ Gemini 3函数调用的思想签名要求：**  
当Gemini 3返回函数调用时，每个步骤都包含`thoughtSignature`字段，你必须原封不动地回传该字段 — 省略会导致400错误。**如果你使用聊天API或将完整响应对象追加到对话历史中，SDK会自动处理该逻辑。**

手动处理示例：
```python

After getting FC response, include the full model turn (with signatures) in next request

收到函数调用响应后，将包含签名的完整模型返回内容包含到下一次请求中

contents = [ {"role": "user", "parts": [{"text": "original request"}]}, model_response.candidates[0].content, # includes thoughtSignature {"role": "user", "parts": [{"function_response": {"name": "fn", "response": result}}]} ]

---

contents = [ {"role": "user", "parts": [{"text": "原始请求"}]}, model_response.candidates[0].content, # 包含thoughtSignature {"role": "user", "parts": [{"function_response": {"name": "fn", "response": result}}]} ]

---

Embeddings

嵌入生成

python

result = client.models.embed_content(
    model="gemini-embedding-001",
    contents="What is the meaning of life?"
)
print(result.embeddings)

python

result = client.models.embed_content(
    model="gemini-embedding-001",
    contents="生命的意义是什么？"
)
print(result.embeddings)

Batch

批量生成

result = client.models.embed_content( model="gemini-embedding-001", contents=["text 1", "text 2", "text 3"] )


Model: `gemini-embedding-001` (GA until July 14, 2026)  
Use case: semantic search, RAG, classification, clustering.

---

result = client.models.embed_content( model="gemini-embedding-001", contents=["文本1", "文本2", "文本3"] )


模型：`gemini-embedding-001`（正式可用至2026年7月14日）  
适用场景：语义搜索、RAG、分类、聚类。

---

File Search (RAG)

文件搜索（RAG）

Managed RAG — free file storage and free embedding generation at query time. Pay only for initial indexing + model tokens.

python

undefined

托管RAG服务 — 提供免费的文件存储和查询时免费嵌入生成，仅需为初始索引和模型token付费。

python

undefined

Create store

创建存储库

file_search_store = client.file_search_stores.create( config={'display_name': 'my-store'} )

Upload directly

直接上传文件

operation = client.file_search_stores.upload_to_file_search_store( file='document.pdf', file_search_store_name=file_search_store.name, config={'display_name': 'My Doc'} ) while not operation.done: time.sleep(5) operation = client.operations.get(operation)

operation = client.file_search_stores.upload_to_file_search_store( file='document.pdf', file_search_store_name=file_search_store.name, config={'display_name': '我的文档'} ) while not operation.done: time.sleep(5) operation = client.operations.get(operation)

Query

查询

response = client.models.generate_content( model="gemini-3-flash-preview", contents="What does the document say about X?", config=types.GenerateContentConfig( tools=[types.Tool( file_search=types.FileSearch( file_search_store_names=[file_search_store.name] ) )] ) )

---

response = client.models.generate_content( model="gemini-3-flash-preview", contents="文档中关于X的描述是什么？", config=types.GenerateContentConfig( tools=[types.Tool( file_search=types.FileSearch( file_search_store_names=[file_search_store.name] ) )] ) )

---

Context Caching

上下文缓存

Reduces cost by caching repeated large contexts. Paid tier only.

python

from google.genai import types

通过缓存重复的大上下文降低成本。仅付费版可用。

python

from google.genai import types

Create cache

创建缓存

cache = client.caches.create( model="gemini-3-flash-preview", config=types.CreateCachedContentConfig( contents=[large_document_content], system_instruction="You are an expert analyst.", ttl="3600s", # 1 hour display_name="my-cache" ) )

cache = client.caches.create( model="gemini-3-flash-preview", config=types.CreateCachedContentConfig( contents=[large_document_content], system_instruction="你是一名专家分析师。", ttl="3600s", # 1小时 display_name="my-cache" ) )

Use cache

使用缓存

response = client.models.generate_content( model="gemini-3-flash-preview", contents="What are the key findings?", config=types.GenerateContentConfig(cached_content=cache.name) ) print(response.usage_metadata.cached_content_token_count)


Implicit caching: 2048+ token prefix is automatically cached at 75% discount.  
Explicit caching: manual TTL control.

---

response = client.models.generate_content( model="gemini-3-flash-preview", contents="核心发现有哪些？", config=types.GenerateContentConfig(cached_content=cache.name) ) print(response.usage_metadata.cached_content_token_count)


隐式缓存：2048+token的前缀会自动缓存，享受75%折扣。  
显式缓存：支持手动控制TTL。

---

Batch API

批量API

50% cost reduction for non-urgent workloads. 24-hour SLO.

python

undefined

非紧急工作负载可享受50%成本优惠，SLO为24小时。

python

undefined

Create batch job (see batch-api.md for full syntax)

创建批量任务（完整语法可参考batch-api.md）

batch_job = client.batches.create( model="gemini-3-flash-preview", src="gs://bucket/requests.jsonl", config=types.CreateBatchJobConfig(dest="gs://bucket/responses/") )

Poll

轮询状态

while batch_job.state not in ["JOB_STATE_SUCCEEDED", "JOB_STATE_FAILED"]: time.sleep(30) batch_job = client.batches.get(name=batch_job.name)

---

while batch_job.state not in ["JOB_STATE_SUCCEEDED", "JOB_STATE_FAILED"]: time.sleep(30) batch_job = client.batches.get(name=batch_job.name)

---

Interactions API

Used for agents (Deep Research). Not accessible via

generate_content

python

import time

interaction = client.interactions.create(
    input="Research the history of quantum computing",
    agent='deep-research-pro-preview-12-2025',
    background=True  # REQUIRED for long tasks
)

while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.outputs[-1].text)
        break
    elif interaction.status == "failed":
        break
    time.sleep(10)

For combining with your own data:

python

interaction = client.interactions.create(
    input="Compare our Q4 report to public benchmarks",
    agent="deep-research-pro-preview-12-2025",
    background=True,
    tools=[{"type": "file_search", "file_search_store_names": ["fileSearchStores/my-store"]}]
)

用于Agent场景（深度研究），无法通过

generate_content

调用。

python

import time

interaction = client.interactions.create(
    input="研究量子计算的历史",
    agent='deep-research-pro-preview-12-2025',
    background=True  # 长任务必填
)

while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.outputs[-1].text)
        break
    elif interaction.status == "failed":
        break
    time.sleep(10)

结合自有数据使用的示例：

python

interaction = client.interactions.create(
    input="将我们的Q4报告与公开基准对比",
    agent="deep-research-pro-preview-12-2025",
    background=True,
    tools=[{"type": "file_search", "file_search_store_names": ["fileSearchStores/my-store"]}]
)

Live API (Real-time Voice/Video)

Live API（实时音视频）

For interactive, streaming audio/video sessions via WebSocket.

python

import asyncio
from google import genai

client = genai.Client()
model = "gemini-2.5-flash-native-audio-preview-12-2025"

async def main():
    async with client.aio.live.connect(
        model=model,
        config={"response_modalities": ["AUDIO"]}
    ) as session:
        await session.send_client_content(
            turns="Hello, how are you?",
            turn_complete=True
        )
        async for response in session.receive():
            if response.data:
                # raw PCM audio bytes (24kHz, 16-bit, little-endian)
                pass
            if response.server_content and response.server_content.turn_complete:
                break

asyncio.run(main())

Audio format: raw PCM, little-endian, 16-bit. Output: 24kHz. Input: natively 16kHz (resampled if different).

Session limits:

Audio-only: 15 min (without compression)
Audio+video: 2 min
Connection: ~10 min → use Session Resumption

Session Resumption:

python

config=types.LiveConnectConfig(
    session_resumption=types.SessionResumptionConfig(handle=previous_handle)
)

用于通过WebSocket实现交互式的流式音视频会话。

python

import asyncio
from google import genai

client = genai.Client()
model = "gemini-2.5-flash-native-audio-preview-12-2025"

async def main():
    async with client.aio.live.connect(
        model=model,
        config={"response_modalities": ["AUDIO"]}
    ) as session:
        await session.send_client_content(
            turns="你好，最近怎么样？",
            turn_complete=True
        )
        async for response in session.receive():
            if response.data:
                # 原始PCM音频字节（24kHz，16位，小端）
                pass
            if response.server_content and response.server_content.turn_complete:
                break

asyncio.run(main())

音频格式： 原始PCM，小端，16位。输出采样率24kHz，输入原生支持16kHz（其他采样率会自动重采样）。

会话限制：

纯音频：15分钟（无压缩）
音视频：2分钟
连接时长：约10分钟 → 可使用会话恢复功能

会话恢复：

python

config=types.LiveConnectConfig(
    session_resumption=types.SessionResumptionConfig(handle=previous_handle)
)

Save new handle from session_resumption_update messages

从session_resumption_update消息中保存新的handle


**Context window compression (for long sessions):**
```python
config=types.LiveConnectConfig(
    context_window_compression=types.ContextWindowCompressionConfig(
        sliding_window=types.SlidingWindow()
    )
)

Tools in Live API: Google Search ✅, Function calling ✅, Google Maps ❌, Code execution ❌, URL context ❌

Live API function calling (manual tool response required):

python

undefined


**长会话上下文窗口压缩：**
```python
config=types.LiveConnectConfig(
    context_window_compression=types.ContextWindowCompressionConfig(
        sliding_window=types.SlidingWindow()
    )
)

Live API支持的工具： 谷歌搜索✅、函数调用✅、谷歌地图❌、代码执行❌、URL上下文❌

Live API函数调用（需要手动返回工具响应）：

python

undefined

After receiving tool_call in response:

收到响应中的tool_call后：

await session.send_tool_response(function_responses=[ types.FunctionResponse(id=fc.id, name=fc.name, response={"result": "ok"}) for fc in response.tool_call.function_calls ])

---

await session.send_tool_response(function_responses=[ types.FunctionResponse(id=fc.id, name=fc.name, response={"result": "ok"}) for fc in response.tool_call.function_calls ])

---

Ephemeral Tokens (Live API Security)

临时令牌（Live API安全）

For client-side Live API connections. Short-lived tokens that expire, reducing risk vs. exposing API keys.

python

import datetime
now = datetime.datetime.now(tz=datetime.timezone.utc)

client = genai.Client(http_options={'api_version': 'v1alpha'})
token = client.auth_tokens.create(config={
    'uses': 1,
    'expire_time': now + datetime.timedelta(minutes=30),
    'new_session_expire_time': now + datetime.timedelta(minutes=1),
    'http_options': {'api_version': 'v1alpha'},
})

用于客户端Live API连接，短时效令牌过期后自动失效，相比暴露API密钥风险更低。

python

import datetime
now = datetime.datetime.now(tz=datetime.timezone.utc)

client = genai.Client(http_options={'api_version': 'v1alpha'})
token = client.auth_tokens.create(config={
    'uses': 1,
    'expire_time': now + datetime.timedelta(minutes=30),
    'new_session_expire_time': now + datetime.timedelta(minutes=1),
    'http_options': {'api_version': 'v1alpha'},
})

Send token.name to client; use as API key for Live API only

将token.name发送给客户端；仅可作为Live API的API密钥使用


Can lock token to specific config:
```python
'live_connect_constraints': {
    'model': 'gemini-2.5-flash-native-audio-preview-12-2025',
    'config': {'response_modalities': ['AUDIO']}
}


可将令牌限制为特定配置：
```python
'live_connect_constraints': {
    'model': 'gemini-2.5-flash-native-audio-preview-12-2025',
    'config': {'response_modalities': ['AUDIO']}
}

Image Generation (Nano Banana)

图像生成（Nano Banana）

Nano Banana 2 =
```
gemini-3.1-flash-image-preview
```
— fast/high-volume
Nano Banana Pro =
```
gemini-3-pro-image-preview
```
— pro quality, thinking
Nano Banana =
```
gemini-2.5-flash-image
```
— speed/efficiency

python

from PIL import Image

response = client.models.generate_content(
    model="gemini-3.1-flash-image-preview",
    contents="Create a picture of a tropical beach at sunset"
)

for part in response.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = part.as_image()
        image.save("output.png")

All generated images include SynthID watermark.

Nano Banana 2 =
```
gemini-3.1-flash-image-preview
```
— 高速/高吞吐量
Nano Banana Pro =
```
gemini-3-pro-image-preview
```
— 专业画质，支持推理
Nano Banana =
```
gemini-2.5-flash-image
```
— 高速/高效

python

from PIL import Image

response = client.models.generate_content(
    model="gemini-3.1-flash-image-preview",
    contents="生成一张日落时分热带海滩的图片"
)

for part in response.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = part.as_image()
        image.save("output.png")

所有生成的图像都包含SynthID水印。

Video Generation (Veo 3.1)

视频生成（Veo 3.1）

python

operation = client.models.generate_videos(
    model="veo-3.1-generate-preview",
    prompt="A serene mountain lake at dawn"
)

while not operation.done:
    time.sleep(10)
    operation = client.operations.get(operation)

video = operation.response.generated_videos[0]
client.files.download(file=video.video)
video.video.save("output.mp4")

Capabilities: 8-second 720p/1080p/4K, portrait (9:16) or landscape (16:9), audio, video extension, first/last frame specification, up to 3 reference images.

python

operation = client.models.generate_videos(
    model="veo-3.1-generate-preview",
    prompt="黎明时分宁静的山间湖泊"
)

while not operation.done:
    time.sleep(10)
    operation = client.operations.get(operation)

video = operation.response.generated_videos[0]
client.files.download(file=video.video)
video.video.save("output.mp4")

能力：8秒时长，支持720p/1080p/4K分辨率，竖屏（9:16）或横屏（16:9），带音频，支持视频续画、指定首尾帧、最多3张参考图。

Image Generation (Imagen — Standalone)

图像生成（Imagen — 独立服务）

python

response = client.models.generate_images(
    model='imagen-4.0-generate-001',
    prompt='Robot holding a red skateboard',
    config=types.GenerateImagesConfig(number_of_images=4)
)
for gen_image in response.generated_images:
    gen_image.image.show()

python

response = client.models.generate_images(
    model='imagen-4.0-generate-001',
    prompt='拿着红色滑板的机器人',
    config=types.GenerateImagesConfig(number_of_images=4)
)
for gen_image in response.generated_images:
    gen_image.image.show()

TTS (Text-to-Speech)

TTS（文本转语音）

python

import wave

response = client.models.generate_content(
    model="gemini-2.5-flash-preview-tts",
    contents="Say cheerfully: Have a wonderful day!",
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO"],
        speech_config=types.SpeechConfig(
            voice_config=types.VoiceConfig(
                prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name='Kore')
            )
        )
    )
)

data = response.candidates[0].content.parts[0].inline_data.data
with wave.open("out.wav", "wb") as wf:
    wf.setnchannels(1)
    wf.setsampwidth(2)
    wf.setframerate(24000)
    wf.writeframes(data)

Multi-speaker TTS (up to 2 speakers):

python

config=types.GenerateContentConfig(
    response_modalities=["AUDIO"],
    speech_config=types.SpeechConfig(
        multi_speaker_voice_config=types.MultiSpeakerVoiceConfig(
            speaker_voice_configs=[
                types.SpeakerVoiceConfig(
                    speaker='Joe',
                    voice_config=types.VoiceConfig(
                        prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name='Kore')
                    )
                ),
                types.SpeakerVoiceConfig(
                    speaker='Jane',
                    voice_config=types.VoiceConfig(
                        prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name='Aoede')
                    )
                ),
            ]
        )
    )
)

Prompt must name speakers matching the config: "TTS the conversation between Joe and Jane: Joe: ... Jane: ..."

python

import wave

response = client.models.generate_content(
    model="gemini-2.5-flash-preview-tts",
    contents="欢快地说：祝你拥有美好的一天！",
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO"],
        speech_config=types.SpeechConfig(
            voice_config=types.VoiceConfig(
                prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name='Kore')
            )
        )
    )
)

data = response.candidates[0].content.parts[0].inline_data.data
with wave.open("out.wav", "wb") as wf:
    wf.setnchannels(1)
    wf.setsampwidth(2)
    wf.setframerate(24000)
    wf.writeframes(data)

多说话人TTS（最多支持2个说话人）：

python

config=types.GenerateContentConfig(
    response_modalities=["AUDIO"],
    speech_config=types.SpeechConfig(
        multi_speaker_voice_config=types.MultiSpeakerVoiceConfig(
            speaker_voice_configs=[
                types.SpeakerVoiceConfig(
                    speaker='Joe',
                    voice_config=types.VoiceConfig(
                        prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name='Kore')
                    )
                ),
                types.SpeakerVoiceConfig(
                    speaker='Jane',
                    voice_config=types.VoiceConfig(
                        prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name='Aoede')
                    )
                ),
            ]
        )
    )
)

提示词中必须指定与配置匹配的说话人："将Joe和Jane的对话转为语音：Joe: ... Jane: ..."

Music Generation (Lyria RealTime)

音乐生成（Lyria RealTime）

Experimental. Real-time streaming music via WebSocket.

python

client = genai.Client(http_options={'api_version': 'v1alpha'})

async with client.aio.live.music.connect(model='models/lyria-realtime-exp') as session:
    await session.set_weighted_prompts(prompts=[
        types.WeightedPrompt(text='minimal techno', weight=1.0)
    ])
    await session.set_music_generation_config(
        config=types.LiveMusicGenerationConfig(bpm=90, temperature=1.0)
    )
    await session.play()
    
    # Receive audio chunks
    async for message in session.receive():
        audio_data = message.server_content.audio_chunks[0].data
        # process PCM audio...

Control:

session.play()

session.pause()

session.stop()

session.reset_context()

Steer by sending new weighted prompts mid-stream. Reset context after BPM/scale changes.

实验性功能，通过WebSocket实现实时流式音乐生成。

python

client = genai.Client(http_options={'api_version': 'v1alpha'})

async with client.aio.live.music.connect(model='models/lyria-realtime-exp') as session:
    await session.set_weighted_prompts(prompts=[
        types.WeightedPrompt(text='minimal techno', weight=1.0)
    ])
    await session.set_music_generation_config(
        config=types.LiveMusicGenerationConfig(bpm=90, temperature=1.0)
    )
    await session.play()
    
    # 接收音频块
    async for message in session.receive():
        audio_data = message.server_content.audio_chunks[0].data
        # 处理PCM音频...

控制方法：

session.play()

、

session.pause()

、

session.stop()

、

session.reset_context()

可在流中发送新的加权提示词调整生成内容，修改BPM/音阶后需要重置上下文。

Computer Use

Browser automation agent. Requires Playwright or similar for action execution.

python

config = genai.types.GenerateContentConfig(
    tools=[types.Tool(
        computer_use=types.ComputerUse(
            environment=types.Environment.ENVIRONMENT_BROWSER,
            excluded_predefined_functions=["drag_and_drop"]  # optional
        )
    )]
)

response = client.models.generate_content(
    model='gemini-2.5-computer-use-preview-10-2025',
    contents=[{"role": "user", "parts": [{"text": "Search Amazon for wireless headphones"}]}],
    config=config
)

浏览器自动化Agent，需要搭配Playwright或类似工具执行动作。

python

config = genai.types.GenerateContentConfig(
    tools=[types.Tool(
        computer_use=types.ComputerUse(
            environment=types.Environment.ENVIRONMENT_BROWSER,
            excluded_predefined_functions=["drag_and_drop"]  # 可选
        )
    )]
)

response = client.models.generate_content(
    model='gemini-2.5-computer-use-preview-10-2025',
    contents=[{"role": "user", "parts": [{"text": "在亚马逊搜索无线耳机"}]}],
    config=config
)

Model returns function_calls with actions like type_text_at, click_at

模型返回包含type_text_at、click_at等动作的function_calls

Check response.candidates[0].content.parts for function_call items

从response.candidates[0].content.parts中查找function_call项

Coordinates are normalized 0-999; convert to actual pixels

坐标为0-999的归一化值；需要转换为实际像素

Recommended screen: 1440x900

推荐屏幕分辨率：1440x900


**Safety:** Check `safety_decision` in response — `require_confirmation` means pause before executing.

---


**安全提示：** 检查响应中的`safety_decision`字段 — 若为`require_confirmation`表示执行前需要暂停确认。

---

Safety Settings

安全设置

python

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Your prompt",
    config=types.GenerateContentConfig(
        safety_settings=[
            types.SafetySetting(
                category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
                threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
            )
        ]
    )
)

Threshold options:

OFF

BLOCK_NONE

BLOCK_ONLY_HIGH

BLOCK_MEDIUM_AND_ABOVE

BLOCK_LOW_AND_ABOVE

Default for Gemini 2.5/3: All filters

OFF

by default.

Categories: Harassment, Hate speech, Sexually explicit, Dangerous

Built-in protections (cannot be disabled): Child safety, etc.

Check blocked response:

response.candidates[0].finish_reason == "SAFETY"

→ inspect

safety_ratings

python

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="你的提示词",
    config=types.GenerateContentConfig(
        safety_settings=[
            types.SafetySetting(
                category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
                threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
            )
        ]
    )
)

阈值选项：

OFF

、

BLOCK_NONE

、

BLOCK_ONLY_HIGH

、

BLOCK_MEDIUM_AND_ABOVE

、

BLOCK_LOW_AND_ABOVE

Gemini 2.5/3默认设置： 所有过滤器默认关闭

OFF

。

分类： 骚扰、仇恨言论、色情内容、危险内容

内置保护机制（无法关闭）：儿童安全防护等。

检查被拦截的响应：

response.candidates[0].finish_reason == "SAFETY"

→ 查看

safety_ratings

了解详情。

Media Resolution

媒体分辨率

Control token usage for images/videos/PDFs:

Global (all models):

python

config = types.GenerateContentConfig(
    media_resolution=types.MediaResolution.MEDIA_RESOLUTION_HIGH  # LOW, MEDIUM, HIGH
)

Per-part (Gemini 3 only, experimental):

python

client = genai.Client(http_options={'api_version': 'v1alpha'})
image_part = types.Part.from_bytes(
    data=image_bytes, mime_type='image/jpeg',
    media_resolution=types.MediaResolution.MEDIA_RESOLUTION_HIGH
)

控制图像/视频/PDF的token消耗：

全局设置（所有模型适用）：

python

config = types.GenerateContentConfig(
    media_resolution=types.MediaResolution.MEDIA_RESOLUTION_HIGH  # 可选LOW, MEDIUM, HIGH
)

单文件设置（仅Gemini 3支持，实验性）：

python

client = genai.Client(http_options={'api_version': 'v1alpha'})
image_part = types.Part.from_bytes(
    data=image_bytes, mime_type='image/jpeg',
    media_resolution=types.MediaResolution.MEDIA_RESOLUTION_HIGH
)

Long Context

长上下文

Most Gemini models support 1M+ token context windows.

1M tokens ≈ 50K lines of code, 8 novels, 200 podcast transcripts.

Optimization: Use context caching when reusing large contexts — 4x cheaper (Flash) + lower latency.

Best practice: Put your query at the END of the prompt (after all context material).

Multi-needle limitation: Model performs ~99% on single retrieval but degrades with many simultaneous retrievals.

大多数Gemini模型支持1M+ token的上下文窗口。

1M tokens约等于 5万行代码、8部长篇小说、200份播客转录文本。

优化建议： 重复使用大上下文时使用上下文缓存 — 成本降低4倍（Flash模型）+ 延迟更低。

最佳实践： 将查询放在提示词的末尾（在所有上下文材料之后）。

多针检索限制： 模型单条检索准确率约99%，但同时进行多条检索时性能会下降。

API Versions

API版本

Version	Use	Default?
`v1`	Stable, production	No
`v1beta`	New features, may change	Yes (SDK default)
`v1alpha`	Experimental only	No

python

client = genai.Client(http_options={'api_version': 'v1'})  # force stable

版本	用途	是否为默认值？
`v1`	稳定版，生产环境使用	否
`v1beta`	新特性，可能会变动	是（SDK默认）
`v1alpha`	仅用于实验性功能	否

python

client = genai.Client(http_options={'api_version': 'v1'})  # 强制使用稳定版

Authentication

鉴权

API Key (default):

bash

export GEMINI_API_KEY=your_key_here

REST header:

x-goog-api-key: $GEMINI_API_KEY

OAuth (for production with stricter controls):

Enable Generative Language API in Cloud console
Configure OAuth consent screen
Create OAuth 2.0 Client ID
Use application-default-credentials

API密钥（默认方式）：

bash

export GEMINI_API_KEY=your_key_here

REST请求头：

x-goog-api-key: $GEMINI_API_KEY

OAuth（适用于管控更严格的生产环境）：

在云控制台开启Generative Language API
配置OAuth同意屏幕
创建OAuth 2.0客户端ID
使用application-default-credentials

Rate Limits & Billing

速率限制与计费

Tiers: Free Tier → Paid Tier (pay-as-you-go)

Upgrade: AI Studio → API Keys → Set up Billing

Paid tier benefits: Higher rate limits, advanced models, data not used for training.

Rate limit headers: Check

x-goog-quota-*

headers in responses.

Error 429: Rate limit exceeded → implement exponential backoff or request quota increase.

层级： 免费层 → 付费层（按量付费）

升级方式： AI Studio → API Keys → 设置计费

付费层权益： 更高的速率限制、高级模型、数据不会被用于训练。

速率限制请求头： 可在响应中查看

x-goog-quota-*

开头的请求头。

错误码429： 超出速率限制 → 实现指数退避策略或申请提升配额。

Common Error Codes

常见错误码

HTTP	Status	Cause	Fix
400	INVALID_ARGUMENT	Malformed request	Check API reference
400	Missing thought_signature	Gemini 3 FC without signature	Use SDK chat or echo signatures
403	PERMISSION_DENIED	Wrong API key	Check key permissions
429	RESOURCE_EXHAUSTED	Rate limit hit	Backoff, upgrade tier
500	INTERNAL	Context too long / server error	Reduce context, retry
503	UNAVAILABLE	Overloaded	Retry or switch model
504	DEADLINE_EXCEEDED	Request too large	Increase timeout

HTTP状态码	状态	原因	解决方法
400	INVALID_ARGUMENT	请求格式错误	查阅API参考文档
400	Missing thought_signature	Gemini 3函数调用未传递思想签名	使用SDK聊天接口或回传签名
403	PERMISSION_DENIED	API密钥错误	检查密钥权限
429	RESOURCE_EXHAUSTED	触发速率限制	退避重试、升级套餐
500	INTERNAL	上下文过长/服务器错误	减少上下文长度、重试
503	UNAVAILABLE	服务器过载	重试或切换模型
504	DEADLINE_EXCEEDED	请求过大	增加超时时间

Framework Integrations

框架集成

CrewAI

python

from crewai import LLM
gemini_llm = LLM(model='gemini/gemini-3-flash-preview', api_key=api_key, temperature=1.0)

python

from crewai import LLM
gemini_llm = LLM(model='gemini/gemini-3-flash-preview', api_key=api_key, temperature=1.0)

LangGraph

python

from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-3-flash-preview")

python

from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-3-flash-preview")

LlamaIndex

python

from llama_index.llms.google_genai import GoogleGenAI
llm = GoogleGenAI(model="gemini-3-flash-preview")

python

from llama_index.llms.google_genai import GoogleGenAI
llm = GoogleGenAI(model="gemini-3-flash-preview")

Vercel AI SDK

bash

npm install ai @ai-sdk/google

bash

npm install ai @ai-sdk/google

gemini-api-2026

Original

Translation

Gemini API Development Skill

Gemini API开发指南

Quick Start

快速入门

CRITICAL GOTCHAS (Read First)

重要注意事项（请先阅读）

Models Reference

模型参考

Gemini 3 Series (Current)

Gemini 3系列（当前最新）

Gemini 2.5 Series (Stable)

Gemini 2.5系列（稳定版）

Gemini 2.0 Series (Deprecating)

Gemini 2.0系列（即将弃用）

Specialized Models

专项模型

Latest Aliases

最新版本别名

Libraries & Installation

依赖库与安装

Java: com.google.genai:google-genai:1.0.0

Java: com.google.genai:google-genai:1.0.0

C#: dotnet add package Google.GenAI

C#: dotnet add package Google.GenAI

Core Generation

核心生成能力

System Instructions

系统提示词

Multi-turn Chat

多轮对话

Streaming

流式输出

Token Counting

Token计数

Before sending:

发送请求前计数：

After generating:

生成结果后计数：

Fields: prompt_token_count, candidates_token_count, thoughts_token_count, total_token_count

包含字段：prompt_token_count, candidates_token_count, thoughts_token_count, total_token_count

Thinking (Reasoning)

推理（思维链）能力

Gemini 3 — thinking_level

Gemini 3 — thinking_level

Gemini 2.5 — thinking_budget

Gemini 2.5 — thinking_budget

Multimodal Input

多模态输入

Images (Inline — under 20MB)

图像（内联上传 — 大小不超过20MB）

Images (URL fetch)

图像（URL拉取）

PDF Documents (Inline — under 50MB)

PDF文档（内联上传 — 大小不超过50MB）

Audio (via Files API)

音频（通过Files API上传）

Video (via Files API)

视频（通过Files API上传）

YouTube URLs

YouTube链接

Include YouTube URL directly in contents

直接在内容中包含YouTube链接即可

File Input Methods Comparison

文件输入方式对比

Files API

Files API

Upload

上传文件

List

列出文件

Delete

删除文件

Structured Output

结构化输出

Tools: Built-in

内置工具

Google Search (Grounding)

Gemini 3 —
`thinking_level`

Gemini 3 —
`thinking_level`

Gemini 2.5 —
`thinking_budget`

Gemini 2.5 —
`thinking_budget`