gemini-sdk-expert
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese🤖 Skill: gemini-sdk-expert (v1.3.0)
🤖 Skill: gemini-sdk-expert (v1.3.0)
Executive Summary
执行摘要
gemini-sdk-expert@google/genaigemini-sdk-expert@google/genai📋 Table of Contents
📋 目录
🚀 Core Capabilities
🚀 核心能力
- Strict Structured Output: Leveraging for 100% reliable JSON generation.
responseSchema - Agentic Function Calling: enabling models to interact with private APIs and tools.
- Long-Form Context Management: Using Context Caching for massive datasets (2M+ tokens).
- Native Multimodal Reasoning: Processing video, audio, and documents as first-class inputs.
- Latency Optimization: Strategic model selection (Flash vs. Pro) and streaming responses.
- 严格结构化输出:利用实现100%可靠的JSON生成。
responseSchema - 智能体函数调用:让模型能够与私有API和工具交互。
- 长文本上下文管理:使用上下文缓存处理超大规模数据集(200万+tokens)。
- 原生多模态推理:将视频、音频和文档作为一等输入进行处理。
- 延迟优化:策略性选择模型(Flash vs. Pro)并支持流式响应。
🚫 The "Do Not" List (Anti-Patterns)
🚫 “切勿”清单(反模式)
| Anti-Pattern | Why it fails in 2026 | Modern Alternative |
|---|---|---|
| Regex Parsing | Fragile and prone to hallucination. | Use |
Old SDK ( | Outdated, lacks 2026 features. | Use |
| Uncached Large Contexts | Extremely expensive and slow. | Use Context Caching for repetitive queries. |
| Hardcoded API Keys | Security risk. | Use Secure Environment Variables and |
| Single-Model Bias | Pro is overkill for simple extraction. | Use Gemini 3 Flash for speed/cost tasks. |
| 反模式 | 2026年失效原因 | 现代替代方案 |
|---|---|---|
| 正则表达式解析 | 脆弱且易出现幻觉。 | 使用** |
旧版SDK( | 已过时,缺少2026年新增功能。 | 仅使用** |
| 未缓存的大上下文 | 成本极高且速度缓慢。 | 对重复查询使用上下文缓存。 |
| 硬编码API密钥 | 存在安全风险。 | 使用安全环境变量和** |
| 单一模型偏见 | 对于简单提取任务,Pro模型性能过剩。 | 针对速度/成本敏感任务使用Gemini 3 Flash。 |
⚡ Quick Start: JSON Enforcement
⚡ 快速入门:JSON强制输出
The #1 rule in 2026: Structure at the Source.
typescript
import { GoogleGenerativeAI, Type } from "@google/genai";
// Optional: Set API Version via env
// process.env.GOOGLE_GENAI_API_VERSION = "v1beta1";
const schema = {
type: Type.OBJECT,
properties: {
status: { type: Type.STRING, enum: ["COMPLETE", "PENDING", "ERROR"] },
summary: { type: Type.STRING },
priority: { type: Type.NUMBER }
},
required: ["status", "summary"]
};
// Always set MIME type to application/json
const result = await model.generateContent({
contents: [{ role: 'user', parts: [{ text: "Evaluate task X..." }] }],
generationConfig: {
responseMimeType: "application/json",
responseSchema: schema
}
});2026年的首要规则:从源头定义结构。
typescript
import { GoogleGenerativeAI, Type } from "@google/genai";
// 可选:通过环境变量设置API版本
// process.env.GOOGLE_GENAI_API_VERSION = "v1beta1";
const schema = {
type: Type.OBJECT,
properties: {
status: { type: Type.STRING, enum: ["COMPLETE", "PENDING", "ERROR"] },
summary: { type: Type.STRING },
priority: { type: Type.NUMBER }
},
required: ["status", "summary"]
};
// 始终将MIME类型设置为application/json
const result = await model.generateContent({
contents: [{ role: 'user', parts: [{ text: "评估任务X..." }] }],
generationConfig: {
responseMimeType: "application/json",
responseSchema: schema
}
});🛠 Standard Production Patterns
🛠 标准生产模式
Pattern A: The Data Extractor (Flash)
模式A:数据提取器(Flash版)
Best for processing thousands of documents quickly and cheaply.
- Model:
gemini-3-flash - Config: High , low
topPfor deterministic extraction.temperature
最适合快速且低成本地处理数千份文档。
- 模型:
gemini-3-flash - 配置:高、低
topP以实现确定性提取。temperature
Pattern B: The Complex Reasoner (Pro)
模式B:复杂推理器(Pro版)
Best for architectural decisions, coding assistance, and deep media analysis.
- Model:
gemini-3-pro - Config: Enable Strict Mode in schemas for 100% adherence.
最适合架构决策、编码辅助和深度媒体分析。
- 模型:
gemini-3-pro - 配置:在Schema中启用严格模式以确保100%合规。
🧩 Advanced Agentic Patterns
🧩 高级智能体模式
Parallel Function Calling
并行函数调用
Reduce round-trips by allowing the model to call multiple tools at once.
See References: Function Calling for implementation.
允许模型同时调用多个工具,减少往返次数。
实现细节请参见参考:函数调用
Semantic Caching
语义缓存
Store and retrieve embeddings of common queries to bypass the LLM for identical requests.
存储并检索常见查询的嵌入向量,对相同请求直接绕过LLM。
💾 Context Caching Strategy
💾 上下文缓存策略
In 2026, we don't re-upload. We cache.
- Warm-up Phase: Initial context upload.
- Persistence Phase: Referencing the cache via .
cachedContent - Cleanup Phase: Managing TTLs to optimize storage costs.
See References: Context Caching for more.
2026年,我们不再重复上传,而是采用缓存。
- 预热阶段:初始上下文上传。
- 持久化阶段:通过引用缓存。
cachedContent - 清理阶段:管理TTL以优化存储成本。
更多内容请参见参考:上下文缓存
📸 Multimodal Integration
📸 多模态集成
Gemini 3 understands the world visually and audibly.
- Video: Scene detection and temporal reasoning.
- Audio: Sentiment, tone, and environment detection.
- Document: Visual layout and OCR.
See References: Multimodal Mastery for details.
Gemini 3能够以视觉和听觉方式理解世界。
- 视频:场景检测与时间推理。
- 音频:情感、语气与环境检测。
- 文档:视觉布局与OCR识别。
详情请参见参考:多模态精通
📖 Reference Library
📖 参考库
Detailed deep-dives into Gemini SDK excellence:
- Structured Output: Nested schemas and validation.
- Function Calling: Tools, execution loops, and security.
- Context Caching: Reducing cost and latency.
- Multimodal 2026: Video, audio, and PDF mastery.
Updated: January 31, 2026 - 10:45
关于Gemini SDK卓越实践的深度解析:
- 结构化输出:嵌套Schema与验证。
- 函数调用:工具、执行循环与安全。
- 上下文缓存:降低成本与延迟。
- 2026多模态:视频、音频与PDF精通。
更新时间:2026年1月31日 10:45