google-gemini-embeddings

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Google Gemini Embeddings

Google Gemini 嵌入模型

Complete production-ready guide for Google Gemini embeddings API

This skill provides comprehensive coverage of the

gemini-embedding-001

model for generating text embeddings, including SDK usage, REST API patterns, batch processing, RAG integration with Cloudflare Vectorize, and advanced use cases like semantic search and document clustering.

Google Gemini嵌入模型API的完整生产就绪指南

本技能全面介绍了用于生成文本嵌入的

gemini-embedding-001

模型，包括SDK使用、REST API模式、批量处理、与Cloudflare Vectorize的RAG集成，以及语义搜索、文档聚类等高级用例。

1. Quick Start

1. 快速开始

Installation

安装

Install the Google Generative AI SDK:

bash

npm install @google/genai@^1.37.0

For TypeScript projects:

bash

npm install -D typescript@^5.0.0

安装Google生成式AI SDK：

bash

npm install @google/genai@^1.37.0

针对TypeScript项目：

bash

npm install -D typescript@^5.0.0

Environment Setup

环境配置

Set your Gemini API key as an environment variable:

bash

export GEMINI_API_KEY="your-api-key-here"

Get your API key from: https://aistudio.google.com/apikey

将你的Gemini API密钥设置为环境变量：

bash

export GEMINI_API_KEY="your-api-key-here"

从以下地址获取API密钥：https://aistudio.google.com/apikey

First Embedding Example

第一个嵌入示例

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: 'What is the meaning of life?',
  config: {
    taskType: 'RETRIEVAL_QUERY',
    outputDimensionality: 768
  }
});

console.log(response.embedding.values); // [0.012, -0.034, ...]
console.log(response.embedding.values.length); // 768

Result: A 768-dimension embedding vector representing the semantic meaning of the text.

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: 'What is the meaning of life?',
  config: {
    taskType: 'RETRIEVAL_QUERY',
    outputDimensionality: 768
  }
});

console.log(response.embedding.values); // [0.012, -0.034, ...]
console.log(response.embedding.values.length); // 768

结果：一个768维度的嵌入向量，代表文本的语义含义。

2. gemini-embedding-001 Model

2. gemini-embedding-001 模型

Model Specifications

模型规格

Current Model:

gemini-embedding-001

(stable, production-ready)

Status: Stable
Experimental:
```
gemini-embedding-exp-03-07
```
(deprecated October 2025, do not use)

当前模型：

gemini-embedding-001

（稳定，可用于生产环境）

状态：稳定
实验性模型：
```
gemini-embedding-exp-03-07
```
（2025年10月弃用，请勿使用）

Dimensions

维度

The model supports flexible output dimensionality using Matryoshka Representation Learning:

Dimension	Use Case	Storage	Performance
768	Recommended for most use cases	Low	Fast
1536	Balance between accuracy and efficiency	Medium	Medium
3072	Maximum accuracy (default)	High	Slower
128-3071	Custom (any value in range)	Variable	Variable

Default: 3072 dimensions Recommended: 768, 1536, or 3072 for optimal performance

该模型支持使用Matryoshka表示学习实现灵活的输出维度：

维度	适用场景	存储	性能
768	推荐用于大多数场景	低	快
1536	在准确性和效率间取得平衡	中	中等
3072	最高准确性（默认值）	高	较慢
128-3071	自定义（范围内任意值）	可变	可变

默认值：3072维度 推荐值：768、1536或3072以获得最佳性能

Context Window

上下文窗口

Input Limit: 2,048 tokens per text
Input Type: Text only (no images, audio, or video)

输入限制：每个文本最多2048个token
输入类型：仅支持文本（不支持图片、音频或视频）

Rate Limits

速率限制

Tier	RPM	TPM	RPD	Requirements
Free	100	30,000	1,000	No billing account
Tier 1	3,000	1,000,000	-	Billing account linked
Tier 2	5,000	5,000,000	-	$250+ spending, 30-day wait
Tier 3	10,000	10,000,000	-	$1,000+ spending, 30-day wait

RPM = Requests Per Minute TPM = Tokens Per Minute RPD = Requests Per Day

层级	每分钟请求数（RPM）	每分钟token数（TPM）	每日请求数（RPD）	要求
免费层	100	30,000	1,000	无需绑定账单账户
层级1	3,000	1,000,000	-	已绑定账单账户
层级2	5,000	5,000,000	-	消费满250美元，等待30天
层级3	10,000	10,000,000	-	消费满1000美元，等待30天

RPM = Requests Per Minute（每分钟请求数） TPM = Tokens Per Minute（每分钟token数） RPD = Requests Per Day（每日请求数）

Output Format

输出格式

typescript

{
  embedding: {
    values: number[] // Array of floating-point numbers
  }
}

typescript

{
  embedding: {
    values: number[] // Array of floating-point numbers
  }
}

3. Basic Embeddings

3. 基础嵌入生成

SDK Approach (Node.js)

SDK方式（Node.js）

Single text embedding:

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: 'The quick brown fox jumps over the lazy dog',
  config: {
    taskType: 'SEMANTIC_SIMILARITY',
    outputDimensionality: 768
  }
});

console.log(response.embedding.values);
// [0.00388, -0.00762, 0.01543, ...]

单文本嵌入：

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: 'The quick brown fox jumps over the lazy dog',
  config: {
    taskType: 'SEMANTIC_SIMILARITY',
    outputDimensionality: 768
  }
});

console.log(response.embedding.values);
// [0.00388, -0.00762, 0.01543, ...]

Fetch Approach (Cloudflare Workers)

Fetch方式（Cloudflare Workers）

For Workers/edge environments without SDK support:

typescript

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const apiKey = env.GEMINI_API_KEY;
    const text = "What is the meaning of life?";

    const response = await fetch(
      'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
      {
        method: 'POST',
        headers: {
          'x-goog-api-key': apiKey,
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({
          content: {
            parts: [{ text }]
          },
          taskType: 'RETRIEVAL_QUERY',
          outputDimensionality: 768
        })
      }
    );

    const data = await response.json();

    // Response format:
    // {
    //   embedding: {
    //     values: [0.012, -0.034, ...]
    //   }
    // }

    return new Response(JSON.stringify(data), {
      headers: { 'Content-Type': 'application/json' }
    });
  }
};

适用于不支持SDK的Workers/边缘环境：

typescript

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const apiKey = env.GEMINI_API_KEY;
    const text = "What is the meaning of life?";

    const response = await fetch(
      'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
      {
        method: 'POST',
        headers: {
          'x-goog-api-key': apiKey,
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({
          content: {
            parts: [{ text }]
          },
          taskType: 'RETRIEVAL_QUERY',
          outputDimensionality: 768
        })
      }
    );

    const data = await response.json();

    // Response format:
    // {
    //   embedding: {
    //     values: [0.012, -0.034, ...]
    //   }
    // }

    return new Response(JSON.stringify(data), {
      headers: { 'Content-Type': 'application/json' }
    });
  }
};

Response Parsing

响应解析

typescript

interface EmbeddingResponse {
  embedding: {
    values: number[];
  };
}

const response: EmbeddingResponse = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: 'Sample text',
  config: { taskType: 'SEMANTIC_SIMILARITY' }
});

const embedding: number[] = response.embedding.values;
const dimensions: number = embedding.length; // 3072 by default

typescript

interface EmbeddingResponse {
  embedding: {
    values: number[];
  };
}

const response: EmbeddingResponse = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: 'Sample text',
  config: { taskType: 'SEMANTIC_SIMILARITY' }
});

const embedding: number[] = response.embedding.values;
const dimensions: number = embedding.length; // 3072 by default

Normalization Requirement

归一化要求

⚠️ CRITICAL: When using dimensions other than 3072, you MUST normalize embeddings before computing similarity. Only 3072-dimensional embeddings are pre-normalized by the API.

Why This Matters: Non-normalized embeddings have varying magnitudes that distort cosine similarity calculations, leading to incorrect search results.

Normalization Helper Function:

typescript

/**
 * Normalize embedding vector for accurate similarity calculations.
 * REQUIRED for dimensions other than 3072.
 *
 * @param vector - Embedding values from API response
 * @returns Normalized vector (unit length)
 */
function normalize(vector: number[]): number[] {
  const magnitude = Math.sqrt(
    vector.reduce((sum, val) => sum + val * val, 0)
  );
  return vector.map(val => val / magnitude);
}

// Usage with 768 or 1536 dimensions
const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: {
    taskType: 'RETRIEVAL_QUERY',
    outputDimensionality: 768  // NOT 3072
  }
});

// ❌ WRONG - Use raw values directly
const embedding = response.embedding.values;
await vectorize.insert([{ id, values: embedding }]);

// ✅ CORRECT - Normalize first
const normalized = normalize(response.embedding.values);
await vectorize.insert([{ id, values: normalized }]);

Source: Official Embeddings Documentation

⚠️ 关键注意事项：当使用3072以外的维度时，在计算相似度之前必须对嵌入向量进行归一化。只有3072维度的嵌入向量会由API预先归一化。

重要原因：未归一化的嵌入向量具有不同的量级，会扭曲余弦相似度计算，导致搜索结果不准确。

归一化辅助函数：

typescript

/**
 * Normalize embedding vector for accurate similarity calculations.
 * REQUIRED for dimensions other than 3072.
 *
 * @param vector - Embedding values from API response
 * @returns Normalized vector (unit length)
 */
function normalize(vector: number[]): number[] {
  const magnitude = Math.sqrt(
    vector.reduce((sum, val) => sum + val * val, 0)
  );
  return vector.map(val => val / magnitude);
}

// Usage with 768 or 1536 dimensions
const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: {
    taskType: 'RETRIEVAL_QUERY',
    outputDimensionality: 768  // NOT 3072
  }
});

// ❌ WRONG - Use raw values directly
const embedding = response.embedding.values;
await vectorize.insert([{ id, values: embedding }]);

// ✅ CORRECT - Normalize first
const normalized = normalize(response.embedding.values);
await vectorize.insert([{ id, values: normalized }]);

来源：官方嵌入模型文档

4. Batch Embeddings

4. 批量嵌入生成

Multiple Texts in One Request (SDK)

单次请求处理多个文本（SDK）

Generate embeddings for multiple texts simultaneously:

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const texts = [
  "What is the meaning of life?",
  "How does photosynthesis work?",
  "Tell me about the history of the internet."
];

const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  contents: texts, // Array of strings
  config: {
    taskType: 'RETRIEVAL_DOCUMENT',
    outputDimensionality: 768
  }
});

// Process each embedding
response.embeddings.forEach((embedding, index) => {
  console.log(`Text ${index}: ${texts[index]}`);
  console.log(`Embedding: ${embedding.values.slice(0, 5)}...`);
  console.log(`Dimensions: ${embedding.values.length}`);
});

同时为多个文本生成嵌入向量：

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const texts = [
  "What is the meaning of life?",
  "How does photosynthesis work?",
  "Tell me about the history of the internet."
];

const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  contents: texts, // Array of strings
  config: {
    taskType: 'RETRIEVAL_DOCUMENT',
    outputDimensionality: 768
  }
});

// Process each embedding
response.embeddings.forEach((embedding, index) => {
  console.log(`Text ${index}: ${texts[index]}`);
  console.log(`Embedding: ${embedding.values.slice(0, 5)}...`);
  console.log(`Dimensions: ${embedding.values.length}`);
});

Batch REST API (fetch)

批量REST API（fetch）

Use the

batchEmbedContents

endpoint:

typescript

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:batchEmbedContents',
  {
    method: 'POST',
    headers: {
      'x-goog-api-key': apiKey,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      requests: texts.map(text => ({
        model: 'models/gemini-embedding-001',
        content: {
          parts: [{ text }]
        },
        taskType: 'RETRIEVAL_DOCUMENT'
      }))
    })
  }
);

const data = await response.json();
// data.embeddings: Array of {values: number[]}

使用

batchEmbedContents

端点：

typescript

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:batchEmbedContents',
  {
    method: 'POST',
    headers: {
      'x-goog-api-key': apiKey,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      requests: texts.map(text => ({
        model: 'models/gemini-embedding-001',
        content: {
          parts: [{ text }]
        },
        taskType: 'RETRIEVAL_DOCUMENT'
      }))
    })
  }
);

const data = await response.json();
// data.embeddings: Array of {values: number[]}

Batch API Known Issues

批量API已知问题

⚠️ Ordering Bug (December 2025): Batch API may not preserve ordering with large batch sizes (>500 texts).

Symptom: Entry 328 appears at position 628 (silent data corruption)
Impact: Results cannot be reliably matched back to input texts
Workaround: Process smaller batches (<100 texts) or add unique IDs to verify ordering
Status: Acknowledged by Google, internal bug created (P0 priority)
Source: GitHub Issue #1207

⚠️ Memory Limit (December 2025): Large batches (>10k embeddings) can cause

ERR_STRING_TOO_LONG

crash.

Error:

Cannot create a string longer than 0x1fffffe8 characters

Cause: API response includes excessive whitespace (~536MB limit)
Workaround: Limit to <5,000 texts per batch
Source: GitHub Issue #1205

⚠️ Rate Limit Anomaly (January 2026): Batch API may return

429 RESOURCE_EXHAUSTED

even when under quota.

Status: Under investigation by Google team
Workaround: Implement exponential backoff and retry logic
Source: GitHub Issue #1264

⚠️ 排序Bug（2025年12月）：当批量大小超过500个文本时，批量API可能无法保持顺序。

症状：第328条条目出现在第628位（静默数据损坏）
影响：结果无法可靠地与输入文本匹配
解决方法：处理更小的批量（<100个文本）或添加唯一ID以验证顺序
状态：Google已确认，已创建内部Bug（P0优先级）
来源：GitHub Issue #1207

⚠️ 内存限制（2025年12月）：大型批量（>10k个嵌入向量）可能导致

ERR_STRING_TOO_LONG

崩溃。

错误信息：

Cannot create a string longer than 0x1fffffe8 characters

原因：API响应包含过多空白字符，导致响应大小超过Node.js字符串限制（约536MB）
解决方法：每个请求的文本数量限制在<5000个
来源：GitHub Issue #1205

⚠️ 速率限制异常（2026年1月）：即使在配额范围内，批量API仍可能返回

429 RESOURCE_EXHAUSTED

错误。

状态：Google团队正在调查中
解决方法：实现指数退避和重试逻辑
来源：GitHub Issue #1264

Chunking for Rate Limits

针对速率限制的分块处理

When processing large datasets, chunk requests to stay within rate limits:

typescript

async function batchEmbedWithRateLimit(
  texts: string[],
  batchSize: number = 50, // REDUCED from 100 due to ordering bug
  delayMs: number = 60000 // 1 minute delay between batches
): Promise<number[][]> {
  const allEmbeddings: number[][] = [];

  for (let i = 0; i < texts.length; i += batchSize) {
    const batch = texts.slice(i, i + batchSize);

    console.log(`Processing batch ${i / batchSize + 1} (${batch.length} texts)`);

    const response = await ai.models.embedContent({
      model: 'gemini-embedding-001',
      contents: batch,
      config: {
        taskType: 'RETRIEVAL_DOCUMENT',
        outputDimensionality: 768
      }
    });

    allEmbeddings.push(...response.embeddings.map(e => e.values));

    // Wait before next batch (except last batch)
    if (i + batchSize < texts.length) {
      await new Promise(resolve => setTimeout(resolve, delayMs));
    }
  }

  return allEmbeddings;
}

// Usage
const embeddings = await batchEmbedWithRateLimit(documents, 50);

处理大型数据集时，将请求分块以保持在速率限制内：

typescript

async function batchEmbedWithRateLimit(
  texts: string[],
  batchSize: number = 50, // REDUCED from 100 due to ordering bug
  delayMs: number = 60000 // 1 minute delay between batches
): Promise<number[][]> {
  const allEmbeddings: number[][] = [];

  for (let i = 0; i < texts.length; i += batchSize) {
    const batch = texts.slice(i, i + batchSize);

    console.log(`Processing batch ${i / batchSize + 1} (${batch.length} texts)`);

    const response = await ai.models.embedContent({
      model: 'gemini-embedding-001',
      contents: batch,
      config: {
        taskType: 'RETRIEVAL_DOCUMENT',
        outputDimensionality: 768
      }
    });

    allEmbeddings.push(...response.embeddings.map(e => e.values));

    // Wait before next batch (except last batch)
    if (i + batchSize < texts.length) {
      await new Promise(resolve => setTimeout(resolve, delayMs));
    }
  }

  return allEmbeddings;
}

// Usage
const embeddings = await batchEmbedWithRateLimit(documents, 50);

Performance Optimization

性能优化

Tips:

Use batch API when embedding multiple texts (single request vs multiple requests)
Choose lower dimensions (768) for faster processing and less storage
Implement exponential backoff for rate limit errors
Cache embeddings to avoid redundant API calls

技巧：

为多个文本生成嵌入时使用批量API（单次请求 vs 多次请求）
选择较低维度（768）以获得更快的处理速度和更少的存储需求
为速率限制错误实现指数退避
缓存嵌入向量以避免重复API调用

5. Task Types

5. 任务类型

The

taskType

parameter optimizes embeddings for specific use cases. Always specify a task type for best results.

taskType

参数针对特定用例优化嵌入向量。始终指定任务类型以获得最佳结果。

Available Task Types (8 total)

可用任务类型（共8种）

Task Type	Use Case	Example
RETRIEVAL_QUERY	User search queries	"How do I fix a flat tire?"
RETRIEVAL_DOCUMENT	Documents to be indexed/searched	Product descriptions, articles
SEMANTIC_SIMILARITY	Comparing text similarity	Duplicate detection, clustering
CLASSIFICATION	Categorizing texts	Spam detection, sentiment analysis
CLUSTERING	Grouping similar texts	Topic modeling, content organization
CODE_RETRIEVAL_QUERY	Code search queries	"function to sort array"
QUESTION_ANSWERING	Questions seeking answers	FAQ matching
FACT_VERIFICATION	Verifying claims with evidence	Fact-checking systems

任务类型	适用场景	示例
RETRIEVAL_QUERY	用户搜索查询	"How do I fix a flat tire?"
RETRIEVAL_DOCUMENT	待索引/搜索的文档	产品描述、文章
SEMANTIC_SIMILARITY	文本相似度比较	重复内容检测、聚类
CLASSIFICATION	文本分类	垃圾邮件检测、情感分析
CLUSTERING	相似文本分组	主题建模、内容组织
CODE_RETRIEVAL_QUERY	代码搜索查询	"function to sort array"
QUESTION_ANSWERING	寻求答案的问题	FAQ匹配
FACT_VERIFICATION	用证据验证主张	事实核查系统

When to Use Which

如何选择合适的任务类型

RAG Systems (Retrieval Augmented Generation):

typescript

// When embedding user queries
const queryEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: userQuery,
  config: { taskType: 'RETRIEVAL_QUERY' } // ← Use RETRIEVAL_QUERY
});

// When embedding documents for indexing
const docEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: documentText,
  config: { taskType: 'RETRIEVAL_DOCUMENT' } // ← Use RETRIEVAL_DOCUMENT
});

Semantic Search:

typescript

const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { taskType: 'SEMANTIC_SIMILARITY' }
});

Document Clustering:

typescript

const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { taskType: 'CLUSTERING' }
});

RAG系统（检索增强生成）：

typescript

// When embedding user queries
const queryEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: userQuery,
  config: { taskType: 'RETRIEVAL_QUERY' } // ← Use RETRIEVAL_QUERY
});

// When embedding documents for indexing
const docEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: documentText,
  config: { taskType: 'RETRIEVAL_DOCUMENT' } // ← Use RETRIEVAL_DOCUMENT
});

语义搜索：

typescript

const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { taskType: 'SEMANTIC_SIMILARITY' }
});

文档聚类：

typescript

const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { taskType: 'CLUSTERING' }
});

Impact on Quality

对质量的影响

Using the correct task type significantly improves retrieval quality:

typescript

// ❌ BAD: No task type specified
const embedding1 = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: userQuery
});

// ✅ GOOD: Task type specified
const embedding2 = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: userQuery,
  config: { taskType: 'RETRIEVAL_QUERY' }
});

Result: Using the right task type can improve search relevance by 10-30%.

使用正确的任务类型显著提升检索质量：

typescript

// ❌ BAD: No task type specified
const embedding1 = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: userQuery
});

// ✅ GOOD: Task type specified
const embedding2 = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: userQuery,
  config: { taskType: 'RETRIEVAL_QUERY' }
});

结果：使用正确的任务类型可将搜索相关性提升10-30%。

6. RAG Patterns

6. RAG模式

RAG (Retrieval Augmented Generation) combines vector search with LLM generation to create AI systems that answer questions using custom knowledge bases.

RAG（检索增强生成）将向量搜索与大语言模型生成相结合，创建可使用自定义知识库回答问题的AI系统。

Document Ingestion Pipeline

文档摄入流水线

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Generate embeddings for chunks
async function embedChunks(chunks: string[]): Promise<number[][]> {
  const response = await ai.models.embedContent({
    model: 'gemini-embedding-001',
    contents: chunks,
    config: {
      taskType: 'RETRIEVAL_DOCUMENT', // ← Documents for indexing
      outputDimensionality: 768 // ← Match Vectorize index dimensions
    }
  });

  return response.embeddings.map(e => e.values);
}

// Store in Cloudflare Vectorize
async function storeInVectorize(
  env: Env,
  chunks: string[],
  embeddings: number[][]
) {
  const vectors = chunks.map((chunk, i) => ({
    id: `doc-${Date.now()}-${i}`,
    values: embeddings[i],
    metadata: { text: chunk }
  }));

  await env.VECTORIZE.insert(vectors);
}

typescript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Generate embeddings for chunks
async function embedChunks(chunks: string[]): Promise<number[][]> {
  const response = await ai.models.embedContent({
    model: 'gemini-embedding-001',
    contents: chunks,
    config: {
      taskType: 'RETRIEVAL_DOCUMENT', // ← Documents for indexing
      outputDimensionality: 768 // ← Match Vectorize index dimensions
    }
  });

  return response.embeddings.map(e => e.values);
}

// Store in Cloudflare Vectorize
async function storeInVectorize(
  env: Env,
  chunks: string[],
  embeddings: number[][]
) {
  const vectors = chunks.map((chunk, i) => ({
    id: `doc-${Date.now()}-${i}`,
    values: embeddings[i],
    metadata: { text: chunk }
  }));

  await env.VECTORIZE.insert(vectors);
}

Query Flow (Retrieve + Generate)

查询流程（检索+生成）

typescript

async function ragQuery(env: Env, userQuery: string): Promise<string> {
  // 1. Embed user query
  const queryResponse = await ai.models.embedContent({
    model: 'gemini-embedding-001',
    content: userQuery,
    config: {
      taskType: 'RETRIEVAL_QUERY', // ← Query, not document
      outputDimensionality: 768
    }
  });

  const queryEmbedding = queryResponse.embedding.values;

  // 2. Search Vectorize for similar documents
  const results = await env.VECTORIZE.query(queryEmbedding, {
    topK: 5,
    returnMetadata: true
  });

  // 3. Extract context from top results
  const context = results.matches
    .map(match => match.metadata.text)
    .join('\n\n');

  // 4. Generate response with context
  const response = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: `Context:\n${context}\n\nQuestion: ${userQuery}\n\nAnswer based on the context above:`
  });

  return response.text;
}

typescript

async function ragQuery(env: Env, userQuery: string): Promise<string> {
  // 1. Embed user query
  const queryResponse = await ai.models.embedContent({
    model: 'gemini-embedding-001',
    content: userQuery,
    config: {
      taskType: 'RETRIEVAL_QUERY', // ← Query, not document
      outputDimensionality: 768
    }
  });

  const queryEmbedding = queryResponse.embedding.values;

  // 2. Search Vectorize for similar documents
  const results = await env.VECTORIZE.query(queryEmbedding, {
    topK: 5,
    returnMetadata: true
  });

  // 3. Extract context from top results
  const context = results.matches
    .map(match => match.metadata.text)
    .join('\n\n');

  // 4. Generate response with context
  const response = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: `Context:\n${context}\n\nQuestion: ${userQuery}\n\nAnswer based on the context above:`
  });

  return response.text;
}

Integration with Cloudflare Vectorize

与Cloudflare Vectorize集成

Create Vectorize Index (768 dimensions for Gemini):

bash

npx wrangler vectorize create gemini-embeddings --dimensions 768 --metric cosine

Bind in wrangler.jsonc:

jsonc

{
  "name": "my-rag-app",
  "main": "src/index.ts",
  "compatibility_date": "2025-10-25",
  "vectorize": {
    "bindings": [
      {
        "binding": "VECTORIZE",
        "index_name": "gemini-embeddings"
      }
    ]
  }
}

Complete RAG Worker:

See

templates/rag-with-vectorize.ts

for full implementation.

创建Vectorize索引（针对Gemini使用768维度）：

bash

npx wrangler vectorize create gemini-embeddings --dimensions 768 --metric cosine

在wrangler.jsonc中绑定：

jsonc

{
  "name": "my-rag-app",
  "main": "src/index.ts",
  "compatibility_date": "2025-10-25",
  "vectorize": {
    "bindings": [
      {
        "binding": "VECTORIZE",
        "index_name": "gemini-embeddings"
      }
    ]
  }
}

完整RAG Worker：

查看

templates/rag-with-vectorize.ts

获取完整实现。

7. Error Handling

7. 错误处理

Common Errors

常见错误

1. API Key Missing or Invalid

typescript

// ❌ Error: API key not set
const ai = new GoogleGenAI({});

// ✅ Correct
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

if (!process.env.GEMINI_API_KEY) {
  throw new Error('GEMINI_API_KEY environment variable not set');
}

2. Dimension Mismatch

typescript

// ❌ Error: Embedding has 3072 dims, Vectorize expects 768
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text
  // No outputDimensionality specified → defaults to 3072
});

await env.VECTORIZE.insert([{
  id: '1',
  values: embedding.embedding.values // 3072 dims, but index is 768!
}]);

// ✅ Correct: Match dimensions
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { outputDimensionality: 768 } // ← Match index dimensions
});

3. Rate Limiting

typescript

// ❌ Error: 429 Too Many Requests
for (let i = 0; i < 1000; i++) {
  await ai.models.embedContent({ /* ... */ }); // Exceeds 100 RPM on free tier
}

// ✅ Correct: Implement rate limiting
async function embedWithRetry(text: string, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await ai.models.embedContent({
        model: 'gemini-embedding-001',
        content: text,
        config: { taskType: 'SEMANTIC_SIMILARITY' }
      });
    } catch (error: any) {
      if (error.status === 429 && attempt < maxRetries - 1) {
        const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

See

references/top-errors.md

for all 8 documented errors with detailed solutions.

1. API密钥缺失或无效

typescript

// ❌ Error: API key not set
const ai = new GoogleGenAI({});

// ✅ Correct
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

if (!process.env.GEMINI_API_KEY) {
  throw new Error('GEMINI_API_KEY environment variable not set');
}

2. 维度不匹配

typescript

// ❌ Error: Embedding has 3072 dims, Vectorize expects 768
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text
  // No outputDimensionality specified → defaults to 3072
});

await env.VECTORIZE.insert([{
  id: '1',
  values: embedding.embedding.values // 3072 dims, but index is 768!
}]);

// ✅ Correct: Match dimensions
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { outputDimensionality: 768 } // ← Match index dimensions
});

3. 速率限制

typescript

// ❌ Error: 429 Too Many Requests
for (let i = 0; i < 1000; i++) {
  await ai.models.embedContent({ /* ... */ }); // Exceeds 100 RPM on free tier
}

// ✅ Correct: Implement rate limiting
async function embedWithRetry(text: string, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await ai.models.embedContent({
        model: 'gemini-embedding-001',
        content: text,
        config: { taskType: 'SEMANTIC_SIMILARITY' }
      });
    } catch (error: any) {
      if (error.status === 429 && attempt < maxRetries - 1) {
        const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

查看

references/top-errors.md

获取所有8种已记录错误的详细解决方案。

Known Issues Prevention

已知问题预防

This section documents additional issues discovered in production use (beyond basic errors above).

本节记录了在生产使用中发现的其他问题（超出上述基本错误）。

Issue #9: Normalization Required for Non-3072 Dimensions

问题#9：非3072维度需要归一化

Error: Incorrect similarity scores, no error thrown Source: Official Embeddings Documentation Why It Happens: Only 3072-dimensional embeddings are pre-normalized by the API. All other dimensions (128-3071) have varying magnitudes that distort cosine similarity. Prevention: Always normalize embeddings when using dimensions other than 3072.

typescript

function normalize(vector: number[]): number[] {
  const magnitude = Math.sqrt(vector.reduce((sum, val) => sum + val * val, 0));
  return vector.map(val => val / magnitude);
}

// When using 768 or 1536 dimensions
const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { outputDimensionality: 768 }
});

const normalized = normalize(response.embedding.values);
// Now safe for similarity calculations

错误表现：相似度分数不正确，无错误抛出来源：官方嵌入模型文档原因：只有3072维度的嵌入向量会由API预先归一化。其他所有维度（128-3071）的向量量级不同，会扭曲余弦相似度。 预防措施：当使用3072以外的维度时，始终对嵌入向量进行归一化。

typescript

function normalize(vector: number[]): number[] {
  const magnitude = Math.sqrt(vector.reduce((sum, val) => sum + val * val, 0));
  return vector.map(val => val / magnitude);
}

// When using 768 or 1536 dimensions
const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { outputDimensionality: 768 }
});

const normalized = normalize(response.embedding.values);
// Now safe for similarity calculations

Issue #10: Batch API Ordering Bug

问题#10：批量API排序Bug

Error: Silent data corruption - embeddings returned in wrong order Source: GitHub Issue #1207 Why It Happens: Batch API does not preserve ordering with large batch sizes (>500 texts). Example: entry 328 appears in position 628. Prevention: Process smaller batches (<100 texts) or add unique identifiers to verify ordering.

typescript

// Safer approach with verification
const taggedTexts = texts.map((text, i) => `[ID:${i}] ${text}`);
const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  contents: taggedTexts,
  config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
});

// Verify ordering by parsing IDs if needed

错误表现：静默数据损坏 - 嵌入向量返回顺序错误来源：GitHub Issue #1207 原因：当批量大小超过500个文本时，批量API无法保持顺序。示例：第328条条目出现在第628位。 预防措施：处理更小的批量（<100个文本）或添加唯一标识符以验证顺序。

typescript

// Safer approach with verification
const taggedTexts = texts.map((text, i) => `[ID:${i}] ${text}`);
const response = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  contents: taggedTexts,
  config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
});

// Verify ordering by parsing IDs if needed

Issue #11: Batch API Memory Limit

问题#11：批量API内存限制

Error:

Cannot create a string longer than 0x1fffffe8 characters

Source: GitHub Issue #1205 Why It Happens: Batch API response contains excessive whitespace causing response size to exceed Node.js string limit (~536MB) with large payloads (>10k embeddings). Prevention: Limit batches to <5,000 texts per request.

typescript

// Safe batch size
async function batchEmbedSafe(texts: string[]) {
  const maxBatchSize = 5000;
  if (texts.length > maxBatchSize) {
    throw new Error(`Batch too large: ${texts.length} texts (max: ${maxBatchSize})`);
  }
  // Process batch...
}

错误表现：

Cannot create a string longer than 0x1fffffe8 characters

来源：GitHub Issue #1205 原因：批量API响应包含过多空白字符，当负载较大（>10k个嵌入向量）时，响应大小超过Node.js字符串限制（约536MB）。 预防措施：每个请求的文本数量限制在<5000个。

typescript

// Safe batch size
async function batchEmbedSafe(texts: string[]) {
  const maxBatchSize = 5000;
  if (texts.length > maxBatchSize) {
    throw new Error(`Batch too large: ${texts.length} texts (max: ${maxBatchSize})`);
  }
  // Process batch...
}

Issue #12: LangChain Dimension Parameter Ignored (Community-sourced)

问题#12：LangChain维度参数被忽略（社区反馈）

Error: Dimension mismatch - getting 3072 dimensions instead of specified 768 Source: Medium Article Verified: Multiple community reports Why It Happens: LangChain's

GoogleGenerativeAIEmbeddings

class silently ignores

output_dimensionality

parameter when passed to constructor (Python SDK). Prevention: Pass dimension parameter to

embed_documents()

method, not constructor. JavaScript users should verify new

@google/genai

SDK doesn't have similar behavior.

python

undefined

错误表现：维度不匹配 - 获得3072维度而非指定的768 来源：Medium文章 已验证：多个社区报告原因：LangChain的

GoogleGenerativeAIEmbeddings

类在构造函数中传入

output_dimensionality

参数时会静默忽略该参数（Python SDK）。 预防措施：将维度参数传递给

embed_documents()

方法，而非构造函数。JavaScript用户应验证新版

@google/genai

SDK是否存在类似问题。

python

undefined

❌ WRONG - parameter silently ignored

embeddings = GoogleGenerativeAIEmbeddings( model="gemini-embedding-001", output_dimensionality=768 # IGNORED! )

✅ CORRECT - pass to method

embeddings = GoogleGenerativeAIEmbeddings(model="gemini-embedding-001") result = embeddings.embed_documents(["text"], output_dimensionality=768)

undefined

embeddings = GoogleGenerativeAIEmbeddings(model="gemini-embedding-001") result = embeddings.embed_documents(["text"], output_dimensionality=768)

undefined

Issue #13: Single Requests Use Batch Endpoint (Community-sourced)

问题#13：单次请求使用批量端点（社区反馈）

Error: Hitting rate limits faster than expected with single text embeddings Source: GitHub Issue #427 (Python SDK) Verified: Official issue in googleapis organization Why It Happens: The

embed_content()

function internally calls

batchEmbedContents

endpoint even for single texts. This causes higher rate limit consumption (batch endpoint has different limits). Prevention: Add delays between single embedding requests and implement exponential backoff for 429 errors.

typescript

// Add delays to avoid rate limits
async function embedWithDelay(text: string, delayMs: number = 100) {
  const response = await ai.models.embedContent({
    model: 'gemini-embedding-001',
    content: text,
    config: { taskType: 'SEMANTIC_SIMILARITY' }
  });
  await new Promise(resolve => setTimeout(resolve, delayMs));
  return response.embedding.values;
}

错误表现：单文本嵌入时比预期更快达到速率限制来源：GitHub Issue #427 (Python SDK) 已验证：googleapis组织的官方问题原因：

embed_content()

函数内部即使处理单文本也会调用

batchEmbedContents

端点。这会导致更高的速率限制消耗（批量端点有不同的限制）。 预防措施：在单次嵌入请求之间添加延迟，并为429错误实现指数退避。

typescript

// Add delays to avoid rate limits
async function embedWithDelay(text: string, delayMs: number = 100) {
  const response = await ai.models.embedContent({
    model: 'gemini-embedding-001',
    content: text,
    config: { taskType: 'SEMANTIC_SIMILARITY' }
  });
  await new Promise(resolve => setTimeout(resolve, delayMs));
  return response.embedding.values;
}

8. Best Practices

8. 最佳实践

Always Do

建议执行

✅ Specify Task Type

typescript

// Task type optimizes embeddings for your use case
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { taskType: 'RETRIEVAL_QUERY' } // ← Always specify
});

✅ Match Dimensions with Vectorize

typescript

// Ensure embeddings match your Vectorize index dimensions
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { outputDimensionality: 768 } // ← Match index
});

✅ Implement Rate Limiting

typescript

// Use exponential backoff for 429 errors
async function embedWithBackoff(text: string) {
  // Implementation from Error Handling section
}

✅ Cache Embeddings

typescript

// Cache embeddings to avoid redundant API calls
const cache = new Map<string, number[]>();

async function getCachedEmbedding(text: string): Promise<number[]> {
  if (cache.has(text)) {
    return cache.get(text)!;
  }

  const response = await ai.models.embedContent({
    model: 'gemini-embedding-001',
    content: text,
    config: { taskType: 'SEMANTIC_SIMILARITY' }
  });

  const embedding = response.embedding.values;
  cache.set(text, embedding);
  return embedding;
}

✅ Use Batch API for Multiple Texts

typescript

// Single batch request vs multiple individual requests
const embeddings = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  contents: texts, // Array of texts
  config: { taskType: 'RETRIEVAL_DOCUMENT' }
});

✅ 指定任务类型

typescript

// Task type optimizes embeddings for your use case
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { taskType: 'RETRIEVAL_QUERY' } // ← Always specify
});

✅ 匹配Vectorize的维度

typescript

// Ensure embeddings match your Vectorize index dimensions
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text,
  config: { outputDimensionality: 768 } // ← Match index
});

✅ 实现速率限制处理

typescript

// Use exponential backoff for 429 errors
async function embedWithBackoff(text: string) {
  // Implementation from Error Handling section
}

✅ 缓存嵌入向量

typescript

// Cache embeddings to avoid redundant API calls
const cache = new Map<string, number[]>();

async function getCachedEmbedding(text: string): Promise<number[]> {
  if (cache.has(text)) {
    return cache.get(text)!;
  }

  const response = await ai.models.embedContent({
    model: 'gemini-embedding-001',
    content: text,
    config: { taskType: 'SEMANTIC_SIMILARITY' }
  });

  const embedding = response.embedding.values;
  cache.set(text, embedding);
  return embedding;
}

✅ 为多个文本使用批量API

typescript

// Single batch request vs multiple individual requests
const embeddings = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  contents: texts, // Array of texts
  config: { taskType: 'RETRIEVAL_DOCUMENT' }
});

Never Do

建议避免

❌ Don't Skip Task Type

typescript

// Reduces quality by 10-30%
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text
  // Missing taskType!
});

❌ Don't Mix Different Dimensions

typescript

// Can't compare embeddings with different dimensions
const emb1 = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text1,
  config: { outputDimensionality: 768 }
});

const emb2 = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text2,
  config: { outputDimensionality: 1536 } // Different dimensions!
});

// ❌ Can't calculate similarity between different dimensions
const similarity = cosineSimilarity(emb1.embedding.values, emb2.embedding.values);

❌ Don't Use Wrong Task Type for RAG

typescript

// Reduces search quality
const queryEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: query,
  config: { taskType: 'RETRIEVAL_DOCUMENT' } // Wrong! Should be RETRIEVAL_QUERY
});

❌ 不要跳过任务类型

typescript

// Reduces quality by 10-30%
const embedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text
  // Missing taskType!
});

❌ 不要混合不同维度

typescript

// Can't compare embeddings with different dimensions
const emb1 = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text1,
  config: { outputDimensionality: 768 }
});

const emb2 = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: text2,
  config: { outputDimensionality: 1536 } // Different dimensions!
});

// ❌ Can't calculate similarity between different dimensions
const similarity = cosineSimilarity(emb1.embedding.values, emb2.embedding.values);

❌ 不要为RAG使用错误的任务类型

typescript

// Reduces search quality
const queryEmbedding = await ai.models.embedContent({
  model: 'gemini-embedding-001',
  content: query,
  config: { taskType: 'RETRIEVAL_DOCUMENT' } // Wrong! Should be RETRIEVAL_QUERY
});

Using Bundled Resources

使用捆绑资源

Templates (templates/)

模板（templates/）

```
package.json
```
- Package configuration with verified versions
```
basic-embeddings.ts
```
- Single text embedding with SDK
```
embeddings-fetch.ts
```
- Fetch-based for Cloudflare Workers
```
batch-embeddings.ts
```
- Batch processing with rate limiting
```
rag-with-vectorize.ts
```
- Complete RAG implementation with Vectorize

```
package.json
```
- 包含已验证版本的包配置
```
basic-embeddings.ts
```
- 使用SDK的单文本嵌入示例
```
embeddings-fetch.ts
```
- 基于Fetch的Cloudflare Workers示例
```
batch-embeddings.ts
```
- 带速率限制的批量处理示例
```
rag-with-vectorize.ts
```
- 完整的Vectorize集成RAG实现

References (references/)

参考资料（references/）

```
model-comparison.md
```
- Compare Gemini vs OpenAI vs Workers AI embeddings
```
vectorize-integration.md
```
- Cloudflare Vectorize setup and patterns
```
rag-patterns.md
```
- Complete RAG implementation strategies
```
dimension-guide.md
```
- Choosing the right dimensions (768 vs 1536 vs 3072)
```
top-errors.md
```
- 8 common errors and detailed solutions

```
model-comparison.md
```
- 比较Gemini、OpenAI和Workers AI嵌入模型
```
vectorize-integration.md
```
- Cloudflare Vectorize设置和模式
```
rag-patterns.md
```
- 完整的RAG实现策略
```
dimension-guide.md
```
- 选择合适的维度（768 vs 1536 vs 3072）
```
top-errors.md
```
- 8种常见错误及详细解决方案

Scripts (scripts/)

脚本（scripts/）

```
check-versions.sh
```
- Verify @google/genai package version is current

```
check-versions.sh
```
- 验证@google/genai包版本是否为最新

Official Documentation

官方文档

Embeddings Guide: https://ai.google.dev/gemini-api/docs/embeddings
Model Spec: https://ai.google.dev/gemini-api/docs/models/gemini#gemini-embedding-001
Rate Limits: https://ai.google.dev/gemini-api/docs/rate-limits
SDK Reference: https://www.npmjs.com/package/@google/genai
Context7 Library ID:
```
/websites/ai_google_dev_gemini-api
```

嵌入模型指南：https://ai.google.dev/gemini-api/docs/embeddings
模型规格：https://ai.google.dev/gemini-api/docs/models/gemini#gemini-embedding-001
速率限制：https://ai.google.dev/gemini-api/docs/rate-limits
SDK参考：https://www.npmjs.com/package/@google/genai
Context7库ID：
```
/websites/ai_google_dev_gemini-api
```

Related Skills

Success Metrics

成功指标

Token Savings: ~60% compared to manual implementation Errors Prevented: 13 documented errors with solutions (8 basic + 5 known issues) Production Tested: ✅ Verified in RAG applications Package Version: @google/genai@1.37.0 Last Updated: 2026-01-21 Changes: Added normalization requirement, batch API warnings (ordering bug, memory limits, rate limit anomaly), LangChain compatibility notes

Token节省：与手动实现相比节省约60% 预防的错误：13种已记录错误及解决方案（8种基本错误+5种已知问题） 生产环境测试：✅ 已在RAG应用中验证 包版本：@google/genai@1.37.0 最后更新：2026-01-21 变更：添加归一化要求、批量API警告（排序bug、内存限制、速率限制异常）、LangChain兼容性说明

License

许可证

MIT License - Free to use in personal and commercial projects.

Questions or Issues?

GitHub: https://github.com/jezweb/claude-skills
Email: jeremy@jezweb.net

MIT许可证 - 可免费用于个人和商业项目。

有问题或疑问？

GitHub：https://github.com/jezweb/claude-skills
邮箱：jeremy@jezweb.net

google-gemini-embeddings

Original

Translation

Google Gemini Embeddings

Google Gemini 嵌入模型

Table of Contents

目录

1. Quick Start

1. 快速开始

Installation

安装

Environment Setup

环境配置

First Embedding Example

第一个嵌入示例

2. gemini-embedding-001 Model

2. gemini-embedding-001 模型

Model Specifications

模型规格

Dimensions

维度

Context Window

上下文窗口

Rate Limits

速率限制

Output Format

输出格式

3. Basic Embeddings

3. 基础嵌入生成

SDK Approach (Node.js)

SDK方式（Node.js）

Fetch Approach (Cloudflare Workers)

Fetch方式（Cloudflare Workers）

Response Parsing

响应解析

Normalization Requirement

归一化要求

4. Batch Embeddings

4. 批量嵌入生成

Multiple Texts in One Request (SDK)

单次请求处理多个文本（SDK）

Batch REST API (fetch)

批量REST API（fetch）

Batch API Known Issues

批量API已知问题

Chunking for Rate Limits

针对速率限制的分块处理

Performance Optimization

性能优化

5. Task Types

5. 任务类型

Available Task Types (8 total)

可用任务类型（共8种）

When to Use Which

如何选择合适的任务类型

Impact on Quality

对质量的影响

6. RAG Patterns

6. RAG模式

Document Ingestion Pipeline

文档摄入流水线

Query Flow (Retrieve + Generate)

查询流程（检索+生成）

Integration with Cloudflare Vectorize

与Cloudflare Vectorize集成

7. Error Handling

7. 错误处理

Common Errors

常见错误

Known Issues Prevention

已知问题预防

Issue #9: Normalization Required for Non-3072 Dimensions

问题#9：非3072维度需要归一化

Issue #10: Batch API Ordering Bug

问题#10：批量API排序Bug

Issue #11: Batch API Memory Limit

问题#11：批量API内存限制

Issue #12: LangChain Dimension Parameter Ignored (Community-sourced)

问题#12：LangChain维度参数被忽略（社区反馈）

❌ WRONG - parameter silently ignored