Loading...
Loading...
Google Gemini embeddings API (gemini-embedding-001) for RAG and semantic search. Use for vector search, Vectorize integration, or encountering dimension mismatches, rate limits, text truncation.
npx skill4agent add secondsky/claude-skills google-gemini-embeddingsgemini-embedding-001bun add @google/genai@^1.27.0bun add -d typescript@^5.0.0export GEMINI_API_KEY="your-api-key-here"import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: 'What is the meaning of life?',
config: {
taskType: 'RETRIEVAL_QUERY',
outputDimensionality: 768
}
});
console.log(response.embedding.values); // [0.012, -0.034, ...]
console.log(response.embedding.values.length); // 768gemini-embedding-001gemini-embedding-exp-03-07| Dimension | Use Case | Storage | Performance |
|---|---|---|---|
| 768 | Recommended for most use cases | Low | Fast |
| 1536 | Balance between accuracy and efficiency | Medium | Medium |
| 3072 | Maximum accuracy (default) | High | Slower |
references/dimension-guide.mdreferences/model-comparison.md| Tier | RPM | TPM | RPD |
|---|---|---|---|
| Free | 100 | 30,000 | 1,000 |
| Tier 1 | 3,000 | 1,000,000 | - |
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: 'The quick brown fox jumps over the lazy dog',
config: {
taskType: 'SEMANTIC_SIMILARITY',
outputDimensionality: 768
}
});
console.log(response.embedding.values);
// [0.00388, -0.00762, 0.01543, ...]export default {
async fetch(request: Request, env: Env): Promise<Response> {
const apiKey = env.GEMINI_API_KEY;
const text = "What is the meaning of life?";
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent',
{
method: 'POST',
headers: {
'x-goog-api-key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({
content: {
parts: [{ text }]
},
taskType: 'RETRIEVAL_QUERY',
outputDimensionality: 768
})
}
);
const data = await response.json();
// Response format:
// {
// embedding: {
// values: [0.012, -0.034, ...]
// }
// }
return new Response(JSON.stringify(data), {
headers: { 'Content-Type': 'application/json' }
});
}
};interface EmbeddingResponse {
embedding: {
values: number[];
};
}
const response: EmbeddingResponse = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: 'Sample text',
config: { taskType: 'SEMANTIC_SIMILARITY', outputDimensionality: 768 }
});
const embedding: number[] = response.embedding.values;
const dimensions: number = embedding.length; // 768import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const texts = [
"What is the meaning of life?",
"How does photosynthesis work?",
"Tell me about the history of the internet."
];
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: texts, // Array of strings
config: {
taskType: 'RETRIEVAL_DOCUMENT',
outputDimensionality: 768
}
});
// Process each embedding
response.embeddings.forEach((embedding, index) => {
console.log(`Text ${index}: ${texts[index]}`);
console.log(`Embedding: ${embedding.values.slice(0, 5)}...`);
console.log(`Dimensions: ${embedding.values.length}`);
});async function batchEmbedWithRateLimit(
texts: string[],
batchSize: number = 100, // Free tier: 100 RPM
delayMs: number = 60000 // 1 minute delay between batches
): Promise<number[][]> {
const allEmbeddings: number[][] = [];
for (let i = 0; i < texts.length; i += batchSize) {
const batch = texts.slice(i, i + batchSize);
console.log(`Processing batch ${i / batchSize + 1} (${batch.length} texts)`);
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: batch,
config: {
taskType: 'RETRIEVAL_DOCUMENT',
outputDimensionality: 768
}
});
allEmbeddings.push(...response.embeddings.map(e => e.values));
// Wait before next batch (except last batch)
if (i + batchSize < texts.length) {
await new Promise(resolve => setTimeout(resolve, delayMs));
}
}
return allEmbeddings;
}
// Usage
const embeddings = await batchEmbedWithRateLimit(documents, 100);taskType| Task Type | Use Case | Example |
|---|---|---|
| RETRIEVAL_QUERY | User search queries | "How do I fix a flat tire?" |
| RETRIEVAL_DOCUMENT | Documents to be indexed/searched | Product descriptions, articles |
| SEMANTIC_SIMILARITY | Comparing text similarity | Duplicate detection, clustering |
| CLASSIFICATION | Categorizing texts | Spam detection, sentiment analysis |
| CLUSTERING | Grouping similar texts | Topic modeling, content organization |
| CODE_RETRIEVAL_QUERY | Code search queries | "function to sort array" |
| QUESTION_ANSWERING | Questions seeking answers | FAQ matching |
| FACT_VERIFICATION | Verifying claims with evidence | Fact-checking systems |
// When embedding user queries
const queryEmbedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: userQuery,
config: {
taskType: 'RETRIEVAL_QUERY', // ← Use RETRIEVAL_QUERY
outputDimensionality: 768
}
});
// When embedding documents for indexing
const docEmbedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: documentText,
config: {
taskType: 'RETRIEVAL_DOCUMENT', // ← Use RETRIEVAL_DOCUMENT
outputDimensionality: 768
}
});Vector dimensions do not match. Expected 768, got 3072outputDimensionality// ❌ BAD: No outputDimensionality (defaults to 3072)
const embedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text
});
// ✅ GOOD: Match Vectorize index dimensions
const embedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text,
config: { outputDimensionality: 768 } // ← Match your index
});429 Too Many Requests - Rate limit exceeded// ✅ GOOD: Exponential backoff
async function embedWithRetry(text: string, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text,
config: { taskType: 'SEMANTIC_SIMILARITY', outputDimensionality: 768 }
});
} catch (error: any) {
if (error.status === 429 && attempt < maxRetries - 1) {
const delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}function chunkText(text: string, maxTokens = 2000): string[] {
const words = text.split(/\s+/);
const chunks: string[] = [];
let currentChunk: string[] = [];
for (const word of words) {
currentChunk.push(word);
// Rough estimate: 1 token ≈ 0.75 words
if (currentChunk.length * 0.75 >= maxTokens) {
chunks.push(currentChunk.join(' '));
currentChunk = [];
}
}
if (currentChunk.length > 0) {
chunks.push(currentChunk.join(' '));
}
return chunks;
}RETRIEVAL_DOCUMENT// ❌ BAD: Wrong task type for RAG query
const queryEmbedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: userQuery,
config: { taskType: 'RETRIEVAL_DOCUMENT' } // ← Wrong!
});
// ✅ GOOD: Correct task types
const queryEmbedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: userQuery,
config: { taskType: 'RETRIEVAL_QUERY', outputDimensionality: 768 }
});Similarity values out of range (-1.5 to 1.2)// ✅ GOOD: Proper cosine similarity
function cosineSimilarity(a: number[], b: number[]): number {
if (a.length !== b.length) {
throw new Error('Vector dimensions must match');
}
let dotProduct = 0;
let magnitudeA = 0;
let magnitudeB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
magnitudeA += a[i] * a[i];
magnitudeB += b[i] * b[i];
}
if (magnitudeA === 0 || magnitudeB === 0) {
return 0; // Handle zero vectors
}
return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
}references/top-errors.mdconst embedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text,
config: { taskType: 'RETRIEVAL_QUERY' } // ← Always specify
});const embedding = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text,
config: { outputDimensionality: 768 } // ← Match index
});// Use exponential backoff for 429 errors (see Error 2)const cache = new Map<string, number[]>();
async function getCachedEmbedding(text: string): Promise<number[]> {
if (cache.has(text)) {
return cache.get(text)!;
}
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: text,
config: { taskType: 'SEMANTIC_SIMILARITY', outputDimensionality: 768 }
});
const embedding = response.embedding.values;
cache.set(text, embedding);
return embedding;
}// Single batch request vs multiple individual requests
const embeddings = await ai.models.embedContent({
model: 'gemini-embedding-001',
contents: texts, // Array of texts
config: { taskType: 'RETRIEVAL_DOCUMENT', outputDimensionality: 768 }
});references/rag-patterns.mdreferences/vectorize-integration.mdreferences/dimension-guide.mdreferences/model-comparison.mdreferences/top-errors.mdpackage.jsonbasic-embeddings.tsembeddings-fetch.tsbatch-embeddings.tsrag-with-vectorize.tssemantic-search.tsclustering.tsmodel-comparison.mdvectorize-integration.mdrag-patterns.mddimension-guide.mdtop-errors.mdcheck-versions.sh/websites/ai_google_dev_gemini-api