cloudflare-vectorize

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Cloudflare Vectorize

Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers.

Status: Production Ready ✅ Last Updated: 2025-11-21 Dependencies: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings) Latest Versions: wrangler@4.50.0, @cloudflare/workers-types@4.20251014.0 Token Savings: ~65% Errors Prevented: 8 Dev Time Saved: ~3 hours

Cloudflare Vectorize 完整实现指南——一款基于Cloudflare Workers构建的全球分布式向量数据库，用于开发语义搜索、RAG（检索增强生成）及AI驱动的应用。

状态：已就绪可投入生产 ✅ 最后更新：2025-11-21 依赖项：cloudflare-worker-base（用于Worker配置）、cloudflare-workers-ai（用于生成嵌入向量） 最新版本：wrangler@4.50.0, @cloudflare/workers-types@4.20251014.0 Token节省率：约65% 避免的错误数量：8个 节省的开发时间：约3小时

What This Skill Provides

本技能提供的功能

Core Capabilities

核心能力

✅ Index Management: Create, configure, and manage vector indexes
✅ Vector Operations: Insert, upsert, query, delete, and list vectors
✅ Metadata Filtering: Advanced filtering with 10 metadata indexes per index
✅ Semantic Search: Find similar vectors using cosine, euclidean, or dot-product metrics
✅ RAG Patterns: Complete retrieval-augmented generation workflows
✅ Workers AI Integration: Native embedding generation with @cf/baai/bge-base-en-v1.5
✅ OpenAI Integration: Support for text-embedding-3-small/large models
✅ Document Processing: Text chunking and batch ingestion pipelines

✅ 索引管理：创建、配置和管理向量索引
✅ 向量操作：插入、更新插入、查询、删除和列出向量
✅ 元数据过滤：每个索引支持10个元数据索引的高级过滤
✅ 语义搜索：使用余弦、欧氏或点积指标查找相似向量
✅ RAG模式：完整的检索增强生成工作流
✅ Workers AI集成：原生支持@cf/baai/bge-base-en-v1.5嵌入向量生成
✅ OpenAI集成：支持text-embedding-3-small/large模型
✅ 文档处理：文本分块和批量导入流水线

Templates Included

包含的模板

basic-search.ts - Simple vector search with Workers AI
rag-chat.ts - Full RAG chatbot with context retrieval
document-ingestion.ts - Document chunking and embedding pipeline
metadata-filtering.ts - Advanced filtering examples

basic-search.ts - 基于Workers AI的简单向量搜索
rag-chat.ts - 带上下文检索的完整RAG聊天机器人
document-ingestion.ts - 文档分块与嵌入向量流水线
metadata-filtering.ts - 高级过滤示例

Critical Setup Rules

关键配置规则

⚠️ MUST DO BEFORE INSERTING VECTORS

⚠️ 插入向量前必须完成的操作

bash

undefined

bash

undefined

1. Create the index with FIXED dimensions and metric

1. 创建具有固定维度和指标的索引

bunx wrangler vectorize create my-index
--dimensions=768
--metric=cosine

2. Create metadata indexes IMMEDIATELY (before inserting vectors!)

2. 立即创建元数据索引（必须在插入向量之前！）

bunx wrangler vectorize create-metadata-index my-index
--property-name=category
--type=string

bunx wrangler vectorize create-metadata-index my-index
--property-name=timestamp
--type=number


**Why**: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property.

bunx wrangler vectorize create-metadata-index my-index
--property-name=category
--type=string

bunx wrangler vectorize create-metadata-index my-index
--property-name=timestamp
--type=number


**原因**：元数据索引必须在插入向量前创建。在元数据索引创建前添加的向量无法通过该属性进行过滤。

Index Configuration (Cannot Be Changed Later)

索引配置（创建后无法修改）

bash

undefined

bash

undefined

Dimensions MUST match your embedding model output:

维度必须与嵌入模型输出匹配：

- Workers AI @cf/baai/bge-base-en-v1.5: 768 dimensions

- Workers AI @cf/baai/bge-base-en-v1.5：768维度

- OpenAI text-embedding-3-small: 1536 dimensions

- OpenAI text-embedding-3-small：1536维度

- OpenAI text-embedding-3-large: 3072 dimensions

- OpenAI text-embedding-3-large：3072维度

Metrics determine similarity calculation:

指标决定相似度计算方式：

- cosine: Best for normalized embeddings (most common)

- cosine：最适合归一化嵌入向量（最常用）

- euclidean: Absolute distance between vectors

- euclidean：向量间的绝对距离

- dot-product: For non-normalized vectors

- dot-product：用于非归一化向量

undefined

undefined

Wrangler Configuration

Wrangler配置

wrangler.jsonc:

jsonc

{
  "name": "my-vectorize-worker",
  "main": "src/index.ts",
  "compatibility_date": "2025-10-21",
  "vectorize": [
    {
      "binding": "VECTORIZE_INDEX",
      "index_name": "my-index"
    }
  ],
  "ai": {
    "binding": "AI"
  }
}

wrangler.jsonc:

jsonc

{
  "name": "my-vectorize-worker",
  "main": "src/index.ts",
  "compatibility_date": "2025-10-21",
  "vectorize": [
    {
      "binding": "VECTORIZE_INDEX",
      "index_name": "my-index"
    }
  ],
  "ai": {
    "binding": "AI"
  }
}

TypeScript Types

TypeScript类型定义

typescript

export interface Env {
  VECTORIZE_INDEX: VectorizeIndex;
  AI: Ai;
}

interface VectorizeVector {
  id: string;
  values: number[] | Float32Array | Float64Array;
  namespace?: string;
  metadata?: Record<string, string | number | boolean | string[]>;
}

interface VectorizeMatches {
  matches: Array<{
    id: string;
    score: number;
    values?: number[];
    metadata?: Record<string, any>;
    namespace?: string;
  }>;
  count: number;
}

typescript

export interface Env {
  VECTORIZE_INDEX: VectorizeIndex;
  AI: Ai;
}

interface VectorizeVector {
  id: string;
  values: number[] | Float32Array | Float64Array;
  namespace?: string;
  metadata?: Record<string, string | number | boolean | string[]>;
}

interface VectorizeMatches {
  matches: Array<{
    id: string;
    score: number;
    values?: number[];
    metadata?: Record<string, any>;
    namespace?: string;
  }>;
  count: number;
}

Common Operations

常见操作

Quick Reference

快速参考

Operation	Method	Key Point
Insert	`insert([...])`	Keeps first if ID exists
Upsert	`upsert([...])`	Overwrites if ID exists (use for updates)
Query	`query(vector, { topK, filter })`	Returns similar vectors
Delete	`deleteByIds([...])`	Remove by ID array
Get	`getByIds([...])`	Retrieve specific vectors

操作	方法	关键点
插入	`insert([...])`	若ID已存在则保留首个向量
更新插入	`upsert([...])`	若ID已存在则覆盖（用于更新）
查询	`query(vector, { topK, filter })`	返回相似向量
删除	`deleteByIds([...])`	通过ID数组删除
获取	`getByIds([...])`	检索指定向量

Filter Operators

过滤运算符

Operator	Example	Description
`$eq`	`{ category: "docs" }`	Equality (implicit)
`$ne`	`{ status: { $ne: "archived" } }`	Not equal
`$in`	`{ category: { $in: ["a", "b"] } }`	In array
`$nin`	`{ category: { $nin: ["x"] } }`	Not in array
`$gte/$lt`	`{ timestamp: { $gte: 123 } }`	Range queries

📄 Full operations guide: Load

references/vector-operations.md

for complete insert/upsert/query/delete examples with code.

运算符	示例	描述
`$eq`	`{ category: "docs" }`	等于（隐式）
`$ne`	`{ status: { $ne: "archived" } }`	不等于
`$in`	`{ category: { $in: ["a", "b"] } }`	在数组中
`$nin`	`{ category: { $nin: ["x"] } }`	不在数组中
`$gte/$lt`	`{ timestamp: { $gte: 123 } }`	范围查询

📄 完整操作指南：加载

references/vector-operations.md

获取包含代码的完整插入/更新插入/查询/删除示例。

Embedding Generation

嵌入向量生成

Model	Provider	Dimensions	Best For
`@cf/baai/bge-base-en-v1.5`	Workers AI	768	Free, general purpose
`text-embedding-3-small`	OpenAI	1536	Balance quality/cost
`text-embedding-3-large`	OpenAI	3072	Highest quality

📄 Integration guides:

Load

references/integration-workers-ai-bge-base.md

for Workers AI setup

Load

references/integration-openai-embeddings.md

for OpenAI integration

模型	提供商	维度	最佳适用场景
`@cf/baai/bge-base-en-v1.5`	Workers AI	768	免费、通用场景
`text-embedding-3-small`	OpenAI	1536	平衡质量与成本
`text-embedding-3-large`	OpenAI	3072	最高质量

📄 集成指南：

加载

references/integration-workers-ai-bge-base.md

获取Workers AI配置方法

加载

references/integration-openai-embeddings.md

获取OpenAI集成方法

Metadata Best Practices

元数据最佳实践

Key Limits

关键限制

Limit	Value
Max metadata indexes	10 per index
Max metadata size	10 KiB per vector
String index	First 64 bytes (UTF-8)
Filter size	Max 2048 bytes

限制	数值
最大元数据索引数	每个索引10个
最大元数据大小	每个向量10 KiB
字符串索引	前64字节（UTF-8）
过滤条件大小	最大2048字节

Invalid Key Characters

无效键字符

Keys cannot: be empty, contain

(reserved for nesting), contain

, or start with

📄 Complete metadata guide: Load

references/metadata-guide.md

for cardinality best practices, nested metadata, and advanced filtering patterns.

键不能：为空、包含

（保留用于嵌套）、包含

、或以

开头。

📄 完整元数据指南：加载

references/metadata-guide.md

获取基数最佳实践、嵌套元数据及高级过滤模式。

RAG Pattern (Full Example)

RAG模式（完整示例）

typescript

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { question } = await request.json();

    // 1. Generate embedding for user question
    const questionEmbedding = await env.AI.run('@cf/baai/bge-base-en-v1.5', {
      text: question
    });

    // 2. Search vector database for similar content
    const results = await env.VECTORIZE_INDEX.query(
      questionEmbedding.data[0],
      {
        topK: 3,
        returnMetadata: 'all',
        filter: { type: "documentation" }
      }
    );

    // 3. Build context from retrieved documents
    const context = results.matches
      .map(m => m.metadata.content)
      .join('\n\n---\n\n');

    // 4. Generate answer with LLM using context
    const answer = await env.AI.run('@cf/meta/llama-3-8b-instruct', {
      messages: [
        {
          role: "system",
          content: `Answer based on this context:\n\n${context}`
        },
        {
          role: "user",
          content: question
        }
      ]
    });

    return Response.json({
      answer: answer.response,
      sources: results.matches.map(m => m.metadata.title)
    });
  }
};

typescript

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { question } = await request.json();

    // 1. 为用户问题生成嵌入向量
    const questionEmbedding = await env.AI.run('@cf/baai/bge-base-en-v1.5', {
      text: question
    });

    // 2. 在向量数据库中搜索相似内容
    const results = await env.VECTORIZE_INDEX.query(
      questionEmbedding.data[0],
      {
        topK: 3,
        returnMetadata: 'all',
        filter: { type: "documentation" }
      }
    );

    // 3. 从检索到的文档构建上下文
    const context = results.matches
      .map(m => m.metadata.content)
      .join('\n\n---\n\n');

    // 4. 使用上下文通过LLM生成答案
    const answer = await env.AI.run('@cf/meta/llama-3-8b-instruct', {
      messages: [
        {
          role: "system",
          content: `Answer based on this context:\n\n${context}`
        },
        {
          role: "user",
          content: question
        }
      ]
    });

    return Response.json({
      answer: answer.response,
      sources: results.matches.map(m => m.metadata.title)
    });
  }
};

Document Chunking Strategy

文档分块策略

Recommended chunk sizes: 300-500 characters for semantic coherence.

Key metadata for chunks:

```
doc_id
```
: Parent document ID
```
chunk_index
```
: Position in document
```
content
```
: Text for retrieval display

📄 Full chunking implementation: See

templates/document-ingestion.ts

for complete chunking pipeline.

推荐分块大小：300-500字符，保证语义连贯性。

分块的关键元数据：

```
doc_id
```
：父文档ID
```
chunk_index
```
：在文档中的位置
```
content
```
：用于检索展示的文本

📄 完整分块实现：查看

templates/document-ingestion.ts

获取完整分块流水线。

Common Errors & Solutions

常见错误与解决方案

Error 1: Metadata Index Created After Vectors Inserted

错误1：元数据索引在插入向量后创建

Problem: Filtering doesn't work on existing vectors
Solution: Delete and re-insert vectors OR create metadata indexes BEFORE inserting

问题：现有向量无法被过滤
解决方案：删除并重新插入向量，或者在插入向量前创建元数据索引

Error 2: Dimension Mismatch

错误2：维度不匹配

Problem: "Vector dimensions do not match index configuration"
Solution: Ensure embedding model output matches index dimensions:
  - Workers AI bge-base: 768
  - OpenAI small: 1536
  - OpenAI large: 3072

问题："Vector dimensions do not match index configuration"
解决方案：确保嵌入模型输出与索引维度匹配：
  - Workers AI bge-base：768
  - OpenAI small：1536
  - OpenAI large：3072

Error 3: Invalid Metadata Keys

错误3：无效元数据键

Problem: "Invalid metadata key"
Solution: Keys cannot:
  - Be empty
  - Contain . (dot)
  - Contain " (quote)
  - Start with $ (dollar sign)

问题："Invalid metadata key"
解决方案：键不能：
  - 为空
  - 包含.（点）
  - 包含"（引号）
  - 以$（美元符号）开头

Error 4: Filter Too Large

错误4：过滤条件过大

Problem: "Filter exceeds 2048 bytes"
Solution: Simplify filter or split into multiple queries

问题："Filter exceeds 2048 bytes"
解决方案：简化过滤条件或拆分为多个查询

Error 5: Range Query on High Cardinality

错误5：高基数字段上的范围查询

Problem: Slow queries or reduced accuracy
Solution: Use lower cardinality fields for range queries, or use seconds instead of milliseconds for timestamps

问题：查询缓慢或准确性降低
解决方案：使用低基数字段进行范围查询，或使用秒而非毫秒作为时间戳单位

Error 6: Insert vs Upsert Confusion

错误6：插入与更新插入混淆

Problem: Updates not reflecting in index
Solution: Use upsert() to overwrite existing vectors, not insert()

问题：更新未反映在索引中
解决方案：使用upsert()覆盖现有向量，而非insert()

Error 7: Missing Bindings

错误7：缺少绑定配置

Problem: "VECTORIZE_INDEX is not defined"
Solution: Add [[vectorize]] binding to wrangler.jsonc

问题："VECTORIZE_INDEX is not defined"
解决方案：在wrangler.jsonc中添加[[vectorize]]绑定

Error 8: Namespace vs Metadata Confusion

错误8：命名空间与元数据混淆

Problem: Unclear when to use namespace vs metadata filtering
Solution:
  - Namespace: Partition key, applied BEFORE metadata filters
  - Metadata: Flexible key-value filtering within namespace

问题：不清楚何时使用命名空间 vs 元数据过滤
解决方案：
  - 命名空间：分区键，在元数据过滤前应用
  - 元数据：命名空间内灵活的键值对过滤

Wrangler CLI Reference

Wrangler CLI参考

Essential commands:

bash

undefined

核心命令：

bash

undefined

Create index (dimensions/metric are PERMANENT)

创建索引（维度/指标为永久设置）

bunx wrangler vectorize create <name> --dimensions=768 --metric=cosine

Create metadata index (MUST be before inserting vectors!)

创建元数据索引（必须在插入向量之前！）

bunx wrangler vectorize create-metadata-index <name> --property-name=category --type=string

Get index info

获取索引信息

bunx wrangler vectorize info <name>


📄 **Full CLI reference**: Load `references/wrangler-commands.md` for all vectorize commands.

bunx wrangler vectorize info <name>


📄 **完整CLI参考**：加载`references/wrangler-commands.md`获取所有vectorize命令。

Performance Tips

性能优化技巧

Batch Operations: Insert/upsert in batches of 100-1000 vectors
Selective Return: Only use
```
returnValues: true
```
when needed (saves bandwidth)
Metadata Cardinality: Keep indexed metadata fields low cardinality for range queries
Namespace Filtering: Apply namespace filter before metadata filters (processed first)
Query Optimization: Use topK=3-10 for best latency (larger values increase search time)

批量操作：按100-1000个向量的批次进行插入/更新插入
选择性返回：仅在需要时使用
```
returnValues: true
```
（节省带宽）
元数据基数：范围查询使用低基数的元数据字段
命名空间过滤：先应用命名空间过滤（优先处理）
查询优化：使用topK=3-10以获得最佳延迟（值越大搜索时间越长）

When to Use This Skill

何时使用本技能

✅ Use Vectorize when:

Building semantic search over documents, products, or content
Implementing RAG chatbots with context retrieval
Creating recommendation engines based on similarity
Building multi-tenant applications (use namespaces)
Need global distribution and low latency

❌ Don't use Vectorize for:

Traditional relational data (use D1)
Key-value lookups (use KV)
Large file storage (use R2)
Real-time collaborative state (use Durable Objects)

✅ 推荐使用Vectorize的场景：

为文档、产品或内容构建语义搜索
实现带上下文检索的RAG聊天机器人
基于相似度创建推荐引擎
构建多租户应用（使用命名空间）
需要全球分布式部署与低延迟

❌ 不推荐使用Vectorize的场景：

传统关系型数据（使用D1）
键值对查找（使用KV）
大文件存储（使用R2）
实时协作状态（使用Durable Objects）

When to Load References

何时加载参考文档

Reference File	Load When...
`references/vector-operations.md`	Need full insert/upsert/query/delete code examples
`references/metadata-guide.md`	Setting up metadata indexes, filtering best practices
`references/wrangler-commands.md`	Using Vectorize CLI commands
`references/integration-workers-ai-bge-base.md`	Integrating Workers AI embeddings
`references/integration-openai-embeddings.md`	Integrating OpenAI embeddings
`references/embedding-models.md`	Comparing embedding model options
`references/index-operations.md`	Index lifecycle management

参考文件	加载时机...
`references/vector-operations.md`	需要完整的插入/更新插入/查询/删除代码示例时
`references/metadata-guide.md`	设置元数据索引、过滤最佳实践时
`references/wrangler-commands.md`	使用Vectorize CLI命令时
`references/integration-workers-ai-bge-base.md`	集成Workers AI嵌入向量时
`references/integration-openai-embeddings.md`	集成OpenAI嵌入向量时
`references/embedding-models.md`	比较嵌入模型选项时
`references/index-operations.md`	索引生命周期管理时

Templates

模板

Template	Purpose
`templates/basic-search.ts`	Simple vector search
`templates/rag-chat.ts`	Complete RAG chatbot
`templates/document-ingestion.ts`	Document chunking pipeline
`templates/metadata-filtering.ts`	Advanced filtering

模板	用途
`templates/basic-search.ts`	简单向量搜索
`templates/rag-chat.ts`	完整RAG聊天机器人
`templates/document-ingestion.ts`	文档分块流水线
`templates/metadata-filtering.ts`	高级过滤

Official Documentation

官方文档

Version: 1.0.0 Status: Production Ready ✅ Token Savings: ~65% Errors Prevented: 8 major categories Dev Time Saved: ~2.5 hours per implementation

版本：1.0.0 状态：已就绪可投入生产 ✅ Token节省率：约65% 避免的错误数量：8大类别 节省的开发时间：每个实现约2.5小时