alicloud-ai-search-dashvector

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Category: provider

分类: 提供商

DashVector Vector Search

DashVector 向量搜索

Use DashVector to manage collections and perform vector similarity search with optional filters and sparse vectors.

使用DashVector管理集合，并执行带可选过滤器和稀疏向量的向量相似度搜索。

Prerequisites

前置条件

Install SDK (recommended in a venv to avoid PEP 668 limits):

bash

python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashvector

Provide credentials and endpoint via environment variables:
- ```
DASHVECTOR_API_KEY
```
- ```
DASHVECTOR_ENDPOINT
```
  (cluster endpoint)

安装SDK（建议在虚拟环境venv中安装，以避免PEP 668限制）：

bash

python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashvector

通过环境变量提供凭证和端点：
- ```
DASHVECTOR_API_KEY
```
- ```
DASHVECTOR_ENDPOINT
```
  （集群端点）

Normalized operations

标准化操作

Create collection

创建集合

```
name
```
(str)
```
dimension
```
(int)
```
metric
```
(str:
```
cosine
```
|
```
dotproduct
```
|
```
euclidean
```
)
```
fields_schema
```
(optional dict of field types)

```
name
```
（字符串类型）
```
dimension
```
（整数类型）
```
metric
```
（字符串类型：
```
cosine
```
|
```
dotproduct
```
|
```
euclidean
```
）
```
fields_schema
```
（可选的字段类型字典）

Upsert docs

插入/更新文档

```
docs
```
list of
```
{id, vector, fields}
```
or tuples
Supports
```
sparse_vector
```
and multi-vector collections

```
docs
```
：
```
{id, vector, fields}
```
格式的列表或元组
支持
```
sparse_vector
```
和多向量集合

Query docs

查询文档

```
vector
```
or
```
id
```
(one required; if both empty, only filter is applied)
```
topk
```
(int)
```
filter
```
(SQL-like where clause)
```
output_fields
```
(list of field names)
```
include_vector
```
(bool)

```
vector
```
或
```
id
```
（二者必填其一；若均为空，则仅应用过滤器）
```
topk
```
（整数类型）
```
filter
```
（类SQL的where子句）
```
output_fields
```
（字段名称列表）
```
include_vector
```
（布尔类型）

Quickstart (Python SDK)

快速开始（Python SDK）

python

import os
import dashvector
from dashvector import Doc

client = dashvector.Client(
    api_key=os.getenv("DASHVECTOR_API_KEY"),
    endpoint=os.getenv("DASHVECTOR_ENDPOINT"),
)

python

import os
import dashvector
from dashvector import Doc

client = dashvector.Client(
    api_key=os.getenv("DASHVECTOR_API_KEY"),
    endpoint=os.getenv("DASHVECTOR_ENDPOINT"),
)

1) Create a collection

1) 创建集合

ret = client.create( name="docs", dimension=768, metric="cosine", fields_schema={"title": str, "source": str, "chunk": int}, ) assert ret

2) Upsert docs

2) 插入/更新文档

collection = client.get(name="docs") ret = collection.upsert( [ Doc(id="1", vector=[0.01] * 768, fields={"title": "Intro", "source": "kb", "chunk": 0}), Doc(id="2", vector=[0.02] * 768, fields={"title": "FAQ", "source": "kb", "chunk": 1}), ] ) assert ret

3) Query

3) 查询

ret = collection.query( vector=[0.01] * 768, topk=5, filter="source = 'kb' AND chunk >= 0", output_fields=["title", "source", "chunk"], include_vector=False, ) for doc in ret: print(doc.id, doc.fields)

undefined

undefined

Script quickstart

脚本快速开始

bash

python skills/ai/search/alicloud-ai-search-dashvector/scripts/quickstart.py

Environment variables:

```
DASHVECTOR_API_KEY
```
```
DASHVECTOR_ENDPOINT
```
```
DASHVECTOR_COLLECTION
```
(optional)
```
DASHVECTOR_DIMENSION
```
(optional)

Optional args:

--collection

--dimension

--topk

--filter

bash

python skills/ai/search/alicloud-ai-search-dashvector/scripts/quickstart.py

环境变量：

```
DASHVECTOR_API_KEY
```
```
DASHVECTOR_ENDPOINT
```
```
DASHVECTOR_COLLECTION
```
（可选）
```
DASHVECTOR_DIMENSION
```
（可选）

可选参数：

--collection

--dimension

--topk

--filter

。

Notes for Claude Code/Codex

针对Claude Code/Codex的注意事项

Prefer
```
upsert
```
for idempotent ingestion.
Keep
```
dimension
```
aligned to your embedding model output size.
Use filters to enforce tenant or dataset scoping.
If using sparse vectors, pass
```
sparse_vector={token_id: weight, ...}
```
when upserting/querying.

优先使用
```
upsert
```
实现幂等性数据导入。
确保
```
dimension
```
与你的嵌入模型输出尺寸一致。
使用过滤器实现租户或数据集范围限定。
若使用稀疏向量，在插入/更新或查询时传入
```
sparse_vector={token_id: weight, ...}
```
。

Error handling

错误处理

401/403: invalid
```
DASHVECTOR_API_KEY
```
400: invalid collection schema or dimension mismatch
429/5xx: retry with exponential backoff

401/403：
```
DASHVECTOR_API_KEY
```
无效
400：集合 schema 无效或维度不匹配
429/5xx：使用指数退避策略重试

References

参考资料

DashVector Python SDK:

Client.create

Collection.upsert

Collection.query

Source list:
```
references/sources.md
```

DashVector Python SDK：

Client.create

Collection.upsert

Collection.query

来源列表：
```
references/sources.md
```