alicloud-ai-search-dashvector
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCategory: provider
分类: 提供商
DashVector Vector Search
DashVector 向量搜索
Use DashVector to manage collections and perform vector similarity search with optional filters and sparse vectors.
使用DashVector管理集合,并执行带可选过滤器和稀疏向量的向量相似度搜索。
Prerequisites
前置条件
- Install SDK (recommended in a venv to avoid PEP 668 limits):
bash
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashvector- Provide credentials and endpoint via environment variables:
DASHVECTOR_API_KEY- (cluster endpoint)
DASHVECTOR_ENDPOINT
- 安装SDK(建议在虚拟环境venv中安装,以避免PEP 668限制):
bash
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashvector- 通过环境变量提供凭证和端点:
DASHVECTOR_API_KEY- (集群端点)
DASHVECTOR_ENDPOINT
Normalized operations
标准化操作
Create collection
创建集合
- (str)
name - (int)
dimension - (str:
metric|cosine|dotproduct)euclidean - (optional dict of field types)
fields_schema
- (字符串类型)
name - (整数类型)
dimension - (字符串类型:
metric|cosine|dotproduct)euclidean - (可选的字段类型字典)
fields_schema
Upsert docs
插入/更新文档
- list of
docsor tuples{id, vector, fields} - Supports and multi-vector collections
sparse_vector
- :
docs格式的列表或元组{id, vector, fields} - 支持和多向量集合
sparse_vector
Query docs
查询文档
- or
vector(one required; if both empty, only filter is applied)id - (int)
topk - (SQL-like where clause)
filter - (list of field names)
output_fields - (bool)
include_vector
- 或
vector(二者必填其一;若均为空,则仅应用过滤器)id - (整数类型)
topk - (类SQL的where子句)
filter - (字段名称列表)
output_fields - (布尔类型)
include_vector
Quickstart (Python SDK)
快速开始(Python SDK)
python
import os
import dashvector
from dashvector import Doc
client = dashvector.Client(
api_key=os.getenv("DASHVECTOR_API_KEY"),
endpoint=os.getenv("DASHVECTOR_ENDPOINT"),
)python
import os
import dashvector
from dashvector import Doc
client = dashvector.Client(
api_key=os.getenv("DASHVECTOR_API_KEY"),
endpoint=os.getenv("DASHVECTOR_ENDPOINT"),
)1) Create a collection
1) 创建集合
ret = client.create(
name="docs",
dimension=768,
metric="cosine",
fields_schema={"title": str, "source": str, "chunk": int},
)
assert ret
ret = client.create(
name="docs",
dimension=768,
metric="cosine",
fields_schema={"title": str, "source": str, "chunk": int},
)
assert ret
2) Upsert docs
2) 插入/更新文档
collection = client.get(name="docs")
ret = collection.upsert(
[
Doc(id="1", vector=[0.01] * 768, fields={"title": "Intro", "source": "kb", "chunk": 0}),
Doc(id="2", vector=[0.02] * 768, fields={"title": "FAQ", "source": "kb", "chunk": 1}),
]
)
assert ret
collection = client.get(name="docs")
ret = collection.upsert(
[
Doc(id="1", vector=[0.01] * 768, fields={"title": "Intro", "source": "kb", "chunk": 0}),
Doc(id="2", vector=[0.02] * 768, fields={"title": "FAQ", "source": "kb", "chunk": 1}),
]
)
assert ret
3) Query
3) 查询
ret = collection.query(
vector=[0.01] * 768,
topk=5,
filter="source = 'kb' AND chunk >= 0",
output_fields=["title", "source", "chunk"],
include_vector=False,
)
for doc in ret:
print(doc.id, doc.fields)
undefinedret = collection.query(
vector=[0.01] * 768,
topk=5,
filter="source = 'kb' AND chunk >= 0",
output_fields=["title", "source", "chunk"],
include_vector=False,
)
for doc in ret:
print(doc.id, doc.fields)
undefinedScript quickstart
脚本快速开始
bash
python skills/ai/search/alicloud-ai-search-dashvector/scripts/quickstart.pyEnvironment variables:
DASHVECTOR_API_KEYDASHVECTOR_ENDPOINT- (optional)
DASHVECTOR_COLLECTION - (optional)
DASHVECTOR_DIMENSION
Optional args: , , , .
--collection--dimension--topk--filterbash
python skills/ai/search/alicloud-ai-search-dashvector/scripts/quickstart.py环境变量:
DASHVECTOR_API_KEYDASHVECTOR_ENDPOINT- (可选)
DASHVECTOR_COLLECTION - (可选)
DASHVECTOR_DIMENSION
可选参数:, , , 。
--collection--dimension--topk--filterNotes for Claude Code/Codex
针对Claude Code/Codex的注意事项
- Prefer for idempotent ingestion.
upsert - Keep aligned to your embedding model output size.
dimension - Use filters to enforce tenant or dataset scoping.
- If using sparse vectors, pass when upserting/querying.
sparse_vector={token_id: weight, ...}
- 优先使用实现幂等性数据导入。
upsert - 确保与你的嵌入模型输出尺寸一致。
dimension - 使用过滤器实现租户或数据集范围限定。
- 若使用稀疏向量,在插入/更新或查询时传入。
sparse_vector={token_id: weight, ...}
Error handling
错误处理
- 401/403: invalid
DASHVECTOR_API_KEY - 400: invalid collection schema or dimension mismatch
- 429/5xx: retry with exponential backoff
- 401/403:无效
DASHVECTOR_API_KEY - 400:集合 schema 无效或维度不匹配
- 429/5xx:使用指数退避策略重试
References
参考资料
-
DashVector Python SDK:,
Client.create,Collection.upsertCollection.query -
Source list:
references/sources.md
-
DashVector Python SDK:,
Client.create,Collection.upsertCollection.query -
来源列表:
references/sources.md