Loading...
Loading...
Redis vector search guidance covering HNSW vs FLAT algorithm choice, vector index configuration (dims, distance metric, datatype), filtered hybrid search combining vector similarity with TAG or NUMERIC filters, and the RAG retrieval pattern with RedisVL. Use when defining a VECTOR field in FT.CREATE, integrating embeddings (OpenAI, Cohere, sentence-transformers), tuning HNSW parameters (M, EF_CONSTRUCTION, EF_RUNTIME), building a retrieval-augmented generation pipeline, or filtering vector results by attribute.
npx skill4agent add redis/agent-skills redis-vector-searchVECTORFT.CREATEIndexSchemaredis-query-engineFT.CREATEFT.SEARCHDIMtext-embedding-3-smallDISTANCE_METRICCOSINEIPL2TYPEdatatypeFLOAT32FLOAT16FT.CREATE idx:docs ON HASH PREFIX 1 doc:
SCHEMA
content TEXT
embedding VECTOR HNSW 6
TYPE FLOAT32
DIM 1536
DISTANCE_METRIC COSINEschema = IndexSchema.from_dict({
"index": {"name": "idx:docs", "prefix": "doc:"},
"fields": [
{"name": "content", "type": "text"},
{"name": "embedding", "type": "vector", "attrs": {
"dims": 1536, "algorithm": "HNSW",
"datatype": "FLOAT32", "distance_metric": "COSINE",
}},
]
})| Algorithm | Speed | Accuracy | Memory | Best for |
|---|---|---|---|---|
| HNSW | Fast (approximate) | ~95%+ recall (tunable) | Higher | Large datasets (>10k vectors), latency-sensitive |
| FLAT | Slow (exact) | 100% | Lower | Small datasets (<10k), accuracy-critical |
MEF_CONSTRUCTIONEF_RUNTIMEfrom redisvl.query import VectorQuery
from redisvl.query.filter import Num, Tag
filters = (Tag("category") == "technology") & (Num("date") >= 2024)
query = VectorQuery(
vector=query_embedding,
vector_field_name="embedding",
return_fields=["content", "category", "date"],
num_results=10,
filter_expression=filters,
)
results = index.query(query)HybridQueryAggregateHybridQuery# Index documents with embeddings
records = [{"content": doc.content,
"embedding": embed_model.encode(doc.content).tolist(),
"source": doc.source}
for doc in documents]
index.load(records)
# Retrieve relevant context for a user question
q_emb = embed_model.encode(user_question)
results = index.query(VectorQuery(
vector=q_emb,
vector_field_name="embedding",
return_fields=["content", "source"],
num_results=5,
))
# Generate with retrieved context
context = "\n".join(r["content"] for r in results)
response = llm.generate(f"Context: {context}\n\nQuestion: {user_question}")COSINEindex.load([...])