ai-data-engineering
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAI Data Engineering
AI数据工程
Purpose
用途
Build data infrastructure for AI/ML systems including RAG pipelines, feature stores, and embedding generation. Provides architecture patterns, orchestration workflows, and evaluation metrics for production AI applications.
为AI/ML系统构建数据基础设施,包括RAG管道、特征存储和嵌入生成。为生产级AI应用提供架构模式、编排工作流与评估指标。
When to Use
适用场景
Use this skill when:
- Building RAG (Retrieval-Augmented Generation) pipelines
- Implementing semantic search or vector databases
- Setting up ML feature stores for real-time serving
- Creating embedding generation pipelines
- Evaluating RAG quality with RAGAS metrics
- Orchestrating data workflows for AI systems
- Integrating with frontend skills (ai-chat, search-filter)
Skip this skill if:
- Building traditional CRUD applications (use databases-relational)
- Simple key-value storage (use databases-nosql)
- No AI/ML components in the application
在以下场景使用本技能:
- 构建RAG(检索增强生成)管道
- 实现语义搜索或向量数据库
- 搭建用于实时服务的ML特征存储
- 创建嵌入生成管道
- 使用RAGAS指标评估RAG质量
- 编排AI系统的数据工作流
- 与前端技能集成(ai-chat、search-filter)
在以下场景跳过本技能:
- 构建传统CRUD应用(使用databases-relational技能)
- 简单键值存储(使用databases-nosql技能)
- 应用中无AI/ML组件
RAG Pipeline Architecture
RAG管道架构
RAG pipelines have 5 distinct stages. Understanding this architecture is critical for production implementations.
┌─────────────────────────────────────────────────────────────┐
│ RAG Pipeline (5 Stages) │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. INGESTION → Load documents (PDF, DOCX, Markdown) │
│ 2. INDEXING → Chunk (512 tokens) + Embed + Store │
│ 3. RETRIEVAL → Query embedding + Vector search + Filters │
│ 4. GENERATION → Context injection + LLM streaming │
│ 5. EVALUATION → RAGAS metrics (faithfulness, relevancy) │
│ │
└─────────────────────────────────────────────────────────────┘For complete RAG architecture with implementation patterns, see:
- - Detailed 5-stage breakdown
references/rag-architecture.md - - Working implementation
examples/langchain-rag/basic_rag.py
RAG管道包含5个不同阶段。理解该架构是生产环境落地的关键。
┌─────────────────────────────────────────────────────────────┐
│ RAG Pipeline (5 Stages) │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. INGESTION → Load documents (PDF, DOCX, Markdown) │
│ 2. INDEXING → Chunk (512 tokens) + Embed + Store │
│ 3. RETRIEVAL → Query embedding + Vector search + Filters │
│ 4. GENERATION → Context injection + LLM streaming │
│ 5. EVALUATION → RAGAS metrics (faithfulness, relevancy) │
│ │
└─────────────────────────────────────────────────────────────┘如需包含实现模式的完整RAG架构参考,请查看:
- - 5阶段详细解析
references/rag-architecture.md - - 可运行的实现示例
examples/langchain-rag/basic_rag.py
Chunking Strategies
分块策略
Chunking is the most critical decision for RAG quality. Poor chunking breaks retrieval.
Default Recommendation:
- Size: 512 tokens
- Overlap: 50-100 tokens
- Method: Fixed token-based
Why these values:
- Too small (<256 tokens): Loses context, requires many retrievals
- Too large (>1024 tokens): Includes irrelevant content, hits token limits
- Overlap prevents information loss at chunk boundaries
Alternative strategies for special cases:
python
undefined分块是影响RAG质量的最关键决策。不合理的分块会破坏检索效果。
默认推荐配置:
- 大小: 512 tokens
- 重叠: 50-100 tokens
- 方法: 基于固定token的分块
选择这些值的原因:
- 太小(<256 tokens):丢失上下文信息,需要多次检索
- 太大(>1024 tokens):包含无关内容,触发token限制
- 重叠设置可避免块边界处的信息丢失
特殊场景下的替代策略:
python
undefinedCode-aware chunking (preserves functions/classes)
代码感知分块(保留函数/类结构)
from langchain.text_splitter import RecursiveCharacterTextSplitter
code_splitter = RecursiveCharacterTextSplitter.from_language(
language="python",
chunk_size=512,
chunk_overlap=50
)
from langchain.text_splitter import RecursiveCharacterTextSplitter
code_splitter = RecursiveCharacterTextSplitter.from_language(
language="python",
chunk_size=512,
chunk_overlap=50
)
Semantic chunking (splits on meaning, not tokens)
语义分块(基于语义而非token分割)
from langchain.text_splitter import SemanticChunker
semantic_splitter = SemanticChunker(
embeddings=embeddings,
breakpoint_threshold_type="percentile" # Split at semantic boundaries
)
**See:** `references/chunking-strategies.md` for complete decision frameworkfrom langchain.text_splitter import SemanticChunker
semantic_splitter = SemanticChunker(
embeddings=embeddings,
breakpoint_threshold_type="percentile" # 在语义边界处分割
)
**参考:** `references/chunking-strategies.md` 中的完整决策框架Embedding Generation
嵌入生成
Embedding quality directly impacts retrieval accuracy. Voyage AI is currently best-in-class.
Primary Recommendation: Voyage AI voyage-3
- Dimensions: 1024
- MTEB Score: 69.0 (highest as of Dec 2025)
- Cost: $$$ but 9.74% better than OpenAI
- Use for: Production systems requiring best retrieval quality
Cost-Effective Alternative: OpenAI text-embedding-3-small
- Dimensions: 1536
- MTEB Score: 62.3
- Cost: $ (5x cheaper than voyage-3)
- Use for: Development, prototyping, cost-sensitive applications
Implementation:
python
from langchain_voyageai import VoyageAIEmbeddings
from langchain_openai import OpenAIEmbeddings嵌入质量直接影响检索准确率。目前Voyage AI的模型处于行业领先水平。
首要推荐:Voyage AI voyage-3
- 维度:1024
- MTEB评分:69.0(截至2025年12月为最高分)
- 成本:$$$,但比OpenAI模型效果好9.74%
- 适用场景:对检索质量要求最高的生产系统
高性价比替代方案:OpenAI text-embedding-3-small
- 维度:1536
- MTEB评分:62.3
- 成本:$(比voyage-3便宜5倍)
- 适用场景:开发、原型验证、对成本敏感的应用
实现代码:
python
from langchain_voyageai import VoyageAIEmbeddings
from langchain_openai import OpenAIEmbeddingsProduction (best quality)
生产环境(最优质量)
embeddings = VoyageAIEmbeddings(
model="voyage-3",
voyage_api_key="your-api-key"
)
embeddings = VoyageAIEmbeddings(
model="voyage-3",
voyage_api_key="your-api-key"
)
Development (cost-effective)
开发环境(高性价比)
embeddings = OpenAIEmbeddings(
model="text-embedding-3-small",
openai_api_key="your-api-key"
)
**See:** `references/embedding-strategies.md` for complete provider comparisonembeddings = OpenAIEmbeddings(
model="text-embedding-3-small",
openai_api_key="your-api-key"
)
**参考:** `references/embedding-strategies.md` 中的完整服务商对比RAGAS Evaluation Metrics
RAGAS评估指标
Traditional metrics (BLEU, ROUGE) don't measure RAG quality. RAGAS provides LLM-as-judge evaluation.
4 Core Metrics:
| Metric | Measures | Good Score |
|---|---|---|
| Faithfulness | Factual consistency with retrieved context | > 0.8 |
| Answer Relevancy | Does answer address the user's question? | > 0.7 |
| Context Precision | Are retrieved chunks actually relevant? | > 0.6 |
| Context Recall | Were all necessary chunks retrieved? | > 0.7 |
Quick evaluation script:
bash
undefined传统指标(BLEU、ROUGE)无法衡量RAG质量。RAGAS提供基于LLM作为评判者的评估方案。
4个核心指标:
| 指标 | 衡量内容 | 优秀分数 |
|---|---|---|
| Faithfulness | 与检索到的上下文的事实一致性 | > 0.8 |
| Answer Relevancy | 回答是否解决了用户的问题? | > 0.7 |
| Context Precision | 检索到的块是否真正相关? | > 0.6 |
| Context Recall | 是否检索到了所有必要的块? | > 0.7 |
快速评估脚本:
bash
undefinedRun RAGAS evaluation (TOKEN-FREE script execution)
运行RAGAS评估(无需TOKEN的脚本执行)
python scripts/evaluate_rag.py --dataset eval_data.json --output results.json
**Manual implementation:**
```python
from ragas import evaluate
from ragas.metrics import faithfulness, answer_relevancy
dataset = {
"question": ["What is the capital of France?"],
"answer": ["Paris is the capital of France."],
"contexts": [["France's capital is Paris."]],
"ground_truth": ["Paris"]
}
result = evaluate(dataset, metrics=[faithfulness, answer_relevancy])
print(f"Faithfulness: {result['faithfulness']}")
print(f"Answer Relevancy: {result['answer_relevancy']}")See: for complete RAGAS implementation guide
references/evaluation-metrics.mdpython scripts/evaluate_rag.py --dataset eval_data.json --output results.json
**手动实现代码:**
```python
from ragas import evaluate
from ragas.metrics import faithfulness, answer_relevancy
dataset = {
"question": ["What is the capital of France?"],
"answer": ["Paris is the capital of France."],
"contexts": [["France's capital is Paris."]],
"ground_truth": ["Paris"]
}
result = evaluate(dataset, metrics=[faithfulness, answer_relevancy])
print(f"Faithfulness: {result['faithfulness']}")
print(f"Answer Relevancy: {result['answer_relevancy']}")参考: 中的完整RAGAS实现指南
references/evaluation-metrics.mdFeature Stores
特征存储
Feature stores solve the "training-serving skew" problem by providing consistent feature computation.
Primary Recommendation: Feast - Open source, works with any backend (PostgreSQL, Redis, DynamoDB, S3, BigQuery, Snowflake)
Basic usage:
python
from feast import FeatureStore
store = FeatureStore(repo_path="feature_repo/")特征存储通过提供一致的特征计算逻辑,解决“训练-服务偏差”问题。
首要推荐:Feast - 开源工具,支持任意后端(PostgreSQL、Redis、DynamoDB、S3、BigQuery、Snowflake)
基础用法:
python
from feast import FeatureStore
store = FeatureStore(repo_path="feature_repo/")Online serving (low-latency)
在线服务(低延迟)
features = store.get_online_features(
features=["user_features:total_orders"],
entity_rows=[{"user_id": 1001}]
).to_dict()
**See:** `references/feature-stores.md` for complete Feast setup and alternatives (Tecton, Hopsworks)features = store.get_online_features(
features=["user_features:total_orders"],
entity_rows=[{"user_id": 1001}]
).to_dict()
**参考:** `references/feature-stores.md` 中的完整Feast搭建指南及替代方案(Tecton、Hopsworks)LangChain Orchestration
LangChain编排
LangChain is the primary framework for LLM orchestration with the largest ecosystem (24,215+ API reference snippets).
Context7 Library ID: (Trust: High, Snippets: 435)
/websites/langchain_oss_python_langchainBasic RAG Chain:
python
from langchain_core.prompts import ChatPromptTemplate
from langchain_qdrant import QdrantVectorStore
from langchain_voyageai import VoyageAIEmbeddingsLangChain是LLM编排的主流框架,拥有最庞大的生态系统(24215+ API参考片段)。
Context7库ID: (可信度:高,片段数量:435)
/websites/langchain_oss_python_langchain基础RAG链:
python
from langchain_core.prompts import ChatPromptTemplate
from langchain_qdrant import QdrantVectorStore
from langchain_voyageai import VoyageAIEmbeddingsSetup retriever
配置检索器
vectorstore = QdrantVectorStore(
client=qdrant_client,
embedding=VoyageAIEmbeddings(model="voyage-3")
)
retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 5})
vectorstore = QdrantVectorStore(
client=qdrant_client,
embedding=VoyageAIEmbeddings(model="voyage-3")
)
retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 5})
Build chain
构建链
prompt = ChatPromptTemplate.from_template(
"Answer based on context:\n{context}\n\nQuestion: {question}"
)
chain = {"context": retriever, "question": lambda x: x} | prompt | ChatOpenAI() | StrOutputParser()
prompt = ChatPromptTemplate.from_template(
"Answer based on context:\n{context}\n\nQuestion: {question}"
)
chain = {"context": retriever, "question": lambda x: x} | prompt | ChatOpenAI() | StrOutputParser()
Stream response
流式响应
for chunk in chain.stream("What is the capital of France?"):
print(chunk, end="", flush=True)
**See:** `references/langchain-patterns.md` - Complete LangChain 0.3+ patterns with streaming and hybrid searchfor chunk in chain.stream("What is the capital of France?"):
print(chunk, end="", flush=True)
**参考:** `references/langchain-patterns.md` - 包含流式传输与混合搜索的完整LangChain 0.3+模式Orchestration Tools
编排工具
Modern AI pipelines require workflow orchestration beyond cron jobs.
Primary Recommendation: Dagster (for ML/AI pipelines) - Asset-centric design, best lineage tracking, perfect for RAG
Example: Embedding Pipeline
python
from dagster import asset
from langchain_voyageai import VoyageAIEmbeddings
@asset
def raw_documents():
"""Load documents from S3."""
return documents
@asset
def chunked_documents(raw_documents):
"""Split into 512-token chunks with 50-token overlap."""
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=50)
return splitter.split_documents(raw_documents)
@asset
def embedded_documents(chunked_documents):
"""Generate embeddings with Voyage AI."""
embeddings = VoyageAIEmbeddings(model="voyage-3")
return embeddings.embed_documents([doc.page_content for doc in chunked_documents])See: for complete Dagster patterns and alternatives (Prefect, Airflow 3.0, dbt)
references/orchestration-tools.md现代AI管道需要超越定时任务的工作流编排能力。
首要推荐:Dagster(面向ML/AI管道) - 以资产为中心的设计,提供最佳的 lineage 追踪,非常适合RAG场景
示例:嵌入管道
python
from dagster import asset
from langchain_voyageai import VoyageAIEmbeddings
@asset
def raw_documents():
"""从S3加载文档。"""
return documents
@asset
def chunked_documents(raw_documents):
"""分割为512-token的块,重叠50个token。"""
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=50)
return splitter.split_documents(raw_documents)
@asset
def embedded_documents(chunked_documents):
"""使用Voyage AI生成嵌入。"""
embeddings = VoyageAIEmbeddings(model="voyage-3")
return embeddings.embed_documents([doc.page_content for doc in chunked_documents])参考: 中的完整Dagster模式及替代方案(Prefect、Airflow 3.0、dbt)
references/orchestration-tools.mdIntegration with Frontend Skills
与前端技能的集成
ai-chat Skill → RAG Backend
ai-chat技能 → RAG后端
The ai-chat skill consumes RAG pipeline outputs for streaming responses.
Backend API (FastAPI):
python
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
@app.post("/api/rag/stream")
async def stream_rag(query: str):
async def generate():
chain = RetrievalQA.from_chain_type(llm=OpenAI(streaming=True), retriever=vectorstore.as_retriever())
async for chunk in chain.astream(query):
yield chunk
return StreamingResponse(generate(), media_type="text/plain")See: for complete frontend integration patterns
references/rag-architecture.mdai-chat技能会消费RAG管道的输出以实现流式响应。
后端API(FastAPI):
python
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
@app.post("/api/rag/stream")
async def stream_rag(query: str):
async def generate():
chain = RetrievalQA.from_chain_type(llm=OpenAI(streaming=True), retriever=vectorstore.as_retriever())
async for chunk in chain.astream(query):
yield chunk
return StreamingResponse(generate(), media_type="text/plain")参考: 中的完整前端集成模式
references/rag-architecture.mdsearch-filter Skill → Semantic Search
search-filter技能 → 语义搜索
The search-filter skill uses semantic search backends for vector similarity.
Backend (Qdrant + Voyage AI):
python
from qdrant_client import QdrantClient
from langchain_voyageai import VoyageAIEmbeddings
@app.post("/api/search/semantic")
async def semantic_search(query: str, filters: dict):
query_vector = VoyageAIEmbeddings(model="voyage-3").embed_query(query)
results = QdrantClient().search(
collection_name="documents",
query_vector=query_vector,
query_filter=filters,
limit=10
)
return {"results": results}search-filter技能使用语义搜索后端实现向量相似度匹配。
后端实现(Qdrant + Voyage AI):
python
from qdrant_client import QdrantClient
from langchain_voyageai import VoyageAIEmbeddings
@app.post("/api/search/semantic")
async def semantic_search(query: str, filters: dict):
query_vector = VoyageAIEmbeddings(model="voyage-3").embed_query(query)
results = QdrantClient().search(
collection_name="documents",
query_vector=query_vector,
query_filter=filters,
limit=10
)
return {"results": results}Data Versioning
数据版本控制
Primary Recommendation: LakeFS (acquired DVC team November 2025)
Git-like operations on data lakes: branch, commit, merge, time travel. Works with S3/Azure/GCS.
python
import lakefs
branch = lakefs.Branch("main").create("experiment-voyage-3")
branch.commit("Updated embeddings to voyage-3")
branch.merge_into("main")See: for complete LakeFS setup
references/data-versioning.md首要推荐:LakeFS(2025年11月收购DVC团队)
为数据湖提供类Git的操作:分支、提交、合并、时间旅行。支持S3/Azure/GCS。
python
import lakefs
branch = lakefs.Branch("main").create("experiment-voyage-3")
branch.commit("Updated embeddings to voyage-3")
branch.merge_into("main")参考: 中的完整LakeFS搭建指南
references/data-versioning.mdQuick Start Workflow
快速开始工作流
1. Set up vector database:
bash
undefined1. 配置向量数据库:
bash
undefinedRun Qdrant setup script (TOKEN-FREE execution)
运行Qdrant配置脚本(无需TOKEN的执行)
python scripts/setup_qdrant.py --collection docs --dimension 1024
**2. Chunk and embed documents:**
```bashpython scripts/setup_qdrant.py --collection docs --dimension 1024
**2. 分块并嵌入文档:**
```bashChunk documents (TOKEN-FREE execution)
文档分块(无需TOKEN的执行)
python scripts/chunk_documents.py
--input data/documents/
--chunk-size 512
--overlap 50
--output data/chunks/
--input data/documents/
--chunk-size 512
--overlap 50
--output data/chunks/
**3. Implement RAG pipeline:**
See `examples/langchain-rag/basic_rag.py` for complete working example.
**4. Evaluate with RAGAS:**
```bashpython scripts/chunk_documents.py
--input data/documents/
--chunk-size 512
--overlap 50
--output data/chunks/
--input data/documents/
--chunk-size 512
--overlap 50
--output data/chunks/
**3. 实现RAG管道:**
查看`examples/langchain-rag/basic_rag.py`获取完整可运行示例。
**4. 使用RAGAS评估:**
```bashRun evaluation (TOKEN-FREE execution)
运行评估(无需TOKEN的执行)
python scripts/evaluate_rag.py
--dataset data/eval_qa.json
--output results/ragas_metrics.json
--dataset data/eval_qa.json
--output results/ragas_metrics.json
**5. Deploy with orchestration:**
See `examples/dagster-pipelines/embedding_pipeline.py` for production deployment.python scripts/evaluate_rag.py
--dataset data/eval_qa.json
--output results/ragas_metrics.json
--dataset data/eval_qa.json
--output results/ragas_metrics.json
**5. 通过编排工具部署:**
查看`examples/dagster-pipelines/embedding_pipeline.py`获取生产环境部署示例。Dependencies
依赖项
Required Python packages:
bash
undefined必需的Python包:
bash
undefinedCore RAG
核心RAG
pip install langchain langchain-core langchain-openai langchain-voyageai langchain-qdrant
pip install langchain langchain-core langchain-openai langchain-voyageai langchain-qdrant
Vector database
向量数据库
pip install qdrant-client
pip install qdrant-client
Evaluation
评估工具
pip install ragas datasets
pip install ragas datasets
Feature stores
特征存储
pip install feast
pip install feast
Orchestration
编排工具
pip install dagster dagster-webserver
pip install dagster dagster-webserver
Data versioning
数据版本控制
pip install lakefs-client
**Optional for alternatives:**
```bashpip install lakefs-client
**可选的替代方案依赖:**
```bashLlamaIndex (alternative to LangChain)
LlamaIndex(LangChain的替代框架)
pip install llama-index
pip install llama-index
dbt (SQL transformations)
dbt(SQL数据转换)
pip install dbt-core dbt-postgres
pip install dbt-core dbt-postgres
Prefect (alternative orchestration)
Prefect(替代编排工具)
pip install prefect
undefinedpip install prefect
undefinedTroubleshooting
故障排除
Common Issues:
1. Poor retrieval quality - Check chunk size (try 512 tokens), increase overlap (50-100), try hybrid search, re-rank with Cohere
2. Slow embedding generation - Batch documents (100-1000), use async APIs, cache with Redis, use smaller model for dev
3. High LLM costs - Reduce retrieved chunks (k=3), use cheaper re-ranking models, cache frequent queries
See: for complete troubleshooting guide
references/rag-architecture.md常见问题:
1. 检索质量差 - 检查分块大小(尝试512 tokens),增加重叠量(50-100),尝试混合搜索,使用Cohere进行重排序
2. 嵌入生成速度慢 - 批量处理文档(100-1000个),使用异步API,用Redis缓存,开发环境使用更小的模型
3. LLM成本高 - 减少检索到的块数量(k=3),使用更便宜的重排序模型,缓存高频查询
参考: 中的完整故障排除指南
references/rag-architecture.mdBest Practices
最佳实践
Chunking: Default to 512 tokens with 50-token overlap. Use semantic chunking for complex documents. Preserve code structure for source code.
Embeddings: Use Voyage AI voyage-3 for production, OpenAI text-embedding-3-small for development. Never mix embedding models (re-embed everything if changing).
Evaluation: Run RAGAS metrics on every pipeline change. Maintain test dataset of 50+ question-answer pairs. Track metrics over time.
Orchestration: Use Dagster for ML/AI pipelines, dbt for SQL transformations only. Version control all pipeline code.
Frontend Integration: Always stream LLM responses. Implement retry logic. Show citations/sources to users. Handle empty results gracefully.
分块: 默认使用512 tokens大小搭配50 tokens重叠。复杂文档使用语义分块。源代码分块时保留代码结构。
嵌入: 生产环境使用Voyage AI voyage-3,开发环境使用OpenAI text-embedding-3-small。切勿混合使用嵌入模型(更换模型时需重新嵌入所有数据)。
评估: 每次管道变更后运行RAGAS指标。维护包含50+问答对的测试数据集。持续跟踪指标变化。
编排: ML/AI管道使用Dagster,仅SQL转换使用dbt。对所有管道代码进行版本控制。
前端集成: 始终使用LLM流式响应。实现重试逻辑。向用户展示引用/来源。优雅处理空结果。
Additional Resources
额外资源
Reference Documentation:
- - Complete RAG pipeline guide
references/rag-architecture.md - - Decision framework for chunking
references/chunking-strategies.md - - Embedding model comparison
references/embedding-strategies.md - - LangChain 0.3+ patterns
references/langchain-patterns.md - - Feast setup and alternatives
references/feature-stores.md - - RAGAS implementation guide
references/evaluation-metrics.md
Working Examples:
- - Simple RAG chain
examples/langchain-rag/basic_rag.py - - Streaming responses
examples/langchain-rag/streaming_rag.py - - Vector + BM25
examples/langchain-rag/hybrid_search.py - - LlamaIndex alternative
examples/llamaindex-agents/query_engine.py - - Complete feature store setup
examples/feast-features/ - - Production pipeline
examples/dagster-pipelines/embedding_pipeline.py
Executable Scripts (TOKEN-FREE):
- - RAGAS evaluation runner
scripts/evaluate_rag.py - - Document chunking utility
scripts/chunk_documents.py - - Retrieval quality benchmark
scripts/benchmark_retrieval.py - - Qdrant collection setup
scripts/setup_qdrant.py
参考文档:
- - 完整RAG管道指南
references/rag-architecture.md - - 分块决策框架
references/chunking-strategies.md - - 嵌入模型对比
references/embedding-strategies.md - - LangChain 0.3+模式
references/langchain-patterns.md - - Feast搭建及替代方案
references/feature-stores.md - - RAGAS实现指南
references/evaluation-metrics.md
可运行示例:
- - 简单RAG链
examples/langchain-rag/basic_rag.py - - 流式响应
examples/langchain-rag/streaming_rag.py - - 向量+BM25混合搜索
examples/langchain-rag/hybrid_search.py - - LlamaIndex替代方案
examples/llamaindex-agents/query_engine.py - - 完整特征存储搭建
examples/feast-features/ - - 生产级管道
examples/dagster-pipelines/embedding_pipeline.py
可执行脚本(无需TOKEN):
- - RAGAS评估运行器
scripts/evaluate_rag.py - - 文档分块工具
scripts/chunk_documents.py - - 检索质量基准测试
scripts/benchmark_retrieval.py - - Qdrant集合配置工具
scripts/setup_qdrant.py