Loading...
Loading...
Build RAG systems - embeddings, vector stores, chunking, and retrieval optimization
npx skill4agent add pluginagentmarketplace/custom-plugin-ai-agents rag-systems| Parameter | Type | Required | Description | Default |
|---|---|---|---|---|
| string | Yes | RAG goal | - |
| enum | No | | |
| string | No | Embedding model | |
| int | No | Chunk size in chars | |
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter
# 1. Split documents
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(documents)
# 2. Create vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(chunks, embeddings)
# 3. Retrieve
docs = vectorstore.similarity_search("query", k=5)| Content Type | Size | Overlap | Rationale |
|---|---|---|---|
| Technical docs | 500-800 | 100 | Preserve code |
| Legal docs | 1000-1500 | 200 | Keep clauses |
| Q&A/FAQ | 200-400 | 50 | Atomic answers |
| Model | Cost/1M tokens |
|---|---|
| text-embedding-3-small | $0.02 |
| text-embedding-3-large | $0.13 |
| Cohere embed-v3 | $0.10 |
| Issue | Solution |
|---|---|
| Irrelevant results | Improve chunking, add reranking |
| Missing context | Increase k, use parent retriever |
| Hallucinations | Add "only use context" prompt |
| Slow retrieval | Add caching, reduce k |
llm-integrationagent-memoryai-agent-basics