LangChain - LLM Applications with Agents & RAG

The most popular framework for building LLM-powered applications.

When to Use

Building agents with tool calling and reasoning (ReAct pattern)
Implementing RAG (retrieval-augmented generation) pipelines
Need to swap LLM providers easily (OpenAI, Anthropic, Google)
Creating chatbots with conversation memory
Rapid prototyping of LLM applications

Core Components

Component	Purpose	Key Concept
Chat Models	LLM interface	Unified API across providers
Agents	Tool use + reasoning	ReAct pattern
Chains	Sequential operations	Composable pipelines
Memory	Conversation state	Buffer, summary, vector
Retrievers	Document lookup	Vector search, hybrid
Tools	External capabilities	Functions agents can call

Agent Patterns

Pattern	Description	Use Case
ReAct	Reason-Act-Observe loop	General tool use
Plan-and-Execute	Plan first, then execute	Complex multi-step
Self-Ask	Generate sub-questions	Research tasks
Structured Chat	JSON tool calling	API integration

Tool Definition

Element	Purpose
Name	How agent refers to tool
Description	When to use (critical for selection)
Parameters	Input schema
Return type	What agent receives back

Key concept: Tool descriptions are critical—the LLM uses them to decide which tool to call. Be specific about when and why to use each tool.

RAG Pipeline Stages

Stage	Purpose	Options
Load	Ingest documents	Web, PDF, GitHub, DBs
Split	Chunk into pieces	Recursive, semantic
Embed	Convert to vectors	OpenAI, Cohere, local
Store	Index vectors	Chroma, FAISS, Pinecone
Retrieve	Find relevant chunks	Similarity, MMR, hybrid
Generate	Create response	LLM with context

Chunking Strategies

Strategy	Best For	Typical Size
Recursive	General text	500-1000 chars
Semantic	Coherent passages	Variable
Token-based	LLM context limits	256-512 tokens

Retrieval Strategies

Strategy	How It Works
Similarity	Nearest neighbors by embedding
MMR	Diversity + relevance balance
Hybrid	Keyword + semantic combined
Self-query	LLM generates metadata filters

Memory Types

Type	Stores	Best For
Buffer	Full conversation	Short conversations
Window	Last N messages	Medium conversations
Summary	LLM-generated summary	Long conversations
Vector	Embedded messages	Semantic recall
Entity	Extracted entities	Track facts about people/things

Key concept: Buffer memory grows unbounded. Use summary or vector for long conversations to stay within context limits.

Document Loaders

Source	Loader Type
Web pages	WebBaseLoader, AsyncChromium
PDFs	PyPDFLoader, UnstructuredPDF
Code	GitHubLoader, DirectoryLoader
Databases	SQLDatabase, Postgres
APIs	Custom loaders

Vector Stores

Store	Type	Best For
Chroma	Local	Development, small datasets
FAISS	Local	Large local datasets
Pinecone	Cloud	Production, scale
Weaviate	Self-hosted/Cloud	Hybrid search
Qdrant	Self-hosted/Cloud	Filtering, metadata

LangSmith Observability

Feature	Benefit
Tracing	See every LLM call, tool use
Evaluation	Test prompts systematically
Datasets	Store test cases
Monitoring	Track production performance

Key concept: Enable LangSmith tracing early—debugging agents without observability is extremely difficult.

Best Practices

Practice	Why
Start simple	`create_agent()` covers most cases
Enable streaming	Better UX for long responses
Use LangSmith	Essential for debugging
Optimize chunk size	500-1000 chars typically works
Cache embeddings	They're expensive to compute
Test retrieval separately	RAG quality depends on retrieval

LangChain vs LangGraph

Aspect	LangChain	LangGraph
Best for	Quick agents, RAG	Complex workflows
Code to start	<10 lines	~30 lines
State management	Limited	Native
Branching logic	Basic	Advanced
Human-in-loop	Manual	Built-in

Key concept: Use LangChain for straightforward agents and RAG. Use LangGraph when you need complex state machines, branching, or human checkpoints.

Resources

Docs: https://docs.langchain.com
LangSmith: https://smith.langchain.com
Templates: https://github.com/langchain-ai/langchain/tree/master/templates

langchain-agents

NPX Install

Tags

SKILL.md Content