langchain-architecture
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseLangChain Architecture
LangChain 架构
Master the LangChain framework for building sophisticated LLM applications with agents, chains, memory, and tool integration.
掌握LangChain框架,构建集成Agent、Chain、记忆和工具的复杂LLM应用。
Do not use this skill when
请勿在以下场景使用此技能
- The task is unrelated to langchain architecture
- You need a different domain or tool outside this scope
- 任务与LangChain架构无关
- 需要此范围之外的其他领域或工具
Instructions
使用说明
- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open .
resources/implementation-playbook.md
- 明确目标、约束条件和所需输入。
- 应用相关最佳实践并验证结果。
- 提供可执行的步骤和验证方法。
- 如果需要详细示例,请打开。
resources/implementation-playbook.md
Use this skill when
建议在以下场景使用此技能
- Building autonomous AI agents with tool access
- Implementing complex multi-step LLM workflows
- Managing conversation memory and state
- Integrating LLMs with external data sources and APIs
- Creating modular, reusable LLM application components
- Implementing document processing pipelines
- Building production-grade LLM applications
- 构建可访问工具的自主AI Agent
- 实现复杂的多步骤LLM工作流
- 管理对话记忆与状态
- 将LLM与外部数据源和API集成
- 创建模块化、可复用的LLM应用组件
- 实现文档处理流水线
- 构建生产级LLM应用
Core Concepts
核心概念
1. Agents
1. Agent
Autonomous systems that use LLMs to decide which actions to take.
Agent Types:
- ReAct: Reasoning + Acting in interleaved manner
- OpenAI Functions: Leverages function calling API
- Structured Chat: Handles multi-input tools
- Conversational: Optimized for chat interfaces
- Self-Ask with Search: Decomposes complex queries
自主系统,使用LLM来决定执行哪些操作。
Agent类型:
- ReAct:交替进行推理与行动
- OpenAI Functions:利用函数调用API
- Structured Chat:处理多输入工具
- Conversational:针对聊天界面优化
- Self-Ask with Search:分解复杂查询
2. Chains
2. Chain
Sequences of calls to LLMs or other utilities.
Chain Types:
- LLMChain: Basic prompt + LLM combination
- SequentialChain: Multiple chains in sequence
- RouterChain: Routes inputs to specialized chains
- TransformChain: Data transformations between steps
- MapReduceChain: Parallel processing with aggregation
对LLM或其他工具的调用序列。
Chain类型:
- LLMChain:基础Prompt与LLM的组合
- SequentialChain:多Chain按顺序执行
- RouterChain:将输入路由到专用Chain
- TransformChain:步骤间的数据转换
- MapReduceChain:并行处理并聚合结果
3. Memory
3. 记忆
Systems for maintaining context across interactions.
Memory Types:
- ConversationBufferMemory: Stores all messages
- ConversationSummaryMemory: Summarizes older messages
- ConversationBufferWindowMemory: Keeps last N messages
- EntityMemory: Tracks information about entities
- VectorStoreMemory: Semantic similarity retrieval
用于在多轮交互中维护上下文的系统。
记忆类型:
- ConversationBufferMemory:存储所有消息
- ConversationSummaryMemory:总结旧消息
- ConversationBufferWindowMemory:保留最近N条消息
- EntityMemory:跟踪实体信息
- VectorStoreMemory:语义相似度检索
4. Document Processing
4. 文档处理
Loading, transforming, and storing documents for retrieval.
Components:
- Document Loaders: Load from various sources
- Text Splitters: Chunk documents intelligently
- Vector Stores: Store and retrieve embeddings
- Retrievers: Fetch relevant documents
- Indexes: Organize documents for efficient access
为检索而进行的文档加载、转换和存储。
组件:
- Document Loaders:从多种来源加载文档
- Text Splitters:智能拆分文档
- Vector Stores:存储和检索嵌入向量
- Retrievers:获取相关文档
- Indexes:组织文档以实现高效访问
5. Callbacks
5. 回调
Hooks for logging, monitoring, and debugging.
Use Cases:
- Request/response logging
- Token usage tracking
- Latency monitoring
- Error handling
- Custom metrics collection
用于日志记录、监控和调试的钩子。
使用场景:
- 请求/响应日志记录
- Token使用量跟踪
- 延迟监控
- 错误处理
- 自定义指标收集
Quick Start
快速入门
python
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI
from langchain.memory import ConversationBufferMemorypython
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI
from langchain.memory import ConversationBufferMemoryInitialize LLM
Initialize LLM
llm = OpenAI(temperature=0)
llm = OpenAI(temperature=0)
Load tools
Load tools
tools = load_tools(["serpapi", "llm-math"], llm=llm)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
Add memory
Add memory
memory = ConversationBufferMemory(memory_key="chat_history")
memory = ConversationBufferMemory(memory_key="chat_history")
Create agent
Create agent
agent = initialize_agent(
tools,
llm,
agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
memory=memory,
verbose=True
)
agent = initialize_agent(
tools,
llm,
agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
memory=memory,
verbose=True
)
Run agent
Run agent
result = agent.run("What's the weather in SF? Then calculate 25 * 4")
undefinedresult = agent.run("What's the weather in SF? Then calculate 25 * 4")
undefinedArchitecture Patterns
架构模式
Pattern 1: RAG with LangChain
模式1:基于LangChain的RAG
python
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddingspython
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddingsLoad and process documents
Load and process documents
loader = TextLoader('documents.txt')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_documents(documents)
loader = TextLoader('documents.txt')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_documents(documents)
Create vector store
Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(texts, embeddings)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(texts, embeddings)
Create retrieval chain
Create retrieval chain
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectorstore.as_retriever(),
return_source_documents=True
)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectorstore.as_retriever(),
return_source_documents=True
)
Query
Query
result = qa_chain({"query": "What is the main topic?"})
undefinedresult = qa_chain({"query": "What is the main topic?"})
undefinedPattern 2: Custom Agent with Tools
模式2:自定义工具Agent
python
from langchain.agents import Tool, AgentExecutor
from langchain.agents.react.base import ReActDocstoreAgent
from langchain.tools import tool
@tool
def search_database(query: str) -> str:
"""Search internal database for information."""
# Your database search logic
return f"Results for: {query}"
@tool
def send_email(recipient: str, content: str) -> str:
"""Send an email to specified recipient."""
# Email sending logic
return f"Email sent to {recipient}"
tools = [search_database, send_email]
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)python
from langchain.agents import Tool, AgentExecutor
from langchain.agents.react.base import ReActDocstoreAgent
from langchain.tools import tool
@tool
def search_database(query: str) -> str:
"""Search internal database for information."""
# Your database search logic
return f"Results for: {query}"
@tool
def send_email(recipient: str, content: str) -> str:
"""Send an email to specified recipient."""
# Email sending logic
return f"Email sent to {recipient}"
tools = [search_database, send_email]
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)Pattern 3: Multi-Step Chain
模式3:多步骤Chain
python
from langchain.chains import LLMChain, SequentialChain
from langchain.prompts import PromptTemplatepython
from langchain.chains import LLMChain, SequentialChain
from langchain.prompts import PromptTemplateStep 1: Extract key information
Step 1: Extract key information
extract_prompt = PromptTemplate(
input_variables=["text"],
template="Extract key entities from: {text}\n\nEntities:"
)
extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")
extract_prompt = PromptTemplate(
input_variables=["text"],
template="Extract key entities from: {text}\n\nEntities:"
)
extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")
Step 2: Analyze entities
Step 2: Analyze entities
analyze_prompt = PromptTemplate(
input_variables=["entities"],
template="Analyze these entities: {entities}\n\nAnalysis:"
)
analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")
analyze_prompt = PromptTemplate(
input_variables=["entities"],
template="Analyze these entities: {entities}\n\nAnalysis:"
)
analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")
Step 3: Generate summary
Step 3: Generate summary
summary_prompt = PromptTemplate(
input_variables=["entities", "analysis"],
template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:"
)
summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")
summary_prompt = PromptTemplate(
input_variables=["entities", "analysis"],
template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:"
)
summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")
Combine into sequential chain
Combine into sequential chain
overall_chain = SequentialChain(
chains=[extract_chain, analyze_chain, summary_chain],
input_variables=["text"],
output_variables=["entities", "analysis", "summary"],
verbose=True
)
undefinedoverall_chain = SequentialChain(
chains=[extract_chain, analyze_chain, summary_chain],
input_variables=["text"],
output_variables=["entities", "analysis", "summary"],
verbose=True
)
undefinedMemory Management Best Practices
记忆管理最佳实践
Choosing the Right Memory Type
选择合适的记忆类型
python
undefinedpython
undefinedFor short conversations (< 10 messages)
For short conversations (< 10 messages)
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
For long conversations (summarize old messages)
For long conversations (summarize old messages)
from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(llm=llm)
from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(llm=llm)
For sliding window (last N messages)
For sliding window (last N messages)
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=5)
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=5)
For entity tracking
For entity tracking
from langchain.memory import ConversationEntityMemory
memory = ConversationEntityMemory(llm=llm)
from langchain.memory import ConversationEntityMemory
memory = ConversationEntityMemory(llm=llm)
For semantic retrieval of relevant history
For semantic retrieval of relevant history
from langchain.memory import VectorStoreRetrieverMemory
memory = VectorStoreRetrieverMemory(retriever=retriever)
undefinedfrom langchain.memory import VectorStoreRetrieverMemory
memory = VectorStoreRetrieverMemory(retriever=retriever)
undefinedCallback System
回调系统
Custom Callback Handler
自定义回调处理器
python
from langchain.callbacks.base import BaseCallbackHandler
class CustomCallbackHandler(BaseCallbackHandler):
def on_llm_start(self, serialized, prompts, **kwargs):
print(f"LLM started with prompts: {prompts}")
def on_llm_end(self, response, **kwargs):
print(f"LLM ended with response: {response}")
def on_llm_error(self, error, **kwargs):
print(f"LLM error: {error}")
def on_chain_start(self, serialized, inputs, **kwargs):
print(f"Chain started with inputs: {inputs}")
def on_agent_action(self, action, **kwargs):
print(f"Agent taking action: {action}")python
from langchain.callbacks.base import BaseCallbackHandler
class CustomCallbackHandler(BaseCallbackHandler):
def on_llm_start(self, serialized, prompts, **kwargs):
print(f"LLM started with prompts: {prompts}")
def on_llm_end(self, response, **kwargs):
print(f"LLM ended with response: {response}")
def on_llm_error(self, error, **kwargs):
print(f"LLM error: {error}")
def on_chain_start(self, serialized, inputs, **kwargs):
print(f"Chain started with inputs: {inputs}")
def on_agent_action(self, action, **kwargs):
print(f"Agent taking action: {action}")Use callback
Use callback
agent.run("query", callbacks=[CustomCallbackHandler()])
undefinedagent.run("query", callbacks=[CustomCallbackHandler()])
undefinedTesting Strategies
测试策略
python
import pytest
from unittest.mock import Mock
def test_agent_tool_selection():
# Mock LLM to return specific tool selection
mock_llm = Mock()
mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"
agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
result = agent.run("test query")
# Verify correct tool was selected
assert "search_database" in str(mock_llm.predict.call_args)
def test_memory_persistence():
memory = ConversationBufferMemory()
memory.save_context({"input": "Hi"}, {"output": "Hello!"})
assert "Hi" in memory.load_memory_variables({})['history']
assert "Hello!" in memory.load_memory_variables({})['history']python
import pytest
from unittest.mock import Mock
def test_agent_tool_selection():
# Mock LLM to return specific tool selection
mock_llm = Mock()
mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"
agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
result = agent.run("test query")
# Verify correct tool was selected
assert "search_database" in str(mock_llm.predict.call_args)
def test_memory_persistence():
memory = ConversationBufferMemory()
memory.save_context({"input": "Hi"}, {"output": "Hello!"})
assert "Hi" in memory.load_memory_variables({})['history']
assert "Hello!" in memory.load_memory_variables({})['history']Performance Optimization
性能优化
1. Caching
1. 缓存
python
from langchain.cache import InMemoryCache
import langchain
langchain.llm_cache = InMemoryCache()python
from langchain.cache import InMemoryCache
import langchain
langchain.llm_cache = InMemoryCache()2. Batch Processing
2. 批量处理
python
undefinedpython
undefinedProcess multiple documents in parallel
Process multiple documents in parallel
from langchain.document_loaders import DirectoryLoader
from concurrent.futures import ThreadPoolExecutor
loader = DirectoryLoader('./docs')
docs = loader.load()
def process_doc(doc):
return text_splitter.split_documents([doc])
with ThreadPoolExecutor(max_workers=4) as executor:
split_docs = list(executor.map(process_doc, docs))
undefinedfrom langchain.document_loaders import DirectoryLoader
from concurrent.futures import ThreadPoolExecutor
loader = DirectoryLoader('./docs')
docs = loader.load()
def process_doc(doc):
return text_splitter.split_documents([doc])
with ThreadPoolExecutor(max_workers=4) as executor:
split_docs = list(executor.map(process_doc, docs))
undefined3. Streaming Responses
3. 流式响应
python
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])python
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])Resources
资源
- references/agents.md: Deep dive on agent architectures
- references/memory.md: Memory system patterns
- references/chains.md: Chain composition strategies
- references/document-processing.md: Document loading and indexing
- references/callbacks.md: Monitoring and observability
- assets/agent-template.py: Production-ready agent template
- assets/memory-config.yaml: Memory configuration examples
- assets/chain-example.py: Complex chain examples
- references/agents.md:Agent架构深度解析
- references/memory.md:记忆系统模式
- references/chains.md:Chain组合策略
- references/document-processing.md:文档加载与索引
- references/callbacks.md:监控与可观测性
- assets/agent-template.py:生产级Agent模板
- assets/memory-config.yaml:记忆配置示例
- assets/chain-example.py:复杂Chain示例
Common Pitfalls
常见陷阱
- Memory Overflow: Not managing conversation history length
- Tool Selection Errors: Poor tool descriptions confuse agents
- Context Window Exceeded: Exceeding LLM token limits
- No Error Handling: Not catching and handling agent failures
- Inefficient Retrieval: Not optimizing vector store queries
- 记忆溢出:未管理对话历史长度
- 工具选择错误:工具描述不佳导致Agent混淆
- 上下文窗口超限:超出LLM的Token限制
- 无错误处理:未捕获和处理Agent故障
- 检索效率低下:未优化向量存储查询
Production Checklist
生产环境检查清单
- Implement proper error handling
- Add request/response logging
- Monitor token usage and costs
- Set timeout limits for agent execution
- Implement rate limiting
- Add input validation
- Test with edge cases
- Set up observability (callbacks)
- Implement fallback strategies
- Version control prompts and configurations
- 实现完善的错误处理
- 添加请求/响应日志
- 监控Token使用量与成本
- 设置Agent执行超时限制
- 实现速率限制
- 添加输入验证
- 测试边缘场景
- 配置可观测性(回调)
- 实现降级策略
- 对Prompt和配置进行版本控制