langchain-architecture

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LangChain Architecture

LangChain 架构

Master the LangChain framework for building sophisticated LLM applications with agents, chains, memory, and tool integration.
掌握LangChain框架,构建集成Agent、Chain、记忆和工具的复杂LLM应用。

Do not use this skill when

请勿在以下场景使用此技能

  • The task is unrelated to langchain architecture
  • You need a different domain or tool outside this scope
  • 任务与LangChain架构无关
  • 需要此范围之外的其他领域或工具

Instructions

使用说明

  • Clarify goals, constraints, and required inputs.
  • Apply relevant best practices and validate outcomes.
  • Provide actionable steps and verification.
  • If detailed examples are required, open
    resources/implementation-playbook.md
    .
  • 明确目标、约束条件和所需输入。
  • 应用相关最佳实践并验证结果。
  • 提供可执行的步骤和验证方法。
  • 如果需要详细示例,请打开
    resources/implementation-playbook.md

Use this skill when

建议在以下场景使用此技能

  • Building autonomous AI agents with tool access
  • Implementing complex multi-step LLM workflows
  • Managing conversation memory and state
  • Integrating LLMs with external data sources and APIs
  • Creating modular, reusable LLM application components
  • Implementing document processing pipelines
  • Building production-grade LLM applications
  • 构建可访问工具的自主AI Agent
  • 实现复杂的多步骤LLM工作流
  • 管理对话记忆与状态
  • 将LLM与外部数据源和API集成
  • 创建模块化、可复用的LLM应用组件
  • 实现文档处理流水线
  • 构建生产级LLM应用

Core Concepts

核心概念

1. Agents

1. Agent

Autonomous systems that use LLMs to decide which actions to take.
Agent Types:
  • ReAct: Reasoning + Acting in interleaved manner
  • OpenAI Functions: Leverages function calling API
  • Structured Chat: Handles multi-input tools
  • Conversational: Optimized for chat interfaces
  • Self-Ask with Search: Decomposes complex queries
自主系统,使用LLM来决定执行哪些操作。
Agent类型:
  • ReAct:交替进行推理与行动
  • OpenAI Functions:利用函数调用API
  • Structured Chat:处理多输入工具
  • Conversational:针对聊天界面优化
  • Self-Ask with Search:分解复杂查询

2. Chains

2. Chain

Sequences of calls to LLMs or other utilities.
Chain Types:
  • LLMChain: Basic prompt + LLM combination
  • SequentialChain: Multiple chains in sequence
  • RouterChain: Routes inputs to specialized chains
  • TransformChain: Data transformations between steps
  • MapReduceChain: Parallel processing with aggregation
对LLM或其他工具的调用序列。
Chain类型:
  • LLMChain:基础Prompt与LLM的组合
  • SequentialChain:多Chain按顺序执行
  • RouterChain:将输入路由到专用Chain
  • TransformChain:步骤间的数据转换
  • MapReduceChain:并行处理并聚合结果

3. Memory

3. 记忆

Systems for maintaining context across interactions.
Memory Types:
  • ConversationBufferMemory: Stores all messages
  • ConversationSummaryMemory: Summarizes older messages
  • ConversationBufferWindowMemory: Keeps last N messages
  • EntityMemory: Tracks information about entities
  • VectorStoreMemory: Semantic similarity retrieval
用于在多轮交互中维护上下文的系统。
记忆类型:
  • ConversationBufferMemory:存储所有消息
  • ConversationSummaryMemory:总结旧消息
  • ConversationBufferWindowMemory:保留最近N条消息
  • EntityMemory:跟踪实体信息
  • VectorStoreMemory:语义相似度检索

4. Document Processing

4. 文档处理

Loading, transforming, and storing documents for retrieval.
Components:
  • Document Loaders: Load from various sources
  • Text Splitters: Chunk documents intelligently
  • Vector Stores: Store and retrieve embeddings
  • Retrievers: Fetch relevant documents
  • Indexes: Organize documents for efficient access
为检索而进行的文档加载、转换和存储。
组件:
  • Document Loaders:从多种来源加载文档
  • Text Splitters:智能拆分文档
  • Vector Stores:存储和检索嵌入向量
  • Retrievers:获取相关文档
  • Indexes:组织文档以实现高效访问

5. Callbacks

5. 回调

Hooks for logging, monitoring, and debugging.
Use Cases:
  • Request/response logging
  • Token usage tracking
  • Latency monitoring
  • Error handling
  • Custom metrics collection
用于日志记录、监控和调试的钩子。
使用场景:
  • 请求/响应日志记录
  • Token使用量跟踪
  • 延迟监控
  • 错误处理
  • 自定义指标收集

Quick Start

快速入门

python
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI
from langchain.memory import ConversationBufferMemory
python
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI
from langchain.memory import ConversationBufferMemory

Initialize LLM

Initialize LLM

llm = OpenAI(temperature=0)
llm = OpenAI(temperature=0)

Load tools

Load tools

tools = load_tools(["serpapi", "llm-math"], llm=llm)
tools = load_tools(["serpapi", "llm-math"], llm=llm)

Add memory

Add memory

memory = ConversationBufferMemory(memory_key="chat_history")
memory = ConversationBufferMemory(memory_key="chat_history")

Create agent

Create agent

agent = initialize_agent( tools, llm, agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION, memory=memory, verbose=True )
agent = initialize_agent( tools, llm, agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION, memory=memory, verbose=True )

Run agent

Run agent

result = agent.run("What's the weather in SF? Then calculate 25 * 4")
undefined
result = agent.run("What's the weather in SF? Then calculate 25 * 4")
undefined

Architecture Patterns

架构模式

Pattern 1: RAG with LangChain

模式1:基于LangChain的RAG

python
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
python
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings

Load and process documents

Load and process documents

loader = TextLoader('documents.txt') documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200) texts = text_splitter.split_documents(documents)
loader = TextLoader('documents.txt') documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200) texts = text_splitter.split_documents(documents)

Create vector store

Create vector store

embeddings = OpenAIEmbeddings() vectorstore = Chroma.from_documents(texts, embeddings)
embeddings = OpenAIEmbeddings() vectorstore = Chroma.from_documents(texts, embeddings)

Create retrieval chain

Create retrieval chain

qa_chain = RetrievalQA.from_chain_type( llm=llm, chain_type="stuff", retriever=vectorstore.as_retriever(), return_source_documents=True )
qa_chain = RetrievalQA.from_chain_type( llm=llm, chain_type="stuff", retriever=vectorstore.as_retriever(), return_source_documents=True )

Query

Query

result = qa_chain({"query": "What is the main topic?"})
undefined
result = qa_chain({"query": "What is the main topic?"})
undefined

Pattern 2: Custom Agent with Tools

模式2:自定义工具Agent

python
from langchain.agents import Tool, AgentExecutor
from langchain.agents.react.base import ReActDocstoreAgent
from langchain.tools import tool

@tool
def search_database(query: str) -> str:
    """Search internal database for information."""
    # Your database search logic
    return f"Results for: {query}"

@tool
def send_email(recipient: str, content: str) -> str:
    """Send an email to specified recipient."""
    # Email sending logic
    return f"Email sent to {recipient}"

tools = [search_database, send_email]

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)
python
from langchain.agents import Tool, AgentExecutor
from langchain.agents.react.base import ReActDocstoreAgent
from langchain.tools import tool

@tool
def search_database(query: str) -> str:
    """Search internal database for information."""
    # Your database search logic
    return f"Results for: {query}"

@tool
def send_email(recipient: str, content: str) -> str:
    """Send an email to specified recipient."""
    # Email sending logic
    return f"Email sent to {recipient}"

tools = [search_database, send_email]

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

Pattern 3: Multi-Step Chain

模式3:多步骤Chain

python
from langchain.chains import LLMChain, SequentialChain
from langchain.prompts import PromptTemplate
python
from langchain.chains import LLMChain, SequentialChain
from langchain.prompts import PromptTemplate

Step 1: Extract key information

Step 1: Extract key information

extract_prompt = PromptTemplate( input_variables=["text"], template="Extract key entities from: {text}\n\nEntities:" ) extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")
extract_prompt = PromptTemplate( input_variables=["text"], template="Extract key entities from: {text}\n\nEntities:" ) extract_chain = LLMChain(llm=llm, prompt=extract_prompt, output_key="entities")

Step 2: Analyze entities

Step 2: Analyze entities

analyze_prompt = PromptTemplate( input_variables=["entities"], template="Analyze these entities: {entities}\n\nAnalysis:" ) analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")
analyze_prompt = PromptTemplate( input_variables=["entities"], template="Analyze these entities: {entities}\n\nAnalysis:" ) analyze_chain = LLMChain(llm=llm, prompt=analyze_prompt, output_key="analysis")

Step 3: Generate summary

Step 3: Generate summary

summary_prompt = PromptTemplate( input_variables=["entities", "analysis"], template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:" ) summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")
summary_prompt = PromptTemplate( input_variables=["entities", "analysis"], template="Summarize:\nEntities: {entities}\nAnalysis: {analysis}\n\nSummary:" ) summary_chain = LLMChain(llm=llm, prompt=summary_prompt, output_key="summary")

Combine into sequential chain

Combine into sequential chain

overall_chain = SequentialChain( chains=[extract_chain, analyze_chain, summary_chain], input_variables=["text"], output_variables=["entities", "analysis", "summary"], verbose=True )
undefined
overall_chain = SequentialChain( chains=[extract_chain, analyze_chain, summary_chain], input_variables=["text"], output_variables=["entities", "analysis", "summary"], verbose=True )
undefined

Memory Management Best Practices

记忆管理最佳实践

Choosing the Right Memory Type

选择合适的记忆类型

python
undefined
python
undefined

For short conversations (< 10 messages)

For short conversations (< 10 messages)

from langchain.memory import ConversationBufferMemory memory = ConversationBufferMemory()
from langchain.memory import ConversationBufferMemory memory = ConversationBufferMemory()

For long conversations (summarize old messages)

For long conversations (summarize old messages)

from langchain.memory import ConversationSummaryMemory memory = ConversationSummaryMemory(llm=llm)
from langchain.memory import ConversationSummaryMemory memory = ConversationSummaryMemory(llm=llm)

For sliding window (last N messages)

For sliding window (last N messages)

from langchain.memory import ConversationBufferWindowMemory memory = ConversationBufferWindowMemory(k=5)
from langchain.memory import ConversationBufferWindowMemory memory = ConversationBufferWindowMemory(k=5)

For entity tracking

For entity tracking

from langchain.memory import ConversationEntityMemory memory = ConversationEntityMemory(llm=llm)
from langchain.memory import ConversationEntityMemory memory = ConversationEntityMemory(llm=llm)

For semantic retrieval of relevant history

For semantic retrieval of relevant history

from langchain.memory import VectorStoreRetrieverMemory memory = VectorStoreRetrieverMemory(retriever=retriever)
undefined
from langchain.memory import VectorStoreRetrieverMemory memory = VectorStoreRetrieverMemory(retriever=retriever)
undefined

Callback System

回调系统

Custom Callback Handler

自定义回调处理器

python
from langchain.callbacks.base import BaseCallbackHandler

class CustomCallbackHandler(BaseCallbackHandler):
    def on_llm_start(self, serialized, prompts, **kwargs):
        print(f"LLM started with prompts: {prompts}")

    def on_llm_end(self, response, **kwargs):
        print(f"LLM ended with response: {response}")

    def on_llm_error(self, error, **kwargs):
        print(f"LLM error: {error}")

    def on_chain_start(self, serialized, inputs, **kwargs):
        print(f"Chain started with inputs: {inputs}")

    def on_agent_action(self, action, **kwargs):
        print(f"Agent taking action: {action}")
python
from langchain.callbacks.base import BaseCallbackHandler

class CustomCallbackHandler(BaseCallbackHandler):
    def on_llm_start(self, serialized, prompts, **kwargs):
        print(f"LLM started with prompts: {prompts}")

    def on_llm_end(self, response, **kwargs):
        print(f"LLM ended with response: {response}")

    def on_llm_error(self, error, **kwargs):
        print(f"LLM error: {error}")

    def on_chain_start(self, serialized, inputs, **kwargs):
        print(f"Chain started with inputs: {inputs}")

    def on_agent_action(self, action, **kwargs):
        print(f"Agent taking action: {action}")

Use callback

Use callback

agent.run("query", callbacks=[CustomCallbackHandler()])
undefined
agent.run("query", callbacks=[CustomCallbackHandler()])
undefined

Testing Strategies

测试策略

python
import pytest
from unittest.mock import Mock

def test_agent_tool_selection():
    # Mock LLM to return specific tool selection
    mock_llm = Mock()
    mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"

    agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)

    result = agent.run("test query")

    # Verify correct tool was selected
    assert "search_database" in str(mock_llm.predict.call_args)

def test_memory_persistence():
    memory = ConversationBufferMemory()

    memory.save_context({"input": "Hi"}, {"output": "Hello!"})

    assert "Hi" in memory.load_memory_variables({})['history']
    assert "Hello!" in memory.load_memory_variables({})['history']
python
import pytest
from unittest.mock import Mock

def test_agent_tool_selection():
    # Mock LLM to return specific tool selection
    mock_llm = Mock()
    mock_llm.predict.return_value = "Action: search_database\nAction Input: test query"

    agent = initialize_agent(tools, mock_llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)

    result = agent.run("test query")

    # Verify correct tool was selected
    assert "search_database" in str(mock_llm.predict.call_args)

def test_memory_persistence():
    memory = ConversationBufferMemory()

    memory.save_context({"input": "Hi"}, {"output": "Hello!"})

    assert "Hi" in memory.load_memory_variables({})['history']
    assert "Hello!" in memory.load_memory_variables({})['history']

Performance Optimization

性能优化

1. Caching

1. 缓存

python
from langchain.cache import InMemoryCache
import langchain

langchain.llm_cache = InMemoryCache()
python
from langchain.cache import InMemoryCache
import langchain

langchain.llm_cache = InMemoryCache()

2. Batch Processing

2. 批量处理

python
undefined
python
undefined

Process multiple documents in parallel

Process multiple documents in parallel

from langchain.document_loaders import DirectoryLoader from concurrent.futures import ThreadPoolExecutor
loader = DirectoryLoader('./docs') docs = loader.load()
def process_doc(doc): return text_splitter.split_documents([doc])
with ThreadPoolExecutor(max_workers=4) as executor: split_docs = list(executor.map(process_doc, docs))
undefined
from langchain.document_loaders import DirectoryLoader from concurrent.futures import ThreadPoolExecutor
loader = DirectoryLoader('./docs') docs = loader.load()
def process_doc(doc): return text_splitter.split_documents([doc])
with ThreadPoolExecutor(max_workers=4) as executor: split_docs = list(executor.map(process_doc, docs))
undefined

3. Streaming Responses

3. 流式响应

python
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])
python
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()])

Resources

资源

  • references/agents.md: Deep dive on agent architectures
  • references/memory.md: Memory system patterns
  • references/chains.md: Chain composition strategies
  • references/document-processing.md: Document loading and indexing
  • references/callbacks.md: Monitoring and observability
  • assets/agent-template.py: Production-ready agent template
  • assets/memory-config.yaml: Memory configuration examples
  • assets/chain-example.py: Complex chain examples
  • references/agents.md:Agent架构深度解析
  • references/memory.md:记忆系统模式
  • references/chains.md:Chain组合策略
  • references/document-processing.md:文档加载与索引
  • references/callbacks.md:监控与可观测性
  • assets/agent-template.py:生产级Agent模板
  • assets/memory-config.yaml:记忆配置示例
  • assets/chain-example.py:复杂Chain示例

Common Pitfalls

常见陷阱

  1. Memory Overflow: Not managing conversation history length
  2. Tool Selection Errors: Poor tool descriptions confuse agents
  3. Context Window Exceeded: Exceeding LLM token limits
  4. No Error Handling: Not catching and handling agent failures
  5. Inefficient Retrieval: Not optimizing vector store queries
  1. 记忆溢出:未管理对话历史长度
  2. 工具选择错误:工具描述不佳导致Agent混淆
  3. 上下文窗口超限:超出LLM的Token限制
  4. 无错误处理:未捕获和处理Agent故障
  5. 检索效率低下:未优化向量存储查询

Production Checklist

生产环境检查清单

  • Implement proper error handling
  • Add request/response logging
  • Monitor token usage and costs
  • Set timeout limits for agent execution
  • Implement rate limiting
  • Add input validation
  • Test with edge cases
  • Set up observability (callbacks)
  • Implement fallback strategies
  • Version control prompts and configurations
  • 实现完善的错误处理
  • 添加请求/响应日志
  • 监控Token使用量与成本
  • 设置Agent执行超时限制
  • 实现速率限制
  • 添加输入验证
  • 测试边缘场景
  • 配置可观测性(回调)
  • 实现降级策略
  • 对Prompt和配置进行版本控制