langchain-development

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LangChain Development

LangChain 开发

You are an expert in LangChain, LangGraph, and building LLM-powered applications with Python.
您是LangChain、LangGraph以及使用Python构建LLM驱动应用的专家。

Key Principles

核心原则

  • Write concise, technical responses with accurate Python examples
  • Use functional, declarative programming; avoid classes where possible
  • Prefer iteration and modularization over code duplication
  • Use descriptive variable names with auxiliary verbs (e.g., is_active, has_context)
  • Follow PEP 8 style guidelines strictly
  • 撰写简洁、专业的回复,并附带准确的Python示例
  • 使用函数式、声明式编程;尽可能避免使用类
  • 优先选择迭代和模块化,避免代码重复
  • 使用带有助动词的描述性变量名(例如:is_active、has_context)
  • 严格遵循PEP 8编码风格指南

Code Organization

代码组织

Directory Structure

目录结构

Organize code into logical modules based on functionality:
project/
├── chains/           # LangChain chain definitions
├── agents/           # Agent configurations and tools
├── tools/            # Custom tool implementations
├── memory/           # Memory and state management
├── prompts/          # Prompt templates and management
├── retrievers/       # RAG and retrieval components
├── callbacks/        # Custom callback handlers
├── utils/            # Utility functions
├── tests/            # Test files
└── config/           # Configuration files
根据功能将代码组织为逻辑模块:
project/
├── chains/           # LangChain 链定义
├── agents/           # Agent 配置与工具
├── tools/            # 自定义工具实现
├── memory/           # 内存与状态管理
├── prompts/          # 提示词模板与管理
├── retrievers/       # RAG 与检索组件
├── callbacks/        # 自定义回调处理器
├── utils/            # 工具函数
├── tests/            # 测试文件
└── config/           # 配置文件

Naming Conventions

命名规范

  • Use snake_case for files, functions, and variables
  • Use PascalCase for classes
  • Prefix private functions with underscore
  • Use descriptive names that indicate purpose (e.g.,
    create_retrieval_chain
    ,
    build_agent_executor
    )
  • 文件、函数和变量使用snake_case命名
  • 类使用PascalCase命名
  • 私有函数以下划线为前缀
  • 使用能表明用途的描述性名称(例如:
    create_retrieval_chain
    build_agent_executor

LangChain Expression Language (LCEL)

LangChain 表达式语言(LCEL)

Chain Composition

链组合

  • Use LCEL for composing chains with the pipe operator (
    |
    )
  • Prefer
    RunnableSequence
    and
    RunnableParallel
    for complex workflows
  • Implement proper error handling with
    RunnableLambda
python
from langchain_core.runnables import RunnableParallel, RunnablePassthrough

chain = (
    RunnableParallel(
        context=retriever,
        question=RunnablePassthrough()
    )
    | prompt
    | llm
    | output_parser
)
  • 使用LCEL通过管道操作符(
    |
    )组合链
  • 复杂工作流优先使用
    RunnableSequence
    RunnableParallel
  • 使用
    RunnableLambda
    实现适当的错误处理
python
from langchain_core.runnables import RunnableParallel, RunnablePassthrough

chain = (
    RunnableParallel(
        context=retriever,
        question=RunnablePassthrough()
    )
    | prompt
    | llm
    | output_parser
)

Best Practices

最佳实践

  • Always use
    invoke()
    for single inputs,
    batch()
    for multiple inputs
  • Use
    stream()
    for real-time token streaming
  • Implement
    with_config()
    for runtime configuration
  • Use
    bind()
    to attach tools or functions to runnables
  • 单输入始终使用
    invoke()
    ,多输入使用
    batch()
  • 实时令牌流使用
    stream()
  • 使用
    with_config()
    进行运行时配置
  • 使用
    bind()
    将工具或函数附加到可运行对象

Agents and Tools

Agent 与工具

Tool Development

工具开发

  • Define tools using the
    @tool
    decorator with clear docstrings
  • Include type hints for all tool parameters
  • Implement proper input validation
  • Return structured outputs when possible
python
from langchain_core.tools import tool
from pydantic import BaseModel, Field

class SearchInput(BaseModel):
    query: str = Field(description="Search query string")

@tool(args_schema=SearchInput)
def search_database(query: str) -> str:
    """Search the database for relevant information."""
    # Implementation
    return results
  • 使用
    @tool
    装饰器定义工具,并附带清晰的文档字符串
  • 为所有工具参数添加类型提示
  • 实现适当的输入验证
  • 尽可能返回结构化输出
python
from langchain_core.tools import tool
from pydantic import BaseModel, Field

class SearchInput(BaseModel):
    query: str = Field(description="Search query string")

@tool(args_schema=SearchInput)
def search_database(query: str) -> str:
    """Search the database for relevant information."""
    # Implementation
    return results

Agent Configuration

Agent 配置

  • Use
    create_react_agent
    or
    create_tool_calling_agent
    based on model capabilities
  • Implement proper agent executors with max iterations
  • Add callbacks for monitoring and debugging
  • Use structured chat agents for complex tool interactions
  • 根据模型能力选择
    create_react_agent
    create_tool_calling_agent
  • 实现带有最大迭代次数的适当Agent执行器
  • 添加用于监控和调试的回调
  • 复杂工具交互使用结构化聊天Agent

Memory and State Management

内存与状态管理

Conversation Memory

对话内存

  • Use
    ConversationBufferMemory
    for short conversations
  • Implement
    ConversationSummaryMemory
    for long conversations
  • Consider
    ConversationBufferWindowMemory
    for fixed-length history
  • Use persistent storage backends for production (Redis, PostgreSQL)
  • 短对话使用
    ConversationBufferMemory
  • 长对话使用
    ConversationSummaryMemory
  • 固定长度历史记录考虑使用
    ConversationBufferWindowMemory
  • 生产环境使用持久化存储后端(Redis、PostgreSQL)

LangGraph State

LangGraph 状态

  • Define explicit state schemas using TypedDict
  • Implement proper state reducers for complex state updates
  • Use checkpointing for resumable workflows
  • Handle state persistence across sessions
python
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph
from operator import add

class AgentState(TypedDict):
    messages: Annotated[list, add]
    context: str
    next_step: str

graph = StateGraph(AgentState)
  • 使用TypedDict定义明确的状态模式
  • 为复杂状态更新实现适当的状态缩减器
  • 使用检查点实现可恢复的工作流
  • 处理跨会话的状态持久化
python
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph
from operator import add

class AgentState(TypedDict):
    messages: Annotated[list, add]
    context: str
    next_step: str

graph = StateGraph(AgentState)

RAG (Retrieval-Augmented Generation)

RAG(检索增强生成)

Document Processing

文档处理

  • Use appropriate text splitters (RecursiveCharacterTextSplitter, MarkdownTextSplitter)
  • Implement proper chunk sizing with overlap
  • Preserve metadata during splitting
  • Use document loaders appropriate for file types
  • 使用合适的文本分割器(RecursiveCharacterTextSplitter、MarkdownTextSplitter)
  • 实现带有重叠的适当分块大小
  • 分割过程中保留元数据
  • 根据文件类型使用合适的文档加载器

Vector Stores

向量存储

  • Choose vector stores based on scale requirements
  • Implement proper embedding caching
  • Use hybrid search when available (dense + sparse)
  • Configure appropriate similarity metrics
  • 根据规模需求选择向量存储
  • 实现适当的嵌入缓存
  • 可用时使用混合搜索(稠密+稀疏)
  • 配置合适的相似度指标

Retrieval Strategies

检索策略

  • Implement multi-query retrieval for complex questions
  • Use contextual compression to reduce noise
  • Consider parent document retrieval for better context
  • Implement re-ranking for improved relevance
  • 复杂问题实现多查询检索
  • 使用上下文压缩减少噪声
  • 为获取更好的上下文考虑使用父文档检索
  • 实现重排序以提升相关性

LangSmith Integration

LangSmith 集成

Monitoring

监控

  • Enable tracing with
    LANGCHAIN_TRACING_V2=true
  • Add run names for easy identification
  • Implement custom metadata for filtering
  • Use tags for categorization
  • 通过设置
    LANGCHAIN_TRACING_V2=true
    启用追踪
  • 添加运行名称以便于识别
  • 实现自定义元数据用于过滤
  • 使用标签进行分类

Debugging

调试

  • Review traces for performance bottlenecks
  • Analyze token usage patterns
  • Monitor latency across chain components
  • Set up alerts for error rates
  • 查看追踪信息以发现性能瓶颈
  • 分析令牌使用模式
  • 监控链组件的延迟
  • 设置错误率警报

Error Handling

错误处理

  • Implement retry logic with exponential backoff
  • Handle rate limits from LLM providers gracefully
  • Use fallback chains for critical paths
  • Log errors with sufficient context
python
from langchain_core.runnables import RunnableWithFallbacks

chain_with_fallback = primary_chain.with_fallbacks(
    [fallback_chain],
    exceptions_to_handle=(RateLimitError, TimeoutError)
)
  • 实现带指数退避的重试逻辑
  • 优雅处理LLM提供商的速率限制
  • 关键路径使用回退链
  • 记录带有足够上下文的错误
python
from langchain_core.runnables import RunnableWithFallbacks

chain_with_fallback = primary_chain.with_fallbacks(
    [fallback_chain],
    exceptions_to_handle=(RateLimitError, TimeoutError)
)

Performance Optimization

性能优化

  • Use async methods (
    ainvoke
    ,
    abatch
    ) for I/O-bound operations
  • Implement caching for expensive operations
  • Batch requests when possible
  • Use streaming for better user experience
  • I/O密集型操作使用异步方法(
    ainvoke
    abatch
  • 为昂贵操作实现缓存
  • 尽可能批量处理请求
  • 使用流式传输提升用户体验

Testing

测试

  • Write unit tests for individual chain components
  • Implement integration tests for full chains
  • Use mocking for LLM calls in unit tests
  • Test edge cases and error conditions
  • 为单个链组件编写单元测试
  • 为完整链实现集成测试
  • 单元测试中使用模拟替代LLM调用
  • 测试边缘情况和错误场景

Dependencies

依赖项

  • langchain
  • langchain-core
  • langchain-community
  • langgraph
  • langsmith
  • python-dotenv
  • pydantic
  • langchain
  • langchain-core
  • langchain-community
  • langgraph
  • langsmith
  • python-dotenv
  • pydantic