python-best-practices

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Python Best Practices

Python开发最佳实践

Purpose

用途

This skill provides guidance on Python development best practices to ensure code quality, maintainability, and consistency across your Python projects.
本指南提供Python开发最佳实践相关建议,确保你的Python项目具备良好的代码质量、可维护性和一致性。

When to Use This Skill

适用场景

Auto-activates when:
  • Working with Python files (*.py)
  • Mentions of "python", "best practices", "style guide"
  • Adding type hints or docstrings
  • Code refactoring in Python
自动触发场景:
  • 处理Python文件(*.py)时
  • 提及“python”、“best practices”、“style guide”相关内容时
  • 添加类型提示或文档字符串时
  • 对Python代码进行重构时

Style Guidelines

风格指南

PEP 8 Compliance

遵循PEP 8规范

Follow PEP 8 style guide for Python code:
  • Indentation: 4 spaces per indentation level
  • Line Length: Maximum 79 characters for code, 72 for docstrings/comments
  • Blank Lines: 2 blank lines between top-level definitions, 1 between methods
  • Imports: Always at top of file, grouped (stdlib, third-party, local)
  • Naming Conventions:
    • snake_case
      for functions, variables, modules
    • PascalCase
      for classes
    • UPPER_SNAKE_CASE
      for constants
    • Leading underscore
      _name
      for internal/private
Python代码需遵循PEP 8风格指南:
  • 缩进:每个缩进级别使用4个空格
  • 行长度:代码行最大长度为79字符,文档字符串/注释行最大长度为72字符
  • 空行:顶层定义之间保留2个空行,方法之间保留1个空行
  • 导入:始终将导入语句放在文件顶部,按标准库、第三方库、本地模块分组
  • 命名规范
    • 函数、变量、模块使用
      snake_case
      命名
    • 类使用
      PascalCase
      命名
    • 常量使用
      UPPER_SNAKE_CASE
      命名
    • 内部/私有成员使用前导下划线
      _name
      命名

Import Organization

导入组织

Always organize imports in this order:
python
undefined
导入语句需按以下顺序组织:
python
undefined

1. Standard library imports

1. 标准库导入

import os import sys from pathlib import Path
import os import sys from pathlib import Path

2. Third-party imports

2. 第三方库导入

import requests import numpy as np
import requests import numpy as np

3. Local application imports

3. 本地应用模块导入

from myapp.core import MyClass from myapp.utils import helper_function

Avoid circular imports by using `TYPE_CHECKING`:

```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from myapp.other_module import OtherClass

def my_function(obj: "OtherClass") -> None:
    """Function that uses OtherClass only for type hints."""
    pass
from myapp.core import MyClass from myapp.utils import helper_function

可通过`TYPE_CHECKING`避免循环导入:

```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from myapp.other_module import OtherClass

def my_function(obj: "OtherClass") -> None:
    """仅在类型提示中使用OtherClass的函数。"""
    pass

Type Hints

类型提示

Always Use Type Hints

始终使用类型提示

Type hints improve code clarity and catch errors early:
python
def process_data(
    items: list[str],
    max_count: int | None = None,
    verbose: bool = False
) -> dict[str, int]:
    """Process items and return counts.

    Parameters
    ----------
    items : list[str]
        List of items to process
    max_count : int | None, optional
        Maximum items to process, by default None
    verbose : bool, optional
        Enable verbose output, by default False

    Returns
    -------
    dict[str, int]
        Dictionary mapping items to counts
    """
    result: dict[str, int] = {}

    for item in items[:max_count]:
        result[item] = result.get(item, 0) + 1
        if verbose:
            print(f"Processed: {item}")

    return result
类型提示可提升代码清晰度,提前发现错误:
python
def process_data(
    items: list[str],
    max_count: int | None = None,
    verbose: bool = False
) -> dict[str, int]:
    """处理条目并返回计数结果。

    参数
    ----------
    items : list[str]
        待处理的条目列表
    max_count : int | None, 可选
        最大处理条目数,默认值为None
    verbose : bool, 可选
        是否启用详细输出,默认值为False

    返回
    -------
    dict[str, int]
        映射条目到其计数的字典
    """
    result: dict[str, int] = {}

    for item in items[:max_count]:
        result[item] = result.get(item, 0) + 1
        if verbose:
            print(f"已处理: {item}")

    return result

Modern Type Syntax (Python 3.10+)

现代类型语法(Python 3.10+)

Use modern union syntax with
|
instead of
Union
:
python
undefined
使用
|
替代
Union
实现现代联合类型语法:
python
undefined

Good (Python 3.10+)

推荐(Python 3.10+)

def get_value(key: str) -> int | None: pass
def get_value(key: str) -> int | None: pass

Avoid (old style)

不推荐(旧语法)

from typing import Union, Optional def get_value(key: str) -> Optional[int]: pass
undefined
from typing import Union, Optional def get_value(key: str) -> Optional[int]: pass
undefined

Generic Types

泛型类型

Use built-in generic types (Python 3.9+):
python
undefined
使用内置泛型类型(Python 3.9+):
python
undefined

Good (Python 3.9+)

推荐(Python 3.9+)

def process_list(items: list[str]) -> dict[str, int]: pass
def process_list(items: list[str]) -> dict[str, int]: pass

Avoid (old style)

不推荐(旧语法)

from typing import List, Dict def process_list(items: List[str]) -> Dict[str, int]: pass
undefined
from typing import List, Dict def process_list(items: List[str]) -> Dict[str, int]: pass
undefined

Docstrings

文档字符串

NumPy Style Docstrings

NumPy风格文档字符串

Use NumPy-style docstrings for consistency:
python
def calculate_statistics(
    data: list[float],
    include_median: bool = True
) -> dict[str, float]:
    """Calculate statistical measures for a dataset.

    This function computes mean, standard deviation, and optionally
    median for the provided dataset.

    Parameters
    ----------
    data : list[float]
        List of numerical values to analyze
    include_median : bool, optional
        Whether to calculate median, by default True

    Returns
    -------
    dict[str, float]
        Dictionary containing:
        - 'mean': arithmetic mean
        - 'std': standard deviation
        - 'median': median value (if include_median=True)

    Raises
    ------
    ValueError
        If data is empty or contains non-numeric values

    Examples
    --------
    >>> calculate_statistics([1.0, 2.0, 3.0, 4.0, 5.0])
    {'mean': 3.0, 'std': 1.414, 'median': 3.0}

    Notes
    -----
    Standard deviation uses Bessel's correction (ddof=1).
    """
    if not data:
        raise ValueError("Data cannot be empty")

    # Implementation here
    pass
使用NumPy风格文档字符串以保证一致性:
python
def calculate_statistics(
    data: list[float],
    include_median: bool = True
) -> dict[str, float]:
    """计算数据集的统计指标。

    本函数计算数据集的均值、标准差,可选计算中位数。

    参数
    ----------
    data : list[float]
        待分析的数值列表
    include_median : bool, 可选
        是否计算中位数,默认值为True

    返回
    -------
    dict[str, float]
        包含以下指标的字典:
        - 'mean': 算术均值
        - 'std': 标准差
        - 'median': 中位数(当include_median=True时)

    异常
    ------
    ValueError
        当数据集为空或包含非数值时触发

    示例
    --------
    >>> calculate_statistics([1.0, 2.0, 3.0, 4.0, 5.0])
    {'mean': 3.0, 'std': 1.414, 'median': 3.0}

    说明
    -----
    标准差使用贝塞尔校正(ddof=1)。
    """
    if not data:
        raise ValueError("数据集不能为空")

    # 实现代码
    pass

Class Docstrings

类文档字符串

python
class DataProcessor:
    """Process and transform data from various sources.

    This class provides methods for loading, transforming, and
    validating data from multiple input formats.

    Parameters
    ----------
    source_dir : Path
        Directory containing source data files
    cache_enabled : bool, optional
        Enable result caching, by default True

    Attributes
    ----------
    source_dir : Path
        Directory path for source files
    cache : dict[str, Any]
        Cache for processed results

    Examples
    --------
    >>> processor = DataProcessor(Path("/data"))
    >>> results = processor.process_files()
    """

    def __init__(self, source_dir: Path, cache_enabled: bool = True):
        """Initialize the DataProcessor."""
        self.source_dir = source_dir
        self.cache: dict[str, Any] = {} if cache_enabled else None
python
class DataProcessor:
    """处理和转换来自多源的数据。

    本类提供加载、转换和验证多格式输入数据的方法。

    参数
    ----------
    source_dir : Path
        包含源数据文件的目录
    cache_enabled : bool, 可选
        是否启用结果缓存,默认值为True

    属性
    ----------
    source_dir : Path
        源文件的目录路径
    cache : dict[str, Any]
        处理结果的缓存

    示例
    --------
    >>> processor = DataProcessor(Path("/data"))
    >>> results = processor.process_files()
    """

    def __init__(self, source_dir: Path, cache_enabled: bool = True):
        """初始化DataProcessor。"""
        self.source_dir = source_dir
        self.cache: dict[str, Any] = {} if cache_enabled else None

Error Handling

错误处理

Specific Exception Types

特定异常类型

Use specific exception types, not bare
except
:
python
undefined
使用特定异常类型,避免使用裸
except
python
undefined

Good

推荐

try: with open(file_path) as f: data = f.read() except FileNotFoundError: logger.error(f"File not found: {file_path}") raise except PermissionError: logger.error(f"Permission denied: {file_path}") raise
try: with open(file_path) as f: data = f.read() except FileNotFoundError: logger.error(f"文件未找到: {file_path}") raise except PermissionError: logger.error(f"权限不足: {file_path}") raise

Avoid

不推荐

try: with open(file_path) as f: data = f.read() except: # Too broad! pass
undefined
try: with open(file_path) as f: data = f.read() except: # 范围过广! pass
undefined

Context Managers

上下文管理器

Always use context managers for resources:
python
undefined
始终使用上下文管理器管理资源:
python
undefined

Good

推荐

with open(file_path) as f: content = f.read()
with open(file_path) as f: content = f.read()

Avoid

不推荐

f = open(file_path) content = f.read() f.close() # Easy to forget!
undefined
f = open(file_path) content = f.read() f.close() # 容易遗漏!
undefined

Custom Exceptions

自定义异常

Define custom exceptions for domain-specific errors:
python
class ValidationError(Exception):
    """Raised when data validation fails."""
    pass

class DataProcessingError(Exception):
    """Raised when data processing encounters an error."""

    def __init__(self, message: str, item_id: str):
        super().__init__(message)
        self.item_id = item_id
为领域特定错误定义自定义异常:
python
class ValidationError(Exception):
    """数据验证失败时触发。"""
    pass

class DataProcessingError(Exception):
    """数据处理遇到错误时触发。"""

    def __init__(self, message: str, item_id: str):
        super().__init__(message)
        self.item_id = item_id

Common Patterns

常见模式

Dataclasses for Data Structures

数据类用于数据结构

Use
dataclasses
for simple data containers:
python
from dataclasses import dataclass, field

@dataclass
class User:
    """User profile information."""

    username: str
    email: str
    age: int
    tags: list[str] = field(default_factory=list)
    is_active: bool = True

    def __post_init__(self):
        """Validate fields after initialization."""
        if self.age < 0:
            raise ValueError("Age cannot be negative")
使用
dataclasses
定义简单数据容器:
python
from dataclasses import dataclass, field

@dataclass
class User:
    """用户档案信息。"""

    username: str
    email: str
    age: int
    tags: list[str] = field(default_factory=list)
    is_active: bool = True

    def __post_init__(self):
        """初始化后验证字段。"""
        if self.age < 0:
            raise ValueError("年龄不能为负数")

Enums for Fixed Sets

枚举用于固定值集合

Use
Enum
for fixed sets of values:
python
from enum import Enum, auto

class Status(Enum):
    """Processing status values."""

    PENDING = auto()
    PROCESSING = auto()
    COMPLETED = auto()
    FAILED = auto()
使用
Enum
定义固定值集合:
python
from enum import Enum, auto

class Status(Enum):
    """处理状态值。"""

    PENDING = auto()
    PROCESSING = auto()
    COMPLETED = auto()
    FAILED = auto()

Usage

使用示例

current_status = Status.PENDING if current_status == Status.COMPLETED: print("Done!")
undefined
current_status = Status.PENDING if current_status == Status.COMPLETED: print("完成!")
undefined

Pathlib for File Operations

Pathlib用于文件操作

Use
pathlib.Path
instead of
os.path
:
python
from pathlib import Path
使用
pathlib.Path
替代
os.path
python
from pathlib import Path

Good

推荐

data_dir = Path("/data") file_path = data_dir / "input.txt"
if file_path.exists(): content = file_path.read_text()
data_dir = Path("/data") file_path = data_dir / "input.txt"
if file_path.exists(): content = file_path.read_text()

Avoid

不推荐

import os data_dir = "/data" file_path = os.path.join(data_dir, "input.txt")
if os.path.exists(file_path): with open(file_path) as f: content = f.read()
undefined
import os data_dir = "/data" file_path = os.path.join(data_dir, "input.txt")
if os.path.exists(file_path): with open(file_path) as f: content = f.read()
undefined

List Comprehensions

列表推导式

Use comprehensions for clarity and performance:
python
undefined
使用推导式提升代码清晰度和性能:
python
undefined

Good

推荐

squared = [x**2 for x in range(10) if x % 2 == 0]
squared = [x**2 for x in range(10) if x % 2 == 0]

Avoid

不推荐

squared = [] for x in range(10): if x % 2 == 0: squared.append(x**2)
undefined
squared = [] for x in range(10): if x % 2 == 0: squared.append(x**2)
undefined

Code Organization

代码组织

Module Structure

模块结构

Organize modules with clear sections:
python
"""Module for data processing utilities.

This module provides functions for loading, transforming, and
validating data from various sources.
"""
按清晰的结构组织模块:
python
"""数据处理工具模块。

本模块提供从多源加载、转换和验证数据的函数。
"""

Standard library imports

标准库导入

import os import sys from pathlib import Path
import os import sys from pathlib import Path

Third-party imports

第三方库导入

import requests import pandas as pd
import requests import pandas as pd

Local imports

本地模块导入

from myapp.core import BaseProcessor from myapp.utils import validate_input
from myapp.core import BaseProcessor from myapp.utils import validate_input

Constants

常量

MAX_RETRIES = 3 DEFAULT_TIMEOUT = 30
MAX_RETRIES = 3 DEFAULT_TIMEOUT = 30

Exceptions

异常

class ProcessingError(Exception): """Raised when processing fails.""" pass
class ProcessingError(Exception): """处理失败时触发。""" pass

Functions

函数

def load_data(source: str) -> pd.DataFrame: """Load data from source.""" pass
def load_data(source: str) -> pd.DataFrame: """从源加载数据。""" pass

Classes

class DataProcessor(BaseProcessor): """Process and validate data.""" pass
class DataProcessor(BaseProcessor): """处理和验证数据。""" pass

Module initialization

模块初始化

if name == "main": # CLI entry point main()
undefined
if name == "main": # CLI入口 main()
undefined

Avoid Magic Numbers

避免魔法数字

Use named constants instead of magic numbers:
python
undefined
使用命名常量替代魔法数字:
python
undefined

Good

推荐

MAX_RETRIES = 3 TIMEOUT_SECONDS = 30
def fetch_data(url: str) -> dict: for attempt in range(MAX_RETRIES): response = requests.get(url, timeout=TIMEOUT_SECONDS) if response.status_code == 200: return response.json()
MAX_RETRIES = 3 TIMEOUT_SECONDS = 30
def fetch_data(url: str) -> dict: for attempt in range(MAX_RETRIES): response = requests.get(url, timeout=TIMEOUT_SECONDS) if response.status_code == 200: return response.json()

Avoid

不推荐

def fetch_data(url: str) -> dict: for attempt in range(3): # What is 3? response = requests.get(url, timeout=30) # Why 30? if response.status_code == 200: return response.json()
undefined
def fetch_data(url: str) -> dict: for attempt in range(3): # 3代表什么? response = requests.get(url, timeout=30) # 为什么是30? if response.status_code == 200: return response.json()
undefined

Testing

测试

Use pytest for Testing

使用pytest进行测试

python
import pytest
from myapp.processor import DataProcessor

def test_process_valid_data():
    """Test processing with valid input."""
    processor = DataProcessor()
    result = processor.process([1, 2, 3])
    assert result == [2, 4, 6]

def test_process_empty_data():
    """Test processing with empty input."""
    processor = DataProcessor()
    with pytest.raises(ValueError):
        processor.process([])

@pytest.fixture
def sample_data():
    """Provide sample data for tests."""
    return [1, 2, 3, 4, 5]

def test_with_fixture(sample_data):
    """Test using fixture."""
    processor = DataProcessor()
    result = processor.process(sample_data)
    assert len(result) == len(sample_data)
python
import pytest
from myapp.processor import DataProcessor

def test_process_valid_data():
    """测试处理有效输入。"""
    processor = DataProcessor()
    result = processor.process([1, 2, 3])
    assert result == [2, 4, 6]

def test_process_empty_data():
    """测试处理空输入。"""
    processor = DataProcessor()
    with pytest.raises(ValueError):
        processor.process([])

@pytest.fixture
def sample_data():
    """为测试提供样本数据。"""
    return [1, 2, 3, 4, 5]

def test_with_fixture(sample_data):
    """使用fixture进行测试。"""
    processor = DataProcessor()
    result = processor.process(sample_data)
    assert len(result) == len(sample_data)

Key Takeaways

核心要点

  1. Follow PEP 8 style guidelines consistently
  2. Always use type hints for function signatures
  3. Write NumPy-style docstrings for all public functions/classes
  4. Use specific exception types, not bare
    except
  5. Prefer
    pathlib.Path
    over
    os.path
  6. Use dataclasses and enums for structured data
  7. Organize imports: stdlib → third-party → local
  8. Avoid magic numbers, use named constants
  9. Write tests using pytest
  10. Use modern Python syntax (3.9+)
  1. 始终遵循PEP 8风格指南
  2. 函数签名始终使用类型提示
  3. 为所有公共函数/类编写NumPy风格文档字符串
  4. 使用特定异常类型,避免裸
    except
  5. 优先使用
    pathlib.Path
    而非
    os.path
  6. 使用数据类和枚举处理结构化数据
  7. 按标准库→第三方库→本地模块的顺序组织导入
  8. 避免魔法数字,使用命名常量
  9. 使用pytest编写测试
  10. 使用现代Python语法(3.9+)