numpy-best-practices

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

NumPy Best Practices

NumPy最佳实践

Expert guidelines for NumPy development, focusing on array programming, numerical computing, and performance optimization.

NumPy开发的专家指南，重点关注数组编程、数值计算和性能优化。

Code Style and Structure

代码风格与结构

Write concise, technical Python code with accurate NumPy examples
Prefer vectorized operations over explicit loops for performance
Use descriptive variable names reflecting data content (e.g.,
```
weights
```
,
```
gradients
```
,
```
input_array
```
)
Follow PEP 8 style guidelines for Python code
Use functional programming patterns when appropriate

编写简洁、专业的Python代码，并附带准确的NumPy示例
为提升性能，优先使用向量化操作而非显式循环
使用能反映数据内容的描述性变量名（例如
```
weights
```
、
```
gradients
```
、
```
input_array
```
）
遵循Python的PEP 8风格指南
酌情使用函数式编程模式

Array Creation and Manipulation

数组创建与操作

Use appropriate array creation functions:

np.array()

np.zeros()

np.ones()

np.empty()

np.arange()

np.linspace()

Prefer
```
np.zeros()
```
or
```
np.empty()
```
for pre-allocation when array size is known
Use
```
np.concatenate()
```
,
```
np.vstack()
```
,
```
np.hstack()
```
for combining arrays
Leverage broadcasting for operations on arrays with different shapes

使用合适的数组创建函数：

np.array()

、

np.zeros()

、

np.ones()

、

np.empty()

、

np.arange()

、

np.linspace()

当已知数组大小时，优先使用
```
np.zeros()
```
或
```
np.empty()
```
进行预分配
使用
```
np.concatenate()
```
、
```
np.vstack()
```
、
```
np.hstack()
```
来合并数组
利用广播机制处理不同形状数组间的操作

Indexing and Slicing

索引与切片

Use advanced indexing with boolean arrays for conditional selection
Prefer views over copies when possible to save memory
Use
```
np.where()
```
for conditional element selection
Understand the difference between fancy indexing (creates copy) and basic slicing (creates view)

使用布尔数组进行高级索引以实现条件筛选
尽可能优先使用视图而非副本，以节省内存
使用
```
np.where()
```
进行条件元素筛选
理解花式索引（会创建副本）与基础切片（会创建视图）之间的区别

Data Types

数据类型

Specify appropriate data types explicitly using
```
dtype
```
parameter
Use
```
np.float32
```
for memory-efficient computations when full precision is not needed
Be aware of integer overflow with fixed-size integer types
Use
```
np.asarray()
```
for type conversion without unnecessary copies

使用
```
dtype
```
参数显式指定合适的数据类型
当不需要全精度时，使用
```
np.float32
```
进行内存高效的计算
注意固定大小整数类型的整数溢出问题
使用
```
np.asarray()
```
进行类型转换，避免不必要的副本

Performance Optimization

性能优化

Vectorization

向量化

Always prefer vectorized operations over Python loops
Use NumPy universal functions (ufuncs) for element-wise operations
Leverage
```
np.einsum()
```
for complex tensor operations
Use
```
np.dot()
```
or
```
@
```
operator for matrix multiplication

始终优先使用向量化操作而非Python循环
使用NumPy通用函数（ufuncs）进行逐元素操作
利用
```
np.einsum()
```
处理复杂张量操作
使用
```
np.dot()
```
或
```
@
```
运算符进行矩阵乘法

Memory Management

内存管理

Use
```
np.ndarray.flags
```
to check memory layout (C-contiguous vs Fortran-contiguous)
Prefer in-place operations with
```
out
```
parameter when possible
Use memory-mapped arrays (
```
np.memmap
```
) for large datasets
Be mindful of array copies vs views

使用
```
np.ndarray.flags
```
检查内存布局（C连续 vs Fortran连续）
尽可能使用带有
```
out
```
参数的原地操作
对大型数据集使用内存映射数组（
```
np.memmap
```
）
注意区分数组副本与视图

Computation Efficiency

计算效率

Use
```
np.sum()
```
,
```
np.mean()
```
,
```
np.std()
```
with
```
axis
```
parameter for aggregations
Leverage
```
np.cumsum()
```
,
```
np.cumprod()
```
for cumulative operations
Use
```
np.searchsorted()
```
for efficient sorted array operations

使用带
```
axis
```
参数的
```
np.sum()
```
、
```
np.mean()
```
、
```
np.std()
```
进行聚合操作
利用
```
np.cumsum()
```
、
```
np.cumprod()
```
进行累积操作
使用
```
np.searchsorted()
```
实现高效的有序数组操作

Error Handling and Validation

错误处理与验证

Validate input shapes and data types before computations
Use assertions for dimension checking with informative messages
Handle NaN and Inf values appropriately with
```
np.isnan()
```
,
```
np.isinf()
```
Use
```
np.errstate()
```
context manager for controlling floating-point error handling

在计算前验证输入的形状和数据类型
使用断言进行维度检查，并提供信息丰富的提示消息
使用
```
np.isnan()
```
、
```
np.isinf()
```
妥善处理NaN和Inf值
使用
```
np.errstate()
```
上下文管理器控制浮点错误处理

Random Number Generation

随机数生成

Use
```
np.random.default_rng()
```
for modern random number generation
Set seeds for reproducibility:
```
rng = np.random.default_rng(seed=42)
```
Prefer the new Generator API over legacy
```
np.random
```
functions
Use appropriate distributions:
```
rng.normal()
```
,
```
rng.uniform()
```
,
```
rng.choice()
```

使用
```
np.random.default_rng()
```
进行现代随机数生成
设置种子以保证可复现性：
```
rng = np.random.default_rng(seed=42)
```
优先使用新的Generator API而非旧版
```
np.random
```
函数
使用合适的分布函数：
```
rng.normal()
```
、
```
rng.uniform()
```
、
```
rng.choice()
```

Linear Algebra

线性代数

Use
```
np.linalg
```
for linear algebra operations
Leverage
```
np.linalg.solve()
```
instead of computing inverse for linear systems
Use
```
np.linalg.eig()
```
,
```
np.linalg.svd()
```
for decompositions
Check matrix condition with
```
np.linalg.cond()
```
before inversion

使用
```
np.linalg
```
进行线性代数操作
求解线性系统时，优先使用
```
np.linalg.solve()
```
而非计算逆矩阵
使用
```
np.linalg.eig()
```
、
```
np.linalg.svd()
```
进行矩阵分解
求逆前使用
```
np.linalg.cond()
```
检查矩阵条件数

Testing and Documentation

测试与文档

Write unit tests using
```
pytest
```
with
```
np.testing
```
assertions
Use
```
np.testing.assert_array_equal()
```
for exact comparisons
Use
```
np.testing.assert_array_almost_equal()
```
for floating-point comparisons
Include comprehensive docstrings following NumPy docstring format

使用
```
pytest
```
结合
```
np.testing
```
断言编写单元测试
使用
```
np.testing.assert_array_equal()
```
进行精确比较
使用
```
np.testing.assert_array_almost_equal()
```
进行浮点数比较
遵循NumPy文档字符串格式，编写全面的文档字符串

Key Conventions

关键约定

Import as
```
import numpy as np
```
Use
```
snake_case
```
for variables and functions
Document array shapes in docstrings
Profile code with
```
%timeit
```
to identify bottlenecks

按
```
import numpy as np
```
的方式导入
变量和函数使用
```
snake_case
```
命名法
在文档字符串中说明数组形状
使用
```
%timeit
```
对代码进行性能分析，找出瓶颈