NumPy - Numerical Python

NumPy - 数值Python

The fundamental package for numerical computing in Python, providing multi-dimensional arrays and fast operations.

Python数值计算的基础包，提供多维数组与快速运算能力。

When to Use

适用场景

Working with multi-dimensional arrays and matrices
Performing element-wise operations on arrays
Linear algebra computations (matrix multiplication, eigenvalues, SVD)
Random number generation and statistical distributions
Fourier transforms and signal processing basics
Mathematical operations (trigonometric, exponential, logarithmic)
Broadcasting operations across different array shapes
Vectorizing Python loops for performance
Reading and writing numerical data to files
Building numerical algorithms and simulations
Serving as foundation for pandas, scikit-learn, SciPy

处理多维数组和矩阵
对数组执行逐元素操作
线性代数计算（矩阵乘法、特征值、奇异值分解SVD）
随机数生成与统计分布
傅里叶变换与基础信号处理
数学运算（三角函数、指数、对数）
不同形状数组间的广播运算
向量化Python循环以提升性能
读写数值数据到文件
构建数值算法与仿真模型
作为pandas、scikit-learn、SciPy的基础依赖

Reference Documentation

参考文档

Official docs: https://numpy.org/doc/
Search patterns:

np.array

,

np.zeros

,

np.dot

,

np.linalg

,

np.random

,

np.broadcast

官方文档：https://numpy.org/doc/
常用搜索关键词：

np.array

,

np.zeros

,

np.dot

,

np.linalg

,

np.random

,

np.broadcast

Core Principles

核心原则

Use NumPy For

适合使用NumPy的场景

Task	Function	Example
Create arrays	`array` , `zeros` , `ones`	`np.array([1, 2, 3])`
Mathematical ops	`+` , `*` , `sin` , `exp`	`np.sin(arr)`
Linear algebra	`dot` , `linalg.inv`	`np.dot(A, B)`
Statistics	`mean` , `std` , `percentile`	`np.mean(arr)`
Random numbers	`random.rand` , `random.normal`	`np.random.rand(10)`
Indexing	`[]` , boolean, fancy	`arr[arr > 0]`
Broadcasting	Automatic	`arr + scalar`
Reshaping	`reshape` , `flatten`	`arr.reshape(2, 3)`

任务	函数	示例
创建数组	`array` , `zeros` , `ones`	`np.array([1, 2, 3])`
数学运算	`+` , `*` , `sin` , `exp`	`np.sin(arr)`
线性代数	`dot` , `linalg.inv`	`np.dot(A, B)`
统计计算	`mean` , `std` , `percentile`	`np.mean(arr)`
随机数生成	`random.rand` , `random.normal`	`np.random.rand(10)`
索引访问	`[]` , 布尔索引, 花式索引	`arr[arr > 0]`
广播运算	自动执行	`arr + scalar`
数组重塑	`reshape` , `flatten`	`arr.reshape(2, 3)`

Do NOT Use For

不适合使用NumPy的场景

String manipulation (use built-in str or pandas)
Complex data structures (use pandas DataFrame)
Symbolic mathematics (use SymPy)
Deep learning (use PyTorch, TensorFlow)
Sparse matrices (use scipy.sparse)

字符串处理（使用Python内置str或pandas）
复杂数据结构（使用pandas DataFrame）
符号数学计算（使用SymPy）
深度学习（使用PyTorch、TensorFlow）
稀疏矩阵（使用scipy.sparse）

Quick Reference

快速参考

Installation

安装方法

bash

undefined

bash

undefined

pip

pip安装

pip install numpy

conda

conda安装

conda install numpy

Specific version

指定版本安装

pip install numpy==1.26.0

undefined

pip install numpy==1.26.0

undefined

Standard Imports

标准导入方式

python

import numpy as np

python

import numpy as np

Common submodules

常用子模块导入

from numpy import linalg as la from numpy import random as rand from numpy import fft

Never import *

不要使用通配符导入

from numpy import * # DON'T DO THIS!

from numpy import * # 绝对不要这么做!

undefined

undefined

Basic Pattern - Array Creation

基础模式 - 数组创建

python

import numpy as np

python

import numpy as np

From list

从列表创建

arr = np.array([1, 2, 3, 4, 5])

Zeros and ones

创建全零或全一数组

zeros = np.zeros((3, 4)) ones = np.ones((2, 3))

Range

生成连续序列

range_arr = np.arange(0, 10, 2) # [0, 2, 4, 6, 8]

Linspace

生成等间隔序列

linspace_arr = np.linspace(0, 1, 5) # [0, 0.25, 0.5, 0.75, 1]

print(f"Array: {arr}") print(f"Shape: {arr.shape}") print(f"Dtype: {arr.dtype}")

undefined

linspace_arr = np.linspace(0, 1, 5) # [0, 0.25, 0.5, 0.75, 1]

print(f"数组: {arr}") print(f"形状: {arr.shape}") print(f"数据类型: {arr.dtype}")

undefined

Basic Pattern - Array Operations

基础模式 - 数组运算

python

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

python

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

Element-wise operations

逐元素运算

c = a + b # [5, 7, 9] d = a * b # [4, 10, 18] e = a ** 2 # [1, 4, 9]

Mathematical functions

数学函数运算

f = np.sin(a) g = np.exp(a)

print(f"Sum: {c}") print(f"Product: {d}")

undefined

f = np.sin(a) g = np.exp(a)

print(f"求和结果: {c}") print(f"乘积结果: {d}")

undefined

Basic Pattern - Linear Algebra

基础模式 - 线性代数

python

import numpy as np

python

import numpy as np

Matrix multiplication

矩阵乘法

A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]])

Dot product

点积运算

C = np.dot(A, B) # or A @ B

C = np.dot(A, B) # 或使用 A @ B 语法

Matrix inverse

矩阵求逆

A_inv = np.linalg.inv(A)

Eigenvalues

特征值计算

eigenvalues, eigenvectors = np.linalg.eig(A)

print(f"Matrix product:\n{C}") print(f"Eigenvalues: {eigenvalues}")

undefined

eigenvalues, eigenvectors = np.linalg.eig(A)

print(f"矩阵乘积:\n{C}") print(f"特征值: {eigenvalues}")

undefined

Critical Rules

重要规则

✅ DO

✅ 推荐做法

Use vectorization - Avoid Python loops, use array operations
Specify dtype explicitly - For memory efficiency and precision control
Use views when possible - Avoid unnecessary copies
Broadcast properly - Understand broadcasting rules
Check array shapes - Use
```
.shape
```
frequently
Use axis parameter - For operations along specific dimensions
Pre-allocate arrays - Don't grow arrays in loops
Use appropriate dtypes - int32, float64, complex128, etc.
Copy when needed - Use
```
.copy()
```
for independent arrays
Use built-in functions - They're optimized in C

使用向量化 - 避免Python循环，使用数组运算
显式指定数据类型 - 提升内存效率与精度控制
尽可能使用视图 - 避免不必要的数组复制
正确使用广播 - 理解广播规则
检查数组形状 - 频繁使用
```
.shape
```
属性
使用axis参数 - 沿指定维度执行运算
预分配数组 - 不要在循环中动态扩展数组
选择合适的数据类型 - 如int32、float64、complex128等
必要时复制数组 - 使用
```
.copy()
```
创建独立数组
使用内置函数 - 它们是C语言优化实现

❌ DON'T

❌ 禁止做法

Loop over arrays - Use vectorization instead
Grow arrays dynamically - Pre-allocate instead
Use Python lists for math - Convert to arrays first
Ignore memory layout - C-contiguous vs Fortran-contiguous matters
Mix dtypes carelessly - Know implicit type promotion rules
Modify arrays during iteration - Can cause undefined behavior
Use == for array comparison - Use
```
np.array_equal()
```
or
```
np.allclose()
```
Assume views vs copies - Check with
```
.base
```
attribute
Ignore NaN handling - Use
```
np.nanmean()
```
,
```
np.nanstd()
```
, etc.
Use outdated APIs - Check for deprecated functions

循环遍历数组 - 改用向量化运算
动态扩展数组 - 提前预分配内存
使用Python列表进行数学计算 - 先转换为数组
忽略内存布局 - C连续与Fortran连续布局会影响性能
随意混合数据类型 - 了解隐式类型提升规则
迭代时修改数组 - 可能导致未定义行为
使用==比较数组 - 改用
```
np.array_equal()
```
或
```
np.allclose()
```
假设视图与复制的区别 - 使用
```
.base
```
属性检查
忽略NaN处理 - 使用
```
np.nanmean()
```
、
```
np.nanstd()
```
等函数
使用过时API - 检查是否有废弃函数

Anti-Patterns (NEVER)

反模式（绝对避免）

python

import numpy as np

python

import numpy as np

❌ BAD: Python loops

❌ 错误：Python循环

result = [] for i in range(len(arr)): result.append(arr[i] * 2) result = np.array(result)

✅ GOOD: Vectorization

✅ 正确：向量化运算

result = arr * 2

❌ BAD: Growing arrays

❌ 错误：动态扩展数组

result = np.array([]) for i in range(1000): result = np.append(result, i) # Very slow!

result = np.array([]) for i in range(1000): result = np.append(result, i) # 速度极慢!

✅ GOOD: Pre-allocate

✅ 正确：预分配数组

result = np.zeros(1000) for i in range(1000): result[i] = i

Even better: Use arange

更优方案：使用arange

result = np.arange(1000)

❌ BAD: Comparing arrays with ==

❌ 错误：使用==比较数组

if arr1 == arr2: # This is ambiguous! print("Equal")

if arr1 == arr2: # 结果不明确! print("相等")

✅ GOOD: Use appropriate comparison

✅ 正确：使用合适的比较方式

if np.array_equal(arr1, arr2): print("Equal")

if np.array_equal(arr1, arr2): print("相等")

Or for floating point

对于浮点数数组

if np.allclose(arr1, arr2, rtol=1e-5): print("Close enough")

if np.allclose(arr1, arr2, rtol=1e-5): print("足够接近")

❌ BAD: Ignoring dtypes

❌ 错误：忽略数据类型

arr = np.array([1, 2, 3]) arr[0] = 1.5 # Silently truncates to 1!

arr = np.array([1, 2, 3]) arr[0] = 1.5 # 会被静默截断为1!

✅ GOOD: Explicit dtype

✅ 正确：显式指定数据类型

arr = np.array([1, 2, 3], dtype=float) arr[0] = 1.5 # Now works correctly

arr = np.array([1, 2, 3], dtype=float) arr[0] = 1.5 # 现在可以正确赋值

❌ BAD: Unintentional modification

❌ 错误：无意的数组修改

a = np.array([1, 2, 3]) b = a # b is just a reference! b[0] = 999 # Also modifies a!

a = np.array([1, 2, 3]) b = a # b只是a的引用! b[0] = 999 # 同时修改了a!

✅ GOOD: Explicit copy

✅ 正确：显式复制

a = np.array([1, 2, 3]) b = a.copy() # b is independent b[0] = 999 # a is unchanged

undefined

a = np.array([1, 2, 3]) b = a.copy() # b是独立数组 b[0] = 999 # a不受影响

undefined

Array Creation

数组创建

Basic Array Creation

基础数组创建

python

import numpy as np

python

import numpy as np

From Python list

从Python列表创建

arr1 = np.array([1, 2, 3, 4, 5])

From nested list (2D)

从嵌套列表创建二维数组

arr2 = np.array([[1, 2, 3], [4, 5, 6]])

Specify dtype

指定数据类型

arr3 = np.array([1, 2, 3], dtype=np.float64) arr4 = np.array([1, 2, 3], dtype=np.int32)

From tuple

从元组创建

arr5 = np.array((1, 2, 3))

Complex numbers

复数数组

arr6 = np.array([1+2j, 3+4j])

print(f"1D array: {arr1}") print(f"2D array:\n{arr2}") print(f"Float array: {arr3}")

undefined

arr6 = np.array([1+2j, 3+4j])

print(f"一维数组: {arr1}") print(f"二维数组:\n{arr2}") print(f"浮点数组: {arr3}")

undefined

Special Array Creation

特殊数组创建

python

import numpy as np

python

import numpy as np

Zeros

全零数组

zeros = np.zeros((3, 4)) # 3x4 array of zeros

zeros = np.zeros((3, 4)) # 3x4的全零数组

Ones

全一数组

ones = np.ones((2, 3, 4)) # 2x3x4 array of ones

ones = np.ones((2, 3, 4)) # 2x3x4的全一数组

Empty (uninitialized)

空数组（未初始化）

empty = np.empty((2, 2)) # Faster but values are garbage

empty = np.empty((2, 2)) # 速度快但值为垃圾数据

Full (constant value)

常量数组

full = np.full((3, 3), 7) # 3x3 array filled with 7

full = np.full((3, 3), 7) # 3x3的7填充数组

Identity matrix

单位矩阵

identity = np.eye(4) # 4x4 identity matrix

identity = np.eye(4) # 4x4单位矩阵

Diagonal matrix

对角矩阵

diag = np.diag([1, 2, 3, 4])

print(f"Zeros shape: {zeros.shape}") print(f"Identity:\n{identity}")

undefined

diag = np.diag([1, 2, 3, 4])

print(f"全零数组形状: {zeros.shape}") print(f"单位矩阵:\n{identity}")

undefined

Range-Based Creation

基于范围的数组创建

python

import numpy as np

python

import numpy as np

Arange (like Python range)

arange（类似Python的range）

a = np.arange(10) # [0, 1, 2, ..., 9] b = np.arange(2, 10) # [2, 3, 4, ..., 9] c = np.arange(0, 10, 2) # [0, 2, 4, 6, 8] d = np.arange(0, 1, 0.1) # [0, 0.1, 0.2, ..., 0.9]

Linspace (linearly spaced)

linspace（等间隔序列）

e = np.linspace(0, 1, 5) # [0, 0.25, 0.5, 0.75, 1] f = np.linspace(0, 10, 100) # 100 points from 0 to 10

e = np.linspace(0, 1, 5) # [0, 0.25, 0.5, 0.75, 1] f = np.linspace(0, 10, 100) # 0到10间的100个等间隔点

Logspace (logarithmically spaced)

logspace（对数间隔序列）

g = np.logspace(0, 2, 5) # [1, 10^0.5, 10, 10^1.5, 100]

Geomspace (geometrically spaced)

geomspace（等比间隔序列）

h = np.geomspace(1, 1000, 4) # [1, 10, 100, 1000]

print(f"Arange: {a}") print(f"Linspace: {e}")

undefined

h = np.geomspace(1, 1000, 4) # [1, 10, 100, 1000]

print(f"arange结果: {a}") print(f"linspace结果: {e}")

undefined

Array Copies and Views

数组复制与视图

python

import numpy as np

original = np.array([1, 2, 3, 4, 5])

python

import numpy as np

original = np.array([1, 2, 3, 4, 5])

View (shares memory)

视图（共享内存）

view = original[:] view[0] = 999 # Modifies original!

view = original[:] view[0] = 999 # 会修改原数组!

Copy (independent)

复制（独立内存）

copy = original.copy() copy[0] = 777 # Doesn't affect original

copy = original.copy() copy[0] = 777 # 不影响原数组

Check if array is a view

检查是否为视图

print(f"Is view? {view.base is original}") print(f"Is copy? {copy.base is None}")

print(f"是否为视图? {view.base is original}") print(f"是否为复制? {copy.base is None}")

Some operations create views, some create copies

部分操作生成视图，部分生成复制

slice_view = original[1:3] # View boolean_copy = original[original > 2] # Copy!

undefined

slice_view = original[1:3] # 视图 boolean_copy = original[original > 2] # 复制!

undefined

Array Indexing and Slicing

数组索引与切片

Basic Indexing

基础索引

python

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

python

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

Single element

单个元素

print(arr[0]) # 10 print(arr[-1]) # 50 (last element)

print(arr[0]) # 10 print(arr[-1]) # 50（最后一个元素）

Slicing

切片

print(arr[1:4]) # [20, 30, 40] print(arr[:3]) # [10, 20, 30] print(arr[2:]) # [30, 40, 50] print(arr[::2]) # [10, 30, 50] (every 2nd element)

print(arr[1:4]) # [20, 30, 40] print(arr[:3]) # [10, 20, 30] print(arr[2:]) # [30, 40, 50] print(arr[::2]) # [10, 30, 50]（每隔一个元素）

Negative indices

负索引

print(arr[-3:-1]) # [30, 40]

Reverse

反转数组

print(arr[::-1]) # [50, 40, 30, 20, 10]

undefined

print(arr[::-1]) # [50, 40, 30, 20, 10]

undefined

Multi-Dimensional Indexing

多维索引

python

import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

python

import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

Single element

单个元素

print(arr[0, 0]) # 1 print(arr[1, 2]) # 6 print(arr[-1, -1]) # 9

Row slicing

行切片

print(arr[0]) # [1, 2, 3] (first row) print(arr[1, :]) # [4, 5, 6] (second row)

print(arr[0]) # [1, 2, 3]（第一行） print(arr[1, :]) # [4, 5, 6]（第二行）

Column slicing

列切片

print(arr[:, 0]) # [1, 4, 7] (first column) print(arr[:, 1]) # [2, 5, 8] (second column)

print(arr[:, 0]) # [1, 4, 7]（第一列） print(arr[:, 1]) # [2, 5, 8]（第二列）

Sub-array

子数组

print(arr[0:2, 1:3]) # [[2, 3], [5, 6]]

Every other element

每隔一个元素

print(arr[::2, ::2]) # [[1, 3], [7, 9]]

undefined

print(arr[::2, ::2]) # [[1, 3], [7, 9]]

undefined

Boolean Indexing

布尔索引

python

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

python

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

Boolean condition

布尔条件

mask = arr > 5 print(mask) # [False, False, False, False, False, True, True, True, True, True]

Boolean indexing

布尔索引

filtered = arr[arr > 5] print(filtered) # [6, 7, 8, 9, 10]

Multiple conditions (use & and |, not 'and' and 'or')

多条件（使用&和|，而非'and'和'or'）

result = arr[(arr > 3) & (arr < 8)] print(result) # [4, 5, 6, 7]

Or condition

或条件

result = arr[(arr < 3) | (arr > 8)] print(result) # [1, 2, 9, 10]

Negation

取反

result = arr[~(arr > 5)] print(result) # [1, 2, 3, 4, 5]

undefined

result = arr[~(arr > 5)] print(result) # [1, 2, 3, 4, 5]

undefined

Fancy Indexing

花式索引

python

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

python

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

Index with array of integers

使用整数数组索引

indices = np.array([0, 2, 4]) result = arr[indices] print(result) # [10, 30, 50]

2D fancy indexing

二维花式索引

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

rows = np.array([0, 2]) cols = np.array([1, 2]) result = arr2d[rows, cols] # Elements at (0,1) and (2,2) print(result) # [2, 9]

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

rows = np.array([0, 2]) cols = np.array([1, 2]) result = arr2d[rows, cols] # 取(0,1)和(2,2)位置的元素 print(result) # [2, 9]

Combining boolean and fancy indexing

结合布尔索引与花式索引

mask = arr > 25 indices_of_large = np.where(mask)[0] print(indices_of_large) # [2, 3, 4]

undefined

mask = arr > 25 indices_of_large = np.where(mask)[0] print(indices_of_large) # [2, 3, 4]

undefined

Array Operations

数组运算

Element-wise Operations

逐元素运算

python

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

python

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

Arithmetic operations

算术运算

print(a + b) # [6, 8, 10, 12] print(a - b) # [-4, -4, -4, -4] print(a * b) # [5, 12, 21, 32] print(a / b) # [0.2, 0.333..., 0.428..., 0.5] print(a ** 2) # [1, 4, 9, 16] print(a // b) # [0, 0, 0, 0] (floor division) print(a % b) # [1, 2, 3, 4] (modulo)

print(a + b) # [6, 8, 10, 12] print(a - b) # [-4, -4, -4, -4] print(a * b) # [5, 12, 21, 32] print(a / b) # [0.2, 0.333..., 0.428..., 0.5] print(a ** 2) # [1, 4, 9, 16] print(a // b) # [0, 0, 0, 0]（地板除法） print(a % b) # [1, 2, 3, 4]（取模）

With scalars

与标量运算

print(a + 10) # [11, 12, 13, 14] print(a * 2) # [2, 4, 6, 8]

undefined

print(a + 10) # [11, 12, 13, 14] print(a * 2) # [2, 4, 6, 8]

undefined

Mathematical Functions

数学函数

python

import numpy as np

x = np.array([0, np.pi/6, np.pi/4, np.pi/3, np.pi/2])

python

import numpy as np

x = np.array([0, np.pi/6, np.pi/4, np.pi/3, np.pi/2])

Trigonometric

三角函数

sin_x = np.sin(x) cos_x = np.cos(x) tan_x = np.tan(x)

Inverse trig

反三角函数

arcsin_x = np.arcsin([0, 0.5, 1])

Exponential and logarithm

指数与对数

arr = np.array([1, 2, 3, 4]) exp_arr = np.exp(arr) log_arr = np.log(arr) log10_arr = np.log10(arr)

Rounding

取整

floats = np.array([1.2, 2.7, 3.5, 4.9]) print(np.round(floats)) # [1, 3, 4, 5] print(np.floor(floats)) # [1, 2, 3, 4] print(np.ceil(floats)) # [2, 3, 4, 5]

Absolute value

绝对值

print(np.abs([-1, -2, 3, -4])) # [1, 2, 3, 4]

undefined

print(np.abs([-1, -2, 3, -4])) # [1, 2, 3, 4]

undefined

Aggregation Functions

聚合函数

python

import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

python

import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

Sum

求和

print(np.sum(arr)) # 45 (all elements) print(np.sum(arr, axis=0)) # [12, 15, 18] (column sums) print(np.sum(arr, axis=1)) # [6, 15, 24] (row sums)

print(np.sum(arr)) # 45（所有元素） print(np.sum(arr, axis=0)) # [12, 15, 18]（列求和） print(np.sum(arr, axis=1)) # [6, 15, 24]（行求和）

Mean

均值

print(np.mean(arr)) # 5.0

Standard deviation

标准差

print(np.std(arr)) # ~2.58

Min and max

最小与最大值

print(np.min(arr)) # 1 print(np.max(arr)) # 9 print(np.argmin(arr)) # 0 (index of min) print(np.argmax(arr)) # 8 (index of max)

print(np.min(arr)) # 1 print(np.max(arr)) # 9 print(np.argmin(arr)) # 0（最小值索引） print(np.argmax(arr)) # 8（最大值索引）

Median and percentiles

中位数与分位数

print(np.median(arr)) # 5.0 print(np.percentile(arr, 25)) # 3.0 (25th percentile)

undefined

print(np.median(arr)) # 5.0 print(np.percentile(arr, 25)) # 3.0（25分位数）

undefined

Broadcasting

广播机制

Broadcasting Rules

广播规则

python

import numpy as np

python

import numpy as np

Scalar and array

标量与数组运算

arr = np.array([1, 2, 3, 4]) result = arr + 10 # Broadcast scalar to array shape print(result) # [11, 12, 13, 14]

arr = np.array([1, 2, 3, 4]) result = arr + 10 # 标量广播为数组形状 print(result) # [11, 12, 13, 14]

1D and 2D

一维与二维数组运算

arr1d = np.array([1, 2, 3]) arr2d = np.array([[10], [20], [30]])

result = arr1d + arr2d print(result)

arr1d = np.array([1, 2, 3]) arr2d = np.array([[10], [20], [30]])

result = arr1d + arr2d print(result)

[[11, 12, 13],

[21, 22, 23],

[31, 32, 33]]

Broadcasting example: standardization

广播示例：数据标准化

data = np.random.randn(100, 3) # 100 samples, 3 features mean = np.mean(data, axis=0) # Mean of each column std = np.std(data, axis=0) # Std of each column standardized = (data - mean) / std # Broadcasting!

undefined

data = np.random.randn(100, 3) # 100个样本，3个特征 mean = np.mean(data, axis=0) # 每列的均值 std = np.std(data, axis=0) # 每列的标准差 standardized = (data - mean) / std # 自动广播!

undefined

Explicit Broadcasting

显式广播

python

import numpy as np

python

import numpy as np

Using broadcast_to

使用broadcast_to

arr = np.array([1, 2, 3]) broadcasted = np.broadcast_to(arr, (4, 3)) print(broadcasted)

[[1, 2, 3],

[1, 2, 3],

[1, 2, 3]]

Using newaxis

使用newaxis

arr1d = np.array([1, 2, 3]) col_vector = arr1d[:, np.newaxis] # Shape (3, 1) row_vector = arr1d[np.newaxis, :] # Shape (1, 3)

arr1d = np.array([1, 2, 3]) col_vector = arr1d[:, np.newaxis] # 形状(3, 1) row_vector = arr1d[np.newaxis, :] # 形状(1, 3)

Outer product using broadcasting

使用广播计算外积

outer = col_vector * row_vector print(outer)

[[1, 2, 3],

[2, 4, 6],

[3, 6, 9]]

undefined

undefined

Linear Algebra

线性代数

Matrix Operations

矩阵运算

python

import numpy as np

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

python

import numpy as np

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

Matrix multiplication

矩阵乘法

C = np.dot(A, B) # Traditional C = A @ B # Modern syntax (Python 3.5+)

C = np.dot(A, B) # 传统方式 C = A @ B # 现代语法（Python 3.5+）

Element-wise multiplication

逐元素乘法

D = A * B # Not matrix multiplication!

D = A * B # 不是矩阵乘法!

Matrix transpose

矩阵转置

A_T = A.T

Trace (sum of diagonal)

迹（对角线元素和）

trace = np.trace(A)

Matrix power

矩阵幂

A_squared = np.linalg.matrix_power(A, 2)

print(f"Matrix product:\n{C}") print(f"Transpose:\n{A_T}") print(f"Trace: {trace}")

undefined

A_squared = np.linalg.matrix_power(A, 2)

print(f"矩阵乘积:\n{C}") print(f"转置矩阵:\n{A_T}") print(f"迹: {trace}")

undefined

Solving Linear Systems

线性方程组求解

python

import numpy as np

python

import numpy as np

Solve Ax = b

求解Ax = b

A = np.array([[3, 1], [1, 2]]) b = np.array([9, 8])

Solve for x

求解x

x = np.linalg.solve(A, b) print(f"Solution: {x}") # [2, 3]

x = np.linalg.solve(A, b) print(f"解: {x}") # [2, 3]

Verify solution

验证解

print(f"Verification: {np.allclose(A @ x, b)}") # True

print(f"验证结果: {np.allclose(A @ x, b)}") # True

Matrix inverse

矩阵求逆

A_inv = np.linalg.inv(A) print(f"Inverse:\n{A_inv}")

A_inv = np.linalg.inv(A) print(f"逆矩阵:\n{A_inv}")

Determinant

行列式

det = np.linalg.det(A) print(f"Determinant: {det}")

undefined

det = np.linalg.det(A) print(f"行列式: {det}")

undefined

Eigenvalues and Eigenvectors

特征值与特征向量

python

import numpy as np

python

import numpy as np

Square matrix

方阵

A = np.array([[1, 2], [2, 1]])

Eigenvalue decomposition

特征值分解

eigenvalues, eigenvectors = np.linalg.eig(A)

print(f"Eigenvalues: {eigenvalues}") print(f"Eigenvectors:\n{eigenvectors}")

eigenvalues, eigenvectors = np.linalg.eig(A)

print(f"特征值: {eigenvalues}") print(f"特征向量:\n{eigenvectors}")

Verify: A * v = λ * v

验证：A * v = λ * v

for i in range(len(eigenvalues)): lam = eigenvalues[i] v = eigenvectors[:, i]

left = A @ v
right = lam * v

print(f"Eigenvalue {i}: {np.allclose(left, right)}")

undefined

for i in range(len(eigenvalues)): lam = eigenvalues[i] v = eigenvectors[:, i]

left = A @ v
right = lam * v

print(f"特征值{i}验证: {np.allclose(left, right)}")

undefined

Singular Value Decomposition (SVD)

奇异值分解（SVD）

python

import numpy as np

python

import numpy as np

Any matrix

任意矩阵

A = np.array([[1, 2, 3], [4, 5, 6]])

SVD: A = U @ S @ Vt

SVD分解: A = U @ S @ Vt

U, s, Vt = np.linalg.svd(A)

Reconstruct original matrix

重构原矩阵

S = np.zeros((2, 3)) S[:2, :2] = np.diag(s) A_reconstructed = U @ S @ Vt

print(f"Original:\n{A}") print(f"Reconstructed:\n{A_reconstructed}") print(f"Close? {np.allclose(A, A_reconstructed)}")

S = np.zeros((2, 3)) S[:2, :2] = np.diag(s) A_reconstructed = U @ S @ Vt

print(f"原矩阵:\n{A}") print(f"重构矩阵:\n{A_reconstructed}") print(f"是否接近原矩阵? {np.allclose(A, A_reconstructed)}")

Singular values

奇异值

print(f"Singular values: {s}")

undefined

print(f"奇异值: {s}")

undefined

Matrix Norms

矩阵范数

python

import numpy as np

A = np.array([[1, 2], [3, 4]])

python

import numpy as np

A = np.array([[1, 2], [3, 4]])

Frobenius norm (default)

Frobenius范数（默认）

norm_fro = np.linalg.norm(A)

1-norm (max column sum)

1-范数（列和最大值）

norm_1 = np.linalg.norm(A, ord=1)

Infinity norm (max row sum)

无穷范数（行和最大值）

norm_inf = np.linalg.norm(A, ord=np.inf)

2-norm (spectral norm)

2-范数（谱范数）

norm_2 = np.linalg.norm(A, ord=2)

print(f"Frobenius: {norm_fro:.4f}") print(f"1-norm: {norm_1:.4f}") print(f"2-norm: {norm_2:.4f}") print(f"inf-norm: {norm_inf:.4f}")

undefined

norm_2 = np.linalg.norm(A, ord=2)

print(f"Frobenius范数: {norm_fro:.4f}") print(f"1-范数: {norm_1:.4f}") print(f"2-范数: {norm_2:.4f}") print(f"无穷范数: {norm_inf:.4f}")

undefined

Random Number Generation

随机数生成

Basic Random Generation

基础随机数生成

python

import numpy as np

python

import numpy as np

Set seed for reproducibility

设置随机种子以保证可复现

np.random.seed(42)

Random floats [0, 1)

生成[0,1)间的随机浮点数

rand_uniform = np.random.rand(5) # 1D array of 5 elements rand_2d = np.random.rand(3, 4) # 3x4 array

rand_uniform = np.random.rand(5) # 一维数组，5个元素 rand_2d = np.random.rand(3, 4) # 3x4数组

Random integers

生成随机整数

rand_int = np.random.randint(0, 10, size=5) # [0, 10) rand_int_2d = np.random.randint(0, 100, size=(3, 3))

rand_int = np.random.randint(0, 10, size=5) # [0,10)区间 rand_int_2d = np.random.randint(0, 100, size=(3, 3))

Random normal distribution

生成正态分布随机数

rand_normal = np.random.randn(1000) # Mean=0, std=1 rand_normal_custom = np.random.normal(loc=5, scale=2, size=1000)

rand_normal = np.random.randn(1000) # 均值0，标准差1 rand_normal_custom = np.random.normal(loc=5, scale=2, size=1000)

Random choice

随机选择

choices = np.random.choice(['a', 'b', 'c'], size=10) weighted_choices = np.random.choice([1, 2, 3], size=100, p=[0.1, 0.3, 0.6])

undefined

choices = np.random.choice(['a', 'b', 'c'], size=10) weighted_choices = np.random.choice([1, 2, 3], size=100, p=[0.1, 0.3, 0.6])

undefined

Statistical Distributions

统计分布

python

import numpy as np

python

import numpy as np

Uniform distribution [low, high)

均匀分布 [low, high)

uniform = np.random.uniform(low=0, high=10, size=1000)

Normal (Gaussian) distribution

正态（高斯）分布

normal = np.random.normal(loc=0, scale=1, size=1000)

Exponential distribution

指数分布

exponential = np.random.exponential(scale=2, size=1000)

Binomial distribution

二项分布

binomial = np.random.binomial(n=10, p=0.5, size=1000)

Poisson distribution

泊松分布

poisson = np.random.poisson(lam=3, size=1000)

Beta distribution

Beta分布

beta = np.random.beta(a=2, b=5, size=1000)

Chi-squared distribution

卡方分布

chisq = np.random.chisquare(df=2, size=1000)

undefined

chisq = np.random.chisquare(df=2, size=1000)

undefined

Modern Random Generator (numpy.random.Generator)

现代随机数生成器（numpy.random.Generator）

python

import numpy as np

python

import numpy as np

Create generator

创建生成器

rng = np.random.default_rng(seed=42)

Generate random numbers

生成随机数

rand = rng.random(size=10) ints = rng.integers(low=0, high=100, size=10) normal = rng.normal(loc=0, scale=1, size=10)

Shuffle array in-place

原地打乱数组

arr = np.arange(10) rng.shuffle(arr)

Sample without replacement

无放回抽样

sample = rng.choice(100, size=10, replace=False)

print(f"Random: {rand}") print(f"Shuffled: {arr}")

undefined

sample = rng.choice(100, size=10, replace=False)

print(f"随机浮点数: {rand}") print(f"打乱后的数组: {arr}")

undefined

Reshaping and Manipulation

数组重塑与操作

Reshaping Arrays

数组重塑

python

import numpy as np

python

import numpy as np

Original array

原数组

arr = np.arange(12) # [0, 1, 2, ..., 11]

Reshape

重塑形状

arr_2d = arr.reshape(3, 4) arr_3d = arr.reshape(2, 2, 3)

Automatic dimension calculation with -1

使用-1自动计算维度

arr_auto = arr.reshape(3, -1) # Automatically calculates 4

arr_auto = arr.reshape(3, -1) # 自动计算为4列

Flatten to 1D

展平为一维数组

flat = arr_2d.flatten() # Returns copy flat = arr_2d.ravel() # Returns view if possible

flat = arr_2d.flatten() # 返回复制 flat = arr_2d.ravel() # 尽可能返回视图

Transpose

转置

arr_t = arr_2d.T

print(f"Original shape: {arr.shape}") print(f"2D shape: {arr_2d.shape}") print(f"3D shape: {arr_3d.shape}")

undefined

arr_t = arr_2d.T

print(f"原数组形状: {arr.shape}") print(f"二维数组形状: {arr_2d.shape}") print(f"三维数组形状: {arr_3d.shape}")

undefined

Stacking and Splitting

数组堆叠与拆分

python

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.array([7, 8, 9])

python

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.array([7, 8, 9])

Vertical stacking (vstack)

垂直堆叠（vstack）

vstacked = np.vstack([a, b, c]) print(vstacked)

[[1, 2, 3],

[4, 5, 6],

[7, 8, 9]]

Horizontal stacking (hstack)

水平堆叠（hstack）

hstacked = np.hstack([a, b, c]) print(hstacked) # [1, 2, 3, 4, 5, 6, 7, 8, 9]

Column stack

列堆叠

col_stacked = np.column_stack([a, b, c])

Concatenate (more general)

通用拼接

arr1 = np.array([[1, 2], [3, 4]]) arr2 = np.array([[5, 6], [7, 8]]) concat_axis0 = np.concatenate([arr1, arr2], axis=0) concat_axis1 = np.concatenate([arr1, arr2], axis=1)

Splitting

数组拆分

arr = np.arange(12) split = np.split(arr, 3) # Split into 3 equal parts print(split) # [array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([8, 9, 10, 11])]

undefined

arr = np.arange(12) split = np.split(arr, 3) # 拆分为3个等长子数组 print(split) # [array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([8, 9, 10, 11])]

undefined

File I/O

文件I/O

Text Files

文本文件

python

import numpy as np

python

import numpy as np

Save to text file

保存到文本文件

data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

np.savetxt('data.txt', data) np.savetxt('data.csv', data, delimiter=',') np.savetxt('data_formatted.txt', data, fmt='%.2f')

data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

np.savetxt('data.txt', data) np.savetxt('data.csv', data, delimiter=',') np.savetxt('data_formatted.txt', data, fmt='%.2f')

Load from text file

从文本文件加载

loaded = np.loadtxt('data.txt') loaded_csv = np.loadtxt('data.csv', delimiter=',')

Skip header rows

跳过表头行

loaded_skip = np.loadtxt('data.txt', skiprows=1)

Load specific columns

加载指定列

loaded_cols = np.loadtxt('data.csv', delimiter=',', usecols=(0, 2))

undefined

loaded_cols = np.loadtxt('data.csv', delimiter=',', usecols=(0, 2))

undefined

Binary Files (.npy, .npz)

二进制文件（.npy, .npz）

python

import numpy as np

python

import numpy as np

Save single array

保存单个数组

arr = np.random.rand(100, 100) np.save('array.npy', arr)

Load single array

加载单个数组

loaded = np.load('array.npy')

Save multiple arrays (compressed)

保存多个数组（压缩）

arr1 = np.random.rand(10, 10) arr2 = np.random.rand(20, 20) np.savez('arrays.npz', first=arr1, second=arr2)

Load multiple arrays

加载多个数组

loaded = np.load('arrays.npz') loaded_arr1 = loaded['first'] loaded_arr2 = loaded['second']

Compressed save

压缩保存

np.savez_compressed('arrays_compressed.npz', arr1=arr1, arr2=arr2)

undefined

np.savez_compressed('arrays_compressed.npz', arr1=arr1, arr2=arr2)

undefined

Advanced Techniques

高级技巧

Universal Functions (ufuncs)

通用函数（ufuncs）

python

import numpy as np

python

import numpy as np

Ufuncs operate element-wise

通用函数逐元素操作

arr = np.array([1, 2, 3, 4, 5])

Built-in ufuncs

内置通用函数

result = np.sqrt(arr) result = np.exp(arr) result = np.log(arr)

Custom ufunc

自定义通用函数

def my_func(x): return x**2 + 2*x + 1

vectorized = np.vectorize(my_func) result = vectorized(arr)

def my_func(x): return x**2 + 2*x + 1

vectorized = np.vectorize(my_func) result = vectorized(arr)

More efficient: define true ufunc

更高效的方式：定义真正的通用函数

@np.vectorize def better_func(x): return x**2 + 2*x + 1

undefined

@np.vectorize def better_func(x): return x**2 + 2*x + 1

undefined

Structured Arrays

结构化数组

python

import numpy as np

python

import numpy as np

Define dtype

定义数据类型

dt = np.dtype([('name', 'U20'), ('age', 'i4'), ('weight', 'f8')])

Create structured array

创建结构化数组

data = np.array([ ('Alice', 25, 55.5), ('Bob', 30, 70.2), ('Charlie', 35, 82.1) ], dtype=dt)

Access by field name

通过字段名访问

names = data['name'] ages = data['age']

Sort by field

按字段排序

sorted_data = np.sort(data, order='age')

print(f"Names: {names}") print(f"Sorted by age:\n{sorted_data}")

undefined

sorted_data = np.sort(data, order='age')

print(f"姓名: {names}") print(f"按年龄排序:\n{sorted_data}")

undefined

Memory Layout and Performance

内存布局与性能

python

import numpy as np

python

import numpy as np

C-contiguous (row-major, default)

C连续（行优先，默认）

arr_c = np.array([[1, 2, 3], [4, 5, 6]], order='C')

Fortran-contiguous (column-major)

Fortran连续（列优先）

arr_f = np.array([[1, 2, 3], [4, 5, 6]], order='F')

Check memory layout

检查内存布局

print(f"C-contiguous? {arr_c.flags['C_CONTIGUOUS']}") print(f"F-contiguous? {arr_c.flags['F_CONTIGUOUS']}")

print(f"是否为C连续? {arr_c.flags['C_CONTIGUOUS']}") print(f"是否为Fortran连续? {arr_c.flags['F_CONTIGUOUS']}")

Make contiguous

转换为连续数组

arr_made_c = np.ascontiguousarray(arr_f) arr_made_f = np.asfortranarray(arr_c)

Memory usage

内存使用情况

print(f"Memory (bytes): {arr_c.nbytes}") print(f"Item size: {arr_c.itemsize}")

undefined

print(f"内存占用（字节）: {arr_c.nbytes}") print(f"单个元素大小: {arr_c.itemsize}")

undefined

Advanced Indexing with ix_

使用ix_进行高级索引

python

import numpy as np

arr = np.arange(20).reshape(4, 5)

python

import numpy as np

arr = np.arange(20).reshape(4, 5)

Select specific rows and columns

选择特定行和列

rows = np.array([0, 2]) cols = np.array([1, 3, 4])

ix_ creates open mesh

ix_创建开放网格

result = arr[np.ix_(rows, cols)] print(result)

[[1, 3, 4],

[11, 13, 14]]

Equivalent to

等价于

result = arr[[0, 2]][:, [1, 3, 4]]

undefined

undefined

Practical Workflows

实用工作流

Statistical Analysis

统计分析

python

import numpy as np

python

import numpy as np

Generate sample data

生成样本数据

np.random.seed(42) data = np.random.normal(loc=100, scale=15, size=1000)

Descriptive statistics

描述性统计

mean = np.mean(data) median = np.median(data) std = np.std(data) var = np.var(data)

Percentiles

分位数

q25, q50, q75 = np.percentile(data, [25, 50, 75])

Histogram

直方图

counts, bins = np.histogram(data, bins=20)

Correlation coefficient

Estimate π

估算π

pi_est = estimate_pi(10000000) print(f"π estimate: {pi_est:.6f}") print(f"Error: {abs(pi_est - np.pi):.6f}")

undefined

pi_est = estimate_pi(10000000) print(f"π的估算值: {pi_est:.6f}") print(f"误差: {abs(pi_est - np.pi):.6f}")

undefined

Polynomial Fitting

多项式拟合

python

import numpy as np

python

import numpy as np

Generate noisy data

生成带噪声的数据

x = np.linspace(0, 10, 50) y_true = 2x**2 + 3x + 1 y_noisy = y_true + np.random.normal(0, 10, size=50)

Fit polynomial (degree 2)

拟合二次多项式

coeffs = np.polyfit(x, y_noisy, deg=2) print(f"Coefficients: {coeffs}") # Should be close to [2, 3, 1]

coeffs = np.polyfit(x, y_noisy, deg=2) print(f"系数: {coeffs}") # 应接近[2, 3, 1]

Predict

预测值

y_pred = np.polyval(coeffs, x)

Evaluate fit quality

评估拟合质量

residuals = y_noisy - y_pred rmse = np.sqrt(np.mean(residuals**2)) print(f"RMSE: {rmse:.2f}")

residuals = y_noisy - y_pred rmse = np.sqrt(np.mean(residuals**2)) print(f"均方根误差: {rmse:.2f}")

Create polynomial object

创建多项式对象

poly = np.poly1d(coeffs) print(f"Polynomial: {poly}")

undefined

poly = np.poly1d(coeffs) print(f"多项式: {poly}")

undefined

Image Processing Basics

基础图像处理

python

import numpy as np

python

import numpy as np

Create synthetic image (grayscale)

创建合成灰度图像

image = np.random.rand(100, 100)

Apply transformations

应用变换

Rotate 90 degrees

旋转90度

rotated = np.rot90(image)

Flip vertically

垂直翻转

flipped_v = np.flipud(image)

Flip horizontally

水平翻转

flipped_h = np.fliplr(image)

Transpose

转置

transposed = image.T

Normalize to [0, 255]

归一化到[0,255]

normalized = ((image - image.min()) / (image.max() - image.min()) * 255).astype(np.uint8)

print(f"Original shape: {image.shape}") print(f"Value range: [{image.min():.2f}, {image.max():.2f}]")

undefined

normalized = ((image - image.min()) / (image.max() - image.min()) * 255).astype(np.uint8)

print(f"原图像形状: {image.shape}") print(f"值范围: [{image.min():.2f}, {image.max():.2f}]")

undefined

Distance Matrices

距离矩阵

python

import numpy as np

python

import numpy as np

Points in 2D

二维点集

points = np.random.rand(100, 2)

Pairwise distances (broadcasting)

成对距离（使用广播）

diff = points[:, np.newaxis, :] - points[np.newaxis, :, :] distances = np.sqrt(np.sum(diff**2, axis=2))

print(f"Distance matrix shape: {distances.shape}") print(f"Max distance: {distances.max():.4f}")

diff = points[:, np.newaxis, :] - points[np.newaxis, :, :] distances = np.sqrt(np.sum(diff**2, axis=2))

print(f"距离矩阵形状: {distances.shape}") print(f"最大距离: {distances.max():.4f}")

Find nearest neighbors

查找最近邻

for i in range(5): # Exclude self (distance = 0) dists = distances[i].copy() dists[i] = np.inf nearest = np.argmin(dists) print(f"Point {i} nearest to point {nearest}, distance: {distances[i, nearest]:.4f}")

undefined

for i in range(5): # 排除自身（距离为0） dists = distances[i].copy() dists[i] = np.inf nearest = np.argmin(dists) print(f"点{i}的最近邻是点{nearest}，距离: {distances[i, nearest]:.4f}")

undefined

Sliding Window Operations

滑动窗口操作

python

import numpy as np

def sliding_window_view(arr, window_size):
    """Create sliding window views of array."""
    shape = (arr.shape[0] - window_size + 1, window_size)
    strides = (arr.strides[0], arr.strides[0])
    return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)

python

import numpy as np

def sliding_window_view(arr, window_size):
    """创建数组的滑动窗口视图。"""
    shape = (arr.shape[0] - window_size + 1, window_size)
    strides = (arr.strides[0], arr.strides[0])
    return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)

Time series data

时间序列数据

data = np.random.rand(100)

Create sliding windows

创建滑动窗口

windows = sliding_window_view(data, window_size=10)

Compute statistics for each window

计算每个窗口的统计量

window_means = np.mean(windows, axis=1) window_stds = np.std(windows, axis=1)

print(f"Number of windows: {len(windows)}") print(f"First window mean: {window_means[0]:.4f}")

undefined

window_means = np.mean(windows, axis=1) window_stds = np.std(windows, axis=1)

print(f"窗口数量: {len(windows)}") print(f"第一个窗口均值: {window_means[0]:.4f}")

undefined

Performance Optimization

性能优化

Vectorization Examples

向量化示例

python

import numpy as np
import time

python

import numpy as np
import time

Bad: Python loop

错误：Python循环

def sum_python_loop(arr): total = 0 for x in arr: total += x**2 return total

Good: Vectorized

正确：向量化

def sum_vectorized(arr): return np.sum(arr**2)

Benchmark

基准测试

arr = np.random.rand(1000000)

start = time.time() result1 = sum_python_loop(arr) time_loop = time.time() - start

start = time.time() result2 = sum_vectorized(arr) time_vec = time.time() - start

print(f"Loop time: {time_loop:.4f}s") print(f"Vectorized time: {time_vec:.4f}s") print(f"Speedup: {time_loop/time_vec:.1f}x")

undefined

arr = np.random.rand(1000000)

start = time.time() result1 = sum_python_loop(arr) time_loop = time.time() - start

start = time.time() result2 = sum_vectorized(arr) time_vec = time.time() - start

print(f"循环耗时: {time_loop:.4f}s") print(f"向量化耗时: {time_vec:.4f}s") print(f"加速比: {time_loop/time_vec:.1f}x")

undefined

Memory-Efficient Operations

内存高效运算

python

import numpy as np

python

import numpy as np

Bad: Creates intermediate arrays

错误：创建中间数组

def inefficient(arr): temp1 = arr * 2 temp2 = temp1 + 5 temp3 = temp2 ** 2 return temp3

Good: In-place operations

正确：原地运算

def efficient(arr): result = arr.copy() result *= 2 result += 5 result **= 2 return result

Even better: Single expression (optimized by NumPy)

更优：单表达式（NumPy会优化）

def most_efficient(arr): return (arr * 2 + 5) ** 2

undefined

def most_efficient(arr): return (arr * 2 + 5) ** 2

undefined

Using numexpr for Complex Expressions

使用numexpr处理复杂表达式

python

import numpy as np

python

import numpy as np

For very large arrays and complex expressions,

对于超大数组和复杂表达式，

numexpr can be faster (requires installation)

numexpr可以更快（需要安装）

Without numexpr

不使用numexpr

a = np.random.rand(10000000) b = np.random.rand(10000000) result = 2a + 3b**2 - np.sqrt(a)

With numexpr (if installed)

使用numexpr（如果已安装）

import numexpr as ne

result = ne.evaluate('2a + 3b**2 - sqrt(a)')

undefined

undefined

Common Pitfalls and Solutions

常见陷阱与解决方案

NaN Handling

NaN处理

python

import numpy as np

arr = np.array([1, 2, np.nan, 4, 5, np.nan])

python

import numpy as np

arr = np.array([1, 2, np.nan, 4, 5, np.nan])

Problem: Regular functions return NaN

问题：常规函数返回NaN

mean = np.mean(arr) # Returns nan

mean = np.mean(arr) # 返回nan

Solution: Use nan-safe functions

解决方案：使用NaN安全函数

mean = np.nanmean(arr) # Returns 3.0 std = np.nanstd(arr) sum_val = np.nansum(arr)

mean = np.nanmean(arr) # 返回3.0 std = np.nanstd(arr) sum_val = np.nansum(arr)

Check for NaN

检查是否存在NaN

has_nan = np.isnan(arr).any() where_nan = np.where(np.isnan(arr))[0]

Remove NaN

移除NaN

arr_clean = arr[~np.isnan(arr)]

print(f"Mean (nan-safe): {mean}") print(f"NaN positions: {where_nan}")

undefined

arr_clean = arr[~np.isnan(arr)]

print(f"NaN安全均值: {mean}") print(f"NaN位置: {where_nan}")

undefined

Integer Division Pitfall

整数除法陷阱

python

import numpy as np

python

import numpy as np

Problem: Integer division with integers

问题：整数间的除法

a = np.array([1, 2, 3]) b = np.array([2, 2, 2]) result = a / b # With Python 3, this is fine

a = np.array([1, 2, 3]) b = np.array([2, 2, 2]) result = a / b # Python 3中没问题

But be careful with older code or explicit int types

但要注意旧代码或显式整数类型

a_int = np.array([1, 2, 3], dtype=np.int32) b_int = np.array([2, 2, 2], dtype=np.int32)

In NumPy, / always gives float result

NumPy中/始终返回浮点数

result_float = a_int / b_int # [0.5, 1, 1.5]

Use // for integer division

使用//进行整数除法

result_int = a_int // b_int # [0, 1, 1]

print(f"Float division: {result_float}") print(f"Integer division: {result_int}")

undefined

result_int = a_int // b_int # [0, 1, 1]

print(f"浮点除法结果: {result_float}") print(f"整数除法结果: {result_int}")

undefined

Array Equality

数组相等性比较

python

import numpy as np

a = np.array([1.0, 2.0, 3.0])
b = np.array([1.0, 2.0, 3.0])

python

import numpy as np

a = np.array([1.0, 2.0, 3.0])
b = np.array([1.0, 2.0, 3.0])

Problem: Can't use == directly for array comparison

问题：不能直接用==比较数组

if a == b: # ValueError!

if a == b: # 会抛出ValueError!

Solution 1: Element-wise comparison

解决方案1：逐元素比较

equal_elements = a == b # Boolean array

equal_elements = a == b # 布尔数组

Solution 2: Check if all elements equal

解决方案2：检查所有元素是否相等

all_equal = np.all(a == b)

Solution 3: array_equal

解决方案3：使用array_equal

array_equal = np.array_equal(a, b)

Solution 4: For floating point, use allclose

解决方案4：浮点数使用allclose

c = a + 1e-10 close_enough = np.allclose(a, c, rtol=1e-5, atol=1e-8)

print(f"All equal: {all_equal}") print(f"Arrays equal: {array_equal}") print(f"Close enough: {close_enough}")

undefined

c = a + 1e-10 close_enough = np.allclose(a, c, rtol=1e-5, atol=1e-8)

print(f"所有元素相等: {all_equal}") print(f"数组相等: {array_equal}") print(f"足够接近: {close_enough}")

undefined

Memory Leaks with Views

视图导致的内存泄漏

python

import numpy as np

python

import numpy as np

Problem: Large array kept in memory

问题：大数组被视图引用而无法释放

large_array = np.random.rand(1000000, 100) small_view = large_array[0:10] # Just 10 rows

large_array = np.random.rand(1000000, 100) small_view = large_array[0:10] # 仅10行

large_array is kept in memory because small_view references it!

large_array会被保留在内存中，因为small_view引用了它!

del large_array # Doesn't free memory!

del large_array # 不会释放内存!

Solution: Make a copy

解决方案：创建副本

large_array = np.random.rand(1000000, 100) small_copy = large_array[0:10].copy() del large_array # Now memory is freed

large_array = np.random.rand(1000000, 100) small_copy = large_array[0:10].copy() del large_array # 现在内存会被释放

Check if it's a view

检查是否为视图

print(f"Is view? {small_view.base is not None}") print(f"Is copy? {small_copy.base is None}")


This comprehensive NumPy guide covers 50+ examples across all major array operations and numerical computing workflows!

print(f"是否为视图? {small_view.base is not None}") print(f"是否为副本? {small_copy.base is None}")


这份全面的NumPy指南涵盖了50多个示例，覆盖了所有主要的数组操作与数值计算工作流！

numpy

Original

Translation

NumPy - Numerical Python

NumPy - 数值Python

When to Use

适用场景

Reference Documentation

参考文档

Core Principles

核心原则

Use NumPy For

适合使用NumPy的场景

Do NOT Use For

不适合使用NumPy的场景

Quick Reference

快速参考

Installation

安装方法

pip

pip安装

conda

conda安装

Specific version

指定版本安装

Standard Imports

标准导入方式

Common submodules

常用子模块导入

Never import *

不要使用通配符导入

from numpy import * # DON'T DO THIS!

from numpy import * # 绝对不要这么做!

Basic Pattern - Array Creation

基础模式 - 数组创建

From list

从列表创建

Zeros and ones

创建全零或全一数组

Range

生成连续序列

Linspace

生成等间隔序列

Basic Pattern - Array Operations

基础模式 - 数组运算

Element-wise operations

逐元素运算

Mathematical functions

数学函数运算

Basic Pattern - Linear Algebra

基础模式 - 线性代数

Matrix multiplication

矩阵乘法

Dot product

点积运算

Matrix inverse

矩阵求逆

Eigenvalues

特征值计算

Critical Rules

重要规则

✅ DO

✅ 推荐做法

❌ DON'T

❌ 禁止做法

Anti-Patterns (NEVER)

反模式（绝对避免）

❌ BAD: Python loops

❌ 错误：Python循环

✅ GOOD: Vectorization

✅ 正确：向量化运算

❌ BAD: Growing arrays

❌ 错误：动态扩展数组

✅ GOOD: Pre-allocate

✅ 正确：预分配数组

Even better: Use arange

更优方案：使用arange

❌ BAD: Comparing arrays with ==

❌ 错误：使用==比较数组

✅ GOOD: Use appropriate comparison