verde

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Verde - Spatial Data Gridding

Verde - 空间数据网格化

Quick Reference

快速参考

python

import verde as vd

python

import verde as vd

Basic gridding

基础网格化

spline = vd.Spline() spline.fit(coordinates, values) # coordinates = (lon, lat) tuple grid = spline.grid(spacing=0.1) # Returns xarray Dataset

spline = vd.Spline() spline.fit(coordinates, values) # coordinates = (经度, 纬度) 元组 grid = spline.grid(spacing=0.1) # 返回xarray Dataset

Access result

访问结果

elevation = grid.elevation.values

Save output

保存输出

grid.to_netcdf('output.nc')

undefined

grid.to_netcdf('output.nc')

undefined

Key Classes

核心类

Class	Purpose
`Spline`	Bi-harmonic spline interpolation (smooth, good extrapolation)
`Linear`	Delaunay triangulation (fast, no extrapolation)
`Cubic`	Cubic interpolation (medium smoothness)
`Chain`	Pipeline of processing steps
`BlockReduce`	Decimate data to block means/medians
`Trend`	Polynomial trend fitting and removal
`Vector`	Grid 2-component vector data

类	用途
`Spline`	双调和样条插值（平滑度高，外推效果好）
`Linear`	德劳内三角剖分（速度快，无外推功能）
`Cubic`	三次插值（平滑度中等）
`Chain`	处理步骤流水线
`BlockReduce`	将数据抽取为块均值/中位数
`Trend`	多项式趋势拟合与去除
`Vector`	对双分量矢量数据进行网格化

Essential Operations

关键操作

Grid Scattered Data

网格化分散数据

python

coordinates = (longitude, latitude)  # Tuple of 1D arrays
values = elevation  # 1D array

spline = vd.Spline()
spline.fit(coordinates, values)
grid = spline.grid(spacing=0.1, data_names=['elevation'])

python

coordinates = (longitude, latitude)  # 一维数组组成的元组
values = elevation  # 一维数组

spline = vd.Spline()
spline.fit(coordinates, values)
grid = spline.grid(spacing=0.1, data_names=['elevation'])

Project to Cartesian

投影至笛卡尔坐标

python

import pyproj

projection = pyproj.Proj(proj='merc', lat_ts=data_lat.mean())
proj_coords = projection(longitude, latitude)

spline = vd.Spline()
spline.fit(proj_coords, values)
grid = spline.grid(spacing=1000)  # 1000m spacing

python

import pyproj

projection = pyproj.Proj(proj='merc', lat_ts=data_lat.mean())
proj_coords = projection(longitude, latitude)

spline = vd.Spline()
spline.fit(proj_coords, values)
grid = spline.grid(spacing=1000)  # 1000米间距

Block Reduce Large Datasets

块缩减大型数据集

python

import numpy as np

reducer = vd.BlockReduce(reduction=np.median, spacing=0.1)
coords_reduced, values_reduced = reducer.filter(coordinates, values)

python

import numpy as np

reducer = vd.BlockReduce(reduction=np.median, spacing=0.1)
coords_reduced, values_reduced = reducer.filter(coordinates, values)

Remove Trend Before Gridding

网格化前去除趋势

python

trend = vd.Trend(degree=2)  # Quadratic
trend.fit(coordinates, values)
residuals = values - trend.predict(coordinates)

python

trend = vd.Trend(degree=2)  # 二次趋势
trend.fit(coordinates, values)
residuals = values - trend.predict(coordinates)

Grid residuals, then add trend back

对残差进行网格化，之后再将趋势加回

undefined

undefined

Processing Pipeline

处理流水线

python

chain = vd.Chain([
    ('trend', vd.Trend(degree=1)),
    ('reduce', vd.BlockReduce(np.median, spacing=0.05)),
    ('spline', vd.Spline())
])
chain.fit(coordinates, values)
grid = chain.grid(spacing=0.01)

python

chain = vd.Chain([
    ('trend', vd.Trend(degree=1)),
    ('reduce', vd.BlockReduce(np.median, spacing=0.05)),
    ('spline', vd.Spline())
])
chain.fit(coordinates, values)
grid = chain.grid(spacing=0.01)

Cross-Validation

交叉验证

python

spline = vd.Spline()
scores = vd.cross_val_score(spline, coordinates, values, cv=5)
print(f"Mean R2: {scores.mean():.3f}")

python

spline = vd.Spline()
scores = vd.cross_val_score(spline, coordinates, values, cv=5)
print(f"平均R2值: {scores.mean():.3f}")

Mask Far from Data

屏蔽远离数据的区域

python

grid = spline.grid(spacing=0.1)
mask = vd.distance_mask(coordinates, maxdist=0.2, grid=grid)
grid_masked = grid.where(mask)

python

grid = spline.grid(spacing=0.1)
mask = vd.distance_mask(coordinates, maxdist=0.2, grid=grid)
grid_masked = grid.where(mask)

Grid Parameters

网格化参数

Parameter	Description
`spacing`	Grid cell size (same units as coordinates)
`region`	(west, east, south, north) bounds
`shape`	(n_north, n_east) grid dimensions
`adjust`	'spacing' or 'region' - which to adjust for exact fit

参数	描述
`spacing`	网格单元大小（与坐标单位一致）
`region`	（西界、东界、南界、北界）范围
`shape`	（北向数量、东向数量）网格维度
`adjust`	'spacing' 或 'region' - 调整哪一项以实现精确匹配

Gridder Comparison

网格化工具对比

Gridder	Speed	Smoothness	Extrapolation
`Spline`	Medium	High	Good
`Linear`	Fast	Low	None
`Cubic`	Fast	Medium	None

网格化工具	速度	平滑度	外推能力
`Spline`	中等	高	良好
`Linear`	快	低	无
`Cubic`	快	中等	无

When to Use vs Alternatives

适用场景与替代工具对比

Use Case	Tool	Why
General spatial gridding	Verde	ML-style API, pipelines, cross-validation
Basic 1D/2D interpolation	scipy.interpolate	Simpler API, no spatial focus
Potential field gridding	Harmonica	Equivalent sources designed for gravity/magnetics
Command-line batch gridding	GMT	Powerful CLI, good for automation scripts
Geostatistical interpolation	scikit-gstat / pykrige	Variogram-based with uncertainty
Very large datasets (10M+ pts)	GMT / GDAL	Better memory handling at scale
Vector data (GPS velocities)	Verde ( `Vector` )	Built-in 2-component vector gridding
Trend removal + gridding	Verde ( `Chain` )	Pipeline combines steps cleanly

Choose Verde when: You need a Pythonic, scikit-learn-style API for gridding scattered spatial data with built-in cross-validation, trend removal, and pipelines. Ideal for exploratory analysis and reproducible workflows.

Choose scipy.interpolate when: You have a simple interpolation task without spatial coordinates, projections, or need for validation.

Choose GMT when: You need command-line batch processing of large datasets or are integrating with shell-based workflows and need

surface

nearneighbor

使用场景	工具	原因
通用空间网格化	Verde	机器学习风格API、流水线、交叉验证
基础一维/二维插值	scipy.interpolate	API更简单，无空间聚焦特性
位场网格化	Harmonica	专为重力/磁法设计的等效源方法
命令行批量网格化	GMT	功能强大的CLI，适合自动化脚本
地统计插值	scikit-gstat / pykrige	基于变异函数，支持不确定性分析
超大型数据集（1000万+点位）	GMT / GDAL	大规模数据下内存处理更优
矢量数据（GPS速度）	Verde ( `Vector` )	内置双分量矢量网格化功能
趋势去除+网格化	Verde ( `Chain` )	流水线可简洁整合多步骤

选择Verde的场景：需要Python化、scikit-learn风格的API来处理分散空间数据，且需内置交叉验证、趋势去除和流水线功能。适用于探索性分析和可复现工作流。

选择scipy.interpolate的场景：仅需简单插值任务，无需空间坐标、投影或验证功能。

选择GMT的场景：需要命令行批量处理大型数据集，或需与基于Shell的工作流集成，使用

surface

或

nearneighbor

工具。

Common Workflows

常见工作流

Grid Scattered Spatial Data with Validation

带验证的分散空间数据网格化

Load scattered point data (coordinates + values)
Project geographic coordinates to Cartesian if needed
Inspect data distribution and identify clusters or gaps
Apply
```
BlockReduce
```
to decimate dense clusters
Remove regional trend with
```
Trend(degree=1)
```
or
```
Trend(degree=2)
```
Cross-validate gridder parameters with
```
cross_val_score()
```
Tune
```
Spline(damping=...)
```
or
```
Spline(mindist=...)
```
based on CV scores
Fit chosen gridder (Spline, Linear, or Cubic) on residuals
Grid onto regular spacing with
```
.grid()
```
Add trend back to gridded residuals
Apply
```
distance_mask()
```
to clip extrapolation artifacts
Visualize grid with xarray plotting:
```
grid.elevation.plot()
```
Save result to NetCDF with
```
grid.to_netcdf()
```

加载分散点位数据（坐标+数值）
若需要，将地理坐标投影至笛卡尔坐标
检查数据分布，识别聚类或空白区域
应用
```
BlockReduce
```
抽取密集聚类区域的数据
使用
```
Trend(degree=1)
```
或
```
Trend(degree=2)
```
去除区域趋势
使用
```
cross_val_score()
```
对网格化参数进行交叉验证
根据交叉验证分数调整
```
Spline(damping=...)
```
或
```
Spline(mindist=...)
```
在残差上拟合选定的网格化工具（Spline、Linear或Cubic）
使用
```
.grid()
```
将数据转换为规则间距的网格
将趋势加回网格化后的残差
应用
```
distance_mask()
```
裁剪外推伪影
使用xarray绘图可视化网格：
```
grid.elevation.plot()
```
使用
```
grid.to_netcdf()
```
将结果保存为NetCDF格式

Common Issues

常见问题

Issue	Solution
Poor extrapolation	Use `distance_mask()` to mask far from data
Slow with large data	Use `BlockReduce` first
Regional trends	Remove with `Trend` before gridding
Wrong spacing	Check coordinate units (degrees vs meters)

问题	解决方案
外推效果差	使用 `distance_mask()` 屏蔽远离数据的区域
大数据处理速度慢	先使用 `BlockReduce` 处理
存在区域趋势	网格化前使用 `Trend` 去除
间距设置错误	检查坐标单位（度 vs 米）

References

参考资料

Gridders - Available gridders and parameters
Cross-Validation - Parameter tuning methods

网格化工具 - 可用的网格化工具及参数
交叉验证 - 参数调优方法

Scripts

脚本

scripts/grid_data.py - Grid scattered data to NetCDF

scripts/grid_data.py - 将分散数据网格化并保存为NetCDF格式