verde

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Verde - Spatial Data Gridding

Verde - 空间数据网格化

Quick Reference

快速参考

python
import verde as vd
python
import verde as vd

Basic gridding

基础网格化

spline = vd.Spline() spline.fit(coordinates, values) # coordinates = (lon, lat) tuple grid = spline.grid(spacing=0.1) # Returns xarray Dataset
spline = vd.Spline() spline.fit(coordinates, values) # coordinates = (经度, 纬度) 元组 grid = spline.grid(spacing=0.1) # 返回xarray Dataset

Access result

访问结果

elevation = grid.elevation.values
elevation = grid.elevation.values

Save output

保存输出

grid.to_netcdf('output.nc')
undefined
grid.to_netcdf('output.nc')
undefined

Key Classes

核心类

ClassPurpose
Spline
Bi-harmonic spline interpolation (smooth, good extrapolation)
Linear
Delaunay triangulation (fast, no extrapolation)
Cubic
Cubic interpolation (medium smoothness)
Chain
Pipeline of processing steps
BlockReduce
Decimate data to block means/medians
Trend
Polynomial trend fitting and removal
Vector
Grid 2-component vector data
用途
Spline
双调和样条插值(平滑度高,外推效果好)
Linear
德劳内三角剖分(速度快,无外推功能)
Cubic
三次插值(平滑度中等)
Chain
处理步骤流水线
BlockReduce
将数据抽取为块均值/中位数
Trend
多项式趋势拟合与去除
Vector
对双分量矢量数据进行网格化

Essential Operations

关键操作

Grid Scattered Data

网格化分散数据

python
coordinates = (longitude, latitude)  # Tuple of 1D arrays
values = elevation  # 1D array

spline = vd.Spline()
spline.fit(coordinates, values)
grid = spline.grid(spacing=0.1, data_names=['elevation'])
python
coordinates = (longitude, latitude)  # 一维数组组成的元组
values = elevation  # 一维数组

spline = vd.Spline()
spline.fit(coordinates, values)
grid = spline.grid(spacing=0.1, data_names=['elevation'])

Project to Cartesian

投影至笛卡尔坐标

python
import pyproj

projection = pyproj.Proj(proj='merc', lat_ts=data_lat.mean())
proj_coords = projection(longitude, latitude)

spline = vd.Spline()
spline.fit(proj_coords, values)
grid = spline.grid(spacing=1000)  # 1000m spacing
python
import pyproj

projection = pyproj.Proj(proj='merc', lat_ts=data_lat.mean())
proj_coords = projection(longitude, latitude)

spline = vd.Spline()
spline.fit(proj_coords, values)
grid = spline.grid(spacing=1000)  # 1000米间距

Block Reduce Large Datasets

块缩减大型数据集

python
import numpy as np

reducer = vd.BlockReduce(reduction=np.median, spacing=0.1)
coords_reduced, values_reduced = reducer.filter(coordinates, values)
python
import numpy as np

reducer = vd.BlockReduce(reduction=np.median, spacing=0.1)
coords_reduced, values_reduced = reducer.filter(coordinates, values)

Remove Trend Before Gridding

网格化前去除趋势

python
trend = vd.Trend(degree=2)  # Quadratic
trend.fit(coordinates, values)
residuals = values - trend.predict(coordinates)
python
trend = vd.Trend(degree=2)  # 二次趋势
trend.fit(coordinates, values)
residuals = values - trend.predict(coordinates)

Grid residuals, then add trend back

对残差进行网格化,之后再将趋势加回

undefined
undefined

Processing Pipeline

处理流水线

python
chain = vd.Chain([
    ('trend', vd.Trend(degree=1)),
    ('reduce', vd.BlockReduce(np.median, spacing=0.05)),
    ('spline', vd.Spline())
])
chain.fit(coordinates, values)
grid = chain.grid(spacing=0.01)
python
chain = vd.Chain([
    ('trend', vd.Trend(degree=1)),
    ('reduce', vd.BlockReduce(np.median, spacing=0.05)),
    ('spline', vd.Spline())
])
chain.fit(coordinates, values)
grid = chain.grid(spacing=0.01)

Cross-Validation

交叉验证

python
spline = vd.Spline()
scores = vd.cross_val_score(spline, coordinates, values, cv=5)
print(f"Mean R2: {scores.mean():.3f}")
python
spline = vd.Spline()
scores = vd.cross_val_score(spline, coordinates, values, cv=5)
print(f"平均R2值: {scores.mean():.3f}")

Mask Far from Data

屏蔽远离数据的区域

python
grid = spline.grid(spacing=0.1)
mask = vd.distance_mask(coordinates, maxdist=0.2, grid=grid)
grid_masked = grid.where(mask)
python
grid = spline.grid(spacing=0.1)
mask = vd.distance_mask(coordinates, maxdist=0.2, grid=grid)
grid_masked = grid.where(mask)

Grid Parameters

网格化参数

ParameterDescription
spacing
Grid cell size (same units as coordinates)
region
(west, east, south, north) bounds
shape
(n_north, n_east) grid dimensions
adjust
'spacing' or 'region' - which to adjust for exact fit
参数描述
spacing
网格单元大小(与坐标单位一致)
region
(西界、东界、南界、北界)范围
shape
(北向数量、东向数量)网格维度
adjust
'spacing' 或 'region' - 调整哪一项以实现精确匹配

Gridder Comparison

网格化工具对比

GridderSpeedSmoothnessExtrapolation
Spline
MediumHighGood
Linear
FastLowNone
Cubic
FastMediumNone
网格化工具速度平滑度外推能力
Spline
中等良好
Linear
Cubic
中等

When to Use vs Alternatives

适用场景与替代工具对比

Use CaseToolWhy
General spatial griddingVerdeML-style API, pipelines, cross-validation
Basic 1D/2D interpolationscipy.interpolateSimpler API, no spatial focus
Potential field griddingHarmonicaEquivalent sources designed for gravity/magnetics
Command-line batch griddingGMTPowerful CLI, good for automation scripts
Geostatistical interpolationscikit-gstat / pykrigeVariogram-based with uncertainty
Very large datasets (10M+ pts)GMT / GDALBetter memory handling at scale
Vector data (GPS velocities)Verde (
Vector
)
Built-in 2-component vector gridding
Trend removal + griddingVerde (
Chain
)
Pipeline combines steps cleanly
Choose Verde when: You need a Pythonic, scikit-learn-style API for gridding scattered spatial data with built-in cross-validation, trend removal, and pipelines. Ideal for exploratory analysis and reproducible workflows.
Choose scipy.interpolate when: You have a simple interpolation task without spatial coordinates, projections, or need for validation.
Choose GMT when: You need command-line batch processing of large datasets or are integrating with shell-based workflows and need
surface
or
nearneighbor
.
使用场景工具原因
通用空间网格化Verde机器学习风格API、流水线、交叉验证
基础一维/二维插值scipy.interpolateAPI更简单,无空间聚焦特性
位场网格化Harmonica专为重力/磁法设计的等效源方法
命令行批量网格化GMT功能强大的CLI,适合自动化脚本
地统计插值scikit-gstat / pykrige基于变异函数,支持不确定性分析
超大型数据集(1000万+点位)GMT / GDAL大规模数据下内存处理更优
矢量数据(GPS速度)Verde (
Vector
)
内置双分量矢量网格化功能
趋势去除+网格化Verde (
Chain
)
流水线可简洁整合多步骤
选择Verde的场景:需要Python化、scikit-learn风格的API来处理分散空间数据,且需内置交叉验证、趋势去除和流水线功能。适用于探索性分析和可复现工作流。
选择scipy.interpolate的场景:仅需简单插值任务,无需空间坐标、投影或验证功能。
选择GMT的场景:需要命令行批量处理大型数据集,或需与基于Shell的工作流集成,使用
surface
nearneighbor
工具。

Common Workflows

常见工作流

Grid Scattered Spatial Data with Validation

带验证的分散空间数据网格化

  • Load scattered point data (coordinates + values)
  • Project geographic coordinates to Cartesian if needed
  • Inspect data distribution and identify clusters or gaps
  • Apply
    BlockReduce
    to decimate dense clusters
  • Remove regional trend with
    Trend(degree=1)
    or
    Trend(degree=2)
  • Cross-validate gridder parameters with
    cross_val_score()
  • Tune
    Spline(damping=...)
    or
    Spline(mindist=...)
    based on CV scores
  • Fit chosen gridder (Spline, Linear, or Cubic) on residuals
  • Grid onto regular spacing with
    .grid()
  • Add trend back to gridded residuals
  • Apply
    distance_mask()
    to clip extrapolation artifacts
  • Visualize grid with xarray plotting:
    grid.elevation.plot()
  • Save result to NetCDF with
    grid.to_netcdf()
  • 加载分散点位数据(坐标+数值)
  • 若需要,将地理坐标投影至笛卡尔坐标
  • 检查数据分布,识别聚类或空白区域
  • 应用
    BlockReduce
    抽取密集聚类区域的数据
  • 使用
    Trend(degree=1)
    Trend(degree=2)
    去除区域趋势
  • 使用
    cross_val_score()
    对网格化参数进行交叉验证
  • 根据交叉验证分数调整
    Spline(damping=...)
    Spline(mindist=...)
  • 在残差上拟合选定的网格化工具(Spline、Linear或Cubic)
  • 使用
    .grid()
    将数据转换为规则间距的网格
  • 将趋势加回网格化后的残差
  • 应用
    distance_mask()
    裁剪外推伪影
  • 使用xarray绘图可视化网格:
    grid.elevation.plot()
  • 使用
    grid.to_netcdf()
    将结果保存为NetCDF格式

Common Issues

常见问题

IssueSolution
Poor extrapolationUse
distance_mask()
to mask far from data
Slow with large dataUse
BlockReduce
first
Regional trendsRemove with
Trend
before gridding
Wrong spacingCheck coordinate units (degrees vs meters)
问题解决方案
外推效果差使用
distance_mask()
屏蔽远离数据的区域
大数据处理速度慢先使用
BlockReduce
处理
存在区域趋势网格化前使用
Trend
去除
间距设置错误检查坐标单位(度 vs 米)

References

参考资料

  • Gridders - Available gridders and parameters
  • Cross-Validation - Parameter tuning methods
  • 网格化工具 - 可用的网格化工具及参数
  • 交叉验证 - 参数调优方法

Scripts

脚本

  • scripts/grid_data.py - Grid scattered data to NetCDF
  • scripts/grid_data.py - 将分散数据网格化并保存为NetCDF格式