verde
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVerde - Spatial Data Gridding
Verde - 空间数据网格化
Quick Reference
快速参考
python
import verde as vdpython
import verde as vdBasic gridding
基础网格化
spline = vd.Spline()
spline.fit(coordinates, values) # coordinates = (lon, lat) tuple
grid = spline.grid(spacing=0.1) # Returns xarray Dataset
spline = vd.Spline()
spline.fit(coordinates, values) # coordinates = (经度, 纬度) 元组
grid = spline.grid(spacing=0.1) # 返回xarray Dataset
Access result
访问结果
elevation = grid.elevation.values
elevation = grid.elevation.values
Save output
保存输出
grid.to_netcdf('output.nc')
undefinedgrid.to_netcdf('output.nc')
undefinedKey Classes
核心类
| Class | Purpose |
|---|---|
| Bi-harmonic spline interpolation (smooth, good extrapolation) |
| Delaunay triangulation (fast, no extrapolation) |
| Cubic interpolation (medium smoothness) |
| Pipeline of processing steps |
| Decimate data to block means/medians |
| Polynomial trend fitting and removal |
| Grid 2-component vector data |
| 类 | 用途 |
|---|---|
| 双调和样条插值(平滑度高,外推效果好) |
| 德劳内三角剖分(速度快,无外推功能) |
| 三次插值(平滑度中等) |
| 处理步骤流水线 |
| 将数据抽取为块均值/中位数 |
| 多项式趋势拟合与去除 |
| 对双分量矢量数据进行网格化 |
Essential Operations
关键操作
Grid Scattered Data
网格化分散数据
python
coordinates = (longitude, latitude) # Tuple of 1D arrays
values = elevation # 1D array
spline = vd.Spline()
spline.fit(coordinates, values)
grid = spline.grid(spacing=0.1, data_names=['elevation'])python
coordinates = (longitude, latitude) # 一维数组组成的元组
values = elevation # 一维数组
spline = vd.Spline()
spline.fit(coordinates, values)
grid = spline.grid(spacing=0.1, data_names=['elevation'])Project to Cartesian
投影至笛卡尔坐标
python
import pyproj
projection = pyproj.Proj(proj='merc', lat_ts=data_lat.mean())
proj_coords = projection(longitude, latitude)
spline = vd.Spline()
spline.fit(proj_coords, values)
grid = spline.grid(spacing=1000) # 1000m spacingpython
import pyproj
projection = pyproj.Proj(proj='merc', lat_ts=data_lat.mean())
proj_coords = projection(longitude, latitude)
spline = vd.Spline()
spline.fit(proj_coords, values)
grid = spline.grid(spacing=1000) # 1000米间距Block Reduce Large Datasets
块缩减大型数据集
python
import numpy as np
reducer = vd.BlockReduce(reduction=np.median, spacing=0.1)
coords_reduced, values_reduced = reducer.filter(coordinates, values)python
import numpy as np
reducer = vd.BlockReduce(reduction=np.median, spacing=0.1)
coords_reduced, values_reduced = reducer.filter(coordinates, values)Remove Trend Before Gridding
网格化前去除趋势
python
trend = vd.Trend(degree=2) # Quadratic
trend.fit(coordinates, values)
residuals = values - trend.predict(coordinates)python
trend = vd.Trend(degree=2) # 二次趋势
trend.fit(coordinates, values)
residuals = values - trend.predict(coordinates)Grid residuals, then add trend back
对残差进行网格化,之后再将趋势加回
undefinedundefinedProcessing Pipeline
处理流水线
python
chain = vd.Chain([
('trend', vd.Trend(degree=1)),
('reduce', vd.BlockReduce(np.median, spacing=0.05)),
('spline', vd.Spline())
])
chain.fit(coordinates, values)
grid = chain.grid(spacing=0.01)python
chain = vd.Chain([
('trend', vd.Trend(degree=1)),
('reduce', vd.BlockReduce(np.median, spacing=0.05)),
('spline', vd.Spline())
])
chain.fit(coordinates, values)
grid = chain.grid(spacing=0.01)Cross-Validation
交叉验证
python
spline = vd.Spline()
scores = vd.cross_val_score(spline, coordinates, values, cv=5)
print(f"Mean R2: {scores.mean():.3f}")python
spline = vd.Spline()
scores = vd.cross_val_score(spline, coordinates, values, cv=5)
print(f"平均R2值: {scores.mean():.3f}")Mask Far from Data
屏蔽远离数据的区域
python
grid = spline.grid(spacing=0.1)
mask = vd.distance_mask(coordinates, maxdist=0.2, grid=grid)
grid_masked = grid.where(mask)python
grid = spline.grid(spacing=0.1)
mask = vd.distance_mask(coordinates, maxdist=0.2, grid=grid)
grid_masked = grid.where(mask)Grid Parameters
网格化参数
| Parameter | Description |
|---|---|
| Grid cell size (same units as coordinates) |
| (west, east, south, north) bounds |
| (n_north, n_east) grid dimensions |
| 'spacing' or 'region' - which to adjust for exact fit |
| 参数 | 描述 |
|---|---|
| 网格单元大小(与坐标单位一致) |
| (西界、东界、南界、北界)范围 |
| (北向数量、东向数量)网格维度 |
| 'spacing' 或 'region' - 调整哪一项以实现精确匹配 |
Gridder Comparison
网格化工具对比
| Gridder | Speed | Smoothness | Extrapolation |
|---|---|---|---|
| Medium | High | Good |
| Fast | Low | None |
| Fast | Medium | None |
| 网格化工具 | 速度 | 平滑度 | 外推能力 |
|---|---|---|---|
| 中等 | 高 | 良好 |
| 快 | 低 | 无 |
| 快 | 中等 | 无 |
When to Use vs Alternatives
适用场景与替代工具对比
| Use Case | Tool | Why |
|---|---|---|
| General spatial gridding | Verde | ML-style API, pipelines, cross-validation |
| Basic 1D/2D interpolation | scipy.interpolate | Simpler API, no spatial focus |
| Potential field gridding | Harmonica | Equivalent sources designed for gravity/magnetics |
| Command-line batch gridding | GMT | Powerful CLI, good for automation scripts |
| Geostatistical interpolation | scikit-gstat / pykrige | Variogram-based with uncertainty |
| Very large datasets (10M+ pts) | GMT / GDAL | Better memory handling at scale |
| Vector data (GPS velocities) | Verde ( | Built-in 2-component vector gridding |
| Trend removal + gridding | Verde ( | Pipeline combines steps cleanly |
Choose Verde when: You need a Pythonic, scikit-learn-style API for gridding
scattered spatial data with built-in cross-validation, trend removal, and pipelines.
Ideal for exploratory analysis and reproducible workflows.
Choose scipy.interpolate when: You have a simple interpolation task without
spatial coordinates, projections, or need for validation.
Choose GMT when: You need command-line batch processing of large datasets or
are integrating with shell-based workflows and need or .
surfacenearneighbor| 使用场景 | 工具 | 原因 |
|---|---|---|
| 通用空间网格化 | Verde | 机器学习风格API、流水线、交叉验证 |
| 基础一维/二维插值 | scipy.interpolate | API更简单,无空间聚焦特性 |
| 位场网格化 | Harmonica | 专为重力/磁法设计的等效源方法 |
| 命令行批量网格化 | GMT | 功能强大的CLI,适合自动化脚本 |
| 地统计插值 | scikit-gstat / pykrige | 基于变异函数,支持不确定性分析 |
| 超大型数据集(1000万+点位) | GMT / GDAL | 大规模数据下内存处理更优 |
| 矢量数据(GPS速度) | Verde ( | 内置双分量矢量网格化功能 |
| 趋势去除+网格化 | Verde ( | 流水线可简洁整合多步骤 |
选择Verde的场景:需要Python化、scikit-learn风格的API来处理分散空间数据,且需内置交叉验证、趋势去除和流水线功能。适用于探索性分析和可复现工作流。
选择scipy.interpolate的场景:仅需简单插值任务,无需空间坐标、投影或验证功能。
选择GMT的场景:需要命令行批量处理大型数据集,或需与基于Shell的工作流集成,使用或工具。
surfacenearneighborCommon Workflows
常见工作流
Grid Scattered Spatial Data with Validation
带验证的分散空间数据网格化
- Load scattered point data (coordinates + values)
- Project geographic coordinates to Cartesian if needed
- Inspect data distribution and identify clusters or gaps
- Apply to decimate dense clusters
BlockReduce - Remove regional trend with or
Trend(degree=1)Trend(degree=2) - Cross-validate gridder parameters with
cross_val_score() - Tune or
Spline(damping=...)based on CV scoresSpline(mindist=...) - Fit chosen gridder (Spline, Linear, or Cubic) on residuals
- Grid onto regular spacing with
.grid() - Add trend back to gridded residuals
- Apply to clip extrapolation artifacts
distance_mask() - Visualize grid with xarray plotting:
grid.elevation.plot() - Save result to NetCDF with
grid.to_netcdf()
- 加载分散点位数据(坐标+数值)
- 若需要,将地理坐标投影至笛卡尔坐标
- 检查数据分布,识别聚类或空白区域
- 应用抽取密集聚类区域的数据
BlockReduce - 使用或
Trend(degree=1)去除区域趋势Trend(degree=2) - 使用对网格化参数进行交叉验证
cross_val_score() - 根据交叉验证分数调整或
Spline(damping=...)Spline(mindist=...) - 在残差上拟合选定的网格化工具(Spline、Linear或Cubic)
- 使用将数据转换为规则间距的网格
.grid() - 将趋势加回网格化后的残差
- 应用裁剪外推伪影
distance_mask() - 使用xarray绘图可视化网格:
grid.elevation.plot() - 使用将结果保存为NetCDF格式
grid.to_netcdf()
Common Issues
常见问题
| Issue | Solution |
|---|---|
| Poor extrapolation | Use |
| Slow with large data | Use |
| Regional trends | Remove with |
| Wrong spacing | Check coordinate units (degrees vs meters) |
| 问题 | 解决方案 |
|---|---|
| 外推效果差 | 使用 |
| 大数据处理速度慢 | 先使用 |
| 存在区域趋势 | 网格化前使用 |
| 间距设置错误 | 检查坐标单位(度 vs 米) |
References
参考资料
- Gridders - Available gridders and parameters
- Cross-Validation - Parameter tuning methods
- 网格化工具 - 可用的网格化工具及参数
- 交叉验证 - 参数调优方法
Scripts
脚本
- scripts/grid_data.py - Grid scattered data to NetCDF
- scripts/grid_data.py - 将分散数据网格化并保存为NetCDF格式