earth2studio-deterministic-forecast
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseEarth2Studio Deterministic Forecast Skill
Earth2Studio确定性预报技能指南
Guide users through building deterministic (single-member) weather forecast
inference scripts using .
earth2studio.run.deterministic指导用户使用构建确定性(单成员)气象预报推理脚本。
earth2studio.run.deterministicPrerequisites
前提条件
- Earth2Studio installed with CUDA-capable GPU
- Python 3.10+, network access for model weights and data
- 已安装Earth2Studio且具备支持CUDA的GPU
- Python 3.10及以上版本,可联网获取模型权重与数据
Live Doc References
实时文档参考
Fetch relevant docs to verify current APIs before recommending components:
| Component | URL |
|---|---|
| Prognostic models | https://nvidia.github.io/earth2studio/modules/models_px.html |
| Data sources (analysis) | https://nvidia.github.io/earth2studio/modules/datasources_analysis.html |
| Data sources (forecast) | https://nvidia.github.io/earth2studio/modules/datasources_forecast.html |
| IO backends | https://nvidia.github.io/earth2studio/modules/io.html |
| https://github.com/NVIDIA/earth2studio/blob/main/earth2studio/run.py |
在推荐组件前,获取相关文档以验证当前API:
Workflow
工作流程
1. Gather Requirements (skip what's already provided)
1. 收集需求(跳过已提供的内容)
- Time horizon (hours/days/weeks)
- Variables of interest (t2m, wind, geopotential, etc.)
- Region (global or specific like CONUS)
- GPU/VRAM available
- 时间范围(小时/天/周)
- 关注变量(如t2m、风、位势等)
- 区域(全球或特定区域如CONUS)
- 可用GPU/显存
2. Select Model
2. 选择模型
Fetch prognostic models page. Filter by time horizon, region, VRAM. Note model's:
- Input variables ()
input_coords["variable"] - Time step size ()
output_coords["lead_time"]
获取预报模型页面,按时间范围、区域、显存筛选。注意模型的:
- 输入变量()
input_coords["variable"] - 时间步长()
output_coords["lead_time"]
3. Select Data Source
3. 选择数据源
Data source must provide all model input variables. Verify via lexicon at
. Common pairings: Global models → GFS/ARCO/IFS;
Regional → HRRR.
earth2studio/lexicon/<source>.py数据源必须提供模型所需的所有输入变量。可通过中的词汇表验证。常见搭配:全球模型→GFS/ARCO/IFS;区域模型→HRRR。
earth2studio/lexicon/<source>.py4. Select IO Backend
4. 选择IO后端
Default: . Use for legacy tools,
for in-memory/small runs.
ZarrBackendNetCDF4BackendXarrayBackend默认使用。若使用旧工具可选择,内存内运行或小规模任务可选择。
ZarrBackendNetCDF4BackendXarrayBackend5. Calculate nsteps
5. 计算nsteps
nsteps = forecast_hours / model_step_hoursExample: 5-day forecast with 6h step →
nsteps = 120 / 6 = 20nsteps = 预报小时数 / 模型步长小时数示例:6小时步长的5天预报 →
nsteps = 120 / 6 = 206. Decide: output_coords Filtering
6. 决定:output_coords过滤
- Filter variables () when user requests specific variables (e.g., "t2m and wind") - reduces output size
output_coords - Save all variables (omit ) when user says "all variables" or doesn't specify - preserves full model output
output_coords
- 当用户请求特定变量(如“t2m和风”)时,过滤变量()——可减小输出体积
output_coords - 当用户要求“所有变量”或未指定时,保存所有变量(省略)——保留完整模型输出
output_coords
7. Generate Script
7. 生成脚本
python
from collections import OrderedDict
import numpy as np
import torch
from earth2studio.models.px import <ModelClass>
from earth2studio.data import <DataSourceClass>
from earth2studio.io import <IOBackendClass>
from earth2studio.run import deterministic
model = <ModelClass>.load_model(<ModelClass>.load_default_package())
data = <DataSourceClass>()
io = <IOBackendClass>("<output_path>")python
from collections import OrderedDict
import numpy as np
import torch
from earth2studio.models.px import <ModelClass>
from earth2studio.data import <DataSourceClass>
from earth2studio.io import <IOBackendClass>
from earth2studio.run import deterministic
model = <ModelClass>.load_model(<ModelClass>.load_default_package())
data = <DataSourceClass>()
io = <IOBackendClass>("<output_path>")Include output_coords ONLY if user requested specific variables
仅当用户请求特定变量时才包含output_coords
output_coords = OrderedDict({"variable": np.array(["t2m", "u10m"])})
io = deterministic(
time=["YYYY-MM-DDTHH:MM:SS"],
nsteps=<N>,
prognostic=model,
data=data,
io=io,
output_coords=output_coords, # omit if saving all variables
device=torch.device("cuda"),
)
undefinedoutput_coords = OrderedDict({"variable": np.array(["t2m", "u10m"])})
io = deterministic(
time=["YYYY-MM-DDTHH:MM:SS"],
nsteps=<N>,
prognostic=model,
data=data,
io=io,
output_coords=output_coords, # 保存所有变量时省略
device=torch.device("cuda"),
)
undefined8. Manual Loop Alternative
8. 手动循环替代方案
When user explicitly requests manual implementation (NOT using ), follow this checklist in order:
earth2studio.run.deterministic- fetch_data - Get initial conditions:
x, coords = fetch_data(data, time, model.input_coords, device) - Setup total_coords - Build coordinate arrays for time and lead_time dimensions
- io.add_array - Initialize IO backend with total_coords before loop
- create_iterator - Create prognostic iterator:
model_iter = model.create_iterator(x, coords) - Loop through nsteps -
for step, (x, coords) in enumerate(model_iter): if step >= nsteps: break - map_coords - Filter output variables if needed:
x_out, coords_out = map_coords(x, coords, output_coords) - split_coords - Prepare for IO write:
x_out, coords_out = split_coords(x_out, coords_out) - io.write - Write each step to backend
当用户明确要求手动实现(不使用)时,按以下顺序执行:
earth2studio.run.deterministic- fetch_data - 获取初始条件:
x, coords = fetch_data(data, time, model.input_coords, device) - 设置total_coords - 构建时间和预报时效维度的坐标数组
- io.add_array - 在循环前用total_coords初始化IO后端
- create_iterator - 创建预报迭代器:
model_iter = model.create_iterator(x, coords) - 循环nsteps次 -
for step, (x, coords) in enumerate(model_iter): if step >= nsteps: break - map_coords - 按需过滤输出变量:
x_out, coords_out = map_coords(x, coords, output_coords) - split_coords - 为IO写入做准备:
x_out, coords_out = split_coords(x_out, coords_out) - io.write - 将每一步结果写入后端
9. Explain Next Steps
9. 说明后续步骤
- How to change forecast time or run multiple initializations
- How to read output ()
xr.open_zarr(...) - Point to diagnostic workflow for post-processing
- 如何更改预报时间或运行多次初始化
- 如何读取输出()
xr.open_zarr(...) - 指向诊断工作流程以进行后处理
Ownership
职责范围
Owns: Model selection, data source compatibility, IO backend selection,
nsteps calculation, generating scripts.
earth2studio.run.deterministicDoes not own: Ensemble workflows, diagnostics, data-only fetch, installation,
model training.
负责: 模型选择、数据源兼容性、IO后端选择、nsteps计算、生成脚本。
earth2studio.run.deterministic不负责: 集合预报工作流程、诊断分析、仅数据获取、安装操作、模型训练。
Troubleshooting
故障排除
See for common errors and solutions.
references/troubleshooting.md请查看获取常见错误及解决方案。
references/troubleshooting.mdReminders
注意事项
- Always fetch live docs before recommending models or data sources - APIs change between releases
- Verify lexicon compatibility - Model input variables must exist in data source's VOCAB
- Use - This is the standard pattern for loading model weights
load_default_package() - Time format is ISO 8601 - Use format for the
"YYYY-MM-DDTHH:MM:SS"argumenttime - Wind speed needs both components - If user asks for "wind speed", include both and
u10mv10m - nsteps is integer division -
nsteps = total_hours // model_step_hours - ZarrBackend is the default - Only suggest alternatives if user has specific requirements
- GPU is required - All prognostic models require CUDA; CPU inference is not supported
- 始终获取实时文档后再推荐模型或数据源——API会随版本更新而变化
- 验证词汇表兼容性——模型输入变量必须存在于数据源的VOCAB中
- 使用——这是加载模型权重的标准方式
load_default_package() - 时间格式为ISO 8601——参数需使用
time格式"YYYY-MM-DDTHH:MM:SS" - 风速需要两个分量——若用户要求“风速”,需同时包含和
u10mv10m - nsteps为整数除法——
nsteps = 总小时数 // 模型步长小时数 - 默认使用ZarrBackend——仅当用户有特定需求时才推荐替代方案
- 必须使用GPU——所有预报模型均需CUDA;不支持CPU推理