ai-ml-timeseries
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTime Series Forecasting — Modern Patterns & Production Best Practices
时间序列预测 — 现代模式与生产最佳实践
Modern Best Practices (January 2026):
- Treat time as a first-class axis: temporal splits, rolling backtests, and point-in-time correctness.
- Default to strong baselines (naive/seasonal naive) before complex models.
- Prevent leakage: feature windows and aggregations must use only information available at prediction time.
- Evaluate by horizon and segment; a single aggregate metric hides failures.
- Prefer probabilistic forecasts when decisions are risk-sensitive (quantiles/intervals); evaluate calibration (coverage) and use pinball/CRPS.
- For many related series, consider global + hierarchical approaches (shared models + reconciliation); validate across levels and key segments.
- Treat time zones/DST as first-class; validate timestamp alignment before feature generation.
- Define retraining cadence and degraded modes (fallback model, last-known-good forecast).
This skill provides operational, copy-paste-ready workflows for forecasting with recent advances: TS-specific EDA, temporal validation, lag/rolling features, model selection, multi-step forecasting, backtesting, generative AI (Chronos, TimesFM), and production deployment with drift monitoring.
It focuses on hands-on forecasting execution, not theory.
现代最佳实践(2026年1月):
- 将时间作为核心轴:时间拆分、滚动回测以及时点正确性。
- 在使用复杂模型前,优先采用强基准模型(朴素/季节性朴素模型)。
- 防止数据泄露:特征窗口与聚合操作只能使用预测时点可获取的信息。
- 按预测周期和细分维度评估模型;单一聚合指标会掩盖问题。
- 当决策对风险敏感时,优先选择概率性预测(分位数/区间);评估校准度(覆盖率)并使用pinball损失/CRPS指标。
- 针对多个相关序列,考虑全局+分层方法(共享模型+结果调和);在不同层级和关键细分维度上验证。
- 将时区/夏令时作为核心因素;生成特征前验证时间戳对齐情况。
- 定义重训练周期与降级模式(备用模型、最近的可靠预测值)。
本技能提供可直接复用的操作工作流,涵盖近期技术进展的预测方法:时间序列专属EDA、时间验证、滞后/滚动特征、模型选择、多步预测、回测、生成式AI(Chronos、TimesFM)以及带漂移监控的生产部署。
本技能聚焦实操性预测执行,而非理论知识。
When to Use This Skill
何时使用本技能
Claude should invoke this skill when the user asks for hands-on time series forecasting, e.g.:
- "Build a time series model for X."
- "Create lag features / rolling windows."
- "Help design a forecasting backtest."
- "Pick the right forecasting model for my data."
- "Fix leakage in forecasting."
- "Evaluate multi-horizon forecasts."
- "Use LLMs or generative models for TS."
- "Set up monitoring for a forecast system."
- "Implement LightGBM for time series."
- "Use transformer models (TimesFM, Chronos) for forecasting."
- "Apply temporal classification/survival modelling for event prediction."
If the user is asking about general ML modelling, deployment, or infrastructure, prefer:
- ai-ml-data-science - General data science workflows, EDA, feature engineering, evaluation
- ai-mlops - Model deployment, monitoring, drift detection, retraining automation
If the user is asking about LLM/RAG/search, prefer:
- ai-llm - LLM fine-tuning, prompting, evaluation
- ai-rag - RAG pipeline design and optimization
当用户询问实操性时间序列预测相关问题时,Claude应调用本技能,例如:
- "为X构建时间序列模型。"
- "创建滞后特征/滚动窗口。"
- "帮助设计预测回测方案。"
- "为我的数据选择合适的预测模型。"
- "修复预测中的数据泄露问题。"
- "评估多周期预测结果。"
- "使用LLM或生成式模型处理时间序列。"
- "为预测系统设置监控。"
- "使用LightGBM进行时间序列预测。"
- "使用Transformer模型(TimesFM、Chronos)进行预测。"
- "应用时间分类/生存模型进行事件预测。"
如果用户询问通用ML建模、部署或基础设施相关问题,优先使用:
- ai-ml-data-science - 通用数据科学工作流、EDA、特征工程、评估
- ai-mlops - 模型部署、监控、漂移检测、重训练自动化
如果用户询问LLM/RAG/搜索相关问题,优先使用:
- ai-llm - LLM微调、提示工程、评估
- ai-rag - RAG管道设计与优化
Quick Reference
快速参考
| Task | Tool/Framework | Command | When to Use |
|---|---|---|---|
| TS EDA & Decomposition | Pandas, statsmodels | | Identifying trend, seasonality, outliers |
| Lag/Rolling Features | Pandas, NumPy | | Creating temporal features for ML models |
| Model Training (Tree-based) | LightGBM, XGBoost | | Tabular TS with seasonality, covariates |
| Deep Learning (Sequence models) | Transformers, RNNs | | Long-term dependencies, complex patterns |
| Event forecasting | Binary/time-to-event models | Temporal labeling + rolling validation | Sparse events and alerts |
| Backtesting | Custom rolling windows | | Temporal validation without leakage |
| Metrics Evaluation | scikit-learn, custom | | Multi-horizon forecast accuracy |
| Production Deployment | MLflow, Airflow | Scheduled pipelines | Automated retraining, drift monitoring |
| 任务 | 工具/框架 | 命令 | 使用场景 |
|---|---|---|---|
| 时间序列EDA与分解 | Pandas, statsmodels | | 识别趋势、季节性、异常值 |
| 滞后/滚动特征 | Pandas, NumPy | | 为ML模型创建时间特征 |
| 模型训练(基于树) | LightGBM, XGBoost | | 带季节性、协变量的表格型时间序列 |
| 深度学习(序列模型) | Transformers, RNNs | | 长期依赖、复杂模式 |
| 事件预测 | 二元/时间到事件模型 | 时间标记+滚动验证 | 稀疏事件与告警 |
| 回测 | 自定义滚动窗口 | | 无数据泄露的时间验证 |
| 指标评估 | scikit-learn, 自定义 | | 多周期预测准确率 |
| 生产部署 | MLflow, Airflow | 调度管道 | 自动化重训练、漂移监控 |
Decision Tree: Choosing Time Series Approach
决策树:选择时间序列方法
text
User needs time series forecasting for: [Data Type]
├─ Strong Seasonality?
│ ├─ Simple patterns? → LightGBM with seasonal features
│ ├─ Complex patterns? → LightGBM + Prophet comparison
│ └─ Multiple seasonalities? → Prophet or TBATS
│
├─ Long-term Dependencies (>50 steps)?
│ ├─ Transformers (TimesFM, Chronos) → Best for complex patterns
│ └─ RNNs/LSTMs → Good for sequential dependencies
│
├─ Event Forecasting (binary outcomes)?
│ └─ Temporal classification / survival modelling → validate with time-based splits
│
├─ Intermittent/Sparse Data (many zeros)?
│ ├─ Croston/SBA → Classical intermittent methods
│ └─ LightGBM with zero-inflation features → Modern approach
│
├─ Multiple Covariates?
│ ├─ LightGBM → Best with many features
│ └─ TFT/DeepAR → If deep learning needed
│
└─ Explainability Required (healthcare, finance)?
├─ LightGBM → SHAP values, feature importance
└─ Linear models → Most interpretabletext
User needs time series forecasting for: [Data Type]
├─ Strong Seasonality?
│ ├─ Simple patterns? → LightGBM with seasonal features
│ ├─ Complex patterns? → LightGBM + Prophet comparison
│ └─ Multiple seasonalities? → Prophet or TBATS
│
├─ Long-term Dependencies (>50 steps)?
│ ├─ Transformers (TimesFM, Chronos) → Best for complex patterns
│ └─ RNNs/LSTMs → Good for sequential dependencies
│
├─ Event Forecasting (binary outcomes)?
│ └─ Temporal classification / survival modelling → validate with time-based splits
│
├─ Intermittent/Sparse Data (many zeros)?
│ ├─ Croston/SBA → Classical intermittent methods
│ └─ LightGBM with zero-inflation features → Modern approach
│
├─ Multiple Covariates?
│ ├─ LightGBM → Best with many features
│ └─ TFT/DeepAR → If deep learning needed
│
└─ Explainability Required (healthcare, finance)?
├─ LightGBM → SHAP values, feature importance
└─ Linear models → Most interpretableCore Concepts (Vendor-Agnostic)
核心概念(与厂商无关)
- Time axis: splits, features, and labels must respect time ordering and availability.
- Non-stationarity: seasonality, trend, and regime shifts are normal; monitor and retrain intentionally.
- Evaluation: rolling/expanding backtests; report horizon-wise and segment-wise performance.
- Operationalization: define retraining cadence, fallback models, and data freshness contracts.
- Data governance: treat time series as potentially sensitive; enforce access control, retention, and PII scrubbing in logs.
- 时间轴:拆分、特征与标签必须遵循时间顺序与可用性。
- 非平稳性:季节性、趋势与机制转变是正常现象;需有意监控并重训练。
- 评估:滚动/扩展回测;按周期和细分维度报告性能。
- 落地实施:定义重训练周期、备用模型与数据新鲜度约定。
- 数据治理:将时间序列视为潜在敏感数据;在日志中强制执行访问控制、保留策略与PII清理。
Implementation Practices (Tooling Examples)
实施实践(工具示例)
- Build features with explicit time windows; store cutoff timestamps with each training run.
- Backtest with a standardized harness (rolling/expanding windows, horizon-wise metrics).
- Log production forecasts with metadata (model version, horizon, data cut) to enable debugging.
- Implement fallbacks (baseline model, last-known-good, “insufficient data” handling) for outages and anomalies.
- 使用明确的时间窗口构建特征;为每次训练运行存储截止时间戳。
- 使用标准化框架进行回测(滚动/扩展窗口、按周期的指标)。
- 记录生产预测的元数据(模型版本、周期、数据截止时间)以支持调试。
- 为故障与异常情况实现回退方案(基准模型、最近的可靠值、“数据不足”处理)。
Do / Avoid
注意事项
Do
- Do start with naive/seasonal naive baselines and compare against learned models (Forecasting: Principles and Practice: https://otexts.com/fpp3/).
- Do backtest with rolling windows and preserve point-in-time correctness.
- Do monitor for data pipeline changes (missing timestamps, level shifts, calendar changes).
- Do align metrics/loss to the decision: asymmetric costs, service levels, and probabilistic targets (quantiles/intervals) when needed.
Avoid
- Avoid random splits for forecasting problems.
- Avoid features that use future information (future aggregates, leakage via target encoding).
- Avoid optimizing only aggregate metrics; always inspect horizon-wise errors and worst segments.
- Avoid MAPE when the target can be 0 or near-0; prefer MASE/WAPE/sMAPE and horizon-wise reporting.
建议
- 建议从朴素/季节性朴素基准模型开始,与学习模型进行对比(参考《预测:原理与实践》:https://otexts.com/fpp3/)。
- 建议使用滚动窗口进行回测,保持时点正确性。
- 建议监控数据管道变化(缺失时间戳、水平偏移、日历变更)。
- 建议根据决策调整指标/损失:非对称成本、服务水平,以及必要时的概率目标(分位数/区间)。
避免
- 避免在预测问题中使用随机拆分。
- 避免使用包含未来信息的特征(未来聚合、通过目标编码导致的数据泄露)。
- 避免仅优化聚合指标;务必检查按周期的错误与最差细分维度。
- 当目标值可能为0或接近0时,避免使用MAPE;优先选择MASE/WAPE/sMAPE并按周期报告。
Navigation: Core Patterns
导航:核心模式
Time Series EDA & Data Preparation
时间序列EDA与数据准备
- TS EDA Best Practices
- Frequency detection, missing timestamps, decomposition
- Outlier detection, level shifts, seasonality analysis
- Granularity selection and stability checks
- 时间序列EDA最佳实践
- 频率检测、缺失时间戳、分解
- 异常值检测、水平偏移、季节性分析
- 粒度选择与稳定性检查
Feature Engineering
特征工程
- Lag & Rolling Patterns
- Lag features (lag_1, lag_7, lag_28 for daily data)
- Rolling windows (mean, std, min, max, EWM)
- Avoiding leakage, seasonal lags, datetime features
- 滞后与滚动模式
- 滞后特征(每日数据的lag_1、lag_7、lag_28)
- 滚动窗口(均值、标准差、最小值、最大值、EWM)
- 避免数据泄露、季节性滞后、日期时间特征
Model Selection
模型选择
-
Model Selection Guide
- Decision rules: Strong seasonality → LightGBM, Long-term → Transformers
- Benchmark comparison: LightGBM vs Prophet vs Transformers vs RNNs
- Explainability considerations for mission-critical domains
-
LightGBM TS Patterns (feature-based forecasting best practices)
- Why LightGBM excels: performance + efficiency + explainability
- Feature engineering for tree-based models
- Hyperparameter tuning for time series
-
模型选择指南
- 决策规则:强季节性→LightGBM,长期依赖→Transformers
- 基准对比:LightGBM vs Prophet vs Transformers vs RNNs
- 关键领域的可解释性考量
-
LightGBM时间序列模式 (基于特征的预测最佳实践)
- LightGBM的优势:性能+效率+可解释性
- 基于树模型的特征工程
- 时间序列的超参数调优
Forecasting Strategies
预测策略
-
Multi-Step Forecasting Patterns
- Direct strategy (separate models per horizon)
- Recursive strategy (feed predictions back)
- Seq2Seq strategy (Transformers, RNNs for long horizons)
-
Intermittent Demand Patterns
- Croston, SBA, ADIDA for sparse data
- LightGBM with zero-inflation features (modern approach)
- Two-stage hurdle models, hierarchical Bayesian
-
多步预测模式
- 直接策略(每个周期使用独立模型)
- 递归策略(将预测结果反馈)
- Seq2Seq策略(Transformers、RNNs用于长周期)
-
间歇性需求模式
- Croston、SBA、ADIDA用于稀疏数据
- 带零膨胀特征的LightGBM(现代方法)
- 两阶段 hurdle模型、分层贝叶斯模型
Validation & Evaluation
验证与评估
- Backtesting Patterns
- Rolling window backtest, expanding window
- Temporal train/validation split (no IID splits!)
- Horizon-wise metrics, segment-level evaluation
- 回测模式
- 滚动窗口回测、扩展窗口
- 时间训练/验证拆分(禁止IID拆分!)
- 按周期的指标、细分维度评估
Generative & Advanced Models
生成式与高级模型
- TS-LLM Patterns
- Chronos, TimesFM, Lag-Llama (Transformer models)
- Event forecasting patterns (temporal classification, survival modelling)
- Tokenization, discretization, trajectory sampling
- 时间序列LLM模式
- Chronos、TimesFM、Lag-Llama(Transformer模型)
- 事件预测模式(时间分类、生存模型)
- 分词、离散化、轨迹采样
Production Deployment
生产部署
- Production Deployment Patterns
- Feature pipelines (same code for train/serve)
- Retraining strategies (time-based, drift-triggered)
- Monitoring (error drift, feature drift, volume drift)
- Fallback strategies, streaming ingestion, data governance
- 生产部署模式
- 特征管道(训练/服务使用相同代码)
- 重训练策略(基于时间、漂移触发)
- 监控(错误漂移、特征漂移、量漂移)
- 回退策略、流摄入、数据治理
Navigation: Templates (Copy-Paste Ready)
导航:模板(可直接复用)
Data Preparation
数据准备
- TS EDA Template - Reproducible structure for time series analysis
- Resample & Fill Template - Handle missing timestamps and resampling
- 时间序列EDA模板 - 可复现的时间序列分析结构
- 重采样与填充模板 - 处理缺失时间戳与重采样
Feature Templates
特征模板
- Lag & Rolling Features - Create temporal features for ML models
- Calendar Features - Business calendars, holidays, events
- 滞后与滚动特征 - 为ML模型创建时间特征
- 日历特征 - 业务日历、节假日、事件
Model Templates
模型模板
- Forecast Model Template - End-to-end forecasting pipeline (LightGBM, transformers, RNNs)
- Multi-Step Strategy - Direct, recursive, and seq2seq approaches
- 预测模型模板 - 端到端预测管道(LightGBM、Transformers、RNNs)
- 多步策略 - 直接、递归与seq2seq方法
Evaluation Templates
评估模板
- Backtest Template - Rolling window validation setup
- TS Metrics Template - MAPE, MAE, RMSE, MASE, pinball loss
- 回测模板 - 滚动窗口验证设置
- 时间序列指标模板 - MAPE、MAE、RMSE、MASE、pinball损失
Advanced Templates
高级模板
- TS-LLM Template - Time series foundation model patterns and experimental approaches
- 时间序列LLM模板 - 时间序列基础模型模式与实验方法
Related Skills
相关技能
For adjacent topics, reference these skills:
- ai-ml-data-science - EDA workflows, feature engineering patterns, model evaluation, SQLMesh transformations
- ai-mlops - Production deployment, monitoring, retraining pipelines
- ai-llm - Fine-tuning approaches applicable to time series LLMs (Chronos, TimesFM)
- ai-prompt-engineering - Prompt design patterns for time series LLMs
- data-sql-optimization - SQL optimization for time series data storage and retrieval
对于相邻主题,参考以下技能:
- ai-ml-data-science - EDA工作流、特征工程模式、模型评估、SQLMesh转换
- ai-mlops - 生产部署、监控、重训练管道
- ai-llm - 适用于时间序列LLM(Chronos、TimesFM)的微调方法
- ai-prompt-engineering - 时间序列LLM的提示设计模式
- data-sql-optimization - 时间序列数据存储与检索的SQL优化
External Resources
外部资源
See data/sources.json for curated web resources including:
- Classical methods (statsmodels, Prophet, ARIMA)
- Deep learning frameworks (PyTorch Forecasting, GluonTS, Darts, NeuralProphet)
- Transformer models (TimesFM, Chronos, Lag-Llama, Informer, Autoformer)
- Anomaly detection tools (PyOD, STUMPY, Isolation Forest)
- Feature engineering libraries (tsfresh, TSFuse, Featuretools)
- Production deployment (Kats, MLflow, sktime)
- Benchmarks and datasets (M5 Competition, Monash Time Series, UCI)
查看data/sources.json获取精选网络资源,包括:
- 经典方法(statsmodels、Prophet、ARIMA)
- 深度学习框架(PyTorch Forecasting、GluonTS、Darts、NeuralProphet)
- Transformer模型(TimesFM、Chronos、Lag-Llama、Informer、Autoformer)
- 异常检测工具(PyOD、STUMPY、孤立森林)
- 特征工程库(tsfresh、TSFuse、Featuretools)
- 生产部署(Kats、MLflow、sktime)
- 基准与数据集(M5竞赛、Monash时间序列、UCI)
Usage Notes
使用说明
For Claude:
- Activate this skill for hands-on forecasting tasks, feature engineering, backtesting, or production setup
- Start with Quick Reference and Decision Tree for fast guidance
- Drill into references/ for detailed implementation patterns
- Use assets/ for copy-paste ready code
- Always check for temporal leakage (future data in training)
- Start with strong baselines; choose model family based on horizon, covariates, and latency/cost constraints
- Emphasize explainability for healthcare/finance domains
- Monitor for data distribution shifts in production
Key Principle: Time series forecasting is about temporal structure, not IID assumptions. Use temporal validation, avoid future leakage, and choose models based on horizon length and data characteristics.