signal-classification
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSignal Classification
信号分类
Predict whether an asset's price will move up or down over a forward horizon using supervised machine learning classifiers. This skill covers the full pipeline: label creation, model training, walk-forward validation, feature importance analysis, and threshold optimization for trading applications.
使用监督机器学习分类器预测资产价格在未来一段时间内的涨跌方向。本技能涵盖完整流程:标签创建、模型训练、滚动向前验证(walk-forward validation)、特征重要性分析以及面向交易场景的阈值优化。
Why Tree-Based Models Dominate Trading ML
为何树模型在交易机器学习中占据主导地位
XGBoost and LightGBM are the workhorses of quantitative trading ML for good reason:
- Non-linear relationships: Financial features interact in complex, non-linear ways that trees capture naturally
- Robust to feature scale: No need to normalize or standardize inputs — trees split on rank order
- Built-in feature importance: Understand which features drive predictions without separate analysis
- Fast training and inference: Train on thousands of samples in seconds, predict in microseconds
- Handle missing values: Native support for NaN without imputation hacks
- Regularization built in: max_depth, min_child_weight, subsample all prevent overfitting
Linear models and deep learning have their place, but for tabular trading features with fewer than 100k samples, gradient-boosted trees consistently outperform alternatives.
XGBoost和LightGBM成为量化交易机器学习的主力工具是有充分理由的:
- 非线性关系:金融特征间存在复杂的非线性交互,树模型可自然捕捉这类关系
- 不受特征尺度影响:无需对输入进行归一化或标准化——树模型基于排序进行分裂
- 内置特征重要性:无需额外分析即可了解哪些特征驱动预测结果
- 训练与推理速度快:数秒内即可完成数千样本的训练,推理仅需微秒级时间
- 处理缺失值:原生支持NaN,无需借助填充技巧
- 内置正则化:max_depth、min_child_weight、subsample等参数均可防止过拟合
线性模型和深度学习各有其适用场景,但对于样本量不足10万的表格型交易特征,梯度提升树的表现始终优于其他模型。
Classification Types
分类类型
Binary Classification
二分类
The simplest and most common setup. Predict whether forward returns exceed a threshold:
- Up signal: forward return > +1%
- Down signal: forward return < -1%
- Neutral (excluded): -1% to +1% — drop these from training to create cleaner labels
python
import numpy as np
def create_binary_labels(
prices: np.ndarray, horizon: int = 24, threshold: float = 0.01
) -> np.ndarray:
"""Create binary labels from forward returns.
Args:
prices: Array of prices.
horizon: Forward return lookback in bars.
threshold: Minimum return magnitude for a label.
Returns:
Array of labels: 1 (up), 0 (down), NaN (neutral).
"""
fwd_returns = np.roll(prices, -horizon) / prices - 1
fwd_returns[-horizon:] = np.nan
labels = np.where(fwd_returns > threshold, 1,
np.where(fwd_returns < -threshold, 0, np.nan))
return labels最简单且最常用的设置。预测未来收益是否超过阈值:
- 上涨信号:未来收益 > +1%
- 下跌信号:未来收益 < -1%
- 中性(排除):-1% 至 +1% —— 训练时剔除这类样本以生成更清晰的标签
python
import numpy as np
def create_binary_labels(
prices: np.ndarray, horizon: int = 24, threshold: float = 0.01
) -> np.ndarray:
"""Create binary labels from forward returns.
Args:
prices: Array of prices.
horizon: Forward return lookback in bars.
threshold: Minimum return magnitude for a label.
Returns:
Array of labels: 1 (up), 0 (down), NaN (neutral).
"""
fwd_returns = np.roll(prices, -horizon) / prices - 1
fwd_returns[-horizon:] = np.nan
labels = np.where(fwd_returns > threshold, 1,
np.where(fwd_returns < -threshold, 0, np.nan))
return labelsMulti-Class Classification
多分类
Three classes for finer signal granularity:
| Class | Condition | Typical threshold |
|---|---|---|
| Strong Up | fwd_return > +2% | High confidence long |
| Mild Up | +0.5% to +2% | Moderate confidence |
| Down | fwd_return < -0.5% | Avoid / short |
Multi-class reduces per-class sample size. Use only with large datasets (1000+ samples per class).
三类划分以实现更精细的信号粒度:
| 类别 | 条件 | 典型用途 |
|---|---|---|
| 强势上涨 | fwd_return > +2% | 高置信度做多 |
| 温和上涨 | +0.5% 至 +2% | 中等置信度操作 |
| 下跌 | fwd_return < -0.5% | 规避/做空 |
多分类会减少每个类别的样本量,仅适用于大型数据集(每个类别样本量≥1000)。
Probability Calibration
概率校准
Raw model probabilities from XGBoost/LightGBM are not well-calibrated. A predicted 0.7 probability does not mean 70% chance of being correct. Use calibration to fix this:
python
from sklearn.calibration import CalibratedClassifierCV
calibrated = CalibratedClassifierCV(base_model, cv=5, method="isotonic")
calibrated.fit(X_train, y_train)
probs = calibrated.predict_proba(X_test)[:, 1]Isotonic calibration works better than Platt scaling for tree models.
XGBoost/LightGBM输出的原始模型概率校准效果不佳。预测概率为0.7并不代表70%的正确率,需通过校准解决此问题:
python
from sklearn.calibration import CalibratedClassifierCV
calibrated = CalibratedClassifierCV(base_model, cv=5, method="isotonic")
calibrated.fit(X_train, y_train)
probs = calibrated.predict_proba(X_test)[:, 1]对于树模型,等渗校准的效果优于Platt缩放。
Walk-Forward Validation
滚动向前验证(Walk-Forward Validation)
This is the single most important concept in trading ML. Standard cross-validation randomly shuffles data, which creates lookahead bias. Walk-forward validation respects time ordering.
这是交易机器学习中最重要的概念。 标准交叉验证会随机打乱数据,从而引入前瞻偏差,而滚动向前验证遵循时间顺序。
How It Works
工作原理
Window 1: [===TRAIN===][GAP][=TEST=]
Window 2: [===TRAIN===][GAP][=TEST=]
Window 3: [===TRAIN===][GAP][=TEST=]
Window 4: [===TRAIN===][GAP][=TEST=]Each window:
- Train on past N bars
- Skip a gap (embargo) equal to the forward return horizon
- Predict on next M bars
- Record out-of-sample predictions
- Slide forward and repeat
Window 1: [===TRAIN===][GAP][=TEST=]
Window 2: [===TRAIN===][GAP][=TEST=]
Window 3: [===TRAIN===][GAP][=TEST=]
Window 4: [===TRAIN===][GAP][=TEST=]每个窗口的流程:
- 基于过去N根K线训练模型
- 跳过与未来收益周期相等的间隔(禁售期)
- 对接下来M根K线进行预测
- 记录样本外预测结果
- 向前滑动窗口并重复上述步骤
Typical Parameters
典型参数
| Parameter | Value | Rationale |
|---|---|---|
| Train window | 30 days (720 hourly bars) | Enough data to learn, recent enough to be relevant |
| Test window | 7 days (168 hourly bars) | Enough predictions for statistical significance |
| Step size | 1 day (24 bars) | Overlap test windows for more data points |
| Gap (embargo) | Same as forward horizon | Prevents label leakage |
| 参数 | 取值 | 理由 |
|---|---|---|
| 训练窗口 | 30天(720根小时K线) | 数据量足够用于学习,且时效性强 |
| 测试窗口 | 7天(168根小时K线) | 预测数量足够具备统计显著性 |
| 步长 | 1天(24根小时K线) | 测试窗口重叠以获取更多数据点 |
| 间隔(禁售期) | 与未来收益周期相同 | 防止标签泄露 |
Walk-Forward Implementation
滚动向前验证实现
python
from typing import Iterator
def walk_forward_splits(
n_samples: int,
train_size: int = 720,
test_size: int = 168,
step_size: int = 24,
gap: int = 24,
) -> Iterator[tuple[np.ndarray, np.ndarray]]:
"""Generate walk-forward train/test index splits.
Args:
n_samples: Total number of samples.
train_size: Number of training samples per window.
test_size: Number of test samples per window.
step_size: Step between successive windows.
gap: Gap between train end and test start.
Yields:
Tuples of (train_indices, test_indices).
"""
start = 0
while start + train_size + gap + test_size <= n_samples:
train_idx = np.arange(start, start + train_size)
test_start = start + train_size + gap
test_idx = np.arange(test_start, test_start + test_size)
yield train_idx, test_idx
start += step_sizeSee for purged CV, CPCV, and evaluation metrics.
references/validation_methods.mdpython
from typing import Iterator
def walk_forward_splits(
n_samples: int,
train_size: int = 720,
test_size: int = 168,
step_size: int = 24,
gap: int = 24,
) -> Iterator[tuple[np.ndarray, np.ndarray]]:
"""Generate walk-forward train/test index splits.
Args:
n_samples: Total number of samples.
train_size: Number of training samples per window.
test_size: Number of test samples per window.
step_size: Step between successive windows.
gap: Gap between train end and test start.
Yields:
Tuples of (train_indices, test_indices).
"""
start = 0
while start + train_size + gap + test_size <= n_samples:
train_idx = np.arange(start, start + train_size)
test_start = start + train_size + gap
test_idx = np.arange(test_start, test_start + test_size)
yield train_idx, test_idx
start += step_size有关purged CV、CPCV及评估指标的内容,请参阅。
references/validation_methods.mdModel Training Pipeline
模型训练流程
Full Pipeline Overview
完整流程概述
- Feature engineering — compute technical indicators, on-chain metrics, volume features (see skill)
feature-engineering - Label creation — forward returns with threshold, drop neutral zone
- Walk-forward split — time-ordered train/test windows with gap
- Train model — XGBoost or LightGBM on each training window
- Predict on test — generate out-of-sample probability predictions
- Aggregate predictions — concatenate all out-of-sample results
- Evaluate — accuracy, precision, recall, F1, AUC, profit factor
- 特征工程 —— 计算技术指标、链上指标、成交量特征(详见技能)
feature-engineering - 标签创建 —— 基于阈值划分未来收益,剔除中性区间
- 滚动向前划分 —— 带间隔的时间序列训练/测试窗口
- 模型训练 —— 在每个训练窗口上训练XGBoost或LightGBM模型
- 测试预测 —— 生成样本外概率预测结果
- 预测结果聚合 —— 拼接所有样本外预测结果
- 评估 —— 准确率、精确率、召回率、F1值、AUC、盈利因子
Quick Training Example
快速训练示例
python
from xgboost import XGBClassifier
model = XGBClassifier(
n_estimators=200,
max_depth=4,
learning_rate=0.05,
subsample=0.8,
colsample_bytree=0.8,
eval_metric="logloss",
use_label_encoder=False,
random_state=42,
)
model.fit(
X_train, y_train,
eval_set=[(X_val, y_val)],
verbose=False,
)
probabilities = model.predict_proba(X_test)[:, 1]See for parameter recommendations and tuning.
references/model_guide.mdpython
from xgboost import XGBClassifier
model = XGBClassifier(
n_estimators=200,
max_depth=4,
learning_rate=0.05,
subsample=0.8,
colsample_bytree=0.8,
eval_metric="logloss",
use_label_encoder=False,
random_state=42,
)
model.fit(
X_train, y_train,
eval_set=[(X_val, y_val)],
verbose=False,
)
probabilities = model.predict_proba(X_test)[:, 1]有关参数推荐与调优的内容,请参阅。
references/model_guide.mdSHAP Feature Importance
SHAP特征重要性
SHAP (SHapley Additive exPlanations) provides the gold standard for understanding model predictions.
SHAP(SHapley Additive exPlanations,沙普利加性解释)是理解模型预测结果的黄金标准。
Global Feature Importance
全局特征重要性
Which features matter most across all predictions:
python
import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)哪些特征在所有预测中影响最大:
python
import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)Summary plot (top 15 features)
Summary plot (top 15 features)
shap.summary_plot(shap_values, X_test, max_display=15)
undefinedshap.summary_plot(shap_values, X_test, max_display=15)
undefinedLocal Explanations
局部解释
Why a specific prediction was made:
python
undefined某一特定预测结果的生成原因:
python
undefinedExplain a single prediction
Explain a single prediction
shap.force_plot(explainer.expected_value, shap_values[0], X_test.iloc[0])
undefinedshap.force_plot(explainer.expected_value, shap_values[0], X_test.iloc[0])
undefinedTemporal Feature Importance
时序特征重要性
Track how feature importance drifts over walk-forward windows. If a feature's importance drops significantly, the market regime may have shifted.
跟踪特征重要性在滚动向前窗口中的变化趋势。若某特征的重要性大幅下降,可能意味着市场状态已发生转变。
Threshold Optimization
阈值优化
The default 0.5 probability threshold is almost never optimal for trading.
默认的0.5概率阈值在交易场景中几乎从未达到最优。
Why Not 0.5?
为何不选0.5?
- Class imbalance: if 60% of labels are "up", a 0.5 threshold is too aggressive
- Trading costs: marginal signals (0.51 probability) rarely cover transaction costs
- Asymmetric payoffs: precision matters more than recall for trading
- 类别不平衡:若60%的标签为“上涨”,0.5的阈值过于激进
- 交易成本:边际信号(如概率0.51)几乎无法覆盖交易成本
- 非对称收益:在交易场景中,精确率比召回率更重要
Optimize for Profit Factor
针对盈利因子优化
python
def optimize_threshold(
probabilities: np.ndarray,
returns: np.ndarray,
thresholds: np.ndarray | None = None,
) -> tuple[float, float]:
"""Find threshold that maximizes profit factor.
Args:
probabilities: Model predicted probabilities.
returns: Actual forward returns.
thresholds: Thresholds to search over.
Returns:
Tuple of (best_threshold, best_profit_factor).
"""
if thresholds is None:
thresholds = np.arange(0.50, 0.85, 0.01)
best_threshold, best_pf = 0.5, 0.0
for t in thresholds:
signals = probabilities >= t
if signals.sum() < 10:
continue
signal_returns = returns[signals]
wins = signal_returns[signal_returns > 0].sum()
losses = abs(signal_returns[signal_returns < 0].sum())
pf = wins / losses if losses > 0 else 0.0
if pf > best_pf:
best_pf = pf
best_threshold = t
return best_threshold, best_pfTypical finding: optimal threshold is 0.60-0.75 for crypto trading signals.
python
def optimize_threshold(
probabilities: np.ndarray,
returns: np.ndarray,
thresholds: np.ndarray | None = None,
) -> tuple[float, float]:
"""Find threshold that maximizes profit factor.
Args:
probabilities: Model predicted probabilities.
returns: Actual forward returns.
thresholds: Thresholds to search over.
Returns:
Tuple of (best_threshold, best_profit_factor).
"""
if thresholds is None:
thresholds = np.arange(0.50, 0.85, 0.01)
best_threshold, best_pf = 0.5, 0.0
for t in thresholds:
signals = probabilities >= t
if signals.sum() < 10:
continue
signal_returns = returns[signals]
wins = signal_returns[signal_returns > 0].sum()
losses = abs(signal_returns[signal_returns < 0].sum())
pf = wins / losses if losses > 0 else 0.0
if pf > best_pf:
best_pf = pf
best_threshold = t
return best_threshold, best_pf典型结论:加密货币交易信号的最优阈值为0.60-0.75。
Crypto-Specific Considerations
加密货币特有的注意事项
Short Training Windows
短训练窗口
Crypto market regimes change fast. A model trained on 6 months of data may perform worse than one trained on 30 days. Use shorter training windows and retrain frequently.
加密货币市场状态变化迅速。基于6个月数据训练的模型表现可能不如基于30天数据训练的模型。应使用更短的训练窗口并频繁重新训练。
Class Imbalance
类别不平衡
Most time periods are "flat" (returns within the neutral zone). Strategies to handle this:
- Drop neutral zone: only train on clear up/down labels
- Undersample majority class: in XGBoost
scale_pos_weight - SMOTE: synthetic minority oversampling (use cautiously — can introduce lookahead)
- Adjust threshold: raise the probability threshold to compensate
大部分时间段处于“横盘”状态(收益在中性区间内)。处理此问题的策略:
- 剔除中性区间:仅基于清晰的涨跌标签进行训练
- 多数类别欠采样:使用XGBoost中的参数
scale_pos_weight - SMOTE:合成少数类过采样(需谨慎使用——可能引入前瞻偏差)
- 调整阈值:提高概率阈值以进行补偿
Transaction Costs
交易成本
A model with 55% accuracy sounds good, but after 0.5% round-trip costs (slippage + fees), many signals become unprofitable. Always evaluate signals net of costs:
python
net_return = gross_return - 0.005 # 50 bps round-trip准确率55%的模型听起来不错,但在扣除0.5%的往返成本(滑点+手续费)后,许多信号将无利可图。务必基于扣除成本后的结果评估信号:
python
net_return = gross_return - 0.005 # 50 bps round-tripFeature Decay
特征衰减
Features lose predictive power over time as more participants discover and trade on them. Monitor rolling performance and retrain when metrics degrade.
随着越来越多参与者发现并基于特征进行交易,特征的预测能力会随时间衰减。需监控滚动表现,当指标下降时重新训练模型。
Integration with Other Skills
与其他技能的集成
| Skill | Integration |
|---|---|
| Compute input features for the classifier |
| Backtest trading strategies from ML signals |
| Train separate models per regime, or use regime as a feature |
| Size positions based on classifier confidence |
| Apply portfolio-level risk limits to ML-generated signals |
| 技能名称 | 集成方式 |
|---|---|
| 为分类器计算输入特征 |
| 基于机器学习信号回测交易策略 |
| 针对不同市场状态训练独立模型,或将状态作为特征 |
| 根据分类器置信度调整仓位大小 |
| 对机器学习生成的信号应用组合层面的风险限制 |
Files
文件
References
参考资料
- — XGBoost and LightGBM parameter guide, tuning, and ensembling
references/model_guide.md - — Walk-forward, purged CV, CPCV, and evaluation metrics
references/validation_methods.md
- —— XGBoost和LightGBM参数指南、调优及集成方法
references/model_guide.md - —— 滚动向前验证、purged CV、CPCV及评估指标
references/validation_methods.md
Scripts
脚本
- — Train a signal classifier with walk-forward validation and feature importance
scripts/train_classifier.py - — Backtest ML signals vs buy-and-hold with walk-forward validation
scripts/walk_forward_backtest.py
- —— 训练带滚动向前验证和特征重要性分析的信号分类器
scripts/train_classifier.py - —— 通过滚动向前验证回测机器学习信号与买入持有策略的表现
scripts/walk_forward_backtest.py
Dependencies
依赖项
bash
undefinedbash
undefinedCore (required)
Core (required)
uv pip install pandas numpy scikit-learn
uv pip install pandas numpy scikit-learn
Optional (recommended)
Optional (recommended)
uv pip install xgboost lightgbm shap
undefineduv pip install xgboost lightgbm shap
undefinedKey Takeaways
核心要点
- Walk-forward validation is non-negotiable — random CV will give you wildly inflated results
- Optimize threshold for profit factor, not accuracy — a high-precision, low-recall model beats a high-accuracy one
- Short training windows for crypto — 30 days beats 6 months in most regimes
- Monitor feature decay — retrain when rolling metrics drop below baseline
- Always evaluate net of costs — a 55% accurate model may be unprofitable after fees
- SHAP over raw feature importance — SHAP gives consistent, theoretically grounded explanations
- 滚动向前验证必不可少 —— 随机交叉验证会得出严重高估的结果
- 针对盈利因子优化阈值,而非准确率 —— 高精确率、低召回率的模型优于高准确率模型
- 加密货币使用短训练窗口 —— 在大多数市场状态下,30天窗口的表现优于6个月窗口
- 监控特征衰减 —— 当滚动指标低于基线时重新训练模型
- 始终基于扣除成本后的结果评估 —— 准确率55%的模型扣除手续费后可能无利可图
- 优先使用SHAP而非原始特征重要性 —— SHAP提供一致且有理论依据的解释