rangebar-eval-metrics

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Range Bar Evaluation Metrics

Range Bar评估指标

Machine-readable reference + computation scripts for state-of-the-art metrics evaluating range bar (price-based sampling) data.

用于评估Range Bar（基于价格的采样）数据的最先进指标的机器可读参考资料及计算脚本。

When to Use This Skill

何时使用该技能

Use this skill when:

Evaluating ML model performance on range bar data
Computing Sharpe ratios with non-IID bar sequences
Running Walk-Forward Optimization metric analysis
Calculating PSR, DSR, or MinTRL statistical tests
Generating evaluation reports from fold results

在以下场景使用本技能：

评估Range Bar数据上的ML模型性能
计算非IID Bar序列的Sharpe ratio
进行Walk-Forward Optimization（WFO）指标分析
计算PSR、DSR或MinTRL统计检验
基于折结果生成评估报告

Quick Start

快速开始

bash

undefined

bash

undefined

Compute metrics from predictions + actuals

python scripts/compute_metrics.py --predictions preds.npy --actuals actuals.npy --timestamps ts.npy

Generate full evaluation report

python scripts/generate_report.py --results folds.jsonl --output report.md

undefined

python scripts/generate_report.py --results folds.jsonl --output report.md

undefined

Metric Tiers

指标层级

Tier	Purpose	Metrics	Compute
Primary (5)	Research decisions	weekly_sharpe, hit_rate, cumulative_pnl, n_bars, positive_sharpe_rate	Per-fold + aggregate
Secondary/Risk (5)	Additional context	max_drawdown, bar_sharpe, return_per_bar, profit_factor, cv_fold_returns	Per-fold
ML Quality (3)	Prediction health	ic, prediction_autocorr, is_collapsed	Per-fold
Diagnostic (5)	Final validation	psr, dsr, autocorr_lag1, effective_n, binomial_pvalue	Aggregate only
Extended Risk (5)	Deep risk analysis	var_95, cvar_95, omega_ratio, sortino_ratio, ulcer_index	Per-fold (optional)

层级	用途	指标	计算方式
主指标 (5)	研究决策参考	weekly_sharpe, hit_rate, cumulative_pnl, n_bars, positive_sharpe_rate	每折计算+汇总
次要/风险指标 (5)	补充上下文信息	max_drawdown, bar_sharpe, return_per_bar, profit_factor, cv_fold_returns	每折计算
ML质量指标 (3)	预测健康度评估	ic, prediction_autocorr, is_collapsed	每折计算
诊断指标 (5)	最终验证	psr, dsr, autocorr_lag1, effective_n, binomial_pvalue	仅汇总计算
扩展风险指标 (5)	深度风险分析	var_95, cvar_95, omega_ratio, sortino_ratio, ulcer_index	每折计算（可选）

Why Range Bars Need Special Treatment

为何Range Bar需要特殊处理

Range bars violate standard IID assumptions:

Variable duration: Bars form based on price movement, not time
Autocorrelation: High-volatility periods cluster bars → temporal correlation
Non-constant information: More bars during volatility = more information per day

Canonical solution: Daily aggregation via

_group_by_day()

before Sharpe calculation.

Range Bar违反了标准的IID（独立同分布）假设：

可变时长：Bar基于价格波动形成，而非时间
自相关性：高波动时期Bar会聚集 → 时间相关性
非恒定信息量：波动期间Bar更多 → 每日信息量更大

标准解决方案：在计算Sharpe ratio前，通过

_group_by_day()

按日聚合。

References

参考资料

Core Reference Files

核心参考文件

Topic	Reference File
Sharpe Ratio Calculations	sharpe-formulas.md
Risk Metrics (VaR, Omega, Ulcer)	risk-metrics.md
ML Prediction Quality (IC, Autocorr)	ml-prediction-quality.md
Crypto Market Considerations	crypto-markets.md
Temporal Aggregation Rules	temporal-aggregation.md
JSON Schema for Metrics	metrics-schema.md
Anti-Patterns (Transaction Costs)	anti-patterns.md
SOTA 2025-2026 (SHAP, BOCPD, etc.)	sota-2025-2026.md
Worked Examples (BTC, EUR/USD)	worked-examples.md
Structured Logging (NDJSON)	structured-logging.md

主题	参考文件
Sharpe Ratio计算	sharpe-formulas.md
风险指标（VaR、Omega、Ulcer）	risk-metrics.md
ML预测质量（IC、自相关）	ml-prediction-quality.md
加密货币市场考量	crypto-markets.md
时间聚合规则	temporal-aggregation.md
指标JSON Schema	metrics-schema.md
反模式（交易成本）	anti-patterns.md
2025-2026最先进技术（SHAP、BOCPD等）	sota-2025-2026.md
实操示例（BTC、EUR/USD）	worked-examples.md
结构化日志（NDJSON）	structured-logging.md

Related Skills

Skill	Relationship
adaptive-wfo-epoch	Uses `weekly_sharpe` , `psr` , `dsr` for WFE calculation

技能	关联关系
adaptive-wfo-epoch	使用 `weekly_sharpe` 、 `psr` 、 `dsr` 进行WFE计算

Dependencies

依赖项

bash

pip install -r requirements.txt

bash

pip install -r requirements.txt

Or: pip install numpy>=1.24 pandas>=2.0 scipy>=1.10

undefined

undefined

Key Formulas

核心公式

Daily-Aggregated Sharpe (Primary Metric)

按日聚合的Sharpe Ratio（主指标）

python

def weekly_sharpe(pnl: np.ndarray, timestamps: np.ndarray) -> float:
    """Sharpe with daily aggregation for range bars."""
    daily_pnl = _group_by_day(pnl, timestamps)  # Sum PnL per calendar day
    if len(daily_pnl) < 2 or np.std(daily_pnl) == 0:
        return 0.0
    daily_sharpe = np.mean(daily_pnl) / np.std(daily_pnl)
    # For crypto (7-day week): sqrt(7). For equities: sqrt(5)
    return daily_sharpe * np.sqrt(7)  # Crypto default

python

def weekly_sharpe(pnl: np.ndarray, timestamps: np.ndarray) -> float:
    """Sharpe with daily aggregation for range bars."""
    daily_pnl = _group_by_day(pnl, timestamps)  # Sum PnL per calendar day
    if len(daily_pnl) < 2 or np.std(daily_pnl) == 0:
        return 0.0
    daily_sharpe = np.mean(daily_pnl) / np.std(daily_pnl)
    # For crypto (7-day week): sqrt(7). For equities: sqrt(5)
    return daily_sharpe * np.sqrt(7)  # Crypto default

Information Coefficient (Prediction Quality)

信息系数（预测质量）

python

from scipy.stats import spearmanr

def information_coefficient(predictions: np.ndarray, actuals: np.ndarray) -> float:
    """Spearman rank IC - captures magnitude alignment."""
    ic, _ = spearmanr(predictions, actuals)
    return ic  # Range: [-1, 1]. >0.02 acceptable, >0.05 good, >0.10 excellent

python

from scipy.stats import spearmanr

def information_coefficient(predictions: np.ndarray, actuals: np.ndarray) -> float:
    """Spearman rank IC - captures magnitude alignment."""
    ic, _ = spearmanr(predictions, actuals)
    return ic  # Range: [-1, 1]. >0.02 acceptable, >0.05 good, >0.10 excellent

信息系数范围：[-1, 1]。>0.02为可接受，>0.05为良好，>0.10为优秀

Probabilistic Sharpe Ratio (Statistical Validation)

概率夏普比率（统计验证）

python

from scipy.stats import norm

def psr(sharpe: float, se: float, benchmark: float = 0.0) -> float:
    """P(true Sharpe > benchmark)."""
    return norm.cdf((sharpe - benchmark) / se)

python

from scipy.stats import norm

def psr(sharpe: float, se: float, benchmark: float = 0.0) -> float:
    """P(true Sharpe > benchmark)."""
    return norm.cdf((sharpe - benchmark) / se)

Annualization Factors

年化系数

Market	Daily → Weekly	Daily → Annual	Rationale
Crypto (24/7)	sqrt(7) = 2.65	sqrt(365) = 19.1	7 trading days/week
Equity	sqrt(5) = 2.24	sqrt(252) = 15.9	5 trading days/week

NEVER use sqrt(252) for crypto markets.

市场	日→周	日→年化	依据
*加密货币（724）**	sqrt(7) = 2.65	sqrt(365) = 19.1	每周7个交易日
股票	sqrt(5) = 2.24	sqrt(252) = 15.9	每周5个交易日

切勿对加密货币市场使用sqrt(252)。

CRITICAL: Session Filter Changes Annualization

重要提示：时段筛选会改变年化系数

View	Filter	days_per_week	Rationale
Session-filtered (London-NY)	Weekdays 08:00-16:00	sqrt(5)	Trading like equities
All-bars (unfiltered)	None	sqrt(7)	Full 24/7 crypto

Using sqrt(7) for session-filtered data overstates Sharpe by ~18%!

See crypto-markets.md for detailed rationale.

视角	筛选条件	每周天数系数	依据
时段筛选后（伦敦-纽约）	工作日08:00-16:00	sqrt(5)	与股票交易模式类似
全Bar（未筛选）	无	sqrt(7)	加密货币7*24交易

对时段筛选后的数据使用sqrt(7)会使Sharpe比率高估约18%！

详情请参阅crypto-markets.md。

Dual-View Metrics

双视角指标

For comprehensive analysis, compute metrics with BOTH views:

Session-filtered (London 08:00 to NY 16:00): Primary strategy evaluation
All-bars: Regime detection, data quality diagnostics

为进行全面分析，请同时计算两种视角下的指标：

时段筛选后（伦敦08:00至纽约16:00）：策略评估主视角
全Bar：Regime检测、数据质量诊断

Academic References

学术参考

Concept	Citation
Deflated Sharpe Ratio	Bailey & López de Prado (2014)
Sharpe SE with Non-Normality	Mertens (2002)
Statistics of Sharpe Ratios	Lo (2002)
Omega Ratio	Keating & Shadwick (2002)
Ulcer Index	Peter Martin (1987)

概念	引用文献
Deflated Sharpe Ratio	Bailey & López de Prado (2014)
非正态分布下的Sharpe标准误	Mertens (2002)
Sharpe比率统计特性	Lo (2002)
Omega比率	Keating & Shadwick (2002)
Ulcer指数	Peter Martin (1987)

Decision Framework

决策框架

Go Criteria (Research)

研究通过标准

yaml

go_criteria:
  - positive_sharpe_rate > 0.55
  - mean_weekly_sharpe > 0
  - cv_fold_returns < 1.5
  - mean_hit_rate > 0.50

yaml

go_criteria:
  - positive_sharpe_rate > 0.55
  - mean_weekly_sharpe > 0
  - cv_fold_returns < 1.5
  - mean_hit_rate > 0.50

Publication Criteria

发表标准

yaml

publication_criteria:
  - binomial_pvalue < 0.05
  - psr > 0.85
  - dsr > 0.50 # If n_trials > 1

yaml

publication_criteria:
  - binomial_pvalue < 0.05
  - psr > 0.85
  - dsr > 0.50 # If n_trials > 1

Scripts

脚本

Script	Purpose
`scripts/compute_metrics.py`	Compute all metrics from predictions/actuals
`scripts/generate_report.py`	Generate Markdown report from fold results
`scripts/validate_schema.py`	Validate metrics JSON against schema

脚本	用途
`scripts/compute_metrics.py`	基于预测值和实际值计算所有指标
`scripts/generate_report.py`	基于折结果生成Markdown报告
`scripts/validate_schema.py`	验证指标JSON是否符合Schema

Remediations (2026-01-19 Multi-Agent Audit)

修复措施（2026-01-19多Agent审计）

The following fixes were applied based on a 12-subagent adversarial audit:

Issue	Root Cause	Fix	Source
`weekly_sharpe=0`	Constant predictions	Model collapse detection + architecture fix	model-expert
`IC=None`	Zero variance predictions	Return 1.0 for constant (semantically correct)	model-expert
`prediction_autocorr=NaN`	Division by zero	Guard for std < 1e-10, return 1.0	model-expert
Ulcer Index divide-by-zero	Peak equity = 0	Guard with np.where(peak > 1e-10, ...)	risk-analyst
Omega/Profit Factor unreliable	Too few samples	min_days parameter (default: 5)	robustness-analyst
BiLSTM mean collapse	Architecture too small	hidden_size: 16→48, dropout: 0.5→0.3	model-expert
`profit_factor=1.0` (n_bars=0)	Early return wrong value	Return NaN when no data to compute ratio	risk-analyst

基于12个Agent的对抗性审计，已应用以下修复：

问题	根本原因	修复方案	来源
`weekly_sharpe=0`	预测值恒定	模型崩溃检测 + 架构修复	model-expert
`IC=None`	预测值方差为零	返回1.0（语义正确）	model-expert
`prediction_autocorr=NaN`	除零错误	增加std <1e-10的判断，返回1.0	model-expert
Ulcer指数除零错误	峰值权益=0	使用np.where(peak >1e-10, ...)添加判断	risk-analyst
Omega/盈利因子不可靠	样本量过少	添加min_days参数（默认：5）	robustness-analyst
BiLSTM均值崩溃	架构过小	hidden_size:16→48，dropout:0.5→0.3	model-expert
`profit_factor=1.0` （n_bars=0）	提前返回错误值	无数据计算比率时返回NaN	risk-analyst

Model Collapse Detection

模型崩溃检测

python

undefined

python

undefined

ALWAYS check for model collapse after prediction

pred_std = np.std(predictions) if pred_std < 1e-6: logger.warning( f"Constant predictions detected (std={pred_std:.2e}). " "Model collapsed to mean - check architecture." )

undefined

pred_std = np.std(predictions) if pred_std < 1e-6: logger.warning( f"Constant predictions detected (std={pred_std:.2e}). " "Model collapsed to mean - check architecture." )

undefined

Recommended BiLSTM Architecture

BEFORE (causes collapse on range bars)

HIDDEN_SIZE = 16 DROPOUT = 0.5

AFTER (prevents collapse)

HIDDEN_SIZE = 48 # Triple capacity DROPOUT = 0.3 # Less aggressive regularization


See reference docs for complete implementation details.

---

HIDDEN_SIZE = 48 # Triple capacity DROPOUT = 0.3 # Less aggressive regularization


请参阅参考文档获取完整实现细节。

---

Troubleshooting

故障排除

Issue	Cause	Solution
weekly_sharpe is 0	Constant predictions	Check for model collapse, increase hidden_size
IC returns None	Zero variance in predictions	Model collapsed - check architecture
prediction_autocorr is NaN	Division by zero	Guard for std < 1e-10 in autocorr calculation
Ulcer Index divide error	Peak equity is zero	Add guard: np.where(peak > 1e-10, ...)
profit_factor = 1.0	No bars processed	Return NaN when n_bars is 0
Sharpe inflated 18%	Wrong annualization for data	Use sqrt(5) for session-filtered, sqrt(7) for 24/7
PSR/DSR not computed	Missing scipy	Install: `pip install scipy`
Timestamps not parsed	Wrong format	Ensure Unix timestamps, not datetime strings

问题	原因	解决方案
weekly_sharpe为0	预测值恒定	检查模型崩溃情况，增大hidden_size
IC返回None	预测值方差为零	模型崩溃 - 检查架构
prediction_autocorr为NaN	除零错误	在自相关计算中添加std <1e-10的判断
Ulcer指数计算错误	峰值权益为零	添加判断：np.where(peak >1e-10, ...)
profit_factor=1.0	未处理任何Bar	当n_bars为0时返回NaN
Sharpe比率高估18%	数据使用错误的年化系数	时段筛选后数据用sqrt(5)，7*24数据用sqrt(7)
PSR/DSR未计算	缺少scipy	安装： `pip install scipy`
时间戳未解析	格式错误	确保为Unix时间戳，而非日期时间字符串

rangebar-eval-metrics

Original

Translation

Range Bar Evaluation Metrics

Range Bar评估指标

When to Use This Skill

何时使用该技能

Quick Start

快速开始

Compute metrics from predictions + actuals

Compute metrics from predictions + actuals

Generate full evaluation report

Generate full evaluation report

Metric Tiers

指标层级

Why Range Bars Need Special Treatment

为何Range Bar需要特殊处理

References

参考资料

Core Reference Files

核心参考文件

Related Skills

相关技能

Dependencies

依赖项

Or: pip install numpy>=1.24 pandas>=2.0 scipy>=1.10

Or: pip install numpy>=1.24 pandas>=2.0 scipy>=1.10

Key Formulas

核心公式

Daily-Aggregated Sharpe (Primary Metric)

按日聚合的Sharpe Ratio（主指标）

Information Coefficient (Prediction Quality)

信息系数（预测质量）

Probabilistic Sharpe Ratio (Statistical Validation)

概率夏普比率（统计验证）

Annualization Factors

年化系数

CRITICAL: Session Filter Changes Annualization

重要提示：时段筛选会改变年化系数

Dual-View Metrics

双视角指标

Academic References

学术参考

Decision Framework

决策框架

Go Criteria (Research)

研究通过标准

Publication Criteria

发表标准

Scripts

脚本

Remediations (2026-01-19 Multi-Agent Audit)

修复措施（2026-01-19多Agent审计）

Model Collapse Detection

模型崩溃检测

ALWAYS check for model collapse after prediction

ALWAYS check for model collapse after prediction

Recommended BiLSTM Architecture

推荐的BiLSTM架构

BEFORE (causes collapse on range bars)

BEFORE (causes collapse on range bars)

AFTER (prevents collapse)

AFTER (prevents collapse)

Troubleshooting

故障排除