algo-risk-credit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Credit Scoring Model

信用评分模型

Overview

概述

Credit scoring models predict the probability of default (PD) from borrower characteristics using logistic regression or gradient boosting. Output: a score (300-850 range) or PD (0-1). Used for loan approval, pricing, and portfolio risk management.
信用评分模型利用logistic regression或gradient boosting算法,基于借款人特征预测违约概率(PD)。输出结果为300-850区间的评分或0-1区间的PD,用于贷款审批、定价及信贷组合风险管理。

When to Use

适用场景

Trigger conditions:
  • Building a scorecard for loan/credit approval decisions
  • Predicting default probability for risk-based pricing
  • Evaluating existing credit models for discriminatory power
When NOT to use:
  • For corporate bankruptcy prediction (use Altman Z-Score)
  • For market risk measurement (use VaR)
触发条件:
  • 为贷款/信贷审批决策构建评分卡
  • 基于风险定价预测违约概率
  • 评估现有信用模型的区分能力
不适用场景:
  • 企业破产预测(请使用Altman Z-Score)
  • 市场风险计量(请使用VaR)

Algorithm

算法

IRON LAW: A Credit Model Must Discriminate AND Be Calibrated
Discrimination (AUC): correctly ranking good vs bad borrowers.
Calibration: predicted PD matches actual default rates.
A model with AUC=0.85 but predicted PD 2x actual default rate will
cause systematic over/under-pricing. Need BOTH properties.
IRON LAW: A Credit Model Must Discriminate AND Be Calibrated
Discrimination (AUC): correctly ranking good vs bad borrowers.
Calibration: predicted PD matches actual default rates.
A model with AUC=0.85 but predicted PD 2x actual default rate will
cause systematic over/under-pricing. Need BOTH properties.

Phase 1: Input Validation

阶段1:输入验证

Collect: borrower features (income, debt ratio, credit history length, delinquency count, utilization), outcome variable (default within 12-24 months). Handle: missing values, class imbalance (typically 2-5% default rate). Gate: Sufficient defaults (300+ events), features available at decision time.
收集:借款人特征(收入、债务比率、信用历史时长、逾期次数、信贷利用率)、结果变量(12-24个月内是否违约)。处理:缺失值、类别不平衡(通常违约率为2-5%)。 准入要求: 违约样本充足(300+个事件),决策时可获取相关特征。

Phase 2: Core Algorithm

阶段2:核心算法

  1. Feature engineering: WOE (Weight of Evidence) binning for logistic regression, or direct encoding for GBDT
  2. Train model: logistic regression (interpretable, regulatory-preferred) or GBDT (higher accuracy)
  3. Calibrate: Platt scaling on holdout, ensure predicted PD matches actual default rate by decile
  4. Convert to score: Score = offset + factor × log(odds), scaled to 300-850 range
  1. 特征工程:针对logistic regression使用WOE(Weight of Evidence)分箱,针对GBDT使用直接编码
  2. 模型训练:选择logistic regression(可解释性强,受监管偏好)或GBDT(精度更高)
  3. 模型校准:在验证集上使用Platt scaling,确保分位区间内预测PD与实际违约率匹配
  4. 转换为评分:评分 = 偏移量 + 系数 × log(优势比),缩放至300-850区间

Phase 3: Verification

阶段3:验证

Evaluate: AUC (>0.70 acceptable, >0.80 good), KS statistic, Gini coefficient. Population stability index (PSI) for monitoring drift. Gate: AUC > 0.70, calibration acceptable, no discriminatory bias in protected attributes.
评估指标:AUC(>0.70为可接受,>0.80为良好)、KS statistic、Gini coefficient。使用Population Stability Index(PSI)监控数据漂移。 准入要求: AUC>0.70,校准效果合格,对受保护属性无歧视性偏差。

Phase 4: Output

阶段4:输出

Return score, PD, and key risk drivers.
返回评分、PD及关键风险驱动因素。

Output Format

输出格式

json
{
  "score": 680,
  "pd": 0.035,
  "risk_grade": "B",
  "top_risk_factors": [{"factor": "high_utilization", "impact": -45}, {"factor": "short_history", "impact": -30}],
  "metadata": {"model": "logistic_regression", "auc": 0.78, "vintage": "2024-Q3"}
}
json
{
  "score": 680,
  "pd": 0.035,
  "risk_grade": "B",
  "top_risk_factors": [{"factor": "high_utilization", "impact": -45}, {"factor": "short_history", "impact": -30}],
  "metadata": {"model": "logistic_regression", "auc": 0.78, "vintage": "2024-Q3"}
}

Examples

示例

Sample I/O

输入输出样例

Input: Borrower: income=$60K, DTI=35%, 5yr credit history, 0 delinquencies, 60% utilization Expected: Score ~680, PD ~3.5%, Grade B (some risk from high utilization)
输入: 借款人:收入6万美元,DTI=35%,5年信用历史,0次逾期,信贷利用率60% 预期输出: 评分约680,PD约3.5%,风险等级B(高信贷利用率带来一定风险)

Edge Cases

边缘案例

InputExpectedWhy
No credit history (thin file)High uncertainty, default to conservativeInsufficient data for scoring
All features identicalSame score regardless of outcomeModel can't differentiate — need more features
Major economy shiftPSI > 0.25, model needs recalibrationPopulation has shifted from training distribution
输入预期输出原因
无信用历史(薄文件)不确定性高,默认采用保守评估评分数据不足
所有特征完全相同无论结果如何评分一致模型无法区分,需补充更多特征
经济环境重大变动PSI>0.25,模型需重新校准样本分布与训练集差异过大

Gotchas

注意事项

  • Reject inference: Training data only includes approved applicants. Rejected applicants' outcomes are unknown, creating selection bias. Use reject inference techniques.
  • Fair lending: Models must not discriminate by protected attributes (race, gender, age). Even proxy variables (zip code ≈ race) can create disparate impact. Test with fairness metrics.
  • Through-the-door vs on-the-books: TTD samples include all applicants; OTB only approved ones. Model purpose determines which sample to use.
  • Vintage analysis: Default rates vary by economic conditions. A 2019-trained model may not predict well in a recession. Track model performance by vintage.
  • Regulatory requirements: Financial regulators (Basel, OCC, FDIC) have specific requirements for model validation, documentation, and fair lending testing.
  • 拒绝推断:训练数据仅包含获批申请人,被拒申请人的违约结果未知,会导致选择偏差。需使用拒绝推断技术。
  • 公平借贷:模型不得因受保护属性(种族、性别、年龄)产生歧视。即使是代理变量(如邮政编码≈种族)也可能造成差异化影响,需使用公平性指标测试。
  • 申请样本 vs 在贷样本:TTD样本包含所有申请人;OTB样本仅包含获批申请人。需根据模型用途选择对应样本。
  • ** vintange分析**:违约率随经济环境变化。2019年训练的模型在衰退期预测效果可能不佳,需按vintage跟踪模型性能。
  • 监管要求:金融监管机构(巴塞尔委员会、OCC、FDIC)对模型验证、文档记录及公平借贷测试有明确要求。

References

参考资料

  • For WOE binning methodology, see
    references/woe-binning.md
  • For reject inference techniques, see
    references/reject-inference.md
  • WOE分箱方法请参考
    references/woe-binning.md
  • 拒绝推断技术请参考
    references/reject-inference.md