algo-risk-credit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCredit Scoring Model
信用评分模型
Overview
概述
Credit scoring models predict the probability of default (PD) from borrower characteristics using logistic regression or gradient boosting. Output: a score (300-850 range) or PD (0-1). Used for loan approval, pricing, and portfolio risk management.
信用评分模型利用logistic regression或gradient boosting算法,基于借款人特征预测违约概率(PD)。输出结果为300-850区间的评分或0-1区间的PD,用于贷款审批、定价及信贷组合风险管理。
When to Use
适用场景
Trigger conditions:
- Building a scorecard for loan/credit approval decisions
- Predicting default probability for risk-based pricing
- Evaluating existing credit models for discriminatory power
When NOT to use:
- For corporate bankruptcy prediction (use Altman Z-Score)
- For market risk measurement (use VaR)
触发条件:
- 为贷款/信贷审批决策构建评分卡
- 基于风险定价预测违约概率
- 评估现有信用模型的区分能力
不适用场景:
- 企业破产预测(请使用Altman Z-Score)
- 市场风险计量(请使用VaR)
Algorithm
算法
IRON LAW: A Credit Model Must Discriminate AND Be Calibrated
Discrimination (AUC): correctly ranking good vs bad borrowers.
Calibration: predicted PD matches actual default rates.
A model with AUC=0.85 but predicted PD 2x actual default rate will
cause systematic over/under-pricing. Need BOTH properties.IRON LAW: A Credit Model Must Discriminate AND Be Calibrated
Discrimination (AUC): correctly ranking good vs bad borrowers.
Calibration: predicted PD matches actual default rates.
A model with AUC=0.85 but predicted PD 2x actual default rate will
cause systematic over/under-pricing. Need BOTH properties.Phase 1: Input Validation
阶段1:输入验证
Collect: borrower features (income, debt ratio, credit history length, delinquency count, utilization), outcome variable (default within 12-24 months). Handle: missing values, class imbalance (typically 2-5% default rate).
Gate: Sufficient defaults (300+ events), features available at decision time.
收集:借款人特征(收入、债务比率、信用历史时长、逾期次数、信贷利用率)、结果变量(12-24个月内是否违约)。处理:缺失值、类别不平衡(通常违约率为2-5%)。
准入要求: 违约样本充足(300+个事件),决策时可获取相关特征。
Phase 2: Core Algorithm
阶段2:核心算法
- Feature engineering: WOE (Weight of Evidence) binning for logistic regression, or direct encoding for GBDT
- Train model: logistic regression (interpretable, regulatory-preferred) or GBDT (higher accuracy)
- Calibrate: Platt scaling on holdout, ensure predicted PD matches actual default rate by decile
- Convert to score: Score = offset + factor × log(odds), scaled to 300-850 range
- 特征工程:针对logistic regression使用WOE(Weight of Evidence)分箱,针对GBDT使用直接编码
- 模型训练:选择logistic regression(可解释性强,受监管偏好)或GBDT(精度更高)
- 模型校准:在验证集上使用Platt scaling,确保分位区间内预测PD与实际违约率匹配
- 转换为评分:评分 = 偏移量 + 系数 × log(优势比),缩放至300-850区间
Phase 3: Verification
阶段3:验证
Evaluate: AUC (>0.70 acceptable, >0.80 good), KS statistic, Gini coefficient. Population stability index (PSI) for monitoring drift.
Gate: AUC > 0.70, calibration acceptable, no discriminatory bias in protected attributes.
评估指标:AUC(>0.70为可接受,>0.80为良好)、KS statistic、Gini coefficient。使用Population Stability Index(PSI)监控数据漂移。
准入要求: AUC>0.70,校准效果合格,对受保护属性无歧视性偏差。
Phase 4: Output
阶段4:输出
Return score, PD, and key risk drivers.
返回评分、PD及关键风险驱动因素。
Output Format
输出格式
json
{
"score": 680,
"pd": 0.035,
"risk_grade": "B",
"top_risk_factors": [{"factor": "high_utilization", "impact": -45}, {"factor": "short_history", "impact": -30}],
"metadata": {"model": "logistic_regression", "auc": 0.78, "vintage": "2024-Q3"}
}json
{
"score": 680,
"pd": 0.035,
"risk_grade": "B",
"top_risk_factors": [{"factor": "high_utilization", "impact": -45}, {"factor": "short_history", "impact": -30}],
"metadata": {"model": "logistic_regression", "auc": 0.78, "vintage": "2024-Q3"}
}Examples
示例
Sample I/O
输入输出样例
Input: Borrower: income=$60K, DTI=35%, 5yr credit history, 0 delinquencies, 60% utilization
Expected: Score ~680, PD ~3.5%, Grade B (some risk from high utilization)
输入: 借款人:收入6万美元,DTI=35%,5年信用历史,0次逾期,信贷利用率60%
预期输出: 评分约680,PD约3.5%,风险等级B(高信贷利用率带来一定风险)
Edge Cases
边缘案例
| Input | Expected | Why |
|---|---|---|
| No credit history (thin file) | High uncertainty, default to conservative | Insufficient data for scoring |
| All features identical | Same score regardless of outcome | Model can't differentiate — need more features |
| Major economy shift | PSI > 0.25, model needs recalibration | Population has shifted from training distribution |
| 输入 | 预期输出 | 原因 |
|---|---|---|
| 无信用历史(薄文件) | 不确定性高,默认采用保守评估 | 评分数据不足 |
| 所有特征完全相同 | 无论结果如何评分一致 | 模型无法区分,需补充更多特征 |
| 经济环境重大变动 | PSI>0.25,模型需重新校准 | 样本分布与训练集差异过大 |
Gotchas
注意事项
- Reject inference: Training data only includes approved applicants. Rejected applicants' outcomes are unknown, creating selection bias. Use reject inference techniques.
- Fair lending: Models must not discriminate by protected attributes (race, gender, age). Even proxy variables (zip code ≈ race) can create disparate impact. Test with fairness metrics.
- Through-the-door vs on-the-books: TTD samples include all applicants; OTB only approved ones. Model purpose determines which sample to use.
- Vintage analysis: Default rates vary by economic conditions. A 2019-trained model may not predict well in a recession. Track model performance by vintage.
- Regulatory requirements: Financial regulators (Basel, OCC, FDIC) have specific requirements for model validation, documentation, and fair lending testing.
- 拒绝推断:训练数据仅包含获批申请人,被拒申请人的违约结果未知,会导致选择偏差。需使用拒绝推断技术。
- 公平借贷:模型不得因受保护属性(种族、性别、年龄)产生歧视。即使是代理变量(如邮政编码≈种族)也可能造成差异化影响,需使用公平性指标测试。
- 申请样本 vs 在贷样本:TTD样本包含所有申请人;OTB样本仅包含获批申请人。需根据模型用途选择对应样本。
- ** vintange分析**:违约率随经济环境变化。2019年训练的模型在衰退期预测效果可能不佳,需按vintage跟踪模型性能。
- 监管要求:金融监管机构(巴塞尔委员会、OCC、FDIC)对模型验证、文档记录及公平借贷测试有明确要求。
References
参考资料
- For WOE binning methodology, see
references/woe-binning.md - For reject inference techniques, see
references/reject-inference.md
- WOE分箱方法请参考
references/woe-binning.md - 拒绝推断技术请参考
references/reject-inference.md