A comprehensive automated credit risk scorecard modeling library

These details have not been verified by PyPI

Project links

Project description

AutoScore - 全自动风控评分卡建模库

AutoScore 是一个专为信用评分卡模型开发设计的Python库，旨在通过自动化流程简化数据清洗、分箱、特征选择、模型训练及评分卡生成的全过程。

✨ 核心特性

🚀 一键建模: 一行代码完成从数据到评分卡的全流程
📊 智能分箱: 自动最优分箱，支持单调性约束
🔍 特征选择: 多种筛选方法（IV、VIF、相关性、逐步回归）
📈 OOT验证: 内置跨时间验证功能
🎯 评分卡生成: 自动生成标准评分卡
📋 模型报告: 一键生成Excel模型报告
💾 轻量部署: 支持JSON格式轻量级模型导出
🔄 在线学习: 支持增量模型更新

📦 安装

pip install autoscore

依赖库:

pandas >= 1.0.0
numpy >= 1.18.0
scikit-learn >= 0.22.0
matplotlib >= 3.0.0
openpyxl >= 3.0.0
joblib >= 0.14.0

🚀 快速开始

最简使用方式

import pandas as pd
from autoscore import AutoScore

# 加载数据
df = pd.read_csv('dataset.csv')

# 创建模型实例
score = AutoScore(random_state=42)

# 一键建模
fit_result = score.fit(
    df=df,
    target_col='target',
    exclude_cols=['id', 'create_time']
)

# 查看结果
print(f"选中特征: {score.selected_features}")
print(f"Test AUC: {score.model_metrics['test']['auc']:.4f}")
print(f"Test KS: {score.model_metrics['test']['ks']:.4f}")

# 生成报告
score.create_report('scorecard_report.xlsx')

# 导出轻量级模型
score.export_lightweight('model.json')

使用配置类 (推荐)

from autoscore import (
    AutoScore, DataConfig, BinningConfig, SelectionConfig, ModelConfig
)

# 数据配置
data_cfg = DataConfig(
    test_size=0.3,
    date_col='create_time',
    oot_split_date='2023-10-01',
    exclude_cols=['id']
)

# 分箱配置
bin_cfg = BinningConfig(
    max_bins=5,
    monotonicity_constraints={'duration': 1, 'age': -1}
)

# 特征选择配置
sel_cfg = SelectionConfig(
    iv_threshold=0.02,
    vif_threshold=5.0,
    selection_method='backward_lr'
)

# 模型配置
model_cfg = ModelConfig(pdo=20, base_score=600)

# 建模
score = AutoScore(random_state=42)
fit_result = score.fit(
    df=df,
    target_col='target',
    data_config=data_cfg,
    binning_config=bin_cfg,
    selection_config=sel_cfg,
    model_config=model_cfg
)

📖 详细文档

核心类：AutoScore

初始化

score = AutoScore(random_state=42)

参数	类型	默认值	说明
random_state	int	42	随机种子

fit 方法参数

基础参数:

参数	类型	默认值	说明
df	pd.DataFrame	必填	完整数据集
target_col	str	'target'	目标列名 (1=坏客户)
exclude_cols	list	[]	排除的列名

数据划分参数:

参数	类型	默认值	说明
test_size	float	0.3	测试集比例
date_col	str	'create_time'	日期列名
oot_split_date	str	None	OOT切分日期

分箱参数:

参数	类型	默认值	说明
max_bins	int	10	最大分箱数
min_bin_pct	float	0.05	最小箱占比
monotonicity_constraints	dict	None	单调性约束

特征选择参数:

参数	类型	默认值	说明
iv_threshold	float	0.02	IV筛选阈值
vif_threshold	float	10.0	VIF阈值
selection_method	str	'backward_lr'	选择方法

评分卡参数:

参数	类型	默认值	说明
pdo	int	20	PDO
base_score	int	600	基础分数
base_odds	int	50	基础赔率

返回值

fit 方法返回字典包含：

键名	说明
selected_features	最终选择的特征
bin_transformer	分箱转换器
model	训练好的模型
scorecard_transformer	评分卡转换器
performance	性能指标

配置类详解

DataConfig

data_cfg = DataConfig(
    test_size=0.3,                    # 测试集比例
    date_col='create_time',           # 日期列名
    oot_split_date='2023-10-01',      # OOT切分日期
    exclude_cols=['id'],              # 排除列
    protected_features=['duration']   # 保护特征
)

BinningConfig

bin_cfg = BinningConfig(
    max_bins=5,                       # 最大分箱数
    min_bin_pct=0.05,                 # 最小箱占比
    monotonicity_constraints={        # 单调性约束
        'duration': 1,                # 1=递增
        'age': -1                     # -1=递减
    },
    final_bins=None                   # 最终分箱规则
)

SelectionConfig

sel_cfg = SelectionConfig(
    iv_threshold=0.02,                # IV阈值
    vif_threshold=10.0,               # VIF阈值
    corr_threshold=0.8,               # 相关性阈值
    selection_method='backward_lr',   # 选择方法
    final_features=None               # 最终特征列表
)

ModelConfig

model_cfg = ModelConfig(
    pdo=20,                           # PDO
    base_score=600,                   # 基础分数
    base_odds=50                      # 基础赔率
)

高级功能

OOT 跨时间验证

score = AutoScore(random_state=42)
fit_result = score.fit(
    df=df,
    target_col='target',
    date_col='create_time',
    oot_split_date='2023-10-01'
)

print(f"Train AUC: {score.model_metrics['train']['auc']:.4f}")
print(f"Test AUC: {score.model_metrics['test']['auc']:.4f}")
print(f"OOT AUC: {score.model_metrics['oot']['auc']:.4f}")
print(f"PSI (Train vs OOT): {score.model_metrics['psi_train_oot']:.4f}")

单调性约束

constraints = {
    'duration': 1,       # 单调递增
    'credit_amount': 0,  # 自动检测
    'age': -1            # 单调递减
}

score.fit(
    df=df,
    target_col='target',
    monotonicity_constraints=constraints
)

约束值说明:

值	含义	适用场景
1	单调递增	期限、负债等风险正向特征
-1	单调递减	年龄、收入等风险负向特征
0	自动检测	不确定趋势的特征

自定义分箱

# 获取分箱转换器
bin_transformer = fit_result['bin_transformer']

# 设置自定义分箱
custom_rules = {
    'duration': [12, 36, 60],
    'age': [25, 35, 45, 55, 65]
}
bin_transformer.set_rules(custom_rules)
bin_transformer.recalculate_stats(X_train, y_train)

# 可视化
bin_transformer.visualize_binning(['duration', 'age'])

中间态继续建模

# 方式1: 通过 pipeline 传入
score2.fit(df=df, target_col='target', pipeline=(bin_transformer,))

# 方式2: 通过 final_bins 跳过分箱
score.fit(df=df, binning_config=BinningConfig(final_bins=existing_rules))

# 方式3: 通过 final_features 跳过特征选择
score.fit(df=df, selection_config=SelectionConfig(final_features=my_features))

保护特征

score.fit(
    df=df,
    target_col='target',
    protected_features=['duration', 'age'],
    iv_threshold=0.05
)

轻量级模型部署

# 导出
score.export_lightweight('model.json')

# 加载和使用
from autoscore import LightweightScorer

scorer = LightweightScorer('model.json')
probas = scorer.predict_proba(X_new)
scores = scorer.predict_score(X_new)

在线学习

new_metrics = score.online_fit(
    X_new, y_new,
    retain_binning=True,
    retain_features=True
)

独立组件使用

BinningProcess

from autoscore import BinningProcess

binning = BinningProcess(max_bins=5, min_bin_pct=0.05)
binning.fit(X, y)

print(f"分箱规则: {binning.rules}")
print(f"IV值: {binning._iv_values}")

X_woe = binning.transform(X)

FeatureSelector

from autoscore import FeatureSelector

selector = FeatureSelector()
features_iv = selector.filter_by_iv(iv_df, threshold=0.02)
features_vif = selector.filter_by_vif(X, threshold=10)

最佳实践

参数调优建议

参数	推荐值	说明
max_bins	5-10	过多会导致过拟合
min_bin_pct	0.05-0.1	确保每箱有足够样本
iv_threshold	0.02-0.05	根据数据集调整
vif_threshold	5-10	控制共线性
selection_method	'backward_lr'	推荐使用

典型工作流

from autoscore import (
    AutoScore, DataConfig, BinningConfig, SelectionConfig, ModelConfig
)

# 1. 配置
data_cfg = DataConfig(test_size=0.3, date_col='create_time', oot_split_date='2023-10-01')
bin_cfg = BinningConfig(max_bins=5, monotonicity_constraints={'duration': 1, 'age': -1})
sel_cfg = SelectionConfig(iv_threshold=0.02, vif_threshold=5.0)
model_cfg = ModelConfig(pdo=20, base_score=600)

# 2. 建模
score = AutoScore(random_state=42)
fit_result = score.fit(df=df, target_col='target', 
                       data_config=data_cfg, binning_config=bin_cfg,
                       selection_config=sel_cfg, model_config=model_cfg)

# 3. 验证
print(f"Train AUC: {score.model_metrics['train']['auc']:.4f}")
print(f"Test AUC: {score.model_metrics['test']['auc']:.4f}")
if score.dataset_stats.get('oot'):
    print(f"OOT AUC: {score.model_metrics['oot']['auc']:.4f}")

# 4. 报告
score.create_report('model_report.xlsx')

# 5. 部署
score.export_lightweight('model.json')

常见问题

日期格式错误

# 正确格式
oot_split_date='2023-10-01'

特征选择失败

# 方案1: 降低IV阈值
iv_threshold=0.01

# 方案2: 直接指定特征
selection_config=SelectionConfig(final_features=['duration', 'age'])

模型性能不佳

# 调整分箱参数
max_bins=8
min_bin_pct=0.03

# 调整单调性约束
monotonicity_constraints={'duration': 1, 'age': -1}

📁 项目结构

autoscore/
├── autoscore/
│   ├── __init__.py          # 包入口
│   ├── pipeline.py          # 主流程
│   ├── config.py            # 配置类
│   ├── binning.py           # 分箱模块
│   ├── selection.py         # 特征选择
│   ├── modeling.py          # 模型训练
│   ├── evaluation.py        # 模型评估
│   ├── scorecard.py         # 评分卡
│   ├── reporting.py         # 报告生成
│   ├── lightweight_scorer.py # 轻量级评分器
│   ├── validation.py        # 数据验证
│   ├── eda.py               # EDA分析
│   └── plot_utils.py        # 可视化工具
├── example.py               # 完整示例
├── readme.md                # 说明文档
├── setup.py                 # 安装配置
└── LICENSE                  # 许可证

📝 更新日志

v1.3.0 (2026-02-24)

新增轻量级模型导出与部署功能
新增在线学习功能
新增保护特征功能
优化分箱单调性约束算法
完善文档和示例代码

v1.2.0

新增OOT跨时间验证
新增单调性约束设置
优化特征选择流程

v1.1.0

新增配置类参数传递方式
新增模型报告生成
优化分箱可视化

v1.0.0

初始版本发布
核心功能实现

📄 许可证

本项目采用 MIT 许可证 - 详见 LICENSE 文件

🤝 贡献

欢迎提交 Issue 和 Pull Request！

📧 联系方式

项目主页: https://github.com/autoscore/autoscore
问题反馈: https://github.com/autoscore/autoscore/issues

版本: 1.3.0
更新日期: 2026-02-24

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.3.0

Feb 24, 2026

1.1.0

Feb 12, 2026

1.0.5

Feb 7, 2026

0.1.0

Feb 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoscore-1.3.0.tar.gz (73.6 kB view details)

Uploaded Feb 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autoscore-1.3.0-py3-none-any.whl (69.9 kB view details)

Uploaded Feb 24, 2026 Python 3

File details

Details for the file autoscore-1.3.0.tar.gz.

File metadata

Download URL: autoscore-1.3.0.tar.gz
Upload date: Feb 24, 2026
Size: 73.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for autoscore-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`a54c761117eeecf500809218e3f4f3ebf4387750b6d9bc8a80600c3cda58e79b`
MD5	`e8dff7128fbe023026cec9c162165416`
BLAKE2b-256	`c62f500b757ba3fb118a7eb28d07d79687ea586fda9e7182122f43b1776b868b`

See more details on using hashes here.

File details

Details for the file autoscore-1.3.0-py3-none-any.whl.

File metadata

Download URL: autoscore-1.3.0-py3-none-any.whl
Upload date: Feb 24, 2026
Size: 69.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for autoscore-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aef26771237e089b368afa8dcd3d19ec6b77166d7104d8e9545a5635221d1acb`
MD5	`cdffe6c16451e10e85316ad7ea27196e`
BLAKE2b-256	`a4914fd203b36dc476dcf41f9590a1e235f3eac84c1df27c0514df5525fde64e`

See more details on using hashes here.

autoscore 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AutoScore - 全自动风控评分卡建模库

✨ 核心特性

📦 安装

🚀 快速开始

最简使用方式

使用配置类 (推荐)

📖 详细文档

目录

核心类：AutoScore

初始化

fit 方法参数

返回值

配置类详解

DataConfig

BinningConfig

SelectionConfig

ModelConfig

高级功能

OOT 跨时间验证

单调性约束

自定义分箱

中间态继续建模

保护特征

轻量级模型部署

在线学习

独立组件使用

BinningProcess

FeatureSelector

最佳实践

参数调优建议

典型工作流

常见问题

日期格式错误

特征选择失败

模型性能不佳

📁 项目结构

📝 更新日志

v1.3.0 (2026-02-24)

v1.2.0

v1.1.0

v1.0.0

📄 许可证

🤝 贡献

📧 联系方式

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes