Skip to main content

Credit risk modeling factory: WOE binning, scorecards, LightGBM, Excel reporting.

Project description

SuperModelingFactory

PyPI Python License: BSL 1.1 Build wheels

风控建模工厂 —— 一套面向信用评分卡开发与模型管理的完整 Python 工具链。

安装

pip install supermodelingfactory

macOS 用户额外需要安装 OpenMP 运行时(lightgbm 依赖):

brew install libomp

支持的环境:Python 3.10 / 3.11 / 3.12 / 3.13,平台 macOS arm64 / Linux x86_64 / Windows x86_64。

详见 INSTALL.md

许可证

本项目采用 Business Source License 1.1,Change Date 为 2030-06-24。之前:

  • ✅ 允许:个人学习、学术研究、内部评估、原型、教学
  • ❌ 不允许:任何生产 / 商业 / 营收性使用

2030-06-24 后自动转为 Apache 2.0。商业授权请联系作者。

核心算法模块(22 个,分布在 WOE / Feature / Model / Eval / Sample / Core)通过 Cython 编译为 .so / .pyd 后分发,源码可在仓库阅读但包含在 wheel 中。

项目概述

SuperModelingFactory 整合了信贷风控建模全流程所需的三大能力:

子项目 功能定位 核心能力
Modeling_Tool 建模引擎 数据分箱、WOE 编码、特征分析、模型训练与评估、样本管理
ExcelMaster 报告引擎 程序化 Excel 工作簿生成,支持图表、条件格式、光标流式写入
Report 报告模板 模型性能报告、WOE 图批量导出、多模型对比报告

项目结构

SuperModelingFactory/
├── Modeling_Tool/          # 核心建模工具包
│   ├── Core/               #   基础设施:分箱、ODPS、工具函数、加密
│   ├── WOE/                #   WOE 编码:分箱、变换、映射、可视化
│   ├── Feature/            #   特征分析:分布偏移、PSI、相关性过滤
│   ├── Model/              #   模型训练:LR、LightGBM、XGBoost、变量选择
│   ├── Eval/               #   模型评估:Gains 表、ROC/KS、性能汇总
│   └── Sample/             #   样本管理:切分、分层、拒绝推断、分布适配
├── ExcelMaster/            # Excel 报告引擎
│   ├── ExcelFormatTool.py  #   格式定义(50+ 预设单元格格式)
│   ├── ExcelMaster.py      #   核心引擎(光标流式写入、图表、条件格式)
│   ├── Template.py         #   分析报告模板(PVA、Bivar、GridSearch 等)
│   └── Utility.py          #   工具函数(颜色、路径、PSI 报表处理等)
└── Report/                 # 模型评估报告模板
    └── Report_Tool.py      #   性能报告、WOE 绘图、多模型对比

安装

依赖

# 核心依赖
pip install pandas numpy scipy scikit-learn

# 建模引擎
pip install lightgbm xgboost joblib

# Excel 报告
pip install xlsxwriter openpyxl Pillow matplotlib seaborn

# 可选
pip install pyodps          # 阿里云 MaxCompute 连接
pip install imbalanced-learn # SMOTE 采样
pip install tqdm             # 进度条

使用

git clone <repo-url>
cd SuperModelingFactory
export PYTHONPATH="${PYTHONPATH}:$(pwd)"

快速开始

典型风控建模流程

from Modeling_Tool import (
    # 分箱
    Binning, super_binning,
    # WOE 编码
    WOE_Master,
    # 特征分析
    VarExtractionInsights, CorrelationFilter, PSICalculator,
    # 模型训练
    GradientBoostingModel, LRMaster,
    # 模型评估
    GainsTableCalculator, PerformanceEvaluator,
    # 样本管理
    SampleSplitter, RejectInferrer
)

# 1. 样本切分
splitter = SampleSplitter(test_size=0.3, random_state=42, stratify=True)
train_df, test_df = splitter.split_df(data, target='is_bad')

# 2. WOE 分箱与编码
woe_master = WOE_Master(train_data=train_df, varlist=feature_cols, dep='is_bad')
woe_master.fit(nbins=10, equal_freq=True)
train_woe = woe_master.transform(train_df)
test_woe = woe_master.transform(test_df)

# 3. 特征筛选
psi_calc = PSICalculator(buckets=10)
psi_result = psi_calc.calculate(expected_df=train_df, current_data=test_df, varlist=feature_cols)

corr_filter = CorrelationFilter(data=train_woe, dep='is_bad')
keep_vars = corr_filter.remove_highly_correlated(feature_cols)

# 4. 模型训练
model = GradientBoostingModel('lgb', params={'n_estimators': 100, 'learning_rate': 0.1})
model.fit(train_woe[keep_vars], train_woe['is_bad'], test_woe[keep_vars], test_woe['is_bad'])

# 5. 模型评估
evaluator = PerformanceEvaluator(tgt_name='is_bad', model=model.model, feature_cols=keep_vars)
evaluator.add_dataset('train', train_woe).add_dataset('test', test_woe)
perf_result = evaluator.evaluate()

使用 ExcelMaster 生成报告

from ExcelMaster.ExcelMaster import ExcelMaster

em = ExcelMaster('model_report.xlsx')
ws = em.add_worksheet('Performance')

# 流式写入 DataFrame
em.write_dataframe(ws, perf_result, title='模型性能汇总', titleformat='BLUE_H2')
em.insert_image(ws, 'roc_curve.png', figScale=(600, 400))

em.close_workbook()

架构设计

依赖方向

                    ┌─────────┐
                    │  Core   │  (基础设施,无跨包依赖)
                    └────┬────┘
           ┌─────────┬───┼───────┬─────────┐
           ▼         ▼   ▼       ▼         ▼
         WOE      Model  Eval  Feature   Sample
           │         │              │        │
           └─────────┴──────────────┴────────┘
                (均单向依赖 Core,模块间延迟导入)
  • Core 是所有子包的基础,不依赖任何其他子包
  • 其他子包之间通过延迟导入(函数体内 import)避免循环依赖
  • 顶层 Modeling_Tool/__init__.py 提供精选的统一 API

命名规范

  • 所有公开 API 通过 __init__.py 导出,使用方只需 from Modeling_Tool import ...
  • 类名采用 PascalCase,函数名采用 snake_case
  • _ 开头的函数/方法为内部实现,不对外暴露

持续集成

本仓库的 GitHub Actions(.github/workflows/tests.yml)会在 push 到 main 与 PR 上自动跑 pytest,矩阵为:

  • Python:3.113.12
  • 依赖矩阵:
    • legacynumpy<2 + scipy<1.13 + lightgbm<4
    • modernnumpy>=2 + scipy>=1.13 + lightgbm>=4
  • 共同约束:pandas>=2.0,<2.3(等 issue #2 修复后放宽)

测试用例托管在独立仓库 SuperModelingFactory_pytest(私有),workflow 通过 secrets.PYTEST_REPO_TOKEN 跨仓 clone。

配置 PAT(只需做一次)

  1. 进入 GitHub Settings · Tokens (classic) 生成新 token,scope 勾选 repo(只读访问私有仓库即可)
  2. 进入本仓库 Settings → Secrets and variables → Actions → New repository secret
  3. Name: PYTEST_REPO_TOKEN,Value: 粘贴 token

如改用 Fine-grained PAT,需将其授权访问 SuperModelingFactory_pytest 仓库的 Contents: Read 权限。

版本

  • Version: 1.0.0
  • Author: Jingkai Sun

许可证

内部项目,仅供团队使用。

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

supermodelingfactory-0.1.0.tar.gz (3.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

supermodelingfactory-0.1.0-cp313-cp313-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.13Windows x86-64

supermodelingfactory-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (18.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

supermodelingfactory-0.1.0-cp313-cp313-macosx_11_0_arm64.whl (2.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

supermodelingfactory-0.1.0-cp312-cp312-win_amd64.whl (2.1 MB view details)

Uploaded CPython 3.12Windows x86-64

supermodelingfactory-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (18.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

supermodelingfactory-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (2.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

supermodelingfactory-0.1.0-cp311-cp311-win_amd64.whl (2.1 MB view details)

Uploaded CPython 3.11Windows x86-64

supermodelingfactory-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (18.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

supermodelingfactory-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (2.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

supermodelingfactory-0.1.0-cp310-cp310-win_amd64.whl (2.1 MB view details)

Uploaded CPython 3.10Windows x86-64

supermodelingfactory-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (17.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

supermodelingfactory-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (2.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file supermodelingfactory-0.1.0.tar.gz.

File metadata

  • Download URL: supermodelingfactory-0.1.0.tar.gz
  • Upload date:
  • Size: 3.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for supermodelingfactory-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fff79c32e0e9e735f9f7c2e1eb1eb680dc1b001781b78bfbb263fbddc37feaf6
MD5 0d2f217ee5000f6fb614c617318d481b
BLAKE2b-256 5db5c494792a1239e5b9168dfe831b40b78d9eb3194bc770c608949d9f8f458c

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0.tar.gz:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 0340b6fcfdefb0bffc3ef2088f68ebe055e575251e9be9e8d9690d4109dfdc27
MD5 ed3716df8429775b4892d511d3ad9051
BLAKE2b-256 04eb765873611d4b031bbc434c70cd439323801f2dfb8e13fc6c80c332ea1ab5

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp313-cp313-win_amd64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7d3e91606f68496637288bf633078d00a0211d7f7588c808edba2d6aaa95c3f0
MD5 3487f7c5959e40725a55f8156b41ed7f
BLAKE2b-256 3bae1ce4adbccadd502dffbe55c3ada1183496ebfbfcb790c7854cdbb25750a4

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2449d95bd9d729891a97a0300d1729e100e1767b8303916a8f80f56f7ae41055
MD5 d51a8510c8cc60dc8b38c03baa95c129
BLAKE2b-256 5398da596da26974478baf30d8312d03f522089143155dd7b3725dd71bbbb5b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ef85f53fe05c06124cf461ac502e72c710083a82b4bfa310c37deebf54e7d648
MD5 86544ffe1186738c70072eafaa13f96b
BLAKE2b-256 c4fd5e73f2b3af5abed28ba087926586b1ffb82e44b04b398fb19b3db7864ceb

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp312-cp312-win_amd64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 16f75ddf9114a2d971c4dd6f62f83244a934fa3ab8d6132fe7f6184ab553dd01
MD5 df0fc5f237019bc86f0ae9d83a6d9244
BLAKE2b-256 60096c2d566740f3673ea001ae9b8f2a9f5bd84897b306149f915fc0efa19138

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 28d1b0a14a25ca8b10957c1908566e71f04efdc6a731cb02401074d1ce839322
MD5 212740aafea999dbb78900df3cc7b955
BLAKE2b-256 4f5ab03fd6789d71256a7bd80432e71f462b89502fcf0157c91715bdf28e000f

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 f9acbe8ec1587a0bedf888039dbda70999f1344a44d1fa2461c2a67796ed5975
MD5 7cb551770476e0fb25aa2dc1ec0c860c
BLAKE2b-256 87274edc1aeb47ed2dabf906e06c9e9ce6d99188ce884b9f5235b0808f0f86b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp311-cp311-win_amd64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5ab6bfc48130b8349679570b773474e3606545d1f6793e70399a3b5791d9419f
MD5 418942e514f1cc91c59f9d988abfcd98
BLAKE2b-256 5d0c0bbd847a3d055130b2c26fb0600620ab0a463a592b7f677748d35fcbc57c

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9f46b89181139f862b291573b97a373fa8201d988c24290b16cb1a09c1637738
MD5 c23b54401540ccef24586400dd42d2f4
BLAKE2b-256 867ddae320e139ee4900c78d857d807a919cc28016460c175ca4f9055eba6d60

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 8403e4cd4150453fd5f483edf1e4ff11e6ee4492687f2c5473f5c647f11d4489
MD5 93036db2a406d2c6ec815d4ab74c526c
BLAKE2b-256 e8e60d087ef5199886c77b41009aff4515e2ae7b4781eca828bfaa5796dca4cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp310-cp310-win_amd64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 588d2f8dad3c295770516785a37ec5a8f54c0cb018465e615aadfc1b2eaf9fd9
MD5 7203e0f86488f44ca131bb14218c5dac
BLAKE2b-256 9baa5017fa4704299f250eeffc8a1e02df027c5d669a7b7bed8ff9093ff03574

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file supermodelingfactory-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for supermodelingfactory-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 08b9974b5bab0d31239fb0909f9107ed2df136149b4713f19058928122a04f42
MD5 8edc59d2539f752de9b2016467c4cf50
BLAKE2b-256 a70cfb0c455816a29d7603204ccfa5a5783759571824473b3f1e074986f5e36b

See more details on using hashes here.

Provenance

The following attestation bundles were made for supermodelingfactory-0.1.0-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: build.yml on Kyle-J-Sun/SuperModelingFactory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page