Skip to main content

A tool for analyzing rule effectiveness in credit risk management

Project description

rulelift

一个用于信用风险管理中规则有效性分析的Python工具包。

功能介绍

rulelift可以帮助您分析信用风险规则的有效性,包括:

  • 基于用户评级坏账率(USER_LEVEL_BADRATE)的预估指标
  • 基于实际逾期情况(USER_TARGET)的实际指标
  • 核心指标包括:命中率、逾期率、召回率、精确率、lift值等
  • 支持自定义字段映射
  • 输出结构化的分析结果

安装方法

pip install rulelift

快速开始

1. 加载示例数据

from rulelift import load_example_data

# 加载示例数据
df = load_example_data()

# 查看数据结构
df.head()

2. 分析规则效度

from rulelift import analyze_rules

# 分析规则效度
result = analyze_rules(df)

# 查看分析结果
print(result.head())

# 按lift值排序
result_sorted = result.sort_values(by='actual_lift', ascending=False)
print(result_sorted.head())

API文档

analyze_rules

def analyze_rules(rule_score, rule_col='RULE', user_id_col='USER_ID', 
                 user_level_badrate_col='USER_LEVEL_BADRATE', user_target_col='USER_TARGET')

参数

  • rule_score: DataFrame,规则拦截客户信息
  • rule_col: str,规则名字段名,默认值为'RULE'
  • user_id_col: str,用户编号字段名,默认值为'USER_ID'
  • user_level_badrate_col: str,用户评级坏账率字段名,默认值为'USER_LEVEL_BADRATE'
  • user_target_col: str,用户实际逾期字段名,默认值为'USER_TARGET'

返回值

  • DataFrame,包含所有规则的评估指标,包括:
    • rule: 规则名称
    • hit_rate_pred: 基于评级坏账率的预估命中率
    • estimated_badrate_pred: 基于评级坏账率的预估逾期率
    • estimated_recall_pred: 基于评级坏账率的预估召回率
    • estimated_precision_pred: 基于评级坏账率的预估精确率
    • estimated_lift_pred: 基于评级坏账率的预估lift值
    • hit_rate: 基于实际逾期的命中率
    • actual_badrate: 基于实际逾期的实际逾期率
    • actual_recall: 基于实际逾期的实际召回率
    • actual_precision: 基于实际逾期的实际精确率
    • actual_lift: 基于实际逾期的实际lift值

load_example_data

def load_example_data(file_path='./data/hit_rule_info.csv')

参数

  • file_path: str,示例数据文件路径,默认值为'./data/hit_rule_info.csv'

返回值

  • DataFrame,示例数据

示例数据结构

示例数据包含以下字段:

字段名 描述
RULE 规则名称
USER_ID 用户编号
HIT_DATE 命中规则日期
USER_LEVEL 用户评级
USER_LEVEL_BADRATE 用户评级对应的坏账率
USER_TARGET 用户是否逾期(1=逾期,0=未逾期)

指标说明

命中率

  • 定义:命中规则的样本数 / 总样本数
  • 意义:规则覆盖的样本比例

逾期率

  • 定义:逾期样本数 / 总样本数
  • 意义:样本的整体逾期情况

召回率

  • 定义:命中规则的逾期样本数 / 总逾期样本数
  • 意义:规则能够识别出的逾期样本比例

精确率

  • 定义:命中规则的逾期样本数 / 命中规则的样本数
  • 意义:规则命中的样本中实际逾期的比例

Lift值

  • 定义:规则命中样本的逾期率 / 总样本的逾期率
  • 意义:规则的有效性提升倍数,值越大说明规则越有效

许可证

MIT License

作者

Author Name author@example.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rulelift-0.3.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rulelift-0.3.0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file rulelift-0.3.0.tar.gz.

File metadata

  • Download URL: rulelift-0.3.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.9

File hashes

Hashes for rulelift-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e7d5d7f839b49845f9deca94f4f4ccdf75671718bcec64cebc16908a8197263d
MD5 49b516deb211282e8d7a9cdb2974f5d6
BLAKE2b-256 1a08845275f11a25d07d9662a333a3aad3d8d850887b61010573b785b70a32f8

See more details on using hashes here.

File details

Details for the file rulelift-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: rulelift-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.9

File hashes

Hashes for rulelift-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 05dc511d5c5d6c102c456e74d59ff5a35b01f4c37d74c497eea732b60cf2d817
MD5 2e5f1bb9bcec72dc8500a44e8d2f57e9
BLAKE2b-256 a5927f234f7e138eab16e8815575b4dad218939e75ae1c95e94accb87e1a8c98

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page