Skip to main content

A streamlined machine learning pipeline for QSAR and regression tasks with SHAP interpretability.

Project description

chrom_qsar

Requirements

在运行本库之前,请确保安装了 Python 3.8+,并安装以下依赖项:

pip install pandas numpy scikit-learn xgboost lightgbm optuna shap matplotlib seaborn joblib openpyxl

Quick Start

Step 1: 准备数据

确保您的 Excel 数据文件与运行脚本位于同一目录下。 默认数据格式要求:Excel 表格的 第1列 为目标变量 (Target/Y),第2列 及之后的所有列为特征变量 (Features/X)。首行应为列名。

Step 2: 训练与优化模型

创建 main.py 并运行,库将自动执行数据清洗、13种模型的 Optuna 调参、测试集评估、SHAP 分析并保存模型文件。

# main.py
import chrom_qsar

if __name__ == '__main__':
    trainer = QSARModelTrainer(
        data_path='YOUR FILEPATH',
        out_dir='training_results',  
        n_trials=200                 # 100~500
    )
    
    summary_df = trainer.run()
    print("\n训练完成,最终模型排名:\n", summary_df)

Step 3: 批量分析与深度可视化

如果您已经训练好了模型,或者想在新的测试集下重新分析,请使用 QSARModelAnalyzer。

# main_analyze.py
import chrom_qsar

if __name__ == '__main__':
    analyzer = QSARModelAnalyzer(
        data_path='YOUR FILEPATH',
        model_dir='training_results',      
        out_dir='batch_analysis_results'   
    )

Advanced Usage

  • 若需引入新模型,只需在 chrom_qsar/models.py 的 get_model_configs() 字典中添加即可
  • 如果您的 Excel 表格列顺序不同,请修改 chrom_qsar/data.py 中的 load_and_clean_data 函数:

Notes

  • 中文显示问题:库内已内置 plt.rcParams['font.sans-serif'] = ['SimHei', 'Arial']。若您的操作系统(如 macOS 或 Linux)未安装 SimHei 字体,图表中的中文可能会显示为方块。请将其替换为您系统已有的中文字体(如 PingFang SC 或 WenQuanYi Micro Hei)。
  • SHAP 计算时间:对于 PermutationExplainer 或 KernelExplainer,当特征数量较多或样本量较大时,计算可能较慢。库内已默认设置 max_evals 限制以平衡速度与精度。

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chrom_qsar-0.1.1.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chrom_qsar-0.1.1-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file chrom_qsar-0.1.1.tar.gz.

File metadata

  • Download URL: chrom_qsar-0.1.1.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for chrom_qsar-0.1.1.tar.gz
Algorithm Hash digest
SHA256 11477953dd6d194c293c77556d478acda4efbcc7d05191090f3843294d77799d
MD5 c377899ceac60b809da566e98e50f3cc
BLAKE2b-256 7a458aa128a8626e3d12a3e26de0dedb29957b22942d3cfe48b7e71904b6bbd4

See more details on using hashes here.

File details

Details for the file chrom_qsar-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: chrom_qsar-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for chrom_qsar-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6cc59e98ee1ac2bfad843a33148241ec23e476eb13cec478febc2afc17ac32f9
MD5 5e40f8535e2adbaead6f553fffb183dd
BLAKE2b-256 016443fa0e226ed784c956761e17723a564f15df824c604db6e0a511b8616e8b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page