A streamlined machine learning pipeline for QSAR and regression tasks with SHAP interpretability.
Project description
chrom_qsar
Requirements
在运行本库之前,请确保安装了 Python 3.8+,并安装以下依赖项:
pip install pandas numpy scikit-learn xgboost lightgbm optuna shap matplotlib seaborn joblib openpyxl
Quick Start
Step 1: 准备数据
确保您的 Excel 数据文件与运行脚本位于同一目录下。 默认数据格式要求:Excel 表格的 第1列 为目标变量 (Target/Y),第2列 及之后的所有列为特征变量 (Features/X)。首行应为列名。
Step 2: 训练与优化模型
创建 main.py 并运行,库将自动执行数据清洗、13种模型的 Optuna 调参、测试集评估、SHAP 分析并保存模型文件。
# main.py
import chrom_qsar
if __name__ == '__main__':
trainer = QSARModelTrainer(
data_path='YOUR FILEPATH',
out_dir='training_results',
n_trials=200 # 100~500
)
summary_df = trainer.run()
print("\n训练完成,最终模型排名:\n", summary_df)
Step 3: 批量分析与深度可视化
如果您已经训练好了模型,或者想在新的测试集下重新分析,请使用 QSARModelAnalyzer。
# main_analyze.py
import chrom_qsar
if __name__ == '__main__':
analyzer = QSARModelAnalyzer(
data_path='YOUR FILEPATH',
model_dir='training_results',
out_dir='batch_analysis_results'
)
Advanced Usage
- 若需引入新模型,只需在 chrom_qsar/models.py 的
get_model_configs()字典中添加即可 - 如果您的 Excel 表格列顺序不同,请修改
chrom_qsar/data.py中的 load_and_clean_data 函数:
Notes
- 中文显示问题:库内已内置
plt.rcParams['font.sans-serif'] = ['SimHei', 'Arial']。若您的操作系统(如 macOS 或 Linux)未安装 SimHei 字体,图表中的中文可能会显示为方块。请将其替换为您系统已有的中文字体(如 PingFang SC 或 WenQuanYi Micro Hei)。 - SHAP 计算时间:对于 PermutationExplainer 或 KernelExplainer,当特征数量较多或样本量较大时,计算可能较慢。库内已默认设置
max_evals限制以平衡速度与精度。
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chrom_qsar-0.1.1.tar.gz.
File metadata
- Download URL: chrom_qsar-0.1.1.tar.gz
- Upload date:
- Size: 10.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11477953dd6d194c293c77556d478acda4efbcc7d05191090f3843294d77799d
|
|
| MD5 |
c377899ceac60b809da566e98e50f3cc
|
|
| BLAKE2b-256 |
7a458aa128a8626e3d12a3e26de0dedb29957b22942d3cfe48b7e71904b6bbd4
|
File details
Details for the file chrom_qsar-0.1.1-py3-none-any.whl.
File metadata
- Download URL: chrom_qsar-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cc59e98ee1ac2bfad843a33148241ec23e476eb13cec478febc2afc17ac32f9
|
|
| MD5 |
5e40f8535e2adbaead6f553fffb183dd
|
|
| BLAKE2b-256 |
016443fa0e226ed784c956761e17723a564f15df824c604db6e0a511b8616e8b
|