AI models based on AIRCHECK data
Project description
aircheck_test_model
aircheck_test_model is a Python package for training and screening machine learning models on chemical compound datasets.
It provides a Python API (simple train and screen functions) and a Command-Line Interface (CLI) for easy integration in pipelines or local workflows.
The package is designed to work with molecular fingerprints (e.g., ECFP) and chemical structure data in formats such as CSV or Parquet.
✨ Features
- Train ML models with training and optional test datasets
- Save trained models to a specified directory
- Evaluate models on test datasets
- Screen new compounds using trained models
- Simple CLI powered by Typer
📦 Installation
Install from PyPI (once published):
pip install aircheck-model
Or install locally for development:
git clone <your-repo-url> cd aircheck_test_model pip install -e '.[dev]'
🐍 Python API Usage
After installation, you can import the top-level functions train and screen:
from pathlib import Path from aircheck_test_model import train, screen
--- Train models ---
train_result, test_result = train( train_file="location of parquet file", train_column="ECFP6", label="LABEL", model_dir="aircheck_test_model/new_model", # test_file is optional (default=None) ) Accepts training and test datasets in Parquet format. Please provide the file path. Datasets can be downloaded from our website AIRCHECK
--- Screen compounds ---
result_df = screen( screen_file="data/ScreenData1.csv", smile_column="SMILES", fingerprint_type="ECFP6", model_directory="aircheck_test_model/new_model" )
print(result_df.head())
💻 CLI Usage
The package also provides a command-line tool:
aircheck_test_model --help
🔹 Check Version
aircheck_test_model version
🔹 Train Models
aircheck_test_model train \ --train-data data/WDR91.parquet \ --column ECFP6 \ --label LABEL \ --model-dir aircheck_test_model/new_model \ --test-data data/sampled_data_test_1.parquet
Arguments:
-
--train-data, -t(required): Path to training data (CSV/Parquet) -
--test-data, -e: Optional path to test data -
--column, -c(required): Feature column (e.g., fingerprint type such as ECFP4, ECFP6) -
--label, -l(required): Label column name -
--model-dir, -m: Directory to save trained models (default:~/model)
🔹 Screen Compounds
aircheck_test_model screen \ --screen-data data/ScreenData1.csv \ --column SMILES \ --fingerprints-column ECFP6 \ --model-dir aircheck_test_model/new_model
Arguments:
-
--screen-data, -s(required): Path to compound data file -
--column, -c(required): Column containing SMILES strings -
--fingerprints-column, -l(required): Fingerprint column name -
--model-dir, -m: Directory where trained models are stored
🛠 Development
Run tests and linting locally:
pytest ruff check .
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aircheck_test_model-1.1.2.tar.gz.
File metadata
- Download URL: aircheck_test_model-1.1.2.tar.gz
- Upload date:
- Size: 330.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a3bb7a15f0916abd590ba5d6155fb94463561575f500f8f802fc565418f802a
|
|
| MD5 |
4cab0a977e8b02c36cc440953b450f4b
|
|
| BLAKE2b-256 |
5724317c63c13b3a4928d90205660c8f612f45dbe1cc38e3151cfe14b86ebf06
|
Provenance
The following attestation bundles were made for aircheck_test_model-1.1.2.tar.gz:
Publisher:
pypi.yaml on nabinelnino/aircheck-model
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aircheck_test_model-1.1.2.tar.gz -
Subject digest:
0a3bb7a15f0916abd590ba5d6155fb94463561575f500f8f802fc565418f802a - Sigstore transparency entry: 538979847
- Sigstore integration time:
-
Permalink:
nabinelnino/aircheck-model@d386a28f4bf1f7d61a16f8f5ecb680a352557a54 -
Branch / Tag:
refs/tags/v1.1.2 - Owner: https://github.com/nabinelnino
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yaml@d386a28f4bf1f7d61a16f8f5ecb680a352557a54 -
Trigger Event:
push
-
Statement type:
File details
Details for the file aircheck_test_model-1.1.2-py3-none-any.whl.
File metadata
- Download URL: aircheck_test_model-1.1.2-py3-none-any.whl
- Upload date:
- Size: 333.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
85fdbf6ea892b2a62a78517b98422aef3154767b16d70311d41cd3b69e34a857
|
|
| MD5 |
adbf65719a1d5678cb345a9b00923d09
|
|
| BLAKE2b-256 |
acb52ddc733e3466373f8f6d760eb37bf8a45f434f485e8094bbf42216d501f9
|
Provenance
The following attestation bundles were made for aircheck_test_model-1.1.2-py3-none-any.whl:
Publisher:
pypi.yaml on nabinelnino/aircheck-model
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aircheck_test_model-1.1.2-py3-none-any.whl -
Subject digest:
85fdbf6ea892b2a62a78517b98422aef3154767b16d70311d41cd3b69e34a857 - Sigstore transparency entry: 538979881
- Sigstore integration time:
-
Permalink:
nabinelnino/aircheck-model@d386a28f4bf1f7d61a16f8f5ecb680a352557a54 -
Branch / Tag:
refs/tags/v1.1.2 - Owner: https://github.com/nabinelnino
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yaml@d386a28f4bf1f7d61a16f8f5ecb680a352557a54 -
Trigger Event:
push
-
Statement type: