A package for attention model pipeline and feature analysis
Project description
DyGAF
DyGAF is a Python package designed to perform attention-based model pipelines and feature importance analysis on tabular datasets. It supports both dependent and independent attention models, making it suitable for tasks such as disease classification, feature selection, and tabular data analysis. The package leverages Stratified K-Fold Cross-Validation and integrates seamlessly with TensorFlow and other machine learning libraries.
Features
- Dependent and Independent Attention Models: Train and evaluate models using attention-based mechanisms.
- Stratified K-Fold Cross-Validation: Ensures consistent performance evaluation.
- Feature Importance Analysis: Calculates feature importance using model weights for better interpretability.
- Seamless Integration: Built on top of TensorFlow, scikit-learn, and XGBoost, with compatibility for Python 3.10.
Installation
Follow these steps to install DyGAF in a new conda environment.
Step 1: Create a Conda Environment
Create and activate a new conda environment with Python 3.10 to ensure compatibility with DyGAF's dependencies.
conda create --name dygaf_env python=3.10
conda activate dygaf_env
Step 2: Create requirements.txt
echo "numpy==1.24.0" > requirements.txt
echo "pandas==1.5.3" >> requirements.txt
echo "scikit-learn==1.5.1" >> requirements.txt
echo "tensorflow==2.10.0" >> requirements.txt
echo "xgboost==2.1.1" >> requirements.txt
Step 3: Install Dependencies
Install all dependencies listed in requirements.txt using pip:
pip install -r requirements.txt
Step 4: Install DyGAF
Finally, install DyGAF using pip:
pip install DyGAF
Step 5: Usage
DyGAF can be used either via the command line interface (CLI) or in a Python environment such as Jupyter notebooks.
Command Line Interface (CLI)
Run DyGAF from the command line by specifying the dataset path, target column, random seed, and number of splits for cross-validation.
DyGAF --df_path "/path/to/csvfile.csv" --target_column Target --seed 4 --n_splits 2
Python Environment / Jupyter Notebook
You can also use DyGAF within a Python script or Jupyter notebook to have more flexibility in your analysis.
# Import DyGAF function from the package
from DyGAF import DyGAF
# Set parameters for the pipeline
df_path = "/input/csvfile.csv"
target_column = "Target" # Name of the target column
seed = 4 # Seed for reproducibility
n_splits = 5 # Number of splits for Stratified K-Fold
# Run DyGAF with the parameters
features_df, accuracy = DyGAF(df_path, target_column, seed, n_splits)
# Display results
print("Feature Importance (Top 5):")
print(features_df.head())
print(f"Model Accuracy: {accuracy:.4f}")
Output folder would look like below:
output/
│
├── features_importance_seed4.csv # Contains the feature importance analysis
├── output_seed4_dependent_attention_weights.csv # Stores attention weights from the dependent attention model
└── output_seed4_independent_attention_weights.csv # Stores attention weights from the independent attention model
Thank you for using our package!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dygaf-0.0.3.tar.gz
.
File metadata
- Download URL: dygaf-0.0.3.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c6b567ace10c40c6dfaf1b901ecc0a1a69a4fbf2eade8fc9a1e778f9405e1c9c |
|
MD5 | 19b0df4eae08a0083e53a72e76df517d |
|
BLAKE2b-256 | 5736577dc5d03a3b433e026ff2ef1f160121dc54ce32137ff30d81b5e4b53ebf |
File details
Details for the file DyGAF-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: DyGAF-0.0.3-py3-none-any.whl
- Upload date:
- Size: 10.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a5d19bbaacefbda2be7646a682c053821ed5182721bef758394ca5c6196e20ba |
|
MD5 | 8dc7c1e0414d11e1b0b7bfe9b69e12b1 |
|
BLAKE2b-256 | 843581d357b87aaec78310b4ab8c553044dc471653ddd7243d5b9cab218b8c57 |