Run production-ready ML, DL, and fusion pipelines at scale โ without writing a single line of code. Refrakt makes research effortless.
Project description
About
refrakt_core is a modular deep learning and machine learning research framework for computer vision, designed for rapid experimentation, extensibility, and reproducibility. It now features a robust, thread-safe registry system, dynamic dataset handling, advanced image resizing, flexible hyperparameter overrides, and comprehensive logging and testing. Refrakt supports both classic and modern CV/ML papers, and enables seamless ML/DL/fusion pipelines.
This project aims to unify, extend, and visualize foundational and modern architectures through clean code, clear abstractions, and rigorous logging.
๐ Key Features
- Safe Registry System: Thread-safe, import-safe, decorator-based registration for models, datasets, losses, trainers, and transforms. Backward compatible with legacy code.
- Dynamic Dataset Loader: Load datasets from custom zip files or torchvision, with automatic format detection (GAN, supervised, contrastive) and size validation.
- Standard Image Resizer/Transforms: Multiple resize strategies (maintain aspect, crop, stretch), size validation, and tensor/PIL support.
- Hyperparameter Overrides: Override any config parameter from the command line or programmatically for fast experimentation.
- Improved Logging: Context-aware logging with better error handling, supporting both TensorBoard and Weights & Biases (W&B).
- Comprehensive Testing: Smoke, sanity, unit, and integration tests for all major features.
- ML/DL/Fusion Pipelines: Support for pure-ML, pure-DL, and hybrid fusion pipelines (e.g., deep feature extraction + ML fusion head).
- Modular YAML Configs: All components (model, trainer, loss, optimizer, scheduler, feature engineering) are defined in modular YAML files.
๐ Implemented Papers
- Vision Transformer (ViT) โ An Image is Worth 16x16 Words
- ResNet โ Deep Residual Learning for Image Recognition
- Autoencoders โ Learning Representations via Reconstruction
- Swin Transformer โ Hierarchical Vision Transformer with Shifted Windows
- Attention is All You Need
- ConvNeXt โ A ConvNet for the 2020s
- SRGAN โ Photo-Realistic Single Image Super-Resolution with GANs
- SimCLR โ A Simple Framework for Contrastive Learning
- DINO โ Self-Supervised Vision Transformers
- MAE โ Masked Autoencoders
- MSN โ Masked Siamese Networks
โ๏ธ Setup
# For pip install
pip install refrakt_core
# Manual setup
git clone https://github.com/refrakt-hub/refrakt_core.git
cd refrakt_core
# Create and activate a virtual environment
conda create -n refrakt python=3.10 -y
conda activate refrakt
# Install dependencies
pip install -r requirements.txt
GPU/cuML Support
If you want to use GPU-accelerated ML features (cuML), you must manually install the required dependencies after the main install. Run one of the following scripts from the project root:
# For bash users:
./install_cuml.sh
# For fish shell users:
./install_cuml.fish
This will install the appropriate cuML and RAPIDS libraries for your environment. If you do not need GPU/cuML support, you can skip this step.
๐งช Running Experiments
# Run with a config file
python -m refrakt_core.api --config refrakt_core/config/vit.yaml
# Or using the CLI
refrakt --config ./refrakt_core/config/resnet.yaml
# Override hyperparameters on-the-fly
python -m refrakt_core.api.train \
config.optimizer.lr=0.0005 \
config.trainer.epochs=20
Supported CLI Flags
| Flag | Description |
|---|---|
--config |
Path to YAML config file |
--log_type |
Logging backend: tensorboard, wandb, or both |
--debug |
Enable debug mode with extra verbosity |
๐ง Config Structure (YAML)
All components are defined in modular YAML files under refrakt_core/config/.
runtime:
mode: pipeline
log_type: []
dataset:
name: MNIST
params:
root: ./data
train: true
download: true
transform:
- name: Resize
params: { size: [28, 28] }
- name: ToTensor
- name: Normalize
params:
mean: [0.1307]
std: [0.3081]
dataloader:
params:
batch_size: 32
shuffle: true
num_workers: 4
drop_last: false
model:
name: vit
wrapper: vit
params:
in_channels: 1
num_classes: 10
image_size: 28
patch_size: 7
fusion:
type: cuml
model: logistic_regression
params:
C: 1.0
penalty: l2
solver: qn
max_iter: 1000
loss:
name: ce_wrapped
mode: logits
params: {}
optimizer:
name: adamw
params:
lr: 0.0003
scheduler: null
trainer:
name: supervised
params:
save_dir: "./checkpoints"
num_epochs: 1
device: cuda
๐งฉ Major Components & Patterns
1. Safe Registry System
Register models, datasets, losses, trainers, and transforms using decorators:
from refrakt_core.registry.safe_registry import register_model, get_model
@register_model("my_model")
class MyModel(torch.nn.Module):
...
model_cls = get_model("my_model")
model = model_cls()
2. Dynamic Dataset Loader
Load datasets from zip files or torchvision, with format detection:
from refrakt_core.loaders.dataset_loader import load_dataset
train_dataset, val_dataset = load_dataset("path/to/dataset.zip")
train_dataset, val_dataset = load_dataset("mnist")
3. Standard Image Resizer/Transforms
from refrakt_core.resizers.standard_transforms import create_standard_transform
transform = create_standard_transform(target_size=(224, 224), resize_strategy="maintain_aspect")
4. Hyperparameter Overrides
Override any config value from the command line or programmatically:
python train.py --config config.yaml model.name=ResNet optimizer.lr=0.001
5. ML/DL/Fusion Pipelines
Supports pure-ML, pure-DL, and hybrid fusion pipelines (deep features + ML head):
from refrakt_core.api.builders.model_builder import build_model
model = build_model(cfg=config, modules=modules, device="cuda", overrides=["model.params.lr=0.0005"])
๐ Logging & Monitoring
- TensorBoard: logs in
logs/<model_name>/tensorboard/ - Weights & Biases: auto-logged if enabled in config
tensorboard --logdir=./logs/<model_name>/tensorboard/
export WANDB_API_KEY=your_key_here
๐งฑ Project Structure
refrakt_core/
โโโ api/ # CLI: train.py, test.py, inference.py
โ โโโ builders/ # Builders for models, losses, optimizers, datasets
โโโ config/ # YAML configurations for each experiment
โโโ losses/ # Contrastive, GAN, MAE, VAE, etc.
โโโ models/ # Vision architectures (ViT, ResNet, MAE, etc.)
โ โโโ templates/ # Base model templates and abstractions
โโโ trainer/ # Task-specific training logic (SimCLR, SRGAN, etc.)
โโโ registry/ # Safe, decorator-based plugin system
โโโ utils/ # Helper modules (encoders, decoders, data classes)
โโโ resizers/ # Image resizing and standard transforms
โโโ loaders/ # Dynamic and standard dataset loaders
โโโ transforms.py # Data augmentation logic
โโโ datasets.py # Dataset definitions and loader helpers
โโโ logging_config.py # Logger wrapper for stdout + W&B/TensorBoard
๐งช Testing
Run all tests:
pytest tests/
๐งฉ Extending Refrakt
Add a New Model
- Create the architecture in
models/your_model.py - Inherit from a base class in
models/templates/models.py - Register it using:
from refrakt_core.registry.model_registry import register_model
@register_model("your_model")
class YourModel(BaseClassifier):
...
- Add a YAML config:
config/your_model.yaml - Write a custom trainer if needed (
trainer/your_model.py)
Add a Custom Dataset Loader or Transform
- Implement in
loaders/orresizers/ - Register with the safe registry
๐ Example Output
- Progress bar (via
tqdm) - Metrics printed and logged
./logs/<model_name>/with TensorBoard events- W&B dashboard if enabled
๐ฌ Contributing
- Clone and install:
git clone ... pip install -r requirements-dev.txt pre-commit install
- Follow formatting (
black,isort,pylint) - Write tests for any new feature
- Run:
pytest tests/
PRs and issues are welcome!
๐ญ Future Scope
| Milestone | Description |
|---|---|
| โ Stage 1 | Paper re-implementations in notebooks |
| โ Stage 2 | Modular training + model pipelines |
| โ Stage 3 | Python library (refrakt train, etc.) |
| ๐ Stage 4 | TBD |
Planned additions:
- Much better code readability + extensive documentation (
readthedocs) - More sklearn and cuML models made available through the registry.
- Integration of Kolmogorov-Arnold Networks and Lagrangian Neural Networks.
- Checkpoints for pre-trained weights of models saved.
- Integrate model tracing for Fusion Blocks.
- Allow for generative / latent fusion trainng.
๐ License
This repository is licensed under the MIT License. See LICENSE for full details.
๐ค Maintainer
Akshath Mangudi If you find issues, raise them. If you learn from this, share it. Built with love and curiosity :)
๐ค Contributing
We welcome contributions! To get started:
- See CONTRIBUTING.md for detailed guidelines, including development setup, code style, and testing.
- Set up your dev environment with:
pip install -e .[dev] # or python scripts/dev_setup.py
- This will install all runtime and development dependencies (testing, linting, formatting, type checking, etc.) and set up pre-commit hooks for code quality.
- Please ensure your code passes all pre-commit checks and tests before opening a pull request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file refrakt_core-0.3.1.tar.gz.
File metadata
- Download URL: refrakt_core-0.3.1.tar.gz
- Upload date:
- Size: 167.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
108991e11893fd1a98aa5c3cad9347dfc5fc95895f5fdcedd4780e5c75a46ed7
|
|
| MD5 |
b8f0aba1d99d7b636c9155e5831a6bd0
|
|
| BLAKE2b-256 |
8524b49dfb16395408e0b6dcb6b10595de0b32a591dcb839ec2c1aa5a279ff63
|
Provenance
The following attestation bundles were made for refrakt_core-0.3.1.tar.gz:
Publisher:
pypi-publish.yml on refrakt-hub/refrakt_core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
refrakt_core-0.3.1.tar.gz -
Subject digest:
108991e11893fd1a98aa5c3cad9347dfc5fc95895f5fdcedd4780e5c75a46ed7 - Sigstore transparency entry: 272630942
- Sigstore integration time:
-
Permalink:
refrakt-hub/refrakt_core@1ddf0c1264da4a335cf6ed9487478844bf23d42b -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/refrakt-hub
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@1ddf0c1264da4a335cf6ed9487478844bf23d42b -
Trigger Event:
release
-
Statement type:
File details
Details for the file refrakt_core-0.3.1-py3-none-any.whl.
File metadata
- Download URL: refrakt_core-0.3.1-py3-none-any.whl
- Upload date:
- Size: 257.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98999febccba2692e585fa7890ab1d84ff9ccf976164ec32cc24c665e4ab5ee0
|
|
| MD5 |
bd11f337e9025a59b18b76896c8effc5
|
|
| BLAKE2b-256 |
bc82f90e072107fc591d3176b33b2c2048feeeb7b8177dac77aa0e7776f48aab
|
Provenance
The following attestation bundles were made for refrakt_core-0.3.1-py3-none-any.whl:
Publisher:
pypi-publish.yml on refrakt-hub/refrakt_core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
refrakt_core-0.3.1-py3-none-any.whl -
Subject digest:
98999febccba2692e585fa7890ab1d84ff9ccf976164ec32cc24c665e4ab5ee0 - Sigstore transparency entry: 272630943
- Sigstore integration time:
-
Permalink:
refrakt-hub/refrakt_core@1ddf0c1264da4a335cf6ed9487478844bf23d42b -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/refrakt-hub
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@1ddf0c1264da4a335cf6ed9487478844bf23d42b -
Trigger Event:
release
-
Statement type: