An Auto-ML framework optimized for small datasets
Project description
Octopus
Octopus is a lightweight AutoML framework specifically designed for small datasets (<1k samples) and with high dimensionality (number of features). The goal of Octopus is to speed up machine learning projects and to increase the reliability of results in the context of small datasets.
What distinguishes Octopus from others
- Nested cross-validation (CV)
- Performance on small datasets
- No information leakage
- No data split mistakes
- Constrained regularization
- Ensembling, optimized for (nested) CV
- Simplicity
- Time to event
- Testing system (branching workflows)
- Reporting based on nested CV
- Test predictions over all samples
Hardware
For maximum speed it is recommended to run Octopus on a compute node with $n\times m$ CPUS for a $n \times m$ nested cross validation. Octopus development is done, for example, on a c5.9xlarge EC2 instance.
Installation
Package Installation works via pip or any other standard Python package manager:
# Install with recommended dependencies (includes optional packages such as AutoGluon)
pip install "octopus-automl[recommended]"
# Explicitly specify optional dependencies
pip install "octopus-automl[autogluon]" # AutoGluon
pip install "octopus-automl[boruta]" # Boruta feature selection
pip install "octopus-automl[survival]" # Support time-to-event / survival analysis
pip install "octopus-automl[examples]" # Dependencies for running examples
# Install with more than one extras, e.g.
pip install "octopus-automl[autogluon,examples]"
For contributors / octopus developers, a specific dependency group exists. It contains code sanitization and quality tools.
pip install "octopus-automl[dev]"
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file octopus_automl-0.5.3.tar.gz.
File metadata
- Download URL: octopus_automl-0.5.3.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00d9bb9d74e0da48912489bbb03913e64576c0e78b9a99121128f54dd48194f7
|
|
| MD5 |
707b76a497173622fd44e1119ec75af0
|
|
| BLAKE2b-256 |
ca4fbf7d5fda65bfbacf52379d29d39ad3ad7350bd0866b590ee1ccfb39c1a58
|
Provenance
The following attestation bundles were made for octopus_automl-0.5.3.tar.gz:
Publisher:
release.yml on emdgroup/octopus-automl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
octopus_automl-0.5.3.tar.gz -
Subject digest:
00d9bb9d74e0da48912489bbb03913e64576c0e78b9a99121128f54dd48194f7 - Sigstore transparency entry: 1306414958
- Sigstore integration time:
-
Permalink:
emdgroup/octopus-automl@f924f481153872f357246d090c822ef871b4bbd3 -
Branch / Tag:
refs/tags/0.5.3 - Owner: https://github.com/emdgroup
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f924f481153872f357246d090c822ef871b4bbd3 -
Trigger Event:
release
-
Statement type:
File details
Details for the file octopus_automl-0.5.3-py3-none-any.whl.
File metadata
- Download URL: octopus_automl-0.5.3-py3-none-any.whl
- Upload date:
- Size: 229.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
881b973bd85d27a6aeb6bf6f4c5e88339cd879302121e27c04ea95dc25280181
|
|
| MD5 |
6e56409c55df4f2f6952ce590e252dac
|
|
| BLAKE2b-256 |
80ddf8e9704c33ed1b62126facd2b724febb5c82599451dc617ea47d47bb5efc
|
Provenance
The following attestation bundles were made for octopus_automl-0.5.3-py3-none-any.whl:
Publisher:
release.yml on emdgroup/octopus-automl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
octopus_automl-0.5.3-py3-none-any.whl -
Subject digest:
881b973bd85d27a6aeb6bf6f4c5e88339cd879302121e27c04ea95dc25280181 - Sigstore transparency entry: 1306415034
- Sigstore integration time:
-
Permalink:
emdgroup/octopus-automl@f924f481153872f357246d090c822ef871b4bbd3 -
Branch / Tag:
refs/tags/0.5.3 - Owner: https://github.com/emdgroup
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f924f481153872f357246d090c822ef871b4bbd3 -
Trigger Event:
release
-
Statement type: