A python package for scaling and automating pre-processing, visualization, classification, and features selection of generic data sets.
Project description
orthrus
A python package for scaling and automating pre-processing, visualization, classification, and features selection of generic data sets. Read the docs!
Installing the conda environment
In order to ensure proper behavior of python classes and functions between platforms we recommend installing an isolated conda
environment with the depedencies listed in environment.yml. To create a new enviroment with these dependencies, from the shell run:
conda env create -f environment.yml
This will generate the conda environment orthrus and install any dependencies required by the orthrus module. If the user does not have a CUDA >=11 compatible graphics card, then the user can replace environment.yml with environment_nocuda.yml. The user can also use their own environment and install the packages listed in either environment.yml or environment_nocuda.yml.
Installing the orthrus package
orthrus is now available through the PyPi just run
pip install orthrus
to install the orthrus package. To install the orthrus package from this repo, first activate the orthrus environment and then navigate to your local orthrus directory:
conda activate orthrus
cd /path/to/orthrus/
Install the package with pip
pip install -e .
Finally add ORTHRUS_PATH=/path/to/orthrus/
to your environment variables (different for each OS).
Basic Usage
The fundamental object in the orthrus package is the DataSet class. Here is an example of loading the iris dataset into the DataSet class to create an instance from within the orthrus directory:
# imports
from orthrus.core.dataset import DataSet as DS
import pandas as pd
# load data and metadata
data = pd.read_csv("test_data/Iris/Data/iris_data.csv", index_col=0)
metadata = pd.read_csv("test_data/Iris/Data/iris_metadata.csv", index_col=0)
# create DataSet instance
ds = DS(name='iris', path='./test_data', data=data, metadata=metadata)
# save dataset
ds.save()
here path
indicates where ds
will save figures and results output by the class methods.
Creating a Project Environment
To increase organization and reproducibility of results the orthrus package includes helper functions for generating a project directory and experiment subdirectories. Here is an example where we create a project directory called Iris and then generate an experiment directory called setosa_versicolor_classify_species_svm where we intend to classify setosa and versicolor species with an SVM classifier.
# imports
from orthrus.core.helper import generate_project
from orthrus.core.helper import generate_experiment
from orthrus.core.dataset import load_dataset
import shutil
# Create a project directory structure in the test path
file_path = './test_data/'
generate_project('Iris', file_path)
# move data into Data directory of Iris project directory
shutil.move('./test_data/iris.ds', './test_data/Iris/Data/iris.ds')
# create experiment directory in the Experiments directory of the Iris directory
proj_dir = './test_data/Iris/'
generate_experiment('setosa_versicolor_classify_species_svm', proj_dir)
Once the setosa_versicolor_classify_species_svm directory is created there will be a file setosa_versicolor_classify_species_svm_params.py containing a template for experimental parameters that the user can change or add on to. The Scripts directory in the Iris directory should contain general purpose scripts that can take in specific experimental parameters from your different experiments—allowing you to easily change your experiment on the fly with minimal code change. Take a look at the Iris directory for an example of this workflow.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file orthrus-1.0.9.tar.gz
.
File metadata
- Download URL: orthrus-1.0.9.tar.gz
- Upload date:
- Size: 119.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 713b5d6b700820585ea81860ee9cfb32f7a34f67cf5ad0e58ab51d9c076f76ba |
|
MD5 | 024ae3fdd64161be5ef9aaa582d4f3c9 |
|
BLAKE2b-256 | 79395cd236ee81e6a814a366848c28176dd1c2824a94e012f01a901772caedce |
File details
Details for the file orthrus-1.0.9-py3-none-any.whl
.
File metadata
- Download URL: orthrus-1.0.9-py3-none-any.whl
- Upload date:
- Size: 131.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de232f54a044e3819e95f9f1eb12e1376ac2ad1141c39514f9a226abd6622db9 |
|
MD5 | 7770a660a89d2d872ae13f02910a5783 |
|
BLAKE2b-256 | 34640aa15b51a48a84d11dc980e9dffe4757f3654705a3da83716c1b00c16dff |