Skip to main content

Generic Framework for ML projects

Project description

CoreML

coreml is an end-to-end machine learning framework aimed at supporting rapid prototyping. It is built on top of PyTorch by combining the several components of any ML pipeline, right from definining the dataset object, choosing how to sample each batch, preprocessing your inputs and labels, iterating on different network architectures, applying various weight initializations, running pretrained models, freezing certain layers, changing optimizers, adding learning rate schedulers, and detailed logging, into a simple model.fit() framework, similar to scikit-learn. The codebase is very modular making it easily extensible for different tasks, modalities and training recipes, avoiding duplication wherever possible.

Features

  • Support for end-to-end training using PyTorch.
  • Makes every aspect of the training pipeline configurable.
  • Provides the ability to define and change architectures right in the config file.
  • Built-in support for experiment tracking using Weights & Biases.
  • Supports tracking instance-level loss over epochs.
  • Logs predictions and metrics over epochs to allow future analysis.
  • Supports saving checkpoints and optimizing thresholds based on specific subsets.
  • Defines several metrics like PrecisionAtRecall, SpecificityAtSensitivity and ConfusionMatrix.
  • Logs several classification curves like PR curve, Sensitivity-Specificity curve, ROC curve.
  • Explicitly requires data versioning.
  • Supports adding new datasets adhering to a required format.
  • Contains unit tests wherever applicable.

Setup

Clone the project:

$ git clone https://github.com/dalmia/coreml.git

Weights & Biases

We use wandb for experiment tracking. You'll need to have that set up:

  • Install wandb
$ pip install wandb
  1. Login to wandb:
$ wandb login

You will be redirected to a link that will show you your WANDB_API_KEY .

  1. Set the WANDB_API_KEY by adding this to your ~/.bashrc file:
export WANDB_API_KEY=YOUR_API_KEY
  1. Run source ~/.bashrc.
  2. During training, you'll have an option to turn off wandb as well.

Docker

We use Docker containers to ensure replicability of experiments. You can either fetch the Docker image from DockerHub using the following line:

$ docker pull adalmia/coreml:v1.0

OR

You can build the image using the DockerFile:

$ docker build -t adalmia/coreml:v1.0 .

The repository runs inside a Docker container. When creating the container, you need to mount the directory containing data to /data and directory where you want to store the ouptuts to /output on the container. Make the corresponding changes to create_container.sh to mount the respective directories by changing /path/to/coreml, /path/to/data and /path/to/outputs to the appropriate values.

Use the following command to launch a container:

$ bash create_container.sh

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreml-0.0.2.tar.gz (40.5 kB view hashes)

Uploaded Source

Built Distribution

coreml-0.0.2-py3-none-any.whl (60.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page