Skip to main content

A python module and experiment manager for deep learning

Project description

Matรฉ ๐Ÿง‰ your modular AI project and experiment manager

Welcome to Matรฉ! ๐ŸŽ‰๐Ÿ‘‹๐Ÿผ

Matรฉ is an open science modular Python framework designed to streamline and simplify the development and management of machine learning projects. It was developed to address the reproducibility crisis in artificial intelligence research by promoting open science and accessible AI. ๐ŸŒ๐Ÿ’ป

The framework is built around the best software engineering practices of modularity and separation of concerns, encouraging quality coding, collaboration, and the sharing of models, trainers, data loaders, and knowledge. The modular design and separation of concerns simplify the development and maintenance of machine learning models, leading to an improved developer experience. ๐Ÿš€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป๐Ÿ’ฌ๐Ÿง 

With Matรฉ, you can easily install the source code of open-source projects and adhere to modularity and separation of concerns, making your models and modules sharable out of the box. This means you can collaborate more effectively with others and easily share your work. ๐Ÿ“ฆ๐Ÿ’ป๐Ÿค

Thank you for choosing Matรฉ, and we can't wait to see the amazing machine learning projects you'll create with it! ๐ŸŽ‰๐Ÿ‘จโ€๐Ÿ’ป๐ŸŒŸ

Features ๐ŸŽ‰

  • Seamless integration with any python library such as PyTorch/Lightning, TensorFlow/Keras, JAX/Flax, Huggingface/transformers. ๐Ÿค๐Ÿค—๐Ÿ‰
  • Easy to use interface to add source code of models, trainers, and data loaders to your projects. ๐ŸŽจ๐Ÿ’ป๐Ÿ“
  • Support for full customizability and reproducibility of results through the inclusion of dependencies in your project. ๐ŸŒŸ๐Ÿ”๐Ÿงช
  • Modular project structure that enforces a clean and organized codebase. ๐Ÿงฑ๐Ÿ‘จโ€๐Ÿ’ป๐Ÿ‘Œ
  • Fully compatible with python. No need to use mate commands to run your experiments. ๐Ÿ๐Ÿ’ป๐Ÿš€
  • Convenient environment management through the Matรฉ Environment API. ๐ŸŒ๐Ÿ‘จโ€๐Ÿ’ป๐Ÿ”ง
  • Support for pip and conda for dependency management. ๐Ÿ“ฆ๐Ÿ”๐Ÿ’ป
  • Works with Colab out of the box. ๐ŸŽ‰๐Ÿ‘Œ๐Ÿค–

Table of Contents

Installation ๐Ÿ”Œ

pip install yerbamate

Quick Start โšก

Initialize a project

mate init deepnet

This will generate the following empty project structure:

/
|-- models/
|   |-- __init__.py
|-- experiments/
|   |-- __init__.py
|-- trainers/
|   |-- __init__.py
|-- data/
|   |-- __init__.py

Install an experiment

To install an experiment, you can use mate install to install a module and its dependencies from a github repository. See docs for more details.

# Short version of GitHub URL https://github.com/oalee/big_transfer/tree/master/big_transfer/experiments/bit
mate install oalee/big_transfer/experiments/bit -yo pip

# Short version of GitHub URL https://github.com/oalee/deep-vision/tree/main/deepnet/experiments/resnet
mate install oalee/deep-vision/deepnet/experiments/resnet -yo pip

Install a module

You can install independant modules such as models, trainers, and data loaders from github projects that follow the Independent modular project structure.

mate install oalee/lightweight-gan/lgan/trainers/lgan 
mate install oalee/big_transfer/models/bit -yo pip
mate install oalee/deep-vision/deepnet/models/vit_pytorch -yo pip
mate install oalee/deep-vision/deepnet/trainers/classification -yo pip

Setting up environment

Set up your environment before running your experiments. This can be done by using shell, or env.json file in the root of your project. Matรฉ API requires results to be set in the environment. For more information, see docs.

DATA_PATH=/path/to/data
results=/path/to/results
{
    "DATA_PATH": "/path/to/data",
    "results": "/path/to/results"
}

Train a model

To train a model, you can use the mate train command. This command will train the model with the specified experiment. For example, to train the an experiment called learn in the bit module, you can use the following command:

mate train bit learn
# or alternatively use python
python -m deepnet.experiments.bit.learn train

Project Structure ๐Ÿ“

Deep learning projects can be organized into the following structure with modularity and seperation of concerns in mind. This offers a clean and organized codebase that is easy to maintain and is sharable out-of-the-box. The modular structure of the framework involves organizing the project directory in a hierarchical tree structure, with an arbitrary name given to the root project directory by the user. The project is then broken down into distinct concerns such as models, data, trainers, experiments, analyzers, and simulators, each with its own subdirectory. Within each concern, modules can be defined with their own subdirectories, such as specific models, trainers, data loaders, data augmentations, or loss functions.

โ””โ”€โ”€ project_name
    โ”œโ”€โ”€ data
    โ”‚   โ”œโ”€โ”€ my_independent_data_loader
    โ”‚   โ””โ”€โ”€ __init__.py
    โ”œโ”€โ”€ experiments
    โ”‚   โ”œโ”€โ”€ my_awesome_experiment
    โ”‚   โ””โ”€โ”€ __init__.py
    โ”œโ”€โ”€ __init__.py
    โ”œโ”€โ”€ models
    โ”‚   โ”œโ”€โ”€ awesomenet
    โ”‚   โ””โ”€โ”€ __init__.py
    โ””โ”€โ”€ trainers
        โ”œโ”€โ”€ big_brain_trainer
        โ””โ”€โ”€ __init__.py

Modularity

Modularity is a software design principle that focuses on creating self-contained, reusable and interchangeable components. In the context of a deep learning project, modularity means creating three independent standalone modules for models, trainers and data. This allows for a more organized, maintainable and sharable project structure. The forth module, experiments, is not independent, but rather combines the three modules together to create a complete experiment.

Independent Modules

Yerbamate prioritizes the organization of the project into independent modules when applicable. Independent modules only depend on Python dependencies (such as NumPy, PyTorch, TensorFlow, or Hugging Face), and the code inside the module uses relative imports to import within the module. This makes it an independent module that can be re-used when Python dependencies are installed.

Non-Independent Modules

In some cases, a combination of independent modules may be necessary for a particular concern. An example of this is the experiment concern, which imports and combines models, data, and trainers to define and create a specific experiment. In such cases, the module is not independent and is designed to combine the previously defined independent modules. In the case of non-independent modules, Yerbamate creates a dependency list of independent modules that can be used to install the code and Python dependencies. This ensures that the necessary modules are installed, and that the code can be run without issue.

Sample Modular Project Structure

This structure highlights modularity and seperation of concerns. The models, data and trainers modules are independent and can be used in any project. The experiments module is not independent, but rather combines the three modules together to create a complete experiment.

.
โ”œโ”€โ”€ mate.json
โ””โ”€โ”€ deepnet
    โ”œโ”€โ”€ data
    โ”‚   โ”œโ”€โ”€ bit
    โ”‚   โ”‚   โ”œโ”€โ”€ fewshot.py
    โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
    โ”‚   โ”‚   โ”œโ”€โ”€ minibatch_fewshot.py
    โ”‚   โ”‚   โ”œโ”€โ”€ requirements.txt
    โ”‚   โ”‚   โ””โ”€โ”€ transforms.py
    โ”‚   โ””โ”€โ”€ __init__.py
    โ”œโ”€โ”€ experiments
    โ”‚   โ”œโ”€โ”€ bit
    โ”‚   โ”‚   โ”œโ”€โ”€ aug.py
    โ”‚   โ”‚   โ”œโ”€โ”€ dependencies.json
    โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
    โ”‚   โ”‚   โ”œโ”€โ”€ learn.py
    โ”‚   โ”‚   โ””โ”€โ”€ requirements.txt
    โ”‚   โ””โ”€โ”€ __init__.py
    โ”œโ”€โ”€ __init__.py
    โ”œโ”€โ”€ models
    โ”‚   โ”œโ”€โ”€ bit_torch
    โ”‚   โ”‚   โ”œโ”€โ”€ downloader
    โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ downloader.py
    โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
    โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ requirements.txt
    โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ utils.py
    โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
    โ”‚   โ”‚   โ”œโ”€โ”€ models.py
    โ”‚   โ”‚   โ””โ”€โ”€ requirements.txt
    โ”‚   โ””โ”€โ”€ __init__.py
    โ””โ”€โ”€ trainers
        โ”œโ”€โ”€ bit_torch
        โ”‚   โ”œโ”€โ”€ __init__.py
        โ”‚   โ”œโ”€โ”€ lbtoolbox.py
        โ”‚   โ”œโ”€โ”€ logger.py
        โ”‚   โ”œโ”€โ”€ lr_schduler.py
        โ”‚   โ”œโ”€โ”€ requirements.txt
        โ”‚   โ””โ”€โ”€ trainer.py
        โ””โ”€โ”€ __init__.py

Example Projects ๐Ÿ“š

Please check out the transfer learning, vision models, and lightweight gan.

Documentation ๐Ÿ“š

Please check out the documentation.

Guides ๐Ÿ“–

For more information on modularity, please check out this guide.

For running experiments on Google Colab, please check out this example

Contribution ๐Ÿค

We welcome contributions from the community! Please check out our contributing guide for more information on how to get started.

Contact ๐Ÿค

For questions please contact:

oalee(at)proton.me

Open Science ๐Ÿ“–

As an open science work, Yerbamate strives to promote the principles of transparency and collaboration. To this end, the history of the LaTeX files for work are available on GitHub. These open science repositories are open to collaboration and encourage participation from the community to enhance the validity, reproducibility, accessibility, and quality of this work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yerbamate-0.9.239.tar.gz (66.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page