Skip to main content

Package for initializing ML projects following ML Ops best practices.

Project description

ML Ops Quickstart

Code coverage PyPI package Latest docker image Code style: black license: MIT

ML Ops Quickstart is a tool for initializing Machine Learning projects following ML Ops best practices.

Setting up new repositories is a time-consuming task that involves creating different files and configuring tools such as linters, docker containers and continuous integration pipelines. The goal of mloq is to simplify that process, so you can start writing code as fast as possible.

mloq generates customized templates for Python projects with focus on Maching Learning. An example of the generated templates can be found in mloq-template.

Index

  1. Installation

  2. Usage

  3. Features

  4. Project Makefile

  5. License

  6. Contributing

  7. Roadmap

1. Installation

mloq is tested on Ubuntu 18.04+, and supports Python 3.6+.

Install from pypi

pip install mloq

Install from source

git clone https://github.com/FragileTech/ml-ops-quickstart.git
cd ml-ops-quickstart
pip install -e .

2. Usage

2.1 Command line interface

To set up a new repository from scratch interactively in the curren working directory:

mloq quickstart .

To load a configuration file from the current repository and initialize the directory example, and override all existing files:

mloq quickstart -f . -o example

Options:

  • --file -f: Name of the configuration file. If file it's a directory it will load the mloq.yml file present in it.

  • --override -o: Rewrite files that already exist in the target project.

Arguments:

  • OUTPUT: Path to the target project.

ci python

2.2 mloq.yml config file

This yaml file contains all the information used by mloq to set up a new project. All values are strings except python_versions and requirements, that are lists of strings. null values are interpreted as missing values.

# `template` contains all the values that will be written in the generated files.
# They are loaded as a dictionary and passed to jinja2 to fill in the templates.
template: 
  project_name: null  # Name of the new Python project
  owner: null  # ------- Github handle of the project owner
  author: null # Person or entity listed as the project author in setup.py
  email: null  # Owner contact email
  copyright_holder: null # Owner of the project's copyright.
  project_url: null # GitHub project url. Defaults to https://github.com/{owner}/{project_name}
  download_url: null # Download link
  bot_name: null # Bot account to push from ci when bumping the project's version
  bot_email: null # Bot account email
  license: "MIT" # Currently only MIT license is supported
  description: "example_description" # Short description of the project
  python_versions: ['3.6', '3.7', '3.8', '3.9'] # Supported Python versions
  default_branch: "master" # Name of the default git branch of the project

requirements: ["datascience", "pytorch", "dataviz"]

# workflows:
# "dist" for python packages with compiled extensions
# "python" for pure python packages.
workflow: "python"

3. Features

3.1 Repository files

Set up the following common repository files personalized for your project with the values defined in mloq.yml:

  • README.md
  • DCO.md
  • CONTRIBUTING.md
  • code_of_conduct.md
  • LICENSE
  • .gitignore

3.2 Packaging

Automatic configuration of pyproject.toml and setup.py to distribute your project as a Python package.

3.3 Code style

All the necessary configuration for the following tools is defined in pyproject.toml.

  • black: Automatic code formatter.
  • isort: Rearrange your imports automatically.
  • flakehell: Linter tool build on top of flake8, pylint and pycodestyle

3.4 Requirements

mloq creates three different requirements files in the root directory of the project. Each file contains pinned dependencies.

  • requirements-lint.txt: Contains the dependencies for running style check analysis and automatic formatting of the code.

  • requirements-text.txt: Dependencies for running pytest, hypothesis and test coverage.

  • requirements.txt: Contains different pre-configured dependencies that can be defined in mloq.yml. The available pre-configured dependencies are:

3.5 Docker

A Dockerfile that builds a container on top of the FragileTech Docker Hub images:

  • If tensorflow or pytorch are selected as requirements the container has CUDA 11.0 installed.
  • Installs all the packages listed in requirements.txt.
  • Installs requirements-test.txt and requirements-lint.txt dependencies.
  • Install a jupyter notebook server with a configurable password in the port 8080.
  • Installs the project with pip install -e ..

3.6 Continuous integration using GitHub Actions

Set up automatically a continuous integration (CI) pipeline using GitHub actions with the following jobs: GitHub Actions pipeline

Automatic build and tests:

  • Style Check: Run flake8 and black --check to ensure a consistent code style.
  • Pytest: Test the project using pytest on all supported Python versions and output a code coverage report.
  • Test-docker: Build the project's Docker container and run the tests inside it.
  • Build-pypi: Build the project and upload it to Test Pypi with a version tag unique to each commit.
  • Test-pypi: Install the project from Test Pypi and run the tests using pytest.
  • Bump-version: Automatically bump the project's version and create a tag in the repository every time the default branch is updated.

Deploy each new version:

  • Push-docker-container: Upload the project's Docker container to Docker Hub.
  • Release-package: Upload to Pypi the source of the project and the corresponding wheels.

3.7 Testing

The lasts versions of pytest, hypothesis and pytest-cov can be found in requirements-test.txt.

The folder structure for the library and tests is created. A scripts folder containing the scripts that will be run in the CI will also be created on the root folder of the project.

4. Project Makefile

A Makefile will be created in the root directory of the project. It contains the following commands:

  • make style: Run isort and black to automatically arrange the imports and format the project.
  • make check: Run flakehell and check black style. If it raises any error the CI will fail.
  • make test: Clear the tests cache and run pytest.
  • make pipenv-install: Install the project in a new Pipenv environment and create a new Pipfile and Pipfile.lock.
  • make pipenv-test: Run pytest inside the project's Pipenv.
  • make docker-build: Build the project's Docker container.
  • make docker-test: Run pytest inside the projects docker container.
  • make docker-shell: Mount the current project as a docker volume and open a terminal in the project's container.
  • make docker-notebook: Mount the current project as a docker volume and open a jupyter notebook in the project's container. It exposes the notebook server on the port 8080.

5. License

ML Ops Quickstart is released under the MIT license.

6. Contributing

Contributions are very welcome! Please check the contributing guidelines before opening a pull request.

7. Roadmap

  • Improve documentation and test coverage.
  • Configure sphinx to build the docs automatically.
  • Implement checks for additional best practices.
  • Improve command line interface and logging.
  • Add new customization options.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mloq-0.0.10.tar.gz (38.8 kB view details)

Uploaded Source

Built Distribution

mloq-0.0.10-py3-none-any.whl (45.0 kB view details)

Uploaded Python 3

File details

Details for the file mloq-0.0.10.tar.gz.

File metadata

  • Download URL: mloq-0.0.10.tar.gz
  • Upload date:
  • Size: 38.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7

File hashes

Hashes for mloq-0.0.10.tar.gz
Algorithm Hash digest
SHA256 ba18f75897fed7aa4be3956734b12d1df5960aea094dfa82d5e86ef56453c558
MD5 bd3be74abb9c660abdce8a8084d82d0b
BLAKE2b-256 f3eb72877db56acb2c08390bd56f6397b8eee60215b9a14fe4a31f6f648ad271

See more details on using hashes here.

File details

Details for the file mloq-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: mloq-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 45.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7

File hashes

Hashes for mloq-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 1b0934b940637f61f2d2e22d9abe17fdc5fd946bbcaea598d3331dbba1c829f2
MD5 49aef60519e61110463c905d4c37c7b3
BLAKE2b-256 6857219b1abb5ffb4ef3da8026b45a8ee360db9fbd56682285f41b9f8f332033

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page