Skip to main content

Package for initializing ML projects following ML Ops best practices.

Project description

ML Ops Quickstart

Code coverage PyPI package Code style: black license: MIT

ML Ops Quickstart is a tool for initializing Machine Learning projects following ML Ops best practices.

Setting up new repositories is a time-consuming task that involves creating different files and configuring tools such as linters, docker containers and continuous integration pipelines. The goal of mloq is to simplify that process, so you can start writing code as fast as possible.

mloq generates customized templates for Python projects with focus on Maching Learning. An example of the generated templates can be found in mloq-template.

Index

  1. Installation

  2. Usage

  3. Features

  4. Project Makefile

  5. License

  6. Contributing

  7. Roadmap

1. Installation

mloq is tested on Ubuntu 18.04+, and supports Python 3.6+.

Install from pypi

pip install mloq

Install from source

git clone https://github.com/FragileTech/ml-ops-quickstart.git
cd ml-ops-quickstart
pip install -e .

2. Usage

2.1 Command line interface

Options:

  • --file -f: Name of the configuration file. If file it's a directory it will load the mloq.yml file present in it.

  • --override -o: Rewrite files that already exist in the target project.

  • --interactive -i: Missing configuration data can be defined interactively from the CLI.

Usage examples

Arguments:

  • OUTPUT: Path to the target project.

To set up a new repository from scratch interactively in the curren working directory:

mloq setup -i .

To load a mloq.yml configuration file from the current repository, and initialize the directory example, and override all existing files with no interactivity:

mloq setup -f . -o example

ci python

2.2 mloq.yml config file

This yaml file contains all the information used by mloq to set up a new project. All values are strings except python_versions and requirements, that are lists of strings. null values are interpreted as missing values.

# This yaml file contains all the information used by mloq to set up a new project.
# All values in template are strings and booleans,
# except "python_versions" and "requirements" that are lists of strings.
# "null" values are interpreted as non-defined values.
# ------------------------------------------------------------------------------

# project_config values are necessary to define the files that will be written, and the tools
# that will be configured.
project_config:
  open_source: null  # boolean. If True, set up and Open Source project
  docker: null  # boolean If True, set up a Docker image for the project
  ci: null  # Name of the GitHub Actions CI workflow that will be configured.
  mlflow: null # boolean. If True configure a MLproject file compatible with ML Flow projects.
  requirements: null # List containing the pre-defined requirements of the project.

# template contains all the values that will be written in the generated files.
# They are loaded as a dictionary and passed to jinja2 to fill in the templates.
template:
  project_name: null  # Name of the new Python project
  default_branch: null  # Name of the defaul branch. Used in the CI push workflow.
  owner: null  # Github handle of the project owner
  author: null  # Person(s) or entity listed as the project author in setup.py
  email: null  # Owner contact email
  copyright_holder: null  # Owner of the project's copyright.
  project_url: null  # Project download url. Defaults to https://github.com/{owner}/{project_name}
  bot_name: null  # GitHub login of the account used to push when bumping the project's version
  bot_email: null # Bot account email
  license: null  # Currently only priprietary and MIT license is supported
  description: null  # Short description of the project
  python_versions: null # Supported Python versions
  docker_image: null  # Your project Docker container will inherit from this image.

3. Features

3.1 Repository files

Set up the following common repository files personalized for your project with the values defined in mloq.yml:

  • README.md
  • DCO.md
  • CONTRIBUTING.md
  • code_of_conduct.md
  • LICENSE
  • .gitignore

3.2 Packaging

Automatic configuration of pyproject.toml and setup.py to distribute your project as a Python package.

3.3 Code style

All the necessary configuration for the following tools is defined in pyproject.toml.

  • black: Automatic code formatter.
  • isort: Rearrange your imports automatically.
  • flakehell: Linter tool build on top of flake8, pylint and pycodestyle

3.4 Requirements

mloq creates three different requirements files in the root directory of the project. Each file contains pinned dependencies.

  • requirements-lint.txt: Contains the dependencies for running style check analysis and automatic formatting of the code.

  • requirements-text.txt: Dependencies for running pytest, hypothesis and test coverage.

  • requirements.txt: Contains different pre-configured dependencies that can be defined in mloq.yml. The available pre-configured dependencies are:

3.5 Docker

A Dockerfile that builds a container on top of the FragileTech Docker Hub images:

  • If tensorflow or pytorch are selected as requirements the container has CUDA 11.0 installed.
  • Installs all the packages listed in requirements.txt.
  • Installs requirements-test.txt and requirements-lint.txt dependencies.
  • Install a jupyter notebook server with a configurable password in the port 8080.
  • Installs the project with pip install -e ..

3.6 Continuous integration using GitHub Actions

Set up automatically a continuous integration (CI) pipeline using GitHub actions with the following jobs: GitHub Actions pipeline

Automatic build and tests:

  • Style Check: Run flake8 and black --check to ensure a consistent code style.
  • Pytest: Test the project using pytest on all supported Python versions and output a code coverage report.
  • Test-docker: Build the project's Docker container and run the tests inside it.
  • Build-pypi: Build the project and upload it to Test Pypi with a version tag unique to each commit.
  • Test-pypi: Install the project from Test Pypi and run the tests using pytest.
  • Bump-version: Automatically bump the project's version and create a tag in the repository every time the default branch is updated.

Deploy each new version:

  • Push-docker-container: Upload the project's Docker container to Docker Hub.
  • Release-package: Upload to Pypi the source of the project and the corresponding wheels.

3.7 Testing

The lasts versions of pytest, hypothesis and pytest-cov can be found in requirements-test.txt.

The folder structure for the library and tests is created. A scripts folder containing the scripts that will be run in the CI will also be created on the root folder of the project.

4. Project Makefile

A Makefile will be created in the root directory of the project. It contains the following commands:

  • make style: Run isort and black to automatically arrange the imports and format the project.
  • make check: Run flakehell and check black style. If it raises any error the CI will fail.
  • make test: Clear the tests cache and run pytest.
  • make pipenv-install: Install the project in a new Pipenv environment and create a new Pipfile and Pipfile.lock.
  • make pipenv-test: Run pytest inside the project's Pipenv.
  • make docker-build: Build the project's Docker container.
  • make docker-test: Run pytest inside the projects docker container.
  • make docker-shell: Mount the current project as a docker volume and open a terminal in the project's container.
  • make docker-notebook: Mount the current project as a docker volume and open a jupyter notebook in the project's container. It exposes the notebook server on the port 8080.

5. License

ML Ops Quickstart is released under the MIT license.

6. Contributing

Contributions are very welcome! Please check the contributing guidelines before opening a pull request.

7. Roadmap

  • Improve documentation and test coverage.
  • Configure sphinx to build the docs automatically.
  • Implement checks for additional best practices.
  • Improve command line interface and logging.
  • Add new customization options.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mloq-0.0.17.tar.gz (41.3 kB view details)

Uploaded Source

Built Distribution

mloq-0.0.17-py3-none-any.whl (49.6 kB view details)

Uploaded Python 3

File details

Details for the file mloq-0.0.17.tar.gz.

File metadata

  • Download URL: mloq-0.0.17.tar.gz
  • Upload date:
  • Size: 41.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7

File hashes

Hashes for mloq-0.0.17.tar.gz
Algorithm Hash digest
SHA256 8bab32c87a26a7e71ffa34039c15b68c43ab74303465f6aedc3c24e64a42e81a
MD5 492dd0709b0019522caf9678acb3f6ec
BLAKE2b-256 1f08e74990ee521030ed62a0574eb8f3b33e6c72b2335cdff3ff5c069ed55a2e

See more details on using hashes here.

File details

Details for the file mloq-0.0.17-py3-none-any.whl.

File metadata

  • Download URL: mloq-0.0.17-py3-none-any.whl
  • Upload date:
  • Size: 49.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7

File hashes

Hashes for mloq-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 fa82bd0b2dd38b0626dbeb39c7803660b06de7d36b16e471812032301720eead
MD5 d27c4cba0fff17e659ec51ae1f30e696
BLAKE2b-256 8edb8537802d1447fb79bdfc67f0041573bc85437c57967202a0947f5748f6c8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page