Package for initializing ML projects following ML Ops best practices.
Project description
ML Ops Quickstart
ML Ops Quickstart is a tool for initializing Machine Learning projects following ML Ops best practices.
Setting up new repositories is a time-consuming task that involves creating different files and
configuring tools such as linters, docker containers and continuous integration pipelines.
The goal of mloq
is to simplify that process, so you can start writing code as fast as possible.
mloq
generates customized templates for Python projects with focus on Maching Learning. An example of
the generated templates can be found in mloq-template.
Index
1. Installation
mloq
is tested on Ubuntu 18.04+, and supports Python 3.6+.
Install from pypi
pip install mloq
Install from source
git clone https://github.com/FragileTech/ml-ops-quickstart.git
cd ml-ops-quickstart
pip install -e .
2. Usage
2.1 Command line interface
To set up a new repository from scratch interactively in the curren working directory:
mloq quickstart .
To load a configuration file from the current repository and initialize the directory example
, and
override all existing files:
mloq quickstart -f . -o example
Options:
-
--file
-f
: Name of the configuration file. Iffile
it's a directory it will load themloq.yml
file present in it. -
--override
-o
: Rewrite files that already exist in the target project.
Arguments:
OUTPUT
: Path to the target project.
2.2 mloq.yml config file
This yaml file contains all the information used by mloq to set up a new project. All values are strings except python_versions and requirements, that are lists of strings. null values are interpreted as missing values.
# `template` contains all the values that will be written in the generated files.
# They are loaded as a dictionary and passed to jinja2 to fill in the templates.
template:
project_name: null # Name of the new Python project
owner: null # ------- Github handle of the project owner
author: null # Person or entity listed as the project author in setup.py
email: null # Owner contact email
copyright_holder: null # Owner of the project's copyright.
project_url: null # GitHub project url. Defaults to https://github.com/{owner}/{project_name}
download_url: null # Download link
bot_name: null # Bot account to push from ci when bumping the project's version
bot_email: null # Bot account email
license: "MIT" # Currently only MIT license is supported
description: "example_description" # Short description of the project
python_versions: ['3.6', '3.7', '3.8', '3.9'] # Supported Python versions
default_branch: "master" # Name of the default git branch of the project
requirements: ["datascience", "pytorch", "dataviz"]
# workflows:
# "dist" for python packages with compiled extensions
# "python" for pure python packages.
workflow: "python"
3. Features
3.1 Repository files
Set up the following common repository files personalized for your project with the values
defined in mloq.yml
:
- README.md
- DCO.md
- CONTRIBUTING.md
- code_of_conduct.md
- LICENSE
- .gitignore
3.2 Packaging
Automatic configuration of pyproject.toml
and setup.py
to distribute your project as a Python package.
3.3 Code style
All the necessary configuration for the following tools is defined in pyproject.toml.
- black: Automatic code formatter.
- isort: Rearrange your imports automatically.
- flakehell: Linter tool build on top of
flake8
,pylint
andpycodestyle
3.4 Requirements
mloq
creates three different requirements files in the root directory of the project. Each file contains
pinned dependencies.
-
requirements-lint.txt: Contains the dependencies for running style check analysis and automatic formatting of the code.
-
requirements-text.txt: Dependencies for running pytest, hypothesis and test coverage.
-
requirements.txt
: Contains different pre-configured dependencies that can be defined inmloq.yml
. The available pre-configured dependencies are:- data-science: Dependencies of common data science libraries.
- data-visualization: Common visualization libraries.
- Last version of pytorch and tensorflow
3.5 Docker
A Dockerfile that builds a container on top of the FragileTech Docker Hub images:
- If tensorflow or pytorch are selected as requirements the container has CUDA 11.0 installed.
- Installs all the packages listed in
requirements.txt
. - Installs
requirements-test.txt
andrequirements-lint.txt
dependencies. - Install a
jupyter notebook
server with a configurable password in the port 8080. - Installs the project with
pip install -e .
.
3.6 Continuous integration using GitHub Actions
Set up automatically a continuous integration (CI) pipeline using GitHub actions with the following jobs:
Automatic build and tests:
- Style Check: Run
flake8
andblack --check
to ensure a consistent code style. - Pytest: Test the project using pytest on all supported Python versions and output a code coverage report.
- Test-docker: Build the project's Docker container and run the tests inside it.
- Build-pypi: Build the project and upload it to Test Pypi with a version tag unique to each commit.
- Test-pypi: Install the project from Test Pypi and run the tests using pytest.
- Bump-version: Automatically bump the project's version and create a tag in the repository every time the default branch is updated.
Deploy each new version:
- Push-docker-container: Upload the project's Docker container to Docker Hub.
- Release-package: Upload to Pypi the source of the project and the corresponding wheels.
3.7 Testing
The lasts versions of pytest
, hypothesis
and pytest-cov
can be found in requirements-test.txt
.
The folder structure for the library and tests is created. A scripts
folder containing the scripts
that will be run in the CI will also be created on the root folder of the project.
4. Project Makefile
A Makefile
will be created in the root directory of the project. It contains the following commands:
make style
: Runisort
andblack
to automatically arrange the imports and format the project.make check
: Runflakehell
and check black style. If it raises any error the CI will fail.make test
: Clear the tests cache and run pytest.make pipenv-install
: Install the project in a new Pipenv environment and create a newPipfile
andPipfile.lock
.make pipenv-test
: Run pytest inside the project's Pipenv.make docker-build
: Build the project's Docker container.make docker-test
: Run pytest inside the projects docker container.make docker-shell
: Mount the current project as a docker volume and open a terminal in the project's container.make docker-notebook
: Mount the current project as a docker volume and open a jupyter notebook in the project's container. It exposes the notebook server on the port8080
.
5. License
ML Ops Quickstart is released under the MIT license.
6. Contributing
Contributions are very welcome! Please check the contributing guidelines before opening a pull request.
7. Roadmap
- Improve documentation and test coverage.
- Configure sphinx to build the docs automatically.
- Implement checks for additional best practices.
- Improve command line interface and logging.
- Add new customization options.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mloq-0.0.16.tar.gz
.
File metadata
- Download URL: mloq-0.0.16.tar.gz
- Upload date:
- Size: 38.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 524bb9efa0e474c87e956bc20a5e35fd481aa906f9496a71bec5988486b43dfa |
|
MD5 | aae48b94b70437f8e187506e9146eb91 |
|
BLAKE2b-256 | 15185d1904080afb5b821310fe323fd66c92b5955ab1cfb35dcb08859583f8d2 |
File details
Details for the file mloq-0.0.16-py3-none-any.whl
.
File metadata
- Download URL: mloq-0.0.16-py3-none-any.whl
- Upload date:
- Size: 45.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f44687190e779d33363beac9bdc3a404b9d70f3b24ad241a7220dbee0952d43d |
|
MD5 | f5c5df91e76e325ef952c3a6cdd92fa6 |
|
BLAKE2b-256 | 7f70db5d3d6803e6e12d9327698ced426c5d19bce5ff5c7a3e03abc4b3f761eb |