Skip to main content

Official CLI for MLDock — manage datasets, trainers, training jobs, and deployments from your terminal.

Project description

mldock

Official CLI for MLDock — manage datasets, trainers, training jobs, and deployments from your terminal.

Works on macOS, Linux, and Windows.


Installation

pip install mldock-io

With Docker-based local testing support:

pip install mldock-io[docker]

Quick Start

# 1. Sign in via browser
mldock login

# 2. Upload a dataset
mldock dataset create "churn-data"
mldock dataset upload data.csv --dataset-id <id>

# 3. Publish a trainer
mldock trainer publish my_trainer.py

# 4. Start training
mldock train start my_trainer --dataset <id> --follow

# 5. View deployed model
mldock model list

# 6. Run inference
mldock model predict my_trainer --input '{"age": 35, "balance": 1200}'

Authentication

Browser-based login (recommended)

mldock login

Opens your browser to the MLDock sign-in page. After signing in and clicking Authorize, the CLI is authenticated automatically. Session is saved to ~/.mldock/session.json.

mldock login --no-browser   # Print the URL instead of opening it

Self-hosted instances: the API and frontend run on separate ports (API: 8030, frontend: 5200). Use both flags:

mldock login \
  --base-url http://localhost:8030 \
  --frontend-url http://localhost:5200

--base-url sets the API endpoint for all CLI calls. --frontend-url overrides the host in the browser login URL (the server's configured FRONTEND_BASE_URL may be an internal address unreachable from your machine).

Environment variables

export MLDOCK_BASE_URL=https://www.mldock.io       # API base URL
export MLDOCK_FRONTEND_URL=http://localhost:5200    # browser login URL (self-hosted only)
export MLDOCK_API_KEY=your-api-key                  # used by `mldock model predict` only

Check current session

mldock whoami

Sign out

mldock logout

Datasets

# List all datasets
mldock dataset list

# Create an empty dataset
mldock dataset create "my-dataset" --description "Training data" --visibility private

# Upload a CSV or JSON file into a dataset
mldock dataset upload data.csv --dataset-id <id>
mldock dataset upload data.jsonl --dataset-id <id>

# Download a dataset to your machine
mldock dataset pull <id>                    # saves as <name>.csv in current dir
mldock dataset pull <id> --output ./data --format jsonl

# Show dataset details
mldock dataset info <id>

# Delete a dataset
mldock dataset delete <id>

Supported upload formats: .csv, .json (array), .jsonl (one object per line)


Trainers

# List trainers in your workspace
mldock trainer list

# Publish a trainer file
mldock trainer publish my_trainer.py
mldock trainer publish my_trainer.py --name "churn-v2" --wait   # wait for scan approval

# Show trainer details
mldock trainer info <name>

# Delete (deactivate) a trainer
mldock trainer delete <name>

Testing a trainer locally

# Test with sample input
mldock trainer test my_trainer.py --input '{"age": 35, "balance": 1200}'

# Test with a local dataset
mldock trainer test my_trainer.py --input '{}' --dataset ./data.csv

# Test with a remote MLDock dataset
mldock trainer test my_trainer.py --input '{}' --remote-dataset <dataset-id>

How local testing works:

Environment Behaviour
Docker available Runs inside a container. Container is kept alive — re-runs skip cold start.
Docker not available Creates a virtualenv at ~/.mldock/envs/<name>/. Dependencies installed once; re-runs skip install if requirements unchanged.

Works on macOS, Linux, and Windows. No Docker account required for the venv path.

Your trainer class must have train() and predict() methods. Dependencies are auto-detected from imports in your .py file.


Training Jobs

# Start a training run
mldock train start my_trainer
mldock train start my_trainer --dataset <id>          # with dataset
mldock train start my_trainer --gpu                   # request GPU
mldock train start my_trainer --params '{"lr": 0.01}' # with hyperparams
mldock train start my_trainer --follow                 # stream until done

# Check job status
mldock train status <job-id>
mldock train status <job-id> --follow   # poll until complete

# List recent jobs
mldock train list
mldock train list --limit 50

# Cancel a job
mldock train cancel <job-id>

Models (Deployments)

# List all deployed models
mldock model list

# Show deployment details
mldock model info <deployment-id>

# Run inference against a deployed model
mldock model predict my_trainer --input '{"text": "hello"}'
mldock model predict my_trainer --input '{"text": "hello"}' --api-key <key>

# View production metrics
mldock model metrics <deployment-id>

# Roll back to previous version
mldock model rollback <deployment-id>

# Delete a deployment
mldock model delete <deployment-id>

Platform

# Check connectivity
mldock status

# Show CLI version
mldock --version

Configuration

Variable Description Default
MLDOCK_BASE_URL MLDock API base URL https://www.mldock.io
MLDOCK_FRONTEND_URL Browser login URL (self-hosted only — overrides scheme+host+port in the login URL returned by the server)
MLDOCK_API_KEY API key for inference calls

Session file: ~/.mldock/session.json (permissions: 600 — owner read/write only)

Local virtualenvs: ~/.mldock/envs/<trainer-name>/

Docker workspaces: ~/.mldock/workspaces/<trainer-name>/


Trainer File Format

Your trainer must be a Python file with a class that has train() and predict():

# my_trainer.py
from sklearn.ensemble import GradientBoostingClassifier
import pandas as pd

class ChurnPredictor:
    def __init__(self):
        self.model = None

    def train(self, dataset_path: str, **kwargs):
        df = pd.read_csv(dataset_path)
        X = df.drop("churn", axis=1)
        y = df["churn"]
        self.model = GradientBoostingClassifier()
        self.model.fit(X, y)

    def predict(self, input_data: dict) -> dict:
        if self.model is None:
            return {"error": "not trained"}
        import numpy as np
        X = pd.DataFrame([input_data])
        prob = self.model.predict_proba(X)[0][1]
        return {"churn_probability": round(float(prob), 4)}

Test it locally before publishing:

mldock trainer test my_trainer.py --input '{"age": 35, "balance": 0}'

Publishing to PyPI

pip install build twine
python -m build
twine upload dist/*

Planned Features

  • mldock trainer logs <name> — stream training logs in real time
  • mldock dataset annotate <id> — open annotation task in browser
  • mldock model ab-test create — set up A/B test between two versions
  • mldock model shadow <id> — enable shadow mode for safe promotion
  • mldock drift status <trainer> — view drift detection status
  • mldock team invite <email> — invite a team member to your workspace
  • mldock api-key create — create a scoped API key
  • mldock init — scaffold a new trainer project from a template

License

MIT — Kreateyou Technologies Ltd

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mldock_io-0.2.0.tar.gz (34.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mldock_io-0.2.0-py3-none-any.whl (41.4 kB view details)

Uploaded Python 3

File details

Details for the file mldock_io-0.2.0.tar.gz.

File metadata

  • Download URL: mldock_io-0.2.0.tar.gz
  • Upload date:
  • Size: 34.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for mldock_io-0.2.0.tar.gz
Algorithm Hash digest
SHA256 feeec808892f3bba84002996f0dc2d9745c7e0d5e3f461aafa87e85428466b83
MD5 83ae90d2ac329f9f4f885a684d286737
BLAKE2b-256 ffbfc263bac92a5c2189d074aaccec340d5f7b7c6a6d4419ffc5924a594be132

See more details on using hashes here.

File details

Details for the file mldock_io-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mldock_io-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 41.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for mldock_io-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ff421b957c682e64a49f3ede67e1a0d4018493ad05692f8f953be26de5d97989
MD5 908bb9d027fe62abd11796d064d7d82c
BLAKE2b-256 103c475ae7930ec2082ecbbaa67efe9e8df50ae2c35b5f5d3ea493266334b7ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page