Skip to main content

Production-ready experiment tracker

Project description

Matyan Backend

REST API and workers for the Matyan experiment-tracking stack (fork of Aim). Serves reads and control operations from FoundationDB; consumes ingestion and control events from Kafka; uses S3/GCS/Azure for artifact blobs. The UI talks to this API; training clients send data via the frontier, which publishes to Kafka consumed by these workers.

Layout

  • src/matyan_backend/ — Python package: FastAPI app (app.py), API routes under api/ (runs, experiments, tags, projects, dashboards, reports, streaming), storage/ (FDB + S3/GCS/Azure), workers/ (ingestion + control Kafka consumers), jobs/ (FDB lock, used by CLI cleanup commands), backup/ (export/restore), CLI in cli.py.
  • Entrypoints: matyan-backend start (API server, default port 53800), matyan-backend ingest-worker, matyan-backend control-worker; plus one-off CLI commands (reindex, backup, restore, finish-stale, cleanup-orphan-blobs, cleanup-tombstones, convert tensorboard).

Prerequisites

  • Python 3.12+. The package uses uv in the repo: uv run matyan-backend or install then matyan-backend CLI.
  • Runtime dependencies: FoundationDB (cluster file), Kafka (for workers), blob store. For local dev, typically run FDB + Kafka + S3 (RustFS) via docker-compose.

Run

  • API server: uv run matyan-backend start (or matyan-backend start). Options: --host, --port (defaults: 0.0.0.0, 53800). API is under /api/v1; health at /health/ready/, /health/live/, metrics at /metrics/ when enabled.
  • Workers: uv run matyan-backend ingest-worker and uv run matyan-backend control-worker. Both require Kafka and FDB; ingestion worker also writes to FDB and reads blob storage config for blob references.
  • CLI (one-off): reindex (rebuild indexes), backup / restore, finish-stale, cleanup-orphan-blobs, cleanup-tombstones. See the backend CLI help (matyan-backend cleanup-orphan-blobs --help, matyan-backend cleanup-tombstones --help) and References — CLI for all options. Cleanup commands are intended for CronJobs or cron; use --dry-run to preview and --lock-ttl-seconds for FDB-based single-run locking. Optional: convert tensorboard to convert TensorBoard logs to backup format.

Configuration (environment variables)

Variable Default Purpose
MATYAN_ENVIRONMENT / ENVIRONMENT development When production, strict checks apply (see Production configuration).
LOG_LEVEL INFO Log level (loguru + uvicorn).
FDB_CLUSTER_FILE fdb.cluster Path to FoundationDB cluster file.
BLOB_BACKEND_TYPE s3 Storage backend: s3, gcs, or azure.
S3_ENDPOINT http://localhost:9000 S3-compatible API URL.
S3_ACCESS_KEY / S3_SECRET_KEY (dev defaults) S3 credentials.
S3_BUCKET matyan-artifacts Bucket for artifacts (when using s3).
S3_REGION us-east-1 S3 region (default: us-east-1).
GCS_BUCKET matyan-artifacts Bucket for artifacts (when using gcs).
AZURE_CONTAINER matyan-artifacts Container for artifacts (when using azure).
AZURE_CONN_STR "" Azure connection string.
AZURE_ACCOUNT_URL "" Azure account URL (for DefaultAzureCredential).
BLOB_URI_SECRET (dev default) Fernet key for blob URIs; must be set in production.
KAFKA_BOOTSTRAP_SERVERS localhost:9092 Kafka broker list.
KAFKA_DATA_INGESTION_TOPIC data-ingestion Topic for ingestion messages.
KAFKA_CONTROL_EVENTS_TOPIC control-events Topic for control events.
KAFKA_SECURITY_PROTOCOL / KAFKA_SASL_* (empty) Optional Kafka SASL.
METRICS_ENABLED true Expose Prometheus metrics.
METRICS_PORT 9090 Port for metrics HTTP server (workers).
INGEST_MAX_MESSAGES_PER_TXN 100 Max messages per FDB transaction (ingestion worker).
INGEST_MAX_TXN_BYTES 8388608 (8 MB) Target max transaction size; FDB limit is 10 MB.
CORS_ORIGINS (localhost list) Allowed origins for CORS.

Source of truth: config.py.

Production configuration

See docs/PRODUCTION_CONFIG.md for enabling production mode (MATYAN_ENVIRONMENT=production), required overrides, and supplying secrets via env or a secrets backend.

Deployment

  • Docker: Build the backend image (context from repo root); run API and workers as separate processes or containers.
  • Kubernetes/Helm: The chart in deploy/helm/matyan deploys the backend API, ingestion worker, and control worker as separate Deployments; optional CronJobs for cleanup-orphan-blobs and cleanup-tombstones. Configure FDB, blob storage (S3, GCS, Azure), and Kafka via chart values; see the chart README. Set MATYAN_ENVIRONMENT=production and required env for production.

Related

  • UI: matyan-ui calls this backend REST API.
  • Frontier: matyan-frontier publishes to Kafka; backend workers consume.
  • API models: matyan-api-models shared types (Kafka messages, run creation, etc.).
  • Monorepo: This package lives under extra/matyan-backend in the matyan-core repo.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matyan_backend-0.3.0.tar.gz (264.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

matyan_backend-0.3.0-py3-none-any.whl (131.0 kB view details)

Uploaded Python 3

File details

Details for the file matyan_backend-0.3.0.tar.gz.

File metadata

  • Download URL: matyan_backend-0.3.0.tar.gz
  • Upload date:
  • Size: 264.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for matyan_backend-0.3.0.tar.gz
Algorithm Hash digest
SHA256 5e7fba2ed55e049380f329c9188cf7fdf12c23c03dc3b99a925c3cb29ebe4e87
MD5 472bdcf55ed55514d17bd07f8af6875f
BLAKE2b-256 045b7a98305a946c50ae0ca5a4e8c68060dc105459c6d97e0592b49f03e31d45

See more details on using hashes here.

Provenance

The following attestation bundles were made for matyan_backend-0.3.0.tar.gz:

Publisher: release.yml on 4gt-104/matyan-backend

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file matyan_backend-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: matyan_backend-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 131.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for matyan_backend-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 29ca80bd5a00cf2df0fb92db8fe3eb5a9217b019e5047254459c30b2aec4af8b
MD5 d319ee0b53facc32f2d3e6cb8e19fefc
BLAKE2b-256 9827738922c65b4999969bac592166a4d1c30c752b3f448d82c26d09a3ffb2a2

See more details on using hashes here.

Provenance

The following attestation bundles were made for matyan_backend-0.3.0-py3-none-any.whl:

Publisher: release.yml on 4gt-104/matyan-backend

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page