Skip to main content

Ingestion gateway: WebSocket + presigned S3 URLs, publishes to Kafka

Project description

Matyan Frontier

Ingestion gateway between training clients and the rest of the Matyan stack. Clients connect via WebSocket for metrics, params, and logs, and via REST for presigned S3 URLs; the frontier publishes to Kafka. The UI and backend do not talk to the frontier. Part of the Matyan experiment-tracking stack (fork of Aim).

Layout

  • src/matyan_frontier/ — Python package: FastAPI app, WebSocket handler, REST presign endpoint, Kafka producer, config, health, metrics.
  • Entrypoints: app.py (lifespan: start Kafka producer, create S3 clients, ensure bucket; shutdown: flush producer, close S3).
  • Routes: GET /api/v1/ws/runs/{run_id} (WebSocket), POST /api/v1/rest/artifacts/presign, GET /health/ready/, GET /health/live/, GET /metrics/ (Prometheus).

Prerequisites

  • Python 3.12+. The package uses uv in the repo: uv run matyan-frontier or install then matyan-frontier CLI.
  • Runtime dependencies: Kafka (bootstrap reachable) and an S3-compatible store (e.g. MinIO/RustFS in dev, AWS S3 in prod). The smoke test and local dev assume Kafka and S3 are up (e.g. via docker-compose).

Run (production-like)

From the frontier package directory: uv run matyan-frontier start (or matyan-frontier start if installed).

Options: --host, --port (defaults: 0.0.0.0, 53801). The CLI uses these option defaults; config also defines host/port for other entry points.

Configuration (environment variables)

Variable Default Purpose
MATYAN_ENVIRONMENT / ENVIRONMENT development When production, S3/Kafka must be non-dev (validated at startup).
LOG_LEVEL INFO Log level (loguru + uvicorn).
HOST 0.0.0.0 Bind address.
PORT 53801 Bind port.
KAFKA_BOOTSTRAP_SERVERS localhost:9092 Kafka broker list.
KAFKA_DATA_INGESTION_TOPIC data-ingestion Topic for ingestion messages.
KAFKA_SECURITY_PROTOCOL / KAFKA_SASL_* (empty) Optional Kafka SASL.
S3_ENDPOINT http://localhost:9000 S3 API endpoint (e.g. MinIO).
S3_PUBLIC_ENDPOINT "" Optional; used for presigned URLs if different from S3_ENDPOINT.
S3_ACCESS_KEY / S3_SECRET_KEY (dev defaults) S3 credentials.
S3_BUCKET matyan-artifacts Bucket for artifacts.
S3_PRESIGN_EXPIRY 3600 Presigned URL expiry (seconds).
SHUTDOWN_FLUSH_TIMEOUT 5.0 Seconds to wait for Kafka flush on shutdown.
METRICS_ENABLED true Expose Prometheus /metrics/.
CORS_ORIGINS (localhost list) Allowed origins (comma-separated or repeated).

Source of truth: config.py.

Development and smoke test

  • Development: Run Kafka and S3 (e.g. docker compose up kafka kafka-init rustfs or equivalent), then uv run matyan-frontier start. Clients (e.g. matyan-client) point at the frontier URL for WebSocket and presign.
  • Smoke test: From extra/matyan-frontier, run uv run python scripts/smoke_test.py. Prerequisites: Kafka, S3, and frontier running. The script covers WebSocket (create_run, log_metric, log_hparams, finish_run, etc.) and REST presign, and optionally verifies Kafka consumption.

See scripts/smoke_test.py for the exact command and what it checks.

Deployment

  • Docker: Use Dockerfile.dev or Dockerfile.prod (context from repo root as needed; align with how the repo builds frontier images).
  • Kubernetes/Helm: The chart in deploy/helm/matyan deploys the frontier as a separate Deployment. Configure Kafka and S3 via chart values (e.g. kafka.*, s3.*, frontier.logLevel, frontier.replicaCount). Ingress routes /api/v1/ws and /api/v1/rest/artifacts to the frontier. The frontier is stateless and can be scaled horizontally.

The UI talks only to the backend, not to the frontier; the frontier is for training clients.

Related

  • Backend: matyan-backend serves the REST API and consumes from Kafka (ingestion + control workers); it does not receive client traffic from the frontier directly.
  • Client: matyan-client sends tracking data to the frontier (WebSocket + presign REST).
  • Monorepo: This package lives under extra/matyan-frontier in the matyan-core repo.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matyan_frontier-0.2.0.tar.gz (27.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

matyan_frontier-0.2.0-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file matyan_frontier-0.2.0.tar.gz.

File metadata

  • Download URL: matyan_frontier-0.2.0.tar.gz
  • Upload date:
  • Size: 27.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for matyan_frontier-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bc444cb49bc95f4c4b82806d9504b3ff521d817462194654951319d42b938679
MD5 523e874836077aa9f2730ba0a7f41f80
BLAKE2b-256 778e0c90dc9235b3a953a2be9eee04647c8692f716527134a54c492f6dc0d506

See more details on using hashes here.

File details

Details for the file matyan_frontier-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: matyan_frontier-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 20.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for matyan_frontier-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d684a0c97cf4a7bbbfb3a5445e450fbdc6d618b0abff6cf6b06043700359d4d5
MD5 7850cad5a80c51689b3b7485c5727bec
BLAKE2b-256 f1f222d64c7aa2cc1066dbc57675256bce0a42b1a419cf08d3a33b109b529ace

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page