Skip to main content

Ingestion gateway: WebSocket + presigned S3 URLs, publishes to Kafka

Project description

Matyan Frontier

Ingestion gateway between training clients and the rest of the Matyan stack. Clients connect via WebSocket for metrics, params, and logs, and via REST for presigned S3 URLs; the frontier publishes to Kafka. The UI and backend do not talk to the frontier. Part of the Matyan experiment-tracking stack (fork of Aim).

Layout

  • src/matyan_frontier/ — Python package: FastAPI app, WebSocket handler, REST presign endpoint, Kafka producer, config, health, metrics.
  • Entrypoints: app.py (lifespan: start Kafka producer, create S3 clients, ensure bucket; shutdown: flush producer, close S3).
  • Routes: GET /api/v1/ws/runs/{run_id} (WebSocket), POST /api/v1/rest/artifacts/presign, GET /health/ready/, GET /health/live/, GET /metrics/ (Prometheus).

Prerequisites

  • Python 3.12+. The package uses uv in the repo: uv run matyan-frontier or install then matyan-frontier CLI.
  • Runtime dependencies: Kafka (bootstrap reachable) and an S3-compatible store (e.g. MinIO/RustFS in dev, AWS S3 in prod). The smoke test and local dev assume Kafka and S3 are up (e.g. via docker-compose).

Run (production-like)

From the frontier package directory: uv run matyan-frontier start (or matyan-frontier start if installed).

Options: --host, --port (defaults: 0.0.0.0, 53801). The CLI uses these option defaults; config also defines host/port for other entry points.

Configuration (environment variables)

Variable Default Purpose
MATYAN_ENVIRONMENT / ENVIRONMENT development When production, S3/Kafka must be non-dev (validated at startup).
LOG_LEVEL INFO Log level (loguru + uvicorn).
HOST 0.0.0.0 Bind address.
PORT 53801 Bind port.
KAFKA_BOOTSTRAP_SERVERS localhost:9092 Kafka broker list.
KAFKA_DATA_INGESTION_TOPIC data-ingestion Topic for ingestion messages.
KAFKA_SECURITY_PROTOCOL / KAFKA_SASL_* (empty) Optional Kafka SASL.
S3_ENDPOINT http://localhost:9000 S3 API endpoint (e.g. MinIO).
S3_PUBLIC_ENDPOINT "" Optional; used for presigned URLs if different from S3_ENDPOINT.
S3_ACCESS_KEY / S3_SECRET_KEY (dev defaults) S3 credentials.
S3_BUCKET matyan-artifacts Bucket for artifacts.
S3_PRESIGN_EXPIRY 3600 Presigned URL expiry (seconds).
SHUTDOWN_FLUSH_TIMEOUT 5.0 Seconds to wait for Kafka flush on shutdown.
METRICS_ENABLED true Expose Prometheus /metrics/.
CORS_ORIGINS (localhost list) Allowed origins (comma-separated or repeated).

Source of truth: config.py.

Development and smoke test

  • Development: Run Kafka and S3 (e.g. docker compose up kafka kafka-init rustfs or equivalent), then uv run matyan-frontier start. Clients (e.g. matyan-client) point at the frontier URL for WebSocket and presign.
  • Smoke test: From extra/matyan-frontier, run uv run python scripts/smoke_test.py. Prerequisites: Kafka, S3, and frontier running. The script covers WebSocket (create_run, log_metric, log_hparams, finish_run, etc.) and REST presign, and optionally verifies Kafka consumption.

See scripts/smoke_test.py for the exact command and what it checks.

Deployment

  • Docker: Use Dockerfile.dev or Dockerfile.prod (context from repo root as needed; align with how the repo builds frontier images).
  • Kubernetes/Helm: The chart in deploy/helm/matyan deploys the frontier as a separate Deployment. Configure Kafka and S3 via chart values (e.g. kafka.*, s3.*, frontier.logLevel, frontier.replicaCount). Ingress routes /api/v1/ws and /api/v1/rest/artifacts to the frontier. The frontier is stateless and can be scaled horizontally.

The UI talks only to the backend, not to the frontier; the frontier is for training clients.

Related

  • Backend: matyan-backend serves the REST API and consumes from Kafka (ingestion + control workers); it does not receive client traffic from the frontier directly.
  • Client: matyan-client sends tracking data to the frontier (WebSocket + presign REST).
  • Monorepo: This package lives under extra/matyan-frontier in the matyan-core repo.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matyan_frontier-0.2.1.tar.gz (124.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

matyan_frontier-0.2.1-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file matyan_frontier-0.2.1.tar.gz.

File metadata

  • Download URL: matyan_frontier-0.2.1.tar.gz
  • Upload date:
  • Size: 124.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for matyan_frontier-0.2.1.tar.gz
Algorithm Hash digest
SHA256 62ff31caf20eeec8fcdeed539da1ef4544e9f1af1b9fcb2261250536d4aa1718
MD5 f18441221b175715f346214428fae797
BLAKE2b-256 fd4b820d4591c5ffedcd2618bbb23a0a7c1ffc824cf8f944088daff049803196

See more details on using hashes here.

Provenance

The following attestation bundles were made for matyan_frontier-0.2.1.tar.gz:

Publisher: release.yml on 4gt-104/matyan-frontier

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file matyan_frontier-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for matyan_frontier-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b44660a85cf00a00fc4d43ee649502f2caf6fa541c9801357f8761a2a1d4da98
MD5 007bb2da66ba60f3e8ce6feada4300d1
BLAKE2b-256 5365c35deaaf39da5fecc7ce2b05a7c3c0107831d968d6e6be139faea52a5b41

See more details on using hashes here.

Provenance

The following attestation bundles were made for matyan_frontier-0.2.1-py3-none-any.whl:

Publisher: release.yml on 4gt-104/matyan-frontier

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page