Skip to main content

No project description provided

Project description

aiSSEMBLE™ Model Training API

This module contains the implementation and baseline Docker image for the aiSSEMBLE model training service. This service allows you to create model training jobs, list jobs, retrieve job logs, and kill jobs.

Model Training API

POST /training-jobs?pipeline=PIPELINE_NAME

  • Request body contains all key/value pairs required for model training, such as model hyperparameters
  • Functionality:
    • Spawns appropriate model training Kubernetes job
      • Checks for existence of model training image with naming convention: "model-training-PIPELINE_NAME"
        • Returns error if not present
      • Job naming convention: "model-training-PIPELINE_NAME-RANDOM_UUID"
    • Passes in user-provided parameters
  • Returns model training job name

GET /training-jobs/TRAINING_JOB_NAME

  • Returns logs from pod running model training job or error if job doesn't exist

GET /training-jobs

  • Returns list of all model training jobs (active, failed, and completed) and statuses
  • Filters all jobs in cluster by reserved job name prefix "model-training"

GET /training-jobs?pipeline=PIPELINE_NAME

  • Returns list of all model training jobs (active, failed, and completed) and statuses for a given pipeline

DELETE /training-jobs/TRAINING_JOB_NAME

  • Deletes specified Kubernetes job
  • Returns error if job does not exist

Remaining Items

  • Ensure appropriate Kubernetes RBAC config in Helm charts
  • Deploy model training API in downstream projects with ML training step(s)
  • In downstream projects, ensure model training image is generated into "model-training-PIPELINE_NAME"
  • In downstream projects, ensure embeddings deployment name is "PIPELINE_NAME-STEP_NAME"
  • Configure permissions/implement PDP authorization for each API route

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

File details

Details for the file aissemble_foundation_model_training_api-1.7.0.dev1715243028.tar.gz.

File metadata

File hashes

Hashes for aissemble_foundation_model_training_api-1.7.0.dev1715243028.tar.gz
Algorithm Hash digest
SHA256 a2e2a3260486ef55c7beb606168f9f157616322ed7e2a38d956f2e7e61cc4ab1
MD5 216b7eff70864505bf184d7e25edd38b
BLAKE2b-256 da3969bb8dd97f711369896b1d05b66bcbc47448172e86b1e86e51ee0b614ebb

See more details on using hashes here.

File details

Details for the file aissemble_foundation_model_training_api-1.7.0.dev1715243028-py3-none-any.whl.

File metadata

File hashes

Hashes for aissemble_foundation_model_training_api-1.7.0.dev1715243028-py3-none-any.whl
Algorithm Hash digest
SHA256 9735d0e5d9e2ea1afb74477ee2626ebf24ddc565a25084533a80018112f593df
MD5 e31ec991caee975c1701c3947904e9c2
BLAKE2b-256 b4f7dee3cb4667cb47cdd56fe077dd0fa34071b716f75a160dfd853397ddca3d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page