Skip to main content

Library for model training in multi-cloud environment.

Project description

cascade

Cascade is a library for submitting and managing jobs across multiple cloud environments. It is designed to integrate seamlessly into existing Prefect workflows or can be used as a standalone library.

Getting Started

Installation

poetry add block-cascade

or

pip install block-cascade

Example Usage in a Prefect flow

from cascade import remote
from cascade import GcpEnvironmentConfig, GcpMachineConfig, GcpResource

machine_config = GcpMachineConfig("n2-standard-4", 1)
environment_config = GcpEnvironmentConfig(
    project="ds-cash-production",
    region="us-west1",
    service_account=f"ds-cash-production@ds-cash-production.iam.gserviceaccount.com",
    image="us.gcr.io/ds-cash-production/cascade/cascade-test",
    network="projects/603986066384/global/networks/neteng-shared-vpc-prod"
)
gcp_resource = GcpResource(
    chief=machine_config,
    environment=environment_config,
)

@remote(resource=gcp_resource)
def addition(a: int, b: int) -> int:
    return a + b

result = addition(1, 2)
assert result == 3

Configuration

Cascade supports defining different resource requirements via a configuration file titled either cascade.yaml or cascade.yml. This configuration file must be located in the working directory of the code execution to be discovered at runtime.

calculate:
  type: GcpResource
  chief:
    type: n1-standard-1
You can even define a default configuration that can be overridden by specific tasks to eliminate redundant definitions.

default:
    GcpResource:
        environment:
            project: ds-cash-dev
            service_account: ds-cash-production@ds-cash-production.iam.gserviceaccount.com
            region: us-central-1
        chief:
            type: n1-standard-4
calculate:
    type: GcpResource
    environment:
        project: ds-cash-production
    chief:
        count: 2

Authorization

Cascade requires authorization both to submit jobs to either GCP or Databricks and to stage picklied code to a cloud storage bucket. In the GCP example below, an authorization token is obtained via IAM by running the following command:

gcloud auth login --update-adc

No additional configuration is required in your application's code to use this token.

However, for authenticating to Databricks and AWS you will need to provide a token and secret key respectively. These can be passed directly to the DatabricksResource object or set as environment variables. The following example shows how to provide these values in the configuration file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

block_cascade-2.1.0.tar.gz (37.8 kB view details)

Uploaded Source

Built Distribution

block_cascade-2.1.0-py3-none-any.whl (48.9 kB view details)

Uploaded Python 3

File details

Details for the file block_cascade-2.1.0.tar.gz.

File metadata

  • Download URL: block_cascade-2.1.0.tar.gz
  • Upload date:
  • Size: 37.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for block_cascade-2.1.0.tar.gz
Algorithm Hash digest
SHA256 1369d2e4f3ccffad237ddd8f24aa82faa6f54faa3e8b3e891f385a0d67e3e654
MD5 50f8270985d4a5bf286742cb49aa2a12
BLAKE2b-256 c0da594a163a7c750f1dcd9916441a94ef63cd44cadf44a4e4fef32a3200b7eb

See more details on using hashes here.

File details

Details for the file block_cascade-2.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for block_cascade-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e7a02821b0005890b26d5fb6722cf7cd5e84dd86d65540f65d7352c39e7b8ca
MD5 6dd1258ea29400e054e315938b502dbc
BLAKE2b-256 b745b272ae116587c4531eb8d0d777081846297f8ab18b9c5c0a58f0e1d6fce4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page