Skip to main content

Library for model training in multi-cloud environment.

Project description

cascade

Cascade is a library for submitting and managing jobs across multiple cloud environments. It is designed to integrate seamlessly into existing Prefect workflows or can be used as a standalone library.

Getting Started

Installation

poetry add block-cascade

or

pip install block-cascade

Example Usage

from block_cascade import remote
from block_cascade import GcpEnvironmentConfig, GcpMachineConfig, GcpResource

machine_config = GcpMachineConfig("n2-standard-4", 1)
environment_config = GcpEnvironmentConfig(
    project="example-project",
    region="us-west1",
    service_account=f"example-project@vertex.iam.gserviceaccount.com",
    image="us.gcr.io/example-project/cascade/cascade-test",
    network="projects/123456789123/global/networks/shared-vpc"
)
gcp_resource = GcpResource(
    chief=machine_config,
    environment=environment_config,
)

@remote(resource=gcp_resource)
def addition(a: int, b: int) -> int:
    return a + b

result = addition(1, 2)
assert result == 3

Configuration

Cascade supports defining different resource requirements via a configuration file titled either cascade.yaml or cascade.yml. This configuration file must be located in the working directory of the code execution to be discovered at runtime.

calculate:
  type: GcpResource
  chief:
    type: n1-standard-1
You can even define a default configuration that can be overridden by specific tasks to eliminate redundant definitions.

default:
    GcpResource:
        environment:
            project: example-project
            service_account: example-project@vertex.iam.gserviceaccount.com
            region: us-central-1
        chief:
            type: n1-standard-4

Authorization

Cascade requires authorization both to submit jobs to either GCP or Databricks and to stage picklied code to a cloud storage bucket. In the GCP example below, an authorization token is obtained via IAM by running the following command:

gcloud auth login --update-adc

No additional configuration is required in your application's code to use this token.

However, for authenticating to Databricks and AWS you will need to provide a token and secret key respectively. These can be passed directly to the DatabricksResource object or set as environment variables. The following example shows how to provide these values in the configuration file.

For Developers

Using hermit for managing Python

When developing cascade, you can optionally use hermit to manage the Python executable used by cascade. Together with using poetry to manage dependencies, this will ensure that your development environment is identical to other contributors. Follow the linked instructions for installing hermit and then you can create a virtualenv with Python@3.9 by running:

. ./bin/activate-hermit

Then, install the dependencies with poetry: poetry install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

block_cascade-2.10.0.tar.gz (35.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

block_cascade-2.10.0-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file block_cascade-2.10.0.tar.gz.

File metadata

  • Download URL: block_cascade-2.10.0.tar.gz
  • Upload date:
  • Size: 35.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for block_cascade-2.10.0.tar.gz
Algorithm Hash digest
SHA256 9ca5cbe88da07c3155713d5f6260fa3f1324525fc0a6a9a7f839d95a712ab2c4
MD5 6a1d18f7fc74d26cecdefa26a2486d64
BLAKE2b-256 437507d1aea99b85c4fcaeb72eef6d5be0abac6fdd171c4934ae111a116d5d68

See more details on using hashes here.

Provenance

The following attestation bundles were made for block_cascade-2.10.0.tar.gz:

Publisher: python-publish.yml on square/cascade

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file block_cascade-2.10.0-py3-none-any.whl.

File metadata

  • Download URL: block_cascade-2.10.0-py3-none-any.whl
  • Upload date:
  • Size: 50.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for block_cascade-2.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 301adf2035e0682e4ad38f7125e97cab21d1d0ec843431afbd6169d5522d124e
MD5 41529f08151ffcf8ac626355e32c9931
BLAKE2b-256 8269e842f8707133c0857991bfcd0dd972b471a39a7ce0488d1223b9671f047d

See more details on using hashes here.

Provenance

The following attestation bundles were made for block_cascade-2.10.0-py3-none-any.whl:

Publisher: python-publish.yml on square/cascade

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page