Skip to main content

Skyplane efficiently transports data between cloud regions and providers.

Project description

Skyplane

poetry docker sphinx License

🔥 Blazing fast bulk data transfers between any cloud 🔥

Skyplane is a tool for blazingly fast bulk data transfers in the cloud. Skyplane manages parallelism, data partitioning, and network paths to optimize data transfers, and can also spin up VM instances to increase transfer throughput.

You can use skyplane to transfer data:

  • Between buckets within a cloud provider
  • Between object stores across multiple cloud providers
  • To/from local storage to a cloud object store

Getting started

Installation

We recommend installation from PyPi: pip install skyplane-nightly

To install Skyplane from source:

$ git clone https://github.com/skyplane-project/skyplane
$ cd skyplane
$ pip install -e .

Authenticating with cloud providers

To transfer files from cloud A to cloud B, Skyplane will start VMs (called gateways) in both A and B. The CLI therefore requires authentication with each cloud provider. Skyplane will infer credentials from each cloud providers CLI. Therefore, log into each cloud.

⤵️  Setting up AWS credentials

To set up AWS credentials on your local machine, first install the AWS CLI.

After installing the AWS CLI, configure your AWS IAM access ID and secret with aws configure:

$ aws configure
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-west-2
Default output format [None]: json

See AWS documentation for further instructions on how to configure the AWS CLI.

⤵️  Setting up GCP credentials

To set up GCP credentials on your local machine, first install the gcloud CLI.

After installing the gcloud CLI, configure your GCP CLI credentials with gcloud auth as follows:

$ gcloud auth login
$ gcloud auth application-default login

⚠️ If you already had GCP credentials configured, make sure to run gcloud auth application-default login which generates application credentials for Skyplane.

⤵️  Setting up Azure credentials

To set up Azure credentials on your local machine, first install the Azure CLI.

After installing the Azure CLI, configure your Azure CLI credentials with az login as follows:

$ az login

Skyplane should now be able to authenticate with Azure although you may need to pass your subscription ID to skyplane init later.

Importing cloud credentials into Skyplane

After authenticating with each cloud provider, you can run skyplane init to create a configuration file for Skyplane.

$ skyplane init
skyplane init output
$ skyplane init

====================================================
 _____ _   ____   _______ _       ___   _   _  _____
/  ___| | / /\ \ / / ___ \ |     / _ \ | \ | ||  ___|
\ `--.| |/ /  \ V /| |_/ / |    / /_\ \|  \| || |__
 `--. \    \   \ / |  __/| |    |  _  || . ` ||  __|
/\__/ / |\  \  | | | |   | |____| | | || |\  || |___
\____/\_| \_/  \_/ \_|   \_____/\_| |_/\_| \_/\____/
====================================================


(1) Configuring AWS:
    Loaded AWS credentials from the AWS CLI [IAM access key ID: ...XXXXXX]
    AWS region config file saved to /home/ubuntu/.skyplane/aws_config

(2) Configuring Azure:
    Azure credentials found in Azure CLI
    Azure credentials found, do you want to enable Azure support in Skyplane? [Y/n]: Y
    Enter the Azure subscription ID: [XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX]:
    Azure region config file saved to /home/ubuntu/.skyplane/azure_config
    Querying for SKU availbility in regions
    Azure SKU availability cached in /home/ubuntu/.skyplane/azure_sku_mapping

(3) Configuring GCP:
    GCP credentials found in GCP CLI
    GCP credentials found, do you want to enable GCP support in Skyplane? [Y/n]: Y
    Enter the GCP project ID [XXXXXXX]:
    GCP region config file saved to /home/ubuntu/.skyplane/gcp_config

Config file saved to /home/ubuntu/.skyplane/config

Using Skyplane

The easiest way to use Skyplane is to use the CLI. skyplane cp supports any local path or cloud object store destination as an argument.

# copy files between two AWS S3 buckets
$ skyplane cp s3://... s3://...

# copy files from an AWS S3 bucket to a GCP GCS bucket
$ skyplane cp s3://... gs://...

# copy files from a local directory to/from a cloud object store
$ skyplane cp /path/to/local/files gs://...

Skyplane also supports incremental copies via skyplane sync:

# copy changed files from S3 to GCS
$ skyplane sync s3://... gcs://...

skyplane sync will diff the contents of the source and destination and only copy the files that are different or have changed. It will not delete files that are no longer present in the source so it's always safe to run skyplane sync.

Accelerating transfers

Use multiple VMs

With default arguments, Skyplane sets up a one VM (called gateway) in the source and destination regions. We can further accelerate the transfer by using more VMs.

To double the transfer speeds by using two VMs in each region, run:

$ skyplane cp s3://... s3://... -n 2

⚠️ If you do not have enough vCPU capacity in each region, you may get a InsufficientVCPUException. Either request more vCPUs or reduce the number of parallel VMs.

Stripe large objects across multiple VMs

Skyplane can transfer a single large object across multiple VMs to accelerate transfers. Internally, Skyplane will stripe the large object into many small chunks which can be transferred in parallel.

To stripe large objects into multiple chunks, run:

$ skyplane cp s3://... s3://... --max_chunk_size_mb 16

⚠️ Large object transfers are only supported for transfers between AWS S3 buckets at the moment.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skyplane-nightly-0.0.1.dev20220604.tar.gz (91.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skyplane_nightly-0.0.1.dev20220604-py3-none-any.whl (112.4 kB view details)

Uploaded Python 3

File details

Details for the file skyplane-nightly-0.0.1.dev20220604.tar.gz.

File metadata

  • Download URL: skyplane-nightly-0.0.1.dev20220604.tar.gz
  • Upload date:
  • Size: 91.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.0b1 CPython/3.8.12 Linux/5.13.0-1025-azure

File hashes

Hashes for skyplane-nightly-0.0.1.dev20220604.tar.gz
Algorithm Hash digest
SHA256 e36e66ad00be44331a827e9e335018bb35a6f3185777c2197cc63f0ff9df7892
MD5 cb8ad1874043545c12630df5c8bbd9f1
BLAKE2b-256 bba140903400baa8006a57edf83f3849f6374077c639dc2f4e2f7291c2e8b8bb

See more details on using hashes here.

File details

Details for the file skyplane_nightly-0.0.1.dev20220604-py3-none-any.whl.

File metadata

File hashes

Hashes for skyplane_nightly-0.0.1.dev20220604-py3-none-any.whl
Algorithm Hash digest
SHA256 bc5350894be9677109965fa5b816fc7230156f7253d7428d6fcd8783b8df96e0
MD5 50f2658291f429a2b232eda980b63aef
BLAKE2b-256 e9ed4d949b69001fece4acad0ec66a734794976df45cad1bc59c76296cd14b3a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page