Skip to main content

Read infrastructure data from your cloud and export it to a SQL database.

Project description

Cloud2SQL 🤩

Read infrastructure data from your cloud ☁️ and export it to a SQL database 📋.

Cloud2SQL

Installation

Install via homebrew

This is the easiest way to install Cloud2SQL. Please note, that the installation process will take a couple of minutes.

brew install someengineering/tap/cloud2sql

Install via Python pip

Alternatively you can install Cloud2SQL as Python package, where Python 3.9 or higher is required.

If you only need support for a specific database, instead of cloud2sql[all] you can choose between cloud2sql[snowflake], cloud2sql[parquet], cloud2sql[postgresql], cloud2sql[mysql].

pip3 install --user "cloud2sql[all]"

This will install the executable to the user install directory of your platform. Please make sure this installation directory is listed in PATH.

Usage

The sources and destinations for cloud2sql are configured via a configuration file. Create your own configuration by adjusting the config template file.

You can safely delete the sections that are not relevant to you (e.g. if you do not use AWS, you can delete the aws section). All sections refer to cloud providers and are enabled if a configuration section is provided.

In the next section you will create a YAML configuration file. Once you have created your configuration file, you can run cloud2sql with the following command:

cloud2sql --config myconfig.yaml

Configuration

Cloud2SQL uses a YAML configuration file to define the sources and destinations.

Sources

AWS

sources:
  aws:
    # AWS Access Key ID (null to load from env - recommended)
    access_key_id: null
    # AWS Secret Access Key (null to load from env - recommended)
    secret_access_key: null
    # IAM role name to assume
    role: null
    # List of AWS profiles to collect
    profiles: null
    # List of AWS Regions to collect (null for all)
    region: null
    # Scrape the entire AWS organization
    scrape_org: false
    # Assume given role in current account
    assume_current: false
    # Do not scrape current account
    do_not_scrape_current: false

Google Cloud

sources:
  gcp:
    # GCP service account file(s)
    service_account: []
    # GCP project(s)
    project: []

Kubernetes

sources:
  k8s:
    # Configure access via kubeconfig files.
    # Structure:
    #   - path: "/path/to/kubeconfig"
    #     all_contexts: false
    #     contexts: ["context1", "context2"]
    config_files: []
    # Alternative: configure access to k8s clusters directly in the config.
    # Structure:
    #   - name: 'k8s-cluster-name'
    #     certificate_authority_data: 'CERT'
    #     server: 'https://k8s-cluster-server.example.com'
    #     token: 'TOKEN'
    configs: []

DigitalOcean

sources:
  digitalocean:
    # DigitalOcean API tokens for the teams to be collected
    api_tokens: []
    # DigitalOcean Spaces access keys for the teams to be collected, separated by colons
    spaces_access_keys: []

Destinations

SQLite

destinations:
  sqlite:
    database: /path/to/database.db

PostgreSQL

destinations:
  postgresql:
    host: 127.0.0.1
    port: 5432
    user: cloud2sql
    password: changeme
    database: cloud2sql
    args:
      key: value

MySQL

destinations:
  mysql:
    host: 127.0.0.1
    port: 3306
    user: cloud2sql
    password: changeme
    database: cloud2sql
    args:
      key: value

MariaDB

destinations:
  mariadb:
    host: 127.0.0.1
    port: 3306
    user: cloud2sql
    password: changeme
    database: cloud2sql
    args:
      key: value

Snowflake

destinations:
  snowflake:
    host: myorg-myaccount
    user: cloud2sql
    password: changeme
    database: cloud2sql/public
    args:
      warehouse: compute_wh
      role: accountadmin

Apache Parquet

destinations:
  file:
    path: /where/to/write/parquet/files/
    format: parquet
    batch_size: 100_000

CSV

destinations:
  file:
    path: /where/to/write/to/csv/files/
    format: csv
    batch_size: 100_000

Upload to S3

destinations:
  s3:
    uri: s3://bucket_name/
    region: eu-central-1
    format: csv
    batch_size: 100_000

Upload to Google Cloud Storage

destinations:
  gcs:
    uri: gs://bucket_name/
    format: parquet
    batch_size: 100_000

My database is not listed here

Cloud2SQL uses SQLAlchemy to connect to the database. If your database is not listed here, you can check if it is supported in SQLAlchemy Dialects. Install the relevant driver and use the connection string from the documentation.

Example

We use a minimal configuration example and export the data to a SQLite database. The example uses our AWS default credentials and the default kubernetes config.

cloud2sql --config config-example.yaml

For a more in-depth example, check out our blog post.

Local Development

Create a local development environment with the following command:

make setup
source venv/bin/activate

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloud2sql-0.9.0.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

cloud2sql-0.9.0-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file cloud2sql-0.9.0.tar.gz.

File metadata

  • Download URL: cloud2sql-0.9.0.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.4

File hashes

Hashes for cloud2sql-0.9.0.tar.gz
Algorithm Hash digest
SHA256 50414e9b9a5b030bb505a4c749a492c00a6ce23f2c95e52a1815386d33c30d60
MD5 0ca9183b35e1a936c1d2e04189e2f240
BLAKE2b-256 68675a76e386c4ac2eba49dcfeffab67c3d847d94b8cad8e8d1773235781b155

See more details on using hashes here.

File details

Details for the file cloud2sql-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: cloud2sql-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.4

File hashes

Hashes for cloud2sql-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 77f802ad544738e82222f23b55c58626f7eee57ba492229169759eed2618cd49
MD5 6f4940a4dbf2e4f340d2e45c97758a0c
BLAKE2b-256 c774b736d62feb3c97cd2f5df767e629eefa0b423e626b4ac9a0f92f5985a4af

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page