Skip to main content

Python library for loading GIS raster data to standard cloud-based data warehouses that don't natively support raster data.

Project description

raster-loader

PyPI version PyPI downloads Tests Documentation Status

Python library for loading GIS raster data to standard cloud-based data warehouses that don't natively support raster data.

Raster Loader is currently tested on Python 3.9, 3.10, 3.11 and 3.12.

Documentation

The Raster Loader documentation is available at raster-loader.readthedocs.io.

Install

pip install -U raster-loader

pip install -U raster-loader"[bigquery]"
pip install -U raster-loader"[snowflake]"

Installing from source

git clone https://github.com/cartodb/raster-loader
cd raster-loader
pip install .

Usage

There are two ways you can use Raster Loader:

  • Using the CLI by running carto in your terminal
  • Using Raster Loader as a Python library (import raster_loader)

CLI

After installing Raster Loader, you can run the CLI by typing carto in your terminal.

Currently, Raster Loader supports uploading raster data to BigQuery. Accessing BigQuery with Raster Loader requires the GOOGLE_APPLICATION_CREDENTIALS environment variable to be set to the path of a JSON file containing your BigQuery credentials. See the GCP documentation for more information.

Two commands are available:

Uploading to BigQuery

carto bigquery upload loads raster data from a local file to a BigQuery table. At a minimum, the carto bigquery upload command requires a file_path to a local raster file that can be read by GDAL and processed with rasterio. It also requires the project (the GCP project name) and dataset (the BigQuery dataset name) parameters. There are also additional parameters, such as table (BigQuery table name) and overwrite (to overwrite existing data).

For example:

carto bigquery upload \
    --file_path /path/to/my/raster/file.tif \
    --project my-gcp-project \
    --dataset my-bigquery-dataset \
    --table my-bigquery-table \
    --overwrite

This command uploads the TIFF file from /path/to/my/raster/file.tif to a BigQuery project named my-gcp-project, a dataset named my-bigquery-dataset, and a table named my-bigquery-table. If the table already contains data, this data will be overwritten because the --overwrite flag is set.

Inspecting a raster file on BigQuery

Use the carto bigquery describe command to retrieve information about a raster file stored in a BigQuery table.

At a minimum, this command requires a GCP project name, a BigQuery dataset name, and a BigQuery table name.

For example:

carto bigquery describe \
    --project my-gcp-project \
    --dataset my-bigquery-dataset \
    --table my-bigquery-table

Using Raster Loader as a Python library

After installing Raster Loader, you can import the package into your Python project. For example:

from raster_loader import rasterio_to_bigquery, bigquery_to_records

Currently, Raster Loader supports uploading raster data to BigQuery. Accessing BigQuery with Raster Loader requires the GOOGLE_APPLICATION_CREDENTIALS environment variable to be set to the path of a JSON file containing your BigQuery credentials. See the GCP documentation for more information.

You can use Raster Loader to upload a local raster file to an existing BigQuery table using the rasterio_to_bigquery() function:

rasterio_to_bigquery(
    file_path = 'path/to/raster.tif',
    project_id = 'my-project',
    dataset_id = 'my_dataset',
    table_id = 'my_table',
)

This function returns True if the upload was successful.

You can also access and inspect a raster file from a BigQuery table using the bigquery_to_records() function:

records_df = bigquery_to_records(
    project_id = 'my-project',
    dataset_id = 'my_dataset',
    table_id = 'my_table',
)

This function returns a DataFrame with some samples from the raster table on BigQuery (10 rows by default).

Development

See CONTRIBUTING.md for information on how to contribute to this project.

ROADMAP.md contains a list of features and improvements planned for future versions of Raster Loader.

Releasing

1. Create and merge a release PR updating the CHANGELOG

  • Branch: release/X.Y.Z
  • Title: Release vX.Y.Z
  • Description: CHANGELOG release notes

Example:

## [0.7.0] - 2024-06-02

### Added
- Support raster overviews (#140)

### Enhancements
- increase chunk-size to 10000 (#142)

### Bug Fixes
- fix: make the gdalwarp examples consistent (#143)

2. Create and push a tag vX.Y.Z

This will trigger an automatic workflow that will publish the package at https://pypi.org/project/raster-loader.

3. Create the GitHub release

Go to the tags page (https://github.com/CartoDB/raster-loader/tags), select the release tag and click on "Create a new release"

  • Title: vX.Y.Z
  • Description: CHANGELOG release notes

Example:

### Added
- Support raster overviews (#140)

### Enhancements
- increase chunk-size to 10000 (#142)

### Bug Fixes
- fix: make the gdalwarp examples consistent (#143)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raster_loader-0.10.4.tar.gz (2.1 MB view details)

Uploaded Source

Built Distribution

raster_loader-0.10.4-py3-none-any.whl (40.4 kB view details)

Uploaded Python 3

File details

Details for the file raster_loader-0.10.4.tar.gz.

File metadata

  • Download URL: raster_loader-0.10.4.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for raster_loader-0.10.4.tar.gz
Algorithm Hash digest
SHA256 3492a88e982b23687032a0cf892cd956df4d9f04709c7cb88d7f52b78ed5add5
MD5 44225cdc84fa97cfd51bf323c5c0d746
BLAKE2b-256 b02fba7e0e15afbd6cb39e2af6ee0f30e4370923f3a2b19ccb48a18ba410b5b9

See more details on using hashes here.

File details

Details for the file raster_loader-0.10.4-py3-none-any.whl.

File metadata

File hashes

Hashes for raster_loader-0.10.4-py3-none-any.whl
Algorithm Hash digest
SHA256 0ce5292998f56de6d6aed00b18e7b905391e1afc22d4f46b70222edd04a5f6c4
MD5 00dc08581b4c8c10ccf60e3ed481f29f
BLAKE2b-256 5f297ba4a3b704837b01e4578657705118af4d7d2bc078cae121d352c13d25ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page