A parser intended for use with VirtualiZarr to create virtual Zarr stores from TIFFs
Project description
Virtual TIFF
A Parser for creating Virtual Zarr stores from TIFF files using VirtualiZarr 2.0 and async-tiff.
Background
First, some thoughts on why we should virtualize GeoTIFFs and/or COGS:
- Provide faster access to non-cloud-optimized GeoTIFFS that contain some form of internal tiling without any data duplication see notebook #1.
- Provide fully async I/O for both GeoTIFFs and COGs using Zarr-Python
- Allow loading a stack of GeoTIFFS/COGS into a data cube while minimizing the number of GET requests relative to using stackstac/odc-stac, thereby decreasing cost and increasing performance
- Provide users access to a lazily loaded DataTree providing both the data and the overviews, allowing scientists to use the overviews not only for tile-based visualization but also quickly iterating on analytics
- Include etags in the virtualized datasets to support reproducibility
- A motivation that's less clear to me, but maybe possible, is using the virtualization layer to access COGs with disparate CRSs as a single dataset (https://github.com/zarr-developers/geozarr-spec/issues/53)
Getting started
The library can be installed from PyPI:
python -m pip install virtual-tiff
You can use Virtual TIFF to load data directly:
import obstore
from virtualizarr.registry import ObjectStoreRegistry
from virtual_tiff import VirtualTIFF
import xarray as xr
# Configuration
bucket_url = "s3://e84-earth-search-sentinel-data/"
file_url = f"{bucket_url}sentinel-2-c1-l2a/10/T/FR/2023/12/S2B_T10TFR_20231223T190950_L2A/B04.tif"
# Setup and open dataset
s3_store = obstore.store.from_url(bucket_url, region="us-west-2", skip_signature=True)
registry = ObjectStoreRegistry({bucket_url: s3_store})
parser = VirtualTIFF(ifd=0)
manifest_store = parser(url=file_url, registry=registry)
ds = xr.open_zarr(manifest_store, zarr_format=3, consolidated=False)
ds.load()
or create a virtual dataset:
import obstore
from virtualizarr import open_virtual_dataset
from virtualizarr.registry import ObjectStoreRegistry
from virtual_tiff import VirtualTIFF
# Configuration
bucket_url = "s3://e84-earth-search-sentinel-data/"
file_url = f"{bucket_url}sentinel-2-c1-l2a/10/T/FR/2023/12/S2B_T10TFR_20231223T190950_L2A/B04.tif"
# Setup and open dataset
s3_store = obstore.store.from_url(bucket_url, region="us-west-2", skip_signature=True)
registry = ObjectStoreRegistry({bucket_url: s3_store})
ds = open_virtual_dataset(
url=file_url,
registry=registry,
parser=VirtualTIFF(ifd=0)
)
Contributing
- Clone the repository:
git clone https://github.com/virtual-zarr/virtual-tiff.git. - Pull baseline image data from dvc remote
pixi run -e test download-test-imagesWARNING: This will download ~1.4GB of TIFFs for testing to your machine. - Run the test suite using
pixi run -e test run-testsWARNING: Some tests will fail due to incomplete status of the implementation. - Start a shell if needed in the development environment using
pixi run -e test zsh.
License
virtual-tiff is distributed under the terms of the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file virtual_tiff-0.2.1.tar.gz.
File metadata
- Download URL: virtual_tiff-0.2.1.tar.gz
- Upload date:
- Size: 35.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b1303604f2bfdb1fb0ec67371cc73d371ad396cd6b374d3823de0ce02695de0
|
|
| MD5 |
15a5995e8640b44e46afef8bf1abeeba
|
|
| BLAKE2b-256 |
62e3904de723d99afd626468bd41bfa999e97f717dbd3ae760a3ce64c815b4e9
|
File details
Details for the file virtual_tiff-0.2.1-py3-none-any.whl.
File metadata
- Download URL: virtual_tiff-0.2.1-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
500758eda95beab2759fd37f10bf83d0ecdfa2b6eb91b770d01d1ea42eb80c0d
|
|
| MD5 |
9968c39d15ce7508395682e6953220a3
|
|
| BLAKE2b-256 |
142ea3a6f7b308a3f9a1beb1ec3c9c982b72ca76833667904048e6edcd867c18
|