Skip to main content

Virtual Computation Cube for Earth Observation Satellite Data

Project description

virtughan

virtughan Logo

Tests Passing Build Status Website Status FastAPI PyPI Version Python Version License Dependencies Last Commit

Name is combination of two words virtual & cube , where cube translated to Nepali word घन, also known as virtual computation cube.

Demo : https://virtughan.com/

Install

As a python package :

https://pypi.org/project/virtughan/

pip install virtughan

Basic Usage

Follow Notebook Here

Background

We started initially by looking at how Google Earth Engine (GEE) computes results on-the-fly at different zoom levels on large-scale Earth observation datasets. We were fascinated by the approach and felt an urge to replicate something similar on our own in an open-source manner. We knew Google uses their own kind of tiling, so we started from there.

Initially, we faced a challenge – how could we generate tiles and compute at the same time without pre-computing the whole dataset? Pre-computation would lead to larger processed data sizes, which we didn’t want. And so, the exploration began and the concept of on the fly tiling computation introduced

At university, we were introduced to the concept of data cubes and the advantages of having a time dimension and semantic layers in the data. It seemed fascinating, despite the challenge of maintaining terabytes of satellite imagery. We thought – maybe we could achieve something similar by developing an approach where one doesn’t need to replicate data but can still build a data cube with semantic layers and computation. This raised another challenge – how to make it work? And hence come the virtual data cube

We started converting Sentinel-2 images to Cloud Optimized GeoTIFFs (COGs) and experimented with the time dimension using Python’s xarray to compute the data. We found that earth-search’s effort to store Sentinel images as COGs made it easier for us to build virtual data cubes across the world without storing any data. This felt like an achievement and proof that modern data cubes should focus on improving computation rather than worrying about how to manage terabytes of data.

We wanted to build something to show that this approach actually works and is scalable. We deliberately chose to use only our laptops to run the prototype and process a year’s worth of data without expensive servers.

Learn about COG and how to generate one for this project Here

Purpose

1. Efficient On-the-Fly Tile Computation

This research explores how to perform real-time calculations on satellite images at different zoom levels, similar to Google Earth Engine, but using open-source tools. By using Cloud Optimized GeoTIFFs (COGs) with Sentinel-2 imagery, large images can be analyzed without needing to pre-process or store them. The study highlights how this method can scale well and work efficiently, even with limited hardware. Our main focus is on how to scale the computation on different zoom-levels without introducing server overhead

Watch

Example python usage

import mercantile
from PIL import Image
from io import BytesIO
from virtughan.tile import TileProcessor

lat, lon = 28.28139, 83.91866
zoom_level = 12
x, y, z = mercantile.tile(lon, lat, zoom_level)

tile_processor = TileProcessor()

image_bytes, feature = await tile_processor.cached_generate_tile(
    x=x,
    y=y,
    z=z,
    start_date="2020-01-01",
    end_date="2025-01-01",
    cloud_cover=30,
    bands=("red", "nir"),
    formula="(nir - red) / (nir + red)",
    colormap_str="RdYlGn",
    collection="sentinel-2-l2a",
)

image = Image.open(BytesIO(image_bytes))

print(f"Tile: {x}_{y}_{z}")
print(f"Date: {feature['properties']['datetime']}")
print(f"Cloud Cover: {feature['properties']['eo:cloud_cover']}%")

image.save(f'tile_{x}_{y}_{z}.png')

2. Virtual Computation Cubes: Focusing on Computation

While storing large images can offer some benefits, we believe that placing emphasis on efficient computation yields far greater advantages and effectively removes the need to worry about large-scale image storage. COGs make it possible to analyze images directly without storing the entire dataset. This introduces the idea of virtual computation cubes, where images are stacked and processed over time, allowing for analysis across different layers ( including semantic layers ) without needing to download or save everything. So original data is never replicated. In this setup, a data provider can store and convert images to COGs, while users or service providers focus on calculations. This approach reduces the need for terra-bytes of storage and makes it easier to process large datasets quickly.

Example python usage

Example NDVI calculation

from virtughan.engine import VirtughanProcessor

processor = VirtughanProcessor(
    bbox=[83.84765625, 28.22697003891833, 83.935546875, 28.304380682962773],
    start_date="2023-01-01",
    end_date="2025-01-01",
    cloud_cover=30,
    formula="(nir - red) / (nir + red)",
    bands=["red", "nir"],
    operation="median",
    timeseries=True,
    output_dir="virtughan_output",
    collection="sentinel-2-l2a",
)

processor.compute()

Sentinel-1 SAR (cross-pol ratio)

Sentinel-1 is supported via Planetary Computer's RTC (Radiometrically Terrain Corrected) product. Bands are polarizations (vv, vh, hh, hv) and pixel values are gamma0 in linear scale. SAR sees through clouds, so cloud_cover is ignored for this collection. Filter by acquisition mode (IW, EW, SM, WV) via the extra_query field.

processor = VirtughanProcessor(
    bbox=[83.92, 28.19, 83.99, 28.24],  # Phewa Lake, Pokhara
    start_date="2025-01-01",
    end_date="2025-01-20",
    cloud_cover=0,
    formula="10 * log10(vv / vh)",  # cross-pol ratio in dB
    bands=["vv", "vh"],
    operation="median",
    timeseries=False,
    output_dir="virtughan_s1_output",
    collection="sentinel-1-rtc",
    extra_query={"sar:instrument_mode": {"eq": "IW"}},
)
processor.compute()

Summary

This research introduces methods on how to use COGs, the SpatioTemporal Asset Catalog (STAC) API, and NumPy arrays to improve the way large Earth observation datasets are accessed and processed. The method allows users to focus on specific areas of interest, process data across different bands and layers over time, and maintain optimal resolution while ensuring fast performance. By using the STAC API, it becomes easier to search for and only process the necessary data without needing to download entire images ( not even the single scene , only accessing the parts ) The study shows how COGs can improve the handling of large datasets, not only making the access faster but also making computation efficient, and scalable across different zoom levels .

flowchart

Sample case study :

Watch Video

Local Setup

This project has FastAPI backend and a plain JS frontend.

Quick Start

git clone https://github.com/virtughan/virtughan.git
cd virtughan
just run

Docker

docker build -t virtughan .
docker run --rm -p 8080:8080 virtughan

For full setup details and configuration options, see the installation guide.

Tech Stack

Resources and Credits

Contribute

Liked the concept? Want to be part of it ?

If you have experience with JavaScript, FastAPI, building geospatial Python libraries , we’d love your contributions! But you don’t have to be a coder to help—spreading the word is just as valuable.

How You Can Contribute ?

Code Contributions

  • Fork the repository and submit a PR with improvements, bug fixes, or features. Use commitizen for your commits
  • Help us refine our development guidelines !

Documentation & Testing

  • Improve our docs to make it easier for others to get started.
  • Test features and report issues to help us build a robust system.

Spread the Word

  • Share the project on social media or among developer communities.
  • Bring in more contributors who might be interested!

Support Us
If you love what we’re building, consider buying us a coffee ☕ to keep the project going!

Buy Us a Coffee

Acknowledgment

This project was initiated during the project work of our master's program , Coopernicus Masters in Digital Earth. We are thankful to all those involved and supported us from the program.

CMIDE-InLine-logoCMYK EU_POS_transparent PLUS

Copyright

© 2024 – Concept by Kshitij and Upen , Distributed under GNU General Public License v3.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

virtughan-1.1.0.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

virtughan-1.1.0-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file virtughan-1.1.0.tar.gz.

File metadata

  • Download URL: virtughan-1.1.0.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for virtughan-1.1.0.tar.gz
Algorithm Hash digest
SHA256 b1886a0291ae575fd3ca2cabdb86caa015c9aefc95777e394cb3fa49dcfa2f8e
MD5 14afec7c72b93846cbdc6a75af30cedb
BLAKE2b-256 5358b378a8558074c12fb61dd6ad1f51a0d009ec16a1356679146389e4f0bd48

See more details on using hashes here.

File details

Details for the file virtughan-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: virtughan-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for virtughan-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f763bca416a5e69883122196a4810437fcc1f2626a8fb859a99d38f54a3f0940
MD5 3374fdb824f338b09a39136da63d439b
BLAKE2b-256 7e97beb9c241c8d6f0f7e866862a01ffd55c2e89918c564c621e8492a304f66e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page