Skip to main content

A collection of common tools to interact with the BigEarthNet dataset.

Project description

Common BigEarthNet Tools

A personal collection of common tools to interact with the BigEarthNet dataset.

Tests License PyPI version Conda Version Auto Release

This library provides a collection of high-level tools to better work with the BigEarthNet dataset.

ben_common tries to accomplish three goals:

  1. Collect the most relevant constants into a single place to reduce the time spent looking for these, like:
    • The 19 or 43 class nomenclature strings
    • URL
    • Band statistics (mean/variance) as integer and float
    • Channel names
    • etc.
  2. Provide parsing functions to convert the metadata JSON files to a geopandas GeoDataFrame.
    • Allow for easy top-level statistical analysis of the data in a familiar pandas-style
    • Provide functions to enrich GeoDataFrames with often required BigEarthNet metadata (like the season or country of the patch)
  3. Simplify the building procedure by providing a command-line interface with reproducible results

Installation

I strongly recommend to use mamba or conda with miniforge to install the package with:

  • mamba/conda install bigearthnet-common -c conda-forge

As the bigearthnet_common tool is built on top of geopandas the same restrictions apply. For more details please review the geopandas installation documentation.

The package is also available via PyPI and could be installed with:

  • pip install bigearthnet_common (not recommended)

TL;DR

The most relevant functions are exposed as CLI entry points. To quickly search for BigEarthNet constants of interest, call:

  • ben_constants_prompt or
  • python -m bigearthnet_common.constants

To build the tabular data, use:

  • ben_gdf_builder --help or
  • python -m bigearthnet_common.ben_gdf_builder --help

Deep Learning

One of the primary purposes of the dataset is to allow deep learning researchers and practitioners to train their models on multi-spectral satellite data. In that regard, there is a general recommendation to drop patches that are covered by seasonal snow or clouds. Also, the novel 19-class nomenclature should be preferred over the original 43-class nomenclature. As a result of these recommendations, some patches have to be excluded from the original raw BigEarthNet dataset that is provided at BigEarthNet.

To simplify the procedure of pre-converting the JSON metadata files, the library provides a single command that will generate a recommended GeoDataFrame with extra metadata (country/season data of each patch) while dropping all patches that are not recommended for deep learning research.

To generate such a GeoDataFrame and store it as an parquet file, use:

  • ben_gdf_builder build-recommended-parquet (available after installing package) or
  • python -m bigearthnet_common.gdf_builder build-recommended-parquet

If you want to read the raw JSON files and convert those to a GeoDataFrame file without dropping any patches or adding any metadata, use:

  • ben_gdf_builder build-raw-ben-parquet (available after installing package) or
  • python -m bigearthnet_common.gdf_builder build-raw-ben-parquet

Contributing

Contributions are always welcome!

Please look at the corresponding ipynb notebook from the nbs folder to review the source code. These notebooks include extensive documentation, visualizations, and tests. The automatically generated Python files are available in the bigearthnet_common module.

More information is available in the contributing guidelines document.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigearthnet_common-0.3.0.tar.gz (23.8 kB view details)

Uploaded Source

Built Distribution

bigearthnet_common-0.3.0-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file bigearthnet_common-0.3.0.tar.gz.

File metadata

  • Download URL: bigearthnet_common-0.3.0.tar.gz
  • Upload date:
  • Size: 23.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.7.12 Linux/5.11.0-1022-azure

File hashes

Hashes for bigearthnet_common-0.3.0.tar.gz
Algorithm Hash digest
SHA256 2de6cc539d56b8683db96bf68689b5b47dce830e35a4bcb32d03984378f0f347
MD5 586b496d93dcc342b7361e044d93f6c4
BLAKE2b-256 a55c9cc3b84c9e31f0579b9c2b14f99c54ffaed462e87598ac4571c936299db0

See more details on using hashes here.

File details

Details for the file bigearthnet_common-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: bigearthnet_common-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.7.12 Linux/5.11.0-1022-azure

File hashes

Hashes for bigearthnet_common-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 26550722dd3e673ed972c04ec174780bed230a3e82cc393dee33684ead6aee18
MD5 c7bb279dcc1d5a1209cdc0885fdb139c
BLAKE2b-256 c2a99a1e3d5055c49784b6354cf6bf10c6a51c04a4705d654889448b9db00b8d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page