Skip to main content

Sentinel-2 product chunk descriptors (drb-chunk)

Project description

drb-chunk-sentinel2

Chunk descriptors for the Sentinel-2 L2A product type, built on top of the drb-chunk add-on.

What it is

drb-chunk-sentinel2 ships a generated cortex.ttl that declares 68 chunk descriptors, grouped into 4 collections (R10m, R20m, R60m, QI), on the Sentinel-2 L2A class URI:

http://knowledge-base.gael.fr/drb/sentinel-2/product_user_level-2a

Each descriptor is a drb:chunk entry linking a logical chunk name (e.g. B04_10m) and a drb:collection (e.g. R10m) to an XQuery expression that navigates the product tree to the corresponding JP2 raster node.

The package does not modify the Sentinel-2 topic TTL and registers no entry-point group. It is a pure descriptor extension merged into the topic graph at runtime (see How attachment works).

L1C support is planned — same pattern, different IMG_DATA layout — but is not present in this release.


Chunks exposed

68 chunks in total, grouped into 4 collections (tile 512 × 512). The three IMG_DATA resolution collections hold the spectral + ancillary bands (named <LAYER>_<RES>); the QI collection holds the QI_DATA quality masks and the preview.

Collection Count Contents
R10m 7 B02 B03 B04 B08 AOT WVP (uint16) · TCI (uint8 RGB)
R20m 14 B01–B07 B8A B11 B12 AOT WVP (uint16) · SCL TCI (uint8)
R60m 15 B01–B07 B8A B09 B11 B12 AOT WVP (uint16) · SCL TCI (uint8)
QI 32 CLDPRB_{20m,60m}, SNWPRB_{20m,60m}, CLASSI_B00, DETFOO_B01–B12+B8A, QUALIT_B01–B12+B8A, PVI (all uint8)

R10m — 10 980 × 10 980 pixels, tile 512 × 512

Chunk name Band dtype
B02_10m B02 uint16
B03_10m B03 uint16
B04_10m B04 uint16
B08_10m B08 uint16
AOT_10m AOT uint16
WVP_10m WVP uint16
TCI_10m TCI uint8 (RGB)

R20m — 5 490 × 5 490 pixels

Chunk name Band dtype
B01_20m B01 uint16
B02_20m B02 uint16
B03_20m B03 uint16
B04_20m B04 uint16
B05_20m B05 uint16
B06_20m B06 uint16
B07_20m B07 uint16
B8A_20m B8A uint16
B11_20m B11 uint16
B12_20m B12 uint16
AOT_20m AOT uint16
WVP_20m WVP uint16
SCL_20m SCL uint8
TCI_20m TCI uint8 (RGB)

R60m — 1 830 × 1 830 pixels

Chunk name Band dtype
B01_60m B01 uint16
B02_60m B02 uint16
B03_60m B03 uint16
B04_60m B04 uint16
B05_60m B05 uint16
B06_60m B06 uint16
B07_60m B07 uint16
B8A_60m B8A uint16
B09_60m B09 uint16
B11_60m B11 uint16
B12_60m B12 uint16
AOT_60m AOT uint16
WVP_60m WVP uint16
SCL_60m SCL uint8
TCI_60m TCI uint8 (RGB)

QI — quality masks & preview (QI_DATA)

All uint8. Probability masks carry a resolution suffix; the per-band detector-footprint and quality masks are named after their band.

Chunk name Layer
CLDPRB_20m, CLDPRB_60m cloud probability
SNWPRB_20m, SNWPRB_60m snow probability
CLASSI_B00 classification mask
DETFOO_B01DETFOO_B12, DETFOO_B8A detector footprint (per band)
QUALIT_B01QUALIT_B12, QUALIT_B8A quality mask (per band)
PVI preview image

Each chunk's drb:source is an XQuery expression of the form:

GRANULE/*/IMG_DATA/R10m/*[fn:matches(fn:name(),'.*_B04_10m\.jp2$')]

Note: the leading .* is required because fn:matches in drb's XQuery engine uses full-match semantics (re.fullmatch), not search semantics. The pattern must match the entire filename, not just a suffix.


How attachment works

cortex.ttl extends the L2A class by URI, targeting sentinel-2:product_user_level-2a. It does not include a setup.cfg entry-point group — the TTL is not auto-discovered by the resolver on its own.

At runtime the chunk TTL must be merged into the topic graph in one of two ways (see Bootstrap):

  • In a Fuseki deployment, add the packaged TTL to the named-graph list loaded into the dataset.
  • In an offline setup, compose it together with the S2-SAFE topic TTL in a single RDFDao([path_to_s2_ttl, cortex_path()]).

Once merged, drb-chunk reads the descriptors via get_dao(topic).graph and makes them available through the standard chunk API.

The packaged TTL path is returned by:

from drb.addons.chunk.sentinel2 import cortex_path
print(cortex_path())   # /path/to/drb/addons/chunk/sentinel2/cortex.ttl

Bootstrap

Mode 1 — Fuseki (recommended for production)

Set the environment variables before resolving any node:

export FUSEKI_URL=http://localhost:3030
export DATASET=drb
export DRB_FUSEKI_GRAPHS="http://drb.gael.fr/graph/kb/drbx-kb-topics-sentinel-2-safe/latest,http://drb.gael.fr/graph/kb/drbx-kb-topics-safe/latest"

DRB_FUSEKI_GRAPHS is a comma-separated list of Fuseki named-graph URIs; the chunk-descriptor graph (loaded from this package's cortex.ttl) must also be merged into the same dataset/graph set.

Mode 2 — Offline vendored TTL

Compose the two TTL files into a single RDFDao and register it before resolving:

from drb.topics.dao import ManagerDao
from drb.topics.dao.rdf_dao import RDFDao
from drb.addons.chunk.sentinel2 import cortex_path

s2_ttl = "/path/to/vendored/sentinel-2-safe-topics.ttl"
# Merge the S2 topic descriptors and the chunk descriptor into one graph.
ManagerDao().add_dao_instance(RDFDao([s2_ttl, str(cortex_path())]))

See examples/demo_s2_l2a_chunk.py for a complete runnable demonstration of both modes.


Worked example

import numpy
from drb.topics import resolver
from drb.addons.addon import AddonManager
from drb.chunk.selection import WindowSelection

# 1. Resolve the S3 product (bootstrap the KB topics first; see Bootstrap)
topic, node = resolver.resolve(
    "s3://my-sentinel2-bucket/"
    "S2A_MSIL2A_20220101T000000_N0400_R000_T19VCG_20220101T000000.SAFE")

# 2. Discover the chunks, by collection, declared for this topic
addon = AddonManager().get_addon("chunk")
print(addon.available_collections(topic))
#   {'R10m': ['B02_10m', ...], 'R20m': [...], 'R60m': [...], 'QI': [...]}

# 3a. Build one chunk and read a 512x512 window
chunk = addon.apply(node, chunk_name="B04_10m")
window = WindowSelection(x=0, y=0, w=512, h=512)
array = chunk.select(window).get_impl(numpy.ndarray)
print(array.shape)          # (1, 512, 512) — single band, uint16

# 3b. Or build every chunk of a whole collection at once
r10m_chunks = addon.apply(node, collection="R10m")   # list[Chunk] (7 here)

Limitations

  • Windowing is correct but not partial-fetch: WindowSelection returns the right pixel values, but the current implementation materialises the full JP2 member from the archive before slicing. There is no partial network fetch (e.g. HTTP range request on a cloud-hosted SAFE). This is a drb-chunk v1 limitation.
  • Single-tile materialisation: each apply / select call opens one JP2 node. Multi-tile mosaicking is not supported in this release.

See docs/evolution-requests.md for tracked improvement requests.


Regenerating the descriptor

The cortex.ttl is generated from a band×resolution table embedded in _generate.py. To regenerate after editing the table:

python -m drb.addons.chunk.sentinel2._generate

The file is written next to the module (i.e. drb/addons/chunk/sentinel2/cortex.ttl). Commit the result.


Installation

pip install drb-chunk-sentinel2

Requires drb-chunk (installed as a dependency). A working drb-chunk setup with either a Fuseki instance or a vendored S2 topic TTL is needed at runtime (see Bootstrap).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drb_chunk_sentinel2-0.1.0.tar.gz (33.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

drb_chunk_sentinel2-0.1.0-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file drb_chunk_sentinel2-0.1.0.tar.gz.

File metadata

  • Download URL: drb_chunk_sentinel2-0.1.0.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for drb_chunk_sentinel2-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8143df362de5af4da3f64b1cba336472d962eccdea669f965cdf8d3ec7a23a44
MD5 2679554849e62322c0870a73500ff7c3
BLAKE2b-256 9fd69288ac1385086f74e8728df656ddba1b24ccd7ad7ee933c485c44d2d8966

See more details on using hashes here.

File details

Details for the file drb_chunk_sentinel2-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for drb_chunk_sentinel2-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3ae67ba8b60ba44016477c130f831b89fa1a6f23ca6a0457d897cb0962f44af5
MD5 96d00ee9b9eb67ca2924945db78e75bd
BLAKE2b-256 f78182d076a1a67987e88e5c7053e51c158a520aa5e61bd795f7f8e754abe6c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page