Skip to main content

Sentinel-1 product chunk descriptors (drb-chunk)

Project description

drb-chunk-sentinel1

Chunk descriptors for Sentinel-1 Level-1 product types: GRD (detected amplitude) and SLC (Single Look Complex), built on top of the drb-chunk add-on.

What it is

drb-chunk-sentinel1 ships a generated cortex.ttl that declares chunk descriptors for Level-1 GRD and SLC products, attached once to the existing Sentinel-1 Level-1 class URI:

http://knowledge-base.gael.fr/drb/sentinel-1/product_level-1

Every Level-1 product — whatever its acquisition mode (IW/EW/SM) or satellite (S1A/B/C/D) — resolves to a subclass of product_level-1 and inherits the chunks through the topic graph's subClassOf chain. There is no product-type-specific topic class in the knowledge base, and none was added: each chunk's drb:source XQuery self-selects on the filename token (-grd-<pol>- for GRD, -slc-<pol>- for SLC) — products of the non-matching type yield an empty source for that chunk.

The package does not modify the Sentinel-1 topic TTL and registers no entry-point group. It is a pure descriptor extension merged into the topic graph at runtime (see How attachment works).


Chunks exposed

One measurement chunk per polarization, tile 512 × 512, uint16, one band. Array shape is read at runtime from the raster itself (GRD dimensions vary per product/slice) — never hard-coded in the descriptor.

Chunk name Content dtype Tile
VV VV-polarization detected amplitude uint16 512×512
VH VH-polarization detected amplitude uint16 512×512
HH HH-polarization detected amplitude uint16 512×512
HV HV-polarization detected amplitude uint16 512×512

Mode × polarization coverage

The four chunk descriptors are declared once, mode-agnostic, and apply uniformly across every Level-1 GRD product type. Which polarizations are actually present on a given product depends on its acquisition mode and polarization scheme; absent polarizations resolve to an empty XQuery source (see Limitations).

Mode Product type VV VH HH HV Validated live
IW (Interferometric Wide) GRDH yes (local S1C fixture, SDV)
EW (Extra Wide) GRDM ⚠️ no — no local EW fixture
SM (Stripmap) GRDH / GRDM ⚠️ no — no local SM fixture

A single product carries at most a dual-pol pair (e.g. SDV → VV+VH, SDH → HH+HV) or a single polarization (SSV → VV, SSH → HH); available_chunks still lists all four chunk names for every GRD product (they are declared on the shared product_level-1 ancestor), but apply() on an absent polarization raises DrbChunkError — never silently wrong data.


Collection taxonomy

drb:collection carries the SAFE data family; drb:chunkName carries the addressing within it (polarization for GRD; swath×pol×burst-index for SLC). This release populates only measurement; the other collections are reserved names for future increments (see the design spec §5 for the full rationale).

Collection Content Status Increment
measurement SAR image raster — GRD: 1 chunk/pol (detected, ground-range, uint16, regular 512² tiling) implemented (this release) 1
bursts SLC measurement re-viewed per burst (per swath), geometry from swathTiming/linesPerBurst + burstList reserved — new annotation-driven TilingScheme 2
calibration Radiometric LUTs (sigma0/beta0/gamma/dn), coarse azimuth×range grid, per pol reserved — candidate LUT/coarse-grid reader future / RFE
noise Thermal-noise LUTs (range + azimuth), per pol reserved future / RFE
geolocation geolocationGridPoint (lat/lon/height/incidence…) coarse grid reserved future / RFE
rfi RFI detection/mitigation reports (flags) out of scope — no regular array
preview Reduced-resolution quicklook/browse reserved future

How attachment works

cortex.ttl extends the Level-1 class by URI, targeting sentinel-1:product_level-1 — the level-1 root every GRD (and SLC/WV) product inherits from. It does not include a setup.cfg entry-point group — the TTL is not auto-discovered by the resolver on its own.

At runtime the chunk TTL must be merged into the topic graph in one of two ways (see Bootstrap):

  • In a Fuseki deployment, add the packaged TTL to the named-graph list loaded into the dataset.
  • In an offline setup, compose it together with the S1-SAFE topic TTLs in a single RDFDao([...]).

Once merged, drb-chunk reads the descriptors via get_dao(topic).graph (walking subClassOf upward from the resolved topic) and makes them available through the standard chunk API.

The packaged TTL path is returned by:

from drb.addons.chunk.sentinel1 import cortex_path
print(cortex_path())   # /path/to/drb/addons/chunk/sentinel1/cortex.ttl

Bootstrap

Mode 1 — Fuseki (recommended for production)

Set the environment variables before resolving any node:

export FUSEKI_URL=http://localhost:3030
export DATASET=drb
export DRB_FUSEKI_GRAPHS="http://drb.gael.fr/graph/kb/drbx-kb-topics-sentinel-1-safe/latest,http://drb.gael.fr/graph/kb/drbx-kb-topics-safe/latest"

DRB_FUSEKI_GRAPHS is a comma-separated list of Fuseki named-graph URIs; the chunk-descriptor graph (loaded from this package's cortex.ttl) must also be merged into the same dataset/graph set. No change to drb-fuseki itself is required — the chunk descriptor attaches to the pre-existing product_level-1 URI, so the KB is used exactly as shipped.

Mode 2 — Offline vendored TTL

Compose three files into a single RDFDao and register it before resolving. The base safe-topics.ttl is the owl:imports root that carries the subClassOf+ drb:item closure — loading only the Sentinel-1-specific TTL loses that closure and every product* class vanishes from the DAO:

from drb.topics.dao import ManagerDao
from drb.topics.dao.rdf_dao import RDFDao
from drb.addons.chunk.sentinel1 import cortex_path

safe_topics_ttl = "/path/to/vendored/safe-topics.ttl"
s1_safe_ttl = "/path/to/vendored/sentinel-1-safe-topics.ttl"
ManagerDao().add_dao_instance(
    RDFDao([safe_topics_ttl, s1_safe_ttl, cortex_path()]))

See examples/demo_s1_grd_chunk.py for a complete runnable demonstration of both modes, and examples/RESULTS-grd.md for a captured run against the local S1C IW GRDH fixture.


Worked example

import numpy
from drb.topics import resolver
from drb.addons.addon import AddonManager
from drb.chunk.selection import WindowSelection

# 1. Resolve the product (bootstrap the KB topics first; see Bootstrap).
#    resolver.create() path-walks a nested "x.SAFE.zip/x.SAFE" URL directly.
node = resolver.create(
    "/data/S1C_IW_GRDH_1SDV_..._7B0C.SAFE.zip/"
    "S1C_IW_GRDH_1SDV_..._7B0C.SAFE")
topic = resolver.resolve(node)[0]
print(topic.uri)
#   http://knowledge-base.gael.fr/drb/sentinel-1/product_level-1_iw_s

# 2. Discover the chunks, by collection, declared for this topic
#    (inherited from product_level-1 -- no GRD-specific topic class exists)
addon = AddonManager().get_addon("chunk")
print(addon.available_collections(topic))
#   {'measurement': ['VV', 'VH', 'HH', 'HV']}

# 3. Build a chunk and read a 512x512 window
chunk = addon.apply(node, chunk_name="VV", topic=topic)
window = WindowSelection(x=0, y=0, w=512, h=512)
array = chunk.select(window).get_impl(numpy.ndarray)
print(array.shape, array.dtype)   # (1, 512, 512) uint16

SLC bursts

Single Look Complex (SLC) products are resolved as product_level-1_iw_s[_abc] (a satellite-letter class) and carry burst-indexed chunks in the bursts collection. Each burst is identified by its swath and polarization (e.g., IW1_VV, IW2_HH); they are enumerated dynamically at runtime by reading the SLC annotation XML's burstList.

Chunk addressing and geometry

An SLC burst chunk is addressed as <SWATH>_<POL> (e.g., IW1_VV):

Chunk name Content Tiling scheme dtype Sample rate
IW1_VV Swath 1, VV polarization, per-burst complex SAR image Burst-indexed (N bursts × 2D) complex64 native
IW1_VH Swath 1, VH polarization, per-burst complex SAR image Burst-indexed (N bursts × 2D) complex64 native
IW2_VV Swath 2, VV polarization, per-burst complex SAR image Burst-indexed (N bursts × 2D) complex64 native
IW2_VH Swath 2, VH polarization, per-burst complex SAR image Burst-indexed (N bursts × 2D) complex64 native
IW3_VV Swath 3, VV polarization, per-burst complex SAR image Burst-indexed (N bursts × 2D) complex64 native
IW3_VH Swath 3, VH polarization, per-burst complex SAR image Burst-indexed (N bursts × 2D) complex64 native

Burst geometry is dynamic: linesPerBurst and samplesPerBurst are read from the product's annotation XML, and burst windows are assembled from the burstList offsets. All bursts within a swath carry the same sample count; line counts may vary per-burst within a swath (nominal case: all equal; edge bursts occasionally shorter).

Addressing bursts via IselSelection(per_dim={"burst": i})

The chunk is a multi-dimensional array with the burst index as the first dimension (e.g., shape (N, lines, samples) for a complex64 image). Bursts are enumerated via chunk.tiles() (yields (0,), (1,), ..., (N-1,) tuples) and accessed individually via chunk.tile((i,)) or via selection:

from drb.chunk.selection import IselSelection
import numpy

# Enumerate bursts for swath IW1, polarization VV
chunk = addon.apply(node, chunk_name="IW1_VV", topic=topic)
burst_count = len(list(chunk.tiles()))  # e.g. 9

# Read burst 3 as a complex64 numpy array
burst3 = chunk.select(
    IselSelection(per_dim={"burst": 3})).get_impl(numpy.ndarray)
print(burst3.shape)  # (1, lines, samples)

Kept metadata per burst

Each burst reference holds metadata (accessed via chunk.tile((i,)).info):

Key Example Type Notes
burstIndex 3 int 0-based burst index within the swath
burstIdAbsolute 900015 int Product-relative burst ID from burstList/@absolute
azimuthTime 2026-01-20T00:10:05.123456 str Burst acquisition timestamp (ISO 8601)
byteOffset 245760 int Byte offset of the burst's first line in the measurement file
window ((0, 500), (0, 1296)) tuple Line and sample indices bounding the burst in the measurement raster (line_start:line_end, sample_start:sample_end)
footprint POLYGON((20 10, 21 10, ...)) str Burst's geolocation footprint (WKT POLYGON) derived from the coarse geolocation grid

Deferred metadata

The firstValidSample / lastValidSample arrays (per-line validity masks) and the per-burst radiometric data (calibration, noise LUTs) are not yet exposed as chunks — they are reserved for future increments. Descriptive metadata (incidence angle, Doppler, etc.) is also deferred.

Bootstrap (same as GRD)

SLC chunk descriptors are merged into the same topic graph as GRD chunks. The bootstrap procedure (Fuseki or offline vendored TTL) is identical; see Bootstrap.

Worked example

import numpy
from drb.topics import resolver
from drb.addons.addon import AddonManager
from drb.chunk.selection import IselSelection

# 1. Resolve an SLC product.
node = resolver.create(
    "s3://bucket/S1A_IW_SLC__1SDV_20260120T001002_..._1234.SAFE/"
    "measurement/s1a-iw1-slc-vv-20260120t001002-...-001.tiff")
topic = resolver.resolve(node)[0]
print(topic.uri)
#   http://knowledge-base.gael.fr/drb/sentinel-1/product_level-1_iw_s_a

# 2. Get the chunk, enumerating bursts at runtime.
addon = AddonManager().get_addon("chunk")
chunk = addon.apply(node, chunk_name="IW1_VV", topic=topic)
bursts = list(chunk.tiles())
print(f"IW1_VV: {len(bursts)} bursts")  # e.g. "9 bursts"

# 3. Read burst metadata and data.
ref = chunk.tile((0,))
print(ref.window)         # e.g. ((0, 500), (0, 1296))
print(ref.info['azimuthTime'])  # 2026-01-20T00:10:02.123456
print(ref.info['footprint'])    # POLYGON((20 10, 21 10, ...))

# 4. Read the burst's full data as a numpy array.
burst0 = chunk.select(
    IselSelection(per_dim={"burst": 0})).get_impl(numpy.ndarray)
print(burst0.shape, burst0.dtype)  # (1, 500, 1296) complex64

See examples/demo_s1_slc_burst.py for a complete runnable demonstration against the offline synthetic fixture, and examples/RESULTS-slc.md for a captured run.


Limitations

  • Chunks are listed even when not materialisable on a given product. Because all four polarization chunks (and, from increment 2, the SLC bursts) are declared once on the shared product_level-1 ancestor, available_chunks/available_collections on any Level-1 product (GRD, SLC, or WV) lists all of them, whether or not that specific product carries that polarization or product type. apply(chunk_name=...) raises DrbChunkError if the chunk's XQuery source matches nothing in the given product — never silently wrong data.
  • EW GRDM / SM GRD are not validated against a live product in this release — only IW GRDH was exercised end-to-end (no local EW/SM fixture available). The chunk descriptors are mode-agnostic and unit-parsed (tests/test_cortex_ttl.py), so EW/SM are expected to work identically, but this is a documented gap, not a claim. See examples/RESULTS-grd.md.
  • Windowing is correct but not a true partial network fetch on every medium. Locally (plain file or zip member) a 512×512 window reads back in ~0.02 s with no measurable zip overhead. Over S3, GRD's raster layout (striped, 1-line blocks, no internal tiling, no overviews) caps efficiency at line-band granularity — see docs/evolution-requests.md RFE 1.
  • Single-tile materialisation: each apply / select call opens one measurement GeoTIFF. Multi-tile mosaicking is not supported in this release (existing drb-chunk v1 deferral, tracked in docs/evolution-requests.md).

Regenerating the descriptor

The cortex.ttl is generated from a small polarization table embedded in _generate.py. To regenerate after editing the table:

python -m drb.addons.chunk.sentinel1._generate

The file is written next to the module (i.e. drb/addons/chunk/sentinel1/cortex.ttl). Commit the result.


Installation

pip install drb-chunk-sentinel1

This package's only declared runtime dependency is rdflib (it ships a generated cortex.ttl descriptor, see requirements.txt). The consuming environment is responsible for providing a working drb stack — drb, drb-chunk, drb-extractor, drb-driver-image, rasterio, numpy (listed in requirements-test.txt for dev/CI) — plus either a Fuseki instance or the vendored S1 topic TTLs at runtime (see Bootstrap). Resolving a .SAFE.zip or S3-hosted product additionally requires drb-driver-zip / drb-driver-s3, which are not declared dependencies of this package either.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drb_chunk_sentinel1-0.1.0.tar.gz (40.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

drb_chunk_sentinel1-0.1.0-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file drb_chunk_sentinel1-0.1.0.tar.gz.

File metadata

  • Download URL: drb_chunk_sentinel1-0.1.0.tar.gz
  • Upload date:
  • Size: 40.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for drb_chunk_sentinel1-0.1.0.tar.gz
Algorithm Hash digest
SHA256 467588eaafad6e00007707b564797fec1ef29cc6aecbaced4a5716cdffcd3b26
MD5 019c3a4f8e1c73f0b769f11b1d6c8705
BLAKE2b-256 230d3f1150509704679e117fd39dd334b79351d4a9ae326e5a444d95f883d0f1

See more details on using hashes here.

File details

Details for the file drb_chunk_sentinel1-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for drb_chunk_sentinel1-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ec3ce62ca698d25a0c634a634ce9c6b3ae9c8095c90e2377fced0404c54da12
MD5 0f4039bbba51a3f5b29f38d0d58abdaa
BLAKE2b-256 62d6dbad5a6369dfd71ddd91966a90d280a3140f2fe4d36a0c4a52bd7130f9d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page