Skip to main content

A tool for reading TSG spectra data into Xarray.

Project description

tsg-xr: A tool for loading TSG datasets into Xarray

The file format associated with The Spectral Geologistâ„¢ (and specifically Hyloggerâ„¢ datasets which have been processed with the software) consists of an ensemble of files:

  • Binary data files containing spectra, high resolutoin imagery and profilometer data
  • Configuration files (principally text, similar in format to TOML)
  • Low resolution core imagery exports (hole overview, per-tray imagery; as JPEG images with associated markup)

tsg-xr heavily leverages the filereader of pytsg to provide access to these data, and presents data in an Xarray format to condense the otherwise complex arrangement. Here pytsg provides an efficient interface to the binary components of the TSG file format, and tsg-xr is largely just arranging this into a condensed data structure which allows easier subseqent use (and serialization to indexable formats, e.g. Zarr).

Usage

tsg-xr is intended to be used to read directories containing ensembles of TSG files; to do so just point the load_tsg funnction at the appropriate directory:

from tsgxr import load_tsg

ds = load_tsg("./Hylogger_Hole_42")

Key array-based data can be accessed directly from this xarray.Dataset object:

ds.Spectra
ds.Image
ds.Lidar

For example, to extract and plot the first metre of core imagery:

import matplotlib.pyplot as plt 								
ds.Image.sel(depth=slice(0, 1)).plot.imshow(yincrease=False)
plt.gca().set(aspect="equal"); # fix the aspect ratio

Similarly, to plot the spectra from a specific interval (e.g. 9.2 to 9.3m here) against wavelength, you can provide a slice to the xarray.DataArray.sel method (note here the holedepth coordinate which is associated with spectral samples, as opposed to the depth coordinate assocaited with RGB imagery - they are thus far separate indexes):

ds.Spectra.sel(holedepth=slice(9.2, 9.3)).plot.scatter(x='wavelength', add_legend=False, color='k', alpha=0.5, s=2)

Scalars and other spectral features are also available; spectral feature (centre, depth, width) data is grouped for brevity:

ds.Centres
ds.Depths
ds.Widths
ds["Grp1 sTSAS"]
...
ds["Min1 sTSAS"]
...
ds["Wt1 sTSAS"]
...

Configuration related to integer-encoding of sample data is also included in the dataset attributes:

ds.attrs

Installation

The tsg-xr pacakge can be installed standalone into your local environment using pip, or you can create an environment with related dependencies using Anaconda (useful for a development scenario, or if you're only using the tool for a singular project).

Option 1: Standalone Installation

The package is also directly installable from GitHub using pip with:

pip install git+https://github.com/CSIRO-GeoscienceAnalytics/tsg-xr

Option 2: Setup an Environment

An environment.yml file is included in this repository, allowing the creation of a conda environment where an Anaconda distribution of some form is used. After cloning this repository and navigating to this directory, the environment can be created as follows:

conda env create -f environment.yml

Alternatively, if you have mamba installed locally (encouraged), you can get there faster with:

mamba env create -f environment.yml

Command Line Interface

Converting TSG files to Zarr

A minimal command line interface exists for converstion of TSG files to Zarr archives. A selection of configuration options are avialable from the commandline, which can be found under the help menu:

tsgxr tsg2zarr --help

Basic usage is as follows, where <Path> refers to either i) an individual TSG scalars file (.tsg), ii) a Hylogger TSG directory, or iii) a directory containing multiple Hylogger TSG directories (multiple datasets can be converted simultaneously):

tsgxr tsg2zarr <Path>

Outputs are by default added to the Hylogger TSG directories themselves, but can be optionally collated into a separate directory; outputs will use the hole name extracted from the TSG dataset and be specific to the spectra specified (NIR or TIR):

tsgxr tsg2zarr <Path> --output_dir "./collated_zarr_archives/"

Note that by default, this will create zipped Zarr archives. These can be directly opened in e.g. Xarray.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsgxr-0.2.5.tar.gz (30.7 kB view details)

Uploaded Source

Built Distribution

tsgxr-0.2.5-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file tsgxr-0.2.5.tar.gz.

File metadata

  • Download URL: tsgxr-0.2.5.tar.gz
  • Upload date:
  • Size: 30.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for tsgxr-0.2.5.tar.gz
Algorithm Hash digest
SHA256 fa886c2eef324927d50df2857fd9431b5a3f53e22b263b20c47e0692d1ed52c0
MD5 4b85a6c3f1d8c1c5e1997c7cec983062
BLAKE2b-256 32b8a3a9b405f1b54314ff20e6e97c531c334934968fc49aa2309e9333e73fdc

See more details on using hashes here.

File details

Details for the file tsgxr-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: tsgxr-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for tsgxr-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8065076151bb737ff3d58f07cc32387b55f1c8074699b9149e9f630f1bdd0c21
MD5 cf47dbe44674bf786a3f42eb4af2139d
BLAKE2b-256 c5c01ae3cf54d84762271b63bfae2dac28e78d08c732a8f5906a44deab52e46d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page