Download and load soil spectral data
Project description
SoilSpecData
A Python package for handling soil spectroscopy data, with a focus on the Open Soil Spectral Library (OSSL).
Installation
pip install soilspecdata
If you want to install the development version, run in the project root:
pip install -e .[dev]
Features
- Easy loading and handling of OSSL dataset
- Support for both VISNIR (Visible Near-Infrared) and MIR (Mid-Infrared) spectral data
- Flexible wavelength range filtering
- Convenient access to soil properties and metadata
- Automatic caching of downloaded data
- Get aligned spectra and target variable(s)
- Further datasets to come …
Quick Start
# Import the package
from soilspecdata.datasets.ossl import get_ossl
Load the OSSL dataset:
ossl = get_ossl()
- Get MIR spectra (600-4000 cm⁻¹):
mir_data = ossl.get_mir(require_valid=True)
- Get VISNIR spectra with custom wavelength range:
visnir_data = ossl.get_visnir(wmin=500, wmax=1000, require_valid=True)
- Get soil properties (e.g., CEC):
properties = ossl.get_properties(['cec_usda.a723_cmolc.kg'], require_complete=True)
For more details on the OSSL dataset and its variables, see the OSSL documentation.
- Get metadata (e.g., geographical coordinates):
metadata = ossl.get_properties(['longitude.point_wgs84_dd', 'latitude.point_wgs84_dd'], require_complete=False)
- Or to get directly aligned spectra and target variable(s):
X, y, ids = ossl.get_aligned_data(
spectra_data=mir_data,
target_cols='cec_usda.a723_cmolc.kg'
)
X.shape, y.shape, ids.shape
((57062, 1701), (57062, 1), (57062,))
- Plot the first 20 MIR spectra:
from matplotlib import pyplot as plt
plt.figure(figsize=(12, 3))
plt.plot(mir_data.wavenumbers, mir_data.spectra[:20,:].T, alpha=0.3, color='steelblue', lw=1)
plt.gca().invert_xaxis()
plt.grid(True, linestyle='--', alpha=0.7)
plt.xlabel('Wavenumber (cm⁻¹)')
plt.ylabel('Absorbance');
Data Structure
The package returns spectra data in a structured format containing:
- Wavenumbers
- Spectra measurements
- Measurement type (reflectance/absorbance)
- Sample IDs
Properties and metadata are returned as pandas DataFrames indexed by sample ID.
Cache Management
By default, the OSSL dataset is cached in ~/.soilspecdata/. To force a
fresh download:
ossl = get_ossl(force_download=True)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
Apache2
Citation
TBC
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file soilspecdata-0.0.3.tar.gz.
File metadata
- Download URL: soilspecdata-0.0.3.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a9f92615981b88a4d743a32231935e0064c67abe1b9fef7b7d2525177c0095a
|
|
| MD5 |
e499224cc3dc9d1b76efa8411532fad2
|
|
| BLAKE2b-256 |
2195cf6391affc137aea2274c7a723d2177fe90edbabc82a025285842adeb14a
|
File details
Details for the file soilspecdata-0.0.3-py3-none-any.whl.
File metadata
- Download URL: soilspecdata-0.0.3-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92d27ef5001ad41bef4e1a78c975bd32e3318e5f06ffb32220e1a3e60981a72f
|
|
| MD5 |
c877bb783632f70198ed15c80b542582
|
|
| BLAKE2b-256 |
7f6589639934a04930a617e6be390912ee67aa660229526c0535711bbfd68d37
|