Skip to main content

Soil infrared spectra preprocessing utilities

Project description

SoilSpecTfm

Spectral Processing Tools for Soil Spectroscopy

By translating specialized soil spectroscopy methods into the scikit-learn framework, SoilSpecTfm and SoilSpecData connect this niche domain with Python’s vast machine learning ecosystem, making advanced ML/DL tools accessible to soil scientists.

Implemented transforms developed so far include:

  • Baseline corrections:

    • [x] SNV: Standard Normal Variate
    • [x] MSC: Multiplicative Scatter Correction
    • Detrend: Detrend the spectrum (planned)
    • ALS: Asymmetric Least Squares detrend the spectrum (planned)
  • Derivatives:

    • [x] TakeDerivative: Take derivative (1st, 2nd, etc.) of the spectrum and apply Savitzky-Golay smoothing
    • GapSegmentDerivative: (planned)
  • Smoothing:

  • Other transformations:

    • [x] ToAbsorbance: Transform the spectrum to absorbance
    • [x] Resample: Resample the spectrum to a new wavenumber range

Key Features:

  • Seamless integration with scikit-learn’s machine learning ecosystem
  • Complement with SoilSpecData package for soil spectroscopy workflows
  • Pipeline-ready transformers with consistent API

All transformers follow scikit-learn conventions:

  • Implement fit/transform interface
  • Support get_params/set_params for GridSearchCV
  • Provide detailed documentation and examples

Installation

pip install soilspectfm

Quick Start

from soilspectfm.core import (SNV, 
                              TakeDerivative, 
                              ToAbsorbance, 
                              Resample, 
                              WaveletDenoise)

from sklearn.pipeline import Pipeline

Loading OSSL dataset

Let’s use OSSL dataset as an example using SoilSpecData package.

from soilspecdata.datasets.ossl import get_ossl
ossl = get_ossl()
mir_data = ossl.get_mir()

Preprocessing pipeline

Transforms are fully compatible with scikit-learn and can be used in a pipeline as follows:

pipe = Pipeline([
    ('snv', SNV()), # Standard Normal Variate transformation
    ('denoise', WaveletDenoise()), # Wavelet denoising
    ('deriv', TakeDerivative(window_length=11, polyorder=2, deriv=1)) # First derivative
])

X_tfm = pipe.fit_transform(mir_data.spectra)

Quick visualization

from soilspectfm.visualization import plot_spectra
from matplotlib import pyplot as plt
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15, 7))

ax1 = plot_spectra(
    mir_data.spectra, 
    mir_data.wavenumbers,
    ax=ax1,
    ascending=False,
    color='black',
    alpha=0.6,
    lw=0.5,
    xlabel='Wavenumber (cm$^{-1}$)',
    title='Raw Spectra'
)

ax2 = plot_spectra(
    X_tfm,
    mir_data.wavenumbers,
    ax=ax2,
    ascending=False,
    color='steelblue',
    alpha=0.6,
    lw=0.5,
    xlabel='Wavenumber (cm$^{-1}$)',
    title='SNV + Derivative (1st order) Transformed Spectra'
)

plt.tight_layout()

Dependencies

  • fastcore
  • numpy
  • scipy
  • scikit-learn
  • matplotlib

Further references

Contributing

Developer guide

If you are new to using nbdev here are some useful pointers to get you started.

Install spectfm in Development mode:

# make sure spectfm package is installed in development mode
$ pip install -e .

# make changes under nbs/ directory
# ...

# compile to have changes apply to spectfm
$ nbdev_prepare

License

This project is licensed under the Apache2 License - see the LICENSE file for details.

Support

For questions and support, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soilspectfm-0.0.3.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soilspectfm-0.0.3-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file soilspectfm-0.0.3.tar.gz.

File metadata

  • Download URL: soilspectfm-0.0.3.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for soilspectfm-0.0.3.tar.gz
Algorithm Hash digest
SHA256 47927e7068c4bf0334983d571194da01f8b1da1e91c786259eb4e155d16f6231
MD5 2c790ac4ccc67ea3d1b43287054baced
BLAKE2b-256 63837fdeaaf04d786a2a5dead27ba2529af4b877c4bd69b671cae78fa549767d

See more details on using hashes here.

File details

Details for the file soilspectfm-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: soilspectfm-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for soilspectfm-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d779f39212f7a2db66f8ccee98c5c74b8a7cb384c038b68e0ba03d769da10b32
MD5 d6e9693277fa8da31cc1513629bc7033
BLAKE2b-256 ba2afb621062349b1e9bdfae9b6e152f3f6833ac29a150df63d7637a5f09ab46

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page