Skip to main content

Soil infrared spectra preprocessing utilities

Project description

SoilSpecTfm

Spectral Processing Tools for Soil Spectroscopy

By translating specialized soil spectroscopy methods into the scikit-learn framework, SoilSpecTfm and SoilSpecData connect this niche domain with Python’s vast machine learning ecosystem, making advanced ML/DL tools accessible to soil scientists.

Implemented transforms developed so far include:

  • Baseline corrections:

    • [x] SNV: Standard Normal Variate
    • [x] MSC: Multiplicative Scatter Correction
    • Detrend: Detrend the spectrum (planned)
    • ALS: Asymmetric Least Squares detrend the spectrum (planned)
  • Derivatives:

    • [x] TakeDerivative: Take derivative (1st, 2nd, etc.) of the spectrum and apply Savitzky-Golay smoothing
    • GapSegmentDerivative: (planned)
  • Smoothing:

  • Other transformations:

    • [x] ToAbsorbance: Transform the spectrum to absorbance
    • [x] Resample: Resample the spectrum to a new wavenumber range
    • [x] Trim: Trim the spectrum to a specific wavenumber range

Key Features:

  • Seamless integration with scikit-learn’s machine learning ecosystem
  • Complement with SoilSpecData package for soil spectroscopy workflows
  • Pipeline-ready transformers with consistent API

All transformers follow scikit-learn conventions:

  • Implement fit/transform interface
  • Support get_params/set_params for GridSearchCV
  • Provide detailed documentation and examples

Installation

pip install soilspectfm

Quick Start

from soilspectfm.core import (SNV, 
                              TakeDerivative, 
                              ToAbsorbance, 
                              Resample, 
                              WaveletDenoise)

from sklearn.pipeline import Pipeline

Loading OSSL dataset

Let’s use OSSL dataset as an example using SoilSpecData package.

from soilspecdata.datasets.ossl import get_ossl
ossl = get_ossl()
mir_data = ossl.get_mir()

Preprocessing pipeline

Transforms are fully compatible with scikit-learn and can be used in a pipeline as follows:

pipe = Pipeline([
    ('snv', SNV()), # Standard Normal Variate transformation
    ('denoise', WaveletDenoise()), # Wavelet denoising
    ('deriv', TakeDerivative(window_length=11, polyorder=2, deriv=1)) # First derivative
])

X_tfm = pipe.fit_transform(mir_data.spectra)

Quick visualization

from soilspectfm.visualization import plot_spectra
from matplotlib import pyplot as plt
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15, 7))

ax1 = plot_spectra(
    mir_data.spectra, 
    mir_data.wavenumbers,
    ax=ax1,
    ascending=False,
    color='black',
    alpha=0.6,
    lw=0.5,
    xlabel='Wavenumber (cm$^{-1}$)',
    title='Raw Spectra'
)

ax2 = plot_spectra(
    X_tfm,
    mir_data.wavenumbers,
    ax=ax2,
    ascending=False,
    color='steelblue',
    alpha=0.6,
    lw=0.5,
    xlabel='Wavenumber (cm$^{-1}$)',
    title='SNV + Derivative (1st order) Transformed Spectra'
)

plt.tight_layout()

Dependencies

  • fastcore
  • numpy
  • scipy
  • scikit-learn
  • matplotlib

Further references

Contributing

Developer guide

If you are new to using nbdev here are some useful pointers to get you started.

Install spectfm in Development mode:

# make sure spectfm package is installed in development mode
$ pip install -e .

# make changes under nbs/ directory
# ...

# compile to have changes apply to spectfm
$ nbdev_prepare

License

This project is licensed under the Apache2 License - see the LICENSE file for details.

Support

For questions and support, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soilspectfm-0.0.5.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soilspectfm-0.0.5-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file soilspectfm-0.0.5.tar.gz.

File metadata

  • Download URL: soilspectfm-0.0.5.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for soilspectfm-0.0.5.tar.gz
Algorithm Hash digest
SHA256 3aceb7ad25a961309292295b5a5ab2b78637e99b2cfff7a861325e9c1f63c8dc
MD5 56ee54bd9c3514aeee092759ce752923
BLAKE2b-256 eb62000120914769e7972550d321429627341be145735dee46aa3f54568f0f25

See more details on using hashes here.

File details

Details for the file soilspectfm-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: soilspectfm-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for soilspectfm-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2402c1848e908aaeba8b5190f34d15edc7270268cb2f528bbbd5b18dc0ab414c
MD5 28ac228e64af531b27cd3a2ed159bff8
BLAKE2b-256 6502358e9816ca08fdfc796f33bcbb982ad844f575ba4f37bb9f74e90b7a56c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page