Skip to main content

Soil infrared spectra preprocessing utilities

Project description

SoilSpecTfm

Spectral Processing Tools for Soil Spectroscopy

By translating specialized soil spectroscopy methods into the scikit-learn framework, SoilSpecTfm and SoilSpecData connect this niche domain with Python’s vast machine learning ecosystem, making advanced ML/DL tools accessible to soil scientists.

Implemented transforms developed so far include:

  • Baseline corrections:

    • [x] SNV: Standard Normal Variate
    • [x] MSC: Multiplicative Scatter Correction
    • Detrend: Detrend the spectrum (planned)
    • ALS: Asymmetric Least Squares detrend the spectrum (planned)
  • Derivatives:

    • [x] TakeDerivative: Take derivative (1st, 2nd, etc.) of the spectrum and apply Savitzky-Golay smoothing
    • GapSegmentDerivative: (planned)
  • Smoothing:

  • Other transformations:

    • [x] ToAbsorbance: Transform the spectrum to absorbance
    • [x] Resample: Resample the spectrum to a new wavenumber range

Key Features:

  • Seamless integration with scikit-learn’s machine learning ecosystem
  • Complement with SoilSpecData package for soil spectroscopy workflows
  • Pipeline-ready transformers with consistent API

All transformers follow scikit-learn conventions:

  • Implement fit/transform interface
  • Support get_params/set_params for GridSearchCV
  • Provide detailed documentation and examples

Installation

pip install soilspectfm

Quick Start

from soilspectfm.core import (SNV, 
                              TakeDerivative, 
                              ToAbsorbance, 
                              Resample, 
                              WaveletDenoise)

from sklearn.pipeline import Pipeline

Loading OSSL dataset

Let’s use OSSL dataset as an example using SoilSpecData package.

from soilspecdata.datasets.ossl import get_ossl
ossl = get_ossl()
mir_data = ossl.get_mir()

Preprocessing pipeline

Transforms are fully compatible with scikit-learn and can be used in a pipeline as follows:

pipe = Pipeline([
    ('snv', SNV()), # Standard Normal Variate transformation
    ('denoise', WaveletDenoise()), # Wavelet denoising
    ('deriv', TakeDerivative(window_length=11, polyorder=2, deriv=1)) # First derivative
])

X_tfm = pipe.fit_transform(mir_data.spectra)

Quick visualization

from soilspectfm.visualization import plot_spectra
from matplotlib import pyplot as plt
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15, 7))

ax1 = plot_spectra(
    mir_data.spectra, 
    mir_data.wavenumbers,
    ax=ax1,
    ascending=False,
    color='black',
    alpha=0.6,
    lw=0.5,
    xlabel='Wavenumber (cm$^{-1}$)',
    title='Raw Spectra'
)

ax2 = plot_spectra(
    X_tfm,
    mir_data.wavenumbers,
    ax=ax2,
    ascending=False,
    color='steelblue',
    alpha=0.6,
    lw=0.5,
    xlabel='Wavenumber (cm$^{-1}$)',
    title='SNV + Derivative (1st order) Transformed Spectra'
)

plt.tight_layout()

Dependencies

  • fastcore
  • numpy
  • scipy
  • scikit-learn
  • matplotlib

Further references

Contributing

Developer guide

If you are new to using nbdev here are some useful pointers to get you started.

Install spectfm in Development mode:

# make sure spectfm package is installed in development mode
$ pip install -e .

# make changes under nbs/ directory
# ...

# compile to have changes apply to spectfm
$ nbdev_prepare

License

This project is licensed under the Apache2 License - see the LICENSE file for details.

Support

For questions and support, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soilspectfm-0.0.4.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soilspectfm-0.0.4-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file soilspectfm-0.0.4.tar.gz.

File metadata

  • Download URL: soilspectfm-0.0.4.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for soilspectfm-0.0.4.tar.gz
Algorithm Hash digest
SHA256 2fb6d469e65438fb0f847dcaf35bd3177b24205fb0499b597658448a4e2d6547
MD5 d98bc4930eae7fa43138d8a448a4bc6c
BLAKE2b-256 ee137a75a4c6ac2fbece14bdca191f88d37f245d42dccf027e36e90060bcea29

See more details on using hashes here.

File details

Details for the file soilspectfm-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: soilspectfm-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for soilspectfm-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6ea8a5a595672b87833461ed4f730425344d66fed9c80b1eace947d4cb3c6074
MD5 e7f5440684f0d6871713199fc71ed1aa
BLAKE2b-256 3eca121908f82d0f053a2af0cc1ece01dfe01df29ebb37cad08d167155803f11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page