Skip to main content

Soil infrared spectra preprocessing utilities

Project description

SoilSpecTfm

A Python package providing scikit-learn compatible transforms for spectroscopic data preprocessing.

It’s designed to work seamlessly with both MIR (Mid-Infrared) and VISNIR (Visible-Near Infrared) spectral data.

WORK IN PROGRESS

Installation

pip install soilspectfm

Quick Start

from soilspectfm.core import SNV, TakeDerivative, ToAbsorbance
from sklearn.pipeline import Pipeline

Loading OSSL dataset

Let’s use OSSL dataset as an example using SoilSpecData package.

from soilspecdata.datasets.ossl import get_ossl
ossl = get_ossl()
mir_data = ossl.get_mir()

Preprocessing pipeline

Implemented transforms developed so far include:

  • Baseline corrections:

    • [x] SNV: Standard Normal Variate
    • [x] MSC: Multiplicative Scatter Correction
    • Detrend: Detrend the spectrum (SOON)
    • ALS: Asymmetric Least Squares detrend the spectrum (SOON)
  • Derivatives:

    • [x] TakeDerivative: Take derivative (1st, 2nd, etc.) of the spectrum and apply Savitzky-Golay smoothing
    • GapSegmentDerivative: …
  • Smoothing:

    • WaveletDenoise: Wavelet denoising
    • SavGolSmooth: Savitzky-Golay smoothing
  • Other transformations:

    • [x] ToAbsorbance: Transform the spectrum to absorbance
    • Resample: Resample the spectrum to a new wavenumber range

Transforms are fully compatible with scikit-learn and can be used in a pipeline as follows:

pipe = Pipeline([
    ('snv', SNV()), # Standard Normal Variate transformation
    ('deriv', TakeDerivative(window_length=11, polyorder=2, deriv=1)) # First derivative
])

X_tfm = pipe.fit_transform(mir_data.spectra)

Quick visualization

from soilspectfm.visualization import plot_spectra
from matplotlib import pyplot as plt
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15, 7))

ax1 = plot_spectra(
    mir_data.spectra, 
    mir_data.wavenumbers,
    ax=ax1,
    ascending=False,
    color='black',
    alpha=0.6,
    lw=0.5,
    xlabel='Wavenumber (cm$^{-1}$)',
    title='Raw Spectra'
)

ax2 = plot_spectra(
    X_tfm,
    mir_data.wavenumbers,
    ax=ax2,
    ascending=False,
    color='steelblue',
    alpha=0.6,
    lw=0.5,
    xlabel='Wavenumber (cm$^{-1}$)',
    title='SNV + Derivative (1st order) Transformed Spectra'
)

plt.tight_layout()

Dependencies

  • fastcore
  • numpy
  • scipy
  • scikit-learn
  • matplotlib

Further references

Contributing

Developer guide

If you are new to using nbdev here are some useful pointers to get you started.

Install spectfm in Development mode:

# make sure spectfm package is installed in development mode
$ pip install -e .

# make changes under nbs/ directory
# ...

# compile to have changes apply to spectfm
$ nbdev_prepare

License

This project is licensed under the Apache2 License - see the LICENSE file for details.

Support

For questions and support, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soilspectfm-0.0.1.tar.gz (13.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soilspectfm-0.0.1-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file soilspectfm-0.0.1.tar.gz.

File metadata

  • Download URL: soilspectfm-0.0.1.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for soilspectfm-0.0.1.tar.gz
Algorithm Hash digest
SHA256 bd856f79b95e2e9d75a1ea1f38d82dd32b6824f89fec2c8eb1039dc89b960c5d
MD5 93e4f4a4c2f46d9d1b90dd3247ff3de0
BLAKE2b-256 54bed7e53ac369e1f36961d8a02436a26375ff8869725e7e21f0290d7458a16a

See more details on using hashes here.

File details

Details for the file soilspectfm-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: soilspectfm-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for soilspectfm-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fc4c0d3e9821c01f446bd66b0602ac461fb663a6612d89b2fd51e1e84d6edf82
MD5 8a2a012c4323de374f38a1c258c4082e
BLAKE2b-256 fbd6d281171cf40bb0c2382f480869cce1284de4527e7d244bc1d03a03d63fe2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page