Skip to main content

Python package that provides a full range of functionality to process and analyze vibrational spectra (Raman, SERS, FTIR, etc.).

Project description


Introduces BoxSERS, a complete and ready-to-use python library for the application of data augmentation, dimensional reduction, spectral correction, machine learning and other methods specially designed and adapted for vibrational spectra(Raman,FTIR, SERS, etc.).

Table of contents

BoxSERS Installation

From PypY

pip install boxsers

From Github

pip install git+https://github.com/ALebrun-108/BoxSERS.git

Requirements

Listed below are the main modules needed to operate the codes:

  • Sklearn
  • Scipy
  • Numpy
  • Pandas
  • Matplotlib
  • Tensor flow (GPU or CPU)

Labels associated to spectra can be in one of the following three forms:

Label Type Examples
Text Cholic, Deoxycholic, Lithocholic, ...
Integer 0, 3, 1 , ...
Binary [1 0 0 0], [0 0 0 1], [0 1 0 0], ...

Included Features


Module boxsers.misc_tools

This module provides functions for a variety of utilities.

  • data_split : Randomly splits an initial set of spectra into two new subsets named in this function: subset A and subset B.

  • load_rruff : Export a subset of Raman spectra from the RRUFF database in the form of three related lists containing Raman shifts, intensities and mineral names.

Module boxsers.visual_tools

This module provides different tools to visualize vibrational spectra quickly.

  • spectro_plot : Returns a plot with the selected spectrum(s)

  • random_plot : Plot a number of randomly selected spectra from a set of spectra.

  • distribution_plot : Return a bar plot that represents the distributions of spectra for each classes in a given set of spectra

# Code example:
from boxsers.misc_tools import data_split
from boxsers.visual_tools import spectro_plot, random_plot, distribution_plot

wn = 3 
spec =5 

# randomly splits the spectra(spec) and the labels(lab) into test and training subsets.
(spec_train, spec_test, lab_train, lab_test) = data_split(wn, spec , b_size=0.4)  
# resulting train|test set proportions = 0.6|0.4

# plots the classes distribution within the training set.
distribution_plot(lab_train, title='Train set distribution')

# spectra array = spec, raman shift column = wn
random_plot(wn, spec, random_spectra=4)  # plots 4 randomly selected spectra
spectro_plot(wn, spec[0], spec[2])  # plots first and third spectra

Module boxsers.preprocessing

This module provides multiple functions to preprocess vibrational spectra. These features improve spectrum quality and can improve performance for machine learning applications.

  • baseline_substraction : Subtracts the baseline signal from the spectrum(s) using Asymmetric Least Squares estimation.

  • intensity_normalization : Normalizes the spectrum(s) using one of the available norms in this function.

  • savgol_smoothing : Smoothes the spectrum(s) using a Savitzky-Golay polynomial filter.

  • spectral_cut : Subtracts or sets to zero a delimited spectral region of the spectrum(s)

  • spline_interpolation : Performs a one-dimensional interpolation spline on the spectra to reproduce them with a new x-axis.

# Code example:
import numpy as np
from boxsers.preprocessing import baseline_subtraction, spectral_cut, intensity_normalization, spline_interpolation

# interpolates with splines the spectra and converts them to a new raman shift range(new_wn)
new_wn = np.linspace(500, 3000, 1000)
spec_cor = spline_interpolation(spec, wn, new_wn)
# removes the baseline signal measured with the als method 
(spec_cor, baseline) = baseline_subtraction(spec, lam=1e4, p=0.001, niter=10)
# normalizes each spectrum individually so that the maximum value equals one and the minimum value zero 
spec_cor = intensity_normalization(spec)
# removes part of the spectra delimited by the Raman shift values wn_start and wn_end 
spec_cor, wn_cor = spectral_cut(spec, wn, wn_start, wn_end)

Module boxsers.data_augmentation

This module provides several data augmentation methods that generate new spectra by adding different variations to existing spectra.

  • aug_mixup : Randomly generates new spectra by mixing together several spectra with a Dirichlet probability distribution.

  • aug_noise : Randomly generates new spectra with Gaussian noise added.

  • aug_multiplier : Randomly generates new spectra with multiplicative factors applied.

  • aug_ioffset : Randomly generates new spectra shifted in intensity.

  • aug_xshift : Randomly generates new spectra shifted in wavelength.

  • aug_linslope : Randomly generates new spectra with additional linear slopes

# Code example:

from boxsers.data_augmentation import aug_mixup, aug_noise

spectra_nse, label_nse  = aug_noise(spec, lab, snr=10)
spectra_mult, label_mult = aug_multiplier(spectra, labels, 0.15,)
spectro_plot(wn, spec, spec_nse, spec_mult_sup, spec_mult_inf, legend=legend)

spec_nse, lab_nse = SpectroDataAug.aug_noise(spec, lab, param_nse, quantity=2, mode='random')
spec_mul, lab_mul = SpectroDataAug.aug_multiplier(spec, lab, mult_lim, quantity=2, mode='random')

# stacks all generated spectra and originals in a single array
spec_aug = np.vstack((x, spec_nse, spec_mul))
lab_aug = np.vstack((lab, lab_nse, lab_mul))

# spectra and labels are randomly mixed
x_aug, y_aug = shuffle(x_aug, y_aug)

Module boxsers.dimension_reduction

This module provides different techniques to perform dimensionality reduction of vibrational spectra.

  • SpectroPCA: Returns a plot with the selected spectrum(s)

  • SpectroPCA : Plot a number of randomly selected spectra from a set of spectra.

  • distribution_plot : Return a bar plot that represents the distributions of spectra for each classes in a given set of spectra

Dimensional Reduction

# Code example:

from boxsers.dimension_reduction import SpectroPCA, SpectroFA, SpectroICA

pca_model = SpectroPCA(n_comp=50)
pca_model.fit_model(spec_train)
pca_model.scatter_plot(spec_test, spec_test, targets=classnames, component_x=1, component_y=2)
pca_model.component_plot(wn, component=2)
spec_pca = pca_model.transform_spectra(spec_test)

Unsupervised Machine Learning

# Code example:

from boxsers.machine_learning import SpectroGmixture, SpectroKmeans

kmeans_model = SpectroKmeans(n_cluster=5)
kmeans_model.fit_model(spec_train)
kmeans_model.scatter_plot(spec_test)

Supervised Machine Learning

  • Convolutional Neural Networt (3 x Convolutional layer 1D , 2 x Dense layer)
from boxsers.pca_model import SpectroPCA, SpectroFA, SpectroICA

pca_model = SpectroICA(n_comp=50)
pca_model.fit_model(x_train)
pca_model.scatter_plot(x_test, y_test, targets=classnames, comp_x=1, comp_y=2)
pca_model.pca_component(Wn, 2)
x_pca = pca_model.transform_spectra(x_train)

Module validation_metrics

This module provides different tools to evaluate the quality of a model’s predictions.

  • cf_matrix : Returns a confusion matrix (built with scikit-learn) generated on a given set of spectra.

  • clf_report : Returns a classification report generated from a given set of spectra

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

boxsers-1.3.0.tar.gz (37.9 kB view details)

Uploaded Source

Built Distribution

boxsers-1.3.0-py3-none-any.whl (42.3 kB view details)

Uploaded Python 3

File details

Details for the file boxsers-1.3.0.tar.gz.

File metadata

  • Download URL: boxsers-1.3.0.tar.gz
  • Upload date:
  • Size: 37.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.8

File hashes

Hashes for boxsers-1.3.0.tar.gz
Algorithm Hash digest
SHA256 589cc4c8fb21254b35831049cca7e80ae3314afa9707dbe4683e10e4d0a3090e
MD5 997ba0f9a962c6a7840581542fe335a0
BLAKE2b-256 b234f45d6f1e7c16f91f18c70a4523787ec82f0c2d93888ebc0d03414907b323

See more details on using hashes here.

File details

Details for the file boxsers-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: boxsers-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 42.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.8

File hashes

Hashes for boxsers-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 deb352a5c7696f8c20d2b843d3efb26ab5717ac4125be5a188c0077edd759d92
MD5 c9d96ee36299f6a855b2f51cc8bb3ce8
BLAKE2b-256 786d19318e9d19bedeb326b319890c6886038a4dd753f1ab16e87a08e1b649e7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page