Python toolkit for spectral data processing: baseline correction, normalization, smoothing, despiking, similarity metrics, peak analysis, and multi-format I/O.

These details have not been verified by PyPI

Project links

Project description

SpectraKit

Python toolkit for spectral data processing: smoothing, baseline correction, normalization, scatter correction, derivatives, peak analysis, and more.

SpectraKit is a lightweight, pip-installable library for preprocessing and analyzing spectral data from IR, Raman, and NIR spectroscopy. It follows a functional design with NumPy arrays as the primary data type and requires only NumPy + SciPy as core dependencies.

Documentation | API Reference | Project Page | Examples

Installation

pip install pyspectrakit

Note: The PyPI distribution name is pyspectrakit (due to a naming conflict). The import name is simply import spectrakit.

Optional extras for additional functionality:

pip install pyspectrakit[io]         # HDF5 file support
pip install pyspectrakit[cli]        # Command-line interface
pip install pyspectrakit[baselines]  # pybaselines backend (200+ methods)
pip install pyspectrakit[fitting]    # lmfit peak fitting
pip install pyspectrakit[sklearn]    # scikit-learn integration
pip install pyspectrakit[plot]       # Plotting utilities
pip install pyspectrakit[all]        # Everything above

Quick Start

import numpy as np
from spectrakit import smooth_savgol, baseline_als, normalize_snv

# Load your spectral data (N spectra, W wavelengths)
spectra = np.loadtxt("data.csv", delimiter=",")

# Process with individual functions
smoothed = smooth_savgol(spectra, window_length=11)
corrected = baseline_als(smoothed, lam=1e6, p=0.01)
normalized = normalize_snv(corrected)

All functions accept both single spectra (W,) and batches (N, W).

Pipeline

Chain steps for reproducibility:

from spectrakit.pipeline import Pipeline

pipe = Pipeline()
pipe.add("smooth", smooth_savgol, window_length=11)
pipe.add("baseline", baseline_als, lam=1e6)
pipe.add("normalize", normalize_snv)

processed = pipe.transform(spectra)

scikit-learn Integration

Use any SpectraKit function in an sklearn pipeline:

from sklearn.pipeline import Pipeline as SkPipeline
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from spectrakit.sklearn import SpectralTransformer

pipe = SkPipeline([
    ("smooth", SpectralTransformer(smooth_savgol, window_length=11)),
    ("baseline", SpectralTransformer(baseline_als, lam=1e6)),
    ("normalize", SpectralTransformer(normalize_snv)),
    ("pca", PCA(n_components=10)),
    ("svm", SVC()),
])

pipe.fit(X_train, y_train)
predictions = pipe.predict(X_test)

Features

Smoothing

Method	Function	Description
Savitzky-Golay	`smooth_savgol(y)`	Polynomial least-squares smoothing
Whittaker	`smooth_whittaker(y)`	Penalized least-squares smoother

Baseline Correction

Method	Function	Description
ALS	`baseline_als(y)`	Asymmetric least squares
SNIP	`baseline_snip(y)`	Statistics-sensitive peak clipping
Polynomial	`baseline_polynomial(y)`	Iterative polynomial fit
Rubberband	`baseline_rubberband(y)`	Convex hull envelope

Normalization

Method	Function	Description
SNV	`normalize_snv(y)`	Zero mean, unit variance
Min-Max	`normalize_minmax(y)`	Scale to [0, 1]
Area	`normalize_area(y)`	Unit area under curve
Vector	`normalize_vector(y)`	L2 norm = 1

Derivatives

Method	Function	Description
Savitzky-Golay	`derivative_savgol(y)`	SG polynomial derivative
Gap-Segment	`derivative_gap_segment(y)`	Norris-Williams derivative

Scatter Correction

Method	Function	Description
MSC	`scatter_msc(y)`	Multiplicative scatter correction
EMSC	`scatter_emsc(y)`	Extended MSC with polynomial terms

Spectral Transforms

Method	Function	Description
Kubelka-Munk	`transform_kubelka_munk(y)`	Reflectance to K-M units
ATR Correction	`transform_atr_correction(y, wn)`	ATR depth-of-penetration

Operations

Function	Description
`spectral_subtract(a, b)`	Spectral subtraction
`spectral_average(y)`	Mean spectrum from batch
`spectral_interpolate(y, wn, new_wn)`	Resample to new axis

Peak Analysis

Function	Description
`peaks_find(y)`	Find peaks with scipy.signal
`peaks_integrate(y)`	Integrate peak regions

Similarity Metrics

Metric	Function	Range
Cosine	`similarity_cosine(a, b)`	[-1, 1]
Pearson	`similarity_pearson(a, b)`	[-1, 1]
Spectral Angle	`similarity_spectral_angle(a, b)`	[0, pi]
Euclidean	`similarity_euclidean(a, b)`	[0, inf)

I/O Formats

Format	Function	Dependencies
JCAMP-DX	`read_jcamp(path)`	None
SPC	`read_spc(path)`	spc-spectra
CSV/TSV	`read_csv(path)`	None
HDF5	`read_hdf5(path)` / `write_hdf5(spec, path)`	h5py
Bruker OPUS	`read_opus(path)`	None

Optional Backends

Backend	Extra	Description
pybaselines	`[baselines]`	200+ baseline methods via `pybaselines_method()`
lmfit	`[fitting]`	Peak fitting with Gaussian, Lorentzian, Voigt models

Visualization

from spectrakit.plot import plot_spectrum, plot_comparison, plot_baseline

Requires pip install pyspectrakit[plot].

Spectrum Container

from spectrakit import Spectrum

spec = Spectrum(
    intensities=np.array([...]),       # (W,) or (N, W)
    wavenumbers=np.array([...]),       # (W,), optional
    metadata={"instrument": "Bruker"},
    source_format="jcamp",
    label="ethanol_ir",
)

CLI

pip install pyspectrakit[cli]

spectrakit info ethanol.dx
spectrakit convert ethanol.dx ethanol.h5

Examples

See the examples/ directory for Jupyter notebooks:

Quick Start — basic preprocessing workflow
Baseline Methods — comparing correction algorithms
Derivatives & Peaks — derivative analysis and peak finding
Scatter Correction — MSC vs EMSC vs SNV
sklearn Pipeline — classification with preprocessing

Development

git clone https://github.com/ktubhyam/spectrakit.git
cd spectrakit
pip install -e ".[all,dev]"
pytest

See CONTRIBUTING.md for guidelines.

Citation

If you use SpectraKit in your research, please cite:

@software{spectrakit,
  author = {Karthikeyan, Tubhyam},
  title = {SpectraKit: Python toolkit for spectral data processing},
  url = {https://github.com/ktubhyam/spectrakit},
  license = {MIT}
}

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.9.6

Feb 26, 2026

1.9.5

Feb 26, 2026

1.9.4

Feb 26, 2026

1.9.3

Feb 26, 2026

1.9.2

Feb 26, 2026

This version

1.9.1

Feb 25, 2026

1.9.0

Feb 25, 2026

1.8.1

Feb 25, 2026

1.8.0

Feb 25, 2026

1.7.2

Feb 25, 2026

1.7.1

Feb 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspectrakit-1.9.1.tar.gz (542.1 kB view details)

Uploaded Feb 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyspectrakit-1.9.1-py3-none-any.whl (294.4 kB view details)

Uploaded Feb 25, 2026 Python 3

File details

Details for the file pyspectrakit-1.9.1.tar.gz.

File metadata

Download URL: pyspectrakit-1.9.1.tar.gz
Upload date: Feb 25, 2026
Size: 542.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for pyspectrakit-1.9.1.tar.gz
Algorithm	Hash digest
SHA256	`31e0a11cb645326319c7e65d4ce5d2ef8c53a8a2dbb9d85f18093772291da88f`
MD5	`61024c49ad2707668702182335313de5`
BLAKE2b-256	`cfd4b8988f9ae24c48eb905a07d761200c554895c4e665e3228ec2ecbfa1649a`

See more details on using hashes here.

File details

Details for the file pyspectrakit-1.9.1-py3-none-any.whl.

File metadata

Download URL: pyspectrakit-1.9.1-py3-none-any.whl
Upload date: Feb 25, 2026
Size: 294.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for pyspectrakit-1.9.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d63945a24a2e0653417683e22b79b4d43cac8ab0a4f3ef6801c8bd11b7c7870d`
MD5	`3739ff3deab9998c7c01f51664fe6c94`
BLAKE2b-256	`0a0285a29c79d663314ddde6d4abf840685f37f2dcad15ef8cb7e161ece004bc`

See more details on using hashes here.

pyspectrakit 1.9.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SpectraKit

Installation

Quick Start

Pipeline

scikit-learn Integration

Features

Smoothing

Baseline Correction

Normalization

Derivatives

Scatter Correction

Spectral Transforms

Operations

Peak Analysis

Similarity Metrics

I/O Formats

Optional Backends

Visualization

Spectrum Container

CLI

Examples

Development

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes