Package to remove cosmic spikes from Raman Spectra.
Project description
Spyky
Spyky incorperates the removal of cosmic spikes from Raman Spectra, as described by Whitaker & Hayes [1], into a python package compatible with sklearn pipelines and parameter optimization.
Reading .spc files
Spyky provides the ability to read several .spc files stored in one location, using the spc-io package. Currently spyky only supports .spc files with a global X and single Y array.
read_spc() returns: the spectra in an array of shape (n_files, n_wavelengths), meaning one row is one spectrum; the wavelengths (note that all .spc files must have the same wavelengths); the names of the read files.
>>> from spyky.reader import read_spc
>>> path = r"./spectra_bin"
>>> spectra, wavelength, names = read_spc(path)
>>> print(spectra)
[[ 684. 721. 776. ... 22819. 22517. 22036.]
[ 667. 724. 770. ... 22575. 22275. 21819.]
[ 676. 726. 775. ... 22618. 22346. 21851.]]
>>> print(wavelength)
[ 32.1 34.4 36.6 ... 3286.9 3287.8 3288.7 ]
>>> print(names)
['example_file_1.spc', 'example_file_2.spc', 'test_file_1.spc']
By specifying pattern you can filter which files to read. By default all .spc files in the specified path are read. The expressions are matched by fnmatch.
>>> path = r"./spectra_bin"
>>> s, w, names = read_spc(path, pattern="example*.spc")
>>> print(names)
['example_file_1.spc', 'example_file_2.spc']
>>> s, w, names = read_spc(path, pattern="example*1*.spc")
>>> print(names)
['example_file_1.spc']
The is also the option to export the read files as a .csv by specifying export_to. The header of the .csv file will contain the wavelength.
>>> s, w, names = read_spc(path, export_to=r"./spectra.csv")
>>> print(names)
['example_file_1.spc']
Spike Removal
The class DeSpike is written so that it seamlessly integrates into sklearn preprocessing piplines and is compatible with hyperparameter optimization like GridSearchCV. Therefore .fit() and .transform() methods are implemented. Each take the spectra as an input. First use .fit() to calculate the modified z-scores, then use .transform() to perform the correction, as explained in [1].
>>> from spyky.spikes import DeSpike
>>> spiky = Despike(window=5, threshold=6)
>>> spiky.fit(spectra)
>>> despiked = spiky.transform(spectra)
>>> print(despiked)
[[ 802.75 721. 776. ... 22819. 22517. 22982. ]
[ 811. 731. 783. ... 22947. 22662. 23119.6 ]
[ 796.5 724. 770. ... 22575. 22275. 22719.6 ]]
In a pipeline this might look like:
>>> from sklearn.pipeline import make_pipeline
>>> from spyky.reader import read_spc
>>> from spyky.spikes import DeSpike
>>> s, w, n = read_spc(r"/home/arle/MSC/Code/spectra_bin/")
>>> pipe = make_pipeline(DeSpike(window=5, threshold=6))
>>> pipe.fit(s)
>>> corrected = pipe.transform(s)
>>> print(corrected)
[[ 802.75 721. 776. ... 22819. 22517. 22982. ]
[ 811. 731. 783. ... 22947. 22662. 23119.6 ]
[ 796.5 724. 770. ... 22575. 22275. 22719.6 ]]
Use "despike__window" and "despike__threshold to test different values through param_grid in GridSearchCV
Sometimes it can happen that the algorithm fasly identifies steep sections of the normal spectra as spikes. If this happens you can use the ignore and ignore_ref to supply an array containing the wavelengths you want to be ignored. The index of the wavenumber array will be used, if you do not supply ignore_ref. Please note your input to ignore must match that of ignore_ref this is easiest to achieve through the use of a mask. Below you can see an example.
wcut = (w > 500) & (w < 1000)
spiky = DeSpike(threshold=3.3, ignore=w[wcut], ignore_ref=w)
References
[1] D. A. Whitaker and K. Hayes, "A simple algorithm for despiking Raman spectra," Chemometrics and Intelligent Laboratory Systems, vol. 179, pp. 82-84, Aug. 2018, doi: 10.1016/j.chemolab.2018.06.009.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spyky-1.0.0.tar.gz.
File metadata
- Download URL: spyky-1.0.0.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c28979b788f65dcf96bc5496fac4820f3d82d676c81c6aa9fd47f06efa91cb3
|
|
| MD5 |
9088f861d7058430fb339563f5007dd0
|
|
| BLAKE2b-256 |
0e9569d53a3237c0d7b9a862e100c5cd42d5bed14d80dd9a9767b2026b24599c
|
File details
Details for the file spyky-1.0.0-py3-none-any.whl.
File metadata
- Download URL: spyky-1.0.0-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2a87571b45c8ae155532419724cdd9e439710868d5db80886b2bdbabb7deade
|
|
| MD5 |
c1532c9b386616055fbf4bb2625a70da
|
|
| BLAKE2b-256 |
107d093af70871409909da6b06d8f467338a470c333e6d814b8728961d442e87
|