A data analysis package for PI-ICR Mass Spectroscopy
Project description
piicrgmms
A package for implementing Gaussian mixture models as a data analysis tool in PI-ICR Mass Spectroscopy experiments. It was first developed in the Fall of 2020 to be used in PI-ICR experiments at Argonne National Laboratory (Lemont, IL, U.S.). At its core is a modified version of the 'mixture' module from the package scikit-learn. The modified version, sklearn_mixture_piicr, retains all the same components as the original version. In addition, it contains two classes with restricted fitting algorithms: a GMM fit where the phase dimension of the component means is not a parameter, and a BGM fit where the number of components is not a parameter. The rest of the package facilitates quick, intuitive use of the GMM algorithms through the use of 4 classes, and visualization methods for debugging.
1. DataFrame
- This class is responsible for processing the .lmf file and phase shifts. As attributes, it holds the processed data for easy access, as well as any data cuts.
2. GaussianMixtureModel
- This class fits Gaussian mixture models to the
DataFrame object. As parameters, it takes:
- Cartesian/Polar coordinates
- Number of components to use
- Covariance matrix type
- Information criterion
- Allows for 'strict' fits, i.e. fits where the number of components is specified.
3. BayesianGaussianModel
- Exact same as the GaussianMixtureModel class, but uses the BayesianGaussianModel class from scikit-learn instead of the GaussianMixtureModel class.
4. PhaseFirstGaussianModel
-
Implements a fit where the phase dimension is fit to first, followed by a GMM fit to both spatial dimensions in which the phase dimension of the component means is fixed. This type of fit was found to work especially well with data sets in which there were many species, like the 168Ho data.
-
Only works with Polar coordinates
Each model class also includes the ability to visualize results in several ways (clustering results, One-dimensional histograms, Probability density function) and the ability to copy fit results to the clipboard for pasting into an Excel spreadsheet.
Installation
Dependencies
piicrgmms requires:
- Python (>=3.6)
- scikit-learn (>=0.23.2)
- pandas (>=1.2.0)
- matplotlib (>=3.3.0)
- lmfit (>=1.0.0)
- joblib (>=1.0.0)
- tqdm (>=4.56.0)
- pillow (>=8.1.0)
- webcolors(>=1.11.1)
User Installation
Assuming Python and pip
have already been installed, decide
whether you want a system-wide or local installation, and
which Python distribution (e.g. Anaconda) you want to
install under. Then, open the Command Prompt (for regular
Python distribution) or the Prompt for another distribution
(e.g. Anaconda Prompt for Anaconda), and run either:
pip install piicrgmms
for a system-wide installation (works for regular Python distributions only), ORpip install -U piicrgmms
for a local installation.
If you want to install in a virtual environment instead, then navigate to the virtual environment's directory, activate the virtual environment, and install with the commands above.
Source code
You can check the latest source code with the command
git clone https://github.com/colinweber27/piicrgmms
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for piicrgmms-0.0.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 081d22e4d4af7c998ed26d39631b3632e73e419136515b3205cfee1a61487ca8 |
|
MD5 | fb63d35185c0afb894307ac62fa175cb |
|
BLAKE2b-256 | e819e42ec23fc3196598e46b430120841a93cd176a61e84ba323cbb5859c45a0 |