Tool for automatic analysis of multiple HPLC results
Project description
HPLC data analysis in Python
The hplc_data_analysis
tool automates the typical analysis of HPLC data, saving time, avoiding human error, and increasing comparability of results from different groups.
Some key features:
- handle multiple HPLC semi-quantitative data tables (obtained with different methods)
- duild a database of all identified compounds and their relevant properties using PubChemPy
- split each compound into its functional groups using a published fragmentation algorithm
- produce single file report, single replicate (intended as the sum of more methods applied to one vial), comprehensive multi-sample (average and deviation of replicates) reports and aggregated reports based on functional group mass fractions in the samples
- provides plotting capabilities
Framework
File
A .txt or .csv file located in the project folder that contains time, area, and concentration information for many compunds for a measure.
Depending on the instrument export parameter, the structure of Files
can differ slightly. Project-Parameters
ensure that the loading process can lead to the same data structure to allow to perform all downstream computations.
A good naming convention for Files
ensures the code handles replicates of the same sample correctly. Filenames have to follow the convention:
method_name-of-sample-with-dashes-only_replicatenumber
Examples that are correctly handled:
- 210_Bio-oil-foodwaste-250C_1
- 254_Bio-oil-foodwaste-250C_1
- 210_FW_2
- 254_FW_2
Examples of NON-ACCEPTABLE names are
- 210-bio_oil_1
- 254-FW1
Replicate
If more Files
belong to the same material (Sample
, see below) but represent different methods that see different compounds (for example, different wavelengths are used in the detector), they can be merged into the same Replicate
.
A Replicate
is the union of files with different methods that are complementary in the analysis of a material.
Sample
A collection of Replicates
that replicate the same measure and allow to assess reproducibility.
Project
The folder path
indicates where the Files
are located and where the output
folder will be created.
The Project-Parameters
are valid for each Sample
.
The Project
can generate Reports
and Plots
for all Files
, Replicates
, or Sample
or only for some of them.
Reports
Reports contain the results for one parameter
(abbreviated as param
) for all Files
, Replicates
, or Sample
.
There are two types of reports:
Reports (simple-reports or compound-reports)
These report report the param
value for each compound in each Files
, Replicates
, or Sample
.
Example: the values of conc_vial_mg_L
for each compound in each File
are collected in a single pandas dataframe (and saved as excel worksheet) for an easy comparison.
Aggrreps (aggregated reports)
These report report the param
value for each aggregated functional group in each Files
, Replicates
, or Sample
.
The results of componds are aggregated by functional group (see this paper for details).
Plots
Each report can be plotted using the plot_report
method of the Project
class.
Documentation
Check out the documentation.
Installation
You can install the package from PyPI:
Examples
Each example is available as a folder in the examples
folder and contains the code and the necessary input data.
To run examples:
- Install
hplc_data_analysis
in your Python environment - Download the folder that contains the example
- Run the code
- If you run the scripts as Jupyter Notebooks, replace the relative path at the beginning of each example with the absolute path to the folder where the code is located
Plotting with myfigure
Plots rely on the package myfigure
, a package to simplify scientific plotting in data analysis packages.
Check out its documentation and
GitHub.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for hplc_data_analysis-2.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2345ceea7982017598eae5e69f66198110ea6435cdbc5a99f767ab504258b641 |
|
MD5 | 567c57dfb387a563c250b7cff98363c5 |
|
BLAKE2b-256 | 526faf3ee1aff1659ba194e857c4c818868716af66401d7287cfcb4942ac3b04 |