No project description provided

Project description

cinnabar (formerly Arsenic)

Reporting relative free energy results

Issue: we must report statistics consistently and we would like to plot these results consistently too

Solution: package that accepts relative free energy results reliably, which is untied to any particular method/system or software package. For this, the input should be as unconverted as possible.

USAGE

python cinnabar.py example.csv

OPTIONS

python cinnabar.py --help

Terminology

D is difference (i.e. relative) while d is variance (i.e. error bar) dDG would be the variance of an absolute FE, DDG would be the relative free energy between two molecules.

Plots to output

There are two ways of thinking of the results of free energy simulations, one is as a method developer, where one cares about the distance of a simulation from the true experimental value. The other is as a drug designer - how does all the information of this method actually help me to pick which molecule to make next. Statistics should definitely be printed on plots.

DDG’s

These should represent the primary data (i.e. for the method developer), output from the relative free energy simulations. There is still discussion to be had about the best way to report these. There are issues to decide as to

Should we report only edges run or all edges
Should we symmetrise

If we only report edges that we run, it makes it harder to compare between results generated with different sets of edges for the same system - I.e. if I run all the easy edges, I will look better than another method that has run more results. Plotting all edges gets around this, but moves us further from the primary data, and is somewhat redundant with the DG plot.

Correlation statistics are variable based on the sign chosen for an edge, so if we are to report these, symmetrizing is the only way to make these robust. One solution would be to both not symmetrise and not report correlation statistics (only RMSE and AUE for these plots).

If we are using these primary data plots, then it should very clear which edges are being plotted, so that we know if we are comparing one network to another or not. Maybe a networkX graph should be attached.

DG’s

These should represent the overall result (i.e. for the drug designer), where there relative free energies should be combined consistently (i.e. using MLE) to convert the available DDG’s into DG’s. As there can only be Nligand data points on these plots, any statistics can be used, but possibly rank-ordering measures are most useful.

Statistics

RMSE - this is good
MUE - this has issues when comparing between targets, as it is dependent on the dynamic range (noted by C. Bayly), but less so when comparing between methods. C. Bayly suggested Relative Absolute Error. Additionally, GRAM from GSK would be a good measure to incorporate (GRAM: A True Null Model for Relative Binding Affinity Predictions | Journal of Chemical Information and Modeling)
R2/Kendall etc (correlation coefficients) - there are issues of using these statistics with some DDG plots, and have more useful meaning with DG results (see 1examples/WhyNotToUseR2ForDDG.ipynb`)

Errors

How do we compare errors? Several sources:

MBAR
Repeats (same simulation again)
Repeats (forward/backward variety)
Cycle closures
Other sources (?)

We would like to handle these consistently. The input to the software should have two errors (a) generated from PYMBAR, as these are the de facto standard and (b) another column to contain other errors that may be generated, which may be used to try compensate for the underestimation of the MBAR errors.

Plot styles - It may be impossible to completely agree on a plot style (and maybe not necessary)

Colours? Colourblind friendly?

Different colors for distance from equality (like David Hahn/de Groot lab)?

Error bars style?

Guidelines at n units from equality?

TODO (move this to project board)

Generate set of plots that people are happy with Add gram analysis for MUE Incorporate edge errors into the bootstrapping? Handle repeats properly Handle forwards and backwards edges properly Have entry point for absolute free energies too Plots that look at other success metrics? i.e histogram of errors? (One like in METK?) Currently just plotting everything against experimental, would like to do forcefield X vs. forcefield Y

Copyright

Acknowledgements

Project based on the Computational Molecular Science Python Cookiecutter version 1.1.

Project details

Release history Release notifications | RSS feed

This version

0.2.3

Dec 5, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cinnabar-0.2.3.tar.gz (37.3 kB view details)

Uploaded Dec 5, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cinnabar-0.2.3-py3-none-any.whl (22.0 kB view details)

Uploaded Dec 5, 2022 Python 3

File details

Details for the file cinnabar-0.2.3.tar.gz.

File metadata

Download URL: cinnabar-0.2.3.tar.gz
Upload date: Dec 5, 2022
Size: 37.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for cinnabar-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`2c3256aa9ee000a9358aa8b62ed2683daa7b6781f263c5ae20b31469be99c3f7`
MD5	`3df914dbd31734f2bd63d2687a81675a`
BLAKE2b-256	`464c5305a4850e942233784aa050c54194a42aa64c0d58024008aa530743b13c`

See more details on using hashes here.

File details

Details for the file cinnabar-0.2.3-py3-none-any.whl.

File metadata

Download URL: cinnabar-0.2.3-py3-none-any.whl
Upload date: Dec 5, 2022
Size: 22.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for cinnabar-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`af3a43c96530b212b7ab0280b5820ffd739d0e4ed60665259e189bef6fbf1b69`
MD5	`3674ad66a40ecf7a390f5ecb6acb1c59`
BLAKE2b-256	`70697e2d655e987632d079d19913e8e8afd249024a7e77fd6dfc71f57d71902e`

See more details on using hashes here.

cinnabar 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

cinnabar (formerly Arsenic)

Reporting relative free energy results

USAGE

OPTIONS

Terminology

Plots to output

DDG’s

DG’s

Statistics

Errors

Plot styles - It may be impossible to completely agree on a plot style (and maybe not necessary)

TODO (move this to project board)

Copyright

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes