No project description provided

These details have not been verified by PyPI

Project description

ChemCurry

chemcurry is a chemical curation workflow package meant to both streamline building curation workflows and producing detailed reports about which chemicals where flagged and when while doing so in a manner to enforce reproducibility and easy sharing. The Molecular Modeling Lab @ UNC often finds itself needing to generate these reports to show to our PI and share our workflows with new members, so this package was developed as a way to standardize that process.

While most chemical curation workflows for any project can be built in under 100 lines of code, the core idea behind chemcurry is to assert reproducibility and easy building/sharing among chemist with any level of coding background. Most cheminformatics projects and publications will need to do some type of curation, and, frankly, the methods on how this is done is often not up to par with scientific reproducibility standards. We believe that lack of reproducibility hurts our filed and chemcurry aims to fix that (for at least on part of it).

Closely related to the philosophy of reproducibility, chemcurry was also designed to be easy to add new curation functions too. There is a simple API that really only requires you to write the same code you might if you were doing in manually in a notebook or script.

What about curation with labels or non-chemical properties?

chemcurry is designed to operate on explict chemical properties, meaning if the property cannot be calculated using just the chemical, it will not fit into the workflow. If you find yourself needing a curation workflow that can use external properties (say to curated a data set with IC50 values for a machine learning/QSAR model) look into chemcurry-learn which extends chemcurry to support this.

Installing

You can install the chemcurry package using pip:

pip install chemcurry

This package was built using poetry, so you can also install it by cloning the repository and create a poetry environment (though this is not recommend outside of development).

git clone https://github.com/jimmyjbling/ChemCurry
cd chemcurry
poetry install

Building and running a workflow

Building a chemical curation workflow with chemcurry requires only a few lines of code

smiles = ["CCCC", "CCCO", "CCCCN"]

from chemcurry.workflow import CurationWorkflow
from chemcurry.steps import AddH, Add3D, FilterMW, RemoveStereochem

steps = [
    AddH(),
    Add3D(timeout=30),
    FilterMW(max_mw=100, min_mw=10),
    RemoveStereochem()
]

my_workflow = CurationWorkflow(steps=steps)
curated_chemicals = my_workflow.curate_smiles(smiles)

The result of the workflow run, a CuratedChemicalSet contains all the info about which compounds failed curation, which compound were altered and why/how all of it happened. You can save save that info in a human readable report by simply running

curated_chemicals.write_report("path/to/my/report.txt")

You can also extract the curated smiles, either as canonical smiles or rdkit Mols

curated_mols = curated_chemicals.to_mols()
curated_smiles = curated_chemicals.to_smiles()

History tracking

You can optionally turn on history tracking mode if you want extremely detailed information about the evolution of chemical as they progress through curation. This comes at the expense of extra memory. All you need to do is set history_tracking=True when initializing your workflow. This will save copies of the molecules after each update is made to them so you can render the full history of the molecule. This can be done by looping through the Molecule objects attached to the curation output in the molecules attribute.

Note: Right now there is not alot you can do with history. In the future, extra features like viewing the history of the molecule as an image might be added.

Saving, loading and sharing workflows

After making and using a workflow, there is a good chance you will want to save it, either so you can using it again later without having to redefine it, or so you can share it as part of a publication or project. You can do this by creating a workflow file (see here for more info on these files)

All you need to do is

my_workflow.save_workflow_file("path/to/my/workflow.json")

To load in an existing one you can use

my_workflow = CurationWorkflow.load("path/to/my/workflow.json")

Simplate as that. There are some checks and other things happening under the hood to help prioritize reproducibility and prevent unexpected behavior. You can read more about how all that work here

Creating a custom curation steps

The curation functions that already exist in chemcurry are unlikely to always have everything you need. chemcurry defines very simple APIs that allow you to easily write your own curation steps You can read more about how that work here

If you do make your own, we humbly request you submit them to chemcurry so that the community can benefit from them. Simply make a fork, push your new function (and its unit test) and then make a pull request. You can read more about contributing to chemcurry [here]

Project details

These details have not been verified by PyPI

Natural Language
- English
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

0.1.2

Dec 26, 2024

0.1.0

Dec 20, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemcurry-0.1.2.tar.gz (23.5 kB view details)

Uploaded Dec 26, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chemcurry-0.1.2-py3-none-any.whl (27.2 kB view details)

Uploaded Dec 26, 2024 Python 3

File details

Details for the file chemcurry-0.1.2.tar.gz.

File metadata

Download URL: chemcurry-0.1.2.tar.gz
Upload date: Dec 26, 2024
Size: 23.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.11.9 Windows/10

File hashes

Hashes for chemcurry-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`161ed55a0feba12225cf0a5ac8b7a845f8c9a9d2d3a20019c44a46ce91562f61`
MD5	`2cbafe932c937bd573245b0dfad9f5f8`
BLAKE2b-256	`2db157c629a8039b504e187dc57abae17b739562c7255ac7098b9143cb35ac43`

See more details on using hashes here.

File details

Details for the file chemcurry-0.1.2-py3-none-any.whl.

File metadata

Download URL: chemcurry-0.1.2-py3-none-any.whl
Upload date: Dec 26, 2024
Size: 27.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.11.9 Windows/10

File hashes

Hashes for chemcurry-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bea9a25b6038b590e8da22e794d9bbb36fa13571c7fd3b6b673e87b85ac054ff`
MD5	`3deda493db605bda6dc466b8147a74e6`
BLAKE2b-256	`98183888334ccdcc56e51abb2186f331feac52229eda97dc19fa7139dc717c05`

See more details on using hashes here.

chemcurry 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

ChemCurry

What about curation with labels or non-chemical properties?

Installing

Building and running a workflow

History tracking

Saving, loading and sharing workflows

Creating a custom curation steps

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes