Skip to main content

Useful functions and scripts for working with small molecules.

Project description

Chem Func

PyPI - Python Version PyPI version Downloads license

Useful functions and scripts for working with small molecules.

Installation

Optionally, create a conda environment.

conda create -y -n chemfunc python=3.12
conda activate chemfunc

Install the latest version of Chem Func using pip.

pip install chemfunc

Alternatively, clone the repository and install the local version of the package.

git clone https://github.com/swansonk14/chemfunc.git
cd chemfunc
pip install -e .

Note: If you get the issue ImportError: libXrender.so.1: cannot open shared object file: No such file or directory, run conda install -c conda-forge xorg-libxrender.

Features

Chem Func contains a variety of useful functions and scripts for working with small molecules.

Functions can be imported from the chemfunc package. For example:

from pathlib import Path
from chemfunc.sdf_to_smiles import sdf_to_smiles

sdf_to_smiles(
    data_path=Path('molecules.sdf'),
    save_path=Path('molecules.csv')
)

Most modules can also be run as scripts from the command line using the chemfunc command along with the appropriate function name. For example:

chemfunc sdf_to_smiles \
    --data_path molecules.sdf \
    --save_path molecules.csv

To see a list of available scripts, run chemfunc -h.

For each script, run chemfunc <script_name> -h to see a description of the arguments for that script.

Contents

Below is a list of the contents of the package.

canonicalize_smiles.py (function, script)

Canonicalizes SMILES using RDKit canonicalization and optionally strips salts.

chemical_diversity.py (function, script)

Computes the chemical diversity of a set of molecules in terms of Tanimoto distances.

cluster_molecules.py (function, script)

Performs k-means clustering to cluster molecules based on Morgan fingerprints.

compute_properties.py (function, script)

Computes one or more molecular properties for a set of molecules.

convert_sdf.py (functions)

Functions to convert SDF files to SMILES or SMARTS. Used by sdf_to_smiles and sdf_to_smarts.

deduplicate_smiles.py (function, script)

Deduplicate a CSV files by SMILES.

filter_molecules.py (function, script)

Filters molecules to those with values in a certain range.

measure_experimental_reproducibility.py (function, script)

Measures the experimental reproducibility of two biological replicates by using one replicate to predict the other.

molecular_fingerprints.py (functions, script)

Contains functions to compute fingerprints for molecules. Parallelized for speed. The function save_fingerprints can be used as a script to compute fingerprints from a CSV file and save them as an NPZ file.

molecular_properties.py (functions)

Contains functions to compute molecular properties. Parallelized for speed.

molecular_similarities.py (functions)

Contains functions to compute similarities between molecules. Parallelized for speed.

nearest_neighbor.py (function, script)

Given a dataset of molecules, computes the nearest neighbor molecule in a second dataset using one of several similarity metrics.

plot_property_distribution.py (function, script)

Plots the distribution of molecular properties of a set of molecules.

plot_tsne.py (function, script)

Runs a t-SNE on molecular fingerprints from one or more chemical libraries.

regression_to_classification.py (function, script)

Converts regression data to classification data using given thresholds.

sample_molecules.py (function, script)

Samples molecules from a CSV file, either uniformly at random across the entire dataset or uniformly at random from each cluster within the data.

sdf_to_smarts.py (function, script)

Converts an SDF file to a CSV file with SMARTS.

sdf_to_smiles.py (function, script)

Converts an SDF file to a CSV file with SMILES.

select_from_clusters.py (function, script)

Selects the best molecule from each cluster.

smiles_to_svg.py (function, script)

Converts a SMILES string to an SVG image of the molecule.

visualize_molecules.py(function, script)

Converts a file of SMILES to images of molecular structures.

visualize_reactions.py (function, script)

Converts a file of reaction SMARTS to images of chemical reactions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemfunc-1.0.10.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

chemfunc-1.0.10-py3-none-any.whl (31.8 kB view details)

Uploaded Python 3

File details

Details for the file chemfunc-1.0.10.tar.gz.

File metadata

  • Download URL: chemfunc-1.0.10.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for chemfunc-1.0.10.tar.gz
Algorithm Hash digest
SHA256 8e664469d63b9858922193167a11443cae505906f19d29e4e14bb9e1f4aa3ff7
MD5 cdd78be7470701c1997172f7acd9413d
BLAKE2b-256 e72f045b6befffcff4b753d7c1f75f68eda413a3ae676b9fb1a8143e96b3800a

See more details on using hashes here.

File details

Details for the file chemfunc-1.0.10-py3-none-any.whl.

File metadata

  • Download URL: chemfunc-1.0.10-py3-none-any.whl
  • Upload date:
  • Size: 31.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for chemfunc-1.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 e0c8a689890cfcd3ff49a4025993b5953c528887485b88e9fa7d5d6a7ebb25a1
MD5 f7e749c8cacb4f9cf9d9058e63e0aa5b
BLAKE2b-256 8b99fa9c452537d9590c8b7c0ddcc5f524eb3ab93a37b87a50b446c3991152a9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page