Skip to main content

Cheminformatics tools for astrochemistry

Project description

https://img.shields.io/pypi/v/astrochem_ml.svg https://img.shields.io/travis/laserkelvin/astrochem_ml.svg Documentation Status

Doing astrochemistry with robots.

The astrochem_ml package is designed for bringing accessible cheminformatics to astrochemical discovery. The main features, some of which are currently active development, are interfaces to common operations using RDKit that are relevant to astrochemistry, and pre-trained embedding models ready for machine learning projects that combine molecules and astrophysics.

The plan is to deliver a general purpose library, in addition to providing a command line interface to several common tasks.

Installation

Not yet on PyPI, and so for now you can install astrochem_ml via:

`pip install git+https://github.com/laserkelvin/astrochem_ml`

Features

Molecule generation

A significant amount of functionality wraps the rdkit package, the main library for doing cheminformatics in Python. For all molecule interactions, we go back and forth between the native rdkit objects and SMILES/SMARTS strings.

  • Exhaustive isotopologue generation in SMILES

>>> from astrochem_ml.smiles import isotopes
# exhaustively enumerate all possible combinations isotopologues
# user can set the threshold for natural abundance and whether
# to include hydrogens
>>> isotopes.generate_all_isos("c1ccccc1", explicit_h=False)
['c1[13cH]c[13cH][13cH][13cH]1', ... 'c1ccccc1', '[13cH]1[13cH][13cH][13cH][13cH][13cH]1','c1c[13cH][13cH][13cH]c1']
  • Functional group substitutions

Replace substructures with other ones in a tree data structure!

>>> from astrochem_ml.smiles import MoleculeGenerator
# randomly grow out possible structures starting from benzene,
# and iteratively replace structures with other functional groups
>>> benzene = MoleculeGenerator("c1ccccc1", substructs=["c", "cC#N", "cC=O", "cN"])
>>> benzene.grow_tree(50)
100%|██████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 237.44it/s]
>>> print(benzene)
c1ccccc1
├── Nc1ccccc1
├── N#Cc1ccccc1
└── O=Cc1ccccc1
├── Nc1ccccc1C=O
   └── N#Cc1ccccc1C=O
├── Nc1cccc(C=O)c1
   ├── Nc1cccc(C=O)c1N
      ├── Nc1c(C=O)ccc(C=O)c1N
      ├── Nc1cc(C=O)cc(C=O)c1N
...

This provides a high level interface to view every structure generated, and from which parent.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astrochem_ml-0.1.1.tar.gz (11.5 kB view hashes)

Uploaded Source

Built Distribution

astrochem_ml-0.1.1-py3-none-any.whl (13.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page