Skip to main content

No project description provided

Project description

This repo contains code for extracting the molecules in https://data.hpc.imperial.ac.uk/resolve/?doi=4618 into an AtomLite database.

Why?

Because the original format of the published data is an out-of-date format, namely an stk JSON dump.

How?

The easiest thing to do is

pip install cage-json-extractor

Now you can download the files

And run

tar xf cages.tar.gz
cage_json_extractor cages/amine2aldehyde3.json cage_prediction.db amine2aldehyde3.db

Now if we want to extract all the shape persistent 4+6 cages we can run

extract_cages amine2aldehyde3.db FourPlusSix --output_directory extracted_cages

This will create a folder extracted_cages which holds a sub-folder for every shape persistent 4+6 cage in amine2aldehyde3.db. In the sub-folder you will find the .mol file of the cage and its building blocks.

Enjoy! (and sorry I deprecated the .json files)

=)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cage-json-extractor-1.0.0.tar.gz (4.4 kB view hashes)

Uploaded source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page