Skip to main content

unimol_tools is a Python package for property prediciton with Uni-Mol in molecule, materials and protein.

Project description

Uni-Mol tools for various prediction and downstreams.

Documentation of Uni-Mol tools is available at https://unimol.readthedocs.io/en/latest/

Details can be found in bohrium notebook

Install

  • pytorch is required, please install pytorch according to your environment. if you are using cuda, please install pytorch with cuda. More details can be found at https://pytorch.org/get-started/locally/
  • currently, rdkit needs with numpy<2.0.0, please install rdkit with numpy<2.0.0.

Option 1: Installing from PyPi (Recommended)

pip install unimol_tools

We recommend installing huggingface_hub so that the required unimol models can be automatically downloaded at runtime! It can be install by

pip install huggingface_hub

huggingface_hub allows you to easily download and manage models from the Hugging Face Hub, which is key for using UniMol models.

Option 2: Installing from source

## Dependencies installation
pip install -r requirements.txt

## Clone repository
git clone https://github.com/deepmodeling/Uni-Mol.git
cd Uni-Mol/unimol_tools

## Install
python setup.py install

Models in Huggingface

The UniMol pretrained models can be found at dptech/Uni-Mol-Models.

If the download is slow, you can use other mirrors, such as:

export HF_ENDPOINT=https://hf-mirror.com

Setting the HF_ENDPOINT environment variable specifies the mirror address for the Hugging Face Hub to use when downloading models.

Modify the default directory for weights

Setting the UNIMOL_WEIGHT_DIR environment variable specifies the directory for pre-trained weights if the weights have been downloaded from another source.

export UNIMOL_WEIGHT_DIR=/path/to/your/weights/dir/

News

  • 2024-07-23: User experience improvements: Add UNIMOL_WEIGHT_DIR.
  • 2024-06-25: unimol_tools has been publish to pypi! Huggingface has been used to manage the pretrain models.
  • 2024-06-20: unimol_tools v0.1.0 released, we remove the dependency of Uni-Core. And we will publish to pypi soon.
  • 2024-03-20: unimol_tools documents is available at https://unimol.readthedocs.io/en/latest/

molecule property prediction

from unimol_tools import MolTrain, MolPredict
clf = MolTrain(task='classification', 
                data_type='molecule', 
                epochs=10, 
                batch_size=16, 
                metrics='auc',
                )
pred = clf.fit(data = data)
# currently support data with smiles based csv/txt file, and
# custom dict of {'atoms':[['C','C],['C','H','O']], 'coordinates':[coordinates_1,coordinates_2]}

clf = MolPredict(load_model='../exp')
res = clf.predict(data = data)

unimol molecule and atoms level representation

import numpy as np
from unimol_tools import UniMolRepr
# single smiles unimol representation
clf = UniMolRepr(data_type='molecule', remove_hs=False)
smiles = 'c1ccc(cc1)C2=NCC(=O)Nc3c2cc(cc3)[N+](=O)[O]'
smiles_list = [smiles]
unimol_repr = clf.get_repr(smiles_list, return_atomic_reprs=True)
# CLS token repr
print(np.array(unimol_repr['cls_repr']).shape)
# atomic level repr, align with rdkit mol.GetAtoms()
print(np.array(unimol_repr['atomic_reprs']).shape)

Please kindly cite our papers if you use the data/code/model.

@inproceedings{
  zhou2023unimol,
  title={Uni-Mol: A Universal 3D Molecular Representation Learning Framework},
  author={Gengmo Zhou and Zhifeng Gao and Qiankun Ding and Hang Zheng and Hongteng Xu and Zhewei Wei and Linfeng Zhang and Guolin Ke},
  booktitle={The Eleventh International Conference on Learning Representations },
  year={2023},
  url={https://openreview.net/forum?id=6K2RM6wVqKu}
}
@misc{lu2023highly,
      title={Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+}, 
      author={Shuqi Lu and Zhifeng Gao and Di He and Linfeng Zhang and Guolin Ke},
      year={2023},
      eprint={2303.16982},
      archivePrefix={arXiv},
      primaryClass={physics.chem-ph}
}

License

This project is licensed under the terms of the MIT license. See LICENSE for additional details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unimol_tools-0.1.0.post4.tar.gz (42.5 kB view details)

Uploaded Source

Built Distribution

unimol_tools-0.1.0.post4-py3-none-any.whl (51.1 kB view details)

Uploaded Python 3

File details

Details for the file unimol_tools-0.1.0.post4.tar.gz.

File metadata

  • Download URL: unimol_tools-0.1.0.post4.tar.gz
  • Upload date:
  • Size: 42.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.8.13

File hashes

Hashes for unimol_tools-0.1.0.post4.tar.gz
Algorithm Hash digest
SHA256 da9e138f1350606af3c36855ff217e6335239cb4a601a41a14be50dd7da7fe5e
MD5 fd741dcd4dc41b6a79f96e325c4a5f9d
BLAKE2b-256 8f65ced543c91bbd9b3db2230e538e7750409efb0efa0f5ed7cfc1d81f0ecef7

See more details on using hashes here.

File details

Details for the file unimol_tools-0.1.0.post4-py3-none-any.whl.

File metadata

File hashes

Hashes for unimol_tools-0.1.0.post4-py3-none-any.whl
Algorithm Hash digest
SHA256 c85b76a347464628da91d0756b550534e9c8aa9972115aee870a68538f3ef695
MD5 b690557bbde9eebcc11d6b240d61fe4b
BLAKE2b-256 490201b92f2a35425ccfd7675bf3ab6f0a45e6b0e9ff3e95c420ae062801af66

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page