Skip to main content

unimol_tools is a Python package for property prediciton with Uni-Mol in molecule, materials and protein.

Project description

Uni-Mol tools for various prediction and downstreams.

Documentation of Uni-Mol tools is available at https://unimol.readthedocs.io/en/latest/

Details can be found in bohrium notebook

Install

  • pytorch is required, please install pytorch according to your environment. if you are using cuda, please install pytorch with cuda. More details can be found at https://pytorch.org/get-started/locally/
  • currently, rdkit needs with numpy<2.0.0, please install rdkit with numpy<2.0.0.

Option 1: Installing from PyPi (Recommended)

pip install unimol_tools

We recommend installing huggingface_hub so that the required unimol models can be automatically downloaded at runtime! It can be install by

pip install huggingface_hub

huggingface_hub allows you to easily download and manage models from the Hugging Face Hub, which is key for using UniMol models.

Option 2: Installing from source

## Dependencies installation
pip install -r requirements.txt

## Clone repository
git clone https://github.com/deepmodeling/Uni-Mol.git
cd Uni-Mol/unimol_tools

## Install
python setup.py install

Models in Huggingface

The UniMol pretrained models can be found at dptech/Uni-Mol-Models.

If the download is slow, you can use other mirrors, such as:

export HF_ENDPOINT=https://hf-mirror.com

Setting the HF_ENDPOINT environment variable specifies the mirror address for the Hugging Face Hub to use when downloading models.

Modify the default directory for weights

Setting the UNIMOL_WEIGHT_DIR environment variable specifies the directory for pre-trained weights if the weights have been downloaded from another source.

export UNIMOL_WEIGHT_DIR=/path/to/your/weights/dir/

News

  • 2024-07-23: User experience improvements: Add UNIMOL_WEIGHT_DIR.
  • 2024-06-25: unimol_tools has been publish to pypi! Huggingface has been used to manage the pretrain models.
  • 2024-06-20: unimol_tools v0.1.0 released, we remove the dependency of Uni-Core. And we will publish to pypi soon.
  • 2024-03-20: unimol_tools documents is available at https://unimol.readthedocs.io/en/latest/

molecule property prediction

from unimol_tools import MolTrain, MolPredict
clf = MolTrain(task='classification', 
                data_type='molecule', 
                epochs=10, 
                batch_size=16, 
                metrics='auc',
                )
pred = clf.fit(data = data)
# currently support data with smiles based csv/txt file, and
# custom dict of {'atoms':[['C','C],['C','H','O']], 'coordinates':[coordinates_1,coordinates_2]}

clf = MolPredict(load_model='../exp')
res = clf.predict(data = data)

unimol molecule and atoms level representation

import numpy as np
from unimol_tools import UniMolRepr
# single smiles unimol representation
clf = UniMolRepr(data_type='molecule', remove_hs=False)
smiles = 'c1ccc(cc1)C2=NCC(=O)Nc3c2cc(cc3)[N+](=O)[O]'
smiles_list = [smiles]
unimol_repr = clf.get_repr(smiles_list, return_atomic_reprs=True)
# CLS token repr
print(np.array(unimol_repr['cls_repr']).shape)
# atomic level repr, align with rdkit mol.GetAtoms()
print(np.array(unimol_repr['atomic_reprs']).shape)

Please kindly cite our papers if you use the data/code/model.

@inproceedings{
  zhou2023unimol,
  title={Uni-Mol: A Universal 3D Molecular Representation Learning Framework},
  author={Gengmo Zhou and Zhifeng Gao and Qiankun Ding and Hang Zheng and Hongteng Xu and Zhewei Wei and Linfeng Zhang and Guolin Ke},
  booktitle={The Eleventh International Conference on Learning Representations },
  year={2023},
  url={https://openreview.net/forum?id=6K2RM6wVqKu}
}
@article{gao2023uni,
  title={Uni-qsar: an auto-ml tool for molecular property prediction},
  author={Gao, Zhifeng and Ji, Xiaohong and Zhao, Guojiang and Wang, Hongshuai and Zheng, Hang and Ke, Guolin and Zhang, Linfeng},
  journal={arXiv preprint arXiv:2304.12239},
  year={2023}
}

License

This project is licensed under the terms of the MIT license. See LICENSE for additional details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unimol_tools-0.1.1.tar.gz (52.0 kB view details)

Uploaded Source

File details

Details for the file unimol_tools-0.1.1.tar.gz.

File metadata

  • Download URL: unimol_tools-0.1.1.tar.gz
  • Upload date:
  • Size: 52.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.15

File hashes

Hashes for unimol_tools-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7add6707270ec50a3f49d50d4d5f748c7b213d9930748feeab7c4019107fd646
MD5 9b15dec1389c3809a16050c9115a449f
BLAKE2b-256 b4dd2dc097f22fb68d777fa154aa69ae500036b0c9e9ec603df7942caa13cf18

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page