Skip to main content

A package for the paper: learning molecular representation in a cell

Project description

The Package for InfoAlign: Learning Molecular Representation in a Cell

InfoAlign is a package for learning molecular representations from bottleneck information, derived from molecular structures, cell morphology, and gene expressions. For more detailed information, please refer to our paper.

This package uses a pretrained model based on the method described in the paper. It takes molecules as input (e.g., a single SMILES string or a list of SMILES strings) and outputs their learned representations. These molecular representations can be applied to various downstream tasks, such as molecular property prediction.

For related projects by the main ML researcher and developer, visit: https://github.com/liugangcode/InfoAlign.

Installation

Install the package via pip:

pip install infoalign

Usage

Command Line Interface (CLI)

infoalign_pred --input {path_to_input_smiles.csv} 
               --output {path_to_output.npy} 
               --output-to-input-column  # This adds the representation to the input CSV as an additional column

Python API

from infoalign.representer import InfoAlignRepresenter

model = InfoAlignRepresenter(model_path='infoalign_model/pretrain.pt')

# For a single SMILES string
one_rep = model.predict('CCC')

# For a list of SMILES strings
two_reps = model.predict(['CCC', 'CCC'])

Citation

If you find this repository helpful, please cite our paper:

@article{liu2024learning,
  title={Learning Molecular Representation in a Cell},
  author={Liu, Gang and Seal, Srijit and Arevalo, John and Liang, Zhenwen and Carpenter, Anne E and Jiang, Meng and Singh, Shantanu},
  journal={arXiv preprint arXiv:2406.12056},
  year={2024}
}

Acknowledgement

This project template was adapted from: https://github.com/lwaekfjlk/python-project-template. We thank the authors for their open-source contribution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infoalign-0.1.1.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

infoalign-0.1.1-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file infoalign-0.1.1.tar.gz.

File metadata

  • Download URL: infoalign-0.1.1.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.7

File hashes

Hashes for infoalign-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1832d1bf436a61d4049061ec4cc7ac57f68919cb2f9a597688128bcf46dec97f
MD5 c228dae3557002a3acef976163afe1d8
BLAKE2b-256 e0fbfef53f9638545b084bcc9e770e92cc7299fc401ff9f9b0df8d17f3b50ae7

See more details on using hashes here.

File details

Details for the file infoalign-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: infoalign-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.7

File hashes

Hashes for infoalign-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cb1f40b67c5933d1b6d25811be522c04daff1468f47ccf951bd51d7a3a81e210
MD5 aaef01a345ae9b940924f40dd1ca72dc
BLAKE2b-256 341296d83ec5f34841a9c89acba1d8e49a15116ab99c663b5afe83200cded57b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page