Molecule functional group extraction and comparison
Project description
AccFG: Accurate Functional Group Extraction and Molecular Structure Comparison
Table of Contents
Introduction
AccFG is a tool for precise functional group (FG) extraction and molecular structure comparison.
Installation
We provide two methods to install AccFG:
Installation by pip (recommended)
pip install accfg
Installation from GitHub repository
To install AccFG, follow these steps:
- Clone/download the repository:
git clone https://github.com/xuanliugit/AccFG.git
- Navigate to the project directory:
cd AccFG
- Install the required dependencies:
conda create --name accfg python=3.10 conda activate accfg pip install -r requirements.txt
- Quick start:
# Get functional groups from SMILES python run_accfg.py 'CN(C)/N=N/C1=C(NC=N1)C(=O)N' # Compare two molecules python run_accfg.py 'CNC(=O)Cc1nc(-c2ccccc2)cs1' --compare_smi 'CCNCCc1nc2ccccc2s1'
The FG dictionary is stored in ./accfg/fgs_common.csv and ./accfg/fgs_heterocycle.csv
Usage
FG extraction
To extract functional groups:
# example.py
from accfg import AccFG
afg = AccFG(print_load_info=True)
smi = 'CN(C)/N=N/C1=C(NC=N1)C(=O)N'
fgs,fg_graph = afg.run(smi, show_atoms=True, show_graph=True)
print_fg_tree(fg_graph, fgs.keys(), show_atom_idx=True)
'''
├──Primary amide: ((10, 12, 11),)
...
'''
print(fgs)
'''
{'Primary amide': [(10, 12, 11)], 'Triazene': [(1, 3, 4)], 'imidazole': [(5, 9, 8, 7, 6)]}
'''
User-defined FGs Example:
# example.py
from accfg import AccFG
my_fgs_dict = {'Cephem': 'O=C(O)C1=CCS[C@@H]2CC(=O)N12', 'Thioguanine': 'Nc1nc(=S)c2[nH]cnc2[nH]1'}
my_afg = AccFG(user_defined_fgs=my_fgs_dict,print_load_info=True)
cephalosporin_C = 'CC(=O)OCC1=C(N2[C@@H]([C@@H](C2=O)NC(=O)CCC[C@H](C(=O)O)N)SC1)C(=O)O'
fgs,fg_graph = my_afg.run(cephalosporin_C, show_atoms=True, show_graph=True)
print_fg_tree(fg_graph, fgs.keys(), show_atom_idx=True) # This will print the FG tree
'''
├──Primary aliphatic amine: ((21,),)
├──...
'''
To print functional groups:
print(fgs) # Show top level FGs
'''
{'Primary aliphatic amine': [(21,)],
'Carboxylic acid': [(22, 23, 24)],
'Carboxylic ester': [(1, 2, 3, 4)],
'Secondary amide': [(15, 16, 14, 13)],
'Cephem': [(8, 7, 9, 6, 5, 27, 26, 25, 13, 11, 12, 10)]}
'''
FG extraction visualization
from accfg import draw_mol_with_fgs, molimg
molimg(draw_mol_with_fgs(cephalosporin_C, afg=my_afg, img_size=(900,900)))
This will show image with FGs highlighted
Molecular structure comparison
from accfg import AccFG, compare_mols, draw_compare_mols
smi_1,smi_2 = ('CNC(=O)Cc1nc(-c2ccccc2)cs1','CCNCCc1nc2ccccc2s1')
diff = compare_mols(smi_1, smi_2)
print(diff) # This print the structure difference
'''
(([('Secondary amide', 1, [(2, 3, 1)]),
...
'''
draw_RascalMCES(smi_1, smi_2) # This draw the RascalMCES comparison
Molecular structure comparison visualization
img = img_grid(draw_compare_mols(smi_1, smi_2),num_columns=2)
with open('results/compare_mols.png', 'wb') as f:
img.save(f, format='PNG')
img
Run
To run the BBBP dataset, Lipophilicity dataset, BACE dataset, and CHEMBL drugs, simply run
python run_data.py
The result is in ./molecule_data. The code to process the data is in exam_data.py
All other examples in the manuscript is in example.ipynb.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file accfg-0.0.2.tar.gz.
File metadata
- Download URL: accfg-0.0.2.tar.gz
- Upload date:
- Size: 20.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6fc374a410c79fe627127e2ed80fc21d8f7af453a270303b0ccedf9a3e31ccef
|
|
| MD5 |
dd46d10e2b46133c59c6f80126c92245
|
|
| BLAKE2b-256 |
cd48fd177ed70a47756a8e6db8d64f87d76c4200a81bae4b676af18a7889b6a3
|
File details
Details for the file accfg-0.0.2-py3-none-any.whl.
File metadata
- Download URL: accfg-0.0.2-py3-none-any.whl
- Upload date:
- Size: 19.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0aecdbea97ab558a96f99433adc9a9b374aaac6642e6c523b03cb6a4907122f
|
|
| MD5 |
ad260e5be76c989ba72a7f90d0852172
|
|
| BLAKE2b-256 |
be7134693993bc9112de9b340d97cf61e0fe903ae115778c13312d565c1f204f
|