Skip to main content

Get UNIFAC functional groups of PubChem compounds or SMILES representation.

Project description

logo

Open In Colab License Python 3.10+ Docs PyPI version Powered by RDKit

ugropy is a Python library to obtain subgroups from different thermodynamic group contribution models using both the name or the SMILES representation of a molecule. If the name is given, the library uses the PubChemPy library to obtain the SMILES representation from PubChem. In both cases, ugropy uses the RDKit library to search the functional groups in the molecule.

ugropy is tested for Python 3.10, 3.11, 3.12, 3.13 and 3.14 on Linux, Windows and Mac OS.

You can access the documentation here: https://ipqa-research.github.io/ugropy/

Try ugropy now

You can try ugropy without installing it by clicking on the Colab badge.

You can install ugropy by:

pip install ugropy

Citing ugropy

ugropy now has an article! If you use ugropy in your research, please cite:

@article{brandolin2025ugropy,
  title={Ugropy: An Extensible Python Package for Thermodynamic Model Functional Group Identification via Mathematical Optimization},
  author={Brandol{\'\i}n, Salvador E and Benelli, Federico E and Magario, Ivana and Scilipoti, Jos{\'e} A},
  journal={Industrial \& Engineering Chemistry Research},
  volume={64},
  number={35},
  pages={17217--17227},
  year={2025},
  publisher={ACS Publications},
  doi = {10.1021/acs.iecr.5c02552}
}

Check the publication here.

Models implemented

Gibbs / EoS models

  • Classic liquid-vapor UNIFAC
  • Predictive Soave-Redlich-Kwong (PSRK)
  • Dortmund (modified UNIFAC)

Property estimators

  • Joback
  • Abdulelah-Gani (beta)

Writers

ugropy allows you to convert the obtained functional groups or estimated properties to the input format required by the following thermodynamic libraries:

Example of use

Here is a little taste of ugropy, please, check the full tutorial here to see all it has to offer!

Get groups from the molecule's name:

from ugropy import Groups


hexane = Groups("hexane")

print(hexane.unifac.subgroups)
print(hexane.psrk.subgroups)
print(hexane.dortmund.subgroups)
print(hexane.joback.subgroups)
print(hexane.agani.primary.subgroups)
{'CH3': 2, 'CH2': 4}
{'CH3': 2, 'CH2': 4}
{'CH3': 2, 'CH2': 4}
{'-CH3': 2, '-CH2-': 4}
{'CH3': 2, 'CH2': 4}

Get groups from molecule's SMILES:

propanol = Groups("CCCO", "smiles")

print(propanol.unifac.subgroups)
print(propanol.psrk.subgroups)
print(propanol.dortmund.subgroups)
print(propanol.joback.subgroups)
print(propanol.agani.primary.subgroups)
{'CH3': 1, 'CH2': 2, 'OH': 1}
{'CH3': 1, 'CH2': 2, 'OH': 1}
{'CH3': 1, 'CH2': 2, 'OH (P)': 1}
{'-CH3': 1, '-CH2-': 2, '-OH (alcohol)': 1}
{'CH3': 1, 'CH2': 2, 'OH': 1}

Estimate properties with the Joback and Abdulelah-Gani models!

limonene = Groups("limonene")

print(limonene.joback.subgroups)
print(f"{limonene.joback.critical_temperature}")
print(f"{limonene.joback.vapor_pressure(176 + 273.15)}")
{'-CH3': 2, '=CH2': 1, '=C<': 1, 'ring-CH2-': 3, 'ring>CH-': 1, 'ring=CH-': 1, 'ring=C<': 1}
657.4486692170663 kelvin
1.0254019428522743 bar
print(limonene.agani.primary.subgroups)
print(limonene.agani.secondary.subgroups)
print(limonene.agani.tertiary.subgroups)
print(f"{limonene.agani.critical_temperature}")
print(limonene.agani.molecular_weight / limonene.agani.liquid_molar_volume)
{'CH3': 2, 'CH2=C': 1, 'CH2 (cyclic)': 3, 'CH (cyclic)': 1, 'CH=C (cyclic)': 1}
{'CH3-CHm=CHn (m,n in 0..2)': 1, '(CHn=C)cyc-CH3 (n in 0..2)': 1, 'CHcyc-C=CHn (n in 1..2)': 1}
{}
640.1457030826214 kelvin
834.8700605718585 gram / liter

Visualize your results! (The next code creates the ugropy logo)

mol = Groups("CCCC1=C(COC(C)(C)COC(=O)OCC)C=C(CC2=CC=CC=C2)C=C1", "smiles")

mol.unifac.draw(
    title="ugropy",
    width=800,
    height=450,
    title_font_size=50,
    legend_font_size=14
)

ugropy can obtain multiple solutions, even nonoptimal ones if desired. For example:

from ugropy import unifac


solutions = unifac.get_groups(
    "9,10-dihydroanthracene",
    search_multiple_solutions=True,
    search_nonoptimal=True
)

for sol in solutions:
    print(sol.subgroups)
{'ACH': 8, 'AC': 2, 'ACCH2': 2}
{'CH2': 1, 'ACH': 8, 'AC': 3, 'ACCH2': 1}
{'CH2': 2, 'ACH': 8, 'AC': 4}

Write down the Clapeyron.jl .csv input files.

from ugropy import writers

names = ["limonene", "adrenaline", "Trinitrotoluene"]

grps = [Groups(n) for n in names]

# Write the csv files into a database directory
writers.to_clapeyron(
    molecules_names=names,
    unifac_groups=[g.unifac.subgroups for g in grps],
    psrk_groups=[g.psrk.subgroups for g in grps],
    joback_objects=[g.joback for g in grps],
    path="database"
)

Obtain the Caleb Bell's Thermo and yaeos API Python subgroups

from ugropy import unifac

names = ["hexane", "ethanol"]

grps = [unifac.get_groups(n) for n in names]

groups_numbers = [g.subgroups_num for g in grps]

print(groups_numbers)
[{1: 2, 2: 4}, {1: 1, 2: 1, 14: 1}]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ugropy-3.2.0.tar.gz (101.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ugropy-3.2.0-py3-none-any.whl (122.9 kB view details)

Uploaded Python 3

File details

Details for the file ugropy-3.2.0.tar.gz.

File metadata

  • Download URL: ugropy-3.2.0.tar.gz
  • Upload date:
  • Size: 101.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ugropy-3.2.0.tar.gz
Algorithm Hash digest
SHA256 52b71477490ece2976ea0eb44f3b5ab74b9b034a002a6bd79ed7f7a6d8f6cc21
MD5 2f945857db72ed2e7d9a1c3419cb8866
BLAKE2b-256 491f3b7c332bd240f67c323df06d4cce23cfdeb4c3c29f69335011b7cb46ae64

See more details on using hashes here.

File details

Details for the file ugropy-3.2.0-py3-none-any.whl.

File metadata

  • Download URL: ugropy-3.2.0-py3-none-any.whl
  • Upload date:
  • Size: 122.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ugropy-3.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 85a732add869841b138b1014e8527a4d1dd82e03864449ce08f2b14fd30bda30
MD5 05da4c2f6dffd9a20a7f173ba272f4d0
BLAKE2b-256 0881adc9040cec6f9b741f17026912aab8fd8e751bb168e644baa86812aaf5ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page