Skip to main content

A Python Package for Accessing Chemical Information from PubChem (https://pubchem.ncbi.nlm.nih.gov/).

Project description

PubChemQuery

Downloads PyPI Python Version License Open In Colab

PubChemQuery: A Python Package for Accessing Chemical Information from PubChem.

PubChemQuery is a Python package that provides a simple and intuitive API for retrieving chemical information from the PubChem database. With this package, you can easily fetch chemical data, including:

  • CID (Compound ID) by name

  • All CIDs by name

  • 2D images by CID or name

  • SDF (Structure Data File) by CID or name

  • Compound properties, including:

    • Molecular formula and weight

    • SMILES and InChI representations

    • IUPAC name and title

    • Physicochemical properties (e.g., XLogP, exact mass, TPSA)

    • Structural features (e.g., bond and atom counts, stereochemistry)

    • 3D properties (e.g., volume, steric quadrupole moments, feature counts)

    • Fingerprint and conformer information

The package offers a straightforward interface, allowing users to access PubChem data with minimal code. Whether you're a chemist, researcher, or developer, PubChemQuery simplifies the process of integrating chemical information into your projects.

Key Features:

Retrieve chemical data by name or CID

Access 2D images and SDF files

Get compound properties, including physicochemical, structural, and 3D features

Easy-to-use API with minimal code required

Simple and Concise API:

There are functions that perform all of the above-mentioned tasks, making it easy to integrate PubChem data into your projects:

  • get_cid_by_inchi(inchi): Get a CID by InChI

  • get_cids_by_formula(formula): Get CIDs by formula

  • get_cid_by_name(name): Get CID by name

  • get_cids_by_name(name): Get all CIDs by name

  • get_image_by_cid(cid): Get 2D image by CID

  • get_image_by_name(name): Get 2D image by name

  • get_image_by_inchi(inchi): Get 2D image by InChI

  • get_structure_by_cid(cid): Get SDF by CID

  • get_structure_by_name(name): Get SDF by name

  • get_similar_structures_cids_by_compound_id(cid/SMILES/InChI): Get similar structures CIDs by cid, SMILES, InChI

Compound Object:

The package also includes a Compound object that encapsulates the retrieved data, providing a convenient way

to access and manipulate the data.

  • compound(cid_or_name): Create a compound object with properties and methods

Getting Started:

To use PubChemQuery, simply install the package and import it into your Python script. Refer to the example code snippets above for a quick start.

Installation

Install PubChemQuery with pip

  pip install PubChemQuery

Examples

Import package as:

import pubchemquery as pcq

Use the functions to retrieve data:

# get a cid by formula

cid = pcq.get_cids_by_formula('C6H6')

print(type(cid), len(cid))
# get a cid by inchi

cid = pcq.get_cid_by_inchi(

    'InChI=1S/C6H5NO3/c8-6-3-1-5(2-4-6)7(9)10/h1-4,8H')

print(cid)
# get a cid by name

cid = pcq.get_cid_by_name('benzene')

print(cid)
# get all cids by name

cids = pcq.get_cids_by_name('benzene')

print(type(cids), len(cids))
# get 2d image

# by cid

image = pcq.get_image_by_cid('241')

image



# by name

image = pcq.get_image_by_name('benzene')

image



# by inchi

image = pcq.get_image_by_inchi(

    'InChI=1S/C6H5NO3/c8-6-3-1-5(2-4-6)7(9)10/h1-4,8H')

print(image)
# get sdf by cid

sdf = pcq.get_structure_by_cid('241')

print(sdf)
# get sdf by name

sdf = pcq.get_structure_by_name('benzene')

print(sdf)
# get similar structure cids by cid

# cids = pcq.get_similar_structures_cids_by_compound_id('241')

# cids = pcq.get_similar_structures_cids_by_compound_id(

#     'C1=CC=CC=C1', compound_id='SMILES')

cids = pcq.get_similar_structures_cids_by_compound_id(

    'InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H', compound_id='InChI')

print(type(cids), len(cids))

Make a compound and then get its properties:

# make a compound

cid = 2244

# compound = pcq.compound(cid)

# name

name = '2-acetyloxybenzoic acid'

compound = pcq.compound(name)

print(compound)

# properties

# InChI

print(compound.InChI)

# InChIKey

print(compound.InChIKey)

# IUPACName

print(compound.IUPACName)

# similar structure cids

print(len(compound.similar_structure_cids))

# image

compound.image

# dataframe

compound.prop_df()

FAQ

For any question, contact me on LinkedIn

Authors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pubchemquery-1.5.0.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

PubChemQuery-1.5.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file pubchemquery-1.5.0.tar.gz.

File metadata

  • Download URL: pubchemquery-1.5.0.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.8

File hashes

Hashes for pubchemquery-1.5.0.tar.gz
Algorithm Hash digest
SHA256 dd71d568f525c8b8a74d4ac4d7357f06822bee0cdf7ffe943a0d834d5c615e07
MD5 7393a74db4a0fce9afce4fdc65d1f531
BLAKE2b-256 6c065bc4a35d28a3d6d597fde7ad6a14cff72c838e87330210157405e46d74a9

See more details on using hashes here.

File details

Details for the file PubChemQuery-1.5.0-py3-none-any.whl.

File metadata

  • Download URL: PubChemQuery-1.5.0-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.8

File hashes

Hashes for PubChemQuery-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f9ea581da6d7307ac4ce1e5dabeba6d74b01c214622a9e4842539fd1c409b870
MD5 615eb7b0ea82bcd36498574bb234bf79
BLAKE2b-256 16ba531ce538a715e96a8cfad7d425aceddbc90e33a0f17f03b428768885255d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page