A Python Package for Accessing Chemical Information from PubChem (https://pubchem.ncbi.nlm.nih.gov/).
Project description
PubChemQuery
PubChemQuery: A Python Package for Accessing Chemical Information from PubChem.
PubChemQuery is a Python package that provides a simple and intuitive API for retrieving chemical information from the PubChem database. With this package, you can easily fetch chemical data, including:
-
CID (Compound ID) by name
-
All CIDs by name
-
2D images by CID or name
-
SDF (Structure Data File) by CID or name
-
Compound properties, including:
-
Molecular formula and weight
-
SMILES and InChI representations
-
IUPAC name and title
-
Physicochemical properties (e.g., XLogP, exact mass, TPSA)
-
Structural features (e.g., bond and atom counts, stereochemistry)
-
3D properties (e.g., volume, steric quadrupole moments, feature counts)
-
Fingerprint and conformer information
-
The package offers a straightforward interface, allowing users to access PubChem data with minimal code. Whether you're a chemist, researcher, or developer, PubChemQuery simplifies the process of integrating chemical information into your projects.
Key Features:
Retrieve chemical data by name or CID
Access 2D images and SDF files
Get compound properties, including physicochemical, structural, and 3D features
Easy-to-use API with minimal code required
Simple and Concise API:
There are functions that perform all of the above-mentioned tasks, making it easy to integrate PubChem data into your projects:
-
get_cid_by_inchi(inchi)
: Get a CID by InChI -
get_cids_by_formula(formula)
: Get CIDs by formula -
get_cid_by_name(name)
: Get CID by name -
get_cids_by_name(name)
: Get all CIDs by name -
get_image_by_cid(cid)
: Get 2D image by CID -
get_image_by_name(name)
: Get 2D image by name -
get_image_by_inchi(inchi)
: Get 2D image by InChI -
get_structure_by_cid(cid)
: Get SDF by CID -
get_structure_by_name(name)
: Get SDF by name -
get_similar_structures_cids_by_compound_id(cid/SMILES/InChI)
: Get similar structures CIDs by cid, SMILES, InChI
Compound Object:
The package also includes a Compound
object that encapsulates the retrieved data, providing a convenient way
to access and manipulate the data.
compound(cid_or_name)
: Create a compound object with properties and methods
Getting Started:
To use PubChemQuery, simply install the package and import it into your Python script. Refer to the example code snippets above for a quick start.
Installation
Install PubChemQuery with pip
pip install PubChemQuery
Examples
Import package as:
import pubchemquery as pcq
Use the functions to retrieve data:
# get a cid by formula
cid = pcq.get_cids_by_formula('C6H6')
print(type(cid), len(cid))
# get a cid by inchi
cid = pcq.get_cid_by_inchi(
'InChI=1S/C6H5NO3/c8-6-3-1-5(2-4-6)7(9)10/h1-4,8H')
print(cid)
# get a cid by name
cid = pcq.get_cid_by_name('benzene')
print(cid)
# get all cids by name
cids = pcq.get_cids_by_name('benzene')
print(type(cids), len(cids))
# get 2d image
# by cid
image = pcq.get_image_by_cid('241')
image
# by name
image = pcq.get_image_by_name('benzene')
image
# by inchi
image = pcq.get_image_by_inchi(
'InChI=1S/C6H5NO3/c8-6-3-1-5(2-4-6)7(9)10/h1-4,8H')
print(image)
# get sdf by cid
sdf = pcq.get_structure_by_cid('241')
print(sdf)
# get sdf by name
sdf = pcq.get_structure_by_name('benzene')
print(sdf)
# get similar structure cids by cid
# cids = pcq.get_similar_structures_cids_by_compound_id('241')
# cids = pcq.get_similar_structures_cids_by_compound_id(
# 'C1=CC=CC=C1', compound_id='SMILES')
cids = pcq.get_similar_structures_cids_by_compound_id(
'InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H', compound_id='InChI')
print(type(cids), len(cids))
Make a compound and then get its properties:
# make a compound
cid = 2244
# compound = pcq.compound(cid)
# name
name = '2-acetyloxybenzoic acid'
compound = pcq.compound(name)
print(compound)
# properties
# InChI
print(compound.InChI)
# InChIKey
print(compound.InChIKey)
# IUPACName
print(compound.IUPACName)
# similar structure cids
print(len(compound.similar_structure_cids))
# image
compound.image
# dataframe
compound.prop_df()
FAQ
For any question, contact me on LinkedIn
Authors
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pubchemquery-1.5.0.tar.gz
.
File metadata
- Download URL: pubchemquery-1.5.0.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd71d568f525c8b8a74d4ac4d7357f06822bee0cdf7ffe943a0d834d5c615e07 |
|
MD5 | 7393a74db4a0fce9afce4fdc65d1f531 |
|
BLAKE2b-256 | 6c065bc4a35d28a3d6d597fde7ad6a14cff72c838e87330210157405e46d74a9 |
File details
Details for the file PubChemQuery-1.5.0-py3-none-any.whl
.
File metadata
- Download URL: PubChemQuery-1.5.0-py3-none-any.whl
- Upload date:
- Size: 15.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9ea581da6d7307ac4ce1e5dabeba6d74b01c214622a9e4842539fd1c409b870 |
|
MD5 | 615eb7b0ea82bcd36498574bb234bf79 |
|
BLAKE2b-256 | 16ba531ce538a715e96a8cfad7d425aceddbc90e33a0f17f03b428768885255d |