Skip to main content

Simplifies interaction with the PubChem database via PUG-REST API.

Project description

# PubChmAPI Library

## Overview

### Introduction

The **PubChmAPI** Python package simplifies interaction with the PubChem database via the PUG-REST API. Unlike traditional wrappers with hard-coded functions, PubChmAPI uses dynamic metaprogramming to generate endpoints, ensuring full coverage of the PubChem schema. It handles URL generation, automatic batching, and throttling to provide a seamless data retrieval experience.

---

## Naming Convention

Functions in **PubChmAPI** follow a strict semantic naming convention to eliminate ambiguity:

`domain_identifier_get_operation_option`

* **Domain:** The primary database being queried (e.g., `compound`, `substance`, `assay`, `gene`).  
* **Identifier:** The input type provided (e.g., `cid`, `name`, `smiles`, `geneid`).  
* **Operation:** The specific data to retrieve (e.g., `properties`, `aids`, `synonyms`).  
* **Option (Optional):** Filters or variants (e.g., `active`, `inactive`, `2d`).  

---

## Functions

### Compound Property Functions (By Name)

Retrieve calculated properties using a compound name (e.g., "Aspirin").  
**Format:** `compound_name_get_[Property](identifier)`

#### Code Example

```python
# test_pubchmapi.py
from PubChmAPI import (
    compound_name_get_Title, 
    compound_name_get_MolecularFormula,
    compound_name_get_MolecularWeight, 
    compound_name_get_CanonicalSMILES, 
    compound_name_get_InChI,
    compound_name_get_InChIKey, 
    compound_name_get_IUPACName,
    compound_name_get_XLogP, 
    compound_name_get_ExactMass
)

def test_pubchmapi():
    compound = "Aspirin"
    print("Testing PubChmAPI functions for:", compound)
    
    print("Title:", compound_name_get_Title(compound))
    print("Molecular Formula:", compound_name_get_MolecularFormula(compound))
    print("Molecular Weight:", compound_name_get_MolecularWeight(compound))
    print("Canonical SMILES:", compound_name_get_CanonicalSMILES(compound))
    print("InChI:", compound_name_get_InChI(compound))
    print("InChIKey:", compound_name_get_InChIKey(compound))
    print("IUPAC Name:", compound_name_get_IUPACName(compound))
    print("XLogP:", compound_name_get_XLogP(compound))
    print("Exact Mass:", compound_name_get_ExactMass(compound))

if __name__ == "__main__":
    test_pubchmapi()

Sample Output

Testing PubChmAPI functions for: Aspirin
Title: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/Title/txt']
Molecular Formula: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/MolecularFormula/txt']
Molecular Weight: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/MolecularWeight/txt']
Canonical SMILES: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/SMILES/txt']
InChI: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/InChI/txt']
InChIKey: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/InChIKey/txt']
IUPAC Name: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/IUPACName/txt']
XLogP: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/XLogP/txt']
Exact Mass: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/Aspirin/property/ExactMass/txt']

Compound CID Functions

Retrieve data using a Compound Identifier (CID). Format: compound_cid_get_[Operation](identifier)

Code Example

# test_pubchmapi_cid.py
from PubChmAPI import (
    compound_cid_get_description,
    compound_cid_get_synonyms,
    compound_cid_get_sids,
    compound_cid_get_cids,
    compound_cid_get_conformers,
    compound_cid_get_png,
    compound_cid_get_aids,
    compound_cid_get_aids_active,
    compound_cid_get_aids_inactive,
    compound_cid_get_assaysummary
)

def test_cid_functions():
    cid = 2244  # Aspirin CID
    print(f"Testing PubChmAPI CID functions for CID: {cid}")

    print("Description:", compound_cid_get_description(cid))
    print("Synonyms:", compound_cid_get_synonyms(cid)[:5], "...")
    print("Substance IDs:", compound_cid_get_sids(cid)[:5], "...")
    print("Self-retrieved CIDs:", compound_cid_get_cids(cid))
    print("Conformers:", compound_cid_get_conformers(cid)[:3], "...")
    print("PNG URL or data type:", type(compound_cid_get_png(cid)))
    print("All Assay IDs:", compound_cid_get_aids(cid)[:5], "...")
    print("Active Assay IDs:", compound_cid_get_aids_active(cid)[:5], "...")
    print("Inactive Assay IDs:", compound_cid_get_aids_inactive(cid)[:5], "...")
    print("Assay Summary:", compound_cid_get_assaysummary(cid))

if __name__ == "__main__":
    test_cid_functions()

Sample Output

Testing PubChmAPI CID functions for CID: 2244
Description: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/description/xml']
Synonyms: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/synonyms/txt'] ...
Substance IDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/sids/txt'] ...
Self-retrieved CIDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/cids/txt']
Conformers: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/conformers/xml'] ...
PNG URL or data type: <class 'list'>
All Assay IDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/aids/txt'] ...
Active Assay IDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/aids/txt?aids_type=active'] ...
Inactive Assay IDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/aids/txt?aids_type=inactive'] ...
Assay Summary: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/2244/assaysummary/xml']

Biological Domain Functions

Retrieve data related to proteins, genes, taxonomy, and cell lines.

Code Example

# test_pubchmapi_biological.py
from PubChmAPI import (
    # Protein
    protein_accession_get_summary,
    protein_accession_get_aids,
    protein_gi_get_summary,
    protein_synonym_get_aids,
    # Gene
    gene_geneid_get_summary,
    gene_geneid_get_aids,
    gene_genesymbol_get_summary,
    gene_genesymbol_get_aids,
    # Taxonomy
    taxonomy_taxid_get_summary,
    taxonomy_taxid_get_aids,
    taxonomy_synonym_get_aids,
    # Cell line
    cell_cellacc_get_summary,
    cell_cellacc_get_aids,
    cell_synonym_get_summary
)

def test_biological_functions():
    print("Testing PubChmAPI Biological Functions\n")

    # Protein
    accession = "P68871"
    gi = "4506723"
    protein_syn = "Hemoglobin"
    print("Protein Functions:")
    print("Summary:", protein_accession_get_summary(accession))
    print("AIDs:", protein_accession_get_aids(accession)[:5], "...")
    print("GI Summary:", protein_gi_get_summary(gi))
    print("Synonym AIDs:", protein_synonym_get_aids(protein_syn)[:5], "...\n")

    # Gene
    geneid = "3043"
    symbol = "HBB"
    print("Gene Functions:")
    print("GeneID Summary:", gene_geneid_get_summary(geneid))
    print("GeneID AIDs:", gene_geneid_get_aids(geneid)[:5], "...")
    print("Symbol Summary:", gene_genesymbol_get_summary(symbol))
    print("Symbol AIDs:", gene_genesymbol_get_aids(symbol)[:5], "...\n")

    # Taxonomy
    taxid = "9606"
    tax_syn = "Human"
    print("Taxonomy Functions:")
    print("TaxID Summary:", taxonomy_taxid_get_summary(taxid))
    print("TaxID AIDs:", taxonomy_taxid_get_aids(taxid)[:5], "...")
    print("Synonym AIDs:", taxonomy_synonym_get_aids(tax_syn)[:5], "...\n")

    # Cell Line
    cellacc = "CVCL_0030"
    cell_syn = "HeLa"
    print("Cell Line Functions:")
    print("Cell Summary:", cell_cellacc_get_summary(cellacc))
    print("Cell AIDs:", cell_cellacc_get_aids(cellacc)[:5], "...")
    print("Synonym Summary:", cell_synonym_get_summary(cell_syn))

if __name__ == "__main__":
    test_biological_functions()

Sample Output

Testing PubChmAPI Biological Functions

Protein Functions:
Summary: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/protein/accession/P68871/summary/json']
AIDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/protein/accession/P68871/aids/txt'] ...
GI Summary: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/protein/gi/4506723/summary/json']
Synonym AIDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/protein/synonym/Hemoglobin/aids/txt'] ...

Gene Functions:
GeneID Summary: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/gene/geneid/3043/summary/json']
GeneID AIDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/gene/geneid/3043/aids/txt'] ...
Symbol Summary: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/gene/genesymbol/HBB/summary/json']
Symbol AIDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/gene/genesymbol/HBB/aids/txt'] ...

Taxonomy Functions:
TaxID Summary: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/taxonomy/taxid/9606/summary/json']
TaxID AIDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/taxonomy/taxid/9606/aids/txt'] ...
Synonym AIDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/taxonomy/synonym/Human/aids/txt'] ...

Cell Line Functions:
Cell Summary: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/cell/cellacc/CVCL_0030/summary/json']
Cell AIDs: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/cell/cellacc/CVCL_0030/aids/txt'] ...
Synonym Summary: ['https://pubchem.ncbi.nlm.nih.gov/rest/pug/cell/synonym/HeLa/summary/json']

Ahmed Alhilal

0.0.42 (12/12/2025)

  • First Release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pubchmapi-0.0.47.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pubchmapi-0.0.47-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file pubchmapi-0.0.47.tar.gz.

File metadata

  • Download URL: pubchmapi-0.0.47.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.13.0

File hashes

Hashes for pubchmapi-0.0.47.tar.gz
Algorithm Hash digest
SHA256 7db6229c7835b65acab358ea6178bde14d2f97cf23f51693d47cb39bb2a3057b
MD5 fa2c5bf5810022c01a1a5f9068530db9
BLAKE2b-256 26b8c17297cf89325d0f1d21881489f7f2801f54f898cd5a52c4dd4ff4a39566

See more details on using hashes here.

File details

Details for the file pubchmapi-0.0.47-py3-none-any.whl.

File metadata

  • Download URL: pubchmapi-0.0.47-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.13.0

File hashes

Hashes for pubchmapi-0.0.47-py3-none-any.whl
Algorithm Hash digest
SHA256 ba2a8ae9efd26ceeab53332fe124038a45e6088b443aaa43c5ae368e4dfe29ad
MD5 fed31a58e778ef50a6af524e30c56cf0
BLAKE2b-256 702b76e7bfc91f694af841ee893c062305092a8dd77a7fa90682959ef293dabe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page