Skip to main content

Add your description here

Project description

BioThings Typed Client

Tests PyPI version

About BioThings.io

BioThings.io is a platform that provides a network of high-performance biomedical APIs and tools for building FAIR (Findable, Accessible, Interoperable, and Reusable) data services. The platform includes several key components:

  • Core BioThings APIs:

    • MyGene.info - Gene Annotation Service
    • MyVariant.info - Variant Annotation Service
    • MyChem.info - Chemical and Drug Annotation Service
    • MyDisease.info - Disease Annotation Service
    • Taxonomy API - For querying taxonomic information
  • Development Tools:

    • BioThings SDK - A Python-based toolkit for building high-performance data APIs
    • BioThings Studio - A pre-configured environment for building and administering BioThings APIs
  • Discovery and Integration Tools:

    • SmartAPI - A registry for semantically annotated APIs
    • BioThings Explorer - A tool for exploring biological data through linked API services

This typed client library is built on top of the BioThings ecosystem, providing type-safe access to these services through Python.

Project Description

A strongly-typed Python wrapper around the BioThings Client library, providing type safety and better IDE support through Python's type hints and Pydantic models.

Features

  • Type Safety: Strongly typed models for all BioThings data using Pydantic
  • IDE Support: Full autocompletion and type checking in modern IDEs
  • Synchronous & Asynchronous: Support for both sync and async operations
  • Helper Methods: Additional utility methods for common operations
  • Validation: Runtime type checking and data validation
  • Compatibility: Maintains full compatibility with the original BioThings client

Installation

Clone the Repository

git clone https://github.com/longevity-genie/biothings-typed-client.git
cd biothings-typed-client

Using pip

pip install biothings-typed-client

Using UV (Recommended)

UV is a fast Python package installer and resolver, written in Rust. It's designed to be a drop-in replacement for pip and pip-tools.

  1. Install UV (if you haven't already):

    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  2. Install the package:

    uv sync
    
  3. To create a virtual environment and install dependencies:

    uv venv
    source .venv/bin/activate  # On Unix/macOS
    # or
    .venv\Scripts\activate  # On Windows
    uv install biothings-typed-client
    

Quick Start

Synchronous Client

from biothings_typed_client.variants import VariantClient

# Initialize the client
client = VariantClient()

# Get a single variant
variant = client.getvariant("chr7:g.140453134T>C")
if variant:
    print(f"Variant ID: {variant.get_variant_id()}")
    print(f"Chromosome: {variant.chrom}")
    print(f"Position: {variant.vcf.position}")
    print(f"Reference: {variant.vcf.ref}")
    print(f"Alternative: {variant.vcf.alt}")

# Get multiple variants
variants = client.getvariants(["chr7:g.140453134T>C", "chr9:g.107620835G>A"])
for variant in variants:
    print(f"Found variant: {variant.get_variant_id()}")

# Query variants
results = client.query("dbnsfp.genename:cdk2", size=5)
for hit in results["hits"]:
    print(f"Found variant: {hit['_id']}")

Asynchronous Client

import asyncio
from biothings_typed_client.variants import VariantClientAsync

async def main():
    # Initialize the client
    client = VariantClientAsync()
    
    # Get a single variant
    variant = await client.getvariant("chr7:g.140453134T>C")
    if variant:
        print(f"Variant ID: {variant.get_variant_id()}")
        print(f"Has clinical significance: {variant.has_clinical_significance()}")
        print(f"Has functional predictions: {variant.has_functional_predictions()}")
    
    # Query variants
    results = await client.query("dbnsfp.genename:cdk2", size=5)
    print("\nQuery results:")
    print(results)

# Run the async code
asyncio.run(main())

Gene Client Examples

Synchronous Gene Client

from biothings_typed_client.genes import GeneClient

# Initialize the client
client = GeneClient()

# Get a single gene
gene = client.getgene("1017")  # Using Entrez ID
if gene:
    print(f"Gene ID: {gene.id}")
    print(f"Symbol: {gene.symbol}")
    print(f"Name: {gene.name}")

# Get multiple genes
genes = client.getgenes(["1017", "1018"])  # Using Entrez IDs
for gene in genes:
    print(f"Found gene: {gene.symbol} ({gene.name})")

# Query genes
results = client.query("symbol:CDK2", size=5)
for hit in results["hits"]:
    print(f"Found gene: {hit['symbol']} ({hit['name']})")

# Batch query genes
genes = client.querymany(["CDK2", "BRCA1"], scopes=["symbol"], size=1)
for gene in genes:
    print(f"Found gene: {gene['symbol']} ({gene['name']})")

Asynchronous Gene Client

import asyncio
from biothings_typed_client.genes import GeneClientAsync

async def main():
    # Initialize the client
    client = GeneClientAsync()
    
    # Get a single gene
    gene = await client.getgene("1017")  # Using Entrez ID
    if gene:
        print(f"Gene ID: {gene.id}")
        print(f"Symbol: {gene.symbol}")
        print(f"Name: {gene.name}")
    
    # Query genes
    results = await client.query("symbol:CDK2", size=5)
    print("\nQuery results:")
    for hit in results["hits"]:
        print(f"Found gene: {hit['symbol']} ({hit['name']})")

# Run the async code
asyncio.run(main())

Chemical Client Examples

Synchronous Chemical Client

from biothings_typed_client.chem import ChemClient

# Initialize the client
client = ChemClient()

# Get a single chemical
chem = client.getchem("ZRALSGWEFCBTJO-UHFFFAOYSA-N")  # Using InChI key
if chem:
    print(f"Chemical ID: {chem.id}")
    print(f"Molecular Formula: {chem.pubchem.molecular_formula}")
    print(f"SMILES: {chem.pubchem.smiles}")
    print(f"Molecular Weight: {chem.pubchem.molecular_weight}")
    print(f"XLogP: {chem.pubchem.xlogp}")
    print(f"Hydrogen Bond Donors: {chem.pubchem.hydrogen_bond_donor_count}")
    print(f"Hydrogen Bond Acceptors: {chem.pubchem.hydrogen_bond_acceptor_count}")
    print(f"Rotatable Bonds: {chem.pubchem.rotatable_bond_count}")
    print(f"Topological Polar Surface Area: {chem.pubchem.topological_polar_surface_area} Ų")

# Get multiple chemicals
chems = client.getchems(["ZRALSGWEFCBTJO-UHFFFAOYSA-N", "RRUDCFGSUDOHDG-UHFFFAOYSA-N"])
for chem in chems:
    print(f"\nFound chemical: {chem.id}")
    if chem.has_pubchem():
        print(f"Molecular Formula: {chem.pubchem.molecular_formula}")
        print(f"Molecular Weight: {chem.pubchem.molecular_weight}")

# Query chemicals with different field filters
print("\n=== Simple Queries ===")
results = client.query("pubchem.molecular_formula:C6H12O6", size=5)
for hit in results["hits"]:
    print(f"Found chemical: {hit['_id']}")

print("\n=== Fielded Queries ===")
results = client.query("pubchem.molecular_weight:[100 TO 200]", size=5)
for hit in results["hits"]:
    print(f"Found chemical: {hit['_id']}")

print("\n=== Range Queries ===")
results = client.query("pubchem.xlogp:>2", size=5)
for hit in results["hits"]:
    print(f"Found chemical: {hit['_id']}")

print("\n=== Boolean Queries ===")
results = client.query("pubchem.hydrogen_bond_donor_count:>2 AND pubchem.hydrogen_bond_acceptor_count:>4", size=5)
for hit in results["hits"]:
    print(f"Found chemical: {hit['_id']}")

# Batch query chemicals with field filtering
chems = client.querymany(
    ["C6H12O6", "C12H22O11"],
    scopes=["pubchem.molecular_formula"],
    fields=["pubchem.molecular_weight", "pubchem.xlogp", "pubchem.smiles"],
    size=1
)
for chem in chems:
    print(f"\nFound chemical: {chem['_id']}")
    if 'pubchem' in chem:
        print(f"Molecular Weight: {chem['pubchem'].get('molecular_weight')}")
        print(f"XLogP: {chem['pubchem'].get('xlogp')}")
        print(f"SMILES: {chem['pubchem'].get('smiles')}")

Asynchronous Chemical Client

import asyncio
from biothings_typed_client.chem import ChemClientAsync

async def main():
    # Initialize the client
    client = ChemClientAsync()
    
    # Get a single chemical
    chem = await client.getchem("ZRALSGWEFCBTJO-UHFFFAOYSA-N")  # Using InChI key
    if chem:
        print(f"Chemical ID: {chem.id}")
        print(f"Has PubChem info: {chem.has_pubchem()}")
        if chem.has_pubchem():
            print(f"Molecular Formula: {chem.pubchem.molecular_formula}")
            print(f"Molecular Weight: {chem.pubchem.molecular_weight}")
            print(f"XLogP: {chem.pubchem.xlogp}")
            print(f"Hydrogen Bond Donors: {chem.pubchem.hydrogen_bond_donor_count}")
            print(f"Hydrogen Bond Acceptors: {chem.pubchem.hydrogen_bond_acceptor_count}")
            print(f"Rotatable Bonds: {chem.pubchem.rotatable_bond_count}")
            print(f"Topological Polar Surface Area: {chem.pubchem.topological_polar_surface_area} Ų")
    
    # Query chemicals with different field filters
    print("\n=== Simple Queries ===")
    results = await client.query("pubchem.molecular_formula:C6H12O6", size=5)
    print("\nQuery results:")
    for hit in results["hits"]:
        print(f"Found chemical: {hit['_id']}")
        
    print("\n=== Fielded Queries ===")
    results = await client.query("pubchem.molecular_weight:[100 TO 200]", size=5)
    print("\nQuery results:")
    for hit in results["hits"]:
        print(f"Found chemical: {hit['_id']}")
        
    print("\n=== Range Queries ===")
    results = await client.query("pubchem.xlogp:>2", size=5)
    print("\nQuery results:")
    for hit in results["hits"]:
        print(f"Found chemical: {hit['_id']}")
        
    print("\n=== Boolean Queries ===")
    results = await client.query("pubchem.hydrogen_bond_donor_count:>2 AND pubchem.hydrogen_bond_acceptor_count:>4", size=5)
    print("\nQuery results:")
    for hit in results["hits"]:
        print(f"Found chemical: {hit['_id']}")
    
    await client.close()

# Run the async code
asyncio.run(main())

The chemical client provides access to detailed chemical compound information from MyChem.info, including:

  • Structural Information:

    • Molecular formula
    • SMILES strings
    • InChI and InChIKey
    • IUPAC names
  • Physical Properties:

    • Molecular weight
    • Exact mass
    • Monoisotopic weight
    • XLogP (octanol-water partition coefficient)
    • Topological polar surface area
  • Chemical Properties:

    • Hydrogen bond donors/acceptors
    • Rotatable bonds
    • Chiral centers
    • Formal charge
    • Molecular complexity
  • Stereochemistry:

    • Chiral atom count
    • Chiral bond count
    • Defined/undefined stereocenters

For more information about available fields and data sources, see the MyChem.info documentation.

Taxon Client Examples

Synchronous Taxon Client

from biothings_typed_client.taxons import TaxonClient

# Initialize the client
client = TaxonClient()

# Get a single taxon
taxon = client.gettaxon(9606)  # Using taxon ID for Homo sapiens
if taxon:
    print(f"Taxon ID: {taxon.id}")
    print(f"Scientific Name: {taxon.scientific_name}")
    print(f"Common Name: {taxon.common_name}")

# Get multiple taxa
taxa = client.gettaxons([9606, 10090])  # Homo sapiens and Mus musculus
for taxon in taxa:
    print(f"Found taxon: {taxon.scientific_name}")

# Query taxa
results = client.query("scientific_name:Homo sapiens", size=5)
for hit in results["hits"]:
    print(f"Found taxon: {hit['scientific_name']}")

# Batch query taxa
taxa = client.querymany(["Homo sapiens", "Mus musculus"], scopes=["scientific_name"], size=1)
for taxon in taxa:
    print(f"Found taxon: {taxon['scientific_name']}")

Asynchronous Taxon Client

import asyncio
from biothings_typed_client.taxons import TaxonClientAsync

async def main():
    # Initialize the client
    client = TaxonClientAsync()
    
    # Get a single taxon
    taxon = await client.gettaxon(9606)  # Using taxon ID for Homo sapiens
    if taxon:
        print(f"Taxon ID: {taxon.id}")
        print(f"Has lineage: {taxon.has_lineage()}")
        print(f"Has common name: {taxon.has_common_name()}")
    
    # Query taxa
    results = await client.query("scientific_name:Homo sapiens", size=5)
    print("\nQuery results:")
    for hit in results["hits"]:
        print(f"Found taxon: {hit['scientific_name']}")

# Run the async code
asyncio.run(main())

Variant Client Examples

Synchronous Variant Client

from biothings_typed_client.variants import VariantClient

# Initialize the client
client = VariantClient()

# Get a single variant
variant = client.getvariant("chr7:g.140453134T>C")
if variant:
    print(f"Variant ID: {variant.get_variant_id()}")
    print(f"Has clinical significance: {variant.has_clinical_significance()}")
    print(f"Variant details: {variant.model_dump_json(indent=2)}")
else:
    print("Variant not found")

# Query variants using different syntax
print("\n=== Simple Queries ===")
results = client.query("rs58991260")
print(f"Query 'rs58991260' results: {results['total']} hits")
if results['hits']:
    print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
    print(f"Score: {results['hits'][0].get('_score', 'No score')}")

print("\n=== Fielded Queries ===")
results = client.query("dbsnp.vartype:snp")
print(f"Query 'dbsnp.vartype:snp' results: {results['total']} hits")
if results['hits']:
    print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
    print(f"Score: {results['hits'][0].get('_score', 'No score')}")

print("\n=== Range Queries ===")
results = client.query("dbnsfp.polyphen2.hdiv.score:>0.99")
print(f"Query 'dbnsfp.polyphen2.hdiv.score:>0.99' results: {results['total']} hits")
if results['hits']:
    print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
    print(f"Score: {results['hits'][0].get('_score', 'No score')}")

print("\n=== Wildcard Queries ===")
results = client.query("dbnsfp.genename:CDK?")
print(f"Query 'dbnsfp.genename:CDK?' results: {results['total']} hits")
if results['hits']:
    print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
    print(f"Score: {results['hits'][0].get('_score', 'No score')}")

print("\n=== Boolean Queries ===")
results = client.query("_exists_:dbsnp AND dbsnp.vartype:snp")
print(f"Query '_exists_:dbsnp AND dbsnp.vartype:snp' results: {results['total']} hits")
if results['hits']:
    print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
    print(f"Score: {results['hits'][0].get('_score', 'No score')}")

Asynchronous Variant Client

import asyncio
from biothings_typed_client.variants import VariantClientAsync

async def main():
    client = VariantClientAsync()
    
    # Get a single variant
    variant = await client.getvariant("chr7:g.140453134T>C")
    if variant:
        print(f"Variant ID: {variant.get_variant_id()}")
        print(f"Has clinical significance: {variant.has_clinical_significance()}")
        print(f"Variant details: {variant.model_dump_json(indent=2)}")
    else:
        print("Variant not found")
        
    # Query variants using different syntax
    print("\n=== Simple Queries ===")
    results = await client.query("rs58991260")
    print(f"Query 'rs58991260' results: {results['total']} hits")
    if results['hits']:
        print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
        print(f"Score: {results['hits'][0].get('_score', 'No score')}")
        
    print("\n=== Fielded Queries ===")
    results = await client.query("dbsnp.vartype:snp")
    print(f"Query 'dbsnp.vartype:snp' results: {results['total']} hits")
    if results['hits']:
        print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
        print(f"Score: {results['hits'][0].get('_score', 'No score')}")
        
    print("\n=== Range Queries ===")
    results = await client.query("dbnsfp.polyphen2.hdiv.score:>0.99")
    print(f"Query 'dbnsfp.polyphen2.hdiv.score:>0.99' results: {results['total']} hits")
    if results['hits']:
        print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
        print(f"Score: {results['hits'][0].get('_score', 'No score')}")
        
    print("\n=== Wildcard Queries ===")
    results = await client.query("dbnsfp.genename:CDK?")
    print(f"Query 'dbnsfp.genename:CDK?' results: {results['total']} hits")
    if results['hits']:
        print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
        print(f"Score: {results['hits'][0].get('_score', 'No score')}")
        
    print("\n=== Boolean Queries ===")
    results = await client.query("_exists_:dbsnp AND dbsnp.vartype:snp")
    print(f"Query '_exists_:dbsnp AND dbsnp.vartype:snp' results: {results['total']} hits")
    if results['hits']:
        print(f"First result: {results['hits'][0].get('_id', 'No ID')}")
        print(f"Score: {results['hits'][0].get('_score', 'No score')}")
    
    await client.close()

# Run the async code
asyncio.run(main())

Available Clients

The library currently provides the following typed clients:

  • VariantClient / VariantClientAsync: For accessing variant data
  • GeneClient / GeneClientAsync: For accessing gene data
  • ChemClient / ChemClientAsync: For accessing chemical compound data
  • TaxonClient / TaxonClientAsync: For accessing taxonomic information
  • More clients coming soon...

Response Models

The library provides strongly-typed response models for all data types. For example, the VariantResponse model includes:

class VariantResponse(BaseModel):
    id: str = Field(description="Variant identifier")
    version: int = Field(description="Version number")
    chrom: str = Field(description="Chromosome number")
    hg19: GenomicLocation = Field(description="HG19 genomic location")
    vcf: VCFInfo = Field(description="VCF information")
    
    # Optional annotation fields
    cadd: Optional[CADDScore] = None
    clinvar: Optional[ClinVarAnnotation] = None
    cosmic: Optional[CosmicAnnotation] = None
    dbnsfp: Optional[DbNSFPPrediction] = None
    dbsnp: Optional[DbSNPAnnotation] = None
    # ... and more

Helper Methods

The response models include useful helper methods:

# Get a standardized variant ID
variant.get_variant_id()

# Check for clinical significance
variant.has_clinical_significance()

# Check for functional predictions
variant.has_functional_predictions()

Development

Running Tests

# Basic test run
pytest tests/

# For more detailed output with uv
uv run pytest -vvv

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • BioThings for the original client library
  • Pydantic for the data validation framework

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biothings_typed_client-0.0.2.tar.gz (57.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biothings_typed_client-0.0.2-py3-none-any.whl (23.8 kB view details)

Uploaded Python 3

File details

Details for the file biothings_typed_client-0.0.2.tar.gz.

File metadata

File hashes

Hashes for biothings_typed_client-0.0.2.tar.gz
Algorithm Hash digest
SHA256 a6ff85cdcb0732938d0be8623f1b87708a79292ee0ecae0519d29cca29a212e1
MD5 336e15ac9273ed2b656d3f0309e50e92
BLAKE2b-256 785c4cb2e8fddf8770ea79b90aa82fece1cef11a8dde9205b6551fa782c18dad

See more details on using hashes here.

File details

Details for the file biothings_typed_client-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for biothings_typed_client-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9982d61b5f70d33c8c32dc018b9702bdd085b5c4f0780da69284e68642af5463
MD5 f16b4c6cfd7b0d571ccb156b41876c82
BLAKE2b-256 c48a8579bfd44def731ba513f38bb8de59d0aba6ed8fbf809898e3419e984a77

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page