Skip to main content

Unified biological concept lookup across 29+ biomedical knowledge sources including BioPortal, OLS, UMLS, ChEMBL, DisGeNET, and more

Project description

🧬 Biomedical Knowledge Lookup

PyPI version Python 3.10+ License: MIT Tests Coverage Ruff Documentation PyPI downloads GitHub last commit DOI

A unified Python library for biological concept lookup across 29+ biomedical knowledge sources including BioPortal, OLS, UMLS, ChEMBL, DisGeNET, and more. Built for bioinformatics researchers, knowledge graph developers, and biomedical data scientists.

✨ Features

  • 🔍 29+ Knowledge Sources: Comprehensive coverage of biomedical ontologies and databases
  • ⚡ Unified API: Single interface for all sources with consistent results
  • 🔄 Multi-source Annotation: Cross-reference concepts across multiple databases
  • 📊 RDF Export: Convert results to RDF format for knowledge graphs
  • 💾 Intelligent Caching: Built-in caching system for performance optimization
  • 🔄 Async Support: Asynchronous operations for scalable applications
  • 🧪 Comprehensive Testing: Full test suite with unit and integration tests
  • 📚 Rich Documentation: Extensive examples and API documentation

🚀 Quick Start

Installation

pip install biomedical-knowledge-lookup
# optional UMLS support
pip install "biomedical-knowledge-lookup[umls]"
# or
poetry add biomedical-knowledge-lookup
# optional UMLS support
poetry add biomedical-knowledge-lookup -E umls
# or from source
git clone https://github.com/JonasHeinickeBio/biomedical-knowledge-lookup.git
cd biomedical-knowledge-lookup
poetry install

UMLS support is optional and requires installing the umls extra plus a valid UMLS API key.

Basic Usage

from knowledge_lookup import CentralKnowledgeLookup, KnowledgeSource

# Initialize the lookup system
lookup = CentralKnowledgeLookup()

# Search for concepts across multiple sources
results = await lookup.search_concepts(
    "diabetes mellitus",
    sources=[KnowledgeSource.BIOPORTAL, KnowledgeSource.OLS, KnowledgeSource.UMLS]
)

# Get detailed information about a specific concept
concept_details = await lookup.get_concept_details("DOID:9351")

# Export results to RDF
rdf_graph = lookup.export_to_rdf(results)

Advanced Usage with Multi-source Annotation

from knowledge_lookup import MultiSourceAnnotator

# Annotate text with concepts from multiple sources
annotator = MultiSourceAnnotator()
annotations = await annotator.annotate_text(
    "Type 2 diabetes is associated with insulin resistance",
    confidence_threshold=0.7
)

# Get consensus annotations across sources
consensus = annotator.get_consensus_annotations(annotations)

📋 Supported Knowledge Sources

Source Description API Key Required
BioPortal NCBI BioPortal ontology repository Yes
OLS Ontology Lookup Service No
UMLS Unified Medical Language System Yes
ChEMBL Chemical database No
DisGeNET Disease-gene associations No
DrugBank Drug information database No
Ensembl Genome annotation database No
Gene Ontology Molecular function/process/component No
HPO Human Phenotype Ontology No
Mondo Mondo Disease Ontology No
OpenTargets Target-disease associations No
PubChem Chemical information No
Reactome Pathway database No
UniProt Protein sequence database No
WikiData Structured knowledge base No
ZOOMA Ontology mapping service No
And 13+ more... See full list in documentation Varies

🏗️ Architecture

knowledge_lookup/
├── adapters/           # Individual source adapters
├── models.py          # Data models and enums
├── central_lookup.py  # Main lookup coordinator
├── multi_source_annotator.py  # Cross-source annotation
├── rdf_converter.py   # RDF export utilities
├── cache.py          # Caching system
└── base.py           # Abstract base classes

📖 Documentation

Additional Resources

Example Notebooks

Explore interactive examples in the examples/ directory:

  • Basic concept lookup
  • Multi-source annotation
  • RDF export and knowledge graph construction
  • Performance benchmarking

🔧 Configuration

API Keys

Some sources require API keys. Set them as environment variables:

export BIOPORTAL_API_KEY="your_key_here"
export UMLS_API_KEY="your_key_here"
# ... etc

Or create a .env file:

BIOPORTAL_API_KEY=your_key_here
UMLS_API_KEY=your_key_here

Advanced Configuration

from knowledge_lookup import LookupConfig

config = LookupConfig(
    rate_limits={
        KnowledgeSource.BIOPORTAL: 10,  # requests per second
        KnowledgeSource.OLS: 20,
    },
    cache_enabled=True,
    cache_dir="./cache"
)

lookup = CentralKnowledgeLookup(config)

🧪 Testing

# Run all tests
poetry run pytest

# Run specific test categories
poetry run pytest -m "unit"        # Unit tests only
poetry run pytest -m "integration" # Integration tests
poetry run pytest -m "not slow"    # Skip slow tests

# Run with coverage
poetry run pytest --cov=knowledge_lookup

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Adding New Adapters

  1. Extend KnowledgeSourceAdapter in base.py
  2. Implement required methods: search_concepts(), get_concept_details()
  3. Add to adapters/__init__.py
  4. Add tests in tests/unit/test_adapters/
  5. Update documentation

Development Setup

git clone https://github.com/JonasHeinickeBio/biomedical-knowledge-lookup.git
cd biomedical-knowledge-lookup
poetry install
poetry run pre-commit install

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Built upon the AID-PAIS Knowledge Graph project
  • Thanks to all contributors and the biomedical research community
  • Special thanks to the maintainers of the various knowledge sources

📞 Support

🔬 Citation

If you use this library in your research, please cite:

@software{heinicke_biomedical_knowledge_lookup_2025,
  author = {Heinicke, Jonas},
  title = {Biomedical Knowledge Lookup: Unified biological concept lookup across 29+ biomedical knowledge sources},
  url = {https://github.com/JonasHeinickeBio/biomedical-knowledge-lookup},
  version = {1.0.0},
  year = {2025}
}

GitHub stars GitHub forks

⭐ Star this repository if you find it useful!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biomedical_knowledge_lookup-1.1.0.tar.gz (157.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biomedical_knowledge_lookup-1.1.0-py3-none-any.whl (206.0 kB view details)

Uploaded Python 3

File details

Details for the file biomedical_knowledge_lookup-1.1.0.tar.gz.

File metadata

File hashes

Hashes for biomedical_knowledge_lookup-1.1.0.tar.gz
Algorithm Hash digest
SHA256 18ac08191b0b835c2317dae854ed53a9f4498c6fc1cee2e0ee835187d332c5ce
MD5 8ed10834ae9e1c8da9a28f5ecf57db5c
BLAKE2b-256 a898a7f54b5aca5eec9c8522619ae13d334508e8d76914efbb600891bdc7adeb

See more details on using hashes here.

Provenance

The following attestation bundles were made for biomedical_knowledge_lookup-1.1.0.tar.gz:

Publisher: publish.yml on JonasHeinickeBio/biomedical-knowledge-lookup

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file biomedical_knowledge_lookup-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for biomedical_knowledge_lookup-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed0f9b92f9595f1d842dded10b227dce9b34a3d98c23e932f5ba79b4b507d582
MD5 ee2106c507d41ad93f32e5d1529facaa
BLAKE2b-256 1c11c09a15a4075e4fc94d4b47706c244a20147c408383aac26461ad79b9e875

See more details on using hashes here.

Provenance

The following attestation bundles were made for biomedical_knowledge_lookup-1.1.0-py3-none-any.whl:

Publisher: publish.yml on JonasHeinickeBio/biomedical-knowledge-lookup

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page