Skip to main content

AI-powered Biomedical Discovery Agent System

Project description

BioDisco 🧬🤖

AI-powered Biomedical Discovery Agent System

BioDisco is a comprehensive framework for scientific hypothesis generation and biomedical literature and knowledge graph mining using AI agents. It leverages multiple AI agents to automatically discover patterns, generate hypotheses, and gather supporting evidence from biomedical literature and knowledge databases.

🌟 Features

  • Multi-Agent AI System: Coordinated AI agents for different aspects of scientific discovery
  • Hypothesis Generation: Automated generation of novel biomedical hypotheses
  • Literature Mining: Intelligent PubMed search and literature analysis
  • Knowledge Graph Integration: Neo4j-based knowledge graph for storing and querying biomedical entities
  • Evidence Collection: Systematic gathering and linking of supporting evidence
  • Simple Python Interface: Easy-to-use API for scientific discovery

🚀 Quick Start

Installation

pip install biodisco

Basic Usage

BioDisco provides a simple interface for biomedical discovery

import BioDisco

# Simple disease-based discovery
results = BioDisco.generate("Role of GPR153 in vascular injury and disease")

You need to setup your Open AI API key as a environment variable OPENAI_API_KEY

On your terminal

export OPENAI_API_KEY=your_openai_api_key_here

or create a .env file in your project directory (check .env.example)

# OpenAI API Configuration
OPENAI_API_KEY=your_openai_api_key_here

Development Installation

git clone https://github.com/yujingke/BioDisco.git
cd BioDisco
pip install -e .

🔧 PubMed and Knowledge Graph Integration

By default PubMed and Knowledge Graph Integration is off. Follow the steps to setup knowledge integration.

PubMed Setup

You can setup an an environment variable DISABLE_PUBMED=False in your .env file or using export command

or

Just pass an argument to the generate function

## Turn on PubMed Integration
results = BioDisco.generate("Role of GPR153 in vascular injury and disease", disable_pubmed=False)

Neo4j Setup

1. Install Neo4j server

First you need to install Neo4j server. Follow the instuctions here to install Neo4j for your OS

2. Add Neo4j login details to as enviroment variable

export NEO4J_URI=bolt://localhost:7687
export NEO4J_USER=neo4j
export NEO4J_PASSWORD=your_neo4j_password

or set these .env file (check .env.example)

3. Download and Setup PrimeKG

  • Download PrimeKG
wget -O kg.csv https://dataverse.harvard.edu/api/access/datafile/6180620
  • run split_nodes_edges.py (should be in the same location as kg.csv) to create nodes.csv and edges.csv

  • run build_kg_index.py (should be in the same location as nodes.csv)

  • add location of files as environment variable KG_PATH (check .env.example)

export KG_PATH=/path/to/your/kg_specific_files

Import PrimeKG to Neo4j

neo4j-admin database import full --nodes nodes.csv --relationships edges.csv --overwrite-destination

Start Neo4j

neo4j start

Turn on PubMed and KG Integration

results = BioDisco.generate("Role of GPR153 in vascular injury and disease", disable_pubmed=False, disable_kg=False)

or

setup environment variable DISABLE_KG=False (check .env.example)

📖 Detailed Usage

Setting number of iterations and PubMed cutoff

import BioDisco

results = BioDisco.generate("Role of GPR153 in vascular injury and disease", disable_pubmed=False, disable_kg=False, n_iterations=3, start_year=2020)

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biodisco-0.1.0.tar.gz (32.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biodisco-0.1.0-py3-none-any.whl (34.2 kB view details)

Uploaded Python 3

File details

Details for the file biodisco-0.1.0.tar.gz.

File metadata

  • Download URL: biodisco-0.1.0.tar.gz
  • Upload date:
  • Size: 32.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for biodisco-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6d2d81ca6ef493dedc7a4209acb35b6d8b1bee9e775a2172a0bb2bb839fba676
MD5 5a0713ef22ef31fcc737a37e13caa410
BLAKE2b-256 0e2664cb58d6af1dbe3d70d2aebf14986d792e7a5ab113e9e8e8c59b19f7534f

See more details on using hashes here.

File details

Details for the file biodisco-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: biodisco-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 34.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for biodisco-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3dde5c34a40296b1dd53980ffcb434a1c03ecb6c1da6463ba90fa065a958520d
MD5 6493c8bec88e32b1e78224b2fd682bce
BLAKE2b-256 cb60fbe131f40c6d65d83eda4451d1720cd9ebadb7ee02cf3720ce6a3702f153

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page