AI-powered Biomedical Discovery Agent System
Project description
BioDisco 🧬🤖
AI-powered Biomedical Discovery Agent System
BioDisco is a comprehensive framework for scientific hypothesis generation and biomedical literature and knowledge graph mining using AI agents. It leverages multiple AI agents to automatically discover patterns, generate hypotheses, and gather supporting evidence from biomedical literature and knowledge databases.
🌟 Features
- Multi-Agent AI System: Coordinated AI agents for different aspects of scientific discovery
- Hypothesis Generation: Automated generation of novel biomedical hypotheses
- Literature Mining: Intelligent PubMed search and literature analysis
- Knowledge Graph Integration: Neo4j-based knowledge graph for storing and querying biomedical entities
- Evidence Collection: Systematic gathering and linking of supporting evidence
- Simple Python Interface: Easy-to-use API for scientific discovery
🚀 Quick Start
Installation
pip install biodisco
Basic Usage
BioDisco provides a simple interface for biomedical discovery
import BioDisco
# Simple disease-based discovery
results = BioDisco.generate("Role of GPR153 in vascular injury and disease")
You need to setup your Open AI API key as a environment variable OPENAI_API_KEY
On your terminal
export OPENAI_API_KEY=your_openai_api_key_here
or create a .env file in your project directory (check .env.example)
# OpenAI API Configuration
OPENAI_API_KEY=your_openai_api_key_here
Development Installation
git clone https://github.com/yujingke/BioDisco.git
cd BioDisco
pip install -e .
🔧 PubMed and Knowledge Graph Integration
By default PubMed and Knowledge Graph Integration is off. Follow the steps to setup knowledge integration.
PubMed Setup
You can setup an an environment variable DISABLE_PUBMED=False in your .env file or using export command
or
Just pass an argument to the generate function
## Turn on PubMed Integration
results = BioDisco.generate("Role of GPR153 in vascular injury and disease", disable_pubmed=False)
Neo4j Setup
1. Install Neo4j server
First you need to install Neo4j server. Follow the instuctions here to install Neo4j for your OS
2. Add Neo4j login details to as enviroment variable
export NEO4J_URI=bolt://localhost:7687
export NEO4J_USER=neo4j
export NEO4J_PASSWORD=your_neo4j_password
or set these .env file (check .env.example)
3. Download and Setup PrimeKG
- Download PrimeKG
wget -O kg.csv https://dataverse.harvard.edu/api/access/datafile/6180620
-
run
split_nodes_edges.py(should be in the same location askg.csv) to createnodes.csvandedges.csv -
run
build_kg_index.py(should be in the same location asnodes.csv) -
add location of files as environment variable
KG_PATH(check.env.example)
export KG_PATH=/path/to/your/kg_specific_files
Import PrimeKG to Neo4j
neo4j-admin database import full --nodes nodes.csv --relationships edges.csv --overwrite-destination
Start Neo4j
neo4j start
Turn on PubMed and KG Integration
results = BioDisco.generate("Role of GPR153 in vascular injury and disease", disable_pubmed=False, disable_kg=False)
or
setup environment variable DISABLE_KG=False (check .env.example)
📖 Detailed Usage
Setting number of iterations and PubMed cutoff
import BioDisco
results = BioDisco.generate("Role of GPR153 in vascular injury and disease", disable_pubmed=False, disable_kg=False, n_iterations=3, start_year=2020)
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file biodisco-0.1.0.tar.gz.
File metadata
- Download URL: biodisco-0.1.0.tar.gz
- Upload date:
- Size: 32.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d2d81ca6ef493dedc7a4209acb35b6d8b1bee9e775a2172a0bb2bb839fba676
|
|
| MD5 |
5a0713ef22ef31fcc737a37e13caa410
|
|
| BLAKE2b-256 |
0e2664cb58d6af1dbe3d70d2aebf14986d792e7a5ab113e9e8e8c59b19f7534f
|
File details
Details for the file biodisco-0.1.0-py3-none-any.whl.
File metadata
- Download URL: biodisco-0.1.0-py3-none-any.whl
- Upload date:
- Size: 34.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3dde5c34a40296b1dd53980ffcb434a1c03ecb6c1da6463ba90fa065a958520d
|
|
| MD5 |
6493c8bec88e32b1e78224b2fd682bce
|
|
| BLAKE2b-256 |
cb60fbe131f40c6d65d83eda4451d1720cd9ebadb7ee02cf3720ce6a3702f153
|