Skip to main content

Med-Discover is an AI-powered tool designed to assist biomedical researchers by leveraging Retrieval-Augmented Generation (RAG) with fine-tuned LLMs on PubMed literature. It enables efficient document retrieval, knowledge extraction, and interactive querying from biomedical research papers, helping researchers find relevant insights quickly. The package supports both GPU-based embeddings (MedCPT) and CPU-friendly alternatives (GPT-4 embeddings), making it accessible for a wide range of users.

Project description

MedDiscover

MedDiscover is an AI-powered tool designed to assist biomedical researchers using RAG-LLM models fine-tuned on PubMed literature.

CLI evaluation (headless)

Install the package (or use it in editable mode), set your OPENAI_API_KEY, and run the built-in evaluator:

pip install .
export OPENAI_API_KEY=...
# optional: ALLOW_MEDCPT_CPU=1 to force MedCPT on CPU

meddiscover-eval \
  --pdfs med_discover_ai/eval_samples/sample_pdfs/fmed-11-1345659.pdf med_discover_ai/eval_samples/sample_pdfs/s10549-023-07033-8.pdf \
  --qa_csv med_discover_ai/eval_samples/sample_qa.csv \
  --embedding_model "MedCPT (GPU Recommended)" \
  --llm_models gpt-4.1-mini \
  --k 3 \
  --max_tokens 64 \
  --out_dir ./eval_outputs_demo

# Evaluate both decoders in one run (example)
meddiscover-eval \
  --pdfs med_discover_ai/eval_samples/sample_pdfs/fmed-11-1345659.pdf med_discover_ai/eval_samples/sample_pdfs/s10549-023-07033-8.pdf \
  --qa_csv med_discover_ai/eval_samples/sample_qa.csv \
  --embedding_model "MedCPT (GPU Recommended)" \
  --llm_models gpt-4.1-mini,gpt-4.1-nano \
  --k 3 --max_tokens 64 --out_dir ./eval_outputs_all
  • For Ada-based retrieval, switch --embedding_model to OpenAI Ada-002 (CPU/Cloud).
  • RAGAS metrics are optional; if dependencies are missing or the QA CSV lacks a reference column, they fall back to None.
  • Re-ranking stays disabled on CPU; enable --rerank only when a GPU and cross-encoder are available.
  • Ollama models are available in both the UI and CLI (prefix with ollama:), e.g. --llm_models ollama:gemma3:4b.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

med_discover_ai-1.0.11.tar.gz (549.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

med_discover_ai-1.0.11-py3-none-any.whl (555.5 kB view details)

Uploaded Python 3

File details

Details for the file med_discover_ai-1.0.11.tar.gz.

File metadata

  • Download URL: med_discover_ai-1.0.11.tar.gz
  • Upload date:
  • Size: 549.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for med_discover_ai-1.0.11.tar.gz
Algorithm Hash digest
SHA256 b58c5537459708fd6f026821ee7f7f59cba0fe8c73096bafed2a07fe3ebdccf4
MD5 0e77d777aac5962eda2a1b6c1beefb65
BLAKE2b-256 ad5cc98bae00403ac6a32ec96c53b14f6c0382e59e8a291b553fee93ac413511

See more details on using hashes here.

File details

Details for the file med_discover_ai-1.0.11-py3-none-any.whl.

File metadata

File hashes

Hashes for med_discover_ai-1.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 04e7b864ac33d399c1f4d5efa6114d2695258767e9253072fb90e62b0217edc1
MD5 80d2d1a810e88c6dd64937d5d9bd2871
BLAKE2b-256 3657384679f335f0a7bebde6ed74888444f5998af1243a73093538cc5b12cfe1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page