A comprehensive toolkit for fine-tuning medical large language models with RAG capabilities

These details have not been verified by PyPI

Project links

Project description

Medical LLM Fine-tuning with RAG System

A comprehensive PyPI package for fine-tuning large language models on medical literature with entity relationship extraction capabilities, featuring Qwen3-4B-Thinking model integration and advanced RAG-Anything multimodal document processing.

📦 Quick Start

# Install the package
pip install medllm-finetune-rag[all]

# Manual Unsloth installation (for 2x faster training)
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

# Quick training
medllm train medical_data.json

# Or use Python API
python -c "from medllm import quick_train; quick_train('medical_data.json')"

🚀 Features

🤖 Advanced Model Support

Qwen3-4B-Thinking Integration: State-of-the-art reasoning model with thinking capabilities
Unsloth Acceleration: 2x faster training with reduced VRAM usage
Thinking Mode: Support for reasoning-based inference with <think></think> tags
Multiple Training Formats: Instruction fine-tuning and NER training formats

📄 RAG-Anything Integration

Multimodal Document Processing: PDF, DOCX, images, tables, equations
Advanced Parsers: MinerU and Docling for robust document extraction
LightRAG Knowledge Graph: Graph-based retrieval for enhanced context
Multiple Search Modes: Naive, local, global, and hybrid retrieval strategies

🏥 Medical Specialization

Medical Entity Extraction: Specialized for Bacteria, Disease, Evidence entities
Relationship Extraction: Complex medical relationship understanding
Domain-Specific Processing: Optimized for medical literature analysis
Evaluation Metrics: Built-in medical NER evaluation tools

⚡ Platform Optimization

Apple Silicon Support: Optimized for Mac M2/M3 with MPS backend
CUDA Acceleration: Full GPU acceleration for NVIDIA cards
CPU Fallback: Reliable CPU-only training option
Universal Configuration: Single config system for all platforms

🏗️ Project Structure

medllm-finetune-rag/
├── core/                           # Core training modules
│   ├── medical_llm_trainer.py     # Main training class with Unsloth integration
│   ├── data_processing.py         # Data preprocessing and format conversion
│   ├── medical_rag_system.py      # RAG-Anything system implementation
│   ├── evaluation_metrics.py      # Model evaluation tools
│   └── huggingface_uploader.py    # HuggingFace model upload utility
├── config/                        # Configuration files
│   ├── config_mac_m2.yaml        # Mac M2/M3 optimized configuration
│   ├── config_cuda.yaml          # CUDA GPU configuration
│   └── config_cpu.yaml           # CPU-only configuration
├── scripts/                       # Utility scripts
│   ├── setup_environment.py      # Environment setup and verification
│   ├── run_training_pipeline.py  # Legacy training pipeline runner
│   ├── quick_start.sh            # Quick setup script
│   └── setup_qwen3.sh           # Qwen3 specific environment setup
├── docs/                          # Documentation
│   ├── UNIVERSAL_TRAINER_GUIDE.md # Universal trainer documentation
│   ├── README_QWEN3.md          # Qwen3 integration guide
│   └── README_MAC_M2.md         # Mac M2 setup guide
├── examples/                      # Example scripts and demos
│   ├── medical_rag_demo.py       # Comprehensive RAG-Anything demo
│   ├── quick_rag_test.py         # Quick RAG functionality test
│   ├── qwen3_example.py          # Qwen3 model example
│   └── run_mac_m2.py            # Mac M2 optimized runner
├── universal_trainer.py          # Universal training script
├── run.py                        # Simple command wrapper
├── requirements.txt              # Python dependencies
├── .env                          # Environment variables (API keys)
└── README.md                     # This file

🔧 Installation

Prerequisites

Python 3.9-3.13
PyTorch with appropriate backend (CUDA/MPS/CPU)
Git and GitHub CLI (optional, for repository management)
OpenAI API key (optional, for RAG-Anything functionality)

Quick Setup

Clone the repository:
```
git clone <repository-url>
```

cd medllm-finetune-rag


2. **Automated setup** (recommended):
```bash
chmod +x scripts/setup_qwen3.sh
./scripts/setup_qwen3.sh

This will automatically:

Create and activate a virtual environment
Install all required dependencies including RAG-Anything
Set up platform-specific optimizations (Mac M2/CUDA/CPU)
Verify the installation

Configure API keys (optional, for RAG functionality):

# Create .env file with your API keys
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env
echo "HF_TOKEN=your_huggingface_token_here" >> .env

Quick test:

# Test basic functionality
python examples/quick_rag_test.py

# Test model inference
python run.py test

Manual Installation

# Core dependencies
pip install torch torchvision torchaudio
pip install transformers>=4.36.0
pip install datasets>=2.14.0
pip install peft>=0.6.0
pip install bitsandbytes>=0.41.0

# Unsloth for efficient training (if compatible)
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install unsloth_zoo
pip install trl>=0.7.0

# Additional utilities
pip install wandb
pip install scikit-learn
pip install numpy pandas
pip install tqdm
pip install PyYAML

🚀 Quick Start

1. Universal Trainer (Recommended)

The universal trainer provides a unified interface for all training operations:

# Quick inference test
python run.py test

# Training with different configurations
python run.py train          # Mac M2 optimized
python run.py cuda          # CUDA GPU training
python run.py cpu           # CPU-only training

# Training with HuggingFace upload
python run.py train-upload

# Interactive mode
python run.py interactive

# Full pipeline with upload
python run.py full-upload

2. RAG-Anything Demo

# Test RAG functionality
python examples/quick_rag_test.py

# Run comprehensive RAG demo
python examples/medical_rag_demo.py

3. Configuration-Based Training

# Use specific configuration files
python universal_trainer.py --config config/config_mac_m2.yaml --mode train
python universal_trainer.py --config config/config_cuda.yaml --mode inference
python universal_trainer.py --config config/config_cpu.yaml --mode eval

4. Medical Training Pipeline

For research and development with full control over each stage:

# Complete medical pipeline (all stages)
python scripts/run_training_pipeline.py --stage all

# Individual stages
python scripts/run_training_pipeline.py --stage 1    # Data preprocessing
python scripts/run_training_pipeline.py --stage 2    # Model training
python scripts/run_training_pipeline.py --stage 3    # RAG system building
python scripts/run_training_pipeline.py --stage 4    # Evaluation
python scripts/run_training_pipeline.py --stage 5    # Demo inference

# Custom parameters
python scripts/run_training_pipeline.py --stage 2 --model Qwen/Qwen3-4B-Thinking-2507 --epochs 5

5. Direct Training API

from core.medical_llm_trainer import MedicalLLMTrainer

# Initialize trainer with Qwen3-4B-Thinking
trainer = MedicalLLMTrainer(
    model_name="Qwen/Qwen3-4B-Thinking-2507",
    use_unsloth=True,
    use_qlora=True,
    max_seq_length=2048
)

# Train on your data
trainer.train("path/to/your/training_data.json")

# Inference with thinking mode
result = trainer.inference(
    "Hepatitis C virus causes chronic liver infection.",
    enable_thinking=True
)
print(result)

🛠️ Training Tools Comparison

This project provides multiple ways to train and use medical LLMs, each optimized for different use cases:

📋 Tools Overview

Tool	Purpose	Best For	Complexity
`scripts/run.py`	Simple command wrapper	Quick tasks, daily use	⭐ Low
`scripts/run_training_pipeline.py`	Complete medical pipeline	Research, development	⭐⭐⭐ High
`universal_trainer.py`	Configuration-driven trainer	Production, customization	⭐⭐ Medium
Direct API	Python integration	Custom applications	⭐⭐ Medium

🚀 run.py - Quick & Easy Commands

Perfect for: Daily use, quick testing, simple training

# Quick commands - just works!
python run.py test              # Quick inference test
python run.py train             # Full training (Mac M2 optimized)
python run.py train-upload      # Train + upload to HuggingFace
python run.py cuda              # CUDA optimized inference
python run.py interactive       # Interactive chat mode
python run.py full-upload       # Complete pipeline + upload

How it works: Simple wrapper that translates commands to universal_trainer.py calls

113 lines of code
No dependencies beyond subprocess
Instant gratification

🏥 run_training_pipeline.py - Medical Research Pipeline

Perfect for: Medical NLP research, stage-by-stage development, detailed analysis

# Stage-based execution with full control
python scripts/run_training_pipeline.py --stage all                    # Complete pipeline
python scripts/run_training_pipeline.py --stage 1                     # Data preprocessing only
python scripts/run_training_pipeline.py --stage 2 --epochs 10         # Custom training
python scripts/run_training_pipeline.py --stage 3                     # RAG system building
python scripts/run_training_pipeline.py --stage 4                     # Evaluation with metrics
python scripts/run_training_pipeline.py --stage 5                     # Demo inference

# Advanced options
python scripts/run_training_pipeline.py --stage 2 \
    --model Qwen/Qwen3-4B-Thinking-2507 \
    --device cuda \
    --batch_size 4 \
    --lr 3e-4

Pipeline Stages:

Data Preprocessing: Load, clean, augment medical data
Model Training: Fine-tune Qwen3-4B-Thinking for medical tasks
RAG System Building: Create RAG-Anything enhanced retrieval system
Evaluation: Comprehensive metrics and performance analysis
Demo Inference: Multi-mode RAG inference demonstration

Features:

673 lines of comprehensive functionality
Async RAG-Anything integration
Detailed error handling and recovery
Medical-specific evaluation metrics
Stage-by-stage execution control

⚙️ universal_trainer.py - Configuration-Driven

Perfect for: Production deployments, custom configurations, reproducible experiments

# Configuration-based training
python universal_trainer.py --config config/config_mac_m2.yaml --mode train
python universal_trainer.py --config config/config_cuda.yaml --mode inference
python universal_trainer.py --config config/config_cpu.yaml --mode eval

# Custom configurations
python universal_trainer.py --config my_custom_config.yaml --mode full --upload-to-hf

Features:

YAML-based configuration
Platform optimization (Mac M2/CUDA/CPU)
HuggingFace integration
RAG-Anything support

🎯 When to Use Which Tool?

Scenario	Recommended Tool	Command Example
Quick test	`run.py`	`python run.py test`
Daily training	`run.py`	`python run.py train`
Research experiment	`run_training_pipeline.py`	`python scripts/run_training_pipeline.py --stage all`
Custom evaluation	`run_training_pipeline.py`	`python scripts/run_training_pipeline.py --stage 4`
RAG development	`run_training_pipeline.py`	`python scripts/run_training_pipeline.py --stage 3`
Production deployment	`universal_trainer.py`	`python universal_trainer.py --config prod_config.yaml`
Custom integration	Direct API	See Python examples below

2. Data Processing

from core.data_processing import MedicalDataProcessor

# Process raw medical data
processor = MedicalDataProcessor("raw_data.json")
processor.load_data()
processor.save_processed_data("processed_data/")

3. Using the Mock Trainer (for development)

from examples.english_stable_solution import MockMedicalTrainer

# Use when network issues prevent model download
trainer = MockMedicalTrainer()
result = trainer.inference(
    "Streptococcus pneumoniae causes pneumonia.",
    enable_thinking=True
)

📊 Model Configuration

Supported Models

Qwen3-4B-Thinking-2507 (Recommended): Advanced reasoning capabilities
Qwen2.5-7B-Instruct: General instruction following
Custom models: Compatible with HuggingFace transformers

Training Parameters

model:
  name: "Qwen/Qwen3-4B-Thinking-2507"
  use_unsloth: true
  use_qlora: true
  torch_dtype: "bfloat16"
  max_seq_length: 2048

training:
  batch_size: 4
  gradient_accumulation_steps: 4
  learning_rate: 2.0e-4
  num_train_epochs: 3
  warmup_steps: 10
  save_steps: 100

🧠 Entity Types and Relations

Entity Types

Bacteria: Bacteria, viruses, and other pathogens
Disease: Diseases, symptoms, and pathological conditions
Evidence: Research evidence, conclusions, and findings

Relationship Types

is_a: Hierarchical relationship
biomarker_for: Biomarker relationship
correlated_with: Correlation relationship
has_relationship: General relationship

🍎 Mac M2/M3 Support

Special optimizations for Apple Silicon:

MPS Backend: Automatic detection and usage of Metal Performance Shaders
Virtual Environment: Automatic setup to avoid system conflicts
Fallback Mechanisms: Graceful degradation when Unsloth is incompatible
Memory Optimization: Efficient memory usage for limited RAM

See README_MAC_M2.md for detailed setup instructions.

🔄 Thinking Mode

The Qwen3-4B-Thinking model supports reasoning mode with <think></think> tags:

# Enable thinking mode for complex reasoning
result = trainer.inference(
    "Analyze the relationship between H. pylori and gastric cancer.",
    enable_thinking=True
)

# Output includes reasoning process:
# <think>
# I need to analyze the relationship between H. pylori and gastric cancer...
# </think>
# 
# Final analysis in JSON format...

📈 Evaluation

Built-in evaluation metrics:

Entity Recognition: Precision, Recall, F1-score
Relation Extraction: Accuracy and relationship-specific metrics
Medical Accuracy: Domain-specific evaluation criteria

from core.evaluation_metrics import MedicalEvaluator

evaluator = MedicalEvaluator()
results = evaluator.evaluate_model(trainer, test_data)
print(f"F1 Score: {results['f1_score']:.4f}")

🛠️ Troubleshooting

Common Issues

Network Download Errors:
- Use the mock trainer for development: examples/english_stable_solution.py
- Try different networks or download during off-peak hours
- Use examples/download_step_by_step.py for manual downloads
Mac M2 Compatibility:
- Use virtual environment to avoid system conflicts
- Fallback to standard transformers if Unsloth fails
- Check docs/README_MAC_M2.md for specific solutions
Memory Issues:
- Reduce batch size and max sequence length
- Enable gradient checkpointing
- Use QLoRA for memory-efficient fine-tuning

Environment Issues

# Reset virtual environment
rm -rf venv
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

📚 Quick Reference

🚀 Most Common Commands

# For everyday use (recommended)
python run.py test                    # Quick test
python run.py train                   # Train model
python run.py interactive            # Chat mode

# For research and development
python scripts/run_training_pipeline.py --stage all    # Full pipeline
python scripts/run_training_pipeline.py --stage 4      # Evaluation only

# For RAG functionality
python examples/quick_rag_test.py                      # Test RAG
python examples/medical_rag_demo.py                    # Full RAG demo

# For setup and configuration
python scripts/setup_environment.py                    # Auto setup
python scripts/setup_environment.py --mode install     # Install deps only

🔧 File Structure Quick Guide

medllm-finetune-rag/
├── scripts/run.py                    # 👈 Start here - simple commands
├── scripts/run_training_pipeline.py # 👈 Research pipeline
├── universal_trainer.py             # Configuration-driven trainer
├── examples/                        # Demo scripts and examples
├── config/                          # Platform-specific configs
└── core/                            # Core modules (advanced use)

💡 Troubleshooting Quick Fixes

# Dependencies missing?
python scripts/setup_environment.py --mode install

# Import errors?
pip install -r requirements.txt

# Mac M2 issues?
python scripts/setup_qwen3.sh

# Want to start fresh?
rm -rf venv && python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

🤝 Contributing

Fork the repository
Create a feature branch: git checkout -b feature-name
Commit changes: git commit -am 'Add feature'
Push to branch: git push origin feature-name
Submit a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Unsloth for efficient LLM training
Qwen Team for the Qwen3-4B-Thinking model
HuggingFace for the transformers ecosystem
Medical research community for domain expertise

📞 Support

📧 Issues: Use GitHub Issues for bug reports and feature requests
📚 Documentation: Check the docs/ directory for detailed guides
💬 Discussions: Use GitHub Discussions for questions and community support

Note: This project is designed for research and educational purposes. Ensure compliance with relevant medical data regulations and ethical guidelines when working with medical literature and patient data.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2 yanked

Aug 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

medllm_finetune_rag-0.1.2.tar.gz (110.3 kB view details)

Uploaded Aug 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

medllm_finetune_rag-0.1.2-py3-none-any.whl (98.4 kB view details)

Uploaded Aug 19, 2025 Python 3

File details

Details for the file medllm_finetune_rag-0.1.2.tar.gz.

File metadata

Download URL: medllm_finetune_rag-0.1.2.tar.gz
Upload date: Aug 19, 2025
Size: 110.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.6

File hashes

Hashes for medllm_finetune_rag-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`6bb1baed65ebff27d69bfb611257a0600cc088377cf28f17d2323b24dd7bd9b6`
MD5	`1178584b9dfcc29019c31f49f8c553cb`
BLAKE2b-256	`567794bb9c39bb15acb6ea856b781028b8a6013a005d03b3d63b555a6738d138`

See more details on using hashes here.

File details

Details for the file medllm_finetune_rag-0.1.2-py3-none-any.whl.

File metadata

Download URL: medllm_finetune_rag-0.1.2-py3-none-any.whl
Upload date: Aug 19, 2025
Size: 98.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.6

File hashes

Hashes for medllm_finetune_rag-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e0da0b0f80d13ac2bba34104542e4719da408651125158ca53ab6f971a50867b`
MD5	`ed8dab5a1e3c4a5420fe31cd00480a8c`
BLAKE2b-256	`05f9c9cf8c3ba4a9777f3f05dae6e912500e75180194ed4eb9df6dfe1662baff`

See more details on using hashes here.

medllm-finetune-rag 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Medical LLM Fine-tuning with RAG System

📦 Quick Start

🚀 Features

🤖 Advanced Model Support

📄 RAG-Anything Integration

🏥 Medical Specialization

⚡ Platform Optimization

🏗️ Project Structure

🔧 Installation

Prerequisites

Quick Setup

Manual Installation

🚀 Quick Start

1. Universal Trainer (Recommended)

2. RAG-Anything Demo

3. Configuration-Based Training

4. Medical Training Pipeline

5. Direct Training API

🛠️ Training Tools Comparison

📋 Tools Overview

🚀 run.py - Quick & Easy Commands

🏥 run_training_pipeline.py - Medical Research Pipeline

⚙️ universal_trainer.py - Configuration-Driven

🎯 When to Use Which Tool?

2. Data Processing

3. Using the Mock Trainer (for development)

📊 Model Configuration

Supported Models

Training Parameters

🧠 Entity Types and Relations

Entity Types

Relationship Types

🍎 Mac M2/M3 Support

🔄 Thinking Mode

📈 Evaluation

🛠️ Troubleshooting

Common Issues

Environment Issues

📚 Quick Reference

🚀 Most Common Commands

🔧 File Structure Quick Guide

💡 Troubleshooting Quick Fixes

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes