EndoReg Db Django App
Project description
EndoregDB - Professional Data Infrastructure for Clinical Research
EndoregDB is a comprehensive database framework designed to manage medical and research-related data for clinical trials. This repository focuses on efficient data processing, automated deployment, security, and reproducibility, offering a flexible setup for local development environments as well as distributed systems. It supports the integration of AI/ML tools and advanced image and report processing.
This infrastructure was originally designed for clinical research studies and is optimized for handling large data volumes, including:
- Medical reports,
- Patient imaging and video data,
- Clinical product and treatment data, and more.
Ingress contract
The package supports two first-class ingest boundaries:
watcher: trusted local filesystem intakeapi: authenticated remote upload intake
Both boundaries create UploadJob records and converge on the same shared ingest services. The downstream processing model is shared; only the trust boundary differs.
For shared multi-center deployments, set
ENDOREG_DEPLOYMENT_ROLE=central_hub. In that role the package requires
authenticated API uploads with declared center_key and refuses
default-center fallback on the API path.
AI and automation consumers should use the API read surfaces for reports, videos, frames, and patient timelines rather than reading STORAGE_DIR directly. Those media endpoints are the package-level contract for center-scoped access.
The node-to-node transfer API under /api/media/hub/transfers/ is supported
for central_hub deployments. In standalone and site_node deployments
those endpoints return 404. /api/upload/ remains the primary hub boundary.
For the current transport-security phase, transfer deployments must:
- use HTTPS or equivalent secure transport
- require proxy-verified mTLS for node-authenticated transfer requests
- keep
NetworkNode.shared_secretlimited to request authentication rather than payload encryption
For downstream upgrade and deployment impact, see
docs/deployment_note_hub_contract.md.
For the full current-state hub behavior, see
docs/wiki/hub_ingest_current_state.md.
Ingest workflow
The package is designed around one shared ingest core with multiple boundary adapters:
watcher,api, or optionaltransferingress accepts a file or transfer payload.- The boundary resolves
center_keyscope and creates anUploadJoborTransferJob. - Provenance is normalized at creation time so audit and cleanup logic do not depend on caller-specific payload shapes.
- Shared processing services import, anonymize, and link the resulting media objects.
- Retention policy decides cleanup eligibility.
The cleanup contract is strict:
UploadJob.retention_policy=preserve_source: successful completion keeps the source artifact and marks cleanup asskippedUploadJob.retention_policy=delete_after_success: successful completion marks the source artifact as cleanup-eligibleTransferJob.cleanup_policy=retain_all: no cleanup is requested- transfer cleanup policies other than
retain_allare recorded as deferred operator intent
This keeps ingest behavior idempotent, auditable, and safe for production cleanup automation.
🚀 Key Features
System Architecture
- Modular Design: Built on scalable and reusable components to simplify integration into various environments.
- Multi-System Support: Manages configurations for local workstations and production servers.
- Role-Specific Configuration: Predefined roles for common use cases:
- Medical data processing systems
- AI/ML model deployment
- Research workstation configuration
Security & Data Management
- Data Encryption: All sensitive data is encrypted, and privacy policies are enforced.
- Impermanence: Stateless system configuration with persistence for critical data.
- Access Control: Role-based access and identity management integration.
Data and Processing Environment
- Data Processing: Optimized for processing medical datasets with preprocessing tools.
- AI/ML Support:
- Integration of machine learning tools for predictive analysis.
- TensorFlow, PyTorch, and other frameworks supported for model training.
- Image/Video Processing: Support for analyzing patient images and clinical videos.
Development Tools & Infrastructure
- Data Science Toolchains: Pre-configured environments for data processing, analysis, and visualization.
- Monitoring & Logging: Setup for continuous monitoring and logging to ensure system stability and performance.
🛠 Getting Started
Prerequisites
- A Linux-based system (Ubuntu/Debian recommended) or NixOS
- Hardware with sufficient storage for data processing (at least 1 TB recommended)
Quick Start
-
Clone the repository:
git clone https://github.com/wg-lux/endoreg-db.git cd endoreg-db
-
Set up your Python environment We need to have a
devenv.nixfile.
This Nixdevenv.nixconfiguration sets up a Python development environment for a Django-based project usinguvfor dependency management. It defines project directories, environment variables, runtime packages, and several development tasks and scripts.Some available Test Shortcuts
runtests: Runs all tests —uv run python runtests.pyruntests-dataloader: Runs dataloader tests —uv run python runtests.py 'dataloader'runtests-other: Runs other miscellaneous tests —uv run python runtests.py 'other'runtests-helpers: Runs helper module tests —uv run python runtests.py 'helpers'runtests-administration: Runs admin module tests —uv run python runtests.py 'administration'runtests-medical: Runs medical module tests —uv run python runtests.py 'medical'
-
Then run
direnv allow -
Run tests: Call Devenv Script to run tests
runtests
Tests Overview
- These tests ensure the functionality of different models and scenarios.
- After running them, you can view the results as demonstrated in the image below:
-
Run
python manage.py migrate
- It applies database migrations and make tables.
- It updates your database schema to match the current state of your Django models.
-
To load the database data run
python manage.py load_base_db_data -
Accessing the Django Shell
- To fetch or interact with data in the terminal, run the following command to run the Django shell:
python manage.py shell
- Using the Django shell, you can:
- Import database models
- Fetch data from the database
- Access related data through model relationships (e.g., foreign keys, one-to-many, many-to-many)
- Example is shown below
EXAMPLE # 1
- Explanation: This script fetches a patient by ID and prints their related examination(s) using Django ORM. It retrieves the examination name linked to the patient from the PatientExamination table.
EXAMPLE # 2
- Explanation: In the Django shell, a specific ExaminationIndication named "colonoscopy_screening" was fetched, and its related FindingIntervention records were accessed using the reverse relation expected_interventions. The first intervention (colon_lesion_polypectomy_cold_snare) was then queried to confirm it is also linked to multiple indications, demonstrating a many-to-many relationship between indications and interventions.
EXAMPLE # 3
- Explanation: All required labels (polyp, instrument, digital_chromo_endoscopy, etc.) are confirmed to exist. The first available video (VideoFile) was loaded, with a valid frame_dir. Using the label "polyp", 8 labeled polyp segments were found in that video, with specific start and end frame numbers.
EXAMPLE # 4
Image a
Image b - All classifications with their choices together
- Explanation: Using the Django shell to fetch all morphology classifications (e.g., NICE, Paris) and their related choices from the database.
📦 Database Backup and Restore
This project includes two shell scripts to export and import database data in JSON format using Django's management commands.
Setup
First, make the scripts executable:
chmod +x import_db.sh
chmod +x export_db.sh
Export (Backup) the Database
To export the current database into a JSON file:
./export_db.sh
This will create a backup file such as endoreg_db_backup.json.
List of the comands in 'export_db.sh'
-
python manage.py dumpdata --indent 4 --output=endoreg_db_backup.json(if migrate comand generates and stores data in database table then wee nee dto exclude those tables from dumping) -
python manage.py shell < fix_endoreg_db_backup_json.py
Import (Restore) the Database
To load the data back into the database
./import_db.sh
List of the comands in 'import_db.sh'
rm dev_db.sqlite3python manage.py migratepython manage.py shell < fix_endoreg_db_backup_json.pypython manage.py loaddata endoreg_db_backup_fixed.json
📁 Repository Structure
endoreg-db/
├── endoreg_db/ # Main Django app for medical data
│ ├── data/ # Medical knowledge base
│ ├── management/ # Data wrangling operations
│ ├── models/ # Data models
│ ├── migrations/ # Database migrations
│ └── serializers/ # Serializers for data
├── .gitignore # Git ignore file for unnecessary files
└── README.md # Project description and setup instructions
🔒 Security Features
- Data Encryption: All sensitive patient data is encrypted.
- Role-Based Access Control: Configurable roles for managing access to various parts of the system.
- Logging & Auditing: Comprehensive logging system that tracks user activities and data changes.
🖥️ Supported Systems
- Workstations: Local development or research workstations with low data processing demands.
- Servers: Scalable server infrastructure for processing large data volumes, integrated with cloud services for scalability.
🛟 Support
For issues and questions:
- Create an issue in the repository
- Review the Deployment Guide for common issues
📜 License
MIT - see LICENSE
📖 Further Documentation
All extended documentation lives in the project Wiki → Browse the Wiki »
Standalone Modules In This Checkout
This repository now vendors two standalone LX modules that should be used directly for report rendering and terminology bundle authoring:
- lx-report-generator: standalone Rust PDF renderer
- lx-terminology-editor: local terminology bundle editor and publisher
lx-report-generator with Nix
From the repo root:
cd lx-report-generator
direnv allow # optional
devenv shell
./target/release/report_pdf_renderer \
--input examples/report_payload.json \
--output /tmp/report_example.pdf
To wire it into endoreg_db:
export ENDOREG_REPORT_PDF_RENDERER_BIN="$PWD/target/release/report_pdf_renderer"
lx-terminology-editor with Nix
From the repo root:
cd lx-terminology-editor
direnv allow # optional
devenv shell
python server.py
Then open:
http://localhost:4173
The editor can publish a terminology bundle locally under:
lx-terminology-editor/.published/<publish-name>/<version>/
and writes a registry file at:
lx-terminology-editor/.published/kb_registry.json
That registry can then be used as an LX_DTYPES_KB_REGISTRY source.
Optimization Documentation
- Complete Optimization Project Report
- Test Performance Optimization Guide
- Test Performance Optimization - Success Summary
- Test Performance Optimization: Complete Implementation Summary
- Test Suite Optimization - Final Status Report
- Test Suite Analysis & Optimization Plan
Models and Migration Documentation
- Models Documentation
- Test Migration & Optimization Report
- Test Migration Success Summary
- Test Optimization Migration Guide
API Documentation
Frame-Anonymisierung
Tutorials Documentation
Keycloak
- How to Create a New Account for Keycloak + Nextcloud
- Integration with the frontend
- Merging Multi-User Accounts in Nextcloud // current options
- New user login steps for keycloak and nextcloud
- keycloak integration with backend endpoint
Coding Principles & Practices
Figures
- Coloreg
- EndoReg Framework
- EndoReg Data Collection Workflow
- Eine gemeinsame Datenplattform für Klinik & Forschung
Miscellaneous
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file endoreg_db-0.9.7.5.tar.gz.
File metadata
- Download URL: endoreg_db-0.9.7.5.tar.gz
- Upload date:
- Size: 740.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c7b3ad6ab958789ab2d8cc4286ebea706080f51b80f72fd35d2a24ec9c5f5f0
|
|
| MD5 |
dd13c87cc60a7d64a1274b15ee2d4036
|
|
| BLAKE2b-256 |
6062531cf24c01e573e1451092445ab53545782a0d53cc878b0c8252adfa1bdd
|
File details
Details for the file endoreg_db-0.9.7.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: endoreg_db-0.9.7.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b28b54f757fa0ec6791f958f111782d8e3be9e8af226f31c599c1c7af05bdc60
|
|
| MD5 |
fda93362182cd61981412eb9f0e61b6d
|
|
| BLAKE2b-256 |
b41028f310057085643666a5123362aa8193655f5726600c3b10b6211fdca4bc
|