Reproducible DHIS2 Python SDK for LMIC scenarios
Project description
pydhis2 is a next-generation Python library for interacting with DHIS2, the world's largest health information management system. It provides a clean, modern, and efficient API for data extraction, analysis, and management, with a strong emphasis on creating reproducible workflowsโa critical need in scientific research and public health analysis, especially in Low and Middle-Income Country (LMIC) contexts.
โจ Why pydhis2?
- ๐ Modern & Asynchronous: Built with
asynciofor high-performance, non-blocking I/O, making it ideal for large-scale data operations. A synchronous client is also provided for simplicity in smaller scripts. - reproducible Reproducible by Design: From project templates to a powerful CLI,
pydhis2is built to support standardized, shareable, and verifiable data analysis pipelines. - ๐ผ Seamless DataFrame Integration: Natively convert DHIS2 analytics data into Pandas DataFrames with a single method call (
.to_pandas()), connecting you instantly to the PyData ecosystem. - ๐ง Powerful Command Line Interface: Automate common tasks like data pulling and configuration directly from your terminal.
๐ Getting Started
1. Installation
Install pydhis2 directly from PyPI:
pip install pydhis2
2. Verify Your Installation
Use the built-in CLI to run a quick demo. This will connect to a live DHIS2 server, fetch data, and confirm that your installation is working correctly.
# Check the installed version
pydhis2 version
# Run the quick demo
pydhis2 demo quick
A successful run will produce the following output:
============================================================
pydhis2 Quick Demo
============================================================
=== Testing: https://demos.dhis2.org/dq ===
Found working API endpoint!
System: Data Quality
Version: 2.38.4.3
Found working server: https://demos.dhis2.org/dq
2. Querying Analytics data...
Retrieved 1 data records
...
Demo completed successfully!
๐ Basic Usage
Here is a simple example of how to use pydhis2 in a Python script to fetch analytics data and load it into a Pandas DataFrame.
Create a file named my_analysis.py:
import asyncio
import sys
from pydhis2 import get_client, DHIS2Config
from pydhis2.core.types import AnalyticsQuery
# pydhis2 provides both an async and a sync client
AsyncDHIS2Client, _ = get_client()
async def main():
# 1. Configure the connection to a DHIS2 server
config = DHIS2Config(
base_url="https://demos.dhis2.org/dq",
auth=("demo", "District1#")
)
async with AsyncDHIS2Client(config) as client:
# 2. Define the query parameters
query = AnalyticsQuery(
dx=["b6mCG9sphIT"], # Data element: ANC 1 Outlier Threshold
ou="qzGX4XdWufs", # Org unit: A-1 District Hospital
pe="2023" # Period: Year 2023
)
# 3. Fetch data and convert it directly to a Pandas DataFrame
df = await client.analytics.to_pandas(query)
# 4. Analyze and display the results
print("โ
Data fetched successfully!")
print(f"Retrieved {len(df)} records.")
print("\n--- Data Preview ---")
print(df.head())
if __name__ == "__main__":
# Standard fix for asyncio on Windows
if sys.platform == 'win32':
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
asyncio.run(main())
Run your script from the terminal:
python my_analysis.py
๐ง Server Configuration
While you can pass credentials directly in your script, we recommend using environment variables for better security and flexibility.
1. Environment Variables (Recommended)
export DHIS2_URL="https://your-dhis2-server.com"
export DHIS2_USERNAME="your_username"
export DHIS2_PASSWORD="your_password"
pydhis2 will automatically detect and use these variables.
2. In-Script Configuration
from pydhis2 import DHIS2Config
config = DHIS2Config(
base_url="https://your-dhis2-server.com",
auth=("your_username", "your_password")
)
3. Using the CLI The CLI provides a convenient way to set and cache your credentials.
pydhis2 config --url "https://your-dhis2-server.com" --username "your_username"
๐๏ธ A Reproducible Workflow: Project Templates
Beyond being a library, pydhis2 promotes a standardized workflow that is essential for scientific research. To jumpstart your analysis, we provide a project template powered by Cookiecutter.
Why use the template?
- Standardization: Ensures every project starts with a clean, logical structure.
- Rapid Start: Generate a fully functional project skeleton in a single command.
- Best Practices: Includes pre-configured settings for DHIS2 connections, data quality pipelines, and environment management.
- Focus on Analysis: Spend less time on boilerplate setup and more time on your research.
How to Use
-
Install Cookiecutter:
pip install cookiecutter
-
Generate your project: Point Cookiecutter to the
pydhis2template. It will prompt you for project details.cookiecutter gh:HzaCode/pyDHIS2 --directory pydhis2/templates
You'll be prompted for details like your project name and author:
project_name [My DHIS-2 Analysis Project]: Malaria Analysis Malawi project_slug [malaria_analysis_malawi]: author_name [Your Name]: Dr. Evans -
Get a complete, ready-to-use project structure:
malaria-analysis-malawi/ โโโ configs/ # DHIS-2 & DQR configurations โโโ data/ # Raw and processed data โโโ pipelines/ # Analysis pipeline definitions โโโ scripts/ # Runner scripts โโโ .env.example # Environment variable template โโโ README.md # A dedicated README for your new project
You can now cd into your new project directory and begin your analysis immediately!
๐ฅ๏ธ Command Line Interface
pydhis2 provides a powerful CLI for common data operations. (Note: Implementation is in progress)
# Pull analytics data and save as Parquet
pydhis2 analytics pull --dx "b6mCG9sphIT" --ou "qzGX4XdWufs" --pe "2023" --out analytics.parquet
# Pull tracker events
pydhis2 tracker events --program "program_id" --out events.parquet
# Run a data quality review
pydhis2 dqr analyze --input analytics.parquet --html dqr_report.html
For a full list of commands, run pydhis2 --help.
๐ Supported Endpoints
| Endpoint | Read | Write | DataFrame | Pagination | Streaming |
|---|---|---|---|---|---|
| Analytics | โ | - | โ | โ | โ |
| DataValueSets | โ | โ | โ | โ | โ |
| Tracker Events | โ | โ | โ | โ | โ |
| Metadata | โ | โ | โ | - | - |
๐ Compatibility
- Python: โฅ 3.9
- DHIS2: โฅ 2.36
- Platforms: Windows, Linux, macOS
๐ค Contributing
Contributions are welcome and highly encouraged! pydhis2 is a community-driven project, and we believe that collaboration is key to building robust and useful tools for the open-science community.
Please see our Contributing Guide for details on how to get started. Also, be sure to review our Code of Conduct.
๐ Community & Support
- ๐ Documentation: For in-depth guides and API references.
- ๐ GitHub Issues: To report bugs or request new features.
- ๐ฌ GitHub Discussions: For questions, ideas, and community conversation.
- ๐ Changelog: Version history and release notes.
๐ License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydhis2-0.2.1.tar.gz.
File metadata
- Download URL: pydhis2-0.2.1.tar.gz
- Upload date:
- Size: 329.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc0ee16d2cf1351b93616449f33b47e139f447421cd475a7bd09be25a89f53db
|
|
| MD5 |
809cc27de4e5c8ccfd3d77b686ab5056
|
|
| BLAKE2b-256 |
19fc7bc32cf69aa4c6e137ce10fac1b6dc98821b75c0ab0ea713b787640046f3
|
File details
Details for the file pydhis2-0.2.1-py3-none-any.whl.
File metadata
- Download URL: pydhis2-0.2.1-py3-none-any.whl
- Upload date:
- Size: 89.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3bf268c5c15aa2964ec23f1996400348e2f3f810c41f70428b966fa28134f4f
|
|
| MD5 |
9b8f82312c8578d63fe4a64b9c3b44b4
|
|
| BLAKE2b-256 |
dbd3295f0a6a0ff6b3896b556c9ce53fc7c3b677e31a6fba3a9050a2b6f9ab67
|