Skip to main content

Reproducible DHIS2 Python SDK for LMIC scenarios

Project description

pydhis2 logo

pydhis2

A modern Python SDK for DHIS2, designed for robust and reproducible scientific workflows.

PyPI version Python versions Downloads Tests Docs License Ruff

pydhis2 is a next-generation Python library for interacting with DHIS2, the world's largest health information management system. It provides a clean, modern, and efficient API for data extraction, analysis, and management, with a strong emphasis on creating reproducible workflowsโ€”a critical need in scientific research and public health analysis, especially in Low and Middle-Income Country (LMIC) contexts.


โœจ Why pydhis2?

  • ๐Ÿš€ Modern & Asynchronous: Built with asyncio for high-performance, non-blocking I/O, making it ideal for large-scale data operations. A synchronous client is also provided for simplicity in smaller scripts.
  • reproducible Reproducible by Design: From project templates to a powerful CLI, pydhis2 is built to support standardized, shareable, and verifiable data analysis pipelines.
  • ๐Ÿผ Seamless DataFrame Integration: Natively convert DHIS2 analytics data into Pandas DataFrames with a single method call (.to_pandas()), connecting you instantly to the PyData ecosystem.
  • ๐Ÿ”ง Powerful Command Line Interface: Automate common tasks like data pulling and configuration directly from your terminal.

๐Ÿš€ Getting Started

1. Installation

Install pydhis2 directly from PyPI:

pip install pydhis2

2. Verify Your Installation

Use the built-in CLI to run a quick demo. This will connect to a live DHIS2 server, fetch data, and confirm that your installation is working correctly.

# Check the installed version
pydhis2 version

# Run the quick demo
pydhis2 demo quick

A successful run will produce the following output:

============================================================
pydhis2 Quick Demo
============================================================
=== Testing: https://demos.dhis2.org/dq ===
   Found working API endpoint!
   System: Data Quality
   Version: 2.38.4.3
Found working server: https://demos.dhis2.org/dq

2. Querying Analytics data...
Retrieved 1 data records
...
Demo completed successfully!

๐Ÿ“– Basic Usage

Here is a simple example of how to use pydhis2 in a Python script to fetch analytics data and load it into a Pandas DataFrame.

Create a file named my_analysis.py:

import asyncio
import sys
from pydhis2 import get_client, DHIS2Config
from pydhis2.core.types import AnalyticsQuery

# pydhis2 provides both an async and a sync client
AsyncDHIS2Client, _ = get_client()

async def main():
    # 1. Configure the connection to a DHIS2 server
    config = DHIS2Config(
        base_url="https://demos.dhis2.org/dq",
        auth=("demo", "District1#")
    )
  
    async with AsyncDHIS2Client(config) as client:
        # 2. Define the query parameters
        query = AnalyticsQuery(
            dx=["b6mCG9sphIT"],   # Data element: ANC 1 Outlier Threshold
            ou="qzGX4XdWufs",    # Org unit: A-1 District Hospital
            pe="2023"            # Period: Year 2023
        )

        # 3. Fetch data and convert it directly to a Pandas DataFrame
        df = await client.analytics.to_pandas(query)

        # 4. Analyze and display the results
        print("โœ… Data fetched successfully!")
        print(f"Retrieved {len(df)} records.")
        print("\n--- Data Preview ---")
        print(df.head())

if __name__ == "__main__":
    # Standard fix for asyncio on Windows
    if sys.platform == 'win32':
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    asyncio.run(main())

Run your script from the terminal:

python my_analysis.py

๐Ÿ”ง Server Configuration

While you can pass credentials directly in your script, we recommend using environment variables for better security and flexibility.

1. Environment Variables (Recommended)

export DHIS2_URL="https://your-dhis2-server.com"
export DHIS2_USERNAME="your_username"
export DHIS2_PASSWORD="your_password"

pydhis2 will automatically detect and use these variables.

2. In-Script Configuration

from pydhis2 import DHIS2Config

config = DHIS2Config(
    base_url="https://your-dhis2-server.com",  
    auth=("your_username", "your_password")
)

3. Using the CLI The CLI provides a convenient way to set and cache your credentials.

pydhis2 config --url "https://your-dhis2-server.com" --username "your_username"

๐Ÿ—๏ธ A Reproducible Workflow: Project Templates

Beyond being a library, pydhis2 promotes a standardized workflow that is essential for scientific research. To jumpstart your analysis, we provide a project template powered by Cookiecutter.

Why use the template?

  • Standardization: Ensures every project starts with a clean, logical structure.
  • Rapid Start: Generate a fully functional project skeleton in a single command.
  • Best Practices: Includes pre-configured settings for DHIS2 connections, data quality pipelines, and environment management.
  • Focus on Analysis: Spend less time on boilerplate setup and more time on your research.

How to Use

  1. Install Cookiecutter:

    pip install cookiecutter
    
  2. Generate your project: Point Cookiecutter to the pydhis2 template. It will prompt you for project details.

    cookiecutter gh:HzaCode/pyDHIS2 --directory pydhis2/templates
    

    You'll be prompted for details like your project name and author:

    project_name [My DHIS-2 Analysis Project]: Malaria Analysis Malawi
    project_slug [malaria_analysis_malawi]:
    author_name [Your Name]: Dr. Evans
    
  3. Get a complete, ready-to-use project structure:

    malaria-analysis-malawi/
    โ”œโ”€โ”€ configs/          # DHIS-2 & DQR configurations
    โ”œโ”€โ”€ data/             # Raw and processed data
    โ”œโ”€โ”€ pipelines/        # Analysis pipeline definitions
    โ”œโ”€โ”€ scripts/          # Runner scripts
    โ”œโ”€โ”€ .env.example      # Environment variable template
    โ””โ”€โ”€ README.md         # A dedicated README for your new project
    

You can now cd into your new project directory and begin your analysis immediately!

๐Ÿ–ฅ๏ธ Command Line Interface

pydhis2 provides a powerful CLI for common data operations. (Note: Implementation is in progress)

# Pull analytics data and save as Parquet
pydhis2 analytics pull --dx "b6mCG9sphIT" --ou "qzGX4XdWufs" --pe "2023" --out analytics.parquet

# Pull tracker events
pydhis2 tracker events --program "program_id" --out events.parquet

# Run a data quality review
pydhis2 dqr analyze --input analytics.parquet --html dqr_report.html

For a full list of commands, run pydhis2 --help.

๐Ÿ“Š Supported Endpoints

Endpoint Read Write DataFrame Pagination Streaming
Analytics โœ… - โœ… โœ… โœ…
DataValueSets โœ… โœ… โœ… โœ… โœ…
Tracker Events โœ… โœ… โœ… โœ… โœ…
Metadata โœ… โœ… โœ… - -

๐Ÿ“‹ Compatibility

  • Python: โ‰ฅ 3.9
  • DHIS2: โ‰ฅ 2.36
  • Platforms: Windows, Linux, macOS

๐Ÿค Contributing

Contributions are welcome and highly encouraged! pydhis2 is a community-driven project, and we believe that collaboration is key to building robust and useful tools for the open-science community.

Please see our Contributing Guide for details on how to get started. Also, be sure to review our Code of Conduct.

๐Ÿ“ž Community & Support

  • ๐Ÿ“– Documentation: For in-depth guides and API references.
  • ๐Ÿ› GitHub Issues: To report bugs or request new features.
  • ๐Ÿ’ฌ GitHub Discussions: For questions, ideas, and community conversation.
  • ๐Ÿ“ Changelog: Version history and release notes.

๐Ÿ“„ License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydhis2-0.2.1.tar.gz (329.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydhis2-0.2.1-py3-none-any.whl (89.8 kB view details)

Uploaded Python 3

File details

Details for the file pydhis2-0.2.1.tar.gz.

File metadata

  • Download URL: pydhis2-0.2.1.tar.gz
  • Upload date:
  • Size: 329.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.1

File hashes

Hashes for pydhis2-0.2.1.tar.gz
Algorithm Hash digest
SHA256 cc0ee16d2cf1351b93616449f33b47e139f447421cd475a7bd09be25a89f53db
MD5 809cc27de4e5c8ccfd3d77b686ab5056
BLAKE2b-256 19fc7bc32cf69aa4c6e137ce10fac1b6dc98821b75c0ab0ea713b787640046f3

See more details on using hashes here.

File details

Details for the file pydhis2-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: pydhis2-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 89.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.1

File hashes

Hashes for pydhis2-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c3bf268c5c15aa2964ec23f1996400348e2f3f810c41f70428b966fa28134f4f
MD5 9b8f82312c8578d63fe4a64b9c3b44b4
BLAKE2b-256 dbd3295f0a6a0ff6b3896b556c9ce53fc7c3b677e31a6fba3a9050a2b6f9ab67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page