Skip to main content

OpenHEXA SDK

Project description

OpenHEXA Logo

Open-source Data integration platform

Test Suite

OpenHEXA Python SDK

OpenHEXA is an open-source data integration platform developed by Bluesquare.

Its goal is to facilitate data integration and analysis workflows, in particular in the context of public health projects.

Please refer to the OpenHEXA wiki for more information about OpenHEXA.

This repository contains the code of the OpenHEXA SDK, a library allows you to write code for the OpenHEXA platform. It is particularly useful to write OpenHEXA data pipelines, but can also be used in the OpenHEXA notebooks environment.

The OpenHEXA wiki has a section dedicated to the SDK: Using the OpenHEXA SDK.

Requirements

The OpenHEXA SDK requires Python version 3.9 or newer, but it is not yet compatible with Python 3.12 or later versions.

If you want to be able to run pipeline in a containerized environment on your machine, you will need Docker.

Quickstart

Here's a super minimal example to get you started. First, create a new directory and a virtual environment:

mkdir openhexa-pipelines-quickstart
cd openhexa-pipelines-quickstart
python -m venv venv
source venv/bin/activate

You can then install the OpenHEXA SDK:

pip install --upgrade openhexa.sdk

💡New OpenHEXA SDK versions are released on a regular basis. Don't forget to update your local installations with pip install --upgrade from times to times!

Now that the SDK is installed withing your virtual environmentYou can now use the openhexa CLI utility to create a new pipeline:

openhexa pipelines init "My awesome pipeline"

Great! As you can see in the console output, the OpenHEXA CLI has created a new directory, which contains the basic structure required for an OpenHEXA pipeline. You can now cd in the new pipeline directory and run the pipeline:

openhexa pipelines run ./my_awesome_pipeline

Congratulations! You have successfully run your first pipeline locally.

If you inspect the actual pipeline code, you will see that it doesn't do a lot of things, but it is still a perfectly valid OpenHEXA pipeline.

Let's publish to an actual OpenHEXA workspace so that it can run online.

Using the OpenHEXA web interface, within a workspace, navigate to the Pipelines tab and click on "Create".

Copy the command displayed in the popup in your terminal:

openhexa workspaces add <workspace>

You will be prompted for an authentication token, you can find it in the popup as well.

After adding the workspace using the CLI, you can now push your pipeline:

openhexa pipelines push 

As it is the first time, the CLI will ask you to confirm the creation operation. After confirmation the console will output the link to the pipeline screen in the OpenHEXA interface.

You can now open the link and run the pipeline using the OpenHEXA web interface.

Contributing

The following sections explain how you can set up a local development environment if you want to participate to the development of the SDK.

SDK development setup

Install the SDK in editable mode:

python -m venv venv # Create a virtual environment for this project
source venv/bin/activate # Activate the venv
pip install -e ".[dev]"  # Necessary to be able to run the openhexa CLI

Using a local installation of OpenHEXA to run pipelines

While it is possible to run pipelines locally using only the SDK, if you want to run OpenHEXA in a more realistic setting you will need to install the OpenHEXA app and frontend components. Please refer to the installation instructions for more information.

You can then configure the OpenHEXA CLI to connect to your local backend:

openhexa config set_url http://localhost:8000

Notes: you can monitor the status of your pipelines using http://localhost:8000/pipelines/status

Running the tests

You can run the tests using pytest:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openhexa_sdk-0.3.0.tar.gz (53.4 kB view details)

Uploaded Source

Built Distribution

openhexa.sdk-0.3.0-py3-none-any.whl (64.0 kB view details)

Uploaded Python 3

File details

Details for the file openhexa_sdk-0.3.0.tar.gz.

File metadata

  • Download URL: openhexa_sdk-0.3.0.tar.gz
  • Upload date:
  • Size: 53.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for openhexa_sdk-0.3.0.tar.gz
Algorithm Hash digest
SHA256 fcaf8dc8c99801d33bcdce8671aeba2ca74b94261e1b538579e3cba2e23880dc
MD5 c2df181d0e31182d3b398db6c9140107
BLAKE2b-256 088f66d8fcbd6b52cbbcb7eb681d8e76f0e82ead07d8dca570573d08c88f23c7

See more details on using hashes here.

File details

Details for the file openhexa.sdk-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: openhexa.sdk-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 64.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for openhexa.sdk-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e140cf2ec99557b91e2c84f62b5ab4159a5ac258914b29cc53b945d39f731e23
MD5 b939acb815289cfd8d58a30ae80eee96
BLAKE2b-256 8e06c0b4d014a22d255a6131dad454687714220e067f5961e8cff4d055a9358f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page