Skip to main content

OpenHEXA SDK

Project description

OpenHEXA Logo

Open-source Data integration platform

Test Suite

OpenHEXA Python SDK

OpenHEXA is an open-source data integration platform developed by Bluesquare.

Its goal is to facilitate data integration and analysis workflows, in particular in the context of public health projects.

Please refer to the OpenHEXA wiki for more information about OpenHEXA.

This repository contains the code of the OpenHEXA SDK, a library allows you to write code for the OpenHEXA platform. It is particularly useful to write OpenHEXA data pipelines, but can also be used in the OpenHEXA notebooks environment.

The OpenHEXA wiki has a section dedicated to the SDK: Using the OpenHEXA SDK.

Requirements

The OpenHEXA SDK requires Python version 3.11, 3.12 or 3.13.

If you want to be able to run pipeline in a containerized environment on your machine, you will need Docker.

Quickstart

Here's a super minimal example to get you started. First, create a new directory and a virtual environment:

mkdir openhexa-pipelines-quickstart
cd openhexa-pipelines-quickstart
python -m venv venv
source venv/bin/activate

You can then install the OpenHEXA SDK:

pip install --upgrade openhexa.sdk

💡New OpenHEXA SDK versions are released on a regular basis. Don't forget to update your local installations with pip install --upgrade from times to times!

Now that the SDK is installed withing your virtual environmentYou can now use the openhexa CLI utility to create a new pipeline:

openhexa pipelines init "My awesome pipeline"

Great! As you can see in the console output, the OpenHEXA CLI has created a new directory, which contains the basic structure required for an OpenHEXA pipeline. You can now cd in the new pipeline directory and run the pipeline:

openhexa pipelines run ./my_awesome_pipeline

Congratulations! You have successfully run your first pipeline locally.

If you inspect the actual pipeline code, you will see that it doesn't do a lot of things, but it is still a perfectly valid OpenHEXA pipeline.

Let's publish to an actual OpenHEXA workspace so that it can run online.

Using the OpenHEXA web interface, within a workspace, navigate to the Pipelines tab and click on "Create".

Copy the command displayed in the popup in your terminal:

openhexa workspaces add <workspace>

You will be prompted for an authentication token, you can find it in the popup as well.

After adding the workspace using the CLI, you can now push your pipeline:

openhexa pipelines push

As it is the first time, the CLI will ask you to confirm the creation operation. After confirmation the console will output the link to the pipeline screen in the OpenHEXA interface.

You can now open the link and run the pipeline using the OpenHEXA web interface.

Contributing

The following sections explain how you can set up a local development environment if you want to participate to the development of the SDK.

SDK development setup

Install the SDK in editable mode:

python -m venv venv # Create a virtual environment for this project
source venv/bin/activate # Activate the venv
pip install -e ".[dev]"  # Necessary to be able to run the openhexa CLI

Using a local installation of OpenHEXA to run pipelines

While it is possible to run pipelines locally using only the SDK, if you want to run OpenHEXA in a more realistic setting you will need to install the OpenHEXA app and frontend components. Please refer to the installation instructions for more information.

You can then configure the OpenHEXA CLI to connect to your local backend:

openhexa config set_url http://localhost:8000

Notes: you can monitor the status of your pipelines using http://localhost:8000/pipelines/status

Using a local version of the SDK to run pipelines

If you want to use a local version of the SDK to run pipelines, you can build a docker image with the local version of the SDK installed in it :

docker build --platform linux/amd64 -t local_image:v1 -f images/Dockerfile .

Then reference the image name and tag in the .env file of your OpenHexa app :

DEFAULT_WORKSPACE_IMAGE=local_image:v1

Or you can set the following in your workspace.yaml configuration file in your pipeline directory:

env:
  WORKSPACE_DOCKER_IMAGE: local_image:v1

Running the tests

You can run the tests using pytest:

pytest

Codegen from the GraphQL schema

We use code generation to create Python client code from our GraphQL schema. This involves one tools:

  • ariadne-codegen: Generates typed Python GraphQL client code from GraphQL files

The code generation process:

  1. The GraphQL schema is manually taken from the Openhexa Monorepo and saved in openhexa/graphql/schema.generated.graphql
  2. ariadne-codegen uses both the schema and queries to generate typed Python client code

To run code generation manually:

pip install ariadne-codegen
python -m ariadne_codegen

ariadne-codegen runs automatically via pre-commit hooks and CI/CD when GraphQL files are modified.

You can add new queries or mutations in the openhexa/graphql/queries.graphql directory, and they will be picked up by the code generation process.

Example of usage of the generated code:

from sdk import OpenHexaClient

# connect to OpenHEXA backend using environment variables
OpenHexaClient().get_countries(workspace_slug="workspace_slug_example")

# or explicitly pass the URL and token
OpenHexaClient(server_url="app.demo.openhexa.org", token="supersecuretoken")

Release

This project uses release-please to manage releases using conventional commits.

To release a new version:

  1. You need to have a least a commit with a conventional commit message (feat|fix) since the last release.
  2. release-please will create a new release PR on GitHub.
  3. Once the PR is merged, release-please will create a new release on GitHub.
  4. A GitHub action will build the package on github release creation and upload it to PyPI and Anaconda.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openhexa_sdk-2.18.2.tar.gz (121.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openhexa_sdk-2.18.2-py3-none-any.whl (155.9 kB view details)

Uploaded Python 3

File details

Details for the file openhexa_sdk-2.18.2.tar.gz.

File metadata

  • Download URL: openhexa_sdk-2.18.2.tar.gz
  • Upload date:
  • Size: 121.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openhexa_sdk-2.18.2.tar.gz
Algorithm Hash digest
SHA256 fe57acd40ccab6d2c27705e31b23a49a71e3ade00494b6a6dce96e482d00d0fe
MD5 d63e477b6cb527caf4b1dece1f8675a8
BLAKE2b-256 ecfc834d971719417c92110d68dcd791a0625ea97c59c080e9a9c5c10101423a

See more details on using hashes here.

File details

Details for the file openhexa_sdk-2.18.2-py3-none-any.whl.

File metadata

  • Download URL: openhexa_sdk-2.18.2-py3-none-any.whl
  • Upload date:
  • Size: 155.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openhexa_sdk-2.18.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c0a9117efe692fe5c3b7f9acd4390951d813e77731513e4bfee8ed18b3a96e54
MD5 216a2162172ed9bbe1d3de77ab675f09
BLAKE2b-256 68b61745e84db0460d71e1c4245cc2f27f4ed841a4798402657b9ac4dedd190b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page