Skip to main content

A headless ETL / ELT / data pipeline and integration SDK for Python.

Project description

Hyperion

A headless ETL / ELT / data pipeline and integration SDK for Python.

pre-commit pytest codecov PyPI - License PyPI - Python Version GitHub Release

📚 Documentation: https://tomasvotava.github.io/hyperion/

Hyperion organises data assets (a data catalog), validates them against Avro schemas, abstracts storage/queue/cache/secrets backends behind ports and adapters, and gives you a framework for writing data sources. The same code runs on a laptop (local filesystem) or on AWS (S3/DynamoDB/SQS) — the wiring is configuration, not code.

Features

  • Data Catalog — manage and organise data assets across backends
  • Schema management — validate and store Avro schemas per asset
  • Ports & adapters — swap S3/DynamoDB/SQS for local filesystem via config
  • Source framework — define sources that extract data into the catalog
  • Caching — in-memory, local file, and DynamoDB caching
  • Asynchronous processing — utilities for async operations and task queues
  • CLI runner — run sources standalone or in "Argo Workflow" mode
  • Geo utilities — Haversine math in the lite core; Google Maps via [geo]
  • Asset collections — a typed, declarative interface over groups of assets

Installation

pip install 'hyperion-sdk[catalog]'
# or
poetry add 'hyperion-sdk[catalog]'

As of 1.0.0 the default install is a slim lite core; heavy backends are opt-in extras — install only the ones you use:

Extra Enables
hyperion-sdk[aws] DynamoDB cache/keyval, S3 storage/schema, SQS queue, AWS Secrets Manager
hyperion-sdk[data] pandera↔polars typing, asset schemas, SpatialKMeans
hyperion-sdk[catalog] avro-backed Catalog (works with local filesystem storage alone)
hyperion-sdk[geo] Google Maps geocoding (Haversine math stays lite)
hyperion-sdk[snappy] snappy-compressed filesystem cache / DynamoDB keyval
hyperion-sdk[all] everything (parity with the pre-1.0 full install)

We follow Semantic Versioning — pin with a selector such as ^1.0.0. Upgrading from a pre-1.0 release? Import paths moved and the Catalog constructor changed — see the migration guide (also pinned at the top of CHANGELOG.md).

Quickstart

from hyperion.catalog.catalog import Catalog
from hyperion.domain.assets import DataLakeAsset
from datetime import datetime, timezone

catalog = Catalog.from_config()
asset = DataLakeAsset(name="customer_data", date=datetime.now(timezone.utc), schema_version=1)
catalog.store_asset(asset, [{"id": 1, "name": "Customer 1"}])

A runnable, no-AWS walkthrough is in the first DataLakeAsset tutorial.

Documentation

Full documentation is published at https://tomasvotava.github.io/hyperion/ and follows the Diataxis structure:

Development

Hyperion uses Poetry:

git clone https://github.com/tomasvotava/hyperion.git
cd hyperion
poetry install
poetry run pre-commit install

poetry run pytest                  # run tests
poetry run pre-commit run -a       # ruff + mypy + hooks
poetry install --with docs         # docs toolchain
poetry run mkdocs serve            # preview the docs site locally

The project uses ruff (linting), mypy (type checking), and commitizen (conventional commits). See the architecture explanation for the layered design.

Contributing

See CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License — see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyperion_sdk-1.0.2.tar.gz (62.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hyperion_sdk-1.0.2-py3-none-any.whl (93.5 kB view details)

Uploaded Python 3

File details

Details for the file hyperion_sdk-1.0.2.tar.gz.

File metadata

  • Download URL: hyperion_sdk-1.0.2.tar.gz
  • Upload date:
  • Size: 62.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hyperion_sdk-1.0.2.tar.gz
Algorithm Hash digest
SHA256 caca235096b23d8adad7f572744537502502902cf03824de54676293f110a503
MD5 8f4f3ee6a6e1e571bcbfcf2995e36e99
BLAKE2b-256 360c67a655d141f907fba9bbc6d19e40735b08316b381cb322f2aa5aa460a0b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperion_sdk-1.0.2.tar.gz:

Publisher: create-release.yml on tomasvotava/hyperion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hyperion_sdk-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: hyperion_sdk-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 93.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hyperion_sdk-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7231b6454f0c8d24d66b13215845dce7653889fbb4e32723c03b6b1e8f76e98a
MD5 90318a041a8e5d8a9fc85783195d67c4
BLAKE2b-256 a8dad8433395655e05da72415bb62f329e5800c109cb07266f25b5660c8482ed

See more details on using hashes here.

Provenance

The following attestation bundles were made for hyperion_sdk-1.0.2-py3-none-any.whl:

Publisher: create-release.yml on tomasvotava/hyperion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page