Skip to main content

Apache Spark integration for Ordeq

Project description

Ordeq Ordeq

Release Docs PyPI PyPI - Downloads License: MIT ruff ty Material for MkDocs

Ordeq (pronounced /ɒɹdɛk/) is a framework for developing data pipelines. It simplifies IO and modularizes pipeline logic. Ordeq elevates your proof-of-concept to a production-grade pipelines. See the introduction for an easy-to-follow example of how Ordeq helps.

Installation

Ordeq is lightweight with 0 dependencies.

To install Ordeq, run:

uv pip install ordeq

Integrations

Ordeq integrates seamlessly with existing tooling. It provides out-of-the-box integrations with 25+ popular libraries. In total, Ordeq offers over 100 IOs via these integrations.

You can install them as needed. For example, for reading and writing data with Pandas, install the ordeq-pandas package:

uv pip install ordeq-pandas

Some of the available integrations:

Data processing

Pandas
Pandas
Spark
Spark
NumPy
Numpy
Polars
Polars
Ibis
Ibis
Joblib
Joblib
HuggingFace
HuggingFace
Pillow
Pillow
SentenceTransformers
st
Requests
Requests
Pydantic
Pydantic
DuckDB
DuckDB
Networkx
NetworkX
TOML
TOML
PyMuPDF
PyMuPDF
ChromaDB
ChromaDB

Plotting

Matplotlib
Matplotlib
Altair
Altair
Plotly Express
Plotly Express

Cloud storage

Google Cloud Storage
Google Cloud Storage
Azure
Azure Storage Blob
AWS S3
AWS S3
Boto3
Boto3
Bigquery
Bigquery

Have a look at the package overview and API reference for a list of available packages.

Documentation

Documentation is available at https://ing-bank.github.io/ordeq/.

Why consider Ordeq?

  • Ordeq is the GenAI companion: it gives your project structure and consistency, such that GenAI can thrive
  • It offers seamless integrations with existing data & ML tooling, such as Spark, Pandas, Pydantic and PyMuPDF, and adding new integrations is trivial
  • It's actively developed and trusted by data scientists, engineers, analysts and machine learning engineers at ING

Learning Ordeq

To learn more about Ordeq, check out the following resources:

Visualizing pipelines

Ordeq makes it easy to visualize your pipelines like this with a single line of code. Read more in the documentation.

The following figure shows an example Ordeq pipeline of a Retrieval-Augmented Generation (RAG) pipeline visualized with Mermaid:

RAG pipeline

Acknowledgements

Ordeq builds upon design choices and ideas from Kedro and other frameworks. It has been developed at ING, with contributions from various individuals. Please refer to the acknowledgements section in the documentation for more details.

License

Ordeq is available under the MIT license. Please refer to the license and notice for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ordeq_spark-2.2.0.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ordeq_spark-2.2.0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file ordeq_spark-2.2.0.tar.gz.

File metadata

  • Download URL: ordeq_spark-2.2.0.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ordeq_spark-2.2.0.tar.gz
Algorithm Hash digest
SHA256 e6a14319702c78c14752b15ab93ff98f5fd8ffd9876a35a5a265bfafa6d9dc49
MD5 7c83410856db7af4340d9fba7f30ee96
BLAKE2b-256 7488bb274c9e4157c4f7659ff0390965defae9c551367ec4d64becc7f3cc3c76

See more details on using hashes here.

File details

Details for the file ordeq_spark-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: ordeq_spark-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ordeq_spark-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f505473a946b17e514a5c3e6c10ae33e59f715a65adb8153f901032f73d39cc8
MD5 6f73c2e4d8874faa6b3cf484437f812b
BLAKE2b-256 da12b3708a6a6c09b14bf817b9c5b71ea7d67277f8651bcc1f902082fccd1998

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page