Skip to main content

Omnipy is a type-driven Python library for data conversion, parsing and wrangling; tool and web service interoperability; and scalable dataflow orchestration

Reason this release was yanked:

Wrong version format

Project description

Omnipy logo

Omnipy is a type-driven Python library for:

  • data parsing, wrangling, visualization, and conversion
  • tool and web service interoperability, and
  • scalable dataflow orchestration

Conceptual overview of Omnipy

Why use Omnipy?

Dataflows, Not Workflows

Traditional workflows rely on command-line tools and intermediate files, adding complexity to data pipelines. Omnipy replaces this with dataflows that operate directly in memory or on standard formats like JSON or CSV. Built on Pydantic models, Omnipy enhances data parsing, conversion, and serialization for structured data processing.

"It's Static Typing!"… "It's Dynamic!"… "It's Omnipy!"

Omnipy blends Python’s dynamic typing with runtime type safety. Models behave like native Python structures while ensuring type guarantees without the rigidity of static typing. Defined in Python, Omnipy models can be as general or specific as needed.

Parse, Don’t Validate

Strict validation often breaks pipelines when data is messy. Inspired by "Parse, don't validate", Omnipy eagerly parses input into structured models that retain integrity throughout the pipeline. This approach aligns with the Robustness Principle: "be liberal in what you accept, and conservative in what you send!"

Self-Constraining Data Models

Omnipy models aren’t just one-time validators. A Model[list[int]]() behaves like a list but ensures its elements are always integers. Every modification parses data to enforce integrity, rolling back invalid operations automatically.

Omnify Your Data Pipelines

Omnipy invites you to "omnify" pipelines — break them into reusable, universal components. By defining dataflows and tasks with structured input and output models, Omnipy simplifies reuse and promotes good coding practices, improving maintainability as projects grow.

Catalog of Components for Interoperability

Omnipy includes components for tasks like asynchronous API requests with rate limiting, parsing JSON or tabular data, and flattening nested data into relational tables. Integration with REST APIs and data wrangling/analysis tools like Pandas simplifies interoperability across diverse systems. Expect the catalog to grow as the community expands!

Built to Scale

Omnipy’s hierarchical Dataset structure simplifies batch processing of directory-based data, including parsing, serialization, and metadata handling. With built-in Prefect support, Omnipy scales seamlessly from local experiments to distributed deployment, meeting the demands of projects large and small.

Installing Omnipy

  1. Make sure that your Python version is between 3.10 and 3.13 (Python 3.14 is not yet supported), e.g.:

    $ python --version
    Python 3.10.14
    
  2. Create and activate a virtual environment for your project, e.g.:

    $ python -m venv myproject
    $ source myproject/bin/activate
    

    TIP:

    • If you need help with setting up a virtual environment, check out the relevant section in the FastAPI documentation. (Please note that Omnipy does not depend on FastAPI, it is just that their documentation is excellent!)
    • If you are using Omnipy in a Jupyter notebook, you can most likely skip this step.
  3. Install Omnipy using:

    $ pip install omnipy
    

Getting started

To define an Omnipy data model, simply concretize the generic Model class by specifying a data type in brackets, e.g. Model[list[int]].

The following creates a data model of a list of integers, and parses some data into that model:

>>> from omnipy import Model
>>> data = (123, '234', 345.0)  # Note that the input data is a tuple of mixed types
>>> data_as_list_of_ints = Model[list[int]](data)
>>> data_as_list_of_ints  # The data is now parsed into a list of integers
>>> print(data_as_list_of_ints._docs())

Omnipy Models are self-constraining, meaning that they will always ensure that the data they contain is of the correct type. For example, if you try to append a string to the list of integers, it will raise an error:

>>> data_as_list_of_ints.append('abc')  # This will raise an error
>>> try:
>>>     data_as_list_of_ints.append('abc')
>>> except Exception as err:
>>>     print(err)

Importantly, Omnipy models automatically reverts to snapshots after an error occurs, allowing you to retry the operation without having to re-parse the data, continuing from where you left off. This is particularly useful when working with large dataflows, as it allows you to handle errors gracefully without losing your progress.

>>> data_as_list_of_ints
>>> print(data_as_list_of_ints._docs())
>>> data_as_list_of_ints.append('456')
>>> data_as_list_of_ints
>>> print(data_as_list_of_ints._docs())

More to come soon...

Running example scripts

  • Install omnipy-examples:
    • pip install omnipy-examples
  • Example script:
    • omnipy-examples isajson
  • For help on the command line interface:
    • omnipy-examples --help
  • For help on a particular example:
    • omnipy-examples isajson --help

Output of flow runs

The output will by default appear in the data directory, with a timestamp.

  • It is recommended to install a file viewer that are capable of browsing tar.gz files. For instance, the "File Expander" plugin in PyCharm is excellent for this.
  • To unpack the compressed files of a run on the command line (just make sure to replace the datetime string from this example):
    for f in $(ls data/2023_02_03-12_51_51/*.tar.gz); do mkdir ${f%.tar.gz}; tar xfzv $f -C ${f%.tar.gz}; done
    

Run with the Prefect engine

Omnipy is integrated with the powerful Prefect dataflow orchestration library.

  • To run an example using the prefect engine, e.g.:
    • omnipy-examples --engine prefect isajson
  • After completion of some runs, you can check the flow logs and orchestration options in the Prefect UI:
    • prefect server start

More info on Prefect configuration will come soon.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

omnipy-0.23.0-py3-none-any.whl (401.9 kB view details)

Uploaded Python 3

omnipy-0.23-py3-none-any.whl (401.9 kB view details)

Uploaded Python 3

File details

Details for the file omnipy-0.23.0-py3-none-any.whl.

File metadata

  • Download URL: omnipy-0.23.0-py3-none-any.whl
  • Upload date:
  • Size: 401.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for omnipy-0.23.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6e0f2962571eaed39d089c8ffcd9861675737eecf5fc12401237bc9da1f5132b
MD5 65173c02396855d6924beee0dbbfcb69
BLAKE2b-256 48679f626f53f7cf8d8a923523f6f882696be789e6d1ede9ea1240ec18a2677b

See more details on using hashes here.

File details

Details for the file omnipy-0.23-py3-none-any.whl.

File metadata

  • Download URL: omnipy-0.23-py3-none-any.whl
  • Upload date:
  • Size: 401.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for omnipy-0.23-py3-none-any.whl
Algorithm Hash digest
SHA256 8045f9965cbf7254d9ee4c7d5bf6742af9a3e9bea482ae6be4c2288dc5f2e921
MD5 482751aed19109b0c561181678c9e7e9
BLAKE2b-256 7724b4dc62572d576aef155f006081effaa1c58cfddb8fd01e4f9d7b1569c355

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page