Skip to main content

Simple library for record-level processing using flows of transformations defined as YAML files.

Project description

pytransflow

A simple library for record-level processing using flows of transformations defined as YAML files

Overview

pytransflow lets you process records by defining a flow of transformations. Each flow has its configuration which is defined using YAML files and can be as simple as

description: A simple test flow
instant_fail: True
fail_scenarios:
  percentage_of_failed_records: 90
variables:
  a: B
transformations:
  - prefix:
      field: a
      value: test
      condition: "@a/c/d/e == !:a"
      ignore_errors:
        - output_already_exists
      output_datasets:
        - k
  - add_field:
      name: test/a/b
      value: { "a": "b" }
      input_datasets:
        - k
      output_datasets:
        - x
        - z

Processing is initiated using the Flow class:

from pytransflow.core import Flow
records = [...]

flow = Flow(name="<flow-name>")
flow.process(records)
pprint(flow.datasets)  # End result
pprint(flow.failed_records)  # Failed records

Refer to the Getting Started wiki page for additional examples and guided initial steps or check out the blog post that introduces this library pytransflow.

Features

The following are some of the features that pytransflow provides:

  • Define processing flows using YAML files
  • Use all kinds of flow configurations to fine-tune the flow
  • Leverage pydantic‘s features for data validation
  • Apply transformations only if defined condition is met
  • Build your own library of transformations
  • Use multiple input and output datasets
  • Ignore specific errors during processing
  • Set conditions for output datasets
  • Track failed records
  • Define flow fail scenarios
  • Process records in parallel
  • Use flow level variables etc.

For more information on these features and how to use them, please refer to the Wiki Page.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytransflow-0.1.1.tar.gz (24.6 kB view details)

Uploaded Source

Built Distribution

pytransflow-0.1.1-py3-none-any.whl (44.5 kB view details)

Uploaded Python 3

File details

Details for the file pytransflow-0.1.1.tar.gz.

File metadata

  • Download URL: pytransflow-0.1.1.tar.gz
  • Upload date:
  • Size: 24.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for pytransflow-0.1.1.tar.gz
Algorithm Hash digest
SHA256 059224e7d7216b337b56ca4b23eeb4bd36efdcc2cada9c160eb94b1722e9db9c
MD5 5e8a62873aad248e2d3e70dc215a9c18
BLAKE2b-256 df495e1765e95cdf57141e1760d3b10a61046fcb0a64221e7cdd79bedba21b4b

See more details on using hashes here.

File details

Details for the file pytransflow-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pytransflow-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 44.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for pytransflow-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9ae60f9552489865a277b3df2fd26fbe144579f4a44510532143a5c4aa7ee775
MD5 446c5eb51cb67ae69a77098418ffc224
BLAKE2b-256 e8b4f4184aaad49869f67d90f79493b48873b147df4dce2d1dda1994370fc8ea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page