Skip to main content

A Data Dependency Graph Framework and Executor

Project description

DataJet

A Data Dependency Graph Framework and Executor

DataJet abstracts over function calls by mapping inputs through a graph of functions to desired outputs. As a programmer, you declare your data transformations (functions of inputs to outputs) once, and datajet will handle mapping any input to any output reachable by the graph of functions.

Key Features

  • Lazy: Only Evaluate and return the data you need
  • Declarative: Declare Data and functions on the data explicitly, using plain python
  • Dependency-Free: Just Python.

Installation

Requirements:

  • Python >=3.8

To Get Started, Install DataJet From pypi:

pip install datajet

Why would I use this?

  • DataJet simplifies the codebase of dynamic systems with mutliple ways to calculate a datapoints from different inputs.
  • DataJet de-couples downstream calculations from the mechanics of calculating upstream dependencies.

Quickstart

from datajet import execute

dollars = [7.98, 20.94, 37.9, 30.31]
units =  [1, 3, 5, 4,]

def prices(dollars, units):
    return [d/u for d, u in zip(dollars, units)]

def average_price(prices):
    return sum(prices) / len(prices) 

def average_price_rounded_down(average_price):
    return average_price * 1000 // 10 / 100


datajet_map = {
    "prices": prices,
    "average_price": average_price,
    "average_price_rounded_down": average_price_rounded_down,
}
execute(
        datajet_map,
        context={
            "dollars": dollars,
            "units": units,
        }, 
        fields=['average_price_rounded_down']
)
{'average_price_rounded_down': 7.52}

And, if you have prices, you can directly get what you need:

prices = [3.99, 4.49, 2.89, 2.79, 2.99]

execute(datajet_map,context={"prices": prices,}, fields=['average_price', 'average_price_rounded_down'])
{'average_price': 3.4299999999999997, 'average_price_rounded_down': 3.42}

Important Details

Keys can be any hashable. The value corresponding to each key can be a function or an object. The functions can have 0 or more parameters. The parameter names must correspond to other keys in the dict if no explicitly defined inputs to the callable are declared in the map. See Datamap reference for how to explicitly define inputs.

You can also define multiple ways of calculating a piece of data via defining a list of functions as the value to the key. Again, each function's parameters must correspond to other keys in the dict, or else you can define which other keys should be inputs to the function via explicitly defining inputs.

Full Documentation

https://bmritz.github.io/datajet/

Development

To create the development environment locally:

git clone
make install

This will start a poetry shell that has all the dev dependencies installed. You can run deactivate to exit the shell.

To run tests

make test

Development troubleshooting

If you see:

urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>

Go to /Applications/Python3.x and run 'Install Certificates.command'

Built on ideas inspired by

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datajet-0.2.0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

datajet-0.2.0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file datajet-0.2.0.tar.gz.

File metadata

  • Download URL: datajet-0.2.0.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.16 Linux/5.15.0-1034-azure

File hashes

Hashes for datajet-0.2.0.tar.gz
Algorithm Hash digest
SHA256 1c1e61b6610b20f6bcd574d067b408fa96148fa5d7f531473b84f3d035f328b3
MD5 9cf38dc7debcdb644e510aa81a20d1dd
BLAKE2b-256 f8605423e70023b810bba206364837ef4b05992a2d7ee35785de5101051196b0

See more details on using hashes here.

File details

Details for the file datajet-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: datajet-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.16 Linux/5.15.0-1034-azure

File hashes

Hashes for datajet-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b7d2e200a2e336a23f95ad383d6ddd35fc657cf892fd3dfbebc670d717fd2936
MD5 542f0efb016e6244b24a0d484b50b0e1
BLAKE2b-256 6c0cea2cdca0f2b9bde3f203280b94f135b29459c8e06067c7514899f706846a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page