Skip to main content

Format Representing Interdependent Data Actions As YAML

Project description

FRIDAAY

Format Representing Interdependent Data Actions As YAML

Who needs SQL, Python, JavaScript and CSV? Get it all done by FRIDAAY

Usage

FRIDAAY uses poetry to manage both dependencies and the virtual environment:

$ poetry install # or '$ poetry update'
$ poetry env use python3
$ poetry run pytest
$ poetry run ptw

Overview

FRIDAAY defines a new "atomic unit" of abstraction for the modern data stack called Data Actions.
Each Data Action defines a semantic mapping for creating a new "frame" from existing frames (or inline data). This allows analysts and data scientists to declaratively specify their intent, empowering the underlying platform to efficiently satisfy those requirements. We call this production-ready alternative to traditional exploratory notebooks a PipeBook.

Right now, business logic and data dependencies are trapped inside complex (and often incompatible) programming languages such as SQL, Python, and Scala, and APIs like Spark vs Pandas, TensorFlow vs MLFlow, etc. FRIDAAY replaces these with a simple yet extensible "programming format" based on YAML that enables:

  • fine-grained orchestration
  • full-fidelity no-code visual programming of data pipelines
  • platform and language independence
  • reusable specification of dashboards and data apps
  • inline tests and alerting
  • uniform specification of external integrations
  • schema-aware autocompletion and templates
  • ad-hoc materialization and incrementalism
  • version-controlled user-facing semantic models and metric layers
  • deterministic transformations between versions and vendors
  • novel interaction paradigms beyond notebooks and REPLs
  • turning legacy code into structured data, which we can manage using all our data superpowers

Example

Available with the package in folder = path_resource(PKG_ID, PIPE_FOLDER)

fridaay:
  version: 0.1
  do: core.init
  imports:
   sql: dad_sql_pandas
  set: # global constants (COMMENT)
    NAME: demo_pets
    SAPIENT: Human

test_data:
  doc: Sample data for test purposes
  do: sql.load
  columns: ['Name','Age','Weight', 'Type', 'Timestamp']
  data:
  - ['Ernie', 54, 170.5, 'Human Tech Nerd', 2020-03-20]
  - ['Qhuinn', 7, 36.3, 'English Cocker Spaniel', 2022-06-27]
  - ['Frolic', 2, 76.2, 'Chocolate Labrador', 2022-06-27]

demo_pets:
  do: sql.select
  from: $$ # last frame
  cols:
    Name: .str Personal Name
    Age: .int.year Age
    Weight: .float.pound Current Weight
  where_all:
  - ["Name","!=",Ernie]
  #- ['Timestamp','>', 2022-01-01]
  save: [table]

Releases

$ poetry version patch
$ poetry build && poetry publish
$ poetry version prepatch

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fridaay-0.2.3.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fridaay-0.2.3-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file fridaay-0.2.3.tar.gz.

File metadata

  • Download URL: fridaay-0.2.3.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.13 Darwin/21.5.0

File hashes

Hashes for fridaay-0.2.3.tar.gz
Algorithm Hash digest
SHA256 7968c76deeb2270b1762252aa3acbbe40d31aa49454180661dd7dadc1698a2e5
MD5 652d3edbae5c19962e0bc9ddf9fd9b81
BLAKE2b-256 195f97d4864c369fab5e22c1f13db1d14a58d1f1042187432cad5b5fc608a038

See more details on using hashes here.

File details

Details for the file fridaay-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: fridaay-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.13 Darwin/21.5.0

File hashes

Hashes for fridaay-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 cf80bcf5430474798f73674f1f9c52c5d0dcb71ada7c7a12a78b8b89a04f0060
MD5 0d4378ec481bf2799f991e59594c78f3
BLAKE2b-256 d48f286063ed891f7ff0b6c055c33d6338e08ac20bedf16e15b1900ba3a20a74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page