Skip to main content

No project description provided

Project description

artigraph

pypi changelog downloads versions license CI codecov OpenSSF Best Practices

Declarative Data Production

Artigraph is a tool to improve the authorship, management, and quality of data. It emphasizes that the core deliverable of a data pipeline or workflow is the data, not the tasks.

Artigraph is hosted by the LF AI and Data Foundation as a Sandbox project.

Installation

Artigraph can be installed from PyPI on python 3.9+ with pip install arti.

Example

This sample from the spend example highlights computing the total amount spent from a series of purchase transactions:

from pathlib import Path
from typing import Annotated

from arti import Annotation, Artifact, Graph, producer
from arti.formats.json import JSON
from arti.storage.local import LocalFile
from arti.types import Collection, Date, Float64, Int64, Struct
from arti.versions import SemVer

DIR = Path(__file__).parent


class Vendor(Annotation):
    name: str


class Transactions(Artifact):
    """Transactions partitioned by day."""

    type = Collection(
        element=Struct(fields={"id": Int64(), "date": Date(), "amount": Float64()}),
        partition_by=("date",),
    )


class TotalSpend(Artifact):
    """Aggregate spend over all time."""

    type = Float64()
    format = JSON()
    storage = LocalFile()


@producer(version=SemVer(major=1, minor=0, patch=0))
def aggregate_transactions(
    transactions: Annotated[list[dict], Transactions]
) -> Annotated[float, TotalSpend]:
    return sum(txn["amount"] for txn in transactions)


with Graph(name="test-graph") as g:
    g.artifacts.vendor.transactions = Transactions(
        annotations=[Vendor(name="Acme")],
        format=JSON(),
        storage=LocalFile(path=str(DIR / "transactions" / "{date.iso}.json")),
    )
    g.artifacts.spend = aggregate_transactions(
        transactions=g.artifacts.vendor.transactions
    )

The full example can be run easily with docker run --rm artigraph/example-spend:

INFO:root:Writing mock Transactions data:
INFO:root:      /usr/src/app/transactions/2021-10-01.json: [{'id': 1, 'amount': 9.95}, {'id': 2, 'amount': 7.5}]
INFO:root:      /usr/src/app/transactions/2021-10-02.json: [{'id': 3, 'amount': 5.0}, {'id': 4, 'amount': 12.0}, {'id': 4, 'amount': 7.55}]
INFO:root:Building aggregate_transactions(transactions=Transactions(format=JSON(), storage=LocalFile(path='/usr/src/app/transactions/{date.iso}.json'), annotations=(Vendor(name='Acme'),)))...
INFO:root:Build finished.
INFO:root:Final Spend data:
INFO:root:      /tmp/test-graph/spend/7564053533177891797/spend.json: 42.0

Community

Everyone is welcome to join the community - learn more in out support and contributing pages!

Presentations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arti-0.0.4.tar.gz (62.2 kB view details)

Uploaded Source

Built Distribution

arti-0.0.4-py3-none-any.whl (80.3 kB view details)

Uploaded Python 3

File details

Details for the file arti-0.0.4.tar.gz.

File metadata

  • Download URL: arti-0.0.4.tar.gz
  • Upload date:
  • Size: 62.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.10.6 Linux/5.15.0-1034-azure

File hashes

Hashes for arti-0.0.4.tar.gz
Algorithm Hash digest
SHA256 6376159e99e69f55234eab7334a440fef42defac89b7e193625f1be5fb43a3c6
MD5 c037dc85197cfde58e4a5ab1b5be4607
BLAKE2b-256 f8da6619cbc475d254348a85e4f96a7bd59dd496d8994c338e6602f5890d69bc

See more details on using hashes here.

File details

Details for the file arti-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: arti-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 80.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.10.6 Linux/5.15.0-1034-azure

File hashes

Hashes for arti-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2547853707c7530d4484851b4a71d85fff117cfc499db1b3ac0c3ff0b789075c
MD5 9bdc99108b3ffca0f7e15a23f66dfb87
BLAKE2b-256 29d66e156d2166317f46dfda2fb0eb6d09c1f5244d78b4708f6bebed5d93b239

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page