Skip to main content

A framework for declarative, functional dataset definitions.

Project description

Labrea Logo

Labrea

A framework for declarative, functional dataset definitions.

lifecycle PyPI Downloads pre-commit Coverage docs

Installation

Labrea is available for install via pip.

pip install labrea

Alternatively, you can install the latest development version from GitHub.

pip install git+https://github.com/8451/labrea@develop

Usage

See our usage guide here.

Labrea exposes a dataset decorator that allows you to define datasets and their dependencies in a declarative manner. Dependencies can either be other datasets or Options, which are values that can be passed in at runtime via a dictionary.

from labrea import dataset, Option
import pandas as pd


@dataset
def stores(path: str = Option('PATHS.STORES')) -> pd.DataFrame:
    return pd.read_csv(path)


@dataset
def transactions(path: str = Option('PATHS.SALES')) -> pd.DataFrame:
    return pd.read_csv(path)


@dataset
def sales_by_region(
        stores_: pd.DataFrame = stores,
        transactions_: pd.DataFrame = transactions
) -> pd.DataFrame:
    """Merge stores to transactions, sum sales by region"""
    return pd.merge(transactions_, stores_, on='store_id').groupby('region')['sales'].sum().reset_index()


options = {
    'PATHS': {
        'STORES': 'path/to/stores.csv',
        'SALES': 'path/to/sales.csv'
    }
}


stores(options)
## +-----------------+-----------+
## | store_id        | region    |
## |-----------------+-----------|
## | 1               | North     |
## | 2               | North     |
## | 3               | South     |
## | 4               | South      |
## +-----------------+-----------+

transactions(options)
## +-----------------+-----------------+-----------------+
## | store_id        | sales           | transaction_id  |
## |-----------------+-----------------+-----------------|
## | 1               | 100             | 1               |
## | 2               | 200             | 2               |
## | 3               | 300             | 3               |
## | 4               | 400             | 4               |
## +-----------------+-----------------+-----------------+

sales_by_region(options)
## +-----------------+-----------------+
## | region          | sales           |
## |-----------------+-----------------|
## | North           | 300             |
## | South           | 700             |
## +-----------------+-----------------+

Contributing

If you would like to contribute to labrea, please read the Contributing Guide.

Changelog

A summary of recent updates to labrea can be found in the Changelog.

Maintainers

Maintainer Email
Austin Warner austin.warner@8451.com
Michael Stoepel michael.stoepel@8451.com

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

labrea-2.0.1.tar.gz (245.4 kB view hashes)

Uploaded Source

Built Distribution

labrea-2.0.1-py3-none-any.whl (44.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page