Skip to main content

Utility wrapper of luigi

Project description

# daisy

Thin wrapper of [luigi](https://github.com/spotify/luigi) for utility.

# Features

## Formatted local targets

Classes inheriting `daisy.FormattedLocalTargetBase` is provided
for dumping and loading objects by one-liner.

``` python
import daisy
import pandas as pd

df = pd.DataFrame({"a": [1,2,3], "b": [4,5,6]})
df
# =>
# a b
# 0 1 4
# 1 2 5
# 2 3 6

targ = daisy.CsvTarget("./output.csv")

# dumping
targ.dump(df)

# loading
df2 = targ.load()

df2
# =>
# a b
# 0 1 4
# 1 2 5
# 2 3 6

# `daisy.FormattedLocalTargetBase` also inherits `luigi.LocalTaget`
# so that original api is also enabled.
with targ.open("r") as fd:
s = fd.read()
```

## Default output for task

`daisy.Task` inherits `luigi.Task` and provides default `output` features.

By setting file extension via `ext` attribute,
daisy automatically configure corresponding `FormattedLocalTarget` with default file name.


### Single output

``` python
class TaskA(daisy.Task):
param1 = daisy.Parameter()
ext = "csv"
```

is equivalent to:

``` python
class TaskA(luigi.Task):
param1 = luigi.Parameter()

def output(self):
return daisy.CsvTarget("./data/TaskA/TaskA(param1={}).csv".format(self.param1))
```


### Multiple outputs

``` python
class TaskA(daisy.Task):
param1 = daisy.Parameter()

ext = {
"vectors": "npy",
"metadata": "json"
}
```

is equivalent to:

``` python
class TaskA(luigi.Task):
param1 = luigi.Parameter()

def output(self):
return {
"vectors": daisy.NpyTarget("./data/TaskA/TaskA(param1={}).npy".format(self.param1)),
"metadata": daisy.JsonTarget("./data/TaskA/TaskA(param1={}).json".format(self.param1))
}
```

Available extension and file types are as follows:

| Target class | Object type | extension |
| --- | --- | --- |
| `CsvTarget` | `pandas.DataFrame` | `csv` |
| `NpyTarget` | `numpy.ndarray` | `npy` |
| `JsonTarget` | `dict` | `json` |
| `PickleTarget` | `object` | `pkl` `pickle` |
| `FeatherTarget` | `pandas.DataFrame` | `feather` |

view [source code](./daisy/formatted_target.py) for detail.


## Configuration

For configuration, edit `[daisy]` section of `luigi.cfg`.

``` INI
[daisy]

# default output directory
data_dir = luigi.Parameter("./data")
```


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

luigi-daisy-0.0.3.tar.gz (4.0 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page