Utility wrapper of luigi
Project description
# daisy
Thin wrapper of [luigi](https://github.com/spotify/luigi) for utility.
# Features
## Formatted local targets
Classes inheriting `daisy.FormattedLocalTargetBase` is provided
for dumping and loading objects by one-liner.
``` python
import daisy
import pandas as pd
df = pd.DataFrame({"a": [1,2,3], "b": [4,5,6]})
df
# =>
# a b
# 0 1 4
# 1 2 5
# 2 3 6
targ = daisy.CsvTarget("./output.csv")
# dumping
targ.dump(df)
# loading
df2 = targ.load()
df2
# =>
# a b
# 0 1 4
# 1 2 5
# 2 3 6
# `daisy.FormattedLocalTargetBase` also inherits `luigi.LocalTaget`
# so that original api is also enabled.
with targ.open("r") as fd:
s = fd.read()
```
## Default output for task
`daisy.Task` inherits `luigi.Task` and provides default `output` features.
By setting file extension via `ext` attribute,
daisy automatically configure corresponding `FormattedLocalTarget` with default file name.
### Single output
``` python
class TaskA(daisy.Task):
param1 = daisy.Parameter()
ext = "csv"
```
is equivalent to:
``` python
class TaskA(luigi.Task):
param1 = luigi.Parameter()
def output(self):
return daisy.CsvTarget("./data/TaskA/TaskA(param1={}).csv".format(self.param1))
```
### Multiple outputs
``` python
class TaskA(daisy.Task):
param1 = daisy.Parameter()
ext = {
"vectors": "npy",
"metadata": "json"
}
```
is equivalent to:
``` python
class TaskA(luigi.Task):
param1 = luigi.Parameter()
def output(self):
return {
"vectors": daisy.NpyTarget("./data/TaskA/TaskA(param1={}).npy".format(self.param1)),
"metadata": daisy.JsonTarget("./data/TaskA/TaskA(param1={}).json".format(self.param1))
}
```
Available extension and file types are as follows:
| Target class | Object type | extension |
| --- | --- | --- |
| `CsvTarget` | `pandas.DataFrame` | `csv` |
| `NpyTarget` | `numpy.ndarray` | `npy` |
| `JsonTarget` | `dict` | `json` |
| `PickleTarget` | `object` | `pkl` `pickle` |
| `FeatherTarget` | `pandas.DataFrame` | `feather` |
view [source code](./daisy/formatted_target.py) for detail.
## Configuration
For configuration, edit `[daisy]` section of `luigi.cfg`.
``` INI
[daisy]
# default output directory
data_dir = luigi.Parameter("./data")
```
Thin wrapper of [luigi](https://github.com/spotify/luigi) for utility.
# Features
## Formatted local targets
Classes inheriting `daisy.FormattedLocalTargetBase` is provided
for dumping and loading objects by one-liner.
``` python
import daisy
import pandas as pd
df = pd.DataFrame({"a": [1,2,3], "b": [4,5,6]})
df
# =>
# a b
# 0 1 4
# 1 2 5
# 2 3 6
targ = daisy.CsvTarget("./output.csv")
# dumping
targ.dump(df)
# loading
df2 = targ.load()
df2
# =>
# a b
# 0 1 4
# 1 2 5
# 2 3 6
# `daisy.FormattedLocalTargetBase` also inherits `luigi.LocalTaget`
# so that original api is also enabled.
with targ.open("r") as fd:
s = fd.read()
```
## Default output for task
`daisy.Task` inherits `luigi.Task` and provides default `output` features.
By setting file extension via `ext` attribute,
daisy automatically configure corresponding `FormattedLocalTarget` with default file name.
### Single output
``` python
class TaskA(daisy.Task):
param1 = daisy.Parameter()
ext = "csv"
```
is equivalent to:
``` python
class TaskA(luigi.Task):
param1 = luigi.Parameter()
def output(self):
return daisy.CsvTarget("./data/TaskA/TaskA(param1={}).csv".format(self.param1))
```
### Multiple outputs
``` python
class TaskA(daisy.Task):
param1 = daisy.Parameter()
ext = {
"vectors": "npy",
"metadata": "json"
}
```
is equivalent to:
``` python
class TaskA(luigi.Task):
param1 = luigi.Parameter()
def output(self):
return {
"vectors": daisy.NpyTarget("./data/TaskA/TaskA(param1={}).npy".format(self.param1)),
"metadata": daisy.JsonTarget("./data/TaskA/TaskA(param1={}).json".format(self.param1))
}
```
Available extension and file types are as follows:
| Target class | Object type | extension |
| --- | --- | --- |
| `CsvTarget` | `pandas.DataFrame` | `csv` |
| `NpyTarget` | `numpy.ndarray` | `npy` |
| `JsonTarget` | `dict` | `json` |
| `PickleTarget` | `object` | `pkl` `pickle` |
| `FeatherTarget` | `pandas.DataFrame` | `feather` |
view [source code](./daisy/formatted_target.py) for detail.
## Configuration
For configuration, edit `[daisy]` section of `luigi.cfg`.
``` INI
[daisy]
# default output directory
data_dir = luigi.Parameter("./data")
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
luigi-daisy-0.0.3.tar.gz
(4.0 kB
view details)
File details
Details for the file luigi-daisy-0.0.3.tar.gz
.
File metadata
- Download URL: luigi-daisy-0.0.3.tar.gz
- Upload date:
- Size: 4.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Python-urllib/3.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b01e0d6a81f6a6952a4f702e206573066b95bd1dcdaca6af7fe7dd03f165277f |
|
MD5 | 29406d5f79066f6d064a7badc152c7e1 |
|
BLAKE2b-256 | 0145d67645f4ffcb2d73acd8c6b0e54b31dfd1ca957a0d7b33562ec7c17c8d65 |