Python data pipelines
Project description
Python data pipelines
Features
This package implements the basics for building pipelines similar to magrittr in R. Pipelines are created using >>. Internally it uses singledispatch to provide a way for a unified API for different kinds of inputs (SQL databases, HDF, simple dicts, …).
Basic example what can be build with this package:
>>> from my_library import append_col
>>> import pandas as pd
>>> pd.DataFrame({"a" : [1,2,3]}) >> append_col(x=3)
a X
0 1 3
1 2 3
2 3 3
In the future, this package might also implement the verbs from the R packages dplyr and tidyr for pandas.DataFrame and or I will fold this into one of the other available implementation of dplyr style pipelines/verbs for pandas.
Documentation
The documentaiton can be found on ReadTheDocs: https://pydatapipes.readthedocs.io
License
Free software: MIT license
Credits
magrittr and it’s usage in dplyr / tidyr for the idea of using pipelines in that ways
lots of python implementations of dplyr style pipelines: dplython, pandas_ply, dfply
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.0 (2016-10-22)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pydatapipes-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 544b50867076f421216b034bb2dc1ca96b94e165698bfadf120b69354744f5a0 |
|
MD5 | d28c0c5e4665e0f8ec233593c35c1d4a |
|
BLAKE2b-256 | 7e3e0c973680e2d96259be569041c486e53c2e2474541850c25078db95a7f487 |