A Grammar of Data Manipulation in python
Project description
datar
A Grammar of Data Manipulation in python
Documentation | Reference Maps | Notebook Examples | API
datar
is a re-imagining of APIs for data manipulation in python with multiple backends supported. Those APIs are aligned with tidyverse packages in R as much as possible.
Installation
pip install -U datar
# install with a backend
pip install -U datar[pandas]
# More backends support coming soon
Backends
Repo | Badges |
---|---|
datar-numpy | |
datar-pandas | |
datar-arrow |
Example usage
# with pandas backend
from datar import f
from datar.dplyr import mutate, filter_, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter_, if_else, tibble
df = tibble(
x=range(4), # or c[:4] (from datar.base import c)
y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
x y z
<int64> <object> <int64>
0 0 zero 0
1 1 one 1
2 2 two 2
3 3 three 3
"""
df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
x y z
<int64> <object> <int64>
0 0 zero 0
1 1 one 0
2 2 two 1
3 3 three 1
"""
df >> filter_(f.x>1)
"""# output:
x y
<int64> <object>
0 2 two
1 3 three
"""
df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter_(f.z==1)
"""# output:
x y z
<int64> <object> <int64>
0 2 two 1
1 3 three 1
"""
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar import f
from datar.base import sin, pi
from datar.tibble import tibble
from datar.dplyr import mutate, if_else
from plotnine import ggplot, aes, geom_line, theme_classic
df = tibble(x=numpy.linspace(0, 2 * pi, 500))
(
df
>> mutate(y=sin(f.x), sign=if_else(f.y >= 0, "positive", "negative"))
>> ggplot(aes(x="x", y="y"))
+ theme_classic()
+ geom_line(aes(color="sign"), size=1.2)
)
# very easy to integrate with other libraries
# for example: klib
import klib
from pipda import register_verb
from datar import f
from datar.data import iris
from datar.dplyr import pull
dist_plot = register_verb(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()
Testimonials
Thanks for your excellent package to port R (
dplyr
) flow of processing to Python. I have been using other alternatives, and yours is the one that offers the most extensive and equivalent to what is possible now withdplyr
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datar-0.15.6.tar.gz
(10.3 MB
view details)
Built Distribution
datar-0.15.6-py3-none-any.whl
(10.3 MB
view details)
File details
Details for the file datar-0.15.6.tar.gz
.
File metadata
- Download URL: datar-0.15.6.tar.gz
- Upload date:
- Size: 10.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.8.10 Linux/5.15.0-1058-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5136c3b0dc4851f0db32e3d44ea137ebaac94f4fdf39f5e4d4de30d875d90b7e |
|
MD5 | 98aba35792ada8137e4d0452a20c3c26 |
|
BLAKE2b-256 | 5557c9a6468c5b6f2e719483ac0174844f130abf7b577a516e775e04ce220460 |
File details
Details for the file datar-0.15.6-py3-none-any.whl
.
File metadata
- Download URL: datar-0.15.6-py3-none-any.whl
- Upload date:
- Size: 10.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.8.10 Linux/5.15.0-1058-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1736ea0d0c7cffa8ae3dcb78813daedd595a602edceb6069558e83fb3452cc16 |
|
MD5 | 26828ed1c81d302c479bf41943ac1177 |
|
BLAKE2b-256 | e1bacf0d0753b0f78f41648611acc6b6185277496de47c022a14398cb900f4c1 |