Skip to main content

A Grammar of Data Manipulation in python

Project description

datar

A Grammar of Data Manipulation in python

Pypi Github Building Docs and API Codacy Codacy coverage

Documentation | Reference Maps | Notebook Examples | API

datar is a re-imagining of APIs for data manipulation in python with multiple backends supported. Those APIs are aligned with tidyrverse packages in R as much as possible.

Installation

pip install -U datar

# install with a backend
pip install -U datar[pandas]

# More backends support will be added in the future

Backends

Repo Badges
datar-numpy 3 18
datar-pandas 4 19

Example usage

# with pandas backend
from datar import f
from datar.dplyr import mutate, filter, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter, if_else, tibble

df = tibble(
    x=range(4),  # or c[:4]  (from datar.base import c)
    y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       1
2       2      two       2
3       3    three       3
"""

df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       0
2       2      two       1
3       3    three       1
"""

df >> filter(f.x>1)
"""# output:
        x        y
  <int64> <object>
0       2      two
1       3    three
"""

df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter(f.z==1)
"""# output:
        x        y       z
  <int64> <object> <int64>
0       2      two       1
1       3    three       1
"""
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar.base import sin, pi
from datar.tibble import tibble
from datar.dplyr import mutate, if_else
from plotnine import ggplot, aes, geom_line, theme_classic

df = tibble(x=numpy.linspace(0, 2 * pi, 500))
(
    df
    >> mutate(y=sin(f.x), sign=if_else(f.y >= 0, "positive", "negative"))
    >> ggplot(aes(x="x", y="y"))
    + theme_classic()
    + geom_line(aes(color="sign"), size=1.2)
)

example

# very easy to integrate with other libraries
# for example: klib
import klib
from pipda import register_verb
from datar import f
from datar.data import iris
from datar.dplyr import pull

dist_plot = register_verb(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()

example

Testimonials

@coforfe:

Thanks for your excellent package to port R (dplyr) flow of processing to Python. I have been using other alternatives, and yours is the one that offers the most extensive and equivalent to what is possible now with dplyr.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datar-0.11.0.tar.gz (10.1 MB view details)

Uploaded Source

Built Distribution

datar-0.11.0-py3-none-any.whl (10.1 MB view details)

Uploaded Python 3

File details

Details for the file datar-0.11.0.tar.gz.

File metadata

  • Download URL: datar-0.11.0.tar.gz
  • Upload date:
  • Size: 10.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.8.10 Linux/5.15.0-1023-azure

File hashes

Hashes for datar-0.11.0.tar.gz
Algorithm Hash digest
SHA256 6991a46bec8da4494524e67f1e2bced6ae0ac1933a2e053a658d9e655ed89372
MD5 9b1d0718dfa939fe9a4b749c42d7ee0f
BLAKE2b-256 5fbcfb4c568a93e53c8365e1ad2e1174ed3d9fca48c22718b83ca50970d2f68d

See more details on using hashes here.

Provenance

File details

Details for the file datar-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: datar-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 10.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.8.10 Linux/5.15.0-1023-azure

File hashes

Hashes for datar-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cab361768af77f7f5c0e35ecf7f76def86422382841f88b95f9f3cfd8a5b59c6
MD5 deb6222e0d7ef1e715b6887266bb43f0
BLAKE2b-256 6ed0f3a2ab3ba0ab8bdc7bd2e60308381ae055b44e5e947c78eb8b9b5e56c164

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page