Skip to main content

A Grammar of Data Manipulation in python

Project description

datar

A Grammar of Data Manipulation in python

Pypi Github Building Docs and API Codacy Codacy coverage Downloads

Documentation | Reference Maps | Notebook Examples | API

datar is a re-imagining of APIs for data manipulation in python with multiple backends supported. Those APIs are aligned with tidyverse packages in R as much as possible.

Installation

pip install -U datar

# install with a backend
pip install -U datar[pandas]

# or
pip install -U datar[polars]

Backends

Repo Badges
datar-numpy 3 18
datar-pandas 4 19
datar-arrow 23 24
datar-polars 25

Example usage

# with pandas backend
from datar import f
from datar.dplyr import mutate, filter_, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter_, if_else, tibble

df = tibble(
    x=range(4),  # or c[:4]  (from datar.base import c)
    y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       1
2       2      two       2
3       3    three       3
"""

df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       0
2       2      two       1
3       3    three       1
"""

df >> filter_(f.x>1)
"""# output:
        x        y
  <int64> <object>
0       2      two
1       3    three
"""

df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter_(f.z==1)
"""# output:
        x        y       z
  <int64> <object> <int64>
0       2      two       1
1       3    three       1
"""
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar import f
from datar.base import sin, pi
from datar.tibble import tibble
from datar.dplyr import mutate, if_else
from plotnine import ggplot, aes, geom_line, theme_classic

df = tibble(x=numpy.linspace(0, 2 * pi, 500))
(
    df
    >> mutate(y=sin(f.x), sign=if_else(f.y >= 0, "positive", "negative"))
    >> ggplot(aes(x="x", y="y"))
    + theme_classic()
    + geom_line(aes(color="sign"), size=1.2)
)

example

# very easy to integrate with other libraries
# for example: klib
import klib
from pipda import register_verb
from datar import f
from datar.data import iris
from datar.dplyr import pull

dist_plot = register_verb(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()

example

Testimonials

@coforfe:

Thanks for your excellent package to port R (dplyr) flow of processing to Python. I have been using other alternatives, and yours is the one that offers the most extensive and equivalent to what is possible now with dplyr.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datar-0.16.0.tar.gz (11.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datar-0.16.0-py3-none-any.whl (10.3 MB view details)

Uploaded Python 3

File details

Details for the file datar-0.16.0.tar.gz.

File metadata

  • Download URL: datar-0.16.0.tar.gz
  • Upload date:
  • Size: 11.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for datar-0.16.0.tar.gz
Algorithm Hash digest
SHA256 1a6327fbb4eecf718b3adcb685410715dd764ec96bca4012fdecd029b1fae545
MD5 2cc60dbb282b7bc14a9cfe1bd3f0b2ef
BLAKE2b-256 d34a9032ced94deac3215a183fb5d6392f211ad3d87e0dcb2cd7c409271f8e78

See more details on using hashes here.

File details

Details for the file datar-0.16.0-py3-none-any.whl.

File metadata

  • Download URL: datar-0.16.0-py3-none-any.whl
  • Upload date:
  • Size: 10.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for datar-0.16.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d8760d9c0e61635b7688e7611f77a91628fbb14d1f9b5e9cc321dc4f2ba56c41
MD5 33e8066f3ed9d2853051eb04f90d7777
BLAKE2b-256 ff63f64c80d53cb5b729c78cea4bce71864fd295f9d17b40b22e1cd2f4bc2061

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page