Skip to main content

Port of dplyr and other related R packages in python, using pipda.

Project description

datar

A Grammar of Data Manipulation in python

Pypi Github Building Docs and API Codacy Codacy coverage

Documentation | Reference Maps | Notebook Examples | API | Blog

datar is a re-imagining of APIs of data manipulation libraries in python (currently only pandas supported) so that you can manipulate your data with it like with dplyr in R.

datar is an in-depth port of tidyverse packages, such as dplyr, tidyr, forcats and tibble, as well as some functions from base R.

Installation

pip install -U datar

# install pdtypes support
pip install -U datar[pdtypes]

# install dependencies for modin as backend
pip install -U datar[modin]
# you may also need to install dependencies for modin engines
# pip install -U modin[ray]

Example usage

from datar import f
from datar.dplyr import mutate, filter, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter, if_else, tibble

df = tibble(
    x=range(4),  # or f[:4]
    y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       1
2       2      two       2
3       3    three       3
"""

df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       0
2       2      two       1
3       3    three       1
"""

df >> filter(f.x>1)
"""# output:
        x        y
  <int64> <object>
0       2      two
1       3    three
"""

df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter(f.z==1)
"""# output:
        x        y       z
  <int64> <object> <int64>
0       2      two       1
1       3    three       1
"""
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar.base import sin, pi
from plotnine import ggplot, aes, geom_line, theme_classic

df = tibble(x=numpy.linspace(0, 2*pi, 500))
(df >>
  mutate(y=sin(f.x), sign=if_else(f.y>=0, "positive", "negative")) >>
  ggplot(aes(x='x', y='y')) +
  theme_classic() +
  geom_line(aes(color='sign'), size=1.2))

example

# easy to integrate with other libraries
# for example: klib
import klib
from datar.core.factory import verb_factory
from datar.datasets import iris
from datar.dplyr import pull

dist_plot = verb_factory(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()

example

See also some advanced examples from my answers on StackOverflow:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datar-0.8.4.tar.gz (10.2 MB view details)

Uploaded Source

Built Distribution

datar-0.8.4-py3-none-any.whl (10.3 MB view details)

Uploaded Python 3

File details

Details for the file datar-0.8.4.tar.gz.

File metadata

  • Download URL: datar-0.8.4.tar.gz
  • Upload date:
  • Size: 10.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.4 Linux/5.13.0-1022-azure

File hashes

Hashes for datar-0.8.4.tar.gz
Algorithm Hash digest
SHA256 8f2365efdff924ecd1052351a5865cbca12a61374e973a2e2aeec71f03df8451
MD5 52341d4a7263824e959abbbc2b708560
BLAKE2b-256 73693fb052439c8ac64b2cb8fb0d18acaad852f218df0efc3c5152a2e3cb589a

See more details on using hashes here.

Provenance

File details

Details for the file datar-0.8.4-py3-none-any.whl.

File metadata

  • Download URL: datar-0.8.4-py3-none-any.whl
  • Upload date:
  • Size: 10.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.4 Linux/5.13.0-1022-azure

File hashes

Hashes for datar-0.8.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5a9fdc42930ac4904e8572dd2ef155884b31b4c24faf4e0b642c846db75b2895
MD5 9a7ad47e4676cf0d6bbec93a12347c89
BLAKE2b-256 078b405a72044ada104ae37fa9e0cb592ae45208b3c8112a1eb6e323e4fc6fba

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page