"A tidier approach to pandas."
Project description
TidyBear
A tidier approach to pandas.
This package was originally a collection of functions, routines, and processes that I found myself often repeating. It has since evolved into a desire to work my way through the tidyverse to reimplement my favorite tidy features in python. This project is not aimed at creating a better experience for every pandas task, but rather just a different one that sometimes feels more natural to me. I hope something here can be useful to you.
Installation
pip install tidybear
Usage
import pandas as pd
import tidybear as tb
Verbs
# rename columns
tb.rename(data, old="new")
# select columns
tb.select(data, ["col1", "col2"])
# count number of rows across multiple columns
tb.count(data, ["col1", "col2"])
# pivot long to wide or wide to long
tb.pivot_longer(data, ["val1", "val2"], names_to="val_type")
tb.pivot_wider(data, names_from="val_type", values_from="value")
# slice rows
tb.slice_max(data, order_by="val1", n=10)
tb.slice_min(data, order_by="val1", n=10, groupby="col1")
# join dataframes
tb.left_join(data1, data2, "colA") # use "colA" as key
tb.right_join(data1, data2, col1A="col1B") # use "col1A" from left and "col1B" from right
tb.cross_join(data1, data2)
Groupby and Summarise API
with tb.GroupBy(df, "group_var") as g:
g.n()
g.sum("value", name="total_value")
g.n_distinct("ids", name="n_unique_ids")
summary = g.summarise()
TidySelectors
everything()
- Select all columnslast_col
- Select last columnfirst_col
- Select first columncontains(pattern)
- Select columns that contain the literal stringmatches(pattern)
- Select columns that match the regular expression patternstarts_with(pattern)
- Select columns that start with the literal stringends_with
- Select all columns that end with the literal srtingnum_range
- Select all columns that match a numeric range like x01, x02, x03
These can be used in a variety of tidybear verbs
from tidybear.selectors import contains, everything
# select all columns that contain "foo"
tb.select(data, contains("foo"))
# pivot all columns to long format
tb.pivot_longer(data, everything())
You can also negate these, so if you wanted everything except one columns, you could do:
from tidybear.selectors import last_col
tb.select(data, -last_col())
Coming Soon (maybe)
- Method chaining
- Tribbles!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.