"A tidier approach to pandas."
Project description
TidyBear
A tidier approach to pandas.
This package was originally a collection of functions, routines, and processes that I found myself often repeating. It has since evolved into a desire to work my way through the tidyverse to reimplement my favorite tidy features in python. This project is not aimed at creating a better experience for every pandas task, but rather just a different one that sometimes feels more natural to me. I hope something here can be useful to you.
Installation
pip install tidybear
Usage
import pandas as pd
import tidybear as tb
Verbs
# rename columns
tb.rename(data, old="new")
# select columns
tb.select(data, ["col1", "col2"])
# count number of rows across multiple columns
tb.count(data, ["col1", "col2"])
# pivot long to wide or wide to long
tb.pivot_longer(data, ["val1", "val2"], names_to="val_type")
tb.pivot_wider(data, names_from="val_type", values_from="value")
# slice rows
tb.slice_max(data, order_by="val1", n=10)
tb.slice_min(data, order_by="val1", n=10, groupby="col1")
# join dataframes
tb.left_join(data1, data2, "colA") # use "colA" as key
tb.right_join(data1, data2, col1A="col1B") # use "col1A" from left and "col1B" from right
tb.cross_join(data1, data2)
Groupby and Summarise API
with tb.GroupBy(df, "group_var") as g:
g.n()
g.sum("value", name="total_value")
g.n_distinct("ids", name="n_unique_ids")
summary = g.summarise()
TidySelectors
everything()
- Select all columnslast_col
- Select last columnfirst_col
- Select first columncontains(pattern)
- Select columns that contain the literal stringmatches(pattern)
- Select columns that match the regular expression patternstarts_with(pattern)
- Select columns that start with the literal stringends_with
- Select all columns that end with the literal srtingnum_range
- Select all columns that match a numeric range like x01, x02, x03
These can be used in a variety of tidybear verbs
from tidybear.selectors import contains, everything
# select all columns that contain "foo"
tb.select(data, contains("foo"))
# pivot all columns to long format
tb.pivot_longer(data, everything())
You can also negate these, so if you wanted everything except one columns, you could do:
from tidybear.selectors import last_col
tb.select(data, -last_col())
Coming Soon (maybe)
- Method chaining
- Tribbles!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tidybear-0.0.5.tar.gz
.
File metadata
- Download URL: tidybear-0.0.5.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c8253abf52bd4e20f20b00db9f88bdb0cb38d44100b991091a7b8be5effd055 |
|
MD5 | b4981021e5689eb45bb5a3a74ef73446 |
|
BLAKE2b-256 | 2dd2bf902e59d8d61f89d3cc93a89af19e52306f7b7a99895ac39eac93a9418b |
File details
Details for the file tidybear-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: tidybear-0.0.5-py3-none-any.whl
- Upload date:
- Size: 12.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31943bb152982a32e9bed8f9cccc948d717431c4a45e401b3c8f03c6c2b98026 |
|
MD5 | 0cef55d3dcda87df90862ae961aad705 |
|
BLAKE2b-256 | 013392cb11fd861518553a08f0874812cde52abb531f6dc02e9a42ec4cc05fab |