dplyr - like data manipulation
Project description
piplyr
piplyr is a Python package that provides dplyr-like data manipulation capabilities for pandas DataFrames. It simplifies data manipulation tasks in Python by offering a set of intuitive methods for data filtering, selection, transformation, and aggregation.
Installation
You can install piplyr
directly from PyPI:
pip install piplyr
Usage
Here are some basic examples of how to use piplyr
:
Initializing piplyr
import pandas as pd
from piplyr import piplyr
# Sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
# Initialize piplyr with the DataFrame
pi = piplyr(df)
Grouping Data
# Group by a column
grouped_pi = pi.group_by('A')
Sorting Data
# Sort the DataFrame by a column
sorted_pi = pi.sort_by('B').to_df
to_df
Use to_df to convert your generated data to a Pandas' DataFrame
Selecting Columns
# Select specific columns
selected_pi = pi.select('A', 'B')
Dropping Columns
# Drop specific columns
dropped_pi = pi.drop_col('B')
Renaming Columns
# Rename columns in the DataFrame
renamed_pi = pi.rename_col({'A': 'new_A'})
Filtering Rows
# Filter rows based on a condition
filtered_pi = pi.filter_row('A > 1')
Mutating Columns
# Add a new column or modify existing ones
mutated_pi = pi.mutate(new_col=lambda x: x['A'] * 2)
Summarizing Data
# Summarize the DataFrame
summarized_pi = pi.summarize(mean_A=lambda x: x['A'].mean())
Executing SQL Queries
# Execute an SQL query on the DataFrame. You have to use from df regardless of your DataFrame name
sql_pi = pi.sql_plyr('SELECT * FROM df WHERE A > 1')
Chaining Methods
# Chain multiple operations
result_pi = pi.select('A', 'B').filter_row('A > 1').summarize(avg_B=lambda x: x['B'].mean())
Additional Methods
The package also includes several other methods like join
, count_na
, distinct
, pivot_longer
, pivot_wider
, clean_names
, separate
, str_pad
, str_sub
, str_extract
, str_detect
, str_len
, str_lower
, str_upper
, str_startswith
, str_endswith
, str_contains
, fct_lump
, fct_infreq
, fct_relevel
, fct_recode
, fct_reorder
and others.
Each of these methods provides specific data manipulation functionalities and can be explored further in the package documentation.
More Examples
Please consult docstrings of various methods that include explnations and examples.
Contributing
Contributions to piplyr
are welcome! Please refer to the contribution guidelines for more information.
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file piplyr-1.4.tar.gz
.
File metadata
- Download URL: piplyr-1.4.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16f6dbc683febe56878221b2e934edaf85e76ef8d70d3521fc78fbe691f03776 |
|
MD5 | 6956f1ef918fdd744e6849313a0a42a9 |
|
BLAKE2b-256 | 7e0bb20c1c9c05acbc40670add8c664ffe0111fdf1190d694eac89b007587987 |
File details
Details for the file piplyr-1.4-py3-none-any.whl
.
File metadata
- Download URL: piplyr-1.4-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 485e92819872aa6d5a5b66a445fad5458c2852172de7e5b73053732c6b33dcd6 |
|
MD5 | 216ea83641ba26fda69b6870ebaeeec8 |
|
BLAKE2b-256 | 93800d3607acbd3f754b82d4afa1ce2b59d04c190d5bf4ee2c52f0cb4a05a8e0 |