Skip to main content

Common tidyverse functions for pandas!

Project description

Project Status: Active – The project has reached a stable, usable state and is being actively developed. Python 3.9.0 codecov License: MIT versionDocumentation Status PyPI version

💪 Bringing the Power of tidyverse to Pandas!

tidyversetopandas is a Python package designed for users familiar with R's tidyverse who are transitioning to Python. It bridges the syntax gap between R and Python by offering pandas equivalents to popular tidyverse functions. This package is particularly beneficial for data scientists and analysts who seek to leverage pandas' robust capabilities with the familiar syntax of tidyverse.

🐍 Fitting into the Python Ecosystem

While pandas is a powerful tool for data manipulation in Python, it can be challenging for those accustomed to R's tidyverse syntax. tidyversetopandas is unique in its approach to blend these two worlds. In the Python ecosystem, tidyversetopandas fits alongside packages that aim to incorporate tidyverse-like functionality into Python's data manipulation landscape, predominantly with pandas. The goal is to make pandas more accessible to those accustomed to tidyverse syntax.Two notable packages in this domain are tidypandas and siuba. Both of them represent, similar to tidyversetopandas, efforts to bridge the gap between R's tidyverse approach and Python's pandas library, offering users familiar with R's data manipulation tools a more comfortable transition to Python's data science ecosystem.

🔑 Key Functions:

  • mutate(): Similar to its tidyverse counterpart, this function allows for the creation of new columns or modification of existing ones in a DataFrame.
  • filter(): Enables row-wise filtering, making it easier to sift through DataFrame based on specified conditions.
  • select(): Facilitates the selection of specific columns in a DataFrame, streamlining data manipulation and analysis.
  • arrange(): Offers sorting capabilities for DataFrame based on one or multiple columns.

⚙️ Installation

pip install tidyversetopandas

🏃 Usage

Lets try to use tidyversetopandas.

Import package

Import the package into your Python environment after installation:

from tidyversetopandas import tidyversetopandas as ttp

Loading Data

Begin by loading your data into a pandas dataframe. This package assumes that you have a dataframe ready for manipulation named df.

Mutate

Use mutate to create new columns or modify existing ones. We can do this by writing the expression we want as a string.

df = ttp.mutate(df, "b=b*2")

Filter

The filter function is used to subset dataframes based on specified conditions. For instance, to select rows where 'A' is greater than 1 and 'B' is less than 6

df = ttp.filter(df, "A > 1 and B < 6")

Arrange

Sort your dataframe with arrange. You can sort by multiple columns and specify ascending or descending order. For example, to sort by 'A' in ascending order and then by 'C'

df = ttp.arrange(df, True, "A", "C")

Select

To keep only certain columns, use the select function. For example, to keep only the column 'A'

df = ttp.select(df, "A")

📖 Developer Guide

🛠️ Installation in Development Mode

  1. Clone the repository and navigate to the project root directory.

  2. Create a virtual environment and activate it.

conda env create -f environment.yml
conda activate tidyversetopandas
  1. Make sure poetry is installed. If not, install it here. Once installed, run the following command to install the package in development mode.
poetry install

✅ Testing

To run the tests, use the following command:

pytest tests/

To run tests with coverage, use the following command:

pytest tests/ --cov=tidyversetopandas

To view the coverage report, use the following command:

pytest --cov=tidyversetopandas --cov-report html tests/

This will create a htmlcov directory containing the coverage report in HTML format. Open the index.html file in this directory with a web browser to view the detailed coverage report.

🤝 Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

©️ License

tidyversetopandas was created by Thomas,Sophia,Lily,Nando. It is licensed under the terms of the MIT license.

👥 Contributors

  1. Thomas Jian
  2. Sophia Zhao
  3. Lily Tao
  4. Farrandi Hernando

Credits

tidyversetopandas was created with cookiecutter and the py-pkgs-cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tidyversetopandas-3.0.0.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

tidyversetopandas-3.0.0-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file tidyversetopandas-3.0.0.tar.gz.

File metadata

  • Download URL: tidyversetopandas-3.0.0.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for tidyversetopandas-3.0.0.tar.gz
Algorithm Hash digest
SHA256 431ab5a54ae143338facd5990385d95ddc079a7e7399621cef1cadd196bf74ab
MD5 e0f5fb8f83865255b98e30f14e8b7a50
BLAKE2b-256 2eff39b9bd06caf22dd32bcbfbe29cc98c767c7c68791d2fa5771cb4a137cffb

See more details on using hashes here.

File details

Details for the file tidyversetopandas-3.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tidyversetopandas-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ec981e047019c279901e387bdb854f2cbe72659534e8cb2fbd839bcd726d0f16
MD5 6ba0fb6a86c63b8d7a671442043aedd6
BLAKE2b-256 6bfba2b8a7df463e2d711190147a9106cd92d7c92ce55f14f78632f74a20f7de

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page