pandas-selector

Simple, composable selectors for loc[], iloc[], assign() and others.

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering

Project description

Pandas Selector

Access the calling pandas data frame in loc[], iloc[], assign() and other methods with DF to write better chains of data frame operations, e.g.:

df = (df
      # Select all rows with column "x" < 2
      .loc[DF["x"] < 2]
      .assign(
          # Shift "x" by its minimum.
          y = DF["x"] - DF["x"].min(),
          # Clip "x" to it's central 50% window. Note how DF is used
          # in the argument to `clip()`.
          z = DF["x"].clip(
              lower=DF["x"].quantile(0.25),
              upper=DF["x"].quantile(0.75)
          ),
      )
     )

Overview

Motivation: Make chaining Pandas operations easier and bring functionality to Pandas similar to Spark’s col() function or referencing columns in R’s dplyr.
Install from PyPI with pip install pandas-selector. Pandas versions 1.0+ (^1.0) are supported.
Documentation can be found at readthedocs.
Source code can be obtained from GitHub.

Example: Create new column and filter

Instead of writing “traditional” Pandas like this:

df_in = pd.DataFrame({"x": range(5)})
df = df_in.assign(y = df_in["x"] // 2)
df = df.loc[df["y"] <= 1]
df
#    x  y
# 0  0  0
# 1  1  0
# 2  2  1
# 3  3  1

One can write:

from pandas_selector import DF
df = (df_in
      .assign(y = DF["x"] // 2)
      .loc[DF["y"] <= 1]
     )

This is especially handy when re-iterating on data frame manipulations interactively, e.g. in a notebook (just imagine you have to rename df to df_out).

But you can access all methods and attributes of the data frame from the context:

df = pd.DataFrame({
    "X": range(5),
    "y": ["1", "a", "c", "D", "e"],
})
df.loc[DF["y"]str.isupper() | DF["y"]str.isnumeric()]
#    X  y
# 0  0  1
# 3  3  D
df.loc[:, DF.columns.str.isupper()]
#    X
# 0  0
# 1  1
# 2  2
# 3  3
# 4  4

You can even use DF in the arguments to methods:

df = pd.DataFrame({
    "x": range(5),
    "y": range(2, 7),
})
df.assign(z = DF['x'].clip(lower=2.2, upper=DF['y'].median()))
#    x  y    z
# 0  0  2  2.2
# 1  1  3  2.2
# 2  2  4  2.2
# 3  3  5  3.0
# 4  4  6  4.0

When working with ~pd.Series the S object exists. It can be used similar to DF:

s = pd.Series(range(5))
s[s < 3]
# 0    0
# 1    1
# 2    2
# dtype: int64

Similar projects for pandas

siuba
- (+) active
- (-) new API to learn
pandas-ply
- (-) stale(?), last change 6 years ago
- (-) new API to learn
- (-) Symbol / pandas_ply.X works only with ply_* functions
pandas-select
- (+) no explicite df necessary
- (-) new API to learn
pandas-selectable
- (+) simple select accessor
- (-) usage inside chains clumsy (needs explicite df):
```
((df
  .select.A == 'a')
  .select.B == 'b'
)
```
- (-) hard-coded str, dt accessor methods
- (?) composable?

Development

Development is containerized with [Docker](https://www.docker.com/) to separte from host systems and improve reproducability. No other prerequisites are needed on the host system.

Recommendation for Windows users: install WSL 2 (tested on Ubuntu 20.04), and for containerized workflows, Docker Desktop for Windows.

The common tasks are collected in Makefile (See make help for a complete list):

Run the unit tests: make test or make watch for continuously running tests on code-changes.
Build the documentation: make docs
TODO: Update the poetry.lock file: make lock
Add a dependency:
1. Start a shell in a new container.
2. Add dependency with poetry add in the running container. This will update poetry.lock automatically:
```
# 1. On the host system
% make shell
# 2. In the container instance:
I have no name!@7d0e85b3a303:/app$ poetry add --dev --lock falcon
```
Build the development image make devimage (Note: This should be done automatically for the targets.)

Project details

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering

Release history Release notifications | RSS feed

1.5.0

Apr 17, 2024

1.3.4

Apr 15, 2023

1.3.3

Apr 5, 2023

1.3.2

Mar 21, 2022

1.3.1

Feb 21, 2022

1.3.0 yanked

Feb 21, 2022

Reason this release was yanked:

broken build

1.2.2

Feb 19, 2022

This version

1.2.1

Feb 19, 2022

1.1.0

Sep 12, 2021

1.0.0

Sep 3, 2021

0.1.2

Aug 22, 2021

0.1.1

Apr 13, 2021

0.1.0

Apr 12, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_selector-1.2.1.tar.gz (15.0 kB view details)

Uploaded Feb 19, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pandas_selector-1.2.1-py3-none-any.whl (13.4 kB view details)

Uploaded Feb 19, 2022 Python 3

File details

Details for the file pandas_selector-1.2.1.tar.gz.

File metadata

Download URL: pandas_selector-1.2.1.tar.gz
Upload date: Feb 19, 2022
Size: 15.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.5 CPython/3.9.2 Linux/5.10.0-10-amd64

File hashes

Hashes for pandas_selector-1.2.1.tar.gz
Algorithm	Hash digest
SHA256	`1140f4ef0ac6e6c4b7fbc57de7974f15196d57d98960d371270115402bfb3a94`
MD5	`fff79c67fe78600581c93e826c987622`
BLAKE2b-256	`61db461fc53d8cc570668501c847ca638c9616b7bcaeee4d455c5e8c18eb4388`

See more details on using hashes here.

File details

Details for the file pandas_selector-1.2.1-py3-none-any.whl.

File metadata

Download URL: pandas_selector-1.2.1-py3-none-any.whl
Upload date: Feb 19, 2022
Size: 13.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.5 CPython/3.9.2 Linux/5.10.0-10-amd64

File hashes

Hashes for pandas_selector-1.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`913e4fd620c34b84b2257e683fc6473acae227b6d4af3e7566e728d51214e553`
MD5	`4782dfa9aa739b30d12116c1ad87dab2`
BLAKE2b-256	`4025a1f459251931fc2526df02da24f82a95b129abf5b61e2c1ef537abd4c647`

See more details on using hashes here.

pandas-selector 1.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Pandas Selector

Overview

Example: Create new column and filter

Similar projects for pandas

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes