Skip to main content

Split a dataframe by boolean array

Project description

``pandas-refract``: Convenient partitioning by Truthy/Falsey array

**pandas-refract** is an MIT licensed Python package with a simple function that allows users to divide their
dataframes by the 'Truthy' and 'Falseyness' of a provided array.

Eventually, the goal of this package is an additional feature to the Pandas library that allows users to .pop rows
from a dataframe where a condition is met. As far as I can tell this is not possible like the below example.

Ideal case would be::

target_df = df.pop(df['target_column'] == 'target_value')
non_target_df = df

What is required now is::

target_df = df[df['target_column'] == 'target_value']
non_target_df = df[df['target_column'] != 'targe_value']

Obviously, this package is not providing anything not currently possible in the current Pandas library. It does,
however, add a layer of convenience for more complex slicing where you need to separate, not remove, rows by conditions.


Simplest example of current Pandas requires::

df1 = df[df.column.notnull()].reset_index(drop=True)
df2 = df[df.column.isnull()].reset_index(drop=True)


df1 = df[df.column == 'test_string'].reset_index(drop=True)
df2 = df[df.column != 'test_string'].reset_index(drop=True)

With pandas-refract this becomes::

df1, df2 = refract(df, df.column.notnull(), True]


df1, df2 = refract(df, df.column == test_string', True]

But you don't have to pass it explicit boolean arrays::

data = {'a': ['', 'truthy', '', 'truthy'],
'b': [0, 1, 2, 3]

df = pd.DataFrame(data)

truthy_df, falsey_df = refract(df, df.a)

More complex examples:
*(where 'a' is Falsey and 'b' is an odd number)*

df1, df2 = refract(df, ((~df.a) & (df.b % 2 == 1)))

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pandas_refract-1.2.1-py2.py3-none-any.whl (3.8 kB) Copy SHA256 hash SHA256 Wheel py2.py3
pandas_refract-1.2.1.tar.gz (2.9 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page