Allows to construct a pipeline of functions to be applied independently on the groups of a groupby object.
.apply methods on Pandas groupby object
The extension is available on PyPi
pip install gcGroupbyExtension-gcalmettes
(Or if you do not want to install the package in your python distribution, just download this repo and place the
gcGroupbyExtension folder in the folder you're running your python script/notebook in.)
Once installed, the extension can be imported via:
What problems does this extension try to solve?
Pandas provides both the
.apply methods to work on its groupby object.
The main difference between
.apply in the groupby context is that you have access to the entire scope of the groupby object (each group) with
.pipe, while you only have access to the subcomponents scope (in the context of a groupby the subcomponents are slices of the dataframe that called groupby where each slice is a dataframe itself. This is analogous for a series groupby.)
.pipemethod can be chained, while the
- You can use the
.aggmethod to limit the application of the functions on particular columns of the groups, but it is cumbersome to apply specific functions independantly on only a selection of the groups.
- There is no easy way to construct independant pipelines of functions for each group.
This extension provides this capability.
What does this extension actually do?
This extension allows to construct a pipeline of functions to be applied independently on the groups of a groupby object. The functions/transformations to be applied can be the same for all the groups or scoped to (a) specific group(s).
This library registers a custom accessor on pandas DataFrame and Series objects.
The methods of this extension are registered under the
See the DEMO notebook for details.
Care to show the syntax?
Sure! See the DEMO notebook for more details, but basically, you can do things like this:
(df.gc.groupby("nameOfColumn") .resetIndex() # this is a special method baked in .apply(lambda x: x * 5, lambda x: x + x.iloc) # accepts multiple functions .apply(mySpecialFunction, onlyGroups=['group1']) # limit the function to specific group(s) .apply(lambda x: x - x.mean(), ignoreGroups=['group4', 'group6']) # limit the function to specific group(s) .apply(lambda x: x.std(axis=1)) .concat(axis=0, multiIndex=None).plot() )
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size gcGroupbyExtension_gcalmettes-0.0.3-py3-none-any.whl (6.3 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size gcGroupbyExtension-gcalmettes-0.0.3.tar.gz (5.0 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for gcGroupbyExtension_gcalmettes-0.0.3-py3-none-any.whl
Hashes for gcGroupbyExtension-gcalmettes-0.0.3.tar.gz