Skip to main content

A Pandas Enhancement

Project description

An enhancement to pandas module.
This is kungfu, with monkey-patched common methods to (Data)Frame and Series in pandas.
jerryzhujian9_at_gmail.com
Tested under python 2.7.

Install:
https://pypi.python.org/pypi/kungfu
pip install kungfu
The above command will auto take care of the following requirements
Requires pandas 0.12.0 (tested 0.12.0-2) which will also install python-dateutil(dateutil), numpy, pytz, six
Requires openpyxl for writing excel (tested with 1.5.8, version 1.6.1 or higher, but lower than 2.0.0 may also work.)
xlrd for reading excel, xlwt for writing .xls (old format) file
(pip install pandas==0.12.0; pip install openpyxl==1.5.8; pip install xlrd; pip install xlwt)

Usage:
http://pandas.pydata.org/pandas-docs/version/0.12.0/genindex.html

Generally all of the calling (monkey-patched or not) returns something and the original frame or series remains unchanged.
If user wants the original frame or series to be changed, assign the returns back.

Visualize a series as a column of a frame with the series name being the column name.
Visualize a single list as a series and therefore a column of a frame when converting a series or frame.
However, for a list of lists, Visualize each list of the list (i.e. sublist) as a row!
Memorization: list=series=column

from pandas import isnull as isna
from pandas import isnull as isnull
Frame.read = Frame.Read Frame.save = Frame.Save Frame.write = Frame.Save
Frame.peek = Frame.Print Frame.Peek = Frame.Print Frame.play = Frame.Play
Frame.sel = Frame.Sel Frame.selcol = Frame.SelCol Frame.selrow = Frame.SelRow
Frame.delete = Frame.Del Frame.groupv = Frame.GroupV Frame.splith = Frame.SplitH
Frame.recols = Frame.ReorderCols Frame.rerows = Frame.ReorderRows Frame.rncols = Frame.RenameCols
Frame.newcol = Frame.NewCol Frame.findval = Frame.FindVal Frame.countval = Frame.CountVal
Frame.cnames = Frame.Columns Frame.names = Frame.Columns Frame.rnames = Frame.Indices
Frame.num = Frame.ToNum Frame.maskout = Frame.Maskout # Frame.fillna = Frame.FillNA

Series.play = Series.Play Series.peek = Series.Print Series.Peek = Series.Print
Series.sel = Series.Sel Series.countval = Series.CountVal # Series.unique = Series.Uniques --existing method
Series.len = Series.Size Series.names = Series.Indices Series.rnames = Series.Indices
Series.cames = Series.Indices Series.num = Series.ToNum Series.str = Series.ToStr
Series.maskout = Series.Maskout # Series.fillna = Series.FillNA

mergelr = MergeLR concatvh = ConcatVH
Frame.tolist(), Frame.list()<--homebrew Series.tolist(), Series.list() <--exisiting method in pandas

# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# Frame IO, Frame info
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
frame = Frame.Read(path, sep=",", header=0) frame = Frame.Readx(path, sheetname='Sheet1', header=0)
frame.Print([column=None]) series.Print()
frame.Save(outputFile[, columns=None])
f = Frame.Play() s = Series.Play()

# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# Selection, grouping
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
frame.Sel(*args) frame.SelCol(column)
frame.SelRow(row) frame.Del(*args):
frame.GroupV(edgeMatchSeries, groupColumnName='AutoGroup')
frame.SplitH(subFrameSize=1, resetIndex=True)
series.Sel(elements=[])

# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# Reorganize
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
General notes on "join":
when joining along an axis, the index of each frame does not have to in the same order
e.g. ["a","b","c","f"] for left frame, ["b","c","a","e"] for right frame
join will match them and return the combined frame (in a certain order)
MergeLR(left, right, join='union', onKeys=[], sort=True)
ConcatVH(frameList, axis=0, join="union", sort=False)
frame.ReorderCols(columns=[]) frame.ReorderRows(indices=[])
frame.RenameCols(newColumns=[]) frame.NewCol([newColumnName="NewColumn"[, newColumnValue=NA]])

# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# Stats, Processing
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
IsNA(object) frame.FindVal(valToFind)
frame.CountVal(valToCount) frame.Columns()
frame.Indices() frame.ToNum()
frame.Maskout(condition) frame.FillNA( *args, **kwargs)
frame.mean(axis=0),frame.median(axis=0),frame.sum(axis=0)
frame.corr(method='') series.CountVal(valToCount)
series.Uniques() series.Size()
series.Indices() series.ToNum()
series.Maskout(condition) series.FillNA( *args, **kwargs)
series.mean(axis=0),series.median(axis=0),series.sum(axis=0)
series.corr(other, method='') series = series.ToStr()

Loop how to:
for columnName, columnSeries in Frame.iteritems():
columnIndex = Frame.Columns().index(colName)
columnUniques = columnSeries.Uniques()
for rowIndex, rowSeries in Frame.iterrows():
for index, value in Series.iteritems():

Also consider apply, applymap, map
apply works on a row / column basis of a DataFrame, applymap works element-wise on a DataFrame,
and map works element-wise on a Series.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kungfu-1.4.4.tar.gz (19.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page