IPython wrapper to more easily manipulate Pandas dataframes.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Project description

pandacell

Author: Eirik B. Stavestrand

Introduces a %df (or %%df) magic which can be used in Jupyter notebooks and the IPython console.

The magic executes the contents of a cell on a Pandas DataFrame.

Description

Pandas is great and all, but writing Pandas code can be tedious. For example when simply making summing two columns:

    In [1]: df["a"] + df["b"]

It might not look like such a big deal, but all those brackets and quotation marks add up. Using pandacell, the above syntax can be written as:

    In [2]: %df a + b

Under the hoods, this is accomplished simply by passing the cell contents as a string to Pandas' df.eval function. This isn't very complex, but it does provide a fair deal of functionality and adds a whole lot of readability.

If you wish to store the results to a new column, use regular assignment along with the -i (or --inplace) flag:

    In [3]: %df -i c = a + b

It also works with multiple assignments:

    In [4]: %%df -i
       ...: c = a + b
       ...: f = c - a

You can use Pandas' various accessors and series method calls:

    In [5]: %%df -i
       ...: name_upper = name.str.upper()
       ...: yr = timestamp.dt.year
       ...: lower_cased = species.where(cond=species.str[0].str.islower(), other=None)

Since variable names are assumed to be columns in the dataframe, regular variables in the local/global namespace can be accessed by prefixing with @

    In [6]: a = 1
       ...: %df a = @a + 1

    In [7]: def myfunc(row):
        ...:     return row + 43
        ...: %df b = a.apply(@myfunc)

By default, pandacell operates on any dataframe named df. This can be overridden with the -n (or --name) flag:

    In [8]: %df -n=df_in c = a + b

You can also print subset a dataframe with the -q (or --query) flag:

    In [9]: %df -q species == "setosa"
    Out[9]:
        sepal_length  sepal_width  petal_length  petal_width species  a
    0            5.1          3.5           1.4          0.2  setosa  0
    1            4.9          3.0           1.4          0.2  setosa  0


    In [10]: %df -q species.isna() #check for missing values
    Out[10]:
    Empty DataFrame
    Columns: [sepal_length, sepal_width, petal_length, petal_width, species]
    Index: []

This can be combined with the -i flag to subset the dataframe in-place:

    In [10]: %df -q -i species == "setosa"

Pandacell even supports comments

    In [11]: %%df -i
       ...: # Line comment
       ...: c = a + b # Comment at end of line

Inspired by

https://github.com/catherinedevlin/ipython-sql

Development

https://github.com/eirki/pandacell

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

2020.9.19

Sep 19, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandacell-2020.9.19.tar.gz (3.5 kB view hashes)

Uploaded Sep 19, 2020 Source

Built Distribution

pandacell-2020.9.19-py3-none-any.whl (3.3 kB view hashes)

Uploaded Sep 19, 2020 Python 3

Hashes for pandacell-2020.9.19.tar.gz

Hashes for pandacell-2020.9.19.tar.gz
Algorithm	Hash digest
SHA256	`6ac93b932fce46e73d0b14d8546d252fa5fb91e00f6f7634b8bbe591ba9b1fe9`
MD5	`8f26fd797be6d916c0f5cfb02a2c3caf`
BLAKE2b-256	`b65615de0418333fd99ed341228e3362c22dd7ef922fddad0b5fa4ea472c2868`

Hashes for pandacell-2020.9.19-py3-none-any.whl

Hashes for pandacell-2020.9.19-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ffe0a9beeb42d7c741125300b69ffe4c1d1bac998914d60d279ec3c51e52b7cf`
MD5	`f7f3f0564fb67f1bfa9057842e8dacab`
BLAKE2b-256	`7f7cccdaa697eb16ce149e4d8d46c04df9e7d0b8591fa246d2a6ef60c50285fe`