Skip to main content

A helper class that makes appending to a Pandas DataFrame efficient

Project description

pandas-appender

Have you ever wanted to append a bunch of rows to a Pandas DataFrame? Turns out that it's extremely inefficient to do so for a large dataframe, you're supposed to make multiple dataframes and pd.concat them instead.

So... helper function? Pandas doesn't seem to have one. Roll your own? OK then. Here's that helper function. It can append around 1 million small rows per cpu-second, and has modest additional memory usage.

Install

pip install pandas-appender

Usage

from pandas_appender import PDF_Appender

pdfa = PDF_appender(ignore_index=True)
for i in range(1_000_000):
    pdfa.append({'i': i})

df = pdfa.finalize()

TODO

Add a df_template argument so that columns with dtype='category' can be efficiently represented. Or, make this template from the df passed in?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_appender-0.9.0.tar.gz (8.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page