Skip to main content

Lightweight Pandas monkey-patch for async map, apply, transform, etc.

Project description

aiopandas

๐Ÿš€ Async-Powered Pandas: Lightweight Pandas monkey-patch that adds async support to map, apply, applymap, aggregate, and transform, enabling seamless handling of async functions with controlled parallel execution (max_parallel).

โœจ Features

  • Drop-in replacement for Pandas functions, now supporting async functions.
  • Automatic async execution with controlled concurrency via max_parallel.
  • Built-in error handling โ€“ choose between raising, ignoring, or logging errors.
  • Supports tqdm for real-time progress tracking.

๐Ÿš€ Quick Start

import aiopandas as pd  # Monkey-patches Pandas with async methods
import asyncio

# Create a sample DataFrame
df = pd.DataFrame({'x': range(10)})

# Define an async function (simulating API calls, I/O, etc.)
async def f(x):
    await asyncio.sleep(0.1 * x)  # Simulate async processing
    return x * 2  # Example transformation

# Apply the async function to the DataFrame column
df['y'] = await df.x.amap(f, max_parallel=5)  # Default max_parallel=16
print(df)

โš ๏ธ Handling Errors Gracefully

aiopandas includes built-in error handling, allowing you to manage failures without breaking the entire operation.

  1. Default behavior (raise) โ€“ stops on the first error
async def f(x):
    if x > 50 and x % 3:
        raise Exception('exception example')
    await asyncio.sleep(0.01 * x)
    return x

df = pd.DataFrame({'x': range(100)})

df['y'] = await df.x.amap(f, max_parallel=50)  # Raises an exception

Output (Error traceback):

Exception: exception example
  1. Ignore errors (on_error='ignore')
df['y'] = await df.x.amap(f, max_parallel=50, on_error='ignore')  # Easy to ignore exceptions

Now, instead of crashing, rows that trigger exceptions return NaN:

print(df['y'])
0      0.0
1      1.0
2      2.0
...
95     NaN
96    96.0
97     NaN
98     NaN
99    99.0
Name: y, Length: 100, dtype: float64
  1. Custom error handling (on_error=print)

You can log or process errors with a custom function (or coroutines):

df['y'] = await df.x.amap(f, max_parallel=50, on_error=print)  # Print errors instead of failing

Output:

exception example
exception example
exception example
...

๐Ÿ“Š Progress Tracking with tqdm

To visualize progress, pass tqdm as an argument:

from tqdm import tqdm

df['y'] = await df.x.amap(f, max_parallel=5, tqdm=tqdm)

Example output:

 69%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ                | 69/100 [00:06<00:03, 9.99it/s]

๐ŸŽฏ Why Use aiopandas?

  • Ideal for async API calls (e.g., LLMs, web scraping, database queries).
  • Massively speeds up Pandas workflows when dealing with async I/O operations.
  • Minimal code changes โ€“ just swap .map() for .amap() (or .apply() for aapply(), etc.) and youโ€™re good to go!

๐Ÿ“ฆ Installation

pip install aiopandas

Or, install it manually:

git clone https://github.com/telekinesis-inc/aiopandas.git
cd aiopandas
pip install .

๐Ÿ’ก Contributing

Pull requests are welcome! If you find issues or have suggestions, feel free to open an issue. ๐Ÿš€

๐Ÿ™Œ Acknowledgements

The monkey patching in aiopandas was heavily inspired by (basically copy-pasted) and adapted from the tqdm.pandas() method. Special thanks to the tqdm maintainers for their excellent work on integrating progress bars with Pandas.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiopandas-0.0.3.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiopandas-0.0.3-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file aiopandas-0.0.3.tar.gz.

File metadata

  • Download URL: aiopandas-0.0.3.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for aiopandas-0.0.3.tar.gz
Algorithm Hash digest
SHA256 c65f69a18df7c70a839056a6689c6936ecac579286929221f0f898f68520e46a
MD5 5a0cd9cb97167b55b6ad53f1c9f13e8d
BLAKE2b-256 8ed8bd1ce573b5669d4cfa4b0fe8b39ceed88aa4978f91bb762811c982ed303a

See more details on using hashes here.

File details

Details for the file aiopandas-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: aiopandas-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for aiopandas-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1dd78cd24925a7984000cf0cf59e1d9b8dc787d55553ea0a825ed27858bbdb6d
MD5 b85d98f3c047c5821e899d325fd5dad5
BLAKE2b-256 40698919d52f7e4e0987ccf93eee2f703e0e5d8ce4437375bf69b05d8b4fc815

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page