Power up your data science workflow with ChatGPT
Project description
pandas-gpt 
Power up your data science workflow with LLMs.
pandas-gpt is a Python library for doing almost anything with a pandas DataFrame using ChatGPT or any other Large Language Model (LLM).
Installation
pip install pandas-gpt[openai]
You may also want to install the optional openai and/or litellm dependencies.
Next, set the OPENAI_API_KEY environment variable to your OpenAI API key, or use the following code snippet:
import openai
openai.api_key = '<API Key>'
If you're looking for a free alternative to the OpenAI API, we encourage using Google Gemini for code completion:
pip install pandas-gpt[litellm]
import pandas_gpt
pandas_gpt.completer = pandas_gpt.LiteLLM('gemini/gemini-1.5-pro', api_key='...')
Examples
Setup and usage examples are available in this Google Colab notebook.
import pandas as pd
import pandas_gpt
df = pd.DataFrame('https://gist.githubusercontent.com/bluecoconut/9ce2135aafb5c6ab2dc1d60ac595646e/raw/c93c3500a1f7fae469cba716f09358cfddea6343/sales_demo_with_pii_and_all_states.csv')
# Data transformation
df = df.ask('drop purchases from Laurenchester, NY')
df = df.ask('add a new Category column with values "cheap", "regular", or "expensive"')
# Queries
weekday = df.ask('which day of the week had the largest number of orders?')
top_10 = df.ask('what are the top 10 most popular products, as a table')
# Plotting
df.ask('plot monthly and hourly sales')
top_10.ask('horizontal bar plot with pastel colors')
# Allow changes to original dataset
df.ask('do something interesting', mutable=True)
# Show source code before running
df.ask('convert prices from USD to GBP', verbose=True)
Custom Language Models
It's possible to use a different language model with the completer config option:
import pandas_gpt
# Global default
pandas_gpt.completer = pandas_gpt.OpenAI('gpt-3.5-turbo')
# Custom completer for a specific request
df.ask('Do something interesting with the data', completer=pandas_gpt.LiteLLM('gemini/gemini-1.5-pro'))
By default, API keys are picked up from environment variables such as OPENAI_API_KEY.
It's also possible to specify an API key for a particular call:
df.ask('Do something important with the data', completer=pandas_gpt.OpenAI('gpt-4o', api_key='...'))
OpenAI
pandas_gpt.completer = pandas_gpt.OpenAI('gpt-4o')
LiteLLM
pandas_gpt.completer = pandas_gpt.LiteLLM('gemini/gemini-1.5-pro')
Local (Huggingface)
pandas_gpt.completer = pandas_gpt.LiteLLM('huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct')
OpenRouter
pandas_gpt.completer = pandas_gpt.OpenRouter('anthropic/claude-3.5-sonnet')
Anything
def my_custom_completer(prompt: str) -> str:
# Use an LLM or any other method to create a `process()` function that
# takes a pandas DataFrame as a single argument, does some operations on it,
# and return a DataFrame.
return 'def process(df): ...'
pandas_gpt.completer = my_custom_completer
If you want to use a fully customized API host such as Azure OpenAI Service,
you can globally configure the openai and pandas-gpt packages:
import openai
openai.api_type = 'azure'
openai.api_base = '<Endpoint>'
openai.api_version = '<Version>'
openai.api_key = '<API Key>'
import pandas_gpt
pandas_gpt.completer = pandas_gpt.OpenAI(
model='gpt-3.5-turbo',
engine='<Engine>',
deployment_id='<Deployment ID>',
)
Alternatives
- GitHub Copilot: General-purpose code completion (paid subscription)
- Sketch: AI-powered data summarization and code suggestions (works without an API key)
Disclaimer
Please note that the limitations of ChatGPT also apply to this library. I would recommend using pandas-gpt in a sandboxed environment such as Google Colab, Kaggle, or GitPod.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pandas_gpt-1.0.0.tar.gz.
File metadata
- Download URL: pandas_gpt-1.0.0.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa6d7142e70d775d24e547a00fe6a58d3e14c82959e2934d6e5d2633e1e91275
|
|
| MD5 |
a19e392eee14f7d4d62bec0ad0de486a
|
|
| BLAKE2b-256 |
86356209ecfeb8555fe44c86bc5b097dac4728d904412b90c521b94263406687
|
File details
Details for the file pandas_gpt-1.0.0-py3-none-any.whl.
File metadata
- Download URL: pandas_gpt-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26f15359cf49798757f986c3061dc0a490eda7fd6af9f0f0ca4c29005261d786
|
|
| MD5 |
f2ad41f942867047b5565d49f1d3b7c4
|
|
| BLAKE2b-256 |
402f1782aa04fcafc462352f8d336f1fdb445061fb76fe776552da3629a48716
|