Skip to main content

Interact with Pandas objects via LLMs and langchain.

Project description

YOLOPandas

Interact with Pandas objects via LLMs and LangChain.

YOLOPandas lets you specify commands with natural language and execute them directly on Pandas objects. You can preview the code before executing, or set yolo=True to execute the code straight from the LLM.

Warning: YOLOPandas will execute arbitrary Python code on the machine it runs on. This is a dangerous thing to do.

https://user-images.githubusercontent.com/26529506/214591990-c295a283-b9e6-4775-81e4-28917183ebb1.mp4

Quick Install

pip install yolopandas

Basic usage

YOLOPandas adds a llm accessor to Pandas dataframes.

from yolopandas import pd

df = pd.DataFrame(
    [
        {"name": "The Da Vinci Code", "type": "book", "price": 15, "quantity": 300, "rating": 4},
        {"name": "Jurassic Park", "type": "book", "price": 12, "quantity": 400, "rating": 4.5},
        {"name": "Jurassic Park", "type": "film", "price": 8, "quantity": 6, "rating": 5},
        {"name": "Matilda", "type": "book", "price": 5, "quantity": 80, "rating": 4},
        {"name": "Clockwork Orange", "type": None, "price": None, "quantity": 20, "rating": 4},
        {"name": "Walden", "type": None, "price": None, "quantity": 100, "rating": 4.5},
    ],
)

df.llm.query("What item is the least expensive?")

The above will generate Pandas code to answer the question, and prompt the user to accept or reject the proposed code. Accepting it in this case will return a Pandas dataframe containing the result.

Alternatively, you can execute the LLM output without first previewing it:

df.llm.query("What item is the least expensive?", yolo=True)

.query can return the result of the computation, which we do not constrain. For instance, while "Show me products under $10" will return a dataframe, the query "Split the dataframe into two, 1/3 in one, 2/3 in the other. Return (df1, df2)" can return a tuple of two dataframes. You can also chain queries together, for instance:

df.llm.query("Group by type and take the mean of all numeric columns.", yolo=True).llm.query("Make a bar plot of the result and use a log scale.", yolo=True)

See the example notebook for more ideas.

LangChain Components

This package uses several LangChain components, making it easy to work with if you are familiar with LangChain. In particular, it utilizes the LLM, Chain, and Memory abstractions.

LLM Abstraction

By working with LangChain's LLM abstraction, it is very easy to plug-and-play different LLM providers into YOLOPandas. You can do this in a few different ways:

  1. You can change the default LLM by specifying a config path using the LLPANDAS_LLM_CONFIGURATION environment variable. The file at this path should be in one of the accepted formats.

  2. If you have a LangChain LLM wrapper in memory, you can set it as the default LLM to use by doing:

import yolopandas
yolopandas.set_llm(llm)
  1. You can set the LLM wrapper to use for a specific dataframe by doing: df.reset_chain(llm=llm)

Chain Abstraction

By working with LangChain's Chain abstraction, it is very easy to plug-and-play different chains into YOLOPandas. This can be useful if you want to customize the prompt, customize the chain, or anything like that.

To use a custom chain for a particular dataframe, you can do:

df.set_chain(chain)

If you ever want to reset the chain to the base chain, you can do:

df.reset_chain()

Memory Abstraction

The default chain used by YOLOPandas utilizes the LangChain concept of memory. This allows for "remembering" of previous commands, making it possible to ask follow up questions or ask for execution of commands that stem from previous interactions.

For example, the query "Make a seaborn plot of price grouped by type" can be followed with "Can you use a dark theme, and pastel colors?" upon viewing the initial result.

By default, memory is turned on. In order to have it turned off by default, you can set the environment variable LLPANDAS_USE_MEMORY=False.

If you are resetting the chain, you can also specify whether to use memory there:

df.reset_chain(use_memory=False)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yolopandas-0.0.6.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yolopandas-0.0.6-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file yolopandas-0.0.6.tar.gz.

File metadata

  • Download URL: yolopandas-0.0.6.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for yolopandas-0.0.6.tar.gz
Algorithm Hash digest
SHA256 45889af639dc9c5cf153febe0a40cfaa7dcb42ad8eea2cc389bb0ff501bf62e1
MD5 cd58ba527c2d39c405b6bc26d8c6cf41
BLAKE2b-256 3257f734fae39bf63e77918cffe9808d51f032b6cfa3539af1b5bdfcf87a682f

See more details on using hashes here.

File details

Details for the file yolopandas-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: yolopandas-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for yolopandas-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 eb41ecb595a40c9c3a593e9408ab9e17a0b19caa2b610d8b2b5999b155628c1e
MD5 3491f89fcf2288c86a62d283c057c3d8
BLAKE2b-256 ce9b1a096ac9bb26e24bb86bf2805453d0fd9fc1d1c893e677bbfc3566b705d5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page