LLM assistant for the development of Spark applications
Project description
LLM Assistant for Apache Spark
Installation
pip install spark-llm
Usage
Initialization
from langchain.chat_models import ChatOpenAI
from spark_llm import SparkLLMAssistant
llm = ChatOpenAI(model_name='gpt-4') # using gpt-4 can achieve better results
assistant=SparkLLMAssistant(llm=llm)
assistant.activate() # active partial functions for Spark DataFrame
Data Ingestion
auto_df=assistant.create_df("2022 USA national auto sales by brand")
auto_df.show(n=5)
rank | brand | us_sales_2022 | sales_change_vs_2021 |
---|---|---|---|
1 | Toyota | 1849751 | -9 |
2 | Ford | 1767439 | -2 |
3 | Chevrolet | 1502389 | 6 |
4 | Honda | 881201 | -33 |
5 | Hyundai | 724265 | -2 |
Plot
auto_df.llm.plot()
To plot with an instruction:
auto_df.llm.plot("pie char for top 5 brands and the others' market shares")
DataFrame Transformation
auto_top_growth_df=auto_df.llm.transform("top brand with the highest growth")
auto_top_growth_df.show()
brand | us_sales_2022 | sales_change_vs_2021 |
---|---|---|
Cadillac | 134726 | 14 |
DataFrame Explanation
auto_top_growth_df.llm.explain()
In summary, this dataframe is retrieving the brand with the highest sales change in 2022 compared to 2021. It presents the results sorted by sales change in descending order and only returns the top result.
Refer to example.ipynb for more detailed usage examples.
DataFrame Attribute Verification
auto_top_growth_df.llm.verify("expect sales change percentage to be between -100 to 100")
result: True
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
License
Licensed under the Apache License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spark_llm-0.1.4.tar.gz
(15.1 kB
view hashes)
Built Distribution
spark_llm-0.1.4-py3-none-any.whl
(16.6 kB
view hashes)
Close
Hashes for spark_llm-0.1.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ae078e9d0951a138a8daf497028a4361209e7778e2bd08645710c7a41718051 |
|
MD5 | 0f67fc5b9b2d878fab6097ffb8f7fa98 |
|
BLAKE2b-256 | fc6b0ec27431f0b857768c9e1d6200cbff904d8780a418d3dd123f68c2839425 |