LIDA: Automatic Generation of Visualizations from Data
Project description
LIDA: Automatic Generation of Visualizations and Infographics using Large Language Models
LIDA uses off-the-shelf large language models to generate grammar-agnostic visualization specifications and data-faithful infographics.
Note:: To create visualizations, LIDA generates and executes code. Ensure that you run LIDA in a secure environment. Acknowledge this by setting the environment variable
LIDA_ALLOW_CODE_EVAL=1
How it works
LIDA comprises of 4 modules - A SUMMARIZER that converts data into a rich but compact natural language summary, a GOAL EXPLORER that enumerates visualization goals given the data, a VISGENERATOR that generates, refines, executes and filters visualization code and an INFOGRAPHER module (tbd) that yields data-faithful stylized graphics using IGMs. LIDA provides a python api, and a hybrid user interface (direct manipulation and multilingual natural language) for interactive chart, infographics and data story generation.
Details on the components of LIDA are described in the paper here and in this tutorial notebook.
Requirements and Installation
Verify Environment - Python 3.10+.
Setup and verify that your python environment is python 3.10
or higher (preferably, use Conda).
Once requirements are met, setup your api key and run the following command to install the library in the repository root:
Setup your openai api key
export OPENAI_API_KEY=<your key>
pip install lida
Alternatively you can install the library in dev model by cloning this repo and running pip install -e .
in the repository root.
Features and Python API
LIDA provides a python
api for generating visualizations and infographics - data summary generation, visualization goals, visualization generation, visualization editing. Learn more about the api (e.g., switching between LLM providers such as Cohere, PaLM, Huggingface etc) by running the tutorial notebook.
Data Summarization
from lida.modules import Manager
lida = Manager()
summary = lida.summarize("data/cars.json") # generate data summary
Visualization Goal Generation
goals = lida.generate_goals(summary, n=5) # generate goals
Visualization Generation
# generate code specifications for charts
vis_specs = lida.generate_viz(summary=summary, goal=goals[0], library="matplotlib") # altair, matplotlib etc
# execute code to return charts (raster images or other formats)
charts = lida.execute_viz(code_specs=vis_specs)
Visualization Editing
# modify chart using natural language
instructions = ["convert this to a bar chart", "change the color to red", "change y axes label to Fuel Efficiency"]
vis_specs = lida.edit_viz(code=charts[0].code, summary=summary, instructions=instructions, library="matplotlib")
edited_chartspecs = lida.execute_viz(code_specs=vis_specs, data=manager.data)
LIDA also supports other operations like visualization explanations, repair, recommendation etc. See the tutorial notebook for more details.
Getting Started
The fastest and recommended way to get started after installation will be to try out the web ui or run the tutorial notebook.
Web UI
You can use the library from the bundled ui by running the following command:
lida ui --port=8080
Then navigate to http://localhost:8080/ in your browser.
Finally, you can call lida from your application via its web api. To view the web api specification, navigate to http://localhost:8080/api/docs
in your browser.
Documentation and Citation
A short paper describing LIDA (Accepted at ACL 2023 Conference) is available here.
@article{dibia2023lida,
title={LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models},
author={Victor Dibia},
year={2023},
eprint={2303.02927},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
LIDA builds on insights in automatic generation of visualizaiton from an earlier paper - Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.