Skip to main content

LIDA: Automatic Generation of Visualizations from Data

Project description

LIDA: Automatic Generation of Visualizations and Infographics using Large Language Models

PyPI version

LIDA uses off-the-shelf large language models to generate grammar-agnostic visualization specifications and data-faithful infographics.

Note:: To create visualizations, LIDA generates and executes code. Ensure that you run LIDA in a secure environment. Acknowledge this by setting the environment variable LIDA_ALLOW_CODE_EVAL=1

How it works

LIDA comprises of 4 modules - A SUMMARIZER that converts data into a rich but compact natural language summary, a GOAL EXPLORER that enumerates visualization goals given the data, a VISGENERATOR that generates, refines, executes and filters visualization code and an INFOGRAPHER module (tbd) that yields data-faithful stylized graphics using IGMs. LIDA provides a python api, and a hybrid user interface (direct manipulation and multilingual natural language) for interactive chart, infographics and data story generation.

lida components

Details on the components of LIDA are described in the paper here and in this tutorial notebook.

Requirements and Installation

Verify Environment - Python 3.10+. Setup and verify that your python environment is python 3.10 or higher (preferably, use Conda).

Once requirements are met, setup your api key and run the following command to install the library in the repository root:

Setup your openai api key

export OPENAI_API_KEY=<your key>
pip install lida

Alternatively you can install the library in dev model by cloning this repo and running pip install -e . in the repository root.

Features and Python API

LIDA provides a python api for generating visualizations and infographics - data summary generation, visualization goals, visualization generation, visualization editing. Learn more about the api (e.g., switching between LLM providers such as Cohere, PaLM, Huggingface etc) by running the tutorial notebook.

Data Summarization

from lida.modules import Manager

lida = Manager()
summary = lida.summarize("data/cars.json") # generate data summary

Visualization Goal Generation

goals = lida.generate_goals(summary, n=5) # generate goals

Visualization Generation

# generate code specifications for charts
vis_specs = lida.generate_viz(summary=summary, goal=goals[0], library="matplotlib") # altair, matplotlib etc

# execute code to return charts (raster images or other formats)
charts = lida.execute_viz(code_specs=vis_specs)

Visualization Editing

# modify chart using natural language
instructions = ["convert this to a bar chart", "change the color to red", "change y axes label to Fuel Efficiency"]
vis_specs = lida.edit_viz(code=charts[0].code,  summary=summary, instructions=instructions, library="matplotlib")
edited_chartspecs = lida.execute_viz(code_specs=vis_specs, data=manager.data)

LIDA also supports other operations like visualization explanations, repair, recommendation etc. See the tutorial notebook for more details.

Getting Started

The fastest and recommended way to get started after installation will be to try out the web ui or run the tutorial notebook.

Web UI

You can use the library from the bundled ui by running the following command:

lida ui  --port=8080

Then navigate to http://localhost:8080/ in your browser.

Finally, you can call lida from your application via its web api. To view the web api specification, navigate to http://localhost:8080/api/docs in your browser.

Documentation and Citation

A short paper describing LIDA (Accepted at ACL 2023 Conference) is available here.

@article{dibia2023lida,
      title={LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models},
      author={Victor Dibia},
      year={2023},
      eprint={2303.02927},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

LIDA builds on insights in automatic generation of visualizaiton from an earlier paper - Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lida-0.0.2a0.tar.gz (41.5 MB view details)

Uploaded Source

Built Distribution

lida-0.0.2a0-py3-none-any.whl (40.5 MB view details)

Uploaded Python 3

File details

Details for the file lida-0.0.2a0.tar.gz.

File metadata

  • Download URL: lida-0.0.2a0.tar.gz
  • Upload date:
  • Size: 41.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for lida-0.0.2a0.tar.gz
Algorithm Hash digest
SHA256 72bc4dd27b017041177c29e25eb25b4a7d407ba4e54abd36766720b28f1bbb4c
MD5 5c3be96fb86aa5557780f5e3fb584f34
BLAKE2b-256 4642ed970a5ca2d14df319401eb52de7f4fd0b8cccf4c7899f6346621fa1d501

See more details on using hashes here.

File details

Details for the file lida-0.0.2a0-py3-none-any.whl.

File metadata

  • Download URL: lida-0.0.2a0-py3-none-any.whl
  • Upload date:
  • Size: 40.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for lida-0.0.2a0-py3-none-any.whl
Algorithm Hash digest
SHA256 6944b7516cf67187a678ab5a67f5d14e128f9486f9439c37fec8048eb2e3f7db
MD5 6ff7af1e5fa00c523e817c20e38bb709
BLAKE2b-256 80f0813c9f40212b393f41cd3d55379396a1590eb1e0e48cb38fd621848b9b07

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page