Skip to main content

A package for visualizing embeddings spaces from Hugging Face models

Project description

LlmEmbeddingXrVizualization

Python package

Google Colab Demostration https://colab.research.google.com/drive/1ngNpXc42u_02hHu2kFF3LyljNxWLAaaP#scrollTo=1uBORXM-ATLG

A package for visualizing Large Language Model (LLM) embedding spacese from Hugging Face models with just the model name as input!

Inspired by the belief that data should be experienced, not just viewed, we're bridging the gap between 2D plots and spatial understanding in the LLM embeddings space. The fundamental limitation of 2D screens - trying to compress three dimensions into two - has always forced us to sacrifice either information or clarity. Our platform breaks free from these constraints, transforming raw datasets into immersive XR visualizations using nothing but the name of the model from Hugging Face. Every visualization is accessible on your Meta Quest XR Headsets. We're not just plotting data - we're creating a new way to discover insights through spatial exploration, one that respects the true dimensionality of our data.

Each word/sentece embedding is meticulously positioned in virtual space, ensuring perfect spatial accuracy and true-to-scale representation. This precision becomes particularly powerful when visualizing LLM embedding spaces - allowing users to physically explore how concepts are related within these models. By walking through the three-dimensional embedding space, researchers can intuitively verify if semantically similar concepts cluster together and identify unexpected relationships that traditional 2D visualizations might miss.

Installation

pip install LlmEmbeddingXrVizualization

Usage

llm-embedding-viz --help

image

llm-embedding-viz

image

example website to open the generated 3d object ('.dae file').

image

example experience on meta quest 3

PlotVerseXR_Trailer

PlotVerseXR_Trailer (1)

llm-embedding-viz --model_name "distilbert/distilbert-base-uncased-finetuned-sst-2-english" -c path_to_ur_labels_domains.csv -r isomap -s"

The csv file must have 'domains' and 'words' columns.

image

generated plot for -s flag

image

References

This idea started in a Hacathon: https://devpost.com/software/plotversexr.

Generative Ai such as Github Copilot and Chat GPT was used extensively in this project.

Duke University Xplainable Ai Class: AIPI 590.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

LlmEmbeddingXrVizualization-0.1.16.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

LlmEmbeddingXrVizualization-0.1.16-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file LlmEmbeddingXrVizualization-0.1.16.tar.gz.

File metadata

File hashes

Hashes for LlmEmbeddingXrVizualization-0.1.16.tar.gz
Algorithm Hash digest
SHA256 588d7b621ab379d8760d6bbcbc7fa6b0bc1a22a9dffbe67654167bd11f50992e
MD5 d952133735e0e38a9d6a66eee0e9edba
BLAKE2b-256 69276ed1614cc510304e3804681b17140e1bf727f7ed09043d9d2998f45cb975

See more details on using hashes here.

File details

Details for the file LlmEmbeddingXrVizualization-0.1.16-py3-none-any.whl.

File metadata

File hashes

Hashes for LlmEmbeddingXrVizualization-0.1.16-py3-none-any.whl
Algorithm Hash digest
SHA256 028660bebac5a471a1f3a0bb0649366a332a6c1caa901b15b41589554e4df67a
MD5 d741b260c9684f6625f4fbf331c323b6
BLAKE2b-256 66c7ae2bd95083cca50e413da7bac88a2ab3dc36af13b1866ffc1349f7c8e83a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page