Skip to main content

Visualize text embeddings with interactive plots

Project description

embedding-visualizer

Interactive 2D visualization of text embeddings using OpenAI's embedding API and Plotly.

Install

pip install embedding-visualizer

Or in dev mode:

cd embedding_visualizer
uv sync

Requires an OPENAI_API_KEY environment variable.

Usage

from embedding_visualizer import visualize_embeddings, PrincipalComponent, TextEmbedding

docs = [
    {
        "text": "This text will be embedded",
        "label": "optional, groups points in the legend",
        "color": "optional, color for the point/label group",
        "line-id": "optional, connects points with the same id",
        "hover": "optional hover text",
    },
    # ...
]

# t-SNE or PCA projection
plot = visualize_embeddings(docs=docs, projection="t-sne")  # or "pca"

# Custom per-axis projection
plot = visualize_embeddings(
    docs=docs,
    x_projection=PrincipalComponent(1),
    y_projection=TextEmbedding("some text to project onto"),
    title="My Embedding Plot",
)

plot.display()              # show in Jupyter or browser
plot.to_html("plot.html")   # self-contained HTML file

Doc fields

Field Required Description
text yes Text to embed
label no Legend group name; points with the same label share a color
color no Point color. If label is set, applies to the whole group
line-id no Connects points with the same id in document order
hover no Custom hover text (defaults to first 100 chars of text)

Projections

  • projection="t-sne" — t-SNE with cosine metric (default)
  • projection="pca" — PCA
  • PrincipalComponent(n) — project onto the nth principal component (1-indexed)
  • TextEmbedding("text") — cosine similarity with a reference text's embedding

Example

examples/repo_files.py embeds every Python file in this repository at multiple truncation points and connects versions of the same file with lines:

uv run python examples/repo_files.py

Caching

Embeddings are cached to ~/.cache/embedding_visualizer/ so repeated runs don't re-call the API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embedding_visualizer-0.1.2.tar.gz (106.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

embedding_visualizer-0.1.2-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file embedding_visualizer-0.1.2.tar.gz.

File metadata

  • Download URL: embedding_visualizer-0.1.2.tar.gz
  • Upload date:
  • Size: 106.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for embedding_visualizer-0.1.2.tar.gz
Algorithm Hash digest
SHA256 3c4c898d40306c9c3f77d822389a946c398b461326e5d2cf9c8102537af02ad1
MD5 2dd2a44b09ab7ea6113c1de7f0debb31
BLAKE2b-256 d6bfc4f409d63d6c0372765f13be8b376692b2f0f1359cad69ebd76e5e75f692

See more details on using hashes here.

Provenance

The following attestation bundles were made for embedding_visualizer-0.1.2.tar.gz:

Publisher: workflow.yaml on nielsrolf/embedding_visualizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file embedding_visualizer-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for embedding_visualizer-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e1d979f1cfd1512a8e2065c3addfeafc819842d23e8f0fb181c86c875f9e3483
MD5 d277ed00c4b39bd227fba69696dde067
BLAKE2b-256 3210b0ad266421a6a43fc238f2f6114357287a35dfa6dde43fc315b747b143ed

See more details on using hashes here.

Provenance

The following attestation bundles were made for embedding_visualizer-0.1.2-py3-none-any.whl:

Publisher: workflow.yaml on nielsrolf/embedding_visualizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page