Skip to main content

High-level APIs and training utilities for the LISA Vision-to-Speech model

Project description

lisaai

High-level, repo-independent tools for the LISA Vision-to-Speech model.

Features (v0.2.0)

  • True Vision-to-Speech: Direct vision-to-mel-spectrogram synthesis avoiding any text/CTC intermediate representation.
  • Improved Fusion: Proper cross-attention between vision and audio streams.
  • Training Utilities: Full support for masked modelling, contrastive alignment, and mel-generation losses.

Install

From PyPI:

pip install lisaai --upgrade

From source:

pip install -e .

CLI

  • Inspect a checkpoint (folder or .safetensors):
lisa.inspect "PATH/TO/MODEL_DIR"
  • Compare param counts by saved prefixes:
lisa.compare-params "PATH/TO/MODEL_DIR"

Python API

from lisaai import inspect_checkpoint, compare_param_counts

rep = inspect_checkpoint("PATH/TO/MODEL_DIR")
print(rep["inferred_dimensions"])  # {vision_embed_dim, fusion_hidden_dim}

To use runtime loading (optional, if the original repo is available):

from lisaai import load_model
m = load_model()  # will use LISA_MODEL_PATH if set

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lisaai-0.2.0.tar.gz (31.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lisaai-0.2.0-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file lisaai-0.2.0.tar.gz.

File metadata

  • Download URL: lisaai-0.2.0.tar.gz
  • Upload date:
  • Size: 31.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for lisaai-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bc2daa234b92bcc25c6687eeb3a0d569ac76dcd76b68f5cb9e871c8a8f319459
MD5 c0041cf4af65354b9c53961218a06719
BLAKE2b-256 f1e126517960602df9c9d90f91be2b2f1b2a3afeb21bf6125704096b0ecb00a5

See more details on using hashes here.

File details

Details for the file lisaai-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: lisaai-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 34.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for lisaai-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c55a545ef570a466b568a616d07e1cb01c5f5687e61bdbe807ce10747c6b9336
MD5 865566ef08458d0b1e2e764b7b7bb671
BLAKE2b-256 a6460f92d6ee2953509179dd230bc8de04c78d6b3952604e04ef658882707f8c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page