Skip to main content

A unified library for autoencoder-family models across deterministic, variational, and quantized latent spaces.

Project description

autoencoders

A latent-model toolkit for deterministic, variational, and quantized autoencoders

Python 3.10+ PyTorch 20+ model families Datasets Checkpoint API

Build, train, serialize, and export latent models with one consistent API.

autoencoders is a PyTorch-first library for autoencoder-family models across deterministic, variational, and quantized latent spaces.

The project goal is simple: make autoencoders feel composable, serializable, and reusable in the same way transformers did for sequence models.

Why autoencoders

🧩 Unified API
One package shape across `AE`, `VAE`, `VQ-VAE`, `PQ-VAE`, `RQ-VAE`, `WAE`, `AAE`, and more.
🧠 Latent-first design
Treat reconstruction, posterior statistics, quantized codes, and exported latents as first-class outputs.
📦 Reusable checkpoints
Use `save_pretrained()` and `from_pretrained()` for stable, shareable model artifacts.
🚀 Real training flow
Ship with trainers, datasets, shell wrappers, and packaging hooks for end-to-end experiments.

What It Covers

Current model families include:

  • Deterministic models: AE, DAE, CAE, SAE, TopKSAE, KLSAE, WAE, AAE
  • Variational models: VAE, DVAE, BetaVAE, BetaTCVAE, DIPVAE, InfoVAE, MMDVAE, FactorVAE, VampPriorVAE, HVAE
  • Quantized models: VQVAE, GumbelVQ, FSQ, RFSQ, PQVAE, RQVAE, VQVAE2

Core interfaces include:

  • Config + Model + Output + Export
  • save_pretrained() / from_pretrained()
  • encode() / decode() / reconstruct() / export()
  • family-specific trainers for deterministic, variational, and quantized models

At a Glance

Family Examples Key outputs
Deterministic `AE`, `DAE`, `CAE`, `SAE`, `TopKSAE`, `KLSAE` `reconstruction`, `latents`, sparse and contractive penalties
Variational `VAE`, `DVAE`, `BetaVAE`, `HVAE` `posterior_mean`, `posterior_logvar`, `kl_loss`, `free_bits_kl_loss`
Quantized `VQVAE`, `FSQ`, `PQVAE`, `RQVAE` `quantized_latents`, `codebook_indices`, usage and perplexity metrics

Installation

Install the package:

pip install autoencoders

Install with PyTorch dependencies:

pip install "autoencoders[torch]"

Install with encoder-backed text dataset support:

pip install "autoencoders[text]"

Install with CLIP-backed multimodal dataset support:

pip install "autoencoders[clip]"

Install everything commonly needed for experiments:

pip install "autoencoders[all]"

If you are working from source and plan to build or publish packages:

pip install "autoencoders[dev]"

Quick Start

Build a basic AE + MLP model explicitly from a sample spec:

import torch

from autoencoders import AutoencoderConfig, AutoencoderModel
from autoencoders.data.base import TensorSpec

model = AutoencoderModel(
    config=AutoencoderConfig(latent_dim=16),
    sample_spec=TensorSpec(shape=(50,)),
    encoder="mlp",
    encoder_config={"hidden_dims": [64, 32], "activation": "relu", "use_bias": True},
    decoder="mlp",
    decoder_config={"hidden_dims": [64, 50], "activation": "relu", "use_bias": True},
)

inputs = torch.randn(32, 50)
outputs = model(inputs)

print(outputs.loss)
print(outputs.latents.shape)
print(outputs.reconstruction.shape)

Save and load checkpoints:

model.save_pretrained("artifacts/ae")
restored = AutoencoderModel.from_pretrained("artifacts/ae")

Inspect the model pipeline and layer-by-layer shape trace:

for step in model.get_pipeline_trace():
    print(step.name, "->", step.output_spec)

Train from YAML:

python examples/trainer.py --config examples/configs/glove/ae.yaml --epoch 5

Product Surface

Use the package at three different layers:

  • Model layer: build or load latent models with typed configs
  • Training layer: train deterministic, variational, or quantized families with dedicated trainers
  • Experiment layer: run reusable YAML configs with one trainer entrypoint on real datasets

Model Loading

Load a model dynamically by name while still keeping backbone selection explicit:

from autoencoders import load_model
from autoencoders.data.base import TensorSpec

model = load_model(
    "vae",
    sample_spec=TensorSpec(shape=(50,)),
    latent_dim=16,
    kl_weight=0.1,
    free_bits=0.02,
    encoder="mlp",
    encoder_config={"hidden_dims": [64, 32], "activation": "relu", "use_bias": True},
    decoder="mlp",
    decoder_config={"hidden_dims": [64, 50], "activation": "relu", "use_bias": True},
)

Datasets

The library currently ships with embedding-first datasets plus one image dataset for CNN-backed experiments:

  • glove
  • fasttext
  • numberbatch
  • snli
  • multinli
  • flickr30k
  • cifar10

Load a dataset directly:

from autoencoders import load_dataset

dataset = load_dataset("glove", dim=50, max_vectors=50000)
loaders = dataset.get_dataloaders(batch_size=256)

Encoder-backed sentence datasets materialize embeddings during prepare() and cache the result just like static embedding tables:

dataset = load_dataset(
    "snli",
    encoder_name="sentence-transformers/all-MiniLM-L6-v2",
    max_vectors=50000,
)
loaders = dataset.get_dataloaders(batch_size=256)

CLIP-backed multimodal datasets follow the same cached artifact pattern:

dataset = load_dataset(
    "flickr30k",
    encoder_name="ViT-B-32",
    encoder_pretrained="laion2b_s34b_b79k",
    modality="both",
    max_vectors=50000,
)
loaders = dataset.get_dataloaders(batch_size=256)

Image data uses H x W x C specs end to end:

dataset = load_dataset("cifar10", max_examples=10000)
print(dataset.get_sample_spec())  # TensorSpec(shape=(32, 32, 3))

Downloaded datasets use a global cache:

  • default: ~/.cache/autoencoders
  • override with: AUTOENCODERS_CACHE=/your/cache/path

This makes the package useful both as:

  • a standalone training library
  • a latent-model subsystem inside larger PyTorch projects

Training API

Deterministic training:

from autoencoders import AETrainer, TrainingConfig

trainer = AETrainer(
    model=model,
    args=TrainingConfig(
        output_dir="artifacts/ae-run",
        epochs=5,
        batch_size=256,
    ),
)

trainer.fit(loaders, metadata={"dataset": "glove", "model": "ae"})

Variational training:

from autoencoders import VAETrainer, VariationalAutoencoderConfig, VariationalAutoencoderModel
from autoencoders.data.base import TensorSpec

trainer = VAETrainer(
    model=VariationalAutoencoderModel(
        config=VariationalAutoencoderConfig(
            latent_dim=16,
            kl_weight=0.1,
            free_bits=0.02,
            kl_warmup_epochs=20,
        ),
        sample_spec=TensorSpec(shape=(50,)),
        encoder="mlp",
        encoder_config={"hidden_dims": [64, 32], "activation": "relu", "use_bias": True},
        decoder="mlp",
        decoder_config={"hidden_dims": [64, 50], "activation": "relu", "use_bias": True},
    ),
    args=TrainingConfig(output_dir="artifacts/vae-run", epochs=10),
)

Quantized training:

from autoencoders import VQTrainer, TrainingConfig, load_model
from autoencoders.data.base import TensorSpec

trainer = VQTrainer(
    model=load_model(
        "rqvae",
        sample_spec=TensorSpec(shape=(None, 50)),
        latent_dim=16,
        codebook_size=256,
        num_quantizers=4,
        use_ema_codebook=True,
        dead_code_reset=True,
        encoder="mlp",
        encoder_config={"hidden_dims": [64, 32], "activation": "relu", "use_bias": True},
        decoder="mlp",
        decoder_config={"hidden_dims": [64, 50], "activation": "relu", "use_bias": True},
    ),
    args=TrainingConfig(output_dir="artifacts/rqvae-run", epochs=10),
)

Training Entry Point

Source checkouts now use one unified YAML-driven entrypoint:

  • examples/trainer.py

The legacy examples/train_ae.py wrapper still forwards into the same code path for basic AE runs.

Useful examples:

python examples/trainer.py --config examples/configs/glove/ae.yaml --epoch 5
python examples/trainer.py --config examples/configs/glove/vae.yaml --epoch 5
python examples/trainer.py --config examples/configs/glove/vqvae.yaml --epoch 5
python examples/trainer.py --config examples/configs/cifar10/vqvae.yaml --epoch 5

Each config is organized into five sections:

  • dataset
  • model
  • encoder
  • decoder
  • trainer

Launch-Ready Features

  • 🗃️ Checkpoints: save_pretrained() and from_pretrained()
  • 📤 Exports: standardized latent artifact export across model families
  • 📚 Real datasets: static embedding tables, sentence corpora, and CLIP-backed image-text corpora
  • 🎛️ Family-specific trainers: deterministic, variational, quantized, and adversarial flows
  • 🧪 Packaging: buildable sdist and wheel, ready for PyPI publication

Design Direction

The library is organized around latent model families rather than a single monolithic interface:

  • BaseAutoencoderModel
  • BaseVariationalAutoencoderModel
  • BaseVectorQuantizedAutoencoderModel

Matching outputs are also family-specific:

  • BaseAutoencoderOutput
  • VariationalAutoencoderOutput
  • QuantizedAutoencoderOutput

This keeps the shared API stable without flattening away meaningful model differences such as posterior statistics or codebook indices.

Current Scope

autoencoders is intentionally embedding-first, with a growing image path for CNN-backed quantized models. The current core is aimed at:

  • representation learning on embedding matrices
  • latent compression
  • variational latent modeling
  • quantized latent tokenization

Future raw-modality frontends and multimodal adapters can be layered on top of this core.

Repository Status

This project is still early, but the current package already supports:

  • trainable deterministic, variational, and quantized autoencoder families
  • reusable checkpoints
  • exportable latent artifacts
  • real embedding datasets with download and cache support
  • package metadata and distribution artifacts ready for publication workflows

Development

Build the package locally:

python -m build

Check the generated distribution:

twine check dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoencoders-0.2.0.tar.gz (107.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoencoders-0.2.0-py3-none-any.whl (118.2 kB view details)

Uploaded Python 3

File details

Details for the file autoencoders-0.2.0.tar.gz.

File metadata

  • Download URL: autoencoders-0.2.0.tar.gz
  • Upload date:
  • Size: 107.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for autoencoders-0.2.0.tar.gz
Algorithm Hash digest
SHA256 22be1c8419a68b5bff84f4ae0828c519fa827cbbbbd3c97596fdac5334b40ea8
MD5 7457f56d2d9f117b2b6fc150c8fae889
BLAKE2b-256 2ba713da6b242a7cc413bf123fbd2e71bc2fdbc874d6fd1fd69233bcf9631683

See more details on using hashes here.

File details

Details for the file autoencoders-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: autoencoders-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 118.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for autoencoders-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 00d223a3f83ced209ecae2fb9b46a4b764ccbec40f116d5c65893ae0bc36728c
MD5 5886be87a3c645484373c3197a95d3eb
BLAKE2b-256 84ec677758782ece67552c679cbdd1461116bb9dcc25959062b1796c9182d669

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page