A unified library for autoencoder-family models across deterministic, variational, and quantized latent spaces.

These details have not been verified by PyPI

Project links

Project description

autoencoders

A latent-model toolkit for deterministic, variational, and quantized autoencoders

Python 3.10+ PyTorch 20+ model families Datasets Checkpoint API

Build, train, serialize, and export latent models with one consistent API.

autoencoders is a PyTorch-first library for autoencoder-family models across deterministic, variational, and quantized latent spaces.

The project goal is simple: make autoencoders feel composable, serializable, and reusable in the same way transformers did for sequence models.

Why autoencoders

🧩 Unified API
One package shape across `AE`, `VAE`, `VQ-VAE`, `PQ-VAE`, `RQ-VAE`, `WAE`, `AAE`, and more.

🧠 Latent-first design
Treat reconstruction, posterior statistics, quantized codes, and exported latents as first-class outputs.

📦 Reusable checkpoints
Use `save_pretrained()` and `from_pretrained()` for stable, shareable model artifacts.

🚀 Real training flow
Ship with trainers, datasets, shell wrappers, and packaging hooks for end-to-end experiments.

What It Covers

Current model families include:

Deterministic models: AE, DAE, CAE, SAE, TopKSAE, KLSAE, WAE, AAE
Variational models: VAE, DVAE, BetaVAE, BetaTCVAE, DIPVAE, InfoVAE, MMDVAE, FactorVAE, VampPriorVAE, HVAE
Quantized models: VQVAE, GumbelVQ, FSQ, RFSQ, PQVAE, RQVAE, VQVAE2

Core interfaces include:

Config + Model + Output + Export
save_pretrained() / from_pretrained()
encode() / decode() / reconstruct() / export()
family-specific trainers for deterministic, variational, and quantized models

At a Glance

Family	Examples	Key outputs
Deterministic	`AE`, `DAE`, `CAE`, `SAE`, `TopKSAE`, `KLSAE`	`reconstruction`, `latents`, sparse and contractive penalties
Variational	`VAE`, `DVAE`, `BetaVAE`, `HVAE`	`posterior_mean`, `posterior_logvar`, `kl_loss`, `free_bits_kl_loss`
Quantized	`VQVAE`, `FSQ`, `PQVAE`, `RQVAE`	`quantized_latents`, `codebook_indices`, usage and perplexity metrics

Installation

Install the package:

pip install autoencoders

Install with PyTorch dependencies:

pip install "autoencoders[torch]"

Install with encoder-backed text dataset support:

pip install "autoencoders[text]"

Install with CLIP-backed multimodal dataset support:

pip install "autoencoders[clip]"

Install everything commonly needed for experiments:

pip install "autoencoders[all]"

If you are working from source and plan to build or publish packages:

pip install "autoencoders[dev]"

Documentation

The repository now ships with an MkDocs site that documents:

the dataset, backbone, and DataSpec surface
the unified YAML training entrypoint
tree-structured model parameter references for deterministic, variational, and quantized families

Preview the docs locally:

mkdocs serve

Build the static site:

mkdocs build --strict

Quick Start

Build a basic AE + MLP model explicitly from a sample spec:

import torch

from autoencoders import AutoencoderConfig, AutoencoderModel
from autoencoders.data.base import TensorSpec

model = AutoencoderModel(
    config=AutoencoderConfig(latent_dim=16),
    sample_spec=TensorSpec(shape=(50,)),
    encoder="mlp",
    encoder_config={"hidden_dims": [64, 32], "activation": "relu", "use_bias": True},
    decoder="mlp",
    decoder_config={"hidden_dims": [64, 50], "activation": "relu", "use_bias": True},
)

inputs = torch.randn(32, 50)
outputs = model(inputs)

print(outputs.loss)
print(outputs.latents.shape)
print(outputs.reconstruction.shape)

Save and load checkpoints:

model.save_pretrained("artifacts/ae")
restored = AutoencoderModel.from_pretrained("artifacts/ae")

Inspect the model pipeline and layer-by-layer shape trace:

for step in model.get_pipeline_trace():
    print(step.name, "->", step.output_spec)

Train from YAML:

python examples/trainer.py --config examples/configs/glove/ae.yaml --epoch 5

Product Surface

Use the package at three different layers:

Model layer: build or load latent models with typed configs
Training layer: train deterministic, variational, or quantized families with dedicated trainers
Experiment layer: run reusable YAML configs with one trainer entrypoint on real datasets

Model Loading

Load a model dynamically by name while still keeping backbone selection explicit:

from autoencoders import load_model
from autoencoders.data.base import TensorSpec

model = load_model(
    "vae",
    sample_spec=TensorSpec(shape=(50,)),
    latent_dim=16,
    kl_weight=0.1,
    free_bits=0.02,
    encoder="mlp",
    encoder_config={"hidden_dims": [64, 32], "activation": "relu", "use_bias": True},
    decoder="mlp",
    decoder_config={"hidden_dims": [64, 50], "activation": "relu", "use_bias": True},
)

Datasets

The library currently ships with embedding-first datasets plus one image dataset for CNN- and ViT-backed experiments:

glove
fasttext
numberbatch
snli
multinli
flickr30k
cifar10

Load a dataset directly:

from autoencoders import load_dataset

dataset = load_dataset("glove", dim=50, max_vectors=50000)
loaders = dataset.get_dataloaders(batch_size=256)

Encoder-backed sentence datasets materialize embeddings during prepare() and cache the result just like static embedding tables:

dataset = load_dataset(
    "snli",
    encoder_name="sentence-transformers/all-MiniLM-L6-v2",
    max_vectors=50000,
)
loaders = dataset.get_dataloaders(batch_size=256)

CLIP-backed multimodal datasets follow the same cached artifact pattern:

dataset = load_dataset(
    "flickr30k",
    encoder_name="ViT-B-32",
    encoder_pretrained="laion2b_s34b_b79k",
    modality="both",
    max_vectors=50000,
)
loaders = dataset.get_dataloaders(batch_size=256)

Image data uses H x W x C specs end to end:

dataset = load_dataset("cifar10", max_examples=10000)
print(dataset.get_sample_spec())  # TensorSpec(shape=(32, 32, 3))

Backbone Semantics

Backbones are configured explicitly and built from the dataset-driven sample_spec.

MLPModule consumes tensor specs whose last dimension is the feature width.
CNNModule consumes image-like TensorSpec(shape=(H, W, C)) values and handles HWC <-> NCHW conversion internally.
VisionTransformerModule also consumes image-like TensorSpec(shape=(H, W, C)), patchifies them internally, and exposes sequence-shaped latent specs.

Auto-inferred decoders are intentionally strict:

decoder: null is supported only when reversing the encoder produces a decoder whose runtime input spec matches the model's decoder input spec.
Models whose decoder space differs from encoder output space, such as hierarchical or latent-shape-changing variants, must provide an explicit decoder config.

For explicit image decoders, use transpose: true when you want an upsampling transposed-convolution stack:

decoder:
  name: cnn
  config:
    channels: [64, 3]
    kernel_sizes: [4, 4]
    strides: [2, 2]
    paddings: [1, 1]
    activation: relu
    use_bias: true
    transpose: true

Downloaded datasets use a global cache:

default: ~/.cache/autoencoders
override with: AUTOENCODERS_CACHE=/your/cache/path

This makes the package useful both as:

a standalone training library
a latent-model subsystem inside larger PyTorch projects

Training API

Deterministic training:

from autoencoders import AETrainer, TrainingConfig

trainer = AETrainer(
    model=model,
    args=TrainingConfig(
        output_dir="artifacts/ae-run",
        epochs=5,
        batch_size=256,
    ),
)

trainer.fit(loaders, metadata={"dataset": "glove", "model": "ae"})

Variational training:

from autoencoders import VAETrainer, VariationalAutoencoderConfig, VariationalAutoencoderModel
from autoencoders.data.base import TensorSpec

trainer = VAETrainer(
    model=VariationalAutoencoderModel(
        config=VariationalAutoencoderConfig(
            latent_dim=16,
            kl_weight=0.1,
            free_bits=0.02,
            kl_warmup_epochs=20,
        ),
        sample_spec=TensorSpec(shape=(50,)),
        encoder="mlp",
        encoder_config={"hidden_dims": [64, 32], "activation": "relu", "use_bias": True},
        decoder="mlp",
        decoder_config={"hidden_dims": [64, 50], "activation": "relu", "use_bias": True},
    ),
    args=TrainingConfig(output_dir="artifacts/vae-run", epochs=10),
)

Quantized training:

from autoencoders import VQTrainer, TrainingConfig, load_model
from autoencoders.data.base import TensorSpec

trainer = VQTrainer(
    model=load_model(
        "rqvae",
        sample_spec=TensorSpec(shape=(None, 50)),
        latent_dim=16,
        codebook_size=256,
        num_quantizers=4,
        use_ema_codebook=True,
        dead_code_reset=True,
        encoder="mlp",
        encoder_config={"hidden_dims": [64, 32], "activation": "relu", "use_bias": True},
        decoder="mlp",
        decoder_config={"hidden_dims": [64, 50], "activation": "relu", "use_bias": True},
    ),
    args=TrainingConfig(output_dir="artifacts/rqvae-run", epochs=10),
)

Training Entry Point

Source checkouts now use one unified YAML-driven entrypoint:

examples/trainer.py

The legacy examples/train_ae.py wrapper still forwards into the same code path for basic AE runs.

Useful examples:

python examples/trainer.py --config examples/configs/glove/ae.yaml --epoch 5
python examples/trainer.py --config examples/configs/glove/vae.yaml --epoch 5
python examples/trainer.py --config examples/configs/glove/vqvae.yaml --epoch 5
python examples/trainer.py --config examples/configs/cifar10/vqvae.yaml --epoch 5
python examples/trainer.py --config examples/configs/cifar10/vqvae_vit.yaml --epoch 5

Each config is organized into five sections:

dataset
model
encoder
decoder
trainer

Each section uses name + config form except trainer, which is a flat config block. Runtime overrides such as --epoch 5, --lr 0.001, or --max_vectors 5000 resolve into ${...:default}$ placeholders inside the YAML files before training starts.

Launch-Ready Features

🗃️ Checkpoints: save_pretrained() and from_pretrained()
📤 Exports: standardized latent artifact export across model families
📚 Real datasets: static embedding tables, sentence corpora, and CLIP-backed image-text corpora
🎛️ Family-specific trainers: deterministic, variational, quantized, and adversarial flows
🧪 Packaging: buildable sdist and wheel, ready for PyPI publication

Design Direction

The library is organized around latent model families rather than a single monolithic interface:

BaseAutoencoderModel
BaseVariationalAutoencoderModel
BaseVectorQuantizedAutoencoderModel

Matching outputs are also family-specific:

BaseAutoencoderOutput
VariationalAutoencoderOutput
QuantizedAutoencoderOutput

This keeps the shared API stable without flattening away meaningful model differences such as posterior statistics or codebook indices.

Current Scope

autoencoders is intentionally embedding-first, with a growing image path for CNN-backed quantized models. The current core is aimed at:

representation learning on embedding matrices
latent compression
variational latent modeling
quantized latent tokenization

Future raw-modality frontends and multimodal adapters can be layered on top of this core.

Repository Status

This project is still early, but the current package already supports:

trainable deterministic, variational, and quantized autoencoder families
reusable checkpoints
exportable latent artifacts
real embedding datasets with download and cache support
package metadata and distribution artifacts ready for publication workflows

Development

Build the package locally:

python -m build

Check the generated distribution:

twine check dist/*

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.2

May 19, 2026

0.6.1

May 19, 2026

0.6.0

May 19, 2026

0.5.2

May 19, 2026

0.5.1

May 19, 2026

This version

0.5.0

May 19, 2026

0.4.3

May 19, 2026

0.4.2

May 19, 2026

0.4.1

May 18, 2026

0.4.0

May 18, 2026

0.3.0

May 18, 2026

0.2.0

May 17, 2026

0.1.0

May 14, 2026

0.0.1

Oct 26, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoencoders-0.5.0.tar.gz (132.3 kB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autoencoders-0.5.0-py3-none-any.whl (129.4 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file autoencoders-0.5.0.tar.gz.

File metadata

Download URL: autoencoders-0.5.0.tar.gz
Upload date: May 19, 2026
Size: 132.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for autoencoders-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`930968698339af88d919a29ee0617d8150dbc72cfdb7e67a7957785f60027d7a`
MD5	`1131bd64c39eab9b9e6346f9c10e82df`
BLAKE2b-256	`e9935e939ea28c02492c81b8e2ea0837e6f501558bea04008354d800fa83f1c1`

See more details on using hashes here.

File details

Details for the file autoencoders-0.5.0-py3-none-any.whl.

File metadata

Download URL: autoencoders-0.5.0-py3-none-any.whl
Upload date: May 19, 2026
Size: 129.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for autoencoders-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`967efee3824eb98c412cace022fea84092fe723b9b687aeb2c027134a32653d0`
MD5	`0990adca7582848022d402639e71fdac`
BLAKE2b-256	`2a6590d28a0bf419c748ebfd7f26028d6dcfc9bebe3e16e76e96853b71047e5b`

See more details on using hashes here.

autoencoders 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

autoencoders

Why autoencoders

What It Covers

At a Glance

Installation

Documentation

Quick Start

Product Surface

Model Loading

Datasets

Backbone Semantics

Training API

Training Entry Point

Launch-Ready Features

Design Direction

Current Scope

Repository Status

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes