Interpolate between discrete sequences.
Project description
Transformer-VAE (WIP)
Transformer-VAE's learn smooth latent spaces of discrete sequences without any explicit rules in their decoders.
This can be used for program synthesis, drug discovery, music generation and much more!
To see how it works checkout this blog post.
This repo is in active development but I should be coming out with full a release soon.
Install
Install using pip:
pip install transformer_vae
Usage
You can exececute the module to easily train it on your own data.
python -m transformer_vae \
--project_name="T5-VAE" \
--output_dir=poet \
--do_train \
--huggingface_dataset=poems \
Or you can import Transformer-VAE to use as a package much like a Huggingface model.
from transformer_vae import T5_VAE_Model
model = T5_VAE_Model.from_pretrained('t5-vae-poet')
Training
Setup Weights & Biasis for logging, see client.
Get a dataset to model, must be represented with text. This is what we will be interpolating over.
This can be a text file with each line representing a sample.
python -m transformer_vae \
--project_name="T5-VAE" \
--output_dir=poet \
--do_train \
--train_file=poems.txt \
Alternatively seperate each sample with a line containing only <|endoftext|>
seperating samples:
python -m transformer_vae \
--project_name="T5-VAE" \
--output_dir=poet \
--do_train \
--train_file=poems.txt \
--multiline_samples
Alternatively provide a Huggingface dataset.
python -m transformer_vae \
--project_name="T5-VAE" \
--output_dir=poet \
--do_train \
--dataset=poems \
--content_key text
Experiment with different parameters.
Once finished upload to huggingface model hub.
# TODO
Explore the produced latent space using Colab_T5_VAE.ipynb
or vising this Colab page.
Contributing
Install with tests:
pip install -e .[test]
Possible contributions to make:
- Could the docs be more clear? Would it be worth having a docs site/blog?
- Use a Funnel transformer encoder, is it more efficient?
- Allow defining alternative tokens set.
- Store the latent codes from the previous step to use in MMD loss so smaller batch sizes are possible.
Feel free to ask what would be useful!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file transformer_vae-0.0.2.tar.gz
.
File metadata
- Download URL: transformer_vae-0.0.2.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d89d5f03a5f2753ed24f922e1e09fe7e84902ceb45ba35e1b8765cca8660d7b6 |
|
MD5 | 2ae8834b55f06e8f6168912470bcc715 |
|
BLAKE2b-256 | 0a73d1be69980961f9001beaadf70f10e5f36a418c74f2ab88753d299d3b1f31 |
File details
Details for the file transformer_vae-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: transformer_vae-0.0.2-py3-none-any.whl
- Upload date:
- Size: 27.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ea8a09b9c518a9e91c0752f3af404f9de11c972651da047654fa46deeb782da |
|
MD5 | 2f5523b8d9c653853bf7fafa113ebd54 |
|
BLAKE2b-256 | 0710d0d7557be61bcd71b30e847d3a77b31304cd48ae689c086fc984fe017612 |