General purpose model trainer for PyTorch that is more flexible than it should be, by 🐸Coqui.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

eginhard

These details have not been verified by PyPI

Project description

👟 Trainer

PyPI - Python Version GithubActions

An opinionated general purpose model trainer on PyTorch with a simple code base. Fork of the original, unmaintained repository. New PyPI package: coqui-tts-trainer

Installation

From PyPI:

pip install coqui-tts-trainer

From Github:

git clone https://github.com/idiap/coqui-ai-Trainer
cd coqui-ai-Trainer
pip install -e .

Implementing a model

Subclass and overload the functions in the TrainerModel()

Training a model with auto-optimization

See the MNIST example.

Training a model with advanced optimization

With 👟 you can define the whole optimization cycle as you want as the in GAN example below. It enables more under-the-hood control and flexibility for more advanced training loops.

You just have to use the scaled_backward() function to handle mixed precision training.

...

def optimize(self, batch, trainer):
    imgs, _ = batch

    # sample noise
    z = torch.randn(imgs.shape[0], 100)
    z = z.type_as(imgs)

    # train discriminator
    imgs_gen = self.generator(z)
    logits = self.discriminator(imgs_gen.detach())
    fake = torch.zeros(imgs.size(0), 1)
    fake = fake.type_as(imgs)
    loss_fake = trainer.criterion(logits, fake)

    valid = torch.ones(imgs.size(0), 1)
    valid = valid.type_as(imgs)
    logits = self.discriminator(imgs)
    loss_real = trainer.criterion(logits, valid)
    loss_disc = (loss_real + loss_fake) / 2

    # step dicriminator
    self.scaled_backward(loss_disc, None, trainer)

    if trainer.total_steps_done % trainer.grad_accum_steps == 0:
        trainer.optimizer[0].step()
        trainer.optimizer[0].zero_grad()

    # train generator
    imgs_gen = self.generator(z)

    valid = torch.ones(imgs.size(0), 1)
    valid = valid.type_as(imgs)

    logits = self.discriminator(imgs_gen)
    loss_gen = trainer.criterion(logits, valid)

    # step generator
    self.scaled_backward(loss_gen, None, trainer)
    if trainer.total_steps_done % trainer.grad_accum_steps == 0:
        trainer.optimizer[1].step()
        trainer.optimizer[1].zero_grad()
    return {"model_outputs": logits}, {"loss_gen": loss_gen, "loss_disc": loss_disc}

...

See the GAN training example with Gradient Accumulation

Training with Batch Size Finder

see the test script here for training with batch size finder.

The batch size finder starts at a default BS(defaults to 2048 but can also be user defined) and searches for the largest batch size that can fit on your hardware. you should expect for it to run multiple trainings until it finds it. to use it instead of calling trainer.fit() youll call trainer.fit_with_largest_batch_size(starting_batch_size=2048) with starting_batch_size being the batch the size you want to start the search with. very useful if you are wanting to use as much gpu mem as possible.

Training with DDP

$ python -m trainer.distribute --script path/to/your/train.py --gpus "0,1"

We don't use .spawn() to initiate multi-gpu training since it causes certain limitations.

Everything must the pickable.
.spawn() trains the model in subprocesses and the model in the main process is not updated.
DataLoader with N processes gets really slow when the N is large.

Training with Accelerate

Setting use_accelerate in TrainingArgs to True will enable training with Accelerate.

You can also use it for multi-gpu or distributed training.

CUDA_VISIBLE_DEVICES="0,1,2" accelerate launch --multi_gpu --num_processes 3 train_recipe_autoregressive_prompt.py

See the Accelerate docs.

Adding a callback

👟 Supports callbacks to customize your runs. You can either set callbacks in your model implementations or give them explicitly to the Trainer.

Please check trainer.utils.callbacks to see available callbacks.

Here is how you provide an explicit call back to a 👟Trainer object for weight reinitialization.

def my_callback(trainer):
    print(" > My callback was called.")

trainer = Trainer(..., callbacks={"on_init_end": my_callback})
trainer.fit()

Profiling example

Create the torch profiler as you like and pass it to the trainer.

import torch
profiler = torch.profiler.profile(
    activities=[
        torch.profiler.ProfilerActivity.CPU,
        torch.profiler.ProfilerActivity.CUDA,
    ],
    schedule=torch.profiler.schedule(wait=1, warmup=1, active=3, repeat=2),
    on_trace_ready=torch.profiler.tensorboard_trace_handler("./profiler/"),
    record_shapes=True,
    profile_memory=True,
    with_stack=True,
)
prof = trainer.profile_fit(profiler, epochs=1, small_run=64)
then run Tensorboard

Run the tensorboard.
```
tensorboard --logdir="./profiler/"
```

Supported Experiment Loggers

Tensorboard - actively maintained
ClearML - actively maintained
MLFlow
Aim
WandDB

To add a new logger, you must subclass BaseDashboardLogger and overload its functions.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

eginhard

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.1

Jun 30, 2025

0.3.0

Jun 11, 2025

0.2.3

Feb 4, 2025

0.2.2

Jan 7, 2025

0.2.1 yanked

Jan 6, 2025

Reason this release was yanked:

Rejects valid configs

0.2.0

Dec 3, 2024

0.1.7

Nov 13, 2024

0.1.6

Nov 4, 2024

0.1.5

Sep 12, 2024

0.1.4

Jun 29, 2024

0.1.3

Jun 28, 2024

0.1.2

Jun 27, 2024

0.1.1

May 3, 2024

0.1.0

Apr 4, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coqui_tts_trainer-0.3.1.tar.gz (50.3 kB view details)

Uploaded Jun 30, 2025 Source

Built Distribution

coqui_tts_trainer-0.3.1-py3-none-any.whl (57.2 kB view details)

Uploaded Jun 30, 2025 Python 3

File details

Details for the file coqui_tts_trainer-0.3.1.tar.gz.

File metadata

Download URL: coqui_tts_trainer-0.3.1.tar.gz
Upload date: Jun 30, 2025
Size: 50.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for coqui_tts_trainer-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`ca32abaf43febb4012a6a0c61e265b1635f91455acbce17fd34a2b5eae3af28c`
MD5	`313a30a519861a85ebbfa036dc165871`
BLAKE2b-256	`e9f868315f71c420382873a8b2bd47b0c3113213f51bf4e932d56c9aed659b80`

See more details on using hashes here.

Provenance

The following attestation bundles were made for coqui_tts_trainer-0.3.1.tar.gz:

Publisher: pypi-release.yml on idiap/coqui-ai-Trainer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: coqui_tts_trainer-0.3.1.tar.gz
- Subject digest: ca32abaf43febb4012a6a0c61e265b1635f91455acbce17fd34a2b5eae3af28c
- Sigstore transparency entry: 256283074
- Sigstore integration time: Jun 30, 2025
Source repository:
- Permalink: idiap/coqui-ai-Trainer@6a00a7fa8f5c6cc63b5ba684fe02f4cd1cb409d5
- Branch / Tag: refs/tags/v0.3.1
- Owner: https://github.com/idiap
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-release.yml@6a00a7fa8f5c6cc63b5ba684fe02f4cd1cb409d5
- Trigger Event: release

File details

Details for the file coqui_tts_trainer-0.3.1-py3-none-any.whl.

File metadata

Download URL: coqui_tts_trainer-0.3.1-py3-none-any.whl
Upload date: Jun 30, 2025
Size: 57.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for coqui_tts_trainer-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eba2449b1c7e6a1fb7454608595949dbe4c1a409bd0d23e1fe93163637600392`
MD5	`0e7358340fab04aa30c818ce4973030d`
BLAKE2b-256	`1579d08e2b3974448bbfede88d7d3c81b67896ccaf8ed77ef10d4ddb8e7d194f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for coqui_tts_trainer-0.3.1-py3-none-any.whl:

Publisher: pypi-release.yml on idiap/coqui-ai-Trainer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: coqui_tts_trainer-0.3.1-py3-none-any.whl
- Subject digest: eba2449b1c7e6a1fb7454608595949dbe4c1a409bd0d23e1fe93163637600392
- Sigstore transparency entry: 256283089
- Sigstore integration time: Jun 30, 2025
Source repository:
- Permalink: idiap/coqui-ai-Trainer@6a00a7fa8f5c6cc63b5ba684fe02f4cd1cb409d5
- Branch / Tag: refs/tags/v0.3.1
- Owner: https://github.com/idiap
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-release.yml@6a00a7fa8f5c6cc63b5ba684fe02f4cd1cb409d5
- Trigger Event: release

coqui-tts-trainer 0.3.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

👟 Trainer

Installation

Implementing a model

Training a model with auto-optimization

Training a model with advanced optimization

Training with Batch Size Finder

Training with DDP

Training with Accelerate

Adding a callback

Profiling example

Supported Experiment Loggers

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance