A comprehensive implementation of the cycle-consistency training paradigm, extending the Huggingface Transformers trainer API to accommodate arbitrary combinations of generative models.

These details have not been verified by PyPI

Project description

Cycleformers

A Python library for efficient cycle-consistency training of transformer models. Cycleformers simplifies iterative back-translation with support for both causal and seq2seq architectures. We also implement Multi-Adapter Cycle-Consistency Training (MACCT), enabling training of LoRA adapters on a frozen base model for 7.5x larger model capacity for the same memory footprint.

Features

🤗 Seamless integration with Hugging Face Transformers
🚀 PEFT/LoRA support for memory-efficient training
🤖 Compatible with both causal and seq2seq models
🔥 Optimized for various hardware configurations

Quick Tour

Installation

pip install cycleformers

Training

The CycleTrainer class is an extension but significant redesign of the 🤗 Transformers trainer, designed to abstract away the specifics of training while remaining configurable. Both Seq2Seq and Causal architectures are supported, each able to train via PEFT adapter swapping for memory efficient configurations. Check the [docs] for [usage] details and [examples].

To train using two identical models the following sample code can be used along with two datasets:

from cycleformers import CycleTrainer, CycleTrainingArguments

model = AutoModelForCausalLM.from_pretrained("gpt2", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

args = CycleTrainingArguments(output_dir="gpt2-cct")
trainer = CycleTrainer(
    args, 
    models = model
    tokenizers = tokenizer
    train_dataset_A = dataset_A,
    train_dataset_B = dataset_B
)
trainer.train()

Any two models (🚧 currently both seq2seq or both causal) can be combined together for completely customisable training:

model_A = AutoModelForCausalLM.from_pretrained("gpt2", device_map="auto")
model_B = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base", device_map="auto")
tokenizer_A = AutoTokenizer.from_pretrained("gpt2")
tokenizer_B = AutoTokenizer.from_pretrained("google/flan-t5-small")

trainer = CycleTrainer(
    args, 
    models = {
        "A": model_A,
        "B": model_B
    }
    tokenizers = {
        "A": tokenizer_A,
        "B": tokenizer_B
    }
    train_dataset_A = dataset_A,
    train_dataset_B = dataset_B
)

Multi-Adapter Cycle-Consistency Training (MACCT)

The CycleTrainer class is also setup to accept a single base model and train two PEFT adapters ontop of it, switching between them to emulate the two model setup. This allows for the training of 7.5x larger models for the same memory footprint:

peft_config = PeftConfig(
    task_type="CAUSAL_LM",
    r=16,
    lora_alpha=32,
    target_modules="all-linear",
    inference_mode=False,
    bias="none"
)

args = CycleTrainingArguments(output_dir="gpt2-macct")
trainer = CycleTrainer(
    args, 
    model = model,
    tokenizer = tokenizer,
    peft_configs = peft_config # Or same A, B dict
)

Citing

If you use Cycleformers in your research, please cite:

add once zenodo/paper citation is available

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Dec 8, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cycleformers-0.1.0.tar.gz (15.7 kB view details)

Uploaded Dec 8, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cycleformers-0.1.0-py3-none-any.whl (16.0 kB view details)

Uploaded Dec 8, 2024 Python 3

File details

Details for the file cycleformers-0.1.0.tar.gz.

File metadata

Download URL: cycleformers-0.1.0.tar.gz
Upload date: Dec 8, 2024
Size: 15.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cycleformers-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`23f5985ab08eba95b7106d7d13c7ff3f9f7bd08655564981df63744725dfafb7`
MD5	`13ac53c911b6ed1b5feb37e9c966bfad`
BLAKE2b-256	`5987bfc7a537a4834d2977d8eab37f710b8c06062e3ee364ef20ccbc3e8a779c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for cycleformers-0.1.0.tar.gz:

Publisher: release.yml on wrmthorne/cycleformers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: cycleformers-0.1.0.tar.gz
- Subject digest: 23f5985ab08eba95b7106d7d13c7ff3f9f7bd08655564981df63744725dfafb7
- Sigstore transparency entry: 154030375
- Sigstore integration time: Dec 8, 2024
Source repository:
- Permalink: wrmthorne/cycleformers@a2ff666c5e3505c42a4a50de70cca9116d68af10
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/wrmthorne
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a2ff666c5e3505c42a4a50de70cca9116d68af10
- Trigger Event: push

File details

Details for the file cycleformers-0.1.0-py3-none-any.whl.

File metadata

Download URL: cycleformers-0.1.0-py3-none-any.whl
Upload date: Dec 8, 2024
Size: 16.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cycleformers-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cd05e82ac0654133d0580603bce064ad455ec6a8918a788a42d47e3b9dfa1074`
MD5	`1ee2ea12d648d6c98aa5b098ea3d95aa`
BLAKE2b-256	`b309615d7e84df1f24527bdd9499ff3ca3f8f163a95c1c13b1fc00c88f6f3011`

See more details on using hashes here.

Provenance

The following attestation bundles were made for cycleformers-0.1.0-py3-none-any.whl:

Publisher: release.yml on wrmthorne/cycleformers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: cycleformers-0.1.0-py3-none-any.whl
- Subject digest: cd05e82ac0654133d0580603bce064ad455ec6a8918a788a42d47e3b9dfa1074
- Sigstore transparency entry: 154030376
- Sigstore integration time: Dec 8, 2024
Source repository:
- Permalink: wrmthorne/cycleformers@a2ff666c5e3505c42a4a50de70cca9116d68af10
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/wrmthorne
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a2ff666c5e3505c42a4a50de70cca9116d68af10
- Trigger Event: push

cycleformers 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Cycleformers

Features

Quick Tour

Installation

Training

Multi-Adapter Cycle-Consistency Training (MACCT)

Citing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance