dytr: Dynamic Transformer for Multi-Task Learning with Continual Learning Support

These details have not been verified by PyPI

Project links

Bug Reports

Project description

dytr - Dynamic Transformer Library

dytr is a flexible PyTorch library for multi-task learning with dynamic transformer architectures. Train multiple tasks sequentially or simultaneously while preserving performance on previous tasks through built-in continual learning techniques. it also supports to finetune and modify pretrained model such as bert.

Build dynamic transformers that learn multiple tasks.

🚀 What's New in v0.1.3

✨ Universal Pretrained Model Loader

Load ANY model architecture directly:

Architecture	Supported Models	Loads As
Encoder-only	BERT, RoBERTa, DistilBERT, ALBERT	Shared encoder
Decoder-only	GPT-2, LLaMA, Mistral, Phi, Gemma, Qwen	Causal LM task
Encoder-decoder	T5, BART, MT5, Pegasus, Flan-T5	Seq2Seq task

from dytr import PretrainedModelLoader

loader = PretrainedModelLoader()
model = loader.load_pretrained("gpt2", task_name="text_generation")

🔧 LoRA Training Support

Mode	Description
Single	One LoRA adapter across all tasks
Multi-Task	Separate LoRA adapters per task
Progressive	4-phase training (Warmup → Expansion → Hybrid → Native)

from dytr.training.lora import LoRATrainer

trainer = LoRATrainer(model, config, mode="multi", rank=8)
model = trainer.train(task_configs, train_datasets, val_datasets)

Why dytr?

🎯 Multi-Task Ready - Train classification, generation, and sequence tasks in one model
🧠 Never Forgets - Built-in EWC and experience replay prevent catastrophic forgetting
🔧 No Black Box - Full control over architecture, understand every component
⚡ Lightweight - Pure PyTorch, minimal dependencies
📦 Pretrained Support - Load BERT, RoBERTa, and more as your encoder backbone and fine tune it on multiple tasks.
🔧 LoRA Support - Parameter-efficient fine-tuning with fewer parameters

Architecture Overview

unnamed

Installation

pip install dytr

pip install git+https://github.com/AAlsubari/dytr.git

Quick Start

from dytr import DynamicTransformer, ModelConfig, TaskConfig, TrainingStrategy, Trainer, SingleDatasetProcessing
import pandas as pd

# 1. Configure your transformer
config = ModelConfig(
    embed_dim=256,
    num_layers=6,
    num_heads=8,
    max_seq_len=256
)

# 2. Create the model
model = DynamicTransformer(config)

# data loading and processing
train_data = pd.DataFrame({
    'text': ['Great movie!', 'Terrible film.', 'Amazing acting!', 'Boring plot.'],
    'label': [1, 0, 1, 0]
})
train_dataset = SingleDatasetProcessing(
    df=train_data,
    tokenizer=model.tokenizer,
    max_len=128,
    task_name="sentiment_analysis",
    strategy=TrainingStrategy.SENTENCE_CLASSIFICATION,
    text_column="text",
    label_column="label"
)
# 3. Add a task
task = TaskConfig(
    task_name="sentiment_analysis",
    training_strategy=TrainingStrategy.SENTENCE_CLASSIFICATION,
    num_labels=2,# train_data.num_labels
)
#model.add_task(task) # not require it will be add automatically during the training process

# Initialize trainer and train
trainer = Trainer(model, config, exp_dir="./experiments")
train_datasets = {"sentiment_analysis": (train_dataset, TrainingStrategy.SENTENCE_CLASSIFICATION)}
model = trainer.train([task], train_datasets, {})# you can set more than one for list of tasks and dataset for multitasks training 

# 4. Generate predictions
result = model.generate("This product is amazing!", task_name="sentiment_analysis")
print(f"Prediction: {result['prediction']}")

# Save the entire multi-task model
model.save_model("multi_task_model.pt")

# Load the model
loaded_model = DynamicTransformer.load_model("multi_task_model.pt")

Core Capabilities

Multiple Training Strategies

Strategy	Purpose	Use Case
Causal LM	Autoregressive text generation	Chatbots, content creation
Seq2Seq	Input to output transformation	Translation, summarization
Sentence Classification	Document-level categorization	Sentiment, topic detection
Token Classification	Token-level labeling	Named entity recognition, POS tagging

Continual Learning

Train tasks sequentially without losing previous knowledge:

config = ModelConfig(
    use_ewc=True,              # Protect important weights
    use_replay=True,           # Replay old samples
    use_task_adapters=True,    # Task-specific modules
    ewc_lambda=1000.0,
    replay_buffer_size=2000
)

model = DynamicTransformer(config)

# Train tasks one after another
for task in task_list:
    model.add_task(task)
    trainer.train([task], train_data, val_data)
    # Previous tasks remain accurate
    # The trainer automatically handles EWC and replay buffer, but you should add the samples to the pretrained model

Universal Pretrained Model Loader

Load any model architecture as your backbone:

from dytr import PretrainedModelLoader

loader = PretrainedModelLoader()

# Encoder-only (BERT) → Shared encoder
model = loader.load_pretrained("bert-base-uncased")

# Decoder-only (GPT-2) → Causal LM task
model = loader.load_pretrained("gpt2", task_name="generation")

# Encoder-decoder (T5) → Seq2Seq task
model = loader.load_pretrained("t5-small", task_name="translation")

Pretrained Encoders

Load powerful encoders as your backbone and extend them with tasks:

from dytr import PretrainedModelLoader
model_name='prajjwal1/bert-tiny'
loader = PretrainedModelLoader()
config = ModelConfig(tokenizer_name=model_name, per_device_train_batch_size=32, num_train_epochs=3, per_device_eval_batch_size=8, special_tokens={}, use_task_adapters=False, use_ewc=True, use_replay=True, use_rotary_embedding=False, training_from_scratch=False)

# Load pretrained BERT as your encoder
model = loader.load_pretrained(model_name, config)

# Now add your own tasks - the model is fully dytr compatible
class_train = pd.DataFrame(
        {
            "text": [
                "Great product!",
                "Poor quality.",
                "Excellent service!",
                "Very disappointed.",
                "Highly recommended!",
            ],
            "label": [1, 0, 1, 0, 1],
        }
    )
classification_task = TaskConfig(
        task_name="sentiment",
        training_strategy=TrainingStrategy.SENTENCE_CLASSIFICATION,
        num_labels=2,
        text_column="text",
        label_column="label",
        max_length=128,
    )
class_dataset = SingleDatasetProcessing(
        df=class_train,
        tokenizer=model.tokenizer,
        max_len=classification_task.max_length,
        task_name=classification_task.task_name,
        strategy=classification_task.training_strategy,
        num_labels=classification_task.num_labels,
        text_column=classification_task.text_column,
        label_column=classification_task.label_column,
    )
# Causal LM task data (text generation)
lm_train = pd.DataFrame(
        {
            "text": [
                "The sun rises in the east.",
                "Cats are adorable animals.",
                "Machine learning is fascinating.",
                "Python is a great programming language.",
                "Deep learning powers modern AI.",
            ]
        }
    )
lm_task = TaskConfig(
        task_name="text_generation",
        training_strategy=TrainingStrategy.CAUSAL_LM,
        max_length=256,
    )
lm_dataset = SingleDatasetProcessing(
        df=lm_train,
        tokenizer=model.tokenizer,
        max_len=lm_task.max_length,
        task_name=lm_task.task_name,
        strategy=lm_task.training_strategy,
        text_column="text",
    )
train_datasets = {
        classification_task.task_name: (class_dataset, classification_task.training_strategy),
        lm_task.task_name: (lm_dataset, lm_task.training_strategy),
    }
val_datasets = {
        #classification_task.task_name: (class_val_dataset, classification_task.training_strategy)
    }

# 6. Train model
print("\n6. Training model...")
trainer = Trainer(model, config, exp_dir="./multi_task_experiments")
model = trainer.train([classification_task, lm_task], train_datasets, val_datasets)

#model = trainer.train([ lm_task], train_datasets, val_datasets)
test_texts = ["This is amazing!", "I hate this."]
for text in test_texts:
      result = model.generate(text, task_name="sentiment")
      sentiment = "POSITIVE" if result["prediction"] == 1 else "NEGATIVE"
      print(f"      {text} -> {sentiment}")

# Test generation
print("\n   Text generation test:")
prompt = "The future of technology"
generated = model.generate(prompt, task_name="text_generation", max_new_tokens=20)
print(f"      Prompt: {prompt}")
print(f"      Generated: {generated}")



#model.add_task(sentiment_task)
#model.add_task(ner_task)
#model.add_task(translation_task)

# Train, generate, and use just like any dytr model

Task-Specific Learning Rates

Different components learn at different speeds:

config = ModelConfig(
    learning_rate=3e-4,
    head_lr_mult=2.0,      # Task heads: fast adaptation
    decoder_lr_mult=0.5,   # Decoders: moderate
    shared_lr_mult=0.1     # Shared encoder: preserve knowledge
)

Who Should Use dytr?

Audience	Why It Matters
Researchers	Customize every aspect of the transformer architecture, Test continual learning algorithms with EWC and experience replay, experiment with multi-task architectures, Experiment with task-specific learning rates and adapters, Analyze forgetting behavior across sequential tasks
Developers	Add new tasks without retraining from scratch, Load pretrained models and extend them with your own tasks, Build production-ready multi-task systems without complex dependencies
Students	Understand transformers from scratch with transparent, readable code, Visualize the impact of hyperparameters on model size, Learn multi-task learning concepts hands-on
Organizations	Deploy single models that handle multiple tasks efficiently , Deploy lighter, faster inference systems, Maintain knowledge across task updates with continual learning

Key Differentiators

Full Transparency - No hidden complexity, understand every component
Continual Learning First - Built from the ground up for sequential task learning
Truly Dynamic - Add or remove tasks without retraining from scratch
Pure PyTorch - No heavy dependencies, easy to customize

Requirements

Python 3.8+
PyTorch 1.10+
NumPy
pandas
scikit-learn
tqdm
requests

Documentation

ModelConfig: Architecture, training, and continual learning parameters
TaskConfig: Dataset configuration, column mapping, task-specific settings
TrainingStrategy: Causal LM, Seq2Seq, Sentence Classification, Token Classification
PretrainedModelLoader: Pretrain models loader for encoder-only, decoder-only, and encoder-decoder models
LoRATrainer: Parameter-efficient fine-tuning with single, multi-task, and progressive modes

License

Apache License 2.0

Author

Dr. Akram Alsubari

Contributing

Contributions are welcome! Open issues or share your use cases.

Support and Contact

For questions, issues, or suggestions: For questions, issues, or suggestions:

📧 Email: akram.alsubari@outlook.com
🔗 LinkedIn: https://www.linkedin.com/in/akram-alsubari/
📱 Connect: Feel free to reach out for collaborations, research discussions, or feedback
🎓 Research Interests: Natural Language Processing, Deep Learning, Transformers, Continual Learning, Multi-Task Learning, Large Language Models

Build once. Learn multiple tasks. Never forget.

Project details

These details have not been verified by PyPI

Project links

Bug Reports

Release history Release notifications | RSS feed

0.1.5

May 7, 2026

This version

0.1.4

Apr 29, 2026

0.1.3

Apr 25, 2026

0.1.2

Apr 21, 2026

0.1.1

Apr 7, 2026

0.1.0

Apr 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dytr-0.1.4.tar.gz (70.5 kB view details)

Uploaded Apr 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dytr-0.1.4-py3-none-any.whl (80.3 kB view details)

Uploaded Apr 29, 2026 Python 3

File details

Details for the file dytr-0.1.4.tar.gz.

File metadata

Download URL: dytr-0.1.4.tar.gz
Upload date: Apr 29, 2026
Size: 70.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dytr-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`5c4a2a90c459e054085e219b6c1a5850cf75989f966e79570f3764791c716e51`
MD5	`55126adab49cf6a598ff34183cf75c4c`
BLAKE2b-256	`01056b811ede8bb8d80d5068343943fd2f2ef3db4afc5cba52c283e993a2203c`

See more details on using hashes here.

File details

Details for the file dytr-0.1.4-py3-none-any.whl.

File metadata

Download URL: dytr-0.1.4-py3-none-any.whl
Upload date: Apr 29, 2026
Size: 80.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dytr-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`03cdd01ffb9c2ae8df9c1bc2b359685f3bad0e0811e9d0894725c605b3e1b557`
MD5	`e3a54c5a6bbc934778e3c797b8a51244`
BLAKE2b-256	`224c92070e8503ae5ad61a2704edd38228a756795b16fb78f9baadcce29090ba`

See more details on using hashes here.

dytr 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dytr - Dynamic Transformer Library

🚀 What's New in v0.1.3

✨ Universal Pretrained Model Loader

🔧 LoRA Training Support

Why dytr?

Architecture Overview

Installation

Quick Start

Core Capabilities

Multiple Training Strategies

Continual Learning

Universal Pretrained Model Loader

Pretrained Encoders

Task-Specific Learning Rates

Who Should Use dytr?

Key Differentiators

Requirements

Documentation

License

Author

Contributing

Support and Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes