Clone and prune transformer models with new tokenizers
Project description
Transformer Cloner
Clone and prune transformer models with new tokenizers. Create smaller, more efficient models by mapping vocabularies, reducing dimensions, and pruning layers.
Features
- 🔄 Vocabulary Mapping: Map tokens from a new tokenizer to original model embeddings
- 📉 Model Pruning: Reduce hidden size, layers, attention heads, and more
- 🎯 Multiple Strategies: Choose from mean, sum, first, last, weighted, max, min for embedding combination
- ✅ Validation: Automatic config validation to prevent incompatible architectures
- 🚀 Fast: Batch processing for efficient token ID mapping
Installation
pip install transformer-cloner
Quick Start
Clone with New Tokenizer
from transformer_cloner import TransformerCloner, EmbeddingStrategy
cloner = TransformerCloner(
org_model_id="google/gemma-3-270m-it",
target_tokenizer_id="your-username/custom-tokenizer",
)
# Clone with mean embedding strategy
model = cloner.clone(strategy=EmbeddingStrategy.MEAN)
model.save_pretrained("cloned-model")
Prune Model Architecture
from transformer_cloner import TransformerCloner, PruningConfig, EmbeddingStrategy
cloner = TransformerCloner(
org_model_id="google/gemma-3-270m-it",
target_tokenizer_id="your-username/custom-tokenizer",
)
# Create a smaller model
pruning_config = PruningConfig(
hidden_size=320, # Reduce embedding dimension
num_hidden_layers=9, # Fewer layers
intermediate_size=1024, # Smaller FFN
num_attention_heads=2, # Fewer attention heads
)
model = cloner.clone_pruned(
pruning_config=pruning_config,
strategy=EmbeddingStrategy.MEAN,
)
model.save_pretrained("pruned-model")
Vocabulary Pruning (Direct 1:1 Mapping)
from transformer_cloner import TransformerCloner
cloner = TransformerCloner(
org_model_id="google/gemma-3-270m-it",
target_tokenizer_id="google/gemma-3-270m-it", # Same tokenizer
)
# Keep only first 16k tokens
model, tokenizer = cloner.clone_with_vocab_pruning(vocab_size=16000)
model.save_pretrained("vocab-pruned-model")
tokenizer.save_pretrained("vocab-pruned-model")
Embedding Strategies
When a target token maps to multiple source tokens, choose how to combine them:
| Strategy | Description |
|---|---|
MEAN |
Average of all source embeddings (default) |
SUM |
Sum of all source embeddings |
FIRST |
Use only the first token's embedding |
LAST |
Use only the last token's embedding |
WEIGHTED |
Weighted average (more weight to first tokens) |
MAX |
Element-wise maximum |
MIN |
Element-wise minimum |
Pruning Options
| Parameter | Description |
|---|---|
hidden_size |
Embedding dimension |
num_hidden_layers |
Number of transformer layers |
intermediate_size |
FFN intermediate dimension |
num_attention_heads |
Number of attention heads |
num_key_value_heads |
Number of KV heads (for GQA) |
head_dim |
Dimension per attention head |
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file transformer_cloner-0.1.0.tar.gz.
File metadata
- Download URL: transformer_cloner-0.1.0.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14af8fe1b8dfcbd20fe52f195daa954b216982eda22037f3b11a93a056eebf55
|
|
| MD5 |
32f577dce1760d53579dece5965e4673
|
|
| BLAKE2b-256 |
ed5533929ac0f1cd8efb320d980a472173097f953c564b7d750d6c97fbc1aa01
|
File details
Details for the file transformer_cloner-0.1.0-py3-none-any.whl.
File metadata
- Download URL: transformer_cloner-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f88eef07ebea60138d4f32316f39c3d4b25ea56db2ecffda683a99a9130a9f0c
|
|
| MD5 |
eb8dd9efdb3e35b44ba3033269c10b4c
|
|
| BLAKE2b-256 |
6c547a6f1074b37a2c87c92c0a9213fc7d9ba584bf1028cbbc76e5ad25278c5d
|