Skip to main content

TPTT : Transforming Pretrained Transformers into Titans

Project description

😊 TPTT

arXiv PyPI Release Documentation

Transforming Pretrained Transformers into Titans

TPTT is a modular Python library designed to inject efficient linearized attention (LiZA) mechanisms-such as Memory as Gate (described in Titans)-into pretrained transformers 🤗.


Features

  • Flexible Attention Injection: Seamlessly wrap and augment standard Transformer attention layers with linearized attention variants for latent memory.
  • Support for Linear Attention: Includes implementations of DeltaNet and DeltaProduct with optional recurrent nonlinearity between chunks.
  • Modular Design: Easily extend or customize operators and integration strategies.
  • Compatibility: Designed to integrate with Hugging Face Transformers and similar PyTorch models.

overview

Note: Order 2 Delta-Product has the same expressiveness as titans.

Installation and Usage

pip install tptt

Titanesque Documentation

  • TPTT-LiZA Training:
    Instructions for training TPTT-based models with LoRA and advanced memory management.

  • TPTT_LiZA_Evaluation:
    Guide for evaluating language models with LightEval and Hugging Face Transformers.

  • TPTT_LiZA_FromScratch:
    Integrating the LinearAttention module into Pytorch deep learning projects.

Basic usage :

from transformer import AutoTokenizer, AutoModelForCausalLM
import tptt
from tptt import save_tptt_safetensors, get_tptt_model, load_tptt_safetensor

##### Transforming into Titans (Tptt)
base_model_name="meta-llama/Llama-3.2-1B"
config = tptt.TpttConfig(
    base_model_name=base_model_name,
    #lora_config=lora_config,
)
model = tptt.TpttModel(config)
# manual local save
save_tptt_safetensors(model, path, name)

##### Pretrained Titans from Transformer
repo_id = "ffurfaro/Titans-Llama-3.2-1B"
model = AutoModelForCausalLM.from_pretrained(repo_id, trust_remote_code=True)

##### More custom for other Model (BERT, ViT, etc.)
model, linear_cache = get_tptt_model(model, config) # you can activate Bidirectional
model = load_tptt_safetensors(repo_or_path, model) # from saved LoRA only

##### Using LinearAttention from scratch
layers = nn.ModuleList([
    tptt.LinearAttention(hidden_dim=64, num_heads=4,)
    for _ in range(num_layers)])

Development

  • Code is organized into modular components under the src/tptt directory.
  • Use pytest for testing and sphinx for documentation. See on this link🔥
  • Contributions and feature requests are welcome!

Requirements

  • Python 3.11+
  • PyTorch
  • einops
  • Transformers
  • Peft

See requirements.txt for the full list.


Citation

If you use TPTT in your academic work, please cite:

@article{furfaro2025tptt,
  title={TPTT: Transforming Pretrained Transformer into Titans},
  author={Furfaro, Fabien},
  journal={arXiv preprint arXiv:2506.17671},
  year={2025}
}

Contact

For questions or support, please open an issue on the GitHub repository or contact the maintainer.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tptt-0.11.4.tar.gz (25.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tptt-0.11.4-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file tptt-0.11.4.tar.gz.

File metadata

  • Download URL: tptt-0.11.4.tar.gz
  • Upload date:
  • Size: 25.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.11 Linux/6.8.0-71-generic

File hashes

Hashes for tptt-0.11.4.tar.gz
Algorithm Hash digest
SHA256 2d6e44c0eedbd0b66ae798bad3764ffe71c59c7f826af8dea60ca68d427f0c44
MD5 8e3a1ccef739ff7a535a0f61871d9c38
BLAKE2b-256 90eaf634749fa9535c939c3cfe6147926222381a66640e0f15a80dbc622bba03

See more details on using hashes here.

File details

Details for the file tptt-0.11.4-py3-none-any.whl.

File metadata

  • Download URL: tptt-0.11.4-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.11 Linux/6.8.0-71-generic

File hashes

Hashes for tptt-0.11.4-py3-none-any.whl
Algorithm Hash digest
SHA256 611384f9464764075e3dd2d9101a00c8e8454ede497d3a70751dd79ecd72095c
MD5 55be6a1b60febcab68cd1db193e12e89
BLAKE2b-256 3a53e7525eba142d8c109e3e72d4748dee29d59dd1e47f8e393ed4774fe5e75e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page