Skip to main content

Comprehensive deep learning framework with state-of-the-art implementations: GPT, DeepSeek-V3 with MLA/MoE, YOLO, CenterNet, and specialized audio/vision models built on PyTorch Lightning

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

APTT โ€“ Antons PyTorch Tools

APTT (Antons PyTorch Tools) is a comprehensive deep learning framework built on PyTorch Lightning that provides production-ready implementations of state-of-the-art architectures including transformer language models (GPT, DeepSeek-V3), object detection (YOLO, CenterNet), and specialized neural networks for vision and audio tasks.

๐Ÿš€ Features

Language Models & NLP

  • โœ… GPT-2/GPT-3 Architecture: Full transformer implementation with configurable layers
  • โœ… DeepSeek-V3: State-of-the-art LLM with Multi-Head Latent Attention (MLA) and Mixture-of-Experts (MoE)
    • Multi-Head Latent Attention with KV-Compression
    • Auxiliary-Loss-Free Load Balancing
    • Multi-Token Prediction (MTP)
    • Rotary Position Embeddings (RoPE)
  • โœ… Text Dataset Loaders: Support for .txt, .jsonl, pre-tokenized data with sliding window

Computer Vision

  • โœ… Object Detection: YOLO (v3/v4/v5), CenterNet, EfficientDet
  • โœ… Feature Extractors: ResNet, DarkNet, EfficientNet, MobileNet, FPN
  • โœ… Tracking: RNN-based object tracking with ReID

Audio Processing

  • โœ… Beamforming: Multi-channel audio processing
  • โœ… Direction of Arrival (DOA): Acoustic source localization
  • โœ… Feature Networks: WaveNet, Complex-valued networks

Training & Optimization

  • ๐Ÿง  Continual Learning: Built-in knowledge distillation and LwF (Learning without Forgetting)
  • ๐Ÿงฉ Pluggable Callbacks: TorchScript export, TensorRT optimization, t-SNE visualization
  • โš™๏ธ Modular Design: Composable heads, losses, layers, and metrics
  • ๐Ÿ“Š Visualization Tools: Embedding analysis, training metrics, model profiling
  • ๐Ÿ—‚๏ธ Flexible Dataset Loaders: Image, audio, text with augmentation support

๐Ÿ› ๏ธ Installation

# Clone the repository
git clone https://github.com/afeldman/aptt.git
cd aptt

# Create virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate  # or .venv\Scripts\activate on Windows

# Install for CPU
uv sync --extra cpu --extra dev

# Install for CUDA 12.4
uv sync --extra cu124 --extra dev

# For documentation building
apt-get install libgraphviz-dev  # Linux
brew install graphviz             # macOS

๐ŸŽฏ Quick Start

Language Model Training (DeepSeek-V3)

import pytorch_lightning as pl
from aptt.modules.deepseek import DeepSeekModule
from aptt.lightning_base.dataset import TextDataLoader

# Prepare dataset
datamodule = TextDataLoader(
    train_data_path="data/train.txt",
    val_data_path="data/val.txt",
    tokenizer=tokenizer,
    max_seq_len=512,
    batch_size=32,
    return_mtp=True,  # Enable Multi-Token Prediction
)

# Create model
model = DeepSeekModule(
    vocab_size=50000,
    d_model=2048,
    n_layers=24,
    n_heads=16,
    use_moe=True,           # Enable Mixture-of-Experts
    use_mtp=True,           # Enable Multi-Token Prediction
    n_routed_experts=256,
    n_expert_per_token=8,
)

# Train
trainer = pl.Trainer(max_steps=100000, accelerator="gpu")
trainer.fit(model, datamodule)

Object Detection (YOLO)

from aptt.modul.yolo import YOLOModule

model = YOLOModule(
    num_classes=80,
    model_size="yolov5s",
    pretrained=True,
)

trainer = pl.Trainer(max_epochs=100, accelerator="gpu")
trainer.fit(model, datamodule)

๐Ÿ“š Documentation

Core Modules

Language Models

Computer Vision

  • Detection models (YOLO, CenterNet, EfficientDet)
  • Feature extractors (ResNet, DarkNet, EfficientNet, FPN)
  • Object tracking systems

Audio Processing

  • Beamforming algorithms
  • DOA estimation
  • Complex-valued neural networks

Examples

# Language Models
python examples/llm_modules_example.py      # GPT & DeepSeek-V3
python examples/llm_loss_head_example.py    # Loss functions & heads
python examples/moe_example.py              # Mixture-of-Experts
python examples/text_dataset_simple.py      # Text data loading

# View all examples
ls examples/

Build Documentation Locally

cd docs
make html
# Open docs/_build/html/index.html

๐Ÿ—๏ธ Project Structure

aptt/
โ”œโ”€โ”€ src/aptt/                      # Core source code
โ”‚   โ”œโ”€โ”€ callbacks/                 # Training callbacks (TensorRT, t-SNE, etc.)
โ”‚   โ”œโ”€โ”€ heads/                     # Output heads (classification, detection, LM)
โ”‚   โ”œโ”€โ”€ layers/                    # Neural network layers
โ”‚   โ”‚   โ”œโ”€โ”€ attention/             # Attention mechanisms (MLA, RoPE, KV-Compression)
โ”‚   โ”‚   โ””โ”€โ”€ moe.py                 # Mixture-of-Experts
โ”‚   โ”œโ”€โ”€ lightning_base/            # Lightning modules and utilities
โ”‚   โ”‚   โ””โ”€โ”€ dataset/               # Dataset loaders (image, audio, text)
โ”‚   โ”œโ”€โ”€ loss/                      # Loss functions
โ”‚   โ”œโ”€โ”€ metric/                    # Evaluation metrics
โ”‚   โ”œโ”€โ”€ model/                     # Model architectures
โ”‚   โ”‚   โ”œโ”€โ”€ beamforming/           # Audio beamforming
โ”‚   โ”‚   โ””โ”€โ”€ detection/             # Object detection
โ”‚   โ”œโ”€โ”€ modules/                   # Lightning modules
โ”‚   โ”‚   โ”œโ”€โ”€ deepseek.py            # DeepSeek-V3 module
โ”‚   โ”‚   โ”œโ”€โ”€ gpt.py                 # GPT module
โ”‚   โ”‚   โ”œโ”€โ”€ yolo.py                # YOLO module
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ””โ”€โ”€ utils/                     # Utility functions
โ”œโ”€โ”€ examples/                      # Usage examples
โ”‚   โ”œโ”€โ”€ llm_modules_example.py     # Language model examples
โ”‚   โ”œโ”€โ”€ moe_example.py             # MoE examples
โ”‚   โ””โ”€โ”€ text_dataset_simple.py     # Dataset examples
โ”œโ”€โ”€ tests/                         # Unit tests
โ”œโ”€โ”€ docs/                          # Sphinx documentation
โ”‚   โ”œโ”€โ”€ llm_modules.md             # LLM documentation
โ”‚   โ”œโ”€โ”€ moe.md                     # MoE documentation
โ”‚   โ””โ”€โ”€ text_dataset.md            # Dataset documentation
โ”œโ”€โ”€ pyproject.toml                 # Project configuration
โ”œโ”€โ”€ README.md                      # This file
#

๐ŸŽ“ Key Concepts

Multi-Head Latent Attention (MLA)

DeepSeek-V3's efficient attention mechanism with low-rank KV-compression:

from aptt.layers.attention.mla import MultiHeadLatentAttention

attention = MultiHeadLatentAttention(
d=2048, # Model dimension
n_h=16, # Number of heads
d_h_c=256, # Compressed KV dimension
d_h_r=64, # Per-head RoPE dimension
)

Mixture-of-Experts (MoE)

Sparse expert activation with auxiliary-loss-free load balancing:

from aptt.layers.moe import DeepSeekMoE

moe = DeepSeekMoE(
d_model=2048,
n_shared_experts=1, # Always active
n_routed_experts=256, # Selectively activated
n_expert_per_token=8, # Top-K experts per token
)

Multi-Token Prediction (MTP)

Predict multiple future tokens simultaneously:

# Dataset with MTP targets

dataset = TextDataset(
data_path="train.txt",
tokenizer=tokenizer,
return_mtp=True,
mtp_depth=3, # Predict 1, 2, 3 tokens ahead
)

# Model with MTP loss

model = DeepSeekModule(
vocab_size=50000,
use_mtp=True,
mtp_lambda=0.3, # MTP loss weight
)

๐Ÿ“Š Model Zoo

Language Models

Model Parameters Config Performance
GPT-Small 124M `d_model=768, n_layers=12` GPT-2 baseline
DeepSeek-Small 51M `d_model=512, n_layers=4, use_moe=True` Demo config
DeepSeek-Base 1.3B `d_model=2048, n_layers=24, n_experts=256` Production
DeepSeek-V3 685B `d_model=7168, n_layers=60, n_experts=256` Full scale

Object Detection

Model Backbone mAP FPS
YOLOv5s CSPDarknet 37.4 140
YOLOv5m CSPDarknet 45.4 100
CenterNet ResNet-50 42.1 45

๐Ÿงช Testing

# Run all tests

pytest

# Run specific test

pytest tests/test_tensor_rt_export_callback.py

# With coverage

pytest --cov=aptt

๐Ÿ› ๏ธ Development

Code Quality

# Format code

ruff format .

# Lint

ruff check .

# Type checking

mypy src/aptt

Pre-commit Hooks

# Install pre-commit

pip install pre-commit

# Setup hooks

pre-commit install

# Run manually

pre-commit run --all-files

๐Ÿ“– Citation

If you use APTT in your research, please cite:

@software{aptt2025,
title = {APTT: Antons PyTorch Tools},
author = {Anton Feldmann},
year = {2025},
url = {https://github.com/afeldman/aptt}
}

For DeepSeek-V3:

@article{deepseekai2024deepseekv3,
title={DeepSeek-V3 Technical Report},
author={DeepSeek-AI},
journal={arXiv preprint arXiv:2412.19437},
year={2024}
}

๐Ÿค Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (`git checkout -b feature/amazing-feature`)
  3. Commit your changes (`git commit -m 'Add amazing feature'`)
  4. Push to the branch (`git push origin feature/amazing-feature`)
  5. Open a Pull Request

Please ensure:

  • Code follows the style guide (Ruff + MyPy)
  • Tests pass (`pytest`)
  • Documentation is updated

๐Ÿ™ Acknowledgments

๐Ÿ“ง Contact

Anton Feldmann - anton.feldmann@gmail.com

Project Link: https://github.com/afeldman/aptt


Version: 0.2.0 | Python: >=3.11 | PyTorch: >=2.6.0 | Lightning: >=2.5.1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aptt-1.0.2.tar.gz (214.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aptt-1.0.2-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file aptt-1.0.2.tar.gz.

File metadata

  • Download URL: aptt-1.0.2.tar.gz
  • Upload date:
  • Size: 214.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for aptt-1.0.2.tar.gz
Algorithm Hash digest
SHA256 15796f25e2d44db90fb2ae00422430e7c6acb9ae7a168d3a109f9bddde8315f5
MD5 6af2de10ce90d92d30082c98a953510d
BLAKE2b-256 f0e3ef9e1971dc56de18938f6c8535f39c17f76146b2d698b33a0623e495249d

See more details on using hashes here.

File details

Details for the file aptt-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: aptt-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for aptt-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 de9fe8c8e5afcde773ca7aa63685e1c404dd44287f4483e869ad2d7140fea578
MD5 0896c47c22b30032e31888a9c4fdcbf6
BLAKE2b-256 b64b43ea957ab642096ada552dd8c6d4518b6a04c1c78f4f8efc6b54501b95e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page