Skip to main content

A highly configurable implementation of our approach in the Aftab paper, benchmarking different convolutional neural networks and their effects on the final results.

Project description

Overview

Aftab (آفتاب) is a benchmarking framework for evaluating CNN-based encoders in PQN across Atari environments.
It provides standardized training, evaluation, and reproducibility tools for deep reinforcement learning research.

IQM HNS IQM HNS (Last 50M Frames)
Global Performance Last 50M Frames

Global performance of base encoders.

IQM HNS IQM HNS (Last 50M Frames)
Hadamax Global Performance Last 50M Frames

Comparison of two Gamma encoder variants based on findings from Hadamax Encoding: Elevating Performance in Model-Free Atari .

Installation

Install via pip:

pip install aftab

Usage

from aftab import Aftab
from aftab import aftab_environments

seeds = [1, 2, 3, 4]

for environment in aftab_environments:
    agent = Aftab(encoder="gamma", frames="pilot")
    for seed in seeds:
        agent.train(environment=environment, seed=seed)
        agent.log()

Defining a Custom Encoder

You can define your own encoder as a PyTorch module and pass it to the agent:

import torch
from aftab import Aftab

class CustomImageEncoder(torch.nn.Module):
    def __init__(self):
        super().__init__()
  
    def forward(self, x):
        pass

agent = Aftab(encoder=CustomImageEncoder, frames="pilot")

Results

Base Encoder Experiments

Hadamax Experiments

Note: The Eta variant has significantly more parameters than other variants, primarily due to the encoder producing a large number of features.

Parameter Count

Base Encoder Variations

Variant Encoder Parameters Q Regression Head Total Parameters
PQN 78,304 1,686,500 1,764,804
Alpha 174,752 1,782,948 1,957,700
Beta 89,008 1,782,948 1,871,956
Gamma 117,168 1,725,364 1,842,532
Delta 78,552 1,850,588 1,929,140
Epsilon 80,112 2,179,828 2,259,940
Zeta 77,232 2,537,396 2,614,628
Eta 78,400 23,739,460 23,817,860
Theta 76,288 1,127,428 1,203,716

Hadamax Variants

Variant Encoder Parameters Q Regression Head Total Parameters
PQN Hadamax 156,608 3,968,516 4,125,124
Gamma Hadamax V1 234,336 1,609,220 1,843,556
Gamma Hadamax V2 234,336 3,280,388 3,514,724

Hyperparameters

Hyperparameter Value
Learning rate $2.5 \times 10^{-4}$
Training environments 128
Test environments 8
Optimizer Rectified Adam
Weight decay 0
$\epsilon$ $1 \times 10^{-5}$
$\beta_{1}$ 0.9
$\beta_{2}$ 0.999
Total Frames 200,000,000
Loss function Mean Squared Error
Scheduler Linear Annealing
$\epsilon$-greedy exploration 10% of total frames
Discount factor ($\gamma$) 0.99
GAE ($\lambda$) 0.65
Epochs 2
Batch size 4096

Used in encoder and Hadamax experiments.

Statistical Significance

PQN Alpha Beta Gamma Delta Epsilon Zeta Eta Theta
PQN - - - - - - - - -
Alpha 0 - - - - - - - -
Beta 0 0.847 - - - - - - -
Gamma 0 0.295 0.802 - - - - - -
Delta 0 0 0 0 - - - - -
Epsilon 0 0.104 0.068 0.01 0 - - - -
Zeta 0 0.145 0.293 0.024 0 0.552 - - -
Eta 0.001 0.337 0.757 0.221 0 0.819 0.967 - -
Theta 0.431 0 0.004 0 0.046 0.001 0.001 0.002 -
Gamma Hadamax Gamma V1 Hadamax Gamma V2 Hadamax
Gamma - - - -
Hadamax Gamma V1 0 - - -
Hadamax Gamma V2 0 0.72 - -
Hadamax Nature DQN 0 0.078 0.151 -

Reproducibility

Due to the stochastic nature of deep reinforcement learning, exact reproducibility via fixed datasets is not feasible.
Instead, we provide a set of random seeds used in our experiments.

from aftab import aftab_seeds

print(aftab_seeds)

Full experiment replication:

from aftab import Aftab
from aftab import aftab_environments
from aftab import aftab_seeds

for environment in aftab_environments:
    agent = Aftab()
    for seed in aftab_seeds:
        agent.train(environment=environment, seed=seed)
        agent.log()

A comprehensive set of Atari environments is available via EnvPool:
https://envpool.readthedocs.io/en/latest/env/atari.html#available-tasks

Citation

@article{aftab2026benchmarking,
  title={Aftab: Benchmarking {CNN} Encoders in {PQN}},
  author={Shieenavaz, Taha and Zareshahraki, Shabnam and Nanni, Loris},
  journal={arXiv preprint arXiv:YYMM.NNNNN},
  year={2026}
}

License

© 2025 Taha Shieenavaz.
Licensed under CC BY-NC 4.0: https://creativecommons.org/licenses/by-nc/4.0/

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aftab-0.0.79.tar.gz (2.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aftab-0.0.79-py3-none-any.whl (57.8 kB view details)

Uploaded Python 3

File details

Details for the file aftab-0.0.79.tar.gz.

File metadata

  • Download URL: aftab-0.0.79.tar.gz
  • Upload date:
  • Size: 2.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aftab-0.0.79.tar.gz
Algorithm Hash digest
SHA256 b1abd91c9267d8eafd44b9ac3c4562a072cbf02ab9a606ada2b2a3ad9b78e5f0
MD5 a95d81f385a7e3bab1796f77979d51b6
BLAKE2b-256 8ac26d13a7b90cfe714409b078d0a2eac62c1d024dc816ef55423ebec955d44f

See more details on using hashes here.

File details

Details for the file aftab-0.0.79-py3-none-any.whl.

File metadata

  • Download URL: aftab-0.0.79-py3-none-any.whl
  • Upload date:
  • Size: 57.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aftab-0.0.79-py3-none-any.whl
Algorithm Hash digest
SHA256 8bace39ed51165dd895408851d4794fab9d25b39653ddf19e6fb1605bff6319d
MD5 63959af74c64902c2e0ae002357e16f5
BLAKE2b-256 2e6d0bb0ae2567ffb7d9d79bc1eb3d70105252ec7a7e110222ed8ec54c31c91e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page