Skip to main content

A highly configurable implementation of our approach in the Aftab paper, benchmarking different convolutional neural networks and their effects on the final results.

Project description

Overview

Aftab (Persian: آفتاب, meaning "sun" or "sun rays") is a benchmarking framework for evaluating CNN-based encoders in PQN across Atari environments.
It provides standardized training, evaluation, and reproducibility tools for deep reinforcement learning research.

IQM HNS IQM HNS (Last 50M Frames)
Global Performance Last 50M Frames

Global performance of base encoders.

IQM HNS IQM HNS (Last 50M Frames)
Hadamax Global Performance Last 50M Frames

Comparison of two Gamma encoder variants based on findings from Hadamax Encoding: Elevating Performance in Model-Free Atari .

Installation

Install via pip:

pip install aftab

Usage

from aftab import Aftab
from aftab import aftab_environments

seeds = [1, 2, 3, 4]

for environment in aftab_environments:
    agent = Aftab(encoder="gamma", frames="pilot")
    for seed in seeds:
        agent.train(environment=environment, seed=seed)
        agent.log()

Defining a Custom Encoder

You can define your own encoder as a PyTorch module and pass it to the agent:

import torch
from aftab import Aftab

class CustomImageEncoder(torch.nn.Module):
    def __init__(self):
        super().__init__()
  
    def forward(self, x):
        pass

agent = Aftab(encoder=CustomImageEncoder, frames="pilot")

Results

Base Encoder Experiments

Hadamax Experiments

Note: The Eta variant has significantly more parameters than other variants, primarily due to the encoder producing a large number of features.

Parameter Count

Base Encoder Variations

Variant Encoder Parameters Q Regression Head Total Parameters
PQN 78,304 1,686,500 1,764,804
Alpha 174,752 1,782,948 1,957,700
Beta 89,008 1,782,948 1,871,956
Gamma 117,168 1,725,364 1,842,532
Delta 78,552 1,850,588 1,929,140
Epsilon 80,112 2,179,828 2,259,940
Zeta 77,232 2,537,396 2,614,628
Eta 78,400 23,739,460 23,817,860
Theta 76,288 1,127,428 1,203,716

Hadamax Variants

Variant Encoder Parameters Q Regression Head Total Parameters
PQN Hadamax 156,608 3,968,516 4,125,124
Gamma Hadamax V1 234,336 1,609,220 1,843,556
Gamma Hadamax V2 234,336 3,280,388 3,514,724

Hyperparameters

Hyperparameter Value
Learning rate $2.5 \times 10^{-4}$
Training environments 128
Test environments 8
Optimizer Rectified Adam
Weight decay 0
$\epsilon$ $1 \times 10^{-5}$
$\beta_{1}$ 0.9
$\beta_{2}$ 0.999
Total Frames 200,000,000
Loss function Mean Squared Error
Scheduler Linear Annealing
$\epsilon$-greedy exploration 10% of total frames
Discount factor ($\gamma$) 0.99
GAE ($\lambda$) 0.65
Epochs 2
Batch size 4096

Used in encoder and Hadamax experiments.

Statistical Significance

PQN Alpha Beta Gamma Delta Epsilon Zeta Eta Theta
PQN - - - - - - - - -
Alpha 0 - - - - - - - -
Beta 0 0.847 - - - - - - -
Gamma 0 0.295 0.802 - - - - - -
Delta 0 0 0 0 - - - - -
Epsilon 0 0.104 0.068 0.01 0 - - - -
Zeta 0 0.145 0.293 0.024 0 0.552 - - -
Eta 0.001 0.337 0.757 0.221 0 0.819 0.967 - -
Theta 0.431 0 0.004 0 0.046 0.001 0.001 0.002 -
Gamma Hadamax Gamma V1 Hadamax Gamma V2 Hadamax
Gamma - - - -
Hadamax Gamma V1 0 - - -
Hadamax Gamma V2 0 0.72 - -
Hadamax Nature DQN 0 0.078 0.151 -

Reproducibility

Due to the stochastic nature of deep reinforcement learning, exact reproducibility via fixed datasets is not feasible.
Instead, we provide a set of random seeds used in our experiments.

from aftab import aftab_seeds

print(aftab_seeds)

Full experiment replication:

from aftab import Aftab
from aftab import aftab_environments
from aftab import aftab_seeds

for environment in aftab_environments:
    agent = Aftab()
    for seed in aftab_seeds:
        agent.train(environment=environment, seed=seed)
        agent.log()

A comprehensive set of Atari environments is available via EnvPool:
https://envpool.readthedocs.io/en/latest/env/atari.html#available-tasks

Citation

@article{aftab2026benchmarking,
  title={Aftab: Benchmarking {CNN} Encoders in {PQN}},
  author={Shieenavaz, Taha and Zareshahraki, Shabnam and Nanni, Loris},
  journal={arXiv preprint arXiv:YYMM.NNNNN},
  year={2026}
}

Related Works

@misc{2107.09645,
  Author = {Denis Yarats and Rob Fergus and Alessandro Lazaric and Lerrel Pinto},
  Title = {Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning},
  Year = {2021},
  Eprint = {arXiv:2107.09645},
}
@misc{2403.03950,
  Author = {Jesse Farebrother and Jordi Orbay and Quan Vuong and Adrien Ali Taïga and Yevgen Chebotar and Ted Xiao and Alex Irpan and Sergey Levine and Pablo Samuel Castro and Aleksandra Faust and Aviral Kumar and Rishabh Agarwal},
  Title = {Stop Regressing: Training Value Functions via Classification for Scalable Deep RL},
  Year = {2024},
  Eprint = {arXiv:2403.03950},
}

License

© 2025 Taha Shieenavaz.
Licensed under CC BY-NC 4.0: https://creativecommons.org/licenses/by-nc/4.0/

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aftab-0.1.29.tar.gz (2.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aftab-0.1.29-py3-none-any.whl (58.5 kB view details)

Uploaded Python 3

File details

Details for the file aftab-0.1.29.tar.gz.

File metadata

  • Download URL: aftab-0.1.29.tar.gz
  • Upload date:
  • Size: 2.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aftab-0.1.29.tar.gz
Algorithm Hash digest
SHA256 3e2b894c877ffc594906263599d8134390f07f45ec2de8824c8a7cbdcab5424f
MD5 6e090013f1975e387d92e3c1b3ecfba4
BLAKE2b-256 1a2cd98b33cec55ecc9324282ee595f5af2ef3d8e37b7fb5c1eaee5bd3601f2d

See more details on using hashes here.

File details

Details for the file aftab-0.1.29-py3-none-any.whl.

File metadata

  • Download URL: aftab-0.1.29-py3-none-any.whl
  • Upload date:
  • Size: 58.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aftab-0.1.29-py3-none-any.whl
Algorithm Hash digest
SHA256 4129f554434fe9668f7c4e863e21c06a4c608d7d2deec39a71c590daeba5e585
MD5 fc57599835fabc7830271847753f0fd6
BLAKE2b-256 c720e2b967b95e9c71f743d5a37ce15fb26da1371aff56d879a2a5e691bcb2a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page