Skip to main content

A highly configurable implementation of our approach in the Aftab paper, benchmarking different convolutional neural networks and their effects on the final results.

Project description

Overview

Aftab (آفتاب) is a benchmarking framework for evaluating CNN-based encoders in PQN across Atari environments.
It provides standardized training, evaluation, and reproducibility tools for deep reinforcement learning research.

IQM HNS IQM HNS (Last 50M Frames)
Global Performance Last 50M Frames

Global performance of base encoders.

IQM HNS IQM HNS (Last 50M Frames)
Hadamax Global Performance Last 50M Frames

Comparison of two Gamma encoder variants based on findings from Hadamax Encoding: Elevating Performance in Model-Free Atari .

Installation

Install via pip:

pip install aftab

Usage

from aftab import Aftab
from aftab import aftab_environments

seeds = [1, 2, 3, 4]

for environment in aftab_environments:
    agent = Aftab(encoder="gamma", frames="pilot")
    for seed in seeds:
        agent.train(environment=environment, seed=seed)
        agent.log()

Defining a Custom Encoder

You can define your own encoder as a PyTorch module and pass it to the agent:

import torch
from aftab import Aftab

class CustomImageEncoder(torch.nn.Module):
    def __init__(self):
        super().__init__()
  
    def forward(self, x):
        pass

agent = Aftab(encoder=CustomImageEncoder, frames="pilot")

Results

Base Encoder Experiments

Hadamax Experiments

Note: The Eta variant has significantly more parameters than other variants, primarily due to the encoder producing a large number of features.

Parameter Count

Base Encoder Variations

Variant Encoder Parameters Q Regression Head Total Parameters
PQN 78,304 1,686,500 1,764,804
Alpha 174,752 1,782,948 1,957,700
Beta 89,008 1,782,948 1,871,956
Gamma 117,168 1,725,364 1,842,532
Delta 78,552 1,850,588 1,929,140
Epsilon 80,112 2,179,828 2,259,940
Zeta 77,232 2,537,396 2,614,628
Eta 78,400 23,739,460 23,817,860
Theta 76,288 1,127,428 1,203,716

Hadamax Variants

Variant Encoder Parameters Q Regression Head Total Parameters
PQN Hadamax 156,608 3,968,516 4,125,124
Gamma Hadamax V1 234,336 1,609,220 1,843,556
Gamma Hadamax V2 234,336 3,280,388 3,514,724

Hyperparameters

Hyperparameter Value
Learning rate $2.5 \times 10^{-4}$
Training environments 128
Test environments 8
Optimizer Rectified Adam
Weight decay 0
$\epsilon$ $1 \times 10^{-5}$
$\beta_{1}$ 0.9
$\beta_{2}$ 0.999
Total Frames 200,000,000
Loss function Mean Squared Error
Scheduler Linear Annealing
$\epsilon$-greedy exploration 10% of total frames
Discount factor ($\gamma$) 0.99
GAE ($\lambda$) 0.65
Epochs 2
Batch size 4096

Used in encoder and Hadamax experiments.

Statistical Significance

PQN Alpha Beta Gamma Delta Epsilon Zeta Eta Theta
PQN - - - - - - - - -
Alpha 0 - - - - - - - -
Beta 0 0.847 - - - - - - -
Gamma 0 0.295 0.802 - - - - - -
Delta 0 0 0 0 - - - - -
Epsilon 0 0.104 0.068 0.01 0 - - - -
Zeta 0 0.145 0.293 0.024 0 0.552 - - -
Eta 0.001 0.337 0.757 0.221 0 0.819 0.967 - -
Theta 0.431 0 0.004 0 0.046 0.001 0.001 0.002 -
Gamma Hadamax Gamma V1 Hadamax Gamma V2 Hadamax
Gamma - - - -
Hadamax Gamma V1 0 - - -
Hadamax Gamma V2 0 0.72 - -
Hadamax Nature DQN 0 0.078 0.151 -

Reproducibility

Due to the stochastic nature of deep reinforcement learning, exact reproducibility via fixed datasets is not feasible.
Instead, we provide a set of random seeds used in our experiments.

from aftab import aftab_seeds

print(aftab_seeds)

Full experiment replication:

from aftab import Aftab
from aftab import aftab_environments
from aftab import aftab_seeds

for environment in aftab_environments:
    agent = Aftab()
    for seed in aftab_seeds:
        agent.train(environment=environment, seed=seed)
        agent.log()

A comprehensive set of Atari environments is available via EnvPool:
https://envpool.readthedocs.io/en/latest/env/atari.html#available-tasks

Citation

@article{aftab2026benchmarking,
  title={Aftab: Benchmarking {CNN} Encoders in {PQN}},
  author={Shieenavaz, Taha and Zareshahraki, Shabnam and Nanni, Loris},
  journal={arXiv preprint arXiv:YYMM.NNNNN},
  year={2026}
}

License

© 2025 Taha Shieenavaz.
Licensed under CC BY-NC 4.0: https://creativecommons.org/licenses/by-nc/4.0/

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aftab-0.0.83.tar.gz (2.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aftab-0.0.83-py3-none-any.whl (58.9 kB view details)

Uploaded Python 3

File details

Details for the file aftab-0.0.83.tar.gz.

File metadata

  • Download URL: aftab-0.0.83.tar.gz
  • Upload date:
  • Size: 2.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aftab-0.0.83.tar.gz
Algorithm Hash digest
SHA256 7d472ea898c23db0223aa82d2031d09b3abace8f295eecd35ba5fe48291a6692
MD5 dc989ff3066e2e4d2c430e2d80748417
BLAKE2b-256 d6124ea4ffbc6e34ff36356c926f6e18ce06aae38b2929977753e9855de205a8

See more details on using hashes here.

File details

Details for the file aftab-0.0.83-py3-none-any.whl.

File metadata

  • Download URL: aftab-0.0.83-py3-none-any.whl
  • Upload date:
  • Size: 58.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aftab-0.0.83-py3-none-any.whl
Algorithm Hash digest
SHA256 1ede5dc3140e195eefc6902d3fdcf7d4b728d42cab847e9d3364307f3ba01ff8
MD5 d142f2ecda46d4132ffe91b778221674
BLAKE2b-256 c23f2157b1eb64c6e954755c63186a24094e143d64e1891cc77bebaeb0901686

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page