A highly configurable implementation of our approach in the Aftab paper, benchmarking different convolutional neural networks and their effects on the final results.
Project description
Overview
Aftab (آفتاب) is a benchmarking framework for evaluating CNN-based encoders in PQN across Atari environments.
It provides standardized training, evaluation, and reproducibility tools for deep reinforcement learning research.
| IQM HNS | IQM HNS (Last 50M Frames) |
|---|---|
Global performance of base encoders.
| IQM HNS | IQM HNS (Last 50M Frames) |
|---|---|
Comparison of two Gamma encoder variants based on findings from Hadamax Encoding: Elevating Performance in Model-Free Atari .
Installation
Install via pip:
pip install aftab
Usage
from aftab import Aftab
from aftab import aftab_environments
seeds = [1, 2, 3, 4]
for environment in aftab_environments:
agent = Aftab(encoder="gamma", frames="pilot")
for seed in seeds:
agent.train(environment=environment, seed=seed)
agent.log()
Defining a Custom Encoder
You can define your own encoder as a PyTorch module and pass it to the agent:
import torch
from aftab import Aftab
class CustomImageEncoder(torch.nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
pass
agent = Aftab(encoder=CustomImageEncoder, frames="pilot")
Results
Base Encoder Experiments
Hadamax Experiments
Note: The Eta variant has significantly more parameters than other variants, primarily due to the encoder producing a large number of features.
Parameter Count
Base Encoder Variations
| Variant | Encoder Parameters | Q Regression Head | Total Parameters |
|---|---|---|---|
| PQN | 78,304 | 1,686,500 | 1,764,804 |
| Alpha | 174,752 | 1,782,948 | 1,957,700 |
| Beta | 89,008 | 1,782,948 | 1,871,956 |
| Gamma | 117,168 | 1,725,364 | 1,842,532 |
| Delta | 78,552 | 1,850,588 | 1,929,140 |
| Epsilon | 80,112 | 2,179,828 | 2,259,940 |
| Zeta | 77,232 | 2,537,396 | 2,614,628 |
| Eta | 78,400 | 23,739,460 | 23,817,860 |
| Theta | 76,288 | 1,127,428 | 1,203,716 |
Hadamax Variants
| Variant | Encoder Parameters | Q Regression Head | Total Parameters |
|---|---|---|---|
| PQN Hadamax | 156,608 | 3,968,516 | 4,125,124 |
| Gamma Hadamax V1 | 234,336 | 1,609,220 | 1,843,556 |
| Gamma Hadamax V2 | 234,336 | 3,280,388 | 3,514,724 |
Hyperparameters
| Hyperparameter | Value |
|---|---|
| Learning rate | $2.5 \times 10^{-4}$ |
| Training environments | 128 |
| Test environments | 8 |
| Optimizer | Rectified Adam |
| Weight decay | 0 |
| $\epsilon$ | $1 \times 10^{-5}$ |
| $\beta_{1}$ | 0.9 |
| $\beta_{2}$ | 0.999 |
| Total Frames | 200,000,000 |
| Loss function | Mean Squared Error |
| Scheduler | Linear Annealing |
| $\epsilon$-greedy exploration | 10% of total frames |
| Discount factor ($\gamma$) | 0.99 |
| GAE ($\lambda$) | 0.65 |
| Epochs | 2 |
| Batch size | 4096 |
Used in encoder and Hadamax experiments.
Statistical Significance
| PQN | Alpha | Beta | Gamma | Delta | Epsilon | Zeta | Eta | Theta | |
|---|---|---|---|---|---|---|---|---|---|
| PQN | - | - | - | - | - | - | - | - | - |
| Alpha | 0 | - | - | - | - | - | - | - | - |
| Beta | 0 | 0.847 | - | - | - | - | - | - | - |
| Gamma | 0 | 0.295 | 0.802 | - | - | - | - | - | - |
| Delta | 0 | 0 | 0 | 0 | - | - | - | - | - |
| Epsilon | 0 | 0.104 | 0.068 | 0.01 | 0 | - | - | - | - |
| Zeta | 0 | 0.145 | 0.293 | 0.024 | 0 | 0.552 | - | - | - |
| Eta | 0.001 | 0.337 | 0.757 | 0.221 | 0 | 0.819 | 0.967 | - | - |
| Theta | 0.431 | 0 | 0.004 | 0 | 0.046 | 0.001 | 0.001 | 0.002 | - |
| Gamma | Hadamax Gamma V1 | Hadamax Gamma V2 | Hadamax | |
|---|---|---|---|---|
| Gamma | - | - | - | - |
| Hadamax Gamma V1 | 0 | - | - | - |
| Hadamax Gamma V2 | 0 | 0.72 | - | - |
| Hadamax Nature DQN | 0 | 0.078 | 0.151 | - |
Reproducibility
Due to the stochastic nature of deep reinforcement learning, exact reproducibility via fixed datasets is not feasible.
Instead, we provide a set of random seeds used in our experiments.
from aftab import aftab_seeds
print(aftab_seeds)
Full experiment replication:
from aftab import Aftab
from aftab import aftab_environments
from aftab import aftab_seeds
for environment in aftab_environments:
agent = Aftab()
for seed in aftab_seeds:
agent.train(environment=environment, seed=seed)
agent.log()
A comprehensive set of Atari environments is available via EnvPool:
https://envpool.readthedocs.io/en/latest/env/atari.html#available-tasks
Citation
@article{aftab2026benchmarking,
title={Aftab: Benchmarking {CNN} Encoders in {PQN}},
author={Shieenavaz, Taha and Zareshahraki, Shabnam and Nanni, Loris},
journal={arXiv preprint arXiv:YYMM.NNNNN},
year={2026}
}
License
© 2025 Taha Shieenavaz.
Licensed under CC BY-NC 4.0: https://creativecommons.org/licenses/by-nc/4.0/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aftab-0.0.91.tar.gz.
File metadata
- Download URL: aftab-0.0.91.tar.gz
- Upload date:
- Size: 2.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c5e6180a84450d0af00a8b42642eed52b018977e3b9464828b31ceed6eac197
|
|
| MD5 |
e087fa0cc45e23cccbf62e1cb35660e6
|
|
| BLAKE2b-256 |
6c6e005c35a6e01a6ad74aee7953909f2c2f7a00349128a8ed94c7e34f45e7ca
|
File details
Details for the file aftab-0.0.91-py3-none-any.whl.
File metadata
- Download URL: aftab-0.0.91-py3-none-any.whl
- Upload date:
- Size: 52.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39debab160cfbc3f97297fe5edf0e6489974458b70bc099486a8043a60d84f41
|
|
| MD5 |
14cad88ef5fe0f9006b2917e835461d9
|
|
| BLAKE2b-256 |
bb84b9b08e2922b84b3f80a067dc48a6a562a79d29d07866700058057b31bb73
|