Skip to main content

A highly configurable implementation of our approach in the Aftab paper, benchmarking different convolutional neural networks and their effects on the final results.

Project description

IQM HNS IQM HNS (Last 50M Frames)
Global Performance Last 50M Frames

Installation

We have composed the whole project inside an installable Python library. You can install the package using pip.

pip install aftab

Usage

You can import the agent and configure all the hyper-parameters based on following guide.

from aftab import Aftab

seeds = [1, 2, 3, 4]
environments = ["Pong-v5", "IceHockey-v5"]

for environment in environments:
  agent = Aftab(encoder="gamma", frames="pilot")
  for seed in seeds:
    agent.train(environment=environment, seed=seed)
    agent.save(environment=environment, seed=seed, name="MyCustomAftabTests")

Defining Custom Encoder

You can simply define your custom encoders as a PyTorch module and pass it to the agent when initializing it. Aftab recognizes this and runs the experiments using your custom module.

import torch
from aftab import Aftab

class CustomImageEncoder(torch.nn.Module):
  def __init__(self):
    super().__init__()
  
  def forward(self, x):
    pass

agent = Aftab(encoder=CustomImageEncoder, frames="pilot")
# ...

Results

Note: In interpreting the results bear in mind that the Eta version has significantly more parameters compared to other variants, principally due to the the encoder yielding a large number of features. (see)

PQN Alpha Beta Gamma Delta Epsilon Zeta Eta Theta
Alien 0.542 1.785 2.408 2.845 0.381 2.262 1.840 1.675 0.411
Amidar 0.640 1.346 0.862 1.038 0.529 1.381 0.726 1.023 0.665
Assault 30.169 24.547 28.574 35.629 24.864 34.124 29.913 34.288 35.067
Asterix 40.617 15.338 14.821 14.311 38.146 10.255 12.616 22.730 38.306
Asteroids 0.034 1.828 0.349 1.438 0.223 1.039 1.752 0.064 0.026
Atlantis 47.172 43.482 44.532 46.336 42.942 47.557 45.964 44.951 46.165
Bank Heist 1.885 2.001 2.112 1.721 1.836 1.699 2.131 1.718 1.949
Battle Zone 1.212 1.425 1.894 1.880 1.197 2.121 1.538 1.067 1.073
Beam Rider 1.144 2.973 2.355 2.388 1.069 1.997 1.591 1.348 1.152
Berzerk 2.251 4.676 0.343 1.227 1.364 1.475 2.984 0.868 1.368
Bowling 0.044 0.100 0.037 0.133 0.075 0.059 0.092 0.113 0.056
Boxing 8.244 8.325 8.297 8.285 8.325 8.324 8.310 8.325 8.319
Breakout 12.019 15.045 15.712 16.763 11.809 12.695 12.870 17.246 14.332
Centipede 0.828 1.300 0.990 1.473 0.693 0.955 0.665 1.120 0.540
Chopper Command 2.679 24.380 47.179 55.506 0.839 40.922 31.491 3.387 1.515
Crazy Climber 6.268 6.594 7.767 6.467 6.786 7.175 6.645 6.839 6.152
Defender 3.173 4.566 4.316 5.861 3.155 3.876 3.967 3.259 5.806
Demon Attack 72.523 72.842 73.151 72.884 70.913 72.627 72.075 71.496 70.971
Double Dunk 7.763 7.783 7.960 7.804 7.571 7.869 7.732 8.171 7.605
Enduro 2.723 2.737 2.709 2.693 2.724 2.711 2.697 2.696 2.710
Fishing Derby 2.525 2.623 2.545 2.590 2.459 2.652 2.655 2.559 2.509
Freeway 1.132 1.148 1.140 1.112 1.136 1.137 1.140 1.140 1.136
Frostbite 1.545 2.564 2.318 2.172 1.179 1.494 2.108 1.804 1.075
Gopher 24.506 25.816 33.980 29.682 17.045 21.291 24.041 31.583 18.757
Gravitar 0.235 0.267 0.396 0.353 0.188 0.255 0.328 0.510 0.118
Hero 0.782 0.849 0.754 0.719 0.694 0.788 0.883 1.069 0.448
Ice Hockey 0.850 1.183 1.065 1.890 0.703 1.050 1.056 1.253 0.631
James Bond 8.134 12.279 6.875 17.127 4.725 28.243 12.081 13.439 8.514
Kangaroo 4.480 4.801 4.773 4.623 4.068 4.584 4.688 4.518 4.652
Krull 7.621 8.625 8.707 8.474 7.064 8.117 8.678 8.008 7.655
Kung Fu Master 1.415 1.776 1.487 1.377 1.579 1.412 1.517 1.478 1.791
Montezuma's Revenge 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.004 0.000
Ms. Pac-Man 0.454 0.681 0.952 0.812 0.515 0.745 0.801 1.112 0.657
Name This Game 2.278 2.825 2.561 3.240 2.322 2.275 2.251 1.877 2.579
Phoenix 24.840 41.930 38.203 33.452 7.793 29.553 42.589 38.638 13.875
Pitfall! 0.032 0.033 0.026 0.033 0.034 0.033 0.034 0.033 0.030
Pong 1.181 1.181 1.181 1.181 1.181 1.181 1.181 1.181 1.181
Private Eye 0.012 0.000 0.001 0.001 -0.000 0.000 0.000 -0.000 0.001
Q*bert 1.572 1.851 1.874 1.816 1.332 1.762 1.860 1.881 1.647
River Raid 1.355 1.672 1.669 1.737 1.405 1.731 1.819 1.495 1.378
Road Runner 7.268 10.362 10.011 21.188 7.391 10.825 11.094 7.851 7.007
Robotank 7.109 7.435 7.254 6.960 7.263 7.185 7.369 6.577 6.855
Seaquest 0.187 0.235 0.194 0.201 0.198 0.210 0.407 0.409 0.192
Skiing -0.581 0.457 -0.179 -0.393 -0.500 0.554 0.496 0.541 -0.388
Solaris 0.111 0.074 0.153 0.190 0.077 0.133 0.117 0.180 0.282
Space Invaders 4.841 3.929 8.819 5.266 4.504 5.061 1.731 15.333 4.159
Star Gunner 27.278 38.522 43.751 42.478 24.829 32.671 31.591 24.522 24.052
Surround 1.069 1.139 1.161 1.123 0.928 1.141 1.198 1.197 0.946
Tennis 1.381 2.290 1.375 2.676 1.459 2.266 1.363 1.875 1.402
Time Pilot 5.901 15.564 20.210 14.657 4.866 12.508 12.864 14.877 4.423
Tutankham 1.519 1.571 1.557 1.538 1.512 1.498 1.536 1.601 1.543
Up 'n Down 23.231 23.353 23.545 24.046 16.812 27.409 27.981 7.877 24.057
Venture 0.000 0.000 0.018 0.001 0.000 0.000 0.000 0.000 0.000
Video Pinball 315.049 365.167 312.556 366.645 359.358 302.946 365.137 372.327 344.773
Wizard of Wor 4.443 7.319 7.254 6.468 3.191 7.742 6.266 8.059 5.384
Yars' Revenge 2.222 2.632 2.559 2.550 1.820 2.373 2.722 2.653 2.198
Zaxxon 1.834 2.109 2.539 2.306 1.739 1.617 1.727 2.267 1.904
Median 1.885 2.623 2.408 2.550 1.579 2.262 2.108 1.877 1.791
IQM 2.692 3.536 3.472 3.481 2.374 3.315 3.207 3.114 2.649

Parameter Count

Variant Encoder Parameters Regression Head Total Parameters
PQN 78,304 1,686,500 1,764,804
Alpha 174,752 1,782,948 1,957,700
Beta 89,008 1,782,948 1,871,956
Gamma 117,168 1,725,364 1,842,532
Delta 78,552 1,850,588 1,929,140
Epsilon 80,112 2,179,828 2,259,940
Zeta 77,232 2,537,396 2,614,628
Eta 78,400 23,739,460 23,817,860
Theta 76,288 1,127,428 1,203,716

Hyperparameters

Hyperparameter Value
Learning rate $2.5 \times 10^{-4}$
Training environments 128
Test environments 8
Optimizer Rectified Adam
Adam Weight decay 0
Adam $\epsilon$ $1 \times 10^{-5}$
Adam $\beta_{1}$ 0.9
Adam $\beta_{2}$ 0.999
Total Frames 200,000,000
Loss function Mean Squared Error
Scheduler Linear Annealing
$\epsilon$-greedy exploration 10% of total frames
Discount factor ($\gamma$) 0.99
GAE parameter ($\lambda$) 0.65
Epochs 2
Batch size 4096

Statistical Significance

PQN Alpha Beta Gamma Delta Epsilon Zeta Eta Theta
PQN 1 0 0 0 0 0 0 0.001 0.431
Alpha 0 1 0.847 0.295 0 0.104 0.145 0.337 0
Beta 0 0.847 1 0.802 0 0.068 0.293 0.757 0.004
Gamma 0 0.295 0.802 1 0 0.01 0.024 0.221 0
Delta 0 0 0 0 1 0 0 0 0.046
Epsilon 0 0.104 0.068 0.01 0 1 0.552 0.819 0.001
Zeta 0 0.145 0.293 0.024 0 0.552 1 0.967 0.001
Eta 0.001 0.337 0.757 0.221 0 0.819 0.967 1 0.002
Theta 0.431 0 0.004 0 0.046 0.001 0.001 0.002 1

Reproducibility

As in the deep reinforcement learning context providing a standalone dataset which is used to conduct researcher's experiments is not possible, we present to you the list of the seeds which has been used to perform our experiments. That can be used to replicate our results spotlessly.

from aftab import Aftab

seeds = [475284, 219842, 525975, 909314]
# the rest of the code

Trivially, our seeds themselves were generated randomly using Python random library as well.

As an alternative, one can simply import the default seeds from the installed library as shown below.

from aftab import aftab_seeds

print(aftab_seeds)

Available Atari Environments

A comprehensive set of Atari environments has been developed by the professional maintainers of the library EnvPool which could be found here.

Aftab takes the input environment variable and passes it directly to EvnPool library. Therefore, feel free to refer to the aforementioned list as your project necessitates.

Citation

Please cite this work should you find that useful.

@article{aftab2026benchmarking,
  title={Aftab: Benchmarking {CNN} Encoders in {PQN}},
  author={Shieenavaz, Taha and Zareshahraki, Shabnam and Nanni, Loris},
  journal={arXiv preprint arXiv:YYMM.NNNNN},
  year={2026}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aftab-0.0.44.tar.gz (3.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aftab-0.0.44-py3-none-any.whl (40.2 kB view details)

Uploaded Python 3

File details

Details for the file aftab-0.0.44.tar.gz.

File metadata

  • Download URL: aftab-0.0.44.tar.gz
  • Upload date:
  • Size: 3.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aftab-0.0.44.tar.gz
Algorithm Hash digest
SHA256 28b596a3a9eacd71c45d05d577e6cfe96b6bb8ca63c16ab34bac8e467dd87471
MD5 009a59d48355a725d2a706d74c4ee987
BLAKE2b-256 b667a02cc8e50ccd6cb1e39d2d0be308711157ad03b9a02e83a9847d3dcb82d4

See more details on using hashes here.

File details

Details for the file aftab-0.0.44-py3-none-any.whl.

File metadata

  • Download URL: aftab-0.0.44-py3-none-any.whl
  • Upload date:
  • Size: 40.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for aftab-0.0.44-py3-none-any.whl
Algorithm Hash digest
SHA256 05925ea56da07703e184521afd43f9460bccb100dc95ba4465519b6ca17f34ce
MD5 a772c591bd07c05aed5755ab18629e2d
BLAKE2b-256 52ef76ce84c7f9419df0259698c8afb30e16368403f6e075f2798a1febe95fd3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page