An implementation of the DQN algorithm and all improvments of rainbow DQN
Project description
Modular-DQN
Fully modular implementation of rainbow DQN, allowing for each feature to be toggled individually.
Authors
This Project was part of the Fachprojekt: Applied Deep Reinforcement Learning at TU Dortmund.
Install
To install run pip install modular-dqn
Requirements
torchprettytablegymnasiumopencv-pythonwandb(optional)rich(optional)
Usage
To start learning with our implementation simply execute the modular-dqn command. While we have provided sensible default values for most hyperparameters, they can be adjusted individually.
Required Arguments
| Argument | Type | Description |
|---|---|---|
--env | --environment |
string | Gymnasium environment to learn |
-s | --steps |
int | Number of steps in Training |
-π | --policy |
string | Network to use (MLP |
Optional Arguments
| Argument | Type | Default | Description |
|---|---|---|---|
--device |
string | None | Device used by pytorch (cpu |
--lr | --learning_rate |
float | 1e-3 | Learning rate used |
-ε | --epsilon |
float | 1 | Initial epsilon used for epsilon-greedy policy |
--edi | --epsilon_decay_interval |
int | 1e3 | Epsilon decay step interval |
--eds | --epsilon_decay_step |
float | 0.1 | Size of epsilon decay step |
--e_min | --epsilon_min |
float | 0.1 | Minimal epsilon value |
-𝛾 | --gamma |
float | 0.9 | Discount factor for future rewards |
-𝜏 | --tau |
float | 0.95 | Polyak update factor for target network |
--bs | --batch_size |
int | 32 | Batch size used to update the Q-Function |
--seed |
int | None | Seed for the environment |
--rm_size | --replay_memory_size |
int | 1e7 | Replay memory maximum capacity |
--rec_trigger |
int | None | Records every rec_trigger episodes if provided |
--wandb |
boolean | False | Whether progress should be logged to wandb |
--tags |
List[string] | None | Tags to add to the run on wandb |
--li | --log_interval |
int | 10 | Number of episodes between logs |
--load_file |
string | None | Relative path where the network should be loaded from if provided |
--optimizer |
string | SGD | Name of optimizer to be used (e.g. SGD |
--skip_frames | --skp |
int | 1 | The number of frames to skip each step |
--clip | --reward_clipping |
int | None | Set to 0 for hard or any other scale for soft clipping divided by scale |
-α | --alpha |
float | 0.5 | Alpha for priority replay |
-β | --beta |
float | 0.5 | Beta for priority replay |
--store_model |
Flag | - | Stores model every rec_trigger interval if set |
--ddqn |
Flag | - | Enables double deep Q-Learning |
--per |
Flag | - | Enables prioritized replay memory |
--n_step |
int | 1 | Sets n-step transition length |
--noisy |
Flag | - | Enables noisy linear layers for exploration |
--dueling |
Flag | - | Uses dueling networks architecture |
--cat | categorical |
Flag | - | Uses categorical dqn loss |
--rainbow |
Flag | - | Enables all improvements (--n_step should still be set) |
--kwargs |
Dict | None | Additional kwargs passed to environment on creation (usage |
--progress |
Flag | - | Display progress bar in terminal (requires rich) |
--loss |
string | SmoothL1 | |
--obs_size |
Tuple | None | Rescale image observations to given size (usage |
--heatmaps |
float | 0.0 | Heatmap opacity in videos or 0.0 for no heatmaps (only works with CNN and image observations) |
--graphs |
Flag | False | Generates Q-Value graph in videos (only works on linux currently) |
Available Optimizers
| Name | Optimizer |
|---|---|
Adadelta |
torch.optim.Adadelta |
Adagrad |
torch.optim.Adagrad |
Adam |
torch.optim.Adam |
AdamW |
torch.optim.AdamW |
SparseAdam |
torch.optim.SparseAdam |
Adamax |
torch.optim.Adamax |
ASGD |
torch.optim.ASGD |
LBFGS |
torch.optim.LBFGS |
NAdam |
torch.optim.NAdam |
RAdam |
torch.optim.RAdam |
RMSProp |
torch.optim.RMSprop |
Rprop |
torch.optim.Rprop |
SGD |
torch.optim.SGD |
Available Loss Functions
| Name | Loss |
|---|---|
L1 |
torch.nn.functional.l1_loss |
MSE |
torch.nn.functional.mse_loss |
CrossEntropy |
torch.nn.functional.cross_entropy |
CTC |
torch.nn.functional.ctc_loss |
NLL |
torch.nn.functional.nll_loss |
PoissonNLL |
torch.nn.functional.poisson_nll_loss |
GaussianNLL |
torch.nn.functional.gaussian_nll_loss |
KLDiv |
torch.nn.functional.kl_div |
BCE |
torch.nn.functional.binary_cross_entropy |
Huber |
torch.nn.functional.huber_loss |
SmoothL1 |
torch.nn.functional.smooth_l1_loss |
SoftMargin |
torch.nn.functional.soft_margin_loss |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modulardqn-1.0.0.tar.gz.
File metadata
- Download URL: modulardqn-1.0.0.tar.gz
- Upload date:
- Size: 32.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc04035253634b746feaeec4cba3d4873bdba74bdf6966d04b1f7bc94f814cec
|
|
| MD5 |
65e3cc284d8ff45b87f8d0a29316390c
|
|
| BLAKE2b-256 |
594b2d9205ee93fd215a06996e3cea29f6c60e48c4cb1e538d1504d1b160cc10
|
File details
Details for the file modulardqn-1.0.0-py3-none-any.whl.
File metadata
- Download URL: modulardqn-1.0.0-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
99d2318c987eaacfc04d5437c8be2035880c428365d92011275efa1e604bd7ff
|
|
| MD5 |
7d3ca8237e2169255ed2eb01604837e4
|
|
| BLAKE2b-256 |
140079a1431921c07de3a3b3f730888a1f616b091f31f05653a0843b71d6d0c5
|