Unsupervised pre-training with PPG
Project description
Unsupervised On-Policy Reinforcement Learning
This work combines Active Pre-Training with an On-Policy algorithm, Phasic Policy Gradient.
Active Pre-Training
Is used to pre-train a model free algorithm before defining a downstream task. It calculates the reward based on an estimatie of the particle based entropy of states. This reduces the training time if you want to define various tasks - i.e. robots for a warehouse.
Phasic Policy Gradient
Improved Version of Proximal Policy Optimization, which uses auxiliary epochs to train shared representations between the policy and a value network.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
active-pre-train-ppg-0.0.5.tar.gz
(15.6 kB
view hashes)
Built Distribution
Close
Hashes for active-pre-train-ppg-0.0.5.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21dcb93030ef49325885d1939edb6e39cc9773349d033a583f6970a534b74f20 |
|
MD5 | 1a317eeaf056f3c81ef487fe14712b66 |
|
BLAKE2b-256 | 1aa224c7f1c7f7355b5e5c1977c5cc77589a67d95fef994feb768d28c4b5d38d |
Close
Hashes for active_pre_train_ppg-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69b33932bb5d09e1eac2b5aad95679e1c557ee61af5b6db82c130ffdbbc40d0c |
|
MD5 | 69d7be9dd251fb76736bf7fb9032ba8d |
|
BLAKE2b-256 | 629d272c6f2b0f48338668a2696dd501de3b87575063b4725a3e8ac85c05f6f7 |