Unsupervised pre-training with PPG
Project description
Unsupervised On-Policy Reinforcement Learning
This work combines Active Pre-Training with an On-Policy algorithm, Phasic Policy Gradient.
Active Pre-Training
Is used to pre-train a model free algorithm before defining a downstream task. It calculates the reward based on an estimatie of the particle based entropy of states. This reduces the training time if you want to define various tasks - i.e. robots for a warehouse.
Phasic Policy Gradient
Improved Version of Proximal Policy Optimization, which uses auxiliary epochs to train shared representations between the policy and a value network.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
active-pre-train-ppg-0.0.6.tar.gz
(15.7 kB
view hashes)
Built Distribution
Close
Hashes for active-pre-train-ppg-0.0.6.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d483bff5ff2f95e2a4bc09d7fbdeff2769838fd99db0e3224d40f95cdf79e310 |
|
MD5 | e2b831020f1ccedb121dee820f761141 |
|
BLAKE2b-256 | f20b2b6dd9b060d0e0834545b988d341cb46d0ce23cad32e9c54535c6056c53c |
Close
Hashes for active_pre_train_ppg-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4641587bc9b71a752cb408cfcb001484dd800726e6cea4e6d09a2efaff9eed03 |
|
MD5 | a552813c8d97c6d921a9b7a9fc24c38e |
|
BLAKE2b-256 | 5b1b48f71266a106d6fbd35546c1abed549cbf185be007f944747b617ac7ae0b |