Project description

Unsupervised On-Policy Reinforcement Learning

This work combines Active Pre-Training with an On-Policy algorithm, Phasic Policy Gradient.

Active Pre-Training

Is used to pre-train a model free algorithm before defining a downstream task. It calculates the reward based on an estimatie of the particle based entropy of states. This reduces the training time if you want to define various tasks - i.e. robots for a warehouse.

Phasic Policy Gradient

Improved Version of Proximal Policy Optimization, which uses auxiliary epochs to train shared representations between the policy and a value network.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.8

Mar 10, 2022

0.0.7

Mar 10, 2022

This version

0.0.6

Mar 10, 2022

0.0.5

Mar 10, 2022

0.0.4

Mar 10, 2022

0.0.3

Mar 10, 2022

0.0.2

Mar 10, 2022

0.0.1

Mar 10, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

active-pre-train-ppg-0.0.6.tar.gz (15.7 kB view hashes)

Uploaded Mar 10, 2022 Source

Built Distribution

active_pre_train_ppg-0.0.6-py3-none-any.whl (39.4 kB view hashes)

Uploaded Mar 10, 2022 Python 3

Hashes for active-pre-train-ppg-0.0.6.tar.gz

Hashes for active-pre-train-ppg-0.0.6.tar.gz
Algorithm	Hash digest
SHA256	`d483bff5ff2f95e2a4bc09d7fbdeff2769838fd99db0e3224d40f95cdf79e310`
MD5	`e2b831020f1ccedb121dee820f761141`
BLAKE2b-256	`f20b2b6dd9b060d0e0834545b988d341cb46d0ce23cad32e9c54535c6056c53c`

Hashes for active_pre_train_ppg-0.0.6-py3-none-any.whl

Hashes for active_pre_train_ppg-0.0.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4641587bc9b71a752cb408cfcb001484dd800726e6cea4e6d09a2efaff9eed03`
MD5	`a552813c8d97c6d921a9b7a9fc24c38e`
BLAKE2b-256	`5b1b48f71266a106d6fbd35546c1abed549cbf185be007f944747b617ac7ae0b`