A smart way to create a ddql agent
Project description
BaseAgent Class
It is an abstract class that implements an agent for reinforcement learning. This class can be extended to create specific agents for different environments.
Constructor
The constructor accepts the following parameters:
- num_actions: The number of possible actions that the agent can take.
- environment: The environment in which the agent operates, represented as a numpy array.
- fit_each_n_steps: The number of steps after which the agent trains its models.
- exploration_rate: The probability that the agent chooses a random action instead of the optimal action.
- exploration_rate_decay: The decay rate of the exploration rate.
- gamma: The discount factor used in the Q-value update equation.
- cumulative_rewards_max_length: The maximum length of the array that keeps track of cumulative rewards.
- memory_max_length: The maximum length of the agent's memory.
- memory_batch_size: The batch size used for training the models.
- allow_episode_tracking: A boolean flag indicating whether the agent should track episodes.
Methods
The BaseAgent class implements the following methods:
- start_episode: Starts a new episode.
- stop_episode: Ends the current episode.
- get_episodes: Returns all recorded episodes.
- reset_episodes: Resets all recorded episodes.
- is_memory_ready: Checks whether the agent's memory is ready for training.
- step: Performs a step of the agent, choosing an action, receiving a reward, and updating the models.
- get_last_cumulative_rewards: Returns the sum of the last cumulative rewards.
In addition, the it takes the following functions to be implemented in subclasses:
- reset_state: Resets the agent's state at the start of a new episode.
- _get_reward: Calculates the reward received by the agent for undertaking an action in a given state.
- _get_model: Returns the model used by the agent to learn the Q-value function.
Usage Example
To use the class, you need to extend it and implement the abstract methods. Here's an example of how this might be done:
class MyAgent(Agent):
def reset_state(self):
# Implementation of state reset
def _get_reward(self, action, environment):
# Implementation of reward calculation
def _get_model(self, state_features):
# Implementation of the model
Once the subclass is defined, you can create an instance of the agent and use it as follows:
agent = MyAgent(num_actions=4, environment=np.array([0, 0, 0, 0]))
agent.start_episode()
for _ in range(100):
agent.step()
agent.stop_episode()
Notes
BaseAgent is designed to be used with discrete environments and deep learning models. If you wish to use a continuous environment or a different learning model, you may need to make some modifications to the class.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ddqla-0.1.25.tar.gz
.
File metadata
- Download URL: ddqla-0.1.25.tar.gz
- Upload date:
- Size: 123.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c236f789a1b98ab413beaa52a46aa57a2d1e69cf2f49b964b6d8caa9907b5ed1 |
|
MD5 | 1929f0bc109092002f39f93cd9479c75 |
|
BLAKE2b-256 | fbb7983a648f79fa3aba9ff35bcb4005a5993ea07146fbec239fc394ba131b15 |
File details
Details for the file ddqla-0.1.25-py3-none-any.whl
.
File metadata
- Download URL: ddqla-0.1.25-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 245f3d5db38d441297bf6d193ee36cc09ae5057e383f25609159f8ce1a96d065 |
|
MD5 | c52fdda3b9ef9f8c0604fa18de58220d |
|
BLAKE2b-256 | f0a695914e45fa5c4420954b0e15619af7cda058a8d51937a5d85f17d799a7f0 |