Reinforcement learning library to expediate application and research
Ludus is a reinforcement learning library for expediating development of RL applications and research. Ludus is still in an alpha development stage, so new algorithms and models are continuously being added with the growth of the framework. You can find many state-of-the-art algorithms implemented and ready to use. Additionly, the platform has ready to go integration with popular training environments like OpenAI's gym. The high level API of Ludus combined with easily accesible and well documented, open source code allows for both efficiency and control.
These instructions will help you quickly get Ludus up and running, ready for RL application. Ludus is built on Python 3, and there is no gaurantee it will work with Python 2.
Ludus requires several libraries to get started. Anaconda 3 is recommended as it contains most of the required libraries, as well as many related libraries that may be useful. Nearly all prerequisites are installed when doing the standard pip installation as described below. The exeption is Tensorflow. While GPU enabled Tensorflow is strongly recommended over the CPU version, it is not required. You can find a guide to installing GPU enabled Tensorflow here, and CPU Tensorflow here.
The recommended method of installation is with the command,
pip install ludus.
Alternatively, the package can be installed by cloning the repository and running,
pip setup.py install in the root directory of the project.
Your First Ludus Agent
To get started, the vpg_cartpole example steps through the creation and training process for a simple agent. It is recommended that you use the notebook as an initial testing ground, and a template for other agents.
In Ludus, the process of creating an intelligent agent can be divided into 3 major steps. Performing the 1st step, and then repeating steps 2 and 3 in a training loop is the typical program flow:
Creating input networks Depending on the type of trainer you wish to use to train your agent, a variety of different input neural networks may be required. The simplest form of this in Vanilla Policy Gradient (VPG /
VPGTrainer), which requires only one network that maps observations to actions. Other, more complex methods like Proximal Policy Optimization (PPO /
PPOTrainer) require two networks, one choosing actions and another estimating state values. If you are not familiar with these concepts, It is recommended that you thoroughly looking through the examples, as they are consice and easy to work with.
Environment simulation & data gathering An
EnvControllerinstance is created and used to gather data from the environment. By adujsting the
n_threadsargument to an integer greater than 1, you can run multiple environments in parallel. The
sim_episodesfunction is used to simualte the environment and gather data. Because Ludus handles environment simulation and data collection for you, custom environments can easily be integrated into the environment so long as they conform to a specific format (more on this here). Once data has been gathered through an instance of
EnvController, retrieving the data can be done with a
get_data()call to the instance.
Training Training one epoch on the data is as simple as calling
network.train(ec.get_data()), where network is an instance of a child of
ecis an instance of
EnvController. After a
get_data()call, the training data memory buffer will be reset, unless otherwise in the function parameters.
Using a Custom Game Environemnt
While this feature is supported and easy to implement, the documentation is not yet complete. For the time being, it is recommended that you examine OpenAI's gym. Creating an environment with the same
step, and initialization functions (the same input arguments and return values) will work with the Ludus framework.
- numpy - Efficient mathematical operations
- opencv2 - Image manipulation in 2D environments
- Tensorflow - Creating and training neural networks
- gym - Envronments for training agents
Further, more in depth documentation is in the works, although not ready quite.
- Edan Meyer (ejmejm) is currently the lead and only developer for the project.
This project is licensed under the MIT License. Please see the attached license file for more details.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.