Skip to main content

Python package that helps to quickly implement MCTS to solve reinforcement learning problems.

Project description

mcts-simple

mcts-simple is a Python3 library that allows reinforcement learning problems to be solved easily with its implementations of Monte Carlo Tree Search.

Monte Carlo Tree Search (MCTS)

MCTS attempts to identify the most promising moves at each state by choosing random actions at that state for every episode (playouts/rollouts). The final game result of each episode is then used to determine the weight of all nodes traversed during that episode so that the probability of choosing an action that yields higher current and potential rewards is increased.

There are 4 stages to the MCTS:

  1. Selection
    • Traverse through the search tree from the root node to a leaf node, while only selecting the most promising child nodes. Leaf node in this case refers to a node that has not yet gone through the expansion stage, rather than its traditional definition which is "a node without child nodes".
  2. Expansion
    • If the leaf node does not lead to an outcome to the episode (e.g. win/lose/draw), create at least one child node for that leaf node and choose one child node from those created.
  3. Simulation
    • Complete one episode starting from the chosen child node, where random actions are chosen for future states. An episode is only completed when an outcome can be yielded from it.
  4. Backpropagation
    • The outcome yielded from the simulated episode in stage 3 should be used to update information in traversed nodes.

Note:

  • We assume that states are unique.
  • Root node's score is almost never evaluated, and at most only the number of visits "n" is used.

Upper Confidence bounds applied to Trees (UCT)

UCT, a variation of MCTS, is often used instead of vanilla MCTS for a few reasons, mainly:

  1. MCTS emphasizes entirely on exploitation. On the other hand, UCT is able to balance exploration and exploitation.
  2. MCTS may favour a losing move despite the presence of one or few forced refutations. UCT attempts to deal with this limitation of the original MCTS.

UCT uses the UCB1 formula to evaluate actions at each state. The exploration parameter c in the UCB1 formula is theoretically equal to sqrt(2), but it can be changed to fit your needs.

How to use mcts-simple

mcts-simple only supports python 3.7 and above.

Dependencies

mcts-simple requires the following libraries:

  • json-pickle
  • tqdm

User installation

In cmd,

pip install mcts-simple

In your python file,

from mcts_simple import *

Creating your own game environment

For the progress bar to work best, use Jupyter Notebook or another platform that supports carriage return "/r".

Create a class for your game by inheriting the Game class from mcts-simple, and define the following methods for your class:

Method What it does
__init__(self) Initialises the object.
render(self) Returns a visual representation of the current state of the game.
get_state(self) Returns current state of the game.
Note:
  1. Provide a hashable state.
  2. Ensure that the state provided during the game does not coincide with the state provided at the start of the game
  3. Best to include the player that is taking an action this turn within the state.
number_of_players(self) Returns number of players.
current_player(self) Returns the player that is taking an action this turn.
possible_actions(self) Returns the actions that can be taken this turn.
take_action(self, action) Player takes action. It is best to check if action is in possible actions (see source code). Action should be string type to support the play_with_human() method from MCTS. Note that even if action leads to the game ending, next player should still be chosen.
delete_last_action(self) Last action is removed. Current state is reverted back to previous state.
has_outcome(self) Returns True if game has ended. Returns False if game is still ongoing.
winner(self) Returns None if game is a draw. Returns the winner if one of the players won. It is best to check if outcome is defined.

After creating your environment, you're basically done! You can train and export your MCTS with just 3 lines of code (assuming your game environment class is named "YourGame":

mcts = MCTS(YourGame())
mcts.run(iterations = 50000)
mcts._export("mcts.json")

You can import your trained MCTS, with another 3 lines of code:

mcts = MCTS(TicTacToe())
mcts._import("mcts.json")
mcts.self_play(activation = "best")

If you have any issues in creating your environment, you can browse the source code or check out the examples provided here.

Contributions

I appreciate if you are able to contribute to this project, since currently I am the only one maintaining this module. This is also the first public Python package that I have written, so if you think that something is wrong with my code, you can open an issue and I'll try my best to resolve it!

There are also other variants of MCTS, so feel free to give some pointers to how they should be implemented.

To Do

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcts-simple-0.0.3.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcts_simple-0.0.3-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file mcts-simple-0.0.3.tar.gz.

File metadata

  • Download URL: mcts-simple-0.0.3.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.46.1 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/1.4.0 colorama/0.4.3 CPython/3.7.6

File hashes

Hashes for mcts-simple-0.0.3.tar.gz
Algorithm Hash digest
SHA256 f201893dda28630b653f0e1d180bb6cf70e5ce0f7d852be0e870335c92587d5e
MD5 c883aabd3ab671aa0f2298f7a44c4d15
BLAKE2b-256 3c5cf7eb59a40c2ec727b73a847bc552debd54452fd0fc4a6494a00cb587d0d6

See more details on using hashes here.

File details

Details for the file mcts_simple-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: mcts_simple-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.46.1 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/1.4.0 colorama/0.4.3 CPython/3.7.6

File hashes

Hashes for mcts_simple-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9c7f6c0dec7a6dbbaa6bb1b9cd0a8ddd1dee3b9b751d735c095829bb072977e2
MD5 0e2780c7d674d1950e39a0e95f849f7e
BLAKE2b-256 cb6b3335730469ea36e38a4f8bc34cd4ba8d91399a39dc172d8963fcd0c06b37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page