Generate crosswords using Monte Carlo Tree Search (MCTS)
Project description
MCTS Crossword Generator
This package provides a pure Python implementation for generating crosswords using Monte Carlo Tree Search (MCTS).
- A good overview about the project can be found in this blog post.
- The pip package can be found on PyPI
Quickstart
A: Install package:
- Create and activate a virtual environment based on Python >= 3.8
- Install crossword_generator package:
pip install crossword_generator
B: Generate crossword with default settings:
You can generate a crossword without providing any arguments. This will fill a 4x5 layout without any black squares using words from an English dictionary.
To do so, activate your virtual environment and chose one of the following (equivalent) options:
- Call application directly:
crossword
- Execute package:
python -m crossword_generator
- Run the main function in a python shell or your own script:
>>> from crossword_generator import generate_crossword
>>> generate_crossword()
For the next examples I assume you are using the first option to interact with the package.
Examples
- To get started and see which input formats are required, you can download some english (comma-separated) or german (semicolon-separated) sample data.
- Let's assume you have downloaded the sample files into a directory called "crossword_input" inside your working directory.
A: Use your own layouts
- In order to use your own layouts, you will need to set argument
path_to_layout
to a CSV file on your local machine. - the CSV file must have an index column and a header row
- potential letters are marked with "_" (underscore)
- black squares are marked with "" (empty)
Fill an empty 5x12 layout:
crossword --path_to_layout "crossword_input/layout_5_12_empty.csv"
Fill a prefilled 5x12 layout:
crossword --path_to_layout "crossword_input/layout_5_12_prefilled.csv"
Fill an entire NYT-style 15x15 layout:
crossword --path_to_layout "crossword_input/layout_15_15_empty.csv"
Of course, you can also provide arguments from within your code:
generate_crossword(
path_to_layout="crossword_input/layout_15_15_empty.csv",
)
B: Use your own words
- In order to use your own set of words, you will need to set argument
path_to_words
to a CSV file (or pattern of CSV files) on your local machine. - the CSV file(s) must contain a column named "answer" with the relevant words
Fill an empty 5x12 layout with words from a list
crossword --path_to_layout "crossword_input/layout_5_12_empty.csv" --path_to_words "crossword_input/sample_words.csv"
Again, you can do the same from within your code:
generate_crossword(
path_to_layout="crossword_input/layout_5_12_empty.csv",
path_to_words="crossword_input/sample_words.csv",
)
C: Other arguments you might want to play with:
- num_rows & num_cols [int]
- number of rows / columns the layout should have
- will only be considered if
path_to_layout
is not specified
- max_num_words [int]
- limits the number of words to improve runtime
- max_mcts_iterations [int]
- sets the maximum number of MCTS iterations
- can be increased to get a better solution or decreased to improve runtime
- random_seed [int]
- change the seed to obtain different filled crosswords
- output_path [str]
- if provided, save the final grid and a summary as CSV files into the provided directory
Modules
- optimizer.py
- script that contains the main function
generate_crossword()
- script that contains the main function
- layout_handler.py
- Provides the layout that will later be filled with words
NewLayoutHandler
: creates a new layout from scratchExistingLayoutHandler
: reads an existing layout from a CSV file
- word_handler.py
- Provides the words that will later be filled into the layout
DictionaryWordHandler
: get words from NLTK corpusFileWordHandler
: read words from CSV files
- state.py
Entry
: class that represents the current state of one entry of the crosswordCrosswordState
: class that represents the current state of whole crossword
- tree_search.py
TreeNode
: represents one node of the MCTS treeMCTS
: represents the whole MCTS tree and provides all necessary functionalities such as- Selection
- Expansion
- Simulation / Rollout
- Backpropagation
References & Dependencies
- The MCTS implementation in
tree_search.py
is based on the algorithm provided by pbsinclair42, which I adapted in several ways:- Convert from 2-player to 1-player domain
- Adjust reward function + exploration term
- Add additional methods to analyze the game tree
- Use PEP 8 code style
- Have a look at
pyproject.toml
for a list of all required and optional dependencies - Python >= 3.8
- Required packages
- nltk>=3.5
- pandas>=1.4.0
- numpy>=1.22.0
- tqdm>=4.41.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for crossword-generator-0.2.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 01b1a91c73c120e8f1f5b2688a3c3c96fe5d21bdbcfaa6505a3d01340482252f |
|
MD5 | cadc2d11949c3e7da33e1fa60cb1c5a0 |
|
BLAKE2b-256 | b79b3b488d068a9c77158495bce6660f15a0690ecae757cf39ed47b86d042774 |
Hashes for crossword_generator-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 141d5151dae0600540da001b1f712554db694b42a93247491775b6d67782dcac |
|
MD5 | f91fe1fc87fab2359d9a429d925a7154 |
|
BLAKE2b-256 | a6fbc149732d10497b38e261a434f4f9a8129878f63220992274522b42974492 |