Generate crosswords using Monte Carlo Tree Search (MCTS)
Project description
MCTS Crossword Generator
This package provides a pure Python implementation for generating crosswords using Monte Carlo Tree Search (MCTS).
- A good overview about the project can be found in this blog post.
- The pip package can be found on PyPI
Quickstart
A: Install package:
- Create and activate a virtual environment based on Python >= 3.8
- Install crossword_generator package:
pip install crossword_generator
B: Generate crossword with default settings:
You can generate a crossword without providing any arguments. This will fill a 4x5 layout without any black squares using words from an English dictionary.
To do so, activate your virtual environment and chose one of the following (equivalent) options:
- Call application directly:
crossword
- Execute package:
python -m crossword_generator
- Run the main function in a python shell or your own script:
>>> from crossword_generator import generate_crossword
>>> generate_crossword()
For the next examples I assume you are using the first option to interact with the package.
Examples
- To get started and see which input formats are required, you can download some sample data from the this directory.
- Let's assume you have downloaded all files into a directory called "crossword_input".
A: Use your own layouts
- In order to use your own layouts, you will need to set argument
path_to_layout
to a CSV file on your local machine. - the CSV file must have an index column and a header row
- potential letters are marked with "_" (underscore)
- black squares are marked with "" (empty)
Fill an empty 5x12 layout:
crossword --path_to_layout "crossword_input/layout_5_12.csv"
Fill a partially filled 5x12 layout:
crossword --path_to_layout "crossword_input/layout_5_12_partially_filled.csv"
Fill an entire NYT-style 15x15 layout:
crossword --path_to_layout "crossword_input/layout_15_15.csv"
Of course, you can also provide arguments from within your code:
generate_crossword(
path_to_layout="crossword_input/layout_15_15.csv",
)
B: Use your own words
- In order to use your own set of words, you will need to set argument
path_to_words
to a CSV file (or pattern of CSV files) on your local machine. - the CSV file(s) must contain a column named "answer" with the relevant words
Fill an empty 5x12 layout with words from a list
crossword --path_to_layout "crossword_input/layout_5_12.csv" --path_to_words "crossword_input/words_example.csv"
Again, you can do the same from within your code:
generate_crossword(
path_to_layout="crossword_input/layout_5_12.csv",
path_to_words="crossword_input/words_example.csv",
)
C: Other arguments you might want to play with:
- num_rows & num_cols [int]
- number of rows / columns the layout should have
- will only be considered if
path_to_layout
is not specified
- max_num_words [int]
- limits the number of words to improve runtime
- max_mcts_iterations [int]
- sets the maximum number of MCTS iterations
- can be increased to get a better solution or decreased to improve runtime
- random_seed [int]
- change the seed to obtain different filled crosswords
- output_path [str]
- if provided, save the final grid and a summary as CSV files into the provided directory
Modules
- optimizer.py
- script that contains the main function
generate_crossword()
- script that contains the main function
- layout_handler.py
- Provides the layout that will later be filled with words
NewLayoutHandler
: creates a new layout from scratchExistingLayoutHandler
: reads an existing layout from a CSV file
- word_handler.py
- Provides the words that will later be filled into the layout
DictionaryWordHandler
: get words from NLTK corpusFileWordHandler
: read words from CSV files
- state.py
Entry
: class that represents the current state of one entry of the crosswordCrosswordState
: class that represents the current state of whole crossword
- tree_search.py
TreeNode
: represents one node of the MCTS treeMCTS
: represents the whole MCTS tree and provides all necessary functionalities such as- Selection
- Expansion
- Simulation / Rollout
- Backpropagation
References & Dependencies
- The MCTS implementation in
tree_search.py
is based on the algorithm provided by pbsinclair42, which I adapted in several ways:- Convert from 2-player to 1-player domain
- Adjust reward function + exploration term
- Add additional methods to analyze the game tree
- Use PEP 8 code style
- Have a look at
pyproject.toml
for a list of all required and optional dependencies - Python >= 3.8
- Required packages
- nltk>=3.5
- pandas>=1.4.0
- numpy>=1.22.0
- tqdm>=4.41.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for crossword-generator-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 035c105d8ddb23e1f5869bae451de94422ed28cc3f1c2acc9f277f8f15c30462 |
|
MD5 | 08dfc191e84a2769a0f90049201d8120 |
|
BLAKE2b-256 | fddf0e35b86b230b2d75c01203cc25a082b3966e92e738116b4ed7059b01e950 |
Hashes for crossword_generator-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26b70ad9f59c48bcff6b661fb9d67fb69221cd209069f79036b2c10d04ee5d78 |
|
MD5 | 34a5d549736397ea1402b17e57b041c3 |
|
BLAKE2b-256 | 3806de49b0f2d3f588a91f844e2e2b1bd70e9ac05c232061e720eeba95a64a4e |