Wordle AI with SQL Backend

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

WORDLE AI with SQL Backend

This package provides an Worldle solver with SQL backend.

How to use

# Install this library via PyPI
pip install wordleaisql
# Then run the executable that comes with the library
wordleai-sql

# Alternatively, clone this repository and run without pip-install
python wordleai-sql.py

Solver session example

$ wordleai-sql

Hi, this is Wordle AI (SQLite backend, approx).

12947 remaining candidates: ['cigar', 'rebut', 'sissy', 'humph', 'awake', 'blush', 'focal', 'evade', 'naval', 'serve', '...']

Type:
  '[s]uggest <criterion>'     to let AI suggest a word (<criterion> is optional)
  '[u]pdate <word> <result>'  to provide new information
  '[e]xit'                    to finish the session

where
  <criterion>  is either 'max_n', 'mean_n', or 'mean_entropy'
  <result>     is a string of 0 (no match), 1 (partial match), and 2 (exact match)

> s
[INFO] Start AI evaluation (2022-03-09 00:37:13)
[INFO] End AI evaluation (2022-03-09 00:37:18, elapsed: 0:00:04.153101)
* Top 20 candidates ordered by mean_entropy
--------------------------------------------------------------------
  input_word         max_n        mean_n  mean_entropy  is_candidate
--------------------------------------------------------------------
       reais            30          12.0         3.094             1
       laers            33          13.5         3.218             1
       aeons            35          14.2         3.312             1
       races            32          14.4         3.323             1
       leads            34          15.1         3.349             1
       strae            33          14.8         3.376             1
       lines            43          16.4         3.386             1
       soral            35          15.6         3.427             1
       cries            48          17.4         3.429             1
       scrae            34          16.2         3.471             1
       rules            42          17.2         3.478             1
       oared            41          17.9         3.511             1
       losen            52          17.9         3.515             1
       sedan            40          17.6         3.516             1
       sured            52          19.1         3.546             1
       artis            45          19.4         3.547             1
       least            42          18.7         3.549             1
       stire            46          18.5         3.552             1
       stria            49          19.3         3.556             1
       nails            55          18.7         3.557             1
--------------------------------------------------------------------
12947 remaining candidates: ['cigar', 'rebut', 'sissy', 'humph', 'awake', 'blush', 'focal', 'evade', 'naval', 'serve', '...']

Type:
  '[s]uggest <criterion>'     to let AI suggest a word (<criterion> is optional)
  '[u]pdate <word> <result>'  to provide new information
  '[e]xit'                    to finish the session

where
  <criterion>  is either 'max_n', 'mean_n', or 'mean_entropy'
  <result>     is a string of 0 (no match), 1 (partial match), and 2 (exact match)

> u races 00000
896 remaining candidates: ['humph', 'outdo', 'digit', 'pound', 'booby', 'loopy', 'lying', 'moult', 'guild', 'thumb', '...']

Type:
  '[s]uggest <criterion>'     to let AI suggest a word (<criterion> is optional)
  '[u]pdate <word> <result>'  to provide new information
  '[e]xit'                    to finish the session

where
  <criterion>  is either 'max_n', 'mean_n', or 'mean_entropy'
  <result>     is a string of 0 (no match), 1 (partial match), and 2 (exact match)

> s
[INFO] Start AI evaluation (2022-03-09 00:37:35)
[INFO] End AI evaluation (2022-03-09 00:37:39, elapsed: 0:00:03.439437)
* Top 20 candidates ordered by mean_entropy
--------------------------------------------------------------------
  input_word         max_n        mean_n  mean_entropy  is_candidate
--------------------------------------------------------------------
       monty            41          16.8         3.454             1
       gipon            66          20.5         3.546             1
       lofty            53          20.0         3.686             1
       bilgy            70          24.2         3.746             1
       bundt            69          23.2         3.779             1
       limbo            69          23.6         3.780             1
       bundy            63          23.5         3.782             1
       found            56          23.7         3.816             1
       youth            50          22.6         3.827             1
       joint            65          23.9         3.895             1
       downy            61          25.5         3.902             1
       milko            78          27.5         3.924             1
       fungo            86          29.6         3.926             1
       lumbi            77          29.1         3.976             1
       tupik            68          28.0         3.981             1
       goopy            76          28.3         4.012             1
       jolty            59          24.3         4.015             1
       muhly            65          28.1         4.034             1
       nouny            59          25.0         4.041             1
       touzy            49          25.0         4.066             1
--------------------------------------------------------------------
896 remaining candidates: ['humph', 'outdo', 'digit', 'pound', 'booby', 'loopy', 'lying', 'moult', 'guild', 'thumb', '...']

Type:
  '[s]uggest <criterion>'     to let AI suggest a word (<criterion> is optional)
  '[u]pdate <word> <result>'  to provide new information
  '[e]xit'                    to finish the session

where
  <criterion>  is either 'max_n', 'mean_n', or 'mean_entropy'
  <result>     is a string of 0 (no match), 1 (partial match), and 2 (exact match)

> u monty 22220
'month' should be the answer!
Thank you!

Suggestion criteria

Input words are evaludated by the three criteria as follows:

"max_n": Maximum number of the candidate words that would remain.
"mean_n": Average number of the candidate words that would remain.
"mean_entropy": Average of the log2 of number of the candidate words that would remain.

Note that if there are n candidate words with the equal probability, then probability of each word i is p_i = 1/n. Then, the entropy is given by -sum(p_i log2(p_i)) = - n * (1/n) log2(1/n) = log2(n). Hence, the average of log2(n) can be seen as the average entropy.

"mean_entropy" is often used in practice and thus set as the default choice of the program. "max_n" can be seen as a pessimistic criterion since it reacts to the worst case. "mean_n" can seem an intutive criterion but does not work as well as "mean_entropy" perhaps due to the skewed distribution.

See also the simulation results for a comparison of the criteria (notebook at simulation/simulation-summary.ipynb or view on nbviewer).

Play and challenge mode

By default, wordleai-sql command starts an interactive solver session.
wordleai-sql --play starts a self-play game.
wordleai-sql --challenge starts a competition against an AI.
In the play and challenge mode, the answer word is chosen in accordance with the answer weight by default. One can set --no_answer_weight option to make all words potentially become an answer word.

Using a custom word set

The default word list is at wordleaisql/vocab/wordle-vocab.txt. The list perhaps is compatible with New York Times version.
One may use a custom word list by specifying --vocabfile=<file path> option.
- A file should contain words of the same length, separated by the line break ("\n").
- Each line may contain a nonnegative numeric value separated by a space, which is used as the relative probability that this word is chosen as the answer (in play and challenge mode). If not supecified, the word is given the weight one.
- A file can be gzip compressed, where the filename must end with ".gz".
- Although not tested thoroughly, the program would work with words containing multibyte characters (with utf8 encoding) or digits.
By default, the file name without extension is used as the vocabname. One may change this by --vocabname.
See vocab-examples/ folder for some examples.

# Example
wordleai-sql --vocabname myvocab --vocabfile my-vocab.txt

Backend options

SQLite with approximate evaluation (default)

wordleai-sql -b approx

With -b approx option, we employ approximate evaluation of words by sampling input and/or answer words.
The database setup completes quikckly since this does not require precompuation of the judge results.
Evaluation also completes quickly since small numbers of input and/or answer words are involved in the calculation.
Although approximate, the engine tends to provide close-to-optimal suggestions thanks to the law of large numbers.

SQLite with full evaluation

wordleai-sql -b sqlite

This engine evaluates all words using the all answer candidates.
To enhance the calculation the engine precomputes all judge results for all word pairs on the setup.
- The file size becomes about 8.4GB.
- The process may take about an hour, depending on the CPU speed.
- The time for the setup will be significantly reduced if c++ compiler command (e.g g++ or clang++) is available.

Google bigquery backend

# --vocabname is used as the dataset name
wordleai-sql -bbq --bq_credential "gcp_credentials.json" --vocabname "wordle_dataset"

With -bbq option, we employ google bigquery as the backend SQL engine.

We need to supply a credential json file of the GCP service account with the following permissions:

bigquery.datasets.create
bigquery.datasets.get
bigquery.jobs.create
bigquery.jobs.get
bigquery.routines.create
bigquery.routines.delete
bigquery.routines.get
bigquery.routines.update
bigquery.tables.create
bigquery.tables.delete
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
bigquery.tables.update
bigquery.tables.updateData

Other options

See wordleai-sql -h for other options, which should mostly be self-explanatory.

GUI application

A browser application built on streamlit is at streamlit/app.py. This can be run with the following command:
```
# install dependencies if not
pip install pandas streamlit
streamlit run ./streamlit/app.py
```
The app is also deployed on the streamlit cloud.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.2.10

Mar 17, 2022

0.2.9

Mar 17, 2022

0.2.8

Mar 15, 2022

0.2.7

Mar 15, 2022

0.2.6

Mar 15, 2022

0.2.5

Mar 15, 2022

0.2.4

Mar 15, 2022

0.2.3

Mar 15, 2022

0.2.2

Mar 15, 2022

0.2.1

Mar 8, 2022

0.1.12

Mar 8, 2022

0.1.11

Mar 7, 2022

0.1.10

Mar 6, 2022

0.1.9

Mar 6, 2022

0.1.8

Mar 6, 2022

0.1.7

Mar 4, 2022

0.1.6

Mar 4, 2022

0.1.5

Mar 4, 2022

0.1.4

Mar 4, 2022

0.0.4

Feb 23, 2022

0.0.3

Feb 23, 2022

0.0.2

Feb 23, 2022

0.0.1

Feb 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wordleaisql-0.2.10.tar.gz (67.5 kB view hashes)

Uploaded Mar 17, 2022 Source

Built Distribution

wordleaisql-0.2.10-py3-none-any.whl (70.5 kB view hashes)

Uploaded Mar 17, 2022 Python 3

Hashes for wordleaisql-0.2.10.tar.gz

Hashes for wordleaisql-0.2.10.tar.gz
Algorithm	Hash digest
SHA256	`16f778038c4e0edb66bb9def7098379f40522d92ee668e626e5e0932d5e198da`
MD5	`3da145a5508470e6de6b575e414b8388`
BLAKE2b-256	`462f9b47c65cad9d56b81b4cb1eff18f2f89f3136f55aafb8a7d3ab43c5c3689`

Hashes for wordleaisql-0.2.10-py3-none-any.whl

Hashes for wordleaisql-0.2.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1b972684e99f7e6a8d43950022f741d44d37010c1458f3808377320f2c3cc8c4`
MD5	`c5eb3f860af348c927cc044a330add06`
BLAKE2b-256	`5cc36288c0f1871ac2cd6152dac607f16d8b9eace37c2f4782e36e42aa4e58ad`