Skip to main content

This program first converts a screenshot of a sudoku grid into plain text. Afterwards, it will solve the sudoku puzzle, display the result and save the solution as a .csv file.

Project description

sudoku-solver

A python program that takes in an image of a sudoku puzzle and outputs its answer.

Installation

This library uses tesseract-ocr for OCR.

Installation of python library and tesseract

Note: this installation mainly focuses on UNIX based systems. I have absolutely no idea how this setup would go on Windows machines.

First download the package.

pip install sudoku-solver-ocr

On UNIX systems, we need to install tesseract through the package manager on your system.

On debian based systems:

sudo apt-get install tesseract-ocr

On Arch based systems:

sudo pacman -Sy tesseract

Verify that it's installed:

tesseract --version

Configuring tessdata

Find tessdata directory using fzf (can be installed using apt install fzf etc...)

find / -name tessdata |fzf

You may get a result like this:

/usr/share/tessdata

or generally:

/path/to/tessdata

Setup the environmental variable called TESSDATA_PREFIX in ~/.bashrc

export TESSDATA_PREFIX=/path/to/tessdata >> ~/.bashrc

Source the ~/.bashrc file

source ~/.bashrc

Verify by recalling the environmental variable.

echo $TESSDATA_PREFIX

Should result in something like this:

/path/to/tessdata

in my case:

/usr/share/tessdata

Change directory to $TESSDATA_PREFIX

cd $TESSDATA_PREFIX

Check that eng.traineddata is in the directory using the ls command.

If it isn't in the directory, download the file and move it to the $TESSDATA_PREFIX directory.

Using it as a CLI program

Note: The computer vision algorithm to detect and process the sudoku grid isn't very sophisticated. In short this is the algorithm:

  1. load the image
  2. crop the image by detecting external borders (no perspective transform)
  3. split the cells by detecting contours of the cell in the grid (very error prone)
  4. feed the cells one by one into an tesseract-ocr (very slow)

As such, it's best not to use it on sudoku puzzles taken through a camera. Furthermore, from testing, these sudoku sites work best with the program (from best to worst):

  1. websudoku
  2. printable sudoku
  3. sudoku web
  4. nytimes sodoku
  5. sudoku.com

And from testing, it seems like it's best if the image's resolution is not too high while not too low either.

First take a screenshot of a sudoku board from an online website as such:

screenshot of websudoku

Now run the following command:

python3 -m sudoku_solver_ocr ./path/to/image /path/to/tesseract 
  • ./path/to/image is path to the image from your current working directory.
  • /path/to/tesseract is the path to the tesseract program itself. It can be checked by running which tesseract

I've stored it in web_sudoku.png in my current working directory and my path to tesseract is at /sbin/tesseract. Therefore, I'll run:

python3 -m sudoku_solver_ocr .web_sudoku.png /sbin/tesseract

If no error arises, a sudoku board should appear in your screen.

Loading image...
Cropping image...
Splitting image...
Using OCR...
  0   1   2   3   4   5   6   7   8   
╔═══╤═══╤═══╦═══╤═══╤═══╦═══╤═══╤═══╗
║   ┃   ┃   ║   ┃ 4 ┃ 6 ║   ┃ 2 ┃   ║0
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║   ┃ 6 ┃ 1 ║ 3 ┃   ┃   ║   ┃ 7 ┃   ║1
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║   ┃   ┃ 7 ║   ┃   ┃   ║   ┃   ┃ 3 ║2
╠═══╪═══╪═══╫═══╪═══╪═══╫═══╪═══╪═══╣
║   ┃   ┃   ║   ┃ 6 ┃ 9 ║   ┃ 3 ┃   ║3
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║   ┃   ┃ 3 ║   ┃   ┃   ║ 4 ┃   ┃   ║4
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║   ┃ 5 ┃   ║ 2 ┃ 1 ┃   ║   ┃   ┃   ║5
╠═══╪═══╪═══╫═══╪═══╪═══╫═══╪═══╪═══╣
║ 5 ┃   ┃   ║   ┃   ┃   ║ 2 ┃   ┃   ║6
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║   ┃ 3 ┃   ║   ┃   ┃ 2 ║ 9 ┃ 1 ┃   ║7
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║   ┃ 2 ┃   ║ 8 ┃ 9 ┃   ║   ┃   ┃   ║8
╚═══╧═══╧═══╩═══╧═══╧═══╩═══╧═══╧═══╝

Please recheck your sudoku board:
- help
- ok
- replace <x> <y> <n>
input a command: 

As suggested, if the board is incorrect, this interface allows us to replace a particular cell given an x, a y and a number to insert n. The command list is of follows:

  • help simply prints some information about sudoku
  • replace <x> <y> <n> places a number n which can be from 0 to 9 (0 to place an empty cell) at coordinates (x, y).
  • ok command breaks out of the loop and feeds the following board into a sudoku solver.

Once ok is entered, we can either solve it while visualizing it or simply solve it and print the solution (animating it intentionally slows the algorithm down).

✅ Final board accepted.
Animate?(y/n) 

I'll simply type n.

If all goes well, we should be greeted with a solved sudoku board.

  0   1   2   3   4   5   6   7   8   
╔═══╤═══╤═══╦═══╤═══╤═══╦═══╤═══╤═══╗
║ 3 ┃ 8 ┃ 5 ║ 7 ┃ 4 ┃ 6 ║ 1 ┃ 2 ┃ 9 ║0
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║ 9 ┃ 6 ┃ 1 ║ 3 ┃ 2 ┃ 5 ║ 8 ┃ 7 ┃ 4 ║1
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║ 2 ┃ 4 ┃ 7 ║ 9 ┃ 8 ┃ 1 ║ 6 ┃ 5 ┃ 3 ║2
╠═══╪═══╪═══╫═══╪═══╪═══╫═══╪═══╪═══╣
║ 8 ┃ 7 ┃ 2 ║ 4 ┃ 6 ┃ 9 ║ 5 ┃ 3 ┃ 1 ║3
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║ 6 ┃ 1 ┃ 3 ║ 5 ┃ 7 ┃ 8 ║ 4 ┃ 9 ┃ 2 ║4
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║ 4 ┃ 5 ┃ 9 ║ 2 ┃ 1 ┃ 3 ║ 7 ┃ 8 ┃ 6 ║5
╠═══╪═══╪═══╫═══╪═══╪═══╫═══╪═══╪═══╣
║ 5 ┃ 9 ┃ 8 ║ 1 ┃ 3 ┃ 4 ║ 2 ┃ 6 ┃ 7 ║6
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║ 7 ┃ 3 ┃ 4 ║ 6 ┃ 5 ┃ 2 ║ 9 ┃ 1 ┃ 8 ║7
╟━━━┿━━━┿━━━╫━━━┿━━━┿━━━╫━━━┿━━━┿━━━╢
║ 1 ┃ 2 ┃ 6 ║ 8 ┃ 9 ┃ 7 ║ 3 ┃ 4 ┃ 5 ║8
╚═══╧═══╧═══╩═══╧═══╧═══╩═══╧═══╧═══╝

Save solution as CSV file?(y/n) 

We can save the solution as a .csv file. For instance if I want to save it in sudoku_solution.csv:

Save solution as CSV file?(y/n) y
Specify path (solution 1/1): sudoku_solution.csv

Now we can check that the solution is saved in sudoku_solution.csv.

CSV data as plain text:

3,8,5,7,4,6,1,2,9
9,6,1,3,2,5,8,7,4
2,4,7,9,8,1,6,5,3
8,7,2,4,6,9,5,3,1
6,1,3,5,7,8,4,9,2
4,5,9,2,1,3,7,8,6
5,9,8,1,3,4,2,6,7
7,3,4,6,5,2,9,1,8
1,2,6,8,9,7,3,4,5

Using it as a library

Once it's installed, simply import the library:

import sudoku_solver_ocr as sso

This is an overview of all the functions, more details can be extracted by reading the source code (as it's not very long).

A sudoku grid is represented a 2d list of integers which has dimensions 9x9.

Note: these functions all lie under sudoku_solver_ocr when importing. I've divded it only because the functions lie in 3 different files.

sudoku_utils.py

This file provides useful utility functions used throughout the program.

function details
read_sudoku_csv opens a sudoku grid in a .csv file and outputs 2d list representation of the grid
write_sudoku_csv writes a 2d list representation of a sudoku puzzle to .csv file
print_board prints the 2d list representation of the sudoku puzzle to stdout in a nice sudoku board format
show_help simply prints basic info about sudoku to stdout
color_text wraps a string with ANSI code to make that string a certain color
double_check runs a interactive user interface to edit the given sudoku board
format_grid prepares the 2d list of int to a 2d list of str to print it nicely to stdout (replacing all 0 with ' ', apply coloring, etc)
get_non_empty_coord returns a set of the non empty (non 0) coordinates as a tuple representing coordinates (y, x)

This file also contains a dictionary called ANSI_CODE which maps a color to an ANSI code. This is useful for coloring strings.

sudoku_ocr.py

This file provides tools used to convert sudoku images to plain text.

function details
load_and_prepare_image loads the sudoku image, grayscale the image and also binarize it
crop_image simply detects an external border of the sudoku grid and crops it. splits out the cropped image
split_grid splits the sudoku grid image to 81 images representing cells that either contain a number or nothing. This is all then stored in a 2d list.
image_to_num_grid does OCR on the output of split_grid which is a 2d list of images. The path to the tesseract is also an argument here

sudoku_solver.py

This file provides all the functions used to solve a sudoku puzzle.

function details
is_possible checks whether a number n can be placed at a particular cell of coordinates (x, y)
solve_sudoku collects the solution of a sudoku puzzle in a list (in case the sudoku puzzle isn't well designed or you've messed up with editting the cells)
solve_sudoku_single_solution is faster than solve_sudoku but assume that there's only one solution
solve_sudoku_animated solve the sudoku puzzle while also printing the state of the sudoku board to stdout while solving

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sudoku_solver_ocr-0.3.tar.gz (15.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sudoku_solver_ocr-0.3-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file sudoku_solver_ocr-0.3.tar.gz.

File metadata

  • Download URL: sudoku_solver_ocr-0.3.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for sudoku_solver_ocr-0.3.tar.gz
Algorithm Hash digest
SHA256 d7ae5b7f68c4bd754161d486a20a281ec011374cbcb19ea4537a6edb188cba31
MD5 2a16f73382fda6d2ca939be6d9d5ed57
BLAKE2b-256 71ad165dc838572a8c9a85eaba88d4b590a2e646fa0f4b24041dc312db63960a

See more details on using hashes here.

File details

Details for the file sudoku_solver_ocr-0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for sudoku_solver_ocr-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1229a6e903a44fba8f85ee4f0be9dc1d3b8be65d09a4e4703be73528ad8b2dd3
MD5 ad45cebbf347165ba06274d9c05c4b8e
BLAKE2b-256 d34e3b248271a05b2c4a507a1efe23401713a50ce1292fb1f0eaa8610ba1def6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page