Skip to main content

Simple MNIST and EMNIST data parser written in pure Python

Project description

Simple MNIST and EMNIST data parser written in pure Python.

MNIST is a database of handwritten digits available on http://yann.lecun.com/exdb/mnist/. EMNIST is an extended MNIST database https://www.nist.gov/itl/iad/image-group/emnist-dataset.

Requirements

  • Python 2 or Python 3

Usage

  • git clone https://github.com/sorki/python-mnist

  • cd python-mnist

  • Get MNIST data:

    ./bin/mnist_get_data.sh
  • Check preview with:

    PYTHONPATH=. ./bin/mnist_preview

Installation

Get the package from PyPi:

pip install python-mnist

or install with setup.py:

python setup.py install

Code sample:

from mnist import MNIST
mndata = MNIST('./dir_with_mnist_data_files')
images, labels = mndata.load_training()

To enable loading of gzip-ed files use:

mndata.gz = True

Library tries to load files named t10k-images-idx3-ubyte train-labels-idx1-ubyte train-images-idx3-ubyte and t10k-labels-idx1-ubyte. If loading throws an exception check if these names match.

EMNIST

  • Get EMNIST data:

    ./bin/emnist_get_data.sh
  • Check preview with:

    PYTHONPATH=. ./bin/emnist_preview

To use EMNIST datasets you need to call:

mndata.select_emnist('digits')

Where digits is one of the available EMNIST datasets. You can choose from

  • balanced

  • byclass

  • bymerge

  • digits

  • letters

  • mnist

EMNIST loader uses gziped files by default, this can be disabled by by setting:

mndata.gz = False

You also need to unpack EMNIST files as bin/emnist_get_data.sh script won’t do it for you. EMNIST loader also needs to mirror and rotate images so it is a bit slower (If this is an issue for you, you should repack the data to avoid mirroring and rotation on each load).

Notes

This package doesn’t use numpy by design as when I’ve tried to find a working implementation all of them were based on some archaic version of numpy and none of them worked. This loads data files with struct.unpack instead.

Example

$ PYTHONPATH=. ./bin/mnist_preview
Showing num: 3

............................
............................
............................
............................
............................
............................
.............@@@@@..........
..........@@@@@@@@@@........
.......@@@@@@......@@.......
.......@@@........@@@.......
.................@@.........
................@@@.........
...............@@@@@........
.............@@@............
.............@.......@......
.....................@......
.....................@@.....
....................@@......
...................@@@......
.................@@@@.......
................@@@@........
....@........@@@@@..........
....@@@@@@@@@@@@............
......@@@@@@................
............................
............................
............................
............................

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-mnist-0.7.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

python_mnist-0.7-py2.py3-none-any.whl (9.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file python-mnist-0.7.tar.gz.

File metadata

  • Download URL: python-mnist-0.7.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.6

File hashes

Hashes for python-mnist-0.7.tar.gz
Algorithm Hash digest
SHA256 a0cced01e83b5b844cff86109280df7a672a8e4e38fc19fa68999a17f8a9fbd8
MD5 57fd9dbad887df73b84b4feb8a0216e5
BLAKE2b-256 ca368bf0c412938ae0a8270a66a490060e2f4c5211e0e951e469cb768e6f08ce

See more details on using hashes here.

File details

Details for the file python_mnist-0.7-py2.py3-none-any.whl.

File metadata

  • Download URL: python_mnist-0.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.40.2 CPython/3.7.6

File hashes

Hashes for python_mnist-0.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f5753799787b8ba853ed7b0551a21be14ee3f635aa649e114393c9a5fd58538e
MD5 4696c805f6eb90374af28d5877a94f1f
BLAKE2b-256 64f06086b84427c3bf156ec0b3c2f9dfc1d770b35f942b9ed8a64f5229776a80

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page