Skip to main content
Help us improve Python packaging – donate today!

Simple MNIST and EMNIST data parser written in pure Python

Project Description

Simple MNIST and EMNIST data parser written in pure Python.

MNIST is a database of handwritten digits available on http://yann.lecun.com/exdb/mnist/. EMNIST is an extended MNIST database https://www.nist.gov/itl/iad/image-group/emnist-dataset.

Requirements

  • Python 2 or Python 3

Usage

  • git clone https://github.com/sorki/python-mnist

  • cd python-mnist

  • Get MNIST data:

    ./get_data.sh
    
  • Check preview with:

    PYTHONPATH=. ./bin/mnist_preview
    

Installation

Get the package from PyPi:

pip install python-mnist

or install with setup.py:

python setup.py install

Code sample:

from mnist import MNIST
mndata = MNIST('./dir_with_mnist_data_files')
images, labels = mndata.load_training()

To enable loading of gzip-ed files use:

mndata.gz = True

EMNIST

  • Get EMNIST data:

    ./get_emnist_data.sh
    
  • Check preview with:

    PYTHONPATH=. ./bin/emnist_preview
    

To use EMNIST datasets you need to call:

mndata.select_emnist('digits')

Where digits is one of the available EMNIST datasets. You can choose from

  • balanced
  • byclass
  • bymerge
  • digits
  • letters
  • mnist

EMNIST loader uses gziped files by default, this can be disabled by by setting:

mndata.gz = False

You also need to unpack EMNIST files as get_emnist_data.sh script won’t do it for you. EMNIST loader also needs to mirror and rotate images so it is a bit slower (If this is an issue for you, you should repack the data to avoid mirroring and rotation on each load).

Notes

This package doesn’t use numpy by design as when I’ve tried to find a working implementation all of them were based on some archaic version of numpy and none of them worked. This loads data files with struct.unpack instead.

Release history Release notifications

This version
History Node

0.5

History Node

0.4

History Node

0.3

History Node

0.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
python-mnist-0.5.tar.gz (9.9 kB) Copy SHA256 hash SHA256 Source None Jan 9, 2018

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page