Skip to main content

A Tensorflow TFRecord Utility Package

Project description

tfrmaker

GitHub GitHub last commit GitHub issues

Utility package which helps to ease the manipulation of tfrecords.

Contents

Description

tfrmaker helps to ease the manipulation of tfrecords for your next machine learning project with tensorflow. You can now easily create, extract and load image datasets in the form of tfrecords with help of tfrmaker. With the help of the package, large image datasets can be converted into tfrecords and fed directly into tensorflow models for training and testing purposes. Some key feature of the package includes:

  • dynamic resizing
  • splitting tfrecords into optimal shards
  • spliting training, validation, testing of tfrecords
  • count no of images in tfrecords
  • asynchronous tfrecord creation

Why TFRecords?

TFRecords stores data as a sequence of binary records with help of protocol buffers, a cross-platform, cross-language library. It has many advantages like:

  • Efficient storage: TFRecord data can take up less space than the original data; it can also be partitioned into multiple files.
  • Fast I/O: TFRecord format can be read with parallel I/O operations, which is useful for TPUs or multiple hosts.

Installation

Use the package manager pip to install tfrmaker.

pip install tfrmaker

Usage

A minimal usage of tfrmaker with image data, organized as directores with name as class labels:

from tfrmaker import images, display

# mapping label names with integer encoding.
LABELS = {"bishop": 0, "knight": 1, "pawn": 2, "queen": 3, "rook": 4}

# specifiying data and output directories.
DATA_DIR = "datasets/chess/"
OUTPUT_DIR = "tfrecords/chess/"

# create tfrecords from the images present in the given data directory.
images.create(DATA_DIR, LABELS, OUTPUT_DIR)

# load one or more tfrecords as an iterator object.
dataset = images.load(["tfrecords/chess/queen.tfrecord","tfrecords/chess/bishop.tfrecords"], batch_size=32, repeat=True)

# iterate one batch and visualize it along with labels.
databatch = next(iter(dataset))
display.batch(databatch, LABELS)

Refer examples folder for more advanced usage.

Support

"Your mental support by staring the repo is much appreciated."

Contribute

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tfrmaker-0.0.2.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tfrmaker-0.0.2-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file tfrmaker-0.0.2.tar.gz.

File metadata

  • Download URL: tfrmaker-0.0.2.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for tfrmaker-0.0.2.tar.gz
Algorithm Hash digest
SHA256 fa4d3225f2cb575a16b7375b310f5c1fe4e2a492280b034b24d4eee474307375
MD5 88ba96ffa0574fa7baa423e18a69c522
BLAKE2b-256 1172a39907a91e78674ef8461378bef3af6b2cda0bb4d608ca233ed5a9c08465

See more details on using hashes here.

File details

Details for the file tfrmaker-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: tfrmaker-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for tfrmaker-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f9a0c1cda29220bab29de58f206dc3d7e2499016715ed66c52b2d9df5c9a7191
MD5 afa4d3b53dd404f4e0190187efa0d0bd
BLAKE2b-256 52035551ec10e0375d0107da49f55a5834d3c90472d10542fa2bb9daefb13129

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page