Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "http://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for the HTTP file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 879f9da614e1364be9cb256e622757147524d1b3deaadae1ffab48b6f5367e7c
MD5 088925d651478e1ae210ff08b7042758
BLAKE2b-256 d466e295061a36557c9f96a65eacc267368e4f2cc079fc14d59d8ca7cf882037

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6fb33ce43b838acdcbcbfbec4c8c0b0ec74425a517b8a23cb57fc4045670049b
MD5 22a6fbda45238ef9f545109e27336e6b
BLAKE2b-256 dfcfff4db5edc75a4e484362b1f1d569c3d8109312a170567017e02022ad1a6d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 44ee1d5d4e142fdf6d1a467284e43d114970ac07e62ba0df4d62470bf5bd1ebd
MD5 2adec24a148fa9c4d3859aad12fef321
BLAKE2b-256 0d700650cfc5c2e41c269d9ab68e69375d5e701095e973527349a80ae6ce9337

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 5e6c61791d931a524678c883edeb024d00a1073c36c56a876cc160d25f3be5a5
MD5 11c4f3483a0f940e099e5207aed1b860
BLAKE2b-256 2fb5ea4d94662b8df31e3edfe03667d09e1d1908221d74f84f4800edbe109630

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f2bc5aaa587838d612c8695c95761b8394be9cc580ae0e056fa3724df63ce0b9
MD5 069edcb9f650c161a41b2dd9d7a5ade5
BLAKE2b-256 688f2b3a41f7d4aeae718aa9915da6791a6dcb9ab758bf624e01f1cdd9d899bd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9daaf5a1b9c02657a1c64f02caa1985c8353fe1561800a4222bf936a307efec7
MD5 0ec7877f4d027a53493ec8689ff7e25d
BLAKE2b-256 10efbb500677d9e0d0dec1f26fe9647d57cc1d74a03a6169ab0a76fe9aa71a17

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 de58cc2c64d83e4ce76c4158d8c8943f82105e0fe19cc8000431fb47f98d2ea3
MD5 c987c189e54a44d5f16c31a6dab07f21
BLAKE2b-256 92de2e1efc2130b0169db925fa9743ace9fd4f8150818216dc93d9f220289b4a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a65e693c4c55b59f03d59f442679449cb754bfdc9b9403f868d1b4320f75040e
MD5 c7e5b0947ae980d046d3ce14a8dee760
BLAKE2b-256 9b1b796f615681cd28a25d0c4405738f8586c73c76afc661f16f180edb7c427b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b89f30130dcc3c69ff1e1f12acc4231887316ad1b20557ae8d24183e62f377cf
MD5 096ca6919daf8d8f19162b9f29807482
BLAKE2b-256 54a4278eed38d729c4710cc192a556c91dec0f7e831593cd21153126680e5706

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 454d963cc0fedce0ab6ee85f96d1f07003677fb8e99c3c1dbd4701f5c69c00e3
MD5 2dc0b48731326c150221455fbc3b5370
BLAKE2b-256 69df818f0513e5b454d4c524f8112c408641aa1cdffc26a81863b8d1bf25d19c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b17d57acd02de8a3ad02ae4d3ef75adc88ecdeaad501848c9f7f813fab47ebda
MD5 80f0a40169d5b2d2c814afa4a3961d59
BLAKE2b-256 c2f424ea63a9a202ad7d450b2614f4917c0e173255dff0d513f37901d74b6354

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210401015540-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8502ffdfd7650705f468e5576179630cdd0ebcf329096956e1c9affcc5ce7695
MD5 060517a4d07732d8ec688d3403b8d392
BLAKE2b-256 0da3e54f29afad536296645cf8b84a2d7a2dfd4a679ef5105af8485e459c98af

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page