Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210714135652-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210714135652-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210714135652-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210714135652-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210714135652-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210714135652-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210714135652-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210714135652-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 6c66fb6193e352b4af08d376fb7c5d67dc95d249cf044b5198b2cd433876b44a
MD5 d87b5eaec0498606d84b4b9e14ffec61
BLAKE2b-256 baed3740a4ed3dab8686c22f570d4e53a36630e66fd6808c45bd7f431ee3b7d9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 553d2004ea4d0f3cd02c048997ff5121b118da13c41bace180315fe6f50d1706
MD5 aa98cb6ba552e6a783126341df26ab0b
BLAKE2b-256 5012a44d4b49fe71cacf6ca48b771b6c7e12ae718b529bfcbea214a2f43be94c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 019be40f0d2b5ee86b05420da91d233edeca087f48b00482bc5bd3ce08d8eddf
MD5 a55d3e5d7e6ef586b2de0416ce149631
BLAKE2b-256 ff88fa60d3bd30e5588d584eac2cb11674c9b566ef1dc4d10fc2f5f64e0f3f99

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1e71079b365299539526ba106d8bd193965c564857082ab4d056f5017011e0a7
MD5 f2f6ca9f24b8ce56247b809053696d47
BLAKE2b-256 b4377c80790826a055e9206ef1978eec0595518b72120b6705c232373d60accc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 680f57c9c28c4b867d36d685b8d2aca2a80c207b0e18e6e7be4a9201126c9ce9
MD5 ab54a140ede486f852411c6e6b11bba8
BLAKE2b-256 62b5f07a34d1dee396c4bc24640ab38e42ab4b33d28dd98f18649ac12f70d79a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d912ac0bbbe76fcefed9e339f7cb049f3ba2bc555ca44b36e7bdb88fdee85f83
MD5 2ddfad746b4d21b436cf65ea3112bd10
BLAKE2b-256 b6266da6bb4a35d9c95edd86d13827240803f003d705c103ecc2bce6027ecee3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 1585c60a09d275e4f455522a7b0a737880f8f39200fbda8bb253643e5e926c77
MD5 8b2d3c941e76e1bf34a4ad3d967a37d7
BLAKE2b-256 238ad8e3ff607069ae138b7a98ac9e24bfb29ae10f065c75d8a3cc86f5e42f75

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9a7dfce96cceb640ca3a3faedecbbad9b8bf14900deb737203876721e48d668d
MD5 63f52ddd8702eebf7aa91b9a08be3451
BLAKE2b-256 54934e5f27c3c74a5a81a1352981970d7abe167630dbee2f64c95fdc2006d9b3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 fa1542f1c531dd1270da48db4e49e7b1161c5ee545e608883b636faeea2fa49f
MD5 e45d0c9963b1ec3ae63b6b7076577bb6
BLAKE2b-256 43b52c6db252c2f0a49be1aa9fe8872ec0cdd1129c991b94066622e74b3a80da

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 6d170061b0a3c9fe2ca7b8286aea859f51c721e17c02d61bd82abe8595cc5e4c
MD5 a5bd2a86a662224cfa1373adc9fbf4f1
BLAKE2b-256 71975241c1d027fd7c1290516438b515613e99854c0dd450e9e12d49ee06794e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4e4ee3666da204e22022e3254bf144dde778d5d92753f364dcba615654acd85b
MD5 5dd31dedfdc65984c5d461887e31be6e
BLAKE2b-256 2e8e96fa1a093c11ffdf7427a7305ce9f9bbd2fcf044adaf717cdc003076e30c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135652-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135652-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 99e530a2b94fdd190ed1cedff985f3e366e3b5907c4406920b6ec57181956965
MD5 acc94c6478a3ae890f00ee18e783c862
BLAKE2b-256 b94ed553eda083fd5da285d66d76e8481370b6dacb954e5ba919ed1c763c9212

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page