Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210725141334-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210725141334-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210725141334-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210725141334-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210725141334-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210725141334-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210725141334-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210725141334-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 5e1a3cfb5d2347436345db14b8252e542210e5e2b23c6aacaa2f55f819ccb043
MD5 523b8d616887650187d8382acb153921
BLAKE2b-256 6f323c10fbae66bf65a43447e0ed5b03799fa1db863846f62a51a79b43834605

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bfc3a9ec91609a628bfb966ae7853878d8ee746da580c97a7217e1e4a523a905
MD5 65e0a016c3dfd9b990440f3f8ab840b3
BLAKE2b-256 fb843e6a444a7e64c1e719af9dc07b604d5fec43677d78911ebc01423e1f3759

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 184cf32d18df2cda61c81a2fb6572f46122b4b760841fa52bbb2d5bebe571730
MD5 8308ec69f91ea553677028f15c6bf341
BLAKE2b-256 51b4adbd74a94530a1a5a529e115f5fbeafcd0044659b785efa9d3115d93d8e7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 d8e652885874f3cdf8a8932dc2b5ba4e5218a973c0ff37e859a1df0f7750ee51
MD5 66fe41f9af3ed0204e54b50598165b20
BLAKE2b-256 c81fa258aac1d53147cd54a9fafb28e977a14f2c8d9eaef61c7701866f9f190e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e9a77d60264bd5fda9125e25f684e1fb71764d7e6857d339dca3d870969d0a69
MD5 c226de90ec33f92a37aaef9243559b13
BLAKE2b-256 b64f701b1705d9dc22f744be328e391cec598e3ecf8497a5b9dd77f19d6b4a39

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8057af8d99f9594aea3550bf9dac4df9ff771d9e0f755e2d8d1f69c9b7951094
MD5 8234dff70fa8dc31c27c3f0ece44ab3f
BLAKE2b-256 4e0392ee05602b627329c37f3bb2354fcd50f096be172551fbaebb720b6ff573

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 8c4fc055b5dcc3a0dc4ee1476becfa620e323923d74aae83698165d4caa01e3f
MD5 60fc5791fd16f6b2e1d1968e9e33ed3e
BLAKE2b-256 0bee5b8d6da3286d7076c72c6efd2e7ae96e9d231822ac8a29cca9e1575cd477

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 fc784dcc880571d0607bef39d4bd0380d6c35d3c8ceadf6857b824cfd69798e7
MD5 dc1203e69f615ee9a07caa1a77a6903e
BLAKE2b-256 06db403ab5c76782d56cce0561de1571efef0549337f0bd4cb298959786deea5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 cb566fdb55d5dd1610fcd233e23f90dab04ecb7a7691730a5dd10778cc84cdea
MD5 c8f3f383497b57e63ee7b0f192e008f1
BLAKE2b-256 1abee925715224ab7b52c9d3c19814779f22d3993a8766cb3a78f6fa82601c95

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 c51dd0b00d6696a0804540f325e42f9caed3b1d91bcd67a02e2f3ef9d62d5643
MD5 78bfb0ed65e4da987855e7cd009323ad
BLAKE2b-256 48ee9110d7a09e97fe958a81513a673c67e09c71ad02d4ef756b5bec41356f1e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 152fcf3b1dde1dcd70ff2e7262e5b577ddd198a9ad4803e7480a0ef69ddec805
MD5 092c20d8436cb87171f54daa37b63c32
BLAKE2b-256 a5ac6ac2390cd639f5b3b4f8dd7fd11b6cd3782a0ef04811e65fd67ecbbde85c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210725141334-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210725141334-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 11d441d7a1fecfecd552b3348fdcbcf5b48efc21c1b6498161fddff61c683d2c
MD5 8709d40ea575f60aba591216ecb2bfc6
BLAKE2b-256 34a901a85a79ef9f5116c407a7b80488f400c191eaf7de73986ce7419c412528

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page