Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210903172215-cp39-cp39-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210903172215-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210903172215-cp38-cp38-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210903172215-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210903172215-cp37-cp37m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210903172215-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210903172215-cp36-cp36m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210903172215-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 48061094d5b7faad8f6d527b17a71d066f1242f8f72d4b950908604644f6ebb3
MD5 8127913a8e89ffe417d00c4519066a1e
BLAKE2b-256 da5a75c224fd46152b05e9aac8cefb3b98cf8032ac9cbbc1629be8a1fedc3f2c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 564983c606b3b4c0f0ee3670b79817938ba12090533d3ffdfc447bbd738241d5
MD5 99990abfa46e145ec8107209a1aea68f
BLAKE2b-256 d65e722a6cd727c5bee167730896282d60412fd9b9a8ae96b321f7c881429d2e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 630e084f7aabb6271785134072d36d1f33a2e4939513f7965021d3332104e1ab
MD5 785c6e0fe5118fa5ba06164a7da7fc4e
BLAKE2b-256 1fefabda1acf36e4415cd9da69007b80918e6e2f835b5d59bd6b0dc7a3bc9749

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c0465b8efdf66ed343cd06f3b375b6bc739ae6d8bf1a653778f424bedff60dfc
MD5 28d40292ab0a8ec444a37c680440be33
BLAKE2b-256 a8afdb4de46cf73d7dcf24b4df6741a90e82f6316e0cf8910119510b756d2e3a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bc556ee055408451d77a77f53c9aab979d58ac40138d00c0208411dc6eca7f03
MD5 c7651c77851ec6f90eb3053239b845cd
BLAKE2b-256 aa838d66f684b8112d374e914076ae9536785754283a58b3942e1eab18db25b1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b91aad14ac45a681c8b3fd3f6058dd07961a5279ed7a0b664a771536118ad737
MD5 6fabba269b308c9b352fbdd26473bd5d
BLAKE2b-256 7bc4ac3a785bed8267a6a0b20a7ff4b3a14b4a5ea78892f422334b1aa39dfe81

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 2ddfc69e9cdf714e03eac9f0f8971e46ec567eab611a93f10c161b24ccbd1982
MD5 7f7b14fb73f12594f9bea4223bc5d791
BLAKE2b-256 417b1c6007034c1ed97088b115b6869fa800a25fa65e8f988e0dadb5da18924c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4e3a97c837bae528eb92c59490263f15a0a0af763b20ad7b3913596f26aa82dd
MD5 04ea6cf335c8c8112240acce23718763
BLAKE2b-256 a58430f5efd4d95b06f0b5932bf348dbc6f93650bcc19e95995366fa7c8acb62

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6fc77de7db6b0c32a0550b296770045c254a70f94478597f28dee8e0409ac56d
MD5 9ae1d200e3a847da1546cd7210ccc09f
BLAKE2b-256 2df94387f59417c4dc3d80e54c57991d7954500ac48ebd55ebbbca61d0d0a230

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 0d792092800271888de0142f0224f5b2235396a040bef158761527b0a4b5e05b
MD5 7462089385408af380b3024dcc26b73b
BLAKE2b-256 c08c82197c2542eac730142d94ce9f2d749cf1573a841aa238261577d64d19a5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 13eb37fbd7917b6d4648b8f1ab8abd702724208cf1d82f3d9469d1d6f96f4e8a
MD5 bde02702a4cda27a706e495572bcdb64
BLAKE2b-256 69a78acbd3ac5a93640708dd8f4f3ebd78f9232cf3b7b2f068da5ec6086dc0ba

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210903172215-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210903172215-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8654237db1ce30024cb79c25221a18032208016d97667d916cff29b4166fd5a3
MD5 ce9bf692e6d07ed63cd5fa4f982fed8c
BLAKE2b-256 7d4138e1a0ec5296b5f554c10add7d2ca1a99f9d13f26799f959ed206fa874f5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page