Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.26.0.dev20220816191153-cp310-cp310-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.26.0.dev20220816191153-cp310-cp310-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.26.0.dev20220816191153-cp39-cp39-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.26.0.dev20220816191153-cp39-cp39-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.26.0.dev20220816191153-cp38-cp38-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.26.0.dev20220816191153-cp38-cp38-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.26.0.dev20220816191153-cp37-cp37m-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.26.0.dev20220816191153-cp37-cp37m-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e63c0f5e8664e8e64c285e44879cc024ac12ec06f64a1df33a368a819fb96ea6
MD5 48f77d60be6bd446cd0300ff959f8eef
BLAKE2b-256 0d0ae74c027a9d7d7f492745d0f30aef2cbc0d474cd6ff0ea2693362993d0c65

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 49275d9b5c3a06a539d86bcf7cfac98152cf7e06792d23ad10ef0390c60a440f
MD5 77c88d5b07c3c18e8ee73da7b9e1eaa5
BLAKE2b-256 4dbb15958d6c6a12bd736042ed05aa2f3cb04c1cb42fecfd7258676b2cfe8f85

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 849156e27cfefa2c5b84d03a7ac647019c9ebae50dd1539463c2d07be64b6788
MD5 85b166aa648019aa7f2255df3e9ca5d4
BLAKE2b-256 0fb9d82685d30ee188d0ca38d805cef16178e21dcb005a2a2318c73ac58474a9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 8c9a5e07949a518919db2f1d70fde52615994b8c4054c4d0aff9eebed20820ee
MD5 5bfafac28b13a95019cddd8fbc33bb12
BLAKE2b-256 71975e8ce09f8dd175d849b07372813bac71286473c4c9f2e4e10883fd54b97d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6c597f3740fa3b747d874563004bacbe94391cbed6715263c23837077fe0b757
MD5 415b82d4c732f7366d3bce9c81eb6a2d
BLAKE2b-256 d3f03d7168b38e9077c19fb4667c661f649514d3599e64f2c0a866c28636380a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f65245efe457d1a30db78ddcbaba055a807ee1d3787aadcb6d54389755f830df
MD5 ba3fb7359721436efda4f84339feb912
BLAKE2b-256 2188d71bf7e010f58ba628b7561abaf0611c540c7291283124aade64d81fb9f9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c79282f69944755a70728526772a5ecda19922055165a12eeaf057b705b9d1a2
MD5 d63b3bc81ca463604b2080e9538e75bb
BLAKE2b-256 06d3317eee54c1b9ab5dc8d429293961abc55f715f60b06cb69b77048112b2d7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 939297fe4c146b0cd5c4fd8ecb9fe8f30ba9ef73f792dfdf844193f947626f28
MD5 2de2b1318c91ed7f3c2fa7dd1084c2cb
BLAKE2b-256 d6ededa4827ea548311a94ae8c9a74ff700fadb4277d84eb647c78fd8b8992c4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1f5c0c17f2a203303b08cfce06b6397dbd0ddbffde8a9ab8afd5fab6f0ccd43b
MD5 2b200e32f9c87820625c1e6829db4a3c
BLAKE2b-256 1b911f3fd6d160ea76c06381ce4efdef8a5474c21de3f665160ec8e82c0a9725

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 63aed76dfb2a351ac377b599a3a876fe14203e52688b1b7753da227380962ce4
MD5 7173e619f82939d6ff54e2b433cf6a2c
BLAKE2b-256 eb14e32725c17df289ca4b077a690ed8e93291331f0a6ee7f70d2ccc7bbf955a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ea607420c7ba5ea47898cf82128c4d319cb98580cc7c6ca7468c32db0472155a
MD5 a75189c73cf5166d9c7f229924df80dd
BLAKE2b-256 a49cfd715c5af90c03f6cfa6ee543eab2d8da1d2af95d9b0b62d6c95b49f42d0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220816191153-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220816191153-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 66820f95f2758c8f440fd46a078a4e18ce99d796d8939743ace85dec7ba04d0b
MD5 fb8498ee68999e04192fe94a53d90af6
BLAKE2b-256 8392a4dea2beafad1ee8bb41e3a5d6819c159307a2a0651a17092a483f7f6194

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page