Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.24.0.dev20220121212647-cp310-cp310-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220121212647-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220121212647-cp39-cp39-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220121212647-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220121212647-cp38-cp38-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220121212647-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220121212647-cp37-cp37m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220121212647-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 974c5eb6f3e1ee238936ab875cbae7ed2ce401aa787885ac0fe5fac721b78da9
MD5 fc5dc2d0f9e0bdfd2773de037302b6e3
BLAKE2b-256 826ecad97a4fc41d3724d85ae7908269feb62e7916b0c3b8714b805907b54c0a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c53ec5ad16e4d5ae7ff97327c8ad45d9340930544a1ccdb374fb8103db1a0e67
MD5 e8f0c0e18a0eb4e907c8d8bf09c408f9
BLAKE2b-256 68d5c54b7694543b829a74ad62071ee088565d852ce991aaace7417bb3299827

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8f309a2ae7a901beead3276afabe8e6be336b2937b0b031cad36aa401111db1d
MD5 cf5e35f4469cc6fa6f05e86bd0e363a6
BLAKE2b-256 ea37dfb7bbc45c6208d53fb75333b9d76c31fc9bb826023ce2d89e718f7babb3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 1b4a1347699c879e9003d29094ec5e2c8038b5c6f597744020101be574bffdb4
MD5 8e1321913d76082505c845e0722025f1
BLAKE2b-256 035bae3631665db7fab0b5371cabe0d6b5da5c5a7ee74967a1dca44c8de4a55f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d276b2fe4009365ad6e7707020cdd7c01d86f8269e24b3bc6374044c7fc284fd
MD5 ff9876f58660fc1d19528e867794dec0
BLAKE2b-256 34504278920b64550e4552d533f782afe65839a0a798f50c23c791fb50fe154c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 16430efcef19b7932f56e8a68af494ef8471ad0d09cdb81de753128d14764d34
MD5 142e672b5d77b62427ab752bde722a1f
BLAKE2b-256 63177106364eacb6f08c0724fddb6da51f1c6acf2093dc3ce2a84387b3bedae6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 42634cc027c8d3748699e0808b97ea5043079280ce0bc3206edd6bb238f3087b
MD5 a2827360eac5b76051893a96a02d2104
BLAKE2b-256 a96469cf2052c8a2b8b6605c61464bb36a24b5f71f6eb4793f138604a73ca340

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9c80f5cd46e173a1b54db8218d292f87bcac0656d42054d9e398d7a971ab8972
MD5 3b1e0a9020cb38f8369a2f4a155d766e
BLAKE2b-256 49caa938018e7b98d0ced559b2e6a8ee4ac524810e451f90277c451149e60c64

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 cb6dbf0417bbe187837beaae190cb14b7711539aca1ec74f4cf2dcc8bd307029
MD5 e6312007d78f90ceecaa3ed94eb7dee0
BLAKE2b-256 85f6d3ce559899320e685bb971301479280a16ffee8cb392a0b4d906a1bfd457

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 7ba592acb5db885bc616436f80848489a9350e020ed9edd97eb8b0931541069c
MD5 898370673b0247112ac221d6d4604656
BLAKE2b-256 0d0b7f67a9e7d23bdb61985ea509268dd621c812dc052b5e22e7844f3aa3d96b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 fe5043be4fde881e5fe2870517f58764ab0f6698f5b7ea90688e57ea4edc1266
MD5 825c101b1e30ec2df5bfdfd010f79eb8
BLAKE2b-256 82c187a845ea241e9719e9b039686ef8372bf95ef4676e7aee72cef1ffdbd5cb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220121212647-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220121212647-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1fd1aa71d0cd9d51b62e2dd412de358049c1c3074e60b92c1ce882b1e87ffea6
MD5 a327325e3719e5ea834b8bbd43f0ce6f
BLAKE2b-256 1f4bca9a41f84c5e405dedb3bdf638c239f9f779620873e13cba08cec2222f3e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page