Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.22.0.dev20211201202542-cp310-cp310-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.22.0.dev20211201202542-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.22.0.dev20211201202542-cp39-cp39-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.22.0.dev20211201202542-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.22.0.dev20211201202542-cp38-cp38-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.22.0.dev20211201202542-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.22.0.dev20211201202542-cp37-cp37m-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.22.0.dev20211201202542-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a5f193015911b6baddf53b0d7c3a2a9d5ed5152e372308cc7e249c64b4d28b25
MD5 e03b5630feb9e18d0ecb7952762151e2
BLAKE2b-256 78b030e101ef6845483ae906c454fc380f824b87adab47aab33f8c10d35295e0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7420b4840e2f49d2946ca9a15f3be4c0f638214a934f8d1fc0466e55db702a3a
MD5 1165dd14993540bbd7f0b5a1f77a6369
BLAKE2b-256 af7f2d6c9031292d45317a870a0242485becda27da6bd46d6376ffc689cb5a74

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a1a15c57bd3ca078d70e65fc1ae0934d3246bb3c9b9391cf41f8691249b318ac
MD5 a8b85e40c93a8381d5e8c5a913724cc7
BLAKE2b-256 8a02282181815659c689fcb4dc9b613293f389cc29ab6af5e22cdd9bebbea6fc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 5c9a2a1ab4cdbb7f9f33435118ee12592ffaeb281e8d8246444425873b5291b8
MD5 46339df053383c7efe1ac32ca177ed47
BLAKE2b-256 75d4b9163744c7e5d18d1936586941449d4fa2a1c4c6f595f7ef7ed633445feb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 10476359d497dbf177bfe1f6591597204c8f9353f5ea580f5aefe05ebe4330d6
MD5 27c113c6c87612764bfb954f44cc3835
BLAKE2b-256 2f4eec0d59f71f8358f8b41dcbc9b40ca5d0ebcfe730f3fdd4ddda1bf72d0347

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 235197fb11954424b43434f9e2e81e17c75cef6dee6519a4ab987ace0e2248f4
MD5 4547c716a43f6d448cb5b46bef076d79
BLAKE2b-256 a6bdf98894fa6c54b1e1ad386b8c91c5d857a3956a2463ff7092522c66328caa

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c1468297bedcf2266f4e820ee03f20fc7df8d53cefe82ecb6f7cf6f7978f2bb5
MD5 99e195a3edbc62b3be7ee5f659678a2e
BLAKE2b-256 f65a656c5a45696b6b0a2d1ddb8800394877f2d8e2fd5a697a589f873ea7d2d4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8e9921e75f395074e3fd7f97c16abe3af8c1a40a4f25a625387e978af8bba33d
MD5 e4e1803756be3be984e07d368a17bc3e
BLAKE2b-256 fc5b62d2c419f7816bed7a5f90f0e28941ee1e93f50453cea012da219c170844

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2ccec8509db0d38babb33d6aa1a54dffdd12f4329be0331ea00d42e0cfc854c5
MD5 1b50464a1bd4bae433107e6973d101cb
BLAKE2b-256 79770467dcb9b5364829290e9ffb5732698f114ef16988d46709d2d4a5904e74

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 abea3b545bf2f023019bf6cf3a7cd6a935620359d2419c7a4eed6c61ad69ec8a
MD5 fc301c8f9a05f5b511ad66f16ec7d7fc
BLAKE2b-256 f932da6b218dc119327328492acc6817263755c79c96581e8b01607214e2b4eb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b45a73d10f6c8bbe83cfc3a20057d41d1481fb27b0307bd449d84f9a60200f7b
MD5 c4e4fecd6eb78351f8f67d394358537a
BLAKE2b-256 d95b9b6cc09c470065e63cf1f5132e77a2b3af37f41bbb59be1364fa6a1edf57

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.22.0.dev20211201202542-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.22.0.dev20211201202542-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d502603d7009906b051b48002363eaef35287aae68cd1fa66936df66dd659c5e
MD5 e1ed94b24a7be35c88eccb0b4d21598b
BLAKE2b-256 0f98b4405c098222f57c4adbe53d06dfed797c783d10f7a1166c29dd072eea13

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page