Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.23.1.dev20220103161115-cp310-cp310-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.23.1.dev20220103161115-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.1.dev20220103161115-cp39-cp39-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.23.1.dev20220103161115-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.1.dev20220103161115-cp38-cp38-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.23.1.dev20220103161115-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.1.dev20220103161115-cp37-cp37m-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.23.1.dev20220103161115-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c31d9f3f33ebc76c73f7135a8c62f16dac439a8612a020bbebc941af77769e0c
MD5 5ec7b39857e73ef809b27b9974fff06c
BLAKE2b-256 bdc51146d5baf7b0491b82e02231ff8e609325fd11dff0e5c9b0ae35f49df64b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 626a9fad954820679737977f8fdaea823c38343851b16063e6f4ea81d93a6bc9
MD5 a7ec0459c3f4e45205a4dd79e32ee18d
BLAKE2b-256 f7d676280684ab0220bb357c33e54cced2aa63f8b7c9b50711d01f256510b0a1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 56a2574d69e73ea9f5ba73d04c2cebb45113659f91df463ea93c97a3d4de26f0
MD5 d013ec79019c7061cc775096e426a4e3
BLAKE2b-256 c59524f8bcc7525c793267446ab57da41e9efa84294b214b4b6edb9175e75cde

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 683ccf2d326c2324e2700b990f3b1d0be801fe903e28f1a011e7a90094100d7a
MD5 a7410c321deb7fb590415dab3d14b760
BLAKE2b-256 b56440caad48d8097f8f75d9a0d1529a576cdd468a2e2bb5836e8c3e36632e9b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 cd7ae919e8c506328a1d17b585b29bc2fdf28c67aea1de72bd8c0a7a1d00117d
MD5 f89f2267c21b5b149d5d986e564dc3df
BLAKE2b-256 0db46f865cb7182147790c4e2519dc8bc6e5107125224d4080bea6932f0f7100

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2ddcadb3de983dc304d5c4831c8524abfab221718624ab1bfc9223aa58c42a96
MD5 fbbf7ba417cc9c4a32c30eeb4b02fde5
BLAKE2b-256 c4897907feb48f73734b00a26326b00f2d6fc655e34d650969fefc857dbf9f1d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 b770bed9406d2dee66c99f9336aaef09d6237208ceb8477cdf9c07cd18d29e1c
MD5 89b93955dfd21eb0303da5cad462a575
BLAKE2b-256 1efc9fb6f6876beaf69b68704d56e1c8806be1f645e3861dbed037a69aa0f51c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 16917c01e03f4102ca23ec0960c13e9d8b8800b7d1f1a41a74dba179d14c7b6c
MD5 41b51210a7214078c2f7c96952a4899a
BLAKE2b-256 f284cd6998f0c66dd977fe910dfc0c6366127395db757d61511fc9242eb32d0c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 00cd4da39c11b72e5e94e4e59d1450a00a3f2ade055775c74673bddf1fc197ec
MD5 2cc102be6ecd1df660cf40fd63c1c39c
BLAKE2b-256 92435394c680d09af60b845b30b84f8fc8a25c87364258f93a00070e8b63cfb8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 cac7e13143563b6c3384e67000c1b4f09fb37980a4051d5935d0f2470c89022a
MD5 bfdae9a6cff7aa3318d5c1316d1ed90c
BLAKE2b-256 4066e285906587a10d75a0f3dbdbe4d8373b561a20c9f47e7bcedb6568f0624a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c0b7e0b854cde6e63afc62be7f538b77714e6c3ccfd394c7fdd112e3d693a482
MD5 9d670ddeca02780a06b3e722863fd0d2
BLAKE2b-256 e6ecb96439f89a55f71c1fe052673be8a4a76b40945554c540d74565f7b907f7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20220103161115-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20220103161115-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 43c38e2e56c89175411bc663ed87a33517f5974b203dda60b18b260d9d3a8d74
MD5 db3a8579c2fe39706a32d727f3c54689
BLAKE2b-256 66f8ab01627a2cc982d9f1734482898fb3cd3e818e238a7271b2570bb9fa8a6f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page