Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210812083251-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210812083251-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210812083251-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210812083251-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210812083251-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210812083251-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210812083251-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210812083251-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 5323fe9afb9eef58be071a79afa9f3d16387b4ee6cb92eae110990a3b90303c9
MD5 80122a330ddf999a1abe190010b4b6bf
BLAKE2b-256 ad5c53e779c8bdfa552336e4196afcb6661399365b4b5d741517e1cd89085684

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 914112a9c2565226b473af66a1585a46fdc7cd099b01551f68fce21fb519903d
MD5 d6ef2576295399658344a2dec829d037
BLAKE2b-256 bdcbabe815c9c797a105e2801cb03ad9244222d997853e6713f77a5b06ccab31

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e1e89eee5f9e1dda3a3a55edd99d2b73a390889e7a4b01cfb0c74d3ae3bd5246
MD5 fd692e4b0ca6f92f20a8a1e43711e214
BLAKE2b-256 ebc892da4ba85883c5d3dd4be1cf503ccf42000b33aedaab7326c3b676b04dfa

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 b2726ad38edc262c54bde39914478d195ea80854f09a26207b60e4521289366b
MD5 c9127864f1219b7d5de5ad82a898bab9
BLAKE2b-256 0eafefedc185abd61e18d83dbfb1e896ef738a6746d56548127f96b4b78d5470

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b3045c75a0701a7ff2d5d5d89ac2d52ed65001ee9aa5e930fa46dbc733a4094e
MD5 75ffc0c914543783a0a27ad5ef320f6c
BLAKE2b-256 f9b10bec2faba08c0ad74acda4ca9a993b8f5778f25d50d6d0177f93f542b93f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 cb6c8442942945628162e880dc2f9f0d22b5aecc37f34bc69834db52cd85719d
MD5 99577e8864f0eeb80babd49600a1dfc2
BLAKE2b-256 6b06a0ec2d29759977a4d66278713eaef71d4eb1262c74cc03b5314338ab5a27

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 57010877623dad5c5b9ccf7729aa298bbf9f939fd2875609628b7c9b6f6b8feb
MD5 6514d2606403318158f6ed9a93b4544e
BLAKE2b-256 58fa5486760d7345b3d143e93245f446c0b82553d5a4f786a7123b51072cd057

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 1d6ab29582db2cdae91ed6a14006d0d177ba82728f682311f75fc29b5242be7d
MD5 b7073a162042985e655b145e4387ca9d
BLAKE2b-256 cdeef447d69d04a679283eac18c7dcc5f6a05d9248b2e66e6b8f9112c73125bf

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1b475c9018b62ef24c3831be6c9ed5b2393a6261b2031e203ca3be274b8037c1
MD5 d1aa2ea07dc3d7643de3082280729a35
BLAKE2b-256 cbea63de3258b6e5cdbdffcf5288bfcb4c4270d9edd9ae701ab567f0c93ba549

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 965fb5c17d37ad09b09bf18f044fe68fe39fde72edcf0e4700b2cba73f78d5b9
MD5 69947d5648ee0290b36c038b2ac69f97
BLAKE2b-256 1d1d93384da845360593e0d4614f379065676de9ce9d39d788768e0afcd2dcdf

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9dd79966fbe1aec046c7a1587bd71b227869bba9cbaeea481842342d80ba9b70
MD5 0e4fc45274fd4c7ab95c263fe80a3c93
BLAKE2b-256 d25b348953ccf5210e97f3df82f2603500294ebfc6d860071c1aced46cfe9c70

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210812083251-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210812083251-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2c2590c031d261ef676a4eeade9c975a9c999a68ce17f9357c299e9f2d636da3
MD5 a941271e4df10cade10f026c3d66e2e0
BLAKE2b-256 a28dad6a61d26324fb361e511518f4cc6adfd6290002d07171fabf0a259d08ba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page