Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210612130343-cp39-cp39-win_amd64.whl (21.2 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210612130343-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210612130343-cp38-cp38-win_amd64.whl (21.2 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210612130343-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210612130343-cp37-cp37m-win_amd64.whl (21.2 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210612130343-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210612130343-cp36-cp36m-win_amd64.whl (21.2 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210612130343-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f166c3c5689e29dd36cc8d3f883507ce95d2d90112ccee3cd0bcf2c41e2a0cd0
MD5 7d1e9e2a78905064dcbc1bd308fd06ee
BLAKE2b-256 411a48dde1837a2b748153a3f3bd857369ef2feefdcb687c0e4325d3194dece6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bf98de63e58e4a5c9ac50c1fd7a24f65ee502e6385ee1c3ae20c57c7256cd8e7
MD5 1c6d28ba63b5cfd58228623e6945118d
BLAKE2b-256 7a5393e19b82910e04b8b2acf07ff9f1398e69f04fc77f3dfa0dc8328cefc272

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 19946282fa0aea478f2354b346225df702042343af017b1c348dd2d1f26b2cc3
MD5 7e44a4fd4cc4db77d48b73df6ffebfd3
BLAKE2b-256 184a357f5306d8d024267f2cb6ab1a564528cbb5233543ec8311ce83fcc8828c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e8128f4ad976a4c3816168dde9089c2ff710b51adf8775e0aaaa5658aac4976d
MD5 dc33c6091577a292ea0b4e294794c1d4
BLAKE2b-256 eaf6f74a622092cc5a3b2420200148997fb2758aad233d8e1418da06bfe97adc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 76fc68b652a20a09811cb7abc64ae7603f23669d96ee0dfae8a443f1bbf9f957
MD5 60bb739f2895dfb5e1b2cb5af89ba1c9
BLAKE2b-256 4df3219dc0d0fd3eae4a59e8d62d3e8d1cbd9976babaf392051790e881a0e4d1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2e28951b636e7f1808551e04cb1b9dc8579b6653c448b4be0e51f612d2cac93d
MD5 d7e88bf4c00e36ed725c6db39dc6cca5
BLAKE2b-256 3fdd810757c35e98e48b919bb98c93566196bff5941d85a096419ca1f3944b7a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 b301a5bbfe2a4f7db10420157d8f15596cc6b8bf114dbdd0b69c9f54089fd2c0
MD5 cbd637dabda4128f64db61ac4882e013
BLAKE2b-256 f226453c5fbdfca4b5b9b845dcf8ac78e7cdd794f72f02364b77617f61a7f0ae

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 aa65200f6232f9fc0aef6b8a838fc85bb2bd6eca60e21cb54a0206f2e477263e
MD5 356787ce6127c6db95c70c160da6f09f
BLAKE2b-256 2b1947fa41a2ab41733b6ac26e29a48425c9919a624fad8647a7069b9d1d3cb9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d016bb3e97ee029b82d9f1d277c82cef6993f7d9aea73227c492637270a23aad
MD5 91911b77f918ac1c390d5c44b2585556
BLAKE2b-256 2427f99434dd8f5f469300c32f00816a5e15e9332fa65e4483449706f54c9f48

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 17dc78f60f4379f69e44c35ecc1b4ec9e3e8c67fa959d622a45cd3ab052ecd37
MD5 bf83bda3dfb65672134f1784845adefd
BLAKE2b-256 1856af453b68ca5eaaf3b7430fa993c7a52235f21683862b0f9ac37feabd6de0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 dc7c7b00fbefe53811dd15f278e39ad6d8cb4fa68fb37347a950b2a265d9167c
MD5 956db69eae14081070aeb10f65e15044
BLAKE2b-256 e489f1d7b88320fa492edfc052caa1d9ee3edd1a0062e22ac994ab30f74cca0d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210612130343-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210612130343-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4bf14cdcb50034ab2dc209f4a4ce00fd8ec7f911986f5136cdff7e2d8d530a6e
MD5 d81685401aa23663123cd077f9d88cd0
BLAKE2b-256 45c7ddb5f72d1392175f30d0a183522794d9c5c32fa1b20547662cf53eb38fbe

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page