Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "http://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for the HTTP file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 4693d2653df04f730a13ac3d4ab25ecf7c42551f726ea5a451710208a794fc4a
MD5 2021633009bc29ae49ad459bfdceba69
BLAKE2b-256 16f3968d53bf4852bb1de56301f29b262e6db16820210c87e8d2bf7de2989cc9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 80a77b3b690018361749e01294be2de9ea866e963897dd4c0328f61b2063e52d
MD5 6b1610b9cb65327e18756fef1bcf13f4
BLAKE2b-256 c3e44ebb86708b46055f8f2bfba082a491ee741b771d9b4f2f4f552150bb2b88

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 83b5c8adf33b35cea491c7fc50292b6b38c99440f5da7e302d03548570651649
MD5 e615fce47597c381ebd19fd89406e562
BLAKE2b-256 85bba84f6e0ebfbe82a55e4e72956346f1e1d4aa16317cbe593194edfb05c9e9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c153afab9abe767b451c3044e5d581be308a7da82be3d7b45b709146e064f1ae
MD5 a03a28bbb6f76e4738c27f4fbbd977bc
BLAKE2b-256 c41e0521a321fe90871bddb66d7e9ca4774464e3fad3d91ddb469592b7e64da9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 94ab24ed375e1f63f1c25a8152312e06069c0acf76b52df22d1dce918509debe
MD5 332b1ab88d3a2c14ff8b8764a04f4736
BLAKE2b-256 f4108dbf5767c886bebbee02187c44bef15541e8eff43d4f64f03bff76d38cbf

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4d73cb4f24507934f828a430e2938bbe13eb9a75f197c938ae4cdbcf84598953
MD5 bc8145f3f36ffc7f3f3a43a125489ca8
BLAKE2b-256 1ea19fc4fb029dce627a3ca5714293cda5dbf9127d688495ecf7ecc63c4aed72

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 bb7e19528a06c79ce458224ed24696806aa3c662b328c4edf558791fd3a30430
MD5 04be1de4d1f2436e8ab6a5f9bc2bf7c7
BLAKE2b-256 dd44199832814251d56d3e12da334da888f720bf922454d24651315e187beffb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 11f8098816764ec2dd235619003de34cc2c4174d21b56442ee49be22b4ef2659
MD5 727040aafb47c831a9ce63f9d2c84104
BLAKE2b-256 4ec180171f4c4bc57b2b1d5a3356d0b26dc0895a74f92d8af6dccf8c4e6b93c9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e4e95c4b2e652351a647c0b3594171b0d27bad0a92aeec5bff7d3ca478fb0ae5
MD5 2dfa63aaa5247c47823471d9aaed27e3
BLAKE2b-256 60b3dd9e155e984ad3afcecfb40c135c1df486c89c86137299d53e5e77c5bd63

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 6f1e802061295d5cade0b4689c996cb2073b568c02044f9606a1f3bd6526c1c3
MD5 a8791378e841d079bb3a4cc20e755121
BLAKE2b-256 3275d5e8f7e27ff1b22b3b6fd2a23909a39315069753f9089c4efdce22634d7b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 719cf71ee0d11eb930e1dd1fc29ded6bb92fcae0e09a0a28256a71856ddb966e
MD5 b934325c0d66566654ab3debdfe357c1
BLAKE2b-256 c6610c30ff83013bfe006b77c917f9722f3469e8a319bc281e81ce9025d8aadb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210318195435-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 bc60ed8d9c34088bbf0eea2d53dc5d6b917bf43afc17ca599d73f59fe2880adc
MD5 7796760bf06c3b0cca57b614bda56d71
BLAKE2b-256 415ca4b84d4aa4c98719fb56e7fdbd73a35219dd96251d07509f5953b3d97b16

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page