Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210518014747-cp39-cp39-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210518014747-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210518014747-cp38-cp38-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210518014747-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210518014747-cp37-cp37m-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210518014747-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210518014747-cp36-cp36m-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210518014747-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 21756e9b39e74efa9ea8dbe1606f329f87689110ffb1549bf409afb479841dd0
MD5 1cf4b02d4a398cfe1351b4079bb6ef5e
BLAKE2b-256 6b291d710ce81a0f9d9be3128c19811dd9c3790d8811f1c6c7c9247b1dee7f38

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f0fb13e44abb53316e87f095593111b7e00f596fbaa6353d662220dc96fa4580
MD5 32fa243eb6e7c9af2b9e592f96bfcd18
BLAKE2b-256 0d9d81d02f7a7379490f3fdc72669e9cea75288e65718acb909e54bc22171467

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d7d21927ac6a096127b3220a9f9de0ac0dd0f38b77ba427c06f6316e00b053d6
MD5 b06378e63f33f9192a32a6036cb9bee4
BLAKE2b-256 af69c3873d3521c054623e248ae5f798298a35f5a0bbfd20a99c4c4358974918

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 9e5aadc8514991b8a9444dafe80e33aeb45f81cb88f4994ca0e04afe3853a861
MD5 f4b755800cc1f09c465a03b8ebc1cc10
BLAKE2b-256 dcb648ea8294b7460b4b952fa62fb3fbe7ae797d347866fdb8ed22c7b6b4c878

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9502092db03261f410550cb5a5c5c4d7a5c214d6affcf036d9415d509401d19b
MD5 067b9c3fbf34a9d466903f66acf4c80c
BLAKE2b-256 bf6152b0f7c8c3a3b637c891e1820695d20ba9f3d17a4b453ecf0e76ff0ad127

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d4f97627add570deb7e6ef7172bd23c93c0d85f60f7c60232d696de4308a170f
MD5 4184d0145631bab034d198db35ca6716
BLAKE2b-256 1d945396138a25c0dc8d1e912d29fd4eef53f712479a78d4b340e5fe39450fdd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 68bd532bc9d4c9588db505d19120028e6adf72defd1e6fb071406fbf3a73ad9f
MD5 d9cecb76263121278cb80d3d5028db0e
BLAKE2b-256 31b1f92edfd0f3135346f49a0590f536f7413da64d60244ff23db62e8d93d699

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2eb7902aded8368a6c230aea93f2a413f276dfa41e302f1f22ac0d6d03066d82
MD5 fd881bef87a53393112df3ebc8c7d7e4
BLAKE2b-256 62fae4e43ea7b795645360a3e0a2df5493611c4ec476683fe14b12c164d8f30c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e76236d4906bb40296f6c5d4516e19fd0a2e513c87a9841572fbee950ce7ee3c
MD5 0966560335a9af7a55e864a546a18fba
BLAKE2b-256 500d2f14134bf8f37b70b103774a88538c035d96abe10c4988b2692e341df6bd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 db92a9ac3ad2d1b5946089b184ca6cb53a876f64a4057fdc98a9eb4784edef00
MD5 2d4a04cf17030a0ae9eb0441fe14f994
BLAKE2b-256 964a50ea93855d75687844253b1ef7bb52a447e8ec963a3be23b5a951f13d8c8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 24242c225cf26acc3a0c36d10376b937f8d726df60d88a866b97a3debeb1f520
MD5 f84b57ffe3b85c6fabd69e7be4a2942a
BLAKE2b-256 3bfc1d2dbdbe1731f7a8c89b24c03e0ca78dee33761b95b543a66f411a8826dc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210518014747-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210518014747-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 acd7f72b8d63ab289f1ee517c0d15ab130b047a1a69b0b360db43a5679fc5577
MD5 7f5046bcb763d9deacc34eb0d7f3b8d2
BLAKE2b-256 81ffb11f1c1389db3be27f3e7919e04f543fc65a745772bda410962e03fe5b44

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page