Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 a37d92133201785ab9c62afc696b58f953379549be1496f86b704dc4fded95e4
MD5 07409c1cb6cf6e6bdd53943850f19070
BLAKE2b-256 2446d827a22a13e2b1423e0f22251f88576b33ca4ef298ad92d24d37dc7087a5

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 fc17a666f5723069af038f51db483b43de2b835b65d7e884b11e212eef553ccf
MD5 9de423aee24f46d869eb82d0bae76164
BLAKE2b-256 0309881e92cc9f6456ad2f24464d5d14048876372976e27d10ee65eee29acbc2

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 87ee2e950f721e5d5213dd6e2ebd795450575a180638847844a84a2664f7dd54
MD5 71191bd2766630c2247fb0e156fe4c7b
BLAKE2b-256 00acd762a8fe792323ea21691bf56580a53880ab72f6025ac4e56635df928c93

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 cad670b372adfee0729d769c83474bee2b0816540ecb4ea2b22028167036432d
MD5 0539002d848a1de2b5c2a111bd61e9d4
BLAKE2b-256 c1811d0be82d936864ab339fae66c61ce28194fe45e090bcca4acf4ca8e86045

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 01aed0452049b5a1b01dc75c69e06b11a33e7fae02d4d0a88fced050e76bbc20
MD5 b0a4e9c53270d1d9075235267ef7c809
BLAKE2b-256 964b9a4620cf18c437dd6590705d0337b3d5746cff255e30836853bea99dea5d

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 676cf643f7dac7745dbbf8ba900308e20365ec7cfd357ae215c468839d6f7b12
MD5 ddb5c65ec3c9070b8d6aff158d2ebb02
BLAKE2b-256 46d3abaac17308041bdc5e498d49f66a86eb4a890fc18f7cc94cb13981de5689

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 a98d996810afc29715a16c5dbc053fabcfe4fef15b2fb2ea23ea72dac69f9423
MD5 c634e902d911c7d74dbd1a8776fbaa8c
BLAKE2b-256 2ef133c15a56af1071263e3fc0f6accaf689d85a46cf85b40a6c1455ae7a5ea7

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 94ddbd0317c85fb77f0adca33a43cdff11df7bddd05346ae4f29dc1bebb3f35a
MD5 b8f4aad89a927ea24c67b3ac634bb97d
BLAKE2b-256 26d24e2a66f9fa54230ecc7175df97aecce37dfa22e42879dc097483d562c480

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a0a8788fc49ab9dc7f5bd73aa91a13b07f02192a72066ec2762f127fb2afdbd0
MD5 55f9d3f848eb81aa3f99a726a1f1deb0
BLAKE2b-256 8d2fbd2ce91d153fa4f13e4fdcfbe0ff2f5ac30dd47ff6a6bab8cf23de8966a4

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 bd5b176073926f56297d0f916bb06f476d341aa06136bd4e4976985ab90a97d1
MD5 dc4fe24c683c9a06c53c777a418ca6bf
BLAKE2b-256 b71435c3a5e9e2e2f300be7cbb7cd476232bbbdcb617796e6d6ba70b9b45e27d

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f1c6ad62f34d35c9fc7778556c473b6bad1ddf16559ec0d15c578a872a4e5e11
MD5 ee5673f4d02082d3bdff2a6ad1416448
BLAKE2b-256 51213d6716b2ec11b75b888f45561724257bf16dbbb92478d0b305b7edd0ab87

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.21.0.dev20211021134553-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8fdc3a7c2a552ca41852ba4a4a804c659e6063f1e355785c9c38cbe76a711a7f
MD5 d396e0332b69158abd5611800c6c4d88
BLAKE2b-256 0d534171c65b4d9310a382f8741a0aa8fd5ade85fe1a8aa7c785605601b00dc7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page