Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.28.0.dev20221121160414-cp311-cp311-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.11 Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp311-cp311-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.11 macOS 10.14+ x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp310-cp310-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp310-cp310-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp39-cp39-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp39-cp39-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp38-cp38-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp38-cp38-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp37-cp37m-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221121160414-cp37-cp37m-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 016c1aa05929a2dcc50c78285d38d82877c71ddd73f2f67e9cb8d8861cca43b9
MD5 fb590c40835faae33cf711ab73d4f88b
BLAKE2b-256 f2ad41d2cb1ee3dbbbd9c124baf1965d4c8e0380a5f3dcabf3e82a965562e694

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 368af4a3947ae7074a02f5517d1122f12874c790123ad22836dec825afebdd76
MD5 976673c5639d6faa10a99773a8664a2a
BLAKE2b-256 6473a8aa2d17d0a47abc74fac4c3c878d6c4d7a7f9b40d3c7d932690c4140f3d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f80b5a6ab1f96dc317002695813e894a7296b98c4d26cf7f335eccfaed74a2c1
MD5 fe3d989c2b9a5fca5c6a14c60a6ad1f5
BLAKE2b-256 5a99a5838f18c9929d7280eff0da48a176311d98894489cf3af946829ef2f403

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 5903860c234abbe549f140e0dfb3e4cbd24b2270831fcd2549159bcf27bec74e
MD5 ecd386fce18e5cd76c3af2ccc9162dc4
BLAKE2b-256 75488fcfc0b15d832d8b208461536b49f7f05b4f8f7590b1e335648482241279

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 12fa5df25ced5bc94b677ba1c6673dc7c0c0797d34ac1207c13664a6b703c38f
MD5 d7263e44c3de07158b4e4a2aa756f9b4
BLAKE2b-256 06fa6f885e82cf88353d2cd9a64d3d2c17c0a9cc963aa5f507dce4093483c2f8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4ecde9aac3fe835ea72cabad5b65e17b1bbea78747de585be68ec495b240fd96
MD5 81b97cf335390a270f6255e6fb88d2d0
BLAKE2b-256 7d0936289aa5f774e4575a6bd633752a8cb3bbae2ad885a831899b155f517720

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f09c7d31f40505980cb560d1fd4727f9a05c976c18660c977c754d9229025131
MD5 16820e643279dbf5e7101719a5b38e96
BLAKE2b-256 7f6c597b2a35f0ce8e627d06cf6933ebc3074cddcf3cf3912a720356f98c1c8f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c1b2fbfbc7f1d8ceb554a8e21777ed4bc61f9229b5eb420a0fd6a50b29f804a0
MD5 76cde3053bb47752c59112efa42f04d0
BLAKE2b-256 580be6779245d23e295c62c1563ca3f870e7b774573ab301601f962e21ccc692

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 11c6648260298f0b9ffe8a91caef4f881807034229f31fab79d498fd3e913daf
MD5 fb8def6512b35519154b7b8ce36232ab
BLAKE2b-256 194ad4eb4ade37474b91c3a031bc2282abb76e942e12532dd39a46f5d8c43031

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 9003222899cb126219873e3ff8db2e56b00fd0619937e74f3bd8d1096d3a67c9
MD5 91f0a1cb601aaf5946ad9f77abe43e83
BLAKE2b-256 32c0351813a73fbf7a0fa6ae027cef0da6434b685c94da8a2ec1edbcf02d9656

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6d8102e65d4e18aec5754fef60622f03f1fca8481024aefe04fcfb059720d226
MD5 a15d4d09cf34471e5b2d8c5286d06d54
BLAKE2b-256 21a17eaee75e6b8e4795235da820a946ffc481db4e7215abe76ae742a72ae76d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9947eeb181ffb36af1440dcc30f5cca56985ed819ced4be2a1429850232651f9
MD5 fb64e7737754e528da15daa105702485
BLAKE2b-256 e02d3f826089414d4805ad118983aaa3b217fe4ed76d084987037c02e798f3cd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 21b37c6dfd2df728345088a49760d80d548de11bb4a3dc24540460bb769eb1e4
MD5 6bb9a7203b078f2d62862026d51e9a51
BLAKE2b-256 5a72e1ad129e7e845486907930a6afdcabc334b82b447fa85f93a710a32de939

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e21046a327c595cd4f99ece437b007a12ee34b80dcbf8b4996d597b6d1a86ed5
MD5 ea6b66e5b8616361db83fb653a0b8511
BLAKE2b-256 45f570a24810fb5f065ad273b95fa2a39ee93b696ce8d47e8e999c5016ef2242

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221121160414-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221121160414-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 09a1a8e15a27f445bec63915ef1f8e95dc3f7983f258e15c7e358229fd3c8131
MD5 87aa52762cbc99bd7c3301bd7c42e040
BLAKE2b-256 3885c85f1a2f52eb2c319162ebbf49b7863b7e20bc6eda553b1e91485267184d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page