Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 959ab922046df5c0de02a97e768edd75fb0fcd9481c25fdfb0b17bfa467b5d7a
MD5 88414f520ca6157573a354fe677acac2
BLAKE2b-256 a7c8c525983b2b0edf884c627ee2659f79bd61fee14b3b8d128c58d975d75f16

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 33cad38a28b22c1dcb402fd36e3ab220033a6ac728195a6a38cd61d9d11fa2e3
MD5 f9cbeb9fbfdccfc261ca4e98b4ccd268
BLAKE2b-256 416f03df8245961258bb2429d29f9c28e4bc293b54ffe6d660326e3fa0e9e015

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1dc0db0168d75307d46c1ed7620016f860ebe349fa64beb687144ba33cc5b5cc
MD5 b6acac03d9672f3aca75d7d77a1371be
BLAKE2b-256 a8597189114a35c90e2253e83900ed87eb89163a26fb74d8e2452e0abb0dc732

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 140e64898a1beac70d050b8008639009256851d7eaa1b9b403f2d544160223af
MD5 67887c7c12270ab5a37796f128cd86d8
BLAKE2b-256 41a831d4b027225e32fb95d37d4c9ebef4ee6343e408cb65355fc8e5b49beb8e

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 224c57abfea50cfe0a2a7f91ab0908d73321023c139b7fe5b312a720c7db03dc
MD5 86b7c0039132b5454c17151e7b4c90aa
BLAKE2b-256 cbde97f9c226632aaf5c452d1a8b1884f71ec384885cbd815c9b1e7f429b14ee

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 94633d0517a524dfd95ae30e2b3a81ab62ad604291b397f6492f150089b883b0
MD5 117fa34ab3ca71efd4396bfc7b591d83
BLAKE2b-256 73418de8a052ef0c260fadb7ae8e9187ff9b46cf5e55a5e05ae2220294cb23f2

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 6600caa1b3c9702c04d3da6189cef7a4327f8574c6e7a74691384487275cd406
MD5 e3c706a59b6634e7d75c70fcce3b1040
BLAKE2b-256 6348ae566fde643b5a10ec01b98e4b30f37d36488d86583a8bf29200e97df60d

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8c4241d179b18de412bd7a34243f206e302d8a1d599412e103faad1a846291a9
MD5 050d918622da9533e98f2c77d34125df
BLAKE2b-256 81238a53798539dc066a9438ff8d6d0d826cafa76a0601b8300dee3ec521511c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 cb50bfa063c82a93c483e892e0e64cca5a6bf17543c53992e601ac8711b08133
MD5 2b775fa9be8a7bdc274640604f15dc22
BLAKE2b-256 9156ae885c011876038284070e152e99030c6eb94f359fbea2ff4e85f7214593

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 a5b52e87614ede73469cfb164725bd46e216264f6524c7a3c59d096ed3e58fbe
MD5 f6e60d20dc11ad6ea8d3166e5d3be08d
BLAKE2b-256 c5b90dc0af045ed569932ec339b297c9f7f352c3d48e38ad5f288a23537e73fb

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e0f54c6a6d1ee0e3a4a417fc79d679c45a0a80ec45b7580d5eaf330a9f09d53e
MD5 009f2ebb404e7bd336c2de6be213bedc
BLAKE2b-256 fd52bbd6c66830ecc56a06dfedec08a580d38ceb7fbdabb2801a30eb1e16f9ce

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 da00201d6106586552aba30fb6b6af5162667e57b1dc1c2ddd0786ac0aef18bd
MD5 94da5f2cd9eab7a9b97c3d0ce5df5327
BLAKE2b-256 59c549f97b44850f97a9b47811f3be196f87232c07b0657fda57bc3c214ef2d8

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 04c0722a86937dd133089ec19adc7fde22ad7b4268a3e6c2640924881901e29e
MD5 23c7de83f980e17a1da1046e033cd8f2
BLAKE2b-256 cf8bfe4c50bed020a1b320a7aa81068da14663a1207a91d5ea2864ffd807b034

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 58b52d6c1323e11526f0ae47d1324360fb1637d404b521763416fa69151a772b
MD5 75428d2c423f17096bf37992ef703b99
BLAKE2b-256 cbb042673f05fd403f4c3601ff6856871528ff1a480b34507545bd0f9bec6958

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221203124204-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 87966a213cb8a00ab9b854e3b43e65c61d040da59c8b6295e2efee6db721a589
MD5 53a4332aa378926421a8a7f57b3a634a
BLAKE2b-256 90048169f6aee79c97ca0fbb68081a77fe567b67614d4a98de0698e1b2fd2fab

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page