Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 79ee963d572955986e756b91a3542f2bf5ec3ee09416e4779dd8d20f5b61afef
MD5 e09e9c3e66eef4d81cc2a4072a092582
BLAKE2b-256 536617af652169f5e2d38812325203b0a818a087f604b64e5faa1905fc002efc

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e9fef19c74429a569057b0802d69a483b79260612e4bab580c156e597e4b6eb9
MD5 8067912d4af39a6a0c7f4f1c7a7b1c9c
BLAKE2b-256 7dc493b1e3dd1f19a95ff336038ea94206e2a70634a68fc76ed59e03620da03d

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7c700fc84063bd418cdd51ddf3ae1f97c5f0ed0005064b7700fbe627caeb3ea8
MD5 2f5a891572b6dc7a25627047cad8861a
BLAKE2b-256 879cdb6db6574a6ea6a4ff0368a6f2b1f70dd08a2d01ea87b9f0d5792058b178

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 90f6889b0959900cf64d77d22fa7e092d578231c7c8dec90ec36f8e6ac61da0f
MD5 2397c2f8c286adcd5c7232c0ce661a50
BLAKE2b-256 17d293a1e46e32067793a2108b549869c19da6773abf551db86630f9d98b006f

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 11a2b2807569cef39f5ff0017c8df68ce8d42daf85e8b3c51cdc16cc2c4e1d8e
MD5 4ff564b098112d2569827e321824a749
BLAKE2b-256 055fa9da985757855f144bcf107849e74e2c680d97aad767ec5525e5510f799f

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8d82efc5a5c26f0702a97854fbcb7cf701e05b57c1fe99fb9ce26fb0e2e326c9
MD5 39f152137bdf51b4fe651c10b602447d
BLAKE2b-256 f7cbc44a5bed29dca656393dc824aa085f80865d96efa9256c7db83fbedfa65f

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1265e26ad52aad74c82c69c2493aa72dd817f3aa3e313bd75653746eea52dad3
MD5 ad3079d82c719a309dffedf7a116d2d9
BLAKE2b-256 690a563acbdbfb0edc4a1c473998efad171fa8ffa8e4397a9878f8adfbb9cf00

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e8861158be7f79c1cee06ea3bacddf2eb206cf5340d789c47590b8cfed9becfe
MD5 d1474f5a22d9e828e99c40fe07f97b9f
BLAKE2b-256 8b61a404ef4f6f75872af4543f7d39cb4f21e9a947f31e983201ed46b8a147a9

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 dffa634f0b26fdde413c22978c1b15ba75541d48843bbb1743de23f98bb5ad11
MD5 0a4525c96378a57d12cc04feca49ba36
BLAKE2b-256 33eb380dc04d14d70c9f7475b42e3df31d334a944dd0ba9d3f03d49af8fe6e33

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 76049a65f637a6c7be892ee89d4c48c4c4973b56f050f14abdc57de8b6453823
MD5 434cadc3fb29ae0471cd3c1a71cf9ee5
BLAKE2b-256 8af42eacf18603e28eb7fddc2e6b8cfb9dc2794dd1145bf760570e8bcec5a2ba

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d7371290a84753f67620620541e27f3e1eef1b5f07e6d59040dfe4dcaa839802
MD5 97ed5c93e464d6b4a8504f855478b53b
BLAKE2b-256 fa27997d512afd1e2dde7c035d65cc26b28ce66e2b3bd457127c7271471d08db

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220816191153-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d171c135e2e021f6738eb0152f9595ead09aad15c5174b39576fa3fd6eae0a5c
MD5 233cc71ac007745830938f53d81cf230
BLAKE2b-256 c7c4a79583b8388ea3f32659b53536f2480cf5e8e75f9790e575dbcbf3736218

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page