Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.31.0 2.11.x Feb 25, 2022
0.30.0 2.11.x Jan 20, 2022
0.29.0 2.11.x Dec 18, 2022
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.31.0.dev20230309180344-cp311-cp311-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.11 Windows x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp311-cp311-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.11 macOS 10.14+ x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp310-cp310-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp310-cp310-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp39-cp39-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp39-cp39-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp38-cp38-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp38-cp38-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp37-cp37m-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.31.0.dev20230309180344-cp37-cp37m-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 83d94fcfd72937d74b98fb6c7903acf93b945a59e8b66333cf1362c6500a1d37
MD5 69c50903a021bca062a1763fccac7f3a
BLAKE2b-256 c59d0d5a32e8b944b9cc6230ae638a3aee9c992571d55b78c63f0485f1dd344b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 42b6c41ec5701f80dc2ae0dae898b66d4ecdecda7e94e89071b3360bb75b221c
MD5 a853d9abc0aacabfaf978507da935c3d
BLAKE2b-256 47506bd3617c0b7c1dfe84bbf9298fa96d4f36e97403d637bf1b0f3b57ca74ea

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2956f26d41f4530aeb5ebab845513e936bb3afb1fd2a7cdf28c08a00778517ff
MD5 7dae459e6d5578d7fd2bd0e64c5d9fa1
BLAKE2b-256 e1c45da3a2205ed662204a127458d9f14ce3ed18f0ac0bbe0187033fe0c82c65

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f70efeafc97bf064d2d7d2e1259d5579b1fb3e46778be7305e1413273689e082
MD5 8eae5eab2174799c7277cc885890a55e
BLAKE2b-256 8ad146e955fb6c8b403dbca3f190dd37a5480643695868d71e32314a97fffe32

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 21c953af839d4205526040fb3392febfe541d91facf92985601fc549a188f5c4
MD5 dc203f39894d939f9d9af29435163931
BLAKE2b-256 67e72bc2957550b93fb96302f1531e1fb9229e96d8039ca58255f081a163ef83

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ca4bfe4f4cd93a04c7171e05a727f21354307d89e2c0ca21c55003c350d70409
MD5 5e35ac4ff9ae14ac5b59197ce45be6d7
BLAKE2b-256 1f52cafbfca127e089236d0c954e0ec7fed6360e3b7470dbcee7cf04c0b0fc76

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 31f827f1c6923a0b4e8d05d5ff2ba07cd0fb29e211e3dcc7ac336fbf577270f2
MD5 b5fb77066e2c64f6b8af848e9f4f24f1
BLAKE2b-256 dbb37d971c1797e75a4083134ed387e75078525c1ee81bfdc50122e899fcdf66

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8eabb8afd65967cd894ce84cc47d17f230c519e46f4962f6f272683c7cca1725
MD5 c70e87cc17a6008df5dff47301078c38
BLAKE2b-256 0b692babd976a9a967da1804e3925982a12450af3d5117dd3dd92ba6b5069314

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6cd7c47c6939854b527ef09edb206347531dbbc97a3ce09c8de2363a2a1e4b40
MD5 692dddd1d458cffdfff637955c71d7d2
BLAKE2b-256 f4f46082e2c27b701fdb3f8ec7065ab1826eea4739872ac872168aaa5569486d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 253bb9196d6a16176cf02f612eb97ac0afb4338386ea6c80354ec5610654eaa1
MD5 ed806e5d59416c65a7e064ddaca4977e
BLAKE2b-256 325ba716b4c86df0090fecafcabe117d9cb45737bc2bb085ff2a9b92e5bcf327

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c75b0a4c3bbde8cf1664e8ffbc3b51f073808c49c1b6c3667dc2309c7fbc50a7
MD5 ba516b8869fff58d17c9200554836be4
BLAKE2b-256 6f259f2a980b1530e016de16c34e33d8fccf66258314927d9574f586bd7aa35c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1b36457dbd2567e16b97164332d141f64f5c22e67ab951755e6a26c88b1834db
MD5 4c20d3ac60de8d82d8b961cbcd2ae75d
BLAKE2b-256 00f6eeca6273a529a8c775569e6175765ca55aa42acff318c791d4dd61908bbd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 9aa8c381a7187570afdbbd06773be34ab80685f4790d7ac462cb912133e8d711
MD5 94449b49119f39cdbeef83093c42da39
BLAKE2b-256 94cf83ef234af081a30bb633d4c9a42d473d11c9980ac212e50a03c1724b876c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 04b126b356e3d5c40844d0de21f8545a4c98afa269c5bf7bac908715cc47b6fe
MD5 27554fae7d63f4a566e36d161c963837
BLAKE2b-256 993a3d5cb8451da32d4b0b04152756532a228e5cbf2a8d0cd057932e2fd238aa

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.31.0.dev20230309180344-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.31.0.dev20230309180344-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a124063883025bdb3c5c86345d228c766d24da4719f59c84e3f1096ef68880f5
MD5 44d89eafdc8531187f2a0754fb8bda2d
BLAKE2b-256 dcfa4a9b7021283995496ee128cd6b4819bb305b5064c1a579d3a115cf30f19e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page