Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.24.0.dev20220127011010-cp310-cp310-win_amd64.whl (21.8 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220127011010-cp310-cp310-macosx_10_14_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220127011010-cp39-cp39-win_amd64.whl (21.8 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220127011010-cp39-cp39-macosx_10_14_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220127011010-cp38-cp38-win_amd64.whl (21.8 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220127011010-cp38-cp38-macosx_10_14_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220127011010-cp37-cp37m-win_amd64.whl (21.8 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220127011010-cp37-cp37m-macosx_10_14_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 22f811f334e2897f2550ce45077987f9f6762021cb1df13c5d0a2a09b8aaed4b
MD5 9702177d612e4af593cdf991f8b7b0e6
BLAKE2b-256 496c169f3d46c0fbb7fd9d506ba8cc5451974ab94f74fed91de748fb85f338e3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3d73e73f2d22236324651f0cb21008580e528151f845f6fba5932458f2b7060e
MD5 9fdf6452fc9e33d2f334b02d5c5fe101
BLAKE2b-256 265ef6935e47034bc8e52df715f5d79deb65e3d9a8c9598feb1954a7f8aa3416

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ea2e7c5b32b43a3781faa4e60d024cd3b2e85c19853b7621663379d8b78785ca
MD5 e74b29d91ea7081497439a86a8262a83
BLAKE2b-256 c644369417b98e8a6f7ce858149dc8e05ccb14e534762f38059ec90f676012ee

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 fe4bd5dae87325cf0ec0374358aa08bcd08339f36cd4b7f5ce8b75c4116f2991
MD5 34d2af11fc17bc0ceb1685eeac6d39f6
BLAKE2b-256 502845a6fa5de22c4fe8a4686277ff55d196cebd65a420554634f31e558135f0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 03a2606d976cb592dcee21b785060b8a58a12d8acfb8e20d3872033a3e5feb25
MD5 e5b9177cb81e6a5b830b30d194d0f4fb
BLAKE2b-256 a556eb6a47b7102d0140678421e0f625840357e89a668b236a7fa4c0ec39577a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 609c7c202f664aca10239bf06a97095c27d49b468dafe144338008c8ad28394d
MD5 01eb1282b0b7dbf86b0cccb3a50beb9d
BLAKE2b-256 1f5a1fed1687229f29d81e346b7aac34df2bd667f4e4a2dcf22ae60a6440d18a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 d15046d1a9f8af1fc94b8a12705d8ad0c02434897f598e1e162a96f58f5d7d96
MD5 1387e0500910278ffb51b063d19d9b1e
BLAKE2b-256 f9b9c3a633417aec7a52838ce592cded022256463ce01f1bf9876582190f18bf

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6d765b4d609005fab794280fb3ded10bf5149a37c1e6b9c262c74525cc795b17
MD5 5a7cd2367252684717a3a7f72b23b71c
BLAKE2b-256 eda7ff5db7ca53b4be55cf6f9581fabfe12fdb6b148da0903d93e6ad2f101865

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 59de9c91a607af6865aa090ce1e5a80152f92707fa4df8198ad8d68ca1221d3c
MD5 6bfa4984999e399e82e4dd33d74bbe9b
BLAKE2b-256 d3b936eb99523e57a1eeca2aacd9f85986c48998cb3242ccb86af0708217c5b0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 2e1e99e961a0d2ed2edbfd2bc8f9a3689c4f5836bc3a2483d8fd6d491095e3a1
MD5 52e4788dff48c2d5569fb281ff1c6c1c
BLAKE2b-256 a8e721b0482904825cd0bfc9b00f1067c1f65f2f5b2615ff591e475287e9c0e4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a89e08be702f5ccef8b255a2e3cb914d400aa7bc5e7f7ac86ddb43905db503aa
MD5 37c8ae3098e89ce0bdd0db83bae93816
BLAKE2b-256 5f8a26b2254daebd95d5241e132c9de1822309311d6a67981380822c7d4d3e7a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220127011010-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220127011010-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1e8b1d80a70ecc9e000a97c560119dd62b383cabe88ee98f2549d012dfcc0c2d
MD5 e1e8839a034ad9f8f43b5882663a0970
BLAKE2b-256 5a55b8266895beb5e18489590e14ee0b2c7c9d7ba65b506942baee3a6059cf03

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page