Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a30e852fefad181d263609481f4bd732100fe744800a357ac6cc241a36b6150e
MD5 39332a3583f21412191d9c1ad093eb6f
BLAKE2b-256 7f7623b232762fdbe3f81e0a67db76550df2da51aeee9f13487938d8ace2c3d8

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8b15b790face814c1ab056bd451fe8b466c3d88e2ba83e4eb3c6e9e9d61c0b4c
MD5 35122f52fd5a4ed0113f5a5a96dae598
BLAKE2b-256 b5c83cf47a7829e3a6bb5b9b2cb6b332257583535659d23c5570a388d7299ec3

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 82ce4cb032279b78e75ce3775391862329317faf801ee9eca677fd975046ea07
MD5 544400132abf1b6f9ca60a9e15d101b2
BLAKE2b-256 17907acb410489f9003b934869bcbef79c04fa01498552796a2bd32f5137b94a

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 3b02ba41087b669066a1910819820c656fb743b00cee25e58454c8961ef42605
MD5 eb9fb120f92a5b67c67ad927bfc4a42f
BLAKE2b-256 4348a38d0eb7192aacfb8b7b4aa2dfd2b32fd5f8dfdeabaedf3c8e412c815e54

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 fa7c1ec93b175c8c12cc69548ee8776f6911fb76d9ba63065e811192d227476f
MD5 28b31fcdb3a30bfad6fb8c6cca39c677
BLAKE2b-256 10f1f36623fe774d84a4861309504a1e94a2c999234d4dca3b9ac4118465b24d

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0c97118731fcc0a533b009bb4f38547c7e95ba579b18b8c9a607101a8a6d1ea5
MD5 7f450957b87f8480d36eff3a36a2a46e
BLAKE2b-256 1f0e6df0865d8fbbf5a9147053d6eecb0f2630fc2316a4bb0190900e0ecb627e

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 8ea0a120378b46f5bdf46885d2b76dc7a3814a4304080932ecd94d28e1796693
MD5 7909c2fa0e078c34328f4364441a750e
BLAKE2b-256 b26954b3a14a7e896b7d1735bb4f333b2854da8d0478ed4ce961838cfd159b6d

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e82b4cc375f5101422f3da00f84f8dd336b2fcd3c7ed28fc6bd8c54ef8db661e
MD5 f1a0f0dc3e48584d89c68ff7087a4b9a
BLAKE2b-256 eb15d4c576e4eb48d19a801dc609a0fdd40d789aabb414e94c76ab319c55e4ef

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 aa3cd88eb4b2afc79e91b8b0242ba69fd66e4965d84afd463c20710a8311fc7d
MD5 3e33b2a08471fd36c1b4b31344a80ab4
BLAKE2b-256 5e4ec7fb7f26f145779e639854513f47d2281edcc7ce4381edea0cb96ec7b14c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 c7c2e7c6a706747f1643f2515981c2d93d907d49692fe1ff6f4e7d2df3046cc0
MD5 2f4c12e02b84dc4fdc2eec3531cfdfa7
BLAKE2b-256 4651c985331930a8e8615ec11eafcc07b17b9965a43ae7f83e6e1a0d89309236

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 62ed7235f4aa7403525ab2e549fd2e8ca360aefc202b2ff8ad396092bb8d502f
MD5 7617e32fe28342b619f2d8a750d4a4f4
BLAKE2b-256 c51c213c89fe87e2360c5dfcf507b97a422f5cec06065640ae632a539c1de4e0

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.26.0.dev20220517052758-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 baba2adfbf79af1933d1d130dd862b0a0faf3b1a9373b9609ed798aed1ac4a66
MD5 6748eddaed1911891688dad9d691c156
BLAKE2b-256 1098282ef95feff3584ea6b56ee89e7a2a3dbeb7aad67a033a4b33151269e1b0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page