Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.29.0 2.11.x Dec 18, 2022
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 9b200ad95b10d95030de1f944a8394fde1df04d936412fb0ad6832879e10c4de
MD5 4f245afed5838ed11b233ead0ca7a940
BLAKE2b-256 bcaa3c86f086381eb3f8ad42b4b9a898b65dcda5b526874ed4eb73a522724fe5

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 eb00812ad3fd047b4f6789dba2e0b8de3a642002840271162814f758a322c812
MD5 0025f7937cae0d9e79378df52975c1b4
BLAKE2b-256 44e6ad37f510785ba77298528d3fd657af5521ad45b318136cb37475c66056ad

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0923e98f13cfaa6e684312de2cb64e5494089af62cb291ac344d053d77f802e1
MD5 4268eb041fef196093e3127e854c638d
BLAKE2b-256 5178f38a8c6032f99ef99dc63dfbedb1d00ab460ab1389594ee28f829eafe9e8

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e029d56b165c0a868bbf39f4bb1e3899460851f5b9146fe5721267dd187c1c51
MD5 749023eddf96bd182a640df26a79be7b
BLAKE2b-256 be4a21d4d514a78c3f589d517ac79fdfb863fb1741f59308a70232ce5d79299b

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 15ab4df97a5cfa85ecf692b8b6e20c99e0b893c10399948c86c7b8248e6c8fdd
MD5 5517fe8c0a959ca1a538e373bd58087e
BLAKE2b-256 80807896d8e731e3858459589404fabbdc25b1adbbd62e84d2e0164d18968f5e

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1048837ca57804bcc91c16beb006cadab25ea50270ebe5fc2d483014b3cfdaad
MD5 d4938bee1eb7b9da6045cccf3055b831
BLAKE2b-256 21e2e85817b079ba81ab4b1f9e2bc9377d0a03a9c73f72e4cd7e15d4ddf7c5c6

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 53aa3c4df44f3615ffd6ba86a4cb51cac553f2029e95d31c8e0289918a0f1dfc
MD5 27b30be5a1878b3a69288ffef43d7b5f
BLAKE2b-256 5ec9ba891141da7926f61b3bb736e6eb5d21b2258c89f6508733ab8cc0ea3c4c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 550477d0be5c687255f98f39c2ebaec96790264f61651cc906ce343b6b0aba72
MD5 4eb9bbfaf6ec06d99d32e9c06c4ecea3
BLAKE2b-256 95e7473a2dc5573cb1bc9c252312a6d7a552a8a6092e4a2788174a37e9b9c734

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d43198c57b5f87f562a617e5a135269e61f7adf858c1766f5dadc493a2835ac0
MD5 7ea7287d120a376eebf00f83e7c4c943
BLAKE2b-256 fa07105003c61f42c833fccf2b713dd2854dc31d54d85de883f3ede36461e148

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 bbbec01d4c8e20de9830eb1f37f981c2a574d39991026df735fb8ecec3eed75f
MD5 ed5a0eb3eeddca82c95f6aeabcac0279
BLAKE2b-256 64e417c96f47beefacfa1663c151cf538b87213097d5d6283811694fd6e75a5d

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 385ec967e382f0fc3ccdb911961433dd88ef602a0fefca4f5ced05be99add271
MD5 39f37b563f32ea2a3f54e68cd42863a4
BLAKE2b-256 5470c97cc37b9244f5779a3a41067791a8b3c4eb24e8f0b91328d09af2647016

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6d74df15c19771c4dbbf1b97ffdf42154b5ee45e34f1b1fea030ea8f5fc0837f
MD5 fb505bf7fcc5f7175c701d7a4c160e8d
BLAKE2b-256 de13df7bfe8065bb1d67d3e5113be2976cdd882d50691fcf3b749e9027a8b5f1

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 a170b727f0095845e23036da458e127f56725fae690d3b6e626dd2dd486c4ed5
MD5 35cdd0161629e32adc30fb9ed2ee5cdf
BLAKE2b-256 d2527c6e01d77f7522939f99a54c950564ce9885d04bafaf492566d8286458af

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 963c0ba955570b3807dfc95526c3e8b1beac2d8555e59c372de33b0bd7ef43ee
MD5 09362e540c33a614fdf560fb25c6bd0c
BLAKE2b-256 02a290845e02feb30418a63a28783f3036b8c53a2d7c8f6e15bb1577cfd07a97

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.29.0.dev20221218181236-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 81b7e50a1e773244f026b8ca4a36a0846f83b07d17376bf8897704d2704fb3cb
MD5 cb372b0b207540cd9b18a8e379a2a904
BLAKE2b-256 ab86a3ae17ae97369edfb1096742b5a13df37e17fb3f43113bf0e50c952ae408

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page