Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.30.0 2.11.x Jan 20, 2022
0.29.0 2.11.x Dec 18, 2022
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.30.0.dev20230118164726-cp311-cp311-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.11 Windows x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp311-cp311-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.11 macOS 10.14+ x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp310-cp310-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp310-cp310-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp39-cp39-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp39-cp39-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp38-cp38-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp38-cp38-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp37-cp37m-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.30.0.dev20230118164726-cp37-cp37m-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 1a2d48f32214278727d9835bc96f5a94c5d24dfefdb821828832e407dcb2c424
MD5 d62c2ad0561a11b372c2d84973c2ba4c
BLAKE2b-256 7e810ee5216b72584590ef16544da8bac285ee478e1d6119301e94e281c36d01

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8c5aa5c7028ea0ae93b53343f807bf56d042cdb4fb4227941f83345367aa034a
MD5 8c1f550d723b5c91978205bdc815d663
BLAKE2b-256 c7a5c01c23756c3d399147759ce64b11c30e266d9facbc469fc79b3c3f6da8ec

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 75fcabe3af57fb6bb08a11942373a3a82bc620c202a3d69c4c8807afebbb9b9e
MD5 0b06275d4daeb5a1b4e146feab573a54
BLAKE2b-256 6fa0e3c7e1e5688f186e9a06ce000bc1a9724116fb2b33fe357ac0f11e58e764

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 ea5f13dd05f5b66d6bd3a53529cd11fea538b633bbe8480fd361121d0b8b2dcd
MD5 195f132be80960ab57ac52dda2f60330
BLAKE2b-256 154f9cc837dd512bc65672c3f8d3e6f176ceda119c9406c72c51784b71b281b0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3f85d157fda7d785f32df079bf61a11602f625d249c3ee80e75f2fd57bb61432
MD5 85fbdbee62fd5a87d779b9b479230f74
BLAKE2b-256 5c5b1e76344e070120e4fb3ada6211d1b1aa80b15eee6926d9b89ccd7a7121f0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4c1ecc899149ba0f83979e44358b2b6a647becd5c163e02b707e0b63afb98605
MD5 91cf8656877e78a1e9a1b3084834d0b6
BLAKE2b-256 6b19dc5bf080c5fbf80e35f58fc4e9cb7bfd0e7721f4d6aa5220cbbb2da59ead

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 4cde54c2f5a2f9433814590bdf71c164508cf7a33258c0e6552a0659486c73ec
MD5 e895c2438cd20c697c0db5ca263cdef7
BLAKE2b-256 d9190446e3e6e6a7f36e5ec2d48270c22788afc4b6aa5de29249ba6f847fed91

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c325c19321e7ab838ef79229fe46bdb9592cb906eb027aa07ef3fcd44b111e03
MD5 dec6edf4feb9b6ad79f7c95eae48b993
BLAKE2b-256 c7f16fa2733e8fcc2e27b4a25d99deab2803030c56f7cde52b1b9127574983ab

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 323e11f8ca7fe882388a97c1e20afe60920299a94817071127d6f5e90d4eea72
MD5 41baa1e0f31b1bd3c70d2a463e057ec0
BLAKE2b-256 c92e217f94ed0349f19149c72ad63688a120fec69a6529ea822e8029093e0697

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 5326b0d24c6e26255575c70df9b416ffd87462ee0e336b5d25138e9233647471
MD5 b0ed495e998cfa0c3ce5580357a56b52
BLAKE2b-256 a28b6b62793f08447732f026404fcbcc91c2540a45f6439320a6f8208dbc702e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 86c2a2ecf9b9a604f24f7d4767d868b8d778f497efa64b06939e7fb3fc7f1a45
MD5 42af7833c0c1e9ea1b929a8f22ad3ee6
BLAKE2b-256 3c4b82baf3312b39ebb13dd02ed96af8972eed4e620318aca0f7930cee2dd90a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b5ad0a361190bdb64157fe71a1fa424a3a3770983012066daba591fabf09eda7
MD5 6a6d11d54f9398c1672f5397f040c8f4
BLAKE2b-256 5840e986de471de787ea9f0abb169360bb8e2928035423f1d705a6346dfb4cd5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 0f6914ee43c68f5b1b91b1e91899b82f54f8f258ded6507a8ddaa9e4413162fc
MD5 eda51da6190ea6e479a6ff18e9784fcf
BLAKE2b-256 2ac617ef9852689a260836ce6225372f2e3c65fdc22f5433a7805b20a12f2749

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9115342b943e8d7a8c2ec994a9f985396b0f23fa3a487b167a545bcd36ca4af2
MD5 b32e58c62136003b23a7bc84276d21c7
BLAKE2b-256 445c6e602352761746abe78f78782ffece21d9e7c1d41a7c4d79470a59377952

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.30.0.dev20230118164726-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.30.0.dev20230118164726-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 c93dd13fc6523538f7571504b7b8b1c2ecef847f6189df91e0c10b356d4ef624
MD5 06d082b4bc8b7ba9f48ea81464afe126
BLAKE2b-256 6dd3f62c67baef27a0a36a09d2ebe3f6c9dba6e016653aaa6f67ff8b1e17610e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page