Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.29.0 2.11.x Dec 18, 2022
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.29.0.dev20230110165504-cp311-cp311-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.11 Windows x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp311-cp311-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.11 macOS 10.14+ x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp310-cp310-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp310-cp310-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp39-cp39-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp39-cp39-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp38-cp38-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp38-cp38-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp37-cp37m-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.29.0.dev20230110165504-cp37-cp37m-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 53de80d1663ab2b156f614b294c0a6e481778a59d2e6ab4eeda67277910bf332
MD5 b80215ab165b1e7e15ef0c37cd6a6f42
BLAKE2b-256 eb72feb5d398cc8a7d18a2bae45e0d48aabdf0acb3b4e1b47264412f9193055e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c9a7cc55b9cd8a1fc28a26853232a50f0a407dce8293d062f0f2709149f20c66
MD5 03750d85135ff29baedc4bdff84d904a
BLAKE2b-256 0ba903b04253524e51515ee67a51669bcae335a858392bd55dcb92a106b02c1f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 19a7346bf893f9592f296913ce7daec2365f5dec5174c83e2ab6c5885f81bc24
MD5 3c31c957cc36d59e4a88ec9a8e845617
BLAKE2b-256 1ad4f5d4fbd7bb48e79cef0196276fef290c248be9b781823fdf7ed2f9c6e9ad

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 4ff743117175fdfb21fdd6cdd005bc0da8497abe22cffc03e09f899c204e3890
MD5 90bdd0f1b6e77a468c7d373f893e4bde
BLAKE2b-256 f656cc5d1306e3240dc8540fa49c13e88c72c3bcbb83901968e3746e3eb1740e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 25e39b8195eaac21019fbe02ffcb9c29f68d23ac5a469a090d01af274e60ed5e
MD5 b699d49dc1bb1670e0b3570e3977f240
BLAKE2b-256 e95b4dcd12c84cf270e0d62ac5c6558910913b32988d59c6a919bec6bd95909f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d68278c6e5235343cba747e9ffca3aef875e8e27301f7899215ea7ad35d76f7c
MD5 3b4d153c4b626454afc0f144c983bea2
BLAKE2b-256 8e0a75e16077bd4951ce878c90dc142bae446c9cd0b0fa40e7597bdedb2d4c68

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 a79eee9fe535c4df53b613e202af26b68ba5edec79b018dd39f102a3d0d72d56
MD5 9adbff36a72cfadda10f0edfc50768d2
BLAKE2b-256 39660caeb4c4f588512eb9ca9931358c36e4315d76ef59f2cef27a829f1f687d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c19ed6c9b8d0fa34da0e85ef9d96da964205c699484c09403f478bb28fc406b7
MD5 486f7d6f6a52926c3df5b13742ed0364
BLAKE2b-256 09b116f7194e5d8ffec4816588a960eb72b2e6ff32e16b62a1abcd1c632db8a8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0a5c69a55a3d7f64369849f54d5c06253c40b321576650182e14da0a13258ba1
MD5 e6d46bb6eb914dbb1d7961791f039c32
BLAKE2b-256 02f5698e626ea0fe4677814d17d8fa595d5c3ec533dd930db6430f22afb63de0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 d6b542bac5aa1a7ff170df98f1ce65103c3d1e8acd5a9bd1e9ef60392a1da9a0
MD5 01e9904288710f0286ffabc47f67220c
BLAKE2b-256 b3f664133f509b07f49c3845ab9099001ec22c6909ad46a43ca93009552aabe2

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b010edde52b14010db5268599001431246034ad3d6a89a8b6b6b849cb65332d9
MD5 a894b7f336cab6930fe18f36d9149cdb
BLAKE2b-256 5c67a350c849768b6754f73ec71884b933e76d440af32af618715610c403994d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4ad8d1f30f9c5d62bae11cd11bbeea6478bf78bc660360667a18d4bc7564cf0d
MD5 d4478ff11cfd3b510c3b0295a2da2889
BLAKE2b-256 c3361753c6127dfa53416e9cd05849094f2a9a2cd98e1d8ef9df827eca0ddfce

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 95481b38924e4f4efc503656c00455cfeb6dffb89a96068b4d5bc431e7d84d78
MD5 afad693d1ebd7611a02631776eb5dd67
BLAKE2b-256 868773e8a7140852485efd1dedd4212934f8850ac4324d47a0f3fa9f728eaefa

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2a9d5d0ce01c177622c239f5ed99b3cc688ccc2b01601ce08b947b5942ea3d4f
MD5 a2ce4d2c058f7f2bd409eaa98af7b5fd
BLAKE2b-256 5ed495154405fe191bec42f86486adfb6200c77508d80b3cac0772341557f931

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.29.0.dev20230110165504-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.29.0.dev20230110165504-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 62606d68aa57daad4f88febc7ecf38630996a65265c6c5a9fd284e6bfb76e77d
MD5 d8ae679ffd151d4849237553224ed2e2
BLAKE2b-256 dd76eaaff27c10570db4314e73b46172d59019ce5223450ecb606df9ffa7982f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page