Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 fa159ea6ff360c2bc09c48b68e2d86f1318264cbaf616eb198649937c7b7ef20
MD5 d2fd28cd05ae0b06d24d86123adb9a9b
BLAKE2b-256 c5aeb1a0778a5d06881f885b3fd54b3caf290bdcaf08aa3b810e8711971c034c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2d79b218a305aac40f962a9407f775f5c34ce0a912fb2b73e08d147be4d0bd91
MD5 95d439de0a56e41f4810734e6bc35316
BLAKE2b-256 02601274f0ee87e1ce52aaa481224de4af2c75ec6dc0cd3249569be5312601b3

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 70b9296abae6069a2d9528800a76799363ca910804774b8a140a7e7ce95e3dbc
MD5 00bb44112e1cb047c28f9fa8017c094b
BLAKE2b-256 ef576ab0d1864826594bed804465ed2b64f22cd16170c22597d21c74536b92ac

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 76b713b27938fecd81ec76b00f89ecc0d581c65b5b7aaa3fdd6095fd3d9e3c7b
MD5 c44ec6d3613387abda34305d04325f2c
BLAKE2b-256 89e5854630d00794dd98a6642aebcd2f11cc92afd46050fc445679d1d36e687a

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d2f56125f6c18caf30577a8b72b24a9feb82d0c45e98914c6242f6945fe4224c
MD5 ab3c40f95982adad7f89be20d5f69f2b
BLAKE2b-256 284082b23d75d6f945c955d79b199a537a141448068b297ba607f225274b2621

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 3af61b2228ae44b355fca2d9e7dfa39b9aaaa2c80389ae2f2ab3dff9f7b2e5b3
MD5 a1c33a7d2163bee45d43a1034f58f646
BLAKE2b-256 6a59c927255323f14d375275ba99885588353af56f912701f36a88402b931b85

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 b7b6ac35824e91f529d3c47669846d7fbcf1161a15422393165db17609c3e830
MD5 ac0c52acd4da7d9be590f5974abbcea4
BLAKE2b-256 4a8ab3d65715804e891f6d52994ca4be0c815cef7792d3bfd220ac6fe97802c8

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 cb71a36c4494e740e24ddf61e5a811effc544708028e3217bb7d89e9f11d08fd
MD5 8b1ca6e388c02b3795728a22fb122001
BLAKE2b-256 3767cbbe920b7f99e8869d10009272dffdaf39e4c0a3898b2ec330336a1c11ca

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7e046ce26ba2a71cfbeb6bc2f37a91e7ef1eb78420bab6675c8cb52b46f52eac
MD5 b3c2d44e3cd8052b311829b0da4c708a
BLAKE2b-256 f619f243f4286126f4ba2196efea120047b45cbf48989c7719f53f0cebe0159c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 096b59792bdbeb5b65a4b11bfd9966701598982cde6f6306729ba4d4813f987e
MD5 c635739c2e61be0d086781ed33180a34
BLAKE2b-256 76fb795efae9f9a6d94076de274bfe30c2aa1fd1c0a279bcbaa8c82e11f0cffb

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e8d6cc07b753946bc7d6deec069f7d6671042973f1e37d0ede2d2dfedcf59712
MD5 fbf5500391fdb7a6946b062c1c7d7255
BLAKE2b-256 71d07220ccbd0a9a4766e37e89e40e113f58358a637628eab1f18b14f172b464

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4b3c5917e6d7920e27e06869e123695bf542c2d7f538b95244cf5af01c497b9a
MD5 5798d399d2532acfbb77de9eb08e08c1
BLAKE2b-256 15ceb8582cdaf5949729df12ffad48c6830fc5a845aa99b03775fe4de2fc5b44

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 146d4c422a54b60cca22914dc3552920e33444305cbd0ecf65324f7663cdb57b
MD5 a793734388e5335faca80a11abb96846
BLAKE2b-256 57939debcea21cb6e240d410a1a85880ea093bf513e3bb1586425b11ab80c1b1

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 03835a96016164e22a139646fe20ff5e8e618a9125e11bd134dcc4eb212ae744
MD5 23f2f43d0544a5e2b736bb1bc03676e0
BLAKE2b-256 95331bf7adc772a18222801b7a3388e5f93761e109e02d3e15d6506b2d8fc9d5

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.28.0.dev20221121160414-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 90b9a678339252874b0aa702bf8f0ef01e926165713fcc5ab28b9ddae44024d1
MD5 3f755a40176f6c8a82a8a45f9cff81fc
BLAKE2b-256 cafca3150b54ee7146dbba028af1aec39bcf47d2e3264fefc3d8b051c52765e0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page