Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.19.1 2.5.x Jul 23, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 4bb342240446de5eef0434451801dabf47e917f7e54fea05e305bd85259a8f31
MD5 08f419e80d38efbf6693144f827b1aa2
BLAKE2b-256 298d9beb993f521bbd6f06a620c0bf7c126c897fd12a5485aa30aad931543521

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f32670aa0b8b82cd6a7ba5a552cd6803b7b1a9e8dac640f12a84e7e2daab828c
MD5 e956020949ec8e8406d8fcfdaef6fe84
BLAKE2b-256 1c127b70804b35973d7a400fcc99d0d5ff4a51266ff9ce3d417bb42c38134e54

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f7e1cd0e1aee220ffd2ce76df0380ed50138d832d1c1a20185e9cca5261175c5
MD5 6123de15495d0ae552dd19ac4957793d
BLAKE2b-256 4c41c9a932170018f79e7aa1ed154d61107bff1cf0453b08e0c5e9ce2a3c1ac2

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 531f895942a749389a23689aa555d68b03b956831c937784981d1629327083a1
MD5 6311e41cdce2614c6b05d2baaf0b227c
BLAKE2b-256 58b4a39d9f2a4786e445d63381cd45267421ad3c086e6523c4842e30a1a34f12

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a9a1df6c0b844cfef31f7d26dc100728aaca1a7bc2a6a92db7eb2064264c5fa4
MD5 96250674f281fa26f50333b6401107f0
BLAKE2b-256 7e94dad01c4b73025c56832fdfe185d3d7e23ed32bf93e0d1edaeb8f0b353be7

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ac05c4717447c523f43d97a1c28e5381c595dcbae2f4676c0aeeef5ad3051cc4
MD5 4ecd986271814fbd20b9eaafe6449ecd
BLAKE2b-256 808503ffdcddec79e5c5eb4c8228dd0db6c54a3159e373b41eef73dab7fe3106

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 cc4c423413087e88f26aa5f9cc14e82e3d87f1bf50d2e4a7bfb814003bf537c3
MD5 f3c97391d6bd2efb16baa6e7c24c9ec1
BLAKE2b-256 e2452227667f700ccebbf04e7c37c332bdb4ce07ff9dc6c96e1a0f24ef49f4c1

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4f17116a8330eda26818939f09e10352406dc0a75b1d336ac7eff4149f263a46
MD5 c9951ed5d3e2e9b94e339f6fd25916d8
BLAKE2b-256 da9074f50d1155d13de66df36d2d940010125794cc26d26068391547bb0ddf54

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1e0c58461c7d40801e60effc5de9f65712a0320f370c8335013cf31e78c72ca9
MD5 94973253393693e798955fbd893900c0
BLAKE2b-256 13e1c1550760f62ca2f26335320396adbc6c19903d76afe0d8f3b14a88ef4ae8

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 4e5b7c07d321277677d78b13b261389f07d67dc88c85e72c9594ff95ed1dbf8c
MD5 3b7885ed70858e51eccb687e15e38e7f
BLAKE2b-256 0aef00ec95eee9a55b846bb3ff16de4602ffd90946c7a3fb1b4cdb3d0c2b809c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 528390c1cf75e58700530a42bf4a42ab90eadd574bfea5a33502e58cdc915a97
MD5 262151b55fbd4113be569bb27d942523
BLAKE2b-256 eb74d80896d835dca5b2e9b055c984f7048ee484b81720b144f05950bb67cd51

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.19.1.dev20210723205827-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 98b8dfb3de982d9eed6ce8c16d0272a67eb1b14f91f3e28281a676b9a4b5e320
MD5 bc3d6c1fdcd9d352f96d29a7bade2e17
BLAKE2b-256 2b8af3917802c96e57666de8ee7f1563dcaac054274a0871fa53ffeda282bb63

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page