Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.30.0 2.11.x Jan 20, 2022
0.29.0 2.11.x Dec 18, 2022
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 291be9eb9deea346de6a09463435320aca7c2c1d5132ce10807385670a4d7143
MD5 b73be634f92d54ae6db81784a5a57d25
BLAKE2b-256 c54caff3f7f05d2b750b7bdd0a91bcc9cd465495f6eb304aea4bb38d977496a1

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f9fe12b0f85665957e67953c8279dcb2e851e1f2a18ca7fe8d6add1f02ae33f4
MD5 773c962650d81d55675be50a2d1f1019
BLAKE2b-256 80792ffdf4b6be3e0655ddcd8aa527cf48958004badd9b378a0fd21b0d50d6fc

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 941e21970750c7a43403a0dd25d79271e3bb7385bc6930fdd751272577825a74
MD5 60b69a6fd97475af83550d78e48acc8d
BLAKE2b-256 da14c70f78f433d5fa126a578108ea0ac10c8fa5841777da5a0963198921a37e

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2d7db2875f6a04101c91500eafaa6d0906c8dc475bad3ab3edc5ebc9d3eb16d7
MD5 8b0a6622c68dd59567151aab7276c418
BLAKE2b-256 1f32011cf4de014c04f1879b4c9d92ec154b476ec2f86b353123727baf89f66b

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 88f680ba21035a2642a40547e5d46d731b85b5788aa0ffebd2aaeb7b617b6fe2
MD5 4f32d12309adbdefc1815392a0fcae55
BLAKE2b-256 ba8c6efdc4a573d2ad3627ad4795a5af84ae476adf4556b9ef9a80d2f4e5c23b

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ca2da8b3042b41565d0ee14f61c02d3df5dc4771443483fc541b0581d9026932
MD5 5c8ec19acc1a67f446fac3878f6b834d
BLAKE2b-256 d177bd7954186208508820384580ca8084fadf8e258b16485d092aca6ef10290

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 923a6f906a4be5f9b23e72270e0d56ce81c78fae85f68236ae94ab29694835ee
MD5 968046ec8723ccfe0eeb0a6af0095418
BLAKE2b-256 952967123d26f5fc8501a7a5b149b1f5f376fb34b0d0e69e3b0a575a7f677482

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e55567d46ab83c78290610bd89eedc29d17f4d57f209f9cee01249dbdf7f9579
MD5 5e2829acab0ee7d3be2e490d19d1ab61
BLAKE2b-256 ae9c7286d46520b7beaf41c3a57b58a60704bf249758a58221b199dc3b6f657a

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ca46d4c61d7cf7e234882edfcd8de0867a598be6ea58e7f6aaeae4244ebd65ba
MD5 af6739b81578d37157a052897f3d7df8
BLAKE2b-256 1bc91ef2f7ebb04a3afc16caf906e3129187601515f064ef7ef445a30cd4fd20

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 9411ae26ce6b27053acb07f153d9c2d04a074336312610c542491e87142f310f
MD5 0d2d239cff69e1a49229e49c8028c4bb
BLAKE2b-256 0a14f4ae7e0e07939e4780780e8a2a5f058558e9dfecb811206e6072674a3177

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 1e5961cc208bb84ae2c1647b142cef74dd9d0755cee9cb33985e6efcdb4dd679
MD5 47fe07c863f55998e7ed48593fe3c5f0
BLAKE2b-256 710e025e9d177140bf6e2076a122fa424d4162697ebd2f566248416a49deae1b

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 65261c7ff8bd282e8756ea60b9d2dc307045b7d6e32755ed3da0f7124e06c131
MD5 219571ba9e13d6a228e94951c3695000
BLAKE2b-256 aa29fa4424ce1d00ae89f3361b430f9e74162f401b2cbdb92b82d30da3a43a74

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 dbdf0b777f573e2c12ae8391865dcba309ab3943f3b1a4fafc41888ee5ae3ee2
MD5 993330eeb29639d84fe133afa05da04b
BLAKE2b-256 b31fe8d07f3560f957c404aeae4fe6c8be089f31a84e3e1695ec640900d0322c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a645524757103fbb8ccaabd6e5f645987bba0e30ebd359570dbf6909824b6e16
MD5 7585ca4f10aabd539ab31122c39b9f44
BLAKE2b-256 7df24f2e30522b33b397f6bf233cd2966206db0564259743e94c117a71becddb

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.30.0.dev20230118164726-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 404930bb3ed67e1f8de2054dc8517636bacdd8046daff503c6b0bc745feb87c5
MD5 858fbe6e3bc6374466eecee9328541c5
BLAKE2b-256 7125f3845cea9213e6355fca35369352c10797ef15a8233b727dd5846d316948

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page