Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.31.0 2.11.x Feb 25, 2022
0.30.0 2.11.x Jan 20, 2022
0.29.0 2.11.x Dec 18, 2022
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 249505b09775902715c3714d197de553a1e7476fcf003a8273a00d228e3180c7
MD5 15151d0a7be7248580f93ca1e1f73c76
BLAKE2b-256 e963cc97e3f1c520a7357268ad2c0f46e53da32d1e2852b83a315ed3d2c434b1

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d5b53b523f82971c556634662412a2b0c97982e031b3f6926c47d0070b6016bd
MD5 fe2ae49ab6f3833aca9b52ffd7ea2717
BLAKE2b-256 c5fc0e4f1204267ea95a5874a5ac1164b90a7754b43e325396dce4f168828b5c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 fc27209d185fa73929e6ef81b8fdb627bb08c292f89cf7db3854215ce749c9f2
MD5 e6a9152ae5d5c27891da083480b2aafc
BLAKE2b-256 bfbc8457357e4df1ec24e6aa6184478ebc859a1442a9b5d8e0ff0805b1688410

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a822952b4e73d0a7783073eae039c557c0509b9ad257a6c94df5a0eec94a5419
MD5 17a286a212ee30f99d27ef974783dd78
BLAKE2b-256 90a8873f9bc33048804b2dfcb38df831ab4140ae26a7131b2eea0c4393e84f81

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b4e254e9e16df7e3024fa5599cffca9bf9d1de75bffdb8dfda960c34faaa88e4
MD5 3fe848f98c090067307401b4448c470d
BLAKE2b-256 7e8313638dd67ee9e477a4bd51fa08e9a722b6213d2d4a47cbf5660dd13c42c1

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 3a324cca2149a2462e894fd3c206e3559550aceb9bcbe02559ee1818bed11998
MD5 a48e3825e4bca5d0e9ca692ab661854d
BLAKE2b-256 17f394ca19c7d7b5f5aa44238d8c6226f9922f44d58c5c31ef1f7f8e1d93a832

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 82eeb402b286abdde83aeec0061853bf8b96efe436e2932e8f4c61afd6b299dd
MD5 6ccc58c4b0896a2ce678aadac39c08f2
BLAKE2b-256 6dcfc457e01d6242b37b2771c5c57e4f84722c1a37f9c51b656b39690c37ef69

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a5599551bcd9e4a355bb337d73321ccfa3c088d03d28254115fbb2fc156c7e55
MD5 d850b8e02c5a5974c6337a1796c93029
BLAKE2b-256 550da8939e4141da6b4a549a02e2e75af0241242311a5342fb16aedb354d5916

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 22e0c30057f40e3820366cacb5cdc58d3b9097ba0571c06864b811c3f896d76c
MD5 11b91e4ab02e459883b2d949940a2c54
BLAKE2b-256 f52b809bf9947d9c2ecd8c76924f73675c3507a85bf8110dc4118816945fac47

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 38b393603dd046a8718b70b3bd3d297fdae6c3e5632f4eba888c3c0d68fdfbe6
MD5 0e5e18d090bc39b7908bf81a9b3ce780
BLAKE2b-256 9822ce134cc52dbbb26ebbe76fc5f2748c48588d86ea709db0517ad2031e2b2c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9655f38cc712c9656beb7e7c79963acf166411930607def27582c1c0421ec693
MD5 af03a2e2bdde8909312133cbfada21e9
BLAKE2b-256 5343d05410dfe6269160ef728f1c82130a0a3074eb26e7d061952823b134e268

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 476f19862f47fd33755c4595afdd19652445808b9d7f3515b8cf9c86a8ffd3c2
MD5 b899142260fc6a3cbb9d295fb81d411a
BLAKE2b-256 7fdc6dac967ed6db93f0c4d073f47c6daba1aa9504f94c0d84083c276ee12a7b

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 a4867df66ab3c584279d4030708bbf0c6fc46c11dc776251004737a18feeb704
MD5 9ff9dad777ceda903e163508224cc50b
BLAKE2b-256 87da2a7375ff4af2901dd095cc77b324b96f812321c0850b8967887ea2a534d0

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ee301600b92ba56e5fa6f573c9d17b9c82d4bc763bbe2fa2b647a9eb267aec80
MD5 02f38978789862d12e3fbe542f962cc3
BLAKE2b-256 c1dc19c7cf60f354ef52a71b1bf371ee4474506a2d257f65bd510be3ccfef560

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.31.0.dev20230309180344-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 982b7a0baf405631d95867442ccbedbfc5def0b29677446bb355d9d9d0fb00fc
MD5 79b3ace4bc7f012350ae65b180ce5b2a
BLAKE2b-256 88a361c4723ccb8e23f024135a093a401449e4263a0f119d52d7a98cc028956a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page