Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 678b2ee2616f1bf7ddbf517ccffd712756fe7c27a61cf8aac2e907573e2a6e3e
MD5 6fde5d72400c45abaf60ced974e0096d
BLAKE2b-256 362e9300c276b931c19721e10abe6517a0c95794ac86d4def55f3d590ba86278

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 02952089a1e3cb224c544c9067093ae4c42764996841202441f0e1f13ae3e391
MD5 e5e3450baf46b546db3a6c4ad333e20a
BLAKE2b-256 b5fef92049f85119c6ef5dfc4691a2b4ef7aedfcb65adb4bca56ef32fc0eb303

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9488c754af4a5b1aa792f7df31ac22ef4550768605bab8bf4535728654748794
MD5 d9ae0aadc18abc42cab11d15d873f3e7
BLAKE2b-256 d7e04a9f1478dd562f888cc7e034d95ffdc2ac1e4e66831dfc749088b25d2b52

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 0ae0a5f1607fecf85750f6db495f1adc1132b8b18833be5ea5265bdf0a0f23bb
MD5 07038b7aea18d74bf27b66d0b498e165
BLAKE2b-256 c8416f387663973e47b77ea77c7e638e1f7f09e5d5b85733ba5527952990fbd1

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ba82300bbf06ab8fb606d0f44fbb431a44666efc35867e971ba84326a3b44095
MD5 ab98854fe6989cfa2b958658b7c54d14
BLAKE2b-256 cc6d776f6a4c4f760cec34ee8d4cceccece0c24dc5c3fa4187b32b74fefac62c

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7a14b5048a7a1de7b1311cdc6d6087ac67a810770f4ed8476f621691fdb6a050
MD5 40101b930de65fc8c3efd62bbbd41eea
BLAKE2b-256 a01e3ffc4bb90506e6077f92c5abc86fcdb33b8d523edaec9b96566fb42472dd

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 df3539bc4b03f1829087ec22c6cf5bd5f1a406cb97f4461ce33f30886638477b
MD5 a43ba70d4ae72b676b0b63fc40dc6541
BLAKE2b-256 3bc9e530fd0e136a7ece249f5f4001289e94b765ec7a98b239c78f1b0b930908

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3a65dc22b49edfc1eaf6be310a75c81913376828f791c29f17961178b28e49bf
MD5 1b1ad190e1c9311e00e6e3818a349df5
BLAKE2b-256 e0892e98f5d1969f8ab2f3878bc10c0698749d49b207b09270e6c11ce8786736

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e4fb9ce670b82524676a2f5c2fdc8c582f222ab6bf90a1c94889473ed79b1c2e
MD5 c2374d0ec432affec819d8db2c228a5e
BLAKE2b-256 7961d903abaede0820ef521249007b088b598f2d68c8e42d22fe5736a55cc541

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 725410ec713084ca0d712480d6e2b524c7cb843aa07fed76d0dbb24b1473075b
MD5 cad445ecdb15cadb2ce5cdb5dec9f9b0
BLAKE2b-256 0b8bf8f400821e275cddc63bb66dfd19b4449e3068c589c730fe1cbd45603112

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 80cc3a941aaf7400a116e51538abedb0f78ff4645e59054f2afe8db26313b0aa
MD5 f85c7a7c314d37c45259d03b741bd27d
BLAKE2b-256 13c537e8fa2c6695b7152d97074ac247e2a851b6fc71d91de54b877692c85cb0

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ddf77ee55b4387114d9229af864a81426e08605a1c97d975f541450372cc4f5a
MD5 e911494aab045bb86d37e7e57d32f808
BLAKE2b-256 d7c0b32e14ba76d19b68fd1113df302cf5bac74896df8805239cb5cd134acc81

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 33608f63f9db849e84c474d87dea4c1b89602fd800961246b00e837277ad6c90
MD5 9a5e320fa769fc9f1f01f9b24b9632eb
BLAKE2b-256 72c92bc8fb501b13062e0a9496fa05a49470e7e3306eb27a78db18c50c5e4814

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7592af52bfb78e9ead0ff044adb33e39fda666263b9b2607f4b5146e2b8318cc
MD5 5d528b241bb5a146951dde6e27058f31
BLAKE2b-256 a8486cebd76357820e71c3a8cd122e5652a7851c90834f20918534a771186acc

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem_nightly-0.27.0.dev20221120143555-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 fcf9e86509c1402a0905961a73656bd3ec9236e7b88ca0c1189b0bddd1cae10a
MD5 4f189eae894bf247402c7fd40c97bcbe
BLAKE2b-256 68bc085f459f357d61729df4c1d2de7acd093d2a6087df42f28aed6fdc982d46

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page