Skip to main content

TensorFlow utilities for efficient TFRecord processing and random access

Project description

TFD Utils

A Python library for efficient TensorFlow TFRecord processing and random access.

Features

  • Random Access to TFRecord Files: Efficiently access specific records in TFRecord files without reading the entire file
  • Automatic Index Caching: Builds and caches an index on first access for fast subsequent lookups
  • Multiple File Support: Handle single files, lists of files, or glob patterns
  • Flexible Key Types: Support for string, integer, and float keys
  • Memory Efficient: Only loads requested records into memory

Quick Start

from tfd_utils.random_access import TFRecordRandomAccess

# Initialize with a single file
reader = TFRecordRandomAccess("path/to/your/file.tfrecord")

# Or with multiple files
reader = TFRecordRandomAccess([
    "path/to/file1.tfrecord",
    "path/to/file2.tfrecord"
])

# Or with a glob pattern
reader = TFRecordRandomAccess("path/to/data_*.tfrecord")

# Get a record by key
record = reader.get_record("your_key")

# Get a specific feature from a record
image_bytes = reader.get_feature("your_key", "image")

# Check if key exists
if "your_key" in reader:
    print("Key exists!")

# Get statistics
stats = reader.get_stats()
print(f"Total records: {stats['total_records']}")

Advanced Usage

Custom Key Feature

By default, the library looks for keys in a feature named 'key'. You can specify a different feature name:

# Use 'id' feature as the key
reader = TFRecordRandomAccess("file.tfrecord", key_feature_name="id")

Custom Index File

You can specify where to save the index cache:

reader = TFRecordRandomAccess(
    "file.tfrecord",
    index_file="my_custom_index.cache"
)

Rebuilding Index

If your TFRecord files change, you can rebuild the index:

reader.rebuild_index()

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tfd_utils-0.2.0.tar.gz (54.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tfd_utils-0.2.0-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file tfd_utils-0.2.0.tar.gz.

File metadata

  • Download URL: tfd_utils-0.2.0.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for tfd_utils-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3ee26f14b10f7072780e9cbe0d49a298e774f18212a74e86257092c08a1455a7
MD5 9f2d02be9758f67ebb1058e1bffa5e86
BLAKE2b-256 df510041bb5fe354b48552711ccaedbfccbcb6305025c05eaaf7968b18905fab

See more details on using hashes here.

Provenance

The following attestation bundles were made for tfd_utils-0.2.0.tar.gz:

Publisher: publish.yml on HarborYuan/tfd-utils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tfd_utils-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: tfd_utils-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for tfd_utils-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9f32e27cad3f434c73a088dfe118d64587fe1a5e5b86d36a8a982b57d8e5e4c4
MD5 92a63600f71578d8a230e7ad393528eb
BLAKE2b-256 8b41b8f8752331ef5440275deb3bb788a63b5e52a09c791a28f248a829f5ffb2

See more details on using hashes here.

Provenance

The following attestation bundles were made for tfd_utils-0.2.0-py3-none-any.whl:

Publisher: publish.yml on HarborYuan/tfd-utils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page