TensorFlow utilities for efficient TFRecord processing and random access
Project description
TFD Utils
A Python library for efficient TensorFlow TFRecord processing and random access.
Features
- Random Access to TFRecord Files: Efficiently access specific records in TFRecord files without reading the entire file
- Automatic Index Caching: Builds and caches an index on first access for fast subsequent lookups
- Multiple File Support: Handle single files, lists of files, or glob patterns
- Flexible Key Types: Support for string, integer, and float keys
- Memory Efficient: Only loads requested records into memory
Quick Start
from tfd_utils.random_access import TFRecordRandomAccess
# Initialize with a single file
reader = TFRecordRandomAccess("path/to/your/file.tfrecord")
# Or with multiple files
reader = TFRecordRandomAccess([
"path/to/file1.tfrecord",
"path/to/file2.tfrecord"
])
# Or with a glob pattern
reader = TFRecordRandomAccess("path/to/data_*.tfrecord")
# Get a record by key
record = reader.get_record("your_key")
# Get a specific feature from a record
image_bytes = reader.get_feature("your_key", "image")
# Check if key exists
if "your_key" in reader:
print("Key exists!")
# Get statistics
stats = reader.get_stats()
print(f"Total records: {stats['total_records']}")
Advanced Usage
Custom Key Feature
By default, the library looks for keys in a feature named 'key'. You can specify a different feature name:
# Use 'id' feature as the key
reader = TFRecordRandomAccess("file.tfrecord", key_feature_name="id")
Custom Index File
You can specify where to save the index cache:
reader = TFRecordRandomAccess(
"file.tfrecord",
index_file="my_custom_index.cache"
)
Rebuilding Index
If your TFRecord files change, you can rebuild the index:
reader.rebuild_index()
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tfd_utils-0.1.0.tar.gz.
File metadata
- Download URL: tfd_utils-0.1.0.tar.gz
- Upload date:
- Size: 67.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb1d65a030a9e114c5c75a394b92d474ddcfe7a332bdbcfad3eb07249de47c63
|
|
| MD5 |
f268d69e72da899d39340f9d1d18cc97
|
|
| BLAKE2b-256 |
187499af9c389bbccf3cf8ae873d98de80899f2ca674c6be42a9d84d79af12f5
|
Provenance
The following attestation bundles were made for tfd_utils-0.1.0.tar.gz:
Publisher:
publish.yml on HarborYuan/tfd-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tfd_utils-0.1.0.tar.gz -
Subject digest:
bb1d65a030a9e114c5c75a394b92d474ddcfe7a332bdbcfad3eb07249de47c63 - Sigstore transparency entry: 268057148
- Sigstore integration time:
-
Permalink:
HarborYuan/tfd-utils@13566685a09a4a57cecfa4dc020f05dfbba19a29 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/HarborYuan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@13566685a09a4a57cecfa4dc020f05dfbba19a29 -
Trigger Event:
release
-
Statement type:
File details
Details for the file tfd_utils-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tfd_utils-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a47e8b53963963341a596aff8ff9901353bcc06796700395a2cad82b1f1fda1
|
|
| MD5 |
b4ae1def8fe36c849ec18a2ffd1b3a54
|
|
| BLAKE2b-256 |
2637bce3d08b1acaddbe53cdc8d91dd82130f42ebd45f47162eba5d82bfa442c
|
Provenance
The following attestation bundles were made for tfd_utils-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on HarborYuan/tfd-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tfd_utils-0.1.0-py3-none-any.whl -
Subject digest:
2a47e8b53963963341a596aff8ff9901353bcc06796700395a2cad82b1f1fda1 - Sigstore transparency entry: 268057177
- Sigstore integration time:
-
Permalink:
HarborYuan/tfd-utils@13566685a09a4a57cecfa4dc020f05dfbba19a29 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/HarborYuan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@13566685a09a4a57cecfa4dc020f05dfbba19a29 -
Trigger Event:
release
-
Statement type: