Skip to main content

Rust implementation to read TFRecord files into PyTorch tensors

Project description

rustfrecord

The TFRecord format is a simple format for storing a sequence of binary records.

This package implements a high-performance reader for Example records stored in TFRecord files.

Examples are loaded into native PyTorch Tensors.

Installation

The wheel can be installed on any Linux system with Python 3.8 or higher:

pip3 install rustfrecord

Getting Started

The Reader class reads TFRecord files and yields Dict[str, Tensor] objects.

import torch
from torch import Tensor
from rustfrecord import Reader

filename = "data/002scattered.training_examples.tfrecord.gz"
r = Reader(filename, compressed=True)

for i, features in enumerate(r):
    print(features.keys())
    # ['variant_type', 'image/encoded', 'image/shape',
    #  'variant/encoded', 'label', 'alt_allele_indices/encoded',
    #  'locus', 'sequencing_type']

    label: Tensor = features['label']
    shape = torch.Size(tuple(features['image/shape']))
    image: Tensor = features['image/encoded'][0].reshape(shape)

    print(i, label, image.shape)

Development

To develop this package (not just use it), you need to install the Rust compiler and the Python development headers.

pip install uv
uv venv
source .venv/bin/activate

uv pip compile pyproject.toml -o requirements.txt
uv pip install -r requirements.txt

export LIBTORCH_USE_PYTORCH=1
CARGO_TARGET_DIR=target_maturin maturin develop

python main.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rustfrecord-0.1.7.tar.gz (16.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rustfrecord-0.1.7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (338.8 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

rustfrecord-0.1.7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (311.5 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

File details

Details for the file rustfrecord-0.1.7.tar.gz.

File metadata

  • Download URL: rustfrecord-0.1.7.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.5.1

File hashes

Hashes for rustfrecord-0.1.7.tar.gz
Algorithm Hash digest
SHA256 9b4440a8af1f1c0996feca8dfb8b2d06f8030eba998a59de86f57641bcb35fb4
MD5 c3784d5128c0fd2798a598ceac573411
BLAKE2b-256 64dcd1eb37cff911709b9d57da592a9631b3ef29fa32bcefe78ab5729d66179a

See more details on using hashes here.

File details

Details for the file rustfrecord-0.1.7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rustfrecord-0.1.7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 497cc098c48844a09c9923be0df7faa5126bdce5c3bc2206b8173cf99ff4c634
MD5 2c62a0291e0796a1b3d38411133ad750
BLAKE2b-256 f008d4cc9f7f7c3737b9fcb4827c556cb62b0190a2ddb31b6cf8fc301583baa3

See more details on using hashes here.

File details

Details for the file rustfrecord-0.1.7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for rustfrecord-0.1.7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 14e9b5f3493045fcb6e82913f0e7a4033d1e38f37919ca0713c1147eca46f276
MD5 4678c1c73d85e935e41267c4ec267504
BLAKE2b-256 cdd3cb2b0b760a3ab0fe347c7dcf8c26e5d540eb5f1df88a8570c6f610c14ae5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page