Skip to main content

tar file index for constant-time member access

Project description

itar

PyPI version docs

itar builds constant‑time indexes over one or more tar file shards, enabling direct, random access to members without extracting the archives. It ships a lightweight CLI (itar) and a Python API.

Designed for large datasets and deep‑learning pipelines, it supports single or sharded tar archives with thread‑safe access for concurrent reads.

Quickstart

pip install itar

Single tarball

echo "Hello world!" > hello.txt
tar cf hello.tar hello.txt       # regular tarball

itar index create hello.itar     # indexes hello.tar
itar index list hello.itar       # list indexed members
import itar

with itar.open("hello.itar") as archive:
    print(archive["hello.txt"].read())

Sharded tarballs

Give each shard a zero-padded suffix before building the index:

tar cf photos-0.tar wedding/    # shard 0
tar cf photos-1.tar vacation/   # shard 1

itar index create photos.itar   # discovers photos-0.tar, photos-1.tar, ...
itar index list -l photos.itar  # shard index, offsets, byte sizes
import itar

with itar.open("photos.itar") as photos:
    assert "wedding/cake.jpg" in photos
    img_bytes = photos["vacation/sunrise.jpg"].read()

Docs

Full CLI, API, and format details live in the documentation site.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

itar-0.4.2.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

itar-0.4.2-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file itar-0.4.2.tar.gz.

File metadata

  • Download URL: itar-0.4.2.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for itar-0.4.2.tar.gz
Algorithm Hash digest
SHA256 e20450faa953e706a462e06725168dbcc4a48d6dc3b8dad92eb5ee05f65271a1
MD5 a63ee94649177ec60a8e349e10a8af39
BLAKE2b-256 a562bbc147142dd0ad1ec1dfb783c8b3faf4d6a1bf24e5a784734ac540af499b

See more details on using hashes here.

File details

Details for the file itar-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: itar-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for itar-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f949f20d41c2292cf860355f518cad1e7ccaec1b6b6f77ead9d6035827552dba
MD5 e68c6658bb4be5acd676455c5e7d1ab8
BLAKE2b-256 8ed87371b0410bcefa42110aea18afb610320c23159fa09ad2968d08a5789527

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page