Skip to main content

provides functions to cut and merge file-objects in python

Project description

snipfile

snipfile is a python package that aims to provide easy-to-use primitives for transforming binary files.

It was written when I needed to write a parser for different sections of binary files. In order to not having to calculate tons of relative offsets all the time (and a desire to write more readable code), I decided to implement the Slice class. The other stuff quickly followed.

Classes + functions

  • class File(f: fileobj) wraps other file-like objects, e.g. files opened using open(), io.BytesIO(), etc.)
    we rely on the following methods and properties to be present: .read(), .seek(), .tell(), .name
  • class Slice(f: Filelike, *, offset:int, size:int) provides a smaller window into a file. seek() and tell() use positions relative to .offset, and read() won't read beyond the Slice's size
  • cutAt(f: Filelike, *positions:int) takes a File/Slice/... and returns a list of Slices pointing to sections of it (with the positions you provided as boundaries)
    e.g. cutAt(f, 5) returns [Slice(offset=0, size=5), Slice(offset=5, size=f.size()-5)]
  • split(f: Filelike, delimiter:bytes) cuts a file whenever it sees the delimiter (similar to b'hello\nworld\n'.split('\n'))
  • splitAfter(f: Filelike, delimiter:bytes) similar to split(), but keeps the delimiter at the end of each Slice
  • join(*parts: Filelike) returns a JoinedFile object that behaves the same way as a file containing each of the parts in that order would (join() can be seen as the inverse of cutAt() or splitAfter())
  • punchHole(f:Filelike, start:int, size/end:int) uses cutAt() and join() to remove a section of the file
  • class Filelike serves as the base class for the others (File, Slice, ...) ...

Characteristics + design considerations

  • snipfile can work with large files. it won't cache data it reads in memory between read() calls;
  • there's no real issue with cutting cutting and joining files into large numbers of chunks;
    e.g. when 1GB file into 1MB chunks, you will end up with with 1000 Slice objects, each pointing at the same underlying File. (it should therefore be possible to write a hex editor with proper insert/remove/undo/redo support without using too much RAM)
    • each Slice consists of nothing more than a pointer to a File object, a start and a size. It simply implements read(), tell(), seek(), ... in a way that behave like accessing a file that only contained that part of the file.
  • snipfile only provides read access; but you can cut and join slices as you wish after which you can call .writeTo() to store the modified data back to disk (but don't overwrite the file in-place unless you know what you're doing)
  • we generally assume that the wrapped file doesn't change in size; all our file-like classes will be initialized with position and/or size information, which they then rely on
  • snipfile relies heavily on seek(). it'll call seek on the underlying file before each read() (as it assumes someone else may have messed with the file in the meantime)
  • snipfile development relies on type hints to limit the potential for coding errors
  • note: at this moment, our Filelike classes don't provide a close() method (and also lack __enter__() or __exit__()). As there could be any number of objects pointing to the same file, we don't make assumptions on what should happen if you e.g. call close() on a Slice (it could be the only object pointing to that file, but there's simply no way to tell)
    make sure to call f.close() on the original file-like (if applicable)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snipfile-0.2.2.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

snipfile-0.2.2-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file snipfile-0.2.2.tar.gz.

File metadata

  • Download URL: snipfile-0.2.2.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for snipfile-0.2.2.tar.gz
Algorithm Hash digest
SHA256 a875cea749aa4e9fd55811ce96ac51bd888ddd1302690fb3f7b7fd05bc06ff86
MD5 73ce630b4498942838f9308d9d001a42
BLAKE2b-256 ddca568b98d88a56ad7889a49ae1dd4d07051f27eadc537f4ed64aeb657222c2

See more details on using hashes here.

File details

Details for the file snipfile-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: snipfile-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for snipfile-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cd23be17d970a527a33feb3b618fc43a6b96e295d8d8a6ed6c5a020602429f79
MD5 70f3c72dc9d966097bc18f7c832bade0
BLAKE2b-256 c907652025dc8fca31964ed985ec0459c1ab91ae2bd6092ffa1917e24b9aba21

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page