Skip to main content

Byte Stream Representation of Piecewise-constant Array

Project description

Documentation

The documentation is available below.

Installation

A Web app of PCA-B-Stream might be available on PythonAnywhere.

This project is published on the Python Package Index (PyPI) at: https://pypi.org/project/pca-b-stream/. It should be installable from Python distribution platforms or Integrated Development Environments (IDEs). Otherwise, it can be installed from a command console using pip:

For all users (after acquiring administrative rights)

For the current user (no administrative rights required)

Installation

pip install pca-b-stream

pip install --user pca-b-stream

Update

pip install --upgrade pca-b-stream

pip install --user --upgrade pca-b-stream

Brief Description

In a Few Words

The PCA-B-Stream project allows to generate a printable byte stream representation of a piecewise-constant Numpy array, and to re-create the array from the byte stream, similarly to what is available as part of the COCO API.

Illustration

In Python:

>>> import pca_b_stream as pcas
>>> import numpy as nmpy
>>> # --- Array creation
>>> array = nmpy.zeros((10, 10), dtype=nmpy.uint8)
>>> array[1, 1] = 1
>>> # --- Array -> Byte stream -> Array
>>> stream = pcas.PCA2BStream(array)
>>> decoding = pcas.BStream2PCA(stream)
>>> # --- Check and print
>>> assert nmpy.array_equal(decoding, array)
>>> print(stream)
b'FnmHoFain+3jtU'

From command line:

pca2bstream some_image_file           # Prints the corresponding byte stream
bstream2pca a_byte_stream a_filename  # Creates an image from the byte stream and stores it

Motivations

The motivations for developing an alternative to existing solutions are:

  • Arrays can be of any dimension (i.e., not just 2-dimensional),

  • Their dtype can be of boolean, integer, or float types,

  • They can contain more than 2 distinct values (i.e., non-binary arrays) as long as the values are integers (potentially stored in a floating-point format though),

  • The byte stream representation is self-contained; In particular, there is no need to keep track of the array shape externally,

  • The byte stream representation contains everything needed to re-create the array exactly as it was instantiated (dtype, endianness, C or Fortran ordering); See note though.

Documentation

Functions

The pca_b_stream module defines the following functions:

  • PCA2BStream
    • Generates the byte stream representation of an array; Does not check the array validity (see PCArrayIssues)

    • Input: a Numpy ndarray

    • Output: an object of type bytes

  • BStream2PCA
    • Re-creates the array from its bytes stream representation; Does not check the stream format validity

    • Input/Output: input and output of PCA2BStream swapped

  • PCArrayIssues
    • Checks whether an array is a valid input for stream representation generation; It is meant to be used before calling PCA2BStream

    • Input: a Numpy ndarray

    • Output: a tuple issues in str format. The tuple is empty if the array is valid.

    • Additional information about what are valid piecewise-constant arrays here is provided in the section “Motivations”.

  • BStreamDetails

Command Line Scripts

The PCA-B-Stream project defines two command line scripts: pca2bstream and bstream2pca. The former takes a path to an image file as argument, and prints the corresponding byte stream (without the “b” string type prefix). The latter takes a character string and a filename as arguments, in that order, and creates an image file with this name that corresponds to the string interpreted as a byte stream. The file must not already exist.

Byte Stream Format

A byte stream is a base85-encoded stream. Once decoded, it has the following format (in lexicographical order; all characters are in bytes format):

  • one character “0” or “1”: indicates whether the remaining of the stream is in uncompressed or ZLIB compressed format; See note on compression; The remaining of the description applies to the stream in uncompressed format

  • 3 characters “{E}{T}{O}”:
    • E: endianness among “|”, “<” and “>”

    • T: dtype character code among: “?” + numpy.typecodes[“AllInteger”] + numpy.typecodes[“Float”]

    • O: enumeration order among “C” (C-ordering) and “F” (Fortran-ordering)

  • one integer for the dimension of the array (1 for vectors, 2 for matrices, 3 for volumes…)

  • one integer per dimension giving the length of the array in that dimension

The remaining of the stream is the actual array content.

  • If the array is not all False’s or zeros:
    • one character “0” or “1”: whether the first value in the array is zero (or False) or one (or True)

    • one integer for the length of the run-length representation

    • integers of the run-length representation of the array read in its proper enumeration order

  • If the array is all False’s or zeros:
    • one character “2”

All the integers are encoded by the unsigned LEB128 encoding using the leb128 project.

For non-boolean arrays with a maximum value of 2 or more, the content part is the concatenation of the sub-contents corresponding to each value between 1 and the maximum value in the array.

Dependencies

The development relies on several packages:

  • Mandatory: dominate, flask, flask-bootstrap4, flask-session, flask-uploads, flask-wtf, imageio, leb128, numpy, si-fi-o, tqdm, wtforms

  • Optional: None

The mandatory dependencies, if any, are installed automatically by pip, if they are not already, as part of the installation of PCA-B-Stream. Python distribution platforms or Integrated Development Environments (IDEs) should also take care of this. The optional dependencies, if any, must be installed independently by following the related instructions, for added functionalities of PCA-B-Stream.

Acknowledgments

https://img.shields.io/badge/code%20style-black-000000.svg https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336

The project is developed with PyCharm Community.

The code is formatted by Black, The Uncompromising Code Formatter.

The imports are ordered by isortyour imports, so you don’t have to.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pca_b_stream-2024.2.tar.gz (32.9 kB view details)

Uploaded Source

Built Distribution

pca_b_stream-2024.2-py3-none-any.whl (31.4 kB view details)

Uploaded Python 3

File details

Details for the file pca_b_stream-2024.2.tar.gz.

File metadata

  • Download URL: pca_b_stream-2024.2.tar.gz
  • Upload date:
  • Size: 32.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for pca_b_stream-2024.2.tar.gz
Algorithm Hash digest
SHA256 ff01ef5f90d43a89fd868dc13217e43adb9cee4f7d89a4d826014112120ab00b
MD5 473eef246912624452512cfa13f8c887
BLAKE2b-256 8914985943b39dee1cc259f9af50f6d92f5f35d35aa7c756098a0e440b1da67c

See more details on using hashes here.

File details

Details for the file pca_b_stream-2024.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pca_b_stream-2024.2-py3-none-any.whl
Algorithm Hash digest
SHA256 82f12ec945937a360e8add981f4191d532fad6459c80f03f69d54868e670547e
MD5 4d994c1b9b17aa27affd5206cc960768
BLAKE2b-256 67df9899a5ab3286dd2edb408c448aca18084e578943df6511f06a388e1a1c3c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page