Skip to main content

PGCopy dump packed into LZ4, ZSTD or uncompressed with meta data information packed into zlib.

Project description

PGPack format

Storage format for PGCopy dump packed into LZ4, ZSTD or uncompressed with meta data information packed into zlib

PGPack structure

  • header b"PGPACK\n\x00" 8 bytes
  • unsigned long integer zlib.crc32 for packed metadata 4 bytes
  • unsigned long integer zlib packed metadata length 4 bytes
  • zlib packed metadata
  • unsigned char compression method 1 byte
  • unsigned long long integer packed pgcopy data length 8 bytes
  • unsigned long long integer unpacked pgcopy data length 8 bytes
  • packed pgcopy data

Installation

From pip

pip install pgpack

From local directory

pip install .

From git

pip install git+https://github.com/0xMihalich/pgpack

Metadata format

Metadata for PGCopy dump contained Column names and OID Types

Decompressed metadata structure

list[
    list[
        column number int,
        list[
            column name str,
            column oid int,
            column lengths int,
            column scale int,
            column nested int,
        ]
    ]
]

Compression methods

  • NONE (value = 0x02) PGCopy dump without compression
  • LZ4 (value = 0x82) PGCopy dump with lz4 compression
  • ZSTD (value = 0x90) PGCopy dump with zstd compression

Get ENUM for set compression method

from pgpack import CompressionMethod

compression_method = CompressionMethod.NONE  # no compression
compression_method = CompressionMethod.LZ4  # lz4 compression (default)
compression_method = CompressionMethod.ZSTD  # zstd compression

Class PGPackReader

Initialization parameters

  • fileobj - BufferedReader object (file, BytesIO e t.c)

Methods and attributes

  • metadata - metadata in bytes
  • columns - List columns names
  • pgtypes - List PGOid for all columns
  • pgparam - List PGParam for all columns
  • pgcopy_compressed_length - integer packed pgcopy data length
  • pgcopy_data_length - integer unpacked pgcopy data length
  • compression_method - CompressionMethod object
  • compression_stream - BufferedReader object for decompress data
  • pgcopy_start - integer offset for start pgcopy compressed data
  • pgcopy - PGCopyReader object
  • to_rows() - Method for reading uncompressed PGCopy data as generator python objects
  • to_pandas() - Method for reading uncompressed PGCopy data as pandas.DataFrame
  • to_polars() - Method for reading uncompressed PGCopy data as polars.DataFrame
  • to_bytes() - Method for reading uncompressed PGCopy data as generator bytes

Class PGPackWriter

Initialization parameters

  • fileobj - BufferedWriter object (file, BytesIO e t.c)
  • metadata - metadata in bytes (default is None)
  • compression_method - CompressionMethod object (default is CompressionMethod.ZSTD)

Methods and attributes

  • columns - List columns names
  • pgtypes - List PGOid for all columns
  • pgparam - List PGParam for all columns
  • pgcopy_compressed_length - integer packed pgcopy data length set to 0 as initialized
  • pgcopy_data_length - integer unpacked pgcopy data length set to -1 as initialized
  • pgcopy_start - integer offset for start pgcopy compressed data set to current offset as initialized
  • pgcopy - PGCopyWriter object
  • from_rows(dtype_data) - Write PGPack file from python objects. Parameter: dtype_data as python iterable object
  • from_pandas(data_frame) - Write PGPack file from pandas.DataFrame. Parameter: data_frame as pandas.DataFrame
  • from_polars(data_frame) - Write PGPack file from polars.DataFrame. Parameter: data_frame as polars.DataFrame
  • from_bytes(bytes_data) - Write PGPack file from bytes. Parameter: bytes_data as bytes iterable object

Errors

  • PGPackError - Base PGPack error
  • PGPackHeaderError - Error header signature
  • PGPackMetadataCrcError - Error metadata crc32
  • PGPackModeError - Error fileobject mode

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgpack-0.3.0.0.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pgpack-0.3.0.0-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file pgpack-0.3.0.0.tar.gz.

File metadata

  • Download URL: pgpack-0.3.0.0.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for pgpack-0.3.0.0.tar.gz
Algorithm Hash digest
SHA256 fb1ec4d8f33e4ea5251cede3e564cd80598dfdc6585f497f227599bba965974f
MD5 dfe81901bf88bcfc3406f89f1c775abe
BLAKE2b-256 24af1176297af6c74c350e9928e8370e4a3f8037736f194039e0ef84edd64641

See more details on using hashes here.

File details

Details for the file pgpack-0.3.0.0-py3-none-any.whl.

File metadata

  • Download URL: pgpack-0.3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for pgpack-0.3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 72d64172cadf44dde8311800367109592351f6ca14a0136af2cf46d88e47f99b
MD5 3754c3c1b5e094c41d0e672842c9d6bb
BLAKE2b-256 d994556294f53d18b996acab514a2e473ca5714fc80f5129a09b76f48a785188

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page