Skip to main content

A Python wrapper for the FAPEC data compressor.

Project description

FaPyc

A Python wrapper for the FAPEC data compressor. (C) DAPCOM Data Services S.L. - https://www.dapcom.es

The free decompression-only library is included, which has some limitations such as the maximum number of threads and the recovery of corrupted files. Only a 'dummy' compression library is provided: You can get free evaluation licenses at https://www.dapcom.es/get-fapec/ to test the compressor. For full licenses, please contact us at fapec@dapcom.es

Usage

There are 3 main execution modes:

  • File: When invoking Fapyc or Unfapyc on a filename, it will (de)compress it directly into another file.
  • Buffer: You can load the whole file to (de)compress on e.g. a byte array, and then invoke Fapyc/Unfapyc which will leave the result in the output buffer. Obviously, you should be careful with large/huge files!
  • File-to-buffer decompression: You can directly decompress a file (without having to load it beforehand) and leave its decompressed output in a buffer, which you can use afterwards.
  • Chunk: FAPEC internally works in 'chunks' of data, of up to 384MB each, which allows to progressively (de)compress a huge file while keeping memory usage under control. For now, this feature is only available in the FAPEC CLI and C API, not in Fapyc/Unfapyc yet.

Examples

Compress and decompress a file

In this example we use the kmall option of FAPEC, suitable for this kind of geomaritime data files from Kongsberg Maritime:

from fapyc import Fapyc, Unfapyc

filename = input("Path to KMALL file: ")

print("Preparing to compress %s" % (filename))
# Here we invoke FAPEC to directly run on files,
# so the memory usage will be small (just 10MB or so)
# although it won't allow us to directly access the
# (de)compressed buffers.
f = Fapyc(filename, chunksize = 2048576, blen = 512)
f.compress_kmall()

print("Preparing to decompress %s" % (filename + ".fapec"))
uf = Unfapyc(filename + ".fapec")
uf.decompress(output=filename+".dec")

Compress and decompress a buffer

In this example we use the tab option of FAPEC, which typically outperforms gzip and bzip2 on tabulated text data:

from fapyc import Fapyc, Unfapyc

filename = input("Path to file: ")
file = open(filename, "rb")
# Beware - Load the whole file to memory
data = file.read()
f = Fapyc(buffer = data)
# Invoke our tabulated-text compression algorithm
# indicating a comma separator
f.compress_tabtxt(sep1=',')
print("Ratio =", round(float(len(data))/len(f.outputBuffer), 4))

# Now we decompress the buffer
uf = Unfapyc(buffer = f.outputBuffer)
uf.decompress()

Decompress a file into a buffer, and do some operations on it

Here we provide a quite specific use case, based on ESA/DPAC Gaia (E)DR3 bulk catalogue (which is publicly available as FAPEC-compressed CSVs). In this example, we decompress one of the files, get its CSV-formatted contents with Pandas, apply some filtering conditions, and generate a histogram.

from fapyc import Unfapyc
from io import BytesIO
import pandas as pd
import matplotlib.pyplot as plt

filename = input("Path to CSV-FAPEC file: ")

### Option 1: open the file, load it to memory (beware!), and decompress the buffer:
#file = open(filename, "rb")
#data = file.read()
#uf = Unfapyc(buffer = data)

### Option 2: directly decompress from the file into a buffer:
uf = Unfapyc(filename = filename)

# Actual decompressor invocation - same for both options
uf.decompress()

# Regenerate the CSV from the bytes buffer
df = pd.read_csv(BytesIO(uf.outputBuffer), comment="#")

print("Info from the full CSV:")
print(df.info())
# Prepare some nice histograms for all data
plt.subplot(2,2,1)
plt.title("Full CSV: skymap (%d sources)" % df.shape[0])
plt.xlabel("RA")
plt.ylabel("DEC")
print("Getting 2D histogram...")
plt.hist2d(df.ra, df.dec, bins=(100, 100), cmap=plt.cm.jet)
plt.colorbar()
plt.subplot(2,2,2)
plt.title("Full CSV: G dist")
plt.xlabel("G magnitude")
plt.ylabel("Counts")
plt.yscale("log")
print("Getting histogram...")
plt.hist(df.phot_g_mean_mag, bins=(50))

# Now let's repeat, but doing the histogram from only the values that fulfil
# some conditions on some of the CSV fields
print("Loading+filtering CSV...")
iter_csv = pd.read_csv(BytesIO(uf.outputBuffer), comment="#", iterator=True, chunksize=1000)
df = pd.concat((x.query("ra_error < 0.1 & dec_error < 0.1 & ruwe > 0 & ruwe < 5") for x in iter_csv))
print("Info from the filtered CSV:")
print(df.info())
plt.subplot(2,2,3)
plt.title("Filtered CSV: skymap (%d sources)" % df.shape[0])
plt.xlabel("RA")
plt.ylabel("DEC")
print("Getting 2D histogram...")
plt.hist2d(df.ra, df.dec, bins=(100, 100), cmap=plt.cm.jet)
plt.colorbar()
plt.subplot(2,2,4)
plt.title("Filtered CSV: G dist")
plt.xlabel("G magnitude")
plt.ylabel("Counts")
plt.yscale("log")
print("Getting histogram...")
plt.hist(df.phot_g_mean_mag, bins=(50))

print("Plotting!")
plt.show()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

fapyc-0.3.0-cp312-cp312-win_amd64.whl (794.8 kB view details)

Uploaded CPython 3.12 Windows x86-64

fapyc-0.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (908.5 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

fapyc-0.3.0-cp311-cp311-win_amd64.whl (794.9 kB view details)

Uploaded CPython 3.11 Windows x86-64

fapyc-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (904.8 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

fapyc-0.3.0-cp310-cp310-win_amd64.whl (794.9 kB view details)

Uploaded CPython 3.10 Windows x86-64

fapyc-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (872.8 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

fapyc-0.3.0-cp39-cp39-win_amd64.whl (802.3 kB view details)

Uploaded CPython 3.9 Windows x86-64

fapyc-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (875.5 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

fapyc-0.3.0-cp38-cp38-win_amd64.whl (802.4 kB view details)

Uploaded CPython 3.8 Windows x86-64

fapyc-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (880.0 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

fapyc-0.3.0-cp37-cp37m-win_amd64.whl (802.1 kB view details)

Uploaded CPython 3.7m Windows x86-64

fapyc-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (852.3 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

fapyc-0.3.0-cp36-cp36m-win_amd64.whl (806.2 kB view details)

Uploaded CPython 3.6m Windows x86-64

fapyc-0.3.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (838.0 kB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64

File details

Details for the file fapyc-0.3.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: fapyc-0.3.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 794.8 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for fapyc-0.3.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 a3661535ddea257256dd84f322d5f600bfa7dea5853d9534391b35e3eaf1b8c4
MD5 b75f1a8597e35ffda01d8c359fcc51c2
BLAKE2b-256 85fad87a0ba620200c43a9e3cacca7ea53bbadc94e03d772970e91bd898cbde0

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fapyc-0.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ff334bccc84c2f5ceefd85f73349fe15e58e1f1e770f5bbb387736f58087379c
MD5 19091230bc908f8e573006b2bd25d2af
BLAKE2b-256 4604af1710ab135d5f1e62c3956195b62fe91ff31882420e32ac0f2f3e775f64

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: fapyc-0.3.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 794.9 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for fapyc-0.3.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 0776b5ace0b9fc71f492f8c55d8b7135aaba5de56860459f660843b7ba043ac6
MD5 3e6193372ae1c1fcbc8ca967278f8270
BLAKE2b-256 d4a5b6a01ee07c02caa32867dadef42df5566c6f6b7fa30e700fbc3db434d069

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fapyc-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 28162a89d3169952cbd60d690ffe955f89bd35edacc52c45d5480cfc3501e558
MD5 c633f83f5134b528935960fd04b22c10
BLAKE2b-256 f2c4d1912b66a54c04dbd04e8bf29abe53571bd5a3e050440b0168c80f745965

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: fapyc-0.3.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 794.9 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for fapyc-0.3.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 3cb5c9b5788f73bf9b6336769283641707242cb75d93ae904e17885da1de6477
MD5 91ebf69bfe744685578d4494dee5438f
BLAKE2b-256 9b0a980147740f0c387c2e3eb2f6760b9162f2abf1844ced5acebd85c56d721d

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fapyc-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f78470174c68cde5f8076260a2b188f23ed4a93f06b93da98f8467aa6f94a713
MD5 ae2433f3e44bbbcf966a5f0b9257476b
BLAKE2b-256 536485a64fea37d4f5b9ad17ad6fc916a0aebcfb16d4af9e62d27c14e5c82ec5

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: fapyc-0.3.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 802.3 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for fapyc-0.3.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 e88a3e843a1515a4a10bd41ac42983e55506fa69e894428145ee559c6a55a370
MD5 2830c2f8b1adf2923641f101e4a31331
BLAKE2b-256 15ec615897c37de643185c3cd4a380381fa8aa9dcc2c79050b8089cc4be7e5a3

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fapyc-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ec278a869ff2e64f1849a4a4743b86c7da45fd907ce87728a95e0309905efe4f
MD5 00eebe3a0c2c1df2f501e27aeb8268b1
BLAKE2b-256 356c44115f5abe8b8de32f60b5092eab2e248acb97b702493dcdc649e5884262

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: fapyc-0.3.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 802.4 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for fapyc-0.3.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 b1e612e85fa041ca40da186bb5f92bc6ea5572b5640fd5f72d96d56b281f3285
MD5 a39c376f92f480d8c95545178175b105
BLAKE2b-256 efc74550ba4d591ad6c8980763d5af10d584551464dca41cf45b1f04a09938c6

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fapyc-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d56bff3dad102ef9f6bf58c56f55a464a2bc1a73378756ef17b43ad3c3853b88
MD5 77195c61606deca68da9112875f521e3
BLAKE2b-256 5d14da9b1ea2961dddebc6dcbb555981821210a2d71ffb1a604289b29f8c84d4

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fapyc-0.3.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 802.1 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for fapyc-0.3.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 488a67777175bfcf3dd0569cf4c0115fa7d9a4aa9f1f5f6888583bab07e5dec4
MD5 b58ecb80e575574a780d280b1a7efe0e
BLAKE2b-256 7118534b3ffe1975637f16866e20129226e2ace5ff6a47cf880eda207410255a

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fapyc-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3de81801b9b9ded7cac6ee3e2975a8dcee49ff425de52337b75e96e1f95794f3
MD5 8b45a03e6138b8634d22d81e832ee9ee
BLAKE2b-256 704ba335ef774eefeff4fc677574bf0a15668ceda939c6fcea70392f6dacbee4

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fapyc-0.3.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 806.2 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for fapyc-0.3.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 24f02f523ba5c5637189d2f7d2de2a2433780d050c0d02417830b6574fa89773
MD5 dc907cf3429deaa3cc71b430bc6323bc
BLAKE2b-256 e0e81ed4bb41fd1577929fc153f189695c02ce8cf1c653841952d92e49ba9fa3

See more details on using hashes here.

File details

Details for the file fapyc-0.3.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fapyc-0.3.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 003525a696377fcdecb56db28c06573226357b71613104bdc99f8cbd32a79cfb
MD5 53f47fa7ff5095ce6e2f8fa783974b13
BLAKE2b-256 f0679c19fafeddd28cb2286670fd702473ce9d49c8fae0171dce5c15a22a8805

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page