A Python wrapper for the FAPEC data compressor.
Project description
FaPyc
A Python wrapper for the FAPEC data compressor. (C) DAPCOM Data Services S.L. - https://www.dapcom.es
The full FAPEC compression and decompression library is included in this package, but a valid license file must be available to properly use it.
Without a license, you can still use the decompressor (yet with some limitations, such as the maximum number of threads, the recovery of corrupted files, or the decompression of just one part of a multi-part archive).
You can get free evaluation licenses at https://www.dapcom.es/get-fapec/ to test the compressor. For full licenses, please contact us at fapec@dapcom.es
Once a valid license is obtained (either full or evaluation), you must define a FAPEC_HOME
environment variable pointing to the path where you have stored your fapeclic.dat
license file.
Usage
There are 3 main execution modes:
- File: When invoking Fapyc or Unfapyc on a filename, it will (de)compress it directly into another file.
- Buffer: You can load the whole file to (de)compress on e.g. a byte array, and then invoke Fapyc/Unfapyc which will leave the result in the output buffer. Obviously, you should be careful with large files, as it may use a lot of RAM.
- File-to-buffer decompression: You can directly decompress a file (without having to load it beforehand) and leave its decompressed output in a buffer, which you can use afterwards.
- Chunk: FAPEC internally works in 'chunks' of data, typically 1-8 MB each (and up to 384MB each), which allows to progressively (de)compress a huge file while keeping memory usage under control. For now, this feature is only available in the FAPEC CLI, in WinFAPEC and in the C API, not in Fapyc/Unfapyc yet.
Examples
Compress and decompress a file
In this example we use the kmall
option of FAPEC, suitable for this kind of geomaritime data files from Kongsberg Maritime:
from fapyc import Fapyc, Unfapyc, FapecLicense
filename = input("Path to KMALL file: ")
# Here we invoke FAPEC to directly run on files,
# so the memory usage will be small (just 10MB or so)
# although it won't allow us to directly access the
# (de)compressed buffers.
f = Fapyc(filename, chunksize = 2048576, blen = 512)
# Check that we have a valid license
lt = f.fapyc_get_lic_type()
if lt >= 0:
ln = FapecLicense(lt).name
lo = f.fapyc_get_lic_owner()
print("FAPEC",ln,"license granted to",lo)
f.compress_kmall()
# Let's now decompress it, as a check
print("Preparing to decompress %s" % (filename + ".fapec"))
uf = Unfapyc(filename + ".fapec")
uf.decompress(output=filename+".dec")
else:
print("No valid license found")
Decompress an image into a buffer and show it
With this example we can view a colour image compressed with FAPEC:
from fapyc import Unfapyc
import numpy as np
from matplotlib import pyplot as plt
filename = input("Path to FAPEC-compressed 8-bit RGB image file: ")
# For now, the API does not provide yet the image dimensions (it will be added soon), so we have to manually indicate them
w,h = input("Width and height (in pixels) of the image (two space-separated values): ").split()
w = int(w)
h = int(h)
# Decompress the file into a byte array buffer
uf = Unfapyc(filename = filename)
uf.decompress()
# Check consistency (image dimensions vs. buffer size)
if len(uf.outputBuffer) != 3*w*h:
print("Image dimensions inconsistent with file contents!")
else:
# Reshape this one-dimensional array into a three-dimensional array (height, width, colours) to plot it
ima = np.reshape(np.frombuffer(uf.outputBuffer, dtype=np.dtype('u1')), (h, w, 3))
plt.imshow(ima)
plt.show()
Compress and decompress a buffer
In this example we use the tab
option of FAPEC, which typically outperforms gzip
and bzip2
on tabulated text/numerical data such as point clouds or certain scientific data files:
from fapyc import Fapyc, Unfapyc
filename = input("Path to file: ")
file = open(filename, "rb")
# Beware - Load the whole file to memory
data = file.read()
f = Fapyc(buffer = data)
# Use 2 threads
f.fapyc_set_nthreads(2)
# Invoke our tabulated-text compression algorithm
# indicating a comma separator
f.compress_tabtxt(sep1=',')
print("Ratio =", round(float(len(data))/len(f.outputBuffer), 4))
# Now we decompress the buffer into another buffer
uf = Unfapyc(buffer = f.outputBuffer)
uf.fapyc_set_useropts(0, 3, 0, 0, 0)
uf.decompress()
print("Decompressed size:", len(uf.outputBuffer))
Decompress a file into a buffer, and do some operations on it
Here we provide a quite specific use case, based on the ESA/DPAC Gaia DR3 bulk catalogue (which is publicly available as FAPEC-compressed CSVs).
In this example, we decompress two of the files, and while getting their CSV-formatted contents with Pandas we filter the contents according to some conditions, and generate some plots.
This is just to illustrate how you can directly work on several compressed files. Note that it may require quite a lot of RAM, perhaps 4GB.
You may need to install pyqt5
with pip
.
from fapyc import Unfapyc
from io import BytesIO
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors
import gc
filename = input("Path to GaiaDR3 csv.fapec file: ")
filename2 = input("Path to another GaiaDR3 csv.fapec file: ")
### Option 1: open the file, load it to memory (beware!), and decompress the buffer; it would be like this:
#file = open(filename, "rb")
#data = file.read()
#uf = Unfapyc(buffer = data)
### Option 2: directly decompress from the file into a buffer:
uf = Unfapyc(filename = filename)
# Here we'll use a verbose mode to see the decompression progress
uf.fapyc_set_useropts(2, 3, 0, 0, 0)
uf.fapyc_set_nthreads(2)
# Invoke decompressor
uf.decompress()
# Define our query (filter):
myq = "ra_error < 0.1 & dec_error < 0.1 & ruwe > 0.5 & ruwe < 2"
# Regenerate the CSV from the bytes buffer
print("Decoding and filtering CSV...")
df = pd.read_csv(BytesIO(uf.outputBuffer), comment="#").query(myq)
# Repeat for the 2nd file
uf = Unfapyc(filename = filename2)
uf.fapyc_set_useropts(2, 3, 0, 0, 0)
uf.fapyc_set_nthreads(2)
uf.decompress()
print("Decoding, filtering and joining CSV...")
df = pd.concat([df, pd.read_csv(BytesIO(uf.outputBuffer), comment="#").query(myq)])
# Remove NaNs and nulls from these two columns
df = df[np.isfinite(df['bp_rp'])]
df = df[np.isfinite(df['phot_g_mean_mag'])]
# Delete Unfapyc and force garbage collection, to try to free some memory
del uf
gc.collect()
print("Info from the filtered CSVs:")
print(df.info())
# Prepare some nice histograms for all data
plt.subplot(2,2,1)
plt.title("Skymap (%d sources)" % df.shape[0])
plt.xlabel("RA")
plt.ylabel("DEC")
print("Getting 2D histogram...")
plt.hist2d(df.ra, df.dec, bins=(200, 200), cmap=plt.cm.jet)
plt.colorbar()
plt.subplot(2,2,2)
plt.title("G-mag distribution")
plt.xlabel("G magnitude")
plt.ylabel("Counts")
plt.yscale("log")
print("Getting histogram...")
plt.hist(df.phot_g_mean_mag, bins=(100))
plt.subplot(2,2,3)
plt.title("Colour-Magnitude Diagram")
plt.xlabel("BP-RP")
plt.ylabel("G")
print("Getting 2D histogram...")
plt.hist2d(df.bp_rp, df.phot_g_mean_mag, bins=(100, 100), norm = colors.LogNorm(), cmap=plt.cm.jet)
plt.colorbar()
plt.subplot(2,2,4)
plt.title("Parallax error distribution")
plt.xlabel("G magnitude")
plt.ylabel("Parallax error")
print("Getting 2D histogram...")
plt.hist2d(df.phot_g_mean_mag, df.parallax_error, bins=(100, 100), norm = colors.LogNorm(), cmap=plt.cm.jet)
print("Plotting...")
plt.show()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file fapyc-0.3.6-cp312-cp312-win_amd64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 727.3 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c795526ab42f283c5f9b8ea1b576ebf997cd325005115eea6828f1ca359a715f |
|
MD5 | bc491ff9a65cdb70cfdff73c1037f7ca |
|
BLAKE2b-256 | f5dfd12c71dd0237debd87bcc50762b7bce6317f9abdc1c3e8b3e7f9dbe08055 |
File details
Details for the file fapyc-0.3.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 863.1 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 028d2820db8de8e0a5da6aba5fa06477f4a7cd15e1c54c02571b7c37347aa7c1 |
|
MD5 | e457a62e1c07c579765db7704da50863 |
|
BLAKE2b-256 | 75ff36097fddc8955acf835e7437c0286b4485b0da6af48b17f9a47daca8507b |
File details
Details for the file fapyc-0.3.6-cp311-cp311-win_amd64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 727.3 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 362668f19437c164c7057f79fd2aa8746b65a784f3050371a1a9129a142ac1af |
|
MD5 | 70a4310973aedc209d1577a5a4d55c4e |
|
BLAKE2b-256 | b34fe1d1c7172676dccb56fa36d2145ddbe0fff279a211b72b1f7d59dd5ded42 |
File details
Details for the file fapyc-0.3.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 857.9 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21bb3af82f64ec9bccc643d3e071a854aab2ea1f6f4f7d3d225e738b4b0ff7c4 |
|
MD5 | fe955baf5f20257fc3f65666dbd6bef5 |
|
BLAKE2b-256 | f5eaad59fde2de9d1fb3d9d177b9a26450f63d2a334e190808d4c4cb2f1e7d81 |
File details
Details for the file fapyc-0.3.6-cp310-cp310-win_amd64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 727.2 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c15c5c3e93a97f80b9fc556adb5961ebb4582e9000d215d60c010a8e60a66e6 |
|
MD5 | 2700050d5c3085aae87a5668bf208e1f |
|
BLAKE2b-256 | aa4d01514c6ba2012fef7cb485158e8f32b40bb58bbe73c362a8b26ed75866fb |
File details
Details for the file fapyc-0.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 826.4 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aba5a1f7381332455140550b27eb0654cd72d6d8fa2acf8f4e6c48d8cd394b07 |
|
MD5 | bf10c547490bc0d2bf4746dadfb441ad |
|
BLAKE2b-256 | e00dd023c055e5a4bb7b511dae015b1c20648573e9ebbcf339bb34da5d6536ea |
File details
Details for the file fapyc-0.3.6-cp39-cp39-win_amd64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 734.3 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ecc44b66d3ace3a4b950b7d4ea824790589bb99285e0c0b43236eb939a3d096 |
|
MD5 | 53925a7ccd423483be0357130a8f7563 |
|
BLAKE2b-256 | 39198b8c9646bbc0247b355993aac3bae73b2cec9251f00ca6a2d12910abb18c |
File details
Details for the file fapyc-0.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 829.5 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe464f58636c8581a33f6a0a1512c5f51dc393b754603ad9e69a88487769776e |
|
MD5 | f530fe29794a577899b2b319c6c31589 |
|
BLAKE2b-256 | b3f39ebcd88375f76ab2fb88c634d1618dc16111abb8cfd9111b50c542858196 |
File details
Details for the file fapyc-0.3.6-cp38-cp38-win_amd64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 734.4 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25d1ef140ba12eadb9f362595e9abb92f371a7893f86fe19ac8ae00086946cee |
|
MD5 | ec7f890323ada0b28e5125afe5940b98 |
|
BLAKE2b-256 | 3cbef2f2defef7ef38191abc62c4b79d8df04a9b95cd8352dbe8515de2e959f1 |
File details
Details for the file fapyc-0.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 833.5 kB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3aecb38b88ad05c08a62dede86d6308878c76252893ce263c696190287d91ab |
|
MD5 | eeb345124bd0a02078617570671d7c68 |
|
BLAKE2b-256 | 6493ce797c4d004e6c3c468d75e300c40f3273b4a6e5cfc65ce3c26ac3423eb8 |
File details
Details for the file fapyc-0.3.6-cp37-cp37m-win_amd64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp37-cp37m-win_amd64.whl
- Upload date:
- Size: 734.2 kB
- Tags: CPython 3.7m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd76ef138642e2e18a6a5c4ee0f2582067782126b2548860271d61803e300687 |
|
MD5 | cf104bbc214179a6991ab9ce6a56fd31 |
|
BLAKE2b-256 | 86202da3ded79415c1fa1d94e1401862d104b7da0371b73d41d6e72770a07e4d |
File details
Details for the file fapyc-0.3.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 807.0 kB
- Tags: CPython 3.7m, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc9f2c0dff9c37b3d5de09ca133eb571c7429f18a5e5d995037b9dee426803f0 |
|
MD5 | 503664b621ea8a0ad4a355fbd03f0617 |
|
BLAKE2b-256 | 97087b1a91c2cb32d393a6fa9d394bb2e59cce584bfcb1835d663b3ed8c4d7f6 |
File details
Details for the file fapyc-0.3.6-cp36-cp36m-win_amd64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp36-cp36m-win_amd64.whl
- Upload date:
- Size: 738.6 kB
- Tags: CPython 3.6m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c08d520f633304f0f9dcd2a46030cf3d9e29b7024921bd6908cbe7e3433c310 |
|
MD5 | 6214761e1fd7075c56047822e9d033f0 |
|
BLAKE2b-256 | c52dcd2a393fc0869104afa066ff7550586a5db027ff21d8829667b12b8efde0 |
File details
Details for the file fapyc-0.3.6-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: fapyc-0.3.6-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 793.0 kB
- Tags: CPython 3.6m, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18fbaefa7661a9db54f20b6c4dbe7c3fdd69188b63f30a65886915ec4455dc6b |
|
MD5 | 9ec5fc3fba18a15a996fca7204e8515a |
|
BLAKE2b-256 | fa83168b10c6cd16d394b9f64c0e2cf2df0483a4d493645f26e55c0200615046 |