Skip to main content

Python interface for PFC-JSONL — high-performance log compression with block-level timestamp filtering

Project description

pfc-jsonl · Python Package

Python interface for PFC-JSONL — high-performance compression for structured log files (JSONL), with block-level timestamp filtering.

pip install pfc-jsonl

Requires the pfc_jsonl binary. Install it separately — see below.


What is PFC-JSONL?

PFC-JSONL compresses JSONL log files 25–37% smaller than gzip/zstd on typical log data. It stores a timestamp index alongside each file, enabling fast time-range queries without full decompression.

Operation Description
compress JSONL → .pfc (with timestamp index)
decompress .pfc → JSONL
query Decompress only blocks matching a time range
seek_blocks Decompress specific blocks by index (DuckDB primitive)

Quick Start

import pfc

# Compress
pfc.compress("logs/app.jsonl", "logs/app.pfc")

# Decompress
pfc.decompress("logs/app.pfc", "logs/app_restored.jsonl")

# Query by time range — only decompresses matching blocks
pfc.query("logs/app.pfc",
          from_ts="2026-01-15T08:00:00",
          to_ts="2026-01-15T09:00:00",
          output_path="logs/morning.jsonl")

Install the Binary

The Python package is a thin wrapper — the compression engine is the pfc_jsonl binary.

Linux (x64):

curl -L https://github.com/ImpossibleForge/pfc-jsonl/releases/latest/download/pfc_jsonl-linux-x64 \
     -o pfc_jsonl && chmod +x pfc_jsonl && sudo mv pfc_jsonl /usr/local/bin/

macOS: Coming soon.

Windows: No native binary available. Use WSL2 or a Linux machine.

Custom location: Set the PFC_BINARY environment variable:

export PFC_BINARY=/opt/tools/pfc_jsonl

Verify:

pfc_jsonl --help

API Reference

pfc.compress(input_path, output_path, *, level="default", block_size_mb=None, workers=None, verbose=False)

Compress a JSONL file to PFC format.

pfc.compress("logs/app.jsonl", "logs/app.pfc")
pfc.compress("big.jsonl", "big.pfc", level="max", workers=4)
Parameter Default Description
level "default" "fast", "default", or "max" (also accepts 1-5)
block_size_mb auto Block size in MiB (power of 2, e.g. 16, 32)
workers auto Parallel compression workers
verbose False Print progress from binary

pfc.decompress(input_path, output_path="-", *, verbose=False)

Decompress a PFC file back to JSONL.

pfc.decompress("logs/app.pfc", "logs/app_restored.jsonl")

pfc.query(pfc_path, from_ts, to_ts, output_path="-")

Decompress only the blocks matching a timestamp range.

pfc.query("logs/app.pfc",
          from_ts="2026-01-15T08:00:00",
          to_ts="2026-01-15T09:00:00",
          output_path="logs/morning.jsonl")

Timestamps can be ISO 8601 strings or Unix epoch integers (as strings).


pfc.seek_blocks(pfc_path, blocks, output_path="-", *, verbose=False)

Decompress specific blocks by index. Used internally by the DuckDB extension.

pfc.seek_blocks("logs/app.pfc", [0, 3, 7], "logs/selected.jsonl")


pfc.get_binary() -> str

Return the path to the pfc_jsonl binary being used.

print(pfc.get_binary())  # /usr/local/bin/pfc_jsonl


Error Handling

import pfc
from pfc import PFCError

try:
    pfc.compress("missing.jsonl", "out.pfc")
except FileNotFoundError as e:
    print(f"Binary not found: {e}")
except PFCError as e:
    print(f"Compression failed (exit {e.returncode}): {e.stderr}")

Integration with Fluent Bit

Use pfc-fluentbit to receive logs from Fluent Bit and compress them automatically.

Integration with DuckDB

Use the pfc DuckDB extension to query .pfc files directly with SQL:

INSTALL pfc FROM community;
LOAD pfc;
LOAD json;
SELECT line->>'$.level' AS level, line->>'$.message' AS msg
FROM read_pfc_jsonl('logs/app.pfc')
WHERE line->>'$.level' = 'ERROR';

License

MIT — see LICENSE

The PFC-JSONL binary is proprietary software — free for personal and open-source use. Commercial use requires a license: impossibleforge@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pfc_jsonl-0.1.4.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pfc_jsonl-0.1.4-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file pfc_jsonl-0.1.4.tar.gz.

File metadata

  • Download URL: pfc_jsonl-0.1.4.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pfc_jsonl-0.1.4.tar.gz
Algorithm Hash digest
SHA256 3a83691393b037a1632475106ab0ccc790435d370c7b1dc7a04f84a5b36d81cd
MD5 fe2fb66cd42e27f8988e48c06748ea48
BLAKE2b-256 b4344d8e43fbcd6cb4d8b80ff721add690932c93d85c4b50675da04fc6667545

See more details on using hashes here.

File details

Details for the file pfc_jsonl-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: pfc_jsonl-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pfc_jsonl-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 8dfd814afc16ab5e8c57ac2d0c7141670eab1f15ce90342f6393ac3d081d0202
MD5 e2cc5f9da81adfe13f12cedee0287810
BLAKE2b-256 80c3d8b2ba23e3fea51662f551639f94278fa63c00a09119dc4e02583aca036d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page