Skip to main content

Python interface for PFC-JSONL — high-performance log compression with block-level timestamp filtering

Project description

pfc-jsonl · Python Package

Python interface for PFC-JSONL — high-performance compression for structured log files (JSONL), with block-level timestamp filtering.

pip install pfc-jsonl

Requires the pfc_jsonl binary. Install it separately — see below.


What is PFC-JSONL?

PFC-JSONL compresses JSONL log files 25–37% smaller than gzip/zstd on typical log data. It stores a timestamp index alongside each file, enabling fast time-range queries without full decompression.

Operation Description
compress JSONL → .pfc (with timestamp index)
decompress .pfc → JSONL
query Decompress only blocks matching a time range
seek_blocks Decompress specific blocks by index (DuckDB primitive)

Quick Start

import pfc

# Compress
pfc.compress("logs/app.jsonl", "logs/app.pfc")

# Decompress
pfc.decompress("logs/app.pfc", "logs/app_restored.jsonl")

# Query by time range — only decompresses matching blocks
pfc.query("logs/app.pfc",
          from_ts="2026-01-15T08:00:00",
          to_ts="2026-01-15T09:00:00",
          output_path="logs/morning.jsonl")

Install the Binary

The Python package is a thin wrapper — the compression engine is the pfc_jsonl binary.

Linux (x64):

curl -L https://github.com/ImpossibleForge/pfc-jsonl/releases/latest/download/pfc_jsonl-linux-x64 \
     -o pfc_jsonl && chmod +x pfc_jsonl && sudo mv pfc_jsonl /usr/local/bin/

macOS (Apple Silicon M1/M2/M3/M4):

curl -L https://github.com/ImpossibleForge/pfc-jsonl/releases/latest/download/pfc_jsonl-macos-arm64 \
     -o pfc_jsonl && chmod +x pfc_jsonl && sudo mv pfc_jsonl /usr/local/bin/

macOS Intel (x64): coming soon.

Windows: No native binary available. Use WSL2 or a Linux machine.

Custom location: Set the PFC_BINARY environment variable:

export PFC_BINARY=/opt/tools/pfc_jsonl

Verify:

pfc_jsonl --help

API Reference

pfc.compress(input_path, output_path, *, level="default", block_size_mb=None, workers=None, verbose=False)

Compress a JSONL file to PFC format.

pfc.compress("logs/app.jsonl", "logs/app.pfc")
pfc.compress("big.jsonl", "big.pfc", level="max", workers=4)
Parameter Default Description
level "default" "fast", "default", or "max" (also accepts 1-5)
block_size_mb auto Block size in MiB (power of 2, e.g. 16, 32)
workers auto Parallel compression workers
verbose False Print progress from binary

pfc.decompress(input_path, output_path="-", *, verbose=False)

Decompress a PFC file back to JSONL.

pfc.decompress("logs/app.pfc", "logs/app_restored.jsonl")

pfc.query(pfc_path, from_ts, to_ts, output_path="-")

Decompress only the blocks matching a timestamp range.

pfc.query("logs/app.pfc",
          from_ts="2026-01-15T08:00:00",
          to_ts="2026-01-15T09:00:00",
          output_path="logs/morning.jsonl")

Timestamps can be ISO 8601 strings or Unix epoch integers (as strings).


pfc.seek_blocks(pfc_path, blocks, output_path="-", *, verbose=False)

Decompress specific blocks by index. Used internally by the DuckDB extension.

pfc.seek_blocks("logs/app.pfc", [0, 3, 7], "logs/selected.jsonl")


pfc.get_binary() -> str

Return the path to the pfc_jsonl binary being used.

print(pfc.get_binary())  # /usr/local/bin/pfc_jsonl


Error Handling

import pfc
from pfc import PFCError

try:
    pfc.compress("missing.jsonl", "out.pfc")
except FileNotFoundError as e:
    print(f"Binary not found: {e}")
except PFCError as e:
    print(f"Compression failed (exit {e.returncode}): {e.stderr}")

Integration with Fluent Bit

Use pfc-fluentbit to receive logs from Fluent Bit and compress them automatically.

Integration with DuckDB

Use the pfc DuckDB extension to query .pfc files directly with SQL:

INSTALL pfc FROM community;
LOAD pfc;
LOAD json;
SELECT line->>'$.level' AS level, line->>'$.message' AS msg
FROM read_pfc_jsonl('logs/app.pfc')
WHERE line->>'$.level' = 'ERROR';

License

MIT — see LICENSE

The PFC-JSONL binary is proprietary software — free for personal and open-source use. Commercial use requires a license: info@impossibleforge.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pfc_jsonl-0.1.5.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pfc_jsonl-0.1.5-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file pfc_jsonl-0.1.5.tar.gz.

File metadata

  • Download URL: pfc_jsonl-0.1.5.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pfc_jsonl-0.1.5.tar.gz
Algorithm Hash digest
SHA256 32e494af31da13395076f4bd2cae421ce48ace37bdee79a62139aef9849ad901
MD5 6cea770a0abfc3b9760ec89fc7015efd
BLAKE2b-256 7316970fc503000b958cc70cc7d2b942479e54db8b40f1fe394865fe3bfc5d39

See more details on using hashes here.

File details

Details for the file pfc_jsonl-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: pfc_jsonl-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pfc_jsonl-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 62365e25223b325b842927b8b3c1b2262d4aabfe4db0db2c65ed6bcd58e93545
MD5 6b6c750b0b7f26919f36c0af0e54e051
BLAKE2b-256 e982c85d3f049c0f256fe27fd618e0f7285dac30d9a8117fd86ae888beae0cd3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page