Skip to main content

No project description provided

Project description

arpeggio-shredder

Reversible shredding (flatten) and unshredding (rebuild) of semi-structured JSONL into Apache Arrow RecordBatches.

This package provides a thin Python binding over a C++ core that converts JSON documents into a columnar “atoms” representation suitable for Arrow / Parquet workflows, while preserving the ability to reconstruct the original documents exactly.

The scope is intentionally narrow: flattening with reversibility, not general JSON processing.

Key properties

  • Reversible: JSON → Arrow atoms → JSON
  • Columnar-first: output is optimized for Arrow-native pipelines
  • Deterministic identity: optional object and transaction tagging
  • Minimal Python surface: Arrow RecordBatch in, RecordBatch out

Native extension

The package ships with a prebuilt native extension (.so) built against Apache Arrow and exposed via pybind11.

  • No runtime compilation
  • No system Arrow installation required
  • Shared libraries are bundled into the wheel

Platform support

  • OS: Linux (manylinux-compatible)
  • Architecture: x86_64
  • Python: CPython 3.12
  • ABI: glibc (manylinux)

Other platforms are not currently supported.

Relationship to the C++ project

This package is the Python distribution layer for the Shredder C++ project.
It does not include the full C++ documentation, tests, or build system.

License

This package is dual-licensed:

  • AGPL-3.0 for open-source use and networked deployments
  • Commercial license for proprietary or closed-source use

Commercial licensing is available via https://arpeggio.one/shop.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arpeggio_shredder-0.1.9-cp312-cp312-manylinux_2_39_x86_64.whl (24.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ x86-64

File details

Details for the file arpeggio_shredder-0.1.9-cp312-cp312-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for arpeggio_shredder-0.1.9-cp312-cp312-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 c6dbc36bbcce968d9f6f4a1c89d8a5d9011ac2f1db18b6a19daeb29389b2414e
MD5 516fb787439da46dc10b3daa377a8e80
BLAKE2b-256 32c1ce97b56be5bf658fc26fd973a9c9aab74c47e760868ecd4ce257af8da43e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page