High-performance parallel dataframe and array processing with Arrow-backed storage

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

aeiwz

These details have not been verified by PyPI

Project description

FrameX

FrameX is an Arrow-backed Python library for parallel dataframe and array processing on a single machine.

It combines:

Pandas-like tabular APIs (DataFrame, Series, GroupBy)
NumPy-compatible chunked arrays (NDArray with NumPy protocol support)
Arrow-native storage/interop (to_arrow, Parquet/IPC I/O)
Eager execution with optional lazy pipelines (.lazy().collect())
Runtime backends for local threads/processes plus optional Ray/Dask executors

Why FrameX

FrameX is aimed at local analytics workflows that are bigger than comfortable single-threaded scripts but do not yet require distributed infrastructure.

Typical fit:

ETL and analytics pipelines on medium-to-large local datasets
feature engineering workflows that mix table and array operations
migration paths from Pandas scripts where API familiarity matters

Installation

From PyPI:

pip install pyframe-xpy

From source:

git clone https://github.com/aeiwz/FrameX.git
cd FrameX
pip install -e .

Requirements:

Python >=3.10
Core dependencies: pyarrow, numpy
Optional compatibility: pandas (pip install pyframe-xpy[pandas_compat])

Quick Start

import framex as fx

df = fx.DataFrame(
    {
        "group": ["a", "a", "b"],
        "value": [10, 20, 30],
        "is_refund": [False, True, False],
    }
)

result = (
    df.filter(~df["is_refund"])
      .groupby("group")
      .agg({"value": ["sum", "mean", "count"]})
      .sort("value_sum", ascending=False)
)

print(result.to_pandas())

Core API

Top-level imports:

import framex as fx

Main objects and helpers:

fx.DataFrame, fx.Series, fx.Index, fx.LazyFrame
fx.NDArray, fx.array(...)
fx.read_parquet, fx.write_parquet, fx.read_ipc, fx.write_ipc, fx.read_csv, fx.write_csv
fx.read_json, fx.write_json, fx.read_ndjson, fx.write_ndjson
fx.read_file, fx.write_file for format auto-detection

Compression:

transparent extension-based compression for read_file / write_file
supported wrappers: .gz, .bz2, .xz, .zip, and .zst/.zstd (when zstandard is installed)
fx.from_pandas, fx.from_dask, fx.from_ray, fx.from_dataframe
fx.get_config, fx.set_backend, fx.set_workers, fx.set_serializer, fx.set_kernel_backend
fx.set_array_backend for auto/NumExpr/Numba/JAX/PyTorch/CuPy acceleration modes
fx.recommend_best_performance_config() to inspect hardware-tuned settings
fx.auto_configure_hardware() to apply best-performance config automatically
fx.StreamProcessor for micro-batch streaming pipelines

Acceleration extras:

pip install pyframe-xpy[accel]      # numexpr + numba
pip install pyframe-xpy[gpu]        # cupy (CUDA)
pip install pyframe-xpy[ml_accel]   # jax + pytorch
pip install pyframe-xpy[pandas_fast]  # modin backend
pip install pyframe-xpy[distributed]  # Dask + Ray distributed/HPC backends
pip install zstandard  # .zst/.zstd file compression

Backend notes:

fx.set_backend("threads" | "processes" | "ray" | "dask" | "hpc")
Ray and Dask execution backends require their respective runtimes to be installed/available.
HPC mode ("hpc") uses cluster-oriented execution via Dask or Ray:
- FRAMEX_HPC_ENGINE=dask|ray
- FRAMEX_DASK_SCHEDULER_ADDRESS=<tcp://...> to connect existing Dask clusters
- FRAMEX_RAY_ADDRESS=<ray://...> to connect existing Ray clusters
- optional SLURM bootstrap: FRAMEX_DASK_SLURM=1 (requires dask-jobqueue)

Test support notes:

Some tests are optional-backend gated and intentionally skipped when deps are not installed.
Typical skip reasons: missing dask.distributed, dask.dataframe, ray, or ray.data.
Run full optional matrix locally:

pip install pyframe-xpy[distributed]
pytest -q

Documentation

Canonical docs are in docs/documents:

Website (Docs UI)

The docs website lives in website (Next.js App Router).

Main docs routes:

http://localhost:3000/docs/features
http://localhost:3000/docs/tutorial_etl_pipeline
http://localhost:3000/docs/use_cases
http://localhost:3000/docs/configuration_guide
http://localhost:3000/docs/performance_test

Run locally:

cd website
npm install
npm run dev

Production build:

npm run build
npm run start

Development

Install dev dependencies:

pip install -e .[dev]

Run tests:

pytest

Benchmarks

Benchmark code and generated reports are in benchmarks.

Run the full benchmark suite (includes in-terminal progress bar and report generation):

python3 -m benchmarks.benchmark_suite

Run workload capability matrix checks:

python3 -m benchmarks.check_framex_workloads

Benchmark outputs are written to benchmarks/results:

benchmark_results.json
benchmark_results.csv
benchmark_report.md
framex_workload_check.json
performance_speedup.png
parallel_processing_scaling.png
multiprocessing_scaling.png
memory_peak_rss.png

Project Status

FrameX is pre-1.0 (0.1.0) and in active development.

APIs are usable and documented
compatibility/performance behavior will continue to evolve
pin versions for production-critical workloads

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

aeiwz

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.2

Apr 9, 2026

0.1.1

Apr 9, 2026

This version

0.1.0

Apr 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyframe_xpy-0.1.0.tar.gz (56.4 kB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyframe_xpy-0.1.0-py3-none-any.whl (67.8 kB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file pyframe_xpy-0.1.0.tar.gz.

File metadata

Download URL: pyframe_xpy-0.1.0.tar.gz
Upload date: Apr 9, 2026
Size: 56.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pyframe_xpy-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2034cac2bdfbd1aab423f190aff45b428b18df27e4a218a31c4701fe8546b7b4`
MD5	`ef1745150dce6b0cbe8fabfcdfb045d9`
BLAKE2b-256	`53592d1b28c0130e052e1b825c28c94d0096600fb956e884e43ab68e95eb8fa1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyframe_xpy-0.1.0.tar.gz:

Publisher: publish-pypi.yml on aeiwz/FrameX

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pyframe_xpy-0.1.0.tar.gz
- Subject digest: 2034cac2bdfbd1aab423f190aff45b428b18df27e4a218a31c4701fe8546b7b4
- Sigstore transparency entry: 1261920108
- Sigstore integration time: Apr 9, 2026
Source repository:
- Permalink: aeiwz/FrameX@32691fc2fc6ed7e0a81c66f249ca4636435a7a16
- Branch / Tag: refs/heads/main
- Owner: https://github.com/aeiwz
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@32691fc2fc6ed7e0a81c66f249ca4636435a7a16
- Trigger Event: workflow_dispatch

File details

Details for the file pyframe_xpy-0.1.0-py3-none-any.whl.

File metadata

Download URL: pyframe_xpy-0.1.0-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 67.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pyframe_xpy-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cba88fbc27156dd6e85246622c0a1afe552c65e3419b50c7a08605674cd1f435`
MD5	`e898cad36b2873b6c89b2dc632328ccd`
BLAKE2b-256	`909a939683abeab57ee4407f6fac4f73f2d3f0c51377ffd129a438deaed9c0ca`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyframe_xpy-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on aeiwz/FrameX

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pyframe_xpy-0.1.0-py3-none-any.whl
- Subject digest: cba88fbc27156dd6e85246622c0a1afe552c65e3419b50c7a08605674cd1f435
- Sigstore transparency entry: 1261920122
- Sigstore integration time: Apr 9, 2026
Source repository:
- Permalink: aeiwz/FrameX@32691fc2fc6ed7e0a81c66f249ca4636435a7a16
- Branch / Tag: refs/heads/main
- Owner: https://github.com/aeiwz
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@32691fc2fc6ed7e0a81c66f249ca4636435a7a16
- Trigger Event: workflow_dispatch

pyframe-xpy 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

FrameX

Why FrameX

Installation

Quick Start

Core API

Documentation

Website (Docs UI)

Development

Benchmarks

Project Status

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance