Skip to main content

Python bindings for rust-data-processing: schema-first CSV/JSON/Parquet/Excel ingestion into an in-memory DataSet.

Project description

rust-data-processing

Phase 2 scope: Phase 1 baseline plus export, privacy, Arrow, incremental ETL → Python; JVM planned

Python bindings for the rust-data-processing crate: schema-first ingestion from CSV, JSON, Parquet, and Excel into an in-memory DataSet, with profiling, validation, Polars-backed pipelines, SQL, and Phase 2 JSONL export, privacy transforms and summaries, median, Arrow interop, and incremental ingest helpers.

Infographic: Phase 2 — Phase 1 flow plus export, privacy, median, Arrow, incremental ETL; JVM planned Phase 3.

This page is the PyPI project description (Python-only). Clone the repository for developer setup, Rust sources, and the full monorepo README.

Install

pip install rust-data-processing

Requires Python 3.10+.

Quick start

import rust_data_processing as rdp

schema = [
    {"name": "id", "data_type": "int64"},
    {"name": "name", "data_type": "utf8"},
]
ds = rdp.ingest_from_path("path/to/data.csv", schema, {"format": "csv"})
print("rows", ds.row_count())

report = rdp.profile_dataset(ds, {"head_rows": 50, "quantiles": [0.5]})
print("profile rows sampled", report["row_count"])

validation = rdp.validate_dataset(
    ds,
    {"checks": [{"kind": "not_null", "column": "id", "severity": "error"}]},
)
print("checks", validation["summary"]["total_checks"])

Phase 2 (export, privacy, JSONL, median, Delta handoff)

Copy-paste snippets: Phase 2 Python examples (Markdown in repo). These APIs are also summarized in API.md (section Export, privacy summaries, truncation (Phase 2)).

Documentation

Link
This package on PyPI pypi.org/project/rust-data-processing
Python examples (HTML, pdoc) GitHub Pages — examples
Python API (HTML, pdoc) GitHub Pages — Python
Python API (markdown) API.md in the repository
Combined site (landing + Rust rustdoc) GitHub Pages — home
Rust crate API docs.rs/rust-data-processing
Repository github.com/scorpio-datalake/rust-data-processing

License

MIT OR Apache-2.0 - see LICENSE-MIT and LICENSE-APACHE in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rust_data_processing-0.3.2.tar.gz (5.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rust_data_processing-0.3.2-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.15tmanylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.2-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.2-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.2-cp314-cp314-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.14Windows x86-64

rust_data_processing-0.3.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.2-cp314-cp314-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rust_data_processing-0.3.2-cp313-cp313-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.13Windows x86-64

rust_data_processing-0.3.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.2-cp313-cp313-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

rust_data_processing-0.3.2-cp312-cp312-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.12Windows x86-64

rust_data_processing-0.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.2-cp312-cp312-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

rust_data_processing-0.3.2-cp311-cp311-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.11Windows x86-64

rust_data_processing-0.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.2-cp311-cp311-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

rust_data_processing-0.3.2-cp310-cp310-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.10Windows x86-64

rust_data_processing-0.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file rust_data_processing-0.3.2.tar.gz.

File metadata

  • Download URL: rust_data_processing-0.3.2.tar.gz
  • Upload date:
  • Size: 5.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rust_data_processing-0.3.2.tar.gz
Algorithm Hash digest
SHA256 dd84e003fed7374e4ecd28b95f72f062a89e7a0062afbefb39904f8f57c17bdf
MD5 05a757ce52fba24701e123875767bd32
BLAKE2b-256 dee62eb04535316ac55357f824afa7366f240df4e399023d53e6d1f33d7b1f14

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7a03c83cc4a1f8cc306bb554b0ce1f29b6ff36135cbe5aab3bdd73af6cb95607
MD5 f8e8c6748aec9b837d538de3f7df2159
BLAKE2b-256 2e3038d297e3545a770650260606813445dad30e84a933ab3a835fec6c2d59df

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d17bab62e348c43d7a7db4782ccb67feee1735d8c4f157d5ff288cf5b9250590
MD5 bbcba132dee9b7a17eb50919d01d9dc1
BLAKE2b-256 06295b9d7e2d2eae843d01a16df436cb4bb94a149fd231cc084358567c3a466b

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 48e23577da989b40ad9bcda318052400f0ad6915a90936e6324a98357a5e0b42
MD5 b1e9326ad3e22a00f2a6eece925e6a02
BLAKE2b-256 7aede3f07c84c0bd8a4df2491d07c4c1089ac437b9b699ae58f6f5dafb87c470

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 051f2ccd8822e34a5a3c565a8fb11354c36b19532443bf9430a68541367189e1
MD5 828a9623ad2848a2e500482dc4ae269d
BLAKE2b-256 8e6c2baed2722f5328375dac008977d9bf9dd96dbb8c089eeed3af930d2f8adb

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 cfffa0198ec827a106278e1f8192914d8a961f910d3b5fd2b77a9b8d8d367e10
MD5 8d3adb725e8d7f2c25ac2b7723e395c3
BLAKE2b-256 e140a1298ea4648d6cda2372b30a7a386e8b8fa2f7e7e7aabf3cf68213ac6963

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9d1a65e9d5110b15bbe7b300a64f0b6da5be12dbbc0c78452080283217771544
MD5 0e58b2a5dadb196914c296d7a92bdf88
BLAKE2b-256 81619d5d3b21596ebaa4ff2b8cf95964c251d7acf2eaa8317a659f5087a8c682

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cdb1334d1df751c447df828af12aaaf9ec4d0f156c45ac5af546755fe9a6d13d
MD5 8c0532e2d6738bdc9e017a09a4c678e6
BLAKE2b-256 a2536300a0928c59c10f3f9dbf3620402e899a3359b4f47a6e2ef873bf18f8ae

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 3e608891e4d530f92ff9e52ee3b03bec16931b883f709548bcf7a1f21ce8fa9e
MD5 6ecc65ba88fb2b4b243fb6274bb749e5
BLAKE2b-256 43cfee4eed8f6bd74bcf74e1840a356c5cb9f7f23027c66eca31883730c954e5

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 300d7d92480bc632c3f93a87d6f1bf5d18ab0de327ca61ae5b54104e322d5315
MD5 757b0ff3b9a9055891e6aaddcce861ad
BLAKE2b-256 4e3ce932c4e4ab9eebf90125f3e38a768163bd6a7b6c98c733c216734ffa0582

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5ed39abc9e36d43323d3e1b3e58e9fbd34cc3f1f425fe50793f18ada14049561
MD5 69cd2b0c3c931798961fe4b78f27030f
BLAKE2b-256 233cad9bcb4843d3e5fa5f64b17933f3ffcf83fc8013e053fffbb2a3d321766f

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 60cca81ad1c41a513dceff842e1ca9d05bad6690af82db099de863ff59fb61c2
MD5 0553b44f45091a526087d530df362101
BLAKE2b-256 06d95a8c6621be7e413cdd71b3eb1dd4629f04b233da610e67d97399fa54499b

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d370a1f938945addba3d41a6eee60bb3030da4e18619863cca57fb7e4c61acf2
MD5 fcea7f056cd0f80d4f54040a9bc63171
BLAKE2b-256 3ab1866d6e323407f7a0e6d6a086949ff67b3ca74fb63c5e8752fac26d7cae99

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a3215ecb10074bad71f14bfc54235d8340646a474a5b3046806ca72c8f2658e3
MD5 017b1e31fdc76e8e15201b0d0a03bc65
BLAKE2b-256 296460aedca33be22df0067ab6fe35f59783ba1bb4b8bc4e965ddcc3ed5961ea

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 fd93b16b390af0460782a36fb4046354d813b7e95c8741971ce36600563fe964
MD5 5c7d5332b0aff1c5b25772aebc9a4307
BLAKE2b-256 4db3458d1913b4af519a25402d35e0d8950bac1940e15686339c58a6f4857d06

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a4037da5e50ac55243c36148f47662c65ee4b813e338a9502f5d6cb157da2231
MD5 9c4d2b8b8eca7b89467997e9105d473e
BLAKE2b-256 c35b2369fd1dd2656a8621b5a8bc138ae60177721207c266476ac3fda4a3b4e5

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 969a7e0b8cde5a2894aaccec5ab4b44e4453df8782c957db2335fee54cd97aac
MD5 00c2fc9d00d32b5f7f023f002ad716e0
BLAKE2b-256 240dd16c5576a65128cba946db46bbd774b74c1dd0cf94b9db9edf2fd8558433

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 94167217c37eb38a75c1649c229a500ec3f765ad22c263c6b5f13088783826ad
MD5 62a1f6cef91d932841765e675381b238
BLAKE2b-256 269d479318daa1c03635b359183d26ac7b7333de2bad0daaf20770cfdbdc3adf

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fdd990fbe9e8ddffcb0b437c6db96ce55ce87d91e25cb4371173cf0c6b8e6ddc
MD5 2b60a17c8fd5e804652c6c57ea575cd7
BLAKE2b-256 bd47c13f0265d751db3bd48b982c7dbfa9a0fe86cad728413977d2c7f22c3140

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page