Skip to main content

Python bindings for rust-data-processing: schema-first CSV/JSON/Parquet/Excel ingestion into an in-memory DataSet.

Project description

rust-data-processing

Phase 2 scope: Phase 1 baseline plus export, privacy, Arrow, incremental ETL → Python; JVM planned

Python bindings for the rust-data-processing crate: schema-first ingestion from CSV, JSON, Parquet, and Excel into an in-memory DataSet, with profiling, validation, Polars-backed pipelines, SQL, and Phase 2 JSONL export, privacy transforms and summaries, median, Arrow interop, and incremental ingest helpers.

Infographic: Phase 2 — Phase 1 flow plus export, privacy, median, Arrow, incremental ETL; JVM planned Phase 3.

This page is the PyPI project description (Python-only). Clone the repository for developer setup, Rust sources, and the full monorepo README.

Install

pip install rust-data-processing

Requires Python 3.10+.

Quick start

import rust_data_processing as rdp

schema = [
    {"name": "id", "data_type": "int64"},
    {"name": "name", "data_type": "utf8"},
]
ds = rdp.ingest_from_path("path/to/data.csv", schema, {"format": "csv"})
print("rows", ds.row_count())

report = rdp.profile_dataset(ds, {"head_rows": 50, "quantiles": [0.5]})
print("profile rows sampled", report["row_count"])

validation = rdp.validate_dataset(
    ds,
    {"checks": [{"kind": "not_null", "column": "id", "severity": "error"}]},
)
print("checks", validation["summary"]["total_checks"])

Phase 2 (export, privacy, JSONL, median, Delta handoff)

Copy-paste snippets: Phase 2 Python examples (Markdown in repo). These APIs are also summarized in API.md (section Export, privacy summaries, truncation (Phase 2)).

Documentation

Link
This package on PyPI pypi.org/project/rust-data-processing
Python examples (HTML, pdoc) GitHub Pages — examples
Python API (HTML, pdoc) GitHub Pages — Python
Python API (markdown) API.md in the repository
Combined site (landing + Rust rustdoc) GitHub Pages — home
Rust crate API docs.rs/rust-data-processing
Repository github.com/scorpio-datalake/rust-data-processing

License

MIT OR Apache-2.0 - see LICENSE-MIT and LICENSE-APACHE in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rust_data_processing-0.3.4.tar.gz (5.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rust_data_processing-0.3.4-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.15tmanylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.4-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.4-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.4-cp314-cp314-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.14Windows x86-64

rust_data_processing-0.3.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.4-cp314-cp314-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rust_data_processing-0.3.4-cp313-cp313-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.13Windows x86-64

rust_data_processing-0.3.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.4-cp313-cp313-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

rust_data_processing-0.3.4-cp312-cp312-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.12Windows x86-64

rust_data_processing-0.3.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.4-cp312-cp312-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

rust_data_processing-0.3.4-cp311-cp311-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.11Windows x86-64

rust_data_processing-0.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.4-cp311-cp311-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

rust_data_processing-0.3.4-cp310-cp310-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.10Windows x86-64

rust_data_processing-0.3.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file rust_data_processing-0.3.4.tar.gz.

File metadata

  • Download URL: rust_data_processing-0.3.4.tar.gz
  • Upload date:
  • Size: 5.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rust_data_processing-0.3.4.tar.gz
Algorithm Hash digest
SHA256 054e302c6083a5c468e2cd3e30de661cce8bcb620a4ebd417e7df642a03d6050
MD5 4304fff8674eefd76c79ccc3385f511a
BLAKE2b-256 d1ea85f305275a9683c183b63ba61d685e6b172998e4e9d0174a3898aa12be6e

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8145decc24c03d2e509689276e3b3003ec8a5e87b8cc98b8bb81fac37916f141
MD5 ff1cb821dd8694afca1f74e99cff762e
BLAKE2b-256 898e5e4565e8ee8ea1cba16f645565de2a832b5e41987d71dd2f034320924f89

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b3735c04d455ca739adcadcfcfcb73c1d68514996189ab2f5e37779628f3816c
MD5 ee2f6b7c620955ff2ad3d892d7566dc7
BLAKE2b-256 d93570e008640c2f50e29bc890ab6b28ac8fa6fc7faec52e3d6239eb5d83e617

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ffaf546a429579b9d60a3b0296510223a8aa6a2a30072ce96390c8b263e3fa7d
MD5 75b4789a79179bbb9c337aac7fd66d27
BLAKE2b-256 d60a0a3aa8cebd60301080fe18699bae2a1048c8567e4d3e8a1c39376564590f

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e06e5b39b3bb0afc3e8878c9abe2e776a9b9c2252b54caa8a07a33fa5f88406e
MD5 ec848b6abbec07e0af6cb7fa28e48c5a
BLAKE2b-256 877ab6f15b614874e8bfe9b3f903c5989cd7934544446dd0b38295df2fbf5043

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 cf7b1b92c28dc55c59e5dbd76102237434984ec4ec1a9930e550b3899d0d43e9
MD5 45ea970f509826be610c370dd8ed11c9
BLAKE2b-256 a80915544bb8ffdbb6be847e726d54f392257bdb7846e920c4d675d7c22f81a4

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 865719eb29f981d5b12679c5452af5f408b10fb252b5fc1edd307e2b55d0adf8
MD5 e5164db0ffbef5e6c7cc023bb554707d
BLAKE2b-256 8b4b64725a3ba7253ae7112db92b36f06471afe1d3179384aefe8617dbe074c3

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 257b2d0c70901ec28cdb221e1aa670cb99caa014a38f372ff6a7670381667540
MD5 f4166ee1ee06bd4c5d4f4778a5a7a12e
BLAKE2b-256 fbc7f967601d382e6b8d009523d1e9bd6d416e828170d610fa05bd9c3f0d43f6

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 82bf34da23ea2af7e9df8269020eaec6d4f3fdeb0080c23d924368e66c836cd2
MD5 68d5ee8930255cdf69e25deba60bcf8d
BLAKE2b-256 ce20d711273ea5ce292d0beabbb1067f29da4fd67f481db81aeb3db36d7f5137

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a043d7f9e409fe8dbf981071b490e1a524352aa7108f6b9594f2fac0eb98fb4d
MD5 95fccd4d7e245f3f31f3fbf922809947
BLAKE2b-256 74297d8378a3601b6ffe909379a9fe94b5db43380822e8b068ae0a176ca687d2

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9edf05fd49dbf94a729bf42864ace172caf611a187f3df39eb787da57a909afb
MD5 c6cdf44f37c9f3e66c4cac053bc530cc
BLAKE2b-256 554a9c538440476f0fcb1496b2b14d19d853a57ac79b69522f05317dcd64f074

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 428257e0fc3180a7dfa2ad4f98ed396fcd94718c103ca580327fa2aff4d4f56f
MD5 bb75b9a96a972c47f76d745fc6392cb0
BLAKE2b-256 d13d2d12f343f80d3a29c4baedf1f47d23e676af97bdc2289adfac45c2dc62b5

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bde6121f17c0c4eb778719a732f9f32266e71f06ee8067fe4d8a3e27ed4bbcbd
MD5 7f653115856aa3aaae7291c1f3cdba6a
BLAKE2b-256 7b687f9d2cb010228f43bbed109792ea8e94c3c00a5217ff6bc0feb2e82e096a

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6fd32b2d9baa8ddefd447b96211640bb9142c80b1d5b808ddf0071ba273f1090
MD5 fa762a57aa11c7bad206067b5c21fda9
BLAKE2b-256 5d45e1c2f26820086513aa36d0e45b115fb44487a2b687767b58b560824021e5

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 2e373fd1a4ec39ffc829cfb1bb46e07ca02439bcad8c2905f28b7346d0e0896e
MD5 12f4e3e74917c38af539601ea86c3758
BLAKE2b-256 92c5df3734b28ff57014ad3a8d8ca81d8bfb36df54e19b480226a88d5c246061

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5af315465c776736a42eaeb2898eb862a4342bd329a369d18b8d2404257a9f5f
MD5 5c4957846a3dff192507a27e17b9c012
BLAKE2b-256 c1074086b7901a2d909e00c1125d3200e7c4cc6e419affd629bc2cddc8ba5c8f

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2ca771410e7d34ff5fd0f3b2ad8684271f5581f69af9c868a2099ed9e57d2bea
MD5 86c3d8ae5c28ee2e68113f0c60f752a4
BLAKE2b-256 6f17879e7156b12d8d4b4d987d44a0f105641a9ab45ad12f084202f3bd7d2b8d

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 1ec123ab1e70cfa4672d2d788d4ab5f610efcd4794b7a9b6809a6cb0484512e0
MD5 d7fb8d180fce173b28e744fe09353be8
BLAKE2b-256 0078db8f192fcdc38ab24cee11495f2d5270c41883f500cb41b8e59aa6459004

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f9c9a83df9b58563b12893d1801a0436bb4262155d43b6b919f5807c35a45798
MD5 6b0cfac21853c3ff635e433e80cd724a
BLAKE2b-256 4dfa46678718f4edec660fe64d13f7f6c06ff3969023309bcd84087adeb1476b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page