Skip to main content

Python bindings for rust-data-processing: schema-first CSV/JSON/Parquet/Excel ingestion into an in-memory DataSet.

Project description

rust-data-processing

Phase 2 scope: Phase 1 baseline plus export, privacy, Arrow, incremental ETL → Python; JVM planned

Python bindings for the rust-data-processing crate: schema-first ingestion from CSV, JSON, Parquet, and Excel into an in-memory DataSet, with profiling, validation, Polars-backed pipelines, SQL, and Phase 2 JSONL export, privacy transforms and summaries, median, Arrow interop, and incremental ingest helpers.

Infographic: Phase 2 — Phase 1 flow plus export, privacy, median, Arrow, incremental ETL; JVM planned Phase 3.

This page is the PyPI project description (Python-only). Clone the repository for developer setup, Rust sources, and the full monorepo README.

Install

pip install rust-data-processing

Requires Python 3.10+.

Quick start

import rust_data_processing as rdp

schema = [
    {"name": "id", "data_type": "int64"},
    {"name": "name", "data_type": "utf8"},
]
ds = rdp.ingest_from_path("path/to/data.csv", schema, {"format": "csv"})
print("rows", ds.row_count())

report = rdp.profile_dataset(ds, {"head_rows": 50, "quantiles": [0.5]})
print("profile rows sampled", report["row_count"])

validation = rdp.validate_dataset(
    ds,
    {"checks": [{"kind": "not_null", "column": "id", "severity": "error"}]},
)
print("checks", validation["summary"]["total_checks"])

Phase 2 (export, privacy, JSONL, median, Delta handoff)

Copy-paste snippets: Phase 2 Python examples (Markdown in repo). These APIs are also summarized in API.md (section Export, privacy summaries, truncation (Phase 2)).

Documentation

Link
This package on PyPI pypi.org/project/rust-data-processing
Python examples (HTML, pdoc) GitHub Pages — examples
Python API (HTML, pdoc) GitHub Pages — Python
Python API (markdown) API.md in the repository
Combined site (landing + Rust rustdoc) GitHub Pages — home
Rust crate API docs.rs/rust-data-processing
Repository github.com/scorpio-datalake/rust-data-processing

License

MIT OR Apache-2.0 - see LICENSE-MIT and LICENSE-APACHE in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rust_data_processing-0.3.3.tar.gz (5.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rust_data_processing-0.3.3-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.15tmanylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.3-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.3-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.3-cp314-cp314-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.14Windows x86-64

rust_data_processing-0.3.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.3-cp314-cp314-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rust_data_processing-0.3.3-cp313-cp313-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.13Windows x86-64

rust_data_processing-0.3.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.3-cp313-cp313-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

rust_data_processing-0.3.3-cp312-cp312-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.12Windows x86-64

rust_data_processing-0.3.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.3-cp312-cp312-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

rust_data_processing-0.3.3-cp311-cp311-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.11Windows x86-64

rust_data_processing-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.3-cp311-cp311-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

rust_data_processing-0.3.3-cp310-cp310-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.10Windows x86-64

rust_data_processing-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file rust_data_processing-0.3.3.tar.gz.

File metadata

  • Download URL: rust_data_processing-0.3.3.tar.gz
  • Upload date:
  • Size: 5.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rust_data_processing-0.3.3.tar.gz
Algorithm Hash digest
SHA256 bae87dbae3deacd35c68f00d9e97b941b260d110bc85c469acf72f4df12f3117
MD5 1e033bb29f9f16a5ffe507ca9a5c3e00
BLAKE2b-256 7abb9cfbc7fa5ed5d290c847b2c23c5acf444fdc455026ff18409b5caa2c4e65

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e6bdcc06d6a45c537081d0eec486ef6feec926b2690f8bda00f3fa1ca16b098d
MD5 fb5021d1413a2184ae59d83fe38e620e
BLAKE2b-256 24c7f6054226a19b28d4374e8e2fda00a184a51ec5b846ba2ed81d049ae27077

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp315-cp315t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d345cbb93e33d862028cd2614f8e40d632abb4944575175ed6037afe09c87728
MD5 24be08dba308f09ee23fbfc6fd51a893
BLAKE2b-256 9aa39fa7587d86280e5f18d0497f9a3e9a4da57df68ef391058a072b9072c96e

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e72dd8a68f9aa74e4d3c42d52c51188582f65fb7ac52a35f2a343d419404976f
MD5 96c32f487690600551d807917b84ee3e
BLAKE2b-256 762bfd3c53aafb371ca4c32c9977c8f8adf4e243fd0b184049758ef0ec40e750

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 417f9d44ae5599af97252151a3e84bf42353851e734db7a20669c09ef901b6eb
MD5 d8e3c7d0707d6220236906912f8ac75d
BLAKE2b-256 d97ada1a20e5c062c3b455f7ad772da8828d1bf6f8b91d64a0734f5b48f53f69

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 08dd9d5ffdb3f65cf386c2c6cc071d991d6beef6a5719258ed42eff3f960c024
MD5 b3a9ce8828a7796df0307f1cffc230e1
BLAKE2b-256 a59ca4f4451ed72eb0970037c4345579a1a14e440d274d0eaafa8a35a5fd15c1

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 67b518709bcfecf49e81f0e1b669bd4d2b6028d968e29601bcf892b31a141a12
MD5 454648d4dcdfe1d0df0d2a7cf545d7a7
BLAKE2b-256 d72cb2fc8df7d523dac95187beb45e2cf02ed81b7e2d51634ca0a8c799a75fe7

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d191ac77cf5cdf5be4a7f1970495b096f4f65341e9dcf0e0c4e2e10169fff1f0
MD5 c34834549e249e1f10fe97dfc678c82f
BLAKE2b-256 35544c3340539e4ad3b600a8ae2eecc54eb005228ed35ce39d691e7f5c47f1d9

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 4acd245331a0b599079f0bae6aab259526b8b3031ccd21d41a6f172d73b7a7c7
MD5 97bdc64d761fbc955553b206d40ed7a1
BLAKE2b-256 06f4a05b54987f0ebb11ccde3d9830cc82bdf2780fb503a5bb1f24ea3996f3b6

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 50062d20a72fd69de28e046150e91ec753e13c780060f001e63a33369382667d
MD5 6b35affa28ae25f44bc8464301922a78
BLAKE2b-256 3c22a10cefb03396fbdaa855b52f8bf6a583d352cd1b423711dac3266ba735fc

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a5cd64c57bb77077678ca6161a35ae67208e16688113504fdb3a3bc15a5372f1
MD5 ebe72f05ec3c1dda6e42a34039f21d3e
BLAKE2b-256 161a1e5d294bf3f4204a44fb9db5e67aaec91972cbd1fb0567a883478f92c501

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 3c0d83dc80fdbfe9368d737cc0523dbf040433aa23309087b6821161ffa710ef
MD5 f3c79e67312585615156e169dd07a4e4
BLAKE2b-256 131a0de93513a9f0dce5559a5599e81d68f668f841fcbb7fac0e44fbd96fb089

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7f30af67f2ec6e6ebfb19ef553dd7dd8d67561742c7e49e23b8af187c6448f58
MD5 7aadad191a37df84c1921bc9b5628ab9
BLAKE2b-256 f1949640fba332010e9c6bf7f326ea3440235a3bb59ceb8774b00a386c7abbc0

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 764c7e9d2017691a28085015cb012608819aae40616082a404e9fd89f41f9195
MD5 b93fb99cfb5f8c6b57c22c0bfc9b3e62
BLAKE2b-256 83249b447782c1e5b22a01fabda70711d9650fff0bdb44512e91cf8b7bc51588

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 10d295e0b3cade4079e51b830b24cc0c4675dea741cc911f8da2d45a3918b6d3
MD5 da2d8e34b365f24ae043f558fb0f92f0
BLAKE2b-256 57f4f950db4267ddda3179b794f94f200ebafee4fbc9a2a8ade46580ca858d9e

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 17dcd091cff7bbef531bb800686f300fe93bf0656e7584cf3244761d3f31bf2c
MD5 4032a8bf53ed2226949b482b92c524f6
BLAKE2b-256 77fa30814fcd1e54f763aebad87266e78613e8dc7510d53f578780319da9a51f

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b1389a2424229b01c5c12c6d2bd2a5e298a6d0739feaf4ed32aa799f9afc8557
MD5 1fa2edf0455c3b61c7a09655e370750a
BLAKE2b-256 4e20eb95e2f0ffe3e8a9b7e09664bbccc30d034168aea4056cebf9baf0095f06

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 893eca6afbaafed2acee800ac29e5104aba79bf442ec5681cf11ce0c8b0bee98
MD5 1e79a58fef0161188cba0506763d70b7
BLAKE2b-256 e13943befe39e7b377494980a7f84b26117b4b54577ef61fed157fe30e3ea6c7

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 17e33d4fe332854001d28f97d96256c873ea471ed075d2cc071ea698826e5b2e
MD5 072fd9d88f2d8360a4ffa6ad3028ed0e
BLAKE2b-256 2d522bb9061389328096febfa42c5950c841ea659ff3b915017fceade3ff2868

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page