Skip to main content

Python bindings for rust-data-processing: schema-first CSV/JSON/Parquet/Excel ingestion into an in-memory DataSet.

Project description

rust-data-processing

Phase 2 scope: Phase 1 baseline plus export, privacy, Arrow, incremental ETL → Python; JVM planned

Python bindings for the rust-data-processing crate: schema-first ingestion from CSV, JSON, Parquet, and Excel into an in-memory DataSet, with profiling, validation, Polars-backed pipelines, SQL, and Phase 2 JSONL export, privacy transforms and summaries, median, Arrow interop, and incremental ingest helpers.

Infographic: Phase 2 — Phase 1 flow plus export, privacy, median, Arrow, incremental ETL; JVM planned Phase 3.

This page is the PyPI project description (Python-only). Clone the repository for developer setup, Rust sources, and the full monorepo README.

Install

pip install rust-data-processing

Requires Python 3.10+.

Quick start

import rust_data_processing as rdp

schema = [
    {"name": "id", "data_type": "int64"},
    {"name": "name", "data_type": "utf8"},
]
ds = rdp.ingest_from_path("path/to/data.csv", schema, {"format": "csv"})
print("rows", ds.row_count())

report = rdp.profile_dataset(ds, {"head_rows": 50, "quantiles": [0.5]})
print("profile rows sampled", report["row_count"])

validation = rdp.validate_dataset(
    ds,
    {"checks": [{"kind": "not_null", "column": "id", "severity": "error"}]},
)
print("checks", validation["summary"]["total_checks"])

Phase 2 (export, privacy, JSONL, median, Delta handoff)

Copy-paste snippets: Phase 2 Python examples (Markdown in repo). These APIs are also summarized in API.md (section Export, privacy summaries, truncation (Phase 2)).

Documentation

Link
This package on PyPI pypi.org/project/rust-data-processing
Python examples (HTML, pdoc) GitHub Pages — examples
Python API (HTML, pdoc) GitHub Pages — Python
Python API (markdown) API.md in the repository
Combined site (landing + Rust rustdoc) GitHub Pages — home
Rust crate API docs.rs/rust-data-processing
Repository github.com/scorpio-datalake/rust-data-processing

License

MIT OR Apache-2.0 - see LICENSE-MIT and LICENSE-APACHE in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rust_data_processing-0.3.1.tar.gz (4.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rust_data_processing-0.3.1-cp314-cp314-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.14Windows x86-64

rust_data_processing-0.3.1-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.1-cp314-cp314-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rust_data_processing-0.3.1-cp313-cp313-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.13Windows x86-64

rust_data_processing-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.1-cp313-cp313-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

rust_data_processing-0.3.1-cp312-cp312-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.12Windows x86-64

rust_data_processing-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.1-cp312-cp312-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

rust_data_processing-0.3.1-cp311-cp311-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.11Windows x86-64

rust_data_processing-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rust_data_processing-0.3.1-cp311-cp311-macosx_11_0_arm64.whl (29.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

rust_data_processing-0.3.1-cp310-cp310-win_amd64.whl (29.5 MB view details)

Uploaded CPython 3.10Windows x86-64

rust_data_processing-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file rust_data_processing-0.3.1.tar.gz.

File metadata

  • Download URL: rust_data_processing-0.3.1.tar.gz
  • Upload date:
  • Size: 4.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rust_data_processing-0.3.1.tar.gz
Algorithm Hash digest
SHA256 35abef59d952666dfa0497b3747f2f2e05a20b88ade939939d3d0c80fd43b2d5
MD5 53966a205ab53f6bcfabd47151913fa1
BLAKE2b-256 d1113ab51567f0ea2bcb167c038f9156c446cd94108a40b97a71b59f31390c1c

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3fce02141b177cd43ded3fb8f3dd07285997983f6f8ff763f6d0c2d94ed3c700
MD5 0eadc3c38e4d83db1d6db7085b3cd958
BLAKE2b-256 c61e4b7f47cc3e80ca7f91431bdd65d8e70c7d6e3ba2d81ff6b2186cc5adfd27

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 e37a662c3e97d9de25bc3992fcaf56f6332a93aea3454d6c4b52c4d8fc9632c8
MD5 fa90e7893c1cc86b2b6cc17176785381
BLAKE2b-256 7c948cc90f36139aee0a7e57325f4d289dcac1a8634d833b13b6f6aa5e606ec2

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b6589289a44e8c0b68d00dbdda96dcaef1d706fdc799eac7c5ce6379bf1b35d3
MD5 8a2eb78c56c857b5954ce94d7c9cf009
BLAKE2b-256 10a78d2526fcc1c82395dd3aa7e2c3b00316cabc417a69e472f6da3727b293c1

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b2d04c4f496fbe62d52cbbd74281c26c133ae86e822281bc324df280f5d7b8dd
MD5 a2e1a4cf9d79ce8daf1e26b02dd29cb1
BLAKE2b-256 57d8850a916c4281ce43f6ea4573da91cce5d03e5ced10b73f318a9db2ca7d33

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 1f31baac301d937eb5959022f97c386554d8455c78641f1ab0fd106d25094811
MD5 d51c88c53f0dff1deaeb22570e2ea15e
BLAKE2b-256 7635f5627cb9012b799eea73b0cca52de896e3ebc74b90158a0eeec221865cb6

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1e757cdd80f684908abe77d3e00333c93ace8b331ffea63734de946b899fae35
MD5 b7a8ea7a13f3875dc61b6133d1f5d84b
BLAKE2b-256 918ba5f7cf21f1b082a614d9155312707a3a91463367926d321cf8fa11342b44

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e0162779bcbb63e6bbc6c0a8ef42408452c927825407e204825fc3e376f79605
MD5 3e8b2662668d3077fe85c4929ddf9bc2
BLAKE2b-256 7b2ed980fa2f973e2df4c01a9b5b3a6cc9f27ad01a61a4925c56b50c65c11bb0

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 e3c7fc11e6c328d77f60e422faaff567457740201999b3f01fcff054f6ba217f
MD5 0af33011bc67e9ead8d8e6ed8edc60fa
BLAKE2b-256 fc0cc7bc45f855e1f85718de4e45cdfb21fecf673b2830e64e70aa6314d08c8f

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4115ec84e94fb2a87f9ffae51db2d749b3ff34cbcf4ee005cebe806dd4f68843
MD5 d9ba738a92e7524ecc27c9321d4d0666
BLAKE2b-256 1810d1c0955fb0290bbb289621836770f11bccf829eb19c4be5c30cd4c85de12

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 19dfd7c625892731aa8096e481ac9ef955bc5a95e2e287b9ea26e109695621e3
MD5 6dc890dde090aed2bd231f93f08cb3cd
BLAKE2b-256 bce5ea87157b3a67c93ff32ff615e2fb7b16ee3ea9073568585a1a342db88f23

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 a0d4d4cfb9561a81e5239fab326e9d3dcc6201f8fa07831105f4307f62e4b78a
MD5 aa133c165ac8f1eee4dd4e60bc07a3ac
BLAKE2b-256 fca7b7ab4b1d3b1885c0fadcdefb4bd2cdfe731428c995b8bf34a661e8200564

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fa47a1c935fb080c3a1eff513d86c11fb0595cb7e7a8146fd1ff7224de605c58
MD5 f468f1dcd1035b94abd372affb52533c
BLAKE2b-256 db15c90e4a5c244abda8a434ef8ea0ed72a820f0ef351949a143a7deaea19652

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ff76d3c287169257ccda3016dbb8502feb5829d41b45a8f0017121be8f859505
MD5 30c709be53e94103a397bfdeb353c1d5
BLAKE2b-256 e30a75f8dcef86622cf49aba3b729c163e5752454243cbe5490dc89231573147

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 5f62258b26bd0473ce84df793624e7a7899744a321ee885011f60cc1d8eae8e2
MD5 9618f40ee0bb1a80e12ba5e3528bddf7
BLAKE2b-256 580cd8388e3354e98ad8338c1c46dd1f1d532208c850d757e2d63a70d11b5687

See more details on using hashes here.

File details

Details for the file rust_data_processing-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_data_processing-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 91b35fd0e817790d89bc2312d14527456d21b417123ce623e405b376fa16c474
MD5 4dfc68de9351aece7fe50be6b2dea1de
BLAKE2b-256 b8ac8fe9c9f97f222527394758c1c56127c538cb02e9139862e8388193ccd30a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page