High-performance data processing tools for Python, built with Rust
Project description
Oxidize
High-performance data processing tools for Python, built with Rust.
Philosophy
- Best of both worlds: Python interfaces with Rust backends for both simplicity and performance
- True parallelism: GIL release for concurrent processing
- Easy installation: Pre-built wheels, no compilation required
- Practical: Specialized solutions for common data engineering tasks
Tools
oxidize-postal
oxidize-postal is an alternative to pypostal for Python bindings of the libpostal library, which provides address parsing and normalization with international support.
oxidize-postal provides the same address parsing capabilities as pypostal but addresses key limitations: it installs without C compilation, releases the Python GIL for true parallel processing, and offers a cleaner API. Built using Rust and libpostal-rust bindings to the libpostal C library.
import oxidize_postal
parsed = oxidize_postal.parse_address("781 Franklin Ave Brooklyn NY 11216")
# {'house_number': '781', 'road': 'franklin ave', 'city': 'brooklyn', 'state': 'ny', 'postcode': '11216'}
expansions = oxidize_postal.expand_address("123 Main St NYC NY")
# ['123 main street nyc new york', '123 main street nyc ny', ...]
oxidize-xml
oxidize-xml is an alternative to lxml and provides streaming XML to JSON conversion for large files.
oxidize-xml is more specialized and opiniated, focusing on common data engineering workflows for extracting repeated elements from large XML files like API responses, log files, and data exports, is particularly built for engineers and analysts working in DuckDB or Polars.
import oxidize_xml
# Extract repeated elements to JSON Lines
count = oxidize_xml.parse_xml_file_to_json_file("data.xml", "book", "output.jsonl")
# Stream processing for large files
json_lines = oxidize_xml.parse_xml_file_to_json_string("export.xml", "record")
Future Tools
New versions to oxidize-xml / oxidize-postal plus new packages coming soon.
License
MIT License for all tools.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file oxidize-0.7.0.tar.gz.
File metadata
- Download URL: oxidize-0.7.0.tar.gz
- Upload date:
- Size: 2.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3a446265641720d93af86941ef5af8274ce732b061454e129b7c368f9e98d28
|
|
| MD5 |
bcb871ef478c6c64ac481a7bc672c0a2
|
|
| BLAKE2b-256 |
29dabf69cdb542d3fd324f934445c462456a5faaa9a6b22ac27006308aead2af
|
File details
Details for the file oxidize-0.7.0-py3-none-any.whl.
File metadata
- Download URL: oxidize-0.7.0-py3-none-any.whl
- Upload date:
- Size: 2.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
148d7297013e3c843b2c4f0f0ee41d91788c0809bd15009ca8c4ea8646c81709
|
|
| MD5 |
26bd47c43637f149b8aeb88555cdd766
|
|
| BLAKE2b-256 |
f9d6a583972b15f234f7c12ecef8e4c85ad509ac25359b0614995f0cc2ccb4a4
|