Skip to main content

High-performance data processing tools for Python, built with Rust

Project description

Oxidize

High-performance data processing tools for Python, built with Rust.

Philosophy

  • Best of both worlds: Python interfaces with Rust backends for both simplicity and performance
  • True parallelism: GIL release for concurrent processing
  • Easy installation: Pre-built wheels, no compilation required
  • Practical: Specialized solutions for common data engineering tasks

Tools

oxidize-postal

oxidize-postal is an alternative to pypostal for Python bindings of the libpostal library, which provides address parsing and normalization with international support.

oxidize-postal provides the same address parsing capabilities as pypostal but addresses key limitations: it installs without C compilation, releases the Python GIL for true parallel processing, and offers a cleaner API. Built using Rust and libpostal-rust bindings to the libpostal C library.

import oxidize_postal

parsed = oxidize_postal.parse_address("781 Franklin Ave Brooklyn NY 11216")
# {'house_number': '781', 'road': 'franklin ave', 'city': 'brooklyn', 'state': 'ny', 'postcode': '11216'}

expansions = oxidize_postal.expand_address("123 Main St NYC NY")
# ['123 main street nyc new york', '123 main street nyc ny', ...]

oxidize-xml

oxidize-xml is an alternative to lxml and provides streaming XML to JSON conversion for large files.

oxidize-xml is more specialized and opiniated, focusing on common data engineering workflows for extracting repeated elements from large XML files like API responses, log files, and data exports, is particularly built for engineers and analysts working in DuckDB or Polars.

import oxidize_xml

# Extract repeated elements to JSON Lines
count = oxidize_xml.parse_xml_file_to_json_file("data.xml", "book", "output.jsonl")

# Stream processing for large files
json_lines = oxidize_xml.parse_xml_file_to_json_string("export.xml", "record")

Future Tools

New versions to oxidize-xml / oxidize-postal plus new packages coming soon.

License

MIT License for all tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oxidize-0.7.0.tar.gz (2.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oxidize-0.7.0-py3-none-any.whl (2.9 kB view details)

Uploaded Python 3

File details

Details for the file oxidize-0.7.0.tar.gz.

File metadata

  • Download URL: oxidize-0.7.0.tar.gz
  • Upload date:
  • Size: 2.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for oxidize-0.7.0.tar.gz
Algorithm Hash digest
SHA256 b3a446265641720d93af86941ef5af8274ce732b061454e129b7c368f9e98d28
MD5 bcb871ef478c6c64ac481a7bc672c0a2
BLAKE2b-256 29dabf69cdb542d3fd324f934445c462456a5faaa9a6b22ac27006308aead2af

See more details on using hashes here.

File details

Details for the file oxidize-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: oxidize-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 2.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for oxidize-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 148d7297013e3c843b2c4f0f0ee41d91788c0809bd15009ca8c4ea8646c81709
MD5 26bd47c43637f149b8aeb88555cdd766
BLAKE2b-256 f9d6a583972b15f234f7c12ecef8e4c85ad509ac25359b0614995f0cc2ccb4a4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page