Data Processing Toolkit
Project description
DKit (Data Toolkit)
Data processing toolkit. General purpose data processing library in Python:
- ETL
- maintain schemas
- schema transforms
- transform from one format to the other
- support many different formats (see below)
- Data Exploration
- Data manipulation
- Report generation using Latex and Reportlab
- Extensive test coverage (>70%)
Data formats
Include extensions that facilitate reading data, transforming it and then and writing to any of the following formats:
- Parquet (using pyarrow)
- SQL (using any SQLAlchemy enabled database)
- Messagepack
- HDF5
- XML
- json and jsonl
- CSV
- Excel
- Apache Avro
Schema Generation
Support schema generation for the following:
- Apache Arrow
- Apache Avro
- SQL (via Sqlalchemy)
- Spark
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
libdkit-25.3.5.tar.gz
(1.5 MB
view details)
File details
Details for the file libdkit-25.3.5.tar.gz.
File metadata
- Download URL: libdkit-25.3.5.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92bcc80097d9c86e403b17762bc370f7479e779c3ef51ef35a0ea96c160755a4
|
|
| MD5 |
3a54598c8709a0538b80d77dfcfc547d
|
|
| BLAKE2b-256 |
d64ec58435560016664f317f74dad6e4161aec20cabd1aea43c1932b0094607d
|