Skip to main content

Data Preparation Toolkit Library

Project description

Data Processing Library

This provides a python framework for developing transforms on data stored in files - currently parquet files are supported - and running them in a ray cluster. Data files may be stored in the local file system or COS/S3. For more details see the documentation.

Virtual Environment

The project uses pyproject.toml and a Makefile for operations. To do development you should establish the virtual environment

make venv

and then either activate

source venv/bin/activate

or set up your IDE to use the venv directory when developing in this project

Library Artifact Build and Publish

To test, build and publish the library

make test build publish

To up the version number, edit the Makefile to change VERSION and rerun the above. This will require committing both the Makefile and the autotmatically updated pyproject.toml file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_prep_toolkit-0.2.2.dev0.tar.gz (125.9 kB view details)

Uploaded Source

Built Distribution

data_prep_toolkit-0.2.2.dev0-py3-none-any.whl (73.2 kB view details)

Uploaded Python 3

File details

Details for the file data_prep_toolkit-0.2.2.dev0.tar.gz.

File metadata

File hashes

Hashes for data_prep_toolkit-0.2.2.dev0.tar.gz
Algorithm Hash digest
SHA256 ff35da37120d0d476e64847431b223b1920d84a8850fbb3da88510a93ddc957d
MD5 38d5ed704975a412fd46a6e098800db0
BLAKE2b-256 25a4774e2ec51832601ec7ba81fb2be36d0a9f2e60f0693067eb19d314de827a

See more details on using hashes here.

File details

Details for the file data_prep_toolkit-0.2.2.dev0-py3-none-any.whl.

File metadata

File hashes

Hashes for data_prep_toolkit-0.2.2.dev0-py3-none-any.whl
Algorithm Hash digest
SHA256 57e4d163c608f13924c0558f32945093f55fbf5991d6c551b4c58a32ec69c406
MD5 f22c616fd122ddf1a0103ddf7a7d6d63
BLAKE2b-256 88cf17340733cc33e76d022d54a8a8976a9b051e6040757d04a7c957ba5f00b5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page