Skip to main content

Pre-processing library for ML models

Project description

lib-ml

This Python library is tailored for preprocessing text data in machine learning. It provides functions for tokenizing data, padding sequences, and encoding labels, all essential for training ML models. Additionally, it enables data downloading from Google Drive and facilitates storing and loading data in various formats from disk. The library is accessible on PyPI and can be seamlessly integrated into your projects.

Features

  • Data Tokenization: Convert text into sequences of integers.
  • Sequence Padding: Pad sequences to a consistent fixed length.
  • Label Encoding: Convert labels into numerical format.
  • Data Storage: Store data to given path under selected format.
  • Data Loading: Load data from disk/Google Drive under selected format.

Installation

Install the library from PyPI using:

pip install remla-preprocessing 

Usage

Example of how to use lib-ml for text processing:

from remla_preprocessing import MLPreprocessor

# Instantiate the MLPreprocessor class
preprocessor = MLPreprocessor()

# Now you can use the functions of the MLPreprocessor class
preprocessor.tokenize_pad_encode_data(train_data, validation_data, test_data)

Support

If you encounter any problems or bugs with lib-ml, feel free to open an issue on the project repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

remla_preprocess-0.1.0.tar.gz (3.4 kB view details)

Uploaded Source

Built Distribution

remla_preprocess-0.1.0-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file remla_preprocess-0.1.0.tar.gz.

File metadata

  • Download URL: remla_preprocess-0.1.0.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for remla_preprocess-0.1.0.tar.gz
Algorithm Hash digest
SHA256 22b07fa0913bbf02292d506c2af0502fb9632040ddba4de7c3e5a26a8220b6b5
MD5 82ce64b80931c21de15468651e8e5517
BLAKE2b-256 16dc26dff47980f6c38a5d6c599def19f1259c46476efc0d45493a1723ab1003

See more details on using hashes here.

File details

Details for the file remla_preprocess-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for remla_preprocess-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dd7e03873936f3f4e949e4d4f4b76bb3f22583aeb5f4f29aff5a4110aa63e9e1
MD5 19bf83a5f3bb7b6ca9ae59941b1401d9
BLAKE2b-256 628d6d35ef203e89e85b59ee82a6e8fb9d25d8d873267c0995fd7be06b3445b0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page