Skip to main content

Pre-processing library for ML models

Project description

lib-ml

This Python library is designed for preprocessing text data in machine learning. It provides functions for tokenizing data, padding sequences, and encoding labels, all essential for training ML models. Additionally, it enables data downloading from Google Drive and facilitates storing and loading data in various formats from disk. The library is accessible on PyPI and can be seamlessly integrated into your projects.

Features

  • Data Tokenization: Convert text into sequences of integers.
  • Sequence Padding: Pad sequences to a consistent fixed length.
  • Label Encoding: Convert labels into numerical format.
  • Data Storage: Store data to given path under selected format.
  • Data Loading: Load data from disk/Google Drive under selected format.

Installation

Install the library from PyPI using:

pip install remla-preprocess 

Usage

Example of how to use lib-ml for text processing:

from remla_preprocessing.pre_processing import MLPreprocessor

# Instantiate the MLPreprocessor class
preprocessor = MLPreprocessor()

# Now you can use the functions of the MLPreprocessor class
preprocessor.tokenize_pad_encode_data(train_data, validation_data, test_data)

Support

If you encounter any problems or bugs with lib-ml, feel free to open an issue on the project repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

remla_preprocess-0.1.1.tar.gz (3.4 kB view details)

Uploaded Source

Built Distribution

remla_preprocess-0.1.1-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file remla_preprocess-0.1.1.tar.gz.

File metadata

  • Download URL: remla_preprocess-0.1.1.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for remla_preprocess-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bf590769cec928a259a93db85c93381b78fffc0c1ebce5beb68fdb7d6151da34
MD5 31dc8fd4f6b380efd1c1c4e2cca80db8
BLAKE2b-256 7108a062d2ff41b5b42b7d9a3c017f8374af66f8a7780cf6cc88079118f47f61

See more details on using hashes here.

File details

Details for the file remla_preprocess-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for remla_preprocess-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 78ce9b7191e947360715606f900d2fb064bd6150256cd9e77f400f5459635455
MD5 9fa08e07c691236b0c936c45cb5954df
BLAKE2b-256 3e97bed7b87e8936e689e86ee81b2a807f666e63dff4137e57ee88f543009776

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page