Pre-processing library for ML models
Project description
lib-ml
This Python library is tailored for preprocessing text data in machine learning. It provides functions for tokenizing data, padding sequences, and encoding labels, all essential for training ML models. Additionally, it enables data downloading from Google Drive and facilitates storing and loading data in various formats from disk. The library is accessible on PyPI and can be seamlessly integrated into your projects.
Features
- Data Tokenization: Convert text into sequences of integers.
- Sequence Padding: Pad sequences to a consistent fixed length.
- Label Encoding: Convert labels into numerical format.
- Data Storage: Store data to given path under selected format.
- Data Loading: Load data from disk/Google Drive under selected format.
Installation
Install the library from PyPI using:
pip install remla-preprocessing
Usage
Example of how to use lib-ml
for text processing:
from remla_preprocessing import MLPreprocessor
# Instantiate the MLPreprocessor class
preprocessor = MLPreprocessor()
# Now you can use the functions of the MLPreprocessor class
preprocessor.tokenize_pad_encode_data(train_data, validation_data, test_data)
Support
If you encounter any problems or bugs with lib-ml
, feel free to open an issue on the project repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file remla_preprocess-0.1.0.tar.gz
.
File metadata
- Download URL: remla_preprocess-0.1.0.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 22b07fa0913bbf02292d506c2af0502fb9632040ddba4de7c3e5a26a8220b6b5 |
|
MD5 | 82ce64b80931c21de15468651e8e5517 |
|
BLAKE2b-256 | 16dc26dff47980f6c38a5d6c599def19f1259c46476efc0d45493a1723ab1003 |
File details
Details for the file remla_preprocess-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: remla_preprocess-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd7e03873936f3f4e949e4d4f4b76bb3f22583aeb5f4f29aff5a4110aa63e9e1 |
|
MD5 | 19bf83a5f3bb7b6ca9ae59941b1401d9 |
|
BLAKE2b-256 | 628d6d35ef203e89e85b59ee82a6e8fb9d25d8d873267c0995fd7be06b3445b0 |