csv and json file preprocessor
Project description
Preprocessor
Preprocessor is a python library for preprocessing the csv file and flattening the json file
- Preprocess csv file for missing value handling, missing value replacement
- Preprocess csv file having textual column for text preprocessing and word normalization
- Automatically detects the columns data type for csv file and do the preprocessing
- Flatten any level complex json file .
Tech
Preprocessor Class : Preprocessor.preprocessor(file,filetype=None)
Parameters:
- file : str,csv,dict
File to be preprocessed
- filetype : str
Type of the input file.Valid options are either dataframe or json
Methods :
Preprocessor.preprocessor.csv_preprocessor(threshold_4_delete_null=0.5,no_null_columns=None,numeric_null_replace=None,textual_column_word_tokenize=False,textual_column_word_normalize=False)
Parameters:
- threshold_4_delete_null : float
Ratio of the null values to number of rows
- no_null_columns :list
List of columns which must not have any null values
- numeric_null_replace : str
Logic for replacement of null values in numeric column. Valid options are mean,median and mode
- textual_column_word_tokenize : Boolean
Whether tokenization of word needed in case of textual column
- textual_column_word_normalize : str
Type of normalization of words needed in Textual columns
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Pre_processor-0.0.3.tar.gz
(3.8 kB
view details)
Built Distribution
File details
Details for the file Pre_processor-0.0.3.tar.gz
.
File metadata
- Download URL: Pre_processor-0.0.3.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd5e2025443c6e2eb37430ecda07ac51e5b82336f22413014bca8b39b903fed6 |
|
MD5 | e036282e4e91f74c1549f9d2a74b070e |
|
BLAKE2b-256 | fcf471616002a693122fae95d724b983e9d4fd12a92b132eaac02eb2358f9ebb |
File details
Details for the file Pre_processor-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: Pre_processor-0.0.3-py3-none-any.whl
- Upload date:
- Size: 4.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 618e1319d6dfd2111ce821f4fd868bd9d79ced1cdb7bf7200c4b7e9f924a093a |
|
MD5 | 8260015394306c036ead2b9443d93ef0 |
|
BLAKE2b-256 | f9f02b5f2aeaf70beac8d2af618f11efe9af4cebd603389e09d407f2eee5ec3c |