Aims to simplify and help with commonly used functions in the data processing areas.
Project description
Binary Rain Helper Toolkit: Data Processing
binaryrain_helper_data_processing is a python package that aims to simplify and help with common functions data processing areas. It builds on top of the pandas library and provides additional functionality to make data processing easier, reduces boilerplate code and provides clear error messages.
Supported File Formats
PARQUET: For efficient columnar storageCSV: For common tabular dataJSON: For structured data exchangeDICT: For Python dictionary data
Key Functions
-
create_dataframe()simplifies creating pandas DataFrames from various formats:from binaryrain_helper_data_processing import FileFormat, create_dataframe # Create from CSV bytes df = create_dataframe(csv_bytes, FileFormat.CSV) # Create with custom options df = create_dataframe(parquet_bytes, FileFormat.PARQUET, file_format_options={'engine': 'pyarrow'})
-
convert_dataframe_to_type(): handles converting DataFrames to different formats:from binaryrain_helper_data_processing import FileFormat, convert_dataframe_to_type # ....df is a pandas DataFrame # Convert to CSV bytes csv_bytes = convert_dataframe_to_type(df, FileFormat.CSV) # Convert with custom options parquet_bytes = convert_dataframe_to_type(df, FileFormat.PARQUET, file_format_options={'engine': 'pyarrow'})
-
merge_dataframes(): provides a simple way to merge multiple DataFrames:from binaryrain_helper_data_processing import merge_dataframes # ....df1 and df2 are pandas DataFrames # Merge DataFrames merged_df = merge_dataframes(df1, df2, sort=True)
-
convert_todatetime(): automatically detects and converts date columns:Supports common date formats:
- %d.%m.%Y (e.g., "31.12.2023")
- %Y-%m-%d (e.g., "2023-12-31")
- %Y-%m-%d %H:%M:%S (e.g., "2023-12-31 23:59:59")
- %Y-%m-%dT%H:%M:%S (ISO format)
from binaryrain_helper_data_processing import convert_todatetime # ....df is a pandas DataFrame # Convert date columns df = convert_todatetime(df)
-
format_datetime_columns(): formats specific datetime columns:from binaryrain_helper_data_processing import format_datetime_columns # ....df is a pandas DataFrame # Format date columns directly df = format_datetime_columns(df, datetime_columns=['date_column1', 'date_column2'], datetime_format='%Y-%m-%d') # Format date columns to in string columns df = format_datetime_columns(df, datetime_columns=['date_column1', 'date_column2'], datetime_format='%Y-%m-%d', datetime_columns=['string_column1', 'string_column2'])
-
clean_dataframe(): cleans DataFrames by removing duplicates and missing values:from binaryrain_helper_data_processing import clean_dataframe # ....df is a pandas DataFrame # Clean DataFrame df = clean_dataframe(df)
-
remove_empty_values(): filters specific columns:from binaryrain_helper_data_processing import remove_empty_values # ....df is a pandas DataFrame # Remove empty values df = remove_empty_values(df, filter_column'column1')
-
format_numeric_values(): handles locale-specific number formatting:from binaryrain_helper_data_processing import format_numeric_values # ....df is a pandas DataFrame # Convert European number format (1.234,56) to standard format (1,234.56) df = format_numeric_values( df, columns=['price', 'quantity'], swap_separators=True, old_decimal_separator=',', old_thousands_separator='.', decimal_separator='.', thousands_separator=',', )
Benefits
- Consistent interface for different file formats
- Simplified error handling with clear messages
- Optional format-specific configurations
- Built on pandas for robust data processing
- Type hints for better IDE support
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file binaryrain_helper_data_processing-0.0.8.tar.gz.
File metadata
- Download URL: binaryrain_helper_data_processing-0.0.8.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3cb5afeda32a040891d69d4be2202d5fa538b166c42b67deb7f91b517a1766b
|
|
| MD5 |
7bb1b4d3c64c1d7cc5ea662293919df0
|
|
| BLAKE2b-256 |
ee585c4b2da296ae14bcf6c82367dc4d32b2a3a1c186fa7d859beb755dbf037a
|
File details
Details for the file binaryrain_helper_data_processing-0.0.8-py3-none-any.whl.
File metadata
- Download URL: binaryrain_helper_data_processing-0.0.8-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac28d8e255b83778d6c59ff60c8c988c43a7e16e0f77ec5b11d6ae72808dbed1
|
|
| MD5 |
31a1934a47d980529c93f449effc28e7
|
|
| BLAKE2b-256 |
d1b52b7a4de53eba8009b014b239f0d379fb2325075e39567831b98003cd7f8d
|