It is a library that facilitates converting CSV files to various formats (such as DataFrames or other CSV/Excel files) based on a JSON mapping
Project description
Project Title
DataForgeToolkit is a Python library for mapping CSV or Excel files based on JSON transformation mappings.
Description
DataForgeToolkit: Flexible Data Mapping for CSV/XLSX Files
The DataForgeToolkit is a Python library designed to streamline the process of converting CSV or Excel files into customized DataFrames based on user-defined JSON mapping configurations. Whether you're working with financial reports, customer datasets, or any other structured data, this toolkit empowers you to effortlessly transform raw data into actionable insights.
Features: Versatile File Support: Seamlessly process both CSV and Excel files, providing flexibility in handling various data formats commonly encountered in data analysis tasks.
Customizable Mapping: Define transformation mappings using a JSON file, allowing for precise specification of column names, data cleaning, and value substitutions tailored to your specific data requirements.
Efficient Data Processing: Automate data preprocessing tasks such as handling missing values, standardizing column names, and applying complex value mappings with ease.
Installation Usage/Examples
pip install dataforgetoolkit
Define Transformation Mapping:
Create a JSON file specifying the transformation mappings for your data. Define column mappings, specify new column names, and define value substitutions as needed.
Use the Toolkit:
Import the DataForgeToolkit in your Python script and utilize the map function to convert your report files:
from dataforgetoolkit import datamapper
datamapper.map('report file path csv / xlsx format','mapping json file path')
Access Mapped Data:
Access the transformed data as a DataFrame for further analysis or export to other formats.
JSON Transformation Mapping
Transformation mappings are specified using a JSON file. Example:
{
"transformation_mapping": [
{
"column": "Name",
"new_name": "Student Name",
"value_mappings": [
{
"*": "Amit Singh"
}
]
},
{
"column": "Age_Column",
"new_name": "Age",
"value_mappings": [
{
"FILTER": "30"
}
]
}
]
}
Authors
-
Software Engineer
Contributing
Contributions are always welcome!
Please adhere to this project's code of conduct
.
Suggest code and open PR/MR
Used By
'Intended Audience' :: Developers , Testers , BA
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for dataforgetoolkit-1.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad69a9b897167e3e0c3f991ed377444d05d9870ed3aee876ca57aec16312d128 |
|
MD5 | 9b22f2b5c568cdb665819760973ba0ed |
|
BLAKE2b-256 | 9a51fbe07d73ad8a0fd29ff3e6aea93bef1152bfc3cbbf86b10f6f8f92124af9 |