A collection of handy tools for data handling, visualization, and reporting.
Project description
forgekit
forgekit is a python3 package designed to simplify data handling, visualization, and reporting. It provides a wide range of functions for cleaning, transforming, visualizing data, and performing machine learning tasks.
Features
- Display and summarize DataFrames
- Export and load CSV files
- Handle missing data, outliers, and invalid values
- Data scaling and transformation (min-max scaling, standardization, log transformation)
- Generate interactive and static plots (matplotlib and Plotly)
- Machine learning utilities (K-Means clustering, feature importance calculation, train-test splitting)
Installation
You can install forgekit by cloning the repository and installing it locally.
-
Clone the repository:
git clone https://github.com/yourusername/forgekit.git
-
Navigate to the package directory and install it:
cd forgekit pip3 install .
-
Alternatively, install dependencies directly from
requirements.txt:pip3 install -r requirements.txt
Usage
Here’s an example of how to use forgekit:
import pandas as pd
from forgekit import ForgeKit
# Sample data
data = {
'Domain': ['example.com', 'example.net'],
'Price': [10.0, 12.5]
}
df = pd.DataFrame(data)
# Display the DataFrame
ForgeKit.display_dataframe(df)
# Plot the DataFrame
ForgeKit.plot_dataframe(df, kind='bar', title="Domain Prices")
Available Functions
Data Display and Summarization:
display_dataframe(): Display a DataFrame with a row limit.summary_stats(): Show summary statistics of a DataFrame.custom_summary(): Show a custom summary of data types, missing values, and statistics.
Data Cleaning:
impute_missing_data(): Handle missing values with strategies like mean, median, mode, or a constant value.remove_outliers(): Remove outliers using the IQR method.remove_duplicates(): Remove duplicate rows in the DataFrame.clean_text_columns(): Clean text columns by stripping whitespace and converting to lowercase.
Data Transformation:
minmax_scale(): Scale numerical data between 0 and 1.standard_scale(): Standardize numerical data to have zero mean and unit variance.log_transform(): Apply log transformation to reduce skewness.
Data Visualization:
plot_dataframe(): Generate static plots using matplotlib.interactive_plot(): Generate interactive plots using Plotly.
Machine Learning Tools:
kmeans_clustering(): Perform K-Means clustering on the DataFrame.train_test_split_data(): Split the DataFrame into training and test sets.feature_importance(): Calculate feature importance using a Random Forest classifier.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Feel free to open issues or pull requests if you would like to contribute!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file forgekit-1.0.1.tar.gz.
File metadata
- Download URL: forgekit-1.0.1.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9da77ba9da8e55806e80804db0c72b02eeefcbfdb4ce312e438d908ebad56a87
|
|
| MD5 |
0e098d92b455ec618757745b8315c8da
|
|
| BLAKE2b-256 |
d11503a39f5576d6ed5ebece00fc95b16d4142fbe9b3fc17e9cb3c9b80fa5622
|
File details
Details for the file forgekit-1.0.1-py3-none-any.whl.
File metadata
- Download URL: forgekit-1.0.1-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ff77aa74cb1ae7a215d1d8762c1465a3c1d239c7ac979b06d8ae374f07812fa
|
|
| MD5 |
8805d5298da95c97f842097599d7c1fe
|
|
| BLAKE2b-256 |
d2a75d5e4068f09001e919e78b4a2a85d39ac23ed3d3a412755935716176f4d7
|