Skip to main content

A collection of handy tools for data handling, visualization, and reporting.

Project description

forgekit

forgekit is a python3 package designed to simplify data handling, visualization, and reporting. It provides a wide range of functions for cleaning, transforming, visualizing data, and performing machine learning tasks.

Features

  • Display and summarize DataFrames
  • Export and load CSV files
  • Handle missing data, outliers, and invalid values
  • Data scaling and transformation (min-max scaling, standardization, log transformation)
  • Generate interactive and static plots (matplotlib and Plotly)
  • Machine learning utilities (K-Means clustering, feature importance calculation, train-test splitting)

Installation

You can install forgekit by cloning the repository and installing it locally.

  1. Clone the repository:

    git clone https://github.com/yourusername/forgekit.git
    
  2. Navigate to the package directory and install it:

    cd forgekit
    pip3 install .
    
  3. Alternatively, install dependencies directly from requirements.txt:

    pip3 install -r requirements.txt
    

Usage

Here’s an example of how to use forgekit:

import pandas as pd
from forgekit import ForgeKit

# Sample data
data = {
    'Domain': ['example.com', 'example.net'],
    'Price': [10.0, 12.5]
}
df = pd.DataFrame(data)

# Display the DataFrame
ForgeKit.display_dataframe(df)

# Plot the DataFrame
ForgeKit.plot_dataframe(df, kind='bar', title="Domain Prices")

Available Functions

Data Display and Summarization:

  • display_dataframe(): Display a DataFrame with a row limit.
  • summary_stats(): Show summary statistics of a DataFrame.
  • custom_summary(): Show a custom summary of data types, missing values, and statistics.

Data Cleaning:

  • impute_missing_data(): Handle missing values with strategies like mean, median, mode, or a constant value.
  • remove_outliers(): Remove outliers using the IQR method.
  • remove_duplicates(): Remove duplicate rows in the DataFrame.
  • clean_text_columns(): Clean text columns by stripping whitespace and converting to lowercase.

Data Transformation:

  • minmax_scale(): Scale numerical data between 0 and 1.
  • standard_scale(): Standardize numerical data to have zero mean and unit variance.
  • log_transform(): Apply log transformation to reduce skewness.

Data Visualization:

  • plot_dataframe(): Generate static plots using matplotlib.
  • interactive_plot(): Generate interactive plots using Plotly.

Machine Learning Tools:

  • kmeans_clustering(): Perform K-Means clustering on the DataFrame.
  • train_test_split_data(): Split the DataFrame into training and test sets.
  • feature_importance(): Calculate feature importance using a Random Forest classifier.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Feel free to open issues or pull requests if you would like to contribute!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forgekit-1.0.2.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

forgekit-1.0.2-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file forgekit-1.0.2.tar.gz.

File metadata

  • Download URL: forgekit-1.0.2.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for forgekit-1.0.2.tar.gz
Algorithm Hash digest
SHA256 1fa5c94b6f04ff0c7b6c7ad54553736802feda483ec0627a52648d2076000db6
MD5 175434319ccc063aa272b94851128eeb
BLAKE2b-256 13cac3e0fbe0667dafbbefafa17afbc694f3d50b207ade9ee6d4485055ac26d9

See more details on using hashes here.

File details

Details for the file forgekit-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: forgekit-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for forgekit-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e68b7b13ba1b3acb08d6166059cbe1218a2c469f6303257c6b3256318519854c
MD5 1347686b2cd016a6eebd92c000c544a5
BLAKE2b-256 923cbd9c13c04a6fc488ae9b415b3caabae7bea35bf83a42385fe89c9929cf7f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page