A collection of handy tools for data handling, visualization, and reporting.

These details have not been verified by PyPI

Project description

ForgeKit

ForgeKit is a Python package designed to simplify data manipulation, visualization, and basic machine learning tasks. Whether you're cleaning, transforming, or visualizing your data, ForgeKit provides an intuitive interface to make these processes faster and more efficient.

Features

Data Display & Summarization: Easily display and summarize DataFrames.
Data Cleaning & Transformation: Handle missing data, remove outliers, scale data, and more.
Visualization: Create static and interactive plots using Matplotlib and Plotly.
Machine Learning Utilities: Perform K-Means clustering, evaluate feature importance, and split data into training and test sets.
Reporting: Generate markdown reports from DataFrames.

Installation

Install the package using pip:

pip install forgekit

For detailed setup instructions, see INSTALL.md.

Usage

Here's a simple example of ForgeKit in action. For a more comprehensive step-by-step workflow, refer to the examples.py file.

import pandas as pd
import forgekit as fk

# Sample Data
data = {
    'A': [5, 8, 10, 15, 20],
    'B': [10, 20, 15, 10, 5],
    'C': ['red', 'blue', 'green', 'red', 'blue']
}
df = pd.DataFrame(data)

# Display DataFrame
fk.display_dataframe(df)

# Plot DataFrame
fk.plot_dataframe(df[['A', 'B']], kind='line', title="Sample Data Plot")

# Perform K-Means Clustering
df_clustered = fk.kmeans_clustering(df[['A', 'B']], n_clusters=2)
fk.display_dataframe(df_clustered)

# Generate a Markdown Report
fk.generate_report(df, file_name='report.md')

For more usage examples, see the examples.py file in the root directory, which includes:

Data normalization and plotting
Handling missing values
K-Means clustering and more

Key Functions in forgekit

Here is an overview of the most commonly used functions in ForgeKit:

Data Display & Summarization

display_dataframe(dataframe, max_rows=10): Display a DataFrame with customizable row limits.
summary_stats(dataframe): Generate descriptive statistics for numerical columns.
custom_summary(dataframe): Display data types, missing values, and descriptive statistics in one output.

Data Cleaning

impute_missing_data(dataframe, strategy='mean'): Impute missing values using strategies like mean, median, or mode.
remove_outliers(dataframe, columns): Remove outliers from specified numerical columns using the IQR method.
remove_duplicates(dataframe): Remove duplicate rows from a DataFrame.
clean_text_columns(dataframe, columns): Clean text columns by removing whitespace and standardizing case.

Data Transformation

minmax_scale(dataframe): Scale numerical data between 0 and 1.
standard_scale(dataframe): Standardize numerical data to have a mean of 0 and unit variance.
log_transform(dataframe, columns): Apply log transformations to reduce skewness in the data.
one_hot_encode(dataframe, columns): Perform one-hot encoding on categorical columns.

Data Visualization

plot_dataframe(dataframe, kind='line', title): Generate static plots (e.g., line, bar, scatter) using Matplotlib.
interactive_plot(dataframe, x_col, y_col, kind='scatter', title): Create interactive plots using Plotly.

Machine Learning Utilities

kmeans_clustering(dataframe, n_clusters=3): Perform K-Means clustering and add cluster labels to the DataFrame.
train_test_split_data(dataframe, target_column, test_size=0.2): Split data into training and test sets.
feature_importance(dataframe, target_column): Calculate feature importance using a RandomForest classifier.

Reporting & Export

generate_report(dataframe, file_name='report.md'): Generate a markdown report of the DataFrame with descriptive statistics.
export_csv(dataframe, file_name): Export a DataFrame to a CSV file.
load_csv(file_path): Load a CSV file into a DataFrame.

Example Workflow

Here’s a brief example showing how you might use ForgeKit in a typical data analysis workflow. For the full script, check out the examples.py file.

import pandas as pd
import forgekit as fk

# Step 1: Load data
df = pd.DataFrame({
    'A': [5, 8, 10, 15, 20],
    'B': [10, 20, 15, 10, 5],
    'C': ['red', 'blue', 'green', 'red', 'blue']
})

# Step 2: Display the DataFrame
fk.display_dataframe(df)

# Step 3: Summary statistics
fk.summary_stats(df)

# Step 4: Normalize numerical columns
df_normalized = fk.minmax_scale(df[['A', 'B']])

# Step 5: K-Means Clustering
df_clustered = fk.kmeans_clustering(df[['A', 'B']], n_clusters=2)

# Step 6: Plot results
fk.plot_dataframe(df_normalized, kind='line', title="Normalized Data")

# Step 7: Generate a markdown report
fk.generate_report(df, file_name='report.md')

This workflow demonstrates the ease of use of ForgeKit for quickly loading, transforming, visualizing, and analyzing data.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

We welcome contributions! Please feel free to open issues or pull requests if you'd like to add features, fix bugs, or improve documentation.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.6

Sep 27, 2024

1.0.5

Sep 27, 2024

1.0.4

Sep 27, 2024

This version

1.0.3

Sep 27, 2024

1.0.2

Sep 27, 2024

1.0.1

Sep 27, 2024

1.0.0

Sep 26, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forgekit-1.0.3.tar.gz (7.4 kB view details)

Uploaded Sep 27, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

forgekit-1.0.3-py3-none-any.whl (7.7 kB view details)

Uploaded Sep 27, 2024 Python 3

File details

Details for the file forgekit-1.0.3.tar.gz.

File metadata

Download URL: forgekit-1.0.3.tar.gz
Upload date: Sep 27, 2024
Size: 7.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for forgekit-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`05ac78596ce70cb84bc91e23381976a4f5fbcdc8049b5279f2b3f8f67a3b68b2`
MD5	`c516f73701381ccf848706e93054ed33`
BLAKE2b-256	`5b04a86c69381a9451a5dd8a59bb4f6d3f167d4747a9d65e1fd392e8d66dbb7f`

See more details on using hashes here.

File details

Details for the file forgekit-1.0.3-py3-none-any.whl.

File metadata

Download URL: forgekit-1.0.3-py3-none-any.whl
Upload date: Sep 27, 2024
Size: 7.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for forgekit-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e4fa9647cf1c0c4b19d6360f36eac6ce44518b1bb9547d47f01e0ded4c449bcd`
MD5	`2879cfb427f0b9f13cd5ca40ecfb7328`
BLAKE2b-256	`1b34e568f34cbffefeae78cb90cc6ee9ade0951e7a16128434adfc7ac1099873`

See more details on using hashes here.

forgekit 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

ForgeKit

Features

Installation

Usage

Key Functions in forgekit

Data Display & Summarization

Data Cleaning

Data Transformation

Data Visualization

Machine Learning Utilities

Reporting & Export

Example Workflow

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes