An extensible toolkit for Goal Oriented Analysis of Data

Project description

GOAD🐐 is the GOAT - Goal Oriented Analysis of Data

GOAD🐐 - When your data analysis is so fire🔥 it's got rizz✨

GOAD🐐 is a flexible Python package for analyzing, transforming, and visualizing data with an emphasis on statistical distribution fitting and modular visualization components.

📊 Features

Composable & extendable plotting system - Build complex visualizations by combining simple components. You can extend the existing components with your own.
Statistical distribution fitting - Automatically fit and compare distributions to your data. The distribution registry is extendable with additional distributions.
Extendable data transformation pipelines - Chain and reuse data transformations into pipelines. Again, extendable with custom transformation components.

Before GOAD🐐 : mid data After GOAD🐐 : data got infinity aura

🚀 Quick Start

Installation

Using uv:

uv install goad

Or, if you prefer your dependencies to be installed 100x slower, with pip:

pip install goad

📋 Demo: Linear Model Analysis

GOAD🐐 includes a comprehensive demo that shows how to use its components together.

Main capabilities

In the demo/linear.py file you can see a showcase of the main capabilities of GOAD🐐:

create a data processing pipeline
components are extendable, so you can easily add your own steps to a pipeline
create visualisations by stacking components. BasePlot will handle boilerplate.
the DistributionFitter will try to fit a few common distributions, and add statistical tests for you
The results work together with the visualizer.PlotFits class to show the results

The main strenght of this module is not that these elements are there (even thought they are very useful). Its superpower is that everything is extendable: so you can use this as a start, and extend it with your own visualisations and analytics.

POV: Your data just got GOADed🐐 and now it's giving main character energy

📚 Core Components

🔄 Extendable Data Transforms

GOAD🐐 provides a pipeline approach to transform your data:

from goad.datatransforms import Pipeline, ShiftValues, ZScaler

# Create a pipeline
pipeline = Pipeline()

# Add transformations
pipeline.add(ShiftValues, name="shift_deaths", column="deaths", period=-14)
pipeline.add(ZScaler, name="scale_tests", column="positivetests", rename=True)

# Apply all transformations
result = pipeline.apply(data)

Available transforms include:

ShiftValues - Shift values in a column by a specified period
DiffValues - Calculate the difference between consecutive values
SelectDataRange - Select rows within a specified date range
RollingAvg - Calculate rolling average of a column
ZScaler - Standardize values in a column

You can extend the pipeline with your own transformations by subclassing BaseTransform. The Zscaler is implemented as follows:

class ZScaler(TransformBase):
    """Standardize the values in a column."""
    def transform(
        self, data: pd.DataFrame, column: str, rename: bool = False
    ) -> pd.DataFrame:
        """Standardize the values in a column."""
        if rename:
            colname = f"{column}_zscore"
        else:
            colname = column
        data[colname] = (data[column] - data[column].mean()) / data[column].std()
        return data

📊 Visualization System

GOAD🐐 visualization system is built on a composable architecture that allows you to build complex plots by combining simpler components:

from goad.visualizer import PlotSettings, ResidualPlot

# Create plot settings
plotsettings = PlotSettings(
        xlabel="date",
        ylabel="normalized values",
        title="Z-Scores of Deaths and Positive Tests",
    )

class LinePlot(BasePlot):
    """Plot a line plot using seaborn."""
    def build(self, data: pd.DataFrame, **kwargs):
        sns.lineplot(data=data, ax=self.ax, **kwargs)
        return self.fig, self.ax


class ComparePlot(BasePlot):
    def build(self, data: pd.DataFrame, x: str, y1: str, y2: str, **kwargs):
        compare = LinePlot(self.settings)
        self.plot_on(compare, data=data, x=x, y=y1, label=y1, **kwargs)
        self.plot_on(compare, data=data, x=x, y=y2, label=y2, **kwargs)
        plt.xticks(rotation=45)

        return self.fig, self.ax

compareplot = ComparePlot(plotsettings)
compareplot.plot(
        data=data, x="date", y1="deaths_shifted_zscore", y2="positivetests_zscore"
    )

zscore This extendable strategy lets BasePlot handle the boilerplate, while you can focus on creating the visualizations you need. It is also easier to reuse components in different contexts.

📈 Distribution Fitting

GOAD🐐 includes tools for fitting statistical distributions to your data:

from goad.analytics import DistributionFitter
from goad.visualizer import PlotSettings, FitPlotSettings, PlotFits

fitter = DistributionFitter()
fits = fitter.fit(data["residual"], discrete=False) # we have to decide if the data is discrete or not
best = fitter.best(fits)
settings = PlotSettings(
    figsize=(12, 6), title="Residuals", xlabel="error", ylabel="probability"
)
fitplotsettings = FitPlotSettings(bins=30, max_fits=3)
fitplotter = PlotFits(settings)
fig = fitplotter.plot(
    data=data["residual"], fit_results=fits, fitplotsettings=fitplotsettings
)

For the kstest, the null hypothesis is that the two distributions are identical. In this example, the p-values are below 0.05, so we can reject the null hypothesis and conclude that the data does not follow any of these.

The plots are sorted by log-likelihood, which means there is no good fit with a distribution in this case. residuals

🧩 Extending with Custom Distributions

You can easily register new distributions:

from goad.distributions import DistributionRegistry
from scipy import stats

# Create registry
registry = DistributionRegistry()

# Register a new distribution
registry.register_distribution(
    name="negative_binomial",
    dist=stats.nbinom,
    is_discrete=True,
    num_params=2
)

# Now it will be used automatically in the  DistributionFitter for discrete fits
from goad.analytics import DistributionFitter
fitter = DistributionFitter()
print(fitter.registry) # shows all registered distributions

🔧 Advanced Usage: Composing Plots

GOAD🐐 has a powerful plotting system that allows you to combine plot elements:

from goad.visualizer import BasePlot, LinePlot, BarWithDates, VerticalDate

# Use a base plot to create a composite
class MyCompositePlot(BasePlot):
    def build(self, data: pd.DataFrame, x: str, y1: str, y2: str, special_date: str):
        # Plot the first component - a line plot
        line_plot = LinePlot(self.settings)
        self.plot_on(line_plot, data=data, x=x, y=y1, label=y1)

        # Plot the second component - a bar chart
        bar_plot = BarWithDates(self.settings)
        self.plot_on(bar_plot, data=data, x=x, y=y2)

        # Add a vertical line
        vline = VerticalDate(self.settings)
        self.plot_on(vline, date=special_date, label="Important Event")
        return self.fig, self.ax

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

GOAD🐐 - When your data analysis is so fire🔥 it's got rizz✨

Project details

Release history Release notifications | RSS feed

0.1.4

Mar 23, 2025

0.1.3

Mar 23, 2025

0.1.2

Mar 23, 2025

0.1.1

Mar 23, 2025

This version

0.1.0

Mar 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

goad_toolkit-0.1.0-py3-none-any.whl (17.6 kB view details)

Uploaded Mar 23, 2025 Python 3

File details

Details for the file goad_toolkit-0.1.0-py3-none-any.whl.

File metadata

Download URL: goad_toolkit-0.1.0-py3-none-any.whl
Upload date: Mar 23, 2025
Size: 17.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.8

File hashes

Hashes for goad_toolkit-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0de021cd6d0a4fc85203c421d22c15c9cb9706f016fa9563a2d45466a94ee2df`
MD5	`1bc0d0ca8a306ad8b216a3e0eeff54db`
BLAKE2b-256	`ab877fee43a8673c2c4aef88d80cb3b7b193bebf78db24e8b7cc4175d5cc3506`

See more details on using hashes here.

goad-toolkit 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

GOAD🐐 is the GOAT - Goal Oriented Analysis of Data

📊 Features

🚀 Quick Start

Installation

📋 Demo: Linear Model Analysis

Main capabilities

📚 Core Components

🔄 Extendable Data Transforms

📊 Visualization System

📈 Distribution Fitting

🧩 Extending with Custom Distributions

🔧 Advanced Usage: Composing Plots

🤝 Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes