Visualize and analyze your data with DaSPi. This package is designed for users who want to find relevant influencing factors in processes and validate improvements.

These details have not been verified by PyPI

Project links

Project description

pypi licence GitHub last commit downloads

logo

Data analysis, Statistics and Process improvements (DaSPi)

Visualize and analyze your data with DaSPi. This package is designed for users who want to find relevant influencing factors in processes and validate improvements. This package offers many Six Sigma tools based on the following packages:

The goal of this package is to be easy to use and flexible so that it can be adapted to a wide array of data analysis tasks.

Why DaSPi?

There are great packages for data analysis and visualization in Python, such as Pandas, Seaborn, Altair, Statsmodels, Scipy, Pinguins. But most of the time they work not directly with each other. Wouldn't it be great if you could use all of these packages together in one place? That's where DaSPi comes in. DaSPi is a Python package that provides a unified interface for data analysis, statistics and visualization. It allows you to use all of the great packages mentioned above together in one place, making it easier to explore and understand your data.

Features

Ease of Use: DaSPi is designed to be easy to use, even for beginners. It provides a simple and intuitive interface that makes it easy to get started with data analysis.
Visualization: DaSPi provides a wide range of visualization options, including multivariate charts, joint charts, and useful templates. This makes it easy to explore and understand your data in a visual way.
Statistics: DaSPi provides a wide range of statistical functions and tests, including hypothesis testing, confidence intervals, and regression analysis. This makes it easy to explore and understand your data in a statistical way.
Open Source: DaSPi is open source, which means that it is free to use and modify. This makes it a great option for users who want to customize the package to their specific needs.

This Package contains following submodules:

plotlib: Visualizations with Matplotlib, where the division by color, marker size or shape as well as rows and columns subplots are automated depending on the given categorical data. Any plots can also be combined, such as scatter with contour plot, violin with error bars or other creative combinations.
anova: analysis of variance (ANOVA), which is used to compare the variance within and between of two or more groups, or the effects of different treatments on a response variable. It also includes a function for calculating the variance inflation factor (VIF) for linear regression models. The main class is LinearModel, which provides methods for fitting linear regression with interactions and automatically elimiinating insignificant variables.
statistics: applied statistics, hypothesis test, confidence calculations and monte-carlo simulation. It also includes estimation for process capability and capability index.
datasets: data for exersices. It includes different datasets that can be used for testing and experimentation.

Usage

Visualization

To use DaSPi, you can import the package and start exploring your data. Here is an example of how to use DaSPi to visualize a dataset:

import daspi as dsp
df = dsp.load_dataset('iris')

chart = dsp.MultivariateChart(
        source=df,
        target='length',
        feature='width',
        hue='species',
        col='leaf',
        markers=('x',)
    ).plot(
        dsp.GaussianKDEContour
    ).plot(
        dsp.Scatter
    ).label(
        feature_label='leaf width (cm)',
        target_label='leaf length (cm)',
    )

Iris sepal length species

ANOVA

Do some ANOVA and statistics on a dataset. Run the example below in a Jupyther Notebook to see the results.

df = dsp.load_dataset('aspirin-dissolution')
model = dsp.LinearModel(
    source=df,
    target='dissolution',
    features=['employee', 'stirrer', 'brand', 'catalyst', 'water'],
    disturbances=['temperature', 'preparation'],
    order=2)
df_gof = pd.DataFrame()
for data_gof in model.recursive_elimination():
    df_gof = pd.concat([df_gof, data_gof])

dsp.ResidualsCharts(model).plot().stripes().label(info=True)
dsp.ParameterRelevanceCharts(model).plot().label(info=True)
model

Formula:

dissolution ~ 16.0792 + 2.3750 employee[T.B] + 0.8375 employee[T.C] + 10.7500 brand[T.Godamed] - 3.8000 water[T.tap] - 5.7167 brand[T.Godamed]:water[T.tap]

Model Summary

Hierarchical	Least Parameter	P Least	S	AIC	R²	R² Adj	R² Pred
True	employee	0.023298	2.374693	224.835935	0.857379	0.840400	0.813719

Parameter Statistics

	Coef	Std Err	T	P	CI Low	CI Upp
Intercept	16.079167	0.839581	19.151424	0.000000	14.384824	17.773509
employee[T.B]	2.375000	0.839581	2.828793	0.007133	0.680657	4.069343
employee[T.C]	0.837500	0.839581	0.997522	0.324224	-0.856843	2.531843
brand[T.Godamed]	10.750000	0.969464	11.088598	0.000000	8.793542	12.706458
water[T.tap]	-3.800000	0.969464	-3.919690	0.000321	-5.756458	-1.843542
brand[T.Godamed]:water[T.tap]	-5.716667	1.371030	-4.169616	0.000149	-8.483516	-2.949817

Analysis of Variance

Source	DF	SS	MS	F	P	n²
employee	2	46.431667	23.215833	4.116891	0.023298	0.027960
brand	1	747.340833	747.340833	132.526821	0.000000	0.450027
water	1	532.000833	532.000833	94.340328	0.000000	0.320355
brand:water	1	98.040833	98.040833	17.385695	0.000149	0.059037
Residual	42	236.845000	5.639167	nan	nan	0.142621

Variance Inflation Factor

	DF	VIF	GVIF	Threshold	Collinear	Method
Intercept	1	5.000000	2.236068	2.236068	True	R_squared
employee	2	1.000000	1.000000	1.495349	False	generalized
brand	1	1.000000	1.000000	2.236068	False	R_squared
water	1	1.000000	1.000000	2.236068	False	R_squared
brand:water	1	1.000000	1.000000	2.236068	False	single_order-2_term

ANOVA dissolution residuals

ANOVA dissolution parameters

Process capability

Analyze process variation and other key performance indicators for process capacity.

df = dsp.load_dataset('drop_card')
spec_limits = dsp.SpecLimits(0, float(df.loc[0, 'usl']))
target = 'distance'

chart = dsp.ProcessCapabilityAnalysisCharts(
        source=df,
        target=target,
        spec_limits=spec_limits,
        hue='method'
    ).plot(
    ).stripes(
    ).label(
        fig_title='Process Capability Analysis',
        sub_title='Drop Card Experiment',
        target_label='Distance (cm)',
        info=True
    )

samples_parallel = df[df['method']=='parallel'][target]
samples_series = df[df['method']=='perpendicular'][target]
pd.concat([
    dsp.ProcessEstimator(samples_parallel, spec_limits).describe(),
    dsp.ProcessEstimator(samples_series, spec_limits).describe()],
    axis=1,
    ignore_index=True,
).rename(
    columns={0: 'parallel', 1: 'perpendicular'}
)

	parallel	perpendicular
n_samples	20	20
n_missing	0	0
n_ok	18	20
n_nok	2	0
n_errors	0	0
ok	90.00 %	100.00 %
nok	10.00 %	0.00 %
nok_norm	8.01 %	3.73 %
nok_fit	7.24 %	5.77 %
min	8.5	17.5
max	83.0	73.0
mean	42.935	48.485
median	40.75	52.5
std	22.666583	17.359489
sem	5.068402	3.8817
excess	-0.900801	-1.236078
p_excess	0.288757	0.072573
skew	0.19252	-0.377538
p_skew	0.690373	0.438723
p_ad	0.754044	0.098371
dist	lognorm	logistic
p_dist	0.964797	0.744326
strategy	norm	norm
lcl	-25.064748	-3.593468
ucl	110.934748	100.563468
lsl	0	0
usl	80.0	80.0
cp	0.588237	0.768072
cpk	0.545076	0.605145
Z	1.635227	1.815434
Z_lt	0.135227	0.315434

Process Capability Analysis

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.10.1

Mar 3, 2026

1.10.0

Mar 3, 2026

1.9.3

Feb 10, 2026

1.9.2

Feb 6, 2026

1.9.1

Feb 6, 2026

1.9.0

Jan 27, 2026

1.8.0

Nov 4, 2025

1.7.0

Jul 1, 2025

1.6.0

Jun 5, 2025

This version

1.5.0

Apr 28, 2025

1.4.5

Apr 10, 2025

1.4.4

Apr 10, 2025

1.4.3

Apr 4, 2025

1.4.2

Apr 3, 2025

1.4.1

Apr 2, 2025

1.4.0

Apr 2, 2025

1.3.0

Feb 20, 2025

1.2.0

Feb 12, 2025

1.1.1

Jan 24, 2025

1.1.0

Jan 23, 2025

1.0.1

Jan 9, 2025

1.0.0

Jan 9, 2025

0.5.1

Nov 19, 2024

0.5.0

Nov 19, 2024

0.4.0

Oct 24, 2024

0.3.0

Oct 9, 2024

0.2.0

Jul 10, 2024

0.1.0

Jul 1, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

daspi-1.5.0.tar.gz (10.4 MB view details)

Uploaded Apr 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

daspi-1.5.0-py3-none-any.whl (271.6 kB view details)

Uploaded Apr 28, 2025 Python 3

File details

Details for the file daspi-1.5.0.tar.gz.

File metadata

Download URL: daspi-1.5.0.tar.gz
Upload date: Apr 28, 2025
Size: 10.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: pdm/2.23.0 CPython/3.12.3 Windows/11

File hashes

Hashes for daspi-1.5.0.tar.gz
Algorithm	Hash digest
SHA256	`66548e16114cb6d8c31dd8afa7ba718b8ad7068ebac0990bbee5baf317cd66c2`
MD5	`3e2f4f99d16ba0910a9c11769a1b33fb`
BLAKE2b-256	`a0b33565f9a4d88397093334313b8705552fa9e0ad55a911062de4387b9f68d5`

See more details on using hashes here.

File details

Details for the file daspi-1.5.0-py3-none-any.whl.

File metadata

Download URL: daspi-1.5.0-py3-none-any.whl
Upload date: Apr 28, 2025
Size: 271.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: pdm/2.23.0 CPython/3.12.3 Windows/11

File hashes

Hashes for daspi-1.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ec5a0c8d5e9546666b44e09995b674ca77eb91f54202fc1d981554143910bbf1`
MD5	`da4060b9cf05d7a7a036901585f1bddc`
BLAKE2b-256	`87e372f14be57bc1670eb74d77ee64061dc23526355e1250fca2320453931520`

See more details on using hashes here.

DaSPi 1.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Data analysis, Statistics and Process improvements (DaSPi)

Why DaSPi?

Features

Usage

Visualization

ANOVA

Process capability

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes