Skip to main content

Lightweight data analysis & ML library. Only requires numpy - includes DataFrame, statistics, ML algorithms, and visualization with zero heavy dependencies.

Project description

QuickInsights

PyPI Python License Tests

A data analysis and machine learning toolkit that runs on numpy alone.

pandas, scipy, scikit-learn, matplotlib — all optional. Install them when you need them; QuickInsights works without any of them.

pip install quickinsights

What is in the box

Module What it does Replaces
dataframe Read / write CSV-JSON-Parquet, filter, group, describe pandas
stats Descriptive stats, correlation, hypothesis tests, outlier detection scipy
ml 22 algorithms — regression, classification, clustering, reduction scikit-learn
viz Text, HTML and SVG charts; auto-fallback to matplotlib when present matplotlib
io_module Smart loader (CSV → Parquet caching), streaming, result cache
analysis One-call analyze() + quick_insight() executive summary
cleaning Missing-value handling, duplicate removal
plugins Runtime plugin registration and execution
config_module Nested key-value config with JSON / YAML / TOML back-ends

Getting started

import quickinsights as qi

# analyse a plain dict — no pandas required
result = qi.analyze({
    "price":    [29.99, 49.99, 19.99, 99.99],
    "rating":   [4.5, 3.8, 4.9, 4.1],
    "category": ["books", "electronics", "books", "clothing"],
})

# one-line executive summary
print(qi.quick_insight(result, target="price")["executive_summary"])

# export to HTML, CSV or JSON
qi.export(result, "report", "html")

Working with data

from quickinsights.dataframe import QuickFrame

qf = QuickFrame.read_csv("sales.csv")          # chunked reading supported
qf = qf[qf["revenue"].values > 0]              # filter
print(qf.groupby("region").mean())              # aggregate
print(qf.corr())                                # correlation matrix
qf.to_csv("filtered.csv")

QuickFrame supports: select_dtypes, sort_values, dropna / fillna, describe, value_counts, concat, rename, drop, duplicated, chunked CSV iteration, JSON and Parquet I/O.

Statistics

from quickinsights.stats import (
    pearson_correlation, ttest_ind, detect_outliers_iqr, jarque_bera
)
import numpy as np

x = np.random.randn(1000)
y = 2 * x + np.random.randn(1000) * 0.5

pearson_correlation(x, y)       # ≈ 0.97
detect_outliers_iqr(x).sum()    # number of outliers
jarque_bera(x)                  # (statistic, p_value)
ttest_ind(x, y)                 # (t, p)

Also available: spearman_correlation, covariance, skewness, kurtosis, chi2_test, zscore, kde_estimate, entropy, distance metrics.

Machine learning

22 algorithms, from scratch, in pure NumPy.

from quickinsights.ml import (
    train_test_split, StandardScaler,
    RandomForestClassifier, accuracy_score, classification_report,
)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
X_train = StandardScaler().fit_transform(X_train)

model = RandomForestClassifier(n_estimators=50, max_depth=8)
model.fit(X_train, y_train)

print(accuracy_score(y_test, model.predict(X_test)))
print(classification_report(y_test, model.predict(X_test)))

Full algorithm list

Category Algorithms
Linear LinearRegression, LogisticRegression, RidgeRegression, LassoRegression, ElasticNet
Trees & ensembles DecisionTreeClassifier, RandomForestClassifier/Regressor, GradientBoostingClassifier/Regressor
Neighbours KNeighborsClassifier
Bayes GaussianNB, MultinomialNB
Clustering KMeans, DBSCAN, AgglomerativeClustering
Dimensionality reduction PCA, t-SNE
Preprocessing StandardScaler, MinMaxScaler, LabelEncoder, train_test_split
Evaluation accuracy_score, MSE, MAE, R², confusion_matrix, classification_report, cross_val_score

Visualization without matplotlib

from quickinsights.viz import text_histogram, generate_html_report

# works in any terminal
print(text_histogram(data, bins=20, title="Distribution"))

# self-contained HTML report — no browser extensions needed
generate_html_report(analysis_results, output_path="report.html")

When matplotlib is installed, smart_histogram / smart_bar_chart / smart_heatmap produce regular matplotlib figures automatically.

Streaming large files

from quickinsights.io_module import StreamingAnalyzer

analyzer = StreamingAnalyzer(chunksize=50_000)
result = analyzer.analyze("big_file.csv")   # constant memory usage

Benchmarks

Measured on 5 000 samples, 20 features. Native = QuickInsights, sklearn = scikit-learn.

Algorithm Native sklearn Accuracy difference
GaussianNB 0.2 ms 1.0 ms identical
Ridge 0.4 ms 2.2 ms identical
LinearRegression 2.2 ms 4.0 ms identical
KMeans (k = 5) 155 ms 637 ms identical
RandomForest (20 trees) 167 ms 30 ms identical
GradientBoosting (50 trees) 963 ms 93 ms identical

Linear-algebra algorithms beat sklearn because both call into the same BLAS/LAPACK routines with less Python overhead. Tree-based ensembles are slower (pure Python vs compiled C) but produce the same predictions.

Installation options

pip install quickinsights              # numpy only — everything works
pip install quickinsights[pandas]      # adds pandas
pip install quickinsights[viz]         # adds matplotlib, seaborn, plotly
pip install quickinsights[sklearn]     # adds scikit-learn
pip install quickinsights[full]        # all of the above

Project layout

src/quickinsights/
    __init__.py          core.py          error_handling.py
    dataframe/           stats/           ml/
    viz/                 io_module/       analysis/
    cleaning/            plugins/         config_module/

Running the tests

pip install pytest
pytest tests/ -v          # 107 tests, < 3 s

License

MIT — see LICENSE.

Author

Eren Ata — erena6466@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quickinsights-0.5.1.tar.gz (92.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quickinsights-0.5.1-py3-none-any.whl (95.8 kB view details)

Uploaded Python 3

File details

Details for the file quickinsights-0.5.1.tar.gz.

File metadata

  • Download URL: quickinsights-0.5.1.tar.gz
  • Upload date:
  • Size: 92.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for quickinsights-0.5.1.tar.gz
Algorithm Hash digest
SHA256 031e3378e8039d068f4f5b9536e4b5d8a22dda6b8cfcbf3493e479843c11fbeb
MD5 f740797463b7efad405d161e2deb9577
BLAKE2b-256 12a488624379de6fef485b3ad9a7e2407a8332e1ff6327f554df69655b9ed58c

See more details on using hashes here.

File details

Details for the file quickinsights-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: quickinsights-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 95.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for quickinsights-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1d01e7a237b6cbde11cf011d0cb8e3f288c04160042589a5906071ff24306cbb
MD5 3a499bb918eb0fae0342ea4b96f61067
BLAKE2b-256 450a3fb1947e9493b6427605eca014a6c2cfe3abcdf5ffef6cf90174cc2fac67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page