Skip to main content

A Python analytics workbench for teaching data science

Project description

PyAnalytica

A Python analytics workbench for teaching data science. Built on Shiny for Python.

Features

  • Interactive data exploration and visualization — profile, transform, pivot, crosstab
  • Guided statistical analysis — group means, proportions, correlation, chi-squared
  • Machine learning — regression, classification, model evaluation, prediction
  • AI-powered insights — optional Anthropic integration for interpretation, suggestions, and natural-language queries
  • Homework framework — YAML-based assignments with hash-checked grading for instructors
  • Report generation — export analyses as HTML, Python scripts, or Jupyter notebooks
  • Procedure builder — record, replay, and export multi-step analysis workflows
  • Bundled datasets — ready-to-use data for classroom exercises

Installation

pip install pyanalytica

Optional extras:

pip install pyanalytica[ai]      # Anthropic AI integration
pip install pyanalytica[report]  # Jupyter notebook export
pip install pyanalytica[all]     # Everything
pip install pyanalytica[dev]     # Development and testing

Quick Start

Launch the interactive workbench

pyanalytica                # CLI entry point
python -m pyanalytica      # or as a module

Use as a Python library

Every analytics function returns a (result, CodeSnippet) tuple. The CodeSnippet contains the equivalent pandas/sklearn code so students can see what runs under the hood.

from pyanalytica.data.load import load_bundled
from pyanalytica.data.profile import profile_dataframe
from pyanalytica.visualize.distribute import histogram
from pyanalytica.visualize.relate import scatter
from pyanalytica.explore.summarize import group_summarize

# Load a bundled dataset
df, code = load_bundled("tips")

# Profile the dataframe
profile = profile_dataframe(df)

# Visualize
fig, code = histogram(df, "total_bill", bins=20)
fig, code = scatter(df, x="total_bill", y="tip", color_by="smoker")

# Summarize
result, code = group_summarize(df, group_cols=["day"], agg_col="tip", agg_func="mean")

Bundled Datasets

Name Rows Description
tips 244 Restaurant tipping data
diamonds ~54,000 Prices and attributes of diamonds
candidates 5,000 JobMatch recruiting simulation — job candidates
jobs 500 JobMatch recruiting simulation — job postings
companies 200 JobMatch recruiting simulation — companies
events JobMatch recruiting simulation — recruiting events
from pyanalytica.datasets import list_datasets, load_dataset

list_datasets()          # ['candidates', 'companies', 'diamonds', 'events', 'jobs', 'tips']
df = load_dataset("diamonds")

Development

git clone https://github.com/social-engineer-ai/PyAnalytica.git
cd PyAnalytica
pip install -e ".[dev,all]"
python -m pytest tests/ -v

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyanalytica-0.1.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyanalytica-0.1.0-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file pyanalytica-0.1.0.tar.gz.

File metadata

  • Download URL: pyanalytica-0.1.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pyanalytica-0.1.0.tar.gz
Algorithm Hash digest
SHA256 174fed8821e34b0143ccb54bda0a2c297a51793082d99917ee8d51ca863227b2
MD5 8227df33b13a5c28acb1e97e7f976f63
BLAKE2b-256 1c9dade101fe54609467caae5f91d7b6c94828db21101257a701687012d54051

See more details on using hashes here.

File details

Details for the file pyanalytica-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyanalytica-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pyanalytica-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 118ba8b3594119b4eaac9f7f4fd8b44212d94c45e8cfd4f770a574377da563a7
MD5 c829711f7807177a3aaec9c538f10848
BLAKE2b-256 bbc936f6232c26a90faab8e597845ab4befca07e86447b1952bf45b5fafa666e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page