A Python analytics workbench for teaching data science
Project description
PyAnalytica
A Python analytics workbench for teaching data science. Built on Shiny for Python.
Features
- Interactive data exploration and visualization — profile, transform, pivot, crosstab
- Guided statistical analysis — group means, proportions, correlation, chi-squared
- Machine learning — regression, classification, model evaluation, prediction
- AI-powered insights — optional Anthropic integration for interpretation, suggestions, and natural-language queries
- Homework framework — YAML-based assignments with hash-checked grading for instructors
- Report generation — export analyses as HTML, Python scripts, or Jupyter notebooks
- Procedure builder — record, replay, and export multi-step analysis workflows
- Bundled datasets — ready-to-use data for classroom exercises
Installation
pip install pyanalytica
Optional extras:
pip install pyanalytica[ai] # Anthropic AI integration
pip install pyanalytica[report] # Jupyter notebook export
pip install pyanalytica[all] # Everything
pip install pyanalytica[dev] # Development and testing
Quick Start
Launch the interactive workbench
pyanalytica # CLI entry point
python -m pyanalytica # or as a module
Use as a Python library
Every analytics function returns a (result, CodeSnippet) tuple. The CodeSnippet
contains the equivalent pandas/sklearn code so students can see what runs under the hood.
from pyanalytica.data.load import load_bundled
from pyanalytica.data.profile import profile_dataframe
from pyanalytica.visualize.distribute import histogram
from pyanalytica.visualize.relate import scatter
from pyanalytica.explore.summarize import group_summarize
# Load a bundled dataset
df, code = load_bundled("tips")
# Profile the dataframe
profile = profile_dataframe(df)
# Visualize
fig, code = histogram(df, "total_bill", bins=20)
fig, code = scatter(df, x="total_bill", y="tip", color_by="smoker")
# Summarize
result, code = group_summarize(df, group_cols=["day"], agg_col="tip", agg_func="mean")
Bundled Datasets
| Name | Rows | Description |
|---|---|---|
tips |
244 | Restaurant tipping data |
diamonds |
~54,000 | Prices and attributes of diamonds |
candidates |
5,000 | JobMatch recruiting simulation — job candidates |
jobs |
500 | JobMatch recruiting simulation — job postings |
companies |
200 | JobMatch recruiting simulation — companies |
events |
— | JobMatch recruiting simulation — recruiting events |
from pyanalytica.datasets import list_datasets, load_dataset
list_datasets() # ['candidates', 'companies', 'diamonds', 'events', 'jobs', 'tips']
df = load_dataset("diamonds")
Development
git clone https://github.com/social-engineer-ai/PyAnalytica.git
cd PyAnalytica
pip install -e ".[dev,all]"
python -m pytest tests/ -v
License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyanalytica-0.2.0.tar.gz.
File metadata
- Download URL: pyanalytica-0.2.0.tar.gz
- Upload date:
- Size: 1.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ccd296da7025d34bea2ac046bc4513f1412c07bc9189efd1c16cab067a3732b6
|
|
| MD5 |
59347f4a547c4b099137cb1121a320ca
|
|
| BLAKE2b-256 |
b7996e28fb8c12aed1b3ea66b4d03de7841d10b5e521d3c182265804c9caed04
|
File details
Details for the file pyanalytica-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pyanalytica-0.2.0-py3-none-any.whl
- Upload date:
- Size: 1.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b06e871c7c73140226b0823380c8f4fd6b834d6a75b999d71e4c1ed0398ddd16
|
|
| MD5 |
4814c38feefa943d7b7b40c74413ab24
|
|
| BLAKE2b-256 |
b0cab7d5b79e486bb5f2267d996dab1fc4b1561c9af4aeca931ec2baa710f5cb
|