Skip to main content

A Python analytics workbench for teaching data science

Project description

PyAnalytica

A Python analytics workbench for teaching data science. Built on Shiny for Python.

Features

  • Interactive data exploration and visualization — profile, transform, pivot, crosstab
  • Guided statistical analysis — group means, proportions, correlation, chi-squared
  • Machine learning — regression, classification, model evaluation, prediction
  • AI-powered insights — optional Anthropic integration for interpretation, suggestions, and natural-language queries
  • Homework framework — YAML-based assignments with hash-checked grading for instructors
  • Report generation — export analyses as HTML, Python scripts, or Jupyter notebooks
  • Procedure builder — record, replay, and export multi-step analysis workflows
  • Bundled datasets — ready-to-use data for classroom exercises

Installation

pip install pyanalytica

Optional extras:

pip install pyanalytica[ai]      # Anthropic AI integration
pip install pyanalytica[report]  # Jupyter notebook export
pip install pyanalytica[all]     # Everything
pip install pyanalytica[dev]     # Development and testing

Quick Start

Launch the interactive workbench

pyanalytica                # CLI entry point
python -m pyanalytica      # or as a module

Use as a Python library

Every analytics function returns a (result, CodeSnippet) tuple. The CodeSnippet contains the equivalent pandas/sklearn code so students can see what runs under the hood.

from pyanalytica.data.load import load_bundled
from pyanalytica.data.profile import profile_dataframe
from pyanalytica.visualize.distribute import histogram
from pyanalytica.visualize.relate import scatter
from pyanalytica.explore.summarize import group_summarize

# Load a bundled dataset
df, code = load_bundled("tips")

# Profile the dataframe
profile = profile_dataframe(df)

# Visualize
fig, code = histogram(df, "total_bill", bins=20)
fig, code = scatter(df, x="total_bill", y="tip", color_by="smoker")

# Summarize
result, code = group_summarize(df, group_cols=["day"], agg_col="tip", agg_func="mean")

Bundled Datasets

Name Rows Description
tips 244 Restaurant tipping data
diamonds ~54,000 Prices and attributes of diamonds
candidates 5,000 JobMatch recruiting simulation — job candidates
jobs 500 JobMatch recruiting simulation — job postings
companies 200 JobMatch recruiting simulation — companies
events JobMatch recruiting simulation — recruiting events
from pyanalytica.datasets import list_datasets, load_dataset

list_datasets()          # ['candidates', 'companies', 'diamonds', 'events', 'jobs', 'tips']
df = load_dataset("diamonds")

Development

git clone https://github.com/social-engineer-ai/PyAnalytica.git
cd PyAnalytica
pip install -e ".[dev,all]"
python -m pytest tests/ -v

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyanalytica-0.2.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyanalytica-0.2.0-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file pyanalytica-0.2.0.tar.gz.

File metadata

  • Download URL: pyanalytica-0.2.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pyanalytica-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ccd296da7025d34bea2ac046bc4513f1412c07bc9189efd1c16cab067a3732b6
MD5 59347f4a547c4b099137cb1121a320ca
BLAKE2b-256 b7996e28fb8c12aed1b3ea66b4d03de7841d10b5e521d3c182265804c9caed04

See more details on using hashes here.

File details

Details for the file pyanalytica-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pyanalytica-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pyanalytica-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b06e871c7c73140226b0823380c8f4fd6b834d6a75b999d71e4c1ed0398ddd16
MD5 4814c38feefa943d7b7b40c74413ab24
BLAKE2b-256 b0cab7d5b79e486bb5f2267d996dab1fc4b1561c9af4aeca931ec2baa710f5cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page