An advanced, open-source cricket intelligence SDK powered by DuckDB, PyArrow, and FastAPI for high-performance analytics.
Project description
Midwicket
The Open-Source Cricket Intelligence SDK
Fast, deterministic cricket analytics powered by PyArrow and DuckDB.
The Problem
Processing unstructured sports telemetry is historically a nightmare. Traditional APIs are slow, schemas constantly break, and calculating complex metrics like "venue bias" or "live win probability" across millions of events requires expensive cloud data warehouses.
The Midwicket Solution
Midwicket brings the data warehouse to your laptop. It is a high-performance cricket intelligence SDK built on a structured pipeline architecture: a query planner routes requests between the PyArrow in-memory layer and a materialized DuckDB cache, keeping aggregations fast without cloud costs.
By leveraging vectorized PyArrow operations and an embedded DuckDB engine, Midwicket processes over 10 years of play-by-play data locally.
Key Capabilities
- Fast Local Queries: PyArrow and DuckDB power sub-second aggregations on cached, materialized views. Raw event scans are available for arbitrary flexibility.
- Pipeline Architecture: Specialized components (Executor, Planner, Storage Engine, Registry) isolate concerns and route queries along the most efficient path.
- Predictive Machine Learning: Logistic regression win probability model trained on IPL data (AUC 0.843), running entirely in memory with no external call.
- Type-Safe & Deterministic: Immutable V1 schemas enforced via Pydantic. Queries are hashed and cached; identical inputs always produce identical outputs.
- FastAPI Backend: Production-ready REST API with auth, rate limiting, CORS, and Prometheus metrics.
Architecture
The Midwicket engine separates concerns across a structured pipeline: incoming data flows from Cricsheet JSON through a PyArrow ingestion layer into a DuckDB cache, where a query planner decides whether to scan raw events or serve a pre-computed view.
graph LR
A[Cricsheet JSON] -->|Ingestion| B(PyArrow Pipeline)
B -->|Parquet| C{DuckDB Cache}
C -->|SQL Queries| D[Query Planner]
D -->|Express API| E[Jupyter / Colab]
D -->|FastAPI| F[Web / Mobile Clients]
Quick Start
Try it instantly in your browser — no install required:
Step 1 — Install
pip install midwicket
Step 2 — Run a prediction (no data download needed)
The win probability model runs entirely in memory. No dataset, no waiting.
import midwicket.express as px
result = px.predict_win(
venue="Wankhede Stadium",
target=180,
current_score=120,
wickets_down=5,
overs_done=15.0,
)
print(f"Win Probability: {result['win_prob']:.1%}")
# Win Probability: 22.5%
The result also includes a confidence field — a heuristic certainty
indicator (0.1–0.95) that reflects how extreme the prediction is and how
much situational information is available. It is not a statistical confidence
interval; treat it as a qualitative signal.
Step 3 — Query player stats and head-to-head matchups
Midwicket ships with a bundled in-memory dataset. Player stats and matchups work out of the box — no download needed:
import midwicket.express as px
stats = px.get_player_stats("Virat Kohli")
print(f"Player: {stats.name} | Runs: {stats.runs} | Strike Rate: {stats.strike_rate}")
matchup = px.get_matchup("V Kohli", "JJ Bumrah")
print(f"Head-to-head | Matches: {matchup.matches} | Average: {matchup.average:.1f}")
How the data layer works:
- Bundled data (default): The in-memory ZIP ships with the package. Stats and matchups read from it automatically with no setup.
- Download full history (optional): For 10+ years of ball-by-ball IPL
data (~50 MB), run this once and it persists to disk:
px.download_data() # downloads to ./data by default # px.download_data("~/cricket-data") # or a custom path
- Registry: Player resolution and matchup stats are indexed in an in-memory
IdentityRegistrybuilt from the loaded data. If a player name isn't found,get_player_statsraisesEntityNotFoundErrorwith the missing name.
Enterprise Deployment
Midwicket includes a FastAPI backend, Prometheus scrape config, and a Grafana dashboard definition. The observability stack is provisioned via Docker Compose.
Status: The FastAPI service and Prometheus integration are production-ready. The Grafana dashboard is provided as a starting point and may need metric name adjustments to match your environment.
# Clone the repository
git clone https://github.com/CodersAcademy006/Midwicket.git
cd Midwicket
# Configure environment variables
cp .env.example .env
# Edit .env: set MIDWICKET_SECRET_KEY, MIDWICKET_API_KEYS, GRAFANA_PASSWORD
# Start the FastAPI server + Prometheus + Grafana
docker-compose up -d
Examples
The examples/ directory contains 36 runnable scripts covering the full SDK:
| Range | Topic |
|---|---|
01–03 |
Setup, basic session, data ingest |
03b–08 |
Player lookup, venue stats, win prediction |
09–20 |
Fantasy points, raw SQL, season filters, leaderboards |
21–27 |
Partnership stats, consistency, reports, pipelines |
28–36 |
Express API, config, full library tour |
Browse examples/ or start with
28_express_quickstart.py.
Contributing
Contributions are highly encouraged! We are actively looking for help with:
- Expanding the built-in machine learning models.
- Optimizing DuckDB materialized views.
- Writing tests for the query planner.
Before submitting code, please review the component architecture in
Agents.md.
License
Midwicket is open-source software released under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file midwicket-0.1.1.tar.gz.
File metadata
- Download URL: midwicket-0.1.1.tar.gz
- Upload date:
- Size: 5.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b2d821f653a236b705db5a628236e8fc14ada1e08fc988115cc055c64986760
|
|
| MD5 |
3e9ffd2d180bc2ffb78d40d1529869e5
|
|
| BLAKE2b-256 |
775df7e9d47e9b47eed08d2510f1b381b159f0166c7e32af8353ded137ddc387
|
Provenance
The following attestation bundles were made for midwicket-0.1.1.tar.gz:
Publisher:
publish.yml on CodersAcademy006/Midwicket
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
midwicket-0.1.1.tar.gz -
Subject digest:
1b2d821f653a236b705db5a628236e8fc14ada1e08fc988115cc055c64986760 - Sigstore transparency entry: 1674370117
- Sigstore integration time:
-
Permalink:
CodersAcademy006/Midwicket@41499ff49577867ecdff9c0e508d5618c2ded0cd -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/CodersAcademy006
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@41499ff49577867ecdff9c0e508d5618c2ded0cd -
Trigger Event:
release
-
Statement type:
File details
Details for the file midwicket-0.1.1-py3-none-any.whl.
File metadata
- Download URL: midwicket-0.1.1-py3-none-any.whl
- Upload date:
- Size: 5.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0e13a78512e76c487318a992283832f69bcbe12c7efa7bc9a105aad15fe00cb
|
|
| MD5 |
f889e82d04c76d9f75af6c5480a74073
|
|
| BLAKE2b-256 |
ba8f4fe1ef452780992482b5bdecd6bc5bee82189f2eb5789ef012f229fb3760
|
Provenance
The following attestation bundles were made for midwicket-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on CodersAcademy006/Midwicket
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
midwicket-0.1.1-py3-none-any.whl -
Subject digest:
d0e13a78512e76c487318a992283832f69bcbe12c7efa7bc9a105aad15fe00cb - Sigstore transparency entry: 1674370131
- Sigstore integration time:
-
Permalink:
CodersAcademy006/Midwicket@41499ff49577867ecdff9c0e508d5618c2ded0cd -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/CodersAcademy006
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@41499ff49577867ecdff9c0e508d5618c2ded0cd -
Trigger Event:
release
-
Statement type: