State-level presidential election forecasting models
Project description
Election Forecasting Models
State-level presidential election forecasting using polling time-series data from the 2016 U.S. presidential election.
Installation
Local Installation
# Install with uv
uv pip install -e .
Docker
# Build the Docker image
docker build -t election-forecasting .
# Run forecasts in container
docker run -v $(pwd)/predictions:/app/predictions \
-v $(pwd)/metrics:/app/metrics \
election-forecasting election-forecast --dates 8
# Run with parallel execution (utilize host CPU cores)
docker run -v $(pwd)/predictions:/app/predictions \
-v $(pwd)/metrics:/app/metrics \
election-forecasting election-forecast --dates 16 --parallel 4
The Docker setup automatically mounts volumes for predictions/ and metrics/ so results persist on your host machine.
Usage
Quick Start: Run Everything
# Run complete pipeline: forecast, compare, and plot
election-run-all
# With custom number of forecast dates
election-run-all --dates 8
Individual Commands
Run All Models
# Run with default 4 forecast dates
election-forecast
# Run with custom number (n) of forecast dates
election-forecast --dates n
# Run with verbose output
election-forecast -v
# Run with parallel execution (recommended for many dates)
election-forecast --dates 16 --parallel 4
# Set random seed for reproducibility
election-forecast --seed 42
Parallel Execution: Use --parallel N (or -w N) to enable multi-core processing. The workload is parallelized by forecast date, so this is most beneficial when using many dates (e.g., 8+). With 4 workers and 8+ dates, you can see significant speedup on multi-core machines.
Compare Model Performance
election-compare
This generates:
model_comparison.csv- Detailed metrics tablemodel_comparison.png- Performance visualization- Console output with rankings
Generate State-Level Plots
# Plot key swing states (default)
election-plot
# Plot all states with polling data
election-plot --all
# Plot specific states
election-plot --states FL PA MI WI
Models
1. Hierarchical Bayes (Best Overall)
Advanced Bayesian model combining fundamentals prior with Kalman-filtered polls and systematic bias correction.
File: election_forecasting/models/hierarchical_bayes.py
2. Poll Average
Simple weighted poll-of-polls average with empirical uncertainty estimation.
File: election_forecasting/models/poll_average.py
3. Improved Kalman
Brownian motion with drift using Kalman filter/RTS smoother and stronger regularization.
File: election_forecasting/models/improved_kalman.py
4. Kalman Diffusion
Basic diffusion model with EM algorithm for parameter estimation.
File: election_forecasting/models/kalman_diffusion.py
Data Sources
- Polls: FiveThirtyEight 2016 state-level polling data (4,209 polls across 50 states)
- Election Results: MIT Election Lab 1976-2020 presidential election results (we use 2016)
Outputs
All results are saved to:
predictions/- Model predictions in CSV formatmetrics/- Evaluation metrics (Brier Score, Log Loss, MAE)plots/- State-level forecast visualizations (organized by model)
License
MIT
## Diagnostics and tests
This repository includes a small diagnostics suite to check both the
calibration utilities and the end-to-end forecasting pipeline.
### 1. Generate predictions
Most diagnostics expect that model predictions have already been generated:
```bash
# From the repository root
source .venv/bin/activate
election-run-all
```
This runs all configured models and writes prediction files under
`predictions/` (and corresponding metrics/summary files).
---
### 2. Global calibration diagnostics
The script `src/scripts/calibration_diagnostics.py` loads the prediction
files and computes **overall calibration statistics** and **reliability
curves** for each model, aggregating across all states and dates.
It writes CSV summaries and plots to disk.
```bash
source .venv/bin/activate
python src/scripts/calibration_diagnostics.py
```
Inspect the outputs (e.g. CSVs and PNGs) in the `diagnostics/` directory
as configured inside the script.
---
### 3. Per-state calibration and error diagnostics
To understand how models behave in individual states, we provide a
per-state diagnostics script:
```bash
source .venv/bin/activate
python src/scripts/per_state_calibration.py
```
This script:
* Loads all prediction CSVs from `predictions/`.
* Aggregates predictions **by state and model**.
* Computes:
* Mean predicted Democratic win probability by state.
* Mean empirical Democratic win rate by state.
* Average predicted margin vs. actual margin by state.
* Simple binned calibration summaries within each state (optional).
The outputs are written to:
* `diagnostics/per_state/per_state_metrics.csv`: per-state error and
summary metrics for each model (e.g. average margin error by state).
* `diagnostics/per_state/per_state_calibration.csv`: optional per-state
binned calibration statistics, if enough data are available.
* `diagnostics/per_state/calibration_<model>_<state>.png`: reliability
curves for specific (model, state) combinations.
These diagnostics are useful for identifying states where a model tends
to **under-** or **over-predict** the Democratic margin or win
probability.
---
### 4. Diagnostics by forecast horizon (days until election)
To study how model performance changes as Election Day approaches, we
include horizon-based diagnostics:
```bash
source .venv/bin/activate
python src/scripts/horizon_diagnostics.py
```
This script:
* Stacks all prediction CSVs from `predictions/` into a single table.
* Computes the **forecast horizon** for each prediction:
* `days_until_election = (Election Day – forecast_date)`.
* Groups by `(model, days_until_election)` and computes:
* Brier score for win probabilities.
* Log loss for win probabilities.
* Mean absolute error (MAE) of predicted margins.
* Mean predicted win probability vs. empirical win rate.
* Mean predicted margin vs. mean actual margin.
Outputs are written to:
* `diagnostics/horizon/horizon_metrics.csv`: one row per
`(model, days_until_election)` with the metrics above.
* `diagnostics/horizon/horizon_brier_score_<model>.png`:
Brier score vs. days until election.
* `diagnostics/horizon/horizon_mae_margin_<model>.png`:
margin MAE vs. days until election.
* `diagnostics/horizon/horizon_log_loss_<model>.png`:
log loss vs. days until election.
These plots and tables summarize whether a model becomes more accurate
and better calibrated as the election gets closer, and how far in
advance its predictions are reliable.
---
### 5. Tests
We provide several test files:
* `tests/test_calibration.py` tests the low-level calibration helper
functions in `src/utils/calibration.py` against small synthetic
examples with known answers.
* `tests/test_smoke_models.py` is a “smoke test” that discovers the
first forecasting model, runs it on a small number of forecast dates,
and checks that the predictions and metrics are non-empty and contain
only finite numeric values.
* `tests/test_horizon_diagnostics.py` checks that the horizon
diagnostics (`src/diagnostics/horizon.py`) behave as expected on a
small synthetic dataset and that the main summary columns are present.
To run these tests:
```bash
source .venv/bin/activate
# Run only the calibration tests
pytest tests/test_calibration.py
# Run only the smoke test
pytest tests/test_smoke_models.py
# Run only the horizon diagnostics tests
pytest tests/test_horizon_diagnostics.py
# Or run the full test suite
pytest
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file election_forecasting-0.1.1.dev9.tar.gz.
File metadata
- Download URL: election_forecasting-0.1.1.dev9.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
542b2ed4fafae34b0921b45a3958cdbb54a73a91825ebc50edafc98084c78c95
|
|
| MD5 |
cede484e8ace1b355c08bd8a720604e4
|
|
| BLAKE2b-256 |
ccafbe6b5b55f85d7010b75f363eff1cb3838f4c0e805fd5bbdb32cac0626acd
|
Provenance
The following attestation bundles were made for election_forecasting-0.1.1.dev9.tar.gz:
Publisher:
publish.yml on cmaloney111/election-forecasting-am215
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
election_forecasting-0.1.1.dev9.tar.gz -
Subject digest:
542b2ed4fafae34b0921b45a3958cdbb54a73a91825ebc50edafc98084c78c95 - Sigstore transparency entry: 747656235
- Sigstore integration time:
-
Permalink:
cmaloney111/election-forecasting-am215@477814c083f8aaf266c79af91eb496d13f406c66 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/cmaloney111
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@477814c083f8aaf266c79af91eb496d13f406c66 -
Trigger Event:
push
-
Statement type:
File details
Details for the file election_forecasting-0.1.1.dev9-py3-none-any.whl.
File metadata
- Download URL: election_forecasting-0.1.1.dev9-py3-none-any.whl
- Upload date:
- Size: 39.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2270a3b50c4cfcdb44f283d0d81adb37b4470d959aa39cacbf294269cedffb5
|
|
| MD5 |
41d143472a85c62dff5ef398cf903d07
|
|
| BLAKE2b-256 |
9ce303cc5f97c03aafad4322409a39b96f26a60572f175a9c62ea74339775e1e
|
Provenance
The following attestation bundles were made for election_forecasting-0.1.1.dev9-py3-none-any.whl:
Publisher:
publish.yml on cmaloney111/election-forecasting-am215
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
election_forecasting-0.1.1.dev9-py3-none-any.whl -
Subject digest:
a2270a3b50c4cfcdb44f283d0d81adb37b4470d959aa39cacbf294269cedffb5 - Sigstore transparency entry: 747656239
- Sigstore integration time:
-
Permalink:
cmaloney111/election-forecasting-am215@477814c083f8aaf266c79af91eb496d13f406c66 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/cmaloney111
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@477814c083f8aaf266c79af91eb496d13f406c66 -
Trigger Event:
push
-
Statement type: