Implementation of the Alberta Plan for AI Research - continual learning with meta-learned step-sizes

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Alberta Framework

Warning: This framework is under active research development. The API is unstable and subject to breaking changes between releases. It is not intended for production use.

A JAX-based research framework implementing components of The Alberta Plan for AI Research in the pursuit of building the foundations of Continual AI.

"The agents are complex only because they interact with a complex world... their initial design is as simple, general, and scalable as possible." — Sutton et al., 2022

Overview

The Alberta Framework provides foundational components for continual reinforcement learning research. Built on JAX for hardware acceleration, the framework emphasizes temporal uniformity every component updates at every time step, with no special training phases or batch processing.

Project Context

This framework is developed as part of my D.Eng. work focusing on the foundations of online, continuious Reinforcement Learning (RL). For more background and context see:

Research Blog: blog.9600baud.net
Replicating Sutton '92: The Foundation of Step-size Adaptation
Effects of normalizing input data: Demonstrating Adaptive Step-Size Algorithm Needs External Normalization
Notes on JAX performance: JAX Performance: From 63 Minutes to 2 Minutes
About the Author: Keith Lawson

Roadmap

Depending on my research trajectory I may or may not implement components required for the Alberta Plan. The current focus of this framework is the Step 1 Baseline Study, investigating the interaction between adaptive optimizers and online normalization.

Step	Focus	Status
1	Meta-learned step-sizes (IDBD, Autostep)	Complete
2	Nonlinear function approximation (MLP, ObGD)	In Progress
3	GVF predictions, Horde architecture	Planned
4	Actor-critic with eligibility traces	Planned
5-6	Off-policy learning, average reward	Planned
7-12	Hierarchical, multi-agent, world models	Future

Installation

pip install alberta-framework

# With optional dependencies
pip install alberta-framework[gymnasium]  # RL environment support
pip install alberta-framework[dev]        # Development (pytest, ruff)

Requirements: Python >= 3.13, JAX >= 0.4, NumPy >= 2.0

Quick Start

import jax.random as jr
from alberta_framework import (
    LinearLearner, MLPLearner, LMS, IDBD, Autostep,
    ObGDBounding, AGCBounding, EMANormalizer, WelfordNormalizer,
    RandomWalkStream, run_learning_loop, run_mlp_learning_loop,
)

stream = RandomWalkStream(feature_dim=10, drift_rate=0.001)

# --- Optimizers ---

# Fixed step-size baseline
learner = LinearLearner(optimizer=LMS(step_size=0.01))

# IDBD: per-weight adaptive step-sizes via gradient correlation (Sutton, 1992)
learner = LinearLearner(optimizer=IDBD())

# Autostep: tuning-free adaptation with gradient normalization (Mahmood et al., 2012)
learner = LinearLearner(optimizer=Autostep())

# --- Adding a Normalizer ---

# EMA normalization for non-stationary feature scales
learner = LinearLearner(optimizer=IDBD(), normalizer=EMANormalizer(decay=0.99))

# Welford normalization for stationary distributions
learner = LinearLearner(optimizer=Autostep(), normalizer=WelfordNormalizer())

# --- Adding a Bounder ---

# ObGD bounding prevents overshooting (Elsayed et al., 2024)
learner = LinearLearner(optimizer=Autostep(), bounder=ObGDBounding(kappa=2.0))

# --- MLP Learner ---

# MLP with Autostep + ObGD bounding + normalization
mlp = MLPLearner(
    hidden_sizes=(128, 128),
    optimizer=Autostep(),
    bounder=ObGDBounding(kappa=2.0),
    normalizer=EMANormalizer(decay=0.99),
)

# MLP with AGC bounding — per-unit clipping scaled by weight norm (Brock et al., 2021)
mlp = MLPLearner(
    hidden_sizes=(128, 128),
    optimizer=Autostep(),
    bounder=AGCBounding(clip_factor=0.01),
)

# --- Training ---

# Linear: JIT-compiled training via jax.lax.scan
state, metrics = run_learning_loop(learner, stream, num_steps=10000, key=jr.key(42))

# MLP: same interface
state, metrics = run_mlp_learning_loop(mlp, stream, num_steps=10000, key=jr.key(42))

Core Components

Composable Architecture

Learners accept three independent, composable concerns:

Optimizer — per-weight step-size adaptation (LMS, IDBD, Autostep)
Bounder — optional global update bounding (ObGDBounding)
Normalizer — optional online feature normalization (EMANormalizer, WelfordNormalizer)

from alberta_framework import (
    LinearLearner, MLPLearner, Autostep, ObGDBounding, AGCBounding, EMANormalizer
)

# Linear learner with Autostep + normalization
learner = LinearLearner(
    optimizer=Autostep(),
    normalizer=EMANormalizer(decay=0.99),
)

# MLP with Autostep + ObGD bounding + normalization
mlp = MLPLearner(
    hidden_sizes=(128, 128),
    optimizer=Autostep(),
    bounder=ObGDBounding(kappa=2.0),
    normalizer=EMANormalizer(decay=0.99),
)

# MLP with AGC bounding (per-unit clipping scaled by weight norm)
mlp_agc = MLPLearner(
    hidden_sizes=(128, 128),
    optimizer=Autostep(),
    bounder=AGCBounding(clip_factor=0.01),
    normalizer=EMANormalizer(decay=0.99),
)

Optimizers

Supervised Learning:

LMS: Fixed step-size baseline
IDBD: Per-weight adaptive step-sizes via gradient correlation (Sutton, 1992)
Autostep: Tuning-free adaptation with gradient normalization (Mahmood et al., 2012)

TD Learning:

TDIDBD: TD learning with per-weight adaptive step-sizes and eligibility traces (Kearney et al., 2019)
AutoTDIDBD: TD learning with AutoStep-style normalization for improved stability

Bounders

ObGDBounding: Dynamic update bounding to prevent overshooting (Elsayed et al., 2024). Decoupled from the optimizer so it can be composed with any optimizer.
AGCBounding: Adaptive Gradient Clipping — per-unit clipping scaled by weight norm (Brock et al., 2021). Finer-grained than ObGD's global scaling.

Normalizers

Online feature normalization for handling varying feature scales:

EMANormalizer: Exponential moving average — suitable for non-stationary distributions
WelfordNormalizer: Welford's algorithm with Bessel's correction — suitable for stationary distributions

MLP Learner

Multi-layer perceptron for nonlinear function approximation (Elsayed et al., 2024):

from alberta_framework import MLPLearner, ObGDBounding, RandomWalkStream, run_mlp_learning_loop
import jax.random as jr

stream = RandomWalkStream(feature_dim=10)
learner = MLPLearner(
    hidden_sizes=(128, 128),
    step_size=1.0,
    bounder=ObGDBounding(kappa=2.0),
    sparsity=0.9,
)
state, metrics = run_mlp_learning_loop(learner, stream, num_steps=10000, key=jr.key(42))

Streams

Non-stationary experience generators implementing the ScanStream protocol:

RandomWalkStream: Gradual target drift
AbruptChangeStream: Sudden target switches
PeriodicChangeStream: Sinusoidal oscillation
DynamicScaleShiftStream: Time-varying feature scales
ScaleDriftStream: Continuous feature scale drift

TD Learning

For temporal-difference learning with value function approximation:

from alberta_framework import TDLinearLearner, TDIDBD, run_td_learning_loop

learner = TDLinearLearner(optimizer=TDIDBD(trace_decay=0.9))
state, metrics = run_td_learning_loop(learner, td_stream, num_steps=10000, key=jr.key(42))

Gymnasium Integration

from alberta_framework.streams.gymnasium import collect_trajectory, learn_from_trajectory, PredictionMode
import gymnasium as gym

env = gym.make("CartPole-v1")
observations, targets = collect_trajectory(env, policy, num_steps=10000, mode=PredictionMode.REWARD)
state, metrics = learn_from_trajectory(learner, observations, targets)

Publication Tools

Multi-seed experiments with statistical analysis and publication-ready outputs:

from alberta_framework.utils import ExperimentConfig, run_multi_seed_experiment, pairwise_comparisons

results = run_multi_seed_experiment(configs, seeds=30, parallel=True)
significance = pairwise_comparisons(results, test="ttest", correction="bonferroni")

Documentation

Full documentation available at j-klawson.github.io/alberta-framework or build locally:

pip install alberta-framework[docs]
mkdocs serve  # http://localhost:8000

Contributing

Contributions are welcome, particularly for upcoming roadmap steps. Please ensure tests pass and follow the existing code style.

pytest tests/ -v

Citation

If you use this framework in your research, please cite:

@software{alberta_framework,
  title = {Alberta Framework: A JAX Implementation of Alberta Plan components},
  author = {Lawson, Keith},
  year = {2026},
  url = {https://github.com/j-klawson/alberta-framework}
}

Key References

@article{sutton2022alberta,
  title = {The Alberta Plan for AI Research},
  author = {Sutton, Richard S. and Bowling, Michael and Pilarski, Patrick M.},
  year = {2022},
  eprint = {2208.11173},
  archivePrefix = {arXiv}
}

@inproceedings{sutton1992idbd,
  title = {Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta},
  author = {Sutton, Richard S.},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  year = {1992}
}

@inproceedings{mahmood2012autostep,
  title = {Tuning-free Step-size Adaptation},
  author = {Mahmood, A. Rupam and Sutton, Richard S. and Degris, Thomas and Pilarski, Patrick M.},
  booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing},
  year = {2012}
}

@inproceedings{kearney2019tidbd,
  title = {Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning},
  author = {Kearney, Alex and Veeriah, Vivek and Travnik, Jaden and Sutton, Richard S. and Pilarski, Patrick M.},
  booktitle = {International Conference on Machine Learning},
  year = {2019}
}

@article{brock2021high,
  title = {High-Performance Large-Scale Image Recognition Without Normalization},
  author = {Brock, Andrew and De, Soham and Smith, Samuel L. and Simonyan, Karen},
  journal = {arXiv preprint arXiv:2102.06171},
  year = {2021}
}

@article{elsayed2024streaming,
  title = {Streaming Deep Reinforcement Learning Finally Works},
  author = {Elsayed, Mohamed and Lan, Gautham and Lim, Shuze and Mahmood, A. Rupam},
  journal = {arXiv preprint arXiv:2410.14606},
  year = {2024}
}

License

Apache License 2.0

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jktl

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.17.1

Apr 17, 2026

0.16.0

Apr 13, 2026

0.15.0

Mar 14, 2026

0.14.0

Mar 8, 2026

0.13.0

Mar 6, 2026

0.12.0

Mar 6, 2026

0.11.0

Mar 3, 2026

0.10.2

Feb 28, 2026

0.10.1

Feb 28, 2026

0.10.0

Feb 28, 2026

0.9.1

Feb 25, 2026

0.9.0

Feb 22, 2026

0.8.0

Feb 16, 2026

0.7.4

Feb 9, 2026

0.7.3

Feb 8, 2026

This version

0.7.2

Feb 8, 2026

0.7.1

Feb 8, 2026

0.7.0

Feb 8, 2026

0.6.1

Feb 7, 2026

0.6.0

Feb 7, 2026

0.5.3

Feb 6, 2026

0.5.2

Feb 6, 2026

0.5.1

Feb 6, 2026

0.5.0

Feb 6, 2026

0.4.0

Feb 4, 2026

0.3.2

Feb 3, 2026

0.3.0

Feb 3, 2026

0.2.2

Feb 2, 2026

0.2.1

Feb 2, 2026

0.2.0

Feb 2, 2026

0.1.1

Feb 2, 2026

0.1.0

Jan 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alberta_framework-0.7.2.tar.gz (137.9 kB view details)

Uploaded Feb 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

alberta_framework-0.7.2-py3-none-any.whl (70.8 kB view details)

Uploaded Feb 8, 2026 Python 3

File details

Details for the file alberta_framework-0.7.2.tar.gz.

File metadata

Download URL: alberta_framework-0.7.2.tar.gz
Upload date: Feb 8, 2026
Size: 137.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for alberta_framework-0.7.2.tar.gz
Algorithm	Hash digest
SHA256	`3212419b74a5f8203e153ce887311b7a3c578d029c9ee691ae8d40465474071c`
MD5	`4b3de17b84b5e3a396c05e5494ad3f31`
BLAKE2b-256	`6721d77a0e05977eb93343f62c2204d048e588d6472c5c1702e417c2e5229b9d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for alberta_framework-0.7.2.tar.gz:

Publisher: publish.yml on j-klawson/alberta-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: alberta_framework-0.7.2.tar.gz
- Subject digest: 3212419b74a5f8203e153ce887311b7a3c578d029c9ee691ae8d40465474071c
- Sigstore transparency entry: 928311674
- Sigstore integration time: Feb 8, 2026
Source repository:
- Permalink: j-klawson/alberta-framework@bf399c28d34d11c1769e48b250d888e0550364b1
- Branch / Tag: refs/tags/v0.7.2
- Owner: https://github.com/j-klawson
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bf399c28d34d11c1769e48b250d888e0550364b1
- Trigger Event: push

File details

Details for the file alberta_framework-0.7.2-py3-none-any.whl.

File metadata

Download URL: alberta_framework-0.7.2-py3-none-any.whl
Upload date: Feb 8, 2026
Size: 70.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for alberta_framework-0.7.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4cad0587cb3fc9e056b1b2dd8bcceaa4bf3e06822fbc5ae54e60bb63872b4c75`
MD5	`6c406c997306c6d995c9d3e43a49597e`
BLAKE2b-256	`701092539f3e6552e1a9063ab6938a6e2111de613d4f9582bc5241ab56bd99f7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for alberta_framework-0.7.2-py3-none-any.whl:

Publisher: publish.yml on j-klawson/alberta-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: alberta_framework-0.7.2-py3-none-any.whl
- Subject digest: 4cad0587cb3fc9e056b1b2dd8bcceaa4bf3e06822fbc5ae54e60bb63872b4c75
- Sigstore transparency entry: 928311675
- Sigstore integration time: Feb 8, 2026
Source repository:
- Permalink: j-klawson/alberta-framework@bf399c28d34d11c1769e48b250d888e0550364b1
- Branch / Tag: refs/tags/v0.7.2
- Owner: https://github.com/j-klawson
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bf399c28d34d11c1769e48b250d888e0550364b1
- Trigger Event: push

alberta-framework 0.7.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Alberta Framework

Overview

Project Context

Roadmap

Installation

Quick Start

Core Components

Composable Architecture

Optimizers

Bounders

Normalizers

MLP Learner

Streams

TD Learning

Gymnasium Integration

Publication Tools

Documentation

Contributing

Citation

Key References

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance