Statistical Model Checking for Neural Agents Using the Gymnasium Interface
Project description
PyDSMC
Statistical Model Checking for Neural Agents Using the Gymnasium Interface
PyDSMC is an open-source Python library for statistical model checking of neural agents on arbitrary Gymnasium environments. It is designed to be lightweight and easy-to-use while being based on established statistical methods to provide guarantees on the investigated agents' performances on various properties. Implementing the Gymnasium interface, PyDSMC is widely applicable and fully agnostic to the environments underlying implementation.
PyDSMC is based on Deep Statistical Model Checking and aims to facilitate greater adoption of statistical model checking by simplifying its usage.
Please consult the accompanying paper for more details.
Table of Contents
Deep Statistical Model Checking
The train and evaluation curves of neural agents differ greatly throughout training. Just inspecting the training curves does not suffice and leads to suboptimal policy extraction. We advocate to perform statistically-backed model evaluations periodically throughout training to ensure a good extraction point.
Setup
PyDSMC can be installed from PyPI using pip install pydsmc.
We recommend using a virtual environment and officially tested python versions 3.10, 3.11, and 3.12.
To work on this project, set up a virtual environment and install all necessary dependencies---for instance, using virtualenvwrapper and executing:
mkvirtualenv --python=python3.11 pydsmc
pip install -r requirements.txt
Usage
PyDSMC mainly exposes two functionalities: Property creation and the Evaluator.
Properties
PyDSMC can analyze arbitrary properties. These can also be environment-specific. For ease-of-use, we provide ready-to-use implementations of commonly used, domain-independent properties that are parameterized and can, thus, be adjusted to each individual usecase. You can get a list of all available predefined properties by calling pydsmc.get_predefined_properties().
Creating a predefined property is straightforward. For instance, a property analyzing the achieved return could be defined as follows:
from pydsmc import create_predefined_property
return_property = create_predefined_property(
property_id='return', # Which predefined property to use
name='returnGamma0.97', # Property's name, used for storing the evaluation results
epsilon=0.025, # Half-width of the requested confidence interval (CI)
kappa=0.05, # Probability that the true mean lies within the CI
relative_error=True, # Whether epsilon represents the relative or absolute error
bounds=(0, 864), # Bounds of the property, i.e., min and max possible values
sound=True, # Whether a sound statistical method should be used
gamma=0.97 # Property specific attributes
)
Creating a custom property is equally simple. The only additional information required is a checking function check_fn that analyzes trajectory information and returns a float value as a sample result for the property:
from pydsmc import create_custom_property
crash_property = create_custom_property(
name='crash_prob', # see above
epsilon=0.025, # see above
kappa=0.05, # see above
relative_error=False, # see above
binomial=True, # This property follows a binomial distribution
bounds=(0, 1), # see above
sound=False, # see above
# The property's checking function, crash identified by last reward -100
check_fn=lambda self, t: float(t[-1][2] == -100)
)
Evaluator
We can then register the properties in an Evaluator which will do the work of evaluating all the properties until convergence is reached, or the given resource budget is exhausted.
In order to construct an Evaluator, it is first necessary to create the evaluation environments from the outside. For this, we provide a helper function create_eval_envs that creates a suitable vectorized environment according to the given configuration.
from pydsmc import create_eval_envs, Evaluator
NUM_THREADS = 1 # Recommended: 1; Often better to use more parallel environments instead
NUM_PAR_ENVS = 8 # Number of parallel environments _per thread_
envs = create_eval_envs(
num_threads=NUM_THREADS,
num_envs_per_thread=NUM_PAR_ENVS,
env_seed=42, # Seeds are incremented for each environment, making them unique
gym_id="HalfCheetah-v5",
wrappers=[gym.wrappers.NormalizeObservation, Monitor],
vecenv_cls=gym.vector.AsyncVectorEnv, # gym.vector.SyncVectorEnv, sb3.DummyVecEnv, sb3.SubprocVecEnv
# The following kwargs are passed to the gym.make function, which passes unknown args to the env
max_episode_steps=1000,
)
evaluator = Evaluator(env=envs)
Having created the Evaluator, we can register the properties and start the evaluation:
evaluator.register_property(return_property)
evaluator.register_property(crash_property)
### Or register multiple properties at once using:
# evaluator.register_properties([return_property, crash_property])
### Start the evaluation
evaluator.eval(
agent=agent, # or: predict_fn=agent.predict
stop_on_convergence=True, # Stop when all properties have converged
episode_limit=10_000, # Early stop evaluation after 10,000 episodes
time_limit=60, # Early stop evaluation after 1 hour
num_episodes_per_policy_run=100,
num_threads=NUM_THREADS,
deterministic=True,
)
Full example
A few full examples on a select set of environments can be found in example_agents to try out.
Statistical Method Selection
For each property, PyDSMC automatically selects an appropriate statistical method based on the provided parameters, following decision tree below:
Citing PyDSMC
If you use PyDSMC in your work, you can use the following BibTeX entry, citing the official Version of Record published at QEST2025.
@inproceedings{grosPyDSMC2025,
title = {{{PyDSMC}}: {{Statistical Model Checking}} for~{{Neural Agents Using}} the~{{Gymnasium Interface}}},
shorttitle = {{{PyDSMC}}},
booktitle = {Quantitative {{Evaluation}} of {{Systems}} and {{Formal Modeling}} and {{Analysis}} of {{Timed Systems}}},
author = {Gros, Timo P. and Hartmanns, Arnd and Hoese, Ivo and Meyer, Joshua and M{\"u}ller, Nicola J. and Wolf, Verena},
editor = {Prabhakar, Pavithra and Vandin, Andrea},
year = {2025},
pages = {134--156},
publisher = {Springer Nature Switzerland},
address = {Cham},
doi = {10.1007/978-3-032-05792-1_8},
isbn = {978-3-032-05792-1},
}
License
The code introduced by this project is licensed under the MIT license. Please consult the bundled LICENSE file for the full license text.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydsmc-0.3.2.tar.gz.
File metadata
- Download URL: pydsmc-0.3.2.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b43a2c415ced4a11e9b51cad5ad1bc22e8eb32f92af10e749281def50853fd67
|
|
| MD5 |
3b3132d5aad8d65de32aaf2c621c9105
|
|
| BLAKE2b-256 |
853905bbaa4e51a816683f2dc80207913da9a1a649f9c0de8d58d805c6253c90
|
Provenance
The following attestation bundles were made for pydsmc-0.3.2.tar.gz:
Publisher:
publish.yml on neuro-mechanistic-modeling/PyDSMC
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pydsmc-0.3.2.tar.gz -
Subject digest:
b43a2c415ced4a11e9b51cad5ad1bc22e8eb32f92af10e749281def50853fd67 - Sigstore transparency entry: 578778556
- Sigstore integration time:
-
Permalink:
neuro-mechanistic-modeling/PyDSMC@4fc619263a61842d35728077b74f0558607b163b -
Branch / Tag:
refs/tags/v0.3.2 - Owner: https://github.com/neuro-mechanistic-modeling
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4fc619263a61842d35728077b74f0558607b163b -
Trigger Event:
release
-
Statement type:
File details
Details for the file pydsmc-0.3.2-py3-none-any.whl.
File metadata
- Download URL: pydsmc-0.3.2-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59a4c9300544c01061527c3f05ee89383ce953f365268cff3c97490d158c4a5b
|
|
| MD5 |
f597f75843c324f574ee6be06f171d8a
|
|
| BLAKE2b-256 |
f719238f9b605f07a4fc6a36841dc42924ede218cc19001a3192f84248209a73
|
Provenance
The following attestation bundles were made for pydsmc-0.3.2-py3-none-any.whl:
Publisher:
publish.yml on neuro-mechanistic-modeling/PyDSMC
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pydsmc-0.3.2-py3-none-any.whl -
Subject digest:
59a4c9300544c01061527c3f05ee89383ce953f365268cff3c97490d158c4a5b - Sigstore transparency entry: 578778579
- Sigstore integration time:
-
Permalink:
neuro-mechanistic-modeling/PyDSMC@4fc619263a61842d35728077b74f0558607b163b -
Branch / Tag:
refs/tags/v0.3.2 - Owner: https://github.com/neuro-mechanistic-modeling
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4fc619263a61842d35728077b74f0558607b163b -
Trigger Event:
release
-
Statement type: