Skip to main content

Statistical Model Checking for Neural Agents Using the Gymnasium Interface

Project description

PyDSMC

Statistical Model Checking for Neural Agents Using the Gymnasium Interface

Ruff PyPI Downloads Python Tests License

PyDSMC is an open-source Python library for statistical model checking of neural agents on arbitrary Gymnasium environments. It is designed to be lightweight and easy-to-use while being based on established statistical methods to provide guarantees on the investigated agents' performances on various properties. Implementing the Gymnasium interface, PyDSMC is widely applicable and fully agnostic to the environments underlying implementation.

PyDSMC is based on Deep Statistical Model Checking and aims to facilitate greater adoption of statistical model checking by simplifying its usage.

Please consult the accompanying paper for more details.

Table of Contents

Deep Statistical Model Checking

The train and evaluation curves of neural agents differ greatly throughout training. Just inspecting the training curves does not suffice and leads to suboptimal policy extraction. We advocate to perform statistically-backed model evaluations periodically throughout training to ensure a good extraction point.

Setup

PyDSMC can be installed from PyPI using pip install pydsmc.

We recommend using a virtual environment and officially tested python versions 3.10, 3.11, and 3.12.

To work on this project, set up a virtual environment and install all necessary dependencies---for instance, using virtualenvwrapper and executing:

mkvirtualenv --python=python3.11 pydsmc
pip install -r requirements.txt

Usage

PyDSMC mainly exposes two functionalities: Property creation and the Evaluator.

Properties

PyDSMC can analyze arbitrary properties. These can also be environment-specific. For ease-of-use, we provide ready-to-use implementations of commonly used, domain-independent properties that are parameterized and can, thus, be adjusted to each individual usecase. You can get a list of all available predefined properties by calling pydsmc.get_predefined_properties().

Creating a predefined property is straightforward. For instance, a property analyzing the achieved return could be defined as follows:

from pydsmc import create_predefined_property

return_property = create_predefined_property(
    property_id='return',   # Which predefined property to use
    name='returnGamma0.97', # Property's name, used for storing the evaluation results
    epsilon=0.025,          # Half-width of the requested confidence interval (CI)
    kappa=0.05,             # Probability that the true mean lies within the CI
    relative_error=True,    # Whether epsilon represents the relative or absolute error
    bounds=(0, 864),        # Bounds of the property, i.e., min and max possible values
    sound=True,             # Whether a sound statistical method should be used
    gamma=0.97              # Property specific attributes
)

Creating a custom property is equally simple. The only additional information required is a checking function check_fn that analyzes trajectory information and returns a float value as a sample result for the property:

from pydsmc import create_custom_property

crash_property = create_custom_property(
    name='crash_prob',      # see above
    epsilon=0.025,          # see above
    kappa=0.05,             # see above
    relative_error=False,   # see above
    binomial=True,          # This property follows a binomial distribution
    bounds=(0, 1),          # see above
    sound=False,            # see above
    # The property's checking function, crash identified by last reward -100
    check_fn=lambda self, t: float(t[-1][2] == -100)
)

Evaluator

We can then register the properties in an Evaluator which will do the work of evaluating all the properties until convergence is reached, or the given resource budget is exhausted.

In order to construct an Evaluator, it is first necessary to create the evaluation environments from the outside. For this, we provide a helper function create_eval_envs that creates a suitable vectorized environment according to the given configuration.

from pydsmc import create_eval_envs, Evaluator

NUM_THREADS = 1   # Recommended: 1; Often better to use more parallel environments instead
NUM_PAR_ENVS = 8  # Number of parallel environments _per thread_

envs = create_eval_envs(
    num_threads=NUM_THREADS,
    num_envs_per_thread=NUM_PAR_ENVS,
    env_seed=42,  # Seeds are incremented for each environment, making them unique
    gym_id="HalfCheetah-v5",
    wrappers=[gym.wrappers.NormalizeObservation, Monitor],
    vecenv_cls=gym.vector.AsyncVectorEnv,  # gym.vector.SyncVectorEnv, sb3.DummyVecEnv, sb3.SubprocVecEnv
    # The following kwargs are passed to the gym.make function, which passes unknown args to the env
    max_episode_steps=1000,
)

evaluator = Evaluator(env=envs)

Having created the Evaluator, we can register the properties and start the evaluation:

evaluator.register_property(return_property)
evaluator.register_property(crash_property)
### Or register multiple properties at once using:
# evaluator.register_properties([return_property, crash_property])

### Start the evaluation
evaluator.eval(
    agent=agent,  # or: predict_fn=agent.predict
    stop_on_convergence=True,  # Stop when all properties have converged
    episode_limit=10_000,  # Early stop evaluation after 10,000 episodes
    time_limit=60,  # Early stop evaluation after 1 hour
    num_episodes_per_policy_run=100,
    num_threads=NUM_THREADS,
    deterministic=True,
)

Full example

A few full examples on a select set of environments can be found in example_agents to try out.

Statistical Method Selection

For each property, PyDSMC automatically selects an appropriate statistical method based on the provided parameters, following decision tree below:

Citing PyDSMC

If you use PyDSMC in your work, you can use the following BibTeX entry, citing the official Version of Record published at QEST2025.

[!NOTE] The official version of record has not been published yet. Once published, there will be a bibtex entry here.

License

The code introduced by this project is licensed under the MIT license. Please consult the bundled LICENSE file for the full license text.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydsmc-0.3.0.tar.gz (22.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydsmc-0.3.0-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file pydsmc-0.3.0.tar.gz.

File metadata

  • Download URL: pydsmc-0.3.0.tar.gz
  • Upload date:
  • Size: 22.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pydsmc-0.3.0.tar.gz
Algorithm Hash digest
SHA256 44fdd349c3576ad435400b87177e7a36a5c6bcc40b840fce0d0732f6449de105
MD5 5bf204eaeafcda51f040fad3d4a51c15
BLAKE2b-256 8c6d79d6e2203c1e21de84d794569a58898eb5559102bf15482d57013657eb80

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydsmc-0.3.0.tar.gz:

Publisher: publish.yml on neuro-mechanistic-modeling/PyDSMC

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pydsmc-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: pydsmc-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 23.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pydsmc-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 358dc8c3d28421307104be9ffdca31b378f39f964defcd323032cea759b96e00
MD5 eb90ba0fb615df9528f9fb91aa97e4f9
BLAKE2b-256 231a4623a5227cb74ef3eec272863559b2f6ea3d4c3f49dd7882d52019ab620f

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydsmc-0.3.0-py3-none-any.whl:

Publisher: publish.yml on neuro-mechanistic-modeling/PyDSMC

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page