Skip to main content

An application/library for simulating the sampling of Postgres session states.

Project description

session-sampling-simulator

An application that generates synthetic database sessions and allows you to simulate the effects of sampling those sessions to understand the error characteristics.

Installing

The application is distributed as a Python package. The recommended way to use it is by installing from PyPI using either pipx or uv.

pipx install session_sampling_simulator

or

uv tool install session_sampling_simulator

Equivalently you can download the wheel or sdist from the GitHub releases page and substitute the path to the file for session_sampling_simulator in either command.

Warning for uv users!

The standalone Python versions that uv uses don't place nicely with tkinter and matplotlib. I've implemented a workaround for the GUI, but GUI and charts together are a no-no right now which means the gui analyzer command is broken. I'm working on a 'compatibility mode' that will allow this to work.

Manual installation

Of course, you can also just create a virtual environment manually, activate it and install the package.

python -m venv .venv
source .venv/bin/activate
pip install session_sampling_simulator

Using

The package installs a CLI script sesasim which you can invoke to access various functionality. It's recommended to run sesasim setup first. This creates a directory in your home directory which will be used for saving the GUI state. It's also the default location for input files and the setup command populates it with some default query files.

You can start the GUI mode with sesasim gui <mode>. To get a feel for what sesasim is all about I suggest you start with sesasim gui single-session. This mode generates a single session according to the parameters you enter, then samples it and presents the true run times of each query and those estimated by sampling.

The gui analyzer mode does the same thing, except over many sessions with a range of sampling frequencies and generates a chart of the error as a function of sampling frequency.

Detailed instructions

The CLI has --help for every command, so you can access a quick description of each parameter. Some more detail is provided below.

Single-session GUI mode

Usage: sesasim gui single-session [OPTIONS]

 A GUI mode for simulating a single session
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --query-file                     TEXT     Path to YAML file defining the     │
│                                           queries to simulate                │
│                                           [default: None]                    │
│ --num-queries                    INTEGER  Number of query inputs to render   │
│                                           in the GUI                         │
│                                           [default: None]                    │
│ --restore        --no-restore             Restore previous inputs, ignores   │
│                                           any query file specified.          │
│                                           [default: no-restore]              │
│ --help                                    Show this message and exit.        │
╰──────────────────────────────────────────────────────────────────────────────╯

If you have run sesasim setup you can run this command with no options and it will use a default set of queries. If you have not run setup you must provide a query file and the --restore option will not work because your GUI inputs will not be saved.

If you provide a query file and specify --num-queries n only the first n queries of the query file will be used. If n is greater than the number of queries in the file, the last query will be repeated to reach n.

GUI inputs: Global Settings

  • Duration: the duration of the session to generate in milliseconds
  • Sample Period: the average interval between samples in milliseconds
  • Randomize Session: if unchecked, query times will not be randomized but instead will start with exactly the specified periodicity (unless another query is already running)
  • Sampling Strategy: determines how the simulation decides when to take a sample.
    • u == uniform: a sample will be taken every P milliseconds exactly, where P is the sample period.
    • e == exponential: the probability of a given millisecond being sampled is 1/P resulting in randomly spaced samples with an average period of P.
    • j == jittered: like uniform, but with a random jitter applied to each sample. This means the sample period is uniformly distributed between 1 and 2P-1.
    • phr == post-hoc random: a fraction 1/P of all the milliseconds in the session are selected to be sampled. This is mathematically equivalent to exponential, but unlike the other methods, couldn't be applied to streaming data in real-time as you need to have the whole session.
  • Use Calculated Sample Weights: if checked, the weight of each sample will be calculated as half the elapsed time since the previous sample, plus half the time until the next one. If not checked each sample is equally weighted.

GUI inputs: Queries

For each query, you can specify the following:

  • Mean Duration: the mean duration of that query in milliseconds
  • Duration Spread: a measure of how much the duration varies, how this is used depends on Duration Distribution.
  • Average Periodicity: the average interval between executions of the query in milliseconds. This interval is generated from an exponential distribution function.
  • Wait-state Ratios: a ratio describing how much of the query's execution is spent in each wait state. For example 2:0:1 means that the query spends two thirds of its time in wait 1, none in wait 2 and one third it wait 3.
  • Duration Distribution: can be set to 'uniform' or 'exponential'. If set to 'uniform' the duration of each execution is drawn from a uniform distribution of width 2 x Duration Spread around Average Periodicity. If set to 'exponential', it is drawn from an exponential distribution of mean Average Periodicity.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

session_sampling_simulator-0.1.0.post1.tar.gz (42.0 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file session_sampling_simulator-0.1.0.post1.tar.gz.

File metadata

File hashes

Hashes for session_sampling_simulator-0.1.0.post1.tar.gz
Algorithm Hash digest
SHA256 95f21df6672858a9c5c37bf792544bff2a541ce0989a3cf16cb40406ddb0eeac
MD5 1c3318784e35abdf8067f325b76a1949
BLAKE2b-256 9331a58ad3a72fd0497f432aa6789d38949866552bf38f49f2ce76efe09c8ff7

See more details on using hashes here.

Provenance

The following attestation bundles were made for session_sampling_simulator-0.1.0.post1.tar.gz:

Publisher: workflow.yml on sonotley/session-sampling-simulator

Attestations:

File details

Details for the file session_sampling_simulator-0.1.0.post1-py3-none-any.whl.

File metadata

File hashes

Hashes for session_sampling_simulator-0.1.0.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 52963a3b90bed5e92f1e7a6c3a85c16393c76b0320fb633e9fd889c0cdd55043
MD5 3467f7bf37d732dbe1adab67fa49d431
BLAKE2b-256 3a22830bf50cc69b3193e9bf44566b22ce83feddeb278e9378409419a6d82515

See more details on using hashes here.

Provenance

The following attestation bundles were made for session_sampling_simulator-0.1.0.post1-py3-none-any.whl:

Publisher: workflow.yml on sonotley/session-sampling-simulator

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page