Simulates backoff strategies for contending writes over the network.
Project description
Backoff Simulator
Context
Many clients concurrently send requests over the network to a server. The server accepts or rejects requests, according to its concurrency control. Each client backs off and retries until its request is accepted.
You want to keep low:
- the time until all requests are accepted (duration)
- the total number of requests (work)
You can keep the duration low by making clients retry rapidly. But then the server often rejects requests, so the work is high.
You can keep the work low by making clients retry sporadically. But then the server is often idle, so the duration is high.
Tradeoff!
The cost is an overall performance measure, got via an exchange rate: work_to_duration * work + duration.
Which backoff strategy minimizes the cost?
There's a well-known aws blog post and simulation about this, focusing on optimistic concurrency control. This app is based on those. But I re-implemented the simulation and added more controls.
The blog post concludes:
The return on implementation complexity of using jittered backoff is huge, and it should be considered a standard approach for remote clients.
The post has been influential. For example, the widely-used Python package backoff defaults to full jitter, citing the post. But is it really the best strategy for your use case?
Let's explore.
How to use
Write a simulations.toml describing the simulations you want to run, e.g.
[[simulation]]
title = "Locking_Example" # anything you like, but must be unique across [[simulation]] blocks
max_clients = 100 # simulate various numbers of clients, up to this maximum
repeat = 20 # simulate each (num_clients, strategy) combination this many times, to average out noise
network_mu = 10.0 # network latency is modeled as max(0, N(mu, sigma))
network_sigma = 2.0
work_to_duration = 1.0
control = "LockingServer"
write_mu = 2.0 # write duration is modeled as max(0, N(mu, sigma))
write_sigma = 1.0
strategies = [ # one or more
{ type = "Constant", constant = 0.5 },
{ type = "FullJitteredExpo", base = 2.0, cap = 1000.0 },
{ type = "EqualJitteredExpo", base = 2.0, cap = 1000.0 },
]
[[simulation]]
title = "Read_Write_OCC_Example"
max_clients = 100
repeat = 30
network_mu = 5.0
network_sigma = 1.0
write_mu = 0.0 # if write duration is negligible
write_sigma = 0.0
work_to_duration = 1.0
control = "ReadWriteOCCServer"
strategies = [
{ type = "Constant", constant = 0.0 },
{ type = "FullJitteredExpo", base = 5.0, cap = 2000.0 },
]
Then run:
backoff-simulator
You can name your config file something else and pass it via --config-file path/to/file.
For each [[simulation]] block, the app writes <title>_metrics.png, e.g.
showing work, duration and cost against number of clients, averaging across repetitions.
It also writes <title>_scatter.png, e.g.
showing the distribution of client requests over time, for some repetition at max_clients.
And it writes representative event histories to stdout, e.g.
Locking_Example + Constant
time client_id event_type event_detail
------ ----------- --------------------- --------------
0 0 client_requests_write
0 2 client_requests_write
0 5 client_requests_write
0 1 client_requests_write
0 4 client_requests_write
0 3 client_requests_write
8 2 server_accepts
8.17 4 server_rejects
8.64 0 server_rejects
9.55 2 server_commits
12.32 5 server_accepts
12.39 3 server_rejects
13.85 1 server_rejects
14.01 0 client_backs_off
14.04 5 server_commits
14.51 0 client_requests_write
18.43 4 client_backs_off
18.93 4 client_requests_write
20.95 3 client_backs_off
21.45 3 client_requests_write
22.62 1 client_backs_off
23.12 1 client_requests_write
25.22 0 server_accepts
26.72 4 server_rejects
28.45 0 server_commits
33.88 3 server_accepts
35.6 1 server_rejects
36.52 3 server_commits
38.87 4 client_backs_off
39.37 4 client_requests_write
45.68 1 client_backs_off
46.18 1 client_requests_write
49.82 4 server_accepts
51.52 4 server_commits
57.53 1 server_accepts
58.53 1 server_commits
Locking_Example + EqualJitteredExpo
time client_id event_type event_detail
------ ----------- --------------------- --------------
0 0 client_requests_write
0 2 client_requests_write
0 5 client_requests_write
0 1 client_requests_write
0 4 client_requests_write
0 3 client_requests_write
6.82 2 server_accepts
8.49 3 server_rejects
8.6 1 server_rejects
10.57 2 server_commits
11.84 4 server_accepts
12.2 0 server_rejects
12.55 5 server_rejects
13.37 4 server_commits
17.39 1 client_backs_off
19.24 1 client_requests_write
21.91 3 client_backs_off
22.58 0 client_backs_off
22.75 5 client_backs_off
22.96 3 client_requests_write
24.02 0 client_requests_write
24.23 5 client_requests_write
28.14 1 server_accepts
30.15 1 server_commits
32.1 3 server_accepts
34.18 3 server_commits
34.34 0 server_accepts
34.47 5 server_rejects
36.23 0 server_commits
46.4 5 client_backs_off
48.99 5 client_requests_write
59.03 5 server_accepts
61.55 5 server_commits
Locking_Example + FullJitteredExpo
time client_id event_type event_detail
------ ----------- --------------------- --------------
0 0 client_requests_write
0 2 client_requests_write
0 5 client_requests_write
0 1 client_requests_write
0 4 client_requests_write
0 3 client_requests_write
8.07 1 server_accepts
8.23 2 server_rejects
9.54 4 server_rejects
10.04 0 server_rejects
11.79 1 server_commits
13.53 5 server_accepts
13.57 5 server_commits
13.59 3 server_accepts
14.37 3 server_commits
18.17 4 client_backs_off
19.41 4 client_requests_write
20.26 2 client_backs_off
21.31 0 client_backs_off
22.11 2 client_requests_write
22.66 0 client_requests_write
27.18 4 server_accepts
31.76 4 server_commits
34.23 0 server_accepts
34.67 2 server_rejects
36.48 0 server_commits
46.65 2 client_backs_off
48.49 2 client_requests_write
55.36 2 server_accepts
58.1 2 server_commits
Concurrency controls
The app can simulate various concurrency controls.
The following controls are designed to manage contending writes.
LockingServer
- Parameters:
write_duration,write_sigma. - When it receives a request:
- if available, it accepts it and becomes unavailable while writing.
- if unavailable, it rejects it immediately.
WriteOnlyOCCServer
- Parameters:
write_duration,write_sigma. - The server stores a version number.
- When it receives a request, it notes the version number and tentatively writes.
- Then it checks the version again:
- if different, it aborts
- if the same, it commits and increments the version
- The writes are variable-duration. (Else a write would succeed just if no write was in progress when it arrived. So we'd end up with a locking server except it knowably does doomed-to-abort work.)
ReadWriteOCCServer
- Parameters:
write_duration,write_sigma. - The server stores a version number.
- A client first reads the version, then requests a write, passing the version.
- The server tentatively writes.
- Then it checks the version again:
- if different, it aborts
- if the same, it commits and increments the version
The following control is designed to manage overload.
ThrottlingServer
- Parameters:
window,limit. - The server accepts up to a limited number of requests in a sliding window.
- Any further requests are rejected immediately.
Backoff strategies
The app can simulate various backoff strategies.
Constant
- e.g. for
constant = 3, the client backs off for 3, 3, 3, ... - this relies entirely on network and write variance to spread out requests
Expo
min(cap, base x 2^n)- e.g. for
cap = 10, b = 2the client backs off for 2, 4, 8, 10, 10, ...
FullJitteredExpo
U(0, min(cap, base x 2^n))- i.e. a value picked uniformly at random between 0 and the capped exponential backoff
EqualJitteredExpo
min(cap, base x 2^n)) / 2 + U(0, min(cap, base x 2^n) / 2)- i.e. half the capped exponential backoff, plus a value picked uniformly at random between 0 and half the capped exponential backoff
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file backoff_simulator-0.2.0.tar.gz.
File metadata
- Download URL: backoff_simulator-0.2.0.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b809c5769bbd15749f378d3e2e20292d9d198870699689d1dfbc16d3d8814f8a
|
|
| MD5 |
3926ec3ad2b01ff7fae8dcba55c70514
|
|
| BLAKE2b-256 |
da9d092f05fcc70e8ebe583a70cad47788b9a899f3fff412e44ff86d777727d5
|
File details
Details for the file backoff_simulator-0.2.0-py3-none-any.whl.
File metadata
- Download URL: backoff_simulator-0.2.0-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c4c2b1aa3da30f9fed0a955756415c6c9fd71c4c2cb364cb0978edd0d26518a
|
|
| MD5 |
a2cc774e0cdbe144fc981087c83a937b
|
|
| BLAKE2b-256 |
39398f4f1a3400544031904772b37badada5d925fafa4e82344bb9b0117eef24
|