Numerical privacy accounting for random allocation and subsampling using PLDs.

These details have not been verified by PyPI

Project links

Project description

PLD_accounting

Tight numerical privacy accounting for random allocation and subsampling using Privacy Loss Distributions (PLDs).

This library provides end-to-end DP accounting for federated learning and other privacy-preserving systems using the random allocation subsampling scheme. It supports both Gaussian mechanisms and custom PLD realizations, with adaptive resolution refinement for accuracy/performance tradeoffs.

Quick Start

Install from PyPI:

pip install PLD_accounting

Compute privacy guarantees with automatic resolution tuning:

from PLD_accounting import gaussian_allocation_epsilon_range

epsilon_upper, epsilon_lower = gaussian_allocation_epsilon_range(
    sigma=3.0,          # Gaussian noise scale
    num_steps=100,      # Total training steps
    num_selected=10,    # Clients selected per step
    delta=1e-6,         # Target delta
)
print(f"ε ∈ [{epsilon_lower:.4f}, {epsilon_upper:.4f}]")

What This Library Provides

Random allocation accounting: Tight bounds for selecting k out of n clients per round
Adaptive resolution: Automatic grid refinement to balance accuracy and runtime
Two input modes:
- Gaussian: Specify sigma, num_steps, num_selected → get ε or δ
- Realization: Provide explicit PLD → compose under random allocation
Direction-aware bounds: Upper/lower bounds for REMOVE, ADD, or BOTH directions
Subsampling amplification: Direct PLD-based amplification for PREAMBLE-style workflows (DOMINATES only)
Efficient convolution: FFT for linear grids, geometric for multiplicative grids

When to Use Each Path

Gaussian Path (Most Common)

Use when: Your mechanism adds Gaussian noise with known σ

Example:

from PLD_accounting import (
    gaussian_allocation_epsilon_extended,
    PrivacyParams,
    AllocationSchemeConfig,
)

params = PrivacyParams(sigma=3.0, num_steps=100, num_selected=10, delta=1e-6)
config = AllocationSchemeConfig(loss_discretization=0.02, tail_truncation=1e-8)

epsilon = gaussian_allocation_epsilon_extended(params, config)

Adaptive variant (recommended for exploratory analysis):

epsilon_upper, epsilon_lower = gaussian_allocation_epsilon_range(
    sigma=3.0, num_steps=100, num_selected=10, delta=1e-6
)

Realization Path (Advanced)

Use when: You have an explicit PLD realization or a non-Gaussian mechanism

Example:

import numpy as np
from PLD_accounting import general_allocation_PLD, PLDRealization, AllocationSchemeConfig

# Define your mechanism's privacy loss distribution on a linear grid
remove_realization = PLDRealization(
    x_min=0.0,
    x_gap=0.1,
    PMF_array=np.array([0.3, 0.25, 0.2, 0.15, 0.1]),
)
add_realization = PLDRealization(
    x_min=0.0,
    x_gap=0.1,
    PMF_array=np.array([0.3, 0.25, 0.2, 0.15, 0.1]),
)

pld = general_allocation_PLD(
    num_steps=10,
    num_selected=5,
    num_epochs=1,
    config=AllocationSchemeConfig(),
    remove_realization=remove_realization,
    add_realization=add_realization,
)

epsilon = pld.get_epsilon_for_delta(1e-6)

Requirements for realizations:

Linear grid structure (uniform spacing)
Valid PLD per Definition 3.1: E[exp(-L)] ≤ 1, no mass at L = -∞
Total probability mass = 1

Key Concepts

Random Allocation

Selecting k clients from n candidates provides privacy amplification compared to full-batch training. This library accounts for the composition of:

num_steps total allocation steps
num_selected clients chosen per step
num_epochs passes through the data

For both Gaussian and realization paths, composition semantics are:

inner per-round composition count: floor(num_steps / num_selected)
outer composition count: num_selected * num_epochs
therefore num_steps must satisfy num_steps >= num_selected

Directions

REMOVE: Privacy loss when removing a data record
ADD: Privacy loss when adding a data record
BOTH: Analyze both directions (most conservative)

Bound Types

DOMINATES: Upper bound (pessimistic, safe for privacy proofs)
IS_DOMINATED: Lower bound (optimistic, for tightness evaluation)

PLD Dual

For a PLD realization L, the dual D(L) is the PLD in the reversed privacy direction (L_{Q,P}). It reflects the support (l -> -l), reweights finite mass by exp(-l), and places the residual mass at +∞. In remove-direction internals, we often use -D(L), obtained explicitly by negating D(L).

Adaptive Resolution

The *_range() functions iteratively refine discretization to achieve target accuracy:

Start from Poisson-subsampled estimate
Refine grid spacing and truncation
Track best upper/lower bounds
Stop when gap meets target (or after 10 iterations)

Common Workflows

1. Simple ε Query

from PLD_accounting import gaussian_allocation_epsilon_range

eps_upper, eps_lower = gaussian_allocation_epsilon_range(
    sigma=5.0, num_steps=500, num_selected=8, delta=1e-5
)

2. PLD for Multiple Queries

from PLD_accounting import gaussian_allocation_PLD, PrivacyParams, AllocationSchemeConfig

pld = gaussian_allocation_PLD(
    params=PrivacyParams(sigma=3.0, num_steps=100, num_selected=10),
    config=AllocationSchemeConfig(),
)

# Query multiple (ε, δ) pairs efficiently
for delta in [1e-4, 1e-5, 1e-6]:
    eps = pld.get_epsilon_for_delta(delta)
    print(f"δ={delta:.0e} → ε={eps:.4f}")

3. Subsampling + Composition

from PLD_accounting import gaussian_allocation_PLD, subsample_PLD

# One training round
base_pld = gaussian_allocation_PLD(...)

# Apply subsampling amplification
subsampled = subsample_PLD(base_pld, sampling_probability=0.1)

# Compose across rounds
final_pld = subsampled.self_compose(num_rounds=20)
epsilon = final_pld.get_epsilon_for_delta(1e-6)

subsample_PLD() / subsample_PMF() are DOMINATES-only utilities; they do not accept a bound-type argument.

See usage_example.py for complete runnable examples including PREAMBLE-style workflows.

Configuration Parameters

PrivacyParams

sigma: Gaussian noise scale (higher = more privacy)
num_steps: Total allocation steps
num_selected: Clients per step (k in the paper)
num_epochs: Training epochs (default 1)
delta or epsilon: Query target

AllocationSchemeConfig

loss_discretization: Grid spacing (smaller = tighter, slower). Default: 1e-3
tail_truncation: Truncate probability mass below this. Default: 1e-10
max_grid_FFT: FFT grid size limit. Default: 2,000,000
max_grid_mult: Geometric grid size limit. Default: 0 (unlimited)
convolution_method: FFT, GEOM, BEST_OF_TWO, or COMBINED

Tradeoff: Smaller loss_discretization and tail_truncation → tighter bounds but higher memory/runtime.

Requirements

Python ≥ 3.10
numpy ≥ 1.23
scipy ≥ 1.10
numba ≥ 0.58
dp-accounting ≥ 0.4.3

All dependencies install automatically with the package.

Development Setup

From source:

git clone https://github.com/moshenfeld/PLD_accounting.git
cd PLD_accounting
pip install -e ".[dev]"

Run tests:

pytest -q

With coverage:

./tests/run_tests.sh --coverage

Build distribution:

python -m build

API Reference

Gaussian Path

gaussian_allocation_epsilon_range() - adaptive upper/lower ε bounds
gaussian_allocation_delta_range() - adaptive upper/lower δ bounds
gaussian_allocation_epsilon_extended() - single ε value with fixed config
gaussian_allocation_delta_extended() - single δ value with fixed config
gaussian_allocation_PLD() - build PLD object for repeated queries

Realization Path

general_allocation_PLD() - build PLD from explicit realizations
general_allocation_epsilon() - compute ε from realizations
general_allocation_delta() - compute δ from realizations
PLDRealization - linear-grid PLD realization type

Composition

subsample_PLD() - apply subsampling amplification to a PLD

Project Structure

PLD_accounting/
├── discrete_dist.py              # Distribution types (Linear, Geometric, PLDRealization)
├── random_allocation_api.py      # Public API surface
├── random_allocation_accounting.py  # Shared composition logic
├── random_allocation_gaussian.py    # Gaussian-specific implementation
├── adaptive_random_allocation.py    # Adaptive resolution refinement
├── geometric_convolution.py      # Multiplicative-grid convolution
├── FFT_convolution.py           # Linear-grid convolution
├── subsample_PLD.py             # Subsampling amplification
└── ...

tests/
├── unit/                        # Type checks, validations, edge cases
└── integration/                 # End-to-end workflows, comparisons

usage_example.py                 # Runnable examples

See IMPLEMENTATION_OVERVIEW.md for architectural details.

Citation

If you use this library in your research, please cite:

[Paper citation pending]

License

[License information]

Support

For questions or issues:

GitHub Issues: https://github.com/moshenfeld/PLD_accounting/issues
Documentation: See usage_example.py and docstrings

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

Mar 25, 2026

This version

0.2.0

Mar 13, 2026

0.1.4

Mar 11, 2026

0.1.0

Mar 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pld_accounting-0.2.0.tar.gz (38.5 kB view details)

Uploaded Mar 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pld_accounting-0.2.0-py3-none-any.whl (41.0 kB view details)

Uploaded Mar 13, 2026 Python 3

File details

Details for the file pld_accounting-0.2.0.tar.gz.

File metadata

Download URL: pld_accounting-0.2.0.tar.gz
Upload date: Mar 13, 2026
Size: 38.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pld_accounting-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`e31bc74a5d561bc9ffa8aaea1cbae23e030cc4da6812265418574e829e9dc4cb`
MD5	`433f0f0630cd5089a6387a89952d76fa`
BLAKE2b-256	`7714e37067d73231b8a1db1e103bbe50a1395033013a41727d8f1834c5eb919e`

See more details on using hashes here.

File details

Details for the file pld_accounting-0.2.0-py3-none-any.whl.

File metadata

Download URL: pld_accounting-0.2.0-py3-none-any.whl
Upload date: Mar 13, 2026
Size: 41.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pld_accounting-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`70ea3a766efceca096e5ef8f7d93a6742ab15d743a6862783d2186877d5e113b`
MD5	`21d9dada2df231d5f1a5ff8016456cba`
BLAKE2b-256	`cbdea10f47e36210472a8668a0f9e88ed9f3155e6c70e51bdfb8676bf3f0977d`

See more details on using hashes here.

PLD-accounting 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PLD_accounting

Quick Start

What This Library Provides

When to Use Each Path

Gaussian Path (Most Common)

Realization Path (Advanced)

Key Concepts

Random Allocation

Directions

Bound Types

PLD Dual

Adaptive Resolution

Common Workflows

1. Simple ε Query

2. PLD for Multiple Queries

3. Subsampling + Composition

Configuration Parameters

PrivacyParams

AllocationSchemeConfig

Requirements

Development Setup

API Reference

Gaussian Path

Realization Path

Composition

Project Structure

Citation

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes