PARIS: Parallel Adaptive Reweighting Importance Sampling for high-dimensional multi-modal Bayesian inference

These details have not been verified by PyPI

Project links

Project description

PARIS Monte Carlo Sampler

An efficient adaptive importance sampler for high-dimensional multi-modal Bayesian inference.

PARIS (Parallel Adaptive Reweighting Importance Sampling) combines global exploration with local adaptation to tackle complex posteriors. The workflow is simple:

Global Initialization: Start with a space-filling design (e.g. Latin Hypercube Sampling) to seed promising regions.
Adaptive Proposals: Each seed runs its own importance sampling process, where the proposal is a Gaussian mixture centered on past weighted samples with covariance estimated from the local sample set.
Dynamic Reweighting: All samples are reweighted against the evolving proposal mixture, ensuring unbiased estimates and self-correcting any early overweights.
Mode Clustering: Parallel processes that converge to the same region are merged to avoid redundancy, while distinct modes are preserved.
Posterior & Evidence: The collected weighted samples directly reconstruct the posterior and yield accurate Bayesian evidence estimates.

This adaptive–parallel design allows PARIS to efficiently discover, refine, and integrate over complex multi-modal landscapes with minimal tuning and far fewer likelihood calls than conventional approaches.

Documentation

📖 Visit the official documentation site for detailed usage guides, API reference, and examples.

How it Works

PARIS combines the exploratory power of global sampling with the precision of local adaptation.

Global Exploration: The sampler begins with Latin Hypercube Sampling (LHS) to uniformly cover the prior and identify promising regions.
Local Adaptation: Multiple parallel "seeds" explore these regions independently. Each seed builds a local Gaussian Mixture proposal that adapts to its own sample history, automatically focusing on high-density modes.
Parallel Merging: As parallel processes evolve, they monitor each other. If two seeds converge toward the same mode, they automatically merge to eliminate redundancy and save computational resources.
Efficient Reweighting: A sliding window mechanism ensures that importance weights remain accurate even in high dimensions, keeping the computational cost per iteration constant regardless of the total sample size.

For the rigorous mathematical derivation and implementation details, please refer to the User Guide and our paper.

Performance

10D Gaussian Mixture Model Benchmark

PARIS demonstrates exceptional efficiency in high-dimensional, multi-modal scenarios. In a challenging 10D GMM with 10 equally weighted modes:

Method	Sample Number	Total Calls	Log Evidence
PARIS	145,420	150,050	2.30
Dynesty	145,423	8,587,847	2.30
PTMCMC	145,400	822,352	1.91

Key Results:

57× fewer likelihood evaluations than Dynesty (Dynamic Nested Sampling)
5.5× fewer likelihood evaluations than PTMCMC
Accurate evidence estimation (2.30 vs theoretical 2.30)
Consistent mode recovery across all dimensions
Robust performance with N_LHS=10⁴, N_seed=100

Figure: 1D marginalized posterior comparison. PARIS (green) closely matches the analytical target distribution (grey), while competitors show mode recovery bias. The target distribution's uniform-like appearance in 1D projections results from LHS-based placement of GMM component centers ensuring maximal separation. PARIS achieves this accuracy with dramatically fewer likelihood evaluations.

Features

Adaptive Proposals per Seed – Each process maintains its own proposal, evolving a local Gaussian mixture that adapts to past samples.
Auto-balanced Exploration – High-weight discoveries automatically attract more samples, while overweights self-correct over time.
Accurate Evidence Estimation – Bayesian evidence is computed directly from importance weights, no extra machinery needed.
Parallel Mode Discovery – Multiple seeds explore independently, merging only when they converge to the same mode.
Intuitive Hyperparameters – Settings like number of seeds, initial covariance, and merge thresholds map directly to prior knowledge.
Efficiency at Scale – Handles high-dimensional, multi-modal targets with substantially fewer likelihood calls.
Boundary-safe – Automatically respects [0,1]^d priors.
Multiprocessing Ready – Runs smoothly across CPU cores for large inference tasks.
Reproducibility – Fully deterministic execution when a seed is provided, ensuring consistent results across runs.
Flexible Initialization – Support for user-provided starting points (e.g. from Sobol sequences or specific physical locations) via the External LHS interface.

Installation

From PyPI (when available)

pip install parismc

From Source

git clone https://github.com/mx-Liu123/parismc.git
cd parismc
pip install -e .

Development Installation

git clone https://github.com/mx-Liu123/parismc.git
cd parismc
pip install -e .[dev]

Quick Start

import numpy as np
from parismc import Sampler, SamplerConfig

# Define your log-likelihood function (batched)
def log_likelihood(x):
    """Batched log-density: x has shape (n, ndim) and returns (n,).

    For a single point, pass x as x[None, :] (shape (1, ndim)).
    """
    return -0.5 * np.sum(x**2, axis=1)

# Create sampler configuration
config = SamplerConfig(
    alpha=1000,
    boundary_limiting=True,
    use_pool=False  # Set to True for multiprocessing
)

# Initialize sampler
ndim = 2
n_seed = 5
init_cov_list = [np.eye(ndim) * 0.1] * n_seed

sampler = Sampler(
    ndim=ndim,
    n_seed=n_seed,
    log_density_func=log_likelihood,
    init_cov_list=init_cov_list,
    config=config
)

# Prepare initial samples
sampler.prepare_lhs_samples(lhs_num=1000, batch_size=100)

# Run sampling
sampler.run_sampling(num_iterations=500, savepath='./results', stop_dlogZ=0.1)

# Optional: tune the evidence stability threshold (default None disables)
# sampler.run_sampling(num_iterations=500, savepath='./results', stop_dlogZ=0.1)

# Get results
samples, weights = sampler.get_samples_with_weights(flatten=True)

Log-Density Function Interface

Input x: numpy array of shape (n, ndim); return a 1D array of length n with log densities.
Single point: pass as x[np.newaxis, :] so shape is (1, ndim).
Multiprocessing (use_pool=True): PARIS parallelizes evaluations across rows; write your function in a vectorized way over rows.

Convenience pattern to support both batched and accidental single-point calls:

def log_density(x):
    x = np.atleast_2d(x)
    return -0.5 * np.sum(x**2, axis=1)

Advanced Usage

Archiving Dead Processes (Full Exploration History)

By default, when parallel seeds merge, the redundant (dead) processes are discarded to save memory. However, advanced users may want to analyze the entire exploration trajectory.

You can enable this by setting keep_dead_processes=True in SamplerConfig. The sampler will elegantly trim and archive the valid sample history of dead processes before garbage collection, keeping memory overhead minimal.

To retrieve these archived samples along with the surviving ones, simply pass include_dead=True when extracting results:

# Enable archiving
config = SamplerConfig(keep_dead_processes=True)
sampler = Sampler(..., config=config)
sampler.run_sampling(...)

# Retrieve ALL samples (both surviving and dead processes)
all_samples, all_weights = sampler.get_samples_with_weights(
    flatten=True, 
    include_dead=True
)

Custom Prior Transform

PARIS works internally in the $[0, 1]^d$ unit hypercube. If your parameters have physical bounds (e.g., mass, distance), use a prior_transform function to map the unit cube to your physical space.

Note: Your log_density function will receive the transformed (physical) parameters.

from scipy.stats import norm

def prior_transform(u):
    """Map unit cube [0, 1] to physical space."""
    x = np.copy(u)
    # Param 0: Uniform [-5, 5]
    x[:, 0] = u[:, 0] * 10 - 5
    # Param 1: Normal(0, 1)
    x[:, 1] = norm.ppf(u[:, 1])
    return x

sampler = Sampler(
    ...,
    prior_transform=prior_transform
)

Progress Bar Output Explained

During run_sampling, the progress bar (powered by tqdm) provides real-time updates on the sampler's state. Here is a breakdown of the printed metrics:

samples: The total number of valid samples currently held in the sliding window across all active processes.
evals: The total cumulative number of likelihood function evaluations performed since the start of the sampling run (including all trial points and historical processes).
n_proc: The current number of active parallel processes (seeds). This number will decrease as processes merge.
logZ: The current estimate of the log-evidence ($\ln \mathcal{Z}$). Returns NULL if not yet calculated.
dlogZ: The absolute difference in the log-evidence estimate compared to the value from 1000 iterations ago. Used for early stopping. Returns NULL during the first 1000 iterations.
max_ld: The maximum log-density (log-likelihood) value found so far by the best-performing active process.

Multiprocessing Caveats

Windows/macOS and notebooks use the "spawn" start method. Always guard code with if __name__ == "__main__": when use_pool=True.
Define log_density and prior_transform at module top level. Avoid lambdas, inner functions, or closures; they are not pickleable for multiprocessing.Pool.
Notebooks: prefer use_pool=False. For parallel runs, move code into a .py script and run from the terminal.

Typical safe pattern:

# my_script.py
import numpy as np
from parismc import Sampler, SamplerConfig

def log_density(x):
    x = np.atleast_2d(x)
    return -0.5 * np.sum(x**2, axis=1)

def main():
    ndim = 2; n_seed = 5
    cfg = SamplerConfig(use_pool=True, n_pool=4)
    sampler = Sampler(ndim, n_seed, log_density, [np.eye(ndim)*0.1]*n_seed, config=cfg)
    sampler.prepare_lhs_samples(lhs_num=1000, batch_size=100)
    sampler.run_sampling(num_iterations=500, savepath='./results', stop_dlogZ=0.1)

if __name__ == "__main__":
    import multiprocessing as mp
    mp.freeze_support()               # Windows executables; harmless elsewhere
    mp.set_start_method("spawn", force=True)  # Explicit, portable start method
    main()

Troubleshooting potential hangs:
- Script never starts or stalls at pool creation → add the main-guard and set_start_method("spawn") as above.
- Workers crash immediately → ensure log_density is top-level and returns np.ndarray of shape (n,).
- Notebook hangs → disable multiprocessing (use_pool=False) or run as a script.
- Reduce n_pool to 1 to isolate pickling issues; then increase.

Runtime flag controls

sampler_flags.json is created/reset in the working directory when run_sampling starts.
Toggle a flag to true while the sampler is running; it will execute the action once and reset the flag to false.
Available flags:
- output_latest_samples: write latest_samples.npy and latest_weights.npy (transformed space, flattened).
- plot_latest_samples: write latest_corner.png (requires corner and matplotlib).
- print_latest_infos: write latest_infos.txt with per-process diagnostics (weighted mean, covariance, max log-density/weights).

Quick toggle example from Python:

import json
with open("sampler_flags.json", "r", encoding="utf-8") as f:
    flags = json.load(f)
flags["output_latest_samples"] = True
with open("sampler_flags.json", "w", encoding="utf-8") as f:
    json.dump(flags, f)

Custom Prior Transform

def uniform_to_normal(x):
    """Transform from [0,1]^d to unbounded space"""
    from scipy.stats import norm
    return norm.ppf(x)

sampler = Sampler(
    ndim=ndim,
    n_seed=n_seed,
    log_density_func=log_likelihood,
    init_cov_list=init_cov_list,
    prior_transform=uniform_to_normal
)

Configuration Options

config = SamplerConfig(
    seed=42,                    # Set seed for reproducible results (default: None)
    merge_confidence=0.9,       # Coverage prob mapped to Mahalanobis merge radius R_m (0→R_m=0, 1→R_m→∞)
    alpha=1000,                 # Number of most recent samples used for weighting
    trail_size=1000,            # Maximum trial samples per iteration
    boundary_limiting=True,     # Enable boundary constraint handling
    use_beta=True,              # Use beta correction for boundaries
    integral_num=100000,        # Monte Carlo samples for beta estimation
        gamma=100,                  # Covariance update frequency
        use_pool=True,              # Enable multiprocessing
        n_pool=4,                   # Number of processes
        merge_type='single',        # Merging strategy: 'distance', 'single' (default), or 'multiple'
        debug=False                 # Enable debug logging
    )
    ```

### Key Hyperparameters

*   **n_lhs**: Number of initial search points. Use higher values for high-dimensional spaces or very narrow modes.
*   **n_seed**: Number of initial parallel processes. Typically set to a few times the expected number of modes.
*   **alpha**: Size of the sliding window for weighting. Controls local resolution and how much past information is retained.

*For a full list of configuration options and tuning tips, please refer to the [User Guide](https://mx-liu123.github.io/parismc/user/guide.html).*

## API Reference

### Main Classes

- `Sampler`: Main sampling class
- `SamplerConfig`: Configuration dataclass

### Key Methods

- `prepare_lhs_samples()`: Initialize with Latin Hypercube Sampling
* `run_sampling(num_iterations, savepath, print_iter=1, stop_dlogZ=None)`: Execute the sampling process; if `stop_dlogZ` is set, stop when `|logZ(i) - logZ(i-1000)| <= stop_dlogZ` (checked every 1000 iterations)
- `get_samples_with_weights()`: Retrieve samples and importance weights
- `save_state()` / `load_state()`: State persistence

#### Loading a Saved Sampler

```python
from parismc import Sampler

# Load sampler state (e.g., from the multimodal example output)
sampler = Sampler.load_state('./multimodal_results/sampler_state.pkl')

# Access weighted samples
samples, weights = sampler.get_samples_with_weights(flatten=True)

See examples/load_sampler_example.py for a complete, multi-modal loading + analysis walkthrough.

Utility Functions

find_sigma_level(): Compute confidence level thresholds
oracle_approximating_shrinkage(): Covariance regularization
Various weighting and clustering utilities

Requirements

Python >= 3.8
NumPy >= 1.20.0
SciPy >= 1.7.0
scikit-learn >= 1.0.0
smt >= 2.0.0
tqdm >= 4.62.0

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues.

Citation

If you use this software in your research, please cite:

@software{parismc,
  title={Parallel Adaptive Reweighting Importance Sampling (PARIS)},
  author={Miaoxin Liu, Alvin J. K. Chua},
  year={2025},
  url={https://github.com/mx-Liu123/parismc}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Mar 18, 2026

0.1.0

Sep 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parismc-0.1.1.tar.gz (64.7 kB view details)

Uploaded Mar 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

parismc-0.1.1-py3-none-any.whl (87.8 kB view details)

Uploaded Mar 18, 2026 Python 3

File details

Details for the file parismc-0.1.1.tar.gz.

File metadata

Download URL: parismc-0.1.1.tar.gz
Upload date: Mar 18, 2026
Size: 64.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for parismc-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`7766fd1a53c8e28ae7eda2d34ff2a02862544dbc13fbc8794903a8cf351073d5`
MD5	`5ab5c132d3daac485e009af30478243c`
BLAKE2b-256	`396a008acf841d05444ea00771aea5492fc73f83e47d719adede2caa276c26de`

See more details on using hashes here.

File details

Details for the file parismc-0.1.1-py3-none-any.whl.

File metadata

Download URL: parismc-0.1.1-py3-none-any.whl
Upload date: Mar 18, 2026
Size: 87.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for parismc-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dc564c2af9e3a80caa5d0c12c45aeb16ec1cee9978ed314d4c4a70ba164e1294`
MD5	`8162a3e90dc456f299da1d30674a2c4f`
BLAKE2b-256	`63f20c5430984207558f134269476710290c738c40ab47375ee56da516a953ef`

See more details on using hashes here.

parismc 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PARIS Monte Carlo Sampler

Documentation

How it Works

Performance

10D Gaussian Mixture Model Benchmark

Features

Installation

From PyPI (when available)

From Source

Development Installation

Quick Start

Advanced Usage

Archiving Dead Processes (Full Exploration History)

Custom Prior Transform

Progress Bar Output Explained

Multiprocessing Caveats

Runtime flag controls

Custom Prior Transform

Configuration Options

Utility Functions

Requirements

License

Contributing

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes