Bayesian Pipeline for Proper Motion measurements using HST and Gaia
Project description
bp3m
bp3m is a Python pipeline for measuring improved proper motions of stars by combining multi-epoch HST imaging with Gaia DR3 astrometry. It takes a sky position or target name, automatically downloads and processes all relevant archival HST data from MAST, and simultaneously solves for the per-image HST transformations and per-star proper motions and parallaxes using a closed-form Bayesian algorithm. The result is a catalogue of stellar astrometry where every star with HST detections has significantly tighter proper motion uncertainties than Gaia alone, with the improvement scaling with the number of HST epochs and the HST-Gaia time baseline.
bp3m implements and extends the Bayesian proper motion method of McKinnon et al. (2024, ApJ 972 150), replacing the original MCMC posterior with a closed-form Gaussian solution that is analytically exact and fast enough to simultaneously fit thousands of stars across >100 HST images. The pipeline follows the science workflow of GaiaHub (del Pino et al. 2022, ApJ 933 76) and uses pypass, a Python implementation of the hst1pass photometry algorithm (Anderson 2022, WFC ISR 2022-05).
This is the actively developed version of bp3m and should be used in place of the original code. The original MCMC-based implementation is archived at https://github.com/KevinMcK95/BayesianPMs. The closed-form Gaussian posterior in this version is not only faster but analytically superior — it does not suffer from MCMC convergence issues and scales to datasets that were impractical with the original code.
Installation
conda create -n bp3m_env python=3.11 -y
conda activate bp3m_env
pip install bp3m
bp3m bundles pypass (PSF-fitting photometry) and gaia_cross_match (Gaia cross-matching) as internal packages — no separate installs are needed.
Setup
After installation, run the setup command to download the required HST PSF and geometric distortion correction (GDC) library files from STScI:
bp3m-setup
By default the library files are stored in ~/.bp3m/lib. To store them elsewhere (e.g. on a large-storage server), set the BP3M_HOME environment variable before running setup:
export BP3M_HOME=/path/to/storage/.bp3m
bp3m-setup --lib-dir /path/to/storage/bp3m_lib
Quick start
bp3m --name "Leo I" --search_radius 0.1 --output_dir ./outputs
Pipeline steps
- Download Gaia — query Gaia DR3 via TAP and cache the result
- Download HST — search MAST and download FLC/FLT images
- PSF fitting — run iterative PSF photometry on each image (pypass)
- Cross-match — match each HST catalog to Gaia with an affine transformation (gaia_cross_match)
- Bayesian alignment — simultaneously solve for image transformations and stellar proper motions/parallaxes using the closed-form BP3M algorithm
Key features
- Closed-form Gaussian posterior (not MCMC) — exact and scales to thousands of stars across >100 images
- Full Python pipeline from HST download through proper motion measurement
- Iterative multi-pass PSF photometry with JAX acceleration (via pypass)
- Robust Gaia cross-matching with affine transformation (via gaia_cross_match)
- Magnitude-dependent chi2 uncertainty calibration
- Diagnostic plots at every pipeline stage
Primary output: stellar_astrometry.csv
The main science output is {output_dir}/{field}/BP3M_results/stellar_astrometry.csv. It contains one row per star and is designed as a near-drop-in replacement for the Gaia astrometric solution — the BP3M columns follow Gaia's naming convention and carry the same physical meaning, but with substantially reduced proper motion uncertainties for stars with multiple HST epochs.
Key columns:
| Column | Description |
|---|---|
pmra_bp3m |
Proper motion in RA×cos(Dec) [mas/yr] — marginalised posterior mean |
pmdec_bp3m |
Proper motion in Dec [mas/yr] — marginalised posterior mean |
parallax_bp3m |
Parallax [mas] — marginalised posterior mean |
sigma_pmra_bp3m |
Uncertainty on pmra_bp3m [mas/yr] |
sigma_pmdec_bp3m |
Uncertainty on pmdec_bp3m [mas/yr] |
sigma_parallax_bp3m |
Uncertainty on parallax_bp3m [mas] |
corr_pmra_pmdec |
Correlation between pmra and pmdec |
corr_pmra_plx |
Correlation between pmra and parallax |
corr_pmdec_plx |
Correlation between pmdec and parallax |
delta_racosdec_bp3m |
BP3M position offset from Gaia in RA×cos(Dec) [mas] |
delta_dec_bp3m |
BP3M position offset from Gaia in Dec [mas] |
n_hst_used |
Total HST detections used (alignment + astrometry) |
chi2_hst_red |
Reduced HST chi2 — should be ~1 for well-fit stars |
The _cond variants (e.g. pmra_bp3m_cond) are the MAP conditional posteriors with the image transformations held fixed. These are tighter but do not account for transformation uncertainty; use the marginalised columns for science.
To use the BP3M results as a drop-in replacement for Gaia proper motions, substitute pmra_bp3m → pmra, sigma_pmra_bp3m → pmra_error, and the corr_* columns → the corresponding Gaia correlation columns. The full 5×5 posterior covariance is also saved as v_cov_marginalised.npy for downstream use.
v2 pipeline: extending to HST-only sources
After running the standard bp3m pipeline, you can optionally run bp3m-v2 to extend the analysis to HST-detected sources that have no Gaia counterpart. This is most useful for fields with deep, multi-epoch HST imaging where many faint sources are detected by HST but fall below the Gaia detection limit.
bp3m-v2 runs a two-step post-processing pipeline:
- Master cross-match — uses the BP3M transformation solution to project every PSF-fit HST source to RA/Dec with full uncertainty propagation, then cross-matches sources across images of the same filter to build a master HST catalogue (
hst_xmatch/master_combined_v2.csv) - v2 BP3M alignment — re-runs the Bayesian alignment including the HST-only sources, using the Gaia-constrained transformation as initialisation and phasing in HST-only sources after the transformation has converged
# Step 1: run the standard pipeline
bp3m --name "Leo I" --search_radius 0.1 --output_dir ./outputs
# Step 2: run v2 post-processing
bp3m-v2 --name "Leo I" --output_dir ./outputs
v2 outputs are written to {output_dir}/{field}/:
BP3M_v2_results/stellar_astrometry.csv— astrometry for all sources (Gaia-matched + HST-only), same column format as the standard outputhst_xmatch/master_combined_v2.csv— the master HST cross-match catalogue used as input to v2 BP3Mhst_xmatch/master_combined.csv— cross-filter merged HST source catalogue
Important caveats
PM correlations between stars
The marginalised proper motion columns (pmra_bp3m, pmdec_bp3m etc.) account for uncertainty in the HST-Gaia image alignment, but this comes at a cost: because all stars share the same alignment solution, their proper motions are correlated with each other. The magnitude of this correlation depends on how many stars constrain each image transformation and how many images a star appears in.
There are two sets of PM columns in stellar_astrometry.csv:
pmra_bp3m_cond/sigma_pmra_bp3m_cond— conditional (MAP alignment fixed). Stars are uncorrelated at fixed alignment. Use these for per-star analyses where each star is treated independently (e.g. membership probabilities).pmra_bp3m/sigma_pmra_bp3m— marginalised over the alignment posterior. These are conservative single-star uncertainties but stars are correlated. Use these for comparisons with Gaia or literature.
For population statistics (mean PM, velocity dispersion), neither set is strictly correct on its own. The most rigorous approach is to draw joint samples from the alignment posterior and propagate to your science quantity — see notebooks/06_alignment_posterior.ipynb for a worked example.
Cross-telescope systematics
Combining astrometry between two telescopes (HST and Gaia) with different passbands, pixel scales, and epochs can introduce complicated systematic errors that affect the final proper motion catalogue. Common sources of systematics include:
- Colour-dependent PSF effects — differential chromatic refraction or filter-dependent PSF structure can introduce position offsets that vary with stellar colour
- Geometric distortion residuals — imperfect GDC corrections leave small systematic position errors that vary across the detector
- Epoch-dependent effects — charge transfer inefficiency (CTI), focus drift, or guide star jitter can introduce time-dependent systematics
We strongly recommend that users:
- Examine the diagnostic plots generated in
BP3M_results/plots/— particularlypm_vector_diagram_detector_pos.png, which shows whether the BP3M proper motions show unexpected trends as a function of position on the HST detector. Any coherent pattern is a warning sign of unmodelled systematics. - Check
chi2_hst_distributions.pngto verify that per-image chi2 distributions are consistent with the expected chi2(2) distribution. Images with large median chi2 or alpha > 2 may have problematic data. - Be cautious when interpreting small systematic PM offsets between populations (e.g. cluster vs. field), as these can be of similar magnitude to unmodelled systematics in shallow or single-epoch datasets.
Where possible, mitigate systematics by using only images with long HST-Gaia time baselines (which reduce the impact of positional errors on the PM solution), restricting to a single instrument/detector, or applying per-filter or per-epoch quality cuts using the chi2_hst_red and n_hst_used columns.
Status and feedback
bp3m has been tested on a range of stellar fields across multiple HST instruments and epochs, but as with any research software there may be edge cases and bugs that haven't been caught yet. If you run into unexpected behaviour or incorrect results, please open a GitHub issue — all feedback is welcome.
Development notes
Code optimization, the Python translation of supporting routines, and pipeline development were assisted by Claude Code (Anthropic).
Citation
If you use bp3m in your research, please cite the original BP3M paper at a minimum:
McKinnon et al. 2024, ApJ 972 150. https://ui.adsabs.harvard.edu/abs/2024ApJ...972..150M/abstract
A new paper describing the updates and extensions in this version of bp3m is in preparation.
If you use the full pipeline (PSF fitting and/or Gaia cross-matching), please also cite the works that these components are based on:
Anderson, J. 2022, "One-Pass HST Photometry with hst1pass", Space Telescope WFC Instrument Science Report 2022-05. https://ui.adsabs.harvard.edu/abs/2022wfc..rept....5A/abstract
del Pino, A., et al. 2022, "GaiaHub: A Method for Combining HST and Gaia to Obtain Improved Proper Motions for HST Observations", ApJ 933 76. https://doi.org/10.3847/1538-4357/ac71ae
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bp3m-2.0.0.tar.gz.
File metadata
- Download URL: bp3m-2.0.0.tar.gz
- Upload date:
- Size: 390.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f8d5fef69e92b372c07b6c1dabcbcaebafc62e16b34ee1efb0e048a6538d6c8
|
|
| MD5 |
b1252fffdeca53c589a0a8fdf36cd770
|
|
| BLAKE2b-256 |
6ec476045f6f1a5e2813bf44797637a87460c995a2a60277cf8aa6ede16255b1
|
Provenance
The following attestation bundles were made for bp3m-2.0.0.tar.gz:
Publisher:
publish.yml on KevinMcK95/bp3m
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bp3m-2.0.0.tar.gz -
Subject digest:
2f8d5fef69e92b372c07b6c1dabcbcaebafc62e16b34ee1efb0e048a6538d6c8 - Sigstore transparency entry: 1728375808
- Sigstore integration time:
-
Permalink:
KevinMcK95/bp3m@1ff579c8a16943d47ecdb708f5c2eedcd5b147ca -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/KevinMcK95
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1ff579c8a16943d47ecdb708f5c2eedcd5b147ca -
Trigger Event:
push
-
Statement type:
File details
Details for the file bp3m-2.0.0-py3-none-any.whl.
File metadata
- Download URL: bp3m-2.0.0-py3-none-any.whl
- Upload date:
- Size: 418.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e3c8d3b5b6898224278bc7334f0a31616a2cfaa1eafbda0dce5d90a48d3b7ce
|
|
| MD5 |
4c37f5646a6ad827034545a818320d0b
|
|
| BLAKE2b-256 |
89bccafd57daa0b45db7c9d3be98fc56f00a419896b7721fa8897021bddd6f73
|
Provenance
The following attestation bundles were made for bp3m-2.0.0-py3-none-any.whl:
Publisher:
publish.yml on KevinMcK95/bp3m
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bp3m-2.0.0-py3-none-any.whl -
Subject digest:
1e3c8d3b5b6898224278bc7334f0a31616a2cfaa1eafbda0dce5d90a48d3b7ce - Sigstore transparency entry: 1728375989
- Sigstore integration time:
-
Permalink:
KevinMcK95/bp3m@1ff579c8a16943d47ecdb708f5c2eedcd5b147ca -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/KevinMcK95
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1ff579c8a16943d47ecdb708f5c2eedcd5b147ca -
Trigger Event:
push
-
Statement type: