Highly Adaptive Principal Components
Project description
HAPC: Highly Adaptive Prinicipal Components
A fast and flexible machine learning library for nonparametric high-dimensional regression and classification with guarantees.
Documentation
- Python API (rendered from docstrings): https://hapc.readthedocs.io —
configured via
.readthedocs.yamlanddocs/(Sphinx + autodoc). Build locally withpip install -e ".[docs]" && sphinx-build -b html docs docs/_build/html. - R API (rendered from roxygen): a pkgdown
site built by
.github/workflows/pkgdown.yaml(config in_pkgdown.yml). Build locally withRscript -e 'pkgdown::build_site()'.
Installation
Prerequisites
- Python 3.8+
- C++ compiler (g++, clang, or MSVC)
- CMake 3.15+
- Eigen3
Quick Install
pip install hapc
Prebuilt wheels are published for Linux (manylinux2014, x86_64), macOS (Intel + Apple Silicon) and Windows, for CPython 3.8–3.12. No compiler, CMake or Eigen is needed when a wheel is available.
Linux / HPC clusters
The Linux wheels use the manylinux2014 baseline (glibc 2.17), so
pip install hapc works out of the box on HPC login/compute nodes —
no conda toolchain, devtoolset, or sysroot setup required:
pip install hapc
If you must build from the source distribution (niche architecture, very
old Python, or an air-gapped node), provide a C++17 compiler and either
let CMake fetch Eigen automatically (needs network) or install Eigen and
let find_package(Eigen3) find it:
# with conda compilers (recommended on HPC)
conda install -c conda-forge cxx-compiler cmake eigen
pip install hapc --no-binary hapc
Install from GitHub (latest development version)
pip install git+https://github.com/meixide/hapc.git
Or with editable install for development:
git clone https://github.com/meixide/hapc.git
cd hapc
pip install -e .
Install build dependencies
If installation fails, you may need to install build dependencies:
macOS:
brew install cmake eigen
Ubuntu/Debian:
sudo apt-get install cmake libeigen3-dev build-essential
Windows:
pip install cmake
# Install Visual Studio Build Tools or use conda
conda install -c conda-forge eigen
Quick Start
import numpy as np
from hapc.single import single_pcghal
from hapc.cv import pcghal_cv
# Generate sample data
X = np.random.randn(100, 5)
Y = X[:, 0] + 0.5 * X[:, 1] + np.random.randn(100) * 0.1
# Single fit with fixed lambda
result = single_pcghal(X, Y, maxdeg=2, npc=5, single_lambda=0.01)
print(f"Risk: {result.optimizer_output.risk:.6f}")
# Cross-validation to select lambda
lambdas = np.logspace(-4, 0, 10)
cv_result = pcghal_cv(X, Y, maxdeg=2, npc=5, lambdas=lambdas, nfolds=5)
print(f"Best lambda: {cv_result.best_lambda:.6f}")
# Make predictions
X_test = np.random.randn(20, 5)
result = single_pcghal(X, Y, maxdeg=2, npc=5, single_lambda=0.01, predict=X_test)
print(f"Predictions: {result.predictions}")
Usage
Regression
from hapc.single import single_pcghal
result = single_pcghal(
X, Y,
maxdeg=2, # Maximum degree of interactions
npc=10, # Number of principal components
single_lambda=0.01,
predict=X_test # Optional: test data for predictions
)
Classification
from hapc.single import single_pcghal
result = single_pcghal(
X, Y_binary,
maxdeg=2,
npc=10,
single_lambda=0.01,
predict=X_test
)
Cross-Validation
from hapc.cv import pcghal_cv
cv_result = pcghal_cv(
X, Y,
maxdeg=2,
npc=10,
lambdas=np.logspace(-4, 0, 20),
nfolds=5
)
print(cv_result.best_lambda)
Average Treatment Effect (ATE)
Estimate the ATE E[Y(1)] − E[Y(0)] with HAPC nuisance models and a
doubly-robust (AIPW) efficient influence function. ate_hapc returns a point
estimate and a (1 − alpha) Wald confidence interval.
from hapc import ate_hapc
# W: covariates (n, p); A: binary treatment in {0,1} or {-1,+1}; Y: outcome
res = ate_hapc(W, Y, A, alpha=0.05, method="undersmooth")
print(res.estimate, res.lower, res.upper)
Two bias-control strategies are available through method:
method="undersmooth"(default) — single-sample estimator. The outcome model is undersmoothed (λ pushed below the CV-optimal value) until the empirical influence function is withinσ / (√n · log n). This requires the full PC basis (npcs = n, the default) and a λ grid that reaches small λ (defaultslog_lambda_out_min = -10); otherwise the gate never reaches the low-bias regime andate_hapcemits a warning. Passreport_undersmoothing=Trueto print the|mean(EIF)|-vs-λ path.method="crossfit"— DML-style K-fold cross-fitting (cf_folds, default 5, stratified by treatment). Both nuisances are fit on the training folds and the influence function is evaluated out-of-fold, giving honest point estimates and coverage without undersmoothing. Recommended under good overlap.
Discrete-time survival (family = "logit-hazard")
Fit a discrete-time logistic hazard model with HAPC. You supply only the
observed right-censored data — baseline covariates X, the observed time
T = min(T_event, C), and the event indicator Delta = 1(T_event <= C) — and
the wrapper performs the person-period expansion (one row per
subject-per-interval-at-risk, hazard label = 1 at the event interval), prepends
the visit time as the first HAL covariate, and cross-validates the binomial fit.
Model. The discrete hazard is the conditional event probability in interval
t given survival up to t, modelled on the logit scale by a HAPC fit f of
the augmented covariate (t, x):
lambda(t | x) = P(T_event = t | T_event >= t, X = x)
logit lambda(t | x) = f(t, x)
Person-period likelihood. Under independent right-censoring the observed-data likelihood factorises over the at-risk intervals,
prod_i prod_{t <= T_i} lambda(t|x_i)^Y_it * (1 - lambda(t|x_i))^(1 - Y_it),
with Y_it = 1(T_event_i = t),
which is exactly the Bernoulli (logistic) likelihood of the expanded
person-period table — so a binomial HAPC fit of Y_it on (t, x_i) estimates
the discrete hazard (Cox 1972; Brown 1975; Allison 1982).
Survival. The conditional survival function follows by the product-limit
relation S(t | x) = prod_{s <= t} (1 - lambda(s | x)), returned for new
subjects when predict= is supplied.
from hapc import hazard_hapc
import numpy as np
# X: baseline covariates (n, p); T: observed times; Delta: 0/1 event indicator
fit = hazard_hapc(X, T, Delta, norm="1", max_degree=2, time_grid=np.arange(1, 7))
fit.hazard # estimated hazard per person-period row (CV predictions)
fit.best_lambda, fit.interior # CV-selected lambda; is it interior to the grid?
# survival curves S(t|x) for new subjects
fit = hazard_hapc(X, T, Delta, norm="1", predict=X_new)
fit.predict_survival # (m, K) survival probabilities over the grid
library(hapc)
# equivalent to cv.hapc(X, T, family = "logit-hazard", Delta = Delta, norm = "1")
fit <- hazard.hapc(X, T, Delta, norm = "1", max_degree = 2, time_grid = 1:6)
fit$hazard; fit$best_lambda; fit$interior
norm must be "1" (logistic LASSO) or "2" (logistic ridge); norm = "sv"
is not implemented for this family and is flagged.
Returns (Python HazardResult / R hapc_hazard):
hazard— cross-validated discrete hazard for each person-period rowlambdas,risk,best_lambda— CV grid, mean logistic deviance, selected λinterior— whetherbest_lambdais strictly inside the grid (sanity check)time_grid,ids/id,Y— the discrete grid and person-period bookkeepingpredict_hazard,predict_survival— hazard surface and survival curves for new subjects (only whenpredict=is given)cv— the underlying cross-validation result
Worked end-to-end examples (five hazard data-generating processes, with
true-vs-estimated hazard scatters and CV risk-vs-λ curves verifying an interior
optimum) are in
examples/hazard_logit_hazard_examples.R
and
examples/hazard_logit_hazard_examples.py.
References. Cox (1972, JRSS B); Brown (1975, Biometrics); Allison (1982, Sociological Methodology); Singer & Willett (2003, Applied Longitudinal Data Analysis); Benkeser & van der Laan (2016, IEEE DSAA).
API Reference
hapc.single.single_pcghal()
Fit PC-GHAL with a single lambda value.
Parameters:
X(ndarray, shape (n, p)): Input featuresY(ndarray, shape (n,)): Response variablemaxdeg(int): Maximum degree of interactionsnpc(int): Number of principal componentssingle_lambda(float): Regularization parametermax_iter(int, default=100): Maximum iterationstol(float, default=1e-6): Convergence toleranceverbose(bool, default=False): Print progresspredict(ndarray, optional): Test data for predictionscenter(bool, default=True): Center the design matrix
Returns:
result.optimizer_output.alpha: Coefficientsresult.optimizer_output.risk: Final riskresult.optimizer_output.iter: Iterations until convergenceresult.predictions: Predictions on test data (if provided)
hapc.cv.pcghal_cv()
Cross-validation to select lambda.
Parameters:
lambdas(ndarray): Grid of lambda values to testnfolds(int, default=5): Number of CV folds- ...other parameters same as
single_pcghal
Returns:
cv_result.best_lambda: Optimal lambdacv_result.mses: CV errors for each lambdacv_result.best_model: Fitted model with best lambdacv_result.predictions: Predictions on test data (if provided)
Contributing
Contributions welcome! The C++ core is shared between R and Python packages.
git clone https://github.com/meixide/hapc.git
cd hapc
pip install -e .
pytest
License
MIT License - see LICENSE file
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hapc-2.6.0.tar.gz.
File metadata
- Download URL: hapc-2.6.0.tar.gz
- Upload date:
- Size: 80.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5b5db661813c4e25bc9fb2eb413374c75fc0592fa2c987e2364268f1caf30b1
|
|
| MD5 |
9849ae79c92f7668551fb66b1bb69b45
|
|
| BLAKE2b-256 |
ed0be6691edd765c3e2a78a210187f7fd0ed77b4d3cb8e87328cd2026cf70498
|
File details
Details for the file hapc-2.6.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: hapc-2.6.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 540.3 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67236a26a562ca9404438e9da8a6ba677ce40009c1eaf53b01fe5c7ee602c456
|
|
| MD5 |
a4fc3bf972604259a169c3a633f4d994
|
|
| BLAKE2b-256 |
11674f3aea1f0987dc591b575bee1fe77f6bd53ebf1839579ffc011827941851
|
File details
Details for the file hapc-2.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: hapc-2.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 302.6 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
85ee99fa6b5dea20fcece820bee5becc61db0e8b66a698079ee355c104d69d9d
|
|
| MD5 |
4af92bf6c582916b38025d80ac7de10a
|
|
| BLAKE2b-256 |
7fae7237917036dfa46bdde3da2e4c4da55f8513438a1dd51631c5689383d77a
|
File details
Details for the file hapc-2.6.0-cp312-cp312-macosx_10_13_universal2.whl.
File metadata
- Download URL: hapc-2.6.0-cp312-cp312-macosx_10_13_universal2.whl
- Upload date:
- Size: 478.2 kB
- Tags: CPython 3.12, macOS 10.13+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac5d4956094df22e31388b32abe2c26a14c0989d250f57bc3d31455559052ea0
|
|
| MD5 |
ffb35b334310cac088cbba130d875cda
|
|
| BLAKE2b-256 |
4c886aa93cb67bc571462542b8f824a9c671828fa65108cc0f863dd99ba3d7f9
|
File details
Details for the file hapc-2.6.0-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: hapc-2.6.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 535.1 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d3f8e70f538c42e7dc6d079f839e92fe24313469457d5f943c9d5da6ab42eb38
|
|
| MD5 |
40217a1158c734f8697fdc8f2c9bbd7b
|
|
| BLAKE2b-256 |
a79bd83a5da8ede92015a9150ff44706e9aecf92238641712206c02778997cf7
|
File details
Details for the file hapc-2.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: hapc-2.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 303.9 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20695dddd8c8e25957fdd6a192992488a1dcfe1af178acfbc88aa910d0fd7325
|
|
| MD5 |
759912646a10c72893f8d1b1fe8580a8
|
|
| BLAKE2b-256 |
7187d6b970697b04c7db1b6db54746c09e84ab028ae13e6f74a18ffeb04e81a5
|
File details
Details for the file hapc-2.6.0-cp311-cp311-macosx_10_9_universal2.whl.
File metadata
- Download URL: hapc-2.6.0-cp311-cp311-macosx_10_9_universal2.whl
- Upload date:
- Size: 473.4 kB
- Tags: CPython 3.11, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6392fd282b6c6cd858c91b9ec40ef26f2e1dfdc1695d8569d4b29763bf4389bd
|
|
| MD5 |
3f02de08008e326432e903553d068fea
|
|
| BLAKE2b-256 |
04a8d5c1e04fb857661b9e367801b113dd321b887dcea2a7417f2b084f1b5fd4
|
File details
Details for the file hapc-2.6.0-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: hapc-2.6.0-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 533.0 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e20dfd8d00a252f11c13b3872c5baaf26a020e23b13167d3e7e068bfa350b0da
|
|
| MD5 |
50e13106c2ce533300a45dacf26dddef
|
|
| BLAKE2b-256 |
7ebda87ba85aa8d625299ec88e4c455f62040083ab445d44f6841ec5efeb0bc0
|
File details
Details for the file hapc-2.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: hapc-2.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 302.6 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7d22f4663377a41cfd444a9d28cac0698e5ef8da5d0bb24213f09873da2aab5
|
|
| MD5 |
e7699adc0fea54a1f842f2cf36170edb
|
|
| BLAKE2b-256 |
b6901d0e8d84127e94117b187aa4679fc1df44a082e760213134e124f847dfa8
|
File details
Details for the file hapc-2.6.0-cp310-cp310-macosx_10_9_universal2.whl.
File metadata
- Download URL: hapc-2.6.0-cp310-cp310-macosx_10_9_universal2.whl
- Upload date:
- Size: 470.7 kB
- Tags: CPython 3.10, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26a0827dc5a2b8015ab5c80813866852055422657508e6ad004b1cfc678ed04f
|
|
| MD5 |
4b695f442aef341416c082ee90236cab
|
|
| BLAKE2b-256 |
3270a695d7c50b5b600ddefd403f9426219787c7683ab89e04c15ccc6c8de484
|
File details
Details for the file hapc-2.6.0-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: hapc-2.6.0-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 533.1 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80a9199770652caf6addcc22ae33b32c7e2105cd95bf7e2aef1576d235063467
|
|
| MD5 |
95c6a6fdb2de525e6a9c36bf3d6b2b01
|
|
| BLAKE2b-256 |
057dc3e6ca6dae01d2a9beec02860a21818824e0d0c9d4a832f7b532f0d2bffb
|
File details
Details for the file hapc-2.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: hapc-2.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 302.5 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32ae5c6751c1376d6c32401e249c89855c5025e3e387c39508a6ef1d5358b1fa
|
|
| MD5 |
ab83373baf921fe855b0c1389dfaf829
|
|
| BLAKE2b-256 |
a906202f51b0373254c3e4db08d1cf30e04b002c23f82d22fced59674a08215b
|
File details
Details for the file hapc-2.6.0-cp39-cp39-macosx_10_9_universal2.whl.
File metadata
- Download URL: hapc-2.6.0-cp39-cp39-macosx_10_9_universal2.whl
- Upload date:
- Size: 470.9 kB
- Tags: CPython 3.9, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34781f6fe93868fd750ddda9422bd61db2bb7498bf22dae7aeee41b0d0585017
|
|
| MD5 |
1cd8219c81e7f59f506d1e541c2f6c7a
|
|
| BLAKE2b-256 |
a6e27cfde373b998de8ff9b0c3450d075be400fe60e834a88e405ede6929620d
|
File details
Details for the file hapc-2.6.0-cp38-cp38-win_amd64.whl.
File metadata
- Download URL: hapc-2.6.0-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 533.0 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
963fd268ff4d9a5161cee37fd772731b03e83ba32a4b418095d719ca30b60b4c
|
|
| MD5 |
60fa5670450e06edf19268d26eb1c084
|
|
| BLAKE2b-256 |
f051d4d68500d8f402eaa677ae657466a911bc852c53a98346c59ad74f34b6e6
|
File details
Details for the file hapc-2.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: hapc-2.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 301.8 kB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc4bdcb80fdc7b9bff6901b9c2f5d497c18c07d95ed70dd213410e51824cdf03
|
|
| MD5 |
684fe8ee0f561fbfed6e276912d295d9
|
|
| BLAKE2b-256 |
acc70a88e05af7b656d84ed5b8bf9ede8e2341f0d5aa5a35d1f673fed6780dc0
|
File details
Details for the file hapc-2.6.0-cp38-cp38-macosx_10_9_universal2.whl.
File metadata
- Download URL: hapc-2.6.0-cp38-cp38-macosx_10_9_universal2.whl
- Upload date:
- Size: 470.1 kB
- Tags: CPython 3.8, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5d2c5aaa72f902014728081d178b21527d779c8d2023f331c2a562fcf4cc66a
|
|
| MD5 |
de7771d41b47185e1ed3219cd60f08d2
|
|
| BLAKE2b-256 |
d931c886f3a05735d1825a0fa91858890a7813e2d483bdf2bfa085171dd10f3e
|