Skip to main content

Weighted structured nonconvex sparse models (Python + Rust)

Project description

skein

Weighted structured nonconvex sparse models. Rust core + Python API.

Documentation: the docs site has the full conceptual reference (penalties, datafits, weights, backends), porting guides for glmnet / ncvreg / grpreg, worked examples, and an auto-generated API reference. Hosted on Read the Docs once the project is connected (config in .readthedocs.yaml); preview locally with mkdocs serve. CI builds it --strict on every PR.

skein targets a niche that's well-served in R (grpreg, ncvreg) but missing in Python at production quality: nonconvex group-structured penalties (group MCP, group SCAD, sparse-group nonconvex) with first-class support for weights along three axes — per-sample, per-feature, and per-group.

Status

v0.1 development. Core algorithms and the headline GLM family are in place; design-matrix backends (sparse, mmap, chunked) are next. See ROADMAP.md for the full plan.

Done so far:

  • Solvers — production CD core (path solver, strong rule + KKT verification, gap-safe screening, Anderson acceleration); group block-CD with LLA outer loop for nonconvex group penalties; Rayon-parallel group sweeps; operator-norm Lipschitz via power iteration.
  • Datafits — least squares, binomial logistic, Poisson (log link), Cox PH (Breslow ties). All glued together by a GlmDatafit trait that exposes a weighted-LS surrogate; the M1/M2 inner solvers absorb every GLM unchanged.
  • Penalties — MCP, SCAD, group lasso, group MCP, sparse-group lasso, sparse-group MCP. Per-feature and per-group weights honored throughout.
  • Python — sklearn-compatible estimators for every (datafit × penalty) combination; type stubs; warm-started λ-paths; standardization with original-scale coef_ / intercept_ recovery (dense backend).

M8 (Distribution & DX) is done: CI + cibuildwheel + Read the Docs + 25-page mkdocs site (concepts + R-porting + extending + examples + API ref) + R numerical regression suite vs glmnet/ncvreg/grpreg + stable Rust API contract. The library is pip install-able once published, documented end-to-end, and pinned against R reference fits so we don't silently drift.

Coming next: algorithmic features — M5.x adaptive weights and stability selection are the next high-value milestones; both leverage the existing per-feature/per-group weight axes that are already wired through every solver.

Layout

crates/skein-core/   pure Rust: traits + algorithms (no Python)
crates/skein-py/     PyO3 bindings (cdylib → skein_glm._core)
python/skein/        sklearn-compatible estimators + ABCs for extensions
tests/               pytest smoke tests
benches/             criterion (Rust) + asv (Python)

The Rust traits (DesignMatrix, Datafit, GlmDatafit, Penalty, GroupPenalty) and their Python ABC mirrors (skein.penalties.Penalty, etc.) are the extension surface for downstream per-paper projects.

Quick start

import numpy as np
from skein import MCPPathRegressor, LogisticGroupMCPPathRegressor, CoxMCPRegressor

# Nonconvex sparse least squares with a λ-path.
rng = np.random.default_rng(0)
n, p = 200, 50
X = rng.standard_normal((n, p))
y = X[:, :3] @ np.array([1.5, -2.0, 0.8]) + 0.1 * rng.standard_normal(n)
model = MCPPathRegressor(gamma=3.0, n_lambdas=50, standardize=True).fit(X, y)
print(model.coefs_[-1, :5], model.intercepts_[-1])

# Logistic + group MCP via LLA, with sklearn-style predict/predict_proba.
groups = np.repeat(np.arange(p // 5), 5)  # 5 features per group
y_bin = (X[:, :3].sum(axis=1) > 0).astype(float)
clf = LogisticGroupMCPPathRegressor(groups=groups, gamma=3.0, n_lambdas=20).fit(X, y_bin)
proba = clf.predict_proba(X)  # shape (n, n_lambdas)

# Cox PH with right-censored survival data.
time = rng.exponential(1.0 / np.exp(X[:, :3].sum(axis=1)))
event = rng.uniform(size=n) < 0.7
cox = CoxMCPRegressor(lambda_=0.01, gamma=3.0).fit(X, time, event.astype(float))
risk = cox.predict(X)  # prognostic index η

Every regressor follows the same (datafit) × (penalty) × ({,Path}Regressor) naming scheme. The path variants warm-start across λ; their coefs_ / intercepts_ (where applicable) are 2D arrays indexed by λ.

Build

# Rust core only (fast iteration on algorithms)
cargo test -p skein-core

# Full Python package (requires maturin in your env)
maturin develop --release
pytest

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skein_glm-0.3.0.tar.gz (182.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

skein_glm-0.3.0-cp310-abi3-win_amd64.whl (1.5 MB view details)

Uploaded CPython 3.10+Windows x86-64

skein_glm-0.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

skein_glm-0.3.0-cp310-abi3-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file skein_glm-0.3.0.tar.gz.

File metadata

  • Download URL: skein_glm-0.3.0.tar.gz
  • Upload date:
  • Size: 182.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skein_glm-0.3.0.tar.gz
Algorithm Hash digest
SHA256 c7a4840444b2f01d05e721abfe0a6dec0752cd7221d0d91f961798b95cf7f4d8
MD5 0f175556ef64abd3230958d0fdb7ae1a
BLAKE2b-256 df5482f93d437c01937b087836e2179c61a275e411916af4c4e648990c51ad23

See more details on using hashes here.

Provenance

The following attestation bundles were made for skein_glm-0.3.0.tar.gz:

Publisher: wheels.yml on dvillacis/skein

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file skein_glm-0.3.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: skein_glm-0.3.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skein_glm-0.3.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 953b0ebecc6ff2b6329f2fc195e97d2f62547e003087265a8d27da2f569c1d87
MD5 3def175a8ef36e7859ed20b995ff4337
BLAKE2b-256 b0c7a1ea2fd4bccfce0908d01c7eeac4750f65a78f56e2e266d8687ae6b04d6f

See more details on using hashes here.

Provenance

The following attestation bundles were made for skein_glm-0.3.0-cp310-abi3-win_amd64.whl:

Publisher: wheels.yml on dvillacis/skein

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file skein_glm-0.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for skein_glm-0.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9f4a46a536b03139c07784c1ce77d50e56ce8505488a552ed8287030ca442d84
MD5 45561bc9737fcbf49039f09fee64a5f5
BLAKE2b-256 de10d5b50cc2fdae6c95f0101e1b56ccf9a1f70531336deab44384f6cca71bce

See more details on using hashes here.

Provenance

The following attestation bundles were made for skein_glm-0.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: wheels.yml on dvillacis/skein

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file skein_glm-0.3.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for skein_glm-0.3.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 183c4099ab273bd5a19bb3501d8bc550cdfbac8347239a62143c61cecb52db26
MD5 29fcc7d5f005b834661dcc11ceaa6eb4
BLAKE2b-256 69e0be82ea6f15e83061d27053bd7bf546d98ac60d1c70623a471bd43baf7e98

See more details on using hashes here.

Provenance

The following attestation bundles were made for skein_glm-0.3.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: wheels.yml on dvillacis/skein

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page