A package implementation of COMBSS, a continuous optimisation method toward best subset selection

These details have not been verified by PyPI

Project description

COMBSS Logo

Continuous Optimization Method for Best Subset Selection

Python implementation of a novel continuous optimization method for best subset selection in linear regression.

📄 Reference:
Moka, Liquet, Zhu & Muller (2024)
COMBSS: best subset selection via continuous optimization
Statistics and Computing

🔗 GitHub Repository: saratmoka/combss

Key Features

🎯 Continuous relaxation of discrete subset selection
⚡ Scalable optimization for high-dimensional data

Intercept Handling

The intercept term (if included) is subject to the same selection process as other features.

Installation

pip install combss

Quick Start

A simple example:

import combss
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

# Generate sample data
X, y = make_regression(n_samples=1000, n_features=50, noise=0.1, random_state=42)

# Split into training and validation sets (60-40 split)
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.4, random_state=42)

# Initialize and fit model with validation data
model = combss.linear.model()
model.fit(
    X_train=X_train, 
    y_train=y_train,
    X_val=X_val,      # Validation features
    y_val=y_val,      # Validation targets
    q=10,             # Maximum subset size
    nlam=50           # Number of λ values
)

# Results
print("Best subset indices:", model.subset)
print("Best coefficients:", model.coef_)
print("Validation MSE:", model.mse)
print("Optimal lambda:", model.lambda_)
print("Computation time (s):", model.run_time)

An example with known true coefficients:

import combss
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

# Configuration
n_samples = 5000
n_features = 50
n_informative = 5  # the number of non-zero coefficients
noise_level = 0.1

# Generate data with exactly 5 informative features
X, y, true_coef = make_regression(
    n_samples=n_samples,
    n_features=n_features,
    n_informative=n_informative, 
    noise=noise_level,
    coef=True,  # Return the actual coefficients used
    random_state=42
)

# The true coefficients will be non-zero for first 5 features
print("Number of truly informative features:", sum(true_coef != 0))  

# Split data
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.4, random_state=42)

# Initialize and fit model
model = combss.linear.model()
model.fit(
    X_train=X_train, 
    y_train=y_train,
    X_val=X_val,
    y_val=y_val,
    q=10,
    nlam=50
)

# Results analysis
print("\nTrue non-zero coefficients:", np.where(true_coef != 0)[0])
print("Estimated subset:", model.subset)
print("\nValidation MSE:", model.mse)

Documentation

Core Parameters

Parameter	Description	Default
`q`	Maximum subset size	min(n,p)
`nlam`	Number of λ values	50
`scaling`	Enable feature scaling	True
`tau`	Threshold parameter	0.5
`delta_frac`	δ/n in objective function	1

Other Parameters

model.fit(
    ...,
    t_init=t_init,     # Initial point for vector t
    eta=0.001,         # Truncation parameter
    patience=10,       # Early stopping rounds
    gd_maxiter=1000,   # Maximum number of iterations for the gradient based optimization
    gd_tol=1e-5,       # Tolerance for the gradient based optimization
    cg_maxiter=1000,   # Maximum number of iterations allowed in the conjugate gradient method
    cg_tol=1e-6        # Conjugate gradient tolerance
)

Output Attributes

Attribute	Description
`subset`	Selected feature indices (0-based)
`coef_`	Regression coefficients
`mse`	Mean squared error
`lambda_`	Optimal λ value
`run_time`	Execution time (seconds)
`subset_list`	The list of subsets over the grid
`lambda_list`	The grid of λ values.

Dependencies

Python 3.7+
NumPy (≥1.21.0)
SciPy (≥1.7.0)

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Developers

Sarat Moka (@saratmoka)
Hua Yang Hu

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.0.0

Apr 24, 2026

This version

1.1.4

Jul 7, 2025

1.1.3

Jul 5, 2025

1.1.2

Jul 5, 2025

1.1.1

Jul 5, 2025

1.1.0

Jul 5, 2025

1.0.3

May 22, 2025

1.0.2

Nov 18, 2024

1.0.1

Nov 18, 2024

1.0.0

Nov 15, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

combss-1.1.4.tar.gz (17.5 kB view details)

Uploaded Jul 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

combss-1.1.4-py2.py3-none-any.whl (18.1 kB view details)

Uploaded Jul 7, 2025 Python 2Python 3

File details

Details for the file combss-1.1.4.tar.gz.

File metadata

Download URL: combss-1.1.4.tar.gz
Upload date: Jul 7, 2025
Size: 17.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.12

File hashes

Hashes for combss-1.1.4.tar.gz
Algorithm	Hash digest
SHA256	`98e0c527235b3c47b381613170e5e7ea16c0582401e43996a1723531713a018a`
MD5	`e053f04c77ef136a3f7f4de9a8baadef`
BLAKE2b-256	`1924f2a9a950af7a98d6c4c61d9df01ccb01e02ca8753cf7fb14eaea3075b6c7`

See more details on using hashes here.

File details

Details for the file combss-1.1.4-py2.py3-none-any.whl.

File metadata

Download URL: combss-1.1.4-py2.py3-none-any.whl
Upload date: Jul 7, 2025
Size: 18.1 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.12

File hashes

Hashes for combss-1.1.4-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`d67ecba85474dbb86afe462e7dccada6c1bdfcf1adc05c272b9a53e9d8b116cc`
MD5	`c6db9bdf9b84863da6208cf8c88c3d60`
BLAKE2b-256	`632b9d9eb222e4a4d6687ee85da63986e4f21bedf1967f389ac7eb0c99ffe7c3`

See more details on using hashes here.

combss 1.1.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Continuous Optimization Method for Best Subset Selection

Key Features

Intercept Handling

Installation

Quick Start

Documentation

Core Parameters

Other Parameters

Output Attributes

Dependencies

Contributing

Developers

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes