Smarter AI training through propositional logic and information theory.

These details have not been verified by PyPI

Project links

Project description

LogiPrune

Smarter AI training through propositional logic and information theory.

LogiPrune is a preprocessing library that analyzes the logical and informational structure of your dataset before training begins, and uses that structure to reduce the hyperparameter search space — without sacrificing accuracy.

Two complementary modules. One library. Two papers.

What it does

LogiPrune (Paper 1) asks: what logical relationship exists between feature A and feature B? It finds implications (A→B), biconditionals (A↔B), incompatibilities (A→¬B), and disjunctions (A∨B), then uses those relationships to eliminate redundant features and restrict the hyperparameter grid.

LogiPruneEntropy (Paper 2) asks: how complex is that relationship? It computes the Shannon entropy H* of the 4-cell truth table distribution for each feature pair — a continuous measure of boundary complexity — and uses it to select the appropriate model depth and size a priori.

Installation

pip install logiprune                    # core (SVC, RF, any estimator)
pip install "logiprune[xgboost]"         # with XGBoost support

Quick start

Paper 1 — Propositional grid pruning (SVC / RF / any estimator)

from logiprune import LogiPrune
from sklearn.svm import SVC
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import StandardScaler

base_grid = {
    'svc__C':      [0.1, 1, 10, 100],
    'svc__kernel': ['linear', 'rbf', 'poly'],
    'svc__gamma':  ['scale', 'auto', 0.01, 0.1],
}

lp = LogiPrune(base_grid=base_grid, verbose=True)
lp.fit(X_train, y_train)

X_pruned    = lp.transform(X_train)
pruned_grid = lp.pruned_grid()

pipe = Pipeline([('scaler', StandardScaler()), ('svc', SVC())])
gs   = GridSearchCV(pipe, pruned_grid, cv=5, scoring='f1')
gs.fit(X_pruned, y_train)

print(lp.report())
# → Config savings: 93.8%  |  Features eliminated: 1  |  ...

Paper 2 — Entropy-based complexity selection (XGBoost)

from logiprune import LogiPruneEntropy
from xgboost import XGBClassifier
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV

xgb_grid = {
    'xgb__n_estimators':     [100, 200, 300],
    'xgb__max_depth':        [3, 5, 7],
    'xgb__learning_rate':    [0.05, 0.1, 0.3],
    'xgb__subsample':        [0.8, 1.0],
    'xgb__colsample_bytree': [0.8, 1.0],
}

lpe = LogiPruneEntropy(base_grid=xgb_grid, verbose=True)
lpe.fit(X_train, y_train)

X_pruned    = lpe.transform(X_train)
pruned_grid = lpe.pruned_grid()

pipe = Pipeline([('xgb', XGBClassifier(eval_metric='logloss', verbosity=0))])
gs   = GridSearchCV(pipe, pruned_grid, cv=5, scoring='f1')
gs.fit(X_pruned, y_train)

print(lpe.report())
# → H_min: 0.76  |  Config savings: 77.8%  |  Complexity: low

Combined pipeline (optimal)

from logiprune import LogiPrune, LogiPruneEntropy

# Stage 1: propositional pruning (Paper 1)
lp = LogiPrune(base_grid=base_grid)
lp.fit(X_train, y_train)
X_p1    = lp.transform(X_train)
grid_p1 = lp.pruned_grid()

# Stage 2: entropy complexity selection (Paper 2)
lpe = LogiPruneEntropy(base_grid=grid_p1)
lpe.fit(X_p1, y_train)
X_final    = lpe.transform(X_p1)
grid_final = lpe.pruned_grid()

# Stage 3: adaptive search on the reduced space
# → plug grid_final into FLAML, Optuna, or GridSearchCV

Empirical results

Paper 1 — Five-method benchmark (SVC)

Dataset	Method	Time	F1	Config savings
breast_cancer	Baseline	14.6s	0.9861	—
	FLAML	14.6s	0.9861	0% (uses full budget)
	LogiPrune	0.7s	0.9726	93.8%
digits_0v1	Baseline	1.5s	1.000	—
	LogiPrune	1.4s	1.000	93.8%
synth_lo_1k	Baseline	32.0s	0.9100	—
	LogiPrune	10.5s	0.9100	33.3%

On structured data (n=10,000, RF): LogiPrune is the only method that is simultaneously faster (+29.3%) and better (F1 +0.0011) than baseline GridSearch.

Paper 2 — XGBoost entropy benchmark

Dataset	H*	Grid	Savings	ΔF1	ΔTime
breast_cancer	0.76	108→24	77.8%	+0.0133	+76.9%
synth_hi_2k	1.20	108→24	77.8%	+0.0005	+74.3%
wine	1.18	108→24	77.8%	0.0000	+74.5%

ΔF1 ≥ 0 in all datasets. H*=0.76 on breast_cancer correctly identifies that shallow trees suffice — eliminating the deep configurations that hurt generalization.

How it works

Paper 1: Propositional vector

For each feature pair, LogiPrune sweeps discretization thresholds and finds the one where the logical relationship is most stable. It classifies each pair as:

Biconditional (A↔B): one feature eliminated after accuracy validation
Implication (A→B): linear kernel sufficient → restrict to kernel=['linear']
Incompatibility (A→¬B): mutual exclusion structure → restrict kernels
Disjunction (A∨B): compressed via t-conorms, only when both A⊢D and B⊢D (disjunction elimination rule ∨E)
Contingency: full grid required

Paper 2: Truth table entropy

For each feature pair at threshold T, the 4-cell weight distribution π(T) = (w₁₁, w₁₀, w₀₁, w₀₀) has Shannon entropy H(T) = −Σ wᵢⱼ · log₂(wᵢⱼ) ∈ [0, 2.0] bits. H* = min H(T) across the threshold sweep captures the best-case simplicity of the relationship.

H*	Complexity	XGBoost restriction
[0.0, 0.5)	Very simple	max_depth=[2,3], n_estimators=[50,100]
[0.5, 1.0)	Simple	max_depth=[3,4,5], n_estimators=[100,200]
[1.0, 1.5)	Moderate	max_depth=[4,5,6], n_estimators=[200,300]
[1.5, 2.0)	Complex	Full grid

The feedback loop

After Paper 1 eliminates a feature B via A↔B, Paper 2 checks: if H*(A, D | without B) > H*(A, D | with B) + δ: reinstate B

This detects when B acts as a "moderator" — its presence simplifies the A→D relationship even though A↔B suggested redundancy.

Parameters

LogiPrune (Paper 1)

Parameter	Default	Description
`base_grid`	required	Full GridSearchCV parameter grid
`min_confidence`	0.75	Minimum confidence for structural relations
`acc_drop_tolerance`	0.04	Max accuracy drop for feature elimination
`theta_disj_gate`	0.85	Both A⊢D and B⊢D must reach this for disjunction compression
`theta_elevation`	0.92	Confidence for full pair elevation to implication
`discretizer_strategy`	'percentile'	'percentile', 'minmax', or 'zscore_clip'
`verbose`	False	Print progress

LogiPruneEntropy (Paper 2)

Parameter	Default	Description
`base_grid`	required	Full hyperparameter grid
`acc_drop_tolerance`	0.04	Max accuracy drop for feature elimination
`feedback_delta`	0.10	Entropy increase that triggers feature reinstatement
`discretizer_strategy`	'percentile'	Normalization strategy
`verbose`	False	Print progress

Recommended pipeline

Dataset
  → LogiPrune       (Paper 1: removes redundant features, restricts kernel/depth)
  → LogiPruneEntropy (Paper 2: restricts n_estimators, max_depth by entropy)
  → FLAML / Optuna  (searches the reduced space adaptively)
  → best model

LogiPrune+FLAML Pareto-dominates FLAML alone: same budget, smaller space, better or equal results.

When it works best

Medical diagnostics (blood panels, imaging features)
Sensor fusion (IoT, process control)
Financial features (ratios from shared base quantities)
Image descriptors (pixel/feature correlations)

When to expect modest gains

Purely synthetic Gaussian datasets with independent features have high entropy throughout. The propositional gate and entropy signal correctly recognize this and apply minimal restrictions, protecting accuracy at the cost of smaller savings.

Citation

If you use LogiPrune in your research, please cite both papers:

@article{peralta2026logiprune,
  title   = {LogiPrune: Propositional Disjunction Elimination
             for Hyperparameter Search Space Pruning},
  author  = {Peralta Del Riego, V{\'i}ctor Manuel},
  journal = {arXiv preprint arXiv:XXXX.XXXXX},
  year    = {2026},
}

@article{peralta2026logiprune_entropy,
  title   = {LogiPrune-Entropy: A Priori Model Complexity Selection
             via Truth Table Shannon Entropy},
  author  = {Peralta Del Riego, V{\'i}ctor Manuel},
  journal = {arXiv preprint arXiv:XXXX.XXXXX},
  year    = {2026},
}

License

MIT © Víctor Manuel Peralta Del Riego, 2026

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.4

Mar 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logiprune-0.2.4.tar.gz (63.8 kB view details)

Uploaded Mar 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

logiprune-0.2.4-py3-none-any.whl (67.4 kB view details)

Uploaded Mar 25, 2026 Python 3

File details

Details for the file logiprune-0.2.4.tar.gz.

File metadata

Download URL: logiprune-0.2.4.tar.gz
Upload date: Mar 25, 2026
Size: 63.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for logiprune-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`0742ac842b65d6ae557eda989eba1d540b5d8acfaa6f38344d69a2470cf9a035`
MD5	`c3b48b78212918b23eddc5a58e27d4a2`
BLAKE2b-256	`d1a32f37d584c3933283c336c79a8684ba0020d6d99328d7656efd9a456f1461`

See more details on using hashes here.

File details

Details for the file logiprune-0.2.4-py3-none-any.whl.

File metadata

Download URL: logiprune-0.2.4-py3-none-any.whl
Upload date: Mar 25, 2026
Size: 67.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for logiprune-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`78ac407544a7df24aa87f85ee99e0989a811faaa34b096af3422b807426e5540`
MD5	`760bc0a81edd1ea0f02fffd0f8edf264`
BLAKE2b-256	`016be8e2b7753a5593fe7461b27b1de89e44d62cc1f45b9a54eff5e7e435a3a9`

See more details on using hashes here.

logiprune 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

LogiPrune

What it does

Installation

Quick start

Paper 1 — Propositional grid pruning (SVC / RF / any estimator)

Paper 2 — Entropy-based complexity selection (XGBoost)

Combined pipeline (optimal)

Empirical results

Paper 1 — Five-method benchmark (SVC)

Paper 2 — XGBoost entropy benchmark

How it works

Paper 1: Propositional vector

Paper 2: Truth table entropy

The feedback loop

Parameters

LogiPrune (Paper 1)

LogiPruneEntropy (Paper 2)

Recommended pipeline

When it works best

When to expect modest gains

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes