Skip to main content

Hierarchical tree-based model for robust parcel sale price predictions using weighted Wilson means.

Project description

LayeredCompModel

PyPI version Documentation Status Tests License

Hierarchical tree-based regressor for robust predictions (e.g., parcel sale prices) using path-weighted Wilson means (95% trimmed means for outlier resistance).

Features

  • Scikit-learn compatible: Inherits BaseEstimator/RegressorMixin; works with Pipeline, GridSearchCV, cross_val_score, pickling.
  • Automatic feature handling: Categorical (one-vs-rest splits), numeric (binary search breakpoints), NaNs/missing values.
  • Robust statistics: Wilson means prevent outlier swings.
  • Ensemble Support: LayeredCompBaggingModel for reduced variance and automatic weight_falloff optimization.
  • Configurable weighting: weight_falloff balances local accuracy vs. market normativity.
  • Explainable: explain_value(row) shows path, weights, means.
  • Serializable: to_json(), to_dict().
  • Parallel: n_jobs support.

NaN Handling

  • Categorical: Treated as distinct "NaN" category.
  • Numeric: Excluded from splits (robust; per SPEC.md).
  • Target y: Must be finite (raises ValueError).
  • Strict checks: Use Pipeline([('imputer', SimpleImputer()), ('model', LayeredCompModel())]).

Installation

pip install layeredcompmodel

For development:

git clone https://github.com/JohnKossa/layeredcompmodel.git
cd layeredcompmodel
pip install -e .[dev]

Quickstart

import pandas as pd
import numpy as np
from layeredcompmodel import LayeredCompModel

# Synthetic real-estate-like data
rng = np.random.default_rng(42)
n_samples = 100
data = {
    'neighborhood': rng.choice(['North', 'South', 'East'], n_samples),
    'size_sqft': rng.normal(2000, 500, n_samples),
    'price': rng.normal(500000, 100000, n_samples) + 100 * rng.normal(0, 1, n_samples) * (rng.normal(0, 1, n_samples) * 2000)
}
df = pd.DataFrame(data)
X = df[['neighborhood', 'size_sqft']]
y = df['price']

# Train
model = LayeredCompModel(weight_falloff=0.8, n_jobs=1)
model.fit(X, y)

# Predict
predictions = model.predict(X)
print(f"Predictions shape: {predictions.shape}")
print(f"MAE: {np.mean(np.abs(predictions - y)):.0f}")

# Explain single prediction
explanation = model.explain_value(X.iloc[0:1].squeeze())
print(explanation)

API Reference

LayeredCompModel(weight_falloff=0.5, split_metric='mae', n_jobs=1)

  • fit(X, y): Build tree from features X (DataFrame), target y (Series).
  • predict(X): Predict using path-weighted means.
  • explain_value(row): Dict with path nodes, depths, weights, wilson_means.
  • to_json(indent=4): JSON tree dump.
  • tree_: Root CompNode.

LayeredCompBaggingModel(tree_count=10, sample_pct=0.8, random_state=None, split_metric='mae', n_jobs=1)

  • fit(X, y): Build bagging ensemble. Automatically optimizes weight_falloff for each tree using an internal split.
  • predict(X): Return the average prediction of all trees.
  • estimators_: List of fitted LayeredCompModel instances.

See docs (TBD).

Examples

Run the quickstart:

python examples/quickstart.py

Expected output:

Predictions shape: (100,)
MAE: 126914
{'final_prediction': 530354.0426294187, 'weight_falloff': 0.8, 'path': [{'depth': 0, 'wilson_mean': 476353.91361128056, 'count': 100, 'is_leaf': False, 'filter_col': 'size_sqft', 'filter_val': 2101.366485546922}, {'depth': 1, 'wilson_mean': 553953.0606894617, 'count': 42, 'is_leaf': False, 'filter_col': 'neighborhood', 'filter_val': 'North'}, {'depth': 2, 'wilson_mean': 525096.3185716979, 'count': 13, 'is_leaf': True}], 'calculation': '0.199*476354 + 0.512*553953 + 0.289*525096 = 530354'}

Development & Testing

pytest tests/ --cov=layeredcompmodel
black src/
mypy src/

CI/CD, Sphinx docs: planned.

Citing

Kossa, J. (2026). LayeredCompModel. GitHub. https://github.com/JohnKossa/layeredcompmodel

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

layeredcompmodel-0.2.1.tar.gz (13.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

layeredcompmodel-0.2.1-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file layeredcompmodel-0.2.1.tar.gz.

File metadata

  • Download URL: layeredcompmodel-0.2.1.tar.gz
  • Upload date:
  • Size: 13.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for layeredcompmodel-0.2.1.tar.gz
Algorithm Hash digest
SHA256 1ec8b29f5837dc25cfdc233c55cfb6b6812c7d7315b693c8b2a6d1f7b1a2b5ec
MD5 9959a51d465c9394d14bd8c907c7fcd7
BLAKE2b-256 a745aa27901bd673721353111635a25ea914f5714d693b746a2ff6c1134c5189

See more details on using hashes here.

File details

Details for the file layeredcompmodel-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for layeredcompmodel-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 575054636896a1f789890e22305a0914236a2d3953fc81b92fe89c53cdad8d5e
MD5 c6da30879f4f74ae10ca0cb827b37111
BLAKE2b-256 84c67c367fa243203e7577bd5808523fdec362efad733a9ab5109f40f592d9ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page