Skip to main content

Hierarchical tree-based model for robust parcel sale price predictions using weighted Wilson means.

Project description

LayeredCompModel

PyPI version Documentation Status Tests License

Hierarchical tree-based regressor for robust predictions (e.g., parcel sale prices) using path-weighted Wilson means (95% trimmed means for outlier resistance).

Features

  • Scikit-learn compatible: Inherits BaseEstimator/RegressorMixin; works with Pipeline, GridSearchCV, cross_val_score, pickling.
  • Automatic feature handling: Categorical (one-vs-rest splits), numeric (binary search breakpoints), NaNs/missing values.
  • Robust statistics: Wilson means prevent outlier swings.
  • Configurable weighting: weight_falloff balances local accuracy vs. market normativity.
  • Explainable: explain_value(row) shows path, weights, means.
  • Serializable: to_json(), to_dict().
  • Parallel: n_jobs support.

NaN Handling

  • Categorical: Treated as distinct "NaN" category.
  • Numeric: Excluded from splits (robust; per SPEC.md).
  • Target y: Must be finite (raises ValueError).
  • Strict checks: Use Pipeline([('imputer', SimpleImputer()), ('model', LayeredCompModel())]).

Installation

pip install layeredcompmodel

For development:

git clone https://github.com/JohnKossa/layeredcompmodel.git
cd layeredcompmodel
pip install -e .[dev]

Quickstart

import pandas as pd
import numpy as np
from layeredcompmodel import LayeredCompModel

# Synthetic real-estate-like data
rng = np.random.default_rng(42)
n_samples = 100
data = {
    'neighborhood': rng.choice(['North', 'South', 'East'], n_samples),
    'size_sqft': rng.normal(2000, 500, n_samples),
    'price': rng.normal(500000, 100000, n_samples) + 100 * rng.normal(0, 1, n_samples) * (rng.normal(0, 1, n_samples) * 2000)
}
df = pd.DataFrame(data)
X = df[['neighborhood', 'size_sqft']]
y = df['price']

# Train
model = LayeredCompModel(weight_falloff=0.8, n_jobs=1)
model.fit(X, y)

# Predict
predictions = model.predict(X)
print(f"Predictions shape: {predictions.shape}")
print(f"MAE: {np.mean(np.abs(predictions - y)):.0f}")

# Explain single prediction
explanation = model.explain_value(X.iloc[0:1].squeeze())
print(explanation)

API Reference

LayeredCompModel(weight_falloff=0.5, split_metric='mae', n_jobs=1)

  • fit(X, y): Build tree from features X (DataFrame), target y (Series).
  • predict(X): Predict using path-weighted means.
  • explain_value(row): Dict with path nodes, depths, weights, wilson_means.
  • to_json(indent=4): JSON tree dump.
  • tree_: Root CompNode (filter_col, filter_val, wilson_mean, children).

See docs (TBD).

Examples

See examples/quickstart.py for a runnable example (code matches Quickstart above).

Run it:

python examples/quickstart.py

Expected output:

Predictions shape: (100,)
MAE: 126914
{'final_prediction': 530354.0426294187, 'weight_falloff': 0.8, 'path': [{'depth': 0, 'wilson_mean': 476353.91361128056, 'count': 100, 'is_leaf': False, 'filter_col': 'size_sqft', 'filter_val': 2101.366485546922}, {'depth': 1, 'wilson_mean': 553953.0606894617, 'count': 42, 'is_leaf': False, 'filter_col': 'neighborhood', 'filter_val': 'North'}, {'depth': 2, 'wilson_mean': 525096.3185716979, 'count': 13, 'is_leaf': True}], 'calculation': '0.199*476354 + 0.512*553953 + 0.289*525096 = 530354'}

Development & Testing

pytest tests/ --cov=layeredcompmodel
black src/
mypy src/

CI/CD, Sphinx docs: planned.

Citing

Kossa, J. (2026). LayeredCompModel. GitHub. https://github.com/JohnKossa/layeredcompmodel

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

layeredcompmodel-0.2.0.tar.gz (13.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

layeredcompmodel-0.2.0-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file layeredcompmodel-0.2.0.tar.gz.

File metadata

  • Download URL: layeredcompmodel-0.2.0.tar.gz
  • Upload date:
  • Size: 13.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for layeredcompmodel-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0889fcf979aac08dad3698a2ef4f1c70d342a7ae90c5cdeb93f7efab2ed9153c
MD5 891f34ff2b99766127d14d44aee4f05e
BLAKE2b-256 9f71ce69c60f517db1eb3d0eca5871c7cfe229a1cfa4bff4fd3e2077f2c80088

See more details on using hashes here.

File details

Details for the file layeredcompmodel-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for layeredcompmodel-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 276ae022348ecc14a5ea5810878538c6efeb539c4a5c61e7cb6f93a9987bb9e4
MD5 c3ffe225a41d9eef1ae8b4ad5009d666
BLAKE2b-256 8ae00afc15202cf0413d82c55c94342171eafaa9365368a76cacbc4a33b35c0d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page