Skip to main content

Hierarchical tree-based model for robust parcel sale price predictions using weighted Wilson means.

Project description

LayeredCompModel

PyPI version Documentation Status Tests License

Hierarchical tree-based regressor for robust predictions (e.g., parcel sale prices) using path-weighted Wilson means (95% trimmed means for outlier resistance).

Features

  • Scikit-learn compatible: Inherits BaseEstimator/RegressorMixin; works with Pipeline, GridSearchCV, cross_val_score, pickling.
  • Automatic feature handling: Categorical (one-vs-rest splits), numeric (binary search breakpoints), NaNs/missing values.
  • Robust statistics: Wilson means prevent outlier swings.
  • Configurable weighting: weight_falloff balances local accuracy vs. market normativity.
  • Explainable: explain_value(row) shows path, weights, means.
  • Serializable: to_json(), to_dict().
  • Parallel: n_jobs support.

NaN Handling

  • Categorical: Treated as distinct "NaN" category.
  • Numeric: Excluded from splits (robust; per SPEC.md).
  • Target y: Must be finite (raises ValueError).
  • Strict checks: Use Pipeline([('imputer', SimpleImputer()), ('model', LayeredCompModel())]).

Installation

pip install layeredcompmodel

For development:

git clone https://github.com/JohnKossa/layeredcompmodel.git
cd layeredcompmodel
pip install -e .[dev]

Quickstart

import pandas as pd
import numpy as np
from layeredcompmodel import LayeredCompModel

# Synthetic real-estate-like data
rng = np.random.default_rng(42)
n_samples = 100
data = {
    'neighborhood': rng.choice(['North', 'South', 'East'], n_samples),
    'size_sqft': rng.normal(2000, 500, n_samples),
    'price': rng.normal(500000, 100000, n_samples) + 100 * rng.normal(0, 1, n_samples) * (rng.normal(0, 1, n_samples) * 2000)
}
df = pd.DataFrame(data)
X = df[['neighborhood', 'size_sqft']]
y = df['price']

# Train
model = LayeredCompModel(weight_falloff=0.8, n_jobs=1)
model.fit(X, y)

# Predict
predictions = model.predict(X)
print(f"Predictions shape: {predictions.shape}")
print(f"MAE: {np.mean(np.abs(predictions - y)):.0f}")

# Explain single prediction
explanation = model.explain_value(X.iloc[0:1].squeeze())
print(explanation)

API Reference

LayeredCompModel(weight_falloff=0.5, split_metric='mae', n_jobs=1)

  • fit(X, y): Build tree from features X (DataFrame), target y (Series).
  • predict(X): Predict using path-weighted means.
  • explain_value(row): Dict with path nodes, depths, weights, wilson_means.
  • to_json(indent=4): JSON tree dump.
  • tree_: Root CompNode (filter_col, filter_val, wilson_mean, children).

See docs (TBD).

Examples

See examples/quickstart.py for a runnable example (code matches Quickstart above).

Run it:

python examples/quickstart.py

Expected output:

Predictions shape: (100,)
MAE: 126914
{'final_prediction': 530354.0426294187, 'weight_falloff': 0.8, 'path': [{'depth': 0, 'wilson_mean': 476353.91361128056, 'count': 100, 'is_leaf': False, 'filter_col': 'size_sqft', 'filter_val': 2101.366485546922}, {'depth': 1, 'wilson_mean': 553953.0606894617, 'count': 42, 'is_leaf': False, 'filter_col': 'neighborhood', 'filter_val': 'North'}, {'depth': 2, 'wilson_mean': 525096.3185716979, 'count': 13, 'is_leaf': True}], 'calculation': '0.199*476354 + 0.512*553953 + 0.289*525096 = 530354'}

Development & Testing

pytest tests/ --cov=layeredcompmodel
black src/
mypy src/

CI/CD, Sphinx docs: planned.

Citing

Kossa, J. (2026). LayeredCompModel. GitHub. https://github.com/JohnKossa/layeredcompmodel

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

layeredcompmodel-0.1.0.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

layeredcompmodel-0.1.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file layeredcompmodel-0.1.0.tar.gz.

File metadata

  • Download URL: layeredcompmodel-0.1.0.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for layeredcompmodel-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b7678db42b8ae3fe57b3c6d020456281ccdcb544e96b308d95a3c7e1325bd616
MD5 865210a639aa76e70810e27ab7ab9a65
BLAKE2b-256 86aa4783b27ebd5e4f2360891a215e485a53299216b2a5b3ebc314de90be97bd

See more details on using hashes here.

File details

Details for the file layeredcompmodel-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for layeredcompmodel-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 914aac0602dff99255a57b71c12854ed2f6822e76d8165300977896013f7b10b
MD5 872f6758afc65e65a92cece1f751eb60
BLAKE2b-256 20496ad042489d792c844de847e0c3991706eb2c0f96dd8315a33d4e781207ec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page