Skip to main content

A trained random forest model for predicting building structural type based on building footprint, height, area, and POI

Project description

Building Structure Type Prediction: Based on Area, Floors, Footprint, and POI Information

中文文档 | English

Introduction


In various maps, building footprint and height data are common or easily obtainable from high-resolution satellite imagery. However, building structure types, which are essential data for structural performance analysis, are difficult to acquire. Although some studies [1-2] have developed related prediction models, their data has not been made publicly available.

This project develops a random forest-based model for predicting building structure types (concrete frame, shear wall, steel frame, etc.) based on building floor area, number of floors, building footprint, and surrounding Points of Interest (POI) information.

We manually extracted data for 2121 buildings from publicly available data in Wuhan, China. Most of these are common building types in Chinese cities: concrete frame (C1), concrete shear wall (C2, commonly residential), and light steel structure (S3, commonly industrial). Both the training data and the trained models are fully available in this repository.

Installation


# Create a Python 3.12 virtual environment named bldstructpred
conda create -n bldstructpred python=3.12

# Activate the environment
conda activate bldstructpred

# Install the BldStructPred package
pip install BldStructPred

Random Forest Model


Training Data

Training dataset: data/武汉建筑训练数据_POI_LJJ.csv (Wuhan Building Training Data with POI)

Data source: Building data from Wuhan, collected through manual annotation (Original data source).

POI classification data: data/高德POI分类与编码(中英文)_V1.06_20230208.xlsx (Amap POI Classification and Encoding in Chinese and English).

Input

  • Floor area
  • Number of floors
  • Footprint coordinates
  • POI (optional): The distances of the 20 nearest POIs categorized as 'Residential Services' or 'Business Residential' within a 2km radius of the building (obtainable through Amap POI service)

Output

  • Structure type: Building structure classification according to Hazus [3], e.g., C2 for concrete shear wall structures, S3 for light steel structures.

Model Performance

The random forest model trained using area, floors, footprint, and POI information achieves an overall accuracy of around 80%. The confusion matrix is shown below:

Confusion Matrix

The feature importance is shown below. Besides area and number of floors, POI information is also significant. For example, if residential buildings are distributed nearby, the building in question is likely also residential. In China, residential buildings are typically concrete shear wall structures, so POI information contributes significantly to the model.

Feature Importance

Without POI data, the overall accuracy drops to 73%. The confusion matrix without POI data is shown below:

Confusion Matrix (No POI)

Usage


Model Training

Refer to the code in Examples/Example1.py for model training.

Using the Trained Model for Prediction

Refer to the code in Examples/Example2.py for using the trained model for prediction:

from joblib import load
from pathlib import Path
import numpy as np
from packaging import version
import BldStructPred

# Load the trained model
data_dir = Path(BldStructPred.__file__).parent / 'data'
np_version_obj = version.parse(np.__version__)
if np_version_obj < version.parse('1.27.0'):
    TRAINED_RF = data_dir / 'TrainedRF_numpy_v_1_26.joblib'
else:
    TRAINED_RF = data_dir / 'TrainedRF.joblib'

# Prepare building data
Area = [32000, 500]                           # List of building areas
Floor = [4, 10]                               # List of floor numbers
Footprint = [[(-80, -100), (80, -100), (80, 100), (-80, 100)],  # List of footprint coordinates
             [(-12.5, -10), (12.5, -10), (12.5, 10), (-12.5, 10)]]
             
# POI data: [[Distance, Category1, Category2, Category3], ...]
POI = [[[443.6, '商务住宅', '住宅区', '住宅小区']], 
       [[294.7, '商务住宅', '住宅区', '住宅小区']]]

# Predict building structure types
clf = load(TRAINED_RF)
Y_test = clf.predict(Area, Floor, Footprint, POI)
print(Y_test)

References


[1] Peng Zhou, Yuan Chang. Automated classification of building structures for urban built environment identification using machine learning. Journal of Building Engineering, 2021, 43: 103008.

[2] Zhen Xu, Yuan Wu, Ming-zhu Qi, Ming Zheng, Chen Xiong, Xinzheng Lu. Prediction of Structural Type for City-Scale Seismic Damage Simulation Based on Machine Learning. Applied Sciences, 2020, 10(5): 1795.

[3] FEMA. Hazus Inventory Technical Manual. Hazus 4.2 SP3. FEMA, 2021.

How to Cite


If you use BldStructPred in your research, please cite it as follows:

@software{you_ke_liu_2025,
  author       = {You, Tian and Ke, Ke and Liu, Jiajie},
  title        = {youtian95/BldStructPred: v0.1.0},
  month        = may,
  year         = 2025,
  publisher    = {Zenodo},
  version      = {v0.1.0},
  doi          = {10.5281/zenodo.15342789},
  url          = {https://doi.org/10.5281/zenodo.15342789}
}

Or in text format:

You, T., Ke, K., & Liu, J. (2025). youtian95/BldStructPred: v0.1.0 (v0.1.0). Zenodo. https://doi.org/10.5281/zenodo.15342789

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bldstructpred-0.2.0.tar.gz (12.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bldstructpred-0.2.0-py3-none-any.whl (12.5 MB view details)

Uploaded Python 3

File details

Details for the file bldstructpred-0.2.0.tar.gz.

File metadata

  • Download URL: bldstructpred-0.2.0.tar.gz
  • Upload date:
  • Size: 12.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bldstructpred-0.2.0.tar.gz
Algorithm Hash digest
SHA256 152b58adb032be5853abfb944ed9220028c796b01fa5c486f1ad6b34af3a1e8a
MD5 469fb2d26ff48446c51127d594311f66
BLAKE2b-256 8df8e76554343e57c389851a50cd345a8fafd370b7153f76910dd36b1305cddd

See more details on using hashes here.

File details

Details for the file bldstructpred-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: bldstructpred-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bldstructpred-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b309e1cd61cc7c0795f40a96ab8a2621699753f873639ef03922194ae6ace8a5
MD5 e4cdbb3c95d3ace4506920c9789abc61
BLAKE2b-256 100b0fbc5359faf9c76ab0ccb4f5ab3c2c442e16f6a9fee1d9f3dc4a19f29215

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page