Skip to main content

A trained random forest model for predicting building structural type based on building footprint, height, area, and POI

Project description

Building Structure Type Prediction: Based on Area, Floors, Footprint, and POI Information

中文文档 | English

Introduction


In various maps, building footprint and height data are common or easily obtainable from high-resolution satellite imagery. However, building structure types, which are essential data for structural performance analysis, are difficult to acquire. Although some studies [1-2] have developed related prediction models, their data has not been made publicly available.

This project develops a random forest-based model for predicting building structure types (concrete frame, shear wall, steel frame, etc.) based on building floor area, number of floors, building footprint, and surrounding Points of Interest (POI) information.

We manually extracted data for 2121 buildings from publicly available data in Wuhan, China. Most of these are common building types in Chinese cities: concrete frame (C1), concrete shear wall (C2, commonly residential), and light steel structure (S3, commonly industrial). Both the training data and the trained models are fully available in this repository.

Installation


# Create a Python 3.12 virtual environment named bldstructpred
conda create -n bldstructpred python=3.12

# Activate the environment
conda activate bldstructpred

# Install the BldStructPred package
pip install BldStructPred

Random Forest Model


Training Data

Training dataset: data/武汉建筑训练数据_POI_LJJ.csv (Wuhan Building Training Data with POI)

Data source: Building data from Wuhan, collected through manual annotation (Original data source).

POI classification data: data/高德POI分类与编码(中英文)_V1.06_20230208.xlsx (Amap POI Classification and Encoding in Chinese and English).

Input

  • Floor area
  • Number of floors
  • Footprint coordinates
  • POI (optional): The distances of the 20 nearest POIs categorized as 'Residential Services' or 'Business Residential' within a 2km radius of the building (obtainable through Amap POI service)

Output

  • Structure type: Building structure classification according to Hazus [3], e.g., C2 for concrete shear wall structures, S3 for light steel structures.

Model Performance

The random forest model trained using area, floors, footprint, and POI information achieves an overall accuracy of around 80%. The confusion matrix is shown below:

Confusion Matrix

The feature importance is shown below. Besides area and number of floors, POI information is also significant. For example, if residential buildings are distributed nearby, the building in question is likely also residential. In China, residential buildings are typically concrete shear wall structures, so POI information contributes significantly to the model.

Feature Importance

Without POI data, the overall accuracy drops to 73%. The confusion matrix without POI data is shown below:

Confusion Matrix (No POI)

Usage


Model Training

Refer to the code in Examples/Example1.py for model training.

Using the Trained Model for Prediction

Refer to the code in Examples/Example2.py for using the trained model for prediction:

from pickle import load
from pathlib import Path
import BldStructPred
from BldStructPred.StructPred import StructPred_RF

# Load the trained model
TRAINED_RF = Path(BldStructPred.__file__).parent / 'data/TrainedRF.pkl' 
with open(TRAINED_RF, "rb") as f:
    clf = load(f)

# Prepare building data
Area = [32000, 500]                           # List of building areas
Floor = [4, 10]                               # List of floor numbers
Footprint = [[(-80, -100), (80, -100), (80, 100), (-80, 100)],  # List of footprint coordinates
             [(-12.5, -10), (12.5, -10), (12.5, 10), (-12.5, 10)]]
             
# POI data: [[Distance, Category1, Category2, Category3], ...]
POI = [[[443.6, '商务住宅', '住宅区', '住宅小区']], 
       [[294.7, '商务住宅', '住宅区', '住宅小区']]]

# Predict building structure types
result = clf.predict(Area, Floor, Footprint, POI)
print(result)

References


[1] Peng Zhou, Yuan Chang. Automated classification of building structures for urban built environment identification using machine learning. Journal of Building Engineering, 2021, 43: 103008.

[2] Zhen Xu, Yuan Wu, Ming-zhu Qi, Ming Zheng, Chen Xiong, Xinzheng Lu. Prediction of Structural Type for City-Scale Seismic Damage Simulation Based on Machine Learning. Applied Sciences, 2020, 10(5): 1795.

[3] FEMA. Hazus Inventory Technical Manual. Hazus 4.2 SP3. FEMA, 2021.

How to Cite


If you use BldStructPred in your research, please cite it as follows:

@software{you_ke_liu_2025,
  author       = {You, Tian and Ke, Ke and Liu, Jiajie},
  title        = {youtian95/BldStructPred: v0.1.0},
  month        = may,
  year         = 2025,
  publisher    = {Zenodo},
  version      = {v0.1.0},
  doi          = {10.5281/zenodo.15342789},
  url          = {https://doi.org/10.5281/zenodo.15342789}
}

Or in text format:

You, T., Ke, K., & Liu, J. (2025). youtian95/BldStructPred: v0.1.0 (v0.1.0). Zenodo. https://doi.org/10.5281/zenodo.15342789

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bldstructpred-0.1.2.tar.gz (9.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bldstructpred-0.1.2-py3-none-any.whl (9.4 MB view details)

Uploaded Python 3

File details

Details for the file bldstructpred-0.1.2.tar.gz.

File metadata

  • Download URL: bldstructpred-0.1.2.tar.gz
  • Upload date:
  • Size: 9.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bldstructpred-0.1.2.tar.gz
Algorithm Hash digest
SHA256 60c30d3fdbcc59eea5ef70ed58b0c8ea81b9ce842d84e3f9a3c5a4580c815bfe
MD5 238ac54d71c513a2cbf3f06bb8e2981c
BLAKE2b-256 be2d32e81cac50264bdfb17988745d5259e67e59ad2a21762ca1071c57c5352e

See more details on using hashes here.

File details

Details for the file bldstructpred-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: bldstructpred-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 9.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bldstructpred-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e51a03ed1be36e6cdc3da838335622c1928d0dc9b76290790c9ad9276b1d01e4
MD5 7695b52c161830b9a7f19c8ff7c03df6
BLAKE2b-256 285b812611a9b44f2329d67905e01ada39c267101cb1fb8688753a385299d1f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page