A trained random forest model for predicting building structural type based on building footprint, height, area, and POI
Project description
Building Structure Type Prediction: Based on Area, Floors, Footprint, and POI Information
中文文档 | English
Introduction
In various maps, building footprint and height data are common or easily obtainable from high-resolution satellite imagery. However, building structure types, which are essential data for structural performance analysis, are difficult to acquire. Although some studies [1-2] have developed related prediction models, their data has not been made publicly available.
This project develops a random forest-based model for predicting building structure types (concrete frame, shear wall, steel frame, etc.) based on building floor area, number of floors, building footprint, and surrounding Points of Interest (POI) information.
We manually extracted data for 2121 buildings from publicly available data in Wuhan, China. Most of these are common building types in Chinese cities: concrete frame (C1), concrete shear wall (C2, commonly residential), and light steel structure (S3, commonly industrial). Both the training data and the trained models are fully available in this repository.
Installation
# Create a Python 3.12 virtual environment named bldstructpred
conda create -n bldstructpred python=3.12
# Activate the environment
conda activate bldstructpred
# Install the BldStructPred package
pip install BldStructPred
Random Forest Model
Training Data
Training dataset: data/武汉建筑训练数据_POI_LJJ.csv (Wuhan Building Training Data with POI)
Data source: Building data from Wuhan, collected through manual annotation (Original data source).
POI classification data: data/高德POI分类与编码(中英文)_V1.06_20230208.xlsx (Amap POI Classification and Encoding in Chinese and English).
Input
- Floor area
- Number of floors
- Footprint coordinates
- POI (optional): The distances of the 20 nearest POIs categorized as 'Residential Services' or 'Business Residential' within a 2km radius of the building (obtainable through Amap POI service)
Output
- Structure type: Building structure classification according to Hazus [3], e.g., C2 for concrete shear wall structures, S3 for light steel structures.
Model Performance
The random forest model trained using area, floors, footprint, and POI information achieves an overall accuracy of around 80%. The confusion matrix is shown below:
The feature importance is shown below. Besides area and number of floors, POI information is also significant. For example, if residential buildings are distributed nearby, the building in question is likely also residential. In China, residential buildings are typically concrete shear wall structures, so POI information contributes significantly to the model.
Without POI data, the overall accuracy drops to 73%. The confusion matrix without POI data is shown below:
Usage
Model Training
Refer to the code in Examples/Example1.py for model training.
Using the Trained Model for Prediction
Refer to the code in Examples/Example2.py for using the trained model for prediction:
from pickle import load
from pathlib import Path
import BldStructPred
from BldStructPred.StructPred import StructPred_RF
# Load the trained model
TRAINED_RF = Path(BldStructPred.__file__).parent / 'data/TrainedRF.pkl'
with open(TRAINED_RF, "rb") as f:
clf = load(f)
# Prepare building data
Area = [32000, 500] # List of building areas
Floor = [4, 10] # List of floor numbers
Footprint = [[(-80, -100), (80, -100), (80, 100), (-80, 100)], # List of footprint coordinates
[(-12.5, -10), (12.5, -10), (12.5, 10), (-12.5, 10)]]
# POI data: [[Distance, Category1, Category2, Category3], ...]
POI = [[[443.6, '商务住宅', '住宅区', '住宅小区']],
[[294.7, '商务住宅', '住宅区', '住宅小区']]]
# Predict building structure types
result = clf.predict(Area, Floor, Footprint, POI)
print(result)
References
[1] Peng Zhou, Yuan Chang. Automated classification of building structures for urban built environment identification using machine learning. Journal of Building Engineering, 2021, 43: 103008.
[2] Zhen Xu, Yuan Wu, Ming-zhu Qi, Ming Zheng, Chen Xiong, Xinzheng Lu. Prediction of Structural Type for City-Scale Seismic Damage Simulation Based on Machine Learning. Applied Sciences, 2020, 10(5): 1795.
[3] FEMA. Hazus Inventory Technical Manual. Hazus 4.2 SP3. FEMA, 2021.
How to Cite
If you use BldStructPred in your research, please cite it as follows:
@software{you_ke_liu_2025,
author = {You, Tian and Ke, Ke and Liu, Jiajie},
title = {youtian95/BldStructPred: v0.1.0},
month = may,
year = 2025,
publisher = {Zenodo},
version = {v0.1.0},
doi = {10.5281/zenodo.15342789},
url = {https://doi.org/10.5281/zenodo.15342789}
}
Or in text format:
You, T., Ke, K., & Liu, J. (2025). youtian95/BldStructPred: v0.1.0 (v0.1.0). Zenodo. https://doi.org/10.5281/zenodo.15342789
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bldstructpred-0.1.2.tar.gz.
File metadata
- Download URL: bldstructpred-0.1.2.tar.gz
- Upload date:
- Size: 9.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60c30d3fdbcc59eea5ef70ed58b0c8ea81b9ce842d84e3f9a3c5a4580c815bfe
|
|
| MD5 |
238ac54d71c513a2cbf3f06bb8e2981c
|
|
| BLAKE2b-256 |
be2d32e81cac50264bdfb17988745d5259e67e59ad2a21762ca1071c57c5352e
|
File details
Details for the file bldstructpred-0.1.2-py3-none-any.whl.
File metadata
- Download URL: bldstructpred-0.1.2-py3-none-any.whl
- Upload date:
- Size: 9.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e51a03ed1be36e6cdc3da838335622c1928d0dc9b76290790c9ad9276b1d01e4
|
|
| MD5 |
7695b52c161830b9a7f19c8ff7c03df6
|
|
| BLAKE2b-256 |
285b812611a9b44f2329d67905e01ada39c267101cb1fb8688753a385299d1f7
|