Machine learning tool for completing building attributes in urban energy models with URBANopt-BuildStock post-processing
Project description
Urban System Generator
NLR Software Record: SWR 25-36
Urban System Generator (USG) is a machine learning-based tool for completing missing building attributes in urban energy modeling workflows. Given partial building information from GeoJSON files, USG predicts missing characteristics like HVAC systems, insulation levels, and other energy-relevant properties. The tool includes post-processing for URBANopt-BuildStock compatibility.
Features
- GeoJSON Processing: Extract and convert building data from GeoJSON files
- Attribute Prediction: Use trained neural networks to predict missing building attributes
- Batch Processing: Process multiple buildings efficiently
- Post-Processing Pipeline: Validate and fix outputs for URBANopt-BuildStock compatibility
- Missing column filler for required URBANopt parameters
- Schema validation against options_lookup.tsv
- Cross-field consistency enforcement (HVAC, PV, foundations, etc.)
- Simulation Ready: Output formatted for URBANopt and other urban energy simulation tools
Installation
# Clone the repository
git clone https://github.com/NatLabRockies/urban-system-generator.git
cd urban-system-generator
# Create a virtual environment
python3 -m venv .venv # Or try `python -m venv .venv` if `python3` doesn't work
# Activate the virtual environment
source .venv/bin/activate
# Upgrade packaging tools
pip install --upgrade pip setuptools wheel
# Install the package and dependencies
pip install -e .
Command Line Interface (CLI)
After installation, the usg command will be available. The CLI provides easy access to all functionality:
# Convert GeoJSON to CSV
usg geojson2csv -i buildings.json -o buildings.csv
# Predict missing attributes for a single building
usg predict-single -a "Vintage" "2000s" -a "Bedrooms" "3"
# Process batch of buildings (ML inference)
usg complete -i incomplete.csv -o completed.csv
# Post-process for URBANopt-BuildStock compatibility
usg process -i completed.csv -o uo_buildstock_mapping.csv -g buildings.json
# Run complete workflow (GeoJSON → Inference → Post-processing)
usg workflow -i buildings.json -o uo_buildstock_mapping.csv --verbose
# Get help
usg --help
For detailed CLI documentation, see docs/cli_usage.md.
Quick Start (Python API)
1. Convert GeoJSON to CSV
from usg.geojson_processor import GeoJSONProcessor
from usg.resources.model_attributes import all_model_attributes
processor = GeoJSONProcessor()
processor.geojson_to_csv(
geojson_path="buildings.json",
output_csv_path="buildings_incomplete.csv",
all_model_attributes=all_model_attributes,
)
2. Predict Missing Attributes
from usg.inference import USGInference
from pathlib import Path
# Get the path to bundled resources
usg_dir = Path(__file__).parent / "usg" / "resources" / "pretrained_model"
# Initialize inference engine
inference = USGInference(
model_path=usg_dir / "adaptive_model_1.keras",
cat_scaler_path=usg_dir / "cat_scaler.pkl",
num_scaler_path=usg_dir / "num_scaler.pkl",
encoding_dict_path=usg_dir / "encoding_mapper.json",
)
# Process buildings
inference.process_buildings_batch(
input_csv_path="buildings_incomplete.csv",
output_csv_path="buildings_complete.csv",
)
Note: When using the CLI (usg command), resource paths are automatically resolved. The manual paths above are only needed for direct Python API usage.
3. Single Building Prediction
known_attrs = {
'Geometry Building Type RECS': 'Single-Family Detached',
'Vintage': '2000s',
'Geometry Floor Area': '1500-1999',
'Geometry Stories': '2',
'Bedrooms': '3',
}
completed = inference.predict_missing_single(known_attrs)
print(completed)
4. Post-Process for URBANopt-BuildStock
from usg.postprocessor import USGPostProcessor, get_default_resource_paths
# Get bundled resource paths (recommended)
options_lookup, consistency_rules = get_default_resource_paths()
# Initialize post-processor
postprocessor = USGPostProcessor(
options_lookup_path=options_lookup,
consistency_rules_path=consistency_rules,
)
# Process inference output for URBANopt compatibility
postprocessor.process(
input_csv_path="buildings_complete.csv",
output_csv_path="uo_buildstock_mapping.csv",
geojson_path="buildings.json", # Optional: for climate zone extraction
generate_reports=True, # Generate validation reports
)
The post-processor performs three steps:
- Missing Column Filler: Adds required URBANopt-BuildStock columns with appropriate defaults
- Schema Validator: Validates all values against options_lookup.tsv and fixes invalid entries
- Consistency Processor: Enforces cross-field constraints (HVAC configurations, PV systems, etc.)
Repository Structure
urban-system-generator/
├── usg/ # Main package
│ ├── __init__.py # Package initialization
│ ├── inference.py # Inference engine
│ ├── geojson_processor.py # GeoJSON processing
│ ├── postprocessor.py # URBANopt post-processing
│ ├── model.py # Neural network model
│ ├── cli.py # Command-line interface
│ └── resources/ # Bundled model and post-processor resources
│ ├── pretrained_model/ # Trained model files
│ │ ├── adaptive_model_1.keras
│ │ ├── cat_scaler.pkl
│ │ ├── num_scaler.pkl
│ │ └── encoding_mapper.json
│ ├── postprocessor/ # Post-processor resources
│ │ ├── options_lookup.tsv # URBANopt-BuildStock schema
│ │ └── consistency_rules.json # Cross-field constraints
│ └── model_attributes.py # Model schema
├── examples/ # Usage examples
├── docs/ # Documentation
└── tests/ # Unit tests
Input Data Format
GeoJSON Requirements
Buildings in your GeoJSON should have the following properties:
{
"type": "Feature",
"properties": {
"id": "building_001",
"type": "Building",
"building_type": "Single-Family Detached",
"year_built": 2005,
"floor_area": 2000,
"number_of_stories": 2,
"number_of_bedrooms": 3,
"number_of_residential_units": 1
},
"geometry": {...}
}
Supported Building Types
- Single-Family Detached
- Single-Family Attached
- Multifamily (2-4 units)
- Multifamily (5+ units)
- Mobile Home
Model Attributes
The model can predict the following building attributes:
- HVAC Cooling Type
- HVAC Heating Type
- Heating Fuel
- Foundation Type
- Attic Type
- Insulation levels
- Window characteristics
- And more...
See usg/resources/model_attributes.py for the complete list.
Examples
See the examples/ directory for complete usage examples:
example_usage.py- Complete workflow demonstrationsexample_geojson.json- Sample GeoJSON file
Testing
The repository includes comprehensive unit and integration tests.
# Run all tests
python run_tests.py
# Run only unit tests
python run_tests.py unit
# Run integration tests
python run_tests.py integration
# Run specific test module
python run_tests.py inference
python run_tests.py geojson
# Using pytest (alternative)
pytest tests/
# Run with coverage
pytest --cov=usg tests/
Test Structure
tests/test_inference.py- Unit tests for the inference enginetests/test_geojson_processor.py- Unit tests for GeoJSON processingtests/test_integration.py- End-to-end integration tests
Authors & Contributors
Lead Developer: Rawad El Kontar
Organization: National Laboratory of the Rockies (NLR)
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
License
Copyright (c) 2025-2026, Alliance for Energy Innovation, LLC
This project is licensed under the BSD 3-Clause License. See the LICENSE file for details.
Citation
If you use this software in your research, please cite:
@software{urban_system_generator,
title={Urban System Generator},
author={El Kontar, Rawad},
organization={National Laboratory of the Rockies},
year={2025},
url={https://github.com/NatLabRockies/urban-system-generator}
}
Acknowledgments
This work was authored by the National Laboratory of the Rockies(NLR) for the U.S. Department of Energy (DOE), operated under Contract No. DE-AC36-08GO28308. Funding provided by NLR’s Laboratory Directed Research and Development program, and the DOE Office of Science. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.
Contact
For questions or support, please contact Rawad El Kontar (rawad.elkontar@nlr.gov) or open an issue on GitHub.
NLR Software Record: SWR 25-36
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file urban_system_generator-0.1.1.tar.gz.
File metadata
- Download URL: urban_system_generator-0.1.1.tar.gz
- Upload date:
- Size: 454.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28ed8839d01151cd037c6719328e92451bae3239a374c8febfa72f2aea7f538d
|
|
| MD5 |
56fd41c0318684956504e03c5a981718
|
|
| BLAKE2b-256 |
023371a52a4c1d300e2307e9afd6762c68e01838227a01815193df9d7552a2f4
|
File details
Details for the file urban_system_generator-0.1.1-py3-none-any.whl.
File metadata
- Download URL: urban_system_generator-0.1.1-py3-none-any.whl
- Upload date:
- Size: 458.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e1a903cf541ced3118d1798805cb6c57ee47318d77b861553c26b90671f823d
|
|
| MD5 |
34c4148504c94b65213ec3f4a7c360cd
|
|
| BLAKE2b-256 |
a8bbff60e2cc4bf80a201d60e2fd57ae15fade386ed7b6a92405e365f7abe34d
|