bciAVM is a machine learning pipeline used to predict property prices.
Project description
The Blockchain & Climate Institute (BCI) is a progressive think tank providing leading expertise in the deployment of emerging technologies for climate and sustainability actions.
As an international network of scientific and technological experts, BCI is at the forefront of innovative efforts, enabling technology transfers, to create a sustainable and clean global future.
Automated Valuation Model (AVM)
About
AVM is a term for a service that uses mathematical modeling combined with databases of existing properties and transactions to calculate real estate values. The majority of automated valuation models (AVMs) compare the values of similar properties at the same point in time. Many appraisers, and even Wall Street institutions, use this type of model to value residential properties. (see What is an AVM Investopedia.com)
For more detailed info about the AVM, please read the About paper in the resources directory.
Valuation Process
Key Functionality
- Supervised algorithms
- Tree-based & deep learning algorithms
- Feature engineering derived from small clusters of similar properties
- Ensemble (value blending) approaches
Set the required AWS Environment Variables
export ACCESS_KEY=YOURACCESS_KEY
export SECRET_KEY=YOURSECRET_KEY
export BUCKET_NAME=bci-transition-risk-data
export TABLE_DIRECTORY=/dbfs/FileStore/tables/
Install from PyPI
pip install bciavm
Start
Load the training data from the BCI S3 bucket
from bciavm.core.config import your_bucket
from bciavm.utils.bci_utils import ReadParquetFile, get_postcodeOutcode_from_postcode, get_postcodeArea_from_outcode, drop_outliers, preprocess_data
import pandas as pd
dfPricesEpc = pd.DataFrame()
dfPrices = pd.DataFrame()
yearArray = ['2020', '2019']
for year in yearArray:
singlePriceEpcFile = pd.DataFrame(ReadParquetFile(your_bucket, 'epc_price_data/byDate/2021-02-04/parquet/' + year))
dfPricesEpc = dfPricesEpc.append(singlePriceEpcFile)
dfPricesEpc['POSTCODE_OUTCODE'] = dfPricesEpc['Postcode'].apply(get_postcodeOutcode_from_postcode)
dfPricesEpc['POSTCODE_AREA'] = dfPricesEpc['POSTCODE_OUTCODE'].apply(get_postcodeArea_from_outcode)
dfPricesEpc.groupby('TypeOfMatching_m').count()['Postcode']
Preprocess & split the data for training/testing
import bciavm
X_train, X_test, y_train, y_test = bciavm.preprocess_data(dfPricesEpc)
Build the pipeline and get the default pipeline parameters
from bciavm.pipelines import RegressionPipeline
class AVMPipeline(RegressionPipeline):
custom_name = 'AVM Pipeline'
component_graph = {
"Preprocess Transformer": ["Preprocess Transformer"],
'Imputer': ['Imputer', "Preprocess Transformer"],
'One Hot Encoder': ['One Hot Encoder', "Imputer"],
'K Nearest Neighbors Regressor': ['K Nearest Neighbors Regressor', 'One Hot Encoder'],
"XGBoost Regressor": ["XGBoost Regressor", 'One Hot Encoder'],
'MultiLayer Perceptron Regressor': ['MultiLayer Perceptron Regressor', 'One Hot Encoder'],
'Final Estimator': ['Linear Regressor', "XGBoost Regressor", 'MultiLayer Perceptron Regressor', 'K Nearest Neighbors Regressor']
}
avm_pipeline = AVMPipeline(parameters={})
avm_pipeline.parameters
Fit the pipeline
avm_pipeline.fit(X_train, y_train)
Score the pipeline
avm_pipeline.score( X_test,
y_test,
objectives=['MAPE',
'MdAPE',
'ExpVariance',
'MaxError',
'MedianAE',
'MSE',
'MAE',
'R2',
'Root Mean Squared Error'])
Next Steps
Read more about bciAVM on our documentation page:
How does it relate to BCI Risk Modeling?
Technical & financial support provided by CCG Analytics
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file bciavm-1.21.28.3.tar.gz
.
File metadata
- Download URL: bciavm-1.21.28.3.tar.gz
- Upload date:
- Size: 19.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f94f8c12e2719d0153329daabbb6abb25085427a27c04af2c5e18f86c792ab2f |
|
MD5 | 80654286e63ebe650eef342d8bcc6ee6 |
|
BLAKE2b-256 | f13892bac52e77904a0245f5589f215e8910d3b2b6e9fece935b48297fec8e86 |