Skip to main content

bciAVM is a machine learning pipeline used to predict property prices.

Project description

bciAVM

PyPI PyPI Stats

The Blockchain & Climate Institute (BCI) is a progressive think tank providing leading expertise in the deployment of emerging technologies for climate and sustainability actions.

As an international network of scientific and technological experts, BCI is at the forefront of innovative efforts, enabling technology transfers, to create a sustainable and clean global future.

Automated Valuation Model (AVM)

About

AVM is a term for a service that uses mathematical modeling combined with databases of existing properties and transactions to calculate real estate values. The majority of automated valuation models (AVMs) compare the values of similar properties at the same point in time. Many appraisers, and even Wall Street institutions, use this type of model to value residential properties. (see What is an AVM Investopedia.com)

For more detailed info about the AVM, please read the About paper in the resources directory.

Valuation Process

Key Functionality

  • Supervised algorithms
  • Tree-based & deep learning algorithms
  • Feature engineering derived from small clusters of similar properties
  • Ensemble (value blending) approaches

Set the required AWS Environment Variables

export ACCESS_KEY=YOURACCESS_KEY
export SECRET_KEY=YOURSECRET_KEY
export BUCKET_NAME=bci-transition-risk-data
export TABLE_DIRECTORY=/dbfs/FileStore/tables/

Install from PyPI

pip install bciavm

Start

Load the training data from the BCI S3 bucket

from bciavm.core.config import your_bucket
from bciavm.utils.bci_utils import ReadParquetFile, get_postcodeOutcode_from_postcode, get_postcodeArea_from_outcode, drop_outliers, preprocess_data
import pandas as pd

dfPricesEpc = pd.DataFrame()
dfPrices = pd.DataFrame()

yearArray = ['2020', '2019']
for year in yearArray:
    singlePriceEpcFile = pd.DataFrame(ReadParquetFile(your_bucket, 'epc_price_data/byDate/2021-02-04/parquet/' + year))
    dfPricesEpc = dfPricesEpc.append(singlePriceEpcFile)

dfPricesEpc['POSTCODE_OUTCODE'] = dfPricesEpc['Postcode'].apply(get_postcodeOutcode_from_postcode)
dfPricesEpc['POSTCODE_AREA'] = dfPricesEpc['POSTCODE_OUTCODE'].apply(get_postcodeArea_from_outcode)
dfPricesEpc.groupby('TypeOfMatching_m').count()['Postcode']

Preprocess & split the data for training/testing

import bciavm
X_train, X_test, y_train, y_test = bciavm.preprocess_data(dfPricesEpc)

Build the pipeline and get the default pipeline parameters

from bciavm.pipelines import RegressionPipeline

class AVMPipeline(RegressionPipeline):
        custom_name = 'AVM Pipeline'
        component_graph = {
            "Preprocess Transformer": ["Preprocess Transformer"],
            'Imputer': ['Imputer', "Preprocess Transformer"],
            'One Hot Encoder': ['One Hot Encoder', "Imputer"],
            'K Nearest Neighbors Regressor': ['K Nearest Neighbors Regressor', 'One Hot Encoder'],
            "XGBoost Regressor": ["XGBoost Regressor", 'One Hot Encoder'],
            'MultiLayer Perceptron Regressor': ['MultiLayer Perceptron Regressor',  'One Hot Encoder'],
            'Final Estimator': ['Linear Regressor', "XGBoost Regressor", 'MultiLayer Perceptron Regressor', 'K Nearest Neighbors Regressor']
        }
    
avm_pipeline = AVMPipeline(parameters={})
avm_pipeline.parameters

Fit the pipeline

avm_pipeline.fit(X_train, y_train)

Score the pipeline

avm_pipeline.score(  X_test, 
                     y_test, 
                     objectives=['MAPE',
                               'MdAPE',
                               'ExpVariance',
                               'MaxError',
                               'MedianAE',
                               'MSE',
                               'MAE',
                               'R2',
                               'Root Mean Squared Error'])

Next Steps

Read more about bciAVM on our documentation page:

How does it relate to BCI Risk Modeling?

Technical & financial support provided by CCG Analytics

ccganalytics.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bciavm-1.21.28.3.tar.gz (19.0 MB view details)

Uploaded Source

File details

Details for the file bciavm-1.21.28.3.tar.gz.

File metadata

  • Download URL: bciavm-1.21.28.3.tar.gz
  • Upload date:
  • Size: 19.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for bciavm-1.21.28.3.tar.gz
Algorithm Hash digest
SHA256 f94f8c12e2719d0153329daabbb6abb25085427a27c04af2c5e18f86c792ab2f
MD5 80654286e63ebe650eef342d8bcc6ee6
BLAKE2b-256 f13892bac52e77904a0245f5589f215e8910d3b2b6e9fece935b48297fec8e86

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page