Skip to main content

A Python 3.5+ wrapper for clustering building point labels using KMeans, DBScan, and Agglomerative clustering

Project description

bldg_point_clustering

PyPi Package: https://pypi.org/project/bldg-point-clustering/

Docs: https://bldg-point-clustering.readthedocs.io/en/latest/

Introduction

A Python 3.5+ wrapper for clustering building point labels using KMeans, DBScan, and Agglomerative clustering.

Installation

Using pip for Python 3.5+ run:

$ pip install bldg_point_clustering

Quick Start

Instantiate Featurizer object and get featurized Pandas DataFrame.

Instantiate Cluster object and pass in featurized DataFrame to. Then, call a clustering method with the appropriate parameters.

Use the plot3D function in the Plotter to create a 3D plot of metrics returned by any of the clustering trials.

Example Usage

Running one iteration of the KMeans algorithm.

import pandas as pd
import numpy as np
from bldg_point_clustering.cluster import Cluster
from bldg_point_clustering.featurizer import Featurizer

filename = "GBSF"

df = pd.read_csv("./datasets/" + filename + ".csv")

first_column = df.iloc[:, 0]

f = Featurizer(filename, corpus=first_column)

featurized_df = f.bag_of_words()

c = Cluster(df, featurized_df)

clustered_df = c.kmeans(n_clusters=300, plot=True, to_csv=True)

metrics = c.get_metrics_df()

avg_levenshtein_score = np.mean(c.get_levenshtein_scores())

Running several iterations of the KMeans algorithm.

from bldg_point_clustering.plotter import plot_3D

c.kmeans_trials()

metrics = c.get_metrics_df()

plot_3D(metrics, "n_clusters", "Avg Levenshtein Score", "Silhouette Score")

This process is similar for DBScan and Agglomerative.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bldg_point_clustering-0.0.3.tar.gz (11.2 kB view hashes)

Uploaded Source

Built Distribution

bldg_point_clustering-0.0.3-py3-none-any.whl (14.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page