A Python 3.5+ wrapper for clustering building point labels using KMeans, DBScan, and Agglomerative clustering
Project description
bldg_point_clustering
PyPi Package: https://pypi.org/project/bldg-point-clustering/
Docs: https://bldg-point-clustering.readthedocs.io/en/latest/
Introduction
A Python 3.5+ wrapper for clustering building point labels using KMeans, DBScan, and Agglomerative clustering.
Installation
Using pip for Python 3.5+ run:
$ pip install bldg_point_clustering
Quick Start
Instantiate Featurizer object and get featurized Pandas DataFrame.
Instantiate Cluster object and pass in featurized DataFrame to. Then, call a clustering method with the appropriate parameters.
Use the plot3D function in the Plotter to create a 3D plot of metrics returned by any of the clustering trials.
Example Usage
Running one iteration of the KMeans algorithm.
import pandas as pd
import numpy as np
from bldg_point_clustering.cluster import Cluster
from bldg_point_clustering.featurizer import Featurizer
filename = "GBSF"
df = pd.read_csv("./datasets/" + filename + ".csv")
first_column = df.iloc[:, 0]
f = Featurizer(filename, corpus=first_column)
featurized_df = f.bag_of_words()
c = Cluster(df, featurized_df)
clustered_df = c.kmeans(n_clusters=300, plot=True, to_csv=True)
metrics = c.get_metrics_df()
avg_levenshtein_score = np.mean(c.get_levenshtein_scores())
Running several iterations of the KMeans algorithm.
from bldg_point_clustering.plotter import plot_3D
c.kmeans_trials()
metrics = c.get_metrics_df()
plot_3D(metrics, "n_clusters", "Avg Levenshtein Score", "Silhouette Score")
This process is similar for DBScan and Agglomerative.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for bldg_point_clustering-0.0.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0081ffe9d0dafbe44270ded56aee7cfb413ea6b29522260aaa81177e97c2120 |
|
MD5 | 71da90e595cd6ce6198484bc0bce6168 |
|
BLAKE2b-256 | b783c4f6ad85566499f5e574d2e906608c872354c7c40add1f0cb2cea160a3eb |
Hashes for bldg_point_clustering-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4ccad5bda03eb6c13e940278044c0ae3b9e383dfc375ac5a4508b99c68239b8 |
|
MD5 | 92c1d951b5524ebe355700b9ebff22d3 |
|
BLAKE2b-256 | 644acfa0c2b0134eeca8559780cfc4c8fb8b6891544784c39b3461acc3b9e1a1 |