A Python 3.5+ wrapper for clustering building point labels using KMeans, DBScan, and Agglomerative clustering
Project description
bldg_point_clustering
PyPi Package: https://pypi.org/project/bldg-point-clustering/
Docs: https://bldg-point-clustering.readthedocs.io/en/latest/
Introduction
A Python 3.5+ wrapper for clustering building point labels using KMeans, DBScan, and Agglomerative clustering.
Installation
Using pip for Python 3.5+ run:
$ pip install bldg_point_clustering
Quick Start
Instantiate Featurizer object and get featurized Pandas DataFrame.
Instantiate Cluster object and pass in featurized DataFrame to. Then, call a clustering method with the appropriate parameters.
Use the plot3D function in the Plotter to create a 3D plot of metrics returned by any of the clustering trials.
Example Usage
Running one iteration of the KMeans algorithm.
import pandas as pd
import numpy as np
from bldg_point_clustering.cluster import Cluster
from bldg_point_clustering.featurizer import Featurizer
filename = "GBSF"
df = pd.read_csv("./datasets/" + filename + ".csv")
first_column = df.iloc[:, 0]
f = Featurizer(filename, corpus=first_column)
featurized_df = f.bag_of_words()
c = Cluster(df, featurized_df)
clustered_df = c.kmeans(n_clusters=300, plot=True, to_csv=True)
metrics = c.get_metrics_df()
avg_levenshtein_score = np.mean(c.get_levenshtein_scores())
Running several iterations of the KMeans algorithm.
from bldg_point_clustering.plotter import plot_3D
c.kmeans_trials()
metrics = c.get_metrics_df()
plot_3D(metrics, "n_clusters", "Avg Levenshtein Score", "Silhouette Score")
This process is similar for DBScan and Agglomerative.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bldg_point_clustering-0.0.3.tar.gz
.
File metadata
- Download URL: bldg_point_clustering-0.0.3.tar.gz
- Upload date:
- Size: 11.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0081ffe9d0dafbe44270ded56aee7cfb413ea6b29522260aaa81177e97c2120 |
|
MD5 | 71da90e595cd6ce6198484bc0bce6168 |
|
BLAKE2b-256 | b783c4f6ad85566499f5e574d2e906608c872354c7c40add1f0cb2cea160a3eb |
File details
Details for the file bldg_point_clustering-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: bldg_point_clustering-0.0.3-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4ccad5bda03eb6c13e940278044c0ae3b9e383dfc375ac5a4508b99c68239b8 |
|
MD5 | 92c1d951b5524ebe355700b9ebff22d3 |
|
BLAKE2b-256 | 644acfa0c2b0134eeca8559780cfc4c8fb8b6891544784c39b3461acc3b9e1a1 |