Skip to main content

A Python 3.5+ wrapper for clustering building point labels using KMeans, DBScan, and Agglomerative clustering

Project description

bldg_point_clustering

PyPi Package: https://pypi.org/project/bldg-point-clustering/

Docs: https://bldg-point-clustering.readthedocs.io/en/latest/

Introduction

A Python 3.5+ wrapper for clustering building point labels using KMeans, DBScan, and Agglomerative clustering.

Installation

Using pip for Python 3.5+ run:

$ pip install bldg_point_clustering

Quick Start

Instantiate Featurizer object and get featurized Pandas DataFrame.

Instantiate Cluster object and pass in featurized DataFrame to. Then, call a clustering method with the appropriate parameters.

Use the plot3D function in the Plotter to create a 3D plot of metrics returned by any of the clustering trials.

Example Usage

Running one iteration of the KMeans algorithm.

import pandas as pd
import numpy as np
from bldg_point_clustering.cluster import Cluster
from bldg_point_clustering.featurizer import Featurizer

filename = "GBSF"

df = pd.read_csv("./datasets/" + filename + ".csv")

first_column = df.iloc[:, 0]

f = Featurizer(filename, corpus=first_column)

featurized_df = f.bag_of_words()

c = Cluster(df, featurized_df)

clustered_df = c.kmeans(n_clusters=300, plot=True, to_csv=True)

metrics = c.get_metrics_df()

avg_levenshtein_score = np.mean(c.get_levenshtein_scores())

Running several iterations of the KMeans algorithm.

from bldg_point_clustering.plotter import plot_3D

c.kmeans_trials()

metrics = c.get_metrics_df()

plot_3D(metrics, "n_clusters", "Avg Levenshtein Score", "Silhouette Score")

This process is similar for DBScan and Agglomerative.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bldg_point_clustering-0.0.3.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

bldg_point_clustering-0.0.3-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file bldg_point_clustering-0.0.3.tar.gz.

File metadata

  • Download URL: bldg_point_clustering-0.0.3.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.1 CPython/3.7.3

File hashes

Hashes for bldg_point_clustering-0.0.3.tar.gz
Algorithm Hash digest
SHA256 d0081ffe9d0dafbe44270ded56aee7cfb413ea6b29522260aaa81177e97c2120
MD5 71da90e595cd6ce6198484bc0bce6168
BLAKE2b-256 b783c4f6ad85566499f5e574d2e906608c872354c7c40add1f0cb2cea160a3eb

See more details on using hashes here.

File details

Details for the file bldg_point_clustering-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: bldg_point_clustering-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.1 CPython/3.7.3

File hashes

Hashes for bldg_point_clustering-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b4ccad5bda03eb6c13e940278044c0ae3b9e383dfc375ac5a4508b99c68239b8
MD5 92c1d951b5524ebe355700b9ebff22d3
BLAKE2b-256 644acfa0c2b0134eeca8559780cfc4c8fb8b6891544784c39b3461acc3b9e1a1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page