Skip to main content

A Kmeans implementation using only NumPy

Project description

K means clustering is often used as an unsupervised data-analytics algorithm meant to find the ideal number of possible classes in a given dataset.

This project implements a k-means clustering algorithm pipeline that takes in dataset file(s) such as the one found in the dataset folder and computes the best K for each dataset and outputs into another text file the file name followed by the estimated K for each one.

Allowed only to use numpy package, all other packages are prohibited. Each line in the dataset file represent 1, n dimensional datapoint.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kMeansBCMAssessment-cwildenb-0.3.tar.gz (3.1 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page