an implementation of spectral clustering for text document collections
Project description
- Homepage:
- Contact:
Overview
Spectral clustering a modern clustering technique considered to be effective for image clustering among others. [1] [2]
This software find clusters among documents based on the bag-of-words representation [3] and TF-IDF weighting [4].
Requirements
Following softwares are required.
Python 2 or 3
Numpy
Scipy
How to use
Prepare documents as raw-text files
Notes
When you use the Reuters set, notice No 17980 might contain non-Unicode character at Line 10. It should probably read: “world economic growth-side measures …”
http://www.daviddlewis.com/resources/testcollections/reuters21578/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scluster-0.0.1.tar.gz
(6.6 kB
view hashes)