an implementation of spectral clustering for text document collections
Homepage: http://github.com/whym/scluster Contact: http://whym.org
|||Ulrike von Luxburg, A Tutorial on Spectral Clustering, 2006. http://arxiv.org/abs/0711.0189|
|||Chris H. Q. Ding, Spectral Clustering, 2004. http://ranger.uta.edu/~chqding/Spectral/|
Following softwares are required.
- Python 2 or 3
How to use
Prepare documents as raw-text files, and put them in a directory, for example, ‘reuters’.
Prepare a category file. For example, ‘cats.txt’ may contain:
14833 palm-oil veg-oil 14839 ship
This means that the file ‘14833’ has ‘palm-oil’ and ‘veg-oil’ as its categories, and ‘14839’ has ‘ship’ as its category.
Run: python scluster/clusterer.py cats.txt reusters/ -m kmeans,
- When you use the Reuters set, notice No 17980 might contain non-Unicode character at Line 10. It should probably read: “world economic growth-side measures …”
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size scluster-0.0.2.tar.gz (6.8 kB)||File type Source||Python version None||Upload date||Hashes View|