A package to create a deterministic classifier based on Zipf law
Project description
Introduction
ZipfClassifier is a classifier that, even though in principle usable on any distribution, leverages the assumption that some kind of datasets su as:
follow the Zipf law.
Dependecies
ZipfClassifier uses zipf, another package o’ mine. I also suggest to use dictances for the metrics used in classification.
Installation
pip install zipf_classifier
Working examples and explanation
A jupyter notebook is available with a full explanation, three working examples and respective link to datasets.
License
This package is licensed under MIT license.
FAQs
Frequenctly asked questions down below.
Generally, which metric do you suggest?
Experimental analysis suggests that, in particular when the learning set distributions contain a significant greater number of events than the distribution from the document you are trying to classify, the intersection_squared_hellinger seemed to work best.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.