Skip to main content

parameter-free clustering algorithm

Project description

TX-Means

TX-Means is a parameter-free clustering algorithm able to efficiently partitioning transactional data in a completely automatic way. TX-Means is designed for the case where clustering must be applied on a massive number of different datasets, for instance when a large set of users need to be analyzed individually and each of them has generated a long history of transactions.

In this repository we provide the source code of TX-Means, the clustering algorithm competitors and the dataset used in

Riccardo Guidotti, Anna Monreale, Mirco Nanni, Fosca Giannotti, Dino Pedreschi "Clustering Individual Transactional Data for Masses of Users", KDD 2017, 2017, Halifax, NS, Canada

Please cite the paper above if you use our code or dataets.

Where to get it

The source code is currently hosted on GitHub at: https://github.com/riccotti/TX-Means

How to install

pip install TXMeans

How to import (some examples)

from TXMeans.txmeans import TXmeans
from TXMeans.util import count_items, remap_items, sample_size (Util functions)
from TXMeans.util import basket_list_to_bitarray, basket_bitarray_to_list (Converting(Reverting) to(from) bitarray)
from TXMeans.datamanager import read_uci_data (Convert the data in nice basket format)
from TXMeans.validation_measures import delta_k, purity, normalized_mutual_info_score (Measure of Validation)
from TXMeans.util import jaccard_bitarray
Requirements:
  • python >= 3
  • numpy >= 1.10.1
  • pandas >= 0.18.1
  • scipy >= 0.17.1
  • bitarray >= 0.8.1
  • Java >= 8.1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

TXMeans-0.1.1.tar.gz (35.2 kB view hashes)

Uploaded Source

Built Distribution

TXMeans-0.1.1-py3-none-any.whl (39.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page