Skip to main content

MAT-classification: Analysis and Classification methods for Multiple Aspect Trajectory Data Mining

Project description

MAT-classification: Analysis and Classification methods for Multiple Aspect Trajectory Data Mining [MAT-Tools Framework]


[Publication] [citation.bib] [GitHub] [PyPi]

The present package offers a tool, to support the user in the task of classification of multiple aspect trajectories. It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods.

Created on Dec, 2023 Copyright (C) 2023, License GPL Version 3 or superior (see LICENSE file)

Installation

Install directly from PyPi repository, or, download from github. (python >= 3.7 required)

    pip3 install mat-classification

Getting Started

On how to use this package, see MAT-classification-Tutorial.ipynb

Available Classifiers:

Movelet-Based:

  • MMLP (Movelet): Movelet Multilayer-Perceptron (MLP) with movelets features. The models were implemented using the Python language, with the keras, fully-connected hidden layer of 100 units, Dropout Layer with dropout rate of 0.5, learning rate of 10−3 and softmax activation function in the Output Layer. Adam Optimization is used to avoid the categorical cross entropy loss, with 200 of batch size, and a total of 200 epochs per training. [REFERENCE]
  • MRF (Movelet): Movelet Random Forest (RF) with movelets features, that consists of an ensemble of 300 decision trees. The models were implemented using the Python language, with the keras. [REFERENCE]
  • MSVN (Movelet): Movelet Support Vector Machine (SVM) with movelets features. The models were implemented using the Python language, with the keras, linear kernel and default structure. Other structure details are default settings. [REFERENCE]

Feature-Based:

  • POI-S: Frequency-based method to extract features of trajectory datasets (TF-IDF approach), the method runs one dimension at a time (or more if concatenated). The models were implemented using the Python language, with the keras. [REFERENCE]

Trajectory-Based:

  • MARC: Uses word embeddings for trajectory classification. It encapsulates all trajectory dimensions: space, time and semantics, and uses them as input to a neural network classifier, and use the geoHash on the spatial dimension, combined with others. The models were implemented using the Python language, with the keras. [REFERENCE]
  • TRF: Random Forest for trajectory data (TRF). Find the optimal set of hyperparameters for each model, applying the grid-search technique: varying number of trees (ne), the maximum number of features to consider at every split (mf), the maximum number of levels in a tree (md), the minimum number of samples required to split a node (mss), the minimum number of samples required at each leaf node (msl), and finally, the method of selecting samples for training each tree (bs). [REFERENCE]
  • TXGBost: Find the optimal set of hyperparameters for each model, applying the grid-search technique: number of estimators (ne), the maximum depth of a tree (md), the learning rate (lr), the gamma (gm), the fraction of observations to be randomly samples for each tree (ss), the sub sample ratio of columns when constructing each tree (cst), the regularization parameters (l1) and (l2). [REFERENCE]
  • BiTuler: Find the optimal set of hyperparameters for each model, applying the grid-search technique: keeps 64 as the batch size and 0.001 as the learning rate and vary the units (un) of the recurrent layer, the embedding size to each attribute (es) and the dropout (dp). [REFERENCE]
  • Tulvae: Find the optimal set of hyperparameters for each model, applying the grid-search technique: keeps 64 as the batch size and 0.001 as the learning rate and vary the units (un) of the recurrent layer, the embedding size to each attribute (es), the dropout (dp), and latent variable (z). [REFERENCE]
  • DeepeST: DeepeST employs a Recurrent Neural Network (RNN), both LSTM and Bidirectional LSTM (BLSTM). Find the optimal set of hyperparameters for each model, applying the grid-search technique: keeps 64 as the batch size and 0.001 as the learning rate and vary the units (un) of the recurrent layer, the embedding size to each attribute (es) and the dropout (dp). [REFERENCE]

Available Scripts (TODO update):

By installing the package the following python scripts will be installed for use in system command line tools:

  • MAT-TC.py: Script to run classifiers on trajectory datasets, for details type: MAT-TC.py --help;
  • MAT-MC.py: Script to run movelet-based classifiers on trajectory datasets, for details type: MAT-MC.py --help;
  • POIS-TC.py: Script to run POI-F/POI-S classifiers on the methods feature matrix, for details type: POIS-TC.py --help;
  • MARC.py: Script to run MARC classifier on trajectory datasets, for details type: MARC.py --help.

One script for running the POI-F/POI-S method:

  • POIS.py: Script to run POI-F/POI-S feature extraction methods (poi, npoi, and wnpoi), for details type: POIS.py --help.

And one script for merging movelet resulting matrices:

  • MAT-MergeDatasets.py: Script to join all class train.csv and test.csv of movelets for using as input into a classifier, for details type: MAT-MergeDatasets.py --help.

Citing

If you use mat-classification please cite the following paper:

TODO

Bibtex:

@inproceedings{...}

Collaborate with us

Any contribution is welcome. This is an active project and if you would like to include your code, feel free to fork the project, open an issue and contact us.

Feel free to contribute in any form, such as scientific publications referencing this package, teaching material and workshop videos.

Related packages

This package is part of MAT-Tools Framework for Multiple Aspect Trajectory Data Mining, check the guide project:

  • mat-tools: Reference guide for MAT-Tools Framework repositories

Change Log

This is a package under construction, see CHANGELOG.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mat_classification-0.1b3.tar.gz (910.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mat_classification-0.1b3-py3-none-any.whl (104.7 kB view details)

Uploaded Python 3

File details

Details for the file mat_classification-0.1b3.tar.gz.

File metadata

  • Download URL: mat_classification-0.1b3.tar.gz
  • Upload date:
  • Size: 910.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.13

File hashes

Hashes for mat_classification-0.1b3.tar.gz
Algorithm Hash digest
SHA256 c8ab38eab1f847813e38aee2a6fbfd80acdcd344d8d85d8662bb39ba14232e8b
MD5 709881da3257855815d00642eac1da2f
BLAKE2b-256 eefb9c1e340817fc541246da998881ec56c3a3b2fa33b1f605a9b9139e34320a

See more details on using hashes here.

File details

Details for the file mat_classification-0.1b3-py3-none-any.whl.

File metadata

File hashes

Hashes for mat_classification-0.1b3-py3-none-any.whl
Algorithm Hash digest
SHA256 0d61fe19fc8f83670a9ce41b952e845d253d444b6f72128065f054df2456e26f
MD5 0d257411fc592caf000574b62586d620
BLAKE2b-256 eb9ce9a59add2ddaa84146a88e9457788a10ec22304b101a036b0ef2da73ca19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page