Multidimensional scanpath comparison
Project description
multimatch
Reimplementation of MultiMatch toolbox (Dewhurst et al., 2012) in Python.
The MultiMatch method proposed by Jarodzka, Holmqvist and Nyström (2010), implemented in Matlab as the MultiMatch toolbox and validated by Dewhurst and colleagues (2012) is a vectorbased, multidimensional approach to compute scanpath similarity.
The method represents scanpaths as geometrical vectors in a twodimensional space: Any scanpath is build up of a vector sequence in which the vectors represent saccades, and the start and end position of saccade vectors represent fixations. Two such sequences (which can differ in length) are compared on the five dimensions ‘vector shape’, ‘vector length’ (saccadic amplitude), ‘vector position’, ‘vector direction’ and ‘fixation duration’ for a multidimensional similarity evaluation. The original Matlab toolbox was kindly provided via email by Dr. Richard Dewhurst and the method was ported into Python with the intent of providing an open source alternative to the matlab toolbox. Additionally, it contains options to compute scanpath similarities from the studyforrest phase 2 eyetracking dataset, in which participants (n = 15 during fmri acquisition, n = 15 in lab) watched the movie ‘Forrest Gump’.
Installation instructions
It is recommended to use a dedicated virtualenv:
# create and enter a new virtual environment (optional) virtualenv python=python3 ~/env/multimatch . ~/env/multimatch/bin/activate
multimatch can be installed via pip. To automatically install multimatch with all dependencies, use:
# install from pyPi pip install multimatch
Method overview
The method takes two n x 3 fixation vectors (xcoordinate, ycoordinate, duration) of two scanpaths as its input.
 Step 1: Representation of scanpaths as vector sequences An idealized saccade is represented as the shortest distance between two fixations. The Cartesian coordinates of the fixations are thus the starting and ending points of a saccade. The length of a saccade in x direction is computed as the difference in x coordinates of starting and ending point. The length of a saccade in y direction is computed accordingly. To represent a saccade as a vector in twodimensional space, the lengths in x and y directions are transformed into polar coordinates (length from coordinate origin (Rho), polar angle in radians (Theta)) by means of trigonometry.
 Step 2: Scanpath simplification Scanpaths are simplified based on angle and amplitude (length) to reduce their complexity. Two or more saccades are grouped together if angles between two consecutive saccades are below an angular threshold TAmp, and intermediate fixations are shorter than a duration threshold TDur, or if the amplitude of successive saccades is below a length threshold TAmp and the surrounding fixation duration. As such, small, locally contained saccades, and saccades in the same general direction are summed to form larger, less complex saccades (Dewhurst et al., 2012). This process is repeated until no further simplifications are made. Thresholds can be set according to use case. The original simplification algorithm implements an angular threshold of 45° and an amplitude threshold of 10% of the screen diagonal (Jarodzka, Holmqvist & Nyström, 2010).
 Step 3: Temporal Alignment Two simplified scanpaths are temporally aligned in order to find pairings of saccade vectors to compare. The aim is not necessarily to align two saccade vectors that constitute the same component in their respective vector sequence, but those two vectors that are the most similar while preserving temporal order. In this way, a stray saccade in one of the two scanpaths does not lead to an overall low similarity rating, and it is further possible to compare scanpaths of unequal length. To do so, all possible pairings of saccades are evaluated in similarity by their shape (i.e. vector differences). More formally, the vector difference between each element i in scanpath S1 = {u1, u2, …, um} and each element j in scanpath S2 = {v1, v2, …, vn} is computed and stored in Matrix M as a weight. Low weights correspond to high similarity. An adjacency matrix of size M is build, defining rules on which connection between matrix elements are allowed: In order to take temporal sequence of saccades into account, connections can only be made to the right, below or belowright. Together, matrices M and the adjacency matrix constitute a matrix representation of a directed, weighted graph. The elements of the matrix are the nodes, the connection rules constitute edges and the weights define the cost associated with each connection.
 Step 4: Scanpath Selection A Dijkstra algorithm (Dijksta, 1959) is used to find the shortest path from the the first two saccade vectors to the last two saccade vectors. “Shortest” path is defined as the connection between nodes with the lowest possible sum of weights.
 Step 5: Similarity Calculation Five measures of scanpath similarity are computed on the aligned scanpaths. This is done by performing simple vector arithmetic on all aligned saccade pairs (u_i, v_j), taking the median of the results and normalizing it. As a result, all five measures are in range [0, 1] with higher values indicating higher similarity between scanpaths on the given dimension (Anderson, Anderson, Kingstone & Bischof, 2015).
For more details on the original algorithm, please see Dewhurst et al. (2012).
Examplary usage of multimatch in a terminal
required inputs:  two tabseparated files with nx3 fixation vectors (x, y, duration)
multimatch data/fixvectors/segment_10_sub19.tsv data/fixvectors/segment_10_sub01.tsv
optional inputs:  –screensize: in pixel, supply first x and then y dimension. The default size is 1280 x 720px
multimatch data/fixvectors/segment_10_sub19.tsv data/fixvectors/segment_10_sub01.tsv screensize 1280 720
if scanpath simplification should be performed, please specify in addition  –amplitude_threshold (am) in px  –direction_threshold (di) in degree  –duration_threshold (du) in seconds
Example usage with grouping:
multimatch data/fixvectors/segment_10_sub19.tsv data/fixvectors/segment_10_sub01.tsv direction_threshold 45.0 duration_threshold 0.3 amplitude_threshold 147.0
Example usage without grouping:
multimatch_forrest
The package further contains an implementation of the method specifically for use with the studyforrest phase 2 eyetracking dataset. Inputs should be the eyetracking data classified into eyemovement events by remodnav (https://github.com/psychoinformaticsde/remodnav). As such, inputs are two eyemovement classified tsvfiles of a given run of studyforrest and the respective location annotation (https://github.com/psychoinformaticsde/studyforrestdataannotations) file as well an output path. It will create scanpaths from the shots of the movie stimlus and compute the necessary nx3 data structure on its own.
Examplary usage of multimatch_forrest in a terminal:
multimatch_forrest multimatch/tests/testdata/sub10_taskmovie_run1_events.tsv multimatch/tests/testdata/sub30_taskmovie_run1_events.tsv multimatch/tests/testdata/studyforrestdataannotations/segments/avmovie/locations_run1_events.tsv output/run1/sub10vssub30
References:
Dewhurst, R., Nyström, M., Jarodzka, H., Foulsham, T., Johansson, R. & Holmqvist, K. (2012). It depends on how you look at it: scanpath comparison in multiple dimensions with MultiMatch, a vectorbased approach. Behaviour Research Methods, 44(4), 10791100.
Dijkstra, E. W. (1959). A note on two problems in connexion withgraphs. Numerische Mathematik, 1, 269–271.
Jarodzka, H., Holmqvist, K., & Nyström, M. (eds.) (2010). A vectorbased, multidimensional scanpath similarity measure. In Proceedings of the 2010 symposium on eyetracking research & applications (pp. 211218). ACM.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size  File type  Python version  Upload date  Hashes 

Filename, size multimatch0.0.1py3noneany.whl (42.6 kB)  File type Wheel  Python version py3  Upload date  Hashes View hashes 
Filename, size multimatch0.0.1.tar.gz (20.9 kB)  File type Source  Python version None  Upload date  Hashes View hashes 
Hashes for multimatch0.0.1py3noneany.whl
Algorithm  Hash digest  

SHA256  2e670337cb64e1b1a14db2f5143489e17b1cd608e2d799fe4083bd59e63e59af 

MD5  b59f69736db9e87d74a3e84e45d7ee7b 

BLAKE2256  5f2bc1df4d97d1ecb3e07613429e590cbb6f7411da63b33e6f85dc11b34b43d7 