Skip to main content

Transition Network Analysis for Python

Project description

TNA - Transition Network Analysis for Python

A Python package providing exact numerical equivalence to the R TNA package for analyzing sequential data as transition networks.

Features

  • 8 Model Types: relative, frequency, co-occurrence, reverse, n-gram, gap, window, attention
  • 9 Centrality Measures: OutStrength, InStrength, ClosenessIn, ClosenessOut, Closeness, Betweenness, BetweennessRSP, Diffusion, Clustering
  • Statistical Inference: Bootstrap resampling, permutation tests, confidence intervals
  • 10+ Visualization Functions: Network plots, heatmaps, centrality charts, sequence plots
  • R Package Equivalence: Verified numerical equivalence with comprehensive test suite

Installation

# Development installation
pip install -e .

# Or install dependencies directly
pip install numpy pandas networkx scipy matplotlib seaborn

Quick Start

import tna
import pandas as pd

# Load example data (2000 learning sessions with 9 self-regulated learning behaviors)
df = tna.load_group_regulation()

# Build a TNA model (relative transition probabilities)
model = tna.tna(df)
print(model)

# Compute centrality measures
cent = tna.centralities(model)
print(cent)

# Visualize the network
tna.plot_network(model, layout='circular', edge_threshold=0.05)

# Visualize centralities
tna.plot_centralities(cent, measures=['OutStrength', 'InStrength', 'Betweenness'])

Model Building

Basic Models

# Relative transition probabilities (default)
model = tna.tna(df)

# Frequency model (raw counts)
fmodel = tna.ftna(df)

# Co-occurrence model (bidirectional)
cmodel = tna.ctna(df)

# Attention model (exponential decay weighting)
amodel = tna.atna(df, beta=0.1)

Advanced Model Types

# All model types via build_model()
model = tna.build_model(df, type_='relative')      # Row-normalized probabilities
model = tna.build_model(df, type_='frequency')     # Raw transition counts
model = tna.build_model(df, type_='co-occurrence') # Bidirectional co-occurrence
model = tna.build_model(df, type_='reverse')       # Reverse order transitions
model = tna.build_model(df, type_='n-gram', params={'n': 2})  # Higher-order n-grams
model = tna.build_model(df, type_='gap', params={'max_gap': 3, 'decay': 0.5})  # Gap-weighted
model = tna.build_model(df, type_='window', params={'size': 3})  # Sliding window
model = tna.build_model(df, type_='attention', params={'beta': 0.1})  # Attention-weighted

Scaling Options

# Apply scaling to weight matrix
model = tna.tna(df, scaling='minmax')  # Min-max normalization [0, 1]
model = tna.tna(df, scaling='max')     # Divide by maximum
model = tna.tna(df, scaling='rank')    # Rank-based scaling
model = tna.tna(df, scaling=['minmax', 'max'])  # Multiple scalings

Centrality Measures

# Compute all centrality measures
cent = tna.centralities(model)

# Compute specific measures
cent = tna.centralities(model, measures=['OutStrength', 'InStrength', 'Betweenness'])

# With normalization
cent = tna.centralities(model, normalize=True)

# Include self-loops
cent = tna.centralities(model, loops=True)

Available Measures

Measure Description
OutStrength Sum of outgoing edge weights
InStrength Sum of incoming edge weights
ClosenessIn Incoming closeness centrality
ClosenessOut Outgoing closeness centrality
Closeness Overall closeness (treats graph as undirected)
Betweenness Standard betweenness centrality
BetweennessRSP Randomized Shortest Path betweenness
Diffusion Diffusion centrality (Banerjee et al. 2014)
Clustering Weighted clustering coefficient (Zhang & Horvath 2005)

Data Preparation

From Long Format Data

# Prepare raw event data
prepared = tna.prepare_data(
    data=events_df,
    actor='user_id',
    time='timestamp',
    action='event_type',
    time_threshold=900  # 15 minutes session timeout
)

# Build model from prepared data
model = tna.tna(prepared)

# Access statistics
print(prepared.statistics)  # n_sessions, n_actors, etc.

From Wide Format Data

# Direct from wide format (rows=sequences, cols=time steps)
df = pd.DataFrame({
    'step1': ['A', 'B', 'A'],
    'step2': ['B', 'C', 'C'],
    'step3': ['C', 'A', 'B']
})
model = tna.tna(df)

Statistical Inference

Bootstrap Analysis

# Bootstrap confidence intervals for model parameters
boot = tna.bootstrap_tna(df, n_boot=1000, ci=0.95, seed=42)

# Get summary with CIs for all edges
summary = boot.summary()

# Find significant edges
sig_edges = boot.significant_edges(threshold=0)

# Bootstrap centrality measures
cent_ci = tna.bootstrap_centralities(
    df,
    measures=['OutStrength', 'InStrength', 'Betweenness'],
    n_boot=1000,
    ci=0.95
)

Permutation Tests

# Compare two groups
result = tna.permutation_test(
    group1_df, group2_df,
    n_perm=1000,
    statistic='weights',  # or 'density', 'centrality'
    alternative='two-sided',
    seed=42
)
print(f"P-value: {result.p_value}")
print(f"Significant: {result.is_significant(0.05)}")

# Edge-wise comparison with multiple testing correction
edges = tna.permutation_test_edges(
    group1_df, group2_df,
    n_perm=1000,
    correction='fdr'  # or 'bonferroni', 'none'
)

Confidence Intervals

# Percentile method
ci = tna.confidence_interval(boot_samples, ci=0.95, method='percentile')

# BCa method (bias-corrected and accelerated)
ci = tna.bca_ci(data, boot_samples, statistic_func=np.mean, ci=0.95)

Visualization

Network Plots

# Basic network plot
tna.plot_network(model)

# Customized network
tna.plot_network(
    model,
    layout='circular',           # or 'spring', 'kamada_kawai'
    node_size='OutStrength',     # Size by centrality
    edge_threshold=0.05,         # Hide weak edges
    node_color='steelblue',
    edge_cmap='Blues'
)

# Network with bootstrap confidence intervals
tna.plot_network_ci(boot, edge_alpha='significance')

Centrality Plots

# Bar charts for centralities
tna.plot_centralities(
    cent,
    measures=['OutStrength', 'InStrength', 'Betweenness'],
    ncol=3
)

Heatmap

# Transition matrix heatmap
tna.plot_heatmap(model, cmap='Blues', annotate=True)

Model Comparison

# Side-by-side comparison of two models
tna.plot_comparison(
    model1, model2,
    plot_type='heatmap',
    labels=('Group 1', 'Group 2')
)

Sequence Visualization

# State distribution over time
tna.plot_sequences(df, plot_type='distribution')

# State frequencies
tna.plot_frequencies(df)

# Histogram of sequence lengths
tna.plot_histogram(df)

Statistical Plots

# Bootstrap distribution
tna.plot_bootstrap(boot, plot_type='weights')
tna.plot_bootstrap(boot, plot_type='centrality', measure='OutStrength')

# Permutation test null distribution
tna.plot_permutation(result)

Example Datasets

# Wide format: 2000 sessions x 20 time steps
df = tna.load_group_regulation()

# Long format: Actor, Time, Action columns
df_long = tna.load_group_regulation_long()

API Reference

Model Building

Function Description
tna(x) Build relative transition probability model
ftna(x) Build frequency (raw counts) model
ctna(x) Build co-occurrence model
atna(x, beta) Build attention-weighted model
build_model(x, type_) Build model with specified type

Data Preparation

Function Description
prepare_data(data, actor, time, action) Prepare long-format event data
create_seqdata(x) Create sequence data from various formats

Centralities

Function Description
centralities(model, measures) Compute centrality measures

Statistical Inference

Function Description
bootstrap_tna(x, n_boot) Bootstrap analysis of TNA model
bootstrap_centralities(x, measures, n_boot) Bootstrap centrality CIs
permutation_test(x1, x2, n_perm) Permutation test for group comparison
permutation_test_edges(x1, x2, n_perm) Edge-wise permutation tests
confidence_interval(samples, ci) Calculate confidence interval
bca_ci(data, samples, func, ci) BCa confidence interval

Visualization

Function Description
plot_network(model) Plot transition network
plot_centralities(cent) Plot centrality bar charts
plot_heatmap(model) Plot transition matrix heatmap
plot_comparison(m1, m2) Compare two models
plot_sequences(df) Plot sequence patterns
plot_frequencies(df) Plot state frequencies
plot_histogram(df) Plot sequence length histogram
plot_bootstrap(boot) Visualize bootstrap results
plot_permutation(result) Visualize permutation test
plot_network_ci(boot) Network with confidence intervals

Utilities

Function Description
row_normalize(matrix) Row-normalize a matrix
minmax_scale(matrix) Min-max scaling to [0, 1]
max_scale(matrix) Divide by maximum
rank_scale(matrix) Rank-based scaling

R Package Equivalence

This package is designed to produce numerically equivalent results to the R TNA package. Key equivalences:

  • Transition matrices: Identical computation of relative, frequency, and co-occurrence matrices
  • Centrality measures: Exact ports of R implementations including custom measures (diffusion, weighted clustering)
  • Data format: Compatible with R's wide-format sequence data

Verification

# Python
model_py = tna.tna(df)
cent_py = tna.centralities(model_py)

# Results match R within floating-point precision:
# - Max absolute difference < 1e-10 for transition matrices
# - Max absolute difference < 1e-6 for centrality measures

Citation

If you use this package in your research, please cite:

@software{tna_python,
  title = {TNA: Transition Network Analysis for Python},
  author  = "Saqr, Mohammed and Tikka, Santtu and López-Pernas, Sonsoles",
  year = {2026},
  url = {https://github.com/mohsaqr/tnapy}
}

Also cite Transition Network Analysis as a method

@INPROCEEDINGS{Saqr2025-ku,
  title     = "Transition Network Analysis: A Novel Framework for Modeling,
               Visualizing, and Identifying the Temporal Patterns of Learners
               and Learning Processes",
  author    = "Saqr, Mohammed and López-Pernas, Sonsoles and Törmänen, Tiina and
               Kaliisa, Rogers and Misiejuk, Kamila and Tikka, Santtu",
  booktitle = "Proceedings of Learning Analytics \& Knowledge (LAK '25)",
  publisher = "ACM",
  address   = "New York, NY, USA",
  doi       = "10.1145/3706468.3706513",
  pages     = "351 - 361",
  year      =  2025
}

License

MIT License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tnapy-0.1.0.tar.gz (279.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tnapy-0.1.0-py3-none-any.whl (260.1 kB view details)

Uploaded Python 3

File details

Details for the file tnapy-0.1.0.tar.gz.

File metadata

  • Download URL: tnapy-0.1.0.tar.gz
  • Upload date:
  • Size: 279.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for tnapy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1a79aaf754f2ea81eafb8c0856a5482b82ca9a8c6de8c2a98d16a5ee82aeae92
MD5 92ec3f8dbb20273c35051ade6c1c9a22
BLAKE2b-256 ca669c00a307a1e570ab90d4cb50a5aa4bee19784b021b5c76871200f9bd51e9

See more details on using hashes here.

File details

Details for the file tnapy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tnapy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 260.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for tnapy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 07e40fb47d9b4046eba090ee0a23786d97d371930c3711d862e48798f54d41ed
MD5 7cc9264ec5723d4879c6c315dc3a5a80
BLAKE2b-256 c6a459441ed5c8d53a8f256e643468647ade18e652e7f6ee0ec5cc7914b84692

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page