Skip to main content

Python implementation of the k-Full Tree (kFT) algorithm for geo-referenced time-series data summarization

Project description

k-Full Tree (kFT) Algorithm for Geo-Referenced Time-Series

A Python implementation of the k-Full Tree algorithm for summarizing geo-referenced time-series (GTS) data using spatio-temporal clustering and tree structures.

Overview

This package implements the k-Full Tree algorithm, which provides efficient summarization of spatio-temporal data by identifying k representative tree structures that capture the most significant activity patterns across both space and time dimensions.

Key Features

  • Spatio-Temporal Analysis: Handles data with both spatial (geographic) and temporal (time-series) dimensions
  • Tree-Based Summarization: Uses tree structures to represent activity patterns
  • Voronoi Partition Assignment (VPA): Advanced partitioning strategy for optimal node assignment
  • Flexible Configuration: Configurable tree degree, number of summary trees (k), and algorithm parameters
  • Comprehensive Testing: Full test suite with 31+ unit and integration tests

Algorithm Description

The k-Full Tree algorithm operates in two main phases:

  1. Phase 1 - Voronoi Partition Assignment: Assigns spatio-temporal nodes to k partitions based on distance to current summary trees
  2. Phase 2 - Summary Tree Update: Recomputes optimal ST-full trees for each partition to maximize activity coverage

The algorithm iterates between these phases until convergence, producing k summary trees that best represent the underlying spatio-temporal patterns in the data.

Installation

This project uses uv for dependency management. Make sure you have uv installed, then:

# Clone or navigate to the project directory
cd kft

# Install dependencies
uv sync

Requirements

  • Python >= 3.13
  • NetworkX >= 3.5
  • NumPy >= 2.3.2

Usage

Basic Example

from kft import create_example_gts, KFullTree

# Create example geo-referenced time-series data
gts = create_example_gts()

# Initialize k-Full Tree algorithm
kft = KFullTree(gts, k=2, max_tree_degree=2)

# Run the algorithm
summary_trees, partitions = kft.fit(use_vpa=True, verbose=True)

# Analyze results
print(f"Found {len(summary_trees)} summary trees")
print(f"Total activity coverage: {kft.get_total_coverage()}")

Advanced Usage

from kft import GeoReferencedTimeSeries, KFullTree, STNode

# Create custom GTS data
spatial_framework = ['A', 'B', 'C', 'D']
temporal_framework = [1, 2, 3, 4]
spatial_neighbors = {
    'A': ['B', 'C'],
    'B': ['A', 'D'],
    'C': ['A', 'D'], 
    'D': ['B', 'C']
}
activity_data = {
    ('A', 1): 10, ('A', 2): 8, ('A', 3): 5, ('A', 4): 2,
    ('B', 1): 3, ('B', 2): 7, ('B', 3): 9, ('B', 4): 6,
    # ... more activity data
}

# Initialize GTS
gts = GeoReferencedTimeSeries(
    spatial_framework, 
    temporal_framework, 
    spatial_neighbors, 
    activity_data
)

# Configure and run algorithm
kft = KFullTree(gts, k=3, max_tree_degree=2)
trees, partitions = kft.fit(
    use_vpa=True,           # Use Voronoi Partition Assignment
    max_iterations=100,     # Maximum iterations
    verbose=True           # Show progress
)

# Access results
for i, tree in enumerate(trees):
    print(f"Tree {i}: {tree}")
    print(f"  Activity Coverage: {tree.activity_coverage()}")
    print(f"  Nodes: {[str(n) for n in tree.nodes]}")

Data Structure

Input Data Format

The algorithm expects:

  • Spatial Framework: List of spatial region identifiers (e.g., ['A', 'B', 'C'])
  • Temporal Framework: List of time periods (e.g., [1, 2, 3, 4])
  • Spatial Neighbors: Dictionary mapping each region to its spatial neighbors
  • Activity Data: Dictionary mapping (region, time) tuples to activity values

Example Data Structure

# Spatial regions
spatial_framework = ['Region1', 'Region2', 'Region3']

# Time periods
temporal_framework = [2020, 2021, 2022, 2023]

# Adjacency relationships
spatial_neighbors = {
    'Region1': ['Region2'],
    'Region2': ['Region1', 'Region3'],
    'Region3': ['Region2']
}

# Activity measurements
activity_data = {
    ('Region1', 2020): 15.5,
    ('Region1', 2021): 18.2,
    ('Region2', 2020): 12.1,
    # ... more measurements
}

Running the Code

# Run the built-in example
uv run python kft.py

# Run the test suite
uv run python test_kft.py

# Run specific test categories
uv run python -m unittest test_kft.TestSTNode -v
uv run python -m unittest test_kft.TestKFullTree -v

API Reference

Core Classes

STNode

Represents a spatio-temporal node with spatial ID, time ID, and activity count.

STFullTree

Represents a spatio-temporal full tree with root, nodes, edges, degree, and depth.

GeoReferencedTimeSeries

Container for GTS data that builds the spatio-temporal neighbor graph.

KFullTree

Main algorithm class that implements the k-Full Tree algorithm.

Key Methods

  • KFullTree.fit(use_vpa=True, max_iterations=100, verbose=False): Run the complete algorithm
  • KFullTree.get_total_coverage(): Get total activity coverage of summary trees
  • STFullTree.activity_coverage(): Get activity coverage of a single tree
  • GeoReferencedTimeSeries.get_st_neighbors(node): Get spatio-temporal neighbors

Testing

The package includes comprehensive tests covering:

  • Unit Tests: Individual class and method testing
  • Integration Tests: Complete algorithm execution
  • Edge Cases: Empty partitions, single nodes, convergence scenarios
  • Phase-Specific Tests: Detailed testing of both algorithm phases

Run all tests:

uv run python test_kft.py -v

Academic Reference

This implementation is based on the k-Full Tree algorithm described in the academic paper included as oliver2012.pdf. The example data and methodology follow the research presented in that work.

Development

Project Structure

kft/
├── kft.py              # Main algorithm implementation
├── test_kft.py         # Comprehensive test suite
├── pyproject.toml      # Project configuration
├── CLAUDE.md           # Development guidance
├── oliver2012.pdf      # Reference paper
└── README.md           # This file

Contributing

  1. Ensure all tests pass: uv run python test_kft.py
  2. Follow the existing code style and patterns
  3. Add tests for new functionality
  4. Update documentation as needed

License

This implementation is provided for educational and research purposes. Please refer to the original academic paper for citation requirements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kfulltree-0.0.1.tar.gz (814.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kfulltree-0.0.1-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file kfulltree-0.0.1.tar.gz.

File metadata

  • Download URL: kfulltree-0.0.1.tar.gz
  • Upload date:
  • Size: 814.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for kfulltree-0.0.1.tar.gz
Algorithm Hash digest
SHA256 ea672e57056f01d1882ded07bab6614b3206de25022aee29d710bf86e04e8b86
MD5 0c1d0429777e7400c79e1cb135750dc3
BLAKE2b-256 3c454dc57305f41494d796cd451f02954770855e4b46964a068b68680e43be15

See more details on using hashes here.

File details

Details for the file kfulltree-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: kfulltree-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for kfulltree-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e65557e78ac950f400c12ae011b08559678829db361ce494a247b2af5ea5eff5
MD5 a0deb661011bee75eb7757394e169654
BLAKE2b-256 8cb02d447a65c909af9064ad22d105dbaa39c19bc120f111e3fee8726790269e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page