Skip to main content

Python tools for walk and path data on networks

Project description

walkingpandas

A Python library for massive-scale walk and path analysis on network topologies.

walkingpandas bridges static graph theory (network topologies) with massive-scale sequential data — any process that can be modelled as a walk on a graph. DuckDB is the analytical engine, Parquet the storage layer, and the API feels like pandas while safely handling out-of-core computation on datasets of any size.

Use cases: pedestrian and vehicle mobility, clickstream / user-journey analysis, supply-chain and logistics flows, trade and transaction routing, biological pathway traversal, communication network traces — anything where entities move step-by-step through a graph.

Features

  • Lazy Evaluation: Build up queries without executing until .compute() is called
  • Out-of-Core Processing: Handle datasets that don't fit in memory using DuckDB and Parquet
  • Integer-Native IDs: All node/edge IDs are stored as BIGINT for fast joins and compact storage; string IDs are auto-translated transparently
  • Map Matching (optional): Turn raw GPS traces into walk data via Valhalla (pip install walkingpandas[mapmatch])
  • Spatial Filtering: Reverse spatial filtering for fast geographic queries
  • Temporal Queries: Handle time-less, single-timestamp, and dwell-time data scenarios
  • Graph Validation: Filter simple paths, cycles, and complex walks
  • Pandas-like API: Familiar interface for data scientists

Installation

pip install walkingpandas

Quick Start

import walkingpandas as wp

# 1. Network (static topology)
network = wp.Network(
    nodes="data/network/nodes.parquet",
    edges="data/network/edges.parquet"
)

# 2. WalkFrame (walks + network)
walks = wp.WalkFrame.from_parquet("data/walks/*.parquet", network=network)

# 3. Lazy query chain; nothing runs until .compute()
result = (
    walks
    .only_simple_paths()
    .passing_through([42])
    .filter(time_range=('08:00', '09:00'))
    .edge_frequencies()
    .compute()
)

Or as a one-liner when you have nodes, edges, and walks as separate Parquet paths:

walks = wp.read_dataset(
    nodes="data/network/nodes.parquet",
    edges="data/network/edges.parquet",
    walks="data/walks/*.parquet"
)
traffic = walks.edge_frequencies().compute()

Documentation

Full documentation is available in the docs/ directory. To build and view locally:

pip install walkingpandas[docs]
mkdocs serve

Then open http://127.0.0.1:8000 in your browser.

License

MIT License - see LICENSE for details.

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Citation

If you use walkingpandas in your research, please cite:

@software{walkingpandas,
  title = {walkingpandas: Massive-Scale Walk and Path Analysis on Network Topologies},
  author = {Jürgen Hackl},
  year = {2026},
  url = {https://github.com/cisgroup/walkingpandas}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

walkingpandas-0.1.0.tar.gz (135.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

walkingpandas-0.1.0-py3-none-any.whl (45.4 kB view details)

Uploaded Python 3

File details

Details for the file walkingpandas-0.1.0.tar.gz.

File metadata

  • Download URL: walkingpandas-0.1.0.tar.gz
  • Upload date:
  • Size: 135.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for walkingpandas-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7709b61b428ddc91d217ad1b25955cc9a7871f246bfbcffed9bbcccd90286b90
MD5 f6bbe1551926c2ee42e55977462b652c
BLAKE2b-256 724bcb25b023a8df93923f692bed8b735dc14afbd16ce8e5d1bfacb44bdf8568

See more details on using hashes here.

File details

Details for the file walkingpandas-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: walkingpandas-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 45.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for walkingpandas-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cf7030d505245a1919492f4ebdc2d3f0e25a84174bba979f7203f17d5f8479c3
MD5 47d0f71f8f90acc63ae8bbeb0da079dd
BLAKE2b-256 27f670264c8677be9f115187b091a1551507a7ba9f10c10108b848457ee9dfac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page