Skip to main content

Python tools for walk and path data on networks

Project description

walkingpandas

A Python library for massive-scale walk and path analysis on network topologies.

walkingpandas bridges static graph theory (network topologies) with massive-scale sequential data — any process that can be modelled as a walk on a graph. DuckDB is the analytical engine, Parquet the storage layer, and the API feels like pandas while safely handling out-of-core computation on datasets of any size.

Use cases: pedestrian and vehicle mobility, clickstream / user-journey analysis, supply-chain and logistics flows, trade and transaction routing, biological pathway traversal, communication network traces — anything where entities move step-by-step through a graph.

Features

  • Lazy Evaluation: Build up queries without executing until .compute() is called
  • Out-of-Core Processing: Handle datasets that don't fit in memory using DuckDB and Parquet
  • Integer-Native IDs: All node/edge IDs are stored as BIGINT for fast joins and compact storage; string IDs are auto-translated transparently
  • Map Matching (optional): Turn raw GPS traces into walk data via Valhalla (pip install walkingpandas[mapmatch])
  • Spatial Filtering: Reverse spatial filtering for fast geographic queries
  • Temporal Queries: Handle time-less, single-timestamp, and dwell-time data scenarios
  • Graph Validation: Filter simple paths, cycles, and complex walks
  • Pandas-like API: Familiar interface for data scientists

Installation

pip install walkingpandas

Quick Start

import walkingpandas as wp

# 1. Network (static topology)
network = wp.Network(
    nodes="data/network/nodes.parquet",
    edges="data/network/edges.parquet"
)

# 2. WalkFrame (walks + network)
walks = wp.WalkFrame.from_parquet("data/walks/*.parquet", network=network)

# 3. Lazy query chain; nothing runs until .compute()
result = (
    walks
    .only_simple_paths()
    .passing_through([42])
    .filter(time_range=('08:00', '09:00'))
    .edge_frequencies()
    .compute()
)

Or as a one-liner when you have nodes, edges, and walks as separate Parquet paths:

walks = wp.read_dataset(
    nodes="data/network/nodes.parquet",
    edges="data/network/edges.parquet",
    walks="data/walks/*.parquet"
)
traffic = walks.edge_frequencies().compute()

Documentation

Full documentation is available in the docs/ directory. To build and view locally:

pip install walkingpandas[docs]
mkdocs serve

Then open http://127.0.0.1:8000 in your browser.

License

MIT License - see LICENSE for details.

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Citation

If you use walkingpandas in your research, please cite:

@software{walkingpandas,
  title = {walkingpandas: Massive-Scale Walk and Path Analysis on Network Topologies},
  author = {Jürgen Hackl},
  year = {2026},
  url = {https://github.com/cisgroup/walkingpandas}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

walkingpandas-0.1.1.tar.gz (133.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

walkingpandas-0.1.1-py3-none-any.whl (46.3 kB view details)

Uploaded Python 3

File details

Details for the file walkingpandas-0.1.1.tar.gz.

File metadata

  • Download URL: walkingpandas-0.1.1.tar.gz
  • Upload date:
  • Size: 133.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for walkingpandas-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f4229481fa49f331f6c1c924639bd693349fa25dc83789cabb9768735fab6c2d
MD5 355eff84e81d5cffb7ef5593c89c0497
BLAKE2b-256 6e4bfb5abe022136c21a3b747dc5312cb3ef52f41fb405f8cdc23660b4daea9f

See more details on using hashes here.

File details

Details for the file walkingpandas-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: walkingpandas-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 46.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for walkingpandas-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 419a5cab60a678cb0e765d624933ff22f2ffdd2db7983bbaeb5263607ab0722a
MD5 004edc86c87026d53963a21d0d34a91f
BLAKE2b-256 e4258fbb8e81df33c45eec74cc2d2f7b4a97a33986206e72e563772d67277eee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page