Skip to main content

High-performance Rust + Python toolkit for parsing BGP MRT data into Parquet and querying prefixes/AS paths.

Project description

bgpframe

Rust 2024 Edition Python 3.12+ PyO3 0.27 Polars 1.36+ GitHub Repo Default Branch Project Scope Package Type Status Rust tests Python tests Doctest Coverage Type Check

English documentation. Korean version: docs/README.ko.md

A Rust-based MRT (BGP) parsing + Parquet processing library. You can build and use it directly from Python with maturin, and write concise prefix containment queries with Polars expressions.

Key Highlights

  • Fast parsing: converts MRT to Parquet using bgpkit-parser + Rust implementation.
  • Memory reuse: reduces allocation/copy overhead with batch buffer swap during flush.
  • Rust scan filter: parquet_filter_updates for fast direct filtering/writing on large Parquet files.
  • Python-friendly API: prefix/IP containment + BGP-specific filters (announce, origin, as_path).
  • Typed API: includes .pyi stubs (_core.pyi, polars_utils.pyi, __init__.pyi).

Installation

Install from PyPI:

pip install bgpframe

If you use the Polars helper expressions:

pip install "bgpframe[polars]"

Using uv:

uv add bgpframe
# or
uv add "bgpframe[polars]"

Requirements

  • Rust stable toolchain
  • Python 3.12+
  • uv
  • maturin

Development Build

uv venv
source .venv/bin/activate
uv pip install maturin
maturin develop
python -c "import bgpframe; print(bgpframe.hello())"

Run without activating the virtual environment:

uv run -- maturin develop
uv run python -c "import bgpframe; print(bgpframe.hello())"

Examples

1) MRT -> Parquet

import bgpframe

bgpframe.mrt_to_parquet(
    "https://data.ris.ripe.net/rrc00/latest-update.gz",
    "rrc00_latest.parquet",
    limit=200_000,      # optional
    batch_size=100_000, # optional
)

2) Prefix containment query (IPv4/IPv6)

import bgpframe
import polars as pl

df = pl.read_parquet("rrc00_latest.parquet")
res = df.filter(bgpframe.contains_prefix_expr("8.8.8.8"))
print(res.head())

3) Filter large Parquet and write output

import bgpframe

matched = bgpframe.parquet_contains_ip(
    "rrc00_latest.parquet",
    "2001:4860:4860::8888",
    output="rrc00_latest_match_google_dns_v6.parquet",
    limit=100_000,  # optional
)
print("matched rows:", matched)

4) Combined BGP convenience filters

import bgpframe
import polars as pl

df = pl.read_parquet("rrc00_latest.parquet")

# announce + origin AS 15169 + AS_PATH includes 3356 + path length 2..5
res = bgpframe.filter_bgp_updates(
    df,
    elem_type="announce",
    origin_asn=15169,
    as_path_contains=3356,
    min_as_path_len=2,
    max_as_path_len=5,
)

# exact prefix match (host bits are normalized with strict=False behavior)
exact = df.filter(bgpframe.prefix_exact_expr("2001:4860:4860::8888/32"))

5) Rust high-speed scan filter (file -> file)

import bgpframe

matched = bgpframe.parquet_filter_updates(
    "rrc00_latest.parquet",
    output="rrc00_latest_updates_filtered.parquet",
    contains_ip="8.8.8.8",
    elem_type="announce",
    origin_asn=15169,
    as_path_contains=3356,
    min_as_path_len=2,
    max_as_path_len=8,
    limit=50_000,
)
print("matched rows:", matched)

The same code is available at example/parquet_filter_updates.py.

Recommended Query Patterns for BGP Data

  • Split event types: announce_expr(), withdraw_expr()
  • Analyze route origin: origin_asn_expr(asn)
  • Track transit/upstream ASN: as_path_contains_expr(asn)
  • Find policy/risk signals: as_path_len_between_expr(min_len=..., max_len=...)
  • Exact prefix comparisons: prefix_exact_expr("x.x.x.x/len")
  • Apply combined filters once: filter_bgp_updates(...)
  • Direct Parquet processing: parquet_filter_updates(...)

Testing / Quality Gates

Results below are from local runs on 2026-03-01 (Asia/Seoul).

  • Rust unit tests: 7 passed
  • Rust doc tests: 0 failed
  • Python regression tests (unittest): 6 passed
  • Python doctest: 4 passed
  • Coverage (Python): 93%
  • Type check (pyrefly): 0 errors

Run commands:

# One-time workaround if cargo test has macOS + Homebrew Python framework link issue
mkdir -p /tmp/Python3.framework/Versions/3.9
ln -sf /opt/homebrew/Frameworks/Python.framework/Versions/Current/Python /tmp/Python3.framework/Versions/3.9/Python3

# Rust tests
DYLD_FRAMEWORK_PATH=/tmp cargo test --lib
DYLD_FRAMEWORK_PATH=/tmp cargo test --doc

# Python tests + doctest
uv run python -m unittest -v tests.test_regression
uv run python -m doctest -v src/bgpframe/polars_utils.py

# Coverage
uv run coverage erase
uv run coverage run -m unittest tests.test_regression
uv run coverage run -a -m doctest src/bgpframe/polars_utils.py
uv run coverage report

# Type check
uv run pyrefly check

CI/CD and PyPI Publishing

  • CI workflow: .github/workflows/ci.yml
    • Trigger: push to main, pull requests
    • Runs: Rust tests, Python regression tests, doctest, type checks
  • Release workflow: .github/workflows/publish-pypi.yml
    • Trigger: GitHub Release (published) or manual dispatch
    • Builds: wheels (ubuntu/macos/windows) + sdist
    • Publishes: PyPI via Trusted Publishing (OIDC)

Required setup for PyPI release

  1. Configure a PyPI Trusted Publisher for this project.
  2. In PyPI Trusted Publisher settings, use:
    • Owner: taeyun16
    • Repository: bgpframe
    • Workflow filename: publish-pypi.yml
    • Environment name: pypi
  3. In GitHub, create environment pypi (Settings -> Environments).
  4. Create a GitHub Release (for example tag v0.1.0) to trigger publish.

With Trusted Publishing, you do not need a long-lived PYPI_API_TOKEN secret.

Schema Summary

The schema is normalized to numeric/list columns and minimizes string fields.

Column Type Description
timestamp i64 Unix timestamp in seconds
elem_type u32 announce=1, withdraw=0
peer_ip_ver u32 4 or 6
peer_ip_v4 u32? Present only for IPv4 peers
peer_ip_v6_hi u64? Upper 64 bits of IPv6
peer_ip_v6_lo u64? Lower 64 bits of IPv6
peer_asn u32 Peer ASN
prefix_ver u32 4 or 6
prefix_v4 u32? IPv4 prefix
prefix_v6_hi u64? Upper 64 bits of IPv6
prefix_v6_lo u64? Lower 64 bits of IPv6
prefix_len u32 Prefix length
prefix_end_v4 u32? IPv4 range end (query acceleration)
next_hop_ver u32? 4 or 6
next_hop_v4 u32? IPv4 next hop
next_hop_v6_hi u64? Upper 64 bits of IPv6
next_hop_v6_lo u64? Lower 64 bits of IPv6
as_path list Flattened AS_PATH
as_path_len u32 Route length
has_as_set bool Contains AS_SET/CONFED_SET
origin_asn u32? Present only for a single origin ASN
local_pref u32? local-pref
med u32? MED

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bgpframe-0.1.0.tar.gz (59.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bgpframe-0.1.0-cp39-abi3-win_amd64.whl (8.8 MB view details)

Uploaded CPython 3.9+Windows x86-64

bgpframe-0.1.0-cp39-abi3-manylinux_2_38_x86_64.whl (10.3 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.38+ x86-64

bgpframe-0.1.0-cp39-abi3-macosx_11_0_arm64.whl (9.1 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file bgpframe-0.1.0.tar.gz.

File metadata

  • Download URL: bgpframe-0.1.0.tar.gz
  • Upload date:
  • Size: 59.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bgpframe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 231a04e8187ee6792569f83ef1b23b088e2e29f085d8b898ec9c0cbb307bfd5b
MD5 8f8d2c7ead446b1c691b51f6eba40271
BLAKE2b-256 c590c35a38202827d1b2cdb89093ca7e0eff9e9ddd2e471e3d70df1d6093879e

See more details on using hashes here.

Provenance

The following attestation bundles were made for bgpframe-0.1.0.tar.gz:

Publisher: publish-pypi.yml on taeyun16/bgpframe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bgpframe-0.1.0-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: bgpframe-0.1.0-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 8.8 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bgpframe-0.1.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 33c07c0d810074bda0ab0162939cab9c6d68126b4b8698a32841dfe4c52c06e5
MD5 a20b17581457b99e8f1b3d9f92eff37d
BLAKE2b-256 65abeb64fdfe5af3a383ecc8415afa40deb9df81c5ab24d0877a64ece7dbcc71

See more details on using hashes here.

Provenance

The following attestation bundles were made for bgpframe-0.1.0-cp39-abi3-win_amd64.whl:

Publisher: publish-pypi.yml on taeyun16/bgpframe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bgpframe-0.1.0-cp39-abi3-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for bgpframe-0.1.0-cp39-abi3-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 c67e268a799d93aff1a0cedf7e44e17c52a409795013541204ac8f86727484c4
MD5 d16fde5fc9454aaffe249cecc80cd6ee
BLAKE2b-256 41062961e1fcfd3470d8d8e5d696644cfae455ed88830c1d4df595641d365da3

See more details on using hashes here.

Provenance

The following attestation bundles were made for bgpframe-0.1.0-cp39-abi3-manylinux_2_38_x86_64.whl:

Publisher: publish-pypi.yml on taeyun16/bgpframe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bgpframe-0.1.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bgpframe-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 96e256d25d82a108e8a21bd26b1c620eea5e4cfe9a24bd17c09390c0867c81cc
MD5 9dbfd827f0b77568a4fdf66512c86ec1
BLAKE2b-256 9bcf2f34c7f00004e95da57aed6cf72b196449768befc67b735e6e4f79ffc6c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for bgpframe-0.1.0-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: publish-pypi.yml on taeyun16/bgpframe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page