Skip to main content

High-performance Rust + Python toolkit for parsing BGP MRT data into Parquet and querying prefixes/AS paths.

Project description

bgpframe

Rust 2024 Edition Python 3.12+ PyO3 0.27 Polars 1.36+ GitHub Repo Default Branch Project Scope Package Type Status Rust tests Python tests Doctest Coverage Type Check

English documentation. Korean version: docs/README.ko.md

A Rust-based MRT (BGP) parsing + Parquet processing library. You can build and use it directly from Python with maturin, and write concise prefix containment queries with Polars expressions.

Key Highlights

  • Fast parsing: converts MRT to Parquet using bgpkit-parser + Rust implementation.
  • Memory reuse: reduces allocation/copy overhead with batch buffer swap during flush.
  • Rust scan filter: parquet_filter_updates for fast direct filtering/writing on large Parquet files.
  • Python-friendly API: prefix/IP containment + BGP-specific filters (announce, origin, as_path).
  • Typed API: includes .pyi stubs (_core.pyi, polars_utils.pyi, __init__.pyi).

Installation

Install from PyPI:

pip install bgpframe

If you use the Polars helper expressions:

pip install "bgpframe[polars]"

Using uv:

uv add bgpframe
# or
uv add "bgpframe[polars]"

Requirements

  • Rust stable toolchain
  • Python 3.12+
  • uv
  • maturin

Development Build

uv venv
source .venv/bin/activate
uv pip install maturin
maturin develop
python -c "import bgpframe; print(bgpframe.hello())"

Run without activating the virtual environment:

uv run -- maturin develop
uv run python -c "import bgpframe; print(bgpframe.hello())"

Examples

1) MRT -> Parquet

import bgpframe

bgpframe.mrt_to_parquet(
    "https://data.ris.ripe.net/rrc00/latest-update.gz",
    "rrc00_latest.parquet",
    limit=200_000,      # optional
    batch_size=100_000, # optional
)

2) Prefix containment query (IPv4/IPv6)

import bgpframe
import polars as pl

df = pl.read_parquet("rrc00_latest.parquet")
res = df.filter(bgpframe.contains_prefix_expr("8.8.8.8"))
print(res.head())

3) Filter large Parquet and write output

import bgpframe

matched = bgpframe.parquet_contains_ip(
    "rrc00_latest.parquet",
    "2001:4860:4860::8888",
    output="rrc00_latest_match_google_dns_v6.parquet",
    limit=100_000,  # optional
)
print("matched rows:", matched)

4) Combined BGP convenience filters

import bgpframe
import polars as pl

df = pl.read_parquet("rrc00_latest.parquet")

# announce + origin AS 15169 + AS_PATH includes 3356 + path length 2..5
res = bgpframe.filter_bgp_updates(
    df,
    elem_type="announce",
    origin_asn=15169,
    as_path_contains=3356,
    min_as_path_len=2,
    max_as_path_len=5,
)

# exact prefix match (host bits are normalized with strict=False behavior)
exact = df.filter(bgpframe.prefix_exact_expr("2001:4860:4860::8888/32"))

5) Rust high-speed scan filter (file -> file)

import bgpframe

matched = bgpframe.parquet_filter_updates(
    "rrc00_latest.parquet",
    output="rrc00_latest_updates_filtered.parquet",
    contains_ip="8.8.8.8",
    elem_type="announce",
    origin_asn=15169,
    as_path_contains=3356,
    min_as_path_len=2,
    max_as_path_len=8,
    limit=50_000,
)
print("matched rows:", matched)

The same code is available at example/parquet_filter_updates.py.

Recommended Query Patterns for BGP Data

  • Split event types: announce_expr(), withdraw_expr()
  • Analyze route origin: origin_asn_expr(asn)
  • Track transit/upstream ASN: as_path_contains_expr(asn)
  • Find policy/risk signals: as_path_len_between_expr(min_len=..., max_len=...)
  • Exact prefix comparisons: prefix_exact_expr("x.x.x.x/len")
  • Apply combined filters once: filter_bgp_updates(...)
  • Direct Parquet processing: parquet_filter_updates(...)

Testing / Quality Gates

Results below are from local runs on 2026-03-01 (Asia/Seoul).

  • Rust unit tests: 7 passed
  • Rust doc tests: 0 failed
  • Python regression tests (unittest): 6 passed
  • Python doctest: 4 passed
  • Coverage (Python): 93%
  • Type check (pyrefly): 0 errors

Run commands:

# One-time workaround if cargo test has macOS + Homebrew Python framework link issue
mkdir -p /tmp/Python3.framework/Versions/3.9
ln -sf /opt/homebrew/Frameworks/Python.framework/Versions/Current/Python /tmp/Python3.framework/Versions/3.9/Python3

# Rust tests
DYLD_FRAMEWORK_PATH=/tmp cargo test --lib
DYLD_FRAMEWORK_PATH=/tmp cargo test --doc

# Python tests + doctest
uv run python -m unittest -v tests.test_regression
uv run python -m doctest -v src/bgpframe/polars_utils.py

# Coverage
uv run coverage erase
uv run coverage run -m unittest tests.test_regression
uv run coverage run -a -m doctest src/bgpframe/polars_utils.py
uv run coverage report

# Type check
uv run pyrefly check

CI/CD and PyPI Publishing

  • CI workflow: .github/workflows/ci.yml
    • Trigger: push to main, pull requests
    • Runs: Rust tests, Python regression tests, doctest, type checks
  • Automated release workflow: .github/workflows/release-please.yml
    • Trigger: push to main (or manual dispatch)
    • Creates/updates a Release PR, updates versions (pyproject.toml, Cargo.toml), and publishes a GitHub Release
  • Release workflow: .github/workflows/publish-pypi.yml
    • Trigger: GitHub Release (published/released) or manual dispatch
    • Builds: wheels (ubuntu/macos/windows) + sdist
    • Publishes: PyPI via Trusted Publishing (OIDC)

Required setup for PyPI release

  1. Configure a PyPI Trusted Publisher for this project.
  2. In PyPI Trusted Publisher settings, use:
    • Owner: taeyun16
    • Repository: bgpframe
    • Workflow filename: publish-pypi.yml
    • Environment name: pypi
  3. In GitHub, create environment pypi (Settings -> Environments).
  4. Create a GitHub Release (for example tag v0.1.0) to trigger publish.

With Trusted Publishing, you do not need a long-lived PYPI_API_TOKEN secret.

Automatic release flow

  1. Merge commits into main (use Conventional Commit prefixes like feat:, fix:, docs:).
  2. release-please opens/updates a Release PR with version/changelog changes.
  3. Merge the Release PR.
  4. release-please creates a GitHub Release.
  5. publish-pypi.yml runs and uploads artifacts to PyPI.

Schema Summary

The schema is normalized to numeric/list columns and minimizes string fields.

Column Type Description
timestamp i64 Unix timestamp in seconds
elem_type u32 announce=1, withdraw=0
peer_ip_ver u32 4 or 6
peer_ip_v4 u32? Present only for IPv4 peers
peer_ip_v6_hi u64? Upper 64 bits of IPv6
peer_ip_v6_lo u64? Lower 64 bits of IPv6
peer_asn u32 Peer ASN
prefix_ver u32 4 or 6
prefix_v4 u32? IPv4 prefix
prefix_v6_hi u64? Upper 64 bits of IPv6
prefix_v6_lo u64? Lower 64 bits of IPv6
prefix_len u32 Prefix length
prefix_end_v4 u32? IPv4 range end (query acceleration)
next_hop_ver u32? 4 or 6
next_hop_v4 u32? IPv4 next hop
next_hop_v6_hi u64? Upper 64 bits of IPv6
next_hop_v6_lo u64? Lower 64 bits of IPv6
as_path list Flattened AS_PATH
as_path_len u32 Route length
has_as_set bool Contains AS_SET/CONFED_SET
origin_asn u32? Present only for a single origin ASN
local_pref u32? local-pref
med u32? MED

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bgpframe-0.2.0.tar.gz (60.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bgpframe-0.2.0-cp39-abi3-win_amd64.whl (8.8 MB view details)

Uploaded CPython 3.9+Windows x86-64

bgpframe-0.2.0-cp39-abi3-manylinux_2_38_x86_64.whl (10.3 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.38+ x86-64

bgpframe-0.2.0-cp39-abi3-macosx_11_0_arm64.whl (9.1 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file bgpframe-0.2.0.tar.gz.

File metadata

  • Download URL: bgpframe-0.2.0.tar.gz
  • Upload date:
  • Size: 60.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bgpframe-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d1a52a883eb2c6dffa41d5ff85fb1acf4f55d3f381df49143dd4509253272586
MD5 21a1e28c9b354433d21678b00edc8690
BLAKE2b-256 891078cc5875beedce3c029516021878aee9e34b1aa88d6a122b2e7965e7eeeb

See more details on using hashes here.

Provenance

The following attestation bundles were made for bgpframe-0.2.0.tar.gz:

Publisher: publish-pypi.yml on taeyun16/bgpframe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bgpframe-0.2.0-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: bgpframe-0.2.0-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 8.8 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bgpframe-0.2.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 fb8223fc59869940dbe12c5c6c43bf7ea1b1e369a090eef21468f1d3dc751827
MD5 988dcab5069fc957275a23de7cef1146
BLAKE2b-256 0c8c633cef3a4aba9e136e2c3573e60458079933c2918276a399679c3b021d6a

See more details on using hashes here.

Provenance

The following attestation bundles were made for bgpframe-0.2.0-cp39-abi3-win_amd64.whl:

Publisher: publish-pypi.yml on taeyun16/bgpframe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bgpframe-0.2.0-cp39-abi3-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for bgpframe-0.2.0-cp39-abi3-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 295383ef22362ae06f7c16edff00f4fcbeb14ee19e9e71ab3fb74b059ab9b517
MD5 8c05c125459acb96b9d0ef9d6ee81c70
BLAKE2b-256 c001246a9befb4723093fd3c42245705ab4ac6e13c1c5f0d1dca4c15785637c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for bgpframe-0.2.0-cp39-abi3-manylinux_2_38_x86_64.whl:

Publisher: publish-pypi.yml on taeyun16/bgpframe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bgpframe-0.2.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bgpframe-0.2.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5e4038ee4cf7cb7278dd8668cffa23b98936e2ed7b32216fc8fb0c10e2a9a405
MD5 dff13d7c217fa3bb6284d2cc7be119b8
BLAKE2b-256 2b8d36dcf7c4c1256e4b1c6f5a885e96dc64d85b048eda8a422aa5fe37b356e3

See more details on using hashes here.

Provenance

The following attestation bundles were made for bgpframe-0.2.0-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: publish-pypi.yml on taeyun16/bgpframe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page