Parse and pseudonymise FreeRADIUS / eduroam 802.1X authentication log files
Project description
eduroam-log-parser
Parse and pseudonymise FreeRADIUS / eduroam 802.1X authentication logs.
eduroam-log-parser is a zero-dependency Python library and CLI tool for
converting raw FreeRADIUS log files into structured, privacy-safe JSONL records.
It is designed for network operators, researchers, and anyone who needs to
analyse eduroam authentication flows without exposing personally identifiable
information.
Features
- Two parsers — F-TICKS/eduroam syslog lines and FreeRADIUS
Auth:lines - Deterministic pseudonymisation — SHA-256 + caller-supplied salt; consistent across files, reversible only by the operator who holds the salt
- Realm signal classifier — 14 categories (public domain, SIM-generated, typo TLD, misrouted sub-domain, …)
- Failure classifier — TLS handshake, EAP method mismatch, policy reject, proxy routing failure, …
- Streaming API — process arbitrarily large log archives without loading them into memory
- gzip transparent —
.gzfiles are decompressed on the fly - Zero dependencies — stdlib only, runs on Python 3.10+
Installation
pip install eduroam-log-parser
Quick start
Python API
from eduroam_log_parser import parse_fticks, parse_radius_auth
# Parse a single F-TICKS line
line = (
"2025-10-05T00:00:28+03:00 host freeradius: "
"F-TICKS/eduroam/1.0#REALM=university.edu.tr#VISCOUNTRY=TR"
"#VISINST=1partner.edu#USERNAME=jsmith@university.edu.tr"
"#CSI=AA:BB:CC:DD:EE:FF#RESULT=OK#"
)
record = parse_fticks(line, salt="YOUR_SECRET_SALT")
# {
# "log_type": "fticks",
# "result": "OK",
# "username_hash": "3a7f…",
# "realm_tld": "tr",
# "realm_signal": "syntactically_valid",
# "failure_category": "",
# ...
# }
Stream a whole file
from pathlib import Path
from eduroam_log_parser import iter_file, parse_fticks
for record in iter_file(Path("trrad-ng.log.gz"), parse_fticks, salt="secret"):
print(record["result"], record["realm_signal"])
Process a directory
from pathlib import Path
from eduroam_log_parser import process_directory
stats = process_directory(
data_dir=Path("./logs"),
output_dir=Path("./out"),
salt="secret",
sources=["fticks", "radius"],
)
# Writes: out/fticks.jsonl, out/radius_auth.jsonl, out/stats.json
CLI
eduroam-log-parser \
--data-dir ./logs \
--output ./out \
--salt "YOUR_SECRET_SALT" \
--sources fticks radius \
--limit 0
Output schema
Each record is a JSON object. Fields common to both log types:
| Field | Type | Description |
|---|---|---|
schema_version |
str | Dataset schema version |
log_type |
str | fticks or radius_auth |
timestamp |
str | ISO 8601 normalised |
result |
str | OK or FAIL |
username_hash |
str | SHA-256[:32] of local-part |
mac_hash |
str | SHA-256[:32] of normalised MAC |
realm_hash |
str | SHA-256[:32] of realm domain |
realm_tld |
str | Top-level label of realm (e.g. tr) |
outer_identity_type |
str | anonymous, numeric_identifier, institutional_format, malformed, unknown |
realm_signal |
str | Structural quality of the outer identity (14 categories) |
failure_category |
str | Root cause category (empty for successful auths) |
failure_layer |
str | Protocol layer: tls, eap, policy, radius_proxy, … |
failure_reason |
str | Sanitised reason string (IPs replaced with hashes) |
Additional fields for fticks: visinst_hash, visinst_country
Additional fields for radius_auth: nas_hash, port, via_tunnel
Realm signal categories
| Label | Meaning |
|---|---|
syntactically_valid |
Well-formed institutional domain |
institution_subrealm |
Valid sub-realm (ogr., student., …) |
well_known_public_domain |
Gmail, Hotmail, iCloud, … |
auto_generated_sim |
3GPP / SIM-based identity |
auto_generated_client_app |
Supplicant-generated realm |
misrouted_local_subdomain |
Operator sent internal sub-domain to federation |
malformed_* |
Various structural errors |
Privacy model
All pseudonymisation is deterministic but one-way without the salt:
- The same username/MAC always maps to the same hash within a dataset
- Cross-dataset correlation is impossible without the same salt
- The salt is never written to any output file
This approach follows the F-TICKS data minimisation guidelines and is compatible with GDPR pseudonymisation requirements.
Related resources
- Dataset — anonymised sample on HuggingFace: gokhaneryol/freeradius-8021x-log-dataset
- F-TICKS specification — GÉANT eduroam wiki
Citation
If you use this library or the associated dataset in academic work, please cite:
@software{eryol2025eduroam,
author = {Eryol, Gökhan},
title = {eduroam-log-parser: Parse and pseudonymise FreeRADIUS / eduroam logs},
year = {2025},
url = {https://github.com/gokhaneryol/eduroam-log-parser},
}
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eduroam_log_parser-0.1.0.tar.gz.
File metadata
- Download URL: eduroam_log_parser-0.1.0.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a266872b734feac14ba23bf8e630383ffb122dac0f06c40c3ab50a01218d975
|
|
| MD5 |
c53f0908a09b20d624cd0681353bf79a
|
|
| BLAKE2b-256 |
3138fc70a2f2cad5f541a5b569db7d2ab035cbeccca06e5654345df09afc3eaa
|
File details
Details for the file eduroam_log_parser-0.1.0-py3-none-any.whl.
File metadata
- Download URL: eduroam_log_parser-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3455af001ecb8fe6306390be89049f73e6c6015b23df66bc5b68594a923a95f9
|
|
| MD5 |
82c04b16d99e390a36ca3a93f61eb3ef
|
|
| BLAKE2b-256 |
7e8904c73ebb50e7a801422c833adaf1357c6aec0f17009c05d4915580f2fde4
|