LogNexs (re: Log Nexus): A Foundational Segmentation Tool for Drone Flight Log Messages
Project description
LogNexs (re: Log Nexus)
LogNexs (re: Log Nexus) is a command-line sentence segmentation tool for decrypted drone flight log messages. It uses a domain-tuned DistilBERT NER model to split noisy, multi-sentence flight log messages into semantically complete sentence records for downstream forensic review and analysis.
Features
- Batch-processes decrypted DJI CSV flight logs from an input directory.
- Extracts
APP.tipandAPP.warningmessages into a message-level timeline. - Downloads the required Hugging Face model with
lognexus-download. - Exports results as nested JSON or exploded XLSX rows.
- Provides a SoPID-style forensic inference pipeline through
lognexus-pipeline. - Supports optional CUDA inference when PyTorch detects an available GPU.
Installation
Create and activate a Python environment, then install from the repository:
git clone https://github.com/DroneNLP/LogNexus.git
cd LogNexus
python -m pip install .
Or install the PyPI distribution after release:
python -m pip install LogNexs
LogNexs depends on PyTorch and simpletransformers. For GPU usage, install the
PyTorch build that matches your CUDA environment before running LogNexs.
Model Setup
The NER model is not bundled into the Python package. Download it separately:
lognexus-download
By default, this downloads:
swardiantara/LogNexus-distilbert-base-uncased
into:
./model
Custom download location:
lognexus-download --model_dir /path/to/model
Input Data
Place decrypted .csv flight logs in the input directory. The CSV files must
contain these columns:
CUSTOM.date [local]
CUSTOM.updateTime [local]
APP.tip
APP.warning
LogNexs reads each non-empty APP.tip and APP.warning cell as a separate log
message while preserving the original date and time values.
Usage
Sentence Extraction
Basic run:
lognexus
This uses:
input: ./evidence
output: ./output
model: ./model
format: json
Custom paths:
lognexus --input_dir /path/to/logs --output_dir /path/to/results --model_dir /path/to/model --format json
XLSX output:
lognexus --format xlsx
GPU inference:
lognexus --cuda
If CUDA is requested but unavailable, LogNexs falls back to CPU.
SoPID-Style Inference Pipeline
The lognexus-pipeline command ports the working inference pipeline from
SoPID into the LogNexus package structure. It supports two paradigms:
message: classifies whole log messages using the Hugging Face sentiment modelswardiantara/drone-sentiment.segment: segments messages with the SoPID NER model and classifies each unique segment with a local DroPTC classifier.
Recommended message-level run:
lognexus-pipeline --paradigm message --evidence-dir ./evidence --output-dir ./pipeline-output
Segment-level run:
lognexus-pipeline \
--paradigm segment \
--model-name swardiantara/SoPID-bert-base-cased \
--model-type bert \
--pretokenizer spacy \
--tag-scheme bioes \
--droptc-model-dir ./best-model/droptc \
--evidence-dir ./evidence \
--output-dir ./pipeline-output
Pipeline evidence can be flat (evidence/*.csv) or grouped by drone model
(evidence/{drone-model}/*.csv). Outputs are written under:
pipeline-output/{message|segment}-{before|after}/run-{n}/{drone-model}/{flight-log}/
Each processed log gets:
unique_events.xlsx: deduplicated messages or segments for manual review.timeline.json: full forensic timeline with propagated labels.timing.json: per-log timing.prediction.json: segment CoNLL-style predictions, for segment runs only.
The run folder also gets timing_summary.json.
Output Formats
JSON output keeps one record per original message and stores extracted sentences as a list:
[
{
"date": "5/12/2025",
"time": "8:27:36.34 AM",
"message": "Failsafe RTH.; RC signal lost. Returning to home.",
"sentence": [
"Failsafe RTH",
"RC signal lost",
"Returning to home"
]
}
]
XLSX output explodes the sentence list so each extracted sentence gets its own
spreadsheet row.
Development
Install the lightweight test dependencies without downloading the ML stack:
python -m pip install pytest pandas openpyxl
python -m pip install -e . --no-deps
Run tests:
pytest
Build package artifacts:
python -m build
twine check dist/*
Publishing Note
The PyPI distribution name is LogNexs. The internal Python import package and
the installed console commands remain lognexus and lognexus-download for
compatibility with the original tool and paper terminology.
Citation
@misc{Silalahi2025LogNexus,
title = {LogNexus: A Foundational Segmentation Tool for Drone Flight Log Messages},
publisher = {Code Ocean},
year = {2025},
note = {[Source Code]},
author = {Swardiantara Silalahi and Tohari Ahmad and Hudan Studiawan}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lognexs-1.0.0.tar.gz.
File metadata
- Download URL: lognexs-1.0.0.tar.gz
- Upload date:
- Size: 27.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58073e1818540e9a1ce6a479295f35aa7b21c9a13b804b136107ec98db94db4a
|
|
| MD5 |
ff8c91326566b3f9b6850c15335a2241
|
|
| BLAKE2b-256 |
49ab933693169be6fbcd3955ec4cb9053fe9c98528cb06761749776c43e5fe2a
|
File details
Details for the file lognexs-1.0.0-py3-none-any.whl.
File metadata
- Download URL: lognexs-1.0.0-py3-none-any.whl
- Upload date:
- Size: 22.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff4c2ead8646f35d26bbacfce5eb52cde950b75bcb0a6993845eeaf42ecbee97
|
|
| MD5 |
1f675084900becc1f176fc64116b4fe8
|
|
| BLAKE2b-256 |
b5dc4a268c25dd6782affd79264eff14187c38ec4cc39dd2c8a058f174686a60
|