Skip to main content

LogNexs (re: Log Nexus): A Foundational Segmentation Tool for Drone Flight Log Messages

Project description

LogNexs (re: Log Nexus)

LogNexs (re: Log Nexus) is a command-line sentence segmentation tool for decrypted drone flight log messages. It uses a domain-tuned DistilBERT NER model to split noisy, multi-sentence flight log messages into semantically complete sentence records for downstream forensic review and analysis.

Features

  • Batch-processes decrypted DJI CSV flight logs from an input directory.
  • Extracts APP.tip and APP.warning messages into a message-level timeline.
  • Downloads the required Hugging Face model with lognexus-download.
  • Exports results as nested JSON or exploded XLSX rows.
  • Provides a SoPID-style forensic inference pipeline through lognexus-pipeline.
  • Supports optional CUDA inference when PyTorch detects an available GPU.

Installation

Create and activate a Python environment, then install from the repository:

git clone https://github.com/DroneNLP/LogNexus.git
cd LogNexus
python -m pip install .

Or install the PyPI distribution after release:

python -m pip install LogNexs

LogNexs depends on PyTorch and simpletransformers. For GPU usage, install the PyTorch build that matches your CUDA environment before running LogNexs.

Model Setup

The NER model is not bundled into the Python package. Download it separately:

lognexus-download

By default, this downloads:

swardiantara/LogNexus-distilbert-base-uncased

into:

./model

Custom download location:

lognexus-download --model_dir /path/to/model

Input Data

Place decrypted .csv flight logs in the input directory. The CSV files must contain these columns:

CUSTOM.date [local]
CUSTOM.updateTime [local]
APP.tip
APP.warning

LogNexs reads each non-empty APP.tip and APP.warning cell as a separate log message while preserving the original date and time values.

Usage

Sentence Extraction

Basic run:

lognexus

This uses:

input:  ./evidence
output: ./output
model:  ./model
format: json

Custom paths:

lognexus --input_dir /path/to/logs --output_dir /path/to/results --model_dir /path/to/model --format json

XLSX output:

lognexus --format xlsx

GPU inference:

lognexus --cuda

If CUDA is requested but unavailable, LogNexs falls back to CPU.

SoPID-Style Inference Pipeline

The lognexus-pipeline command ports the working inference pipeline from SoPID into the LogNexus package structure. It supports two paradigms:

  • message: classifies whole log messages using the Hugging Face sentiment model swardiantara/drone-sentiment.
  • segment: segments messages with the SoPID NER model and classifies each unique segment with a local DroPTC classifier.

Recommended message-level run:

lognexus-pipeline --paradigm message --evidence-dir ./evidence --output-dir ./pipeline-output

Segment-level run:

lognexus-pipeline \
  --paradigm segment \
  --model-name swardiantara/SoPID-bert-base-cased \
  --model-type bert \
  --pretokenizer spacy \
  --tag-scheme bioes \
  --droptc-model-dir ./best-model/droptc \
  --evidence-dir ./evidence \
  --output-dir ./pipeline-output

Pipeline evidence can be flat (evidence/*.csv) or grouped by drone model (evidence/{drone-model}/*.csv). Outputs are written under:

pipeline-output/{message|segment}-{before|after}/run-{n}/{drone-model}/{flight-log}/

Each processed log gets:

  • unique_events.xlsx: deduplicated messages or segments for manual review.
  • timeline.json: full forensic timeline with propagated labels.
  • timing.json: per-log timing.
  • prediction.json: segment CoNLL-style predictions, for segment runs only.

The run folder also gets timing_summary.json.

Output Formats

JSON output keeps one record per original message and stores extracted sentences as a list:

[
  {
    "date": "5/12/2025",
    "time": "8:27:36.34 AM",
    "message": "Failsafe RTH.; RC signal lost. Returning to home.",
    "sentence": [
      "Failsafe RTH",
      "RC signal lost",
      "Returning to home"
    ]
  }
]

XLSX output explodes the sentence list so each extracted sentence gets its own spreadsheet row.

Development

Install the lightweight test dependencies without downloading the ML stack:

python -m pip install pytest pandas openpyxl
python -m pip install -e . --no-deps

Run tests:

pytest

Build package artifacts:

python -m build
twine check dist/*

Publishing Note

The PyPI distribution name is LogNexs. The internal Python import package and the installed console commands remain lognexus and lognexus-download for compatibility with the original tool and paper terminology.

Citation

@misc{Silalahi2025LogNexus,
  title = {LogNexus: A Foundational Segmentation Tool for Drone Flight Log Messages},
  publisher = {Code Ocean},
  year = {2025},
  note = {[Source Code]},
  author = {Swardiantara Silalahi and Tohari Ahmad and Hudan Studiawan}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lognexs-1.0.0.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lognexs-1.0.0-py3-none-any.whl (22.1 kB view details)

Uploaded Python 3

File details

Details for the file lognexs-1.0.0.tar.gz.

File metadata

  • Download URL: lognexs-1.0.0.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for lognexs-1.0.0.tar.gz
Algorithm Hash digest
SHA256 58073e1818540e9a1ce6a479295f35aa7b21c9a13b804b136107ec98db94db4a
MD5 ff8c91326566b3f9b6850c15335a2241
BLAKE2b-256 49ab933693169be6fbcd3955ec4cb9053fe9c98528cb06761749776c43e5fe2a

See more details on using hashes here.

File details

Details for the file lognexs-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: lognexs-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 22.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for lognexs-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ff4c2ead8646f35d26bbacfce5eb52cde950b75bcb0a6993845eeaf42ecbee97
MD5 1f675084900becc1f176fc64116b4fe8
BLAKE2b-256 b5dc4a268c25dd6782affd79264eff14187c38ec4cc39dd2c8a058f174686a60

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page