Skip to main content

LLM-powered MAVLink flight log analysis and plotting

Project description

๐Ÿงญ MAVPose

Ask questions about your drone flight in plain English. Get plots.

Inspired by the Log Pose from One Piece โ€” it locks onto your flight data and charts a course through it.

Python License: MIT CI


MAVPose is a headless CLI tool that turns a natural language prompt into a matplotlib plot of your MAVLink flight log. It runs a two-phase pipeline: the parent process first extracts clean, time-aligned telemetry into a Parquet file (like a headless database query), then hands the LLM a precise column schema and a pd.read_parquet() call โ€” no raw binary data, no pymavlink in the generated script. This drastically reduces hallucinations and self-healing loops.

$ python cli.py flight.tlog --prompt "Plot altitude over time"

๐Ÿ“‚ Parsing log schema: flight.tlog
โœ… Schema indexed.

๐Ÿ” Finding relevant message types for: 'Plot altitude over time'
   โ†’ ['GLOBAL_POSITION_INT', 'VFR_HUD']

๐Ÿ—ƒ๏ธ  Extracting telemetry to Parquet...
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  ๐Ÿ—ƒ๏ธ  Extracted Parquet schema                               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  [GLOBAL_POSITION_INT]  1842 rows                          โ”‚
โ”‚      time_s: float64  [0.0 โ€ฆ 312.4]                       โ”‚
โ”‚      alt: float64  [487320 โ€ฆ 512100]                      โ”‚
โ”‚  [VFR_HUD]  1842 rows                                      โ”‚
โ”‚      time_s: float64  [0.0 โ€ฆ 312.4]                       โ”‚
โ”‚      alt: float64  [476.1 โ€ฆ 501.3]                        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   Saved โ†’ /path/to/telemetry.parquet

โœ๏ธ  Generating plot script with z-ai/glm-5.1...
โš™๏ธ  Running script...
โœ… Plot saved to: /path/to/plot.png

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  PHASE 1 โ€” Extraction  (parent process, no LLM)             โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                             โ”‚
โ”‚  LogExtractor.schema_only()                                 โ”‚
โ”‚    โ””โ”€ pymavlink fast scan โ†’ {msg_type: {fields, count}}     โ”‚
โ”‚    โ””โ”€ ChromaDB embeddings (semantic field search)           โ”‚
โ”‚                                                             โ”‚
โ”‚  find_relevant_data_types(prompt)                           โ”‚
โ”‚    โ””โ”€ vector similarity โ†’ ["GLOBAL_POSITION_INT", ...]      โ”‚
โ”‚                                                             โ”‚
โ”‚  LogExtractor.extract_all()  โ†  full row materialisation    โ”‚
โ”‚    โ””โ”€ per-msg-type DataFrames, time_s column, cast numerics โ”‚
โ”‚                                                             โ”‚
โ”‚  LogExtractor.export_parquet(msg_types, path)               โ”‚
โ”‚    โ””โ”€ clean telemetry.parquet + schema summary dict         โ”‚
โ”‚                                                             โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PHASE 2 โ€” LLM Plot Generation                              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                             โ”‚
โ”‚  LLM receives:                                              โ”‚
โ”‚    โ””โ”€ parquet_file path                                    โ”‚
โ”‚    โ””โ”€ schema: {msg_type: {rows, columns: {col: dtype+range}}} โ”‚
โ”‚    โ””โ”€ output_file path (.png)                              โ”‚
โ”‚                                                             โ”‚
โ”‚  LLM writes pandas + matplotlib script                      โ”‚
โ”‚    df = pd.read_parquet(parquet_file)                       โ”‚
โ”‚    df_x = df[df['msg_type'] == 'X']                         โ”‚
โ”‚    ...                                                      โ”‚
โ”‚                                                             โ”‚
โ”‚  run_script() โ†’ subprocess exec                            โ”‚
โ”‚    โ””โ”€ on failure: attempt_to_fix_script() up to Nร—         โ”‚
โ”‚                                                             โ”‚
โ”‚  โ†’ plot.png saved next to log file                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Why this matters: In the old design, the LLM had to write pymavlink code to parse a binary .tlog โ€” a notoriously tricky API with blocking/non-blocking subtleties, timestamp inconsistencies, and message-type guesswork. By the time the LLM touched the data, it was flying blind. Now it receives a typed, time-aligned Parquet file with exact column names, dtypes, and value ranges. Writing pd.read_parquet() + matplotlib code against a known schema is trivially reliable.


Features

  • Plain-English plots โ€” describe any flight data; MAVPose figures out which MAVLink fields to use
  • Headless extraction layer โ€” LogExtractor parses the log into per-message-type DataFrames with a monotonic time_s index before the LLM is ever invoked
  • Clean Parquet handoff โ€” only the relevant message types are exported; the LLM sees exact column names, dtypes, min/max ranges โ€” no binary guesswork
  • Semantic field search โ€” ChromaDB vector embeddings surface the most relevant message types for your query
  • Self-healing scripts โ€” LLM debugs and rewrites failing scripts up to N times; fix prompt also includes the full schema
  • Sandboxed execution โ€” generated code runs in an isolated subprocess with a blocked-import denylist and a 30 s timeout
  • Interactive REPL mode โ€” omit --prompt to enter a live loop; Parquet is re-extracted per query with the relevant types
  • Configurable model โ€” drop in any OpenRouter model with a one-line .env change
  • Persisted vector store โ€” ChromaDB is saved to disk; re-running on the same log skips re-embedding
  • CI-tested โ€” lint (ruff) and pytest run on every push

Tech Stack

Layer Library / Service
LLM OpenRouter โ†’ Z.ai GLM-5.1 (default)
Embeddings OpenRouter โ†’ openai/text-embedding-3-small
LLM framework LangChain (LCEL) โ€” langchain-openai, langchain-chroma
Vector store ChromaDB โ‰ฅ 0.5 (persisted to disk)
Extraction layer pandas โ‰ฅ 2.0 + pyarrow โ‰ฅ 14.0
Drone log parsing pymavlink 2.4.37
Plotting matplotlib 3.7.1 (in generated script)
Sandbox Custom subprocess executor with import denylist + timeout
Config python-dotenv
Lint / CI ruff + pytest + GitHub Actions

Prerequisites

  • Python 3.10+
  • An OpenRouter API key โ€” get one here (free tier available)

Installation

# 1. Clone
git clone https://github.com/AyushMaria/MAVPose.git
cd MAVPose

# 2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate      # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure
cp template.env .env

Open .env and fill in your key:

OPENROUTER_API_KEY=your-key-here
OPENROUTER_MODEL=z-ai/glm-5.1   # default โ€” change to any OpenRouter model

โš ๏ธ Never commit .env โ€” it is already listed in .gitignore.


Usage

Single prompt

python cli.py flight.tlog --prompt "Plot altitude over time"

Interactive REPL

Omit --prompt to enter a live loop:

python cli.py flight.tlog

๐Ÿ“Š Plot request > Plot battery voltage and current
๐Ÿ“Š Plot request > Show GPS latitude and longitude
๐Ÿ“Š Plot request > quit

All CLI flags

usage: mavpose [-h] [--prompt PROMPT] [--retries N] [--verbose] log_file

positional arguments:
  log_file              Path to a MAVLink log file (.tlog, .bin, .log)

options:
  -p, --prompt TEXT     Plot request. Omit for interactive mode.
  -r, --retries N       Max self-healing retries on script failure (default: 3)
  -v, --verbose         Enable debug logging

The output plot is saved as plot.png, and intermediate telemetry as telemetry.parquet, both in the same directory as your log file.

Need a sample log file?
Download here


Output Files

File Description
telemetry.parquet Extracted, time-aligned telemetry for the relevant message types
plot.py The LLM-generated pandas + matplotlib script
plot.png The final plot at 400 dpi
chroma_db/ Persisted ChromaDB vector store (git-ignored)

Switching Models

MAVPose routes through OpenRouter, so any model on the platform works. Just update .env:

# Z.ai options
OPENROUTER_MODEL=z-ai/glm-5.1          # default โ€” long-horizon agentic coding
OPENROUTER_MODEL=z-ai/glm-5            # flagship, complex systems
OPENROUTER_MODEL=z-ai/glm-5-turbo      # fast inference
OPENROUTER_MODEL=z-ai/glm-4.5          # MoE, 355B params, switchable reasoning

# OpenAI fallback
OPENROUTER_MODEL=openai/gpt-4o

Project Structure

MAVPose/
โ”œโ”€โ”€ cli.py                         # CLI entry point โ€” two-phase orchestration
โ”œโ”€โ”€ app.py                         # Legacy stub (Gradio UI removed)
โ”œโ”€โ”€ llm/
โ”‚   โ”œโ”€โ”€ log_extractor.py           # ๐Ÿ†• Headless extraction layer (LogExtractor)
โ”‚   โ”œโ”€โ”€ gptPlotCreator.py          # PlotCreator โ€” orchestrates both phases
โ”‚   โ”œโ”€โ”€ safe_executor.py           # Subprocess sandbox with import denylist
โ”‚   โ””โ”€โ”€ file_validator.py          # File validation (extension, size, symlink)
โ”œโ”€โ”€ tests/
โ”‚   โ”œโ”€โ”€ test_log_extractor.py      # ๐Ÿ†• Unit tests for LogExtractor
โ”‚   โ”œโ”€โ”€ test_extract_code_snippets.py
โ”‚   โ”œโ”€โ”€ test_file_validator.py
โ”‚   โ””โ”€โ”€ test_safe_executor.py
โ”œโ”€โ”€ .github/workflows/ci.yml       # GitHub Actions: ruff lint + pytest
โ”œโ”€โ”€ docs/
โ”‚   โ””โ”€โ”€ GPT_MAVPlot_Arch.png
โ”œโ”€โ”€ target/                        # Output directory for plot.py and plot.png
โ”œโ”€โ”€ template.env
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ LICENSE

Core API โ€” LogExtractor

from llm.log_extractor import LogExtractor

extractor = LogExtractor("flight.tlog")

# Fast schema scan (no DataFrame allocation)
schema = extractor.schema_only()
# {"GLOBAL_POSITION_INT": {"count": 1842, "fields": {"lat": "int", ...}}, ...}

# Full extraction into DataFrames
frames = extractor.extract_all()
# {"GLOBAL_POSITION_INT": pd.DataFrame([time_s, msg_type, lat, lon, alt, ...]), ...}

# Export to Parquet and get schema summary for the LLM
summary = extractor.export_parquet(["GLOBAL_POSITION_INT", "VFR_HUD"], "telemetry.parquet")
# {"GLOBAL_POSITION_INT": {"rows": 1842, "columns": {"time_s": {"dtype": "float64", "min": 0.0, "max": 312.4}, ...}}}

Core API โ€” PlotCreator

from llm.gptPlotCreator import PlotCreator

creator = PlotCreator(max_retries=3)
creator.set_logfile_name("flight.tlog")

# Phase 1a: schema scan + embeddings
creator.parse_mavlink_log()

# Phase 1b: semantic search + Parquet extraction
msg_types = creator.find_relevant_data_types("Plot altitude over time")
schema_summary = creator.extract_dataframes(msg_types)

# Phase 2: LLM writes + executes the script
creator.create_plot("Plot altitude over time", schema_summary)
result, code = creator.run_script()
# โ†’ plot.png and telemetry.parquet saved next to the log file

Running Tests

pip install pytest pytest-cov ruff
pytest tests/ -v --cov=llm

To lint:

ruff check llm/ cli.py

Troubleshooting

Issue Cause Fix
KeyError: OPENROUTER_API_KEY .env not configured Copy template.env to .env and add your key
ModuleNotFoundError: pandas Missing dependency Run pip install -r requirements.txt
ModuleNotFoundError: pyarrow Missing dependency Run pip install -r requirements.txt
Plot not generated after N retries LLM script failed repeatedly Try a more specific prompt; use --verbose to inspect errors
FileValidationError Wrong file type or empty file Only .tlog, .bin, .log files โ‰ค 200 MB are accepted
ValueError: None of the requested message types were found Semantic search returned types not in log Use --verbose to see what types the log actually contains
ChromaDB version conflict Stale venv Delete venv/ and reinstall with a fresh pip install -r requirements.txt

License

MIT โ€” see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mavpose-0.1.0.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mavpose-0.1.0-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file mavpose-0.1.0.tar.gz.

File metadata

  • Download URL: mavpose-0.1.0.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mavpose-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9a57efdb2d5b5421b1a5ae76e79f221a4477305b6bca846b00ec8e5b876de184
MD5 d12d50467aa62ddd557f964954b846c3
BLAKE2b-256 e7322d9e5bf9bf8f8cab98d70dcaafef2e094b51f40872350461b0d6e425c849

See more details on using hashes here.

File details

Details for the file mavpose-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mavpose-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mavpose-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0829a7b877dd9061861f626235c046cd33111a4d1ac5ffa900a81868c87abbe8
MD5 31c26b6a79965dbc4d0779e60f7594ea
BLAKE2b-256 3fbb12722c98bf9d1d3cb4ef8046d014475355a497bf709634391dda755be7de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page