LLM-powered MAVLink flight log analysis and plotting

Project description

🧭 MAVPose

Ask questions about your drone flight in plain English. Get plots.

Inspired by the Log Pose from One Piece — it locks onto your flight data and charts a course through it.

MAVPose is a headless CLI tool that turns a natural language prompt into a matplotlib plot of your MAVLink flight log. It runs a two-phase pipeline: the parent process first extracts clean, time-aligned telemetry into a Parquet file (like a headless database query), then hands the LLM a precise column schema and a pd.read_parquet() call — no raw binary data, no pymavlink in the generated script. This drastically reduces hallucinations and self-healing loops.

$ python cli.py flight.tlog --prompt "Plot altitude over time"

📂 Parsing log schema: flight.tlog
✅ Schema indexed.

🔍 Finding relevant message types for: 'Plot altitude over time'
   → ['GLOBAL_POSITION_INT', 'VFR_HUD']

🗃️  Extracting telemetry to Parquet...
┌──────────────────────────────────────────────────────────┐
│  🗃️  Extracted Parquet schema                               │
├──────────────────────────────────────────────────────────┤
│  [GLOBAL_POSITION_INT]  1842 rows                          │
│      time_s: float64  [0.0 … 312.4]                       │
│      alt: float64  [487320 … 512100]                      │
│  [VFR_HUD]  1842 rows                                      │
│      time_s: float64  [0.0 … 312.4]                       │
│      alt: float64  [476.1 … 501.3]                        │
└──────────────────────────────────────────────────────────┘
   Saved → /path/to/telemetry.parquet

✍️  Generating plot script with z-ai/glm-5.1...
⚙️  Running script...
✅ Plot saved to: /path/to/plot.png

Architecture

┌──────────────────────────────────────────────────────────┐
│  PHASE 1 — Extraction  (parent process, no LLM)             │
├──────────────────────────────────────────────────────────┤
│                                                             │
│  LogExtractor.schema_only()                                 │
│    └─ pymavlink fast scan → {msg_type: {fields, count}}     │
│    └─ ChromaDB embeddings (semantic field search)           │
│                                                             │
│  find_relevant_data_types(prompt)                           │
│    └─ vector similarity → ["GLOBAL_POSITION_INT", ...]      │
│                                                             │
│  LogExtractor.extract_all()  ←  full row materialisation    │
│    └─ per-msg-type DataFrames, time_s column, cast numerics │
│                                                             │
│  LogExtractor.export_parquet(msg_types, path)               │
│    └─ clean telemetry.parquet + schema summary dict         │
│                                                             │
├──────────────────────────────────────────────────────────┤
│  PHASE 2 — LLM Plot Generation                              │
├──────────────────────────────────────────────────────────┤
│                                                             │
│  LLM receives:                                              │
│    └─ parquet_file path                                    │
│    └─ schema: {msg_type: {rows, columns: {col: dtype+range}}} │
│    └─ output_file path (.png)                              │
│                                                             │
│  LLM writes pandas + matplotlib script                      │
│    df = pd.read_parquet(parquet_file)                       │
│    df_x = df[df['msg_type'] == 'X']                         │
│    ...                                                      │
│                                                             │
│  run_script() → subprocess exec                            │
│    └─ on failure: attempt_to_fix_script() up to N×         │
│                                                             │
│  → plot.png saved next to log file                          │
└──────────────────────────────────────────────────────────┘

Why this matters: In the old design, the LLM had to write pymavlink code to parse a binary .tlog — a notoriously tricky API with blocking/non-blocking subtleties, timestamp inconsistencies, and message-type guesswork. By the time the LLM touched the data, it was flying blind. Now it receives a typed, time-aligned Parquet file with exact column names, dtypes, and value ranges. Writing pd.read_parquet() + matplotlib code against a known schema is trivially reliable.

Features

Plain-English plots — describe any flight data; MAVPose figures out which MAVLink fields to use
Headless extraction layer — LogExtractor parses the log into per-message-type DataFrames with a monotonic time_s index before the LLM is ever invoked
Clean Parquet handoff — only the relevant message types are exported; the LLM sees exact column names, dtypes, min/max ranges — no binary guesswork
Semantic field search — ChromaDB vector embeddings surface the most relevant message types for your query
Self-healing scripts — LLM debugs and rewrites failing scripts up to N times; fix prompt also includes the full schema
Sandboxed execution — generated code runs in an isolated subprocess with a blocked-import denylist and a 30 s timeout
Interactive REPL mode — omit --prompt to enter a live loop; Parquet is re-extracted per query with the relevant types
Configurable model — drop in any OpenRouter model with a one-line .env change
Persisted vector store — ChromaDB is saved to disk; re-running on the same log skips re-embedding
CI-tested — lint (ruff) and pytest run on every push

Tech Stack

Layer	Library / Service
LLM	OpenRouter → Z.ai GLM-5.1 (default)
Embeddings	OpenRouter → `openai/text-embedding-3-small`
LLM framework	LangChain (LCEL) — `langchain-openai`, `langchain-chroma`
Vector store	ChromaDB ≥ 0.5 (persisted to disk)
Extraction layer	pandas ≥ 2.0 + pyarrow ≥ 14.0
Drone log parsing	pymavlink 2.4.37
Plotting	matplotlib 3.7.1 (in generated script)
Sandbox	Custom subprocess executor with import denylist + timeout
Config	python-dotenv
Lint / CI	ruff + pytest + GitHub Actions

Prerequisites

Python 3.10+
An OpenRouter API key — get one here (free tier available)

Installation

# 1. Clone
git clone https://github.com/AyushMaria/MAVPose.git
cd MAVPose

# 2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate      # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure
cp template.env .env

Open .env and fill in your key:

OPENROUTER_API_KEY=your-key-here
OPENROUTER_MODEL=z-ai/glm-5.1   # default — change to any OpenRouter model

⚠️ Never commit .env — it is already listed in .gitignore.

Usage

Single prompt

python cli.py flight.tlog --prompt "Plot altitude over time"

Interactive REPL

Omit --prompt to enter a live loop:

python cli.py flight.tlog

📊 Plot request > Plot battery voltage and current
📊 Plot request > Show GPS latitude and longitude
📊 Plot request > quit

All CLI flags

usage: mavpose [-h] [--prompt PROMPT] [--retries N] [--verbose] log_file

positional arguments:
  log_file              Path to a MAVLink log file (.tlog, .bin, .log)

options:
  -p, --prompt TEXT     Plot request. Omit for interactive mode.
  -r, --retries N       Max self-healing retries on script failure (default: 3)
  -v, --verbose         Enable debug logging

The output plot is saved as plot.png, and intermediate telemetry as telemetry.parquet, both in the same directory as your log file.

Need a sample log file?
Download here

Output Files

File	Description
`telemetry.parquet`	Extracted, time-aligned telemetry for the relevant message types
`plot.py`	The LLM-generated pandas + matplotlib script
`plot.png`	The final plot at 400 dpi
`chroma_db/`	Persisted ChromaDB vector store (git-ignored)

Switching Models

MAVPose routes through OpenRouter, so any model on the platform works. Just update .env:

# Z.ai options
OPENROUTER_MODEL=z-ai/glm-5.1          # default — long-horizon agentic coding
OPENROUTER_MODEL=z-ai/glm-5            # flagship, complex systems
OPENROUTER_MODEL=z-ai/glm-5-turbo      # fast inference
OPENROUTER_MODEL=z-ai/glm-4.5          # MoE, 355B params, switchable reasoning

# OpenAI fallback
OPENROUTER_MODEL=openai/gpt-4o

Project Structure

MAVPose/
├── cli.py                         # CLI entry point — two-phase orchestration
├── app.py                         # Legacy stub (Gradio UI removed)
├── llm/
│   ├── log_extractor.py           # 🆕 Headless extraction layer (LogExtractor)
│   ├── gptPlotCreator.py          # PlotCreator — orchestrates both phases
│   ├── safe_executor.py           # Subprocess sandbox with import denylist
│   └── file_validator.py          # File validation (extension, size, symlink)
├── tests/
│   ├── test_log_extractor.py      # 🆕 Unit tests for LogExtractor
│   ├── test_extract_code_snippets.py
│   ├── test_file_validator.py
│   └── test_safe_executor.py
├── .github/workflows/ci.yml       # GitHub Actions: ruff lint + pytest
├── docs/
│   └── GPT_MAVPlot_Arch.png
├── target/                        # Output directory for plot.py and plot.png
├── template.env
├── requirements.txt
└── LICENSE

Core API — `LogExtractor`

from llm.log_extractor import LogExtractor

extractor = LogExtractor("flight.tlog")

# Fast schema scan (no DataFrame allocation)
schema = extractor.schema_only()
# {"GLOBAL_POSITION_INT": {"count": 1842, "fields": {"lat": "int", ...}}, ...}

# Full extraction into DataFrames
frames = extractor.extract_all()
# {"GLOBAL_POSITION_INT": pd.DataFrame([time_s, msg_type, lat, lon, alt, ...]), ...}

# Export to Parquet and get schema summary for the LLM
summary = extractor.export_parquet(["GLOBAL_POSITION_INT", "VFR_HUD"], "telemetry.parquet")
# {"GLOBAL_POSITION_INT": {"rows": 1842, "columns": {"time_s": {"dtype": "float64", "min": 0.0, "max": 312.4}, ...}}}

Core API — `PlotCreator`

from llm.gptPlotCreator import PlotCreator

creator = PlotCreator(max_retries=3)
creator.set_logfile_name("flight.tlog")

# Phase 1a: schema scan + embeddings
creator.parse_mavlink_log()

# Phase 1b: semantic search + Parquet extraction
msg_types = creator.find_relevant_data_types("Plot altitude over time")
schema_summary = creator.extract_dataframes(msg_types)

# Phase 2: LLM writes + executes the script
creator.create_plot("Plot altitude over time", schema_summary)
result, code = creator.run_script()
# → plot.png and telemetry.parquet saved next to the log file

Running Tests

pip install pytest pytest-cov ruff
pytest tests/ -v --cov=llm

To lint:

ruff check llm/ cli.py

Troubleshooting

Issue	Cause	Fix
`KeyError: OPENROUTER_API_KEY`	`.env` not configured	Copy `template.env` to `.env` and add your key
`ModuleNotFoundError: pandas`	Missing dependency	Run `pip install -r requirements.txt`
`ModuleNotFoundError: pyarrow`	Missing dependency	Run `pip install -r requirements.txt`
Plot not generated after N retries	LLM script failed repeatedly	Try a more specific prompt; use `--verbose` to inspect errors
`FileValidationError`	Wrong file type or empty file	Only `.tlog`, `.bin`, `.log` files ≤ 200 MB are accepted
`ValueError: None of the requested message types were found`	Semantic search returned types not in log	Use `--verbose` to see what types the log actually contains
ChromaDB version conflict	Stale venv	Delete `venv/` and reinstall with a fresh `pip install -r requirements.txt`

License

MIT — see LICENSE for details.

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Apr 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mavpose-0.1.0.tar.gz (24.5 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mavpose-0.1.0-py3-none-any.whl (20.7 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file mavpose-0.1.0.tar.gz.

File metadata

Download URL: mavpose-0.1.0.tar.gz
Upload date: Apr 10, 2026
Size: 24.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mavpose-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9a57efdb2d5b5421b1a5ae76e79f221a4477305b6bca846b00ec8e5b876de184`
MD5	`d12d50467aa62ddd557f964954b846c3`
BLAKE2b-256	`e7322d9e5bf9bf8f8cab98d70dcaafef2e094b51f40872350461b0d6e425c849`

See more details on using hashes here.

File details

Details for the file mavpose-0.1.0-py3-none-any.whl.

File metadata

Download URL: mavpose-0.1.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 20.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mavpose-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0829a7b877dd9061861f626235c046cd33111a4d1ac5ffa900a81868c87abbe8`
MD5	`31c26b6a79965dbc4d0779e60f7594ea`
BLAKE2b-256	`3fbb12722c98bf9d1d3cb4ef8046d014475355a497bf709634391dda755be7de`

See more details on using hashes here.

mavpose 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

🧭 MAVPose

Architecture

Features

Tech Stack

Prerequisites

Installation

Usage

Single prompt

Interactive REPL

All CLI flags

Output Files

Switching Models

Project Structure

Core API — `LogExtractor`

Core API — `PlotCreator`

Running Tests

Troubleshooting

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

mavpose 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

🧭 MAVPose

Architecture

Features

Tech Stack

Prerequisites

Installation

Usage

Single prompt

Interactive REPL

All CLI flags

Output Files

Switching Models

Project Structure

Core API — LogExtractor

Core API — PlotCreator

Running Tests

Troubleshooting

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Core API — `LogExtractor`

Core API — `PlotCreator`