LLM-powered MAVLink flight log analysis and plotting
Project description
๐งญ MAVPose
Ask questions about your drone flight in plain English. Get plots.
Inspired by the Log Pose from One Piece โ it locks onto your flight data and charts a course through it.
MAVPose is a headless CLI tool that turns a natural language prompt into a matplotlib plot of your MAVLink flight log. It runs a two-phase pipeline: the parent process first extracts clean, time-aligned telemetry into a Parquet file (like a headless database query), then hands the LLM a precise column schema and a pd.read_parquet() call โ no raw binary data, no pymavlink in the generated script. This drastically reduces hallucinations and self-healing loops.
$ python cli.py flight.tlog --prompt "Plot altitude over time"
๐ Parsing log schema: flight.tlog
โ
Schema indexed.
๐ Finding relevant message types for: 'Plot altitude over time'
โ ['GLOBAL_POSITION_INT', 'VFR_HUD']
๐๏ธ Extracting telemetry to Parquet...
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐๏ธ Extracted Parquet schema โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ [GLOBAL_POSITION_INT] 1842 rows โ
โ time_s: float64 [0.0 โฆ 312.4] โ
โ alt: float64 [487320 โฆ 512100] โ
โ [VFR_HUD] 1842 rows โ
โ time_s: float64 [0.0 โฆ 312.4] โ
โ alt: float64 [476.1 โฆ 501.3] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Saved โ /path/to/telemetry.parquet
โ๏ธ Generating plot script with z-ai/glm-5.1...
โ๏ธ Running script...
โ
Plot saved to: /path/to/plot.png
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PHASE 1 โ Extraction (parent process, no LLM) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ LogExtractor.schema_only() โ
โ โโ pymavlink fast scan โ {msg_type: {fields, count}} โ
โ โโ ChromaDB embeddings (semantic field search) โ
โ โ
โ find_relevant_data_types(prompt) โ
โ โโ vector similarity โ ["GLOBAL_POSITION_INT", ...] โ
โ โ
โ LogExtractor.extract_all() โ full row materialisation โ
โ โโ per-msg-type DataFrames, time_s column, cast numerics โ
โ โ
โ LogExtractor.export_parquet(msg_types, path) โ
โ โโ clean telemetry.parquet + schema summary dict โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ PHASE 2 โ LLM Plot Generation โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ LLM receives: โ
โ โโ parquet_file path โ
โ โโ schema: {msg_type: {rows, columns: {col: dtype+range}}} โ
โ โโ output_file path (.png) โ
โ โ
โ LLM writes pandas + matplotlib script โ
โ df = pd.read_parquet(parquet_file) โ
โ df_x = df[df['msg_type'] == 'X'] โ
โ ... โ
โ โ
โ run_script() โ subprocess exec โ
โ โโ on failure: attempt_to_fix_script() up to Nร โ
โ โ
โ โ plot.png saved next to log file โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Why this matters: In the old design, the LLM had to write pymavlink code to parse a binary .tlog โ a notoriously tricky API with blocking/non-blocking subtleties, timestamp inconsistencies, and message-type guesswork. By the time the LLM touched the data, it was flying blind. Now it receives a typed, time-aligned Parquet file with exact column names, dtypes, and value ranges. Writing pd.read_parquet() + matplotlib code against a known schema is trivially reliable.
Features
- Plain-English plots โ describe any flight data; MAVPose figures out which MAVLink fields to use
- Headless extraction layer โ
LogExtractorparses the log into per-message-type DataFrames with a monotonictime_sindex before the LLM is ever invoked - Clean Parquet handoff โ only the relevant message types are exported; the LLM sees exact column names, dtypes, min/max ranges โ no binary guesswork
- Semantic field search โ ChromaDB vector embeddings surface the most relevant message types for your query
- Self-healing scripts โ LLM debugs and rewrites failing scripts up to N times; fix prompt also includes the full schema
- Sandboxed execution โ generated code runs in an isolated subprocess with a blocked-import denylist and a 30 s timeout
- Interactive REPL mode โ omit
--promptto enter a live loop; Parquet is re-extracted per query with the relevant types - Configurable model โ drop in any OpenRouter model with a one-line
.envchange - Persisted vector store โ ChromaDB is saved to disk; re-running on the same log skips re-embedding
- CI-tested โ lint (ruff) and pytest run on every push
Tech Stack
| Layer | Library / Service |
|---|---|
| LLM | OpenRouter โ Z.ai GLM-5.1 (default) |
| Embeddings | OpenRouter โ openai/text-embedding-3-small |
| LLM framework | LangChain (LCEL) โ langchain-openai, langchain-chroma |
| Vector store | ChromaDB โฅ 0.5 (persisted to disk) |
| Extraction layer | pandas โฅ 2.0 + pyarrow โฅ 14.0 |
| Drone log parsing | pymavlink 2.4.37 |
| Plotting | matplotlib 3.7.1 (in generated script) |
| Sandbox | Custom subprocess executor with import denylist + timeout |
| Config | python-dotenv |
| Lint / CI | ruff + pytest + GitHub Actions |
Prerequisites
- Python 3.10+
- An OpenRouter API key โ get one here (free tier available)
Installation
# 1. Clone
git clone https://github.com/AyushMaria/MAVPose.git
cd MAVPose
# 2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure
cp template.env .env
Open .env and fill in your key:
OPENROUTER_API_KEY=your-key-here
OPENROUTER_MODEL=z-ai/glm-5.1 # default โ change to any OpenRouter model
โ ๏ธ Never commit
.envโ it is already listed in.gitignore.
Usage
Single prompt
python cli.py flight.tlog --prompt "Plot altitude over time"
Interactive REPL
Omit --prompt to enter a live loop:
python cli.py flight.tlog
๐ Plot request > Plot battery voltage and current
๐ Plot request > Show GPS latitude and longitude
๐ Plot request > quit
All CLI flags
usage: mavpose [-h] [--prompt PROMPT] [--retries N] [--verbose] log_file
positional arguments:
log_file Path to a MAVLink log file (.tlog, .bin, .log)
options:
-p, --prompt TEXT Plot request. Omit for interactive mode.
-r, --retries N Max self-healing retries on script failure (default: 3)
-v, --verbose Enable debug logging
The output plot is saved as plot.png, and intermediate telemetry as telemetry.parquet, both in the same directory as your log file.
Need a sample log file?
Download here
Output Files
| File | Description |
|---|---|
telemetry.parquet |
Extracted, time-aligned telemetry for the relevant message types |
plot.py |
The LLM-generated pandas + matplotlib script |
plot.png |
The final plot at 400 dpi |
chroma_db/ |
Persisted ChromaDB vector store (git-ignored) |
Switching Models
MAVPose routes through OpenRouter, so any model on the platform works. Just update .env:
# Z.ai options
OPENROUTER_MODEL=z-ai/glm-5.1 # default โ long-horizon agentic coding
OPENROUTER_MODEL=z-ai/glm-5 # flagship, complex systems
OPENROUTER_MODEL=z-ai/glm-5-turbo # fast inference
OPENROUTER_MODEL=z-ai/glm-4.5 # MoE, 355B params, switchable reasoning
# OpenAI fallback
OPENROUTER_MODEL=openai/gpt-4o
Project Structure
MAVPose/
โโโ cli.py # CLI entry point โ two-phase orchestration
โโโ app.py # Legacy stub (Gradio UI removed)
โโโ llm/
โ โโโ log_extractor.py # ๐ Headless extraction layer (LogExtractor)
โ โโโ gptPlotCreator.py # PlotCreator โ orchestrates both phases
โ โโโ safe_executor.py # Subprocess sandbox with import denylist
โ โโโ file_validator.py # File validation (extension, size, symlink)
โโโ tests/
โ โโโ test_log_extractor.py # ๐ Unit tests for LogExtractor
โ โโโ test_extract_code_snippets.py
โ โโโ test_file_validator.py
โ โโโ test_safe_executor.py
โโโ .github/workflows/ci.yml # GitHub Actions: ruff lint + pytest
โโโ docs/
โ โโโ GPT_MAVPlot_Arch.png
โโโ target/ # Output directory for plot.py and plot.png
โโโ template.env
โโโ requirements.txt
โโโ LICENSE
Core API โ LogExtractor
from llm.log_extractor import LogExtractor
extractor = LogExtractor("flight.tlog")
# Fast schema scan (no DataFrame allocation)
schema = extractor.schema_only()
# {"GLOBAL_POSITION_INT": {"count": 1842, "fields": {"lat": "int", ...}}, ...}
# Full extraction into DataFrames
frames = extractor.extract_all()
# {"GLOBAL_POSITION_INT": pd.DataFrame([time_s, msg_type, lat, lon, alt, ...]), ...}
# Export to Parquet and get schema summary for the LLM
summary = extractor.export_parquet(["GLOBAL_POSITION_INT", "VFR_HUD"], "telemetry.parquet")
# {"GLOBAL_POSITION_INT": {"rows": 1842, "columns": {"time_s": {"dtype": "float64", "min": 0.0, "max": 312.4}, ...}}}
Core API โ PlotCreator
from llm.gptPlotCreator import PlotCreator
creator = PlotCreator(max_retries=3)
creator.set_logfile_name("flight.tlog")
# Phase 1a: schema scan + embeddings
creator.parse_mavlink_log()
# Phase 1b: semantic search + Parquet extraction
msg_types = creator.find_relevant_data_types("Plot altitude over time")
schema_summary = creator.extract_dataframes(msg_types)
# Phase 2: LLM writes + executes the script
creator.create_plot("Plot altitude over time", schema_summary)
result, code = creator.run_script()
# โ plot.png and telemetry.parquet saved next to the log file
Running Tests
pip install pytest pytest-cov ruff
pytest tests/ -v --cov=llm
To lint:
ruff check llm/ cli.py
Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
KeyError: OPENROUTER_API_KEY |
.env not configured |
Copy template.env to .env and add your key |
ModuleNotFoundError: pandas |
Missing dependency | Run pip install -r requirements.txt |
ModuleNotFoundError: pyarrow |
Missing dependency | Run pip install -r requirements.txt |
| Plot not generated after N retries | LLM script failed repeatedly | Try a more specific prompt; use --verbose to inspect errors |
FileValidationError |
Wrong file type or empty file | Only .tlog, .bin, .log files โค 200 MB are accepted |
ValueError: None of the requested message types were found |
Semantic search returned types not in log | Use --verbose to see what types the log actually contains |
| ChromaDB version conflict | Stale venv | Delete venv/ and reinstall with a fresh pip install -r requirements.txt |
License
MIT โ see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mavpose-0.1.0.tar.gz.
File metadata
- Download URL: mavpose-0.1.0.tar.gz
- Upload date:
- Size: 24.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a57efdb2d5b5421b1a5ae76e79f221a4477305b6bca846b00ec8e5b876de184
|
|
| MD5 |
d12d50467aa62ddd557f964954b846c3
|
|
| BLAKE2b-256 |
e7322d9e5bf9bf8f8cab98d70dcaafef2e094b51f40872350461b0d6e425c849
|
File details
Details for the file mavpose-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mavpose-0.1.0-py3-none-any.whl
- Upload date:
- Size: 20.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0829a7b877dd9061861f626235c046cd33111a4d1ac5ffa900a81868c87abbe8
|
|
| MD5 |
31c26b6a79965dbc4d0779e60f7594ea
|
|
| BLAKE2b-256 |
3fbb12722c98bf9d1d3cb4ef8046d014475355a497bf709634391dda755be7de
|