A Voice-Controlled Drone Agent MCP Server
Project description
🚁 EchoPilot: A Voice-Controlled Drone Agent
Ever wanted to talk to your drone like you're in a sci-fi movie? EchoPilot makes it happen. This open-source project bridges the gap between natural language and autonomous flight, allowing you to command a drone simply by speaking.
At its core, EchoPilot uses a powerful, local Large Language Model (LLM) to automatically generate complex mission plans from a single voice command. Unlike solutions relying on external APIs, EchoPilot processes your commands right on your system, offering enhanced privacy and control. These plans are then executed by a robust set of tools running as a background service, communicating via the Model Context Protocol (MCP).
https://www.youtube.com/watch?v=-IVD1Tpn5w0&ab_channel=Bilal%C4%B0leri
✨ Features
- Natural Language Control: Command the drone with complex sentences like, "takeoff, fly to the Eiffel tower, orbit it, and then return to launch."
- Automatic Mission Planning: The LLM acts as an intelligent flight officer, automatically creating a safe and logical sequence of tool calls based on your voice command.
- Decoupled Tool Server: Drone capabilities run as an independent MCP Server in the background, making the system modular and easy to extend.
- Local & Offline First: Designed to run primarily with a local LLM via Ollama and offline Text-to-Speech (
pyttsx3), giving you full control without constant internet access. - Telemetry-Verified Execution: This isn't fire-and-forget. The drone's tools use real-time telemetry to confirm actions are physically complete—it waits for the drone to actually arrive at a location before proceeding.
- Safe by Design: The Planner-Executor model ensures predictable behavior. The agent generates a full plan, which can be reviewed before the drone ever takes off.
🛠️ Architecture: The Brain-Hand-Voice Model
This project's power comes from its decoupled, three-part architecture, connected by the Multi-Server Control Protocol (MCP).
+--------------------------+ +-----------------------------+
| Drone Agent (Brain) | | Tool Server (Hands) |
| (drone_agent.py) | | (drone_server.py) |
|--------------------------| |-----------------------------|
| - Listens for Voice | | - Runs as a background |
| - Uses LLM to create |--(MCP)--> | process (MCP Server) |
| a JSON mission plan | | - Exposes tools like |
| (a sequence of tools) | | 'fly_to_coordinates' |
| - Executes plan step by | | - Talks to drone via MAVSDK |
| step | | - Verifies actions w/ telemetry |
+--------------------------+ +-----------------------------+
-
The Agent (The Brain 🧠): The
drone_agent.pyscript is the mission commander. It takes your voice command and uses the LLM to generate a mission plan. This plan is a structured list of which tools to call with which arguments. -
The Tool Server (The Hands 👐): The
drone_server.pyscript is the real workhorse. It runs as an independent background process, acting as an MCP Server. It exposes the drone's physical capabilities (takeoff, land, fly) as a set of robust tools. The Agent dynamically discovers and calls these tools using MCP. -
The Voice Interface (The Voice 🗣️): The
speaker.pyandvoice_recognizer.pymodules provide a simple, offline interface for you to talk to the agent and for the agent to talk back to you.
⚙️ Setup Guide
Follow these steps to get the full EchoPilot experience up and running. This guide is tailored for Ubuntu Linux.
Part 1: Setting up the PX4 Simulator (SITL)
This is the most involved part, but you only have to do it once.
-
Clone PX4 Autopilot: The PX4 project is a git repository with many submodules. It's important to clone it recursively.
git clone [https://github.com/PX4/PX4-Autopilot.git](https://github.com/PX4/PX4-Autopilot.git) --recursive
-
Run the Setup Script: The PX4 team provides a fantastic script that installs all dependencies, including the Gazebo simulator.
cd PX4-Autopilot bash ./Tools/setup/ubuntu.sh
This script will ask for your password (
sudo) and will take a while to run. -
Build the Simulator: The first build compiles everything. Grab a coffee, as this can take 10-20 minutes.
make make px4_sitl gz_x500
If successful, a Gazebo 3D window will launch with a drone on a runway. You can close it for now with
Ctrl+C.
Part 2: Setting up the EchoPilot Agent
-
Clone This Repository:
git clone [https://github.com/Bilalileri/EchoPilot.git](https://github.com/Bilalileri/EchoPilot.git) cd EchoPilot
-
Install
uv(Recommended Python Package Manager): This project usesuvfor fast and reliable dependency management.curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh source $HOME/.cargo/env
-
Create and Activate a Python Virtual Environment: You can use
uvto create and activate the environment in one step.# This will create a .venv folder based on the python version in .python-version uv venv source .venv/bin/activate
You should see
(.venv)at the start of your terminal prompt. -
Install Python Packages: Using
uv syncwill install the exact versions fromuv.lock, ensuring a perfect setup.uv sync(Alternative with pip: If you prefer not to use uv, you can still use pip with
pip install -r requirements.txt, butuv syncis the recommended method for this project.) -
Configure Your LLM: This project runs best with a local LLM via Ollama.
- Install Ollama: Follow the instructions on the official Ollama website.
- Pull a Model: We recommend Llama 3.
ollama pull llama3.1
- Set the Code: In
drone_agent.py, make sure theChatOllamaline is active:# LLM = init_chat_model("groq:llama3-8b-8192") LLM = ChatOllama(model="llama3.1")
Alternative (Fast Cloud LLM with Groq):
- Get a free API key from the Groq Console.
- Create a file named
.envin the project root. - Add your key to the
.envfile:GROQ_API_KEY="your_groq_api_key_here" - In
drone_agent.py, make sure theinit_chat_modelline is active:LLM = init_chat_model("groq:llama3-8b-8192") # LLM = ChatOllama(model="llama3.1")
▶️ Running the Project
Running this project requires a specific startup sequence across multiple terminals.
Step 1: Start the Core Services
First, launch the necessary background services for ROS 2 communication and the simulation.
- Terminal #1: PX4 SITL Simulation
This command starts the drone simulation itself within Gazebo. The environment variables set the specific drone model and its starting position.
cd ~/PX4-Autopilot make px4_sitl gz_x500
Wait for the Gazebo window to appear and the PX4 console to finish its startup sequence. ``
Step 2: Start the EchoPilot Agent
This is the final step that brings everything together.
- Terminal #2: EchoPilot Agent
This runs the main voice-controlled agent. Make sure to activate your project's virtual environment first.
source ~/echodrone/.venv/bin/activate # Or your project's venv path cd ~/echodrone # Or your project's path python3 drone_agent.py
Step 4: Give Your Command
- The agent will initialize and greet you.
- When prompted, speak your command clearly into your microphone.
- Watch the agent generate the plan in Terminal #5 and monitor the drone's execution in Gazebo!
(Optional) Monitor with QGroundControl
For a professional mission control dashboard, you can also run QGroundControl. It will automatically connect to the running simulation and give you a real-time map, telemetry, and a direct view into the drone's state.
# In another terminal, from where you downloaded it:
chmod +x ./QGroundControl.AppImage
./QGroundControl.AppImage
- Give Your Command:
- The agent will initialize and greet you.
- When prompted, speak your command clearly into your microphone.
- Watch the agent generate the plan in the terminal and monitor the drone's execution in QGroundControl and Gazebo!
Example Commands
- "Takeoff, fly 50 meters forward, and then land."
- "Takeoff and go to the Eiffel Tower and come back."
- "Takeoff, fly to Pont d'Iéna bridge in Paris, orbit it, and then land there."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0.tar.gz.
File metadata
- Download URL: iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71bf14053f903e63d29c780e20eeab5447e7252bf85e31d34470041eeead01c8
|
|
| MD5 |
de5b6e429cd50ce143e7d4c5eb93ffaa
|
|
| BLAKE2b-256 |
957cdb471a4bafa59ee4f0d3ec160c5561004241bb3599ff02ea4c0aef2a55df
|
File details
Details for the file iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0db254f0d4943bd9786dc619e3e00cb767d8a040fb125b64697842b578a49c26
|
|
| MD5 |
2a70d36b4b746faaec04e60bf370a543
|
|
| BLAKE2b-256 |
31f2e246edd66132951e6c052f99ce71ea35e58de6e8cb807187520ae2789873
|