Skip to main content

A Voice-Controlled Drone Agent MCP Server

Project description

🚁 EchoPilot: A Voice-Controlled Drone Agent

Ever wanted to talk to your drone like you're in a sci-fi movie? EchoPilot makes it happen. This open-source project bridges the gap between natural language and autonomous flight, allowing you to command a drone simply by speaking.

At its core, EchoPilot uses a powerful, local Large Language Model (LLM) to automatically generate complex mission plans from a single voice command. Unlike solutions relying on external APIs, EchoPilot processes your commands right on your system, offering enhanced privacy and control. These plans are then executed by a robust set of tools running as a background service, communicating via the Model Context Protocol (MCP).

https://www.youtube.com/watch?v=-IVD1Tpn5w0&ab_channel=Bilal%C4%B0leri

✨ Features

  • Natural Language Control: Command the drone with complex sentences like, "takeoff, fly to the Eiffel tower, orbit it, and then return to launch."
  • Automatic Mission Planning: The LLM acts as an intelligent flight officer, automatically creating a safe and logical sequence of tool calls based on your voice command.
  • Decoupled Tool Server: Drone capabilities run as an independent MCP Server in the background, making the system modular and easy to extend.
  • Local & Offline First: Designed to run primarily with a local LLM via Ollama and offline Text-to-Speech (pyttsx3), giving you full control without constant internet access.
  • Telemetry-Verified Execution: This isn't fire-and-forget. The drone's tools use real-time telemetry to confirm actions are physically complete—it waits for the drone to actually arrive at a location before proceeding.
  • Safe by Design: The Planner-Executor model ensures predictable behavior. The agent generates a full plan, which can be reviewed before the drone ever takes off.

🛠️ Architecture: The Brain-Hand-Voice Model

This project's power comes from its decoupled, three-part architecture, connected by the Multi-Server Control Protocol (MCP).

+--------------------------+           +-----------------------------+
|    Drone Agent (Brain)   |           |    Tool Server (Hands)      |
|      (drone_agent.py)    |           |     (drone_server.py)       |
|--------------------------|           |-----------------------------|
| - Listens for Voice      |           | - Runs as a background      |
| - Uses LLM to create     |--(MCP)--> |   process (MCP Server)      |
|   a JSON mission plan    |           | - Exposes tools like        |
|   (a sequence of tools)  |           |   'fly_to_coordinates'      |
| - Executes plan step by  |           | - Talks to drone via MAVSDK |
|   step                   |           | - Verifies actions w/ telemetry |
+--------------------------+           +-----------------------------+
  1. The Agent (The Brain 🧠): The drone_agent.py script is the mission commander. It takes your voice command and uses the LLM to generate a mission plan. This plan is a structured list of which tools to call with which arguments.

  2. The Tool Server (The Hands 👐): The drone_server.py script is the real workhorse. It runs as an independent background process, acting as an MCP Server. It exposes the drone's physical capabilities (takeoff, land, fly) as a set of robust tools. The Agent dynamically discovers and calls these tools using MCP.

  3. The Voice Interface (The Voice 🗣️): The speaker.py and voice_recognizer.py modules provide a simple, offline interface for you to talk to the agent and for the agent to talk back to you.

⚙️ Setup Guide

Follow these steps to get the full EchoPilot experience up and running. This guide is tailored for Ubuntu Linux.

Part 1: Setting up the PX4 Simulator (SITL)

This is the most involved part, but you only have to do it once.

  1. Clone PX4 Autopilot: The PX4 project is a git repository with many submodules. It's important to clone it recursively.

    git clone [https://github.com/PX4/PX4-Autopilot.git](https://github.com/PX4/PX4-Autopilot.git) --recursive
    
  2. Run the Setup Script: The PX4 team provides a fantastic script that installs all dependencies, including the Gazebo simulator.

    cd PX4-Autopilot
    bash ./Tools/setup/ubuntu.sh
    

    This script will ask for your password (sudo) and will take a while to run.

  3. Build the Simulator: The first build compiles everything. Grab a coffee, as this can take 10-20 minutes.

    make make px4_sitl gz_x500
    

    If successful, a Gazebo 3D window will launch with a drone on a runway. You can close it for now with Ctrl+C.

Part 2: Setting up the EchoPilot Agent

  1. Clone This Repository:

    git clone [https://github.com/Bilalileri/EchoPilot.git](https://github.com/Bilalileri/EchoPilot.git)
    cd EchoPilot
    
  2. Install uv (Recommended Python Package Manager): This project uses uv for fast and reliable dependency management.

    curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
    source $HOME/.cargo/env
    
  3. Create and Activate a Python Virtual Environment: You can use uv to create and activate the environment in one step.

    # This will create a .venv folder based on the python version in .python-version
    uv venv
    source .venv/bin/activate
    

    You should see (.venv) at the start of your terminal prompt.

  4. Install Python Packages: Using uv sync will install the exact versions from uv.lock, ensuring a perfect setup.

    uv sync
    

    (Alternative with pip: If you prefer not to use uv, you can still use pip with pip install -r requirements.txt, but uv sync is the recommended method for this project.)

    
    
  5. Configure Your LLM: This project runs best with a local LLM via Ollama.

    1. Install Ollama: Follow the instructions on the official Ollama website.
    2. Pull a Model: We recommend Llama 3.
      ollama pull llama3.1
      
    3. Set the Code: In drone_agent.py, make sure the ChatOllama line is active:
      # LLM = init_chat_model("groq:llama3-8b-8192")
      LLM = ChatOllama(model="llama3.1")
      

    Alternative (Fast Cloud LLM with Groq):

    1. Get a free API key from the Groq Console.
    2. Create a file named .env in the project root.
    3. Add your key to the .env file: GROQ_API_KEY="your_groq_api_key_here"
    4. In drone_agent.py, make sure the init_chat_model line is active:
      LLM = init_chat_model("groq:llama3-8b-8192")
      # LLM = ChatOllama(model="llama3.1")
      

▶️ Running the Project

Running this project requires a specific startup sequence across multiple terminals.

Step 1: Start the Core Services

First, launch the necessary background services for ROS 2 communication and the simulation.

  • Terminal #1: PX4 SITL Simulation This command starts the drone simulation itself within Gazebo. The environment variables set the specific drone model and its starting position.
    cd ~/PX4-Autopilot
    make px4_sitl gz_x500
    
    Wait for the Gazebo window to appear and the PX4 console to finish its startup sequence. ``

Step 2: Start the EchoPilot Agent

This is the final step that brings everything together.

  • Terminal #2: EchoPilot Agent This runs the main voice-controlled agent. Make sure to activate your project's virtual environment first.
    source ~/echodrone/.venv/bin/activate  # Or your project's venv path
    cd ~/echodrone # Or your project's path
    python3 drone_agent.py
    

Step 4: Give Your Command

  • The agent will initialize and greet you.
  • When prompted, speak your command clearly into your microphone.
  • Watch the agent generate the plan in Terminal #5 and monitor the drone's execution in Gazebo!

(Optional) Monitor with QGroundControl

For a professional mission control dashboard, you can also run QGroundControl. It will automatically connect to the running simulation and give you a real-time map, telemetry, and a direct view into the drone's state.

# In another terminal, from where you downloaded it:
chmod +x ./QGroundControl.AppImage
./QGroundControl.AppImage
  1. Give Your Command:
    • The agent will initialize and greet you.
    • When prompted, speak your command clearly into your microphone.
    • Watch the agent generate the plan in the terminal and monitor the drone's execution in QGroundControl and Gazebo!

Example Commands

  • "Takeoff, fly 50 meters forward, and then land."
  • "Takeoff and go to the Eiffel Tower and come back."
  • "Takeoff, fly to Pont d'Iéna bridge in Paris, orbit it, and then land there."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 71bf14053f903e63d29c780e20eeab5447e7252bf85e31d34470041eeead01c8
MD5 de5b6e429cd50ce143e7d4c5eb93ffaa
BLAKE2b-256 957cdb471a4bafa59ee4f0d3ec160c5561004241bb3599ff02ea4c0aef2a55df

See more details on using hashes here.

File details

Details for the file iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_bilalileri_echopilot_ollama_px4_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0db254f0d4943bd9786dc619e3e00cb767d8a040fb125b64697842b578a49c26
MD5 2a70d36b4b746faaec04e60bf370a543
BLAKE2b-256 31f2e246edd66132951e6c052f99ce71ea35e58de6e8cb807187520ae2789873

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page