Presentation control, reimagined.
Project description
moves
Presentation control, reimagined. Hands-free slide navigation using offline speech recognition and hybrid similarity matching.
Overview
moves is a CLI tool that automates slide advancement during presentations based on your spoken words. By analyzing your presentation and corresponding transcript, it learns what you say during each slide, then uses speech recognition to detect when you move between sections—all offline and hands-free.
Key Features
- Offline speech recognition – Uses local ONNX models; your voice stays on your machine
- Hybrid similarity engine – Combines semantic and phonetic matching for accurate slide detection
- Automatic slide generation – Extracts slides from PDF presentations and generates templates with LLM assistance (optional manual mode)
- Speaker profiles – Save and reuse multiple presentations with different speakers
- Flexible source handling – Load presentations and transcripts from local files or Google Drive
- Interactive terminal UI – Real-time feedback with Rich-powered dashboard showing current slide, similarity scores, and system state
What It Does
- Prepare – Extract slides from a PDF, analyze your transcript, generate sections with speech content
- Control – Start live voice-controlled navigation with keyboard backups
- Manage – Add, edit, list, and delete speaker profiles
Installation
Requirements
- Python 3.13+
uvpackage manager (or pip as fallback)
Install from PyPI
uv tool install moves-cli
# or: pip install moves-cli
# Verify installation
moves --version
Quick Start
1. Add a Speaker Profile
moves speaker add MyPresentation \
/path/to/presentation.pdf \
/path/to/transcript.txt
You can also use Google Drive URLs (the tool handles authentication):
moves speaker add MyPresentation \
"https://drive.google.com/file/d/.../view?usp=sharing" \
"https://drive.google.com/file/d/.../view?usp=sharing"
2. Configure LLM (for automatic section generation)
# Set your LLM model (e.g., Gemini 2.5 Flash)
moves settings set model gemini/gemini-2.5-flash-lite
# Set your API key (securely prompted)
moves settings set key
Tip: You can skip LLM setup and use
--manualmode to generate empty templates you edit yourself.
3. Prepare the Speaker
Generate sections (speech content for each slide):
# Auto mode (uses LLM)
moves speaker prepare MyPresentation
# Or manual mode (empty template to edit yourself)
moves speaker prepare MyPresentation --manual
Edit ~/.moves/speakers/<speaker-id>/sections.md to add your spoken words for each slide if using manual mode.
4. Start Presentation Control
moves present MyPresentation
Keyboard shortcuts during presentation:
←/→– Previous / Next slide (manual navigation)Ins– Pause/Resume microphoneCtrl+C– Exit
The tool listens to your speech and automatically advances slides when it detects you've moved to new content.
Documentation
- Getting Started Guide – Detailed walkthrough with examples
- Architecture – How the system works internally
- CLI Reference – Complete command documentation
- Configuration Guide – Setup LLM, API keys, and more
- Development Guide – For contributors and developers
How It Works
┌─────────────────────────────────────────────────────────┐
│ 1. PREPARATION PHASE │
├─────────────────────────────────────────────────────────┤
│ • Extract slides from PDF │
│ • Analyze transcript to identify sections │
│ • Generate speech content for each slide (LLM or manual)│
│ • Create sections.md file with structure │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ 2. PRESENTATION PHASE │
├─────────────────────────────────────────────────────────┤
│ • Start microphone stream (real-time audio input) │
│ • Voice Activity Detector (VAD) filters silence │
│ • Speech Recognition converts audio to text (offline) │
│ • Similarity Engine matches text to chunks │
│ ├─ Semantic similarity (embeddings) │
│ └─ Phonetic similarity (fuzzy matching) │
│ • Auto-advance when high similarity match detected │
└─────────────────────────────────────────────────────────┘
Data Storage
All speaker data is stored in ~/.moves/:
~/.moves/
├── settings.toml # LLM model configuration
├── settings.key # API key (Windows Credential Manager)
└── speakers/
└── <speaker-id>/
├── speaker.yaml # Speaker metadata
└── sections.md # Speech content for each slide
Common Issues & Solutions
No speakers found?
moves speaker list
# Check ~/.moves/speakers/ directory exists
Sections not being created?
# Check LLM configuration
moves settings list
# Try manual mode (no LLM required)
moves speaker prepare MyPresentation --manual
Microphone not detected?
# Verify your system microphone works:
# Settings → Sound → Volume mixer (Windows)
# Then retry: moves present MyPresentation
Speech not being recognized?
- Speak clearly and at a normal pace
- Test microphone in a quiet environment
- Check that sections.md contains expected content
Performance Notes
- Offline processing – No cloud calls except for LLM section generation
- Real-time audio – ~32ms analysis windows, responsive slide detection
- Memory efficient – Processed sections cached in
sections.md - First run slower – ONNX models (~500MB) downloaded on first use
Project Status
Active Development – This tool is being actively developed and improved. Feedback and contributions are welcome.
License
Licensed under the GNU General Public License v3.0. See LICENSE for details.
Contributing
Contributions are welcome! See Development Guide for setup instructions.
Questions? Check the FAQ in Getting Started or open an issue on GitHub.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file moves_cli-0.3.3.tar.gz.
File metadata
- Download URL: moves_cli-0.3.3.tar.gz
- Upload date:
- Size: 39.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af9476dc371f43555fda1817c04fe3ab560f0429aa5c776e7fccaea8e029a1d7
|
|
| MD5 |
ce5005cea9abf009ab0d039385e9788f
|
|
| BLAKE2b-256 |
0060db086d359ad01f92dcf0a7da54e26bb10e54151caa5bdcc34546f11d927c
|
Provenance
The following attestation bundles were made for moves_cli-0.3.3.tar.gz:
Publisher:
publish.yml on mdonmez/moves-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
moves_cli-0.3.3.tar.gz -
Subject digest:
af9476dc371f43555fda1817c04fe3ab560f0429aa5c776e7fccaea8e029a1d7 - Sigstore transparency entry: 844737090
- Sigstore integration time:
-
Permalink:
mdonmez/moves-cli@f0a0cd37893134671b8e9a185aa4924ae2b8254b -
Branch / Tag:
refs/heads/master - Owner: https://github.com/mdonmez
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f0a0cd37893134671b8e9a185aa4924ae2b8254b -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file moves_cli-0.3.3-py3-none-any.whl.
File metadata
- Download URL: moves_cli-0.3.3-py3-none-any.whl
- Upload date:
- Size: 50.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
654a75ec31461299c0417b2e81ddf5564698b1808b66a6a12a8e2be1f29401f1
|
|
| MD5 |
821de0e80a71d40696d49941fac81ada
|
|
| BLAKE2b-256 |
00d26c5bfaaf87ca040b7cc8221040f9441675995438f6762dc5e732909f70ab
|
Provenance
The following attestation bundles were made for moves_cli-0.3.3-py3-none-any.whl:
Publisher:
publish.yml on mdonmez/moves-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
moves_cli-0.3.3-py3-none-any.whl -
Subject digest:
654a75ec31461299c0417b2e81ddf5564698b1808b66a6a12a8e2be1f29401f1 - Sigstore transparency entry: 844737093
- Sigstore integration time:
-
Permalink:
mdonmez/moves-cli@f0a0cd37893134671b8e9a185aa4924ae2b8254b -
Branch / Tag:
refs/heads/master - Owner: https://github.com/mdonmez
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f0a0cd37893134671b8e9a185aa4924ae2b8254b -
Trigger Event:
workflow_dispatch
-
Statement type: