Local-first AI assistant with agent mode, RAG, and MCP support
Project description
✦ AI Assistant
A cross-platform desktop AI assistant that lives in your system tray and works on whatever you are already doing — selected text, screenshots, clipboard images, and your own documents. It runs through Ollama so models stay local by default, with optional support for remote or cloud Ollama endpoints when you choose.
What makes this different
Most AI tools today are browser tabs, IDE plugins, or single-platform utilities tied to one vendor’s cloud. This project is built around a different idea: bring a capable assistant to the OS layer, without replacing your apps or sending everything to a SaaS backend.
| Typical cloud assistants (ChatGPT, Copilot, Gemini) | Ollama WebUI / chat apps | This project | |
|---|---|---|---|
| Where it runs | Vendor cloud | Local server in a browser tab | Native desktop app (Windows, macOS, Linux) |
| How you invoke it | Switch app, paste, type | Open browser, paste | Selection action bar at the cursor, global hotkey, tray |
| Context from your work | Manual copy-paste | Manual copy-paste | Captures selection, target window, screenshots |
| Your files | Upload per chat / enterprise connectors | Manual upload or plugins | Folder RAG — index a directory, ask from the action bar |
| Model choice | Vendor models only | Any Ollama model | Any Ollama model + quality presets + vision model picker |
| Privacy posture | Data leaves device by default | Stays local if Ollama is local | Local-first; you control URL, capture, and offline mode |
| Insert back into apps | Copy manually | Copy manually | Insert last reply hotkey into the foreground app |
Novel behaviors this project incorporates
1. Selection action bar (not a radial menu, not a sidebar)
After you select text, a compact toolbar appears near the cursor with one-click intents: Explain, Summarize, Translate, Ask, Screen, and Ask my files. Other tools usually make you open a separate window and paste — here the intent is chosen in context.
2. Foreground-aware capture
Before the assistant takes focus, it remembers which window and cursor position you were using. Screen capture targets that window — not the assistant’s own popup. That avoids the common “screenshot captured my chat window” failure mode.
3. System-wide workflow, not app-specific
Works in browsers, editors, PDF readers, terminals, and more via OS-level selection and hotkeys — not only inside one host application.
4. Local RAG on a folder you own
Point at a watch folder (Documents, a project directory, etc.). Files are chunked and embedded with Ollama; Ask my files pulls relevant passages into the prompt. No per-file upload dance each session.
5. Vision from the desktop
Paste images, capture a window, or use the Screen action for LeetCode-style problems, UI mockups, or diagrams — with configurable vision timeouts and image sizing for slow or cloud vision models.
6. Native UI per platform
PyQt6 with dedicated styling for Windows (Segoe), macOS (translucent / SF Pro), and Linux (GNOME-inspired) — not a generic Electron shell.
7. Power-user controls others often hide
Quality presets, custom model names, thinking-mode toggle for Qwen3/cloud models, remote Ollama URL detection with adjusted timeouts, chat export, recent chats, tone chips (Shorter / Simpler / Formal), and insert-reply hotkey.
What others do that this project does not (by design)
- No bundled proprietary model — you install and choose models via Ollama.
- No multi-user cloud sync — chats live in your app data directory on your machine.
- No IDE-only scope — it is a general desktop assistant, not a code-editor extension.
- No always-on cloud — when Ollama runs locally, inference stays on your PC; cloud use is opt-in via your Ollama URL and model choice.
Features
- Selection action bar — Explain, Summarize, Translate, Ask, Screen, Ask my files
- Global hotkey — open assistant with current selection (Alt+S on Windows/Linux, ⌥S on macOS)
- Insert reply — paste the last AI response into any app (Ctrl+Shift+V / ⌘⇧V)
- Vision — paste images, screenshot button, window/screen capture from the action bar
- RAG — index a configurable folder with Chroma + Ollama embeddings
- Chat — streaming, recent chats, export, tone chips, safe markdown code rendering
- Settings UI — tabbed: general, AI & models, hotkeys, files, advanced
- First-run wizard — Ollama setup and model download with progress
- Tray integration — launch at login, new chat, settings, paste screenshot, quit
- Platform UI — Windows, macOS (liquid glass), Linux (GNOME-inspired)
Agent mode (beta)
Opt in via Settings → AI & models → Enable tools (beta). When enabled and your model supports Ollama tool calling (e.g. qwen3, llama3.2), the assistant can run a short read-only tool loop before answering:
- Search indexed files — RAG over your watch folder
- List / read files — only inside the configured watch folder (path-scoped; no arbitrary disk access)
- Read clipboard — current text clipboard
- Capture screen — OCR text from the foreground window (respects the screen-capture setting)
Read-only tools use the same local-first rules as chat (including offline-only mode). Requires a tool-capable Ollama model; without one, chat falls back to plain streaming.
Desktop actions (beta)
With Enable tools and Allow desktop actions both on in Settings, the AI can — each behind an Allow / Deny dialog — write text files inside the watch folder only (.txt, .md, .csv, .json, .log), paste text where you click after a 3-second countdown, open http/https links in your browser, and open documents from the watch folder.
Constraints: text insertion is unavailable on Wayland; on macOS it needs the same Accessibility permission as hotkeys. The assistant never runs programs, presses arbitrary keys, or moves the mouse.
MCP servers (beta)
Configure stdio MCP servers in Settings → Advanced → MCP servers. When agent mode and MCP are enabled, tools from connected servers are advertised to the model automatically. Anything the server does not mark with a read-only hint triggers an Allow / Deny dialog that shows the exact arguments before execution.
- Disabled in offline-only mode — MCP servers are third-party programs that may use the network.
- Trust on first connect — you must explicitly trust a server before it is started.
- Requires the server's runtime — e.g. Node.js for
npx @modelcontextprotocol/server-filesystem ….
Example: add a filesystem server scoped to a notes folder (npx -y @modelcontextprotocol/server-filesystem ~/Notes). Ask the agent to find action items from meeting notes; it can search and read files in that folder, then summarize. If it needs to write todo.md, a non-read-only MCP tool shows the confirmation dialog first.
SSE/HTTP MCP transport and image tool results are not in this release.
Privacy & data
| Default | Your choice |
|---|---|
Ollama at 127.0.0.1 |
Point to a remote or cloud Ollama URL in Settings |
| Chats stored under app data on your PC | Export or delete via the chat menu |
| Screen capture can be disabled | Toggle in Settings → Advanced |
| Images stripped from saved chat JSON | Raw prompts with base64 are not persisted |
When Ollama runs locally, prompts and model output stay on your machine. If you use a cloud model (for example qwen3-vl:235b-cloud), inference goes to that endpoint — configure it explicitly in Settings → AI & models.
Installation
Recommended — one-line installer
Download install.py and run:
python install.py
This creates an isolated environment, installs Olly, and puts a launcher on your Desktop. Double-click it any time to start — no terminal needed.
Requires Python 3.11+ and Ollama running locally.
Manual install (for developers)
pip install olly-desktop
olly # launches the app; safe to close the terminal after
Updating
python update.py
Binary installers (optional)
Pre-built .dmg (macOS) and .exe (Windows) installers are attached to each
GitHub release.
Note: The binaries are unsigned. macOS will block the
.dmgat first launch — see Installation notes for the one-time workaround. The pip install above has no such restriction.
| Platform | File | Notes |
|---|---|---|
| Windows | AIAssistantSetup.exe |
Recommended installer |
| Windows | AIAssistant.exe |
Portable build |
| macOS | AIAssistant.dmg |
Drag to Applications |
| Linux | AIAssistant-linux-x64.tar.gz |
Extract and run AIAssistant/AIAssistant |
Platform notes
| OS | Permissions / limits |
|---|---|
| Windows | Antivirus may flag global hooks once — add an exclusion if prompted |
| macOS | Grant Accessibility for hotkeys and text capture |
| Linux X11 | Best support for global hotkeys and selection capture |
| Linux Wayland | Global hotkeys, selection capture, and agent text insertion may be unavailable — use the tray menu |
From source
git clone https://github.com/tp-0604/ai-assistant.git
cd ai-assistant
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Windows only:
pip install -r requirements-windows.txt
Install Ollama, then:
# macOS / Linux
./launch.sh
# or
python main.py
# Windows
launch.bat
Usage
| Action | How |
|---|---|
| Selection bar | Select text (drag, or double-click a word if enabled in Settings) |
| Open assistant | Alt+S (Windows/Linux) · ⌥S (macOS) |
| Paste image | Ctrl+V / ⌘V in chat input |
| Insert last reply | Ctrl+Shift+V / ⌘⇧V |
| Settings | Tray → Settings, or ⋮ in popup |
| Ask my files | Selection bar → Files (enable RAG and pick a folder in Settings) |
| Screenshot | Tray → Paste screenshot, or Screen in the action bar |
Settings
Stored in the app data directory — editable via Settings:
| OS | Location |
|---|---|
| Windows | %APPDATA%\AIAssistant\ |
| macOS | ~/Library/Application Support/AIAssistant/ |
| Linux | ~/.local/share/AIAssistant/ |
| Option | Description |
|---|---|
| Quality preset | Speed / Balanced / Quality models |
| AI & models | LLM, vision model, thinking mode, timeouts, Ollama URL |
| Hotkeys | Global open and insert-reply shortcuts |
| Ask my files | RAG folder, enable/disable indexing |
| Theme | Follow system / Dark / Light |
| Advanced | Screen capture, image limits, system prompt, offline-only |
Build & release
# Windows
build.bat
# macOS
./build.sh mac
# Linux
./build.sh linux
Push a version tag to build all platforms and publish to GitHub Releases:
git tag v1.2.0
git push origin v1.2.0
CI (.github/workflows/build.yml) runs tests, builds Windows/macOS/Linux artifacts, and uploads them to one release page.
Project structure
ai-assistant/
├── main.py # Entry point, tray, services
├── core/ # Ollama client, settings, RAG, capture, platform
├── ui/ # Popup, action bar, settings, onboarding, markdown
├── ui/styles/ # windows.py, macos.py, linux.py
├── utils/ # Global hotkeys
├── packaging/ # Linux desktop file
├── AI_Assistant.*.spec # PyInstaller specs per OS
└── tests/
Optional: OCR
Install Tesseract for text extraction from images when no vision model is available.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file olly_desktop-1.2.5.tar.gz.
File metadata
- Download URL: olly_desktop-1.2.5.tar.gz
- Upload date:
- Size: 78.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e18bd5d27340c1f109f07cdffb07bb4c0e302e94602c2d1e253dba0a70f4df57
|
|
| MD5 |
9e944c63980d00fd6cb77e91f225c342
|
|
| BLAKE2b-256 |
80f66a17a8437809eacfe61a99691094cf74959b97df75c9c70537d059d75eba
|
Provenance
The following attestation bundles were made for olly_desktop-1.2.5.tar.gz:
Publisher:
publish.yml on tp-0604/ai-assistant
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
olly_desktop-1.2.5.tar.gz -
Subject digest:
e18bd5d27340c1f109f07cdffb07bb4c0e302e94602c2d1e253dba0a70f4df57 - Sigstore transparency entry: 1777735092
- Sigstore integration time:
-
Permalink:
tp-0604/ai-assistant@5073d92fca6a6d258d5780e289ce839a3618b7a0 -
Branch / Tag:
refs/tags/v1.2.5 - Owner: https://github.com/tp-0604
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5073d92fca6a6d258d5780e289ce839a3618b7a0 -
Trigger Event:
release
-
Statement type:
File details
Details for the file olly_desktop-1.2.5-py3-none-any.whl.
File metadata
- Download URL: olly_desktop-1.2.5-py3-none-any.whl
- Upload date:
- Size: 89.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e16f98b0037b55ead94d389287e83f677740720fdd3a3ea7666c7b4d15ea9b0
|
|
| MD5 |
b1e94df5afa937d1de9aea2497425643
|
|
| BLAKE2b-256 |
f57497756567d6e60253d1d458c25b87a9327d1e0097ea505e9b7f50d678357c
|
Provenance
The following attestation bundles were made for olly_desktop-1.2.5-py3-none-any.whl:
Publisher:
publish.yml on tp-0604/ai-assistant
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
olly_desktop-1.2.5-py3-none-any.whl -
Subject digest:
5e16f98b0037b55ead94d389287e83f677740720fdd3a3ea7666c7b4d15ea9b0 - Sigstore transparency entry: 1777735173
- Sigstore integration time:
-
Permalink:
tp-0604/ai-assistant@5073d92fca6a6d258d5780e289ce839a3618b7a0 -
Branch / Tag:
refs/tags/v1.2.5 - Owner: https://github.com/tp-0604
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5073d92fca6a6d258d5780e289ce839a3618b7a0 -
Trigger Event:
release
-
Statement type: