Skip to main content

SOFIA Gmail & Calendar MCP Server - A Model Context Protocol server for Google services integration

Project description

SOFIA - Sort of Functional Interactive Agent

A sophisticated AI assistant platform that combines desktop automation, email/calendar management, and multimodal interaction capabilities. SOFIA provides a flexible, privacy-conscious solution for automating tasks through natural language conversation.

Overview

SOFIA operates through multiple interfaces (web, desktop overlay, and MCP server) with a dual AI backend system supporting both local (Ollama) and cloud (OpenAI) models. The platform uses advanced computer vision for UI automation and integrates seamlessly with Google services while maintaining strict security boundaries.

Key Features

  • Desktop Automation: Computer vision-powered UI element detection using OmniParser, precise mouse/keyboard control, screenshot analysis, and sandboxed shell command execution
  • Email & Calendar Integration: Full Gmail functionality (search, compose, reply, forward) and Google Calendar management through natural language with OAuth2 authentication
  • Local Conversation Storage: Compacts and stores past conversations as md files for future use and reference.
  • Multimodal Processing: Text and image chat support, audio file transcription via Whisper, real-time streaming responses, and drag-and-drop file handling
  • Flexible AI Backend: Hot-swappable between local Ollama models (privacy-focused, offline) and OpenAI API (faster, more accurate) via Brain Factory pattern
  • Safety-First Design: All file operations sandboxed to ~/SOFIA/ directory, parameter validation, graceful error handling, and secure API key management

Architecture

SOFIA employs a modular architecture with three primary interfaces:

  1. Web Interface (sofia_web.py): Gradio-based UI for multimodal chat and audio transcription
  2. Desktop Interface (sofia_desktop.py): PyQt6 transparent overlay for seamless desktop integration
  3. MCP Server (sofia_gmail.py): Model Context Protocol server for Google services

The system uses a THINK → PLAN → EXECUTE → VERIFY workflow for autonomous task completion, with concurrent tool execution and conversation memory management.

Demo

See SOFIA's computer use capabilities in action:

sofia_demo.webm

The demo showcases SOFIA's ability to understand visual interfaces, navigate applications, and perform desktop automation tasks.

Getting Started

Requirements

  • Ubuntu 22.04+ (Windows/macOS support planned)
  • Python 3.11+
  • For computer use and local AI: CUDA enabled GPU (24gb+ recommended)
  • For cloud AI: OpenAI API key
  • Google credentials for email/calendar features
  • Separate monitor recommended for desktop automation

Installation

git clone https://github.com/akim42003/SOFIA.git
cd SOFIA
conda create -n sofia python=3.11
conda activate sofia
pip install -r requirements.txt

# For local AI models
ollama create sofia -f Modelfile.enhanced

Google Services Setup

  1. Place credentials.json in project root
  2. Run python sofia_gmail.py to authenticate
  3. Follow OAuth2 flow in browser

Usage

# Web interface with audio support
python sofia_web.py      # Access at localhost:7860

# Desktop overlay interface
python sofia_desktop.py  # Transparent UI overlay

# Gmail/Calendar server
python sofia_gmail.py    # MCP server on port 3000

Computer Vision & Automation

SOFIA's desktop automation leverages advanced computer vision:

  • OmniParser Integration: YOLO-based UI element detection with Florence2/BLIP2 captioning
  • Visual Grounding: Maps natural language descriptions to precise pixel coordinates
  • Screenshot Analysis: Real-time screen capture and interpretation
  • Error Recovery: Automatic retry mechanisms with visual verification

AI Backend Options

Local Models (Ollama)

  • Pros: Complete privacy, offline operation, no API costs
  • Cons: Requires CUDA GPU (24GB+ VRAM), slower inference
  • Recommended Models: mistral-small3.1:24b, llama4, custom sofia model

Cloud Models (OpenAI)

  • Pros: Faster response, higher accuracy, no hardware requirements
  • Cons: API costs, requires internet, data privacy considerations
  • Available Models: gpt-4o, gpt-o4mini, etc

Backend Management

python switch_backend.py status   # Check current backend
python switch_backend.py openai   # Switch to OpenAI
python switch_backend.py ollama   # Switch to Ollama

Configuration

Main Configuration

Edit config/sofia_config.yaml:

ai_backend: "ollama"  # or "openai"
openai:
  model: "gpt-4o"
ollama:
  model: "sofia2"

Tool Configuration

Customize agent behavior and available tools in config/tools

User Personalization

Create config/user_config.yaml for personal preferences and context

Security Considerations

  • All file operations restricted to ~/SOFIA/ directory
  • API keys managed via environment variables
  • OAuth2 for Google service authentication
  • Parameter validation on all tool invocations
  • No arbitrary code execution outside sandboxed environment

Troubleshooting

  • GPU Memory Issues: Reduce model size or switch to CPU inference
  • OCR Performance: Ensure CUDA is properly configured for OmniParser
  • Gmail Authentication: Check credentials.json and OAuth2 token validity
  • Desktop Control: Verify PyAutoGUI permissions and display configuration

Built by Alex Kim.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_akim42003_sofia-0.1.0.tar.gz (55.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iflow_mcp_akim42003_sofia-0.1.0-py3-none-any.whl (71.1 kB view details)

Uploaded Python 3

File details

Details for the file iflow_mcp_akim42003_sofia-0.1.0.tar.gz.

File metadata

  • Download URL: iflow_mcp_akim42003_sofia-0.1.0.tar.gz
  • Upload date:
  • Size: 55.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_akim42003_sofia-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c0dc5268e0d280c8e9a5468cc305d37ca115de4b681e42bd7e10e96e453a6059
MD5 1ff5dcea6e5947f765c9c0a3c1258d57
BLAKE2b-256 66390bd59dc2f6bf3e008636d5b8ff4a85f689919421071fe34fa56d26229202

See more details on using hashes here.

File details

Details for the file iflow_mcp_akim42003_sofia-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_akim42003_sofia-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 71.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_akim42003_sofia-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4d6da1de60f8afa36e71cc77111b57fef25d12bb1f403e2aa2328d3dace9a606
MD5 a11a610d70c849aa727398b88cb26ed5
BLAKE2b-256 6b0a903495040c8bc1225b104883710f0079673e6a458dd61783c632803db721

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page