An AI trivia assistant that uses your camera and multiple AI models.
Project description
RobbingHood: an ai trivia assistant.
Bottom of this readme contains a couple feature suggestions + ways to expand this for anyone coming across this. Would also appreciate a github star if you're reading this :)
Get search grounded, up-to-date, accurate responses for any multiple choice trivia question within seconds (less than 3 on average). It combines the power of multiple AI models to give you the best possible answer under any timed based trivia game.
Features
- Real-time capture and analysis: Point your camera at the question and get instant results
- Triple-check mode: Cross-references answers from three different AI models:
- OpenAI's GPT-4-Turbo
- Perplexity's Sonar Pro
- Perplexity's Sonar
- Continuous capture: Keep your camera running for seamless question-to-question transitions
- Multi-camera support: Select from available webcams on your device
- On-screen results: View answers directly in the camera feed
Technical Overview
This application demonstrates several software engineering principles and technologies:
- Clean Architecture: Separation of concerns with distinct layers for UI, business logic, and data
- SOLID Principles: Single responsibility, dependency injection, and interface segregation
- Concurrent Processing: Parallel API calls using ThreadPoolExecutor for optimal performance
- Real-time Computer Vision: OpenCV integration for camera feeds and image processing
- Cloud AI Integration: Multiple AI service APIs orchestrated in a single application
Architecture
┌─────────────┐ ┌───────────────┐ ┌──────────────┐
│ UI │────▶│ Application │────▶│ AI Services │
│ (OpenCV) │◀────│ Core │◀────│ (API Calls) │
└─────────────┘ └───────────────┘ └──────────────┘
│
▼
┌──────────────┐
│ OCR │
│ Services │
└──────────────┘
Technologies Used
- Python 3.8+: Core programming language
- OpenCV: Camera interfacing and image processing
- Google Cloud Vision API: Optical Character Recognition
- API Integrations: OpenAI API, Perplexity API
- Concurrent Processing: Python's ThreadPoolExecutor
- Environment Management: python-dotenv for configuration
Code Structure
robbinhood/
├── main.py # Entry point and application bootstrap
├── config.py # Configuration management
├── camera/ # Camera abstraction layer
│ ├── __init__.py
│ └── camera_manager.py # Camera operations and frame capture
├── ocr/ # Text extraction services
│ ├── __init__.py
│ └── ocr_processor.py # OCR processing with Google Vision
├── ai/ # AI model interfaces
│ ├── __init__.py
│ ├── base_processor.py # Abstract base class for AI models
│ ├── perplexity.py # Perplexity API integration
│ └── gpt4.py # OpenAI GPT-4 integration
├── ui/ # User interface components
│ ├── __init__.py
│ ├── display.py # Display management
│ └── renderer.py # Text and overlay rendering
└── core/ # Core application logic
├── __init__.py
└── app.py # Main application workflows
Design Patterns Used
- Factory Pattern: For creating AI processors
- Strategy Pattern: Different AI models implement the same interface
- Dependency Injection: Components receive their dependencies
- Observer Pattern: UI updated as results become available
Requirements
- Python 3.8+
- Webcam
- API keys (set as environment variables):
PERPLEXITY_API_KEYOPENAI_API_KEYGOOGLE_CREDENTIALS_PATH(path to your Google Cloud Vision API JSON credentials file)
Installation
Option 1: Install from PyPI (Recommended)
pip install robbinghood-ai-trivia
That's it! You can now run the application using robbinhood-cam.
Option 2: Install from Source
- Clone the repository:
git clone https://github.com/vikvang/robbinghood.git
cd robbinghood
- Create a virtual environment (recommended):
python3 -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the application and its dependencies:
pip install -e . # The '-e' makes it an editable install
-
Set up your API keys: Ensure the following environment variables are set in your shell or a
.envfile (if using a.envfile, ensure your Python script loads it, e.g., usingpython-dotenvwhich is already a dependency; the application should handle this ifConfig()loads dotenv):PERPLEXITY_API_KEY=your_perplexity_api_key OPENAI_API_KEY=your_openai_api_key GOOGLE_CREDENTIALS_PATH=path/to/your/google_credentials.json -
Set up Google Cloud Vision API:
- Create a project in the Google Cloud Console
- Enable the Vision API
- Create a service account and download the JSON credentials file
- Set
GOOGLE_CREDENTIALS_PATHenvironment variable to the path of this file.
Usage
After installation, run the program from your terminal:
robbinhood-cam
This will launch the application, starting your camera.
- The application runs in Triple Check Mode, querying GPT-4, Perplexity Sonar Pro, and Perplexity Sonar for each question.
- If you have multiple cameras, you might be prompted to select one, or you can specify one using the
--camera_indexargument (userobbinhood-cam --list_camerasto see available camera indices).
Inside the application window:
- Press SPACE to capture the current camera frame for analysis.
- Press ESC to return to the main menu (where you can change cameras or exit).
From the main menu in the application (which appears in the terminal where you launched robbinhood-cam):
- Choose "Start Triple Check Mode" to begin or resume camera capture.
- Choose "Change Camera" to select a different camera source.
- Choose "Exit" to close the application.
Performance Considerations (i tried implementing the following but could be improved)
- Parallel Processing: AI model requests run concurrently for maximum speed
- Non-blocking UI: User interface remains responsive during processing
- Optimized OCR: Google Vision API provides high-quality text extraction
- Memory Management: Temporary images are properly cleaned up
Extending the Application (feature suggestions open to anyone to build on top of this)
The modular architecture makes it easy to:
- Add new AI models by implementing the BaseAIProcessor interface
- Support alternative OCR engines by creating new OCR processor classes
- Create custom UI visualizations by extending the renderer
- Add new processing modes to the application core
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robbinghood_ai_trivia-0.1.0.tar.gz.
File metadata
- Download URL: robbinghood_ai_trivia-0.1.0.tar.gz
- Upload date:
- Size: 20.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d011ca66db78b3842dd55f094e122d98d336fbd8a63d9ab3a39fd650849e964c
|
|
| MD5 |
7f96a9f48044e906c80beefe4f3ad8da
|
|
| BLAKE2b-256 |
1964fbb531919600ed424c9ad1940b5ca111529fc1f7f07c09d57506c2dfea8a
|
File details
Details for the file robbinghood_ai_trivia-0.1.0-py3-none-any.whl.
File metadata
- Download URL: robbinghood_ai_trivia-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad758d933707547804f98340a754c23ec29aa336650e098716ff93fae5e77ee8
|
|
| MD5 |
65b9d356af9c5a67192721ee494b1837
|
|
| BLAKE2b-256 |
6548e2e8d12ff8fcd76551384d2f617ba3577fbb2ce84797017c6abbfe9296de
|