An AI assistant with PDF processing, web browsing, and speech capabilities
Project description
Karhu AI Assistant CLI
Karhu is a powerful command-line AI assistant designed for productivity, research, and creative tasks. It supports file and document processing, web browsing, contextual conversations, speech synthesis, and advanced profile/model management—all from your terminal.
Table of Contents
- Features
- Installation
- Configuration
- Usage
- Profiles and Models
- Speech and Voice Features
- Context Management
- Module Reference
- Testing
- Contributing
- License
Features
- File & Document Processing: Read and process PDF, text, and other files.
- Web Browsing & Search: Browse web pages and perform web searches directly from the CLI.
- Contextual Conversations: Maintain, save, and manage conversation context for seamless multi-turn interactions.
- Profile & Model Management: Switch between AI models and conversational profiles (e.g., coding, creative, academic, therapist).
- Speech Synthesis & Recognition: Text-to-speech (TTS) and speech-to-text (STT) support, including multiple voice engines.
- Interactive Mode: Chat with Karhu in a conversational loop with command autocompletion and history.
- Robust Error Handling: Graceful error messages and recovery for all operations.
- Extensible: Modular design for easy addition of new features and integrations.
Installation
-
Clone the repository:
git clone https://github.com/yourusername/karhu-cli.git cd karhu-cli
-
Install dependencies:
pip install -r requirements.txt
(If using a virtual environment, activate it first.)
-
(Optional) Install extra system dependencies for TTS/STT features (see Speech and Voice Features).
Configuration
Karhu uses JSON configuration files in src/karhu/config/:
- models.json: Define available AI models and their parameters.
- profiles.json: Define conversational profiles (e.g., coding, creative, therapist).
- system_prompt.json: Set the default system prompt for the assistant.
You can customize or add new profiles and models by editing these files.
Usage
CLI Options
Run Karhu from the project root:
python src/karhu/cli.py [OPTIONS]
Main options:
--query,-q <question>: Ask a direct question.--interactive,-i: Start interactive chat mode.--file,-f <path>: Process a specific file.--files,-ff <directory>: Process all files in a directory.--web,-w <url>: Browse a web page.--search,-s <query>: Perform a web search.--model,-m <name>: Select AI model.--profile,-P <name>: Select conversational profile.--setsprompt <prompt>: Set a custom system prompt.--save: Save conversation context.--clear,-c: Clear current context.--list-models: List available models.--list-profiles: List available profiles.--voices: List TTS voices.--kokoro-voices: List Kokoro TTS voices.--kokoro-blend <indices>: Blend Kokoro voices.--help-commands: Show all available commands.
Interactive Mode
Start with:
python src/karhu/cli.py --interactive
Features:
- Command autocompletion and history.
- All CLI and special commands available as
!command(see below).
Interactive Commands
!model [name]— Switch AI model.!list_models— List models.!profile [name]— Switch profile.!list_profiles— List profiles.!create_profile [name:prompt]— Create a new profile.!system_prompt— Show current system prompt.!setsprompt [prompt]— Set system prompt.!file [path]— Read a file.!files [directory]— Read all files in a directory.!browse [url]— Browse a web page.!search [query]— Web search.!context_size— Show context size.!context_info— Show context details.!optimize_context— Summarize/optimize context.!search_context [query]— Search within context.!chunk [id]— List/retrieve document chunks.!save— Save conversation.!clear— Clear context.!clearall— Clear all context/history.!lazy— Toggle speech-to-text mode.!speak— Toggle text-to-speech mode.!voices— List TTS voices.!voice [index]— Change TTS voice.!kokoro— Toggle Kokoro TTS.!kokoro_voices— List Kokoro voices.!kokoro_voice [index]— Change Kokoro voice.!kokoro_blend [indices]— Blend Kokoro voices.!help— Show help.!quit— Exit.
Example Commands
- Process a PDF:
python src/karhu/cli.py --file path/to/file.pdf
- Web search:
python src/karhu/cli.py --search "What is quantum computing?"
- Switch to therapist profile in interactive mode:
python src/karhu/cli.py --interactive --profile therapist
Profiles and Models
Karhu supports multiple AI models (e.g., GPT-4o, Claude, Gemma) and conversational profiles (e.g., coding, creative, academic, therapist, funny, sarcastic, chill). You can switch or create new ones at runtime.
- List models:
!list_models - Switch model:
!model <name> - List profiles:
!list_profiles - Switch profile:
!profile <name> - Create profile:
!create_profile name:prompt
Profiles are defined in src/karhu/config/profiles.json.
Speech and Voice Features
- Text-to-Speech (TTS): Use
!speak,!voices,!voice [index]to enable and select voices. - Kokoro TTS: Advanced TTS engine with voice blending (
!kokoro,!kokoro_voices,!kokoro_voice,!kokoro_blend). - Speech-to-Text (STT): Use
!lazyto toggle speech input mode.
Note: Some features may require additional system dependencies (e.g., espeak, ffmpeg, or platform-specific TTS engines).
Context Management
- Save context:
!save - Clear context:
!clear - Clear all:
!clearall - Show context size/info:
!context_size,!context_info - Optimize context:
!optimize_context - Search context:
!search_context [query] - Chunking:
!chunk [id]for large documents
Module Reference
- ai_assistant.py: Core assistant logic and LLM interaction.
- cli.py: Command-line interface and argument parsing.
- interactive.py: Interactive chat mode.
- model_manager.py: Model selection and management.
- profile_manager.py: Profile selection and management.
- context_manager.py: Context storage, retrieval, and optimization.
- document_processor.py: File and document parsing.
- web_browser.py: Web browsing and search.
- TextToSpeech.py / SpeechToText.py / kokorotts.py: Speech synthesis and recognition.
- Display_help.py: Command help and documentation.
- Errors.py: Error handling and reporting.
- config_parser.py: Configuration file parsing.
- globals.py: Global state and settings.
Testing
Run all tests with:
pytest
Tests are located in the tests/ directory and cover core modules and features.
Contributing
- Fork the repository and create a new branch.
- Add your feature or fix.
- Write or update tests as needed.
- Submit a pull request with a clear description.
License
This project is licensed under the MIT License.
For questions or support, please open an issue on GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file karhu-2.0.0.tar.gz.
File metadata
- Download URL: karhu-2.0.0.tar.gz
- Upload date:
- Size: 48.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f8950299d71403fbeb508c9fa25b86f628d2d353d2eed663c2757d2982f0907
|
|
| MD5 |
c7a0745611d042af586e43c04ee7a75d
|
|
| BLAKE2b-256 |
55176c01709c373f670201eccf2792fffe4943270077a8448f7ec53f7614069d
|
File details
Details for the file karhu-2.0.0-py3-none-any.whl.
File metadata
- Download URL: karhu-2.0.0-py3-none-any.whl
- Upload date:
- Size: 52.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2aaef45110044e9f66f7de358e8e710856142f2196b2cfcb657074fa30f224e
|
|
| MD5 |
30bb59f2c7351be203bbfdaf44281092
|
|
| BLAKE2b-256 |
9f638d43bc7dd6862a9c572508879ea8f669a367cac2a3d2aaf973067592d57e
|