An AI-powered conversational assistant with voice support and document intelligence

These details have not been verified by PyPI

Project links

Project description

Samvaad: Facilitating Dialogue-Based Learning

Python

Note

Voice queries are now fully supported with Kokoro TTS for high-quality speech synthesis in English and Hindi
Frontend/UI is under development - currently CLI-only
Voice chat feature includes automatic markdown processing for clean display and natural pronunciation

Please see the issues for ideas or to report bugs.

Recent Updates

Kokoro TTS: Neural TTS engine with high-quality speech synthesis
Voice Queries: Ask questions or query documents in your preferred language (Hindi, English, etc.)
GPU Acceleration: Automatic GPU detection for faster processing
Performance Monitoring: Timing instrumentation for all pipeline steps
OS Compatibility: Cross-platform path resolution
Separate Requirements: CPU and GPU-specific dependency files
Interactive CLI: Improved user interface for all operations

The modular design makes it easy to add new features. The backend/ and frontend/ folders are separate, so you can build the UI and connect to the backend API.icense-MIT-green)

About The Project

Samvaad (Sanskrit for "dialogue") is an open-source software that combines Retrieval-Augmented Generation (RAG) capabilities with end-to-end voice capabilities. Users can add their documents, Samvaad will index and store them, and then users can have a text or voice conversation with those documents that delivers accurate, context-aware answers. Built with a modular backend and a modern frontend (in the works), Samvaad makes it easy to learn new topics, get ahead of confusions, and stay learning - all while feeling like a friend.

Getting Started

Prerequisites

Python 3.11: This project is optimized for Python 3.11. Some dependencies (like sounddevice for voice features) provide wheels primarily for this version. Ensure you're using 3.11:
```
python --version  # Should show Python 3.11.x
```

Follow these steps to set up and run Samvaad locally:

1. Clone the Repository

git clone https://github.com/HapoSeiz/samvaad.git
cd samvaad

2. Set Up a Virtual Environment

Install uv (if not already installed):

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows:

uv venv
venv\Scripts\activate

macOS/Linux:

uv venv
source .venv/bin/activate

Note: uv creates a .venv directory by default (with a dot). The activation command uses .venv/bin/activate on Unix systems.

3. Install Samvaad

Option 1: Install from PyPI (recommended):

For CPU-only systems:

pip install samvaad[cpu]

For GPU systems (CUDA 12.1):

pip install samvaad[gpu]

Option 2: Install from source:

For CPU-only systems:

git clone https://github.com/atharva-again/samvaad.git
cd samvaad
pip install -e .[cpu]

For GPU systems:

git clone https://github.com/atharva-again/samvaad.git
cd samvaad
pip install -e .[gpu]

Note: Always ensure your virtual environment is activated before installing packages. If you encounter PyTorch installation issues, visit https://pytorch.org/get-started/locally/ for manual installation instructions.

Important: Install Required Llama-cpp-python Fork

Samvaad requires a special fork of llama-cpp-python for Gemma model support. You must install this manually before installing Samvaad:

pip install git+https://github.com/inference-sh/llama-cpp-python.git

Then install Samvaad as usual:

pip install samvaad[cpu]
# or
pip install samvaad[gpu]

4. Add Your Documents

Place your documents inside the data/documents/ folder. Supported file types include:

PDF files (.pdf)
Microsoft Office documents (.docx, .pptx, .xlsx)
Text files (.txt, .md)
Web pages (.html, .htm)
Images (.png, .jpg, .jpeg, .tiff, .bmp) - with OCR support
Other formats supported by Docling (e.g., .rtf, .epub)

These will be used as the chatbot's knowledge base.

5. Configure Environment

Create a .env file in the root directory and add your API keys:

# Copy and edit the following into .env
GEMINI_API_KEY=your_gemini_api_key_here

You can get your Gemini_API_Key here.

Note: The system works without API keys but will only show retrieved context without AI-generated answers.

6. Process Your Documents

Run the interactive CLI to ingest documents:

samvaad

Then use commands like:

i document.pdf to ingest a file
q What is the main topic? to query

7. Query Your Knowledge Base

Use the interactive CLI for querying:

samvaad

Inside the CLI:

q What are the main findings? - Basic query

Voice Queries

Samvaad supports multilingual voice queries, allowing you to ask questions in Hindi, English, Hinglish, or other languages. The system transcribes your speech and responds in the same language/style.

# Start interactive mode
samvaad

# Inside CLI:
v
# This starts voice recording mode. Speak your question in any supported language.
# The system will transcribe, process, and respond accordingly.

Supported Languages: Hindi, Hinglish (code-mixed), English, and auto-detection for other languages.

TTS Engine Options:

Kokoro TTS: Neural TTS engine with high-quality voices (English & Hindi)

# Voice query with Kokoro TTS
v

Features:

Automatic silence detection (2 seconds of silence stops recording)
Markdown-aware responses (clean text for both display and speech)
Audio responses saved to data/audio_responses/ with engine-specific filenames
Real-time language detection and appropriate voice selection

API Endpoints

Samvaad provides a REST API for programmatic access:

TTS Endpoint:

POST /tts
Content-Type: application/json

{
  "text": "Your text here",
  "language": "en"
}

Supported TTS Engine:

kokoro - Neural TTS (higher quality, English & Hindi)

Response:

{
  "audio_base64": "base64_encoded_wav_data",
  "sample_rate": 24000,
  "format": "wav"
}

Direct Voice Query Usage

For direct voice queries without the interactive CLI:

# Voice query with Kokoro TTS
python -m backend.pipeline.retrieval.query_voice

# Voice query with specific Gemini model
python -m backend.pipeline.retrieval.query_voice --model gemini-2.5-flash

Usage Examples

Interactive CLI

Samvaad now uses an interactive command-line interface for all operations:

samvaad

Available commands:

i <file> or ingest <file> - Process and ingest a file
q <text> or query <text> - Query the knowledge base
v or voice - Start voice query mode (supports multiple languages like Hindi, English, Hinglish)
r <file> or remove <file> - Remove a file and its embeddings
h or help - Show help
e or exit - Exit the CLI

Document Processing

# Start interactive mode
samvaad

# Inside CLI:
i documents/research_paper.pdf
# Output includes timing: ⏱️ Parsing time: 0.1234 seconds, etc.

# Remove a document
r documents/old_file.pdf
# Output: ⏱️ Deletion time: 0.0567 seconds

Querying Your Knowledge Base

# Start interactive mode
python -m backend.test

# Inside CLI:
q "What are the main findings?"
# Output includes total query time and sources

q "Explain the methodology" -k 8
# Retrieve more context chunks

q "What are the implications?" -m gemini-2.5-flash
# Use Gemini model for answers

Performance Monitoring

The CLI now shows timing for each step:

⏱️ Parsing time: 0.1234 seconds
⏱️ Chunking time: 0.0567 seconds
⏱️ Embedding time: 1.2345 seconds
⏱️ Storage time: 0.0890 seconds
⏱️ Total query time: 2.3456 seconds
⏱️ Deletion time: 0.0123 seconds

GPU Acceleration

If a CUDA-compatible GPU is detected, operations will automatically use GPU acceleration for:

Document parsing (Docling)
Text embeddings (SentenceTransformer)
Cross-encoder reranking
LLM inference (if supported)

Check GPU usage with nvidia-smi during processing.

Example Output

🔍 Processing query: 'What is the theory of Ballism?'
============================================================
⏱️ Total query time: 2.3456 seconds

📝 QUERY: What is the theory of Ballism?

🤖 ANSWER:
The theory of Ballism, formally known as the Principle of Spherical Convergence, posits that all matter and energy in the universe is subject to a fundamental force that compels it to assume a perfect spherical shape over infinitely long periods...

📚 SOURCES (3 chunks retrieved):

1. ballism.txt (Similarity: 0.847)
   Preview: The theory of Ballism, formally known as the Principle of Spherical Convergence...

2. ballism.txt (Similarity: 0.723)
   Preview: Dr. Finch's initial "Finches' Folly" experiment...

Project Structure

samvaad/
├── samvaad/          # Python code for the RAG pipeline and API
│   ├── pipeline/     # Core RAG components
│   │   ├── generation/    # LLM integration and TTS engine (Kokoro)
│   │   ├── ingestion/     # Document processing and chunking
│   │   ├── retrieval/     # Query processing and voice recognition
│   │   ├── vectorstore/   # Vector database operations
│   │   └── deletion/      # Document removal utilities
│   ├── utils/        # Utilities (hashing, DB, GPU detection)
│   ├── interfaces/   # CLI and API interfaces
│   │   ├── api.py    # FastAPI server with TTS API
│   │   └── cli.py    # Interactive CLI for testing and usage
├── data/             # Raw documents and audio responses
│   ├── documents/    # Source documents for knowledge base
│   └── audio_responses/  # Saved TTS audio files
├── tests/            # Unit and integration tests
├── requirements.txt  # Dependencies
└── README.md         # Project documentation

Directory Overview:

samvaad/: Modular RAG pipeline, dual TTS engines, API, and CLI (Python)
samvaad/pipeline/generation/: LLM integration (Gemini) and TTS engine (Kokoro)
samvaad/pipeline/retrieval/: Query processing, voice recognition, and markdown handling
data/documents/: Your source documents (PDFs, Office docs, text, images, etc.)
data/audio_responses/: Automatically saved TTS audio files with engine-specific names
tests/: Comprehensive test suite for reliability

Features

Kokoro TTS: Neural TTS engine with high-quality speech synthesis
Smart Markdown Processing: Automatic stripping of markdown formatting for clean terminal display and natural speech synthesis
Multilingual Voice Support: Voice queries and responses in Hindi, English, Hinglish, and auto-detection for other languages
Retrieval-Augmented Generation (RAG): Combines LLMs with your own documents for accurate, context-aware answers.
Complete Query Pipeline: Ask natural language questions and get AI-powered answers with source citations.
GPU Acceleration: Automatic GPU detection and usage for faster embeddings, parsing, and inference (when available).
Performance Monitoring: Built-in timing instrumentation for ingestion, retrieval, and deletion steps.
OS-Agnostic Paths: Cross-platform compatibility (Windows, macOS, Linux) with dynamic path resolution.
Modular Backend: Easily extend or swap components in the RAG pipeline.
Modern Frontend (Coming Soon): React + Next.js interface for a seamless chat experience.
Interactive CLI: Full document processing and querying via an interactive command-line interface.
Multiple LLM Support: Works with OpenAI GPT models and Google Gemini, with graceful fallback.
Easy Setup: Simple installation with manual PyTorch selection for CPU or GPU.
Private & Secure: Your data stays on your machine.

Testing

Samvaad includes comprehensive unit and integration tests to ensure reliability.

Test Structure

tests/
├── unit/                    # Unit tests for individual components
│   ├── test_utils.py       # Utils (hashing, DB, GPU)
│   ├── test_preprocessing.py
│   ├── test_ingestion.py
│   ├── test_embedding.py
│   ├── test_vectorstore.py
│   ├── test_query.py
│   └── test_deletion.py
├── integration/            # Integration tests for full pipeline
│   └── test_full_pipeline.py
└── pytest.ini             # Test configuration

Running Tests

Run all tests:

pytest

Run unit tests only:

pytest tests/unit/

Run integration tests only:

pytest tests/integration/

Run specific test file:

pytest tests/unit/test_utils.py -v

Test Coverage

Unit Tests: Test individual functions and classes in isolation
Integration Tests: Test the complete RAG pipeline end-to-end
Mocking: External dependencies (APIs, databases, ML models) are mocked for reliable testing
CI/CD Ready: Tests are designed to run in automated environments

About Test Warnings

Some warnings may appear during test runs from external dependencies (e.g., docling-core, google-genai). These warnings are not from Samvaad code but from upstream libraries that have known deprecation issues in Pydantic v2.12+. Here's how to minimize them:

To reduce or eliminate warnings:

Keep dependencies updated: uv pip install --upgrade docling google-genai pydantic setuptools
These are deprecation notices that will be fixed in future releases of the upstream libraries
The warnings do not affect functionality - all 175+ tests pass successfully

Current state (as of Oct 2025):

docling-core 2.49.0: Pending upstream fix for Pydantic validator pattern
google-genai 1.45.0: Pending upstream fix for Pydantic validator pattern
setuptools 80.9.0: pkg_resources deprecation warning (expected to be removed in setuptools 81+)

These warnings will disappear once the upstream libraries update their code to use instance methods instead of classmethods for Pydantic validators (required by Pydantic v2.12+).

Continuous Integration

Automated test runs execute through GitHub Actions. The workflow runs CPU tests on all pushes and pull requests to main. GPU tests run only on pushes to main to avoid the overhead of installing large PyTorch GPU wheels on every PR. Both configurations exercise the full pytest suite. No additional secrets are required for the suite to pass because external services are mocked in the tests. You can monitor the latest builds from the Actions tab on GitHub.

Contributing

Contributions are welcome! To get started:

Fork this repository
Create a new branch (git checkout -b feature/your-feature)
Make your changes and add tests
Commit and push (git commit -am 'Add new feature')
Open a pull request

Please see the issues page for ideas or to report bugs.

Future Development The modular design of this project makes it easy to add new features. The backend/ and frontend/ folders are completely separate, so you can build out the user interface and connect it to the backend's API when you're ready.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Oct 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

samvaad-0.1.0.tar.gz (50.8 kB view details)

Uploaded Oct 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

samvaad-0.1.0-py3-none-any.whl (49.4 kB view details)

Uploaded Oct 21, 2025 Python 3

File details

Details for the file samvaad-0.1.0.tar.gz.

File metadata

Download URL: samvaad-0.1.0.tar.gz
Upload date: Oct 21, 2025
Size: 50.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for samvaad-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`70af4eaa3eecc81ba0b7386a38c7a0752b0ddf503916bf4e73f6c9770c9b388d`
MD5	`96e2b407915d775305dc91e8709ae532`
BLAKE2b-256	`c5d02d2e25e165a4f041a026791f23f51ed83e2946f058332ea9d524912a8b14`

See more details on using hashes here.

File details

Details for the file samvaad-0.1.0-py3-none-any.whl.

File metadata

Download URL: samvaad-0.1.0-py3-none-any.whl
Upload date: Oct 21, 2025
Size: 49.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for samvaad-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`859c4ba2ef14cb6cc36a9caf76a014b22fb57492d265033e9930a359762036c2`
MD5	`e004d72baa89be07cf57634f99d2f6ad`
BLAKE2b-256	`570897a83469d44ae9e24ca50a9bde2bcf9546027e4126402ac0238b0309f1d7`

See more details on using hashes here.

samvaad 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Samvaad: Facilitating Dialogue-Based Learning

Note

Recent Updates

About The Project

Getting Started

Prerequisites

1. Clone the Repository

2. Set Up a Virtual Environment

3. Install Samvaad

Important: Install Required Llama-cpp-python Fork

4. Add Your Documents

5. Configure Environment

6. Process Your Documents

7. Query Your Knowledge Base

Voice Queries

API Endpoints

Direct Voice Query Usage

Usage Examples

Interactive CLI

Document Processing

Querying Your Knowledge Base

Performance Monitoring

GPU Acceleration

Example Output

Project Structure

Features

Testing

Test Structure

Running Tests

Test Coverage

About Test Warnings

Continuous Integration

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes