This project implements a local-first RAG chat system that reads and processes various text-based log files. It splits the content into manageable chunks, generates embeddings using Ollama or OpenAI, and allows users to interactively query the logs for specific information. The application features a customizable response format and supports configuration for user preferences.
Project description
RagQL
Project Overview
RagQL is a local-first Retrieval-Augmented Generation (RAG) system designed for natural language Q&A over your logs and databases. It provides a modular chat-like interface that can index and query local data sources such as log files and SQLite .db database files. With RagQL, you can ask questions about the contents of your logs or databases and get answers powered by a large language model, all while keeping your data private on your own machine. The system works by generating vector embeddings of your data and using those for context retrieval, then feeding relevant context to an LLM to produce answers. Its modular design makes it easy to extend (e.g. adding new data loaders or swapping components), and it supports dual LLM backends for maximum flexibility.
Importantly, RagQL supports both local and remote LLM/embedding backends. By default it favors a local setup using Ollama (which runs open-source models on your machine) for embedding generation and question answering. This means you can run RagQL completely offline. Alternatively, you can integrate OpenAI's API for embeddings and/or LLM responses – useful if you prefer OpenAI's models or if you don't have a suitable local model. This dual backend support is seamless, allowing you to switch between local and cloud as needed (with a simple flag). RagQL's chat interface and CLI tools make it easy to interactively query your data or automate queries via scripts.
Key Features
- Local-First Operation – Designed to run fully offline using local models via Ollama. All data stays on your machine, and embeddings can be generated with a local model, ensuring privacy (no log or DB data is sent to the cloud).
- Dual Backend Support (Ollama & OpenAI) – Use a local LLM or the OpenAI API as the backend provider. By default, RagQL will use an Ollama-served model if available, and you can force the use of OpenAI with a
--remoteflag. This gives you the choice between offline processing or OpenAI's latest models on demand. - Flexible Data Source Indexing – Index various data sources, including plain text log files and SQLite database files (
.db). RagQL uses modular loader components to parse content (e.g. reading database tables via pandas or splitting log files into chunks) and then builds a vector index (using FAISS) for efficient similarity search. This modular design makes it easy to add support for new file types or data sources in the future. - Interactive Chat Interface – Launch an interactive REPL to have a multi-turn conversation with your data. The chat interface allows follow-up questions and retains context from previous Q&A turns (conversation memory), enabling a more natural dialogue when exploring data. You can also enter special commands in this mode to manage configuration or data sources on the fly.
- One-off Query Mode – If you prefer not to use the chat interface, you can run RagQL as a one-shot CLI tool. By specifying a question along with target source files/folders in a single command, RagQL will process the query and print the answer, then exit. This is convenient for scripting or quick queries.
- Configuration System – Easily manage API keys and default settings through configuration files. RagQL supports a
.envfile for sensitive settings (like your OpenAI API key or Ollama server URL) and aconfig.jsonfor persistent configuration (such as a list of data sources to index by default, or other preferences). A built-in config mode (--configs) allows you to add or remove indexed sources and set keys without manually editing files. These settings persist between runs, so you can "set and forget" your environment and data sources. - Lightweight & Extensible – Built with Python and standard libraries/frameworks (FAISS for embeddings index,
pandasfor data handling,argparsefor CLI, etc.), the project remains lightweight and hackable. Developers can easily extend RagQL – for example, by adding new loader modules for different file formats, or integrating alternative vector stores – thanks to its clean, modular architecture.
Installation
Prerequisites: You'll need Python 3.10+ and Poetry (for dependency management) installed on your system (just if you want to contribute, otherwise the dependencies are only Python and the backend LLM provider of your choice). If you plan to use the local LLM mode, you should also install Ollama and have it running (Ollama is available for macOS, Linux, and Windows; it provides a local API endpoint for running models). For remote mode, you'll need an OpenAI account and API key.
-
Follow these steps to install RagQL:
Via Poetry (recommended):
-
Clone the repo and enter it:
git clone https://github.com/yourusername/ragql.git cd ragql
-
Install dependencies:
poetry install -
Configure your
.envandconfig.jsonas described below.
Install via PyPI:
-
RAGQL is also published on PyPI, so you can install it directly:
pip install ragql
-
Or, if you're using Poetry in another project:
poetry add ragql
-
-
After installing via PyPI, make sure to create a
.envfile in your working directory:# .env OPENAI_API_KEY=<your OpenAI key> OLLAMA_URL=http://localhost:11434
Then you can run:
ragql --help -
Configure Environment – Create a
.envfile in the project root (or wherever you run RagQL) to store configuration like API keys. At minimum you should add:OPENAI_API_KEY=<your OpenAI key>(if you plan to use OpenAI for embeddings or answers)OLLAMA_URL=http://localhost:11434(or the appropriate URL if your Ollama server is running on a different port/host; default Ollama listens at 11434).
RagQL will automatically load this
.envfile on startup. You can also configure these via environment variables directly, but using a.envis convenient for local development. -
(Optional) Configure Default Sources – By default, RagQL will create a
rag_config.jsonto persist settings. You can manually create or edit this file to specify directories or files that should be indexed on startup (and other config options). However, you can also use RagQL's interactive config commands to set this up after installation (see Usage below), so manual editing isn't required. -
Run RagQL – You're all set! You can now run the tool via Poetry:
poetry run ragql --help
Or, if the Poetry environment is active, simply:
ragql --helpThis will show the help message and verify that the installation was successful. (If RagQL was installed as a package or script, the
ragqlcommand should be available in your PATH.)
Usage
RagQL can be used in multiple ways with various command-line options. Here's a comprehensive guide:
Basic Command Structure
ragql [options] [command] [key_value]
Command-Line Options
--help, -h– Display help message and exit--migrate– Migrate yourconfig.jsonto the new schema while preserving unchanged fields--query QUESTION, -q QUESTION– Run a single RAG-powered query and exit--sources [SOURCES ...]– Specify one or more folders/text files/Data.db files to index--remote– Force using OpenAI API even if OLLAMA_URL is set--configs– Enter configuration mode
Operation Modes
-
Interactive Chat Mode (REPL):
- Launch by running
ragqlwith no arguments - Enter an interactive chat interface where you can ask questions
- Type questions and get answers based on indexed data
- Use special commands within chat (see Configuration Commands below)
- Exit with
exitorCtrl+C
- Launch by running
-
One-off Query Mode:
ragql --query "Your question here" --sources path/to/data # or ragql -q "Your question here" --sources path/to/data
- Ask a single question and get an answer
- System will exit after providing the response
-
Configuration Mode:
ragql --configs # or ragql [command] [key_value]
Configuration Commands
The following commands can be used either in configuration mode (--configs) or directly as positional arguments:
add <path>– Add a single file to the index configurationadd-folder <directory>– Recursively add all files from a directoryremove <path>– Remove a file or folder from configurationlist– Display all configured source files/foldersset openai key <API_KEY>– Configure your OpenAI API keyhelp– Show available commands in config modeexit– Exit configuration mode (when in interactive mode)
Example Commands
-
Index and Query a Single File:
ragql --sources ~/logs/system.log -q "What errors occurred today?"
-
Force Remote API Usage:
ragql --remote --sources data.db -q "Summarize this database"
-
Configure Sources via Command Line:
ragql add-folder ~/project/logs ragql add ~/project/data/metrics.db
-
Migrate Configuration:
ragql --migrate -
Interactive Chat with Multiple Sources:
ragql --sources ~/logs/ ~/databases/metrics.db
Configuration Files
RagQL uses two main configuration files:
-
.env- For sensitive settings:OPENAI_API_KEY=<your OpenAI key> OLLAMA_URL=http://localhost:11434
-
rag_config.json- For persistent configuration:- Stores indexed sources
- Maintains other preferences
- Can be managed via
--configsmode or direct commands
Tech Stack
RagQL is built on a stack of modern tools and libraries:
- Python 3 – The core language used for development. RagQL is written in Python, making it cross-platform and easy to extend or customize.
- Poetry – Used for dependency management and packaging. This ensures a reproducible environment and easy installation of required packages.
- Ollama – Provides the local LLM backend. Ollama is an open-source tool that allows running large language models on your own hardware via a simple API. RagQL uses Ollama to generate embeddings and responses locally when available (keeping data local).
- OpenAI API – The cloud alternative for embeddings and LLM responses. RagQL can call OpenAI's API (e.g. GPT-4, GPT-3.5, or Ada for embeddings) when a local model is not available or when explicitly requested. This gives access to powerful language models hosted by OpenAI.
- FAISS – Facebook AI Similarity Search is used as the vector store for embeddings. RagQL leverages FAISS to index and search embeddings efficiently, enabling quick retrieval of relevant text chunks from your data.
- pandas – pandas is used for data loading and manipulation, particularly for database files. For example, RagQL might use pandas to read tables from a SQLite
.dband convert them into text or CSV for indexing. - python-dotenv – python-dotenv is used to load environment variables from the
.envfile. This simplifies configuration of API keys and URLs without hardcoding them. - argparse – RagQL's command-line interface is built using Python's built-in argparse library. This powers the parsing of flags like
--sources,--remote, and subcommands in config mode.
Additionally, RagQL is structured in a modular way (with separate components for CLI, configuration management, data loading, embedding generation, and storage). This design makes it easy for developers to understand and modify the codebase or integrate other tools (for example, swapping FAISS with a different vector database, or adding a new loader for a different file type).
References
- OpenAI API Documentation – Official documentation for OpenAI's API, which RagQL can use for embeddings and chat completion.
- Ollama Project Website – Information and download for Ollama, the local LLM engine used for offline mode.
- FAISS Library (Facebook AI Research) – GitHub repo for FAISS, the vector similarity search library used for embedding indexing.
- Poetry: Python Package Manager – Documentation for Poetry, used for managing RagQL's dependencies and environment.
- Pandas Library – Official site for pandas, used in RagQL for data handling (especially with
.dbfiles). - python-dotenv – GitHub repository for python-dotenv, which RagQL uses to manage environment variables from a file.
- Python Argparse – Documentation for the argparse library used to build the CLI interface.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragql-0.2.0.tar.gz.
File metadata
- Download URL: ragql-0.2.0.tar.gz
- Upload date:
- Size: 24.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.2 Windows/11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae609d6ba769f5cd72f5cf59ee375e25e9d12d37a6629a9ea55e125a5246bfdc
|
|
| MD5 |
3dadd606dd68194930e0c14fc6642fee
|
|
| BLAKE2b-256 |
9c5057674e3c88d7214b21e03cd5acc87f5e94170cbe2bc7d20b3cf67260b00a
|
File details
Details for the file ragql-0.2.0-py3-none-any.whl.
File metadata
- Download URL: ragql-0.2.0-py3-none-any.whl
- Upload date:
- Size: 23.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.2 Windows/11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
111e9a98aa3f16a61a73dad514040d9d72653fb35a0023ba2de606ddc96d633e
|
|
| MD5 |
8076374bad576613fc255fd528d0f676
|
|
| BLAKE2b-256 |
3600ef253c0294106ef5926227430b0abb5076529150456c22c2c1d21762a825
|