Skip to main content

RLM-based knowledge client with WatsonX backend for domain-specific document querying

Project description

RLM Knowledge Client

A portable Python package for querying local document knowledge bases using the Recursive Language Model (RLM) pattern with IBM WatsonX as the LLM backend.

Overview

This package allows you to:

  1. Index a directory of documents - including PDF, DOCX, XLSX, PPTX, and all text files
  2. Query the knowledge base - using natural language questions
  3. Get AI-synthesized answers - based on relevant document content

The key innovation is the RLM pattern: instead of dumping all documents into the context (which fails for large knowledge bases), the LLM writes Python code to explore the documents on-demand, searching and reading only what's needed.

Installation

# From source
pip install -e /path/to/watsonx_rlm_knowledge

# Or install directly
pip install watsonx-rlm-knowledge

Quick Start

1. Set Environment Variables

export WATSONX_API_KEY="your-ibm-cloud-api-key"
export WATSONX_PROJECT_ID="your-watsonx-project-id"
export RLM_KNOWLEDGE_ROOT="/path/to/your/documents"

# Optional
export WATSONX_REGION_URL="https://us-south.ml.cloud.ibm.com"  # default
export WATSONX_MODEL_ID="openai/gpt-oss-120b"  # default

2. Use the Client

from watsonx_rlm_knowledge import KnowledgeClient

# Initialize client (preprocesses documents automatically)
client = KnowledgeClient.from_directory("/path/to/documents")

# Query the knowledge base
answer = client.query("How does the authentication system work?")
print(answer)

# Get detailed results
result = client.query_detailed("Explain the database schema")
print(f"Answer: {result.answer}")
print(f"Iterations: {result.iterations}")
print(f"Time: {result.total_time:.2f}s")

3. Or Use the CLI

# Query
watsonx-rlm-knowledge query "How does authentication work?"

# Interactive chat
watsonx-rlm-knowledge chat

# List documents
watsonx-rlm-knowledge list

# Search
watsonx-rlm-knowledge search "authentication"

# Statistics
watsonx-rlm-knowledge stats

Supported Document Formats

Text Files (read directly)

  • Code: .py, .js, .ts, .java, .c, .cpp, .go, .rs, .rb, etc.
  • Config: .json, .yaml, .toml, .xml, .ini, etc.
  • Documentation: .md, .txt, .rst, .tex, etc.
  • Web: .html, .css, .vue, .svelte, etc.
  • Data: .csv, .sql, .graphql, etc.

Binary Documents (converted to text)

  • PDF: .pdf
  • Word: .docx, .doc
  • Excel: .xlsx, .xls
  • PowerPoint: .pptx, .ppt
  • Other: .rtf, .odt, .ods, .odp

How It Works

The RLM Pattern

Traditional RAG (Retrieval-Augmented Generation) has limitations:

  • Embedding search may miss relevant content
  • Context windows can't hold large documents
  • Pre-chunking loses document structure

RLM (Recursive Language Model) takes a different approach:

  1. The LLM is given access to a KnowledgeContext object
  2. It writes Python code to explore documents
  3. Code is executed and results fed back
  4. The LLM iterates until it has enough information
  5. Finally outputs a FINAL_ANSWER
User Query → LLM writes Python → Execute → Results → LLM writes more Python → ... → FINAL_ANSWER

Example RLM Iteration

# LLM writes this code:
matches = knowledge.search("authentication")
obs = f"Found {len(matches)} matches for 'authentication':\n"
for m in matches[:5]:
    obs += f"  {m.path}:{m.line_number}: {m.line_text}\n"

# Results fed back:
# "Found 12 matches for 'authentication':
#   auth/login.py:45: def authenticate_user(username, password):
#   docs/api.md:23: ## Authentication Methods
#   ..."

# LLM then reads the relevant file:
content = knowledge.read_slice("auth/login.py", offset=0, nbytes=5000)
obs = content

# And continues until it can answer the question

API Reference

KnowledgeClient

The main interface for querying knowledge bases.

# Factory methods
client = KnowledgeClient.from_directory("/path/to/docs")
client = KnowledgeClient.from_credentials(
    knowledge_root="/path/to/docs",
    api_key="your-key",
    project_id="your-project"
)
client = KnowledgeClient.from_env()

# Query methods
answer = client.query("question")
result = client.query_detailed("question")  # Returns RLMResult

# Utility methods
docs = client.list_documents(pattern="*.pdf")
results = client.search("term", max_results=20)
content = client.read_document("path/to/doc.md")
stats = client.get_stats()
client.preprocess(force=True)

KnowledgeContext

Low-level access to the knowledge base (used by the RLM engine).

from watsonx_rlm_knowledge import KnowledgeContext

ctx = KnowledgeContext("/path/to/docs")

# List documents
docs = ctx.list_documents()
files = ctx.list_files()

# Search
matches = ctx.search("term", max_matches=50)
matches = ctx.grep("pattern")
matches = ctx.search_regex(r"auth\w+")

# Read content
text = ctx.head("doc.md", nbytes=5000)
text = ctx.read_slice("doc.md", offset=1000, nbytes=3000)
text = ctx.read_full("doc.md")
text = ctx.tail("doc.md")

# Document info
toc = ctx.get_table_of_contents("doc.md")
count = ctx.count_occurrences("authentication")

RLMEngine

The core engine that runs the RLM loop.

from watsonx_rlm_knowledge import RLMEngine, KnowledgeContext
from watsonx_rlm_knowledge.engine import RLMConfig

# Custom configuration
config = RLMConfig(
    max_iterations=15,      # Max exploration iterations
    max_code_retries=3,     # Retries for code errors
    temperature=0.1,        # LLM temperature
    main_max_tokens=4096,   # Max tokens for main calls
    subcall_max_tokens=2048 # Max tokens for subcalls
)

# Create engine
engine = RLMEngine(
    knowledge=ctx,
    llm_call_fn=your_llm_function,
    config=config
)

# Run query
result = engine.run("Your question here")
print(result.answer)
print(result.iterations)
print(result.observations)

DocumentPreprocessor

Handles conversion of binary documents to text.

from watsonx_rlm_knowledge import DocumentPreprocessor
from watsonx_rlm_knowledge.preprocessor import PreprocessorConfig

config = PreprocessorConfig(
    cache_dir=".rlm_cache",
    max_file_size_mb=50,
    skip_hidden=True,
    skip_dirs=(".git", "node_modules", "__pycache__")
)

preprocessor = DocumentPreprocessor("/path/to/docs", config)
preprocessor.preprocess_all(force=False)

# Get text content
text = preprocessor.get_text("/path/to/docs/report.pdf")

Configuration

WatsonX Configuration

from watsonx_rlm_knowledge.watsonx_client import WatsonXConfig

config = WatsonXConfig(
    api_key="your-key",
    project_id="your-project",
    region_url="https://us-south.ml.cloud.ibm.com",
    model_id="openai/gpt-oss-120b",
    max_tokens=8192,
    temperature=0.1,
    reasoning_effort="low"  # Reduces "thinking-only" outputs
)

Environment Variables

Variable Description Default
WATSONX_API_KEY IBM Cloud API key (required)
WATSONX_PROJECT_ID WatsonX project ID (required)
WATSONX_REGION_URL WatsonX region URL https://us-south.ml.cloud.ibm.com
WATSONX_MODEL_ID Model ID openai/gpt-oss-120b
RLM_KNOWLEDGE_ROOT Default knowledge directory (none)

CLI Reference

# Query the knowledge base
watsonx-rlm-knowledge query "Your question here"
watsonx-rlm-knowledge query "Your question" --detailed

# Interactive chat mode
watsonx-rlm-knowledge chat

# List documents
watsonx-rlm-knowledge list
watsonx-rlm-knowledge list --pattern "*.pdf"
watsonx-rlm-knowledge list --json

# Search documents
watsonx-rlm-knowledge search "term"
watsonx-rlm-knowledge search "term" --max-results 50

# Preprocess documents
watsonx-rlm-knowledge preprocess
watsonx-rlm-knowledge preprocess --force

# Show statistics
watsonx-rlm-knowledge stats
watsonx-rlm-knowledge stats --json

# Read a document
watsonx-rlm-knowledge read "path/to/doc.md"
watsonx-rlm-knowledge read "path/to/doc.md" --max-bytes 10000

# Global options
watsonx-rlm-knowledge --knowledge-root /path/to/docs query "question"
watsonx-rlm-knowledge --verbose query "question"

WatsonX Adapters

The WatsonX client includes adapters to work around bugs in IBM's vLLM backend:

Tool Adapter

Emulates function calling via prompt injection when native tool use is broken.

JSON Adapter

Enforces JSON schema responses via prompt engineering.

Message Adapter

Handles vLLM quirks like null content issues.

Thinking Handler

Automatically retries when the model returns only "reasoning_content" without actual output.

Example Use Cases

Code Documentation Q&A

client = KnowledgeClient.from_directory("./my-project")
answer = client.query("How do I configure the database connection?")

Research Paper Analysis

client = KnowledgeClient.from_directory("./papers")
answer = client.query("What are the main findings about transformer architectures?")

Policy Document Search

client = KnowledgeClient.from_directory("./policies")
answer = client.query("What is the vacation policy for remote employees?")

Troubleshooting

"WatsonX credentials not found"

Ensure you've set WATSONX_API_KEY and WATSONX_PROJECT_ID environment variables.

"Model returned thinking-only response"

The client automatically retries, but if this persists, try:

  • Setting reasoning_effort="low" in WatsonXConfig
  • Simplifying your query

Slow preprocessing

Large PDFs or many documents take time. Progress is cached, so subsequent runs are faster.

Document not found

Ensure the path is relative to your knowledge root, not absolute.

License

MIT License

Contributing

Contributions welcome! Please open an issue or PR.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

watsonx_rlm_knowledge-1.0.0.tar.gz (43.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

watsonx_rlm_knowledge-1.0.0-py3-none-any.whl (49.3 kB view details)

Uploaded Python 3

File details

Details for the file watsonx_rlm_knowledge-1.0.0.tar.gz.

File metadata

  • Download URL: watsonx_rlm_knowledge-1.0.0.tar.gz
  • Upload date:
  • Size: 43.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for watsonx_rlm_knowledge-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b969d8ea77f233f7ff92ad6cbcc6185a01a4b26d54a7609bd3ba51ee4e7b03d3
MD5 c90ad07665b361c15c6a2b2ebe65346f
BLAKE2b-256 60630e7d6fc3039008f1f28fbaa49f0c5a7ff8bbe661f4771cd87e40bcfc9b54

See more details on using hashes here.

File details

Details for the file watsonx_rlm_knowledge-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for watsonx_rlm_knowledge-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b4d2d08411fe9b0b8b3b1deda37808b12d8372d6bf292c8488bc1fdb4bddb45
MD5 6f0c53cc93a077e446b7fe9cc468698e
BLAKE2b-256 0643795b200b4561daa8a71a588c7d2e5f0d8c4642cfd5c9bcde9e7e91537916

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page