RLM-based knowledge client with WatsonX backend for domain-specific document querying

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Project description

RLM Knowledge Client

A portable Python package for querying local document knowledge bases using the Recursive Language Model (RLM) pattern with IBM WatsonX as the LLM backend.

Overview

This package allows you to:

Index a directory of documents - including PDF, DOCX, XLSX, PPTX, and all text files
Query the knowledge base - using natural language questions
Get AI-synthesized answers - based on relevant document content

The key innovation is the RLM pattern: instead of dumping all documents into the context (which fails for large knowledge bases), the LLM writes Python code to explore the documents on-demand, searching and reading only what's needed.

Installation

# From source
pip install -e /path/to/watsonx_rlm_knowledge

# Or install directly
pip install watsonx-rlm-knowledge

Quick Start

1. Set Environment Variables

export WATSONX_API_KEY="your-ibm-cloud-api-key"
export WATSONX_PROJECT_ID="your-watsonx-project-id"
export RLM_KNOWLEDGE_ROOT="/path/to/your/documents"

# Optional
export WATSONX_REGION_URL="https://us-south.ml.cloud.ibm.com"  # default
export WATSONX_MODEL_ID="openai/gpt-oss-120b"  # default

2. Use the Client

from watsonx_rlm_knowledge import KnowledgeClient

# Initialize client (preprocesses documents automatically)
client = KnowledgeClient.from_directory("/path/to/documents")

# Query the knowledge base
answer = client.query("How does the authentication system work?")
print(answer)

# Get detailed results
result = client.query_detailed("Explain the database schema")
print(f"Answer: {result.answer}")
print(f"Iterations: {result.iterations}")
print(f"Time: {result.total_time:.2f}s")

3. Or Use the CLI

# Query
watsonx-rlm-knowledge query "How does authentication work?"

# Interactive chat
watsonx-rlm-knowledge chat

# List documents
watsonx-rlm-knowledge list

# Search
watsonx-rlm-knowledge search "authentication"

# Statistics
watsonx-rlm-knowledge stats

Supported Document Formats

Text Files (read directly)

Code: .py, .js, .ts, .java, .c, .cpp, .go, .rs, .rb, etc.
Config: .json, .yaml, .toml, .xml, .ini, etc.
Documentation: .md, .txt, .rst, .tex, etc.
Web: .html, .css, .vue, .svelte, etc.
Data: .csv, .sql, .graphql, etc.

Binary Documents (converted to text)

PDF: .pdf
Word: .docx, .doc
Excel: .xlsx, .xls
PowerPoint: .pptx, .ppt
Other: .rtf, .odt, .ods, .odp

How It Works

The RLM Pattern

Traditional RAG (Retrieval-Augmented Generation) has limitations:

Embedding search may miss relevant content
Context windows can't hold large documents
Pre-chunking loses document structure

RLM (Recursive Language Model) takes a different approach:

The LLM is given access to a KnowledgeContext object
It writes Python code to explore documents
Code is executed and results fed back
The LLM iterates until it has enough information
Finally outputs a FINAL_ANSWER

User Query → LLM writes Python → Execute → Results → LLM writes more Python → ... → FINAL_ANSWER

Example RLM Iteration

# LLM writes this code:
matches = knowledge.search("authentication")
obs = f"Found {len(matches)} matches for 'authentication':\n"
for m in matches[:5]:
    obs += f"  {m.path}:{m.line_number}: {m.line_text}\n"

# Results fed back:
# "Found 12 matches for 'authentication':
#   auth/login.py:45: def authenticate_user(username, password):
#   docs/api.md:23: ## Authentication Methods
#   ..."

# LLM then reads the relevant file:
content = knowledge.read_slice("auth/login.py", offset=0, nbytes=5000)
obs = content

# And continues until it can answer the question

API Reference

KnowledgeClient

The main interface for querying knowledge bases.

# Factory methods
client = KnowledgeClient.from_directory("/path/to/docs")
client = KnowledgeClient.from_credentials(
    knowledge_root="/path/to/docs",
    api_key="your-key",
    project_id="your-project"
)
client = KnowledgeClient.from_env()

# Query methods
answer = client.query("question")
result = client.query_detailed("question")  # Returns RLMResult

# Utility methods
docs = client.list_documents(pattern="*.pdf")
results = client.search("term", max_results=20)
content = client.read_document("path/to/doc.md")
stats = client.get_stats()
client.preprocess(force=True)

KnowledgeContext

Low-level access to the knowledge base (used by the RLM engine).

from watsonx_rlm_knowledge import KnowledgeContext

ctx = KnowledgeContext("/path/to/docs")

# List documents
docs = ctx.list_documents()
files = ctx.list_files()

# Search
matches = ctx.search("term", max_matches=50)
matches = ctx.grep("pattern")
matches = ctx.search_regex(r"auth\w+")

# Read content
text = ctx.head("doc.md", nbytes=5000)
text = ctx.read_slice("doc.md", offset=1000, nbytes=3000)
text = ctx.read_full("doc.md")
text = ctx.tail("doc.md")

# Document info
toc = ctx.get_table_of_contents("doc.md")
count = ctx.count_occurrences("authentication")

RLMEngine

The core engine that runs the RLM loop.

from watsonx_rlm_knowledge import RLMEngine, KnowledgeContext
from watsonx_rlm_knowledge.engine import RLMConfig

# Custom configuration
config = RLMConfig(
    max_iterations=15,      # Max exploration iterations
    max_code_retries=3,     # Retries for code errors
    temperature=0.1,        # LLM temperature
    main_max_tokens=4096,   # Max tokens for main calls
    subcall_max_tokens=2048 # Max tokens for subcalls
)

# Create engine
engine = RLMEngine(
    knowledge=ctx,
    llm_call_fn=your_llm_function,
    config=config
)

# Run query
result = engine.run("Your question here")
print(result.answer)
print(result.iterations)
print(result.observations)

DocumentPreprocessor

Handles conversion of binary documents to text.

from watsonx_rlm_knowledge import DocumentPreprocessor
from watsonx_rlm_knowledge.preprocessor import PreprocessorConfig

config = PreprocessorConfig(
    cache_dir=".rlm_cache",
    max_file_size_mb=50,
    skip_hidden=True,
    skip_dirs=(".git", "node_modules", "__pycache__")
)

preprocessor = DocumentPreprocessor("/path/to/docs", config)
preprocessor.preprocess_all(force=False)

# Get text content
text = preprocessor.get_text("/path/to/docs/report.pdf")

Configuration

WatsonX Configuration

from watsonx_rlm_knowledge.watsonx_client import WatsonXConfig

config = WatsonXConfig(
    api_key="your-key",
    project_id="your-project",
    region_url="https://us-south.ml.cloud.ibm.com",
    model_id="openai/gpt-oss-120b",
    max_tokens=8192,
    temperature=0.1,
    reasoning_effort="low"  # Reduces "thinking-only" outputs
)

Environment Variables

Variable	Description	Default
`WATSONX_API_KEY`	IBM Cloud API key	(required)
`WATSONX_PROJECT_ID`	WatsonX project ID	(required)
`WATSONX_REGION_URL`	WatsonX region URL	`https://us-south.ml.cloud.ibm.com`
`WATSONX_MODEL_ID`	Model ID	`openai/gpt-oss-120b`
`RLM_KNOWLEDGE_ROOT`	Default knowledge directory	(none)

CLI Reference

# Query the knowledge base
watsonx-rlm-knowledge query "Your question here"
watsonx-rlm-knowledge query "Your question" --detailed

# Interactive chat mode
watsonx-rlm-knowledge chat

# List documents
watsonx-rlm-knowledge list
watsonx-rlm-knowledge list --pattern "*.pdf"
watsonx-rlm-knowledge list --json

# Search documents
watsonx-rlm-knowledge search "term"
watsonx-rlm-knowledge search "term" --max-results 50

# Preprocess documents
watsonx-rlm-knowledge preprocess
watsonx-rlm-knowledge preprocess --force

# Show statistics
watsonx-rlm-knowledge stats
watsonx-rlm-knowledge stats --json

# Read a document
watsonx-rlm-knowledge read "path/to/doc.md"
watsonx-rlm-knowledge read "path/to/doc.md" --max-bytes 10000

# Global options
watsonx-rlm-knowledge --knowledge-root /path/to/docs query "question"
watsonx-rlm-knowledge --verbose query "question"

WatsonX Adapters

The WatsonX client includes adapters to work around bugs in IBM's vLLM backend:

Tool Adapter

Emulates function calling via prompt injection when native tool use is broken.

JSON Adapter

Enforces JSON schema responses via prompt engineering.

Message Adapter

Handles vLLM quirks like null content issues.

Thinking Handler

Automatically retries when the model returns only "reasoning_content" without actual output.

Example Use Cases

Code Documentation Q&A

client = KnowledgeClient.from_directory("./my-project")
answer = client.query("How do I configure the database connection?")

Research Paper Analysis

client = KnowledgeClient.from_directory("./papers")
answer = client.query("What are the main findings about transformer architectures?")

Policy Document Search

client = KnowledgeClient.from_directory("./policies")
answer = client.query("What is the vacation policy for remote employees?")

Troubleshooting

"WatsonX credentials not found"

Ensure you've set WATSONX_API_KEY and WATSONX_PROJECT_ID environment variables.

"Model returned thinking-only response"

The client automatically retries, but if this persists, try:

Setting reasoning_effort="low" in WatsonXConfig
Simplifying your query

Slow preprocessing

Large PDFs or many documents take time. Progress is cached, so subsequent runs are faster.

Document not found

Ensure the path is relative to your knowledge root, not absolute.

License

MIT License

Contributing

Contributions welcome! Please open an issue or PR.

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

2.1.0

Feb 19, 2026

2.0.0

Jan 30, 2026

1.2.1

Jan 30, 2026

1.2.0

Jan 30, 2026

1.1.2

Jan 30, 2026

1.1.1

Jan 30, 2026

1.1.0

Jan 30, 2026

This version

1.0.0

Jan 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

watsonx_rlm_knowledge-1.0.0.tar.gz (43.4 kB view details)

Uploaded Jan 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

watsonx_rlm_knowledge-1.0.0-py3-none-any.whl (49.3 kB view details)

Uploaded Jan 21, 2026 Python 3

File details

Details for the file watsonx_rlm_knowledge-1.0.0.tar.gz.

File metadata

Download URL: watsonx_rlm_knowledge-1.0.0.tar.gz
Upload date: Jan 21, 2026
Size: 43.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for watsonx_rlm_knowledge-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`b969d8ea77f233f7ff92ad6cbcc6185a01a4b26d54a7609bd3ba51ee4e7b03d3`
MD5	`c90ad07665b361c15c6a2b2ebe65346f`
BLAKE2b-256	`60630e7d6fc3039008f1f28fbaa49f0c5a7ff8bbe661f4771cd87e40bcfc9b54`

See more details on using hashes here.

File details

Details for the file watsonx_rlm_knowledge-1.0.0-py3-none-any.whl.

File metadata

Download URL: watsonx_rlm_knowledge-1.0.0-py3-none-any.whl
Upload date: Jan 21, 2026
Size: 49.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for watsonx_rlm_knowledge-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2b4d2d08411fe9b0b8b3b1deda37808b12d8372d6bf292c8488bc1fdb4bddb45`
MD5	`6f0c53cc93a077e446b7fe9cc468698e`
BLAKE2b-256	`0643795b200b4561daa8a71a588c7d2e5f0d8c4642cfd5c9bcde9e7e91537916`

See more details on using hashes here.

watsonx-rlm-knowledge 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RLM Knowledge Client

Overview

Installation

Quick Start

1. Set Environment Variables

2. Use the Client

3. Or Use the CLI

Supported Document Formats

Text Files (read directly)

Binary Documents (converted to text)

How It Works

The RLM Pattern

Example RLM Iteration

API Reference

KnowledgeClient

KnowledgeContext

RLMEngine

DocumentPreprocessor

Configuration

WatsonX Configuration

Environment Variables

CLI Reference

WatsonX Adapters

Tool Adapter

JSON Adapter

Message Adapter

Thinking Handler

Example Use Cases

Code Documentation Q&A

Research Paper Analysis

Policy Document Search

Troubleshooting

"WatsonX credentials not found"

"Model returned thinking-only response"

Slow preprocessing

Document not found

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes