Client-side tools for using large language models, full service (e.g. ChatGPT & Bard) or locally hosted (e.g. LLaMA derivatives)
Project description
OgbujiPT is a general-purpose knowledge bank system for LLM-based applications. It provides a unified API for storing, retrieving, and managing semantic knowledge across multiple backends, with support for dense vector search, sparse retrieval, hybrid search, and more.
Built with Pythonic simplicity and transparency in mind; avoiding the over-frameworks that plague the LLM ecosystem. Every abstraction must justify its existence.
| OgbujiPT is primarily developed by the crew at Oori Data. We offer data pipelines and software engineering services around AI/LLM applications. |
Quick links
Getting started
uv pip install ogbujipt
Quick example: In-memory knowledge bank
Perfect for prototyping, testing, or small applications—no database setup required:
import asyncio
from ogbujipt.store import RAMDataDB
from sentence_transformers import SentenceTransformer
async def main():
# Load embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')
# Create in-memory knowledge base
kb = RAMDataDB(embedding_model=model, collection_name='docs')
await kb.setup()
# Insert documents
await kb.insert('Python is great for machine learning', metadata={'lang': 'python'})
await kb.insert('JavaScript powers modern web applications', metadata={'lang': 'js'})
# Semantic search
async for result in kb.search('programming languages', limit=5):
print(f'{result.content} (score: {result.score:.3f})')
await kb.cleanup()
asyncio.run(main())
Hybrid search with reranking
Combine dense vector search with sparse BM25 retrieval, then rerank for best results:
from ogbujipt.retrieval.hybrid import RerankedHybridSearch
from ogbujipt.retrieval.sparse import BM25Search
from ogbujipt.retrieval.dense import DenseSearch
from rerankers import Reranker
# Initialize components
reranker = Reranker(model_name='BAAI/bge-reranker-base')
hybrid = RerankedHybridSearch(
strategies=[DenseSearch(), BM25Search()],
reranker=reranker,
rerank_top_k=20
)
# Search across knowledge bases
async for result in hybrid.execute('machine learning', backends=[kb], limit=5):
print(f'{result.score:.3f}: {result.content[:50]}...')
Knowledge bank features
OgbujiPT provides a flexible knowledge bank system with multiple storage backends and retrieval strategies.
Storage backends
- In-memory (
RAMDataDB,RAMMessageDB): Zero-setup stores perfect for testing, prototyping, and small applications. Drop-in replacements for PostgreSQL versions with identical APIs. - PostgreSQL + pgvector: Production-ready persistent storage with advanced indexing (HNSW, IVFFlat) and full SQL capabilities.
- Qdrant: High-performance vector database with distributed capabilities.
Retrieval strategies
- Dense vector search: Semantic similarity using embeddings (e.g., SentenceTransformers, OpenAI embeddings)
- Sparse retrieval: BM25 keyword-based search for exact term matching
- Hybrid search: Combine multiple strategies using Reciprocal Rank Fusion (RRF)
- Reranking: Cross-encoder reranking for improved precision (e.g., BGE-reranker, ZeRank)
Message/conversation storage
Store and search chat history with semantic retrieval:
from ogbujipt.store import RAMMessageDB
from uuid import uuid4
from datetime import datetime, timezone
async def chat_example():
model = SentenceTransformer('all-MiniLM-L6-v2')
db = RAMMessageDB(embedding_model=model, collection_name='chat')
await db.setup()
conversation_id = uuid4()
# Store messages
await db.insert(conversation_id, 'user', 'What is machine learning?',
datetime.now(tz=timezone.utc), {})
await db.insert(conversation_id, 'assistant', 'ML is a subset of AI...',
datetime.now(tz=timezone.utc), {})
# Semantic search over conversation
results = await db.search(conversation_id, 'AI concepts', limit=2)
for msg in results:
print(f'[{msg.role}] {msg.content}')
await db.cleanup()
Design philosophy
- Composability over monoliths: Mix and match backends and strategies
- Explicit over implicit: No hidden magic—you control connection pooling, retries, caching
- Pythonic simplicity: Minimal abstractions, clear APIs, sensible defaults
- Production-ready: Structured logging, retry logic, async-first design
LLM integration
OgbujiPT includes LLM wrapper utilities for integrating knowledge banks with language models.
Basic LLM usage
from ogbujipt.llm_wrapper import openai_chat_api, prompt_to_chat
llm_api = openai_chat_api(base_url='http://localhost:8000')
prompt = 'Write a short birthday greeting for my star employee'
resp = llm_api.call(prompt_to_chat(prompt), temperature=0.1, max_tokens=256)
print(resp.first_choice_text)
Asynchronous API
import asyncio
from ogbujipt.llm_wrapper import openai_chat_api, prompt_to_chat
llm_api = openai_chat_api(base_url='http://localhost:8000')
messages = prompt_to_chat('Hello!', system='You are a helpful AI agent…')
resp = await llm_api(messages, temperature=0.1, max_tokens=256)
print(resp.first_choice_text)
Supported LLM backends
You can use the OpenAI cloud LLM API and APIs which conform to this, including Anthropic's, local LM Studio, Ollama, etc. Users on Mac might want to check out our sister project Toolio which provides a local LLM inference server on Apple Silicon.
RAG example: Chat with your documents
from ogbujipt.store import RAMDataDB
from ogbujipt.llm_wrapper import openai_chat_api, prompt_to_chat
# Setup knowledge base
kb = RAMDataDB(embedding_model=model, collection_name='docs')
await kb.setup()
await kb.insert('Your document content here...', metadata={'source': 'doc.pdf'})
# Retrieve relevant context
contexts = []
async for result in kb.search('user question', limit=3):
contexts.append(result.content)
# Build RAG prompt
context_text = '\n\n'.join(contexts)
prompt = f"""Based on the following context, answer the question.
Context:
{context_text}
Question: user question"""
# Get LLM response
llm_api = openai_chat_api(base_url='http://localhost:8000')
resp = await llm_api(prompt_to_chat(prompt))
print(resp.first_choice_text)
Demos and examples
See the demo/ directory for complete examples:
Knowledge bank demos
ram-store/: In-memory vector stores—zero setup, perfect for learningsimple_search_demo.py: Basic semantic search with filteringchat_with_memory.py: Conversational AI with message history
pg-hybrid/: PostgreSQL-based production exampleschat_with_hybrid_kb.py: Hybrid search with RRF fusionhybrid_rerank_demo.py: Reranking with cross-encoderschat_doc_folder_pg.py: RAG chat application
LLM demos
- Basic LLM text completion and format correction
- Multiple simultaneous queries via multiprocessing
- OpenAI-style function calling
- Discord bot integration
- Streamlit UI for PDF chat
Roadmap
OgbujiPT is evolving into a comprehensive knowledge bank system. Current focus (v0.10.0+):
✅ Implemented
- In-memory vector stores (RAMDataDB, RAMMessageDB)
- Dense vector search (PostgreSQL, Qdrant, in-memory)
- Sparse retrieval (BM25)
- Hybrid search with RRF fusion
- Cross-encoder reranking
- Message/conversation storage
- Metadata filtering
🚧 In progress
- GraphRAG support using Onya
- Unified knowledge base API
- Query classification and routing
- Multi-backend aggregation
📋 Planned
- RSS feed ingestion and caching
- Link management with update mechanisms
- Graph curation strategies
- KB maintenance and pruning (summarization, obsolescence marking)
- RBAC and multi-tenancy
- Observability (query logging, tracing, performance monitoring)
- MCP (Model Context Protocol) provider/server
- Query sampling for refinement
- Additional backends (filesystem, Marqo, etc.)
- Multi-modal support
See discussion #92 for detailed roadmap and design philosophy.
Installation
uv pip install ogbujipt
Optional dependencies
For specific features:
# PostgreSQL + pgvector support
uv pip install "ogbujipt[postgres]"
# Qdrant support
uv pip install "ogbujipt[qdrant]"
# Reranking support
uv pip install "rerankers[transformers]"
# GraphRAG support (when available)
uv pip install "ogbujipt[graph]"
Development and Contribution
See CONTRIBUTING.md and the contributor notes for development setup and guidelines. Toolio,
Design principles
Avoid over-frameworks
OgbujiPT deliberately avoids becoming another LangChain. We emphasize:
- Minimal abstractions: Every layer must justify its existence
- Explicit over implicit: No hidden magic—be clear about connection pooling, retries, caching
- Configuration clarity: Help automate config without creating configuration hell
- Composability: Mix and match components rather than monolithic frameworks
- Pythonic: Old-school Python simplicity and clarity
Memory taxonomy
Different memory types need different strategies:
- Conversational memory: Recent chat history (working memory)
- Semantic memory: Long-term knowledge (documents, facts)
- Scratchpad: Temporary computation state
- Observability logs: Query/retrieval tracing
OgbujiPT provides explicit APIs for each, avoiding one-size-fits-all "universal memory" patterns.
Resources
License
Apache 2.0. For tha culture!
Credits
Some initial ideas & code were borrowed from these projects, but with heavy refactoring:
Related projects
- mlx-tuning-fork—"very basic framework for parameterized Large Language Model (Q)LoRa fine-tuning with MLX. It uses mlx, mlx_lm, and OgbujiPT, and is based primarily on the excellent mlx-example libraries but adds very minimal architecture for systematic running of easily parameterized fine tunes, hyperparameter sweeping, declarative prompt construction, an equivalent of HF's train on completions, and other capabilities."
- living-bookmarks—"Uses [OgbujiPT] to Help a user manage their bookmarks in context of various chat, etc."
FAQ
What's unique about OgbujiPT?
Unlike frameworks that try to do everything, OgbujiPT focuses on:
- Knowledge bank primitives: Clean APIs for storage and retrieval
- Composability: Mix backends and strategies without lock-in
- Pythonic simplicity: Minimal abstractions, clear code
- Production-ready: Async-first, structured logging, retry logic
- Explicit design: No hidden magic—you control the details
Why not just use LangChain?
LangChain is great for many use cases, but it's also:
- Overly abstracted (hard to understand what's happening)
- Monolithic (hard to use just the parts you need)
- Configuration-heavy (too many ways to configure the same thing)
OgbujiPT provides a lighter-weight alternative focused on knowledge banks, with clear boundaries and explicit control.
Does this support GPU for locally-hosted models?
Yes! Make sure your LLM backend (Toolio, llama.cpp, text-generation-webui, etc.) is configured with GPU support. OgbujiPT works with any OpenAI-compatible API, so GPU acceleration is handled by your backend.
What's with the crazy name?
Enh?! Yo mama! 😝 My surname is Ogbuji, so it's a bit of a pun. This is the notorious OGPT, ya feel me?
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ogbujipt-0.10.0.tar.gz.
File metadata
- Download URL: ogbujipt-0.10.0.tar.gz
- Upload date:
- Size: 73.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60ba9d3720546e4e1e26c99fb4e43f1caa2f7e5b1956a3d7d2c4c9c22a04ff0b
|
|
| MD5 |
a377e786e3256afda0dcf0b2240d7244
|
|
| BLAKE2b-256 |
726b6f271d099ac5c75641b2ad86fc26148661605f7809f1de86e89592274492
|
Provenance
The following attestation bundles were made for ogbujipt-0.10.0.tar.gz:
Publisher:
publish.yml on OoriData/OgbujiPT
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ogbujipt-0.10.0.tar.gz -
Subject digest:
60ba9d3720546e4e1e26c99fb4e43f1caa2f7e5b1956a3d7d2c4c9c22a04ff0b - Sigstore transparency entry: 731745434
- Sigstore integration time:
-
Permalink:
OoriData/OgbujiPT@6e7c80d1632bacd4235fc2a8ce03e1d4b89771cf -
Branch / Tag:
refs/tags/v0.10.0 - Owner: https://github.com/OoriData
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6e7c80d1632bacd4235fc2a8ce03e1d4b89771cf -
Trigger Event:
release
-
Statement type:
File details
Details for the file ogbujipt-0.10.0-py3-none-any.whl.
File metadata
- Download URL: ogbujipt-0.10.0-py3-none-any.whl
- Upload date:
- Size: 73.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b80c5af01e57ca9d92d979fd44600bd73892491bf3af6d3a2a2885a4fb5cbf5d
|
|
| MD5 |
a7ad2cd7c60be435278c4668a829d00b
|
|
| BLAKE2b-256 |
2c88c19bdb053354b03c1899c798e3759d3fb54a7838fc62199c5c3367dd92e3
|
Provenance
The following attestation bundles were made for ogbujipt-0.10.0-py3-none-any.whl:
Publisher:
publish.yml on OoriData/OgbujiPT
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ogbujipt-0.10.0-py3-none-any.whl -
Subject digest:
b80c5af01e57ca9d92d979fd44600bd73892491bf3af6d3a2a2885a4fb5cbf5d - Sigstore transparency entry: 731745435
- Sigstore integration time:
-
Permalink:
OoriData/OgbujiPT@6e7c80d1632bacd4235fc2a8ce03e1d4b89771cf -
Branch / Tag:
refs/tags/v0.10.0 - Owner: https://github.com/OoriData
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6e7c80d1632bacd4235fc2a8ce03e1d4b89771cf -
Trigger Event:
release
-
Statement type: