Knowledge Management System that connects to your RAG system
Project description
Simba - Your Knowledge Management System
Connect your knowledge to any RAG system
๐ Overview
Simba is an open-source, portable Knowledge Management System (KMS) designed specifically for seamless integration with Retrieval-Augmented Generation (RAG) systems. With its intuitive UI, modular architecture, and powerful SDK, Simba simplifies knowledge management, allowing developers to focus on building advanced AI solutions.
Table of Contents
๐ Features
- ๐ Powerful SDK: Comprehensive Python SDK for easy integration.
- ๐งฉ Modular Architecture: Flexible integration of vector stores, embedding models, chunkers, and parsers.
- ๐ฅ๏ธ Modern UI: User-friendly interface for managing document chunks.
- ๐ Seamless Integration: Effortlessly connects with any RAG-based system.
- ๐จโ๐ป Developer-Centric: Simplifies complex knowledge management tasks.
- ๐ฆ Open Source & Extensible: Community-driven with extensive customization options.
๐ฅ Demo
๐ ๏ธ Getting Started
๐ Prerequisites
Ensure you have the following installed:
๐ Quickstart Simba SDK Usage
pip install simba-client
Leverage Simba's SDK for powerful programmatic access:
from simba_sdk import SimbaClient
client = SimbaClient(api_url="http://localhost:8000") # you need to install simba-core and run simba server first
document = client.documents.create(file_path="path/to/your/document.pdf")
document_id = document[0]["id"]
parsing_result = client.parser.parse_document(document_id, parser="docling", sync=True)
retrieval_results = client.retriever.retrieve(query="your-query")
for result in retrieval_results["documents"]:
print(f"Content: {result['page_content']}")
print(f"Metadata: {result['metadata']['source']}")
print("====" * 10)
Explore more in the Simba SDK documentation.
๐ฆ Installation
Install Simba core :
pip install simba-core
Or Clone and set up the repository:
git clone https://github.com/GitHamza0206/simba.git
cd simba
poetry config virtualenvs.in-project true
poetry install
source .venv/bin/activate
๐ Configuration
Create a .env file:
OPENAI_API_KEY=your_openai_api_key
REDIS_HOST=localhost
CELERY_BROKER_URL=redis://localhost:6379/0
CELERY_RESULT_BACKEND=redis://localhost:6379/1
Configure config.yaml:
# config.yaml
project:
name: "Simba"
version: "1.0.0"
api_version: "/api/v1"
paths:
base_dir: null # Will be set programmatically
faiss_index_dir: "vector_stores/faiss_index"
vector_store_dir: "vector_stores"
llm:
provider: "openai"
model_name: "gpt-4o-mini"
temperature: 0.0
max_tokens: null
streaming: true
additional_params: {}
embedding:
provider: "huggingface"
model_name: "BAAI/bge-base-en-v1.5"
device: "mps" # Changed from mps to cpu for container compatibility
additional_params: {}
vector_store:
provider: "faiss"
collection_name: "simba_collection"
additional_params: {}
chunking:
chunk_size: 512
chunk_overlap: 200
retrieval:
method: "hybrid" # Options: default, semantic, keyword, hybrid, ensemble, reranked
k: 5
# Method-specific parameters
params:
# Semantic retrieval parameters
score_threshold: 0.5
# Hybrid retrieval parameters
prioritize_semantic: true
# Ensemble retrieval parameters
weights: [0.7, 0.3] # Weights for semantic and keyword retrievers
# Reranking parameters
reranker_model: colbert
reranker_threshold: 0.7
# Database configuration
database:
provider: litedb # Options: litedb, sqlite
additional_params: {}
celery:
broker_url: ${CELERY_BROKER_URL:-redis://redis:6379/0}
result_backend: ${CELERY_RESULT_BACKEND:-redis://redis:6379/1}
๐ Running Simba
Start the server, frontend, and parsers:
simba server
simba front
simba parsers
๐ณ Docker Deployment
Deploy Simba using Docker:
- CPU:
DEVICE=cpu make build
DEVICE=cpu make up
- NVIDIA GPU:
DEVICE=cuda make build
DEVICE=cuda make up
- Apple Silicon:
DEVICE=cpu make build
DEVICE=cpu make up
๐ Roadmap
- ๐ป pip install simba-core
- ๐ง pip install simba-sdk
- ๐ www.simba-docs.com
- ๐ Auth & access management
- ๐ธ๏ธ Web scraping
- โ๏ธ Cloud integrations (Azure/AWS/GCP)
- ๐ Additional parsers and chunkers
- ๐จ Enhanced UX/UI
๐ค Contributing
We welcome contributions! Follow these steps:
- Fork the repository
- Create a feature or bugfix branch
- Commit clearly documented changes
- Submit a pull request
๐ฌ Support & Contact
For support or inquiries, open an issue on GitHub or contact Hamza Zerouali.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file simba_core-0.3.0.tar.gz.
File metadata
- Download URL: simba_core-0.3.0.tar.gz
- Upload date:
- Size: 336.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b18bef12e581dc32897855d824cac9cefd634b69b98616c879f344ed86eee937
|
|
| MD5 |
c913dc6d3d340f07f32522509f953dc7
|
|
| BLAKE2b-256 |
239c622f5099e10fd75b86cacc79db4664555f2a397d0ab51801ff142701717e
|
File details
Details for the file simba_core-0.3.0-py3-none-any.whl.
File metadata
- Download URL: simba_core-0.3.0-py3-none-any.whl
- Upload date:
- Size: 373.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6eb87876111556af1d3c98f1ecd77c081c0c147efd5c73a1430f32073c5fc1b6
|
|
| MD5 |
4aae3562f7f0b5fe8af65366cce6850f
|
|
| BLAKE2b-256 |
82491ac6efae593e1a3ef3d7b5354a9dc207737d942c0f221c4a79acc94c8879
|