Core interfaces for hybrid search implementations (CPU version)
Project description
just-semantic-search
LLM-agnostic semantic-search library with hybrid search support and multiple backends.
Features
- 🔍 Hybrid search combining semantic and keyword search
- 🚀 Multiple backend support (Meilisearch, more coming soon)
- 📄 Smart document splitting with semantic awareness
- 🔌 LLM-agnostic - works with any embedding model
- 🎯 Optimized for scientific and technical content
- 🛠 Easy to use API and CLI tools
Installation
Make sure you have at least Python 3.11 installed.
Using pip
pip install just-semantic-search # Core package
pip install just-semantic-search-meili # Meilisearch backend
Using Poetry
poetry add just-semantic-search # Core package
poetry add just-semantic-search-meili # Meilisearch backend
From Source
# Install Poetry if you haven't already
curl -sSL https://install.python-poetry.org | python3 -
# Clone the repository
git clone https://github.com/your-username/just-semantic-search.git
cd just-semantic-search
# Install dependencies and create virtual environment
poetry install
# Activate the virtual environment
poetry shell
Docker Setup for Meilisearch
The project includes a Docker Compose configuration for running Meilisearch. Simply run:
./bin/meili.sh
This will start a Meilisearch instance with vector search enabled and persistent data storage.
Quick Start
Document Splitting
from just_semantic_search.article_semantic_splitter import ArticleSemanticSplitter
from sentence_transformers import SentenceTransformer
# Initialize model and splitter
model = SentenceTransformer('thenlper/gte-base')
splitter = ArticleSemanticSplitter(model)
# Split document with metadata
documents = splitter.split_file(
"path/to/document.txt",
embed=True,
title="Document Title",
source="https://source.url"
)
Hybrid Search with Meilisearch
from just_semantic_search.meili.rag import MeiliConfig, MeiliRAG
# Configure Meilisearch
config = MeiliConfig(
host="127.0.0.1",
port=7700,
api_key="your_api_key"
)
# Initialize RAG
rag = MeiliRAG(
"test_index",
"thenlper/gte-base",
config,
create_index_if_not_exists=True
)
# Add documents and search
rag.add_documents_sync(documents)
results = rag.search(
text_query="What are CAD-genes?",
vector=model.encode("What are CAD-genes?")
)
Project Structure
The project consists of multiple components:
core: Core interfaces for hybrid search implementationsmeili: Meilisearch backend implementation
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Citation
If you use this software in your research, please cite:
@software{just_semantic_search,
title = {just-semantic-search: LLM-agnostic semantic search library},
author = {Karmazin, Alex and Kulaga, Anton},
year = {2024},
url = {https://github.com/your-username/just-semantic-search}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file just_semantic_search-0.4.7.tar.gz.
File metadata
- Download URL: just_semantic_search-0.4.7.tar.gz
- Upload date:
- Size: 23.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.10.12 Linux/5.15.0-161-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08bb121b6d9350fc53377d75671926a00b40c8740bef3189f57fcc1e2c4879d4
|
|
| MD5 |
a80d1040bd6bee5ee96094ce1305fa27
|
|
| BLAKE2b-256 |
9300f3d8feaeca4701775f7f4301730cb46ca30aa32c550ee01b0c2ebbeac6c3
|
File details
Details for the file just_semantic_search-0.4.7-py3-none-any.whl.
File metadata
- Download URL: just_semantic_search-0.4.7-py3-none-any.whl
- Upload date:
- Size: 31.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.10.12 Linux/5.15.0-161-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1aa854e2f1c617beb7d9a054b85a0442aa538060bf6b3a4789d7307f15570a8
|
|
| MD5 |
ea1b06348494c14b3bae65ab298079ce
|
|
| BLAKE2b-256 |
c0157e093dadffd92ee094198a97a47e3f54c3f6984a46888e973d315d1ccd54
|