Skip to main content

Core interfaces for hybrid search implementations (CPU version)

Project description

just-semantic-search

PyPI version Python Version License Downloads

LLM-agnostic semantic-search library with hybrid search support and multiple backends.

Features

  • 🔍 Hybrid search combining semantic and keyword search
  • 🚀 Multiple backend support (Meilisearch, more coming soon)
  • 📄 Smart document splitting with semantic awareness
  • 🔌 LLM-agnostic - works with any embedding model
  • 🎯 Optimized for scientific and technical content
  • 🛠 Easy to use API and CLI tools

Installation

Make sure you have at least Python 3.11 installed.

Using pip

pip install just-semantic-search        # Core package
pip install just-semantic-search-meili  # Meilisearch backend

Using Poetry

poetry add just-semantic-search        # Core package
poetry add just-semantic-search-meili  # Meilisearch backend

From Source

# Install Poetry if you haven't already
curl -sSL https://install.python-poetry.org | python3 -

# Clone the repository
git clone https://github.com/your-username/just-semantic-search.git
cd just-semantic-search

# Install dependencies and create virtual environment
poetry install

# Activate the virtual environment
poetry shell

Docker Setup for Meilisearch

The project includes a Docker Compose configuration for running Meilisearch. Simply run:

./bin/meili.sh

This will start a Meilisearch instance with vector search enabled and persistent data storage.

Quick Start

Document Splitting

from just_semantic_search.article_semantic_splitter import ArticleSemanticSplitter
from sentence_transformers import SentenceTransformer

# Initialize model and splitter
model = SentenceTransformer('thenlper/gte-base')
splitter = ArticleSemanticSplitter(model)

# Split document with metadata
documents = splitter.split_file(
    "path/to/document.txt",
    embed=True,
    title="Document Title",
    source="https://source.url"
)

Hybrid Search with Meilisearch

from just_semantic_search.meili.rag import MeiliConfig, MeiliRAG

# Configure Meilisearch
config = MeiliConfig(
    host="127.0.0.1",
    port=7700,
    api_key="your_api_key"
)

# Initialize RAG
rag = MeiliRAG(
    "test_index",
    "thenlper/gte-base",
    config,
    create_index_if_not_exists=True
)

# Add documents and search
rag.add_documents_sync(documents)
results = rag.search(
    text_query="What are CAD-genes?",
    vector=model.encode("What are CAD-genes?")
)

Project Structure

The project consists of multiple components:

  • core: Core interfaces for hybrid search implementations
  • meili: Meilisearch backend implementation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Citation

If you use this software in your research, please cite:

@software{just_semantic_search,
  title = {just-semantic-search: LLM-agnostic semantic search library},
  author = {Karmazin, Alex and Kulaga, Anton},
  year = {2024},
  url = {https://github.com/your-username/just-semantic-search}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

just_semantic_search-0.4.8.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

just_semantic_search-0.4.8-py3-none-any.whl (31.5 kB view details)

Uploaded Python 3

File details

Details for the file just_semantic_search-0.4.8.tar.gz.

File metadata

  • Download URL: just_semantic_search-0.4.8.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.10.12 Linux/5.15.0-161-generic

File hashes

Hashes for just_semantic_search-0.4.8.tar.gz
Algorithm Hash digest
SHA256 313e24c15ab1a0cac13e22b23e40620224826b7190b22fca4c4ee64d87ffd8a4
MD5 9c46fe1c9048272287fbad4a22fa85e9
BLAKE2b-256 6acf32c9dab84e0197c4c5ed9e5461b94df02d616cbc5518d558edd3734aec71

See more details on using hashes here.

File details

Details for the file just_semantic_search-0.4.8-py3-none-any.whl.

File metadata

  • Download URL: just_semantic_search-0.4.8-py3-none-any.whl
  • Upload date:
  • Size: 31.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.10.12 Linux/5.15.0-161-generic

File hashes

Hashes for just_semantic_search-0.4.8-py3-none-any.whl
Algorithm Hash digest
SHA256 737a2f31eb96ea6591261d6b7cbc2f4cd7f6d65e385bd63721ea6b6d37afb818
MD5 ee5374736c37d626b84119dae74d9336
BLAKE2b-256 a308bb40c5724255cee1399776f151d0d0ce409c412529ab580ce880e6b5c62b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page