Skip to main content

A library for embedding, indexing, and applying semantic search for text and image data

Project description

Deep Semantic Search

A Python library for embedding, indexing, and applying semantic search for text and image data.

Features

  • Multi-modal Semantic Search:

    • Embedding and indexing text data using the nli-mpnet-base-v2 model
    • Embedding and indexing image data using the CLIP model
    • Semantic search for both text and image data
    • Search images by both image and text queries
  • Clustering and Image Captioning:

    • Cluster image embeddings using PyTorch KMeans (with GPU support)
    • Caption images using the BLIP model
  • Retrieval-Augmented Generation (RAG):

    • Answer questions based on search results
    • Summarize search results
    • Generate topics for image captions

Installation

pip install deep-semantic-search

Quick Start

Text Search

from deep_semantic_search import LoadTextData, TextEmbedder, TextSearch

# Load text data
loader = LoadTextData()
corpus_dict = loader.from_folder("path/to/text/files")

# Embed the text data
embedder = TextEmbedder()
embedder.embed(corpus_dict)

# Search for similar texts
search = TextSearch()
results = search.find_similar("your search query", top_n=5)

for result in results:
    print(f"Score: {result['score']}, Text: {result['text'][:100]}...")

Image Search

from deep_semantic_search import LoadImageData, ImageSearch

# Load image data
loader = LoadImageData()
image_paths = loader.from_folder("path/to/images")

# Set up image search
searcher = ImageSearch(image_paths)

# Search for similar images to a text query
results = searcher.get_similar_images_to_text("cat on a sofa", number_of_images=5)

# Display results
for path, score in results.items():
    print(f"Score: {score}, Image: {path}")

RAG (Retrieval-Augmented Generation)

from deep_semantic_search import ask_question

# Ask a question based on provided text data
texts = ["Text document 1...", "Text document 2..."]
answer = ask_question(texts, "What is the main topic discussed?")
print(answer)

Requirements

  • Python 3.8+
  • PyTorch
  • Sentence Transformers
  • Hugging Face Transformers
  • FAISS
  • LangChain

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deep_semantic_search-0.1.1.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deep_semantic_search-0.1.1-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file deep_semantic_search-0.1.1.tar.gz.

File metadata

  • Download URL: deep_semantic_search-0.1.1.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for deep_semantic_search-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5157c8f951ae77e9ee9178018d0a0b92a5657617bcec3df95566d7ee37a25d97
MD5 d1c55277717b6bfd7b78d3fa264f1edd
BLAKE2b-256 26697ae6d7cfd1af60d5bbdd5adc3b917a0c78151845613c6e6f132dc11b918c

See more details on using hashes here.

File details

Details for the file deep_semantic_search-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for deep_semantic_search-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bc41900a4a25fafa7f262832d9bc987bfd9d4a9671cfaa2bef52e24a532451fd
MD5 e19aaa755f4bc1eed1706016870a7f98
BLAKE2b-256 a63cc354eb182bb1ac23c0e136d0a8745efc0583343968b55c22ff8f0f100398

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page