Skip to main content

A library for embedding, indexing, and applying semantic search for text and image data

Project description

Deep Semantic Search

A Python library for embedding, indexing, and applying semantic search for text and image data.

Features

  • Multi-modal Semantic Search:

    • Embedding and indexing text data using the nli-mpnet-base-v2 model
    • Embedding and indexing image data using the CLIP model
    • Semantic search for both text and image data
    • Search images by both image and text queries
  • Clustering and Image Captioning:

    • Cluster image embeddings using PyTorch KMeans (with GPU support)
    • Caption images using the BLIP model
  • Retrieval-Augmented Generation (RAG):

    • Answer questions based on search results
    • Summarize search results
    • Generate topics for image captions

Installation

pip install deep-semantic-search

Quick Start

Text Search

from deep_semantic_search import LoadTextData, TextEmbedder, TextSearch

# Load text data
loader = LoadTextData()
corpus_dict = loader.from_folder("path/to/text/files")

# Embed the text data
embedder = TextEmbedder()
embedder.embed(corpus_dict)

# Search for similar texts
search = TextSearch()
results = search.find_similar("your search query", top_n=5)

for result in results:
    print(f"Score: {result['score']}, Text: {result['text'][:100]}...")

Image Search

from deep_semantic_search import LoadImageData, ImageSearch

# Load image data
loader = LoadImageData()
image_paths = loader.from_folder("path/to/images")

# Set up image search
searcher = ImageSearch(image_paths)

# Search for similar images to a text query
results = searcher.get_similar_images_to_text("cat on a sofa", number_of_images=5)

# Display results
for path, score in results.items():
    print(f"Score: {score}, Image: {path}")

RAG (Retrieval-Augmented Generation)

from deep_semantic_search import ask_question

# Ask a question based on provided text data
texts = ["Text document 1...", "Text document 2..."]
answer = ask_question(texts, "What is the main topic discussed?")
print(answer)

Requirements

  • Python 3.8+
  • PyTorch
  • Sentence Transformers
  • Hugging Face Transformers
  • FAISS
  • LangChain

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deep_semantic_search-0.1.0.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deep_semantic_search-0.1.0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file deep_semantic_search-0.1.0.tar.gz.

File metadata

  • Download URL: deep_semantic_search-0.1.0.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for deep_semantic_search-0.1.0.tar.gz
Algorithm Hash digest
SHA256 75ca84f3b196233624986529a5f7b6c8258d2deab923d6bd72a6a6ab8c2faeab
MD5 dc544cd854d007ff60fd04a731ff189f
BLAKE2b-256 7b1cef376a75417ce282e965c92be55a1e28835284806836638ebdb085ec06df

See more details on using hashes here.

File details

Details for the file deep_semantic_search-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for deep_semantic_search-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e70a784ff1319e34182b52adab423be9d7215cec6257b42f658ade4ce3d35dad
MD5 1ba578e2435042c9f267e297f253eea9
BLAKE2b-256 096b43d9cc5cb11f7f005dc8f9fb3abf68427bb4d9cf52b7e42fbd80ca1f5e39

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page