A library for embedding, indexing, and applying semantic search for text and image data
Project description
Deep Semantic Search
A Python library for embedding, indexing, and applying semantic search for text and image data.
Features
-
Multi-modal Semantic Search:
- Embedding and indexing text data using the nli-mpnet-base-v2 model
- Embedding and indexing image data using the CLIP model
- Semantic search for both text and image data
- Search images by both image and text queries
-
Clustering and Image Captioning:
- Cluster image embeddings using PyTorch KMeans (with GPU support)
- Caption images using the BLIP model
-
Retrieval-Augmented Generation (RAG):
- Answer questions based on search results
- Summarize search results
- Generate topics for image captions
Installation
pip install deep-semantic-search
Quick Start
Text Search
from deep_semantic_search import LoadTextData, TextEmbedder, TextSearch
# Load text data
loader = LoadTextData()
corpus_dict = loader.from_folder("path/to/text/files")
# Embed the text data
embedder = TextEmbedder()
embedder.embed(corpus_dict)
# Search for similar texts
search = TextSearch()
results = search.find_similar("your search query", top_n=5)
for result in results:
print(f"Score: {result['score']}, Text: {result['text'][:100]}...")
Image Search
from deep_semantic_search import LoadImageData, ImageSearch
# Load image data
loader = LoadImageData()
image_paths = loader.from_folder("path/to/images")
# Set up image search
searcher = ImageSearch(image_paths)
# Search for similar images to a text query
results = searcher.get_similar_images_to_text("cat on a sofa", number_of_images=5)
# Display results
for path, score in results.items():
print(f"Score: {score}, Image: {path}")
RAG (Retrieval-Augmented Generation)
from deep_semantic_search import ask_question
# Ask a question based on provided text data
texts = ["Text document 1...", "Text document 2..."]
answer = ask_question(texts, "What is the main topic discussed?")
print(answer)
Requirements
- Python 3.8+
- PyTorch
- Sentence Transformers
- Hugging Face Transformers
- FAISS
- LangChain
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deep_semantic_search-0.1.0.tar.gz.
File metadata
- Download URL: deep_semantic_search-0.1.0.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75ca84f3b196233624986529a5f7b6c8258d2deab923d6bd72a6a6ab8c2faeab
|
|
| MD5 |
dc544cd854d007ff60fd04a731ff189f
|
|
| BLAKE2b-256 |
7b1cef376a75417ce282e965c92be55a1e28835284806836638ebdb085ec06df
|
File details
Details for the file deep_semantic_search-0.1.0-py3-none-any.whl.
File metadata
- Download URL: deep_semantic_search-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e70a784ff1319e34182b52adab423be9d7215cec6257b42f658ade4ce3d35dad
|
|
| MD5 |
1ba578e2435042c9f267e297f253eea9
|
|
| BLAKE2b-256 |
096b43d9cc5cb11f7f005dc8f9fb3abf68427bb4d9cf52b7e42fbd80ca1f5e39
|