Skip to main content

Multimodal Graph retrieval

Project description

Example Usage

from multimodal_rag import MultimodalRAG

# Initialize MultimodalRAG instance
mm_rag = MultimodalRAG(pdf_directory="/path/to/pdf_directory", output_directory="/path/to/output_directory")

# Preprocess documents
mm_rag.preprocess(directory="/path/to/pdf_directory", use_multiprocessing=True)

# Perform a multimodal query
query = "example query text"
search_results, result_paths = mm_rag.multimodal_query(query, k=5)

# Display results
print("Search Results:", search_results)
print("Result Paths:", result_paths)

Multimodal Retrieval (with captioning and image and graph linkage)

from multimodal_rag import MultimodalRetrieval
from langchain_core.documents.base import Document

# Prepare text documents
text_documents = [
    Document(page_content="That car was on fire.", metadata={"source": "doc1.pdf", "page": 1}),
    Document(page_content="That vehicle is called lava", metadata={"source": "doc2.pdf", "page": 1})
]

# Prepare image paths
image_paths = [
    "/content/car.jpg",
    "/content/fire.jpg",
]

# Initialize and preprocess
rag = MultimodalRetrieval()
rag.preprocess(text_documents, image_paths, similarity_threshold=0.2)

# Perform a query
query = "that car is fire"
results = rag.query(query, k=3, use_multi_hop=True)

results = rag.query_balanced(query, k_text=3, k_image=3, use_multi_hop=True)

# Print the results
print("Text Results:")
for doc, score in results["text_results"]:
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}")
    print(f"Score: {score}")
    print()

print("Image Results:")
for metadata, score in results["image_results"]:
    print(f"Image Path: {metadata['path']}")
    print(f"Caption: {metadata['caption']}")
    print(f"Score: {score}")
    print()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multimodalgraphretrieval-0.0.1b0.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

MultimodalGraphRetrieval-0.0.1b0-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file multimodalgraphretrieval-0.0.1b0.tar.gz.

File metadata

File hashes

Hashes for multimodalgraphretrieval-0.0.1b0.tar.gz
Algorithm Hash digest
SHA256 9f9c0ef0a116fb85a5ff30d207e8951903fb8e848b9255d24063c8978d9c55b9
MD5 19630ef6701e93b8c926f2d0eebe2a40
BLAKE2b-256 3743ba770cc2caa1ce81263fa47abfd48fa1da7c4fbcea37e5d859088c28d3b6

See more details on using hashes here.

File details

Details for the file MultimodalGraphRetrieval-0.0.1b0-py3-none-any.whl.

File metadata

File hashes

Hashes for MultimodalGraphRetrieval-0.0.1b0-py3-none-any.whl
Algorithm Hash digest
SHA256 cd512f59863e6c2329d46976414fc29edf1469cccd0ec426975fe7b876e02607
MD5 004c16b490d35513580b9c9a49fc2a5a
BLAKE2b-256 97f18bc0540c5f67e03c6e99da8a02d4993bfab6264cd8249d88844fb995dd2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page