A plug-and-play multimodal Retrieval-Augmented Generation (RAG) framework
Project description
📦 UniversalRAG
UniversalRAG is a plug-and-play, multimodal Retrieval-Augmented Generation (RAG) framework that lets you query data from ** PDFs, Videos, Images, Audio, Documents**, and more using powerful LLMs like OpenAI, Groq, HuggingFace.
Why UniversalRAG?
UniversalRAG is designed to simplify the entire Retrieval-Augmented Generation (RAG) pipeline for developers.
- ❌ No need to write separate code for different document types
- ❌ No manual chunking, vectorizing, or retrieval code
- ❌ No need to understand the low-level details of embeddings or vector DBs
Just pass in:
- a PDF, YouTube video, audio file, or website URL
- ask your question
- get an accurate answer backed by retrieval
UniversalRAG saves developers hours of boilerplate work and enables them to focus on building real-world GenAI apps faster.
It’s your one-stop solution for multimodal RAG.
🚀 Features
- ✅ Supports 6+ input formats: PDF, DOCX, Image, Audio, Video
- ✅ Embedding-based retrieval system
- ✅ Built-in support for 3 types of models:
groq(LLaMA 3.1 8B Instant)huggingface(HuggingFaceH4/zephyr-7b-beta)openai(GPT-3.5 Turbo)
- ✅ Clean interface: Just import and ask!
- ✅ Easily extendable
- ✅ Supports
.env-based API key management
📦 Installation
pip install universalrag
🔐 Setup API Keys
Create a .env file in your project root and add the keys as per the model(s) you’re using:
OPENAI_API_KEY=your_openai_api_key
GROQ_API_KEY=your_groq_api_key
HUGGINGFACEHUB_API_TOKEN=your_huggingface_token
⸻
🧪 Example Usage
⚠️NOTE:Load the any of the pdf,url,video,audio,doc,image at the place where you are running this input file
⚠️NOTE:copy the path of the specific (pdf,url,video,audio,doc,image) as well in the {input_path}
⚠️NOTE:The User gets to Choose which model he/she wants to communicate with.
⚠️NOTE:{groq,openai,huggingface,}-inputs for the variable {model_name}
#CODE:
🗂️ Input file (choose one)
from universalrag.pipeline import RAGPipeline
input_path = r"anything.pdf" # PDF
# input_path = "lecture.mp4" # 🎥 Video
# input_path = "https://example.com" # 🌐 URL
# input_path = "meeting.wav" # 🎧 Audio
# input_path = "notes.docx" # 📃 Word Doc
# input_path = "image.jpg" # 🖼️ Image
# 🤖 Initialize pipeline with desired model
rag = RAGPipeline(input_path, model_name="groq")
# ❓ Ask a question
question = "Summarize the content."
answer = rag.ask(question)
print("\n🤖 Answer:")
print(answer)
⚠️ Notes
• You must install model dependencies (e.g., openai, groq, transformers, langchain) as per your use case.
• You need API keys for Groq, OpenAI, or Hugging Face models.
• Some formats (e.g. audio/video) require ffmpeg to be installed.
📃 License
MIT License © 2025 Vigyat Singh
⸻
❤️ Contribute
Feel free to open issues, suggest improvements, or create pull requests!
⸻
🌐 Contact
For queries or collaborations, reach out to:
• GitHub: @vigyat13
• Email: vigyatsingh@2004.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file universalrag-0.1.0.tar.gz.
File metadata
- Download URL: universalrag-0.1.0.tar.gz
- Upload date:
- Size: 10.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f8eefe0f8a5166a128982ca536fb2c3440897bf7343763b5c748ffde985cc9ce
|
|
| MD5 |
fddc58c255fa6e66ebc0cab372cb0677
|
|
| BLAKE2b-256 |
a2b01083c9fc0a3dca0c843487bcde20fa0b27e8b3bc935e6fcce186945aa2e6
|
File details
Details for the file universalrag-0.1.0-py3-none-any.whl.
File metadata
- Download URL: universalrag-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94be3269abbb0e0adb566bbf4632558fdb9b3d018a552d8e6a9387958a71b8b4
|
|
| MD5 |
c202c946403ca92a1ded21d37dc9f740
|
|
| BLAKE2b-256 |
45a3e215d9d9ffd27ee34efdb1a2a77d523eb2a41c2863ca1dc3dec4c1aa7798
|