npmai is a lightweight Python package designed to bridge the gap between users and open-source LLMs. Connect with Ollama and 45+ other powerful models instantly— no installation, no login, and no API keys required, and help in development of RAG Agents without installing anything locally or on cloud and it is free without sigin or signup or any type of limit.
Project description
🚀 npmai
By Sonu Kumar (Viral Boy)
npmai is a lightweight Python package designed to bridge the gap between users and open-source LLMs.
Connect with Ollama and 45+ other powerful models instantly—no installation, no login, and no API keys required, and help in development of RAG Agents without installing anything locally or on cloud and it is free without sigin or signup or any type of limit.
✨ Features
- 🔗 Zero Setup: No local Ollama installation or complex API signups needed.
- 🤖 Multi-Model Support: Execute prompts across 10+ open-source models simultaneously.
- 🧠 Built-in Memory: (New in v0.1.3) Native memory support—no need for external Agentic frameworks.
- 🕵️♂️🔍📑 RAG Frame-Work: no need to install Whisper or any model locally,no need to write code for the pdf,image,video,yt-video to text just use npmai
- 🔍📑 Vectorised Database Now you can store your any all type of files in vectorised form through npmai for free of cost unlimited time
- ⚡ Framework Ready: Fully compatible with LangChain, CrewAI, and other orchestration tools.
- 🛠️ Universal API: Access via Python, JavaScript, C++, Java, or C.
- 🔍 Tavily: Tavily integrated and also integrating MCP Servers and advance tools to make LLM more powerfull
- 📑 LARA:LARA we developed reaserch paper which addresses major problems of RAG industry.
- 🤖 Scalability:80640 requests per 24 hour we can easily handle on free computes neither we are paying anyone nor users or developers will pay anything to npmai.
- 🛠️Future Features: In v0.2.0 we are going to integrate LARA and Offline Ollama also but with integration of pool of MCP Servers and Tools that will advance a 1b Param Model and model will respond in a way you cannot imagine so keep patience we are working on it and soon we will launch it.
Achievment for npmai:-
We Achieved 1.2 Million+ installations this shows reliabity and trust on npmai
🖥️ Supported Models
| Model Name | Description |
|---|---|
llama3.2 |
Meta's latest powerful small model |
gemma-2-instruct-9b |
Google's high-performance open model |
qwen-2.5-coder-7b |
Alibaba's elite coding assistant |
mistral-7b-instruct |
Versatile and efficient instructor model |
phi-3-medium |
Microsoft's highly capable reasoning model |
Falcon |
From UAE ,TII |
Baichuan-2 |
Baichuan from China |
InternLM |
From Sanghai AI Laboratory |
Vicuna |
From LMSYS Org |
gemma3:12b |
A latest AI Model by Google which has knowledge cutoff of 2026 also Latest AI Model in NPMAI ECOSYSTEM |
gemma2:9b |
Model by Google Deepmind also in NPMAI ECOSYSTEM |
qwen3.5:9b |
One of the latest and powerfull model with latest knowledge cutoff year 2026 now in NPMAI ECOSYSTEM |
Workflow:-
npmai
Rag
npmai Ecosystem:-
Here is the main core component of NPMAI ECOSYSTEM in Ecosystem following products are:-
1.NPM-Rag-A.I:-NPM Rag A.I is a beautiful, easy-to-use web application that lets you instantly create and talk to your own private or public knowledge bases using RAG (Retrieval-Augmented Generation).
2.NPM-Journalist:-You can raise voice to Government without getting traced,safe & secure journalism.
3.NPM-AutoCode-A.I:- Full Autonomous agent where A.I will write code to auotmate PC and execute debug and before execution of any code there will be a safety check,full secure.
4.NPM-Youtube-Automation:-Just give us Video and Thumbnail and go to sleep your video will automatically uploaded to Youtube with full meta-data including captions.(Future update:- Your video or any post will be uploaded to all social media platforms from Facebook to X to insta to all).
5.NPM-Debater-A.I:-You can enjoy debate of 4 AI models just enter topic and enjoy infinite debate arena.
6.NPM-Legal-A.I:-Legal Chatbot with specific models,processings for free supports all documents of users (Currently it only support Indian Laws).
7.NPM-Business-Analysis-A.I:-Business Analysis AI where you will explain your business and it will guide you in making plans (Future Updates:- NPM-Youtube-Automation and npmai-RAG will be integrated.)
8.NPM-Data-A.I:-It will analyse your Bank Account transaction history and it will give advice related to your financial future and conditions.
Note:- All projects are free and deployed and production ready.
⚙️ Installation
Install via pip in seconds:
pip install npmai
Use code with caution.
Tip for Python 3.13+: Use py -3.13 -m pip install npmai
💡 Quick Start (Python)
python
from npmai import Ollama
# Initialize the LLM
llm = Ollama()
# Simple invocation
response = llm.invoke("What is the future of AI?", model="llama3.2")
print(response)
🌐 API Usage (Other Languages)
If you aren't using Python, hit our global endpoint:
POST https://npmai-api.onrender.com
🟡 JavaScript
javascript
const response = await fetch("https://npmai-api.onrender.com", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
prompt: "Hello! Who are you?",
model: "llama3.2",
temperature: 0.4
})
});
const data = await response.json();
console.log(data.response);
🔵 C++
cpp
nlohmann::json payload = {
{"prompt", "Explain quantum physics."},
{"model", "llama3.2"},
{"temperature", 0.4}
};
auto res = cli.Post("/llm", payload.dump(), "application/json");
🆕 Latest Update: Version 0.1.9
In this update:-
- npmai introduced a new distributed architechure that can handle 80K+ requests in 24 hour for free neither npmai will pay anything to anyone nor users or devs will pay anything to npmai.
- npmai added change,Models,fallback_api paramters in Ollama class.
- npmai removed the inheritence of LLM class from langchain.
- npmai changed api parameter link in Ollama class.
- npmai collabrated with developers who wanted to support us and we together launched 56 spaces which powers 80K+ requests in free of cost for 24hr daily.
version 0.1.8 --->> Integrated Supabase for long term storage of vectorised Documents,Added a method in Rag class of npmai sdk "vector_db_use", Updated parameters of Rag class to make compatible with new supabase integrations, Updated Docstrings for Rag class.
version 0.1.7 --->> Updated parameters of Rag class for sending multiple files of all type at once and also added a clear_memory method in Memory class to remove memory files and added docstrings description in every class Ollama,Memory,Rag.
version 0.1.6 --->> Added try and except for api hitting and added huggingface api endpoints as fallback.
version 0.1.5 --->> Just fixed some bugs and added link as a parameter in Rag class.
version 0.1.4 --->> Now you do not need to write code for RAG tools like pdf,image,video,audio,yt-video to text and no need to load whisper and other requirements locally no local process everything on cloud in free without any signup or singin or key hurdles.
Important Update in NPMAI RAG:-
🚀 NPMAI Update: Advanced RAG & Refine Architecture
We have officially upgraded the NPMAI Ecosystem to a more intelligent, cost-efficient, and "Product-Ready" pipeline. These updates move beyond basic RAG into High-Performance Agentic Retrieval.
🔍 1. Dynamic K-Context Retrieval (70% Coverage)
The Problem:
Standard RAG systems use a fixed k value (e.g., k=4). This is inefficient—it provides too little context for large documents (missing facts) and too much "noise" for tiny documents (wasting tokens).
The Solution: I have engineered a Proportional Scaling Logic that calculates the optimal number of chunks to retrieve based on the actual density of your vectorized database.
- Logic:
dynamic_k = max(1, int(total_chunks * 0.70)) - How it works:
- Short Documents: If your database has only 2 chunks, the system retrieves only those 2.
- Large PDFs: If your PDF generates 100 chunks, the system automatically scales up to retrieve 70 relevant chunks ($k=70$).
- The Impact: This ensures the AI always sees a statistically significant slice of the knowledge base, adapting perfectly to any document size.
🔄 2. Sliding Window Batch-Refinement (3-Chunk Window)
The Problem: Traditional "Refine" strategies process one chunk at a time. This is incredibly slow because it makes $N$ separate API calls. For a 30-chunk document, the user waits too long.
The Solution: I have implemented a Sliding Window Batch-Refine system that processes chunks in groups of 3 instead of 1.
- Logic:
for i in range(0, total_chunks, 3): - How it works:
- Instead of making a single LLM call for every 1,000 characters, the system sends a batch of 3 related chunks (3,000 characters) in one go.
- It uses the previous answer as a "Running Memory" to merge new information from the current 3-chunk batch.
- The Impact:
- 3x Faster Execution: We have reduced total API latency by 66%.
- Improved Coherence: The AI sees a broader context ($3,000$ chars vs $1,000$ chars), allowing it to spot connections between facts that are split across neighboring chunks.
☁️ 3. Infrastructure: Persistent Supabase Integration (v0.1.8)
We have successfully integrated Supabase Object Storage to move from temporary memory to Persistent Knowledge Bases.
- Vector Persistence: All
.faissand.pklindex files are now automatically uploaded to a secure Supabase bucket. - Multi-Platform Access: This allows NPM-Rag-AI, NPM-AutoCode-AI, and the npmai SDK to share and load the same vectorized data from anywhere in the world.
Summary: These architectural changes make NPMAI one of the most efficient open-source RAG frameworks available for developers who need Speed + Accuracy without the high cost of standard 1-by-1 refinement.
⚠️ Important Notes:-
Please star our project on Github please.
🔗 Resources
Documentation: npmai.netlify.com
API Endpoint: https://npmaiecosystem-loadbalancer.hf.space/load_balancer Fallback API Endpoint: https://npmaiecosystem-loadbalancerfallback.hf.space/load_balancer
Developed with ❤️ to make AI accessible to everyone.
Developer and Maintainer:- Sonu Kumar
Thankyou Statement:-
So thankyou for Installations and Support of you people npmai PYPI Package achieved 1.2 Million+ Installations today and npmai is getting installed daily by 15k Installations.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file npmai-0.1.9.tar.gz.
File metadata
- Download URL: npmai-0.1.9.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
876156735c476666d93fb10da177aaad68bc1272a0ad6be7c52fd9b726fdb69b
|
|
| MD5 |
d80317af5e4572abe224644782942fc7
|
|
| BLAKE2b-256 |
8ca8e963202a6d08223af651190cbc76966a494b83989df3a874c5bf01039c49
|
File details
Details for the file npmai-0.1.9-py3-none-any.whl.
File metadata
- Download URL: npmai-0.1.9-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f256f2ecc60662de49740996d5553e5e0b30473f5e638e43a6546be1fe0e053e
|
|
| MD5 |
ba4c7f979d459da429e48159eab94fdc
|
|
| BLAKE2b-256 |
5df09097a15d5c09645ffd473face7fcd964d7d242d4e1bd919ddd4d3f799dc1
|