Free, unlimited vector database backed by Telegram

These details have not been verified by PyPI

Project links

Project description

TgVectorDB

Free, unlimited vector database backed by Telegram.

Store embeddings as Telegram messages. Search them semantically. No API keys. No servers. No monthly bills. Just your Telegram account.

Turbopuffer for broke developers.

how it works

Your vectors are stored as messages in a private Telegram channel you own. A tiny local index (~1MB) routes queries to the right cluster. Search fetches only the relevant messages — not the whole database.

cold query:  ~0.5-1.5 seconds  (fetching from telegram)
warm query:  <5 milliseconds   (from local cache)
cost:        $0/month forever

install

pip install tgvectordb

# for pdf support:
pip install tgvectordb[pdf]

get telegram credentials

go to https://my.telegram.org
log in with your phone number
click "API development tools"
create an app, grab the api_id and api_hash

quick start

from tgvectordb import TgVectorDB

db = TgVectorDB(
    api_id=12345,
    api_hash="your_api_hash_here",
    phone="+91xxxxxxxxxx",
    db_name="my-notes",
)

# add some text
db.add("photosynthesis converts sunlight into chemical energy in plants")
db.add("neural networks learn patterns from training data")
db.add("sourdough bread requires a long fermentation process")

# search
results = db.search("how do plants make food?", top_k=3)
for r in results:
    print(f"[{r['score']:.2f}] {r['text'][:80]}...")

# add a whole document
db.add_source("research_paper.pdf")

# check stats
print(db.stats())

use with a RAG chatbot

from tgvectordb import TgVectorDB

db = TgVectorDB(api_id=..., api_hash=..., phone=..., db_name="rag-bot")

# index your docs (one time)
db.add_source("handbook.pdf")
db.add_source("faq.md")

# on each question
def answer(question):
    context = db.search(question, top_k=5)
    context_text = "\n".join([r["text"] for r in context])
    # pass to your LLM of choice (ollama, llama.cpp, openai, whatever)
    return ask_llm(f"Context:\n{context_text}\n\nQuestion: {question}")

features

free forever — telegram provides unlimited cloud storage at no cost
zero infrastructure — no docker, no servers, no databases to manage
durable — your data lives on telegram's multi-datacenter infrastructure
portable — db.restore() on any new machine and you're back in business
fast enough — 0.5-1.5s cold queries, <5ms warm queries with caching
private — data stays in your own private telegram channel

important stuff

uses intfloat/e5-small-v2 for embeddings (384 dims, runs on CPU)
vectors are int8 quantized to fit in telegram's 4096 char message limit
uses Telethon (MTProto) for fast message fetching, not the bot API
recommended: use a secondary telegram account, not your main one
this is for personal projects and prototyping, not production SaaS

commands

# --- Adding Data ---
db.add("text")                          
# Adds a single string of text to the database.

db.add_batch(texts, metadatas)          
# Adds multiple texts at once with optional metadata. Much faster than calling add() in a loop since it batches the embeddings and Telegram messages.

db.add_source("file.pdf")              
# Automatically parses, chunks, and adds a supported file format (.pdf, .docx, .txt, .md, .html, .csv, .json, .py, etc).

db.add_directory("./my_docs/", extensions=[".pdf", ".docx"])  
# Ingests all supported files within a directory concurrently.

# --- Search & Retrieval ---
db.search("query", top_k=5)            
# Performs a semantic similarity search against the stored vectors and returns the top_k most relevant chunks.

db.search("query", filter={"src": "document.pdf"}) 
# Filters the semantic search to ONLY check against vectors matching that exact metadata tag.

# --- Database Management ---
db.reindex()                           
# Forces a full re-clustering of your entire vector database. Normally this is handled automatically as your dataset grows.

db.backup()                            
# Zips up the small local SQLite index map and uploads it to a private Telegram channel for disaster recovery.

db.restore()                           
# Downloads the latest index backup from Telegram, allowing you to instantly restore your database on any machine.

db.stats()                             
# Returns a dictionary of database statistics, including vector counts, cache hit rates, and clustering metrics.

db.delete(filter={"src": "old.pdf"})   
# Permanently removes all vectors and metadata matching a specific filter from both Telegram and the local index.

supported file formats

works out of the box: .txt, .md, .html, .csv, .tsv, .json, .jsonl, .xml, .yaml, .py, .js, .java, .go, .rs, and most text-based files.

with optional dependencies:

.pdf — pip install tgvectordb[pdf] (uses pdfplumber)
.docx — pip install tgvectordb[docx] (uses python-docx)
or just: pip install tgvectordb[all]

license

MIT — do whatever you want with it.

disclaimer

this project uses telegram's cloud infrastructure as a storage backend. while projects like Pentaract have done this since 2023 without issues, its not officially sanctioned by telegram. use a secondary account and don't abuse rate limits. see the full disclaimer in the docs.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Mar 25, 2026

0.2.0

Mar 25, 2026

This version

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tgvectordb-0.1.0.tar.gz (44.7 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tgvectordb-0.1.0-py3-none-any.whl (39.2 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file tgvectordb-0.1.0.tar.gz.

File metadata

Download URL: tgvectordb-0.1.0.tar.gz
Upload date: Mar 9, 2026
Size: 44.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for tgvectordb-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c2c8f98f1dd1e100b16f0e147ef9663d60d001de0159fd0f7df07c42b70163c9`
MD5	`8fc20a3314147252c23a2d87cb2b140c`
BLAKE2b-256	`0501bb4d44e91c7b8fc5a28b0fcb80e91aa1ddebc3024a816993abb0e2eedc55`

See more details on using hashes here.

File details

Details for the file tgvectordb-0.1.0-py3-none-any.whl.

File metadata

Download URL: tgvectordb-0.1.0-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 39.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for tgvectordb-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`00966a48916a2be4ea64066eb9c658858f00a89537f26a70e11850a19c27a2ad`
MD5	`29e8ca08819f1fdcb1592a09d29d279b`
BLAKE2b-256	`5dea937a9ec309f6835f28e24776da1d5327f1881ee5b2ddda173cb49d219eea`

See more details on using hashes here.

tgvectordb 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TgVectorDB

how it works

install

get telegram credentials

quick start

use with a RAG chatbot

features

important stuff

commands

supported file formats

license

disclaimer

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes