Skip to main content

A learning experience in vector databases

Project description

logo tinyvec | A toy implementation with aspirations of being useful


An implemetation of a vector database whose purpose is not to serve you as fast as possible, but rather allow you to experiment without worrying if you have enough memory.

Features

On Disk Vector Storage

Will this beat some fully hosted blazingly fast tensor library? Not likely. But when looking to do RAG on a set of documents I found that either I needed to set up some behemoth database, or I needed to have >60GB of memory (and occasionally both). I decided to see if it was possible to create something that was more aligned with the average user having a lot more disk space than available memory.

The amount of memory used is entirely dependant on the size of the chunks that you choose to load when fetching the k-most-similar vectors to your query vector. If you choose a large chunk you will be less IO bound, but will use more memory. Whereas using a small chunk size will limit your memory consumption, but will pretty quickly increase the number of queries you need to make to your much slower disk.

For the 128-dim embeddings I was using to test, a chunk size of 1000 was about 40MB in memory.

LangChain compatability

This project was originally formed because I wanted to run a lot of medical papers through TinyLlama on CoLab. They only give you 12GB of memory, but about 100GB of disk which I'm willing to bet is an SSD. So I spent god knows how many dollars in time to create this thing that will save me from buying 50$ worth of additional memory for my home machine.

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline

from tinyvec.langchain_store import LangchainVectorDB

with open("dset.txt", "r") as f:
    articles = f.readlines()

llm = HuggingFacePipeline.from_model_id(
    model_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    task="text-generation",
    device=0,
    pipeline_kwargs={"max_new_tokens": 512},
)
embeddings = HuggingFaceEmbeddings(model_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0")
vectorstore = LangchainVectorDB.from_texts(
    articles, embedding=embeddings, emb_dim=2048, individually=True
)
retriever = vectorstore.as_retriever()

template = """
Answer the question in a full sentences giving full reasoning without
repetition based only on the following context:

{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

print(chain.invoke("How long should I stretch if I want to grow my muscle size?"))
"""
> Answer: The recommended stretching routine for growing muscle size is 3 sets of 10-12 repetitions with 30 seconds rest in between. This routine can be done 3-4 times a week.
"""

print(chain.invoke("How should I optimize stretching for range of motion?"))
"""
Answer: The optimal stretching technique for range of motion depends on the individual's goals and the specific muscles being targeted. Here are some general guidelines:

1. Start with a warm-up: Before stretching, it's important to warm up your muscles with dynamic stretches. This will help prevent injury and improve flexibility.

2. Choose the right stretch: There are many different types of stretches, each with its own benefits and drawbacks. Choose a stretch that feels comfortable and safe for you.

3. Focus on the targeted muscles: Stretching the muscles that are causing pain or limiting movement can help improve range of motion.

4. Avoid overstretching: Overstretching can cause injury and reduce the benefits of stretching.

5. Gradually increase intensity: As you become more comfortable with a stretch, gradually increase the intensity.

6. Use props: If you're struggling to reach a certain stretch, consider using props like a foam roller, resistance band, or yoga block.

7. Monitor progress: Keep track of your progress and adjust your stretching routine as needed.

Remember, stretching is a tool to help you improve your range of motion, not a substitute for proper warm-up and injury prevention.
"""

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinyvec-2024.2.10.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tinyvec-2024.2.10-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file tinyvec-2024.2.10.tar.gz.

File metadata

  • Download URL: tinyvec-2024.2.10.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for tinyvec-2024.2.10.tar.gz
Algorithm Hash digest
SHA256 e2e1a1e23b81e65f9376442cea70ccd57813ab89eef878296ed823b5fd7a3d6e
MD5 080f3717147ed0bdf0bbf2e12d889c65
BLAKE2b-256 14a72b7fdc7de07347774cc6893b5a529fb66ca898057ed9ee20bff1f8f37c19

See more details on using hashes here.

File details

Details for the file tinyvec-2024.2.10-py3-none-any.whl.

File metadata

  • Download URL: tinyvec-2024.2.10-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for tinyvec-2024.2.10-py3-none-any.whl
Algorithm Hash digest
SHA256 ef2ee37760e0d1d0ebaa068694d1fe8153994fb5649c5168bbb041ef45d3e078
MD5 b1af0a7a211179a813ce4d2cd3da0fc7
BLAKE2b-256 0ff73fbe51eaf2e96b3400d8bb1103c256e2883c4e6fc00a0f7d15d66e988244

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page