A modular text-based database manager for retrieval-augmented generation (RAG), seamlessly integrating with the LoLLMs ecosystem.
Project description
LoLLMsVectorDB
LoLLMsVectorDB: A modular text-based database manager for retrieval-augmented generation (RAG), seamlessly integrating with the LoLLMs ecosystem. Supports various vectorization methods and directory bindings for efficient text data management.
Features
- Flexible Vectorization: Supports multiple vectorization methods including TF-IDF and Word2Vec.
- Directory Binding: Automatically updates the vector store with text data from a specified directory.
- Efficient Search: Provides fast and accurate search results with metadata to locate the original text chunks.
- Modular Design: Easily extendable to support new vectorization methods and functionalities.
Installation
pip install lollmsvectordb
Usage
Example with TFIDFVectorizer
from lollmsvectordb import TFIDFVectorizer, VectorDatabase, DirectoryBinding
# Initialize the vectorizer
tfidf_vectorizer = TFIDFVectorizer()
tfidf_vectorizer.fit(["This is a sample text.", "Another sample text."])
# Create the vector database
db = VectorDatabase("vector_db.sqlite", tfidf_vectorizer)
# Bind a directory to the vector database
directory_binding = DirectoryBinding("path_to_your_directory", db)
# Update the vector store with text data from the directory
directory_binding.update_vector_store()
# Search for a query in the vector database
results = directory_binding.search("This is a sample text.")
print(results)
Adding New Vectorization Methods
To add a new vectorization method, create a subclass of the Vectorizer
class and implement the vectorize
method.
from lollmsvectordb import Vectorizer
class CustomVectorizer(Vectorizer):
def vectorize(self, data):
# Implement your custom vectorization logic here
pass
Contributing
Contributions are welcome! Please fork the repository and submit a pull request.
License
This project is licensed under the MIT License.
Contact
For any questions or suggestions, feel free to reach out to the author:
- Twitter: @ParisNeo_AI
- Discord: Join our Discord
- Sub-Reddit: r/lollms
- Instagram: spacenerduino
Acknowledgements
Special thanks to the LoLLMs community for their continuous support and contributions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for lollmsvectordb-1.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d8d2fb7c7702f02cd4348b38f20dd3dcbd6422a9758a7eb1d5d49e60022216d5 |
|
MD5 | a7393cbe2e39748c1e627292c2baf8be |
|
BLAKE2b-256 | d1ba89c12715d4de7322abfd0298f50992275f18a98b47810fae7f16976c79e6 |