Skip to main content

A Python module that allows conversion of a document into chunks to be inserted into Pinecone vector database

Project description

📚 PreVectorChunks

A lightweight utility for document chunking and vector database upserts — designed for developers building RAG (Retrieval-Augmented Generation) solutions.


✨ Who Needs This Module?

Any developer working with:

  • RAG pipelines
  • Vector Databases (like Pinecone, Weaviate, etc.)
  • AI applications requiring similar content retrieval

🎯 What Does This Module Do?

This module helps you:

  • Chunk documents into smaller fragments
  • Insert (upsert) fragments into a vector database
  • Fetch & update existing chunks from a vector database

📦 Installation

pip install prevectorchunks

How to import in a file:

from PreVectorChunks.services import chunk_documents_crud_vdb

#How to use Pinecone and OpenAI:
#Use a .env file in your project root to configure API keys:

PINECONE_API_KEY=YOUR_API_KEY
OPENAI_API_KEY=YOUR_API_KEY

#how to call relevant functions:
#Four key functions that you can call are below: 
#function that chunks any document 
chunk_documents(instructions,file_path="content_playground/content.json"): 
#function that chunks any document as well as inserts into vdb - you need an index name inside index_n
chunk_and_upsert_to_vdb(index_n,instructions,file_path="content_playground/content.json"): 
#function that loads existing chunks from vdb by document name - you need an index name inside index_n 
fetch_vdb_chunks_grouped_by_document_name(index_n): 
#function that updates existing chunks - you need an index name inside index_n 
update_vdb_chunks_grouped_by_document_name(index_n,dataset):

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prevectorchunks_core-0.1.1.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prevectorchunks_core-0.1.1-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file prevectorchunks_core-0.1.1.tar.gz.

File metadata

  • Download URL: prevectorchunks_core-0.1.1.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for prevectorchunks_core-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5b921767cb5c8d45e0d45638c0747e62e8e2c8a2c1bf63e9dc01d20b47722147
MD5 251c4c46133c5552ce3f4671a251c2c7
BLAKE2b-256 d0e01c84c8c7fc12fee575cd04f4b76cb862cfb81c05b28297cfa605b1358041

See more details on using hashes here.

File details

Details for the file prevectorchunks_core-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for prevectorchunks_core-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f8644bda4039898daad406ebd8de775c55b5c67e2b2a69d8b23b6b8da12b6bcf
MD5 f69ff0a354e97a70ba296969562f627c
BLAKE2b-256 8e41b1d539f95b674dbd0c475daeb28ed96d4c6f00203bf7cdc04d5fd2257456

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page