Skip to main content

Utility package for interacting with vectorstores

Project description

Rakam System Vectorstore

The vectorstore package of Rakam Systems providing vector database solutions and document processing capabilities.

Overview

rakam-systems-vectorstore provides comprehensive vector storage, embedding models, and document loading capabilities. This package depends on rakam-systems-core.

Features

  • Configuration-First Design: Change your entire vector store setup via YAML — no code changes
  • Multiple Backends: PostgreSQL with pgvector and FAISS in-memory storage
  • Flexible Embeddings: SentenceTransformers, OpenAI, and Cohere
  • Document Loaders: PDF, DOCX, HTML, Markdown, CSV, and more
  • Search Capabilities: Vector search, keyword search (BM25), and hybrid search
  • Chunking: Intelligent text chunking with context preservation

Installation

pip install rakam-systems-vectorstore

# With specific backends
pip install rakam-systems-vectorstore[postgres]
pip install rakam-systems-vectorstore[faiss]
pip install rakam-systems-vectorstore[all]

Available extras:

Extra What it adds
postgres psycopg2-binary, pgvector, django
faiss faiss-cpu
local-embeddings sentence-transformers, torch
openai openai (for OpenAI embeddings)
cohere cohere (for Cohere embeddings)
loaders python-magic, beautifulsoup4, python-docx, pymupdf, docling, chonkie
all Everything above

Quick Start

from rakam_systems_vectorstore import FaissStore, Node, NodeMetadata

store = FaissStore(
    name="my_store",
    base_index_path="./indexes",
    embedding_model="Snowflake/snowflake-arctic-embed-m",
    initialising=True
)

nodes = [
    Node(
        content="Python is great for AI",
        metadata=NodeMetadata(source_file_uuid="doc1", position=0)
    )
]

store.create_collection_from_nodes("my_collection", nodes)
results, _ = store.search(collection_name="my_collection", query="AI programming", number=5)

Core Components

  • ConfigurablePgVectorStore — PostgreSQL with pgvector, hybrid search, keyword search
  • FaissStore — In-memory FAISS-based vector search
  • ConfigurableEmbeddings — SentenceTransformers, OpenAI, Cohere backends
  • AdaptiveLoader — Auto-detects and loads PDF, DOCX, HTML, Markdown, CSV, email, code
  • TextChunker / AdvancedChunker — Sentence-based and context-aware chunking

Environment Variables

Variable Description
POSTGRES_HOST PostgreSQL host (default: localhost)
POSTGRES_PORT PostgreSQL port (default: 5432)
POSTGRES_DB Database name (default: vectorstore_db)
POSTGRES_USER Database user (default: postgres)
POSTGRES_PASSWORD Database password
OPENAI_API_KEY For OpenAI embeddings
COHERE_API_KEY For Cohere embeddings
HUGGINGFACE_TOKEN For private HuggingFace models

Documentation

For PostgreSQL setup, search examples, YAML configuration, and full API reference, see the official documentation.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rakam_systems_vectorstore-0.1.3.tar.gz (117.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rakam_systems_vectorstore-0.1.3-py3-none-any.whl (159.0 kB view details)

Uploaded Python 3

File details

Details for the file rakam_systems_vectorstore-0.1.3.tar.gz.

File metadata

File hashes

Hashes for rakam_systems_vectorstore-0.1.3.tar.gz
Algorithm Hash digest
SHA256 ad7494f0844f169af3197c44bc9633cbe70bcd0cb932c3d479a62cbb8d7a0187
MD5 545a6c3d6e4e10055f8c8cdf19e7d0d0
BLAKE2b-256 1697ac9b130146bf7f85a9b3737dd3b9fe6d80ce00ad0e1b30fe2a3ca60531eb

See more details on using hashes here.

File details

Details for the file rakam_systems_vectorstore-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for rakam_systems_vectorstore-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d128c3f88ea4d82624ee496193a00070a5f5963877505c59a7b62b0f898e251c
MD5 9828936bdb1130def89b0e2bf11affd2
BLAKE2b-256 cd0da28a3fccca3250d7c36c8d9a7ceaa43e07b76156d1a0582ebf5df3b5c392

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page