Skip to main content

A simple and intuitive vector database abstraction layer

Project description

PyPI Version License: MIT

Flex Vector

A simple and intuitive vector database abstraction layer supporting multiple vector stores.

CI

Features

  • Unified interface for multiple vector databases
  • Default: Chroma (included in base installation - lightweight, file-based)
  • Optional Vector Databases:
    • Qdrant (pip install flexvector[qdrant])
    • Weaviate (pip install flexvector[weaviate])
    • PGVector (pip install flexvector[pgvector])
    • Milvus (pip install flexvector[milvus])
    • Azure AI Search (coming soon)
    • ...and more to come!
  • LangChain native support (included in base installation)
  • Command-line interface for common operations
  • FastEmbed fallback for embeddings (no API keys required)
  • Flexible data loading from files, direct data, or URIs
  • LlamaIndex native support (Coming soon)
  • Async support for all operations

Installation

FlexVector comes with Chroma as the default vector database (lightweight and file-based). You can install additional vector databases as needed:

Base Installation (includes Chroma)

pip install flexvector

Install with Specific Vector Databases

Qdrant:

pip install flexvector[qdrant]

Weaviate:

pip install flexvector[weaviate]

PGVector (PostgreSQL):

pip install flexvector[pgvector]

Milvus:

pip install flexvector[milvus]

**Full installation

pip install flexvector[full]

CLI Tool

Add the CLI tool to your path:

# After installation, use the 'flexvector' command directly
flexvector --help

Quick Start

Environment Variables

  • See env.example for a list of environment variables you can set.

Note: If no OpenAI API key is provided, FlexVector automatically falls back to FastEmbed with the BAAI/bge-small-en-v1.5 model, which provides free, high-quality embeddings without requiring any API keys.

Using the Python API

from flexvector import VectorDBFactory
from flexvector.config import settings
from flexvector.core import Document

# Check which vector databases are available
print("Available:", VectorDBFactory.list_available())
print("Installed:", VectorDBFactory.list_installed())

# Initialize client with configuration
# Use "chroma" (default), "qdrant", "weaviate", "pg", or "milvus" 
try:
    client = VectorDBFactory.get("qdrant", settings)
except ImportError as e:
    print(f"Error: {e}")
    # Fallback to default Chroma
    client = VectorDBFactory.get("chroma", settings)

# Load documents from file or directory
docs = client.load(collection_name="my_collection", path="path/to/document.txt")

# Or create and add documents directly
from langchain_core.documents import Document

doc = Document(page_content="Hello world", metadata={"source": "example"})
client.from_langchain("my_collection", [doc])

# Search
results = client.search(
    collection_name="my_collection",
    query="hello",
    top_k=5
)

# Delete collection
client.remove_collection("my_collection")

# Delete documents

Embedding Options

FlexVector supports multiple embedding providers:

  1. OpenAI Embeddings (default when API key provided):

    • Models: text-embedding-3-small, text-embedding-3-large, etc.
    • Requires: OPENAI_API_KEY environment variable
    • High quality, configurable dimensions
  2. FastEmbed (automatic fallback):

    • Model: BAAI/bge-small-en-v1.5 (512 dimensions)
    • Requires: No API key needed
    • Free, fast, and runs locally
    • Good quality for most use cases

Using the Command Line Interface

Check available vector databases:

flexvector list-databases

This command shows:

  • 📦 All available vector database types
  • ✅ Which ones are currently installed
  • 💡 Installation commands for missing dependencies

Load documents from a file:

flexvector load --input-file examples/files/data.txt --collection my_documents

# Or using python
python cli.py load --input-file examples/files/data.txt --collection my_documents

Load documents from a directory:

flexvector load --input-dir examples/files --collection research_papers

Use a specific vector database:

# With Qdrant (requires: pip install flexvector[qdrant])
flexvector load --db-type qdrant --input-file data.txt --collection docs

# With Weaviate (requires: pip install flexvector[weaviate])  
flexvector search --db-type weaviate --query "AI research" --collection papers

# With PGVector (requires: pip install flexvector[pgvector])
flexvector load --db-type pg --input-dir ./docs --collection knowledge_base

Search for documents:

flexvector search --query "What is vector database?" --collection my_documents --top-k 5

Delete a collection:

flexvector delete --collection my_documents

Advanced Configuration

FlexVector supports multiple configuration methods for different deployment environments:

Configuration Sources (in priority order)

  1. CLI arguments (highest priority) - Direct command-line overrides
  2. Environment variables - Runtime environment settings
  3. Configuration files - YAML, TOML, or JSON files
  4. Default values (lowest priority) - Built-in fallback values

  1. Create a configuration file:

    flexvector init-config --config-file flexvector.yaml
    
  2. Edit the configuration file for your environment:

    # flexvector.yaml
    environments:
      development:
        CHROMA_DB_FILE: "./data/vectorstores/chroma-dev"
        EMBEDDING_MODEL: "text-embedding-3-small"
      production:
        CHROMA_HTTP_URL: "https://prod-chroma.example.com"
        EMBEDDING_MODEL: "text-embedding-3-large"
    
    # Default settings
    EMBEDDING_DIMENSION: 512
    
  3. Use environment-specific settings:

    # Development
    python cli.py load --input-dir ./docs --environment development
    
    # Production  
    python cli.py search --query "AI" --environment production
    

.env File Support

cp env.example .env
# Edit .env with your local settings

📖 See full configuration documentation for advanced configuration patterns, multiple environments, and security best practices.

Documentation

For more usage info, see docs.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.


Appendix: Use Cases

This package aims to be a versatile tool for various AI applications, including but not limited to:

Research and Development

  • Prototyping: Quickly test different vector databases without changing your application code
  • A/B Testing: Compare performance across different vector stores for your specific use case
  • Academic Research: Study vector search behavior with a standardized interface

RAG Pipeline Integration

Build robust Retrieval Augmented Generation (RAG) systems with a database-agnostic approach:

  • ETL Workflows: Create efficient extract-transform-load pipelines that process documents and store embeddings without locking into a specific vector database
  • Multi-modal RAG: Store and retrieve text, images, and other data types with the same consistent interface
  • Hybrid Search Systems: Combine semantic search with traditional keyword search for improved retrieval quality

Research and Development

  • Prototyping: Quickly test different vector databases without changing your application code
  • A/B Testing: Compare performance across different vector stores for your specific use case
  • Academic Research: Study vector search behavior with a standardized interface

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flexvector-0.1.3rc1.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flexvector-0.1.3rc1-py3-none-any.whl (37.4 kB view details)

Uploaded Python 3

File details

Details for the file flexvector-0.1.3rc1.tar.gz.

File metadata

  • Download URL: flexvector-0.1.3rc1.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for flexvector-0.1.3rc1.tar.gz
Algorithm Hash digest
SHA256 332f98578a3708a020901dd5c74f5b90a8d37c23aaeef782c92a2fdaf846279b
MD5 7cfd29a5bfa14e3936d8ccbc9cd45162
BLAKE2b-256 5126a76e571f11c51cb433dfd9f37e6a5c1632a1f9a37b33de3dcd89e0f05d40

See more details on using hashes here.

Provenance

The following attestation bundles were made for flexvector-0.1.3rc1.tar.gz:

Publisher: publish-to-pypi.yml on ndamulelonemakh/flexvector

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file flexvector-0.1.3rc1-py3-none-any.whl.

File metadata

  • Download URL: flexvector-0.1.3rc1-py3-none-any.whl
  • Upload date:
  • Size: 37.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for flexvector-0.1.3rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 38bafe9a39ce8859cf058aa631bef388455df331d2619d611ad6967cdd756f4f
MD5 beb2b951802b94c5e4ded04da3ea4853
BLAKE2b-256 cb7908ae32f8b569b3f7a4c64ffd23e2a7e3dc16244b1340eab0094a1b6cb90e

See more details on using hashes here.

Provenance

The following attestation bundles were made for flexvector-0.1.3rc1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on ndamulelonemakh/flexvector

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page