VectorWave: Seamless Auto-Vectorization Framework

These details have not been verified by PyPI

Project links

Repository

Development Status
- 3 - Alpha
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language

Project description

VectorWave: Seamless Auto-Vectorization Framework

🌟 Overview

VectorWave is an innovative framework that uses a decorator to automatically save and manage the output of Python functions/methods in a Vector Database (Vector DB). Developers can convert function outputs into intelligent vector data with a single line of code (@vectorize), without worrying about the complex processes of data collection, embedding generation, or storage in a Vector DB.

✨ Features

@vectorize Decorator:
1. Static Data Collection: Saves the function's source code, docstring, and metadata to the VectorWaveFunctions collection once when the script is loaded.
2. Dynamic Data Logging: Records the execution time, success/failure status, error logs, and 'dynamic tags' to the VectorWaveExecutions collection every time the function is called.
Distributed Tracing: By combining the @vectorize and @trace_span decorators, you can analyze the execution of complex multi-step workflows, grouped under a single trace_id.
Search Interface: Provides search_functions (for vector search) and search_executions (for log filtering) to facilitate the construction of RAG and monitoring systems.

🚀 Usage

VectorWave consists of 'storing' via decorators and 'searching' via functions, and now includes execution flow tracing.

1. (Required) Initialize the Database and Configuration

import time
from vectorwave import (
    vectorize, 
    initialize_database, 
    search_functions, 
    search_executions
)
# [ADDITION] Import trace_span separately for distributed tracing.
from vectorwave.monitoring.tracer import trace_span 

# This only needs to be called once when the script starts.
try:
    client = initialize_database()
    print("VectorWave DB initialized successfully.")
except Exception as e:
    print(f"DB initialization failed: {e}")
    exit()

2. [Store] Use `@vectorize` with Distributed Tracing

The @vectorize acts as the Root for tracing, and @trace_span is used on internal functions to group the execution flow under a single trace_id.

# --- Child Span Function: Captures arguments ---
@trace_span(attributes_to_capture=['user_id', 'amount'])
def step_1_validate_payment(user_id: str, amount: int):
    """(Span) Payment validation. Records user_id and amount in the log."""
    print(f"  [SPAN 1] Validating payment for {user_id}...")
    time.sleep(0.1)
    return True

@trace_span(attributes_to_capture=['user_id', 'receipt_id'])
def step_2_send_receipt(user_id: str, receipt_id: str):
    """(Span) Sends the receipt."""
    print(f"  [SPAN 2] Sending receipt {receipt_id}...")
    time.sleep(0.2)


# --- Root Function (@trace_root role) ---
@vectorize(
    search_description="Charges a user in the payment system.",
    sequence_narrative="Returns a receipt ID upon successful payment.",
    team="billing",  # <-- Custom Tag (recorded in all execution logs)
    priority=1       # <-- Custom Tag (execution priority)
)
def process_payment(user_id: str, amount: int):
    """(Root Span) Executes the user payment workflow."""
    print(f"  [ROOT EXEC] process_payment: Starting workflow for {user_id}...")
    
    # When calling child functions, the same trace_id is automatically inherited via ContextVar.
    step_1_validate_payment(user_id=user_id, amount=amount) 
    
    receipt_id = f"receipt_{user_id}_{amount}"
    step_2_send_receipt(user_id=user_id, receipt_id=receipt_id)

    print(f"  [ROOT DONE] process_payment")
    return {"status": "success", "receipt_id": receipt_id}

# --- Execute the Function ---
print("Now calling 'process_payment'...")
# This single call records 3 execution logs (spans) in the DB,
# all grouped under one 'trace_id'.
process_payment("user_789", 5000)

3. [Search ①] Function Definition Search (for RAG)

# Search for functions related to 'payment' using natural language (vector search).
print("\n--- Searching for 'payment' functions ---")
payment_funcs = search_functions(
    query="user payment processing",
    limit=3
)
for func in payment_funcs:
    print(f"  - Function: {func['properties']['function_name']}")
    print(f"  - Description: {func['properties']['search_description']}")
    print(f"  - Similarity (Distance): {func['metadata'].distance:.4f}")

4. [Search ②] Execution Log Search (Monitoring and Tracing)

The search_executions function can now search for all related execution logs (spans) based on the trace_id.

# 1. Find the Trace ID of a specific workflow (process_payment).
latest_payment_span = search_executions(
    limit=1, 
    filters={"function_name": "process_payment"},
    sort_by="timestamp_utc",
    sort_ascending=False
)
trace_id = latest_payment_span[0]["trace_id"] 

# 2. Search all spans belonging to that Trace ID, sorted chronologically.
print(f"\n--- Full Trace for ID ({trace_id[:8]}...) ---")
trace_spans = search_executions(
    limit=10,
    filters={"trace_id": trace_id},
    sort_by="timestamp_utc",
    sort_ascending=True # Ascending sort for workflow flow analysis
)

for i, span in enumerate(trace_spans):
    print(f"  - [Span {i+1}] {span['function_name']} ({span['duration_ms']:.2f}ms)")
    # Captured arguments (user_id, amount, etc.) are displayed for the child spans.
    
# Example Output:
# - [Span 1] step_1_validate_payment (100.81ms)
# - [Span 2] step_2_send_receipt (202.06ms)
# - [Span 3] process_payment (333.18ms)

⚙️ Configuration

VectorWave automatically reads Weaviate database connection info and vectorization strategy from environment variables or a .env file.

Create a .env file in your project's root directory (e.g., where test_ex/example.py is located) and set the required values.

Vectorizer Strategy (VECTORIZER)

You can select the text vectorization method via the VECTORIZER environment variable in your test_ex/.env file.

`VECTORIZER` Setting	Description	Required Additional Settings
`huggingface`	(Default Recommended) Uses the `sentence-transformers` library to vectorize on your local CPU. No API key is needed, making it great for immediate testing.	`HF_MODEL_NAME` (e.g., "sentence-transformers/all-MiniLM-L6-v2")
`openai_client`	(High-Performance) Uses the OpenAI Python client to vectorize with modern models like `text-embedding-3-small`.	`OPENAI_API_KEY` (A valid OpenAI API key)
`weaviate_module`	(Docker Delegate) Delegates the vectorization task to the Weaviate container's built-in module (e.g., `text2vec-openai`).	`WEAVIATE_VECTORIZER_MODULE`, `OPENAI_API_KEY`
`none`	Disables vectorization. Data will be stored without vectors.	None

.env File Examples

Configure your .env file according to the strategy you want to use.

Example 1: Using `huggingface` (Local, No API Key)

Uses a sentence-transformers model on your local machine. Ideal for testing without API keys.

# .env (Using HuggingFace)
# --- Basic Weaviate Connection ---
WEAVIATE_HOST=localhost
WEAVIATE_PORT=8080
WEAVIATE_GRPC_PORT=50051

# --- [Strategy 1] HuggingFace Config ---
VECTORIZER="huggingface"
HF_MODEL_NAME="sentence-transformers/all-MiniLM-L6-v2"

# (OPENAI_API_KEY is not required for this mode)
OPENAI_API_KEY=sk-...

# --- [Advanced] Custom Properties ---
CUSTOM_PROPERTIES_FILE_PATH=.weaviate_properties
RUN_ID=test-run-001

Example 2: Using `openai_client` (Python Client, High-Performance)

Directly calls the OpenAI API via the openai Python library.

# .env (Using OpenAI Python Client)
# --- Basic Weaviate Connection ---
WEAVIATE_HOST=localhost
WEAVIATE_PORT=8080
WEAVIATE_GRPC_PORT=50051

# --- [Strategy 2] OpenAI Client Config ---
VECTORIZER="openai_client"

# [Required] You must enter a valid OpenAI API key.
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxx

# (HF_MODEL_NAME is not used in this mode)
HF_MODEL_NAME=...

# --- [Advanced] Custom Properties ---
CUSTOM_PROPERTIES_FILE_PATH=.weaviate_properties
RUN_ID=test-run-001

Example 3: Using `weaviate_module` (Docker Delegate)

Delegates vectorization to the Weaviate Docker container instead of Python. (See vw_docker.yml config).

# .env (Delegating to Weaviate Module)
# --- Basic Weaviate Connection ---
WEAVIATE_HOST=localhost
WEAVIATE_PORT=8080
WEAVIATE_GRPC_PORT=50051

# --- [Strategy 3] Weaviate Module Config ---
VECTORIZER="weaviate_module"
WEAVIATE_VECTORIZER_MODULE=text2vec-openai
WEAVIATE_GENERATIVE_MODULE=generative-openai

# [Required] The Weaviate container will read this API key.
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxx

# --- [Advanced] Custom Properties ---
CUSTOM_PROPERTIES_FILE_PATH=.weaviate_properties
RUN_ID=test-run-001

Custom Properties and Dynamic Execution Tagging

VectorWave can store user-defined metadata in addition to static data (function definitions) and dynamic data (execution logs). This works in two steps.

Step 1: Define Custom Schema (Tag "Allow-list")

Create a JSON file at the path specified by CUSTOM_PROPERTIES_FILE_PATH in your .env file (default: .weaviate_properties).

This file instructs VectorWave to add new properties (columns) to the Weaviate collections. This file acts as an "allow-list" for all custom tags.

.weaviate_properties Example:

{
  "run_id": {
    "data_type": "TEXT",
    "description": "The ID of the specific test run"
  },
  "experiment_id": {
    "data_type": "TEXT",
    "description": "Identifier for the experiment"
  },
  "team": {
    "data_type": "TEXT",
    "description": "The team responsible for this function"
  },
  "priority": {
    "data_type": "INT",
    "description": "Execution priority level"
  }
}

This definition will add run_id, experiment_id, team, and priority properties to both the VectorWaveFunctions and VectorWaveExecutions collections.

Step 2: Dynamic Execution Tagging (Adding Values)

When a function is executed, VectorWave adds tags to the VectorWaveExecutions log. These tags are collected and merged from two sources.

1. Global Tags (Environment Variables) VectorWave looks for environment variables matching the UPPERCASE name of the keys defined in Step 1 (e.g., RUN_ID, EXPERIMENT_ID). Found values are loaded as global_custom_values and added to all execution logs. Ideal for run-wide metadata.

2. Function-Specific Tags (Decorator) You can pass tags as keyword arguments (**execution_tags) directly to the @vectorize decorator. Ideal for function-specific metadata.

# --- .env file ---
# RUN_ID=global-run-abc
# TEAM=default-team

@vectorize(
    search_description="Process payment",
    sequence_narrative="...",
    team="billing",  # <-- Function-specific tag
    priority=1       # <-- Function-specific tag
)
def process_payment():
    pass

@vectorize(
    search_description="Another function",
    sequence_narrative="...",
    run_id="override-run-xyz" # <-- Overrides the global tag
)
def other_function():
    pass

Tag Merging and Validation Rules

Validation (Important): Tags (global or function-specific) will only be saved to Weaviate if their key (e.g., run_id, team, priority) was first defined in the .weaviate_properties file (Step 1). Tags not defined in the schema are ignored, and a warning is printed at script startup.
Priority (Override): If a tag key is defined in both places (e.g., global RUN_ID in .env and run_id="override-xyz" in the decorator), the function-specific tag from the decorator always wins.

Resulting Logs:

process_payment() execution log: {"run_id": "global-run-abc", "team": "billing", "priority": 1}
other_function() execution log: {"run_id": "override-run-xyz", "team": "default-team"}

🤝 Contributing

All forms of contribution are welcome, including bug reports, feature requests, and code contributions. For details, please refer to CONTRIBUTING.md.

📜 License

This project is distributed under the MIT License. See the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Repository

Development Status
- 3 - Alpha
Intended Audience
- Developers
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

1.0.0

May 19, 2026

0.3.0

Feb 23, 2026

0.2.9

Feb 11, 2026

0.2.8

Jan 25, 2026

0.2.7

Jan 24, 2026

0.2.6

Dec 23, 2025

0.2.5

Dec 21, 2025

0.2.4

Dec 9, 2025

0.2.3

Dec 4, 2025

0.2.2

Dec 3, 2025

0.2.1

Nov 26, 2025

0.2.0

Nov 24, 2025

0.1.9

Nov 23, 2025

0.1.8

Nov 22, 2025

0.1.7

Nov 21, 2025

0.1.6

Nov 18, 2025

0.1.5

Nov 16, 2025

0.1.4

Nov 11, 2025

This version

0.1.3

Nov 10, 2025

0.1.2

Nov 8, 2025

0.1.1

Nov 8, 2025

0.1.0

Nov 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectorwave-0.1.3.tar.gz (32.0 kB view details)

Uploaded Nov 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vectorwave-0.1.3-py3-none-any.whl (36.4 kB view details)

Uploaded Nov 10, 2025 Python 3

File details

Details for the file vectorwave-0.1.3.tar.gz.

File metadata

Download URL: vectorwave-0.1.3.tar.gz
Upload date: Nov 10, 2025
Size: 32.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for vectorwave-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`11c7b6fc5061c3ab8b7b7c3be2079bf1abbc035b8b4834e9993e2180fd6aa620`
MD5	`893008d45bba986eee9102bac83a93db`
BLAKE2b-256	`937d0fe7d80ca82d6edc7371c2efbaf75347745fd037f2d74a74ea65f61d6b0b`

See more details on using hashes here.

File details

Details for the file vectorwave-0.1.3-py3-none-any.whl.

File metadata

Download URL: vectorwave-0.1.3-py3-none-any.whl
Upload date: Nov 10, 2025
Size: 36.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for vectorwave-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`79c4beeb3901e84a6578234abf58cc3fae1cd24a54cdb0560230a1730896a28e`
MD5	`c76aadf5493f3e2e946cd3c4d25cb0cd`
BLAKE2b-256	`c9af00729b2483bbc87e8246a017998a08e01d30ff229f08178a6535079f5715`

See more details on using hashes here.

vectorwave 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

VectorWave: Seamless Auto-Vectorization Framework

🌟 Overview

✨ Features

🚀 Usage

1. (Required) Initialize the Database and Configuration

2. [Store] Use @vectorize with Distributed Tracing

3. [Search ①] Function Definition Search (for RAG)

4. [Search ②] Execution Log Search (Monitoring and Tracing)

⚙️ Configuration

Vectorizer Strategy (VECTORIZER)

.env File Examples

Example 1: Using huggingface (Local, No API Key)

Example 2: Using openai_client (Python Client, High-Performance)

Example 3: Using weaviate_module (Docker Delegate)

Custom Properties and Dynamic Execution Tagging

Step 1: Define Custom Schema (Tag "Allow-list")

Step 2: Dynamic Execution Tagging (Adding Values)

🤝 Contributing

📜 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

2. [Store] Use `@vectorize` with Distributed Tracing

Example 1: Using `huggingface` (Local, No API Key)

Example 2: Using `openai_client` (Python Client, High-Performance)

Example 3: Using `weaviate_module` (Docker Delegate)