Snowflake Cortex Search vector database provider for NLWeb
Project description
NLWeb Snowflake Cortex Search Provider
Snowflake Cortex Search vector database provider for NLWeb, enabling hybrid search capabilities using Snowflake's Cortex Search Service.
Features
- Cortex Search Integration: Native integration with Snowflake Cortex Search Service
- REST API Based: Uses Snowflake's REST API for search operations
- Hybrid Search: Combines vector similarity with keyword search
- Site Filtering: Filter search results by site or URL
- PAT Authentication: Secure authentication using Programmatic Access Tokens
- Async Support: Built with async/await for high performance
Installation
pip install nlweb-snowflake-vectordb
Configuration
Configure the Snowflake Cortex Search endpoint in your config.yaml:
retrieval_endpoints:
snowflake_prod:
db_type: snowflake_cortex_search
api_endpoint: "https://your-account.snowflakecomputing.com"
api_key: "${SNOWFLAKE_PAT}"
index_name: "MY_DATABASE.MY_SCHEMA.MY_SEARCH_SERVICE"
vector_dimensions: 1024
The index_name should be in the format: <database>.<schema>.<service>
Usage
Basic Search
from nlweb_snowflake_vectordb import SnowflakeCortexClient
# Initialize client
client = SnowflakeCortexClient(endpoint_name="snowflake_prod")
# Search for documents
results = await client.search(
query="machine learning models",
site="docs.example.com",
num_results=10
)
# Process results
for url, schema_json, name, site in results:
print(f"{name}: {url}")
Search by URL
# Find a specific document by URL
results = await client.search_by_url(
url="https://docs.example.com/ml-guide",
query="machine learning"
)
Get Available Sites
# Get list of all indexed sites
sites = await client.get_sites()
print(f"Available sites: {sites}")
API Reference
SnowflakeCortexClient
Main client for Snowflake Cortex Search operations.
Methods
search(query, site, num_results, **kwargs): Search for documents by query and sitesearch_by_url(url, query, **kwargs): Search for a specific document by URLget_sites(**kwargs): Get list of unique site names
Snowflake Cortex Search Service
This provider requires a Snowflake Cortex Search Service with the following columns:
url: Document URL (TEXT)site: Site name (TEXT)schema_json: Schema metadata (TEXT/VARIANT)
The search service should be created with vector embeddings enabled.
Requirements
- Python 3.10+
- nlweb-core >= 0.5.5
- httpx >= 0.28.1
- Active Snowflake account with Cortex Search enabled
- Valid Programmatic Access Token (PAT)
Note on Data Ingestion
Unlike other vector database providers, Snowflake Cortex Search does not support programmatic document upload through this client. Data must be loaded into Snowflake tables using Snowflake's native data loading tools (COPY INTO, Snowpipe, etc.) before creating the Cortex Search Service.
This package provides read-only access to existing Cortex Search Services.
License
MIT License - see LICENSE file for details.
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nlweb_snowflake_vectordb-0.5.5.tar.gz.
File metadata
- Download URL: nlweb_snowflake_vectordb-0.5.5.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d6c6d1678ef2e68fe85d90fed85ac9830b97b5815d2feff93fc4e4928434d93
|
|
| MD5 |
49c957c12631e727dfc3d0e86e77c492
|
|
| BLAKE2b-256 |
3ff5db2772ef1cb19f85e32e228f285937bdb3e4d3754a3207b998440f184148
|
File details
Details for the file nlweb_snowflake_vectordb-0.5.5-py3-none-any.whl.
File metadata
- Download URL: nlweb_snowflake_vectordb-0.5.5-py3-none-any.whl
- Upload date:
- Size: 6.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93e0e4390823ae3e953de007044646c23ba3ddb955995bf75d73169626e1e355
|
|
| MD5 |
bb6885ea276dd15bfee4f84c8c126104
|
|
| BLAKE2b-256 |
5aacdf81de06e13d9594ef251375f1fbc7e47006ecca21ad1e4349c45504f4bf
|