PostgreSQL vector support for asyncpg
Project description
PostgreSQL vector support for asyncpg
Adds PostgreSQL vector support for Python.
Registers data types vector and halfvec from the PostgreSQL extension vector to the asynchronous PostgreSQL client asyncpg, and marshals vector data to and from PostgreSQL database tables.
Internally, the data is packed into a Python bytes object, with single-precision float vectors stored on 4 bytes per item (for class Vector) and half-precision float vectors stored on 2 bytes per item (for class HalfVector). Data is (un)packed using struct from the standard library as necessary.
This module provides functionality similar to pgvector-python but imports minimum dependencies (e.g. no dependency on numpy).
Setup
Install the package
pip install asyncpg_vector
Initialize
Register vector types with your database connection or connection pool:
Connection:
from asyncpg_vector import register_vector
async def main() -> None:
...
await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
await register_vector(conn)
Pool:
from asyncpg_vector import register_vector
async def init_connection(conn: asyncpg.Connection) -> None:
await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
await register_vector(conn)
async def main() -> None:
...
pool = await asyncpg.create_pool(..., init=init_connection)
Perform similarity search
First, create a table and an index:
async def create(conn: asyncpg.Connection) -> None:
await conn.execute("""
CREATE TABLE IF NOT EXISTS items
(
id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,
content text NOT NULL,
embedding halfvec(1536) NOT NULL,
CONSTRAINT pk_items PRIMARY KEY (id)
);
CREATE INDEX IF NOT EXISTS embedding_index ON items
USING hnsw (embedding halfvec_cosine_ops);
""")
Next, find documents in a knowledge base that match a search phrase using vector similarity with approximate nearest neighbor semantics:
from asyncpg_vector import HalfVector
async def search(conn: asyncpg.Connection, phrase: str) -> list[str]:
...
embedding_response = await ai_client.embeddings.create(
input=phrase,
model="text-embedding-3-small",
encoding_format="base64"
)
embedding = HalfVector.from_float_base64(embedding_response.data[0].embedding)
query = """
SELECT
id,
content,
embedding <=> $1 AS distance
FROM items
ORDER BY distance
LIMIT 5
"""
rows = await conn.fetch(query, embedding)
...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file asyncpg_vector-0.1.0.tar.gz.
File metadata
- Download URL: asyncpg_vector-0.1.0.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c2a3c431d5e71e9201f55e01d2d3c68034513ef6c3be1c3e6cf686c0b72475b
|
|
| MD5 |
59c40c82fcb4513b84d61f66167e9d59
|
|
| BLAKE2b-256 |
52b918da43f39767b1b446dced41ebe756ee4f180e8416801a006936bface6a7
|
File details
Details for the file asyncpg_vector-0.1.0-py3-none-any.whl.
File metadata
- Download URL: asyncpg_vector-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
befe4248832098c9cac9e53b41360d0184a4eb1b4c8ea746599d08cb0cb2050b
|
|
| MD5 |
42b8f751acb4aeb79641a65a0d3b7b09
|
|
| BLAKE2b-256 |
fda6a502940ae95bca7ff911e2c88b4f74fe03276be60418028e5bc68d5b5c25
|